Welcome to Practical Data Manipulation Techniques! In this unit, we’ll combine and build upon everything you've learned about data transformation in Ruby. You’ll work through techniques for filtering, projecting, and aggregating data, using methods like map, select, sum, and reduce. By the end, you’ll know how to harness these methods to analyze and summarize data effectively.
Let’s dive in!
Throughout this unit, we’ll work with a structured dataset to apply and combine the techniques you’ve learned. Here’s an array of hashes representing individuals with different attributes:
This dataset will be the foundation as we explore data manipulation techniques.
Data projection is used to select specific fields from each entry in a dataset. Let’s say we only want to see each person's name and profession:
In this example:
mapiterates through each person in the dataset.selectextracts only thenameandprofessionfields.
The result is an array of hashes containing only the projected fields.
Filtering allows you to keep only the data that matches specific conditions. Let’s select only the individuals who are 30 years or older:
Here:
selectfilters entries where theageis 30 or above.- The result contains only entries matching this age criterion.
By combining projection and filtering, we can create a more refined view of our data. For instance, let’s retrieve only the name and salary of people who are engineers and over the age of 30:
In this example:
selectfilters for people who are engineers and over 30.mapthen projects only theirnameandsalary.
Aggregation techniques allow us to summarize data by calculating sums, averages, counts, and more. Let’s explore a few common aggregation tasks.
We can calculate the total salary by extracting the salary field for each person and summing it:
Here:
mapextracts each person’ssalary.sumcalculates the total of all salaries.
To find the average age, sum all the ages and divide by the total count:
This code:
- Maps the dataset to ages.
- Sums the ages and divides by the number of people to find the average.
Using max and min, we can find the highest and lowest salaries:
This example uses map to get all salaries, then max and min to find the highest and lowest values.
The reduce method (also called inject) is useful for custom aggregations. Let’s use it to count the number of people who are 21 or older:
Here:
- We first map the ages.
reduceaccumulates a count of people aged 21 or over.
Ruby’s then method can help in chaining operations, making code easier to read. Here’s an example where we chain selection, projection, and averaging of salaries for engineers over 25:
This example:
- Filters for engineers older than 25.
- Projects only the
salaryfield. - Uses
thento calculate the average salary, adding readability by separating the final calculation step.
In Practical Data Manipulation Techniques, we’ve brought together essential methods for transforming, filtering, and aggregating data in Ruby. You’ve learned to:
- Project specific fields with
mapandselect. - Filter data to include only entries meeting certain criteria.
- Aggregate data using
sum,reduce,max,min, andthen.
With these combined techniques, you’re well-prepared to process, analyze, and summarize data efficiently in Ruby. Dive into the exercises to reinforce your skills—happy coding!
