Lesson 3
Mastering Data Aggregation and JSON Streams
Introduction

Welcome to our lesson on mastering data aggregation and data streams with JSON formatting in Ruby.

In this lesson, we'll start by building a basic sales records aggregator and then extend its functionality to handle more complex operations, such as filtering, data aggregation, and JSON formatting. By the end of this session, you’ll be able to manage and format data streams efficiently with Ruby.

Starter Task Methods and Their Definitions

To begin, we’ll implement a basic sales record aggregator with essential methods:

  • add_sale(sale_id, amount) — Adds a sale record with a unique identifier sale_id and an amount. If a sale with the same sale_id already exists, it updates the amount.
  • get_sale(sale_id) — Retrieves the sale amount associated with the sale_id. If the sale does not exist, it returns nil.
  • delete_sale(sale_id) — Deletes the sale record with the given sale_id. Returns true if the sale was deleted and false if the sale does not exist.

With these methods in place, let’s proceed to the code.

Starter Task Solution

Here is the complete code for the starter task:

Ruby
1class SalesAggregator 2 def initialize 3 @sales = {} 4 end 5 6 def add_sale(sale_id, amount) 7 @sales[sale_id] = amount 8 end 9 10 def get_sale(sale_id) 11 @sales[sale_id] 12 end 13 14 def delete_sale(sale_id) 15 !!@sales.delete(sale_id) 16 end 17end

The initialize method sets up an empty hash to store sales records. The add_sale method either adds a new sale or updates the amount if a sale with the same ID already exists. The get_sale method retrieves the amount for a given sale ID, returning nil if the sale does not exist. The delete_sale method removes the sale record for a specified sale ID, returning false if the sale does not exist.

To test these methods, here’s some example usage:

Ruby
1# Example Usage 2aggregator = SalesAggregator.new 3 4# Add sales 5aggregator.add_sale('001', 100.50) 6aggregator.add_sale('002', 200.75) 7 8# Get sale 9puts aggregator.get_sale('001') # Output: 100.5 10 11# Delete sale 12puts aggregator.delete_sale('002') # Output: true 13puts aggregator.get_sale('002') # Output: nil

With the basic aggregator working, let’s expand its functionality to handle more advanced operations.

New Methods and Their Definitions

To increase the functionality of our sales aggregator, we’ll introduce new methods for advanced data aggregation, filtering, and JSON formatting:

  • aggregate_sales(min_amount = 0) — Returns a hash with the total number of sales and the total amount of sales where the sale amount is above min_amount. The hash format is:
    Ruby
    1{ total_sales: 0, total_amount: 0.0 }
  • format_sales(min_amount = 0) — Returns the sales data, filtered by min_amount, formatted as JSON. Includes aggregated sales statistics in the output.
  • add_sale(sale_id, amount, date) — Adds or updates a sale record with sale_id, amount, and a date in the format "YYYY-MM-DD".
  • get_sales_in_date_range(start_date, end_date) — Retrieves all sales within the specified date range, inclusive, including each sale’s sale_id, amount, and date.

Let’s implement these methods step-by-step, using map, select, and reduce where applicable.

Step 1: Enhancing the add_sale Method to Include Date

We’ll first modify the add_sale method to accept a date, so each sale record includes a date in addition to the amount.

Ruby
1def add_sale(sale_id, amount, date) 2 @sales[sale_id] = { amount: amount, date: date } 3end

This change allows each sale to store both an amount and a date, preparing the data structure for date-based filtering.

Step 2: Implementing the aggregate_sales Method with reduce

The aggregate_sales method will aggregate sales that exceed the specified min_amount, returning the total number of qualifying sales and the total sales amount. Here, we can use reduce to streamline the aggregation.

Ruby
1def aggregate_sales(min_amount = 0) 2 @sales.values.reduce({ total_sales: 0, total_amount: 0.0 }) do |acc, sale| 3 if sale[:amount] > min_amount 4 acc[:total_sales] += 1 5 acc[:total_amount] += sale[:amount] 6 end 7 acc 8 end 9end

In aggregate_sales, reduce iterates through each sale and updates the accumulator (acc) with the count and sum of sales above min_amount. This approach makes the method concise and efficient for handling aggregations.

Example usage:

Ruby
1# Create an instance of SalesAggregator 2aggregator = SalesAggregator.new 3 4# Add sales with date 5aggregator.add_sale('001', 100.50, '2023-01-01') 6aggregator.add_sale('002', 200.75, '2023-01-15') 7 8# Aggregate sales 9puts aggregator.aggregate_sales(min_amount: 50) 10# Output: {:total_sales=>2, :total_amount=>301.25}
Step 3: Implementing the format_sales Method with select and map

The format_sales method will output the sales data in JSON format, filtered by min_amount, and include aggregated statistics. We’ll use select and map here.

Ruby
1require 'json' 2 3def format_sales(min_amount = 0) 4 filtered_sales = @sales.select { |_, sale| sale[:amount] > min_amount } 5 sales_array = filtered_sales.map do |sale_id, sale| 6 { sale_id: sale_id, amount: sale[:amount], date: sale[:date] } 7 end 8 statistics = aggregate_sales(min_amount) 9 10 result = { 11 sales: sales_array, 12 statistics: statistics 13 } 14 JSON.generate(result, quirks_mode: true) 15end

In format_sales, select filters the sales, and map converts the filtered result (which is an array of key-value pairs) into an array of hashes, each containing sale_id, amount, and date. The use of JSON.generate with quirks_mode: true avoids adding slashes in the output by generating JSON strings with unescaped forward slashes.

Example usage:

Ruby
1# Format sales to JSON 2puts aggregator.format_sales(min_amount: 50) 3# Output: {"sales":[{"sale_id":"001","amount":100.5,"date":"2023-01-01"},{"sale_id":"002","amount":200.75,"date":"2023-01-15"}],"statistics":{"total_sales":2,"total_amount":301.25}}
Step 4: Implementing the get_sales_in_date_range Method with select and map

Finally, let’s implement the get_sales_in_date_range method to retrieve sales within a specific date range, using select and map to streamline the process.

Ruby
1require 'date' 2 3def get_sales_in_date_range(start_date, end_date) 4 start_d = Date.parse(start_date) 5 end_d = Date.parse(end_date) 6 @sales.select do |_, sale| 7 sale_date = Date.parse(sale[:date]) 8 start_d <= sale_date && sale_date <= end_d 9 end.map do |sale_id, sale| 10 { sale_id: sale_id, amount: sale[:amount], date: sale[:date] } 11 end 12end

In get_sales_in_date_range, select filters the sales within the specified date range, and map formats the results as an array of hashes, each containing sale_id, amount, and date.

Example usage:

Ruby
1# Get sales in date range 2puts aggregator.get_sales_in_date_range('2023-01-01', '2023-12-31') 3# Output: [{:sale_id=>"001", :amount=>100.5, :date=>"2023-01-01"}, {:sale_id=>"002", :amount=>200.75, :date=>"2023-01-15"}]
Lesson Summary

Congratulations! You’ve extended a basic sales aggregator into a fully functional data management tool, capable of filtering, aggregating, and formatting sales data in JSON using Ruby’s powerful select, map, and reduce methods. These skills are invaluable for efficiently handling data streams, especially with larger datasets.

Experiment with similar tasks to reinforce your understanding. Great job, and see you in the next lesson!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.