Lesson 4
Batch Processing of CSV Files with PHP
Introduction to Reading Data in Batches with PHP

In previous lessons, you learned how to handle datasets stored in compressed formats and manage large numerical datasets efficiently using PHP. Building on that foundation, today's lesson will teach you how to read and process data in batches from multiple CSV files using PHP. This is crucial because working with data in smaller chunks, or batches, can make your code more efficient and faster when dealing with large datasets.

Our focus in this lesson will be on a practical scenario where a dataset containing car information is spread across multiple files. You will learn to read, process, and analyze this data to extract meaningful insights, such as determining the car with the lowest price.

Understanding CSV Data Structure

In this lesson, we will work with a set of CSV files containing car data. Here's what a typical CSV file might look like:

csv
1transmission,price,color,year,model,distance_traveled_km 2Automatic,60383.80,Silver,2013,Ford Focus,10437 3Manual,82471.28,White,2011,Toyota Corolla,221662 4Automatic,52266.72,Black,2012,BMW Series 5,30296 5...

Each line represents a car record with the following attributes:

  • Transmission: Type of transmission (e.g., Automatic, Manual)
  • Price: The price of the car
  • Color: The color of the car
  • Year: The manufacturing year of the car
  • Model: The model of the car
  • Distance Traveled (km): Kilometers the car has traveled

These files are divided into multiple parts to allow batch processing, and understanding their structure is crucial as you learn to read and process them efficiently.

Setting Up for CSV File Batch Reading

Now, let's delve into reading these CSV files in batches using PHP constructs. We'll build our solution step-by-step. First, we need to specify the filenames for our CSV files and prepare a data structure to hold the combined data.

php
1// Class to represent a car 2class Car 3{ 4 public $model; 5 public $price; 6 7 public function __construct($model, $price) 8 { 9 $this->model = $model; 10 $this->price = $price; 11 } 12} 13 14// Filenames to read 15$filenames = ['data_part1.csv', 'data_part2.csv', 'data_part3.csv']; 16 17// Array to store all car data 18$carData = [];

Here, we declare an array $filenames to hold the names of the CSV files and an array $carData to store instances of the Car class, representing the car data read from the files.

Reading Data from Each File

Now, we'll loop through each filename, read the data using fopen and fgetcsv, and append it to our $carData structure.

php
1// Loop through each file and load car data 2foreach ($filenames as $filename) { 3 // Open the CSV file for reading 4 $file = fopen($filename, 'r'); 5 6 // Skip header line to avoid adding it to the data 7 fgetcsv($file); 8 9 // Read each line of the CSV and create Car objects 10 while (($columns = fgetcsv($file)) !== false) { 11 // Create a new Car object and add it to the car data array 12 $carData[] = new Car($columns[4], (float)$columns[1]); 13 } 14 15 // Close the file after reading 16 fclose($file); 17}

In this code:

  • We open each file for reading using fopen.
  • We skip the header with fgetcsv($file).
  • For each row, fgetcsv is used to extract the columns as an array.
  • We create a new Car object using $columns[4] for the model and parse the price to a float using $columns[1], appending each valid car entry to $carData.
Finding the Car with the Lowest Price

With all data combined in $carData, the next step is identifying the car with the lowest price in PHP.

php
1// Check if there are any cars in the data 2if (!empty($carData)) { 3 // Initialize the car with the lowest price 4 $lowestCostCar = $carData[0]; 5 6 // Loop through the car data to find the lowest priced car 7 foreach ($carData as $car) { 8 if ($car->price < $lowestCostCar->price) { 9 $lowestCostCar = $car; 10 } 11 } 12 13 // Output the model and price of the lowest cost car 14 echo "Model: {$lowestCostCar->model}\n"; 15 echo "Price: \${$lowestCostCar->price}\n"; 16} else { 17 // Output a message if no valid car data was found 18 echo "No valid car data available.\n"; 19}

Here:

  • We check if the $carData array has elements.
  • We initialize $lowestCostCar with the first car in the array.
  • We loop through the $carData to find the car with the minimum price.
  • Finally, we output the model and price of the car with the lowest price.
Summary and Practice Preparation

In this lesson, you have learned how to:

  • Read data in batches from multiple CSV files using PHP file handling with fopen and fgetcsv.
  • Process the data efficiently with PHP array manipulation and conversions.
  • Identify insights, such as the car with the lowest price.

These techniques prepare you to handle similar datasets efficiently using PHP. Practice these skills with exercises designed to reinforce your understanding, focusing on reactive and efficient data handling techniques in PHP.

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.