Lesson 3
Introducing Aggregate Functions for Data Analysis
Introduction and Overview

Welcome to this lesson on aggregate functions in PostgreSQL. In the previous lessons, we covered how to use the COUNT function for quantitative analysis and utilized the DISTINCT clause to explore unique values in our data. Now, we will dive into more advanced data analysis techniques. In this lesson, we will explore the SUM, AVG, MAX and MIN functions.

By the end of this lesson, you'll be able to analyze different aspects of the movie data, like calculating total box office earnings, average IMDb ratings, and identifying extreme values.

Dataset Review

For convenience, the tables and columns in our Marvel movies dataset are:

Movies Table

Plain text
1 movie_id | movie_name | release_date | phase 2----------+------------+--------------+-------

Movie Details Table

Plain text
1 movie_id | budget_million_usd | box_office_million_usd | imdb_rating | runtime_minutes 2----------+--------------------+------------------------+-------------+-----------------

Characters Table

Plain text
1 character_id | movie_id | character_name | actor | screen_time_minutes 2--------------+----------+----------------+-------+---------------------
Basic Syntax and Usage

Here is the basic syntax for using aggregate functions.

SQL
1SELECT aggregate_function(column_name) 2FROM table_name;
Calculating Total Box Office Sales with SUM

We want to find the total box office sales for all 33 movies in our dataset. To do this, we use the SUM() aggregate function to add the values of box_office_million_usd of each row in the movie_details table. We also want the resulting table to have a column named "total_sales". The query is:

SQL
1SELECT SUM(movie_details.box_office_million_usd) AS "Total Sales" 2FROM movie_details;

In this code, we pass the movie_details.box_office_million_usd column into the SUM function. We use AS to create an alias for the column called "Total Sales".

The result is:

Plain text
1 Total Sales 2------------- 3 30481.9 4(1 row)

Using the SUM function, we can see that the total box office sales from all 33 movies is 30481.9 million dollars ($30,481,900,000).

Using AVG to Calculate Average IMDb Ratings

Now, let's use the AVG function, to find the average IMDb rating for movies with a runtime longer than 120 minutes and less than or equal to 120 minutes. The queries are:

SQL
1SELECT AVG(movie_details.imdb_rating) 2FROM movie_details 3WHERE movie_details.runtime_minutes > 120; 4 5SELECT AVG(movie_details.imdb_rating) 6FROM movie_details 7WHERE movie_details.runtime_minutes <= 120;

In the code, we pass the movie_details.imdb_rating into the AVG function. In the first query, we only use movies with a runtime greater than 120 minutes. The second query only uses movies with a runtime less than or equal to 120 minutes.

The output is:

Plain text
1 avg 2-------------------- 3 7.4740740740740741 4(1 row) 5 6 avg 7-------------------- 8 7.0833333333333333 9(1 row)

Using the AVG function with the WHERE clause, we see that movies with a runtime longer than 2 hours tend to have higher IMDb ratings.

Using MAX and MIN to Find Extreme Values

The MAX and MIN functions find the maximum and minimum values of a column. MAX and MIN can accept numbers or even text. The MIN value of a text column is the text that comes first alphabetically. Similarly, MAX finds the text that comes last alphabetically.

SQL
1SELECT MIN(movies.movie_name) AS "First Movie", 2MAX(movies.movie_name) AS "Last Movie" 3FROM movies;

In this code, we pass the movies.movie_name into both the MAX and MIN function.

The output is:

Plain text
1 First Movie | Last Movie 2-------------+---------------------- 3 Ant-Man | Thor: The Dark World 4(1 row)

When listing the movies in alphabetical order, "Ant-Man" comes first and "Thor: The Dark World" comes last.

Summary and Practice Preparation

To recap, in this lesson, we focused on the following key points:

  • Using aggregate functions like SUM(), AVG(), MAX(), and MIN().
  • Using AS to create aliases for aggregate functions
  • Filtering aggregate function input using the WHERE

Next, you'll apply what you've learned through practice exercises to solidify your understanding of these concepts. Happy querying!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.