Welcome to this lesson on aggregate functions in PostgreSQL. In the previous lessons, we covered how to use the COUNT function for quantitative analysis and utilized the DISTINCT clause to explore unique values in our data. Now, we will dive into more advanced data analysis techniques. In this lesson, we will explore the SUM, AVG, MAX and MIN functions.
By the end of this lesson, you'll be able to analyze different aspects of the movie data, like calculating total box office earnings, average IMDb ratings, and identifying extreme values.
For convenience, the tables and columns in our Marvel movies dataset are:
Movies Table
Movie Details Table
Characters Table
Here is the basic syntax for using aggregate functions.
We want to find the total box office sales for all 33 movies in our dataset. To do this, we use the SUM() aggregate function to add the values of box_office_million_usd of each row in the movie_details table. We also want the resulting table to have a column named "total_sales". The query is:
In this code, we pass the movie_details.box_office_million_usd column into the SUM function. We use AS to create an alias for the column called "Total Sales".
The result is:
Using the SUM function, we can see that the total box office sales from all 33 movies is 30481.9 million dollars ($30,481,900,000).
Now, let's use the AVG function, to find the average IMDb rating for movies with a runtime longer than 120 minutes and less than or equal to 120 minutes.
The queries are:
In the code, we pass the movie_details.imdb_rating into the AVG function. In the first query, we only use movies with a runtime greater than 120 minutes. The second query only uses movies with a runtime less than or equal to 120 minutes.
The output is:
Using the AVG function with the WHERE clause, we see that movies with a runtime longer than 2 hours tend to have higher IMDb ratings.
The MAX and MIN functions find the maximum and minimum values of a column. MAX and MIN can accept numbers or even text. The MIN value of a text column is the text that comes first alphabetically. Similarly, MAX finds the text that comes last alphabetically.
In this code, we pass the movies.movie_name into both the MAX and MIN function.
The output is:
When listing the movies in alphabetical order, "Ant-Man" comes first and "Thor: The Dark World" comes last.
To recap, in this lesson, we focused on the following key points:
- Using aggregate functions like
SUM(),AVG(),MAX(), andMIN(). - Using
ASto create aliases for aggregate functions - Filtering aggregate function input using the
WHERE
Next, you'll apply what you've learned through practice exercises to solidify your understanding of these concepts. Happy querying!
