Introduction and Overview

Welcome back! In our previous lessons, we've explored various ways to query and analyze data using PostgreSQL. We began by mastering text-based queries, moved on to understanding subqueries, and learned to create new columns and perform mathematical operations.

In this lesson, we'll dive deeper into complex queries and conditional logic:

  1. Using the CASE statement to categorize data.
  2. Combining multiple conditions in the WHERE clause.
  3. Enhancing queries with subqueries.

These techniques will help you make your data queries more powerful and insightful.

Dataset Review

For convenience, the tables and columns in our Marvel movies dataset are:

Movies Table

Movie Details Table

Characters Table

Understanding the CASE Statement in PostgreSQL

The CASE statement allows you to perform conditional logic directly within your queries. It's similar to an IF-THEN-ELSE statement in programming languages.

The syntax of a CASE statement is as follows:

  • CASE: Begins the conditional logic block.
  • WHEN condition1 THEN result1: If condition1 is true, the statement returns result1.
  • WHEN condition2 THEN result2: If condition2 is true, the statement returns result2.
  • ELSE resultN: If none of the conditions are true, the statement returns resultN. This part is optional but recommended for completeness.
  • END: Ends the CASE block.
  • AS alias_name: (Optional) Provides a temporary name for the resulting column.

Let's see a real-world application of the CASE statement. We'll categorize movies based on their box office earnings into three categories: Flop, Hit, and Blockbuster.

Categorizing Movies by Box Office Earnings

We want to categorize movies as either a "Flop", "Hit", or "Blockbuster" based on box office sales. The query is:

Let's break down this query:

  • We use INNER JOIN to combine data from the movies and movie_details tables based on movie_id.
  • The SELECT statement retrieves movie names.
  • CASE begins the conditional logic block to categorize the movies based on their box office earnings.
    • WHEN movie_details.box_office_million_usd < 400 THEN 'Flop': If the movie's box office earnings are less than 400 million USD, the movie is categorized as 'Flop'.
    • WHEN movie_details.box_office_million_usd BETWEEN 400 AND 1000 THEN 'Hit': If the movie's box office earnings are between 400 and 1000 million USD (inclusive), the movie is categorized as 'Hit'.
    • ELSE 'Blockbuster': If the movie's box office earnings are greater than 1000 million USD, the movie is categorized as 'Blockbuster'.
    • END AS success_category: Ends the CASE block and gives the result column the alias success_category.

The output is:

From the output, we can see that "The Incredible Hulk" is listed as a 'Flop', "Iron Man" is listed as a 'Hit', and "The Avengers" is listed as a 'Blockbuster'.

This example demonstrates how you can use the CASE statement to derive meaningful categories from numerical data.

Putting It All Together: Super Query

Congratulations on reaching the last lesson of the "Intro to SQL with PostgreSQL" course path! Now let's put your PostgreSQL super powers to the test. In this example, we combine every query technique we have learned.

We want to create a table with the following columns:

  • Movie Name: The name of the movie.
  • Success Level: Categorized as 'Flop', 'Hit', or 'Blockbuster' based on the box office earnings.
  • Profit in Millions: The difference between the box office earnings and the budget, representing the movie's profit.
  • Average Screen Time: The rounded average screen time of characters in the movie, displayed in minutes with 'mins' appended.

Only movies with an IMDb rating higher than average should be included, and the output should be sorted by profit in descending order.

Let's take a look at this super query:

Let's break this query down line by line:

SELECT Clause:

  • Retrieves the movie_name.
  • Uses a CASE statement to determine the "Success Level" of each movie based on its box_office_million_usd:
    • Flop for earnings less than 400 million USD.
    • Hit for earnings between 400 and 1000 million USD.
    • Blockbuster for earnings greater than 1000 million USD.
  • Computes the "Profit in Millions" by subtracting the budget_million_usd from the box_office_million_usd.
  • Calculates the "Average Screen Time" for characters in the movie by rounding the average screen_time_minutes to the nearest whole number and appending 'mins'.

FROM Clause:

  • Uses the movies table as the base dataset.

INNER JOIN:

  • Combines movies with movie_details using movie_id.
  • Combines movies with characters also using movie_id.

WHERE Clause:

  • Filters the results to include only movies where the imdb_rating is higher than the average IMDb rating of all movies.
  • The subquery (SELECT AVG(movie_details.imdb_rating) FROM movie_details) calculates the average IMDb rating across all movies.

GROUP BY Clause:

  • Groups the results by movie_name, box_office_million_usd, and budget_million_usd to aggregate the average screen time correctly. When using aggregate functions, all non-aggregate columns in the SELECT statement must be included in the GROUP BY clause.

ORDER BY Clause:

  • Orders the results based on "Profit in Millions" in descending order to list the most profitable movies first.

The output is:

The output is a list of movies including their name, success level, profit in millions, and average screen time, filtered to include only those movies with above-average IMDb ratings and sorted by profit in descending order.

Notice that none of the movies are labeled as a flop. Remember that we filtered out all movies with an IMDb rating less than average. From our output, we know that any movie labeled as a flop was excluded from our table. This means that all movies that made less than 400 million dollars in profit have an IMDb rating lower than average.

Summary and Next Steps

Congratulations on reaching the final lesson in the "Intro to SQL with PostgreSQL" course path. You have now mastered the key techniques for dealing with complex queries and conditional logic in PostgreSQL. You've come a long way in unlocking your PostgreSQL Super Powers! Keep exploring and practicing to further enhance your skills.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal