Welcome to the lesson on SQL JOINs with PostgreSQL! So far, you’ve learned how to retrieve data using basic SQL queries. Now, we’ll explore how to combine data from multiple tables, which is crucial for complex database querying and analysis.
In this lesson, we’ll use a simplified version of the Marvel movies dataset. Here’s a look at the simplified tables.
Movies Table
movie_id | movie_name |
---|---|
1 | Iron Man |
2 | The Incredible Hulk |
3 | Captain America: The First Avenger |
4 | The Avengers |
Movie Details Table
movie_id | imdb_rating | box_office_million_usd |
---|---|---|
1 | 7.9 | 585.2 |
3 | 6.9 | 370.6 |
4 | 8.0 | 1519.6 |
Characters Table
character_id | movie_id | character_name |
---|---|---|
1 | 1 | Tony Stark/Iron Man |
2 | 1 | Pepper Potts |
3 | 1 | James Rhodes |
4 | 1 | Obadiah Stane |
5 | 3 | Steve Rogers/Captain America |
6 | 3 | Peggy Carter |
7 | 3 | Bucky Barnes/Winter Soldier |
8 | 3 | Johann Schmidt/Red Skull |
9 | 3 | Howard Stark |
10 | 4 | Steve Rogers/Captain America |
11 | 4 | Tony Stark/Iron Man |
12 | 4 | Thor |
13 | 4 | Natasha Romanoff/Black Widow |
14 | 4 | Bruce Banner/Hulk |
15 | 4 | Clint Barton/Hawkeye |
16 | 4 | Loki |
17 | 5 | T'challa |
In order to link these tables together, we will use the movie_id
key which is present in all 3 tables. Let’s dive in and explore how to use different types of JOINs to combine these tables effectively.
SQL JOINs help us combine rows from two or more tables based on related columns among them. There are 4 main categories of joins:
INNER JOIN
- Returns only the rows where there is a match in both tables. It’s useful when retrieving data that exists in both tables.
LEFT JOIN
- Returns all rows from the left table and the matched rows from the right table. If there is no match, the result is
NULL
on the side of the right table.
RIGHT JOIN
- Returns all rows from the right table and the matched rows from the left table. If there is no match, the result is
NULL
on the side of the left table.
FULL JOIN
- Returns all rows when there is a match in either table. Rows without a match in one of the tables will appear with
NULL
s in the places where there is no matching row.
Here is a great way to visualize how these different types of joins. In the diagram below, A
and B
are the tables we are joining and the green areas are the outputs of the different JOIN
s.
The basic syntax for a join query follows this structure:
SQL1SELECT column1, column2, ...
2FROM table1
3JOIN_TYPE table2
4ON table1.column_name = table2.column_name;
SELECT column1, column2, ...
: Specifies the columns you want to retrieve from one or multiple tables.FROM table1
: Indicates the primary table from which to start the query.JOIN_TYPE table2
: Indicates the type of join (INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN) and the second table to join.ON table1.column_name = table2.column_name
: Specifies the join condition, indicating how the tables are related to each other by comparing columns.
We will cover the details of these queries in more detail later in the course. For now, just focus on how different types of joins affect the output.
An INNER JOIN
returns only the rows where there is a match in both tables.
Suppose, we want to retrieve the movie names and their corresponding IMDb ratings. The movies
and movie_details
tables both have a movie_id
column. This query returns all rows where the movie_id
values match.
SQL1SELECT *
2FROM movies
3INNER JOIN movie_details
4ON movies.movie_id = movie_details.movie_id;
In this query:
- The
movies
table is one side of the join. - The
movie_details
table is the other side of the join. - The
ON
clause specifies the condition to join the tables, which ismovies.movie_id = movie_details.movie_id
.
The output is:
Plain text1 movie_id | movie_name | movie_id | box_office_million_usd | imdb_rating 2----------+------------------------------------+----------+------------------------+------------- 3 1 | Iron Man | 1 | 585.2 | 7.9 4 3 | Captain America: The First Avenger | 3 | 370.6 | 6.9 5 4 | The Avengers | 4 | 1519.6 | 8.0 6(3 rows) 7
The result of this query includes only the movies that have matching entries in the movie_details table. Even though "The Incredible Hulk" is found in the movies
table, it is not in the movie_details
table, so it does not appear in the output.
Next, let’s use a LEFT JOIN
to list movies and their characters, even if a movie doesn’t have a character listed. In a LEFT JOIN
, all rows from the left table are returned, along with matching rows from the right table where available.
SQL1SELECT movies.movie_name, characters.character_name
2FROM movies
3LEFT JOIN characters
4ON movies.movie_id = characters.movie_id;
In this query:
- The
movies
table is the left table. - The
characters
table is the right table. - The
ON
clause specifies the condition to join the tables, which ismovies.movie_id = characters.movie_id
.
The output is:
Plain text1 movie_name | character_name 2------------------------------------+------------------------------ 3Iron Man | Tony Stark/Iron Man 4Iron Man | Pepper Potts 5Iron Man | James Rhodes 6Iron Man | Obadiah Stane 7Captain America: The First Avenger | Steve Rogers/Captain America 8Captain America: The First Avenger | Peggy Carter 9Captain America: The First Avenger | Bucky Barnes/Winter Soldier 10Captain America: The First Avenger | Johann Schmidt/Red Skull 11Captain America: The First Avenger | Howard Stark 12The Avengers | Steve Rogers/Captain America 13The Avengers | Tony Stark/Iron Man 14The Avengers | Thor 15The Avengers | Natasha Romanoff/Black Widow 16The Avengers | Bruce Banner/Hulk 17The Avengers | Clint Barton/Hawkeye 18The Avengers | Loki 19The Incredible Hulk | NULL 20(17 rows)
This result shows all movies, including:
- Movies with multiple characters, which appear multiple times in the result, each time with a different character.
- Movies without any characters listed, which appear with
NULL
in thecharacter_name
column. For example, "The Incredible Hulk" appears in the result even though it does not have any character listed in thecharacters
table.
Now, let’s use a RIGHT JOIN
to ensure we list all characters, even if they don’t have a corresponding movie entry. In a RIGHT JOIN
, all rows from the right table are returned, along with matching rows from the left table where available.
SQL1SELECT movies.movie_name, characters.character_name
2FROM movies
3RIGHT JOIN characters
4ON characters.movie_id = movies.movie_id;
In this query:
- The
movies
table is the left table. - The
characters
table is the right table. - The
ON
clause specifies the condition to join the tables, which ischaracters.movie_id = movies.movie_id
.
The output is:
Plain text1 movie_name | character_name 2------------------------------------+------------------------------ 3Iron Man | Tony Stark/Iron Man 4Iron Man | Pepper Potts 5Iron Man | James Rhodes 6Iron Man | Obadiah Stane 7Captain America: The First Avenger | Steve Rogers/Captain America 8Captain America: The First Avenger | Peggy Carter 9Captain America: The First Avenger | Bucky Barnes/Winter Soldier 10Captain America: The First Avenger | Johann Schmidt/Red Skull 11Captain America: The First Avenger | Howard Stark 12The Avengers | Steve Rogers/Captain America 13The Avengers | Tony Stark/Iron Man 14The Avengers | Thor 15The Avengers | Natasha Romanoff/Black Widow 16The Avengers | Bruce Banner/Hulk 17The Avengers | Clint Barton/Hawkeye 18The Avengers | Loki 19NULL | T'challa 20(17 rows) 21
The result of this query includes:
- All rows from the
characters
table. - The corresponding movie names from the
movies
table for characters where there is a match onmovie_id
. - If no matching movies are found, the
movie_name
column will beNULL
for that character. For example, "T'challa" appears in the result even though it does not have any movie listed in themovies
table.
Lastly, let’s use a FULL JOIN
to retrieve all movies and their characters, even if one of the tables doesn’t have a matching entry for a movie_id
. In a FULL JOIN
, all rows from both tables are returned, along with matching rows where available. If there is no match, NULL
values are used for the missing parts.
SQL1SELECT movies.movie_name, characters.character_name
2FROM movies
3FULL JOIN characters ON movies.movie_id = characters.movie_id;
In this query:
- The
movies
table is one side of the join. - The
characters
table is the other side of the join. - The
ON
clause specifies the condition to join the tables, which ismovies.movie_id = characters.movie_id
.
The output is:
Plain text1 movie_name | character_name 2------------------------------------+------------------------------ 3Iron Man | Tony Stark/Iron Man 4Iron Man | Pepper Potts 5Iron Man | James Rhodes 6Iron Man | Obadiah Stane 7Captain America: The First Avenger | Steve Rogers/Captain America 8Captain America: The First Avenger | Peggy Carter 9Captain America: The First Avenger | Bucky Barnes/Winter Soldier 10Captain America: The First Avenger | Johann Schmidt/Red Skull 11Captain America: The First Avenger | Howard Stark 12The Avengers | Steve Rogers/Captain America 13The Avengers | Tony Stark/Iron Man 14The Avengers | Thor 15The Avengers | Natasha Romanoff/Black Widow 16The Avengers | Bruce Banner/Hulk 17The Avengers | Clint Barton/Hawkeye 18The Avengers | Loki 19NULL | T'challa 20The Incredible Hulk | NULL 21(18 rows)
The result of this query includes:
- All rows from the
movies
table. - All rows from the
characters
table. - The corresponding characters from the
characters
table for movies where there is a match onmovie_id
. - The corresponding movie names from the
movies
table for characters where there is a match onmovie_id
. - If no matching characters are found, the
character_name
column will beNULL
for that movie. - If no matching movies are found, the
movie_name
column will beNULL
for that character.
In this lesson, you learned about different types of SQL JOINs and how to use them to combine data from multiple tables. You've practiced using INNER JOIN
, LEFT JOIN
, RIGHT JOIN
, and FULL JOIN
.
Now, it's time to practice! In the exercises, we will use an expanded dataset with more movies, characters, and movie details. Keep practicing to solidify your understanding!