Welcome back! In the previous lesson, we explored the COUNT
function for quantitative analysis in PostgreSQL. Today, we'll dive into another essential SQL feature: the DISTINCT
clause.
The DISTINCT
clause is used to return unique values from a column, helping you find and filter out duplicate data. For example, the characters table has 243 rows, but some of those characters appear in multiple movies. Therefore, they appear multiple times in the table. To find out how many distinct character names there are, we can use the DISTINCT
function.
For convenience, the tables and columns in our Marvel movies dataset are:
Movies Table
Plain text1 movie_id | movie_name | release_date | phase 2----------+------------+--------------+-------
Movie Details Table
Plain text1 movie_id | budget_million_usd | box_office_million_usd | imdb_rating | runtime_minutes 2----------+--------------------+------------------------+-------------+-----------------
Characters Table
Plain text1 character_id | movie_id | character_name | actor | screen_time_minutes 2--------------+----------+----------------+-------+---------------------
Let's start by understanding the syntax of the DISTINCT
clause:
SQL1SELECT DISTINCT column_name 2FROM table_name;
Here's a quick breakdown of the syntax:
SELECT DISTINCT
: This part of the command tells PostgreSQL that you want to select unique values from a specific column.column_name
: Replace this with the name of the column you want to filter for unique values.FROM table_name
: Replace this with the name of the table containing the data.
Now, let's see it in action with a practical example.
Suppose we want to get a list of unique character names. The SQL query is:
SQL1SELECT DISTINCT character_name 2FROM characters;
SELECT DISTINCT character_name
: This part selects uniquecharacter_name
values from thecharacters
table.FROM characters
: Specifies thecharacters
table where the data is located.
When you run this query, you will get a table of unique character names. The first 20 entries of the output are:
Plain text1 character_name 2------------------------------------- 3 Stephen Strange/Doctor Strange 4 JARVIS 5 Mysterio 6 Adrian Toomes/Vulture 7 Karl Mordo 8 Christine Palmer 9 Nebula 10 Rocket 11 Gamora 12 Phil Coulson 13 Makkari 14 Flash Thompson 15 Drax the Destroyer 16 Gorr the God Butcher 17 Ned Leeds 18 Max Dillon/Electro 19 Odin 20 Everett Ross 21 Ronan the Accuser 22 Pepper Potts 23.... 24(147 rows)
By using the DISTINCT
clause, we eliminate the duplicate entries and get a list of unique character names. The output tells us that there are 147 unique character names in the characters
table.
To combine COUNT
and DISTINCT
, you can use the COUNT function with the DISTINCT keyword inside it. This allows you to count the number of unique values in a column without listing the whole output table.
Here's the syntax:
SQL1SELECT COUNT(DISTINCT column_name) 2FROM table_name;
The query to count the number of unique character names is:
SQL1SELECT COUNT(DISTINCT character_name) AS unique_character_count 2FROM characters;
COUNT(DISTINCT character_name)
: This part counts the number of uniquecharacter_name
values in thecharacters
table.AS unique_character_count
: This assigns an aliasunique_character_count
to the resulting count.FROM characters
: Specifies thecharacters
table where the data is located.
The following output shows that there are 147 distinct character names.
Plain text1 unique_character_count 2------------------------ 3 147 4(1 row)
Today, you learned how to use the DISTINCT
clause to extract unique values from a column in PostgreSQL. Let's summarize the key points:
- The
DISTINCT
clause helps eliminate duplicate rows from queries. - We walked through a practical example of selecting unique character names from a table.
- We combined the
COUNT
function andDISTINCT
clause to find the number of unique character names
Now, you're ready to apply what you've learned in the upcoming practice exercises. These exercises will help reinforce your understanding and give you hands-on experience with the DISTINCT
clause. Keep up the great work, and let's continue our journey into mastering SQL functions and clauses with PostgreSQL!