Welcome back! In the previous lesson, we explored the COUNT
function for quantitative analysis in PostgreSQL. Today, we'll dive into another essential SQL feature: the DISTINCT
clause.
The DISTINCT
clause is used to return unique values from a column, helping you find and filter out duplicate data. For example, the characters table has 243 rows, but some of those characters appear in multiple movies. Therefore, they appear multiple times in the table. To find out how many distinct character names there are, we can use the DISTINCT
function.
For convenience, the tables and columns in our Marvel movies dataset are:
Movies Table
Movie Details Table
Characters Table
Let's start by understanding the syntax of the DISTINCT
clause:
Here's a quick breakdown of the syntax:
SELECT DISTINCT
: This part of the command tells PostgreSQL that you want to select unique values from a specific column.column_name
: Replace this with the name of the column you want to filter for unique values.FROM table_name
: Replace this with the name of the table containing the data.
Now, let's see it in action with a practical example.
Suppose we want to get a list of unique character names. The SQL query is:
SELECT DISTINCT character_name
: This part selects uniquecharacter_name
values from thecharacters
table.FROM characters
: Specifies thecharacters
table where the data is located.
When you run this query, you will get a table of unique character names. The first 20 entries of the output are:
By using the DISTINCT
clause, we eliminate the duplicate entries and get a list of unique character names. The output tells us that there are 147 unique character names in the characters
table.
To combine COUNT
and DISTINCT
, you can use the COUNT function with the DISTINCT keyword inside it. This allows you to count the number of unique values in a column without listing the whole output table.
Here's the syntax:
The query to count the number of unique character names is:
COUNT(DISTINCT character_name)
: This part counts the number of uniquecharacter_name
values in thecharacters
table.AS unique_character_count
: This assigns an aliasunique_character_count
to the resulting count.FROM characters
: Specifies thecharacters
table where the data is located.
The following output shows that there are 147 distinct character names.
Today, you learned how to use the DISTINCT
clause to extract unique values from a column in PostgreSQL. Let's summarize the key points:
- The
DISTINCT
clause helps eliminate duplicate rows from queries. - We walked through a practical example of selecting unique character names from a table.
- We combined the
COUNT
function andDISTINCT
clause to find the number of unique character names
Now, you're ready to apply what you've learned in the upcoming practice exercises. These exercises will help reinforce your understanding and give you hands-on experience with the DISTINCT
clause. Keep up the great work, and let's continue our journey into mastering SQL functions and clauses with PostgreSQL!
