Introduction to the DISTINCT Clause

Welcome back! In the previous lesson, we explored the COUNT function for quantitative analysis in PostgreSQL. Today, we'll dive into another essential SQL feature: the DISTINCT clause.

The DISTINCT clause is used to return unique values from a column, helping you find and filter out duplicate data. For example, the characters table has 243 rows, but some of those characters appear in multiple movies. Therefore, they appear multiple times in the table. To find out how many distinct character names there are, we can use the DISTINCT function.

Dataset Review

For convenience, the tables and columns in our Marvel movies dataset are:

Movies Table

Movie Details Table

Characters Table

Basic Syntax of DISTINCT

Let's start by understanding the syntax of the DISTINCT clause:

Here's a quick breakdown of the syntax:

  • SELECT DISTINCT: This part of the command tells PostgreSQL that you want to select unique values from a specific column.
  • column_name: Replace this with the name of the column you want to filter for unique values.
  • FROM table_name: Replace this with the name of the table containing the data.

Now, let's see it in action with a practical example.

Selecting Unique Character Names

Suppose we want to get a list of unique character names. The SQL query is:

  • SELECT DISTINCT character_name: This part selects unique character_name values from the characters table.
  • FROM characters: Specifies the characters table where the data is located.

When you run this query, you will get a table of unique character names. The first 20 entries of the output are:

By using the DISTINCT clause, we eliminate the duplicate entries and get a list of unique character names. The output tells us that there are 147 unique character names in the characters table.

Combining DISTINCT and COUNT

To combine COUNT and DISTINCT, you can use the COUNT function with the DISTINCT keyword inside it. This allows you to count the number of unique values in a column without listing the whole output table.

Here's the syntax:

The query to count the number of unique character names is:

  • COUNT(DISTINCT character_name): This part counts the number of unique character_name values in the characters table.
  • AS unique_character_count: This assigns an alias unique_character_count to the resulting count.
  • FROM characters: Specifies the characters table where the data is located.

The following output shows that there are 147 distinct character names.

Summary and Next Steps

Today, you learned how to use the DISTINCT clause to extract unique values from a column in PostgreSQL. Let's summarize the key points:

  • The DISTINCT clause helps eliminate duplicate rows from queries.
  • We walked through a practical example of selecting unique character names from a table.
  • We combined the COUNT function and DISTINCT clause to find the number of unique character names

Now, you're ready to apply what you've learned in the upcoming practice exercises. These exercises will help reinforce your understanding and give you hands-on experience with the DISTINCT clause. Keep up the great work, and let's continue our journey into mastering SQL functions and clauses with PostgreSQL!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal