Introduction and Overview

Welcome to the first lesson of our course on Advanced Query Techniques and Conditional Logic in PostgreSQL. In this lesson, we will focus on mastering text-based queries.

By the end of this lesson, you will know how to search for patterns in text and handle case sensitivity effectively using PostgreSQL.

Dataset Overview

This course uses the Marvel movies dataset from previous courses. The dataset has a movies table, movie_details table, and a characters table. As a refresher, the tables are:

Movies Table

The movies table includes the first 33 Marvel movies. Each entry in the table includes a value for movie_id, movie_name, release_date and phase. The movie_id column corresponds with the movie_id columns from the movie_details and characters tables.

Movie Details Table

The movie_details table includes details for the first 33 Marvel movies. Each entry in the table includes a value for movie_id, budget_million_usd, box_office_million_usd, imdb_rating, and runtime_minutes. The movie_id column corresponds with the movie_id columns from the movies and characters tables.

Characters Table

The characters table contains 243 entries for characters that appear in the first 33 Marvel movies. Each entry has a character_id, movie_id, character_name, actor, and screen_time_minutes. The movie_id column corresponds with the movie_id column from the movies and movie_details table.

Now that we understand the dataset, let's dive into text-based queries.

Introduction to Pattern Matching using %

Pattern matching in PostgreSQL is primarily facilitated through the use of the % wildcard within the LIKE and ILIKE operators. The % wildcard matches any sequence of characters, including an empty sequence.

The LIKE operator performs case-sensitive pattern matching while the ILIKE operator performs case-insensitive pattern matching.

Placing % at the end of a text query matches any string that starts with the text placed before %. For example, the pattern A% matches any string that starts with the letter "A" followed by any sequence of characters, including an empty sequence.

Placing % at the start of a text query matches any string that ends with the text placed after %. For example, the pattern %a matches any string that ends with the letter "a".

Finding Characters Starting with "A"

Suppose we want to find characters with names that begin with "A". The query is:

The output is:

Here, any character name starting with "A" (case-sensitive) followed by any sequence of characters is selected.

Find Characters Ending with "a"

The query to find character names that end with "a" (case-sensitive) preceded by any sequence of characters is:

The output is:

Any character name ending with "a" will be included, regardless of the preceding sequence of characters.

Exact Matches

In PostgreSQL, you can use the = operator to find exact matches in text-based queries. This operator ensures only entries that exactly match the specified text are returned.

To find exact matches for the character name "Tony Stark", you can use the following query:

The output is:

This query returns only the entries where the character_name is exactly "Tony Stark", with no additional characters or variations in case.

Substring Matches

Placing % at the beginning and end of a text finds all entries that contain that substring. The % at the beginning means any sequence of characters (including an empty sequence) can come before the substring, and the % at the ends means any sequence of characters (including an empty sequence) can come after the substring. Let's take a look at an example:

The output of this query is:

The output indicates that there are two distinct entries in the character_name column that contain "Tony Stark": "Tony Stark" and "Tony Stark/Iron Man".

"Tony Stark/Iron Man" did not appear in the query containing WHERE character_name = 'Tony Stark' because "Tony Stark/Iron Man" is not an exact match.

Exploring Case Sensitivity with ILIKE

PostgreSQL provides two pattern-matching functions: LIKE and ILIKE. LIKE performs case-sensitive pattern matching, while ILIKE performs case-insensitive pattern matching.

Let's revisit the example of finding the characters whose names begin with "A". In the characters table, the first letter of each character's name is capitalized.

The above query does not return any entries because no character names start with lowercase "a". The output is:

We can use ILIKE to find all names that begin with the letter "A"/"a" regardless of case.

The output of this query is:

The output shows how ILIKE can be used to query text without regard to case.

Handling Case Sensitivity

Sometimes, you might want to handle case sensitivity explicitly. We can do this using the UPPER and LOWER functions. These functions return their input in all capital letters and all lowercase letters, respectively.

To find names that are fully uppercase, we use the UPPER function as follows:

UPPER(character_name) returns the character_name in all caps. Using WHERE character_name = UPPER(character_name);, we find all character names that remain unchanged when passed into the UPPER function.

The output is:

The output shows the names of characters that only contain capital letters.

Similarly, to find names in lowercase, we use the LOWER function.

The output is:

This query returns an empty table, as no actor names are in all lowercase.

Summary and Next Steps

In this lesson, we:

  • Reviewed the Marvel movies database
  • Explored pattern matching using % and =
  • Found exact matches and substring matches in text data
  • Handled case sensitivity using the UPPER and LOWER functions

You are now well-equipped to handle text-based queries in PostgreSQL. The next steps involve practicing these queries through exercises to reinforce your understanding. Happy querying!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal