Introduction: Combining Filters and Vector Search

Welcome to the first lesson of the course, where we begin our journey into advanced querying with pgvector. In this lesson, you will learn how to combine traditional SQL filters with vector similarity search. This is a powerful technique that allows you to narrow down your search results using product attributes like category, stock, and price, and then rank those results by how semantically similar they are to a given query. This approach is common in real-world applications, such as e-commerce platforms, where users want to find products that not only match certain criteria but are also most relevant to their needs. By the end of this lesson, you will be able to write queries that combine these two types of filtering, setting a strong foundation for more advanced techniques in the rest of the course.

How Vector Search Works in SQL with pgvector

Before we dive into combining filters, let’s quickly review how vector search works in SQL using the pgvector extension. In pgvector, you can store vector embeddings — numerical representations of text or other data — in a special column type. To compare how similar two vectors are, you use the <-> operator, which calculates the L2 distance between them. The smaller the distance, the more similar the vectors are.

For example, if you have a table of products and each product has an embedding column, you can find the products most similar to a query embedding like this:

This query returns the 10 products whose embeddings are closest to the query embedding. The ${QUERY_EMBEDDING} is a placeholder for the vector you want to compare against, such as the embedding of a search phrase.

Adding Multiple Filters to a Vector Search Query

Now, let’s see how you can add traditional SQL filters to your vector search. In real-world scenarios, you often want to filter your data by certain attributes before ranking by similarity. For example, you might want to consider only products in the "Electronics" category, with more than 100 items in stock, and a price above $300. By applying these filters first, you reduce the number of items that need to be compared by vector similarity, which can improve both the relevance and performance of your search.

In SQL, you use the WHERE clause to add these filters. You can combine multiple conditions using AND to make sure all criteria are met before the vector similarity is calculated.

Step-by-Step Example: Writing a Combined Query

Let’s walk through a complete example that combines multiple filters with vector search, this time using a join to reference category names from a separate categories table. Here is the query:

In this query, we are selecting the product_id and product_name from the products table (aliased as p). We use a JOIN to connect the products table to the categories table (aliased as c) using the category_id field. The WHERE clause applies three filters: the product must belong to the "Electronics" category (as found in the categories table), have a stock_quantity greater than 100, and a price greater than 300. Only products that meet all these conditions will be considered in the next step.

After filtering, the ORDER BY p.embedding <-> ${QUERY_EMBEDDING} part sorts the remaining products by how close their embedding is to the query embedding. The ensures that only the top 10 most similar products are returned.

Summary and What’s Next

In this lesson, you learned how to combine multiple SQL filters with vector similarity search using pgvector. By filtering your data first and then ranking by vector similarity, you can create more targeted and efficient search queries. This technique is especially useful in applications where both metadata and semantic relevance matter. In the next set of exercises, you’ll get hands-on practice writing and modifying queries like the one we covered here. This will help you build confidence in combining filters and vector search, preparing you for even more advanced querying techniques in the rest of the course.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal