Welcome back! In the previous lesson, you learned how to combine traditional SQL filters — such as category, stock, and price — with vector similarity search using pgvector. This approach allowed you to narrow down your results using product attributes and then rank them by semantic relevance.
In this lesson, we will take your querying skills a step further by introducing full-text search into the mix. Specifically, you will learn how to filter your results using keywords in product descriptions and then use vector similarity to rank those results. This combination is especially useful when you want to find items that not only match a certain topic or keyword but are also semantically similar to your search intent.
By the end of this lesson, you will be able to write queries that combine keyword filtering and vector similarity, making your searches even more powerful and precise.
Let’s look at the query you will master in this lesson:
This query does two important things. First, it uses ILIKE '%AI%'
in the WHERE
clause to filter products whose descriptions contain the word "AI," regardless of case. The ILIKE
operator is a case-insensitive version of LIKE
, which is helpful when you want to match keywords without worrying about capitalization.
Second, it adds another condition: embedding <-> ${QUERY_EMBEDDING} < 0.5
. This means that only products whose vector embedding is within a certain distance (in this case, less than 0.5) from your query embedding will be considered. The <->
operator calculates the distance between two vectors, and a smaller value means higher similarity.
After filtering, the query orders the results by similarity using ORDER BY embedding <-> ${QUERY_EMBEDDING}
and limits the output to the top 10 most similar products.
Let’s see how this works in practice. Imagine you have a products
table with various items, and you want to find products that mention "AI" in their descriptions and are also semantically similar to your search intent. Here’s the query again:
Suppose your products
table looks like this:
If your ${QUERY_EMBEDDING}
represents a search for "AI home devices," the query will first filter out any products whose descriptions do not contain "AI" (so "Fitness Tracker" and "Blender" are excluded). Then, among the remaining products, it will keep only those whose embedding is within 0.5 distance of your query embedding. Finally, it will sort these by similarity and return the top 10. The output might look like this:
This result shows you the most relevant "AI" products, ranked by how closely they match your search intent.
You can easily adjust this query to fit different needs. For example, if you want to search for products related to "robotics," you can change the keyword in the ILIKE
clause to '%robotics%'
. If you want to be more or less strict about how similar the products need to be, you can adjust the similarity threshold from 0.5
to another value, such as 0.3
for stricter matches or 0.7
for broader results.
The ${QUERY_EMBEDDING}
variable can also be changed to represent different search intents, depending on what you are looking for. Each of these changes will affect which products are returned and how they are ranked, giving you fine control over your search results.
In this lesson, you learned how to combine full-text keyword search with vector similarity in a single SQL query using pgvector. By filtering products with a keyword in their descriptions and then ranking them by semantic similarity, you can create highly targeted and relevant search experiences.
This technique is especially useful when you want to ensure that your results are both topically and semantically aligned with your search intent. In the next set of practice exercises, you will get hands-on experience writing and modifying queries like the one we covered here. This will help you build confidence in combining full-text and vector search, preparing you for even more advanced querying techniques in the rest of the course.
