Welcome back! In the previous lesson, you learned how to inspect your tables and indexes in PostgreSQL with pgvector. You now know how to check storage usage, view index definitions, and analyze index usage statistics. These skills are essential for understanding the current state of your database and making informed decisions about optimization.
Now, let’s move forward and focus on how to get the best performance from your vector search queries. In real-world applications, you often need to balance speed and accuracy when searching for similar vectors. This is where tuning approximate indexes comes in. By adjusting certain parameters, you can control how much time the database spends searching and how accurate the results are. In this lesson, I will show you how to tune and run queries using the two main approximate index types in pgvector: IVFFlat and HNSW. You will see how to set the key parameters for each index and how these settings affect your search results.
When running vector search queries with approximate indexes, there are two important parameters you should know about: ivfflat.probes
for IVFFlat indexes and hnsw.ef_search
for HNSW indexes. These parameters let you control the trade-off between search speed and accuracy. In this context, accuracy refers to how closely the results of your approximate search match the true nearest neighbors you would get from an exact search. Measuring accuracy is important when tuning parameters, as it helps you understand whether faster queries are sacrificing too much result quality. You can estimate accuracy by comparing the results of your approximate search to those from an exact (brute-force) search on the same data.
The ivfflat.probes
parameter determines how many clusters the database will search when using an IVFFlat index. A higher value means the search will be more accurate, but it will also take more time. A lower value makes the search faster, but you might miss some good matches.
The hnsw.ef_search
parameter is similar, but it is used with HNSW indexes. It controls how many candidate vectors are considered during the search. Increasing ef_search
usually improves the quality of the results, but it also increases the query time.
By tuning these parameters, you can find the right balance for your application — whether you need the fastest possible results or the most accurate matches.
Let’s look at a practical example of tuning and running a query with an IVFFlat index. Suppose you want to search for the top 10 products most similar to a given embedding using L2 distance. In this example, we’ll use the query text "solution-oriented framework for future scalability". The embedding for this query will be calculated from the text in a file called query.txt
.
Before running the query, you can set the ivfflat.probes
parameter to control how many clusters are searched.
Note: If you have both IVFFlat and HNSW indexes on the
embedding
column, this query will use the IVFFlat index because it uses the<->
operator (L2 distance) and theivfflat.probes
parameter.
Here is how you would do this in SQL:
In this example, the first line sets the number of probes to 10. This means the database will search 10 clusters for the nearest neighbors. The second part is the actual query, which orders the products by their L2 distance to your query embedding (calculated from the text "solution-oriented framework for future scalability" in query.txt
) and returns the top 10 results.
If you run this query, you might see output like:
By adjusting the value of ivfflat.probes
, you can make the search faster or more accurate, depending on your needs. If you set it to a higher number, you may get better results, but the query will take longer.
Now let’s see how to tune and run a query using an HNSW index. With HNSW, the key parameter is hnsw.ef_search
. This controls how many candidate vectors are explored during the search. A higher value usually means better recall (more accurate results), but it can also slow down the query.
For this example, we’ll also use the query text "solution-oriented framework for future scalability", with the embedding calculated from the text in query.txt
.
Note: If you have both IVFFlat and HNSW indexes on the
embedding
column, this query will use the HNSW index because it uses the<=>
operator (cosine distance) and thehnsw.ef_search
parameter.
Here is an example of how to set this parameter and run a query using cosine distance:
In this example, the first line sets ef_search
to 100. The query then finds the 10 products whose embeddings are closest to your query embedding (again, calculated from the text in query.txt
), using cosine distance. The ${QUERY_EMBEDDING}
is a placeholder for the vector you want to compare against, such as the embedding of a search phrase.
The output might again look like this:
In some cases, especially when working with large datasets or during index maintenance, you might want to adjust certain PostgreSQL settings to improve performance. Two settings you may encounter are max_parallel_maintenance_workers
and maintenance_work_mem
.
The max_parallel_maintenance_workers
setting controls how many parallel workers PostgreSQL can use for maintenance tasks, such as building indexes. Increasing this value can speed up index creation if your system has enough CPU resources.
The maintenance_work_mem
setting determines how much memory is available for maintenance operations. Setting this to a higher value can also help speed up index creation and maintenance, especially for large tables.
Here is how you might set these parameters:
These settings are usually adjusted by database administrators and are most useful during index creation or maintenance, not during regular query execution. On CodeSignal, these settings are informational, as the environment is already configured for you, but it is good to know about them for your own projects.
In this lesson, you learned how to tune and run vector search queries using approximate indexes in pgvector. You saw how to set the ivfflat.probes
parameter for IVFFlat indexes and the hnsw.ef_search
parameter for HNSW indexes, and you practiced running queries with each. You also learned about optional admin settings that can help with performance during index maintenance.
Understanding how to tune these parameters will help you get the best balance of speed and accuracy for your vector search applications. In the next set of practice exercises, you will have the chance to try out these queries and see how changing the parameters affects your results. This hands-on experience will help you become more confident in optimizing and scaling your own vector search systems.
