Querying for Relevant Context with Embeddings

Introduction: Why Retrieve Relevant Context?

Welcome back! In the last lesson, you learned how to embed document chunks and store them in a vector database. Now, you are ready to make your smart email assistant even more useful.

Imagine you receive a new email and want your assistant to reply with helpful information. To do this, the assistant needs to find the most relevant pieces of information from all the documents you have processed so far. This is where retrieving relevant context comes in.

In this lesson, you will learn how to use semantic search to find the most useful document chunks for a given email. This is a key step in making your assistant truly smart and responsive.

Quick Recall: Embeddings and Vector Search

Let's quickly remind ourselves of two important ideas:

Embeddings: These are special lists of numbers (vectors) that represent the meaning of text. You learned how to create them from document chunks in the last lesson.
Vector Databases: These are databases designed to store and search embeddings efficiently. You used LibSQLVector to store your chunk embeddings.

Why do we use them? Because searching by embeddings lets us find text that is similar in meaning, not just in exact words. This is called semantic search.

Embedding the Query (Email Body)

When a new email arrives, the first step is to turn its content into an embedding. This embedding will help us search for similar content in our database.

Let's see how to do this step by step.

First, suppose you have an email object with a body property:

To create an embedding for this email, you use the embed method from your AI library:

What does this do?

The embed method takes a text string and converts it into a numerical vector that represents its meaning.
The value parameter is the text you want to convert to an embedding.
The model parameter specifies which embedding model to use.
The result, queryEmbedding, is the embedding vector for your email.

At this point, you have a vector that represents the meaning of the email. This is what you will use to search for similar document chunks.

Searching for the Most Relevant Chunks

Now that you have the query embedding, you can search your vector database for the most relevant document chunks.

Here's how you do it using the query method provided by your vector database:

What does this do?

storage is your LibSQLVector database instance.
indexName: "embedding" tells it which vector index to search in.
queryVector is the embedding you want to find similar content for.
topK: 5 means you want the five most relevant chunks returned.
The method returns the most similar document chunks based on vector similarity.

The relevantChunks variable will now contain the top five document chunks that are most similar in meaning to your email.

Let's print out the relevant context:

Sample Output:

This output shows the most relevant pieces of information your assistant can use to answer the email.

Using Retrieved Context to Inform Responses

Now that you have the most relevant document chunks, your assistant can use them to generate a helpful reply.

For example, if the email asked for the latest project update, the assistant can use the retrieved context to answer accurately, even if the exact words were not used in the email or the documents.

Why is this important?

It allows your assistant to understand the meaning behind questions.
It helps the assistant provide answers based on the most useful information, not just keyword matches.

This step is what makes your email assistant smart and context-aware.

Summary And What's Next

In this lesson, you learned how to:

Turn a new email into an embedding using the embed method with OpenAI's embedding model.
Search your LibSQLVector database for the most relevant document chunks using the query method.
Use the retrieved context to help your assistant generate better, more informed responses.

You are now ready to practice these steps in the upcoming exercises. This hands-on practice will help you get comfortable with retrieving relevant context for your smart email assistant. Good luck!

Previous Lesson

Next Lesson: Integrating Retrieved Context into Agent Responses

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal