Loading...

Introduction

Welcome to our second lesson in the “Introduction to RAG” course! In the previous lesson, you learned how RAG evolved from traditional Information Retrieval (IR). Today, we'll connect those ideas to concrete code and illustrate a simple end-to-end RAG workflow. By the end of this lesson, you'll see how indexing, retrieval, prompt augmentation, and final text generation fit together to produce a targeted answer.

In this lesson, we showcase a scenario involving “Project Chimera.” Think of this as an internal project of a company — here, it's just an example to demonstrate how a naive, context-free system can invent inaccurate details, whereas a RAG-based system will provide reliable answers using information from an authoritative knowledge base. Please note that we are deliberately using extremely simplified methods (like simple keyword matching) to illustrate how each part of a RAG pipeline can work. Later in this course, we will explore more realistic and robust approaches—such as using embeddings and vector databases—for each component of our RAG pipeline.

The RAG Workflow: Four Key Steps

Before diving into the code, let's take a quick look at the four primary steps of our simple RAG workflow:

Indexing: Documents are structured in a way that makes them easy to search.
Retrieval: The most relevant piece of text is fetched based on a user query.
Prompt (Query) Augmentation: The retrieved text is combined with the user's question to form a context-rich prompt.
Generation: A language model processes the prompt and produces a final answer anchored to the provided text.

This process ensures that answers are backed by your data, reducing the risk of fabricated or off-topic responses. Let's examine each step through code!

Indexing: Organizing Documents

We start by defining our knowledge base. Below, “Project Chimera” serves as the example domain:

Here's a quick breakdown:

We define a Python dictionary named KNOWLEDGE_BASE that contains multiple documents.
Each entry has an ID (e.g., "doc1") and both a title and content field.
“Project Chimera” information is now the authoritative data source for the RAG system.

Keep in mind this is a very simplified approach for educational purposes; in a real-world production scenario, your KNOWLEDGE_BASE would consist of more advanced components, such as a Vector Database. But don't worry, we'll be dealing with these databases in Course 3!

Retrieval: Locating Relevant Information

Next, we create a function to return the best document from our knowledge base, based on simple keyword overlap:

Let's walk through the code:

The query is split into lowercase words and stored in a set.
Each document's text is similarly tokenized.
The function picks the document with the greatest word overlap.
If no match is found, it returns None.

Query Augmentation: Creating Context-Rich Prompts

Once we retrieve the relevant document, we augment the user's original question with that document's content. This additional context significantly reduces hallucinations because the language model sees both the question and real data.

Here's the process in detail:

If a document was found, we include its title and main content as a snippet in the prompt.
We pass the combined query and document text to the language model.
If no document matches, we still form a direct prompt.

Given our KNOWLEDGE_BASE and the query "What is the main goal of Project Chimera?", the resulting RAG-augmented prompt may be similar to the following:

Using the following information: 'Chimera's Neural Interface: The core component of Project Chimera is a neural interface that allows for bidirectional communication between the brain and external devices. This interface uses biocompatible nanomaterials.', answer: What is the main goal of Project Chimera?

Generation: Producing Tailored Answers

Finally, let's see the difference between a naive approach—one that might invent details if it doesn't have any context—and a RAG approach, which leverages the knowledge base.

Let's break down this section:

naive_generation(query) can easily lead to random or inaccurate answers regarding “Project Chimera”, such as: "The main goal of Project Chimera is to develop advanced artificial intelligence systems that can enhance human capabilities and improve decision-making processes across various fields.".
rag_generation(query, doc) provides contextual information from the knowledge base, ensuring the answer is grounded: "The main goal of Project Chimera is to enable bidirectional communication between the brain and external devices through the use of a neural interface.".
Seeing both approaches (and their actual output) helps you compare how naive answers can deviate from your authoritative data, while RAG-based responses stay closer to truth.

Conclusion and Next Steps

You've now implemented:

A simple knowledge base indexing scheme.
Basic retrieval to find the most relevant document.
Prompt augmentation to combine user queries and reference data.
Generation that relies on actual context, lowering the chance of hallucinations.

Up next, you'll get hands-on practice with these steps in coding exercises. As you progress, you'll see how RAG can be extended to tackle more complex tasks and domains—whether you're talking about a literal “Project Chimera” or a real-world internal project.

Previous Lesson

Next Lesson: Fighting Hallucinations: RAG's Role in Trustworthy LLMs

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal