Introduction

Welcome to our second lesson in the “Introduction to RAG” course! In the previous lesson, you learned how RAG evolved from traditional Information Retrieval (IR). Today, we'll connect those ideas to concrete code and illustrate a simple end-to-end RAG workflow. By the end of this lesson, you'll see how indexing, retrieval, prompt augmentation, and final text generation fit together to produce a targeted answer.

In this lesson, we showcase a scenario involving “Project Chimera.” Think of this as an internal project of a company — here, it's just an example to demonstrate how a naive, context-free system can invent inaccurate details, whereas a RAG-based system will provide reliable answers using information from an authoritative knowledge base. Please note that we are deliberately using extremely simplified methods (like simple keyword matching) to illustrate how each part of a RAG pipeline can work. Later in this course, we will explore more realistic and robust approaches—such as using embeddings and vector databases—for each component of our RAG pipeline.

The RAG Workflow: Four Key Steps

Before diving into the code, let's take a quick look at the four primary steps of our simple RAG workflow:

  • Indexing: Documents are structured in a way that makes them easy to search.
  • Retrieval: The most relevant piece of text is fetched based on a user query.
  • Prompt (Query) Augmentation: The retrieved text is combined with the user's question to form a context-rich prompt.
  • Generation: A language model processes the prompt and produces a final answer anchored to the provided text.

This process ensures that answers are backed by your data, reducing the risk of fabricated or off-topic responses. Let's examine each step through code!

Indexing: Organizing Documents

We start by defining our knowledge base. Below, “Project Chimera” serves as the example domain:

Python
1KNOWLEDGE_BASE = { 2 "doc1": { 3 "title": "Project Chimera Overview", 4 "content": ( 5 "Project Chimera is a research initiative focused on developing " 6 "novel bio-integrated interfaces. It aims to merge biological " 7 "systems with advanced computing technologies." 8 ) 9 }, 10 "doc2": { 11 "title": "Chimera's Neural Interface", 12 "content": ( 13 "The core component of Project Chimera is a neural interface " 14 "that allows for bidirectional communication between the brain " 15 "and external devices. This interface uses biocompatible " 16 "nanomaterials." 17 ) 18 }, 19 "doc3": { 20 "title": "Applications of Chimera", 21 "content": ( 22 "Potential applications of Project Chimera include advanced " 23 "prosthetics, treatment of neurological disorders, and enhanced " 24 "human-computer interaction. Ethical considerations are paramount." 25 ) 26 } 27}

Here's a quick breakdown:

  • We define a Python dictionary named KNOWLEDGE_BASE that contains multiple documents.
  • Each entry has an ID (e.g., "doc1") and both a title and content field.
  • “Project Chimera” information is now the authoritative data source for the RAG system.

Keep in mind this is a very simplified approach for educational purposes; in a real-world production scenario, your KNOWLEDGE_BASE would consist of more advanced components, such as a Vector Database. But don't worry, we'll be dealing with these databases in Course 3!

Retrieval: Locating Relevant Information

Next, we create a function to return the best document from our knowledge base, based on simple keyword overlap:

Python
1def rag_retrieval(query, documents): 2 query_words = set(query.lower().split()) 3 best_doc_id = None 4 best_overlap = 0 5 6 for doc_id, doc in documents.items(): 7 # Compare the query words with the document's content words 8 doc_words = set(doc["content"].lower().split()) 9 overlap = len(query_words.intersection(doc_words)) 10 11 if overlap > best_overlap: 12 best_overlap = overlap 13 best_doc_id = doc_id 14 15 # Return the best document, or None if nothing matched 16 return documents.get(best_doc_id)

Let's walk through the code:

  • The query is split into lowercase words and stored in a set.
  • Each document's text is similarly tokenized.
  • The function picks the document with the greatest word overlap.
  • If no match is found, it returns None.
Query Augmentation: Creating Context-Rich Prompts

Once we retrieve the relevant document, we augment the user's original question with that document's content. This additional context significantly reduces hallucinations because the language model sees both the question and real data.

Python
1def rag_generation(query, document): 2 if document: 3 snippet = f"{document['title']}: {document['content']}" 4 prompt = f"Using the following information: '{snippet}', answer: {query}" 5 else: 6 prompt = f"No relevant information found. Answer directly: {query}"

Here's the process in detail:

  • If a document was found, we include its title and main content as a snippet in the prompt.
  • We pass the combined query and document text to the language model.
  • If no document matches, we still form a direct prompt.

Given our KNOWLEDGE_BASE and the query "What is the main goal of Project Chimera?", the resulting RAG-augmented prompt may be similar to the following:

Using the following information: 'Chimera's Neural Interface: The core component of Project Chimera is a neural interface that allows for bidirectional communication between the brain and external devices. This interface uses biocompatible nanomaterials.', answer: What is the main goal of Project Chimera?

Generation: Producing Tailored Answers

Finally, let's see the difference between a naive approach—one that might invent details if it doesn't have any context—and a RAG approach, which leverages the knowledge base.

Python
1def get_llm_response(prompt): 2 """ 3 This function interfaces with a language model to generate a response based on the provided prompt. 4 5 Parameters: 6 - prompt (str): A string containing the question or task for the language model, potentially augmented with additional context. 7 8 Returns: 9 - response (str): The generated text from the language model, which aims to answer the question or fulfill the task described in the prompt. 10 """ 11 pass 12def naive_generation(query): 13 # This approach ignores the knowledge base 14 prompt = f"Answer directly the following query: {query}" 15 return get_llm_response(prompt) 16 17def rag_generation(query, document): 18 # This approach augments the prompt via the knowledge base 19 if document: 20 snippet = f"{document['title']}: {document['content']}" 21 prompt = f"Using the following information: '{snippet}', answer: {query}" 22 else: 23 prompt = f"No relevant information found. Answer directly: {query}" 24 return get_llm_response(prompt) 25 26 27query = "What is the main goal of Project Chimera?" 28 29naive_answer = naive_generation(query) 30print("Naive approach:", naive_answer) 31 32doc = rag_retrieval(query, KNOWLEDGE_BASE) 33rag_answer = rag_generation(query, doc) 34print("RAG approach:", rag_answer)

Let's break down this section:

  • naive_generation(query) can easily lead to random or inaccurate answers regarding “Project Chimera”, such as: "The main goal of Project Chimera is to develop advanced artificial intelligence systems that can enhance human capabilities and improve decision-making processes across various fields.".
  • rag_generation(query, doc) provides contextual information from the knowledge base, ensuring the answer is grounded: "The main goal of Project Chimera is to enable bidirectional communication between the brain and external devices through the use of a neural interface.".
  • Seeing both approaches (and their actual output) helps you compare how naive answers can deviate from your authoritative data, while RAG-based responses stay closer to truth.
Conclusion and Next Steps

You've now implemented:

  • A simple knowledge base indexing scheme.
  • Basic retrieval to find the most relevant document.
  • Prompt augmentation to combine user queries and reference data.
  • Generation that relies on actual context, lowering the chance of hallucinations.

Up next, you'll get hands-on practice with these steps in coding exercises. As you progress, you'll see how RAG can be extended to tackle more complex tasks and domains—whether you're talking about a literal “Project Chimera” or a real-world internal project.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal