Loading...

Building a Chat Engine with Conversation History

Welcome to the second unit of our course on building a RAG-powered chatbot! In the previous lesson, we built a document processor that forms the retrieval component of our RAG system. Today, we'll focus on the conversational aspect by creating a chat engine that can maintain conversation history.

While our document processor is excellent at finding relevant information, a complete RAG system needs a way to interact with users in a natural, conversational manner. This is where our chat engine comes in. The chat engine is responsible for managing the conversation flow, formatting prompts with relevant context, and maintaining a history of the interaction.

Understanding the Chat Engine

The chat engine we'll build today will:

Manage interactions with the language model
Optionally maintain a history of the conversation for display or logging
Format prompts with relevant context from our document processor
Provide methods to reset the conversation history when needed

By the end of this lesson, you'll have a fully functional chat engine that can be integrated with the document processor we built previously to create a complete RAG system.

Creating the ChatEngine Class Structure

Let's begin by setting up the basic structure of our ChatEngine class. This class will encapsulate all the functionality needed for managing conversations with the language model.

Key points in this initialization:

Chat Model: We initialize self.chat_model using ChatOpenAI() to create an instance of the OpenAI chat model for generating responses.
System Message: We define strict instructions that guide the AI's behavior, telling it to only answer questions based on provided context and to politely decline if no relevant context is available.
Prompt Template: We use ChatPromptTemplate.from_messages() to explicitly define both the system and human message templates. The system message sets the assistant's behavior, and the human message template includes placeholders for context and question.
Conversation History: We initialize an empty list to optionally keep track of the conversation for display or logging purposes. This history is not sent to the model in this implementation.

Understanding Prompt Templates in LangChain

Let's take a closer look at how we define the prompt template in our ChatEngine class. This part is crucial for controlling how information is passed to the language model.

Here's what each component does and what it returns:

SystemMessagePromptTemplate.from_template
This method takes a string template (our system instructions) and returns a SystemMessagePromptTemplate object. This object knows how to generate a system message for the chat model by filling in any placeholders if needed.
HumanMessagePromptTemplate.from_template
This method takes a string template for the user's message (with placeholders for context and question) and returns a HumanMessagePromptTemplate object. This object can generate a human message for the chat model by filling in those placeholders.
ChatPromptTemplate.from_messages
This method takes a list of message prompt templates (like the two above) and returns a ChatPromptTemplate object. This object can generate a full list of formatted messages (system and human) ready to be sent to the chat model, using the values you provide for the placeholders.

Key difference:

The .from_template methods create individual message templates (for either system or human messages).
The .from_messages method combines multiple message templates into a prompt template that can generate the full message sequence for the chat model.

Building the Message Handling System

Now that we have our basic class structure, let's implement the core functionality: sending messages and receiving responses. The send_message method will handle this process.

The send_message method takes two parameters: user_message (the question from the user) and context (optional relevant information from our document processor).

Format Messages: We use our prompt template to fill in placeholders with the provided context and question. The system message is always included.
Get Response: We invoke the chat model with our formatted messages using self.chat_model.invoke(messages).
Update History: We append the user's message and the AI's response to the conversation history for display or logging.
Return Result: We return the content of the response to be displayed to the user.

Note: In this implementation, conversation history is not sent to the model. Each response is based only on the current context and question, which is typical for RAG systems.

Implementing Conversation Management

An important aspect of any chat system is the ability to manage the conversation state. Let's implement a method to reset the conversation history:

The reset_conversation method clears the conversation history. This is useful for display or logging purposes, and allows users to start fresh when needed.

Testing Our Chat Engine Without Context

Let's see how our chat engine behaves when we don't provide any context. This is important because, in a RAG system, the assistant should not "hallucinate" answers—it should only respond based on the information it has.

Here's how you can test this scenario:

In this example, we create an instance of ChatEngine and send a question without providing any context. According to our system message instructions, the assistant should politely inform the user that it doesn't have enough information to answer.

You should see output similar to:

Notice that the assistant does not attempt to answer the question directly. Instead, it follows the instructions from the system message and responds appropriately. The conversation history also shows both the user's question and the assistant's response.

Testing With Context

Now, let's test the chat engine with some relevant context. This simulates the scenario where our document processor has retrieved useful information, and we want the assistant to answer using only that context.

In this test, we provide a context string that mentions several Paris landmarks. The assistant should now generate an answer based only on this context.

The output will look something like:

Here, the assistant uses only the information from the provided context to answer the question. This demonstrates how the chat engine, when combined with a document processor, can provide accurate, context-aware responses.

Resetting the Conversation

Finally, let's see how to reset the conversation history. This is useful if you want to clear the previous exchanges.

After calling reset_conversation(), the conversation history should be empty:

This confirms that the conversation history is cleared and ready for a new interaction.

Summary and Practice Preview

In this lesson, we've built a chat engine for our RAG chatbot using explicit system and human message templates. We've learned how to:

Create a ChatEngine class that manages conversations with a language model
Define system messages to guide the AI's behavior
Format prompts with context and questions using templates
Optionally maintain conversation history for display or logging
Implement methods to send messages and reset conversation history
Test our chat engine with various scenarios

Our chat engine complements the document processor we built in the previous lesson. While the document processor handles the retrieval of relevant information, the chat engine manages the conversation and presents this information to the user in a natural way. In the next unit, we'll integrate the document processor and chat engine to create a complete RAG system. This integration will allow our chatbot to automatically retrieve relevant context from documents based on user queries, creating a seamless experience where users can ask questions about their documents and receive informed, contextual responses.

Get ready to practice what you've learned and take your RAG chatbot to the next level!

Previous Lesson

Next Lesson: Integrating Components for a Complete RAG Chatbot

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal