Context Generator: Building Useful Context for AI Code Review

Introduction: The Role of Context in AI Code Review

Welcome back! In the previous lessons, you learned how to set up the OpenAI client for code review and how to break down code changes using a diff parser. Now, you are ready to take the next step: generating context for the AI code review assistant.

Context is the information that helps the AI understand what is happening in the code. Without context, the AI might miss important details or misunderstand the code changes. A context generator gathers the most relevant information about a file, such as its current content, recent changes, and related files. This makes the AI’s feedback more accurate and useful.

In this lesson, you will learn how to build a simple context generator. You will see how to extract file content, summarize recent changes, and find related files. Each step will be explained with clear examples so you can follow along easily.

Quick Recall: Code Files and Sessions

Before we dive in, let’s quickly remind ourselves how code files and sessions are represented in our project. In earlier lessons, you saw that we use simple Python classes to represent files and a session object to interact with them.

For example, here is a basic data class for a code file:

And here is a mock Session class that lets us query these files:

These classes help us organize and access code files in our project. You will see them used in the examples throughout this lesson.

Extracting File Content for Review

The first step in generating context is to get the content of the file you want to review. This gives the AI assistant a snapshot of the code as it currently exists.

Let’s start by writing a function that retrieves the content of a file given its path. We will use the Session and CodeFile classes from earlier.

Let’s break this down:

We use session.query(CodeFile).filter_by(file_path=file_path).first() to find the file with the given path.
If the file is not found, we return an empty string.
We split the file content into lines. If the file is short enough (less than or equal to max_lines), we return the whole content.
If the file is long, we return only the first max_lines lines and add a note that the content was truncated.

Example usage:

Output:

If the file had more than 50 lines, only the first 50 would be shown, followed by a truncation message.

Summarizing Recent File Changes

Next, it’s helpful to show the AI assistant what has changed in the file recently. This usually means showing a summary of the latest commits.

Let’s write a function that returns a list of recent changes for a file:

Here’s what’s happening:

We look up the file using the session, just like before.
If the file is not found, we return an empty list.
For demonstration, we return a list of dictionaries, each representing a commit. In a real project, you would fetch this data from a version control system.

Example usage:

Output:

This gives the AI assistant a quick overview of what has changed in the file recently.

Finding Related Files

Sometimes, understanding a file requires looking at other files it depends on. For example, if a file imports another module, it can be helpful to include those related files in the context.

Let’s write a function that finds related files based on import statements:

Here’s how this works:

We get the file content and look for lines that start with import or from.
If the import refers to a module with a dot (like utils.helper), we convert it to a file path (utils/helper.py).
We check if that file exists in our session, and if so, add it to the list of related files.
We limit the result to three related files.

Example usage:

Output:

This helps the AI assistant see the bigger picture by including files that are likely to be important for understanding the current file.

Summary And What’s Next

In this lesson, you learned how to generate useful context for an AI code review assistant by:

Extracting the content of a file, with truncation for large files.
Summarizing recent changes to the file.
Finding related files based on import statements.

These steps help the AI provide more accurate and helpful code reviews. In the next practice exercises, you will get hands-on experience building and using these context generation functions. This will prepare you to integrate context generation into a real AI code review workflow.

Previous Lesson

Next Lesson: Review Engine: Bringing Automated Code Review Together

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal