Loading...

Introduction & Context

Welcome to the first lesson of this course on advanced MCP server and agent integration in Python. In previous courses, you learned how to build an MCP server and connect it to an agent, giving your agent the ability to use external tools. Now, we will take your skills further by focusing on how to make your agent more efficient and responsive, especially when handling a sequence of user queries. In this lesson, you will learn how to how to use tool caching to reduce latency and improve performance when running an agent across multiple queries. These techniques are essential for building agents that feel fast and natural in real-world applications.

Recap Of MCP Tools And Agent Integration

Before we dive into advanced topics, let’s quickly review how MCP servers and agents work together. An MCP server provides a set of tools — these are actions or functions the agent can use to help answer user queries. In earlier lessons, you learned how to launch an MCP server and connect it to an agent using the OpenAI Agents SDK. The agent gathers tools from the MCP server (or servers) and can use them alongside any built-in tools you define. This integration allows your agent to perform a wide range of tasks, from fetching data to managing files, depending on the tools available.

The Importance And Benefits Of Tool Caching

Every time an agent starts a new session or receives a query, it needs to know what tools are available. By default, the agent asks the MCP server for the list of tools each time. If the MCP server is running locally, this might be fast, but if it’s remote or the tool list is large, this can slow things down. Caching the tool list means the agent remembers the tools after the first request, so it doesn’t have to ask the server again for every query. This reduces latency, speeds up responses, and saves resources.

To see the impact of not using tool caching, here’s an example of what you might see in the logs of an MCP server running with Server-Sent Events (SSE) when an agent processes multiple queries without caching enabled. Notice how the server receives a ListToolsRequest for each new query, even though the list of tools hasn’t changed:

Each time the agent receives a new query, it sends a ListToolsRequest to the MCP server to fetch the available tools, even if the tool list hasn’t changed. This repeated fetching adds unnecessary latency and load to the server, especially in multi-query conversations. By enabling tool caching, you can avoid these redundant requests and make your agent much more efficient.

When Not To Cache

Tool caching is most effective when your tool list is stable and does not change often. However, if your tool list is dynamic and changes frequently—such as when tools are added, removed, or updated at runtime—caching can cause the agent to miss new or updated tools. In these cases, the agent may continue to use an outdated tool list, leading to incorrect or incomplete responses.

If you expect frequent changes to your tool list, consider disabling caching or make sure to manually invalidate the cache whenever the tool set changes. This ensures your agent always has access to the latest tools and can respond accurately to user queries.

Implementing Tool Caching With The OpenAI Agents SDK

Let’s look at how to enable tool caching in your agent setup. In the OpenAI Agents SDK, you can set the cache_tools_list parameter to True when you create your MCP server connection. This tells the agent to fetch the tool list once and reuse it for future queries. Here’s how it looks in practice:

When you use this mcp_server in your agent, the agent will only fetch the tool list from the server the first time.

Clearing the Tools List Cache

Tool caching works best when your tool list is stable, but there are situations where you need to update the cache to reflect changes. Only the metadata about the tools (such as names, input/output schema, and descriptions) is cached—not the actual tool logic or state. The logic itself always runs live on the server. If you update a tool’s implementation on the server but keep its metadata the same, you do not need to invalidate the cache. However, if you add, remove, or modify tools (for example, by changing their schema or description), you should clear the cache so the agent can fetch the latest tool list.

You can manually clear the cached tool list by calling:

A common use case for this is when tools are dynamically registered based on user roles or runtime data. For example, if administrative tools are only available when a user logs in as an admin, you should call invalidate_tools_cache() immediately after a login event or role change. This ensures the agent fetches the correct tool list for the new user context and always has access to the appropriate tools.

If your tool list changes frequently, consider when and how to invalidate the cache so your agent always operates with the most up-to-date information.

Running An Agent Across Multiple Queries

A key part of building a helpful agent is maintaining the conversation context across multiple user queries. This means the agent remembers what was said before and can respond in a way that makes sense for the ongoing conversation. In the example below, you will see how to run an agent through a series of queries, updating the conversation history each time.

Here is a complete example that brings together everything we’ve discussed:

In this code, the agent is connected to an MCP server with tool caching enabled. Because the tool list is cached, the agent only fetches the list of available tools from the server once, instead of making a separate request for each query. This significantly reduces latency and speeds up the agent’s responses across the entire sequence of user queries. Tool caching is especially beneficial in multi-query scenarios like this, where the agent can reuse the same tool list for each turn in the conversation, resulting in a smoother and more responsive user experience.

Summary, Best Practices, And Next Steps

In this lesson, you learned how to make your agent more efficient by caching the list of tools from the MCP server. Caching reduces latency and improves performance, especially when the tool list is stable. As a best practice, enable tool caching when your tool list does not change often, and remember to invalidate the cache if you update the tools.

You are now ready to practice these concepts in hands-on exercises. Keep up the great work — these skills will help you build faster and smarter agents!

Next Lesson: Integrating Multiple MCP Servers to an OpenAI Agent

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal