Welcome to another step in this journey! In previous courses, you learned how to build sophisticated agent systems that can use tools, handle multi-turn conversations, and even transfer control between specialized agents. You also learned the basics of concurrent execution in TypeScript using Promises to make multiple Claude API calls simultaneously. Now it's time to bring these concepts together and learn how to run multiple independent agent conversations concurrently, significantly improving the performance of your agentic systems.
Let's start by understanding what happens when we run agent conversations one after another. Consider a scenario where you have multiple math problems to solve. The straightforward approach would be to process each problem sequentially, waiting for one to complete before starting the next.
In this approach, the second conversation doesn't start until the first one is completely finished. If the first conversation takes 5 seconds and the second takes 4 seconds, the total execution time is 9 seconds. The bottleneck here is clear: we're forcing conversations to wait unnecessarily, even though they're completely independent and could theoretically run at the same time.
Now let's transform our serial approach into concurrent execution. The key is to create all our agent conversation tasks upfront without immediately waiting for them to complete.
We use map to create an array of tasks, where each task is a call to agent.run(). Notice we're not using await inside the map function. This means we're creating the promises without immediately waiting for them to complete. Each promise represents a conversation that's ready to start but hasn't been launched yet.
Once we have our array of tasks, we can launch all conversations simultaneously using Promise.all and wait for every single one to finish before continuing.
When you run this code, you'll see output that demonstrates the concurrent nature of the execution. The tool calls from both conversations appear interleaved, showing that both agents are actively working at the same time:
The results array contains the outcomes in the same order as the original prompts, regardless of which conversation finished first. Notice how the tool calls from both conversations are interleaved in the output, confirming that both conversations are running simultaneously rather than one after the other.
You might wonder whether it's safe to use the same agent instance for multiple concurrent conversations. The answer is yes, and understanding why is important for writing efficient code.
When you call agent.run(), you pass in a messages array that represents the conversation history. Each call to run() creates a local copy of the messages and builds upon it throughout the conversation. This means each concurrent conversation maintains its own separate state. The agent's configuration, like systemPrompt, tools, and model, is read-only during execution, so multiple conversations can safely read these shared properties simultaneously without interfering with each other.
Let's examine the concrete performance benefits of concurrent execution. Imagine you have two conversations: one takes 5 seconds to complete and another takes 4 seconds.
With serial execution, the second conversation must wait for the first to finish completely. The total time is simply the sum of all individual conversation times. With concurrent execution, both conversations start at the same time, and the total execution time becomes the duration of the longest conversation, not the sum of all conversations. In this example, we've reduced the total time from 9 seconds to 5 seconds, a 44% improvement.
While we're running multiple conversations concurrently, it's important to understand what remains sequential. Within each individual conversation, tools are always executed serially, one at a time, even when Claude requests multiple tools simultaneously.
The first and third tool calls belong to conversation 1, while the second belongs to conversation 2. They're interleaved because the conversations run concurrently, but within conversation 1, sum_numbers completes before multiply_numbers is called.
Claude often requests multiple tool calls in a single response when the results are independent. For example, it might ask to calculate 2 + 3 and 4 * 4 at the same time since neither calculation depends on the other. However, our agent implementation uses a for loop to process these tool calls:
This means even when Claude requests multiple independent tools, they execute one after another within that conversation. The second tool doesn't start until the first completes. Each conversation follows its own agentic loop: Claude responds, tools execute sequentially, results go back to Claude, and the cycle continues until Claude provides a final answer.
This design keeps the code simple and predictable, but it's not the most efficient approach. Tools that involve I/O operations, like making network API calls to fetch data or contacting another agent, could run concurrently to save time. CPU-bound tools like our math operations could be offloaded to worker threads to avoid blocking. We'll explore these optimization techniques in future lessons. However, with our current setup, we've already unlocked a powerful optimization: we can launch multiple agent conversations at once, allowing different conversations to progress in parallel even if the tools within each conversation run sequentially.
You've now learned how to run multiple independent agent conversations concurrently using Promise.all, transforming serial execution into parallel processing that can dramatically reduce total execution time. You discovered that a single agent instance can safely handle multiple concurrent conversations because each maintains its own isolated messages array, and you understand that while conversations run in parallel, the agentic loop within each conversation remains sequential. Now you're ready to practice implementing concurrent agent conversations and see firsthand how parallelization improves the efficiency of your agentic systems!
