Launching Concurrent Agents

Introduction & Context

Welcome to another step in this journey! In previous lessons, you learned how to build sophisticated agent systems that can use tools, handle multi-turn conversations, and even transfer control between specialized agents. You also learned the basics of concurrent execution in TypeScript using Promises to make multiple GPT-5 calls simultaneously. Now it’s time to bring these concepts together and learn how to run multiple independent agent conversations concurrently, significantly improving the performance of your agentic systems.

In this lesson, our Agent is implemented on top of the OpenAI Responses API (client.responses.create) and uses function calling (tools) as part of the agent loop.

The Serial Approach

Let’s start by understanding what happens when we run agent conversations one after another. Consider a scenario where you have multiple math problems to solve. The straightforward approach would be to process each problem sequentially, waiting for one to complete before starting the next.

In this approach, the second conversation doesn’t start until the first one is completely finished. If the first conversation takes 5 seconds and the second takes 4 seconds, the total execution time is 9 seconds. The bottleneck here is clear: we’re forcing conversations to wait unnecessarily, even though they’re completely independent and could theoretically run at the same time.

Creating Concurrent Tasks

Now let’s transform our serial approach into concurrent execution. The key is to create all our agent conversation tasks upfront without immediately waiting for them to complete.

We use map to create an array of tasks, where each task is a call to agent.run(). Notice we’re not using await inside the map function. In JavaScript, Promises are eager, meaning that as soon as agent.run() is called, the network request and agent logic begin executing.

Awaiting with Promise.all

Once we have our array of tasks, we can wait for all conversations to finish using Promise.all.

When you run this code, you’ll see output that demonstrates the concurrent nature of the execution. The tool calls from both conversations can appear interleaved, showing that both agents are actively working at the same time:

The results array contains the outcomes in the same order as the original prompts, regardless of which conversation finished first. Notice how the tool calls from both conversations are interleaved in the output, confirming that both conversations are running simultaneously.

Handling Partial Failures

It is important to remember that Promise.all follows an "all-or-nothing" strategy. If any single conversation fails and its promise rejects, the entire Promise.all call will immediately reject, losing access to the results of the successful conversations. In workflows where you want to keep partial results even if one agent crashes, consider using Promise.allSettled(). This ensures you can inspect which conversations succeeded and which failed individually.

Reusing a Single Agent Instance

You might wonder whether it’s safe to use the same agent instance for multiple concurrent conversations. The answer is yes, and understanding why is important for writing efficient code.

When you call agent.run(), you pass in an initial messages array that represents the conversation history. Each call to run() creates a local copy of that array and builds upon it throughout the conversation. This means each concurrent conversation maintains its own separate state.

The agent’s configuration—like systemPrompt, tools, toolSchemas, model, and reasoningEffort—is read-only during execution, so multiple conversations can safely read these shared properties simultaneously without interfering with each other.

Performance Benefits

Let’s examine the concrete performance benefits of concurrent execution. Imagine you have two conversations: one takes 5 seconds to complete and another takes 4 seconds.

With serial execution, the total time is the sum of all individual conversation times. With concurrent execution, all conversations start at once, and the total execution time becomes the duration of the longest conversation.

What Stays Serial

While we’re running multiple conversations concurrently, it’s important to understand what remains sequential. Within each individual conversation, tools are always executed serially, one at a time, even when the model requests multiple tools in a single turn.

In your agent loop, tool calls are extracted from response.output:

Then they’re executed with a for loop:

This means even when the model requests multiple independent function calls, they execute one after another within that conversation. The interleaving you see in the logs happens because conversation A and conversation B are progressing concurrently, not because tools inside a single conversation are parallel.

Room for Further Optimization

This design keeps the code simple and predictable, but tools that involve I/O operations, like making network API calls, could run concurrently to save time when the tool calls are independent. We’ll explore these optimization techniques in future lessons.

Summary & Hands-On

You’ve now learned how to run multiple independent agent conversations concurrently using Promise.all, transforming serial execution into parallel processing. You discovered that a single agent instance can safely handle multiple concurrent conversations because each maintains its own isolated messages array.

Now you’re ready to practice implementing concurrent agent conversations!

Next Lesson: Concurrent Tool Execution in TypeScript

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal