Introduction: Parallelizing Tools Within a Turn

In the previous lesson, you learned how to run multiple independent conversations concurrently using Promise.all(). That approach optimized execution at the application level by starting multiple agent.run() calls at once. Now, we'll take this optimization one level deeper by parallelizing tool calls within a single conversation turn.

When the model responds with multiple function_call items in a single turn, these tool calls are independent of each other, and there's no reason to wait for one to finish before starting the next. By executing these independent tool calls concurrently, we can significantly reduce the time it takes for the agent to complete complex multi-step tasks.

The Sequential Tool Execution Bottleneck

Let's examine how the current agent processes tool calls. When the model returns a response with function_call items in response.output, the agent loops through all of them and executes each tool sequentially:

Notice the await this.callTool(functionCall) inside the loop. This means the agent processes each tool call one at a time, waiting for each to complete before starting the next. If the model requests three tool calls and each takes 1 second, the total time is 3 seconds. Since these tool calls are independent, we could execute them all at once and reduce the total time to approximately 1 second, which is exactly what we'll implement next.

The Concurrent Execution Strategy

To parallelize tool execution, we'll use the same Promise.all() pattern you learned in the previous lesson, but this time we'll apply it inside the Agent.run() method. The strategy consists of three key steps:

  1. Collect tool calls as Promise tasks:
    Loop through the function_call items in response.output and, instead of awaiting each tool call immediately, collect each as a Promise task.

  2. Execute all tool tasks concurrently:
    Use Promise.all() to run all the collected tool tasks at the same time.

  3. Gather and add results to the message history:
    Once all tool calls complete, gather their results and add them to the message history together.

This approach leverages the fact that our callTool method is already an async function that returns a Promise, making it perfect for concurrent execution.

Understanding callTool's Promise-Based Design

Our callTool method is already designed to return a Promise because it's an async function. Let's examine how it works:

The key points here are:

  • Already returns a Promise: Because callTool is declared as async, it automatically returns a Promise. We don't need to wrap it with Promise.resolve().

  • Parses arguments as JSON: The functionCall.arguments string is parsed into a JSON object using , then passed directly to the tool function.

Collecting Tool Tasks Instead of Awaiting

Now, let's modify the tool execution loop to collect Promise tasks instead of awaiting each tool call immediately:

We've introduced a new array called toolTasks that will hold our Promise objects. When we encounter a regular tool call (not a handoff), instead of writing await this.callTool(functionCall), we simply push the Promise returned by this.callTool(functionCall) into the toolTasks array.

Notice there's no await keyword here, which means the tool call starts immediately but doesn't block the loop. The loop continues, starting all the tool calls one after another without waiting for any of them to finish, and now we're ready to execute them all concurrently.

Executing All Tool Tasks with Promise.all

After collecting all the tool tasks, we use Promise.all() to execute them concurrently and gather their results:

The if (toolTasks.length > 0) check ensures we only call Promise.all() when there are actually tool tasks to execute. This handles the case where the model's response contains only handoff calls — the toolTasks array would be empty, and functionOutputs would only contain handoff results. While Promise.all([]) would work (returning an empty array), checking first makes the intent clearer.

The Promise.all(toolTasks) call waits for all tool calls to complete and returns results in the same order as the tasks. We use the spread operator ...toolResults to add these results to functionOutputs, which might already contain handoff results from earlier in the loop. Finally, we add all the function outputs to the message history, now with the benefit of concurrent execution.

When there are no function calls in the response, the agent surfaces the final assistant message using response.output_text:

Handling Handoffs Separately

handoff calls are treated differently from regular tool calls, and for good reason:

  • Special Function Recognition:
    The agent exposes a special handoff function through this.handoffSchema, which the model can select just like any other function. However, when the agent detects a function_call with name === "handoff", it routes it to callHandoff instead of callTool.

  • Immediate Control Transfer:
    A handoff call transfers control to a different agent by calling that agent's run() method and returning the result of that agent's entire conversation. This is fundamentally different from a regular tool call, which simply returns a result.

  • Short-Circuit on Success:
    When a handoff succeeds, we immediately return from the current agent's run() method with the result from the target agent. This means we do not continue processing other tool calls in the same turn.

  • Error Handling:
    If a handoff fails (for example, if the target agent doesn't exist), we push a function_call_output with the error message into so the agent can see what went wrong and potentially try a different approach.

Why Synchronous Tools Don't Show Visible Concurrency

When you run the code after parallelizing tool calls, you'll notice that the output looks very similar to what you'd see with sequential tool execution, even though we've implemented concurrent execution:

Notice that while some tool calls happen very close together (within milliseconds), the timestamps still show them executing in sequence. This is because our current tools are synchronous and CPU-bound (simple math functions). In JavaScript, when you call a synchronous function — even if you wrap it in a Promise or start it as part of a Promise.all() — the function starts and finishes immediately — blocking the event loop until it completes.

This means that, even though the code is written to start all tool calls "at the same time," each tool call actually runs to completion before the next one can begin.

The real power of this pattern becomes visible when we use asynchronous tools that involve I/O operations like API calls, database queries, or network requests. With async tools, Promise.all() allows all operations to be in-flight simultaneously, and the total time becomes determined by the slowest operation rather than the sum of all operations.

For example, if you had three tools that each make a 2-second API call:

Best Practices for Async Tools

When working with async tools, especially those involving I/O operations, keep these practices in mind:

  • Explicit prompting for parallel execution:
    Make your agent's system prompt explicit about issuing multiple function_call items "in the same turn" when appropriate. This helps the model understand that it should produce multiple items together rather than sequencing them across turns. For example: "When given independent subproblems, call the appropriate tools in parallel within a single turn."

  • Idempotent operations:
    Design your tools so that calling them multiple times with the same input produces the same result. This makes retries safe if something goes wrong.

  • Error handling:
    Each tool call in the Promise.all() array can fail independently. Our current implementation catches errors within callTool and returns them as function_call_output results with error messages, allowing the agent to see what went wrong and potentially adjust its approach.

  • Concurrency limits:
    If you expect many parallel tool calls, consider adding a concurrency limiter to respect API rate limits or system resources. For most applications, the natural limit of how many tools the model requests in a single turn is sufficient.

Summary: Ready to Practice Concurrent Tool Execution

You've now learned how to transform sequential tool execution into concurrent execution using Promise.all(), and you understand how this pattern delivers performance benefits with asynchronous tools. The pattern is simple but powerful:

  1. Collect function_call items from response.output as Promise tasks instead of awaiting them immediately.
  2. Use Promise.all() to execute all tasks concurrently.
  3. Gather all function_call_output results and append them to the message history together.
  4. When no further function calls are requested, surface response.output_text as the final assistant reply.

With synchronous tools, this pattern prepares your code for future optimization. With async tools involving I/O operations, it delivers immediate, substantial performance improvements. Now you're ready to practice implementing this pattern yourself and see firsthand how parallelization transforms the efficiency of your agentic systems!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal