In the previous lesson, you learned how to run multiple independent conversations concurrently using Promise.all(). That approach optimized execution at the application level by starting multiple agent.run() calls at once. Now, we'll take this optimization one level deeper by parallelizing tool calls within a single conversation turn.
When the model responds with multiple function_call items in a single turn, these tool calls are independent of each other, and there's no reason to wait for one to finish before starting the next. By executing these independent tool calls concurrently, we can significantly reduce the time it takes for the agent to complete complex multi-step tasks.
Let's examine how the current agent processes tool calls. When the model returns a response with function_call items in response.output, the agent loops through all of them and executes each tool sequentially:
Notice the await this.callTool(functionCall) inside the loop. This means the agent processes each tool call one at a time, waiting for each to complete before starting the next. If the model requests three tool calls and each takes 1 second, the total time is 3 seconds. Since these tool calls are independent, we could execute them all at once and reduce the total time to approximately 1 second, which is exactly what we'll implement next.
To parallelize tool execution, we'll use the same Promise.all() pattern you learned in the previous lesson, but this time we'll apply it inside the Agent.run() method. The strategy consists of three key steps:
-
Collect tool calls as
Promisetasks:
Loop through thefunction_callitems inresponse.outputand, instead of awaiting each tool call immediately, collect each as aPromisetask. -
Execute all tool tasks concurrently:
UsePromise.all()to run all the collected tool tasks at the same time. -
Gather and add results to the message history:
Once all tool calls complete, gather their results and add them to the message history together.
This approach leverages the fact that our callTool method is already an async function that returns a Promise, making it perfect for concurrent execution.
Our callTool method is already designed to return a Promise because it's an async function. Let's examine how it works:
The key points here are:
-
Already returns a
Promise: BecausecallToolis declared asasync, it automatically returns aPromise. We don't need to wrap it withPromise.resolve(). -
Parses arguments as
JSON: ThefunctionCall.argumentsstring is parsed into aJSONobject using , then passed directly to the tool function.
Now, let's modify the tool execution loop to collect Promise tasks instead of awaiting each tool call immediately:
We've introduced a new array called toolTasks that will hold our Promise objects. When we encounter a regular tool call (not a handoff), instead of writing await this.callTool(functionCall), we simply push the Promise returned by this.callTool(functionCall) into the toolTasks array.
Notice there's no await keyword here, which means the tool call starts immediately but doesn't block the loop. The loop continues, starting all the tool calls one after another without waiting for any of them to finish, and now we're ready to execute them all concurrently.
After collecting all the tool tasks, we use Promise.all() to execute them concurrently and gather their results:
The if (toolTasks.length > 0) check ensures we only call Promise.all() when there are actually tool tasks to execute. This handles the case where the model's response contains only handoff calls — the toolTasks array would be empty, and functionOutputs would only contain handoff results. While Promise.all([]) would work (returning an empty array), checking first makes the intent clearer.
The Promise.all(toolTasks) call waits for all tool calls to complete and returns results in the same order as the tasks. We use the spread operator ...toolResults to add these results to functionOutputs, which might already contain handoff results from earlier in the loop. Finally, we add all the function outputs to the message history, now with the benefit of concurrent execution.
When there are no function calls in the response, the agent surfaces the final assistant message using response.output_text:
handoff calls are treated differently from regular tool calls, and for good reason:
-
Special Function Recognition:
The agent exposes a specialhandofffunction throughthis.handoffSchema, which the model can select just like any other function. However, when the agent detects afunction_callwithname === "handoff", it routes it tocallHandoffinstead ofcallTool. -
Immediate Control Transfer:
Ahandoffcall transfers control to a different agent by calling that agent'srun()method and returning the result of that agent's entire conversation. This is fundamentally different from a regular tool call, which simply returns a result. -
Short-Circuit on Success:
When ahandoffsucceeds, we immediately return from the current agent'srun()method with the result from the target agent. This means we do not continue processing other tool calls in the same turn. -
Error Handling:
If ahandofffails (for example, if the target agent doesn't exist), we push afunction_call_outputwith the error message into so the agent can see what went wrong and potentially try a different approach.
When you run the code after parallelizing tool calls, you'll notice that the output looks very similar to what you'd see with sequential tool execution, even though we've implemented concurrent execution:
Notice that while some tool calls happen very close together (within milliseconds), the timestamps still show them executing in sequence. This is because our current tools are synchronous and CPU-bound (simple math functions). In JavaScript, when you call a synchronous function — even if you wrap it in a Promise or start it as part of a Promise.all() — the function starts and finishes immediately — blocking the event loop until it completes.
This means that, even though the code is written to start all tool calls "at the same time," each tool call actually runs to completion before the next one can begin.
The real power of this pattern becomes visible when we use asynchronous tools that involve I/O operations like API calls, database queries, or network requests. With async tools, Promise.all() allows all operations to be in-flight simultaneously, and the total time becomes determined by the slowest operation rather than the sum of all operations.
For example, if you had three tools that each make a 2-second API call:
When working with async tools, especially those involving I/O operations, keep these practices in mind:
-
Explicit prompting for parallel execution:
Make your agent's system prompt explicit about issuing multiplefunction_callitems "in the same turn" when appropriate. This helps the model understand that it should produce multiple items together rather than sequencing them across turns. For example: "When given independent subproblems, call the appropriate tools in parallel within a single turn." -
Idempotent operations:
Design your tools so that calling them multiple times with the same input produces the same result. This makes retries safe if something goes wrong. -
Error handling:
Each tool call in thePromise.all()array can fail independently. Our current implementation catches errors withincallTooland returns them asfunction_call_outputresults with error messages, allowing the agent to see what went wrong and potentially adjust its approach. -
Concurrency limits:
If you expect many parallel tool calls, consider adding a concurrency limiter to respect API rate limits or system resources. For most applications, the natural limit of how many tools the model requests in a single turn is sufficient.
You've now learned how to transform sequential tool execution into concurrent execution using Promise.all(), and you understand how this pattern delivers performance benefits with asynchronous tools. The pattern is simple but powerful:
- Collect
function_callitems fromresponse.outputasPromisetasks instead of awaiting them immediately. - Use
Promise.all()to execute all tasks concurrently. - Gather all
function_call_outputresults and append them to the message history together. - When no further function calls are requested, surface
response.output_textas the final assistant reply.
With synchronous tools, this pattern prepares your code for future optimization. With async tools involving I/O operations, it delivers immediate, substantial performance improvements. Now you're ready to practice implementing this pattern yourself and see firsthand how parallelization transforms the efficiency of your agentic systems!
