Parallel Tool Execution

Introduction & Context

Welcome back! In the previous lesson, you converted the agent's Gemini calls to work inside an async workflow, so multiple conversations can make progress while waiting for model responses. That was an important first step, but there is still another bottleneck: tool execution. When Gemini asks for tools, the agent receives one or more function calls in the model response. If the agent executes those function calls one at a time, a single slow tool can delay every other tool result in that turn. In this lesson, you'll make tool execution non-blocking and then run multiple requested tools concurrently with asyncio.create_task() and asyncio.gather() .

Understanding the Tool Execution Bottleneck

The agent already wraps Gemini API calls with asyncio.to_thread(), because the Google Gen AI Python SDK call is synchronous. We need the same idea for regular Python tools. A math function like sum_numbers() returns quickly, but in real systems a tool might call another service, read a file, query a database, or do other blocking work. If we call that function directly from an async method, it blocks the event loop. The first improvement is to make _call_tool() itself async and run synchronous tools in a worker thread: Pythonasync def _call_tool(self, name, args): print(f"🔧 Tool called: {name}({args})") try: fn = self.tools[name] except KeyError: result = f"Error: Function {name} not found" return self._function_response_part(name, result) try: result = await asyncio.to_thread(fn, **args) return self._function_response_part(name, str(result)) except Exception as e: result = f"Error: {str(e)}" return self._function_response_part(name, result)async def _call_tool(self, name, args): print(f"🔧 Tool called: {name}({args})") try: fn = self.tools[name] except KeyError: result = f"Error: Function {name} not found" return self._function_response_part(name, result) try: result = await asyncio.to_thread(fn, **args) return self._function_response_part(name, str(result)) except Exception as e: result = f"Error: {str(e)}" return self._function_response_part(name, result) This change prevents a blocking tool from freezing the event loop, so other conversations can continue while the tool runs.

Scheduling Multiple Tool Calls

Making_call_tool() async is useful, but if the run() method still awaits each tool immediately, tools in the same turn still execute one after another: Python for name, args in self._iter_function_calls(response): tool_result = await self._call_tool(name, args) tool_results.append(tool_result) for name, args in self._iter_function_calls(response): tool_result = await self._call_tool(name, args) tool_results.append(tool_result) To run independent tools in parallel, collect them as tasks first: Python tool_results = [] tasks = [] for name, args in self._iter_function_calls(response): if name == "handoff": ok, res = await self._call_handoff(args, messages) if ok: return res tool_results.append(self._function_response_part("handoff", res)) else: tasks.append(asyncio.create_task(self._call_tool(name, args))) tool_results = [] tasks = [] for name, args in self._iter_function_calls(response): if name == "handoff": ok, res = await self._call_handoff(args, messages) if ok: return res tool_results.append(self._function_response_part("handoff", res)) else: tasks.append(asyncio.create_task(self._call_tool(name, args))) asyncio.create_task() schedules each tool call without waiting for it immediately. That gives all tool calls in the same model turn a chance to run concurrently.

Gathering Tool Results

Once all tool tasks are scheduled, asyncio.gather() waits for them and returns their results in the same order as the task list: Pythonif tasks: tool_results.extend(await asyncio.gather(*tasks)) if tool_results: messages.append({"role": "user", "parts": tool_results})if tasks: tool_results.extend(await asyncio.gather(*tasks)) if tool_results: messages.append({"role": "user", "parts": tool_results}) Because _call_tool() catches its own exceptions and returns formatted function_response parts, one failed tool can be sent back to Gemini as an error result without crashing the whole agent loop.

Adding a Synchronous Wrapper

After converting run() to async, synchronous scripts can no longer call agent.run(...) directly. They receive a coroutine instead of a final result. For notebooks, tests, or simple scripts that are not already inside an event loop, a small wrapper can make the async agent easier to use: Python def run_sync(self, input_messages): return asyncio.run(self.run(input_messages)) def run_sync(self, input_messages): return asyncio.run(self.run(input_messages)) This wrapper does not replace the async API. It simply provides a convenient bridge for synchronous callers. If your application is already async, you should continue to use await agent.run(...) directly.

Summary & Practice Exercises

In this lesson, you refined your agent's parallel capabilities by removing the tool execution bottleneck. You'll first make_call_tool() async with asyncio.to_thread(), then practice launching multiple tool calls with asyncio.create_task() and collecting them with asyncio.gather() . Finally, you'll add a run_sync() helper so synchronous code can still use the async agent safely.

Previous Lesson

Next Lesson: Agent Orchestration Patterns

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal