Gemini Tool Use Workflow

Introduction

Welcome to understanding how Gemini handles tool use in its API! In previous lesson, you learned how to define tool schemas that describe your functions for Gemini. Now, you'll discover how to include these tools in your Gemini API requests and, most importantly, how to interpret Gemini's responses when it decides to call a tool. In this lesson, you'll learn how to configure your requests to enable tool use, understand the different types of responses Gemini can provide, and extract the specific information you need to execute the tools Gemini requests. By the end, you'll be able to recognize when Gemini wants to use a tool and gather all the details needed to make that happen.

The Client-Side Loop

It is important to understand that Gemini does not execute your code . When Gemini "calls a tool," it is actually sending a structured request back to you . Your Python application is responsible for the heavy lifting: Parsing the request from Gemini. Executing the actual Python function locally in your environment. Sending the function's output back to Gemini to continue the conversation. This design ensures you maintain full control over your security and environment, as the model never has direct access to your local system.

The Tool Use Workflow

Integrating tools into your agent involves a specific lifecycle. To build a functional tool-calling agent, you must follow these steps: Define : Describe your tools using function schemas. Provide : Pass these tools to Gemini in your API request. Detect : Inspect the response structure to see if Gemini wants to call a tool. Extract : Pull the tool name and arguments from the response. Execute : Run your local code with the provided arguments. Return : Send the results back to Gemini to complete the interaction. Let's dive into the details of how this works in practice.

Guiding Gemini's Tool Use with System Instructions

Gemini can automatically detect and use any tools you provide in your request. However, you can guide Gemini's behavior by including instructions in the system message or prompt, which helps ensure more consistent and appropriate tool usage. Here's an example of how you can provide tool usage guidance in Gemini's system instruction: Python from google import genai from google.genai import types # System instruction with optional tool usage guidance system_instruction = ("You are a helpful math assistant. " "When performing calculations, use the available tools for accuracy.") from google import genai from google.genai import types # System instruction with optional tool usage guidance system_instruction = ("You are a helpful math assistant. " "When performing calculations, use the available tools for accuracy.") When you send a request to Gemini, you can include this instruction as part of the system_instruction parameter (or as the first message in the conversation, depending on the API version). This guidance helps Gemini understand your preferences for when and how to use tools, but it is not required for tool functionality. Gemini can still recognize and use tools based solely on their availability in the request.

Providing Tools to Gemini

To make your functions available to Gemini, you provide a list of tool schemas using the tools field in GenerateContentConfig. Gemini uses these schemas to understand what functions it can call and how to call them. Note: The Google Gen AI Python SDK accepts standard JSON Schema types (e.g., "object", "number"). You can pass your schemas as-is. Here's how you can provide tools to Gemini using the google.genai library and the model models/gemini-flash-latest: Pythonimport os from google import genai from google.genai import types import json # Initialize the Gemini API API = os.environ["GOOGLE_API_KEY"] BASE = os.environ.get("GOOGLE_BASE_URL", "").rstrip("/") client_kwargs = {"api_key": API} if BASE: client_kwargs["http_options"] = types.HttpOptions(base_url=BASE) client = genai.Client(**client_kwargs) model_name = "models/gemini-flash-latest" system_instruction = ( "You are a helpful math assistant. " "When performing calculations, use the available tools for accuracy." ) with open('schemas.json', 'r') as f: raw_tool_schemas = json.load(f) tool_schemas = raw_tool_schemas # Configure tool usage for this request model_name = "models/gemini-flash-latest" config = types.GenerateContentConfig( tools=[types.Tool(function_declarations=tool_schemas)], system_instruction=system_instruction ) def generate(contents): return client.models.generate_content( model=model_name, contents=contents, config=config, ) # Create a message requesting a calculation messages = [ {"role": "user", "parts": [{"text": "Please calculate 15 + 27"}]} ] # Send the request with tools enabled response = generate(messages)import os from google import genai from google.genai import types import json # Initialize the Gemini API API = os.environ["GOOGLE_API_KEY"] BASE = os.environ.get("GOOGLE_BASE_URL", "").rstrip("/") client_kwargs = {"api_key": API} if BASE: client_kwargs["http_options"] = types.HttpOptions(base_url=BASE) client = genai.Client(**client_kwargs) model_name = "models/gemini-flash-latest" system_instruction = ( "You are a helpful math assistant. " "When performing calculations, use the available tools for accuracy." ) with open('schemas.json', 'r') as f: raw_tool_schemas = json.load(f) tool_schemas = raw_tool_schemas # Configure tool usage for this request model_name = "models/gemini-flash-latest" config = types.GenerateContentConfig( tools=[types.Tool(function_declarations=tool_schemas)], system_instruction=system_instruction ) def generate(contents): return client.models.generate_content( model=model_name, contents=contents, config=config, ) # Create a message requesting a calculation messages = [ {"role": "user", "parts": [{"text": "Please calculate 15 + 27"}]} ] # Send the request with tools enabled response = generate(messages) Gemini will recognize the available tools from the tools parameter. The system instruction can help encourage consistent tool usage, but Gemini will still be able to see and use your tools even without explicit instructions.

Understanding Gemini's Tool Use Responses

When Gemini decides to use a tool, the response structure includes a special field indicating a function call, rather than just a simple text reply. Let's examine what a complete tool use response looks like by printing the entire response structure. We'll use pprint (short for "pretty-print"), a Python standard library module that formats complex data structures like dictionaries to make them more human-readable: Pythonimport pprint # Print the complete response structure pprint.pprint(response.model_dump())import pprint # Print the complete response structure pprint.pprint(response.model_dump()) This will output a detailed dictionary showing all the components of Gemini's response. A typical tool use response from Gemini looks like this: text{ 'candidates': [ { 'content': { 'parts': [ { 'function_call': { 'name': 'sum_numbers', 'args': { 'a': 15, 'b': 27 } } } ] }, 'finish_reason': 'function_call', 'role': 'model' } ], 'usage_metadata': {...} }{ 'candidates': [ { 'content': { 'parts': [ { 'function_call': { 'name': 'sum_numbers', 'args': { 'a': 15, 'b': 27 } } } ] }, 'finish_reason': 'function_call', 'role': 'model' } ], 'usage_metadata': {...} } Notice two key elements that appear when Gemini wants to use a tool: The parts array: This contains a function_call block with the tool name and arguments. The finish_reason: This is often set to "function_call". This structure allows Gemini to provide both the intent to call a function and the structured information your system needs to execute the requested function.

Detecting Tool Use in Gemini Responses

While the finish_reason field provides a hint about why the model stopped, the most robust way to determine if Gemini wants to use a tool is to inspect the content parts directly. Depending on the SDK version or specific model behavior, finish_reason can sometimes vary, but the presence of a function_call in the parts is the definitive signal. Here is how you can reliably detect if any tool calls were requested: Python# Check the first candidate for parts parts = response.candidates[0].content.parts # Detect if any part contains a function call tool_calls = [part.function_call for part in parts if part.function_call] if tool_calls: print(f"Gemini requested {len(tool_calls)} tool call(s).") else: print("Gemini returned a text response.")# Check the first candidate for parts parts = response.candidates[0].content.parts # Detect if any part contains a function call tool_calls = [part.function_call for part in parts if part.function_call] if tool_calls: print(f"Gemini requested {len(tool_calls)} tool call(s).") else: print("Gemini returned a text response.") By scanning the parts array, your agent logic remains resilient. Here are the common values for finish_reason you might see as metadata: "function_call": Gemini has requested tool execution and is waiting for results. "stop": Gemini has completed its response (though it may still contain tool calls in some scenarios). "max_tokens": Gemini reached the token limit.

Understanding the Parts Array and Function Call Blocks

When a tool call is detected, the parts array in the response contains one or more function_call blocks. Each block specifies a function Gemini wants to call, along with the arguments. Let's iterate through the parts array to examine each item: Python# Process each part in the response parts = response.candidates[0].content.parts for i, part in enumerate(parts): print(f"\nPart {i+1}:") if part.function_call: func_call = part.function_call print(f"Function Name: {func_call.name}") print(f"Arguments: {dict(func_call.args)}") elif part.text: print(f"Text: {part.text}")# Process each part in the response parts = response.candidates[0].content.parts for i, part in enumerate(parts): print(f"\nPart {i+1}:") if part.function_call: func_call = part.function_call print(f"Function Name: {func_call.name}") print(f"Arguments: {dict(func_call.args)}") elif part.text: print(f"Text: {part.text}") Running this code will show you the structure of each part: textPart 1: Function Name: sum_numbers Arguments: {'a': 15, 'b': 27}Part 1: Function Name: sum_numbers Arguments: {'a': 15, 'b': 27} Each function call block contains the essential information your system needs to execute the requested function: name: The function name that matches your tool schema (e.g., "sum_numbers"). args: A dictionary of parameters to pass to your function, with keys matching your schema's parameter names. If Gemini wants to call multiple tools in parallel, the parts array may contain multiple function_call blocks.

Summary and Next Steps

You now understand how Gemini communicates its tool use intentions through structured API responses. Remember that Gemini only requests the call; your client code must perform the actual execution. To detect these requests reliably, you should always inspect the parts of the response for function_call objects. In the upcoming practice exercises, you'll work with these response structures hands-on, learning to parse tool use requests and prepare for the next step: actually executing the requested tools and sending results back to Gemini to complete the conversation flow.

Previous Lesson

Next Lesson: Gemini Tool Execution Workflow

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal