Introduction

Welcome to understanding how Gemini handles tool use in its API! In previous lesson, you learned how to define tool schemas that describe your functions for Gemini. Now, you'll discover how to include these tools in your Gemini API requests and, most importantly, how to interpret Gemini's responses when it decides to call a tool.

In this lesson, you'll learn how to configure your requests to enable tool use, understand the different types of responses Gemini can provide, and extract the specific information you need to execute the tools Gemini requests. By the end, you'll be able to recognize when Gemini wants to use a tool and gather all the details needed to make that happen.

The Client-Side Loop

It is important to understand that Gemini does not execute your code. When Gemini "calls a tool," it is actually sending a structured request back to you. Your Python application is responsible for the heavy lifting:

  1. Parsing the request from Gemini.
  2. Executing the actual Python function locally in your environment.
  3. Sending the function's output back to Gemini to continue the conversation.

This design ensures you maintain full control over your security and environment, as the model never has direct access to your local system.

The Tool Use Workflow

Integrating tools into your agent involves a specific lifecycle. To build a functional tool-calling agent, you must follow these steps:

  • Define: Describe your tools using function schemas.
  • Provide: Pass these tools to Gemini in your API request.
  • Detect: Inspect the response structure to see if Gemini wants to call a tool.
  • Extract: Pull the tool name and arguments from the response.
  • Execute: Run your local code with the provided arguments.
  • Return: Send the results back to Gemini to complete the interaction.

Let's dive into the details of how this works in practice.

Guiding Gemini's Tool Use with System Instructions

Gemini can automatically detect and use any tools you provide in your request. However, you can guide Gemini's behavior by including instructions in the system message or prompt, which helps ensure more consistent and appropriate tool usage.

Here's an example of how you can provide tool usage guidance in Gemini's system instruction:

When you send a request to Gemini, you can include this instruction as part of the system_instruction parameter (or as the first message in the conversation, depending on the API version). This guidance helps Gemini understand your preferences for when and how to use tools, but it is not required for tool functionality. Gemini can still recognize and use tools based solely on their availability in the request.

Providing Tools to Gemini

To make your functions available to Gemini, you provide a list of tool schemas using the tools field in GenerateContentConfig. Gemini uses these schemas to understand what functions it can call and how to call them.

Note: The Google Gen AI Python SDK accepts standard JSON Schema types (e.g., "object", "number"). You can pass your schemas as-is.

Here's how you can provide tools to Gemini using the google.genai library and the model models/gemini-flash-latest:

Gemini will recognize the available tools from the tools parameter. The system instruction can help encourage consistent tool usage, but Gemini will still be able to see and use your tools even without explicit instructions.

Understanding Gemini's Tool Use Responses

When Gemini decides to use a tool, the response structure includes a special field indicating a function call, rather than just a simple text reply. Let's examine what a complete tool use response looks like by printing the entire response structure.

We'll use pprint (short for "pretty-print"), a Python standard library module that formats complex data structures like dictionaries to make them more human-readable:

This will output a detailed dictionary showing all the components of Gemini's response. A typical tool use response from Gemini looks like this:

Notice two key elements that appear when Gemini wants to use a tool:

  • The parts array: This contains a function_call block with the tool name and arguments.
  • The finish_reason: This is often set to "function_call".

This structure allows Gemini to provide both the intent to call a function and the structured information your system needs to execute the requested function.

Detecting Tool Use in Gemini Responses

While the finish_reason field provides a hint about why the model stopped, the most robust way to determine if Gemini wants to use a tool is to inspect the content parts directly. Depending on the SDK version or specific model behavior, finish_reason can sometimes vary, but the presence of a function_call in the parts is the definitive signal.

Here is how you can reliably detect if any tool calls were requested:

By scanning the parts array, your agent logic remains resilient. Here are the common values for finish_reason you might see as metadata:

  • "function_call": Gemini has requested tool execution and is waiting for results.
  • "stop": Gemini has completed its response (though it may still contain tool calls in some scenarios).
  • "max_tokens": Gemini reached the token limit.
Understanding the Parts Array and Function Call Blocks

When a tool call is detected, the parts array in the response contains one or more function_call blocks. Each block specifies a function Gemini wants to call, along with the arguments.

Let's iterate through the parts array to examine each item:

Running this code will show you the structure of each part:

Each function call block contains the essential information your system needs to execute the requested function:

  • name: The function name that matches your tool schema (e.g., "sum_numbers").
  • args: A dictionary of parameters to pass to your function, with keys matching your schema's parameter names.

If Gemini wants to call multiple tools in parallel, the parts array may contain multiple function_call blocks.

Summary and Next Steps

You now understand how Gemini communicates its tool use intentions through structured API responses. Remember that Gemini only requests the call; your client code must perform the actual execution. To detect these requests reliably, you should always inspect the parts of the response for function_call objects.

In the upcoming practice exercises, you'll work with these response structures hands-on, learning to parse tool use requests and prepare for the next step: actually executing the requested tools and sending results back to Gemini to complete the conversation flow.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal