Introduction & Goals

Welcome to your first lesson in building effective agents with Gemini! You'll learn how to send messages using the Google Gen AI Python SDK (google.genai) and master the structure of Gemini's responses.

By the end, you'll be able to create a Python script that communicates with Gemini and inspects the full response object—a foundation for the advanced agent workflows we'll build later.

Setting Up the Client

To use Gemini, you must initialize the Gen AI client. In this environment, we use a custom base URL to route requests correctly.

Model Selection

You can choose different models based on your needs:

  • models/gemini-flash-latest: Points to the latest Gemini Flash model, optimized for speed, efficiency, and high-volume tasks.
  • models/gemini-pro-latest: Points to the latest Gemini Pro model, designed for complex reasoning and large contexts.

For this lesson, we'll use the Flash model for simplicity. See the Gemini API models documentation for the full list of available models.

How Gemini Messaging Works

Gemini uses a conversational pattern. Each interaction consists of a list of messages. Every message is a dictionary with a role ("user" or "model") and a list of parts containing the content.

Example structure:

The System Instruction

A system instruction defines the model's persona or behavioral constraints. It guides Gemini's logic throughout the entire session.

While user messages provide the task, the system instruction sets the "rules of engagement."

Sending Your First Request

To generate a response, use the generate_content method. We pass the model ID, the message list, and a configuration object containing our system instruction and parameters like temperature.

Exploring the Response Object

The generate_content call returns a structured GenerateContentResponse object. To see everything Gemini provides—including metadata and token usage—you can dump it to JSON:

Key fields include:

  • candidates: The list of generated responses (usually index 0).
  • finish_reason: Why the model stopped (e.g., "STOP" or "MAX_TOKENS").
  • usage_metadata: Details on token consumption.
Extracting the Text Content

There are two ways to get the text from a response. You can use the convenience property response.text, or use a robust approach that manually checks the candidates:

Summary & Next Steps

In this lesson, you configured the Gemini client, defined system instructions, and sent your first request. You also explored the structured response object and learned how to extract text safely.

Next, we will build on this foundation by managing multi-turn conversation states and handling more complex data structures.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal