Generating Images With Text

Introduction to Text Integration in Image Generation

Welcome to the final lesson of this course on creating images with the Gemini API . In previous lessons, you explored various aspects of image generation, including crafting effective prompts and using photography modifiers . Now, we will focus on text integration into your images, a powerful feature that can enhance the visual storytelling of your creations. Text integration allows you to add meaningful context or branding elements to your images, making them more engaging and informative. In this lesson, you will learn how to construct prompts that guide the AI to effectively place text within images. We will also cover how to generate these images using the gemini-3.1-flash-image model. By the end of this lesson, you will be equipped to create images with text that can be used for various applications, such as logos, posters, or digital art.

Constructing Effective Prompts for Text Placement

Creating effective prompts is crucial for guiding the AI to generate images with text. When constructing prompts, consider the following guidelines: Character Limits : Keep text short, ideally 25 characters or fewer, to ensure clarity and readability. Multiple Phrases : Use up to three distinct phrases to provide additional information without cluttering the image. Text Placement : Specify where you want the text to appear, such as at the top arc or at the bottom arc . Let's break down an example prompt: "A circular emblem featuring a central image of a mountain. At the top arc, the text Adventure Awaits is curved gracefully, and at the bottom arc, the text Explore the Unknown follows the curve. The design has a vintage aesthetic with serif fonts." This prompt provides clear guidance on text placement and style, helping the AI generate an image that meets your expectations.

Generating Images with Text Using Gemini

Now that you understand how to construct prompts, let's generate an image with text using the gemini-3.1-flash-image model. Here's a step-by-step walkthrough of the code in PHP: PHP<?php require 'vendor/autoload.php'; use GuzzleHttp\Client; use GuzzleHttp\Exception\RequestException; // Load API key from environment $apiKey = getenv("GEMINI_API_KEY"); if (!$apiKey) { die("GEMINI_API_KEY environment variable not set.\n"); } // Define endpoint and payload $endpoint = getenv("GEMINI_BASE_URL") . "?key=" . $apiKey; // Define the prompt with text placement guidance $prompt = "A circular emblem featuring a central image of a mountain. " . "At the top arc, the text 'Adventure Awaits' is curved gracefully, " . "and at the bottom arc, the text 'Explore the Unknown' follows the curve. " . "The design has a vintage aesthetic with serif fonts."; $payload = [ "contents" => [ [ "parts" => [ ["text" => $prompt] ] ] ], "generationConfig" => [ "responseModalities" => ["TEXT", "IMAGE"] ] ]; // Call the API $client = new Client(); try { $response = $client->post($endpoint, [ 'headers' => [ 'Content-Type' => 'application/json' ], 'json' => $payload ]); $statusCode = $response->getStatusCode(); $body = json_decode($response->getBody(), true); // Steps to save the image ... } catch (RequestException $e) { echo "API request failed.\n"; if ($e->hasResponse()) { echo "Response:\n" . $e->getResponse()->getBody(); } else { echo $e->getMessage(); } } ?> In this code, we initialize the Gemini client with your API key and define a prompt that includes text placement instructions. The post method is used to send the prompt to the generateContent endpoint using the gemini-3.1-flash-image model. The request body uses the contents -> parts -> text structure you have seen throughout this course, and responseModalities is set to request image output. The response is then processed to extract the generated image data from the inlineData field within the returned parts.

Result

The following image demonstrates the model's ability to follow precise layout instructions, placing the requested text along the specified top and bottom arcs of the emblem:

Summary and Next Steps

Congratulations on completing the course! In this lesson, you learned how to integrate text into images using the Gemini API with the gemini-3.1-flash-image model. You now have the skills to send prompts via generateContent, process the returned inlineData image parts, and save the resulting images for various applications, from logos to digital art. As you move on to the practice exercises, take the opportunity to experiment with different prompts and configurations. This hands-on practice will reinforce your understanding and prepare you for real-world applications. Well done on reaching the end of the course, and happy image generating!

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal