Introduction to the Image Manager

Welcome to the second lesson of our course on building an image generation service with FastAPI. In our previous lesson, we created the PromptManager class to handle the formatting of prompts for our image generation process. Now, we'll build upon that foundation by creating the ImageManager class, which will be responsible for handling the images once they're generated.

The ImageManager serves as a crucial component in our application architecture. While the PromptManager prepares the instructions for image generation, the ImageManager takes care of what happens after an image is created. Its primary responsibilities include:

  1. Storing generated images along with their associated prompts
  2. Converting images to a web-friendly format (base64)
  3. Providing access to the collection of stored images

When we use the Gemini API, it returns generated image data as a native SDK object. By converting these image objects (or fallback PIL images used for testing) to base64 strings, we can easily embed them in HTML or send them as JSON responses in our API. Additionally, keeping track of which prompt generated which image allows us to maintain a history of generations.

Let's dive into building this important piece of our image generation service.

Setting Up the Image Manager Class

To begin, we'll create a new file called image_manager.py in our app/models directory. This file will contain our ImageManager class. First, let's set up the basic structure and understand the imports we'll need:

Let's break down these imports:

  • base64: This module provides functions for encoding binary data to ASCII characters and decoding such encodings back to binary data. We'll use it to convert our images to a string format that can be easily transmitted over HTTP.
  • PIL (Python Imaging Library): We import the Image class from PIL, which allows us to open, manipulate, and save many different image file formats. It provides a helpful fallback structure for handling custom images outside of the SDK.
  • BytesIO: This is a class from the io module that implements a file-like interface for reading and writing bytes in memory rather than to a disk file. It's useful for saving our images as a buffer before encoding them.

Our ImageManager class starts with a simple constructor that initializes an empty list called images. This list will store dictionaries containing information about each image, including:

  • A unique identifier
  • The prompt that generated the image
  • The -encoded image data
Converting Images to Base64 Format

One of the key functions of our ImageManager class is to convert image objects into a format that can be easily used in web applications. Base64 encoding is perfect for this purpose as it transforms binary data into a string of ASCII characters that can be included directly in HTML or JSON.

Let's implement the image_to_base64 method:

This method takes either a Google GenAI Image object or a PIL.Image.Image object. Let's walk through what's happening:

  1. We first check if the object has an image_bytes attribute (which the GenAI SDK returns). If so, we encode those raw bytes directly into a base64 string.
  2. If not, we verify that it is a standard PIL Image. We then create a BytesIO buffer, save the PIL image to this buffer in PNG format, and encode the buffer's contents as base64 (decoding it to a UTF-8 string).
  3. If any errors occur during this process, we catch them and raise a more informative .
Building Image Collection Methods

Now that we can convert images to base64, let's implement the methods for adding images to our collection and retrieving the stored images.

The add_image Method

This method performs several important tasks:

  1. It calls our image_to_base64 method to convert the image object to a base64 string.
  2. It creates a dictionary (called image_entry) containing:
    • An id based on the current length of the images list (ensuring each image gets a unique identifier)
    • The prompt that was used to generate the image
    • The base64-encoded image data
  3. It appends this dictionary to our images list.
  4. It returns the base64 string, which can be immediately used by the caller if needed.
The get_images Method

This method simply returns the entire list of image entries, allowing other parts of our application to access all stored images and their associated metadata.

Testing the Image Manager

Now that we've implemented our ImageManager class, let's create a simple test script to verify that it works correctly. We'll create this in our app/main.py file using our fallback Pillow validation logic:

In this test script, we:

  1. Import our ImageManager class and the PIL.Image module.
  2. Create an instance of the ImageManager.
  3. Create a fake solid-color PIL image to simulate testing without needing an API call.
  4. Call the add_image method with our test prompt and fake image.
  5. Print the result and then retrieve and print all stored images.

When running this script, you would see output containing a massive base64 string representing the red square image.

Summary and Practice Preview

In this lesson, we've built the ImageManager class, a crucial component of our image generation service. This class handles the storage and processing of generated images, converting native SDK objects or PIL objects to a web-friendly base64 format and maintaining a collection of all images along with their associated prompts.

The ImageManager complements the PromptManager we built in the previous lesson. Together, these components form the foundation of our image generation service.

In the upcoming practice exercises, you'll have the opportunity to work with the ImageManager class, testing its functionality and seeing how it behaves in various scenarios.

In our next lesson, we'll build upon this foundation by implementing the ImageGeneratorService, which will connect to Google's Gemini API to actually generate images based on our prompts using the gemini-3.1-flash-image model.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal