Welcome to the second lesson of our course on building an image generation service with Flask. In our previous lesson, we created the PromptManager
class to handle the formatting of prompts for our image generation process. Now, we'll build upon that foundation by creating the ImageManager
class, which will be responsible for handling the images once they're generated.
The ImageManager
serves as a crucial component in our application architecture. While the PromptManager
prepares the instructions for image generation, the ImageManager
takes care of what happens after an image is created. Its primary responsibilities include:
- Storing generated images along with their associated prompts
- Converting images to a web-friendly format (
base64
) - Providing access to the collection of stored images
This component is essential because web applications can't directly work with raw image data. By converting images to base64
strings, we can easily embed them in HTML or send them as JSON responses in our API. Additionally, keeping track of which prompt generated which image allows us to maintain a history of generations and potentially reuse successful prompts.
Let's dive into building this important piece of our image generation service.
To begin, we'll create a new file called image_manager.py
in our app/models
directory. This file will contain our ImageManager
class. First, let's set up the basic structure and understand the imports we'll need:
Python1import base64 2from PIL import Image 3from io import BytesIO 4 5class ImageManager: 6 def __init__(self): 7 self.images = []
Let's break down these imports:
base64
: This module provides functions for encoding binary data to ASCII characters and decoding such encodings back to binary data. We'll use it to convert our images to a string format that can be easily transmitted over HTTP.- PIL (Python Imaging Library): We import the
Image
class fromPIL
, which allows us to open, manipulate, and save many different image file formats. In our case, we'll use it to process the image data. BytesIO
: This is a class from theio
module that implements a file-like interface for reading and writing bytes in memory rather than to a disk file. It's useful for handling binary data like images.
Our ImageManager
class starts with a simple constructor that initializes an empty list called images
. This list will store dictionaries containing information about each image, including:
- A unique identifier
- The prompt that generated the image
- The
base64
-encoded image data
This simple data structure allows us to keep track of all generated images and their associated metadata in memory. In a production application, you might want to use a database instead, but for our learning purposes, this in-memory storage works well.
One of the key functions of our ImageManager
class is to convert image data into a format that can be easily used in web applications. Base64
encoding is perfect for this purpose as it transforms binary data into a string of ASCII characters that can be included directly in HTML or JSON.
Let's implement the image_to_base64
method:
Python1def image_to_base64(self, image_data): 2 try: 3 image_bytes = image_data.image.image_bytes 4 image = Image.open(BytesIO(image_bytes)) 5 buffered = BytesIO() 6 image.save(buffered, format="JPEG") 7 return base64.b64encode(buffered.getvalue()).decode("utf-8") 8 except Exception as e: 9 raise RuntimeError(f"Error processing image: {str(e)}")
This method takes image_data
as input, which will be the response from our image generation API. Let's walk through what's happening:
- We extract the raw image bytes from the API response structure (
image_data.image.image_bytes
). - We use
BytesIO
to create a file-like object from these bytes, whichPIL
'sImage.open()
can read. - We create another
BytesIO
buffer to hold the processed image. - We save the image to this buffer in
JPEG
format. - We encode the buffer's contents as
base64
and then decode it to aUTF-8
string (rather than leaving it as bytes). - If any errors occur during this process, we catch them and raise a more informative
RuntimeError
.
The error handling is particularly important here because image processing can fail for various reasons, such as corrupted image data or memory issues. By catching exceptions and providing clear error messages, we make debugging much easier.
Now that we can convert images to base64
, let's implement the methods for adding images to our collection and retrieving the stored images.
Python1def add_image(self, prompt, image_data): 2 image_base64 = self.image_to_base64(image_data) 3 image_entry = {"id": len(self.images), "prompt": prompt, "image_base64": image_base64} 4 self.images.append(image_entry) 5 return image_base64
This method performs several important tasks:
- It calls our
image_to_base64
method to convert the image data to abase64
string. - It creates a dictionary (called
image_entry
) containing:- An
id
based on the current length of the images list (ensuring each image gets a unique identifier) - The
prompt
that was used to generate the image - The
base64
-encoded image data
- An
- It appends this dictionary to our
images
list. - It returns the
base64
string, which can be immediately used by the caller if needed.
Python1def get_images(self): 2 return self.images
This method simply returns the entire list of image entries, allowing other parts of our application to access all stored images and their associated metadata.
Now that we've implemented our ImageManager
class, let's create a simple test script to verify that it works correctly. We'll create this in our app/main.py
file:
Python1from models.image_manager import ImageManager 2 3image_manager = ImageManager() 4fake_image_data = b"fake image data" # Simulated image data (use real data when integrating) 5prompt = "Test prompt for ImageManager" 6 7result = image_manager.add_image(prompt, fake_image_data) 8print("Image Added Successfully:") 9print(result) 10print("\nAll Stored Images:") 11print(image_manager.get_images())
In this test script, we:
- Import our
ImageManager
class. - Create an instance of the
ImageManager
. - Define some fake image data and a test prompt. Note that this fake data won't actually work with our
image_to_base64
method as implemented, but it serves as a placeholder for our test. - Call the
add_image
method with our test data. - Print the result and then retrieve and print all stored images.
When running this script with real image data, you would see output similar to:
1Image Added Successfully: 2/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0a... 3 4All Stored Images: 5[{'id': 0, 'prompt': 'Test prompt for ImageManager', 'image_base64': '/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0a...'}]
The base64
string would be much longer in a real application, but I've truncated it here for readability.
In this lesson, we've built the ImageManager
class, a crucial component of our image generation service. This class handles the storage and processing of generated images, converting them to a web-friendly format and maintaining a collection of all images along with their associated prompts.
Let's review what we've learned:
- We created a class structure with a simple in-memory storage mechanism.
- We implemented a method to convert image data to
base64
format, making it suitable for web applications. - We built methods to add images to our collection and retrieve all stored images.
- We created a simple test script to verify the functionality of our
ImageManager
.
The ImageManager
complements the PromptManager
we built in the previous lesson. While the PromptManager
prepares the instructions for image generation, the ImageManager
handles the results of that generation process. Together, these components form the foundation of our image generation service.
In the upcoming practice exercises, you'll have the opportunity to work with the ImageManager
class, testing its functionality with different types of image data and exploring how it integrates with the rest of our application. You'll also get to experiment with error handling and see how the class behaves in various scenarios.
In our next lesson, we'll build upon this foundation by implementing the ImageGeneratorService
, which will connect to Google's Gemini API to actually generate images based on our prompts. This service will use both the PromptManager
and ImageManager
classes we've created so far.
