Welcome back! In the previous lesson, you learned how to use AI to generate new recipes from a list of ingredients. Now, let’s take the next step: extracting recipes from real-world web pages.
Many cooking websites have great recipes, but the information is often buried in messy HTML code. Our goal is to build a script that can take a raw HTML file, use AI to extract a clean recipe, and then store that recipe in our database. This process is a key part of making our AI Cooking Helper smarter and more useful.
By the end of this lesson, you’ll understand how to automate recipe extraction from HTML using Python, prompt templates, and your existing database setup.
Before we dive in, let’s briefly remind ourselves how we previously generated recipes with AI.
In the last lesson, you learned how to:
- Use prompt templates to ask the AI for a recipe based on a list of ingredients.
- Send these prompts to the AI and receive a structured recipe in response.
- Parse the AI’s response and use it in your Flask app.
This time, instead of generating a recipe from scratch, we’ll use the AI to extract a recipe from a messy HTML page. The process is similar, but the input and prompts are a bit different.
Let’s look at the big picture before we break things down.
The script you’ll be working with is called extract_and_store_recipe.py. Its job is to:
- Read a raw
HTMLfile from your computer. - Use AI to extract a clean recipe from that
HTML. - Parse the AI’s response into structured data (
name,ingredients,steps). - Store the recipe in your database, making sure not to add duplicates.
Here’s a simple diagram of the flow:
This script brings together everything you’ve learned so far: prompt templates, LLM calls, and database operations.
The first step is to get the AI to read the HTML and return a clean recipe. We do this by sending it a carefully crafted prompt.
Let’s look at how the script prepares and sends this prompt.
We use two prompt templates:
- A system prompt that tells the AI what its job is.
- A user prompt that gives the AI the actual
HTMLto process.
Here’s how the script loads and fills in these templates:
generate_responseis a function that loads the prompt templates, fills in the{{html}}variable with yourHTML, and sends the request to the AI.- The AI is told to extract a recipe and return it in a specific format.
System prompt:
User prompt:
When the script runs, it replaces {{html}} with the actual HTML content. The AI then returns a recipe in the requested format.
Once the AI returns its response, we need to turn that text into structured data and save it to the database.
The script uses a function called parse_recipe_string to break the AI’s response into parts:
- The function reads each line of the AI’s response.
- It looks for the
Name:,Ingredients:, andSteps:sections. - It collects the recipe name, a list of ingredients, and a list of steps.
You might find the code familiar, as its the same logic we used in the generate_recipe function inside routes.py
Example output:
Now, let’s see how the script saves the recipe:
- The function first checks if the recipe data is empty.
- It opens a database session and checks if a recipe with the same name already exists.
- If not, it creates a new
Recipeobject and adds each ingredient, creating new ones if needed. - Finally, it saves everything to the database.
You might also find the code familiar, as its the same logic we used in the add_manual_recipe.py script with avoidance of duplicate insert added.
Example output:
or, if the recipe already exists:
In this lesson, you learned how to extract a recipe from a messy HTML page using AI and store it in your database. You saw how the script:
- Reads an
HTMLfile, - Uses prompt templates to guide the AI,
- Parses the AI’s response into structured data,
- And saves the recipe and its ingredients to your database.
Next, you’ll get hands-on practice running and modifying this script. You’ll see how it works with real HTML files and learn how to troubleshoot and improve the extraction process. Great job making it this far — let’s keep building your AI Cooking Helper!
