Welcome back! So far, you have learned how to build and use API endpoints to retrieve, search, and rate recipes in your cooking helper app. As your app grows and more recipes are added, it’s important to make sure your data stays clean and reliable. One common problem in real-world applications is duplicate data — when the same recipe appears more than once in your database.
Duplicate recipes can confuse users, make search results messy, and even affect features like ratings and recommendations. In this lesson, you will learn how to identify and remove duplicate recipes from your Python script. This is an important step in keeping your app’s data accurate and user-friendly.
By the end of this lesson, you’ll know how to use the remove_duplicate_recipes.py script to find and safely delete duplicate recipes, making your cooking helper app more professional and enjoyable for users.
Before we look at the script, let’s talk about how duplicate recipes can appear in your database. Sometimes, users might add the same recipe twice by mistake. Other times, small differences — like extra spaces or different capitalization — can make two recipes look different to a computer, even though they are the same to a person.
For example, these two names would be considered duplicates by a human, but not always by a computer:
"Chocolate Cake"
" chocolate cake "
In our project, we consider recipes to be duplicates if their names are the same when you ignore spaces at the beginning or end and treat uppercase and lowercase letters as the same. This is called “normalizing” the name.
Let’s break down how the remove_duplicate_recipes.py
script works, step by step.
First, the script needs to connect to your Flask app and the database where your recipes are stored. This is done by importing the app and models:
- The
sys.path.insert
line makes sure Python can find your app’s code. create_app
is used to set up the Flask app and connect to the database.Recipe
anddb
are imported so the script can work with your recipes.
To find duplicates, the script needs to compare recipe names in a way that ignores spaces and capitalization. This is done with a helper function:
strip()
removes spaces at the beginning and end.lower()
makes all letters lowercase.
For example:
" Chocolate Cake "
becomes"chocolate cake"
.
The script then loads all recipes and groups them by their normalized name:
all_recipes = Recipe.query.all()
gets every recipe from the database.name_map
is a dictionary that groups recipes by their normalized name.- If a group has more than one recipe, it’s a duplicate.
The script then prints out the duplicates it finds:
Example Output:
The script asks you if you want to delete the duplicates, keeping only one copy of each:
- If you type
y
, the script deletes all but one recipe in each duplicate group. - If you type anything else, it does nothing.
This makes sure you don’t accidentally delete recipes without checking first.
When a duplicate recipe is deleted, you don’t have to worry about its reviews being left behind in the database. This is because the Review
model uses ondelete='CASCADE'
on its recipe_id
foreign key. This means that when a recipe is deleted, all reviews linked to that recipe are automatically deleted by the database. This helps keep your data consistent and prevents orphaned reviews.
Let’s see what happens when you run the script.
If there are no duplicates, you’ll see:
If duplicates are found, you’ll see something like:
If you type y
and press Enter, the script will remove the extra copies and show:
If you press Enter or type anything else, it will show:
This gives you a chance to review what will be deleted before making any changes.
In this lesson, you learned why duplicate recipes can be a problem and how to use the remove_duplicate_recipes.py
script to keep your recipe database clean. You saw how the script connects to your app, normalizes recipe names, finds duplicates, and safely removes them with your confirmation.
Good data hygiene is important for any real-world app. By keeping your recipes unique, you make your cooking helper more reliable and enjoyable for users.
Congratulations on reaching the end of this course! You now have the skills to build, search, and maintain a high-quality recipe API. Be sure to try the practice exercises next to reinforce what you’ve learned. Great job!
