Welcome to the first lesson of the "Creating a Researcher in Python with OpenAI" course! In this course, you will learn how to build DeepResearcher, an AI-powered research tool that can search the web, gather information, and generate a final report — all using Python
.
Before we dive into the details of web searching and AI, it’s important to set up a solid foundation. A clear project structure will help you keep your code organized, make it easier to add new features, and help you debug problems as you go. In this lesson, we’ll walk through the basic structure of the DeepResearcher project and explain how the main program is set up.
By the end of this lesson, you’ll understand how the main parts of the project fit together and be ready to start building out each piece in future lessons.
Let’s start with a quick reminder about how the DeepResearcher works:
This flowchart illustrates the step-by-step process, showing how each component fits into the overall workflow. The LLM
(Language Model) and Web
components work together to automate the research process.
Now, let’s look at the main file of our project: main.py
. This file is the entry point for DeepResearcher. It’s where the program starts running.
Let’s break down the key parts of this file step by step.
At the top of the file, we import a function from another part of our project:
This allows us to use the clear_visited_pages
function in our main program.
Next, we see several function definitions. Right now, these functions are just “stubs” — they don’t do anything yet, but they show what the main steps of our program will be.
generate_initial_search_queries
: This will take the user’s research topic and create a list of search queries.perform_iterative_research
: This will handle the main research loop, searching the web and collecting information.generate_final_report
: This will take all the information we’ve gathered and create a final report.
The pass
statement is a placeholder. It means “do nothing for now.” We’ll fill in these functions in later lessons.
The main logic of the program is inside the research_main()
function:
Let’s break this down:
- The program asks the user for a research topic and how many times to repeat the research process.
- It clears any previously visited web pages.
- It generates the first set of search queries.
- If there are no search queries, it stops.
- It copies the search queries to keep track of all queries used.
- It performs the main research loop.
- Finally, it generates a report.
At the bottom, we see:
This means: “If this file is run directly, start the program by calling research_main()
.”
In this lesson, you learned how to set up the basic structure for the DeepResearcher project. You saw how the main program is organized, what each function is responsible for, and how the program flows from user input to generating a final report.
This structure will make it much easier to build and test each part of the project as we move forward. In the next practice exercises, you’ll get hands-on experience working with this structure and preparing your own project files. After that, we’ll start filling in each function to bring DeepResearcher to life, step by step.
