Throughout this course, you have mastered the fundamentals of tool integration with GPT-5: creating tool schemas, understanding GPT-5's tool use responses, and executing single tool requests. However, the approach you've learned so far has a significant limitation — it only handles one tool call per conversation turn. While this works perfectly for simple tasks, many real-world problems require multiple sequential steps, and often the number and nature of these steps cannot be determined in advance.
In this lesson, we'll work together to transform GPT-5 from a single-turn tool user into an autonomous agent capable of iterative problem-solving. We'll build an agent class that can call tools, analyze results, decide what to do next, and continue this process until complex multi-step tasks are completed. This represents a fundamental shift from reactive tool usage to proactive, intelligent problem-solving that mirrors how humans approach complex challenges.
Before we start coding, let's understand how autonomous agents operate through action-feedback loops, in which each tool execution provides information that influences the next decision. This iterative process mirrors human problem-solving: we take an action, observe the result, decide what to do next, and repeat until we reach our goal. The action-feedback loop consists of four key phases that repeat until task completion:
- Decision Phase:
GPT-5analyzes the current situation and determines the next action, which may include calling one or more tools. - Action Phase: Our agent executes the requested tool(s) based on
GPT-5's instructions. - Feedback Phase: The results from the tool execution(s) are captured and added to the conversation history.
- Evaluation Phase:
GPT-5reviews the new information, decides whether the task is complete or if additional steps are needed, and the loop continues.
This loop structure enables complex problem-solving because each iteration builds upon previous results. For example, when solving a quadratic equation, GPT-5 might first calculate the discriminant, then use that result to determine if real solutions exist, then calculate the square root of the discriminant, and finally compute the two solutions. The key insight is that GPT-5 doesn't need to plan all steps in advance — it can adapt its approach based on intermediate results, just like a human mathematician working through a problem.
Now let's start building our agent class to make this iterative process possible.
Let's begin by creating the foundation of our autonomous agent. We need to establish the core structure that will manage extended conversations, tool execution, and decision-making loops. We'll start with the class definition and constructor:
Our agent's foundation relies on key design decisions that enable autonomous behavior while maintaining flexibility for different use cases:
-
BASE_SYSTEM_PROMPT: Explicitly tellsGPT-5that it can make multiple tool calls and that users won't see the intermediate steps — only the final result. We're combining this with custom instructions to allow for domain-specific guidance while maintaining the autonomous behavior. -
Constructor parameters: Provide flexibility for different scenarios while ensuring some safe defaults:
name: Provides a clear identifier for the agent, useful for debugging, logging, and when working with multiple agents in complex systems.system_promptallows customization for specific domains like math or data analysis.
Now let's add the method that handles individual tool executions within our agent loop. This method needs to be robust because tool failures shouldn't break our entire autonomous process:
This method handles the individual tool executions that will happen within our larger iterative loop. Here's how it manages the execution flow:
- Extract tool information: Gets the tool name, call ID, and parses the arguments from JSON format into a Python dictionary that we can use to call the function.
- Debug tracking: Prints which tool is being called with what parameters — invaluable for debugging and understanding how our agent thinks.
- Execute with comprehensive error handling: Attempts to run the tool, catching both missing tools (
KeyError— equivalent to checking if the tool exists in our dictionary) and execution failures (Exception). - Return structured results: Converts both successful results and errors into properly formatted function call output objects with the correct
call_idreference.
The error handling ensures that tool failures don't break the entire agent loop. Instead, errors are converted into tool results that GPT-5 can understand and potentially work around. This robustness allows our agent to continue operating even when individual tools encounter problems, making the whole system much more resilient.
Now we're ready to implement the heart of our autonomous agent: the run method. This method will manage the iterative loop that enables multi-step problem-solving. Let's start by understanding how our agent handles conversation state:
The input_messages.copy() is important because it ensures our agent remains stateless. Just like normal LLM API calls, where you pass the complete conversation history each time, our agent doesn't store any conversation state between calls. Each time you call agent.run(), you provide the full context through input_messages, and the agent processes only that specific conversation without any memory of previous interactions.
By copying the input messages instead of modifying them directly, we preserve the original conversation and allow the same agent instance to handle multiple independent conversations. This design also gives you complete control over context management — you can decide exactly what conversation history to include, filter out irrelevant messages, or combine conversations as needed before passing them to the agent.
Now let's add the basic loop structure that will enable our agent's iterative problem-solving:
We're starting with a controlled loop that will continue until GPT-5 provides a final answer or we reach our maximum turn limit. Each iteration represents one complete action-feedback cycle in which GPT-5 makes a decision (potentially including tool calls), and we capture that decision in our conversation history. The turn counter prevents infinite loops while allowing sufficient iterations for complex problems.
The API call uses the GPT-5 Responses API with several key parameters:
instructionsprovides the combined base and custom instructions that guide the agent's behavior.inputcontains the full conversation history for context.toolsspecifies which tools are available forGPT-5to use.reasoningcontrols the computational effort applied to problem-solving.store=Falseensures we don't persist conversation data.
Now let's add the logic for handling tool calls within our loop. This is where the magic of autonomous behavior happens:
When GPT-5 decides to use tools, we handle the execution through a systematic two-step process:
-
Detect function calls: We filter the response output array to find all items with
type == "function_call", which tells usGPT-5wants to execute tools. -
Add function calls to conversation: Before executing anything, we add each function call to the messages array. This maintains a complete record of what
GPT-5requested. -
Execute all requested tools:
GPT-5might call multiple tools in a single turn, and we need to execute each one to gather all the information it needs for its next decision.
Finally, let's complete our loop with the logic for handling final responses and error conditions:
When GPT-5 reaches a final answer, we handle the completion through a structured return process:
-
Detect completion: When
GPT-5doesn't want to use tools (no function calls in the output), it signals that it has reached a final answer and no further iterations are needed. -
Extract clean response:
GPT-5provides the final text directly throughresponse.output_text, which contains the readable answer without any tool-related content. -
Add to conversation history: Before returning, we add the final assistant response to the messages array to maintain a complete conversation record.
-
Return complete state: We return both the full conversation history (
messages) and the final response text (response.output_text) to maintain our stateless design — the caller receives everything needed to understand what happened and can use the conversation history for follow-up questions or multi-turn interactions.
Here's how our complete run method looks when put together:
Now let's put our agent to work! We'll create a math-focused autonomous agent and see how it handles a complex quadratic equation. We'll provide more math tools following the same pattern used across the course, so you can easily extend your agent's capabilities as needed:
When we run this code, our agent demonstrates sophisticated autonomous reasoning:
Our agent systematically applied the quadratic formula by calculating b² ((-7)²), computing 4ac (first 2×3, then 4×6), finding the discriminant (49-24), and taking the square root (√25). Each tool call built upon previous results, demonstrating true autonomous reasoning. The agent made 5 tool calls across multiple conversation turns, yet the user only sees the final, complete answer.
Together, we've successfully built an autonomous agent capable of complex, multi-step problem-solving. Our agent class encapsulates conversation management, tool execution, and iterative decision-making in a reusable structure that can tackle problems requiring dozens of sequential operations.
The architecture we created enables GPT-5 to operate as a true autonomous agent: it can assess situations, make decisions, execute tools, learn from results, and continue iterating until complex tasks are completed. This represents a fundamental advancement from simple tool usage to intelligent, adaptive problem-solving.
In the upcoming practice exercises, you'll implement your own autonomous agents, experiment with different instructions and tool combinations, and tackle increasingly complex multi-step problems. You'll gain hands-on experience with the debugging and optimization techniques needed for production agent systems, building upon the solid foundation we've created together.
