Executing Tool Calls and Managing Context

Introduction: The Tool Execution Loop

You've learned how to define tool schemas and parse function calls from model responses. Now you're ready to close the loop by actually executing those function calls and feeding their results back to the model. This lesson introduces Factor 3 (Own your context window) by showing you how to explicitly control what information flows through your agent's conversation history. By the end of this lesson, you'll build agents that can perform multi-step reasoning, where the model calls a tool, your code executes it, and the model sees the result before deciding what to do next.

Defining Executable Python Functions

To execute function calls, you first need actual Python functions that perform real work. These functions should be simple and focused, with clear inputs and outputs. For this lesson, you'll define two basic mathematical operations that your agent can use. Python def add(a: float, b: float) -> float: """Add two numbers together.""" return a + b def multiply(a: float, b: float) -> float: """Multiply two numbers together.""" return a * b def add(a: float, b: float) -> float: """Add two numbers together.""" return a + b def multiply(a: float, b: float) -> float: """Multiply two numbers together.""" return a * b Each function has type hints that document what it expects (float) and returns. The docstrings describe what the function does in plain language. These functions represent the actual capabilities your agent has — when the model decides to call add, your code will execute this Python function and return the result. Now you need to create tool schemas that tell the model that these functions exist and how to use them.

Creating Tool Schemas for Your Functions

You'll create tool_schemas that match your Python functions, plus a final_answer tool that signals when the agent has finished its work. The schemas tell the model what tools are available and exactly how to use them. Pythontool_schemas = [ { "type": "function", "name": "final_answer", "description": "Provide the final answer and stop.", "strict": True, "parameters": { "type": "object", "properties": { "answer": {"type": "string", "description": "The final answer for the user."} }, "required": ["answer"], "additionalProperties": False } }, { "type": "function", "name": "add", "description": "Add two numbers together", "strict": True, "parameters": { "type": "object", "properties": { "a": {"type": "number", "description": "The first number"}, "b": {"type": "number", "description": "The second number"} }, "required": ["a", "b"], "additionalProperties": False } }, { "type": "function", "name": "multiply", "description": "Multiply two numbers together", "strict": True, "parameters": { "type": "object", "properties": { "a": {"type": "number", "description": "The first number"}, "b": {"type": "number", "description": "The second number"} }, "required": ["a", "b"], "additionalProperties": False } } ]tool_schemas = [ { "type": "function", "name": "final_answer", "description": "Provide the final answer and stop.", "strict": True, "parameters": { "type": "object", "properties": { "answer": {"type": "string", "description": "The final answer for the user."} }, "required": ["answer"], "additionalProperties": False } }, { "type": "function", "name": "add", "description": "Add two numbers together", "strict": True, "parameters": { "type": "object", "properties": { "a": {"type": "number", "description": "The first number"}, "b": {"type": "number", "description": "The second number"} }, "required": ["a", "b"], "additionalProperties": False } }, { "type": "function", "name": "multiply", "description": "Multiply two numbers together", "strict": True, "parameters": { "type": "object", "properties": { "a": {"type": "number", "description": "The first number"}, "b": {"type": "number", "description": "The second number"} }, "required": ["a", "b"], "additionalProperties": False } } ] Notice how the schema for add matches the signature of your Python function — the parameter names (a and b) are identical, and the types correspond to Python's type hints. This consistency is critical because when the model calls add with arguments, you'll parse those arguments and pass them directly to your Python function using keyword argument unpacking. With your functions and schemas defined, you're ready to build the conversation context.

Initializing the Context with the User's Message

The context is a list that holds the entire conversation history between the user and the agent. You explicitly control what goes into this list, which implements Factor 3 by giving you full ownership of the context window. You start by creating a context with just the user's initial message. Pythoncontext = [ { "role": "user", "content": "Compute 15 + 27" } ]context = [ { "role": "user", "content": "Compute 15 + 27" } ] This simple list represents everything the model will see when you make your first API call. The user has asked the agent to perform a calculation, and the model will need to decide which tool to call based on this request and the available tool_schemas. Before making the call, you'll also need to define a system_prompt that explains the agent's role.

Setting Up the System Prompt

You need a system_prompt that explains the agent's role and instructs it to use tools when appropriate. Since you're providing explicit tool_schemas, the system_prompt can remain simple and focused on behavior rather than output format. Python system_prompt = """ You are a helpful assistant that can perform calculations. When asked to do math, you must use the provided tools. When your work is done, call the final_answer tool. """ system_prompt = """ You are a helpful assistant that can perform calculations. When asked to do math, you must use the provided tools. When your work is done, call the final_answer tool. """ The system_prompt tells the agent when to use tools (for math) and when to signal completion (via final_answer). This gives the model clear behavioral guidance while the tool_schemas handle the structural requirements. With both the context and system_prompt ready, you can make your first API call.

Making the First API Call

With the context initialized and the system_prompt defined, you're ready to make the first API call. The model will see the user's request, understand the available tools from the schemas, and decide which tool to call first. Pythonresponse = openai.responses.create( model="gpt-5", instructions=system_prompt, input=context, tools=tool_schemas, tool_choice="required", reasoning={"effort": "low"} )response = openai.responses.create( model="gpt-5", instructions=system_prompt, input=context, tools=tool_schemas, tool_choice="required", reasoning={"effort": "low"} ) The tool_choice="required" parameter ensures that the model must call a tool rather than responding conversationally. This is important because you're building a structured agent that operates through tool calls, not free-form text. The model will analyze the user's request, recognize that it needs to add two numbers, and call the add tool with the appropriate arguments. Now you need to process the response and identify the function call.

Understanding the Context Message Structure

Before processing the response and building your context, you need to understand the exact structure the API expects for function-related messages. When you add function calls and their results to context, you must follow a specific format with required fields. For function call messages added to context, the API requires: "type": "function_call" — identifies this as a function call message "name": <string> — the name of the function being called "arguments": <string> — a JSON string containing the function arguments "call_id": <string> — a unique identifier linking this call to its result For function result messages added to context, the API requires: "type": "function_call_output" — identifies this as a function result message "call_id": <string> — must match the call_id from the original function call "output": <string> — a JSON string containing the function's result The critical rule is that the call_id must match between a function call and its corresponding result. This linkage allows the API to understand which result belongs to which call, which becomes important in complex scenarios where multiple function calls might be in progress. With this structure in mind, you're ready to process the response and build your context correctly.

Identifying and Recording the Function Call

After the API returns, you need to process the response by iterating through the response.output items and identifying function calls. As a reminder from the previous lesson, the response.output list can contain different types of items, and you're specifically looking for items where type == "function_call". When you find one, the first step is to add it to your context. Pythonfor item in response.output: if item.type == "function_call": # Step 1: Add the function call to context context.append({ "type": "function_call", "name": item.name, "arguments": item.arguments, "call_id": item.call_id })for item in response.output: if item.type == "function_call": # Step 1: Add the function call to context context.append({ "type": "function_call", "name": item.name, "arguments": item.arguments, "call_id": item.call_id }) Recording the function_call in context is crucial because the model needs to see its own decisions in the conversation history. The function call includes the tool name, the arguments as a JSON string, and a unique call_id that links the call to its eventual result. This call_id is generated by the API and serves as a tracking mechanism that you'll use when adding the function's result back to the context. With the call recorded, you're ready to execute the actual Python function.

Parsing Arguments and Dispatching to Functions

With the function_call recorded in context, you're ready to execute the actual Python function. First, you need to parse the arguments from the JSON string into a Python dictionary using json.loads(), then use a match statement to dispatch to the correct function. Python # Step 2: Execute the function call using a match statement args = json.loads(item.arguments) match item.name: case "add": result = add(**args) case "multiply": result = multiply(**args) case _: result = f"Error: Tool {item.name} not implemented" print(f"Executed {item.name}({args}) = {result}") # Step 2: Execute the function call using a match statement args = json.loads(item.arguments) match item.name: case "add": result = add(**args) case "multiply": result = multiply(**args) case _: result = f"Error: Tool {item.name} not implemented" print(f"Executed {item.name}({args}) = {result}") The **args syntax unpacks the dictionary as keyword arguments, which works because your tool_schemas use the same parameter names as your Python functions. When the model calls add with {"a": 15, "b": 27}, this becomes add(a=15, b=27) in Python. The default case handles any unexpected tool names by returning an error message, which is good defensive programming. After executing the function, you print a confirmation message to help you understand what's happening during development. textExecuted add({'a': 15, 'b': 27}) = 42Executed add({'a': 15, 'b': 27}) = 42 The execution succeeded and produced the expected result. Now you need to add this result back to the context so the model can see what happened.

Recording the Function Result in Context

After executing the function, you need to add the result back to the context. This is where the call_id becomes important — you use it to link this result to the original function_call. Python # Step 3: Add the function result back to context context.append({ "type": "function_call_output", "call_id": item.call_id, "output": json.dumps({"result": result}) }) # Step 3: Add the function result back to context context.append({ "type": "function_call_output", "call_id": item.call_id, "output": json.dumps({"result": result}) }) The output is formatted as a JSON string containing the result. You use json.dumps() to ensure the format is consistent and parseable. The function_call_output type tells the API that this message contains the result of a previous function call, and the matching call_id links everything together. This linkage is important for complex agents that might have multiple function calls in flight simultaneously, though in this simple example, you'll only have one at a time. With the result recorded, you can inspect the complete context before making your second API call.

Inspecting the Context Before the Second Call

At this point, your context contains three items: the user's original request, the function_call the model made, and the result of executing that function. Before making the second API call, you can inspect the context to see exactly what the model will receive. Python# Display the context before the second call print("\nCurrent context:") print(json.dumps(context, indent=2))# Display the context before the second call print("\nCurrent context:") print(json.dumps(context, indent=2)) This produces clear visibility into what you're sending to the model: textCurrent context: [ { "role": "user", "content": "Compute 15 + 27" }, { "type": "function_call", "name": "add", "arguments": "{\"a\":15, \"b\":27}", "call_id": "call_lUnRop9ARHhrzWEkWvSLhcrq" }, { "type": "function_call_output", "call_id": "call_lUnRop9ARHhrzWEkWvSLhcrq", "output": "{\"result\": 42}" } ]Current context: [ { "role": "user", "content": "Compute 15 + 27" }, { "type": "function_call", "name": "add", "arguments": "{\"a\":15, \"b\":27}", "call_id": "call_lUnRop9ARHhrzWEkWvSLhcrq" }, { "type": "function_call_output", "call_id": "call_lUnRop9ARHhrzWEkWvSLhcrq", "output": "{\"result\": 42}" } ] This explicit visibility into the context is a key benefit of Factor 3 — you're not guessing what the model sees or relying on hidden framework state. You know exactly what information is available because you built the context yourself. Now you're ready to make the second API call where the model will see the complete history.

Making the Second API Call with Complete History

Now you make a second API call with the updated context. The model will see the entire history: the user's request, the tool call it previously made, and the result of that execution. Based on this information, it should recognize that the work is complete and call final_answer to provide the result to the user. Python# Second API call: the model sees the function calls and their results response = openai.responses.create( model="gpt-5", instructions=system_prompt, input=context, tools=tool_schemas, tool_choice="required", reasoning={"effort": "low"} )# Second API call: the model sees the function calls and their results response = openai.responses.create( model="gpt-5", instructions=system_prompt, input=context, tools=tool_schemas, tool_choice="required", reasoning={"effort": "low"} ) The same parameters are used as before, but now the input parameter contains the full conversation history, including the function execution. The model can use this context to understand that the addition was successful and decide how to proceed. You'll process this second response to extract the final_answer.

Processing the Final Answer

You process this second response the same way you processed the first, looking for function calls in the output. This time, you expect to see a final_answer call with the computed result. Python# Process the second set of tool calls (final_answer) print("") for item in response.output: if item.type == "function_call" and item.name == "final_answer": args = json.loads(item.arguments) print(f"Final response: {args['answer']}")# Process the second set of tool calls (final_answer) print("") for item in response.output: if item.type == "function_call" and item.name == "final_answer": args = json.loads(item.arguments) print(f"Final response: {args['answer']}") Running this code produces the final result: textFinal response: 42Final response: 42 The model used the context to understand that the addition was successful and provided the answer back to the user through the final_answer tool. This completes the execution loop: request, tool call, execution, result feedback, and final answer.

Summary: Owning Your Context Window

You've now implemented the complete tool execution pattern by defining Python functions alongside their schemas, initializing an explicit context, processing function calls by recording them in context, executing the functions and feeding results back, and completing the loop with a second API call where the model can see the full history. This pattern implements Factor 3 by giving you full control over the context window — you decide what information goes into each API call by explicitly managing the context list, and you can inspect it at any time to understand exactly what the model sees. In the practice exercises, you'll implement this pattern yourself with different tools and scenarios.

Previous Lesson

Next Lesson: Controlling Loops of Agentic Tool-Use

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal