Integrating Human Input Back to Agents

Introduction: Completing the Human-in-the-Loop Workflow

In the previous lesson, you completed Factor 6 by implementing pause and resume endpoints that give clients lifecycle control over agent workflows. Now, you'll implement Factor 7 — Contact humans with tool calls. As you learned when studying the 12-Factor Agents methodology, this factor treats human escalation as a first-class tool: when the agent lacks information, it doesn't guess or fail — it calls a structured ask_human tool, pauses itself, and waits for a response. The human's input is recorded in the context just like any other tool output, making the interaction auditable and reproducible. By the end of this lesson, you'll have a complete human-in-the-loop system where agents and users collaborate seamlessly to solve problems together.

Defining the ask_human Tool Schema

To allow the agent to request information from users, you need to create a new tool schema. The ask_human tool requires only one parameter: the question or prompt that the agent wants to present to the user. Create a new file at src/core/tools/schemas/ask_human.json with the following content:

This schema follows the same structure as your other tools, like sum_numbers or final_answer. The name field identifies the function as ask_human, while the description tells the language model when it should use this tool. The parameters section defines a single required field called question that holds the text the agent wants to show to the user. By keeping the schema simple with just one string parameter, you make it easy for the language model to formulate clear questions without worrying about complex argument structures.

Loading the ask_human Schema in the Agent

Now you need to load this schema and register it alongside your existing tools in the agent initialization. Open src/core/agent.py and modify the __init__ method to include the new schema:

The code opens the new schema file and loads it as JSON, just like it does for the math and final_answer schemas. Then, it includes ask_human_schema in the self.tool_schemas list, making the tool available to the language model during each agent step.

Handling ask_human in the Agent's Step Logic

When the agent calls the ask_human tool, you need special tool dispatch logic that differs from regular tools. Regular tools execute immediately and return results, while final_answer terminates the workflow. The ask_human tool should pause the workflow and wait for external input before continuing. Add a new case to the tool dispatch logic in the _next_step method:

When the agent detects an ask_human call, the handler performs three steps:

Removes the call from state.pending_tool_calls to prevent reprocessing it later.
Sets state.status to waiting_human_input, which signals to the rest of your system that the agent is paused and waiting for user interaction.
Returns the state immediately without continuing the loop.

Unlike final_answer, which ends the workflow permanently, ask_human represents a temporary pause that will resume after the user provides input.

Creating the provide_input Endpoint Request Model

To accept human responses and resume agent execution, you need to create a new API endpoint. Start by defining a Pydantic model for the request payload in src/server/main.py:

This model requires two fields: id identifies which agent state should receive the input, and answer contains the human's response to the agent's question.

Implementing the provide_input Route Handler

Now implement the route handler that validates the request and prepares the state for resumption:

The endpoint opens a database session and queries for the state record matching the provided id, returning a 404 error if it does not exist. Before accepting the input, the handler validates that the state's current status is waiting_human_input. This prevents clients from accidentally sending answers to agents that are not waiting for them. After validation passes, the code converts the database record to a Pydantic model to work with in the subsequent logic.

Extracting the call_id from the Context

Before you can add the human's answer to the agent's context, you need to determine which specific ask_human call the answer corresponds to. The language model requires this information because it tracks function calls and their outputs using unique call_id values. For that, let's implement a module-level helper function in src/server/main.py that searches backwards through the context:

We will place this function above the provide_input route handler so it's available when the endpoint calls it. The function iterates through the context in reverse order using Python's reversed() function, which is more efficient than searching from the beginning when the ask_human call is likely near the end. For each item, it checks whether the item is a dictionary with type equal to function_call and name equal to ask_human. When it finds such an item, it extracts and returns the call_id field. If no matching call is found, the function returns .

Using the call_id to Validate the Context

Now, back inside the provide_input route handler, you can call this helper to retrieve the call_id needed for constructing the response:

After extracting the call_id, the code validates that it exists. If _get_call_id_from_state returns None, the endpoint raises a 400 error, indicating that the context is in an invalid state.

Constructing the Function Call Output and Securing the Lifecycle

With the call_id identified, you can construct the function_call_output structure that the language model expects and append it to the agent's context.

The human_response dictionary links the output to the original ask_human call using the extracted call_id, and wraps the human's response in a JSON string. The endpoint then sets the status to running and schedules the background task, allowing the agent to continue seamlessly.

Important Note: Because the agent is now waiting for specific input, you must also secure your existing /agent/resume endpoint. If a client blindly calls resume while the agent is waiting for a human, the agent will loop indefinitely or fail. You should update /agent/resume to reject requests with a 400 error if the state is waiting_human_input, forcing clients to use instead. You'll do this in the upcoming exercise!

Testing the Launch and Waiting Phase

To verify that the human-in-the-loop workflow functions correctly, let's write a test script that launches an agent with an incomplete prompt and waits for the ask_human call.

Running this script produces output showing the agent launching and running until it needs human input:

The output shows that the agent launched successfully and ran for two steps before calling ask_human to request the missing circle measurement. The question it formulated is clear and specific, asking for either radius or diameter and mentioning that it will use pi equals three point one four as specified in the original prompt.

Testing the Input and Completion Phase

Now extend the test script to provide the human input and monitor completion:

Running this continuation of the script produces output demonstrating the complete human-in-the-loop cycle:

After the script provides the radius of ten inches, the agent resumes execution and continues for four more steps, ultimately reaching completion with the correct calculated area of 314.0 square inches. If you examine the full context in the final state, you'll see the ask_human function call followed by the function_call_output containing the human's answer — both linked by the same call_id, providing a complete audit trail of the collaboration.

Summary

You've implemented Factor 7 — Contact humans with tool calls by building a complete human-in-the-loop system. The ask_human tool schema, the provide_input endpoint, and the call matching logic work together so that agents can request structured information from users, record the exchange in context like any other tool call, and resume seamlessly. Combined with the launch, pause, and resume endpoints from earlier lessons (Factor 6), your API now supports rich, collaborative workflows between agents and humans.

Previous Lesson

Next Lesson: Triggering Agents from Anywhere

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal