Introduction: Grounding Factor 1 in Practice

In the previous course, you learned about Factor 1 (Natural Language to Tool Calls) of the 12-Factor Agents methodology: translating natural language into tool calls. The core idea is that instead of letting your LLM produce free-form text that you then have to parse with brittle string manipulation, you design the interaction so the model returns structured outputs — typically JSON or function calls — that your code can reliably process.

Now it's time to implement that principle. In this lesson, you'll learn a probabilistic approach to getting structured outputs: prompting OpenAI models to return JSON. This method relies on careful instruction through the system prompt, but it's important to understand that this is a best-effort technique — the model will usually comply, but there's no guarantee. You'll need robust error handling to manage cases where the model doesn't follow instructions perfectly.

The System Prompt Pattern

The first step in getting structured outputs is crafting a system prompt that clearly instructs the model on what format to use. Think of the system prompt as setting the rules of the game before the conversation starts. Here's a simple but effective pattern:

Notice what this prompt does. It doesn't just ask the model to "return JSON" in a vague way. It shows the exact JSON schema you expect, complete with field names and a brief description of what goes in each field. The word "only" is deliberate — it reinforces that the model should not add extra commentary or explanations outside the JSON structure. This approach leverages the model's training on structured data formats, so when you show a clear schema, the model generally complies. However, remember this is a best-effort instruction — the model may still occasionally deviate from the schema or return invalid JSON.

Making the API Request

With your system_prompt ready, you can now make a request to the OpenAI Responses API. This API is designed for structured interactions where you want more control over the model's behavior.

Let's break down each parameter used in the request:

  • model: Specifies which OpenAI model to use — in this case, "gpt-5", which is one of the more capable models.
  • instructions: Takes your system_prompt, which sets the behavioral constraints for the entire conversation.
  • input: A list of messages representing the conversation history, where each message has a role (like "user" or "assistant") and content (the actual text).
  • reasoning: Setting this to "low" effort is appropriate for straightforward tasks like this math question. More complex reasoning would use or but consume more tokens.
Navigating the Response Structure

Once you get a response back from the API, you need to extract the actual JSON data. The response object has a nested structure that requires careful navigation.

The response has an output attribute, which is a list of items. Each item could be a message, a tool call, or other types of content. You iterate through these items and check if the type is "message" — this indicates it's a text response from the model. Inside the message, the content is itself a list, so you access the first item and get its text attribute to retrieve the raw string the model generated. With this text in hand, you're ready to parse it as JSON.

Parsing and Extracting the Answer

Now comes the critical step: converting the text string into a Python dictionary so you can access the structured data. This is where you validate that the model actually followed your instructions.

The json.loads() function converts the model's raw text response into a Python dictionary. By accessing the "answer" key, you can extract the specific value you need. For our addition problem, the terminal will display the final calculated result:

However, because this prompting approach is probabilistic, robust error handling is essential.

Handling JSON Parse Failures

Because prompt-based JSON generation is a best-effort technique, there's always a real possibility the model might return something that isn't valid JSON — maybe it added extra text, used the wrong field names, or included a syntax error. This isn't a rare edge case; it's an inherent limitation of relying on instruction-following rather than enforced structure.

By wrapping the JSON parsing in a try-except block, you handle failures gracefully. If json.loads() encounters invalid JSON, it raises a JSONDecodeError, which you catch and handle instead of letting your program crash. In a production system, you would want to do more than just print an error message — you might log the failure for debugging, retry the request with a modified prompt, or escalate to a human operator. This defensive coding approach is fundamental to working with probabilistic structured outputs.

Summary & Moving to Practice

You've now seen a complete pattern for prompting LLMs to return structured JSON outputs: a clear system prompt with an explicit schema, a properly structured API call, careful extraction of the response content, and defensive parsing with error handling. This is a probabilistic approach that works well in many cases but requires robust error handling because the model may not always comply with your instructions.

In the practice exercises that follow, you'll implement this pattern yourself with different schemas and questions, gaining hands-on experience with both successful cases and the error handling you'll need when the model doesn't return perfectly structured output.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal