Getting Started with OpenAI Responses API

Introduction & Goals

Welcome to your first lesson in building effective agents with GPT-5! Whether you're new to the OpenAI API or have some experience, this lesson will give you a solid foundation in structuring requests and interpreting responses — skills you’ll reuse for agent workflows throughout this course.

In this lesson, you’ll learn how to send messages using the Responses API and understand the complete response structure. By the end, you’ll be able to create a script that communicates with a chat-capable model and inspect the full JSON response. This matters because, throughout the course, we’ll be working with different parts of Responses API outputs — from basic text to reasoning summaries and conversation flow control.

The same patterns extend to more complex workflows later in the path, regardless of which compatible model you choose.

Environment and Setup

To communicate with GPT-5, you'll need two things: the OpenAI Python SDK and an API key from OpenAI. The SDK handles all the technical details of making API requests, and you'd normally install it through your package manager (for example, with pip install openai). The API key authenticates your requests, and the OpenAI client automatically looks for it in the OPENAI_API_KEY environment variable.

In CodeSignal, we've already configured everything for you — the SDK is pre-installed and your API key is set up, so you can focus on learning the core concepts without worrying about setup details.

How the Responses API Handles Messaging

Every interaction follows a structured conversation pattern. The Responses API uses role-based messaging:

The system role (passed via instructions) sets behavior and context for the model — the model’s “job description” for the conversation.
The user role represents messages from you or your end users.
The assistant role represents model responses.

When you make a request, you provide a model, an instructions string, an input array representing the conversation history, and optional flags like store to control persistence. The API returns a structured JSON object containing the model’s output and metadata — details we’ll rely on later for reasoning tracking, conversation management, and error handling.

Setting Up the Client and Configuration

Let’s build a first interaction by initializing the client and defining a model and system prompt. We’ll use a chat-capable model identifier as an example; you can substitute any compatible model.

The client pulls your API key from environment variables. The system prompt will influence how the model responds across the conversation and is central to defining behavior you’ll use later for tools and agent workflows.

Creating Your First Message

Create the messages array representing the conversation so far:

Each message has role and content fields. Even for a single turn, an array can be used to support multi-turn conversations.

Sending the Message with the Responses API

With our message prepared, we can now send it to GPT-5:

The client.responses.create() method sends an HTTP request to OpenAI's servers, where GPT-5 processes your message according to the instructions and returns a structured response.

The store parameter controls whether OpenAI persists this conversation for later retrieval. By default, it's set to True, meaning conversations are stored and can be accessed again using their unique ID. Setting store=False tells OpenAI not to save the conversation — it processes your request and returns a response, but the conversation won't be retrievable later. Throughout this course, we'll manage the entire context window of our systems ourselves by maintaining conversation history in our code, so we don't need OpenAI to store anything. This gives us complete control over what gets sent in each request and how conversations are structured.

Notice that, unlike some other APIs, we don't need to specify a maximum response length — GPT-5 manages response sizing automatically based on the conversation context and the task at hand.

Examining the Complete Response Structure

To understand what GPT-5 returns, let's examine the complete response structure:

The response.model_dump() method converts GPT-5's response into a dictionary, and json.dumps() formats it as readable JSON. You'll see output like this:

This structure contains everything you need to see how the API processed your request and what it returned.

Understanding Response Fields

Understanding this response structure is essential for the rest of the course. Key fields include:

id — Unique identifier for logging and debugging
status — Whether the response completed successfully
output — Array of output blocks (e.g., reasoning blocks and message blocks)
usage — Token consumption details, which matter for monitoring and cost control

In output, you typically see a reasoning block (optionally with a summary if enabled/supported) and a message block (the final answer). Inside message blocks, the actual text lives in content items of type "output_text".

It's important to note that GPT-5 requests include reasoning with medium effort by default — even when you don't explicitly specify the reasoning parameter, the model allocates tokens to internal reasoning processes. You'll see this reflected in the usage section, where reasoning_tokens are counted separately from regular output tokens. This default behavior ensures thoughtful, well-considered responses while balancing quality and speed. This structure allows the API to include both intermediate reasoning summaries and final outputs when supported.

Extracting the Text Response

While the full JSON structure shows you everything about the response, most of the time you'll want to access just GPT-5's text response. The OpenAI Python SDK provides a convenient property for this:

The output_text property is a client-side convenience that automatically extracts the text content from the response structure. It scans through the output array for assistant message blocks, finds content items where type is "output_text", and collects their text values. This produces the clean text output:

This is the easiest approach for simple interactions, as the SDK handles navigating the nested structure for you. Note that output_text doesn't appear in the raw JSON response — it's computed by the Python client based on the content structure.

Building Multi-Turn Conversations

To continue a conversation, maintain history by appending the model’s response as an assistant message, then add your follow-up:

Notice how we append GPT-5's text response as an assistant message to maintain the conversation structure. This preserves the conversation flow and allows GPT-5 to understand the context of our follow-up question. Now let's see GPT-5's response:

This produces output like:

The conversation continues naturally because GPT-5 can see the full context of our previous exchange, allowing it to provide a focused answer about training specifically.

Enabling Enhanced Reasoning

Although GPT-5 includes reasoning with medium effort by default, we can explicitly control the reasoning behavior to suit our needs. Let's continue the conversation with custom reasoning settings to see how summaries can appear:

The reasoning parameter allows us to override the default behavior. Setting effort to "low" reduces the computational effort GPT-5 invests in thinking through the problem compared to the default medium level, while summary set to "auto" tells GPT-5 to automatically generate a summary of its reasoning process when appropriate. This gives us fine-grained control over the balance between response speed, depth of reasoning, and visibility into the model's thought process.

Examining Reasoning Responses

Let's examine the full response structure to see both GPT-5's internal reasoning summary and its final answer:

You'll see output that includes both a reasoning block with a summary and a message block with the final answer:

Notice how the response now contains two output blocks: a reasoning block showing GPT-5's internal thought process as a summary, and a message block with the polished final answer. The usage section also breaks down token consumption, showing that 128 tokens were used for reasoning.

When reasoning is enabled, you can access the summary through the response structure, but the output_text property still gives you just the final answer:

Summary & Next Steps

You’ve now learned how to interact with the OpenAI Responses API: structuring conversations with roles, sending requests with instructions, inspecting full response JSON, extracting text via output_text, maintaining multi-turn history, and enabling reasoning with the reasoning parameter where supported.

These core patterns — input as a role-structured array, output parsing through output blocks, and optional reasoning summaries — form the foundation for the agent workflows you’ll build in the rest of this course. In the upcoming practices, you'll get hands-on experience building on these concepts and exploring different ways to interact with GPT-5. This foundation will serve you well as we progress through more advanced topics in the course!

Next Lesson: Breaking Down Tasks with Prompt Chaining

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal