Understanding Reasoning Models: Guiding LLMs to Think Step by Step

Introduction: Why Reasoning Techniques Matter

In the previous lessons, you learned how large language models (LLMs) generate text by predicting one token at a time and how different model versions can affect the quality of responses. Let's build on that foundation by exploring how you can get even better answers from LLMs — especially for complex or multi-step problems — by guiding them to "think" before answering.

When you ask a model a simple question, it often gives you a direct answer. However, for more complicated tasks, such as solving a math problem or analyzing a scenario, the model can make mistakes if it tries to answer too quickly. This is where reasoning techniques come in. Encouraging the model to break down its thought process can help it arrive at more accurate and logical answers.

Chain of Thought (CoT): The Step-by-Step Approach

The Chain of Thought (CoT) technique is a way to prompt LLMs to solve problems step by step, just like you might do on paper. Instead of asking for a final answer right away, you guide the model to show its reasoning process.

Let's see how this works, starting with a simple math problem.

Suppose you ask:

If you ask this, the model will try to predict the next token. It can't actually do the math; it just predicts something plausible. Sometimes, it gets it right, but often, it makes a mistake, especially with large numbers or multi-step problems.

To help the model, you can add a phrase like Think step by step to your prompt. This tells the model to break down the problem:

But you can be even more helpful by showing the model exactly how to break down the steps. Let's build this up together.

First, you can show the model how to multiply using place value:

Chain of Thought (CoT): Full Example

Now, let's put this into a full prompt, step by step:

In this prompt, we provide the model with a solution example for a different problem and then ask for a solution to an actual problem. This way, we show the model how to "think" properly.

By guiding the model through each step, you help it avoid mistakes and clarify its reasoning. This is the core idea behind Chain of Thought prompting.

You can use this approach for many problems, not just math. For example, you can ask the model to explain its reasoning in logic puzzles, story analysis, or code debugging.

How Reasoning Models Use Chain of Thought

Some advanced LLMs are trained to use Chain of Thought reasoning automatically, especially when they see specific cues in your prompt. These are called reasoning models. They are designed to handle multi-step problems by breaking them down internally, even if you don't explicitly ask them to.

For example, if you prompt a reasoning model with:

A reasoning model might automatically start breaking down the steps, similar to the example above, and show its work before giving the final answer.

However, not all models do this by default. For regular models, you often need to guide them by providing an example or by saying, Think step by step. Reasoning models are more likely to follow this process independently, but you can still help them by being clear in your prompt.

When to Use Reasoning Models vs. Regular Models

Let's examine the pros and cons of using reasoning models and when to use each type.

Reasoning Models (with CoT)	Regular Models
Pros:	Pros:
- Better at multi-step problems (math, logic, analysis)	- Faster responses
- More transparent reasoning	- Good for simple, factual questions
- Less likely to make mistakes on complex tasks	- Uses less computing power
Cons:	Cons:
- Slower, longer answers	- Can make mistakes on complex tasks
- May over-explain simple questions	- Less transparent reasoning
- Usually more expensive to run

When to use reasoning models:

Solving math problems with multiple steps
Logic puzzles or riddles
Explaining something
Brainstorming sessions
Analyzing stories or scenarios
Advanced coding tasks

When to use regular models:

Quick factual lookups (e.g., "What is the capital of France?")
Simple, direct questions
When you want to create something, e.g., generate a story or a code snippet
When you want a short answer
When you want to save money

Reasoning in Our Environment

In our environment, the following models are the reasoning ones:

Anthropic Claude Sonnet 4
DeepSeek R1

Other models in our environment are not reasoning models.

A reasoning model's answer will contain its "thoughts" as part of the answer:

Summary and What's Next

In this lesson, you learned how to use the Chain of Thought technique to guide LLMs through step-by-step reasoning. You saw how to build prompts encouraging the model to show its work, and you learned the difference between reasoning and regular models. You also saw when to use each approach for the best results.

Next, you'll get to practice writing your own Chain of Thought prompts and see how different models respond. This hands-on practice will help you become more confident in getting accurate and thoughtful answers from LLMs.

Previous Lesson

Next Lesson: Understanding Context Windows: Managing Input and Output Sizes

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal