Tuning Model Responses

Exploring Model Parameters

Welcome back! In the previous lesson, you learned how to send a simple message to DeepSeek's language model and receive a response. Now, we will take a step further by exploring model parameters that allow you to customize the AI tutor's responses. These parameters are crucial for tailoring the tutor's behavior to meet specific educational needs.

In this lesson, we will focus on four key parameters: maxTokens, temperature, presencePenalty, and frequencyPenalty. Understanding these parameters will enable you to control the creativity, length, and content of the AI's explanations, enhancing your personal tutor's effectiveness.

Understanding the Messaging Pipeline

Before we dive into tuning model responses, it's important to understand three key classes you'll use when interacting with the DeepSeek model in Spring AI:

UserMessage: Represents a message from the user. You use this class to wrap the text or question you want to send to the AI tutor.
- Example:
Prompt: Acts as a container that holds one or more messages (such as UserMessage) along with any chat options (like model parameters). The Prompt is what you send to the model for processing.
- Example:
AssistantMessage: Represents the AI tutor's reply. After you send a Prompt to the model, the response you receive is encapsulated in an AssistantMessage.
- Example:

These classes form the core of the message exchange process between your application and the AI model. You'll see them in action throughout the code examples in this lesson.

Controlling Response Length with Max Tokens

The maxTokens parameter sets a hard limit on the number of tokens the AI can generate in its response. A "token" can be a whole word or just part of a word. For example, "tutor" might be one token, while "explanation" could be split into multiple tokens. It's important to note that token counts vary across different models, words, and languages — so the same text might have a different token count depending on these factors.

When you set maxTokens, you specify the maximum number of tokens the AI can produce. This is a strict limit, meaning the model will stop generating text once it reaches this count, even if it results in an incomplete answer.

In Spring AI, you configure these settings using the ChatOptions class. ChatOptions allows you to specify which model to use and set various parameters that control the AI's behavior, such as the maximum number of tokens, temperature, and penalties for presence and frequency.

Here's an example where we set maxTokens to 150 and use the deepseek-chat-v3 model:

By setting maxTokens to , you impose a hard limit on the number of tokens the AI tutor can generate in its explanation. This may result in responses being abruptly cut off if the model hasn't completed its intended thought. Importantly, the parameter doesn't make the model inherently more concise or brief — it simply restricts explanation length.

Exploring Temperature

The temperature parameter controls the randomness or creativity of the AI's responses. A lower temperature value, such as 0.2, makes the AI's output more deterministic and focused, often resulting in more predictable explanations. Conversely, a higher temperature value, like 0.8, encourages the AI to generate more diverse and creative responses, which can be useful for providing varied educational content and explanations.

For example, consider the following code snippet where we set the temperature to 0.6:

With a temperature of 0.6, the AI tutor is likely to provide an explanation that balances creativity and factual accuracy. Experimenting with different temperature values will help you find the right balance for your specific tutoring scenarios.

A lower temperature might be preferable for mathematical or scientific explanations where precision is crucial, while a higher temperature could work better for creative writing or brainstorming sessions.

Encouraging New Topics with Presence Penalty

The presencePenalty parameter is a powerful tool for encouraging the AI tutor to introduce new concepts in its explanations. It works by penalizing the AI for using words that have already appeared in the conversation, thus promoting diversity in the dialogue.

A low presencePenalty value, such as 0.0, means the AI is less discouraged from repeating words, leading to more focused explanations. In contrast, a high presencePenalty value, like 1.0, strongly encourages the AI to explore new topics, resulting in more varied and diverse tutoring content.

Consider the following code where we set the presencePenalty to 0.5:

With a presencePenalty of 0.5, the AI tutor is more likely to introduce new concepts and provide varied explanations. This can be particularly useful in educational scenarios where you want to expose students to a broader range of related ideas or approaches to a problem.

Reducing Repetition with Frequency Penalty

The frequencyPenalty parameter helps reduce repetition in the AI's explanations by discouraging the repeated use of the same words or phrases. This encourages more varied and engaging educational content. A low frequencyPenalty value, such as 0.0, allows for more repetition, which can be useful for reinforcing key concepts. Conversely, a high frequencyPenalty value, such as 1.0, reduces repetition, promoting more dynamic and varied explanations.

While presencePenalty and frequencyPenalty serve different functions in controlling repetition:

Presence penalty: Encourages the AI to bring up new concepts by penalizing the model for using words that have already appeared in the conversation history.
Frequency penalty: Reduces repetition by penalizing the model for using the same words or phrases multiple times within a single explanation.

This distinction allows you to manage both the range of topics and the variety of language in the AI tutor's output.

In the following example, we set the frequencyPenalty to 0.2:

By applying a frequencyPenalty of 0.2, you can reduce redundancy in the tutor's explanations while still allowing for some repetition when necessary for educational purposes. This results in more dynamic and engaging educational content, striking a balance between variation and reinforcement of important concepts.

Example: Implementing Model Parameters in Code

Let's bring it all together with a complete code example that incorporates all the parameters we've discussed:

In this example, we use all four parameters to customize the AI's response. By adjusting these parameters, you can fine-tune the tutor's behavior to meet your specific requirements. When you run this code, you should see a response that reflects the balance of creativity, length, and content diversity that you've set.

Summary and Preparation for Practice

In this lesson, we explored how to use model parameters to customize AI tutor responses. You learned about the temperature, maxTokens, presencePenalty, and frequencyPenalty parameters and saw how they can be applied in code. These tools allow you to control the creativity, length, and content of the AI's explanations, enhancing your personal tutor's educational effectiveness.

As you move on to the practice exercises, I encourage you to experiment with different parameter settings to see their effects firsthand. This hands-on practice will reinforce what you've learned and prepare you for the next unit, where we'll delve deeper into managing tutoring sessions and message types. Keep up the great work, and enjoy the journey of creating your personal tutor with DeepSeek!

Previous Lesson

Next Lesson: Maintaining Conversation Context

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal