Loading...

Language Models as DSPy's Foundation

Welcome to our second lesson in the DSPy Programming course! In our previous lesson, we introduced DSPy as a framework that shifts your focus from manually writing prompt templates to creating modular, composable Python code that "programs" language models. We also covered the three stages of DSPy development: Programming, Evaluation, and Optimization.

Today, we'll focus on the first critical component you need to master in the DSPy programming stage: working with Language Models (LMs). Language models serve as the computational engine that powers everything in DSPy. Just as a traditional program needs a CPU to execute its instructions, a DSPy program needs a language model to process its modules and signatures.

In DSPy, language models aren't just tools you occasionally query — they're the foundation upon which your entire application is built. Every module you create, every signature you define, and every optimization you perform ultimately relies on an underlying language model to do the actual work of generating text, making decisions, or processing information.

The DSPy framework abstracts away many of the complexities of working directly with language model APIs, providing a consistent interface regardless of which specific model you're using. This means you can write your DSPy code once and easily switch between different language models — whether that's GPT-4, Claude, or an open-source model you're running locally — with minimal changes to your code.

As we explore how to use language models in DSPy, keep in mind that this knowledge forms the foundation for everything else we'll cover in this course. Mastering these concepts will enable you to build increasingly sophisticated AI systems as we progress.

Initializing Language Models in DSPy

To start working with language models in DSPy, you first need to initialize an LM instance. DSPy supports various language model providers, with a unified interface that makes it easy to switch between them.

Let's begin by importing the DSPy library and creating our first language model instance:

In this example, we're creating an instance of dspy.LM that will use OpenAI's GPT-4o-mini model. The model identifier follows a provider/model format, where openai is the provider and gpt-4o-mini is the specific model.

For authentication, you'll need to provide your API key. This is typically done by passing it directly to the LM constructor as shown above. If you're working in a secure environment, you might prefer to use environment variables instead of hardcoding your API key:

DSPy supports various language model providers beyond OpenAI. The syntax for initializing models from different providers follows the same pattern:

When working in CodeSignal or similar environments, many popular language models might already be configured and available, so you won't need to worry about API keys. However, it's important to understand how to set up these connections for when you're working in your own environment.

Now that we've initialized our language model, let's see how to use it to generate text.

Making Direct LM Calls

Once you've initialized a language model, you can make direct calls to it. This is the most basic way to interact with a language model in DSPy, and it's useful for simple text generation tasks or for testing how a model responds to different inputs.

There are two main ways to call a language model in DSPy:

Using a simple string input:
Using a structured message format (similar to chat models):

Notice that in both cases, the response is returned as a list of strings. This is because language models can potentially generate multiple completions for a single prompt (though by default, they return just one).

Modifying the Parameters of your LM Calls

You can pass various parameters to control the generation process:

The output might look something like:

These parameters control different aspects of text generation:

temperature: Controls randomness. Higher values (e.g., 0.8) make output more random, while lower values (e.g., 0.2) make it more deterministic.
max_tokens: The maximum length of the generated text.
stop: Sequences where the model should stop generating (e.g., "\n\n" to stop at double newlines).
cache: Whether to cache responses to avoid redundant API calls.

Understanding these parameters is crucial for getting the most out of your language model interactions.

Global LM Configuration

In a DSPy application, you'll often want to use the same language model throughout your code. Rather than passing the LM instance to every module or function, DSPy allows you to configure a default LM that will be used automatically.

To set a global default LM, use the dspy.configure() function:

Once you've configured a default LM, any DSPy module you create will automatically use this LM unless specified otherwise. This makes your code cleaner and more maintainable.

For example, if you have a question-answering module (we'll cover modules in more detail in a future lesson), you can use it without explicitly passing an LM:

Local LM Configuration

Sometimes, you might want to temporarily use a different language model for specific parts of your code. DSPy provides a context manager for this purpose:

In this example, the qa module will use GPT-3.5-turbo within the context manager block, and then revert to the globally configured LM (GPT-4o-mini in our case) outside of it. This is particularly useful for comparing how different models perform on the same task or for using more powerful models only when necessary.

You can also configure other global settings using dspy.configure(), such as retrievers for RAG applications or metrics for evaluation. We'll explore these in later lessons.

Monitoring and Debugging LM Interactions

When working with language models, it's important to be able to monitor and debug your interactions. DSPy provides several tools to help with this.

Every LM instance in DSPy keeps a history of all the calls made to it. You can access this history to see what prompts were sent, what responses were received, and other metadata:

You can also inspect the details of specific calls:

The output might look something like:

This gives you access to:

The prompt that was sent to the model
The response that was received
Usage information (like token counts)
The timestamp of the call
The parameters used for generation

You can dig deeper into specific aspects:

This information is invaluable for debugging issues with your DSPy applications. For example, if you're not getting the responses you expect, you can check the exact prompt that was sent to the model. If you're concerned about API costs, you can monitor token usage to optimize your prompts.

For more complex applications, you might want to implement custom logging or monitoring. DSPy's history mechanism provides the foundation for building these more advanced tools.

Summary and Practice Preview

In this lesson, we've explored how to work with language models in DSPy, which form the foundation of any DSPy application. We've covered initializing LMs with different providers, making direct calls to generate text, configuring global and local LM settings, and monitoring LM interactions for debugging purposes.

Here are the key takeaways:

Language models are the computational engine that powers DSPy applications.
You can initialize an LM with dspy.LM(), specifying the provider, model, and authentication.
Direct LM calls can be made with either simple strings or structured message formats.
You can configure a global default LM with dspy.configure() and temporarily override it with dspy.context().
The LM history provides valuable information for monitoring and debugging.

In the upcoming practice exercises, you'll have the opportunity to apply these concepts by initializing different language models, making various types of calls, and experimenting with different configuration options. These hands-on activities will help solidify your understanding of how to effectively work with language models in DSPy.

In our next lesson, we'll build on this foundation by exploring how to create custom signatures in DSPy. Signatures allow you to define the input/output behavior of your language model tasks, providing a structured way to interact with LMs beyond simple text generation. You'll learn how to specify the semantic roles of inputs and outputs, and how to use these signatures with the LMs we've covered in this lesson.

As you work through the practice exercises, remember that mastering language model interactions is essential for everything else we'll do in DSPy. Take your time to experiment with different models, parameters, and input formats to get a feel for how they affect the generated outputs.

Previous Lesson

Next Lesson: Creating Your Own Signature in DSPy

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal