Introduction to Custom Signatures

Welcome to our third lesson in the DSPy Programming course! In our previous lesson, we explored how to work with language models in DSPy, learning how to initialize them, make direct calls, and configure global and local settings. These language models serve as the computational engine that powers everything in DSPy.

Today, we'll build on that foundation by diving into signatures, which are the way we define the expected input and output behavior of our language model tasks. Think of signatures as contracts that specify what information goes into a task and what should come out. They're a critical part of DSPy's approach to structured programming with language models.

While we briefly mentioned signatures in our introduction to DSPy, now we'll learn how to create custom signatures tailored to your specific needs. This is where DSPy really starts to shine compared to traditional prompt engineering. Instead of writing lengthy prompt templates, you'll define clear input and output specifications that DSPy will use to generate appropriate prompts behind the scenes.

Signatures in DSPy serve several important purposes:

  1. They provide a structured way to interact with language models.
  2. They make your code more readable and maintainable.
  3. They enable DSPy's optimization capabilities.
  4. They allow for modular composition of complex AI systems.

By the end of this lesson, you'll be able to create both simple string-based signatures and more complex class-based signatures for a variety of tasks. This knowledge will form the foundation for building sophisticated DSPy modules and programs in future lessons.

Let's start by looking at the simplest way to define signatures in DSPy.

String-Based Signatures

The most straightforward way to create a signature in DSPy is by using a string format. This concise syntax is perfect for simple tasks where you need to quickly define the relationship between inputs and outputs.

String-based signatures follow this general pattern:

The arrow (->) separates inputs from outputs, making it clear what goes in and what comes out. Let's look at a basic example:

In this example, we've created a signature that takes a sentence as input and produces a sentiment as output. The : bool part specifies that the output should be a boolean value (True or False).

Now let's use this signature to classify the sentiment of a sentence:

When we run this code, the language model analyzes the sentence and determines that it has a positive sentiment, returning True. The DSPy framework handles all the prompt engineering behind the scenes, instructing the language model to perform sentiment classification and return a boolean result.

Let's try another example with a different task — summarization:

In this case, our signature simply specifies that we want to transform a document into a summary. Since we didn't specify a type for the summary, it defaults to a string.

The string-based signature format is powerful in its simplicity. It allows you to quickly define tasks without a lot of boilerplate code. However, as your tasks become more complex, you might need more control over the inputs and outputs, which brings us to our next topic.

Working with Multiple Parameters

Many real-world tasks require multiple inputs or produce multiple outputs. DSPy's signature system handles this elegantly, allowing you to define complex parameter relationships.

To specify multiple inputs in a string-based signature, simply separate them with commas:

This signature defines a question-answering task that takes two inputs:

  1. context: A list of strings containing relevant information
  2. question: A string representing the question to be answered

It produces a single output:

  • answer: A string containing the response to the question

Notice how we can specify types for both inputs and outputs. In this case, we're telling DSPy that context should be a list of strings, while question and answer are individual strings.

Similarly, you can define signatures with multiple outputs:

This signature defines a multiple-choice question-answering task with:

  • Inputs: A question and a list of choices
  • Outputs: The reasoning behind the answer and the selection (an integer representing the chosen option)

When you call a module with multiple inputs, you need to provide all of them:

Similarly, when a signature produces multiple outputs, you can access each one individually:

This might produce output like:

The ability to work with multiple parameters makes DSPy signatures extremely versatile. You can model complex tasks with interdependent inputs and outputs, all while maintaining a clean and readable syntax.

However, as your signatures become more complex, you might find the string-based format limiting. For more advanced use cases, DSPy provides a class-based approach to signature definition.

Class-Based Signature Definition

While string-based signatures are convenient for simple tasks, class-based signatures offer more control and expressiveness for complex scenarios. They allow you to:

  1. Add detailed documentation
  2. Provide field descriptions
  3. Enforce stricter type constraints
  4. Create more complex input/output structures

To create a class-based signature, you define a Python class that inherits from dspy.Signature. Here's a basic example:

This signature defines an emotion classification task. Let's break down its components:

  • The class inherits from dspy.Signature.
  • The docstring provides a brief description of the task.
  • sentence is defined as an input field of type str.
  • sentiment is defined as an output field with a specific set of allowed values.

The Literal type from Python's typing module is particularly useful here. It constrains the output to one of the specified values, ensuring that the language model's response falls within our expected categories.

Now let's use this signature:

Class-based signatures also allow you to provide more detailed descriptions for each field using the desc parameter:

These descriptions serve two important purposes:

  1. They provide documentation for developers using your signature.
  2. They give the language model more guidance about what each field represents.

The desc parameter is particularly valuable for output fields, as it helps steer the language model toward generating responses that match your expectations.

Advanced Type Annotations

One of the most powerful features of DSPy signatures is their support for Python's type system. By leveraging type annotations, you can create highly structured inputs and outputs that guide the language model's responses.

Let's explore some advanced type annotations with a complex example:

This signature uses a TypedDict to define a structured entity representation, with fields for the entity's name, type, and an optional description.

Python's type system provides a rich vocabulary for expressing complex data structures, and DSPy leverages this to create highly structured interactions with language models. By using appropriate type annotations, you can guide the model to produce outputs that match your expected format, making it easier to integrate language models into larger applications.

Summary and Practice Preview

In this lesson, we've explored how to create custom signatures in DSPy, which are essential for defining the input/output behavior of your language model tasks. We've covered both string-based and class-based approaches, as well as advanced type annotations for more complex scenarios.

Here are the key takeaways:

  1. String-based signatures provide a concise syntax for simple tasks, using the arrow (->) to separate inputs from outputs.
  2. You can work with multiple parameters by separating them with commas and specifying types where needed.
  3. Class-based signatures offer more control and expressiveness, allowing you to add documentation and field descriptions.
  4. Advanced type annotations enable you to create highly structured inputs and outputs, guiding the language model's responses.

These concepts build directly on the language model foundation we covered in the previous lesson. The LMs we learned to initialize and configure are the computational engine that powers these signatures, turning your structured specifications into natural language interactions.

In the upcoming practice exercises, you'll have the opportunity to apply these concepts by creating various types of signatures for different tasks. You'll experiment with both string-based and class-based approaches, and you'll see how different type annotations affect the language model's responses.

As you work through these exercises, remember that effective signature design is about finding the right balance between flexibility and constraint. Too little structure might lead to unpredictable outputs, while too much might unnecessarily limit the language model's capabilities.

In our next lesson, we'll build on this foundation by exploring DSPy modules, which allow you to compose multiple signatures into more complex AI systems. You'll learn how to use built-in modules like ChainOfThought and ReAct, and how to create your own custom modules for specific tasks.

For now, focus on mastering signature creation, as it's the building block for everything else we'll do in DSPy. The more comfortable you become with defining clear input/output specifications, the more effectively you'll be able to harness the power of language models in your applications.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal