DSPy — High-Level Overview

What is DSPy?

DSPy is a framework for programming — not just prompting — language models. Instead of writing fragile prompt strings, you define typed Python classes that describe what goes in and what comes out. DSPy handles the actual prompt construction, model calls, and output parsing.

The analogy: DSPy is to LLMs what PyTorch is to neural networks. PyTorch lets you define layers and forward passes without manually computing gradients. DSPy lets you define LLM operations without manually writing prompts.

Another way to think about it: DSPy is like FastAPI route definitions, but for LLM calls — typed inputs, typed outputs, composable modules.

Core Concepts

1. Signatures

A Signature is a typed contract that defines the inputs and outputs of an LLM call. Think of it like a function type signature, but for a language model.

pythonimport dspy

class SummarizeArticle(dspy.Signature):
    """Summarize the given article in 2-3 sentences."""

    article: str = dspy.InputField(desc="The full text of the article")
    summary: str = dspy.OutputField(desc="A concise 2-3 sentence summary")

What's happening here:

The docstring becomes the system instruction to the LLM
dspy.InputField() marks what gets sent to the model
dspy.OutputField() marks what the model should return
The desc parameter tells the LLM what each field means
You never write the actual prompt — DSPy constructs it from this definition

Signatures can output Pydantic models (not just strings):

pythonfrom pydantic import BaseModel

class WorkoutPlan(BaseModel):
    name: str
    description: str
    num_weeks: int
    sessions_per_week: int

class GeneratePlan(dspy.Signature):
    """Generate a structured workout plan."""

    user_goals: str = dspy.InputField()
    plan: WorkoutPlan = dspy.OutputField()  # Structured output!

DSPy will automatically instruct the LLM to return JSON matching the Pydantic schema and parse/validate the response.

2. Modules / Predictors

A Module is a class that uses one or more Signatures to do actual work. DSPy provides built-in predictor types that wrap signatures with different strategies:

`dspy.Predict` — Basic Call

The simplest predictor. Takes input, calls the LLM, returns output.

pythonsummarizer = dspy.Predict(SummarizeArticle)
result = summarizer(article="Long article text here...")
print(result.summary)  # "The article discusses..."

`dspy.ChainOfThought` — Reasoning First

Automatically adds a "reasoning" step before the output. The LLM thinks through its answer before committing.

pythonplanner = dspy.ChainOfThought(GeneratePlan)
result = planner(user_goals="Build muscle, 3 days per week")
# The LLM first reasons about the plan, THEN produces the structured output
print(result.reasoning)  # "The user wants hypertrophy with 3 sessions..."
print(result.plan)       # WorkoutPlan(name="...", num_weeks=8, ...)

When to use which:

Predictor	Use When
`dspy.Predict`	Simple extraction/parsing tasks
`dspy.ChainOfThought`	Tasks requiring reasoning (planning, analysis, complex generation)
`dspy.ReAct`	Agent-like tasks that need to use tools

Custom Modules — Chaining Multiple Steps

You can compose multiple predictors into a pipeline by subclassing dspy.Module:

pythonclass PlanGenerator(dspy.Module):
    def __init__(self):
        super().__init__()
        # Define the predictors this module uses
        self.generate_overview = dspy.ChainOfThought(GeneratePlanOverview)
        self.generate_sessions = dspy.ChainOfThought(GenerateSessions)
        self.parse_prescriptions = dspy.Predict(ParsePrescriptions)

    def forward(self, user_goals: str):
        # Stage 1: Generate the plan overview
        overview = self.generate_overview(user_goals=user_goals)

        # Stage 2: Generate sessions using the overview as context
        sessions = self.generate_sessions(
            plan_overview=overview.plan,
            user_goals=user_goals
        )

        # Stage 3: Parse prescriptions from the sessions
        parsed = self.parse_prescriptions(
            raw_sessions=sessions.sessions
        )

        return parsed

Key point: Each stage can use a different LLM (see Context Switching below). The module chains them together like a pipeline.

3. LM Configuration

You configure which language model to use globally or per-call:

python# Create LM instances
grok = dspy.LM("openrouter/x-ai/grok-4-fast", max_tokens=32000)
sonnet = dspy.LM("anthropic/claude-sonnet-4-5", max_tokens=8000)
haiku = dspy.LM("anthropic/claude-haiku-4-5", max_tokens=4000)

# Set default
dspy.configure(lm=grok)

DSPy uses LiteLLM under the hood, so model strings follow the provider/model-name format and you can use any provider (OpenAI, Anthropic, OpenRouter, local vLLM, etc.).

4. Context Switching (`dspy.context`)

This is what enables multi-model routing — using different models for different stages:

pythonclass MultiModelPipeline(dspy.Module):
    def __init__(self):
        super().__init__()
        self.planner = dspy.ChainOfThought(PlanSignature)
        self.parser = dspy.Predict(ParseSignature)

    def forward(self, user_input):
        # Use Grok-4 for the complex planning step
        with dspy.context(lm=grok):
            plan = self.planner(input=user_input)

        # Use Claude Haiku for the cheap parsing step
        with dspy.context(lm=haiku):
            parsed = self.parser(raw_plan=plan.output)

        return parsed

Why this matters: You match model capability to task complexity. Expensive models for hard reasoning, cheap models for simple extraction. This is a core cost optimization pattern.

5. Composing Modules

DSPy modules are composable — you can nest them, loop them, and build complex pipelines:

pythonclass FullPipeline(dspy.Module):
    def __init__(self):
        super().__init__()
        self.plan_overview = dspy.ChainOfThought(OverviewSignature)
        self.session_generator = dspy.ChainOfThought(SessionSignature)

    def forward(self, user_context):
        # Generate overview
        overview = self.plan_overview(context=user_context)

        # Loop: generate each session with context from previous ones
        all_sessions = []
        for week in overview.plan.cycles:
            for day in range(week.sessions_per_week):
                session = self.session_generator(
                    context=user_context,
                    cycle=week,
                    previous_sessions=all_sessions  # Progressive context!
                )
                all_sessions.append(session)

        return all_sessions

This progressive context pattern is useful for generating multi-part content where each part needs awareness of what came before.

DSPy vs Raw Prompting

Aspect	Raw Prompting	DSPy
Prompt definition	String templates with f-strings	Typed Python classes
Output parsing	Manual JSON parsing, regex, hope	Automatic via Pydantic models
Model switching	Change API calls everywhere	`dspy.context(lm=...)`
Testing	Test the full prompt+model combo	Test signatures independently
Prompt optimization	Manual trial and error	DSPy can optimize prompts algorithmically
Type safety	None — strings in, strings out	Full type checking on inputs and outputs
Maintenance	Prompts break when you change models	Signatures are model-agnostic

Example — the same task both ways:

Raw prompting:

pythonresponse = openai.chat.completions.create(
    model="gpt-4",
    messages=[{
        "role": "system",
        "content": "You are a fitness coach. Generate a workout plan as JSON..."
    }, {
        "role": "user",
        "content": f"Goals: {user_goals}\nEquipment: {equipment}\n..."
    }]
)
plan = json.loads(response.choices[0].message.content)  # Might fail!

DSPy:

pythonclass GeneratePlan(dspy.Signature):
    """You are a fitness coach. Generate a workout plan."""
    user_goals: str = dspy.InputField()
    equipment: list[str] = dspy.InputField()
    plan: WorkoutPlan = dspy.OutputField()

generator = dspy.ChainOfThought(GeneratePlan)
result = generator(user_goals=goals, equipment=equip)
plan = result.plan  # Already a validated Pydantic object

DSPy vs LangChain

Aspect	LangChain	DSPy
Philosophy	Chain prompts together	Program with LMs like functions
Abstraction	Wrappers around API calls	Typed signatures compiled to prompts
Prompt control	You write prompt templates	DSPy generates prompts from signatures
Optimization	Manual prompt tuning	Can algorithmically optimize prompts
Learning curve	Many abstractions to learn	Closer to normal Python
When to use	Quick prototyping, many integrations	Production pipelines needing reliability

Multi-Stage Pipeline Example

User Fitness Profile
        |
        v
[Stage 1] dspy.ChainOfThought(GeneratePlanSignature)
          LM: Grok-4 via OpenRouter (32K tokens)
          Output: WorkPlanLLM (Pydantic model)
        |
        v
[Stage 2] dspy.ChainOfThought(GenerateWeeklyTasksSignature)  x N cycles
          LM: Grok-4 via OpenRouter (24K tokens)
          Input includes: previous sessions for balance
        |
        v
[Stage 3] dspy.ChainOfThought(GenerateSingleSessionSignature)  x N sessions
          LM: Claude Sonnet 4.5 (8K tokens)
          Input includes: all prior sessions that week
        |
        v
[Stage 4] dspy.Predict(ParsePrescriptionSignature)
          LM: Claude Haiku 4.5 (4K tokens)
          Parses "3x5 @ RPE 8" into structured sets
        |
        v
[Enrichment] Vector search matches exercises to canonical library
             (Not DSPy — deterministic code using pgvector)

Key DSPy features used:

dspy.ChainOfThought for reasoning-heavy stages (planning, session design)
dspy.Predict for simpler tasks (prescription parsing, exercise enrichment)
dspy.context(lm=...) for per-stage model routing
Pydantic OutputField types for guaranteed structured outputs

Key Terms Glossary

Term	Definition
Signature	A typed class defining LLM inputs/outputs. The "function signature" of an LLM call.
InputField	A typed input parameter that gets sent to the LLM as part of the prompt.
OutputField	A typed output parameter that the LLM must produce. Can be a string or Pydantic model.
Module	A class that composes one or more predictors into a pipeline. Subclass of `dspy.Module`.
Predictor	A wrapper around a Signature that defines HOW the LLM should approach the task (basic, chain-of-thought, ReAct).
dspy.Predict	Simplest predictor — direct input/output, no reasoning step.
dspy.ChainOfThought	Predictor that adds automatic "reasoning" before producing output. Better for complex tasks.
dspy.ReAct	Predictor that enables tool use — the LLM can call functions, observe results, and iterate.
dspy.context	Context manager to temporarily override the LM, temperature, or other settings for a block of code.
dspy.configure	Set the default LM and global settings.
LM	A language model instance (`dspy.LM("provider/model")`). Uses LiteLLM for multi-provider support.
Optimizer	DSPy can algorithmically optimize prompts by trying variations and measuring output quality. Advanced feature.
Teleprompter	Legacy name for DSPy optimizers. You may see this in older docs.
Compiled program	A DSPy module after optimization — the prompts have been tuned for best performance.