denniscao.net

DSPy — High-Level Overview

What is DSPy?

DSPy is a framework for programming — not just prompting — language models. Instead of writing fragile prompt strings, you define typed Python classes that describe what goes in and what comes out. DSPy handles the actual prompt construction, model calls, and output parsing.

The analogy: DSPy is to LLMs what PyTorch is to neural networks. PyTorch lets you define layers and forward passes without manually computing gradients. DSPy lets you define LLM operations without manually writing prompts.

Another way to think about it: DSPy is like FastAPI route definitions, but for LLM calls — typed inputs, typed outputs, composable modules.


Core Concepts

1. Signatures

A Signature is a typed contract that defines the inputs and outputs of an LLM call. Think of it like a function type signature, but for a language model.

python
import dspy

class SummarizeArticle(dspy.Signature):
    """Summarize the given article in 2-3 sentences."""

    article: str = dspy.InputField(desc="The full text of the article")
    summary: str = dspy.OutputField(desc="A concise 2-3 sentence summary")

What's happening here:

  • The docstring becomes the system instruction to the LLM
  • dspy.InputField() marks what gets sent to the model
  • dspy.OutputField() marks what the model should return
  • The desc parameter tells the LLM what each field means
  • You never write the actual prompt — DSPy constructs it from this definition

Signatures can output Pydantic models (not just strings):

python
from pydantic import BaseModel

class WorkoutPlan(BaseModel):
    name: str
    description: str
    num_weeks: int
    sessions_per_week: int

class GeneratePlan(dspy.Signature):
    """Generate a structured workout plan."""

    user_goals: str = dspy.InputField()
    plan: WorkoutPlan = dspy.OutputField()  # Structured output!

DSPy will automatically instruct the LLM to return JSON matching the Pydantic schema and parse/validate the response.


2. Modules / Predictors

A Module is a class that uses one or more Signatures to do actual work. DSPy provides built-in predictor types that wrap signatures with different strategies:

dspy.Predict — Basic Call

The simplest predictor. Takes input, calls the LLM, returns output.

python
summarizer = dspy.Predict(SummarizeArticle)
result = summarizer(article="Long article text here...")
print(result.summary)  # "The article discusses..."

dspy.ChainOfThought — Reasoning First

Automatically adds a "reasoning" step before the output. The LLM thinks through its answer before committing.

python
planner = dspy.ChainOfThought(GeneratePlan)
result = planner(user_goals="Build muscle, 3 days per week")
# The LLM first reasons about the plan, THEN produces the structured output
print(result.reasoning)  # "The user wants hypertrophy with 3 sessions..."
print(result.plan)       # WorkoutPlan(name="...", num_weeks=8, ...)

When to use which:

PredictorUse When
dspy.PredictSimple extraction/parsing tasks
dspy.ChainOfThoughtTasks requiring reasoning (planning, analysis, complex generation)
dspy.ReActAgent-like tasks that need to use tools

Custom Modules — Chaining Multiple Steps

You can compose multiple predictors into a pipeline by subclassing dspy.Module:

python
class PlanGenerator(dspy.Module):
    def __init__(self):
        super().__init__()
        # Define the predictors this module uses
        self.generate_overview = dspy.ChainOfThought(GeneratePlanOverview)
        self.generate_sessions = dspy.ChainOfThought(GenerateSessions)
        self.parse_prescriptions = dspy.Predict(ParsePrescriptions)

    def forward(self, user_goals: str):
        # Stage 1: Generate the plan overview
        overview = self.generate_overview(user_goals=user_goals)

        # Stage 2: Generate sessions using the overview as context
        sessions = self.generate_sessions(
            plan_overview=overview.plan,
            user_goals=user_goals
        )

        # Stage 3: Parse prescriptions from the sessions
        parsed = self.parse_prescriptions(
            raw_sessions=sessions.sessions
        )

        return parsed

Key point: Each stage can use a different LLM (see Context Switching below). The module chains them together like a pipeline.


3. LM Configuration

You configure which language model to use globally or per-call:

python
# Create LM instances
grok = dspy.LM("openrouter/x-ai/grok-4-fast", max_tokens=32000)
sonnet = dspy.LM("anthropic/claude-sonnet-4-5", max_tokens=8000)
haiku = dspy.LM("anthropic/claude-haiku-4-5", max_tokens=4000)

# Set default
dspy.configure(lm=grok)

DSPy uses LiteLLM under the hood, so model strings follow the provider/model-name format and you can use any provider (OpenAI, Anthropic, OpenRouter, local vLLM, etc.).


4. Context Switching (dspy.context)

This is what enables multi-model routing — using different models for different stages:

python
class MultiModelPipeline(dspy.Module):
    def __init__(self):
        super().__init__()
        self.planner = dspy.ChainOfThought(PlanSignature)
        self.parser = dspy.Predict(ParseSignature)

    def forward(self, user_input):
        # Use Grok-4 for the complex planning step
        with dspy.context(lm=grok):
            plan = self.planner(input=user_input)

        # Use Claude Haiku for the cheap parsing step
        with dspy.context(lm=haiku):
            parsed = self.parser(raw_plan=plan.output)

        return parsed

Why this matters: You match model capability to task complexity. Expensive models for hard reasoning, cheap models for simple extraction. This is a core cost optimization pattern.


5. Composing Modules

DSPy modules are composable — you can nest them, loop them, and build complex pipelines:

python
class FullPipeline(dspy.Module):
    def __init__(self):
        super().__init__()
        self.plan_overview = dspy.ChainOfThought(OverviewSignature)
        self.session_generator = dspy.ChainOfThought(SessionSignature)

    def forward(self, user_context):
        # Generate overview
        overview = self.plan_overview(context=user_context)

        # Loop: generate each session with context from previous ones
        all_sessions = []
        for week in overview.plan.cycles:
            for day in range(week.sessions_per_week):
                session = self.session_generator(
                    context=user_context,
                    cycle=week,
                    previous_sessions=all_sessions  # Progressive context!
                )
                all_sessions.append(session)

        return all_sessions

This progressive context pattern is useful for generating multi-part content where each part needs awareness of what came before.


DSPy vs Raw Prompting

AspectRaw PromptingDSPy
Prompt definitionString templates with f-stringsTyped Python classes
Output parsingManual JSON parsing, regex, hopeAutomatic via Pydantic models
Model switchingChange API calls everywheredspy.context(lm=...)
TestingTest the full prompt+model comboTest signatures independently
Prompt optimizationManual trial and errorDSPy can optimize prompts algorithmically
Type safetyNone — strings in, strings outFull type checking on inputs and outputs
MaintenancePrompts break when you change modelsSignatures are model-agnostic

Example — the same task both ways:

Raw prompting:

python
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{
        "role": "system",
        "content": "You are a fitness coach. Generate a workout plan as JSON..."
    }, {
        "role": "user",
        "content": f"Goals: {user_goals}\nEquipment: {equipment}\n..."
    }]
)
plan = json.loads(response.choices[0].message.content)  # Might fail!

DSPy:

python
class GeneratePlan(dspy.Signature):
    """You are a fitness coach. Generate a workout plan."""
    user_goals: str = dspy.InputField()
    equipment: list[str] = dspy.InputField()
    plan: WorkoutPlan = dspy.OutputField()

generator = dspy.ChainOfThought(GeneratePlan)
result = generator(user_goals=goals, equipment=equip)
plan = result.plan  # Already a validated Pydantic object

DSPy vs LangChain

AspectLangChainDSPy
PhilosophyChain prompts togetherProgram with LMs like functions
AbstractionWrappers around API callsTyped signatures compiled to prompts
Prompt controlYou write prompt templatesDSPy generates prompts from signatures
OptimizationManual prompt tuningCan algorithmically optimize prompts
Learning curveMany abstractions to learnCloser to normal Python
When to useQuick prototyping, many integrationsProduction pipelines needing reliability

Multi-Stage Pipeline Example

User Fitness Profile
        |
        v
[Stage 1] dspy.ChainOfThought(GeneratePlanSignature)
          LM: Grok-4 via OpenRouter (32K tokens)
          Output: WorkPlanLLM (Pydantic model)
        |
        v
[Stage 2] dspy.ChainOfThought(GenerateWeeklyTasksSignature)  x N cycles
          LM: Grok-4 via OpenRouter (24K tokens)
          Input includes: previous sessions for balance
        |
        v
[Stage 3] dspy.ChainOfThought(GenerateSingleSessionSignature)  x N sessions
          LM: Claude Sonnet 4.5 (8K tokens)
          Input includes: all prior sessions that week
        |
        v
[Stage 4] dspy.Predict(ParsePrescriptionSignature)
          LM: Claude Haiku 4.5 (4K tokens)
          Parses "3x5 @ RPE 8" into structured sets
        |
        v
[Enrichment] Vector search matches exercises to canonical library
             (Not DSPy — deterministic code using pgvector)

Key DSPy features used:

  • dspy.ChainOfThought for reasoning-heavy stages (planning, session design)
  • dspy.Predict for simpler tasks (prescription parsing, exercise enrichment)
  • dspy.context(lm=...) for per-stage model routing
  • Pydantic OutputField types for guaranteed structured outputs

Key Terms Glossary

TermDefinition
SignatureA typed class defining LLM inputs/outputs. The "function signature" of an LLM call.
InputFieldA typed input parameter that gets sent to the LLM as part of the prompt.
OutputFieldA typed output parameter that the LLM must produce. Can be a string or Pydantic model.
ModuleA class that composes one or more predictors into a pipeline. Subclass of dspy.Module.
PredictorA wrapper around a Signature that defines HOW the LLM should approach the task (basic, chain-of-thought, ReAct).
dspy.PredictSimplest predictor — direct input/output, no reasoning step.
dspy.ChainOfThoughtPredictor that adds automatic "reasoning" before producing output. Better for complex tasks.
dspy.ReActPredictor that enables tool use — the LLM can call functions, observe results, and iterate.
dspy.contextContext manager to temporarily override the LM, temperature, or other settings for a block of code.
dspy.configureSet the default LM and global settings.
LMA language model instance (dspy.LM("provider/model")). Uses LiteLLM for multi-provider support.
OptimizerDSPy can algorithmically optimize prompts by trying variations and measuring output quality. Advanced feature.
TeleprompterLegacy name for DSPy optimizers. You may see this in older docs.
Compiled programA DSPy module after optimization — the prompts have been tuned for best performance.