DSPy — High-Level Overview
What is DSPy?
DSPy is a framework for programming — not just prompting — language models. Instead of writing fragile prompt strings, you define typed Python classes that describe what goes in and what comes out. DSPy handles the actual prompt construction, model calls, and output parsing.
The analogy: DSPy is to LLMs what PyTorch is to neural networks. PyTorch lets you define layers and forward passes without manually computing gradients. DSPy lets you define LLM operations without manually writing prompts.
Another way to think about it: DSPy is like FastAPI route definitions, but for LLM calls — typed inputs, typed outputs, composable modules.
Core Concepts
1. Signatures
A Signature is a typed contract that defines the inputs and outputs of an LLM call. Think of it like a function type signature, but for a language model.
pythonimport dspy
class SummarizeArticle(dspy.Signature):
"""Summarize the given article in 2-3 sentences."""
article: str = dspy.InputField(desc="The full text of the article")
summary: str = dspy.OutputField(desc="A concise 2-3 sentence summary")
What's happening here:
- The docstring becomes the system instruction to the LLM
dspy.InputField()marks what gets sent to the modeldspy.OutputField()marks what the model should return- The
descparameter tells the LLM what each field means - You never write the actual prompt — DSPy constructs it from this definition
Signatures can output Pydantic models (not just strings):
pythonfrom pydantic import BaseModel
class WorkoutPlan(BaseModel):
name: str
description: str
num_weeks: int
sessions_per_week: int
class GeneratePlan(dspy.Signature):
"""Generate a structured workout plan."""
user_goals: str = dspy.InputField()
plan: WorkoutPlan = dspy.OutputField() # Structured output!
DSPy will automatically instruct the LLM to return JSON matching the Pydantic schema and parse/validate the response.
2. Modules / Predictors
A Module is a class that uses one or more Signatures to do actual work. DSPy provides built-in predictor types that wrap signatures with different strategies:
dspy.Predict — Basic Call
The simplest predictor. Takes input, calls the LLM, returns output.
pythonsummarizer = dspy.Predict(SummarizeArticle)
result = summarizer(article="Long article text here...")
print(result.summary) # "The article discusses..."
dspy.ChainOfThought — Reasoning First
Automatically adds a "reasoning" step before the output. The LLM thinks through its answer before committing.
pythonplanner = dspy.ChainOfThought(GeneratePlan)
result = planner(user_goals="Build muscle, 3 days per week")
# The LLM first reasons about the plan, THEN produces the structured output
print(result.reasoning) # "The user wants hypertrophy with 3 sessions..."
print(result.plan) # WorkoutPlan(name="...", num_weeks=8, ...)
When to use which:
| Predictor | Use When |
|---|---|
dspy.Predict | Simple extraction/parsing tasks |
dspy.ChainOfThought | Tasks requiring reasoning (planning, analysis, complex generation) |
dspy.ReAct | Agent-like tasks that need to use tools |
Custom Modules — Chaining Multiple Steps
You can compose multiple predictors into a pipeline by subclassing dspy.Module:
pythonclass PlanGenerator(dspy.Module):
def __init__(self):
super().__init__()
# Define the predictors this module uses
self.generate_overview = dspy.ChainOfThought(GeneratePlanOverview)
self.generate_sessions = dspy.ChainOfThought(GenerateSessions)
self.parse_prescriptions = dspy.Predict(ParsePrescriptions)
def forward(self, user_goals: str):
# Stage 1: Generate the plan overview
overview = self.generate_overview(user_goals=user_goals)
# Stage 2: Generate sessions using the overview as context
sessions = self.generate_sessions(
plan_overview=overview.plan,
user_goals=user_goals
)
# Stage 3: Parse prescriptions from the sessions
parsed = self.parse_prescriptions(
raw_sessions=sessions.sessions
)
return parsed
Key point: Each stage can use a different LLM (see Context Switching below). The module chains them together like a pipeline.
3. LM Configuration
You configure which language model to use globally or per-call:
python# Create LM instances
grok = dspy.LM("openrouter/x-ai/grok-4-fast", max_tokens=32000)
sonnet = dspy.LM("anthropic/claude-sonnet-4-5", max_tokens=8000)
haiku = dspy.LM("anthropic/claude-haiku-4-5", max_tokens=4000)
# Set default
dspy.configure(lm=grok)
DSPy uses LiteLLM under the hood, so model strings follow the provider/model-name format and you can use any provider (OpenAI, Anthropic, OpenRouter, local vLLM, etc.).
4. Context Switching (dspy.context)
This is what enables multi-model routing — using different models for different stages:
pythonclass MultiModelPipeline(dspy.Module):
def __init__(self):
super().__init__()
self.planner = dspy.ChainOfThought(PlanSignature)
self.parser = dspy.Predict(ParseSignature)
def forward(self, user_input):
# Use Grok-4 for the complex planning step
with dspy.context(lm=grok):
plan = self.planner(input=user_input)
# Use Claude Haiku for the cheap parsing step
with dspy.context(lm=haiku):
parsed = self.parser(raw_plan=plan.output)
return parsed
Why this matters: You match model capability to task complexity. Expensive models for hard reasoning, cheap models for simple extraction. This is a core cost optimization pattern.
5. Composing Modules
DSPy modules are composable — you can nest them, loop them, and build complex pipelines:
pythonclass FullPipeline(dspy.Module):
def __init__(self):
super().__init__()
self.plan_overview = dspy.ChainOfThought(OverviewSignature)
self.session_generator = dspy.ChainOfThought(SessionSignature)
def forward(self, user_context):
# Generate overview
overview = self.plan_overview(context=user_context)
# Loop: generate each session with context from previous ones
all_sessions = []
for week in overview.plan.cycles:
for day in range(week.sessions_per_week):
session = self.session_generator(
context=user_context,
cycle=week,
previous_sessions=all_sessions # Progressive context!
)
all_sessions.append(session)
return all_sessions
This progressive context pattern is useful for generating multi-part content where each part needs awareness of what came before.
DSPy vs Raw Prompting
| Aspect | Raw Prompting | DSPy |
|---|---|---|
| Prompt definition | String templates with f-strings | Typed Python classes |
| Output parsing | Manual JSON parsing, regex, hope | Automatic via Pydantic models |
| Model switching | Change API calls everywhere | dspy.context(lm=...) |
| Testing | Test the full prompt+model combo | Test signatures independently |
| Prompt optimization | Manual trial and error | DSPy can optimize prompts algorithmically |
| Type safety | None — strings in, strings out | Full type checking on inputs and outputs |
| Maintenance | Prompts break when you change models | Signatures are model-agnostic |
Example — the same task both ways:
Raw prompting:
pythonresponse = openai.chat.completions.create(
model="gpt-4",
messages=[{
"role": "system",
"content": "You are a fitness coach. Generate a workout plan as JSON..."
}, {
"role": "user",
"content": f"Goals: {user_goals}\nEquipment: {equipment}\n..."
}]
)
plan = json.loads(response.choices[0].message.content) # Might fail!
DSPy:
pythonclass GeneratePlan(dspy.Signature):
"""You are a fitness coach. Generate a workout plan."""
user_goals: str = dspy.InputField()
equipment: list[str] = dspy.InputField()
plan: WorkoutPlan = dspy.OutputField()
generator = dspy.ChainOfThought(GeneratePlan)
result = generator(user_goals=goals, equipment=equip)
plan = result.plan # Already a validated Pydantic object
DSPy vs LangChain
| Aspect | LangChain | DSPy |
|---|---|---|
| Philosophy | Chain prompts together | Program with LMs like functions |
| Abstraction | Wrappers around API calls | Typed signatures compiled to prompts |
| Prompt control | You write prompt templates | DSPy generates prompts from signatures |
| Optimization | Manual prompt tuning | Can algorithmically optimize prompts |
| Learning curve | Many abstractions to learn | Closer to normal Python |
| When to use | Quick prototyping, many integrations | Production pipelines needing reliability |
Multi-Stage Pipeline Example
User Fitness Profile | v [Stage 1] dspy.ChainOfThought(GeneratePlanSignature) LM: Grok-4 via OpenRouter (32K tokens) Output: WorkPlanLLM (Pydantic model) | v [Stage 2] dspy.ChainOfThought(GenerateWeeklyTasksSignature) x N cycles LM: Grok-4 via OpenRouter (24K tokens) Input includes: previous sessions for balance | v [Stage 3] dspy.ChainOfThought(GenerateSingleSessionSignature) x N sessions LM: Claude Sonnet 4.5 (8K tokens) Input includes: all prior sessions that week | v [Stage 4] dspy.Predict(ParsePrescriptionSignature) LM: Claude Haiku 4.5 (4K tokens) Parses "3x5 @ RPE 8" into structured sets | v [Enrichment] Vector search matches exercises to canonical library (Not DSPy — deterministic code using pgvector)
Key DSPy features used:
dspy.ChainOfThoughtfor reasoning-heavy stages (planning, session design)dspy.Predictfor simpler tasks (prescription parsing, exercise enrichment)dspy.context(lm=...)for per-stage model routing- Pydantic
OutputFieldtypes for guaranteed structured outputs
Key Terms Glossary
| Term | Definition |
|---|---|
| Signature | A typed class defining LLM inputs/outputs. The "function signature" of an LLM call. |
| InputField | A typed input parameter that gets sent to the LLM as part of the prompt. |
| OutputField | A typed output parameter that the LLM must produce. Can be a string or Pydantic model. |
| Module | A class that composes one or more predictors into a pipeline. Subclass of dspy.Module. |
| Predictor | A wrapper around a Signature that defines HOW the LLM should approach the task (basic, chain-of-thought, ReAct). |
| dspy.Predict | Simplest predictor — direct input/output, no reasoning step. |
| dspy.ChainOfThought | Predictor that adds automatic "reasoning" before producing output. Better for complex tasks. |
| dspy.ReAct | Predictor that enables tool use — the LLM can call functions, observe results, and iterate. |
| dspy.context | Context manager to temporarily override the LM, temperature, or other settings for a block of code. |
| dspy.configure | Set the default LM and global settings. |
| LM | A language model instance (dspy.LM("provider/model")). Uses LiteLLM for multi-provider support. |
| Optimizer | DSPy can algorithmically optimize prompts by trying variations and measuring output quality. Advanced feature. |
| Teleprompter | Legacy name for DSPy optimizers. You may see this in older docs. |
| Compiled program | A DSPy module after optimization — the prompts have been tuned for best performance. |