type
status
date
slug
summary
tags
category
password
icon
Author
Abstract
Introduction
Hello everyone!
This isn't my first blog post on DSPy—I've written several before. However, I've noticed some recent updates to DSPy, and I'd rather not consult the documentation every time I want to build programs. So, I plan to jot down some basic DSPy concepts in this post. Additionally, I intend to use this document as external knowledge for GPT or Claude.
Install DSPy and Set up the LM
Here's how to install DSPy:
For me, I usually use OpenAI, Anthropic and other thrid providers to test models. So here is how to set up the LM.
OpenAI
Anthropic
Other providers(compatible with OpenAI)
Calling the LM
Here’s how to call the LM to answer.
Signatures
A signature is a declarative specification of input/output behavior of a DSPy module.
Inline DSPy Signatures
- Question Answering:
“question —> answer”
which you can also write this as“question:str —> answer:str”
- Sentiment Classification:
“sentence → sentiment:bool”
, eg. True if positive
- Summarization:
”document→summary”
You can also have multiple input/output fieldds with types:
- RAG:
“context:list[str], question:str → answer:str”
- Multiple-Choice Question Answering with Reasoning:
“question, choices:list[str] → reasoning:str, selection:int”
Here are some examples of Inline DSPy Signatures:
Classification
Summarization
Class-Based DSPy Signatures
For some advanced tasks, you need more verbose signatures. This is typically to
- Clarify something about the nature of the task(expressed below as docstring)
- Supply hints on the nature of an input field, expressed as a desc keyword argument for
dspy.InputField
.
- Supply constraints on an output field, expressed as a desc keyword argument for
dspy.OutField
.
Here are some examples of Class-Based DSPy Signatures:
Modules
Modules in DSPy help you shift from tinkering with prompt strings to programming with structured natural-language modules. For each AI component in the system, you can specify input/output behavior as a signature and select a module to assign a strategy for invoking the LM. DSPy expands the signature into prompts and parses your typed output.
Here are some examples to illustrate:
Math(Chain of Thought)
Possible Output
Classification
Possible Output
Information Extraction
Possible Output
Agents
Possible Output
Optimizer
A DSPy optimizer is an algorithm that can tune the parameters of a DSPy Program(i.e., the prompts and/or the LM weights) to maximize the metrics you specifiy, like accuracy.
A typical DSPy optimizer takes three things:
- Your DSPy Program
- Your metric
- A few training inputs. This may be very small(ie., only 5 or 10 examples) and incomplete (only inputs to your program, without any labels)
Automatic Finetuning
This optimizer is used to fine-tune the underlying LLM(s)
- BootstrapFinetune: Distills a prompt-based DSPy program into weight updates. The output is a DSPy program that has the same steps but where each step is conducted by a finetuned model instead of a prompted LM.
Here is an example to fine-tune the example.
Automatic Few-Shot Learning
These optimizers extend the signature by automatically generating and including optimized examples within the prompt sent to the model, implementing few-shot learning.
LabeledFewShot
Simply constructs few-shot examples(demos) from provided labeled input and output data points. Requires k (numbers of examples for the prompt) andtrianset
to randomly selectk
examples from.
BootstrapFewShot
: Uses ateacher
module (which defaults to your program) to generate complete demonstrations for every stage of your program, along with labeled examples intrainset
. Parameters includemax_labeled_demos
(the number of demonstrations randomly selected from thetrainset
) andmax_bootstrapped_demos
(the number of additional examples generated by theteacher
). The bootstrapping process employs the metric to validate demonstrations, including only those that pass the metric in the "compiled" prompt. Advanced: Supports using ateacher
program that is a different DSPy program that has compatible structure, for harder tasks.
BootstrapFewShotWithRandomSearch
: AppliesBootstrapFewShot
several times with random search over generated demonstrations, and selects the best program over the optimization. Parameters mirror those ofBootstrapFewShot
, with the addition ofnum_candidate_programs
, which specifies the number of random programs evaluated over the optimization, including candidates of the uncompiled program,LabeledFewShot
optimized program,BootstrapFewShot
compiled program with unshuffled examples andnum_candidate_programs
ofBootstrapFewShot
compiled programs with randomized example sets.
KNNFewShot
. Uses k-Nearest Neighbors algorithm to find the nearest training example demonstrations for a given input example. These nearest neighbor demonstrations are then used as the trainset for the BootstrapFewShot optimization process. See this notebook for an example.
Automatic Instruction Optimization
These optimizers produce optimal instructions for the prompt and, in the case of MIPROv2 can also optimize the set of few-shot demonstrations.
COPRO
: Generates and refines new instructions for each step, and optimizes them with coordinate ascent (hill-climbing using the metric function and thetrainset
). Parameters includedepth
which is the number of iterations of prompt improvement the optimizer runs over.
MIPROv2
: Generates instructions and few-shot examples in each step. The instruction generation is data-aware and demonstration-aware. Uses Bayesian Optimization to effectively search over the space of generation instructions/demonstrations across your modules.
- Author:Chengsheng Deng
- URL:https://chengshengddeng.com/article/notes-on-dspy
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts