DSPy-Opt is a framework for building and automatically optimizing Retrieval-Augmented Generation (RAG) pipelines using DSPy. It implements a modular 5-stage pipeline — query rewriting, sub-query decomposition, metadata extraction, hybrid Weaviate search, and chain-of-thought answer generation — and uses DSPy optimizers to automatically tune prompts and few-shot examples against DeepEval metrics.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/avnlp/dspy-opt/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
Build and run your first optimized RAG pipeline in minutes
Installation
Set up dependencies, environment variables, and Weaviate
Pipeline Architecture
Understand the 5-stage modular RAG pipeline design
Optimizers
Compare MIPROv2, COPRO, SIMBA, GEPA, and BootstrapFewShot
What DSPy-Opt Does
DSPy-Opt replaces manual prompt engineering with automated optimization. Instead of hand-tuning prompts for each stage of your RAG pipeline, you define the pipeline structure once and let a DSPy optimizer search over instructions and few-shot demonstrations to maximize your evaluation metric.5 Optimizers
MIPROv2, COPRO, SIMBA, GEPA, and BootstrapFewShot — each with a different search strategy
5 Datasets
FreshQA, HotpotQA, PubMedQA, TriviaQA, and Wikipedia — covering single-hop, multi-hop, and biomedical QA
5 Metrics
Answer Relevancy, Faithfulness, Contextual Precision, Recall, and Relevancy via DeepEval
How It Works
Index your dataset
Load documents from HuggingFace, extract structured metadata with an LLM, embed with SentenceTransformers, and store in a Weaviate collection.
Configure your pipeline
Set models, optimizer hyperparameters, and metadata schema in a YAML config file.
Run the optimizer
The DSPy optimizer compiles your pipeline — searching over prompt instructions and few-shot examples — evaluated with DeepEval metrics.
Explore the Documentation
Pipeline Components
Detailed API reference for QueryRewriter, SubQueryGenerator, MetadataExtractor, WeaviateRetriever, and Metrics
Dataset Pipelines
Per-dataset pipeline classes, metadata schemas, and configuration for all five QA benchmarks
Running Optimizers
Step-by-step guide to running each optimizer with the right configuration
Adding a Dataset
Extend DSPy-Opt to a new dataset by following the established pattern