Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/avnlp/dspy-opt/llms.txt

Use this file to discover all available pages before exploring further.

DSPy-Opt is a framework for building and automatically optimizing Retrieval-Augmented Generation (RAG) pipelines using DSPy. It implements a modular 5-stage pipeline — query rewriting, sub-query decomposition, metadata extraction, hybrid Weaviate search, and chain-of-thought answer generation — and uses DSPy optimizers to automatically tune prompts and few-shot examples against DeepEval metrics.

Quickstart

Build and run your first optimized RAG pipeline in minutes

Installation

Set up dependencies, environment variables, and Weaviate

Pipeline Architecture

Understand the 5-stage modular RAG pipeline design

Optimizers

Compare MIPROv2, COPRO, SIMBA, GEPA, and BootstrapFewShot

What DSPy-Opt Does

DSPy-Opt replaces manual prompt engineering with automated optimization. Instead of hand-tuning prompts for each stage of your RAG pipeline, you define the pipeline structure once and let a DSPy optimizer search over instructions and few-shot demonstrations to maximize your evaluation metric.

5 Optimizers

MIPROv2, COPRO, SIMBA, GEPA, and BootstrapFewShot — each with a different search strategy

5 Datasets

FreshQA, HotpotQA, PubMedQA, TriviaQA, and Wikipedia — covering single-hop, multi-hop, and biomedical QA

5 Metrics

Answer Relevancy, Faithfulness, Contextual Precision, Recall, and Relevancy via DeepEval

How It Works

1

Index your dataset

Load documents from HuggingFace, extract structured metadata with an LLM, embed with SentenceTransformers, and store in a Weaviate collection.
2

Configure your pipeline

Set models, optimizer hyperparameters, and metadata schema in a YAML config file.
3

Run the optimizer

The DSPy optimizer compiles your pipeline — searching over prompt instructions and few-shot examples — evaluated with DeepEval metrics.
4

Save and evaluate

The optimized pipeline is saved to JSON. Run the evaluation script to measure performance across all DeepEval metrics.

Explore the Documentation

Pipeline Components

Detailed API reference for QueryRewriter, SubQueryGenerator, MetadataExtractor, WeaviateRetriever, and Metrics

Dataset Pipelines

Per-dataset pipeline classes, metadata schemas, and configuration for all five QA benchmarks

Running Optimizers

Step-by-step guide to running each optimizer with the right configuration

Adding a Dataset

Extend DSPy-Opt to a new dataset by following the established pattern

Build docs developers (and LLMs) love