DSPy-Opt organizes retrieval-augmented generation as a deterministic five-stage pipeline. Each stage is a composable DSPy module with learnable prompts, meaning optimizers can tune stages 1, 2, 3, and 5 automatically — improving retrieval quality and answer accuracy without manual prompt engineering. The pipeline accepts a plain-text question and returns a structured prediction containing the final answer, the chain-of-thought reasoning, the rewritten query, the generated sub-queries, and the full list of retrieved passages.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/avnlp/dspy-opt/llms.txt
Use this file to discover all available pages before exploring further.
Pipeline Overview
The five stages execute sequentially insideFreshQARAG.forward(). Retrieval in stage 4 fans out across the main rewritten query and every sub-query, then collapses back into a single deduplicated passage list that feeds stage 5.
Query Rewriting
The raw user question is passed to
QueryRewriter, which uses dspy.ChainOfThought(QueryRewriteSignature) by default (switchable to dspy.Predict via use_chain_of_thought=False) to produce a search-optimised string. The signature instructs the model to expand the query with relevant synonyms and concepts, clarify ambiguous terms, remove conversational noise such as “I want” or “looking for”, preserve key entities and numerical constraints, and keep the result between 5 and 15 words.Sub-Query Generation
The rewritten query is passed to
SubQueryGenerator, which uses dspy.ChainOfThought(SubQuerySignature) to decompose multi-faceted questions into 2–5 focused sub-queries for parallel retrieval. The optimal number of sub-queries is determined automatically by _determine_complexity(), which counts comparative keywords (compare, vs, versus), conjunctions (and, &), query length, and punctuation. Each generated sub-query must be self-contained and 5–12 words long.If JSON parsing fails or the model returns fewer sub-queries than
min_subqueries, SubQueryGenerator falls back to a single simplified rewrite of the original query, removing common stop words. This ensures retrieval always proceeds even when decomposition fails.Metadata Extraction
MetadataExtractor calls a dedicated extractor_llm (separate from the answer LLM) via dspy.Predict(ExtractMetadataSignature) inside a dspy.context(lm=self.extractor_llm) block. It extracts structured fields defined in a user-provided JSON schema — for example title and category in the FreshQA config — and returns only the non-null fields as a plain Python dictionary. Extraction is run once for the main rewritten query and once for each sub-query, yielding per-query metadata dictionaries.The schema is validated before each call: every property must use one of the allowed types (string, number, boolean), and enum is restricted to string fields. Any extraction failure returns an empty dict {} so the pipeline degrades gracefully.The extractor is intentionally instructed not to use placeholders like
"Unknown" or "N/A". Only fields explicitly stated in the input text are populated, making the resulting filter predicates reliable rather than noisy.Document Retrieval
WeaviateRetriever performs hybrid search — combining dense vector similarity with keyword-based BM25 — against a named Weaviate collection. It accepts an optional precomputed embedding vector and an optional metadata filter. The filter is built from the extracted metadata dictionary: only keys present in the metadata_schema passed to the retriever are translated into Weaviate Filter predicates via metadata_to_weaviate_filter().Retrieval is called once for the main rewritten query and once per sub-query. All passage lists are concatenated, then deduplicated with dict.fromkeys() to preserve insertion order:["No relevant context found in the knowledge base."] so that stage 5 always receives a non-empty context list.Answer Generation
The deduplicated passage list and the original question are fed into The pipeline’s
dspy.ChainOfThought(FreshQAAnswerSignature). This produces four output fields: rewritten_query, sub_queries, answer, and reasoning. The reasoning field exposes the model’s chain-of-thought — how it synthesised the retrieved passages into a final answer.forward() method wraps stage 5’s output in a dspy.Prediction that exposes all intermediate state — question, rewritten_query, sub_queries, retrieved_context, answer, and reasoning — so that downstream evaluation and optimization have full visibility into how the answer was produced.Complete forward() Method
The full FreshQARAG.forward() method ties all five stages together. A top-level try/except provides a fallback path if any stage raises an unhandled exception: the pipeline generates an answer directly from "Limited context available" rather than crashing.
Which Stages Are Optimizable
Stages 1, 2, 3, and 5 each contain DSPy modules with learnable prompts:| Stage | Module | DSPy Module | Optimizable |
|---|---|---|---|
| 1 | QueryRewriter | dspy.ChainOfThought(QueryRewriteSignature) | ✅ Yes |
| 2 | SubQueryGenerator | dspy.ChainOfThought(SubQuerySignature) | ✅ Yes |
| 3 | MetadataExtractor | dspy.Predict(ExtractMetadataSignature) | ✅ Yes |
| 4 | WeaviateRetriever | Deterministic hybrid search | ❌ No |
| 5 | generate_answer | dspy.ChainOfThought(FreshQAAnswerSignature) | ✅ Yes |
WeaviateRetriever) is a deterministic database call with no learnable parameters. All other stages expose their instruction text and few-shot slots to the DSPy optimizer during compilation.