Documentation Index
Fetch the complete documentation index at: https://mintlify.com/avnlp/dspy-opt/llms.txt
Use this file to discover all available pages before exploring further.
SubQueryGenerator is a dspy.Module that tackles multi-faceted questions by splitting them into several focused sub-queries, each designed for independent retrieval execution. Rather than sending a long, ambiguous question directly to the search layer, the generator identifies distinct aspects, comparisons, and constraints within the original query and produces a JSON array of targeted sub-queries — one for each retrievable concept. The module automatically estimates how many sub-queries are needed based on linguistic heuristics and then uses dspy.ChainOfThought to produce them.
Signature
The module is driven bySubQuerySignature, which defines the three fields passed to the LLM:
| Field | Type | Description |
|---|---|---|
original_query | InputField | The user’s original complex search query. The LLM must identify distinct aspects, entities, and constraints that each warrant a separate search. |
num_subqueries | InputField | Target number of sub-queries to generate (typically 2–5). Scaled by complexity: 2 for simple, 3 for medium, 4–5 for highly complex queries. |
sub_queries | OutputField | A JSON array of optimized sub-query strings. Each must be self-contained, address a distinct aspect of the original query, preserve all critical constraints, and be 5–12 words long. Output must be valid JSON with no additional text. |
Constructor
Minimum number of sub-queries to generate. If the LLM produces fewer than this threshold, the module falls back to a single expanded query derived from the original. Internally clamped to
min(2, min_subqueries).Maximum number of sub-queries to generate. Any excess sub-queries from the LLM output are truncated. Internally clamped to
max(5, max_subqueries).Methods
forward
The original complex search query to decompose.
Optional override for the number of sub-queries to generate. When omitted, the count is determined automatically by
_determine_complexity. The value is always clamped to [min_subqueries, max_subqueries].dspy.Prediction containing:
sub_queries—List[str]of optimized sub-query stringsrationale— the reasoning steps used to decompose the query (fromdspy.ChainOfThought)
batch_generate
forward on each.
A list of complex search queries to decompose.
List[List[str]] — a list of sub-query lists, one per input query.
_determine_complexity
| Signal | Check |
|---|---|
| Comparison language | "compare", "versus", "vs", "difference" present |
| Conjunction language | "and", "&", "also" present |
| Query length | More than 10 words |
| List punctuation | :, ;, or , present |
[min_subqueries, max_subqueries].
Error handling
SubQueryGenerator is designed to never crash the pipeline:
LLM call succeeds, output is valid JSON list of strings
Normal path — sub-queries are returned, truncated to
max_subqueries if needed.LLM call succeeds, but fewer sub-queries than min_subqueries
Falls back to a single expanded query produced by
_fallback_rewrite, which strips conversational stop-words ("how", "what", "why", "i", "me", "my") from the original.Usage
All sub-queries produced by this module are designed for independent parallel retrieval. Feed the full
result.sub_queries list into WeaviateRetriever (or any other retriever) to fetch passages for each facet, then merge the results before passing them to the answer generator.