Usage
Natural language query or structured query document
Options
Filter by collection name (can be specified multiple times). Defaults to all included collections.
Maximum number of results
Return all matches (use with
--min-score)Minimum score threshold (0-1)
Show full document instead of best chunk
Add line numbers to output
Output as JSON
Output as CSV
Output as Markdown
Output as XML
Output file paths only (with docid and score)
How It Works
Automatic Mode (recommended)
Single-line natural language query:- BM25 check — Fast keyword search for strong signals
- Query expansion — Generate lex/vec/hyde variations if needed
- Hybrid search — Combine BM25 + vector results using RRF
- Neural reranking — Score all chunks with cross-encoder model
- Best chunks — Return highest-scoring chunks per document
Structured Mode (advanced)
Multi-line query with explicit strategy per line:lex:, vec:, or hyde::
- lex: BM25 keyword search (supports phrases, proximity, negation)
- vec: Vector semantic search
- hyde: Hypothetical document (generates text that would answer the query)
Query Syntax Grammar
Examples
Automatic Queries
Structured Queries
Query Expansion
When using automatic mode, QMD generates multiple query variations:Reranking Model
Default model:qwen3-reranker (Qwen/Qwen2.5-3B-Instruct-qn_k_m from Hugging Face)
- Cross-encoder scoring for relevance
- Processes all chunks from top documents
- Selects best chunk per document
Performance
Typical query times:- Strong BM25 signal: 50-100ms (skips expansion)
- With expansion: 500ms-2s (embedding + reranking)
- Large result sets: 2-5s (more chunks to rerank)
- Query expansion
- Embedding generation
- Reranking
Related Commands
qmd search
Fast BM25 keyword search only
qmd vsearch
Vector semantic search only
qmd embed
Generate embeddings
Search guide
Learn search strategies