Usage
Natural language query
Options
Filter by collection name (can be specified multiple times). Defaults to all included collections.
Maximum number of results
Return all matches (use with
--min-score)Minimum similarity score threshold (0-1)
Show full document instead of snippet
Add line numbers to output
Output as JSON
Output as CSV
Output as Markdown
Output as XML
Output file paths only (with docid and score)
How It Works
- Query embedding — Convert your query to a vector using the embedding model
- Vector search — Find nearest neighbors in the vector index using cosine similarity
- Expansion — Automatically generates related queries (lexical, vector, hypothetical)
- Fusion — Combines results using Reciprocal Rank Fusion (RRF)
Examples
Query Expansion
Vector search automatically expands your query into multiple forms:- lex: Keyword-optimized version for BM25 search
- vec: Semantic variations for vector search
- hyde: Hypothetical document text that would answer the query
Embedding Model
Default model:embeddinggemma (google/gemma-2-2b-it-qn_k_m from Hugging Face)
- Dimensions: 2048
- Chunking: 900 tokens/chunk with 15% overlap
- Boundaries: Prefers markdown headings
qmd embed to generate embeddings before using vector search.
Performance
Vector search is slower than BM25 but finds semantically similar content:- Embedding query: ~50-200ms
- Vector search: ~10-50ms
- Total: ~100-300ms (without expansion)
qmd query which combines vector search with BM25 and reranking.
Related Commands
qmd query
Hybrid search with reranking (recommended)
qmd search
Fast BM25 keyword search
qmd embed
Generate vector embeddings
Vector search guide
Learn more about embeddings