| Mode | Speed | Quality | Use Case |
|---|---|---|---|
| search | Fast (~10ms) | Good | Keyword matching, exact terms |
| vsearch | Medium (~100ms) | Better | Semantic similarity, questions |
| query | Slow (~1-2s) | Best | High-quality results, LLM integration |
search - BM25 Keyword Search
Fast full-text search using SQLite FTS5 with BM25 ranking. No LLM or embedding lookup required.How It Works
- Parses query into FTS5 syntax
- Searches documents_fts table (full-text index)
- Ranks results using BM25 algorithm
- Returns top N results (default: 5)
Performance
- Speed: ~10-50ms
- No GPU required
- No embeddings required
When to Use
Keyword matching
Keyword matching
When you know specific terms that appear in documents:
Exact phrase matching
Exact phrase matching
Use quotes for exact phrases:
Exclusions
Exclusions
Exclude terms with
-:Fast preliminary search
Fast preliminary search
When you need instant results and semantic understanding isn’t critical.
Lex Query Syntax
Thesearch command uses lex (lexical) query syntax:
| Syntax | Meaning | Example |
|---|---|---|
word | Prefix match | perf matches “performance” |
"phrase" | Exact phrase | "rate limiter" |
-word | Exclude term | -sports |
-"phrase" | Exclude phrase | -"test data" |
Limitations
- No semantic understanding (“auth” won’t match “login”)
- Requires knowing exact terminology
- Prefix matching can return false positives
vsearch - Vector Semantic Search
Semantic similarity search using embeddings and cosine distance. Better at understanding intent than keyword search.How It Works
- Embeds query using embeddinggemma-300M model
- Searches vectors_vec table (vector index)
- Computes cosine distance to all document chunks
- Returns top N results (default: 5)
Performance
- Speed: ~50-200ms (depends on index size and GPU)
- Requires: Embeddings (
qmd embed) - GPU: Accelerates embedding generation
When to Use
Natural language questions
Natural language questions
When searching with questions or descriptions:
Semantic similarity
Semantic similarity
When you don’t know exact terms but understand the concept:
Cross-lingual understanding
Cross-lingual understanding
Embeddings can match concepts across different phrasings:
Fast semantic search
Fast semantic search
When you need semantic understanding but don’t need reranking or query expansion.
Vec Query Format
Vector queries are plain natural language — no special syntax:Limitations
- Requires running
qmd embedfirst - No query expansion (single embedding only)
- No LLM reranking (may miss nuance)
- Results not optimized for best-first ordering
query - Hybrid Search with Reranking
Highest quality search combining BM25, vector search, query expansion, and LLM reranking. Recommended for most use cases.How It Works
Pipeline Stages
Query Expansion
LLM generates alternative query phrasings using qmd-query-expansion-1.7B model.Original query gets 2× weight in fusion to preserve exact matches.
Parallel Retrieval
Each query variation searches:
- FTS (BM25): Keyword matching
- Vector: Semantic similarity
RRF Fusion
Combines all results using Reciprocal Rank Fusion:Top-ranked documents get bonus:
- Rank #1: +0.05
- Rank #2-3: +0.02
Performance
- Speed: ~500ms-2s (depends on GPU and query complexity)
- Requires: Embeddings (
qmd embed) and GPU for best performance - Models: 3 GGUF models (~2GB total)
When to Use
Best-quality results
Best-quality results
When you need the most relevant results and can afford the latency:
LLM integration
LLM integration
When feeding results to an LLM (via MCP or CLI):
Complex questions
Complex questions
When queries need understanding and expansion:
Recall-critical searches
Recall-critical searches
When missing relevant documents is worse than higher latency.
Score Interpretation
| Score | Meaning |
|---|---|
| 0.8 - 1.0 | Highly relevant |
| 0.5 - 0.8 | Moderately relevant |
| 0.2 - 0.5 | Somewhat relevant |
| 0.0 - 0.2 | Low relevance |
--min-score to filter results:
Choosing the Right Mode
search
Use when:
- You know exact keywords
- Speed is critical (<50ms)
- No embeddings available
qmd search "OAuth 2.0"vsearch
Use when:
- Natural language queries
- Semantic understanding needed
- Fast results (<200ms)
qmd vsearch "how to authenticate"query
Use when:
- Best quality required
- LLM integration
- Complex questions
qmd query "authentication best practices"Common Options
All search modes support these options:Advanced: Structured Queries
For maximum control, use Query Syntax to specify multiple query types:Related
- Query Syntax - Advanced query document format
- Embeddings - How vector search works
- MCP Server - Using search modes via MCP
- CLI Reference - Search command documentation