Skip to main content
Semantic search using vector embeddings. Finds documents by meaning rather than exact keywords.

Usage

qmd vsearch [options] <query>
query
string
required
Natural language query

Options

-c, --collection
string
Filter by collection name (can be specified multiple times). Defaults to all included collections.
-n
number
default:"5 (CLI), 20 (JSON/files)"
Maximum number of results
--all
boolean
default:"false"
Return all matches (use with --min-score)
--min-score
number
default:"0.3"
Minimum similarity score threshold (0-1)
--full
boolean
default:"false"
Show full document instead of snippet
--line-numbers
boolean
default:"false"
Add line numbers to output
--json
boolean
default:"false"
Output as JSON
--csv
boolean
default:"false"
Output as CSV
--md
boolean
default:"false"
Output as Markdown
--xml
boolean
default:"false"
Output as XML
--files
boolean
default:"false"
Output file paths only (with docid and score)

How It Works

  1. Query embedding — Convert your query to a vector using the embedding model
  2. Vector search — Find nearest neighbors in the vector index using cosine similarity
  3. Expansion — Automatically generates related queries (lexical, vector, hypothetical)
  4. Fusion — Combines results using Reciprocal Rank Fusion (RRF)

Examples

# Semantic search
qmd vsearch "how does authentication work"

# Search specific collection
qmd vsearch "deployment strategies" -c devops

# Higher threshold for quality
qmd vsearch "database design" --min-score 0.5

# Get all semantic matches
qmd vsearch "error handling" --all --min-score 0.3

# JSON output
qmd vsearch "CI/CD pipeline" --json -n 10

Query Expansion

Vector search automatically expands your query into multiple forms:
  • lex: Keyword-optimized version for BM25 search
  • vec: Semantic variations for vector search
  • hyde: Hypothetical document text that would answer the query
Example expansion for “how does auth work”:
├─ how does auth work
├─ lex: authentication authorization security login
├─ vec: authentication workflow authorization flow
└─ hyde: Authentication uses JWT tokens stored in cookies...

Embedding Model

Default model: embeddinggemma (google/gemma-2-2b-it-qn_k_m from Hugging Face)
  • Dimensions: 2048
  • Chunking: 900 tokens/chunk with 15% overlap
  • Boundaries: Prefers markdown headings
Run qmd embed to generate embeddings before using vector search.

Performance

Vector search is slower than BM25 but finds semantically similar content:
  • Embedding query: ~50-200ms
  • Vector search: ~10-50ms
  • Total: ~100-300ms (without expansion)
For best results, use qmd query which combines vector search with BM25 and reranking.

qmd query

Hybrid search with reranking (recommended)

qmd search

Fast BM25 keyword search

qmd embed

Generate vector embeddings

Vector search guide

Learn more about embeddings

Build docs developers (and LLMs) love