qmd vsearch

Semantic search using vector embeddings. Finds documents by meaning rather than exact keywords.

Usage

qmd vsearch [options] <query>

query

string

required

Natural language query

Options

-c, --collection

string

Filter by collection name (can be specified multiple times). Defaults to all included collections.

-n

number

default:"5 (CLI), 20 (JSON/files)"

Maximum number of results

--all

boolean

default:"false"

Return all matches (use with --min-score)

--min-score

number

default:"0.3"

Minimum similarity score threshold (0-1)

--full

boolean

default:"false"

Show full document instead of snippet

--line-numbers

boolean

default:"false"

Add line numbers to output

--json

boolean

default:"false"

Output as JSON

--csv

boolean

default:"false"

Output as CSV

--md

boolean

default:"false"

Output as Markdown

--xml

boolean

default:"false"

Output as XML

--files

boolean

default:"false"

Output file paths only (with docid and score)

How It Works

Query embedding — Convert your query to a vector using the embedding model
Vector search — Find nearest neighbors in the vector index using cosine similarity
Expansion — Automatically generates related queries (lexical, vector, hypothetical)
Fusion — Combines results using Reciprocal Rank Fusion (RRF)

Examples

# Semantic search
qmd vsearch "how does authentication work"

# Search specific collection
qmd vsearch "deployment strategies" -c devops

# Higher threshold for quality
qmd vsearch "database design" --min-score 0.5

# Get all semantic matches
qmd vsearch "error handling" --all --min-score 0.3

# JSON output
qmd vsearch "CI/CD pipeline" --json -n 10

Query Expansion

Vector search automatically expands your query into multiple forms:

lex: Keyword-optimized version for BM25 search
vec: Semantic variations for vector search
hyde: Hypothetical document text that would answer the query

Example expansion for “how does auth work”:

├─ how does auth work
├─ lex: authentication authorization security login
├─ vec: authentication workflow authorization flow
└─ hyde: Authentication uses JWT tokens stored in cookies...

Embedding Model

Default model: embeddinggemma (google/gemma-2-2b-it-qn_k_m from Hugging Face)

Dimensions: 2048
Chunking: 900 tokens/chunk with 15% overlap
Boundaries: Prefers markdown headings

Run qmd embed to generate embeddings before using vector search.

Performance

Vector search is slower than BM25 but finds semantically similar content:

Embedding query: ~50-200ms
Vector search: ~10-50ms
Total: ~100-300ms (without expansion)

For best results, use qmd query which combines vector search with BM25 and reranking.

qmd query

Hybrid search with reranking (recommended)

qmd search

Fast BM25 keyword search

qmd embed

Generate vector embeddings

Vector search guide

Learn more about embeddings

Commands

Options

Usage

Options

How It Works

Examples

Query Expansion

Embedding Model

Performance

qmd query

qmd search

qmd embed

Vector search guide

Build docs developers (and LLMs) love

Commands

Options

Documentation Index

​Usage

​Options

​How It Works

​Examples

​Query Expansion

​Embedding Model

​Performance

​Related Commands

qmd query

qmd search

qmd embed

Vector search guide

Build docs developers (and LLMs) love

Usage

Options

How It Works

Examples

Query Expansion

Embedding Model

Performance

Related Commands