qmd embed

Generate or refresh vector embeddings for all indexed documents. Required for vector and hybrid search.

Usage

qmd embed [options]

Options

-f, --force

boolean

default:"false"

Force re-embedding all documents (clears existing vectors)

How It Works

Find documents — Identifies unique content hashes needing embeddings
Chunk documents — Splits large documents into 900-token chunks with 15% overlap
Generate embeddings — Processes chunks in batches using embedding model
Store vectors — Saves embeddings to sqlite-vec table for fast similarity search

Chunking Strategy

Max tokens: 900 per chunk
Overlap: 15% (135 tokens)
Boundaries: Prefers markdown headings as split points
Multi-chunk docs: Large documents split into multiple chunks

Embedding Model

Default: embeddinggemma (google/gemma-2-2b-it-qn_k_m from Hugging Face)

Dimensions: 2048
Format: GGUF quantized (qn_k_m)
Storage: ~/.cache/qmd/models/

Examples

# Embed new/changed documents
qmd embed

# Force re-embed everything
qmd embed --force
qmd embed -f

Output

Shows progress with:

Total documents and chunks
Progress bar with percentage
Throughput (MB/s)
ETA
Error count (if any)

Example output:

Embedding 42 documents (156 chunks, 2.3 MB)
4 documents split into multiple chunks
Model: embeddinggemma

████████████████████████████░░ 89% 139/156  1.2 MB/s ETA 2s

On completion:

████████████████████████████████ 100%

✓ Done! Embedded 156 chunks from 42 documents in 32s (73 KB/s)

Performance

Typical throughput:

GPU (CUDA/Metal/Vulkan): 500KB-2MB/s
CPU only: 50-200KB/s

Batch size: 32 chunks (balances memory and efficiency)

When to Run

Run qmd embed after:

Adding collections — qmd collection add
Updating content — qmd update
First install — Before using vector/hybrid search

QMD reminds you when embeddings are needed:

Run 'qmd embed' to update embeddings (23 unique hashes need vectors)

Force Re-embedding

Use --force when:

Switching embedding models
Fixing corrupted embeddings
Changing chunking parameters (requires code rebuild)

⚠️ Warning: Force re-embedding clears all existing vectors and takes longer.

Deduplication

Embeddings are stored by content hash:

Identical content only embedded once
Saves time and storage
Shared chunks across documents

qmd query

Hybrid search (uses embeddings)

qmd vsearch

Vector semantic search

qmd update

Re-index collections

qmd status

Check embedding coverage

Commands

Options

Usage

Options

How It Works

Chunking Strategy

Embedding Model

Examples

Output

Performance

When to Run

Force Re-embedding

Deduplication

qmd query

qmd vsearch

qmd update

qmd status

Build docs developers (and LLMs) love

Commands

Options

Documentation Index

​Usage

​Options

​How It Works

​Chunking Strategy

​Embedding Model

​Examples

​Output

​Performance

​When to Run

​Force Re-embedding

​Deduplication

​Related Commands

qmd query

qmd vsearch

qmd update

qmd status

Build docs developers (and LLMs) love

Usage

Options

How It Works

Chunking Strategy

Embedding Model

Examples

Output

Performance

When to Run

Force Re-embedding

Deduplication

Related Commands