SQLite Schema
The index is stored in~/.cache/qmd/index.sqlite with the following structure:
Core Tables
FTS5 Virtual Table
QMD uses SQLite’s FTS5 extension for full-text search with BM25 ranking:porter tokenizer applies Porter stemming, and unicode61 provides Unicode-aware tokenization.
sqlite-vec Virtual Table
Vector embeddings are stored in a sqlite-vec virtual table:hash_seq key is formatted as {hash}_{seq} to uniquely identify each chunk.
Indexing Pipeline
Step 1: Collection Scanning
Step 2: Content Hashing
Each document’s content is hashed using SHA-256:Step 3: Title Extraction
Titles are extracted from document headers:Step 4: Database Insertion
Content and metadata are inserted into SQLite:Step 5: FTS5 Triggers
Automatic triggers keep the FTS5 index synchronized:Embedding Generation
Vector embeddings are generated separately usingqmd embed.
Embedding Pipeline
- Identify Documents Needing Embeddings
- Chunk Documents
- Format for Embedding
- Generate Embeddings
- Store Vectors
Index Maintenance
Update Flow
- Pull latest changes (if
--pullspecified and collection is a git repo) - Re-scan collection directories
- Mark missing documents as inactive (
active = 0) - Hash new/modified files
- Insert new content and update document records
- FTS5 triggers automatically update the full-text index
Cleanup Operations
Configuration
Collections are managed in~/.config/qmd/index.yml: