System Architecture
Core Components
Storage Layer
QMD uses SQLite as its storage backend with two key extensions:- FTS5 (full-text search) for BM25 keyword matching
- sqlite-vec for vector similarity search
~/.cache/qmd/index.sqlite
Search Backends
| Backend | Raw Score | Conversion | Range |
|---|---|---|---|
| FTS (BM25) | SQLite FTS5 BM25 | Math.abs(score) | 0 to ~25+ |
| Vector | Cosine distance | 1 / (1 + distance) | 0.0 to 1.0 |
| Reranker | LLM 0-10 rating | score / 10 | 0.0 to 1.0 |
LLM Models
QMD uses three local GGUF models (auto-downloaded on first use):| Model | Purpose | Size | URI |
|---|---|---|---|
| embeddinggemma-300M-Q8_0 | Vector embeddings | ~300MB | hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf |
| qwen3-reranker-0.6b-q8_0 | Re-ranking | ~640MB | hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf |
| qmd-query-expansion-1.7B-q4_k_m | Query expansion (fine-tuned) | ~1.1GB | hf:tobil/qmd-query-expansion-1.7B-gguf/qmd-query-expansion-1.7B-q4_k_m.gguf |
~/.cache/qmd/models/.
Data Flow
Indexing Flow
Embedding Flow
Documents are chunked into ~900-token pieces with 15% overlap using smart boundary detection:Search Modes
QMD provides three search modes:| Mode | Description | Use Case |
|---|---|---|
search | BM25 full-text search only | Fast keyword search, exact term matching |
vsearch | Vector semantic search only | Conceptual similarity, synonyms |
query | Hybrid: FTS + Vector + Query Expansion + Re-ranking | Best quality, recommended for most searches |
Context System
QMD supports hierarchical context annotations that help LLMs understand document structure:Score Interpretation
| Score | Meaning |
|---|---|
| 0.8 - 1.0 | Highly relevant |
| 0.5 - 0.8 | Moderately relevant |
| 0.2 - 0.5 | Somewhat relevant |
| 0.0 - 0.2 | Low relevance |