Skip to main content

Installation

QMD requires Node.js 22+ or Bun 1.0+ and uses three local GGUF models for embeddings, re-ranking, and query expansion. Models are automatically downloaded on first use.

System Requirements

Runtime

# Node.js >= 22.0.0 required
node --version
Node.js version 22 or higher is required. QMD uses modern JavaScript features that are not available in older versions.

Platform-Specific Requirements

Install Homebrew SQLite for extension support:
brew install sqlite
This is required for the sqlite-vec extension used by vector search.

Storage Requirements

Plan for approximately 2.5GB of storage:
ComponentSizePurpose
Models~2.1GB totalGGUF model files (see below)
IndexVariableSQLite database (~10MB per 1000 docs)
CacheVariableModel cache and LLM response cache

Install QMD

npm install -g @tobilu/qmd
This makes the qmd command available system-wide.

Run Without Installing

npx @tobilu/qmd <command>
Use npx to run QMD without a global install. Useful for trying it out or running specific versions.

GGUF Models

QMD uses three local GGUF models that are automatically downloaded from HuggingFace on first use:
Purpose: Generate vector embeddings for documents and queriesModel: ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.ggufQuantization: Q8_0 (8-bit quantization for quality)Usage:
  • Document chunking and embedding during qmd embed
  • Query embedding during qmd vsearch and qmd query
Prompt Format:
# For queries
task: search result | query: {query}

# For documents
title: {title} | text: {content}
Purpose: Re-rank search results using cross-encoder scoringModel: ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.ggufQuantization: Q8_0 (8-bit quantization)Usage:
  • Re-ranks top 30 candidates in qmd query pipeline
  • Returns yes/no relevance with logprob confidence scores
How it works: Uses node-llama-cpp’s createRankingContext() API to score query-document pairs.
Purpose: Generate alternative query phrasings for better recallModel: tobil/qmd-query-expansion-1.7B-gguf/qmd-query-expansion-1.7B-q4_k_m.ggufQuantization: Q4_K_M (4-bit K-quant for efficiency)Usage:
  • Generates 1-2 alternative queries in qmd query pipeline
  • Original query is weighted 2x in fusion scoring
Fine-tuned: This model is specifically fine-tuned for QMD’s query expansion task.
Models are cached in ~/.cache/qmd/models/ by default. You can set XDG_CACHE_HOME to change the cache directory.

Model Download

Models download automatically on first use:
1

First Command

Run any command that requires models (e.g., qmd embed or qmd query):
qmd embed
2

Automatic Download

QMD detects missing models and downloads them from HuggingFace:
Downloading embeddinggemma-300M-Q8_0.gguf...
[████████████████████████████████] 300MB/300MB
Progress is shown in the terminal.
3

Cache and Reuse

Models are cached locally. Subsequent commands use the cached versions instantly.

Manual Model Download

If you prefer to download models explicitly before first use:
# Download all models (requires LLM session)
qmd embed --help  # Triggers model check/download
Or download directly from HuggingFace:
mkdir -p ~/.cache/qmd/models
cd ~/.cache/qmd/models

# Download embedding model
wget https://huggingface.co/ggml-org/embeddinggemma-300M-GGUF/resolve/main/embeddinggemma-300M-Q8_0.gguf

# Download reranker model
wget https://huggingface.co/ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/resolve/main/qwen3-reranker-0.6b-q8_0.gguf

# Download query expansion model
wget https://huggingface.co/tobil/qmd-query-expansion-1.7B-gguf/resolve/main/qmd-query-expansion-1.7B-q4_k_m.gguf

Storage Locations

QMD uses XDG Base Directory conventions:
DataLocationEnvironment Variable
Index database~/.cache/qmd/index.sqliteXDG_CACHE_HOME
GGUF models~/.cache/qmd/models/XDG_CACHE_HOME
Collection config~/.config/qmd/index.ymlXDG_CONFIG_HOME
Set XDG_CACHE_HOME or XDG_CONFIG_HOME to customize storage locations:
export XDG_CACHE_HOME=~/my-cache
export XDG_CONFIG_HOME=~/my-config

Verify Installation

Confirm QMD is installed and check version:
qmd --version
Expected output:
qmd version 1.1.0
Check index status (creates empty database if missing):
qmd status
Expected output:
Index: /Users/you/.cache/qmd/index.sqlite
Documents: 0
Collections: 0
Embeddings: 0 (0%)

Troubleshooting

Error: SyntaxError: Unexpected token '??='Solution: Upgrade to Node.js 22+:
# Using nvm
nvm install 22
nvm use 22

# Or download from nodejs.org
# https://nodejs.org/
Error: Error loading sqlite-vec extensionSolution: Install Homebrew SQLite:
brew install sqlite
Make sure Homebrew’s SQLite is in your PATH before system SQLite.
Error: Failed to download model from HuggingFaceSolutions:
  1. Check internet connection
  2. Try again (HuggingFace can be slow)
  3. Download manually (see Manual Model Download above)
  4. Check firewall/proxy settings
Error: ENOSPC: no space left on deviceSolution: Free up at least 3GB of space for models and index data.Check current usage:
du -sh ~/.cache/qmd

Next Steps

Quick Start

Get your first search working with a step-by-step tutorial

Build docs developers (and LLMs) love