Installation

QMD requires Node.js 22+ or Bun 1.0+ and uses three local GGUF models for embeddings, re-ranking, and query expansion. Models are automatically downloaded on first use.

System Requirements

Runtime

# Node.js >= 22.0.0 required
node --version

Node.js version 22 or higher is required. QMD uses modern JavaScript features that are not available in older versions.

Platform-Specific Requirements

macOS
Linux
Windows

Install Homebrew SQLite for extension support:

brew install sqlite

This is required for the sqlite-vec extension used by vector search.

No additional dependencies required. The sqlite-vec native bindings are included.

Storage Requirements

Plan for approximately 2.5GB of storage:

Component	Size	Purpose
Models	~2.1GB total	GGUF model files (see below)
Index	Variable	SQLite database (~10MB per 1000 docs)
Cache	Variable	Model cache and LLM response cache

Install QMD

npm
Bun
Development

Global Installation (Recommended)

npm install -g @tobilu/qmd

This makes the qmd command available system-wide.

Run Without Installing

npx @tobilu/qmd <command>

Use npx to run QMD without a global install. Useful for trying it out or running specific versions.

Global Installation (Recommended)

bun install -g @tobilu/qmd

This makes the qmd command available system-wide.

Run Without Installing

bunx @tobilu/qmd <command>

Use bunx to run QMD without a global install. Useful for trying it out or running specific versions.

Install from Source

git clone https://github.com/tobi/qmd
cd qmd
npm install
npm link

This creates a symlink to the development version. Use npm unlink to remove it.

Run from Source

bun src/qmd.ts <command>

GGUF Models

QMD uses three local GGUF models that are automatically downloaded from HuggingFace on first use:

embeddinggemma-300M-Q8_0 (~300MB)

Purpose: Generate vector embeddings for documents and queriesModel: ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.ggufQuantization: Q8_0 (8-bit quantization for quality)Usage:

Document chunking and embedding during qmd embed
Query embedding during qmd vsearch and qmd query

Prompt Format:

# For queries
task: search result | query: {query}

# For documents
title: {title} | text: {content}

qwen3-reranker-0.6b-q8_0 (~640MB)

Purpose: Re-rank search results using cross-encoder scoringModel: ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.ggufQuantization: Q8_0 (8-bit quantization)Usage:

Re-ranks top 30 candidates in qmd query pipeline
Returns yes/no relevance with logprob confidence scores

How it works: Uses node-llama-cpp’s createRankingContext() API to score query-document pairs.

qmd-query-expansion-1.7B-q4_k_m (~1.1GB)

Purpose: Generate alternative query phrasings for better recallModel: tobil/qmd-query-expansion-1.7B-gguf/qmd-query-expansion-1.7B-q4_k_m.ggufQuantization: Q4_K_M (4-bit K-quant for efficiency)Usage:

Generates 1-2 alternative queries in qmd query pipeline
Original query is weighted 2x in fusion scoring

Fine-tuned: This model is specifically fine-tuned for QMD’s query expansion task.

Models are cached in ~/.cache/qmd/models/ by default. You can set XDG_CACHE_HOME to change the cache directory.

Model Download

Models download automatically on first use:

First Command

Run any command that requires models (e.g., qmd embed or qmd query):

qmd embed

Automatic Download

QMD detects missing models and downloads them from HuggingFace:

Downloading embeddinggemma-300M-Q8_0.gguf...
[████████████████████████████████] 300MB/300MB

Progress is shown in the terminal.

Cache and Reuse

Models are cached locally. Subsequent commands use the cached versions instantly.

Manual Model Download

If you prefer to download models explicitly before first use:

# Download all models (requires LLM session)
qmd embed --help  # Triggers model check/download

Or download directly from HuggingFace:

mkdir -p ~/.cache/qmd/models
cd ~/.cache/qmd/models

# Download embedding model
wget https://huggingface.co/ggml-org/embeddinggemma-300M-GGUF/resolve/main/embeddinggemma-300M-Q8_0.gguf

# Download reranker model
wget https://huggingface.co/ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/resolve/main/qwen3-reranker-0.6b-q8_0.gguf

# Download query expansion model
wget https://huggingface.co/tobil/qmd-query-expansion-1.7B-gguf/resolve/main/qmd-query-expansion-1.7B-q4_k_m.gguf

Storage Locations

QMD uses XDG Base Directory conventions:

Data	Location	Environment Variable
Index database	`~/.cache/qmd/index.sqlite`	`XDG_CACHE_HOME`
GGUF models	`~/.cache/qmd/models/`	`XDG_CACHE_HOME`
Collection config	`~/.config/qmd/index.yml`	`XDG_CONFIG_HOME`

Set XDG_CACHE_HOME or XDG_CONFIG_HOME to customize storage locations:

export XDG_CACHE_HOME=~/my-cache
export XDG_CONFIG_HOME=~/my-config

Verify Installation

Confirm QMD is installed and check version:

qmd --version

Expected output:

qmd version 1.1.0

Check index status (creates empty database if missing):

qmd status

Expected output:

Index: /Users/you/.cache/qmd/index.sqlite
Documents: 0
Collections: 0
Embeddings: 0 (0%)

Troubleshooting

Node version too old

Error: SyntaxError: Unexpected token '??='Solution: Upgrade to Node.js 22+:

# Using nvm
nvm install 22
nvm use 22

# Or download from nodejs.org
# https://nodejs.org/

sqlite-vec extension not loading (macOS)

Error: Error loading sqlite-vec extensionSolution: Install Homebrew SQLite:

brew install sqlite

Make sure Homebrew’s SQLite is in your PATH before system SQLite.

Model download fails

Error: Failed to download model from HuggingFaceSolutions:

Check internet connection
Try again (HuggingFace can be slow)
Download manually (see Manual Model Download above)
Check firewall/proxy settings

Out of disk space

Error: ENOSPC: no space left on deviceSolution: Free up at least 3GB of space for models and index data.Check current usage:

du -sh ~/.cache/qmd

Next Steps

Quick Start

Get your first search working with a step-by-step tutorial

Get Started

Core Concepts

Usage Guides

Architecture

Installation

Installation

System Requirements

Runtime

Platform-Specific Requirements

Storage Requirements

Install QMD

Global Installation (Recommended)

Run Without Installing

Global Installation (Recommended)

Run Without Installing

Install from Source

Run from Source

GGUF Models

Model Download

Manual Model Download

Storage Locations

Verify Installation

Troubleshooting

Next Steps

Quick Start

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage Guides

Architecture

Documentation Index

​Installation

​System Requirements

​Runtime

​Platform-Specific Requirements

​Storage Requirements

​Install QMD

​Global Installation (Recommended)

​Run Without Installing

​Global Installation (Recommended)

​Run Without Installing

​Install from Source

​Run from Source

​GGUF Models

​Model Download

​Manual Model Download

​Storage Locations

​Verify Installation

​Troubleshooting

​Next Steps

Quick Start

Build docs developers (and LLMs) love

Installation

System Requirements

Runtime

Platform-Specific Requirements

Storage Requirements

Install QMD

Global Installation (Recommended)

Run Without Installing

Global Installation (Recommended)

Run Without Installing

Install from Source

Run from Source

GGUF Models

Model Download

Manual Model Download

Storage Locations

Verify Installation

Troubleshooting

Next Steps