Search Your Knowledge Base Locally

QMD combines BM25 full-text search, vector semantic search, and LLM reranking—all running on-device with GGUF models. Index your markdown notes, documentation, and knowledge bases with zero external dependencies.

Get Started Explore Search Modes

Quick Start

Get QMD running in minutes with these simple steps

Install QMD

Install globally via npm or bun:

npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmd

Create a collection

Index your markdown files by creating a collection:

qmd collection add ~/notes --name notes
qmd collection add ~/Documents/meetings --name meetings

Add context

Add contextual descriptions to help with search relevance:

qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://meetings "Meeting transcripts and notes"

Generate embeddings

Create vector embeddings for semantic search:

qmd embed

Models are auto-downloaded on first use (~2GB total). They’re cached in ~/.cache/qmd/models/

Search your content

Use any of the three search modes:

qmd search "project timeline"           # Fast keyword search
qmd vsearch "how to deploy"             # Semantic search
qmd query "quarterly planning process"  # Hybrid + reranking

Explore by topic

Learn about QMD’s core features and capabilities

Collections

Organize and index your markdown files into searchable collections

Search modes

Choose between BM25, vector, or hybrid search with reranking

Query syntax

Write structured queries with typed sub-queries for precise results

Context management

Add hierarchical metadata to improve search relevance

AI agents

Integrate QMD with AI workflows using JSON and file outputs

MCP server

Expose QMD as an MCP server for Claude and other AI tools

Key features

Everything you need for powerful local search

Hybrid search pipeline

Combines BM25 full-text search with vector semantic search, then applies LLM reranking for optimal results

100% local execution

All models run on-device via node-llama-cpp. No API keys, no external dependencies, no data leaving your machine

Smart chunking

Markdown-aware boundary detection keeps sections intact. 900 tokens per chunk with 15% overlap

Document IDs (docid)

Every document gets a short hash ID for quick reference. Use them in search results and retrieval commands

Query document format

Write multi-line queries with typed sub-queries (lex, vec, hyde) for precise control over search strategy

Multiple output formats

Export results as JSON, CSV, Markdown, XML, or plain file lists for integration with AI agents and scripts

Ready to get started?

Install QMD and start searching your knowledge base in minutes

View Installation Guide

Get Started

Core Concepts

Usage Guides

Architecture

QMD Documentation

Search Your Knowledge Base Locally

Quick Start

Explore by topic

Collections

Search modes

Query syntax

Context management

AI agents

MCP server

Key features

Hybrid search pipeline

100% local execution

Smart chunking

Document IDs (docid)

Query document format

Multiple output formats

Ready to get started?

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage Guides

Architecture

Documentation Index

​Search Your Knowledge Base Locally

​Quick Start

​Explore by topic

Collections

Search modes

Query syntax

Context management

AI agents

MCP server

​Key features

​Hybrid search pipeline

​100% local execution

​Smart chunking

​Document IDs (docid)

​Query document format

​Multiple output formats

​Ready to get started?

Build docs developers (and LLMs) love

Search Your Knowledge Base Locally

Quick Start

Explore by topic

Key features

Hybrid search pipeline

100% local execution

Smart chunking

Document IDs (docid)

Query document format

Multiple output formats

Ready to get started?