Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/headroomlabs-ai/headroom/llms.txt

Use this file to discover all available pages before exploring further.

Headroom ships as two separate packages depending on what you need: headroom-ai on PyPI (Python, includes the headroom CLI) and headroom-ai on npm (TypeScript SDK — library only, no CLI). This page covers every install path, all optional extras, Docker image tags, required environment variables, and how to handle SSL-inspection environments. If something goes wrong during install, the troubleshooting section at the bottom has the common fixes.

Which Package Do You Need?

  • pip — you are writing Python, or you need the CLI (headroom proxy, headroom wrap, headroom mcp, headroom learn, headroom perf), regardless of what language your app is in.
  • npm — you are writing TypeScript/Node and want inline compress(), SDK wrapping (withHeadroom), or Vercel AI SDK middleware.

Python

Headroom requires Python 3.10+. Current release wheels are built for Python 3.10 through 3.13 on Linux (manylinux_2_28 x86_64 / aarch64) and macOS (Apple Silicon). Windows and Intel macOS fall back to building the Rust extension from source and require a working native toolchain — see the Windows section below.

Core Package

pip install headroom-ai
The core package includes the compress() function, SmartCrusher, CacheAligner, and ContentRouter. No heavy ML dependencies.

Everything at Once

pip install "headroom-ai[all]"
Installs the proxy server, all compressors (code, ML, image, voice, HTML, spreadsheet), memory, relevance scoring, MCP tools, reports, OpenTelemetry, and the evaluation framework. Does not include [vector] (requires a C++ toolchain), [pytorch-mps] (macOS-only GPU offload), or framework adapters ([langchain], [agno], [strands], [anyllm], [bedrock]).

Python Extras

Install only what you need, or combine extras:
pip install "headroom-ai[proxy,langchain,ml]"
ExtraWhat it addsInstall command
proxyProxy server, MCP tools, HTTP APIpip install "headroom-ai[proxy]"
mlKompress-v2-base (ModernBERT text compression, requires PyTorch)pip install "headroom-ai[ml]"
codeCodeCompressor (tree-sitter AST parsing for 7+ languages)pip install "headroom-ai[code]"
memoryPersistent memory with sqlite-vec — pure-Python, no compiler requiredpip install "headroom-ai[memory]"
vectorOptional HNSW vector backend (hnswlib) — needs a C++ toolchain; not in [all]pip install "headroom-ai[vector]"
relevancefastembed-based relevance scoring (BAAI/bge-small-en-v1.5, ONNX)pip install "headroom-ai[relevance]"
imageImage compression (Pillow, ONNX runtime, OCR)pip install "headroom-ai[image]"
reportsHTML/Markdown report generation (Jinja2)pip install "headroom-ai[reports]"
otelOpenTelemetry exporter (OTLP)pip install "headroom-ai[otel]"
voiceVoice/audio filler detectionpip install "headroom-ai[voice]"
mcpMCP server tools (headroom_compress, headroom_retrieve, headroom_stats)pip install "headroom-ai[mcp]"
langchainLangChain HeadroomChatModel wrapper — not in [all]pip install "headroom-ai[langchain]"
agnoAgno HeadroomAgnoModel wrapper — not in [all]pip install "headroom-ai[agno]"
strandsAWS Strands Agents SDK integration — not in [all]pip install "headroom-ai[strands]"
anyllmany-llm multi-provider backend (Python 3.11+) — not in [all]pip install "headroom-ai[anyllm]"
bedrockAWS Bedrock backend (boto3 + CRT) — not in [all]pip install "headroom-ai[bedrock]"
htmlHTML content extraction (trafilatura)pip install "headroom-ai[html]"
spreadsheetBinary spreadsheet ingestion (.xlsx / .xls)pip install "headroom-ai[spreadsheet]"
evalsEvaluation framework (GSM8K, SQuAD, BFCL benchmarks)pip install "headroom-ai[evals]"
pytorch-mpsApple-GPU (MPS) memory-embedder offload — macOS only, not in [all]; opt in with HEADROOM_EMBEDDER_RUNTIME=pytorch_mpspip install "headroom-ai[pytorch-mps]"
allproxy, code, ml, memory, relevance, image, reports, otel, evals, voice, html, mcp, spreadsheet — excludes vector, pytorch-mps, and framework adapterspip install "headroom-ai[all]"
[all] covers the core stack but excludes framework adapters. Install them separately: pip install "headroom-ai[langchain]" (also [agno], [strands], [anyllm], [bedrock]).

pipx

pipx creates one isolated virtual environment per app. If that environment resolves to an unsupported Python version, pipx may pull an older Headroom release. Always specify Python 3.13 explicitly:
pipx install --python python3.13 "headroom-ai[all]"
Pick Python 3.13 for dollar savings. The proxy dashboard’s *Proxy SavedtilepricescompressionusingLiteLLM,whichdoesnotyetsupportPython3.14+.On3.14,tokensavingsstilltrackbutthedollarfigureshows Saved* tile prices compression using LiteLLM, which does not yet support Python 3.14+. On 3.14, token savings still track — but the dollar figure shows `0.00. Switch with pipx reinstall headroom-ai —python python3.13` and restart the proxy.
For a pinned release:
pipx install --python python3.13 "headroom-ai[all]==0.28.0"
Check which Python an existing pipx environment uses:
pipx list

uv

uv tool install --python 3.13 "headroom-ai[all]"

Verify the Install

python -c "import headroom; print(headroom.__version__)"
headroom doctor    # full health check — confirms routing and model loading work

Windows

There are no prebuilt Windows wheels yet. pip install headroom-ai falls back to building the Rust extension from the sdist, which requires the MSVC toolchain on PATH. Without it you will see:
error: linker `link.exe` not found
note: please ensure that Visual Studio 2017 or later, or Build Tools for
Visual Studio were installed with the Visual C++ option
To install the prerequisites:
  1. MSVC toolchain — install Build Tools for Visual Studio and select the “Desktop development with C++” workload (link.exe). VS Code alone is not enough.
  2. Rust — install via rustup.rs and choose the stable-x86_64-pc-windows-msvc toolchain so Cargo uses the MSVC linker.
  3. Open a fresh PowerShell (so PATH updates take effect), then run:
uv tool install --python 3.13 "headroom-ai[all]"
# or
pip install "headroom-ai[all]"
If you want to avoid the native toolchain entirely, use Docker — see the Docker section below.

TypeScript / Node.js

The TypeScript SDK requires Node.js 18+. It is a library you import — it does not install the headroom CLI (headroom wrap, headroom proxy, etc.), which ships only with the Python package.
npm install headroom-ai
The TypeScript SDK sends messages to a local Headroom proxy for compression. The proxy runs the full Python compression pipeline. Start it before using the SDK:
pip install "headroom-ai[proxy]"
headroom proxy --port 8787
Then point the SDK at it:
import { compress } from 'headroom-ai';

const result = await compress(messages, {
  baseUrl: 'http://localhost:8787',
});
Or set the HEADROOM_BASE_URL environment variable to avoid passing baseUrl on every call.

Verify the Install

node -e "const h = require('headroom-ai'); console.log('headroom-ai loaded')"

Docker

Pre-built images are published to GitHub Container Registry on every release.
docker pull ghcr.io/chopratejas/headroom:latest
docker run -p 8787:8787 ghcr.io/chopratejas/headroom:latest

Image Tags

TagExtrasBase imageDescription
latestproxyDebian slimDefault image, runs the proxy
<version>proxyDebian slimPinned release version
nonrootproxyDebian slimRuns as non-root user
codeproxy,codeDebian slimIncludes tree-sitter for code compression
code-nonrootproxy,codeDebian slimCode compression, non-root
slimproxyDistrolessMinimal image, no shell
slim-nonrootproxyDistrolessMinimal, non-root
code-slimproxy,codeDistrolessCode compression, minimal
code-slim-nonrootproxy,codeDistrolessCode compression, minimal, non-root

Build from Source

Use Docker Bake for multi-variant builds:
# List all targets
docker buildx bake --list targets

# Build the default runtime image
docker buildx bake runtime-default

# Build a specific variant with a custom registry
docker buildx bake runtime-code-slim-nonroot \
  --set '*.tags=my-registry/headroom:code-slim-nonroot'

Environment Variables

Set these in your shell, .env file, or container environment.

LLM Provider Keys

VariableDescription
OPENAI_API_KEYOpenAI API key (used when proxying to OpenAI)
ANTHROPIC_API_KEYAnthropic API key (used when proxying to Anthropic)
AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEYAWS credentials for Bedrock backend
GOOGLE_APPLICATION_CREDENTIALSGoogle Cloud credentials for Vertex AI backend

Proxy Configuration

VariableDefaultDescription
HEADROOM_PORT8787Port the proxy listens on
HEADROOM_HOST127.0.0.1Host the proxy binds to
HEADROOM_MODEtokenDefault optimization mode: token or cache
HEADROOM_TELEMETRYoffSet to on to opt into anonymous telemetry
HEADROOM_REQUEST_TIMEOUT300Request timeout in seconds
HEADROOM_OUTPUT_SHAPER0Set to 1 to enable output token reduction (verbosity steering + effort routing)
HEADROOM_OUTPUT_HOLDOUTFloat 0–1: fraction of conversations left unshaped as a control group for measured savings
HEADROOM_UPDATE_CHECKonSet to off to disable the daily PyPI update check

TypeScript SDK

VariableDefaultDescription
HEADROOM_BASE_URLhttp://localhost:8787Proxy URL for the TypeScript SDK
HEADROOM_API_KEY(none)API key if the proxy requires authentication

Model Downloads

VariableDescription
HF_HUB_OFFLINESet to 1 to disable all HuggingFace Hub network calls (requires pre-downloaded model)
HF_ENDPOINTOverride the HuggingFace Hub endpoint — use this to point at a trusted internal mirror
ORT_STRATEGYSet to system to use a pre-installed ONNX Runtime instead of downloading from cdn.pyke.io
ORT_LIB_LOCATIONPath to a pre-installed ONNX Runtime library (used with ORT_STRATEGY=system)

Updating

headroom update detects how Headroom was installed and upgrades in place across macOS, Linux, and Windows:
headroom update          # upgrade to the latest release
headroom update --check  # report the latest release without upgrading
headroom update --pre    # include pre-releases
For git checkouts, editable installs, Docker images, and externally-managed system Pythons (PEP 668), the command prints the correct manual step instead of guessing. The proxy also shows a one-line “update available” notice on startup. It checks PyPI at most once a day, in the background, and never blocks. Opt out with HEADROOM_UPDATE_CHECK=off.

Corporate / SSL-Inspection Environments

CERTIFICATE_VERIFY_FAILED During Install

If pip install "headroom-ai[all]" fails with CERTIFICATE_VERIFY_FAILED (unable to get local issuer certificate), your network uses SSL inspection — a MITM proxy presenting a company-issued CA. The build backend downloads Rust (rustup) over a connection your TLS stack does not trust. Install Rust first so the build does not fetch it:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh && rustup default stable
Restart your shell, then re-run pip install "headroom-ai[all]". A prebuilt wheel avoids the Rust build entirely where available:
pip install --only-binary headroom-ai headroom-ai
Prebuilt wheels are published for Windows (win_amd64), Linux (x86_64 / aarch64), and macOS (Apple Silicon). The Rust-first dance above is only needed for the platform-independent sdist fallback (e.g. Intel macOS). Two runtime assets are fetched over TLS. If they are blocked, trust your corporate CA via REQUESTS_CA_BUNDLE / SSL_CERT_FILE / CURL_CA_BUNDLE:
  • cdn.pyke.io — the ONNX Runtime for the Rust core. Pre-provide it with ORT_STRATEGY=system and ORT_LIB_LOCATION=/path/to/onnxruntime.
  • huggingface.co — the Kompress-v2-base model. Pre-download it and run with HF_HUB_OFFLINE=1, or set HF_ENDPOINT to a trusted mirror.

Basic Constraints of CA cert not marked critical (Python 3.13+)

This is a different failure. Python 3.13 + OpenSSL 3.x enable VERIFY_X509_STRICT by default (RFC 5280 §4.2.1.9): a CA cert’s basicConstraints must be marked critical. Inspection roots like Zscaler set CA:TRUE without the critical bit, so the chain is rejected. Set HEADROOM_TLS_STRICT=0 to clear only the strict flag from every TLS context Headroom controls — the proxy’s upstream client and the HuggingFace model-download path. Chain validation, signature, expiry, and hostname checks all remain on:
HEADROOM_TLS_STRICT=0 headroom proxy --port 8787
The Rust core’s ONNX download (cdn.pyke.io) uses a separate TLS stack (rustls / OS trust store) that HEADROOM_TLS_STRICT does not affect. On Windows, the corporate root must be in the machine certificate store, or pre-provision ONNX Runtime with ORT_STRATEGY=system + ORT_LIB_LOCATION.

Troubleshooting

Python Version Error

Headroom requires Python 3.10 or later. Check your version:
python3 --version
If needed (macOS with Homebrew):
brew install python@3.13

Editable Install Fails (pip install -e)

Upgrade pip to the latest version:
python3 -m pip install --upgrade pip

Missing cargo (Rust Error)

Some install paths compile a Rust extension. Install Rust via rustup.rs:
# macOS / Linux
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Restart your shell and re-run the install.

Next Steps

Quickstart

Compress your first messages and see token savings in under 5 minutes.

Proxy Server

Run headroom proxy as a persistent service and open the savings dashboard.

Configuration

CompressConfig, target ratios, savings profiles, and runtime toggles.

Vercel AI SDK

wrapLanguageModel middleware for Next.js and edge functions.

Build docs developers (and LLMs) love