Use this file to discover all available pages before exploring further.
Headroom integrates with the Anthropic SDK through withHeadroom(), a single wrapper that intercepts messages.create() calls and compresses conversation history before they reach Claude. The adapter handles full Anthropic message format including content blocks, tool use, and tool results — the conversion is lossless, so your request and response behave identically to an unwrapped client.
Use HeadroomClient with AnthropicProvider to wrap your Anthropic instance. AnthropicProvider enables accurate token counting against Claude’s exact context limits:
Headroom’s CacheAligner stabilizes prompt prefixes so Anthropic’s prompt-caching KV cache actually hits on repeated calls. When compression rearranges message content, CacheAligner ensures the stable prefix (system prompt, earlier turns) is positioned consistently to maximize cache hit rate.
The AnthropicProvider used by Headroom knows Claude’s exact context limits per model: 200 000 tokens for claude-3-5-sonnet and above, 100 000 for claude-3-haiku. This means compression only activates when you actually need it.
Supported models
claude-opus-4, claude-sonnet-4-5-20250929, claude-haiku-3-5, and all claude-3 variants. Context limits are auto-detected per model ID.
Prompt caching
CacheAligner keeps your stable prefixes pinned so Anthropic’s prompt cache keeps hitting even after Headroom compresses later turns.