Use this file to discover all available pages before exploring further.
Headroom can be configured via the SDK constructor, the headroom proxy command line, environment variables, or per-request overrides. Settings are applied in this order — later entries override earlier ones: built-in defaults → environment variables → SDK constructor arguments → per-request overrides.
These modes apply to SDK usage via HeadroomClient(default_mode=...) or as a per-request override. They are not the same axis as the proxy --mode flag — each controls a different layer of the stack.
Mode
Behavior
Use case
audit
Observes and logs; no modifications
Production monitoring, baseline measurement
optimize
Applies safe, deterministic transforms
Production optimization
simulate
Returns the transform plan without making an API call
Testing, cost estimation
Proxy --mode is a separate axis.headroom proxy --mode token maximizes compression by rewriting prior turns. --mode cache freezes prior turns to maximize provider prefix-cache hit rates. The proxy does not accept audit, optimize, or simulate.
SmartCrusher is Headroom’s universal JSON array compressor. It uses statistical analysis to intelligently select which items to keep while preserving the original JSON schema.
CacheAligner stabilizes system-prompt prefixes so provider KV caches actually hit on repeated turns. It extracts dynamic content (dates, UUIDs, tokens) and moves it to a trailing section after a stable prefix.
Disabled by default — prefix-stability gains are marginal in most setups
use_dynamic_detector
True
Use full DynamicContentDetector (15+ patterns) instead of legacy date regex
detection_tiers
["regex"]
Detection tiers: regex (fast, ~0ms), ner (spaCy, ~5–10ms), semantic (~20–50ms)
extra_dynamic_labels
[]
Extra KEY names that hint the VALUE is dynamic (e.g. "session")
entropy_threshold
0.7
Entropy threshold for random-string detection (0–1; higher = more selective)
normalize_whitespace
True
Normalize whitespace in system prompts
collapse_blank_lines
True
Collapse multiple blank lines
dynamic_tail_separator
"\n\n---\n[Dynamic Context]\n"
Separator marking where dynamic content begins
date_patterns
(4 patterns)
Legacy date regex patterns used when use_dynamic_detector=False
CacheAligner is applied only to system messages — never to user, assistant, or tool content. Code blocks with significant indentation or ASCII art may be affected by whitespace normalization; test before enabling in production.
Override configuration on individual requests without changing global settings:
Python
TypeScript
response = client.chat.completions.create( model="gpt-4o", messages=[...], # Override mode for this request only headroom_mode="audit", # Reserve more tokens for output headroom_output_buffer_tokens=8000, # Keep last N turns uncompressed headroom_keep_turns=5, # Skip compression for specific tools headroom_tool_profiles={ "important_tool": {"skip_compression": True}, "search_tool": {"max_items_after_crush": 25}, },)
import { compress } from 'headroom-ai';const result = await compress(messages, { model: 'gpt-4o', tokenBudget: 100_000, timeout: 15_000,});
Set to 0 to relax OpenSSL’s RFC 5280 strict CA-constraint check (required behind Zscaler/Netskope on Python 3.13+). Chain validation, hostname, and expiry checks remain enabled.
HEADROOM_PROXY_TOKEN
—
Require this bearer token for non-loopback callers
Read-mostly config root. Derives models.json and plugin config paths.
HEADROOM_WORKSPACE_DIR
~/.headroom
Read-write state root. Derives savings, memory DB, logs, TOIN, and more.
HEADROOM_BASE_URL
http://localhost:8787
TypeScript SDK: proxy base URL
HEADROOM_API_KEY
—
TypeScript SDK: optional API key for authenticated endpoints
Set HEADROOM_CONTEXT_TOOL=lean-ctx before headroom wrap to use lean-ctx for CLI context filtering instead of RTK. Both tools are supported; RTK is the default.
Configure context limits and pricing for new or fine-tuned models. Save as ~/.headroom/config/models.json or point HEADROOM_MODEL_LIMITS at a JSON string or file path: