Headroom’s compression subsystem is built from composable transform classes. When you useDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/headroomlabs-ai/headroom/llms.txt
Use this file to discover all available pages before exploring further.
HeadroomClient or compress(), these transforms run automatically. You can also call them directly when you need fine-grained control — to benchmark individual transforms, build custom pipelines, or integrate into frameworks that already manage their own request loop.
SmartCrusher
SmartCrusher is Headroom’s primary compressor for JSON and array tool outputs. It uses statistical analysis to identify which items are important — errors, anomalies, query-relevant entries, and boundary items — and drops everything else. The output preserves the original JSON schema exactly; no wrapper objects or generated text are inserted.What SmartCrusher Preserves
| Category | Guarantee | Mechanism |
|---|---|---|
| Error items | 100% kept | status: error, level: error, exception objects |
| First N items | 100% kept | Configurable first_fraction of max_items_after_crush |
| Last N items | 100% kept | Configurable last_fraction of max_items_after_crush |
| Anomalies | 100% kept | Numeric values > variance_threshold std devs from mean |
| Relevant items | Top K kept | BM25 / embedding / hybrid scoring against user query |
| Change points | Kept | Significant data transitions (5-item window) |
Constructor
Compression configuration. When
None, uses SmartCrusherConfig() defaults. See SmartCrusherConfig for all fields.Direct Usage
SmartCrusher is backed by a Rust extension (
headroom._core.SmartCrusher) built with PyO3. The public Python surface — SmartCrusherConfig, SmartCrusher, and CrushResult — is unchanged. Build the extension locally with scripts/build_rust_extension.sh or install a prebuilt wheel.CCR Sentinels
When SmartCrusher’s lossy row-drop path removes items, it appends a sentinel object to the kept-items array:headroom_retrieve(hash) to fetch the original data. If you iterate a compressed array and need to skip sentinels:
CacheAligner
CacheAligner is a detector-only transform. It scans system messages for volatile content that would cause provider KV-cache misses and logs warnings when instability is found. It does not modify messages — the system prompt is never mutated. Detected volatile patterns include:- UUIDs — RFC 4122 canonical form (36 chars with dashes)
- ISO 8601 timestamps — parsed via
datetime.fromisoformat - JWTs — three dot-separated base64url segments
- Hex hashes — MD5 (32), SHA1 (40), SHA256 (64) character strings
Constructor
Aligner configuration. When
None, uses CacheAlignerConfig() defaults. See CacheAlignerConfig.Direct Usage
get_alignment_score()
0.0 (highly volatile prefix) to 100.0 (perfectly stable prefix). Each detected volatile pattern (UUID, timestamp, JWT, etc.) deducts 10 points. Useful for monitoring cache health without running the full pipeline.
TransformPipeline
TransformPipeline orchestrates transforms in sequence. The default pipeline runs:
- CacheAligner — detect volatile content in the system prefix
- ContentRouter — route each message to the appropriate compressor:
- SmartCrusher for JSON arrays
- Kompress for text
- CodeAwareCompressor for source code
- SearchCompressor for web/grep results
- LogCompressor for log output
- DiffCompressor for diffs
transforms list to override the default order.
Constructor
Full Headroom configuration. When
None, HeadroomConfig() defaults are used.Custom transform list. When provided, replaces the default
[CacheAligner, ContentRouter] order entirely.Provider for model-specific tokenization. Used when building the default pipeline with per-provider behavior.
apply()
TransformResult. The model_limit keyword argument is required and must be provided explicitly (there is no default — passing None raises ValueError). Common kwargs:
model_limit: int— context window size in tokens (required)output_buffer: int— tokens to reserve for model output (default4000)tool_profiles: dict[str, dict]— per-tool compression profiles
simulate()
TransformResult as apply() but does not persist metrics. Accepts the same kwargs as apply(), including the required model_limit.
CompressionHooks
CompressionHooks is a base class with no-op defaults. Subclass it to inject custom logic at three well-defined pipeline stages:
pre_compress— modify messages before compressioncompute_biases— set per-message compression aggressivenesspost_compress— observe results after compressionon_pipeline_event— observe canonical lifecycle events
pre_compress
compute_biases
{message_index: bias_float}. Values:
1.0— default compression> 1.0— compress less aggressively (keep more)< 1.0— compress more aggressively- Missing indices default to
1.0
post_compress
CompressContext Fields
Model name for this compression call.
Extracted user query (empty if not detected).
Turn counter within the session.
Tool names called in this context.
Provider name:
"anthropic", "openai", "gemini", etc.CompressEvent Fields
Tokens before compression.
Tokens after compression.
tokens_before - tokens_after.Fraction of tokens saved.
Transforms that ran.
CCR hashes for any offloaded data.
Model name.
Extracted user query.
Provider name.
PipelineStage
PipelineStage is a string enum listing the canonical lifecycle stages that PipelineExtensionManager emits events for.
PipelineEvent
PipelineEvent is the event object emitted at each stage. Extensions receive it via on_pipeline_event.
The stage this event was emitted from.
Operation name, e.g.
"sdk.request", "compress".Unique request identifier (empty string if not set).
Provider name.
Model name.
Messages at this stage. May be
None for stages that don’t involve messages.Tools list at this stage.
Request headers at this stage.
API response (only at
POST_SEND / RESPONSE_RECEIVED).Stage-specific metadata (e.g. token counts, transform names).
PipelineExtensionManager
PipelineExtensionManager dispatches PipelineEvent objects to a list of extensions. Extensions are loaded from:
- The
hooks=argument (any object withon_pipeline_event) - The
extensions=list - Auto-discovered entry points under the
headroom.pipeline_extensiongroup (whendiscover=True)
CANONICAL_PIPELINE_STAGES
A tuple of allPipelineStage values in execution order:
Custom Pipeline Example
The following example composes a minimal custom pipeline with SmartCrusher and CacheAligner, attaches observability hooks, and runs it directly on a message list:Using Transforms with compress()
Hooks integrate directly with thecompress() function — no HeadroomClient required:
When building a custom
TransformPipeline with an explicit transforms list, the ContentRouter is not included unless you add it. The ContentRouter is responsible for dispatching individual messages to content-specific compressors (SmartCrusher, Kompress, CodeAwareCompressor, etc.). If you only include SmartCrusher directly, only explicit SmartCrusher logic runs — not the full routing heuristics.