TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/arjunkshah/supercompress/llms.txt
Use this file to discover all available pages before exploring further.
/v1/compress endpoint is the primary way to use SuperCompress from non-Python environments. Send a context string and a user query; receive a compressed version along with token counts and savings metrics. Every successful request is recorded against the API key used, so usage dashboards stay up to date automatically.
Request
Method:POSTPath:
/v1/compressAuth header:
X-API-Key: sc_live_… or Authorization: Bearer sc_live_…Content-Type:
application/json
Body parameters
The full context to compress — for example, a retrieved document, conversation history, or code file. Maximum 120,000 characters.
The current user query. SuperCompress uses this to guide token retention, keeping content most relevant to the question. Maximum 2,000 characters.
Fraction of tokens to retain. Must be between 0.05 and 1.0 inclusive. A value of
0.35 retains roughly 35 % of the original tokens, yielding up to 65 % KV-cache savings.Example request
Response
A successful request returns HTTP 200 with a JSON object.The compressed context ready to be inserted into your LLM prompt. Pass this in place of the original context.
Token count of the input context before compression.
Token count of the compressed context after compression.
Percentage of tokens removed:
(1 − kept_tokens / original_tokens) × 100. Rounded to two decimal places.Share of input lines retained in the output, including sink lines and recent-context lines. Rounded to three decimal places.
Name of the compression policy that was applied, e.g.
"SuperCompress" for the learned policy or "H2O-fallback" when the model falls back to a heuristic baseline.The budget ratio that was used (echoes the request value, or the default
0.35 if omitted).Example response
Usage tracking
Every successful call to/v1/compress automatically increments the request count and token tallies for the API key used. View aggregated usage on the dashboard or via GET /api/keys/{id}/usage.
Usage is recorded after compression succeeds. A
400 or 401 error response does not consume quota.Error responses
| Status | Cause |
|---|---|
400 | Invalid request body (field out of range, context too large, etc.) |
401 | Missing, malformed, or revoked API key |
Unauthenticated playground
POST /api/compress accepts the same context, query, and budget_ratio fields but requires no API key. It is intended for the browser playground and local smoke tests. It additionally supports a compare field:
When
true, the response includes a compare map with results from every built-in policy (FIFO, Truncation, Summarization, H2O, and SuperCompress) run side-by-side.