Management API: usage statistics and analytics endpoints

MonoRelay tracks usage statistics in two layers: a fast in-memory counter that persists to data/stats.json on every update, and a SQLite-backed log database that enables richer historical queries. The stats API surfaces both layers in a single response and also exposes per-client usage tracking from the dedicated usage tracker.

Authentication

All stats endpoints require a valid JWT token:

Authorization: Bearer <jwt>

Endpoints

GET /api/stats

The primary stats endpoint. Returns a composite object combining in-memory global counters, persistent database aggregates, per-key statistics, and per-model detail breakdowns.

curl http://localhost:8787/api/stats \
  -H "Authorization: Bearer <jwt>"

data.in_memory

object

In-memory stats accumulated since the last restart or reset:

total_requests — total number of proxied requests
total_errors — number of requests that returned an error
error_rate — total_errors / total_requests
total_tokens_in — cumulative input tokens
total_tokens_out — cumulative output tokens
total_cache_hit_tokens — tokens served from upstream prompt cache
total_tokens — combined input and output tokens
estimated_total_cost — estimated USD spend
requests_by_provider — {"openrouter": 142, "anthropic": 58}
requests_by_model — {"claude-opus-4-5": 34, ...}
errors_by_provider — error counts by provider name

data.persistent

object

Aggregates computed from the SQLite request log:

total_requests — all logged requests (survives restarts)
total_cost — total estimated cost from log records
avg_latency_ms — average end-to-end latency
input_tokens — total input tokens across all log entries
output_tokens — total output tokens
cache_hit_tokens — total cache-hit tokens from log entries

data.keys

object

Per-provider, per-key health data from the key manager: request counts, failure counts, cooldown state, and quota usage.

data.models

object

Per-model detail from the stats tracker. Each key is a model name:

requests — total requests using that model
errors — error count for that model
total_tokens_in / total_tokens_out — token totals
total_cache_hit_tokens — cache-hit tokens for that model
avg_first_token_ms — exponential-decay weighted average TTFT in ms
avg_speed_tps — exponential-decay weighted average output tokens per second
streaming_requests — number of streaming requests

Example response:

{
  "success": true,
  "data": {
    "in_memory": {
      "total_requests": 200,
      "total_errors": 5,
      "error_rate": 0.025,
      "total_tokens_in": 84200,
      "total_tokens_out": 31500,
      "total_cache_hit_tokens": 12300,
      "total_tokens": 115700,
      "estimated_total_cost": 0.0423,
      "requests_by_provider": {"anthropic": 120, "openrouter": 80},
      "requests_by_model": {"claude-opus-4-5": 90, "gpt-4o-mini": 80},
      "errors_by_provider": {"openrouter": 5}
    },
    "persistent": {
      "total_requests": 200,
      "total_cost": 0.042,
      "avg_latency_ms": 1284.3,
      "input_tokens": 84200,
      "output_tokens": 31500,
      "cache_hit_tokens": 12300
    },
    "models": {
      "claude-opus-4-5": {
        "requests": 90,
        "errors": 0,
        "total_tokens_in": 51000,
        "total_tokens_out": 18200,
        "total_cache_hit_tokens": 12300,
        "avg_first_token_ms": 621.4,
        "avg_speed_tps": 38.7,
        "streaming_requests": 74
      }
    }
  }
}

GET /api/stats/enhanced

Returns an expanded stats object that includes per-provider breakdowns enriched with provider configuration (cost rates, key counts) and a flat key health inventory useful for dashboards.

data.provider_breakdown

object

One entry per enabled provider:

enabled — provider enabled state
total_requests — requests routed to this provider
total_errors — errors from this provider
keys.total / keys.enabled — key inventory counts
cost_per_m_input / cost_per_m_output — configured pricing rates

data.key_health

array

Flat list of all keys across all providers, each entry containing provider, label, enabled, total_requests, total_failures, is_available, cooldown_until, quota_limit, quota_used, rate_limit_rps, and expires_at.

GET /api/stats/file

Return the raw stats.json file content as a string. Useful for backup or cross-instance synchronization.

{"content": "{ \"total_requests\": 200, ... }"}

PUT /api/stats/file

Replace the stats.json file content and immediately reload the in-memory stats from the new content.

content

string

required

JSON string to write as the new stats file content.

POST /api/stats/reset

Clear all in-memory and persisted statistics. Also deletes the stats.json file and clears the SQLite request log.

curl -X POST http://localhost:8787/api/stats/reset \
  -H "Authorization: Bearer <jwt>"

{"success": true, "message": "统计数据已清空"}

This operation is irreversible. Both the stats file and the full request log database will be cleared. Export or back up data before calling this endpoint.

Per-client usage stats

GET /api/usage/stats

Return usage statistics broken down by client identity (the username or access-key prefix that made each request). Pass client_id as a query parameter to retrieve stats for a single client.

# All clients
curl http://localhost:8787/api/usage/stats \
  -H "Authorization: Bearer <jwt>"

# Single client
curl "http://localhost:8787/api/usage/stats?client_id=alice" \
  -H "Authorization: Bearer <jwt>"

POST /api/usage/clear

Reset per-client usage counters. Pass client_id to clear only that client’s data, or omit it to clear all clients.

Analytics endpoints

GET /api/analytics/overview

Aggregate cost and usage by provider and model over a date range. Defaults to the last 7 days.

Query param	Description
`start_date`	Start date in `YYYY-MM-DD` format
`end_date`	End date in `YYYY-MM-DD` format

data.total_requests

integer

Total requests in the date range.

data.total_cost

number

Estimated total cost in USD.

data.total_tokens

object

{"input": 84200, "output": 31500} — token totals for the period.

data.by_provider

object

Per-provider request counts and costs.

data.by_model

object

Per-model request counts, costs, and token totals.

GET /api/analytics/slow-queries

List requests that exceeded a latency threshold, ordered by first-token latency descending.

Query param	Default	Description
`threshold_ms`	`2000`	Minimum first-token latency in ms
`start_date`	7 days ago	Start of date range
`end_date`	today	End of date range
`limit`	`100`	Maximum results

GET /api/analytics/cost-distribution

Return cost broken down by provider and by model as percentage shares, useful for pie-chart visualizations in dashboards.

Dashboard integration

The MonoRelay dashboard reads all of these endpoints automatically. The stats overview panel uses GET /api/stats, the model breakdown table uses data.models, and the real-time activity feed uses GET /api/logs/stream. You can build your own monitoring dashboards using the same API surface that the built-in UI relies on.

Overview

OpenAI-Compatible

Anthropic-Compatible

Management API

Management API: usage statistics and analytics endpoints

Authentication

Endpoints

GET /api/stats

GET /api/stats/enhanced

GET /api/stats/file

PUT /api/stats/file

POST /api/stats/reset

Per-client usage stats

GET /api/usage/stats

POST /api/usage/clear

Analytics endpoints

GET /api/analytics/overview

GET /api/analytics/slow-queries

GET /api/analytics/cost-distribution

Dashboard integration

Build docs developers (and LLMs) love

Overview

OpenAI-Compatible

Anthropic-Compatible

Management API

Documentation Index

​Authentication

​Endpoints

​GET /api/stats

​GET /api/stats/enhanced

​GET /api/stats/file

​PUT /api/stats/file

​POST /api/stats/reset

​Per-client usage stats

​GET /api/usage/stats

​POST /api/usage/clear

​Analytics endpoints

​GET /api/analytics/overview

​GET /api/analytics/slow-queries

​GET /api/analytics/cost-distribution

​Dashboard integration

Build docs developers (and LLMs) love

Authentication

Endpoints

GET /api/stats

GET /api/stats/enhanced

GET /api/stats/file

PUT /api/stats/file

POST /api/stats/reset

Per-client usage stats

GET /api/usage/stats

POST /api/usage/clear

Analytics endpoints

GET /api/analytics/overview

GET /api/analytics/slow-queries

GET /api/analytics/cost-distribution

Dashboard integration