GET /health
Detailed health check endpoint with account status and model quotas.Request
Response
Overall server status. Always
"ok" if the server is running.ISO 8601 timestamp of the health check.
Time taken to generate the health report (in milliseconds).
Human-readable summary of account pool status.Example:
"3 accounts: 2 available, 1 rate-limited"Account pool statistics:
total(number): Total accounts configuredavailable(number): Accounts ready to userateLimited(number): Accounts waiting for quota resetinvalid(number): Accounts requiring re-authentication
Detailed status for each account:
email(string): Account email addressstatus(string):"ok","rate-limited","invalid", or"banned"lastUsed(string | null): ISO 8601 timestamp of last usemodelRateLimits(object): Per-model rate limit staterateLimitCooldownRemaining(number): Milliseconds until next retrymodels(object): Per-model quota information (see below)
Example Response
Model Quota Information
Each account’smodels object contains quota details per model:
- remaining (string): Human-readable percentage (e.g.,
"75%") - remainingFraction (number): Quota fraction from 0 to 1
1.0= Full quota available0.0= Quota exhaustednull= Quota unavailable (error fetching)
- resetTime (string | null): ISO 8601 timestamp when quota resets
GET /account-limits
Fetch quota and subscription data for all accounts across all models.Request
Query Parameters
Response format:
json- JSON response (default)table- ASCII table for terminal display
Include 30-day usage history in the response. Used by the Web UI dashboard.
Response (JSON)
Human-readable timestamp of the report.
Total number of accounts configured.
List of all model IDs across all accounts.
Server-wide quota threshold (0-0.99). Accounts below this threshold are deprioritized.
Detailed account information:
email(string): Account emailstatus(string):"ok","rate-limited","invalid", or"banned"subscription(object): Subscription tier datatier(string):"free","pro", or"ultra"projectId(string): Google Cloud project IDdetectedAt(number): Timestamp when tier was detected
limits(object): Per-model quota limitsquotaThreshold(number | undefined): Per-account quota threshold (overrides global)modelQuotaThresholds(object): Per-model quota thresholds
Example Response
Response (ASCII Table)
When?format=table is used:
Monitoring Best Practices
Health Check Frequency
- Production: Poll
/healthevery 30-60 seconds - Development: Use the Web UI for real-time monitoring
Account Limits Polling
- Web UI: Polls
/account-limits?includeHistory=trueevery ~30s (with jitter) - CLI Monitoring: Use
?format=tablefor terminal-friendly output
Alerting
Set up alerts based on:- No available accounts:
counts.available === 0 - High invalid account ratio:
counts.invalid / counts.total > 0.3 - Quota exhaustion: All accounts have
remainingFraction < 0.1for a model
Quota Thresholds
The proxy supports three-tier quota protection:- Global threshold: Configured via
globalQuotaThresholdinconfig.json(0-0.99) - Per-account threshold: Overrides global for specific accounts
- Per-model threshold: Highest priority, overrides both global and account-level