MonoRelay: Unified LLM API relay for multiple providers

MonoRelay is a self-hosted relay server that sits in front of your AI providers and exposes a single OpenAI-compatible (and Anthropic-compatible) endpoint. Point any existing client at MonoRelay and gain intelligent routing, automatic key rotation, request logging, and a full admin dashboard — without changing your application code.

Quickstart

Get MonoRelay running and send your first request in under five minutes.

Deployment guide

Deploy with Docker, Windows executable, systemd, PM2, or run from source.

Configuration

Configure providers, model routing, key selection, and logging.

API Reference

Explore the OpenAI-compatible and Anthropic-compatible API endpoints.

What MonoRelay does

MonoRelay accepts standard OpenAI or Anthropic API requests and intelligently forwards them to one of your configured providers. Your applications keep using the same API format — MonoRelay handles the complexity behind the scenes.

Multi-provider support

Connect OpenRouter, NVIDIA NIM, OpenAI, Anthropic, DeepSeek, Groq, and ChatGPT web reverse proxy behind one endpoint.

Smart model routing

Define model aliases, provider mapping patterns, complexity-based routing, and cascade fallback chains.

Key management

Rotate API keys with round-robin, random, or weighted strategies. Built-in cooldown on rate-limit errors.

Full streaming support

Native SSE streaming for both OpenAI and Anthropic formats with no buffering overhead.

Admin dashboard

Vue 3 SPA with request stats, provider health, live logs, user management, and config editor.

Request logging

SQLite-backed logging of every request and response with full content viewing and error details.

Get started

Install MonoRelay

Choose your deployment method — Docker is the fastest path to production, or run directly from source for development.

cp config.yml.example config.yml
docker compose up -d

Configure your providers

Edit config.yml to add your API keys and enable the providers you want to use.

config.yml

providers:
  openrouter:
    enabled: true
    base_url: "https://openrouter.ai/api/v1"
    keys:
      - key: "sk-or-v1-your-key-here"
        label: "main"

Point your client at MonoRelay

Update your existing OpenAI client to use MonoRelay’s base URL and your access key.

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8787/v1",
    api_key="your-access-key",
)

Open the admin dashboard

Navigate to http://localhost:8787 in your browser to view request stats, manage providers, and inspect logs.

MonoRelay requires Python 3.11 or later. Python 3.12 is recommended.

Get Started

Configuration

Admin Dashboard

Authentication

MonoRelay: Unified LLM API relay for multiple providers

Quickstart

Deployment guide

Configuration

API Reference

What MonoRelay does

Multi-provider support

Smart model routing

Key management

Full streaming support

Admin dashboard

Request logging

Get started

Build docs developers (and LLMs) love

Get Started

Configuration

Admin Dashboard

Authentication

Documentation Index

Quickstart

Deployment guide

Configuration

API Reference

​What MonoRelay does

Multi-provider support

Smart model routing

Key management

Full streaming support

Admin dashboard

Request logging

​Get started

Build docs developers (and LLMs) love

What MonoRelay does

Get started