Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Excurs1ons/MonoRelay/llms.txt

Use this file to discover all available pages before exploring further.

By the end of this guide you will have MonoRelay running locally, connected to at least one LLM provider, and you will have sent a successful chat completion request through the relay. The whole process takes under five minutes.
1

Clone MonoRelay

Clone the repository from GitHub and change into the project directory.
git clone https://github.com/Excurs1ons/MonoRelay.git
cd MonoRelay
If you prefer not to use Git, download the latest source archive from the Releases page and extract it.
2

Install dependencies

Install the Python packages listed in requirements.txt. Using a virtual environment is recommended.
python3 -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
Python 3.11 or later is required. Run python3 --version to check. Python 3.12 is recommended.
3

Configure config.yml

Copy the example configuration file and open it in your editor.
cp config.yml.example config.yml
At minimum, set your access_key and enable at least one provider. Here is a minimal configuration using OpenRouter:
config.yml
server:
  host: "0.0.0.0"
  port: 8787
  access_key: "my-secret-relay-key"   # change this

providers:
  openrouter:
    enabled: true
    base_url: "https://openrouter.ai/api/v1"
    keys:
      - key: "sk-or-v1-your-openrouter-key"
        label: "main"
        weight: 1
Change access_key before exposing MonoRelay to any network. This key is the only credential protecting your relay endpoint.
4

Start the server

Run MonoRelay directly with Python. The server starts on port 8787 by default.
python -m backend.main
You should see output similar to:
INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8787 (Press CTRL+C to quit)
To run in the background immediately, use ./start-bg.sh start instead. See the deployment guide for all production options.
5

Send your first API call

With the server running, send a chat completion request using curl. Replace my-secret-relay-key with the access_key you set in config.yml.
curl http://localhost:8787/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-relay-key" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Say hello in one sentence."}
    ]
  }'
A successful response looks like this:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1716000000,
  "model": "openai/gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! It's great to meet you today."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 12,
    "total_tokens": 26
  }
}

Admin dashboard

Once the server is running, open http://localhost:8787 in your browser and log in with your access_key. The dashboard shows request counts, error rates, token usage, provider health, and a live request log.
The Swagger API explorer is available at http://localhost:8787/docs. It lists every endpoint and lets you test requests directly in the browser.

Next steps

Now that MonoRelay is running locally, read the deployment guide to choose the right method for production — including Docker, systemd, and PM2.

Build docs developers (and LLMs) love