Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Excurs1ons/MonoRelay/llms.txt
Use this file to discover all available pages before exploring further.
By the end of this guide you will have MonoRelay running locally, connected to at least one LLM provider, and you will have sent a successful chat completion request through the relay. The whole process takes under five minutes.
Clone MonoRelay
Clone the repository from GitHub and change into the project directory.git clone https://github.com/Excurs1ons/MonoRelay.git
cd MonoRelay
If you prefer not to use Git, download the latest source archive from the Releases page and extract it. Install dependencies
Install the Python packages listed in requirements.txt. Using a virtual environment is recommended.python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
Python 3.11 or later is required. Run python3 --version to check. Python 3.12 is recommended.
Configure config.yml
Copy the example configuration file and open it in your editor.cp config.yml.example config.yml
At minimum, set your access_key and enable at least one provider. Here is a minimal configuration using OpenRouter:server:
host: "0.0.0.0"
port: 8787
access_key: "my-secret-relay-key" # change this
providers:
openrouter:
enabled: true
base_url: "https://openrouter.ai/api/v1"
keys:
- key: "sk-or-v1-your-openrouter-key"
label: "main"
weight: 1
Change access_key before exposing MonoRelay to any network. This key is the only credential protecting your relay endpoint.
Start the server
Run MonoRelay directly with Python. The server starts on port 8787 by default.You should see output similar to:INFO: Started server process [12345]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8787 (Press CTRL+C to quit)
To run in the background immediately, use ./start-bg.sh start instead. See the deployment guide for all production options. Send your first API call
With the server running, send a chat completion request using curl. Replace my-secret-relay-key with the access_key you set in config.yml.curl http://localhost:8787/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer my-secret-relay-key" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [
{"role": "user", "content": "Say hello in one sentence."}
]
}'
A successful response looks like this:{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1716000000,
"model": "openai/gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! It's great to meet you today."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 14,
"completion_tokens": 12,
"total_tokens": 26
}
}
Admin dashboard
Once the server is running, open http://localhost:8787 in your browser and log in with your access_key. The dashboard shows request counts, error rates, token usage, provider health, and a live request log.
The Swagger API explorer is available at http://localhost:8787/docs. It lists every endpoint and lets you test requests directly in the browser.
Next steps
Now that MonoRelay is running locally, read the deployment guide to choose the right method for production — including Docker, systemd, and PM2.