Chat completions — POST /api/v1/chat/completions

The /api/v1/chat/completions endpoint is the primary way to send messages to LMArena models through the bridge. It follows the OpenAI Chat Completions format, so any client that works with OpenAI’s API can be pointed at LMArena Bridge with minimal changes.

Authentication

Authorization

string

required

Bearer token. Pass the API key you created in the dashboard: Authorization: Bearer <api_key>. If no API keys are configured on the bridge, pass an empty string or omit the header.

Request body

model

string

required

The public model ID to use. Retrieve the current list of available IDs from GET /api/v1/models.

messages

object[]

required

Array of message objects forming the conversation. Each object must have a role ("system", "user", or "assistant") and a content field.

Show message object properties

role

string

required

One of "system", "user", or "assistant".

content

string | array

required

The message text. For vision models, content may be an array of content blocks that include image data (base64 or URL).

stream

boolean

default:"false"

When true, the response is returned as a Server-Sent Events stream. Each event carries a JSON delta; the stream ends with data: [DONE].

temperature

number

Sampling temperature. Passed through to LMArena when provided.

max_tokens

number

Maximum number of tokens to generate. Passed through to LMArena when provided.

Multi-turn conversations

The bridge maintains a chat session for each API key. When you send a sequence of messages that begins with the same first user message, the bridge routes follow-up turns to the same LMArena session automatically — you do not need to track a session ID yourself. The conversation key is derived from your API key, the model name, and the first user message.

If you change the model or the first user message, a new LMArena session is started. The bridge keeps all sessions in memory; restarting the server clears them.

Examples

curl -X POST http://localhost:8000/api/v1/chat/completions \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Explain the difference between TCP and UDP."}
    ]
  }'

Responses

Non-streaming response

200

{
  "id": "chatcmpl-a1b2c3d4e5f6",
  "object": "chat.completion",
  "created": 1716000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "TCP provides reliable, ordered delivery with error checking, while UDP is connectionless and prioritises low latency over reliability."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 32,
    "total_tokens": 50
  }
}

Streaming response

Each SSE event has the form data: <json>\n\n. The final event is data: [DONE].

SSE stream

: keep-alive

data: {"id":"chatcmpl-a1b2c3d4e5f6","object":"chat.completion.chunk","created":1716000000,"model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-a1b2c3d4e5f6","object":"chat.completion.chunk","created":1716000000,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"TCP"},"finish_reason":null}]}

data: {"id":"chatcmpl-a1b2c3d4e5f6","object":"chat.completion.chunk","created":1716000000,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":" provides"},"finish_reason":null}]}

data: [DONE]

Response fields

string

Unique identifier for this completion.

object

string

Always "chat.completion" for non-streaming responses.

model

string

The model that generated the response.

choices

object[]

Show choice object properties

index

number

Index of the choice (always 0).

message

object

The generated message with role and content.

finish_reason

string

Why generation stopped. Typically "stop".

usage

object

Token usage counts: prompt_tokens, completion_tokens, total_tokens.

Error responses

Status	Meaning
`400`	Invalid JSON, missing `model` or `messages`, empty messages array, or prompt exceeds the ~113,000 character limit.
`401`	The `Authorization` header is missing or the API key is invalid.
`403`	Attempted to use a stealth model (one without a public organisation).
`404`	The requested model was not found. Check `GET /api/v1/models` for valid IDs.
`429`	Rate limit exceeded for your API key.
`503`	LMArena is unavailable, the model list could not be fetched, or the bridge failed to acquire a reCAPTCHA token.

401

{
  "detail": "Invalid API key"
}

400

{
  "detail": "Prompt too long (120000 characters). LMArena has a character limit of approximately 113567 characters. Please reduce the message size."
}

Endpoints

Dashboard

Chat completions — POST /api/v1/chat/completions

Authentication

Request body

Multi-turn conversations

Examples

Responses

Non-streaming response

Streaming response

Response fields

Error responses

Build docs developers (and LLMs) love

Endpoints

Dashboard

Documentation Index

​Authentication

​Request body

​Multi-turn conversations

​Examples

​Responses

​Non-streaming response

​Streaming response

​Response fields

​Error responses

Build docs developers (and LLMs) love

Authentication

Request body

Multi-turn conversations

Examples

Responses

Non-streaming response

Streaming response

Response fields

Error responses