Chat Completions API

List Models

Retrieve available models in a room (online participants).

curl http://localhost:3000/rooms/ABC123/v1/models

Endpoint

GET /rooms/:code/v1/models

Path Parameters

code

string

required

Room code

Response

Status: 200 OK OpenAI-compatible models list with Gambiarra extensions.

object

string

Always "list"

data

array

Array of model objects

Show model properties

string

Participant ID (use as model identifier in chat completions)

object

string

Always "model"

created

number

Unix timestamp (seconds) when participant joined

owned_by

string

Participant nickname

gambiarra

object

Gambiarra-specific metadata

Show gambiarra properties

nickname

string

Participant display name

model

string

Actual model name

endpoint

string

Participant endpoint URL

Example Response

{
  "object": "list",
  "data": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "object": "model",
      "created": 1709409600,
      "owned_by": "Alice",
      "gambiarra": {
        "nickname": "Alice",
        "model": "llama3.2:3b",
        "endpoint": "http://localhost:11434"
      }
    },
    {
      "id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
      "object": "model",
      "created": 1709409700,
      "owned_by": "Bob",
      "gambiarra": {
        "nickname": "Bob",
        "model": "gpt-4o",
        "endpoint": "http://192.168.1.100:8080"
      }
    }
  ]
}

Error Responses

404 Not Found - Room

{
  "error": "Room not found"
}

Chat Completions

Proxy OpenAI-compatible chat completion requests to room participants.

curl -X POST http://localhost:3000/rooms/ABC123/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "550e8400-e29b-41d4-a716-446655440000",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Endpoint

POST /rooms/:code/v1/chat/completions

Path Parameters

code

string

required

Room code

Request Body

model

string

required

Model routing specification:

Participant ID: Route to specific participant (e.g., "550e8400-e29b-41d4-a716-446655440000")
Model name prefix: Route to first online participant with matching model (e.g., "model:llama3.2:3b")
Wildcard: Route to random online participant (use "*" or "any")

messages

array

required

Array of message objects (OpenAI format)

Show message properties

role

string

required

Message role: “system”, “user”, “assistant”, “tool”, or “function”

content

string

required

Message content (can be null for tool messages)

stream

boolean

Enable streaming responses (default: false)

temperature

number

Temperature (0-2)

top_p

number

Top-p sampling (0-1)

max_tokens

number

Maximum tokens to generate

stop

string | array

Stop sequences

frequency_penalty

number

Frequency penalty (-2 to 2)

presence_penalty

number

Presence penalty (-2 to 2)

seed

number

Random seed for reproducibility

Additional provider-specific parameters are passed through to the participant endpoint.

Response (Non-Streaming)

Status: 200 OK OpenAI-compatible chat completion response from the participant.

string

Completion ID

object

string

Always "chat.completion"

created

number

Unix timestamp (seconds)

model

string

Model used by the participant

choices

array

Array of completion choices

Show choice properties

index

number

Choice index

message

object

Response message

Show message properties

role

string

Always "assistant"

content

string

Response content

finish_reason

string

“stop”, “length”, “content_filter”, or “tool_calls”

usage

object

Token usage statistics

Show usage properties

prompt_tokens

number

Tokens in prompt

completion_tokens

number

Tokens in completion

total_tokens

number

Total tokens

Example Response (Non-Streaming)

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1709409600,
  "model": "llama3.2:3b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

Response (Streaming)

Status: 200 OK Content-Type: text/event-stream Server-Sent Events stream with OpenAI-compatible chat completion chunks.

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709409600,"model":"llama3.2:3b","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709409600,"model":"llama3.2:3b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709409600,"model":"llama3.2:3b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Error Responses

404 Not Found - Room

{
  "error": "Room not found"
}

404 Not Found - No Available Participant

{
  "error": "No available participant for the requested model"
}

503 Service Unavailable - Participant Offline

{
  "error": "Participant is offline"
}

502 Bad Gateway - Proxy Failed

{
  "error": "Failed to proxy request: Connection refused"
}

Model Routing Logic

The hub routes requests based on the model field:

1. Wildcard (`"*"` or `"any"`)

Selects a random online participant from the room.

{"model": "*"}

Use case: Load balancing or when you don’t care which participant handles the request.

2. Model Name Prefix (`"model:<name>"`)

Routes to the first online participant with a matching model name.

{"model": "model:llama3.2:3b"}

Use case: Target specific model capabilities without knowing participant IDs.

3. Participant ID (direct)

Routes to a specific participant by ID.

{"model": "550e8400-e29b-41d4-a716-446655440000"}

Use case: Sticky routing, debugging, or targeting a specific machine.

4. Fallback: Model Name (without prefix)

If the value doesn’t match a participant ID, treats it as a model name.

{"model": "llama3.2:3b"}

Use case: Convenience when model names are unique in the room.

Model routing only considers participants with status "online". Offline or busy participants are skipped.

Streaming Behavior

When stream: true:

Hub proxies the streaming response from the participant
Response uses Content-Type: text/event-stream
Stream is passed through without modification
Connection is kept alive until completion or error

SSE events are NOT broadcast for streaming chunks. Use the /rooms/:code/events endpoint to monitor llm:request and llm:complete events.

HTTP API

Core Types

Chat Completions API

List Models

Endpoint

Path Parameters

Response

Example Response

Error Responses

Chat Completions

Endpoint

Path Parameters

Request Body

Response (Non-Streaming)

Example Response (Non-Streaming)

Response (Streaming)

Error Responses

Model Routing Logic

1. Wildcard (`"*"` or `"any"`)

2. Model Name Prefix (`"model:<name>"`)

3. Participant ID (direct)

4. Fallback: Model Name (without prefix)

Streaming Behavior

Build docs developers (and LLMs) love

HTTP API

Core Types

​List Models

​Endpoint

​Path Parameters

​Response

​Example Response

​Error Responses

​Chat Completions

​Endpoint

​Path Parameters

​Request Body

​Response (Non-Streaming)

​Example Response (Non-Streaming)

​Response (Streaming)

​Error Responses

​Model Routing Logic

​1. Wildcard ("*" or "any")

​2. Model Name Prefix ("model:<name>")

​3. Participant ID (direct)

​4. Fallback: Model Name (without prefix)

​Streaming Behavior

Build docs developers (and LLMs) love

List Models

Endpoint

Path Parameters

Response

Example Response

Error Responses

Chat Completions

Endpoint

Path Parameters

Request Body

Response (Non-Streaming)

Example Response (Non-Streaming)

Response (Streaming)

Error Responses

Model Routing Logic

1. Wildcard (`"*"` or `"any"`)

2. Model Name Prefix (`"model:<name>"`)

3. Participant ID (direct)

4. Fallback: Model Name (without prefix)

Streaming Behavior