Skip to main content

List Models

Retrieve available models in a room (online participants).
curl http://localhost:3000/rooms/ABC123/v1/models

Endpoint

GET /rooms/:code/v1/models

Path Parameters

code
string
required
Room code

Response

Status: 200 OK OpenAI-compatible models list with Gambiarra extensions.
object
string
Always "list"
data
array
Array of model objects

Example Response

{
  "object": "list",
  "data": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "object": "model",
      "created": 1709409600,
      "owned_by": "Alice",
      "gambiarra": {
        "nickname": "Alice",
        "model": "llama3.2:3b",
        "endpoint": "http://localhost:11434"
      }
    },
    {
      "id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
      "object": "model",
      "created": 1709409700,
      "owned_by": "Bob",
      "gambiarra": {
        "nickname": "Bob",
        "model": "gpt-4o",
        "endpoint": "http://192.168.1.100:8080"
      }
    }
  ]
}

Error Responses

{
  "error": "Room not found"
}

Chat Completions

Proxy OpenAI-compatible chat completion requests to room participants.
curl -X POST http://localhost:3000/rooms/ABC123/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "550e8400-e29b-41d4-a716-446655440000",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Endpoint

POST /rooms/:code/v1/chat/completions

Path Parameters

code
string
required
Room code

Request Body

model
string
required
Model routing specification:
  • Participant ID: Route to specific participant (e.g., "550e8400-e29b-41d4-a716-446655440000")
  • Model name prefix: Route to first online participant with matching model (e.g., "model:llama3.2:3b")
  • Wildcard: Route to random online participant (use "*" or "any")
messages
array
required
Array of message objects (OpenAI format)
stream
boolean
Enable streaming responses (default: false)
temperature
number
Temperature (0-2)
top_p
number
Top-p sampling (0-1)
max_tokens
number
Maximum tokens to generate
stop
string | array
Stop sequences
frequency_penalty
number
Frequency penalty (-2 to 2)
presence_penalty
number
Presence penalty (-2 to 2)
seed
number
Random seed for reproducibility
Additional provider-specific parameters are passed through to the participant endpoint.

Response (Non-Streaming)

Status: 200 OK OpenAI-compatible chat completion response from the participant.
id
string
Completion ID
object
string
Always "chat.completion"
created
number
Unix timestamp (seconds)
model
string
Model used by the participant
choices
array
Array of completion choices
usage
object
Token usage statistics

Example Response (Non-Streaming)

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1709409600,
  "model": "llama3.2:3b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

Response (Streaming)

Status: 200 OK Content-Type: text/event-stream Server-Sent Events stream with OpenAI-compatible chat completion chunks.
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709409600,"model":"llama3.2:3b","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709409600,"model":"llama3.2:3b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1709409600,"model":"llama3.2:3b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Error Responses

{
  "error": "Room not found"
}
{
  "error": "No available participant for the requested model"
}
{
  "error": "Participant is offline"
}
{
  "error": "Failed to proxy request: Connection refused"
}

Model Routing Logic

The hub routes requests based on the model field:

1. Wildcard ("*" or "any")

Selects a random online participant from the room.
{"model": "*"}
Use case: Load balancing or when you don’t care which participant handles the request.

2. Model Name Prefix ("model:<name>")

Routes to the first online participant with a matching model name.
{"model": "model:llama3.2:3b"}
Use case: Target specific model capabilities without knowing participant IDs.

3. Participant ID (direct)

Routes to a specific participant by ID.
{"model": "550e8400-e29b-41d4-a716-446655440000"}
Use case: Sticky routing, debugging, or targeting a specific machine.

4. Fallback: Model Name (without prefix)

If the value doesn’t match a participant ID, treats it as a model name.
{"model": "llama3.2:3b"}
Use case: Convenience when model names are unique in the room.
Model routing only considers participants with status "online". Offline or busy participants are skipped.

Streaming Behavior

When stream: true:
  1. Hub proxies the streaming response from the participant
  2. Response uses Content-Type: text/event-stream
  3. Stream is passed through without modification
  4. Connection is kept alive until completion or error
SSE events are NOT broadcast for streaming chunks. Use the /rooms/:code/events endpoint to monitor llm:request and llm:complete events.

Build docs developers (and LLMs) love