Overview
Gambiarra’s model routing system determines which participant’s LLM handles each request. The hub supports three routing strategies, specified via the model field in chat completion requests:
Strategy Pattern Behavior Any * or anyRandom online participant Model name model:<name>First online participant with that model Participant ID <participant-id>Specific participant by UUID
Routing Strategies
1. Any Available Participant
Use * or any to route to a random online participant. This is useful for load balancing or when you don’t care which model handles the request.
import { createGambiarra } from "gambiarra-sdk" ;
import { generateText } from "ai" ;
const gambiarra = createGambiarra ({ roomCode: "ABC123" });
const result = await generateText ({
model: gambiarra . any (),
prompt: "Tell me a joke"
});
How it works :
Hub fetches all participants in the room
Filters to only status: "online" participants
Selects one randomly using Math.random()
Proxies the request to that participant’s endpoint
Implementation : packages/core/src/room.ts:167-183
function getRandomOnlineParticipant (
roomId : string
) : ParticipantInfo | undefined {
const room = rooms . get ( roomId );
if ( ! room ) return undefined ;
const online = Array . from ( room . participants . values ()). filter (
( p ) => p . status === "online"
);
if ( online . length === 0 ) return undefined ;
return online [ Math . floor ( Math . random () * online . length )];
}
Use gambiarra.any() when you want to distribute load across all available participants.
2. Model Name Routing
Use model:<name> to route to the first online participant with a specific model name.
const result = await generateText ({
model: gambiarra . model ( "llama3" ), // Routes to "model:llama3"
prompt: "Explain quantum computing"
});
How it works :
SDK prepends model: to the name (e.g., "llama3" → "model:llama3")
Hub extracts the model name by slicing off the prefix
Iterates through participants to find first match with participant.model === "llama3"
Only considers participants with status: "online"
Implementation : packages/core/src/room.ts:150-165
function findParticipantByModel (
roomId : string ,
model : string
) : ParticipantInfo | undefined {
const room = rooms . get ( roomId );
if ( ! room ) return undefined ;
for ( const participant of room . participants . values ()) {
if ( participant . model === model && participant . status === "online" ) {
return participant ;
}
}
return undefined ;
}
SDK implementation : packages/sdk/src/provider.ts:98
model : ( name : string ) => createProvider ( `model: ${ name } ` )
The model name must exactly match the model field provided during participant registration.
3. Participant ID Routing
Use a specific participant UUID to always route to that participant.
// Get list of participants
const participants = await gambiarra . listParticipants ();
// Route to a specific participant
const result = await generateText ({
model: gambiarra . participant ( participants [ 0 ]. id ),
prompt: "What's your GPU?"
});
How it works :
Hub first tries to find participant by ID
If found, uses that participant
If not found, falls back to treating it as a model name
Implementation : packages/core/src/hub.ts:226-247
function findParticipant (
roomId : string ,
modelId : string
) : ParticipantInfo | undefined {
if ( modelId === "*" || modelId === "any" ) {
return Room . getRandomOnlineParticipant ( roomId );
}
if ( modelId . startsWith ( "model:" )) {
const actualModel = modelId . slice ( 6 );
return Room . findParticipantByModel ( roomId , actualModel );
}
// Try as participant ID first
const participant = Room . getParticipant ( roomId , modelId );
if ( participant ) {
return participant ;
}
// Fallback: try as model name
return Room . findParticipantByModel ( roomId , modelId );
}
If the participant is offline or doesn’t exist, the request will fail with a 404 or 503 error.
Routing Examples
Example 1: Load Balancing
Distribute requests across multiple participants:
const gambiarra = createGambiarra ({ roomCode: "ABC123" });
// All requests go to random online participants
const promises = Array . from ({ length: 10 }, ( _ , i ) =>
generateText ({
model: gambiarra . any (),
prompt: `Question ${ i + 1 } `
})
);
const results = await Promise . all ( promises );
Example 2: Model-Specific Routing
Route based on task requirements:
// Use fast model for simple tasks
const summary = await generateText ({
model: gambiarra . model ( "llama3" ),
prompt: "Summarize: ...long text..."
});
// Use powerful model for complex reasoning
const analysis = await generateText ({
model: gambiarra . model ( "gpt-4" ),
prompt: "Analyze: ...complex data..."
});
Example 3: Sticky Sessions
Keep a conversation with the same participant:
const participants = await gambiarra . listParticipants ();
const selectedParticipant = participants [ 0 ]. id ;
const messages = [];
for ( const userMessage of conversation ) {
messages . push ({ role: "user" , content: userMessage });
const result = await generateText ({
model: gambiarra . participant ( selectedParticipant ),
messages
});
messages . push ({ role: "assistant" , content: result . text });
}
Example 4: Fallback Strategy
Try specific model, fall back to any:
async function generateWithFallback ( prompt : string ) {
try {
// Try preferred model first
return await generateText ({
model: gambiarra . model ( "llama3" ),
prompt
});
} catch ( error ) {
// Fall back to any available participant
return await generateText ({
model: gambiarra . any (),
prompt
});
}
}
Participant Status
Routing only considers participants with status: "online":
type ParticipantStatus = "online" | "busy" | "offline" ;
Status Description Routable? onlineParticipant is healthy and available ✅ Yes busyParticipant is processing a request ❌ No offlineParticipant hasn’t sent health check in 30s ❌ No
The busy status is not currently implemented but reserved for future load management.
Listing Available Models
To see what models are available in a room:
Using SDK
const gambiarra = createGambiarra ({ roomCode: "ABC123" });
// Get all participants
const participants = await gambiarra . listParticipants ();
participants . forEach ( p => {
console . log ( ` ${ p . nickname } : ${ p . model } ( ${ p . status } )` );
});
// Get OpenAI-compatible model list
const models = await gambiarra . listModels ();
models . forEach ( m => {
console . log ( ` ${ m . id } : ${ m . model } by ${ m . nickname } ` );
});
Using HTTP
# List participants
curl http://localhost:3000/rooms/ABC123/participants
# List models (OpenAI-compatible)
curl http://localhost:3000/rooms/ABC123/v1/models
Response format (/v1/models):
{
"object" : "list" ,
"data" : [
{
"id" : "participant-uuid" ,
"object" : "model" ,
"created" : 1234567890 ,
"owned_by" : "Alice" ,
"gambiarra" : {
"nickname" : "Alice" ,
"model" : "llama3" ,
"endpoint" : "http://192.168.1.50:11434"
}
}
]
}
Implementation : packages/core/src/hub.ts:161-186
Error Handling
Routing can fail for several reasons:
404: No Available Participant
{
"error" : "No available participant for the requested model"
}
Causes :
No participants in the room
Requested model doesn’t exist
All participants are offline
503: Participant Offline
{
"error" : "Participant is offline"
}
Causes :
Participant was found but status is not “online”
Participant’s health check expired
502: Proxy Failed
{
"error" : "Failed to proxy request: <details>"
}
Causes :
Participant’s LLM endpoint is unreachable
Network error between hub and participant
Participant’s LLM crashed
Direct OpenAI Client Usage
You can use Gambiarra with any OpenAI-compatible client:
import OpenAI from "openai" ;
const client = new OpenAI ({
baseURL: "http://localhost:3000/rooms/ABC123/v1" ,
apiKey: "not-needed" // API key not required
});
// Any strategy works
const result1 = await client . chat . completions . create ({
model: "*" , // Any participant
messages: [{ role: "user" , content: "Hello" }]
});
const result2 = await client . chat . completions . create ({
model: "model:llama3" , // Specific model
messages: [{ role: "user" , content: "Hi" }]
});
const result3 = await client . chat . completions . create ({
model: "participant-uuid" , // Specific participant
messages: [{ role: "user" , content: "Hey" }]
});
Advanced: Custom Routing Logic
For custom routing (e.g., based on GPU specs or latency), implement client-side logic:
import { createGambiarra } from "gambiarra-sdk" ;
import { generateText } from "ai" ;
const gambiarra = createGambiarra ({ roomCode: "ABC123" });
// Fetch participants
const participants = await gambiarra . listParticipants ();
// Select participant with most VRAM
const bestParticipant = participants
. filter ( p => p . status === "online" )
. sort (( a , b ) => ( b . specs . vram ?? 0 ) - ( a . specs . vram ?? 0 ))[ 0 ];
if ( ! bestParticipant ) {
throw new Error ( "No online participants" );
}
// Route to selected participant
const result = await generateText ({
model: gambiarra . participant ( bestParticipant . id ),
prompt: "Process this large context..."
});
Next Steps
Room Management Learn how to manage rooms and participants
Architecture Understand the overall system design