Before starting, make sure you have a local LLM running (Ollama, LM Studio, etc.). If you don’t have one yet, install Ollama and run
ollama pull llama3.Prerequisites
- A local LLM server (Ollama, LM Studio, LocalAI, etc.)
- Node.js 18+ or Bun 1.0+ (for SDK usage)
- Two or more machines on the same network (optional, but ideal for testing)
Setup
Start the hub server
The hub is the central coordinator for all rooms and participants. Start it on any machine in your network:You should see output like:Keep this terminal running. The hub needs to stay active for participants to communicate.
The
--mdns flag enables automatic discovery on your local network. Clients can find the hub without knowing its IP address.Create a room
Open a new terminal and create a room:Output:
The room code (like
ABC123) is what others will use to join your room. Share it with your team!Optional: Password protection
You can protect your room with a password:Join with your LLM
Now join the room with your local LLM endpoint. This makes your LLM available to others in the room.Output:The participant will automatically send health checks every 10 seconds to stay online.
- Ollama
- LM Studio
- LocalAI
- Custom
If the room is password-protected, add
--password mySecret123 to the join command.What just happened?
Let’s break down what happened:- Hub started - The central coordinator started on port 3000
- Room created - A virtual room with code
ABC123was created - Participant joined - Your local LLM registered as a participant in the room
- SDK connected - Your application connected to the hub and made a request
- Request routed - The hub routed your request to an available participant
- Response returned - The participant’s LLM processed the request and returned the result
Try different routing patterns
Gambiarra supports three routing strategies:1. Any available participant (random)
Use any online participant:2. Specific model
Route to the first participant with a specific model:3. Specific participant
Route to a specific participant by ID:Streaming responses
Gambiarra fully supports streaming responses:List available models
You can programmatically list all participants and models in a room:Monitor with the Terminal UI
Gambiarra includes a beautiful Terminal UI for real-time monitoring. If you just rungambiarra without arguments, it opens the TUI:
- See all available rooms
- View participants and their health status
- Monitor request activity in real-time
- Create and join rooms interactively
Working with multiple participants
The real power of Gambiarra comes from multiple participants. Open another terminal (or use another machine) and join the same room with a different model:gambiarra.any(), requests will be randomly distributed across both participants. When you use gambiarra.model("mistral"), requests will go to the second participant.
List all rooms
See all available rooms on a hub:Clean up
When you’re done:- Stop participants - Press
Ctrl+Cin the terminal where you rangambiarra join - Stop the hub - Press
Ctrl+Cin the terminal where you rangambiarra serve
Next steps
CLI reference
Learn about all available CLI commands
SDK reference
Explore advanced SDK features
Architecture
Understand how Gambiarra works
Troubleshooting
Common issues and solutions