Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/omnigent-ai/omnigent/llms.txt

Use this file to discover all available pages before exploring further.

Omnigent runs as two cooperating pieces: a server that handles coordination, persistence, and the web UI, and a runner that lives on the user’s machine and executes the actual LLM loop. Separating them lets the server be deployed anywhere — a VPS, Render, Railway, Fly.io — while model credentials and local tools never leave the machine that registers as a host. Sessions stay reachable from any device, including a phone, without code or API keys ever touching the server image.

The Server

The server is a FastAPI application that acts as the coordination hub for everything Omnigent does. It exposes HTTP and SSE routes for clients (the CLI REPL, the web UI, the Python SDK), WebSocket endpoints for terminal attachment, and a WebSocket tunnel endpoint (WS /v1/runner/tunnel) that runners dial into from the user’s machine. The server stores all session state — messages, sub-agents, terminal resources, file resources, policies — in either SQLite (for single-instance or lightweight deploys) or Postgres (for production and multi-instance setups). The web UI is served as a static SPA from the same process. Because the server handles only coordination and persistence — not execution — the Docker image is deliberately small. It ships no harness SDKs, no tmux, and no LLM API keys.
The server image has no harness SDKs or API keys — they live on the runner’s machine. A deployed server can be shared with your whole team without anyone’s credentials entering the server environment.
Deployment targets for the server include:

Render / Railway

One-click deploys with managed Postgres provisioned automatically.

Docker Compose

Run on any VPS or home server with docker compose up -d.

Fly.io / Modal / HF Spaces

CLI-based deploys with SQLite or bring-your-own Postgres.

Local (background)

omnigent server start or auto-started by omnigent run on your machine.

Runners (Hosts)

A runner is a Python subprocess that runs on the user’s machine — a laptop, a dev container, or a cloud sandbox. Runners are not deployed; each user launches one by running omnigent run, omnigent claude, or registering their machine with omnigent host. The runner dials into the server over WS /v1/runner/tunnel, authenticates, and waits for work. When a session receives a message, the server dispatches the task to the bound runner. The runner then:
  1. Loads the agent spec and selects the harness.
  2. Invokes the LLM loop locally (using the user’s own API keys or CLI login).
  3. Executes tools in the local environment.
  4. Streams events back through the WebSocket tunnel to the server, which fans them out to all connected clients (web UI, CLI REPL, SDK streams).
This design means that even when a session is shared with teammates or accessed from a phone, the agent’s code runs on the machine that registered as a host, with that machine’s credentials and filesystem access.

Cloud Sandbox Hosts

If you don’t want a laptop to stay online, runners can be launched in Modal or Daytona cloud sandboxes:
omnigent sandbox create --provider modal
omnigent sandbox connect --provider modal --sandbox-id <id> --server https://your-host
The server can also provision sandboxes automatically per session (managed hosts) by setting a sandbox: block in the server config.

Harnesses

A harness is an adapter that connects the runner to a specific agent runtime or SDK. The runner loads the harness declared in the agent’s executor.harness field (or the --harness CLI flag) and delegates all LLM interaction to it. The six supported harnesses are:
HarnessRuntime
claude-sdkClaude Code via the claude-agent-sdk Python package
openai-agentsOpenAI Agents SDK
codexOpenAI Codex CLI via @openai/codex npm package
piPi harness (Anthropic Pi)
claude-nativeNative Claude Code CLI, tmux-based
codex-nativeNative Codex CLI, tmux-based
See the Harnesses page for credential requirements and gateway configuration per harness.

Session Lifecycle

A session is the live context for one agent conversation. Here is how a typical interactive session flows from start to finish:
omnigent claude


CLI ensures backend
  • host daemon starts (or reuses existing)
  • daemon spawns local server if none running
  • daemon connects runner → server via WS tunnel


CLI creates session on server
  POST /v1/sessions  →  session_id: conv_abc123


Server dispatches to runner
  runner loads claude-sdk harness
  harness starts LLM loop locally


User types a message
  CLI → POST /v1/sessions/{id}/events
  Server → forwards to bound runner
  Runner → calls Anthropic API (local creds)
  LLM response → streamed back via WS tunnel
  Server → fans out via SSE to all clients


Session persists in server DB
  (resumable, shareable, forkable)
Events emitted by the harness — text deltas, tool calls, tool results — are streamed back to the server in real time and forwarded to any connected clients: the terminal REPL, the web UI, SDK stream() callers, and teammates watching a shared session.

Local vs. Server Mode

Running omnigent claude or omnigent run with no --server flag starts everything on your machine in one step. A background daemon:
  1. Auto-starts a local Omnigent server on http://localhost:6767.
  2. Connects a runner to that server via the WebSocket tunnel.
The web UI at http://localhost:6767 shows the same session. Teammates on your LAN can open your machine’s LAN address (e.g. http://192.168.x.x:6767) to watch or co-drive.
omnigent claude            # server + runner start automatically
omnigent server status     # check what's running
omnigent stop              # stop everything

Build docs developers (and LLMs) love