What It Is
The orchestrator is aCopilotSession created via client.createSession() (or resumed via client.resumeSession()). Unlike workers, which are temporary, the orchestrator session is meant to live indefinitely. Its session ID is stored in the max_state SQLite table under the key orchestrator_session_id so it can be resumed after a daemon restart.
Session Persistence
On every daemon start, Max attempts to resume the saved session:Look up saved session ID
getState("orchestrator_session_id") reads the ID from max_state. If none exists, skip to step 3.Resume the session
client.resumeSession(savedId, { ... }) reconnects to the existing Copilot SDK session, preserving conversation history managed by the SDK.The SDK manages long-context conversation history automatically via
infiniteSessions. Max sets a compaction threshold at 80% buffer utilization and a hard cutoff at 95%.Message Queue
The orchestrator is single-threaded by design. All inbound messages are pushed ontomessageQueue and processed one at a time by processQueue().
Message Flow
WhensendToOrchestrator(prompt, source, callback) is called:
- Source tagging — user messages are prefixed with
[via telegram]or[via tui]. Background worker completions are left untagged. - Enqueue — the tagged prompt and callback are added to
messageQueue. - Model resolution —
resolveModel()classifies the message and selects a model tier (see Model Routing). - Session execution —
session.sendAndWait({ prompt }, 300_000)runs with a 5-minute timeout. Streaming deltas are forwarded to the callback in real time. - Logging — on success, both sides of the exchange are written to
conversation_log.
Retry Logic
Transient failures (timeouts, disconnects, EPIPE, etc.) are retried automatically:| Attempt | Delay before retry |
|---|---|
| 1st retry | 1 000 ms |
| 2nd retry | 3 000 ms |
| 3rd retry | 10 000 ms |
ensureClient(). Non-recoverable errors (e.g. explicit cancellation) fail immediately without retrying.
Health Check Loop
AsetInterval fires every 30 seconds to verify the Copilot SDK client is still connected:
createOrResumeSession() call.
System Message
Each time a session is created or resumed, a system message is injected. It contains:- Role & identity — Max’s name, personality, and the user’s name
- Architecture overview — how channels work (
[via telegram],[via tui],[via background]) - Tool usage guide — when to use workers vs. answer directly, skill workflow, memory workflow
- Auto-routing tiers — fast / standard / premium model assignments
- Long-term memory — output of
getMemorySummary(), grouped by category - Self-edit protection — injected unless
--self-editflag is set - Current date — resolved at session creation time via
process.platform
Model Routing
Before each message is executed,resolveModel() selects a model:
Keyword overrides (highest priority)
Keyword overrides (highest priority)
Certain keywords bypass classification entirely. The default override routes design/UI/UX requests to
claude-opus-4.6:LLM classification
LLM classification
When auto-routing is enabled and no override matches, the message is classified as
fast, standard, or premium by a lightweight LLM call via classifyWithLLM(). The tier maps to a configured model:| Tier | Default model |
|---|---|
| fast | gpt-4.1 |
| standard | claude-sonnet-4.6 |
| premium | claude-opus-4.6 |
Cooldown
Cooldown
To prevent rapid model switching, a cooldown of 2 messages is enforced. If the classifier wants to switch but the cooldown hasn’t elapsed, the current model is kept.
Session recreation on switch
Session recreation on switch
When the model changes, the current orchestrator session is destroyed (
orchestratorSession = undefined) and the session ID is deleted from max_state. The next ensureOrchestratorSession() call creates a fresh session on the new model.Auto-routing is disabled by default. Enable it with
toggle_auto (via natural language) or the /auto TUI command. When disabled, the model configured in ~/.max/.env is used for every message.Tool Execution
Tools run synchronously insidesession.sendAndWait(). When the model calls a tool, the Copilot SDK invokes the registered handler, waits for it to return, then continues generating. Worker tools (create_worker_session, send_to_worker) are the exception — they dispatch work to a background thread and return immediately, keeping the orchestrator turn fast.