Skip to main content

Introduction

Symphony is a long-running automation service that orchestrates coding agents to complete project work. It continuously reads issues from a tracker (Linear), creates isolated workspaces, and runs coding agent sessions inside those workspaces.
Symphony is a scheduler/runner and tracker reader, not a general workflow engine. Ticket writes (state transitions, comments, PR links) are performed by the coding agent using tools in the runtime environment.

Architecture Principles

Separation of Concerns

Symphony is organized into distinct layers:

Policy Layer

Repository-defined rules in WORKFLOW.md for ticket handling and validation

Coordination Layer

Orchestrator managing polling, eligibility, concurrency, and retries

Execution Layer

Workspace management and agent subprocess lifecycle

Integration Layer

Tracker adapters normalizing external API data

Key Design Goals

  • Single authoritative orchestrator for dispatch, retries, and reconciliation
  • Deterministic per-issue workspaces preserved across runs
  • Restart recovery without requiring a persistent database
  • Bounded concurrency with global and per-state limits
  • Exponential backoff recovery from transient failures

System Components

The architecture consists of these main components:

Component Responsibilities

Core coordination engine that:
  • Owns the poll tick and in-memory runtime state
  • Decides which issues to dispatch, retry, stop, or release
  • Tracks session metrics and retry queue state
  • Reconciles running issues against current tracker state
Implementation: SymphonyElixir.Orchestrator (GenServer)
Isolated filesystem management that:
  • Maps issue identifiers to workspace paths
  • Ensures per-issue workspace directories exist
  • Runs lifecycle hooks (after_create, before_run, after_run, before_remove)
  • Cleans workspaces for terminal issues
Implementation: SymphonyElixir.Workspace
Coding agent execution that:
  • Creates/reuses workspace for each issue
  • Builds prompts from issue + workflow template
  • Launches Codex app-server client in workspace
  • Streams agent updates back to orchestrator
  • Manages multi-turn sessions on the same thread
Implementation: SymphonyElixir.AgentRunner
Issue tracker integration that:
  • Fetches candidate issues in active states
  • Fetches current states for running issues (reconciliation)
  • Fetches terminal-state issues during startup cleanup
  • Normalizes tracker payloads into stable issue model
Implementation: SymphonyElixir.Tracker + SymphonyElixir.Linear.Adapter
Configuration management that:
  • Reads WORKFLOW.md from repository
  • Parses YAML front matter and prompt body
  • Watches for changes and hot-reloads config
  • Returns typed configuration for runtime
Implementation: SymphonyElixir.Workflow + SymphonyElixir.Config

Data Flow

Poll Cycle Flow

1. Tick timer fires (every polling.interval_ms)

2. Reconcile running issues
   - Check for stalled sessions (no activity > stall_timeout_ms)
   - Refresh tracker state for all running issues
   - Terminate workers for terminal/inactive issues

3. Validate configuration
   - Ensure tracker.kind, api_key, project_slug present
   - Ensure codex.command configured

4. Fetch candidate issues from tracker
   - Filter by active_states
   - Sort by priority, created_at, identifier

5. Dispatch eligible issues
   - Check: not already running/claimed
   - Check: global concurrency slots available
   - Check: per-state concurrency slots available
   - Check: Todo issues not blocked by non-terminal blockers

6. Schedule next poll tick

Agent Execution Flow

1. Create/reuse workspace directory
   - Sanitize issue identifier for path safety
   - Run after_create hook if newly created

2. Run before_run hook (if configured)

3. Start Codex app-server session
   - Send initialize, initialized, thread/start messages
   - Configure approval policy and sandbox settings

4. Run first turn
   - Render prompt from WORKFLOW.md template + issue data
   - Send turn/start with full task description
   - Stream events back to orchestrator

5. Check for continuation (if turn succeeds)
   - Refresh issue state from tracker
   - If still active AND turn_count < agent.max_turns:
     → Start next turn with continuation guidance
   - If max_turns reached:
     → Return control to orchestrator for retry

6. Run after_run hook (always, even on failure)

7. Report completion to orchestrator

Retry Flow

Worker exits abnormally

Orchestrator schedules retry with exponential backoff
   delay = min(10000 * 2^(attempt-1), max_retry_backoff_ms)

Retry timer fires

Fetch candidate issues (active states only)

Find issue by ID

Check issue state:
   - Terminal? → Clean workspace, release claim
   - Active + slots available? → Re-dispatch
   - Active + no slots? → Requeue retry
   - Not found/inactive? → Release claim

Reconciliation Flow

Reconciliation runs before dispatch on every poll tick to ensure running sessions stay aligned with tracker state.
For each running issue:

1. Stall detection
   - elapsed_ms = now - (last_codex_timestamp || started_at)
   - If elapsed_ms > stall_timeout_ms:
     → Terminate worker, schedule retry

2. Tracker state refresh
   - Fetch current state from tracker by issue ID
   - If terminal state:
     → Terminate worker, clean workspace
   - If still active:
     → Update in-memory issue snapshot
   - If neither active nor terminal:
     → Terminate worker (no cleanup)

State Management

Orchestrator Runtime State

The orchestrator maintains a single in-memory authoritative state:
%State{
  poll_interval_ms: integer(),
  max_concurrent_agents: integer(),
  running: %{issue_id => running_entry},
  claimed: MapSet.t(issue_id),
  retry_attempts: %{issue_id => retry_entry},
  completed: MapSet.t(issue_id),
  codex_totals: %{
    input_tokens: integer(),
    output_tokens: integer(),
    total_tokens: integer(),
    seconds_running: integer()
  },
  codex_rate_limits: map() | nil
}

Issue Orchestration States

Unclaimed

Issue not running, no retry scheduled

Claimed

Reserved to prevent duplicate dispatch (Running or RetryQueued)

Running

Worker task exists, tracked in running map

RetryQueued

Worker not running, retry timer active in retry_attempts

Released

Claim removed (terminal, non-active, missing, or retry completed)

Safety Invariants

Critical workspace safety rules that must never be violated:
  1. Workspace isolation: Run coding agent only in per-issue workspace path
    • Validate: cwd == workspace_path before launching subprocess
  2. Workspace containment: All workspace paths must stay inside workspace root
    • Normalize both paths to absolute
    • Require workspace_path has workspace_root as prefix
    • Reject paths outside workspace root
  3. Workspace key sanitization: Only [A-Za-z0-9._-] allowed in directory names
    • Replace all other characters with _
Implementation reference: SymphonyElixir.Workspace.validate_workspace_path/1

Configuration Hot-Reload

Symphony watches WORKFLOW.md for changes and hot-reloads configuration: Invalid reloads do not crash the service—it keeps operating with the last known good configuration.

Next Steps

Component Deep Dive

Detailed implementation of each component

Workflow Lifecycle

Issue polling → dispatch → execution → cleanup

Workspace Isolation

How workspaces are isolated and managed

Configuration

Complete WORKFLOW.md reference

Build docs developers (and LLMs) love