Workflow Lifecycle

Lifecycle Overview

A complete Symphony workflow follows this lifecycle:

Phase 1: Issue Polling

Poll Tick Trigger

The orchestrator schedules recurring ticks at polling.interval_ms (default: 30 seconds).

# Initial tick scheduled at startup
def init(_opts) do
  state = %State{
    poll_interval_ms: Config.poll_interval_ms(),
    # ...
  }
  
  run_terminal_workspace_cleanup()
  :ok = schedule_tick(0)  # Immediate first poll
  {:ok, state}
end

# Subsequent ticks
def handle_info(:tick, state) do
  state = refresh_runtime_config(state)  # Hot-reload WORKFLOW.md
  state = %{state | poll_check_in_progress: true}
  schedule_poll_cycle_start()  # Short delay for dashboard render
  {:noreply, state}
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:50-76

The orchestrator hot-reloads configuration on every tick, allowing changes to WORKFLOW.md to take effect without restart.

Startup Terminal Cleanup

Before the first poll, Symphony cleans up workspaces for issues already in terminal states:

defp run_terminal_workspace_cleanup do
  case Tracker.fetch_issues_by_states(Config.linear_terminal_states()) do
    {:ok, issues} ->
      Enum.each(issues, fn
        %Issue{identifier: identifier} when is_binary(identifier) ->
          cleanup_issue_workspace(identifier)
      end)
    
    {:error, reason} ->
      Logger.warning("Skipping startup cleanup; failed to fetch terminal issues: #{inspect(reason)}")
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:776-791

Phase 2: Reconciliation

Before dispatching new work, the orchestrator reconciles all running issues.

Step 2a: Stall Detection

Stall detection prevents zombie sessions that stop emitting events but don’t exit.

defp reconcile_stalled_running_issues(%State{} = state) do
  timeout_ms = Config.codex_stall_timeout_ms()  # Default: 300000 (5 min)
  
  cond do
    timeout_ms <= 0 -> state  # Disabled
    map_size(state.running) == 0 -> state
    true ->
      now = DateTime.utc_now()
      Enum.reduce(state.running, state, fn {issue_id, running_entry}, state_acc ->
        elapsed_ms = stall_elapsed_ms(running_entry, now)
        
        if is_integer(elapsed_ms) and elapsed_ms > timeout_ms do
          Logger.warning("Issue stalled: #{issue_id} elapsed_ms=#{elapsed_ms}; restarting")
          
          state_acc
          |> terminate_running_issue(issue_id, false)
          |> schedule_issue_retry(issue_id, next_attempt, %{
            identifier: running_entry.identifier,
            error: "stalled for #{elapsed_ms}ms without codex activity"
          })
        else
          state_acc
        end
      end)
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:367-406 Elapsed time calculation:

elapsed_ms = now - (last_codex_timestamp || started_at)

If last_codex_timestamp exists (any event received), use it. Otherwise use started_at (worker launch time).

Step 2b: Tracker State Refresh

Active-run reconciliation ensures running sessions stay aligned with current tracker state.

defp reconcile_running_issues(%State{} = state) do
  running_ids = Map.keys(state.running)
  
  case Tracker.fetch_issue_states_by_ids(running_ids) do
    {:ok, issues} ->
      Enum.reduce(issues, state, fn issue, state_acc ->
        cond do
          terminal_issue_state?(issue.state) ->
            Logger.info("Issue #{issue.identifier} moved to terminal state=#{issue.state}; stopping")
            terminate_running_issue(state_acc, issue.id, true)  # cleanup_workspace=true
          
          !issue_routable_to_worker?(issue) ->
            Logger.info("Issue #{issue.identifier} no longer routed; stopping")
            terminate_running_issue(state_acc, issue.id, false)
          
          active_issue_state?(issue.state) ->
            refresh_running_issue_state(state_acc, issue)  # Update in-memory snapshot
          
          true ->
            Logger.info("Issue #{issue.identifier} moved to non-active state; stopping")
            terminate_running_issue(state_acc, issue.id, false)
        end
      end)
    
    {:error, reason} ->
      Logger.debug("Failed to refresh states: #{inspect(reason)}; keeping workers running")
      state
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:236-324 Reconciliation outcomes:

Terminal State

Terminate worker + clean workspace

Still Active

Update issue snapshot in memory

Non-Active

Terminate worker (no cleanup)

Phase 3: Validation

Before fetching candidates, the orchestrator validates runtime configuration:

defp maybe_dispatch(%State{} = state) do
  state = reconcile_running_issues(state)
  
  with :ok <- Config.validate!(),
       {:ok, issues} <- Tracker.fetch_candidate_issues(),
       true <- available_slots(state) > 0 do
    choose_issues(issues, state)
  else
    {:error, :missing_linear_api_token} ->
      Logger.error("Linear API token missing in WORKFLOW.md")
      state
    
    {:error, :missing_codex_command} ->
      Logger.error("Codex command missing in WORKFLOW.md")
      state
    
    # ... other validation errors
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:173-234 Validation checks:

tracker.kind present and supported
tracker.api_key present after $ resolution
tracker.project_slug present (for Linear)
codex.command present and non-empty

If validation fails, dispatch is skipped for this tick, but reconciliation continues and the next tick will retry.

Phase 4: Candidate Fetch

Tracker Query

The orchestrator fetches candidate issues in active states:

query CandidateIssues($projectSlug: String!, $states: [String!]!) {
  issues(
    filter: {
      project: { slugId: { eq: $projectSlug } }
      state: { name: { in: $states } }
    }
    first: 50
  ) {
    nodes {
      id
      identifier
      title
      description
      priority
      state { name }
      # ...
    }
  }
}

Variables:

projectSlug: tracker.project_slug
states: tracker.active_states

Issue Normalization

Raw tracker responses are normalized:

Normalization Rules

labels → lowercase strings
blocked_by → derived from inverse blocks relations
priority → integer only (non-integers become nil)
created_at, updated_at → parsed ISO-8601 timestamps
state → compared after trim + lowercase

Phase 5: Dispatch

Issue Sorting

Candidates are sorted by dispatch priority:

Enum.sort_by(issues, fn issue ->
  {
    priority_rank(issue.priority),      # 1..4 → 1..4, null → 5
    issue_created_at_sort_key(issue),   # Unix microseconds (oldest first)
    issue.identifier || issue.id        # Lexicographic tie-breaker
  }
end)

Reference: elixir/lib/symphony_elixir/orchestrator.ex:453-461

Eligibility Checks

For each issue in sorted order:

Dispatch Eligibility

defp should_dispatch_issue?(issue, state, active_states, terminal_states) do
  candidate_issue?(issue, active_states, terminal_states) and
    !todo_issue_blocked_by_non_terminal?(issue, terminal_states) and
    !MapSet.member?(state.claimed, issue.id) and
    !Map.has_key?(state.running, issue.id) and
    available_slots(state) > 0 and
    state_slots_available?(issue, state.running)
end

Candidate checks:

Has required fields (id, identifier, title, state)
State in active_states and not in terminal_states
Routable to worker (assignee check)

Blocker rule:

If issue state is “Todo”, reject if any blocker is non-terminal

Concurrency checks:

Global: max_concurrent_agents - running_count > 0
Per-state: max_concurrent_agents_by_state[state] - running_count_for_state > 0

Reference: elixir/lib/symphony_elixir/orchestrator.ex:473-507

Issue Revalidation

Before spawning a worker, the orchestrator refreshes the issue from the tracker to avoid acting on stale data:

defp dispatch_issue(state, issue, attempt) do
  case revalidate_issue_for_dispatch(issue, &Tracker.fetch_issue_states_by_ids/1, terminal_states) do
    {:ok, refreshed_issue} ->
      do_dispatch_issue(state, refreshed_issue, attempt)
    
    {:skip, :missing} ->
      Logger.info("Skipping; issue no longer visible: #{issue.identifier}")
      state
    
    {:skip, refreshed_issue} ->
      Logger.info("Skipping stale dispatch: #{refreshed_issue.identifier} state=#{refreshed_issue.state}")
      state
    
    {:error, reason} ->
      Logger.warning("Skipping; refresh failed: #{inspect(reason)}")
      state
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:578-596

Worker Spawn

defp do_dispatch_issue(state, issue, attempt) do
  recipient = self()
  
  case Task.Supervisor.start_child(SymphonyElixir.TaskSupervisor, fn ->
    AgentRunner.run(issue, recipient, attempt: attempt)
  end) do
    {:ok, pid} ->
      ref = Process.monitor(pid)
      Logger.info("Dispatching #{issue.identifier} to agent pid=#{inspect(pid)} attempt=#{inspect(attempt)}")
      
      running_entry = %{
        pid: pid,
        ref: ref,
        identifier: issue.identifier,
        issue: issue,
        started_at: DateTime.utc_now(),
        retry_attempt: normalize_retry_attempt(attempt),
        # ... session tracking fields
      }
      
      %{
        state |
        running: Map.put(state.running, issue.id, running_entry),
        claimed: MapSet.put(state.claimed, issue.id),
        retry_attempts: Map.delete(state.retry_attempts, issue.id)
      }
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:598-647

Phase 6: Workspace Creation

The Agent Runner creates an isolated workspace:

def run(issue, recipient, opts) do
  case Workspace.create_for_issue(issue) do
    {:ok, workspace} ->
      try do
        with :ok <- Workspace.run_before_run_hook(workspace, issue),
             :ok <- run_codex_turns(workspace, issue, recipient, opts) do
          :ok
        end
      after
        Workspace.run_after_run_hook(workspace, issue)
      end
  end
end

Reference: elixir/lib/symphony_elixir/agent_runner.ex:11-33

Workspace Path Construction

1. Sanitize identifier: "ABC-123" → "ABC-123" (already safe)
                        "MT/649"  → "MT_649" (slash replaced)
                        
2. Join with root: workspace_root + "/" + sanitized_identifier
   Example: "/tmp/symphony_workspaces/ABC-123"

3. Validate path safety:
   - Must be inside workspace_root (prefix check)
   - Must not equal workspace_root
   - Must not contain symlink escapes

Directory Creation

defp ensure_workspace(workspace) do
  cond do
    File.dir?(workspace) ->
      clean_tmp_artifacts(workspace)  # Remove .elixir_ls, tmp/
      {:ok, false}  # Reused
    
    File.exists?(workspace) ->
      File.rm_rf!(workspace)
      create_workspace(workspace)
    
    true ->
      create_workspace(workspace)
  end
end

defp create_workspace(workspace) do
  File.rm_rf!(workspace)
  File.mkdir_p!(workspace)
  {:ok, true}  # Newly created
end

Reference: elixir/lib/symphony_elixir/workspace.ex:32-51

after_create Hook

If the workspace was newly created (not reused), run hooks.after_create:

defp maybe_run_after_create_hook(workspace, issue_context, created?) do
  case created? do
    true ->
      case Config.workspace_hooks()[:after_create] do
        nil -> :ok
        command -> run_hook(command, workspace, issue_context, "after_create")
      end
    false ->
      :ok
  end
end

Reference: elixir/lib/symphony_elixir/workspace.ex:125-139

after_create hook failure is fatal to workspace creation. The workspace will not be used.

Phase 7: Agent Execution

Codex Session Startup

defp run_codex_turns(workspace, issue, recipient, opts) do
  with {:ok, session} <- AppServer.start_session(workspace) do
    try do
      do_run_codex_turns(session, workspace, issue, recipient, opts, fetcher, 1, max_turns)
    after
      AppServer.stop_session(session)
    end
  end
end

Reference: elixir/lib/symphony_elixir/agent_runner.ex:49-60 Protocol handshake:

{"id":1,"method":"initialize","params":{"clientInfo":{"name":"symphony","version":"1.0"},"capabilities":{}}}
{"method":"initialized","params":{}}
{"id":2,"method":"thread/start","params":{"approvalPolicy":"...","sandbox":"workspace-write","cwd":"/abs/workspace"}}

Reference: SPEC.md:928-936

First Turn

Prompt rendering:

prompt = PromptBuilder.build_prompt(issue, opts)
# Uses WORKFLOW.md template + issue data

Turn start:

{"id":3,"method":"turn/start","params":{
  "threadId":"<thread-id>",
  "input":[{"type":"text","text":"<rendered-prompt>"}],
  "cwd":"/abs/workspace",
  "title":"ABC-123: Example Issue",
  "approvalPolicy":"...",
  "sandboxPolicy":{"type":"..."}
}}

Reference: SPEC.md:954-963 Event streaming: Codex emits line-delimited JSON on stdout:

turn/completed → success
turn/failed → failure
turn/cancelled → failure
Tool calls → handled by dynamic tool executor
Approval requests → auto-approved or failed (depending on policy)

Continuation Turns

If the turn completes successfully and the issue is still active:

case continue_with_issue?(issue, issue_state_fetcher) do
  {:continue, refreshed_issue} when turn_number < max_turns ->
    Logger.info("Continuing after normal turn completion turn=#{turn_number}/#{max_turns}")
    do_run_codex_turns(session, workspace, refreshed_issue, recipient, opts, fetcher, turn_number + 1, max_turns)
  
  {:continue, _} ->
    Logger.info("Reached max_turns with issue still active; returning to orchestrator")
    :ok
  
  {:done, _} ->
    :ok
end

Reference: elixir/lib/symphony_elixir/agent_runner.ex:74-99 Continuation prompt:

Continuation guidance:

- The previous Codex turn completed normally, but the Linear issue is still active.
- This is continuation turn #2 of 20.
- Resume from current workspace state instead of restarting.
- The original task instructions are already in this thread.
- Focus on remaining ticket work.

Reference: elixir/lib/symphony_elixir/agent_runner.ex:105-115

Continuation turns reuse the same Codex thread to preserve context and workspace state across multiple turns.

Phase 8: Completion

The worker task exits and reports to the orchestrator.

Normal Exit

def handle_info({:DOWN, ref, :process, _pid, :normal}, state) do
  case find_issue_id_for_ref(state.running, ref) do
    issue_id ->
      {running_entry, state} = pop_running_entry(state, issue_id)
      state = record_session_completion_totals(state, running_entry)
      
      Logger.info("Agent task completed for #{issue_id}; scheduling continuation check")
      
      state
      |> complete_issue(issue_id)
      |> schedule_issue_retry(issue_id, 1, %{
        identifier: running_entry.identifier,
        delay_type: :continuation  # 1-second retry
      })
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:91-131

Even on normal exit, the orchestrator schedules a continuation retry with a 1-second delay to re-check if the issue is still active.

Abnormal Exit

def handle_info({:DOWN, ref, :process, _pid, reason}, state) when reason != :normal do
  Logger.warning("Agent task exited for #{issue_id} reason=#{inspect(reason)}; scheduling retry")
  
  next_attempt = next_retry_attempt_from_running(running_entry)
  
  schedule_issue_retry(state, issue_id, next_attempt, %{
    identifier: running_entry.identifier,
    error: "agent exited: #{inspect(reason)}"
  })
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:116-125 Exponential backoff:

attempt 1: delay = min(10000 * 2^0, 300000) = 10s
attempt 2: delay = min(10000 * 2^1, 300000) = 20s
attempt 3: delay = min(10000 * 2^2, 300000) = 40s
...
attempt 6: delay = min(10000 * 2^5, 300000) = 300s (capped)

Phase 9: Retry Handling

Retry Timer

defp schedule_issue_retry(state, issue_id, attempt, metadata) do
  delay_ms = retry_delay(attempt, metadata)
  timer_ref = Process.send_after(self(), {:retry_issue, issue_id}, delay_ms)
  
  Logger.warning("Retrying #{issue_id} in #{delay_ms}ms (attempt #{attempt})")
  
  %{
    state |
    retry_attempts: Map.put(state.retry_attempts, issue_id, %{
      attempt: attempt,
      timer_ref: timer_ref,
      due_at_ms: System.monotonic_time(:millisecond) + delay_ms,
      identifier: metadata[:identifier],
      error: metadata[:error]
    })
  }
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:677-708

Retry Execution

def handle_info({:retry_issue, issue_id}, state) do
  case Tracker.fetch_candidate_issues() do
    {:ok, issues} ->
      case find_issue_by_id(issues, issue_id) do
        %Issue{} = issue ->
          cond do
            terminal_issue_state?(issue.state) ->
              cleanup_issue_workspace(issue.identifier)
              release_issue_claim(state, issue_id)
            
            retry_candidate_issue?(issue) and slots_available?(issue, state) ->
              dispatch_issue(state, issue, attempt)
            
            retry_candidate_issue?(issue) ->
              schedule_issue_retry(state, issue_id, attempt + 1, %{
                error: "no available orchestrator slots"
              })
            
            true ->
              release_issue_claim(state, issue_id)
          end
        
        nil ->
          release_issue_claim(state, issue_id)
      end
  end
end

Reference: elixir/lib/symphony_elixir/orchestrator.ex:157-768

Phase 10: Cleanup

Terminal State Cleanup

When an issue moves to a terminal state during reconciliation:

terminal_issue_state?(issue.state) ->
  Logger.info("Issue moved to terminal state=#{issue.state}; stopping active agent")
  terminate_running_issue(state, issue.id, true)  # cleanup_workspace=true

Reference: elixir/lib/symphony_elixir/orchestrator.ex:303-306 Cleanup steps:

before_remove hook (if workspace exists)

case Config.workspace_hooks()[:before_remove] do
  nil -> :ok
  command -> run_hook(command, workspace, issue_context, "before_remove")
end

Failure is logged and ignored.

Workspace deletion
```
File.rm_rf(workspace)
```

State cleanup

%{
  state |
  running: Map.delete(state.running, issue_id),
  claimed: MapSet.delete(state.claimed, issue_id),
  retry_attempts: Map.delete(state.retry_attempts, issue_id)
}

Reference: elixir/lib/symphony_elixir/orchestrator.ex:335-365

Workspace Preservation

Successful runs do not auto-delete workspaces. Workspaces are reused across runs for the same issue until the issue reaches a terminal state.

This allows:

Incremental progress across multiple agent sessions
Manual inspection of workspace state between runs
Operator-driven workspace cleanup via before_remove hook

Lifecycle Summary

Polling

30s ticks, hot-reload config, reconcile before dispatch

Reconciliation

Stall detection + tracker state refresh for running issues

Dispatch

Sort by priority, validate eligibility, spawn worker task

Workspace

Create/reuse directory, run hooks, enforce safety invariants

Execution

Multi-turn Codex sessions, continuation guidance, event streaming

Retry

Exponential backoff for failures, 1s delay for continuations

Cleanup

Terminal state → run before_remove hook → delete workspace

Next Steps

Component Reference

Implementation details for each component

Workspace Isolation

Safety mechanisms and lifecycle hooks

Get Started

Setup & Deployment

Architecture

Operations

Advanced

Documentation Index

​Lifecycle Overview

​Phase 1: Issue Polling

​Poll Tick Trigger

​Startup Terminal Cleanup

​Phase 2: Reconciliation

​Step 2a: Stall Detection

​Step 2b: Tracker State Refresh

Terminal State

Still Active

Non-Active

​Phase 3: Validation

​Phase 4: Candidate Fetch

​Tracker Query

​Issue Normalization

​Phase 5: Dispatch

​Issue Sorting

​Eligibility Checks

​Issue Revalidation

​Worker Spawn

​Phase 6: Workspace Creation

​Workspace Path Construction

​Directory Creation

​after_create Hook

​Phase 7: Agent Execution

​Codex Session Startup

​First Turn

​Continuation Turns

​Phase 8: Completion

​Normal Exit

​Abnormal Exit

​Phase 9: Retry Handling

​Retry Timer

​Retry Execution

​Phase 10: Cleanup

​Terminal State Cleanup

​Workspace Preservation

​Lifecycle Summary

Polling

Reconciliation

Dispatch

Workspace

Execution

Retry

Cleanup

​Next Steps

Component Reference

Workspace Isolation

Build docs developers (and LLMs) love

Lifecycle Overview

Phase 1: Issue Polling

Poll Tick Trigger

Startup Terminal Cleanup

Phase 2: Reconciliation

Step 2a: Stall Detection

Step 2b: Tracker State Refresh

Phase 3: Validation

Phase 4: Candidate Fetch

Tracker Query

Issue Normalization

Phase 5: Dispatch

Issue Sorting

Eligibility Checks

Issue Revalidation

Worker Spawn

Phase 6: Workspace Creation

Workspace Path Construction

Directory Creation

after_create Hook

Phase 7: Agent Execution

Codex Session Startup

First Turn

Continuation Turns

Phase 8: Completion

Normal Exit

Abnormal Exit

Phase 9: Retry Handling

Retry Timer

Retry Execution

Phase 10: Cleanup

Terminal State Cleanup

Workspace Preservation

Lifecycle Summary

Next Steps