Lifecycle Overview
A complete Symphony workflow follows this lifecycle:
Phase 1: Issue Polling
Poll Tick Trigger
The orchestrator schedules recurring ticks at polling.interval_ms (default: 30 seconds).
# Initial tick scheduled at startup
def init ( _opts ) do
state = % State {
poll_interval_ms: Config . poll_interval_ms (),
# ...
}
run_terminal_workspace_cleanup ()
:ok = schedule_tick ( 0 ) # Immediate first poll
{ :ok , state}
end
# Subsequent ticks
def handle_info ( :tick , state) do
state = refresh_runtime_config (state) # Hot-reload WORKFLOW.md
state = %{state | poll_check_in_progress: true }
schedule_poll_cycle_start () # Short delay for dashboard render
{ :noreply , state}
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:50-76
The orchestrator hot-reloads configuration on every tick, allowing changes to WORKFLOW.md to take effect without restart.
Startup Terminal Cleanup
Before the first poll, Symphony cleans up workspaces for issues already in terminal states:
defp run_terminal_workspace_cleanup do
case Tracker . fetch_issues_by_states ( Config . linear_terminal_states ()) do
{ :ok , issues} ->
Enum . each (issues, fn
% Issue { identifier: identifier} when is_binary (identifier) ->
cleanup_issue_workspace (identifier)
end )
{ :error , reason} ->
Logger . warning ( "Skipping startup cleanup; failed to fetch terminal issues: #{ inspect (reason) } " )
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:776-791
Phase 2: Reconciliation
Before dispatching new work, the orchestrator reconciles all running issues.
Step 2a: Stall Detection
Stall detection prevents zombie sessions that stop emitting events but don’t exit.
defp reconcile_stalled_running_issues (% State {} = state) do
timeout_ms = Config . codex_stall_timeout_ms () # Default: 300000 (5 min)
cond do
timeout_ms <= 0 -> state # Disabled
map_size(state.running) == 0 -> state
true ->
now = DateTime . utc_now ()
Enum . reduce (state.running, state, fn {issue_id, running_entry}, state_acc ->
elapsed_ms = stall_elapsed_ms (running_entry, now)
if is_integer (elapsed_ms) and elapsed_ms > timeout_ms do
Logger . warning ( "Issue stalled: #{ issue_id } elapsed_ms= #{ elapsed_ms } ; restarting" )
state_acc
|> terminate_running_issue (issue_id, false )
|> schedule_issue_retry (issue_id, next_attempt, %{
identifier: running_entry.identifier,
error: "stalled for #{ elapsed_ms } ms without codex activity"
})
else
state_acc
end
end )
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:367-406
Elapsed time calculation :
elapsed_ms = now - (last_codex_timestamp || started_at)
If last_codex_timestamp exists (any event received), use it. Otherwise use started_at (worker launch time).
Step 2b: Tracker State Refresh
Active-run reconciliation ensures running sessions stay aligned with current tracker state.
defp reconcile_running_issues (% State {} = state) do
running_ids = Map . keys (state.running)
case Tracker . fetch_issue_states_by_ids (running_ids) do
{ :ok , issues} ->
Enum . reduce (issues, state, fn issue, state_acc ->
cond do
terminal_issue_state?(issue.state) ->
Logger . info ( "Issue #{ issue.identifier } moved to terminal state= #{ issue.state } ; stopping" )
terminate_running_issue (state_acc, issue.id, true ) # cleanup_workspace=true
! issue_routable_to_worker?(issue) ->
Logger . info ( "Issue #{ issue.identifier } no longer routed; stopping" )
terminate_running_issue (state_acc, issue.id, false )
active_issue_state?(issue.state) ->
refresh_running_issue_state (state_acc, issue) # Update in-memory snapshot
true ->
Logger . info ( "Issue #{ issue.identifier } moved to non-active state; stopping" )
terminate_running_issue (state_acc, issue.id, false )
end
end )
{ :error , reason} ->
Logger . debug ( "Failed to refresh states: #{ inspect (reason) } ; keeping workers running" )
state
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:236-324
Reconciliation outcomes :
Terminal State Terminate worker + clean workspace
Still Active Update issue snapshot in memory
Non-Active Terminate worker (no cleanup)
Phase 3: Validation
Before fetching candidates, the orchestrator validates runtime configuration:
defp maybe_dispatch (% State {} = state) do
state = reconcile_running_issues (state)
with :ok <- Config . validate! (),
{ :ok , issues} <- Tracker . fetch_candidate_issues (),
true <- available_slots (state) > 0 do
choose_issues (issues, state)
else
{ :error , :missing_linear_api_token } ->
Logger . error ( "Linear API token missing in WORKFLOW.md" )
state
{ :error , :missing_codex_command } ->
Logger . error ( "Codex command missing in WORKFLOW.md" )
state
# ... other validation errors
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:173-234
Validation checks :
tracker.kind present and supported
tracker.api_key present after $ resolution
tracker.project_slug present (for Linear)
codex.command present and non-empty
If validation fails, dispatch is skipped for this tick , but reconciliation continues and the next tick will retry.
Phase 4: Candidate Fetch
Tracker Query
The orchestrator fetches candidate issues in active states:
query CandidateIssues ( $projectSlug : String ! , $states : [ String ! ] ! ) {
issues (
filter : {
project : { slugId : { eq : $projectSlug } }
state : { name : { in : $states } }
}
first : 50
) {
nodes {
id
identifier
title
description
priority
state { name }
# ...
}
}
}
Variables:
projectSlug: tracker.project_slug
states: tracker.active_states
Issue Normalization
Raw tracker responses are normalized:
labels → lowercase strings
blocked_by → derived from inverse blocks relations
priority → integer only (non-integers become nil)
created_at, updated_at → parsed ISO-8601 timestamps
state → compared after trim + lowercase
Phase 5: Dispatch
Issue Sorting
Candidates are sorted by dispatch priority:
Enum . sort_by (issues, fn issue ->
{
priority_rank (issue.priority), # 1..4 → 1..4, null → 5
issue_created_at_sort_key (issue), # Unix microseconds (oldest first)
issue.identifier || issue.id # Lexicographic tie-breaker
}
end )
Reference: elixir/lib/symphony_elixir/orchestrator.ex:453-461
Eligibility Checks
For each issue in sorted order:
defp should_dispatch_issue? (issue, state, active_states, terminal_states) do
candidate_issue? (issue, active_states, terminal_states) and
! todo_issue_blocked_by_non_terminal? (issue, terminal_states) and
! MapSet . member? (state.claimed, issue.id) and
! Map . has_key? (state.running, issue.id) and
available_slots (state) > 0 and
state_slots_available? (issue, state.running)
end
Candidate checks :
Has required fields (id, identifier, title, state)
State in active_states and not in terminal_states
Routable to worker (assignee check)
Blocker rule :
If issue state is “Todo”, reject if any blocker is non-terminal
Concurrency checks :
Global: max_concurrent_agents - running_count > 0
Per-state: max_concurrent_agents_by_state[state] - running_count_for_state > 0
Reference: elixir/lib/symphony_elixir/orchestrator.ex:473-507
Issue Revalidation
Before spawning a worker, the orchestrator refreshes the issue from the tracker to avoid acting on stale data:
defp dispatch_issue (state, issue, attempt) do
case revalidate_issue_for_dispatch (issue, & Tracker . fetch_issue_states_by_ids / 1 , terminal_states) do
{ :ok , refreshed_issue} ->
do_dispatch_issue (state, refreshed_issue, attempt)
{ :skip , :missing } ->
Logger . info ( "Skipping; issue no longer visible: #{ issue.identifier } " )
state
{ :skip , refreshed_issue} ->
Logger . info ( "Skipping stale dispatch: #{ refreshed_issue.identifier } state= #{ refreshed_issue.state } " )
state
{ :error , reason} ->
Logger . warning ( "Skipping; refresh failed: #{ inspect (reason) } " )
state
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:578-596
Worker Spawn
defp do_dispatch_issue (state, issue, attempt) do
recipient = self ()
case Task . Supervisor . start_child ( SymphonyElixir . TaskSupervisor , fn ->
AgentRunner . run (issue, recipient, attempt: attempt)
end ) do
{ :ok , pid} ->
ref = Process . monitor (pid)
Logger . info ( "Dispatching #{ issue.identifier } to agent pid= #{ inspect (pid) } attempt= #{ inspect (attempt) } " )
running_entry = %{
pid: pid,
ref: ref,
identifier: issue.identifier,
issue: issue,
started_at: DateTime . utc_now (),
retry_attempt: normalize_retry_attempt (attempt),
# ... session tracking fields
}
%{
state |
running: Map . put (state.running, issue.id, running_entry),
claimed: MapSet . put (state.claimed, issue.id),
retry_attempts: Map . delete (state.retry_attempts, issue.id)
}
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:598-647
Phase 6: Workspace Creation
The Agent Runner creates an isolated workspace:
def run (issue, recipient, opts) do
case Workspace . create_for_issue (issue) do
{ :ok , workspace} ->
try do
with :ok <- Workspace . run_before_run_hook (workspace, issue),
:ok <- run_codex_turns (workspace, issue, recipient, opts) do
:ok
end
after
Workspace . run_after_run_hook (workspace, issue)
end
end
end
Reference: elixir/lib/symphony_elixir/agent_runner.ex:11-33
Workspace Path Construction
1. Sanitize identifier: "ABC-123" → "ABC-123" (already safe)
"MT/649" → "MT_649" (slash replaced)
2. Join with root: workspace_root + "/" + sanitized_identifier
Example: "/tmp/symphony_workspaces/ABC-123"
3. Validate path safety:
- Must be inside workspace_root (prefix check)
- Must not equal workspace_root
- Must not contain symlink escapes
Directory Creation
defp ensure_workspace (workspace) do
cond do
File .dir?(workspace) ->
clean_tmp_artifacts (workspace) # Remove .elixir_ls, tmp/
{ :ok , false } # Reused
File .exists?(workspace) ->
File . rm_rf! (workspace)
create_workspace (workspace)
true ->
create_workspace (workspace)
end
end
defp create_workspace (workspace) do
File . rm_rf! (workspace)
File . mkdir_p! (workspace)
{ :ok , true } # Newly created
end
Reference: elixir/lib/symphony_elixir/workspace.ex:32-51
after_create Hook
If the workspace was newly created (not reused), run hooks.after_create:
defp maybe_run_after_create_hook (workspace, issue_context, created?) do
case created? do
true ->
case Config . workspace_hooks ()[ :after_create ] do
nil -> :ok
command -> run_hook (command, workspace, issue_context, "after_create" )
end
false ->
:ok
end
end
Reference: elixir/lib/symphony_elixir/workspace.ex:125-139
after_create hook failure is fatal to workspace creation. The workspace will not be used.
Phase 7: Agent Execution
Codex Session Startup
defp run_codex_turns (workspace, issue, recipient, opts) do
with { :ok , session} <- AppServer . start_session (workspace) do
try do
do_run_codex_turns (session, workspace, issue, recipient, opts, fetcher, 1 , max_turns)
after
AppServer . stop_session (session)
end
end
end
Reference: elixir/lib/symphony_elixir/agent_runner.ex:49-60
Protocol handshake :
{ "id" : 1 , "method" : "initialize" , "params" :{ "clientInfo" :{ "name" : "symphony" , "version" : "1.0" }, "capabilities" :{}}}
{ "method" : "initialized" , "params" :{}}
{ "id" : 2 , "method" : "thread/start" , "params" :{ "approvalPolicy" : "..." , "sandbox" : "workspace-write" , "cwd" : "/abs/workspace" }}
Reference: SPEC.md:928-936
First Turn
Prompt rendering :
prompt = PromptBuilder . build_prompt (issue, opts)
# Uses WORKFLOW.md template + issue data
Turn start :
{ "id" : 3 , "method" : "turn/start" , "params" :{
"threadId" : "<thread-id>" ,
"input" :[{ "type" : "text" , "text" : "<rendered-prompt>" }],
"cwd" : "/abs/workspace" ,
"title" : "ABC-123: Example Issue" ,
"approvalPolicy" : "..." ,
"sandboxPolicy" :{ "type" : "..." }
}}
Reference: SPEC.md:954-963
Event streaming :
Codex emits line-delimited JSON on stdout:
turn/completed → success
turn/failed → failure
turn/cancelled → failure
Tool calls → handled by dynamic tool executor
Approval requests → auto-approved or failed (depending on policy)
Continuation Turns
If the turn completes successfully and the issue is still active:
case continue_with_issue? (issue, issue_state_fetcher) do
{ :continue , refreshed_issue} when turn_number < max_turns ->
Logger . info ( "Continuing after normal turn completion turn= #{ turn_number } / #{ max_turns } " )
do_run_codex_turns (session, workspace, refreshed_issue, recipient, opts, fetcher, turn_number + 1 , max_turns)
{ :continue , _ } ->
Logger . info ( "Reached max_turns with issue still active; returning to orchestrator" )
:ok
{ :done , _ } ->
:ok
end
Reference: elixir/lib/symphony_elixir/agent_runner.ex:74-99
Continuation prompt :
Continuation guidance:
- The previous Codex turn completed normally, but the Linear issue is still active.
- This is continuation turn #2 of 20.
- Resume from current workspace state instead of restarting.
- The original task instructions are already in this thread.
- Focus on remaining ticket work.
Reference: elixir/lib/symphony_elixir/agent_runner.ex:105-115
Continuation turns reuse the same Codex thread to preserve context and workspace state across multiple turns.
Phase 8: Completion
The worker task exits and reports to the orchestrator.
Normal Exit
def handle_info ({ :DOWN , ref, :process , _pid , :normal }, state) do
case find_issue_id_for_ref (state.running, ref) do
issue_id ->
{running_entry, state} = pop_running_entry (state, issue_id)
state = record_session_completion_totals (state, running_entry)
Logger . info ( "Agent task completed for #{ issue_id } ; scheduling continuation check" )
state
|> complete_issue (issue_id)
|> schedule_issue_retry (issue_id, 1 , %{
identifier: running_entry.identifier,
delay_type: :continuation # 1-second retry
})
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:91-131
Even on normal exit , the orchestrator schedules a continuation retry with a 1-second delay to re-check if the issue is still active.
Abnormal Exit
def handle_info ({ :DOWN , ref, :process , _pid , reason}, state) when reason != :normal do
Logger . warning ( "Agent task exited for #{ issue_id } reason= #{ inspect (reason) } ; scheduling retry" )
next_attempt = next_retry_attempt_from_running (running_entry)
schedule_issue_retry (state, issue_id, next_attempt, %{
identifier: running_entry.identifier,
error: "agent exited: #{ inspect (reason) } "
})
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:116-125
Exponential backoff :
attempt 1: delay = min(10000 * 2^0, 300000) = 10s
attempt 2: delay = min(10000 * 2^1, 300000) = 20s
attempt 3: delay = min(10000 * 2^2, 300000) = 40s
...
attempt 6: delay = min(10000 * 2^5, 300000) = 300s (capped)
Phase 9: Retry Handling
Retry Timer
defp schedule_issue_retry (state, issue_id, attempt, metadata) do
delay_ms = retry_delay (attempt, metadata)
timer_ref = Process . send_after ( self (), { :retry_issue , issue_id}, delay_ms)
Logger . warning ( "Retrying #{ issue_id } in #{ delay_ms } ms (attempt #{ attempt } )" )
%{
state |
retry_attempts: Map . put (state.retry_attempts, issue_id, %{
attempt: attempt,
timer_ref: timer_ref,
due_at_ms: System . monotonic_time ( :millisecond ) + delay_ms,
identifier: metadata[ :identifier ],
error: metadata[ :error ]
})
}
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:677-708
Retry Execution
def handle_info ({ :retry_issue , issue_id}, state) do
case Tracker . fetch_candidate_issues () do
{ :ok , issues} ->
case find_issue_by_id (issues, issue_id) do
% Issue {} = issue ->
cond do
terminal_issue_state?(issue.state) ->
cleanup_issue_workspace (issue.identifier)
release_issue_claim (state, issue_id)
retry_candidate_issue?(issue) and slots_available? (issue, state) ->
dispatch_issue (state, issue, attempt)
retry_candidate_issue?(issue) ->
schedule_issue_retry (state, issue_id, attempt + 1 , %{
error: "no available orchestrator slots"
})
true ->
release_issue_claim (state, issue_id)
end
nil ->
release_issue_claim (state, issue_id)
end
end
end
Reference: elixir/lib/symphony_elixir/orchestrator.ex:157-768
Phase 10: Cleanup
Terminal State Cleanup
When an issue moves to a terminal state during reconciliation:
terminal_issue_state? (issue.state) ->
Logger . info ( "Issue moved to terminal state= #{ issue.state } ; stopping active agent" )
terminate_running_issue (state, issue.id, true ) # cleanup_workspace=true
Reference: elixir/lib/symphony_elixir/orchestrator.ex:303-306
Cleanup steps :
before_remove hook (if workspace exists)
case Config . workspace_hooks ()[ :before_remove ] do
nil -> :ok
command -> run_hook (command, workspace, issue_context, "before_remove" )
end
Failure is logged and ignored.
Workspace deletion
State cleanup
%{
state |
running: Map . delete (state.running, issue_id),
claimed: MapSet . delete (state.claimed, issue_id),
retry_attempts: Map . delete (state.retry_attempts, issue_id)
}
Reference: elixir/lib/symphony_elixir/orchestrator.ex:335-365
Workspace Preservation
Successful runs do not auto-delete workspaces. Workspaces are reused across runs for the same issue until the issue reaches a terminal state.
This allows:
Incremental progress across multiple agent sessions
Manual inspection of workspace state between runs
Operator-driven workspace cleanup via before_remove hook
Lifecycle Summary
Polling 30s ticks, hot-reload config, reconcile before dispatch
Reconciliation Stall detection + tracker state refresh for running issues
Dispatch Sort by priority, validate eligibility, spawn worker task
Workspace Create/reuse directory, run hooks, enforce safety invariants
Execution Multi-turn Codex sessions, continuation guidance, event streaming
Retry Exponential backoff for failures, 1s delay for continuations
Cleanup Terminal state → run before_remove hook → delete workspace
Next Steps
Component Reference Implementation details for each component
Workspace Isolation Safety mechanisms and lifecycle hooks