Agent Runner - Symphony

Overview

The Agent Runner wraps workspace creation, prompt rendering, and app-server client integration. It creates/reuses workspaces, builds prompts from workflow templates, launches the coding agent subprocess, and forwards app-server events to the orchestrator.

On any error during the agent run, the worker attempt fails and the orchestrator will retry according to the configured backoff policy.

Agent Runner Contract

The Agent Runner executes the following sequence:

1. Workspace Creation

Create or reuse workspace for the issue

2. Prompt Building

Build prompt from workflow template

3. Session Startup

Start app-server session

4. Event Forwarding

Forward app-server events to orchestrator

Workspaces are intentionally preserved after successful runs to enable continuation across multiple agent sessions.

Workspace Creation and Lifecycle

Workspace Path Determination

Workspace Root

workspace.root (normalized path; the config layer expands path-like values and preserves bare relative names)

Per-Issue Workspace Path

<workspace.root>/<sanitized_issue_identifier>

Creation and Reuse Algorithm

Show Workspace Creation Steps

Sanitize issue.identifier to workspace_key
Compute workspace path under workspace root
Ensure the workspace path exists as a directory
Mark created_now=true only if the directory was created during this call; otherwise created_now=false
If created_now=true, run after_create hook if configured

This section does not assume any specific repository/VCS workflow. Workspace preparation beyond directory creation (e.g., dependency bootstrap, checkout/sync, code generation) is implementation-defined and typically handled via hooks.

Workspace Hooks

Supported hooks and their execution semantics:

after_create

When: Only when a workspace directory is newly createdFailure: Aborts workspace creationTimeout: Uses hooks.timeout_ms (default: 60000)

before_run

When: Before each agent attempt after workspace preparation and before launching the coding agentFailure: Aborts the current attemptTimeout: Uses hooks.timeout_ms (default: 60000)

after_run

When: After each agent attempt (success, failure, timeout, or cancellation) once the workspace existsFailure: Logged but ignoredTimeout: Logged but ignored

before_remove

When: Before workspace deletion if the directory existsFailure: Logged but ignored; cleanup still proceedsTimeout: Logged but ignored

Workspace Safety Invariants

Invariant 1: Agent CWD

Run the coding agent only in the per-issue workspace path. Before launching the coding-agent subprocess, validate:cwd == workspace_path

Invariant 2: Path Containment

Workspace path must stay inside workspace root:

Normalize both paths to absolute
Require workspace_path to have workspace_root as a prefix directory
Reject any path outside the workspace root

Invariant 3: Sanitized Keys

Workspace key is sanitized:

Only [A-Za-z0-9._-] allowed in workspace directory names
Replace all other characters with _

Prompt Construction

Inputs

workflow.prompt_template

string

required

Markdown body from WORKFLOW.md

issue

object

required

Normalized issue object with all fields

attempt

integer | null

Optional retry/continuation metadata

Rendering Rules

Render with strict variable checking (unknown variables fail rendering)
Render with strict filter checking (unknown filters fail rendering)
Convert issue object keys to strings for template compatibility
Preserve nested arrays/maps (labels, blockers) so templates can iterate

Retry/Continuation Semantics

Show Attempt Context

attempt should be passed to the template because the workflow prompt may provide different instructions for:

First run: attempt null or absent
Continuation run: After a successful prior session
Retry after error: After timeout/stall/failure

Failure Behavior

If prompt rendering fails:

Fail the run attempt immediately
Let the orchestrator treat it like any other worker failure and decide retry behavior

Codex App-Server Integration

Launch Contract

Command

string

required

codex.command (default: codex app-server)

Invocation

string

required

bash -lc <codex.command>

Working Directory

string

required

Workspace path

Stdout/Stderr

string

required

Separate streams

Framing

string

required

Line-delimited protocol messages on stdout (JSON-RPC-like JSON per line)

Session Startup Handshake

The client must send these protocol messages in order:

Show 1. initialize request

{
  "id": 1,
  "method": "initialize",
  "params": {
    "clientInfo": {"name": "symphony", "version": "1.0"},
    "capabilities": {}
  }
}

Includes clientInfo and capabilities objects. Wait for response (read_timeout_ms).

Show 2. initialized notification

{
  "method": "initialized",
  "params": {}
}

Show 3. thread/start request

{
  "id": 2,
  "method": "thread/start",
  "params": {
    "approvalPolicy": "<implementation-defined>",
    "sandbox": "<implementation-defined>",
    "cwd": "/abs/workspace"
  }
}

Includes approvalPolicy, sandbox, and cwd. Read thread_id from response result.thread.id.

Show 4. turn/start request

{
  "id": 3,
  "method": "turn/start",
  "params": {
    "threadId": "<thread-id>",
    "input": [{"type": "text", "text": "<rendered prompt>"}],
    "cwd": "/abs/workspace",
    "title": "ABC-123: Example",
    "approvalPolicy": "<implementation-defined>",
    "sandboxPolicy": {"type": "<implementation-defined>"}
  }
}

Includes threadId, input, cwd, title, approvalPolicy, and sandboxPolicy. Read turn_id from response result.turn.id.

Session Identifiers

Thread ID

Read from thread/start result: result.thread.id

Turn ID

Read from each turn/start result: result.turn.id

Session ID

Emit as: session_id = "<thread_id>-<turn_id>"

Continuation

Reuse the same thread_id for all continuation turns inside one worker run

Streaming Turn Processing

The client reads line-delimited messages until the turn terminates.

Show Completion Conditions

turn/completed → success
turn/failed → failure
turn/cancelled → failure
turn timeout (turn_timeout_ms) → failure
subprocess exit → failure

Show Continuation Processing

If the worker decides to continue after a successful turn, it should issue another turn/start on the same live threadId
The app-server subprocess should remain alive across those continuation turns and be stopped only when the worker run is ending

Line Handling Requirements

Read protocol messages from stdout only
Buffer partial stdout lines until newline arrives
Attempt JSON parse on complete stdout lines
Stderr is not part of the protocol stream:
- Ignore it or log it as diagnostics
- Do not attempt protocol JSON parsing on stderr

Emitted Runtime Events

The app-server client emits structured events to the orchestrator callback. Each event includes:

event

string

required

Event type enum/string

timestamp

required

UTC timestamp

codex_app_server_pid

string

Process ID if available

usage

object

Optional token counts

Important event types:

session_started

startup_failed

turn_completed

turn_failed

turn_cancelled

turn_ended_with_error

turn_input_required

approval_auto_approved

unsupported_tool_call

notification

other_message

malformed

Approval, Tool Calls, and User Input Policy

Approval, sandbox, and user-input behavior is implementation-defined.

Each implementation should document its chosen approval, sandbox, and operator-confirmation posture.

Show Example High-Trust Behavior

Auto-approve command execution approvals for the session
Auto-approve file-change approvals for the session
Treat user-input-required turns as hard failure

Show Unsupported Dynamic Tool Calls

Supported dynamic tool calls that are explicitly implemented should be handled according to their extension contract
If the agent requests a dynamic tool call (item/tool/call) that is not supported, return a tool failure response and continue the session
This prevents the session from stalling on unsupported tool execution paths

Timeouts and Error Mapping

codex.read_timeout_ms

integer

default:5000

Request/response timeout during startup and sync requests

codex.turn_timeout_ms

integer

default:3600000

Total turn stream timeout (1 hour)

codex.stall_timeout_ms

integer

default:300000

Enforced by orchestrator based on event inactivity (5 minutes). If <= 0, stall detection is disabled.

Show Error Categories

codex_not_found
invalid_workspace_cwd
response_timeout
turn_timeout
port_exit
response_error
turn_failed
turn_cancelled
turn_input_required

CLI

Configuration

Specification

Documentation Index

​Overview

​Agent Runner Contract

1. Workspace Creation

2. Prompt Building

3. Session Startup

4. Event Forwarding

​Workspace Creation and Lifecycle

​Workspace Path Determination

​Creation and Reuse Algorithm

​Workspace Hooks

after_create

before_run

after_run

before_remove

​Workspace Safety Invariants

Invariant 1: Agent CWD

Invariant 2: Path Containment

Invariant 3: Sanitized Keys

​Prompt Construction

​Inputs

​Rendering Rules

​Retry/Continuation Semantics

​Failure Behavior

​Codex App-Server Integration

​Launch Contract

​Session Startup Handshake

​Session Identifiers

Thread ID

Turn ID

Session ID

Continuation

​Streaming Turn Processing

​Line Handling Requirements

​Emitted Runtime Events

session_started

startup_failed

turn_completed

turn_failed

turn_cancelled

turn_ended_with_error

turn_input_required

approval_auto_approved

unsupported_tool_call

notification

other_message

malformed

​Approval, Tool Calls, and User Input Policy

​Timeouts and Error Mapping

Build docs developers (and LLMs) love

Overview

Agent Runner Contract

Workspace Creation and Lifecycle

Workspace Path Determination

Creation and Reuse Algorithm

Workspace Hooks

Workspace Safety Invariants

Prompt Construction

Inputs

Rendering Rules

Retry/Continuation Semantics

Failure Behavior

Codex App-Server Integration

Launch Contract

Session Startup Handshake

Session Identifiers

Streaming Turn Processing

Line Handling Requirements

Emitted Runtime Events

Approval, Tool Calls, and User Input Policy

Timeouts and Error Mapping