Skip to main content
The polling configuration section controls the orchestrator’s tick frequency and reconciliation behavior.

Configuration

WORKFLOW.md
polling:
  interval_ms: 5000

Fields

interval_ms
integer
default:30000
Poll interval in milliseconds (30 seconds default).The orchestrator runs a full tick on this cadence:
  1. Reconcile running issues (state refresh, stall detection)
  2. Validate configuration (dispatch preflight checks)
  3. Fetch candidate issues from tracker
  4. Dispatch eligible issues while slots remain
  5. Notify observability consumers (dashboard, logs)
Dynamic: Changes apply immediately to future tick scheduling without restart.Example values:
  • 5000 = 5 seconds (high frequency)
  • 30000 = 30 seconds (default)
  • 60000 = 1 minute (conservative)
  • 300000 = 5 minutes (low frequency)

Poll Tick Lifecycle

Each tick follows this sequence:
# From orchestrator.ex
def handle_info(:tick, state) do
  state = refresh_runtime_config(state)
  state = %{state | poll_check_in_progress: true, next_poll_due_at_ms: nil}

  notify_dashboard()
  :ok = schedule_poll_cycle_start()
  {:noreply, state}
end

def handle_info(:run_poll_cycle, state) do
  state = refresh_runtime_config(state)
  state = maybe_dispatch(state)
  now_ms = System.monotonic_time(:millisecond)
  next_poll_due_at_ms = now_ms + state.poll_interval_ms
  :ok = schedule_tick(state.poll_interval_ms)

  state = %{state | poll_check_in_progress: false, next_poll_due_at_ms: next_poll_due_at_ms}

  notify_dashboard()
  {:noreply, state}
end

Reconciliation (Every Tick)

Before dispatch, the orchestrator reconciles running issues:

Stall Detection

For each running issue:
elapsed_ms = now_ms - (last_codex_timestamp || started_at)

if elapsed_ms > codex.stall_timeout_ms and codex.stall_timeout_ms > 0 do
  terminate_worker(issue_id)
  schedule_retry(issue_id)
end

State Refresh

  1. Fetch current tracker states for all running issue IDs
  2. For each running issue:
    • Terminal state → Terminate worker, clean workspace
    • Still active → Update in-memory snapshot
    • Neither active nor terminal → Terminate worker (no cleanup)
  3. If state refresh fails → Keep workers running, retry next tick

Dispatch Preflight Validation

Before fetching candidates, the orchestrator validates:
  • tracker.kind is present and supported
  • tracker.api_key is present after environment resolution
  • tracker.project_slug is present (when required)
  • codex.command is present and non-empty
Validation failures:
  • Skip dispatch for this tick
  • Reconciliation still runs
  • Log operator-visible error
  • Keep last known good config

Candidate Selection

The orchestrator fetches issues from the tracker and filters by:
  1. Required fields: Has id, identifier, title, state
  2. Active state: State is in tracker.active_states
  3. Not terminal: State is not in tracker.terminal_states
  4. Not claimed: Not in running or retry_attempts maps
  5. Concurrency slots: Global and per-state limits allow dispatch
  6. Blocker rule (Todo only): No non-terminal blockers
From source:
defp candidate?(issue, state) do
  Issue.has_required_fields?(issue) and
    active_issue_state?(issue.state) and
    not terminal_issue_state?(issue.state) and
    not claimed?(state, issue.id) and
    available_slots_for_issue(state, issue) > 0 and
    not todo_blocked?(issue)
end

Dispatch Sorting

Candidates are sorted by:
  1. Priority (ascending): 1 > 2 > 3 > 4 > null
  2. Created at (oldest first)
  3. Identifier (lexicographic tie-breaker)
Example:
defp sort_issues_for_dispatch(issues) do
  Enum.sort_by(
    issues,
    fn issue ->
      {
        priority_sort_key(issue.priority),
        created_at_sort_key(issue.created_at),
        issue.identifier
      }
    end
  )
end

Polling Frequency Considerations

High Frequency (5-10 seconds)

Pros:
  • Fast reaction to new issues
  • Quick reconciliation of state changes
  • Lower latency for continuation retries
Cons:
  • More API requests to tracker
  • Higher CPU usage for reconciliation
  • Potential rate limiting from tracker API

Medium Frequency (30-60 seconds)

Pros:
  • Balanced API usage
  • Reasonable reaction time
  • Default setting for most workloads
Cons:
  • Moderate delay before new issues are picked up

Low Frequency (5+ minutes)

Pros:
  • Minimal API usage
  • Low overhead
  • Suitable for batch workflows
Cons:
  • High latency for new issue pickup
  • Slower reconciliation of terminal states
  • Longer delays for retry/continuation

Configuration Reloading

Polling interval changes are applied dynamically:
defp refresh_runtime_config(state) do
  %{
    state
    | poll_interval_ms: Config.poll_interval_ms(),
      max_concurrent_agents: Config.max_concurrent_agents()
  }
end
  • Read from WORKFLOW.md on every tick
  • Applied to next tick scheduling
  • No restart required
  • Current tick completes with original interval

Startup Behavior

On service startup:
  1. Initialize state with default poll_interval_ms
  2. Run terminal cleanup: Query tracker for terminal issues and remove workspaces
  3. Schedule immediate tick (delay_ms: 0)
  4. Start polling loop
From source:
def init(_opts) do
  now_ms = System.monotonic_time(:millisecond)

  state = %State{
    poll_interval_ms: Config.poll_interval_ms(),
    max_concurrent_agents: Config.max_concurrent_agents(),
    next_poll_due_at_ms: now_ms,
    poll_check_in_progress: false,
    codex_totals: @empty_codex_totals,
    codex_rate_limits: nil
  }

  run_terminal_workspace_cleanup()
  :ok = schedule_tick(0)

  {:ok, state}
end

Observability

The orchestrator exposes polling state to observability consumers:
  • next_poll_due_at_ms: Monotonic timestamp for next tick
  • poll_check_in_progress: Boolean flag for active tick
  • Dashboard updates after each tick (notify_dashboard())
Example dashboard rendering:
# From status_dashboard.ex
time_until_next_poll_ms = max(state.next_poll_due_at_ms - now_ms, 0)

if state.poll_check_in_progress do
  "checking now…"
else
  "next poll in #{format_duration(time_until_next_poll_ms)}"
end

Examples

Fast Iteration

polling:
  interval_ms: 5000  # 5 seconds
Use for:
  • Development environments
  • High-priority issue queues
  • Real-time workflows

Production Default

polling:
  interval_ms: 30000  # 30 seconds
Use for:
  • Standard production deployments
  • Balanced API usage
  • Most team workflows

Batch Processing

polling:
  interval_ms: 300000  # 5 minutes
Use for:
  • Low-frequency batch jobs
  • API rate limit avoidance
  • Background maintenance tasks

Conservative (API Rate Limiting)

polling:
  interval_ms: 600000  # 10 minutes
Use for:
  • Strict API quotas
  • Low-traffic projects
  • Long-running agent sessions
  • tracker - Define which issues to poll for
  • agent - Control concurrency and dispatch limits
  • codex - Configure stall timeout for reconciliation

Build docs developers (and LLMs) love