Skip to main content
The Dispatch Daemon is the heart of Stoneforge’s orchestration system. It runs continuous polling loops to assign tasks, deliver messages, and trigger steward workflows.

Overview

The daemon coordinates all agent activity without manual intervention:
The daemon runs as a background service on the smithy server. Start it with sf daemon start.

Polling Loops

The daemon executes five main polling loops every 5 seconds (configurable):

1. Worker Availability Polling

Purpose: Assign unassigned tasks to available ephemeral workers
1

Find Available Workers

Query for ephemeral workers with:
  • No active session
  • Not rate-limited
  • Below pool capacity
const availableWorkers = agents.filter(a => 
  a.metadata.agentRole === 'worker' &&
  a.metadata.workerMode === 'ephemeral' &&
  a.metadata.sessionStatus === 'idle'
);
2

Query Ready Tasks

Get highest priority tasks that are:
  • Status: OPEN
  • No assignee
  • Not blocked by dependencies
const readyTasks = await api.ready();
const sortedTasks = sortByEffectivePriority(readyTasks);
3

Assign and Dispatch

For each available worker:
  1. Assign highest priority task
  2. Create or reuse worktree
  3. Send dispatch message to worker’s inbox
  4. Spawn worker session in worktree
await api.update(taskId, { 
  assignee: workerId,
  status: 'in_progress',
});

await dispatchService.dispatchTask(taskId, workerId);
Workers are spawned inside their worktree directory for full isolation.

2. Inbox Polling

Purpose: Route messages and trigger agent sessions when needed
Message Routing Logic:
  1. Poll for unread messages in worker inbox
  2. For each message:
    • Dispatch message? Mark as read (spawn handled by worker polling)
    • Has active session? Leave unread (session will handle it)
    • Idle with non-dispatch messages? Leave unread to accumulate
  3. If accumulated messages exist, spawn triage session
Triage Session:
  • Groups messages by channel
  • Spawns temporary session to process batch
  • Uses message-triage prompt template
  • Agent responds to messages then exits

3. Steward Trigger Polling

Purpose: Activate steward workflows based on triggers
// Event trigger fires when matching event occurs
const trigger: EventTrigger = {
  type: 'event',
  event: 'task_review_ready',
  condition: 'task.priority <= 2', // Optional filter
};

// Daemon checks for triggered events
if (eventMatches(trigger, event)) {
  await spawnStewardWorkflow(stewardId, event);
}
  1. Trigger fires (event or cron)
  2. Create workflow from playbook template
  3. Workflow picked up by Workflow Task Polling
  4. Steward session spawned to execute

4. Workflow Task Polling

Purpose: Assign workflow tasks to available stewards
1

Find Incomplete Workflows

Query for workflows with:
  • Status: RUNNING or PENDING
  • No assigned steward
  • Current step not completed
2

Find Available Stewards

Get stewards matching workflow requirements:
  • No active session
  • Correct steward focus
  • Not rate-limited
3

Assign and Execute

For each available steward:
  1. Assign workflow task
  2. Send dispatch message
  3. Spawn steward session with workflow context

5. Orphan Recovery Polling

Purpose: Recover workers with assigned tasks but no active session after restart
When the orchestrator server restarts, agent sessions terminate but task assignments persist. Orphan recovery re-spawns workers to continue their work.
1

Detect Orphaned Assignments

Find ephemeral workers with:
  • No active session (sessionStatus: ‘idle’)
  • Assigned tasks (OPEN or IN_PROGRESS)
  • Session terminated by restart
const orphans = workers.filter(w => 
  w.metadata.sessionStatus === 'idle' &&
  hasAssignedTasks(w.id)
);
2

Attempt Resume

Try to resume previous session:
  1. Check for sessionId in task metadata
  2. If found, attempt provider session resume
  3. Reuse existing worktree
  4. Inject resume context explaining restart
await spawner.spawn({
  agentId: workerId,
  resumeSessionId: task.metadata.orchestrator.lastSessionId,
  workingDirectory: task.metadata.orchestrator.handoffWorktree,
});
3

Fallback to Fresh Spawn

If resume fails or no session ID:
  1. Spawn fresh session
  2. Send full task prompt
  3. Continue from existing worktree
  4. Preserve handoff history
Orphan recovery runs at daemon startup and at the start of each poll cycle.

End-to-End Flow

Here’s a complete example of a task flowing through the system:

Configuration

Customize daemon behavior via configuration:
const daemon = new DispatchDaemon({
  // Poll interval
  pollIntervalMs: 5000, // 5 seconds (default)
  
  // Enable/disable specific loops
  workerAvailabilityPollEnabled: true,
  inboxPollEnabled: true,
  stewardTriggerPollEnabled: true,
  workflowTaskPollEnabled: true,
  orphanRecoveryEnabled: true,
  
  // Recovery settings
  maxResumeAttemptsBeforeRecovery: 3,
  maxSessionDurationMs: 0, // Disabled by default
  maxStewardSessionDurationMs: 1800000, // 30 minutes
  
  // Reconciliation
  closedUnmergedReconciliationEnabled: true,
  closedUnmergedGracePeriodMs: 120000, // 2 minutes
  stuckMergeRecoveryEnabled: true,
  stuckMergeRecoveryGracePeriodMs: 600000, // 10 minutes
});
OptionDefaultDescription
pollIntervalMs5000Time between poll cycles
workerAvailabilityPollEnabledtrueEnable task assignment
inboxPollEnabledtrueEnable message routing
stewardTriggerPollEnabledtrueEnable steward triggers
workflowTaskPollEnabledtrueEnable workflow dispatch
orphanRecoveryEnabledtrueEnable restart recovery
maxResumeAttemptsBeforeRecovery3Resume limit before steward
maxSessionDurationMs0Worker timeout (0=disabled)
maxStewardSessionDurationMs1800000Steward timeout (30 min)

Starting and Stopping

CLI

# Start daemon
sf daemon start

# Check status
sf daemon status

# Stop daemon
sf daemon stop

Programmatically

import { DispatchDaemon } from '@stoneforge/smithy';

const daemon = new DispatchDaemon({
  api: quarryAPI,
  registry: agentRegistry,
  sessionManager,
  dispatchService,
  worktreeManager,
  // ... other dependencies
});

// Start daemon
await daemon.start();

// Stop daemon
await daemon.stop();

// Check if running
const isRunning = daemon.isRunning();

Monitoring

Poll Results

The daemon emits poll results after each cycle:
daemon.on('poll', (result: PollResult) => {
  console.log(`${result.pollType}: processed ${result.processed} in ${result.durationMs}ms`);
  
  if (result.errors > 0) {
    console.error(`Errors: ${result.errorMessages.join(', ')}`);
  }
});

Health Checks

// Get daemon health status
const health = await daemonService.getHealth();

// Health response
{
  status: 'running' | 'stopped',
  uptime: 123456, // milliseconds
  pollCycleCount: 42,
  lastPollResults: {
    'worker-availability': { processed: 3, errors: 0 },
    'inbox': { processed: 5, errors: 0 },
    'steward-trigger': { processed: 1, errors: 0 },
    'workflow-task': { processed: 0, errors: 0 },
  },
}

Advanced Features

Rate Limit Handling

The daemon tracks rate-limited workers:
// Rate limit detected from session error
if (isRateLimitMessage(error)) {
  const resetTime = parseRateLimitResetTime(error);
  rateLimitTracker.set(workerId, resetTime);
}

// Skip rate-limited workers in polling
const availableWorkers = workers.filter(w => 
  !rateLimitTracker.isRateLimited(w.id)
);

Session Timeout

Automatically terminate sessions that run too long:
// Configure timeouts
const daemon = new DispatchDaemon({
  maxSessionDurationMs: 3600000, // 1 hour for workers
  maxStewardSessionDurationMs: 1800000, // 30 minutes for stewards
});

// Daemon checks session duration each poll cycle
if (sessionDuration > maxDuration) {
  await sessionManager.terminate(sessionId);
}

Plan Auto-Completion

Automatically close plans when all tasks are done:
// Enabled by default
const daemon = new DispatchDaemon({
  planAutoCompleteEnabled: true,
});

// Daemon detects completed plans
const activePlans = await api.list({ 
  type: 'plan',
  status: 'active',
});

for (const plan of activePlans) {
  if (allTasksCompleted(plan)) {
    await api.updateStatus(plan.id, {
      status: 'completed',
    });
  }
}

Stuck Merge Recovery

Recover tasks stuck in merge states:
const daemon = new DispatchDaemon({
  stuckMergeRecoveryEnabled: true,
  stuckMergeRecoveryGracePeriodMs: 600000, // 10 minutes
});

// Daemon finds tasks stuck in 'merging' or 'testing'
// for longer than grace period and resets to 'pending'

Next Steps

Agent Roles

Learn about the agents being orchestrated

Task Management

Understand task lifecycle and statuses

Dependencies

See how blocking affects dispatch

Workflows

Build multi-step steward workflows

Build docs developers (and LLMs) love