Workflows

Workflows are resumable task sequences that execute in order with durable state. If a step fails, the workflow resumes from that step rather than restarting from scratch.

Workflow vs Task

Task

Single unit of work

One goal/deliverable
One assignee at a time
Binary completion (done or not)
Independent execution

Workflow

Sequence of steps

Multiple stages
Resumable on failure
Durable state tracking
Orchestrated execution

Workflow Lifecycle

Workflows progress through five states:

PENDING
RUNNING
COMPLETED
FAILED
CANCELLED

Created but not started

Workflow exists but no execution yet
Variables resolved from template
Waiting for trigger or manual start

const workflow = await api.create({
  type: 'workflow',
  title: 'Deploy to production',
  status: 'pending',
  ephemeral: false,
  variables: { environment: 'prod' },
  createdBy: stewardId,
});

Active execution

Currently processing steps
Has active steward assignment
startedAt timestamp set

await api.updateStatus(workflowId, {
  status: 'running',
});
// startedAt set automatically

All steps finished successfully

Terminal state (no further transitions)
finishedAt timestamp set
Eligible for garbage collection (if ephemeral)

await api.updateStatus(workflowId, {
  status: 'completed',
});
// finishedAt set automatically

Execution failed

Terminal state
finishedAt and failureReason set
Requires manual intervention or retry

await api.updateStatus(workflowId, {
  status: 'failed',
  failureReason: 'Tests failed at step 3',
});

Manually stopped

Terminal state
finishedAt and cancelReason set
May be restarted as new workflow

await api.updateStatus(workflowId, {
  status: 'cancelled',
  cancelReason: 'Deploy window closed',
});

Creating Workflows

From Playbook Template

The recommended approach - instantiate from a reusable template:

// 1. Create playbook template
const playbook = await api.create({
  type: 'playbook',
  title: 'Deployment Workflow',
  content: `
# Deployment Workflow

## Steps

1. Run tests
2. Build artifacts
3. Deploy to {{environment}}
4. Smoke test
5. Notify team

## Variables
- environment: prod | staging
  `,
  contentType: 'text/markdown',
  createdBy: directorId,
});

// 2. Instantiate workflow from playbook
const workflow = await api.create({
  type: 'workflow',
  title: 'Deploy v1.2.0 to prod',
  playbookId: playbook.id,
  ephemeral: false,
  variables: { 
    environment: 'prod',
    version: '1.2.0',
  },
  createdBy: stewardId,
});

Ad-Hoc Workflow

For one-off sequences without a template:

const workflow = await api.create({
  type: 'workflow',
  title: 'Emergency hotfix deployment',
  descriptionRef: descriptionDocId,
  ephemeral: true, // Delete after completion
  variables: {
    branch: 'hotfix/security-patch',
    skipTests: false,
  },
  createdBy: humanId,
});

Workflow Execution

Steward Assignment

Workflows are executed by Stewards:

Workflow Created

Workflow starts in PENDING status.

Workflow Task Polling

Dispatch Daemon detects workflow needing execution.

Steward Assignment

Available steward with matching focus is assigned.

Session Spawn

Steward session spawned with workflow context:

await spawner.spawn({
  agentId: stewardId,
  mode: 'headless',
  initialPrompt: renderWorkflowPrompt(workflow),
});

Execution

Steward processes workflow steps, updating status as it progresses.

Resumption on Failure

Workflows preserve state and can resume:

// Workflow state in metadata
{
  metadata: {
    currentStep: 3,
    completedSteps: [1, 2],
    stepResults: {
      '1': { success: true, output: '...' },
      '2': { success: true, output: '...' },
      '3': { success: false, error: 'Test failed' },
    },
  },
}

// On resume, steward continues from step 3
// Not from step 1

Resumption requires idempotent steps - safe to run multiple times without side effects.

Ephemeral vs Durable

Workflows can be ephemeral (temporary) or durable (persistent):

Ephemeral
Durable

Temporary execution, not synced to JSONLUse for:

Short-lived automation
Event-triggered actions
Garbage collected after completion

const workflow = await api.create({
  type: 'workflow',
  title: 'Daily cleanup',
  ephemeral: true,
  variables: {},
  createdBy: stewardId,
});

// After completion, eligible for GC
if (isEligibleForGarbageCollection(workflow)) {
  await api.delete(workflow.id);
}

Benefits:

Reduced storage overhead
Automatic cleanup
Faster performance (no JSONL writes)

Persistent execution, synced to JSONLUse for:

Critical workflows
Audit trails
Long-running processes
Cross-session workflows

const workflow = await api.create({
  type: 'workflow',
  title: 'Quarterly release',
  ephemeral: false,
  variables: { quarter: 'Q1-2026' },
  createdBy: directorId,
});

// Persists across restarts
// Visible in git history

Benefits:

Survives restarts
Full history tracking
Git-tracked state

Promoting Ephemeral to Durable

// Start as ephemeral
const workflow = await api.create({
  type: 'workflow',
  title: 'Test workflow',
  ephemeral: true,
  variables: {},
  createdBy: stewardId,
});

// Decide to keep it
await api.update(workflow.id, {
  ephemeral: false,
});
// Now synced to JSONL

Variables and Templates

Workflows support variable interpolation from playbooks:

Variable Definition

In playbook markdown:

# Deploy {{service}} to {{environment}}

## Steps

1. Build {{service}} from branch {{branch}}
2. Run tests in {{environment}} mode
3. Deploy to {{environment}}
4. Notify #{{notification_channel}}

## Variables
- service: api | web | worker
- environment: prod | staging | dev
- branch: main | release/*
- notification_channel: deployments

Variable Resolution

At workflow creation:

const workflow = await api.create({
  type: 'workflow',
  title: 'Deploy API to prod',
  playbookId: deployPlaybookId,
  variables: {
    service: 'api',
    environment: 'prod',
    branch: 'release/v1.2.0',
    notification_channel: 'deployments',
  },
  createdBy: stewardId,
});

// Steward sees resolved template:
// "Deploy api to prod"
// "Build api from branch release/v1.2.0"
// etc.

Variables are resolved once at creation time and stored in the workflow.

Workflow Patterns

Deploy Workflow

const deployWorkflow = await api.create({
  type: 'workflow',
  title: `Deploy ${version} to ${env}`,
  playbookId: 'playbook-deploy',
  ephemeral: false,
  variables: {
    version,
    environment: env,
    runTests: true,
    notifySlack: true,
  },
  createdBy: stewardId,
});

Steps:

Pull latest code
Run test suite
Build artifacts
Deploy to target environment
Run smoke tests
Notify team in Slack

Data Migration Workflow

const migrationWorkflow = await api.create({
  type: 'workflow',
  title: 'Migrate users table',
  playbookId: 'playbook-migration',
  ephemeral: false,
  variables: {
    tableName: 'users',
    batchSize: 1000,
    dryRun: true,
  },
  createdBy: humanId,
});

Steps:

Backup current data
Run migration in dry-run mode
Validate results
Run actual migration in batches
Verify data integrity
Clean up old data

Periodic Maintenance Workflow

const maintenanceWorkflow = await api.create({
  type: 'workflow',
  title: 'Weekly maintenance',
  playbookId: 'playbook-maintenance',
  ephemeral: true, // Delete after completion
  variables: {
    cleanupThreshold: '7d',
    notifyOnFailure: true,
  },
  createdBy: stewardId,
});

Steps:

Clean up old logs
Vacuum database
Update dependencies
Run health checks
Generate report

Querying Workflows

By Status

// Active workflows
const active = await api.list({
  type: 'workflow',
  status: ['pending', 'running'],
});

// Completed workflows
const completed = await api.list({
  type: 'workflow',
  status: 'completed',
});

By Playbook

// All instances of a playbook
const deployments = await api.list({
  type: 'workflow',
  playbookId: deployPlaybookId,
});

Ephemeral vs Durable

// Only durable workflows
const durable = await api.list({
  type: 'workflow',
  ephemeral: false,
});

// Eligible for garbage collection
const gcCandidates = await api.list({
  type: 'workflow',
  ephemeral: true,
  status: ['completed', 'failed', 'cancelled'],
});

Workflow Monitoring

Duration Tracking

const workflow = await api.get(workflowId);

if (workflow.startedAt) {
  const duration = workflow.finishedAt
    ? new Date(workflow.finishedAt).getTime() - new Date(workflow.startedAt).getTime()
    : Date.now() - new Date(workflow.startedAt).getTime();
  
  console.log(`Duration: ${duration}ms`);
}

Status Summary

import { groupByWorkflowStatus } from '@stoneforge/core';

const workflows = await api.list({ type: 'workflow' });
const grouped = groupByWorkflowStatus(workflows);

console.log(`
  Pending: ${grouped.pending.length}
  Running: ${grouped.running.length}
  Completed: ${grouped.completed.length}
  Failed: ${grouped.failed.length}
  Cancelled: ${grouped.cancelled.length}
`);

Failure Analysis

const failed = await api.list({
  type: 'workflow',
  status: 'failed',
});

for (const workflow of failed) {
  console.log(`${workflow.title}: ${workflow.failureReason}`);
  
  // Check metadata for step details
  const stepResults = workflow.metadata.stepResults;
  const failedStep = Object.entries(stepResults)
    .find(([_, result]) => !result.success);
  
  if (failedStep) {
    console.log(`Failed at step ${failedStep[0]}: ${failedStep[1].error}`);
  }
}

Best Practices

1. Make steps idempotent

Steps should be safe to run multiple times:

// Good: Checks before acting
if (!fileExists('output.txt')) {
  writeFile('output.txt', data);
}

// Bad: Fails on second run
writeFile('output.txt', data); // Error if exists

2. Use durable workflows for critical processes

// Critical: Use durable
const deployment = await api.create({
  type: 'workflow',
  title: 'Production deployment',
  ephemeral: false, // Persist for audit
  variables: {},
  createdBy: stewardId,
});

// Routine: Use ephemeral
const cleanup = await api.create({
  type: 'workflow',
  title: 'Daily cleanup',
  ephemeral: true, // Delete after done
  variables: {},
  createdBy: stewardId,
});

3. Validate variables before execution

const workflow = await api.create({
  type: 'workflow',
  title: 'Deploy',
  playbookId,
  variables: {
    environment: 'prod',
    version: '1.2.0',
  },
  createdBy: stewardId,
});

// Validate before starting
if (!isValidVersion(workflow.variables.version)) {
  throw new Error('Invalid version format');
}

await api.updateStatus(workflow.id, { status: 'running' });

4. Store detailed step results in metadata

// Update metadata as steps complete
await api.update(workflowId, {
  metadata: {
    ...workflow.metadata,
    currentStep: 3,
    stepResults: {
      ...workflow.metadata.stepResults,
      '2': {
        success: true,
        output: 'Tests passed: 142/142',
        duration: 45000,
      },
    },
  },
});

5. Set meaningful failure reasons

await api.updateStatus(workflowId, {
  status: 'failed',
  failureReason: 'Tests failed at step 3: authentication_test.ts:42 - Expected 200, got 401',
});

Next Steps

Agent Roles

Learn about Stewards that execute workflows

Orchestration Loop

See how workflows are dispatched

Task Management

Compare workflows to tasks

Dependencies

Understand workflow step dependencies

Get Started

Core Concepts

Guides

Workflow vs Task

Task

Workflow

Workflow Lifecycle

Creating Workflows

From Playbook Template

Ad-Hoc Workflow

Workflow Execution

Steward Assignment

Resumption on Failure

Ephemeral vs Durable

Promoting Ephemeral to Durable

Variables and Templates

Variable Definition

Variable Resolution

Workflow Patterns

Deploy Workflow

Data Migration Workflow

Periodic Maintenance Workflow

Querying Workflows

By Status

By Playbook

Ephemeral vs Durable

Workflow Monitoring

Duration Tracking

Status Summary

Failure Analysis

Best Practices

Next Steps

Agent Roles

Orchestration Loop

Task Management

Dependencies

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Workflow vs Task

Task

Workflow

​Workflow Lifecycle

​Creating Workflows

​From Playbook Template

​Ad-Hoc Workflow

​Workflow Execution

​Steward Assignment

​Resumption on Failure

​Ephemeral vs Durable

​Promoting Ephemeral to Durable

​Variables and Templates

​Variable Definition

​Variable Resolution

​Workflow Patterns

​Deploy Workflow

​Data Migration Workflow

​Periodic Maintenance Workflow

​Querying Workflows

​By Status

​By Playbook

​Ephemeral vs Durable

​Workflow Monitoring

​Duration Tracking

​Status Summary

​Failure Analysis

​Best Practices

​Next Steps

Agent Roles

Orchestration Loop

Task Management

Dependencies

Build docs developers (and LLMs) love

Workflow vs Task

Workflow Lifecycle

Creating Workflows

From Playbook Template

Ad-Hoc Workflow

Workflow Execution

Steward Assignment

Resumption on Failure

Ephemeral vs Durable

Promoting Ephemeral to Durable

Variables and Templates

Variable Definition

Variable Resolution

Workflow Patterns

Deploy Workflow

Data Migration Workflow

Periodic Maintenance Workflow

Querying Workflows

By Status

By Playbook

Ephemeral vs Durable

Workflow Monitoring

Duration Tracking

Status Summary

Failure Analysis

Best Practices

Next Steps