Workflows are resumable task sequences that execute in order with durable state. If a step fails, the workflow resumes from that step rather than restarting from scratch.
Workflow vs Task
Task Single unit of work
One goal/deliverable
One assignee at a time
Binary completion (done or not)
Independent execution
Workflow Sequence of steps
Multiple stages
Resumable on failure
Durable state tracking
Orchestrated execution
Workflow Lifecycle
Workflows progress through five states:
PENDING
RUNNING
COMPLETED
FAILED
CANCELLED
Created but not started
Workflow exists but no execution yet
Variables resolved from template
Waiting for trigger or manual start
const workflow = await api . create ({
type: 'workflow' ,
title: 'Deploy to production' ,
status: 'pending' ,
ephemeral: false ,
variables: { environment: 'prod' },
createdBy: stewardId ,
});
Active execution
Currently processing steps
Has active steward assignment
startedAt timestamp set
await api . updateStatus ( workflowId , {
status: 'running' ,
});
// startedAt set automatically
All steps finished successfully
Terminal state (no further transitions)
finishedAt timestamp set
Eligible for garbage collection (if ephemeral)
await api . updateStatus ( workflowId , {
status: 'completed' ,
});
// finishedAt set automatically
Execution failed
Terminal state
finishedAt and failureReason set
Requires manual intervention or retry
await api . updateStatus ( workflowId , {
status: 'failed' ,
failureReason: 'Tests failed at step 3' ,
});
Manually stopped
Terminal state
finishedAt and cancelReason set
May be restarted as new workflow
await api . updateStatus ( workflowId , {
status: 'cancelled' ,
cancelReason: 'Deploy window closed' ,
});
Creating Workflows
From Playbook Template
The recommended approach - instantiate from a reusable template:
// 1. Create playbook template
const playbook = await api . create ({
type: 'playbook' ,
title: 'Deployment Workflow' ,
content: `
# Deployment Workflow
## Steps
1. Run tests
2. Build artifacts
3. Deploy to {{environment}}
4. Smoke test
5. Notify team
## Variables
- environment: prod | staging
` ,
contentType: 'text/markdown' ,
createdBy: directorId ,
});
// 2. Instantiate workflow from playbook
const workflow = await api . create ({
type: 'workflow' ,
title: 'Deploy v1.2.0 to prod' ,
playbookId: playbook . id ,
ephemeral: false ,
variables: {
environment: 'prod' ,
version: '1.2.0' ,
},
createdBy: stewardId ,
});
Ad-Hoc Workflow
For one-off sequences without a template:
const workflow = await api . create ({
type: 'workflow' ,
title: 'Emergency hotfix deployment' ,
descriptionRef: descriptionDocId ,
ephemeral: true , // Delete after completion
variables: {
branch: 'hotfix/security-patch' ,
skipTests: false ,
},
createdBy: humanId ,
});
Workflow Execution
Steward Assignment
Workflows are executed by Stewards:
Workflow Created
Workflow starts in PENDING status.
Workflow Task Polling
Dispatch Daemon detects workflow needing execution.
Steward Assignment
Available steward with matching focus is assigned.
Session Spawn
Steward session spawned with workflow context: await spawner . spawn ({
agentId: stewardId ,
mode: 'headless' ,
initialPrompt: renderWorkflowPrompt ( workflow ),
});
Execution
Steward processes workflow steps, updating status as it progresses.
Resumption on Failure
Workflows preserve state and can resume:
// Workflow state in metadata
{
metadata : {
currentStep : 3 ,
completedSteps : [ 1 , 2 ],
stepResults : {
'1' : { success: true , output: '...' },
'2' : { success: true , output: '...' },
'3' : { success: false , error: 'Test failed' },
},
},
}
// On resume, steward continues from step 3
// Not from step 1
Resumption requires idempotent steps - safe to run multiple times without side effects.
Ephemeral vs Durable
Workflows can be ephemeral (temporary) or durable (persistent):
Temporary execution, not synced to JSONL Use for:
Short-lived automation
Event-triggered actions
Garbage collected after completion
const workflow = await api . create ({
type: 'workflow' ,
title: 'Daily cleanup' ,
ephemeral: true ,
variables: {},
createdBy: stewardId ,
});
// After completion, eligible for GC
if ( isEligibleForGarbageCollection ( workflow )) {
await api . delete ( workflow . id );
}
Benefits:
Reduced storage overhead
Automatic cleanup
Faster performance (no JSONL writes)
Persistent execution, synced to JSONL Use for:
Critical workflows
Audit trails
Long-running processes
Cross-session workflows
const workflow = await api . create ({
type: 'workflow' ,
title: 'Quarterly release' ,
ephemeral: false ,
variables: { quarter: 'Q1-2026' },
createdBy: directorId ,
});
// Persists across restarts
// Visible in git history
Benefits:
Survives restarts
Full history tracking
Git-tracked state
// Start as ephemeral
const workflow = await api . create ({
type: 'workflow' ,
title: 'Test workflow' ,
ephemeral: true ,
variables: {},
createdBy: stewardId ,
});
// Decide to keep it
await api . update ( workflow . id , {
ephemeral: false ,
});
// Now synced to JSONL
Variables and Templates
Workflows support variable interpolation from playbooks:
Variable Definition
In playbook markdown:
# Deploy {{service}} to {{environment}}
## Steps
1. Build {{service}} from branch {{branch}}
2. Run tests in {{environment}} mode
3. Deploy to {{environment}}
4. Notify #{{notification_channel}}
## Variables
- service: api | web | worker
- environment: prod | staging | dev
- branch: main | release/*
- notification_channel: deployments
Variable Resolution
At workflow creation:
const workflow = await api . create ({
type: 'workflow' ,
title: 'Deploy API to prod' ,
playbookId: deployPlaybookId ,
variables: {
service: 'api' ,
environment: 'prod' ,
branch: 'release/v1.2.0' ,
notification_channel: 'deployments' ,
},
createdBy: stewardId ,
});
// Steward sees resolved template:
// "Deploy api to prod"
// "Build api from branch release/v1.2.0"
// etc.
Variables are resolved once at creation time and stored in the workflow.
Workflow Patterns
Deploy Workflow
const deployWorkflow = await api . create ({
type: 'workflow' ,
title: `Deploy ${ version } to ${ env } ` ,
playbookId: 'playbook-deploy' ,
ephemeral: false ,
variables: {
version ,
environment: env ,
runTests: true ,
notifySlack: true ,
},
createdBy: stewardId ,
});
Steps:
Pull latest code
Run test suite
Build artifacts
Deploy to target environment
Run smoke tests
Notify team in Slack
Data Migration Workflow
const migrationWorkflow = await api . create ({
type: 'workflow' ,
title: 'Migrate users table' ,
playbookId: 'playbook-migration' ,
ephemeral: false ,
variables: {
tableName: 'users' ,
batchSize: 1000 ,
dryRun: true ,
},
createdBy: humanId ,
});
Steps:
Backup current data
Run migration in dry-run mode
Validate results
Run actual migration in batches
Verify data integrity
Clean up old data
Periodic Maintenance Workflow
const maintenanceWorkflow = await api . create ({
type: 'workflow' ,
title: 'Weekly maintenance' ,
playbookId: 'playbook-maintenance' ,
ephemeral: true , // Delete after completion
variables: {
cleanupThreshold: '7d' ,
notifyOnFailure: true ,
},
createdBy: stewardId ,
});
Steps:
Clean up old logs
Vacuum database
Update dependencies
Run health checks
Generate report
Querying Workflows
By Status
// Active workflows
const active = await api . list ({
type: 'workflow' ,
status: [ 'pending' , 'running' ],
});
// Completed workflows
const completed = await api . list ({
type: 'workflow' ,
status: 'completed' ,
});
By Playbook
// All instances of a playbook
const deployments = await api . list ({
type: 'workflow' ,
playbookId: deployPlaybookId ,
});
Ephemeral vs Durable
// Only durable workflows
const durable = await api . list ({
type: 'workflow' ,
ephemeral: false ,
});
// Eligible for garbage collection
const gcCandidates = await api . list ({
type: 'workflow' ,
ephemeral: true ,
status: [ 'completed' , 'failed' , 'cancelled' ],
});
Workflow Monitoring
Duration Tracking
const workflow = await api . get ( workflowId );
if ( workflow . startedAt ) {
const duration = workflow . finishedAt
? new Date ( workflow . finishedAt ). getTime () - new Date ( workflow . startedAt ). getTime ()
: Date . now () - new Date ( workflow . startedAt ). getTime ();
console . log ( `Duration: ${ duration } ms` );
}
Status Summary
import { groupByWorkflowStatus } from '@stoneforge/core' ;
const workflows = await api . list ({ type: 'workflow' });
const grouped = groupByWorkflowStatus ( workflows );
console . log ( `
Pending: ${ grouped . pending . length }
Running: ${ grouped . running . length }
Completed: ${ grouped . completed . length }
Failed: ${ grouped . failed . length }
Cancelled: ${ grouped . cancelled . length }
` );
Failure Analysis
const failed = await api . list ({
type: 'workflow' ,
status: 'failed' ,
});
for ( const workflow of failed ) {
console . log ( ` ${ workflow . title } : ${ workflow . failureReason } ` );
// Check metadata for step details
const stepResults = workflow . metadata . stepResults ;
const failedStep = Object . entries ( stepResults )
. find (([ _ , result ]) => ! result . success );
if ( failedStep ) {
console . log ( `Failed at step ${ failedStep [ 0 ] } : ${ failedStep [ 1 ]. error } ` );
}
}
Best Practices
Steps should be safe to run multiple times: // Good: Checks before acting
if ( ! fileExists ( 'output.txt' )) {
writeFile ( 'output.txt' , data );
}
// Bad: Fails on second run
writeFile ( 'output.txt' , data ); // Error if exists
2. Use durable workflows for critical processes
// Critical: Use durable
const deployment = await api . create ({
type: 'workflow' ,
title: 'Production deployment' ,
ephemeral: false , // Persist for audit
variables: {},
createdBy: stewardId ,
});
// Routine: Use ephemeral
const cleanup = await api . create ({
type: 'workflow' ,
title: 'Daily cleanup' ,
ephemeral: true , // Delete after done
variables: {},
createdBy: stewardId ,
});
3. Validate variables before execution
const workflow = await api . create ({
type: 'workflow' ,
title: 'Deploy' ,
playbookId ,
variables: {
environment: 'prod' ,
version: '1.2.0' ,
},
createdBy: stewardId ,
});
// Validate before starting
if ( ! isValidVersion ( workflow . variables . version )) {
throw new Error ( 'Invalid version format' );
}
await api . updateStatus ( workflow . id , { status: 'running' });
4. Store detailed step results in metadata
5. Set meaningful failure reasons
await api . updateStatus ( workflowId , {
status: 'failed' ,
failureReason: 'Tests failed at step 3: authentication_test.ts:42 - Expected 200, got 401' ,
});
Next Steps
Agent Roles Learn about Stewards that execute workflows
Orchestration Loop See how workflows are dispatched
Task Management Compare workflows to tasks
Dependencies Understand workflow step dependencies