Documentation Index
Fetch the complete documentation index at: https://mintlify.com/cadence-workflow/cadence/llms.txt
Use this file to discover all available pages before exploring further.
Service Overview
Cadence consists of four core services that work together to provide workflow orchestration capabilities. Each service is stateless, horizontally scalable, and has a specific set of responsibilities.Frontend Service
The Frontend service acts as the API gateway for all client interactions with Cadence.Responsibilities
- API Gateway: Exposes public APIs for workflow and activity operations
- Request Validation: Validates all incoming requests
- Rate Limiting: Enforces per-domain rate limits
- Authentication & Authorization: Handles security concerns
- Cluster Redirection: Routes requests to appropriate cluster in multi-region setup
- Domain Management: Handles domain registration and updates
Key Components
API Layers
The frontend implements multiple decorator layers:- Base Handler: Core API implementation
- Version Check: Ensures client compatibility
- Rate Limiter: Enforces quota limits
- Metrics: Captures telemetry data
- Cluster Redirection: Handles multi-cluster routing
- Access Control: Authorization enforcement
Configuration
Rate Limiting
The frontend implements a sophisticated multi-stage rate limiting system:- User RPS: Rate limit for client API calls
- Worker RPS: Rate limit for worker poll requests
- Visibility RPS: Rate limit for visibility queries
- Async RPS: Rate limit for async workflow operations
APIs Exposed
Workflow APIs
StartWorkflowExecutionSignalWorkflowExecutionTerminateWorkflowExecutionGetWorkflowExecutionHistoryDescribeWorkflowExecution
Domain APIs
RegisterDomainDescribeDomainUpdateDomainListDomainsDeprecateDomain
Task List APIs
PollForDecisionTaskPollForActivityTaskRespondDecisionTaskCompletedRespondActivityTaskCompleted
History Service
The History service is the core workflow execution engine that maintains workflow state and makes execution decisions.Responsibilities
- Workflow State Management: Maintains mutable state for active workflows
- Event History: Persists immutable workflow history events
- Decision Processing: Processes decisions from workflow workers
- Task Generation: Creates decision and activity tasks
- Timer Management: Handles workflow and activity timeouts
- Shard Ownership: Manages history shard ownership
Shard-Based Architecture
Key Components
Workflow Execution State
The History service maintains two types of state:-
Mutable State: Current workflow execution state
- Pending decision/activity tasks
- Timers
- Signals
- Child workflows
- Execution info (status, timeouts, etc.)
-
Immutable History: Event log of all workflow actions
- WorkflowExecutionStarted
- DecisionTaskScheduled
- ActivityTaskStarted
- WorkflowExecutionCompleted
Task Queues
History service manages multiple task queues per shard:Transfer Queue
- Processes tasks that need immediate execution
- Examples: Decision tasks, activity tasks, close execution
- FIFO processing within each shard
Timer Queue
- Processes time-based tasks
- Examples: Workflow timeout, activity timeout, retry timer
- Priority queue ordered by fire time
Configuration
Scalability Considerations
- Maximum Scale: Limited by
numHistoryShards - Shard Distribution: Automatic rebalancing when instances join/leave
- Graceful Shutdown: Drains shards before stopping
Matching Service
The Matching service routes tasks from History to Workers using task lists.Responsibilities
- Task List Management: Maintains task lists for decisions and activities
- Task Routing: Delivers tasks to polling workers
- Sync Match: Optimizes latency by matching tasks with waiting pollers
- Task Persistence: Stores unmatched tasks in database
- Load Balancing: Distributes tasks across available workers
Task List Architecture
Key Components
Sync Match Optimization
Sync Match: When a task arrives and workers are already polling- Near-zero latency task delivery
- Reduced database load
- Better throughput
Task List Types
-
Decision Task List: Routes decision tasks
- One per workflow task list
- Workers poll for workflow execution decisions
-
Activity Task List: Routes activity tasks
- Can be different from decision task list
- Workers poll for activity execution
Configuration
Scalability
- Task List Partitioning: High-throughput task lists can be partitioned
- Isolation Groups: Route tasks to specific worker pools
- Dynamic Partitioning: Automatic partition adjustment based on load
Worker Service
The Worker service handles internal background processing tasks for the Cadence system.Responsibilities
- Replication: Processes cross-datacenter replication tasks
- Indexing: Indexes workflow data to Elasticsearch/Pinot for visibility
- Archival: Archives old workflow histories to blob storage
- System Workflows: Runs internal system workflows
- Domain Replication: Replicates domain metadata across clusters
Key Components
Replicator
Handles cross-cluster replication:- History service writes replication tasks to Kafka
- Worker service consumes from Kafka
- Applies tasks to target cluster
- Handles conflict resolution
Indexer
Indexes workflow data for advanced visibility:Archiver
Archives workflow histories to long-term storage:- Local filesystem
- AWS S3
- Google Cloud Storage
- Custom implementations
Scanner
Performs data consistency checks and cleanup:- Task List Scanner: Removes orphaned task list entries
- History Scanner: Validates workflow history integrity
- Timer Scanner: Checks for stuck timers
- Execution Scanner: Identifies zombie workflows
Configuration
System Domains
Worker service creates internal system domains:- cadence-system: Core system workflows
- cadence-batcher: Batch operations
- cadence-canary: Health checks
Inter-Service Communication
Communication Patterns
RPC Configuration
Protocols Supported:- gRPC (recommended)
- TChannel (legacy)
Service Discovery
Services discover each other using Ringpop:Deployment Considerations
Resource Requirements
| Service | CPU | Memory | Disk | Network |
|---|---|---|---|---|
| Frontend | Low-Medium | Low | Minimal | High |
| History | High | High | Minimal | Medium |
| Matching | Low-Medium | Low-Medium | Minimal | Medium |
| Worker | Medium | Medium | Low | Medium |
Scaling Guidelines
-
Frontend: Scale based on RPS
- Start with 2-3 instances
- Add instances as traffic increases
-
History: Scale based on shard count
- Each instance should own 100-200 shards
- More instances = better distribution
-
Matching: Scale based on task throughput
- Start with 2-3 instances
- Scale if sync match rate drops
-
Worker: Scale based on background load
- Replication lag
- Indexing lag
- Archival backlog
Next Steps
Persistence Layer
Learn about database design and configuration
Cross-DC Replication
Set up multi-region deployments