Cadence Canary is a suite of automated workflows that continuously validate the health and correctness of your Cadence cluster. It tests core features and advanced capabilities to detect issues before they impact production workloads.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/cadence-workflow/cadence/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The canary suite provides:- Continuous Monitoring: Periodic execution of validation workflows
- Feature Coverage: Tests for core and advanced Cadence features
- Early Detection: Identifies issues before user impact
- Operational Confidence: Validates deployments and upgrades
- Automated Alerting: Integrates with monitoring systems
Setup
Prerequisites
- Running Cadence cluster
- Advanced Visibility (Elasticsearch) for some tests
- History Archival for archival tests
- Visibility Archival for visibility archival tests
Running Canary
Option 1: Docker Compose
Easiest setup for local development:Option 2: Docker Image
For production deployments:Option 3: Build from Source
Configuration
Editconfig/canary/development.yaml:
- Archival tests always use
canary-archival-domaindomain - Exclude tests for features not enabled on your cluster
Canary Test Cases
Sanity Suite (Starter)
Main test suite that launches all test cases. Purpose: One-stop validation of all Cadence features Run manually:workflow_success metric with workflowType = "workflow.sanity":
Cron Canary
Periodically runs the Sanity suite. Features:- Continuous validation
- Fixed workflow ID:
cadence.canary.cron - Configurable schedule
- Automatic failure detection
Echo Test
Tests basic workflow functionality. What it tests:- Workflow execution
- Activity execution
- Result passing
Signal Test
Tests signal delivery. What it tests:SignalWorkflowExecutionAPI- Signal reception and handling
- Signal buffering
Visibility Test
Tests basic visibility features. What it tests:- Workflow listing
- Status filtering
- Time range queries
- Basic visibility (no Elasticsearch required)
Search Attributes Test
Tests advanced visibility. What it tests:- Custom search attributes
- Complex queries
- Elasticsearch integration
Concurrent Execution Test
Tests parallel activity execution. What it tests:- Concurrent activities
- Activity results aggregation
- Parallel execution limits
Query Test
Tests workflow query feature. What it tests:- Query registration
- Query execution
- Query consistency
Timeout Test
Tests activity timeout enforcement. What it tests:- Activity timeout configuration
- Timeout enforcement
- Timeout handling
Local Activity Test
Tests local activity execution. What it tests:- Local activity scheduling
- Fast execution path
- No workflow history for local activities
Cancellation Test
Tests workflow cancellation. What it tests:- Cancellation requests
- Cancellation propagation
- Activity cancellation
- Child workflow cancellation
Retry Test
Tests activity retry policies. What it tests:- Retry policy configuration
- Automatic retries
- Exponential backoff
- Maximum attempts
Reset Test
Tests workflow reset feature. What it tests:- Workflow reset to decision
- Reset validation
- History replay after reset
History Archival Test
Tests history archival. What it tests:- History archival to storage
- Archived history retrieval
- Archival URI validation
canary-archival-domain
Run manually:
Visibility Archival Test
Tests visibility archival. What it tests:- Visibility record archival
- Archived visibility queries
- Archival URI validation
canary-archival-domain
Run manually:
Batch Test
Tests batch operations. What it tests:- Batch workflow termination
- Batch workflow signaling
- Query-based batch operations
Monitoring and Alerting
Key Metrics
Monitor these metrics for canary health: Workflow Success Rate:Alerting Rules
Prometheus Alert Example:Production Monitoring
Recommended Setup:- Deploy canary in production cluster
- Configure cron schedule (e.g., every 5 minutes)
- Monitor
workflow_successmetric - Alert on failures or high latency
- Exclude tests for disabled features
- Use separate alerting for archival tests
Best Practices
Configuration
- Exclude Unavailable Features: Don’t test features not enabled
- Appropriate Frequency: Balance coverage vs. load (5-30 minutes typical)
- Realistic Timeouts: Set timeouts for expected execution time
- Resource Allocation: Ensure canary doesn’t impact production
Monitoring
- Alert on Failures: Set up immediate alerts for test failures
- Track Latency Trends: Monitor for performance degradation
- Dashboard: Create dedicated canary dashboard
- Test-Specific Metrics: Monitor individual test types
Operations
- Post-Deployment: Run canary immediately after deployments
- Pre-Upgrade: Verify canary passes before upgrades
- Incident Response: Check canary status during incidents
- Capacity Planning: Use canary metrics for baseline performance
Troubleshooting
Canary Not Running
Problem: No canary executions Solution:Test Failures
Problem: Specific test consistently failing Solution:- Run test manually for debugging
- Check if feature is properly configured
- Verify required dependencies (e.g., Elasticsearch)
- Review server logs for errors
- Check domain configuration
Archival Tests Failing
Problem: Archival tests fail but others pass Solution:- Verify archival is enabled in server config
- Check
canary-archival-domainexists and has archival enabled - Verify archival storage (S3/GCS/filestore) is accessible
- Review archival worker logs
- Test archival manually with CLI
Next Steps
- Learn about Benchmarking for load testing
- Configure Archival for tested features
- Set up Dynamic Config for tuning
- Monitor with Web UI