Schedulers manage task distribution to workers for remote execution.
Scheduler Definition
Unique identifier for the scheduler. Referenced in server execution services and worker configurations.
Scheduler type and configuration. Only one type can be specified.
Scheduler Types
Simple
GRPC
CacheLookup
PropertyModifier
Core scheduler for worker task distribution.{
name: "MAIN_SCHEDULER",
simple: {
supported_platform_properties: {
cpu_count: "minimum",
cpu_arch: "exact",
OSFamily: "priority"
},
retain_completed_for_s: 60,
worker_timeout_s: 5,
max_job_retries: 3
}
}
supported_platform_properties
Map of platform property names to match strategies. Keys must match worker-published capabilities.Property types:
"minimum" - Worker value must be >= task value (numeric)
"exact" - Worker value must exactly match task value (string)
"priority" - Informational, doesn’t restrict matching
"ignore" - Allows property in tasks without requiring worker support
supported_platform_properties: {
cpu_count: "minimum", // Worker needs >= cores
memory_kb: "minimum", // Worker needs >= memory
cpu_arch: "exact", // Must match exactly (e.g., "x86_64")
gpu_model: "exact", // Must match exactly
OSFamily: "priority", // Preference hint
"container-image": "priority"
}
Seconds to keep completed action results in memory for WaitExecution calls.
Mark actions as failed if no client updates within this duration.
Remove workers from pool after this many seconds without keepalive.
max_action_executing_timeout_s
Maximum time an action can execute without worker updates before timing out and re-queueing. Set to 0 to disable (rely only on worker_timeout_s).
Maximum retries for jobs that fail with internal errors or timeouts.
allocation_strategy
enum
default:"least_recently_used"
Worker selection strategy when multiple workers can run a task.Options:
least_recently_used - Prefer idle workers
most_recently_used - Prefer active workers
worker_match_logging_interval_s
Log worker matching status every N seconds. Set to -1 to disable.
Storage backend for scheduler state.
In-memory backend (default, non-persistent).experimental_backend: "memory"
Redis-backed persistent scheduler.experimental_backend: {
redis: {
redis_store: "REDIS_STORE_NAME"
}
}
Reference to a redis_store configuration.
Forwards scheduling requests to remote scheduler.{
name: "REMOTE_SCHEDULER",
grpc: {
endpoint: {
address: "grpc://scheduler.example.com:50052"
},
connections_per_endpoint: 5,
max_concurrent_requests: 100
}
}
Remote scheduler endpoint.
GRPC endpoint URL (e.g., grpc://host:port or grpcs://host:port).
TLS configuration for grpcs://.
Certificate authority file path.
Client certificate file path.
Client private key file path.
Maximum concurrent requests to this endpoint.
TCP connection timeout (seconds).
TCP keepalive interval (seconds).
http2_keepalive_interval_s
HTTP/2 PING interval (seconds).
http2_keepalive_timeout_s
HTTP/2 PING timeout (seconds).
Retry configuration for failed requests.
Maximum retries (0 = unlimited).
Initial delay for exponential backoff (seconds).
Jitter factor (0-1) for randomized delays.
Global concurrent request limit (0 = unlimited). Requests queue when limit is reached.
Number of TCP connections per endpoint for load balancing.
Checks Action Cache before delegating to nested scheduler.{
name: "CACHED_SCHEDULER",
cache_lookup: {
ac_store: "AC_STORE",
scheduler: {
simple: { /* ... */ }
}
}
}
Reference to Action Cache store. Should be a completeness_checking store to verify result validity.
Nested scheduler to use on cache miss.
Modifies action platform properties before scheduling.{
name: "MODIFIED_SCHEDULER",
property_modifier: {
modifications: [
{ add: { name: "pool", value: "default" } },
{ remove: "deprecated_property" },
{
replace: {
name: "os",
value: "linux",
new_name: "OSFamily",
new_value: "Linux"
}
}
],
scheduler: { simple: { /* ... */ } }
}
}
Ordered list of property modifications applied to each action.
Add a property.{ add: { name: "pool", value: "gpu" } }
Remove a property by name.{ remove: "unwanted_property" }
Replace matching property.{
replace: {
name: "cpu_arch",
value: "amd64",
new_name: "cpu_arch",
new_value: "x86_64"
}
}
Value to match (omit to match any).
New value (omit to keep original).
Nested scheduler that receives modified actions.
Platform properties match tasks to workers. Workers advertise capabilities, and the scheduler filters based on property type:
| Property Type | Behavior | Example Use Case |
|---|
minimum | Worker value >= task value | CPU cores, memory, disk |
exact | String exact match | Architecture, OS, GPU model |
priority | Soft preference | Container images |
ignore | Accept task property without worker requirement | Optional labels |
Example Configuration
{
name: "PRODUCTION_SCHEDULER",
simple: {
supported_platform_properties: {
// Resource minimums
cpu_count: "minimum",
memory_kb: "minimum",
disk_read_iops: "minimum",
// Exact matches
cpu_arch: "exact", // "x86_64", "aarch64"
cpu_vendor: "exact", // "intel", "amd"
OSFamily: "exact", // "Linux", "Windows"
// Preferences
"container-image": "priority",
// Optional
"build-id": "ignore"
},
allocation_strategy: "least_recently_used",
worker_timeout_s: 10,
max_job_retries: 5
}
}