Overview
Instead of running all build tasks locally, remote execution allows build tools to:- Upload inputs to a shared Content Addressable Storage (CAS)
- Submit actions to a scheduler
- Execute on workers with matching capabilities
- Download outputs from CAS
Benefits of Remote Execution
Massive Parallelism
Execute hundreds or thousands of tasks simultaneously across a worker pool.
Consistent Environments
All builds run in controlled, hermetic environments ensuring reproducibility.
Resource Offloading
Free up local CPU, memory, and disk for other tasks.
Faster Iteration
Large builds that take hours locally can complete in minutes.
Execution Lifecycle
1. Action Submission
The build client creates anAction protobuf message:
- Command: What to execute (program, arguments, environment)
- Input Root: Merkle tree of all input files/directories
- Platform Properties: Worker requirements (OS, CPU architecture, etc.)
- Timeout: Maximum allowed execution time
2. Scheduler Queueing
The scheduler receives the action and:- Validates the action is well-formed
- Checks Action Cache (if caching enabled)
- Queues the action awaiting a suitable worker
- Monitors progress and handles timeouts
Actions are queued in the order received, but workers may pull from the queue based on availability and platform matching.
3. Worker Matching
The scheduler matches actions to workers based on platform properties: Matching Rules (configured in scheduler):- Exact Match
- Minimum Match
- Priority Match
Worker must have the exact property value.Example: Action with
cpu_arch: arm64 only runs on workers with cpu_arch: arm64.4. Worker Execution
Once assigned, the worker: Execution Steps:- Precondition Check: Run optional script to verify worker capabilities
- Input Download: Fetch all input files from CAS into working directory
- Environment Setup: Set environment variables, create output directories
- Command Execution: Run the command with timeout and resource monitoring
- Output Capture: Collect stdout, stderr, and exit code
- Output Upload: Hash and upload all output files to CAS
- Result Reporting: Send
ActionResultback to scheduler
Working Directory Isolation
Working Directory Isolation
Each action executes in a clean, isolated directory:
- No leftover files from previous actions
- Inputs materialized from CAS
- Outputs collected after execution
- Directory deleted after completion
5. Result Collection
The scheduler receives theActionResult:
- Stored in Action Cache (if caching enabled)
- Returned to client via gRPC stream
- Used to update operation status for
WaitExecutionsubscribers
Worker Configuration
Workers are configured to advertise their capabilities and resource limits.Platform Properties
Declare worker capabilities:Resource Limits
Precondition Scripts
Dynamically check worker readiness:- Check available disk space
- Verify required tools are installed
- Ensure GPU is available
- Confirm license server connectivity
Precondition scripts run before accepting each action. If the script fails (non-zero exit), the worker rejects the action.
Scheduler Configuration
Schedulers manage the action queue and worker pool.Simple Scheduler
The primary scheduler implementation:Allocation Strategy
Allocation Strategy
Worker Timeout
Workers send periodic keepalive messages to signal they’re still alive. If a worker crashes or loses network connectivity, it’s automatically removed after the timeout.
Action Timeout
- client_action_timeout_s: Mark action failed if no client update (for multi-stage operations)
- max_action_executing_timeout_s: Max time an action can execute without progress updates
Retry Logic
Advanced Features
Cache Lookup Integration
Wrap the scheduler with a cache lookup layer:Property Modification
Modify action properties before execution:- Route actions to specific worker pools
- Add default properties
- Remove incompatible properties
Multi-Scheduler Federation
Forward actions to remote schedulers:Monitoring Execution
Clients can monitor execution progress:WaitExecution
- Queued: Action accepted, waiting for worker
- Executing: Worker is running the action
- Completed: Action finished (success or failure)
Operation Metadata
- Current execution stage
- Assigned worker ID
- Queue and execution timestamps
Performance Optimization
Input Minimization
Include only necessary inputs ininput_root_digest:
- Reduces upload/download time
- Decreases storage usage
- Improves cache hit rates
Output Locality
Use output paths to avoid downloading unnecessary outputs:Worker Affinity
With MRU allocation, repeatedly scheduled actions on the same worker can reuse:- Downloaded inputs (if still in local cache)
- Compiled headers and intermediate files
Troubleshooting
Common Issues
Common Issues
Next Steps
Schedulers
Configure scheduler behavior
Workers
Set up and manage worker nodes
Stores
Optimize CAS storage backends