Architecture Overview
NativeLink consists of three primary components that work together to provide caching and remote execution capabilities:Core Components
Build Clients
Build tools like Bazel, Buck2, Goma, and Reclient interact with NativeLink through the Remote Execution API:- Submit build actions to the scheduler
- Upload input files to the Content Addressable Storage (CAS)
- Query the Action Cache (AC) for previously computed results
- Download output artifacts from CAS
Schedulers
The scheduler is responsible for managing the execution lifecycle of build actions:- Simple Scheduler
- Cache Lookup Scheduler
- Property Modifier Scheduler
- GRPC Scheduler
The primary scheduler implementation that handles action queuing, worker matching, and task distribution.Key Features:
- Platform property-based worker matching
- Configurable allocation strategies (LRU/MRU)
- Action timeout and retry logic
- Worker health monitoring
Workers
Worker nodes execute build actions in isolated environments:- Connect to the scheduler and advertise their capabilities via platform properties
- Download action inputs from CAS
- Execute commands in controlled environments
- Upload outputs back to CAS
- Report execution results to the scheduler
- Multi-action concurrency (configurable
max_inflight_tasks) - Resource management (CPU, memory, disk)
- Precondition scripts for dynamic resource checks
- Graceful draining and shutdown
Storage Backends
NativeLink provides a flexible storage abstraction supporting multiple backends and composition strategies. See Stores for details.Data Flow
Remote Execution Flow
Build Cache Flow
When a build tool performs a build:- Hash Computation: Compute action digest from inputs, command, and platform properties
- Cache Check: Query Action Cache with digest
- Cache Hit: Download outputs from CAS and skip execution
- Cache Miss: Execute locally or remotely, then populate cache
Deployment Patterns
Single-Node Setup
All components run on a single machine. Ideal for local development and CI runners.
- Scheduler + Worker + Storage on one node
- In-memory or filesystem storage
- Minimal configuration
Distributed Cluster
Components distributed across multiple machines for scalability.
- Dedicated scheduler nodes
- Worker pool (10s-1000s of nodes)
- Shared cloud storage (S3, GCS)
- Redis for metadata
Hybrid Cloud
Local caching with remote execution.
- Local CAS/AC stores
- GRPC Scheduler forwarding to cloud
- FastSlow store for cache tiers
Multi-Region
Geographically distributed deployment.
- Regional schedulers and workers
- Shared global CAS (S3/GCS)
- Compression for network efficiency
Communication Protocols
NativeLink implements the following Remote Execution API v2 services:Execution Service
Execute- Submit actions for executionWaitExecution- Monitor execution progress
Content Addressable Storage Service
FindMissingBlobs- Check blob existenceBatchUpdateBlobs- Upload small blobsBatchReadBlobs- Download small blobsGetTree- Retrieve directory trees
ByteStream Service
Read- Stream blob downloadsWrite- Stream blob uploads
Action Cache Service
GetActionResult- Retrieve cached resultsUpdateActionResult- Store action results
Capabilities Service
GetCapabilities- Query server capabilities
Platform Properties
Platform properties enable fine-grained worker matching:Property Types
Property Types
Configuration Files
NativeLink uses JSON5 configuration files that define:- Stores: CAS and AC backend configurations
- Schedulers: Task scheduling and worker management
- Workers: Execution capabilities and resources
- Servers: gRPC service endpoints
Performance Characteristics
NativeLink is trusted in production to handle over 1 billion requests per month for customers including Samsung.
- Content-addressed deduplication eliminates redundant storage
- Incremental builds reuse cached artifacts
- Parallel remote execution distributes workload
- Store composition (compression, dedup, fast/slow tiers)
- Efficient binary protocols (gRPC + protobuf)
Metrics and Observability
NativeLink provides extensive metrics and tracing:- Prometheus Metrics: Component-level performance data
- OpenTelemetry Tracing: Distributed request tracing
- Origin Events: Action lifecycle tracking
- Health Endpoints: Service status monitoring
Next Steps
Build Cache
Learn how build caching accelerates builds
Remote Execution
Understand distributed task execution
Storage Backends
Explore storage options and composition