System Architecture

NativeLink is a high-performance, distributed build cache and remote execution system designed to accelerate software compilation and testing. The system follows the Remote Execution API v2 protocol and provides a modular architecture that scales from single-machine setups to large distributed deployments.

Architecture Overview

NativeLink consists of three primary components that work together to provide caching and remote execution capabilities:

Core Components

Build Clients

Build tools like Bazel, Buck2, Goma, and Reclient interact with NativeLink through the Remote Execution API:

Submit build actions to the scheduler
Upload input files to the Content Addressable Storage (CAS)
Query the Action Cache (AC) for previously computed results
Download output artifacts from CAS

Schedulers

The scheduler is responsible for managing the execution lifecycle of build actions:

Simple Scheduler
Cache Lookup Scheduler
Property Modifier Scheduler
GRPC Scheduler

The primary scheduler implementation that handles action queuing, worker matching, and task distribution.Key Features:

Platform property-based worker matching
Configurable allocation strategies (LRU/MRU)
Action timeout and retry logic
Worker health monitoring

Configuration: See schedulers.rs:88-169

Workers

Worker nodes execute build actions in isolated environments:

Connect to the scheduler and advertise their capabilities via platform properties
Download action inputs from CAS
Execute commands in controlled environments
Upload outputs back to CAS
Report execution results to the scheduler

Worker Capabilities:

Multi-action concurrency (configurable max_inflight_tasks)
Resource management (CPU, memory, disk)
Precondition scripts for dynamic resource checks
Graceful draining and shutdown

Storage Backends

NativeLink provides a flexible storage abstraction supporting multiple backends and composition strategies. See Stores for details.

Data Flow

Remote Execution Flow

Build Cache Flow

When a build tool performs a build:

Hash Computation: Compute action digest from inputs, command, and platform properties
Cache Check: Query Action Cache with digest
Cache Hit: Download outputs from CAS and skip execution
Cache Miss: Execute locally or remotely, then populate cache

Deployment Patterns

Single-Node Setup

All components run on a single machine. Ideal for local development and CI runners.

Scheduler + Worker + Storage on one node
In-memory or filesystem storage
Minimal configuration

Distributed Cluster

Components distributed across multiple machines for scalability.

Dedicated scheduler nodes
Worker pool (10s-1000s of nodes)
Shared cloud storage (S3, GCS)
Redis for metadata

Hybrid Cloud

Local caching with remote execution.

Local CAS/AC stores
GRPC Scheduler forwarding to cloud
FastSlow store for cache tiers

Multi-Region

Geographically distributed deployment.

Regional schedulers and workers
Shared global CAS (S3/GCS)
Compression for network efficiency

Communication Protocols

NativeLink implements the following Remote Execution API v2 services:

Execution Service

Execute - Submit actions for execution
WaitExecution - Monitor execution progress

Content Addressable Storage Service

FindMissingBlobs - Check blob existence
BatchUpdateBlobs - Upload small blobs
BatchReadBlobs - Download small blobs
GetTree - Retrieve directory trees

ByteStream Service

Read - Stream blob downloads
Write - Stream blob uploads

Action Cache Service

GetActionResult - Retrieve cached results
UpdateActionResult - Store action results

Capabilities Service

GetCapabilities - Query server capabilities

Platform Properties

Platform properties enable fine-grained worker matching:

Property Types

Example Configuration:

{
  "supported_platform_properties": {
    "cpu_count": "minimum",
    "cpu_arch": "exact",
    "OSFamily": "exact"
  }
}

Configuration Files

NativeLink uses JSON5 configuration files that define:

Stores: CAS and AC backend configurations
Schedulers: Task scheduling and worker management
Workers: Execution capabilities and resources
Servers: gRPC service endpoints

See the configuration examples for reference deployments.

Performance Characteristics

NativeLink is trusted in production to handle over 1 billion requests per month for customers including Samsung.

Key Performance Features:

Content-addressed deduplication eliminates redundant storage
Incremental builds reuse cached artifacts
Parallel remote execution distributes workload
Store composition (compression, dedup, fast/slow tiers)
Efficient binary protocols (gRPC + protobuf)

Metrics and Observability

NativeLink provides extensive metrics and tracing:

Prometheus Metrics: Component-level performance data
OpenTelemetry Tracing: Distributed request tracing
Origin Events: Action lifecycle tracking
Health Endpoints: Service status monitoring

Next Steps

Build Cache

Learn how build caching accelerates builds

Remote Execution

Understand distributed task execution

Storage Backends

Explore storage options and composition

Getting Started

Core Concepts

Deployment

Integration

Operations

Security

Architecture Overview

Core Components

Build Clients

Schedulers

Workers

Storage Backends

Data Flow

Remote Execution Flow

Build Cache Flow

Deployment Patterns

Single-Node Setup

Distributed Cluster

Hybrid Cloud

Multi-Region

Communication Protocols

Execution Service

Content Addressable Storage Service

ByteStream Service

Action Cache Service

Capabilities Service

Platform Properties

Configuration Files

Performance Characteristics

Metrics and Observability

Next Steps

Build Cache

Remote Execution

Storage Backends

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Deployment

Integration

Operations

Security

Documentation Index

​Architecture Overview

​Core Components

​Build Clients

​Schedulers

​Workers

​Storage Backends

​Data Flow

​Remote Execution Flow

​Build Cache Flow

​Deployment Patterns

Single-Node Setup

Distributed Cluster

Hybrid Cloud

Multi-Region

​Communication Protocols

​Execution Service

​Content Addressable Storage Service

​ByteStream Service

​Action Cache Service

​Capabilities Service

​Platform Properties

​Configuration Files

​Performance Characteristics

​Metrics and Observability

​Next Steps

Build Cache

Remote Execution

Storage Backends

Build docs developers (and LLMs) love

Architecture Overview

Core Components

Build Clients

Schedulers

Workers

Storage Backends

Data Flow

Remote Execution Flow

Build Cache Flow

Deployment Patterns

Communication Protocols

Execution Service

Content Addressable Storage Service

ByteStream Service

Action Cache Service

Capabilities Service

Platform Properties

Configuration Files

Performance Characteristics

Metrics and Observability

Next Steps