Skip to main content
NativeLink is a high-performance, distributed build cache and remote execution system designed to accelerate software compilation and testing. The system follows the Remote Execution API v2 protocol and provides a modular architecture that scales from single-machine setups to large distributed deployments.

Architecture Overview

NativeLink consists of three primary components that work together to provide caching and remote execution capabilities:

Core Components

Build Clients

Build tools like Bazel, Buck2, Goma, and Reclient interact with NativeLink through the Remote Execution API:
  • Submit build actions to the scheduler
  • Upload input files to the Content Addressable Storage (CAS)
  • Query the Action Cache (AC) for previously computed results
  • Download output artifacts from CAS

Schedulers

The scheduler is responsible for managing the execution lifecycle of build actions:
The primary scheduler implementation that handles action queuing, worker matching, and task distribution.Key Features:
  • Platform property-based worker matching
  • Configurable allocation strategies (LRU/MRU)
  • Action timeout and retry logic
  • Worker health monitoring
Configuration: See schedulers.rs:88-169

Workers

Worker nodes execute build actions in isolated environments:
  • Connect to the scheduler and advertise their capabilities via platform properties
  • Download action inputs from CAS
  • Execute commands in controlled environments
  • Upload outputs back to CAS
  • Report execution results to the scheduler
Worker Capabilities:
  • Multi-action concurrency (configurable max_inflight_tasks)
  • Resource management (CPU, memory, disk)
  • Precondition scripts for dynamic resource checks
  • Graceful draining and shutdown

Storage Backends

NativeLink provides a flexible storage abstraction supporting multiple backends and composition strategies. See Stores for details.

Data Flow

Remote Execution Flow

Build Cache Flow

When a build tool performs a build:
  1. Hash Computation: Compute action digest from inputs, command, and platform properties
  2. Cache Check: Query Action Cache with digest
  3. Cache Hit: Download outputs from CAS and skip execution
  4. Cache Miss: Execute locally or remotely, then populate cache

Deployment Patterns

Single-Node Setup

All components run on a single machine. Ideal for local development and CI runners.
  • Scheduler + Worker + Storage on one node
  • In-memory or filesystem storage
  • Minimal configuration

Distributed Cluster

Components distributed across multiple machines for scalability.
  • Dedicated scheduler nodes
  • Worker pool (10s-1000s of nodes)
  • Shared cloud storage (S3, GCS)
  • Redis for metadata

Hybrid Cloud

Local caching with remote execution.
  • Local CAS/AC stores
  • GRPC Scheduler forwarding to cloud
  • FastSlow store for cache tiers

Multi-Region

Geographically distributed deployment.
  • Regional schedulers and workers
  • Shared global CAS (S3/GCS)
  • Compression for network efficiency

Communication Protocols

NativeLink implements the following Remote Execution API v2 services:

Execution Service

  • Execute - Submit actions for execution
  • WaitExecution - Monitor execution progress

Content Addressable Storage Service

  • FindMissingBlobs - Check blob existence
  • BatchUpdateBlobs - Upload small blobs
  • BatchReadBlobs - Download small blobs
  • GetTree - Retrieve directory trees

ByteStream Service

  • Read - Stream blob downloads
  • Write - Stream blob uploads

Action Cache Service

  • GetActionResult - Retrieve cached results
  • UpdateActionResult - Store action results

Capabilities Service

  • GetCapabilities - Query server capabilities

Platform Properties

Platform properties enable fine-grained worker matching:
Example Configuration:
{
  "supported_platform_properties": {
    "cpu_count": "minimum",
    "cpu_arch": "exact",
    "OSFamily": "exact"
  }
}

Configuration Files

NativeLink uses JSON5 configuration files that define:
  • Stores: CAS and AC backend configurations
  • Schedulers: Task scheduling and worker management
  • Workers: Execution capabilities and resources
  • Servers: gRPC service endpoints
See the configuration examples for reference deployments.

Performance Characteristics

NativeLink is trusted in production to handle over 1 billion requests per month for customers including Samsung.
Key Performance Features:
  • Content-addressed deduplication eliminates redundant storage
  • Incremental builds reuse cached artifacts
  • Parallel remote execution distributes workload
  • Store composition (compression, dedup, fast/slow tiers)
  • Efficient binary protocols (gRPC + protobuf)

Metrics and Observability

NativeLink provides extensive metrics and tracing:
  • Prometheus Metrics: Component-level performance data
  • OpenTelemetry Tracing: Distributed request tracing
  • Origin Events: Action lifecycle tracking
  • Health Endpoints: Service status monitoring

Next Steps

Build Cache

Learn how build caching accelerates builds

Remote Execution

Understand distributed task execution

Storage Backends

Explore storage options and composition

Build docs developers (and LLMs) love