Build Cache

Build caching is one of NativeLink’s core features that dramatically reduces build times by storing and reusing results from previous builds. When a build action is requested, NativeLink checks if an identical action has been executed before and returns the cached result instead of re-executing.

How Build Caching Works

The build cache operates on the principle of content-addressable storage where every build artifact is identified by a cryptographic hash of its content.

Action Digest Computation

An Action Digest uniquely identifies a build action and is computed from:

Command: The exact command to execute (e.g., gcc -c file.c -o file.o)
Input Files: Content hashes of all input files and directories
Environment Variables: Specified environment variables
Platform Properties: Execution requirements (OS, architecture, etc.)
Timeout: Execution timeout configuration

If any of these inputs change, the action digest changes, resulting in a cache miss.

Cache Lookup Process

Content Addressable Storage (CAS)

The Content Addressable Storage (CAS) is where all build artifacts are stored, indexed by their content hash.

Digest-Based Addressing

Every blob in CAS is identified by a DigestInfo:

pub struct DigestInfo {
    pub hash: [u8; 32],      // SHA-256 or BLAKE3 hash
    pub size_bytes: u64,     // Size of the content
}

Supported Hash Functions:

SHA-256: Industry standard, widely compatible
BLAKE3: Faster alternative with parallel hashing

SHA-256
BLAKE3

Default hash function used by most build tools.

{
  "default_digest_hash_function": "sha256"
}

High-performance hash function with better parallelism.

{
  "default_digest_hash_function": "blake3"
}

Ensure all clients and workers use the same hash function.

Deduplication

Content addressing provides automatic deduplication:

Identical files share the same digest and are stored only once
Saves storage space especially for common dependencies
Reduces network transfer when artifacts already exist in CAS

Example: If 100 build actions all include the same stdlib.h, it’s stored only once in CAS.

Action Cache (AC)

The Action Cache maps action digests to their execution results.

ActionResult Structure

An ActionResult contains:

message ActionResult {
  repeated OutputFile output_files = 1;
  repeated OutputDirectory output_directories = 2;
  int32 exit_code = 3;
  bytes stdout_digest = 4;
  bytes stderr_digest = 5;
  google.protobuf.Duration execution_duration = 6;
}

Key Fields:

Output Files/Directories: Digests of produced artifacts
Exit Code: Command return code
Stdout/Stderr: Digests of captured output
Execution Metadata: Duration, worker info

Cache Validation

NativeLink ensures cache integrity through Completeness Checking:

Completeness Checking Store

When enabled, the CompletenessCheckingSpec wrapper verifies that all output digests referenced in an ActionResult actually exist in the CAS before returning a cache hit.

{
  "completeness_checking": {
    "backend": {
      "filesystem": { ... }
    },
    "cas_store": {
      "ref_store": { "name": "CAS_MAIN_STORE" }
    }
  }
}

Recommended for AC stores to prevent returning incomplete cache entries.

Cache Hit Optimization Strategies

1. Deterministic Builds

Ensure builds are deterministic to maximize cache hits:

Use Relative Paths

Avoid absolute paths in compiler flags that vary between machines.

Fixed Timestamps

Remove timestamp dependencies from build outputs.

Sorted Inputs

Process inputs in consistent order (e.g., sorted file lists).

Hermetic Environments

Isolate builds from system-specific dependencies.

2. Fine-Grained Actions

Break builds into smaller, focused actions:

Compile individual source files separately
Link as a separate action
Generate headers in dedicated actions

Benefit: Changing one source file only invalidates that file’s action, not the entire build.

3. Input Root Minimization

Include only necessary inputs in the action:

// ❌ Bad: Includes entire source tree
{
  "input_root_digest": "<digest-of-entire-repo>"
}

// ✅ Good: Includes only required files
{
  "input_root_digest": "<digest-of-src-file-and-headers>"
}

Cache Storage Backends

NativeLink supports various storage backends for CAS and AC:

Memory
Filesystem
S3 / GCS
Redis

Ultra-fast in-memory cache.Use Case: Development, small projects, fast local cache tierLimitations: Volatile, limited by RAM

{
  "memory": {
    "eviction_policy": {
      "max_bytes": "10gb"
    }
  }
}

Persistent disk-based storage.Use Case: Local persistent cache, single-node deployments

{
  "filesystem": {
    "content_path": "/var/cache/nativelink/cas",
    "temp_path": "/var/cache/nativelink/tmp",
    "eviction_policy": {
      "max_bytes": "100gb"
    }
  }
}

Cloud storage for distributed teams.Use Case: Multi-machine clusters, CI/CD pipelines

{
  "experimental_cloud_object_store": {
    "provider": "aws",
    "region": "us-east-1",
    "bucket": "my-build-cache"
  }
}

Fast remote cache for small artifacts.Use Case: Metadata storage, small blob cacheLimitations: Object size limits (typically 256-512 MB)

{
  "redis_store": {
    "addresses": ["redis://localhost:6379"]
  }
}

Multi-Tier Caching

Combine storage backends for optimal performance:

{
  "fast_slow": {
    "fast": {
      "memory": {
        "eviction_policy": { "max_bytes": "5gb" }
      }
    },
    "slow": {
      "experimental_cloud_object_store": {
        "provider": "aws",
        "bucket": "build-cache"
      }
    }
  }
}

How it Works:

Read: Check fast tier (memory), fallback to slow tier (S3)
Write: Write to both tiers simultaneously
Promotion: Slow tier hits are cached in fast tier

This pattern provides local speed with cloud persistence.

Cache Eviction Policies

Control cache size with eviction policies:

{
  "eviction_policy": {
    "max_bytes": "100gb",        // Evict when size exceeds
    "evict_bytes": "10gb",       // Evict until 90gb (low watermark)
    "max_seconds": 604800,       // Evict after 7 days
    "max_count": 1000000         // Evict when item count exceeds
  }
}

Eviction Strategy: Least Recently Used (LRU)

Zero-Byte File Handling

NativeLink optimizes for common cases:

Special Zero-Byte Digests

Empty files have well-known digests:

SHA-256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
BLAKE3: af1349b9f5f9a1a6a0404dea36dcc9499bcb25c9adc112b7cc9a93cae41f3262

These are often handled specially to avoid unnecessary storage/transfer.

Cache Verification

The VerifySpec wrapper validates uploads:

{
  "verify": {
    "backend": { ... },
    "verify_size": true,    // Check size matches digest
    "verify_hash": true     // Recompute and verify hash
  }
}

Recommendation:

CAS: Enable both verify_size and verify_hash
AC: Disable both (action results are not content-addressed)

Cache Statistics

Monitor cache effectiveness: Key Metrics:

Cache Hit Rate: Percentage of actions served from cache
Cache Size: Total bytes stored
Eviction Rate: How often items are evicted
Download/Upload Volume: Network transfer savings

A well-configured cache can achieve 80-95% hit rates for incremental builds.

Best Practices

Use Completeness Checking on AC stores to ensure cache integrity
Enable Verification on CAS uploads to catch corrupt data early
Size Partitioning to separate small and large artifacts
Compression for network-backed stores to reduce transfer costs
Monitor Cache Metrics to tune eviction policies
Shared Caches across teams to maximize reuse

Getting Started

Core Concepts

Deployment

Integration

Operations

Security

How Build Caching Works

Action Digest Computation

Cache Lookup Process

Content Addressable Storage (CAS)

Digest-Based Addressing

Deduplication

Action Cache (AC)

ActionResult Structure

Cache Validation

Cache Hit Optimization Strategies

1. Deterministic Builds

Use Relative Paths

Fixed Timestamps

Sorted Inputs

Hermetic Environments

2. Fine-Grained Actions

3. Input Root Minimization

Cache Storage Backends

Multi-Tier Caching

Cache Eviction Policies

Zero-Byte File Handling

Cache Verification

Cache Statistics

Best Practices

Next Steps

Storage Backends

Remote Execution

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Deployment

Integration

Operations

Security

Documentation Index

​How Build Caching Works

​Action Digest Computation

​Cache Lookup Process

​Content Addressable Storage (CAS)

​Digest-Based Addressing

​Deduplication

​Action Cache (AC)

​ActionResult Structure

​Cache Validation

​Cache Hit Optimization Strategies

​1. Deterministic Builds

Use Relative Paths

Fixed Timestamps

Sorted Inputs

Hermetic Environments

​2. Fine-Grained Actions

​3. Input Root Minimization

​Cache Storage Backends

​Multi-Tier Caching

​Cache Eviction Policies

​Zero-Byte File Handling

​Cache Verification

​Cache Statistics

​Best Practices

​Next Steps

Storage Backends

Remote Execution

Build docs developers (and LLMs) love

How Build Caching Works

Action Digest Computation

Cache Lookup Process

Content Addressable Storage (CAS)

Digest-Based Addressing

Deduplication

Action Cache (AC)

ActionResult Structure

Cache Validation

Cache Hit Optimization Strategies

1. Deterministic Builds

2. Fine-Grained Actions

3. Input Root Minimization

Cache Storage Backends

Multi-Tier Caching

Cache Eviction Policies

Zero-Byte File Handling

Cache Verification

Cache Statistics

Best Practices

Next Steps