Stack diff snapshots into reusable chains

Every snapshot built by forkd from-image or forkd snapshot is a self-contained triple: memory.bin (full guest RAM), vmstate, and rootfs.ext4. When you build a second snapshot by running pip install pandas on top of a Python base, you store the full 1.5 GiB base again — three copies of the same bytes for three pip install deltas. Diff-snapshot chains (v0.5) solve this: each layer records only the pages dirtied by its installer step, plus a parent_tag + content-hash edge to the layer below. The daemon walks the chain at spawn time and assembles the memory image in one pass. The recipe maintainer’s iteration loop drops from “re-snapshot the full base” to “snapshot the 12 MB delta.”

This is distinct from the v0.3 BRANCH diff path. v0.3 optimises the pause window when branching a running sandbox. v0.5 chains optimise build time and disk usage by stacking installer deltas at build time against a shared base. Both use Firecracker’s snapshot_type: "Diff" under the hood.

Building a chain with `forkd snapshot-diff`

forkd snapshot-diff spawns a one-shot sandbox from a base tag, runs an installer command in the guest, takes a Diff snapshot of only the dirtied pages, and registers the result with a parent_tag pointing back to the base.

# Build a 3-layer chain off a Python 3.12 base snapshot
forkd snapshot-diff --from py-base --tag py-numpy   --exec "pip install numpy==2.0.2"
forkd snapshot-diff --from py-numpy --tag py-pandas --exec "pip install pandas==2.2.3"
forkd snapshot-diff --from py-pandas --tag py-sklearn \
    --exec "pip install scikit-learn==1.5.0"

Each command:

Spawns a one-shot sandbox from --from
Waits for the guest agent to respond
Runs --exec via the agent’s exec endpoint
Pauses the source and takes a Diff snapshot (only dirty pages since restore)
Writes snapshot.json with parent_tag and parent_content_hash
Tears down the sandbox

Flags

Flag	Default	Description
`--from`	(required)	Base snapshot tag to derive from. Must be registered with the daemon.
`--tag`	(required)	Tag for the new diff snapshot.
`--exec`	(required)	Shell command to run in the sandbox (e.g. `"pip install pandas==2.2.3"`). Quoted string, split on whitespace.
`--exec-timeout-secs`	`600`	Wall-clock timeout for the exec step. Increase for large installs (`pip install torch`).
`--daemon-url`	`http://127.0.0.1:8889`	Controller daemon URL. Also reads `FORKD_URL`.
`--daemon-token`	(env)	Bearer token. Also reads `FORKD_TOKEN`.

Spawning from a chain head

Spawning from a chained snapshot is transparent to the caller — the API surface is identical to spawning from a flat base:

# CLI
sudo -E forkd fork --tag py-pandas -n 5 --per-child-netns

# REST
curl -H "Authorization: Bearer $TOKEN" \
     -H 'Content-Type: application/json' \
     -X POST http://127.0.0.1:8889/v1/sandboxes \
     -d '{"snapshot_tag":"py-pandas","n":5,"per_child_netns":true}'

At restore time the daemon walks the parent_tag chain (py-pandas → py-numpy → py-base), verifies each link’s parent_content_hash against the actual memory.bin SHA-256 on disk, assembles the memory image via cp --reflink=auto(base) + apply_diff(each link), and hands the assembled file to Firecracker. The chain depth is opaque to the caller.

Restore performance by filesystem

Restore latency depends on whether your host filesystem supports reflinks:

Filesystem	`cp` cost (1.5 GiB base)	Diff merge (50 MB)	Chain total	vs flat base
btrfs / xfs / ext4+reflink	~1 ms (metadata only)	~0.4 s	~0.4 s	within 10% ✓
ext4 no-reflink (SATA SSD)	~1.5 s	~0.4 s	~1.9 s	~2× of base
ext4 no-reflink (HDD)	~6 s	~0.4 s	~6.4 s	~6× of base

forkd doctor warns when the host filesystem does not support reflinks. On non-reflink hosts, compact deep chains (see below) to recover flat-base performance.

Inspecting a chain — `forkd snapshot-info`

Before deleting or compacting a chained snapshot, inspect its depth, parent, and dependents:

forkd snapshot-info py-numpy

tag:                  py-numpy
dir:                  /home/user/.local/share/forkd/snapshots/py-numpy
chain depth:          1
parent_tag:           py-base
parent_content_hash:  b356ee89...
memory.bin:           512.0 MiB logical / 12.3 MiB on disk
vmstate:              21.4 KiB
ancestors:            py-base (root → parent)
dependents:           py-pandas (would be orphaned by `forkd rmi py-numpy` — pass --cascade or --force)

Pass --json to get the raw daemon JSON for programmatic use. The memory.bin physical size (12.3 MiB) vs logical size (512.0 MiB) shows the disk savings from storing only the dirtied pages as a sparse file.

Removing chain links — `forkd rmi`

Removing a snapshot that is the parent_tag of another snapshot is refused by default to prevent orphaned chains:

forkd rmi py-numpy
# HTTP 409: snapshot `py-numpy` is the parent of 1 chained snapshot(s): [py-pandas];
# rerun with `?cascade=true` to delete the whole subtree,
# or `?force=true` to orphan the children (they will fail to restore)

Override with --cascade (delete the entire subtree) or --force (orphan children):

# Delete py-numpy AND py-pandas AND py-sklearn
forkd rmi py-numpy --cascade

# Delete py-numpy and leave py-pandas orphaned (it will fail to restore)
forkd rmi py-numpy --force

--cascade and --force are mutually exclusive.

Compacting a chain — `forkd snapshot-compact`

As chain depth grows, each link incurs a SHA-256 verification cost against the base (approximately 460 ms for 512 MiB at 1.1 GiB/s). Compact a deep chain into a new flat snapshot to eliminate this tax:

forkd snapshot-compact --from py-sklearn --to py-sklearn-flat

The daemon resolves the full chain, verifies every parent content hash, assembles the head’s memory image, and persists the result as a new flat (parentless) snapshot under py-sklearn-flat. The original chain is left intact; run forkd rmi py-sklearn --cascade afterward if you no longer need the individual links. Verify the result:

forkd snapshot-info py-sklearn-flat
# chain depth: 0
# parent_tag:  (none — this is a base snapshot)

Compact when forkd snapshot-info shows chain depth ≥ 3 and you are on a non-reflink filesystem, or when the per-link SHA-256 tax is measurable in your spawn latency. The bench numbers below help calibrate when compacting pays off.

Performance — chain depth vs spawn latency

Benchmarks from bench/chain-spawn/RESULTS-v0.5.md on a 512 MiB base, ext4, i7-12700:

Chain head	Depth	Spawn p50	Per-link tax
base (flat)	0	59 ms	—
`+numpy`	1	751 ms	+692 ms
`+pandas`	2	1 222 ms	+471 ms
`+sklearn`	3	1 668 ms	+446 ms
flat-equivalent (3 pkgs, one diff)	1	746 ms	—

The per-link tax tracks the SHA-256 of the base (~460 ms for 512 MiB at 1.1 GiB/s). The mmap-once-then-incremental verify optimisation is queued for v0.6 and will collapse the per-link tax to near zero. Correctness: 90/90 probe passes (100%) across L1/L2/L3 plus the flat-equivalent.

Packing and unpacking chains

forkd pack automatically detects whether a snapshot is a chain head and bundles all ancestor links into a single .tar.zst archive:

# Pack the entire 3-link chain into one file
forkd pack --tag py-pandas --out py-pandas-chain.tar.zst
# ✓ wrote 28.4 MiB (... uncompressed; 19.2× compression)  (3 chain links)

# Unpack on another host — restores all 3 link directories
forkd unpack py-pandas-chain.tar.zst
# ✓ link 'py-base' → 'py-base' at ~/.local/share/forkd/snapshots/py-base/
# ✓ link 'py-numpy' → 'py-numpy' at ...  (chain parent: py-base)
# ✓ link 'py-pandas' → 'py-pandas' at ...  (chain parent: py-numpy)
# ✓ unpacked 3 chain link(s), head = 'py-pandas'

The v2 pack format stores each link under its own <tag>/ directory inside the tarball. The manifest carries a chain[] array ordered root → head with per-link sha256 digests. forkd unpack verifies every file against its digest before moving it into place.

--tag <override> is only accepted for single-link bundles. Multi-link chain packs require the manifest’s tags — the flag is ambiguous when multiple links are involved.

SHA-256 content-hash pinning

Every chained snapshot’s snapshot.json records a parent_content_hash — the SHA-256 of the parent’s memory.bin at chain-build time. At restore time the daemon verifies the parent’s current memory.bin against this hash. If the base was re-snapshotted under the same tag, the restore fails with a clear error:

HTTP 409: chain `py-numpy` references `py-base` content `b356ee89...`,
but parent now has content `fa2c11dd...`;
rebuild with `forkd snapshot-diff --from py-base`

This prevents silent corruption when a base snapshot is updated — the chain must be explicitly rebuilt rather than silently restoring the wrong bytes. Compact first if you want to decouple the chain from the base before updating it.

Get Started

Guides

Recipes

Operations

Stack diff snapshots into reusable chains — forkd v0.5

Building a chain with `forkd snapshot-diff`

Flags

Spawning from a chain head

Restore performance by filesystem

Inspecting a chain — `forkd snapshot-info`

Removing chain links — `forkd rmi`

Compacting a chain — `forkd snapshot-compact`

Performance — chain depth vs spawn latency

Packing and unpacking chains

SHA-256 content-hash pinning

Build docs developers (and LLMs) love

Get Started

Guides

Recipes

Operations

Documentation Index

​Building a chain with forkd snapshot-diff

​Flags

​Spawning from a chain head

​Restore performance by filesystem

​Inspecting a chain — forkd snapshot-info

​Removing chain links — forkd rmi

​Compacting a chain — forkd snapshot-compact

​Performance — chain depth vs spawn latency

​Packing and unpacking chains

​SHA-256 content-hash pinning

Build docs developers (and LLMs) love

Building a chain with `forkd snapshot-diff`

Flags

Spawning from a chain head

Restore performance by filesystem

Inspecting a chain — `forkd snapshot-info`

Removing chain links — `forkd rmi`

Compacting a chain — `forkd snapshot-compact`

Performance — chain depth vs spawn latency

Packing and unpacking chains

SHA-256 content-hash pinning