BRANCH a running sandbox to fork mid-thought agents

The standard forkd workflow is a two-layer tree: a parent snapshot forks N children. BRANCH extends this to an arbitrary-depth tree — any running child can itself be snapshotted and used as the parent for a new generation of grandchildren. The source sandbox pauses for the duration of the snapshot, then resumes. The result is an independent snapshot tag on disk: fork from it or delete it regardless of whether the source is still alive. This unlocks a class of patterns impossible with static snapshots: fork an agent mid-thought (inheriting its in-progress reasoning state, open files, and loaded model weights), try five different steering hints in parallel, keep the best branch. The source doesn’t restart. The 50 MiB binary blob it loaded is shared with all grandchildren via copy-on-write — it can’t fit in a prompt.

Three BRANCH modes

Mode	Source pause window	Mechanism	When to use
Full	0.5–8 s (dominated by RAM size)	Write entire `memory.bin` synchronously	First BRANCH from a new source; sources ≥ 4 GiB where dirty-page ratio is near 100%
Diff (v0.3)	~200 ms idle source; sub-second for typical agent workloads (30–300 MiB dirty)	Write only pages dirtied since last snapshot (Firecracker `snapshot_type: "Diff"`)	Repeated BRANCHes from the same source; agent with a small working set
Live (v0.4)	56 ms p50 / 64 ms p90 on a 1.5 GiB source	UFFD_WP captures dirty pages asynchronously after vCPU dump; source resumes immediately	Low-latency agent fan-out; fire-and-forget workflows (`wait: false`)

v0.3.4 multi-BRANCH fix — consecutive BRANCHes on the same source no longer compound pause times. The 6th consecutive Diff BRANCH stays at ~150 ms instead of climbing to 2.7 s (17.6× faster). Each BRANCH’s memory.bin is used as the base for the next diff, requiring zero extra storage.

CLI BRANCH

From-sandbox snapshot

# Full BRANCH (default)
forkd snapshot --from-sandbox <sandbox-id> --tag checkpoint-1

# Diff BRANCH (~200 ms pause)
forkd snapshot --from-sandbox <sandbox-id> --tag checkpoint-1 --diff

# Live BRANCH (sub-50 ms pause; source must have been spawned with --live-fork)
forkd snapshot --from-sandbox <sandbox-id> --tag checkpoint-1 --live

# Live + fire-and-forget: returns ~10 ms after source resumes
forkd snapshot --from-sandbox <sandbox-id> --tag checkpoint-1 --live --no-wait

The --diff and --live flags are mutually exclusive. --no-wait requires --live.

Fork grandchildren from the branch tag

forkd fork --tag checkpoint-1 -n 5 --per-child-netns

The branch tag is a first-class snapshot — identical to one built from a Docker image. Spawn as many children as you need.

Python SDK BRANCH

from forkd import Controller

c = Controller()  # reads FORKD_URL and FORKD_TOKEN from env

# Spawn a source sandbox. live_fork=True is required upfront for mode="live" later.
parent = c.spawn_sandboxes("pyagent", n=1, live_fork=True)[0]

# ... drive the parent via in-guest Sandbox, then BRANCH before a risky step ...

# Diff BRANCH — ~200 ms pause, synchronous
branch = c.branch_sandbox(parent["id"], tag="checkpoint-diff", mode="diff")
grandchildren = c.spawn_sandboxes(branch["tag"], n=5)

# Live BRANCH — sub-50 ms pause, fire-and-forget
branch = c.branch_sandbox(parent["id"], mode="live", wait=False)
# Returns after ~10 ms with status="writing"
# Poll until the background copy finishes:
import time
while True:
    snaps = c.list_snapshots()
    snap = next((s for s in snaps if s["tag"] == branch["tag"]), None)
    if snap and snap.get("status") == "ready":
        break
    time.sleep(0.5)

grandchildren = c.spawn_sandboxes(branch["tag"], n=5)

TypeScript SDK BRANCH

import { Controller } from '@deeplethe/forkd';

const ctrl = new Controller();  // reads FORKD_URL, FORKD_TOKEN from env

const [parent] = await ctrl.spawnSandboxes({
  snapshotTag: 'pyagent',
  n: 1,
  perChildNetns: true,
  liveFork: true,   // required upfront for mode: 'live' later
});

// ... drive parent ...

// Live BRANCH + fire-and-forget
const branch = await ctrl.branchSandbox(parent.id, { mode: 'live', wait: false });
// branch.status === "writing" at this point

// Poll for ready
let snap;
while (true) {
  const snaps = await ctrl.listSnapshots();
  snap = snaps.find(s => s.tag === branch.tag);
  if (snap?.status === 'ready') break;
  await new Promise(r => setTimeout(r, 500));
}

const kids = await ctrl.spawnSandboxes({ snapshotTag: branch.tag, n: 5 });

`wait: false` fire-and-forget semantics

With mode: "live" and wait: false, the daemon returns a SnapshotInfo with status: "writing" as soon as the source sandbox resumes (approximately 10 ms after the API call). The background memory copy completes asynchronously — the snapshot’s status field transitions to "ready" (or "failed") once it does. Poll GET /v1/snapshots to detect completion:

TOKEN=$(cat /etc/forkd/token)
while true; do
  STATUS=$(curl -s -H "Authorization: Bearer $TOKEN" \
    http://127.0.0.1:8889/v1/snapshots | \
    jq -r '.[] | select(.tag=="checkpoint-1") | .status')
  echo "status: $STATUS"
  [ "$STATUS" = "ready" ] && break
  sleep 0.5
done

This is a 200× RT improvement for fire-and-forget BRANCH from agent code — the caller returns after ~70 ms total while the background copy completes in parallel with continued agent execution.

wait: false is only meaningful with mode: "live". Diff and Full BRANCHes are synchronous: the daemon only returns once the snapshot is fully written and status: "ready".

REST API BRANCH

TOKEN=$(cat /etc/forkd/token)

curl -s -H "Authorization: Bearer $TOKEN" \
     -H 'Content-Type: application/json' \
     -X POST http://127.0.0.1:8889/v1/sandboxes/sb-67a1b3-0000/branch \
     -d '{"tag":"checkpoint-1","mode":"live","wait":false}'

Response (201 Created):

{
  "tag": "checkpoint-1",
  "dir": "/home/user/.local/share/forkd/snapshots/checkpoint-1",
  "created_at_unix": 1717000456,
  "branched_from": "sb-67a1b3-0000",
  "pause_ms": 48,
  "status": "writing"
}

Key response fields:

branched_from — source sandbox id (for audit / lineage tracing)
pause_ms — measured source-VM pause window in milliseconds
status — "writing" (live + no-wait), "ready" (complete), or "failed" (background copy error)

Error cases: 404 (sandbox not found), 409 (tag already exists or BRANCH already in flight for this tag), 503 (daemon at branch concurrency cap of 4).

Demo scenario — LangGraph agent branched mid-thought

A 24-second walkthrough of the branch-and-fan-out pattern:

A source agent runs a ReAct loop (LangGraph) processing a travel planning request
The agent is BRANCHed mid-thought — after several turns of reasoning but before the final output
Three grandchildren each receive a different steering hint injected as the next user message
All three produce divergent itineraries while inheriting the same prior reasoning state

The divergence is real: the source (no hint) picks Nishiki Market for Day 1; all three hinted children independently substitute Arashiyama Bamboo Grove. The model was not told to swap places — each hint perturbed the next LLM call, and the prior reasoning came along unchanged. Full recipe, raw transcripts, and benchmark numbers are in recipes/langgraph-react/.

from forkd import Controller, Sandbox

c = Controller()

# Boot the source agent with live_fork=True
parent = c.spawn_sandboxes("langgraph-react", n=1, live_fork=True)[0]
sb = Sandbox(sandbox_id=parent["id"])

# Run the agent for several turns (it builds up reasoning state)
sb.eval("agent.run('Plan a 3-day trip to Kyoto')")

# BRANCH mid-thought (sub-50 ms source pause)
branch = c.branch_sandbox(parent["id"], mode="live", wait=False)

# Wait for branch to be ready, then fan out with 3 different hints
hints = ["Focus on budget options", "Include nature experiences", "Prioritize cultural sites"]
grandchildren = c.spawn_sandboxes(branch["tag"], n=3)
for child, hint in zip(grandchildren, hints):
    Sandbox(sandbox_id=child["id"]).eval(f"inject_hint('{hint}')")

Live-fork prerequisites

Live BRANCH (mode: "live") requires:

The source sandbox must have been spawned with live_fork: true — the memfd-backed RAM region is set up at spawn time, not at BRANCH time.
Linux kernel ≥ 5.7 for UFFD_WP (userfaultfd write-protection).
vm.unprivileged_userfaultfd=1 or CAP_SYS_PTRACE.
The vendored Firecracker fork at deeplethe/firecracker:forkd-v0.4-mem-backend-shared-v1.12 — the mem_backend.backend_type: "Shared" feature is not yet upstream.

Run forkd doctor to verify all four prerequisites. Both uffd_wp and memfd_create checks must show green.

If live-fork prerequisites are not met, fall back to mode: "diff" for ~200 ms pause window, or mode: "full" (default) for maximum compatibility.

Get Started

Guides

Recipes

Operations

BRANCH a running sandbox to fork mid-thought agents

Three BRANCH modes

CLI BRANCH

From-sandbox snapshot

Fork grandchildren from the branch tag

Python SDK BRANCH

TypeScript SDK BRANCH

`wait: false` fire-and-forget semantics

REST API BRANCH

Demo scenario — LangGraph agent branched mid-thought

Live-fork prerequisites

Build docs developers (and LLMs) love

Get Started

Guides

Recipes

Operations

Documentation Index

​Three BRANCH modes

​CLI BRANCH

​From-sandbox snapshot

​Fork grandchildren from the branch tag

​Python SDK BRANCH

​TypeScript SDK BRANCH

​wait: false fire-and-forget semantics

​REST API BRANCH

​Demo scenario — LangGraph agent branched mid-thought

​Live-fork prerequisites

Build docs developers (and LLMs) love

Three BRANCH modes

CLI BRANCH

From-sandbox snapshot

Fork grandchildren from the branch tag

Python SDK BRANCH

TypeScript SDK BRANCH

`wait: false` fire-and-forget semantics

REST API BRANCH

Demo scenario — LangGraph agent branched mid-thought

Live-fork prerequisites