Forking is the core operation in forkd. Given a parent snapshot (Documentation Index
Fetch the complete documentation index at: https://mintlify.com/deeplethe/forkd/llms.txt
Use this file to discover all available pages before exploring further.
memory.bin + vmstate) on disk, the fork operation launches N independent Firecracker processes that each mmap the parent’s memory.bin with MAP_PRIVATE. The Linux kernel implements copy-on-write at the page level: every child starts with zero private pages and only allocates new physical pages as it writes to memory that diverges from the parent. The result is KVM-level hardware isolation per child at a spawn cost closer to fork(2) than to a cold VM boot — 100 children in 101 ms on a 20-vCPU Ubuntu 24.04 host.
CLI fork
Key flags
| Flag | Description |
|---|---|
--tag | Snapshot tag to fork from. Must exist under ~/.local/share/forkd/snapshots/<tag>/. |
-n | Number of children to spawn. |
--per-child-netns | Place each child in its own forkd-child-<i> network namespace. Run sudo bash scripts/netns-setup.sh N first. |
--memory-limit-mib | Set memory.max on a per-child cgroup v2 leaf. Children exceeding this are OOM-killed. Requires cgroup v2 and write access to /sys/fs/cgroup/forkd/. |
--live-fork | Boot each child with a memfd-backed RAM region (v0.4). Required if you later want to take a live BRANCH from a child. Requires Linux ≥ 5.7 and the vendored Firecracker fork. |
--hugepages | Back the memfd with 2 MiB hugepages (only with --live-fork). Reduces TLB pressure during bulk spawn. Requires reserved hugepages on the host. |
--settle-secs | Seconds to let children run before reporting / shutting down. Default 2. |
--keep-workdir | Keep /tmp/forkd-fork-<tag>/ after shutdown for post-mortem inspection. |
Example — fork 5 children, exec a command in one
Benchmark output — forkd bench
forkd bench runs a representative spawn → exec → branch → fanout → cleanup cycle against a live daemon and prints per-step timing. Use it to verify forkd’s performance on your hardware after any configuration change:
exec round-trip line is particularly telling: at 22 ms it reuses the already-warmed Python interpreter in PID 1. Compare this to sandbox.commands.run("python3 -c '...'") which costs ~96 ms because it spawns a cold subprocess that must re-import numpy.
Memory limits with cgroup v2
Pass--memory-limit-mib to set a hard RSS cap per child using the cgroup v2 memory.max knob:
- cgroup v2 unified hierarchy mounted at
/sys/fs/cgroup(forkd doctorchecks this) - Write access to
/sys/fs/cgroup/forkd/(root or a delegated cgroup)
memory.max are OOM-killed by the kernel — their processes terminate and the Firecracker instance exits, but other children are unaffected.
Measured CoW overhead at N=100 is 0.12 MiB per child on top of the parent. For a 512 MiB warmed Python+numpy parent with
--memory-limit-mib 256, the practical ceiling before hitting vCPU or process-count limits is roughly 50 idle agents per 8 GiB of Pod/host RAM.Live-fork mode (--live-fork)
Pass --live-fork to opt each child into a memfd-backed RAM region. This is a forward-looking flag: it has no effect at spawn time beyond swapping the memory backend, but it is required upfront if you later want to take a sub-50 ms live BRANCH from that child.
--hugepages to back the memfd with 2 MiB hugepages, reducing TLB pressure during bulk spawn and live BRANCH bulk-copy:
Daemon-managed fork (REST API)
For production deployments, drive forkd through the controller daemon instead of the CLI. Start the daemon:POST /v1/sandboxes:
n— 1 to 1000 childrenper_child_netns— place each child in its pre-provisionedforkd-child-<i>namespacememory_limit_mib— cgroup v2memory.maxper child (optional)live_fork— enable memfd-backed RAM for later live BRANCH (v0.4+, defaultfalse)
SandboxInfo and error shapes.
Accessing a sandbox
Once a sandbox is running, communicate with the in-guest agent on port 8888.Ping
Exec (subprocess)
Eval (warmed interpreter)
eval runs a Python expression against the already-warmed PID-1 interpreter — no subprocess spawn, no import overhead:
eval path is ~1 ms per call vs ~96 ms for an exec that must cold-import numpy. For high-frequency agent interactions, always prefer eval over exec when your parent snapshot has the relevant module pre-imported.
See /reference/cli/sandbox-commands and /reference/sdk/python for the Python SDK equivalent (Sandbox.eval, Sandbox.commands.run).
Cleanup
Kill specific sandboxes
Sweep orphaned work directories
If forkd crashes or is killed withSIGKILL, temporary work directories under /tmp/forkd-* can accumulate. The cleanup command sweeps them: