forkd: KVM-isolated microVM sandbox runtime for AI agents

forkd is a microVM sandbox runtime built for AI agent fan-out. It boots a Firecracker VM once, warms it with your Python runtime, dependencies, or model weights, then pauses it to disk as a snapshot. Every child sandbox is a separate Firecracker process that mmaps the parent’s memory image with MAP_PRIVATE — the kernel implements copy-on-write at the page level, so children share the parent’s resident memory until they diverge. The result is two properties at once: per-child KVM hardware isolation, and a spawn cost closer to fork(2) than to a cold VM boot. Fork 100 children in ~101 ms. BRANCH a running VM mid-thought in sub-50 ms.

Key properties

Hardware isolation. Each child is its own Firecracker microVM backed by KVM. Escape requires a hypervisor or kernel vulnerability, not a runc regression.
Warmed runtimes inherit for free. Imports, JIT compilation, model weights, prefetched caches — anything the parent did is already resident in every child. A sandbox.eval("numpy.zeros(5).tolist()") call costs ~1 ms because it reuses the warmed PID 1 interpreter; spawning a cold subprocess to re-import numpy costs ~96 ms.
Real Linux per child. Multi-vCPU, full TCP networking, apt install, outbound HTTPS. Unlike function-level snapshot runtimes that trade single-vCPU and serial I/O for raw spawn speed, forkd children run real Python servers, model inference, or any workload that needs a full kernel.
Multi-tenant by construction. Per-child network namespace, per-child cgroup v2 memory.max, and independent /dev/urandom re-seeded by vmgenid (Linux 5.20+).
Operable. A daemon process owns state, exposes a REST API over Unix or TCP, emits Prometheus /metrics, writes an append-only JSON audit log, and ships a systemd unit.

Benchmark comparison

Same Linux host (Ubuntu 24.04, Linux 6.14, 20 vCPU, 30 GiB, KVM). Workload: spawn 100 sandboxes that each run import numpy; numpy.zeros(5).tolist().

Project	Primitive	Cold-start (N=100)	Fork-from-warm	Quotas	Auth / TLS	License
forkd	Firecracker + snapshot CoW	101 ms	✓	cgroup `memory.max`	bearer + rustls	Apache 2.0
CubeSandbox	RustVMM + KVM microVM	1.06 s*	“coming soon”	<5 MiB / instance	not in OSS	Apache 2.0
Daytona	OCI workspace	<90 ms**	✗	per workspace	API keys (platform)	AGPL-3.0
OpenSandbox	Docker / K8s + gVisor / Kata / FC	122 s	✗	via runtime	gateway (k8s)	Apache 2.0
E2B	Firecracker (in infra)	not in OSS	✗	platform	API keys (cloud)	Apache 2.0
BoxLite	KVM / Hypervisor.framework + OCI	113 s	✗ stateful Box	KVM + seccomp	egress policy only	Apache 2.0
Firecracker raw	microVM only	759 ms	manual	n/a	n/a	Apache 2.0
Docker (runc)	OCI container	335 s	✗	cgroups	n/a	Apache 2.0
gVisor (runsc)	userspace kernel	289 s	✗	cgroups	n/a	Apache 2.0

* CubeSandbox wall-clock at N=100 concurrent is the fast-path figure (1056 ± 14 ms over five runs) with pool_default_format_size_list extended to match the template’s writable-layer size. CubeSandbox advertises <60 ms single-instance cold-start on a 96 vCPU host; that was not retested here. See bench/CUBESANDBOX.md for full methodology.** Daytona’s advertised number; not independently measured (workspace runtime, not a fan-out-comparable shape). Note also that forkd compares fork-from-warm against other runtimes’ cold-start — they are different operating points by design.

Where forkd fits

forkd is designed for workloads that spawn many short-lived, isolated sandboxes from a common warmed template:

Code interpreters and Jupyter-kernel sandboxes. Each conversation turn or tool call spawns a fresh kernel; the warmed parent carries the SciPy or ML runtime, collapsing the per-request import numpy / import torch cost to zero across the entire cohort.
Evaluation harnesses. Hundreds of repository checkouts or test rollouts in parallel — SWE-bench-style — without paying Docker cold-start per task.
Per-user code execution at fan-out scale. Many short-lived sandboxes sharing one warmed parent, each child KVM-isolated by construction.
Untrusted-code execution in CI. git clone, pip install, pytest inside a real Linux VM, not a container namespace.
Self-hosted alternative to managed sandbox SaaS. One Linux box with KVM, a single-binary daemon, Apache 2.0 — no per-second cloud fees, no vendor lock-in.

Where forkd is the wrong tool. Function-level snapshot runtimes that give up real Linux (single-vCPU, serial I/O only) beat forkd’s ~100 ms spawn time by an order of magnitude — at the cost of not running real Python servers, apt install, or outbound HTTPS. BoxLite is the right call when each agent owns a persistent, long-lived workspace. CubeSandbox advertises faster pure cold-starts on large hosts.

Status

forkd is alpha. The fork-on-write primitive, controller daemon, REST API, bearer-token auth, audit logging, cgroup memory.max limits, Prometheus metrics, and Python + TypeScript SDKs are in place and exercised by 25 unit and integration tests in CI. On-disk formats and API shapes may still change before 1.0. What is not yet production-ready:

Multi-node scheduling (one daemon = one host).
Default-deny egress on per-child netns (today: shared MASQUERADE rule; users add their own iptables rules per netns for allow-list policy).
cpu.max, io.max, pids.max quotas beyond the existing memory.max.
Third-party security audit.

v0.1.4 contains daemon security fixes. Two HIGH-class validation gaps in POST /v1/sandboxes (path-traversal via snapshot_tag) and packaging/k8s/ (placeholder bearer token) affected versions 0.1.0–0.1.3. Users on those versions should upgrade immediately. Full advisories in docs/SECURITY.md.

Quickstart

Install forkd, bake a Python snapshot, and fork your first microVMs in under 5 minutes.

How It Works

Deep-dive into the warm-fork lifecycle, copy-on-write memory model, and BRANCH mechanics.

Get Started

Guides

Recipes

Operations

forkd: KVM-isolated microVM sandbox runtime for AI agents

Key properties

Benchmark comparison

Where forkd fits

Status

Quickstart

How It Works

Build docs developers (and LLMs) love

Get Started

Guides

Recipes

Operations

Documentation Index

​Key properties

​Benchmark comparison

​Where forkd fits

​Status

Quickstart

How It Works

Build docs developers (and LLMs) love

Key properties

Benchmark comparison

Where forkd fits

Status