forkd’s execution model maps cleanly onto Kubernetes, but with a different shape than runtimes that schedule one Pod per sandbox. A singleDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/deeplethe/forkd/llms.txt
Use this file to discover all available pages before exploring further.
forkd-controller Pod hosts the daemon and all of its child Firecracker processes. The K8s scheduler runs exactly once when the Pod starts, regardless of how many sandboxes you subsequently fork — no per-sandbox scheduling overhead, no per-sandbox Pod churn. This makes forkd better suited to AI agent fan-out workloads than Kata Containers or generic Firecracker-on-K8s designs that require one Pod per VM.
The starter manifest lives at packaging/k8s/forkd-controller.yaml and has been verified end-to-end on k3s on bare-metal Ubuntu 24.04 / Linux 6.14 / KVM.
Node requirements
Each node that will run aforkd-controller Pod must have:
/dev/kvmpresent and accessible (VMX or SVM enabled in BIOS / hypervisor settings)- cgroup v2 unified hierarchy (
mount -t cgroup2 cgroup2 /sys/fs/cgroup— the daemon writes to/sys/fs/cgroup/forkd/) - x86_64 architecture (Firecracker is x86_64-only)
- A kernel image and parent rootfs reachable on the node, either placed directly on the node filesystem or mounted via a PersistentVolume
nodeSelector targets kubernetes.io/arch: amd64, which is set automatically on x86_64 nodes by most Kubernetes distributions. To additionally restrict scheduling to nodes with confirmed KVM access, apply a custom label and update the nodeSelector accordingly:
Apply the starter manifest
Generate and patch the bearer token
The manifest ships with a placeholder token that the daemon refuses to accept at startup — a forgotten
sed step fails noisily rather than silently. Replace it before applying:Apply the manifest
strategy: Recreate because forkd holds live VM state and cannot do a rolling update.Key manifest fields explained
The manifest creates four resources: aNamespace, a Secret (the bearer token), a Deployment, and a ClusterIP Service.
Secret
/etc/forkd/token (mode 0400). The value you substitute must be at least 16 bytes and must not begin with REPLACE_ME or CHANGE_ME — the daemon validates this at startup.
Container args
0.0.0.0) since the Service and any NetworkPolicy provide the isolation. The audit log is written into the same volume as the state file.
Volumes
| Volume | Mount | Purpose |
|---|---|---|
kvm (hostPath) | /dev/kvm | Exposes the KVM character device to Firecracker. |
cgroup (hostPath) | /sys/fs/cgroup | Lets the daemon write per-child memory.max limits. |
token (Secret) | /etc/forkd | Bearer token, mode 0400. |
state (emptyDir) | /var/lib/forkd | State file, snapshots, and audit log. Survives container restarts, not Pod restarts. |
For production, replace the
emptyDir state volume with a PersistentVolumeClaim so snapshots survive Pod restarts. Without a PVC, you must rebuild snapshots after every Pod restart.Probes
/healthz is always unauthenticated, so probes work without a credential even when --token-file is set.
Sizing guidance
With a 512 MiB warmed Python + numpy parent snapshot, use these rough heuristics for resource requests and limits:| Resource | Guidance |
|---|---|
| vCPU — actively running agents | ~1 actively-running agent per vCPU. Compute-bound bottleneck. |
| RAM — idle-pooled agents | ~50 idle-pooled agents per 8 GiB Pod RAM. Process-state bottleneck, not memory. |
| CoW overhead per child | 0.12 MiB at N=100 (bench data). Memory rarely caps fan-out — vCPU and process count dominate. |
forkd bench --tag <your-tag> --n 20 inside the Pod before committing to node shape.
Security implications
The manifest uses the simplest secure path (privileged: true). For tighter security at the cost of additional platform work:
- Replace
privileged: truewith a KVM device plugin (e.g.kubevirt/kvm-device-plugin) so/dev/kvmis granted as a K8s resource rather than a host mount. - Drop
privilegedand enumerate only the capabilities you need:NET_ADMIN(tap setup),SYS_ADMIN(cgroup writes). - Replace
emptyDirwith aPersistentVolumeClaimbacked by fast local storage (NVMe hostPath or a CSI driver with local volumes).
Multi-tenant deployments
Because oneforkd-controller Pod has node-level blast radius, multi-tenant deployments must isolate at the node boundary:
- Run one
forkd-controllerPod per tenant on dedicated nodes (use node labels +nodeSelector/nodeAffinity+ taints + tolerations). - Each tenant gets their own
Namespace, their ownSecret, and their own bearer token. - Apply a
NetworkPolicylimiting ingress to port8889to only that tenant’s agent backplane. - Do not co-schedule untrusted tenant pods on the same node as a
forkd-controller.
Token management
The manifest ships withtoken: REPLACE_ME_WITH_32_BYTES_BASE64 in the Secret. The daemon’s validate_token() function rejects any token that:
- Begins with
REPLACE_MEorCHANGE_ME - Is shorter than 16 bytes
kubectl apply with the un-patched placeholder causes the daemon to refuse to start — a noisy fail rather than a silent compromise. Always substitute the token before applying:
kubectl -n forkd rollout restart deployment/forkd-controller). Existing sandboxes inside the Pod are killed on restart; snapshots on the PVC survive.
What this manifest does not cover
The starter manifest is a single-node starting point. The following are out of scope for v0.1 and noted here so you know what to add for production:- DaemonSet shape — for multi-node deployments (one controller per node), convert to a DaemonSet with
nodeSelectorfor KVM-capable nodes. - netns provisioning — per-child netns (
forkd-child-N) must be provisioned byscripts/netns-setup.shon each node before forking. Wire this as a Pod init container or a separate DaemonSet. - HPA / autoscaling — each controller instance owns its state; horizontal scaling of the controller itself doesn’t apply. Scale by adding more KVM nodes and more controller instances (one per node).
- NetworkPolicy — lock down port
8889to your agent backplane only.