Deploy OpenSandbox on Kubernetes with Helm

The OpenSandbox Kubernetes operator extends the platform with custom resources and a controller that manages sandbox environments natively within your cluster. It adds high-throughput batch delivery via the BatchSandbox CRD, pre-warmed resource pools via the Pool CRD, OCI-snapshot-based pause and resume, optional process-based task orchestration, and comprehensive real-time monitoring — all without requiring changes to the standard OpenSandbox SDK or server API.

Key Features

BatchSandbox CRD

Create and manage one or many identical sandbox replicas from a single YAML manifest or SDK call.

Pool CRD

Maintain pre-warmed compute resources for near-instant sandbox provisioning without cold-start latency.

Pause and Resume

Persist sandbox filesystem state as an OCI rootfs snapshot, release cluster resources, and restore later with the same sandbox ID.

Task Orchestration

Inject heterogeneous process-based tasks into each sandbox in a batch via taskTemplate and shardTaskPatches.

Pod Eviction

Gracefully evict idle pool pods for node maintenance or resource reclamation without disrupting allocated sandboxes.

O(1) Batch Delivery

Delivering 100 sandboxes from a pool completes in under one second — O(1) time complexity vs. O(N) for sequential provisioning.

Prerequisites

Requirement	Minimum Version
Kubernetes	1.21.1+
Helm	3.x
kubectl	1.11.3+
Go (build from source only)	1.24.0+
Docker (build from source only)	17.03+

If you don’t have access to a Kubernetes cluster, use kind to create a local cluster for testing. Install it with brew install kind (macOS) or winget install Kubernetes.kind (Windows), then run kind create cluster.

Custom Resources

OpenSandbox installs three CRDs into your cluster.

BatchSandbox

The BatchSandbox resource creates and manages one or many sandbox replicas from a pod template. It is the primary workload unit that the OpenSandbox server creates when the Kubernetes runtime is configured. Key spec fields:

Field	Description
`spec.replicas`	Number of sandbox instances to create (use `1` for single sandboxes)
`spec.template`	Kubernetes pod template spec for the sandbox container(s)
`spec.poolRef`	Name of a `Pool` resource to allocate from (omit for non-pooled creation)
`spec.taskTemplate`	Optional default task injected into every sandbox in the batch
`spec.shardTaskPatches`	Per-sandbox task overrides for heterogeneous task distribution
`spec.pause`	Set to `true` to trigger a rootfs snapshot and release cluster resources

Pool

The Pool resource maintains a buffer of pre-warmed pods that BatchSandbox instances can allocate instantly instead of waiting for cold container starts. Key spec fields:

Field	Description
`spec.template`	Pod template spec for pool members (must match `BatchSandbox` template)
`spec.capacitySpec.bufferMin`	Minimum number of unallocated pods to keep warm
`spec.capacitySpec.bufferMax`	Maximum number of unallocated pods to maintain
`spec.capacitySpec.poolMax`	Hard cap on total pods managed by this pool
`spec.capacitySpec.poolMin`	Minimum total pod count (allocated + buffer)
`spec.scaleStrategy.maxUnavailable`	Rate limit for scaling operations (e.g., `"20%"` or `5`)

SandboxSnapshot (internal)

SandboxSnapshot is an internal CRD used by the controller to track rootfs snapshot operations during pause/resume. You do not create these directly — the controller creates them when BatchSandbox.spec.pause=true.

Helm Deployment

Install the controller from a GitHub Release

Install the OpenSandbox controller Helm chart directly from GitHub Releases. Replace <version> with the desired release (e.g., 0.1.0).

helm install opensandbox-controller \
  https://github.com/opensandbox-group/OpenSandbox/releases/download/helm/opensandbox-controller/<version>/opensandbox-controller-<version>.tgz \
  --namespace opensandbox-system \
  --create-namespace

Customize the installation (optional)

Use --set flags or a values file to tune resource limits, replica count, and log level.

--set flags
values file

helm install opensandbox-controller \
  https://github.com/opensandbox-group/OpenSandbox/releases/download/helm/opensandbox-controller/0.1.0/opensandbox-controller-0.1.0.tgz \
  --namespace opensandbox-system \
  --create-namespace \
  --set controller.replicaCount=2 \
  --set controller.resources.limits.cpu=1000m \
  --set controller.resources.limits.memory=512Mi

# custom-values.yaml
controller:
  replicaCount: 2
  resources:
    limits:
      cpu: 1000m
      memory: 512Mi
    requests:
      cpu: 100m
      memory: 128Mi
  logLevel: debug

helm install opensandbox-controller \
  https://github.com/opensandbox-group/OpenSandbox/releases/download/helm/opensandbox-controller/0.1.0/opensandbox-controller-0.1.0.tgz \
  --namespace opensandbox-system \
  --create-namespace \
  -f custom-values.yaml

Configure the OpenSandbox server for Kubernetes

Generate a Kubernetes-oriented server config and edit it to point at your cluster.

opensandbox-server init-config ~/.sandbox.toml --example k8s

Key Kubernetes-specific config sections:

Section	Purpose
`[kubernetes]`	`workload_provider` (`batchsandbox` or `agent-sandbox`), `batchsandbox_template_file`
`[agent_sandbox]`	Agent sandbox settings for the `agent-sandbox` provider
`[ingress]`	Ingress gateway mode for routing sandbox traffic
`[secure_runtime]`	gVisor, Kata, or Firecracker runtime class integration

Verify the installation

Check that the controller pod is running and review its logs.

kubectl get pods -n opensandbox-system
kubectl get deployment -n opensandbox-system
kubectl logs -n opensandbox-system -l control-plane=controller-manager -f

Creating BatchSandbox and Pool Resources

Basic non-pooled sandbox

Create a simple batch of two sandboxes without a resource pool:

apiVersion: sandbox.opensandbox.io/v1alpha1
kind: BatchSandbox
metadata:
  name: basic-batch-sandbox
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: sandbox-container
        image: nginx:latest
        ports:
        - containerPort: 80

kubectl apply -f basic-batch-sandbox.yaml
kubectl get batchsandbox basic-batch-sandbox -o wide

# Check the endpoint annotation once sandboxes are ready
kubectl get batchsandbox basic-batch-sandbox \
  -o jsonpath='{.metadata.annotations.sandbox\.opensandbox\.io/endpoints}'

Resource pool

Create a pool that maintains 2–10 warm pods with a maximum of 20:

apiVersion: sandbox.opensandbox.io/v1alpha1
kind: Pool
metadata:
  name: example-pool
spec:
  template:
    spec:
      containers:
      - name: sandbox-container
        image: nginx:latest
        ports:
        - containerPort: 80
  capacitySpec:
    bufferMin: 2
    bufferMax: 10
    poolMin: 5
    poolMax: 20
  scaleStrategy:
    maxUnavailable: "20%"

Then allocate from the pool:

apiVersion: sandbox.opensandbox.io/v1alpha1
kind: BatchSandbox
metadata:
  name: pooled-batch-sandbox
spec:
  replicas: 3
  poolRef: example-pool

Pooled sandbox with heterogeneous tasks

For RL-style workloads, inject different tasks into each sandbox replica using shardTaskPatches. The task executor sidecar must share the process namespace with the sandbox container.

apiVersion: sandbox.opensandbox.io/v1alpha1
kind: Pool
metadata:
  name: task-example-pool
spec:
  template:
    spec:
      shareProcessNamespace: true
      containers:
      - name: sandbox-container
        image: ubuntu:latest
        command: ["sleep", "3600"]
      - name: task-executor
        image: <task-executor-image>:<tag>
        securityContext:
          capabilities:
            add: ["SYS_PTRACE"]
  capacitySpec:
    bufferMin: 2
    bufferMax: 10
    poolMax: 20
    poolMin: 5

apiVersion: sandbox.opensandbox.io/v1alpha1
kind: BatchSandbox
metadata:
  name: task-batch-sandbox
spec:
  replicas: 2
  poolRef: task-example-pool
  taskTemplate:
    spec:
      process:
        command: ["echo", "Default task"]
  shardTaskPatches:
  - spec:
      process:
        command: ["echo", "Custom task for sandbox 1"]
  - spec:
      process:
        command: ["echo", "Custom task for sandbox 2"]
        args: ["with", "additional", "arguments"]

Resource Pooling

Pools maintain a buffer of pre-warmed, unallocated pods so that BatchSandbox allocation is near-instantaneous. The controller continuously reconciles the pool to maintain bufferMin ≤ unallocated ≤ bufferMax while keeping total pod count within poolMin–poolMax.

Scenario	Total time for 100 sandboxes
Sequential (concurrency=1)	76.35 s
Concurrent (concurrency=10)	23.17 s
Concurrent (concurrency=50)	33.85 s
BatchSandbox with Pool	0.92 s

Pod Eviction

Pool supports graceful pod eviction for node maintenance or resource reclamation. How to evict a pod:

Label the target pod with pool.opensandbox.io/evict (any value).
The controller checks whether the pod is allocated to an active BatchSandbox. Allocated pods are protected and skipped.
Idle pods are deleted, triggering the pool to replenish capacity automatically.
Pods labeled for eviction are excluded from new allocations.

Custom eviction strategies can be implemented by:

Setting the pool.opensandbox.io/eviction-handler=<handler-name> label on the Pool resource.
Implementing the EvictionHandler interface with NeedsEviction() and Evict() methods.
Registering the handler in the factory function.

Task Orchestration

Task orchestration is optional. Sandboxes can be created without tasks at all. When enabled, the task executor sidecar runs process-based tasks inside each sandbox pod.

taskTemplate sets the default task for all replicas in a batch.
shardTaskPatches provides per-index overrides, enabling different tasks in each sandbox of the same batch (heterogeneous distribution).
The task executor sidecar requires SYS_PTRACE capability and shareProcessNamespace: true on the pod spec.

Pause and Resume

OpenSandbox supports pause and resume for single-replica Kubernetes sandboxes by committing the container rootfs to an OCI image snapshot.

Time ────────────────────────────────────────────────────────────>

[Running] → [Pausing] → [Paused] → [Resuming] → [Running]
               |                        |
         commit rootfs           rewrite template images
         push to registry        recreate runtime from snapshot
         release pods/alloc

The public sandbox ID remains stable across pause/resume cycles. See the Pause and Resume guide for full setup instructions including OCI registry secrets and controller flags.

Pause and resume currently requires BatchSandbox.spec.replicas=1. The OpenSandbox server always creates Kubernetes sandboxes with replicas: 1. Direct BatchSandbox CRs with any other replica count will be rejected by the controller when pause is requested.

Monitoring

Use kubectl to monitor pool and sandbox status in real time.

# List all pools and batch sandboxes
kubectl get pools
kubectl get batchsandboxes

# Describe a specific resource for detailed status
kubectl describe pool example-pool
kubectl describe batchsandbox task-batch-sandbox

Key status fields:

Field	Description
`status.replicas`	Total number of replicas requested
`status.readyReplicas`	Number of running, ready sandbox pods
`status.allocatedReplicas`	Pods currently allocated to a `BatchSandbox` (Pool)
`status.availableReplicas`	Unallocated warm pods available for new requests (Pool)
`status.taskStates`	Per-task execution state breakdown

Upgrade and Uninstall

# Upgrade the controller to a new version
helm upgrade opensandbox-controller \
  https://github.com/opensandbox-group/OpenSandbox/releases/download/helm/opensandbox-controller/<new-version>/opensandbox-controller-<new-version>.tgz \
  --namespace opensandbox-system

# Uninstall the controller (does not delete CRDs by default)
helm uninstall opensandbox-controller -n opensandbox-system

Components Deployed on Kubernetes

Component	Deployment Kind	Purpose
Server	Deployment	Lifecycle control plane
Operator/Controller	Deployment	Manages BatchSandbox and Pool CRDs
Ingress	DaemonSet or Deployment	Routes HTTP/WebSocket traffic to sandboxes
Egress	Sidecar (per pod)	Per-sandbox egress policy enforcement
Execd	Built into sandbox images	In-sandbox execution API

Server Deployment — Configure the OpenSandbox server
Architecture Overview — Six-surface architecture and request flow
Components — Server, execd, ingress, and egress in detail
Network Architecture — Kubernetes ingress routing and egress isolation
Pause and Resume — Full snapshot-based pause/resume guide
Secure Container — gVisor and Kata on Kubernetes

Get Started

SDKs

CLI & MCP

Guides

Deployment

Architecture

Deploy OpenSandbox on Kubernetes with Helm

Key Features

BatchSandbox CRD

Pool CRD

Pause and Resume

Task Orchestration

Pod Eviction

O(1) Batch Delivery

Prerequisites

Custom Resources

BatchSandbox

Pool

SandboxSnapshot (internal)

Helm Deployment

Creating BatchSandbox and Pool Resources

Basic non-pooled sandbox

Resource pool

Pooled sandbox with heterogeneous tasks

Resource Pooling

Pod Eviction

Task Orchestration

Pause and Resume

Monitoring

Upgrade and Uninstall

Components Deployed on Kubernetes

Build docs developers (and LLMs) love

Get Started

SDKs

CLI & MCP

Guides

Deployment

Architecture

Documentation Index

​Key Features

BatchSandbox CRD

Pool CRD

Pause and Resume

Task Orchestration

Pod Eviction

O(1) Batch Delivery

​Prerequisites

​Custom Resources

​BatchSandbox

​Pool

​SandboxSnapshot (internal)

​Helm Deployment

​Creating BatchSandbox and Pool Resources

​Basic non-pooled sandbox

​Resource pool

​Pooled sandbox with heterogeneous tasks

​Resource Pooling

​Pod Eviction

​Task Orchestration

​Pause and Resume

​Monitoring

​Upgrade and Uninstall

​Components Deployed on Kubernetes

​Related

Build docs developers (and LLMs) love

Key Features

Prerequisites

Custom Resources

BatchSandbox

Pool

SandboxSnapshot (internal)

Helm Deployment

Creating BatchSandbox and Pool Resources

Basic non-pooled sandbox

Resource pool

Pooled sandbox with heterogeneous tasks

Resource Pooling

Pod Eviction

Task Orchestration

Pause and Resume

Monitoring

Upgrade and Uninstall

Components Deployed on Kubernetes

Related