Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/opensandbox-group/OpenSandbox/llms.txt

Use this file to discover all available pages before exploring further.

The OpenSandbox Kubernetes operator extends the platform with custom resources and a controller that manages sandbox environments natively within your cluster. It adds high-throughput batch delivery via the BatchSandbox CRD, pre-warmed resource pools via the Pool CRD, OCI-snapshot-based pause and resume, optional process-based task orchestration, and comprehensive real-time monitoring — all without requiring changes to the standard OpenSandbox SDK or server API.

Key Features

BatchSandbox CRD

Create and manage one or many identical sandbox replicas from a single YAML manifest or SDK call.

Pool CRD

Maintain pre-warmed compute resources for near-instant sandbox provisioning without cold-start latency.

Pause and Resume

Persist sandbox filesystem state as an OCI rootfs snapshot, release cluster resources, and restore later with the same sandbox ID.

Task Orchestration

Inject heterogeneous process-based tasks into each sandbox in a batch via taskTemplate and shardTaskPatches.

Pod Eviction

Gracefully evict idle pool pods for node maintenance or resource reclamation without disrupting allocated sandboxes.

O(1) Batch Delivery

Delivering 100 sandboxes from a pool completes in under one second — O(1) time complexity vs. O(N) for sequential provisioning.

Prerequisites

RequirementMinimum Version
Kubernetes1.21.1+
Helm3.x
kubectl1.11.3+
Go (build from source only)1.24.0+
Docker (build from source only)17.03+
If you don’t have access to a Kubernetes cluster, use kind to create a local cluster for testing. Install it with brew install kind (macOS) or winget install Kubernetes.kind (Windows), then run kind create cluster.

Custom Resources

OpenSandbox installs three CRDs into your cluster.

BatchSandbox

The BatchSandbox resource creates and manages one or many sandbox replicas from a pod template. It is the primary workload unit that the OpenSandbox server creates when the Kubernetes runtime is configured. Key spec fields:
FieldDescription
spec.replicasNumber of sandbox instances to create (use 1 for single sandboxes)
spec.templateKubernetes pod template spec for the sandbox container(s)
spec.poolRefName of a Pool resource to allocate from (omit for non-pooled creation)
spec.taskTemplateOptional default task injected into every sandbox in the batch
spec.shardTaskPatchesPer-sandbox task overrides for heterogeneous task distribution
spec.pauseSet to true to trigger a rootfs snapshot and release cluster resources

Pool

The Pool resource maintains a buffer of pre-warmed pods that BatchSandbox instances can allocate instantly instead of waiting for cold container starts. Key spec fields:
FieldDescription
spec.templatePod template spec for pool members (must match BatchSandbox template)
spec.capacitySpec.bufferMinMinimum number of unallocated pods to keep warm
spec.capacitySpec.bufferMaxMaximum number of unallocated pods to maintain
spec.capacitySpec.poolMaxHard cap on total pods managed by this pool
spec.capacitySpec.poolMinMinimum total pod count (allocated + buffer)
spec.scaleStrategy.maxUnavailableRate limit for scaling operations (e.g., "20%" or 5)

SandboxSnapshot (internal)

SandboxSnapshot is an internal CRD used by the controller to track rootfs snapshot operations during pause/resume. You do not create these directly — the controller creates them when BatchSandbox.spec.pause=true.

Helm Deployment

1

Install the controller from a GitHub Release

Install the OpenSandbox controller Helm chart directly from GitHub Releases. Replace <version> with the desired release (e.g., 0.1.0).
helm install opensandbox-controller \
  https://github.com/opensandbox-group/OpenSandbox/releases/download/helm/opensandbox-controller/<version>/opensandbox-controller-<version>.tgz \
  --namespace opensandbox-system \
  --create-namespace
2

Customize the installation (optional)

Use --set flags or a values file to tune resource limits, replica count, and log level.
helm install opensandbox-controller \
  https://github.com/opensandbox-group/OpenSandbox/releases/download/helm/opensandbox-controller/0.1.0/opensandbox-controller-0.1.0.tgz \
  --namespace opensandbox-system \
  --create-namespace \
  --set controller.replicaCount=2 \
  --set controller.resources.limits.cpu=1000m \
  --set controller.resources.limits.memory=512Mi
3

Configure the OpenSandbox server for Kubernetes

Generate a Kubernetes-oriented server config and edit it to point at your cluster.
opensandbox-server init-config ~/.sandbox.toml --example k8s
Key Kubernetes-specific config sections:
SectionPurpose
[kubernetes]workload_provider (batchsandbox or agent-sandbox), batchsandbox_template_file
[agent_sandbox]Agent sandbox settings for the agent-sandbox provider
[ingress]Ingress gateway mode for routing sandbox traffic
[secure_runtime]gVisor, Kata, or Firecracker runtime class integration
4

Verify the installation

Check that the controller pod is running and review its logs.
kubectl get pods -n opensandbox-system
kubectl get deployment -n opensandbox-system
kubectl logs -n opensandbox-system -l control-plane=controller-manager -f

Creating BatchSandbox and Pool Resources

Basic non-pooled sandbox

Create a simple batch of two sandboxes without a resource pool:
apiVersion: sandbox.opensandbox.io/v1alpha1
kind: BatchSandbox
metadata:
  name: basic-batch-sandbox
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: sandbox-container
        image: nginx:latest
        ports:
        - containerPort: 80
kubectl apply -f basic-batch-sandbox.yaml
kubectl get batchsandbox basic-batch-sandbox -o wide

# Check the endpoint annotation once sandboxes are ready
kubectl get batchsandbox basic-batch-sandbox \
  -o jsonpath='{.metadata.annotations.sandbox\.opensandbox\.io/endpoints}'

Resource pool

Create a pool that maintains 2–10 warm pods with a maximum of 20:
apiVersion: sandbox.opensandbox.io/v1alpha1
kind: Pool
metadata:
  name: example-pool
spec:
  template:
    spec:
      containers:
      - name: sandbox-container
        image: nginx:latest
        ports:
        - containerPort: 80
  capacitySpec:
    bufferMin: 2
    bufferMax: 10
    poolMin: 5
    poolMax: 20
  scaleStrategy:
    maxUnavailable: "20%"
Then allocate from the pool:
apiVersion: sandbox.opensandbox.io/v1alpha1
kind: BatchSandbox
metadata:
  name: pooled-batch-sandbox
spec:
  replicas: 3
  poolRef: example-pool

Pooled sandbox with heterogeneous tasks

For RL-style workloads, inject different tasks into each sandbox replica using shardTaskPatches. The task executor sidecar must share the process namespace with the sandbox container.
apiVersion: sandbox.opensandbox.io/v1alpha1
kind: Pool
metadata:
  name: task-example-pool
spec:
  template:
    spec:
      shareProcessNamespace: true
      containers:
      - name: sandbox-container
        image: ubuntu:latest
        command: ["sleep", "3600"]
      - name: task-executor
        image: <task-executor-image>:<tag>
        securityContext:
          capabilities:
            add: ["SYS_PTRACE"]
  capacitySpec:
    bufferMin: 2
    bufferMax: 10
    poolMax: 20
    poolMin: 5
apiVersion: sandbox.opensandbox.io/v1alpha1
kind: BatchSandbox
metadata:
  name: task-batch-sandbox
spec:
  replicas: 2
  poolRef: task-example-pool
  taskTemplate:
    spec:
      process:
        command: ["echo", "Default task"]
  shardTaskPatches:
  - spec:
      process:
        command: ["echo", "Custom task for sandbox 1"]
  - spec:
      process:
        command: ["echo", "Custom task for sandbox 2"]
        args: ["with", "additional", "arguments"]

Resource Pooling

Pools maintain a buffer of pre-warmed, unallocated pods so that BatchSandbox allocation is near-instantaneous. The controller continuously reconciles the pool to maintain bufferMin ≤ unallocated ≤ bufferMax while keeping total pod count within poolMinpoolMax.
ScenarioTotal time for 100 sandboxes
Sequential (concurrency=1)76.35 s
Concurrent (concurrency=10)23.17 s
Concurrent (concurrency=50)33.85 s
BatchSandbox with Pool0.92 s

Pod Eviction

Pool supports graceful pod eviction for node maintenance or resource reclamation. How to evict a pod:
  1. Label the target pod with pool.opensandbox.io/evict (any value).
  2. The controller checks whether the pod is allocated to an active BatchSandbox. Allocated pods are protected and skipped.
  3. Idle pods are deleted, triggering the pool to replenish capacity automatically.
  4. Pods labeled for eviction are excluded from new allocations.
Custom eviction strategies can be implemented by:
  1. Setting the pool.opensandbox.io/eviction-handler=<handler-name> label on the Pool resource.
  2. Implementing the EvictionHandler interface with NeedsEviction() and Evict() methods.
  3. Registering the handler in the factory function.

Task Orchestration

Task orchestration is optional. Sandboxes can be created without tasks at all. When enabled, the task executor sidecar runs process-based tasks inside each sandbox pod.
  • taskTemplate sets the default task for all replicas in a batch.
  • shardTaskPatches provides per-index overrides, enabling different tasks in each sandbox of the same batch (heterogeneous distribution).
  • The task executor sidecar requires SYS_PTRACE capability and shareProcessNamespace: true on the pod spec.

Pause and Resume

OpenSandbox supports pause and resume for single-replica Kubernetes sandboxes by committing the container rootfs to an OCI image snapshot.
Time ────────────────────────────────────────────────────────────>

[Running] → [Pausing] → [Paused] → [Resuming] → [Running]
               |                        |
         commit rootfs           rewrite template images
         push to registry        recreate runtime from snapshot
         release pods/alloc
The public sandbox ID remains stable across pause/resume cycles. See the Pause and Resume guide for full setup instructions including OCI registry secrets and controller flags.
Pause and resume currently requires BatchSandbox.spec.replicas=1. The OpenSandbox server always creates Kubernetes sandboxes with replicas: 1. Direct BatchSandbox CRs with any other replica count will be rejected by the controller when pause is requested.

Monitoring

Use kubectl to monitor pool and sandbox status in real time.
# List all pools and batch sandboxes
kubectl get pools
kubectl get batchsandboxes

# Describe a specific resource for detailed status
kubectl describe pool example-pool
kubectl describe batchsandbox task-batch-sandbox
Key status fields:
FieldDescription
status.replicasTotal number of replicas requested
status.readyReplicasNumber of running, ready sandbox pods
status.allocatedReplicasPods currently allocated to a BatchSandbox (Pool)
status.availableReplicasUnallocated warm pods available for new requests (Pool)
status.taskStatesPer-task execution state breakdown

Upgrade and Uninstall

# Upgrade the controller to a new version
helm upgrade opensandbox-controller \
  https://github.com/opensandbox-group/OpenSandbox/releases/download/helm/opensandbox-controller/<new-version>/opensandbox-controller-<new-version>.tgz \
  --namespace opensandbox-system

# Uninstall the controller (does not delete CRDs by default)
helm uninstall opensandbox-controller -n opensandbox-system

Components Deployed on Kubernetes

ComponentDeployment KindPurpose
ServerDeploymentLifecycle control plane
Operator/ControllerDeploymentManages BatchSandbox and Pool CRDs
IngressDaemonSet or DeploymentRoutes HTTP/WebSocket traffic to sandboxes
EgressSidecar (per pod)Per-sandbox egress policy enforcement
ExecdBuilt into sandbox imagesIn-sandbox execution API

Build docs developers (and LLMs) love