Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/alibaba/OpenSandbox/llms.txt

Use this file to discover all available pages before exploring further.

This guide covers common issues you may encounter when operating OpenSandbox and provides solutions for debugging and resolving them.

Server Issues

Symptoms:
  • Server exits immediately after starting
  • Error messages about configuration
  • Port binding failures
Common Causes & Solutions:1. Configuration file not found
# Error: Could not load config from ~/.sandbox.toml

# Solution: Create config file
opensandbox-server init-config ~/.sandbox.toml --example docker
2. Port already in use
# Error: Address already in use: 0.0.0.0:8080

# Solution: Check what's using the port
lsof -i :8080

# Kill the process or change port in config
[server]
port = 8081
3. Invalid configuration syntax
# Error: TOML parsing error

# Solution: Validate TOML syntax
cat ~/.sandbox.toml | python -c "import toml, sys; toml.load(sys.stdin)"
4. Docker daemon not accessible
# Error: Cannot connect to Docker daemon

# Solution: Verify Docker is running
docker ps

# Check DOCKER_HOST variable
echo $DOCKER_HOST

# Set correct Docker socket
export DOCKER_HOST="unix:///var/run/docker.sock"
Symptoms:
  • 401 Unauthorized responses
  • “Invalid API key” errors
Solutions:1. API key not configured
# ~/.sandbox.toml
[server]
api_key = "your-secret-api-key-change-this"
2. Missing header in requests
# Wrong - missing auth header
curl http://localhost:8080/v1/sandboxes

# Correct - include API key header
curl -H "OPEN-SANDBOX-API-KEY: your-secret-api-key" \
  http://localhost:8080/v1/sandboxes
3. Disable auth for development
[server]
# Comment out or remove api_key for local dev
# api_key = "your-secret-api-key"
Diagnosis:
# Test server health
curl http://localhost:8080/health

# Check server logs
tail -f /var/log/opensandbox/server.log
Solutions:
  • Verify server is fully started (check logs for “Application startup complete”)
  • Check for runtime initialization errors
  • Ensure Docker/Kubernetes runtime is accessible
  • Review log_level in configuration for more details

Sandbox Creation Issues

Symptoms:
  • Sandbox never transitions to Running
  • Status remains Pending for extended period
Debugging Steps:1. Check sandbox status
curl -H "OPEN-SANDBOX-API-KEY: your-api-key" \
  http://localhost:8080/v1/sandboxes/{sandbox_id}
Look for status.reason and status.message fields.2. Common causes:Image pull failures:
{
  "status": {
    "state": "Failed",
    "reason": "IMAGE_PULL_ERROR",
    "message": "Failed to pull image: manifest not found"
  }
}
Solutions:
  • Verify image exists: docker pull python:3.11-slim
  • Check image registry credentials
  • Use full image URI including registry
Resource constraints:
  • Insufficient CPU/memory on host
  • Resource limits too high
  • Pool capacity exceeded
Solutions:
# Check Docker resources
docker info | grep -E "CPUs|Memory"

# Reduce resource limits
"resourceLimits": {
  "cpu": "250m",
  "memory": "256Mi"
}
3. Enable debug logging
[server]
log_level = "DEBUG"
Restart server and review detailed logs.
Error:
{
  "detail": "entrypoint must be a non-empty array"
}
Solutions:1. Empty entrypoint array
// Wrong
{
  "image": {"uri": "python:3.11-slim"},
  "entrypoint": []  // Invalid
}

// Correct
{
  "image": {"uri": "python:3.11-slim"},
  "entrypoint": ["python", "-m", "http.server", "8000"]
}
2. Missing entrypoint entirely
// Must include entrypoint field
{
  "image": {"uri": "python:3.11-slim"},
  "entrypoint": ["sleep", "infinity"],
  "timeout": 3600
}
Error:
NetworkPolicy requires Docker bridge mode and egress.image configuration
Solution:1. Configure egress sidecar
[runtime]
type = "docker"
execd_image = "opensandbox/execd:v1.0.6"

[egress]
image = "opensandbox/egress:v1.0.1"

[docker]
network_mode = "bridge"  # Required for network policies
2. Pull egress image
docker pull opensandbox/egress:v1.0.1
3. Verify bridge mode
  • Network policies NOT supported in host mode
  • Must use network_mode = "bridge"
Common Errors:Invalid CPU format:
// Wrong
"resourceLimits": {"cpu": "500"}

// Correct
"resourceLimits": {"cpu": "500m"}
Invalid memory format:
// Wrong
"resourceLimits": {"memory": "512"}

// Correct
"resourceLimits": {"memory": "512Mi"}
Valid formats:
  • CPU: "100m", "0.5", "1"
  • Memory: "128Mi", "1Gi", "512Mi"

Runtime Issues

Symptoms:
  • Cannot create sandboxes
  • “Docker daemon not responding” errors
Solutions:1. Verify Docker is running
# Check Docker service
systemctl status docker

# Start if stopped
sudo systemctl start docker

# Test connection
docker ps
2. Check Docker socket permissions
# Verify socket exists
ls -la /var/run/docker.sock

# Add user to docker group
sudo usermod -aG docker $USER

# Reload groups (or logout/login)
newgrp docker
3. Configure Docker API timeout
[docker]
api_timeout = 300  # Increase timeout to 5 minutes
4. Remote Docker host
# Set Docker host environment variable
export DOCKER_HOST="ssh://user@remote-host"

# Or in config
export DOCKER_HOST="tcp://10.0.0.1:2375"
Symptoms:
  • Pods not created
  • Timeout waiting for sandbox
Solutions:1. Verify kubeconfig
[kubernetes]
kubeconfig_path = "~/.kube/config"
namespace = "opensandbox"
2. Check namespace exists
kubectl get namespace opensandbox

# Create if missing
kubectl create namespace opensandbox
3. Verify RBAC permissions
# Check service account permissions
kubectl auth can-i create pods --namespace opensandbox
kubectl auth can-i get pods --namespace opensandbox
kubectl auth can-i delete pods --namespace opensandbox
4. Check controller logs
kubectl logs -n opensandbox deployment/sandbox-controller
Symptoms:
  • Cannot execute code or commands
  • Ping endpoint timeout
Diagnosis:
# Test execd health
curl http://sandbox-ip:44772/ping

# Check if port is accessible
nc -zv sandbox-ip 44772
Solutions:1. Verify execd image version
[runtime]
execd_image = "opensandbox/execd:v1.0.6"  # Use latest version
2. Check container logs
docker logs <sandbox-container-id>
3. Verify network connectivity
  • Host mode: Port 44772 directly accessible
  • Bridge mode: Check port mappings and routing
4. Access token authentication
# Include access token header
curl -H "X-EXECD-ACCESS-TOKEN: your-token" \
  http://sandbox-ip:44772/metrics

Networking Issues

Host Mode Issues:1. Port already in use
# Error: Port 8000 already allocated

# Solution: Only one sandbox at a time in host mode
# Or use bridge mode for multiple sandboxes
2. Firewall blocking access
# Check firewall rules
sudo iptables -L -n

# Allow port (example)
sudo ufw allow 8000/tcp
Bridge Mode Issues:1. Routing not configured
[docker]
network_mode = "bridge"
host_ip = "10.57.1.91"  # Set when server runs in container
2. Get endpoint URL
curl -H "OPEN-SANDBOX-API-KEY: your-api-key" \
  http://localhost:8080/v1/sandboxes/{sandbox_id}/endpoints/8000
Use returned endpoint URL instead of direct IP access.
Symptoms:
  • Network policy not enforced
  • Sidecar container errors
Diagnosis:
# List all containers including sidecars
docker ps -a | grep egress

# Check sidecar logs
docker logs opensandbox-egress-<sandbox-id>
Solutions:1. Missing egress image
# Pull egress image
docker pull opensandbox/egress:v1.0.1

# Verify image exists
docker images | grep egress
2. Capability conflicts
  • Main container drops NET_ADMIN (required)
  • Sidecar needs NET_ADMIN (automatically granted)
  • Don’t manually override these settings
3. IPv6 disabled warning
  • Normal behavior when egress sidecar is active
  • IPv6 automatically disabled for policy enforcement
Direct Mode (Default):
[ingress]
mode = "direct"  # Docker runtime only
Gateway Mode (Kubernetes):
[ingress]
mode = "gateway"
gateway.address = "*.example.com"
gateway.route.mode = "wildcard"  # or "uri" or "header"
Route Modes:Wildcard:
URL: <sandbox-id>-<port>.example.com/path
URI:
URL: gateway.example.com/<sandbox-id>/<port>/path
Header:
URL: gateway.example.com
Header: OpenSandbox-Ingress-To: <sandbox-id>-<port>

Kubernetes-Specific Issues

Check status:
kubectl describe batchsandbox <name>
Common causes:1. Pool capacity exceeded
# Check pool status
kubectl get pool <pool-name> -o yaml

# Increase pool capacity
kubectl edit pool <pool-name>
# Update poolMax value
2. Resource quota exceeded
# Check namespace quotas
kubectl describe resourcequota -n opensandbox

# Check node resources
kubectl top nodes
3. Image pull failures
# Check pod events
kubectl get events -n opensandbox --sort-by='.lastTimestamp'

# Verify image pull secrets
kubectl get secrets -n opensandbox
Diagnosis:
# Check pool status
kubectl get pool <pool-name> -o wide

# View detailed status
kubectl describe pool <pool-name>
Solutions:1. Controller not running
# Check controller pod
kubectl get pods -n opensandbox -l app=sandbox-controller

# View controller logs
kubectl logs -n opensandbox -l app=sandbox-controller
2. Node resource constraints
# Check available node resources
kubectl describe nodes | grep -A 5 "Allocated resources"
3. Adjust pool settings
capacitySpec:
  bufferMin: 2
  bufferMax: 10
  poolMin: 5
  poolMax: 20  # Increase if hitting limit
Symptoms:
  • Tasks stuck in running state
  • Tasks fail immediately
Diagnosis:
# Check task status
kubectl get batchsandbox <name> -o jsonpath='{.status.taskStats}'

# Get task executor logs
kubectl logs <pod-name> -c task-executor
Solutions:1. Missing task-executor sidecar
# Pool template must include task-executor
spec:
  template:
    spec:
      shareProcessNamespace: true  # Required
      containers:
      - name: sandbox-container
        image: ubuntu:latest
      - name: task-executor
        image: opensandbox/task-executor:latest
        securityContext:
          capabilities:
            add: ["SYS_PTRACE"]  # Required
2. Process namespace not shared
spec:
  shareProcessNamespace: true  # Must be set
3. Task command errors
  • Verify command exists in container
  • Check command syntax
  • Review task executor logs for errors
Solutions:1. Check directory permissions
ls -la /var/log/sandbox-controller/
ls -ld /var/log/sandbox-controller/
2. Verify file logging enabled
# Controller must be started with
--enable-file-log=true
3. Create log directory
mkdir -p /var/log/sandbox-controller
chmod 755 /var/log/sandbox-controller
4. In Kubernetes
initContainers:
- name: setup-log-dir
  image: busybox
  command: ['sh', '-c', 'mkdir -p /var/log/controller && chmod 755 /var/log/controller']
  volumeMounts:
  - name: log-volume
    mountPath: /var/log/controller

Debugging Techniques

Enable Debug Logging

Server:
[server]
log_level = "DEBUG"
Kubernetes Controller:
./controller --zap-log-level=debug

Docker Debugging

# In server code
import logging
logging.getLogger("docker").setLevel(logging.DEBUG)

Interactive Debugging

VS Code/Cursor:
{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: FastAPI",
      "type": "python",
      "request": "launch",
      "module": "src.main",
      "justMyCode": false,
      "env": {
        "SANDBOX_CONFIG_PATH": "${workspaceFolder}/.sandbox.toml"
      }
    }
  ]
}
Python Breakpoints:
breakpoint()  # Python 3.7+

Collect Diagnostic Information

#!/bin/bash
# Diagnostic collection script

echo "=== OpenSandbox Diagnostics ==="

# Server version
echo "\n--- Server Version ---"
opensandbox-server --version

# Docker info
echo "\n--- Docker Info ---"
docker info

# Server logs
echo "\n--- Server Logs (last 50 lines) ---"
tail -n 50 /var/log/opensandbox/server.log

# Active sandboxes
echo "\n--- Active Containers ---"
docker ps | grep opensandbox

# Network configuration
echo "\n--- Network Config ---"
cat ~/.sandbox.toml | grep -A 5 "\[docker\]"

# Resource usage
echo "\n--- System Resources ---"
docker stats --no-stream

Common Error Codes

Error CodeDescriptionSolution
IMAGE_PULL_ERRORFailed to pull container imageVerify image exists and credentials
CONTAINER_STARTINGContainer is startingWait for transition to Running
RESOURCE_LIMIT_EXCEEDEDInsufficient resourcesReduce limits or increase host capacity
NETWORK_ERRORNetwork configuration failedCheck network mode and routing
EXPIREDSandbox TTL exceededNormal - automatic cleanup
INVALID_REQUEST_BODYMalformed API requestCheck JSON syntax and required fields
FILE_NOT_FOUNDFile operation failedVerify file path exists

Getting Help

Report Issues

Submit bug reports and feature requests on GitHub

Before Reporting Issues

  1. Check existing issues - Search for similar problems
  2. Collect diagnostics - Use the diagnostic script above
  3. Minimal reproduction - Provide steps to reproduce
  4. Version information - Include server and runtime versions
  5. Configuration - Share relevant config (redact secrets)
  6. Logs - Include relevant log excerpts

Useful Debug Commands

# Check server health
curl http://localhost:8080/health

# List all sandboxes
curl -H "OPEN-SANDBOX-API-KEY: key" http://localhost:8080/v1/sandboxes

# Check Docker daemon
docker info

# View container logs
docker logs <container-id>

# Kubernetes events
kubectl get events --sort-by='.lastTimestamp'

# Pod logs
kubectl logs <pod-name> --all-containers

Build docs developers (and LLMs) love