Skip to main content

Production Checklist

Before deploying to production:
  • Use environment variables for secrets (never hardcode)
  • Enable TLS/HTTPS via reverse proxy
  • Harden Docker container security settings
  • Configure health checks and restart policies
  • Set up monitoring and observability
  • Use production configuration (config.prod.yaml)
  • Configure CORS allowed origins
  • Set appropriate log levels
  • Enable Prometheus metrics
  • Configure external Redis for state persistence
  • Use non-root user (default in distroless image)
  • Implement rate limiting and request timeouts

Security Hardening

Container Security

The official Docker image already includes several security features:
  • Distroless base: No shell, no package manager, minimal attack surface
  • Non-root user: Runs as nonroot user by default
  • Stripped binary: Debug symbols removed for smaller size
Additional hardening for production:
docker run --read-only --tmpfs /tmp \
  --cap-drop=ALL --cap-add=NET_BIND_SERVICE \
  --security-opt=no-new-privileges:true \
  -v ./config.yaml:/app/config.yaml:ro \
  -p 3111:3111 -p 49134:49134 -p 3112:3112 -p 9464:9464 \
  iiidev/iii:latest
Flags explained:
  • --read-only: Filesystem is read-only except for explicit mounts
  • --tmpfs /tmp: Writable temporary directory in memory
  • --cap-drop=ALL: Drop all Linux capabilities
  • --cap-add=NET_BIND_SERVICE: Add only capability to bind privileged ports
  • --security-opt=no-new-privileges:true: Prevent privilege escalation

TLS/HTTPS with Caddy

Use Caddy for automatic HTTPS with Let’s Encrypt:
# docker-compose.prod.yml
services:
  caddy:
    image: caddy:2-alpine
    ports:
      - "80:80"
      - "443:443"
      - "443:443/udp"  # HTTP/3
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile:ro
      - caddy_data:/data
      - caddy_config:/config
    depends_on:
      iii:
        condition: service_healthy
    restart: unless-stopped

  iii:
    image: iiidev/iii:latest
    volumes:
      - ./config.prod.yaml:/app/config.yaml:ro
    healthcheck:
      test: ["CMD-SHELL", "nc -z 127.0.0.1 3111 || exit 1"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 10s
    restart: unless-stopped

volumes:
  caddy_data:
  caddy_config:
Caddyfile configuration:
your-domain.com {
	# Automatically provisions TLS certificate from Let's Encrypt
	
	handle /api/* {
		reverse_proxy iii:3111
	}

	handle /streams/* {
		reverse_proxy iii:3112
	}

	handle /ws {
		reverse_proxy iii:49134
	}

	handle {
		reverse_proxy iii:3111
	}
}
Caddy automatically:
  • Obtains TLS certificates from Let’s Encrypt
  • Handles HTTP to HTTPS redirects
  • Renews certificates before expiration
  • Supports HTTP/3 and modern TLS versions

CORS Configuration

In production, restrict CORS to specific origins:
# config.prod.yaml
modules:
  - class: modules::api::RestApiModule
    config:
      host: 0.0.0.0
      port: 3111
      cors:
        allowed_origins:
          - https://your-app.com
          - https://www.your-app.com
        allowed_methods:
          - GET
          - POST
          - PUT
          - DELETE
          - OPTIONS
Never use allowed_origins: ['*'] in production. Always specify exact domains.

Secrets Management

Use environment variables for sensitive configuration:
# config.prod.yaml
modules:
  - class: modules::stream::StreamModule
    config:
      adapter:
        class: modules::stream::adapters::RedisAdapter
        config:
          redis_url: ${REDIS_URL}
  
  - class: modules::queue::QueueModule
    config:
      adapter:
        class: modules::queue::RedisAdapter
        config:
          redis_url: ${REDIS_URL}
Provide secrets via: Docker Compose:
services:
  iii:
    environment:
      - REDIS_URL=redis://:${REDIS_PASSWORD}@redis:6379
    env_file:
      - .env.prod
Kubernetes:
env:
  - name: REDIS_URL
    valueFrom:
      secretKeyRef:
        name: iii-secrets
        key: redis-url
Docker Secrets:
services:
  iii:
    secrets:
      - redis_password
    environment:
      - REDIS_URL=redis://:${REDIS_PASSWORD}@redis:6379

secrets:
  redis_password:
    external: true

Environment Variables

Core Configuration

VariableDescriptionDefaultExample
RUST_LOGLog level and filteringinfoinfo, debug, warn=info,iii=debug
STREAM_PORTStream API port31123112
REDIS_URLRedis connection stringredis://localhost:6379redis://:password@redis:6379

Observability

VariableDescriptionDefaultExample
OTEL_ENABLEDEnable OpenTelemetrytruetrue, false
OTEL_SERVICE_NAMEService name for tracesiiiiii-production
SERVICE_VERSIONService version0.2.01.0.0
SERVICE_NAMESPACEEnvironment namespaceproductionstaging, prod
OTEL_EXPORTER_TYPETrace exporter typememoryotlp, memory, both
OTEL_EXPORTER_OTLP_ENDPOINTOTLP collector endpointhttp://localhost:4317http://collector:4317
OTEL_LOGS_ENABLEDEnable OTEL logstruetrue, false
OTEL_LOGS_EXPORTERLogs exportermemoryotlp, memory, both

Example Production Environment

# .env.prod
RUST_LOG=info
REDIS_URL=redis://:your-redis-password@redis.production.svc:6379
OTEL_ENABLED=true
OTEL_SERVICE_NAME=iii-production
SERVICE_VERSION=1.2.0
SERVICE_NAMESPACE=production
OTEL_EXPORTER_TYPE=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_LOGS_ENABLED=true
OTEL_LOGS_EXPORTER=otlp

Health Checks

Docker Health Check

The production compose includes health checks:
iii:
  healthcheck:
    test: ["CMD-SHELL", "nc -z 127.0.0.1 3111 || exit 1"]
    interval: 10s
    timeout: 5s
    retries: 5
    start_period: 10s

Kubernetes Probes

livenessProbe:
  httpGet:
    path: /health
    port: 3111
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /ready
    port: 3111
  initialDelaySeconds: 5
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 2

Monitoring

Prometheus Metrics

The engine exposes Prometheus metrics on port 9464:
scrape_configs:
  - job_name: 'iii'
    static_configs:
      - targets: ['iii:9464']
Available metrics:
  • iii_workers_active: Number of connected workers
  • iii_invocations_total: Total function invocations
  • iii_invocations_error: Failed invocations
  • iii_invocations_duration_seconds: Invocation duration histogram

OpenTelemetry Integration

Configure OTLP export for distributed tracing:
modules:
  - class: modules::observability::OtelModule
    config:
      enabled: true
      service_name: ${OTEL_SERVICE_NAME:iii}
      service_version: ${SERVICE_VERSION:1.0.0}
      service_namespace: ${SERVICE_NAMESPACE:production}
      exporter: otlp
      endpoint: ${OTEL_EXPORTER_OTLP_ENDPOINT:http://localhost:4317}
      sampling_ratio: 1.0
      metrics_enabled: true
      metrics_exporter: otlp
      logs_enabled: true
      logs_exporter: otlp

Scaling Strategies

Horizontal Scaling

Multiple Engine Instances: For HTTP traffic, run multiple engine instances behind a load balancer:
services:
  iii:
    image: iiidev/iii:latest
    deploy:
      replicas: 3
    volumes:
      - ./config.prod.yaml:/app/config.yaml:ro
WebSocket worker connections require sticky sessions. Ensure your load balancer supports session affinity for the WebSocket port (49134).
Worker Scaling: Workers scale independently:
# Run 10 worker processes
for i in {1..10}; do
  node worker.js &
done
Or with Docker:
services:
  worker:
    build: ./worker
    deploy:
      replicas: 10
    environment:
      - III_ENGINE_URL=ws://iii:49134

Resource Limits

Set memory and CPU limits:
services:
  iii:
    image: iiidev/iii:latest
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G
        reservations:
          cpus: '0.5'
          memory: 512M

Logging

Structured Logging

Set RUST_LOG for detailed logging:
# All modules at info level
RUST_LOG=info

# Debug level for iii, info for others
RUST_LOG=warn,iii=debug

# Module-specific filtering
RUST_LOG=info,iii::modules::api=debug,iii::engine=trace

Log Aggregation

Use Docker’s logging driver:
services:
  iii:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
Or send to external aggregator:
services:
  iii:
    logging:
      driver: "syslog"
      options:
        syslog-address: "tcp://logstash:5000"

Backup and Recovery

State Persistence

For file-based KV stores, back up the data directory:
# Backup
tar -czf backup-$(date +%Y%m%d).tar.gz ./data

# Restore
tar -xzf backup-20260304.tar.gz

Redis Persistence

For Redis-backed modules, configure RDB or AOF:
redis:
  image: redis:7-alpine
  command: redis-server --appendonly yes
  volumes:
    - redis_data:/data

Production Configuration Example

# config.prod.yaml
modules:
  - class: modules::api::RestApiModule
    config:
      host: 0.0.0.0
      port: 3111
      default_timeout: 30000
      concurrency_request_limit: 1024
      cors:
        allowed_origins:
          - https://your-app.com
        allowed_methods:
          - GET
          - POST
          - PUT
          - DELETE
          - OPTIONS

  - class: modules::stream::StreamModule
    config:
      port: ${STREAM_PORT:3112}
      host: 0.0.0.0
      adapter:
        class: modules::stream::adapters::RedisAdapter
        config:
          redis_url: ${REDIS_URL}

  - class: modules::queue::QueueModule
    config:
      adapter:
        class: modules::queue::RedisAdapter
        config:
          redis_url: ${REDIS_URL}

  - class: modules::cron::CronModule

  - class: modules::observability::OtelModule
    config:
      enabled: true
      service_name: ${OTEL_SERVICE_NAME:iii}
      service_version: ${SERVICE_VERSION:1.0.0}
      service_namespace: production
      exporter: otlp
      endpoint: ${OTEL_EXPORTER_OTLP_ENDPOINT}
      sampling_ratio: 0.1  # Sample 10% in production
      metrics_enabled: true
      metrics_exporter: otlp
      logs_enabled: true
      logs_exporter: otlp
      logs_console_output: false  # Disable console logs in prod

Next Steps

Configuration

Deep dive into module configuration

Monitoring

Set up monitoring and observability

Build docs developers (and LLMs) love