Skip to main content
The production configuration is optimized for stability, performance, and scalability. It uses pre-built images from GitHub Container Registry and implements resource management and scaling.

Configuration Overview

The compose.prod.yaml configuration includes:
  • Pre-built images - Uses published images from ghcr.io/hitesh22rana/chronoverse
  • Resource limits - CPU and memory constraints for stability
  • Horizontal scaling - Multiple replicas for worker services
  • Nginx reverse proxy - Single entry point on port 80
  • Production security - No exposed internal ports
  • Optimized settings - Production-grade configurations

Resource Allocation

The production compose file defines three resource profiles:

Database Services

  • Limits: 1 CPU, 1 GB RAM
  • Reservations: 0.5 CPU, 512 MB RAM
  • Services: PostgreSQL, ClickHouse, Redis, Kafka

Application Services

  • Limits: 0.25 CPU, 256 MB RAM
  • Reservations: 0.1 CPU, 128 MB RAM
  • Services: users-service, workflows-service, jobs-service, notifications-service, analytics-service, server

Worker Services

Low-Resource Workers (2 replicas each):
  • Limits: 0.5 CPU, 2 GB RAM
  • Reservations: 0.25 CPU, 1 GB RAM
  • Workers: scheduling-worker, workflow-worker, joblogs-processor, analytics-processor
High-Resource Workers (2 replicas each):
  • Limits: 2 CPU, 4 GB RAM
  • Reservations: 1 CPU, 2 GB RAM
  • Workers: execution-worker

Deploying to Production

1

Prepare the Server

Ensure your production server meets requirements:
  • Docker Engine 20.10+
  • Docker Compose V2
  • 8+ CPU cores
  • 16+ GB RAM
  • 100+ GB disk space
# Update system
sudo apt update && sudo apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

# Verify installation
docker --version
docker compose version
2

Clone the Repository

git clone https://github.com/your-org/chronoverse.git
cd chronoverse
3

Configure Environment

Change default passwords before deploying to production!
Edit compose.prod.yaml and update:
  • Database passwords (PostgreSQL, ClickHouse)
  • Meilisearch master key
  • Server allowed origins
  • Any other environment-specific settings
4

Start the Stack

# Pull latest images
docker compose -f compose.prod.yaml pull

# Start all services
docker compose -f compose.prod.yaml up -d

# Monitor startup progress
docker compose -f compose.prod.yaml logs -f
5

Verify Deployment

# Check service health
docker compose -f compose.prod.yaml ps

# Verify all services are healthy
docker ps --filter "health=healthy" | wc -l
6

Access the Application

The application is now available on port 80:
Configure DNS and TLS termination at your load balancer or reverse proxy.

Nginx Reverse Proxy

The production deployment includes an nginx reverse proxy that routes traffic:

Routing Rules

  • / → Dashboard (port 3000)
  • /api/ → Server API (port 8080)
  • /api/workflows/{id}/jobs/{id}/events → Server-Sent Events (SSE) with special configuration

SSE Configuration

The nginx proxy includes optimized settings for Server-Sent Events:
# SSE specific settings
proxy_buffering off;
proxy_cache off;
proxy_set_header Connection "";
proxy_http_version 1.1;
chunked_transfer_encoding off;

# Prevent timeouts for long-running connections
proxy_read_timeout 24h;
proxy_send_timeout 24h;

Scaling Workers

The production configuration scales workers for high availability:

Current Scaling

  • scheduling-worker: 2 replicas
  • workflow-worker: 2 replicas
  • execution-worker: 2 replicas
  • joblogs-processor: 2 replicas
  • analytics-processor: 2 replicas

Adjusting Replicas

To scale workers up or down, edit compose.prod.yaml:
workflow-worker:
  deploy:
    replicas: 4  # Increase from 2 to 4
Then apply the change:
docker compose -f compose.prod.yaml up -d --scale workflow-worker=4

Security Considerations

Change Default Credentials

CRITICAL: Change all default passwords before production deployment!
Update these values in compose.prod.yaml:
environment:
  POSTGRES_PASSWORD: your-secure-password-here
  CLICKHOUSE_PASSWORD: your-secure-password-here
  MEILI_MASTER_KEY: your-secure-master-key-here

Network Isolation

  • All services run on isolated chronoverse network
  • Only nginx is exposed (port 80)
  • Internal services use mTLS

TLS Certificates

The init-certs service generates:
  • Self-signed CA certificate
  • Service certificates for mTLS
  • Client certificates for database access
  • ED25519 keypair for JWT authentication
For production, consider using proper CA-signed certificates or integrate with your certificate management system.

Docker Socket Access

The docker-proxy service provides controlled access to Docker socket:
  • Read-only permissions for most operations
  • Limited to specific API endpoints
  • Used by execution-worker and workflow-worker

Monitoring Production

Service Health

# Check all service health
docker compose -f compose.prod.yaml ps

# View unhealthy services
docker ps --filter "health=unhealthy"

# Check specific service
docker inspect --format='{{.State.Health.Status}}' server

Resource Usage

# View resource consumption
docker stats

# View specific service stats
docker stats server users-service workflows-service

Logs

docker compose -f compose.prod.yaml logs -f

LGTM Observability

Access Grafana for detailed metrics and traces:
# Port forward Grafana to localhost
docker port lgtm 3000

# Access via SSH tunnel if needed
ssh -L 3000:localhost:3000 user@your-server
Then open http://localhost:3000 in your browser.

Backup and Restore

Backing Up Data

# Backup PostgreSQL
docker exec postgres pg_dump \
  "host=localhost user=primary dbname=chronoverse \
   sslmode=verify-full \
   sslrootcert=/certs/ca/ca.crt \
   sslcert=/certs/clients/client.crt \
   sslkey=/certs/clients/client.key" \
  > chronoverse-$(date +%Y%m%d).sql

Restoring Data

# Stop services
docker compose -f compose.prod.yaml down

# Restore volumes
docker run --rm \
  -v chronoverse_postgres:/data/postgres \
  -v chronoverse_clickhouse:/data/clickhouse \
  -v chronoverse_redis:/data/redis \
  -v $(pwd)/backup:/backup \
  alpine tar xzf /backup/volumes-20260303.tar.gz -C /

# Start services
docker compose -f compose.prod.yaml up -d

Updating Services

1

Pull Latest Images

docker compose -f compose.prod.yaml pull
2

Stop Services Gracefully

docker compose -f compose.prod.yaml down
3

Start with New Images

docker compose -f compose.prod.yaml up -d
4

Verify Update

# Check service health
docker compose -f compose.prod.yaml ps

# View updated image tags
docker compose -f compose.prod.yaml images

Troubleshooting Production

Service Won’t Start

# Check service logs
docker compose -f compose.prod.yaml logs <service-name>

# Check resource constraints
docker stats <service-name>

# Inspect service configuration
docker inspect <service-name>

High Resource Usage

# Identify resource-hungry services
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

# Adjust resource limits in compose.prod.yaml
# Then recreate services
docker compose -f compose.prod.yaml up -d --force-recreate

Database Connection Issues

# Verify database health
docker compose -f compose.prod.yaml ps postgres clickhouse redis

# Test database connectivity
docker exec server ping postgres

# Check certificate validity
docker exec server ls -la /certs/

Worker Not Processing Jobs

# Check worker logs
docker compose -f compose.prod.yaml logs -f workflow-worker

# Verify Kafka connectivity
docker compose -f compose.prod.yaml logs kafka

# Restart workers
docker compose -f compose.prod.yaml restart workflow-worker execution-worker

High Availability

For production high availability:
  1. Deploy across multiple hosts - Use Docker Swarm or Kubernetes
  2. External load balancer - Route traffic across multiple nginx instances
  3. Managed databases - Consider using managed PostgreSQL, Redis, and ClickHouse
  4. Persistent volumes - Use network storage or managed volume solutions
  5. Certificate management - Integrate with Vault or cert-manager

Next Steps

Configuration Reference

Complete environment variable reference

Monitoring Setup

Set up production monitoring

Backup Strategy

Implement backup and disaster recovery

Security Hardening

Production security best practices

Build docs developers (and LLMs) love