Zerops automatically scales your Node.js applications based on real-time resource usage, ensuring optimal performance and cost efficiency.
How Auto-Scaling Works
Zerops continuously monitors your application and adjusts resources automatically:
- Vertical Scaling - Adjusts CPU, RAM, and disk for each container
- Horizontal Scaling - Adds or removes containers based on load
- Real-time Monitoring - Checks resources every few seconds
- Automatic Optimization - No manual intervention needed
Vertical Auto-Scaling
Vertical scaling adjusts resources for individual containers.
CPU Modes
Shared CPU (Default)
Dedicated CPU
Containers share CPU with other services. Cost-effective for most applications.services:
- hostname: app
type: nodejs@20
verticalAutoscaling:
cpuMode: SHARED
minCpu: 1
maxCpu: 5
Containers get dedicated CPU cores. Best for CPU-intensive workloads.services:
- hostname: app
type: nodejs@20
verticalAutoscaling:
cpuMode: DEDICATED
minCpu: 2
maxCpu: 10
Resource Limits
Configure min/max resources:
services:
- hostname: app
type: nodejs@20
verticalAutoscaling:
cpuMode: SHARED
minCpu: 1 # Minimum CPU cores
maxCpu: 5 # Maximum CPU cores
minRam: 0.5 # Minimum RAM (GB)
maxRam: 8 # Maximum RAM (GB)
minDisk: 1 # Minimum disk (GB)
maxDisk: 10 # Maximum disk (GB)
Start with conservative limits and adjust based on monitoring data.
Advanced Vertical Scaling
Fine-tune scaling behavior:
services:
- hostname: app
type: nodejs@20
verticalAutoscaling:
cpuMode: DEDICATED
minCpu: 2
maxCpu: 8
minRam: 2
maxRam: 16
minDisk: 5
maxDisk: 50
startCpuCoreCount: 4 # Initial CPU cores
minFreeRamGB: 0.5 # Keep 0.5 GB free
minFreeRamPercent: 20 # Keep 20% RAM free
Horizontal Auto-Scaling
Horizontal scaling adds or removes containers based on load.
Basic Configuration
services:
- hostname: app
type: nodejs@20
minContainers: 2 # Minimum containers (always running)
maxContainers: 10 # Maximum containers (scale up to 10)
Scaling Triggers
Zerops scales horizontally when:
- CPU usage exceeds threshold for 60 seconds
- RAM usage is consistently high
- Traffic increases significantly
- Containers reach resource limits
Scaling Example
Normal Load
Your app runs with 2 containers at 40% CPU.
Traffic Spike
Traffic increases, CPU reaches 80% on all containers.
Scale Up
Zerops adds 2 more containers (total: 4).
Load Balanced
Traffic distributes across 4 containers, CPU drops to 50%.
Traffic Decreases
After 5 minutes of low usage, Zerops removes extra containers.
Scaling Configurations by Use Case
Development/Staging
Minimal resources for testing:
services:
- hostname: app
type: nodejs@20
minContainers: 1
maxContainers: 2
verticalAutoscaling:
cpuMode: SHARED
minCpu: 1
maxCpu: 2
minRam: 0.5
maxRam: 2
Production (Small)
Small production app with light traffic:
services:
- hostname: app
type: nodejs@20
minContainers: 2
maxContainers: 4
verticalAutoscaling:
cpuMode: SHARED
minCpu: 1
maxCpu: 4
minRam: 1
maxRam: 4
minDisk: 2
maxDisk: 10
Production (Medium)
Medium traffic with burst capacity:
services:
- hostname: app
type: nodejs@20
minContainers: 3
maxContainers: 8
verticalAutoscaling:
cpuMode: DEDICATED
minCpu: 2
maxCpu: 6
minRam: 2
maxRam: 8
minDisk: 5
maxDisk: 20
Production (High Traffic)
High-traffic production app:
services:
- hostname: app
type: nodejs@20
minContainers: 5
maxContainers: 10
verticalAutoscaling:
cpuMode: DEDICATED
minCpu: 4
maxCpu: 10
minRam: 4
maxRam: 16
minDisk: 10
maxDisk: 50
startCpuCoreCount: 6
Monitoring Scaling
View Metrics in GUI
Access service metrics
- Go to your service in Zerops GUI
- Click Metrics or Dashboard
View resource usage
Monitor:
- CPU usage per container
- RAM consumption
- Disk usage
- Network traffic
- Container count over time
Adjust scaling
Based on metrics, adjust your scaling configuration.
Key Metrics to Monitor
CPU Usage
Should stay between 40-70% under normal load.
RAM Usage
Keep 20-30% free for traffic spikes.
Container Count
Track how often scaling occurs.
Response Time
Monitor application performance.
Optimizing Your Node.js App for Scaling
Use Clustering
Leverage multiple CPU cores:
const cluster = require('cluster');
const os = require('os');
const express = require('express');
if (cluster.isMaster) {
const numWorkers = os.cpus().length;
for (let i = 0; i < numWorkers; i++) {
cluster.fork();
}
cluster.on('exit', (worker) => {
console.log(`Worker ${worker.id} died, starting new worker`);
cluster.fork();
});
} else {
const app = express();
// Your app code here
app.listen(3000, '0.0.0.0');
}
Stateless Applications
Design apps without local state:
Bad - Local State
Good - External State
// Don't store state in memory
let userSessions = {};
app.post('/login', (req, res) => {
userSessions[req.body.userId] = {
loginTime: Date.now()
};
});
// Use Redis for shared state
const redis = require('redis');
const client = redis.createClient({
host: 'redis'
});
app.post('/login', async (req, res) => {
await client.set(
`session:${req.body.userId}`,
JSON.stringify({ loginTime: Date.now() })
);
});
Optimize Memory Usage
Prevent memory leaks:
// Use connection pooling
const { Pool } = require('pg');
const pool = new Pool({
host: 'db',
max: 20, // Maximum connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000
});
// Clean up resources
process.on('SIGTERM', async () => {
await pool.end();
process.exit(0);
});
Handle Graceful Shutdown
Ensure clean shutdowns during scaling:
const server = app.listen(3000);
process.on('SIGTERM', () => {
console.log('SIGTERM received, closing server...');
server.close(() => {
console.log('Server closed');
// Close database connections
pool.end();
process.exit(0);
});
// Force close after 10 seconds
setTimeout(() => {
console.error('Forcing shutdown');
process.exit(1);
}, 10000);
});
Cost Optimization
Right-Size Resources
Begin with minimal resources:minContainers: 1
maxContainers: 4
minCpu: 1
maxCpu: 3
After 1-2 weeks, review metrics and adjust based on actual usage.
Use shared CPU when possible
Shared CPU is more cost-effective for most applications.
Scale Down Idle Services
For non-production environments:
services:
- hostname: staging
type: nodejs@20
minContainers: 0 # Scale to zero when idle
maxContainers: 2
minContainers: 0 may cause cold starts. Not recommended for production.
Troubleshooting Scaling
App keeps hitting CPU limits
Solutions:
- Increase
maxCpu
- Optimize code (use profiling tools)
- Increase
maxContainers for horizontal scaling
- Consider dedicated CPU
Memory usage growing constantly
Possible causes:
- Memory leak in application
- Insufficient garbage collection
Solutions:
- Profile with
node --inspect
- Review event listeners and timers
- Increase
maxRam temporarily
Solutions:
- Reduce
maxContainers
- Increase resources per container
- Optimize application performance
Next Steps
Deployment Process
Learn about zero-downtime deployments.
Build Pipeline
Configure your build process.