Running xyOps in production with lots of servers and high job volumes? This guide provides best practices for scaling your deployment to handle enterprise workloads.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/pixlcore/xyops/llms.txt
Use this file to discover all available pages before exploring further.
Start with Self-Hosting first if you’re new to xyOps deployment. This guide complements those foundational concepts.
Hardware Sizing
Proper hardware provisioning is critical for production xyOps deployments at scale.CPU Cores
xyOps is multi-process and highly concurrent. More cores improve performance across:- Job scheduler
- Web server request handling
- Storage I/O operations
- Real-time log compression
Memory (RAM)
Adequate RAM ensures smooth operation and reduces disk I/O:- Node.js heap space
- In-process caches (storage, lists)
- Storage engine caches (SQLite, Filesystem)
- OS page cache for log files
Storage
- Type: Prefer SSD or NVMe for local Filesystem/SQLite backends
- IOPS: Ensure adequate IOPS for parallel job logs, snapshots, and uploads
- Capacity: Plan for log archives, job history, and monitor time-series data
Network
- Ensure good NIC throughput and low latency between conductors and workers
- For external storage (S3, Redis, MinIO), place conductors in the same region/AZ
- Use load balancers with proper health checks for multi-conductor setups
OS Limits
Memory Configuration
Node.js Heap Size
xyOps honors theNODE_MAX_MEMORY environment variable to set Node’s old-space heap size.
Configure Node.js Memory
Configure Node.js Memory
Calculate appropriate value
On a 16 GB instance, allocate 8-12 GB to Node.js heap, leaving room for:
- OS and system processes
- Filesystem cache
- External daemons (nginx, database)
Storage Engine Caching
xyOps uses pixl-server-storage with in-memory caches for JSON records.Maximum cache size in bytes (default ~100 MB)
Maximum cached items
Filesystem cache size in bytes
Multi-Conductor Architecture
Multi-conductor deployments require external shared storage so all conductors see the same state.See Multi-Conductor with Nginx for detailed setup instructions.
Storage Backend Options
S3 / MinIO
S3 / MinIO
AWS S3 works but has higher latency. MinIO (self-hosted S3) performs better on-prem.
Redis + S3 Hybrid
Redis + S3 Hybrid
Common pattern: fast key/value store for JSON documents, object store for binaries.Ensure Redis persistence (RDB/AOF) is enabled for durability.
NFS Shared Filesystem
NFS Shared Filesystem
SQLite works great for single-conductor but cannot be shared across multiple conductors. Switch to a networked backend for multi-conductor.
Performance Tuning
Disable QuickMon at Scale
QuickMon sends per-second metrics from all satellites. At large scale, this adds ingestion load.monitoring_enabled.
Disable Job Network Monitoring
For servers with tens of thousands of network connections, disable real-time network monitoring during jobs:Job Throughput
Increase the global job rate limit prudently:Global e-brake to prevent runaway workflows from overwhelming the system
Data Retention
Cap database history sizes to prevent unbounded growth:Search Performance
Worker threads for file search operations
Automated Backups
Configure nightly API export
Use the nightly API export for critical data. Schedule via cron and store off-host.
See Daily Backups.
Monitoring and Alerting
Critical Error Notifications
Configure system hooks to send alerts for crashes and failed upgrades:Universal Alert Actions
Configure global alert actions that fire for all alerts:Security Hardening
Network Access Control
Network Access Control
HTTPS/TLS
HTTPS/TLS
https_header_detect if terminating TLS upstream.Upload and Connection Limits
Upload and Connection Limits
Security Headers
Security Headers