CockroachDB is designed to scale horizontally by adding more nodes to your cluster. The database automatically rebalances data across nodes to maintain optimal performance and fault tolerance.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/cockroachdb/cockroach/llms.txt
Use this file to discover all available pages before exploring further.
Horizontal Scaling
CockroachDB scales horizontally, meaning you add more nodes rather than upgrading existing hardware. This approach provides:Linear Scalability
Performance scales nearly linearly with the number of nodes
Automatic Rebalancing
Data automatically redistributes across all nodes
No Downtime
Add or remove nodes without interrupting service
Increased Capacity
Distribute storage and compute across more resources
Adding Nodes to a Cluster
Manual Node Addition
Prepare the New Node
Install CockroachDB on the new machine and ensure it can communicate with existing nodes on port 26257.
Copy Security Certificates
If running in secure mode, copy the CA certificate and create node certificates:
Kubernetes Node Scaling
For Kubernetes deployments, scale the StatefulSet:Removing Nodes from a Cluster
Graceful Node Decommissioning
Before removing a node, decommission it to safely transfer its data:Initiate Decommissioning
Automatic Rebalancing
CockroachDB automatically rebalances data across nodes based on:Rebalancing Triggers
- Node Addition: New nodes receive data from existing nodes
- Node Removal: Data moves from decommissioned nodes
- Uneven Distribution: Data shifts to balance storage utilization
- Locality Changes: Data moves closer to where it’s accessed
Monitoring Rebalancing
Monitor Replica Distribution
Controlling Rebalancing Rate
Adjust cluster settings to control rebalancing speed:Scaling Considerations
When to Scale Up
High CPU Utilization
Sustained CPU usage above 70% across nodes
Memory Pressure
Frequent cache evictions or OOM warnings
Storage Capacity
Disk usage exceeding 80% capacity
Query Latency
Increasing query response times
Scaling Best Practices
Scale in Odd Numbers
Always maintain an odd number of nodes (3, 5, 7, etc.) to ensure proper quorum for Raft consensus.
With an even number of nodes, you don’t gain additional fault tolerance. A 4-node cluster can still only lose 1 node, same as a 3-node cluster.
Scale Gradually
Add one or two nodes at a time and wait for rebalancing to complete before adding more.
Monitor During Scaling
Watch metrics for CPU, memory, disk I/O, and network throughput during scaling operations.
Replication Factor
The replication factor determines how many copies of each data range exist:View Current Replication Factor
Change Default Replication Factor
Locality-Aware Scaling
When deploying across multiple regions or availability zones:Start Node with Locality
Configure Zone Constraints
Constrain Data to Specific Localities
Performance Monitoring During Scaling
Key Metrics to Monitor
Key Metrics to Monitor
- Replica Count per Store: Should be balanced across stores
- Snapshot Rate: Shows active rebalancing activity
- Disk I/O: Increases during rebalancing
- Network Throughput: Higher during data movement
- Query Latency: Should remain stable during scaling
- Range Count: Total ranges should distribute evenly
Troubleshooting Scaling Issues
Rebalancing Stuck
Check for Range Issues
Uneven Distribution
Identify Unevenly Distributed Ranges
Next Steps
Deployment
Learn deployment strategies
Migration
Migrate data to CockroachDB
Backup & Restore
Implement backup strategies