Documentation Index
Fetch the complete documentation index at: https://mintlify.com/nubskr/walrus/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
- Minimum nodes: 3 (for Raft quorum)
- Rust: 1.70+ (for manual builds)
- Docker & Docker Compose: Latest (for containerized deployment)
- Network: All nodes must be able to communicate on Raft and client ports
Quick Start: Docker Compose
The fastest way to get a 3-node cluster running locally.1. Start the Cluster
- Builds the Docker image from the Dockerfile
- Starts 3 nodes:
walrus-1,walrus-2,walrus-3 - Exposes client ports: 9091, 9092, 9093
- Creates a bridge network for internal communication
2. Wait for Bootstrap
3. Verify Cluster Health
Connect to any node and check metrics:current_leader: Should be1after bootstrapstate: Should beLeaderon Node 1,Followeron othersmembership: Should show all 3 nodes
4. Test Basic Operations
5. Shutdown
Docker Compose Configuration
Thedocker-compose.yml defines the cluster topology:
Key Configuration Points
Key Configuration Points
--raft-host=0.0.0.0: Binds Raft listener to all interfaces (required in containers)--raft-advertise-host=node1: Hostname advertised to peers for RPC--join=node1:6001: Non-bootstrap nodes join via Node 1’s Raft portWALRUS_DISABLE_IO_URING=1: Uses mmap instead of io_uring (container compatibility)privileged: true: Required for some file system operations
Manual Deployment
Deploy nodes manually without Docker for production or bare-metal setups.1. Build the Binary
target/release/distributed-walrus
2. Node 1 (Bootstrap Leader)
Start the first node which will become the initial Raft leader:- Bootstrap as the Raft leader
- Create the initial
"logs"topic - Start accepting client connections on
:9091
Replace
192.168.1.10 with the actual IP address that other nodes can reach. For local testing, use 127.0.0.1.3. Node 2 (Join Cluster)
Start the second node and join the cluster:- Contact Node 1 at
192.168.1.10:6001 - Join as a Raft learner
- Sync metadata from the leader
- Get promoted to voting member automatically
4. Node 3 (Join Cluster)
Start the third node:5. Verify Cluster Formation
Check logs for successful join:Configuration Reference
Command-Line Flags
| Flag | Required | Default | Description |
|---|---|---|---|
--node-id | Yes | - | Unique node identifier (1, 2, 3, …) |
--data-dir | No | ./data | Root directory for storage |
--raft-port | No | 6000 | Port for Raft/internal RPC |
--raft-host | No | 127.0.0.1 | Raft bind address |
--raft-advertise-host | No | (raft-host) | Address advertised to peers |
--client-port | No | 8080 | Client TCP port |
--client-host | No | 127.0.0.1 | Client bind address |
--join | No | - | Address of existing node to join |
--log-file | No | - | File to write logs (stdout if not set) |
Environment Variables
| Variable | Default | Description |
|---|---|---|
RUST_LOG | info | Log level: debug, info, warn, error |
WALRUS_MAX_SEGMENT_ENTRIES | 1000000 | Entries before segment rollover |
WALRUS_MONITOR_CHECK_MS | 10000 | Monitor loop interval (ms) |
WALRUS_DISABLE_IO_URING | (unset) | Set to 1 to use mmap instead of io_uring |
Data Directory Structure
Each node stores data in separate directories:Production Deployment Considerations
Network Configuration
-
Firewall rules:
- Open client ports (9091-9093) for application traffic
- Open Raft ports (6001-6003) only between cluster nodes
- Do NOT expose Raft ports to the public internet
-
DNS/Hostnames:
- Use stable hostnames or IPs for
--raft-advertise-host - Consider using a load balancer for client connections
- Use stable hostnames or IPs for
Hardware Requirements
Minimum per node:- CPU: 2+ cores
- RAM: 4GB+ (depends on segment size and concurrency)
- Disk: SSD recommended for Walrus WAL performance
- Network: Low latency between nodes (< 10ms ideal for Raft)
Monitoring
Set up monitoring for:- Raft leader stability (
METRICScommand) - Segment rollover frequency
- Write latency (local vs. forwarded)
- Disk usage in
data_wal_dir
Backup Strategy
Backup requirements:- Raft metadata (
raft_meta/): Small, critical for cluster state - User data (
user_data/): Large, actual log data
- Snapshot the entire
data/directory per node - Use Walrus’s built-in snapshot capabilities for sealed segments
- Stream data to object storage for long-term retention
Hot-Join a New Node
Add a 4th node to an existing 3-node cluster:- Join as a learner
- Sync metadata from the leader
- Automatically promote to voter after catching up (~60 seconds)
New segments created after the join will distribute leadership across all nodes, including the new one.
Troubleshooting
”No Raft leader known”
Cause: Cluster hasn’t completed bootstrap or lost quorum. Solution:- Check network connectivity between nodes
- Verify at least 2 of 3 nodes are running
- Review logs for election timeout errors
”NotLeaderForPartition” errors
Cause: Node doesn’t hold lease for the segment. Solution:- Wait 100ms for lease sync to complete
- Check
STATE <topic>to see current leader - Verify Raft metadata is consistent across nodes
”Failed to join cluster”
Cause: Cannot reach the join target. Solution:- Verify
--joinaddress is correct - Ensure Raft port (6001) is accessible
- Check Node 1 is fully bootstrapped (wait 20-30 seconds after start)
Ports already in use
Cause: Previous instance not cleaned up. Solution:Next Steps
Client Protocol
Learn the TCP protocol commands
Failure Recovery
Handle node failures and recovery