Walrus provides configurable consistency models and fsync scheduling to let you tune the trade-off between durability and performance based on your application’s requirements.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/nubskr/walrus/llms.txt
Use this file to discover all available pages before exploring further.
ReadConsistency
TheReadConsistency enum controls how read cursors are persisted to disk, affecting exactly-once vs at-least-once delivery semantics.
StrictlyAtOnce
Guarantees exactly-once consumption: every read cursor update is immediately persisted to disk before returning.- Read cursor persisted after every
read_next()call - Survivors process crashes with no message replays
- Holds reader lock through IO for single-consumption semantics
- Highest durability, lower read throughput
- Financial transactions
- Order processing
- Any system requiring exactly-once delivery
- Critical audit logs
AtLeastOnce
Provides at-least-once delivery: read cursors are persisted periodically every N reads, allowing replays after crashes.persist_every: Number of reads between cursor persistence (e.g., 1000)
- Read cursor persisted every N reads
- After crash: May replay up to N messages
- Releases reader lock before IO (allows concurrent readers)
- Higher throughput, relaxed durability
- Event processing (idempotent handlers)
- Metrics aggregation
- Log analysis
- Any system tolerating replays
Comparison
| Feature | StrictlyAtOnce | AtLeastOnce |
|---|---|---|
| Delivery | Exactly-once | At-least-once |
| Cursor Persistence | Every read | Every N reads |
| Crash Replays | None | Up to N messages |
| Read Throughput | Lower | Higher |
| Concurrency | Serialized reads | Concurrent reads |
| Use Case | Critical data | High throughput |
The
checkpoint parameter in read_next(topic, checkpoint) controls whether the cursor advances. Set to false for non-destructive peeks (no cursor update), or true to consume the entry.FsyncSchedule
TheFsyncSchedule enum controls when write data is flushed to disk, affecting durability guarantees and write performance.
Milliseconds(u64)
Flush data to disk at regular intervals (default: 200ms).- Background thread calls
fsync()every N milliseconds - Buffered writes are flushed in batches
- Balances durability and throughput
- Default: 200ms
- Crash before fsync: Lose writes from last N milliseconds
- Example with 500ms: Lose up to 500ms of writes
- General-purpose streaming
- Most production workloads
- Default recommended setting
SyncEach
Flush data to disk after every single write (maximum durability).- Files opened with
O_SYNCflag (on Unix systems) - Every
append_for_topic()call waits for disk write - Guarantees data on disk before returning
- Significantly lower write throughput
- None (zero data loss on crash)
- Financial ledgers
- Transaction logs
- Any system requiring zero data loss
- Regulatory compliance
NoFsync
Never flush data to disk explicitly (maximum throughput, no durability).- Data is written to OS page cache
- Relies on OS for eventual disk flushes
- Maximum write throughput
- No durability guarantees
- Crash: May lose all unacknowledged writes
- OS decides when to flush (typically 5-30s)
- Development/testing
- Temporary caching
- Non-critical event logs
- Metrics buffering
Fsync Implementation Details
On Linux (FD Backend with io_uring)
When using the FD backend on Linux, fsync operations leverage io_uring for batching:- Multiple file descriptors fsynced in parallel
- Reduced syscall overhead
- Better utilization of disk I/O
On Other Platforms (Mmap Backend)
Falls back to sequentialfsync() calls:
Configuration Examples
High Durability (Financial System)
High Throughput (Event Processing)
Balanced (Recommended Default)
Development/Testing
Read Cursor Persistence
Checkpoint Behavior
Thecheckpoint parameter in read operations controls cursor advancement:
Batch Reads
Batch reads respect the same consistency model:Distributed Consistency
In the distributed system, consistency also involves:Write Leases
Only the designated leader can write to each segment:Metadata Replication
All topology changes (topics, segments, leaders) are replicated via Raft:Best Practices
Start Conservative
Begin with
StrictlyAtOnce and Milliseconds(200):- Ensures data safety
- Profile to identify bottlenecks
- Relax constraints if needed
Match Your Workload
Choose based on requirements:
- Exactly-once needed? →
StrictlyAtOnce - Idempotent handlers? →
AtLeastOnce - Zero data loss? →
SyncEach - Testing only? →
NoFsync
Tune for Throughput
If performance is critical:
- Use
AtLeastOncewith higherpersist_every - Increase fsync interval (500ms-1s)
- Ensure idempotent processing
Monitor Loss Window
Track fsync interval + buffer depth:
- 200ms fsync + 1s buffer = 1.2s loss window
- Acceptable for most applications
- Adjust based on RPO requirements
Environment Variables
Related Topics
Architecture Overview
Learn about the overall system design and storage engine
Topics and Segments
Understand how data is organized in topics and segments