TSM Tree Structure
The TSM (Time-Structured Merge) tree is implemented insrc/tsm/tsm.zig as a two-tier storage system:
In-Memory Cache
The cache layer (src/tsm/cache.zig) uses a skip list for ordered storage:
Cache Structure
- Skip list: 16 levels, O(log n) insertion and search
- Series indexing: HashMap maps series keys to individual skip lists
- Bloom filter: 1024-bit filter for series existence checks
- Point counting: Tracks total points for flush threshold
Data Point Types
Cache Operations
Insert:Automatic Flushing
When cache reachesMAX_CACHE_POINTS, data is flushed to disk:
- Write cache contents to new disk entry
- Increment entry count
- Clear cache and reset Bloom filter
- Reset point counter
Disk Storage
Disk entries (src/tsm/entry.zig) use a columnar format with compression:
Entry Structure
File Format
Each disk entry creates two files:_{level}_{name}.dat- compressed data_{level}_{name}.idx- indexes and metadata
Timestamp Encoding
Two encoding options: 1. Gorilla encoding (better compression):Value Serialization
Bloom Filters
Bloom filters provide fast series existence checks:- 1024-bit filter size
- Inserted on first write to series
- Persisted to disk with entry
- Checked before expensive disk reads
Query Execution
Queries span cache and disk transparently:- Check timestamp range against entry metadata
- Check Bloom filter for series existence
- Look up series in series index
- Read only relevant pages
- Decompress timestamps for range filtering
Skip List Performance
The 16-level skip list provides:- Average O(log n) insertion: ~4-5 comparisons for 1M points
- Average O(log n) search: similar performance
- O(n) range scan: efficient due to linked structure
- Memory overhead: ~32 bytes per node (16 forward pointers)
Storage Efficiency
From billion-point benchmark:- Total: 9.12 bytes per point
- Breakdown:
- Gorilla-encoded timestamps: ~1-2 bytes/point
- Float64 values: ~8 bytes/point (with type tag)
- Indexes and metadata: Less than 1 byte/point amortized
- Bloom filters: negligible