Overview
Slung’s data model is designed for high-cardinality time series data with flexible tagging and multiple value types. The model is optimized for both write throughput and query performance.Core Concepts
Series Keys
A series key uniquely identifies a time series and is constructed from a measurement name and sorted tags:Series keys are interned by the server - each unique key is stored once and reused across all data points, reducing memory overhead for high-cardinality datasets.
Series Key Internment
The server maintains a series key cache (src/main.zig:107-115):
Hash-Based Collision Detection
To optimize lookups, Slung uses Wyhash for series key hashing with collision detection:Data Types
DataPoint Structure
A data point consists of a timestamp and a value:Value Types
Slung supports four value types (src/tsm/cache.zig:27-41):
Binary Wire Format
Data is ingested via WebSocket in little-endian binary format:src/main.zig:424-469):
Indexing Strategy
Three-Level Index
Slung maintains three complementary indexes for efficient query routing:1. Series by Measurement
Maps measurement names to all associated series:2. Series by Measurement+Tag
Mapsmeasurement<0x1f>tag to matching series:
3. Series Key Hash
Fast lookup by hash (with collision tracking):Index Population
Indexes are populated lazily as series are written (src/main.zig:259-291):
Query Matching
Tag Filter Evaluation
The server evaluates tag filters using set operations (src/main.zig:140-209):
- Start with measurement universe: All series for the measurement
- Evaluate each tag operand: Build a set of matching series
- Apply operators: Combine sets using AND, OR, NOT
- Return intersection: Final set of matching series keys
Example Query Evaluation
Query:AVG:cpu.usage:[region=us-west AND NOT env=dev]
- Get universe: All
cpu.usageseries - Get
region=us-westset fromseries_by_measurement_tag - Get
env=devset and invert (NOT) - Intersect both sets (AND)
- Return matching series keys
Timestamps
Timestamp Format
All timestamps are stored as microsecond precision i64 values:- Range: ±292,277 years from Unix epoch
- Resolution: 1 microsecond
- Format: Signed 64-bit integer
Timestamp Encoding
See TSM Tree for details on Gorilla delta-of-delta encoding.Memory Optimization
String Internment
Series keys are interned once:- Reduces memory usage for high-cardinality data
- Single allocation per unique series key
- Pointer equality checks for fast comparison
Tag Scratch Buffer
Tag arrays are reused via a scratch buffer to avoid allocations:Best Practices
Tag Cardinality
Tag Cardinality
Keep tag cardinality manageable. Each unique tag combination creates a new series. For example:
- Good:
host(100 values),region(5 values) = 500 series - Bad:
request_id(millions of values) = millions of series
Tag Naming
Tag Naming
Use consistent tag naming:
- Use
key=valueformat:env=prod,region=us-west - Sort tags alphabetically for consistent series keys
- Avoid whitespace in tag names
Value Types
Value Types
Choose appropriate value types:
- Use
Floatfor measurements (temperature, CPU usage) - Use
Intfor counts (request count, byte count) - Use
Boolfor binary states (online/offline) - Use
Bytessparingly (no compression support)
Next Steps
TSM Tree Storage
Learn how data is stored and compressed on disk