Skip to main content

Overview

Slung is a real-time time-series database built in Zig that combines streaming ingestion with historical storage. It processes temporal data streams while maintaining full historical context through an efficient on-disk storage engine.

Core Components

Slung’s architecture consists of four main subsystems:

1. Storage Engine (TSM Tree)

The Time-Structured Merge (TSM) tree is Slung’s storage foundation:
  • In-memory cache: Skip list structure with 16 levels for fast insertion and range queries
  • Disk entries: Compressed columnar storage with page-based indexing
  • Bloom filters: 1024-bit AMQ filters for fast series existence checks
  • Encoding: Gorilla delta-of-delta compression for timestamps, typed values for data
Key configuration:
  • MAX_CACHE_POINTS = 1_000_000 - automatic flush threshold
  • max_level = 100_000 - maximum disk entry levels
  • page_size = 4096 bytes - disk page size

2. Query Engine

The query engine provides a DSL for filtering and aggregating time-series data:
OP:SERIES:[TAGS]:[RANGE]
Supported operations:
  • AVG, MIN, MAX, SUM, COUNT
  • Tag filtering with AND, OR, NOT operators
  • Time range expressions: [1m,now], [1700000000,1700000100]
  • Relative time units: seconds, minutes, hours, days, weeks
The engine executes queries across both cache and disk, merging results transparently.

3. WASM Runtime

Slung executes user-defined functions in WebAssembly for real-time stream processing:
  • zware VM: Zig-based WebAssembly interpreter
  • Host functions: 9 exposed functions for querying, writing, and callbacks
  • Lifecycle management: Spawn/teardown with execution isolation
  • Write-back support: HTTP and WebSocket output channels

4. Stream Pipeline

Non-blocking streaming architecture:
  • WebSocket channels for real-time event ingestion
  • Stream rx/tx pool manager for connection handling
  • Integration with WASM runtime for processing
  • Query event broadcasting to active live queries

Data Flow

  1. Ingestion: Events arrive via WebSocket → validated → inserted into cache
  2. Cache management: When cache reaches 1M points → flush to disk entry
  3. Disk storage: Columnar format with separate .dat (data) and .idx (index) files
  4. Query execution:
    • Parse query DSL
    • Filter series by tags and Bloom filter
    • Read from cache + disk entries
    • Merge and aggregate results
  5. WASM processing: Events trigger WASM functions → process → write back

Storage Format

Disk entries use a columnar format:
  • Magic header: SLZ01
  • Metadata: row count, timestamps, version
  • Column descriptors: name, offset, size, page count
  • Page descriptors: data offset/size, row range per series
  • Indexes: series index, row index, Bloom filter
  • Footer: metadata + offsets
Two encoding versions:
  • VERSION_DELTA_COMPRESSED (1): Zigzag varint delta encoding
  • VERSION_GORILLA (2): Delta-of-delta bit encoding

Performance Characteristics

From billion-point benchmark (i5-10210U CPU, SAMSUNG MZVLB256 SSD):
  • Write throughput: 1.29M writes/second
  • Write latency: 772ns per point
  • Storage efficiency: 9.12 bytes per point
  • Memory usage: 575 MiB peak for 1B points
  • Query latency: ~160ms for 1M point aggregation
See Performance for detailed benchmarks.

Build docs developers (and LLMs) love