Data structures
Entry
Each key-value pair is stored in anEntry node:
hashtable.h
expires_at == 0 is the sentinel for “this key never expires”. Any non-zero value is an absolute Unix timestamp; the key is considered expired once time(NULL) >= expires_at.
HashTable
hashtable.h
size field tracks the bucket array length. count tracks live entries and drives the load factor calculation. resizes is a monotonic counter exposed by the INFO command.
Hash function
RadishDB uses a djb2 variant — a multiply-and-add hash that distributes ASCII keys well in practice:hashtable.c
hash(key) % ht->size. Because size is always a power of two after the first resize, this modulo is computed efficiently.
djb2 is not cryptographic. Key distribution is good for typical workloads, but a carefully crafted set of keys can cause hash collisions. RadishDB does not implement hash randomization.
Separate chaining
When two keys map to the same bucket, they form a singly-linked list through thenext pointer. Lookup walks the chain comparing keys with strcmp until it finds a match or reaches the end.
Auto-resize at 0.75 load factor
Before every insert,ht_set computes the current load factor:
hashtable.c
ht_resize allocates a new bucket array twice as large, re-hashes every existing entry into the new array, and frees the old array. The resizes counter increments with each operation.
This keeps the average chain length short (below 1.0) and lookup time close to O(1) regardless of database size.
Memory management
All strings are heap-allocated viastrdup at insert time. The hash table owns these allocations:
- On
SET: newkeyandvaluestrings arestrdup’d into the entry. - On overwrite: the old
valuestring isfree’d before the new one is stored. - On
DELor expiry: bothkeyandvaluearefree’d and theEntryitself isfree’d. - On
ht_resize: entries are re-linked into new buckets — the strings themselves are not re-allocated.
Complexity
| Operation | Average | Worst case |
|---|---|---|
SET | O(1) amortized | O(n) during resize |
GET | O(1) | O(n) — full chain scan |
DEL | O(1) | O(n) — full chain scan |
TTL | O(1) | O(n) — full chain scan |
| Resize | O(n) | O(n) |
SET holds because resize work is spread across many insertions. Resize itself is O(n) but happens at most O(log n) times over n total inserts.
INFO metrics
TheINFO command exposes the following hash table statistics:
| Field | Description |
|---|---|
keys | Total number of live entries |
keys_with_ttl | Entries that have a non-zero expires_at |
expired_keys | Cumulative count of keys deleted by expiry |
buckets | Current bucket array size |
load_factor | count / size as a float |
resizes | Total number of resize operations |
max_chain | Length of the longest chain in the current table |