Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Neumenon/cowrie/llms.txt

Use this file to discover all available pages before exploring further.

Overview

This page presents real benchmark results from the Cowrie test suite across multiple language implementations. All benchmarks measure encode/decode throughput, memory allocations, and payload sizes.

Running Benchmarks

Go

cd go
go test -bench=. -benchmem ../benchmarks/

Python

cd python
python ../benchmarks/bench_python.py

Rust

cd rust
cargo bench

Throughput Results

Approximate throughput across implementations (measured on typical hardware):
ImplementationGen1 EncodeGen1 DecodeGen2 EncodeGen2 Decode
Go~500 MB/s~600 MB/s~300 MB/s~400 MB/s
Rust~450 MB/s~550 MB/s~250 MB/s~350 MB/s
Python~15 MB/s~20 MB/s~8 MB/s~12 MB/s
Key Insights:
  • Gen1 is 1.5-2x faster than Gen2 due to simpler encoding (no dictionary building)
  • Decoding is generally faster than encoding
  • Go and Rust have similar performance characteristics
  • Python is 30-40x slower but sufficient for most applications

Payload Size Comparison

Real size measurements from benchmark suite:
Payload TypeJSONGen1Gen2
Small object (3 fields)46 bytes35 bytes (76%)43 bytes (93%)
Large array (1000 objects)48KB34KB (70%)23KB (47%)
Float array (10K floats)86KB80KB (93%)-
Graph shard (100 nodes)--~10KB
Key Insights:
  • Gen2 dictionary coding provides ~50% size reduction for repeated schemas
  • Gen1 saves 10-30% over JSON for most payloads
  • Float arrays show minimal benefit without compression
  • Graph types are most efficient in Gen2

Benchmark Scenarios

Small Object (3 fields)

{
    "name": "Alice",
    "age": 30,
    "score": 3.14159,
}
Results:
  • JSON: 46 bytes
  • Gen1: 35 bytes (76% of JSON)
  • Gen2: 43 bytes (93% of JSON)
Gen1 is more efficient for single objects due to no dictionary overhead. Gen2 shines with repeated patterns.

Large Array (1000 objects)

[
    {"id": 0, "name": "item", "value": 0.0},
    {"id": 1, "name": "item", "value": 0.1},
    // ... 998 more with same schema
]
Results:
  • JSON: 48KB
  • Gen1: 34KB (70% of JSON)
  • Gen2: 23KB (47% of JSON)
Gen2’s dictionary coding encodes the three keys (“id”, “name”, “value”) once, then uses small varints for references. This provides massive savings with repeated schemas.

Float Array (10,000 elements)

[0.000, 0.001, 0.002, ..., 9.999]
Results:
  • JSON: 86KB (text representation)
  • Gen1: 80KB (binary float64 array)
  • Gen2: Similar to Gen1
Gen1 automatically promotes homogeneous arrays to typed arrays (Float64Array), providing compact binary encoding. Gen2 provides no additional benefit without compression.

Graph Shard (100 nodes, 200 edges)

GraphShard{
    Nodes: [100 nodes with properties],
    Edges: [200 edges with weights],
    Metadata: {"version": 1}
}
Results:
  • Gen2: ~10KB (dictionary-coded properties)
Graph types leverage dictionary coding for repeated property keys across nodes and edges, achieving excellent compression ratios.

Benchmark Categories

1. Encode/Decode Speed

Measures operations per second for various payload sizes: Go Benchmarks:
BenchmarkGen1EncodeSmall        1000000    1200 ns/op    128 B/op    3 allocs/op
BenchmarkGen1EncodeLarge          1000    1.2 ms/op      65KB/op     1003 allocs/op
BenchmarkGen2EncodeSmall         500000    2400 ns/op    256 B/op    5 allocs/op
BenchmarkGen2EncodeLarge           500    2.5 ms/op      45KB/op     1005 allocs/op

2. Memory Allocations

Gen1 typically requires fewer allocations due to simpler encoding. Gen2 adds overhead for dictionary building but reduces payload size.

3. Graph Type Performance

GraphShard Benchmark (100 nodes, 200 edges):
Gen2 Encode: ~500 µs/op
Gen2 Decode: ~400 µs/op
Graph types are optimized for GNN workloads with efficient batch encoding.

Methodology

All benchmarks follow these principles:
  1. Warmup: 10 iterations before timing to stabilize JIT/caching
  2. Iterations: 1000+ iterations for small payloads, 100+ for large
  3. Measurement: High-resolution timers (perf_counter in Python, time.Now in Go)
  4. Memory: Track allocations and bytes allocated per operation
  5. Realistic Data: Mix of integers, floats, strings, nested structures

Benchmark Data Sets

Small Object: 3 fields, mixed types (string, int, float) Medium Object: 20 fields with nested structure Large Array: 1000 identical objects (tests dictionary coding) Float Array: 10,000 float64 values (tests typed array promotion) Graph Shard: 100 nodes + 200 edges (tests graph type efficiency)

Performance Tips

  1. Use Gen1 for simple APIs: Faster encode/decode, predictable latency
  2. Use Gen2 for repeated schemas: ~50% size savings on logs, events, bulk data
  3. Array promotion threshold: Arrays with 4+ elements auto-promote to typed arrays
  4. Batch graph operations: Use NodeBatch/EdgeBatch for streaming workloads
  5. Compression: Add gzip/zstd for Gen2 when network bandwidth is limited

Comparative Analysis

Gen1 vs JSON

  • Encode: Gen1 is 2-3x faster (binary format, no escaping)
  • Decode: Gen1 is 3-5x faster (no parsing, direct binary read)
  • Size: 10-30% smaller (binary integers, no text overhead)

Gen2 vs Gen1

  • Encode: Gen2 is 40-50% slower (dictionary building)
  • Decode: Gen2 is 30-40% slower (dictionary lookups)
  • Size: 20-50% smaller for repeated schemas, similar otherwise

Cowrie vs Protocol Buffers

  • Flexibility: Cowrie supports dynamic schemas, Protobuf requires .proto files
  • Speed: Similar performance for typed data
  • Size: Gen2 with compression is comparable to Protobuf
  • Use case: Cowrie for JSON-like flexibility, Protobuf for strict schemas

Next Steps

  • See Optimization for tuning tips
  • See Comparison for format selection guidance
  • Run benchmarks yourself to validate on your hardware

Build docs developers (and LLMs) love