Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Neumenon/cowrie/llms.txt
Use this file to discover all available pages before exploring further.
Cowrie decoders enforce security limits to prevent denial-of-service attacks, memory exhaustion, and CPU spin attacks. These limits provide defense-in-depth protection beyond basic sanity checks.
Overview
Security limits are enforced at decode time and can be customized via DecodeOptions. Default limits are designed to support large ML workloads while preventing extreme allocations.
Two-Layer Protection
- Sanity Checks (always enforced): Length cannot exceed remaining data
- Security Limits (configurable): Absolute maximums even for well-formed data
// Example: Decoding a string
length := readVarint() // Attacker claims 1GB
// Layer 1: Sanity check
if length > remaining_bytes {
return ErrMalformedLength // Fail fast
}
// Layer 2: Security limit
if length > MaxStringLen {
return ErrStringTooLarge // Prevent legitimate but huge allocation
}
data := read(length) // Safe to allocate
Default Limits
const (
DefaultMaxDepth = 1000 // Maximum nesting depth
DefaultMaxArrayLen = 100_000_000 // 100M elements
DefaultMaxObjectLen = 10_000_000 // 10M fields
DefaultMaxStringLen = 500_000_000 // 500MB strings
DefaultMaxBytesLen = 1_000_000_000 // 1GB bytes (tensors, images, audio)
DefaultMaxExtLen = 100_000_000 // 100MB max extension payload
DefaultMaxDictLen = 10_000_000 // 10M dictionary entries
DefaultMaxHintCount = 10_000 // 10K column hints
DefaultMaxRank = 32 // Maximum tensor rank
)
These defaults support real ML workloads:
- 768-dim embeddings: ~3KB per embedding → 32M embeddings fit in MaxBytesLen
- Large language model responses: Multi-paragraph text fits in MaxStringLen
- Graph databases: Millions of nodes/edges fit in MaxArrayLen
DecodeOptions
Configure limits for your use case:
import "github.com/Neumenon/cowrie"
// Use defaults
val, err := cowrie.Decode(data)
// Custom limits
opts := cowrie.DecodeOptions{
MaxDepth: 500, // Limit nesting (JSON bomb protection)
MaxArrayLen: 1_000_000, // Limit array size
MaxObjectLen: 100_000, // Limit object fields
MaxStringLen: 10_000_000, // 10MB strings
MaxBytesLen: 100_000_000, // 100MB binary data
MaxExtLen: 50_000_000, // 50MB extensions
MaxDictLen: 1_000_000, // 1M dictionary keys
MaxHintCount: 1_000, // 1K column hints
MaxRank: 16, // 16D tensors max
}
val, err := cowrie.DecodeWithOptions(data, opts)
Zero Values Use Defaults
opts := cowrie.DecodeOptions{
MaxDepth: 100, // Override
// MaxArrayLen: 0 → Uses DefaultMaxArrayLen (100M)
}
Unlimited (Not Recommended)
opts := cowrie.DecodeOptions{
MaxDepth: -1, // Unlimited (DANGEROUS!)
}
Only use unlimited for trusted input (e.g., internal files).
Limit Descriptions
MaxDepth
Protects against: Nested structure attacks (stack overflow, CPU spin)
// Attack: 1000 levels deep
{"a": {"a": {"a": {"a": ...}}}}
Default: 1000 levels (enough for legitimate data)
Typical values:
- APIs: 50-100 (shallow documents)
- Databases: 500-1000 (complex objects)
- File processing: 1000+ (deeply nested config)
MaxArrayLen
Protects against: Memory exhaustion via huge arrays
// Attack: Claim 1B elements (8GB+ allocation)
Tag(0x06) | count:varint(1000000000) | ...
Default: 100M elements
Typical values:
- APIs: 1M-10M (paginated responses)
- ML workloads: 100M+ (large embedding batches)
- Graphs: 100M+ (large node/edge batches)
Memory impact:
- 100M int64: ~800MB
- 100M float32: ~400MB
- 100M strings: Variable (depends on content)
MaxObjectLen
Protects against: Memory exhaustion via huge objects
// Attack: 10M fields (massive dictionary + object overhead)
Tag(0x07) | count:varint(10000000) | ...
Default: 10M fields
Typical values:
- APIs: 1K-10K fields (reasonable documents)
- Databases: 100K-1M fields (wide tables)
- Analytics: 10M+ fields (event aggregations)
Memory impact:
- 10M fields × 32 bytes/field ≈ 320MB overhead
- Plus dictionary keys (encoded once)
- Plus field values (varies)
MaxStringLen
Protects against: Memory exhaustion via huge strings
// Attack: 1GB string
Tag(0x05) | len:varint(1000000000) | ...
Default: 500MB
Typical values:
- APIs: 1MB-10MB (documents, logs)
- LLM responses: 100MB-500MB (long-form generation)
- Files: 500MB+ (processing large text)
Why 500MB? Supports GPT-4 max context (~200K tokens × 4 bytes ≈ 800KB UTF-8, but with long-form responses can be multi-MB).
MaxBytesLen
Protects against: Memory exhaustion via binary data (tensors, images, audio)
// Attack: 10GB tensor
Tag(0x20) | ... | dataLen:varint(10000000000) | ...
Default: 1GB
Typical values:
- APIs: 10MB-100MB (small images, embeddings)
- ML workloads: 1GB+ (large tensors, batches)
- Media: 100MB-1GB+ (high-res images, audio)
Examples:
- 768-dim float32 embedding: 3KB
- 10K embeddings: 30MB
- 1M embeddings: 3GB (exceeds default!)
- 1920×1080 JPEG: ~1MB
- 4K raw RGB: 24MB
MaxExtLen
Protects against: Unknown extension payload attacks
// Attack: 1GB unknown extension
Tag(0x0E) | extType:varint | len:varint(1000000000) | ...
Default: 100MB
Typical values:
- Standard: 10MB-100MB (forward compatibility)
- Strict: 1MB (reject large unknown data)
MaxDictLen
Protects against: Dictionary explosion (CPU spin, memory)
// Attack: 10M dictionary keys
DictLen:varint(10000000) | (len:varint | bytes)* | ...
Default: 10M entries (same as MaxObjectLen)
Typical values:
- APIs: 1K-10K keys (typical schemas)
- Large objects: 1M-10M keys (wide tables, many graphs)
Memory impact:
- 10M keys × 20 bytes avg ≈ 200MB dictionary
- Plus hash map overhead
MaxHintCount
Protects against: Column hints CPU spin attack
// Attack: 1M column hints (causes long parsing time)
HintCount:varint(1000000) | (field + type + shape + flags)* | ...
Default: 10K hints
Typical values:
- Standard: 100-1000 columns (wide tables)
- Large: 10K+ columns (ultra-wide analytics)
MaxRank
Protects against: Tensor dimension explosion
// Attack: 255 dimensions (causes huge offset calculations)
Tag(0x20) | dtype | rank:u8(255) | dims:varint*255 | ...
Default: 32 dimensions
Typical values:
- Standard ML: 4-8 dimensions (batches, channels, height, width, etc.)
- Advanced: 16-32 dimensions (attention heads, multiple batches)
Why 32? Enough for complex architectures:
- 4D:
[batch, channels, height, width]
- 6D:
[batch, time, layers, heads, seq, hidden]
- 32D: Extreme multi-dimensional tensors
Wire limit: u8 max = 255 dimensions (but decoder rejects > MaxRank)
Attack Scenarios
1. Nested Object Bomb
Attack: Deeply nested objects to exhaust stack or spin CPU
{"a":{"a":{"a":{"a": ... 10000 levels}}}}
Protection: MaxDepth limit
opts := cowrie.DecodeOptions{MaxDepth: 100}
_, err := cowrie.DecodeWithOptions(malicious, opts)
// err == cowrie.ErrDepthExceeded
2. Array Length Bomb
Attack: Claim huge array to allocate gigabytes
Tag(0x06) | count:varint(1000000000) | ...
Protection: MaxArrayLen + sanity check
// Decoder checks:
if count > MaxArrayLen {
return ErrArrayTooLarge // Security limit
}
if count > remaining_bytes {
return ErrMalformedLength // Sanity check
}
3. Dictionary Explosion
Attack: 10M dictionary keys to exhaust memory + CPU
DictLen:varint(10000000) | key1 | key2 | ... | key10M | ...
Protection: MaxDictLen + sanity check
if dictLen > MaxDictLen {
return ErrDictTooLarge
}
if dictLen > remaining_bytes {
return ErrMalformedLength
}
4. Decompression Bomb
Attack: 1KB compressed → 10GB decompressed
Flags:0x03 (compressed gzip) | OrigLen:varint(10000000000) | [1KB of compressed data]
Protection: MaxDecompressedSize limit
const MaxDecompressedSize = 256 * 1024 * 1024 // 256MB
limited := io.LimitReader(gzipReader, MaxDecompressedSize+1)
out, _ := io.ReadAll(limited)
if len(out) > MaxDecompressedSize {
return cowrie.ErrDecompressedTooLarge
}
See Compression for details.
5. Tensor Rank Bomb
Attack: 255-dimensional tensor to cause overflow in size calculations
Tag(0x20) | dtype | rank:u8(255) | dims:[1,1,1,...,1] | dataLen:varint(1) | [1 byte]
Protection: MaxRank limit
if rank > MaxRank {
return ErrMalformedLength
}
6. Column Hints CPU Spin
Attack: 1M column hints to slow down header parsing
FlagHasColumnHints | HintCount:varint(1000000) | (field + type + shape + flags)*1M | ...
Protection: MaxHintCount limit
if hintCount > MaxHintCount {
return ErrTooManyHints
}
Error Handling
All limit violations return specific errors:
val, err := cowrie.DecodeWithOptions(data, opts)
switch err {
case cowrie.ErrDepthExceeded:
log.Println("Nested too deep")
case cowrie.ErrArrayTooLarge:
log.Println("Array too large")
case cowrie.ErrObjectTooLarge:
log.Println("Object too large")
case cowrie.ErrStringTooLarge:
log.Println("String too large")
case cowrie.ErrBytesTooLarge:
log.Println("Bytes/tensor too large")
case cowrie.ErrExtTooLarge:
log.Println("Extension too large")
case cowrie.ErrDictTooLarge:
log.Println("Dictionary too large")
case cowrie.ErrTooManyHints:
log.Println("Too many column hints")
case cowrie.ErrMalformedLength:
log.Println("Length exceeds remaining data (malicious)")
default:
log.Println("Other error:", err)
}
Recommended Configurations
opts := cowrie.DecodeOptions{
MaxDepth: 100, // Shallow documents
MaxArrayLen: 1_000_000, // 1M elements max
MaxObjectLen: 10_000, // 10K fields max
MaxStringLen: 10_000_000, // 10MB strings
MaxBytesLen: 100_000_000, // 100MB binary
MaxExtLen: 10_000_000, // 10MB extensions
MaxDictLen: 10_000, // 10K keys
MaxHintCount: 100, // 100 column hints
MaxRank: 8, // 8D tensors max
OnUnknownExt: cowrie.UnknownExtError, // Reject unknown extensions
}
Profile: Conservative, protects against abuse, suitable for user-facing APIs.
Internal Service (Semi-Trusted)
opts := cowrie.DecodeOptions{
MaxDepth: 500,
MaxArrayLen: 10_000_000,
MaxObjectLen: 100_000,
MaxStringLen: 100_000_000,
MaxBytesLen: 500_000_000,
MaxExtLen: 50_000_000,
MaxDictLen: 100_000,
MaxHintCount: 1_000,
MaxRank: 16,
}
Profile: Moderate, allows larger payloads, suitable for service-to-service communication.
ML Workload (Trusted)
opts := cowrie.DefaultDecodeOptions() // Use generous defaults
// or
opts := cowrie.DecodeOptions{
MaxDepth: 1000,
MaxArrayLen: 100_000_000,
MaxObjectLen: 10_000_000,
MaxStringLen: 500_000_000,
MaxBytesLen: 2_000_000_000, // 2GB for large tensors
MaxExtLen: 100_000_000,
MaxDictLen: 10_000_000,
MaxHintCount: 10_000,
MaxRank: 32,
}
Profile: Permissive, supports large ML payloads, suitable for trusted data pipelines.
Strict Mode (Maximum Security)
opts := cowrie.DecodeOptions{
MaxDepth: 50,
MaxArrayLen: 10_000,
MaxObjectLen: 1_000,
MaxStringLen: 1_000_000, // 1MB
MaxBytesLen: 10_000_000, // 10MB
MaxExtLen: 1_000_000, // 1MB
MaxDictLen: 1_000,
MaxHintCount: 50,
MaxRank: 4,
OnUnknownExt: cowrie.UnknownExtError,
}
Profile: Paranoid, rejects anything unusual, suitable for high-security environments.
Limit checks have negligible overhead (less than 1% CPU) because they use fail-fast checks:
// Fast: Single comparison
if count > MaxArrayLen {
return ErrArrayTooLarge
}
// No allocation until after limit check
items := make([]*Value, count) // Only if count <= MaxArrayLen
Benchmark (100KB payload):
- No limits: 1.2ms decode
- With limits: 1.21ms decode (~1% overhead)
Limits save time by rejecting malicious payloads early.
Monitoring
Track limit violations to detect attacks:
func DecodeWithMetrics(data []byte) (*cowrie.Value, error) {
val, err := cowrie.Decode(data)
switch err {
case cowrie.ErrArrayTooLarge,
cowrie.ErrObjectTooLarge,
cowrie.ErrStringTooLarge,
cowrie.ErrBytesTooLarge,
cowrie.ErrDictTooLarge:
metrics.Increment("cowrie.limit_exceeded", map[string]string{
"error": err.Error(),
})
log.Warn("Limit exceeded", "error", err, "size", len(data))
}
return val, err
}
Best Practices
- Use Defaults for ML: Default limits support real ML workloads
- Tighten for APIs: Reduce limits for user-facing endpoints
- Monitor Violations: Track ErrXxxTooLarge errors
- Reject Unknown Extensions: Set OnUnknownExt to Error for strict mode
- Combine with Rate Limiting: Limit violations may indicate attack
- Test Edge Cases: Verify your limits with real data
Unknown Extension Behavior
Control how the decoder handles unknown TagExt extensions:
type UnknownExtBehavior int
const (
UnknownExtKeep UnknownExtBehavior = iota // Preserve (default)
UnknownExtSkipAsNull // Skip, return null
UnknownExtError // Error (strict mode)
)
opts := cowrie.DecodeOptions{
OnUnknownExt: cowrie.UnknownExtError, // Reject unknown data
}
Use cases:
- Keep: Forward compatibility, round-trip preservation
- Skip: Ignore unknown extensions silently
- Error: Strict validation, reject unknown data