Skip to main content
Jasonisnthappy is designed for high performance. This guide covers optimization techniques, benchmarks, and best practices.

Performance characteristics

Key metrics

Based on production benchmarks:

Bulk inserts

~19,000 docs/sec for 1000-doc batches

Single writes

~8ms per insert with full ACID guarantees

Reads

Sub-millisecond queries on collections with 2500+ documents

Concurrent scaling

Linear scaling with thread count up to core count

ACID compliance

All operations include:
  • Full ACID transaction support
  • MVCC snapshot isolation
  • Write-ahead logging (WAL)
  • Crash recovery
Performance numbers include durability guarantees (fsync). Jasonisnthappy doesn’t sacrifice safety for speed.

Write optimization

Batch operations

Use bulk inserts instead of individual inserts.
// Insert 1000 documents in one transaction
let docs: Vec<Value> = (0..1000)
    .map(|i| json!({"id": i, "name": format!("User {}", i)}))
    .collect();

let ids = users.insert_many(docs)?;  // ~50ms
// Throughput: ~19,000 docs/sec
Optimal batch size: 100-1000 documents per insert_many call balances throughput and memory usage.

Bulk write operations

Use bulk_write for mixed operations.
use jasonisnthappy::Database;
use serde_json::json;

let users = db.collection("users");

// Efficient: single transaction for multiple operations
let result = users.bulk_write()
    .insert(json!({"name": "Alice"}))
    .insert(json!({"name": "Bob"}))
    .update_one("name is \"Alice\"", json!({"age": 30}))
    .delete_many("status is \"inactive\"")
    .execute()?;

println!("Inserted: {}, Updated: {}, Deleted: {}",
    result.inserted_count,
    result.updated_count,
    result.deleted_count
);

Unordered writes

For maximum throughput, use unordered bulk writes.
// Unordered: continue on errors, faster
let result = users.bulk_write()
    .ordered(false)  // Don't stop on first error
    .insert(json!({"name": "User1"}))
    .insert(json!({"name": "User2"}))
    // ... 100s of operations ...
    .execute()?;

if !result.errors.is_empty() {
    eprintln!("Some operations failed: {:?}", result.errors);
}

Checkpoint management

Control when WAL is checkpointed.
// Default: auto-checkpoint every 1000 WAL frames
let db = Database::open("my.db")?;

// For bulk imports: increase threshold
db.set_auto_checkpoint_threshold(10000);  // Checkpoint less often

// ... bulk insert operations ...

// Manual checkpoint after bulk import
db.checkpoint()?;

// Restore default
db.set_auto_checkpoint_threshold(1000);
Increasing the checkpoint threshold reduces write amplification during bulk imports but uses more WAL space.

Read optimization

Use indexes

Indexes are crucial for query performance.
// Slow: scans entire collection
let user = users.find("email is \"alice@example.com\"")?;
// Time: O(n) - 50ms for 10,000 documents
See the Indexes guide for details.

Query optimization

Use the most specific query method.
// Best: Direct ID lookup
let user = users.find_by_id("user_123")?;  // 0.01ms

// Good: find_one with index
let user = users.find_one("email is \"alice@example.com\"")?;  // 0.02ms

// Slower: find (returns all matches)
let results = users.find("email is \"alice@example.com\"")?;  // 0.03ms
let user = results.first();

// Slowest: find_all then filter in code
let all_users = users.find_all()?;  // 2ms for 10,000 docs
let user = all_users.iter().find(|u| u["email"] == "alice@example.com");

Projection

Fetch only the fields you need.
let results = users.query()
    .filter("age > 25")
    .project(&["name", "email"])  // Only load 2 fields
    .execute()?;
// Less data transferred from disk

Typed operations

Use typed methods to avoid JSON parsing overhead.
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct User {
    name: String,
    email: String,
}

// Faster: direct deserialization
let user: Option<User> = users.find_by_id_typed("user_123")?;

// Slower: JSON Value then application-level parsing
let user_value = users.find_by_id("user_123")?;
let user: User = serde_json::from_value(user_value)?;

Transaction optimization

Minimize transaction scope

Keep transactions short and focused.
let mut tx = db.begin()?;
let users = tx.collection("users");
users.insert(json!({"name": "Alice"}))?;
tx.commit()?;  // Fast commit

Use retryable transactions

For automatic conflict resolution.
use jasonisnthappy::TransactionConfig;

// Configure retries
db.set_transaction_config(TransactionConfig {
    max_retries: 5,
    retry_backoff_base_ms: 1,
    max_retry_backoff_ms: 100,
});

// Run with automatic retry
let result = db.run_transaction(|tx| {
    let users = tx.collection("users");
    let count = users.count()?;
    
    users.insert(json!({"count": count}))?;
    Ok(count)
})?;

MVCC benefits

Reads never block writes, writes never block reads.
use std::thread;

// Reader thread (never blocked)
let db_clone = db.clone();
let reader = thread::spawn(move || {
    let users = db_clone.collection("users");
    loop {
        let count = users.count().unwrap();
        println!("Count: {}", count);
        thread::sleep(Duration::from_millis(10));
    }
});

// Writer thread (runs concurrently)
let db_clone = db.clone();
let writer = thread::spawn(move || {
    let users = db_clone.collection("users");
    loop {
        users.insert(json!({"data": "test"})).unwrap();
        thread::sleep(Duration::from_millis(100));
    }
});

// Both threads run concurrently without blocking

Memory optimization

Page cache

Configure the page cache size.
use jasonisnthappy::DatabaseOptions;

let db = Database::open_with_options("my.db", DatabaseOptions {
    cache_size: 50_000,  // 50K pages = ~200MB
    ..Default::default()
})?;
Cache sizing guide:
  • Small DBs (< 100MB): 10,000 pages (~40MB)
  • Medium DBs (< 1GB): 25,000 pages (~100MB)
  • Large DBs (> 1GB): 50,000+ pages (~200MB+)

Limit result sets

Use pagination to avoid loading large result sets.
// Good: Paginate large result sets
let page_size = 100;
let results = users.query()
    .filter("status is \"active\"")
    .limit(page_size)
    .execute()?;

// Bad: Load all results into memory
let all_results = users.find("status is \"active\"")?;  // Could be millions

Garbage collection

Reclaim space from old MVCC versions.
// Periodically clean up old versions
let stats = db.garbage_collect()?;

println!("Versions removed: {}", stats.versions_removed);
println!("Pages freed: {}", stats.pages_freed);
println!("Bytes freed: {}", stats.bytes_freed);
Garbage collection is safe to run while the database is in use. It only removes versions no longer visible to any transaction.

Disk I/O optimization

SSD vs HDD

Jasonisnthappy performs best on SSDs.
StorageRandom readsSequential writesRecommended
SSDExcellentExcellent✅ Yes
NVMe SSDExceptionalExceptional✅ Best
HDDPoorGood⚠️ Limited

File placement

Separate database and WAL on different disks (advanced).
// Database on fast SSD
let db = Database::open("/mnt/ssd/app.db")?;

// WAL on separate disk (if supported)
// Note: Current implementation keeps WAL with database

Batch commits

Jasonisnthappy batches commits automatically for better throughput.
// Multiple concurrent writers are batched
let threads: Vec<_> = (0..4)
    .map(|i| {
        let db = db.clone();
        thread::spawn(move || {
            let users = db.collection("users");
            users.insert(json!({"thread": i})).unwrap();
        })
    })
    .collect();

// Commits are batched together for efficiency

Benchmarking

Running benchmarks

# Run all benchmarks
cargo run --release --example bench_all

# Results show:
# - Write throughput
# - Read latency
# - Bulk insert performance
# - Concurrent operations

Custom benchmarks

use std::time::Instant;

let db = Database::open("bench.db")?;
let users = db.collection("users");

// Benchmark inserts
let start = Instant::now();
for i in 0..1000 {
    users.insert(json!({"id": i}))?;
}
let elapsed = start.elapsed();
println!("1000 inserts: {:.2?}", elapsed);
println!("Throughput: {:.0} ops/sec", 1000.0 / elapsed.as_secs_f64());

// Benchmark queries
let start = Instant::now();
for i in 0..1000 {
    users.find_one(&format!("id is {}", i))?;
}
let elapsed = start.elapsed();
println!("1000 queries: {:.2?}", elapsed);
println!("Avg latency: {:.2}ms", elapsed.as_millis() as f64 / 1000.0);

Monitoring

Metrics

Track database performance with built-in metrics.
use jasonisnthappy::Database;

let db = Database::open("my.db")?;

// Get metrics snapshot
let metrics = db.metrics();

println!("Transactions:");
println!("  Begun: {}", metrics.transactions_begun);
println!("  Committed: {}", metrics.transactions_committed);
println!("  Rolled back: {}", metrics.transactions_rolled_back);
println!("  Conflicts: {}", metrics.transaction_conflicts);

println!("\nCache:");
println!("  Hits: {}", metrics.cache_hits);
println!("  Misses: {}", metrics.cache_misses);
let hit_rate = metrics.cache_hits as f64 / 
               (metrics.cache_hits + metrics.cache_misses) as f64 * 100.0;
println!("  Hit rate: {:.1}%", hit_rate);

println!("\nWAL:");
println!("  Frames written: {}", metrics.wal_frames_written);
println!("  Checkpoints: {}", metrics.checkpoints);

Performance tuning

Use metrics to identify bottlenecks.
1
Check cache hit rate
2
let metrics = db.metrics();
let hit_rate = metrics.cache_hits as f64 / 
               (metrics.cache_hits + metrics.cache_misses) as f64;

if hit_rate < 0.9 {
    println!("Cache too small - increase cache_size");
}
3
Monitor transaction conflicts
4
if metrics.transaction_conflicts > metrics.transactions_committed / 10 {
    println!("High conflict rate - review transaction patterns");
}
5
Track WAL growth
6
let frame_count = db.frame_count();
if frame_count > 10000 {
    println!("Large WAL - consider checkpoint");
    db.checkpoint()?;
}

Best practices summary

Do:
  • Use insert_many for bulk inserts (100-1000 docs)
  • Create indexes on frequently queried fields
  • Use find_one when you only need the first result
  • Project only needed fields
  • Keep transactions short
  • Use typed operations for hot paths
  • Configure adequate cache size
  • Run garbage collection periodically
Avoid:
  • Individual inserts in loops
  • Queries without indexes on large collections
  • Loading entire collections with find_all
  • Long-running transactions
  • Excessive checkpoint frequency
  • Too many indexes (slow writes)
  • Running on slow HDDs if possible

Performance checklist

1
Initial setup
2
  • Choose SSD storage
  • Configure appropriate cache size
  • Set auto-checkpoint threshold
  • 3
    Development
    4
  • Create indexes on query fields
  • Use batch operations
  • Project only needed fields
  • Use typed operations where possible
  • 5
    Production
    6
  • Monitor metrics (cache hit rate, conflicts)
  • Schedule garbage collection
  • Checkpoint during low traffic
  • Benchmark critical operations
  • Profile slow queries
  • 7
    Optimization
    8
  • Add missing indexes
  • Increase cache size if hit rate < 90%
  • Batch writes during bulk imports
  • Use read-only mode for replicas
  • Real-world performance

    E-commerce application

    // Product catalog: 100,000 products
    // - Indexed: name, category, price
    // - Cache: 50,000 pages (~200MB)
    // - Storage: NVMe SSD
    
    // Search products: 0.5-2ms
    let results = products.query()
        .filter("category is \"electronics\" and price < 500")
        .sort_by("price", SortOrder::Asc)
        .limit(20)
        .execute()?;
    
    // Add to cart: 8-12ms (with ACID)
    let cart = db.collection("carts");
    cart.insert(json!({
        "user_id": user_id,
        "product_id": product_id,
        "quantity": 1
    }))?;
    
    // Checkout (bulk write): 30-50ms
    let result = db.run_transaction(|tx| {
        // Update inventory, create order, clear cart
        // ... multiple operations ...
        Ok(order_id)
    })?;
    

    Analytics dashboard

    // Events: 1M+ documents
    // - Aggregation: 100-500ms
    // - Cached results: < 1ms
    
    let hourly_stats = events.aggregate()
        .match_("timestamp > 1704067200")
        .group_by("event_type")
        .count("total")
        .execute()?;  // ~200ms for 1M events
    

    Next steps

    Indexes

    Create indexes to speed up queries

    Querying

    Optimize query patterns

    CRUD operations

    Use efficient write operations

    Backup and restore

    Backup strategies for production

    Build docs developers (and LLMs) love