Skip to main content

Overview

Beacon implements Redis-based rate limiting to protect the API from abuse and ensure fair usage. The rate limiter uses a sliding window algorithm to track requests per IP address.
Rate limiting is only active when running beacon serve. The CLI commands are not rate limited.

Rate Limit Configuration

Default Limits

src/main.rs:36-37
const RATE_LIMIT_WINDOW_SECONDS: u64 = 60;
const RATE_LIMIT_MAX_REQUESTS: usize = 20;
Current Settings:
  • Window: 60 seconds (1 minute)
  • Max Requests: 20 per window
  • Applies to: /generate and /validate endpoints only

Protected Endpoints

Rate limiting is enforced on:
EndpointMethodRate Limited
/healthGETNo
/generatePOSTYes (20/min)
/validatePOSTYes (20/min)
Implementation in src/main.rs:144-146:
if request.uri().path() != "/generate" && request.uri().path() != "/validate" {
    return Ok(next.run(request).await);
}

Implementation Details

Sliding Window Algorithm

Beacon uses Redis sorted sets to implement a sliding window rate limiter:
  1. Key Format: ratelimit:<ip_address>
  2. Sorted Set: Stores request timestamps as both score and value
  3. Cleanup: Old entries outside the window are removed automatically
Implementation (src/main.rs:138-202):
async fn rate_limit_middleware(
    State(state): State<AppState>,
    addr: Option<ConnectInfo<SocketAddr>>,
    request: Request<axum::body::Body>,
    next: Next,
) -> Result<Response, StatusCode> {
    // Skip rate limiting for non-protected endpoints
    if request.uri().path() != "/generate" && request.uri().path() != "/validate" {
        return Ok(next.run(request).await);
    }
    
    // Extract IP address
    let ip = match addr {
        Some(ConnectInfo(a)) => a.ip().to_string(),
        None => {
            // Fallback for proxies
            request.headers()
                .get("x-forwarded-for")
                .and_then(|h| h.to_str().ok())
                .unwrap_or("unknown")
                .to_string()
        }
    };
    
    let key = format!("ratelimit:{}", ip);
    let now = SystemTime::now()
        .duration_since(SystemTime::UNIX_EPOCH)
        .unwrap()
        .as_secs();

    // Redis pipeline for atomic operations
    let mut conn = state.redis_client
        .get_multiplexed_async_connection()
        .await
        .map_err(|e| {
            tracing::error!("Redis connection error: {}", e);
            StatusCode::INTERNAL_SERVER_ERROR
        })?;

    let results: Vec<redis::Value> = redis::pipe()
        .atomic()
        .zrembyscore(&key, 0, (now - RATE_LIMIT_WINDOW_SECONDS) as f64)  // Remove old entries
        .zadd(&key, now, now)                                              // Add current request
        .zcard(&key)                                                       // Count requests in window
        .expire(&key, RATE_LIMIT_WINDOW_SECONDS as i64)                   // Set key expiration
        .query_async(&mut conn)
        .await
        .map_err(|e| {
            tracing::error!("Redis pipeline error: {}", e);
            StatusCode::INTERNAL_SERVER_ERROR
        })?;

    // Extract count from pipeline results
    let count: usize = if results.len() >= 3 {
        match &results[2] {
            redis::Value::Int(c) => *c as usize,
            _ => 0,
        }
    } else {
        0
    };

    if count > RATE_LIMIT_MAX_REQUESTS {
        return Err(StatusCode::TOO_MANY_REQUESTS);
    }
    
    Ok(next.run(request).await)
}

Redis Commands Breakdown

# 1. Remove expired entries (older than 60 seconds)
ZREMRANGEBYSCORE ratelimit:192.168.1.1 0 <now-60>

# 2. Add current request timestamp
ZADD ratelimit:192.168.1.1 <now> <now>

# 3. Count total requests in window
ZCARD ratelimit:192.168.1.1

# 4. Set key expiration (cleanup)
EXPIRE ratelimit:192.168.1.1 60

IP Address Detection

Beacon extracts the client IP from:
  1. Direct Connection: ConnectInfo<SocketAddr> (Axum middleware)
  2. Proxy Headers: X-Forwarded-For header as fallback
src/main.rs:148-158
let ip = match addr {
    Some(ConnectInfo(a)) => a.ip().to_string(),
    None => {
        request.headers()
            .get("x-forwarded-for")
            .and_then(|h| h.to_str().ok())
            .unwrap_or("unknown")
            .to_string()
    }
};
When behind a proxy/load balancer, ensure the X-Forwarded-For header is set correctly. Consider using X-Real-IP for more security.

Rate Limit Response

When rate limit is exceeded, the API returns:
HTTP/1.1 429 Too Many Requests
Content-Length: 0
No additional headers (Retry-After, X-RateLimit-*) are currently included, but can be added:
if count > RATE_LIMIT_MAX_REQUESTS {
    let retry_after = RATE_LIMIT_WINDOW_SECONDS;
    let response = Response::builder()
        .status(StatusCode::TOO_MANY_REQUESTS)
        .header("Retry-After", retry_after.to_string())
        .header("X-RateLimit-Limit", RATE_LIMIT_MAX_REQUESTS.to_string())
        .header("X-RateLimit-Remaining", "0")
        .header("X-RateLimit-Reset", (now + retry_after).to_string())
        .body(axum::body::Body::empty())
        .unwrap();
    return Err(response);
}

Redis Configuration

Required Setup

Beacon requires Redis to be available when running the server:
# Local Redis
REDIS_URL=redis://localhost:6379

# Redis with authentication
REDIS_URL=redis://:password@localhost:6379

# Redis Cloud (TLS)
REDIS_URL=rediss://default:password@redis-12345.cloud.redislabs.com:12345
The server will fail to start if REDIS_URL is not set:
src/main.rs:449
let redis_url = std::env::var("REDIS_URL")
    .context("REDIS_URL must be set")?;

Connection Pooling

Beacon uses redis::Client with multiplexed async connections:
src/main.rs:31-34
#[derive(Clone)]
struct AppState {
    redis_client: Arc<redis::Client>,
}
Connections are established per request:
let mut conn = state.redis_client
    .get_multiplexed_async_connection()
    .await?;

Persistence

For production, configure Redis persistence to avoid losing rate limit state:
redis.conf
# Append-only file (AOF) persistence
appendonly yes
appendfsync everysec

# RDB snapshots as backup
save 900 1
save 300 10
save 60 10000

Monitoring Rate Limits

Redis CLI Inspection

Check rate limit data for a specific IP:
# Connect to Redis
redis-cli

# List all rate limit keys
KEYS ratelimit:*

# Check request count for an IP
ZCARD ratelimit:192.168.1.1

# View all timestamps for an IP
ZRANGE ratelimit:192.168.1.1 0 -1 WITHSCORES

# Time until key expires
TTL ratelimit:192.168.1.1

Metrics Collection

Integrate with Redis monitoring tools:
  • RedisInsight: Visual inspection of keys
  • Prometheus Redis Exporter: Export metrics to Prometheus
  • CloudWatch/Datadog: Monitor Redis performance
Example Prometheus query:
# Total rate limited requests
sum(rate(redis_commands_total{cmd="zcard"}[5m]))

# Rate limit key count
redis_db_keys{db="db0"} - on() redis_db_keys{db="db0",key!~"ratelimit:.*"}

Customizing Rate Limits

Per-Endpoint Limits

Implement different limits per endpoint:
const RATE_LIMIT_GENERATE: usize = 10;  // 10/min for /generate
const RATE_LIMIT_VALIDATE: usize = 50;  // 50/min for /validate

let max_requests = match request.uri().path() {
    "/generate" => RATE_LIMIT_GENERATE,
    "/validate" => RATE_LIMIT_VALIDATE,
    _ => RATE_LIMIT_MAX_REQUESTS,
};

Per-User Limits

Rate limit by API key or user ID instead of IP:
let user_id = extract_user_id_from_headers(&request)?;
let key = format!("ratelimit:user:{}", user_id);

Tiered Rate Limits

Implement different tiers based on payment or subscription:
let tier = get_user_tier(&user_id).await?;
let max_requests = match tier {
    Tier::Free => 20,
    Tier::Pro => 100,
    Tier::Enterprise => 1000,
};

Bypassing Rate Limits

For Self-Hosted Instances

Option 1: Remove rate limiting middleware
let app = Router::new()
    .route("/health", get(health))
    .route("/validate", post(handle_validate))
    .route("/generate", post(handle_generate))
    // .layer(middleware::from_fn_with_state(state.clone(), rate_limit_middleware))  // Commented out
    .with_state(state);
Option 2: Whitelist specific IPs
let whitelist = vec!["127.0.0.1", "10.0.0.0/8", "172.16.0.0/12"];
if whitelist.contains(&ip.as_str()) {
    return Ok(next.run(request).await);
}
Option 3: Increase limits dramatically
const RATE_LIMIT_MAX_REQUESTS: usize = 1000000;  // Effectively unlimited

Error Handling

Redis Connection Failures

If Redis is unavailable, the API returns 500:
let mut conn = state.redis_client
    .get_multiplexed_async_connection()
    .await
    .map_err(|e| {
        tracing::error!("Redis connection error: {}", e);
        StatusCode::INTERNAL_SERVER_ERROR
    })?;
Improvement: Fail open (allow requests) if Redis is down:
let conn_result = state.redis_client
    .get_multiplexed_async_connection()
    .await;

let mut conn = match conn_result {
    Ok(c) => c,
    Err(e) => {
        tracing::warn!("Redis unavailable, allowing request: {}", e);
        return Ok(next.run(request).await);
    }
};

Performance Considerations

Redis Pipeline Efficiency

Using Redis pipelines reduces round trips:
  • Without Pipeline: 4 round trips (ZREMRANGEBYSCORE, ZADD, ZCARD, EXPIRE)
  • With Pipeline: 1 round trip (all commands batched)
Latency Improvement: ~10ms → ~2ms per request

Memory Usage

Each IP address uses approximately:
Key overhead: ~50 bytes
Timestamp entry: 8 bytes (score) + 8 bytes (value) = 16 bytes/request
Total per IP: 50 + (16 × 20) = 370 bytes
For 10,000 unique IPs: ~3.7 MB

Key Expiration

Keys automatically expire after 60 seconds of inactivity:
.expire(&key, RATE_LIMIT_WINDOW_SECONDS as i64)
This prevents memory leaks from inactive IPs.

Production Deployment

docker-compose.yml
services:
  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
    volumes:
      - redis-data:/data
    ports:
      - "6379:6379"
    restart: unless-stopped

volumes:
  redis-data:

Redis Cluster for High Traffic

For > 1000 req/sec, use Redis Cluster or managed services:
  • AWS ElastiCache: Fully managed Redis
  • Redis Enterprise Cloud: Managed Redis with clustering
  • Upstash: Serverless Redis with global replication

Monitoring Checklist

  • Redis memory usage
  • Connection pool saturation
  • Rate limit hit rate (429 responses)
  • Redis command latency
  • Key count growth
  • Failed Redis operations

Next Steps

Configuration

Configure Redis URL and connection settings

Custom Deployment

Deploy with Redis in production

Build docs developers (and LLMs) love