Rate limiting concepts: per-user protection in RLaaS

Rate limiting controls how many requests a client can make to your API within a given time window. Without it, a single misbehaving client can exhaust your server resources, degrade service for everyone else, and drive up infrastructure costs. By enforcing a per-user cap, you protect against abuse, ensure fair use across all clients, and keep operating costs predictable.

How RLaaS enforces rate limits

RLaaS intercepts every incoming HTTP request using a servlet filter — specifically, a Spring OncePerRequestFilter — that runs before any controller logic executes. The filter reads the userId query parameter, checks the current request count against the configured limit in Redis, and either allows the request through or returns a 429 Too Many Requests response immediately.

Filter decision flow

Incoming request
      │
      ▼
RateLimiterFilter (OncePerRequestFilter)
      │
      ├─ Extract userId from query param
      │
      ├─ Call rateLimitingAlgorithm.allowRequest(userId)
      │       │
      │       ├─ allowed=true  ──► filterChain.doFilter() ──► Controller
      │       │
      │       └─ allowed=false ──► HTTP 429 "Try after X seconds"
      │
      └─ Response returned to client

Because the check happens in the filter layer, rate-limited requests never reach your business logic. This keeps your controllers clean and ensures consistent enforcement regardless of which endpoint is called.

Per-user isolation

Each userId gets its own independent Redis key. For the fixed window algorithm, keys follow the pattern rlaas:rate_limit:{userId}:{window}; for the sliding window, a single ZSET key rlaas:rate_limit:{userId} is used per user. This means Alice’s request count has no effect on Bob’s. One user exhausting their quota does not slow down or block other users. The isolation is enforced at the Redis key level — there is no shared counter.

What happens when a limit is exceeded

When a user exceeds their configured limit, RLaaS returns:

HTTP status: 429 Too Many Requests
Response body: User is not allowed...Try after X seconds.

The value X is the TTL of the Redis key for that user’s current window — the number of seconds until their quota resets. Clients can read this value to implement retry logic with an appropriate backoff.

Choosing a window size and limit

You configure rate limiting behavior with two properties:

rate-limiter:
  window-size: 60       # seconds
  max-requests: 100

A few starting points:

Public API, moderate traffic: 100 requests per 60 seconds
Stricter abuse prevention: 20 requests per 60 seconds, or 100 requests per 300 seconds
Internal service-to-service: 1000 requests per 60 seconds

Reduce max-requests to tighten limits without changing the window. Reduce window-size to make limits reset more frequently. For the smoothest enforcement with no boundary bursts, use the sliding window algorithm.

Fixed Window vs Sliding Window

Compare the two rate-limiting algorithms and their trade-offs.

Check endpoint

See the API reference for the rate limit check endpoint.

Get Started

Concepts

Rate limiting concepts: per-user protection in RLaaS

How RLaaS enforces rate limits

Per-user isolation

What happens when a limit is exceeded

Choosing a window size and limit

Fixed Window vs Sliding Window

Check endpoint

Build docs developers (and LLMs) love

Get Started

Concepts

​How RLaaS enforces rate limits

​Per-user isolation

​What happens when a limit is exceeded

​Choosing a window size and limit

Fixed Window vs Sliding Window

Check endpoint

Build docs developers (and LLMs) love

How RLaaS enforces rate limits

Per-user isolation

What happens when a limit is exceeded

Choosing a window size and limit