Rate Limiting

TaskForge API uses Flask-Limiter to protect all endpoints from excessive requests. Limits are enforced per IP address and tracked in memory by default, with optional Redis-backed storage for distributed deployments.

Default limits

Two global limits apply to every endpoint that does not have a more specific limit configured:

Window	Limit
Per day	200 requests
Per hour	50 requests

These are set by the RATELIMIT_DEFAULT environment variable:

RATELIMIT_DEFAULT=200 per day;50 per hour

Multiple limits are separated by a semicolon. Both limits apply simultaneously — a client is throttled as soon as either threshold is reached.

Endpoint-specific limits

Some endpoints apply stricter per-hour limits on top of the global defaults. The more restrictive limit always takes precedence.

Authentication endpoints

Endpoint	Limit	Reason
`POST /api/auth/register`	5 per hour	Prevents automated account creation
`POST /api/auth/login`	10 per hour	Slows down brute-force attempts
`POST /api/auth/change-password`	3 per hour	Limits password-guessing attacks

Task endpoints

Endpoint	Limit	Reason
`POST /api/tasks`	50 per hour	Prevents bulk task spam

Tag endpoints

Endpoint	Limit	Reason
`POST /api/tags`	20 per hour	Limits tag proliferation

Read endpoints (GET) and update/delete endpoints (PUT, PATCH, DELETE) are covered only by the global defaults, not endpoint-specific limits.

Rate limit response

When a client exceeds any limit, the API responds with HTTP 429 Too Many Requests. Flask-Limiter’s default error body is returned:

{
  "message": "429 Too Many Requests: 5 per 1 hour"
}

The 429 response body does not use the standard TaskForge { "success": false, "message": "..." } envelope — it comes directly from Flask-Limiter.

Rate limit headers

When RATELIMIT_HEADERS_ENABLED=true (the default), the API includes informational headers on every response:

Header	Description
`X-RateLimit-Limit`	Maximum number of requests allowed in the current window
`X-RateLimit-Remaining`	Number of requests remaining before the limit is hit
`X-RateLimit-Reset`	Unix timestamp when the current window resets
`Retry-After`	Seconds to wait before retrying (present only on `429` responses)

Configuration

All rate limiting settings are controlled via environment variables:

RATELIMIT_ENABLED

boolean

Enable or disable rate limiting globally. Set to false in the testing configuration. Defaults to true.

RATELIMIT_ENABLED=true

RATELIMIT_STORAGE_URL

string

Backend used to store rate limit counters. Defaults to memory:// (in-process, not shared across workers). For production deployments with multiple Gunicorn workers, use a Redis URL to share counters.

# In-memory (default, suitable for single-process)
RATELIMIT_STORAGE_URL=memory://

# Redis (recommended for multi-worker production)
RATELIMIT_STORAGE_URL=redis://localhost:6379/0

RATELIMIT_DEFAULT

string

Semicolon-separated list of default limits applied to every endpoint. Each limit uses the format N per period where period is second, minute, hour, or day.

RATELIMIT_DEFAULT=200 per day;50 per hour

The default memory:// storage is not shared between Gunicorn workers. In production, TaskForge runs with 4 Gunicorn workers, so each worker tracks its own counters independently. Use Redis storage to enforce accurate limits across all workers.

Handling 429 responses

Implement retry logic with exponential backoff to gracefully handle rate limit errors. Check the Retry-After header when present to determine exactly how long to wait.

import time
import requests

def request_with_backoff(method, url, max_retries=5, **kwargs):
    """
    Make an HTTP request with exponential backoff on 429 responses.
    """
    for attempt in range(max_retries):
        response = requests.request(method, url, **kwargs)

        if response.status_code != 429:
            return response

        # Respect the Retry-After header if present
        retry_after = response.headers.get("Retry-After")
        if retry_after:
            wait_seconds = int(retry_after)
        else:
            # Exponential backoff: 1s, 2s, 4s, 8s, 16s
            wait_seconds = 2 ** attempt

        print(f"Rate limited. Retrying in {wait_seconds}s (attempt {attempt + 1}/{max_retries})")
        time.sleep(wait_seconds)

    # Return the last 429 response if all retries are exhausted
    return response


# Example usage
response = request_with_backoff(
    "POST",
    "https://your-api.example.com/api/auth/login",
    json={"email": "[email protected]", "password": "Password123"},
    headers={"Content-Type": "application/json"}
)

if response.status_code == 200:
    data = response.json()
    access_token = data["data"]["access_token"]

Best practices

Cache responses

Cache GET responses locally to reduce the number of requests you make. List endpoints support filtering — request only the data you need.

Use refresh tokens

Access tokens expire after 1 hour. Use the refresh token flow (POST /api/auth/refresh) instead of logging in repeatedly — the refresh endpoint has no custom rate limit.

Batch updates

Update task fields in a single PUT /api/tasks/{id} call rather than making multiple PATCH requests. Include all changed fields in one request body.

Monitor headers

Read X-RateLimit-Remaining on every response. When it drops to single digits, slow down your request rate proactively rather than waiting to hit a 429.

Get Started

Guides

Reference

Default limits

Endpoint-specific limits

Rate limit response

Rate limit headers

Configuration

Handling 429 responses

Best practices

Cache responses

Use refresh tokens

Batch updates

Monitor headers

Build docs developers (and LLMs) love

Get Started

Guides

Reference

​Default limits

​Endpoint-specific limits

​Rate limit response

​Rate limit headers

​Configuration

​Handling 429 responses

​Best practices

Cache responses

Use refresh tokens

Batch updates

Monitor headers

Build docs developers (and LLMs) love

Default limits

Endpoint-specific limits

Rate limit response

Rate limit headers

Configuration

Handling 429 responses

Best practices