Skip to main content
TaskForge API uses Flask-Limiter to protect all endpoints from excessive requests. Limits are enforced per IP address and tracked in memory by default, with optional Redis-backed storage for distributed deployments.

Default limits

Two global limits apply to every endpoint that does not have a more specific limit configured:
WindowLimit
Per day200 requests
Per hour50 requests
These are set by the RATELIMIT_DEFAULT environment variable:
RATELIMIT_DEFAULT=200 per day;50 per hour
Multiple limits are separated by a semicolon. Both limits apply simultaneously — a client is throttled as soon as either threshold is reached.

Endpoint-specific limits

Some endpoints apply stricter per-hour limits on top of the global defaults. The more restrictive limit always takes precedence.
EndpointLimitReason
POST /api/auth/register5 per hourPrevents automated account creation
POST /api/auth/login10 per hourSlows down brute-force attempts
POST /api/auth/change-password3 per hourLimits password-guessing attacks
EndpointLimitReason
POST /api/tasks50 per hourPrevents bulk task spam
EndpointLimitReason
POST /api/tags20 per hourLimits tag proliferation
Read endpoints (GET) and update/delete endpoints (PUT, PATCH, DELETE) are covered only by the global defaults, not endpoint-specific limits.

Rate limit response

When a client exceeds any limit, the API responds with HTTP 429 Too Many Requests. Flask-Limiter’s default error body is returned:
{
  "message": "429 Too Many Requests: 5 per 1 hour"
}
The 429 response body does not use the standard TaskForge { "success": false, "message": "..." } envelope — it comes directly from Flask-Limiter.

Rate limit headers

When RATELIMIT_HEADERS_ENABLED=true (the default), the API includes informational headers on every response:
HeaderDescription
X-RateLimit-LimitMaximum number of requests allowed in the current window
X-RateLimit-RemainingNumber of requests remaining before the limit is hit
X-RateLimit-ResetUnix timestamp when the current window resets
Retry-AfterSeconds to wait before retrying (present only on 429 responses)

Configuration

All rate limiting settings are controlled via environment variables:
RATELIMIT_ENABLED
boolean
Enable or disable rate limiting globally. Set to false in the testing configuration. Defaults to true.
RATELIMIT_ENABLED=true
RATELIMIT_STORAGE_URL
string
Backend used to store rate limit counters. Defaults to memory:// (in-process, not shared across workers). For production deployments with multiple Gunicorn workers, use a Redis URL to share counters.
# In-memory (default, suitable for single-process)
RATELIMIT_STORAGE_URL=memory://

# Redis (recommended for multi-worker production)
RATELIMIT_STORAGE_URL=redis://localhost:6379/0
RATELIMIT_DEFAULT
string
Semicolon-separated list of default limits applied to every endpoint. Each limit uses the format N per period where period is second, minute, hour, or day.
RATELIMIT_DEFAULT=200 per day;50 per hour
The default memory:// storage is not shared between Gunicorn workers. In production, TaskForge runs with 4 Gunicorn workers, so each worker tracks its own counters independently. Use Redis storage to enforce accurate limits across all workers.

Handling 429 responses

Implement retry logic with exponential backoff to gracefully handle rate limit errors. Check the Retry-After header when present to determine exactly how long to wait.
import time
import requests

def request_with_backoff(method, url, max_retries=5, **kwargs):
    """
    Make an HTTP request with exponential backoff on 429 responses.
    """
    for attempt in range(max_retries):
        response = requests.request(method, url, **kwargs)

        if response.status_code != 429:
            return response

        # Respect the Retry-After header if present
        retry_after = response.headers.get("Retry-After")
        if retry_after:
            wait_seconds = int(retry_after)
        else:
            # Exponential backoff: 1s, 2s, 4s, 8s, 16s
            wait_seconds = 2 ** attempt

        print(f"Rate limited. Retrying in {wait_seconds}s (attempt {attempt + 1}/{max_retries})")
        time.sleep(wait_seconds)

    # Return the last 429 response if all retries are exhausted
    return response


# Example usage
response = request_with_backoff(
    "POST",
    "https://your-api.example.com/api/auth/login",
    json={"email": "[email protected]", "password": "Password123"},
    headers={"Content-Type": "application/json"}
)

if response.status_code == 200:
    data = response.json()
    access_token = data["data"]["access_token"]

Best practices

Cache responses

Cache GET responses locally to reduce the number of requests you make. List endpoints support filtering — request only the data you need.

Use refresh tokens

Access tokens expire after 1 hour. Use the refresh token flow (POST /api/auth/refresh) instead of logging in repeatedly — the refresh endpoint has no custom rate limit.

Batch updates

Update task fields in a single PUT /api/tasks/{id} call rather than making multiple PATCH requests. Include all changed fields in one request body.

Monitor headers

Read X-RateLimit-Remaining on every response. When it drops to single digits, slow down your request rate proactively rather than waiting to hit a 429.

Build docs developers (and LLMs) love