Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/lumina-ai-inc/chunkr/llms.txt

Use this file to discover all available pages before exploring further.

The Chunkr API is a powerful document processing service that converts documents into RAG/LLM-ready data through advanced layout analysis and intelligent chunking.

Base URL

All API requests should be made to:
https://api.chunkr.ai

API Versioning

The current API version is v1. All endpoints are prefixed with /api/v1. Example endpoint:
https://api.chunkr.ai/api/v1/task

Request Format

The API accepts requests in two formats:
  • JSON (Recommended): Content-Type: application/json
  • Multipart Form Data: Content-Type: multipart/form-data (deprecated)

Response Format

All successful responses return JSON with appropriate HTTP status codes:
  • 200 OK - Request successful
  • 400 Bad Request - Invalid request parameters
  • 401 Unauthorized - Authentication failed
  • 404 Not Found - Resource not found
  • 429 Too Many Requests - Rate limit exceeded
  • 500 Internal Server Error - Server error

Rate Limits

The API implements intelligent rate limiting to ensure fair usage and optimal performance.
Chunkr uses a token bucket algorithm for rate limiting across different service types:

Service Rate Limits

ServiceDefault Rate LimitConfigurable
General OCR5 requests/secondYes
Segmentation5 requests/secondYes
LLM ProcessingVaries by modelYes

Batch Sizes

To optimize throughput, the API processes requests in batches:
  • General OCR: 30 pages per batch
  • Segmentation: 3 pages per batch

Rate Limit Headers

Currently, rate limit information is managed server-side. If you exceed the rate limit, you’ll receive a 429 Too Many Requests response.
When you hit a rate limit, the API will return a 429 status code with the message “Usage limit exceeded”. Implement exponential backoff in your retry logic.

Timeouts

Different operations have different timeout configurations:
  • General OCR: Configurable (no default timeout)
  • Segmentation: Configurable (no default timeout)
  • LLM Processing: 150 seconds default
  • API Request: 600 seconds (10 minutes)

File Size Limits

The API accepts files up to 1 GB by default. Both total request size and in-memory limits are enforced.
  • Max Total Limit: 1 GB (configurable via MAX_TOTAL_LIMIT)
  • Max Memory Limit: 1 GB (configurable via MAX_MEMORY_LIMIT)

Supported File Types

The API automatically detects file types and supports various document formats. Common MIME types include:
  • PDF documents
  • Images (JPEG, PNG)
  • Other document formats

Health Check

Check the API health and version:
curl https://api.chunkr.ai/health
Response:
OK - Version {version}

API Documentation

Interactive API documentation is available at:
  • Swagger UI: https://api.chunkr.ai/swagger-ui/
  • ReDoc: https://api.chunkr.ai/redoc
  • OpenAPI Spec: https://api.chunkr.ai/openapi.json

Next Steps

Authentication

Learn how to authenticate your API requests

Error Handling

Understand error responses and status codes

Task Management

Create and manage document processing tasks

Build docs developers (and LLMs) love