Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/lumina-ai-inc/chunkr/llms.txt

Use this file to discover all available pages before exploring further.

Quickstart Guide

This guide will help you make your first API request to Chunkr and process a document. We’ll cover both local deployment and using the Cloud API.
This quickstart assumes you have Chunkr running locally. If you haven’t installed it yet, check out the Installation Guide.

Prerequisites

1

Chunkr is Running

Ensure your Chunkr services are up and running:
docker compose ps
You should see all services in “Up” state.
2

Access the Services

Verify you can access:
  • API: http://localhost:8000
  • Web UI: http://localhost:5173
3

LLM Configuration

Make sure you’ve configured at least one LLM in models.yaml. See the Installation Guide for details.

Making Your First Request

Using the Web UI

The easiest way to get started is using the built-in web interface:
1

Open the Web UI

Navigate to http://localhost:5173 in your browser
2

Upload a Document

Click the upload area and select a PDF, Word doc, PowerPoint, or image file
3

Configure Processing

Choose your processing options:
  • OCR Strategy: All (process all pages) or Auto (selective)
  • Segmentation Strategy: LayoutAnalysis (detailed) or Page (simple)
  • High Resolution: Enable for better quality (adds ~7s per page)
4

View Results

Watch your document process in real-time and explore the structured output

Using the API

For programmatic access, use the REST API. Here’s how to process a document:
curl -X POST http://localhost:8000/api/v1/task/parse \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "file": "https://example.com/document.pdf",
    "ocr_strategy": "Auto",
    "segmentation_strategy": "LayoutAnalysis",
    "high_resolution": true
  }'
For local development without authentication, you can omit the Authorization header. Authentication is required when deploying to production.

Using Base64 Encoded Files

You can also upload files directly as base64:
Python
import base64
import requests

# Read and encode file
with open('document.pdf', 'rb') as f:
    file_data = base64.b64encode(f.read()).decode('utf-8')
    file_base64 = f"data:application/pdf;base64,{file_data}"

# Create task with base64 file
response = requests.post(
    "http://localhost:8000/api/v1/task/parse",
    json={
        "file": file_base64,
        "file_name": "document.pdf",
        "ocr_strategy": "Auto",
        "segmentation_strategy": "LayoutAnalysis"
    }
)

task = response.json()

Polling for Results

Document processing is asynchronous. Use the task ID to check status and retrieve results:
curl http://localhost:8000/api/v1/task/{task_id} \
  -H "Authorization: Bearer YOUR_API_KEY"

Understanding the Response

The task response contains rich structured data:
{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "Succeeded",
  "created_at": "2026-03-02T10:00:00Z",
  "finished_at": "2026-03-02T10:00:15Z",
  "file_name": "document.pdf",
  "page_count": 5,
  "output": [
    {
      "page_number": 1,
      "segments": [
        {
          "segment_id": "seg_001",
          "segment_type": "Title",
          "content": "<h1>Document Title</h1>",
          "text": "Document Title",
          "bbox": {
            "left": 100,
            "top": 50,
            "width": 400,
            "height": 60
          },
          "confidence": 0.98
        },
        {
          "segment_id": "seg_002",
          "segment_type": "Text",
          "content": "<p>This is the document content...</p>",
          "text": "This is the document content...",
          "bbox": {...}
        }
      ]
    }
  ]
}
  • task_id: Unique identifier for tracking this task
  • status: Current state (Starting, Processing, Succeeded, Failed)
  • output: Array of pages, each containing segments
  • segments: Individual layout elements (Title, Text, Table, Picture, etc.)
  • content: Generated HTML or Markdown based on configuration
  • text: Raw OCR-extracted text
  • bbox: Bounding box coordinates (left, top, width, height)
  • segment_type: Element type (Title, SectionHeader, Text, ListItem, Table, Picture, Caption, Formula, Footnote, PageHeader, PageFooter)

Configuration Options

OCR Strategy

Controls how OCR is applied:
  • All (default): Process all pages with OCR (~0.5s penalty per page)
  • Auto: Selective OCR only where needed; uses existing text layer when available

Segmentation Strategy

Controls layout analysis:
  • LayoutAnalysis (default): Detect all layout elements with bounding boxes
  • Page: Treat each page as a single segment (faster, less detailed)

Additional Options

{
  "high_resolution": true,        // Use high-res images (~7s per page)
  "expires_in": 3600,             // Task expiration in seconds
  "error_handling": "Fail",       // "Fail" or "Continue" on errors
  "chunk_processing": {           // Configure semantic chunking
    "target_length": 512
  },
  "segment_processing": {         // Per-segment format configuration
    "table": {
      "format": "Markdown",
      "strategy": "LLM"           // Use LLM for table extraction
    },
    "picture": {
      "format": "Html",
      "strategy": "LLM"           // Generate image descriptions
    }
  }
}
High-resolution processing significantly improves quality but adds ~7 seconds per page. Use it for documents requiring precise extraction.

Common Use Cases

RAG/LLM Pipeline

Extract chunks for embedding and retrieval:
# Process with semantic chunking
response = requests.post(
    "http://localhost:8000/api/v1/task/parse",
    json={
        "file": file_url,
        "chunk_processing": {
            "target_length": 512
        },
        "segment_processing": {
            "text": {"format": "Markdown"}
        }
    }
)

# Extract chunks for embedding
task = wait_for_task(response.json()['task_id'])
for page in task['output']:
    for segment in page['segments']:
        # Embed segment['content'] or segment['text']
        pass

Table Extraction

Extract structured tables with LLM enhancement:
response = requests.post(
    "http://localhost:8000/api/v1/task/parse",
    json={
        "file": file_url,
        "segment_processing": {
            "table": {
                "format": "Markdown",
                "strategy": "LLM"  # AI-enhanced structure
            }
        }
    }
)

Image Description

Generate descriptions for images using VLM:
response = requests.post(
    "http://localhost:8000/api/v1/task/parse",
    json={
        "file": file_url,
        "segment_processing": {
            "picture": {
                "format": "Html",
                "strategy": "LLM"  # Generate descriptions
            }
        }
    }
)

Next Steps

API Reference

Explore all API endpoints and parameters

Configuration

Learn about advanced configuration options

Installation

Deploy Chunkr to production

Examples

See more code examples and use cases

Troubleshooting

Check that:
  • Your LLM is configured in models.yaml
  • The file URL is accessible or base64 is valid
  • All required services are running (docker compose ps)
Consider:
  • Using Auto OCR strategy instead of All
  • Disabling high_resolution if not needed
  • Deploying with GPU support (see Installation)
  • Scaling up worker replicas in compose.yaml
Verify:
  • API key is valid in models.yaml
  • LLM endpoint is reachable
  • Rate limits aren’t exceeded
  • Model supports the OpenAI-compatible format

Build docs developers (and LLMs) love