Skip to main content

Overview

The PageIndex Cloud API provides a hosted service for generating PageIndex tree structures and performing reasoning-based RAG without requiring local setup or managing infrastructure.

API Documentation

View the complete API documentation at docs.pageindex.ai

Available Services

PageIndex Chat Platform

A ChatGPT-style interface for document analysis with human-like retrieval:

Chat Platform

Try the PageIndex Chat Platform at chat.pageindex.ai
Features:
  • Upload and chat with long PDF documents
  • Reasoning-based retrieval with full traceability
  • Multi-document conversations
  • Page and section references in responses
  • No vector database or chunking required

MCP Integration

Integrate PageIndex into Claude, Cursor, or any MCP-enabled agent:

MCP Setup

Set up PageIndex MCP at pageindex.ai/mcp
Features:
  • Native integration with Claude Desktop
  • Works with Cursor IDE
  • Compatible with any MCP-enabled application
  • Full PageIndex reasoning capabilities

REST API

Programmatic access to PageIndex services:

API Quickstart

Get started with the API at docs.pageindex.ai/quickstart
Key Endpoints:
  • Tree structure generation
  • Document upload and processing
  • Reasoning-based search
  • Multi-step retrieval

Deployment Options

Self-Host

Run locally with the open-source repository on GitHub

Cloud Service

Use the hosted Chat Platform, MCP, or REST API

Enterprise

Private or on-premises deployment

Self-Hosted (Open Source)

Pros:
  • Complete control over infrastructure
  • No API costs (pay only for OpenAI)
  • Customizable processing pipeline
  • No data leaves your environment
Cons:
  • Requires local setup and maintenance
  • Slower processing (no optimization)
  • Manual scaling and monitoring
Best for: Development, testing, custom workflows

Cloud Service

Pros:
  • Instant access, no setup required
  • Optimized processing (faster results)
  • Managed infrastructure
  • Automatic updates and improvements
Cons:
  • Requires API subscription
  • Data sent to PageIndex servers
Best for: Production applications, rapid prototyping

Enterprise Deployment

Pros:
  • Private cloud or on-premises hosting
  • Full data control and compliance
  • Custom SLAs and support
  • Dedicated resources
Cons:
  • Higher cost
  • Requires enterprise contract
Best for: Large organizations, regulated industries

Contact for Enterprise

Contact us for enterprise deployment options

Cloud API vs. Self-Hosted

FeatureCloud APISelf-Hosted
Setup TimeInstant5-10 minutes
Processing SpeedOptimizedStandard
InfrastructureManagedSelf-managed
API CostsSubscriptionOpenAI only
Data PrivacyCloudFully private
CustomizationLimitedFull
ScalingAutomaticManual
SupportIncludedCommunity

Getting Started with Cloud API

1. Sign Up

Visit the PageIndex Chat Platform to create an account.

2. Get API Key

API access is currently in beta. Contact us to request access.

3. Quick Example

import requests

# API endpoint (example - see official docs)
url = "https://api.pageindex.ai/v1/process"

# Upload document
with open("document.pdf", "rb") as f:
    files = {"file": f}
    headers = {"Authorization": f"Bearer {api_key}"}
    response = requests.post(url, files=files, headers=headers)

result = response.json()
print(f"Document ID: {result['doc_id']}")
print(f"Tree structure: {result['structure']}")
The above is a simplified example. See the official API documentation for complete endpoints and authentication.

Cloud API Features

Tree Structure Generation

Generate PageIndex tree structures from uploaded documents:
  • Automatic TOC detection and extraction
  • Hierarchical structure with page ranges
  • Optional AI summaries for each section
  • Support for complex document layouts
Perform multi-step reasoning over document structures:
  • Natural language queries
  • Context-aware retrieval
  • Tree search with reasoning steps
  • Traceable results with page references

Multi-Document Support

Work with multiple documents simultaneously:
  • Cross-document search
  • Document comparison
  • Multi-source answers
  • Document relationship analysis

Integration Examples

Python Client

from pageindex_client import PageIndexClient

# Initialize client
client = PageIndexClient(api_key="your_api_key")

# Upload and process document
doc = client.upload("financial_report.pdf")

# Get tree structure
tree = doc.get_structure()

# Perform search
results = doc.search("What were the total revenues in Q4?")
print(results.answer)
print(results.sources)  # Page references

JavaScript/TypeScript

import { PageIndexClient } from '@pageindex/client';

// Initialize client
const client = new PageIndexClient({ apiKey: process.env.PAGEINDEX_API_KEY });

// Upload document
const doc = await client.upload('document.pdf');

// Get structure
const tree = await doc.getStructure();
console.log(tree.structure);

// Search
const results = await doc.search('Summarize the executive summary');
console.log(results.answer);

cURL

# Upload document
curl -X POST https://api.pageindex.ai/v1/upload \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "[email protected]"

# Get structure
curl -X GET https://api.pageindex.ai/v1/documents/{doc_id}/structure \
  -H "Authorization: Bearer YOUR_API_KEY"

# Search
curl -X POST https://api.pageindex.ai/v1/documents/{doc_id}/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the main conclusion?"}'

Pricing

PageIndex Cloud API is currently in beta. Pricing will be announced upon general availability. Beta Access:
  • Limited free tier available
  • Early adopter discounts
  • Usage-based pricing planned

Request Beta Access

Contact us to request beta access to the API

Use Cases

Financial Analysis

  • Analyze earnings reports, 10-Ks, 10-Qs
  • Extract specific metrics and KPIs
  • Compare financial documents
  • Navigate complex contracts
  • Extract clauses and obligations
  • Compare legal documents

Research & Academia

  • Analyze research papers
  • Extract methodology and findings
  • Literature review automation

Technical Documentation

  • Search product manuals
  • Find specific procedures
  • Extract technical specifications

Support & Resources

Documentation

Complete API documentation and guides

Discord Community

Join our Discord for support and discussions

GitHub Repository

Open-source code and examples

Blog

Technical articles and updates

Frequently Asked Questions

The Cloud API provides the same core PageIndex functionality with additional optimizations, managed infrastructure, and extra features like multi-document support. The open-source version is great for self-hosting and customization.
Yes! The API is designed to be compatible with the open-source version. You can easily migrate existing workflows to the Cloud API with minimal code changes.
For the Cloud API, documents are processed on our secure servers and can be deleted after processing. For maximum privacy, use the self-hosted version or contact us about enterprise on-premises deployment.
Currently, PDF files are fully supported. Markdown support is available in the open-source version. Additional formats may be added in the future.
For the Cloud API, OpenAI costs are included in the subscription. For self-hosted deployments, you use your own OpenAI API key directly.

Next Steps

1

Try the Chat Platform

Experience PageIndex with the Chat Platform - no setup required
2

Explore Documentation

Read the full API documentation for integration details
3

Request API Access

Contact us to request beta API access
4

Join the Community

Join our Discord for support and discussions

See Also

Build docs developers (and LLMs) love