Introduction

Auto-generate your docs

Reasoning-based RAG • No Vector DB • No Chunking • Human-like Retrieval
How PageIndex Works
Core Features
State-of-the-Art Performance
Deployment Options
Next Steps

Reasoning-based RAG • No Vector DB • No Chunking • Human-like Retrieval

Are you frustrated with vector database retrieval accuracy for long professional documents? Traditional vector-based RAG relies on semantic similarity rather than true relevance. But similarity ≠ relevance — what we truly need in retrieval is relevance, and that requires reasoning. Inspired by AlphaGo, PageIndex is a vectorless, reasoning-based RAG system that builds a hierarchical tree index from long documents and uses LLMs to reason over that index for agentic, context-aware retrieval. It simulates how human experts navigate and extract knowledge from complex documents through tree search.

Quick Start

Get PageIndex running on your first PDF in under 5 minutes

Installation

Install PageIndex and set up your environment

Tree Structure

Understand the hierarchical tree index that powers PageIndex

API Reference

Explore configuration options and advanced parameters

How PageIndex Works

PageIndex performs retrieval in two steps:

Generate a hierarchical tree index - Similar to a “Table of Contents” but optimized for LLMs
Perform reasoning-based retrieval - Navigate the tree through LLM-powered tree search

Core Features

Compared to traditional vector-based RAG, PageIndex features:

No Vector DB

Uses document structure and LLM reasoning for retrieval, instead of vector similarity search

No Chunking

Documents are organized into natural sections, not artificial chunks

Human-like Retrieval

Simulates how human experts navigate and extract knowledge from complex documents

Better Explainability

Retrieval is based on reasoning — traceable and interpretable with page and section references

State-of-the-Art Performance

PageIndex powers a reasoning-based RAG system that achieved 98.7% accuracy on FinanceBench, demonstrating superior performance over vector-based RAG solutions in professional document analysis.

See the Mafin 2.5 benchmark results for detailed comparisons and performance metrics.

Deployment Options

Self-Host

Run locally with the open-source package

Cloud Service

Try instantly with the Chat Platform or integrate via MCP/API

Enterprise

Private or on-prem deployment with dedicated support

Next Steps

Quick Start Guide

Generate your first PageIndex tree structure in minutes

Vectorless RAG Cookbook

See a minimal, hands-on example of reasoning-based RAG

Quick Start

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us

Get Started

Core Concepts

Guides

Cookbook

Tutorials

Reasoning-based RAG • No Vector DB • No Chunking • Human-like Retrieval

Quick Start

Installation

Tree Structure

API Reference

How PageIndex Works

Core Features

No Vector DB

No Chunking

Human-like Retrieval

Better Explainability

State-of-the-Art Performance

Deployment Options

Self-Host

Cloud Service

Enterprise

Next Steps

Quick Start Guide

Vectorless RAG Cookbook

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Cookbook

Tutorials

Documentation Index

​Reasoning-based RAG • No Vector DB • No Chunking • Human-like Retrieval

Quick Start

Installation

Tree Structure

API Reference

​How PageIndex Works

​Core Features

No Vector DB

No Chunking

Human-like Retrieval

Better Explainability

​State-of-the-Art Performance

​Deployment Options

Self-Host

Cloud Service

Enterprise

​Next Steps

Quick Start Guide

Vectorless RAG Cookbook

Build docs developers (and LLMs) love

Reasoning-based RAG • No Vector DB • No Chunking • Human-like Retrieval

How PageIndex Works

Core Features

State-of-the-Art Performance

Deployment Options

Next Steps