Skip to main content
PageIndex Hero Light

Reasoning-based RAG • No Vector DB • No Chunking • Human-like Retrieval

Are you frustrated with vector database retrieval accuracy for long professional documents? Traditional vector-based RAG relies on semantic similarity rather than true relevance. But similarity ≠ relevance — what we truly need in retrieval is relevance, and that requires reasoning. Inspired by AlphaGo, PageIndex is a vectorless, reasoning-based RAG system that builds a hierarchical tree index from long documents and uses LLMs to reason over that index for agentic, context-aware retrieval. It simulates how human experts navigate and extract knowledge from complex documents through tree search.

Quick Start

Get PageIndex running on your first PDF in under 5 minutes

Installation

Install PageIndex and set up your environment

Tree Structure

Understand the hierarchical tree index that powers PageIndex

API Reference

Explore configuration options and advanced parameters

How PageIndex Works

PageIndex performs retrieval in two steps:
  1. Generate a hierarchical tree index - Similar to a “Table of Contents” but optimized for LLMs
  2. Perform reasoning-based retrieval - Navigate the tree through LLM-powered tree search
PageIndex Workflow

Core Features

Compared to traditional vector-based RAG, PageIndex features:

No Vector DB

Uses document structure and LLM reasoning for retrieval, instead of vector similarity search

No Chunking

Documents are organized into natural sections, not artificial chunks

Human-like Retrieval

Simulates how human experts navigate and extract knowledge from complex documents

Better Explainability

Retrieval is based on reasoning — traceable and interpretable with page and section references

State-of-the-Art Performance

PageIndex powers a reasoning-based RAG system that achieved 98.7% accuracy on FinanceBench, demonstrating superior performance over vector-based RAG solutions in professional document analysis.
See the Mafin 2.5 benchmark results for detailed comparisons and performance metrics.

Deployment Options

Self-Host

Run locally with the open-source package

Cloud Service

Try instantly with the Chat Platform or integrate via MCP/API

Enterprise

Private or on-prem deployment with dedicated support

Next Steps

Quick Start Guide

Generate your first PageIndex tree structure in minutes

Vectorless RAG Cookbook

See a minimal, hands-on example of reasoning-based RAG

Build docs developers (and LLMs) love