Quick Start
Install LiteParse and parse your first document in under 2 minutes.
Library Usage
Use LiteParse as a Node.js library in your application.
CLI Reference
Explore all CLI commands: parse, batch-parse, and screenshot.
API Reference
Full TypeScript API — the LiteParse class, config options, and types.
Key features
Spatial text extraction
Preserves text layout with precise bounding boxes using PDF.js — ideal for structured documents.
Built-in OCR
Tesseract.js is included out of the box. No setup required for scanned documents.
Pluggable OCR servers
Connect EasyOCR, PaddleOCR, or any custom OCR server via a simple HTTP API.
Multi-format input
Automatically converts DOCX, XLSX, PPTX, and images to PDF before parsing.
Screenshot generation
Generate high-quality page screenshots for LLM visual agents.
Runs locally
No cloud dependencies. Everything runs on your machine — Linux, macOS, or Windows.
Get started
Need higher accuracy on complex documents — dense tables, multi-column layouts, or handwritten text? Try LlamaParse, the cloud-based document parser built for production pipelines.