CLI
Parse your first document
Pass any PDF (or supported document format) to Output is printed to stdout as plain text with the spatial layout preserved. To save it to a file:
lit parse:Library
Parse a file
Import
LiteParse and call parse() with a file path:result.text contains the full document text with spatial layout preserved across all pages. Per-page data is available in result.pages.Parse from a Buffer or Uint8Array
You can pass raw bytes instead of a file path. PDF bytes go straight to the parser with zero disk I/O; non-PDF bytes are written to a temp file for format conversion.
- From disk
- From HTTP
OCR is enabled by default using the built-in Tesseract.js engine — no setup required. On the first run, Tesseract downloads language data from the internet. For offline use, set
TESSDATA_PREFIX to a directory containing pre-downloaded .traineddata files.Next steps
Library usage
Explore the full
LiteParse API: configuration options, OCR setup, screenshot generation, and more.CLI reference
Full reference for
lit parse, lit batch-parse, and lit screenshot commands and all their flags.