Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ragaeeb/kokokor/llms.txt
Use this file to discover all available pages before exploring further.
Function Signature
src/utils/paragraphs.ts:236
Description
Groups text lines into coherent paragraphs while handling both prose and poetry content appropriately. This is the second stage of the paragraph reconstruction pipeline. The function:- Merges consecutive prose lines into paragraphs based on vertical spacing and line width patterns
- Preserves poetic lines individually to maintain their formatting
- Processes body content and footnotes separately
- Uses enhanced paragraph detection with robust geometry heuristics
Parameters
Array of text lines to group into paragraphs. These are typically the output from
mapObservationsToTextLines.Each TextBlock should contain:text: The line contentbbox: Bounding box of the lineisPoetic: Whether the line is poetry (will not be merged)isFootnote: Whether the line is a footnote (processed separately)
Object-based paragraph detection settings.
Returns
Array of text blocks representing complete paragraphs.
- Prose lines are merged into paragraph-level blocks
- Poetic lines (
isPoetic: true) are preserved individually - Each block contains merged text and a bounding box covering the entire paragraph
- Body content and footnotes are processed separately then concatenated
Example
Example with Poetry
Notes
- The enhanced paragraph detector uses robust geometry (p75 width baseline, robust x baseline)
- Poetic lines are never merged - they preserve their original line breaks
- Body content and footnotes are processed separately using the same heuristics
- The function implements a single-break-per-line decision to avoid double increments
- Short-line interaction guards prevent premature paragraph breaks