Paragrafs provides a comprehensive TypeScript API for processing transcripts, aligning ground truth, and working with timestamped text.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ragaeeb/paragrafs/llms.txt
Use this file to discover all available pages before exploring further.
API Sections
Transcript Builders
Functions for processing tokens into formatted segments with natural breaks
Ground Truth Alignment
Align AI-generated tokens with human-edited text using LCS matching
Editor Helpers
Utilities for finding tokens based on queries or text selections
Utility Functions
Helper functions for timestamps, punctuation, normalization, and more
Hint Generation
Auto-generate hints from repeated phrases in transcripts (Arabic-first)
Types
TypeScript types and interfaces used throughout the library
Quick Start
Core Concepts
Tokens
AToken represents a single word or phrase with timing information:
Segments
ASegment is a higher-level structure containing multiple tokens:
Markers
The library uses special markers to indicate segment boundaries:SEGMENT_BREAK- Soft break (can be ignored if duration constraints allow)ALWAYS_BREAK- Hard break (must create a new segment/line)
Arabic-First Design
Many functions include Arabic-specific features:- Diacritics removal
- Alef normalization (أإآ → ا)
- Ya normalization (ى → ي)
- Tatweel removal (ـ)
- Arabic punctuation support (؟ ؛)