Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ragaeeb/paragrafs/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Paragrafs provides a simple API for converting raw AI transcription tokens into properly formatted paragraphs. This guide covers the essential functions you’ll need to get started.Installation
First, install Paragrafs in your project:Core Workflow
The basic workflow for processing transcriptions involves three main steps:Estimate segments from tokens
Convert multi-word tokens into segments with word-level timing information.
Mark and combine segments
Process segments to identify natural paragraph breaks based on fillers, gaps, and punctuation.
Quick Start Example
Here’s a complete example showing how to process a simple transcription:Working with Multiple Segments
For more complex transcriptions with multiple segments, use the complete processing pipeline:Configuration Options
ThemarkAndCombineSegments function accepts several options to customize paragraph reconstruction:
| Option | Type | Description |
|---|---|---|
fillers | string[] | Words to treat as filler (e.g., “uh”, “umm”) that trigger segment breaks |
gapThreshold | number | Minimum time gap in seconds to trigger a segment break |
maxSecondsPerSegment | number | Maximum duration in seconds for a single segment |
minWordsPerSegment | number | Minimum words required for a segment to stand alone |
hints | Hints | Optional multi-word phrase hints for custom break points |
Core Data Types
Understanding the basic types will help you work effectively with Paragrafs:Next Steps
Timestamped Transcripts
Learn how to create human-readable transcripts with timestamps
Ground Truth Alignment
Align AI tokens with human-edited text
Auto-Hint Generation
Automatically discover repeated phrases
Arabic Support
Work with Arabic text normalization