Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/ragaeeb/paragrafs/llms.txt

Use this file to discover all available pages before exploring further.

Breaking Changes (Recent)

This guide covers recent breaking changes in Paragrafs. Please review these changes carefully when upgrading to ensure your code continues to work as expected.

Hints are Normalized by Default

Breaking Change: createHints(...) now uses Arabic-first normalization for matching and mining by default.

What Changed

The createHints function now applies Arabic-first normalization automatically. This normalization is:
  • Diacritics tolerant - removes Arabic diacritical marks
  • Punctuation tolerant - handles various punctuation marks
  • More robust - better matching for Arabic text with variations

Impact

If you previously relied on exact string matching for hints, your code may behave differently:
Before (Exact Matching)
import { createHints } from 'paragrafs';

// Previously matched exact strings only
const hints = createHints('مرحباً', 'السلام عليكم');
// Only matched text with exact diacritics and punctuation
After (Normalized Matching)
import { createHints } from 'paragrafs';

// Now uses normalization by default
const hints = createHints('مرحباً', 'السلام عليكم');
// Matches variations: 'مرحبا', 'مرحباً', 'مرحبا،' etc.

Migration Steps

1

Review your hint usage

Identify all places where you use createHints in your codebase:
grep -r "createHints" .
2

Update expectations

If you relied on exact matching, update your code to account for normalized matching behavior. The function now matches more liberally with Arabic text.
3

Configure normalization (optional)

If needed, pass explicit normalization options:
const hints = createHints(
  { normalizeAlef: true, removeDiacritics: true },
  'مرحباً',
  'السلام عليكم'
);
4

Test thoroughly

Run your test suite to ensure hint matching behaves as expected:
bun test --coverage

Reference

See src/utils/textUtils.ts:121-156 for implementation details.

ALWAYS_BREAK is a True Hard Boundary

Breaking Change: Segments or lines after an ALWAYS_BREAK must not be merged into previous segments.

What Changed

The ALWAYS_BREAK marker now enforces a strict boundary that cannot be crossed during segment merging operations.

Impact

Previously, some merging logic might have combined segments across ALWAYS_BREAK markers. This is no longer allowed:
Before (Soft Boundary)
const segments = markAndCombineSegments(tokens, options);
// ALWAYS_BREAK could sometimes be crossed during merging
After (Hard Boundary)
const segments = markAndCombineSegments(tokens, options);
// ALWAYS_BREAK creates an absolute boundary
// Segments after ALWAYS_BREAK will never merge with previous segments

Migration Steps

1

Review hint usage

If you use createHints() to create custom hints, be aware that matched hints now insert ALWAYS_BREAK markers that cannot be crossed during segment merging.
2

Check segment expectations

Review your code to ensure it doesn’t expect segments to merge across hint boundaries. Hints now create hard boundaries that are strictly preserved.
3

Test segmentation output

Run your segmentation logic and verify that hint-marked boundaries are preserved as expected. If you need softer boundaries, consider adjusting your hint strategy.
4

Validate results

Test your complete transcription workflow to ensure the stricter boundaries produce the desired output.

Reference

See the following for implementation details:

General Migration Tips

Always run your full test suite after upgrading:
bun test --coverage
This helps identify any breaking changes that affect your specific use case.
Run TypeScript compilation to catch any type-related issues:
bun run build
Check the changelog for a complete list of changes in each version.
Pin to specific versions in your package.json to avoid unexpected breaking changes:
{
  "dependencies": {
    "paragrafs": "1.6.0"
  }
}

Need Help?

Report Issues

Found a bug or need help migrating? Open an issue on GitHub

API Reference

Review the complete API documentation

Build docs developers (and LLMs) love