Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/adelpro/quran-search-engine/llms.txt

Use this file to discover all available pages before exploring further.

Function signature

export const normalizeArabic = (text: string): string
Advanced Arabic normalization for search indexing. Handles Unicode normalization, variant unification, and cleanup.

Parameters

text
string
required
The input Arabic text to normalize

Returns

normalizedText
string
The normalized string with all transformations applied

What transformations are applied

The function applies the following transformations in order:
  1. Removes tashkeel (diacritics) - Strips all diacritical marks and Quranic symbols
  2. Unicode normalization - Applies NFC (Canonical Decomposition followed by Canonical Composition)
  3. Removes dagger alif and tatweel - Removes U+0670 (dagger alif) and U+0640 (tatweel/kashida)
  4. Unifies alef variants - Converts إ أ آ ٱ → ا (regular alef)
  5. Unifies hamza variants - Converts ؤ ئ ء → ء (standalone hamza)
  6. Unifies alif maqsura - Converts ى → ي (yaa)
  7. Removes control characters - Strips CRLF and non-Arabic symbols
  8. Normalizes whitespace - Replaces multiple spaces with single space
  9. Trims whitespace - Removes leading and trailing whitespace

Use case

Preparing user input for searching. This function unifies alef variants, removes tashkeel, and normalizes text to ensure consistent matching across different writing styles.

Examples

Basic normalization

import { normalizeArabic } from 'quran-search-engine';

const out = normalizeArabic('بِسْمِ ٱللَّهِ');
console.log(out);
// Output: 'بسم الله'

Before and after

import { normalizeArabic } from 'quran-search-engine';

// Input with diacritics and variant characters
const input = 'بِسْمِ ٱللَّهِ ٱلرَّحْمَٰنِ ٱلرَّحِيمِ';
console.log('Before:', input);

const output = normalizeArabic(input);
console.log('After:', output);
// Before: بِسْمِ ٱللَّهِ ٱلرَّحْمَٰنِ ٱلرَّحِيمِ
// After: بسم الله الرحمن الرحيم

Using in search filter

import { normalizeArabic } from 'quran-search-engine';

export function containsAllTokens(value: string, query: string): boolean {
  const normalizedQuery = normalizeArabic(query);
  if (!normalizedQuery) return false;

  const tokens = normalizedQuery.split(/\s+/);
  const normalizedValue = normalizeArabic(value);
  return tokens.every((token) => normalizedValue.includes(token));
}

Empty input handling

import { normalizeArabic } from 'quran-search-engine';

const result = normalizeArabic('');
console.log(result);
// Output: ''

Build docs developers (and LLMs) love