Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/adelpro/quran-search-engine/llms.txt

Use this file to discover all available pages before exploring further.

Function signature

export const removeTashkeel = (text: string): string
Removes Tashkeel (diacritics) and Quranic marks from Arabic text.

Parameters

text
string
required
The input Arabic text containing diacritics and Quranic marks

Returns

cleanText
string
Text without diacritics and Quranic marks

What transformations are applied

The function applies two transformations:
  1. Converts wasl alef to regular alef - Replaces U+0671 (wasl alef) with U+0627 (regular alef)
  2. Removes all diacritical marks - Strips the following Unicode ranges:
    • U+064B - U+065F (Arabic diacritics including fatha, kasra, damma, sukun, shadda, etc.)
    • U+0670 (dagger alef)
    • U+06D6 - U+06DC (Quranic annotation marks)
    • U+06DF - U+06E8 (Quranic pause marks and signs)
    • U+06EA - U+06FC (Quranic marks)

Use case

Stripping diacritics (tashkeel) for display or simple comparisons. This function is also used internally by normalizeArabic() as the first step in the normalization pipeline.

Examples

Basic usage

import { removeTashkeel } from 'quran-search-engine';

const out = removeTashkeel('بِسْمِ ٱللَّهِ');
console.log(out);
// Output: 'بسم الله'

Before and after

import { removeTashkeel } from 'quran-search-engine';

// Input with full tashkeel (diacritics)
const input = 'بِسْمِ ٱللَّهِ ٱلرَّحْمَٰنِ ٱلرَّحِيمِ';
console.log('Before:', input);

const output = removeTashkeel(input);
console.log('After:', output);
// Before: بِسْمِ ٱللَّهِ ٱلرَّحْمَٰنِ ٱلرَّحِيمِ
// After: بسم الله الرحمن الرحيم

Removing Quranic marks

import { removeTashkeel } from 'quran-search-engine';

// Text with various Quranic pause and annotation marks
const textWithMarks = 'وَقُولُوا۟ حِطَّةٌ۬ وَٱدْخُلُوا۟ ٱلْبَابَ سُجَّدًۭا';
const cleaned = removeTashkeel(textWithMarks);
console.log(cleaned);
// Output: وقولوا حطة وادخلوا الباب سجدا

Difference from normalizeArabic()

import { removeTashkeel, normalizeArabic } from 'quran-search-engine';

const text = 'بِسْمِ ٱللَّهِ';

// removeTashkeel only removes diacritics
const tashkeelRemoved = removeTashkeel(text);
console.log('removeTashkeel:', tashkeelRemoved);
// Output: بسم الله

// normalizeArabic does more: removes diacritics + unifies variants + cleans up
const normalized = normalizeArabic(text);
console.log('normalizeArabic:', normalized);
// Output: بسم الله

// The difference is more visible with variant characters:
const textWithVariants = 'إِلَٰهَ إِلَّآ أَنتَ';
console.log('removeTashkeel:', removeTashkeel(textWithVariants));
// Output: إله إلا أنت (keeps alef variants)

console.log('normalizeArabic:', normalizeArabic(textWithVariants));
// Output: اله الا انت (unifies to regular alef)

Build docs developers (and LLMs) love