Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ragaeeb/shamela/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Removes Arabic numeral page markers enclosed in turtle ⦗ ⦘ brackets. These markers are commonly used in Shamela texts to denote page numbers in the original printed edition.Signature
Parameters
Text potentially containing page markers
Returns
The text with numeric markers replaced by a single space
Behavior
- Matches Arabic numerals (٠-٩) enclosed in ⦗ ⦘ brackets
- Removes up to two preceding whitespace characters (space or
\r) - Removes up to one following whitespace character
- Replaces the entire match with a single space
- Uses the pattern:
/(?: |\r){0,2}⦗[\u0660-\u0669]+⦘(?: |\r)?/g
Example
Arabic Numerals
The function recognizes Arabic-Indic numerals (٠-٩):| Arabic | Latin | Unicode |
|---|---|---|
| ٠ | 0 | U+0660 |
| ١ | 1 | U+0661 |
| ٢ | 2 | U+0662 |
| ٣ | 3 | U+0663 |
| ٤ | 4 | U+0664 |
| ٥ | 5 | U+0665 |
| ٦ | 6 | U+0666 |
| ٧ | 7 | U+0667 |
| ٨ | 8 | U+0668 |
| ٩ | 9 | U+0669 |
Use Cases
- Clean display text - Remove page markers before displaying to users
- Search preparation - Remove markers before indexing for search
- Text analysis - Clean text for linguistic analysis
- Export formatting - Remove markers when exporting to other formats
Processing Order
Recommended order in a processing pipeline:Related Functions
mapPageCharacterContent()- General character normalizationremoveTagsExceptSpan()- HTML tag removal