Skip to main content

Overview

The optimizeRules function analyzes your rule set and automatically consolidates redundant patterns. This reduces the size of your rules while maintaining the same matching behavior.
Optimization is optional but recommended for large rule sets to improve maintainability and potentially reduce trie build time.

Basic usage

Import and use optimizeRules before building your trie:
import { optimizeRules, buildTrie } from 'trie-rules';

const rules = [
  { from: ['Source', 'source'], to: 'Target' },
  { from: ['Bukhari', 'al-Bukhari'], to: 'Bukhari' },
  { from: ['Source1', 'Source2', 'Source3'], to: 'Target' },
  { from: ['Source1', 'Source2'], to: 'Target' },
];

const result = optimizeRules(rules);

console.log(result.optimizedRules);
// Consolidated rules with fewer sources and automatic options

console.log(result.savings);
// { sourcesRemoved: N, rulesRemoved: M }

console.log(result.warnings);
// { conflicts: [...], matchTypeConflicts: [...], overwrittenRules: [...] }

Optimizations performed

The optimizer applies several strategies to consolidate your rules:
1

Case sensitivity consolidation

Detects sources that differ only in case and adds casing: CaseSensitivity.Insensitive.
2

Apostrophe normalization

When normalizeApostrophes: true is set, consolidates sources that differ only in apostrophe-like characters.
3

Prefix optimization

Detects redundant prefix variations and adds a prefix option instead.
4

Clip pattern optimization

Detects leading/trailing apostrophe-like characters and adds clipStartPattern or clipEndPattern options.
5

Subset elimination

Removes rules whose sources are a subset of another rule with the same target.
6

Match type consolidation

Merges rules with different MatchType values, keeping the most permissive.

Case sensitivity consolidation

Before optimization:
const rules = [
  { from: ['Source', 'source'], to: 'Target' },
];
After optimization:
const result = optimizeRules(rules);
// result.optimizedRules:
[
  {
    from: ['Source'],
    to: 'Target',
    options: { casing: CaseSensitivity.Insensitive },
  },
]
// result.savings.sourcesRemoved: 1
The optimizer automatically detects case variants and adds the appropriate casing option, reducing redundancy.

Prefix optimization

Before optimization:
const rules = [
  { from: ['Bukhari', 'al-Bukhari'], to: 'Bukhari' },
];
After optimization:
const result = optimizeRules(rules);
// result.optimizedRules:
[
  {
    from: ['Bukhari'],
    to: 'Bukhari',
    options: { prefix: 'al-' },
  },
]
// result.savings.sourcesRemoved: 1
The optimizer recognizes common Arabic article prefixes: al-, ash-, an-, ar-, as-, ath-, ad-

Subset elimination

Before optimization:
const rules = [
  { from: ['Source1', 'Source2', 'Source3'], to: 'Target' },
  { from: ['Source1', 'Source2'], to: 'Target' },
];
After optimization:
const result = optimizeRules(rules);
// result.optimizedRules:
[
  { from: ['Source1', 'Source2', 'Source3'], to: 'Target' },
]
// result.savings.rulesRemoved: 1
The second rule is redundant because its sources are a complete subset of the first rule with the same target.

Apostrophe normalization

When using apostrophe normalization, the optimizer consolidates apostrophe variants:
const rules = [
  { from: ["al-Qur'an", "al-Qur'an", 'al-Qur`an'], to: 'al-Qurʾān' },
];

const result = optimizeRules(rules, { normalizeApostrophes: true });
// result.optimizedRules:
[
  { from: ["al-Qur'an"], to: 'al-Qurʾān' },
]
// result.savings.sourcesRemoved: 2

Conflict detection

The optimizer warns when the same source maps to different targets:
const rules = [
  { from: ['Ali'], to: 'ʿAlī' },
  { from: ['Ali'], to: 'ʾAlī' },
];

const result = optimizeRules(rules);

console.log(result.warnings.conflicts);
// [
//   {
//     from: 'Ali',
//     conflictingTo: ['ʿAlī', 'ʾAlī']
//   }
// ]
Conflicts indicate that the same source pattern has multiple different replacement targets. The last rule in the array will take precedence during matching.

Match type conflicts

The optimizer detects and consolidates rules with different match types:
import { optimizeRules, MatchType } from 'trie-rules';

const rules = [
  { from: ['test'], to: 'result', options: { match: MatchType.Any } },
  { from: ['test'], to: 'result', options: { match: MatchType.Whole } },
];

const result = optimizeRules(rules);
// Consolidates to most permissive match type (Any)
// result.optimizedRules:
[
  { from: ['test'], to: 'result', options: { match: MatchType.Any } },
]
When multiple rules have the same source and target but different match types, the optimizer keeps the most permissive type: Any > Whole > Alone

Overwritten rules

The optimizer detects rules that would be overwritten in the trie:
const rules = [
  { from: ['test'], to: 'first' },
  { from: ['test'], to: 'second' },
];

const result = optimizeRules(rules);

console.log(result.warnings.overwrittenRules);
// [
//   {
//     from: 'test',
//     kept: { from: ['test'], to: 'second' },
//     discarded: [{ from: ['test'], to: 'first' }]
//   }
// ]
When the same source appears in multiple rules, the last one wins. Consider if this is intentional or if you need to resolve the conflict.

Optimization result structure

The OptimizeResult object contains:
optimizedRules
Rule[]
The optimized array of rules with redundancies removed and options consolidated.
savings
object
Statistics about the optimization process.
warnings
object
Warnings about potential issues found during optimization.

Best practices

Run optimization early

Optimize rules before building the trie to reduce build time and catch issues early.

Review warnings

Always inspect the warnings object to identify and resolve conflicts in your rule set.

Track savings

Monitor sourcesRemoved and rulesRemoved to understand the impact of optimization.

Test thoroughly

Verify that optimized rules produce the same results as your original rules.

Next steps

Build docs developers (and LLMs) love