Rule optimization

Overview

The optimizeRules function analyzes your rule set and automatically consolidates redundant patterns. This reduces the size of your rules while maintaining the same matching behavior.

Optimization is optional but recommended for large rule sets to improve maintainability and potentially reduce trie build time.

Basic usage

Import and use optimizeRules before building your trie:

import { optimizeRules, buildTrie } from 'trie-rules';

const rules = [
  { from: ['Source', 'source'], to: 'Target' },
  { from: ['Bukhari', 'al-Bukhari'], to: 'Bukhari' },
  { from: ['Source1', 'Source2', 'Source3'], to: 'Target' },
  { from: ['Source1', 'Source2'], to: 'Target' },
];

const result = optimizeRules(rules);

console.log(result.optimizedRules);
// Consolidated rules with fewer sources and automatic options

console.log(result.savings);
// { sourcesRemoved: N, rulesRemoved: M }

console.log(result.warnings);
// { conflicts: [...], matchTypeConflicts: [...], overwrittenRules: [...] }

Optimizations performed

The optimizer applies several strategies to consolidate your rules:

Case sensitivity consolidation

Detects sources that differ only in case and adds casing: CaseSensitivity.Insensitive.

Apostrophe normalization

When normalizeApostrophes: true is set, consolidates sources that differ only in apostrophe-like characters.

Prefix optimization

Detects redundant prefix variations and adds a prefix option instead.

Clip pattern optimization

Detects leading/trailing apostrophe-like characters and adds clipStartPattern or clipEndPattern options.

Subset elimination

Removes rules whose sources are a subset of another rule with the same target.

Match type consolidation

Merges rules with different MatchType values, keeping the most permissive.

Case sensitivity consolidation

Before optimization:

const rules = [
  { from: ['Source', 'source'], to: 'Target' },
];

After optimization:

const result = optimizeRules(rules);
// result.optimizedRules:
[
  {
    from: ['Source'],
    to: 'Target',
    options: { casing: CaseSensitivity.Insensitive },
  },
]
// result.savings.sourcesRemoved: 1

The optimizer automatically detects case variants and adds the appropriate casing option, reducing redundancy.

Prefix optimization

Before optimization:

const rules = [
  { from: ['Bukhari', 'al-Bukhari'], to: 'Bukhari' },
];

After optimization:

const result = optimizeRules(rules);
// result.optimizedRules:
[
  {
    from: ['Bukhari'],
    to: 'Bukhari',
    options: { prefix: 'al-' },
  },
]
// result.savings.sourcesRemoved: 1

The optimizer recognizes common Arabic article prefixes: al-, ash-, an-, ar-, as-, ath-, ad-

Subset elimination

Before optimization:

const rules = [
  { from: ['Source1', 'Source2', 'Source3'], to: 'Target' },
  { from: ['Source1', 'Source2'], to: 'Target' },
];

After optimization:

const result = optimizeRules(rules);
// result.optimizedRules:
[
  { from: ['Source1', 'Source2', 'Source3'], to: 'Target' },
]
// result.savings.rulesRemoved: 1

The second rule is redundant because its sources are a complete subset of the first rule with the same target.

Apostrophe normalization

When using apostrophe normalization, the optimizer consolidates apostrophe variants:

const rules = [
  { from: ["al-Qur'an", "al-Qur'an", 'al-Qur`an'], to: 'al-Qurʾān' },
];

const result = optimizeRules(rules, { normalizeApostrophes: true });
// result.optimizedRules:
[
  { from: ["al-Qur'an"], to: 'al-Qurʾān' },
]
// result.savings.sourcesRemoved: 2

Conflict detection

The optimizer warns when the same source maps to different targets:

const rules = [
  { from: ['Ali'], to: 'ʿAlī' },
  { from: ['Ali'], to: 'ʾAlī' },
];

const result = optimizeRules(rules);

console.log(result.warnings.conflicts);
// [
//   {
//     from: 'Ali',
//     conflictingTo: ['ʿAlī', 'ʾAlī']
//   }
// ]

Conflicts indicate that the same source pattern has multiple different replacement targets. The last rule in the array will take precedence during matching.

Match type conflicts

The optimizer detects and consolidates rules with different match types:

import { optimizeRules, MatchType } from 'trie-rules';

const rules = [
  { from: ['test'], to: 'result', options: { match: MatchType.Any } },
  { from: ['test'], to: 'result', options: { match: MatchType.Whole } },
];

const result = optimizeRules(rules);
// Consolidates to most permissive match type (Any)
// result.optimizedRules:
[
  { from: ['test'], to: 'result', options: { match: MatchType.Any } },
]

When multiple rules have the same source and target but different match types, the optimizer keeps the most permissive type: Any > Whole > Alone

Overwritten rules

The optimizer detects rules that would be overwritten in the trie:

const rules = [
  { from: ['test'], to: 'first' },
  { from: ['test'], to: 'second' },
];

const result = optimizeRules(rules);

console.log(result.warnings.overwrittenRules);
// [
//   {
//     from: 'test',
//     kept: { from: ['test'], to: 'second' },
//     discarded: [{ from: ['test'], to: 'first' }]
//   }
// ]

When the same source appears in multiple rules, the last one wins. Consider if this is intentional or if you need to resolve the conflict.

Optimization result structure

The OptimizeResult object contains:

optimizedRules

Rule[]

The optimized array of rules with redundancies removed and options consolidated.

savings

object

Statistics about the optimization process.

Show child attributes

sourcesRemoved

number

Number of individual source strings removed from rules.

rulesRemoved

number

Number of complete rules removed (e.g., subset rules).

warnings

object

Warnings about potential issues found during optimization.

Show child attributes

conflicts

array

Source values that have conflicting target values.

matchTypeConflicts

array

Source values with conflicting match types.

overwrittenRules

array

Rules that would be overwritten in the trie.

Best practices

Run optimization early

Optimize rules before building the trie to reduce build time and catch issues early.

Review warnings

Always inspect the warnings object to identify and resolve conflicts in your rule set.

Track savings

Monitor sourcesRemoved and rulesRemoved to understand the impact of optimization.

Test thoroughly

Verify that optimized rules produce the same results as your original rules.

Next steps

Implement confirmation callbacks for context-aware matching
Review performance characteristics and benchmarks
Explore advanced matching options

Get Started

Core Concepts

Guides

Overview

Basic usage

Optimizations performed

Case sensitivity consolidation

Prefix optimization

Subset elimination

Apostrophe normalization

Conflict detection

Match type conflicts

Overwritten rules

Optimization result structure

Best practices

Run optimization early

Review warnings

Track savings

Test thoroughly

Next steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Overview

​Basic usage

​Optimizations performed

​Case sensitivity consolidation

​Prefix optimization

​Subset elimination

​Apostrophe normalization

​Conflict detection

​Match type conflicts

​Overwritten rules

​Optimization result structure

​Best practices

Run optimization early

Review warnings

Track savings

Test thoroughly

​Next steps

Build docs developers (and LLMs) love

Overview

Basic usage

Optimizations performed

Case sensitivity consolidation

Prefix optimization

Subset elimination

Apostrophe normalization

Conflict detection

Match type conflicts

Overwritten rules

Optimization result structure

Best practices

Next steps