Skip to main content
The fuzzy extension provides functions for approximate string matching and phonetic analysis. It is adapted from the sqlean fuzzy extension.

Edit distance functions

Edit distance functions measure how similar two strings are by counting the minimum number of operations needed to transform one string into the other.
FunctionArgumentsReturnsDescription
fuzzy_leven(A, B)A, B TEXTINTEGERLevenshtein distance between A and B (insertions, deletions, substitutions)
fuzzy_damlev(A, B)A, B TEXTINTEGERDamerau-Levenshtein distance (adds transpositions to Levenshtein)
fuzzy_editdist(A, B)A, B ASCII TEXTINTEGERWeighted edit distance; if A ends with *, treats it as a prefix of B
fuzzy_hamming(A, B)A, B TEXT (equal length)INTEGERHamming distance; returns -1 if strings differ in length
fuzzy_osadist(A, B)A, B TEXTINTEGEROptimal String Alignment distance
fuzzy_jarowin(A, B)A, B TEXTREALJaro-Winkler similarity score between 0.0 (no match) and 1.0 (identical)

Examples

-- Levenshtein: 1 substitution
SELECT fuzzy_leven('kitten', 'sitten');
-- 1

-- Damerau-Levenshtein: transposition counts as 1 (vs 2 for Levenshtein)
SELECT fuzzy_damlev('abc', 'acb');
-- 1

-- Hamming: number of positions that differ
SELECT fuzzy_hamming('hello', 'hellp');
-- 1

-- Jaro-Winkler: similarity score
SELECT fuzzy_jarowin('martha', 'marhta');
-- ~0.961

-- Find approximate matches within distance 2
SELECT word FROM dictionary
WHERE fuzzy_leven(word, 'colour') <= 2;

Phonetic encoding functions

Phonetic encoding functions convert words to codes that represent how they sound, so that words with similar pronunciations match even when spelled differently.
FunctionArgumentsReturnsDescription
fuzzy_soundex(X)X TEXTTEXT or NULLStandard Soundex code (e.g. P532 for phonetics)
fuzzy_rsoundex(X)X TEXTTEXT or NULLRefined Soundex — more granular than standard Soundex
fuzzy_phonetic(X)X TEXTTEXT or NULLPhonetic hash using an alternative phonetic algorithm
fuzzy_caver(X)X TEXTTEXT or NULLCaverphone encoding (10-character fixed-length code)

Examples

-- Standard Soundex
SELECT fuzzy_soundex('phonetics');
-- P532

SELECT fuzzy_soundex('awesome');
-- A250

-- Find names that sound like 'Smith'
SELECT name FROM contacts
WHERE fuzzy_soundex(name) = fuzzy_soundex('Smith');

-- Refined Soundex
SELECT fuzzy_rsoundex('phonetics');
-- P1080603

-- Caverphone
SELECT fuzzy_caver('awesome');
-- AWSM111111

Transliteration and script functions

FunctionArgumentsReturnsDescription
fuzzy_translit(X)X TEXTTEXTConverts non-ASCII Roman characters in X to their closest ASCII equivalents
fuzzy_script(X)X TEXTINTEGERReturns the ISO 15924 numeric code of the dominant script in X

Script codes returned by fuzzy_script

CodeScript
125Hebrew
160Arabic
200Greek
215Latin
220Cyrillic
998Mixed (two or more scripts detected)
999Unknown (no recognized script)

Examples

-- Transliterate accented characters
SELECT fuzzy_translit('Ångström');
-- Angstrom

-- Detect script
SELECT fuzzy_script('Hello');
-- 215 (Latin)

SELECT fuzzy_script('Привет');
-- 220 (Cyrillic)

NULL handling

All fuzzy functions return NULL or an error value when a required argument is NULL. For phonetic functions, an empty string input returns an empty string.

Build docs developers (and LLMs) love