#edit-distance #levenshtein #ascii #flags

agnostic-levenshtein

Levenshtein distance for ASCII or Unicode strings

4 releases

0.1.3 Feb 10, 2025
0.1.2 Feb 3, 2025
0.1.1 Feb 1, 2025
0.1.0 Feb 1, 2025

#670 in Algorithms

Download history 294/week @ 2025-01-28 109/week @ 2025-02-04 62/week @ 2025-02-11

465 downloads per month

MIT license

7KB
104 lines

agnostic-levenshtein

After doing the LeetCode problem on the edit distance between two strings—which in this case refers to the Levenshtein distance—I became quite enamored of the algorithm and wanted to write my own implementation in Rust. That's all this is.

I wrote my version such that the function edit_distance takes three arguments: two &str and a bool flag for "ASCII mode." When this flag is set, the algorithm will work directly with the bytes of the strings, rather than building Vecs of their 32-bit (i.e., UTF-32) char values. This should be faster and avoid some allocation, as far as I understand. I wonder if there might be a more efficient approach for the Unicode case.

Either way, the return value is the Levenshtein distance as u32. It may be worth noting that input strings of length greater than u32::MAX will not yield correct results—though I can hardly imagine that problem arising in practice.

No runtime deps