#edit-distance #levenshtein #string #similarity #metrics #calculating #hamming

yanked str_edit_distance

A Rust library for calculating various string distance and similarity metrics

2 releases

0.1.2 Jul 3, 2024
0.1.1 Jun 12, 2024
0.1.0 May 19, 2024

#22 in #edit-distance

GPL-3.0 license

17KB
146 lines

Str Distance

str_edit_distance is a Rust library for calculating various string distance metrics. These metrics help measure the difference between two sequences by counting the minimum number of single-character edits required to change one string into the other.

Features

  • Levenshtein Distance: Measures the minimum number of single-character edits (insertions, deletions, substitutions) required to change one string into the other.
  • Damerau-Levenshtein Distance: Extends the Levenshtein distance by also considering transpositions of two adjacent characters as a single edit.
  • Hamming Distance: Measures the number of positions at which the corresponding characters are different. Note: only works for strings of equal length.
  • Jaro-Winkler Distance: Measures the similarity between two strings, particularly useful for short strings like names.
  • Dice's Coefficient: Measures the similarity between two strings based on bigrams (pairs of adjacent characters).

Installation

Add str_edit_distance to your Cargo.toml:

[dependencies]
str_edit_distance = "0.1"

Then, run cargo build to install the package.

Usage

Levenshtein

use str_edit_distance::levenshtein;

fn main() {
    let distance = levenshtein("kitten", "sitting");
    println!("The Levenshtein distance between 'kitten' and 'sitting' is: {}", distance);
}

Damerau-Levenshtein Distance

use str_edit_distance::damerau_levenshtein;

fn main() {
    let distance = damerau_levenshtein("ca", "abc");
    println!("The Damerau-Levenshtein distance between 'ca' and 'abc' is: {}", distance);
}

Hamming Distance

use str_edit_distance::hamming;

fn main() {
    let distance = hamming("karolin", "kathrin");
    println!("The Hamming distance between 'karolin' and 'kathrin' is: {}", distance);
}

Jaro-Winkler Distance

use str_edit_distance::jaro_winkler;

fn main() {
    let distance = jaro_winkler("martha", "marhta");
    println!("The Jaro-Winkler distance between 'martha' and 'marhta' is: {:.3}", distance);
}

Dice's Coefficient

use str_edit_distance::dice_coefficient;

fn main() {
    let similarity = dice_coefficient("night", "nacht");
    println!("The Dice's Coefficient between 'night' and 'nacht' is: {:.3}", similarity);
}

No runtime deps