#string #similarity #hamming #levenshtein #jaro

strsim

Implementations of string similarity metrics. Includes Hamming, Levenshtein, OSA, Damerau-Levenshtein, Jaro, Jaro-Winkler, and Sørensen-Dice.

22 releases

Uses old Rust 2015

0.10.0 Jan 31, 2020
0.9.3 Dec 13, 2019
0.9.2 May 9, 2019
0.8.0 Aug 19, 2018
0.2.2 Mar 29, 2015

#22 in Algorithms

Download history 1201570/week @ 2023-02-12 1182680/week @ 2023-02-19 1328244/week @ 2023-02-26 1329680/week @ 2023-03-05 1292716/week @ 2023-03-12 1461947/week @ 2023-03-19 1291295/week @ 2023-03-26 1136907/week @ 2023-04-02 1180814/week @ 2023-04-09 1229102/week @ 2023-04-16 1186852/week @ 2023-04-23 1111641/week @ 2023-04-30 1146185/week @ 2023-05-07 1146766/week @ 2023-05-14 1165193/week @ 2023-05-21 1084249/week @ 2023-05-28

4,620,020 downloads per month
Used in 1,512 crates (441 directly)

MIT license

34KB
712 lines

strsim-rs

Crates.io Crates.io CI status unsafe forbidden

Rust implementations of string similarity metrics:

The normalized versions return values between 0.0 and 1.0, where 1.0 means an exact match.

There are also generic versions of the functions for non-string inputs.

Installation

strsim is available on crates.io. Add it to your Cargo.toml:

[dependencies]
strsim = "0.10.0"

Usage

Go to Docs.rs for the full documentation. You can also clone the repo, and run $ cargo doc --open.

Examples

extern crate strsim;

use strsim::{hamming, levenshtein, normalized_levenshtein, osa_distance,
             damerau_levenshtein, normalized_damerau_levenshtein, jaro,
             jaro_winkler, sorensen_dice};

fn main() {
    match hamming("hamming", "hammers") {
        Ok(distance) => assert_eq!(3, distance),
        Err(why) => panic!("{:?}", why)
    }

    assert_eq!(levenshtein("kitten", "sitting"), 3);

    assert!((normalized_levenshtein("kitten", "sitting") - 0.571).abs() < 0.001);

    assert_eq!(osa_distance("ac", "cba"), 3);

    assert_eq!(damerau_levenshtein("ac", "cba"), 2);

    assert!((normalized_damerau_levenshtein("levenshtein", "löwenbräu") - 0.272).abs() <
            0.001);

    assert!((jaro("Friedrich Nietzsche", "Jean-Paul Sartre") - 0.392).abs() <
            0.001);

    assert!((jaro_winkler("cheeseburger", "cheese fries") - 0.911).abs() <
            0.001);

    assert_eq!(sorensen_dice("web applications", "applications of the web"),
        0.7878787878787878);
}

Using the generic versions of the functions:

extern crate strsim;

use strsim::generic_levenshtein;

fn main() {
    assert_eq!(2, generic_levenshtein(&[1, 2, 3], &[0, 2, 5]));
}

Contributing

If you don't want to install Rust itself, you can run $ ./dev for a development CLI if you have Docker installed.

Benchmarks require a Nightly toolchain. Run $ cargo +nightly bench.

License

MIT

No runtime deps