7 releases (1 stable)
1.0.0 | Jun 23, 2023 |
---|---|
0.11.0-alpha.4 | Mar 27, 2023 |
0.4.1 | Mar 25, 2023 |
0.3.0 | Mar 25, 2023 |
0.1.0 | Jun 28, 2020 |
#21 in #levenshtein
395KB
1.5K
SLoC
Overview
jellyfish is a library for approximate & phonetic matching of strings.
Source: https://github.com/jamesturk/jellyfish
Documentation: https://jamesturk.github.io/jellyfish/
Issues: https://github.com/jamesturk/jellyfish/issues
Included Algorithms
String comparison:
- Levenshtein Distance
- Damerau-Levenshtein Distance
- Jaro Distance
- Jaro-Winkler Distance
- Match Rating Approach Comparison
- Hamming Distance
Phonetic encoding:
- American Soundex
- Metaphone
- NYSIIS (New York State Identification and Intelligence System)
- Match Rating Codex
Example Usage
>>> import jellyfish
>>> jellyfish.levenshtein_distance('jellyfish', 'smellyfish')
2
>>> jellyfish.jaro_distance('jellyfish', 'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance('jellyfish', 'jellyfihs')
1
>>> jellyfish.metaphone('Jellyfish')
'JLFX'
>>> jellyfish.soundex('Jellyfish')
'J412'
>>> jellyfish.nysiis('Jellyfish')
'JALYF'
>>> jellyfish.match_rating_codex('Jellyfish')
'JLLFSH'
Dependencies
~4–9MB
~100K SLoC