2 releases

0.2.12 Feb 6, 2024
0.2.11 Nov 29, 2023

#718 in Text processing

Download history 29/week @ 2023-11-26 16/week @ 2023-12-03 21/week @ 2023-12-10 4/week @ 2023-12-17 6/week @ 2023-12-24 2/week @ 2023-12-31 5/week @ 2024-01-07 4/week @ 2024-01-14 4/week @ 2024-01-21 4/week @ 2024-01-28 34/week @ 2024-02-04 25/week @ 2024-02-11

68 downloads per month

BSD-3-Clause and GPL-3.0 licenses

28MB
5.5K SLoC

Rust 4.5K SLoC // 0.1% comments Jupyter Notebooks 681 SLoC // 0.1% comments Python 449 SLoC // 0.0% comments

semsimian

Semsimian is a package to provide fast semantic similarity calculations for ontologies. It is a Rust library with a Python interface.

This includes implementation of Jaccard and Resnik similarity of terms in an ontology, as well as a method to calculate the similarity of two sets of terms (so-called termset similarity). Other methods will be added in the future.

Semsimian is currently integrated into OAK and the Monarch app to provide fast semantic similarity calculations.

Rust Installation

  • cargo add semsimian

Python Installation

  • Set up your virtual environment of choice.
  • cd semsimian (home directory of this project)
  • pip install maturin
  • maturin develop
  • python
Python 3.9.16 (main, Jan 11 2023, 10:02:19) 
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from semsimian import Semsimian
>>> s = Semsimian([('banana', 'is_a', 'fruit'), ('cherry', 'is_a', 'fruit')])
>>> s.jaccard_similarity('banana', 'cherry')

This should yield a value of 1.0.

Releases

As of version 0.2.11, the semsimian source is released on GitHub, with a corresponding set of Python wheels released to PyPi and a corresponding release in crates.io.

Dependencies

~34–70MB
~1M SLoC