5 releases

0.1.4 Apr 28, 2024
0.1.3 Apr 28, 2024
0.1.2 Apr 28, 2024
0.1.1 Apr 28, 2024
0.1.0 Apr 26, 2024

#73 in Machine learning

Download history 144/week @ 2024-04-22 171/week @ 2024-04-29

315 downloads per month

GPL-2.0 license

24KB
196 lines

csep

crates.io

Cosine Similarity Embeddings Print

Like Grep (Global Regular Expression Print) takes a regular expression and prints all the lines that have a match in it, Csep (Cosine Similarity Embeddings Print) takes an input phrase and prints all the chunks that are similar to it.

The goal of this project is to give users command line access to semantic search in the same way that grep is used for regular expressions. This not only gives you full control over a command line semantic search on any unix like system, but also allows you to use it in scripts and pipelines, if you combine it with a command line llm tool like chat-gipity or Ollama you could even potentially perform rag in a simple unix shell script.

Installation

You will need to install ollama and pull the all-minilm model in order for csep to have something to get embeddings from.

ollama pull all-minilm

You can then install csep from this source using:

cargo install --path .

Or you can pull whatever the latest published version is from crates.io with

cargo install csep

Dependencies

~16–31MB
~399K SLoC