11 releases (5 breaking)
✓ Uses Rust 2018 edition
|0.10.1||Sep 23, 2019|
|0.10.0||Sep 16, 2019|
|0.9.0||Sep 4, 2019|
|0.8.2||Aug 16, 2019|
|0.5.1||Mar 24, 2019|
#19 in Machine learning
708 downloads per month
Used in 5 crates (4 directly)
This is a crate for reading, writing, and using finalfusion embeddings in Rust. Additionally, the word2vec and GloVe file formats are also supported. Please consult the API documentation for usage information.
Note: This package is still new, its API will change.
A library for reading, writing, and using word embeddings.
finalfusion allows you to read, write, and use word2vec/GloVe embeddings and read fastText embeddings. finalfusion uses finalfusion as its native data format, which has several benefits over the word2vec, GloVe, and fastText formats.
finalfusion embeddings can be read with the
method, which expects a reader that implements the
Since finalfusion supports various types of vocabularies and
embedding matrix (storage) formats, these should be specified
as type parameters of the
Embeddings type. However, typically
one would want to read finalfusion embeddings with any type of
vocabulary or embedding matrix. For this purpose, the
StorageWrap types are provided, which wrap any type of
vocabulary and embeddung matrix.
We can thus load a finalfusion format and retrieve an embedding as follows:
use std::fs::File; use std::io::BufReader; use finalfusion::prelude::*; let mut reader = BufReader::new(File::open("testdata/similarity.fifu").unwrap()); // Read the embeddings. let embeddings: Embeddings<VocabWrap, StorageWrap> = Embeddings::read_embeddings(&mut reader) .unwrap(); // Look up an embedding. let embedding = embeddings.embedding("Berlin");
For performing analogy/similarity queries on the embedding
matrix, we need an embedding matrix which can act as a view.
In that case one should use
StorageViewWrap in place of
StorageViewWrap is only supported for a
subset of embedding matrix types -- in particular, quantized
matrices cannot be used as a view.
Consult the documentation of the
word2vec modules for information on how to read fastText,
GloVe, and word2vec embeddings.