5 releases
| 0.1.4 | Oct 8, 2024 |
|---|---|
| 0.1.3 | Jul 16, 2024 |
| 0.1.2 | Jul 4, 2024 |
| 0.1.1 | Jul 2, 2024 |
| 0.1.0 | Nov 18, 2023 |
#1229 in Encoding
2,724 downloads per month
Used in kitoken
8KB
78 lines
SentencePiece model parser generated from the SentencePiece protobuf definition.
See SentencePieceModel for the entry point for parsing and accessing sentencepiece models.
use sentencepiece_model::SentencePieceModel;
let model = SentencePieceModel::from_file("tests/t5-spiece.model")?;
assert_eq!(model.pieces.len(), 32000);
assert_eq!(model.trainer().unwrap().unk_id(), 2);
sentencepiece-model
SentencePiece model parser generated from the SentencePiece protobuf definition.
use sentencepiece_model::SentencePieceModel;
let model = SentencePieceModel::from_file("tests/t5-spiece.model")?;
assert_eq!(model.pieces.len(), 32000);
assert_eq!(model.trainer()?.unk_id(), 2);
Usage
[dependencies]
sentencepiece-model = "0.1"
sentencepiece-model uses prost-build and protox to generate Rust code from the SentencePiece protobuf definition at build time. protoc is not required.
Dependencies
~0.3–2.2MB
~38K SLoC