2 releases
new 0.0.2 | Dec 19, 2024 |
---|---|
0.0.1 | Dec 12, 2024 |
#4 in #tokenization
125 downloads per month
605KB
2K
SLoC
crossandra-rs
crossandra-rs
is a work-in-progress ⚠️, straightforward tokenization library for seamless text
processing. A simplified Rust implementation of the Python Crossandra library.
Usage
Add this to your Cargo.toml
:
[dependencies]
crossandra = "0.0.1"
Import and use like this:
use crossandra::{Tokenizer, common};
fn main() {
let word_finder = Tokenizer::default()
.with_patterns(vec![common::WORD.clone()])
.expect("built-in pattern should be safe");
let text = "Hello, world!";
for token in word_finder.tokenize(text).flatten() {
println!("{:?}", token);
}
// Token { name: "word", value: "Hello", position: 0}
// Token { name: "word", value: "world", position: 7}
}
Documentation
The documentation is available at docs.rs/crossandra.
Acknowledgements
Huge thanks to @Maneren for his invaluable guidance in developing this library 🫶
License
crossandra-rs
is licensed under the MIT License.
© trag1c, 2024
Dependencies
~4.5MB
~85K SLoC