8 releases

new 0.1.10 Apr 24, 2025
0.1.9 Apr 10, 2025

#500 in Text processing

Download history 161/week @ 2025-04-02 533/week @ 2025-04-09 20/week @ 2025-04-16

714 downloads per month

Custom license

13KB
222 lines

hat-splitter

The hat-splitter crate implements the splitting rule described in the Hierarchical Autoregressive Transformers paper. You can use this to implement training and inference of HAT models.

Installation

cargo add hat-splitter

Usage

use hat_splitter::{HATSplitter, Splitter};

let my_hat_splitter = HATSplitter::new();
let words: Vec<String> = my_hat_splitter.split("Hello, world!");
assert_eq!(words, vec!["Hello,", " world!"]);

let words: Vec<Vec<u8>> = my_hat_splitter.split_with_limit("Hello, world!", 4);
assert_eq!(words, vec![b"Hell".to_vec(), b"o,".to_vec(), b" wor".to_vec(), b"ld!".to_vec()]);

Dependencies

~2.5–3.5MB
~59K SLoC