1 unstable release

0.1.1 Feb 26, 2024

#439 in Machine learning

MIT license

17KB
365 lines


Minimal, fast, multi-threaded implementation of the Byte Pair Encoding (BPE) for LLM tokenization

Dependencies

~9–20MB
~272K SLoC