1 unstable release

0.1.1 Feb 26, 2024

#699 in Machine learning

MIT license

17KB
365 lines


Minimal, fast, multi-threaded implementation of the Byte Pair Encoding (BPE) for LLM tokenization

Dependencies

~9–19MB
~279K SLoC