#transformer #cuda #kernel #oxidized #bert #llama #models

oxidized-cuda-kernels

Additional CUDA kernels for Oxidized Transformers

1 unstable release

0.1.1 Mar 22, 2024

#433 in Machine learning

Download history 117/week @ 2024-03-22 18/week @ 2024-03-29 5/week @ 2024-04-05

140 downloads per month

MIT/Apache

12KB
256 lines

Oxidized Transformers

Oxidized Transformers is a Rust transformers library that started out as a port of Curated Transformers. The foundations are in place and some popular models are implemented, but Oxidized Transformers is still too volatile to use in projects. Keep an eye on the repo, since progress is currently fast.

🧰 Supported Model Architectures

Supported encoder-only models:

  • ALBERT
  • BERT
  • RoBERTa
  • XLM-RoBERTa

Supported decoder-only models:

  • GPT-NeoX
  • Llama 1/2

All types of models can be loaded from Huggingface Hub. Float16/bfloat16 models can use flash attention v2 on recent CUDA GPUs.

Dependencies

~9.5MB
~204K SLoC