#tensor #machine-learning #flash

candle-flash-attn-v1

Flash attention V1 layer for the candle ML framework

1 unstable release

new 0.0.1 Apr 15, 2025

#62 in #flash

Download history 63/week @ 2025-04-09

63 downloads per month

MIT/Apache

2MB
849 lines

Candle Flash Attention v1 Layer

Flash Attention v2 does not support Turing GPUs (T4, RTX 2080). This layer can be used in replacement of the official flash attention Candle layer in the meantime.

Dependencies

~16–23MB
~404K SLoC