1 unstable release
new 0.0.1 | Apr 15, 2025 |
---|
#62 in #flash
63 downloads per month
2MB
849 lines
Candle Flash Attention v1 Layer
Flash Attention v2 does not support Turing GPUs (T4, RTX 2080). This layer can be used in replacement of the official flash attention Candle layer in the meantime.
Dependencies
~16–23MB
~404K SLoC