#framework #candle #ml #norm #fused #residual #layer-norm #rms-norm #layer-for-candle

candle-layer-norm

Layer Norm layer for the candle ML framework

1 unstable release

0.0.1 Apr 15, 2025

#3 in #fused

Download history 118/week @ 2025-08-05 145/week @ 2025-08-12 125/week @ 2025-08-19 86/week @ 2025-08-26 134/week @ 2025-09-02 132/week @ 2025-09-09 88/week @ 2025-09-16 114/week @ 2025-09-23 127/week @ 2025-09-30 72/week @ 2025-10-07 56/week @ 2025-10-14 103/week @ 2025-10-21 117/week @ 2025-10-28 114/week @ 2025-11-04 132/week @ 2025-11-11 143/week @ 2025-11-18

548 downloads per month

BSD-3-Clause

42KB
835 lines

Candle Cuda Layer Norm

Layer Norm fused operation for the Candle ML framework.

This Layer was adapted from https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm.

It implements fused dropout + residual + LayerNorm, building on Apex's FastLayerNorm.

Major changes:

  • Add residual.
  • Make it work for both pre-norm and post-norm architecture.
  • Support more hidden dimensions (all dimensions divisible by 8, up to 8192).
  • Implement RMSNorm as an option.

Dependencies

~21MB
~471K SLoC