#onnx #neural-network #tensorflow

tract-linalg

Tiny, no-nonsense, self contained, TensorFlow and ONNX inference

142 releases

Uses new Rust 2024

0.22.0 Aug 26, 2025
0.21.13 May 15, 2025
0.21.12 Apr 10, 2025
0.21.11 Mar 19, 2025
0.2.9 Mar 28, 2019

#1473 in Machine learning

Download history 5913/week @ 2025-08-04 4578/week @ 2025-08-11 5846/week @ 2025-08-18 8592/week @ 2025-08-25 6603/week @ 2025-09-01 8757/week @ 2025-09-08 10546/week @ 2025-09-15 8585/week @ 2025-09-22 11575/week @ 2025-09-29 11373/week @ 2025-10-06 11686/week @ 2025-10-13 8463/week @ 2025-10-20 7710/week @ 2025-10-27 7066/week @ 2025-11-03 7006/week @ 2025-11-10 10854/week @ 2025-11-17

33,158 downloads per month
Used in 45 crates (2 directly)

MIT/Apache

1MB
27K SLoC

Rust 18K SLoC // 0.0% comments Templ 9K SLoC // 0.1% comments GNU Style Assembly 13 SLoC // 0.3% comments

tract-linalg

linalg stands for "linear algebra". This is a misnamer. This crates contains low-level, architecture dependant optimisations used by tract-core.

Functions

  • MatMatMul: Extended matrix*matrix product:
    • inspired by Gotoblass and BLIS micro kernel approach
    • extended for convolution friendly addressing (fused img2col)
    • fused output pipeline (min, max, and a few more simple, fast ops)
    • f32*f32 -> f32 (à la sgemm)
    • i8*i8 -> i32 accumulator -> i32 storage
    • i8*i8 -> i32 accumulator -> i8 (with channel zeropoint and scale, and re-quantization pipeline)
  • f32 sigmoid and f32 tanh: at f32 precision, by a rationale function (no exponentiation)
  • byte-to-byte lookup table

Implementations

generic fallback armv6, vfp armv7 neon armv8 simd x64 FMA
MatMatMul f32 4x4 8x4 8x8 16x6
MatMatMul i8->i8 8x4 8x8
MatMatMul i8->i32 8x8
sigmoid f32 4n 4n
tanh f32 4n 4n
byte lookup

Dependencies

~9–12MB
~232K SLoC