Lib.rs

›

#onnx #tensorflow #neural-network #inference #artificial-intelligence #storage #ops #table

tract-linalg

Tiny, no-nonsense, self contained, TensorFlow and ONNX inference

by Mathieu Poumeyrol and 68 contributors

140 releases

new 0.21.12	Apr 10, 2025
0.21.11	Mar 19, 2025
0.21.10	Feb 21, 2025
0.21.8	Dec 5, 2024
0.2.9	Mar 28, 2019

#1508 in Machine learning

27,604 downloads per month
Used in 38 crates (2 directly)

MIT/Apache

1MB
27K SLoC

tract-linalg

linalg stands for "linear algebra". This is a misnamer. This crates contains low-level, architecture dependant optimisations used by tract-core.

Functions

MatMatMul: Extended matrix*matrix product:
- inspired by Gotoblass and BLIS micro kernel approach
- extended for convolution friendly addressing (fused img2col)
- fused output pipeline (min, max, and a few more simple, fast ops)
- f32*f32 -> f32 (à la sgemm)
- i8*i8 -> i32 accumulator -> i32 storage
- i8*i8 -> i32 accumulator -> i8 (with channel zeropoint and scale, and re-quantization pipeline)
f32 sigmoid and f32 tanh: at f32 precision, by a rationale function (no exponentiation)
byte-to-byte lookup table

Implementations

	generic fallback	armv6, vfp	armv7 neon	armv8 simd	x64 FMA
MatMatMul f32		4x4	8x4	8x8	16x6
MatMatMul i8->i8			8x4		8x8
MatMatMul i8->i32					8x8
sigmoid f32			4n	4n
tanh f32			4n	4n
byte lookup

Dependencies

~9–18MB
~228K SLoC

build build.rs
build cc
build half =2.4.1+std +num-traits
build liquid =0.26.8
build liquid-core =0.26.8
build liquid-derive =0.26.8
build smallvec
build time
build unicode-normalization
build walkdir
dev core_affinity 0.8
dev criterion 0.5.1 not wasm wasm
dev proptest not wasm wasm
dev env_logger 0.10
dev libc =0.2.164
dev nu-ansi-term 0.46

Other features