16 releases (breaking)
0.17.0 | Apr 9, 2025 |
---|---|
0.16.0 | Feb 8, 2025 |
0.15.1 | Jan 6, 2025 |
0.15.0 | Dec 28, 2024 |
0.1.0 | Dec 31, 2023 |
#1283 in Machine learning
1,903 downloads per month
Used in 8 crates
(via rten)
265KB
6K
SLoC
SIMD-vectorized implementations of operations used in neural networks.
These implementations are used as kernels for operations in the rten crate.
Constructing and dispatching operations
The operations are implemented by structs which implement the SIMD operation
traits from rten-simd. To apply an operation to data, first
construct the operation using the struct from this crate, then use a
dispatch method from the SimdOp
or
SimdUnaryOp
traits to execute
the operation.
In-place and non in-place operations
Some operations support both updating data in place or reading input from
one slice and writing to another. For unary operations this is controlled by
dispatching with either map
or
map_mut
. For other operations
this is handled by exposing different constructors for the in-place and
mutating cases, such as Softmax::new
and Softmax::new_mut
.
For operations which use a separate source and destination, the destination
is expected to be an uninitialized slice ([MaybeUninit<T>]
). This allows
the caller to control allocation of the buffer and avoid the overhead of
initializing elements which the operation will overwrite. The ExtendInit
trait provides a safe API for the common task of filling a new Vec
with
the result of the operation.
Examples
Applying a vectorized unary function
use std::mem::MaybeUninit;
use rten_simd::SimdUnaryOp;
use rten_vecmath::Erf;
// Apply the error function to each element of `data`.
let mut data = [1., 0.5, 2.0];
let erf_op = Erf {};
erf_op.map_mut(&mut data);
// Apply the error function to each element of `src`, writing to `dest`.
let src = [1., 0.5, 2.0];
let mut dest = [MaybeUninit::uninit(); 3];
erf_op.map(&src, &mut dest);
Applying softmax in place
This example applies the softmax function in-place to a mutable slice.
use rten_simd::SimdOp;
use rten_vecmath::Softmax;
let mut data = [1., 0.5, 2.0];
Softmax::new_mut(&mut data).dispatch();
Applying softmax with separate input and output buffers
This example reads data from an input and writes to an uninitialized output
buffer (&mut [MaybeUninit<f32>]
), obtained from the uninitialized portion
of a Vec<f32>
. To update the length of the Vec<f32>
after it is
initialized, the helper ExtendInit
trait is used.
use rten_simd::SimdOp;
use rten_vecmath::{Softmax, ExtendInit};
let data = [1., 0.5, 2.0];
let mut output = Vec::with_capacity(data.len());
output.extend_init(|output_uninit| {
// `output_uninit` is the uninitialized part of `output`, as returned by
// `output.spare_capacity_mut()`.
//
// The `dispatch` call initializes it and returns the initialized slice.
Softmax::new(&data, output_uninit).dispatch()
});
assert_eq!(output.len(), 3);
Computing the sum of a list of floats
use rten_simd::SimdOp;
use rten_vecmath::Sum;
let data = [1., 0.5, 2.0];
let sum = Sum::new(&data).dispatch();
rten-vecmath
This crate contains SIMD-vectorized kernels ("vectorized math") for various operations used in machine learning models. This includes:
- Math functions such as exp, erf, tanh
- Activation function such as gelu
- Normalization functions such as softmax and mean-variance normalization
- Reduction functions such as sums and sum-of-square
SIMD operations are implemented using portable SIMD types from the rten-simd crate.