1 unstable release

0.1.0 Mar 7, 2022

#1475 in Hardware support

MIT license

7KB
111 lines

simd-slice

This crate provides a wrapper type struct SimdSlice(&[T]) with simd-enabled operations.

Relying on auto-vectorization by compiler might fail for various reasons. Use this crate to explicitly enable the vectorization without boilerplate.

Example

let a: Vec<i32> = {
    let mut a = vec![0_i32; N];
    for i in 0..N {
        a[i] = (i % 12345) as i32;
    }
    a
};

use simd_slice::AsSimdSlice;

// unaligned, odd-length cases are supported
let sum: i32 = a.as_simd_slice().sum(); 
let min: Option<i32> = a.as_simd_slice().min();
let max: Option<i32> = a.as_simd_slice().max();

Benchmark

reduce-sum on a Vec<i32> with 1000000 items

naive:   251.136µs (=1.000, 4039 iterations in 62 samples)
rayon:   108.663µs (=0.997, 9542 iterations in 71 samples)
simd :    52.341µs (=0.998, 20470 iterations in 79 samples)
both1:   148.563µs (=0.999, 7165 iterations in 68 samples)
both2:    61.943µs (=0.992, 16915 iterations in 77 samples)
both3:    82.678µs (=0.997, 12705 iterations in 74 samples)
  • simd = simd-slice
  • both1 = rayon chunks(4) + std::simd
  • both2 = rayon chunks(4096) + std::simd
  • both3 = rayon chunks(slice.len()/num_cores) + std::simd

Dependencies

~465KB