9 releases (5 breaking)
0.6.1 | Aug 18, 2020 |
---|---|
0.6.0 | Jul 14, 2020 |
0.5.1 | May 29, 2020 |
0.5.0 | Apr 19, 2020 |
0.1.1 | Sep 10, 2019 |
#57 in Hardware support
74,673 downloads per month
Used in 110 crates
(10 directly)
26KB
163 lines
Multiversion
Function multiversioning attribute macros for Rust.
What is function multiversioning?
Many CPU architectures have a variety of instruction set extensions that provide additional functionality. Common examples are single instruction, multiple data (SIMD) extensions such as SSE and AVX on x86/x86-64 and NEON on ARM/AArch64. When available, these extended features can provide significant speed improvements to some functions. These optional features cannot be haphazardly compiled into programs--executing an unsupported instruction will result in a crash.
Function multiversioning is the practice of compiling multiple versions of a function with various features enabled and safely detecting which version to use at runtime.
Features
- Dynamic dispatching, using runtime CPU feature detection
- Static dispatching, avoiding repeated feature detection for nested multiversioned functions (and allowing inlining!)
- Support for all functions, including generic and
async
Example
Automatic function multiversioning with the clone
attribute, similar to GCC's target_clones
attribute:
use multiversion::multiversion;
#[multiversion]
#[clone(target = "[x86|x86_64]+avx")]
#[clone(target = "x86+sse")]
fn square(x: &mut [f32]) {
for v in x {
*v *= *v;
}
}
Manual function multiversioning with the multiversion
and target
attributes:
use multiversion::{multiversion, target};
#[target("[x86|x86_64]+avx")]
unsafe fn square_avx(x: &mut [f32]) {
for v in x {
*v *= *v;
}
}
#[target("x86+sse")]
unsafe fn square_sse(x: &mut [f32]) {
for v in x {
*v *= *v;
}
}
#[multiversion]
#[specialize(target = "[x86|x86_64]+avx", fn = "square_avx", unsafe = true)]
#[specialize(target = "x86+sse", fn = "square_sse", unsafe = true)]
fn square(x: &mut [f32]) {
for v in x {
*v *= *v;
}
}
License
Multiversion is distributed under the terms of both the MIT license and the Apache License (Version 2.0).
See LICENSE-APACHE and LICENSE-MIT for details.
Dependencies
~235–650KB
~16K SLoC