3 releases (breaking)

new 0.2.0 Mar 11, 2025
0.1.0 Feb 10, 2025
0.0.1 Feb 8, 2025

#1126 in Science

Download history 213/week @ 2025-02-05 32/week @ 2025-02-12 9/week @ 2025-02-19 12/week @ 2025-02-26

266 downloads per month

Apache-2.0

1MB
42K SLoC

RSTSR OpenBLAS device

This crate enables OpenBLAS device.

Usage

use rstsr_core::prelude::*;
use rstsr_openblas::DeviceOpenBLAS;

// specify the number of threads of 16
let device = DeviceOpenBLAS::new(16);
// if you want to use the default number of threads, use the following line
// let device = DeviceOpenBLAS::default();

let a = rt::linspace((0.0, 1.0, 1048576, &device)).into_shape([16, 256, 256]);
let b = rt::linspace((1.0, 2.0, 1048576, &device)).into_shape([16, 256, 256]);

// by optimized BLAS, the following operation is very fast
let c = &a % &b;

// mean of all elements is also performed in parallel
let c_mean = c.mean_all();

println!("{:?}", c_mean);
assert!((c_mean - 213.2503660477036) < 1e-6);

Important Notes

  • We do not provide automatic linkage:

    • Please add -l openblas in RUSTFLAGS, or cargo:rustc-link-lib=openblas in build.rs, or something similar, to your project. We do not use external FFI crates blas or blas-sys, and do not automatically search OpenBLAS library for linking.
    • If feature openmp activated, please add -l gomp or -l omp in RUSTFLAGS, or cargo:rustc-link-lib=gomp or cargo:rustc-link-lib=omp in build.rs, or something similar, to your project. We do not use external FFI crate openmp-sys, and do not automatically search for OpenMP library for linking.
  • If your OpenBLAS is compiled with OpenMP, please add openmp feature to either this crate or rstsr-openblas-ffi.

    • In our testing, OpenBLAS with OpenMP is probably more efficient than pthreads. However, we currently decided not make openmp as default feature.

Dependencies

~4–5.5MB
~120K SLoC