4 releases
0.2.0 | May 24, 2021 |
---|---|
0.1.2 | May 24, 2021 |
0.1.1 | May 21, 2021 |
0.1.0 | May 21, 2021 |
#1237 in Concurrency
Used in 2 crates
(via scanflow)
12KB
154 lines
rayon-tlsctx
Thread local variables for Rayon thread pools
This crate provides a simple ThreadLocalCtx
struct that allows to store efficient thread-local state that gets built by a lambda.
It is incredibly useful in multithreaded processing, where a context needs to be used that is expensive to clone. In the end, there will be no more clones occuring than number of threads in a rayon thread pool.
lib.rs
:
Clone only when it's necessary
This library provides an efficient way to clone values in a rayon thread pool, but usually just once per thread. It cuts down on computation time for potentially expensive cloning operations.
Additional clones can rarely occur when rayon schedules execution of another instance of the same job, recursively. But in the end, there should not be more than 2N clones, for N threads.
Examples
use rayon_tlsctx::ThreadLocalCtx;
use rayon::iter::*;
const NUM_COPIES: usize = 16;
let mut buf: Vec<u16> = (0..!0).collect();
// Create a thread local context with value 0.
let ctx = ThreadLocalCtx::new(|| {
// Simulate expensive operation.
// Since we are building unlocked context,
// the sleeps will occur concurrently.
std::thread::sleep_ms(200);
0
});
let pool = rayon::ThreadPoolBuilder::new().num_threads(64).build().unwrap();
// Run inside a custom thread pool.
pool.install(|| {
// Sum the buffer `NUM_COPIES` times and accumulate the results
// into the threaded pool of counts. Note that the counts may be
// Unevenly distributed.
(0..NUM_COPIES)
.into_par_iter()
.flat_map(|_| buf.par_iter())
.for_each(|i| {
let mut cnt = unsafe { ctx.get() };
*cnt += *i as usize;
});
});
let buf_sum = buf.into_iter().fold(0, |acc, i| acc + i as usize);
// What matters is that the final sum matches the expected value.
assert_eq!(ctx.into_iter().sum::<usize>(), buf_sum * NUM_COPIES);
Dependencies
~1.5MB
~25K SLoC