#synchronization-primitive #async-task #waker #async #atomic #no-alloc

no-std diatomic-waker

An async, lock-free synchronization primitive for task wakeup

5 releases

0.2.3 Sep 8, 2024
0.2.2 Aug 24, 2024
0.2.1 Aug 21, 2024
0.2.0 Jul 28, 2024
0.1.0 Oct 12, 2022

#29 in Concurrency

Download history 15999/week @ 2024-09-19 18009/week @ 2024-09-26 24589/week @ 2024-10-03 25968/week @ 2024-10-10 26833/week @ 2024-10-17 28107/week @ 2024-10-24 26420/week @ 2024-10-31 35917/week @ 2024-11-07 30681/week @ 2024-11-14 32384/week @ 2024-11-21 34982/week @ 2024-11-28 38879/week @ 2024-12-05 45375/week @ 2024-12-12 21085/week @ 2024-12-19 14599/week @ 2024-12-26 26414/week @ 2025-01-02

116,313 downloads per month
Used in 177 crates (6 directly)

MIT/Apache

54KB
609 lines

diatomic-waker

An async, fast synchronization primitives for task wakeup.

Cargo Documentation License

Overview

diatomic-waker is similar to atomic-waker in that it enables concurrent updates and notifications to a wrapped Waker. Unlike the latter, however, it does not use spinlocks[^spinlocks] and is significantly faster, in particular when the consumer needs to be notified periodically rather than just once. It can in particular be used as a very fast, single-consumer eventcount to turn a non-blocking data structure into an asynchronous one (see MPSC channel receiver example).

This library is an offshoot of Asynchronix, an ongoing effort at a high performance asynchronous computation framework for system simulation.

[^spinlocks]: The implementation of AtomicWaker yields to the runtime on contention, which is in effect an executor-mediated spinlock.

Usage

Add this to your Cargo.toml:

[dependencies]
diatomic-waker = "0.2.3"

Features flags

By default, this crate enables the alloc feature to provide the owned WakeSink and WakeSource. It can be made no-std-compatible by specifying default-features = false.

Example

A multi-producer, single-consumer channel of capacity 1 for sending NonZeroUsize values, with an asynchronous receiver:

use std::num::NonZeroUsize;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;

use diatomic_waker::{WakeSink, WakeSource};

// The sending side of the channel.
#[derive(Clone)]
struct Sender {
    wake_src: WakeSource,
    value: Arc<AtomicUsize>,
}

// The receiving side of the channel.
struct Receiver {
    wake_sink: WakeSink,
    value: Arc<AtomicUsize>,
}

// Creates an empty channel.
fn channel() -> (Sender, Receiver) {
    let value = Arc::new(AtomicUsize::new(0));
    let wake_sink = WakeSink::new();
    let wake_src = wake_sink.source();
    (
        Sender {
            wake_src,
            value: value.clone(),
        },
        Receiver { wake_sink, value },
    )
}

impl Sender {
    // Sends a value if the channel is empty.
    fn try_send(&self, value: NonZeroUsize) -> bool {
        let success = self
            .value
            .compare_exchange(0, value.get(), Ordering::Relaxed, Ordering::Relaxed)
            .is_ok();
        if success {
            self.wake_src.notify()
        };
        success
    }
}

impl Receiver {
    // Receives a value asynchronously.
    async fn recv(&mut self) -> NonZeroUsize {
        // Wait until the predicate returns `Some(value)`, i.e. when the atomic
        // value becomes non-zero.
        self.wake_sink
            .wait_until(|| NonZeroUsize::new(self.value.swap(0, Ordering::Relaxed)))
            .await
    }
}

Safety

This is a low-level primitive and as such its implementation relies on unsafe. The test suite makes extensive use of Loom to assess its correctness. As amazing as it is, however, Loom is only a tool: it cannot formally prove the absence of data races.

Implementation details

A distinguishing feature of diatomic-waker is its use of two waker storage slots (hence its name) rather than one. This makes it possible to achieve lock-freedom in situations where waker registration and notification are performed concurrently. In the case of concurrent notifications, even though one notifier does hold a notification lock, other notifiers never block: they merely request the holder of the lock to send another notification, which is a wait-free operation.

Compared to atomic-waker, dummy notifications (with no waker registered) are much cheaper. The overall cost of a successful notification (registration + notification itself) is also much cheaper in the common case where the registered/unregistered waker is always the same, because the last waker is always cached to avoid undue cloning. Quantitatively, the costs in terms of atomic Read-Modify-Write (RMW) operations are:

  • dummy notification due to no waker being registered: 1 RMW vs 2 RMWs for atomic-waker,
  • registration of the same waker as the last registered waker + notification: 1+3=4 RMWs vs 3+4=7 RMWs for atomic-waker (this assumes 1 RMW for Waker::wake_by_ref, 2 RMWs for Waker::wake, 1 RMW for Waker::clone).
  • registration of a new waker + notification: 3+3=6 RMWs vs 3+4=7 RMWs for atomic-waker (same assumptions as above + 1 RMW for Waker::drop); this is typically only necessary for the very first registration,
  • very few RMWs and predictable cost on contention due to the absence of spinlocks.

License

This software is licensed under the Apache License, Version 2.0 or the MIT license, at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

No runtime deps