#lazy-evaluation #static

no-std once_cell_serde

Single assignment cells and lazy values

1 stable release

1.17.1 Mar 14, 2023

#2600 in Rust patterns

Download history 536/week @ 2024-07-01 546/week @ 2024-07-08 1505/week @ 2024-07-15 1046/week @ 2024-07-22 1307/week @ 2024-07-29 1241/week @ 2024-08-05 878/week @ 2024-08-12 1285/week @ 2024-08-19 581/week @ 2024-08-26 817/week @ 2024-09-02 1318/week @ 2024-09-09 110/week @ 2024-09-16 80/week @ 2024-09-23 99/week @ 2024-09-30 239/week @ 2024-10-07 7/week @ 2024-10-14

434 downloads per month
Used in kodept-ast

MIT/Apache

89KB
1K SLoC

This project is https://github.com/matklad/once_cell with serde support as proposed at https://github.com/matklad/once_cell/pull/104


lib.rs:

Overview

once_cell provides two new cell-like types, unsync::OnceCell and sync::OnceCell. A OnceCell might store arbitrary non-Copy types, can be assigned to at most once and provides direct access to the stored contents. The core API looks roughly like this (and there's much more inside, read on!):

impl<T> OnceCell<T> {
    const fn new() -> OnceCell<T> { ... }
    fn set(&self, value: T) -> Result<(), T> { ... }
    fn get(&self) -> Option<&T> { ... }
}

Note that, like with RefCell and Mutex, the set method requires only a shared reference. Because of the single assignment restriction get can return a &T instead of Ref<T> or MutexGuard<T>.

The sync flavor is thread-safe (that is, implements the Sync trait), while the unsync one is not.

Recipes

OnceCell might be useful for a variety of patterns.

Safe Initialization of Global Data

use std::{env, io};

use once_cell_serde::sync::OnceCell;

#[derive(Debug)]
pub struct Logger {
    // ...
}
static INSTANCE: OnceCell<Logger> = OnceCell::new();

impl Logger {
    pub fn global() -> &'static Logger {
        INSTANCE.get().expect("logger is not initialized")
    }

    fn from_cli(args: env::Args) -> Result<Logger, std::io::Error> {
       // ...
    }
}

fn main() {
    let logger = Logger::from_cli(env::args()).unwrap();
    INSTANCE.set(logger).unwrap();
    // use `Logger::global()` from now on
}

Lazy Initialized Global Data

This is essentially the lazy_static! macro, but without a macro.

use std::{sync::Mutex, collections::HashMap};

use once_cell_serde::sync::OnceCell;

fn global_data() -> &'static Mutex<HashMap<i32, String>> {
    static INSTANCE: OnceCell<Mutex<HashMap<i32, String>>> = OnceCell::new();
    INSTANCE.get_or_init(|| {
        let mut m = HashMap::new();
        m.insert(13, "Spica".to_string());
        m.insert(74, "Hoyten".to_string());
        Mutex::new(m)
    })
}

There are also the sync::Lazy and unsync::Lazy convenience types to streamline this pattern:

use std::{sync::Mutex, collections::HashMap};
use once_cell_serde::sync::Lazy;

static GLOBAL_DATA: Lazy<Mutex<HashMap<i32, String>>> = Lazy::new(|| {
    let mut m = HashMap::new();
    m.insert(13, "Spica".to_string());
    m.insert(74, "Hoyten".to_string());
    Mutex::new(m)
});

fn main() {
    println!("{:?}", GLOBAL_DATA.lock().unwrap());
}

Note that the variable that holds Lazy is declared as static, not const. This is important: using const instead compiles, but works wrong.

General purpose lazy evaluation

Unlike lazy_static!, Lazy works with local variables.

use once_cell_serde::unsync::Lazy;

fn main() {
    let ctx = vec![1, 2, 3];
    let thunk = Lazy::new(|| {
        ctx.iter().sum::<i32>()
    });
    assert_eq!(*thunk, 6);
}

If you need a lazy field in a struct, you probably should use OnceCell directly, because that will allow you to access self during initialization.

use std::{fs, path::PathBuf};

use once_cell_serde::unsync::OnceCell;

struct Ctx {
    config_path: PathBuf,
    config: OnceCell<String>,
}

impl Ctx {
    pub fn get_config(&self) -> Result<&str, std::io::Error> {
        let cfg = self.config.get_or_try_init(|| {
            fs::read_to_string(&self.config_path)
        })?;
        Ok(cfg.as_str())
    }
}

Lazily Compiled Regex

This is a regex! macro which takes a string literal and returns an expression that evaluates to a &'static Regex:

macro_rules! regex {
    ($re:literal $(,)?) => {{
        static RE: once_cell_serde::sync::OnceCell<regex::Regex> = once_cell_serde::sync::OnceCell::new();
        RE.get_or_init(|| regex::Regex::new($re).unwrap())
    }};
}

This macro can be useful to avoid the "compile regex on every loop iteration" problem.

Runtime include_bytes!

The include_bytes macro is useful to include test resources, but it slows down test compilation a lot. An alternative is to load the resources at runtime:

use std::path::Path;

use once_cell_serde::sync::OnceCell;

pub struct TestResource {
    path: &'static str,
    cell: OnceCell<Vec<u8>>,
}

impl TestResource {
    pub const fn new(path: &'static str) -> TestResource {
        TestResource { path, cell: OnceCell::new() }
    }
    pub fn bytes(&self) -> &[u8] {
        self.cell.get_or_init(|| {
            let dir = std::env::var("CARGO_MANIFEST_DIR").unwrap();
            let path = Path::new(dir.as_str()).join(self.path);
            std::fs::read(&path).unwrap_or_else(|_err| {
                panic!("failed to load test resource: {}", path.display())
            })
        }).as_slice()
    }
}

static TEST_IMAGE: TestResource = TestResource::new("test_data/lena.png");

#[test]
fn test_sobel_filter() {
    let rgb: &[u8] = TEST_IMAGE.bytes();
    // ...
}

lateinit

LateInit type for delayed initialization. It is reminiscent of Kotlin's lateinit keyword and allows construction of cyclic data structures:

use once_cell_serde::sync::OnceCell;

pub struct LateInit<T> { cell: OnceCell<T> }

impl<T> LateInit<T> {
    pub fn init(&self, value: T) {
        assert!(self.cell.set(value).is_ok())
    }
}

impl<T> Default for LateInit<T> {
    fn default() -> Self { LateInit { cell: OnceCell::default() } }
}

impl<T> std::ops::Deref for LateInit<T> {
    type Target = T;
    fn deref(&self) -> &T {
        self.cell.get().unwrap()
    }
}

#[derive(Default)]
struct A<'a> {
    b: LateInit<&'a B<'a>>,
}

#[derive(Default)]
struct B<'a> {
    a: LateInit<&'a A<'a>>
}


fn build_cycle() {
    let a = A::default();
    let b = B::default();
    a.b.init(&b);
    b.a.init(&a);
    
    let _a = &a.b.a.b.a;
}

Comparison with std

!Sync types Access Mode Drawbacks
Cell<T> T requires T: Copy for get
RefCell<T> RefMut<T> / Ref<T> may panic at runtime
unsync::OnceCell<T> &T assignable only once
Sync types Access Mode Drawbacks
AtomicT T works only with certain Copy types
Mutex<T> MutexGuard<T> may deadlock at runtime, may block the thread
sync::OnceCell<T> &T assignable only once, may block the thread

Technically, calling get_or_init will also cause a panic or a deadlock if it recursively calls itself. However, because the assignment can happen only once, such cases should be more rare than equivalents with RefCell and Mutex.

Minimum Supported rustc Version

This crate's minimum supported rustc version is 1.56.0.

If only the std feature is enabled, MSRV will be updated conservatively, supporting at least latest 8 versions of the compiler. When using other features, like parking_lot, MSRV might be updated more frequently, up to the latest stable. In both cases, increasing MSRV is not considered a semver-breaking change.

Implementation details

The implementation is based on the lazy_static and lazy_cell crates and std::sync::Once. In some sense, once_cell just streamlines and unifies those APIs.

To implement a sync flavor of OnceCell, this crates uses either a custom re-implementation of std::sync::Once or parking_lot::Mutex. This is controlled by the parking_lot feature (disabled by default). Performance is the same for both cases, but the parking_lot based OnceCell<T> is smaller by up to 16 bytes.

This crate uses unsafe.

F.A.Q.

Should I use lazy_static or once_cell?

To the first approximation, once_cell is both more flexible and more convenient than lazy_static and should be preferred.

Unlike once_cell, lazy_static supports spinlock-based implementation of blocking which works with #![no_std].

lazy_static has received significantly more real world testing, but once_cell is also a widely used crate.

Should I use the sync or unsync flavor?

Because Rust compiler checks thread safety for you, it's impossible to accidentally use unsync where sync is required. So, use unsync in single-threaded code and sync in multi-threaded. It's easy to switch between the two if code becomes multi-threaded later.

At the moment, unsync has an additional benefit that reentrant initialization causes a panic, which might be easier to debug than a deadlock.

Does this crate support async?

No, but you can use async_once_cell instead.

Can I bring my own mutex?

There is generic_once_cell to allow just that.

Related crates

Most of this crate's functionality is available in std in nightly Rust. See the tracking issue.

Dependencies

~0–4.5MB