#statistics #standard-deviation #stats #data #floating-point

rolling-stats

Rolling statistics calculations (min/max/mean/std_dev) over arbitrary floating point numbers based on Welford's Online Algorithm

10 releases (6 breaking)

0.7.0 Jun 4, 2023
0.5.1 Jan 14, 2023
0.5.0 Nov 10, 2021
0.4.0 Feb 10, 2021
0.1.0 Feb 13, 2019

#160 in Science

Download history 167/week @ 2024-07-20 217/week @ 2024-07-27 143/week @ 2024-08-03 272/week @ 2024-08-10 188/week @ 2024-08-17 308/week @ 2024-08-24 401/week @ 2024-08-31 283/week @ 2024-09-07 269/week @ 2024-09-14 324/week @ 2024-09-21 476/week @ 2024-09-28 216/week @ 2024-10-05 325/week @ 2024-10-12 330/week @ 2024-10-19 270/week @ 2024-10-26 259/week @ 2024-11-02

1,209 downloads per month
Used in 4 crates (2 directly)

MIT/Apache

22KB
251 lines

Rust-Rolling-Stats

The rolling-stats library offers rolling statistics calculations (minimum, maximum, mean, standard deviation) over arbitrary floating point numbers. It uses Welford's Online Algorithm for these computations. This crate is no_std compatible.

For more information on the algorithm, visit Algorithms for calculating variance on Wikipedia.

Status

GitHub tag Build Status Crates.io Docs.rs

Usage

Single Thread Example

Below is an example of using rust-rolling-stats in a single-threaded context:

use rolling_stats::Stats;
use rand_distr::{Distribution, Normal};
use rand::SeedableRng;

type T = f64;

const MEAN: T = 0.0;
const STD_DEV: T = 1.0;
const NUM_SAMPLES: usize = 10_000;
const SEED: u64 = 42;

let mut stats: Stats<T> = Stats::new();
let mut rng = rand::rngs::StdRng::seed_from_u64(SEED); // Seed the RNG for reproducibility
let normal = Normal::<T>::new(MEAN, STD_DEV).unwrap();

// Generate random data
let random_data: Vec<T> = (0..NUM_SAMPLES).map(|_x| normal.sample(&mut rng)).collect();

// Update the stats one by one
random_data.iter().for_each(|v| stats.update(*v));

// Print the stats
println!("{}", stats);
// Output: (avg: 0.00, std_dev: 1.00, min: -3.53, max: 4.11, count: 10000)

Multi Thread Example

This example showcases the usage of rust-rolling-stats in a multi-threaded context with the help of the rayon crate:

use rolling_stats::Stats;
use rand_distr::{Distribution, Normal};
use rand::SeedableRng;
use rayon::prelude::*;

type T = f64;

const MEAN: T = 0.0;
const STD_DEV: T = 1.0;
const NUM_SAMPLES: usize = 500_000;
const SEED: u64 = 42;
const CHUNK_SIZE: usize = 1000;

let mut stats: Stats<T> = Stats::new();
let mut rng = rand::rngs::StdRng::seed_from_u64(SEED); // Seed the RNG for reproducibility
let normal = Normal::<T>::new(MEAN, STD_DEV).unwrap();

// Generate random data
let random_data: Vec<T> = (0..NUM_SAMPLES).map(|_x| normal.sample(&mut rng)).collect();

// Update the stats in parallel. New stats objects are created for each chunk of data.
let stats: Vec<Stats<T>> = random_data
    .par_chunks(CHUNK_SIZE) // Multi-threaded parallelization via Rayon
    .map(|chunk| {
        let mut s: Stats<T> = Stats::new();
        chunk.iter().for_each(|v| s.update(*v));
        s
    })
    .collect();

// Check if there's more than one stat object
assert!(stats.len() > 1);

// Accumulate the stats using the reduce method
let merged_stats = stats.into_iter().reduce(|acc, s| acc.merge(&s)).unwrap();

// Print the stats
println!("{}", merged_stats);
// Output: (avg: -0.00, std_dev: 1.00, min: -4.53, max: 4.57, count: 500000)

Feature Flags

The following feature flags are available:

  • serde: Enables serialization and deserialization of the Stats struct via the serde crate.

License

The rolling-stats library is dual-licensed under the MIT and Apache License 2.0. By opening a pull request, you are implicitly agreeing to these licensing terms.

Dependencies

~460–710KB
~14K SLoC