#sorting #parallel #sorter

crumsort

Parallel implementation of crumsort optmized for uniform distributions

1 unstable release

0.1.0 Oct 24, 2022

#1377 in Algorithms

44 downloads per month
Used in forma-render

Apache-2.0

51KB
1K SLoC

crumsort-rs

A parallelized Rust port of crumsort.

The goal of this port is to excel at sorting well-distributed data which is why it is not an exact 1:1 replica.

Temporary caveats

There are a few limitations given some of the constraints when this was originally written:

  • sorts uniform data faster than crumsort, but severly skewed distributions slower (missing crumsort_analyze function)
  • intended as a solution for sorting large (u64 or u128) integers
  • only sorts Copy + Default data as a way to limit the use of unsafe
  • missing un-parallelized version (data needs to implement Send)
  • missing *_by and *_by_key sorting alternatives (data needs to implement Ord)

Please feel free to submit improvements in any of these area by submitting a pull request.

Benchmarks against parallel pdqsort (Rayon)

All banchmarks run with the bench example on an M1 Pro:

cargo run --release --example bench

Uniformly distributed random u32s

Length Algorithm Throughput Improvement
212 pdqsort 32.15Mkeys/s 0.00%
212 crumsort 38.70Mkeys/s 20.39%
216 pdqsort 129.96Mkeys/s 0.00%
216 crumsort 176.95Mkeys/s 36.16%
220 pdqsort 226.31Mkeys/s 0.00%
220 crumsort 368.09Mkeys/s 62.65%
224 pdqsort 227.80Mkeys/s 0.00%
224 crumsort 399.89Mkeys/s 75.54%

Uniformly distributed random u64s

Length Algorithm Throughput Improvement
212 pdqsort 33.18Mkeys/s 0.00%
212 crumsort 40.79Mkeys/s 22.91%
216 pdqsort 151.24Mkeys/s 0.00%
216 crumsort 237.48Mkeys/s 57.02%
220 pdqsort 218.64Mkeys/s 0.00%
220 crumsort 364.79Mkeys/s 66.85%
224 pdqsort 226.83Mkeys/s 0.00%
224 crumsort 385.42Mkeys/s 69.92%

Note

This is not an officially supported Google product.

Dependencies

~1.5MB
~25K SLoC