6 releases
0.1.2 | Jul 3, 2019 |
---|---|
0.1.1 | Jun 25, 2018 |
0.0.3 | Jun 19, 2018 |
#1297 in Algorithms
14KB
200 lines
external_sort
Provides the ability to perform external sorts on structs, which allows for rapid sorting of extremely large data streams.
Usage
Add this to your Cargo.toml
:
[dependencies]
external_sort = "^0.1.1"
and this to your crate root:
extern crate external_sort;
Examples
The following shows using external_sort
to sort a vector of simple structs.
Note that your struct must impl
Ord
, Clone
, as well as the serde
Serialize
and Deserialize
traits. Additionally, in order for external_sort
to track it's memory buffer usage, your struct must be able to report on it's size (via external_sort::ExternallySortable
)
extern crate external_sort;
#[macro_use]
extern crate serde_derive;
use external_sort::{ExternalSorter, ExternallySortable};
#[derive(Serialize, Deserialize, Clone, PartialEq, Eq, PartialOrd, Ord)]
struct Num {
the_num: u32,
}
impl Num {
fn new(num: u32) -> Num {
Num { the_num: num }
}
}
impl ExternallySortable for Num {
fn get_size(&self) -> u64 {
4
}
}
fn main() {
let unsorted = vec![
Num::new(5),
Num::new(2),
Num::new(1),
Num::new(3),
Num::new(4),
];
let sorted = vec![
Num::new(1),
Num::new(2),
Num::new(3),
Num::new(4),
Num::new(5),
];
let external_sorter = ExternalSorter::new(16, None);
let iter = external_sorter.sort(unsorted.into_iter()).unwrap();
for (idx, i) in iter.enumerate() {
assert_eq!(i.unwrap().the_num, sorted[idx].the_num);
}
}
If your struct is unable to report on it's size, simply return 1
from get_size()
, and then pass the number of objects (rather than bytes) that the ExternalSorter
should keep in memory when calling ExternalSorter::new()
Dependencies
~1.1–2.2MB
~42K SLoC