#suffix-array #suffix #parallel #saca

bin+lib psacak

The pSACAK suffix sorting algorithm for huge in-memory data on multicore machines

1 unstable release

0.1.0 Apr 22, 2020

#5 in #saca

MIT license

110KB
3K SLoC

pSACAK

crates docs

Implementation of the pSACAK suffix sorting algorithm for huge in-memory data on multicore machines, as described in this paper.


lib.rs:

The pSACAK suffix sorting algorithm for huge in-memory data on multicore machines.

The algorithm is described in this paper.

Examples

Create new suffix array.

use psacak::psacak;

let text = b"mississippi";
let suf = psacak(text);

assert_eq!(suf, vec![10, 7, 4, 1, 0, 9, 8, 6, 3, 5, 2]);

Compute the suffix array in a slice.

use psacak::psacak_inplace;

let text = b"mississippi";
let mut suf = vec![0; text.len()];
psacak_inplace(text, &mut suf[..]);

assert_eq!(suf, vec![10, 7, 4, 1, 0, 9, 8, 6, 3, 5, 2]);

Note that the position of sentinel is not presented in the suffix array calculated by pSACAK.

To obtain a suffix array which included the position of sentinel, you can try this:

use psacak::psacak_inplace;

let text = b"mississippi";
let mut suf = vec![0; text.len() + 1];
suf[0] = text.len() as u32;
psacak_inplace(text, &mut suf[1..]);

assert_eq!(suf, vec![11, 10, 7, 4, 1, 0, 9, 8, 6, 3, 5, 2]);

Dependencies

~1.5MB
~31K SLoC