#search-algorithms #text-search #string-search #search #simd #string #text

sliceslice

A fast implementation of single-pattern substring search using SIMD acceleration

8 releases

0.4.3 Jun 29, 2024
0.4.2 Jan 30, 2024
0.4.1 Apr 27, 2022
0.3.1 May 18, 2021
0.1.0 Jul 22, 2020

#69 in Hardware support

Download history 4943/week @ 2024-08-22 5076/week @ 2024-08-29 4755/week @ 2024-09-05 4680/week @ 2024-09-12 4752/week @ 2024-09-19 4190/week @ 2024-09-26 4682/week @ 2024-10-03 5582/week @ 2024-10-10 5518/week @ 2024-10-17 5785/week @ 2024-10-24 5664/week @ 2024-10-31 6688/week @ 2024-11-07 6734/week @ 2024-11-14 6773/week @ 2024-11-21 5303/week @ 2024-11-28 4320/week @ 2024-12-05

24,198 downloads per month

MIT license

255KB
1.5K SLoC

sliceslice

Actions Crate Docs License

A fast implementation of single-pattern substring search using SIMD acceleration, based on the work presented by Wojciech Muła. For a fast multi-pattern substring search algorithm, see instead the aho-corasick crate.

Example

use sliceslice::x86::DynamicAvx2Searcher;

fn main() {
    let searcher = unsafe { DynamicAvx2Searcher::new(b"ipsum".to_owned().into()) };

    assert!(unsafe {
        searcher.search_in(b"Lorem ipsum dolor sit amet, consectetur adipiscing elit")
    });

    assert!(!unsafe {
        searcher.search_in(b"foo bar baz qux quux quuz corge grault garply waldo fred")
    });
}

Benchmarks

We ran the i386 benchmarks on an HP EliteDesk 800 G2 Tower PC with an Intel Core i7-6700 Processor @ 3.40GHz, 16GB of RAM and 512GB of disk space, running Ubuntu 20.04.1 LTS, gcc 9.3.0 and Rust 1.46.0.

Library Version Function Short haystack Long haystack
std 1.46.0 String::find [335.32 ms 335.56 ms 335.83 ms] [344.62 ms 345.01 ms 345.52 ms]
memmem 0.1.1 TwoWaySearcher::search_in [87.927 ms 88.029 ms 88.151 ms] [401.40 ms 401.59 ms 401.81 ms]
twoway 0.2.1 find_bytes [274.60 ms 274.82 ms 275.07 ms] [146.32 ms 146.44 ms 146.58 ms]
sse4-strstr¹ 0.0.0-9308a59 avx2_strstr_v2 [75.389 ms 75.515 ms 75.682 ms] [38.521 ms 38.579 ms 38.649 ms]
sliceslice 0.2.0 DynamicAvx2Searcher::search_in [79.283 ms 79.416 ms 79.596 ms] [35.135 ms 35.181 ms 35.247 ms]

¹ sse4-strstr is not memory safe as it can read beyond the end of the haystack, please see the documentation for sliceslice where this is discussed in more detail.

Benchmarks results column chart

Licensing

Licensed under the MIT license. See the LICENSE file for details.

Dependencies