#string #find #substring-search #pcmpestri #memmem

no-std twoway

Fast substring search for strings and byte strings. Optional SSE4.2 acceleration (if detected at runtime) using pcmpestri. Memchr is the only mandatory dependency. The two way algorithm is also used by rust’s libstd itself, but here it is exposed both for byte strings, using memchr, and optionally using a SSE4.2 accelerated version.

11 releases

new 0.2.1 Oct 9, 2019
0.2.0 Nov 19, 2018
0.1.8 Mar 19, 2018
0.1.7 Nov 18, 2017
0.1.1 Dec 9, 2015

#12 in Algorithms

Download history 3816/week @ 2019-06-26 1785/week @ 2019-07-03 2419/week @ 2019-07-10 3876/week @ 2019-07-17 4213/week @ 2019-07-24 4251/week @ 2019-07-31 5492/week @ 2019-08-07 5166/week @ 2019-08-14 4795/week @ 2019-08-21 4799/week @ 2019-08-28 5473/week @ 2019-09-04 6022/week @ 2019-09-11 6528/week @ 2019-09-18 8611/week @ 2019-09-25 8211/week @ 2019-10-02

23,581 downloads per month
Used in 80 crates (11 directly)

MIT/Apache

80KB
1.5K SLoC

This is my substring search workspace.

Please read the API documentation here

build_status crates

Documentation

Fast substring search for strings and byte strings, using the two-way algorithm.

This is the same code as is included in Rust's libstd to “power” str::find(&str), but here it is exposed with some improvements:

  • Available for byte string searches using &[u8]
  • Having an optional SSE4.2 accelerated version (if detected at runtime) which is even faster.
  • Using memchr for the single byte case, which is ultra fast.
  • twoway::find_bytes(text: &[u8], pattern: &[u8]) -> Option<usize>
  • twoway::rfind_bytes(text: &[u8], pattern: &[u8]) -> Option<usize>
  • twoway::find_str(text: &str, pattern: &str) -> Option<usize>
  • twoway::rfind_str(text: &str, pattern: &str) -> Option<usize>

Recent Changes

  • 0.2.1
    • Update dev-deps
  • 0.2.0
    • Use std::arch and transparently support SSE4.2 when possible (x86 and x86-64 only) to enable an accelerated implementation of the algorithm. Forward search only. By @RReverser and @bluss
    • Fix a bug in the SSE4.2 algorithm that made it much slower than it should have been, so performance increases as well.
    • Requires Rust 1.27
  • 0.1.8
    • Tweak crate keywords by @tari
    • Only testing and benchmarking changes otherwise (no changes to the crate itself)
  • 0.1.7
    • The crate is optionally no_std. Regular and pcmp both support this mode.
  • 0.1.6
    • The hidden and internal test module set, technically pub, was removed from standard compilation.
  • 0.1.5
    • Update from an odds dependency to using unchecked-index instead (only used by the pcmp feature).
    • The hidden and internal test module tw, technically pub, was removed from standard compilation.
  • 0.1.4
    • Update memchr dependency to 2.0
  • 0.1.3
    • Link to docs.rs docs
    • Drop pcmp's itertools dependency
    • Update nightly code for recent changes
  • 0.1.2
    • Internal improvements to the pcmp module.
  • 0.1.1
    • Add rfind_bytes, rfind_str
  • 0.1.0
    • Initial release
    • Add find_bytes, find_str

License

MIT / APACHE-2.0

Notes

Consider denying 0/n factorizations, see http://lists.gnu.org/archive/html/bug-gnulib/2010-06/msg00184.html

Dependencies

~135KB