9 stable releases
| 2.3.7 | Feb 19, 2026 |
|---|---|
| 2.3.6 | Jun 1, 2025 |
| 2.3.5 | Apr 15, 2025 |
| 2.3.0 |
|
| 0.2.0 |
|
#613 in Text processing
73 downloads per month
34KB
466 lines
whitespace-sifter
use whitespace_sifter::WhitespaceSifter;
// This prints `1.. 2.. 3.. 4.. 5..`.
println!(
"{}",
"1.. \n2.. \n\r\n\n3.. \n\n\n4.. \n\n\r\n\n\n5.. \n\n\n\n\n".sift(),
);
// This prints `1..\n2..\n3..\n4..\r\n5..`.
println!(
"{}",
"1.. \n2.. \n\r\n3.. \n\n\n4.. \r\n\n\r\n\n5.. \n\n\n\n\n"
.sift_preserve_newlines(),
);
✨ Sift Duplicate Whitespaces In One Function Call
This crate helps you remove duplicate whitespaces within a UTF-8 encoded string.
It naturally removes the whitespaces at the start and end of the string.
📈 Crate Comparison
| Crate | Implementation |
|---|---|
| whitespace-sifter | Any AsRef<str> as input, CR-LF compatibility, preserve_newlines |
| collapse | &str input only |
| fast_whitespace_collapse | &str input only, SIMD with fallback for any unsupported rustc target |
| Crate | Whitespace Dictionary | Time | Complete |
|---|---|---|---|
| whitespace-sifter | '\t' | '\n' | '\x0C' | '\r' | ' '| "\r\n" |
~170 µs | ✅ |
| collapse | ' ' | '\x09'..='\x0d' | unicode::White_Space(c) |
~270 µs | ✅ |
| fast_whitespace_collapse | ' ' | '\t' |
~160 µs | ❌ |
Disclaimers:
-
I do not know the crate maintainers nor asked for permission to include their crates here.
-
As far as I know, there are only three crates dedicated to whitespace sifting/collapse.
-
fast_whitespace_collapsewas not able to collapse cr-lf and line feeds.
⚡️Benchmarks
Performance is a priority; Most updates are performance improvements.
The benchmark uses a transcript of the Bee Movie.
Execute these commands to benchmark:
$ git clone https://github.com/JumperBot/whitespace-sifter.git
$ cd whitespace-sifter/bench
$ cargo bench
You should only look for results that look like the following:
Sift/Sift time: [173.00 µs 173.35 µs 173.80 µs]
Sift Preserved/Sift Preserved
time: [185.58 µs 186.11 µs 186.64 µs]
In just 0.0001 seconds; Pretty impressive, no?
Go try it on a better machine, I guess.
Benchmark specifications:- Processor: Intel(R) Core(TM) i5-8365U CPU @ 1.60GHz 1.90 GHz
- Memory: RAM 16.0 GB (15.8 GB usable)
- System: GNU/Linux 6.6.87.2-microsoft-standard-WSL2 x86_64
- Modified: v2.3.7
➕ Dependency
Add this to your project with:
$ cargo add whitespace-sifter
📦️ Installation
Download the binary with:
$ cargo install whitespace-sifter
Use it as usual:
$ echo "Hello there!" | whitespace-sifter
$ cat document.txt | whitespace-sifter --preserve-newlines
🔊 Changelog
- Improved Performance
- Minimum Supported Rust Version set to
v1.79.0(startingv2.3.3) - Crate binary (starting
v2.3.6) - Stricter Tests (starting
v2.3.2)- Proper UTF-8/Unicode Encoding
- Regular Sifting
- Sifting With Leading Whitespaces
- Documentation Assertion
- MSRV Verification
- Compliance Check for Old Versions
- Crate Comparison (starting
v2.3.4) - Benchmark Separation (starting
v2.3.5)
📄 Licensing
whitespace-sifter is licensed under the MIT LICENSE; This is the summarization.
Dependencies
~1–1.5MB
~27K SLoC