#base64 #simd #codec #decoding #avx2 #simd-accelerated #instructions

bin+lib bs64

SIMD-accelerated Base64 encoding and decoding library

3 releases

0.1.2 Oct 16, 2023
0.1.1 Oct 15, 2023
0.1.0 Oct 15, 2023

#1035 in Encoding

Download history 76/week @ 2024-01-15 12/week @ 2024-02-19 8/week @ 2024-02-26 5/week @ 2024-03-11 4/week @ 2024-03-25 217/week @ 2024-04-01 71/week @ 2024-04-08 31/week @ 2024-04-15

323 downloads per month
Used in 3 crates

MIT/Apache

49KB
747 lines

๐Ÿš€ Base 64

Docs

โœจ SIMD-accelerated Base64 for Rust โœจ

๐ŸŒŸ Features

  • ๐Ÿ’ก Uses AVX2 instructions for super-fast encoding and decoding
  • ๐Ÿ”„ Fallback when AVX2 is unavailable uses any available SIMD

๐ŸŽฏ Project goals

  • ๐Ÿ”ง Simple, idiomatic API
  • ๐Ÿ“ฆ Sensible defaults
  • โšก Fast

Installation

cargo add bs64

Usage

use bs64;

fn main() {
  // Encode
  let input = vec![2, 3, 4, 5];
  let output: String = bs64::encode(&input);

  // Decode
  let decoded_output = bs64::decode(output.as_bytes());
}

Benchmarks

Ran using 100k inputs, 10000 iterations on an Intelยฎ Coreโ„ข i7-1065G7. Comparisons are made against base64 and data-encoding crates.

cargo run --features "cli" --release -- -b 100000 -i 10000

Encode

name MB/s
๐Ÿš€ bs64::encode() 4813.70
๐Ÿš€ bs64::encode_mut() 6579.17
๐Ÿš€ bs64 fallback 944.18
data_encoding 858.51
data_encoding mut 873.28
base64 748.02
base64 mut 870.99

Decode

name MB/s
๐Ÿš€ bs64::decode() 3899.26
๐Ÿš€ bs64::decode_mut() 3965.25
๐Ÿš€ bs64 fallback 837.17
data_encoding 647.33
data_encoding mut 684.01
base64 761.68
base64 mut 805.60

Implementation Details

Code was initially ported from https://github.com/lemire/fastbase64

The simple fallback implementation is based on the chromium implementation from the fastbase64 repo. The use of iterators and chunking the input in the Rust implementation makes it easy for the compiler to vectorise the processing.

The AVX2 implementation is largely untouched compared with the original fastbase64 implementation.

The code is optimised for x86_64, and therefore assumes large-ish caches are available for storing lookup tables. I created a naive implementation that indexed a static array of valid base64 chars - the performance there was only slightly worse than the chromium LUT implementation, so I may add this as an option for low-memory targets (i.e. embedded).

Useful links:

TODO

  • Integration tests
  • Benchmarking suite
  • Comply with MIME, UTF-7, and other Base64 standards
  • Regression tests + benchmark in Github Actions
  • Change default implementation with feature flags
  • Builders for custom configs at runtime

Dependencies

~0.3โ€“1.1MB
~23K SLoC