#twitter #binary-data #data-encoding #character-encoding #tweet #encode #platform


base2048 encoding for efficient encoding of binary data on twitter

6 releases (2 stable)

2.0.2 Oct 3, 2021
1.0.0 Oct 2, 2021
0.2.1 Jul 28, 2020
0.1.1 Jul 28, 2020

#1981 in Encoding

0BSD license

156 lines

rust-base2048   crates_badge actions_badge docs_badge

base2048 encoding for packing data into tweets!

This is an experimental module for encoding chunks of 11 bits into a single character as counted by twitter. This allows you to encode 385 (280 * 11 / 8) bytes in tweet.

The main things this crate put effort into getting right:

  1. The characters display on most platforms.
  2. No right-to-left characters are used.
  3. No weird punctuation characters are included.

See base2048.txt for the ordered list of characters.


base2048 = "2"


// these 189 will never fit in a tweet encoded as hex
let bytes = hex_literal::hex!("0100000001574981a3fb74e6632493fcab62947b07a6c228c2b9d840893ff1e7c4f143723c010000006a47304402201f2fc511e390f5dcecf5f0fcb627faff9c0acec671bf372c49e30b43cab048ff02200a10eefea2f2c7b1c5a1603b73dc4d3175b9a416db0acfedf9bf443c0be219c90121031132f6c2139c199a18bfe1fb7f7eb5d1daaf8d4d2e03bf11e833a13e62268fb5ffffffff01eda54e020000000017a914582e495bd15671cc7344ff54104a4d3e6468fff08700000000");

// but with base2048 you can fit it - twice!
let encoded = base2048::encode(&bytes[..]);
assert_eq!(encoded, "ÅØØÒԾആ১ԍཪƉǧႵషϡဏၾഓπ௫ఇĄ૪൦Ⴏ၍ƜসÍص୷སΰňþҕЙၑήಟဖ௴ͿӻआइԚџവফඤѕળशĹсϗႫॳšķ۹ঙјఓȨёՑʮǴരౡଣౙഗ૩ໜໝŇऔཀඨΑΟɈઉທΣઠගऽइƽ೪ჁಔດևЫѱʟॺଅԻͳʼnଢӸ྾྾ვʭମԙउØØĞଥওȲϦၺழƦʍşဃłФЍ൰১ႭƠØØØ");
assert_eq!(base2048::decode(&encoded), Some(bytes.to_vec()));

Previous Work

This was inspired by the javascript module base2048 but they are not compatible. The main difference is the character list is more curated here to display properly on each platform.

No runtime deps