#chinese #convert #hanzi #traditional #localization #simplified

fast2s

A fast Traditional Chinese to Simplified Chinese conversion library. Built with FST, faster than most of other libraries.

5 unstable releases

0.3.1 Jun 7, 2022
0.3.0 Nov 3, 2021
0.2.1 Oct 31, 2021
0.2.0 Oct 31, 2021
0.1.0 Oct 31, 2021

#1483 in Text processing

Download history 319/week @ 2024-07-20 304/week @ 2024-07-27 222/week @ 2024-08-03 649/week @ 2024-08-10 759/week @ 2024-08-17 355/week @ 2024-08-24 298/week @ 2024-08-31 236/week @ 2024-09-07 208/week @ 2024-09-14 240/week @ 2024-09-21 224/week @ 2024-09-28 228/week @ 2024-10-05 352/week @ 2024-10-12 430/week @ 2024-10-19 402/week @ 2024-10-26 352/week @ 2024-11-02

1,559 downloads per month
Used in 6 crates (4 directly)

MIT license

26KB
223 lines

fast2s

A super-fast Chinese translation tool to translate Traditional Chinese to Simplified Chinese.

Use hashbrown to build the translation state machine.

Usage:

let t = "企畫 計畫 企劃 計劃 畫圖 畫畫";
let s = fast2s::convert(k);
assert_eq!(&s, "企划 计划 企划 计划 画图 画画");

Benchmark

See simple.rs under benches directory. I compared the result with opencc-rust, simplet2s-rs, and character_converter. As character_converter is too slow, I have to change the sample size to 10 to not wait super long.

Test result (convert and return new string):

tests fast2s simplet2s-rs opencc-rust character_conver
zht 188us 729us 5.98ms 1.23s
zhc 169us 941us 6.89ms 2.87s
en 69us 3.31ms 13.99ms 26.11s

Test result (mutate existing string):

tests fast2s simplet2s-rs opencc-rust character_conver
zht 121us N/A N/A N/A
zhc 139us N/A N/A N/A
en 78us N/A N/A N/A

Note:

  1. benchmark is done with rust 1.56.1.
  2. zht means load "math_zht.txt" and translate, zhc means load "math_zhc.txt" (all Simplified Chinese) and translate, en means load "math_en.txt" (all English) and translate.
  3. N/A means not supported.

Please do not trust the benchmark result directly, you shall run it in your local environment. See how to run benchmark.

Credits

t2s.txt is borrored from simplet2s.

Dependencies

~1–1.6MB
~28K SLoC