#chinese #dictionary #traditional-chinese

bin+lib tocfl

Chinese TOCFL vocabulary by levels

3 releases (breaking)

0.3.0 Oct 8, 2023
0.2.0 Mar 26, 2023
0.1.0 Mar 26, 2023

#68 in #chinese

47 downloads per month

MIT license

2.5MB
268 lines

TOCFL

The Test of Chinese as a Foreign Language (TOCFL) (Chinese: 華語文能力測驗; pinyin: Huáyǔwén Nénglì Cèyàn) is a standardized test of Taiwanese Mandarin language proficiency for non-native speakers, including foreign students. While there are many vocabulary lists available online, a lot of them are either incomplete / outdated or behind paywalls.

This repo provides a dataset based on (linked from the official TOCFL website):

coct.naer.edu.tw/download/tech_report

Excel Sheet

Vocabulary

Taiwan Chinese Language Proficiency Benchmark Vocabulary List_111-11-14.xlsx

The vocabulary list is great, it gives frequency for written AND spoken. It also provides pinyin to differentiate same char with different meaning pronounciation.

Characters

Taiwan Chinese Language Proficiency Benchmark Chinese Character List_111-09-20.xlsx

Other

https://github.com/tomcumming/tocfl-word-list also provides TOCFL lists, but seems to be incomplete (or outdated). The source used to compile the list is not entirely clear.

Dependencies

~7MB
~192K SLoC