3 releases
0.2.2 | Feb 19, 2023 |
---|---|
0.2.1 | Jan 15, 2023 |
0.2.0 | Dec 15, 2021 |
#899 in Text processing
58KB
1.5K
SLoC
translitRS — Transliterator for Serbian Language
TranslitRS is a command-line utility for transliteration between Cyrillic and Latin scripts of the Serbian language. It can work on plain text files directly, or as a filter for Pandoc document processor (Markdown, HTML, LaTeX, Microsoft Word...).
Usage
Arguments
-i, --input <path>
Read input from file
Default: standard input-o, --output <path>
Write output to file
Default: standard output-f, --from <charset>
Convert from character set
Default: latin-t, --into <charset>
Convert to character set
Default: cyrillic-d, --skip-digraph
Do not check for digraph exceptions-u, --force-foreign
Process words with foreign and mixed characters-l, --force-links
Process hyperlinks, email addresses and units-p, --pandoc-filter
Run in Pandoc JSON pipe filter mode-v, --version
Show version and quit-h, --help
Show usage help and quit
Character sets
Listed below are available character sets and their shorthand codes:
- Serbian Latin
latin, lat, l
- Serbian Latin (Unicode)
latin8, lat8, l8
- Serbian Cyrillic
cyrillic, cyr, c
Pandoc filter mode
When running as a Pandoc filter, the arguments listed above can't be passed directly. Instead, use the following arguments variables:
CHARS_FROM=<charset>
Convert from character setCHARS_INTO=<charset>
Convert to character setSKIP_DIGRAPH=1
Do not check for digraph exceptionsFORCE_FOREIGN=1
Process words with foreign and mixed charactersFORCE_LINKS=1
Process hyperlinks, email addresses and units
Examples
# Transliterate plaintext file from Latin (Unicode) to Cyrillic
translitrs -f lat8 -t cyr -i source.txt -o destination.txt
# Transliterate Microsoft Word document from Cyrillic to Latin
CHARS_FROM=c CHARS_INTO=l pandoc essay.docx --filter translitrs -o essay.docx
Dependencies
~2.4–3.5MB
~64K SLoC