#locale #charset

locale_name_code_page

Rust library that helps us get code pages (then legacy encodings) used in Windows

1 unstable release

0.1.0 Sep 13, 2020

#1675 in Encoding


Used in 2 crates (via zifu_core)

MIT license

745KB
531 lines

Locale Name to Code Page for Rust

CI (master) CI (Release) locale_name_code_page at crates.io locale_name_code_page at docs.rs Downloads (Crates.io) License (Crates.io)

This is a library that converts strings representing locale names to code pages that are used in Windows.

e.g.

  • In en-US locale, Windows-1252 (code page id: 1252) is used as the ANSI code page, and CP437 (code page id: 437) is used as the OEM code page.
  • In ja-JP locale, Shift_JIS (code page id: 932) is used as both of the ANSI and OEM code pages.

Usage

First, add locale_name_code_page = "<2" to your Cargo.toml.

[dependencies]
# *snip*
locale_name_code_page = "<2"
# *snip*

Then, convert strings representing locales to code pages like:

use locale_name_code_page::get_codepage;
use locale_name_code_page::cp_table_type::CodePage;

// IConverter has already been defined by you
fn get_converter_instance(codepage: &CodePage) -> Box<dyn IConverter> {
  // do something
  return Box::new(converter);
}

// *snip*

fn main() {
  // *snip*
  if let Some(codepage_ref) = get_codepage(locale_string) {
    let converter = get_converter_instance(codepage_ref);
    // *snip*
  } else {
    eprintln!("Error: {} doesn't represent a valid locale.", locale_string);
    std::process::exit(1);
  }
}

Obtained codepage (instance of locale_name_code_page::cp_table_type::CodePage) can be used as follows:

use locale_name_code_page::get_codepage;

fn main() {
  let en_cp = get_codepage("en-US").unwrap();
  // prints "en-US locale: 1252 (ANSI) / 437 (OEM)"
  println!("en-US locale: {} (ANSI) / {} (OEM)", en_cp.ansi, en_cp.oem);
}

Source of Information

https://web.archive.org/web/20180104073254/https://www.microsoft.com/resources/msdn/goglobal/default.mspx

FAQ

How can I convert codepage to encoder/decoder?

Use the following libraries:

ANSI encodings (including CJKV languages)

Combine with codepage and encoding_rs.

OEM encodings (except for CJKV languages)

Use oem_cp.

How can I get the current locale?

Use locale_config.

I want to port this library to other languages.

You can use assets/nls_info.json in your automatic code generation script.

LICENSE

MIT

Dependencies

~2.3–3.5MB
~57K SLoC