#icu4x #localization

i18n_lexer-rizzen-yazston

The i18n_lexer crate of the Internationalisation project

8 releases (4 breaking)

new 0.10.1 Nov 15, 2024
0.10.0 Aug 2, 2024
0.9.2 Jul 23, 2024
0.9.1 Mar 24, 2024
0.6.1 Jul 6, 2023

#272 in Internationalization (i18n)


Used in 5 crates (4 directly)

BSD-3-Clause

85KB
1K SLoC

i18n_lexer

Rizzen Yazston :icu4x: https://github.com/unicode-org/icu4x :url-unicode: https://home.unicode.org/ :DataProvider: https://docs.rs/icu_provider/1.2.0/icu_provider/trait.DataProvider.html :BlobDataProvider: https://docs.rs/icu_provider_blob/1.2.0/icu_provider_blob/struct.BlobDataProvider.html :FsDataProvider: https://docs.rs/icu_provider_fs/1.2.1/icu_provider_fs/struct.FsDataProvider.html :BufferProvider: https://docs.rs/icu_provider/1.2.0/icu_provider/buf/trait.BufferProvider.html :CLDR: https://cldr.unicode.org/

Welcome to the lexer crate of the Internationalisation (i18n) project.

This crate contains three modules:

  • icu: The ICU data provider wrapper.

  • lexer: The lexer iterator.

  • error: The error enums for the other modules.

Features

Available features for i18n_lexer crate:

  • icu_blob: Allow for instances of BlobDataProvider to be used various ICU4X components that supports {BufferProvider}BufferProvider. An alternative provider when the internal data of ICU4X components are insufficient for a particular use case.

  • icu_compiled_data [default]: Allow for the internal data of the various ICU4X components.

  • icu_extended: Include the extended data of various ICU's components.

  • icu_fs: Allow for instances of FsDataProvider to be used various ICU4X components that supports BufferProvider. An alternative provider when the internal data of ICU4X components are insufficient for a particular use case.

  • logging: To provide some logging information.

  • sync: Allow for rust's concurrency capabilities to be used. Use of Arc and Mutex instead Rc and RefCell.

Modules

icu: ICU data provider wrapper

{icu4x}[ICU4X] project (maintained by the {url-unicode}[Unicode Consortium]) data provider helper.

The IcuDataProvider type contains the DataProvider enum of supported implementations of ICU4X {DataProvider}DataProvider. Depending on the features selected, they are: Internal (internally uses the BakedDataProvider), {BlobDataProvider}BlobDataProvider, or {FsDataProvider}FsDataProvider.

When data provider is not Internal and depending on the data provider used, the IcuDataProvider may contain non-locale based data, such as the grapheme cluster segmenter and the selected character properties set data.

IcuDataProvider type is used within the Rc type as Rc<IcuDataProvider> or Arc type as Arc<IcuDataProvider> to prevent unnecessary duplication.

lexer: The Lexer Iterator

The LexerIterator is created with the DataProvider enum of supported data providers. Usually the ICU components includes internal compiled data, thus other data providers are not required. The crate's default feature includes the icu_compiled_data feature. If the internal compiled data is not used, the data repository is just a customised local copy of the {url-unicode}[Unicode Consortium] {CLDR}[CLDR], and usually located in the application's data directory.

Consult the {icu4x}[ICU4X] website for instructions on generating a suitable data repository for the application, by leaving out data that is not used by the application.

The grammar syntax characters along with the string are also passed at time of creating the LexerInterator.

Acknowledgement

Stefano Angeleri for advice on various design aspects of implementing the components of the internationalisation project, and also providing the Italian translation of error message strings.

Dependencies

~4.5–6.5MB
~101K SLoC