#korean #nlp #analyzer #tokenizer

bareun_rs

Bareun is a Korean Morphological analyzer for Rust

1 unstable release

0.1.0 Jun 7, 2024

#1769 in Text processing

BSD-3-Clause

51KB
826 lines

bareun-rs is an unofficial Rust library for Bareun, a Korean morphological analyzer.
Bareun is a Korean NLP, which provides tokenizing, POS tagging for Korean.


lib.rs:

bareun_rs::bareun

Provides

  1. a Korean Part-Of-Speech Tagger as bareun client
  2. Multiple custom dictionaries which is kept in the your bareun server.

How to use the documentation

Full documentation for bareun is available in installable tarball or docker images.

  • see docs/intro.html at installable tarball.
  • or http://localhost:5757/intro.html after running docker.

The docstring examples assume that bareun_rs::bareun has been imported as brn::

use bareun_rs::bareun as brn;

Use the built-in help function to view a class's docstring::

help(brn::Tagger) ...

Classes

Tagger the bareun POS tagger for Korean use bareun_rs::bareun::Tagger; Tagged Wrapper for tagged output use bareun_rs::bareun::Tagged; CustomDict Custom dictionary for Korean. use bareun_rs::bareun::CustomDict;

Version

use bareun_rs as brn;
println!("{}", brn::VERSION);
println!("{}", brn::BAREUN_VERSION);

Get bareun

Dependencies

~5–11MB
~126K SLoC