5 releases (breaking)
| 0.5.0 | Sep 11, 2025 |
|---|---|
| 0.4.0 | Sep 10, 2025 |
| 0.3.0 | Sep 10, 2025 |
| 0.2.0 | Sep 10, 2025 |
| 0.1.0 | Sep 10, 2025 |
#829 in Programming languages
39 downloads per month
40KB
314 lines
Module for BUND standard library: Naive Bayes text classifier
A powerful and flexible text classification module built on top of the bund machine learning framework. This module is part of the Bund Standard Library (stdlib) and provides a streamlined workflow for training, evaluating, and deploying text classification models. This module is a library and not designed to be a standalone application, but it can be embedded inside BUND virtual machine.
Installation
This module required make and Rust framework to be installed first. After that:
cargo add bund_stdlib_text_classifier
Quick start
Get started with a simple example to classify text data. First, you have to create and train classifier
:TEST textclassifier.new
:rust "./scripts/rust.txt" textclassifier.train.from_file
"{A} tokens for RUST" format println
:kant "./scripts/kant.txt" textclassifier.train.from_file
"{A} tokens for KANT" format println
:astronomy "./scripts/astronomy.txt" textclassifier.train.from_file
"{A} tokens for ASTRONOMY" format println
:tolstoy "./scripts/tolstoy.txt" textclassifier.train.from_file
"{A} tokens for LEO TOLSTOY" format println
textclassifier.train.finish
Then you can classify any text lines.
:TEST
"At its simplest, a test in Rust is a function that’s annotated with the test attribute. Attributes are metadata about pieces of Rust code"
textclassifier.classify
The following call will return a DICT value:
{
"astronomy": 0.8331765363980779,
"kant": 0.9968812285706273,
"rust": 1.0,
"tolstoy": 0.9968812285706273
}
BUND functions exposed in this module
| Name | Stack IN | Stack OUT | Description |
|---|---|---|---|
| textclassifier.new | Classifier name |
Classifier name |
Create new classifier |
| textclassifier.exists | Classifier name |
Classifier nameTRUE/FALSE |
Check if classifier exists |
| textclassifier.train.from_file | Classifier nameCategoryFilename |
Classifier nameNumber of tokens |
Train classifier from text file |
| textclassifier.train.finish | Classifier name |
Classifier name |
Finalize classifier training |
| textclassifier.classify | Classifier nameText for classification |
Classifier nameDICT with scores |
Classify text string using pre-trained classifier |
Dependencies
~35MB
~494K SLoC