1 unstable release
new 0.1.0 | Dec 17, 2024 |
---|
#26 in #hdf5
5KB
103 lines
MiniNN
A minimalist deep learnig crate for rust.
[!NOTE] This crate is still in development and is not ready for production use.
🔧 Setup
You can add the crate with cargo
cargo add mininn
Alternatively, you can manually add it to your project's Cargo.toml like this:
[dependencies]
mininn = "*" # Change the `*` to the current version
✏️ Quick Start: Solving XOR
For this example we will resolve the classic XOR problem
use ndarray::{array, Array1};
use mininn::prelude::*;
fn main() -> NNResult<()> {
let train_data = array![
[0.0, 0.0],
[0.0, 1.0],
[1.0, 0.0],
[1.0, 1.0],
];
let labels = array![[0.0], [1.0], [1.0], [0.0],];
// Create the neural network
let mut nn = NN::new()
.add(Dense::new(2, 3).apply(Act::Tanh))?
.add(Dense::new(3, 1).apply(Act::Tanh))?;
// Set the training configuration
let train_config = TrainConfig::new()
.with_epochs(200)
.with_cost(Cost::MSE)
.with_learning_rate(0.1)
.with_batch_size(2)
.with_verbose(true);
// Train the neural network
let loss = nn.train(train_data.view(), labels.view(), train_config)?;
println!("Predictions:\n");
let predictions: Array1<f32> = train_data
.rows()
.into_iter()
.map(|input| {
let pred = nn.predict(input.view()).unwrap();
let out = if pred[0] >= 0.9 { 1.0 } else { 0.0 };
println!("{} --> {}", input, out);
out
})
.collect();
// Calc metrics using MetricsCalculator
let metrics = MetricsCalculator::new(labels.view(), predictions.view());
println!("\nConfusion matrix:\n{}\n", metrics.confusion_matrix());
println!(
"Accuracy: {}\nRecall: {}\nPrecision: {}\nF1: {}\nLoss: {}",
metrics.accuracy(),
metrics.recall(),
metrics.precision(),
metrics.f1_score(),
loss
);
// Save the model into a HDF5 file
match nn.save("model.h5") {
Ok(_) => println!("Model saved successfully!"),
Err(e) => println!("Error saving model: {}", e),
}
Ok(())
}
Output
Epoch 1/200 - Loss: 0.2636616, Time: 0.000482592 sec
Epoch 2/200 - Loss: 0.265602, Time: 0.000444258 sec
Epoch 3/200 - Loss: 0.26768285, Time: 0.000398091 sec
...
Epoch 198/200 - Loss: 0.0010192227, Time: 0.000600476 sec
Epoch 199/200 - Loss: 0.0009878413, Time: 0.000510074 sec
Epoch 200/200 - Loss: 0.0009578406, Time: 0.000512518 sec
Training Completed!
Total Training Time: 0.11 sec
Predictions:
[0, 0] --> 0
[0, 1] --> 1
[1, 0] --> 1
[1, 1] --> 0
Confusion matrix:
[[2, 0],
[0, 2]]
Accuracy: 1
Recall: 1
Precision: 1
F1: 1
Loss: 0.0009578406
Model saved successfully!
📊 Train and evaluation
Train the model
In order to train the model, you need to provide the training data, the labels and the training configuration. The training configuration is a struct that contains all the parameters that are used during the training process, such as the number of epochs, the cost function, the learning rate, the batch size, the optimizer, and whether to print the training process or not.
let train_data = array![[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]];
let labels = array![[0.0], [1.0], [1.0], [0.0]];
let loss = nn.train(train_data.view(), labels.view(), TrainConfig::default())?;
Predict the model
Once the model is trained, you can use it to make predictions on new data. To do this, you need to provide the input data to the predict
method.
let input = array![1.0, 2.0];
let output = nn.predict(input.view())?;
Metrics
You can also calculate metrics for your models using MetricsCalculator
:
let metrics = MetricsCalculator::new(labels.view(), predictions.view());
println!("\nConfusion matrix:\n{}\n", metrics.confusion_matrix());
println!(
"Accuracy: {}\nRecall: {}\nPrecision: {}\nF1: {}\n",
metrics.accuracy(),
metrics.recall(),
metrics.precision(),
metrics.f1_score()
);
This is the output of the iris
example
Confusion matrix:
[[26, 0, 0],
[0, 28, 1],
[0, 2, 18]]
Accuracy: 0.96
Recall: 0.9551724137931035
Precision: 0.960233918128655
F1: 0.9574098218166016
Save and load models
When you already have a trained model you can save it into a HDF5 file:
nn.save("model.h5").unwrap();
let nn = NN::load("model.h5").unwrap();
🧰 Built-in Components
The crate defines some default layers, activations and costs that can be used in your model:
Default Layers
For now, the crate only offers these types of layers:
Layer | Description |
---|---|
Dense |
Fully connected layer where each neuron connects to every neuron in the previous layer. It computes the weighted sum of inputs, adds a bias term, and applies an optional activation function (e.g., ReLU, Sigmoid). This layer is fundamental for transforming input data in deep learning models. |
Activation |
Applies a non-linear transformation (activation function) to its inputs. Common activation functions include ReLU, Sigmoid, Tanh, and Softmax. These functions introduce non-linearity to the model, allowing it to learn complex patterns. |
Flatten |
Flattens the input into a 1D array. This layer is useful when the input is a 2D array, but you want to treat it as a 1D array. |
Dropout |
Applies dropout, a regularization technique where randomly selected neurons are ignored during training. This helps prevent overfitting by reducing reliance on specific neurons and forces the network to learn more robust features. Dropout is typically used in the training phase and is deactivated during inference. |
[!NOTE] More layers in the future.
Cost functions
The crate also provides a set of cost functions that can be used in the training process, these are represented by the Cost
enum::
-
Cost::MSE
: Mean Squared Error. This cost function measures the average squared difference between the predicted and actual values.\text{MSE}(y_p, y) = \frac{1}{n} \sum_{i=1}^{n} (y_p - y)^2
-
Cost::MAE
: maps the input to a value between 0 and 1, which is the probability of the input being 1.\text{MAE}(y_p, y) = \frac{1}{n} \sum_{i=1}^{n} |y_p - y|
-
Cost::BCE
: maps the input to 0 if it is negative, and the input itself if it is positive.\text{BCE}(y_p, y) = -\frac{1}{n} \sum_{i=1}^{n} y_p \log(y) + (1 - y_p) \log(1 - y)
-
Cost::CCE
: maps the input to a value between -1 and 1, which is the ratio of the input to the hyperbolic tangent of the input.\text{CCE}(y_p, y) = -\frac{1}{n} \sum_{i=1}^{n} y_p \log(y)
Activation functions
The crate provides a set of activation functions that can be used in the Activation
layer, these are represented by the Act
enum:
-
Act::Step
: maps the input to 0 if it is negative, and 1 if it is positive.\text{step}(x) = \begin{cases} 0 & \text{if } x < 0 \\ 1 & \text{if } x \geq 0 \end{cases}
-
Act::Sigmoid
: maps the input to a value between 0 and 1, which is the probability of the input being 1.\text{sigmoid}(x) = \frac{1}{1 + e^{-x}}
-
Act::ReLU
: maps the input to 0 if it is negative, and the input itself if it is positive.\text{ReLU}(x) = \max(0, x)
-
Act::Tanh
: maps the input to a value between -1 and 1, which is the ratio of the input to the hyperbolic tangent of the input.\text{tanh}(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}
-
Act::Softmax
: maps the input to a probability distribution over the possible values of the input.\text{softmax}(x) = \frac{e^x}{e^x + \sum_{i=1}^{n} e^{x_i}}
🛠️ Customization
One of the main goals of the mininn
crate is to provide a flexible and customizable framework for building and training neural networks. This section will cover how to create your own layers, activations and costs and how to register them with the framework.
Custom layers
All layers in the network are required to implement the Layer
trait. This ensures that users can define their own custom layers while maintaining compatibility with the framework.
To fulfill this requirement, every layer must also implement the following traits in addition to Layer
:
Debug
: For inspecting print layers information.Clone
: To enable copying of layer instances.Serialize
andDeserialize
: For seamless serialization and deserialization, typically usingserde
.
Here is a little example about how to create custom layers:
use mininn::prelude::*;
use serde::{Deserialize, Serialize};
use ndarray::ArrayViewD;
// The implementation of the custom layer
#[derive(Layer, Debug, Clone, Serialize, Deserialize)]
struct CustomLayer;
impl TrainLayer for CustomLayer {
fn forward(&mut self, input: ArrayViewD<f32>, _mode: &NNMode) -> NNResult<ArrayD<f32>> {
todo!()
}
fn backward(
&mut self,
output_gradient: ArrayViewD<f32>,
_learning_rate: f32,
_optimizer: &Optimizer,
_mode: &NNMode,
) -> NNResult<ArrayD<f32>> {
todo!()
}
}
fn main() {
let nn = NN::new()
.add(CustomLayer::new()).unwrap();
nn.save("custom_layer.h5").unwrap();
}
Custom Activation Functions
You can also create your own activation functions by implementing the ActivationFunction
and Debug
traits.
use mininn::prelude::*;
use ndarray::{array, ArrayViewD};
#[derive(ActivationFunction, Debug, Clone)]
struct CustomActivation;
impl ActCore for CustomActivation {
fn function(&self, z: &ArrayViewD<f32>) -> ArrayD<f32> {
z.mapv(|x| x.powi(2))
}
fn derivate(&self, z: &ArrayViewD<f32>) -> ArrayD<f32> {
z.mapv(|x| 2. * x)
}
}
fn main() {
let mut nn = NN::new()
.add(Dense::new(2, 3).apply(CustomActivation))?
.add(Dense::new(3, 1).apply(CustomActivation))?;
let dense_layers = nn.extract_layers::<Dense>().unwrap();
assert_eq!(dense_layers.len(), 2);
assert_eq!(dense_layers[0].activation().unwrap(), "CustomActivation");
assert_eq!(dense_layers[1].activation().unwrap(), "CustomActivation");
}
Custom Cost Functions
You can also create your own cost functions by implementing the CostFunction
and Debug
traits.
use mininn::prelude::*;
use ndarray::{array, ArrayViewD};
#[derive(CostFunction, Debug, Clone)]
struct CustomCost;
impl CostCore for CustomCost {
fn function(&self, y_p: &ArrayViewD<f32>, y: &ArrayViewD<f32>) -> f32 {
(y - y_p).abs().mean().unwrap_or(0.)
}
fn derivate(&self, y_p: &ArrayViewD<f32>, y: &ArrayViewD<f32>) -> ArrayD<f32> {
(y_p - y).signum() / y.len() as f32
}
}
fn main() {
let mut nn = NN::new()
.add(Dense::new(2, 3).apply(Act::Tanh))
.unwrap()
.add(Dense::new(3, 1).apply(Act::Tanh))
.unwrap();
let train_data = array![
[0.0, 0.0],
[0.0, 1.0],
[1.0, 0.0],
[1.0, 1.0]
];
let labels = array![[0.0], [1.0], [1.0], [0.0]];
let train_config = TrainConfig::new()
.epochs(1)
.learning_rate(0.1)
.cost(CustomCost);
let train_result = nn.train(train_data.view(), labels.view(), train_config);
assert!(train_result.is_ok());
}
Register layers, activations and costs
For use your custom layers, activation functions, or cost functions in the load
method, you need to register them first:
fn main() {
// You can use the register builder to register your own layers, activations and costs
Register::new()
.with_layer::<CustomLayer>()
.with_layer::<CustomLayer1>()
.with_activation::<CustomActivation>()
.with_cost::<CustomCost>()
.register();
// Or you can use the register! macro to register your own layers, activations and costs
register!(
layers: [CustomLayer, CustomLayer1],
acts: [CustomActivation],
costs: [CustomCost]
);
let nn = NN::load("custom_layer.h5").unwrap();
for layer in nn.extract_layers::<CustomLayer>().unwrap() {
println!("{}", layer.layer_type())
}
println!("{}", nn.train_config().cost.cost_name());
}
The register!
macro can be used to register your layers, activations, costs or all of them at once.
register!(
layers: [CustomLayer, CustomLayer1],
acts: [CustomActivation],
costs: [CustomCost]
);
register!(
layers: [CustomLayer],
acts: [CustomActivation]
);
register!(layers: [CustomLayer]);
register!(acts: [CustomActivation]);
register!(costs: [CustomCost]);
📋 Examples
There is a multitude of examples resolving classics ML problems, if you want to see the results just run these commands.
cargo run --example iris
cargo run --example xor [optional_path_to_model] # If no path is provided, the model won't be saved
cargo run --example mnist [optional_path_to_model] # If no path is provided, the model won't be saved
cargo run --example xor_load_nn <path_to_model>
cargo run --example mnist_load_nn <path_to_model>
📑 Libraries used
- ndarray - For manage N-Dimensional Arrays.
- ndarray-rand - For manage Random N-Dimensional Arrays.
- serde - For serialization.
- rmp_serde - For MSGPack serialization.
- hdf5 - For model storage.
- dyn-clone - For cloning trait objects.
💻 Contributing
Contributions are welcome! Feel free to open issues or submit pull requests, see CONTRIBUTING.md for more information.
🔑 License
MIT - Created by Paco Algar.
Dependencies
~240–690KB
~16K SLoC