#neural-networks #machine-learning #backpropagation


A GPU accelerated library that creates/trains/runs neural networks in pure safe Rust code

10 releases

Uses new Rust 2021

0.2.2 Aug 2, 2022
0.2.1 Jul 31, 2022
0.1.4 Jul 30, 2022
0.0.2 Jul 23, 2022

#31 in Machine learning

Download history 49/week @ 2022-07-22 212/week @ 2022-07-29 53/week @ 2022-08-05

314 downloads per month

MIT license



Crates.io Crates.io github.com github.com github.com

A GPU accelerated library that creates/trains/runs neural networks in pure safe Rust code.

Architechture overview

Intricate has a layout very similar to popular libraries out there such as Keras.


As said before, similar to Keras from Tensorflow, Intricate defines Models as basically a list of Layers and the definition for "layer" is as follows.


Every layer receives inputs and returns outputs, they must also implement a back_propagate method that will mutate the layer if needed and then return the derivatives of the loss function with respected to the inputs, written with I as the inputs of the layer, E as the loss and O as the outputs of the layer:

dE/dI <- Model <- dE/dO

These layers can be anything you want and just propagates the previous inputs to the next inputs for the next layer or for the outputs of the whole Model.

There are a few activations already implemented, but still many to be implemented.

XoR using Intricate

If you look at the examples/ in the repository you will find XoR implemented using Intricate. The following is basically just that example with some separate explanation.

Setting up the training data

let training_inputs = Vec::from([
    Vec::from([0.0, 0.0]),
    Vec::from([0.0, 1.0]),
    Vec::from([1.0, 0.0]),
    Vec::from([1.0, 1.0]),

let expected_outputs = Vec::from([

Setting up the layers

let mut layers: Vec<Box<dyn Layer<f64>>> = Vec::new();

//                      inputs_amount|outputs_amount
layers.push(Box::new(DenseF64::new(2, 3)));
layers.push(Box::new(TanHF64::new())); // activation functions are layers
layers.push(Box::new(DenseF64::new(3, 1)));

Creating the model with the layers

// Instantiate our model using the layers
let mut xor_model = ModelF64::new(layers);
// mutable because the 'fit' method lets the layers tweak themselves

Fitting our model

    TrainingOptionsF64 {
        learning_rate: 0.1,
        loss_algorithm: Box::new(MeanSquared), // The Mean Squared loss function
        should_print_information: true, // Should be verbose
        instantiate_gpu: false, // Should initialize WGPU Device and Queue for GPU layers
        epochs: 10000,

As you can see it is extremely easy creating these models, and blazingly fast as well.

Although if you wish to do (just like in the actual XoR example) you could write this using the F32 version of numbers which is 30% faster overall and uses half the RAM but at the price of less precision.

How to save and load models

Intricate implements a few functions for each layer that saves and loads the necessary layer information to some file using the savefile crate.

But a layer can save and load the data anyway it sees fit, as long as it does what the trait Layer requires.

Saving the model

To load and save data, as an example, say for the XoR model we trained above, we can just call the save function as such:

xor_model.layers[0].save("xor-model-first-dense.bin", 0).unwrap();
xor_model.layers[2].save("xor-model-second-dense.bin", 0).unwrap();

And we save only the Dense layers here because the Activation layers don't really hold any valuable information, only the Dense layers do.

Loading the model

As for the loading of the data we must create some dummy dense layers and tell them to load their data from the paths created above

let mut first_dense: Box<DenseF32> = Box::new(DenseF32::dummy());
first_dense.load("xor-model-first-dense.bin", 0).unwrap();
let mut second_dense: Box<DenseF32> = Box::new(DenseF32::dummy()); 
second_dense.load("xor-model-second-dense.bin", 0).unwrap();

let mut new_layers: Vec<Box<dyn Layer<f32>>> = Vec::new();

let loaded_xor_model = ModelF32::new(new_layers);

Things to be done still

  • writing some kind of macro to generate the code for f32 and f64 versions of certain structs and traits to not have duplicated code.
  • improve the GPU shaders, perhaps finding a way to send the full unflattened matrices to the GPU instead of sending just a flattened array.
  • create GPU accelerated activations and loss functions as to make everything GPU accelerated.
  • perhaps write some shader to calculate the Model loss to output gradient (derivatives).
  • implement convolutional layers and perhaps even solve some image classification problems in a example
  • add a example that uses GPU acceleration


~236K SLoC