12 releases
0.5.0 | Dec 6, 2022 |
---|---|
0.4.1 | Sep 3, 2021 |
0.3.3 | Nov 7, 2020 |
0.3.2 | Oct 14, 2020 |
0.1.1 | Nov 2, 2019 |
#205 in Science
Used in krabmaga
120KB
2K
SLoC
Friedrich: Gaussian Process Regression
This library implements Gaussian Process Regression, also known as Kriging, in Rust. Our goal is to provide a solid and well-featured building block for other algorithms (such as Bayesian Optimization).
Gaussian processes have both the ability to extract a lot of information from their training data and to return a prediction and an uncertainty value on their prediction. Furthermore, they can handle non-linear phenomena, take uncertainty on the inputs into account, and encode a prior on the output.
All of those properties make it an algorithm of choice to perform regression when data is scarce or when having uncertainty bars on the output is a desirable property.
However, the O(n^3)
complexity of the algorithm makes the classic implementations unsuitable for large datasets.
Functionalities
This implementation lets you:
- define a gaussian process with default parameters or using the builder pattern
- train it on multidimensional data
- fit the parameters (kernel, prior and noise) on the training data
- introduce an optional
cholesky_epsilon
to make the Cholesky decomposition infallible in case of badly conditioned problems - add additional samples efficiently (
O(n^2)
) and refit the process - predict the mean, variance and covariance matrix for given inputs
- sample the distribution at a given position
- save and load a trained model with serde
(See the todo.md file to get up-to-date information on current developments.)
Code sample
use friedrich::gaussian_process::GaussianProcess;
// trains a gaussian process on a dataset of one-dimensional vectors
let training_inputs = vec![vec![0.8], vec![1.2], vec![3.8], vec![4.2]];
let training_outputs = vec![3.0, 4.0, -2.0, -2.0];
let gp = GaussianProcess::default(training_inputs, training_outputs);
// predicts the mean and variance of a single point
let input = vec![1.];
let mean = gp.predict(&input);
let var = gp.predict_variance(&input);
println!("prediction: {} ± {}", mean, var.sqrt());
// makes several prediction
let inputs = vec![vec![1.0], vec![2.0], vec![3.0]];
let outputs = gp.predict(&inputs);
println!("predictions: {:?}", outputs);
// samples from the distribution
let new_inputs = vec![vec![1.0], vec![2.0]];
let sampler = gp.sample_at(&new_inputs);
let mut rng = rand::thread_rng();
println!("samples: {:?}", sampler.sample(&mut rng));
Inputs
Most methods of this library can currently work with the following input -> output
pairs :
Vec<f64> -> f64
a single, multidimensional, sampleVec<Vec<f64>> -> Vec<f64>
each inner vector is a training sampleDMatrix<f64> -> DVector<f64>
using a nalgebra matrix with one row per sampleArrayBase<f64, Ix1> -> f64
a single sample stored in a ndarray array (using thefriedrich_ndarray
feature)ArrayBase<f64, Ix2> -> Array1<f64>
each row is a sample (using thefriedrich_ndarray
feature)
The Input trait is provided to add your own pairs.
Why call it Friedrich?
Gaussian Processes are named after the Gaussian distribution which is itself named after Carl Friedrich Gauss.
Dependencies
~4MB
~83K SLoC