### 8 releases

new 0.0.8 | Feb 21, 2021 |
---|---|

0.0.7 | Aug 2, 2020 |

0.0.6 | Apr 27, 2020 |

0.0.1 | Mar 20, 2020 |

#**69** in Science

**57** downloads per month

**MIT**license

94KB

1.5K
SLoC

# ndarray-glm

Rust library for solving linear, logistic, and generalized linear models through
iteratively reweighted least squares, using the

module.`ndarray-linalg`

## Status

This package is in early alpha and the interface is likely to undergo many changes. Functionality may change from one release to the next.

The regression algorithm uses iteratively re-weighted least squares (IRLS) with a step-halving procedure applied when the next iteration of guesses does not increase the likelihood.

Much of the logic is done at the type/trait level to avoid compiling code a user does not need and to allow general implementations that the compiler can optimize in trivial cases.

## Prerequisites

The recommended approach is to use a system BLAS implementation. For instance, to install OpenBLAS on Debian/Ubuntu:

`sudo`` apt update` `&&` `sudo`` apt install`` -`y libopenblas-dev

Then use this crate with the

feature.`openblas-system`

To use an alternative backend or to build a static BLAS implementation, refer to the

documentation. Use
this crate with the appropriate feature flag and it will be forwarded to
`ndarray-linalg`

.`ndarray-linalg`

## Example

To use in your crate, add the following to the

:`Cargo .toml`

`ndarray ``=` `{` version `=` `"`0.14`"``,` features `=` `[``"`blas`"``]``}`
ndarray`-`glm `=` `{` version `=` `"`0.0.8`"``,` features `=` `[``"`openblas-system`"``]` `}`

An example for linear regression is shown below.

`use` `ndarray_glm``::``{`array`,` Linear`,` ModelBuilder`,` standardize`}``;`
`//` define some test data
`let` data_y `=` `array!``[``0.``3``,` `1.``3``,` `0.``7``]``;`
`let` data_x `=` `array!``[``[``0.``1``,` `0.``2``]``,` `[``-``0.``4``,` `0.``1``]``,` `[``0.``2``,` `0.``4``]``]``;`
`//` The design matrix can optionally be standardized, where the mean of each independent
`//` variable is subtracted and each is then divided by the standard deviation of that variable.
`let` data_x `=` `standardize``(`data_x`)``;`
`//` The interface takes `ArrayView`s to allow for efficient passing of slices.
`let` model `=` `ModelBuilder``::``<`Linear`>``::`data`(`data_y`.``view``(``)``,` data_x`.``view``(``)``)``.``build``(``)``?``;`
`//` L2 (ridge) regularization can be applied with l2_reg().
`let` fit `=` model`.``fit_options``(``)``.``l2_reg``(`1e`-``5``)``.``fit``(``)``?``;`
`//` Currently the result is a simple array of the MLE estimators, including the intercept term.
`println!``(``"`Fit result: `{}``"``,` fit`.`result`)``;`

For logistic regression, the

array data must be boolean, and for Poisson
regression it must be an unsigned integer.`y`

Custom non-canonical link functions can be defined by the user, although the
interface is not particularly ergonomic. See

for examples.`tests /custom_link.rs`

## Features

- Linear regression
- Logistic regression
- Generalized linear model IRLS
- Linear offsets
- Generic over floating point type
- Non-float domain types
- L2 (ridge) Regularization
- L1 (lasso) Regularization
- An experimental smoothed version with an epsilon tolerance is WIP

- Other exponential family distributions
- Poisson
- Binomial (nightly only)
- Exponential
- Gamma
- Inverse Gaussian

- Option for data standardization/normalization
- Weighted and correlated regressions
- Non-canonical link functions
- Goodness-of-fit tests

## Reference

These notes on generalized linear models summarize many of the relevant concepts and provide some additional references.

#### Dependencies

~4–22MB

~467K SLoC