#matrix #batch #faer #mask #svd #format #linear-algebra #back-end #rayon

single_algebra

A linear algebra convenience library for the single-rust library. Can be used externally as well.

32 releases (5 breaking)

new 0.6.0 May 5, 2025
0.4.0-alpha.1 Apr 29, 2025
0.2.3-alpha.3 Mar 27, 2025
0.1.1-alpha.5 Dec 23, 2024
0.1.1-alpha.1 Nov 10, 2024

#278 in Math

Download history 1/week @ 2025-01-12 326/week @ 2025-01-19 356/week @ 2025-01-26 437/week @ 2025-02-02 92/week @ 2025-02-09 120/week @ 2025-02-16 15/week @ 2025-02-23 396/week @ 2025-03-02 56/week @ 2025-03-09 110/week @ 2025-03-16 322/week @ 2025-03-23 69/week @ 2025-03-30 8/week @ 2025-04-06 184/week @ 2025-04-13 33/week @ 2025-04-20 201/week @ 2025-04-27

426 downloads per month
Used in single_rust

Custom license

150KB
3.5K SLoC

single-algebra 🧮

A powerful linear algebra and machine learning utilities library for Rust, providing efficient matrix operations, dimensionality reduction, and statistical analysis tools.

Features 🚀

  • Efficient Matrix Operations: Support for both dense and sparse matrices (CSR/CSC formats)
  • Dimensionality Reduction: PCA implementations for both dense and sparse matrices
  • SVD Implementations: Multiple SVD backends including LAPACK and Faer
  • Statistical Analysis: Comprehensive statistical operations with batch processing support
  • Similarity Measures: Collection of distance/similarity metrics for high-dimensional data
  • Masking Support: Selective data processing with boolean masks
  • Parallel Processing: Efficient multi-threaded implementations using Rayon
  • Feature-Rich: Configurable through feature flags for specific needs

Matrix Operations 📊

  • SVD Decomposition: Choose between parallel, LAPACK, or Faer implementations
  • Sparse Matrix Support: Comprehensive operations for CSR and CSC sparse matrix formats
  • Masked Operations: Selective data processing with boolean masks
  • Batch Processing: Statistical operations grouped by batch identifiers
  • Normalization: Row and column normalization with customizable targets

Dimensionality Reduction ⬇️

  • PCA Framework: Flexible implementation with customizable SVD backends
  • Dense Matrix PCA: Optimized implementation for dense matrices
  • Sparse Matrix PCA: Memory-efficient PCA for sparse matrices
  • Masked Sparse PCA: Apply PCA on selected features only
  • Incremental Processing: Support for large datasets that don't fit in memory

Similarity Measures 📏

  • Cosine Similarity: Measure similarity based on the cosine of the angle between vectors
  • Euclidean Similarity: Similarity based on Euclidean distance
  • Pearson Similarity: Measure linear correlation between vectors
  • Manhattan Similarity: Similarity based on Manhattan distance
  • Jaccard Similarity: Measure similarity as intersection over union

Statistical Analysis 📈

  • Basic Statistics: Mean, variance, sum, min/max operations
  • Batch Statistics: Compute statistics grouped by batch identifiers
  • Matrix Variance: Efficient variance calculations for matrices
  • Nonzero Counting: Count non-zero elements in sparse matrices
  • Masked Statistics: Compute statistics on selected rows/columns only

Installation

Add this to your Cargo.toml:

[dependencies]
single-algebra = "0.5.0"

Feature Flags

Enable optional features based on your needs:

[dependencies]
single-algebra = { version = "0.5.0", features = ["lapack", "faer"] }

Available features:

  • smartcore: Enable integration with the SmartCore machine learning library
  • lapack: Use the LAPACK backend for linear algebra operations
  • faer: Use the Faer backend for linear algebra operations
  • simba: Enable SIMD optimizations via simba

Usage Examples

Basic PCA with LAPACK Backend

use ndarray::{Array2, ArrayView2};
use single_algebra::dimred::pca::dense::{PCABuilder, LapackSVD};

// Create a sample matrix
let data = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];

// Build PCA with LAPACK backend
let mut pca = PCABuilder::new(LapackSVD)
    .n_components(2)
    .center(true)
    .scale(false)
    .build();

// Fit and transform data
pca.fit(data.view()).unwrap();
let transformed = pca.transform(data.view()).unwrap();

// Access results
let components = pca.components().unwrap();
let explained_variance = pca.explained_variance_ratio().unwrap();

Sparse Matrix Operations

use nalgebra_sparse::{CooMatrix, CsrMatrix};
use single_algebra::sparse::MatrixSum;

// Create a sparse matrix
let mut coo = CooMatrix::new(3, 3);
coo.push(0, 0, 1.0);
coo.push(1, 1, 2.0);
coo.push(2, 2, 3.0);
let csr: CsrMatrix<f64> = (&coo).into();

// Calculate column sums
let col_sums: Vec<f64> = csr.sum_col().unwrap();

Batch Processing

use nalgebra_sparse::CsrMatrix;
use single_algebra::sparse::BatchMatrixMean;

// Sample data with batch identifiers
let matrix = create_sparse_matrix();
let batches = vec!["batch1", "batch1", "batch2", "batch2", "batch3"];

// Calculate mean per batch
let batch_means = matrix.mean_batch_col(&batches).unwrap();

// Access results for a specific batch
let batch1_means = batch_means.get("batch1").unwrap();

Similarity Measures

use ndarray::Array1;
use single_algebra::similarity::{SimilarityMeasure, CosineSimilarity};

let a = Array1::from_vec(vec![1.0, 2.0, 3.0]);
let b = Array1::from_vec(vec![4.0, 5.0, 6.0]);

let cosine = CosineSimilarity;
let similarity = cosine.calculate(a.view(), b.view());

Performance Considerations

  • For large matrices, consider using sparse representations (CSR/CSC)
  • Enable the appropriate backend (lapack or faer) based on your needs
  • Use masked operations when working with subsets of data
  • Batch processing can significantly improve performance for grouped operations

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the BSD 3-Clause License - see the LICENSE.md file for details.

Acknowledgments

  • The LAPACK integration is built upon the nalgebra-lapack crate
  • Some components are inspired by scikit-learn's implementations
  • The Faer backend leverages the high-performance faer crate

Dependencies

~8–21MB
~354K SLoC