18 releases (5 breaking)
Uses new Rust 2024
| new 0.6.1 | Apr 10, 2026 |
|---|---|
| 0.6.0 | Apr 9, 2026 |
| 0.5.0 | Mar 31, 2026 |
| 0.4.3 | Mar 25, 2026 |
| 0.1.0 | Jan 19, 2026 |
#407 in Machine learning
140 downloads per month
Used in 19 crates
1MB
20K
SLoC
axonml-tensor
Overview
axonml-tensor provides the core Tensor type for the AxonML framework. Tensors are N-dimensional arrays with support for automatic broadcasting, efficient memory sharing through views, and device-agnostic operations for machine learning computations.
Features
-
N-Dimensional Arrays - Create tensors of arbitrary shape with generic element types (f32, f64, i32, etc.).
-
Automatic Broadcasting - NumPy-style broadcasting for element-wise operations between tensors of different shapes.
-
Efficient Views - Zero-copy slicing, transposing, and reshaping through stride manipulation without data copying.
-
Device Agnostic - Transparent tensor operations across CPU, CUDA, Vulkan, Metal, and WebGPU backends.
-
Rich Operations - Comprehensive arithmetic, reduction, activation, and matrix operations including matmul with batching support.
-
Factory Functions - Convenient tensor creation with
zeros,ones,rand,randn,arange,linspace, and more. -
Optimized Concatenation -
catuses contiguous memcpy per slice for fast tensor joining along any axis.var_dimcomputes variance along a dimension in a single pass with Welford's algorithm.
Modules
| Module | Description |
|---|---|
tensor |
Core Tensor struct with arithmetic, reduction, activation, and shape operations |
shape |
Shape and stride utilities including broadcasting, reshape, and index computation |
creation |
Factory functions for tensor initialization (zeros, ones, rand, randn, arange, linspace, eye) |
view |
Slicing, indexing, and view operations (select, narrow, chunk, split) |
ops |
Additional operations including softmax, GELU, comparisons, and clipping |
Usage
Add this to your Cargo.toml:
[dependencies]
axonml-tensor = "0.1.0"
Basic Example
use axonml_tensor::{Tensor, zeros, ones, randn};
// Create tensors
let a = zeros::<f32>(&[2, 3]);
let b = ones::<f32>(&[2, 3]);
let c = randn::<f32>(&[2, 3]);
// Arithmetic operations
let sum = a.add(&b).unwrap();
let product = b.mul(&c).unwrap();
let scaled = c.mul_scalar(2.0);
// Reductions
let total = scaled.sum();
let average = scaled.mean().unwrap();
let maximum = scaled.max().unwrap();
Shape Operations
use axonml_tensor::Tensor;
let t = Tensor::<f32>::from_vec(
vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0],
&[2, 3]
).unwrap();
// Reshape
let flat = t.reshape(&[-1]).unwrap(); // [6]
let reshaped = t.reshape(&[3, 2]).unwrap();
// Transpose
let transposed = t.t().unwrap(); // [3, 2]
// Squeeze and unsqueeze
let unsqueezed = t.unsqueeze(0).unwrap(); // [1, 2, 3]
let squeezed = unsqueezed.squeeze(Some(0)).unwrap(); // [2, 3]
Matrix Operations
use axonml_tensor::Tensor;
// Matrix multiplication
let a = Tensor::<f32>::from_vec(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]).unwrap();
let b = Tensor::<f32>::from_vec(vec![5.0, 6.0, 7.0, 8.0], &[2, 2]).unwrap();
let c = a.matmul(&b).unwrap(); // [2, 2]
// Batched matmul
let batch_a = randn::<f32>(&[4, 2, 3]);
let batch_b = randn::<f32>(&[4, 3, 5]);
let batch_c = batch_a.matmul(&batch_b).unwrap(); // [4, 2, 5]
// Dot product
let v1 = Tensor::<f32>::from_vec(vec![1.0, 2.0, 3.0], &[3]).unwrap();
let v2 = Tensor::<f32>::from_vec(vec![4.0, 5.0, 6.0], &[3]).unwrap();
let dot = v1.dot(&v2).unwrap(); // Scalar tensor
Activation Functions
use axonml_tensor::Tensor;
let x = Tensor::<f32>::from_vec(vec![-1.0, 0.0, 1.0, 2.0], &[4]).unwrap();
let relu_out = x.relu(); // [0.0, 0.0, 1.0, 2.0]
let sigmoid_out = x.sigmoid();
let tanh_out = x.tanh();
let gelu_out = x.gelu();
let softmax_out = x.softmax(-1);
Broadcasting
use axonml_tensor::Tensor;
// Automatic broadcasting
let a = Tensor::<f32>::from_vec(vec![1.0, 2.0, 3.0], &[3]).unwrap();
let b = Tensor::<f32>::from_vec(vec![10.0], &[1]).unwrap();
let c = a.add(&b).unwrap(); // [11.0, 12.0, 13.0]
// 2D broadcasting
let matrix = Tensor::<f32>::from_vec(vec![1.0; 6], &[2, 3]).unwrap();
let row = Tensor::<f32>::from_vec(vec![1.0, 2.0, 3.0], &[1, 3]).unwrap();
let result = matrix.add(&row).unwrap(); // [2, 3]
Lazy Tensors (novel)
Defer computation and let algebraic optimizations simplify your expression before execution.
use axonml_tensor::lazy::LazyTensor;
use axonml_tensor::Tensor;
// Build expression tree without executing
let a = LazyTensor::from_tensor(Tensor::from_vec(vec![1.0, 2.0, 3.0], &[3]).unwrap());
let b = LazyTensor::from_tensor(Tensor::from_vec(vec![4.0, 5.0, 6.0], &[3]).unwrap());
let result = a.add(&b).mul_scalar(2.0).neg().neg(); // double negation will be eliminated
// Optimize: constant folding, identity elimination, inverse cancellation
let optimized = result.optimize();
// Execute the optimized expression tree
let tensor = optimized.materialize();
Tests
Run the test suite (98 tests):
cargo test -p axonml-tensor
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Dependencies
~5–7.5MB
~153K SLoC