1 unstable release
0.2.0  Oct 8, 2020 

0.1.0 

#432 in Science
128 downloads per month
Used in 2 crates
89KB
1K
SLoC
kmeans
kmeans is a small and fast library for kmeans clustering calculations. Here is a small example, using kmean++ as initialization method and lloyd as kmeans variant:
use kmeans::*;
fn main() {
let (sample_cnt, sample_dims, k, max_iter) = (20000, 200, 4, 100);
// Generate some random data
let mut samples = vec![0.0f64;sample_cnt * sample_dims];
samples.iter_mut().for_each(v *v = rand::random());
// Calculate kmeans, using kmean++ as initializationmethod
let kmean = KMeans::new(samples, sample_cnt, sample_dims);
let result = kmean.kmeans_lloyd(k, max_iter, KMeans::init_kmeanplusplus, &KMeansConfig::default());
println!("Centroids: {:?}", result.centroids);
println!("ClusterAssignments: {:?}", result.assignments);
println!("Error: {}", result.distsum);
}
Datastructures
For performancereasons, all calculations are done on bare vectors, using handwritten SIMD intrinsics from the packed_simd
crate. All vectors are stored rowmajor, so each sample is stored in a consecutive block of memory.
Supported variants / algorithms
 lloyd (standard kmeans)
 minibatch
Supported centroid initialization methods
 KMean++
 random partition
 random sample
Dependencies
~3MB
~61K SLoC