5 stable releases
new 3.8.1  Feb 22, 2024 

3.8.0  Feb 13, 2024 
3.7.5  Jan 30, 2024 
0.1.2 

#48 in Hardware support
425 downloads per month
120KB
2K
SLoC
SimSIMD 📏
HardwareAccelerated Similarity Metrics and Distance Functions
 Zerodependency headeronly C 99 library.
 Handles
f64
double,f32
single, andf16
halfprecision,i8
integral, and binary vectors.  Targets ARM NEON, SVE, x86 AVX2, AVX512 (VNNI, FP16) hardware backends.
 Bindings for Python, Rust and JavaScript.
 Up to 200x faster than
scipy.spatial.distance
andnumpy.inner
.  Supports more Python versions than SciPy and NumPy.
 Zerocopy compatible with NumPy, PyTorch, TensorFlow, and other tensors.
 Used in USearch and several DBMS products.
Implemented distance functions include:
 Euclidean (L2), Inner Distance, and Cosine (Angular) spatial distances.
 Hamming (~ Manhattan) and Jaccard (~ Tanimoto) binary distances.
 KullbackLeibler and Jensen–Shannon divergences for probability distributions.
Technical Insights and related articles:
 Uses Horner's method for polynomial approximations, beating GCC 12 by 119x.
 Uses Arm SVE and x86 AVX512's masked loads to eliminate tail
for
loops.  Uses AVX512 FP16 for halfprecision operations, that few compilers vectorize.
 Substitutes LibC's
sqrt
calls with bithacks using Jan Kadlec's constant.  For Python avoids slow PyBind11, SWIG, and even
PyArg_ParseTuple
for speed.  For JavaScript uses typed arrays and NAPI for zerocopy calls.
Benchmarks
Apple M2 Pro
Given 1000 embeddings from OpenAI Ada API with 1536 dimensions, running on the Apple M2 Pro Arm CPU with NEON support, here's how SimSIMD performs against conventional methods:
Kind  f32 improvement 
f16 improvement 
i8 improvement 
Conventional method  SimSIMD 

Cosine  32 x  79 x  133 x  scipy.spatial.distance.cosine 
cosine 
Euclidean ²  5 x  26 x  17 x  scipy.spatial.distance.sqeuclidean 
sqeuclidean 
Inner Distance  2 x  9 x  18 x  numpy.inner 
inner 
Jensen Shannon  31 x  53 x  scipy.spatial.distance.jensenshannon 
jensenshannon 
Intel Sapphire Rapids
On the Intel Sapphire Rapids platform, SimSIMD was benchmarked against autovectorized code using GCC 12. GCC handles singleprecision float
but might not be the best choice for int8
and _Float16
arrays, which has been part of the C language since 2011.
Kind  GCC 12 f32 
GCC 12 f16 
SimSIMD f16 
f16 improvement 

Cosine  3.28 M/s  336.29 k/s  6.88 M/s  20 x 
Euclidean ²  4.62 M/s  147.25 k/s  5.32 M/s  36 x 
Inner Distance  3.81 M/s  192.02 k/s  5.99 M/s  31 x 
Jensen Shannon  1.18 M/s  18.13 k/s  2.14 M/s  118 x 
Broader Benchmarking Results:
Using SimSIMD in Python
The package is intended to replace the usage of numpy.inner
, numpy.dot
, and scipy.spatial.distance
.
Aside from drastic performance improvements, SimSIMD significantly improves accuracy in mixed precision setups.
NumPy and SciPy, processing i8
or f16
vectors, will use the same types for accumulators, while SimSIMD can combine i8
enumeration, i16
multiplication, and i32
accumulation to entirely avoid overflows.
The same applies to processing f16
values with f32
precision.
Installation
pip install simsimd
Distance Between 2 Vectors
import simsimd
import numpy as np
vec1 = np.random.randn(1536).astype(np.float32)
vec2 = np.random.randn(1536).astype(np.float32)
dist = simsimd.cosine(vec1, vec2)
Supported functions include cosine
, inner
, sqeuclidean
, hamming
, and jaccard
.
Distance Between 2 Batches
batch1 = np.random.randn(100, 1536).astype(np.float32)
batch2 = np.random.randn(100, 1536).astype(np.float32)
dist = simsimd.cosine(batch1, batch2)
If either batch has more than one vector, the other batch must have one or the same number of vectors. If it contains just one, the value is broadcasted.
All Pairwise Distances
For calculating distances between all possible pairs of rows across two matrices (akin to scipy.spatial.distance.cdist
):
matrix1 = np.random.randn(1000, 1536).astype(np.float32)
matrix2 = np.random.randn(10, 1536).astype(np.float32)
distances = simsimd.cdist(matrix1, matrix2, metric="cosine")
Multithreading
By default, computations use a single CPU core. To optimize and utilize all CPU cores on Linux systems, add the threads=0
argument. Alternatively, specify a custom number of threads:
distances = simsimd.cdist(matrix1, matrix2, metric="cosine", threads=0)
Hardware Backend Capabilities
To view a list of hardware backends that SimSIMD supports:
print(simsimd.get_capabilities())
Using Python API with USearch
Want to use it in Python with USearch?
You can wrap the raw C function pointers SimSIMD backends into a CompiledMetric
and pass it to USearch, similar to how it handles Numba's JITcompiled code.
from usearch.index import Index, CompiledMetric, MetricKind, MetricSignature
from simsimd import pointer_to_sqeuclidean, pointer_to_cosine, pointer_to_inner
metric = CompiledMetric(
pointer=pointer_to_cosine("f16"),
kind=MetricKind.Cos,
signature=MetricSignature.ArrayArraySize,
)
index = Index(256, metric=metric)
Using SimSIMD in Rust
To install, add the following to your Cargo.toml
:
[dependencies]
simsimd = "..."
To use it:
use simsimd::{cosine, sqeuclidean};
fn main() {
let vector_a = vec![1.0, 2.0, 3.0];
let vector_b = vec![4.0, 5.0, 6.0];
let distance = cosine(&vector_a, &vector_b);
println!("Cosine Distance: {}", distance);
let distance = sqeuclidean(&vector_a, &vector_b);
println!("Squared Euclidean Distance: {}", distance);
}
Using SimSIMD in JavaScript
To install, choose one of the following options depending on your environment:
npm install save simsimd
yarn add simsimd
pnpm add simsimd
bun install simsimd
The package is distributed with prebuilt binaries for Node.js v10 and above for Linux (x86_64, arm64), macOS (x86_64, arm64), and Windows (i386,x86_64).
If your platform is not supported, you can build the package from source via npm run build
. This will automatically happen unless you install the package with ignorescripts
flag or use Bun.
After you install it, you will be able to call the SimSIMD functions on various TypedArray
variants:
const { sqeuclidean, cosine, inner, hamming, jaccard } = require('simsimd');
const vectorA = new Float32Array([1.0, 2.0, 3.0]);
const vectorB = new Float32Array([4.0, 5.0, 6.0]);
const distance = sqeuclidean(vectorA, vectorB);
console.log('Squared Euclidean Distance:', distance);
Using SimSIMD in C
For integration within a CMakebased project, add the following segment to your CMakeLists.txt
:
FetchContent_Declare(
simsimd
GIT_REPOSITORY https://github.com/ashvardanian/simsimd.git
GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(simsimd)
If you're aiming to utilize the _Float16
functionality with SimSIMD, ensure your development environment is compatible with C 11.
For other functionalities of SimSIMD, C 99 compatibility will suffice.
A minimal usage example would be:
#include <simsimd/simsimd.h>
int main() {
simsimd_f32_t vector_a[1536];
simsimd_f32_t vector_b[1536];
simsimd_f32_t distance = simsimd_avx512_f32_cos(vector_a, vector_b, 1536);
return 0;
}
All of the functions names follow the same pattern: simsimd_{backend}_{type}_{metric}
.
 The backend can be
avx512
,avx2
,neon
, orsve
.  The type can be
f64
,f32
,f16
,i8
, orb8
.  The metric can be
cos
,ip
,l2sq
,hamming
,jaccard
,kl
, orjs
.
In case you want to avoid hardcoding the backend, you can use the simsimd_metric_punned_t
to pun the function pointer, and simsimd_capabilities
function to get the available backends at runtime.
No runtime deps
~165KB