#simd #search #linear-algebra #vector

sys no-std simsimd

Fastest SIMD-Accelerated Vector Similarity Functions for x86 and Arm

5 stable releases

new 3.8.1 Feb 22, 2024
3.8.0 Feb 13, 2024
3.7.5 Jan 30, 2024
0.1.2 Jan 27, 2024

#48 in Hardware support

Download history 157/week @ 2024-01-24 20/week @ 2024-01-31 54/week @ 2024-02-07 194/week @ 2024-02-14

425 downloads per month

Apache-2.0

120KB
2K SLoC

C 678 SLoC // 0.1% comments Python 537 SLoC // 0.1% comments JavaScript 233 SLoC // 0.0% comments Rust 209 SLoC TypeScript 187 SLoC // 0.3% comments C++ 174 SLoC // 0.0% comments Go 106 SLoC // 0.2% comments Jupyter Notebooks 25 SLoC Shell 7 SLoC

SimSIMD 📏

Hardware-Accelerated Similarity Metrics and Distance Functions

Implemented distance functions include:

  • Euclidean (L2), Inner Distance, and Cosine (Angular) spatial distances.
  • Hamming (~ Manhattan) and Jaccard (~ Tanimoto) binary distances.
  • Kullback-Leibler and Jensen–Shannon divergences for probability distributions.

Technical Insights and related articles:

Benchmarks

Apple M2 Pro

Given 1000 embeddings from OpenAI Ada API with 1536 dimensions, running on the Apple M2 Pro Arm CPU with NEON support, here's how SimSIMD performs against conventional methods:

Kind f32 improvement f16 improvement i8 improvement Conventional method SimSIMD
Cosine 32 x 79 x 133 x scipy.spatial.distance.cosine cosine
Euclidean ² 5 x 26 x 17 x scipy.spatial.distance.sqeuclidean sqeuclidean
Inner Distance 2 x 9 x 18 x numpy.inner inner
Jensen Shannon 31 x 53 x scipy.spatial.distance.jensenshannon jensenshannon

Intel Sapphire Rapids

On the Intel Sapphire Rapids platform, SimSIMD was benchmarked against auto-vectorized code using GCC 12. GCC handles single-precision float but might not be the best choice for int8 and _Float16 arrays, which has been part of the C language since 2011.

Kind GCC 12 f32 GCC 12 f16 SimSIMD f16 f16 improvement
Cosine 3.28 M/s 336.29 k/s 6.88 M/s 20 x
Euclidean ² 4.62 M/s 147.25 k/s 5.32 M/s 36 x
Inner Distance 3.81 M/s 192.02 k/s 5.99 M/s 31 x
Jensen Shannon 1.18 M/s 18.13 k/s 2.14 M/s 118 x

Broader Benchmarking Results:

Using SimSIMD in Python

The package is intended to replace the usage of numpy.inner, numpy.dot, and scipy.spatial.distance. Aside from drastic performance improvements, SimSIMD significantly improves accuracy in mixed precision setups. NumPy and SciPy, processing i8 or f16 vectors, will use the same types for accumulators, while SimSIMD can combine i8 enumeration, i16 multiplication, and i32 accumulation to entirely avoid overflows. The same applies to processing f16 values with f32 precision.

Installation

pip install simsimd

Distance Between 2 Vectors

import simsimd
import numpy as np

vec1 = np.random.randn(1536).astype(np.float32)
vec2 = np.random.randn(1536).astype(np.float32)
dist = simsimd.cosine(vec1, vec2)

Supported functions include cosine, inner, sqeuclidean, hamming, and jaccard.

Distance Between 2 Batches

batch1 = np.random.randn(100, 1536).astype(np.float32)
batch2 = np.random.randn(100, 1536).astype(np.float32)
dist = simsimd.cosine(batch1, batch2)

If either batch has more than one vector, the other batch must have one or the same number of vectors. If it contains just one, the value is broadcasted.

All Pairwise Distances

For calculating distances between all possible pairs of rows across two matrices (akin to scipy.spatial.distance.cdist):

matrix1 = np.random.randn(1000, 1536).astype(np.float32)
matrix2 = np.random.randn(10, 1536).astype(np.float32)
distances = simsimd.cdist(matrix1, matrix2, metric="cosine")

Multithreading

By default, computations use a single CPU core. To optimize and utilize all CPU cores on Linux systems, add the threads=0 argument. Alternatively, specify a custom number of threads:

distances = simsimd.cdist(matrix1, matrix2, metric="cosine", threads=0)

Hardware Backend Capabilities

To view a list of hardware backends that SimSIMD supports:

print(simsimd.get_capabilities())

Using Python API with USearch

Want to use it in Python with USearch? You can wrap the raw C function pointers SimSIMD backends into a CompiledMetric and pass it to USearch, similar to how it handles Numba's JIT-compiled code.

from usearch.index import Index, CompiledMetric, MetricKind, MetricSignature
from simsimd import pointer_to_sqeuclidean, pointer_to_cosine, pointer_to_inner

metric = CompiledMetric(
    pointer=pointer_to_cosine("f16"),
    kind=MetricKind.Cos,
    signature=MetricSignature.ArrayArraySize,
)

index = Index(256, metric=metric)

Using SimSIMD in Rust

To install, add the following to your Cargo.toml:

[dependencies]
simsimd = "..."

To use it:

use simsimd::{cosine, sqeuclidean};

fn main() {
    let vector_a = vec![1.0, 2.0, 3.0];
    let vector_b = vec![4.0, 5.0, 6.0];

    let distance = cosine(&vector_a, &vector_b);
    println!("Cosine Distance: {}", distance);

    let distance = sqeuclidean(&vector_a, &vector_b);
    println!("Squared Euclidean Distance: {}", distance);
}

Using SimSIMD in JavaScript

To install, choose one of the following options depending on your environment:

  • npm install --save simsimd
  • yarn add simsimd
  • pnpm add simsimd
  • bun install simsimd

The package is distributed with prebuilt binaries for Node.js v10 and above for Linux (x86_64, arm64), macOS (x86_64, arm64), and Windows (i386,x86_64).

If your platform is not supported, you can build the package from source via npm run build. This will automatically happen unless you install the package with --ignore-scripts flag or use Bun.

After you install it, you will be able to call the SimSIMD functions on various TypedArray variants:

const { sqeuclidean, cosine, inner, hamming, jaccard } = require('simsimd');

const vectorA = new Float32Array([1.0, 2.0, 3.0]);
const vectorB = new Float32Array([4.0, 5.0, 6.0]);

const distance = sqeuclidean(vectorA, vectorB);
console.log('Squared Euclidean Distance:', distance);

Using SimSIMD in C

For integration within a CMake-based project, add the following segment to your CMakeLists.txt:

FetchContent_Declare(
    simsimd
    GIT_REPOSITORY https://github.com/ashvardanian/simsimd.git
    GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(simsimd)

If you're aiming to utilize the _Float16 functionality with SimSIMD, ensure your development environment is compatible with C 11. For other functionalities of SimSIMD, C 99 compatibility will suffice. A minimal usage example would be:

#include <simsimd/simsimd.h>

int main() {
    simsimd_f32_t vector_a[1536];
    simsimd_f32_t vector_b[1536];
    simsimd_f32_t distance = simsimd_avx512_f32_cos(vector_a, vector_b, 1536);
    return 0;
}

All of the functions names follow the same pattern: simsimd_{backend}_{type}_{metric}.

  • The backend can be avx512, avx2, neon, or sve.
  • The type can be f64, f32, f16, i8, or b8.
  • The metric can be cos, ip, l2sq, hamming, jaccard, kl, or js.

In case you want to avoid hard-coding the backend, you can use the simsimd_metric_punned_t to pun the function pointer, and simsimd_capabilities function to get the available backends at runtime.

No runtime deps

~165KB