#optimized #aarch64 #x86-64 #multiplication #cuda #scalar #multiscalar

pasta-msm

Optimized multiscalar multiplicaton for Pasta moduli for x86_64 and aarch64

5 releases

0.1.4 May 5, 2023
0.1.3 Aug 10, 2022
0.1.2 Jun 26, 2022
0.1.1 May 26, 2022
0.1.0 May 19, 2022

#335 in Machine learning

Download history 108/week @ 2024-01-27 86/week @ 2024-02-03 113/week @ 2024-02-10 163/week @ 2024-02-17 124/week @ 2024-02-24 163/week @ 2024-03-02 165/week @ 2024-03-09 271/week @ 2024-03-16 207/week @ 2024-03-23 189/week @ 2024-03-30 159/week @ 2024-04-06 180/week @ 2024-04-13 121/week @ 2024-04-20 142/week @ 2024-04-27 145/week @ 2024-05-04 132/week @ 2024-05-11

585 downloads per month
Used in 7 crates (4 directly)

Apache-2.0

17KB
323 lines

Pasta Multi-Scalar Multiplication

This is an initial version with a list of planned improvements:

  • parallelize;
  • break down scalars to signed digits to half the buckets' integration complexity;
  • switch to alternative bucket point representation with faster addition formula;
  • migrate CUDA implementation;

To compile CUDA support ensure that you have nvcc, Nvidia CUDA compiler, on your program search path. Minimal installation suffices. For example on Ubuntu it would be sufficient to install cuda-minimal-build-11-7 instead of complete cuda package. If your laptop is equipped with a Turing+ controller, you're likely to have to compile with --features=cuda-mobile. Caveat lector. CUDA implementation does not adapt for the actual load yet, so that some results would be suboptimal.

Dependencies

~1.2–2.5MB
~50K SLoC