#optimized #aarch64 #x86-64 #multiplication #cuda #scalar #multiscalar

pasta-msm

Optimized multiscalar multiplicaton for Pasta moduli for x86_64 and aarch64

5 releases

0.1.4 May 5, 2023
0.1.3 Aug 10, 2022
0.1.2 Jun 26, 2022
0.1.1 May 26, 2022
0.1.0 May 19, 2022

#339 in Machine learning

Download history 74/week @ 2023-12-18 21/week @ 2023-12-25 145/week @ 2024-01-01 152/week @ 2024-01-08 213/week @ 2024-01-15 109/week @ 2024-01-22 115/week @ 2024-01-29 90/week @ 2024-02-05 130/week @ 2024-02-12 137/week @ 2024-02-19 135/week @ 2024-02-26 176/week @ 2024-03-04 162/week @ 2024-03-11 277/week @ 2024-03-18 200/week @ 2024-03-25 234/week @ 2024-04-01

885 downloads per month
Used in 6 crates (4 directly)

Apache-2.0

17KB
323 lines

Pasta Multi-Scalar Multiplication

This is an initial version with a list of planned improvements:

  • parallelize;
  • break down scalars to signed digits to half the buckets' integration complexity;
  • switch to alternative bucket point representation with faster addition formula;
  • migrate CUDA implementation;

To compile CUDA support ensure that you have nvcc, Nvidia CUDA compiler, on your program search path. Minimal installation suffices. For example on Ubuntu it would be sufficient to install cuda-minimal-build-11-7 instead of complete cuda package. If your laptop is equipped with a Turing+ controller, you're likely to have to compile with --features=cuda-mobile. Caveat lector. CUDA implementation does not adapt for the actual load yet, so that some results would be suboptimal.

Dependencies

~1.2–2.6MB
~51K SLoC