#cuda #ssimulacra2 #npp #kernel #image #results #pair

ssimulacra2-cuda

Ssimulacra2 implementation running on CUDA

1 unstable release

0.1.0 Oct 12, 2024

#8 in #ssimulacra2


Used in 2 crates (via turbo-metrics)

MIT license

150KB
3.5K SLoC

ssimulacra2-cuda

An implementation of ssimulacra2 using CUDA.

Features

  • Close to the original implementation, and with close results.
  • Leverages many custom kernels written in Rust and a few CUDA NPP primitives.
  • Uses CUDA graphs to alleviate the cost of launching the 200+ kernels per image pair.

TODO

  • Investigate if it is possible to change some computations to accelerate processing without deviating from the original implementation too much. Maybe making it configurable.
  • More custom kernels, is it possible to run the whole computation in a single fused kernel launch ?
  • Use less memory (currently 500MB for 1080p), might be possible by using a single fused kernel.

Credits

Original reference implementation : https://github.com/cloudinary/ssimulacra2

With inspiration from : https://github.com/rust-av/ssimulacra2

Dependencies

~0.1–2.5MB
~49K SLoC