Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(metal_mont_backend): adapted from https://github.com/geometryxyz/msl-secp256k1 #13

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

moven0831
Copy link
Collaborator

Overview

This PR introduces a well-organized structure for further MSM implementation with Metal and adapts a more efficient EC metal library from Zprize 2023 winner, Tal and Koh's work of MSM with WebGPU.

Related Issue

Key Changes

Core Implementation

  1. Refactored Metal MSM Module

    • Added new metal_msm module with host, shader, and utils submodules
      • The metal_msm aims to replace the previous metal directory as a refactored version. The metal directory will be deleted once the metal_msm is implemented.
    • Implemented basic Metal state management and GPU operations
  2. Better EC backend adapted from Zprize 2023 winner

    • BigInt Operations
      • New BigInt and BigIntWide structs with arithmetic operations
      • New Metal shaders for BigInt addition and subtraction
    • Field Operations
      • New field arithmetic operations (addition, subtraction)
      • Implemented BN254 field operations with updated modulus
      • Added Montgomery multiplication implementations (CIOS, modified, optimized)
    • Elliptic Curve Operations
      • Implemented ECPoint class with Jacobian coordinate operations
      • Added functions for point addition and doubling

Testing & Benchmarking

  1. Test Suites

    • BigInt operations tests
    • Field arithmetic tests
    • Montgomery multiplication tests
    • Curve operation tests
  2. Benchmarking

    • Added benchmarking modules for Montgomery multiplication
    • Implemented performance testing for different multiplication variants

For the BN254 curve we're targeting, the modified mont_mul with a 15-bit limb size appears to work best. This makes sense intuitively: with a 15-bit limb size, there are 17 limbs, resulting in 15 * 17 = 255 bits, which is closest to the 254-bit base field modulus of BN254.

Benchmark result:

running 1 test
=== benchmarking mont_mul_modified ===
benchmark for 11-bit limbs took 306ms
benchmark for 12-bit limbs took 174ms
benchmark for 13-bit limbs took 139ms
benchmark for 14-bit limbs took 144ms
benchmark for 15-bit limbs took 128ms
benchmark for 16-bit limbs: Benchmark failed: results do not match expected values

=== benchmarking mont_mul_optimised ===
benchmark for 11-bit limbs took 281ms
benchmark for 12-bit limbs took 172ms
benchmark for 13-bit limbs took 137ms
benchmark for 14-bit limbs took 137ms
benchmark for 15-bit limbs: Benchmark failed: results do not match expected values
benchmark for 16-bit limbs: Benchmark failed: results do not match expected values

=== benchmarking mont_mul_cios ===
benchmark for 11-bit limbs: Benchmark failed: results do not match expected values
benchmark for 12-bit limbs: Benchmark failed: results do not match expected values
benchmark for 13-bit limbs took 368ms
benchmark for 14-bit limbs took 354ms
benchmark for 15-bit limbs took 259ms
benchmark for 16-bit limbs took 129ms
test msm::metal_msm::tests::mont_backend::mont_benchmarks::all_benchmarks ... ok

Utility Functions

  • Added limb conversion utilities from BigInt to arbitrary limb size format in Vec
  • Implemented Montgomery parameter calculations

Dependencies

Updated various dependencies including:

  • num: 0.4.0
  • num-bigint: 0.4.3

Next Steps

  1. Review and merge into main branch
  2. Build new MSM algorithms with metal on top of this ec backend
  3. Consider additional optimizations based on benchmark results

Reference

@moven0831 moven0831 self-assigned this Nov 8, 2024
@moven0831 moven0831 linked an issue Nov 8, 2024 that may be closed by this pull request
Copy link

@yaroslavyaroslav yaroslavyaroslav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Haven't reviewed any metal file, bc I believe they're just moved rather modified, so I consider them out of scope this particular PR.

These build.rs although being nice to have not a critical part of PR, so in case of tight deadlines if any we can put in a separate PR, I can even take it on my own afterwards.

let mut limb_idx = 0;

for &byte in bytes.iter() {
val |= (byte as u32) << bits;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider to operate by a whole word here rather than a single byte. However I think this optimisation should be a part of a separate PR.

pub enum MetalError {
#[error("Couldn't find a system default device for Metal")]
DeviceNotFound(),
#[error("Couldn't create a new Metal library: {0}")]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this one is used within metalcompiller.rs, so I believe it could be removed due to reasoning we have discussed in DM

use std::string::String;

pub fn compile_metal(path_from_cargo_manifest_dir: &str, input_filename: &str) -> String {
let input_path = PathBuf::from(env!("CARGO_MANIFEST_DIR"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saving here all the main points of the talk we have had in DM recently.

The metal building code is pretty similar to what I've made for my internal use in #14 so it's perfectly fine to keep it as it is except for two points:

  1. Move the code in build.rs as will enable it building during the library itself build stage, allowing cargo to handle incremental builds of metal shaders effortlessly.
  2. It worth it to keep built library in target directory with env::var("OUT_DIR") env variable, which could be referenced in the main code base as env!("OUT_DIR")
  3. It's worth it to conditionally append debug symbols generation attribute for profiling as it's made here: https://github.com/zkmopro/gpu-acceleration/pull/14/files#diff-1fb60ca492a52fb05b37909bd795664d0a6c31e77d34f28bc1baa3b7d91093a7R35-R40
  4. When it comes to build.rs you have to specify the list of foreign language files to follow, which is done here https://github.com/zkmopro/gpu-acceleration/pull/14/files#diff-1fb60ca492a52fb05b37909bd795664d0a6c31e77d34f28bc1baa3b7d91093a7R103-R106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Optimized Montgomery Multiplication for Metal
2 participants