In this work, we begin to publish a general and fair zkVM benchmark framework based on previous work by a16z, providing a comparison on proving time and energy cost between ZKM (zkMIPS) and other zkVM projects, like RISC Zero (R0) and SP1.
This initial publication focuses on comparing proving time with R0, with the result showing that the performance of zkMIPS is close in a range 76% to 317%, and we also provide analysis about the slow cases and upcoming optimizations.
We aim for zkMIPS to become one of the most production-ready zkVM’s on the market and the most performant to leverage the MIPS instruction set.
We seek to investigate the following questions:
Note: By presenting the specific instance type and benchmarks, we implemented the zkVM-benchmark (https://github.com/zkMIPS/zkvm-benchmarks) based on a16z/zkvm-benchmark, updating R0 to the latest v1.0.5 to ensure the results were directly comparable and fair.
For pure CPU machines, zkVM can still utilize Intel’s AVX instruction set to speed up the Goldilocks operations. This feature can help achieve a 6-10% speedup, based on our previous experiences. zkVM enabled this feature during the benchmark by adding runtime flags RUSTFLAGS="-C target-cpu=native".
All benchmarks can be found at https://github.com/zkMIPS/zkvm-benchmarks
bef9edd. (https://github.com/zkMIPS/zkm/tree/main/prover/examples/revme )
CPU Instance: AWS r6a.8xlarge, 32 vCPU and 256G RAM, AMD EPYC 7R13 Processor
GPU Instance: 64-vCPU,480G RAM, AMD EPYC 9354 32-Core Processor, NV GeForce RTX 4090X4
segment size of zkMIPS is 262144(2^21).
From this comparison, we can see that zkMIPS is competitive with other top-tier zkVMs regarding the proving performance and energy cost.
Furthermore, the analysis of our time distribution over the proof generation is depicted in Figure 1:
Observing the different segments, we found that it takes up about 10%-22% of the total time to generate the traces, and 25%-32% to compute the commitment of each table, but the CTL (cross table lookup, which is built on GKR-optimized LogUp scheme) time is negligible.
By design, the traces’ generation should only be run on a single CPU, but the commitments calculation for different tables, like CPU, Arithmetic etc, can be executed in parallel on multiple CPUs, which reduces 2/3 of the time usage from now.
The proof generation takes up to about 45%. Regarding the proof calculation, we have a detailed pie chart below in Figure 2 - memory operation and cpu operations take about 72.8% of the total time, it matches along with the trace table’s size.
For each STARK proof generation, the time usage distribution is shown in the tables 4,5 & 6 below. It’s easy to observe that the ‘compute auxiliary polynomials commitment’ and ‘compute openings proof’ are the most time consuming to compute. The ‘compute auxiliary polynomial commitment’ includes Poseidon-based Merkel Hash and the FFT to calculate the point-value format polynomial from coefficient polynomials. And ‘compute openings proof’ is calculating the final polynomial on the opening points by polynomial multiplication and division(inversion) operations.
Table 4: time usage for computing single STARK proof
Table 5: time usage for computing compute auxiliary polynomials commitment
Table 6: time usage for compute openings proof
Differentiating from ZKM’s use of Plonky2, in which the FFT and Poseidon are performed over field Goldilocks, RISC0 and SP1 are using a more efficient hash function and smaller field, which can greatly benefit the proving time. We have implemented GPU Plonky2 and realized a speedup of 3. Meanwhile, we are looking to integrate the more efficient hash function and smaller field on Plonky2 in-place, expecting this to reduce the time and cost by roughly half.
With the planned optimizations, we’re confident that we can significantly increase the performance of zkMIPS to be evermore closely competitive with other leading zkVMs.
We have strived to be as accurate as possible in these benchmark comparisons. If any discrepancies are identified, please reach out to us at contact@zkm.io, and we will address any necessary corrections.
In this work, we begin to publish a general and fair zkVM benchmark framework based on previous work by a16z, providing a comparison on proving time and energy cost between ZKM (zkMIPS) and other zkVM projects, like RISC Zero (R0) and SP1.
This initial publication focuses on comparing proving time with R0, with the result showing that the performance of zkMIPS is close in a range 76% to 317%, and we also provide analysis about the slow cases and upcoming optimizations.
We aim for zkMIPS to become one of the most production-ready zkVM’s on the market and the most performant to leverage the MIPS instruction set.
We seek to investigate the following questions:
Note: By presenting the specific instance type and benchmarks, we implemented the zkVM-benchmark (https://github.com/zkMIPS/zkvm-benchmarks) based on a16z/zkvm-benchmark, updating R0 to the latest v1.0.5 to ensure the results were directly comparable and fair.
For pure CPU machines, zkVM can still utilize Intel’s AVX instruction set to speed up the Goldilocks operations. This feature can help achieve a 6-10% speedup, based on our previous experiences. zkVM enabled this feature during the benchmark by adding runtime flags RUSTFLAGS="-C target-cpu=native".
All benchmarks can be found at https://github.com/zkMIPS/zkvm-benchmarks
bef9edd. (https://github.com/zkMIPS/zkm/tree/main/prover/examples/revme )
CPU Instance: AWS r6a.8xlarge, 32 vCPU and 256G RAM, AMD EPYC 7R13 Processor
GPU Instance: 64-vCPU,480G RAM, AMD EPYC 9354 32-Core Processor, NV GeForce RTX 4090X4
segment size of zkMIPS is 262144(2^21).
From this comparison, we can see that zkMIPS is competitive with other top-tier zkVMs regarding the proving performance and energy cost.
Furthermore, the analysis of our time distribution over the proof generation is depicted in Figure 1:
Observing the different segments, we found that it takes up about 10%-22% of the total time to generate the traces, and 25%-32% to compute the commitment of each table, but the CTL (cross table lookup, which is built on GKR-optimized LogUp scheme) time is negligible.
By design, the traces’ generation should only be run on a single CPU, but the commitments calculation for different tables, like CPU, Arithmetic etc, can be executed in parallel on multiple CPUs, which reduces 2/3 of the time usage from now.
The proof generation takes up to about 45%. Regarding the proof calculation, we have a detailed pie chart below in Figure 2 - memory operation and cpu operations take about 72.8% of the total time, it matches along with the trace table’s size.
For each STARK proof generation, the time usage distribution is shown in the tables 4,5 & 6 below. It’s easy to observe that the ‘compute auxiliary polynomials commitment’ and ‘compute openings proof’ are the most time consuming to compute. The ‘compute auxiliary polynomial commitment’ includes Poseidon-based Merkel Hash and the FFT to calculate the point-value format polynomial from coefficient polynomials. And ‘compute openings proof’ is calculating the final polynomial on the opening points by polynomial multiplication and division(inversion) operations.
Table 4: time usage for computing single STARK proof
Table 5: time usage for computing compute auxiliary polynomials commitment
Table 6: time usage for compute openings proof
Differentiating from ZKM’s use of Plonky2, in which the FFT and Poseidon are performed over field Goldilocks, RISC0 and SP1 are using a more efficient hash function and smaller field, which can greatly benefit the proving time. We have implemented GPU Plonky2 and realized a speedup of 3. Meanwhile, we are looking to integrate the more efficient hash function and smaller field on Plonky2 in-place, expecting this to reduce the time and cost by roughly half.
With the planned optimizations, we’re confident that we can significantly increase the performance of zkMIPS to be evermore closely competitive with other leading zkVMs.
We have strived to be as accurate as possible in these benchmark comparisons. If any discrepancies are identified, please reach out to us at contact@zkm.io, and we will address any necessary corrections.