halo2-solidity-verifier code analysis

DoHoon Kim·2025년 4월 9일

halo2

목록 보기

2/4

In this post, we will review the EVM verifier for halo2 proof written in Yul assembly done by EZKL.

Code analysis

In the code, we encounter with two types that constitute reusable verifier.

Halo2VerifierReusable struct

#[derive(Template)]
#[template(path = "Halo2VerifierReusable.sol")]
pub(crate) struct Halo2VerifierReusable {
    pub(crate) scheme: BatchOpenScheme,
    pub(crate) vk_const_offsets: HashMap<&'static str, U256>,
}

This struct has two fields, scheme which represents the batch opening scheme used for KZG commitment opening, and vk_const_offsets which represents the mapping from the name of constants to the addresses they are stored in. In this context, constants are the circuit specific constant informations.

Halo2VerifyingKey struct

#[derive(Template)]
#[template(path = "Halo2VerifyingKey.sol")]
pub(crate) struct Halo2VerifyingKey {
    pub(crate) constants: Vec<(&'static str, U256)>,
    pub(crate) fixed_comms: Vec<(U256, U256)>,
    pub(crate) permutation_comms: Vec<(U256, U256)>,
}

This struct has three fields, constants which stores the constant values that are used during proving, fixed_comms and permutation_comms that stores the commitments of fixed columns and permutation polynomials representing equality constraints. Note that each commitment is represented as two EVM words which store the affine coordinate of an elliptic curve point.

Halo2VerifyingKey struct stores some part of circuit information that is related to the layout of the circuit. However, it does not contain the information about custom gates, lookups, and other auxiliary features such as in-circuit challenges.

Halo2VerifyingArtifact struct

#[derive(Template)]
#[template(path = "Halo2VerifyingArtifact.sol")]
pub(crate) struct Halo2VerifyingArtifact {
    pub(crate) constants: Vec<(&'static str, U256)>,
    pub(crate) fixed_comms: Vec<(U256, U256)>,
    pub(crate) permutation_comms: Vec<(U256, U256)>,
    pub(crate) const_expressions: Vec<U256>,
    pub(crate) gate_computations: GateDataEncoded,
    pub(crate) permutation_computations: PermutationDataEncoded,
    pub(crate) lookup_computations: LookupsDataEncoded,
    pub(crate) pcs_computations: PcsDataEncoded,
}

Halo2VerifyingArtifact struct is newly introduced struct type that stores all the information that is required for the verifier.

Memory management in EVM verifier

In Halo2VerifierReusable verifier contract, verifier uses the reserved memory space starting from 0x00 to vk_mptr to carry out gate computations, permutation computations, lookup computations and PCS computations.

Verifier divides memory area into three portions.

0x00 ~ vk_mptr: this is reserved area to carry out challenge generation, gate computations, permutation computations, lookup computations and PCS computations.
vk_mptr ~ vk_mptr + vk length: this area is where whole vk is loaded into
vk_mptr + vk length ~: this area is to store challenges

The above diagram shows how the verifier manages the memory area. The variables that arrows are starting from are the stack variables that point to memory.

There are two extcodecopy in the verifier, making two steps distinguished from one another as following.

Squeezing challenges
Computing quotient polynomials

Since vk area and free static working memory area should never overlap, it is crucial to estimate the size of free static working memory area and set vk_mptr properly.

Due to the way in which EVM verifier manages the memory, it is bit complex to encode the computational data of circuit into Halo2VerifyingArtifact. Let's see how it works.

Creating `VerifyingArtifact`

The following is the layout of VerifyingArtifact.

The purposes of creating Halo2VerifyingArtifact are the following:

Encoding quotient polynomial evaluation data
Encoding data for verifying openings of polynomial commitments

Encoding metadata into `VerifyingArtifact`

Since the encoded data will be stored into EVM memory, the correct offset of each computational data inside memory space should be encoded too.

Determining the offset of each computational data requires two information:

Order in which computational datas are stacked inside VerifyingArtifact
The size of each computational data

Thus, we should create each data (to determine the size of the data) first to determine the correct offset of the data. However, computational datas are composed of pointers, which also point to the challenge space area. As we saw in the EVM verifier's memory management diagram, challenge area is located after VerifyingArtifact, so we cannot determine exact addresses of each pointers before we know the exact size of VerifyingArtifact. This seems like a chicken and egg problem. How can we resolve this?

So the problem is,

We should create computational data itself to determine the offset.

We cannot create "correct" computational data without knowing the exact size of VerifyingArtifact, which we cannot know before actually creating computational data.

Thus, encoder first creates "incorrect" computational data with dummy pointers. "Incorrect" computational data does not encode the valid pointers, but should have exactly the same shape with "correct" computational data. After the encoder creates computational datas and determine their exact sizes, the offsets of each of them can be determined, thus encoded into metadata area.

Encoding data into `VerifyingArtifact`

First, the data that encodes the gate computations, permutation computations, lookup computations and PCS computations should be created. Let's introduce one more type.

Data struct

#[cfg(not(feature = "mv-lookup"))]
#[derive(Debug)]
pub(crate) struct Data {
    pub(crate) challenge_mptr: Ptr,
    pub(crate) theta_mptr: Ptr,

    pub(crate) quotient_comm_cptr: Ptr,
    pub(crate) w_cptr: Ptr,

    pub(crate) fixed_comms: Vec<EcPoint>,
    pub(crate) permutation_comms: HashMap<Column<Any>, EcPoint>,
    pub(crate) advice_comms: Vec<EcPoint>,
    pub(crate) lookup_permuted_comms: Vec<(EcPoint, EcPoint)>,
    pub(crate) permutation_z_comms: Vec<EcPoint>,
    pub(crate) lookup_z_comms: Vec<EcPoint>,
    pub(crate) random_comm: EcPoint,

    pub(crate) challenges: Vec<Word>,

    pub(crate) instance_eval: Word,
    pub(crate) advice_evals: HashMap<(usize, i32), Word>,
    pub(crate) fixed_evals: HashMap<(usize, i32), Word>,
    pub(crate) random_eval: Word,
    pub(crate) permutation_evals: HashMap<Column<Any>, Word>,
    pub(crate) permutation_z_evals: Vec<(Word, Word, Word)>,
    pub(crate) lookup_evals: Vec<(Word, Word, Word, Word, Word)>,

    pub(crate) computed_quotient_comm: EcPoint,
    pub(crate) computed_quotient_eval: Word,
}

So data struct holds all the pointers that points to the commitments and evaluations of the polynomials. Note that two fields, challenge_mptr and theta_mptr each points to the challenges for halo2 phases and $\theta$ challenge sampled by the verifier.

During initialization of Data struct, it stores the position to each commitments and evaluations created during proving, starting from proof_cptr. (Question: does random_comm points to the shuffled polynomial commitments?)
Also, permutation_z_evals and lookup_evals each points to the evaluations of $Z_P$ and lookup ( $A', S', Z_L$ ) polynomials. Recall that - $Z_P$ is evaluated in three different rotations (except for the last set of permutation sets), and lookup polynomials are evaluated at five different rotations ( $A'(X), A'(\omega^{-1} X), S'(X), Z_L(X), Z_L(\omega X)$ ). Thus each field stores the vector of triplet words and vector of quinary words.

Since challenge values are unknown at the time of creating Data struct, Data stores the pointer to each challenges into challenges field.

Now, we will see how computational data is created inside encoder in the following posts.

Before diving into how computational data is created, let's see the initial part of EVM verifier.

Emulating Fiat-Shamir transcript inside EVM verifier

Since EVM verifier should squeeze the challenges, EVM verifier should emulate Fiat-Shamir transcript. EVM verifier uses Keccak256 hash function.

Squeezing challenges

The verifier squeezes challenges in the following order:

Squeeze the challenges for each phase

Squeeze $\theta$

Squeeze $\beta, \gamma$

Squeeze $y$

Squeeze $x$

Squeeze $\zeta$

Squeeze $\nu$

Squeeze $\mu$

During squeezing the challenges,

Computing quotient polynomials

Computing quotient polynomials is consisted of 4 steps:

Instance polynomial evaluations
Gate polynomial evaluations
Permutation polynomial evaluations
Lookup polynomial evaluations

Instance evaluations

EVM verifier currently supports the circuit with one instance column and the query to the instance column should have zero rotation. Thus, the situation is simple. The verifier only needs to compute instance polynomial $I(X)$ from the given instance column values by Lagrange interpolation, and compute the evaluation $I(x)$ .

Let the instance column is given as follows:

[I_0, I_1, \dots, I_{l-1}, 0, \dots, 0]

The instance polynomial can be computed using Lagrange interpolation as follows:

I(X) = \sum_{i=0}^{l-1} I_i \cdot L_i(X)

$L_i(X)$ is Lagrange basis polynomial on the cyclic group of size $n$ (size of the circuit) $\{\omega^i\}_{i=0}^{n-1}$ and is computed as follows:

L_i(X) = \prod_{j=0,j \neq i}^{n-1} \frac{X-\omega^j}{\omega^i-\omega^j} \\ = \frac{X^n-1}{X-\omega^i} \cdot \prod_{j=0,j \neq i}^{n-1} \frac{1}{\omega^i-\omega^j} \\ = \frac{X^n-1}{X-\omega^i} \cdot \prod_{j=0,j \neq i}^{n-1} \frac{\omega^{-i}}{1-\omega^{j-i}} \\ = \frac{X^n-1}{X-\omega^i} \cdot \omega^i \cdot \prod_{j'=1}^{n-1} \frac{1}{1-\omega^{j'}} \\ = \frac{X^n-1}{X-\omega^i} \cdot \frac{\omega^i}{n} \\ = \underbrace{\frac{X^n-1}{n}}_\text{l\_i\_common} \cdot \frac{\omega^i}{X-\omega^i}

During instance evaluation, the first thing EVM verifier does is to store $[x-\omega^i]_{i}$ values starting from x_n_mptr. After, the values are batch inverted and multiplied by $\text{l\_i\_common} \cdot \omega^i$ to get the desired Lagrange basis polynomial evaluations.

After, the verifier computes and stores instance polynomial evaluation along with l_last, l_blind, and l_0 evaluations.

DoHoon Kim

Researcher & Developer

이전 포스트

halo2 proving system anatomy

다음 포스트

halo2-solidity-verifier code analysis

halo2

Code analysis

Memory management in EVM verifier

Creating `VerifyingArtifact`

Encoding metadata into `VerifyingArtifact`

Encoding data into `VerifyingArtifact`

Emulating Fiat-Shamir transcript inside EVM verifier

Squeezing challenges

Computing quotient polynomials

Instance evaluations

halo2 proving system anatomy

Batch opening scheme - BDFG 21

0개의 댓글

halo2-solidity-verifier code analysis

halo2

Code analysis

Memory management in EVM verifier

Creating VerifyingArtifact

Encoding metadata into VerifyingArtifact

Encoding data into VerifyingArtifact

Emulating Fiat-Shamir transcript inside EVM verifier

Squeezing challenges

Computing quotient polynomials

Instance evaluations

halo2 proving system anatomy

Batch opening scheme - BDFG 21

0개의 댓글

Creating `VerifyingArtifact`

Encoding metadata into `VerifyingArtifact`

Encoding data into `VerifyingArtifact`