Batch opening scheme - BDFG 21

DoHoon Kim·2025년 4월 23일

halo2

목록 보기

3/4

In this post, we will review the batch opening scheme used in halo2 library. The paper for this scheme is introduced in the paper.

In this protocol, the prover and the verifier has access to SRS of the group elements from the groups that support the non-degenerate bilinear pairing.

Algebraic Group Model

The batch opening scheme is proven to be secure(has perfect completeness and knowledge soundness) in Algebraic Group Model. In algebraic group model, the adversary $\mathcal{A}$ for the protocol, is assumed to be algebraic adversary. This means that whenever $\mathcal{A}$ outputs an element $A \in \mathbb{G}_i$ for $i = 1, 2$ , it also outputs the vector $v \in \mathbb{F}^k$ such that $A = \langle v, SRS_i \rangle$ .

Now, let's see the definition of knowledge soundness in algebraic group model.

A protocol $\mathcal{P}$ between the prover $\mathbf{P}$ and the verifier $\mathbf{V}$ for a relation $\mathcal{R}$ has knowledge soundness in algebraic group model, if there exists an efficient $\mathbf{E}$ such that the probability of any algebraic adversary $\mathcal{A}$ winning the following game is negligible.

$\mathcal{A}$ chooses input $x$ and plays the role of $\mathbf{P}$ with input $x$ .

$\mathbf{E}$ given access to all of $\mathcal{A}$ ’s messages during the protocol (including the coefficients of the linear combinations) outputs $\omega$ .

$\mathcal{A}$ wins if
(a) $\mathbf{V}$ accepts at the end of the protocol, and
(b) $(x, \omega) \notin \mathcal{R}$

Extractor is an idealized object used for proving knowledge soundness, not implemented in the real-world case. In short, if the protocol has knowledge soundness in algebraic group model, there must be only two cases: the transcript is accepting and $(x, \omega) \in \mathcal{R}$ , or, the transcript is not accepting and $(x, \omega) \notin \mathcal{R}$ .

The rationality behind this definition is because, in the case of transcript is accepting and $(x, \omega) \notin \mathcal{R}$ , the extractor can solve discrete logarithm problem and break Q-DLOG assumption. We will visit this in the future post.

Batch opening scheme

Opening scheme

Let the initial setup is given as following.

$T = \{ z_1, \dots, z_t \} \subset \mathbb{F}$ and $S_1, \dots, S_k \subset T$ .
$cm_1, \dots, cm_k$ are the commitments to the polynomials $f_1, \dots, f_k$ .
$\{r_i \in \mathbb{F}_{\lt |S_i|} [X]\}_{i \in [k]}$ such that $r_i(z) = f_i(z)$ for each $i \in [k], z \in S_i$ .

$S_i$ is the opening set for each polynomial $f_i$ , and $T = \cup_{i=1}^k S_i$ . $r_i$ is the polynomial that opens to $f_i$ in $S_i$ .

Let's visit the following claims.

Fix subsets $S \subset T \subset \mathbb{F}$ , and a polynomial $g \in \mathbb{F}_{\lt d}[X]$ . Then $Z_S(X)$ divides $g(X)$ if and only if $Z_T(X)$ divides $Z_{T \backslash S} \cdot g(X)$ .

Fix $F_1, \dots, F_k \in \mathbb{F}_{\lt n}[X]$ . Fix $Z \in \mathbb{F}_{\lt n}[X]$ that decomposes to distinct linear factors over $\mathbb{F}$ . Suppose that for some $i \in [k]$ , $Z \not| \;F_i$ . Then, except with the probability $k / |\mathbb{F}|$ over uniform $\gamma \in \mathbb{F}$ , $Z$ does not divide $\sum_{i=1}^k \gamma^{i - 1} \cdot F_i$ .

Using this claim, let's visit batch opening scheme.

$\mathbf{V}$ sends random $\gamma \in \mathbb{F}$ .
$\mathbf{P}$ computes the polynomial $f(X) := \sum_{i \in [k]} \gamma^{i-1} \cdot Z_{T \backslash S_i}(X) \cdot (f_i(X) - r_i(X))$

and $h(X) = f(X) / Z_T(X)$ . Sends $W = [h(X)]_1$ .

$\mathbf{V}$ sends random $z \in \mathbb{F}$ .
$\mathbf{P}$ computes the polynomial $L(X) := f_z(X) - Z_T(z) \cdot h(X)$

where

f_z(X) := \sum_{i \in [k]} \gamma^{i-1} \cdot Z_{T \backslash S_i}(z) \cdot (f_i(X) - r_i(z))

and $\frac{L(X)}{X - z}$ . Sends $W' := [\frac{L(X)}{X - z}]_1$ .

$\mathbf{V}$ computes

F := \sum_{i \in [k]} \gamma^{i-1} \cdot Z_{T \backslash S_i}(z) \cdot (cm_i - [r_i(z)]_1) - Z_T(z) \cdot W

and checks

e(F, [1]_2) = e(W', [x - z]_2)

Proof of knowledge soundness

Let's prove the knowledge soundness of the protocol. First, algebraic adversary $\mathcal{A}$ chooses the group elements $cm_1, \dots, cm_k$ and plays the role of $\mathbf{P}$ . There is an efficient AGM-extractor that has access to $\mathcal{A}$ 's choice of coefficients in $\mathbb{F}$ to make $cm_1, \dots, cm_k$ . Let's say

cm_i = \sum_{j=0}^{d-1} a_{ij} [x^j]_1

AGM-extractor outputs the polynomials $f_i(X) = \sum_{j=0}^{d-1} a_{ij} X^{j}$ .
$\mathcal{A}$ now participate as $\mathbf{P}$ . After $\mathbf{V}$ sends random $\gamma \in \mathbb{F}$ , $\mathcal{A}$ computes the polynomial

f(X) := \sum_{i \in [k]} \gamma^{i-1} \cdot Z_{T \backslash S_i}(X) \cdot (f_i(X) - r_i(X))

Suppose that there exists some $i^* \in [k]$ such that $Z_{S_{i^*}}(X) \not| \;(f_{i^*}(X) - r_{i^*}(X))$ . Then, by the claims we saw, $f(X)$ is not divisible by $Z_T(X)$ except with the probability $k / |\mathbb{F}|$ over uniform $\gamma \in \mathbb{F}$ . Suppose that $\gamma \in \mathbb{F}$ is not in that form, and $\mathcal{A}$ computes $W := [H(x)]_1$ for arbitrary $H(X)$ .

$\mathbf{V}$ sends random $z \in \mathbb{F}$ and $\mathcal{A}$ computes $L(X) := f_z(X) - Z_T(z) \cdot H(X)$ . $L(z) = 0$ except with the negligible probability over uniform $z \in \mathbb{F}$ . Suppose that $z$ is not in that form, and $\mathcal{A}$ computes $W' := [H'(x)]_1$ for arbitrary $H'(X)$ .

Let's show that the final pairing check passes with negligible probability to prove the knowledge soundness. Since the pairing check is the randomized test of $f_z(X) - Z_T(z) \cdot H(X) = H'(X) \cdot (X - z)$ , and this pairing check satisfies only if $f_z(X) - Z_T(z) \cdot H(X)$ is divisible by $X - z$ , $f_z(z) = Z_T(z) \cdot H(z)$ . However, this holds except with negligible probability. Thus the pairing check passes with negligible probability, thereby proving the knowledge soundness.

Implementation

There is a implementation of this batch opening scheme inside halo2. It would be worthy to review the implementation and get some deeper understanding.

The structs we first encounter is the following: RotationSet and IntermediateSets.

#[derive(Debug, Clone, PartialEq)]
struct RotationSet<F: Field, T: PartialEq + Clone> {
    commitments: Vec<Commitment<F, T>>,
    points: Vec<F>,
}

#[derive(Debug, PartialEq)]
struct IntermediateSets<F: Field, Q: Query<F>> {
    rotation_sets: Vec<RotationSet<F, Q::Commitment>>,
    super_point_set: BTreeSet<F>,
}

IntermediateSets has two fields, rotation_sets and super_point_set. These struct holds the information about all the evaluation points $T$ (super_point_set), and rotation_sets is the sets of the polynomials that will be opened in the same opening set $S_i$ . Inside RotationSet, there are the polynomial commitments that will be opened against the same opening set(commitments), and the opening points(points).

There are additional two intermediate types, CommitmentExtension and RotationSetExtension.

struct CommitmentExtension<'a, C: CurveAffine> {
    commitment: Commitment<C::Scalar, PolynomialPointer<'a, C>>,
    low_degree_equivalent: Polynomial<C::Scalar, Coeff>,
}

struct RotationSetExtension<'a, C: CurveAffine> {
    commitments: Vec<CommitmentExtension<'a, C>>,
    points: Vec<C::Scalar>,
}

CommitmentExtension groups the polynomial commitment $f_i$ (commitment) and evaluation polynomial on the opening set $r_i$ (low_degree_equivalent). RotationSetExtension is the same with RotationSet except that commitments field is replaced with the vector of CommitmentExtension type.

`create_proof`

After having this in mind, let's look into create_proof function.

There is additional challenge y sampled by the verifier in the actual protocol, to random linearly combine the polynomials in the same RotationSet.

let intermediate_sets = construct_intermediate_sets(queries);
let (rotation_sets, super_point_set) = (
    intermediate_sets.rotation_sets,
    intermediate_sets.super_point_set,
);

let rotation_sets: Vec<RotationSetExtension<E::G1Affine>> = rotation_sets
    .into_par_iter()
    .map(|rotation_set| {
        let commitments: Vec<CommitmentExtension<E::G1Affine>> = rotation_set
            .commitments
            .as_slice()
            .into_par_iter()
            .map(|commitment_data| commitment_data.extend(&rotation_set.points))
            .collect();
        rotation_set.extend(commitments)
    })
    .collect();

First, the prover builds rotation_sets and super_point_set from the queries. It extends each rotation set by extending the commitments of $f_{i_j}$ polynomials with $r_{i_j}$ polynomials.

Computing $W = [h(X)]_1$

let v: ChallengeV<_> = transcript.squeeze_challenge_scalar();

#[allow(clippy::needless_collect)]
let quotient_polynomials = rotation_sets
    .as_slice()
    .into_par_iter()
    .map(quotient_contribution)
    .collect::<Vec<_>>();

let h_x: Polynomial<E::Fr, Coeff> = quotient_polynomials
    .into_iter()
    .zip(powers(*v))
    .map(|(poly, power_of_v)| poly * power_of_v)
    .reduce(|acc, poly| acc + &poly)
    .unwrap();

let h = self.params.commit(&h_x, Blind::default()).to_affine();
transcript.write_point(h)?;

`verify_proof`

For the rotation set $S = \{x_1, \dots, x_{n}\}$ , suppose that we are given polynomial evaluations on the set as $z_1, \dots, z_{n}$ . Verifier should construct the polynomial by Lagrange interpolation, thus Lagrange basis polynomials should be computed in EVM verifier side for each rotation set.

Both halo2 Rust verifier and EVM verifier utilizes memory-efficient algorithm to compute Lagrange basis polynomials using elementary algebra trick.

The constructed polynomial would have the form

f(X) = \sum_{j=1}^{n} z_j \cdot \prod_{k=1, k \neq j}^{n} \frac{X-x_k}{x_j-x_k}

Lagrange basis polynomials refer to $\prod_{k=1, k \neq j}^{n} \frac{X-x_k}{x_j-x_k}$ for each $j$ . What we want to compute is the coefficient form of Lagrange basis polynomial. We will denote the coefficient form of the polynomial as the row vector $[c_0, \dots, c_{n-2}]$ . The algorithm proceeds inductively on $n$ .

Suppose that $j=1$ and we are trying to compute $\prod_{k=2}^{n} (X-x_k)$ . Since $\prod_{k=2}^{n} (X-x_k) = (X-x_{n}) * \prod_{k=2}^{n-1} (X-x_k)$ , the coefficient of $X^i, 1 \le i \le n$ can be computed as $(\text{coefficient of} \;X^{i-1} \;\text{of} \;\prod_{k=2}^{n-1} (X-x_k)) + (-x_{n}) * (\text{coefficient of} \;X^i \;\text{of} \;\prod_{k=2}^{n-1} (X-x_k))$ . The constant coefficient is $\prod_{i=2}^{n}(-x_i)$ .

Let $[c_0, \dots, c_{n-3}]$ be the coefficient vector of $\prod_{k=2}^{n-1} (X-x_k)$ and $[c_0', \dots, c_{n-2}']$ be the coefficient vector of $\prod_{k=2}^{n} (X-x_k)$ . Then,

\begin{bmatrix} c_0 & c_1 & c_2 & \dots & c_{n-3} \end{bmatrix} \\ \qquad \qquad \quad \begin{bmatrix} c_0 & c_1 & \dots & c_{n-4} & c_{n-3} \end{bmatrix} \\ \qquad \quad \begin{bmatrix} c_0' & c_1' & c_2' & \dots & c_{n-3}' & c_{n-2}' \end{bmatrix}

Each coefficient (third row) can be obtained by $(-x_{n}) \cdot (\text{first row}) + (\text{second row})$ .

DoHoon Kim

Researcher & Developer

이전 포스트