What problem are we solving?
If you’ve followed private rollups, you’ve watched the same fight play out three different ways and seen all three lose.
- Nightfall. Aggregator needs 144 CPU cores and 750 GB of RAM to assemble a block of just 64 transactions. Each user’s transaction validity proof has to be composed into a single block proof, and the proof composition is what melts the machine.
- Aztec. Tries the same recipe with sharper tooling. Testnet sits at 60 GB of aggregator RAM and 0.2 TPS. Same fundamental bottleneck.
- Intmax2. Inverts the trade: push the proof composition onto the client via PCD. Now the user has to prove their entire receive history, and every recipient has to merge the sender’s proof with their own. Users have to keep syncing for every block, even ones they aren’t in.
The problem underneath all three: combining privacy (per-transaction zkSNARK) with scale (some kind of recursive composition) forces someone to do a lot of zero-knowledge work per transaction. The aggregator has to verify everyone’s proofs and stitch them together. The client has to generate the proofs in the first place.
PlasmaBlind (Daix-Moreux & Zhang, 2026) takes a different angle. The client no longer generates a zkSNARK. They run the transaction validity circuit, extract the R1CS witness, and then perform exactly one folding step — mixing their real witness with a random one. The output is the zero-knowledge proof.
In numbers: ~32ms to produce a transaction proof on a MacBook M1 Max, ~46ms to update a balance proof, ~36,000 TPS in the centralized deployment, ~1,150 TPS in the decentralized one (or ~1,800 TPS with 20-byte truncated nullifiers). No general-purpose zkSNARK on the client, ever.
A glossary before we touch the table
The protocol carries notation from three different worlds — UTXO accounting, Merkle accumulators, and Nova folding. Six symbols recur from here on:
- — user secret/public key. Derived as using a circuit-friendly PRF (NOT an elliptic-curve scheme — PlasmaBlind keys don’t have to interop with L1).
- — the position of a UTXO. It’s the -th output of the -th transaction in block . Every UTXO has a unique triple.
- — a committed UTXO. A Pedersen-style hiding commitment to with opening . The opening goes to the recipient out-of-band; the commitment goes on-chain (well, into the aggregator’s tree).
- — the nullifier for that UTXO. Unique per UTXO, deterministic from , and hides the underlying position from anyone but the owner.
- — an R1CS witness/instance pair. is the assignment, is the public input. The “would-have-been-SNARKed” object.
- — a committed-relaxed R1CS pair (Nova’s variant). carries the witness plus an error term plus randomness; carries hiding commitments to and , a scalar , and public inputs . The pair is satisfying when for . The original is the special case .
Four Merkle trees show up:
- — committed UTXOs in block . Ephemeral, per-block.
- — sender public keys for valid transactions in block (or for unsettled ones). Ephemeral, per-block.
- — all historical nullifiers. Global, permanent. Stored as an interval Merkle tree so the aggregator can prove non-membership in a circuit.
- — block headers (one per block). Global, permanent.
A block header is the triple — three Merkle roots. The block tree’s leaves are headers.
How it stacks up
| Aggregator RAM needed | Client proving | Client sync | Client proof object | TPS (centralized) | |
|---|---|---|---|---|---|
| Nightfall (64 tx) | 750 GB | full zkSNARK | none | succinct SNARK | low |
| Aztec (testnet) | 60 GB | full zkSNARK | none | succinct SNARK | 0.2 TPS |
| Intmax2 | low | full PCD | every block | recursive proof | high (theory) |
| PlasmaBlind | commodity (benched on 64 GB i9) | ~32 ms | only relevant blocks | folded R1CS pair | ~36,000 TPS |
The trade is clear once you read across the row. PlasmaBlind’s client proof is not succinct — it’s a folded R1CS witness, hundreds of kilobytes — but the client paid milliseconds to make it, and the aggregator can fold them all together cheaply. The succinctness comes back at the end, when the aggregator compresses everything into a final decider SNARK.
Prerequisites. You should know what a zkSNARK is at the blackbox level, be familiar with Zcash-style shielded UTXOs (nullifiers, commitments, Merkle accumulators), and have at least heard the words “Nova” and “IVC.” We’ll re-introduce just enough of Nova to make the folding trick land. What we won’t cover: formal soundness proofs, the internals of MicroNova (used as the decider), CycleFold mechanics, or the epoch-based block-tree optimization beyond a one-line mention.
A Refresher: Nova Folding in One Page
To understand the trick, you need committed-relaxed R1CS and one folding step. Both come from Nova (Kothapalli–Setty–Tzialla, CRYPTO 2022).
R1CS is the constraint system for matrices and an assignment . A witness paired with public input satisfies the system if the equation holds.
Committed-relaxed R1CS generalizes this in two ways. First, an error term lets us relax the equation: , where is a scalar. Set and you recover standard R1CS. Second, the witness and error are wrapped in hiding commitments — Pedersen-style commitments with randomness .
The pair becomes:
— is the secret, is the public, and “satisfying” means are correct commitments and for .
Folding takes two satisfying committed-relaxed pairs and for the same and produces one satisfying pair . The mechanics:
- Prover computes and sends a commitment to the cross-term
- Verifier sends a random challenge .
- Both parties compute the folded instance: , , , .
- The prover (privately) computes the folded witness: , , and similarly for with .
The output satisfies committed-relaxed R1CS iff both inputs did. No SNARK was generated. What changed hands: one cross-term commitment , one challenge , and one linear combination. That’s why folding is fast — it’s a multiscalar multiplication and a couple of vector ops, not a polynomial commitment opening.
We’ll write the prover side of this single fold as and the verifier side as . Here are the folding scheme’s accumulator/verifier keys — fixed setup data. NIFS stands for “non-interactive folding scheme”; the interactive challenge becomes a Fiat–Shamir hash in the non-interactive form.
(Sharp readers comparing the cross-term against the paper will notice a transcription typo there: the paper writes in the coefficient, but the correct cross-term comes from substituting into and reads , as above.)
The blinding observation. Nova-style folding has a property the paper calls blinding: if you fold a satisfying pair with a uniformly random satisfying pair , the folded witness is information-theoretically independent of the input . The randomness in masks the input. The paper attributes this observation to HyperNova [24], and it’s the seed of every speed result that follows.
The Trick: Folding-With-Random Is Already Zero-Knowledge
Here’s the move that makes PlasmaBlind work. The user has built a satisfying R1CS pair — the assignment they’d normally hand to a zkSNARK prover. Instead of running the prover, they:
- Sample a uniformly random satisfying committed-relaxed pair .
- Run one folding step against it: .
- Ship .
That’s the entire “proof generation” the user does. The thing they ship is a folded witness plus enough public data to let the aggregator re-derive the folded instance.
Why it’s zero-knowledge. The blinding property says is distributed identically regardless of what was — the random pair masks the real witness. The aggregator can verify satisfies committed-relaxed R1CS, but they can’t recover anything about the original transaction from it.
Why it’s sound. The folding scheme is sound: if didn’t satisfy the transaction validity circuit, then with overwhelming probability over , the folded pair doesn’t satisfy committed-relaxed R1CS — and the aggregator catches it.
The catch: the “proof” isn’t succinct. carries a folded witness vector. For a transaction validity circuit of size constraints, that’s hundreds of kilobytes. We trade a small succinct proof (slow to generate) for a big non-succinct one (fast to generate, but big on the wire). PlasmaBlind makes this trade work by pushing the succinctness step to the aggregator, who has CPU to spare and can amortize it over an entire block.
What’s In a Transaction
Before we follow the proof to the aggregator, let’s pin down what the transaction validity circuit actually checks. PlasmaBlind runs in the UTXO model. A transparent transaction is
— input UTXOs, output UTXOs. The sender knows it; nobody else does. The shielded form replaces each input with a nullifier and each output with a committed UTXO:
The transaction validity circuit enforces, in plain English:
- is derived from : .
- For every input UTXO at position : the sender owns it, it’s the -th leaf of the committed UTXO tree of transaction in block , that block’s header sits in the block tree at root , and the supplied nullifier is correctly derived: .
- For every output UTXO: the committed form is a correct opening of the plaintext.
- The values balance: .
A handful of Merkle membership proofs (MT.V) and PRF evaluations. The sender hands witness and public input to the circuit, extracts the assignment, and that’s the pair they fold.
Why the Aggregator’s Side Is Hard
The user’s job ends there. The aggregator’s job is harder. Per transaction, the aggregator has to:
- Verify the user’s folded pair. That is, fold with the claimed on the verifier side and check it matches . Cheap — one
NIFS.V. - Update three trees. Insert s into the committed UTXO tree, insert sender public keys (or ) into the public key tree, insert nullifiers into the interval Merkle nullifier tree (which requires finding the right interval and splitting it).
- Produce a block proof. The rollup contract needs a single succinct proof that everything above happened correctly for every transaction in the block.
The naive way to produce that block proof is non-uniform PCD: think of a circuit that branches between “fold this transaction” and “update this tree,” then recursively compose proofs in a tree structure. Two things make non-uniform PCD expensive:
- Non-uniform means different parts of the computation use different step circuits. You either build a giant circuit that contains all of them with branching gadgets, or you use a PCD scheme that natively supports heterogeneous steps — both more expensive than the uniform case.
- PCD itself (vs IVC) handles tree-shaped composition where leaves are external proofs from many provers. The recursion overhead of verifying a folded pair inside a SNARK circuit is the cost center.
This is the rock the 750-GB-RAM aggregators broke on.
The Aggregator’s Trick: Two Linked Uniform IVC Chains
PlasmaBlind’s structural fix is to refuse the PCD framing. The aggregator instead maintains two independent uniform IVC chains running in parallel, and links them with a single line of step-circuit logic.
Chain 1 — the user-chain. A running accumulator that swallows incoming user folded pairs, one transaction at a time. Every step is the same operation: take the current accumulator, take the user’s , fold them into a new accumulator. Uniform — every transaction looks like every other transaction.
Chain 2 — the state-chain. A running pair accumulating proofs that the per-step state update circuit ran correctly. The step circuit takes the current state and a witness , and enforces:
- The committed UTXO membership check (the -th leaf of is the right output set).
- The public key membership check (the -th leaf of is the sender’s ).
- The nullifier-tree interval update (find such that , replace it with two new intervals).
The step circuit is the same every time. Uniform.
The link. Inside there’s one extra line:
The state-chain’s step circuit verifies, inside the circuit, that the user-chain’s current accumulator was advanced by folding the same transaction this step is processing. That single NIFS.V call ties the two chains together: it’s impossible to update the trees for transaction in chain 2 unless chain 1 actually folded transaction in.
The full per-transaction trace from the paper (§3.2.2):
- — extract the verifier-side folded instance from the user’s shipment.
- Chain 1 update: .
- Chain 2 update: .
- Evaluate on to get the next .
Steps 2 and 3 are both uniform folds. Step 4 is a uniform circuit eval. Step 1 is the aggregator running NIFS.V outside the circuit — also cheap.
Why this beats non-uniform PCD. Two reasons.
- Uniform amortization. Cycle-of-curves overhead (CycleFold) is paid once per chain, not once per step circuit variant. With two uniform chains you have constant overhead. With non-uniform PCD you’d have overhead proportional to the number of step-circuit shapes.
- No branching in the step circuit. A non-uniform circuit needs gadgets that select between branches — every branch’s constraints are present, gated by selectors. PlasmaBlind’s step circuit is just the state update; the user-proof verification is separately accumulated in chain 1 and only checked inside the state circuit via one
NIFS.Vcall.
The result: per-transaction aggregator time ~300 ms end-to-end, even with both chains running.
At the end of the block, the aggregator compresses both running pairs and a final block-tree-update proof into one zkSNARK using MicroNova [32], yielding KB. This is the proof that goes on-chain.
Balance Proofs and Instant Exit
The user maintains a third proof locally: , an IVC proof over a balance update step circuit . The witness records the user’s running balance and nonce; the step circuit checks each transaction the user is involved in (send or receive), updates the balance, increments the nonce on sends.
The improvement over Intmax2 is two-fold:
- No sync for irrelevant blocks. For blocks the user isn’t in, they just accumulate the block’s roots into a chained hash — no cryptographic step circuit invocation. They prove membership of those hashed blocks in the block tree only when they need to exit.
- Folding-based IVC, not PCD. Update overhead per relevant block is folding-cheap (~46 ms on M1 Max), not PCD-expensive.
Exiting is non-interactive: present + block-tree membership proofs for accumulated to the rollup contract. The contract verifies, transfers the L1 balance, removes the user’s L2 entry.
The Numbers
§4 of the paper benchmarks the Rust implementation (built on the sonobe folding library). Client benchmarks on a MacBook M1 Max / 32 GB; aggregator benchmarks on an Intel i9-12900K / 64 GB.
Client-side proving (Table 1, paper):
| Griffin (2-to-2) | Poseidon (2-to-2) | Griffin (4-to-4) | Poseidon (4-to-4) | |
|---|---|---|---|---|
| Prove transaction validity | 32.8 ms | 51.8 ms | 60.4 ms | 94.0 ms |
| Prove balance update | 43.1 ms | 45.8 ms | 46.4 ms | 48.3 ms |
Griffin is roughly 1/3 faster than Poseidon for transaction validity (smaller circuit), but their balance update times are within 5 ms — the bottleneck there is CycleFold’s recursive overhead, which is hash-function-agnostic. Note: the paper benchmarks Griffin while flagging that recent algebraic attacks have raised questions about Griffin’s security; the numbers are representative of efficiency-comparable safer alternatives.
Aggregator-side block building scales linearly with the number of transactions. At 128 transactions per block, the total time ranges from 34.0 s (Griffin, 2-to-2) to 45.5 s (Poseidon, 4-to-4). Breakdown of the 128-tx Griffin case: local instance folding (51–60%), user instance folding (19–21%), step circuit synthesis (11–22%), user instance validation (7–10%). The state-chain (chain 2) fold dominates, which makes sense — its witness carries the augmented step circuit’s full assignment.
Throughput. The math (paper §4 closing paragraphs) assumes 14 blobs per L1 block × 131,072 B = 1835 KB of available blob space, 12-second L1 slots, a 12 KB decider proof, a 96-byte header, 4-input transactions, and Intmax2’s ~4.15-byte short-id encoding for sender public keys.
- Centralized. Aggregator doesn’t post nullifiers — only sender ids. transactions per L1 block → ~36,604 TPS.
- Decentralized. Aggregator must post 4 × 32 bytes of nullifiers per transaction (so different aggregators stay in sync). → ~13,796 tx/block → ~1,149 TPS. Truncate nullifiers to 20 bytes and you get ~1,805 TPS.
The 30× drop from centralized to decentralized is fundamentally a calldata problem, not a proving problem — the prover-side numbers are the same.
Proof sizes grow linearly with block-tree height (the epoch optimization). At height 4 (a small anonymity set), Griffin 2-to-2 is 259 KB. At height 32 (a maximum-sized epoch), Poseidon 4-to-4 hits 1.9 MB. Big for mobile clients, fine for laptop/desktop senders.
Limitations
The honest list, mostly from §3.3 and the structure of the construction.
- Sender public keys are visible on-chain. PlasmaBlind hides amounts and recipients but reveals when each address is transacting. This is a deliberate trade for data availability: without sender ids in the public key tree, the user can’t verify their endorsed transaction was actually included in a block. Stronger sender privacy is possible but needs different DA assumptions.
- Client proofs are big. Hundreds of KB to ~2 MB per transaction. Acceptable for aggregator ingestion (and far smaller than the alternative — sending a full IVC proof history Intmax2-style), but heavy for mobile.
- Decentralized throughput is calldata-bound. ~30× drop from centralized to decentralized comes entirely from posting nullifiers on-chain for cross-aggregator consistency. Not solved by PlasmaBlind.
- Decider choice matters. Centralized throughput math bakes in MicroNova’s 12 KB proof size for the decider. A different decider SNARK changes the headline TPS number.
- Griffin caveat. The benchmarks lean on Griffin for raw speed, but recent algebraic attacks have raised concerns. The paper flags this; production deployments would need a safer arithmetization-oriented hash.
What’s Worth Taking Away
Three ideas, in order of how much they generalize.
1. Folding-with-random is a ZK proof. This is the smallest, most reusable trick in the paper. Any time you have a folding scheme with the blinding property and a relation that can be expressed as committed-relaxed R1CS, you can produce a zero-knowledge proof of satisfiability by doing exactly one folding step against a random satisfying pair. No general-purpose zkSNARK prover ever runs.
2. Two linked uniform IVC chains beat one non-uniform PCD. When you have a computation that mixes external (other parties’ proofs) and local (your tree updates) work with different shapes, the textbook fix is non-uniform PCD. PlasmaBlind shows you can often factor it: run two uniform IVC chains in parallel and add one NIFS.V line to the local chain’s step circuit to verify the external chain folded the same input. The paper notes this is of “independent interest” — it’s the kind of pattern that should show up in other rollup designs.
3. The PCS-style cost flip works for L2s too. PCSs have spent years moving costs from the verifier to the prover via succinct proofs. Folding does the opposite move for L2s: it puts a (small, fast) cost on the client and a (big, but cheap-per-tx) cost on the aggregator. The end-to-end win comes from the aggregator being able to amortize over the whole block.
If you want to chase the threads: read PlasmaFold (the non-private precursor by the same authors), Nova for the folding scheme internals, HyperNova for the blinding-folding observation, MicroNova for the decider, and the sonobe Rust implementation if you want to play with folding directly.
References
- PlasmaBlind (this paper): eprint 2026/634 — Daix-Moreux & Zhang, 2026.
- PlasmaFold: eprint 2025/1300 — Daix-Moreux & Zhang, 2025. Non-private precursor.
- Intmax2: eprint 2023/1082 — Rybakken et al., 2023. The short-id and DA design PlasmaBlind borrows.
- Nightfall: github.com/EYBlockchain/nightfall_4_CE, 2024.
- Nova: Kothapalli, Setty, Tzialla. Recursive zero-knowledge arguments from folding schemes. CRYPTO 2022.
- HyperNova: Kothapalli, Setty. Recursive arguments for customizable constraint systems. CRYPTO 2024.
- CycleFold: Kothapalli, Setty. eprint 2023/1192.
- MicroNova: Zhao, Setty, Cui, Zaverucha. Folding-based arguments with efficient (on-chain) verification. IEEE S&P 2025.
- Zerocash: Ben-Sasson et al. Decentralized anonymous payments from Bitcoin. IEEE S&P 2014.
- Poseidon: Grassi, Khovratovich, Rechberger, Roy, Schofnegger. USENIX Security 2021.
- Griffin: Grassi et al. CRYPTO 2023.
- sonobe: github.com/privacy-ethereum/sonobe. The Rust folding-scheme library PlasmaBlind builds on.