Guide to zero-knowledge proof systems
Once a breakthrough in applied cryptography, zero-knowledge proofs are widely used for anonymizing transactions. They let parties share information in a privacy-preserving way, improve the security of digital systems, and boost the throughput of popular blockchains. Discover how zero-knowledge proofs work in our in-depth guide.
What are zero-knowledge proofs?
Zero-knowledge proofs were first described in "The knowledge complexity of interactive proof systems," a research paper published in 1985. It defines them as "proofs that convey no additional knowledge other than the correctness of the proposition in question." In layman's terms, these methods let us prove that we know a secret without giving it away.
A zero-knowledge protocol connects two parties: the prover and the verifier. The former wants to prove the validity of a specific claim, and the second party must confirm it. However, there is a catch: the only information they can share is whether this statement is true or not. Such a proof does not reveal the statement's contents or the method of discovering the truth.
Ali Baba's cave
To understand this concept, consider the following variation of a story from "How to Explain Zero-Knowledge Protocols to Your Children," a paper published in 1998 by Jean-Jacques Quisquater et al.
In this hypothetical scenario, Alice (verifier) considers buying a secret code from Bob (prover), but she wants to make sure it is valid. This code unlocks a magic door connecting two paths (A and B) inside a cave. How can Bob convince Alice that he knows the code without giving it away?
Alice stands at the mouth of the cave while Bob enters it using either of the paths. Next, Alice walks inside and shouts to Bob, instructing him to use a specific exit. If Bob knows the secret code, he will always return by the path Alice chose, even if he entered using the other one.
In a single experiment, there is a 50% probability of Bob succeeding by chance. The more times it is repeated, the lower the likelihood of deceit. Eventually, cheating becomes nearly impossible, so Alice learns that the code is legit without testing it herself. You can find more real-life scenarios here.
Why do we need zero-knowledge proofs?
Consider another example — proving identity or citizenship to a service provider by showing your driver's license or national passport. Although typical, this method is inherently flawed from a security standpoint.
Personally Identifiable Information (PII) is shared with a third party, which then stores it in its centralized database. This is not an ideal solution as identity theft is on the rise — in 2021 alone, around 15 million Americans had their identity stolen. Thus, privacy protection is paramount, and individuals need other means of providing sensitive details.
Zero-knowledge proofs offer a spectrum of alternatives — simple and reliable ways to prove validity quickly and securely. These protocols produce succinct evidence based solely on a witness (statement) without exposing the data used to create it.
In our identity-related scenario, a zero-knowledge proof is the only necessary evidence. If its specific properties hold true, it will convince the verifier that your claim is valid. One could apply the same logic to Finding Waldo — the challenge of proving where the character is without revealing his location.
How do zero-knowledge proofs work?
Zero-knowledge protocols are based on special algorithms. Taking a statement as an input, they return an output in the form of true or false. In addition, they use commitment schemes that work for both parties.
A prover can specify all the information beforehand and only reveal what corresponds to the verifier's choices. The verifier also uses a commitment scheme — it lets them specify their choices in advance. Zero-knowledge proofs must meet the following requirements:
Completeness
Provided that the input is valid, a true output is guaranteed. A statement will not be accepted unless both parties are honest.
Soundness
It is theoretically impossible to manipulate a zero-knowledge protocol into returning true for an invalid input. A prover cannot deceive an honest verifier except for a tiny margin of probability.
Zero-knowledge
A verifier's knowledge is limited to the validity or falsity of the statement at hand. They have no way of deriving the initial input — the contents of the statement — from the proof.
Components of zero-knowledge protocols
A zero-knowledge proof comprises three fundamental elements. The descriptions below refer to an interactive zero-knowledge proof, the basis of early protocols. In this model, determining the validity of a statement requires back-and-forth communication between provers and verifiers.
Witness
A prover validates their knowledge of secret information — the witness to the proof. To begin, they randomly pick a question from a set corresponding to their assumed knowledge of the witness. After calculating the answer, they send it to the verifier.
Challenge
One correct answer does not prove actual access to the witness. To ensure the prover has not guessed it, the verifier randomly picks another question from the same set.
Response
Upon receiving the second question, the prover calculates the answer and sends it back to the verifier. The verifier checks it and selects more questions. This interaction repeats multiple times to reduce the possibility of faking knowledge.
Non-interactive zero-knowledge proofs
Any interactive proving process is imperfect, as the prover and the verifier must be available and interact multiple times. It also excludes independent verification, even when the verifier is convinced of the prover's honesty. Convincing another participant would require another set of messages between the two parties.
This problem is solved by non-interactive zero-knowledge proofs proposed by Manuel Blum, Paul Feldman, and Silvio Micali. Their methods include a key that a prover and their verifier share. It lets the prover show their knowledge beyond doubt in a single round without revealing the witness. Here is how non-interactive ZK proofs work.
- Instead of answering multiple questions one by one, a prover sends their secret information to a verification algorithm.
- The algorithm computes a zero-knowledge proof.
- The verifier receives the proof and uses another algorithm to check it.
This model outshines its predecessor in many ways. Aside from raising efficiency, it lets third parties verify the proof if they have access to the key and verification algorithm. These principles underlie modern proof types, which fall into several categories.
Zk-SNARK (Zero-Knowledge Succinct Non-Interactive Argument of Knowledge)
This type uses the shared key mentioned above — a set of public parameters for proof generation and verification approved by both participants. Here is what each of the components stands for.
- Zero-knowledge. A verifier's knowledge of any statement is limited to whether it is true or false. Validation does not require any additional information.
- Succinct. Verification is quick as the proof is smaller than the witness.
- Non-interactive. Only one round of communication between a prover and a verifier occurs.
- Argument. The probability of cheating is extremely low as the proof meets the soundness requirement described above.
- (Of) Knowledge. A prover must access the witness to prove their statement. The likelihood of computing a valid proof otherwise is close to zero.
Risks of using ZK-SNARKs
The security of zk-SNARK protocols relies on the generation of public parameters for the shared key. Collectively, they are referred to as the Common Reference String (CRS). In theory, if a dishonest prover finds a way to tweak the entropy (randomness) of the CRS generation, they can violate the computational soundness and compute false proofs.
Multi-party computation (MPC) mitigates this risk: during a trusted setup ceremony, each party contributes a random value to generate the CRS. However, for this technique to work, all participants must keep their inputs (sampled randomness) hidden. Otherwise, malicious actors could still exploit the mathematical structure of the CRS.
Zk-STARK (Zero-Knowledge Scalable Transparent Argument of Knowledge)
These protocols use a non-trusted setup — users do not have to trust the parties generating the public parameters. Larger proofs lead to higher verification overheads compared to zk-SNARKs, but zk-STARKs have two critical advantages:
- Scalability. Zk-STARKs are more efficient and cost-effective for bigger witnesses (datasets). Prover and verifier times in zk-SNARKs grow linearly with the size of the witness. Here, they increase only slightly.
- Transparency. zk-STARKs replace trust with publicly verifiable randomness. As a result, they generate public parameters for proof and verification more transparently.
Bulletproofs
Like zk-STARKs, these proofs work without a trusted setup. They enable confidential crypto transactions and other applications in cryptographic protocols. For instance, a Bulletproof system can be used to prove that some encrypted number falls into a specific range without revealing anything else about the number. As you can see from the chart below, Bulletproofs are slower than both zk-SNARKs and zk-STARKs.
Use cases for a zero-knowledge proof system
This transformative technology has multiple applications, from cryptocurrencies and privacy protection to anonymous voting. Here are the top six applications in 2022.
Privacy-focused blockchains
Conventional payment systems undermine user privacy. In case of credit card payments, transaction details are visible to banks, payments providers, and other parties, including state authorities. While cryptocurrencies emerged as a means for private, peer-to-peer transactions, they are visible on public blockchains.
Most crypto transactions are not anonymous but pseudonymous. Malicious actors can trace virtual user identities to real-world identities — for instance, when a wallet holder shares their address on social media. On-chain and off-chain data analysis also helps cybercriminals achieve their goal.
Completely anonymous transactions are possible with privacy coins. These cryptocurrencies exist on privacy-focused blockchains, where transaction details (type of asset, its quantity, transaction times, and sender/receiver addresses) are hidden. As they use zero-knowledge technology, nodes validate transactions without accessing transaction data.
For example, zk-SNARKs encode some of the network consensus rules in Zcash, a Bitcoin fork with enhanced security and anonymity. As a result, its function determines the validity of each transaction without revealing any of the calculations involved.
At the initial stage, the information being proved is transformed into its mathematical representation. A transaction validity function turns into a series of the simplest possible operations (addition, subtraction, multiplication, and division) — an arithmetic circuit. You can find a detailed description of this method on the Zcash site.
Public blockchains
ZK proofs can also anonymize transactions on public blockchains. For example, they hide users' details on Tornado Cash to enable private transactions on Ethereum. On the flip side, the opt-in nature of such privacy tools makes illicit activity possible.
Identity protection
Using zero-knowledge proofs, individuals validate their identities without disclosing sensitive details. These techniques underpin decentralized identity systems — environments where individuals own their digital identities and manage access to them. For example, one can prove their citizenship without providing any password details.
Users of decentralized identity frameworks collect verified data about themselves from certified issuers in an identity wallet. Then, they decide what information to share with any requesting third parties.
Authentication
Many personalized online services require sensitive data, such as a user's full name, birth date, and phone number. Furthermore, account holders must create complex passwords and risk losing access if they forget them.
Zero-knowledge proofs simplify authentication for users and entities alike. Instead of exposing personal information, users can present ZK proofs based on public and private inputs. Meanwhile, organizations benefit as they do not have to store massive amounts of user data.
Supply chains
In industries like pharmaceuticals, supply chains include multiple transfers and require secure tracing systems. For example, when drugs are returned to distributors, these entities must verify all serial numbers before reselling. To facilitate this process, the MediLedger Project proposes a confidential chain of custody based on smart contracts and ZK proofs.
Verifiable computation
According to Vitalik Buterin's blockchain trilemma, it is impossible to improve the security, scalability, and decentralization of a platform at the same time. However, verifiable computation improves processing speeds without detriment to security.
This method delegates computation to another entity. It returns the result with a proof, validating execution without diminishing the verification quality.
Verifiable computation underlies off-chain scaling solutions for Ethereum. Instead of redesigning the core protocol, they make the base layer more efficient using outsourced computation.
- A separate chain executes each transaction, returning a result and a ZK proof (validity proof).
- Ethereum applies the results to its state immediately, without re-execution or additional evidence.
- Network congestion decreases, and transaction speeds rise.
- Validity proofs underpin zero-knowledge rollups and validiums, two off-chain scaling solutions for secure scalability.
Drawbacks of using zero-knowledge proofs
A perfect zero-knowledge method does not exist. ZK systems may be bulky and potentially vulnerable to advanced technologies. Another concern is unauthorized access to the private key, as it defines the parameters of the proof protocol. If a malicious party created false proofs, they could still look valid to the verifiers.
Expensive equipment
Advanced calculations require specialized hardware that is prohibitively expensive for most individuals. For the same reason, applications using zero-knowledge proofs may charge higher fees.
Proof verification costs
Due to the same complexity of computations, zero-knowledge technology is costly to implement. For example, verification of one zk-SNARK proof for zk-rollups on Ethereum costs around 500,000 gas.
Trust assumptions
The biggest problem with zk-SNARKs is that the Common Reference String is created once, and multiple parties can reuse it. In a trusted setup ceremony, the honesty of participants is merely assumed, and users have no way of checking it. Zk-STARKs solve this problem via publicly verifiable randomness. Another system, zK-ConSNARK, claims to protect privacy on mainstream blockchains without a trusted setup.
Quantum computing threats
Encryption in zk-SNARKs is based on elliptic curve cryptography algorithms (ECDSA). In the future, quantum computers could render this encryption method insecure. Zk-STARKs, which use collision-resistant hashes, appear to be immune to this threat as they do not use public key/private key pairings.
Final word
Zero-knowledge proofs allow one party to prove to another party that a statement is true without revealing it. They increase blockchain throughput, support privacy coins, and have multiple applications beyond the crypto world. Zk-SNARKs, zk-STARKs, and Bulletproofs serve the same purpose — creating a future where users have enhanced anonymity and control over their information.