Zero-knowledge Virtual Machines, the Polaris License, and Vendor Lock-in

March 11, 2021

Originally posted on medium.com/zeroknowledge.

A punch card. Source: Marcin Wichary.

Recently released frameworks for building zero-knowledge applications, such as Zinc and Cairo, allow developers to relatively easily write programs whose execution can be verified while keeping all or some inputs private. In terms of developer-friendliness, this is a step up from writing zero-knowledge circuits, since such frameworks provide a single zero-knowledge virtual machine (ZK VM) circuit that can verify any program of a supported size. Without this option, developers have to write a new circuit per application and handle the complexity of unique proving and verifying keys.

Despite the benefits of such ZK VMs, there are also tradeoffs that I believe that developers should keenly consider. In this post, I describe ZK circuits and VMs at a high level, and discuss the benefits and drawbacks of ZK VMs. While ZK VMs provide strong benefits such as easier development, a simpler security model, and greater compatibility with other VMs, I contend that the risk of vendor lock-in with some ZK VMs on the market is subtle but underestimated. With this blog post, I aim to start a discussion about this issue, so that developers can better understand the full set of tradeoffs involved, should they choose to build applications atop the aforementioned ZK VM platforms.

About zero-knowledge virtual machines

To understand ZK VMs, one must understand ZK circuits, sometimes called arithmetic circuits or algebraic intermediate representations (AIRs). Without delving into technical details it suffices to say that a circuit is a representation of a program. Given a circuit, a prover can use a zero-knowledge construction like a zk-SNARK or zk-STARK to demonstrate that, given a certain set of inputs, they have correctly executed this program without revealing any (or some) of the inputs. Common circuits for cryptocurrency applications include those which prevent double-withdrawals of funds from zk-rollups and mixers, or those which prove that a coordinator has correctly tallied a set of votes in anti-collusion private voting systems.

An arithmetic circuit. Source: Hartwig Mayer.

In recent months, some zero-knowledge application frameworks have taken this approach one step further with zero knowledge virtual machines. A ZK VM is a circuit that executes bytecode. It allows a prover to show that, given a set of inputs, as well as some bytecode, they have correctly executed the program code on said inputs.

A model of a ZK VM. Source: Bobbin Threadbare.

Notably, the bytecode (or program) is not the circuit itself, but part of its inputs. As such, a prover can use the same circuit to create proofs of validity of execution for arbitrary programs, as long as said programs fit the underlying ZK VM circuit.

Prior work on ZK VMs

ZK VMs are not new. In 2013, Eli Ben-Sasson and others introduced TinyRAM:

A random-access machine tailored for efficient verification of nondeterministic computations… This system can be used to prove the correct execution of C programs, using our TinyRAM port of the GCC compiler.

As smart-contract-enabled blockchains rose in prominence, researchers found privacy-focused use cases for ZK VMs, such as Hawk, which described “private smart contracts” as early as 2016:

A Hawk programmer can write a private smart contract in an intuitive manner without having to implement cryptography, and our compiler automatically generates an efficient cryptographic protocol where contractual parties interact with the blockchain, using cryptographic primitives such as zero-knowledge proofs. (emphasis added)

ZK VMs found further development through systems like Aleo, which follows the ZEXE model of private computation.. Aleo provides a language, Leo, to write programs which are included in transactions; said programs accept input records and produce output records:

Each Leo .leo file is compiled into a program. Each program lives in a record. Each record lives in a transaction. An Aleo transaction spends two old records: old_record_0, old_record_1 and creates two new records: new_record_0, new_record_1. (source)

More recently, Starkware Industries released Cairo, a STARK-based Turing-complete language:

Cairo is the first production-grade proof system implementing a Turing Complete von Neumann Architecture: each Cairo program P resides in the virtual machine’s memory, alongside the data D processed by it. Cairo comes with a single AIR (and thus a single Verifier — in a smart contract, WebAssembly, etc.) that can verify any Cairo program. Namely, the Cairo AIR verifies the computational integrity of executing P on D, and the correctness of the post-execution state of the system. (source)

Matter Labs also released a ZK VM and Zinc, a language that compiles into bytecode which the VM can execute:

Zinc code is compiled into bytecode which can be run by Zinc VM. Zinc VM is a virtual machine that serves three purposes: executing arbitrary computations, generating zero-knowledge proof of performed computations, and verification of the provided proof without knowing the input data.

Moreover, Matter Labs has stated that they have an upcoming implementation of the Ethereum Virtual Machine in zero-knowledge, which allows developers to deploy an “existing EVM codebase with minimum modifications”.

Up to this point, this post has been purely descriptive about the existing state of ZK VMs and described their history. In the next section, I present my views on the benefits and drawbacks of developing programs which target ZK VMs, rather than for application-specific circuits.

Benefits and drawbacks

This section discusses several arguments for and against the use of ZK VMs from the perspective of a developer who wishes to build applications that use zero-knowledge proofs. I assume that this developer already has experience with writing circuits (such as using arkworks or the circom language), and is considering switching to writing code that targets a ZK VM.

Ease of development

Argument. A ZK VM abstracts away some tricky aspects of writing circuits or AIRs. While developers should still understand the underlying system on which their code will run for security reasons, it is theoretically easier to write code for a VM than wiring a circuit by hand. An analogy is that writing Solidity is easier than writing Yul, though developers must understand how the EVM works to avoid security issues.

Objection. A language that targets a ZK VM is not inherently easier to use than one which targets circuits or AIRs. Also, a highly developer-friendly circuit construction language can provide the same abstractions and flexibility.

Reply. In terms of developer effectiveness, some small- or medium-sized programs will benefit the most if they target a ZK VM, if the zero-knowledge portion of the system is simple enough.

Simpler proving and verification key management

Argument. Since a ZK VM is a single circuit that accepts arbitrary program bytecode, developers do not need to handle proving and verifying keys when writing application-specific circuits. Assuming that the maintainers of the circuit have securely and correctly generated these keys and have made them publicly available, they can be verified against the ZK VM circuit. Developers can then use just one set of proving and verifying keys without needing to write additional code to manage them.

Objection. The security of programs written for a ZK VM will inherit any underlying security flaws of the ZK VM circuit. This creates a single point of failure, which may not be acceptable in some circumstances or for some use cases.

Reply. It is better to have a well-designed and thoroughly audited ZK VM which enables a wide array of ZK applications, rather than a large number of ZK applications with a wide range of possible security flaws. The above objection can therefore be overcome if sufficient resources are applied to strengthen the security of said ZK VM.

Compatibility with other VMs

Argument. It may be possible to share code written for another VM and a ZK VM. This allows developers to relatively easily port applications like smart contracts from, for instance, the EVM, to a ZK VM and thereby enjoy higher scalability through zk-rollup constructions.

Objection. It is highly non-trivial to port the EVM to a zero-knowledge circuit. Even if ZK VM authors pull off this task, they may also have to handle the difficult task of keeping up with upgrades to the EVM, such as new precompiles or new opcodes. Developers may therefore prefer to write code that directly targets circuits or AIRs, rather than for a ZK VM which may take time to mature.

Reply. A compromise can be reached, such that the ZK VM only implements a subset of the EVM, so it can fulfill most use cases.

Vendor lock-in

I believe that a developer should base their decision of whether they should use a ZK VM rather than write an application-specific circuit on their own needs and preferences. There is, however, one additional subtle and under-discussed drawback: the risk of vendor lock-in.

Vendor lock-in occurs when the developers of an application rely on tools, frameworks, or platforms which a third party controls so tightly that developers cannot easily switch. As long as developers build on open-source platforms, they should not face this issue.Yet, the current selection of ZK VMs available is small, and some ZK VM authors have indicated that they may exercise a greater degree of control over their platforms than is usually seen in fully open-source projects. For instance, the Polaris license, released by Aztec and Starkware, states:

[Company / Foundation Name] grants you (“Licensee”) a license to use, modify, and redistribute the Prover, but only (a) for Non-Commercial Use, or (b) for Commercial Use only where every Proof generated by the Prover is submitted only to a Polaris Verifier.

According to a blog post about the license:

… StarkWare plans to release source code for its STARK prover; Aztec will use the same Polaris license for its PLONK provers… Informally, the Polaris license says that anyone may use and modify the Prover code, including for commercial use, as long as proofs generated by it are submitted to one of the white-listed Polaris Verifiers. A white-listed Verifier is a smart contract address appearing on an append-only list, which means that StarkWare may only add Verifiers to that list, but never remove them.

Starkware claims that this license helps eliminates platform risk:

For developers relying on the Prover code, this license eliminates the platform risk presented by technology providers such as StarkWare and Aztec. Platform risk is a major concern for developers, whether those platforms are quasi-monopolies or technology upstarts: Dominant platforms such as Facebook, Google, Facebook and Twitter pose a platform risk to developers, as they may unilaterally modify the terms of use of their platform, or shut it down altogether.

The Polaris license greatly reduces platform risk as the verifiers’ terms of use are immutable:

Polaris Verifiers are not merely guaranteed to exist indefinitely — their terms of use, whether gas-like or not, will also be made immutable by the blockchain, and thus provide developers with a stable business foundation to build upon.

According to the license, as long as a Polaris Verifier exists on-chain, anyone can freely use it to verify proofs. Since these smart contracts are immutable, as long as the terms of said verifier are acceptable, a platform user can trust that the platform owner (such as Starkware or Aztec) cannot pull the rug beneath a team’s feet and deny them access to the verifier.

Yet, assuming that only the platform owner has the ability to add verifiers to the on-chain list, a degree of centralisation remains. If the prover and verifier have to be upgraded (such as to add an optimisation or to fix a security bug), teams still solely rely on the platform owner to do so. Moreover, nothing can stop the platform owner from imposing new fees for using upgraded verifiers. Given how fast-moving, niche, and complex zero-knowledge technology is, I think that it is likely that such upgrades will be necessary, so teams will have to trust the platform owner to act reasonably.

I understand that platform owners of ZK VMs may be justified in creating this license to ensure business sustainability, and I can see how it is a compromise between full lock-in and full software freedom. My point is simply that teams should be aware that the only way to completely eliminate platform risk is to rely on fully open-source ZK VMs. In conclusion, application developers are best served if they are made fully aware of the tradeoffs involved, however subtle.

Concluding thoughts

ZK VMs have advanced rapidly in recent years. Today, developers have many more options when writing zero knowledge circuits, particularly ZK VMs. While they bring a number of benefits, they also come with certain tradeoffs. One of the most under-discussed is vendor lock-in. While licenses like the Polaris License purports to be a way for platforms to sustain themselves without overly exposing application developers to platform risk/lock-in, application developers should be aware of the tradeoffs as well as benefits when choosing how to integrate zero knowledge circuits into their stack.

Finally, please note that this post only contains my personal opinions on the topic and does not reflect the views of other organizations or individuals.