Custody Game in Eth 2.0 and MPC Implementations

Ethereum 2.0

Ethereum is one of the most widely used blockchains in the world. After about 5 years of development, it is now entering its fourth stage: Serenity, also known as Ethereum 2.0 or often abbreviated Eth 2.0.

Eth 2.0 will be the most ambitious upgrade so far, and it is designed to improve almost every aspect of the decentralized system. Two major problems of the current Ethereum network will be solved, which are scalability and sustainability.

Custody Game in Eth 2.0 and MPC Implementations — Part I

Scalability

The capacity of the current Ethereum network is about 20 transactions per second, which is far from satisfying hundreds of applications. Many scaling options have been proposed in the blockchain community, and Eth 2.0 settled on sharding for layer one.

Simply put, sharding enables to divide the system into manageable smaller parts (shard chains), and each shard processes concurrently and independently. Finally, the results of each shard will be connected through corsslinking to the beacon chain, which will be discussed later.

Take a simple example, suppose a block in the system contains 3 transactions. In the current Ethereum system, every node, say node A, B, C, has to verify all the transactions. If the least powerful node takes 3 seconds to verify the block, the throughput of the system will be 1 transaction per second. Clearly, the scalability of the system is subject to the capacity of one single node.

In sharding, we could distribute the transactions across different shard chains and let each node verify one chain. The transactions are also divided into 3 parts and separately verified in each shard chains. Suppose each transaction could be verified in 1 second, then throughput of the system is 3 transactions per second! Furthermore, each node does not need to store all the data in the chain, they are only responsible for specific shard chains for period of time.

In the real world, each shard of course has to be verified by many validators. If you are interested in learning more, check out the Sharding FAQ (https://github.com/PlatONnetwork/proof_of_custody). Eth 2.0 aims to have 1024 shard chains (with 64 enabled from the beginning), and the throughput is expected to be improved more than 1000 times. In addition to mechanisms in layer two, the performance will be improved further.

Proof of Stake

Ethereum now, like Bitcoin, relies on Proof of Work (PoW) to ensure consensus. In a PoW system, the miners make use of computing resources to solve cryptographic puzzles, and get rewarded for finding solutions. Security comes due to the computational difficulty of the puzzle. This is marred by centralization and monopolization in the hands of a small number of mining pools, because there are economies of scale in mining.

There are more problems with mining/PoW than just centralization. It also leads to huge waste (and therefore environmental pollution) as the computation demands lots of energy; this energy itself actually provides the security, so this problem is hard to solve in the PoW paradigm. Further, since you can only reward, but never punish participants for bad behaviour (as they are anonymous), the cost of security is much higer than necessary.

Finally, for sharding, Proof of Work cannot work as the miners don’t have identities and therefore cannot be “assigned” to shards in any meaningful way.

To solve these problems, Eth 2.0 will switch to a Proof of Stake (PoS) system, which is called Casper. In a PoS system, the mining procedure is replaced by a “voting” system; validators need to stake 32 ETH to be able to participate and be able to vote. To achieve consensus, the validators take turns proposing and voting on the the next block. The Ethereum Proof of Stake FAQ (https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ) says:

The blockchain keeps track of a set of validators, and anyone who holds the blockchain’s base cryptocurrency (in Ethereum’s case, Ether) can become a validator by sending a special type of transaction that locks up their Ether into a deposit. The process of creating and agreeing to new blocks is then done through a consensus algorithm that all current validators can participate in.

The beacon chain will be the ”backbone“ of Eth 2.0. It will store and maintain the registration of validators, process cross-shard communications and finality gadget. All shard chains are following the beacon chain at all times. ETH holders who stake 32ETH (fixed) could be a validator. Every slot of an epoch, committees are randomly selected from the beacon chain for different shards. The validators are evenly divided into committees and each committee consists of at least 128 validators. The committee is responsible to produce blocks for a particular shard chain.

On the flip side, validators will be penalized for undesired and dishonest behaviors, the most severe punishment being slashing, where up to the whole deposit of a validator can be burned. Less severe penalized behaviors, for example, include not working in its turn, attest to a block that does not get finalized. Very high penalties (slashing) are dealt out for equivocating (signing to concurrent blocks) or signing incorrect computations.

The data availability problem

The data availability problem is highly related to fraud proofs, briefly shown as follows. More details see this post (https://dankradfeist.de/ethereum/2019/12/20/data-availability-checks.html).

Fraud proofs

In the above description of nodes (without sharding) in blockchain, we actually mean full nodes. Full nodes produce the chain, download all the data in every node, and verify all the transactions and states. A full node requires the machine to be equipped with large amounts of memory, computation power and bandwidth. This can be too costly for constrained devices like mobile phones.

Light nodes or light clients are cheap alternatives to full nodes. They connect to at least one full node and only download the block headers and blocks the are interested in. They trust the full node to check the validity of the data, and assume the malicious full nodes can not make an invalid chain.

Fraud proofs are a mechanism for the light node to reduce the security risk from invalid chains. Fraud proofs are produced by the honest full node, whenever it finds some inconsistent state, and sound an “alarm” for light nodes. This fraud proof is small and can be quickly distributed on the network, and proves with certainty that some chain is faulty; light nodes can then completely ignore this chain and are protected from the inconsistency it introduces.

For example, if the full node finds a transaction t is faulty, and the states before and after it are s_in and s_out, respectively. The full node then constructs the fraud proof as (s_in, t, s_out) together with the Merkle proof. The proof is very “light” to broadcast and to verify. The light node checks the Merkle proof that the (faulty) transaction is in the block, and also checks the invalidity of the state transition through the triplet.

The data availability problem says that, what if some malicious full node signed a block header, but does not publish some of the data, in particular data that contains an invalid transaction (for example, a transaction stealing someone’s money and transferring it to another account)? In this case an honest full node can not generate the fraud proof, since the data necessary to create the proof is missing.

Data availablity in sharding

Thedata availability problem is also important in sharding. As explained before, the validators in Eth 2.0 will not verify all blocks, and do cannot be expected to download all data, in order for sharding to be useful and reduce the load on individual validators. Shard blocks will be validated by the committees and only a commitment will be stored into the beacon chain. From this perspective, the validators are actually light nodes on most shards, except for the ones they are doing active work on.

In more details, the data availability problem in sharding is depicted as in the following diagram.

The shard data is merklised into Merkle root (Actually, it is the crosslink data root, we use merkle root for simplicity).
A proposer produces a block and signs the Merkle root.
Other validators vote for this block and sign on it. The BLS signatures could be aggregated into one single signature.
Once the number of the signature exceeds the threshold, the signed Merkle root is added to the Beacon chain.

Validators working on other shards do not know the entire data, and cannot download it, otherwise it would eliminate the advantages of sharding. The data availability problem in this setting is how to verify that the data in shard 1 is actually available to any full note wanting to download and verify it.

The custody game

Eth2.0 assumes that ⅔ of validators are honest, and will assign validators to shards in such a way that if this is true, it will never happen that an unavailable or incorrect block can be included in a crosslink. However, what does honest mean? There may be some validators that are “honest but lazy”: Given that most of the time, nobody is trying to cheat, it may be tempting to never actually verify anything, and just sign any incoming block header. Maybe to be a bit safer, you can wait for some signatures and only then jump on. You still get the rewards, but you have to do barely any work.

If this happens, an attacker can rely on these validators to “boost” their invalid blocks. That would be terrible for the overall health of the system. So we want to avoid “honest but lazy” validators as much as possible. That is the aim of the custody game.

It does not in itself fully solve the data availability problem, for that, data availability checks are needed. However, it does make sure that at least the validators in shard 1 that signed the block will all have the data.

Informally, in the proof of custody game, each validator has to additionally compute a custody bit. The custody bit can only be computed by the validator with some “secret” key (the custody key) and the data. After they publish the custody key, anyone can verify the custody bit with the data. If they discover an invalid custody bit, they can challenge this on chain. If the challengers are correct, then they will get rewarded and the custody bit generator will be slashed.

There are a few key points in proof of custody

The custody key should be deterministically computed from the validator key, so that fewer on-chain operations are necessary to commit to new keys. It will be generated periodically, and will have to be published when the custody period is over, anyone can check the validity of the custody key. It is an ephemeral key as it is only valid for one custody period (in Eth2.0, that is around nine days)
Without the custody key and the data, the custody bit cannot be guessed better than random
Anyone could check the validity of the custody bit using the custody key and the data.

In Eth 2.0, the public key of validators will be a BLS public key. When staking their ETH to become validators, operators will commit to one private-public key pair. For each custody period (9 days), the validators have the ability to generate an ephemeral custody key ek. The ephemeral keys are actually BLS signatures of a counter (the number of current epoch). Note that since BLS signatures are deterministic, all ephemeral keys are predetermined from the public key. No one can compute the ephemeral keys except the validators themselves.

The custody bit is computed using a mix function, although the specific form of function is under discussion, the specification leans to use MPC-friendly constructions, see eth2.0-specs(https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase1/custody-game.md#misc). In general it outputs the custody bit as b = mix(ek, D), where D is the block data.

The current construction of mix makes use of Universal Hash Function (UHF) and Legendre PRF (Pseudo-Random Function). These are both keyed functions and deterministic. Given an ephemeral key, it could be represented as two elements (ek_0, ek_1). Then the validator computes the custody bit as Leg_PRF(ek_0, UHF(ek_0, ek_1, D)). See the following diagram.

Simply put, the UHF is used to expand the domain (as usually used in cryptography) and avoid outsourcing (only the key and data owner can compute the UHF function). The adoption of the Legendre PRF is that it can be computed extremely efficiently in a secure Multi-Party Computation (MPC) protocol, as well as for the randomness of custody bit. Some details can be found in this post (https://ethresear.ch/t/using-the-legendre-symbol-as-a-prf-for-the-proof-of-custody/5169), and we will also give more details and explanations in the following posts.

MPC Friendlyness

One of the design goals of Eth2.0 is to make it MPC friendly, because (a) this can bring extra security by allowing operators to distribute their validator across several machines and even different data centers, avoiding single points of failure and (b) it allows people with less capital to take part in Eth2.0 validation by forming a trustless pool. So the proof of custody should also be MPC-friendly, and this is the main reason for using the Legendre PRF. This possibly enables new staking business or other interesting applications. See here (https://slideslive.com/38920085/ethereum-20-trustless-staking-pools) for further details.

PlatON initiates a project, supported by a grant from the Ethereum Foundation, to implement and optimize proof of custody in an MPC protocol. Please find the current code in GitHub(https://github.com/PlatONnetwork/proof_of_custody). More details come soon, stay tuned!