Ethereum 2.0 Phase 0 -- The Beacon Chain

Welcome to the Ethereum 2.0 Annotated Specification!

All the text that appears with a side-bar like this is my added commentary; everything else is directly from the original specification document.

This document concerns the Phase 0 beacon chain state transition function. Think of it as "The Yellow Paper for the Beacon Chain". It's basically the brains of the operation. There's a good deal more to building a fully functioning client, such as networking, fork choice, and validator behaviour: other documents cover these. There are also common "standards" around components like SSZ and APIs. For all the rest of it (databases, architecture, optimisations, metrics, Eth1 data handling, ...), the best source is probably a client implementation.

Other useful resources:

Notice: This document is a work-in-progress for researchers and implementers.

I wonder if this notice will ever be removed ūüėā

Introduction

This document represents the specification for Phase 0 of Ethereum 2.0 -- The Beacon Chain.

Phase 0 of Ethereum 2.0 (Eth2) is concerned only with the beacon chain. It implements a self-sustaining Proof of Stake network, but not much more. Phase 1 will bring sharding of blockchain data, and Phase 2 will bring execution engines that consume that data.

At the core of Ethereum 2.0 is a system chain called the "beacon chain". The beacon chain stores and manages the registry of validators. In the initial deployment phases of Ethereum 2.0, the only mechanism to become a validator is to make a one-way ETH transaction to a deposit contract on Ethereum 1.0. Activation as a validator happens when Ethereum 1.0 deposit receipts are processed by the beacon chain, the activation balance is reached, and a queuing process is completed. Exit is either voluntary or done forcibly as a penalty for misbehavior. The primary source of load on the beacon chain is "attestations". Attestations are simultaneously availability votes for a shard block (Phase 1) and proof-of-stake votes for a beacon block (Phase 0).

Eth2 is not merely an upgrade to Ethereum 1.

The original plan had been to manage both sharding and proof of stake via smart contracts on the existing Ethereum Mainnet. Eventually, it became apparent that this approach would limit throughput and constrain the scope for innovation in Eth2. So, around June 2018, it was decided to implement Eth2 as a separate blockchain, with strong economic ties to Eth1, and to come up with a transition plan to eventually import Eth1 into the new Eth2 chain.

The actors in Phase 0 of Eth2 are beacon nodes and validators. Beacon nodes maintain the state of the beacon chain, manage validators, apply rewards and penalties, maintain randomness, and determine the canonical chain with finality. Validators propose beacon chain blocks, and vote for blocks by making attestations. That's pretty much it in Phase 0. In Phase 1, the role of validators will be expanded to deal with shards.

This Beacon Chain specification describes the beacon nodes only. Expected validator behaviour is described in a separate document which is not formally a specification: correct behaviour for validators is essentially inferred from the specification of the beacon chain.

Notation

Code snippets appearing in this style are to be interpreted as Python 3 code.

It's a little controversial, but in its final form, the Ethereum 2.0 specification is almost entirely made up of Python code. This has a couple of advantages, the greatest being that the specification is executable, meaning that test vectors can be generated directly from the spec document, and in the event of consensus failure, the spec can be executed as the final arbiter of which fork is correct. It is also more amenable to formal verification of various sorts. You can download the specification directly as a Python package, and run code with it.

Unfortunately, a downside of all this is that all narrative and intuition has been removed: it is difficult for normal human beings to understand the spec only by reading the spec. (Though there are recent signs of this being reversed.)

Anyway, I'm here to help. Welcome to The Annotated Spec ūüėĄ

Custom types

Right at the top of the spec we find the important concepts laid out. Each of the types in the following table relates to something fundamental about the construction of the Ethereum 2.0 beacon chain.

SSZ is the encoding method used to pass data between clients and is described in a separate specification. Here it can be thought of as just a data type.

All integers are unsigned 64 bit numbers throughout the spec. We fought on the side of signedness in the integer wars of February 2019 but lost ūüėĘ. This means that preserving the order of operations is critical in some parts of the spec to avoid inadvertantly underflowing.

We define the following Python custom types for type hinting and readability:

I'm just going to give a brief intro here to each of these; we'll be seeing a lot more of them later.

Name SSZ equivalent Description
Slot uint64 a slot number

Time is divided into slots of 12 seconds. Exactly one beacon chain block is supposed to be proposed per slot, by a validator randomly selected to do so.

Epoch uint64 an epoch number

Thirty-two slots make an epoch.

Epoch boundaries are the points at which the chain can be justified and finalised (by the Casper FFG mechanism), and they are also the points at which validator balances are updated, validator committees get shuffled, and validator exits, entries, and slashings are processed. That is, the main state-transition work is performed per epoch, not per slot.

Fun fact: Epochs were originally called Cycles.

CommitteeIndex uint64 a committee index at a slot

Validators are organised into committees that collectively vote (make attestations) on blocks. Each committee is active at exactly one slot per epoch, but several committees are active at each slot. This type indexes into the list of committees active at a slot.

In Phase 0, validators are members of only one type of committee, and they are shuffled between committees every epoch. The role of the committee is to attest to the beacon block proposed by the selected member of the committee. In Phase 1 persistent committees will be introduced that will attest to shard data blocks and are shuffled slowly.

ValidatorIndex uint64 a validator registry index

Every validator that enters the system is consecutively assigned a unique validator index number that is permanent, remaining even after the validator exits. This is necessary as the validator's balance is associated with its index, so it needs to be preserved even if the validator exits, since there is no mechanism in Phase 0 to transfer that balance elsewhere.

Gwei uint64 an amount in Gwei

All Ether amounts are specified in units of Gwei ($\smash{10^9}$ Wei). This is basically a hack to avoid having to use integers wider than 64 bits ($\smash{2^{64}}$ Wei is only 18 Ether) to store balances and in calculations. Even so, in some places care needs to be taken to avoid arithmetic overflow when dealing with Ether calculations.

Root Bytes32 a Merkle root

Merkle roots are ubiquitous in the Eth2 protocol. They are a very succint and tamper-proof way of representing a lot of data: blocks are summarised by their Merkle roots; state is summarised by its Merkle root; the list of Eth1 deposits is summarised by its Merkle root; the digital signature of messages is calculated from the Merkle root of the data structure contained by the message. Here's a primer in case this is all new to you.

Version Bytes4 a fork version number

It is expected that the protocol will get updated/upgraded from time to time: a process commonly known as a "hard-fork". Unlike Eth1, Eth2 has an in-protocol concept of a version number. This is used, for example, to prevent votes from validators on one fork (that maybe haven't upgraded yet) being counted on a different fork.

Its recommended use is described in the Ethereum 2.0 networking specification.

DomainType Bytes4 a domain type

This is just a cryptographic nicety: messages intended for different purposes are tagged with different domains before being hashed and possibly signed. It's a kind of name-spacing to avoid clashes; probably unnecessary, but considered best-practice. Seven domain types are defined in Phase 0.

ForkDigest Bytes4 a digest of the current fork data

The unique chain identifier, based on information from genesis and the current fork Version. It is calculated in compute_fork_digest. As per the comment there, "4-bytes suffices for practical separation of forks/chains".

ForkDigest is used extensively in the Ethereum 2.0 networking specification.

Domain Bytes32 a signature domain

Domain is the concatenation of the DomainType and the first 28 bytes of the fork data root. It is used when verifying any messages from a validator‚ÄĒthe message needs to have been sent with the correct domain and fork version.

BLSPubkey Bytes48 a BLS12-381 public key

BLS is the digital signature scheme used by Eth2. It has some very nice properties, in particular the ability to aggregate signatures. This means that many validators can sign the same message (for example, that they support block X), and these signatures can all be efficiently aggregated into a single signature for verification. The ability to do this efficiently makes Eth2 practical as a protocol. Several other protocols have adopted or will adopt BLS, such as Zcash, Chia, Dfinity and Algorand. We are using the BLS signature scheme based on the BLS12-381 elliptic curve.

The BLSPubkey type holds a validator's public key, or the aggregation of several validators' public keys. This is used to verify messages that are claimed to have come from that validator or group of validators.

In our implementation, BLS public keys are elliptic curve points from the BLS12-381 $\smash{G_1}$ group, thus are 48 bytes long when compressed.

BLSSignature Bytes96 a BLS12-381 signature

As above, we are using BLS (Boneh-Lynn-Shacham) signatures over the BLS12-381 (Barreto-Lynn-Scott) elliptic curve in order to sign messages between participants. As with all digital signature schemes, this guarantees both the identity of the sender and the integrity of the contents of any message.

In our implementation, BLS signatures are elliptic curve points from the BLS12-381 $\smash{G_2}$ group, thus are 96 bytes long when compressed.

Constants

The following values are (non-configurable) constants used throughout the specification.

The distinction between "constants" and "configuration values" is not always clean, and things have moved back and forth between them at times. These are things that are expected never to change for the beacon chain, no matter what fork or test network it is running.

Name Value
GENESIS_SLOT Slot(0)

The very first slot number for the beacon chain is zero.

This might seem uncontroversial (except perhaps to Fortran programmers), but it actually featured heavily in the Great Integer Wars mentioned previously. The issue was that calculations on unsigned integers might have negative intermediate values, which could cause problems. A proposed work-around for this was to start the chain at a non-zero slot number. This was initially chosen to be 2^19, then 2^63, then 2^32, and finally back to zero. In my view, this madness suggests that we should have been using signed integers all along ūüôĄ.

GENESIS_EPOCH Epoch(0)

Similar to the above, but widely used in the beacon chain spec. When the chain starts, it starts at epoch zero.

FAR_FUTURE_EPOCH Epoch(2**64 - 1)

A candidate for the dullest constant. It's used as a default initialiser for validators' activation and exit times before they are properly set. No epoch number is bigger than this one.

BASE_REWARDS_PER_EPOCH uint64(4)

BASE_REWARDS_PER_EPOCH is the number of distinct things that attesting validators get rewarded for in each epoch. Namely, creating attestations with (1) matching source block, (2) matching target block, (3) matching chain head, and (4) inclusion delay - i.e. getting attestations included quickly in the beacon chain. (1) and (2) relate to the Casper FFG finality gadget, and (3) relates to the LMD GHOST fork choice rule.

DEPOSIT_CONTRACT_TREE_DEPTH uint64(2**5) (= 32)

DEPOSIT_CONTRACT_TREE_DEPTH specifies the size of the (sparse) Merkle tree used by the Eth1 deposit contract to store deposits made. With a value of 32, this allows for $\smash{2^{32}}$ = 4.3 billion deposits. Given that the minimum deposit it 1 Ether, that number is clearly enough for quite a while.

JUSTIFICATION_BITS_LENGTH uint64(4)

As an optimisation to Casper FFG‚ÄĒthe process by which finality is conferred on epochs‚ÄĒEth2 uses a "k-finality" rule. We will describe this properly when we look at processing justification and finalisation. For now, this constant is just the number of bits we need to store in state to implement k-finality. For k¬†=¬†2 we need to track the justification status of the last four epochs.

ENDIANNESS 'little'

Endianness refers to the order of bytes in the binary representation of a number: most significant byte first is big-endian; least significant byte first is little-endian. For the most part these details are hidden by compilers, and we don't need to worry about endianness. But endianness matters when converting between integers and bytes, which is relevant to shuffling and proposer selection, the RANDAO, and when serialising with SSZ.

The spec began life as big-endian, but the Nimbus team from Status got it changed to little-endian to better suit the low-power processors they are targetting. SSZ was changed first, and then the rest of the spec followed.

Configuration

In the normal course of things, configuration parameters might not be the most exciting part of a specification. However, to gain an understanding of these parameters is to gain a huge insight into what kind of beast we are dealing with in the beacon chain.

You'll notice that most of them are powers of two. There's no huge significance to this. Computer scientists think it's neat, and it ensures that things cleanly divide other things in general. Justin Drake believes that it helps to minimise bike-shedding.

Some of the configuaration parameters below are quite technical and perhaps obscure. I'll use the opportunity here to introduce some concepts, and more detailed explanation will follow when they are used later in the spec.

Note: The default mainnet configuration values are included here for spec-design purposes. The different configurations for mainnet, testnets, and YAML-based testing can be found in the configs/constant_presets directory. These configurations are updated for releases and may be out of sync during dev changes.

To facilitate easier initial interoperability testing and testnets, a much lighter-weight minimal configuration was defined. This runs more quickly, with much lower resource use, than the below mainnet configuration: it has fewer, smaller committees, less shuffling, 6s rather than 12s slots, 8-slot rather than 64-slot epochs, and so on. The final beacon chain will be deployed with the mainnet configuration parameters below.

Misc

Name Value
ETH1_FOLLOW_DISTANCE uint64(2**10) (= 1,024)

This is the minimum depth of block on the Ethereum 1 chain that can be considered by the Eth2 chain: it applies to the Genesis process and the processing of deposits by validators. The Eth1 chain depth is estimated by multiplying this value by the target average Eth1 block time, SECONDS_PER_ETH1_BLOCK.

The value of ETH1_FOLLOW_DISTANCE is very conservatively set to ensure that we never end up with some block that we've relied on subsequently being reorganised out of the Ethereum 1 chain.

MAX_COMMITTEES_PER_SLOT uint64(2**6) (= 64)

Validators are organised into committees to do their work. At any one time, each validator is a member of exactly one beacon chain committee, and is called on to make an attestation exactly once per epoch. (An attestation is a vote for a beacon chain block that has been proposed for a slot.)

In the beacon chain, the 64 committees active in a slot effectively act as a single committee as far as the fork-choice rule is concerned. They all vote on the proposed block for the slot, and their votes/attestations are pooled. (In a similar way, all committees active during an epoch act effectively as a single committee as far as justification and finalisation are concerned.)

The number 64 is designed to map onto one committee per shard once Phase 1 is deployed, since these committees will also vote on shard crosslinks.

TARGET_COMMITTEE_SIZE uint64(2**7) (= 128)

To achieve a desirable level of security, committees need to be larger than a certain size. This is to make it infeasible for an attacker to randomly end up with a majority in a committee even if they control a significant number of validators. This target is a kind of lower-bound on committee size. If there are not enough validators to make all committees have at least 128 members, then, as a first measure, the number of committees per slot is reduced to maintain this minimum. Only if there are fewer than SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE = 4096 validators in total will the committee size be reduced below TARGET_COMMITTEE_SIZE. With so few validators, the system would be insecure in any case.

See the note below for how this value 128 was arrived at.

MAX_VALIDATORS_PER_COMMITTEE uint64(2**11) (= 2,048)

This is just used for sizing some data structures, and is not particularly interesting. Reaching this limit would imply over 4 million active validators, staked with a total of 128 million Ether, which exceeds the total supply today.

MIN_PER_EPOCH_CHURN_LIMIT uint64(2**2) (= 4)

Validators are allowed to exit the system and cease validating, and new validators may apply to join at any time. For interesting reasons, a design decision was made to apply a rate-limit to entries (activations) and exits. Basically, it is important in proof of stake protocols that the validator set not change too quickly.

In the normal case, a validator is able to exit fairly swiftly: it just needs to wait MAX_SEED_LOOKAHEAD (currently four) epochs. However, if there are large numbers of validators wishing to exit at the same time, a queue forms with a limited number of exits allowed per epoch. The minimum number of exits per epoch (the minimum "churn") is MIN_PER_EPOCH_CHURN_LIMIT, so that validators can always eventually exit. The actual allowed churn per epoch is calculated in conjunction with CHURN_LIMIT_QUOTIENT.

The same applies to new validator activations, once a validator has been marked as eligible for activation.

CHURN_LIMIT_QUOTIENT uint64(2**16) (= 65,536)

This is used in conjunction with MIN_PER_EPOCH_CHURN_LIMIT to calculate the actual number of validator exits and activations allowed per epoch. The number of exits allowed is max(MIN_PER_EPOCH_CHURN_LIMIT, n // CHURN_LIMIT_QUOTIENT), where n is the number of active validators. The same applies to activations.

SHUFFLE_ROUND_COUNT uint64(90)

The beacon chain implements a rather interesting way of shuffling validators in order to select committees, called the "swap-or-not shuffle". This shuffle proceeds in rounds, and the degree of shuffling is determined by the number of rounds: SHUFFLE_ROUND_COUNT. The time taken to shuffle is linear in the number of rounds, so for light-weight, non-mainnet configurations, the number of rounds can be reduced.

The value 90 was introduced in Vitalik's initial commit without explanation. The original paper describing the shuffling technique seems to suggest that a cryptographically safe number of rounds is $6\log{N}$. With 90 rounds, then, we should be good for shuffling 3.3 million validators, which is close to the maximum number possible (given the Ether supply).

The main advantage of using this shuffling method is that light clients and others that are interested in only a small number of the committees at any time can compute only the committees they need without having to shuffle the entire set of active validators. This can be a big saving on computational resources. See compute_shuffled_index.

For more on the mechanics of the swap-or-not shuffle, check out my explainer.

MIN_GENESIS_ACTIVE_VALIDATOR_COUNT uint64(2**14) (= 16,384)

MIN_GENESIS_ACTIVE_VALIDATOR_COUNT is the minimum number of full validator stakes that must have been deposited before the beacon chain can start producing blocks. The number is chosen to ensure a degree of security. It allows for four 128 member committees per slot, rather than the 64 eventually desired to support Phase 1. But fewer validators means higher rewards per validator, so it is designed to attract early participants to get things bootstrapped.

MIN_GENESIS_ACTIVE_VALIDATOR_COUNT used to be much higher (65,536 = 2 million Ether staked), but was reduced when MIN_GENESIS_TIME, below, was added.

MIN_GENESIS_TIME uint64(1578009600) (Jan 3, 2020)

MIN_GENESIS_TIME is the earliest date that the beacon chain can start, and will be adjusted as we near deployment. It's currently set to January the 3rd, 2020 as that is the 11th anniversary of the Bitcoin genesis. It had been suggested that it would be cool to start up on that date, but it was never remotely practical.

Having a MIN_GENESIS_TIME allows us to start the chain with fewer validators than was previously thought necessary. The previous plan was to start the chain as soon as there were MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators staked. But there were concerns that with a lowish initial validator count, a single entity could form the majority of them and then act to prevent other validators from entering (a "gatekeeper attack"). A minimum genesis time allows time for all intending depositors to make their deposits before they could be excluded by a gatekeeper attack.

HYSTERESIS_QUOTIENT uint64(4)
HYSTERESIS_DOWNWARD_MULTIPLIER uint64(1)
HYSTERESIS_UPWARD_MULTIPLIER uint64(5)

These parameters relate to the way that effective balance is changed (see EFFECTIVE_BALANCE_INCREMENT below). As described there, effective balance for a validator follows changes to the actual balance in a step-wise way, with hysteresis. This is to ensure that it does not change very often.

The original hysteresis design had an unintended effect that might have encouraged stakers to over-deposit or make multiple deposits in order to maintain a balance above 32 Ether at all times. This is because, if a validator's balance were to drop below 32 Ether soon after depositing, however briefly, the effective balance would immediately drop to 31 Ether, and would take a long time to recover. This would result in a 3% reduction in rewards for a period.

This problem was addressed by making the hysteresis configurable via these parameters. Specifically, these settings mean:

  1. if a validators' balance falls 0.25 Ether below its effective balance, then its effective balance is reduced by 1 Ether
  2. if a validator's balance rises 1.25 Ether above its effective balance, then its effective balance is increased by 1 Ether

These calculations are done in process_final_updates().

  • For the safety of committees, TARGET_COMMITTEE_SIZE exceeds the recommended minimum committee size of 111; with sufficient active validators (at least SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE), the shuffling algorithm ensures committee sizes of at least TARGET_COMMITTEE_SIZE. (Unbiasable randomness with a Verifiable Delay Function (VDF) will improve committee robustness and lower the safe minimum committee size.)

Given a proportion of the validators controlled by an attacker, what is the probability that the attacker ends up controlling a 2/3 majority in a randomly selected committee drawn from the full set of validators? This is what Vitalik looks at in the presentation, and where the 111 number comes from (a $\smash{2^{-40}}$ chance, one-in-a-trillion, of an attacker with 1/3 of the validators gaining by chance a 2/3 majority in any one committee).

Another issue is that the randomness that we are using (a RANDAO) is not unbiasable. If an attacker happens to control a number of block proposers at the end of an epoch, they can decide to reveal or not to reveal their blocks, gaining one bit of influence per validator on the next random number. This might allow an attacker to gain more control in the next round and so on. In this way, an attacker can gain some influence over committee selection. Having a good lower-bound on committee size (TARGET_COMMITTEE_SIZE) helps to defend against this. Alternatively, we could use an unbiasable source of randomness such as a verifiable delay function (VDF). Use of a VDF is not currently planned for Eth2, but may be implemented in future.

Gwei values

Name Value
MIN_DEPOSIT_AMOUNT Gwei(2**0 * 10**9) (= 1,000,000,000)

MIN_DEPOSIT_AMOUNT is not actually used anywhere within the Phase 0 Beacon Chain Specification document. Where it is used is within the deposit contract to be deployed to the Ethereum 1 chain. Any amount less than this value sent to the deposit contract is reverted. [TODO - the Vyper contract has been deprecated; link to the Solidity version.]

MAX_EFFECTIVE_BALANCE Gwei(2**5 * 10**9) (= 32,000,000,000)

There is a concept of "effective balance" for validators: whatever a validator's actual total stake (balance), its voting power is weighted by its effective balance, even if it has much more at stake. Effective balance is also the amount on which all rewards, penalties, and slashings are calculated‚ÄĒit's used a lot in the protocol

The MAX_EFFECTIVE_BALANCE is the highest effective balance that a validator can have: 32 Ether. Any balance above this is ignored. Note that this means that staking rewards don't compound in the usual case (unless our balance somehow falls below 32 Ether at some point).

There is a discussion in the Design Rationale of why 32 Ether was chosen as the staking amount. In short, we want enough validators to keep the chain both alive and secure under attack, but not so many that the message overhead on the network becomes too high.

EJECTION_BALANCE Gwei(2**4 * 10**9) (= 16,000,000,000)

If a validator's balance falls below 16 Ether then it is exited from the system (kicked out of the active validator set). This is most likely to happen as a result of the "quadratic leak" which gradually reduces the balances of inactive validators in order to maintain the liveness of the beacon chain.

EFFECTIVE_BALANCE_INCREMENT Gwei(2**0 * 10**9) (= 1,000,000,000)

Throughout the protocol, a quantity called "effective balance" is used instead of the validators' actual balances. Effective balance tracks the actual balance, with two differences: (1) effective balance is capped at MAX_EFFECTIVE_BALANCE no matter how high the actual balance of a validator is, and (2) effective balance is much more granular - it changes only in steps of EFFECTIVE_BALANCE_INCREMENT rather than Gwei.

This discretisation of balance is designed to reduce the amount of hashing required when making state updates. As we shall see, validators' actual balances are stored as a contiguous list in BeaconState. This is easy to update. Effective balances are stored in the individual validator records and are more costly to update (more hashing required). So we try to update effective balances relatively infrequently.

Effective balance is changed according to a process with hysteresis to avoid situations where it changes frequently. See HYSTERESIS_QUOTIENT above.

You can read more about effective balance in the Design Rationale and in this article.

Initial values

Name Value
GENESIS_FORK_VERSION Version('0x00000000')

Forks/upgrades are expected, if only when we move to Phase 1. This is the fork version the beacon chain starts with at its "Genesis" event: the point at which the chain first starts producing blocks.

BLS_WITHDRAWAL_PREFIX Bytes1('0x00')

Not actually used in this core beacon chain spec, but used in the deposit contract spec.

Validators need to register two public/private key pairs. The signing key is used constantly for signing attestations and blocks. The withdrawal key will be used in future after a validator has exited to allow the validator's Ether balance to be transferred to an Eth2 account. The withdrawal credentials are stored in the validator's record so that, in future, the owner of the validator can lay claim to the original stake and accrued rewards. The withdrawal credentials is the 32 byte SHA256 hash of the validators withdrawal public key, with the first byte set to BLS_WITHDRAWAL_PREFIX as a version number, in case of future changes.

Time parameters

Name Value Unit Duration
GENESIS_DELAY uint64(172800) seconds 2 days

The GENESIS_DELAY is a grace period to allow nodes and node operators time to prepare for the Genesis event. The Genesis event cannot occur before MIN_GENESIS_TIME. If there are not MIN_GENESIS_ACTIVE_VALIDATOR_COUNT registered validators sufficiently in advance of MIN_GENESIS_TIME, then Genesis will occur GENESIS_DELAY seconds after enough validators have been registered.

The Genesis event (beacon chain start) was originally designed to take place at midnight UTC, even for testnets, which was not always convenient. This has now been changed. Once we're past MIN_GENESIS_TIME - GENESIS_DELAY, Genesis could end up being at any time of the day, depending on when the last depost needed comes in.

SECONDS_PER_SLOT uint64(12) seconds 12 seconds

This used to be 6 seconds, but is now 12, and has previously had other values. The main limiting factors in shortening this is the time necessary for block proposals to propagate among committees, and for validators to communicate and aggregate their votes for the block.

This slot length has to account for shard blocks as well in later phases. There was some discussion around having the beacon chain and shards on differing cadences, but the latest Phase 1 design tightly couples the beacon chain with the shards. Shard blocks under the new proposal are much larger, which led to the lengthening of the slot to 12 seconds.

There is a general intention to shorten this in future, perhaps to [8 seconds](https://github.com/ethereum/eth2.0-specs/issues/1890#issue-638024803, if it proves possible to do this in practice.

SECONDS_PER_ETH1_BLOCK uint64(14) seconds 14 seconds

The assumed block interval on the Eth1 chain, used when calculating how long we will wait before trusting that an Eth1 block will not be reorganised.

The average Eth1 block time since January 2020 has actually been nearer 13 seconds, but never mind. The net effect is that we will be going a little deeper back in the Eth1 chain than ETH1_FOLLOW_DISTANCE would suggest, which ought to be safer.

MIN_ATTESTATION_INCLUSION_DELAY uint64(2**0) (= 1) slots 12 seconds

A design goal of Eth2 is not to heavily disadvantage validators that are running on lower-spec systems, or, conversely, to reduce any advantage gained by running on high-spec systems.

One aspect of performance is network bandwidth. When a validator becomes the block proposer, it needs to gather attestations from the rest of its committee. On a low-bandwidth link, this takes longer, and could result in the proposer not being able to include as many past attestations as other better-connected validators might, thus receiving lower rewards.

MIN_ATTESTATION_INCLUSION_DELAY was an attempt to "level the playing field" by setting a minimum number of slots before an attestation can be included in a beacon block. It was originally set at 4, with a 6 second slot time, allowing 24 seconds for attestations to propagate around the network.

It was later set to one‚ÄĒattestations are included as early as possible‚ÄĒand, now we are crosslinking shards every slot, this is the only value that makes sense. So it exists today as a kind of relic of the earlier design.

The current slot time of 12 seconds (see above) is assumed to allow sufficient time for attestations to propagate and be aggregated sufficently within one slot. If this proves not to be the case, then it may be lengthened later.

SLOTS_PER_EPOCH uint64(2**5) (= 32) slots 6.4 minutes

When slots were six seconds, there were 64 slots per epoch. So the time between epoch boundaries is unchanged compared with the original design.

As a reminder, epoch transitions are where the heavy beacon chain state-transition calculation occurs, so we don't want them too close together. On the other hand, they are also the targets for finalisation, so we don't want them too far apart.

MIN_SEED_LOOKAHEAD uint64(2**0) (= 1) epochs 6.4 minutes

A random seed is used to select all the committees and proposers for an epoch. Every epoch, the beacon chain accumulates randomness from proposers via the RANDAO and stores it. The seed for the current epoch is based on the RANDAO output from the epoch MIN_SEED_LOOKUP + 1 ago. With MIN_SEED_LOOKAHEAD set to one, the effect is that we can know the seed for the current epoch and the next epoch, but not beyond (since the next-but-one epoch depends on randomness from the current epoch that hasn't been accumulated yet).

This mechanism is designed to allow sufficient time for committee members to find each other on the peer-to-peer network, and in Phase 1 to sync up any data they will need. But preventing committee makeup being known too far ahead limits the opportunity for coordinated collusion between validators.

MAX_SEED_LOOKAHEAD uint64(2**2) (= 4) epochs 25.6 minutes

The above notwithstanding, if an attacker has a large proportion of the stake, or is, for example, able to DoS block proposers for a while, then it might be possible for the the attacker to predict the output of the RANDAO further ahead than MIN_SEED_LOOKAHEAD would normally allow. In which case the attacker might be able to manipulate the make up of committtees advantageously by performing judicious exits and activations of their validators.

To prevent this, we assume a maximum feasible lookahead that an attacker might achieve (that is, this parameter) and delay all activations and exits by this amount. With MAX_SEED_LOOKAHEAD set to 4, if only 10% of validators are online and honest, then the chance that an attacker can succeed in forecasting the seed beyond MAX_SEED_LOOK_AHEAD - MIN_SEED_LOOKAHEAD = 3 epochs is $\smash{0.9^{3\times32}}$, which is about 1 in 25,000.

MIN_EPOCHS_TO_INACTIVITY_PENALTY uint64(2**2) (= 4) epochs 25.6 minutes

The inactivity penalty is discussed below. This parameter sets the time until it kicks in: if the last finalised epoch is longer ago than this, then the beacon chain starts operating in "leak" mode. In this mode, participating validators no longer get rewarded, and validators that are not participating get penalised.

EPOCHS_PER_ETH1_VOTING_PERIOD uint64(2**5) (= 32) epochs ~3.4 hours

In order to safely onboard new validators, the beacon chain needs to take a view on what the Eth1 chain looks like. This is done by collecting votes from beacon block proposers - they are expected to consult an available Eth1 client in order to construct their vote.

EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH is the total number of votes for Eth1 blocks that are collected. As soon as half of this number of votes are for the same Eth1 block, that block is adopted by the beacon chain and deposit processing can continue.

Rules for how validators select the right block to vote for are set out in the validator guide. ETH1_FOLLOW_DISTANCE is the minimum depth of block that can be considered: this is very conservatively set to ensure that we never end up with some block that we've relied on subsequently being reorganised out of the Ethereum 1 chain. For a detailed analysis of these parameters, see this ethresear.ch post

SLOTS_PER_HISTORICAL_ROOT uint64(2**13) (= 8,192) slots ~27 hours

There have been several redesigns of the way the beacon chain stores its past history. The current design is a double batched accumulator. The block root and state root for every slot are stored in the state for SLOTS_PER_HISTORICAL_ROOT slots. When that list is full, both lists are merkleised into a single Merkle root, which is added to the ever-growing state.historical_roots list.

MIN_VALIDATOR_WITHDRAWABILITY_DELAY uint64(2**8) (= 256) epochs ~27 hours

Once a validator has made it through the exit queue it can stop participating. However, its funds remain locked for the duration of MIN_VALIDATOR_WITHDRAWABILITY_DELAY. In Phase 0 this is to allow some time for any slashable behaviour to be detected and reported so that the validator can still be penalised (in which case the validator's withdrawable time is pushed EPOCHS_PER_SLASHINGS_VECTOR into the future). In Phase 1 this delay will also allow for shard rewards to be credited and for proof of custody challenges to be mounted.

Note that in Phases 0 and 1 there is no mechanism to withdraw a validator's balance in any case. But being in a "withdrawable" state means that a validator has now fully exited from the protocol.

SHARD_COMMITTEE_PERIOD uint64(2**8) (= 256) epochs ~27 hours

This really anticipates Phase 1. The idea is that it's bad for the stability of longer-lived shard committees if validators can appear and disappear very rapidly. Therefore, a validator cannot initiate a voluntary exit until SHARD_COMMITTEE_PERIOD epochs after it is activated. Note that it could still be ejected by slashing before this time.

State list lengths

The following parameters set the sizes of some lists in the beacon chain state. Some lists have natural sizes, others such as the validator registry need an explicit maximum size to help with SSZ serialisation.

Name Value Unit Duration
EPOCHS_PER_HISTORICAL_VECTOR uint64(2**16) (= 65,536) epochs ~0.8 years

This is the number of epochs of previous RANDAO mixes that are stored (one per epoch). Having access to past randao mixes allows historical shufflings to be recalculated. Since Validator records keep track of the activation and exit epochs of all past validators, we can thus reconstitute past committees as far back as we have the RANDAO values. This information can be used for slashing long-past attestations, for example. It is not clear how the value of this parameter was decided.

EPOCHS_PER_SLASHINGS_VECTOR uint64(2**13) (= 8,192) epochs ~36 days

In the epoch in which a misbehaving validator is slashed, its effective balance is added to an accumulator in the state. In this way, the state.slashings list tracks the total effective balance of all validators slashed during the last EPOCHS_PER_SLASHINGS_VECTOR epochs.

At a time EPOCHS_PER_SLASHINGS_VECTOR // 2 after being slashed, a further penalty is applied to the slashed validator, based on the total amount of value slashed during the 4096 epochs before and the 4096 epochs after it was originally slashed.

The idea of this is to disproportionately punish coordinated attacks, in which many validators break the slashing conditions at the same time, while only lightly penalising validators that get slashed by making a mistake. Early designs for Eth2 would always slash a validator's entire deposit.

HISTORICAL_ROOTS_LIMIT uint64(2**24) (= 16,777,216) historical roots ~52,262 years

Every SLOTS_PER_HISTORICAL_ROOT slots, the list of block roots and the list of state roots are merkleised and added to state.historical_roots list. This is sized so that it is possible to store these roots for the entire past history of the chain. Although this list is effectively unbounded, it grows at less than 10 KB per year.

Storing past roots like this allows historical Merkle proofs to be constructed if required.

VALIDATOR_REGISTRY_LIMIT uint64(2**40) (= 1,099,511,627,776) validators

Every time the Eth1 deposit contract processes a deposit from a new validator (as identified by its public key), a new entry is appended to the state.validators list.

In the current design, validators are never removed from this list, even after exiting from being a validator. This is largely because there is nowhere yet to send a validator's remaining deposit and staking rewards, so they continue to need to be tracked in the beacon chain.

The maximum length of this list is VALIDATOR_REGISTRY_LIMIT, which is one trillion, so we ought to be OK for a while, especially given that the minimum deposit amount is 1 Ether.

Rewards and penalties

Name Value
BASE_REWARD_FACTOR uint64(2**6) (= 64)

This is the big knob to turn to change the issuance rate of Eth2. Almost all validator rewards are calculated in terms of a "base reward" which is calculated as,

effective_balance * BASE_REWARD_FACTOR // integer_squareroot(total_balance) // BASE_REWARDS_PER_EPOCH

where effective_balance is the individual validator's current effective balance and total_balance is the sum of the effective balances of all active validators.

Thus, the total validator rewards per epoch (the Eth2 issuance rate) could in principle be tuned by increasing or decreasing BASE_REWARD_FACTOR.

WHISTLEBLOWER_REWARD_QUOTIENT uint64(2**9) (= 512)

One reward amount that is not tied to the base reward is the whistleblower reward. This is a reward for providing a proof that a proposer or attestor has violated a slashing condition. The whistleblower reward is set at $\smash{\frac{1}{512}}$ of the effective balance of the slashed validator.

PROPOSER_REWARD_QUOTIENT uint64(2**3) (= 8)

The whistleblower reward can optionally be divided between the reporter and the proposer that includes the report in a block in the ratio 7:1. That is, the proposer receives a proportion 1/PROPOSER_REWARD_QUOTIENT of the reward.

In the Phase 0 spec, however, the whistleblower reward always gets awarded in its entirety to the block proposer including the report, ignoring this parameter. In general, it's quite hard to avoid whistleblowing reports being stolen by block proposers in any case. zkProofs might help one day.

INACTIVITY_PENALTY_QUOTIENT uint64(2**24) (= 16,777,216)

If the beacon chain hasn't finalised an epoch for longer than MIN_EPOCHS_TO_INACTIVITY_PENALTY epochs, then it enters "leak" mode. In this mode, any validator that does not vote (or votes for an incorrect target) is penalised an amount each epoch of effective_balance * finality_delay // INACTIVITY_PENALTY_QUOTIENT. The effect of this is the quadratic leak described below.

MIN_SLASHING_PENALTY_QUOTIENT uint64(2**5) (= 32)

When a validator is first convicted of a slashable offence, an initial penalty is applied. This is calculated as,

validator.effective_balance // MIN_SLASHING_PENALTY_QUOTIENT

Thus, the initial slashing penalty is between 0.5 Ether and 1 Ether depending on the validator's effective balance (which is between 16 and 32 Ether; note that effective balance is denominated in Gwei).

A further slashing penalty is applied later based on the total amount of balance slashed during a period of EPOCHS_PER_SLASHINGS_VECTOR.

  • The INACTIVITY_PENALTY_QUOTIENT equals INVERSE_SQRT_E_DROP_TIME**2 where INVERSE_SQRT_E_DROP_TIME := 2**12 epochs (about 18 days) is the time it takes the inactivity penalty to reduce the balance of non-participating validators to about 1/sqrt(e) ~= 60.6%. Indeed, the balance retained by offline validators after n epochs is about (1 - 1/INACTIVITY_PENALTY_QUOTIENT)**(n**2/2); so after INVERSE_SQRT_E_DROP_TIME epochs, it is roughly (1 - 1/INACTIVITY_PENALTY_QUOTIENT)**(INACTIVITY_PENALTY_QUOTIENT/2) ~= 1/sqrt(e).

The idea for the inactivity leak (aka the quadratic leak) was proposed in the original Casper FFG paper. The problem it addresses is that, if a large fraction of the validator set were to go offline at the same time, it would not be possible to continue finalising blocks, since a 2/3 majority of the whole validator set is required for finalisation.

In order to recover, the inactivity leak gradually reduces the stakes of validators who are not making attestations until, eventually, the remaining participating validators control 2/3 of the remaining stake. They can then begin to finalise checkpoints once again.

In the calculation here, we are solving the (discrete form of the) differential equation $\smash{\frac{dB}{dt}=-\frac{Bt}{\alpha}}$, where $B$ is the balance and $\alpha$ is the value of INACTIVITY_PENALTY_QUOTIENT, and the amount leaked at each step increases in proportion to the time $t$ since finality. The solution to this differential equation is $\smash{B(t)=B_0e^{-t^2/2\alpha}}$. From this it can be confirmed that INVERSE_SQRT_E_DROP_TIME, the time taken to reduce starting balances by $\smash{e^{\frac{1}{2}}}$, is $t=\sqrt{\alpha}$. The second half of the spec paragraph is just a calculus-avoiding way of expressing the same thing.

With these parameters, it would take about 21.4 days for a validator to leak half its deposit and then be ejected for falling below the EJECTION_BALANCE.

This inactivity penalty mechanism is designed to protect the chain long-term in the face of catastrophic events (sometimes referred to as the ability to surve World War III). The result might be that the beacon chain could permanently split into two independent chains either side of a network partition, but this is assumed to be a reasonable outcome for any problem that can't be fixed in a few weeks. In this sense, the beacon chain technically prioritises availability over consistency. (You can't have both.)

Max operations per block

Name Value
MAX_PROPOSER_SLASHINGS 2**4 (= 16)
MAX_ATTESTER_SLASHINGS 2**1 (= 2)
MAX_ATTESTATIONS 2**7 (= 128)
MAX_DEPOSITS 2**4 (= 16)
MAX_VOLUNTARY_EXITS 2**4 (= 16)

These parameters are used to size lists in the beacon block bodies for the purposes of SSZ serialisation, as well as constraining the maximum size of beacon blocks so that they can propagate efficiently, and avoid DoS attacks.

With these settings, the maximum size of a beacon block (before compression) is 123,016 bytes. By far the largest object is the AttesterSlashing, at up to 33,216 bytes. However, a single attester slashing can be used to slash many misbehaving validators at the same time (assuming that in an attack, many validators would make the same conflicting vote).

With some assumptions on average behaviour and compressibility, this leads to an average block size of around 36 KBytes, compressing down to 22 KBytes, in the worst case (with the maximum number of validators, and the maximum average number of possible slashings).

Some calculations to support the above can be found for each of the containers in the next section. Also on this spreadsheet (numbers are a bit out of date). Protolambda has a script for calculating all the Eth2 container minimum and maximum sizes.

Domain types

Name Value
DOMAIN_BEACON_PROPOSER DomainType('0x00000000')
DOMAIN_BEACON_ATTESTER DomainType('0x01000000')
DOMAIN_RANDAO DomainType('0x02000000')
DOMAIN_DEPOSIT DomainType('0x03000000')
DOMAIN_VOLUNTARY_EXIT DomainType('0x04000000')
DOMAIN_SELECTION_PROOF DomainType('0x05000000')
DOMAIN_AGGREGATE_AND_PROOF DomainType('0x06000000')

These domain types are used in two ways: for signatures and for seeds.

As a cryptographic nicety, each of the protocol's five signature types is augmented with the appropriate Domain before being signed:

  • Signed block proposals incorporate DOMAIN_BEACON_PROPOSER
  • Signed attestations incorporate DOMAIN_BEACON_ATTESTER
  • RANDAO reveals are BLS signatures, and use DOMAIN_RANDAO
  • Deposit data mesages from Ethereum 1 incorporate DOMAIN_DEPOSIT
  • Validator voluntary exit messages incorporate DOMAIN_VOLUNTARY_EXIT

In each case, except for Eth1 deposits, the fork version is also incorporated before signing. Deposits are valid across forks, but other messages are not. Note that this would allow validators to participate, if they wish, in two independent forks of the beacon chain without fear of being slashed.

In addition, the first two domains are also used to separate the seeds for random number generation. The original motivation was to avoid occasional collisions between Phase 0 committees and Phase 1 persistent committees. So, when computing the beacon block proposer, DOMAIN_BEACON_PROPOSER is hashed into the seed, and when computing committees, DOMAIN_BEACON_ATTESTER is hashed into the seed.

The last two domains were introduced to implement attestation subnet validations for denial of service resistance. They are not part of the consensus-critical state-transition. In short, each slot, validators are selected to aggregate attestations from their committee. The selection is done based on the validator's signature over the slot number, mixing in DOMAIN_SELECTION_PROOF. Then the validator signs the whole aggregated attestation using DOMAIN_AGGREGATE_AND_PROOF. See the Honest Validator spec for more on this.

Containers

The following types are SimpleSerialize (SSZ) containers.

We're about to see our first Python code in the executable spec. For specification purposes, these Constainer data structures are just Python data classes that are derived from the base SSZ Container class.

SSZ is the serialisation and merkleisation format used everywhere in Eth2. It is not self-describing, so you need to know what you are unpacking when deserialising. SSZ deals with basic types and composite types. Classes like the below are handled as SSZ containers, a composite type defined as an "ordered heterogeneous collection of values".

Implementations will obviously use their own paradigms to represent these data structures (we use a combination of Java classes and interfaces).

[TODO: check sizes of containers against Proto's script.]

Note: The definitions are ordered topologically to facilitate execution of the spec.

Note: Fields missing in container instantiations default to their zero value.

In the below, for most of the containers, I've shown the size along with the working out. If you prefer your information programmatically generated, see this from Protolambda (for spec v0.12.x).

Misc dependencies

Fork

class Fork(Container):
    previous_version: Version
    current_version: Version
    epoch: Epoch  # Epoch of latest fork

Fork data is stored in the BeaconState to indicate the current and previous fork versions. The fork version gets incorporated into the cryptographic domain in order to invalidate messages from validators on other forks. The previous fork version and the epoch of the change are stored so that pre-fork messages can still be validated (at least until the next fork).

Note that this is all about manual, protocol forks, and nothing to do with the fork-choice rule.

Fixed size: 4 + 4 + 8 = 16 bytes

ForkData

class ForkData(Container):
    current_version: Version
    genesis_validators_root: Root

Only used in compute_fork_data_root(). This is used when distinguishing between chains for the purpose of peer-to-peer gossip, and for domain separation. By including both the current fork version and the genesis validators root, we can cleanly distinguish between, say, mainnet and a testnet. They might both have the same fork history, but the genesis validators roots will differ.

Version is the datatype for a fork version number.

Checkpoint

class Checkpoint(Container):
    epoch: Epoch
    root: Root

Checkpoints are the points of justification or finalisation by the Casper FFG protocol. They are used by validators in creating AttestationData votes, and also for recording the status of recent checkpoints in BeaconState.

As per the Casper paper, checkpoints contain a height, and a block root. In this implementation of Casper FFG, checkpoints occur whenever the slot number is a multiple of SLOTS_PER_EPOCH, thus they correspond to epoch numbers. In particular, checkpoint $N$ is the first slot of epoch $N$. The genesis block is Checkpoint 0, and starts off both justified and finalised.

Thus, the root element here is the block root of the first block in the epoch. (This might be the block root of an earlier block if some slots have been skipped, that is, if there are no blocks for those slots.)

Once a checkpoint has been finalised, the slot it points to and all prior slots will never be reverted.

Fixed size: 8 + 32 = 40 bytes

Validator

class Validator(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32  # Commitment to pubkey for withdrawals
    effective_balance: Gwei  # Balance at stake
    slashed: boolean
    # Status epochs
    activation_eligibility_epoch: Epoch  # When criteria for activation were met
    activation_epoch: Epoch
    exit_epoch: Epoch
    withdrawable_epoch: Epoch  # When validator can withdraw funds

This is the datastructure that stores (almost) all the information about each individual validator.

Validators' actual balances are stored separately in the BeaconState structure, and only the slowly changing "effective balance" is stored here. This is because actual balances are liable to change quite frequently (every epoch): the way that Eth2 calculates state roots means that only the parts that change need to be recalculated; the roots of unchanged parts can be cached. Separating out the validator balances potentially means that only 1/15th (8/121) as much data needs to be rehashed every epoch compared to storing them here, which is an important optimisation.

A validator's record is created when its deposit is first processed. Sending multiple deposits does not create multiple validator records: deposits with the same public key are aggregated in one record. Validator records never expire in Phase 0; they are stored permanently, even after the validator has exited the system. Thus there is a 1:1 mapping between a validator's index in the list and the identity of the validator (validator records are only ever appended to the list).

Also stored in Validator:

  • pubkey serves as both the unique identity of the validator and the means of cryptographically verifying messages purporting to have been signed by it. The public key is stored raw, unlike in Eth1, where it is hashed to form the account address. This is to allow public keys to be aggregated for verifying aggregated attestations.
  • Validators actually have two private/public key pairs, the one above used for signing protocol messages, and a separate "withdrawal key". withdrawal_credentials is a commitment generated from the validator's withdrawal key so that, at some time in the future, a validator can prove it owns the funds and will be able to withdraw them. Storing the hash of the public key rather than the key itself saves a few bytes (16 bytes).
  • effective_balance is a topic of its own that we've touched upon already, and will discuss more fully when we look at process_final_updates
  • slashed indicates that a validator has been slashed, that is, punished for violating the slashing conditions. A validator can only be slashed once.
  • The remaining values are the epochs in which the validator changed, or is due to change state.

A detailed explanation of the stages in a validator's lifecycle is here, and we'll be covering it in detail as we work through the beacon chain logic. But, in simplified form, progress is as follows:

  1. A 32 Eth deposit has been made on the Eth1 chain. No validator record exists yet.
  2. The deposit is processed by the beacon chain at some slot. A validator record is created with all epoch fields set to FAR_FUTURE_EPOCH.
  3. At the end of the epoch, the activation_eligibility_epoch is set to the next epoch.
  4. After the epoch activation_eligibility_epoch has been finalised, the validator is added to the activation queue by setting its activation_epoch appropriately, taking into account the per-epoch churn limit and MAX_SEED_LOOKAHEAD.
  5. On reaching activation_epoch the validator becomes active, and should carry out its duties.
  6. At any time after SHARD_COMMITTEE_PERIOD epochs, a validator may request a voluntary exit. exit_epoch is set according to the validator's position in the exit queue and MAX_SEED_LOOKAHEAD, and withdrawable_epoch is set MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after that.
  7. From exit_epoch onwards the validator is no longer active. There is no mechanism for exited validators to rejoin: exiting is permanent.
  8. After withdrawable_epoch, the validator's balance can in principle be withdrawn, although there is no mechanism for doing this in Phase 0.

The above does not account for slashing or forced exits due to low balance.

Fixed size: 48 + 32 + 8 + 1 + 4 * 8 = 121 bytes

AttestationData

class AttestationData(Container):
    slot: Slot
    index: CommitteeIndex
    # LMD GHOST vote
    beacon_block_root: Root
    # FFG vote
    source: Checkpoint
    target: Checkpoint

Eth2 relies on a combination of two different consensus mechanisms: LMD GHOST keeps the chain moving, and Casper FFG brings finalisation. These are documented in the Gasper paper. Attestations from (committees of) validators are used to provide votes simultaneously for each of these consensus mechanisms.

This class is the fundamental unit of attestation data.

  • slot: each active validator should be making exactly one attestation per epoch. Validators have an assigned slot for their attestation, and it is recorded here.
  • index: there can be several committees active in a single slot. This is the number of the committee that the validator belongs to in that slot. It is used to reconstruct the committee and to check that the attesting validator is a member. Ideally, all (or the majority at least) of the attestations in a slot from one committee will be identical, and can therefore be aggregated into a smaller number of attestations.
  • beacon_block_root is the validator's vote on the best block for that slot after locally running the LMD GHOST fork-choice rule.
  • source is the validator's opinion of the best currently justified checkpoint for the Casper FFG finalisation process.
  • target is the validator's opinion of the block at the start of the current epoch, also for Casper FFG finalisation.

This AttestationData structure gets wrapped up into several other similar but distinct structures:

  • Attestation: This is the form in which attestations normally make their way around the network. It is signed and aggregatable, and the list of validators making this attestation is compressed into a bitlist.
  • IndexedAttestation: Used primarily for attester slashing, it is signed and aggregated, with the list of attesting validators being an uncompressed list of indices.
  • PendingAttestation: After having their validity checked during block processing, these are stored in the beacon state pending processing at the end of the epoch. The signature is not stored, and the list of attesting validators is compressed into a bitlist.

Fixed size: 8 + 8 + 32 + 2 * 40 = 128 bytes

IndexedAttestation

class IndexedAttestation(Container):
    attesting_indices: List[ValidatorIndex, MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    signature: BLSSignature

This is one of the forms in which aggregated attestations‚ÄĒcombined identical attestations from multiple validators in the same committee‚ÄĒare handled.

Attestations and IndexedAttestations contain essentially the same information. The difference being that the list of attesting validators is stored uncompressed in IndexedAttestations. That is, each attesting validator is referenced by its global validator index, and non-attesting validators are not included. To be valid, the validator indices must be unique and sorted, and the signature must be an aggregate signature from exactly the listed set of validators.

IndexedAttestations are primarily used when reporting attester slashing. An Attestation can be converted to an IndexedAttestation using get_indexed_attestation().

Max size: 8 * 2048 + 128 + 96 = 16,608 bytes

PendingAttestation

class PendingAttestation(Container):
    aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    inclusion_delay: Slot
    proposer_index: ValidatorIndex

Attestations received in blocks are verified and then temporarily stored in beacon state in the form of PendingAttestations, pending further processing at the end of the epoch.

A PendingAttestation is an Attestation minus the signature, plus a couple of fields related to reward calculation:

  • inclusion_delay is the number of slots between the attestation having been made and it being included in a beacon block by the block proposer. Validators are rewarded for getting their attestations included in blocks, but the reward declines in inverse proportion to the inclusion delay. This incentivises swift attesting and communicating by validators.
  • proposer_index is the block proposer that included the attestation. The block proposer gets a micro reward for every validator's attestation it includes, not just for the aggregate attestation as a whole. This incentivises efficient finding and packing of aggregations, since the number of aggregate attestations per block is capped.

Taken together, these rewards ought to incentivise the whole network to collaborate to do efficient attestation aggregation (proposers want to include only well-aggregated attestations; validators want to get their attestations included, so will ensure that they get well aggregated).

Max size: 2048 / 8 + 128 + 8 + 8 = 400 bytes

Eth1Data

class Eth1Data(Container):
    deposit_root: Root
    deposit_count: uint64
    block_hash: Bytes32

Proposers include their view of the Eth1 chain in blocks, and this is how they do it. The beacon chain stores these up as votes in beacon state until there is a majority consensus, and then the winner is committed to beacon state. This is to allow the processing of Eth1 deposits, and creates a simple "honest-majority" one-way bridge from Eth1 to Eth2. The 1/2 majority assumption for this (rather than 2/3 for committees) is considered safe as the number of validators voting each time is large, at SLOTS_PER_ETH1_VOTING_PERIOD (1024).

  • deposit_root is the result of the get_deposit_root() method of the Eth1 deposit contract after executing the Eth1 block being voted on‚ÄĒit's the root of the (sparse) Merkle tree of deposits.
  • deposit_count is the number of deposits in the deposit contract at that point, the result of the get_deposit_count method on the contract. This will be equal to or greater than (if there are pending unprocessed deposits) the value of state.eth1_deposit_index.
  • block_hash is the block hash of the Eth1 block being voted for. This doesn't have any current use within the Eth2 protocol, but is "too potentially useful to not throw in there", to quote Danny Ryan.

Fixed size: 32 + 8 + 32 = 72 bytes

HistoricalBatch

class HistoricalBatch(Container):
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]

This is used to implement part of the double batched accumulator for the past history of the chain. Once SLOTS_PER_HISTORICAL_ROOT block roots and the same number of state roots have been accumulated in the beacon state, they are put in a HistoricalBatch object and the hash tree root of that is appended to the historical_roots list in beacon state. The corresponding block and state root lists in the beacon state are then cleared ready to be filled again. See process_final_updates.

Fixed size: 2 * 32 * 8192 = 524,288 bytes

DepositMessage

class DepositMessage(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei

The basic information necessary to either add a validator to the registry, or to top up an existing validator's stake.

pubkey is the unique public key of the validator. If it is already present in the registry (the list of validators in beacon state) then amount is added to its balance. Otherwise a new Validator entry is appended to the list and credited with amount.

See the Validator class for info on withdrawal_credentials.

There are two protections that DepositMessages get at different points:

  1. They are stored, pending processing, in beacon state as DepositData. This includes the validator's BLS signature so that the authenticity of the DepositMessage can be verified before a validator is added.
  2. DepositData is included in beacon blocks as a Deposit, which adds a Merkle proof that the data has been registered with the Eth1 deposit contract.

Fixed size: 48 + 32 + 64 = 144 bytes

DepositData

class DepositData(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei
    signature: BLSSignature  # Signing over DepositMessage

A signed DepositMessage. The comment says that the signing is done over DepositMessage. What actually happens is that a DepositMessage is constructed from the first three fields; the root of that is combined with DOMAIN_DEPOSIT in a SigningData object; finally the root of this is signed and included in DepositData.

Fixed size: 48 + 32 + 8 + 96 = 184 bytes

BeaconBlockHeader

class BeaconBlockHeader(Container):
    slot: Slot
    proposer_index: ValidatorIndex
    parent_root: Root
    state_root: Root
    body_root: Root

A standalone version of a beacon block header: BeaconBlocks contain their own header. It is identical to BeaconBlock, except that body is replaced by body_root. It is BeaconBlock-lite.

BeaconBlockHeader is stored in beacon state to record the last processed block header. This is used to ensure that we always proceed along a continuous chain of blocks that always point to their predecessor (it's a blockchain, yo!). See process_block_header().

The signed version is used in proposer slashings.

Fixed size: 2 * 8 + 3 * 32 = 112 bytes

SigningData

class SigningData(Container):
    object_root: Root
    domain: Domain

This is just a convenience class used only in compute_signing_root to calculate the hash tree root of an object along with a domain. That root is the message data that gets signed with a BLS signature. The SigningData object itself is never stored or transmitted.

Beacon operations

ProposerSlashing

class ProposerSlashing(Container):
    signed_header_1: SignedBeaconBlockHeader
    signed_header_2: SignedBeaconBlockHeader

ProposerSlashings may be included in blocks to demonstrate that a validator has broken the rules and ought to be slashed. Proposers receive a reward for correctly submitting these.

In this case, the rule is that a validator may not propose two different blocks at the same height, and the payload is the signed headers of the two blocks that evidence the crime. The signatures on the SignedBeaconBlockHeaders are checked to verify that they were both signed by the accused validator.

Fixed size: 2 * 200 = 400 bytes

AttesterSlashing

class AttesterSlashing(Container):
    attestation_1: IndexedAttestation
    attestation_2: IndexedAttestation

AttesterSlashings may be included in blocks to demonstrate that a group of validators has broken the rules and ought to be slashed. Proposers receive a reward for correctly submitting these.

The contents of the IndexedAttestations are checked against the attester slashing conditions in is_slashable_attestation_data(). If there is a violation, then any validator that attested to both attestation_1 and attestation_2 is slashed, see process_attester_slashing.

AttesterSlashings are potentially very large since they could in principle list the indices of all the validators in a committee. On the other hand, many validators can be slashed as a result of a single report.

Max size: 2 * 16,608 = 33,216 bytes

Attestation

class Attestation(Container):
    aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    signature: BLSSignature

This is the form in which attestations make their way around the network. It is designed to be easily aggregatable: Attestations containing identical AttestationData can be combined into a single attestation by aggregating the signatures.

Attestations contain the same information as IndexedAttestations, but use knowledge of the validator committees at slots to compress the list of attesting validators down to a bitlist. Thus, Attestations are about 1/35th of the size of IndexedAttestations.

Max size: 2048 / 8 + 128 + 96 = 480

Deposit

class Deposit(Container):
    proof: Vector[Bytes32, DEPOSIT_CONTRACT_TREE_DEPTH + 1]  # Merkle path to deposit root
    data: DepositData

Used to include deposit data from wannabe validators in beacon blocks so that they can be processed into beacon state.

The proof is a Merkle proof constructed by the block proposer that the DepositData corresponds to the previously agreed deposit root of the Eth1 contract's deposit tree. It is verified in process_deposit() by is_valid_merkle_branch().

Fixed size: 32 * (32 + 1) + 184 = 1240 bytes

VoluntaryExit

class VoluntaryExit(Container):
    epoch: Epoch  # Earliest epoch when voluntary exit can be processed
    validator_index: ValidatorIndex

Voluntary exit messages are how a validator signals that it wants to cease being a validator. They are ignored by the beacon chain if they are included in blocks before epoch, so validators should buffer any future-dated exits they see before putting them in a block.

VoluntaryExit objects are never used naked; they are always wrapped up into a SignedVoluntaryExit object.

Fixed size: 8 + 8 = 16 bytes

Beacon blocks

BeaconBlockBody

class BeaconBlockBody(Container):
    randao_reveal: BLSSignature
    eth1_data: Eth1Data  # Eth1 data vote
    graffiti: Bytes32  # Arbitrary data
    # Operations
    proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS]
    attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS]
    attestations: List[Attestation, MAX_ATTESTATIONS]
    deposits: List[Deposit, MAX_DEPOSITS]
    voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS]

From a beacon node's point of view, only two things on this page really matter: the BeaconBlock and the BeaconState. The former is how the latter gets updated. The BeaconBlockBody is the business part of a BeaconBlock.

A beacon block is proposed by a validator when its (randomly selected) turn comes. There ought to be exactly one beacon block per slot if things are running correctly.

Always present:

  • randao_reveal: the block is invalid if this does not verify correctly against the proposer's public key. This is the block proposer's contribution to the beacon chain's randomness. It is generated by the proposer signing the current epoch number (combined with DOMAIN_RANDAO) with its private key. To the best of anyone's knowledge, the result is indistinguishable from random. This gets mixed into the beacon state RANDAO.
  • See Eth1Data for eth1_data. In principle, this is mandatory, but it is not checked, and there is no penalty for making it up.
  • graffiti is left free for the proposer to insert whatever data it wishes. It has no protocol level signifcance. Can be left as zero.

Optional, with rewards for inclusion:

  • proposer_slashings: up to MAX_PROPOSER_SLASHINGS ProposerSlashings may be included. There is a reward of up to 0.0625 Ether for each validator slashed as a result, all accruing to the block proposer.
  • attester_slashings: up to MAX_ATTESTER_SLASHINGS AttesterSlashings may be included. There is a reward of up to 0.0625 Ether for each validator slashed as a result, all accruing to the block proposer.
  • attestations: up to MAX_ATTESTATIONS (aggregated) Attestations may be included. The block proposer is incentivised to include well-packed aggregate attestations, as it receives a micro reward for each unique good attestation. In a perfect world, with perfectly aggregated attestations, MAX_ATTESTATIONS would be equal to MAX_COMMITTEES_PER_SLOT. In our configuration it is double. This allows for some imperfectly aggregated attestations, and to catch up after skip slots.

Mandatory, no rewards for inclusion:

  • deposits: if the block does not contain either all the outstanding Deposits, or MAX_DEPOSITS of them in deposit order, then it is invalid.

Optional, no rewards for inclusion:

Max size: 96 + 72 + 32 + 408 * 16 + 33,216 * 2 + 480 * 128 + 1240 * 16 + 112 * 16 = 156,232

BeaconBlock

class BeaconBlock(Container):
    slot: Slot
    proposer_index: ValidatorIndex
    parent_root: Root
    state_root: Root
    body: BeaconBlockBody

BeaconBlock just adds some blockchain paraphernalia to BeaconBlockBody.

slot is the slot the block is proposed for. proposer_index was added to avoid a potential DoS vector and to allow clients without full access to the state to still know useful things. parent_root is used to make sure that this block is a direct child of the last block we processed. In order to calculate state_root, the proposer is expected to run the state transition on the block before propagating it. After the beacon node has processed the block, the state roots are compared to ensure they match. This seems to be the mechanism for tying the whole system together and making sure that all validators and beacon nodes are always working off the same version of state (absent any short-term forks).

If any of these are incorrect, then the block is invalid with respect to the current beacon state and will be ignored.

Max size = 8 + 8 + 32 + 32 + 123,016 = 123,096

Beacon state

BeaconState

class BeaconState(Container):
    # Versioning
    genesis_time: uint64
    genesis_validators_root: Root
    slot: Slot
    fork: Fork
    # History
    latest_block_header: BeaconBlockHeader
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT]
    # Eth1
    eth1_data: Eth1Data
    eth1_data_votes: List[Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH]
    eth1_deposit_index: uint64
    # Registry
    validators: List[Validator, VALIDATOR_REGISTRY_LIMIT]
    balances: List[Gwei, VALIDATOR_REGISTRY_LIMIT]
    # Randomness
    randao_mixes: Vector[Bytes32, EPOCHS_PER_HISTORICAL_VECTOR]
    # Slashings
    slashings: Vector[Gwei, EPOCHS_PER_SLASHINGS_VECTOR]  # Per-epoch sums of slashed effective balances
    # Attestations
    previous_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
    current_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
    # Finality
    justification_bits: Bitvector[JUSTIFICATION_BITS_LENGTH]  # Bit set for every recent justified epoch
    previous_justified_checkpoint: Checkpoint  # Previous epoch snapshot
    current_justified_checkpoint: Checkpoint
    finalized_checkpoint: Checkpoint

All roads lead to the BeaconState. Maintaining this is the sole purpose of all the apparatus in all of these documents. This state is the focus of consensus among the beacon nodes: it is what everybody, eventually, must agree on.

Eth2's beacon state is monolothic: everything is bundled into the one state object (sometimes referred to as the "God object"). Some have argued for more granular approaches that might be more efficient, but at least the current approach is simple.

Let's break this thing down...

# Versioning
genesis_time: uint64
genesis_validators_root: Root
slot: Slot
fork: Fork

How do we know which chain we're on, and where we are on it? This information ought to be sufficient. A path back to the genesis block would also do.

genesis_validators_root is calculated at Genesis time (when the chain starts) and is fixed for the life of the chain. This, combined with the fork identifier, should serve to uniquely identify the chain that we are on.

The fork choice rule uses genesis_time to work out what slot we're in.

The fork element is updated at hard forks (not related to the fork choice rule) to invalidate blocks and attestations from validators not following the new fork.

# History
latest_block_header: BeaconBlockHeader
block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT]

latest_block_header is only used to make sure that the next block we process is a direct descendent. It's a blockchain thing.

Past block_roots and state_roots are stored in lists here until the lists are full. Once they are full, the Merkle root is taken of both the lists together and appended to historical_roots. historical_roots effectively grows without bound (HISTORICAL_ROOTS_LIMIT is large), but only at a rate of 10KB per year. Keeping this data is useful for light clients, and also allows Merkle proofs to be created against past states, for example historical deposit data.

# Eth1
eth1_data: Eth1Data
eth1_data_votes: List[Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH]
eth1_deposit_index: uint64

eth1_data is the latest agreed upon state of the Eth1 chain and deposit contract. eth1_data_votes accumulates Eth1Data from blocks until there is an overall majority in favour of one Eth1 state. If a majority is not achieved by the time the list is full then it is cleared down and starts again. eth1_deposit_index is the total number of deposits that have been processed by the beacon chain (which is greater than or equal to the number of validators, as a deposit can top-up the balance of an existing validator).

# Registry
validators: List[Validator, VALIDATOR_REGISTRY_LIMIT]
balances: List[Gwei, VALIDATOR_REGISTRY_LIMIT]

The registry of Validators and their balances. The balances list is separated out as it changes relatively more broadly more frequently than the validators list. Roughly speaking, balances of active validators are updated every epoch, while the validators list has only minor updates per epoch. When combined with SSZ tree hashing, this results in a big saving in the amount of data to be rehashed on registry updates.

# Randomness
randao_mixes: Vector[Bytes32, EPOCHS_PER_HISTORICAL_VECTOR]

Past randao mixes are stored in a fixed-size circular list for EPOCHS_PER_HISTORICAL_VECTOR epochs (~290 days). These can be used to recalculate past committees, which allows slashing of historical attestations. See EPOCHS_PER_HISTORICAL_VECTOR for more information.

# Slashings
slashings: Vector[Gwei, EPOCHS_PER_SLASHINGS_VECTOR]

A fixed-size circular list of past slashed amounts. Each epoch, the total effective balance of all validators slashed in that epoch is stored as an entry in this list. When the final slashing penalty for a slashed validator is calculated, it is weighted with the sum of this list. This is intended to more heavily penalise mass slashings during a window of time, which is more likely to be a coordinated attack.

# Attestations
previous_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
current_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]

These are pending attestations accumulated from blocks, but not yet processed by the beacon chain at the end of an epoch. current_epoch_attestations have a target that is the epoch we are currently in. These are just stored. All rewards and finality calculations are based on previous_epoch_attestations, which are last epoch's current_epoch_attestations plus any new ones received that target the previous epoch.

# Finality
justification_bits: Bitvector[JUSTIFICATION_BITS_LENGTH]
previous_justified_checkpoint: Checkpoint
current_justified_checkpoint: Checkpoint
finalized_checkpoint: Checkpoint

Eth2 uses the Casper FFG finality mechanism, with a k-finality optimisation, where k = 2. These are the data that need to be tracked in order to apply the finality rules.

  • justification_bits is only four bits long. It tracks the justification status of the last four epochs: 1 if justified, 0 if not. This is used when calculating whether we can finalise an epoch.
  • Outside of the finality calculations, previous_justified_checkpoint and current_justified_checkpoint are only used to filter attestations being added into attestations lists discussed above: attestations need to have the matching source parameter.
  • finalized_checkpoint: the network has agreed that the beacon chain state at or before that epoch will never be reverted. So, for one thing, the fork choice rule doesn't need to go back any further than this. The Casper FFG mechanism is specifically constructed so that two conflicting finalized checkpoints cannot be created without at least one third of validators being slashed.

TODO: calculate sizes of fixed, bounded, and unbounded parts

Fun fact: there was a period during which beacon state was split into "crystallized state" and "active state". The active state was constantly changing; the crystallized state changed only once per epoch (or what passed for epochs back then). Separating out the fast-changing state from the slower-changing state was an attempt to avoid having to constantly rehash the whole state every slot. With the introduction of SSZ tree hashing, this was no longer necessary, as the roots of the slower changing parts could simply be cached, which was a nice simplification. There remains an echo of this approach, however, in the splitting out of validator balances into a different structure.

Signed envelopes

The following are just wrappers for a more basic type with an added signature.

SignedVoluntaryExit

class SignedVoluntaryExit(Container):
    message: VoluntaryExit
    signature: BLSSignature

Voluntary exits are currently signed using the validator's online signing key. There is some discussion about changing this to also allow signing of voluntary exits with the validator's offline withdrawal key.

Fixed size: 16 + 96 = 112 bytes

SignedBeaconBlock

class SignedBeaconBlock(Container):
    message: BeaconBlock
    signature: BLSSignature

BeaconBlocks are signed by the block proposer and unwrapped for block processing.

Max size: 123,088 + 96 = 123,184

SignedBeaconBlockHeader

class SignedBeaconBlockHeader(Container):
    message: BeaconBlockHeader
    signature: BLSSignature

This is used only when reporting proposer slashing, via the ProposerSlashing container.

Through the magic of SSZ hash tree roots, a valid signature for a SignedBeaconBlock is also a valid signature for a SignedBeaconBlockHeader. Proposer slashing makes use of this to save space in slashing reports.

Fixed size: 104 + 96 = 200 bytes

Helper functions

Note: The definitions below are for specification purposes and are not necessarily optimal implementations.

This is note is super important for implementers! There are many, many optimisations of the below routines that are being used in practice: a naive implementation is impractically slow for mainnet configurations. As long as the optimised code produces identical results to the code here, then all is fine.

Math

integer_squareroot

def integer_squareroot(n: uint64) -> uint64:
    """
    Return the largest integer ``x`` such that ``x**2 <= n``.
    """
    x = n
    y = (x + 1) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    return x

Validator rewards scale with the reciprocal of the square root of the total active balance of all validators. This is calculated in get_base_reward(), and is the only place this function is used. Newton's method is used which has pretty good convergence properties, but implementations may use any method that gives identical results.

xor

def xor(bytes_1: Bytes32, bytes_2: Bytes32) -> Bytes32:
    """
    Return the exclusive-or of two 32-byte strings.
    """
    return Bytes32(a ^ b for a, b in zip(bytes_1, bytes_2))

The bitwise xor of two 32-byte quantities is defined here in terms of Python's behaviour.

This is only used in process_randao when mixing in the new randao reveal.

Fun fact: if you xor two byte types in Java, the result is a 32 bit (signed) integer :man_facepalming: This is one reason we need to define the "obvious" here. But mainly, because the spec is executable, we need to tell Python what it doesn't already know.

uint_to_bytes

def uint_to_bytes(n: uint) -> bytes is a function for serializing the uint type object to bytes in ENDIANNESS-endian. The expected length of the output is the byte-length of the uint type.

[TODO update for v0.12.2] For the most part, integers are integers and bytes are bytes, and they don't mix much. But there are a few places where we need to convert from integers to bytes in Phase 0:

The result of this conversion is dependent on our arbitrary choice of endianness: that is, how we choose to represent integers as strings of bytes. For Eth2, we have chosen little-endian: see the discussion of ENDIANNESS for more background.

bytes_to_uint64

def bytes_to_uint64(data: bytes) -> uint64:
    """
    Return the integer deserialization of ``data`` interpreted as ``ENDIANNESS``-endian.
    """
    return uint64(int.from_bytes(data, ENDIANNESS))

[TODO update for v0.12.2] bytes_to_uint64() is the inverse of uint_to_bytes(), and is used by the shuffling algorithm.

It's also used in the validator specification when selecting validators to aggregate attestations.

Crypto

hash

def hash(data: bytes) -> Bytes32 is SHA256.

SHA256 was chosen as the protocol's base hash algorithm for easier cross-chain interoperability: many other chains use SHA256, and Eth1 has a SHA256 precompile.

There was lots of discussion about this at the time. The original plan had been to use the BLAKE2b-512 hash function‚ÄĒthat being a modern hash function that's faster than SHA3‚ÄĒand move to a STARK/SNARK friendly hash function at some point (such as MiMC). However, to keep interoperability with Eth1, in particular for the implementation of the deposit contract, the hash function was changed to Keccak256. Finally, we settled on SHA256.

hash_tree_root

def hash_tree_root(object: SSZSerializable) -> Root is a function for hashing objects into a single root by utilizing a hash tree structure, as defined in the SSZ spec.

The development of the hash tree procedure has been transformative for the Eth2 specification, and it's now used everywhere.

The naive way to create a digest of a datastructure is to linearise it and then just run a hash function over the result. In tree hashing, the basic idea is to treat each element of an ordered, compound data structure as the leaf of a merkle tree, recursively if necessary until a primitive type is reached, and to return the Merkle root of the resulting tree.

At first sight, this looks quite inefficient: twice as much data needs to be hashed when tree hashing, and actual speeds are 4-6 times slower compared with the linear hash. However, it's good for supporting light clients, because it allows Merkle proofs to be constructed easily for subsets of the full state.

The breakthrough insight was realising that much of the re-hashing work can be cached: if part of the state data structure has not changed, that part does not need to be re-hashed: the whole subtree can be replaced with its cached hash. This turns out to be a huge efficiency boost, allowing the previous design, with cumbersome separate crystallised and active state, to be simplified into a single state object.

[TODO find some explainer, or insert a diagram]

BLS Signatures

Eth2 makes use of BLS signatures as specified in the IETF draft BLS specification draft-irtf-cfrg-bls-signature-02 but uses Hashing to Elliptic Curves - draft-irtf-cfrg-hash-to-curve-07 instead of draft-irtf-cfrg-hash-to-curve-06. Specifically, eth2 uses the BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_ ciphersuite which implements the following interfaces:

  • def Sign(SK: int, message: Bytes) -> BLSSignature
  • def Verify(PK: BLSPubkey, message: Bytes, signature: BLSSignature) -> bool
  • def Aggregate(signatures: Sequence[BLSSignature]) -> BLSSignature
  • def FastAggregateVerify(PKs: Sequence[BLSPubkey], message: Bytes, signature: BLSSignature) -> bool
  • def AggregateVerify(PKs: Sequence[BLSPubkey], messages: Sequence[Bytes], signature: BLSSignature) -> bool

Within these specifications, BLS signatures are treated as a module for notational clarity, thus to verify a signature bls.Verify(...) is used.

BLS is the digital signature scheme used by Eth2. It has some very nice properties, in particular the ability to aggregate signatures. This means that many validators can sign the same message (for example, that they support block X), and these signatures can all be efficiently aggregated into a single signature for verification. The ability to do this efficiently makes Eth2 practical as a protocol.

Several other protocols have adopted or will adopt BLS, such as Zcash, Chia, Dfinity and Algorand. We are using the BLS signature scheme based on the BLS12-381 elliptic curve. By implementing the new standard for BLS signatures, we hope that interoperability between chains will be easier in future.

Note: The non-standard configuration of the BLS and hash to curve specs is temporary and will be resolved once IETF releases BLS spec draft 3.

The hash-to-curve draft standard is actually up to version 08 now. However, at least for our purposes, there are no substantive changes from version 07.

Predicates

is_active_validator

def is_active_validator(validator: Validator, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is active.
    """
    return validator.activation_epoch <= epoch < validator.exit_epoch

Validators don't explicitly track their state (eligible for activation, active, exited, withdrawable - the exception being whether they have been slashed or not). Instead, a validator's state is calculated by checking fields in the Validator record that store the epoch numbers of state transitions.

In this case, if the validator was activated in the past and has not yet exited, then it is active.

This is used a few times in the spec, most notably in get_active_validator_indices which returns a list of all active validators at an epoch.

is_eligible_for_activation_queue

def is_eligible_for_activation_queue(validator: Validator) -> bool:
    """
    Check if ``validator`` is eligible to be placed into the activation queue.
    """
    return (
        validator.activation_eligibility_epoch == FAR_FUTURE_EPOCH
        and validator.effective_balance == MAX_EFFECTIVE_BALANCE
    )

When a new deposit has been processed with a previously unseen public key, a new Validator record is created with all the state-transition fields set to the default value of FAR_FUTURE_EPOCH.

During epoch processing, eligible validators are marked as eligible for activation by setting the validator.activation_eligibility_epoch.

It is possible to deposit any amount over MIN_DEPOSIT_AMOUNT (currently 1 Ether) into the deposit contract. However, validators do not become eligible for activation until their effective balance is equal to MAX_EFFECTIVE_BALANCE, which corresponds to an actual balance of 32 Ether or more.

is_eligible_for_activation

def is_eligible_for_activation(state: BeaconState, validator: Validator) -> bool:
    """
    Check if ``validator`` is eligible for activation.
    """
    return (
        # Placement in queue is finalized
        validator.activation_eligibility_epoch <= state.finalized_checkpoint.epoch
        # Has not yet been activated
        and validator.activation_epoch == FAR_FUTURE_EPOCH
    )

Once a validator is_eligible_for_activation_queue(), its activation_eligibility_epoch is set to the next epoch, but its activation_epoch is not yet set.

To avoid any ambiguity or confusion on the validator side about its state, we wait until its eligibility activation epoch has been finalised before adding it to the activation queue by setting its activation_epoch. Otherwise, it might at one point become active, and then the beacon chain could flip to a fork in which it is not active.

is_slashable_validator

def is_slashable_validator(validator: Validator, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is slashable.
    """
    return (not validator.slashed) and (validator.activation_epoch <= epoch < validator.withdrawable_epoch)

Used by process_proposer_slashing and process_attester_slashing.

Validators can be slashed only once: the flag Validator.slashed is set on the first occasion.

An unslashed validator remains eligible to be slashed from when it becomes active right up until it becomes withdrawable. This is some time (MIN_VALIDATOR_WITHDRAWABILITY_DELAY) after it has exited from being a validator and ceased validation duties.

is_slashable_attestation_data

def is_slashable_attestation_data(data_1: AttestationData, data_2: AttestationData) -> bool:
    """
    Check if ``data_1`` and ``data_2`` are slashable according to Casper FFG rules.
    """
    return (
        # Double vote
        (data_1 != data_2 and data_1.target.epoch == data_2.target.epoch) or
        # Surround vote
        (data_1.source.epoch < data_2.source.epoch and data_2.target.epoch < data_1.target.epoch)
    )

Used by process_attester_slashing to check that the two sets of alleged conflicting attestation data in an AttesterSlashing do in fact qualify as slashable.

There are two ways for validators to get slashed under Casper FFG:

  1. A double vote: by a voting more than once for the same target epoch, or
  2. A surround vote: the source‚Äďtarget interval of one attestation entirely contains the source‚Äďtarget of a second attestation from the same validator(s). The reporting block proposer needs to take care to order the IndexedAttestations within the AttesterSlashing object so that the first surrounds the second. (The opposite ordering also describes a slashable offence, but is not checked for here.)

is_valid_indexed_attestation

def is_valid_indexed_attestation(state: BeaconState, indexed_attestation: IndexedAttestation) -> bool:
    """
    Check if ``indexed_attestation`` is not empty, has sorted and unique indices and has a valid aggregate signature.
    """
    # Verify indices are sorted and unique
    indices = indexed_attestation.attesting_indices
    if len(indices) == 0 or not indices == sorted(set(indices)):
        return False
    # Verify aggregate signature
    pubkeys = [state.validators[i].pubkey for i in indices]
    domain = get_domain(state, DOMAIN_BEACON_ATTESTER, indexed_attestation.data.target.epoch)
    signing_root = compute_signing_root(indexed_attestation.data, domain)
    return bls.FastAggregateVerify(pubkeys, signing_root, indexed_attestation.signature)

This is used in attestation processing and attester slashing processing.

IndexedAttestations differ from Attestations in that the latter record the contributing validators in a bitlist and the former explicitly list the global indices of the contributing validators.

An IndexedAttestation passes this validity test only if,

  1. There is at least one validator index present.
  2. The list of validators contains no duplicates (the Python set function performs deduplication).
  3. The indices of the validators are sorted. (It's not clear to me why this is required. It's used in the duplicate check here, but that could just be replaced by checking the set size.)
  4. Its aggregated signature verifies against the aggregated public keys of the listed validators.

Verifying the signature uses the magic of aggregated BLS signatures. The indexed attestation contains a BLS signature that is supposed to be the combined individual signatures of each of the validators listed in the attestation. This is verified by passing it to bls.FastAggregateVerify() along with the list of public keys from the same validators. The verification succeeds only if exactly the same set of validators signed the message (signing_root) as are in the list of public keys. Note that get_domain() mixes in the fork version, so that attestations are not valid across forks.

No check is done here that the attesting_indices (which are the global validator indices) are all members of the correct committee for this attestation. In process_attestation() they must be, by construction. In process_attester_slashing it doesn't matter: any validator signing conflicting attestations is liable to be slashed.

is_valid_merkle_branch

def is_valid_merkle_branch(leaf: Bytes32, branch: Sequence[Bytes32], depth: uint64, index: uint64, root: Root) -> bool:
    """
    Check if ``leaf`` at ``index`` verifies against the Merkle ``root`` and ``branch``.
    """
    value = leaf
    for i in range(depth):
        if index // (2**i) % 2:
            value = hash(branch[i] + value)
        else:
            value = hash(value + branch[i])
    return value == root

The classic algorithm for verifying a merkle branch. Nodes are iteratively hashed as the tree is traversed from leaves to root. The bits of index select whether we are the right or left child of our parent at each level. The result should match the given root of the tree.

This proves that we know that leaf is the value at position index in the list of leaves, and we know the whole structure of the rest of the tree, as summarised in branch.

We use this function in process_deposit to check whether the deposit data we've received is correct or not.

Misc

compute_shuffled_index

def compute_shuffled_index(index: uint64, index_count: uint64, seed: Bytes32) -> uint64:
    """
    Return the shuffled index corresponding to ``seed`` (and ``index_count``).
    """
    assert index < index_count

    # Swap or not (https://link.springer.com/content/pdf/10.1007%2F978-3-642-32009-5_1.pdf)
    # See the 'generalized domain' algorithm on page 3
    for current_round in range(SHUFFLE_ROUND_COUNT):
        pivot = bytes_to_uint64(hash(seed + uint_to_bytes(uint8(current_round)))[0:8]) % index_count
        flip = (pivot + index_count - index) % index_count
        position = max(index, flip)
        source = hash(
            seed
            + uint_to_bytes(uint8(current_round))
            + uint_to_bytes(uint32(position // 256))
        )
        byte = uint8(source[(position % 256) // 8])
        bit = (byte >> (position % 8)) % 2
        index = flip if bit else index

    return index

Selecting random, distinct committees of validators is a big part of Eth2. This is done by shuffling. Now, if you have a list of objects, shuffling it is a well understood problem in computer science.

Notice, however, that this routine manages to shuffle a single index to a new location, knowing only the total length of the list. That is, it is oblivious. To shuffle the whole list, this routine needs to be called once per validator index in the list (note: optimisations are available for doing this on batches of validators). By construction, each input index maps to a distinct output index; thus, when applied to all indices in the list, it results in a permutation, also called a shuffling.

Why do this rather than a simpler, more efficient, conventional shuffle? It's all about light clients. Beacon nodes will generally need to know the whole shuffling, but light clients will often be interested only in a small number of committees. Using this technique allows the composition of a single committee to be calculated without having to shuffle the entire set: a big saving on time and memory.

As stated in the code comments, this is an implementation of the "swap-or-not" shuffle, described in this paper. The search for a good shuffling algorithm for Eth2 is described in this issue, and swap-or-not is identified in this one, with the corresponding pull request here. For details on the mechanics of the swap-or-not shuffle (with diagrams!), check out my explainer.

The algorithm breaks down as follows. For each iteration (each round), we start with a current index.

  1. Pseudo-randomly select a pivot. This is a 64-bit integer based on the seed and current round number. This domain is large enough that any non-uniformity caused by taking the modulus in the next step is entirely negligible.
  2. Use pivot to find another index in the list of validators, flip. We can see why it is called a "pivot": index and flip end up equally far from pivot / 2, but on opposite sides of it (taking into account wrap-around in the list). That is, flip is index reflected in pivot / 2 if we lay out the list in a line.
  3. Calculate a single pseudo-random bit based on the seed, the current round number, and some bytes from either index or flip depending on which is greater.
  4. If our bit is zero, we keep index unchanged; if it is one, we set index to flip.

We are effectively swapping cards in a deck based on a deterministic algorithm.

The way that position is broken down is worth noting:

  • Bits 0-2 (3 bits) are used to select a single bit from the eight bits of byte.
  • Bits 3-7 (5 bits) are used to select a single byte from the thirty-two bytes of source.
  • Bits 8-39 (32 bits) are used in generating source. Note that the upper two bytes of this will always be zero in practice, due to limits on the number of active validators.

SHUFFLE_ROUND_COUNT is, and always has been, 90 in the mainnet configuration, as explained there.

Another nice feature of the swap-or-not shuffle is that it is also easy to invert: just start current_round at SHUFFLE_ROUND_COUNT - 1 and decrease to 0 rather than vice-versa to get back the original position.

compute_shuffled_index is used by compute_committee and compute_proposer_index. In practice, full beacon node implementations will run this once per epoch with an optimised version that shuffles the whole list, and cache the result of that for the epoch.

compute_proposer_index

def compute_proposer_index(state: BeaconState, indices: Sequence[ValidatorIndex], seed: Bytes32) -> ValidatorIndex:
    """
    Return from ``indices`` a random index sampled by effective balance.
    """
    assert len(indices) > 0
    MAX_RANDOM_BYTE = 2**8 - 1
    i = uint64(0)
    total = uint64(len(indices))
    while True:
        candidate_index = indices[compute_shuffled_index(i % total, total, seed)]
        random_byte = hash(seed + uint_to_bytes(uint64(i // 32)))[i % 32]
        effective_balance = state.validators[candidate_index].effective_balance
        if effective_balance * MAX_RANDOM_BYTE >= MAX_EFFECTIVE_BALANCE * random_byte:
            return candidate_index
        i += 1

There is exactly one beacon block proposer per slot, selected randomly from among all the active validators. The seed parameter is set in get_beacon_proposer_index based on the epoch and slot. Note that there is a small but finite probability of the same validator being called on to propose a block more than once in an epoch.

A validator's chance of being the proposer is weighted by its effective balance: a validator with a 32 Ether effective balance is twice as likely to be chosen as a validator with a 16 Ether effective balance.

In order to account for the need to weight by effective balance, this is a try-and-increment algorithm. A counter i starts at zero. This counter does double duty:

  • First i is used to uniformly select a candidate proposer with probability $1/N$ where, $N$ is the number of active validators. This is done by using the compute_shuffled_index routine to shuffle index i to a new location, which is then the candidate_index.
  • Then i is used to generate a pseudo-random byte using the hash function as a seeded PRNG with at least 256 bits of output. The lower 5 bits of i select a byte in the hash function, and the upper bits salt the seed. (An obvious optimisation is that the output of the hash changes only once every 32 iterations.)

The if test is where the weighting by effective balance is done. If the candidate has MAX_EFFECTIVE_BALANCE, it will always pass this test and be returned as the proposer. If the candidate has a fraction of MAX_EFFECTIVE_BALANCE then that fraction is the probability of being returned as proposer.

If the candidate is not chosen, then i is incremented and we try again. Since the minimum effective balance is half of the maximum, then this ought to terminate fairly swiftly. In the worst case, all validators have 16 Ether effective balance and the chance of having to do another iteration is 50%, in which case there is a one in a million chance of having to do 20 iterations.

Note that this dependence on the validators' effective balances, which are updated at the end of each epoch, means that proposer assignments are valid only in the current epoch. This is different from committee assignments, which are valid with a one epoch look-ahead.

compute_committee

def compute_committee(indices: Sequence[ValidatorIndex],
                      seed: Bytes32,
                      index: uint64,
                      count: uint64) -> Sequence[ValidatorIndex]:
    """
    Return the committee corresponding to ``indices``, ``seed``, ``index``, and committee ``count``.
    """
    start = (len(indices) * index) // count
    end = (len(indices) * (index + 1)) // count
    return [indices[compute_shuffled_index(uint64(i), uint64(len(indices)), seed)] for i in range(start, end)]

get_beacon_committee uses this to find the specific members of one of the committees at a slot.

Every epoch, a fresh set of committees is generated; during an epoch, the committees are stable.

Looking at the parameters in reverse order:

  • count is the total number of committees in an epoch. This is SLOTS_PER_EPOCH times the output of get_committee_count_per_slot().
  • index is the committee number within the epoch, running from 0 to count - 1.
  • seed is the seed value for computing the pseudo-random shuffling, based on the epoch number and a domain parameter (get_beacon_committee() uses DOMAIN_BEACON_ATTESTER).
  • indices is the list of validators eligible for inclusion in committees, namely the whole list of indices of active validators.

Random sampling among the validators is done by taking a contiguous slice of array indices from start to end and seeing where each one gets shuffled to by compute_shuffled_index(). Note that ValidatorIndex(i) is a type-cast in the above: it just turns i into a ValidatorIndex type for input into the shuffling. The output value of the shuffling is then used as an index into the indices list. There is much here that client implementations will optimise with caching and batch operations.

It may not be immediately obvious, but not all committees returned will be the same size (can vary by one), and every validator in indices will be a member of exactly one committee. As we increment index from zero, clearly start for index == j + 1 is end for index == j, so there are no gaps. In addition, the highest index is count - 1, so every validator in indices finds its way into a committee.

In Phase 1, this function will also be used to generate long-lived committees for shards, and light client committees. By mixing different domains into the seed in get_seed(), different shufflings and therefore different committees will be selected for the same epoch.

compute_epoch_at_slot

def compute_epoch_at_slot(slot: Slot) -> Epoch:
    """
    Return the epoch number at ``slot``.
    """
    return Epoch(slot // SLOTS_PER_EPOCH)

This is trivial enough that I won't explain it ūüėÄ

But note that it does rely on GENESIS_SLOT and GENESIS_EPOCH being zero.

compute_start_slot_at_epoch

def compute_start_slot_at_epoch(epoch: Epoch) -> Slot:
    """
    Return the start slot of ``epoch``.
    """
    return Slot(epoch * SLOTS_PER_EPOCH)

The first slot of an epoch. See remarks above.

compute_activation_exit_epoch

def compute_activation_exit_epoch(epoch: Epoch) -> Epoch:
    """
    Return the epoch during which validator activations and exits initiated in ``epoch`` take effect.
    """
    return Epoch(epoch + 1 + MAX_SEED_LOOKAHEAD)

When queuing validators for activation or exit in process_registry_updates() and initiate_validator_exit() respectively, the activation or exit is delayed until the next epoch, plus MAX_SEED_LOOKAHEAD epochs, currently 4.

See MAX_SEED_LOOKAHEAD for the details, but in short it is designed to make it extremely hard for an attacker to manipulate the make up of committees via activations and exits.

compute_fork_data_root

def compute_fork_data_root(current_version: Version, genesis_validators_root: Root) -> Root:
    """
    Return the 32-byte fork data root for the ``current_version`` and ``genesis_validators_root``.
    This is used primarily in signature domains to avoid collisions across forks/chains.
    """
    return hash_tree_root(ForkData(
        current_version=current_version,
        genesis_validators_root=genesis_validators_root,
    ))

The fork data root serves as a unique identifier for the chain that we are on. genesis_validators_root identifies our unique genesis event, and current_version our own hard fork subsequent to that genesis event. This is useful, for example, to differentiate between a testnet and mainnet: both might have the same fork versions, but will definitely have different genesis validator roots.

It is used by compute_fork_digest() and compute_domain.

compute_fork_digest

def compute_fork_digest(current_version: Version, genesis_validators_root: Root) -> ForkDigest:
    """
    Return the 4-byte fork digest for the ``current_version`` and ``genesis_validators_root``.
    This is a digest primarily used for domain separation on the p2p layer.
    4-bytes suffices for practical separation of forks/chains.
    """
    return ForkDigest(compute_fork_data_root(current_version, genesis_validators_root)[:4])

Just the first four bytes of the fork data root as a ForkDigest type.

Used extensively in the Ethereum 2.0 networking specification.

compute_domain

def compute_domain(domain_type: DomainType, fork_version: Version=None, genesis_validators_root: Root=None) -> Domain:
    """
    Return the domain for the ``domain_type`` and ``fork_version``.
    """
    if fork_version is None:
        fork_version = GENESIS_FORK_VERSION
    if genesis_validators_root is None:
        genesis_validators_root = Root()  # all bytes zero by default
    fork_data_root = compute_fork_data_root(fork_version, genesis_validators_root)
    return Domain(domain_type + fork_data_root[:28])

When dealing with signed messages, the signature "domains" are separated according to three independent factors:

  1. All signatures include a DomainType relevant to the message's purpose, which is just some cryptographic hygiene in case the same message is to be signed for different purposes at any point.
  2. All but signatures on deposit messages include the fork version. This ensures that messages across different forks of the chain become invalid, and that validators won't be slashed for signing attestations on two different chains (this is allowed).
  3. And, now, the root hash of the validator Merkle tree at Genesis is included. Along with the fork version this gives a unique identifier for our chain.

This function is mainly used by get_domain(). It is also used in deposit processing, in which case fork_version and genesis_validators_root take their default values since deposits are valid across forks.

Fun fact: this function looks pretty simple, but I found a subtle bug in the way tests were generated in a previous implementation. Linus's law.

compute_signing_root

def compute_signing_root(ssz_object: SSZObject, domain: Domain) -> Root:
    """
    Return the signing root for the corresponding signing data.
    """
    return hash_tree_root(SigningData(
        object_root=hash_tree_root(ssz_object),
        domain=domain,
    ))

This is a pre-processor for signing objects with BLS signatures:

  1. calculate the hash tree root of the object
  2. combine the hash tree root with the Domain inside a temporary SigningData object
  3. return the hash tree root of that, which is the data to be signed.

The domain is usually the output of get_domain(), which mixes in the cryptographic domain, the fork version, and the genesis validators root to the message hash. For deposits, it is the output of compute_domain(), ignoring the fork version and genesis validators root.

This is exactly equivalent to adding the domain to an object and taking the hash tree root of the whole thing. Indeed, this function used to be called compute_domain_wrapper_root.

Beacon state accessors

The massive BeaconState object gets passed around everywhere, so it's simple to access stored data directly. The following functions are simple wrappers that do some amount of processing on the beacon state data to be returned.

get_current_epoch

def get_current_epoch(state: BeaconState) -> Epoch:
    """
    Return the current epoch.
    """
    return compute_epoch_at_slot(state.slot)

A getter for the current epoch, as calculated by compute_epoch_at_slot().

get_previous_epoch

def get_previous_epoch(state: BeaconState) -> Epoch:
    """`
    Return the previous epoch (unless the current epoch is ``GENESIS_EPOCH``).
    """
    current_epoch = get_current_epoch(state)
    return GENESIS_EPOCH if current_epoch == GENESIS_EPOCH else Epoch(current_epoch - 1)

Return the previous epoch number as an Epoch type. Returns GENESIS_EPOCH if we are in the GENESIS_EPOCH: it has no prior, and we don't do negative numbers.

get_block_root

def get_block_root(state: BeaconState, epoch: Epoch) -> Root:
    """
    Return the block root at the start of a recent ``epoch``.
    """
    return get_block_root_at_slot(state, compute_start_slot_at_epoch(epoch))

The Casper FFG part of consensus deals in Checkpoints that are the first slot of an epoch. get_block_root is a specialised version of get_block_root_at_slot() that only returns the block root of the checkpoint, given an epoch.

get_block_root_at_slot

def get_block_root_at_slot(state: BeaconState, slot: Slot) -> Root:
    """
    Return the block root at a recent ``slot``.
    """
    assert slot < state.slot <= slot + SLOTS_PER_HISTORICAL_ROOT
    return state.block_roots[slot % SLOTS_PER_HISTORICAL_ROOT]

Recent block roots are stored in a circular list in state, with a length of SLOTS_PER_HISTORICAL_ROOT (currently ~27 hours).

get_block_root_at_slot is used by get_matching_head_attestations(), and in turn when assigning rewards for good LMD GHOST consensus votes.

get_randao_mix

def get_randao_mix(state: BeaconState, epoch: Epoch) -> Bytes32:
    """
    Return the randao mix at a recent ``epoch``.
    """
    return state.randao_mixes[epoch % EPOCHS_PER_HISTORICAL_VECTOR]

Randao mixes are stored in a circular list of length EPOCHS_PER_HISTORICAL_VECTOR. They are used when calculating the seed for assigning beacon proposers and committees.

get_active_validator_indices

def get_active_validator_indices(state: BeaconState, epoch: Epoch) -> Sequence[ValidatorIndex]:
    """
    Return the sequence of active validator indices at ``epoch``.
    """
    return [ValidatorIndex(i) for i, v in enumerate(state.validators) if is_active_validator(v, epoch)]

Steps through the entire list of validators and returns the list of only the active ones (that is, validators that have been activated but not exited as returned by is_active_validator().

This function is heavily used and I'd expect it to be memoised in practice.

get_validator_churn_limit

def get_validator_churn_limit(state: BeaconState) -> uint64:
    """
    Return the validator churn limit for the current epoch.
    """
    active_validator_indices = get_active_validator_indices(state, get_current_epoch(state))
    return max(MIN_PER_EPOCH_CHURN_LIMIT, uint64(len(active_validator_indices)) // CHURN_LIMIT_QUOTIENT)

The "churn limit" applies when activating and exiting validators and acts as a rate-limit on changes to the validator set. The value of this function provides the number of validators that may become active in an epoch, and the number of validators that may exit in an epoch.

Some small amount of churn is always allowed, set by MIN_PER_EPOCH_CHURN_LIMIT, and the amount of per-epoch churn allowed increases by one for every extra CHURN_LIMIT_QUOTIENT validators that are currently active (once the minimum has been exceeded).

get_seed

def get_seed(state: BeaconState, epoch: Epoch, domain_type: DomainType) -> Bytes32:
    """
    Return the seed at ``epoch``.
    """
    mix = get_randao_mix(state, Epoch(epoch + EPOCHS_PER_HISTORICAL_VECTOR - MIN_SEED_LOOKAHEAD - 1))  # Avoid underflow
    return hash(domain_type + uint_to_bytes(epoch) + mix)

Used in get_beacon_committee() and get_beacon_proposer_index to provide the random input for computing proposers and committees. domain_type is DOMAIN_BEACON_ATTESTER or DOMAIN_BEACON_PROPOSER respectively.

Randao mixes are stored in a circular list of length EPOCHS_PER_HISTORICAL_VECTOR. The seed for an epoch is based on the randao mix from MIN_SEED_LOOKAHEAD epochs ago. This is to limit the forward visibility of randomness: see the explanation there.

The seed returned is not based only on the domain and the randao mix, but the epoch number is also added in. This is to handle the pathological case of no blocks being seen for more than two epochs, in which case we run out of randao updates. Adding in the epoch number means that fresh committees and proposers can continue to be selected.

get_committee_count_per_slot

def get_committee_count_per_slot(state: BeaconState, epoch: Epoch) -> uint64:
    """
    Return the number of committees in each slot for the given ``epoch``.
    """
    return max(uint64(1), min(
        MAX_COMMITTEES_PER_SLOT,
        uint64(len(get_active_validator_indices(state, epoch))) // SLOTS_PER_EPOCH // TARGET_COMMITTEE_SIZE,
    ))

Every slot in a given epoch has the same number of beacon committees, as calculated by this function.

There is always at least one committee per slot, and never more than MAX_COMMITTEES_PER_SLOT, currently 64.

Subject to these constraints, the actual number of committees per slot is $N / 4096$, where $N$ is the total number of active validators.

The intended behaviour looks like this:

  1. The ideal case is that there are MAX_COMMITTEES_PER_SLOT = 64 committees per slot. This maps to one committee per slot per shard in Phase¬†1‚ÄĒthese committees will be responsible for voting on shard crosslinks. There must be at least 262,144 active validators to achieve this.
  2. If there are fewer active validators, then the number of committees per shard is reduced below 64 in order to maintain a minimum committee size of TARGET_COMMITTEE_SIZE = 128. In this case, not every shard will get crosslinked at every slot in Phase 1.
  3. Finally, only if the number of active validators falls below 4096 will the committee size be reduced to less than 128. This is the point at which there is only one beacon committee per shard. But, at this point, the chain basically has no meaningful security in any case.

get_beacon_committee

def get_beacon_committee(state: BeaconState, slot: Slot, index: CommitteeIndex) -> Sequence[ValidatorIndex]:
    """
    Return the beacon committee at ``slot`` for ``index``.
    """
    epoch = compute_epoch_at_slot(slot)
    committees_per_slot = get_committee_count_per_slot(state, epoch)
    return compute_committee(
        indices=get_active_validator_indices(state, epoch),
        seed=get_seed(state, epoch, DOMAIN_BEACON_ATTESTER),
        index=(slot % SLOTS_PER_EPOCH) * committees_per_slot + index,
        count=committees_per_slot * SLOTS_PER_EPOCH,
    )

Beacon committees vote on the beacon block at each slot via attestations. There are up to MAX_COMMITTEES_PER_SLOT beacon committees per slot, and each committee is active exactly once per epoch.

This function returns the list of committee members given a slot number and an index within that slot to select the desired committee, relying on compute_committee() to do the heavy lifting.

Note that, since this uses get_seed(), we can obtain committees only up to EPOCHS_PER_HISTORICAL_VECTOR epochs into the past (minus MIN_SEED_LOOKAHEAD).

get_beacon_committee is used by get_attesting_indices() and process_attestation() when processing attestations coming from a committee, and by validators when checking their committee assignments and aggregation duties.

get_beacon_proposer_index

def get_beacon_proposer_index(state: BeaconState) -> ValidatorIndex:
    """
    Return the beacon proposer index at the current slot.
    """
    epoch = get_current_epoch(state)
    seed = hash(get_seed(state, epoch, DOMAIN_BEACON_PROPOSER) + uint_to_bytes(state.slot))
    indices = get_active_validator_indices(state, epoch)
    return compute_proposer_index(state, indices, seed)

Each slot, exactly one of the active validators is pseudo-randomly assigned to be the proposer of the beacon block for that slot. The probability of being selected is weighted by the validator's effective balance in compute_proposer_index().

The chosen block proposer does not need to be a member of one of the beacon committees for that slot: it is chosen from the entire set of active validators for that epoch.

Since the randao seed is updated only once per epoch, the slot number is mixed into the seed using a hash to get a different proposer at each slot. There is a chance of the same proposer being selected in two consecutive slots, or more than once per epoch: if every validator has the same effective balance, then the probability of being selected in a particular slot is simply $\smash{\frac{1}{N}}$ independent of any other slot, where $N$ is the number of active validators in the epoch corresponding to the slot.

get_total_balance

def get_total_balance(state: BeaconState, indices: Set[ValidatorIndex]) -> Gwei:
    """
    Return the combined effective balance of the ``indices``.
    ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    Math safe up to ~10B ETH, afterwhich this overflows uint64.
    """
    return Gwei(max(EFFECTIVE_BALANCE_INCREMENT, sum([state.validators[index].effective_balance for index in indices])))

A simple utility to return the total balance of all validators in the list, indices, passed in.

Side observation: there is an interesting example of some fragility in the spec lurking here. This function used to return a minimum of 1 Gwei to avoid a potential division by zero in get_attestation_deltas(). However, that function was modified to avoid a possible overflow condition, without modifying this function, which introduced the possibility of a division by zero. This was later fixed by returning a minimum of EFFECTIVE_BALANCE_INCREMENT. But, Yay! for lots of eyes on the spec.

get_total_active_balance

def get_total_active_balance(state: BeaconState) -> Gwei:
    """
    Return the combined effective balance of the active validators.
    Note: ``get_total_balance`` returns ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    """
    return get_total_balance(state, set(get_active_validator_indices(state, get_current_epoch(state))))

Uses get_total_balance() to calculate the sum of the effective balances of all active validators in the current epoch.

This quantity is frequently used in the spec. For example, Casper FFG uses the total active balance to judge whether the 2/3 majority threshold of attestations has been reached in justification and finalisation. And it is a fundamental part of the calculation of rewards and penalties, where the base reward is made proportional to the reciprocal of the square root of the total active balance: validator reqards are higher when little balance is at stake (few active validators) and lower when much balance is at stake (many active validators).

Total active balance does not change during an epoch, so is a great candidate for being cached.

get_domain

def get_domain(state: BeaconState, domain_type: DomainType, epoch: Epoch=None) -> Domain:
    """
    Return the signature domain (fork version concatenated with domain type) of a message.
    """
    epoch = get_current_epoch(state) if epoch is None else epoch
    fork_version = state.fork.previous_version if epoch < state.fork.epoch else state.fork.current_version
    return compute_domain(domain_type, fork_version, state.genesis_validators_root)

For the science behind domains, see Domain types and compute_domain().

With the exception of DOMAIN_DEPOSIT, domains are always combined with the fork version before being used in signature generation. This is to distinguish messages for different chains, and ensure that validators don't get slashed if they choose to participate on two independent forks. (That is, deliberate forks, aka hard-forks. Participating on both branches of temporary consensus forks is punishable: that's basically the whole point of slashing.)

get_indexed_attestation

def get_indexed_attestation(state: BeaconState, attestation: Attestation) -> IndexedAttestation:
    """
    Return the indexed attestation corresponding to ``attestation``.
    """
    attesting_indices = get_attesting_indices(state, attestation.data, attestation.aggregation_bits)

    return IndexedAttestation(
        attesting_indices=sorted(attesting_indices),
        data=attestation.data,
        signature=attestation.signature,
    )

Just a wrapper converting an Attestation into an IndexedAttestation.

Attestations are aggregatable, which means that attestations from multiple validators making the same vote can be rolled up into a single attestation through the magic of BLS signature aggregation. However, in order to be able to verify the signature later, a record needs to be kept of which validators actually contributed to the attestation. This is so that those validators' public keys can be aggregated.

The Attestation type uses a bitlist to indicate whether a member of the attesting committee contributed to the attestation. This is to minimise the size. The IndexedAttestation type explicitly lists the global validator indices of contributing validators. Note that the list of indices is sorted: an attestation is invalid if not.

The conversion between the list formats is performed by get_attesting_indices(), below.

get_attesting_indices

def get_attesting_indices(state: BeaconState,
                          data: AttestationData,
                          bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]) -> Set[ValidatorIndex]:
    """
    Return the set of attesting indices corresponding to ``data`` and ``bits``.
    """
    committee = get_beacon_committee(state, data.slot, data.index)
    return set(index for i, index in enumerate(committee) if bits[i])

Lists of validators within committees occur in two forms in the specification:

  • Compressed into a bitlist, in which each bit represents the presence or absence of a validator from a particular committee. The committee is referenced by slot and committee index within that slot. This is how sets of validators are represented in Attestations.
  • An explicit list of validator indices, as in IndexedAttestations.

get_attesting_indices() converts from the former representation to the latter. The slot number and the committee index are provided by the AttestationData and are used to reconstruct the committee members via get_beacon_committee(), and the bitlist will have come from an Attestation.

Beacon state mutators

increase_balance

def increase_balance(state: BeaconState, index: ValidatorIndex, delta: Gwei) -> None:
    """
    Increase the validator balance at index ``index`` by ``delta``.
    """
    state.balances[index] += delta

This and decrease_balance() are the only places in the spec where validator balances are modified‚ÄĒit's a nod towards encapsulation.

Two separate functions are needed for changing validator balances (one for increasing and one for decreasing) because we are using only unsigned integers, remember.

Fun fact: A typo around this led to our one and only consensus failure at the initial client interop event. You see, unsigned integers induce bugs!

decrease_balance

def decrease_balance(state: BeaconState, index: ValidatorIndex, delta: Gwei) -> None:
    """
    Decrease the validator balance at index ``index`` by ``delta``, with underflow protection.
    """
    state.balances[index] = 0 if delta > state.balances[index] else state.balances[index] - delta

The counterpart to increase_balance(). This one has extra work to do to check for unsigned int underflow. Balances may not go negative.

initiate_validator_exit

def initiate_validator_exit(state: BeaconState, index: ValidatorIndex) -> None:
    """
    Initiate the exit of the validator with index ``index``.
    """
    # Return if validator already initiated exit
    validator = state.validators[index]
    if validator.exit_epoch != FAR_FUTURE_EPOCH:
        return

    # Compute exit queue epoch
    exit_epochs = [v.exit_epoch for v in state.validators if v.exit_epoch != FAR_FUTURE_EPOCH]
    exit_queue_epoch = max(exit_epochs + [compute_activation_exit_epoch(get_current_epoch(state))])
    exit_queue_churn = len([v for v in state.validators if v.exit_epoch == exit_queue_epoch])
    if exit_queue_churn >= get_validator_churn_limit(state):
        exit_queue_epoch += Epoch(1)

    # Set validator exit epoch and withdrawable epoch
    validator.exit_epoch = exit_queue_epoch
    validator.withdrawable_epoch = Epoch(validator.exit_epoch + MIN_VALIDATOR_WITHDRAWABILITY_DELAY)

Exits may be initiated voluntarily, or as a result of being slashed, or by dropping below the EJECTION_BALANCE threshold.

A dynamic "churn limit" caps the number of validators that may exit per epoch. This is calculated by get_validator_churn_limit(). The mechanism for enforcing this is the exit queue: the validator's exit_epoch is set such that it is at the end of the queue. (Per the spec, the queue is not a separate data structure, but is continually re-calculated from the exit epochs of all validators: I expect there are some optimisations to be had around this in actual implementations.)

An exiting validator is expected to continue with its proposing and attesting duties until exit_epoch has passed, and will continue to receive rewards and penalties accordingly.

In addition, an exited validator remains eligible to be slashed until its withdrawable_epoch, which is set to MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after its exit_epoch. This is to allow some extra time for any slashable offences by the validator to be detected and reported.

slash_validator

def slash_validator(state: BeaconState,
                    slashed_index: ValidatorIndex,
                    whistleblower_index: ValidatorIndex=None) -> None:
    """
    Slash the validator with index ``slashed_index``.
    """
    epoch = get_current_epoch(state)
    initiate_validator_exit(state, slashed_index)
    validator = state.validators[slashed_index]
    validator.slashed = True
    validator.withdrawable_epoch = max(validator.withdrawable_epoch, Epoch(epoch + EPOCHS_PER_SLASHINGS_VECTOR))
    state.slashings[epoch % EPOCHS_PER_SLASHINGS_VECTOR] += validator.effective_balance
    decrease_balance(state, slashed_index, validator.effective_balance // MIN_SLASHING_PENALTY_QUOTIENT)

    # Apply proposer and whistleblower rewards
    proposer_index = get_beacon_proposer_index(state)
    if whistleblower_index is None:
        whistleblower_index = proposer_index
    whistleblower_reward = Gwei(validator.effective_balance // WHISTLEBLOWER_REWARD_QUOTIENT)
    proposer_reward = Gwei(whistleblower_reward // PROPOSER_REWARD_QUOTIENT)
    increase_balance(state, proposer_index, proposer_reward)
    increase_balance(state, whistleblower_index, Gwei(whistleblower_reward - proposer_reward))

Both proposer slashings and attester slashings end up here when a report of a slashable offence has been verified during block processing.

When a validator is slashed, several things happen immediately:

  • The validator is processed for exit via initiate_validator_exit(), so it joins the exit queue.
  • It is also marked as slashed. This information is used when calculating rewards and penalties: while being exited, whatever it does, a slashed validator receives penalities as if it had failed to propose or attest, including the inactivity leak if applicable.
  • Normally, as part of the exit process, the withdrawable_epoch for a validator (the point at which a validator's stake is in principle unlocked) is set to MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after it exits. When a validator is slashed, a much longer period of lock-up applies, namely EPOCHS_PER_SLASHINGS_VECTOR. This is to allow a further, potentially much greater, slashing penalty to be applied later once the chain knows how many validators have been slashed together around this time.
  • The effective balance of the validator is added to the accumulated balances of validators slashed this epoch, and stored in the circular list, state.slashings. This will be used by the slashing penalty calculation mentioned in the previous point.
  • An initial "slap on the wrist" slashing penalty of the validator's effective balance (in Gwei) divided by the MIN_SLASHING_PENALTY_QUOTIENT is applied. With current values, this is a maximum of 1 Ether. As above, a potentially larger penalty will be applied later depending on how many other validators were slashed concurrently.
  • The proposer including the slashing proof receives a reward.

In short, a slashed validator receives an initial minor penalty, can expect to receive a further penalty later, and is marked for exit.

Note that the whistleblower_index defaults to None in the parameter list. This is never used in Phase 0, with the result that the proposer that included the slashing gets the entire reward; there is no separate whistleblower reward for reporting proposer or attester slashings. One reason is simply that reports are too easy to steal: if I report a slashable event to a block proposer, there is nothing to prevent that proposer claiming the report as its own. We could introduce some fancy ZK protocol to make this trustless, but this is what we're going with for now. In Phase 1, whistleblower rewards in the proof-of-custody game may use this functionality.

As a final note, here and in deposit processing are the only places in the Phase 0 specification where validator balances are updated outside epoch processing.

Genesis

Genesis is the moment at which all the clients simultaneously start processing the beacon chain. All being well, we will only ever have to do this once for real!

All the clients need to be in agreement about the timing of the genesis event, and also about the contents of the genesis block. The genesis event occurs once two pre-conditions have been satisfied:

  1. MIN_GENESIS_TIME must have passed. There's also a GENESIS_DELAY that applies in some circumstances.
  2. Sufficient valid deposits must have been made into the Eth1 deposit contract to activate MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators. [TODO - point to the new Solidity contract once it has been uploaded to the repo].

Information about both of these is drawn from the existing Eth1 chain as the source of truth.

Before the Ethereum 2.0 genesis has been triggered, and for every Ethereum 1.0 block, let candidate_state = initialize_beacon_state_from_eth1(eth1_block_hash, eth1_timestamp, deposits) where:

  • eth1_block_hash is the hash of the Ethereum 1.0 block
  • eth1_timestamp is the Unix timestamp corresponding to eth1_block_hash
  • deposits is the sequence of all deposits, ordered chronologically, up to (and including) the block with hash eth1_block_hash

Ahead of MIN_GENESIS_TIME, some Eth2 beacon nodes should be up and running, and monitoring the Eth1 chain. No Eth2 beacon blocks are being produced yet.

From here until is_valid_genesis_state() returns True, each Eth1 block that is produced is run through the initialize_beacon_state_from_eth1() function.

Eth1 blocks must only be considered once they are at least SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE seconds old (i.e. eth1_timestamp + SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE <= current_unix_time). Due to this constraint, if GENESIS_DELAY < SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE, then the genesis_time can happen before the time/state is first known. Values should be configured to avoid this case.

The SECONDS_PER_ETH1_BLOCK*ETH1_FOLLOW_DISTANCE constraint is a heuristic intended to ensure that any Eth1 block we rely on is not later reorganised out of the Eth1 chain. Its value is set pretty conservatively: about 4 hours.

def initialize_beacon_state_from_eth1(eth1_block_hash: Bytes32,
                                      eth1_timestamp: uint64,
                                      deposits: Sequence[Deposit]) -> BeaconState:
    fork = Fork(
        previous_version=GENESIS_FORK_VERSION,
        current_version=GENESIS_FORK_VERSION,
        epoch=GENESIS_EPOCH,
    )
    state = BeaconState(
        genesis_time=eth1_timestamp + GENESIS_DELAY,
        fork=fork,
        eth1_data=Eth1Data(block_hash=eth1_block_hash, deposit_count=len(deposits)),
        latest_block_header=BeaconBlockHeader(body_root=hash_tree_root(BeaconBlockBody())),
        randao_mixes=[eth1_block_hash] * EPOCHS_PER_HISTORICAL_VECTOR,  # Seed RANDAO with Eth1 entropy
    )

    # Process deposits
    leaves = list(map(lambda deposit: deposit.data, deposits))
    for index, deposit in enumerate(deposits):
        deposit_data_list = List[DepositData, 2**DEPOSIT_CONTRACT_TREE_DEPTH](*leaves[:index + 1])
        state.eth1_data.deposit_root = hash_tree_root(deposit_data_list)
        process_deposit(state, deposit)

    # Process activations
    for index, validator in enumerate(state.validators):
        balance = state.balances[index]
        validator.effective_balance = min(balance - balance % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)
        if validator.effective_balance == MAX_EFFECTIVE_BALANCE:
            validator.activation_eligibility_epoch = GENESIS_EPOCH
            validator.activation_epoch = GENESIS_EPOCH

    # Set genesis validators root for domain separation and chain versioning
    state.genesis_validators_root = hash_tree_root(state.validators)

    return state

So, Eth1 blocks are used in turn to repeatedly try to construct a valid genesis beacon state as follows.

  1. The beacon state timestamp is set to the Eth1 block's time stamp plus the GENESIS_DELAY. By the constraint above, this will be in the future.
  2. A few other genesis constants are set. Notably, the Randao is seeded from the Eth1 block hash. The latest_block_header field is derived from an empty BeaconBlockBody - that is, all its fields default to their zero values as defined in the SSZ specification.
  3. All the deposits into the Eth1 contract up to and including this block are processed. These are provided as a list, which can be derived from the receipts generated by the deposit contract. Some of these may be invalid, for example having an invalid BLS signature. These are ignored; the deposit is lost forever. Some may be partial or repeated deposits: this is fine and the total deposit for each validator is totted up in process_deposit().
  4. Validators that have an effective balance of MAX_EFFECTIVE_BALANCE (i.e. 32 Ether) are marked to become active at the start of the GENESIS_EPOCH.
  5. The hash tree root of the validator states becomes a permanent identifier for this chain in the form of genesis_validators_root. This is used by ForkData, which in turn is used whenever this chain needs to be distinguished from another chain.

Note: The ETH1 block with eth1_timestamp meeting the minimum genesis active validator count criteria can also occur before MIN_GENESIS_TIME.

There are two ways in which the genesis process can play out. Consider a point in time, MIN_GENESIS_TIME-GENESIS_DELAY`.

  • If sufficient Eth1 deposits to activate MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators have been made by that time, then genesis will occur at the timestamp of the first Eth1 block after that time plus GENSIS_DELAY, which is likely to be a few seconds after MIN_GENESIS_TIME. It will include all validators registered to this point, which can be in excess of MIN_GENESIS_ACTIVE_VALIDATOR_COUNT.
  • Otherwise, genesis occurs GENESIS_DELAY seconds after the timestamp of the block containing the deposit that activates the MIN_GENESIS_ACTIVE_VALIDATOR_COUNTth validator. Genesis will include all validators registered up to and including this block (which might be MIN_GENESIS_ACTIVE_VALIDATOR_COUNT or perhaps slightly over if the block has multiple deposits).

Recall that, in both these cases, there is also an interval of SECONDS_PER_ETH1_BLOCK*ETH1_FOLLOW_DISTANCE seconds between the deposit hitting the Eth1 chain and being picked up by the beacon nodes. Finally, note that the activation queue that normally applies for onboarding new validators is not used pre-genesis.

Genesis state

Let genesis_state = candidate_state whenever is_valid_genesis_state(candidate_state) is True for the first time.

def is_valid_genesis_state(state: BeaconState) -> bool:
    if state.genesis_time < MIN_GENESIS_TIME:
        return False
    if len(get_active_validator_indices(state, GENESIS_EPOCH)) < MIN_GENESIS_ACTIVE_VALIDATOR_COUNT:
        return False
    return True

This function simply checks the criteria above. The beacon nodes continually prepare candidate beacon genesis states until this function returns True. The genesis event will take place at least GENESIS_DELAY seconds later, using the genesis state that first flips this functions output to True.

We keep on adding new validators while this function returns False. That is, while we are more than GENESIS_DELAY seconds before MIN_GENESIS_TIME, or while we don't yet have MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators. Thus the total number of genesis validators can't necessarily be known ahead of time.

At the moment when this function first returns True, we are then able to calculate the exact genesis time, the genesis block, and the genesis state root. The GENESIS_DELAY is designed to allow node operators time to verify these parameters between themselves (everyone should agree!), and to configure any non-validating nodes, such as boot nodes, with these quantities so that they do not need to rely on an Eth1 node. (But nodes with validators always need to access an Eth1 node.)

Note: The is_valid_genesis_state function (including MIN_GENESIS_TIME and MIN_GENESIS_ACTIVE_VALIDATOR_COUNT) is a placeholder for testing. It has yet to be finalized by the community, and can be updated as necessary.

Genesis block

Let genesis_block = BeaconBlock(state_root=hash_tree_root(genesis_state)).

This is not explicitly used elsewhere in the spec. However, it is what the block in the first slot should reference as its "parent". This can be seen in the process_slot() function, where, if state.latest_block_header.state_root is empty it is replaced by the actual state root. This can happen only during the first slot.

Beacon chain state transition function

TODO

The post-state corresponding to a pre-state state and a signed block signed_block is defined as state_transition(state, signed_block). State transitions that trigger an unhandled exception (e.g. a failed assert or an out-of-range list access) are considered invalid. State transitions that cause a uint64 overflow or underflow are also considered invalid.

The use of asserts in the spec as-is is a little weird, and slightly controversial. The essential thing to remember is that, if you hit an assert at any point while processing a block, the whole transition needs to be aborted and reset to the original state. With the spec as written, this potentially means undoing already done state updates, or keeping a copy of the former state around to revert to if necessary.

def state_transition(state: BeaconState, signed_block: SignedBeaconBlock, validate_result: bool=True) -> BeaconState:
    block = signed_block.message
    # Process slots (including those with no blocks) since block
    process_slots(state, block.slot)
    # Verify signature
    if validate_result:
        assert verify_block_signature(state, signed_block)
    # Process block
    process_block(state, block)
    # Verify state root
    if validate_result:
        assert block.state_root == hash_tree_root(state)
    # Return post-state
    return state

Beacon chain state is advanced every slot, with extra processing at the end of each epoch. However, state updates are driven by the receipt of valid blocks. Although, in the ideal case, there is a block for every slot, in practice one or more slots can be skipped, for example if the proposers are offline. (It the beacon node is serving validators, however, it will need to keep its state up to date irrespective of whether blocks arrive or not.)

So the first thing that happens in the state transition function is that the state is brought up to date with respect to the block received (the comment in the code seems inaccurate).

[TODO - complete this thought] In principle, a beacon node can hang around doing nothing and just catch up when a block comes. If no blocks are received for a number of slots, nothing happens. When a block is finally received, all slots and epochs are processed up to date.

[TODO: explain validate_result.]

def verify_block_signature(state: BeaconState, signed_block: SignedBeaconBlock) -> bool:
    proposer = state.validators[signed_block.message.proposer_index]
    signing_root = compute_signing_root(signed_block.message, get_domain(state, DOMAIN_BEACON_PROPOSER))
    return bls.Verify(proposer.pubkey, signing_root, signed_block.signature)

Simply checks that the signature on the block matches the block's contents and the public key of the claimed proposer of the block. This ensures that blocks cannot be forged or tampered with in transit. All the public keys for validators are stored in the Validators list in state. See domain types for DOMAIN_BEACON_PROPOSER.

def process_slots(state: BeaconState, slot: Slot) -> None:
    assert state.slot < slot
    while state.slot < slot:
        process_slot(state)
        # Process epoch on the start slot of the next epoch
        if (state.slot + 1) % SLOTS_PER_EPOCH == 0:
            process_epoch(state)
        state.slot = Slot(state.slot + 1)

TODO

def process_slot(state: BeaconState) -> None:
    # Cache state root
    previous_state_root = hash_tree_root(state)
    state.state_roots[state.slot % SLOTS_PER_HISTORICAL_ROOT] = previous_state_root
    # Cache latest block header state root
    if state.latest_block_header.state_root == Bytes32():
        state.latest_block_header.state_root = previous_state_root
    # Cache block root
    previous_block_root = hash_tree_root(state.latest_block_header)
    state.block_roots[state.slot % SLOTS_PER_HISTORICAL_ROOT] = previous_block_root

TODO

Epoch processing

def process_epoch(state: BeaconState) -> None:
    process_justification_and_finalization(state)
    process_rewards_and_penalties(state)
    process_registry_updates(state)
    process_slashings(state)
    process_final_updates(state)

TODO

Helper functions

def get_matching_source_attestations(state: BeaconState, epoch: Epoch) -> Sequence[PendingAttestation]:
    assert epoch in (get_previous_epoch(state), get_current_epoch(state))
    return state.current_epoch_attestations if epoch == get_current_epoch(state) else state.previous_epoch_attestations

TODO

def get_matching_target_attestations(state: BeaconState, epoch: Epoch) -> Sequence[PendingAttestation]:
    return [
        a for a in get_matching_source_attestations(state, epoch)
        if a.data.target.root == get_block_root(state, epoch)
    ]

TODO

def get_matching_head_attestations(state: BeaconState, epoch: Epoch) -> Sequence[PendingAttestation]:
    return [
        a for a in get_matching_target_attestations(state, epoch)
        if a.data.beacon_block_root == get_block_root_at_slot(state, a.data.slot)
    ]

TODO

def get_unslashed_attesting_indices(state: BeaconState,
                                    attestations: Sequence[PendingAttestation]) -> Set[ValidatorIndex]:
    output = set()  # type: Set[ValidatorIndex]
    for a in attestations:
        output = output.union(get_attesting_indices(state, a.data, a.aggregation_bits))
    return set(filter(lambda index: not state.validators[index].slashed, output))

TODO

def get_attesting_balance(state: BeaconState, attestations: Sequence[PendingAttestation]) -> Gwei:
    """
    Return the combined effective balance of the set of unslashed validators participating in ``attestations``.
    Note: ``get_total_balance`` returns ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    """
    return get_total_balance(state, get_unslashed_attesting_indices(state, attestations))

TODO

Justification and finalization

def process_justification_and_finalization(state: BeaconState) -> None:
    if get_current_epoch(state) <= GENESIS_EPOCH + 1:
        return

    previous_epoch = get_previous_epoch(state)
    current_epoch = get_current_epoch(state)
    old_previous_justified_checkpoint = state.previous_justified_checkpoint
    old_current_justified_checkpoint = state.current_justified_checkpoint

    # Process justifications
    state.previous_justified_checkpoint = state.current_justified_checkpoint
    state.justification_bits[1:] = state.justification_bits[:JUSTIFICATION_BITS_LENGTH - 1]
    state.justification_bits[0] = 0b0
    matching_target_attestations = get_matching_target_attestations(state, previous_epoch)  # Previous epoch
    if get_attesting_balance(state, matching_target_attestations) * 3 >= get_total_active_balance(state) * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=previous_epoch,
                                                        root=get_block_root(state, previous_epoch))
        state.justification_bits[1] = 0b1
    matching_target_attestations = get_matching_target_attestations(state, current_epoch)  # Current epoch
    if get_attesting_balance(state, matching_target_attestations) * 3 >= get_total_active_balance(state) * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=current_epoch,
                                                        root=get_block_root(state, current_epoch))
        state.justification_bits[0] = 0b1

    # Process finalizations
    bits = state.justification_bits
    # The 2nd/3rd/4th most recent epochs are justified, the 2nd using the 4th as source
    if all(bits[1:4]) and old_previous_justified_checkpoint.epoch + 3 == current_epoch:
        state.finalized_checkpoint = old_previous_justified_checkpoint
    # The 2nd/3rd most recent epochs are justified, the 2nd using the 3rd as source
    if all(bits[1:3]) and old_previous_justified_checkpoint.epoch + 2 == current_epoch:
        state.finalized_checkpoint = old_previous_justified_checkpoint
    # The 1st/2nd/3rd most recent epochs are justified, the 1st using the 3rd as source
    if all(bits[0:3]) and old_current_justified_checkpoint.epoch + 2 == current_epoch:
        state.finalized_checkpoint = old_current_justified_checkpoint
    # The 1st/2nd most recent epochs are justified, the 1st using the 2nd as source
    if all(bits[0:2]) and old_current_justified_checkpoint.epoch + 1 == current_epoch:
        state.finalized_checkpoint = old_current_justified_checkpoint

TODO

Rewards and penalties

Helpers
def get_base_reward(state: BeaconState, index: ValidatorIndex) -> Gwei:
    total_balance = get_total_active_balance(state)
    effective_balance = state.validators[index].effective_balance
    return Gwei(effective_balance * BASE_REWARD_FACTOR // integer_squareroot(total_balance) // BASE_REWARDS_PER_EPOCH)

TODO

def get_proposer_reward(state: BeaconState, attesting_index: ValidatorIndex) -> Gwei:
    return Gwei(get_base_reward(state, attesting_index) // PROPOSER_REWARD_QUOTIENT)

TODO

def get_finality_delay(state: BeaconState) -> uint64:
    return get_previous_epoch(state) - state.finalized_checkpoint.epoch

TODO

def is_in_inactivity_leak(state: BeaconState) -> bool:
    return get_finality_delay(state) > MIN_EPOCHS_TO_INACTIVITY_PENALTY

TODO

def get_eligible_validator_indices(state: BeaconState) -> Sequence[ValidatorIndex]:
    previous_epoch = get_previous_epoch(state)
    return [
        ValidatorIndex(index) for index, v in enumerate(state.validators)
        if is_active_validator(v, previous_epoch) or (v.slashed and previous_epoch + 1 < v.withdrawable_epoch)
    ]

TODO

def get_attestation_component_deltas(state: BeaconState,
                                     attestations: Sequence[PendingAttestation]
                                     ) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Helper with shared logic for use by get source, target, and head deltas functions
    """
    rewards = [Gwei(0)] * len(state.validators)
    penalties = [Gwei(0)] * len(state.validators)
    total_balance = get_total_active_balance(state)
    unslashed_attesting_indices = get_unslashed_attesting_indices(state, attestations)
    attesting_balance = get_total_balance(state, unslashed_attesting_indices)
    for index in get_eligible_validator_indices(state):
        if index in unslashed_attesting_indices:
            increment = EFFECTIVE_BALANCE_INCREMENT  # Factored out from balance totals to avoid uint64 overflow
            if is_in_inactivity_leak(state):
                # Since full base reward will be canceled out by inactivity penalty deltas,
                # optimal participation receives full base reward compensation here.
                rewards[index] += get_base_reward(state, index)
            else:
                reward_numerator = get_base_reward(state, index) * (attesting_balance // increment)
                rewards[index] += reward_numerator // (total_balance // increment)
        else:
            penalties[index] += get_base_reward(state, index)
    return rewards, penalties

TODO

Components of attestation deltas
def get_source_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attester micro-rewards/penalties for source-vote for each validator.
    """
    matching_source_attestations = get_matching_source_attestations(state, get_previous_epoch(state))
    return get_attestation_component_deltas(state, matching_source_attestations)

TODO

def get_target_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attester micro-rewards/penalties for target-vote for each validator.
    """
    matching_target_attestations = get_matching_target_attestations(state, get_previous_epoch(state))
    return get_attestation_component_deltas(state, matching_target_attestations)

TODO

def get_head_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attester micro-rewards/penalties for head-vote for each validator.
    """
    matching_head_attestations = get_matching_head_attestations(state, get_previous_epoch(state))
    return get_attestation_component_deltas(state, matching_head_attestations)

TODO

def get_inclusion_delay_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return proposer and inclusion delay micro-rewards/penalties for each validator.
    """
    rewards = [Gwei(0) for _ in range(len(state.validators))]
    matching_source_attestations = get_matching_source_attestations(state, get_previous_epoch(state))
    for index in get_unslashed_attesting_indices(state, matching_source_attestations):
        attestation = min([
            a for a in matching_source_attestations
            if index in get_attesting_indices(state, a.data, a.aggregation_bits)
        ], key=lambda a: a.inclusion_delay)
        rewards[attestation.proposer_index] += get_proposer_reward(state, index)
        max_attester_reward = get_base_reward(state, index) - get_proposer_reward(state, index)
        rewards[index] += Gwei(max_attester_reward // attestation.inclusion_delay)

    # No penalties associated with inclusion delay
    penalties = [Gwei(0) for _ in range(len(state.validators))]
    return rewards, penalties

TODO

def get_inactivity_penalty_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return inactivity reward/penalty deltas for each validator.
    """
    penalties = [Gwei(0) for _ in range(len(state.validators))]
    if is_in_inactivity_leak(state):
        matching_target_attestations = get_matching_target_attestations(state, get_previous_epoch(state))
        matching_target_attesting_indices = get_unslashed_attesting_indices(state, matching_target_attestations)
        for index in get_eligible_validator_indices(state):
            # If validator is performing optimally this cancels all rewards for a neutral balance
            base_reward = get_base_reward(state, index)
            penalties[index] += Gwei(BASE_REWARDS_PER_EPOCH * base_reward - get_proposer_reward(state, index))
            if index not in matching_target_attesting_indices:
                effective_balance = state.validators[index].effective_balance
                penalties[index] += Gwei(effective_balance * get_finality_delay(state) // INACTIVITY_PENALTY_QUOTIENT)

    # No rewards associated with inactivity penalties
    rewards = [Gwei(0) for _ in range(len(state.validators))]
    return rewards, penalties

TODO

get_attestation_deltas
def get_attestation_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attestation reward/penalty deltas for each validator.
    """
    source_rewards, source_penalties = get_source_deltas(state)
    target_rewards, target_penalties = get_target_deltas(state)
    head_rewards, head_penalties = get_head_deltas(state)
    inclusion_delay_rewards, _ = get_inclusion_delay_deltas(state)
    _, inactivity_penalties = get_inactivity_penalty_deltas(state)

    rewards = [
        source_rewards[i] + target_rewards[i] + head_rewards[i] + inclusion_delay_rewards[i]
        for i in range(len(state.validators))
    ]

    penalties = [
        source_penalties[i] + target_penalties[i] + head_penalties[i] + inactivity_penalties[i]
        for i in range(len(state.validators))
    ]

    return rewards, penalties

TODO

Fun fact: my colleague, Herman Junge, qualified for the first Eth2 bug bounty for discovering a potential arithmetic overflow in a previous version of this function.

process_rewards_and_penalties
def process_rewards_and_penalties(state: BeaconState) -> None:
    if get_current_epoch(state) == GENESIS_EPOCH:
        return

    rewards, penalties = get_attestation_deltas(state)
    for index in range(len(state.validators)):
        increase_balance(state, ValidatorIndex(index), rewards[index])
        decrease_balance(state, ValidatorIndex(index), penalties[index])

TODO

Registry updates

def process_registry_updates(state: BeaconState) -> None:
    # Process activation eligibility and ejections
    for index, validator in enumerate(state.validators):
        if is_eligible_for_activation_queue(validator):
            validator.activation_eligibility_epoch = get_current_epoch(state) + 1

        if is_active_validator(validator, get_current_epoch(state)) and validator.effective_balance <= EJECTION_BALANCE:
            initiate_validator_exit(state, ValidatorIndex(index))

    # Queue validators eligible for activation and not yet dequeued for activation
    activation_queue = sorted([
        index for index, validator in enumerate(state.validators)
        if is_eligible_for_activation(state, validator)
        # Order by the sequence of activation_eligibility_epoch setting and then index
    ], key=lambda index: (state.validators[index].activation_eligibility_epoch, index))
    # Dequeued validators for activation up to churn limit
    for index in activation_queue[:get_validator_churn_limit(state)]:
        validator = state.validators[index]
        validator.activation_epoch = compute_activation_exit_epoch(get_current_epoch(state))

TODO

Slashings

def process_slashings(state: BeaconState) -> None:
    epoch = get_current_epoch(state)
    total_balance = get_total_active_balance(state)
    for index, validator in enumerate(state.validators):
        if validator.slashed and epoch + EPOCHS_PER_SLASHINGS_VECTOR // 2 == validator.withdrawable_epoch:
            increment = EFFECTIVE_BALANCE_INCREMENT  # Factored out from penalty numerator to avoid uint64 overflow
            penalty_numerator = validator.effective_balance // increment * min(sum(state.slashings) * 3, total_balance)
            penalty = penalty_numerator // total_balance * increment
            decrease_balance(state, ValidatorIndex(index), penalty)

TODO

Final updates

def process_final_updates(state: BeaconState) -> None:
    current_epoch = get_current_epoch(state)
    next_epoch = Epoch(current_epoch + 1)
    # Reset eth1 data votes
    if next_epoch % EPOCHS_PER_ETH1_VOTING_PERIOD == 0:
        state.eth1_data_votes = []
    # Update effective balances with hysteresis
    for index, validator in enumerate(state.validators):
        balance = state.balances[index]
        HYSTERESIS_INCREMENT = EFFECTIVE_BALANCE_INCREMENT // HYSTERESIS_QUOTIENT
        DOWNWARD_THRESHOLD = HYSTERESIS_INCREMENT * HYSTERESIS_DOWNWARD_MULTIPLIER
        UPWARD_THRESHOLD = HYSTERESIS_INCREMENT * HYSTERESIS_UPWARD_MULTIPLIER
        if (
            balance + DOWNWARD_THRESHOLD < validator.effective_balance
            or validator.effective_balance + UPWARD_THRESHOLD < balance
        ):
            validator.effective_balance = min(balance - balance % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)
    # Reset slashings
    state.slashings[next_epoch % EPOCHS_PER_SLASHINGS_VECTOR] = Gwei(0)
    # Set randao mix
    state.randao_mixes[next_epoch % EPOCHS_PER_HISTORICAL_VECTOR] = get_randao_mix(state, current_epoch)
    # Set historical root accumulator
    if next_epoch % (SLOTS_PER_HISTORICAL_ROOT // SLOTS_PER_EPOCH) == 0:
        historical_batch = HistoricalBatch(block_roots=state.block_roots, state_roots=state.state_roots)
        state.historical_roots.append(hash_tree_root(historical_batch))
    # Rotate current/previous epoch attestations
    state.previous_epoch_attestations = state.current_epoch_attestations
    state.current_epoch_attestations = []

TODO

Block processing

def process_block(state: BeaconState, block: BeaconBlock) -> None:
    process_block_header(state, block)
    process_randao(state, block.body)
    process_eth1_data(state, block.body)
    process_operations(state, block.body)

TODO

Block header

def process_block_header(state: BeaconState, block: BeaconBlock) -> None:
    # Verify that the slots match
    assert block.slot == state.slot
    # Verify that the block is newer than latest block header
    assert block.slot > state.latest_block_header.slot
    # Verify that proposer index is the correct index
    assert block.proposer_index == get_beacon_proposer_index(state)
    # Verify that the parent matches
    assert block.parent_root == hash_tree_root(state.latest_block_header)
    # Cache current block as the new latest block
    state.latest_block_header = BeaconBlockHeader(
        slot=block.slot,
        proposer_index=block.proposer_index,
        parent_root=block.parent_root,
        state_root=Bytes32(),  # Overwritten in the next process_slot call
        body_root=hash_tree_root(block.body),
    )

    # Verify proposer is not slashed
    proposer = state.validators[block.proposer_index]
    assert not proposer.slashed

TODO

RANDAO

def process_randao(state: BeaconState, body: BeaconBlockBody) -> None:
    epoch = get_current_epoch(state)
    # Verify RANDAO reveal
    proposer = state.validators[get_beacon_proposer_index(state)]
    signing_root = compute_signing_root(epoch, get_domain(state, DOMAIN_RANDAO))
    assert bls.Verify(proposer.pubkey, signing_root, body.randao_reveal)
    # Mix in RANDAO reveal
    mix = xor(get_randao_mix(state, epoch), hash(body.randao_reveal))
    state.randao_mixes[epoch % EPOCHS_PER_HISTORICAL_VECTOR] = mix

TODO

Eth1 data

def process_eth1_data(state: BeaconState, body: BeaconBlockBody) -> None:
    state.eth1_data_votes.append(body.eth1_data)
    if state.eth1_data_votes.count(body.eth1_data) * 2 > EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH:
        state.eth1_data = body.eth1_data

TODO

Operations

def process_operations(state: BeaconState, body: BeaconBlockBody) -> None:
    # Verify that outstanding deposits are processed up to the maximum number of deposits
    assert len(body.deposits) == min(MAX_DEPOSITS, state.eth1_data.deposit_count - state.eth1_deposit_index)

    def for_ops(operations: Sequence[Any], fn: Callable[[BeaconState, Any], None]) -> None:
        for operation in operations:
            fn(state, operation)

    for_ops(body.proposer_slashings, process_proposer_slashing)
    for_ops(body.attester_slashings, process_attester_slashing)
    for_ops(body.attestations, process_attestation)
    for_ops(body.deposits, process_deposit)
    for_ops(body.voluntary_exits, process_voluntary_exit)

TODO

Proposer slashings
def process_proposer_slashing(state: BeaconState, proposer_slashing: ProposerSlashing) -> None:
    header_1 = proposer_slashing.signed_header_1.message
    header_2 = proposer_slashing.signed_header_2.message

    # Verify header slots match
    assert header_1.slot == header_2.slot
    # Verify header proposer indices match
    assert header_1.proposer_index == header_2.proposer_index
    # Verify the headers are different
    assert header_1 != header_2
    # Verify the proposer is slashable
    proposer = state.validators[header_1.proposer_index]
    assert is_slashable_validator(proposer, get_current_epoch(state))
    # Verify signatures
    for signed_header in (proposer_slashing.signed_header_1, proposer_slashing.signed_header_2):
        domain = get_domain(state, DOMAIN_BEACON_PROPOSER, compute_epoch_at_slot(signed_header.message.slot))
        signing_root = compute_signing_root(signed_header.message, domain)
        assert bls.Verify(proposer.pubkey, signing_root, signed_header.signature)

    slash_validator(state, header_1.proposer_index)

TODO

Attester slashings
def process_attester_slashing(state: BeaconState, attester_slashing: AttesterSlashing) -> None:
    attestation_1 = attester_slashing.attestation_1
    attestation_2 = attester_slashing.attestation_2
    assert is_slashable_attestation_data(attestation_1.data, attestation_2.data)
    assert is_valid_indexed_attestation(state, attestation_1)
    assert is_valid_indexed_attestation(state, attestation_2)

    slashed_any = False
    indices = set(attestation_1.attesting_indices).intersection(attestation_2.attesting_indices)
    for index in sorted(indices):
        if is_slashable_validator(state.validators[index], get_current_epoch(state)):
            slash_validator(state, index)
            slashed_any = True
    assert slashed_any

TODO

Attestations
def process_attestation(state: BeaconState, attestation: Attestation) -> None:
    data = attestation.data
    assert data.target.epoch in (get_previous_epoch(state), get_current_epoch(state))
    assert data.target.epoch == compute_epoch_at_slot(data.slot)
    assert data.slot + MIN_ATTESTATION_INCLUSION_DELAY <= state.slot <= data.slot + SLOTS_PER_EPOCH
    assert data.index < get_committee_count_per_slot(state, data.target.epoch)

    committee = get_beacon_committee(state, data.slot, data.index)
    assert len(attestation.aggregation_bits) == len(committee)

    pending_attestation = PendingAttestation(
        data=data,
        aggregation_bits=attestation.aggregation_bits,
        inclusion_delay=state.slot - data.slot,
        proposer_index=get_beacon_proposer_index(state),
    )

    if data.target.epoch == get_current_epoch(state):
        assert data.source == state.current_justified_checkpoint
        state.current_epoch_attestations.append(pending_attestation)
    else:
        assert data.source == state.previous_justified_checkpoint
        state.previous_epoch_attestations.append(pending_attestation)

    # Verify signature
    assert is_valid_indexed_attestation(state, get_indexed_attestation(state, attestation))

TODO

Deposits
def get_validator_from_deposit(state: BeaconState, deposit: Deposit) -> Validator:
    amount = deposit.data.amount
    effective_balance = min(amount - amount % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)

    return Validator(
        pubkey=deposit.data.pubkey,
        withdrawal_credentials=deposit.data.withdrawal_credentials,
        activation_eligibility_epoch=FAR_FUTURE_EPOCH,
        activation_epoch=FAR_FUTURE_EPOCH,
        exit_epoch=FAR_FUTURE_EPOCH,
        withdrawable_epoch=FAR_FUTURE_EPOCH,
        effective_balance=effective_balance,
    )

[TODO: new in v0.12.2]

def process_deposit(state: BeaconState, deposit: Deposit) -> None:
    # Verify the Merkle branch
    assert is_valid_merkle_branch(
        leaf=hash_tree_root(deposit.data),
        branch=deposit.proof,
        depth=DEPOSIT_CONTRACT_TREE_DEPTH + 1,  # Add 1 for the List length mix-in
        index=state.eth1_deposit_index,
        root=state.eth1_data.deposit_root,
    )

    # Deposits must be processed in order
    state.eth1_deposit_index += 1

    pubkey = deposit.data.pubkey
    amount = deposit.data.amount
    validator_pubkeys = [v.pubkey for v in state.validators]
    if pubkey not in validator_pubkeys:
        # Verify the deposit signature (proof of possession) which is not checked by the deposit contract
        deposit_message = DepositMessage(
            pubkey=deposit.data.pubkey,
            withdrawal_credentials=deposit.data.withdrawal_credentials,
            amount=deposit.data.amount,
        )
        domain = compute_domain(DOMAIN_DEPOSIT)  # Fork-agnostic domain since deposits are valid across forks
        signing_root = compute_signing_root(deposit_message, domain)
        if not bls.Verify(pubkey, signing_root, deposit.data.signature):
            return

        # Add validator and balance entries
        state.validators.append(get_validator_from_deposit(state, deposit))
        state.balances.append(amount)
    else:
        # Increase balance by deposit amount
        index = ValidatorIndex(validator_pubkeys.index(pubkey))
        increase_balance(state, index, amount)

TODO

Voluntary exits
def process_voluntary_exit(state: BeaconState, signed_voluntary_exit: SignedVoluntaryExit) -> None:
    voluntary_exit = signed_voluntary_exit.message
    validator = state.validators[voluntary_exit.validator_index]
    # Verify the validator is active
    assert is_active_validator(validator, get_current_epoch(state))
    # Verify exit has not been initiated
    assert validator.exit_epoch == FAR_FUTURE_EPOCH
    # Exits must specify an epoch when they become valid; they are not valid before then
    assert get_current_epoch(state) >= voluntary_exit.epoch
    # Verify the validator has been active long enough
    assert get_current_epoch(state) >= validator.activation_epoch + SHARD_COMMITTEE_PERIOD
    # Verify signature
    domain = get_domain(state, DOMAIN_VOLUNTARY_EXIT, voluntary_exit.epoch)
    signing_root = compute_signing_root(voluntary_exit, domain)
    assert bls.Verify(validator.pubkey, signing_root, signed_voluntary_exit.signature)
    # Initiate exit
    initiate_validator_exit(state, voluntary_exit.validator_index)

TODO