This work applies to an outdated version of the specification and is unmaintained. Please see my Upgrading Ethereum book for the latest, including an updated annotated specification.

Ethereum 2.0 Phase 0 -- The Beacon Chain

Welcome to the Ethereum 2.0 Annotated Specification!

All the text that appears with a side-bar like this is my added commentary; everything else is directly from the original specification document.

This document concerns the Phase 0 beacon chain state transition function. Think of it as "The Yellow Paper for the Beacon Chain". It's basically the brains of the operation. There's a good deal more to building a fully functioning client, such as networking, fork choice, and validator behaviour: other documents cover these. There are also common "standards" around components like SSZ and APIs. For all the rest of it (databases, architecture, optimisations, metrics, Eth1 data handling, ...), the best source is probably a client implementation.

Other useful resources:

Vitalik's annotated specification

Phase 0 for Humans [v0.10.0]

Serenity Design Rationale

Phase 0 design notes (Justin Drake)

eth2.info

Ethereum 2.0 Devs Handbook and FAQs

Formal beacon chain specification in Dafny

Introduction

This document represents the specification for Phase 0 of Ethereum 2.0 -- The Beacon Chain.

Phase 0 of Ethereum 2.0 (Eth2) is concerned only with the beacon chain. It implements a self-sustaining Proof of Stake network, but not much more. Phase 1 will bring sharding of blockchain data, and Phase 2 will bring execution engines that consume that data.

At the core of Ethereum 2.0 is a system chain called the "beacon chain". The beacon chain stores and manages the registry of validators. In the initial deployment phases of Ethereum 2.0, the only mechanism to become a validator is to make a one-way ETH transaction to a deposit contract on Ethereum 1.0. Activation as a validator happens when Ethereum 1.0 deposit receipts are processed by the beacon chain, the activation balance is reached, and a queuing process is completed. Exit is either voluntary or done forcibly as a penalty for misbehavior. The primary source of load on the beacon chain is "attestations". Attestations are simultaneously availability votes for a shard block (Phase 1) and proof-of-stake votes for a beacon block (Phase 0).

Eth2 is not merely an upgrade to Ethereum 1.

The original plan had been to manage both sharding and proof of stake via smart contracts on the existing Ethereum Mainnet. Eventually, it became apparent that this approach would limit throughput and constrain the scope for innovation in Eth2. So, around June 2018, it was decided to implement Eth2 as a separate blockchain, with strong economic ties to Eth1, and to come up with a transition plan to eventually import Eth1 into the new Eth2 chain.

The actors in Phase 0 of Eth2 are beacon nodes and validators. Beacon nodes maintain the state of the beacon chain, manage validators, apply rewards and penalties, maintain randomness, and determine the canonical chain with finality. Validators propose beacon chain blocks, and vote for blocks by making attestations. That's pretty much it in Phase 0. In Phase 1, the role of validators will be expanded to deal with shards.

This Beacon Chain specification describes the beacon nodes only. Expected validator behaviour is described in a separate document which is not formally a specification: correct behaviour for validators is essentially inferred from the specification of the beacon chain.

Notation

Code snippets appearing in this style are to be interpreted as Python 3 code.

It's a little controversial, but in its final form, the Ethereum 2.0 specification is almost entirely made up of Python code. This has a couple of advantages, the greatest being that the specification is executable, meaning that test vectors can be generated directly from the spec document, and in the event of consensus failure, the spec can be executed as the final arbiter of which fork is correct. It is also more amenable to formal verification of various sorts. You can download the specification directly as a Python package, and run code with it.

Unfortunately, a downside of all this is that all narrative and intuition has been removed: it is difficult for normal human beings to understand the spec only by reading the spec. (Though there are recent signs of this being reversed.)

Anyway, I'm here to help. Welcome to The Annotated Spec 😄

Custom types

Right at the top of the spec we find the important concepts laid out. Each of the types in the following table relates to something fundamental about the construction of the Ethereum 2.0 beacon chain.

SSZ is the encoding method used to pass data between clients and is described in a separate specification. Here it can be thought of as just a data type.

All integers are unsigned 64 bit numbers throughout the spec. We fought on the side of signedness in the integer wars of February 2019 but lost 😢. This means that preserving the order of operations is critical in some parts of the spec to avoid inadvertantly underflowing.

We define the following Python custom types for type hinting and readability:

I'm just going to give a brief intro here to each of these; we'll be seeing a lot more of them later.

Name	SSZ equivalent	Description
`Slot`	`uint64`	a slot number

Time is divided into slots of 12 seconds. Exactly one beacon chain block is supposed to be proposed per slot, by a validator randomly selected to do so.


`Epoch`	`uint64`	an epoch number

Thirty-two slots make an epoch.

Epoch boundaries are the points at which the chain can be justified and finalised (by the Casper FFG mechanism), and they are also the points at which validator balances are updated, validator committees get shuffled, and validator exits, entries, and slashings are processed. That is, the main state-transition work is performed per epoch, not per slot.

Fun fact: Epochs were originally called Cycles.


`CommitteeIndex`	`uint64`	a committee index at a slot

Validators are organised into committees that collectively vote (make attestations) on blocks. Each committee is active at exactly one slot per epoch, but several committees are active at each slot. This type indexes into the list of committees active at a slot.

In Phase 0, validators are members of only one type of committee, and they are shuffled between committees every epoch. The role of the committee is to attest to the beacon block proposed by the selected member of the committee. In Phase 1 persistent committees will be introduced that will attest to shard data blocks and are shuffled slowly.


`ValidatorIndex`	`uint64`	a validator registry index

Every validator that enters the system is consecutively assigned a unique validator index number that is permanent, remaining even after the validator exits. This is necessary as the validator's balance is associated with its index, so it needs to be preserved even if the validator exits, since there is no mechanism in Phase 0 to transfer that balance elsewhere.


`Gwei`	`uint64`	an amount in Gwei

All Ether amounts are specified in units of Gwei ($\smash{10^9}$ Wei). This is basically a hack to avoid having to use integers wider than 64 bits ($\smash{2^{64}}$ Wei is only 18 Ether) to store balances and in calculations. Even so, in some places care needs to be taken to avoid arithmetic overflow when dealing with Ether calculations.


`Root`	`Bytes32`	a Merkle root

Merkle roots are ubiquitous in the Eth2 protocol. They are a very succint and tamper-proof way of representing a lot of data: blocks are summarised by their Merkle roots; state is summarised by its Merkle root; the list of Eth1 deposits is summarised by its Merkle root; the digital signature of messages is calculated from the Merkle root of the data structure contained by the message. Here's a primer in case this is all new to you.


`Version`	`Bytes4`	a fork version number

It is expected that the protocol will get updated/upgraded from time to time: a process commonly known as a "hard-fork". Unlike Eth1, Eth2 has an in-protocol concept of a version number. This is used, for example, to prevent votes from validators on one fork (that maybe haven't upgraded yet) being counted on a different fork.

Its recommended use is described in the Ethereum 2.0 networking specification.


`DomainType`	`Bytes4`	a domain type

This is just a cryptographic nicety: messages intended for different purposes are tagged with different domains before being hashed and possibly signed. It's a kind of name-spacing to avoid clashes; probably unnecessary, but considered best-practice. Seven domain types are defined in Phase 0.


`ForkDigest`	`Bytes4`	a digest of the current fork data

The unique chain identifier, based on information from genesis and the current fork Version. It is calculated in compute_fork_digest. As per the comment there, "4-bytes suffices for practical separation of forks/chains".

ForkDigest is used extensively in the Ethereum 2.0 networking specification.


`Domain`	`Bytes32`	a signature domain

Domain is the concatenation of the DomainType and the first 28 bytes of the fork data root. It is used when verifying any messages from a validator—the message needs to have been sent with the correct domain and fork version.


`BLSPubkey`	`Bytes48`	a BLS12-381 public key

BLS is the digital signature scheme used by Eth2. It has some very nice properties, in particular the ability to aggregate signatures. This means that many validators can sign the same message (for example, that they support block X), and these signatures can all be efficiently aggregated into a single signature for verification. The ability to do this efficiently makes Eth2 practical as a protocol. Several other protocols have adopted or will adopt BLS, such as Zcash, Chia, Dfinity and Algorand. We are using the BLS signature scheme based on the BLS12-381 elliptic curve.

The BLSPubkey type holds a validator's public key, or the aggregation of several validators' public keys. This is used to verify messages that are claimed to have come from that validator or group of validators.

In our implementation, BLS public keys are elliptic curve points from the BLS12-381 $\smash{G_1}$ group, thus are 48 bytes long when compressed.


`BLSSignature`	`Bytes96`	a BLS12-381 signature

As above, we are using BLS (Boneh-Lynn-Shacham) signatures over the BLS12-381 (Barreto-Lynn-Scott) elliptic curve in order to sign messages between participants. As with all digital signature schemes, this guarantees both the identity of the sender and the integrity of the contents of any message.

In our implementation, BLS signatures are elliptic curve points from the BLS12-381 $\smash{G_2}$ group, thus are 96 bytes long when compressed.

Constants

The following values are (non-configurable) constants used throughout the specification.

The distinction between "constants" and "configuration values" is not always clean, and things have moved back and forth between them at times. These are things that are expected never to change for the beacon chain, no matter what fork or test network it is running.

Name	Value
`GENESIS_SLOT`	`Slot(0)`

The very first slot number for the beacon chain is zero.

This might seem uncontroversial (except perhaps to Fortran programmers), but it actually featured heavily in the Great Integer Wars mentioned previously. The issue was that calculations on unsigned integers might have negative intermediate values, which could cause problems. A proposed work-around for this was to start the chain at a non-zero slot number. This was initially chosen to be 2^19, then 2^63, then 2^32, and finally back to zero. In my view, this madness suggests that we should have been using signed integers all along 🙄.


`GENESIS_EPOCH`	`Epoch(0)`

Similar to the above, but widely used in the beacon chain spec. When the chain starts, it starts at epoch zero.


`FAR_FUTURE_EPOCH`	`Epoch(2**64 - 1)`

A candidate for the dullest constant. It's used as a default initialiser for validators' activation and exit times before they are properly set. No epoch number is bigger than this one.


`BASE_REWARDS_PER_EPOCH`	`uint64(4)`

BASE_REWARDS_PER_EPOCH is the number of distinct things that attesting validators get rewarded for in each epoch. Namely, creating attestations with (1) matching source block, (2) matching target block, (3) matching chain head, and (4) inclusion delay - i.e. getting attestations included quickly in the beacon chain. (1) and (2) relate to the Casper FFG finality gadget, (3) and (4) relate to the LMD GHOST fork choice rule.


`DEPOSIT_CONTRACT_TREE_DEPTH`	`uint64(2**5)` (= 32)

DEPOSIT_CONTRACT_TREE_DEPTH specifies the size of the (sparse) Merkle tree used by the Eth1 deposit contract to store deposits made. With a value of 32, this allows for $\smash{2^{32}}$ = 4.3 billion deposits. Given that the minimum deposit it 1 Ether, that number is clearly enough for quite a while.


`JUSTIFICATION_BITS_LENGTH`	`uint64(4)`

As an optimisation to Casper FFG—the process by which finality is conferred on epochs—Eth2 uses a "k-finality" rule. We will describe this properly when we look at processing justification and finalisation. For now, this constant is just the number of bits we need to store in state to implement k-finality. For k = 2 we need to track the justification status of the last four epochs.


`ENDIANNESS`	`'little'`

Endianness refers to the order of bytes in the binary representation of a number: most significant byte first is big-endian; least significant byte first is little-endian. For the most part these details are hidden by compilers, and we don't need to worry about endianness. But endianness matters when converting between integers and bytes, which is relevant to shuffling and proposer selection, the RANDAO, and when serialising with SSZ.

The spec began life as big-endian, but the Nimbus team from Status got it changed to little-endian to better suit the low-power processors they are targetting. SSZ was changed first, and then the rest of the spec followed.

Configuration

In the normal course of things, configuration parameters might not be the most exciting part of a specification. However, to gain an understanding of these parameters is to gain a huge insight into what kind of beast we are dealing with in the beacon chain.

You'll notice that most of them are powers of two. There's no huge significance to this. Computer scientists think it's neat, and it ensures that things cleanly divide other things in general. Justin Drake believes that it helps to minimise bike-shedding.

Some of the configuaration parameters below are quite technical and perhaps obscure. I'll use the opportunity here to introduce some concepts, and more detailed explanation will follow when they are used later in the spec.

Note: The default mainnet configuration values are included here for illustrative purposes. The different configurations for mainnet, testnets, and YAML-based testing can be found in the configs/constant_presets directory.

To facilitate easier initial interoperability testing and testnets, a much lighter-weight minimal configuration was defined. This runs more quickly, with much lower resource use, than the below mainnet configuration: it has fewer, smaller committees, less shuffling, 6s rather than 12s slots, 8-slot rather than 64-slot epochs, and so on. The final beacon chain was deployed with the mainnet configuration parameters below.

Misc

Name	Value
`ETH1_FOLLOW_DISTANCE`	`uint64(2**11)` (= 2,048)

This is the minimum depth of block on the Ethereum 1 chain that can be considered by the Eth2 chain: it applies to the Genesis process and the processing of deposits by validators. The Eth1 chain depth is estimated by multiplying this value by the target average Eth1 block time, SECONDS_PER_ETH1_BLOCK.

The value of ETH1_FOLLOW_DISTANCE is not based on the expected depth of any reorgs of the Eth1 chain, which are rarely if ever more than 2-3 blocks deep. It is about providing time to respond to an incident on the Eth1 chain such as a consensus failure between clients.

This parameter was increased from 1024 to 2048 blocks for the beacon chain mainnet, to allow devs more time to respond if there were any trouble on the Eth1 chain.


`MAX_COMMITTEES_PER_SLOT`	`uint64(2**6)` (= 64)

Validators are organised into committees to do their work. At any one time, each validator is a member of exactly one beacon chain committee, and is called on to make an attestation exactly once per epoch. (An attestation is a vote for a beacon chain block that has been proposed for a slot.)

In the beacon chain, the 64 committees active in a slot effectively act as a single committee as far as the fork-choice rule is concerned. They all vote on the proposed block for the slot, and their votes/attestations are pooled. (In a similar way, all committees active during an epoch act effectively as a single committee as far as justification and finalisation are concerned.)

The number 64 is designed to map onto one committee per shard once Phase 1 is deployed, since these committees will also vote on shard crosslinks.


`TARGET_COMMITTEE_SIZE`	`uint64(2**7)` (= 128)

To achieve a desirable level of security, committees need to be larger than a certain size. This is to make it infeasible for an attacker to randomly end up with a majority in a committee even if they control a significant number of validators. This target is a kind of lower-bound on committee size. If there are not enough validators to make all committees have at least 128 members, then, as a first measure, the number of committees per slot is reduced to maintain this minimum. Only if there are fewer than SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE = 4096 validators in total will the committee size be reduced below TARGET_COMMITTEE_SIZE. With so few validators, the system would be insecure in any case.

See the note below for how this value 128 was arrived at.


`MAX_VALIDATORS_PER_COMMITTEE`	`uint64(2**11)` (= 2,048)

This is just used for sizing some data structures, and is not particularly interesting. Reaching this limit would imply over 4 million active validators, staked with a total of 128 million Ether, which exceeds the total supply today.


`MIN_PER_EPOCH_CHURN_LIMIT`	`uint64(2**2)` (= 4)

Validators are allowed to exit the system and cease validating, and new validators may apply to join at any time. For interesting reasons, a design decision was made to apply a rate-limit to entries (activations) and exits. Basically, it is important in proof of stake protocols that the validator set not change too quickly.

In the normal case, a validator is able to exit fairly swiftly: it just needs to wait MAX_SEED_LOOKAHEAD (currently four) epochs. However, if there are large numbers of validators wishing to exit at the same time, a queue forms with a limited number of exits allowed per epoch. The minimum number of exits per epoch (the minimum "churn") is MIN_PER_EPOCH_CHURN_LIMIT, so that validators can always eventually exit. The actual allowed churn per epoch is calculated in conjunction with CHURN_LIMIT_QUOTIENT.

The same applies to new validator activations, once a validator has been marked as eligible for activation.


`CHURN_LIMIT_QUOTIENT`	`uint64(2**16)` (= 65,536)

This is used in conjunction with MIN_PER_EPOCH_CHURN_LIMIT to calculate the actual number of validator exits and activations allowed per epoch. The number of exits allowed is max(MIN_PER_EPOCH_CHURN_LIMIT, n // CHURN_LIMIT_QUOTIENT), where n is the number of active validators. The same applies to activations.


`SHUFFLE_ROUND_COUNT`	`uint64(90)`

The beacon chain implements a rather interesting way of shuffling validators in order to select committees, called the "swap-or-not shuffle". This shuffle proceeds in rounds, and the degree of shuffling is determined by the number of rounds: SHUFFLE_ROUND_COUNT. The time taken to shuffle is linear in the number of rounds, so for light-weight, non-mainnet configurations, the number of rounds can be reduced.

The value 90 was introduced in Vitalik's initial commit without explanation. The original paper describing the shuffling technique seems to suggest that a cryptographically safe number of rounds is $6\log{N}$. With 90 rounds, then, we should be good for shuffling 3.3 million validators, which is close to the maximum number possible (given the Ether supply).

The main advantage of using this shuffling method is that light clients and others that are interested in only a small number of the committees at any time can compute only the committees they need without having to shuffle the entire set of active validators. This can be a big saving on computational resources. See compute_shuffled_index.

For more on the mechanics of the swap-or-not shuffle, check out my explainer.


`MIN_GENESIS_ACTIVE_VALIDATOR_COUNT`	`uint64(2**14)` (= 16,384)

MIN_GENESIS_ACTIVE_VALIDATOR_COUNT is the minimum number of full validator stakes that must have been deposited before the beacon chain can start producing blocks. The number is chosen to ensure a degree of security. It allows for four 128 member committees per slot, rather than the 64 eventually desired to support Phase 1. But fewer validators means higher rewards per validator, so it is designed to attract early participants to get things bootstrapped.

MIN_GENESIS_ACTIVE_VALIDATOR_COUNT used to be much higher (65,536 = 2 million Ether staked), but was reduced when MIN_GENESIS_TIME, below, was added.


`MIN_GENESIS_TIME`	`uint64(1606824000)` (Dec 1, 2020, 12pm UTC)

MIN_GENESIS_TIME is the earliest date that the beacon chain can start.

Having a MIN_GENESIS_TIME allows us to start the chain with fewer validators than was previously thought necessary. The previous plan was to start the chain as soon as there were MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators staked. But there were concerns that with a lowish initial validator count, a single entity could form the majority of them and then act to prevent other validators from entering (a "gatekeeper attack"). A minimum genesis time allows time for all intending depositors to make their deposits before they could be excluded by a gatekeeper attack.

In the event, the beacon chain started at 12:00:23 UTC on the 1st of December 2020. The extra 23 seconds comes from the timestamp of the first Eth1 block to meet the genesis criteria, block 11320899. I like to think of this as a little remnant of proof of work forever embedded in the beacon chain's history.


`HYSTERESIS_QUOTIENT`	`uint64(4)`
`HYSTERESIS_DOWNWARD_MULTIPLIER`	`uint64(1)`
`HYSTERESIS_UPWARD_MULTIPLIER`	`uint64(5)`

These parameters relate to the way that effective balance is changed (see EFFECTIVE_BALANCE_INCREMENT below). As described there, effective balance for a validator follows changes to the actual balance in a step-wise way, with hysteresis. This is to ensure that it does not change very often.

The original hysteresis design had an unintended effect that might have encouraged stakers to over-deposit or make multiple deposits in order to maintain a balance above 32 Ether at all times. This is because, if a validator's balance were to drop below 32 Ether soon after depositing, however briefly, the effective balance would immediately drop to 31 Ether, and would take a long time to recover. This would result in a 3% reduction in rewards for a period.

This problem was addressed by making the hysteresis configurable via these parameters. Specifically, these settings mean:

if a validators' balance falls 0.25 Ether below its effective balance, then its effective balance is reduced by 1 Ether

if a validator's balance rises 1.25 Ether above its effective balance, then its effective balance is increased by 1 Ether

These calculations are done in process_final_updates().

For the safety of committees, TARGET_COMMITTEE_SIZE exceeds the recommended minimum committee size of 111; with sufficient active validators (at least SLOTS_PER_EPOCH * TARGET_COMMITTEE_SIZE), the shuffling algorithm ensures committee sizes of at least TARGET_COMMITTEE_SIZE. (Unbiasable randomness with a Verifiable Delay Function (VDF) will improve committee robustness and lower the safe minimum committee size.)

Given a proportion of the validators controlled by an attacker, what is the probability that the attacker ends up controlling a 2/3 majority in a randomly selected committee drawn from the full set of validators? This is what Vitalik looks at in the presentation, and where the 111 number comes from (a $\smash{2^{-40}}$ chance, one-in-a-trillion, of an attacker with 1/3 of the validators gaining by chance a 2/3 majority in any one committee).

Another issue is that the randomness that we are using (a RANDAO) is not unbiasable. If an attacker happens to control a number of block proposers at the end of an epoch, they can decide to reveal or not to reveal their blocks, gaining one bit of influence per validator on the next random number. This might allow an attacker to gain more control in the next round and so on. In this way, an attacker can gain some influence over committee selection. Having a good lower-bound on committee size (TARGET_COMMITTEE_SIZE) helps to defend against this. Alternatively, we could use an unbiasable source of randomness such as a verifiable delay function (VDF). Use of a VDF is not currently planned for Eth2, but may be implemented in future.

Gwei values

Name	Value
`MIN_DEPOSIT_AMOUNT`	`Gwei(2*0 10**9)` (= 1,000,000,000)

MIN_DEPOSIT_AMOUNT is not actually used anywhere within the Phase 0 Beacon Chain Specification document. Where it is used is within the deposit contract that was deployed to the Ethereum 1 chain. Any amount less than this value sent to the deposit contract is reverted.


`MAX_EFFECTIVE_BALANCE`	`Gwei(2*5 10**9)` (= 32,000,000,000)

There is a concept of "effective balance" for validators: whatever a validator's actual total stake (balance), its voting power is weighted by its effective balance, even if it has much more at stake. Effective balance is also the amount on which all rewards, penalties, and slashings are calculated—it's used a lot in the protocol

The MAX_EFFECTIVE_BALANCE is the highest effective balance that a validator can have: 32 Ether. Any balance above this is ignored. Note that this means that staking rewards don't compound in the usual case (unless our balance somehow falls below 32 Ether at some point).

There is a discussion in the Design Rationale of why 32 Ether was chosen as the staking amount. In short, we want enough validators to keep the chain both alive and secure under attack, but not so many that the message overhead on the network becomes too high.


`EJECTION_BALANCE`	`Gwei(2*4 10**9)` (= 16,000,000,000)

If a validator's effective balance falls to 16 Ether or below then it is exited from the system (kicked out of the active validator set). This is most likely to happen as a result of the "inactivity leak" which gradually reduces the balances of inactive validators in order to maintain the liveness of the beacon chain.

Note that the dependence on effective balance means that the validator is queued for ejection as soon as its actual balance falls to 16.75 Ether.


`EFFECTIVE_BALANCE_INCREMENT`	`Gwei(2*0 10**9)` (= 1,000,000,000)

Throughout the protocol, a quantity called "effective balance" is used instead of the validators' actual balances. Effective balance tracks the actual balance, with two differences: (1) effective balance is capped at MAX_EFFECTIVE_BALANCE no matter how high the actual balance of a validator is, and (2) effective balance is much more granular - it changes only in steps of EFFECTIVE_BALANCE_INCREMENT rather than Gwei.

This discretisation of balance is designed to reduce the amount of hashing required when making state updates. As we shall see, validators' actual balances are stored as a contiguous list in BeaconState. This is easy to update. Effective balances are stored in the individual validator records and are more costly to update (more hashing required). So we try to update effective balances relatively infrequently.

Effective balance is changed according to a process with hysteresis to avoid situations where it changes frequently. See HYSTERESIS_QUOTIENT above.

You can read more about effective balance in the Design Rationale and in this article.

Initial values

Name	Value
`GENESIS_FORK_VERSION`	`Version('0x00000000')`

Forks/upgrades are expected, if only when we move to Phase 1. This is the fork version the beacon chain starts with at its "Genesis" event: the point at which the chain first starts producing blocks.


`BLS_WITHDRAWAL_PREFIX`	`Bytes1('0x00')`

Not actually used in this core beacon chain spec, but used in the deposit contract spec.

Validators need to register two public/private key pairs. The signing key is used constantly for signing attestations and blocks. The withdrawal key will be used in future after a validator has exited to allow the validator's Ether balance to be transferred to an Eth2 account. The withdrawal credentials are stored in the validator's record so that, in future, the owner of the validator can lay claim to the original stake and accrued rewards. The withdrawal credentials is the 32 byte SHA256 hash of the validators withdrawal public key, with the first byte set to BLS_WITHDRAWAL_PREFIX as a version number, in case of future changes.

Time parameters

Name	Value	Unit	Duration
`GENESIS_DELAY`	`uint64(604800)`	seconds	7 days

The GENESIS_DELAY is a grace period to allow nodes and node operators time to prepare for the Genesis event. The Genesis event cannot occur before MIN_GENESIS_TIME. If there are not MIN_GENESIS_ACTIVE_VALIDATOR_COUNT registered validators sufficiently in advance of MIN_GENESIS_TIME, then Genesis will occur GENESIS_DELAY seconds after enough validators have been registered.

The Genesis event (beacon chain start) was originally designed to take place at midnight UTC, even for testnets, which was not always convenient. This has now been changed. Once we're past MIN_GENESIS_TIME - GENESIS_DELAY, Genesis could end up being at any time of the day, depending on when the last depost needed comes in. In the event, genesis occurred at 12:00:23 UTC on the 1st of December 2020, according to the timestamp of Ethereum block number 11320899 plus GENESIS_DELAY.


`SECONDS_PER_SLOT`	`uint64(12)`	seconds	12 seconds

This used to be 6 seconds, but is now 12, and has previously had other values. The main limiting factors in shortening this is the time necessary for block proposals to propagate among committees, and for validators to communicate and aggregate their votes for the block.

This slot length has to account for shard blocks as well in later phases. There was some discussion around having the beacon chain and shards on differing cadences, but the latest Phase 1 design tightly couples the beacon chain with the shards. Shard blocks under the new proposal are much larger, which led to the lengthening of the slot to 12 seconds.

There is a general intention to shorten this in future, perhaps to [8 seconds](https://github.com/ethereum/eth2.0-specs/issues/1890#issue-638024803, if it proves possible to do this in practice.


`SECONDS_PER_ETH1_BLOCK`	`uint64(14)`	seconds	14 seconds

The assumed block interval on the Eth1 chain, used when calculating how long we will wait before trusting that an Eth1 block will not be reorganised.

The average Eth1 block time since January 2020 has actually been nearer 13 seconds, but never mind. The net effect is that we will be going a little deeper back in the Eth1 chain than ETH1_FOLLOW_DISTANCE would suggest, which ought to be safer.


`MIN_ATTESTATION_INCLUSION_DELAY`	`uint64(2**0)` (= 1)	slots	12 seconds

A design goal of Eth2 is not to heavily disadvantage validators that are running on lower-spec systems, or, conversely, to reduce any advantage gained by running on high-spec systems.

One aspect of performance is network bandwidth. When a validator becomes the block proposer, it needs to gather attestations from the rest of its committee. On a low-bandwidth link, this takes longer, and could result in the proposer not being able to include as many past attestations as other better-connected validators might, thus receiving lower rewards.

MIN_ATTESTATION_INCLUSION_DELAY was an attempt to "level the playing field" by setting a minimum number of slots before an attestation can be included in a beacon block. It was originally set at 4, with a 6 second slot time, allowing 24 seconds for attestations to propagate around the network.

It was later set to one—attestations are included as early as possible—and, now we are crosslinking shards every slot, this is the only value that makes sense. So it exists today as a kind of relic of the earlier design.

The current slot time of 12 seconds (see above) is assumed to allow sufficient time for attestations to propagate and be aggregated sufficently within one slot. If this proves not to be the case, then it may be lengthened later.


`SLOTS_PER_EPOCH`	`uint64(2**5)` (= 32)	slots	6.4 minutes

When slots were six seconds, there were 64 slots per epoch. So the time between epoch boundaries is unchanged compared with the original design.

As a reminder, epoch transitions are where the heavy beacon chain state-transition calculation occurs, so we don't want them too close together. On the other hand, they are also the targets for finalisation, so we don't want them too far apart.


`MIN_SEED_LOOKAHEAD`	`uint64(2**0)` (= 1)	epochs	6.4 minutes

A random seed is used to select all the committees and proposers for an epoch. Every epoch, the beacon chain accumulates randomness from proposers via the RANDAO and stores it. The seed for the current epoch is based on the RANDAO output from the epoch MIN_SEED_LOOKUP + 1 ago. With MIN_SEED_LOOKAHEAD set to one, the effect is that we can know the seed for the current epoch and the next epoch, but not beyond (since the next-but-one epoch depends on randomness from the current epoch that hasn't been accumulated yet).

This mechanism is designed to allow sufficient time for committee members to find each other on the peer-to-peer network, and in Phase 1 to sync up any data they will need. But preventing committee makeup being known too far ahead limits the opportunity for coordinated collusion between validators.


`MAX_SEED_LOOKAHEAD`	`uint64(2**2)` (= 4)	epochs	25.6 minutes

The above notwithstanding, if an attacker has a large proportion of the stake, or is, for example, able to DoS block proposers for a while, then it might be possible for the the attacker to predict the output of the RANDAO further ahead than MIN_SEED_LOOKAHEAD would normally allow. In which case the attacker might be able to manipulate the make up of committtees advantageously by performing judicious exits and activations of their validators.

To prevent this, we assume a maximum feasible lookahead that an attacker might achieve (that is, this parameter) and delay all activations and exits by this amount. With MAX_SEED_LOOKAHEAD set to 4, if only 10% of validators are online and honest, then the chance that an attacker can succeed in forecasting the seed beyond MAX_SEED_LOOK_AHEAD - MIN_SEED_LOOKAHEAD = 3 epochs is $\smash{0.9^{3\times32}}$, which is about 1 in 25,000.


`MIN_EPOCHS_TO_INACTIVITY_PENALTY`	`uint64(2**2)` (= 4)	epochs	25.6 minutes

The inactivity penalty is discussed below. This parameter sets the time until it kicks in: if the last finalised epoch is longer ago than this, then the beacon chain starts operating in "leak" mode. In this mode, participating validators no longer get rewarded, and validators that are not participating get penalised.


`EPOCHS_PER_ETH1_VOTING_PERIOD`	`uint64(2**6)` (= 64)	epochs	~6.8 hours

In order to safely onboard new validators, the beacon chain needs to take a view on what the Eth1 chain looks like. This is done by collecting votes from beacon block proposers - they are expected to consult an available Eth1 client in order to construct their vote.

EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH is the total number of votes for Eth1 blocks that are collected. As soon as half of this number of votes are for the same Eth1 block, that block is adopted by the beacon chain and deposit processing can continue.

Rules for how validators select the right block to vote for are set out in the validator guide. ETH1_FOLLOW_DISTANCE is the minimum depth of block that can be considered: this is very conservatively set to ensure that we never end up with some block that we've relied on subsequently being reorganised out of the Ethereum 1 chain. For a detailed analysis of these parameters, see this ethresear.ch post

This parameter was increased from 32 to 64 epochs for the beacon chain mainnet, to allow devs more time to respond if there were any trouble on the Eth1 chain.


`SLOTS_PER_HISTORICAL_ROOT`	`uint64(2**13)` (= 8,192)	slots	~27 hours

There have been several redesigns of the way the beacon chain stores its past history. The current design is a double batched accumulator. The block root and state root for every slot are stored in the state for SLOTS_PER_HISTORICAL_ROOT slots. When that list is full, both lists are merkleised into a single Merkle root, which is added to the ever-growing state.historical_roots list.


`MIN_VALIDATOR_WITHDRAWABILITY_DELAY`	`uint64(2**8)` (= 256)	epochs	~27 hours

Once a validator has made it through the exit queue it can stop participating. However, its funds remain locked for the duration of MIN_VALIDATOR_WITHDRAWABILITY_DELAY. In Phase 0 this is to allow some time for any slashable behaviour to be detected and reported so that the validator can still be penalised (in which case the validator's withdrawable time is pushed EPOCHS_PER_SLASHINGS_VECTOR into the future). In Phase 1 this delay will also allow for shard rewards to be credited and for proof of custody challenges to be mounted.

Note that in Phases 0 and 1 there is no mechanism to withdraw a validator's balance in any case. But being in a "withdrawable" state means that a validator has now fully exited from the protocol.


`SHARD_COMMITTEE_PERIOD`	`uint64(2**8)` (= 256)	epochs	~27 hours

This really anticipates Phase 1. The idea is that it's bad for the stability of longer-lived shard committees if validators can appear and disappear very rapidly. Therefore, a validator cannot initiate a voluntary exit until SHARD_COMMITTEE_PERIOD epochs after it is activated. Note that it could still be ejected by slashing before this time.

State list lengths

The following parameters set the sizes of some lists in the beacon chain state. Some lists have natural sizes, others such as the validator registry need an explicit maximum size to help with SSZ serialisation.

Name	Value	Unit	Duration
`EPOCHS_PER_HISTORICAL_VECTOR`	`uint64(2**16)` (= 65,536)	epochs	~0.8 years

This is the number of epochs of previous RANDAO mixes that are stored (one per epoch). Having access to past randao mixes allows historical shufflings to be recalculated. Since Validator records keep track of the activation and exit epochs of all past validators, we can thus reconstitute past committees as far back as we have the RANDAO values. This information can be used for slashing long-past attestations, for example. It is not clear how the value of this parameter was decided.


`EPOCHS_PER_SLASHINGS_VECTOR`	`uint64(2**13)` (= 8,192)	epochs	~36 days

In the epoch in which a misbehaving validator is slashed, its effective balance is added to an accumulator in the state. In this way, the state.slashings list tracks the total effective balance of all validators slashed during the last EPOCHS_PER_SLASHINGS_VECTOR epochs.

At a time EPOCHS_PER_SLASHINGS_VECTOR // 2 after being slashed, a further penalty is applied to the slashed validator, based on the total amount of value slashed during the 4096 epochs before and the 4096 epochs after it was originally slashed.

The idea of this is to disproportionately punish coordinated attacks, in which many validators break the slashing conditions at the same time, while only lightly penalising validators that get slashed by making a mistake. Early designs for Eth2 would always slash a validator's entire deposit.


`HISTORICAL_ROOTS_LIMIT`	`uint64(2**24)` (= 16,777,216)	historical roots	~52,262 years

Every SLOTS_PER_HISTORICAL_ROOT slots, the list of block roots and the list of state roots are merkleised and added to state.historical_roots list. This is sized so that it is possible to store these roots for the entire past history of the chain. Although this list is effectively unbounded, it grows at less than 10 KB per year.

Storing past roots like this allows historical Merkle proofs to be constructed if required.


`VALIDATOR_REGISTRY_LIMIT`	`uint64(2**40)` (= 1,099,511,627,776)	validators

Every time the Eth1 deposit contract processes a deposit from a new validator (as identified by its public key), a new entry is appended to the state.validators list.

In the current design, validators are never removed from this list, even after exiting from being a validator. This is largely because there is nowhere yet to send a validator's remaining deposit and staking rewards, so they continue to need to be tracked in the beacon chain.

The maximum length of this list is VALIDATOR_REGISTRY_LIMIT, which is one trillion, so we ought to be OK for a while, especially given that the minimum deposit amount is 1 Ether.

Rewards and penalties

Name	Value
`BASE_REWARD_FACTOR`	`uint64(2**6)` (= 64)

This is the big knob to turn to change the issuance rate of Eth2. Almost all validator rewards are calculated in terms of a "base reward" which is calculated as,

effective_balance * BASE_REWARD_FACTOR // integer_squareroot(total_balance) // BASE_REWARDS_PER_EPOCH

where effective_balance is the individual validator's current effective balance and total_balance is the sum of the effective balances of all active validators.

Thus, the total validator rewards per epoch (the Eth2 issuance rate) could in principle be tuned by increasing or decreasing BASE_REWARD_FACTOR.


`WHISTLEBLOWER_REWARD_QUOTIENT`	`uint64(2**9)` (= 512)

One reward amount that is not tied to the base reward is the whistleblower reward. This is a reward for providing a proof that a proposer or attestor has violated a slashing condition. The whistleblower reward is set at $\smash{\frac{1}{512}}$ of the effective balance of the slashed validator.


`PROPOSER_REWARD_QUOTIENT`	`uint64(2**3)` (= 8)

This is mostly used to apportion rewards between attesters and proposers when including attestations in blocks. For each included attestation, 7/8 ofthe reward goes to the attester, and 1/8 to the proposer that includes it. See get_proposer_reward() and get_inclusion_delay_deltas.

In addition, the above whistleblower reward can optionally be divided between the reporter of the slashing offence and the proposer that includes the report in a block. In principle, the reporter of the slashing receives 7/8 of the reward, and the block proposer 1/8. In the Phase 0 spec, however, the whistleblower reward always gets awarded in its entirety to the block proposer, ignoring this parameter. It's quite hard to avoid whistleblowing reports being stolen by block proposers, so this makes sense, although zkProofs might help one day.


`INACTIVITY_PENALTY_QUOTIENT`	`uint64(2**26)` (= 67,108,864)

If the beacon chain hasn't finalised an epoch for longer than MIN_EPOCHS_TO_INACTIVITY_PENALTY epochs, then it enters "leak" mode. In this mode, any validator that does not vote (or votes for an incorrect target) is penalised an amount each epoch of effective_balance * finality_delay // INACTIVITY_PENALTY_QUOTIENT. The effect of this is the inactivity leak described below.

This value was increased from 2**24 for the beacon chain launch, with the intention of resetting it in a hardfork after a few months. The goal is to penalise validators less severely in case of non-finalisation due to implementation problems in the early days.


`MIN_SLASHING_PENALTY_QUOTIENT`	`uint64(2**7)` (=128)

When a validator is first convicted of a slashable offence, an initial penalty is applied. This is calculated as,

validator.effective_balance // MIN_SLASHING_PENALTY_QUOTIENT

Thus, the initial slashing penalty is between 0.125 Ether and 0.25 Ether depending on the validator's effective balance (which is between 16 and 32 Ether; note that effective balance is denominated in Gwei).

A further slashing penalty is applied later based on the total amount of balance slashed during a period of EPOCHS_PER_SLASHINGS_VECTOR.

This value was increased from 2**5 for the beacon chain launch, with the intention of resetting it in a hardfork after a few months. The goal is to punish validators less severely in case of slashing due to implementation problems in the early days.


`PROPORTIONAL_SLASHING_MULTIPLIER`	`uint64(1)`

See below.

The INACTIVITY_PENALTY_QUOTIENT equals INVERSE_SQRT_E_DROP_TIME**2 where INVERSE_SQRT_E_DROP_TIME := 2**13 epochs (about 36 days) is the time it takes the inactivity penalty to reduce the balance of non-participating validators to about 1/sqrt(e) ~= 60.6%. Indeed, the balance retained by offline validators after n epochs is about (1 - 1/INACTIVITY_PENALTY_QUOTIENT)**(n**2/2); so after INVERSE_SQRT_E_DROP_TIME epochs, it is roughly (1 - 1/INACTIVITY_PENALTY_QUOTIENT)**(INACTIVITY_PENALTY_QUOTIENT/2) ~= 1/sqrt(e). Note this value will be upgraded to 2**24 after Phase 0 mainnet stabilizes to provide a faster recovery in the event of an inactivity leak.

The idea for the inactivity leak (aka the quadratic leak) was proposed in the original Casper FFG paper. The problem it addresses is that, if a large fraction of the validator set were to go offline at the same time, it would not be possible to continue finalising checkpoints, since a 2/3 majority of the whole validator set is required for finalisation. We've seen this happen on testnets when the participation metric falls below 67%.

In order to recover, the inactivity leak gradually reduces the stakes of validators who are not making attestations until, eventually, the remaining participating validators control 2/3 of the remaining stake. They can then begin to finalise checkpoints once again.

In the calculation here, we are solving (a discrete form of) the differential equation $\smash{\frac{dB}{dt}=-\frac{Bt}{\alpha}}$, where $B$ is the balance and $\alpha$ is the value of INACTIVITY_PENALTY_QUOTIENT, and the amount leaked at each step increases in proportion to the time $t$ since finality. The solution to this differential equation is $\smash{B(t)=B_0e^{-t^2/2\alpha}}$. From this it can be confirmed that INVERSE_SQRT_E_DROP_TIME, the time taken to reduce starting balances by $\smash{e^{\frac{1}{2}}}$, is $t=\sqrt{\alpha}$. The second half of the spec paragraph is just a calculus-avoiding way of expressing the same thing.

With these parameters, it would take about 43 days for a validator to leak half its deposit and then be ejected for falling to the EJECTION_BALANCE threshold. (Calculated as $t = \sqrt{2\alpha\ln{2}}$ epochs for the balance to fall by half.) Note that non-particpating validators don't have to be ejected to restore finality: it is sufficient only to reduce their stakes to 2/3 of the total.

This inactivity penalty mechanism is designed to protect the chain long-term in the face of catastrophic events (sometimes referred to as the ability to survive World War III). The result might be that the beacon chain could permanently split into two independent chains either side of a network partition, and this is assumed to be a reasonable outcome for any problem that can't be fixed in a few weeks. In this sense, the beacon chain technically prioritises availability over consistency. (You can't have both.)

The PROPORTIONAL_SLASHING_MULTIPLIER is set to 1 at initial mainnet launch, resulting in one-third of the minimum accountable safety margin in the event of a finality attack. After Phase 0 mainnet stablizes, this value will be upgraded to 3 to provide the maximal minimum accoutable safety margin.

When a validator has been slashed, a further penalty is later applied to the validator based on how many other validators were slashed during a window of size EPOCHS_PER_SLASHINGS_VECTOR epochs centred on that slashing event (approximately 18 days before and after).

The proportion of the validator's remaining effective balance that will be subtracted is calculated as, PROPORTIONAL_SLASHING_MULTIPLIER multiplied by the sum of the effective balances of the slashed validators in the window, divided by the total effective balance of all validators. The idea of this mechanism is to punish accidents lightly (in which only a small number of validators were slashed) and attacks heavily (where many validators coordinated to double vote).

To finalise conflicting checkpoints, at least a third of the balance must have voted for both. That's why the "natural" setting of PROPORTIONAL_SLASHING_MULTIPLIER is three: those slashed validators will lose their entire stakes due to this clear attack.

Max operations per block

Name	Value
`MAX_PROPOSER_SLASHINGS`	`2**4` (= 16)
`MAX_ATTESTER_SLASHINGS`	`2**1` (= 2)
`MAX_ATTESTATIONS`	`2**7` (= 128)
`MAX_DEPOSITS`	`2**4` (= 16)
`MAX_VOLUNTARY_EXITS`	`2**4` (= 16)

These parameters are used to size lists in the beacon block bodies for the purposes of SSZ serialisation, as well as constraining the maximum size of beacon blocks so that they can propagate efficiently, and avoid DoS attacks.

With these settings, the maximum size of a beacon block (before compression) is 123,016 bytes. By far the largest object is the AttesterSlashing, at up to 33,216 bytes. However, a single attester slashing can be used to slash many misbehaving validators at the same time (assuming that in an attack, many validators would make the same conflicting vote).

With some assumptions on average behaviour and compressibility, this leads to an average block size of around 36 KBytes, compressing down to 22 KBytes, in the worst case (with the maximum number of validators, and the maximum average number of possible slashings).

Some calculations to support the above can be found for each of the containers in the next section. Also on this spreadsheet (numbers are a bit out of date). Protolambda has a script for calculating all the Eth2 container minimum and maximum sizes.

Some comments on the chosen values:

I have suggested elsewhere reducing MAX_DEPOSITS from sixteen to one.

At first sight there looks to be a disparity between the number of proposer slashings and the number of attester slashings that may be included in a block. But note that an attester slashing (a) can be much larger than a proposer slashing, and (b) as noted above, can result in many more validators getting slashed than a proposer slashing.

MAX_ATTESTATIONS is double the value of MAX_COMMITTEES_PER_SLOT. This allows there to be an empty slot (no block proposal), yet still include all the attestations for the empty slot in the next slot, since, ideally, each committee produces a single aggregate attestation.

Domain types

Name	Value
`DOMAIN_BEACON_PROPOSER`	`DomainType('0x00000000')`
`DOMAIN_BEACON_ATTESTER`	`DomainType('0x01000000')`
`DOMAIN_RANDAO`	`DomainType('0x02000000')`
`DOMAIN_DEPOSIT`	`DomainType('0x03000000')`
`DOMAIN_VOLUNTARY_EXIT`	`DomainType('0x04000000')`
`DOMAIN_SELECTION_PROOF`	`DomainType('0x05000000')`
`DOMAIN_AGGREGATE_AND_PROOF`	`DomainType('0x06000000')`

These domain types are used in two ways: for signatures and for seeds.

As a cryptographic nicety, each of the protocol's five signature types is augmented with the appropriate Domain before being signed:

Signed block proposals incorporate DOMAIN_BEACON_PROPOSER

Signed attestations incorporate DOMAIN_BEACON_ATTESTER

RANDAO reveals are BLS signatures, and use DOMAIN_RANDAO

Deposit data mesages from Ethereum 1 incorporate DOMAIN_DEPOSIT

Validator voluntary exit messages incorporate DOMAIN_VOLUNTARY_EXIT

In each case, except for Eth1 deposits, the fork version is also incorporated before signing. Deposits are valid across forks, but other messages are not. Note that this would allow validators to participate, if they wish, in two independent forks of the beacon chain without fear of being slashed.

In addition, the first two domains are also used to separate the seeds for random number generation. The original motivation was to avoid occasional collisions between Phase 0 committees and Phase 1 persistent committees. So, when computing the beacon block proposer, DOMAIN_BEACON_PROPOSER is hashed into the seed, and when computing committees, DOMAIN_BEACON_ATTESTER is hashed into the seed.

The last two domains were introduced to implement attestation subnet validations for denial of service resistance. They are not part of the consensus-critical state-transition. In short, each slot, validators are selected to aggregate attestations from their committee. The selection is done based on the validator's signature over the slot number, mixing in DOMAIN_SELECTION_PROOF. Then the validator signs the whole aggregated attestation using DOMAIN_AGGREGATE_AND_PROOF. See the Honest Validator spec for more on this.

Containers

The following types are SimpleSerialize (SSZ) containers.

We're about to see our first Python code in the executable spec. For specification purposes, these Constainer data structures are just Python data classes that are derived from the base SSZ Container class.

SSZ is the serialisation and merkleisation format used everywhere in Eth2. It is not self-describing, so you need to know what you are unpacking when deserialising. SSZ deals with basic types and composite types. Classes like the below are handled as SSZ containers, a composite type defined as an "ordered heterogeneous collection of values".

Implementations will obviously use their own paradigms to represent these data structures (we use a combination of Java classes and interfaces).

[TODO: check sizes of containers against Proto's script.]

Note: The definitions are ordered topologically to facilitate execution of the spec.

Note: Fields missing in container instantiations default to their zero value.

In the below, for most of the containers, I've shown the size along with the working out. If you prefer your information programmatically generated, see this from Protolambda (for spec v0.12.x).

Misc dependencies

`Fork`

class Fork(Container):
    previous_version: Version
    current_version: Version
    epoch: Epoch  # Epoch of latest fork

Fork data is stored in the BeaconState to indicate the current and previous fork versions. The fork version gets incorporated into the cryptographic domain in order to invalidate messages from validators on other forks. The previous fork version and the epoch of the change are stored so that pre-fork messages can still be validated (at least until the next fork).

Note that this is all about manual, protocol forks, and nothing to do with the fork-choice rule.

Fixed size: 4 + 4 + 8 = 16 bytes

`ForkData`

class ForkData(Container):
    current_version: Version
    genesis_validators_root: Root

Only used in compute_fork_data_root(). This is used when distinguishing between chains for the purpose of peer-to-peer gossip, and for domain separation. By including both the current fork version and the genesis validators root, we can cleanly distinguish between, say, mainnet and a testnet. They might both have the same fork history, but the genesis validators roots will differ.

Version is the datatype for a fork version number.

`Checkpoint`

class Checkpoint(Container):
    epoch: Epoch
    root: Root

Checkpoints are the points of justification or finalisation by the Casper FFG protocol. They are used by validators in creating AttestationData votes, and also for recording the status of recent checkpoints in BeaconState.

As per the Casper paper, checkpoints contain a height, and a block root. In this implementation of Casper FFG, checkpoints occur whenever the slot number is a multiple of SLOTS_PER_EPOCH, thus they correspond to epoch numbers. In particular, checkpoint $N$ is the first slot of epoch $N$. The genesis block is Checkpoint 0, and starts off both justified and finalised.

Thus, the root element here is the block root of the first block in the epoch. (This might be the block root of an earlier block if some slots have been skipped, that is, if there are no blocks for those slots.)

Once a checkpoint has been finalised, the slot it points to and all prior slots will never be reverted.

Fixed size: 8 + 32 = 40 bytes

`Validator`

class Validator(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32  # Commitment to pubkey for withdrawals
    effective_balance: Gwei  # Balance at stake
    slashed: boolean
    # Status epochs
    activation_eligibility_epoch: Epoch  # When criteria for activation were met
    activation_epoch: Epoch
    exit_epoch: Epoch
    withdrawable_epoch: Epoch  # When validator can withdraw funds

This is the datastructure that stores (almost) all the information about each individual validator.

Validators' actual balances are stored separately in the BeaconState structure, and only the slowly changing "effective balance" is stored here. This is because actual balances are liable to change quite frequently (every epoch): the way that Eth2 calculates state roots means that only the parts that change need to be recalculated; the roots of unchanged parts can be cached. Separating out the validator balances potentially means that only 1/15th (8/121) as much data needs to be rehashed every epoch compared to storing them here, which is an important optimisation.

A validator's record is created when its deposit is first processed. Sending multiple deposits does not create multiple validator records: deposits with the same public key are aggregated in one record. Validator records never expire in Phase 0; they are stored permanently, even after the validator has exited the system. Thus there is a 1:1 mapping between a validator's index in the list and the identity of the validator (validator records are only ever appended to the list).

Also stored in Validator:

pubkey serves as both the unique identity of the validator and the means of cryptographically verifying messages purporting to have been signed by it. The public key is stored raw, unlike in Eth1, where it is hashed to form the account address. This is to allow public keys to be aggregated for verifying aggregated attestations.

Validators actually have two private/public key pairs, the one above used for signing protocol messages, and a separate "withdrawal key". withdrawal_credentials is a commitment generated from the validator's withdrawal key so that, at some time in the future, a validator can prove it owns the funds and will be able to withdraw them. Storing the hash of the public key rather than the key itself saves a few bytes (16 bytes).

effective_balance is a topic of its own that we've touched upon already, and will discuss more fully when we look at process_final_updates

slashed indicates that a validator has been slashed, that is, punished for violating the slashing conditions. A validator can only be slashed once.

The remaining values are the epochs in which the validator changed, or is due to change state.

A detailed explanation of the stages in a validator's lifecycle is here, and we'll be covering it in detail as we work through the beacon chain logic. But, in simplified form, progress is as follows:

A 32 Eth deposit has been made on the Eth1 chain. No validator record exists yet.

The deposit is processed by the beacon chain at some slot. A validator record is created with all epoch fields set to FAR_FUTURE_EPOCH.

At the end of the epoch, the activation_eligibility_epoch is set to the next epoch.

After the epoch activation_eligibility_epoch has been finalised, the validator is added to the activation queue by setting its activation_epoch appropriately, taking into account the per-epoch churn limit and MAX_SEED_LOOKAHEAD.

On reaching activation_epoch the validator becomes active, and should carry out its duties.

At any time after SHARD_COMMITTEE_PERIOD epochs, a validator may request a voluntary exit. exit_epoch is set according to the validator's position in the exit queue and MAX_SEED_LOOKAHEAD, and withdrawable_epoch is set MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after that.

From exit_epoch onwards the validator is no longer active. There is no mechanism for exited validators to rejoin: exiting is permanent.

After withdrawable_epoch, the validator's balance can in principle be withdrawn, although there is no mechanism for doing this in Phase 0.

The above does not account for slashing or forced exits due to low balance.

Fixed size: 48 + 32 + 8 + 1 + 4 * 8 = 121 bytes

`AttestationData`

class AttestationData(Container):
    slot: Slot
    index: CommitteeIndex
    # LMD GHOST vote
    beacon_block_root: Root
    # FFG vote
    source: Checkpoint
    target: Checkpoint

Eth2 relies on a combination of two different consensus mechanisms: LMD GHOST keeps the chain moving, and Casper FFG brings finalisation. These are documented in the Gasper paper. Attestations from (committees of) validators are used to provide votes simultaneously for each of these consensus mechanisms.

This class is the fundamental unit of attestation data.

slot: each active validator should be making exactly one attestation per epoch. Validators have an assigned slot for their attestation, and it is recorded here.

index: there can be several committees active in a single slot. This is the number of the committee that the validator belongs to in that slot. It is used to reconstruct the committee and to check that the attesting validator is a member. Ideally, all (or the majority at least) of the attestations in a slot from one committee will be identical, and can therefore be aggregated into a smaller number of attestations.

beacon_block_root is the validator's vote on the best block for that slot after locally running the LMD GHOST fork-choice rule.

source is the validator's opinion of the best currently justified checkpoint for the Casper FFG finalisation process.

target is the validator's opinion of the block at the start of the current epoch, also for Casper FFG finalisation.

This AttestationData structure gets wrapped up into several other similar but distinct structures:

Attestation: This is the form in which attestations normally make their way around the network. It is signed and aggregatable, and the list of validators making this attestation is compressed into a bitlist.

IndexedAttestation: Used primarily for attester slashing, it is signed and aggregated, with the list of attesting validators being an uncompressed list of indices.

PendingAttestation: After having their validity checked during block processing, these are stored in the beacon state pending processing at the end of the epoch. The signature is not stored, and the list of attesting validators is compressed into a bitlist.

Fixed size: 8 + 8 + 32 + 2 * 40 = 128 bytes

`IndexedAttestation`

class IndexedAttestation(Container):
    attesting_indices: List[ValidatorIndex, MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    signature: BLSSignature

This is one of the forms in which aggregated attestations—combined identical attestations from multiple validators in the same committee—are handled.

Attestations and IndexedAttestations contain essentially the same information. The difference being that the list of attesting validators is stored uncompressed in IndexedAttestations. That is, each attesting validator is referenced by its global validator index, and non-attesting validators are not included. To be valid, the validator indices must be unique and sorted, and the signature must be an aggregate signature from exactly the listed set of validators.

IndexedAttestations are primarily used when reporting attester slashing. An Attestation can be converted to an IndexedAttestation using get_indexed_attestation().

Max size: 8 * 2048 + 128 + 96 = 16,608 bytes

`PendingAttestation`

class PendingAttestation(Container):
    aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    inclusion_delay: Slot
    proposer_index: ValidatorIndex

Attestations received in blocks are verified and then temporarily stored in beacon state in the form of PendingAttestations, pending further processing at the end of the epoch.

A PendingAttestation is an Attestation minus the signature, plus a couple of fields related to reward calculation:

inclusion_delay is the number of slots between the attestation having been made and it being included in a beacon block by the block proposer. Validators are rewarded for getting their attestations included in blocks, but the reward declines in inverse proportion to the inclusion delay. This incentivises swift attesting and communicating by validators.

proposer_index is the block proposer that included the attestation. The block proposer gets a micro reward for every validator's attestation it includes, not just for the aggregate attestation as a whole. This incentivises efficient finding and packing of aggregations, since the number of aggregate attestations per block is capped.

Taken together, these rewards ought to incentivise the whole network to collaborate to do efficient attestation aggregation (proposers want to include only well-aggregated attestations; validators want to get their attestations included, so will ensure that they get well aggregated).

Max size: 2048 / 8 + 128 + 8 + 8 = 400 bytes

`Eth1Data`

class Eth1Data(Container):
    deposit_root: Root
    deposit_count: uint64
    block_hash: Bytes32

Proposers include their view of the Eth1 chain in blocks, and this is how they do it. The beacon chain stores these up as votes in beacon state until there is a majority consensus, and then the winner is committed to beacon state. This is to allow the processing of Eth1 deposits, and creates a simple "honest-majority" one-way bridge from Eth1 to Eth2. The 1/2 majority assumption for this (rather than 2/3 for committees) is considered safe as the number of validators voting each time is large, at SLOTS_PER_ETH1_VOTING_PERIOD (1024).

deposit_root is the result of the get_deposit_root() method of the Eth1 deposit contract after executing the Eth1 block being voted on—it's the root of the (sparse) Merkle tree of deposits.

deposit_count is the number of deposits in the deposit contract at that point, the result of the get_deposit_count method on the contract. This will be equal to or greater than (if there are pending unprocessed deposits) the value of state.eth1_deposit_index.

block_hash is the block hash of the Eth1 block being voted for. This doesn't have any current use within the Eth2 protocol, but is "too potentially useful to not throw in there", to quote Danny Ryan.

Fixed size: 32 + 8 + 32 = 72 bytes

`HistoricalBatch`

class HistoricalBatch(Container):
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]

This is used to implement part of the double batched accumulator for the past history of the chain. Once SLOTS_PER_HISTORICAL_ROOT block roots and the same number of state roots have been accumulated in the beacon state, they are put in a HistoricalBatch object and the hash tree root of that is appended to the historical_roots list in beacon state. The corresponding block and state root lists in the beacon state are circular and just get overwritten in the next period. See process_final_updates.

Fixed size: 2 * 32 * 8192 = 524,288 bytes (but never stored as-is, only its hash tree root.)

`DepositMessage`

class DepositMessage(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei

The basic information necessary to either add a validator to the registry, or to top up an existing validator's stake.

pubkey is the unique public key of the validator. If it is already present in the registry (the list of validators in beacon state) then amount is added to its balance. Otherwise a new Validator entry is appended to the list and credited with amount.

See the Validator class for info on withdrawal_credentials.

There are two protections that DepositMessages get at different points:

They are stored, pending processing, in beacon state as DepositData. This includes the validator's BLS signature so that the authenticity of the DepositMessage can be verified before a validator is added.

DepositData is included in beacon blocks as a Deposit, which adds a Merkle proof that the data has been registered with the Eth1 deposit contract.

Fixed size: 48 + 32 + 8 = 88 bytes

`DepositData`

class DepositData(Container):
    pubkey: BLSPubkey
    withdrawal_credentials: Bytes32
    amount: Gwei
    signature: BLSSignature  # Signing over DepositMessage

A signed DepositMessage. The comment says that the signing is done over DepositMessage. What actually happens is that a DepositMessage is constructed from the first three fields; the root of that is combined with DOMAIN_DEPOSIT in a SigningData object; finally the root of this is signed and included in DepositData.

Fixed size: 48 + 32 + 8 + 96 = 184 bytes

`BeaconBlockHeader`

class BeaconBlockHeader(Container):
    slot: Slot
    proposer_index: ValidatorIndex
    parent_root: Root
    state_root: Root
    body_root: Root

A standalone version of a beacon block header: BeaconBlocks contain their own header. It is identical to BeaconBlock, except that body is replaced by body_root. It is BeaconBlock-lite.

BeaconBlockHeader is stored in beacon state to record the last processed block header. This is used to ensure that we always proceed along a continuous chain of blocks that always point to their predecessor (it's a blockchain, yo!). See process_block_header().

The signed version is used in proposer slashings.

Fixed size: 2 * 8 + 3 * 32 = 112 bytes

`SigningData`

class SigningData(Container):
    object_root: Root
    domain: Domain

This is just a convenience class used only in compute_signing_root to calculate the hash tree root of an object along with a domain. That root is the message data that gets signed with a BLS signature. The SigningData object itself is never stored or transmitted.

Beacon operations

`ProposerSlashing`

class ProposerSlashing(Container):
    signed_header_1: SignedBeaconBlockHeader
    signed_header_2: SignedBeaconBlockHeader

ProposerSlashings may be included in blocks to demonstrate that a validator has broken the rules and ought to be slashed. Proposers receive a reward for correctly submitting these.

In this case, the rule is that a validator may not propose two different blocks at the same height, and the payload is the signed headers of the two blocks that evidence the crime. The signatures on the SignedBeaconBlockHeaders are checked to verify that they were both signed by the accused validator.

Fixed size: 2 * 200 = 400 bytes

`AttesterSlashing`

class AttesterSlashing(Container):
    attestation_1: IndexedAttestation
    attestation_2: IndexedAttestation

AttesterSlashings may be included in blocks to demonstrate that a group of validators has broken the rules and ought to be slashed. Proposers receive a reward for correctly submitting these.

The contents of the IndexedAttestations are checked against the attester slashing conditions in is_slashable_attestation_data(). If there is a violation, then any validator that attested to both attestation_1 and attestation_2 is slashed, see process_attester_slashing().

AttesterSlashings are potentially very large since they could in principle list the indices of all the validators in a committee. On the other hand, many validators can be slashed as a result of a single report.

Max size: 2 * 16,608 = 33,216 bytes

`Attestation`

class Attestation(Container):
    aggregation_bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]
    data: AttestationData
    signature: BLSSignature

This is the form in which attestations make their way around the network. It is designed to be easily aggregatable: Attestations containing identical AttestationData can be combined into a single attestation by aggregating the signatures.

Attestations contain the same information as IndexedAttestations, but use knowledge of the validator committees at slots to compress the list of attesting validators down to a bitlist. Thus, Attestations are about 1/35th of the size of IndexedAttestations.

Max size: 2048 / 8 + 128 + 96 = 480

`Deposit`

class Deposit(Container):
    proof: Vector[Bytes32, DEPOSIT_CONTRACT_TREE_DEPTH + 1]  # Merkle path to deposit root
    data: DepositData

Used to include deposit data from wannabe validators in beacon blocks so that they can be processed into beacon state.

The proof is a Merkle proof constructed by the block proposer that the DepositData corresponds to the previously agreed deposit root of the Eth1 contract's deposit tree. It is verified in process_deposit() by is_valid_merkle_branch().

Fixed size: 32 * (32 + 1) + 184 = 1240 bytes

`VoluntaryExit`

class VoluntaryExit(Container):
    epoch: Epoch  # Earliest epoch when voluntary exit can be processed
    validator_index: ValidatorIndex

Voluntary exit messages are how a validator signals that it wants to cease being a validator. They are ignored by the beacon chain if they are included in blocks before epoch, so nodes should buffer any future-dated exits they see before putting them in a block.

VoluntaryExit objects are never used naked; they are always wrapped up into a SignedVoluntaryExit object.

Fixed size: 8 + 8 = 16 bytes

Beacon blocks

`BeaconBlockBody`

class BeaconBlockBody(Container):
    randao_reveal: BLSSignature
    eth1_data: Eth1Data  # Eth1 data vote
    graffiti: Bytes32  # Arbitrary data
    # Operations
    proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS]
    attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS]
    attestations: List[Attestation, MAX_ATTESTATIONS]
    deposits: List[Deposit, MAX_DEPOSITS]
    voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS]

From a beacon node's point of view, only two things on this page really matter: the BeaconBlock and the BeaconState. The former is how the latter gets updated. The BeaconBlockBody is the business part of a BeaconBlock.

A beacon block is proposed by a validator when its (randomly selected) turn comes. There ought to be exactly one beacon block per slot if things are running correctly.

Always present:

randao_reveal: the block is invalid if this does not verify correctly against the proposer's public key. This is the block proposer's contribution to the beacon chain's randomness. It is generated by the proposer signing the current epoch number (combined with DOMAIN_RANDAO) with its private key. To the best of anyone's knowledge, the result is indistinguishable from random. This gets mixed into the beacon state RANDAO.

See Eth1Data for eth1_data. In principle, this is mandatory, but it is not checked, and there is no penalty for making it up.

graffiti is left free for the proposer to insert whatever data it wishes. It has no protocol level signifcance. Can be left as zero.

Optional, with rewards for inclusion:

proposer_slashings: up to MAX_PROPOSER_SLASHINGS ProposerSlashings may be included. There is a reward of up to 0.0625 Ether for each validator slashed as a result, all accruing to the block proposer.

attester_slashings: up to MAX_ATTESTER_SLASHINGS AttesterSlashings may be included. There is a reward of up to 0.0625 Ether for each validator slashed as a result, all accruing to the block proposer.

attestations: up to MAX_ATTESTATIONS (aggregated) Attestations may be included. The block proposer is incentivised to include well-packed aggregate attestations, as it receives a micro reward for each unique good attestation. In a perfect world, with perfectly aggregated attestations, MAX_ATTESTATIONS would be equal to MAX_COMMITTEES_PER_SLOT. In our configuration it is double. This allows for some imperfectly aggregated attestations, and to catch up after skip slots.

Mandatory, no rewards for inclusion:

deposits: if the block does not contain either all the outstanding Deposits, or MAX_DEPOSITS of them in deposit order, then it is invalid.

Optional, no rewards for inclusion:

voluntary_exits: up to MAX_VOLUNTARY_EXITS SignedVoluntaryExits may be included.

Max size: 96 + 72 + 32 + 408 * 16 + 33,216 * 2 + 480 * 128 + 1240 * 16 + 112 * 16 = 156,232

`BeaconBlock`

class BeaconBlock(Container):
    slot: Slot
    proposer_index: ValidatorIndex
    parent_root: Root
    state_root: Root
    body: BeaconBlockBody

BeaconBlock just adds some blockchain paraphernalia to BeaconBlockBody.

slot is the slot the block is proposed for. proposer_index was added to avoid a potential DoS vector and to allow clients without full access to the state to still know useful things. parent_root is used to make sure that this block is a direct child of the last block we processed. In order to calculate state_root, the proposer is expected to run the state transition on the block before propagating it. After the beacon node has processed the block, the state roots are compared to ensure they match. This seems to be the mechanism for tying the whole system together and making sure that all validators and beacon nodes are always working off the same version of state (absent any short-term forks).

If any of these are incorrect, then the block is invalid with respect to the current beacon state and will be ignored.

Max size = 8 + 8 + 32 + 32 + 123,016 = 123,096

Beacon state

`BeaconState`

class BeaconState(Container):
    # Versioning
    genesis_time: uint64
    genesis_validators_root: Root
    slot: Slot
    fork: Fork
    # History
    latest_block_header: BeaconBlockHeader
    block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
    historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT]
    # Eth1
    eth1_data: Eth1Data
    eth1_data_votes: List[Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH]
    eth1_deposit_index: uint64
    # Registry
    validators: List[Validator, VALIDATOR_REGISTRY_LIMIT]
    balances: List[Gwei, VALIDATOR_REGISTRY_LIMIT]
    # Randomness
    randao_mixes: Vector[Bytes32, EPOCHS_PER_HISTORICAL_VECTOR]
    # Slashings
    slashings: Vector[Gwei, EPOCHS_PER_SLASHINGS_VECTOR]  # Per-epoch sums of slashed effective balances
    # Attestations
    previous_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
    current_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
    # Finality
    justification_bits: Bitvector[JUSTIFICATION_BITS_LENGTH]  # Bit set for every recent justified epoch
    previous_justified_checkpoint: Checkpoint  # Previous epoch snapshot
    current_justified_checkpoint: Checkpoint
    finalized_checkpoint: Checkpoint

All roads lead to the BeaconState. Maintaining this is the sole purpose of all the apparatus in all of these documents. This state is the focus of consensus among the beacon nodes: it is what everybody, eventually, must agree on.

Eth2's beacon state is monolothic: everything is bundled into the one state object (sometimes referred to as the "God object"). Some have argued for more granular approaches that might be more efficient, but at least the current approach is simple.

Let's break this thing down...
# Versioning
genesis_time: uint64
genesis_validators_root: Root
slot: Slot
fork: Fork
How do we know which chain we're on, and where we are on it? This information ought to be sufficient. A path back to the genesis block would also do.

genesis_validators_root is calculated at Genesis time (when the chain starts) and is fixed for the life of the chain. This, combined with the fork identifier, should serve to uniquely identify the chain that we are on.

The fork choice rule uses genesis_time to work out what slot we're in.

The fork element is updated at hard forks (not related to the fork choice rule) to invalidate blocks and attestations from validators not following the new fork.
# History
latest_block_header: BeaconBlockHeader
block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT]
latest_block_header is only used to make sure that the next block we process is a direct descendent. It's a blockchain thing.

Past block_roots and state_roots are stored in lists here until the lists are full. Once they are full, the Merkle root is taken of both the lists together and appended to historical_roots. historical_roots effectively grows without bound (HISTORICAL_ROOTS_LIMIT is large), but only at a rate of 10KB per year. Keeping this data is useful for light clients, and also allows Merkle proofs to be created against past states, for example historical deposit data.
# Eth1
eth1_data: Eth1Data
eth1_data_votes: List[Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH]
eth1_deposit_index: uint64
eth1_data is the latest agreed upon state of the Eth1 chain and deposit contract. eth1_data_votes accumulates Eth1Data from blocks until there is an overall majority in favour of one Eth1 state. If a majority is not achieved by the time the list is full then it is cleared down and starts again. eth1_deposit_index is the total number of deposits that have been processed by the beacon chain (which is greater than or equal to the number of validators, as a deposit can top-up the balance of an existing validator).
# Registry
validators: List[Validator, VALIDATOR_REGISTRY_LIMIT]
balances: List[Gwei, VALIDATOR_REGISTRY_LIMIT]
The registry of Validators and their balances. The balances list is separated out as it changes relatively more broadly more frequently than the validators list. Roughly speaking, balances of active validators are updated every epoch, while the validators list has only minor updates per epoch. When combined with SSZ tree hashing, this results in a big saving in the amount of data to be rehashed on registry updates.
# Randomness
randao_mixes: Vector[Bytes32, EPOCHS_PER_HISTORICAL_VECTOR]
Past randao mixes are stored in a fixed-size circular list for EPOCHS_PER_HISTORICAL_VECTOR epochs (~290 days). These can be used to recalculate past committees, which allows slashing of historical attestations. See EPOCHS_PER_HISTORICAL_VECTOR for more information.
# Slashings
slashings: Vector[Gwei, EPOCHS_PER_SLASHINGS_VECTOR]
A fixed-size circular list of past slashed amounts. Each epoch, the total effective balance of all validators slashed in that epoch is stored as an entry in this list. When the final slashing penalty for a slashed validator is calculated, it is weighted with the sum of this list. This is intended to more heavily penalise mass slashings during a window of time, which is more likely to be a coordinated attack.
# Attestations
previous_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
current_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
These are pending attestations accumulated from blocks, but not yet processed by the beacon chain at the end of an epoch. current_epoch_attestations have a target that is the epoch we are currently in. These are just stored. All rewards and finality calculations are based on previous_epoch_attestations, which are last epoch's current_epoch_attestations plus any new ones received that target the previous epoch.
# Finality
justification_bits: Bitvector[JUSTIFICATION_BITS_LENGTH]
previous_justified_checkpoint: Checkpoint
current_justified_checkpoint: Checkpoint
finalized_checkpoint: Checkpoint
Eth2 uses the Casper FFG finality mechanism, with a k-finality optimisation, where k = 2. These are the data that need to be tracked in order to apply the finality rules.

justification_bits is only four bits long. It tracks the justification status of the last four epochs: 1 if justified, 0 if not. This is used when calculating whether we can finalise an epoch.

Outside of the finality calculations, previous_justified_checkpoint and current_justified_checkpoint are only used to filter attestations being added into attestations lists discussed above: attestations need to have the matching source parameter.

finalized_checkpoint: the network has agreed that the beacon chain state at or before that epoch will never be reverted. So, for one thing, the fork choice rule doesn't need to go back any further than this. The Casper FFG mechanism is specifically constructed so that two conflicting finalized checkpoints cannot be created without at least one third of validators being slashed.

TODO: calculate sizes of fixed, bounded, and unbounded parts

Fun fact: there was a period during which beacon state was split into "crystallized state" and "active state". The active state was constantly changing; the crystallized state changed only once per epoch (or what passed for epochs back then). Separating out the fast-changing state from the slower-changing state was an attempt to avoid having to constantly rehash the whole state every slot. With the introduction of SSZ tree hashing, this was no longer necessary, as the roots of the slower changing parts could simply be cached, which was a nice simplification. There remains an echo of this approach, however, in the splitting out of validator balances into a different structure.

Signed envelopes

The following are just wrappers for a more basic type with an added signature.

`SignedVoluntaryExit`

class SignedVoluntaryExit(Container):
    message: VoluntaryExit
    signature: BLSSignature

A voluntary exit is currently signed with the validator's online signing key. There is some discussion about changing this to also allow signing of a voluntary exit with the validator's offline withdrawal key.

Fixed size: 16 + 96 = 112 bytes

`SignedBeaconBlock`

class SignedBeaconBlock(Container):
    message: BeaconBlock
    signature: BLSSignature

BeaconBlocks are signed by the block proposer and unwrapped for block processing.

Max size: 123,088 + 96 = 123,184

`SignedBeaconBlockHeader`

class SignedBeaconBlockHeader(Container):
    message: BeaconBlockHeader
    signature: BLSSignature

This is used only when reporting proposer slashing, via the ProposerSlashing container.

Through the magic of SSZ hash tree roots, a valid signature for a SignedBeaconBlock is also a valid signature for a SignedBeaconBlockHeader. Proposer slashing makes use of this to save space in slashing reports.

Fixed size: 104 + 96 = 200 bytes

Helper functions

Note: The definitions below are for specification purposes and are not necessarily optimal implementations.

This is note is super important for implementers! There are many, many optimisations of the below routines that are being used in practice: a naive implementation is impractically slow for mainnet configurations. As long as the optimised code produces identical results to the code here, then all is fine.

Math

`integer_squareroot`

def integer_squareroot(n: uint64) -> uint64:
    """
    Return the largest integer ``x`` such that ``x**2 <= n``.
    """
    x = n
    y = (x + 1) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    return x

Validator rewards scale with the reciprocal of the square root of the total active balance of all validators. This is calculated in get_base_reward(), and is the only place this function is used. Newton's method is used which has pretty good convergence properties, but implementations may use any method that gives identical results.

`xor`

def xor(bytes_1: Bytes32, bytes_2: Bytes32) -> Bytes32:
    """
    Return the exclusive-or of two 32-byte strings.
    """
    return Bytes32(a ^ b for a, b in zip(bytes_1, bytes_2))

The bitwise xor of two 32-byte quantities is defined here in terms of Python's behaviour.

This is only used in process_randao when mixing in the new randao reveal.

Fun fact: if you xor two byte types in Java, the result is a 32 bit (signed) integer :man_facepalming: This is one reason we need to define the "obvious" here. But mainly, because the spec is executable, we need to tell Python what it doesn't already know.

`uint_to_bytes`

def uint_to_bytes(n: uint) -> bytes is a function for serializing the uint type object to bytes in ENDIANNESS-endian. The expected length of the output is the byte-length of the uint type.

For the most part, integers are integers and bytes are bytes, and they don't mix much. But there are a few places where we need to convert from integers to bytes in Phase 0:

several times in the compute_shuffled_index() algorithm

in get_seed() to mix the epoch number into the randao mix

in compute_proposer_index in the algorithm to select a proposer weighted by stake

in get_beacon_proposer_index() to mix the slot number into the per-epoch randao seed

The result of this conversion is dependent on our arbitrary choice of endianness: that is, how we choose to represent integers as strings of bytes. For Eth2, we have chosen little-endian: see the discussion of ENDIANNESS for more background.

`bytes_to_uint64`

def bytes_to_uint64(data: bytes) -> uint64:
    """
    Return the integer deserialization of ``data`` interpreted as ``ENDIANNESS``-endian.
    """
    return uint64(int.from_bytes(data, ENDIANNESS))

bytes_to_uint64() is the inverse of uint_to_bytes(), and is used by the shuffling algorithm.

It's also used in the validator specification when selecting validators to aggregate attestations.

Crypto

`hash`

def hash(data: bytes) -> Bytes32 is SHA256.

SHA256 was chosen as the protocol's base hash algorithm for easier cross-chain interoperability: many other chains use SHA256, and Eth1 has a SHA256 precompile.

There was lots of discussion about this at the time. The original plan had been to use the BLAKE2b-512 hash function—that being a modern hash function that's faster than SHA3—and move to a STARK/SNARK friendly hash function at some point (such as MiMC). However, to keep interoperability with Eth1, in particular for the implementation of the deposit contract, the hash function was changed to Keccak256. Finally, we settled on SHA256.

`hash_tree_root`

def hash_tree_root(object: SSZSerializable) -> Root is a function for hashing objects into a single root by utilizing a hash tree structure, as defined in the SSZ spec.

The development of the hash tree procedure has been transformative for the Eth2 specification, and it's now used everywhere.

The naive way to create a digest of a datastructure is to linearise it and then just run a hash function over the result. In tree hashing, the basic idea is to treat each element of an ordered, compound data structure as the leaf of a merkle tree, recursively if necessary until a primitive type is reached, and to return the Merkle root of the resulting tree.

At first sight, this looks quite inefficient: twice as much data needs to be hashed when tree hashing, and actual speeds are 4-6 times slower compared with the linear hash. However, it's good for supporting light clients, because it allows Merkle proofs to be constructed easily for subsets of the full state.

The breakthrough insight was realising that much of the re-hashing work can be cached: if part of the state data structure has not changed, that part does not need to be re-hashed: the whole subtree can be replaced with its cached hash. This turns out to be a huge efficiency boost, allowing the previous design, with cumbersome separate crystallised and active state, to be simplified into a single state object.

[TODO find some explainer, or insert a diagram]

BLS Signatures

Eth2 makes use of BLS signatures as specified in the IETF draft BLS specification draft-irtf-cfrg-bls-signature-04. Specifically, eth2 uses the BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_ ciphersuite which implements the following interfaces:

def Sign(SK: int, message: Bytes) -> BLSSignature
def Verify(PK: BLSPubkey, message: Bytes, signature: BLSSignature) -> bool
def Aggregate(signatures: Sequence[BLSSignature]) -> BLSSignature
def FastAggregateVerify(PKs: Sequence[BLSPubkey], message: Bytes, signature: BLSSignature) -> bool
def AggregateVerify(PKs: Sequence[BLSPubkey], messages: Sequence[Bytes], signature: BLSSignature) -> bool

Within these specifications, BLS signatures are treated as a module for notational clarity, thus to verify a signature bls.Verify(...) is used.

BLS is the digital signature scheme used by Eth2. It has some very nice properties, in particular the ability to aggregate signatures. This means that many validators can sign the same message (for example, that they support block X), and these signatures can all be efficiently aggregated into a single signature for verification. The ability to do this efficiently makes Eth2 practical as a protocol.

Several other protocols have adopted or will adopt BLS, such as Zcash, Chia, Dfinity and Algorand. We are using the BLS signature scheme based on the BLS12-381 elliptic curve. By implementing the new standard for BLS signatures, we hope that interoperability between chains will be easier in future.

Predicates

`is_active_validator`

def is_active_validator(validator: Validator, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is active.
    """
    return validator.activation_epoch <= epoch < validator.exit_epoch

Validators don't explicitly track their state (eligible for activation, active, exited, withdrawable - the exception being whether they have been slashed or not). Instead, a validator's state is calculated by checking fields in the Validator record that store the epoch numbers of state transitions.

In this case, if the validator was activated in the past and has not yet exited, then it is active.

This is used a few times in the spec, most notably in get_active_validator_indices which returns a list of all active validators at an epoch.

`is_eligible_for_activation_queue`

def is_eligible_for_activation_queue(validator: Validator) -> bool:
    """
    Check if ``validator`` is eligible to be placed into the activation queue.
    """
    return (
        validator.activation_eligibility_epoch == FAR_FUTURE_EPOCH
        and validator.effective_balance == MAX_EFFECTIVE_BALANCE
    )

When a new deposit has been processed with a previously unseen public key, a new Validator record is created with all the state-transition fields set to the default value of FAR_FUTURE_EPOCH.

During epoch processing, eligible validators are marked as eligible for activation by setting the validator.activation_eligibility_epoch.

It is possible to deposit any amount over MIN_DEPOSIT_AMOUNT (currently 1 Ether) into the deposit contract. However, validators do not become eligible for activation until their effective balance is equal to MAX_EFFECTIVE_BALANCE, which corresponds to an actual balance of 32 Ether or more.

`is_eligible_for_activation`

def is_eligible_for_activation(state: BeaconState, validator: Validator) -> bool:
    """
    Check if ``validator`` is eligible for activation.
    """
    return (
        # Placement in queue is finalized
        validator.activation_eligibility_epoch <= state.finalized_checkpoint.epoch
        # Has not yet been activated
        and validator.activation_epoch == FAR_FUTURE_EPOCH
    )

Once a validator is_eligible_for_activation_queue(), its activation_eligibility_epoch is set to the next epoch, but its activation_epoch is not yet set.

To avoid any ambiguity or confusion on the validator side about its state, we wait until its eligibility activation epoch has been finalised before adding it to the activation queue by setting its activation_epoch. Otherwise, it might at one point become active, and then the beacon chain could flip to a fork in which it is not active.

`is_slashable_validator`

def is_slashable_validator(validator: Validator, epoch: Epoch) -> bool:
    """
    Check if ``validator`` is slashable.
    """
    return (not validator.slashed) and (validator.activation_epoch <= epoch < validator.withdrawable_epoch)

Used by process_proposer_slashing() and process_attester_slashing().

Validators can be slashed only once: the flag Validator.slashed is set on the first occasion.

An unslashed validator remains eligible to be slashed from when it becomes active right up until it becomes withdrawable. This is some time (MIN_VALIDATOR_WITHDRAWABILITY_DELAY) after it has exited from being a validator and ceased validation duties.

`is_slashable_attestation_data`

def is_slashable_attestation_data(data_1: AttestationData, data_2: AttestationData) -> bool:
    """
    Check if ``data_1`` and ``data_2`` are slashable according to Casper FFG rules.
    """
    return (
        # Double vote
        (data_1 != data_2 and data_1.target.epoch == data_2.target.epoch) or
        # Surround vote
        (data_1.source.epoch < data_2.source.epoch and data_2.target.epoch < data_1.target.epoch)
    )

Used by process_attester_slashing() to check that the two sets of alleged conflicting attestation data in an AttesterSlashing do in fact qualify as slashable.

There are two ways for validators to get slashed under Casper FFG:

A double vote: by a voting more than once for the same target epoch, or

A surround vote: the source–target interval of one attestation entirely contains the source–target of a second attestation from the same validator(s). The reporting block proposer needs to take care to order the IndexedAttestations within the AttesterSlashing object so that the first surrounds the second. (The opposite ordering also describes a slashable offence, but is not checked for here.)

`is_valid_indexed_attestation`

def is_valid_indexed_attestation(state: BeaconState, indexed_attestation: IndexedAttestation) -> bool:
    """
    Check if ``indexed_attestation`` is not empty, has sorted and unique indices and has a valid aggregate signature.
    """
    # Verify indices are sorted and unique
    indices = indexed_attestation.attesting_indices
    if len(indices) == 0 or not indices == sorted(set(indices)):
        return False
    # Verify aggregate signature
    pubkeys = [state.validators[i].pubkey for i in indices]
    domain = get_domain(state, DOMAIN_BEACON_ATTESTER, indexed_attestation.data.target.epoch)
    signing_root = compute_signing_root(indexed_attestation.data, domain)
    return bls.FastAggregateVerify(pubkeys, signing_root, indexed_attestation.signature)

This is used in attestation processing and attester slashing processing.

IndexedAttestations differ from Attestations in that the latter record the contributing validators in a bitlist and the former explicitly list the global indices of the contributing validators.

An IndexedAttestation passes this validity test only if,

There is at least one validator index present.

The list of validators contains no duplicates (the Python set function performs deduplication).

The indices of the validators are sorted. (It's not clear to me why this is required. It's used in the duplicate check here, but that could just be replaced by checking the set size.)

Its aggregated signature verifies against the aggregated public keys of the listed validators.

Verifying the signature uses the magic of aggregated BLS signatures. The indexed attestation contains a BLS signature that is supposed to be the combined individual signatures of each of the validators listed in the attestation. This is verified by passing it to bls.FastAggregateVerify() along with the list of public keys from the same validators. The verification succeeds only if exactly the same set of validators signed the message (signing_root) as are in the list of public keys. Note that get_domain() mixes in the fork version, so that attestations are not valid across forks.

No check is done here that the attesting_indices (which are the global validator indices) are all members of the correct committee for this attestation. In process_attestation() they must be, by construction. In process_attester_slashing() it doesn't matter: any validator signing conflicting attestations is liable to be slashed.

`is_valid_merkle_branch`

def is_valid_merkle_branch(leaf: Bytes32, branch: Sequence[Bytes32], depth: uint64, index: uint64, root: Root) -> bool:
    """
    Check if ``leaf`` at ``index`` verifies against the Merkle ``root`` and ``branch``.
    """
    value = leaf
    for i in range(depth):
        if index // (2**i) % 2:
            value = hash(branch[i] + value)
        else:
            value = hash(value + branch[i])
    return value == root

The classic algorithm for verifying a merkle branch. Nodes are iteratively hashed as the tree is traversed from leaves to root. The bits of index select whether we are the right or left child of our parent at each level. The result should match the given root of the tree.

This proves that we know that leaf is the value at position index in the list of leaves, and we know the whole structure of the rest of the tree, as summarised in branch.

We use this function in process_deposit to check whether the deposit data we've received is correct or not.

Misc

`compute_shuffled_index`

def compute_shuffled_index(index: uint64, index_count: uint64, seed: Bytes32) -> uint64:
    """
    Return the shuffled index corresponding to ``seed`` (and ``index_count``).
    """
    assert index < index_count

    # Swap or not (https://link.springer.com/content/pdf/10.1007%2F978-3-642-32009-5_1.pdf)
    # See the 'generalized domain' algorithm on page 3
    for current_round in range(SHUFFLE_ROUND_COUNT):
        pivot = bytes_to_uint64(hash(seed + uint_to_bytes(uint8(current_round)))[0:8]) % index_count
        flip = (pivot + index_count - index) % index_count
        position = max(index, flip)
        source = hash(
            seed
            + uint_to_bytes(uint8(current_round))
            + uint_to_bytes(uint32(position // 256))
        )
        byte = uint8(source[(position % 256) // 8])
        bit = (byte >> (position % 8)) % 2
        index = flip if bit else index

    return index

Selecting random, distinct committees of validators is a big part of Eth2. This is done by shuffling. Now, if you have a list of objects, shuffling it is a well understood problem in computer science.

Notice, however, that this routine manages to shuffle a single index to a new location, knowing only the total length of the list. That is, it is oblivious. To shuffle the whole list, this routine needs to be called once per validator index in the list (note: optimisations are available for doing this on batches of validators). By construction, each input index maps to a distinct output index; thus, when applied to all indices in the list, it results in a permutation, also called a shuffling.

Why do this rather than a simpler, more efficient, conventional shuffle? It's all about light clients. Beacon nodes will generally need to know the whole shuffling, but light clients will often be interested only in a small number of committees. Using this technique allows the composition of a single committee to be calculated without having to shuffle the entire set: a big saving on time and memory.

As stated in the code comments, this is an implementation of the "swap-or-not" shuffle, described in this paper. The search for a good shuffling algorithm for Eth2 is described in this issue, and swap-or-not is identified in this one, with the corresponding pull request here. For details on the mechanics of the swap-or-not shuffle (with diagrams!), check out my explainer.

The algorithm breaks down as follows. For each iteration (each round), we start with a current index.

Pseudo-randomly select a pivot. This is a 64-bit integer based on the seed and current round number. This domain is large enough that any non-uniformity caused by taking the modulus in the next step is entirely negligible.

Use pivot to find another index in the list of validators, flip. We can see why it is called a "pivot": index and flip end up equally far from pivot / 2, but on opposite sides of it (taking into account wrap-around in the list). That is, flip is index reflected in pivot / 2 if we lay out the list in a line.

Calculate a single pseudo-random bit based on the seed, the current round number, and some bytes from either index or flip depending on which is greater.

If our bit is zero, we keep index unchanged; if it is one, we set index to flip.

We are effectively swapping cards in a deck based on a deterministic algorithm.

The way that position is broken down is worth noting:

Bits 0-2 (3 bits) are used to select a single bit from the eight bits of byte.

Bits 3-7 (5 bits) are used to select a single byte from the thirty-two bytes of source.

Bits 8-39 (32 bits) are used in generating source. Note that the upper two bytes of this will always be zero in practice, due to limits on the number of active validators.

SHUFFLE_ROUND_COUNT is, and always has been, 90 in the mainnet configuration, as explained there.

Another nice feature of the swap-or-not shuffle is that it is also easy to invert: just start current_round at SHUFFLE_ROUND_COUNT - 1 and decrease to 0 rather than vice-versa to get back the original position.

compute_shuffled_index is used by compute_committee and compute_proposer_index. In practice, full beacon node implementations will run this once per epoch with an optimised version that shuffles the whole list, and cache the result of that for the epoch.

`compute_proposer_index`

def compute_proposer_index(state: BeaconState, indices: Sequence[ValidatorIndex], seed: Bytes32) -> ValidatorIndex:
    """
    Return from ``indices`` a random index sampled by effective balance.
    """
    assert len(indices) > 0
    MAX_RANDOM_BYTE = 2**8 - 1
    i = uint64(0)
    total = uint64(len(indices))
    while True:
        candidate_index = indices[compute_shuffled_index(i % total, total, seed)]
        random_byte = hash(seed + uint_to_bytes(uint64(i // 32)))[i % 32]
        effective_balance = state.validators[candidate_index].effective_balance
        if effective_balance * MAX_RANDOM_BYTE >= MAX_EFFECTIVE_BALANCE * random_byte:
            return candidate_index
        i += 1

There is exactly one beacon block proposer per slot, selected randomly from among all the active validators. The seed parameter is set in get_beacon_proposer_index based on the epoch and slot. Note that there is a small but finite probability of the same validator being called on to propose a block more than once in an epoch.

A validator's chance of being the proposer is weighted by its effective balance: a validator with a 32 Ether effective balance is twice as likely to be chosen as a validator with a 16 Ether effective balance.

In order to account for the need to weight by effective balance, this is a try-and-increment algorithm. A counter i starts at zero. This counter does double duty:

First i is used to uniformly select a candidate proposer with probability $1/N$ where, $N$ is the number of active validators. This is done by using the compute_shuffled_index routine to shuffle index i to a new location, which is then the candidate_index.

Then i is used to generate a pseudo-random byte using the hash function as a seeded PRNG with at least 256 bits of output. The lower 5 bits of i select a byte in the hash function, and the upper bits salt the seed. (An obvious optimisation is that the output of the hash changes only once every 32 iterations.)

The if test is where the weighting by effective balance is done. If the candidate has MAX_EFFECTIVE_BALANCE, it will always pass this test and be returned as the proposer. If the candidate has a fraction of MAX_EFFECTIVE_BALANCE then that fraction is the probability of being returned as proposer.

If the candidate is not chosen, then i is incremented and we try again. Since the minimum effective balance is half of the maximum, then this ought to terminate fairly swiftly. In the worst case, all validators have 16 Ether effective balance and the chance of having to do another iteration is 50%, in which case there is a one in a million chance of having to do 20 iterations.

Note that this dependence on the validators' effective balances, which are updated at the end of each epoch, means that proposer assignments are valid only in the current epoch. This is different from committee assignments, which are valid with a one epoch look-ahead.

`compute_committee`

def compute_committee(indices: Sequence[ValidatorIndex],
                      seed: Bytes32,
                      index: uint64,
                      count: uint64) -> Sequence[ValidatorIndex]:
    """
    Return the committee corresponding to ``indices``, ``seed``, ``index``, and committee ``count``.
    """
    start = (len(indices) * index) // count
    end = (len(indices) * uint64(index + 1)) // count
    return [indices[compute_shuffled_index(uint64(i), uint64(len(indices)), seed)] for i in range(start, end)]

get_beacon_committee uses this to find the specific members of one of the committees at a slot.

Every epoch, a fresh set of committees is generated; during an epoch, the committees are stable.

Looking at the parameters in reverse order:

count is the total number of committees in an epoch. This is SLOTS_PER_EPOCH times the output of get_committee_count_per_slot().

index is the committee number within the epoch, running from 0 to count - 1.

seed is the seed value for computing the pseudo-random shuffling, based on the epoch number and a domain parameter (get_beacon_committee() uses DOMAIN_BEACON_ATTESTER).

indices is the list of validators eligible for inclusion in committees, namely the whole list of indices of active validators.

Random sampling among the validators is done by taking a contiguous slice of array indices from start to end and seeing where each one gets shuffled to by compute_shuffled_index(). Note that ValidatorIndex(i) is a type-cast in the above: it just turns i into a ValidatorIndex type for input into the shuffling. The output value of the shuffling is then used as an index into the indices list. There is much here that client implementations will optimise with caching and batch operations.

It may not be immediately obvious, but not all committees returned will be the same size (can vary by one), and every validator in indices will be a member of exactly one committee. As we increment index from zero, clearly start for index == j + 1 is end for index == j, so there are no gaps. In addition, the highest index is count - 1, so every validator in indices finds its way into a committee.

In Phase 1, this function will also be used to generate long-lived committees for shards, and light client committees. By mixing different domains into the seed in get_seed(), different shufflings and therefore different committees will be selected for the same epoch.

`compute_epoch_at_slot`

def compute_epoch_at_slot(slot: Slot) -> Epoch:
    """
    Return the epoch number at ``slot``.
    """
    return Epoch(slot // SLOTS_PER_EPOCH)

This is trivial enough that I won't explain it 😀

But note that it does rely on GENESIS_SLOT and GENESIS_EPOCH being zero.

`compute_start_slot_at_epoch`

def compute_start_slot_at_epoch(epoch: Epoch) -> Slot:
    """
    Return the start slot of ``epoch``.
    """
    return Slot(epoch * SLOTS_PER_EPOCH)

The first slot of an epoch. See remarks above.

`compute_activation_exit_epoch`

def compute_activation_exit_epoch(epoch: Epoch) -> Epoch:
    """
    Return the epoch during which validator activations and exits initiated in ``epoch`` take effect.
    """
    return Epoch(epoch + 1 + MAX_SEED_LOOKAHEAD)

When queuing validators for activation or exit in process_registry_updates() and initiate_validator_exit() respectively, the activation or exit is delayed until the next epoch, plus MAX_SEED_LOOKAHEAD epochs, currently 4.

See MAX_SEED_LOOKAHEAD for the details, but in short it is designed to make it extremely hard for an attacker to manipulate the make up of committees via activations and exits.

`compute_fork_data_root`

def compute_fork_data_root(current_version: Version, genesis_validators_root: Root) -> Root:
    """
    Return the 32-byte fork data root for the ``current_version`` and ``genesis_validators_root``.
    This is used primarily in signature domains to avoid collisions across forks/chains.
    """
    return hash_tree_root(ForkData(
        current_version=current_version,
        genesis_validators_root=genesis_validators_root,
    ))

The fork data root serves as a unique identifier for the chain that we are on. genesis_validators_root identifies our unique genesis event, and current_version our own hard fork subsequent to that genesis event. This is useful, for example, to differentiate between a testnet and mainnet: both might have the same fork versions, but will definitely have different genesis validator roots.

It is used by compute_fork_digest() and compute_domain.

`compute_fork_digest`

def compute_fork_digest(current_version: Version, genesis_validators_root: Root) -> ForkDigest:
    """
    Return the 4-byte fork digest for the ``current_version`` and ``genesis_validators_root``.
    This is a digest primarily used for domain separation on the p2p layer.
    4-bytes suffices for practical separation of forks/chains.
    """
    return ForkDigest(compute_fork_data_root(current_version, genesis_validators_root)[:4])

Just the first four bytes of the fork data root as a ForkDigest type.

Used extensively in the Ethereum 2.0 networking specification.

`compute_domain`

def compute_domain(domain_type: DomainType, fork_version: Version=None, genesis_validators_root: Root=None) -> Domain:
    """
    Return the domain for the ``domain_type`` and ``fork_version``.
    """
    if fork_version is None:
        fork_version = GENESIS_FORK_VERSION
    if genesis_validators_root is None:
        genesis_validators_root = Root()  # all bytes zero by default
    fork_data_root = compute_fork_data_root(fork_version, genesis_validators_root)
    return Domain(domain_type + fork_data_root[:28])

When dealing with signed messages, the signature "domains" are separated according to three independent factors:

All signatures include a DomainType relevant to the message's purpose, which is just some cryptographic hygiene in case the same message is to be signed for different purposes at any point.

All but signatures on deposit messages include the fork version. This ensures that messages across different forks of the chain become invalid, and that validators won't be slashed for signing attestations on two different chains (this is allowed).

And, now, the root hash of the validator Merkle tree at Genesis is included. Along with the fork version this gives a unique identifier for our chain.

This function is mainly used by get_domain(). It is also used in deposit processing, in which case fork_version and genesis_validators_root take their default values since deposits are valid across forks.

Fun fact: this function looks pretty simple, but I found a subtle bug in the way tests were generated in a previous implementation. Linus's law.

`compute_signing_root`

def compute_signing_root(ssz_object: SSZObject, domain: Domain) -> Root:
    """
    Return the signing root for the corresponding signing data.
    """
    return hash_tree_root(SigningData(
        object_root=hash_tree_root(ssz_object),
        domain=domain,
    ))

This is a pre-processor for signing objects with BLS signatures:

calculate the hash tree root of the object

combine the hash tree root with the Domain inside a temporary SigningData object

return the hash tree root of that, which is the data to be signed.

The domain is usually the output of get_domain(), which mixes in the cryptographic domain, the fork version, and the genesis validators root to the message hash. For deposits, it is the output of compute_domain(), ignoring the fork version and genesis validators root.

This is exactly equivalent to adding the domain to an object and taking the hash tree root of the whole thing. Indeed, this function used to be called compute_domain_wrapper_root.

Beacon state accessors

The massive BeaconState object gets passed around everywhere, so it's simple to access stored data directly. The following functions are simple wrappers that do some amount of processing on the beacon state data to be returned.

`get_current_epoch`

def get_current_epoch(state: BeaconState) -> Epoch:
    """
    Return the current epoch.
    """
    return compute_epoch_at_slot(state.slot)

A getter for the current epoch, as calculated by compute_epoch_at_slot().

`get_previous_epoch`

def get_previous_epoch(state: BeaconState) -> Epoch:
    """`
    Return the previous epoch (unless the current epoch is ``GENESIS_EPOCH``).
    """
    current_epoch = get_current_epoch(state)
    return GENESIS_EPOCH if current_epoch == GENESIS_EPOCH else Epoch(current_epoch - 1)

Return the previous epoch number as an Epoch type. Returns GENESIS_EPOCH if we are in the GENESIS_EPOCH: it has no prior, and we don't do negative numbers.

`get_block_root`

def get_block_root(state: BeaconState, epoch: Epoch) -> Root:
    """
    Return the block root at the start of a recent ``epoch``.
    """
    return get_block_root_at_slot(state, compute_start_slot_at_epoch(epoch))

The Casper FFG part of consensus deals in Checkpoints that are the first slot of an epoch. get_block_root is a specialised version of get_block_root_at_slot() that only returns the block root of the checkpoint, given an epoch.

`get_block_root_at_slot`

def get_block_root_at_slot(state: BeaconState, slot: Slot) -> Root:
    """
    Return the block root at a recent ``slot``.
    """
    assert slot < state.slot <= slot + SLOTS_PER_HISTORICAL_ROOT
    return state.block_roots[slot % SLOTS_PER_HISTORICAL_ROOT]

Recent block roots are stored in a circular list in state, with a length of SLOTS_PER_HISTORICAL_ROOT (currently ~27 hours).

get_block_root_at_slot is used by get_matching_head_attestations(), and in turn when assigning rewards for good LMD GHOST consensus votes.

`get_randao_mix`

def get_randao_mix(state: BeaconState, epoch: Epoch) -> Bytes32:
    """
    Return the randao mix at a recent ``epoch``.
    """
    return state.randao_mixes[epoch % EPOCHS_PER_HISTORICAL_VECTOR]

Randao mixes are stored in a circular list of length EPOCHS_PER_HISTORICAL_VECTOR. They are used when calculating the seed for assigning beacon proposers and committees.

`get_active_validator_indices`

def get_active_validator_indices(state: BeaconState, epoch: Epoch) -> Sequence[ValidatorIndex]:
    """
    Return the sequence of active validator indices at ``epoch``.
    """
    return [ValidatorIndex(i) for i, v in enumerate(state.validators) if is_active_validator(v, epoch)]

Steps through the entire list of validators and returns the list of only the active ones (that is, validators that have been activated but not exited as returned by is_active_validator().

This function is heavily used and I'd expect it to be memoised in practice.

`get_validator_churn_limit`

def get_validator_churn_limit(state: BeaconState) -> uint64:
    """
    Return the validator churn limit for the current epoch.
    """
    active_validator_indices = get_active_validator_indices(state, get_current_epoch(state))
    return max(MIN_PER_EPOCH_CHURN_LIMIT, uint64(len(active_validator_indices)) // CHURN_LIMIT_QUOTIENT)

The "churn limit" applies when activating and exiting validators and acts as a rate-limit on changes to the validator set. The value of this function provides the number of validators that may become active in an epoch, and the number of validators that may exit in an epoch.

Some small amount of churn is always allowed, set by MIN_PER_EPOCH_CHURN_LIMIT, and the amount of per-epoch churn allowed increases by one for every extra CHURN_LIMIT_QUOTIENT validators that are currently active (once the minimum has been exceeded).

`get_seed`

def get_seed(state: BeaconState, epoch: Epoch, domain_type: DomainType) -> Bytes32:
    """
    Return the seed at ``epoch``.
    """
    mix = get_randao_mix(state, Epoch(epoch + EPOCHS_PER_HISTORICAL_VECTOR - MIN_SEED_LOOKAHEAD - 1))  # Avoid underflow
    return hash(domain_type + uint_to_bytes(epoch) + mix)

Used in get_beacon_committee() and get_beacon_proposer_index to provide the random input for computing proposers and committees. domain_type is DOMAIN_BEACON_ATTESTER or DOMAIN_BEACON_PROPOSER respectively.

Randao mixes are stored in a circular list of length EPOCHS_PER_HISTORICAL_VECTOR. The seed for an epoch is based on the randao mix from MIN_SEED_LOOKAHEAD epochs ago. This is to limit the forward visibility of randomness: see the explanation there.

The seed returned is not based only on the domain and the randao mix, but the epoch number is also added in. This is to handle the pathological case of no blocks being seen for more than two epochs, in which case we run out of randao updates. Adding in the epoch number means that fresh committees and proposers can continue to be selected.

`get_committee_count_per_slot`

def get_committee_count_per_slot(state: BeaconState, epoch: Epoch) -> uint64:
    """
    Return the number of committees in each slot for the given ``epoch``.
    """
    return max(uint64(1), min(
        MAX_COMMITTEES_PER_SLOT,
        uint64(len(get_active_validator_indices(state, epoch))) // SLOTS_PER_EPOCH // TARGET_COMMITTEE_SIZE,
    ))

Every slot in a given epoch has the same number of beacon committees, as calculated by this function.

There is always at least one committee per slot, and never more than MAX_COMMITTEES_PER_SLOT, currently 64.

Subject to these constraints, the actual number of committees per slot is $N / 4096$, where $N$ is the total number of active validators.

The intended behaviour looks like this:

The ideal case is that there are MAX_COMMITTEES_PER_SLOT = 64 committees per slot. This maps to one committee per slot per shard in Phase 1—these committees will be responsible for voting on shard crosslinks. There must be at least 262,144 active validators to achieve this.

If there are fewer active validators, then the number of committees per shard is reduced below 64 in order to maintain a minimum committee size of TARGET_COMMITTEE_SIZE = 128. In this case, not every shard will get crosslinked at every slot in Phase 1.

Finally, only if the number of active validators falls below 4096 will the committee size be reduced to less than 128. This is the point at which there is only one beacon committee per shard. But, at this point, the chain basically has no meaningful security in any case.

`get_beacon_committee`

def get_beacon_committee(state: BeaconState, slot: Slot, index: CommitteeIndex) -> Sequence[ValidatorIndex]:
    """
    Return the beacon committee at ``slot`` for ``index``.
    """
    epoch = compute_epoch_at_slot(slot)
    committees_per_slot = get_committee_count_per_slot(state, epoch)
    return compute_committee(
        indices=get_active_validator_indices(state, epoch),
        seed=get_seed(state, epoch, DOMAIN_BEACON_ATTESTER),
        index=(slot % SLOTS_PER_EPOCH) * committees_per_slot + index,
        count=committees_per_slot * SLOTS_PER_EPOCH,
    )

Beacon committees vote on the beacon block at each slot via attestations. There are up to MAX_COMMITTEES_PER_SLOT beacon committees per slot, and each committee is active exactly once per epoch.

This function returns the list of committee members given a slot number and an index within that slot to select the desired committee, relying on compute_committee() to do the heavy lifting.

Note that, since this uses get_seed(), we can obtain committees only up to EPOCHS_PER_HISTORICAL_VECTOR epochs into the past (minus MIN_SEED_LOOKAHEAD).

get_beacon_committee is used by get_attesting_indices() and process_attestation() when processing attestations coming from a committee, and by validators when checking their committee assignments and aggregation duties.

`get_beacon_proposer_index`

def get_beacon_proposer_index(state: BeaconState) -> ValidatorIndex:
    """
    Return the beacon proposer index at the current slot.
    """
    epoch = get_current_epoch(state)
    seed = hash(get_seed(state, epoch, DOMAIN_BEACON_PROPOSER) + uint_to_bytes(state.slot))
    indices = get_active_validator_indices(state, epoch)
    return compute_proposer_index(state, indices, seed)

Each slot, exactly one of the active validators is pseudo-randomly assigned to be the proposer of the beacon block for that slot. The probability of being selected is weighted by the validator's effective balance in compute_proposer_index().

The chosen block proposer does not need to be a member of one of the beacon committees for that slot: it is chosen from the entire set of active validators for that epoch.

Since the randao seed is updated only once per epoch, the slot number is mixed into the seed using a hash to get a different proposer at each slot. There is a chance of the same proposer being selected in two consecutive slots, or more than once per epoch: if every validator has the same effective balance, then the probability of being selected in a particular slot is simply $\smash{\frac{1}{N}}$ independent of any other slot, where $N$ is the number of active validators in the epoch corresponding to the slot.

`get_total_balance`

def get_total_balance(state: BeaconState, indices: Set[ValidatorIndex]) -> Gwei:
    """
    Return the combined effective balance of the ``indices``.
    ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    Math safe up to ~10B ETH, afterwhich this overflows uint64.
    """
    return Gwei(max(EFFECTIVE_BALANCE_INCREMENT, sum([state.validators[index].effective_balance for index in indices])))

A simple utility to return the total balance of all validators in the list, indices, passed in.

Side observation: there is an interesting example of some fragility in the spec lurking here. This function used to return a minimum of 1 Gwei to avoid a potential division by zero in get_attestation_deltas(). However, that function was modified to avoid a possible overflow condition, without modifying this function, which introduced the possibility of a division by zero. This was later fixed by returning a minimum of EFFECTIVE_BALANCE_INCREMENT. But, Yay! for lots of eyes on the spec.

`get_total_active_balance`

def get_total_active_balance(state: BeaconState) -> Gwei:
    """
    Return the combined effective balance of the active validators.
    Note: ``get_total_balance`` returns ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    """
    return get_total_balance(state, set(get_active_validator_indices(state, get_current_epoch(state))))

Uses get_total_balance() to calculate the sum of the effective balances of all active validators in the current epoch.

This quantity is frequently used in the spec. For example, Casper FFG uses the total active balance to judge whether the 2/3 majority threshold of attestations has been reached in justification and finalisation. And it is a fundamental part of the calculation of rewards and penalties, where the base reward is made proportional to the reciprocal of the square root of the total active balance: validator reqards are higher when little balance is at stake (few active validators) and lower when much balance is at stake (many active validators).

Total active balance does not change during an epoch, so is a great candidate for being cached.

`get_domain`

def get_domain(state: BeaconState, domain_type: DomainType, epoch: Epoch=None) -> Domain:
    """
    Return the signature domain (fork version concatenated with domain type) of a message.
    """
    epoch = get_current_epoch(state) if epoch is None else epoch
    fork_version = state.fork.previous_version if epoch < state.fork.epoch else state.fork.current_version
    return compute_domain(domain_type, fork_version, state.genesis_validators_root)

For the science behind domains, see Domain types and compute_domain().

With the exception of DOMAIN_DEPOSIT, domains are always combined with the fork version before being used in signature generation. This is to distinguish messages for different chains, and ensure that validators don't get slashed if they choose to participate on two independent forks. (That is, deliberate forks, aka hard-forks. Participating on both branches of temporary consensus forks is punishable: that's basically the whole point of slashing.)

`get_indexed_attestation`

def get_indexed_attestation(state: BeaconState, attestation: Attestation) -> IndexedAttestation:
    """
    Return the indexed attestation corresponding to ``attestation``.
    """
    attesting_indices = get_attesting_indices(state, attestation.data, attestation.aggregation_bits)

    return IndexedAttestation(
        attesting_indices=sorted(attesting_indices),
        data=attestation.data,
        signature=attestation.signature,
    )

Just a wrapper converting an Attestation into an IndexedAttestation.

Attestations are aggregatable, which means that attestations from multiple validators making the same vote can be rolled up into a single attestation through the magic of BLS signature aggregation. However, in order to be able to verify the signature later, a record needs to be kept of which validators actually contributed to the attestation. This is so that those validators' public keys can be aggregated.

The Attestation type uses a bitlist to indicate whether a member of the attesting committee contributed to the attestation. This is to minimise the size. The IndexedAttestation type explicitly lists the global validator indices of contributing validators. Note that the list of indices is sorted: an attestation is invalid if not.

The conversion between the list formats is performed by get_attesting_indices(), below.

`get_attesting_indices`

def get_attesting_indices(state: BeaconState,
                          data: AttestationData,
                          bits: Bitlist[MAX_VALIDATORS_PER_COMMITTEE]) -> Set[ValidatorIndex]:
    """
    Return the set of attesting indices corresponding to ``data`` and ``bits``.
    """
    committee = get_beacon_committee(state, data.slot, data.index)
    return set(index for i, index in enumerate(committee) if bits[i])

Lists of validators within committees occur in two forms in the specification:

Compressed into a bitlist, in which each bit represents the presence or absence of a validator from a particular committee. The committee is referenced by slot and committee index within that slot. This is how sets of validators are represented in Attestations.

An explicit list of validator indices, as in IndexedAttestations.

get_attesting_indices() converts from the former representation to the latter. The slot number and the committee index are provided by the AttestationData and are used to reconstruct the committee members via get_beacon_committee(), and the bitlist will have come from an Attestation.

Beacon state mutators

`increase_balance`

def increase_balance(state: BeaconState, index: ValidatorIndex, delta: Gwei) -> None:
    """
    Increase the validator balance at index ``index`` by ``delta``.
    """
    state.balances[index] += delta

This and decrease_balance() are the only places in the spec where validator balances are modified—it's a nod towards encapsulation.

Two separate functions are needed for changing validator balances (one for increasing and one for decreasing) because we are using only unsigned integers, remember.

Fun fact: A typo around this led to our one and only consensus failure at the initial client interop event. You see, unsigned integers induce bugs!

`decrease_balance`

def decrease_balance(state: BeaconState, index: ValidatorIndex, delta: Gwei) -> None:
    """
    Decrease the validator balance at index ``index`` by ``delta``, with underflow protection.
    """
    state.balances[index] = 0 if delta > state.balances[index] else state.balances[index] - delta

The counterpart to increase_balance(). This one has extra work to do to check for unsigned int underflow. Balances may not go negative.

`initiate_validator_exit`

def initiate_validator_exit(state: BeaconState, index: ValidatorIndex) -> None:
    """
    Initiate the exit of the validator with index ``index``.
    """
    # Return if validator already initiated exit
    validator = state.validators[index]
    if validator.exit_epoch != FAR_FUTURE_EPOCH:
        return

    # Compute exit queue epoch
    exit_epochs = [v.exit_epoch for v in state.validators if v.exit_epoch != FAR_FUTURE_EPOCH]
    exit_queue_epoch = max(exit_epochs + [compute_activation_exit_epoch(get_current_epoch(state))])
    exit_queue_churn = len([v for v in state.validators if v.exit_epoch == exit_queue_epoch])
    if exit_queue_churn >= get_validator_churn_limit(state):
        exit_queue_epoch += Epoch(1)

    # Set validator exit epoch and withdrawable epoch
    validator.exit_epoch = exit_queue_epoch
    validator.withdrawable_epoch = Epoch(validator.exit_epoch + MIN_VALIDATOR_WITHDRAWABILITY_DELAY)

Exits may be initiated voluntarily, or as a result of being slashed, or by dropping to the EJECTION_BALANCE threshold.

In all cases, a dynamic "churn limit" caps the number of validators that may exit per epoch. This is calculated by get_validator_churn_limit(). The mechanism for enforcing this is the exit queue: the validator's exit_epoch is set such that it is at the end of the queue. (Per the spec, the queue is not a separate data structure, but is continually re-calculated from the exit epochs of all validators: I expect there are some optimisations to be had around this in actual implementations.)

An exiting validator is expected to continue with its proposing and attesting duties until exit_epoch has passed, and will continue to receive rewards and penalties accordingly.

In addition, an exited validator remains eligible to be slashed until its withdrawable_epoch, which is set to MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after its exit_epoch. This is to allow some extra time for any slashable offences by the validator to be detected and reported.

`slash_validator`

def slash_validator(state: BeaconState,
                    slashed_index: ValidatorIndex,
                    whistleblower_index: ValidatorIndex=None) -> None:
    """
    Slash the validator with index ``slashed_index``.
    """
    epoch = get_current_epoch(state)
    initiate_validator_exit(state, slashed_index)
    validator = state.validators[slashed_index]
    validator.slashed = True
    validator.withdrawable_epoch = max(validator.withdrawable_epoch, Epoch(epoch + EPOCHS_PER_SLASHINGS_VECTOR))
    state.slashings[epoch % EPOCHS_PER_SLASHINGS_VECTOR] += validator.effective_balance
    decrease_balance(state, slashed_index, validator.effective_balance // MIN_SLASHING_PENALTY_QUOTIENT)

    # Apply proposer and whistleblower rewards
    proposer_index = get_beacon_proposer_index(state)
    if whistleblower_index is None:
        whistleblower_index = proposer_index
    whistleblower_reward = Gwei(validator.effective_balance // WHISTLEBLOWER_REWARD_QUOTIENT)
    proposer_reward = Gwei(whistleblower_reward // PROPOSER_REWARD_QUOTIENT)
    increase_balance(state, proposer_index, proposer_reward)
    increase_balance(state, whistleblower_index, Gwei(whistleblower_reward - proposer_reward))

Both proposer slashings and attester slashings end up here when a report of a slashable offence has been verified during block processing.

When a validator is slashed, several things happen immediately:

The validator is processed for exit via initiate_validator_exit(), so it joins the exit queue.

It is also marked as slashed. This information is used when calculating rewards and penalties: while being exited, whatever it does, a slashed validator receives penalities as if it had failed to propose or attest, including the inactivity leak if applicable.

Normally, as part of the exit process, the withdrawable_epoch for a validator (the point at which a validator's stake is in principle unlocked) is set to MIN_VALIDATOR_WITHDRAWABILITY_DELAY epochs after it exits. When a validator is slashed, a much longer period of lock-up applies, namely EPOCHS_PER_SLASHINGS_VECTOR. This is to allow a further, potentially much greater, slashing penalty to be applied later once the chain knows how many validators have been slashed together around this time. Strictly, this postponement of the withdrawable epoch is twice as long as required to apply the extra penalty, which is applied half-way through this period. Slashed validators continue to accrue attestation penalties until they become withdrawable.

The effective balance of the validator is added to the accumulated balances of validators slashed this epoch, and stored in the circular list, state.slashings. This will be used by the slashing penalty calculation mentioned in the previous point.

An initial "slap on the wrist" slashing penalty of the validator's effective balance (in Gwei) divided by the MIN_SLASHING_PENALTY_QUOTIENT is applied. With current values, this is a maximum of 0.25 Ether. As above, a potentially larger penalty will be applied later depending on how many other validators were slashed concurrently.

The proposer including the slashing proof receives a reward.

In short, a slashed validator receives an initial minor penalty, can expect to receive a further penalty later, and is marked for exit.

Note that the whistleblower_index defaults to None in the parameter list. This is never used in Phase 0, with the result that the proposer that included the slashing gets the entire reward; there is no separate whistleblower reward for reporting proposer or attester slashings. One reason is simply that reports are too easy to steal: if I report a slashable event to a block proposer, there is nothing to prevent that proposer claiming the report as its own. We could introduce some fancy ZK protocol to make this trustless, but this is what we're going with for now. In Phase 1, whistleblower rewards in the proof-of-custody game may use this functionality.

As a final note, here and in deposit processing are the only places in the Phase 0 specification where validator balances are updated outside epoch processing.

Genesis

Genesis is the moment at which all the clients simultaneously start processing the beacon chain. All being well, we will only ever have to do this once for real!

All the clients need to be in agreement about the timing of the genesis event, and also about the contents of the genesis block. The genesis event occurs once two pre-conditions have been satisfied:

MIN_GENESIS_TIME must have passed. There's also a GENESIS_DELAY that applies in some circumstances.

Sufficient valid deposits must have been made into the Eth1 deposit contract to activate MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators.

Information about both of these is drawn from the existing Eth1 chain as the source of truth.

Before the Ethereum 2.0 genesis has been triggered, and for every Ethereum 1.0 block, let candidate_state = initialize_beacon_state_from_eth1(eth1_block_hash, eth1_timestamp, deposits) where:

eth1_block_hash is the hash of the Ethereum 1.0 block
eth1_timestamp is the Unix timestamp corresponding to eth1_block_hash
deposits is the sequence of all deposits, ordered chronologically, up to (and including) the block with hash eth1_block_hash

Ahead of MIN_GENESIS_TIME, some Eth2 beacon nodes should be up and running, and monitoring the Eth1 chain. No Eth2 beacon blocks are being produced yet.

From here until is_valid_genesis_state() returns True, each Eth1 block that is produced is run through the initialize_beacon_state_from_eth1() function.

Eth1 blocks must only be considered once they are at least SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE seconds old (i.e. eth1_timestamp + SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE <= current_unix_time). Due to this constraint, if GENESIS_DELAY < SECONDS_PER_ETH1_BLOCK * ETH1_FOLLOW_DISTANCE, then the genesis_time can happen before the time/state is first known. Values should be configured to avoid this case.

The SECONDS_PER_ETH1_BLOCK*ETH1_FOLLOW_DISTANCE constraint is a heuristic intended to ensure that any Eth1 block we rely on is not later reorganised out of the Eth1 chain. Its value is set pretty conservatively: about 4 hours.

def initialize_beacon_state_from_eth1(eth1_block_hash: Bytes32,
                                      eth1_timestamp: uint64,
                                      deposits: Sequence[Deposit]) -> BeaconState:
    fork = Fork(
        previous_version=GENESIS_FORK_VERSION,
        current_version=GENESIS_FORK_VERSION,
        epoch=GENESIS_EPOCH,
    )
    state = BeaconState(
        genesis_time=eth1_timestamp + GENESIS_DELAY,
        fork=fork,
        eth1_data=Eth1Data(block_hash=eth1_block_hash, deposit_count=len(deposits)),
        latest_block_header=BeaconBlockHeader(body_root=hash_tree_root(BeaconBlockBody())),
        randao_mixes=[eth1_block_hash] * EPOCHS_PER_HISTORICAL_VECTOR,  # Seed RANDAO with Eth1 entropy
    )

    # Process deposits
    leaves = list(map(lambda deposit: deposit.data, deposits))
    for index, deposit in enumerate(deposits):
        deposit_data_list = List[DepositData, 2**DEPOSIT_CONTRACT_TREE_DEPTH](*leaves[:index + 1])
        state.eth1_data.deposit_root = hash_tree_root(deposit_data_list)
        process_deposit(state, deposit)

    # Process activations
    for index, validator in enumerate(state.validators):
        balance = state.balances[index]
        validator.effective_balance = min(balance - balance % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)
        if validator.effective_balance == MAX_EFFECTIVE_BALANCE:
            validator.activation_eligibility_epoch = GENESIS_EPOCH
            validator.activation_epoch = GENESIS_EPOCH

    # Set genesis validators root for domain separation and chain versioning
    state.genesis_validators_root = hash_tree_root(state.validators)

    return state

So, Eth1 blocks are used in turn to repeatedly try to construct a valid genesis beacon state as follows.

The beacon state timestamp is set to the Eth1 block's time stamp plus the GENESIS_DELAY. By the constraint above, this will be in the future.

A few other genesis constants are set. Notably, the Randao is seeded from the Eth1 block hash. The latest_block_header field is derived from an empty BeaconBlockBody - that is, all its fields default to their zero values as defined in the SSZ specification.

All the deposits into the Eth1 contract up to and including this block are processed. These are provided as a list, which can be derived from the receipts generated by the deposit contract. Some of these may be invalid, for example having an invalid BLS signature. These are ignored; the deposit is lost forever. Some may be partial or repeated deposits: this is fine and the total deposit for each validator is totted up in process_deposit().

Validators that have an effective balance of MAX_EFFECTIVE_BALANCE (i.e. 32 Ether) are marked to become active at the start of the GENESIS_EPOCH.

The hash tree root of the validator states becomes a permanent identifier for this chain in the form of genesis_validators_root. This is used by ForkData, which in turn is used whenever this chain needs to be distinguished from another chain.

Note: The ETH1 block with eth1_timestamp meeting the minimum genesis active validator count criteria can also occur before MIN_GENESIS_TIME.

There are two ways in which the genesis process can play out. Consider a point in time, MIN_GENESIS_TIME-GENESIS_DELAY`.

If sufficient Eth1 deposits to activate MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators have been made by that time, then genesis will occur at the timestamp of the first Eth1 block after that time plus GENSIS_DELAY, which is likely to be a few seconds after MIN_GENESIS_TIME. It will include all validators registered to this point, which can be in excess of MIN_GENESIS_ACTIVE_VALIDATOR_COUNT.

Otherwise, genesis occurs GENESIS_DELAY seconds after the timestamp of the block containing the deposit that activates the MIN_GENESIS_ACTIVE_VALIDATOR_COUNTth validator. Genesis will include all validators registered up to and including this block (which might be MIN_GENESIS_ACTIVE_VALIDATOR_COUNT or perhaps slightly over if the block has multiple deposits).

Recall that, in both these cases, there is also an interval of SECONDS_PER_ETH1_BLOCK*ETH1_FOLLOW_DISTANCE seconds between the deposit hitting the Eth1 chain and being picked up by the beacon nodes. Finally, note that the activation queue that normally applies for onboarding new validators is not used pre-genesis.

Genesis state

Let genesis_state = candidate_state whenever is_valid_genesis_state(candidate_state) is True for the first time.

def is_valid_genesis_state(state: BeaconState) -> bool:
    if state.genesis_time < MIN_GENESIS_TIME:
        return False
    if len(get_active_validator_indices(state, GENESIS_EPOCH)) < MIN_GENESIS_ACTIVE_VALIDATOR_COUNT:
        return False
    return True

This function simply checks the criteria above. The beacon nodes continually prepare candidate beacon genesis states until this function returns True. The genesis event will take place at least GENESIS_DELAY seconds later, using the genesis state that first flips this functions output to True.

We keep on adding new validators while this function returns False. That is, while we are more than GENESIS_DELAY seconds before MIN_GENESIS_TIME, or while we don't yet have MIN_GENESIS_ACTIVE_VALIDATOR_COUNT validators. Thus the total number of genesis validators can't necessarily be known ahead of time.

At the moment when this function first returns True, we are then able to calculate the exact genesis time, the genesis block, and the genesis state root. The GENESIS_DELAY is designed to allow node operators time to verify these parameters between themselves (everyone should agree!), and to configure any non-validating nodes, such as boot nodes, with these quantities so that they do not need to rely on an Eth1 node. (But nodes with validators always need to access an Eth1 node.)

Genesis block

Let genesis_block = BeaconBlock(state_root=hash_tree_root(genesis_state)).

This is not explicitly used elsewhere in the spec. However, it is what the block in the first slot should reference as its "parent". This can be seen in the process_slot() function, where, if state.latest_block_header.state_root is empty it is replaced by the actual state root. This can happen only during the first slot.

Beacon chain state transition function

TODO

The post-state corresponding to a pre-state state and a signed block signed_block is defined as state_transition(state, signed_block). State transitions that trigger an unhandled exception (e.g. a failed assert or an out-of-range list access) are considered invalid. State transitions that cause a uint64 overflow or underflow are also considered invalid.

The use of asserts in the spec as-is is a little weird, and slightly controversial. The essential thing to remember is that, if you hit an assert at any point while processing a block, the whole transition needs to be aborted and reset to the original state. With the spec as written, this potentially means undoing already done state updates, or keeping a copy of the former state around to revert to if necessary.

def state_transition(state: BeaconState, signed_block: SignedBeaconBlock, validate_result: bool=True) -> None:
    block = signed_block.message
    # Process slots (including those with no blocks) since block
    process_slots(state, block.slot)
    # Verify signature
    if validate_result:
        assert verify_block_signature(state, signed_block)
    # Process block
    process_block(state, block)
    # Verify state root
    if validate_result:
        assert block.state_root == hash_tree_root(state)

Beacon chain state is advanced every slot, with extra processing at the end of each epoch. However, state updates are driven by the receipt of valid blocks. Although, in the ideal case, there is a block for every slot, in practice one or more slots can be skipped, for example if the proposers are offline. (It the beacon node is serving validators, however, it will need to keep its state up to date irrespective of whether blocks arrive or not.)

So the first thing that happens in the state transition function is that the state is brought up to date with respect to the block received (the comment in the code seems inaccurate).

[TODO - complete this thought] In principle, a beacon node can hang around doing nothing and just catch up when a block comes. If no blocks are received for a number of slots, nothing happens. When a block is finally received, all slots and epochs are processed up to date.

[TODO: explain validate_result.]

def verify_block_signature(state: BeaconState, signed_block: SignedBeaconBlock) -> bool:
    proposer = state.validators[signed_block.message.proposer_index]
    signing_root = compute_signing_root(signed_block.message, get_domain(state, DOMAIN_BEACON_PROPOSER))
    return bls.Verify(proposer.pubkey, signing_root, signed_block.signature)

Simply checks that the signature on the block matches the block's contents and the public key of the claimed proposer of the block. This ensures that blocks cannot be forged or tampered with in transit. All the public keys for validators are stored in the Validators list in state. See domain types for DOMAIN_BEACON_PROPOSER.

def process_slots(state: BeaconState, slot: Slot) -> None:
    assert state.slot < slot
    while state.slot < slot:
        process_slot(state)
        # Process epoch on the start slot of the next epoch
        if (state.slot + 1) % SLOTS_PER_EPOCH == 0:
            process_epoch(state)
        state.slot = Slot(state.slot + 1)

Updates the state from its current slot up to the given slot number assuming that all the intermediate slots are empty (they do not contain blocks). Iteratively calls process_slot() to apply the empty slot state-transition.

Empty slot processing is extremely light weight, but the epoch transitions require the full rewards and penalties, and justification/finalisation apparatus.

def process_slot(state: BeaconState) -> None:
    # Cache state root
    previous_state_root = hash_tree_root(state)
    state.state_roots[state.slot % SLOTS_PER_HISTORICAL_ROOT] = previous_state_root
    # Cache latest block header state root
    if state.latest_block_header.state_root == Bytes32():
        state.latest_block_header.state_root = previous_state_root
    # Cache block root
    previous_block_root = hash_tree_root(state.latest_block_header)
    state.block_roots[state.slot % SLOTS_PER_HISTORICAL_ROOT] = previous_block_root

Apply the "empty slot" state-transition (except for updating the slot number, and end-of-epoch processing).

This is almost trivial and consists only of calculating the updated state and block hash tree roots (as necessary), and storing them in the historical lists in the state.

TODO: explain state_root == Bytes32()

SLOTS_PER_HISTORICAL_ROOT is a multiple of SLOTS_PER_EPOCH, so there is no danger of overwriting the circular lists of state_roots and block_roots. These will be dealt with correctly during epoch processing.

Epoch processing

def process_epoch(state: BeaconState) -> None:
    process_justification_and_finalization(state)
    process_rewards_and_penalties(state)
    process_registry_updates(state)
    process_slashings(state)
    process_final_updates(state)

All major updates to the state occur after the last slot of each epoch, during epoch processing. With the exception of slashing and deposit processing, epoch processing is the only place in the Phase 0 specification where validator balances are modified. Note that this is likely to change in future.

process_justification_and_finalization() uses the stored attestations to update the justified and finalised checkpoints, as per Casper FFG.

process_rewards_and_penalties() calculates and applies all rewards and penalties for making attestations, for proposing blocks, and the inacivity leak.

process_registry_updates() manages the queues for validators that are to be activated or exited.

process_slashings() applies any correlated slashing penalties to previously slashed validators.

process_final_updates() Mostly just housekeeping, including updating validators' effective balances.

Helper functions

def get_matching_source_attestations(state: BeaconState, epoch: Epoch) -> Sequence[PendingAttestation]:
    assert epoch in (get_previous_epoch(state), get_current_epoch(state))
    return state.current_epoch_attestations if epoch == get_current_epoch(state) else state.previous_epoch_attestations

TODO

def get_matching_target_attestations(state: BeaconState, epoch: Epoch) -> Sequence[PendingAttestation]:
    return [
        a for a in get_matching_source_attestations(state, epoch)
        if a.data.target.root == get_block_root(state, epoch)
    ]

TODO

def get_matching_head_attestations(state: BeaconState, epoch: Epoch) -> Sequence[PendingAttestation]:
    return [
        a for a in get_matching_target_attestations(state, epoch)
        if a.data.beacon_block_root == get_block_root_at_slot(state, a.data.slot)
    ]

TODO

def get_unslashed_attesting_indices(state: BeaconState,
                                    attestations: Sequence[PendingAttestation]) -> Set[ValidatorIndex]:
    output = set()  # type: Set[ValidatorIndex]
    for a in attestations:
        output = output.union(get_attesting_indices(state, a.data, a.aggregation_bits))
    return set(filter(lambda index: not state.validators[index].slashed, output))

TODO

def get_attesting_balance(state: BeaconState, attestations: Sequence[PendingAttestation]) -> Gwei:
    """
    Return the combined effective balance of the set of unslashed validators participating in ``attestations``.
    Note: ``get_total_balance`` returns ``EFFECTIVE_BALANCE_INCREMENT`` Gwei minimum to avoid divisions by zero.
    """
    return get_total_balance(state, get_unslashed_attesting_indices(state, attestations))

TODO

Justification and finalization

def process_justification_and_finalization(state: BeaconState) -> None:
    # Initial FFG checkpoint values have a `0x00` stub for `root`.
    # Skip FFG updates in the first two epochs to avoid corner cases that might result in modifying this stub.
    if get_current_epoch(state) <= GENESIS_EPOCH + 1:
        return

    previous_epoch = get_previous_epoch(state)
    current_epoch = get_current_epoch(state)
    old_previous_justified_checkpoint = state.previous_justified_checkpoint
    old_current_justified_checkpoint = state.current_justified_checkpoint

    # Process justifications
    state.previous_justified_checkpoint = state.current_justified_checkpoint
    state.justification_bits[1:] = state.justification_bits[:JUSTIFICATION_BITS_LENGTH - 1]
    state.justification_bits[0] = 0b0
    matching_target_attestations = get_matching_target_attestations(state, previous_epoch)  # Previous epoch
    if get_attesting_balance(state, matching_target_attestations) * 3 >= get_total_active_balance(state) * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=previous_epoch,
                                                        root=get_block_root(state, previous_epoch))
        state.justification_bits[1] = 0b1
    matching_target_attestations = get_matching_target_attestations(state, current_epoch)  # Current epoch
    if get_attesting_balance(state, matching_target_attestations) * 3 >= get_total_active_balance(state) * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=current_epoch,
                                                        root=get_block_root(state, current_epoch))
        state.justification_bits[0] = 0b1

    # Process finalizations
    bits = state.justification_bits
    # The 2nd/3rd/4th most recent epochs are justified, the 2nd using the 4th as source
    if all(bits[1:4]) and old_previous_justified_checkpoint.epoch + 3 == current_epoch:
        state.finalized_checkpoint = old_previous_justified_checkpoint
    # The 2nd/3rd most recent epochs are justified, the 2nd using the 3rd as source
    if all(bits[1:3]) and old_previous_justified_checkpoint.epoch + 2 == current_epoch:
        state.finalized_checkpoint = old_previous_justified_checkpoint
    # The 1st/2nd/3rd most recent epochs are justified, the 1st using the 3rd as source
    if all(bits[0:3]) and old_current_justified_checkpoint.epoch + 2 == current_epoch:
        state.finalized_checkpoint = old_current_justified_checkpoint
    # The 1st/2nd most recent epochs are justified, the 1st using the 2nd as source
    if all(bits[0:2]) and old_current_justified_checkpoint.epoch + 1 == current_epoch:
        state.finalized_checkpoint = old_current_justified_checkpoint

TODO

Rewards and penalties

Helpers

def get_base_reward(state: BeaconState, index: ValidatorIndex) -> Gwei:
    total_balance = get_total_active_balance(state)
    effective_balance = state.validators[index].effective_balance
    return Gwei(effective_balance * BASE_REWARD_FACTOR // integer_squareroot(total_balance) // BASE_REWARDS_PER_EPOCH)

The base reward (which I call $B$ throughout this doc) is the fundamental determiner of the issuance rate of the beacon chain: all attester and proposer rewards are calculated from this. As I noted under BASE_REWARD_FACTOR, this is the big knob to turn if we wish to increase or decrease the total reward for participating in Eth2.

The base reward for a validator is proportional its effective balance, because that is how its votes are weighted in influencing the chain. It is inversely proportional to the square root of the total balance of all active validators. This means that, as the number $N$ of validators increases, the reward per validator decreases as $\smash{\frac{1}{\sqrt{N}}}$, and the overall issuance per epoch increases as $\smash{\sqrt{N}}$.

The decrease with increasing $N$ in per-validator rewards is a form of price discovery: the idea is that an equilibrium will be found where the total number of validators results in a reward similar to returns available elsewhere for similar risk. The Eth2 Launch Pad has a graph that shows how this translates into expected APR for running a validator for different total amounts of ETH staked.

A different curve could have been chosen for the rewards profile. For example, the inverse of total balance rather than its square root would keep total issuance constant. Vitalik justifies the inverse square root approach and discusses the trade-offs here.

BASE_REWARDS_PER_EPOCH multiplied by get_base_reward() is the total value of an attestation (when all attesters participate, i.e. $B = B'$). There are four components:

$B$ for getting the Casper FFG source vote right.

$B$ for getting the Casper FFG target vote right.

$B$ for getting the LMD GHOST head vote right.

$B$ divided between the attester and the proposer for getting the attestation included quickly.

def get_proposer_reward(state: BeaconState, attesting_index: ValidatorIndex) -> Gwei:
    return Gwei(get_base_reward(state, attesting_index) // PROPOSER_REWARD_QUOTIENT)

For each attestation included in a block, the block proposer receives a reward $\smash{\frac{B}{8}}$, where $B$ is the base reward of the validator that made the attestation, and PROPOSER_REWARD_QUOTIENT is 8. (The attesting validator will receive the remaining $\smash{\frac{7B}{8}}$, reduced by the inclusion delay.)

Proposer rewards are allocated in get_inclusion_delay_deltas().

def get_finality_delay(state: BeaconState) -> uint64:
    return get_previous_epoch(state) - state.finalized_checkpoint.epoch

Returns the number of epochs that have elapsed since the last finalised checkpoint. Anything over zero here indicates a delay in optimal finalisation: ideally, a checkpoint is justified at the end of its epoch, and finalised at the end of the subsequent epoch.

Used by is_in_inactivity_leak(), and in get_inactivity_penalty_deltas() to calculate the inactivity leak.

def is_in_inactivity_leak(state: BeaconState) -> bool:
    return get_finality_delay(state) > MIN_EPOCHS_TO_INACTIVITY_PENALTY

If the beacon chain has not managed to finalise a checkpoint for MIN_EPOCHS_TO_INACTIVITY_PENALTY epochs (four), then the chain enters the inactivity leak.

This is used in get_attestation_component_deltas() and get_inactivity_penalty_deltas().

def get_eligible_validator_indices(state: BeaconState) -> Sequence[ValidatorIndex]:
    previous_epoch = get_previous_epoch(state)
    return [
        ValidatorIndex(index) for index, v in enumerate(state.validators)
        if is_active_validator(v, previous_epoch) or (v.slashed and previous_epoch + 1 < v.withdrawable_epoch)
    ]

Returns a list of validators that were either active during the last epoch (and therefore should have made an attestation) or are marked as slashed but not yet withdrawable.

Slashed validators are excluded when attestation rewards are calculated. Including them in this list ensures that they are always penalised for not attesting (even if they do attest correctly up to the epoch of their exit). The penalty goes beyond the slashed validator's exit epoch, right up to its withdrawable epoch. Thus, a slashed validator will accrue full penalties for EPOCHS_PER_SLASHINGS_VECTOR epochs after the slashing being processed (~36 days), as set in slash_validator(). This is not strictly necessary---a bigger initial penalty could simply have been applied---but I imagine that it does increase the perceived pain of having been slashed.

def get_attestation_component_deltas(state: BeaconState,
                                     attestations: Sequence[PendingAttestation]
                                     ) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Helper with shared logic for use by get source, target, and head deltas functions
    """
    rewards = [Gwei(0)] * len(state.validators)
    penalties = [Gwei(0)] * len(state.validators)
    total_balance = get_total_active_balance(state)
    unslashed_attesting_indices = get_unslashed_attesting_indices(state, attestations)
    attesting_balance = get_total_balance(state, unslashed_attesting_indices)
    for index in get_eligible_validator_indices(state):
        if index in unslashed_attesting_indices:
            increment = EFFECTIVE_BALANCE_INCREMENT  # Factored out from balance totals to avoid uint64 overflow
            if is_in_inactivity_leak(state):
                # Since full base reward will be canceled out by inactivity penalty deltas,
                # optimal participation receives full base reward compensation here.
                rewards[index] += get_base_reward(state, index)
            else:
                reward_numerator = get_base_reward(state, index) * (attesting_balance // increment)
                rewards[index] += reward_numerator // (total_balance // increment)
        else:
            penalties[index] += get_base_reward(state, index)
    return rewards, penalties

This is a utility function that, given a list of attestations, allocates rewards to those validators that participated and penalties to those validators that didn't. It is used by get_source_deltas(), get_target_deltas(), and get_head_deltas().

Every active validator is expected to make an attestation exactly once per epoch, and the function cycles through them in turn, applying either a reward or a penalty. The list of validators to consider is provided by get_eligible_validator_indices(): these are the active validators and slashed validators that have not reached their withdrawable epoch. Meanwhile, get_unslashed_attesting_indices() provides the list of validators that participated in the provided attestations. Validators appearing in the intersection of the lists receive a reward; the remainder receive a penalty. Note that the latter list excludes slashed validators, so these are always penalised.

Those validators that did not participate in one of the attestations simply receive a penalty of get_base_reward() ($B$).

Those validators that particpated in one of the attestations receive a reward. There are two cases:

In the normal case, the reward is $B'$: which is the base reward $B$ scaled by the proportion of total active balance that made the same vote. Thus, if participation rates are below 100%, rewards are proportionally lower.

If the inactivity leak is active, the reward is simply the base reward, $B$. This will be exactly cancelled out later in get_inactivity_penalty_deltas(), so the net maximum reward for attesting during an inactivity leak period is zero.

It may not be immediately obvious why the base reward $B'$ is set to vary with the participation rate. This is about discouragement attacks (see also this nice explainer). In short, with this mechanism, validators are incentivised to help each other out (e.g. by maintaining forwarding gossip mssages, or aggregating attestations well) rather than to attack or censor one-another.

Components of attestation deltas

Every attestation is eligible for four micro rewards:

Voting for the correct Casper FFG source checkpoint (in the view of this beacon node)

Voting for the correct Casper FFG target checkpoint

Voting for the correct LMD GHOST head block

Being included in a block quickly

For the first three of these, there is a reward for voting correctly and a penalty for voting incorrectly. There is also a reward for block proposers that is calculated under #4.

The following get_*_deltas() functions organise the calculation of these micro rewards and penalties. They are accumulated in get_attestation_deltas() and applied in process_rewards_and_penalties().

All of the rewards and penalties are applied based on attestations collected during the previous epoch. Since epoch processing takes place at the end of each epoch, this means that the attestations were included in blocks between 33 and 64 slots ago.

There's plenty of low-hanging fruit for optimisation in all of the rewards processing. For example, by combining functions to walk the attestation list only once.

def get_source_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attester micro-rewards/penalties for source-vote for each validator.
    """
    matching_source_attestations = get_matching_source_attestations(state, get_previous_epoch(state))
    return get_attestation_component_deltas(state, matching_source_attestations)

Computes rewards for getting the Casper FFG source checkpoint vote correct.

The matching source attestations returned by get_matching_source_attestations() are those that voted for the correct source checkpoint. That is, all the ones in the state's previous_epoch_attestation list.

get_attestation_component_deltas() is a common utility function for allocating rewards to validators based on a list of attestations.

def get_target_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attester micro-rewards/penalties for target-vote for each validator.
    """
    matching_target_attestations = get_matching_target_attestations(state, get_previous_epoch(state))
    return get_attestation_component_deltas(state, matching_target_attestations)

Computes rewards for getting the Casper FFG target checkpoint vote correct.

Matching target attestations returned by get_matching_target_attestations() are those that voted for both the correct source checkpoint and the correct target checkpoint.

def get_head_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attester micro-rewards/penalties for head-vote for each validator.
    """
    matching_head_attestations = get_matching_head_attestations(state, get_previous_epoch(state))
    return get_attestation_component_deltas(state, matching_head_attestations)

Computes rewards for getting the LMD GHOST head vote correct.

Matching head attestations returned by get_matching_head_attestations() are those that voted for all three of the correct source checkpoint, the correct target checkpoint, and the correct head block.

def get_inclusion_delay_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return proposer and inclusion delay micro-rewards/penalties for each validator.
    """
    rewards = [Gwei(0) for _ in range(len(state.validators))]
    matching_source_attestations = get_matching_source_attestations(state, get_previous_epoch(state))
    for index in get_unslashed_attesting_indices(state, matching_source_attestations):
        attestation = min([
            a for a in matching_source_attestations
            if index in get_attesting_indices(state, a.data, a.aggregation_bits)
        ], key=lambda a: a.inclusion_delay)
        rewards[attestation.proposer_index] += get_proposer_reward(state, index)
        max_attester_reward = Gwei(get_base_reward(state, index) - get_proposer_reward(state, index))
        rewards[index] += Gwei(max_attester_reward // attestation.inclusion_delay)

    # No penalties associated with inclusion delay
    penalties = [Gwei(0) for _ in range(len(state.validators))]
    return rewards, penalties

Two things are going on in this function: attester rewards are calculated for having got attestations included quickly in blocks, and proposer rewards are calculated for having included attestations in blocks. Aside from including slashing reports, this is the only place in which block proposers receive a reward.

The relevant attestations are those that were included in a block during the previous epoch and voted for the correct source checkpoint. That is, all of the attestations in the state's previous_epoch_attestations list. This list contains PendingAttestation objects, which are aggregate Attestations, minus the signature, plus inclusion delay info and the index of the proposer that included them. See process_attestation().

Each active, unslashed, validator is expected to have made exactly one attestation in the epoch. However, multiple copies of that attestation may be present in the list, from different aggregations or different blocks. This is taken care of by selecting only the attestation with the minimum inclusion delay for each validator. As ever, slashed but still non-exited validators do not receive this reward.

Each attestation is assigned a base reward $B$ by get_base_reward(), proportional to the attester's effective balance. The base reward is divided between the proposer and the attester as follows:

The proposer that included the attestation in a block receives $\smash{\frac{1}{8}B}$, based on PROPOSER_REWARD_QUOTIENT being 8 in get_proposer_reward().

The attester receives the remaining $\smash{\frac{7}{8}B}$, scaled by the reciprocal of the delay in getting the attestation included in a block. The earliest an attestation may be included is with 1 slot's delay (MIN_ATTESTATION_INCLUSION_DELAY), in which case the attester receives the full $\smash{\frac{7}{8}B}$. For two slots' delay, $\smash{\frac{7}{16}B}$. For three slots' delay, $\smash{\frac{7}{24}B}$, and so on. The maximum inclusion delay (enforced by process_attestation()) is 32 slots.

This mechanism has been very effective in getting client implementations to improve the quality and speed of their attestation generation. Before the beaconcha.in block explorer made a version of this metric visible, clients were often scoring well below 90% on the metric. Client improvements mean that it is now rare to find scores less than 99%, and more commonly 100%, on the mainnet beacon chain.

def get_inactivity_penalty_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return inactivity reward/penalty deltas for each validator.
    """
    penalties = [Gwei(0) for _ in range(len(state.validators))]
    if is_in_inactivity_leak(state):
        matching_target_attestations = get_matching_target_attestations(state, get_previous_epoch(state))
        matching_target_attesting_indices = get_unslashed_attesting_indices(state, matching_target_attestations)
        for index in get_eligible_validator_indices(state):
            # If validator is performing optimally this cancels all rewards for a neutral balance
            base_reward = get_base_reward(state, index)
            penalties[index] += Gwei(BASE_REWARDS_PER_EPOCH * base_reward - get_proposer_reward(state, index))
            if index not in matching_target_attesting_indices:
                effective_balance = state.validators[index].effective_balance
                penalties[index] += Gwei(effective_balance * get_finality_delay(state) // INACTIVITY_PENALTY_QUOTIENT)

    # No rewards associated with inactivity penalties
    rewards = [Gwei(0) for _ in range(len(state.validators))]
    return rewards, penalties

If the beacon chain is finalising correctly, then this function is a no-op. But if the chain is in an inactivity leak regime, as determined by is_in_inactivity_leak(), then the extra penalties are calculated here.

Two things are happening:

All eligible (i.e. active) validators receive a penalty constructed so as to exactly negate the maximum reward available for attesting (both for correct votes, and optimal inclusion). So, during the inactivity leak, perfectly attesting validators receive no net reward and no net penalty.

All eligible validators that failed to vote for the correct Casper FFG target in the previous epoch, or are marked as slashed, receive a further penalty that increases linearly with the number of epochs since finalisation. Note that a validator may be online and attesting, but it is nonetheless subject to the inactivity leak unless both its source and target votes are correct. Slashed (but not exited) validators receive this penalty to avoid self-slashing becoming a way to avoid the leak.

See below for how this fits into the big picure, and above for further discussion of the inactivity leak.

[TODO: explain why max rewards are limited to zero during this period - something to do with discouragement attacks iirc...]

Note that, unlike attester rewards, proposer rewards are not cancelled out. So it is possible for validators to make a net positive reward during an inactivity leak, based solely on rewards for proposing blocks.

`get_attestation_deltas`

def get_attestation_deltas(state: BeaconState) -> Tuple[Sequence[Gwei], Sequence[Gwei]]:
    """
    Return attestation reward/penalty deltas for each validator.
    """
    source_rewards, source_penalties = get_source_deltas(state)
    target_rewards, target_penalties = get_target_deltas(state)
    head_rewards, head_penalties = get_head_deltas(state)
    inclusion_delay_rewards, _ = get_inclusion_delay_deltas(state)
    _, inactivity_penalties = get_inactivity_penalty_deltas(state)

    rewards = [
        source_rewards[i] + target_rewards[i] + head_rewards[i] + inclusion_delay_rewards[i]
        for i in range(len(state.validators))
    ]

    penalties = [
        source_penalties[i] + target_penalties[i] + head_penalties[i] + inactivity_penalties[i]
        for i in range(len(state.validators))
    ]

    return rewards, penalties

Rewards and penalties for all validators are applied once per epoch, and are accumulated from their various component parts here. Below, I'll try to summarise everything in one place.

Attestation rewards are calculated from a base reward, call it $B$, calculated as follows:
effective_balance * BASE_REWARD_FACTOR // integer_squareroot(total_balance) // BASE_REWARDS_PER_EPOCH
where effective_balance is the effective balance of the validator making the attestation. For the source, target, and head rewards, $B$ is scaled by the attesting balance divided by the total balance: call it $B'$ (which is less than or equal to $B$, thus, individual rewards decrease if participation is below 100%). In the following, "correct" should be understood as meaning "correct according to the processing node's current view of the chain".

When the chain is operating normally (not in inactivity leak), attesters receive the following.

A reward of $B'$ for having the right source Casper FFG vote; a penalty of $B$ for having the wrong source vote.

A reward of $B'$ for having the right target Casper FFG vote; a penalty of $B$ for having the wrong target vote.

A reward of $B'$ for having the right head LMG GHOST vote; a penalty of $B$ for having the wrong head vote.

Getting the source vote wrong implies getting the target vote wrong, and getting the target vote wrong implies getting the head vote wrong, so the cases boil down to the following cases and net rewards.

Incorrect source: reward $0$, penalty $3B$

Correct source, incorrect target: reward $B'$, penalty $2B$

Correct source and target, incorrect head: reward $2B'$, penalty $B$

Correct source, target and head: reward $3B'$, penalty $0$

In addition to the above, there is a reward for being included quickly (but no associated penalty). The reward for the proposer including the attestation in a block is taken from this inclusion delay reward. With PROPOSER_REWARD_QUOTIENT set to 8, the maximum an attestation can receive is $\smash{\frac{7}{8}B}$. This is then scaled with the reciprocal of the number of slots that inclusion was delayed: $\smash{\frac{7}{8}B}$ for a one slot delay (the minimum); $\smash{\frac{7}{16}B}$ for a two slot delay; $\smash{\frac{7}{24}B}$ for a three slot delay, and so on. After 32 slots delay the attestation is no longer valid.

So there are in total 4 reward components, which fact is used in the calculation of $B$ via BASE_REWARDS_PER_EPOCH.

Finally, the total reward and penalty for an attestation comprises the sum of the four elements above. It is a maximum net reward of $\smash{3B'+\frac{7}{8}B}$, and a maximum net penalty of $3B$.

The above is modified as follows when the chain is not finalising and the inactivity leak is occurring:

$B'$ is set to $B$

All validators expected to have made an attestation (all active validators) receive an extra penalty of $\smash{3B+\frac{7}{8}B}$. This means that if they attest perfectly their net balance remains unchanged.

All validators not making a correct target vote receive an extra penalty that increases with time, of effective_balance * get_finality_delay(state) // INACTIVITY_PENALTY_QUOTIENT.

The last of these is the inactivity leak, designed to regain finality in the event that many vaildators go offline. Note that a validator may be online and attesting, but it is nonetheless subject to the inactivity leak unless both its source and target votes are correct.

Proposer rewards are comparatively simple. The proposer receives an amount $\smash{\frac{1}{8}B}$ for each attestation it includes in the block inside an aggregate attestation. The fraction comes from PROPOSER_REWARD_QUOTIENT, and $B$ is calculated from the attester's effective balance for each attestation. As the total number of validators increases, block proposals become less frequent for each validator, but the reward gets proportionately larger since there are more attestations to include. The ratio of expected attestation rewards to expected proposer rewards for a validator is about 31:1 (i.e. 3.1% of rewards from block proposals).

Slashing aside, the above is everything there is to know about rewards and penalties 😅

Fun fact: my (erstwhile) colleague, Herman Junge, qualified for the first Eth2 bug bounty for discovering a potential arithmetic overflow in a previous version of this function.

`process_rewards_and_penalties`

def process_rewards_and_penalties(state: BeaconState) -> None:
    # No rewards are applied at the end of `GENESIS_EPOCH` because rewards are for work done in the previous epoch
    if get_current_epoch(state) == GENESIS_EPOCH:
        return

    rewards, penalties = get_attestation_deltas(state)
    for index in range(len(state.validators)):
        increase_balance(state, ValidatorIndex(index), rewards[index])
        decrease_balance(state, ValidatorIndex(index), penalties[index])

Re there being no rewards at the end of the genesis epoch, the get_*_deltas() functions all reference get_previous_epoch(), so rewards processing needs to be deferred until the end of the epoch after the genesis epoch.

Having chosen to use unsigned integers throughout (grumble), each validator's balance is visited twice: once to add rewards; once to subtract penalties. Penalties and rewards are accumulated from all their various components in get_attestation_deltas().

Registry updates

def process_registry_updates(state: BeaconState) -> None:
    # Process activation eligibility and ejections
    for index, validator in enumerate(state.validators):
        if is_eligible_for_activation_queue(validator):
            validator.activation_eligibility_epoch = get_current_epoch(state) + 1

        if is_active_validator(validator, get_current_epoch(state)) and validator.effective_balance <= EJECTION_BALANCE:
            initiate_validator_exit(state, ValidatorIndex(index))

    # Queue validators eligible for activation and not yet dequeued for activation
    activation_queue = sorted([
        index for index, validator in enumerate(state.validators)
        if is_eligible_for_activation(state, validator)
        # Order by the sequence of activation_eligibility_epoch setting and then index
    ], key=lambda index: (state.validators[index].activation_eligibility_epoch, index))
    # Dequeued validators for activation up to churn limit
    for index in activation_queue[:get_validator_churn_limit(state)]:
        validator = state.validators[index]
        validator.activation_epoch = compute_activation_exit_epoch(get_current_epoch(state))

The list of validator records is sometimes called the Registry.

TODO

Slashings

def process_slashings(state: BeaconState) -> None:
    epoch = get_current_epoch(state)
    total_balance = get_total_active_balance(state)
    adjusted_total_slashing_balance = min(sum(state.slashings) * PROPORTIONAL_SLASHING_MULTIPLIER, total_balance)
    for index, validator in enumerate(state.validators):
        if validator.slashed and epoch + EPOCHS_PER_SLASHINGS_VECTOR // 2 == validator.withdrawable_epoch:
            increment = EFFECTIVE_BALANCE_INCREMENT  # Factored out from penalty numerator to avoid uint64 overflow
            penalty_numerator = validator.effective_balance // increment * adjusted_total_slashing_balance
            penalty = penalty_numerator // total_balance * increment
            decrease_balance(state, ValidatorIndex(index), penalty)

Slashing penalties are applied in two stages: the first stage is in slash_validator(), immediately on detection; the second stage is here.

In slash_validator() the withdrawable epoch is set EPOCHS_PER_SLASHINGS_VECTOR in the future, so in this function we are considering all slashed validators that are halfway to their withdrawable epoch. Equivalently, they were slashed EPOCHS_PER_SLASHINGS_VECTOR // 2 epochs ago (about 18 days).

To calculate the additional slashing penalty, we do the following:

Find the sum of the effective balances (at the time of the slashing) of all validators that were slashed in the previous EPOCHS_PER_SLASHINGS_VECTOR epochs (36 days). These are stored in a vector in the state.

Multiply the sum by the PROPORTIONAL_SLASHING_MULTIPLIER.

Divide the result by the total effective balance of all active validators.

This gives us the proportion of each slashed validators' effective balance to be subtracted in this stage. If it is greater than one, the validator's whole remaining effective balance is forfeit.

If only a single validator were slashed within the 36 days, then this secondary penalty is tiny. If one-third of validators were slashed (the minimum required to finalise conflicting blocks), then, with PROPORTIONAL_SLASHING_MULTIPLIER set to one, each slashed validator would lose one third of its effective balance. When PROPORTIONAL_SLASHING_MULTIPLIER is reset to three, a successful chain attack will result in the attackers losing their entire stakes.

Note that, due to the way the integer arithmetic is constructed in this routine, in particular the factoring out of increment, the result of this calculation will be zero for adjusted_total_slashing_balance less than 1/(PROPORTIONAL_SLASHING_MULTIPLIER * EFFECTIVE_BALANCE_INCREMENT) (1/32 in the early stages) of total_balance. In other words, the penalty is rounded down to the nearest whole amount of Ether. Discussed here and here, but the consequence is that when there are few slashings, then there is no extra correlated slashing penalty at all, which is probably a good thing.

Final updates

def process_final_updates(state: BeaconState) -> None:
    current_epoch = get_current_epoch(state)
    next_epoch = Epoch(current_epoch + 1)
    # Reset eth1 data votes
    if next_epoch % EPOCHS_PER_ETH1_VOTING_PERIOD == 0:
        state.eth1_data_votes = []
    # Update effective balances with hysteresis
    for index, validator in enumerate(state.validators):
        balance = state.balances[index]
        HYSTERESIS_INCREMENT = uint64(EFFECTIVE_BALANCE_INCREMENT // HYSTERESIS_QUOTIENT)
        DOWNWARD_THRESHOLD = HYSTERESIS_INCREMENT * HYSTERESIS_DOWNWARD_MULTIPLIER
        UPWARD_THRESHOLD = HYSTERESIS_INCREMENT * HYSTERESIS_UPWARD_MULTIPLIER
        if (
            balance + DOWNWARD_THRESHOLD < validator.effective_balance
            or validator.effective_balance + UPWARD_THRESHOLD < balance
        ):
            validator.effective_balance = min(balance - balance % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)
    # Reset slashings
    state.slashings[next_epoch % EPOCHS_PER_SLASHINGS_VECTOR] = Gwei(0)
    # Set randao mix
    state.randao_mixes[next_epoch % EPOCHS_PER_HISTORICAL_VECTOR] = get_randao_mix(state, current_epoch)
    # Set historical root accumulator
    if next_epoch % (SLOTS_PER_HISTORICAL_ROOT // SLOTS_PER_EPOCH) == 0:
        historical_batch = HistoricalBatch(block_roots=state.block_roots, state_roots=state.state_roots)
        state.historical_roots.append(hash_tree_root(historical_batch))
    # Rotate current/previous epoch attestations
    state.previous_epoch_attestations = state.current_epoch_attestations
    state.current_epoch_attestations = []

Mostly straightforward tidying up at the end of an epoch. But two things are worth a comment.

First, the update to effective balances. Each validator's balance is represented twice in the state: once accurately in a list separate from validator records, and once in a coarse-grained format within the validator's record. Only effective balances are used in calculations within the spec, but rewards and penalties are applied only to actual balances. This routine is where effective balances are updated once per epoch.

A hysteresis mechanism is used when calculating the effective balance of a validator when its actual balance changes. See HYSTERESIS_QUOTIENT for more discussion of this, and the values of the related constants. With the current values, a validator's effective balance drops to X ETH when its actual balance drops below X.75 ETH, and increases to Y ETH when its actual balance rises above Y.25 ETH. The hysteresis mechanism ensures that effective balances change infrequently, which means that the list of validator records needs to be re-hashed only infrequently, saving on work.

Second, the historical roots accumulator is updated. This is implements part of the double batched accumulator for the past history of the chain. Once SLOTS_PER_HISTORICAL_ROOT block roots and the same number of state roots have been accumulated in the beacon state, they are put in a HistoricalBatch object and the hash tree root of that is appended to the historical_roots list in beacon state. The corresponding block and state root lists in the beacon state are circular and just get overwritten in the next period.

Storing past roots like this allows historical Merkle proofs to be constructed if required.

Block processing

def process_block(state: BeaconState, block: BeaconBlock) -> None:
    process_block_header(state, block)
    process_randao(state, block.body)
    process_eth1_data(state, block.body)
    process_operations(state, block.body)

Simply the tasks that the beacon node performs in order to process a block and update the state. If any of the functions called contains an assert failure, then the entire block is invalid, and any state changes must be rolled back.

process_operations covers the processing of any slashings (proposer and attester) in the block, any attestations, any deposits, and any voluntary exits.

Block header

def process_block_header(state: BeaconState, block: BeaconBlock) -> None:
    # Verify that the slots match
    assert block.slot == state.slot
    # Verify that the block is newer than latest block header
    assert block.slot > state.latest_block_header.slot
    # Verify that proposer index is the correct index
    assert block.proposer_index == get_beacon_proposer_index(state)
    # Verify that the parent matches
    assert block.parent_root == hash_tree_root(state.latest_block_header)
    # Cache current block as the new latest block
    state.latest_block_header = BeaconBlockHeader(
        slot=block.slot,
        proposer_index=block.proposer_index,
        parent_root=block.parent_root,
        state_root=Bytes32(),  # Overwritten in the next process_slot call
        body_root=hash_tree_root(block.body),
    )

    # Verify proposer is not slashed
    proposer = state.validators[block.proposer_index]
    assert not proposer.slashed

A straightfoward list of validity conditions for the block header data.

TODO: explain state_root=Bytes32()

RANDAO

def process_randao(state: BeaconState, body: BeaconBlockBody) -> None:
    epoch = get_current_epoch(state)
    # Verify RANDAO reveal
    proposer = state.validators[get_beacon_proposer_index(state)]
    signing_root = compute_signing_root(epoch, get_domain(state, DOMAIN_RANDAO))
    assert bls.Verify(proposer.pubkey, signing_root, body.randao_reveal)
    # Mix in RANDAO reveal
    mix = xor(get_randao_mix(state, epoch), hash(body.randao_reveal))
    state.randao_mixes[epoch % EPOCHS_PER_HISTORICAL_VECTOR] = mix

A good source of randomness is foundational to the operation of the beacon chain. Security of the protocol depends significantly on unpredictably and uniformly selecting block proposers and attesting committees. The very name "beacon chain" was inspired by Dfinity's concept of a randomness beacon.

The current mechanism for providing randomness is a RANDAO, in which each block proposer provides some randomness and all the contributions are mixed together over the course of an epoch. This is not unbiasable (a malicious proposer may choose to skip a block if it is to its advantage to do so), but is good enough. In future, Ethereum might use a verifiable delay function (VDF) to provide unbiasable randomness.

Early designs had the validators pre-committing to "hash onions", peeling off one layer of hashing at each block proposal. This was changed to using a BLS signature over the epoch number as the entropy source. Using signatures is both a simplification, and an enabler for multi-party validators such as in decentralised staking pools.

The process_randao() function simply uses the proposer's public key to verify that the RANDAO reveal in the block is indeed the epoch number signed with the proposer's private key. It then mixes the hash of the reveal into the current epoch's RANDAO accumulator. The hash is used in order to reduce the signature down from 96 to 32 bytes, and to make it uniform. EPOCHS_PER_HISTORICAL_VECTOR past values of the RANDAO accumulator at the ends of epochs are stored in the state.

From Justin Drake's notes:

Using xor in process_randao is (slightly) more secure than using hash. To illustrate why, imagine an attacker can grind randomness in the current epoch such that two of his validators are the last proposers, in a different order, in two resulting samplings of the next epochs. The commutativity of xor makes those two samplings equivalent, hence reducing the attacker’s grinding opportunity for the next epoch versus hash (which is not commutative). The strict security improvement may simplify the derivation of RANDAO security formal lower bounds.

Note that the assert statement means that the whole block is invalid if the RANDAO reveal is incorrectly formed.

Eth1 data

def process_eth1_data(state: BeaconState, body: BeaconBlockBody) -> None:
    state.eth1_data_votes.append(body.eth1_data)
    if state.eth1_data_votes.count(body.eth1_data) * 2 > EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH:
        state.eth1_data = body.eth1_data

Blocks contain Eth1Data which is supposed to be the proposer's best view of the Eth1 chain and the deposit contract at the time. There is no incentive to get this data correct or penalty for it being incorrect.

If there is a simple majority of the same vote being cast by proposers during each voting period of EPOCHS_PER_ETH1_VOTING_PERIOD epochs (6.8 hours) then the Eth1 data is committed to the beacon state. This updates the chain's view of the deposit contract, and new deposits since the last update will start being processed.

This mechanism has proved to be fragile in the past, but appears to be workable if not perfect.

Operations

def process_operations(state: BeaconState, body: BeaconBlockBody) -> None:
    # Verify that outstanding deposits are processed up to the maximum number of deposits
    assert len(body.deposits) == min(MAX_DEPOSITS, state.eth1_data.deposit_count - state.eth1_deposit_index)

    def for_ops(operations: Sequence[Any], fn: Callable[[BeaconState, Any], None]) -> None:
        for operation in operations:
            fn(state, operation)

    for_ops(body.proposer_slashings, process_proposer_slashing)
    for_ops(body.attester_slashings, process_attester_slashing)
    for_ops(body.attestations, process_attestation)
    for_ops(body.deposits, process_deposit)
    for_ops(body.voluntary_exits, process_voluntary_exit)

Just a dispatcher for handling the various optional contents in a block.

Deposits are optional only in the sense that some blocks have them and some don't. However, if, according to the beacon chain's view of the Eth1 chain, there are deposits pending, then the block must include them, otherwise the block is invalid. On the face of it, this suggests that it is important for a block proposer to have access to an Eth1 node, so as to be able to obtain the deposit data. In practice, this turns out to be not so important.

Proposer slashings

def process_proposer_slashing(state: BeaconState, proposer_slashing: ProposerSlashing) -> None:
    header_1 = proposer_slashing.signed_header_1.message
    header_2 = proposer_slashing.signed_header_2.message

    # Verify header slots match
    assert header_1.slot == header_2.slot
    # Verify header proposer indices match
    assert header_1.proposer_index == header_2.proposer_index
    # Verify the headers are different
    assert header_1 != header_2
    # Verify the proposer is slashable
    proposer = state.validators[header_1.proposer_index]
    assert is_slashable_validator(proposer, get_current_epoch(state))
    # Verify signatures
    for signed_header in (proposer_slashing.signed_header_1, proposer_slashing.signed_header_2):
        domain = get_domain(state, DOMAIN_BEACON_PROPOSER, compute_epoch_at_slot(signed_header.message.slot))
        signing_root = compute_signing_root(signed_header.message, domain)
        assert bls.Verify(proposer.pubkey, signing_root, signed_header.signature)

    slash_validator(state, header_1.proposer_index)

A ProposerSlashing is a proof that a proposer has signed two blocks at the same height. Up to MAX_PROPOSER_SLASHINGS of them may be included in a block. It contains the evidence in the form of a pair of SignedBeaconBlockHeaders.

The proof is simple: the two proposals come from the same slot, have the same proposer, but differ in one or more of parent_root, state_root, or body_root. In addition, they were both signed by the proposer. The conflicting blocks do not need to be valid: any pair of headers that meet the criteria, irrespective of the blocks' contents, are liable to be slashed.

As ever, the assert statements ensure that the containing block is invalid if it contains any invalid slashing claims.

Fun fact: the first slashing to occur on the beacon chain was a proposer slashing. Two clients running side-by-side with the same keys will often produce the same attestations since the protocol is designed to encourage that. Independently producing the same block is very unlikely as blocks contain much more data.

Attester slashings

def process_attester_slashing(state: BeaconState, attester_slashing: AttesterSlashing) -> None:
    attestation_1 = attester_slashing.attestation_1
    attestation_2 = attester_slashing.attestation_2
    assert is_slashable_attestation_data(attestation_1.data, attestation_2.data)
    assert is_valid_indexed_attestation(state, attestation_1)
    assert is_valid_indexed_attestation(state, attestation_2)

    slashed_any = False
    indices = set(attestation_1.attesting_indices).intersection(attestation_2.attesting_indices)
    for index in sorted(indices):
        if is_slashable_validator(state.validators[index], get_current_epoch(state)):
            slash_validator(state, index)
            slashed_any = True
    assert slashed_any

AttesterSlashings are similar to proposer slashings in that they just provide the evidence of the two aggregate IndexedAttestations that conflict with each other. Up to MAX_ATTESTER_SLASHINGS of them may be included in a block.

The validity checking is done by is_slashable_attestation_data(), which checks the double vote and surround vote conditions, and by is_valid_indexed_attestation() which verifies the signatures on the attestations.

Any validators that appear in both attestations are slashed. If no validator is slashed, then the attester slashing claim was not valid after all, and therefore its containing block is invalid.

Examples: a double vote attester slashing; surround vote attester slashings.

Attestations

def process_attestation(state: BeaconState, attestation: Attestation) -> None:
    data = attestation.data
    assert data.target.epoch in (get_previous_epoch(state), get_current_epoch(state))
    assert data.target.epoch == compute_epoch_at_slot(data.slot)
    assert data.slot + MIN_ATTESTATION_INCLUSION_DELAY <= state.slot <= data.slot + SLOTS_PER_EPOCH
    assert data.index < get_committee_count_per_slot(state, data.target.epoch)

    committee = get_beacon_committee(state, data.slot, data.index)
    assert len(attestation.aggregation_bits) == len(committee)

    pending_attestation = PendingAttestation(
        data=data,
        aggregation_bits=attestation.aggregation_bits,
        inclusion_delay=state.slot - data.slot,
        proposer_index=get_beacon_proposer_index(state),
    )

    if data.target.epoch == get_current_epoch(state):
        assert data.source == state.current_justified_checkpoint
        state.current_epoch_attestations.append(pending_attestation)
    else:
        assert data.source == state.previous_justified_checkpoint
        state.previous_epoch_attestations.append(pending_attestation)

    # Verify signature
    assert is_valid_indexed_attestation(state, get_indexed_attestation(state, attestation))

Attestations (in the form of aggregate attestations) that appear in blocks have only correctness checks performed on them before being added to the state. Proper attestation processing happens at epoch boundaries.

The checks:

The target vote of the attestation must be either the previous epoch's checkpoint or the current epoch's checkpoint.

The target checkpoint and the attestation's slot must belong to the same epoch.

The attestation must be no newer than MIN_ATTESTATION_INCLUSION_DELAY slots, which is one. So this condition rules out attestations from the current or future slots.

The attestation must be no older than SLOTS_PER_EPOCH slots, which is 32.

The attestation must come from a committee that existed when the attestation was created.

The size of the committee and the size of the aggregate must match (aggregation_bits).

The source vote of the attestation must match the justified checkpoint that is in the state that corresponds to its target vote.

The (aggregate) signature on the attestation must be valid and must correspond to the aggregated public keys of the validators that it claims to be signed by.

Having passed all the tests, the aggregate attestation is appended to a list in the state corresponding to its target vote: either this epoch or last epoch. These lists are stored for processing at the epoch transition. The inclusion_delay parameter in the PendingAttestation object is populated with the number of slots between the attestation's slot and the slot of the block containing it, which is between 1 and 32. This will be used in rewards processing to incentivise swift inclusion of attestations. The proposer of the block containing the attestation is also saved so that the proposer reward can be correctly allocated.

Recall that failing an assertion means that the entire block is invalid, and any state changed must be rolled back. Thus, there's no significance to the signature check formally occurring after the state is updated here.

Fun fact: Barnabé Monnot did a study of the first 1000 epochs of the beacon chain and found that storing individual attestations rather than aggregate attestations would have taken almost 73 times as much space.

Deposits

def get_validator_from_deposit(state: BeaconState, deposit: Deposit) -> Validator:
    amount = deposit.data.amount
    effective_balance = min(amount - amount % EFFECTIVE_BALANCE_INCREMENT, MAX_EFFECTIVE_BALANCE)

    return Validator(
        pubkey=deposit.data.pubkey,
        withdrawal_credentials=deposit.data.withdrawal_credentials,
        activation_eligibility_epoch=FAR_FUTURE_EPOCH,
        activation_epoch=FAR_FUTURE_EPOCH,
        exit_epoch=FAR_FUTURE_EPOCH,
        withdrawable_epoch=FAR_FUTURE_EPOCH,
        effective_balance=effective_balance,
    )

Create a newly initialised validator object from a deposit. This was factored out of process_deposit() for better code reuse between phases 0 and 1.

def process_deposit(state: BeaconState, deposit: Deposit) -> None:
    # Verify the Merkle branch
    assert is_valid_merkle_branch(
        leaf=hash_tree_root(deposit.data),
        branch=deposit.proof,
        depth=DEPOSIT_CONTRACT_TREE_DEPTH + 1,  # Add 1 for the List length mix-in
        index=state.eth1_deposit_index,
        root=state.eth1_data.deposit_root,
    )

    # Deposits must be processed in order
    state.eth1_deposit_index += 1

    pubkey = deposit.data.pubkey
    amount = deposit.data.amount
    validator_pubkeys = [v.pubkey for v in state.validators]
    if pubkey not in validator_pubkeys:
        # Verify the deposit signature (proof of possession) which is not checked by the deposit contract
        deposit_message = DepositMessage(
            pubkey=deposit.data.pubkey,
            withdrawal_credentials=deposit.data.withdrawal_credentials,
            amount=deposit.data.amount,
        )
        domain = compute_domain(DOMAIN_DEPOSIT)  # Fork-agnostic domain since deposits are valid across forks
        signing_root = compute_signing_root(deposit_message, domain)
        if not bls.Verify(pubkey, signing_root, deposit.data.signature):
            return

        # Add validator and balance entries
        state.validators.append(get_validator_from_deposit(state, deposit))
        state.balances.append(amount)
    else:
        # Increase balance by deposit amount
        index = ValidatorIndex(validator_pubkeys.index(pubkey))
        increase_balance(state, index, amount)

Process a deposit from a block: if the deposit is valid, either a new validator is created or the deposit amount is added to an existing validator.

The call to is_valid_merkle_branch() ensures that it is not possible to fake a deposit. The eth1data.deposit_root from the deposit contract has been agreed by the beacon chain and includes all pending deposits visible to the beacon chain. The deposit contains a Merkle proof that it is included in that root. The state.eth1_deposit_index counter ensures that deposits are processed in order. In short, the proposer provides leaf and branch, but neither index nor root.

Deposits are signed with the private key of the depositing validator, and the corresponding public key is included in the deposit data. This constitutes a "proof of possession" of the private key, and prevents nastiness like the rogue key attack.

If the Merkle branch check fails, then the whole block is invalid. However, individual deposits can be invalid (if the signature check fails) without invalidating the block.

Note that it is not possible to change a validator's withdrawal credentials after the initial deposit: the withdrawal credentials of subsequent deposits for the same validator are ignord; only the credentials appearing on the initial deposit are stored on the beacon chain. This is an important security measure. If an attacker steals a validator's signing key, we don't want them to be able to change the withdrawl credentials in order to steal the stake for themselves.

Voluntary exits

def process_voluntary_exit(state: BeaconState, signed_voluntary_exit: SignedVoluntaryExit) -> None:
    voluntary_exit = signed_voluntary_exit.message
    validator = state.validators[voluntary_exit.validator_index]
    # Verify the validator is active
    assert is_active_validator(validator, get_current_epoch(state))
    # Verify exit has not been initiated
    assert validator.exit_epoch == FAR_FUTURE_EPOCH
    # Exits must specify an epoch when they become valid; they are not valid before then
    assert get_current_epoch(state) >= voluntary_exit.epoch
    # Verify the validator has been active long enough
    assert get_current_epoch(state) >= validator.activation_epoch + SHARD_COMMITTEE_PERIOD
    # Verify signature
    domain = get_domain(state, DOMAIN_VOLUNTARY_EXIT, voluntary_exit.epoch)
    signing_root = compute_signing_root(voluntary_exit, domain)
    assert bls.Verify(validator.pubkey, signing_root, signed_voluntary_exit.signature)
    # Initiate exit
    initiate_validator_exit(state, voluntary_exit.validator_index)

A voluntary exit is submitted by a validator to indicate that it wishes to cease being an active validator. A node receives voluntary exit messages via gossip or via its API.

Most of the checks are straightforward, as per the comments in the code. Notes:

Voluntary exits are ignored if they are included in blocks before the given epoch, so nodes might buffer any future-dated exits they see before putting them in a block.

A validator must have been active for at least SHARD_COMMITTEE_PERIOD epochs (27 hours). See there for the rationale.

Voluntary exits are signed with the validator's usual signing key. There is some discussion about changing this to also allow signing of a voluntary exit with the validator's withdrawal key.

If the voluntary exit message is valid then the validator is added to the exit queue by calling initiate_validator_exit().

At present it is not possible for a validator to exit and re-enter, but this functionality may be introduced in future.