# Understanding the Yellow Paper's EVM Specifications

The Yellow Paper is the formal specification for Ethereum. Except where amended by the EIP process, it contains the exact description of how everything works. It is written as a mathematical paper, which includes terminology programmers may not find familiar. In this paper you learn how to read it, and by extension other related mathematical papers.

## Which Yellow Paper?

Like almost everything else in Ethereum, the Yellow Paper evolves over time. To be able to refer to a specific version, I uploaded the current version at writing. The section, page, and equation numbers I use will refer to that version. It is a good idea to have it open in a different window while reading this document.

### Why the EVM?

I am writing this a few months before The Merge. The Merge will significantly change the way blocks are handled, making that part of the current yellow paper of historical interest. On the other hand, the EVM is mostly unaffected by The Merge.

## 9 Execution model

This section (p. 12-14) includes most of the definition of the EVM.

The term *system state* includes everything you need to know about the system to run it. In a typical computer, this means the memory, content of registers, etc.

A Turing machine is a computational model. Essentially, it is a simplified version of a computer, which is proved to have the same ability to run computations that a normal computer can (everything that a computer can calculate a Turing machine can calculate and vice versa). This model makes it easier to prove various theorems about what is and what isn't comnputable.

The term Turing-complete means a computer that can run the same calculations as a Turing machine. Turing machines can get into infinite loops, and the EVM cannot because it would run out of gas, so it's only quasi-Turing-complete.

## 9.1 Basics

This section gives the basics of the EVM and how it compares with other computational models.

A stack machine is a computer that stores intermediate data not in registers, but in a stack. Stack machine is the preferred architecture for virtual machines because it is easy to implement. This ease of implementation is especially important in the case of the EVM because it means that bugs, and security vulnerabilities, are a lot less likely.

The memory is a byte array, which means every memory location is a single byte. This means that when you write a word (which is 256 bits) to memory it covers 32 (256/8) different memory locations. For example, if you execute this Yul code:

1mstore(0, 0x60A7)2

It writes zeros to locations 0-29, 0x60 to 30, and 0xA7 to 31.

The Von Neumann architecture specifies that the program to be executed and the data which it processes are stored in the same memory. This is a bad idea from the security perspective because it allows program code to be modified, so the EVM never stores the currently running code in memory, it is always in a different memory that is ROM (read only memory). There are only two cases code that will be executed in the future comes from memory, in both cases because the code needs to come from a different piece of code, so it *has* to come from memory (or storage, but that would be too expensive).

- When a contract creates another contract (using
`CREATE`

or`CREATE2`

), the code for the contract constructor comes from memory. - During the creation of
*any*contract, the constructor code runs and then returns with the code of the actual contract, also from memory.

The term exceptional execution means an exception that causes the execution of the current contract to halt.

## 9.2 Fees overview

This section explains how the gas fees are calculated. There are three costs:

### Opcode cost

The inherent cost of the specific opcode. To get this value, find the cost group of the opcode in Appendix H (p. 28, under equation (327)), and find the cost group in equation (324). This gives you a cost function, which in most cases uses parameters from Appendix G (p. 27).

For example, the opcode `CALLDATACOPY`

is a member of group *W _{copy}*. The opcode cost for that group is

*G*. Looking at Appendix G, we see that both constants are 3, which gives us

_{verylow}+G_{copy}×⌈μ_{s}[2]÷32⌉*3+3×⌈μ*.

_{s}[2]÷32⌉We still need to decipher the expression *⌈μ _{s}[2]÷32⌉*. The outmost part,

*⌈ \<value> ⌉*is the ceiling function, a function that given a value returns the smallest integer that is still not smaller than the value. For example,

*⌈2.5⌉ = ⌈3⌉ = 3*. The inner part is

*μ*. Looking at section 3 (Conventions) on p. 3,

_{s}[2]÷32*μ*is the machine state. The machine state is defined in section 9.4.1 on p. 13. According to that section, one of the machine state parameters is

*s*for the stack. Putting it all together, it seems that

*μ*is location #2 in the stack. Looking at the opcode, location #2 in the stack is the size of the data in bytes. Looking at the other opcodes in group W

_{s}[2]_{copy},

`CODECOPY`

and `RETURNDATACOPY`

, they also have a size of data in the same location. So *⌈μ*is the number of 32 byte words required to store the data being copied. Putting everything together, the inherent cost of

_{s}[2]÷32⌉`CALLDATACOPY`

is 3 gas plus 3 per word of data being copied.### Running cost

The cost of running the code we're calling.

- In the case of
`CREATE`

and`CREATE2`

, the constructor for the new contract. - In the case of
`CALL`

,`CALLCODE`

,`STATICCALL`

, or`DELEGATECALL`

, the contract we call.

### Expanding memory cost

The cost of expanding memory (if necessary).

In equation 324, this value is written as *C _{mem}(μ_{i}')-C_{mem}(μ_{i})*. Looking at section 9.4.1 again, we see that

*μ*is the number of words in memory. So

_{i}*μ*is the number of words in memory before the opcode and

_{i}*μ*is the number of words in memory after the opcode.

_{i}'The function *C _{mem}* is defined in equation 326:

*C*.

_{mem}(a) = G_{memory}× a + ⌊a^{2}÷ 512⌋*⌊x⌋*is the floor function, a function that given a value returns the largest integer that is still not larger than the value. For example,

*⌊2.5⌋ = ⌊2⌋ = 2.*When

*a < √512*,

*a*, and the result of the floor function is zero. So for the first 22 words (704 bytes), the cost rises linearly with the number of memory words required. Beyond that point

^{2}< 512*⌊a*is positive. When the memory required is high enough the gas cost is proportional to the square of the amount of memory.

^{2}÷ 512⌋## 9.3 Execution environment

The execution environment is a tuple, *I*, that includes information that isn't part of the blockchain state or the EVM.

Parameter | Opcode to access the data | Solidity code to access the data |
---|---|---|

I_{a} | `ADDRESS` | `address(this)` |

I_{o} | `ORIGIN` | `tx.origin` |

I_{p} | `GASPRICE` | `tx.gasprice` |

I_{d} | `CALLDATALOAD` , etc. | `msg.data` |

I_{s} | `CALLER` | `msg.sender` |

I_{v} | `CALLVALUE` | `msg.value` |

I_{b} | `CODECOPY` | `address(this).code` |

I_{H} | Block header fields, such as `NUMBER` and `DIFFICULTY` | `block.number` , `block.difficulty` , etc. |

I_{e} | Depth of the call stack for calls between contracts (including contract creation) | |

I_{w} | Is the EVM allowed to change state, or is it running statically |

A few other parameters are necessary to understand the rest of section 9:

Parameter | Defined in section | Meaning |
---|---|---|

σ | 2 (p. 2, equation 1) | The state of the blockchain |

g | 9.3 (p. 13) | Remaining gas |

A | 6.1 (p. 8) | Accrued substate (changes scheduled for when the transaction ends) |

o | 9.3 (p. 13) | Output - the returned result in the case of internal transaction (when one contract calls another) and calls to view functions (when you are just asking for information, so there is no need to wait for a transaction) |

## 9.4 Execution overview

Now that have all the preliminaries, we can finally start working on how the EVM works.

Equations 137-142 give us the initial conditions for running the EVM:

Symbol | Initial value | Meaning |
---|---|---|

μ_{g} | g | Gas remaining |

μ_{pc} | 0 | Program counter, the address of the next instruction to execute |

μ_{m} | (0, 0, ...) | Memory, initialized to all zeros |

μ_{i} | 0 | Highest memory location used |

μ_{s} | () | The stack, initially empty |

μ_{o} | ∅ | The output, empty set until and unless we stop either with return data (`RETURN` or `REVERT` ) or without it (`STOP` or `SELFDESTRUCT` ). |

Equation 143 tells us there are four possible conditions at each point in time during execution, and what to do with them:

- If
*Z(σ,μ,A,I)*, it means that we have encountered an abnormal condition. In that case, the new state is identical to the old one (except gas gets burned) - If the opcode is
`REVERT`

, the new state is the same as the old state, some gas is lost, and we have output to return. - If there is any output, (meaning we are at a
`RETURN`

), the state is the new state and returns the output. - If we aren't at one of the end conditions, continue running.

## 9.4.1 Machine State

This section explains the machine state in greater detail. It specifies that *w* is the current opcode. If *μ _{pc}* is less than

*||I*, the length of the code, then that byte (

_{b}||*I*) is the opcode. Otherwise, the opcode is defined as

_{b}[μ_{pc}]`STOP`

.As this is a stack machine, we need to keep track of the number of items popped out (*δ*) and pushed in (*α*) by each opcode.

## 9.4.2 Exceptional Halting

This section defines the *Z* function, which specifies when we have an abnormal termination. This is a Boolean function, so it uses *∨* for a logical or and *∧* for a logical and.

We have an exceptional halt if any of these conditions is true:

As we saw in section 9.2,*μ*_{g}< C(σ,μ,A,I)*C*is the function that specifies the gas cost. There isn't enough gas left to cover the next opcode.If the number of items popped for an opcode is undefined, then the opcode itself is undefined.*δ*_{w}=∅Stack underflow, not enough items in the stack for the current opcode.*|| μ*_{s}|| < δ_{w}The opcode is*w = JUMP ∧ μ*_{s}[0]∉D(I_{b})`JUMP`

and the address is not a`JUMPDEST`

. Jumps are*only*valid when the destination is a`JUMPDEST`

.The opcode is*w = JUMPI ∧ μ*_{s}[1]≠0 ∧ μ_{s}[0] ∉ D(I_{b})`JUMPI`

, the condition is true (non zero) so the jump should happen, and the address is not a`JUMPDEST`

. Jumps are*only*valid when the destination is a`JUMPDEST`

.The opcode is*w = RETURNDATACOPY ∧ μ*_{s}[1]+μ_{s}[2]>|| μ_{o}||`RETURNDATACOPY`

. In this opcode stack element*μ*is the offset to read from in the return data buffer, and stack element_{s}[1]*μ*is the length of data. This condition occurs when you try to read beyond the end of the return data buffer. Note that there isn't a similar condition for the calldata or for the code itself. When you try to read beyond the end of those buffers you just get zeros._{s}[2]*|| μ*_{s}|| - δ_{w}+ α_{w}> 1024Stack overflow. If running the opcode will result in a stack of over 1024 items, abort.

Are we running statically (¬ is negation and*¬I*_{w}∧ W(w,μ)*I*is true when we are allowed to change the blockchain state)? If so, and we're trying a state changing operation, it can't happen._{w}The function

*W(w,μ)*is defined later in equation 150.*W(w,μ)*is true if one of these conditions is true:These opcodes change the state, either by creating a new contract, storing a value, or destroying the current contract.*w ∈ {CREATE, CREATE2, SSTORE, SELFDESTRUCT}*If we are called statically we cannot emit log entries. The log opcodes are all in the range between*LOG0≤w ∧ w≤LOG4*`LOG0`

(A0) and`LOG4`

(A4). The number after the log opcode specifies how many topics the log entry contains.You can call another contract when you're static, but if you do you cannot transfer ETH to it.*w=CALL ∧ μ*_{s}[2]≠0

You cannot run*w = SSTORE ∧ μ*_{g}≤ G_{callstipend}`SSTORE`

unless you have more than G_{callstipend}(defined as 2300 in Appendix G) gas.

## 9.4.3 Jump Destination Validity

Here we formally define what are the `JUMPDEST`

opcodes. We cannot just look for byte value 0x5B, because it might be inside a PUSH (and therefore data and not an opcode).

In equation (153) we define a function, *N(i,w)*. The first parameter, *i*, is the opcode's location. The second, *w*, is the opcode itself. If *w∈[PUSH1, PUSH32]* that means the opcode is a PUSH (square brackets define a range that includes the endpoints). If that case the next opcode is at *i+2+(w−PUSH1)*. For `PUSH1`

we need to advance by two bytes (the PUSH itself and the one byte value), for `PUSH2`

we need to advance by three bytes because it's a two byte value, etc. All other EVM opcodes are just one byte long, so in all other cases *N(i,w)=i+1*.

This function is used in equation (152) to define *D _{J}(c,i)*, which is the set of all valid jump destinations in code

*c*, starting with opcode location

*i*. This function is defined recursively. If

*i≥||c||*, that means that we're at or after the end of the code. We are not going to find any more jump destinations, so just return the empty set.

In all other cases we look at the rest of the code by going to the next opcode and getting the set starting from it. *c[i]* is the current opcode, so *N(i,c[i])* is the location of the next opcode. *D _{J}(c,N(i,c[i]))* is therefore the set of valid jump destinations that starts at the next opcode. If the current opcode isn't a

`JUMPDEST`

, just return that set. If it is `JUMPDEST`

, include it in the result set and return that.## 9.4.4 Normal halting

The halting function *H*, can return three types of values.

- If we aren't in a halt opcode, return
*∅*, the empty set. By convention, this value is interpreted as Boolean false. - If we have a halt opcode that doesn't produce output (either
`STOP`

or`SELFDESTRUCT`

), return a sequence of size zero bytes as the return value. Note that this is very different from the empty set. This value means that the EVM really did halt, just there's no return data to read. - If we have a halt opcode that does produce output (either
`RETURN`

or`REVERT`

), return the sequence of bytes specified by that opcode. This sequence is taken from memory, the value at the top of the stack (*μ*) is the first byte, and the value after it (_{s}[0]*μ*) is the length._{s}[1]

## H.2 Instruction set

Before we go to the final subsection of the EVM, 9.5, let's look at the instructions themselves. They are defined in Appendix H.2 which starts on p. 29. Anything that is not specified as changing with that specific opcode is expected to stay the same. Variables that do change are specified with as \<something>′.

For example, let's look at the `ADD`

opcode.

Value | Mnemonic | δ | α | Description |
---|---|---|---|---|

0x01 | ADD | 2 | 1 | Addition operation. |

μ′_{s}[0] ≡ μ_{s}[0] + μ_{s}[1] |

*δ* is the number of values we pop from the stack. In this case two, because we are adding the top two values.

*α* is the number of values we push back. In this case one, the sum.

So the new stack top (*μ′ _{s}[0]*) is the sum of the old stack top (

*μ*) and the old value below it (

_{s}[0]*μ*).

_{s}[1]Instead of going over all the opcodes with an "eyes glaze over list", This article explains only those opcodes that introduce something new.

Value | Mnemonic | δ | α | Description |
---|---|---|---|---|

0x20 | KECCAK256 | 2 | 1 | Compute Keccak-256 hash. |

μ′_{s}[0] ≡ KEC(μ_{m}[μ_{s}[0] . . . (μ_{s}[0] + μ_{s}[1] − 1)]) | ||||

μ′_{i} ≡ M(μ_{i},μ_{s}[0],μ_{s}[1]) |

This is the first opcode that accesses memory (in this case, read only). However, it might expand beyond the current limits of the memory, so we need to update *μ _{i}.* We do this using the

*M*function defined in equation 328 on p. 29.

Value | Mnemonic | δ | α | Description |
---|---|---|---|---|

0x31 | BALANCE | 1 | 1 | Get balance of the given account. |

... |

The address whose balance we need to find is *μ _{s}[0] mod 2^{160}*. The top of the stack is the address, but because addresses are only 160 bits, we calculate the value modulo 2

^{160}.

If *σ[μ _{s}[0] mod 2^{160}] ≠ ∅*, it means that there is information about this address. In that case,

*σ[μ*is the balance for that address. If

_{s}[0] mod 2^{160}]_{b}*σ[μ*, it means that this address is uninitialized and the balance is zero. You can see the list of account information fields in section 4.1 on p. 4.

_{s}[0] mod 2^{160}] = ∅The second equation, *A' _{a} ≡ A_{a} ∪ {μ_{s}[0] mod 2^{160}}*, is related to the difference in cost between access to warm storage (storage that has recently been accessed and is likely to be cached) and cold storage (storage that hasn't been accessed and is likely to be in slower storage that is more expensive to retrieve).

*A*is the list of addresses previously accessed by the transaction, which should therefore be cheaper to access, as defined in section 6.1 on p. 8. You can read more about this subject in EIP-2929.

_{a}Value | Mnemonic | δ | α | Description |
---|---|---|---|---|

0x8F | DUP16 | 16 | 17 | Duplicate 16th stack item. |

μ′_{s}[0] ≡ μ_{s}[15] |

Note that to use any stack item, we need to pop it, which means we also need to pop all the stack items on top of it. In the case of `DUP<n>`

and `SWAP<n>`

, this means having to pop and then push up to sixteen values.

## 9.5 The execution cycle

Now that we have all the parts, we can finally understand how the execution cycle of the EVM is documented.

Equation (155) says that given the state:

*σ*(global blockchain state)*μ*(EVM state)*A*(substate, changes to happen when the transaction ends)*I*(execution environment)

The new state is *(σ', μ', A', I')*.

Equations (156)-(158) define the stack and the change in it due to an opcode (*μ _{s}*). Equation (159) is the change in gas (

*μ*). Equation (160) is the change in the program counter (

_{g}*μ*). Finally, equations (161)-(164) specify that the other parameters stay the same, unless explicitly changed by the opcode.

_{pc}With this the EVM is fully defined.

## Conclusion

Mathematical notation is precise and has allowed the Yellow Paper to specify every detail of Ethereum. However, it does have some drawbacks:

- It can only be understood by humans, which means that compliance tests must be written manually.
- Programmers understand computer code. They may or may not understand mathematical notation.

Maybe for these reasons, the newer consensus layer specs are written in Python. There are execution layer specs in Python, but they are not complete. Until and unless the entire Yellow Paper is also translated to Python or a similar language, the Yellow Paper will continue in service, and it is helpful to be able to read it.