REVM源码阅读- Interpreter

liumuhui 3个月前 (03-20) 阅读数 147 #区块链

文章标签 EVM Revm

Interpreter

REVM 是台计算机的话, Interpreter 就是 CPU .
由它来负责对字节码进行解释执行,完成本次交易想要实现的目的.
前面的内容最多算准备工作,提供 Interpreter 执行所需要的内容.
理解了它,才算真正入门了 REVM.

Interpreter 存在于每个 Frame 中,每次新建 Frame 会创建新的 Interpreter.
并不是整个交易共享一个 Interpreter.

先看下结构体定义中的类型,部分类型前面已经讲过,这里再概括下.
没讲的类型在下面展开讲.
把各个类型的功能和作用讲清楚,后面执行逻辑就是对各种类型的操作.

GasParams 这个类型,在 Frame 中已经讲过,保存 Opcode 静态和动态的 Gas 消耗.
Bytecode
Gas
Stack
ReturnData
Memory 共享内存Buffer, 每个 Frame 指定一个范围进行数据临时缓存.
Input
RuntimeFlag
Extend

// crates/interpreter/src/interpreter.rs
#[derive(Debug, Clone)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct Interpreter&lt;WIRE: InterpreterTypes = EthInterpreter> {
    /// Gas table for dynamic gas constants.
    pub gas_params: GasParams,
    /// Bytecode being executed.
    pub bytecode: WIRE::Bytecode,
    /// Gas tracking for execution costs.
    pub gas: Gas,
    /// EVM stack for computation.
    pub stack: WIRE::Stack,
    /// Buffer for return data from calls.
    pub return_data: WIRE::ReturnData,
    /// EVM memory for data storage.
    pub memory: WIRE::Memory,
    /// Input data for current execution context.
    pub input: WIRE::Input,
    /// Runtime flags controlling execution behavior.
    pub runtime_flag: WIRE::RuntimeFlag,
    /// Extended functionality and customizations.
    pub extend: WIRE::Extend,
}

ExtBytecode

上面的类型是 Bytecode, 但是使用的 ExtBytecode.
ExtBytecode 是在 ByteCode 的基础上再进行一层封装,分担了一部分 Interpreter 的工作.

Bytecode

先讲下 Bytecode 的部分
Bytecode 不是结构体,是枚举 Enum,分为两种:

Eip7702Bytecode EIP-7702类型,合约账户使用
- delegated_address 委托地址,就是我调用
- version 版本,现在只有一个版本0.
- raw 23字节的字节码.内容为 0xef01(固定) + 00(版本) + address(20字节的委托地址)
LegacyAnalyzedBytecode 传统字节码(现有的合约字节码)
- bytecode 字节码
- original_len 原始长度( bytecode 为了防止 pc 指向非法,会在尾部填充 STOP(0x00))
- jump_table JUMP 只能跳转到 JUMPDEST 指令,这里提前扫描保存了所有可跳转点.
  - jump_table 不是直接插入保存每个 JUMPDEST 的位置.
  - 是用一个长度等于 bytecode 的 bitvec<u8>,初始默认0,在 jumpdest 的位置标记为1,具体在后面解释.

// crates/bytecode/src/bytecode.rs
pub enum Bytecode {
    /// EIP-7702 delegated bytecode
    Eip7702(Arc&lt;Eip7702Bytecode>),
    /// The bytecode has been analyzed for valid jump destinations.
    LegacyAnalyzed(Arc&lt;LegacyAnalyzedBytecode>),
}
// crates/bytecode/src/eip7702.rs
pub struct Eip7702Bytecode {
    pub delegated_address: Address,
    pub version: u8,
    pub raw: Bytes,
}
// crates/bytecode/src/legacy/analyzed.rs
pub struct LegacyAnalyzedBytecode {
    bytecode: Bytes,
    original_len: usize,
    jump_table: JumpTable,
}

Eip7702Bytecode 的实现部分很简单.简单的 Getter.
new 和 new_raw 是在构造上面说的 raw 格式的内容.

impl Eip7702Bytecode {
    pub fn new_raw(raw: Bytes) -> Result&lt;Self, Eip7702DecodeError> {
        if raw.len() != 23 {
            return Err(Eip7702DecodeError::InvalidLength);
        }
        if !raw.starts_with(&EIP7702_MAGIC_BYTES) {
            return Err(Eip7702DecodeError::InvalidMagic);
        }

        // Only supported version is version 0.
        if raw[2] != EIP7702_VERSION {
            return Err(Eip7702DecodeError::UnsupportedVersion);
        }

        Ok(Self {
            delegated_address: Address::new(raw[3..].try_into().unwrap()),
            version: raw[2],
            raw,
        })
    }
    pub fn new(address: Address) -> Self {
        let mut raw = EIP7702_MAGIC_BYTES.to_vec();
        raw.push(EIP7702_VERSION);
        raw.extend(&address);
        Self {
            delegated_address: address,
            version: EIP7702_VERSION,
            raw: raw.into(),
        }
    }
    pub fn raw(&self) -> &Bytes {
        &self.raw
    }
    pub fn address(&self) -> Address {
        self.delegated_address
    }
    pub fn version(&self) -> u8 {
        self.version
    }
}

看下 LegacyAnalyzedBytecode 的实现.

只需要关注 analyze_legacy , 其他的都是 Getter.
跳到super::analysis::analyze_legacy 的具体实现
这个函数是为了获取前面的 jump_table , jump_table 是一个 bitvec<u8>.

看清楚这里是bitvec,而不是vec.这是一个坑,我一开始就搞错了.
它实际保存的是位图,不过用 u8 来实现.一个 u8 实际可以保存8个bool.
也就是这里jump_table中的操作都是位操作.一定要记住. pc 为 8 的jumpdest位置是在第2个字节,而不是第8个字节.
而且在字节中的顺序是逆序的,pc为8的话是第二个字节最右边的最低位,而不是最左边的最高位.

let range = bytecode.as_ptr_range(); 这里返回了 bytecode 的 起始指针 和 结束指针.
iterator 是指针位置.
while 循环并对 iterator 解引用获取所在位置的 Opcode.
- 判断 Opcode 是否 JUMPDEST, 是则将 jumps 中的该位置设置为 True.
  - set_unchecked 设置指定位置为 True.
  - iterator.offset_from_unsigned(start) ,计算 iterator - start 的值.
- 判断 Opcode 是否 PUSH1-PUSH32 (0x60 - 0x7f)
  - 如果是, iterator + push_offset + 2
在 bytecode 后面补充0x00 (STOP),避免后面执行代码时 PC 越界.
- let padding = (iterator as usize) - (end as usize) + (opcode != opcode::STOP) as usize
  - 这里是计算要补充几个 0x00.为什么不是只补充一个 0x00 我们讲下:
    - 正常没有 PUSH 指令, iterator 递增,不会有啥问题.
    - 如果最后一个 OP 是 PUSH32, bytecode 又被异常截断.导致数据不够,iterator跳转后会超出 end.
    - 这段代码就是处理这种异常情况,保证异常时也能执行.不够的数据填充0x00.

注: opcode.wrapping_sub(opcode::PUSH1) PUSH1-PUSH32是连续的 Opcode ,减去 PUSH1得到的值是在 0-31 ,说明就是 PUSH1-PUSH32.
PUSH 的指令比较特殊.它会将数据直接编码到指令后面. PUSHx 后面的 x个字节 都是它的数据.
所以 PUSH 指令的 iterator 是`iterator + (push_offset + 1) + 1

// crates/bytecode/src/legacy/analyzed.rs
impl LegacyAnalyzedBytecode {
    pub fn analyze(bytecode: Bytes) -> Self {
        let original_len = bytecode.len();
        let (jump_table, padded_bytecode) = super::analysis::analyze_legacy(bytecode);
        Self::new(padded_bytecode, original_len, jump_table)
    }
}

// crates/bytecode/src/legacy/analysis.rs
pub fn analyze_legacy(bytecode: Bytes) -> (JumpTable, Bytes) {
    if bytecode.is_empty() {
        return (JumpTable::default(), Bytes::from_static(&[opcode::STOP]));
    }

    let mut jumps: BitVec&lt;u8> = bitvec![u8, Lsb0; 0; bytecode.len()];
    let range = bytecode.as_ptr_range();
    let start = range.start;
    let mut iterator = start;
    let end = range.end;
    let mut opcode = 0;

    while iterator &lt; end {
        opcode = unsafe { *iterator };
        if opcode == opcode::JUMPDEST {
            // SAFETY: Jumps are max length of the code
            unsafe { jumps.set_unchecked(iterator.offset_from_unsigned(start), true) }
            iterator = unsafe { iterator.add(1) };
        } else {
            let push_offset = opcode.wrapping_sub(opcode::PUSH1);
            if push_offset &lt; 32 {
                // SAFETY: Iterator access range is checked in the while loop
                iterator = unsafe { iterator.add(push_offset as usize + 2) };
            } else {
                // SAFETY: Iterator access range is checked in the while loop
                iterator = unsafe { iterator.add(1) };
            }
        }
    }

    let padding = (iterator as usize) - (end as usize) + (opcode != opcode::STOP) as usize;
    let bytecode = if padding > 0 {
        let mut padded = Vec::with_capacity(bytecode.len() + padding);
        padded.extend_from_slice(&bytecode);
        padded.resize(padded.len() + padding, 0);
        Bytes::from(padded)
    } else {
        bytecode
    };

    (JumpTable::new(jumps), bytecode)
}

前面讲的是 Eip7702Bytecode 和 LegacyAnalyzedBytecode 的实现.
接下来继续讲下 Bytecode 的实现.
函数内容都一目了然,这里只给出函数列表和功能.

new() - 创建一个新的默认 Bytecode，包含一个 STOP 操作码的 legacy 分析字节码
legacy_jump_table() - 返回跳转表（如果是已分析的 legacy 字节码），否则返回 None
hash_slow() - 计算字节码的 Keccak-256 哈希值（如果为空则返回 KECCAK_EMPTY）
is_eip7702() - 判断字节码是否为 EIP-7702 类型
new_legacy(raw) - 从原始字节创建一个新的 legacy 字节码（会自动分析）
new_raw(bytecode) - 从原始字节创建字节码，自动检测类型（EIP-7702 或 legacy），格式错误时会 panic
new_raw_checked(bytes)- 从原始字节创建字节码，自动检测类型，格式错误时返回 Err
new_eip7702(address) - 从地址创建一个新的 EIP-7702 委托字节码
new_analyzed(bytecode, original_len, jump_table) - 直接创建已分析的 legacy 字节码（需提供跳转表）
bytecode() - 返回字节码的引用（&Bytes）
bytecode_ptr() - 返回可执行字节码的原始指针
bytes() - 返回字节码的克隆（Bytes）
bytes_ref() - 返回原始字节的引用（&Bytes）
bytes_slice() - 返回原始字节切片（&[u8]）
original_bytes() - 返回原始字节码（未填充的，克隆）
original_byte_slice() - 返回原始字节码切片（未填充的）
len() - 返回原始字节码的长度
is_empty() - 判断字节码是否为空
iter_opcodes() - 返回一个迭代器，遍历字节码中的操作码（跳过立即数）

ExtBytecode

照例先看下结构的部分:

instruction_pointer 指令指针位置.类似刚才讲的 analyze_legacy 中的 iterator
continue_execution 当前 Frame 是否结束.
bytecode_hash 当前 bytecode 的 keccak256 值.
action 返回结果,当前 Frame 是结束,还是创建 新的Frame
base 执行的字节码

// crates/interpreter/src/interpreter/ext_bytecode.rs
pub struct ExtBytecode {
    instruction_pointer: *const u8,
    continue_execution: bool,
    bytecode_hash: Option&lt;B256>,
    pub action: Option&lt;InterpreterAction>,
    base: Bytecode,
}

关键实现

pc 程序计数器,指示当前正在执行的字节码位置.
- 用instruction_pointer - 字节码起始位置 得到的值
opcode 当前执行的字节码
relative_jump 相对跳转,以当前 instruction_pointer 指向为基础进行 offset 的跳转.
absolute_jump 绝对条转,以字节码的起始位置进行 offset 的跳转.
is_valid_legacy_jump 是否合法跳转.通过查找跳转位置是否在 JumpTable 中.
read_u16 从 instruction_pointer 指向位置读取2个字节.
- EOF 格式专用,但 EOF 还没启用.
read_u8 从 instruction_pointer 指向位置读取1个字节.用于获取 Opcode
read_slice 从 instruction_pointer 指向位置读取指定长度字节. 用于 PUSH 类的 Opcode
read_offset_u16 从指定位置读取2个字节.EOF 格式专用, EOF 还没启用

// crates/interpreter/src/interpreter/ext_bytecode.rs
impl Jumps for ExtBytecode {
    fn relative_jump(&mut self, offset: isize) {
        self.instruction_pointer = unsafe { self.instruction_pointer.offset(offset) };
    }
    fn absolute_jump(&mut self, offset: usize) {
        self.instruction_pointer = unsafe { self.base.bytes_ref().as_ptr().add(offset) };
    }
    fn is_valid_legacy_jump(&mut self, offset: usize) -> bool {
        self.base
            .legacy_jump_table()
            .expect("Panic if not legacy")
            .is_valid(offset)
    }
    fn opcode(&self) -> u8 {
        // SAFETY: `instruction_pointer` always point to bytecode.
        unsafe { *self.instruction_pointer }
    }
    fn pc(&self) -> usize {
        unsafe {
            self.instruction_pointer
                .offset_from_unsigned(self.base.bytes_ref().as_ptr())
        }
    }
}
impl Immediates for ExtBytecode {
    fn read_u16(&self) -> u16 {
        unsafe { read_u16(self.instruction_pointer) }
    }
    fn read_u8(&self) -> u8 {
        unsafe { *self.instruction_pointer }
    }
    fn read_slice(&self, len: usize) -> &[u8] {
        unsafe { core::slice::from_raw_parts(self.instruction_pointer, len) }
    }
    fn read_offset_u16(&self, offset: isize) -> u16 {
        unsafe {
            read_u16(
                self.instruction_pointer
                    // Offset for max_index that is one byte
                    .offset(offset),
            )
        }
    }
}

Gas

用于记录当前 Frame 执行过程的 Gas 消耗和剩余记录.

看下结构体

limit 这里是当前 Frame 能用到的Gas Limit, 不是整个交易的
remaining 当前GasLimit 剩余
refunded 退款.(清零Storage之类的会退款)
memory 记录内存扩展花费.在 EVM 中内存扩展会花费,且成本不是线性,扩展越多花费越高.

实现都比较简单.只说下:

new(limit) 创建新的 Gas 实例，remaining 初始等于 limit
new_spent(limit) 创建已耗尽的 Gas 实例，remaining = 0
limit() 返回当前 frame 的 gas 上限
memory() 返回 MemoryGas 的不可变引用
memory_mut() 返回 MemoryGas 的可变引用
refunded() 返回累计的退款金额
spent() 返回已消耗的 gas (limit - remaining)
used() 返回实际使用的 gas (spent - refunded)
spent_sub_refunded() 同 used()，返回扣除退款后的消耗量
remaining() 返回剩余可用的 gas
erase_cost(returned) 把 gas 加回来，用于子调用返回时归还未用完的 gas
spend_all() 消耗所有剩余 gas，设置 remaining = 0
record_refund(refund) 记录退款值（可正可负）
set_final_refund(is_london) 设置最终退款，应用 EIP-3529 上限
set_refund(refund) 直接覆盖设置退款值
set_spent(spent) 直接设置已消耗的 gas 量
record_cost(cost) 扣除 gas，成功返回 true，不足返回 false
record_cost_unsafe(cost) 不安全版本的扣费，即使不足也执行减法，返回是否 OOG

// crates/interpreter/src/gas.rs
pub struct Gas {
    limit: u64,
    remaining: u64,
    refunded: i64,
    memory: MemoryGas,
}

看下内存 Memory 扩展的Gas消耗规则.

words_num word 数量,这里的 word 是32字节.64字节对应的就是2.
expansion_cost 当前累计的内存扩展成本
set_words_num 设置内存字数和扩展成本，返回新旧成本差值
record_new_len 记录新内存长度，返回额外 gas 成本或 None
- 关键公式:crate::gas::calc::memory_gas(new_num, linear_cost, quadratic_cost)
- 点进去查看 memory_gas 的具体实现:linear_cost.saturating_mul(num_words).saturating_add(num_words.saturating_mul(num_words) / quadratic_cost)
- 相当于 linear_cost × words + words² / quadratic_cost
- linear_cost 和 quadratic_cost 是固定值
- 实际并没有使用这个函数,具体的计算在 crates/interpreter/src/gas/params.rs中,但是公式还是一样的.

// crates/interpreter/src/gas.rs
pub struct MemoryGas {
    pub words_num: usize,
    pub expansion_cost: u64,
}

impl MemoryGas {
    #[inline]
    pub const fn new() -> Self {
        Self {
            words_num: 0,
            expansion_cost: 0,
        }
    }

    #[inline]
    pub fn set_words_num(&mut self, words_num: usize, mut expansion_cost: u64) -> Option&lt;u64> {
        self.words_num = words_num;
        core::mem::swap(&mut self.expansion_cost, &mut expansion_cost);
        self.expansion_cost.checked_sub(expansion_cost)
    }

    #[inline]
    pub fn record_new_len(
        &mut self,
        new_num: usize,
        linear_cost: u64,
        quadratic_cost: u64,
    ) -> Option&lt;u64> {
        if new_num &lt;= self.words_num {
            return None;
        }
        self.words_num = new_num;
        let mut cost = crate::gas::calc::memory_gas(new_num, linear_cost, quadratic_cost);
        core::mem::swap(&mut self.expansion_cost, &mut cost);
        Some(self.expansion_cost - cost)
    }
}

U256

数据在内存中的两个排序方式:
大端排序: 高位字节在低地址，低位字节在高地址
小端排序: 低位字节在低地址，高位字节在高地址
低地址在左边,高地址在右边

U256 不是在 REVM 源码里的, 属于 alloy 写的一个 Crate.
看下U256的类型.

// primitives-1.5.0/src/aliases.rs
macro_rules! int_aliases {
    ($($unsigned:ident, $signed:ident&lt;$BITS:literal, $LIMBS:literal>),* $(,)?) => {$(
        #[doc = concat!($BITS, "-bit [unsigned integer type][Uint], consisting of ", $LIMBS, ", 64-bit limbs.")]
        pub type $unsigned = Uint&lt;$BITS, $LIMBS>;

        #[doc = concat!($BITS, "-bit [signed integer type][Signed], consisting of ", $LIMBS, ", 64-bit limbs.")]
        pub type $signed = Signed&lt;$BITS, $LIMBS>;

        const _: () = assert!($LIMBS == ruint::nlimbs($BITS));
    )*};
}
int_aliases! {
    U8,   I8&lt;  8, 1>,
    ~~~
    U256, I256&lt;256, 4>,
    ~~~
    U512, I512&lt;512, 8>,
}

宏展开以后是下面这样的形式

Uint<256, 4> 中的 256 是位长. 4 是 limbs 数量.
1 个 limb 是 64位. 256 位也就是 4.如果有不足 64位 的,则按 1 算.
limbs = CEIL(BITS / 64).
U256 你可以理解成 4 个 U64 组成.

容易混淆的地方在这里.
U64 ,也就是一个 libm,它内部是大端排序的,高位数据在低位地址.
但他们组成 U256 的时候,这 4 个 limb 又是小端排序的,高位数据在高位地址.

例如数据 0x0102030405060708_0910111213141516_1718192021222324_2526272829303132
横线是为了方便理解

分成4个 U64,也就是
0x0102030405060708, 0x0910111213141516, 0x1718192021222324,0x2526272829303132.
分割后还是大端排序, 高位在低地址.

他们实际在栈中保存时候的顺序是
0x2526272829303132,0x1718192021222324,0x0910111213141516,0x0102030405060708
也即是小端排序,低位在低地址.

pub type U256 = Uint&lt;256, 4>;
const _: () = assert!(4 == ruint::nlimbs(256));  // 验证 limb 数量正确

这时候你再去看下面 push_slice_ 的代码.
先从左到右按 32 个字节分割出 U256 的类型.
再从右到到左分割 U64 类型并写入.

let words = slice.chunks_exact(32);
// 上面不足32个字节剩下的部分
let partial_last_word = words.remainder();

for word in words {
    // 这里是rchunks_exact,多了r,是从右向左.就是上面说的小端排序的原因
    // 这里i没有置零,原因就是栈是用vec模拟的,是连续的块内存.
    // 一直往后写入就行
    for l in word.rchunks_exact(8) {
        dst.add(i).write(u64::from_be_bytes(l.try_into().unwrap()));
        i += 1;
    }
}

Stack

栈的特性大家都理解了吧,先进后出.
EVM 是栈虚拟机,自然所有的计算、数据传递、控制流都要通过它来完成.

这里先不介绍具体虚拟机执行过程,介绍完基本类型之后再讲下这部分.
REVM 使用 Vec 对栈进行模拟,栈中所有的数据都是 256位(32字节)的.
REVM 栈中的数据是大端排序.数据的高位字节放在低位地址.低位字节放高位地址.
栈的最大深度为 1024.

pub struct Stack {
    data: Vec&lt;U256>,
}

大部分函数实现都很简单,只说下push_slice_.
传入的是 u8 数组类型,也就是一个字节数组.
一个栈的单元存储的是 32 自己的数据,所以插入时要计算要计算插入的格数.
具体逻辑直接在代码里注释解释方便点.

EVM中数据的存储是大端排序,高位字节在前（索引 0），低位字节在后（索引 31）.
但 REVM 中的 U256 是由4个 u64 构成的.这4个u64又是小端排序的,这里要注意.

// crates/interpreter/src/interpreter/stack.rs
fn push_slice_(&mut self, slice: &[u8]) -> bool {
    if slice.is_empty() {
        return true;
    }
    // 计算要插入的格数,
    let n_words = slice.len().div_ceil(32);
    // 计算插入后的长度,此时栈长度还是原来的
    let new_len = self.data.len() + n_words;
    if new_len > STACK_LIMIT {
        return false;
    }
    debug_assert!(self.data.capacity() >= new_len);
    unsafe {
        // self.data.as_mut_ptr() 数组0的位置, 加上栈长度就是指向栈顶.
        let dst = self.data.as_mut_ptr().add(self.data.len()).cast::&lt;u64>();
        self.data.set_len(new_len);
        let mut i = 0;
        // 从左向右,每32个字节,组成一个块.
        let words = slice.chunks_exact(32);
        // 上面不足32个字节剩下的部分
        let partial_last_word = words.remainder();

        for word in words {
            // 这里是rchunks_exact,多了r,是从右向左.就是上面说的小端排序的原因
            // 这里i没有置零,原因就是栈是用vec模拟的,是连续的块内存.
            // 一直往后写入就行
            for l in word.rchunks_exact(8) {
                dst.add(i).write(u64::from_be_bytes(l.try_into().unwrap()));
                i += 1;
            }
        }
        if partial_last_word.is_empty() {
            return true;
        }

        // 之前剩下不足32字节的,按8个字节进行分组.
        // 写入跟前面的方法一样.
        let limbs = partial_last_word.rchunks_exact(8);
        let partial_last_limb = limbs.remainder();
        for l in limbs {
            dst.add(i).write(u64::from_be_bytes(l.try_into().unwrap()));
            i += 1;
        }
        // 剩下不足8个字节的
        if !partial_last_limb.is_empty() {
            // 创建一个存有8个u8类型0的数组
            let mut tmp = [0u8; 8];
            // 把剩余字节右对齐复制到 tmp 高位（左边补 0）
            // 转成 u64 写入（高位补 0，保证大端序）。
            tmp[8 - partial_last_limb.len()..].copy_from_slice(partial_last_limb);
            dst.add(i).write(u64::from_be_bytes(tmp));
            i += 1;
        }
        debug_assert_eq!(i.div_ceil(4), n_words, "wrote too much");
        // 一个u256类型是4个u64,这里是如果最后一个word没写满,就补0
        let m = i % 4; // 32 / 8
        if m != 0 {
            dst.add(i).write_bytes(0, 4 - m);
        }
    }
    true
}

其他的函数功能:

len() 返回栈中当前元素数量
is_empty() 检查栈是否为空
data() 返回底层数据的不可变引用
data_mut() 返回底层数据的可变引用
into_data() 消耗栈并返回底层 Vec 数据
peek(n) 查看从栈顶往下第 n 个元素（不弹出）
pop() 弹出栈顶元素，栈空返回 StackUnderflow 错误
pop_unsafe() 无检查弹出栈顶（unsafe）
top_unsafe() 返回栈顶元素的可变引用（unsafe）
popn::<N>() 弹出栈顶 N 个元素并返回数组（unsafe）
popn_top::<POPN>() 弹出 POPN 个元素，同时返回新栈顶的可变引用（unsafe）
push(value) 压入一个 U256 值，栈满返回 false
push_slice(slice) 压入字节切片，按 32 字节分割，返回 Result
push_slice_(slice) 内部方法，压入字节切片，返回 bool
dup(n) 复制从栈顶往下第 n 个元素并压入栈顶（对应 DUP1~DUP16）
swap(n) 交换栈顶与第 n 个元素（对应 SWAP1~SWAP16）
exchange(n, m) 交换栈中位置 n 和位置 n+m 的元素（对应 EXCHANGE 指令）
set(n, val) 设置从栈顶往下第 n 个位置的值
clear() 清空栈中所有元素

ReturnData

上一次 CALL/RETURN 返回的数据缓冲区。供 RETURNDATACOPY / RETURNDATALOAD 使用。也用于当前帧的 RETURN 数据输出。

pub struct ReturnDataImpl(pub Bytes);
impl ReturnData for ReturnDataImpl {
    fn buffer(&self) -> &Bytes {
        &self.0
    }

    fn set_buffer(&mut self, bytes: Bytes) {
        self.0 = bytes;
    }
}

InputsImpl

当前调用帧的输入数据上下文.

target_address 目标地址.这里指的是实际修改状态的地址.例如合约A CALL 合约B,修改的是合约B的状态.如果是合约A DELEGATECALL 合约B,修改的是合约A的状态.
bytecode_address 字节码地址,也就是调用的合约地址.
caller_address 调用这次合约的地址.
input CallInput 类型,在 Frame 中详细介绍过了
call_value 本地调用发送的 Wei 数量.

pub struct InputsImpl {
    pub target_address: Address,
    pub bytecode_address: Option&lt;Address>,
    pub caller_address: Address,
    pub input: CallInput,
    pub call_value: U256,
}

impl InputsTr for InputsImpl {
    fn target_address(&self) -> Address {
        self.target_address
    }

    fn caller_address(&self) -> Address {
        self.caller_address
    }

    fn bytecode_address(&self) -> Option&lt;&Address> {
        self.bytecode_address.as_ref()
    }

    fn input(&self) -> &CallInput {
        &self.input
    }

    fn call_value(&self) -> U256 {
        self.call_value
    }
}

RuntimeFlags

is_static: 如果为 true，表示当前执行上下文处于只读模式.假设A合约通过 StaticCall 调用B合约, is_static 为 True ,调用过程中任何修改B合约状态的操作都会出错.
spec_id: EVM 硬分叉版本

// crates/interpreter/src/interpreter/runtime_flags.rs
pub struct RuntimeFlags {
    pub is_static: bool,
    pub spec_id: SpecId,
}

impl RuntimeFlag for RuntimeFlags {
    fn is_static(&self) -> bool {
        self.is_static
    }

    fn spec_id(&self) -> SpecId {
        self.spec_id
    }
}

Host

Host 封装了 Interpreter 执行过程中需要的所有函数.
大概看下这些的名字,会发现都很熟悉,又很混杂.
因为这个 Trait 是为了加一层封装,实际调用的是内部属性的方法.

在 REVM 中, 实现了 Host 这个 Trait 的是 Context.
实际调用的 Context 中 Block 、Transaction 、Config、Database、Journal、Log等中的方法.
大部分方法在之前都讲过了,这里直接略过

// crates/context/interface/src/host.rs
pub trait Host {
    fn basefee(&self) -> U256;
    fn blob_gasprice(&self) -> U256;
    fn gas_limit(&self) -> U256;
    fn difficulty(&self) -> U256;
    fn prevrandao(&self) -> Option&lt;U256>;
    fn block_number(&self) -> U256;
    fn timestamp(&self) -> U256;
    fn beneficiary(&self) -> Address;
    fn chain_id(&self) -> U256;
    fn effective_gas_price(&self) -> U256;
    fn caller(&self) -> Address;
    fn blob_hash(&self, number: usize) -> Option&lt;U256>;
    fn max_initcode_size(&self) -> usize;
    fn block_hash(&mut self, number: u64) -> Option&lt;B256>;
    fn selfdestruct(
        &mut self,
        address: Address,
        target: Address,
        skip_cold_load: bool,
    ) -> Result&lt;StateLoad&lt;SelfDestructResult>, LoadError>;
    fn log(&mut self, log: Log);
    fn sstore_skip_cold_load(
        &mut self,
        address: Address,
        key: StorageKey,
        value: StorageValue,
        skip_cold_load: bool,
    ) -> Result&lt;StateLoad&lt;SStoreResult>, LoadError>;
    fn sstore(
        &mut self,
        address: Address,
        key: StorageKey,
        value: StorageValue,
    ) -> Option&lt;StateLoad&lt;SStoreResult>> {
        self.sstore_skip_cold_load(address, key, value, false).ok()
    }
    fn sload_skip_cold_load(
        &mut self,
        address: Address,
        key: StorageKey,
        skip_cold_load: bool,
    ) -> Result&lt;StateLoad&lt;StorageValue>, LoadError>;
    fn sload(&mut self, address: Address, key: StorageKey) -> Option&lt;StateLoad&lt;StorageValue>> {
        self.sload_skip_cold_load(address, key, false).ok()
    }
    fn tstore(&mut self, address: Address, key: StorageKey, value: StorageValue);
    fn tload(&mut self, address: Address, key: StorageKey) -> StorageValue;
    fn load_account_info_skip_cold_load(
        &mut self,
        address: Address,
        load_code: bool,
        skip_cold_load: bool,
    ) -> Result&lt;AccountInfoLoad&lt;'_>, LoadError>;
    #[inline]
    fn balance(&mut self, address: Address) -> Option&lt;StateLoad&lt;U256>> {
        self.load_account_info_skip_cold_load(address, false, false)
            .ok()
            .map(|load| load.into_state_load(|i| i.balance))
    }
    #[inline]
    fn load_account_delegated(&mut self, address: Address) -> Option&lt;StateLoad&lt;AccountLoad>> {
        let account = self
            .load_account_info_skip_cold_load(address, true, false)
            .ok()?;

        let mut account_load = StateLoad::new(
            AccountLoad {
                is_delegate_account_cold: None,
                is_empty: account.is_empty,
            },
            account.is_cold,
        );
        if let Some(Bytecode::Eip7702(code)) = &account.code {
            let address = code.address();
            let delegate_account = self
                .load_account_info_skip_cold_load(address, true, false)
                .ok()?;
            account_load.data.is_delegate_account_cold = Some(delegate_account.is_cold);
            account_load.data.is_empty = delegate_account.is_empty;
        }

        Some(account_load)
    }
    #[inline]
    fn load_account_code(&mut self, address: Address) -> Option&lt;StateLoad&lt;Bytes>> {
        self.load_account_info_skip_cold_load(address, true, false)
            .ok()
            .map(|load| {
                load.into_state_load(|i| {
                    i.code
                        .as_ref()
                        .map(|b| b.original_bytes())
                        .unwrap_or_default()
                })
            })
    }
    #[inline]
    fn load_account_code_hash(&mut self, address: Address) -> Option&lt;StateLoad&lt;B256>> {
        self.load_account_info_skip_cold_load(address, false, false)
            .ok()
            .map(|load| {
                load.into_state_load(|i| {
                    if i.is_empty() {
                        B256::ZERO
                    } else {
                        i.code_hash
                    }
                })
            })
    }
}

// crates/context/src/context.rs
impl&lt;
        BLOCK: Block,
        TX: Transaction,
        CFG: Cfg,
        DB: Database,
        JOURNAL: JournalTr&lt;Database = DB>,
        CHAIN,
        LOCAL: LocalContextTr,
    > Host for Context&lt;BLOCK, TX, CFG, DB, JOURNAL, CHAIN, LOCAL>
{
    ...
}

InstructionTable

InstructionTable 是一个 Instruction 数组, 存储了 Opcode 的静态 Gas 和对应函数指针.
在 Interpreter 执行过程中,执行到指定 Opcode ,从 InstructionTable 找到指定的处理函数并运行.

先看 Instruction 结构体的定义.
就两个属性字段, fn_ 和 static_gas.
fn(InstructionContext<'_, H, W>) 就是接受一个 InstructionContext<'_, H, W> 类型参数的函数指针.
static_gas 就是当前 Opcode 的 静态Gas 消耗.

再看下 InstructionContext 结构体的定义.
两个属性字段, interpreter 和 host
interpreter 要求实现了 InterpreterTypes,就是当前我们讲的 Interpreter.
host 这里只要求实现了 ?Sized
如果是 Sized,表示类型在编译时大小是已知. ?Sized 则表示编译时大小未知,也就是动态大小,例如str(不是&str).
估计是为了支持其他链的不同需求,所以没有直接指明实现 Host Trait, REVM 中的抽象太多.

继续 Instruction 的实现.
impl<W: InterpreterTypes, H: Host + ?Sized> Instruction<W, H>,这里的 H 就要求必须要实现了 Host.
具体方法都比较简单,就不细讲了.

InstructionTable 的部分内容比较明了,但是长度长,这里略过.
返回的是一个 256 长度的数组. 数组的 Index 对应的 Opcode 保存的是对应的 Instruction

// crates/interpreter/src/instruction_context.rs
pub struct InstructionContext&lt;'a, H: ?Sized, ITy: InterpreterTypes> {
    /// Reference to the interpreter containing execution state (stack, memory, gas, etc).
    pub interpreter: &'a mut Interpreter&lt;ITy>,
    /// Reference to the host interface for accessing external blockchain state.
    pub host: &'a mut H,
}

// crates/interpreter/src/instructions.rs
pub struct Instruction&lt;W: InterpreterTypes, H: ?Sized> {
    fn_: fn(InstructionContext&lt;'_, H, W>),
    static_gas: u64,
}
impl&lt;W: InterpreterTypes, H: Host + ?Sized> Instruction&lt;W, H> {
    #[inline]
    pub const fn new(fn_: fn(InstructionContext&lt;'_, H, W>), static_gas: u64) -> Self {
        Self { fn_, static_gas }
    }
    #[inline]
    pub const fn unknown() -> Self {
        Self {
            fn_: control::unknown,
            static_gas: 0,
        }
    }
    #[inline(always)]
    pub fn execute(self, ctx: InstructionContext&lt;'_, H, W>) {
        (self.fn_)(ctx)
    }
    #[inline(always)]
    pub const fn static_gas(&self) -> u64 {
        self.static_gas
    }
}

pub type InstructionTable&lt;W, H> = [Instruction&lt;W, H>; 256];

#[inline]

pub const fn instruction_table&lt;WIRE: InterpreterTypes, H: Host>() -> [Instruction&lt;WIRE, H>; 256] {

const { instruction_table_impl::&lt;WIRE, H>() }

}

InstructionResult

InstructionResult 保存的是 Opcode 执行的返回结果.
这篇没有啥特别要讲的,只贴下 Enum 中各个类型的意义.

Stop 遇到了 STOP opcode（0x00），当前调用帧正常停止执行，不返回任何数据（默认值）
Return 执行到 RETURN opcode（0xf3），当前调用帧正常返回数据，将内存中指定范围的内容作为返回值返回给调用者
SelfDestruct 执行到 SELFDESTRUCT opcode（0xff），当前合约标记为自毁，余额转给指定受益者地址，合约代码将在交易结束时被删除
Revert 执行到 REVERT opcode（0xfd），当前调用帧回滚，将内存中指定范围的内容作为 revert reason 返回给调用者
CallTooDeep 调用深度超过 EVM 限制（通常 1024 层），无法继续执行 CALL / CREATE 等操作
OutOfFunds 转账或 CALL 时，发送者余额不足以支付 value + gas cost
CreateInitCodeStartingEF00 CREATE / CREATE2 的 initcode 以 0xEF00 开头（EIP-3541 规则），视为无效
InvalidEOFInitCode EOF 格式的 initcode 无效（格式错误、section 不完整等）
InvalidExtDelegateCallTarget ExtDelegateCall（EIP-7702 相关）调用的目标不是有效的 EOF 合约
OutOfGas gas 不足（通用 Out of Gas），执行过程中任何 gas 扣除超过剩余 gas 都会触发
MemoryOOG 内存扩展时 gas 不足（quadratic cost 导致）
MemoryLimitOOG 内存扩展超过了配置的内存上限（memory_limit）
PrecompileOOG precompile 执行时 gas 不足（precompile 有独立 gas 计算规则）
InvalidOperandOOG 操作数无效导致 gas 溢出或计算时 OOG
ReentrancySentryOOG 重入检查（reentrancy sentry）时 gas 不足
OpcodeNotFound 遇到了未知或未实现的 opcode（保留的或 fork 未激活的 opcode）
CallNotAllowedInsideStatic 在 static call（STATICCALL）中尝试执行修改状态的操作（如 SSTORE、CREATE、SELFDESTRUCT）
StateChangeDuringStaticCall static call 期间发生了状态变更（违反 static 规则）
InvalidFEOpcode 遇到了未定义的 opcode（0xfe，通常是 INVALID）
InvalidJump JUMP / JUMPI 的目标地址不是 JUMPDEST（0x5b），或越界
NotActivated 尝试使用当前 fork 未激活的功能或 opcode（如 EOF opcode 在 legacy fork 中）
StackUnderflow 栈下溢（POP、ADD 等操作时栈元素不足）
StackOverflow 栈上溢（PUSH 等操作时栈深度超过 1024）
OutOfOffset 内存/存储偏移量无效（负数或超大）
CreateCollision CREATE / CREATE2 时目标地址已存在（地址冲突）
OverflowPayment 支付金额溢出（value + gas_cost 超过 u256 最大值）
PrecompileError precompile 执行内部错误（输入非法、计算失败等）
NonceOverflow nonce 溢出（u64::MAX）
CreateContractSizeLimit 创建的合约代码大小超过限制（EIP-3860 等）
CreateContractStartingWithEF 创建的合约代码以 0xEF 开头（EIP-3541 规则，防止 EOF 冲突）
CreateInitCodeSizeLimit initcode 大小超过限制（EIP-3860）
FatalExternalError 外部数据库或其他宿主环境返回致命错误（db 读写失败等）

pub enum InstructionResult {
    #[default]
    Stop = 1, // Start at 1 so that `Result&lt;(), _>::Ok(())` is 0.
    Return,
    SelfDestruct,

    Revert = 0x10,
    CallTooDeep,
    OutOfFunds,
    CreateInitCodeStartingEF00,
    InvalidEOFInitCode,
    InvalidExtDelegateCallTarget,

    // Error Codes
    OutOfGas = 0x20,
    MemoryOOG,
    MemoryLimitOOG,
    PrecompileOOG,
    InvalidOperandOOG,
    ReentrancySentryOOG,
    OpcodeNotFound,
    CallNotAllowedInsideStatic,
    StateChangeDuringStaticCall,
    InvalidFEOpcode,
    InvalidJump,
    NotActivated,
    StackUnderflow,
    StackOverflow,
    OutOfOffset,
    CreateCollision,
    OverflowPayment,
    PrecompileError,
    NonceOverflow,
    CreateContractSizeLimit,
    CreateContractStartingWithEF,
    CreateInitCodeSizeLimit,
    FatalExternalError,
}

InterpreterAction

InterpreterAction 是当前 Frame 中的 Interpreter 执行指定Opcode后要求上层执行的动作.

NewFrame 是创建一个新的 Frame. Call 和 Create 相关的 Opcode 执行后的要求.
Return Interpreter 执行完成.
实现部分都比较简单,这里略过了.

FrameInput 在之前已经讲过了,不再复述.

#[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub enum InterpreterAction {
    NewFrame(FrameInput),
    Return(InterpreterResult),
}
#[inline]
pub fn is_call(&self) -> bool {
    matches!(self, InterpreterAction::NewFrame(FrameInput::Call(..)))
}
#[inline]
pub fn is_create(&self) -> bool {
    matches!(self, InterpreterAction::NewFrame(FrameInput::Create(..)))
}
#[inline]
pub fn is_return(&self) -> bool {
    matches!(self, InterpreterAction::Return { .. })
}
#[inline]
pub fn gas_mut(&mut self) -> Option&lt;&mut Gas> {
    match self {
        InterpreterAction::Return(result) => Some(&mut result.gas),
        _ => None,
    }
}
#[inline]
pub fn into_result_return(self) -> Option&lt;InterpreterResult> {
    match self {
        InterpreterAction::Return(result) => Some(result),
        _ => None,
    }
}
#[inline]
pub fn instruction_result(&self) -> Option&lt;InstructionResult> {
    match self {
        InterpreterAction::Return(result) => Some(result.result),
        _ => None,
    }
}
#[inline]
pub fn new_frame(frame_input: FrameInput) -> Self {
    Self::NewFrame(frame_input)
}
#[inline]
pub fn new_halt(result: InstructionResult, gas: Gas) -> Self {
    Self::Return(InterpreterResult::new(result, Bytes::new(), gas))
}
#[inline]
pub fn new_return(result: InstructionResult, output: Bytes, gas: Gas) -> Self {
    Self::Return(InterpreterResult::new(result, output, gas))
}
#[inline]
pub fn new_stop() -> Self {
    Self::Return(InterpreterResult::new(
        InstructionResult::Stop,
        Bytes::new(),
        Gas::new(0),
    ))
}

看下 InterpreterResult 的部分

result 指令执行的返回结果,类型在前面讲过.
output 指令执行的返回数据.
- RETURN / REVERT 时：内存中返回或回滚的数据（revert reason 或 return value）。
  - CALL / CREATE 成功时：子调用返回的数据（如果有）。
- 其他情况下：通常为空 Bytes（空字节数组）
gas gas使用情况,类型在前面讲过.

// crates/interpreter/src/interpreter.rs
pub struct InterpreterResult {
    pub result: InstructionResult,
    pub output: Bytes,
    pub gas: Gas,
}

impl InterpreterResult {
    pub fn new(result: InstructionResult, output: Bytes, gas: Gas) -> Self {
        Self {
            result,
            output,
            gas,
        }
    }
    #[inline]
    pub const fn is_ok(&self) -> bool {
        self.result.is_ok()
    }
    #[inline]
    pub const fn is_revert(&self) -> bool {
        self.result.is_revert()
    }
    #[inline]
    pub const fn is_error(&self) -> bool {
        self.result.is_error()
    }
}

Outcome

Frame的返回值

result
memory_offset 返回数据在父调用帧内存中的写入范围
was_precompile_called 表示本次子调用是否执行了 precompile 合约
precompile_call_logs 如果子调用是 precompile 且产生了日志事件,这里保存这些日志。
address 新创建合约的地址（成功时为 Some(address)，失败时为 None）

// crates/interpreter/src/interpreter_action/call_outcome.rs
pub struct CallOutcome {
    pub result: InterpreterResult,
    pub memory_offset: Range&lt;usize>,
    pub was_precompile_called: bool,
    pub precompile_call_logs: Vec&lt;Log>,
}
// crates/interpreter/src/interpreter_action/create_outcome.rs
pub struct CreateOutcome {
    pub result: InterpreterResult,
    pub address: Option&lt;Address>,
}