本文介绍了一个MIPS模拟器的设计,支持MIPS ELF程序的加载、执行和段生成。模拟器通过逐步执行指令并检查退出条件,支持常规执行和段分割两种模式。内存管理采用4KB页面并通过哈希树计算镜像ID,优化了修改页面的哈希计算,提升了性能。主要数据结构包括仿真状态、内存和段信息。
模拟器主要实现了MIPS指令集的仿真操作,并提供了运行MIPS ELF程序和生成段的接口。所有代码可以在zkm/emulator
目录下找到。
MIPS程序的执行过程如下:(左边为常规执行过程,右边为执行过程中带有段分割的过程)
主要步骤如下:
split_seg
过程。如果退出条件被触发,进入split_seg + exit
过程。主数据结构包括:
pub struct InstrumentedState {
/// state stores the state of the MIPS emulator
pub state: Box<State>,
/// writer for stdout
stdout_writer: Box<dyn Write>,
/// writer for stderr
stderr_writer: Box<dyn Write>,
pub pre_segment_id: u32,
pre_pc: u32,
pre_image_id: [u8; 32],
pre_hash_root: [u8; 32],
block_path: String,
pre_input: Vec<Vec<u8>>,
pre_input_ptr: usize,
pre_public_values: Vec<u8>,
pre_public_values_ptr: usize,
}
State: 维护仿真系统的MIPS架构状态(寄存器、内存、堆指针等)。
pub struct State {
pub memory: Box<Memory>,
/// the 32 general purpose registers of MIPS.
pub registers: [u32; 32],
/// the pc register stores the current execution instruction address.
pub pc: u32,
/// the next pc stores the next execution instruction address.
next_pc: u32,
/// the hi register stores the multiplier/divider result high(remainder) part.
hi: u32,
/// the low register stores the multiplier/divider result low(quotient) part.
lo: u32,
/// heap handles the mmap syscall.
heap: u32,
/// brk handles the brk syscall
brk: u32,
/// tlb addr
local_user: u32,
/// step tracks the total step has been executed.
pub step: u64,
pub total_step: u64,
/// cycle tracks the total cycle has been executed.
pub cycle: u64,
pub total_cycle: u64,
/// A stream of input values (global to the entire program).
pub input_stream: Vec<Vec<u8>>,
/// A ptr to the current position in the input stream incremented by HINT_READ opcode.
pub input_stream_ptr: usize,
/// A stream of public values from the program (global to entire program).
pub public_values_stream: Vec<u8>,
/// A ptr to the current position in the public values stream, incremented when reading from public_values_stream.
pub public_values_stream_ptr: usize,
pub exited: bool,
pub exit_code: u8,
dump_info: bool,
}
Memory: 维护系统当前的内存镜像以及当前段的访问追踪信息。
pub struct Memory {
/// page index -> cached page
pages: BTreeMap<u32, Rc<RefCell<CachedPage>>>,
// two caches: we often read instructions from one page, and do memory things with another page.
// this prevents map lookups each instruction
last_page_keys: [Option<u32>; 2],
last_page: [Option<Rc<RefCell<CachedPage>>>; 2],
// for implement std::io::Read trait
addr: u32,
count: u32,
rtrace: BTreeMap<u32, [u8; PAGE_SIZE]>,
wtrace: [BTreeMap<u32, Rc<RefCell<CachedPage>>>; 3],
}
* **Segment**: 维护段相关信息。
```rust
pub struct Segment {
pub mem_image: BTreeMap<u32, u32>, // initial memory image of segment
pub pc: u32, // initial pc
pub segment_id: u32, // segment id
pub pre_image_id: [u8; 32], // image id of segment pre state
pub pre_hash_root: [u8; 32], // hash root of segment pre memory image
pub image_id: [u8; 32], // image id of segment post state
pub page_hash_root: [u8; 32], // hash root of segment post memory image
pub end_pc: u32, // end pc
pub step: u64, // step number of cur segment
pub input_stream: Vec<Vec<u8>>,
pub input_stream_ptr: usize,
pub public_values_stream: Vec<u8>,
pub public_values_stream_ptr: usize,
}
模拟器使用指令解析方法执行指令:首先获取指令,然后根据指令编码解析并执行相应的函数,更新系统的状态/内存。主要代码:mips_step()
可以在state.rs
中找到。
支持的ISA可以在mips_isa
中找到。
内存以哈希树的形式组织,页面(4KB)作为节点。哈希页面的起始地址为0x8000000,程序地址空间为0\~0x8000000,根哈希页面地址为0x81020000,如下所示。
页面哈希和镜像ID的计算过程如下:
为了减少页面哈希更新的频率,在指令执行过程中会记录已修改的内存页面。因此,在镜像ID计算过程中,只需递归更新这些已修改的内存页面的哈希页面,计算根哈希并获得镜像ID。
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!