├── README.md ├── bin ├── apic.bin ├── callgate.bin ├── hvcall.bin ├── iret.bin ├── popfs.bin ├── popss.bin ├── rdmsr.bin ├── realmode.bin ├── retf.bin ├── syscall.bin ├── sysenter.bin ├── taskswitch_call.bin ├── taskswitch_iret.bin ├── taskswitch_iret_s.bin ├── taskswitch_jmp.bin ├── taskswitch_vector.bin └── wrmsr.bin └── scripts ├── example_hypercall.py ├── example_lapic.py ├── example_msr.py ├── example_realmode.py ├── example_rum.py ├── example_taskswitch.py ├── example_vmxon.py └── vmstate.py /README.md: -------------------------------------------------------------------------------- 1 | # About HyperFuzzer 2 | 3 | HyperFuzzer is an efficient hybrid fuzzer for [Microsoft's Hyper-V hypervisor](https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/hyper-v-technology-overview). 4 | HyperFuzzer loads a complete VM state, traces the hypervisor execution using Intel Processor Trace, and continuously 5 | mutates the VM state based on code coverage and symbolic execution until it hits a critical bug. HyperFuzzer is deployed 6 | inside Microsoft to test daily builds of the Hyper-V hypervisor, and has found 11 critical hypervisor bugs to date (as of 8/2/2021). 7 | You can find more technical details of HyperFuzzer in our CCS'21 paper. 8 | 9 | Please note that HyperFuzzer is _not_ available outside of Microsoft at the moment. The purpose of this repo is to 10 | release the fuzzing seeds which are generic and not specific to Hyper-V to foster future research on hypervisor fuzzing. 11 | 12 | # Fuzzing Seeds 13 | 14 | This repo contains the fuzzing seeds HyperFuzzer used to drive the testing. A fuzzing seed is a complete VM state 15 | (register + memory) which triggers VMEXIT on executing its very first instruction and then terminates. HyperFuzzer 16 | relies on these fuzzing seeds to cover different functionalities of the hypervisor (e.g., task switch, APIC 17 | emulation, etc.), and mutates them with the goal to cause crashes or assertion violations. 18 | 19 | ## Binary Format 20 | 21 | Each fuzzing input has a fixed-sized register region followed by a variable-length physical memory region. 22 | The data format of the fuzzing input file is as follows: 23 | 24 | ``` 25 | #pragma pack(1) 26 | 27 | typedef struct _VM_STATE { 28 | REG_FILE RegFile; 29 | UINT8 PhysicalMemory[0]; // till the end of the file 30 | } VM_STATE; 31 | ``` 32 | 33 | The data structure of `REG_FILE` is defined below. 34 | 35 | ``` 36 | #pragma pack(1) 37 | 38 | typedef struct _REG_SEG64 { 39 | UINT64 Base; 40 | UINT32 Limit; 41 | UINT16 Selector; 42 | UINT16 Attributes; 43 | } REG_SEG64; 44 | 45 | typedef struct _REG_TABLE64 { 46 | UINT64 Base; 47 | UINT16 Limit; 48 | } REG_TABLE64; 49 | 50 | typedef struct _REG_FILE { 51 | UINT64 Rax; 52 | UINT64 Rcx; 53 | UINT64 Rdx; 54 | UINT64 Rbx; 55 | UINT64 Rsp; 56 | UINT64 Rbp; 57 | UINT64 Rsi; 58 | UINT64 Rdi; 59 | UINT64 R8; 60 | UINT64 R9; 61 | UINT64 R10; 62 | UINT64 R11; 63 | UINT64 R12; 64 | UINT64 R13; 65 | UINT64 R14; 66 | UINT64 R15; 67 | UINT64 Rip; 68 | UINT32 Eflags; 69 | REG_SEG64 Es; 70 | REG_SEG64 Cs; 71 | REG_SEG64 Ss; 72 | REG_SEG64 Ds; 73 | REG_SEG64 Fs; 74 | REG_SEG64 Gs; 75 | REG_SEG64 Tr; 76 | REG_TABLE64 Idtr; 77 | REG_TABLE64 Gdtr; 78 | UINT32 Cr0; 79 | UINT64 Cr2; 80 | UINT64 Cr3; 81 | UINT32 Cr4; 82 | UINT64 Dr0; 83 | UINT64 Dr1; 84 | UINT64 Dr2; 85 | UINT64 Dr3; 86 | UINT32 Dr6; 87 | UINT32 Dr7; 88 | UINT32 SysenterCs; 89 | UINT64 SysenterEip; 90 | UINT64 SysenterEsp; 91 | UINT32 Efer; 92 | UINT64 KernelGsBase; 93 | UINT64 Star; 94 | UINT64 Lstar; 95 | UINT64 Cstar; 96 | UINT32 Sfmask; 97 | } REG_FILE; 98 | ``` 99 | 100 | ## Seed Generation 101 | 102 | We construct the fuzzing seeds by using a set of Python2 scripts in the `scripts/` folder: 103 | 104 | * `vmstate.py` defines key x86 data structures and various utility functions to help construct VM states. 105 | * `example_taskswitch.py` generates VM states to test the hypervisor's emulation of hardware task switch. 106 | * `example_rum.py` generates VM states to test the hypervisor's restricted user mode implementation. 107 | * `example_lapic.py` generates VM states to test the hypervisor's APIC emulation. 108 | * `example_msr.py` generates VM states to test the hypervisor's MSR virtualization. 109 | * ... 110 | 111 | We also place the final binary files generated by those scripts in the `bin/` folder. 112 | -------------------------------------------------------------------------------- /bin/apic.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/apic.bin -------------------------------------------------------------------------------- /bin/callgate.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/callgate.bin -------------------------------------------------------------------------------- /bin/hvcall.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/hvcall.bin -------------------------------------------------------------------------------- /bin/iret.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/iret.bin -------------------------------------------------------------------------------- /bin/popfs.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/popfs.bin -------------------------------------------------------------------------------- /bin/popss.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/popss.bin -------------------------------------------------------------------------------- /bin/rdmsr.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/rdmsr.bin -------------------------------------------------------------------------------- /bin/realmode.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/realmode.bin -------------------------------------------------------------------------------- /bin/retf.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/retf.bin -------------------------------------------------------------------------------- /bin/syscall.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/syscall.bin -------------------------------------------------------------------------------- /bin/sysenter.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/sysenter.bin -------------------------------------------------------------------------------- /bin/taskswitch_call.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/taskswitch_call.bin -------------------------------------------------------------------------------- /bin/taskswitch_iret.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/taskswitch_iret.bin -------------------------------------------------------------------------------- /bin/taskswitch_iret_s.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/taskswitch_iret_s.bin -------------------------------------------------------------------------------- /bin/taskswitch_jmp.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/taskswitch_jmp.bin -------------------------------------------------------------------------------- /bin/taskswitch_vector.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/taskswitch_vector.bin -------------------------------------------------------------------------------- /bin/wrmsr.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MSRSSP/hyperfuzzer-seeds/2bf303c5ebb194ed5a2951d82facf8806d2212e1/bin/wrmsr.bin -------------------------------------------------------------------------------- /scripts/example_hypercall.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import struct 4 | import argparse 5 | from ctypes import * 6 | from vmstate import * 7 | 8 | class HV_HYPERCALL_INPUT_PRIVATE(Structure): 9 | _fields_ = [('CallCode', c_uint64, 14), 10 | ('IsIsolated', c_uint64, 1), 11 | ('IsExtended', c_uint64, 1), 12 | ('IsFast', c_uint64, 1), 13 | ('VariableHeaderSize', c_uint64, 9), 14 | ('Reserved1', c_uint64, 5), 15 | ('IsNested', c_uint64, 1), 16 | ('CountOfElements', c_uint64, 12), 17 | ('Reserved2', c_uint64, 4), 18 | ('RepStartIndex', c_uint64, 12), 19 | ('Reserved3', c_uint64, 4)] 20 | 21 | assert sizeof(HV_HYPERCALL_INPUT_PRIVATE) == 8 22 | 23 | class HYPERSEED_CORPUS(Structure): 24 | _pack_ = 1 25 | _fields_ = [('CallCode', c_uint32), 26 | ('VariableHeaderSizeInBytes', c_uint32), 27 | ('Type', c_uint8), 28 | ('Abi', c_uint8), 29 | ('CountOfElements', c_uint64), 30 | ('InputSize', c_uint64)] 31 | 32 | assert sizeof(HYPERSEED_CORPUS) == 26 33 | 34 | def generate_seeds(seedfile): 35 | states = [] 36 | while True: 37 | # get one hypercall payload from the seedfile 38 | buf = seedfile.read(sizeof(HYPERSEED_CORPUS)) 39 | if len(buf) != sizeof(HYPERSEED_CORPUS): 40 | assert not buf 41 | break 42 | corpus = HYPERSEED_CORPUS.from_buffer(bytearray(buf)) 43 | control = HV_HYPERCALL_INPUT_PRIVATE() 44 | control.CallCode = corpus.CallCode 45 | control.VariableHeaderSize = (corpus.VariableHeaderSizeInBytes + 7) >> 3 46 | control.CountOfElements = corpus.CountOfElements 47 | rawinput = seedfile.read(corpus.InputSize) 48 | assert len(rawinput) == corpus.InputSize 49 | assert len(rawinput) <= 0x1000 50 | # initialize the VM state 51 | state = VMState(0x86) 52 | state.setup_gdt() 53 | # inject vmcall + int3 54 | code = '\x0f\x01\xc1' + '\xcc' 55 | state.regs.rip.value = state.memory.allocate(len(code)) 56 | state.memory.write(state.regs.rip.value, code) 57 | # setup vmcall parameters 58 | state.regs.rax.value, state.regs.rdx.value = struct.unpack(' 0x1000: 61 | addr = state.memory.allocate(len(rawinput), 0x1000) 62 | else: 63 | addr = state.memory.allocate(len(rawinput), 8) 64 | assert addr < 0xffffffff 65 | state.memory.write(addr, rawinput) 66 | # make input/output GPA point to the same address 67 | state.regs.rcx.value = addr 68 | state.regs.rsi.value = addr 69 | # save the state 70 | states.append(state) 71 | return states 72 | 73 | if __name__ == '__main__': 74 | parser = argparse.ArgumentParser() 75 | parser.add_argument('-i', required = True, type = argparse.FileType('rb'), metavar = '/path/to/seed.bin', help = 'Input generated by hyperseed.exe') 76 | parser.add_argument('-o', type = str, dest = 'path', required = True, metavar = '/path/to/save/folder', help = 'Where to save the seeds') 77 | args = parser.parse_args() 78 | # ensure an output directory is provided 79 | if not os.path.isdir(args.path): 80 | print '%s must be a directory' % args.path 81 | sys.exit(0) 82 | # generate the VM states 83 | states = generate_seeds(args.i) 84 | for i in range(len(states)): 85 | state = states[i] 86 | with open('%s/hc%06d.bin' % (args.path, i), 'wb') as f: 87 | f.write(state.raw()) 88 | -------------------------------------------------------------------------------- /scripts/example_lapic.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import random 3 | import struct 4 | import argparse 5 | import os 6 | from vmstate import * 7 | 8 | APICBASE = 0xFEE00000 9 | 10 | READOFF = (0x20, 0x23, 0x30, 0x80, 0xb0, 0xa0, 0xd0, 0xd3, 0xe0, 0xe3, 11 | 0xf0, 0x100, 0x110, 0x120, 0x130, 0x140, 0x150, 0x160, 0x170, 12 | 0x180, 0x190, 0x1a0, 0x1b0, 0x1c0, 0x1d0, 0x1e0, 0x1f0, 0x200, 13 | 0x210, 0x220, 0x230, 0x240, 0x250, 0x260, 0x270, 0x280, 0x300, 14 | 0x310, 0x320, 0x330, 0x340, 0x350, 0x360, 0x370, 0x380, 0x390, 15 | 0x3e0, 0x2f0) 16 | 17 | WRITEOFF = (0x20, 0x80, 0xb0, 0xd0, 0xd3, 0xe0, 0xe3, 0xf0, 0x280, 0x300, 18 | 0x310, 0x320, 0x330, 0x340, 0x350, 0x360, 0x370, 0x380, 0x390, 19 | 0x3e0, 0x3f0, 0x2f0) 20 | 21 | rand32 = lambda: random.randint(0, 0xffffffff) 22 | 23 | def init_state(): 24 | state = VMState(0x86) 25 | state.setup_gdt() 26 | addr = state.memory.allocate(64) 27 | state.regs.rsp.value = addr + 64 28 | state.regs.rcx.value = 1 # loop once for string instructions 29 | return state 30 | 31 | def load(state, code): 32 | code += '\xcc' * 16 # append an INT3 ladder to stop 33 | addr = state.memory.allocate(len(code)) 34 | state.memory.write(addr, code) 35 | state.regs.rip.value = addr 36 | return state 37 | 38 | def alu_write(opcode): 39 | states = [] 40 | for off in WRITEOFF: 41 | state = init_state() 42 | state.regs.rax.value = APICBASE + off 43 | state.regs.rbx.value = rand32() 44 | states.append(load(state, struct.pack('> 16) & 0xffff 100 | self.selector = selector 101 | self.type = 0b110 102 | self.d = d 103 | self.s = 0 104 | self.dpl = dpl 105 | self.p = p 106 | 107 | def offset(self): 108 | return self.offset0_15 | (self.offset16_31 << 16) 109 | 110 | assert sizeof(IntGateDesc32) == 8 111 | 112 | class IntGateDesc64(IntGateDesc32): 113 | _pack_ = 1 114 | _fields_ = [('offset32_63', c_uint32), 115 | ('rsvd', c_uint32)] 116 | 117 | def __init__(self, offset = 0, selector = 0, d = 1, dpl = 0, p = 0): 118 | self.offset32_63 = (offset >> 32) & 0xffffffff 119 | IntGateDesc32.__init__(self, offset & 0xffffffff, selector, d, dpl, p) 120 | 121 | def offset(self): 122 | return IntGateDesc32.offset(self) | (self.offset32_63 << 32) 123 | 124 | assert sizeof(IntGateDesc64) == 16 125 | 126 | class TrapGateDesc32(IntGateDesc32): 127 | def __init__(self, offset = 0, selector = 0, d = 1, dpl = 0, p = 0): 128 | IntGateDesc32.__init__(self, offset, selector, d, dpl, p) 129 | self.type = 0b111 130 | 131 | assert sizeof(TrapGateDesc32) == 8 132 | 133 | class TrapGateDesc64(TrapGateDesc32): 134 | _pack_ = 1 135 | _fields_ = [('offset32_63', c_uint32), 136 | ('rsvd', c_uint32)] 137 | 138 | def __init__(self, offset = 0, selector = 0, d = 1, dpl = 0, p = 0): 139 | self.offset32_63 = (offset >> 32) & 0xffffffff 140 | TrapGateDesc32.__init__(self, offset & 0xffffffff, selector, d, dpl, p) 141 | 142 | def offset(self): 143 | return TrapGateDesc32.offset(self) | (self.offset32_63 << 32) 144 | 145 | assert sizeof(TrapGateDesc64) == 16 146 | 147 | class CallGateDesc32(Structure): 148 | _pack_ = 1 149 | _fields_ = [('offset0_15', c_uint16), 150 | ('selector', c_uint16), 151 | ('param_count', c_uint8, 5), 152 | ('rsvd0', c_uint8, 3), 153 | ('type', c_uint8, 4), 154 | ('s', c_uint8, 1), 155 | ('dpl', c_uint8, 2), 156 | ('p', c_uint8, 1), 157 | ('offset16_31', c_uint16)] 158 | 159 | def __init__(self, offset = 0, selector = 0, param_count = 0, dpl = 0, p = 0): 160 | self.offset0_15 = offset & 0xffff 161 | self.offset16_31 = (offset >> 16) & 0xffff 162 | self.selector = selector 163 | self.type = 0b1100 164 | self.s = 0 165 | self.param_count = param_count 166 | self.dpl = dpl 167 | self.p = p 168 | 169 | def offset(self): 170 | return self.offset0_15 | (self.offset16_31 << 16) 171 | 172 | assert sizeof(CallGateDesc32) == 8 173 | 174 | class CallGateDesc64(CallGateDesc32): 175 | _pack_ = 1 176 | _fields_ = [('offset32_63', c_uint32), 177 | ('rsvd', c_uint32)] 178 | 179 | def __init__(self, offset = 0, selector = 0, param_count = 0, dpl = 0, p = 0): 180 | self.offset32_63 = (offset >> 32) & 0xffffffff 181 | CallGateDesc32.__init__(self, offset & 0xffffffff, selector, param_count, dpl, p) 182 | 183 | def offset(self): 184 | return CallGateDesc32.offset(self) | (self.offset32_63 << 32) 185 | 186 | assert sizeof(CallGateDesc64) == 16 187 | 188 | class TaskGateDesc32(Structure): 189 | _pack_ = 1 190 | _fields_ = [('rsvd0', c_uint16), 191 | ('selector', c_uint16), 192 | ('rsvd1', c_uint8), 193 | ('type', c_uint8, 4), 194 | ('s', c_uint8, 1), 195 | ('dpl', c_uint8, 2), 196 | ('p', c_uint8, 1), 197 | ('rsvd2', c_uint16)] 198 | 199 | def __init__(self, selector = 0, dpl = 0, p = 0): 200 | self.selector = selector 201 | self.type = 0b0101 202 | self.dpl = dpl 203 | self.p = p 204 | 205 | assert sizeof(TaskGateDesc32) == 8 206 | 207 | class SegDesc32(Structure): 208 | _pack_ = 1 209 | _fields_ = [('limit0_15', c_uint16), 210 | ('base0_15', c_uint16), 211 | ('base16_23', c_uint8), 212 | ('type', c_uint8, 4), 213 | ('s', c_uint8, 1), 214 | ('dpl', c_uint8, 2), 215 | ('p', c_uint8, 1), 216 | ('limit16_19', c_uint8, 4), 217 | ('avl', c_uint8, 1), 218 | ('l', c_uint8, 1), 219 | ('db', c_uint8, 1), 220 | ('g', c_uint8, 1), 221 | ('base24_31', c_uint8)] 222 | 223 | def __init__(self, base = 0, limit = 0, type = 0, s = 0, dpl = 0, p = 0, avl = 0, l = 0, db = 0, g = 0): 224 | self.base0_15 = base & 0xffff 225 | self.base16_23 = (base >> 16) & 0xff 226 | self.base24_31 = (base >> 24) & 0xff 227 | self.limit0_15 = limit & 0xffff 228 | self.limit16_19 = (limit >> 16) & 0xf 229 | self.type = type 230 | self.s = s 231 | self.dpl = dpl 232 | self.p = p 233 | self.avl = avl 234 | self.l = l 235 | self.db = db 236 | self.g = g 237 | 238 | def base(self): 239 | return self.base0_15 | (self.base16_23 << 16) | (self.base24_31 << 24) 240 | 241 | def limit(self): 242 | return self.limit0_15 | (self.limit16_19 << 16) 243 | 244 | assert sizeof(SegDesc32) == 8 245 | 246 | class TssDesc32(SegDesc32): 247 | def __init__(self, base = 0, limit = 0, b = 0, dpl = 0, p = 0, avl = 0, g = 0): 248 | type = 0b1001 | (b << 1) 249 | SegDesc32.__init__(self, base, limit, type, 0, dpl, p, avl, 0, 0, g) 250 | 251 | assert sizeof(TssDesc32) == 8 252 | 253 | class TssDesc64(TssDesc32): 254 | _pack_ = 1 255 | _fields_ = [('base32_63', c_uint32), 256 | ('rsvd', c_uint32)] 257 | 258 | def __init__(self, base = 0, limit = 0, b = 0, dpl = 0, p = 0, avl = 0, g = 0): 259 | self.base32_63 = (base >> 32) & 0xffffffff 260 | TssDesc32.__init__(self, base & 0xffffffff, limit, b, dpl, p, avl, g) 261 | 262 | def base(self): 263 | return TssDesc32.base(self) | (self.base32_63 << 32) 264 | 265 | assert sizeof(TssDesc64) == 16 266 | 267 | class PDE32(Structure): 268 | _pack_ = 1 269 | _fields_ = [('p', c_uint32, 1), 270 | ('w', c_uint32, 1), 271 | ('u', c_uint32, 1), 272 | ('pwt', c_uint32, 1), 273 | ('pcd', c_uint32, 1), 274 | ('a', c_uint32, 1), 275 | ('rsvd0', c_uint32, 1), 276 | ('ps', c_uint32, 1), 277 | ('rsvd1', c_uint32, 4), 278 | ('pfn', c_uint32, 20)] 279 | 280 | assert sizeof(PDE32) == 4 281 | 282 | class PTE32(Structure): 283 | _pack_ = 1 284 | _fields_ = [('p', c_uint32, 1), 285 | ('w', c_uint32, 1), 286 | ('u', c_uint32, 1), 287 | ('pwt', c_uint32, 1), 288 | ('pcd', c_uint32, 1), 289 | ('a', c_uint32, 1), 290 | ('d', c_uint32, 1), 291 | ('pat', c_uint32, 1), 292 | ('g', c_uint32, 1), 293 | ('rsvd1', c_uint32, 3), 294 | ('pfn', c_uint32, 20)] 295 | 296 | assert sizeof(PTE32) == 4 297 | 298 | class PML4E(Structure): 299 | _pack_ = 1 300 | _fields_ = [('p', c_uint64, 1), 301 | ('w', c_uint64, 1), 302 | ('u', c_uint64, 1), 303 | ('pwt', c_uint64, 1), 304 | ('pcd', c_uint64, 1), 305 | ('a', c_uint64, 1), 306 | ('rsvd0', c_uint64, 6), 307 | ('pfn', c_uint64, 40), 308 | ('rsvd1', c_uint64, 11), 309 | ('xd', c_uint64, 1)] 310 | 311 | assert sizeof(PML4E) == 8 312 | 313 | class PDPTE(Structure): 314 | _pack_ = 1 315 | _fields_ = [('p', c_uint64, 1), 316 | ('w', c_uint64, 1), 317 | ('u', c_uint64, 1), 318 | ('pwt', c_uint64, 1), 319 | ('pcd', c_uint64, 1), 320 | ('a', c_uint64, 1), 321 | ('d', c_uint64, 1), 322 | ('ps', c_uint64, 1), 323 | ('g', c_uint64, 1), 324 | ('rsvd0', c_uint64, 3), 325 | ('pfn', c_uint64, 40), 326 | ('rsvd1', c_uint64, 11), 327 | ('xd', c_uint64, 1)] 328 | 329 | assert sizeof(PDPTE) == 8 330 | 331 | class PDE64(Structure): 332 | _pack_ = 1 333 | _fields_ = [('p', c_uint64, 1), 334 | ('w', c_uint64, 1), 335 | ('u', c_uint64, 1), 336 | ('pwt', c_uint64, 1), 337 | ('pcd', c_uint64, 1), 338 | ('a', c_uint64, 1), 339 | ('rsvd0', c_uint64, 1), 340 | ('ps', c_uint64, 1), 341 | ('rsvd1', c_uint64, 4), 342 | ('pfn', c_uint64, 40), 343 | ('rsvd2', c_uint64, 11), 344 | ('xd', c_uint64, 1)] 345 | 346 | assert sizeof(PDE64) == 8 347 | 348 | class PTE64(Structure): 349 | _pack_ = 1 350 | _fields_ = [('p', c_uint64, 1), 351 | ('w', c_uint64, 1), 352 | ('u', c_uint64, 1), 353 | ('pwt', c_uint64, 1), 354 | ('pcd', c_uint64, 1), 355 | ('a', c_uint64, 1), 356 | ('d', c_uint64, 1), 357 | ('pat', c_uint64, 1), 358 | ('g', c_uint64, 1), 359 | ('rsvd0', c_uint64, 3), 360 | ('pfn', c_uint64, 40), 361 | ('rsvd2', c_uint64, 7), 362 | ('pkey', c_uint64, 4), 363 | ('xd', c_uint64, 1)] 364 | 365 | assert sizeof(PTE64) == 8 366 | 367 | class RegCr0(Structure): 368 | _pack_ = 1 369 | _fields_ = [('PE', c_uint32, 1), 370 | ('MP', c_uint32, 1), 371 | ('EM', c_uint32, 1), 372 | ('TS', c_uint32, 1), 373 | ('ET', c_uint32, 1), 374 | ('NE', c_uint32, 1), 375 | ('rsvd0', c_uint32, 10), 376 | ('WP', c_uint32, 1), 377 | ('rsvd1', c_uint32, 1), 378 | ('AM', c_uint32, 1), 379 | ('rsvd2', c_uint32, 10), 380 | ('NW', c_uint32, 1), 381 | ('CD', c_uint32, 1), 382 | ('PG', c_uint32, 1)] 383 | 384 | assert sizeof(RegCr0) == 4 385 | 386 | class RegCr4(Structure): 387 | _pack_ = 1 388 | _fields_ = [('VME', c_uint32, 1), 389 | ('PVI', c_uint32, 1), 390 | ('TSD', c_uint32, 1), 391 | ('DE', c_uint32, 1), 392 | ('PSE', c_uint32, 1), 393 | ('PAE', c_uint32, 1), 394 | ('MCE', c_uint32, 1), 395 | ('PGE', c_uint32, 1), 396 | ('PCE', c_uint32, 1), 397 | ('OSFXSR', c_uint32, 1), 398 | ('OSXMMEXCPT', c_uint32, 1), 399 | ('UMIP', c_uint32, 1), 400 | ('rsvd0', c_uint32, 1), 401 | ('VMXE', c_uint32, 1), 402 | ('SMXE', c_uint32, 1), 403 | ('rsvd1', c_uint32, 1), 404 | ('FSGSBASE', c_uint32, 1), 405 | ('PCIDE', c_uint32, 1), 406 | ('OSXSAVE', c_uint32, 1), 407 | ('rsvd2', c_uint32, 1), 408 | ('SMEP', c_uint32, 1), 409 | ('SMAP', c_uint32, 1), 410 | ('PKE', c_uint32, 1), 411 | ('rsvd3', c_uint32, 9)] 412 | 413 | assert sizeof(RegCr4) == 4 414 | 415 | class RegEflags(Structure): 416 | _pack_ = 1 417 | _fields_ = [('CF', c_uint32, 1), 418 | ('one', c_uint32, 1), 419 | ('PF', c_uint32, 1), 420 | ('rsvd0', c_uint32, 1), 421 | ('AF', c_uint32, 1), 422 | ('rsvd1', c_uint32, 1), 423 | ('ZF', c_uint32, 1), 424 | ('SF', c_uint32, 1), 425 | ('TF', c_uint32, 1), 426 | ('IF', c_uint32, 1), 427 | ('DF', c_uint32, 1), 428 | ('OF', c_uint32, 1), 429 | ('IOPL', c_uint32, 2), 430 | ('NT', c_uint32, 1), 431 | ('rsvd2', c_uint32, 1), 432 | ('RF', c_uint32, 1), 433 | ('VM', c_uint32, 1), 434 | ('AC', c_uint32, 1), 435 | ('VIF', c_uint32, 1), 436 | ('VIP', c_uint32, 1), 437 | ('ID', c_uint32, 1), 438 | ('rsvd3', c_uint32, 10)] 439 | 440 | def __init__(self): 441 | self.one = 1 442 | 443 | assert sizeof(RegEflags) == 4 444 | 445 | class RegEfer(Structure): 446 | _fields_ = [('SCE', c_uint32, 1), 447 | ('rsvd0', c_uint32, 7), 448 | ('LME', c_uint32, 1), 449 | ('rsvd1', c_uint32, 1), 450 | ('LMA', c_uint32, 1), 451 | ('NXE', c_uint32, 1), 452 | ('rsvd2', c_uint32, 20)] 453 | 454 | assert sizeof(RegEfer) == 4 455 | 456 | class Reg32(Structure): 457 | _fields_ = [('value', c_uint32)] 458 | 459 | assert sizeof(Reg32) == 4 460 | 461 | class Reg64(Structure): 462 | _pack_ = 1 463 | _fields_ = [('value', c_uint64)] 464 | 465 | assert sizeof(Reg64) == 8 466 | 467 | class Reg128(Structure): 468 | _pack_ = 1 469 | _fields_ = [('low', c_uint64), 470 | ('high', c_uint64)] 471 | 472 | assert sizeof(Reg128) == 16 473 | 474 | class RegTable32(Structure): 475 | _pack_ = 1 476 | _fields_ = [('base', c_uint32), 477 | ('limit', c_uint16)] 478 | 479 | assert sizeof(RegTable32) == 6 480 | 481 | class RegTable64(Structure): 482 | _pack_ = 1 483 | _fields_ = [('base', c_uint64), 484 | ('limit', c_uint16)] 485 | 486 | assert sizeof(RegTable64) == 10 487 | 488 | class RegSeg32(Structure): 489 | _pack_ = 1 490 | _fields_ = [('base', c_uint32), 491 | ('limit', c_uint32), 492 | ('selector', c_uint16), 493 | ('type', c_uint16, 4), 494 | ('s', c_uint16, 1), 495 | ('dpl', c_uint16, 2), 496 | ('p', c_uint16, 1), 497 | ('rsvd0', c_uint16, 4), 498 | ('avl', c_uint16, 1), 499 | ('l', c_uint16, 1), 500 | ('db', c_uint16, 1), 501 | ('g', c_uint16, 1)] 502 | 503 | assert sizeof(RegSeg32) == 12 504 | 505 | class RegSeg64(Structure): 506 | _pack_ = 1 507 | _fields_ = [('base', c_uint64), 508 | ('limit', c_uint32), 509 | ('selector', c_uint16), 510 | ('type', c_uint16, 4), 511 | ('s', c_uint16, 1), 512 | ('dpl', c_uint16, 2), 513 | ('p', c_uint16, 1), 514 | ('rsvd0', c_uint16, 4), 515 | ('avl', c_uint16, 1), 516 | ('l', c_uint16, 1), 517 | ('db', c_uint16, 1), 518 | ('g', c_uint16, 1)] 519 | 520 | assert sizeof(RegSeg64) == 16 521 | 522 | class RegFile(Structure): 523 | _pack_ = 1 524 | _fields_ = [('rax', Reg64), 525 | ('rcx', Reg64), 526 | ('rdx', Reg64), 527 | ('rbx', Reg64), 528 | ('rsp', Reg64), 529 | ('rbp', Reg64), 530 | ('rsi', Reg64), 531 | ('rdi', Reg64), 532 | ('r8', Reg64), 533 | ('r9', Reg64), 534 | ('r10', Reg64), 535 | ('r11', Reg64), 536 | ('r12', Reg64), 537 | ('r13', Reg64), 538 | ('r14', Reg64), 539 | ('r15', Reg64), 540 | ('rip', Reg64), 541 | ('eflags', RegEflags), 542 | ('es', RegSeg64), 543 | ('cs', RegSeg64), 544 | ('ss', RegSeg64), 545 | ('ds', RegSeg64), 546 | ('fs', RegSeg64), 547 | ('gs', RegSeg64), 548 | ('tr', RegSeg64), 549 | ('idtr', RegTable64), 550 | ('gdtr', RegTable64), 551 | ('cr0', RegCr0), 552 | ('cr2', Reg64), 553 | ('cr3', Reg64), 554 | ('cr4', RegCr4), 555 | ('dr0', Reg64), 556 | ('dr1', Reg64), 557 | ('dr2', Reg64), 558 | ('dr3', Reg64), 559 | ('dr6', Reg32), 560 | ('dr7', Reg32), 561 | ('sysentercs', Reg32), 562 | ('sysentereip', Reg64), 563 | ('sysenteresp', Reg64), 564 | ('efer', RegEfer), 565 | ('kernelgsbase', Reg64), 566 | ('star', Reg64), 567 | ('lstar', Reg64), 568 | ('cstar', Reg64), 569 | ('sfmask', Reg32)] 570 | 571 | def __init__(self): 572 | for field_info in self._fields_: 573 | setattr(self, field_info[0], field_info[1]()) 574 | 575 | class Memory(bytearray): 576 | def allocate(self, size, alignment = 1): 577 | addr = (len(self) + alignment - 1) / alignment * alignment 578 | self.extend('\x00' * (addr + size - len(self))) 579 | return addr 580 | 581 | def write(self, addr, content): 582 | assert addr + len(content) <= len(self) 583 | self[addr:addr + len(content)] = content 584 | 585 | def read(self, addr, size): 586 | assert addr + size <= len(self) 587 | return self[addr:addr + size] 588 | 589 | class VMState(object): 590 | def __init__(self, arch = 0x86): 591 | assert arch in (0x86, 0x64), 'Unsupported architecture: %x' % arch 592 | self.memory = Memory() 593 | self.regs = RegFile() 594 | self.regs.cr0.PE = 1 595 | if arch == 0x64: 596 | self.regs.efer.SCE = 1 597 | self.regs.efer.LME = 1 598 | self.regs.efer.LMA = 1 599 | self.regs.efer.NXE = 1 600 | 601 | def setup_real(self): 602 | ''' 603 | Setup registers for real-mode execution. 604 | ''' 605 | assert self.regs.efer.LMA == 0 606 | # setup segment selector registers 607 | for (reg, s, type) in [(self.regs.cs, 1, 0b1011), 608 | (self.regs.ds, 1, 0b0011), 609 | (self.regs.es, 1, 0b0011), 610 | (self.regs.fs, 1, 0b0011), 611 | (self.regs.gs, 1, 0b0011), 612 | (self.regs.ss, 1, 0b0011), 613 | (self.regs.tr, 0, 0b1011)]: 614 | reg.limit = 0xffff 615 | reg.type = type 616 | reg.s = s 617 | reg.p = 1 618 | # setup table registers 619 | for reg in [self.regs.idtr, self.regs.gdtr]: 620 | reg.limit = 0xffff 621 | # disable protected mode 622 | self.regs.cr0.PE = 0 623 | 624 | def setup_paging(self): 625 | ''' 626 | Setup an identity mapping (VA == PA) with full accesses. 627 | ''' 628 | assert self.regs.cr0.PG == 0 629 | if self.regs.efer.LMA == 0: 630 | # allocate a page table directory 631 | pgdiraddr = self.memory.allocate(PGSIZE, PGSIZE) 632 | # setup identity mapping for [0, 4GB) 633 | for i in range(PGSIZE / sizeof(PDE32)): 634 | pde = PDE32.from_buffer(self.memory, pgdiraddr + i * sizeof(PDE32)) 635 | pde.p = 1 636 | pde.w = 1 637 | pde.u = 1 638 | pde.ps = 1 # mark as a 4MB large page 639 | pde.pfn = (i << 10) 640 | # setup cr3 641 | self.regs.cr3.value = pgdiraddr 642 | else: 643 | # allocate a PML4 and a PDPT 644 | pml4addr = self.memory.allocate(PGSIZE, PGSIZE) 645 | pdptaddr = self.memory.allocate(PGSIZE, PGSIZE) 646 | # make the first PML4 entry point to the PDPT 647 | pml4e = PML4E.from_buffer(self.memory, pml4addr) 648 | pml4e.p = 1 649 | pml4e.w = 1 650 | pml4e.u = 1 651 | pml4e.pfn = (pdptaddr >> 12) 652 | # setup identity mapping for [0, 512GB) 653 | for i in range(PGSIZE / sizeof(PDPTE)): 654 | pdpte = PDPTE.from_buffer(self.memory, pdptaddr + i * sizeof(PDPTE)) 655 | pdpte.p = 1 656 | pdpte.w = 1 657 | pdpte.u = 1 658 | pdpte.ps = 1 # mark as a 1GB large page (requires hardware support) 659 | pdpte.pfn = (i << 18) 660 | # PAE is required for 4-level paging 661 | self.regs.cr4.PAE = 1 662 | # setup cr3 663 | self.regs.cr3.value = pml4addr 664 | # enable large page support 665 | self.regs.cr4.PSE = 1 666 | # turn on paging 667 | self.regs.cr0.PG = 1 668 | 669 | def load_seg(self, reg, selector): 670 | ''' 671 | Load the segment register and update its cache accordingly. 672 | ''' 673 | assert (selector & 0b100) == 0, 'LDT is not supported yet' 674 | assert selector + sizeof(SegDesc32) - 1 <= self.regs.gdtr.limit 675 | desc_addr = self.regs.gdtr.base + (selector & ~0b111) 676 | desc = SegDesc32.from_buffer(self.memory, desc_addr) 677 | if desc.s == 0: 678 | if desc.type == 0b1100: 679 | desc = (CallGateDesc64 if self.regs.efer.LMA else CallGateDesc32).from_buffer(self.memory, desc_addr) 680 | elif desc.type == 0b1011 or desc.type == 0b1001: 681 | desc = (TssDesc64 if self.regs.efer.LMA else TssDesc32).from_buffer(self.memory, desc_addr) 682 | else: 683 | raise NotImplementedError 684 | reg.base = desc.base() 685 | reg.limit = desc.limit() if not desc.g else (desc.limit() * PGSIZE + PGSIZE - 1) 686 | reg.selector = selector 687 | reg.type = desc.type 688 | reg.s = desc.s 689 | reg.dpl = desc.dpl 690 | reg.p = desc.p 691 | reg.avl = desc.avl 692 | reg.l = desc.l 693 | reg.db = desc.db 694 | reg.g = desc.g 695 | 696 | def setup_gdt(self): 697 | ''' 698 | Setup the Global Descriptor Table (GDT) using flat memory model. 699 | The constructed GDT will be like [NULL, KT, UT, KD, UD, TSS], and 700 | all the segment registers are initialized to refer to KT/KD. 701 | If you wish to setup a customized GDT, please do it yourself. 702 | ''' 703 | assert self.regs.gdtr.base == 0 704 | assert self.regs.gdtr.limit == 0 705 | # create a task state segment 706 | long_mode = self.regs.efer.LMA 707 | tss_size = sizeof(TSS32) if long_mode else sizeof(TSS64) 708 | tss_addr = self.memory.allocate(tss_size) 709 | # GDT always starts with a NULL descriptor 710 | gdt = [SegDesc32(), # NULL 711 | SegDesc32(0, 0xfffff, 0b1011, 1, 0, 1, 0, long_mode, 1 - long_mode, 1), # KT 712 | SegDesc32(0, 0xfffff, 0b1011, 1, 3, 1, 0, long_mode, 1 - long_mode, 1), # UT 713 | SegDesc32(0, 0xfffff, 0b0011, 1, 0, 1, 0, 0, 1, 1), # KD 714 | SegDesc32(0, 0xfffff, 0b0011, 1, 3, 1, 0, 0, 1, 1)] # UD 715 | # add a TSS descriptor to GDT based on the arch 716 | if long_mode: 717 | gdt.append(TssDesc64(tss_addr, tss_size - 1, 1, 0, 1, 0, 0)) 718 | else: 719 | gdt.append(TssDesc32(tss_addr, tss_size - 1, 1, 0, 1, 0, 0)) 720 | # allocate GDT from the memory 721 | gdt_size = sum([sizeof(desc) for desc in gdt]) 722 | gdt_addr = self.memory.allocate(gdt_size) 723 | # initialize the GDT layout accordingly 724 | self.memory.write(gdt_addr, ''.join([str(bytearray(desc)) for desc in gdt])) 725 | # update gdtr to point to the GDT in memory 726 | self.regs.gdtr.base = gdt_addr 727 | self.regs.gdtr.limit = gdt_size - 1 728 | # update segment registers 729 | self.load_seg(self.regs.cs, 0x8) 730 | self.load_seg(self.regs.ds, 0x18) 731 | self.load_seg(self.regs.es, 0x18) 732 | self.load_seg(self.regs.ss, 0x18) 733 | self.load_seg(self.regs.tr, 0x28) 734 | 735 | def setup_idt(self, descs): 736 | ''' 737 | Setup the Interrupt Descriptor Table given a list of IDT descriptors. 738 | ''' 739 | # convert the descriptors into raw bytes 740 | raw = ''.join([str(bytearray(desc)) for desc in descs]) 741 | # allocate IDT and set it up accordingly 742 | idt_size = len(raw) 743 | idt_addr = self.memory.allocate(idt_size, 8) 744 | self.memory.write(idt_addr, raw) 745 | # update idtr to point to the IDT 746 | self.regs.idtr.base = idt_addr 747 | self.regs.idtr.limit = idt_size - 1 748 | 749 | def raw(self): 750 | ''' 751 | Convert the current VM state to raw bytes. 752 | ''' 753 | return bytearray(self.regs) + bytearray(self.memory) 754 | 755 | def dump(self, showreg = True, showmem = False): 756 | ''' 757 | Dump the current register/memory state. 758 | ''' 759 | if showreg: 760 | print '==================== REGISTER STATE =====================' 761 | print 762 | for field_info in self.regs._fields_: 763 | print '%s: %s' % (field_info[0], dumps(getattr(self.regs, field_info[0]))) 764 | print 765 | if showmem: 766 | print '===================== MEMORY STATE ======================' 767 | print 768 | for addr in range(0, len(self.memory), 16): 769 | remaining = len(self.memory) - addr 770 | content = self.memory.read(addr, 16 if remaining > 16 else remaining) 771 | print '%08x: %s' % (addr, ' '.join(map(lambda b: '%02x' % b, content))) 772 | print 773 | 774 | if __name__ == '__main__': 775 | if len(sys.argv) < 2: 776 | print 'usage: %s state.bin' % sys.argv[0] 777 | sys.exit(0) 778 | raw = bytearray(open(sys.argv[1], 'rb').read()) 779 | state = VMState() 780 | state.regs = type(state.regs).from_buffer(raw) 781 | state.memory = Memory(raw[sizeof(state.regs):]) 782 | state.dump(True, True) 783 | --------------------------------------------------------------------------------