├── LICENSE ├── Makefile ├── README.md ├── genmap.py ├── parse_dram.py └── srcs ├── boot ├── boot_ap.asm ├── boot_bsp.asm ├── irqs.asm ├── program.asm ├── program_chrome.asm ├── program_copygroup.asm ├── program_defender.asm ├── program_svg.asm └── program_word.asm ├── data └── pqueue.asm ├── defines.asm ├── disk └── ide.asm ├── disp └── console.asm ├── dstruc └── flut.asm ├── emu ├── ide.asm └── win.asm ├── fuzzers ├── defender.asm ├── generic.asm ├── pdf.asm └── word.asm ├── io └── serial.asm ├── memory_map ├── mm └── mm.asm ├── net ├── falktp.asm ├── i825xx.asm ├── tags └── x540.asm ├── os └── win32.asm ├── time └── time.asm └── vm └── snapshot.asm /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Gamozo Labs, LLC 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | all: atropos.vfd 2 | copy brownie.vfd y:\tftpd\fleeb.bin 3 | 4 | bochs: all 5 | @"D:\emulation\bochs\bochs-20141121-msvc-src\bochs-20141121\bochs.exe" -f bochsrc.bxrc 6 | 7 | atropos.vfd: 8 | nasm -f bin -o brownie.vfd srcs/boot/boot_bsp.asm 9 | 10 | clean: 11 | -@del brownie.vfd 12 | 13 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Falkervisor (brownie) 2 | 3 | This is a hypervisor for fuzzing. It runs on bare metal (not a driver), and runs on AMD fam15h machines. It's pretty cool but there are so many issues with this version, but it's still fun to look at and try to use. 4 | 5 | This is one of the first versions of falkervisor. It was used to find bugs in Chrome sandbox, Windows Defender, Word (RTF), and probably some other random crap between 2014-2015. Since I didn't use version control I'm probably missing pieces, but this actually builds and should run on any AMD fam15h machine. It should be able to boot up single-core OSes right off IDE based disk, and take snapshots via proprietary falktp which I don't have the server for anymore, so you'd have to reverse it. You also need an Intel x540 for this to run. 6 | 7 | It was quickly dropped in favor of C once I became more sane. It is the foundation of most of the concepts used in my modern version of falkervisor, which is now written in Rust. 8 | 9 | Fun fact, this is still the version I use for snapshots as it's my only hypervisor with IOMMU support! 10 | 11 | There's some cool historical shit in here: 12 | - It's all assembly, cause I was/am dumb 13 | - It has NUMA support 14 | - It has CoW support so minimal memory is used to fuzz 15 | - It quickly restores to a snapshot by walking dirty bits 16 | - It has IOMMU support but uses hardcoded e820 tables for guests xD 17 | - It has many different forms of support for reading/writing guest memory 18 | - It can run live OSes under it, or run from a snapshot downloaded from network 19 | - It uses many different types of coverage 20 | - It builds instantly, just run `nmake` or `make` 21 | 22 | I'd be impressed if someone got this to run and take a snapshot. It has all the code here, but some tweaks would need to be made for your specific hardware. 23 | -------------------------------------------------------------------------------- /genmap.py: -------------------------------------------------------------------------------- 1 | mmio_ranges = [ 2 | [0xD9000000, 0xDFFFFFFF], 3 | [0xD6000000, 0xD8FFFFFF], 4 | [0xE0000000, 0xEFFFFFFF], 5 | [0x000A0000, 0x000BFFFF], 6 | [0xF0000000, 0xFFFFFFFF], 7 | [0x00700000, 0x00E0FFFF]] 8 | 9 | def is_mmio(addr): 10 | for ra in mmio_ranges: 11 | if addr >= ra[0] and (addr + 4095) <= ra[1]: 12 | return True 13 | 14 | return False 15 | 16 | e820_map = [ 17 | [0x0, 0x9e400, 0x1], 18 | [0x9e400, 0x1c00, 0x2], 19 | [0xe8000, 0x18000, 0x2], 20 | [0x100000, 0xcfd60000, 0x1], 21 | ] 22 | 23 | def e820(addr): 24 | for ra in e820_map: 25 | if addr >= ra[0] and (addr + 4096) < (ra[0] + ra[1]): 26 | return ra[2] 27 | 28 | return 2 29 | 30 | da_map = [] 31 | start = 0 32 | chain = 0 33 | 34 | for addr in range(0, 1024 * 1024 * 1024 * 256, 4096): 35 | if is_mmio(addr) == False and e820(addr) == 1: 36 | if chain == 0: 37 | start = addr 38 | 39 | chain = 1 40 | else: 41 | da_len = addr - start 42 | if chain: 43 | da_map.append([start, da_len]) 44 | 45 | chain = 0 46 | 47 | #print is_mmio(addr), e820(addr) 48 | 49 | i = 1 50 | for map in da_map: 51 | print "\t dq 0x%.8x, 0x%.8x, 0x1, 0x%x" % (map[0], map[1], i) 52 | i += 1 53 | 54 | -------------------------------------------------------------------------------- /parse_dram.py: -------------------------------------------------------------------------------- 1 | datas = [[3, 0x10270000, 0, 0], 2 | [0x10280003, 0x20270001, 0, 0], 3 | [0x20280003, 0x30270002, 0, 0], 4 | [0x30280003, 0x40260003, 0, 0]] 5 | 6 | for data in datas: 7 | dram_base = ((data[2] & 0xff) << 40) | ((data[0] & 0xffff0000) << (24 - 16)) 8 | intlv_en = data[0] & (3 << 8) 9 | we = data[0] & (1 << 1) 10 | re = data[0] & (1 << 0) 11 | 12 | dram_limit = ((data[3] & 0xff) << 40) | ((data[1] & 0xffff0000) << (24 - 16)) | 0xFFFFFF 13 | intlv_sel = data[1] & (3 << 8) 14 | dst_node = data[1] & 3 15 | 16 | print "DRAM Base: %.16x\n" \ 17 | "DRAM Limit: %.16x\n" \ 18 | "Interleave En: %d\n" \ 19 | "Write En: %d\n" \ 20 | "Read En: %d\n" \ 21 | "Interleave Sel: %d\n" \ 22 | "Dest node: %d" % (dram_base, dram_limit, intlv_en, we, re, intlv_sel, dst_node) 23 | print 24 | 25 | -------------------------------------------------------------------------------- /srcs/boot/boot_ap.asm: -------------------------------------------------------------------------------- 1 | [bits 16] 2 | 3 | ; boot_ap 4 | ; 5 | ; Summary: 6 | ; 7 | ; This is the real mode entry point for all APs. This function sets the A20 8 | ; line, loads the GDT, and goes into protected mode. 9 | ; 10 | ; Optimization: 11 | ; 12 | ; Readability 13 | ; 14 | boot_ap: 15 | cli 16 | 17 | ; Blindly set the A20 line 18 | in al, 0x92 19 | or al, 2 20 | out 0x92, al 21 | 22 | ; Load the gdt (for 32-bit proteted mode) 23 | lgdt [gdt] 24 | 25 | ; Set the protection bit 26 | mov eax, cr0 27 | or al, (1 << 0) 28 | mov cr0, eax 29 | 30 | ; We go to protected land now! 31 | jmp 0x0008:ap_pmland 32 | 33 | [bits 32] 34 | 35 | ; ap_pmland 36 | ; 37 | ; Summary: 38 | ; 39 | ; This is the protected mode entry point for all APs. This function sets up 40 | ; paging, enables long mode, and loads the 64-bit gdt. 41 | ; 42 | ; Optimization: 43 | ; 44 | ; Readability 45 | ; 46 | ap_pmland: 47 | ; Set up all data selectors 48 | mov ax, 0x10 49 | mov es, ax 50 | mov ds, ax 51 | mov fs, ax 52 | mov ss, ax 53 | mov gs, ax 54 | 55 | ; Enable SSE 56 | mov eax, cr0 57 | btr eax, 2 ; Disable CR0.EM 58 | bts eax, 1 ; Enable CR0.MP 59 | mov cr0, eax 60 | 61 | ; Enable OSFXSR and OSXSAVE and OSXMMEXCPT 62 | mov eax, cr4 63 | bts eax, 9 64 | bts eax, 18 65 | bts eax, 10 66 | mov cr4, eax 67 | 68 | ; Disable paging 69 | mov eax, cr0 70 | and eax, 0x7FFFFFFF 71 | mov cr0, eax 72 | 73 | ; Set up CR3 74 | mov edi, 0x00100000 75 | mov cr3, edi 76 | 77 | ; Enable PAE 78 | mov eax, cr4 79 | or eax, (1 << 5) 80 | mov cr4, eax 81 | 82 | ; Enable long mode 83 | mov ecx, 0xC0000080 84 | rdmsr 85 | or eax, (1 << 8) 86 | wrmsr 87 | 88 | ; Enable paging 89 | mov eax, cr0 90 | or eax, (1 << 31) 91 | mov cr0, eax 92 | 93 | ; Load the 64-bit GDT and jump to the long mode code 94 | lgdt [gdt64] 95 | jmp 0x08:lmland 96 | 97 | align 8 98 | gdt64_base: 99 | dq 0x0000000000000000 100 | dq 0x0020980000000000 101 | dq 0x0000900000000000 102 | 103 | gdt64: 104 | .len: dw (gdt64 - gdt64_base) - 1 105 | .base: dq gdt64_base 106 | 107 | [bits 64] 108 | 109 | init_pic: 110 | push rax 111 | 112 | ; Start PIC init 113 | mov al, (PIC_INIT | PIC_ICW4) 114 | out MPIC_CTRL, al 115 | out SPIC_CTRL, al 116 | 117 | ; IRQ 0-7 now based at int IRQ07_MAP (32) 118 | ; IRQ 8-F now based at int IRQ8F_MAP (40) 119 | mov al, IRQ07_MAP 120 | out MPIC_DATA, al 121 | mov al, IRQ8F_MAP 122 | out SPIC_DATA, al 123 | 124 | ; Inform the MPIC about the SPIC and inform the SPIC about the cascade 125 | mov al, 0x04 126 | out MPIC_DATA, al 127 | mov al, 0x02 128 | out SPIC_DATA, al 129 | 130 | ; Set 8086 mode 131 | mov al, PIC_8086 132 | out MPIC_DATA, al 133 | out SPIC_DATA, al 134 | 135 | ; Zero out the masks 136 | xor al, al 137 | out MPIC_DATA, al 138 | out SPIC_DATA, al 139 | 140 | pop rax 141 | ret 142 | 143 | ; copy_kern_to_pnm 144 | ; 145 | ; Summary: 146 | ; 147 | ; This function relocates the entire kernel to PNM address space 148 | ; 149 | ; Parameters: 150 | ; 151 | ; None 152 | ; 153 | ; Alignment: 154 | ; 155 | ; None 156 | ; 157 | ; Returns: 158 | ; 159 | ; None 160 | ; 161 | ; Smashes: 162 | ; 163 | ; None 164 | ; 165 | ; Optimization 166 | ; 167 | ; Readability 168 | ; 169 | copy_kern_to_pnm: 170 | push rcx 171 | push rsi 172 | push rdi 173 | 174 | mov rdi, kern_size 175 | bamp_alloc rdi 176 | 177 | push rdi 178 | 179 | mov rcx, kern_size 180 | mov rsi, boot_bsp 181 | rep movsb 182 | 183 | pop rbx 184 | pop rdi 185 | pop rsi 186 | pop rcx 187 | ret 188 | 189 | ; init_pnm 190 | ; 191 | ; Summary: 192 | ; 193 | ; This function initializes the PDPTEs for the PNM 194 | ; 195 | ; Parameters: 196 | ; 197 | ; None 198 | ; 199 | ; Alignment: 200 | ; 201 | ; None 202 | ; 203 | ; Returns: 204 | ; 205 | ; None 206 | ; 207 | ; Smashes: 208 | ; 209 | ; None 210 | ; 211 | ; Optimization 212 | ; 213 | ; Readability 214 | ; 215 | init_pnm: 216 | push rax 217 | push rbx 218 | push rcx 219 | push rdx 220 | push rdi 221 | push rbp 222 | push r8 223 | 224 | mov rbx, MEMORY_MAP_LOC + 0x20 225 | movzx rcx, word [MEMORY_MAP_LOC] 226 | .for_each_map: 227 | ; If the mapping is not type 1, fail! 228 | mov eax, dword [rbx + 0x10] 229 | cmp rax, 1 230 | jne .go_next_map 231 | 232 | ; get the base and 1GB align it 233 | mov rax, qword [rbx + 0x00] 234 | add rax, (1 * 1024 * 1024 * 1024 - 1) 235 | and rax, ~(1 * 1024 * 1024 * 1024 - 1) 236 | 237 | ; get the length of the ALIGNED remainder. This could be negative. Since 238 | ; we do signed compares, this is fine! 239 | mov rdx, qword [rbx + 0x00] ; base 240 | add rdx, qword [rbx + 0x08] ; length 241 | sub rdx, rax 242 | 243 | .while_1G_left: 244 | ; If we don't have 1GB left, go to the next mapping 245 | cmp rdx, (1 * 1024 * 1024 * 1024) 246 | jl short .go_next_map 247 | 248 | ; If we're below 5GB physical, loop without alloc 249 | mov rbp, (5 * 1024 * 1024 * 1024) 250 | cmp rax, rbp 251 | jl short .go_next_1gb 252 | 253 | mov rbp, rax 254 | add rbp, (1 * 1024 * 1024 * 1024 - 1) 255 | 256 | ; rax now points to the base physical address of a 1GB aligned page 257 | ; rbp points to the last byte available in this 1GB page 258 | 259 | lea rdi, [rel dram_routing_table] 260 | lea r8, [rdi + 0x20 * MAX_NODES] 261 | .find_dram: 262 | ; Check if end_of_this_map > end_of_dram_for_node 263 | cmp rbp, [rdi + 0x08] 264 | ja short .next_dram 265 | 266 | ; Check if base_of_this_map < base_of_dram_for_node 267 | cmp rax, [rdi + 0x00] 268 | jb short .next_dram 269 | 270 | jmp short .found_dram 271 | 272 | .next_dram: 273 | add rdi, 0x20 274 | cmp rdi, r8 275 | jl short .find_dram 276 | 277 | jmp short .go_next_1gb 278 | 279 | .found_dram: 280 | ; Get the next PDPTE location for this node and populate it 281 | mov rbp, qword [rdi + 0x18] 282 | mov qword [rbp], rax 283 | or qword [rbp], 0x83 ; Present, writable, page size 284 | 285 | ; Increment the PDPTE pointer 286 | add qword [rdi + 0x18], 8 287 | 288 | .go_next_1gb: 289 | add rax, (1 * 1024 * 1024 * 1024) 290 | sub rdx, (1 * 1024 * 1024 * 1024) 291 | jmp short .while_1G_left 292 | 293 | .go_next_map: 294 | add rbx, 0x20 295 | dec rcx 296 | jnz .for_each_map 297 | 298 | pop r8 299 | pop rbp 300 | pop rdi 301 | pop rdx 302 | pop rcx 303 | pop rbx 304 | pop rax 305 | ret 306 | 307 | ; fetch_dram_info 308 | ; 309 | ; Summary: 310 | ; 311 | ; This function populates the global table which contains DRAM routing rules. 312 | ; We use this information to set up the PNMs 313 | ; 314 | fetch_dram_info: 315 | push rax 316 | push rbx 317 | push rcx 318 | push rdx 319 | push rsi 320 | push rdi 321 | push rbp 322 | push r15 323 | 324 | ; Get the MMIO address for the processor PCIE config space 325 | mov ecx, 0xc0010058 326 | rdmsr 327 | shl rdx, 32 328 | or rdx, rax 329 | and rdx, ~0x3F 330 | mov r15, rdx 331 | 332 | lea rsi, [rel dram_routing_table] 333 | xor ebp, ebp 334 | .per_node: 335 | ; Bus 0, Device 18, Function 1 336 | mov eax, [r15 + rbp*8 + 0x040 + ((0 << 20) | (0x18 << 15) | (0x1 << 12))] 337 | mov ebx, [r15 + rbp*8 + 0x044 + ((0 << 20) | (0x18 << 15) | (0x1 << 12))] 338 | mov ecx, [r15 + rbp*8 + 0x140 + ((0 << 20) | (0x18 << 15) | (0x1 << 12))] 339 | mov edx, [r15 + rbp*8 + 0x144 + ((0 << 20) | (0x18 << 15) | (0x1 << 12))] 340 | 341 | ; eax - DRAM base low 342 | ; ebx - DRAM limit low 343 | ; ecx - DRAM base high 344 | ; edx - DRAM limit high 345 | 346 | ; Check for readable 347 | bt eax, 0 348 | jnc short .next_node 349 | 350 | ; Check for writable 351 | bt eax, 1 352 | jnc short .next_node 353 | 354 | ; Check for node interleave 355 | test eax, (3 << 8) 356 | jnz short .interleave_warning 357 | 358 | ; Get the node route 359 | mov edi, ebx 360 | and edi, 7 361 | mov qword [rsi + 0x10], rdi 362 | 363 | ; Get the low part from DRAM base low 364 | and eax, 0xffff0000 365 | shl rax, (24 - 16) 366 | 367 | ; Combine the high and low base parts 368 | mov edi, ecx 369 | and edi, 0xff 370 | shl rdi, 40 371 | or rdi, rax 372 | 373 | mov qword [rsi + 0x00], rdi ; DRAM base 374 | 375 | ; Get the low part from DRAM limit low 376 | and ebx, 0xffff0000 377 | shl rbx, (24 - 16) 378 | 379 | ; Combine the high and low limit parts 380 | mov edi, edx 381 | and edi, 0xff 382 | shl rdi, 40 383 | or rdi, rbx 384 | or rdi, 0xFFFFFF 385 | 386 | mov qword [rsi + 0x08], rdi ; DRAM limit 387 | 388 | .next_node: 389 | add rsi, 0x20 390 | inc ebp 391 | cmp ebp, MAX_NODES 392 | jb .per_node 393 | 394 | pop r15 395 | pop rbp 396 | pop rdi 397 | pop rsi 398 | pop rdx 399 | pop rcx 400 | pop rbx 401 | pop rax 402 | ret 403 | 404 | .interleave_warning: 405 | mov rdi, 0xb8000 406 | lea rbx, [rel .ilw] 407 | mov rcx, 58 408 | call outstr 409 | cli 410 | hlt 411 | 412 | .ilw: db "Node interleaving is enabled, please disable from the BIOS" 413 | 414 | align 16 415 | dram_routing_table: 416 | ; base, limit, node to route to, next PDPTE 417 | ; If the limit is zero the entry is invalid 418 | dq 0, 0, 0, 0x100103000 419 | dq 0, 0, 0, 0x100104000 420 | dq 0, 0, 0, 0x100105000 421 | dq 0, 0, 0, 0x100106000 422 | dq 0, 0, 0, 0x100107000 423 | dq 0, 0, 0, 0x100108000 424 | dq 0, 0, 0, 0x100109000 425 | dq 0, 0, 0, 0x10010a000 426 | 427 | ; fetch_mmio_info 428 | fetch_mmio_info: 429 | push rax 430 | push rbx 431 | push rcx 432 | push rdx 433 | push rsi 434 | push rbp 435 | push r10 436 | push r15 437 | 438 | ; Get the MMIO address for the processor PCIE config space 439 | mov ecx, 0xc0010058 440 | rdmsr 441 | shl rdx, 32 442 | or rdx, rax 443 | and rdx, ~0x3F 444 | mov r15, rdx 445 | 446 | lea rsi, [rel mmio_routing_table] 447 | xor ebp, ebp 448 | .per_node: 449 | ; Bus 0, Device 18, Function 1 450 | mov eax, [r15 + rbp*8 + 0x080 + ((0 << 20) | (0x18 << 15) | (0x1 << 12))] 451 | mov ebx, [r15 + rbp*8 + 0x084 + ((0 << 20) | (0x18 << 15) | (0x1 << 12))] 452 | mov ecx, [r15 + rbp*8 + 0x180 + ((0 << 20) | (0x18 << 15) | (0x1 << 12))] 453 | 454 | ; eax - MMIO base low 455 | ; ebx - MMIO limit low 456 | ; ecx - MMIO base/limit high 457 | 458 | ; Skip blank mappings 459 | test eax, 0x3 460 | jz short .next_node 461 | 462 | ; Get the low part from MMIO base low 463 | and eax, 0xffffff00 464 | shl rax, (16 - 8) 465 | 466 | ; Combine the high and low base parts 467 | mov r10d, ecx 468 | and r10d, 0xff 469 | shl r10, 40 470 | or r10, rax 471 | 472 | ; Save the base 473 | mov qword [rsi + 0], r10 474 | 475 | ; Get the low part from MMIO limit low 476 | and ebx, 0xffffff00 477 | shl rbx, (16 - 8) 478 | 479 | ; Combine the high and low limit parts 480 | bextr r10d, ecx, 0x0810 481 | shl r10, 40 482 | or r10, rbx 483 | or r10, 0xFFFF 484 | 485 | ; Save the limit 486 | mov qword [rsi + 8], r10 487 | 488 | .next_node: 489 | add rsi, 0x10 490 | inc ebp 491 | cmp ebp, 12 492 | jb .per_node 493 | 494 | pop r15 495 | pop r10 496 | pop rbp 497 | pop rsi 498 | pop rdx 499 | pop rcx 500 | pop rbx 501 | pop rax 502 | ret 503 | 504 | align 16 505 | mmio_routing_table: 506 | times (12 * 2) dq 0 507 | 508 | ; Initialize the fs segment. 509 | init_globals: 510 | push rax 511 | push rbx 512 | push rcx 513 | push rdx 514 | 515 | ; This is a fixed allocation. We compensate for this by properly 516 | ; initializing bamp_addrs to not start here 517 | mov rbx, 0x0000010000000000 518 | 519 | mov eax, ebx 520 | bextr rdx, rbx, 0x2020 521 | mov ecx, 0xC0000100 ; FS.base MSR 522 | wrmsr 523 | 524 | pop rdx 525 | pop rcx 526 | pop rbx 527 | pop rax 528 | ret 529 | 530 | ; Initialize the global values 531 | populate_globals: 532 | push rbx 533 | push rcx 534 | push rdx 535 | push rsi 536 | 537 | mov rbx, 0x0000010000000000 538 | 539 | ; Zero out the global table 540 | push rax 541 | push rcx 542 | push rdi 543 | mov rdi, rbx 544 | mov rcx, GLOBAL_STORAGE 545 | xor eax, eax 546 | rep stosb 547 | pop rdi 548 | pop rcx 549 | pop rax 550 | 551 | mov qword [fs:globals.fs_base], rbx 552 | 553 | mov rbx, 0x50000000000 554 | mov qword [fs:globals.next_free_vaddr], rbx 555 | 556 | mov rbx, 0x0000010000000000 + GLOBAL_STORAGE ; Node 0 base (less globals) 557 | mov qword [fs:globals.bamp_addr + 0x00], rbx 558 | mov rbx, 0x0000018000000000 ; Node 1 base 559 | mov qword [fs:globals.bamp_addr + 0x08], rbx 560 | mov rbx, 0x0000020000000000 ; Node 2 base 561 | mov qword [fs:globals.bamp_addr + 0x10], rbx 562 | mov rbx, 0x0000028000000000 ; Node 3 base 563 | mov qword [fs:globals.bamp_addr + 0x18], rbx 564 | mov rbx, 0x0000030000000000 ; Node 4 base 565 | mov qword [fs:globals.bamp_addr + 0x20], rbx 566 | mov rbx, 0x0000038000000000 ; Node 5 base 567 | mov qword [fs:globals.bamp_addr + 0x28], rbx 568 | mov rbx, 0x0000040000000000 ; Node 6 base 569 | mov qword [fs:globals.bamp_addr + 0x30], rbx 570 | mov rbx, 0x0000048000000000 ; Node 7 base 571 | mov qword [fs:globals.bamp_addr + 0x38], rbx 572 | 573 | lea rsi, [rel dram_routing_table] 574 | mov ecx, 0 575 | .for_each_node: 576 | mov rbx, qword [rsi + 0x18] 577 | and rbx, 0xfff 578 | shl rbx, 27 ; Multiply by (1GB / 8). This gives us the number of bytes in 579 | ; this nodes pool 580 | 581 | mov rdx, qword [fs:globals.bamp_addr + rcx*8] 582 | add rdx, rbx 583 | and rdx, ~((1024 * 1024 * 1024) - 1) 584 | mov qword [fs:globals.bamp_ends + rcx*8], rdx 585 | 586 | add rsi, 0x20 587 | inc ecx 588 | cmp ecx, MAX_NODES 589 | jb short .for_each_node 590 | 591 | pop rsi 592 | pop rdx 593 | pop rcx 594 | pop rbx 595 | ret 596 | 597 | ; qwait 598 | ; 599 | ; Summary: 600 | ; 601 | ; This is a shitty spinloop used for delaying execution for INIT-SIPI-SIPIs 602 | ; 603 | qwait: 604 | push rcx 605 | 606 | mov rcx, 1000000 607 | .lewp: 608 | dec rcx 609 | jnz short .lewp 610 | 611 | pop rcx 612 | ret 613 | 614 | boot_aps: 615 | push rax 616 | push rbx 617 | 618 | ; Send INIT 619 | mov eax, 0x000C4500 620 | mov ebx, 0xFEE00300 621 | mov dword [rbx], eax 622 | call qwait 623 | 624 | ; Send SIPI #1 625 | mov eax, 0x000C4609 626 | mov dword [rbx], eax 627 | call qwait 628 | 629 | ; Send SIPI #2 630 | mov dword [rbx], eax 631 | call qwait 632 | 633 | pop rbx 634 | pop rax 635 | ret 636 | 637 | create_cephys: 638 | push rax 639 | push rcx 640 | push rdi 641 | 642 | mov rdi, 0x100102000 643 | 644 | ; Set up the 1GB PDPTEs for the cephys map 645 | ; Present, writable, page size 646 | mov rax, 0x83 ; Low 32-bits 647 | mov ecx, 512 648 | .set1Gentry_cephys: 649 | mov qword [rdi + 0], rax ; Low bits 650 | 651 | add rax, (1 * 1024 * 1024 * 1024) 652 | add rdi, 8 653 | dec ecx 654 | jnz short .set1Gentry_cephys 655 | 656 | pop rdi 657 | pop rcx 658 | pop rax 659 | ret 660 | 661 | switch_cr3: 662 | push rdi 663 | 664 | mov rdi, 0x100100000 665 | 666 | mov dword [rdi + 0x00], 0x00102003 ; cephys 667 | mov dword [rdi + 0x04], 0x00000001 ; cephys high 668 | mov dword [rdi + 0x08], 0x00102003 ; cephys 669 | mov dword [rdi + 0x0c], 0x00000001 ; cephys high 670 | mov dword [rdi + 0x10], 0x00103003 ; PNM node 0 671 | mov dword [rdi + 0x14], 0x00000001 ; PNM node 0 high 672 | mov dword [rdi + 0x18], 0x00104003 ; PNM node 1 673 | mov dword [rdi + 0x1c], 0x00000001 ; PNM node 1 high 674 | mov dword [rdi + 0x20], 0x00105003 ; PNM node 2 675 | mov dword [rdi + 0x24], 0x00000001 ; PNM node 2 high 676 | mov dword [rdi + 0x28], 0x00106003 ; PNM node 3 677 | mov dword [rdi + 0x2c], 0x00000001 ; PNM node 3 high 678 | mov dword [rdi + 0x30], 0x00107003 ; PNM node 4 679 | mov dword [rdi + 0x34], 0x00000001 ; PNM node 4 high 680 | mov dword [rdi + 0x38], 0x00108003 ; PNM node 5 681 | mov dword [rdi + 0x3c], 0x00000001 ; PNM node 5 high 682 | mov dword [rdi + 0x40], 0x00109003 ; PNM node 6 683 | mov dword [rdi + 0x44], 0x00000001 ; PNM node 6 high 684 | mov dword [rdi + 0x48], 0x0010a003 ; PNM node 7 685 | mov dword [rdi + 0x4c], 0x00000001 ; PNM node 7 high 686 | 687 | mov cr3, rdi 688 | wbinvd 689 | 690 | pop rdi 691 | ret 692 | 693 | create_relocated_idt: 694 | push rax 695 | push rbx 696 | push rcx 697 | push rdx 698 | push rsi 699 | push rbp 700 | 701 | ; Allocate room for the IDT and the entries 702 | mov rbx, (256 * 16) + 8 + 2 703 | bamp_alloc rbx 704 | mov rsi, rbx 705 | 706 | ; Skip ahead past the actual IDT 707 | add rbx, (8 + 2) 708 | 709 | ; Create the IDT structure 710 | mov word [rbx - 0x0a], 4095 ; Limit 711 | mov qword [rbx - 0x08], rbx ; Base 712 | 713 | ; Allocate room for the dispatchers 714 | ; 50 push rax 715 | ; 53 push rbx 716 | ; 48 b8 mov rax, 717 | ; 48 bb mov rbx, 718 | ; ff e3 jmp rbx 719 | mov rdx, ((1 + 1 + 10 + 10 + 2) * 256) 720 | bamp_alloc rdx 721 | 722 | ; Install exception handlers, [0, 32) 723 | lea rax, [rel relocated_handler] 724 | xor ecx, ecx 725 | .install_exception_handlers: 726 | ; Create the dispatcher 727 | mov word [rdx + 0x00 + 0], 0x5350 728 | mov word [rdx + 0x00 + 2], 0xb848 729 | mov qword [rdx + 0x02 + 2], rcx 730 | mov word [rdx + 0x0a + 2], 0xbb48 731 | mov qword [rdx + 0x0c + 2], rax 732 | mov word [rdx + 0x14 + 2], 0xe3ff 733 | 734 | mov word [rbx + 0], dx ; offset 15..0 735 | mov word [rbx + 2], 0x0008 ; Segment selector 736 | mov byte [rbx + 4], 0x00 ; ist 737 | mov byte [rbx + 5], 0x8E ; type 738 | 739 | bextr rbp, rdx, 0x1010 740 | mov word [rbx + 6], bp ; offset 31..16 741 | 742 | bextr rbp, rdx, 0x2020 743 | mov dword [rbx + 0x8], ebp ; offset 63..32 744 | mov dword [rbx + 0xc], 0 ; reserved 745 | 746 | add rdx, (1 + 1 + 10 + 10 + 2) 747 | add rbx, 16 748 | inc ecx 749 | cmp ecx, 32 750 | jl short .install_exception_handlers 751 | 752 | ; Install interrupt handlers, [32, 256) 753 | mov rcx, 32 754 | .install_interrupt_handlers: 755 | imul rbx, rcx, 16 756 | 757 | lea rdx, [rel user_handler] 758 | mov word [rsi + (8 + 2) + rbx + 0x0], dx 759 | mov word [rsi + (8 + 2) + rbx + 0x2], 0x0008 760 | mov byte [rsi + (8 + 2) + rbx + 0x4], 0x00 761 | mov byte [rsi + (8 + 2) + rbx + 0x5], 0x8E 762 | bextr rbp, rdx, 0x1010 763 | mov word [rsi + (8 + 2) + rbx + 0x6], bp 764 | bextr rbp, rdx, 0x2020 765 | mov dword [rsi + (8 + 2) + rbx + 0x8], ebp 766 | mov dword [rsi + (8 + 2) + rbx + 0xc], 0 767 | 768 | inc ecx 769 | cmp ecx, 256 770 | jl short .install_interrupt_handlers 771 | 772 | ; Install the node local IBS handler 773 | lea rdx, [rel ibs_handler] 774 | mov word [rsi + (8 + 2) + (16 * 0x2) + 0x0], dx 775 | mov word [rsi + (8 + 2) + (16 * 0x2) + 0x2], 0x0008 776 | mov byte [rsi + (8 + 2) + (16 * 0x2) + 0x4], 0x00 777 | mov byte [rsi + (8 + 2) + (16 * 0x2) + 0x5], 0x8E 778 | bextr rbp, rdx, 0x1010 779 | mov word [rsi + (8 + 2) + (16 * 0x2) + 0x6], bp 780 | bextr rbp, rdx, 0x2020 781 | mov dword [rsi + (8 + 2) + (16 * 0x2) + 0x8], ebp 782 | mov dword [rsi + (8 + 2) + (16 * 0x2) + 0xc], 0 783 | 784 | ; Program the IBS APIC vector to deliver NMIs 785 | mov eax, 0xFEE00500 786 | mov dword [eax], (4 << 8) 787 | 788 | lidt [rsi] 789 | 790 | pop rbp 791 | pop rsi 792 | pop rdx 793 | pop rcx 794 | pop rbx 795 | pop rax 796 | ret 797 | 798 | ; rdx <- MMIO address for processor PCIE config space 799 | amd_fam15h_fetch_pcie_mmio: 800 | push rax 801 | push rcx 802 | 803 | ; Get the MMIO address for the processor PCIE config space 804 | mov ecx, 0xc0010058 805 | rdmsr 806 | shl rdx, 32 807 | or rdx, rax 808 | and rdx, ~0x3F 809 | 810 | pop rcx 811 | pop rax 812 | ret 813 | 814 | falkrand: 815 | movdqu xmm15, [gs:thread_local.xs_seed] 816 | aesenc xmm15, xmm15 817 | movdqu [gs:thread_local.xs_seed], xmm15 818 | ret 819 | 820 | ; rcx -> Lower bound 821 | ; rdx -> Upper bound 822 | ; rax <- Random number in range [lower, upper] 823 | randexp: 824 | push rdx 825 | 826 | call randuni 827 | 828 | mov rdx, rax 829 | call randuni 830 | 831 | pop rdx 832 | ret 833 | 834 | ; rcx -> Lower bound 835 | ; rdx -> Upper bound 836 | ; rax <- Random number in range [lower, upper] 837 | randuni: 838 | push rdx 839 | push rbp 840 | push rsi 841 | push r15 842 | 843 | ; Default to returning the lower bound 844 | mov rax, rcx 845 | 846 | ; If the lower bound is greater than or equal to the upper bound, either 847 | ; there is no range, or the range is invalid, return the lower bound. 848 | cmp rcx, rdx 849 | jae short .done 850 | 851 | ; rbp <- Range 852 | mov rbp, rdx 853 | sub rbp, rcx 854 | inc rbp 855 | 856 | ; Get the number of set bits in range. If only one bit is set, use a mask, 857 | ; otherwise, use a div. 858 | popcnt rsi, rbp 859 | cmp rsi, 1 860 | jne short .use_div 861 | 862 | .use_mask: 863 | dec rbp ; Create mask 864 | call xorshift64 865 | and r15, rbp 866 | lea rax, [rcx + r15] 867 | jmp short .done 868 | 869 | .use_div: 870 | call xorshift64 871 | xor rdx, rdx 872 | mov rax, r15 873 | div rbp 874 | lea rax, [rcx + rdx] 875 | 876 | .done: 877 | pop r15 878 | pop rsi 879 | pop rbp 880 | pop rdx 881 | ret 882 | 883 | xorshift64: 884 | XMMPUSH xmm15 885 | 886 | call falkrand 887 | movq r15, xmm15 888 | 889 | XMMPOP xmm15 890 | ret 891 | 892 | falkseed: 893 | push rax 894 | push rdx 895 | 896 | XMMPUSH xmm15 897 | 898 | pinsrq xmm15, rsp, 0 899 | 900 | rdtsc 901 | shl rdx, 32 902 | or rdx, rax 903 | bts rdx, 63 904 | btr rdx, 62 905 | pinsrq xmm15, rdx, 1 906 | 907 | aesenc xmm15, xmm15 908 | aesenc xmm15, xmm15 909 | aesenc xmm15, xmm15 910 | aesenc xmm15, xmm15 911 | 912 | movdqu [gs:thread_local.xs_seed], xmm15 913 | 914 | XMMPOP xmm15 915 | 916 | pop rdx 917 | pop rax 918 | ret 919 | 920 | init_lwp: 921 | push rax 922 | push rcx 923 | push rdx 924 | 925 | ; Set up xcr0 to save all state (FPU, MMX, SSE, AVX, and LWP) 926 | mov edx, 0x40000000 927 | mov eax, 0x00000007 928 | xor ecx, ecx 929 | xsetbv 930 | 931 | ; Get the LWP features 932 | mov eax, 0x8000001c 933 | cpuid 934 | 935 | ; Write the LWP features to the LWP_CFG MSR 936 | mov rax, rdx 937 | xor rdx, rdx 938 | mov ecx, 0xc0000105 939 | wrmsr 940 | 941 | pop rdx 942 | pop rcx 943 | pop rax 944 | ret 945 | 946 | lmland: 947 | cli 948 | 949 | xor ax, ax 950 | mov es, ax 951 | mov ds, ax 952 | mov gs, ax 953 | mov ss, ax 954 | mov fs, ax 955 | 956 | lidt [idt] 957 | 958 | ; Enable the IO APIC 959 | mov ebx, 0xFEE000F0 ; Spurious Interrupt Vector Register 960 | mov eax, dword [rbx] 961 | or eax, 0x100 962 | mov dword [rbx], eax 963 | 964 | ; Get a unique core ID 965 | mov rax, 1 966 | lock xadd qword [proc_id], rax 967 | 968 | ; Each core gets it's own quarter-line on the screen, obviously this would 969 | ; cause problems if there are more than 100 cores 970 | imul rdi, rax, (40) 971 | add rdi, 0xb8000 972 | 973 | ; Each core gets a unique 64kB stack 974 | imul rsp, rax, 0x10000 ; 64kB stack per core 975 | add rsp, 0x00504000 976 | 977 | test rax, rax 978 | jnz short .not_bsp 979 | 980 | ; We've entered LM as the BSP, we need to set up the PNM and start the 981 | ; APs 982 | call create_cephys 983 | call fetch_dram_info 984 | call fetch_mmio_info 985 | call init_pnm 986 | call init_globals 987 | call populate_globals 988 | 989 | ; Get the rdtsc increment frequency in MHz and store it 990 | push rax 991 | call amd_fam15h_sw_p0_freq 992 | mov qword [fs:globals.rdtsc_freq], rax 993 | pop rax 994 | 995 | call i825xx_init 996 | call x540_init 997 | 998 | ; Initialize the per_node_kern_code struct 999 | mov qword [fs:globals.per_node_kern_code + node_struct.orig_data], boot_bsp 1000 | mov qword [fs:globals.per_node_kern_code + node_struct.data_len], kern_size 1001 | 1002 | call boot_aps 1003 | 1004 | .not_bsp: 1005 | ; Set up the FS segment for this core 1006 | call init_globals 1007 | call init_per_core_storage 1008 | call init_lwp 1009 | 1010 | push rax 1011 | 1012 | ; Get a copy of the kernel on this node 1013 | mov rbx, [fs:globals.fs_base] 1014 | lea rbx, [rbx + globals.per_node_kern_code] 1015 | mov rbx, [rbx + node_struct.orig_data] 1016 | 1017 | pop rax 1018 | 1019 | ; Jump to the kernel code 1020 | add rbx, (.cores_init_bamp - boot_bsp) 1021 | jmp rbx 1022 | 1023 | .cores_init_bamp: 1024 | ; At this point we are relocated 1025 | 1026 | ; Allocate a 32MB stack from the BAMP 1027 | mov rbx, (1024 * 1024 * 32) 1028 | bamp_alloc rbx 1029 | add rbx, (1024 * 1024 * 32) 1030 | mov rsp, rbx 1031 | 1032 | mov rbx, cr0 1033 | btr rbx, 30 ; Enable cache 1034 | btr rbx, 29 ; Enable write through 1035 | bts rbx, 5 ; Enable numeric error 1036 | mov cr0, rbx 1037 | 1038 | ; Increment the counter of number of cores up and relocated 1039 | mov rbx, 1 1040 | lock xadd qword [cores_reloc], rbx 1041 | 1042 | ; Wait for all cores to relocate 1043 | mov rbx, qword [num_cores] 1044 | .wait_for_reloc: 1045 | pause 1046 | cmp rbx, qword [cores_reloc] 1047 | jne short .wait_for_reloc 1048 | 1049 | call switch_cr3 1050 | 1051 | ; Relocate the GDT 1052 | lea rbx, [rel gdt64] 1053 | lea rcx, [rel gdt64_base] 1054 | mov [rbx + 2], rcx 1055 | lgdt [rbx] 1056 | 1057 | .next: 1058 | call falkseed 1059 | call x540_init_local_rx 1060 | call x540_init_local_tx 1061 | call i825xx_init_thread_local 1062 | call create_relocated_idt 1063 | 1064 | ; Jump to the actual program to run! 1065 | jmp program 1066 | 1067 | -------------------------------------------------------------------------------- /srcs/boot/boot_bsp.asm: -------------------------------------------------------------------------------- 1 | [org 0x7C00] 2 | [bits 16] 3 | 4 | %define PXE 5 | 6 | %include "srcs/defines.asm" 7 | 8 | ; boot_bsp 9 | ; 10 | ; Summary: 11 | ; 12 | ; This is where the BIOS initially passes over execution. Here we just ensure 13 | ; interrupts are disabled and segments are zeroed out. After this we quickly 14 | ; go into rmland to continue our real mode needs. 15 | ; 16 | ; Optimization: 17 | ; 18 | ; Readability 19 | ; 20 | boot_bsp: 21 | ; Disable interrupts until we need them 22 | cli 23 | cld 24 | 25 | ; Clear all segments 26 | xor ax, ax 27 | mov es, ax 28 | mov ds, ax 29 | mov fs, ax 30 | mov gs, ax 31 | mov ss, ax 32 | 33 | ; Set up the stack 34 | mov sp, 0x7000 35 | 36 | ; Ensure cs is zero 37 | jmp 0x0000:rmland 38 | 39 | ; rmland 40 | ; 41 | ; Summary: 42 | ; 43 | ; This is the BSP real-mode environment where we ensure we use everything 44 | ; we need from the BIOS, such as setting the video mode and loading other 45 | ; sectors, and then we continue on setting the A20 line and transitioning into 46 | ; protected mode. 47 | ; 48 | ; Optimization: 49 | ; 50 | ; Readability 51 | ; 52 | rmland: 53 | ; Go into VGA mode 3 (80x25 16-colour mode) 54 | mov ax, 0x0003 55 | int 0x10 56 | 57 | %ifndef PXE 58 | ; Drive reset 59 | xor ah, ah 60 | int 0x13 61 | jc short .halt 62 | 63 | xor cx, cx 64 | mov es, cx ; Store es for the segment we read to 65 | mov ax, 0x0212 ; Read 18 sectors 66 | mov cx, 0x0001 ; Read from track 0, sector 1 67 | xor dh, dh ; Read from head 0 on the drive booted from 68 | mov bx, 0x7C00 ; Read to 0000:7C00 69 | int 0x13 ; Read the sectors 70 | cli ; Hyper-V... what the fuck are you doing m8? 71 | jc short .halt 72 | 73 | xor cx, cx 74 | mov es, cx ; Store es for the segment we read to 75 | mov ax, 0x0212 ; Read 18 sectors 76 | mov cx, 0x0001 ; Read from track 0, sector 1 77 | mov dh, 1 ; Read from head 1 on the drive booted from 78 | mov bx, 0xA000 ; Read to 0000:A000 79 | int 0x13 ; Read the sectors 80 | cli ; Hyper-V... what the fuck are you doing m8? 81 | jc short .halt 82 | %endif 83 | 84 | ; Blindly set the A20 line 85 | in al, 0x92 86 | or al, 2 87 | out 0x92, al 88 | 89 | ; Get the e820 memory map 90 | call gen_memory_map 91 | 92 | ; Load the gdt (for 32-bit proteted mode) 93 | lgdt [gdt] 94 | 95 | ; Set the protection bit 96 | mov eax, cr0 97 | or al, (1 << 0) 98 | mov cr0, eax 99 | 100 | ; We go to protected land now! 101 | jmp 0x0008:pmland 102 | 103 | .halt: 104 | hlt 105 | jmp short .halt 106 | 107 | %define CONFIG_PORT 0x2e 108 | 109 | %define INDEX_PORT 0x2e 110 | %define DATA_PORT 0x2f 111 | 112 | %define START_CONFIG_KEY 0x55 113 | %define END_CONFIG_KEY 0xaa 114 | 115 | %define RUNTIME_REGS_IO_PORT 0x400 116 | 117 | gen_memory_map: 118 | ; Set the entry count to 0 119 | mov word [MEMORY_MAP_LOC], 0 120 | 121 | mov di, MEMORY_MAP_LOC + 0x20 122 | xor ebx, ebx 123 | 124 | get_e820: 125 | ; Update the count 126 | inc word [MEMORY_MAP_LOC] 127 | 128 | mov eax, 0x0000E820 129 | mov ecx, 0x20 130 | mov edx, 0x534D4150 131 | int 0x15 132 | jc e820_end 133 | 134 | cmp eax, 0x534D4150 135 | jne e820_end 136 | 137 | add di, 0x20 138 | cmp ebx, 0 139 | jne get_e820 140 | 141 | e820_end: 142 | ret 143 | 144 | ; ----------------------------------------------------------------------- 145 | 146 | ; 32-bit protected mode GDT 147 | 148 | align 8 149 | gdt_base: 150 | dq 0x0000000000000000 151 | dq 0x00CF9A000000FFFF 152 | dq 0x00CF92000000FFFF 153 | 154 | gdt: 155 | dw (gdt - gdt_base) - 1 156 | dd gdt_base 157 | 158 | ; ----------------------------------------------------------------------- 159 | 160 | [bits 32] 161 | 162 | ; pmland 163 | ; 164 | ; Summary: 165 | ; 166 | ; This is our BSP protected mode landing point. Here we set up data selectors, 167 | ; enable the IO APIC, and send INIT-SIPI-SIPI to all other APs. 168 | ; 169 | ; Optimization: 170 | ; 171 | ; Readability 172 | ; 173 | pmland: 174 | ; Set up all data selectors 175 | mov ax, 0x10 176 | mov es, ax 177 | mov ds, ax 178 | mov fs, ax 179 | mov ss, ax 180 | mov gs, ax 181 | 182 | ; Zero out the page table 183 | mov edi, 0x00100000 184 | mov cr3, edi 185 | mov ecx, (0x0010b000 - 0x00100000) 186 | xor eax, eax 187 | rep stosb 188 | 189 | ; Identity map 190 | ; 191 | ; 0x00100000 - 0x00101000 - PML4T (512 GB pages) 192 | ; 0x00101000 - 0x00102000 - PDPT ( 1 GB pages) 193 | 194 | ; cephys (Cache enabled physical) 195 | ; 196 | ; 0x00102000 - 0x00103000 - PDPT ( 1 GB pages) 197 | 198 | ; Set up PML4T 199 | mov edi, cr3 200 | mov dword [edi + 0x00], 0x00101003 ; Identity map 201 | mov dword [edi + 0x04], 0x00000000 ; Identity map 202 | mov dword [edi + 0x08], 0x00102003 ; cephys 203 | mov dword [edi + 0x0c], 0x00000001 ; cephys high 204 | mov dword [edi + 0x10], 0x00103003 ; PNM node 0 205 | mov dword [edi + 0x14], 0x00000001 ; PNM node 0 high 206 | mov dword [edi + 0x18], 0x00104003 ; PNM node 1 207 | mov dword [edi + 0x1c], 0x00000001 ; PNM node 1 high 208 | mov dword [edi + 0x20], 0x00105003 ; PNM node 2 209 | mov dword [edi + 0x24], 0x00000001 ; PNM node 2 high 210 | mov dword [edi + 0x28], 0x00106003 ; PNM node 3 211 | mov dword [edi + 0x2c], 0x00000001 ; PNM node 3 high 212 | mov dword [edi + 0x30], 0x00107003 ; PNM node 4 213 | mov dword [edi + 0x34], 0x00000001 ; PNM node 4 high 214 | mov dword [edi + 0x38], 0x00108003 ; PNM node 5 215 | mov dword [edi + 0x3c], 0x00000001 ; PNM node 5 high 216 | mov dword [edi + 0x40], 0x00109003 ; PNM node 6 217 | mov dword [edi + 0x44], 0x00000001 ; PNM node 6 high 218 | mov dword [edi + 0x48], 0x0010a003 ; PNM node 7 219 | mov dword [edi + 0x4c], 0x00000001 ; PNM node 7 high 220 | add edi, 0x1000 221 | 222 | ; Set up the 1GB PDPTEs for the identity map 223 | ; Present, writable, write through, cache disable, page size 224 | mov eax, 0x83 ; Low 32-bits 225 | xor edx, edx ; High 32-bits 226 | mov ecx, 512 227 | .set1Gentry: 228 | mov dword [edi + 0], eax ; Low bits 229 | mov dword [edi + 4], edx ; High bits 230 | 231 | add eax, (1 * 1024 * 1024 * 1024) 232 | adc edx, 0 233 | 234 | add edi, 8 235 | dec ecx 236 | jnz short .set1Gentry 237 | 238 | jmp ap_pmland 239 | 240 | ; Must be aligned to ensure safe to RW from multiple threads 241 | ; 242 | ; This is the only place you can legally store globals outside of the fs 243 | ; segment. When cores_reloc == num_cores these all become invalid and illegal 244 | ; to operate on! We relocate entirely and thus globals are forbidden after all 245 | ; cores have relocated! You can access the values read only by using the rel 246 | ; prefix as they have been relocated properly 247 | align 8 248 | proc_id: dq 0 249 | cores_reloc: dq 0 250 | num_cores: dq NUM_CORES 251 | 252 | times 510-($-$$) db 0 253 | dw 0xAA55 254 | 255 | %include "srcs/boot/irqs.asm" 256 | 257 | times 0x1400-($-$$) db 0 258 | 259 | %include "srcs/boot/boot_ap.asm" 260 | 261 | times 0x2400-($-$$) db 0 262 | 263 | %include "srcs/mm/mm.asm" 264 | %include "srcs/io/serial.asm" 265 | %include "srcs/disp/console.asm" 266 | %include "srcs/time/time.asm" 267 | %include "srcs/boot/program.asm" 268 | %include "srcs/net/i825xx.asm" 269 | %include "srcs/net/x540.asm" 270 | %include "srcs/net/falktp.asm" 271 | %include "srcs/data/pqueue.asm" 272 | %include "srcs/disk/ide.asm" 273 | %include "srcs/dstruc/flut.asm" 274 | %include "srcs/os/win32.asm" 275 | %include "srcs/emu/ide.asm" 276 | %include "srcs/vm/snapshot.asm" 277 | %include "srcs/fuzzers/generic.asm" 278 | ;%include "srcs/fuzzers/defender.asm" 279 | %include "srcs/fuzzers/pdf.asm" 280 | ;%include "srcs\fuzzers\word.asm" 281 | 282 | kern_size: equ ($-$$) 283 | 284 | %ifndef PXE 285 | times (1474560)-($-$$) db 0 286 | %else 287 | times (32 * 1024)-($-$$) db 0 288 | %endif 289 | 290 | -------------------------------------------------------------------------------- /srcs/boot/irqs.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | idt_base: 4 | times (256 * 8) db 0 5 | 6 | idt: 7 | .limit: dw (idt - idt_base) - 1 8 | .base: dq idt_base 9 | 10 | %define EXCEPT_REPORT_MAGIC 0x215f25d0 11 | 12 | struc except_report 13 | .magic: resd 1 14 | .vec: resq 1 15 | .rsp00: resq 1 16 | .rsp08: resq 1 17 | .rsp10: resq 1 18 | .rsp18: resq 1 19 | .kern: resq 1 20 | .cr2: resq 1 21 | .stack: resb 1024 22 | endstruc 23 | 24 | ; Sends exception reports via the network. Use create_relocated_idt to set up 25 | ; your idt to use this. 26 | relocated_handler: 27 | mov [gs:thread_local.vm_ctxt + vm_ctxt.vec], rax 28 | pop rbx 29 | pop rax 30 | 31 | jmp panic 32 | 33 | user_handler: 34 | push rax 35 | push rdx 36 | 37 | mov dx, 0x20 38 | mov al, 0x20 39 | out dx, al 40 | 41 | pop rdx 42 | pop rax 43 | iretq 44 | 45 | ibs_handler: 46 | iretq 47 | 48 | -------------------------------------------------------------------------------- /srcs/boot/program.asm: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gamozolabs/falkervisor_beta/48e83a56badcf53dba94a5ab224eb37775a2f37d/srcs/boot/program.asm -------------------------------------------------------------------------------- /srcs/boot/program_chrome.asm: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gamozolabs/falkervisor_beta/48e83a56badcf53dba94a5ab224eb37775a2f37d/srcs/boot/program_chrome.asm -------------------------------------------------------------------------------- /srcs/boot/program_copygroup.asm: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gamozolabs/falkervisor_beta/48e83a56badcf53dba94a5ab224eb37775a2f37d/srcs/boot/program_copygroup.asm -------------------------------------------------------------------------------- /srcs/boot/program_defender.asm: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gamozolabs/falkervisor_beta/48e83a56badcf53dba94a5ab224eb37775a2f37d/srcs/boot/program_defender.asm -------------------------------------------------------------------------------- /srcs/boot/program_svg.asm: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gamozolabs/falkervisor_beta/48e83a56badcf53dba94a5ab224eb37775a2f37d/srcs/boot/program_svg.asm -------------------------------------------------------------------------------- /srcs/boot/program_word.asm: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gamozolabs/falkervisor_beta/48e83a56badcf53dba94a5ab224eb37775a2f37d/srcs/boot/program_word.asm -------------------------------------------------------------------------------- /srcs/data/pqueue.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | ; rcx -> ID of spinlock to acquire 4 | acquire_spinlock: 5 | push rbx 6 | 7 | ; Acquire a lock 8 | mov rbx, 1 9 | lock xadd qword [fs:globals.spinlocks_lock + rcx*8], rbx 10 | 11 | ; Spin until we're the chosen one 12 | .spin: 13 | pause 14 | cmp rbx, qword [fs:globals.spinlocks_release + rcx*8] 15 | jne short .spin 16 | 17 | pop rbx 18 | ret 19 | 20 | ; rcx -> ID of spinlock to release 21 | release_spinlock: 22 | ; Release the lock 23 | inc qword [fs:globals.spinlocks_release + rcx*8] 24 | 25 | ret 26 | 27 | ; rbx -> Array of pointers to sort (based on element) 28 | ; rcx -> Number of elements in the array 29 | ; rdx -> Index of 64-bit signed value to sort by in the pointer 30 | heapsort: 31 | push rcx 32 | push rsi 33 | push rdi 34 | push r8 35 | push r9 36 | 37 | call heapify 38 | 39 | ; end := count - 1 40 | dec rcx 41 | .lewp: 42 | ; while end > 0 43 | cmp rcx, 0 44 | jle short .end 45 | 46 | ; swap(a[end], a[0]) 47 | mov r8, qword [rbx + rcx*8] 48 | mov r9, qword [rbx] 49 | mov qword [rbx + rcx*8], r9 50 | mov qword [rbx], r8 51 | 52 | dec rcx 53 | 54 | mov rsi, 0 55 | mov rdi, rcx 56 | call siftdown 57 | 58 | jmp short .lewp 59 | 60 | .end: 61 | pop r9 62 | pop r8 63 | pop rdi 64 | pop rsi 65 | pop rcx 66 | ret 67 | 68 | ; rbx -> Array of pointers to sort (based on element) 69 | ; rcx -> Number of elements in the array 70 | ; rdx -> Byte index of 64-bit value to sort by in the pointer 71 | heapify: 72 | push rsi 73 | push rdi 74 | 75 | ; start <- floor((count - 2) / 2) 76 | mov rsi, rcx 77 | sub rsi, 2 78 | shr rsi, 1 79 | 80 | ; end <- count - 1 81 | mov rdi, rcx 82 | dec rdi 83 | 84 | .lewp: 85 | cmp rsi, 0 86 | jl short .end 87 | 88 | call siftdown 89 | 90 | dec rsi 91 | jmp short .lewp 92 | 93 | .end: 94 | pop rdi 95 | pop rsi 96 | ret 97 | 98 | ; rbx -> Array of pointers to sort (based on element) 99 | ; rdx -> Byte index of 64-bit value to sort by in the pointer 100 | ; rsi -> 'start' 101 | ; rdi -> 'end' 102 | siftdown: 103 | push rsi 104 | push rbp 105 | push r8 106 | push r9 107 | push r10 108 | push r11 109 | 110 | .lewp: 111 | ; rbp := start * 2 + 1 112 | mov rbp, rsi 113 | shl rbp, 1 114 | add rbp, 1 115 | 116 | cmp rbp, rdi 117 | jg short .end 118 | 119 | mov r10, rsi 120 | 121 | ; rbp - child 122 | ; rsi - root 123 | ; r10 - swap 124 | ; r11 - child + 1 125 | 126 | mov r8, qword [rbx + r10*8] 127 | mov r8, qword [r8 + rdx] 128 | mov r9, qword [rbx + rbp*8] 129 | 130 | ; if a[swap] < a[child] 131 | ; swap := child 132 | cmp r8, qword [r9 + rdx] 133 | jnl short .no_chillins 134 | 135 | ; swap := child 136 | mov r10, rbp 137 | 138 | .no_chillins: 139 | ; child + 1 140 | mov r11, rbp 141 | inc r11 142 | 143 | cmp r11, rdi 144 | jg short .right_child_is_not_greater 145 | 146 | mov r8, qword [rbx + r10*8] 147 | mov r8, qword [r8 + rdx] 148 | mov r9, qword [rbx + r11*8] 149 | cmp r8, qword [r9 + rdx] 150 | jnl short .right_child_is_not_greater 151 | 152 | ; swap := child + 1 153 | mov r10, r11 154 | 155 | .right_child_is_not_greater: 156 | ; if swap == root: return 157 | cmp r10, rsi 158 | je short .end 159 | 160 | ; swap(root, swap) 161 | mov r8, qword [rbx + rsi*8] 162 | mov r9, qword [rbx + r10*8] 163 | mov qword [rbx + rsi*8], r9 164 | mov qword [rbx + r10*8], r8 165 | 166 | ; root = swap 167 | mov rsi, r10 168 | 169 | jmp short .lewp 170 | 171 | .end: 172 | pop r11 173 | pop r10 174 | pop r9 175 | pop r8 176 | pop rbp 177 | pop rsi 178 | ret 179 | 180 | -------------------------------------------------------------------------------- /srcs/defines.asm: -------------------------------------------------------------------------------- 1 | %define SERIAL_PORT 0x3f8 2 | 3 | %define DMA_BUFFER_ZONE 0x60000000 4 | 5 | %define MPIC_CTRL 0x20 6 | %define MPIC_DATA 0x21 7 | %define SPIC_CTRL 0xA0 8 | %define SPIC_DATA 0xA1 9 | 10 | %define PIC_ICW4 0x01 11 | %define PIC_INIT 0x10 12 | 13 | %define PIC_8086 0x01 14 | 15 | %define IRQ07_MAP 0x20 16 | %define IRQ8F_MAP 0x28 17 | 18 | %define MEMORY_MAP_LOC 0x500 19 | 20 | %define MAX_NODES 8 21 | %define MAX_CORES 64 22 | %define NUM_CORES 64 23 | 24 | %define VM_STATE_DUMP 0xe7c928b1 25 | 26 | ; Must be a page-size multiple! 27 | %define GLOBAL_STORAGE (1024 * 1024) 28 | 29 | %define MAX_NUM_MODULES 512 30 | 31 | %macro XMMPUSH 1 32 | sub rsp, 16 33 | movdqu [rsp], %1 34 | %endmacro 35 | 36 | %macro XMMPOP 1 37 | movdqu %1, [rsp] 38 | add rsp, 16 39 | %endmacro 40 | 41 | %macro DBGPRINT 1 42 | push rdi 43 | push rdx 44 | mov rdx, %1 45 | call per_core_screen 46 | call outhexq 47 | pop rdx 48 | pop rdi 49 | %endmacro 50 | 51 | ;%define ASLR 52 | 53 | ; bamp_alloc 54 | ; 55 | ; Summary: 56 | ; 57 | ; This is the core allocator for the system. It is a simple linear allocator 58 | ; which is thread safe. All allocations are ensured to be 4KB aligned 59 | ; 60 | ; Parameters: 61 | ; 62 | ; %1 - Number of bytes to allocate 63 | ; 64 | ; Alignment: 65 | ; 66 | ; None 67 | ; 68 | ; Returns: 69 | ; 70 | ; %1 - Address to allocated memory 71 | ; 72 | ; Smashes: 73 | ; 74 | ; %1 - Return value 75 | ; 76 | ; Optimization 77 | ; 78 | ; Speed 79 | ; 80 | %macro bamp_alloc 1 81 | %if %1 != rsi 82 | push rsi 83 | mov rsi, %1 84 | %endif 85 | 86 | call bamp_alloc_int 87 | 88 | %if %1 != rsi 89 | mov %1, rsi 90 | pop rsi 91 | %endif 92 | %endmacro 93 | 94 | %macro rand_alloc 1 95 | %if %1 != rsi 96 | push rsi 97 | mov rsi, %1 98 | %endif 99 | 100 | call rand_alloc_int 101 | 102 | %if %1 != rsi 103 | mov %1, rsi 104 | pop rsi 105 | %endif 106 | %endmacro 107 | 108 | %macro mixed_alloc 1 109 | %if %1 != rax 110 | push rax 111 | %endif 112 | %if %1 != rcx 113 | push rcx 114 | %endif 115 | 116 | mov rcx, %1 117 | call mm_mixed_alloc 118 | mov %1, rax 119 | 120 | %if %1 != rcx 121 | pop rcx 122 | %endif 123 | %if %1 != rax 124 | pop rax 125 | %endif 126 | %endmacro 127 | 128 | struc node_struct 129 | .orig_data: resq 1 ; Original data pointer 130 | .data_len: resq 1 ; Length of data (in bytes) 131 | .node_data: resq MAX_NODES ; Each nodes pointer to data 132 | .node_race: resq MAX_NODES ; Locks for each node 133 | endstruc 134 | 135 | struc i825xx_dev 136 | .pcireq: resq 1 137 | .mmio_base: resq 1 138 | .rx_ring_base: resq 1 139 | .tx_ring_base: resq 1 140 | .tx_tail: resq 1 141 | .rx_tail: resq 1 142 | .rxed_count: resq 1 143 | endstruc 144 | 145 | -------------------------------------------------------------------------------- /srcs/disk/ide.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | ; pci_get_ide 4 | ; 5 | ; Summary: 6 | ; 7 | ; This function enumerates all PCI devices and returns the PCI request needed 8 | ; for the first IDE device encountered. 9 | ; 10 | ; Parameters: 11 | ; 12 | ; None 13 | ; 14 | ; Alignment: 15 | ; 16 | ; None 17 | ; 18 | ; Returns: 19 | ; 20 | ; on Success: rax = PCI request on success 21 | ; on Failure: rax = 0 22 | ; 23 | ; Smashes: 24 | ; 25 | ; rax - Return value 26 | ; 27 | ; Optimization: 28 | ; 29 | ; Readability 30 | ; 31 | pci_get_ide: 32 | push rbx 33 | push rdx 34 | 35 | sub rsp, 0x10 36 | 37 | ; rsp + 0x00 L2 | Bus number 38 | ; rsp + 0x02 L2 | Device number 39 | ; rsp + 0x04 L2 | Function number 40 | 41 | mov word [rsp + 0x00], 0xFF 42 | .for_bus: 43 | mov word [rsp + 0x02], 0x1F 44 | .for_device: 45 | mov word [rsp + 0x04], 0x07 46 | .for_func: 47 | ; Bus 48 | movzx eax, byte [rsp + 0x00] 49 | shl eax, 5 50 | 51 | ; Device 52 | or al, byte [rsp + 0x02] 53 | shl eax, 3 54 | 55 | ; Function 56 | or al, byte [rsp + 0x04] 57 | shl eax, 8 58 | 59 | ; Enable bit 60 | or eax, 0x80000000 61 | 62 | ; Save the bus:device.func query into ebx 63 | mov ebx, eax 64 | 65 | ; Request the vendor ID and device ID 66 | mov dx, 0x0CF8 67 | out dx, eax 68 | mov dx, 0x0CFC 69 | in eax, dx 70 | 71 | ; If the vendor ID is 0xFFFF, then this bus:device.func does not exist 72 | cmp ax, 0xFFFF 73 | je short .next_device 74 | 75 | ; Query register 0x08, it's the one that contains the class code and 76 | ; subclass 77 | mov eax, ebx 78 | or eax, 0x08 79 | mov dx, 0x0CF8 80 | out dx, eax 81 | mov dx, 0x0CFC 82 | in eax, dx 83 | 84 | ; Check if it's an IDE device (class 0x01 subclass 0x01) 85 | shr eax, 16 86 | cmp ax, 0x0101 87 | jne short .next_device 88 | 89 | mov eax, ebx 90 | jmp short .ret 91 | 92 | .next_device: 93 | dec word [rsp + 0x04] 94 | jns short .for_func 95 | 96 | dec word [rsp + 0x02] 97 | jns short .for_device 98 | 99 | dec word [rsp + 0x00] 100 | jns short .for_bus 101 | 102 | xor rax, rax 103 | .ret: 104 | add rsp, 0x10 105 | pop rdx 106 | pop rbx 107 | ret 108 | 109 | ; Channel 1 (primary) 110 | %define IOADDR1_BASE 0x1f0 111 | %define IOADDR2_BASE 0x3f6 112 | 113 | ; Channel 2 (secondary) 114 | ;%define IOADDR1_BASE 0x170 115 | ;%define IOADDR2_BASE 0x376 116 | 117 | ; Channel 3 118 | ;%define IOADDR1_BASE 0x1e8 119 | ;%define IOADDR2_BASE 0x3e6 120 | 121 | ; Channel 4 122 | ;%define IOADDR1_BASE 0x168 123 | ;%define IOADDR2_BASE 0x366 124 | 125 | ; This is the structure describing the IDE controller 126 | hdd: 127 | .BAR0: dd 0 128 | .BAR1: dd 0 129 | .BAR2: dd 0 130 | .BAR3: dd 0 131 | .BAR4: dd 0 132 | .BAR5: dd 0 133 | 134 | .BUS_MASTER: dw 0 135 | 136 | .CH0_DATA: dw IOADDR1_BASE + 0 ; RW Primary 137 | .CH0_ERROR: dw IOADDR1_BASE + 1 ; RO Primary 138 | .CH0_FEATURES: dw IOADDR1_BASE + 1 ; WO Primary 139 | .CH0_SECCOUNT0: dw IOADDR1_BASE + 2 ; RW Primary 140 | .CH0_SECCOUNT1: dw IOADDR1_BASE + 2 ; RW Secondary 141 | .CH0_LBA0: dw IOADDR1_BASE + 3 ; RW Primary 142 | .CH0_LBA3: dw IOADDR1_BASE + 3 ; RW Secondary 143 | .CH0_LBA1: dw IOADDR1_BASE + 4 ; RW Primary 144 | .CH0_LBA4: dw IOADDR1_BASE + 4 ; RW Secondary 145 | .CH0_LBA2: dw IOADDR1_BASE + 5 ; RW Primary 146 | .CH0_LBA5: dw IOADDR1_BASE + 5 ; RW Secondary 147 | .CH0_HDDEVSEL: dw IOADDR1_BASE + 6 ; RW Primary 148 | .CH0_COMMAND: dw IOADDR1_BASE + 7 ; WO Primary 149 | .CH0_STATUS: dw IOADDR1_BASE + 7 ; RO Primary 150 | 151 | .CH0_CONTROL: dw IOADDR2_BASE ; WO 152 | .CH0_ALTSTATUS: dw IOADDR2_BASE ; RO 153 | ;.CH0_DEVADDRESS: dw 0x3F9 ; Not supported 154 | 155 | .CH0_MST_IDENTIFY: times 512 db 0 ; 512-byte identify value 156 | .CH0_MST_BYTE_SIZE: dq 0 ; Size of the drive in bytes 157 | 158 | ; ide_wait_busy 159 | ; 160 | ; Summary: 161 | ; 162 | ; This function polls the IDE drive and blocks until it is no longer busy. 163 | ; 164 | ; Parameters: 165 | ; 166 | ; None 167 | ; 168 | ; Alignment: 169 | ; 170 | ; None 171 | ; 172 | ; Returns: 173 | ; 174 | ; None 175 | ; 176 | ; Smashes: 177 | ; 178 | ; None 179 | ; 180 | ; Optimization: 181 | ; 182 | ; Readability 183 | ; 184 | ide_wait_busy: 185 | push rax 186 | push rdx 187 | 188 | mov dx, word [rel hdd.CH0_STATUS] 189 | .busy: 190 | in al, dx 191 | test al, 0x80 192 | jnz short .busy 193 | 194 | pop rdx 195 | pop rax 196 | ret 197 | 198 | ; load_hdd 199 | ; 200 | ; Summary: 201 | ; 202 | ; This function scans for the first IDE controller out of all the PCI devices 203 | ; and then loads the BARs into global struct 'hdd'. 204 | ; 205 | ; Parameters: 206 | ; 207 | ; None 208 | ; 209 | ; Alignment: 210 | ; 211 | ; None 212 | ; 213 | ; Returns: 214 | ; 215 | ; on Success: CF = 0 216 | ; on Failure: CF = 1 217 | ; 218 | ; Smashes: 219 | ; 220 | ; None 221 | ; 222 | ; Optimization: 223 | ; 224 | ; Readability 225 | ; 226 | load_hdd: 227 | push rax 228 | push rbx 229 | push rcx 230 | push rdx 231 | 232 | ; Scan PCI devices for an IDE controller 233 | call pci_get_ide 234 | test rax, rax 235 | jz .fail 236 | 237 | ; Cache the PCI request 238 | mov ebx, eax 239 | 240 | ; Get the header type 241 | or eax, 0x0C 242 | mov dx, 0x0CF8 243 | out dx, eax 244 | mov dx, 0x0CFC 245 | in eax, dx 246 | 247 | ; Check if the header type 0x00 248 | shr eax, 16 249 | and al, 0xFF 250 | jnz short .fail 251 | 252 | ; Load BARs 0-5 into the hdd structure 253 | xor ecx, ecx 254 | .fetch_bars: 255 | ; Calculate the PCI request 256 | mov eax, ebx 257 | add eax, 0x10 258 | add eax, ecx 259 | 260 | ; Request BAR[ecx] 261 | mov dx, 0x0CF8 262 | out dx, eax 263 | mov dx, 0x0CFC 264 | in eax, dx 265 | 266 | ; Store the bar 267 | lea rbx, [rel hdd.BAR0] 268 | mov dword [rbx + rcx], eax 269 | 270 | add ecx, 4 271 | cmp ecx, 0x14 272 | jle short .fetch_bars 273 | 274 | clc 275 | jmp short .ret 276 | .fail: 277 | stc 278 | .ret: 279 | pop rdx 280 | pop rcx 281 | pop rbx 282 | pop rax 283 | ret 284 | 285 | ; init_hdd 286 | ; 287 | ; Summary: 288 | ; 289 | ; This function is responsible for finding IDE controllers (via load_hdd), 290 | ; populating all elements of global struct 'hdd', and preparing up the IDE 291 | ; controller for use. We currently only support the primary IDE controller, 292 | ; the secondary one is not used. 293 | ; 294 | ; Parameters: 295 | ; 296 | ; None 297 | ; 298 | ; Alignment: 299 | ; 300 | ; None 301 | ; 302 | ; Returns: 303 | ; 304 | ; on Success: CF = 0 305 | ; on Failure: CF = 1 306 | ; 307 | ; Smashes: 308 | ; 309 | ; None 310 | ; 311 | ; Optimization: 312 | ; 313 | ; Readability 314 | ; 315 | init_hdd: 316 | push rax 317 | push rcx 318 | push rdx 319 | 320 | call load_hdd 321 | jc .fail 322 | 323 | ; Ensure that the primary IDE controller is the one we expect by checking 324 | ; that the BARs are either 0 or 1. 325 | ;cmp dword [rel hdd.BAR0], 0x1 326 | ;ja .fail 327 | ;cmp dword [rel hdd.BAR1], 0x1 328 | ;ja .fail 329 | 330 | ; Validate and save the bus master IO port 331 | mov eax, dword [rel hdd.BAR4] 332 | test eax, 1 333 | jz .fail 334 | and eax, 0xFFFFFFFC 335 | mov word [rel hdd.BUS_MASTER], ax 336 | 337 | ; Disable interrupts on CH0 338 | mov dx, word [rel hdd.CH0_CONTROL] 339 | mov al, 2 340 | out dx, al 341 | 342 | ; Select 'master' drive with LBA addressing mode 343 | mov dx, word [rel hdd.CH0_HDDEVSEL] 344 | mov al, 0xE0 345 | out dx, al 346 | 347 | ; Send identify command 348 | mov dx, word [rel hdd.CH0_COMMAND] 349 | mov al, 0xEC 350 | out dx, al 351 | 352 | ; Wait for the IDE controller to not be busy 353 | call ide_wait_busy 354 | 355 | ; Load the identity data 356 | push rdi 357 | lea rdi, [rel hdd.CH0_MST_IDENTIFY] 358 | mov dx, word [rel hdd.CH0_DATA] 359 | mov rcx, (512 / 4) 360 | rep insd 361 | pop rdi 362 | 363 | ; Ensure the device takes LBA48 addressing 364 | mov edx, dword [rel hdd.CH0_MST_IDENTIFY + 164] 365 | test edx, (1 << 26) 366 | jz short .fail 367 | 368 | ; Get the size of the drive 369 | mov edx, dword [rel hdd.CH0_MST_IDENTIFY + 200] 370 | shl rdx, 9 371 | mov qword [rel hdd.CH0_MST_BYTE_SIZE], rdx 372 | 373 | clc 374 | jmp short .ret 375 | .fail: 376 | stc 377 | .ret: 378 | pop rdx 379 | pop rcx 380 | pop rax 381 | ret 382 | 383 | ; ide_pio_read_sectors 384 | ; 385 | ; Summary: 386 | ; 387 | ; This function reads cx sectors starting at sector rbx into buffer specified 388 | ; by r8. The read is performed on channel 0, master drive. 389 | ; 390 | ; Parameters: 391 | ; 392 | ; rbx - Sector to read from 393 | ; cx - Number of sectors to read 394 | ; r8 - Buffer to read into 395 | ; 396 | ; Alignment: 397 | ; 398 | ; None 399 | ; 400 | ; Returns: 401 | ; 402 | ; None 403 | ; 404 | ; Smashes: 405 | ; 406 | ; None 407 | ; 408 | ; Optimization: 409 | ; 410 | ; Readability 411 | ; 412 | ide_pio_read_sectors: 413 | push rax 414 | push rbx 415 | push rcx 416 | push rdx 417 | push rsi 418 | push rdi 419 | 420 | push rcx 421 | mov rcx, SPINLOCK_DISK 422 | call acquire_spinlock 423 | pop rcx 424 | 425 | mov dx, word [rel hdd.CH0_HDDEVSEL] 426 | mov al, 0xE0 427 | out dx, al 428 | 429 | rol rbx, 24 430 | 431 | ; Set the sector counts and the LBA48 address 432 | mov dx, word [rel hdd.CH0_SECCOUNT1] 433 | mov al, ch 434 | out dx, al 435 | 436 | mov dx, word [rel hdd.CH0_LBA5] 437 | mov al, bl 438 | out dx, al 439 | rol rbx, 8 440 | 441 | mov dx, word [rel hdd.CH0_LBA4] 442 | mov al, bl 443 | out dx, al 444 | rol rbx, 8 445 | 446 | mov dx, word [rel hdd.CH0_LBA3] 447 | mov al, bl 448 | out dx, al 449 | rol rbx, 8 450 | 451 | mov dx, word [rel hdd.CH0_SECCOUNT0] 452 | mov al, cl 453 | out dx, al 454 | 455 | mov dx, word [rel hdd.CH0_LBA2] 456 | mov al, bl 457 | out dx, al 458 | rol rbx, 8 459 | 460 | mov dx, word [rel hdd.CH0_LBA1] 461 | mov al, bl 462 | out dx, al 463 | rol rbx, 8 464 | 465 | mov dx, word [rel hdd.CH0_LBA0] 466 | mov al, bl 467 | out dx, al 468 | 469 | mov dx, word [rel hdd.CH0_COMMAND] 470 | mov al, 0x24 471 | out dx, al 472 | 473 | mov rdi, r8 474 | mov si, cx 475 | .lewp: 476 | mov dx, word [rel hdd.CH0_STATUS] 477 | in al, dx 478 | in al, dx 479 | in al, dx 480 | in al, dx 481 | 482 | call ide_wait_busy 483 | 484 | mov dx, word [rel hdd.CH0_STATUS] 485 | in al, dx 486 | 487 | ; ERR 488 | test al, 0x01 489 | jnz short .err 490 | 491 | ; DF 492 | test al, 0x20 493 | jnz short .err 494 | 495 | ; !DRQ 496 | test al, 0x08 497 | jz short .err 498 | 499 | mov dx, word [rel hdd.CH0_DATA] 500 | mov rcx, (512 / 4) 501 | rep insd 502 | 503 | .continue: 504 | dec si 505 | jnz short .lewp 506 | 507 | .err: 508 | push rcx 509 | mov rcx, SPINLOCK_DISK 510 | call release_spinlock 511 | pop rcx 512 | 513 | pop rdi 514 | pop rsi 515 | pop rdx 516 | pop rcx 517 | pop rbx 518 | pop rax 519 | ret 520 | 521 | .errz: 522 | pop rdi 523 | pop rsi 524 | pop rdx 525 | pop rcx 526 | pop rbx 527 | pop rax 528 | 529 | mov rdi, 0xb8000 530 | mov rdx, 0x1337133713371337 531 | call outhexq 532 | 533 | mov rdi, 0xb8000 + (80 * 2 * 1) 534 | mov rdx, rbx 535 | call outhexq 536 | 537 | mov rdi, 0xb8000 + (80 * 2 * 2) 538 | mov rdx, r8 539 | call outhexq 540 | hlt 541 | 542 | align 16 543 | prd: 544 | .buf: dd DMA_BUFFER_ZONE 545 | .size: dw 0x0000 546 | .flags: dw 0x8000 547 | 548 | ide_dma_read_sectors: 549 | push rax 550 | push rbx 551 | push rcx 552 | push rdx 553 | push rsi 554 | push rdi 555 | 556 | mov rdi, DMA_BUFFER_ZONE 557 | mov rcx, (64 * 1024) 558 | mov al, 0xc0 559 | rep stosb 560 | 561 | mov dword [rel prd.buf], DMA_BUFFER_ZONE 562 | mov word [rel prd.size], 0 563 | mov word [rel prd.flags], 0x8000 564 | 565 | ; Set the address of the PRD 566 | mov dx, word [rel hdd.BUS_MASTER] 567 | add dx, 0x4 568 | lea rbx, [rel prd] 569 | call bamp_get_phys 570 | out dx, eax 571 | 572 | ; Disable the device, set it to read mode 573 | mov dx, word [rel hdd.BUS_MASTER] 574 | mov al, 0x0 575 | out dx, al 576 | 577 | ; Clear the status register 578 | mov dx, word [rel hdd.BUS_MASTER] 579 | add dx, 0x2 580 | mov al, 0x0 581 | out dx, al 582 | 583 | push rdi 584 | mov dx, word [rel hdd.BUS_MASTER] 585 | add dx, 2 586 | in al, dx 587 | mov rdi, 0xb8000 588 | ;movzx rdx, al 589 | call outhexq 590 | pop rdi 591 | cli 592 | hlt 593 | 594 | ; Enable the device 595 | mov dx, word [rel hdd.BUS_MASTER] 596 | mov al, 0x1 597 | out dx, al 598 | 599 | mov cx, 128 600 | 601 | mov dx, word [rel hdd.CH0_HDDEVSEL] 602 | mov al, 0xE0 603 | out dx, al 604 | 605 | rol rbx, 24 606 | 607 | ; Set the sector counts and the LBA48 address 608 | ;mov dx, word [hdd.CH0_SECCOUNT1] 609 | ;mov al, ch 610 | ;out dx, al 611 | 612 | ;mov dx, word [hdd.CH0_LBA5] 613 | ;mov al, bl 614 | ;out dx, al 615 | rol rbx, 8 616 | 617 | ;mov dx, word [hdd.CH0_LBA4] 618 | ;mov al, bl 619 | ;out dx, al 620 | rol rbx, 8 621 | 622 | ;mov dx, word [hdd.CH0_LBA3] 623 | ;mov al, bl 624 | ;out dx, al 625 | rol rbx, 8 626 | 627 | mov dx, word [rel hdd.CH0_SECCOUNT0] 628 | mov al, cl 629 | out dx, al 630 | 631 | mov dx, word [rel hdd.CH0_LBA2] 632 | mov al, bl 633 | out dx, al 634 | rol rbx, 8 635 | 636 | mov dx, word [rel hdd.CH0_LBA1] 637 | mov al, bl 638 | out dx, al 639 | rol rbx, 8 640 | 641 | mov dx, word [rel hdd.CH0_LBA0] 642 | mov al, bl 643 | out dx, al 644 | 645 | sti 646 | mov dx, word [rel hdd.CH0_COMMAND] 647 | mov al, 0xC8 648 | out dx, al 649 | 650 | mov rdi, DMA_BUFFER_ZONE 651 | .lewp: 652 | push rdi 653 | mov dx, word [rel hdd.BUS_MASTER] 654 | add dx, 2 655 | in al, dx 656 | mov rdi, 0xb8000 657 | movzx rdx, al 658 | call outhexq 659 | pop rdi 660 | 661 | cmp dword [rdi], 0xc0c0c0c0 662 | je short .lewp 663 | 664 | mov dx, word [rel hdd.BUS_MASTER] 665 | add dx, 2 666 | in al, dx 667 | 668 | mov dx, word [rel hdd.BUS_MASTER] 669 | mov al, 0x00 670 | out dx, al 671 | 672 | mov rdi, r8 673 | mov rsi, DMA_BUFFER_ZONE 674 | mov rcx, (64 * 1024) ; 64kB 675 | rep movsb 676 | 677 | pop rdi 678 | pop rsi 679 | pop rdx 680 | pop rcx 681 | pop rbx 682 | pop rax 683 | ret 684 | 685 | ; hash_falkquot 686 | ; 687 | ; Summary: 688 | ; 689 | ; This function performs the falkquot hashing operation on data in rbx with a 690 | ; length of r11. 691 | ; 692 | ; Parameters: 693 | ; 694 | ; rbx - Pointer to data to hash 695 | ; r11 - Length of data to hash 696 | ; 697 | ; Alignment: 698 | ; 699 | ; None 700 | ; 701 | ; Returns: 702 | ; 703 | ; rax - Hash 704 | ; 705 | ; Smashes: 706 | ; 707 | ; rax - Hash 708 | ; 709 | ; Optimization: 710 | ; 711 | ; Readability (with a hint of speed) 712 | ; 713 | hash_falkquot: 714 | push rbx 715 | push rcx 716 | push rdx 717 | push r8 718 | push r10 719 | push r11 720 | 721 | mov r10, 0x1337133713371337 722 | .lewp: 723 | mov al, byte [rbx] 724 | mov dl, al 725 | 726 | ; n 727 | and al, 0xF 728 | add al, 17 729 | 730 | ; k 731 | shr dl, 4 732 | add dl, 13 733 | 734 | mov r8, r10 735 | mov cl, dl 736 | shl r8, cl 737 | xor r10, r8 ; r10 ^= (r10 << k) 738 | 739 | mov r8, r10 740 | mov cl, al 741 | shr r8, cl 742 | xor r10, r8 ; r10 ^= (r10 >> n) 743 | 744 | mov r8, r10 745 | shl r8, 43 746 | xor r10, r8 ; r10 ^= (r10 << 43) 747 | 748 | inc rbx 749 | dec r11 750 | jnz short .lewp 751 | 752 | mov rax, r10 753 | 754 | pop r11 755 | pop r10 756 | pop r8 757 | pop rdx 758 | pop rcx 759 | pop rbx 760 | ret 761 | 762 | ; read_via_ide_pio 763 | ; 764 | ; Summary: 765 | ; 766 | ; This function reads a FALKQUOT filesystem from IDE controller 0 master drive. 767 | ; The drive is initialized here as well. 768 | ; 769 | ; Parameters: 770 | ; 771 | ; r8 - Buffer to load file into 772 | ; 773 | ; Alignment: 774 | ; 775 | ; None 776 | ; 777 | ; Returns: 778 | ; 779 | ; on Success: CF = 0 and r11 = number of bytes loaded 780 | ; on Failure: CF = 1 and r11 = Smashed 781 | ; 782 | ; Smashes: 783 | ; 784 | ; r11 - Return value 785 | ; 786 | ; Optimization: 787 | ; 788 | ; Readability 789 | ; 790 | read_via_ide_pio: 791 | push rax 792 | push rbx 793 | push rcx 794 | push rdx 795 | push r8 796 | push r10 797 | push r13 798 | 799 | call init_hdd 800 | jc .fail 801 | 802 | ; Read the initial sector to get the length of the falkquot 803 | mov cx, 0x1 804 | xor rbx, rbx 805 | call ide_pio_read_sectors 806 | 807 | ; Validate the signature 808 | mov r10, r8 809 | mov r11, 0x544F55514B4C4146 ; 'FALKQUOT' 810 | cmp qword [r10], r11 811 | jne .fail 812 | 813 | ; Size in bytes of data 814 | mov r11, qword [r10 + 0x08] 815 | mov r13, r11 816 | shr r13, 9 817 | add r13, 10 818 | 819 | ; Read the rest of the sectors we need in 0xFFFF chunks 820 | 821 | ; r8 - Pointer to buffer 822 | ; r13 - Number of sectors to read 823 | ; rbx - Sector to read 824 | ; rcx - Number of sectors to read (max of 0xFFFF) 825 | lea r8, [r10] 826 | mov rbx, 0 827 | .lewp: 828 | ; Read in 0xFFFF sector chunks (max that IDE PIO can support) 829 | mov rcx, 0xFF 830 | cmp r13, rcx 831 | jbe short .no_max 832 | jmp short .read 833 | 834 | .no_max: 835 | mov rcx, r13 836 | .read: 837 | call ide_pio_read_sectors 838 | 839 | ; Add to the sector offset 840 | add rbx, rcx 841 | 842 | ; Calculate number of sectors left to read 843 | sub r13, rcx 844 | 845 | ; Increment our buffer 846 | shl rcx, 9 847 | add r8, rcx 848 | 849 | ; Print out the number of sectors remaining 850 | push rdx 851 | push rdi 852 | mov rdi, 0xb8000 + (80 * 2 * 0) 853 | mov rdx, r13 854 | call outhexq 855 | pop rdi 856 | pop rdx 857 | 858 | ; Continue if we have sectors left 859 | test r13, r13 860 | jnz short .lewp 861 | 862 | ; Generate the hash 863 | lea rbx, [r10 + 0x18] 864 | call hash_falkquot 865 | 866 | ; Ensure the hash matches 867 | cmp rax, qword [r10 + 0x10] 868 | jne short .fail 869 | 870 | clc 871 | jmp short .ret 872 | .fail: 873 | stc 874 | .ret: 875 | pop r13 876 | pop r10 877 | pop r8 878 | pop rdx 879 | pop rcx 880 | pop rbx 881 | pop rax 882 | ret 883 | 884 | ; read_via_ide_dma 885 | ; 886 | ; Summary: 887 | ; 888 | ; This function reads a FALKQUOT filesystem from IDE controller 0 master drive. 889 | ; The drive is initialized here as well. 890 | ; 891 | ; Parameters: 892 | ; 893 | ; None 894 | ; 895 | ; Alignment: 896 | ; 897 | ; None 898 | ; 899 | ; Returns: 900 | ; 901 | ; on Success: CF = 0 and r11 = number of bytes loaded 902 | ; on Failure: CF = 1 and r11 = Smashed 903 | ; 904 | ; Smashes: 905 | ; 906 | ; r11 - Return value 907 | ; 908 | ; Optimization: 909 | ; 910 | ; Readability 911 | ; 912 | read_via_ide_dma: 913 | push rax 914 | push rbx 915 | push rdx 916 | push r8 917 | push r10 918 | push r13 919 | 920 | call init_hdd 921 | jc .fail 922 | 923 | ; Read the initial sector to get the length of the falkquot 924 | xor rbx, rbx 925 | call ide_dma_read_sectors 926 | 927 | ; Validate the signature 928 | mov r10, r8 929 | mov r11, 0x544F55514B4C4146 ; 'FALKQUOT' 930 | cmp qword [r10], r11 931 | jne .fail 932 | 933 | mov rdx, 0x1234DE 934 | mov rdi, 0xb8000 + (80 * 2 * 5) 935 | call outhexq 936 | cli 937 | hlt 938 | 939 | ; Size in bytes of data 940 | mov r11, qword [r10 + 0x08] 941 | mov r13, r11 942 | shr r13, 9 943 | add r13, 10 944 | 945 | ; Read the rest of the sectors we need in 0xFFFF chunks 946 | 947 | ; r8 - Pointer to buffer 948 | ; r13 - Number of sectors to read 949 | ; rbx - Sector to read 950 | lea r8, [r10] 951 | mov rbx, 0 952 | .lewp: 953 | mov cx, 128 954 | call ide_dma_read_sectors 955 | 956 | ; Add to the sector offset 957 | add rbx, (64 * 1024) / 512 958 | 959 | ; Calculate number of sectors left to read 960 | sub r13, (64 * 1024) / 512 961 | 962 | ; Increment our buffer 963 | add r8, (64 * 1024) 964 | 965 | ; Print out the number of sectors remaining 966 | push rdx 967 | push rdi 968 | mov rdx, r13 969 | call outhexq 970 | pop rdi 971 | pop rdx 972 | 973 | ; Continue if we have sectors left 974 | cmp r13, 0 975 | jg short .lewp 976 | 977 | ; Generate the hash 978 | lea rbx, [r10 + 0x18] 979 | call hash_falkquot 980 | 981 | ; Ensure the hash matches 982 | cmp rax, qword [r10 + 0x10] 983 | jne short .fail 984 | 985 | clc 986 | jmp short .ret 987 | .fail: 988 | stc 989 | .ret: 990 | pop r13 991 | pop r10 992 | pop r8 993 | pop rdx 994 | pop rbx 995 | pop rax 996 | ret 997 | 998 | -------------------------------------------------------------------------------- /srcs/disp/console.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | ; sets rdi to point to the calling cores designated display location 4 | per_core_screen: 5 | imul rdi, qword [gs:thread_local.core_id], 40 6 | add rdi, 0xb8000 7 | ret 8 | 9 | hexlut: db "0123456789ABCDEF" 10 | 11 | ; outhexq 12 | ; 13 | ; Summary: 14 | ; 15 | ; This function writes out hexlified value rdx to the screen (pointed to by 16 | ; rdi). rdi is incremented by 32 as `stosw` is used to write the bytes. 17 | ; 18 | ; Parameters: 19 | ; 20 | ; rdx - Number to hexlify and display 21 | ; rdi - Pointer to buffer to recieve characters (usually the screen) 22 | ; 23 | ; Alignment: 24 | ; 25 | ; None 26 | ; 27 | ; Returns: 28 | ; 29 | ; None 30 | ; 31 | ; Smashes: 32 | ; 33 | ; rdi - Incremented by 32 34 | ; 35 | ; Optimization 36 | ; 37 | ; Readability 38 | ; 39 | outhexq: 40 | push rax 41 | push rbx 42 | push rcx 43 | ; We don't need to save rdx because it's been rotated back to original 44 | 45 | mov cl, 16 46 | .lewp: 47 | rol rdx, 4 48 | 49 | mov rax, rdx 50 | and rax, 0xF 51 | lea rbx, [rel hexlut] 52 | mov al, byte [rbx + rax] 53 | mov ah, 0x0F 54 | stosw 55 | 56 | dec cl 57 | jnz short .lewp 58 | 59 | pop rcx 60 | pop rbx 61 | pop rax 62 | ret 63 | 64 | ; output xmm0 to screen in hex form 65 | ; rdi -> Screen pointer 66 | ; xmm0 -> Data to display 67 | ; rdi <- Incremented screen pointer 68 | outhex128: 69 | push rdx 70 | 71 | pextrq rdx, xmm0, 1 72 | call outhexq 73 | pextrq rdx, xmm0, 0 74 | call outhexq 75 | 76 | pop rdx 77 | ret 78 | 79 | ; outdecq 80 | ; 81 | ; Summary: 82 | ; 83 | ; This function writes out decimalified value rdx to the screen (pointed to by 84 | ; rdi). rdi is incremented by 40 as `stosw` is used to write the bytes. 85 | ; 86 | ; Parameters: 87 | ; 88 | ; rdx - Number to decimalify and display 89 | ; rdi - Pointer to buffer to recieve characters (usually the screen) 90 | ; 91 | ; Alignment: 92 | ; 93 | ; None 94 | ; 95 | ; Returns: 96 | ; 97 | ; None 98 | ; 99 | ; Smashes: 100 | ; 101 | ; rdi - Incremented by 40 102 | ; 103 | ; Optimization 104 | ; 105 | ; Readability 106 | ; 107 | outdecq: 108 | push rax 109 | push rbx 110 | push rcx 111 | push rdx 112 | push r8 113 | push r9 114 | 115 | std 116 | add rdi, 38 117 | 118 | mov cl, 20 119 | mov r8, rdx 120 | mov r9, 10 121 | .lewp: 122 | xor rdx, rdx 123 | mov rax, r8 124 | div r9 125 | mov r8, rax 126 | 127 | lea rbx, [rel hexlut] 128 | mov al, byte [rbx + rdx] 129 | mov ah, 0x0F 130 | stosw 131 | 132 | dec cl 133 | jnz short .lewp 134 | 135 | add rdi, 42 136 | cld 137 | 138 | pop r9 139 | pop r8 140 | pop rdx 141 | pop rcx 142 | pop rbx 143 | pop rax 144 | ret 145 | 146 | ; outdecqsz 147 | ; 148 | ; Summary: 149 | ; 150 | ; This function writes out decimalified value rdx to the screen (pointed to by 151 | ; rdi). rdi is incremented by (sz * 2) as `stosw` is used to write the bytes. 152 | ; 153 | ; Parameters: 154 | ; 155 | ; rcx - Number of digits to display (from right to left), 2 means 'nn' 156 | ; rdx - Number to decimalify and display 157 | ; rdi - Pointer to buffer to recieve characters (usually the screen) 158 | ; 159 | ; Alignment: 160 | ; 161 | ; None 162 | ; 163 | ; Returns: 164 | ; 165 | ; None 166 | ; 167 | ; Smashes: 168 | ; 169 | ; rdi - Incremented by (rcx * 2) 170 | ; 171 | ; Optimization 172 | ; 173 | ; Readability 174 | ; 175 | outdecqsz: 176 | push rax 177 | push rbx 178 | push rcx 179 | push rdx 180 | push r8 181 | push r9 182 | 183 | std 184 | lea rdi, [rdi + rcx*2 - 2] 185 | push rcx 186 | 187 | ; Passed in 188 | ;mov cl, 20 189 | mov r8, rdx 190 | mov r9, 10 191 | .lewp: 192 | xor rdx, rdx 193 | mov rax, r8 194 | div r9 195 | mov r8, rax 196 | 197 | lea rbx, [rel hexlut] 198 | mov al, byte [rbx + rdx] 199 | mov ah, 0x0F 200 | stosw 201 | 202 | dec cl 203 | jnz short .lewp 204 | 205 | pop rcx 206 | lea rdi, [rdi + rcx*2 + 2] 207 | cld 208 | 209 | pop r9 210 | pop r8 211 | pop rdx 212 | pop rcx 213 | pop rbx 214 | pop rax 215 | ret 216 | 217 | ; outfixedb10d4 218 | ; 219 | ; Summary: 220 | ; 221 | ; This function writes out decimalified value rdx to the screen (pointed to by 222 | ; rdi). rdi is incremented by 42 as `stosw` is used to write the bytes. 223 | ; The output print will be based on a base 10 fixed point representation with 224 | ; 4 digits (n.xxxx) of fixed pointness 225 | ; 226 | ; Parameters: 227 | ; 228 | ; rdx - Number to decimalify and display 229 | ; rdi - Pointer to buffer to recieve characters (usually the screen) 230 | ; 231 | ; Alignment: 232 | ; 233 | ; None 234 | ; 235 | ; Returns: 236 | ; 237 | ; None 238 | ; 239 | ; Smashes: 240 | ; 241 | ; rdi - Incremented by 42 242 | ; 243 | ; Optimization 244 | ; 245 | ; Readability 246 | ; 247 | outfixedb10d4: 248 | push rax 249 | push rbx 250 | push rcx 251 | push rdx 252 | push r8 253 | push r9 254 | 255 | std 256 | add rdi, 40 257 | 258 | mov cl, 20 259 | mov r8, rdx 260 | mov r9, 10 261 | .lewp: 262 | xor rdx, rdx 263 | mov rax, r8 264 | div r9 265 | mov r8, rax 266 | 267 | lea rbx, [rel hexlut] 268 | mov al, byte [rbx + rdx] 269 | mov ah, 0x0F 270 | stosw 271 | 272 | cmp cl, (20 - 3) 273 | jne short .not_decimal_time 274 | 275 | mov al, '.' 276 | stosw 277 | 278 | .not_decimal_time: 279 | dec cl 280 | jnz short .lewp 281 | 282 | add rdi, 44 283 | cld 284 | 285 | pop r9 286 | pop r8 287 | pop rdx 288 | pop rcx 289 | pop rbx 290 | pop rax 291 | ret 292 | 293 | ; outdecqi 294 | ; 295 | ; Summary: 296 | ; 297 | ; This function writes out decimalified value rdx to the screen (pointed to by 298 | ; rdi). rdi is incremented by 40 as `stosw` is used to write the bytes. 299 | ; This is the signed version of outdecq 300 | ; 301 | ; Parameters: 302 | ; 303 | ; rdx - Number to decimalify and display 304 | ; rdi - Pointer to buffer to recieve characters (usually the screen) 305 | ; 306 | ; Alignment: 307 | ; 308 | ; None 309 | ; 310 | ; Returns: 311 | ; 312 | ; None 313 | ; 314 | ; Smashes: 315 | ; 316 | ; rdi - Incremented by 42 317 | ; 318 | ; Optimization 319 | ; 320 | ; Readability 321 | ; 322 | outdecqi: 323 | push rax 324 | push rbx 325 | push rcx 326 | push rdx 327 | push r8 328 | push r9 329 | push r10 330 | 331 | xor r10, r10 332 | 333 | bt rdx, 63 334 | jnc short .positive 335 | 336 | bts r10, 0 337 | neg rdx 338 | 339 | .positive: 340 | std 341 | add rdi, 40 342 | 343 | mov cl, 20 344 | mov r8, rdx 345 | mov r9, 10 346 | .lewp: 347 | xor rdx, rdx 348 | mov rax, r8 349 | div r9 350 | mov r8, rax 351 | 352 | lea rbx, [rel hexlut] 353 | mov al, byte [rbx + rdx] 354 | mov ah, 0x0F 355 | stosw 356 | 357 | dec cl 358 | jnz short .lewp 359 | 360 | test r10, r10 361 | jz short .noneg 362 | 363 | mov ax, 0x0F2D 364 | stosw 365 | 366 | jmp short .done 367 | .noneg: 368 | mov ax, 0x0F2B 369 | stosw 370 | 371 | .done: 372 | add rdi, 44 373 | cld 374 | 375 | pop r10 376 | pop r9 377 | pop r8 378 | pop rdx 379 | pop rcx 380 | pop rbx 381 | pop rax 382 | ret 383 | 384 | ; outdecexp 385 | ; 386 | ; Summary: 387 | ; 388 | ; This function writes out the decimal representation of the exponent style 389 | ; float. The number is input in rdx, and the exponent is input in rcx. 390 | ; rbx will specify the number of digits after the decimal 391 | ; 392 | ; TODO: This function does not support exponents >0 or does it support very 393 | ; negative exponents 394 | ; 395 | ; Parameters: 396 | ; 397 | ; rcx - Exponent to use during displaying 398 | ; rdx - Number to decimalify and display 399 | ; rdi - Pointer to buffer to recieve characters (usually the screen) 400 | ; 401 | ; Alignment: 402 | ; 403 | ; None 404 | ; 405 | ; Retuerns: 406 | ; 407 | ; None 408 | ; 409 | ; Smashes: 410 | ; 411 | ; rdi - Incremented to point to the next element after the printed string 412 | ; 413 | ; Optimization 414 | ; 415 | ; Readability 416 | ; 417 | outdecexp: 418 | push rax 419 | push rbx 420 | push rcx 421 | push rdx 422 | push r8 423 | push r9 424 | push r10 425 | push r11 426 | 427 | xor r10, r10 428 | 429 | bt rdx, 63 430 | jnc short .positive 431 | 432 | bts r10, 0 433 | neg rdx 434 | 435 | .positive: 436 | push rcx 437 | 438 | mov rcx, 40 439 | xor al, al 440 | rep stosb 441 | 442 | pop rcx 443 | 444 | std 445 | 446 | mov r11, -1 447 | mov r8, rdx 448 | mov r9, 10 449 | .lewp: 450 | xor rdx, rdx 451 | mov rax, r8 452 | div r9 453 | mov r8, rax 454 | 455 | lea rbx, [rel hexlut] 456 | mov al, byte [rbx + rdx] 457 | mov ah, 0x0F 458 | stosw 459 | 460 | cmp r11, rcx 461 | jne short .dont_print_dec 462 | 463 | mov ax, 0x0F2E 464 | stosw 465 | 466 | .dont_print_dec: 467 | dec r11 468 | test r8, r8 469 | jnz short .lewp 470 | 471 | test r10, r10 472 | jz short .noneg 473 | 474 | mov ax, 0x0F2D 475 | stosw 476 | 477 | jmp short .done 478 | .noneg: 479 | mov ax, 0x0F2B 480 | stosw 481 | 482 | .done: 483 | add rdi, 44 484 | cld 485 | 486 | pop r11 487 | pop r10 488 | pop r9 489 | pop r8 490 | pop rdx 491 | pop rcx 492 | pop rbx 493 | pop rax 494 | ret 495 | 496 | ; outdouble 497 | ; 498 | ; Summary: 499 | ; 500 | ; This function writes out the double value in xmm0 to the screen in it's 501 | ; decimal form. 502 | ; 503 | ; Parameters: 504 | ; 505 | ; xmm0 - Number to decimalify and display 506 | ; rdi - Pointer to buffer to recieve characters (usually the screen) 507 | ; 508 | ; Alignment: 509 | ; 510 | ; None 511 | ; 512 | ; Returns: 513 | ; 514 | ; None 515 | ; 516 | ; Smashes: 517 | ; 518 | ; rdi - Incremented by 40 519 | ; 520 | ; Optimization 521 | ; 522 | ; Readability 523 | ; 524 | outdouble: 525 | push rax 526 | push rbx 527 | push rcx 528 | push rdx 529 | push r8 530 | push r9 531 | push r10 532 | push r11 533 | push r12 534 | push r13 535 | 536 | movq rdx, xmm0 537 | 538 | ; rax - Mantissa 539 | mov rax, rdx 540 | mov rbx, 0xFFFFFFFFFFFFF 541 | and rax, rbx 542 | 543 | ; rbx - Exponent with bias applied 544 | mov rbx, rdx 545 | shr rbx, 52 546 | and rbx, 0x7FF 547 | sub rbx, 1023 548 | 549 | xor r8, r8 ; decimalExponent 550 | mov r9, 1 ; sideMultiplicator 551 | 552 | .for_each_exponent_pos: 553 | cmp rbx, 0 554 | jle short .done_exponent_pos 555 | 556 | shl r9, 1 557 | 558 | mov r10, r9 559 | shr r10, 30 560 | jz short .no_overflow 561 | 562 | push rax 563 | push rdx 564 | add r8, 3 565 | mov r11, 1000 566 | xor rdx, rdx 567 | mov rax, r9 568 | div r11 569 | mov r9, rax 570 | pop rdx 571 | pop rax 572 | 573 | .no_overflow: 574 | dec rbx 575 | jmp short .for_each_exponent_pos 576 | 577 | .done_exponent_pos: 578 | .for_each_exponent_neg: 579 | cmp rbx, 0 580 | jge short .done_exponent_neg 581 | 582 | mov r11, r9 583 | shr r11, 30 584 | jz short .no_overflow_neg 585 | 586 | imul r9, r9, 10 587 | sub r8, 1 588 | 589 | .no_overflow_neg: 590 | shr r9, 1 591 | 592 | inc rbx 593 | jmp short .for_each_exponent_neg 594 | 595 | .done_exponent_neg: 596 | ; rax - Mantissa 597 | ; r8 - decimalExponent 598 | ; r9 - sideMultiplicator 599 | ; r10 - betweenResult 600 | ; r11 - fraction 601 | ; r12 - bit 602 | mov r10, r9 603 | mov r11, 2 604 | xor r12, r12 605 | 606 | .frac: 607 | cmp r12, 52 608 | jge short .done_frac 609 | 610 | mov rcx, 51 611 | sub rcx, r12 612 | bt rax, rcx 613 | jnc .cont_frac 614 | 615 | .while_fraction: 616 | push rax 617 | push rdx 618 | xor rdx, rdx 619 | mov rax, r9 620 | div r11 621 | mov r13, rdx 622 | pop rdx 623 | pop rax 624 | 625 | cmp r13, 0 626 | jle .done_while 627 | 628 | mov r13, r10 629 | shr r13, 58 630 | jnz short .done_while 631 | 632 | imul r10, r10, 10 633 | imul r9, r9, 10 634 | sub r8, 1 635 | jmp short .while_fraction 636 | 637 | .done_while: 638 | push rax 639 | push rdx 640 | xor rdx, rdx 641 | mov rax, r9 642 | div r11 643 | mov r13, rax 644 | pop rdx 645 | pop rax 646 | 647 | add r10, r13 648 | 649 | .cont_frac: 650 | add r12, 1 651 | shl r11, 1 652 | jmp short .frac 653 | 654 | .done_frac: 655 | bt rdx, 63 656 | jnc short .not_signed 657 | 658 | ;neg r10 659 | 660 | .not_signed: 661 | ; now r8 contains the number and r10 contains the base-10 exponent 662 | 663 | ; for example, if we were converting '123.3489' 664 | ; r10 would contain: +1233488999999999998 665 | ; r8 would contain: -16 666 | 667 | mov rcx, r8 668 | mov rdx, r10 669 | call outdecexp 670 | 671 | pop r13 672 | pop r12 673 | pop r11 674 | pop r10 675 | pop r9 676 | pop r8 677 | pop rdx 678 | pop rcx 679 | pop rbx 680 | pop rax 681 | ret 682 | 683 | ; dump_memory_map 684 | ; 685 | ; Summary: 686 | ; 687 | ; This function dumps the 0xE820 memory map contents to the screen. 688 | ; 689 | ; Parameters: 690 | ; 691 | ; None 692 | ; 693 | ; Alignment: 694 | ; 695 | ; None 696 | ; 697 | ; Returns: 698 | ; 699 | ; None 700 | ; 701 | ; Smashes: 702 | ; 703 | ; None 704 | ; 705 | ; Optimization 706 | ; 707 | ; Readability 708 | ; 709 | dump_memory_map: 710 | push rbx 711 | push rcx 712 | push rdx 713 | push rdi 714 | 715 | ; Set cursor to top left 716 | mov rdi, 0xb8000 717 | 718 | mov rbx, MEMORY_MAP_LOC + 0x20 719 | movzx rcx, word [MEMORY_MAP_LOC] 720 | .lewp: 721 | ; Print the base address 722 | mov rdx, qword [rbx] 723 | call outhexq 724 | add rdi, 2 725 | 726 | ; Print the length 727 | mov rdx, qword [rbx + 0x08] 728 | call outhexq 729 | add rdi, 2 730 | 731 | ; Print the type 732 | mov edx, dword [rbx + 0x10] 733 | call outhexq 734 | add rdi, (80 * 2) - ((16 * 2) * 3) - 4 735 | 736 | add rbx, 0x20 737 | dec rcx 738 | jnz short .lewp 739 | 740 | pop rdi 741 | pop rdx 742 | pop rcx 743 | pop rbx 744 | ret 745 | 746 | dump_mmio_map: 747 | push rax 748 | push rcx 749 | push rdx 750 | push rdi 751 | push rbp 752 | 753 | mov rdi, 0xb8000 754 | 755 | lea rax, [rel mmio_routing_table] 756 | mov rcx, 12 757 | .lewp: 758 | mov rdx, qword [rax + 0] ; Base 759 | mov rbp, qword [rax + 8] ; Limit 760 | 761 | ; If the limit does not exist then go to the next table 762 | test rbp, rbp 763 | jz short .next_table 764 | 765 | mov rdx, rdx 766 | call outhexq 767 | add rdi, 2 768 | 769 | mov rdx, rbp 770 | call outhexq 771 | add rdi, (80 * 2) - ((16 * 2) * 2) - 2 772 | 773 | .next_table: 774 | add rax, 0x10 775 | dec rcx 776 | jnz short .lewp 777 | 778 | pop rbp 779 | pop rdi 780 | pop rdx 781 | pop rcx 782 | pop rax 783 | ret 784 | 785 | %macro printstr_nl 2 786 | push rbx 787 | push rcx 788 | jmp .%1_end 789 | .%1: db %2 790 | .%1_end: 791 | lea rbx, [rel .%1] 792 | mov rcx, (.%1_end - .%1) 793 | call outstr 794 | add rdi, (80 * 2) - ((.%1_end - .%1) * 2) 795 | pop rcx 796 | pop rbx 797 | %endmacro 798 | 799 | %macro printstr 2 800 | push rbx 801 | push rcx 802 | jmp .%1_end 803 | .%1: db %2 804 | .%1_end: 805 | lea rbx, [rel .%1] 806 | mov rcx, (.%1_end - .%1) 807 | call outstr 808 | pop rcx 809 | pop rbx 810 | %endmacro 811 | 812 | ; outstr 813 | ; 814 | ; Summary: 815 | ; 816 | ; This function prints out the string pointed to by rbx, of rcx length 817 | ; 818 | ; Parameters: 819 | ; 820 | ; rbx - Pointer to smmio_routing_tabletring 821 | ; rcx - Length of the string to print 822 | ; rdi - Location to print to (the screen) 823 | ; 824 | ; Smashes: 825 | ; 826 | ; rdi - Incremented by (2 * rcx) 827 | ; 828 | ; Optimization: 829 | ; 830 | ; Readability 831 | ; 832 | outstr: 833 | push rax 834 | push rbx 835 | push rcx 836 | 837 | mov ah, 0x0F 838 | .lewp: 839 | mov al, [rbx] 840 | stosw 841 | inc rbx 842 | dec rcx 843 | jnz short .lewp 844 | 845 | pop rcx 846 | pop rbx 847 | pop rax 848 | ret 849 | 850 | -------------------------------------------------------------------------------- /srcs/dstruc/flut.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | %define FLUT_BIN_SIZE 2 4 | 5 | ; rsi <- flut vector 6 | flut_alloc: 7 | push rax 8 | push rcx 9 | push rdi 10 | 11 | ; Check if we're out of flut entries 12 | cmp qword [gs:thread_local.flut_pages_rem], 0 13 | jne short .dont_alloc_page 14 | 15 | ; Allocate a whole page 16 | mov rsi, 4096 17 | rand_alloc rsi 18 | 19 | ; Zero out the page 20 | mov rdi, rsi 21 | mov ecx, (4096 / 8) 22 | xor eax, eax 23 | rep stosq 24 | 25 | ; Save the new flut pages 26 | mov qword [gs:thread_local.flut_pages], rsi 27 | mov qword [gs:thread_local.flut_pages_rem], 4096 / ((1 << FLUT_BIN_SIZE)*8) 28 | 29 | .dont_alloc_page: 30 | ; Consume one flut allocation 31 | mov rsi, ((1 << FLUT_BIN_SIZE) * 8) 32 | xadd qword [gs:thread_local.flut_pages], rsi 33 | dec qword [gs:thread_local.flut_pages_rem] 34 | 35 | pop rdi 36 | pop rcx 37 | pop rax 38 | ret 39 | 40 | ; rcx -> flut 41 | ; xmm5 -> hash 42 | ; rcx <- flut entry or pointer to flut entry to fill 43 | ; CF <- Set if flut needs to be filled 44 | flut_fetch_or_lock: 45 | push rax 46 | push rbx 47 | push rdx 48 | push rsi 49 | push rbp 50 | 51 | movq rdx, xmm5 52 | 53 | mov rbx, (128 / FLUT_BIN_SIZE) 54 | .for_each_bin: 55 | mov ebp, edx 56 | and ebp, ((1 << FLUT_BIN_SIZE) - 1) 57 | ror rdx, FLUT_BIN_SIZE 58 | 59 | ; Check if this entry has been filled 60 | mov rsi, qword [rcx + rbp*8] 61 | cmp rsi, 2 62 | jae short .filled 63 | 64 | ; The entry has not been filled, atomicially try to lock it down so we 65 | ; can fill it! 66 | xor eax, eax 67 | mov esi, 1 68 | lock cmpxchg qword [rcx + rbp*8], rsi 69 | jne short .wait_for_fill 70 | 71 | ; We succeeded in obtaining the lock. If this is the last bin in the hash 72 | ; we want to return to the user to fill in this last piece. 73 | lea rax, [rcx + rbp*8] 74 | cmp rbx, 1 75 | je short .done_needs_fill 76 | 77 | ; We obtained the lock and we are not at the last level, allocate another 78 | ; array and place it in the flut. 79 | call flut_alloc 80 | 81 | mov qword [rcx + rbp*8], rsi 82 | jmp short .filled 83 | 84 | .wait_for_fill: 85 | ; We lost the race in locking, wait for the caller with the lock to fill 86 | ; it in and then treat it as a filled entry. 87 | pause 88 | cmp qword [rcx + rbp*8], 1 89 | jbe short .wait_for_fill 90 | mov rsi, qword [rcx + rbp*8] 91 | 92 | .filled: 93 | cmp rbx, (64 / FLUT_BIN_SIZE) + 1 94 | jne short .dont_grab_upper 95 | 96 | pextrq rdx, xmm5, 1 97 | 98 | .dont_grab_upper: 99 | mov rcx, rsi 100 | dec rbx 101 | jnz short .for_each_bin 102 | 103 | ; At this point the entry of the flut was filled in, return the value 104 | ; in the flut and clear CF. 105 | mov rax, rcx 106 | 107 | clc 108 | jmp short .done 109 | .done_needs_fill: 110 | stc 111 | .done: 112 | mov rcx, rax 113 | pop rbp 114 | pop rsi 115 | pop rdx 116 | pop rbx 117 | pop rax 118 | ret 119 | 120 | ; rcx -> flut 121 | ; rax <- random entry from flut 122 | global flut_random 123 | flut_random: 124 | push rbx 125 | push rcx 126 | push rdx 127 | push rbp 128 | push rdi 129 | 130 | XMMPUSH xmm15 131 | 132 | call falkrand 133 | movq rbp, xmm15 134 | 135 | xor ebx, ebx 136 | .lewp: 137 | mov eax, ebp 138 | and eax, ((1 << FLUT_BIN_SIZE) - 1) 139 | ror rbp, FLUT_BIN_SIZE 140 | 141 | mov edi, (1 << FLUT_BIN_SIZE) 142 | .try_find_next: 143 | cmp qword [rcx + rax*8], 2 144 | jae short .found 145 | 146 | inc eax 147 | and eax, ((1 << FLUT_BIN_SIZE) - 1) 148 | dec edi 149 | jnz short .try_find_next 150 | jmp short .fail 151 | 152 | .found: 153 | mov rcx, [rcx + rax*8] 154 | inc ebx 155 | cmp ebx, (64 / FLUT_BIN_SIZE) 156 | jne short .dont_get_high 157 | 158 | pextrq rbp, xmm15, 1 159 | 160 | .dont_get_high: 161 | cmp ebx, (128 / FLUT_BIN_SIZE) 162 | jb short .lewp 163 | 164 | mov rax, rcx 165 | jmp short .done 166 | .fail: 167 | xor eax, eax 168 | .done: 169 | XMMPOP xmm15 170 | 171 | pop rdi 172 | pop rbp 173 | pop rdx 174 | pop rcx 175 | pop rbx 176 | ret 177 | 178 | struc fht_table 179 | .entries: resq 1 ; Number of used entries in this table 180 | .bits: resq 1 ; Number of bits in this table 181 | .table: resq 1 ; Pointer to hash table 182 | .ents: resq 1 ; Pointer to list of hashes 183 | endstruc 184 | 185 | struc fht_entry 186 | .hash: resq 2 187 | .data: resq 1 188 | .pad: resq 1 189 | endstruc 190 | 191 | struc fht_list_entry 192 | .hash: resq 2 193 | endstruc 194 | 195 | ; rcx -> Number of bits 196 | ; rcx <- Hash table base 197 | fht_create: 198 | push rax 199 | push rdi 200 | push rbp 201 | 202 | ; Allocate the table header 203 | mov rbp, fht_table_size 204 | rand_alloc rbp 205 | 206 | ; Initialize the table header 207 | mov qword [rbp + fht_table.entries], 0 208 | mov qword [rbp + fht_table.bits], rcx 209 | 210 | ; Calculate the table size 211 | mov eax, 1 212 | shl rax, cl 213 | 214 | ; Allocate and zerothe hash table 215 | imul rdi, rax, fht_entry_size 216 | mov rcx, rdi 217 | mixed_alloc rdi 218 | call bzero 219 | mov [rbp + fht_table.table], rdi 220 | 221 | ; Allocate and zero the hash list table 222 | imul rdi, rax, fht_list_entry_size 223 | mov rcx, rdi 224 | mixed_alloc rdi 225 | call bzero 226 | mov [rbp + fht_table.ents], rdi 227 | 228 | mov rcx, rbp 229 | 230 | pop rbp 231 | pop rdi 232 | pop rax 233 | ret 234 | 235 | ; rcx -> Pointer to hash table 236 | ; rax <- Random entry (or zero if no entries are present) 237 | fht_random: 238 | push rbx 239 | push rcx 240 | push rdx 241 | push r15 242 | 243 | XMMPUSH xmm5 244 | 245 | cmp qword [rcx + fht_table.entries], 0 246 | je short .fail 247 | 248 | ; Pick a random entry 249 | call xorshift64 250 | xor rdx, rdx 251 | mov rax, r15 252 | div qword [rcx + fht_table.entries] 253 | 254 | ; Calculate the entry offset 255 | mov rbx, [rcx + fht_table.ents] 256 | imul rdx, fht_list_entry_size 257 | 258 | ; Fetch the random hash. If it is zero, fail. 259 | movdqu xmm5, [rbx + rdx + fht_list_entry.hash] 260 | ptest xmm5, xmm5 261 | jz short .fail 262 | 263 | ; Look up the hash. This will always succeed. 264 | call fht_fetch_or_lock 265 | mov rax, rcx 266 | jmp short .done 267 | 268 | .fail: 269 | xor rax, rax 270 | .done: 271 | XMMPOP xmm5 272 | 273 | pop r15 274 | pop rdx 275 | pop rcx 276 | pop rbx 277 | ret 278 | 279 | ; rcx -> Pointer to hash table 280 | ; xmm5 -> Hash 281 | ; rcx <- Pointer to entry or entry (depending on CF) 282 | ; CF <- Set if this is a new entry we must populate 283 | fht_fetch_or_lock: 284 | push rax 285 | push rbx 286 | push rdx 287 | push rdi 288 | push rbp 289 | 290 | XMMPUSH xmm4 291 | 292 | ; Save off the hash table pointer 293 | mov rbp, rcx 294 | 295 | mov rbx, [rbp + fht_table.table] 296 | mov rcx, [rbp + fht_table.bits] 297 | 298 | ; rbx now points to the start of the hash table vector 299 | ; rcx is now the number of bits in the hash table 300 | 301 | ; Get the low 64-bits of the hash 302 | movq rdx, xmm5 303 | 304 | ; Calculate the mask 305 | mov eax, 1 306 | shl rax, cl 307 | dec rax 308 | 309 | ; Mask the hash 310 | and rdx, rax 311 | 312 | ; Calculate the byte offset into the hash table 313 | imul rdx, fht_entry_size 314 | 315 | .next_entry: 316 | ; Look up this entry in the table 317 | movdqu xmm4, [rbx + rdx + fht_entry.hash] 318 | 319 | ; If this bin is empty, try to fill it 320 | ptest xmm4, xmm4 321 | jz short .empty 322 | 323 | ; If the hashes match, we have an entry! 324 | pxor xmm4, xmm5 325 | ptest xmm4, xmm4 326 | jz short .found 327 | 328 | ; The bin was not empty, nor did our hash match. This is a collision case. 329 | ; Go to the next entry (linear probing) 330 | add rdx, fht_entry_size 331 | and rdx, rax 332 | jmp short .next_entry 333 | 334 | .empty: 335 | ; The bin was empty, try to win the race to fill it. 336 | lea rdi, [rbx + rdx + fht_entry.hash] 337 | 338 | ; We did not find an entry, try to atomicially populate this entry 339 | push rax 340 | push rbx 341 | push rcx 342 | push rdx 343 | 344 | ; Compare part 345 | xor edx, edx 346 | xor eax, eax 347 | 348 | ; Exchange part 349 | pextrq rcx, xmm5, 1 350 | pextrq rbx, xmm5, 0 351 | 352 | lock cmpxchg16b [rdi] 353 | 354 | ; If we lost, rdx:rax is the 128-bit value that we lost to. Store this in 355 | ; xmm4. 356 | pinsrq xmm4, rdx, 1 357 | pinsrq xmm4, rax, 0 358 | 359 | pop rdx 360 | pop rcx 361 | pop rbx 362 | pop rax 363 | je short .won_race 364 | 365 | ; We lost the race. Check if the hash matches (we could lose the race to 366 | ; a collision case). 367 | pxor xmm4, xmm5 368 | ptest xmm4, xmm4 369 | jz short .found 370 | 371 | ; We lost the race, and it was a collision. Go to the next entry. 372 | add rdx, fht_entry_size 373 | and rdx, rax 374 | jmp short .next_entry 375 | 376 | .won_race: 377 | ; We won the race! Return the address of the data to fill. 378 | lea rcx, [rbx + rdx + fht_entry.data] 379 | 380 | ; Get this entry's ID 381 | mov edi, 1 382 | lock xadd qword [rbp + fht_table.entries], rdi 383 | 384 | ; Add this hash to the hash list 385 | mov rbx, [rbp + fht_table.ents] 386 | imul rdi, fht_list_entry_size 387 | movdqu [rbx + rdi + fht_list_entry.hash], xmm5 388 | 389 | XMMPOP xmm4 390 | 391 | pop rbp 392 | pop rdi 393 | pop rdx 394 | pop rbx 395 | pop rax 396 | stc 397 | ret 398 | 399 | .found: 400 | ; Fetch the data. If it is zero, loop until it is not. 401 | mov rcx, [rbx + rdx + fht_entry.data] 402 | test rcx, rcx 403 | jz short .found 404 | 405 | XMMPOP xmm4 406 | 407 | pop rbp 408 | pop rdi 409 | pop rdx 410 | pop rbx 411 | pop rax 412 | clc 413 | ret 414 | 415 | -------------------------------------------------------------------------------- /srcs/emu/ide.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | struc ide_emu_state 4 | .length: resq 0 ; in sectors 5 | .sector: resq 0 6 | .cylinder: resq 0 7 | endstruc 8 | 9 | ide_handle: 10 | ; out handler, in handler 11 | dq ide_unhandled, ide_unhandled ; 0x1f0 12 | dq ide_handled, ide_unhandled ; 0x1f1 13 | dq ide_1f2_out, ide_unhandled ; 0x1f2 14 | dq ide_1f3_out, ide_unhandled ; 0x1f3 15 | dq ide_1f4_out, ide_unhandled ; 0x1f4 16 | dq ide_1f5_out, ide_unhandled ; 0x1f5 17 | dq ide_1f6_out, ide_unhandled ; 0x1f6 18 | dq ide_1f7_out, ide_1f7_in ; 0x1f7 19 | 20 | ide_1f7_out: 21 | push r10 22 | 23 | mov r10, (1 << 31) | (0 << 8) | 0x71 24 | mov qword [rax + VMCB.eventinj], r10 25 | 26 | pop r10 27 | clc 28 | ret 29 | 30 | ide_1f7_in: 31 | push rbx 32 | 33 | ; We require that this is an 8-bit operand 34 | ; Since we know this is not a string, rep, or imm operation. We know for 35 | ; sure this operation is an: 'out dx, al' and nothing else!!! 36 | bt dword [rax + VMCB.exitinfo1], 4 37 | jnc .unhandled 38 | 39 | ; Status byte (RDY and DRQ bits set) 40 | mov byte [rax + VMCB.rax], (1 << 6) | (1 << 3) 41 | 42 | clc 43 | pop rbx 44 | ret 45 | 46 | .unhandled: 47 | stc 48 | pop rbx 49 | ret 50 | 51 | ide_1f2_out: 52 | push rbx 53 | 54 | ; We require that this is an 8-bit operand 55 | ; Since we know this is not a string, rep, or imm operation. We know for 56 | ; sure this operation is an: 'out dx, al' and nothing else!!! 57 | bt dword [rax + VMCB.exitinfo1], 4 58 | jnc .unhandled 59 | 60 | ; Sector count 61 | movzx ebx, byte [rax + VMCB.rax] 62 | mov qword [gs:thread_local.ide_emu_state + ide_emu_state.length], rbx 63 | 64 | clc 65 | pop rbx 66 | ret 67 | 68 | .unhandled: 69 | stc 70 | pop rbx 71 | ret 72 | 73 | ide_1f3_out: 74 | push rbx 75 | 76 | ; We require that this is an 8-bit operand 77 | ; Since we know this is not a string, rep, or imm operation. We know for 78 | ; sure this operation is an: 'out dx, al' and nothing else!!! 79 | bt dword [rax + VMCB.exitinfo1], 4 80 | jnc .unhandled 81 | 82 | ; Sector 83 | movzx ebx, byte [rax + VMCB.rax] 84 | mov qword [gs:thread_local.ide_emu_state + ide_emu_state.sector], rbx 85 | 86 | clc 87 | pop rbx 88 | ret 89 | 90 | .unhandled: 91 | stc 92 | pop rbx 93 | ret 94 | 95 | ide_1f4_out: 96 | push rbx 97 | 98 | ; We require that this is an 8-bit operand 99 | ; Since we know this is not a string, rep, or imm operation. We know for 100 | ; sure this operation is an: 'out dx, al' and nothing else!!! 101 | bt dword [rax + VMCB.exitinfo1], 4 102 | jnc .unhandled 103 | 104 | ; Low cylinder byte 105 | movzx ebx, byte [rax + VMCB.rax] 106 | mov byte [gs:thread_local.ide_emu_state + ide_emu_state.cylinder], bl 107 | 108 | clc 109 | pop rbx 110 | ret 111 | 112 | .unhandled: 113 | stc 114 | pop rbx 115 | ret 116 | 117 | ide_1f5_out: 118 | push rbx 119 | 120 | ; We require that this is an 8-bit operand 121 | ; Since we know this is not a string, rep, or imm operation. We know for 122 | ; sure this operation is an: 'out dx, al' and nothing else!!! 123 | bt dword [rax + VMCB.exitinfo1], 4 124 | jnc .unhandled 125 | 126 | ; High cylinder byte 127 | movzx ebx, byte [rax + VMCB.rax] 128 | mov byte [gs:thread_local.ide_emu_state + ide_emu_state.cylinder + 1], bl 129 | 130 | clc 131 | pop rbx 132 | ret 133 | 134 | .unhandled: 135 | stc 136 | pop rbx 137 | ret 138 | 139 | ide_1f6_out: 140 | ; We require that this is an 8-bit operand 141 | ; Since we know this is not a string, rep, or imm operation. We know for 142 | ; sure this operation is an: 'out dx, al' and nothing else!!! 143 | bt dword [rax + VMCB.exitinfo1], 4 144 | jnc .unhandled 145 | 146 | ; Reset state 147 | mov qword [gs:thread_local.ide_emu_state + ide_emu_state.length], 0 148 | mov qword [gs:thread_local.ide_emu_state + ide_emu_state.sector], 0 149 | mov qword [gs:thread_local.ide_emu_state + ide_emu_state.cylinder], 0 150 | 151 | clc 152 | ret 153 | 154 | .unhandled: 155 | stc 156 | ret 157 | 158 | ; rax -> VMCB 159 | ; CF <- Set if access unhandled, clear if handled 160 | ide_io: 161 | push rbx 162 | push rcx 163 | 164 | ; We currently do not support string accesses or rep accesses 165 | test dword [rax + VMCB.exitinfo1], (3 << 2) 166 | jnz .unhandled 167 | 168 | ; Make sure this port is an IDE port 169 | bextr rbx, qword [rax + VMCB.exitinfo1], 0x1010 170 | 171 | cmp rbx, 0xeff0 172 | je .handled 173 | cmp rbx, 0xeff2 174 | je .handled 175 | cmp rbx, 0xeff4 176 | je .handled 177 | 178 | cmp rbx, 0x1f0 179 | jb .unhandled 180 | cmp rbx, 0x1f7 181 | ja .unhandled 182 | 183 | ; Fetch whether this is an in our out 184 | mov ecx, dword [rax + VMCB.exitinfo1] 185 | and ecx, 1 186 | shl ecx, 3 ; Multiply by 0x8 187 | 188 | ; Dispatch this IO access 189 | sub rbx, 0x1f0 190 | shl rbx, 4 ; Multiply by 0x10 191 | call [rel ide_handle + rbx + rcx] ; XXX: Can we use rel like this? 192 | 193 | ; Might be handled or unhandled, what we call must set/unset CF 194 | pop rcx 195 | pop rbx 196 | ret 197 | 198 | .handled: 199 | clc 200 | pop rcx 201 | pop rbx 202 | ret 203 | 204 | .unhandled: 205 | stc 206 | pop rcx 207 | pop rbx 208 | ret 209 | 210 | ide_unhandled: 211 | stc 212 | ret 213 | 214 | ide_handled: 215 | clc 216 | ret 217 | 218 | -------------------------------------------------------------------------------- /srcs/emu/win.asm: -------------------------------------------------------------------------------- 1 | ; rsi -> VM snapshot 2 | apply_fileio_breakpoint: 3 | push rax 4 | push rcx 5 | push rdx 6 | push rdi 7 | push rsi 8 | 9 | lea rdi, [rsi + vm_snapshot.physical_memory + 0x820] 10 | lea rsi, [rel .fileio_sig] 11 | mov rcx, 64 12 | mov rdx, VM_MEMORY_SIZE 13 | call memmem 14 | test rax, rax ; not found 15 | jz panic 16 | 17 | mov byte [rax + 0], 0xcc ; int3 18 | mov byte [rax + 1], 0xc3 ; ret 19 | 20 | pop rsi 21 | pop rdi 22 | pop rdx 23 | pop rcx 24 | pop rax 25 | ret 26 | 27 | .fileio_sig: 28 | db 0x48, 0x89, 0x5C, 0x24, 0x18, 0x55, 0x56, 0x57, 0x41, 0x54, 0x41, 0x55, 0x41, 0x56, 0x41, 0x57, 29 | db 0x48, 0x81, 0xEC, 0x80, 0x00, 0x00, 0x00, 0x48, 0x8B, 0x05, 0xC2, 0x47, 0x03, 0x00, 0x48, 0x33, 30 | db 0xC4, 0x48, 0x89, 0x44, 0x24, 0x78, 0x48, 0x8B, 0x59, 0x40, 0x48, 0x8B, 0xBA, 0xB8, 0x00, 0x00, 31 | db 0x00, 0x4C, 0x8B, 0xF2, 0x4C, 0x8B, 0x63, 0x10, 0x44, 0x8B, 0x7F, 0x08, 0x48, 0x8B, 0xE9, 0xF0, 32 | .fileio_sig_size: equ ($ - .fileio_sig) 33 | 34 | handle_fileio: 35 | push rbx 36 | push rcx 37 | push rdi 38 | push rsi 39 | push rbp 40 | push r8 41 | push r9 42 | 43 | ; We expect SystemBuffer to be NULL 44 | push rdx 45 | lea rdx, [rdx + 0x18] ; IRP->AssociatedIrp.SystemBuffer 46 | call mm_read_guest_qword 47 | test rdx, rdx 48 | jnz panic 49 | pop rdx 50 | 51 | ; We expect UserBuffer to be NULL 52 | push rdx 53 | lea rdx, [rdx + 0x70] ; IRP->UserBuffer 54 | call mm_read_guest_qword 55 | test rdx, rdx 56 | jnz panic 57 | pop rdx 58 | 59 | ; Get IRP->MdlAddress->MappedSystemVa 60 | push rdx 61 | lea rdx, [rdx + 0x08] ; IRP->MdlAddress 62 | call mm_read_guest_qword 63 | test rdx, rdx 64 | jz panic 65 | 66 | lea rdx, [rdx + 0x18] ; MdlAddress->MappedSystemVa 67 | call mm_read_guest_qword 68 | mov rdi, rdx 69 | test rdx, rdx 70 | jz panic 71 | pop rdx 72 | 73 | ; Get the IrpSp 74 | push rdx 75 | lea rdx, [rdx + 0xb8] ; IRP->Tail.CurrentStackLocation 76 | call mm_read_guest_qword 77 | mov rbp, rdx 78 | test rdx, rdx 79 | pop rdx 80 | jz panic 81 | 82 | ; rdi - Guest virtual address to read sectors into 83 | ; rbp - Address of the IrpSp 84 | 85 | ; Get the IrpSp->MajorFunction and make sure this is an IRP_MJ_READ 86 | push rdx 87 | lea rdx, [rbp + 0x00] ; IrpSp->MajorFunction 88 | call mm_read_guest_qword 89 | cmp dl, 0x3 ; IRP_MJ_READ 90 | jne panic 91 | pop rdx 92 | 93 | ; Get IrpSp->Parameters.Read.ByteOffset 94 | push rdx 95 | lea rdx, [rbp + 0x18] ; IrpSp->Parameters.Read.ByteOffset 96 | call mm_read_guest_qword 97 | mov rbx, rdx 98 | pop rdx 99 | 100 | ; Get IrpSp->Parameters.Read.Length 101 | push rdx 102 | lea rdx, [rbp + 0x08] ; IrpSp->Parameters.Read.Length 103 | call mm_read_guest_qword 104 | mov rcx, rdx 105 | pop rdx 106 | 107 | ; rbx - Disk byte offset to file 108 | ; rcx - Length to read from disk (in bytes) 109 | ; rdi - Guest virtual address to read sectors into 110 | ; rbp - Address of the IrpSp 111 | 112 | ; Offset and size must be 512-byte aligned 113 | test rbx, 0x1ff 114 | jnz panic 115 | test rcx, 0x1ff 116 | jnz panic 117 | 118 | ; Divide down offset and size to now be sector counts 119 | shr rbx, 9 120 | mov r9, rcx 121 | shr r9, 9 122 | 123 | ; r9 - Length to read in sectors 124 | 125 | ; If there are no sectors to read, skip the reading 126 | test r9, r9 127 | jz short .nothing_to_do 128 | 129 | push rcx 130 | sub rsp, 512 131 | .lewp: 132 | ; Read the sector 133 | mov rcx, 1 134 | mov r8, rsp 135 | ;call ide_pio_read_sectors 136 | 137 | ; Copy the sector into VM memory 138 | mov rsi, rsp 139 | mov rcx, 512 140 | ;call mm_copy_to_guest_vm_vmcb 141 | 142 | add rdi, 512 143 | inc rbx 144 | dec r9 145 | jnz short .lewp 146 | add rsp, 512 147 | pop rcx 148 | 149 | .nothing_to_do: 150 | ; IRP->IoStatus.Status = STATUS_SUCCESS 151 | push rdx 152 | xor rbx, rbx 153 | lea rdx, [rdx + 0x30] 154 | call mm_write_guest_qword 155 | pop rdx 156 | 157 | ; IRP->IoStatus.Information = Number of bytes read 158 | push rdx 159 | mov rbx, rcx 160 | lea rdx, [rdx + 0x38] 161 | call mm_write_guest_qword 162 | pop rdx 163 | 164 | mov qword [rax + VMCB.rax], 0 ; STATUS_SUCCESS 165 | 166 | pop r9 167 | pop r8 168 | pop rbp 169 | pop rsi 170 | pop rdi 171 | pop rcx 172 | pop rbx 173 | 174 | ; IoCompleteRequest(irp, IO_NO_INCREMENT); 175 | mov rcx, 0xfffff80107935d44 176 | add qword [rax + VMCB.rip], rcx 177 | mov rcx, rdx ; rcx = IRP 178 | mov rdx, 0 ; rdx = IO_NO_INCREMENT 179 | ret 180 | 181 | -------------------------------------------------------------------------------- /srcs/fuzzers/defender.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | corrupt_quantum: 4 | push rax 5 | push rbx 6 | push rcx 7 | push rdx 8 | push rsi 9 | push rdi 10 | push rbp 11 | push r8 12 | push r9 13 | push r10 14 | push r11 15 | push r12 16 | push r13 17 | push r14 18 | push r15 19 | 20 | XMMPUSH xmm5 21 | 22 | call start_log 23 | 24 | call rand_dict_entry 25 | mov rsi, rbx 26 | mov rdi, qword [gs:thread_local.rtf_fuzz] 27 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 28 | rep movsb 29 | 30 | %ifndef ENABLE_FUZZING 31 | jmp .done 32 | %endif 33 | 34 | call xorshift64 35 | test r15, 0x3 36 | jz .create_new_fuzz 37 | 38 | .use_coverage: 39 | mov r10, -1 40 | mov r11, 0 41 | mov r12, 64 42 | .try_another: 43 | dec r12 44 | jz short .do_the_copy 45 | 46 | mov rcx, qword [fs:globals.coverage_fht] 47 | call fht_random 48 | test rax, rax 49 | jz .try_another 50 | 51 | cmp qword [rax + bb_struc.count], r10 52 | jae short .try_another 53 | 54 | mov r10, qword [rax + bb_struc.count] 55 | mov r11, rax 56 | jmp short .try_another 57 | 58 | .do_the_copy: 59 | test r11, r11 60 | jz .create_new_fuzz 61 | 62 | movdqu xmm5, [r11 + bb_struc.input_hash] 63 | call input_entry_from_hash 64 | 65 | mov rsi, rdx 66 | mov rdi, qword [gs:thread_local.rtf_fuzz] 67 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 68 | rep movsb 69 | 70 | .create_new_fuzz: 71 | %if 0 72 | mov rdi, qword [gs:thread_local.rtf_fuzz] 73 | 74 | call xorshift64 75 | mov rcx, r15 76 | and rcx, 0xff 77 | test rcx, rcx 78 | jz short .no_byte_corrupt 79 | .lewp: 80 | call xorshift64 81 | xor rdx, rdx 82 | mov rax, r15 83 | div qword [fs:globals.per_node_rtf + node_struct.data_len] 84 | 85 | call xorshift64 86 | mov byte [rdi + rdx], r15b 87 | 88 | dec rcx 89 | jnz short .lewp 90 | .no_byte_corrupt: 91 | %endif 92 | 93 | %if 0 94 | call xorshift64 95 | mov rcx, r15 96 | and rcx, 0xf 97 | test rcx, rcx 98 | jz short .no_corp_corrupt 99 | .cc_corrupt: 100 | call corrupt_from_corpus 101 | dec rcx 102 | jnz short .cc_corrupt 103 | .no_corp_corrupt: 104 | %endif 105 | 106 | %if 1 107 | call corpymem 108 | call corpymem 109 | call corpymem 110 | call corpymem 111 | %endif 112 | 113 | .done: 114 | call stop_log 115 | add qword [gs:thread_local.time_corrupt], rdx 116 | 117 | XMMPOP xmm5 118 | 119 | pop r15 120 | pop r14 121 | pop r13 122 | pop r12 123 | pop r11 124 | pop r10 125 | pop r9 126 | pop r8 127 | pop rbp 128 | pop rdi 129 | pop rsi 130 | pop rdx 131 | pop rcx 132 | pop rbx 133 | pop rax 134 | ret 135 | 136 | inject_corrupt_bytes: 137 | push rcx 138 | push rdx 139 | push rdi 140 | push rsi 141 | push r10 142 | push r11 143 | 144 | ; Hook 1 - UfsIoCache::Read() 145 | ; hook at mpengine+0x???? 146 | 147 | ; r10 - Offset 148 | ; r11 - Buffer 149 | ; r12d - Bytes read 150 | ; 151 | ; [rsp + 0x78 + 0x10] - int64_t off 152 | ; [rsp + 0x78 + 0x18] - void *buf 153 | 154 | ; Get the offset 155 | mov rdx, [rax + VMCB.rsp] 156 | add rdx, 0x78 + 0x10 157 | call mm_read_guest_qword 158 | mov r10, rdx 159 | 160 | ; Get the buffer 161 | mov rdx, [rax + VMCB.rsp] 162 | add rdx, 0x78 + 0x18 163 | call mm_read_guest_qword 164 | mov r11, rdx 165 | 166 | ; Calculate bread + offset 167 | mov rsi, r12 168 | add rsi, r10 169 | 170 | ; (length + offset) must not exceed file length 171 | cmp rsi, qword [fs:globals.per_node_rtf + node_struct.data_len] 172 | ja short .dont_corrupt 173 | 174 | ; Allocate room on the stack for the data 175 | sub rsp, r12 176 | 177 | ; Read the read contents from the VM 178 | mov rdi, rsp 179 | mov rsi, r11 180 | mov rcx, r12 181 | call mm_copy_from_guest_vm_vmcb 182 | 183 | ; Check if the contents from the VM match the original input file at the 184 | ; specified offset 185 | mov rdi, rsp 186 | mov rsi, qword [gs:thread_local.rtf_orig] 187 | add rsi, r10 188 | mov rcx, r12 189 | rep cmpsb 190 | jne short .dont_corrupt_free 191 | 192 | mov rsi, qword [gs:thread_local.rtf_fuzz] 193 | add rsi, r10 194 | mov rdi, r11 195 | mov rcx, r12 196 | call mm_copy_to_guest_vm_vmcb 197 | 198 | .dont_corrupt_free: 199 | add rsp, r12 200 | .dont_corrupt: 201 | pop r11 202 | pop r10 203 | pop rsi 204 | pop rdi 205 | pop rdx 206 | pop rcx 207 | ret 208 | 209 | -------------------------------------------------------------------------------- /srcs/fuzzers/generic.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | %define CRASH_CLASS_NULL 0 ; vec #PF addr [0, 16KB) 4 | %define CRASH_CLASS_NEG 1 ; vec #PF addr (-16KB, 0) 5 | %define CRASH_CLASS_INVAL 2 ; vec #PF addr [16KB, -16KB) 6 | %define CRASH_CLASS_ASCII 3 ; vec #GP 7 | %define CRASH_CLASS_OTHER 4 ; vec !(#GP || #PF) 8 | 9 | ; rax -> VMCB 10 | ; rcx <- crash classification based on VMCB data 11 | classify_crash: 12 | cmp qword [rax + VMCB.exitcode], 0x4d ; #GP 13 | je short .class_gp 14 | 15 | cmp qword [rax + VMCB.exitcode], 0x4e ; #PF 16 | je short .class_pf 17 | 18 | mov ecx, CRASH_CLASS_OTHER 19 | ret 20 | 21 | .class_gp: 22 | mov ecx, CRASH_CLASS_ASCII 23 | ret 24 | 25 | .class_pf: 26 | ; rcx = abs(addr) 27 | mov rcx, qword [rax + VMCB.exitinfo2] 28 | cmp rcx, 0 29 | jge short .dont_neg 30 | neg rcx 31 | .dont_neg: 32 | ; if abs(addr) >= 16KB then it's an inval access 33 | cmp rcx, (16 * 1024) 34 | jge short .inval 35 | 36 | ; if addr < 0 then it's a neg else it's a null 37 | cmp qword [rax + VMCB.exitinfo2], 0 38 | jl short .neg 39 | 40 | mov ecx, CRASH_CLASS_NULL 41 | ret 42 | 43 | .neg: 44 | mov ecx, CRASH_CLASS_NEG 45 | ret 46 | 47 | .inval: 48 | mov rcx, CRASH_CLASS_INVAL 49 | ret 50 | 51 | ; rcx -> Maximum number of times to invoke (inclusive) 52 | ; rbp -> Function to invoke (no parameters) 53 | invoke_random: 54 | push rax 55 | push rcx 56 | push rdx 57 | push r15 58 | 59 | inc rcx 60 | 61 | call xorshift64 62 | xor rdx, rdx 63 | mov rax, r15 64 | div rcx 65 | 66 | test rdx, rdx 67 | jz short .done 68 | 69 | .invoke: 70 | call rbp 71 | dec rdx 72 | jnz short .invoke 73 | 74 | .done: 75 | pop r15 76 | pop rdx 77 | pop rcx 78 | pop rax 79 | ret 80 | 81 | corrupt_from_corpus: 82 | push rax 83 | push rbx 84 | push rcx 85 | push rdx 86 | push rsi 87 | push rdi 88 | push rbp 89 | push r15 90 | 91 | mov rcx, 4 92 | mov rdx, 256 93 | call randexp 94 | mov rcx, rax 95 | 96 | ; Get a random entry in the dictionary 97 | push rcx 98 | push rsi 99 | call rand_pdf 100 | mov rbp, rcx 101 | pop rsi 102 | pop rcx 103 | 104 | ; Generate a random number [0, sizeof(dict_entry)) 105 | call xorshift64 106 | xor rdx, rdx 107 | mov rax, r15 108 | div rbp 109 | 110 | ; rsi = random pointer in ftar 111 | ; rcx = MIN(rcx, bytes_remaining_in_ftar) 112 | lea rsi, [rbx + rdx] 113 | sub rbp, rdx 114 | cmp rcx, rbp 115 | cmova rcx, rbp 116 | 117 | ; Get the fuzz image pointer 118 | mov rdi, qword [gs:thread_local.fuzz_input] 119 | 120 | ; Generate a random number [0, sizeof(fuzz_image)) 121 | call xorshift64 122 | xor rdx, rdx 123 | mov rax, r15 124 | mov rbp, qword [gs:thread_local.fuzz_input_len] 125 | div rbp 126 | 127 | ; rdi = random pointer in fuzz image 128 | ; rcx = MIN(rcx, bytes_remaining_in_fuzz_image) 129 | lea rdi, [rdi + rdx] 130 | sub rbp, rdx 131 | cmp rcx, rbp 132 | cmova rcx, rbp 133 | 134 | ; Copy from a random place in the ftar to a random place in the fuzz image 135 | rep movsb 136 | 137 | pop r15 138 | pop rbp 139 | pop rdi 140 | pop rsi 141 | pop rdx 142 | pop rcx 143 | pop rbx 144 | pop rax 145 | ret 146 | 147 | corpymem: 148 | push rax 149 | push rbx 150 | push rcx 151 | push rdx 152 | push rdi 153 | push rsi 154 | push rbp 155 | push r10 156 | push r15 157 | 158 | ; Pick a random size 159 | mov rcx, 8 160 | mov rdx, 15 161 | call randexp 162 | mov rcx, rax 163 | 164 | ; Get the fuzz image pointer 165 | mov rdi, qword [gs:thread_local.fuzz_input] 166 | 167 | ; Generate a random number [0, sizeof(fuzz_image)) 168 | call xorshift64 169 | xor rdx, rdx 170 | mov rax, r15 171 | mov rbp, qword [gs:thread_local.fuzz_input_len] 172 | div rbp 173 | 174 | ; rsi = random pointer in fuzz image 175 | ; rcx = MIN(rcx, bytes_remaining_in_fuzz_image) 176 | lea rsi, [rdi + rdx] 177 | sub rbp, rdx 178 | mov r10, rbp 179 | cmp rcx, rbp 180 | cmova rcx, rbp 181 | mov rbp, rcx 182 | 183 | ; rsi - Random pointer in fuzz image 184 | ; rbp - Bytes to compare 185 | ; r10 - Number of bytes left in the fuzz image 186 | 187 | push rsi 188 | call rand_pdf 189 | pop rsi 190 | 191 | call xorshift64 192 | xor rdx, rdx 193 | mov rax, r15 194 | div rcx 195 | 196 | lea rbx, [rbx + rdx] 197 | sub rcx, rdx 198 | 199 | ; rbx - Random pointer in dict entry 200 | ; rcx - Bytes remaining in dict entry 201 | 202 | mov rdi, rbx 203 | mov rdx, rcx 204 | mov rsi, rsi 205 | mov rcx, rbp 206 | call memmem 207 | test rax, rax 208 | jz short .no_replace 209 | 210 | push rax 211 | push rcx 212 | push rdx 213 | mov rcx, 32 214 | mov rdx, 256 215 | call randexp 216 | mov r15, rax 217 | pop rdx 218 | pop rcx 219 | pop rax 220 | 221 | cmp r15, r10 222 | cmova r15, r10 223 | 224 | mov rdi, rsi 225 | mov rsi, rax 226 | mov rcx, r15 227 | rep movsb 228 | 229 | .no_replace: 230 | pop r15 231 | pop r10 232 | pop rbp 233 | pop rsi 234 | pop rdi 235 | pop rdx 236 | pop rcx 237 | pop rbx 238 | pop rax 239 | ret 240 | 241 | ; xmm5 <- Hash of input 242 | input_create_entry: 243 | push rcx 244 | push rdi 245 | push rsi 246 | 247 | mov rdi, qword [gs:thread_local.fuzz_input] 248 | mov rsi, qword [gs:thread_local.fuzz_input_len] 249 | call falkhash 250 | 251 | mov rcx, qword [fs:globals.input_fht] 252 | call fht_fetch_or_lock 253 | jnc short .already_present_input 254 | 255 | ; This input hasn't been saved yet, save it! 256 | mov rdi, input_entry_size 257 | rand_alloc rdi 258 | 259 | mov rsi, qword [gs:thread_local.fuzz_input_len] 260 | rand_alloc rsi 261 | mov [rdi + input_entry.input], rsi 262 | 263 | mov rsi, qword [gs:thread_local.fuzz_input_len] 264 | rand_alloc rsi 265 | mov [rdi + input_entry.maps], rsi 266 | 267 | mov rsi, qword [gs:thread_local.fuzz_input_len] 268 | mov [rdi + input_entry.len], rsi 269 | 270 | push rcx 271 | push rdi 272 | mov rdi, [rdi + input_entry.input] 273 | mov rsi, [gs:thread_local.fuzz_input] 274 | mov rcx, [gs:thread_local.fuzz_input_len] 275 | rep movsb 276 | pop rdi 277 | pop rcx 278 | 279 | push rcx 280 | push rdi 281 | mov rdi, [rdi + input_entry.maps] 282 | mov rsi, [gs:thread_local.fuzz_maps] 283 | mov rcx, [gs:thread_local.fuzz_input_len] 284 | rep movsb 285 | pop rdi 286 | pop rcx 287 | 288 | mov qword [rcx], rdi 289 | 290 | .already_present_input: 291 | pop rsi 292 | pop rdi 293 | pop rcx 294 | ret 295 | 296 | ; xmm5 -> Hash of input 297 | ; rdx <- Pointer to input_entry structure 298 | input_entry_from_hash: 299 | push rcx 300 | 301 | mov rcx, qword [fs:globals.input_fht] 302 | call fht_fetch_or_lock 303 | jnc short .already_present_input 304 | 305 | ; We should NEVER hit this. Someone tried to look up a hash that is not 306 | ; present! 307 | jmp panic 308 | 309 | .already_present_input: 310 | mov rdx, rcx 311 | pop rcx 312 | ret 313 | 314 | ; rsi -> Page of memory 315 | add_breakpoints: 316 | push rsi 317 | push rdi 318 | push rcx 319 | push rax 320 | 321 | mov rax, rsi 322 | 323 | lea rdi, [rel needle] 324 | lea rsi, [rsi + 0x452] 325 | mov rcx, 16 326 | rep cmpsb 327 | jne short .dont_bp 328 | 329 | mov dword [rax + 0x452], 0xcccccccc 330 | 331 | .dont_bp: 332 | pop rax 333 | pop rcx 334 | pop rdi 335 | pop rsi 336 | ret 337 | 338 | needle: 339 | db 0x8B, 0xC8, 0xBA, 0x02, 0x00, 0x00, 0x00, 0x81, 0xF9, 0x06, 0x00, 0x00 340 | db 0xD0, 0x74, 0x41, 0x81 341 | 342 | ; r8 -> File to munch 343 | ; r9 -> Size of file to munch 344 | ; r10 -> Prefix 345 | ; r11 -> Prefix length 346 | ; r12 -> Suffix 347 | ; r13 -> Suffix length 348 | ; rbp <- Pointer to prefix in file 349 | ; rcx <- Size of munched region (zero if no match found) 350 | munch: 351 | push rax 352 | push rbx 353 | push rdx 354 | push rdi 355 | push rsi 356 | push r9 357 | 358 | ; Look for the prefix in the file 359 | mov rdi, r8 360 | mov rdx, r9 361 | mov rsi, r10 362 | mov rcx, r11 363 | call memmem 364 | test rax, rax 365 | jz short .prefix_not_found 366 | 367 | ; Save off the prefix found address 368 | mov rbp, rax 369 | 370 | ; Calculate the offset where the prefix was found, then the number of 371 | ; bytes left in the file 372 | sub rax, r8 373 | sub r9, rax 374 | 375 | ; Look for the suffix after the prefix 376 | mov rdi, rbp 377 | mov rdx, r9 378 | mov rsi, r12 379 | mov rcx, r13 380 | call memmem 381 | test rax, rax 382 | jz short .suffix_not_found 383 | 384 | ; Jump to after the suffix 385 | add rax, r13 386 | 387 | ; Calculate the size of the region to munch 388 | sub rax, rbp 389 | 390 | ; Set up the return value 391 | mov rbp, rbp 392 | mov rcx, rax 393 | 394 | jmp short .done 395 | 396 | .suffix_not_found: 397 | .prefix_not_found: 398 | xor rbp, rbp 399 | xor rcx, rcx 400 | 401 | .done: 402 | pop r9 403 | pop rsi 404 | pop rdi 405 | pop rdx 406 | pop rbx 407 | pop rax 408 | ret 409 | 410 | -------------------------------------------------------------------------------- /srcs/fuzzers/pdf.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | corrupt_pdf: 4 | push rax 5 | push rbx 6 | push rcx 7 | push rdx 8 | push rsi 9 | push rdi 10 | push rbp 11 | push r8 12 | push r9 13 | push r10 14 | push r11 15 | push r12 16 | push r13 17 | push r14 18 | push r15 19 | 20 | XMMPUSH xmm5 21 | 22 | call start_log 23 | 24 | ; Pick a random base PDF input 25 | call rand_pdf 26 | mov [gs:thread_local.fuzz_input_len], rcx 27 | 28 | mov rdi, [gs:thread_local.fuzz_maps] 29 | mov rsi, rsi 30 | mov rcx, [gs:thread_local.fuzz_input_len] 31 | rep movsb 32 | 33 | mov rdi, [gs:thread_local.fuzz_input] 34 | mov rsi, rbx 35 | mov rcx, [gs:thread_local.fuzz_input_len] 36 | rep movsb 37 | 38 | call xorshift64 39 | test r15, 0x7 40 | jz .create_new_fuzz 41 | 42 | %ifdef ENABLE_COVERAGE_FEEDBACK 43 | .use_coverage: 44 | mov r10, -1 45 | mov r11, 0 46 | mov r12, 8 47 | .try_another: 48 | dec r12 49 | jz short .do_the_copy 50 | 51 | ; Pick a random covearge entry 52 | mov rcx, qword [fs:globals.coverage_fht] 53 | call fht_random 54 | test rax, rax 55 | jz .try_another 56 | 57 | ; If this entry is not more rare, try another entry 58 | cmp qword [rax + bb_struc.count], r10 59 | jae short .try_another 60 | 61 | ; Save off this rare entry 62 | mov r10, qword [rax + bb_struc.count] 63 | mov r11, rax 64 | jmp short .try_another 65 | 66 | .do_the_copy: 67 | ; If we didn't find any entries, start a new fuzz 68 | test r11, r11 69 | jz .create_new_fuzz 70 | 71 | ; Fetch the input associated with this coverage entry 72 | movdqu xmm5, [r11 + bb_struc.input_hash] 73 | call input_entry_from_hash 74 | 75 | ; Copy the coverage entry to the fuzz_input 76 | mov rsi, [rdx + input_entry.input] 77 | mov rdi, [gs:thread_local.fuzz_input] 78 | mov rcx, [rdx + input_entry.len] 79 | rep movsb 80 | 81 | ; Copy the coverage entry maps to fuzz_maps 82 | mov rsi, [rdx + input_entry.maps] 83 | mov rdi, [gs:thread_local.fuzz_maps] 84 | mov rcx, [rdx + input_entry.len] 85 | rep movsb 86 | 87 | ; Set the new fuzz input length 88 | mov rcx, qword [rdx + input_entry.len] 89 | mov [gs:thread_local.fuzz_input_len], rcx 90 | 91 | jmp .create_new_fuzz 92 | %endif 93 | 94 | .create_new_fuzz: 95 | %ifdef ENABLE_FUZZING 96 | call xorshift64 97 | mov r14, r15 98 | and r14, 0xf 99 | inc r14 100 | 101 | .corrupt_stuff: 102 | mov r8, [gs:thread_local.fuzz_input] 103 | mov r9, [gs:thread_local.fuzz_input_len] 104 | mov r10, [gs:thread_local.fuzz_maps] 105 | 106 | call xorshift64 107 | xor rdx, rdx 108 | mov rax, r15 109 | div r9 110 | 111 | mov r11, rdx 112 | movzx edi, byte [r10 + r11] 113 | 114 | mov rbp, 100 115 | .find_match: 116 | call rand_pdf 117 | call xorshift64 118 | xor rdx, rdx 119 | mov rax, r15 120 | div rcx 121 | 122 | mov r12, rdx 123 | 124 | movzx ecx, byte [rsi + r12] 125 | cmp edi, ecx 126 | je short .match 127 | 128 | dec rbp 129 | jnz .find_match 130 | jmp .no_match 131 | 132 | .match: 133 | xor r15, r15 134 | lea rbp, [rsi + r12] 135 | .lewp: 136 | cmp r15, 0x100 137 | jae short .end 138 | cmp dil, byte [rbp + r15] 139 | jne short .end 140 | inc r15 141 | jmp short .lewp 142 | .end: 143 | mov rbp, r15 144 | call xorshift64 145 | xor rdx, rdx 146 | mov rax, r15 147 | div rbp 148 | inc rdx 149 | 150 | push rsi 151 | push rdi 152 | push rcx 153 | lea rsi, [r8 + r11] 154 | lea rdi, [r8 + r11] 155 | add rdi, rdx 156 | mov rcx, r9 157 | sub rcx, r11 158 | call memcpy 159 | pop rcx 160 | pop rdi 161 | pop rsi 162 | 163 | push rsi 164 | push rdi 165 | push rcx 166 | lea rsi, [r10 + r11] 167 | lea rdi, [r10 + r11] 168 | add rdi, rdx 169 | mov rcx, r9 170 | sub rcx, r11 171 | call memcpy 172 | pop rcx 173 | pop rdi 174 | pop rsi 175 | 176 | add qword [gs:thread_local.fuzz_input_len], rdx 177 | 178 | lea rsi, [rbx + r12] ; source 179 | lea rcx, [ r8 + r11] ; destination in the input 180 | lea rbp, [r10 + r11] ; destination in the map 181 | .hax: 182 | mov al, [rsi] 183 | mov [rcx], al 184 | mov [rbp], dil 185 | 186 | inc rbp 187 | inc rsi 188 | inc rcx 189 | dec rdx 190 | jnz short .hax 191 | .no_match: 192 | 193 | dec r14 194 | jnz .corrupt_stuff 195 | 196 | mov rcx, 8 197 | lea rbp, [rel corrupt_from_corpus] 198 | call invoke_random 199 | 200 | mov rcx, 8 201 | lea rbp, [rel corpymem] 202 | call invoke_random 203 | %endif 204 | 205 | .done: 206 | call stop_log 207 | add qword [gs:thread_local.time_corrupt], rdx 208 | 209 | XMMPOP xmm5 210 | 211 | pop r15 212 | pop r14 213 | pop r13 214 | pop r12 215 | pop r11 216 | pop r10 217 | pop r9 218 | pop r8 219 | pop rbp 220 | pop rdi 221 | pop rsi 222 | pop rdx 223 | pop rcx 224 | pop rbx 225 | pop rax 226 | ret 227 | 228 | inject_pdf: 229 | push rcx 230 | push rdi 231 | push rsi 232 | 233 | ; rdx - PDF buffer base 234 | ; r8 - PDF length 235 | 236 | mov rdi, rdx 237 | mov rsi, qword [gs:thread_local.fuzz_input] 238 | mov rcx, qword [gs:thread_local.fuzz_input_len] 239 | call mm_copy_to_guest_vm_vmcb 240 | 241 | mov r8, qword [gs:thread_local.fuzz_input_len] 242 | 243 | pop rsi 244 | pop rdi 245 | pop rcx 246 | ret 247 | 248 | ; rbx <- Pointer to random entry in dict 249 | ; rcx <- Size of random entry 250 | ; rsi <- Fuzz map 251 | rand_pdf: 252 | push rax 253 | push rdx 254 | push rbp 255 | push r15 256 | 257 | mov rbx, [fs:globals.fs_base] 258 | lea rbx, [rbx + globals.per_node_pdfs] 259 | call per_node_data 260 | mov rbp, rax 261 | 262 | mov rbx, [fs:globals.fs_base] 263 | lea rbx, [rbx + globals.per_node_pdfmaps] 264 | call per_node_data 265 | mov rsi, rax 266 | 267 | .try_another: 268 | ; Pick a random entry in the dict 269 | call xorshift64 270 | xor rdx, rdx 271 | mov rax, r15 272 | div qword [rbp] 273 | 274 | imul rdx, 0x10 275 | add rdx, 8 276 | 277 | mov rbx, [rbp + rdx + 0] ; File offset 278 | mov rcx, [rbp + rdx + 8] ; File size 279 | 280 | test rcx, rcx 281 | jz short .try_another 282 | 283 | lea rsi, [rsi + rbx] 284 | lea rbx, [rbp + rbx] 285 | 286 | pop r15 287 | pop rbp 288 | pop rdx 289 | pop rax 290 | ret 291 | 292 | -------------------------------------------------------------------------------- /srcs/fuzzers/word.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | ; al -> Character 4 | ; ZF <- Set if allowed, else not set 5 | rtf_cw_is_allowed_character: 6 | push rcx 7 | push rdi 8 | 9 | lea rdi, [rel rtf_cw_allowed_character] 10 | mov rcx, (rtf_cw_allowed_character.end - rtf_cw_allowed_character) 11 | repne scasb 12 | 13 | pop rdi 14 | pop rcx 15 | ret 16 | 17 | rtf_cw_allowed_character: 18 | ; Control words most be lowercase. They can be followed by a number. This 19 | ; number can be negative, thus we also can include '-'s. 20 | ; 21 | ; If a space is hit, the parsing is terminated but the space is included 22 | ; in the control word. 23 | ; 24 | ; When a non-allowed character is hit, the parsing is terminated and the 25 | ; character is not stored in the control word. 26 | ; 27 | db ' -abcdefghijklmnopqrstuvwxyz0123456789' 28 | .end: 29 | 30 | ; rbx -> Rtf file 31 | ; rbp -> Rtf file length 32 | ; r15 -> Seed 33 | ; rbx <- Pointer to control word (null if failure) 34 | ; rbp <- Length of control word 35 | ; r15 <- Updated seed 36 | rtf_get_random_bracket: 37 | push rax 38 | push rcx 39 | push rdx 40 | push rsi 41 | push rdi 42 | push r8 43 | push r9 44 | 45 | ; Find the size of the bracket database in entries 46 | xor rdx, rdx 47 | mov rax, qword [fs:globals.per_node_bktdb + node_struct.data_len] 48 | mov rbp, bracket_size 49 | div rbp 50 | mov rbp, rax 51 | 52 | ; Pick a random entry in the bracket DB 53 | call xorshift64 54 | xor rdx, rdx 55 | mov rax, r15 56 | div rbp 57 | 58 | ; Calculate the array offset in the bracket DB 59 | imul rdx, rdx, bracket_size 60 | 61 | ; Get the actual bracket db entry 62 | push rax 63 | push rbx 64 | mov rbx, [fs:globals.fs_base] 65 | lea rbx, [rbx + globals.per_node_bktdb] 66 | call per_node_data 67 | mov rdi, rax 68 | pop rbx 69 | pop rax 70 | 71 | lea rdi, [rdi + rdx] 72 | 73 | add rbx, qword [rdi + bracket.idx] 74 | mov rbp, qword [rdi + bracket.len] 75 | 76 | pop r9 77 | pop r8 78 | pop rdi 79 | pop rsi 80 | pop rdx 81 | pop rcx 82 | pop rax 83 | ret 84 | 85 | word_minimize: 86 | push rax 87 | push rcx 88 | push rdx 89 | push rsi 90 | push rdi 91 | push r14 92 | push r15 93 | 94 | mov rcx, SPINLOCK_MINIMIZE 95 | call acquire_spinlock 96 | 97 | mov rsi, qword [fs:globals.minimize_input] 98 | mov rdi, qword [gs:thread_local.rtf_fuzz] 99 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 100 | rep movsb 101 | 102 | mov rcx, SPINLOCK_MINIMIZE 103 | call release_spinlock 104 | 105 | call xorshift64 106 | mov r14, r15 107 | and r14, 0x7 108 | inc r14 109 | 110 | ; Calculate the number of bytes left to minimize 111 | mov rsi, qword [gs:thread_local.rtf_fuzz] 112 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 113 | call trailnull 114 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 115 | sub rcx, rax 116 | mov qword [gs:thread_local.rtf_bcrp], rcx 117 | test rcx, rcx 118 | jz .bail 119 | 120 | .do_more_minimize: 121 | %if 1 122 | call xorshift64 123 | test r15, 0x7 124 | jnz short .dont_minimize_bracket 125 | 126 | call xorshift64 127 | xor rdx, rdx 128 | mov rax, r15 129 | mov rcx, qword [gs:thread_local.rtf_bcrp] 130 | div rcx 131 | 132 | ; Parser state. 0 - looking for {, 1 - looking for } 133 | xor r15, r15 134 | 135 | mov rsi, qword [gs:thread_local.rtf_fuzz] 136 | add rsi, rdx 137 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 138 | sub rcx, rdx 139 | .fill_bracket: 140 | test r15, r15 141 | jnz short .looking_for_end 142 | 143 | cmp byte [rsi], '{' 144 | jne short .next_byte 145 | 146 | mov r15, 1 147 | 148 | .looking_for_end: 149 | ; If we're at the end, make our length 1 so that we end after this. This 150 | ; allows us to write over the closing bracket with a 0, and continue to 151 | ; exit after writing it. 152 | cmp byte [rsi], '}' 153 | cmove rcx, r15 154 | 155 | mov byte [rsi], 0 156 | 157 | .next_byte: 158 | inc rsi 159 | dec rcx 160 | jnz short .fill_bracket 161 | 162 | .dont_minimize_bracket: 163 | %endif 164 | %if 1 165 | ; Get the offset to start to zero 166 | call xorshift64 167 | xor rdx, rdx 168 | mov rax, r15 169 | mov rcx, qword [gs:thread_local.rtf_bcrp] 170 | div rcx 171 | 172 | ; Calculate pointer and length remaining 173 | mov rdi, qword [gs:thread_local.rtf_fuzz] 174 | add rdi, rdx 175 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 176 | sub rcx, rdx 177 | 178 | cmp rcx, 8 179 | jl short .skip_zeroing 180 | 181 | ; Get a length to start to zero 182 | call xorshift64 183 | xor rdx, rdx 184 | mov rax, r15 185 | mov rcx, 8 186 | div rcx 187 | 188 | ; Zero it out 189 | xor eax, eax 190 | mov rcx, rdx 191 | rep stosb 192 | %endif 193 | 194 | .skip_zeroing: 195 | dec r14 196 | jnz .do_more_minimize 197 | 198 | .done: 199 | call xorshift64 200 | mov r14, r15 201 | and r14, 0x1f 202 | inc r14 203 | 204 | .do_some_compress: 205 | ; Get the offset to compress 206 | call xorshift64 207 | xor rdx, rdx 208 | mov rax, r15 209 | mov rcx, qword [gs:thread_local.rtf_bcrp] 210 | div rcx 211 | 212 | test rdx, rdx 213 | jz .dont_compress 214 | 215 | ; Calculate pointer and length remaining 216 | mov rdi, qword [gs:thread_local.rtf_fuzz] 217 | add rdi, rdx 218 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 219 | sub rcx, rdx 220 | mov rsi, rdi 221 | 222 | cmp byte [rdi - 1], 0 223 | jne .dont_compress 224 | 225 | .find_first_zero: 226 | dec rdi 227 | cmp byte [rdi], 0 228 | je .find_first_zero 229 | 230 | inc rdi 231 | mov rdx, rsi 232 | sub rdx, rdi 233 | 234 | ; rdi now has scanned backwards to the first zero 235 | ; rsi points to the initial random point we sampled 236 | ; rcx is the length of the rest of the file 237 | rep movsb 238 | 239 | mov rdi, qword [gs:thread_local.rtf_fuzz] 240 | add rdi, qword [fs:globals.per_node_rtf + node_struct.data_len] 241 | sub rdi, rdx 242 | mov rcx, rdx 243 | xor eax, eax 244 | rep stosb 245 | 246 | .dont_compress: 247 | dec r14 248 | jnz short .do_some_compress 249 | 250 | .bail: 251 | ; Calculate the number of different quadwords 252 | mov rsi, qword [gs:thread_local.rtf_fuzz] 253 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 254 | call trailnull 255 | mov qword [gs:thread_local.rtf_bcrp], rax 256 | call nullcount 257 | mov qword [gs:thread_local.rtf_null], rax 258 | 259 | pop r15 260 | pop r14 261 | pop rdi 262 | pop rsi 263 | pop rdx 264 | pop rcx 265 | pop rax 266 | ret 267 | 268 | word_fuzz: 269 | push rax 270 | push rbx 271 | push rcx 272 | push rdx 273 | push rsi 274 | push rdi 275 | push rbp 276 | push r8 277 | push r9 278 | push r10 279 | push r11 280 | push r12 281 | push r13 282 | push r14 283 | push r15 284 | 285 | XMMPUSH xmm5 286 | 287 | %ifdef ENABLE_LOGGING 288 | call start_log 289 | %endif 290 | 291 | call xorshift64 292 | test r15, 0xF 293 | jz .create_new_fuzz 294 | 295 | mov r10, -1 296 | mov r11, 0 297 | mov r12, 16 298 | .try_another: 299 | dec r12 300 | jz short .do_the_copy 301 | 302 | mov rcx, qword [fs:globals.coverage_fht] 303 | call fht_random 304 | test rax, rax 305 | jz .try_another 306 | 307 | cmp qword [rax + bb_struc.count], r10 308 | jae short .try_another 309 | 310 | mov r10, qword [rax + bb_struc.count] 311 | mov r11, rax 312 | jmp short .try_another 313 | 314 | .do_the_copy: 315 | test r11, r11 316 | jz .create_new_fuzz 317 | 318 | movdqu xmm5, [r11 + bb_struc.input_hash] 319 | call input_entry_from_hash 320 | 321 | mov rsi, rdx 322 | mov rdi, qword [gs:thread_local.rtf_fuzz] 323 | mov rcx, qword [fs:globals.per_node_rtf + node_struct.data_len] 324 | rep movsb 325 | 326 | jmp .create_new_fuzz 327 | 328 | .create_new_fuzz: 329 | mov rbx, qword [fs:globals.per_node_rtf + node_struct.data_len] 330 | mov qword [gs:thread_local.fuzz_size], rbx 331 | 332 | .do_fuzz: 333 | %ifdef BRACKET_FUZZ 334 | mov rdi, qword [gs:thread_local.rtf_fuzz] 335 | add rdi, DONT_CORRUPT_FIRST 336 | mov rsi, qword [gs:thread_local.fuzz_size] 337 | sub rsi, DONT_CORRUPT_FIRST 338 | cmp rsi, 0 339 | jle panic 340 | 341 | call xorshift64 342 | mov r10, r15 343 | and r10, 0xff 344 | test r10, r10 345 | jz .end_bracket_fuzz 346 | .corrupt_bracket: 347 | mov rbx, [fs:globals.fs_base] 348 | lea rbx, [rbx + globals.per_node_fuzzdat] 349 | call per_node_data 350 | mov rbx, rax 351 | 352 | mov rbp, qword [fs:globals.per_node_fuzzdat + node_struct.data_len] 353 | call rtf_get_random_bracket 354 | test rbx, rbx 355 | jz short .next_bracket 356 | 357 | ; Get a random offset in the RTF 358 | call xorshift64 359 | xor rdx, rdx 360 | mov rax, r15 361 | div rsi 362 | 363 | ; Get the number of bytes remaining in the file and make sure we have 364 | ; room to inject this control word 365 | mov r8, rsi 366 | sub r8, rdx 367 | cmp r8, rbp 368 | jl short .next_bracket 369 | 370 | ; Copy the control word into the fuzz input 371 | push rcx 372 | push rsi 373 | push rdi 374 | lea rdi, [rdi + rdx] 375 | mov rsi, rbx 376 | mov rcx, rbp 377 | rep movsb 378 | pop rdi 379 | pop rsi 380 | pop rcx 381 | 382 | .next_bracket: 383 | dec r10 384 | jnz .corrupt_bracket 385 | 386 | .end_bracket_fuzz: 387 | %endif 388 | 389 | %ifdef CW_FUZZ 390 | call xorshift64 391 | mov r13, r15 392 | and r13, 0xff 393 | test r13, r13 394 | jz .end_cw_fuzz 395 | .do_another_cw_fuzz: 396 | mov rcx, qword [gs:thread_local.fuzz_size] 397 | sub rcx, DONT_CORRUPT_FIRST 398 | cmp rcx, 0 399 | jle panic 400 | 401 | ; Randomly pick a place to corrupt 402 | call xorshift64 403 | xor rdx, rdx 404 | mov rax, r15 405 | div rcx 406 | 407 | ; Calculate the length remaining 408 | sub rcx, rdx 409 | 410 | ; Calculate the pointer to corrupt 411 | mov rbx, [gs:thread_local.rtf_fuzz] 412 | add rbx, rdx 413 | add rbx, DONT_CORRUPT_FIRST 414 | 415 | ; Calculate number of entries in ctrldb 416 | xor rdx, rdx 417 | mov rax, qword [fs:globals.per_node_ctrldb + node_struct.data_len] 418 | mov rbp, ctrl_size 419 | div rbp 420 | mov rsi, rax 421 | 422 | push rax 423 | push rbx 424 | mov rbx, [fs:globals.fs_base] 425 | lea rbx, [rbx + globals.per_node_fuzzdat] 426 | call per_node_data 427 | mov rdi, rax 428 | pop rbx 429 | pop rax 430 | 431 | ; rbx - Thing to fuzz 432 | ; rcx - Room remaining to fuzz 433 | ; rsi - Number of db entries 434 | ; rdi - Pointer to parsed_rtf 435 | ; r8 - Counter 436 | ; r9 - Entry to fuzz around 437 | 438 | ; Select a random entry 439 | call xorshift64 440 | xor rdx, rdx 441 | mov rax, r15 442 | div rsi 443 | mov r9, rdx 444 | 445 | ; How many control words do we want to place in a row? 446 | call xorshift64 447 | mov r8, r15 448 | and r8, 0xf 449 | inc r8 450 | .place_control_word: 451 | ; Select a random entry around the base entry 452 | call xorshift64 453 | mov r10, r15 454 | and r10, 0xff 455 | sub r10, 128 456 | add r10, r9 457 | 458 | ; If we're OOB, stop 459 | cmp r10, rsi 460 | jge .done_fuzzing 461 | cmp r10, 0 462 | jl .done_fuzzing 463 | 464 | imul r10, r10, ctrl_size 465 | 466 | ; Get the entry 467 | push rax 468 | push rbx 469 | mov rbx, [fs:globals.fs_base] 470 | lea rbx, [rbx + globals.per_node_ctrldb] 471 | call per_node_data 472 | mov r14, rax 473 | pop rbx 474 | pop rax 475 | lea r14, [r14 + r10] 476 | 477 | ; Make sure we have room for this entry 478 | mov r11, qword [r14 + ctrl.len] 479 | add r11, 256 480 | cmp r11, rcx 481 | jge .done_fuzzing 482 | 483 | call xorshift64 484 | test r15, 0x7 485 | jnz short .dont_open 486 | 487 | mov byte [rbx], '{' 488 | inc rbx 489 | dec rcx 490 | 491 | .dont_open: 492 | call xorshift64 493 | test r15, 0x7 494 | jnz short .dont_close 495 | 496 | mov byte [rbx], '}' 497 | inc rbx 498 | dec rcx 499 | 500 | .dont_close: 501 | 502 | push rcx 503 | push rdi 504 | push rsi 505 | 506 | mov rsi, qword [r14 + ctrl.idx] 507 | lea rsi, [rdi + rsi] 508 | mov rdi, rbx 509 | mov rcx, qword [r14 + ctrl.len] 510 | rep movsb 511 | 512 | pop rsi 513 | pop rdi 514 | pop rcx 515 | 516 | mov r11, qword [r14 + ctrl.len] 517 | add rbx, r11 518 | sub rcx, r11 519 | 520 | cmp qword [r14 + ctrl.num], 0 521 | je short .next 522 | 523 | call xorshift64 524 | mov r10, r15 525 | and r10, 0x3 526 | inc r10 527 | .for_each_num: 528 | call xorshift64 529 | xor rdx, rdx 530 | mov rax, r15 531 | mov rbp, 10 532 | div rbp 533 | 534 | add dl, 0x30 535 | mov byte [rbx], dl 536 | inc rbx 537 | dec rcx 538 | 539 | dec r10 540 | jnz short .for_each_num 541 | 542 | .next: 543 | dec r8 544 | jnz .place_control_word 545 | 546 | dec r13 547 | jnz .do_another_cw_fuzz 548 | 549 | .end_cw_fuzz: 550 | %endif 551 | 552 | .done_fuzzing: 553 | %ifdef ENABLE_LOGGING 554 | call stop_log 555 | add qword [gs:thread_local.time_corrupt], rdx 556 | %endif 557 | 558 | XMMPOP xmm5 559 | 560 | pop r15 561 | pop r14 562 | pop r13 563 | pop r12 564 | pop r11 565 | pop r10 566 | pop r9 567 | pop r8 568 | pop rbp 569 | pop rdi 570 | pop rsi 571 | pop rdx 572 | pop rcx 573 | pop rbx 574 | pop rax 575 | ret 576 | 577 | -------------------------------------------------------------------------------- /srcs/io/serial.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | ; init_serial 4 | ; 5 | ; Summary: 6 | ; 7 | ; This function prepares the serial port SERIAL_PORT for operation of 115,200 8 | ; baud no pairty, one stop bit operation. 9 | ; 10 | ; Parameters: 11 | ; 12 | ; None 13 | ; 14 | ; Alignment: 15 | ; 16 | ; None 17 | ; 18 | ; Returns: 19 | ; 20 | ; None 21 | ; 22 | ; Smashes: 23 | ; 24 | ; None 25 | ; 26 | ; Optimization: 27 | ; 28 | ; Readability 29 | ; 30 | init_serial: 31 | push rax 32 | push rdx 33 | 34 | ; Disable all interrupts 35 | xor al, al 36 | mov dx, (SERIAL_PORT + 1) 37 | out dx, al 38 | 39 | ; Set DLAB 40 | mov al, 0b10000000 41 | mov dx, (SERIAL_PORT + 3) 42 | out dx, al 43 | 44 | mov al, 0b00000001 45 | mov dx, (SERIAL_PORT + 0) 46 | out dx, al 47 | 48 | mov al, 0b00000000 49 | mov dx, (SERIAL_PORT + 1) 50 | out dx, al 51 | 52 | mov al, 0b00000011 53 | mov dx, (SERIAL_PORT + 3) 54 | out dx, al 55 | ; End Set Baud 56 | 57 | ; Disable FIFO 58 | mov al, 0x00 59 | mov dx, (SERIAL_PORT + 2) 60 | out dx, al 61 | 62 | ; Enable DTR 63 | mov al, 0b00001011 64 | mov dx, (SERIAL_PORT + 4) 65 | out dx, al 66 | 67 | ; Enable data available interrupt 68 | mov al, 0b00000001 69 | mov dx, (SERIAL_PORT + 1) 70 | out dx, al 71 | 72 | pop rdx 73 | pop rax 74 | ret 75 | 76 | ; ser_int 77 | ; 78 | ; Summary: 79 | ; 80 | ; This function handles IRQ4, thus serial port interrupts for COM1 81 | ; 82 | ser_int: 83 | push rax 84 | push rdx 85 | push rdi 86 | 87 | mov dx, SERIAL_PORT 88 | out dx, al 89 | 90 | pop rdx 91 | pop rax 92 | ret 93 | 94 | -------------------------------------------------------------------------------- /srcs/memory_map: -------------------------------------------------------------------------------- 1 | Physical Map: 2 | 3 | E820 memory map: 4 | 0x00000000`00000500 - 0x00000000`xxxxxxxx : Unbounded, likely < 1KB 5 | 6 | Physical page table 7 | 0x00000000`00100000 - 0x00000000`00102000 : 8 KB Cache disable, WT 8 | 0x00000001`00102000 - 0x00000001`00103000 : 4 KB Cache enable (cephys) 9 | 10 | PNM page table(s) 11 | 0x00000001`00103000 - 0x00000001`00104000 : 4 KB (at most) Node 0 12 | 0x00000001`00104000 - 0x00000001`00105000 : 4 KB (at most) Node 1 13 | 0x00000001`00105000 - 0x00000001`00106000 : 4 KB (at most) Node 2 14 | 0x00000001`00106000 - 0x00000001`00107000 : 4 KB (at most) Node 3 15 | 0x00000001`00107000 - 0x00000001`00108000 : 4 KB (at most) Node 4 16 | 0x00000001`00108000 - 0x00000001`00109000 : 4 KB (at most) Node 5 17 | 0x00000001`00109000 - 0x00000001`0010a000 : 4 KB (at most) Node 6 18 | 0x00000001`0010a000 - 0x00000001`0010b000 : 4 KB (at most) Node 7 19 | 20 | Everything below physical 8MB is not touched by anything using the BAMP 21 | unless it directly uses the low 512GB mapping to touch devices, or just 22 | buggy code. 23 | 24 | Generic use memory mapped to the PNM 25 | 0x00000000`00800000 - 0x********`******** : Varies 26 | 27 | Virtual Map: 28 | 29 | Identity mapped cache disabled, write through section 30 | 0x00000000`00000000 - 0x00000080`00000000 : 512 GB 31 | 32 | Identity mapped full caching 33 | 0x00000080`00000000 - 0x00000100`00000000 : 512 GB 34 | 35 | Per node memory (PNM): 36 | no aslr: ((node_id + 2) << 39) - 0x********`******** : Varies 37 | 38 | Example: Node 0 @ 0x10000000000 39 | Node 1 @ 0x18000000000 40 | Node 2 @ 0x20000000000 41 | Node 3 @ 0x28000000000 42 | Node 4 @ 0x30000000000 43 | Node 5 @ 0x38000000000 44 | Node 6 @ 0x40000000000 45 | Node 7 @ 0x48000000000 46 | 47 | -------------------------------------------------------------------------------- /srcs/net/falktp.asm: -------------------------------------------------------------------------------- 1 | %define FALKTP_REQ 0x957f977c 2 | %define FALKTP_PUSH 0xda8fa024 3 | %define FALKTP_DAT 0xfe004afa 4 | 5 | %define FALKTP_TRANSMIT 0xcd9c04e7 6 | %define FALKTP_CHUNK_REQ 0xb58dc3f7 7 | %define FALKTP_DATA 0xcb4303e4 8 | %define FALKTP_DATA_DONE 0x665e0643 9 | 10 | ; Must be divisible by 8 11 | %define FALKTP_DATA_SIZE (8192) 12 | %define FALKTP_PACKETS_PER_CHUNK (256) 13 | %define FALKTP_CHUNK_SIZE (FALKTP_DATA_SIZE * FALKTP_PACKETS_PER_CHUNK) 14 | 15 | ;%define FALKTP_USE_HASHES 16 | 17 | struc falktp_txmit 18 | .magic: resq 1 19 | .padding: resd 1 20 | .req_id: resq 1 21 | .size: resq 1 22 | .hash: resq 2 23 | endstruc 24 | 25 | struc falktp_chunk_req 26 | .magic: resd 1 27 | .padding: resd 1 28 | .req_id: resq 1 29 | .chunk_id: resq 1 30 | endstruc 31 | 32 | struc falktp_data 33 | .magic: resd 1 34 | .padding: resd 1 35 | .req_id: resq 1 36 | .chunk_id: resq 1 37 | .seq_id: resq 1 38 | .data: resb FALKTP_DATA_SIZE 39 | endstruc 40 | 41 | struc falktp_data_done 42 | .magic: resd 1 43 | .padding: resd 1 44 | .req_id: resq 1 45 | .chunk_id: resq 1 46 | endstruc 47 | 48 | struc falktp_req 49 | .magic: resd 1 50 | .req_id: resq 1 51 | .file_id: resq 1 52 | endstruc 53 | 54 | struc falktp_push 55 | .magic: resd 1 56 | .req_id: resq 1 57 | .file_len: resq 1 58 | .hash: resq 2 ; 128-bit falkhash 59 | endstruc 60 | 61 | struc falktp_dat 62 | .magic: resd 1 63 | .req_id: resq 1 64 | .seq_id: resq 1 65 | .data: resb FALKTP_DATA_SIZE 66 | endstruc 67 | 68 | ; r10 -> vaddr to memory to send 69 | ; r11 -> size to send 70 | ; r15 -> request ID 71 | ; rdx -> chunk ID to send 72 | handle_chunk_req: 73 | push rax 74 | push rcx 75 | push rdi 76 | push rsi 77 | push r10 78 | push r11 79 | sub rsp, falktp_data_size 80 | 81 | ; Initialize the packet template 82 | mov dword [rsp + falktp_data.magic], FALKTP_DATA 83 | mov qword [rsp + falktp_data.req_id], r15 84 | mov qword [rsp + falktp_data.chunk_id], rdx 85 | mov qword [rsp + falktp_data.seq_id], 0 86 | lea rdi, [rsp + falktp_data.data] 87 | mov rcx, FALKTP_DATA_SIZE 88 | call bzero 89 | 90 | ; bounds check the chunk id 91 | imul rax, rdx, FALKTP_CHUNK_SIZE 92 | add r10, rax 93 | cmp rax, r11 94 | jae short .fail 95 | 96 | ; r11 = file_size - chunk_offset 97 | sub r11, rax 98 | 99 | ; r11 = max(r11, FALKTP_CHUNK_SIZE) 100 | cmp r11, FALKTP_CHUNK_SIZE 101 | jbe short .dont_cap 102 | mov r11, FALKTP_CHUNK_SIZE 103 | .dont_cap: 104 | 105 | ; r10 is the pointer to the data to send 106 | ; r11 is the size in bytes to send 107 | .transmit_data: 108 | mov rcx, r11 109 | cmp rcx, FALKTP_DATA_SIZE 110 | jbe short .dont_cap_data 111 | mov rcx, FALKTP_DATA_SIZE 112 | .dont_cap_data: 113 | mov rdx, rcx 114 | 115 | lea rdi, [rsp + falktp_data.data] 116 | mov rsi, r10 117 | rep movsb 118 | 119 | mov rbx, rsp 120 | mov rcx, falktp_data_size 121 | call x540_send_packet 122 | 123 | inc qword [rsp + falktp_data.seq_id] 124 | add r10, rdx 125 | sub r11, rdx 126 | jnz short .transmit_data 127 | 128 | .fail: 129 | add rsp, falktp_data_size 130 | pop r11 131 | pop r10 132 | pop rsi 133 | pop rdi 134 | pop rcx 135 | pop rax 136 | ret 137 | 138 | ; r10 -> Vaddr to memory to send 139 | ; r11 -> Size to send 140 | falktp_transmit: 141 | push rax 142 | push rbx 143 | push rcx 144 | push rdx 145 | push rsi 146 | push rbp 147 | push r12 148 | push r14 149 | push r15 150 | XMMPUSH xmm5 151 | sub rsp, falktp_data 152 | 153 | ;push rdi 154 | ;push rsi 155 | ;mov rdi, r10 156 | ;mov rsi, r11 157 | ;call falkhash 158 | ;pop rsi 159 | ;pop rdi 160 | 161 | .retry: 162 | ; Generate a random request id 163 | call xorshift64 164 | 165 | mov dword [rsp + falktp_txmit.magic], FALKTP_TRANSMIT 166 | mov qword [rsp + falktp_txmit.req_id], r15 167 | mov qword [rsp + falktp_txmit.size], r11 168 | movdqu [rsp + falktp_txmit.hash], xmm5 169 | mov rbx, rsp 170 | mov rcx, falktp_txmit_size 171 | call x540_send_packet 172 | 173 | ; Track how long we've waited for this push so we can timeout 174 | mov rax, 5000000 175 | call rdtsc_future 176 | mov rdx, rax 177 | mov r12, 50000 178 | 179 | jmp .init_push 180 | 181 | .wait_for_push: 182 | call x540_rx_advance 183 | .init_push: 184 | ; Check if there was a timeout 185 | call rdtsc64 186 | cmp rax, rdx 187 | jb short .no_timeout 188 | 189 | ; If we timed out 100 times, retry the entire file from the start 190 | cmp r12, 100 191 | jae short .retry 192 | 193 | ; Otherwise, send a falktp_data_done 194 | mov dword [rsp + falktp_data_done.magic], FALKTP_DATA_DONE 195 | mov qword [rsp + falktp_data_done.req_id], r15 196 | mov qword [rsp + falktp_data_done.chunk_id], r14 197 | mov rbx, rsp 198 | mov rcx, falktp_data_done_size 199 | call x540_send_packet 200 | 201 | ; Update the timeout 202 | mov rax, 10000 203 | call rdtsc_future 204 | mov rdx, rax 205 | 206 | ; Increment the number of timeouts 207 | inc r12 208 | 209 | .no_timeout: 210 | call x540_probe_rx_udp 211 | 212 | test rsi, rsi 213 | jz short .init_push 214 | 215 | cmp rbp, falktp_chunk_req_size 216 | jne short .wait_for_push 217 | 218 | cmp dword [rsi + falktp_chunk_req.magic], FALKTP_CHUNK_REQ 219 | jne short .wait_for_push 220 | 221 | cmp qword [rsi + falktp_chunk_req.req_id], r15 222 | jne short .wait_for_push 223 | 224 | mov rdx, [rsi + falktp_chunk_req.chunk_id] 225 | mov r14, [rsi + falktp_chunk_req.chunk_id] 226 | call x540_rx_advance 227 | 228 | ; Special case. If the chunk_id was leet, transfer was successful! 229 | mov r12, 0x1337133713371337 230 | cmp r12, rdx 231 | je .done 232 | 233 | call handle_chunk_req 234 | 235 | ; Update the timeout 236 | xor r12, r12 237 | mov rax, 10000 238 | call rdtsc_future 239 | mov rdx, rax 240 | jmp .init_push 241 | 242 | .done: 243 | add rsp, falktp_data 244 | XMMPOP xmm5 245 | pop r15 246 | pop r14 247 | pop r12 248 | pop rbp 249 | pop rsi 250 | pop rdx 251 | pop rcx 252 | pop rbx 253 | pop rax 254 | ret 255 | 256 | struc zfalktp_push 257 | .magic: resq 1 ; 0x00 258 | .req_id: resq 1 ; 0x08 259 | .file_len: resq 1 ; 0x10 260 | .padding: resq 1 261 | .hash: resq 2 262 | endstruc 263 | 264 | struc zfalktp_dat 265 | .magic: resq 1 ; 0x00 266 | .req_id: resq 1 ; 0x08 267 | .data: resb 8192 ; 0x18 268 | endstruc 269 | 270 | ; r10 - Snapshot 271 | ; r11 - Snapshot length 272 | falktp_send: 273 | push rax 274 | push rbx 275 | push rcx 276 | push rdx 277 | push rdi 278 | push r8 279 | push r11 280 | push r12 281 | push r13 282 | push r14 283 | push r15 284 | 285 | XMMPUSH xmm5 286 | 287 | sub rsp, zfalktp_dat_size 288 | 289 | ;mov rdi, r10 290 | ;mov rsi, r11 291 | ;call falkhash 292 | 293 | test r11, r11 294 | jz .done 295 | 296 | call xorshift64 297 | mov qword [rsp + zfalktp_push.magic], 0x1A8FA024 298 | mov qword [rsp + zfalktp_push.req_id], r15 299 | mov qword [rsp + zfalktp_push.file_len], r11 300 | 301 | mov rbx, rsp 302 | mov rcx, zfalktp_push_size 303 | call x540_send_packet 304 | 305 | xor r8, r8 ; sent 306 | .lewp: 307 | cmp r8, r11 308 | jae .done 309 | 310 | ; r14 = length - sent 311 | mov r12, 8192 312 | mov r14, r11 313 | sub r14, r8 314 | cmp r14, r12 315 | cmova r14, r12 316 | 317 | mov qword [rsp + zfalktp_dat.magic], 0x1E004AFA 318 | mov qword [rsp + zfalktp_dat.req_id], r15 319 | 320 | lea rdi, [rsp + zfalktp_dat.data] 321 | lea rsi, [r10 + r8] 322 | mov rcx, r14 323 | rep movsb 324 | 325 | mov rbx, rsp 326 | lea rcx, [r14 + zfalktp_dat.data] 327 | call x540_send_packet 328 | 329 | ; Target ~390MB/s 330 | mov ecx, 20 331 | call rdtsc_sleep 332 | 333 | add r8, r14 334 | jmp .lewp 335 | 336 | .done: 337 | add rsp, zfalktp_dat_size 338 | XMMPOP xmm5 339 | pop r15 340 | pop r14 341 | pop r13 342 | pop r12 343 | pop r11 344 | pop r8 345 | pop rdi 346 | pop rdx 347 | pop rcx 348 | pop rbx 349 | pop rax 350 | ret 351 | 352 | ; r13 -> File ID to request 353 | ; rsi <- Pointer to file data 354 | ; rbp <- Size of file in bytes 355 | falktp_pull: 356 | push rax 357 | push rbx 358 | push rcx 359 | push rdx 360 | push rdi 361 | push r8 362 | push r11 363 | push r12 364 | push r13 365 | push r14 366 | push r15 367 | 368 | XMMPUSH xmm5 369 | XMMPUSH xmm7 370 | 371 | sub rsp, 4096 372 | 373 | ; This is where we store the allocation. If we retry, we don't realloacte. 374 | xor r11, r11 375 | 376 | .retry: 377 | call xorshift64 378 | 379 | ; Send the request 380 | mov dword [rsp + falktp_req.magic], FALKTP_REQ 381 | mov qword [rsp + falktp_req.req_id], r15 382 | mov qword [rsp + falktp_req.file_id], r13 383 | mov rbx, rsp 384 | mov rcx, falktp_req_size 385 | call x540_send_packet 386 | 387 | ; Track how long we've waited for this push so we can timeout 388 | call rdtsc_uptime 389 | mov rdx, rax 390 | 391 | jmp .init_push 392 | 393 | .wait_for_push: 394 | call x540_rx_advance 395 | .init_push: 396 | call rdtsc_uptime 397 | sub rax, rdx 398 | cmp rax, 30000000 399 | jge short .retry 400 | 401 | call x540_probe_rx_udp 402 | 403 | test rsi, rsi 404 | jz short .init_push 405 | 406 | cmp rbp, falktp_push_size 407 | jne short .wait_for_push 408 | 409 | cmp dword [rsi + falktp_push.magic], FALKTP_PUSH 410 | jne short .wait_for_push 411 | 412 | cmp qword [rsi + falktp_push.req_id], r15 413 | jne short .wait_for_push 414 | 415 | mov r14, qword [rsi + falktp_push.file_len] 416 | test r14, r14 417 | jz .wait_for_push 418 | 419 | movdqu xmm7, [rsi + falktp_push.hash] 420 | 421 | ; We're done with this packet now 422 | call x540_rx_advance 423 | 424 | ; Round up an integer division to determine the number of chunks we need 425 | xor rdx, rdx 426 | lea rax, [r14 + (FALKTP_DATA_SIZE - 1)] 427 | mov rcx, FALKTP_DATA_SIZE 428 | div rcx 429 | mov r8, rax 430 | 431 | test r11, r11 432 | jnz short .dont_realloc 433 | 434 | imul rbx, rax, FALKTP_DATA_SIZE 435 | bt r13, 63 436 | jc short .not_mixed 437 | mixed_alloc rbx 438 | jmp short .alloc_dec 439 | .not_mixed: 440 | bamp_alloc rbx 441 | .alloc_dec: 442 | mov r11, rbx 443 | .dont_realloc: 444 | mov rbx, r11 445 | 446 | ; Track how long we've waited for this chunk so we can timeout 447 | mov rax, 5000000 448 | call rdtsc_future 449 | mov rdx, rax 450 | 451 | ; At this point rbx points to the destination for the recieved data 452 | ; rax is the size of the file in chunks 453 | ; r8 is the number of remaining chunks 454 | ; r14 is the size of the file in bytes 455 | jmp .init_chunk 456 | 457 | .wait_for_chunks: 458 | call x540_rx_advance 459 | .init_chunk: 460 | ; Timeout after 5 seconds 461 | call rdtsc64 462 | cmp rax, rdx 463 | jge .retry 464 | 465 | call x540_probe_rx_udp 466 | 467 | test rsi, rsi 468 | jz short .init_chunk 469 | 470 | cmp rbp, falktp_dat_size 471 | jne short .wait_for_chunks 472 | 473 | cmp dword [rsi + falktp_dat.magic], FALKTP_DAT 474 | jne short .wait_for_chunks 475 | 476 | cmp qword [rsi + falktp_dat.req_id], r15 477 | jne short .wait_for_chunks 478 | 479 | mov rax, 5000000 480 | call rdtsc_future 481 | mov rdx, rax 482 | 483 | ; Bounds check the sequence ID 484 | cmp qword [rsi + falktp_dat.seq_id], rax 485 | jae short .wait_for_chunks 486 | 487 | imul rdi, qword [rsi + falktp_dat.seq_id], FALKTP_DATA_SIZE 488 | add rdi, rbx 489 | add rsi, falktp_dat.data 490 | mov rcx, (FALKTP_DATA_SIZE / 8) 491 | rep movsq 492 | 493 | ; We're done with the packet 494 | call x540_rx_advance 495 | 496 | dec r8 497 | jnz short .init_chunk 498 | 499 | %ifdef FALKTP_USE_HASHES 500 | ; Make sure the hashe pop r15 501 | pop r14 502 | pop r13 503 | pop r12 504 | pop r11 505 | pop r8 506 | pop rdi 507 | pop rdx 508 | pop rcx 509 | pop rbx 510 | pop raxs match 511 | mov rdi, rbx 512 | mov rsi, r14 513 | call falkhash 514 | pxor xmm5, xmm7 515 | ptest xmm5, xmm5 516 | jnz .retry 517 | %endif 518 | 519 | mov rsi, rbx 520 | mov rbp, r14 521 | 522 | .end: 523 | add rsp, 4096 524 | 525 | XMMPOP xmm7 526 | XMMPOP xmm5 527 | 528 | pop r15 529 | pop r14 530 | pop r13 531 | pop r12 532 | pop r11 533 | pop r8 534 | pop rdi 535 | pop rdx 536 | pop rcx 537 | pop rbx 538 | pop rax 539 | ret 540 | 541 | ; rcx -> buffer 542 | ; rdx -> length (in bytes) 543 | ; eax <- crc32c (polynomial 0x11EDC6F41) 544 | crc32c: 545 | push rcx 546 | push rdx 547 | 548 | mov eax, -1 549 | 550 | ; Calculate the CRC32 in chunks of 8 bytes at a time 551 | .lewp_chunk: 552 | ; If we have fewer than 8 bytes left, try to process the rest individually 553 | cmp rdx, 8 554 | jb short .individual 555 | 556 | ; Generate the CRC32 on an 8 byte chunk, try again! 557 | crc32 rax, qword [rcx] 558 | add rcx, 8 559 | sub rdx, 8 560 | jmp short .lewp_chunk 561 | 562 | .individual: 563 | test rdx, rdx 564 | jz short .end 565 | 566 | ; Finish the CRC32 byte by byte 567 | .lewp: 568 | crc32 rax, byte [rcx] 569 | inc rcx 570 | dec rdx 571 | jnz short .lewp 572 | 573 | .end: 574 | ; Done! 575 | xor eax, -1 576 | 577 | pop rdx 578 | pop rcx 579 | ret 580 | 581 | ; A chunk_size of 0x50 is ideal for AMD fam 15h platforms, which is what this 582 | ; was optimized and designed for. If you change this value, you have to 583 | ; manually add/remove movdqus and aesencs from the core loop. 584 | %define FALKHASH_CHUNK_SIZE 0x50 585 | 586 | ; rdi -> data 587 | ; rsi -> len 588 | ; xmm5 <- 128-bit hash 589 | falkhash: 590 | push rax 591 | push rcx 592 | push rdi 593 | push rsi 594 | push rbp 595 | 596 | XMMPUSH xmm0 597 | XMMPUSH xmm1 598 | XMMPUSH xmm2 599 | XMMPUSH xmm3 600 | XMMPUSH xmm4 601 | 602 | sub rsp, FALKHASH_CHUNK_SIZE 603 | 604 | ; Add the seed to the length 605 | mov rbp, rsi 606 | add rbp, 0x13371337 607 | 608 | ; Place the length+seed for both the low and high 64-bits into xmm5, 609 | ; our hash output. 610 | pinsrq xmm5, rbp, 0 611 | inc rbp 612 | pinsrq xmm5, rbp, 1 613 | 614 | .lewp: 615 | ; If we have less than a chunk, copy the partial chunk to the stack. 616 | cmp rsi, FALKHASH_CHUNK_SIZE 617 | jb short .pad_last_chunk 618 | 619 | .continue: 620 | ; Read 5 pieces from memory into xmms 621 | movdqu xmm0, [rdi + 0x00] 622 | movdqu xmm1, [rdi + 0x10] 623 | movdqu xmm2, [rdi + 0x20] 624 | movdqu xmm3, [rdi + 0x30] 625 | movdqu xmm4, [rdi + 0x40] 626 | 627 | ; Mix all pieces into xmm0 628 | aesenc xmm0, xmm1 629 | aesenc xmm0, xmm2 630 | aesenc xmm0, xmm3 631 | aesenc xmm0, xmm4 632 | 633 | ; Finalize xmm0 by mixing with itself 634 | aesenc xmm0, xmm0 635 | 636 | ; Mix in xmm0 to the hash 637 | aesenc xmm5, xmm0 638 | 639 | ; Go to the next chunk, fall through if we're done. 640 | add rdi, FALKHASH_CHUNK_SIZE 641 | sub rsi, FALKHASH_CHUNK_SIZE 642 | jnz short .lewp 643 | jmp short .done 644 | 645 | .pad_last_chunk: 646 | ; Fill the stack with 0xff's, this is our padding 647 | push rdi 648 | lea rdi, [rsp + 8] 649 | mov eax, -1 650 | mov ecx, FALKHASH_CHUNK_SIZE 651 | rep stosb 652 | pop rdi 653 | 654 | ; Copy the remainder of data to the stack 655 | mov rcx, rsi 656 | mov rsi, rdi 657 | mov rdi, rsp 658 | rep movsb 659 | 660 | ; Make our data now come from the stack, and set the size to one chunk. 661 | mov rdi, rsp 662 | mov rsi, FALKHASH_CHUNK_SIZE 663 | 664 | jmp short .continue 665 | 666 | .done: 667 | ; Finalize the hash. This is required at least once to pass 668 | ; Combination 0x8000000 and Combination 0x0000001. Need more than 1 to 669 | ; pass the Seed tests. We do 4 because they're pretty much free. 670 | ; Maybe we should actually use the seed better? Nah, more finalizing! 671 | aesenc xmm5, xmm5 672 | aesenc xmm5, xmm5 673 | aesenc xmm5, xmm5 674 | aesenc xmm5, xmm5 675 | 676 | add rsp, FALKHASH_CHUNK_SIZE 677 | 678 | XMMPOP xmm4 679 | XMMPOP xmm3 680 | XMMPOP xmm2 681 | XMMPOP xmm1 682 | XMMPOP xmm0 683 | 684 | pop rbp 685 | pop rsi 686 | pop rdi 687 | pop rcx 688 | pop rax 689 | ret 690 | 691 | -------------------------------------------------------------------------------- /srcs/net/i825xx.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | ; i825xx_fetch_pci 4 | ; 5 | ; Summary: 6 | ; 7 | ; This function enumerates all PCI devices and returns the PCI request needed 8 | ; for the first device encountered with VID 8086 and DID 100e 9 | ; 10 | ; Parameters: 11 | ; 12 | ; None 13 | ; 14 | ; Alignment: 15 | ; 16 | ; None 17 | ; 18 | ; Returns: 19 | ; 20 | ; on Success: rax = PCI request on success 21 | ; on Failure: rax = 0 22 | ; 23 | ; Smashes: 24 | ; 25 | ; rax - Return value 26 | ; 27 | ; Optimization: 28 | ; 29 | ; Readability 30 | ; 31 | i825xx_fetch_pci: 32 | push rbx 33 | push rdx 34 | 35 | sub rsp, 0x10 36 | 37 | ; rsp + 0x00 L2 | Bus number 38 | ; rsp + 0x02 L2 | Device number 39 | ; rsp + 0x04 L2 | Function number 40 | 41 | mov word [rsp + 0x00], 0xFF 42 | .for_bus: 43 | mov word [rsp + 0x02], 0x1F 44 | .for_device: 45 | mov word [rsp + 0x04], 0x07 46 | .for_func: 47 | ; Bus 48 | movzx eax, byte [rsp + 0x00] 49 | shl eax, 5 50 | 51 | ; Device 52 | or al, byte [rsp + 0x02] 53 | shl eax, 3 54 | 55 | ; Function 56 | or al, byte [rsp + 0x04] 57 | shl eax, 8 58 | 59 | ; Enable bit 60 | or eax, 0x80000000 61 | 62 | ; Save the bus:device.func query into ebx 63 | mov ebx, eax 64 | 65 | ; Request the vendor ID and device ID 66 | mov dx, 0x0CF8 67 | out dx, eax 68 | mov dx, 0x0CFC 69 | in eax, dx 70 | 71 | ; If the vendor ID is 0xFFFF, then this bus:device.func does not exist 72 | cmp ax, 0xFFFF 73 | je short .next_device 74 | 75 | ; Query register 0x00, it's the one that contains the VID and DID 76 | mov eax, ebx 77 | mov dx, 0x0CF8 78 | out dx, eax 79 | mov dx, 0x0CFC 80 | in eax, dx 81 | 82 | cmp eax, 0x10d38086 83 | jne short .next_device 84 | 85 | mov eax, ebx 86 | jmp short .ret 87 | 88 | .next_device: 89 | dec word [rsp + 0x04] 90 | jns short .for_func 91 | 92 | dec word [rsp + 0x02] 93 | jns short .for_device 94 | 95 | dec word [rsp + 0x00] 96 | jns short .for_bus 97 | 98 | xor rax, rax 99 | .ret: 100 | add rsp, 0x10 101 | pop rdx 102 | pop rbx 103 | ret 104 | 105 | i825xx_init: 106 | push rax 107 | push rcx 108 | push rdx 109 | 110 | call i825xx_fetch_pci 111 | test rax, rax 112 | jz .done 113 | 114 | mov qword [fs:globals.i825xx_dev + i825xx_dev.pcireq], rax 115 | 116 | or rax, 0x10 ; BAR1 117 | mov dx, 0x0CF8 118 | out dx, eax 119 | mov dx, 0x0CFC 120 | in eax, dx 121 | and eax, ~1 122 | 123 | mov qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base], rax 124 | 125 | ; Init procedure according to 82574l docs 126 | ; 127 | ; 1. Disable interrupts 128 | ; 2. Issue global reset and do general config 129 | ; 3. Setup the PHY and link 130 | ; 4. Init the stat counters 131 | ; 5. Enable RX 132 | ; 6. Enable TX 133 | ; 7. Enable interrupts 134 | 135 | ; Disable all interrupts by writing all Fs to the Interrupt Mask Clear 136 | ; Register (IMC) 137 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 138 | mov dword [rdx + 0xD8], -1 139 | 140 | ; Issue a full reset by setting RST in the CTRL register 141 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 142 | or dword [rdx + 0x00], (1 << 26) 143 | 144 | ; Disable all interrupts by writing all Fs to the Interrupt Mask Clear 145 | ; Register (IMC) 146 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 147 | mov dword [rdx + 0xD8], -1 148 | 149 | ; If we're not using XOFF flow control, zero FCAH, FCAL, and FCT 150 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 151 | mov dword [rdx + 0x002C], 0 ; FCAH 152 | mov dword [rdx + 0x0028], 0 ; FCAL 153 | mov dword [rdx + 0x0030], 0 ; FCT 154 | or dword [rdx + 0x5B00], (1 << 22) ; GCR bit 22 should be set 155 | 156 | ; Enable the MAC and PHY to be all auto negotiated/resolved 157 | ; Set CTRL.FRCDPLX = 0b, CTRL.FRCSPD = 0b, CTRL.ASDE = 0b, CTRL.SLU = 1b, 158 | ; CTRL.RFCE = 1b, CTRL.TFCE = 1b 159 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 160 | and dword [rdx + 0x00], ~((1 << 12) | (1 << 11) | (1 << 5)) 161 | or dword [rdx + 0x00], ((1 << 6) | (1 << 27) | (1 << 28)) 162 | 163 | ; Zero out all MTA entries 164 | xor rcx, rcx 165 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 166 | .zero_mta: 167 | mov dword [rdx + 0x5200 + rcx*4], 0 168 | inc rcx 169 | cmp rcx, 128 170 | jl short .zero_mta 171 | 172 | call i825xx_init_rx 173 | call i825xx_init_tx 174 | 175 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 176 | mov dword [rdx + 0x14], (0 << 2) | 1 177 | .wait_for_done: 178 | bt dword [rdx + 0x14], 1 179 | jnc .wait_for_done 180 | mov eax, dword [rdx + 0x14] 181 | shr eax, 16 182 | mov word [fs:globals.hw_mac_address + 0], ax 183 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 184 | mov dword [rdx + 0x14], (1 << 2) | 1 185 | .wait_for_done1: 186 | bt dword [rdx + 0x14], 1 187 | jnc .wait_for_done1 188 | mov eax, dword [rdx + 0x14] 189 | shr eax, 16 190 | mov word [fs:globals.hw_mac_address + 2], ax 191 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 192 | mov dword [rdx + 0x14], (2 << 2) | 1 193 | .wait_for_done2: 194 | bt dword [rdx + 0x14], 1 195 | jnc .wait_for_done2 196 | mov eax, dword [rdx + 0x14] 197 | shr eax, 16 198 | mov word [fs:globals.hw_mac_address + 4], ax 199 | 200 | .done: 201 | pop rdx 202 | pop rcx 203 | pop rax 204 | ret 205 | 206 | %define I825XX_NUM_RX 512 207 | 208 | i825xx_init_rx: 209 | push rax 210 | push rbx 211 | push rcx 212 | push rdx 213 | push rbp 214 | 215 | ; Allocate ring descriptor space 216 | mov rbx, (16 * I825XX_NUM_RX) 217 | bamp_alloc rbx 218 | call bamp_get_phys 219 | 220 | mov qword [fs:globals.i825xx_dev + i825xx_dev.rx_ring_base], rax 221 | 222 | ; rbp - Base address 223 | ; rdx - Base address pointer 224 | ; rcx - Counter 225 | mov rbp, rax 226 | mov rdx, rax 227 | xor rcx, rcx 228 | .setup_rx: 229 | ; Allocate 8k for each RX descriptor 230 | mov rbx, (8192 + 16) 231 | bamp_alloc rbx 232 | call bamp_get_phys 233 | 234 | mov qword [rdx + 0x00], rax ; Address 235 | mov qword [rdx + 0x08], 0 ; Status 236 | 237 | add rdx, 16 238 | inc rcx 239 | cmp rcx, I825XX_NUM_RX 240 | jl short .setup_rx 241 | 242 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 243 | 244 | ; Set up the high and low parts of the address 245 | mov rax, rbp 246 | shr rax, 32 247 | mov dword [rdx + 0x02804], eax ; RDBAH0 248 | mov dword [rdx + 0x02800], ebp ; RDBAL0 249 | 250 | ; Set up the length of the recieve ring buffer 251 | mov dword [rdx + 0x02808], (I825XX_NUM_RX * 16) ; RDLEN0 252 | 253 | ; Set up the head and tail pointers 254 | mov dword [rdx + 0x02810], 0 ; Head 255 | mov dword [rdx + 0x02818], (I825XX_NUM_RX - 1) ; Tail 256 | 257 | ; Store the current read position 258 | mov qword [fs:globals.i825xx_dev + i825xx_dev.rx_tail], 0 259 | 260 | ; Set the RX control register 261 | ; Enable the following: 262 | ; SBP - Store bad packets (store packets with CRC errors) 263 | ; UPE - Unicast promiscuous enable 264 | ; MPE - Multicast promiscuous enable 265 | ; LPE - Long packet enable 266 | ; BAM - Accept broadcast packets 267 | ; BSIZE - Set packet size to 8k 268 | ; BSEX - Needed to get packet size to 8k 269 | ; SECRC - Strip ethernet CRC from packets 270 | ; EN - Enable RX! 271 | mov dword [rdx + 0x100], ((1 << 2) | (1 << 3) | (1 << 4) | (1 << 5) | (1 << 15) | (2 << 16) | (1 << 25) | (1 << 26) | (1 << 1)) 272 | 273 | pop rbp 274 | pop rdx 275 | pop rcx 276 | pop rbx 277 | pop rax 278 | ret 279 | 280 | ; This must be a power of two! 281 | %define I825XX_NUM_TX 512 282 | 283 | i825xx_init_tx: 284 | push rax 285 | push rbx 286 | push rcx 287 | push rdx 288 | push rbp 289 | 290 | ; Allocate ring descriptor space 291 | mov rbx, (16 * I825XX_NUM_TX) 292 | bamp_alloc rbx 293 | call bamp_get_phys 294 | 295 | mov qword [fs:globals.i825xx_dev + i825xx_dev.tx_ring_base], rax 296 | 297 | ; rbp - Base address 298 | ; rdx - Base address pointer 299 | ; rcx - Counter 300 | mov rbp, rax 301 | mov rdx, rax 302 | xor rcx, rcx 303 | .setup_tx: 304 | push rax 305 | 306 | mov rbx, 8192 307 | bamp_alloc rbx 308 | call bamp_get_phys 309 | 310 | mov qword [rdx + 0x00], rax ; Address 311 | mov qword [rdx + 0x08], 0 ; Status 312 | bts qword [rdx + 0x08], 32 ; Set DD bit in status to signify this is avail 313 | ; for use. 314 | 315 | pop rax 316 | 317 | add rdx, 16 318 | inc rcx 319 | cmp rcx, I825XX_NUM_TX 320 | jl short .setup_tx 321 | 322 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 323 | 324 | ; Set up the high and low parts of the address 325 | mov rax, rbp 326 | shr rax, 32 327 | mov dword [rdx + 0x03804], eax ; TDBAH0 328 | mov dword [rdx + 0x03800], ebp ; TDBAL0 329 | 330 | ; Set up the length of the TX ring buffer 331 | mov dword [rdx + 0x03808], (I825XX_NUM_TX * 16) ; TDLEN0 332 | 333 | ; Set up the head and tail pointers 334 | mov dword [rdx + 0x03810], 0 ; Head 335 | mov dword [rdx + 0x03818], 0 ; Tail 336 | 337 | ; Enable the device and also pad short packets 338 | mov dword [rdx + 0x400], ((1 << 1) | (1 << 3)) 339 | 340 | mov qword [fs:globals.i825xx_dev + i825xx_dev.tx_tail], 0 341 | 342 | pop rbp 343 | pop rdx 344 | pop rcx 345 | pop rbx 346 | pop rax 347 | ret 348 | 349 | struc i825xx_tx_desc 350 | .address: resq 1 351 | .length: resw 1 352 | .cso: resb 1 353 | .cmd: resb 1 354 | .sta: resb 1 355 | .css: resb 1 356 | .special: resw 1 357 | endstruc 358 | 359 | ; This must be initialized before we relocate and after such is read only 360 | ; and must be accessed relative 361 | udp_template: 362 | .eth: 363 | .dest: db 0x2c, 0x41, 0x38, 0xa2, 0x6c, 0xf5 364 | .src: db 0x00, 0x1b, 0x21, 0x34, 0x02, 0x19 365 | .type: db 0x08, 0x00 ; IP 366 | 367 | .ip: 368 | .ver: db 0x45 369 | .svc: db 0x00 370 | .len: db 0x05, 0xd0 ; 28 + payload_len 371 | .ident: db 0x0e, 0x5f 372 | .flags: db 0x00 373 | .frag: db 0x00 374 | .ttl: db 0x80 375 | .proto: db 0x11 ; UDP 376 | .chk: db 0x00, 0x00 377 | .srcip: db 0xc0, 0xa8, 0x08, 0x08 ; 192.168.8.8 378 | .destip: db 0xc0, 0xa8, 0x08, 0x01 ; 192.168.8.1 379 | 380 | .udp: 381 | .src_port: db 0x41, 0x00 ; 0x41 382 | .dest_port: db 0x41, 0x00 ; 0x41 383 | .ulen: db 0x00, 0x00 ; 8 + payload_len 384 | .chksum: db 0x00, 0x00 385 | .end: 386 | 387 | udp_template_len: equ (udp_template.end - udp_template) 388 | 389 | ; r15 -> Packet 390 | update_ipv4_checksum: 391 | push rax 392 | push rbx 393 | push rcx 394 | push rdx 395 | push r15 396 | 397 | ; Zero out the checksum for calculation 398 | mov word [r15 + (udp_template.chk - udp_template)], 0 399 | 400 | lea rbx, [r15 + (udp_template.ip - udp_template)] 401 | mov rcx, 20 402 | xor rax, rax 403 | 404 | .lewp: 405 | movzx rdx, word [rbx] 406 | xchg dh, dl 407 | add rax, rdx 408 | 409 | add rbx, 2 410 | sub rcx, 2 411 | jnz short .lewp 412 | 413 | mov rdx, rax 414 | shr rdx, 16 415 | add rax, rdx 416 | not rax 417 | 418 | xchg al, ah 419 | mov word [r15 + (udp_template.chk - udp_template)], ax 420 | 421 | pop r15 422 | pop rdx 423 | pop rcx 424 | pop rbx 425 | pop rax 426 | ret 427 | 428 | i825xx_tx_acquire_spinlock: 429 | push rbx 430 | 431 | ; Acquire a lock 432 | mov rbx, 1 433 | lock xadd qword [fs:globals.i825xx_tx_lock], rbx 434 | 435 | ; Spin until we're the chosen one 436 | .spin: 437 | cmp rbx, qword [fs:globals.i825xx_tx_release] 438 | jne short .spin 439 | 440 | pop rbx 441 | ret 442 | 443 | i825xx_tx_release_spinlock: 444 | ; Release the lock 445 | inc qword [fs:globals.i825xx_tx_release] 446 | 447 | ret 448 | 449 | i825xx_send_packets: 450 | push rax 451 | push rbx 452 | push rcx 453 | push rdx 454 | 455 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.tx_ring_base] 456 | mov rcx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 457 | 458 | call i825xx_tx_acquire_spinlock 459 | 460 | mov eax, dword [rcx + 0x03818] 461 | .lewp: 462 | ; Check if this packet has DD set to 0, if so, send it 463 | mov ebx, eax 464 | shl ebx, 4 465 | test byte [rdx + rbx + i825xx_tx_desc.sta], 1 466 | jnz short .done 467 | 468 | ; Update tail pointer 469 | add eax, 1 470 | and eax, (I825XX_NUM_TX - 1) 471 | jmp .lewp 472 | 473 | .done: 474 | mov dword [rcx + 0x03818], eax 475 | 476 | call i825xx_tx_release_spinlock 477 | pop rdx 478 | pop rcx 479 | pop rbx 480 | pop rax 481 | ret 482 | 483 | ; rax <- Pointer to descriptor 484 | i825xx_get_tx_descriptor: 485 | push rbx 486 | push rdx 487 | 488 | ; Fetch the next available descriptor 489 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.tx_ring_base] 490 | mov rbx, 1 491 | lock xadd qword [fs:globals.i825xx_dev + i825xx_dev.tx_tail], rbx 492 | and rbx, (I825XX_NUM_TX - 1) 493 | shl rbx, 4 494 | 495 | ; Due to the size of the ring being larger than the number of cores, we 496 | ; do not need to have a lock on descriptors at this point. 497 | 498 | ; Wait for the DD flag to be set, signifying this can be reused. 499 | .lewp: 500 | pause 501 | test byte [rdx + rbx + i825xx_tx_desc.sta], 1 502 | jz short .lewp 503 | 504 | lea rax, [rdx + rbx] 505 | 506 | pop rdx 507 | pop rbx 508 | ret 509 | 510 | ; rbx -> Packet (virtual address) 511 | ; rcx -> Length 512 | i825xx_send_packet: 513 | push rax 514 | push rbx 515 | push rcx 516 | push rdx 517 | push r14 518 | push r15 519 | 520 | call i825xx_get_tx_descriptor 521 | mov r14, rax 522 | mov r15, [r14] 523 | 524 | ; Copy the udp header to the payload 525 | push rdi 526 | push rsi 527 | push rcx 528 | mov rcx, udp_template_len 529 | mov rdi, r15 530 | lea rsi, [rel udp_template] 531 | rep movsb 532 | pop rcx 533 | pop rsi 534 | pop rdi 535 | 536 | ; Copy the packet to the payload 537 | push rdi 538 | push rsi 539 | push rcx 540 | lea rdi, [r15 + udp_template_len] 541 | mov rsi, rbx 542 | rep movsb 543 | pop rcx 544 | pop rsi 545 | pop rdi 546 | 547 | mov rdx, rcx ; IP len 548 | mov rax, rcx ; UDP len 549 | 550 | add dx, 28 551 | xchg dl, dh ; byte swap 552 | mov [r15 + (udp_template.len - udp_template)], dx 553 | 554 | add ax, 8 555 | xchg al, ah ; byte swap 556 | mov [r15 + (udp_template.ulen - udp_template)], ax 557 | 558 | call update_ipv4_checksum 559 | 560 | ; Update to send the packet, and add the header length 561 | add rcx, udp_template_len 562 | 563 | mov word [r14 + i825xx_tx_desc.length], cx 564 | mov byte [r14 + i825xx_tx_desc.cmd], ((1 << 3) | 3) 565 | mov byte [r14 + i825xx_tx_desc.sta], 0 566 | call i825xx_send_packets 567 | 568 | pop r15 569 | pop r14 570 | pop rdx 571 | pop rcx 572 | pop rbx 573 | pop rax 574 | ret 575 | 576 | i825xx_init_thread_local: 577 | push rbx 578 | push rcx 579 | push rdi 580 | push rsi 581 | 582 | ; Update the template src and dest 583 | mov edx, dword [fs:globals.hw_mac_address] 584 | mov dword [rel udp_template.src], edx 585 | mov dx, word [fs:globals.hw_mac_address + 4] 586 | mov word [rel udp_template.src + 4], dx 587 | mov word [rel udp_template.srcip + 2], dx 588 | 589 | pop rsi 590 | pop rdi 591 | pop rcx 592 | pop rbx 593 | ret 594 | 595 | ; rbx <- Buffer to read into (must be 4k) 596 | i825xx_poll_rx: 597 | push rax 598 | push rbx 599 | push rcx 600 | push rdx 601 | push r15 602 | 603 | mov r15, rbx 604 | 605 | ; Only one thread at a time here. Don't block if you don't get through 606 | mov rax, 1 607 | lock xadd qword [fs:globals.i825xx_rx_poll_lock], rax 608 | test rax, rax 609 | jnz .end 610 | 611 | ; Get the rx entry 612 | mov rax, qword [fs:globals.i825xx_dev + i825xx_dev.rx_tail] 613 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.rx_ring_base] 614 | shl rax, 4 615 | mov rbx, qword [rdx + rax + 0] ; pointer 616 | 617 | .lewp: 618 | mov rcx, qword [rdx + rax + 8] ; flags/status 619 | 620 | ; Wait until a packet is present here 621 | bt rcx, 32 622 | jnc short .lewp 623 | 624 | and rcx, 0xFFFF 625 | 626 | ; rbx now holds the rxed packet 627 | ; rcx holds the length 628 | 629 | inc qword [fs:globals.i825xx_dev + i825xx_dev.rxed_count] 630 | 631 | mov rcx, 0x1E15BAC81612B164 632 | cmp qword [rbx + udp_template_len], rcx 633 | jne short .dont_reboot 634 | 635 | mov al, 0xFE 636 | mov dx, 0x64 637 | out dx, al 638 | 639 | .dont_reboot: 640 | 641 | ; Copy the recieved packet to the caller provided buffer 642 | push rdi 643 | push rsi 644 | push rcx 645 | mov rdi, r15 646 | mov rsi, rbx 647 | mov rcx, (2048 / 8) 648 | rep movsq 649 | pop rcx 650 | pop rsi 651 | pop rdi 652 | 653 | .next_packet: 654 | ; Update our internal head 655 | inc qword [fs:globals.i825xx_dev + i825xx_dev.rx_tail] 656 | and qword [fs:globals.i825xx_dev + i825xx_dev.rx_tail], (512 - 1) 657 | 658 | ; Put the packet we just read back up for storage 659 | mov qword [rdx + rax + 8], 0 ; Clear out the flags and status 660 | 661 | mov eax, dword [fs:globals.i825xx_dev + i825xx_dev.rx_tail] 662 | 663 | mov rdx, qword [fs:globals.i825xx_dev + i825xx_dev.mmio_base] 664 | mov dword [rdx + 0x02818], eax ; Tail 665 | 666 | .end: 667 | lock dec qword [fs:globals.i825xx_rx_poll_lock] 668 | pop r15 669 | pop rdx 670 | pop rcx 671 | pop rbx 672 | pop rax 673 | ret 674 | 675 | -------------------------------------------------------------------------------- /srcs/net/tags: -------------------------------------------------------------------------------- 1 | !_TAG_FILE_FORMAT 2 /extended format; --format=1 will not append ;" to lines/ 2 | !_TAG_FILE_SORTED 1 /0=unsorted, 1=sorted, 2=foldcase/ 3 | !_TAG_PROGRAM_AUTHOR Darren Hiebert /dhiebert@users.sourceforge.net/ 4 | !_TAG_PROGRAM_NAME Exuberant Ctags // 5 | !_TAG_PROGRAM_URL http://ctags.sourceforge.net /official site/ 6 | !_TAG_PROGRAM_VERSION 5.8 // 7 | endstruc .\falktp.asm /^endstruc$/;" l 8 | endstruc .\i825xx.asm /^endstruc$/;" l 9 | endstruc .\x540.asm /^endstruc$/;" l 10 | falktp_pull .\falktp.asm /^falktp_pull:$/;" l 11 | i825xx_fetch_pci .\i825xx.asm /^i825xx_fetch_pci:$/;" l 12 | i825xx_get_tx_descriptor .\i825xx.asm /^i825xx_get_tx_descriptor:$/;" l 13 | i825xx_init .\i825xx.asm /^i825xx_init:$/;" l 14 | i825xx_init_rx .\i825xx.asm /^i825xx_init_rx:$/;" l 15 | i825xx_init_thread_local .\i825xx.asm /^i825xx_init_thread_local:$/;" l 16 | i825xx_init_tx .\i825xx.asm /^i825xx_init_tx:$/;" l 17 | i825xx_poll_rx .\i825xx.asm /^i825xx_poll_rx:$/;" l 18 | i825xx_send_packet .\i825xx.asm /^i825xx_send_packet:$/;" l 19 | i825xx_send_packets .\i825xx.asm /^i825xx_send_packets:$/;" l 20 | i825xx_tx_acquire_spinlock .\i825xx.asm /^i825xx_tx_acquire_spinlock:$/;" l 21 | i825xx_tx_release_spinlock .\i825xx.asm /^i825xx_tx_release_spinlock:$/;" l 22 | shitsum .\falktp.asm /^shitsum:$/;" l 23 | struc .\falktp.asm /^struc falktp_chunk$/;" l 24 | struc .\falktp.asm /^struc falktp_dat$/;" l 25 | struc .\falktp.asm /^struc falktp_dat_done$/;" l 26 | struc .\falktp.asm /^struc falktp_fin$/;" l 27 | struc .\falktp.asm /^struc falktp_push$/;" l 28 | struc .\falktp.asm /^struc falktp_push_ack$/;" l 29 | struc .\falktp.asm /^struc falktp_req$/;" l 30 | struc .\i825xx.asm /^struc i825xx_tx_desc$/;" l 31 | struc .\x540.asm /^struc x540_tx_desc$/;" l 32 | udp_template .\i825xx.asm /^udp_template:$/;" l 33 | udp_template_10g .\x540.asm /^udp_template_10g:$/;" l 34 | udp_template_10g_len .\x540.asm /^udp_template_10g_len: equ (udp_template_10g.end - udp_template_10g)$/;" d 35 | udp_template_len .\i825xx.asm /^udp_template_len: equ (udp_template.end - udp_template)$/;" d 36 | update_ipv4_checksum .\i825xx.asm /^update_ipv4_checksum:$/;" l 37 | x540_fetch_pci .\x540.asm /^x540_fetch_pci:$/;" l 38 | x540_init .\x540.asm /^x540_init:$/;" l 39 | x540_init_local_rx .\x540.asm /^x540_init_local_rx:$/;" l 40 | x540_init_local_tx .\x540.asm /^x540_init_local_tx:$/;" l 41 | x540_poll_rx_int .\x540.asm /^x540_poll_rx_int:$/;" l 42 | x540_poll_rx_raw .\x540.asm /^x540_poll_rx_raw:$/;" l 43 | x540_poll_rx_udp .\x540.asm /^x540_poll_rx_udp:$/;" l 44 | x540_probe_rx_udp .\x540.asm /^x540_probe_rx_udp:$/;" l 45 | x540_rx_advance .\x540.asm /^x540_rx_advance:$/;" l 46 | x540_send_packet .\x540.asm /^x540_send_packet:$/;" l 47 | x540_send_packets .\x540.asm /^x540_send_packets:$/;" l 48 | x540_validate_udp .\x540.asm /^x540_validate_udp:$/;" l 49 | -------------------------------------------------------------------------------- /srcs/net/x540.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | %define X540_BASE_PORT 1400 4 | 5 | ; x540_fetch_pci 6 | ; 7 | ; Summary: 8 | ; 9 | ; This function enumerates all PCI devices and returns the PCI request needed 10 | ; for the first device encountered with VID 8086 and DID 1528 11 | ; 12 | ; Parameters: 13 | ; 14 | ; None 15 | ; 16 | ; Alignment: 17 | ; 18 | ; None 19 | ; 20 | ; Returns: 21 | ; 22 | ; on Success: rax = PCI request on success 23 | ; on Failure: rax = 0 24 | ; 25 | ; Smashes: 26 | ; 27 | ; rax - Return value 28 | ; 29 | ; Optimization: 30 | ; 31 | ; Readability 32 | ; 33 | x540_fetch_pci: 34 | push rbx 35 | push rdx 36 | 37 | sub rsp, 0x10 38 | 39 | ; rsp + 0x00 L2 | Bus number 40 | ; rsp + 0x02 L2 | Device number 41 | ; rsp + 0x04 L2 | Function number 42 | 43 | mov word [rsp + 0x00], 0xFF 44 | .for_bus: 45 | mov word [rsp + 0x02], 0x1F 46 | .for_device: 47 | mov word [rsp + 0x04], 0x07 48 | .for_func: 49 | ; Bus 50 | movzx eax, byte [rsp + 0x00] 51 | shl eax, 5 52 | 53 | ; Device 54 | or al, byte [rsp + 0x02] 55 | shl eax, 3 56 | 57 | ; Function 58 | or al, byte [rsp + 0x04] 59 | shl eax, 8 60 | 61 | ; Enable bit 62 | or eax, 0x80000000 63 | 64 | ; Save the bus:device.func query into ebx 65 | mov ebx, eax 66 | 67 | ; Request the vendor ID and device ID 68 | mov dx, 0x0CF8 69 | out dx, eax 70 | mov dx, 0x0CFC 71 | in eax, dx 72 | 73 | ; If the vendor ID is 0xFFFF, then this bus:device.func does not exist 74 | cmp ax, 0xFFFF 75 | je short .next_device 76 | 77 | ; Query register 0x00, it's the one that contains the VID and DID 78 | mov eax, ebx 79 | mov dx, 0x0CF8 80 | out dx, eax 81 | mov dx, 0x0CFC 82 | in eax, dx 83 | 84 | cmp eax, 0x15288086 85 | jne short .next_device 86 | 87 | mov eax, ebx 88 | jmp short .ret 89 | 90 | .next_device: 91 | dec word [rsp + 0x04] 92 | jns short .for_func 93 | 94 | dec word [rsp + 0x02] 95 | jns short .for_device 96 | 97 | dec word [rsp + 0x00] 98 | jns short .for_bus 99 | 100 | xor rax, rax 101 | .ret: 102 | add rsp, 0x10 103 | pop rdx 104 | pop rbx 105 | ret 106 | 107 | x540_init: 108 | push rax 109 | push rcx 110 | push rdx 111 | 112 | call x540_fetch_pci 113 | test rax, rax 114 | jz short .fail 115 | 116 | or rax, 0x10 ; BAR0 117 | mov dx, 0x0CF8 118 | out dx, eax 119 | mov dx, 0x0CFC 120 | in eax, dx 121 | and eax, ~0xF 122 | 123 | mov qword [fs:globals.x540_mmio_base], rax 124 | mov edx, eax 125 | 126 | ; Disable all interrupts by writing Fs to the Extended Interrupt Mask 127 | ; Clear Register (EIMC) 128 | mov dword [rdx + 0x0888], 0x7fffffff 129 | 130 | ; Do a device reset by setting RST in the CTRL register 131 | or dword [rdx + 0x0000], (1 << 26) 132 | 133 | ; Wait for the reset to complete by waiting for the RST bit to clear as 134 | ; well as then wait 10ms after the reset is complete. According to 135 | ; documentation this sleep is suggested for a smooth initialization 136 | .wait_for_reset: 137 | test dword [rdx + 0x0000], (1 << 26) 138 | jnz short .wait_for_reset 139 | mov ecx, 20000 140 | call rdtsc_sleep 141 | 142 | ; Disable all interrupts by writing Fs to the Extended Interrupt Mask 143 | ; Clear Register (EIMC) 144 | mov dword [rdx + 0x0888], 0x7fffffff 145 | 146 | ; Enable jumbo frames up to 9018 bytes (including CRC) 147 | bts dword [rdx + 0x4240], 2 ; HLREG0 148 | mov dword [rdx + 0x4268], (9018 << 16) ; MAXFRS 149 | 150 | ; Enable transmit path DMA 151 | or dword [rdx + 0x4a80], (1 << 0) 152 | 153 | ; Set the RX control register 154 | ; Enable the following: 155 | ; UPE - Unicast promiscuous enable 156 | ; MPE - Multicast promiscuous enable 157 | ; BAM - Accept broadcast packets 158 | mov dword [rdx + 0x5080], ((1 << 9) | (1 << 8) | (1 << 10)) 159 | 160 | ; Enable snooping globally to prevent cache incoherency 161 | or dword [rdx + 0x0018], (1 << 16) 162 | 163 | .fail: 164 | pop rdx 165 | pop rcx 166 | pop rax 167 | ret 168 | 169 | %define X540_NUM_RX 1024 170 | 171 | x540_init_local_rx: 172 | push rax 173 | push rbx 174 | push rcx 175 | push rdx 176 | push rbp 177 | push r8 178 | 179 | ; Calculate the offset for this cores filter. 180 | imul r8, qword [gs:thread_local.core_id], 0x4 181 | 182 | ; Calculate the source and dest port for the filter. Bits 15:0 are the 183 | ; source port, and bits 31:16 are the dest port. Both stored as big endian. 184 | mov eax, dword [gs:thread_local.core_id] 185 | add eax, X540_BASE_PORT 186 | xchg al, ah ; Byte swap 187 | mov ebx, eax 188 | shl ebx, 16 189 | or ebx, eax ; Or the value in twice shifted 16 so both source and dest 190 | ; are filled in. 191 | 192 | mov rdx, qword [fs:globals.x540_mmio_base] 193 | mov dword [rdx + 0xE000 + r8], 0x9608400a ; Source filter 10.64.8.150 194 | mov dword [rdx + 0xE200 + r8], 0x9808400a ; Dest filter 10.64.8.152 195 | mov dword [rdx + 0xE400 + r8], ebx ; Source/dest port filter. 196 | ; Both source and dest are the 197 | ; core id. 198 | 199 | ; Set up which rx queue is associated with this filter. 200 | mov eax, dword [gs:thread_local.core_id] 201 | shl eax, 21 202 | mov [rdx + 0xE800 + r8], eax 203 | 204 | ; Filter control: 205 | ; Filter using 4 tuples (source IP, dest IP, dest port, protocol) 206 | ; UDP protocol 207 | ; Highest priority 208 | ; Enable filter 209 | ; Do not use pool field. 210 | mov dword [rdx + 0xE600 + r8], (1 << 0) | (7 << 2) | (1 << 31) | (1 << 30) | (1 << 27) 211 | 212 | ; Calculate the offset for this cores ring descriptors. 213 | imul r8, qword [gs:thread_local.core_id], 0x40 214 | 215 | ; Allocate ring descriptor space 216 | mov rcx, (16 * X540_NUM_RX) 217 | call mm_alloc_contig_phys 218 | 219 | mov qword [gs:thread_local.x540_rx_ring_base], rax 220 | 221 | ; rbp - Base address 222 | ; rdx - Base address pointer 223 | ; rcx - Counter 224 | mov rbp, rax 225 | mov rdx, rax 226 | xor rcx, rcx 227 | .setup_rx: 228 | ; Allocate 12k for each RX descriptor 229 | push rcx 230 | mov rcx, (12 * 1024) 231 | call mm_alloc_contig_phys 232 | pop rcx 233 | 234 | mov qword [rdx + 0x00], rax ; Address 235 | mov qword [rdx + 0x08], 0 ; Status 236 | 237 | add rdx, 16 238 | inc rcx 239 | cmp rcx, X540_NUM_RX 240 | jl short .setup_rx 241 | 242 | mov rdx, qword [fs:globals.x540_mmio_base] 243 | 244 | ; Set up the SRRCTL 245 | ; 10KB buffer size 246 | ; 256 byte header buffer (default) 247 | ; Use legacy descriptor 248 | ; Drop packets when queue is full 249 | mov dword [rdx + 0x01014 + r8], (10 << 0) | (4 << 8)| (1 << 28) 250 | 251 | ; Set up the high and low parts of the address 252 | mov rax, rbp 253 | shr rax, 32 254 | mov dword [rdx + 0x01004 + r8], eax ; RDBAH0 255 | mov dword [rdx + 0x01000 + r8], ebp ; RDBAL0 256 | 257 | ; Set up the length of the recieve ring buffer 258 | mov dword [rdx + 0x01008 + r8], (X540_NUM_RX * 16) ; RDLEN0 259 | 260 | ; Store the current read position 261 | mov qword [gs:thread_local.x540_rx_head], 0 262 | 263 | ; Enable the ring and poll until it becomes enabled 264 | or dword [rdx + 0x01028 + r8], (1 << 25) 265 | .lewp: 266 | bt dword [rdx + 0x01028 + r8], 25 267 | jnc short .lewp 268 | 269 | ; Bump the tail descriptor 270 | mov dword [rdx + 0x01018 + r8], (X540_NUM_RX - 1) ; RDT (tail pointer) 271 | 272 | ; Enable RX 273 | mov dword [rdx + 0x3000], (1 << 0) 274 | 275 | pop r8 276 | pop rbp 277 | pop rdx 278 | pop rcx 279 | pop rbx 280 | pop rax 281 | ret 282 | 283 | ; This must be a power of two! 284 | %define X540_NUM_TX 1024 285 | 286 | x540_init_local_tx: 287 | push rax 288 | push rbx 289 | push rcx 290 | push rdx 291 | push rbp 292 | push r8 293 | 294 | ; Calculate the offset for this cores ring descriptors. 295 | imul r8, qword [gs:thread_local.core_id], 0x40 296 | 297 | ; Allocate ring descriptor space 298 | mov rcx, (16 * X540_NUM_TX) 299 | call mm_alloc_contig_phys 300 | 301 | mov qword [gs:thread_local.x540_tx_ring_base], rax 302 | 303 | ; rbp - Base address 304 | ; rdx - Base address pointer 305 | ; rcx - Counter 306 | mov rbp, rax 307 | mov rdx, rax 308 | xor rcx, rcx 309 | .setup_tx: 310 | push rax 311 | 312 | push rcx 313 | mov rcx, (12 * 1024) 314 | call mm_alloc_contig_phys 315 | pop rcx 316 | 317 | mov qword [rdx + 0x00], rax ; Address 318 | mov qword [rdx + 0x08], 0 ; Status 319 | bts qword [rdx + 0x08], 32 ; Set DD bit in status to signify this is avail 320 | ; for use. 321 | 322 | pop rax 323 | 324 | add rdx, 16 325 | inc rcx 326 | cmp rcx, X540_NUM_TX 327 | jl short .setup_tx 328 | 329 | mov rdx, qword [fs:globals.x540_mmio_base] 330 | 331 | ; Set up the high and low parts of the address 332 | mov rax, rbp 333 | shr rax, 32 334 | mov dword [rdx + 0x6004 + r8], eax ; TDBAH0 335 | mov dword [rdx + 0x6000 + r8], ebp ; TDBAL0 336 | 337 | ; Set up the length of the TX ring buffer 338 | mov dword [rdx + 0x6008 + r8], (X540_NUM_TX * 16) ; TDLEN0 339 | 340 | ; Enable transmit queue 341 | or dword [rdx + 0x6028 + r8], (1 << 25) 342 | 343 | ; Set up the tail pointer 344 | mov dword [rdx + 0x6018 + r8], 0 ; Tail 345 | 346 | mov qword [gs:thread_local.x540_tx_tail], 0 347 | mov qword [gs:thread_local.cur_port], 0 348 | 349 | pop r8 350 | pop rbp 351 | pop rdx 352 | pop rcx 353 | pop rbx 354 | pop rax 355 | ret 356 | 357 | struc x540_tx_desc 358 | .address: resq 1 359 | .length: resw 1 360 | .cso: resb 1 361 | .cmd: resb 1 362 | .sta: resb 1 363 | .css: resb 1 364 | .special: resw 1 365 | endstruc 366 | 367 | x540_send_packets: 368 | push rax 369 | push rbx 370 | push rcx 371 | push rdx 372 | 373 | ; Calculate the offset for this cores ring descriptors. 374 | imul rdx, qword [gs:thread_local.core_id], 0x40 375 | 376 | mov rcx, qword [fs:globals.x540_mmio_base] 377 | 378 | mov eax, dword [rcx + 0x6018 + rdx] 379 | inc eax 380 | and eax, (X540_NUM_TX - 1) 381 | mov [rcx + 0x6018 + rdx], eax 382 | 383 | pop rdx 384 | pop rcx 385 | pop rbx 386 | pop rax 387 | ret 388 | 389 | ; This must be initialized before we relocate and after such is read only 390 | ; and must be accessed relative 391 | udp_template_10g: 392 | .eth: 393 | .dest: db 0xa0, 0x36, 0x9f, 0x55, 0x00, 0x5e ; worky.fast.gl.lan 394 | .src: db 0xa0, 0x36, 0x9f, 0x55, 0x04, 0x00 ; archivey.fast.gl.lan 395 | .type: db 0x08, 0x00 ; IP 396 | 397 | .ip: 398 | .ver: db 0x45 399 | .svc: db 0x00 400 | .len: db 0x05, 0xd0 ; 28 + payload_len 401 | .ident: db 0x0e, 0x5f 402 | .flags: db 0x00 403 | .frag: db 0x00 404 | .ttl: db 0x80 405 | .proto: db 0x11 ; UDP 406 | .chk: db 0x00, 0x00 407 | .srcip: db 0x0a, 0x40, 0x08, 0x98 ; 10.64.8.152 408 | .destip: db 0x0a, 0x40, 0x08, 0x96 ; 10.64.8.150 409 | 410 | .udp: 411 | .src_port: db 0x41, 0x00 ; 0x4100 412 | .dest_port: db 0x41, 0x00 ; 0x4100 413 | .ulen: db 0x00, 0x00 ; 8 + payload_len 414 | .chksum: db 0x00, 0x00 415 | .end: 416 | 417 | udp_template_10g_len: equ (udp_template_10g.end - udp_template_10g) 418 | 419 | ; rbx -> Packet (virtual address) 420 | ; rcx -> Length 421 | x540_send_packet: 422 | push rax 423 | push rbx 424 | push rcx 425 | push rdx 426 | push r14 427 | push r15 428 | 429 | ; Fetch the next available descriptor 430 | push rbx 431 | mov rdx, qword [gs:thread_local.x540_tx_ring_base] 432 | mov rbx, 1 433 | xadd qword [gs:thread_local.x540_tx_tail], rbx 434 | and rbx, (X540_NUM_TX - 1) 435 | shl rbx, 4 436 | 437 | ; Wait for the DD flag to be set, signifying this can be reused. 438 | .lewp: 439 | test byte [rdx + rbx + x540_tx_desc.sta], 1 440 | jz short .lewp 441 | 442 | lea rax, [rdx + rbx] 443 | pop rbx 444 | 445 | mov r14, rax 446 | mov r15, [r14] 447 | 448 | ; Copy the udp header to the payload 449 | push rdi 450 | push rsi 451 | push rcx 452 | mov rcx, udp_template_10g_len 453 | mov rdi, r15 454 | lea rsi, [rel udp_template_10g] 455 | rep movsb 456 | pop rcx 457 | pop rsi 458 | pop rdi 459 | 460 | ; Copy the packet to the payload 461 | push rdi 462 | push rsi 463 | push rcx 464 | lea rdi, [r15 + udp_template_10g_len] 465 | mov rsi, rbx 466 | rep movsb 467 | pop rcx 468 | pop rsi 469 | pop rdi 470 | 471 | ; Place in this cores port 472 | mov eax, X540_BASE_PORT 473 | xchg al, ah 474 | mov [r15 + (udp_template_10g.dest_port - udp_template_10g)], ax 475 | 476 | ; Place in the core ID as the source port so we can track who reported 477 | ; what. 478 | mov eax, dword [gs:thread_local.core_id] 479 | xchg al, ah 480 | mov [r15 + (udp_template_10g.src_port - udp_template_10g)], ax 481 | 482 | mov rdx, rcx ; IP len 483 | mov rax, rcx ; UDP len 484 | 485 | add dx, 28 486 | xchg dl, dh ; byte swap 487 | mov [r15 + (udp_template_10g.len - udp_template_10g)], dx 488 | 489 | add ax, 8 490 | xchg al, ah ; byte swap 491 | mov [r15 + (udp_template_10g.ulen - udp_template_10g)], ax 492 | 493 | call update_ipv4_checksum 494 | 495 | ; Update to send the packet, and add the header length 496 | add rcx, udp_template_10g_len 497 | 498 | mov word [r14 + x540_tx_desc.length], cx 499 | mov byte [r14 + x540_tx_desc.cmd], ((1 << 3) | 3) 500 | mov byte [r14 + x540_tx_desc.sta], 0 501 | call x540_send_packets 502 | 503 | pop r15 504 | pop r14 505 | pop rdx 506 | pop rcx 507 | pop rbx 508 | pop rax 509 | ret 510 | 511 | ; rsi <- Packet contents (zero if packet not present) 512 | ; rbp <- Packet size 513 | x540_probe_rx_udp: 514 | push rax 515 | push rdx 516 | 517 | ; Get the rx entry 518 | mov rdx, qword [gs:thread_local.x540_rx_ring_base] 519 | mov eax, dword [gs:thread_local.x540_rx_head] 520 | shl eax, 4 521 | 522 | ; Zero out the return value to indicate no packet. 523 | xor esi, esi 524 | 525 | ; Check if there is a packet, if there isn't return immediately 526 | test dword [rdx + rax + 8 + 4], 1 527 | jz short .done 528 | 529 | ; Get the packet 530 | call x540_poll_rx_raw 531 | call x540_validate_udp 532 | jnc short .done 533 | 534 | ; We got a packet, but it wasn't a valid UDP packet. Return zero. 535 | xor esi, esi 536 | 537 | ; Discard the packet 538 | call x540_rx_advance 539 | 540 | .done: 541 | pop rdx 542 | pop rax 543 | ret 544 | 545 | ; rsi -> Pointer to raw packet to validate 546 | ; rbp -> Length of raw packet to validate 547 | ; rsi <- Pointer to UDP contents 548 | ; rbp <- Length of UDP contents 549 | ; CF <- Set if the packet is not a valid UDP packet 550 | x540_validate_udp: 551 | push rax 552 | 553 | ; If this packet is < the size of a UDP packet, 554 | sub ebp, udp_template_10g_len 555 | jb short .invalid 556 | 557 | ; Compute the UDP length 558 | movzx eax, word [rsi + (udp_template_10g.ulen - udp_template_10g)] 559 | xchg al, ah 560 | 561 | ; UDP length must be at least 8 bytes 562 | sub ax, 8 563 | jb short .invalid 564 | 565 | ; Make sure that the UDP payload length fits in the actual packet length 566 | cmp ebp, eax 567 | jb short .invalid 568 | 569 | ; Return the pointer to the packet contents 570 | add rsi, udp_template_10g_len 571 | 572 | ; Return the UDP length - 8 (the actual packet length) 573 | mov ebp, eax 574 | 575 | clc 576 | jmp short .end 577 | 578 | .invalid: 579 | stc 580 | .end: 581 | pop rax 582 | ret 583 | 584 | ; Poll until the next valid UDP packet comes in over the network. When it does 585 | ; return a pointer to the contents of the UDP packet in rsi, and the length 586 | ; in rbp. 587 | ; 588 | ; rsi <- Pointer to UDP packet contents 589 | ; rbp <- Size of UDP packet contents in bytes 590 | x540_poll_rx_udp: 591 | push rcx 592 | 593 | xor ecx, ecx 594 | call x540_poll_rx_int 595 | 596 | pop rcx 597 | ret 598 | 599 | ; Poll until the next packet comes in over the network. Return pointer to raw 600 | ; packet in rsi and length in rbp. 601 | ; 602 | ; rsi <- Pointer to raw packet 603 | ; rbp <- Size of raw packet in bytes 604 | x540_poll_rx_raw: 605 | push rcx 606 | 607 | mov ecx, 1 608 | call x540_poll_rx_int 609 | 610 | pop rcx 611 | ret 612 | 613 | ; rcx -> If zero, poll until first UDP packet. Else poll until first packet. 614 | ; UDP mode - Return first valid UDP packet. rsi points to the UDP 615 | ; contents and rbp is the length of the contents. 616 | ; Raw mode - Return first packet. rsi points to the raw packet contents, 617 | ; rbp is the length of the raw packet. 618 | ; rsi <- Pointer to packet contents 619 | ; rbp <- Packet size in bytes 620 | x540_poll_rx_int: 621 | push rax 622 | push rbx 623 | push rdx 624 | 625 | jmp short .first_try 626 | 627 | .try_again: 628 | ; If we're retrying, we need to put the last packet back up for use. 629 | call x540_rx_advance 630 | 631 | .first_try: 632 | ; Get the rx entry 633 | mov rdx, qword [gs:thread_local.x540_rx_ring_base] 634 | mov eax, dword [gs:thread_local.x540_rx_head] 635 | shl eax, 4 636 | 637 | .lewp: 638 | ; Wait until a packet is present here by polling the DD bit 639 | test dword [rdx + rax + 8 + 4], 1 640 | jz short .lewp 641 | 642 | ; Increment the number of packets this core has received 643 | inc qword [gs:thread_local.x540_packets_rx] 644 | 645 | mov rsi, qword [rdx + rax + 0] ; pointer to packet contents 646 | movzx ebp, word [rdx + rax + 8] ; packet length 647 | 648 | %if 0 649 | push rdx 650 | push rdi 651 | call per_core_screen 652 | mov rdx, qword [gs:thread_local.x540_packets_rx] 653 | call outhexq 654 | pop rdi 655 | pop rdx 656 | %endif 657 | 658 | ; Enable these next few lines to report all rxed packets over the network 659 | ;push rbx 660 | ;push rcx 661 | ;mov rbx, rsi 662 | ;mov rcx, rbp 663 | ;call x540_send_packet 664 | ;pop rcx 665 | ;pop rbx 666 | 667 | ; Check if we're in UDP or raw mode 668 | test rcx, rcx 669 | jnz short .end 670 | 671 | ; If we're in UDP mode, check if this is a valid UDP packet. If it is not 672 | ; wait for another packet. 673 | call x540_validate_udp 674 | jc short .try_again 675 | 676 | .end: 677 | pop rdx 678 | pop rbx 679 | pop rax 680 | ret 681 | 682 | ; Increment the rx head. This will put the last packet we read back up for 683 | ; use. 684 | x540_rx_advance: 685 | push rax 686 | push rbx 687 | push rdx 688 | 689 | ; Calculate the offset for this cores ring descriptors. 690 | mov ebx, dword [gs:thread_local.core_id] 691 | shl ebx, 6 692 | 693 | ; Get the rx entry 694 | mov rdx, qword [gs:thread_local.x540_rx_ring_base] 695 | mov eax, dword [gs:thread_local.x540_rx_head] 696 | shl eax, 4 697 | 698 | ; Put the packet we just read back up for storage 699 | mov qword [rdx + rax + 8], 0 ; Clear out the flags and status 700 | 701 | mov eax, dword [gs:thread_local.x540_rx_head] 702 | mov rdx, qword [fs:globals.x540_mmio_base] 703 | mov dword [rdx + 0x01018 + rbx], eax ; Tail 704 | 705 | ; Update our internal head 706 | inc eax 707 | and eax, (X540_NUM_RX - 1) 708 | mov dword [gs:thread_local.x540_rx_head], eax 709 | 710 | pop rdx 711 | pop rbx 712 | pop rax 713 | ret 714 | 715 | -------------------------------------------------------------------------------- /srcs/os/win32.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | struc modlist 4 | ; module is loaded at [base, end] 5 | .base: resq 1 6 | .end: resq 1 7 | 8 | .hash: resq 1 9 | 10 | .namelen: resq 1 ; Name length in bytes 11 | .name: resb 512 ; utf16 name of module 12 | endstruc 13 | 14 | win32_construct_modlist: 15 | push rax 16 | push rbx 17 | push rcx 18 | push rdx 19 | push rdi 20 | push rsi 21 | push rbp 22 | 23 | ; If we have already initialized the modlist, do nothing 24 | cmp qword [gs:thread_local.modlist_count], 0 25 | jne .end 26 | 27 | ; Deref teb->ProcessEnvironmentBlock 28 | mov rdx, [rax + VMCB.gs_base] 29 | add rdx, 0x60 30 | call mm_read_guest_qword 31 | 32 | ; Deref peb->Ldr (struct _PEB_LDR_DATA) 33 | add rdx, 0x18 34 | call mm_read_guest_qword 35 | 36 | ; Deref ldr->InLoadOrderLinks (struct _LDR_DATA_TABLE_ENTRY) 37 | add rdx, 0x10 38 | 39 | ; Fetch the blink and store it, we stop iterating when we hit this 40 | push rdx 41 | add rdx, 8 42 | call mm_read_guest_qword 43 | mov rbp, rdx 44 | pop rdx 45 | 46 | .for_each_module: 47 | ; Deref the _LDR_DATA_TABLE_ENTRY to move to the next entry. 48 | call mm_read_guest_qword 49 | 50 | ; Test if it's the end of the list 51 | test rdx, rdx 52 | jz .end 53 | cmp rdx, rbp 54 | je .end 55 | 56 | ; Allocate a new modlist entry 57 | mov rbx, modlist_size 58 | bamp_alloc rbx 59 | 60 | ; Lookup the base address 61 | push rdx 62 | add rdx, 0x30 63 | call mm_read_guest_qword 64 | test rdx, rdx 65 | jz .for_each_module 66 | mov [rbx + modlist.base], rdx 67 | mov [rbx + modlist.end], rdx 68 | pop rdx 69 | 70 | ; Look up the image size 71 | push rdx 72 | add rdx, 0x40 73 | call mm_read_guest_qword 74 | test rdx, rdx 75 | jz .for_each_module 76 | add qword [rbx + modlist.end], rdx 77 | dec qword [rbx + modlist.end] 78 | pop rdx 79 | 80 | ; Look up the image dll name size 81 | push rdx 82 | add rdx, 0x58 83 | call mm_read_guest_qword 84 | and edx, 0xffff 85 | cmp edx, 512 86 | ja .for_each_module 87 | mov [rbx + modlist.namelen], rdx 88 | pop rdx 89 | 90 | ; Fetch the image dll name 91 | push rdx 92 | add rdx, 0x60 93 | call mm_read_guest_qword 94 | mov rsi, rdx 95 | lea rdi, [rbx + modlist.name] 96 | mov rcx, [rbx + modlist.namelen] 97 | call mm_copy_from_guest_vm_vmcb 98 | pop rdx 99 | 100 | ; Fetch the BaseNameHashValue 101 | push rdx 102 | add rdx, 0x108 103 | call mm_read_guest_qword 104 | 105 | ; Since this value is a dword, shift it left by 32. We then use the bottom 106 | ; 32-bit (which are now zero) to add our relative offset to calculate our 107 | ; relative hash. 108 | shl rdx, 32 109 | mov [rbx + modlist.hash], rdx 110 | pop rdx 111 | 112 | ; Add this new DLL to the modlist 113 | mov rdi, qword [gs:thread_local.modlist_count] 114 | mov rsi, qword [gs:thread_local.gs_base] 115 | lea rsi, [rsi + thread_local.modlist + rdi*8] 116 | mov qword [rsi], rbx 117 | inc qword [gs:thread_local.modlist_count] 118 | 119 | jmp .for_each_module 120 | 121 | .end: 122 | pop rbp 123 | pop rsi 124 | pop rdi 125 | pop rdx 126 | pop rcx 127 | pop rbx 128 | pop rax 129 | ret 130 | 131 | ; rbx -> RIP to resolve 132 | ; xmm5 <- Symhash to use 133 | win32_symhash: 134 | push rdi 135 | push rsi 136 | push rbp 137 | 138 | call win32_resolve_symbol 139 | test rbp, rbp 140 | jz short .trash 141 | 142 | mov rbp, qword [rbp + modlist.hash] 143 | 144 | pinsrq xmm5, rbp, 0 145 | pinsrq xmm5, rbx, 1 146 | aesenc xmm5, xmm5 147 | aesenc xmm5, xmm5 148 | aesenc xmm5, xmm5 149 | aesenc xmm5, xmm5 150 | 151 | jmp short .end 152 | 153 | .trash: 154 | xorps xmm5, xmm5 155 | .end: 156 | pop rbp 157 | pop rsi 158 | pop rdi 159 | ret 160 | 161 | ; rbx -> RIP to resolve 162 | ; rbx <- Module offset or original RIP 163 | ; rbp <- Pointer to modlist entry. If this is zero, symbol could not be 164 | ; resolved and thus rbx is left unchanged. 165 | win32_resolve_symbol: 166 | push rcx 167 | push rdx 168 | 169 | cmp qword [gs:thread_local.modlist_count], 0 170 | je short .end 171 | 172 | xor rcx, rcx 173 | mov rdx, qword [gs:thread_local.gs_base] 174 | lea rdx, [rdx + thread_local.modlist] 175 | .lewp: 176 | mov rbp, qword [rdx + rcx*8] 177 | 178 | cmp rbx, qword [rbp + modlist.base] 179 | jb short .next 180 | 181 | cmp rbx, qword [rbp + modlist.end] 182 | ja short .next 183 | 184 | ; We resolved the symbol! 185 | sub rbx, qword [rbp + modlist.base] 186 | jmp short .end_found 187 | 188 | .next: 189 | inc rcx 190 | cmp rcx, qword [gs:thread_local.modlist_count] 191 | jb short .lewp 192 | 193 | .end: 194 | xor rbp, rbp 195 | .end_found: 196 | pop rdx 197 | pop rcx 198 | ret 199 | 200 | -------------------------------------------------------------------------------- /srcs/time/time.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | init_pit: 4 | push rax 5 | 6 | ; Set the PIT mode, timer 0, square wave gen, hex 7 | mov al, 0x34 8 | out 0x43, al 9 | 10 | ; Lower bits 11 | mov al, 0xFF 12 | out 0x40, al 13 | 14 | ; Upper bits 15 | mov al, 0xFF 16 | out 0x40, al 17 | 18 | pop rax 19 | ret 20 | 21 | ; ecx -> Hardware P-state [0, 7] (not checked for validity!) 22 | ; rax <- Frequency of the corresponding P-state in MHz 23 | amd_fam15h_hw_freq: 24 | push rcx 25 | push rdx 26 | 27 | ; Compute the P-state MSR which corresponds to this hardware P-state 28 | add ecx, 0xc0010064 29 | rdmsr 30 | bextr ecx, eax, 0x0306 ; CpuDid, divisor ID 31 | and eax, 0x3f ; CpuFid, frequency multiplier 32 | 33 | ; CoreCOF in MHz = ((CpuFid + 0x10) / (2^CpuDid)) * 100 34 | 35 | ; Compute CpuFid in MHz 36 | add eax, 0x10 37 | imul rax, 100 38 | 39 | ; Divide by CpuDid 40 | shr rax, cl 41 | 42 | pop rdx 43 | pop rcx 44 | ret 45 | 46 | ; rax <- Current frequency in MHz 47 | amd_fam15h_cur_freq: 48 | push rcx 49 | push rdx 50 | 51 | ; Get the COFVID status, which contains the current hardware P-state 52 | mov ecx, 0xc0010071 53 | rdmsr 54 | bextr ecx, eax, 0x0310 55 | 56 | call amd_fam15h_hw_freq 57 | 58 | pop rdx 59 | pop rcx 60 | ret 61 | 62 | ; rax <- Software P0 frequency in MHz 63 | amd_fam15h_sw_p0_freq: 64 | push rcx 65 | 66 | call amd_fam15h_fetch_pcie_mmio 67 | 68 | ; Bus 0, Device 18, Function 4 69 | ; Get the number of boost states, this will be the hardware P-state that 70 | ; corresponds to hardware P0 71 | bextr ecx, [rdx + 0x15c + ((0 << 20) | (0x18 << 15) | (0x4 << 12))], 0x0302 72 | 73 | call amd_fam15h_hw_freq 74 | 75 | pop rcx 76 | ret 77 | 78 | rdtsc64: 79 | push rdx 80 | 81 | rdtsc 82 | shl rdx, 32 83 | or rax, rdx 84 | 85 | pop rdx 86 | ret 87 | 88 | ; We use this function to calculate what the value of rdtsc will be x 89 | ; microseconds in the future. This way we don't have to do a rdtsc and div 90 | ; by calling rdtsc_uptime, we just rdtsc. 91 | ; rax -> Target number of microseconds to wait 92 | ; rax <- Value rdtsc will be upon specified milliseconds 93 | rdtsc_future: 94 | push rcx 95 | push rdx 96 | 97 | ; Get number of cycles in n microseconds 98 | xor rdx, rdx 99 | mul qword [fs:globals.rdtsc_freq] 100 | mov rcx, rax 101 | 102 | ; Get current time 103 | rdtsc 104 | shl rdx, 32 105 | or rdx, rax 106 | 107 | ; Add the current time and the delay together 108 | lea rax, [rcx + rdx] 109 | 110 | pop rdx 111 | pop rcx 112 | ret 113 | 114 | ; rax <- Number of microseconds since rdtsc init 115 | rdtsc_uptime: 116 | push rdx 117 | 118 | rdtsc 119 | shl rdx, 32 120 | or rax, rdx 121 | 122 | xor rdx, rdx 123 | div qword [fs:globals.rdtsc_freq] 124 | 125 | pop rdx 126 | ret 127 | 128 | ; rcx -> Number of microseconds to sleep 129 | rdtsc_sleep: 130 | push rax 131 | push rcx 132 | push rdx 133 | 134 | ; Get number of timestamp ticks that correspeonds to this request 135 | imul rcx, qword [fs:globals.rdtsc_freq] 136 | 137 | ; Get the target timestamp counter that we're done sleeping at 138 | rdtsc 139 | shl rdx, 32 140 | or rdx, rax 141 | add rcx, rdx 142 | 143 | .wait_rdtsc: 144 | rdtsc 145 | shl rdx, 32 146 | or rdx, rax 147 | cmp rdx, rcx 148 | jb short .wait_rdtsc 149 | 150 | pop rdx 151 | pop rcx 152 | pop rax 153 | ret 154 | 155 | -------------------------------------------------------------------------------- /srcs/vm/snapshot.asm: -------------------------------------------------------------------------------- 1 | [bits 64] 2 | 3 | ; This is called in a raw vm state. You must save the actual GPRs here. 4 | save_vm_snapshot: 5 | push rax 6 | push rbx 7 | push rcx 8 | push rdx 9 | push rsi 10 | push rdi 11 | push rbp 12 | push r8 13 | push r9 14 | push r10 15 | push r11 16 | push r12 17 | push r13 18 | push r14 19 | push r15 20 | 21 | push rax 22 | push rbx 23 | 24 | ; Use the old snapshot memory 25 | mov rbx, qword [fs:globals.vm_snapshot] 26 | test rbx, rbx 27 | jnz short .create_snapshot 28 | 29 | ; Allocate room in virtual memory space for the biggest possible snapshot 30 | mov rbx, VM_MEMORY_SIZE + (VM_MAX_PAGES * 8) + vm_snapshot_size 31 | add rbx, 0xFFF 32 | and rbx, ~0xFFF 33 | lock xadd qword [fs:globals.next_free_vaddr], rbx 34 | mov qword [fs:globals.vm_snapshot_mem], rbx 35 | 36 | ; Allocate room for our vm snapshot 37 | mov rbx, vm_snapshot_size 38 | bamp_alloc rbx 39 | mov qword [fs:globals.vm_snapshot], rbx 40 | 41 | .create_snapshot: 42 | mov [rbx + vm_snapshot.rcx], rcx 43 | mov [rbx + vm_snapshot.rdx], rdx 44 | mov [rbx + vm_snapshot.rsi], rsi 45 | mov [rbx + vm_snapshot.rdi], rdi 46 | mov [rbx + vm_snapshot.rbp], rbp 47 | mov [rbx + vm_snapshot.r8], r8 48 | mov [rbx + vm_snapshot.r9], r9 49 | mov [rbx + vm_snapshot.r10], r10 50 | mov [rbx + vm_snapshot.r11], r11 51 | mov [rbx + vm_snapshot.r12], r12 52 | mov [rbx + vm_snapshot.r13], r13 53 | mov [rbx + vm_snapshot.r14], r14 54 | mov [rbx + vm_snapshot.r15], r15 55 | 56 | mov rcx, rbx 57 | pop rbx 58 | 59 | ; rcx now points to the vm_snapshot 60 | mov [rcx + vm_snapshot.rbx], rbx 61 | mov rbx, cr8 62 | mov [rcx + vm_snapshot.cr8], rbx 63 | 64 | ; Save off xcr0 65 | push rcx 66 | mov ecx, 0 67 | xgetbv 68 | pop rcx 69 | shl rdx, 32 70 | or rdx, rax 71 | mov [rcx + vm_snapshot.xcr0], rdx 72 | 73 | ; Save off debug registers 74 | mov rbx, dr0 75 | mov [rcx + vm_snapshot.dr0], rbx 76 | mov rbx, dr1 77 | mov [rcx + vm_snapshot.dr1], rbx 78 | mov rbx, dr2 79 | mov [rcx + vm_snapshot.dr2], rbx 80 | mov rbx, dr3 81 | mov [rcx + vm_snapshot.dr3], rbx 82 | 83 | ; Save off the physical memory size of this system 84 | mov rbx, VM_MEMORY_SIZE 85 | mov qword [rcx + vm_snapshot.pmem_size], rbx 86 | 87 | ; Set up xcr0 to save all state (FPU, MMX, SSE, AVX, and LWP) 88 | mov edx, 0x40000000 89 | mov eax, 0x00000007 90 | push rcx 91 | mov ecx, 0 92 | xsetbv 93 | pop rcx 94 | xsave [rcx + vm_snapshot.xsave] 95 | 96 | pop rax ; Restore vmcb 97 | 98 | ; Save off vmcb 99 | push rcx 100 | mov rsi, rax 101 | lea rdi, [rcx + vm_snapshot.vmcb] 102 | mov rcx, 4096 / 8 103 | rep movsq 104 | pop rcx 105 | 106 | ; Map vm_snapshot_mem[0:0x3000] -> vm_snapshot[0:0x3000] 107 | push rax 108 | lea rbx, [rcx + 0x0000] 109 | call bamp_get_phys 110 | lea rbp, [rax + 3] 111 | mov rbx, qword [fs:globals.vm_snapshot_mem] 112 | lea rbx, [rbx + 0x0000] 113 | mov rdx, cr3 114 | call mm_map_4k 115 | 116 | lea rbx, [rcx + 0x1000] 117 | call bamp_get_phys 118 | lea rbp, [rax + 3] 119 | mov rbx, qword [fs:globals.vm_snapshot_mem] 120 | lea rbx, [rbx + 0x1000] 121 | mov rdx, cr3 122 | call mm_map_4k 123 | 124 | lea rbx, [rcx + 0x2000] 125 | call bamp_get_phys 126 | lea rbp, [rax + 3] 127 | mov rbx, qword [fs:globals.vm_snapshot_mem] 128 | lea rbx, [rbx + 0x2000] 129 | mov rdx, cr3 130 | call mm_map_4k 131 | pop rax 132 | 133 | ; Allocate the zero page we return for MMIO addresses 134 | push rax 135 | call alloc_zero_4k 136 | mov r11, rax 137 | pop rax 138 | 139 | mov rbx, 0 140 | mov r10, VM_MEMORY_SIZE 141 | .lewp: 142 | call probe_memory_dest 143 | test rdx, rdx 144 | jz short .is_ram 145 | 146 | push rbx 147 | lea rbp, [r11 + 3] 148 | add rbx, qword [fs:globals.vm_snapshot_mem] 149 | add rbx, 0x3000 150 | mov rdx, cr3 151 | call mm_map_4k 152 | pop rbx 153 | 154 | jmp short .next_page 155 | 156 | .is_ram: 157 | push rax 158 | push rbx 159 | mov rdx, [rax + VMCB.n_cr3] 160 | call mm_get_phys 161 | 162 | lea rbp, [rax + 3] 163 | add rbx, qword [fs:globals.vm_snapshot_mem] 164 | add rbx, 0x3000 165 | mov rdx, cr3 166 | call mm_map_4k 167 | pop rbx 168 | pop rax 169 | 170 | .next_page: 171 | add rbx, 4096 172 | cmp rbx, r10 173 | jb short .lewp 174 | 175 | mov r10, qword [fs:globals.vm_snapshot_mem] 176 | mov r11, (vm_snapshot_size + VM_MEMORY_SIZE) 177 | call falktp_send 178 | 179 | .resume: 180 | pop r15 181 | pop r14 182 | pop r13 183 | pop r12 184 | pop r11 185 | pop r10 186 | pop r9 187 | pop r8 188 | pop rbp 189 | pop rdi 190 | pop rsi 191 | pop rdx 192 | pop rcx 193 | pop rbx 194 | pop rax 195 | 196 | jmp launch_svm.inject_debug 197 | 198 | --------------------------------------------------------------------------------