├── .gitignore ├── LICENSE.md ├── README.md ├── example_pictures ├── aarch64_user.jpg ├── aarch64_w_or_x.jpg ├── aarch64_x.jpg ├── cache_list.jpg ├── x86_64_exec_filter.jpg └── x86_64_user_space.jpg ├── pt.py ├── pt ├── __init__.py ├── machine.py ├── pt.py ├── pt_aarch64_definitions.py ├── pt_aarch64_parse.py ├── pt_arch_backend.py ├── pt_common.py ├── pt_constants.py ├── pt_register.py ├── pt_riscv64_parse.py ├── pt_x86_64_definitions.py ├── pt_x86_64_parse.py └── pt_x86_msr.py ├── pt_gdb ├── __init__.py └── pt_gdb.py ├── pt_host.py ├── pt_host ├── pt_host_read_cr3.bcc └── pt_host_read_physmem.bcc ├── pyproject.toml └── tests └── integration_tests ├── Dockerfile.package ├── Dockerfile.runtests ├── Makefile ├── build_package.sh ├── common.sh ├── custom_kernels ├── .gitignore ├── arm │ └── 64_bit │ │ ├── Makefile │ │ ├── boot.asm │ │ ├── entry.c │ │ └── linker.ld ├── common │ └── common.h └── x86 │ ├── 64_bit │ ├── Makefile │ ├── boot.asm │ ├── entry.c │ └── linker.ld │ └── common │ └── common_x86.h ├── pt_utils.py ├── run_integration_tests.py ├── run_tests.sh └── vm_utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.swp 2 | __pycache__ 3 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 Martin Radev 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # gdb-pt-dump 2 | 3 | `gdb-pt-dump` is a gdb script to enhance debugging of a QEMU-based virtual machine. 4 | 5 | The repository also includes `pt_host` which is a BPF program that allows for examining the page tables of a Linux process. 6 | 7 | ## Supported architectures 8 | 9 | Supported architectures: `x86-64`, `x86-32`, `aarch64`, `riscv64`. 10 | 11 | ## Features 12 | 13 | * Dumping a page table from a specific guest physical address. 14 | * Merging semantically-similar contiguous memory. 15 | * Provide detailed architectural information: writeable, executable, U/S, cacheable, write-back, XN, PXN, etc 16 | * Cache collected information for future filtering and printing. 17 | * Filter page table information via page attributes (x, w, u, s, ...) and virtual addresses (before, after, between) 18 | * Search memory very fast using `/proc/QEMU_PID/mem`. Search for string, u8, u4 19 | Search is applied after filter. 20 | * Filter-out search results by address alignment. Useful for filtering-out SLAB allocations. 21 | * Try to determine KASLR information by examining the address space. 22 | * Find virtual memory aliases. 23 | * Dump host page tables via `pt_host.py` 24 | 25 | ## How to use 26 | 27 | The script is standalone. 28 | 29 | For now, do `source PATH_TO_PT_DUMP/pt.py`. 30 | 31 | For details, just do `help pt` in gdb. 32 | 33 | ## Examples 34 | 35 | ![x86_64: Only user space pages](example_pictures/x86_64_user_space.jpg "x86_64 only user space") 36 | 37 | ![x86_64: Only executable pages](example_pictures/x86_64_exec_filter.jpg "x86_64 only executable") 38 | 39 | ![aarch64: User space accessible pages](example_pictures/aarch64_user.jpg "Aarch64 user space accessible") 40 | 41 | ![aarch64: write or executable pages](example_pictures/aarch64_w_or_x.jpg "Aarch64 write or executable pages") 42 | 43 | ![aarch64: only executable pages](example_pictures/aarch64_x.jpg "Aarch64 only executable pages") 44 | 45 | ![Saved page tables](example_pictures/cache_list.jpg "Show saved page tables") 46 | 47 | ## Possible issues 48 | 49 | Old QEMU versions seem to not provide access to privileged registers like cr3. 50 | Thus, the page table address would need to be retrieved in some other way. 51 | 52 | -------------------------------------------------------------------------------- /example_pictures/aarch64_user.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/martinradev/gdb-pt-dump/9df79f3931bbe1549051b30d9c65e5b250d7a9a0/example_pictures/aarch64_user.jpg -------------------------------------------------------------------------------- /example_pictures/aarch64_w_or_x.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/martinradev/gdb-pt-dump/9df79f3931bbe1549051b30d9c65e5b250d7a9a0/example_pictures/aarch64_w_or_x.jpg -------------------------------------------------------------------------------- /example_pictures/aarch64_x.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/martinradev/gdb-pt-dump/9df79f3931bbe1549051b30d9c65e5b250d7a9a0/example_pictures/aarch64_x.jpg -------------------------------------------------------------------------------- /example_pictures/cache_list.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/martinradev/gdb-pt-dump/9df79f3931bbe1549051b30d9c65e5b250d7a9a0/example_pictures/cache_list.jpg -------------------------------------------------------------------------------- /example_pictures/x86_64_exec_filter.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/martinradev/gdb-pt-dump/9df79f3931bbe1549051b30d9c65e5b250d7a9a0/example_pictures/x86_64_exec_filter.jpg -------------------------------------------------------------------------------- /example_pictures/x86_64_user_space.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/martinradev/gdb-pt-dump/9df79f3931bbe1549051b30d9c65e5b250d7a9a0/example_pictures/x86_64_user_space.jpg -------------------------------------------------------------------------------- /pt.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | 4 | # A hack to import the other files without placing the files in the modules directory. 5 | dirname = os.path.dirname(os.path.abspath(__file__)) 6 | sys.path.insert(1, dirname) 7 | 8 | from pt_gdb import PageTableDumpGdbFrontend 9 | 10 | PageTableDumpGdbFrontend() 11 | -------------------------------------------------------------------------------- /pt/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/martinradev/gdb-pt-dump/9df79f3931bbe1549051b30d9c65e5b250d7a9a0/pt/__init__.py -------------------------------------------------------------------------------- /pt/machine.py: -------------------------------------------------------------------------------- 1 | from abc import ABC 2 | from abc import abstractmethod 3 | 4 | class Machine(ABC): 5 | 6 | def __init__(self): 7 | pass 8 | 9 | @abstractmethod 10 | def read_register(self, register_name): 11 | raise Exception("Unimplemented") 12 | 13 | @abstractmethod 14 | def read_physical_memory(self, physical_address, length): 15 | raise Exception("Unimplemented") 16 | 17 | -------------------------------------------------------------------------------- /pt/pt.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import argparse 3 | import os 4 | import subprocess 5 | import tempfile 6 | import string 7 | import random 8 | import traceback 9 | 10 | from pt.pt_common import * 11 | 12 | class PageTableDump(): 13 | 14 | def __init__(self, machine_backend, arch_backend, needs_pid = False): 15 | self.machine_backend = machine_backend 16 | self.arch_backend = arch_backend 17 | 18 | self.parser = argparse.ArgumentParser() 19 | self.parser.add_argument("-save", action="store_true") 20 | self.parser.add_argument("-list", action="store_true") 21 | self.parser.add_argument("-clear", action="store_true") 22 | self.parser.add_argument("-ss", nargs='+', type=lambda s: str(s)) 23 | self.parser.add_argument("-sb", nargs='+', type=lambda s: b"".join([int(s[u:u+2], 16).to_bytes(1, 'little') for u in range(0, len(s), 2)])) 24 | self.parser.add_argument("-s8", nargs='+', type=lambda s: int(s, 0)) 25 | self.parser.add_argument("-s4", nargs='+', type=lambda s: int(s, 0)) 26 | self.parser.add_argument("-range", nargs=2, type=lambda s: int(s, 0)) 27 | self.parser.add_argument("-after", nargs=1, type=lambda s: int(s, 0)) 28 | self.parser.add_argument("-before", nargs=1, type=lambda s: int(s, 0)) 29 | self.parser.add_argument("-has", nargs=1, type=lambda s: int(s, 0)) 30 | self.parser.add_argument("-align", nargs='+', type=lambda s: int(s, 0)) 31 | self.parser.add_argument("-kaslr", action="store_true") 32 | self.parser.add_argument("-kaslr_leaks", action="store_true") 33 | self.parser.add_argument("-info", action="store_true") 34 | self.parser.add_argument("-walk", nargs=1, type=lambda s: int(s, 0)) 35 | self.parser.add_argument("-phys_verbose", action="store_true") 36 | self.parser.add_argument("-filter", nargs="+") 37 | self.parser.add_argument("-o", nargs=1) 38 | self.parser.add_argument("-find_alias", action="store_true") 39 | self.parser.add_argument("-force_traverse_all", action="store_true") 40 | self.parser.add_argument("-read_phys", nargs=2, help="hex physical address and length") 41 | self.parser.add_argument("-read_virt", nargs=2, help="hex virtual address and length") 42 | 43 | if needs_pid: 44 | self.parser.add_argument("-pid", type=int, required = True) 45 | 46 | if self.arch_backend.get_arch() == "x86_64" or self.arch_backend.get_arch() == "x86_32": 47 | self.parser.add_argument("-cr3", nargs=1) 48 | 49 | if self.arch_backend.get_arch() == "aarch64": 50 | self.parser.add_argument("-ttbr0_el1", nargs=1) 51 | self.parser.add_argument("-ttbr1_el1", nargs=1) 52 | 53 | if self.arch_backend.get_arch() == "riscv64": 54 | self.parser.add_argument("-satp", nargs=1) 55 | 56 | self.cache = dict() 57 | 58 | def print_cache(self): 59 | print("Cache:") 60 | for address in self.cache: 61 | print(f"\t{hex(address)}") 62 | 63 | def handle_command_wrapper(self, argv): 64 | args = None 65 | try: 66 | args = self.parser.parse_args(argv) 67 | except: 68 | return None 69 | 70 | saved_stdout = None 71 | if args.o: 72 | saved_stdout = sys.stdout 73 | sys.stdout = open(args.o[0], "w+") 74 | 75 | try: 76 | self.handle_command(args) 77 | except Exception as e: 78 | print(f"Exception: {str(e)}") 79 | print(f"Stack trace:\n{traceback.format_exc()}") 80 | finally: 81 | if saved_stdout: 82 | sys.stdout.close() 83 | sys.stdout = saved_stdout 84 | 85 | def read_virt_memory(self, virt_ranges, virt_addr, len): 86 | phys_blocks = [] 87 | for r in virt_ranges: 88 | if r.va + r.page_size <= virt_addr: 89 | continue 90 | if r.va >= virt_addr + len: 91 | break 92 | 93 | acc = 0 94 | for (phys_addr, phys_extent) in zip(r.phys, r.sizes): 95 | if r.va + acc >= virt_addr + len: 96 | break 97 | acc += phys_extent 98 | if r.va + acc < virt_addr: 99 | continue 100 | start_delta = max(virt_addr - (acc + r.va - phys_extent), 0) 101 | phys_addr_fixed = phys_addr + start_delta 102 | phys_extent_fixed = min(virt_addr + len + phys_extent - (r.va + acc), phys_extent) - start_delta 103 | phys_blocks.append((phys_addr_fixed, phys_extent_fixed)) 104 | 105 | remaining = len 106 | data = b"" 107 | for (pa, extent) in phys_blocks: 108 | to_read = min(remaining, extent) 109 | data += self.machine_backend.read_physical_memory(pa, to_read) 110 | remaining -= to_read 111 | if remaining == 0: 112 | break 113 | return data 114 | 115 | def handle_command(self, args): 116 | if args.list: 117 | self.print_cache() 118 | return 119 | 120 | if args.clear: 121 | self.cache = dict() 122 | return 123 | 124 | to_search = None 125 | to_search_num = 0x100000000 126 | if args.ss: 127 | to_search = args.ss[0].encode("ascii") 128 | if len(args.ss) > 1: 129 | to_search_num = int(args.ss[1], 0) 130 | if args.sb: 131 | to_search = args.sb[0] 132 | if len(args.sb) > 1: 133 | to_search_num = int.from_bytes(args.sb[1], 'little') 134 | elif args.s8: 135 | to_search = args.s8[0].to_bytes(8, 'little') 136 | if len(args.s8) > 1: 137 | to_search_num = int(args.s8[1], 0) 138 | elif args.s4: 139 | to_search = args.s4[0].to_bytes(4, 'little') 140 | if len(args.s4) > 1: 141 | to_search_num = int(args.s4[1], 0) 142 | 143 | requires_page_table_parsing = True 144 | if args.info: 145 | requires_page_table_parsing = False 146 | 147 | if args.walk: 148 | requires_page_table_parsing = False 149 | 150 | if args.read_phys: 151 | requires_page_table_parsing = False 152 | 153 | page_ranges = None 154 | page_ranges_filtered = None 155 | if requires_page_table_parsing: 156 | page_ranges = self.arch_backend.parse_tables(self.cache, args) 157 | compound_filter, (min_address, max_address) = self.parse_filter_args(args) 158 | page_ranges_filtered = list(filter(compound_filter, page_ranges)) 159 | # Perform cut-off of start and end. 160 | # Only the first and last page entry need to be potentially modified because they were already filtered 161 | if len(page_ranges_filtered) >= 1: 162 | if min_address: 163 | page_ranges_filtered[0].cut_after(min_address) 164 | if max_address: 165 | page_ranges_filtered[-1].cut_before(max_address) 166 | 167 | 168 | if to_search: 169 | if page_ranges_filtered: 170 | aligned_to = args.align[0] if args.align else 1 171 | aligned_offset = args.align[1] if args.align and len(args.align) == 2 else 0 172 | search_results = search_memory(self.machine_backend, page_ranges_filtered, to_search, to_search_num, aligned_to, aligned_offset) 173 | for entry in search_results: 174 | print("Found at " + hex(entry[0]) + " in " + entry[1].to_string(args.phys_verbose)) 175 | else: 176 | print("Not found") 177 | elif args.walk: 178 | walk = self.arch_backend.walk(args.walk[0]) 179 | print(walk) 180 | elif args.kaslr: 181 | self.arch_backend.print_kaslr_information(page_ranges) 182 | elif args.kaslr_leaks: 183 | def inner_find_leaks(x, off): 184 | top = (x >> (off * 8)).to_bytes(8 - off, 'little') 185 | num_entries = 10 186 | entries = search_memory(self.machine_backend, page_ranges_filtered, top, num_entries, 1, 0) 187 | if entries: 188 | print(f"Search for {hex(x)}") 189 | for entry in entries: 190 | print("Found at " + hex(entry[0] - off) + " in " + entry[1].to_string(args.phys_verbose)) 191 | leaks = self.arch_backend.print_kaslr_information(page_ranges, False) 192 | if leaks: 193 | inner_find_leaks(leaks[0], 3) 194 | inner_find_leaks(leaks[1], 5) 195 | elif args.info: 196 | self.arch_backend.print_stats() 197 | elif args.find_alias: 198 | find_aliases(page_ranges, args.phys_verbose) 199 | elif args.read_phys: 200 | phys_addr = int(args.read_phys[0], 0) 201 | phys_len = int(args.read_phys[1], 0) 202 | sys.stdout.buffer.write(self.machine_backend.read_physical_memory(phys_addr, phys_len)) 203 | elif args.read_virt: 204 | virt_addr = int(args.read_virt[0], 0) 205 | virt_len = int(args.read_virt[1], 0) 206 | data = self.read_virt_memory(page_ranges, virt_addr, virt_len) 207 | sys.stdout.buffer.write(data) 208 | else: 209 | self.arch_backend.print_table(page_ranges_filtered, phys_verbose=args.phys_verbose) 210 | 211 | def parse_filter_args(self, args): 212 | filters = [] 213 | min_address = 0 214 | max_address = 2 ** 64 215 | if args.range: 216 | filters.append(lambda page: page.va >= args.range[0] and page.va <= args.range[1]) 217 | min_address = max(args.range[0], min_address) 218 | max_address = min(args.range[1], max_address) 219 | 220 | if args.has: 221 | filters.append(lambda page: args.has[0] >= page.va and args.has[0] < page.va + page.page_size) 222 | 223 | if args.after: 224 | filters.append(lambda page: args.after[0] < page.va + page.page_size) 225 | min_address = max(args.after[0], min_address) 226 | else: 227 | min_address = None 228 | 229 | if args.before: 230 | filters.append(lambda page: args.before[0] > page.va) 231 | max_address = min(args.before[0], max_address) 232 | else: 233 | max_address = None 234 | 235 | if args.filter: 236 | # First, we have to determine if user/superuser filter flag was set 237 | # This is necessary at least for aarch64 where the AP bits provide many possibilities. 238 | 239 | has_superuser_filter = False 240 | has_user_filter = False 241 | for f in args.filter: 242 | if f == "s": 243 | has_superuser_filter = True 244 | if f == "u": 245 | has_user_filter = True 246 | if not has_superuser_filter and not has_user_filter: 247 | has_superuser_filter = True 248 | has_user_filter = True 249 | for f in args.filter: 250 | if f == "w": 251 | filters.append(self.arch_backend.get_filter_is_writeable(has_superuser_filter, has_user_filter)) 252 | elif f == "_w": 253 | filters.append(self.arch_backend.get_filter_is_not_writeable(has_superuser_filter, has_user_filter)) 254 | elif f == "x": 255 | filters.append(self.arch_backend.get_filter_is_executable(has_superuser_filter, has_user_filter)) 256 | elif f == "_x": 257 | filters.append(self.arch_backend.get_filter_is_not_executable(has_superuser_filter, has_user_filter)) 258 | elif f == "w|x" or f == "x|w": 259 | filters.append(self.arch_backend.get_filter_is_writeable_or_executable(has_superuser_filter, has_user_filter)) 260 | elif f == "u": 261 | filters.append(self.arch_backend.get_filter_is_user_page(has_superuser_filter, has_user_filter)) 262 | elif f == "s": 263 | filters.append(self.arch_backend.get_filter_is_superuser_page(has_superuser_filter, has_user_filter)) 264 | elif f == "ro": 265 | filters.append(self.arch_backend.get_filter_is_read_only_page(has_superuser_filter, has_user_filter)) 266 | elif f in ["wb", "_wb", "uc", "_uc"]: 267 | filters.append(self.arch_backend.get_filter_architecture_specific(f, has_superuser_filter, has_user_filter)) 268 | else: 269 | print(f"Unknown filter: {f}") 270 | return 271 | 272 | return (create_compound_filter(filters), (min_address, max_address)) 273 | 274 | -------------------------------------------------------------------------------- /pt/pt_aarch64_definitions.py: -------------------------------------------------------------------------------- 1 | from pt.pt_register import * 2 | 3 | # Used the `Armv8, for Armv8-A architecture profile` manual. 4 | # I hope this doesn't break any license. Please don't sue :( 5 | 6 | class PT_TCR(PT_Register): 7 | def __init__(self, machine): 8 | super(PT_TCR, self).__init__(machine, "TCR_EL1", "Translation Control Register EL1 (TCR EL1)") 9 | self.add_range("T0SZ", 0, 5, lambda x: f"{x} bits are truncated. TTBR0_EL1 addresses {64 - x} bits.") 10 | self.add_range("EPD0", 7, 7, PT_Decipher_Meaning_Match( \ 11 | {0: "Perform translation table walk using TTBR0_EL1", \ 12 | 1: "A TLB miss on an address translated from TTBR0_EL1 generates a Translation fault. No translation table walk is performed."})) 13 | self.add_range("IRGN0", 8, 9, PT_Decipher_Meaning_Passthrough) 14 | self.add_range("ORGN0", 10, 11, PT_Decipher_Meaning_Passthrough) 15 | self.add_range("SH0", 12, 13, PT_Decipher_Meaning_Match( \ 16 | {0b00: "Non-shareable.", \ 17 | 0b01: "Reserved.", \ 18 | 0b10: "Outer Shareable.", \ 19 | 0b11: "Inner Shareable."})) 20 | self.add_range("TG0", 14, 15, PT_Decipher_Meaning_Match( \ 21 | {0b00: "4 KiB TTBR0_EL1 granule size", \ 22 | 0b01: "64 KiB TTBR0_EL1 granule size.", \ 23 | 0b10: "16 KiB TTBR0_EL1 granule size."})) 24 | self.add_range("T1SZ", 16, 21, lambda x: f"{x} bits are truncated. TTBR1_EL1 addresses {64 - x} bits.") 25 | self.add_range("A1", 22, 22, PT_Decipher_Meaning_Match( \ 26 | {0: "TTBR0_EL1.ASID defines the ASID.", \ 27 | 1: "TTBR1_EL1.ASID defines the ASID."})) 28 | self.add_range("EPD1", 23, 23, PT_Decipher_Meaning_Match( \ 29 | {0: "Perform translation table walk using TTBR1_EL1", \ 30 | 1: "A TLB miss on an address translated from TTBR1_EL1 generates a Translation fault. No translation table walk is performed."})) 31 | self.add_range("IRGN1", 24, 25, PT_Decipher_Meaning_Passthrough) 32 | self.add_range("ORGN1", 26, 27, PT_Decipher_Meaning_Passthrough) 33 | self.add_range("SH1", 28, 29, PT_Decipher_Meaning_Match( \ 34 | {0b00: "Non-shareable.", \ 35 | 0b01: "Reserved.", \ 36 | 0b10: "Outer Shareable.", \ 37 | 0b11: "Inner Shareable."})) 38 | self.add_range("TG1", 30, 31, PT_Decipher_Meaning_Match( \ 39 | {0b01: "16 KiB TTBR1_EL1 granule size", \ 40 | 0b10: "4 KiB TTBR1_EL1 granule size.", \ 41 | 0b11: "64 KiB TTBR1_EL1 granule size."})) 42 | self.add_range("IPS", 32, 34, PT_Decipher_Meaning_Match( \ 43 | {0b000: "32 bits, 4 GB.", \ 44 | 0b001: "36 bits, 64 GB.", \ 45 | 0b010: "40 bits, 1 TB.", \ 46 | 0b011: "42 bits, 4 TB.", \ 47 | 0b100: "44 bits, 16 TB.", \ 48 | 0b101: "48 bits, 256 TB.", \ 49 | 0b110: "52 bits, 4 PB."})) 50 | self.add_range("AS", 36, 36, PT_Decipher_Meaning_Match( \ 51 | {0: "8 bit - the upper 8 bits of TTBR0_EL1 and TTBR1_EL1 are ignored by hardware.", \ 52 | 1: "16 bit - the upper 16 bits of TTBR0_EL1 and TTBR1_EL1 are used for allocation and matching in the TLB."})) 53 | self.add_range("TBI0", 37, 37, PT_Decipher_Meaning_Match( \ 54 | {0: "Top Byte used in the address calculation.", 55 | 1: "Top Byte ignored in the address calculation."})) 56 | self.add_range("TBI1", 38, 38, PT_Decipher_Meaning_Match( \ 57 | {0: "Top Byte used in the address calculation.", 58 | 1: "Top Byte ignored in the address calculation."})) 59 | 60 | -------------------------------------------------------------------------------- /pt/pt_aarch64_parse.py: -------------------------------------------------------------------------------- 1 | from pt.pt_common import * 2 | from pt.pt_aarch64_definitions import * 3 | from pt.pt_arch_backend import PTArchBackend 4 | from pt.pt_constants import * 5 | from pt.machine import * 6 | 7 | import math 8 | 9 | PT_AARCH64_4KB_PAGE = PT_SIZE_4K 10 | PT_AARCH64_16KB_PAGE = PT_SIZE_16K 11 | PT_AARCH64_64KB_PAGE = PT_SIZE_64K 12 | 13 | def is_user_readable(block): 14 | return block.permissions == 0b11 or block.permissions == 0b01 15 | 16 | def is_kernel_readable(block): 17 | return True 18 | 19 | def is_user_writeable(block): 20 | return block.permissions == 0b01 21 | 22 | def is_kernel_writeable(block): 23 | return block.permissions == 0b01 or block.permissions == 0b00 24 | 25 | def is_user_executable(block): 26 | return (not block.xn) 27 | 28 | def is_kernel_executable(block): 29 | return not block.pxn 30 | 31 | class Aarch64_Block(CommonPage): 32 | def __init__(self, va, phys, size, xn, pxn, permissions): 33 | self.va = va 34 | self.page_size = size 35 | self.xn = xn 36 | self.pxn = pxn 37 | self.permissions = permissions 38 | self.phys = [phys] 39 | self.sizes = [size] 40 | 41 | def to_string(self, phys_verbose): 42 | varying_str = None 43 | if phys_verbose: 44 | fmt = f"{{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} : {{:>{PrintConfig.phys_len}}} " 45 | varying_str = fmt.format(hex(self.va), hex(self.page_size), hex(self.phys[0])) 46 | else: 47 | fmt = f"{{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} " 48 | varying_str = fmt.format(hex(self.va), hex(self.page_size)) 49 | 50 | uspace_writeable = is_user_writeable(self) 51 | kspace_writeable = is_kernel_writeable(self) 52 | uspace_readable = is_user_readable(self) 53 | kspace_readable = is_kernel_readable(self) 54 | uspace_executable = is_user_executable(self) 55 | kspace_executable = is_kernel_executable(self) 56 | delim = bcolors.YELLOW + " " + bcolors.ENDC 57 | uspace_color = select_color(uspace_writeable, uspace_executable, uspace_readable) 58 | uspace_str = uspace_color + f" R:{int(uspace_readable)} W:{int(uspace_writeable)} X:{int(uspace_executable)} " + bcolors.ENDC 59 | kspace_color = select_color(kspace_writeable, kspace_executable, kspace_readable) 60 | kspace_str = kspace_color + f" R:{int(kspace_readable)} W:{int(kspace_writeable)} X:{int(kspace_executable)} " + bcolors.ENDC 61 | s = f"{varying_str}" + delim + uspace_str + delim + kspace_str 62 | return s 63 | 64 | def pwndbg_is_writeable(self): 65 | return is_user_writeable(self) or is_kernel_writeable(self) 66 | 67 | def pwndbg_is_executable(self): 68 | return is_user_executable(self) or is_kernel_executable(self) 69 | 70 | class Aarch64_Table(): 71 | def __init__(self, pa, va, pxn, xn, permissions): 72 | self.va = va 73 | self.pa = pa 74 | self.permissions = permissions 75 | self.pxn = pxn 76 | self.xn = xn 77 | 78 | def aarch64_semantically_similar(p1: Aarch64_Block, p2: Aarch64_Block) -> bool: 79 | return p1.xn == p2.xn and p1.pxn == p2.pxn and p1.permissions == p2.permissions 80 | 81 | class PT_Aarch64_Backend(PTArchBackend): 82 | def __init__(self, machine): 83 | self.machine = machine 84 | self.init_registers() 85 | 86 | def init_registers(self): 87 | self.pt_tcr = PT_TCR(self.machine) 88 | 89 | def print_stats(self): 90 | print(self.pt_tcr.check()) 91 | 92 | def get_arch(self): 93 | return "aarch64" 94 | 95 | def walk(self, va): 96 | pt_walk = PageTableWalkInfo(va) 97 | 98 | # Use canonical form to determine which page table to use. 99 | top_bit_index = 63 100 | granule_size = None 101 | as_limit = None 102 | 103 | table_addr = None 104 | if va & (1 << top_bit_index) == 0: 105 | # top bit is not set, so this is a userspace address 106 | granule_size = self.determine_ttbr0_granule_size() 107 | as_limit = self.determine_ttbr0_address_space_limit() 108 | table_addr = extract_no_shift(self.get_ttbr0_el1(), 0, 47) 109 | pt_walk.add_register_stage("TTBR0_EL1", table_addr) 110 | else: 111 | granule_size = self.determine_ttbr1_granule_size() 112 | as_limit = self.determine_ttbr1_address_space_limit() 113 | table_addr = extract_no_shift(self.get_ttbr1_el1(), 0, 47) 114 | pt_walk.add_register_stage("TTBR1_EL1", table_addr) 115 | 116 | entry_size = 8 117 | bits_per_stage = int(math.log2(granule_size / entry_size)) 118 | start_index = int(math.log2(granule_size)) 119 | ranges = reversed([(base, base + bits_per_stage - 1) for base in range(start_index, as_limit, bits_per_stage)]) 120 | 121 | for (index, r) in enumerate(ranges): 122 | page_pa = table_addr & ~0xfff 123 | entry_index = extract(va, r[0], r[1]) 124 | entry_page_pa = page_pa + entry_index * entry_size 125 | entry_value = int.from_bytes(self.machine.read_physical_memory(entry_page_pa, entry_size), 'little') 126 | entry_value_pa = extract_no_shift(entry_value, 0, 47) 127 | entry_value_pa_no_meta = extract_no_shift(entry_value, 12, 47) 128 | meta_bits = extract_no_shift(entry_value, 0, 11) 129 | 130 | pt_walk.add_stage(f"Level{index}", entry_index, entry_value_pa_no_meta, meta_bits) 131 | 132 | bit1and2 = extract(entry_value, 0, 1) 133 | is_valid = bit1and2 != 0 134 | if not is_valid: 135 | pt_walk.set_faulted() 136 | break 137 | 138 | is_block = bit1and2 == 0x1 139 | if is_block: 140 | break 141 | table_addr = entry_value_pa 142 | 143 | return pt_walk 144 | 145 | def get_filter_is_writeable(self, has_superuser_filter, has_user_filter): 146 | if has_superuser_filter == True and has_user_filter == False: 147 | return lambda p: is_kernel_writeable(p) 148 | elif has_superuser_filter == False and has_user_filter == True: 149 | return lambda p: is_user_writeable(p) 150 | else: 151 | return lambda p: is_kernel_writeable(p) or is_user_writeable(p) 152 | 153 | def get_filter_is_not_writeable(self, has_superuser_filter, has_user_filter): 154 | if has_superuser_filter == True and has_user_filter == False: 155 | return lambda p: not is_kernel_writeable(p) 156 | elif has_superuser_filter == False and has_user_filter == True: 157 | return lambda p: not is_user_writeable(p) 158 | else: 159 | return lambda p: not is_kernel_writeable(p) and not is_user_writeable(p) 160 | 161 | def get_filter_is_executable(self, has_superuser_filter, has_user_filter): 162 | if has_superuser_filter == True and has_user_filter == False: 163 | return lambda p: is_kernel_executable(p) 164 | elif has_superuser_filter == False and has_user_filter == True: 165 | return lambda p: is_user_executable(p) 166 | else: 167 | return lambda p: is_user_executable(p) or is_kernel_executable(p) 168 | 169 | def get_filter_is_not_executable(self, has_superuser_filter, has_user_filter): 170 | if has_superuser_filter == True and has_user_filter == False: 171 | return lambda p: not is_kernel_executable(p) 172 | elif has_superuser_filter == False and has_user_filter == True: 173 | return lambda p: not is_user_executable(p) 174 | else: 175 | return lambda p: not is_user_executable(p) and not is_kernel_executable(p) 176 | 177 | def get_filter_is_writeable_or_executable(self, has_superuser_filter, has_user_filter): 178 | if has_superuser_filter == True and has_user_filter == False: 179 | return lambda p: is_kernel_writeable(p) or is_kernel_executable(p) 180 | elif has_superuser_filter == False and has_user_filter == True: 181 | return lambda p: is_user_writeable(p) or is_user_executable(p) 182 | else: 183 | return lambda p: is_kernel_writeable(p) or is_kernel_executable(p) or \ 184 | is_user_writeable(p) or is_user_executable(p) 185 | 186 | def get_filter_is_user_page(self, has_superuser_filter, has_user_filter): 187 | return lambda p: is_user_writeable(p) or is_user_readable(p) or is_user_executable(p) 188 | 189 | def get_filter_is_superuser_page(self, has_superuser_filter, has_user_filter): 190 | return lambda p: is_kernel_writeable(p) or is_kernel_readable(p) or is_kernel_executable(p) 191 | 192 | def get_filter_is_read_only_page(self, has_superuser_filter, has_user_filter): 193 | l_kernel = lambda p: (not is_kernel_writeable(p) and not is_kernel_executable(p)) and is_kernel_readable(p) 194 | l_user = lambda p: (not is_user_writeable(p) and not is_user_executable(p)) and is_user_readable(p) 195 | if has_superuser_filter == True and has_user_filter == False: 196 | return l_kernel 197 | elif has_superuser_filter == False and has_user_filter == True: 198 | return l_user 199 | else: 200 | return lambda p: l_user(p) or l_kernel(p) 201 | 202 | def get_filter_architecture_specific(self, filter_name, has_superuser_filter, has_user_filter): 203 | raise exception(f"Uknown filter {filter_name}") 204 | 205 | def get_ttbr0_el1(self): 206 | return self.machine.read_register("$TTBR0_EL1") 207 | 208 | def get_ttbr1_el1(self): 209 | return self.machine.read_register("$TTBR1_EL1") 210 | 211 | def determine_ttbr0_granule_size(self): 212 | tb0_granule_size = None 213 | tg0 = self.pt_tcr.TG0 214 | if tg0 == 0b00: 215 | tb0_granule_size = PT_AARCH64_4KB_PAGE 216 | elif tg0 == 0b01: 217 | tb0_granule_size = PT_AARCH64_64KB_PAGE 218 | elif tg0 == 0b10: 219 | tb0_granule_size = PT_AARCH64_16KB_PAGE 220 | else: 221 | raise Exception(f"Unknown TG0 value {tg0}") 222 | 223 | return tb0_granule_size 224 | 225 | def determine_ttbr1_granule_size(self): 226 | tb1_granule_size = None 227 | tg1 = self.pt_tcr.TG1 228 | if tg1 == 0b10: 229 | tb1_granule_size = PT_AARCH64_4KB_PAGE 230 | elif tg1 == 0b11: 231 | tb1_granule_size = PT_AARCH64_64KB_PAGE 232 | elif tg1 == 0b01: 233 | tb1_granule_size = PT_AARCH64_16KB_PAGE 234 | else: 235 | raise Exception(f"Unknown TG1 value {tg1}") 236 | 237 | return tb1_granule_size 238 | 239 | def determine_ttbr0_address_space_limit(self): 240 | return 64 - self.pt_tcr.T0SZ 241 | 242 | def determine_ttbr1_address_space_limit(self): 243 | return 64 - self.pt_tcr.T1SZ 244 | 245 | def aarch64_parse_entries(self, tbl, level_range, as_size, granule, is_last_level): 246 | # lvl starts from one to be in sync with the armv7 docs 247 | entries = [] 248 | start_bit = int(math.log2(granule)) 249 | 250 | try: 251 | entries = split_range_into_int_values(read_arbitrary_page(self.machine, tbl.pa, granule), 8) 252 | except Exception: 253 | pass 254 | 255 | tables = [] 256 | blocks = [] 257 | for i, pa in enumerate(entries): 258 | is_valid = bool(pa & 0x1) 259 | if is_valid: 260 | bit1 = extract(pa, 1, 1) 261 | bit1and2 = extract(pa, 0, 1) 262 | contiguous_bit = extract(pa, 52, 52) 263 | is_block_or_page = (bit1and2 == 1) or is_last_level 264 | is_table = (not is_block_or_page) 265 | address_contrib = (i << level_range[0]) 266 | child_va = tbl.va | address_contrib 267 | if is_table: 268 | next_level_address = extract_no_shift(pa, start_bit, 47) 269 | permissions = extract(pa, 61, 62) 270 | xn = (extract(pa, 60, 60) == 0x1) | tbl.xn 271 | pxn = extract(pa, 59, 59) == 0x1 | tbl.pxn 272 | tables.append(Aarch64_Table(next_level_address, child_va, pxn, xn, permissions)) 273 | else: 274 | xn = (extract(pa, 54, 54) == 0x1) | tbl.xn 275 | pxn = (extract(pa, 53, 53) == 0x1) | tbl.pxn 276 | phys_addr = extract_no_shift(pa, start_bit, 47) 277 | permissions = extract(pa, 6, 7) 278 | size = (1 << level_range[0]) 279 | blocks.append(Aarch64_Block(child_va, phys_addr, size, xn, pxn, permissions)) 280 | 281 | return tables, blocks 282 | 283 | def arm_traverse_table(self, pt_addr, as_size, granule_size, leading_bit): 284 | num_entries_in_page = int(granule_size / 8) 285 | level_bit_coverage = int(math.log2(num_entries_in_page)) 286 | low_bit_inclusive = int(math.log2(granule_size)) 287 | top_bit_inclusive = as_size - 1 288 | 289 | table_ranges = list(reversed([(low, min(low + level_bit_coverage, top_bit_inclusive)) for low in range(low_bit_inclusive, top_bit_inclusive, level_bit_coverage)])) 290 | 291 | root = Aarch64_Table(pt_addr, 0, 0, 0, 0) 292 | tables = [root] 293 | all_blocks = [] 294 | for (level, address_range) in enumerate(table_ranges): 295 | is_last_level = (level + 1) == len(table_ranges) 296 | new_tables = [] 297 | for table in tables: 298 | cur_tables, cur_blocks = self.aarch64_parse_entries(table, address_range, as_size, granule_size, is_last_level) 299 | new_tables.extend(cur_tables) 300 | all_blocks.extend(cur_blocks) 301 | tables = new_tables 302 | 303 | if leading_bit == 1: 304 | for block in all_blocks: 305 | block.va = make_canonical(block.va | (1<{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} : {{:>{PrintConfig.phys_len}}} |" 385 | varying_str = fmt.format("Address", "Length", "Phys") 386 | else: 387 | fmt = f"{{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} |" 388 | varying_str = fmt.format("Address", "Length") 389 | print(bcolors.BLUE + varying_str + " User space " + " | Kernel space " + bcolors.ENDC) 390 | for block in table: 391 | print(block.to_string(phys_verbose)) 392 | 393 | -------------------------------------------------------------------------------- /pt/pt_arch_backend.py: -------------------------------------------------------------------------------- 1 | from abc import ABC 2 | from abc import abstractmethod 3 | 4 | class PTArchBackend(ABC): 5 | 6 | @abstractmethod 7 | def get_arch(self): 8 | pass 9 | 10 | @abstractmethod 11 | def get_filter_is_writeable(self, has_superuser_filter, has_user_filter): 12 | pass 13 | 14 | @abstractmethod 15 | def get_filter_is_not_writeable(self, has_superuser_filter, has_user_filter): 16 | pass 17 | 18 | @abstractmethod 19 | def get_filter_is_executable(self, has_superuser_filter, has_user_filter): 20 | pass 21 | 22 | @abstractmethod 23 | def get_filter_is_not_executable(self, has_superuser_filter, has_user_filter): 24 | pass 25 | 26 | @abstractmethod 27 | def get_filter_is_writeable_or_executable(self, has_superuser_filter, has_user_filter): 28 | pass 29 | 30 | @abstractmethod 31 | def get_filter_is_user_page(self, has_superuser_filter, has_user_filter): 32 | pass 33 | 34 | @abstractmethod 35 | def get_filter_is_superuser_page(self, has_superuser_filter, has_user_filter): 36 | pass 37 | 38 | @abstractmethod 39 | def get_filter_is_read_only_page(self, has_superuser_filter, has_user_filter): 40 | pass 41 | 42 | @abstractmethod 43 | def get_filter_is_read_only_page(self, has_superuser_filter, has_user_filter): 44 | pass 45 | 46 | @abstractmethod 47 | def get_filter_architecture_specific(self, filter_name, has_superuser_filter, has_user_filter): 48 | pass 49 | 50 | @abstractmethod 51 | def parse_tables(self, cache, args): 52 | pass 53 | 54 | @abstractmethod 55 | def print_table(self, table, phys_verbose): 56 | pass 57 | 58 | @abstractmethod 59 | def print_kaslr_information(self, table, phys_verbose): 60 | pass 61 | 62 | @abstractmethod 63 | def print_stats(self): 64 | pass 65 | 66 | @abstractmethod 67 | def walk(self, va): 68 | pass 69 | 70 | -------------------------------------------------------------------------------- /pt/pt_common.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | import copy 3 | 4 | class bcolors: 5 | RED = '\033[41m' 6 | BLUE = '\033[44m' 7 | GREEN = '\033[42m' 8 | CYAN = '\033[106m' 9 | MAGENTA = '\033[45m' 10 | BLACK = '\033[40m' 11 | YELLOW = '\033[103m' 12 | LGREY = '\033[47m' 13 | ENDC = '\033[0m' 14 | 15 | def extract(value, s, e): 16 | return extract_no_shift(value, s, e) >> s 17 | 18 | def extract_no_shift(value, s, e): 19 | mask = ((1<<(e + 1))-1) & ~((1<> shift) & 0x1 43 | mask = ((((2**64)-1) >> shift) * bit) << shift 44 | return va | mask 45 | 46 | PagePrintSettings = namedtuple('PagePrintSettings', ['va_len', 'page_size_len', 'phys_len']) 47 | PrintConfig = PagePrintSettings(va_len = 18, page_size_len = 14, phys_len = 12) 48 | 49 | class CommonPage(): 50 | 51 | def cut_after(self, cut_addr): 52 | i = 0 53 | off = 0 54 | while i < len(self.phys): 55 | if cut_addr < self.va + off + self.sizes[i]: 56 | break 57 | off += self.sizes[i] 58 | i += 1 59 | if i > 0: 60 | self.phys = self.phys[i:] 61 | self.sizes = self.sizes[i:] 62 | delta = 0 63 | if len(self.phys) >= 1 and cut_addr >= self.va: 64 | delta = cut_addr - (self.va + off) 65 | self.sizes[0] = self.sizes[0] - delta 66 | self.phys[0] = self.phys[0] + delta 67 | self.page_size = self.page_size - delta - off 68 | self.va = max(self.va, cut_addr) 69 | 70 | def cut_before(self, cut_addr): 71 | i = len(self.phys) - 1 72 | off = 0 73 | while i >= 0: 74 | if self.va < cut_addr: 75 | break 76 | off += self.sizes[i] 77 | i -= 1 78 | if i > 0: 79 | self.phys = self.phys[:i] 80 | self.sizes = self.sizes[:i] 81 | delta = 0 82 | if len(self.phys) >= 1: 83 | delta = max(0, (self.va + self.page_size - off) - cut_addr) 84 | self.sizes[-1] = self.sizes[-1] - delta 85 | self.page_size = min(self.page_size, cut_addr - self.va) 86 | 87 | def read_memory(self, machine): 88 | memory = b"" 89 | for phys_range_start, phys_range_size in zip(self.phys, self.sizes): 90 | memory += machine.read_physical_memory(phys_range_start, phys_range_size) 91 | return memory 92 | 93 | class Page(CommonPage): 94 | def __init__(self): 95 | self.va = None 96 | self.page_size = None 97 | self.w = None 98 | self.x = None 99 | self.s = None 100 | self.wb = None 101 | self.uc = None 102 | self.phys = None 103 | self.sizes = None 104 | 105 | def pwndbg_is_writeable(self): 106 | return self.w 107 | 108 | def pwndbg_is_executable(self): 109 | return self.x 110 | 111 | def to_string(self, phys_verbose): 112 | prefix = "" 113 | if not self.s: 114 | prefix = bcolors.CYAN + " " + bcolors.ENDC 115 | elif self.s: 116 | prefix = bcolors.MAGENTA + " " + bcolors.ENDC 117 | 118 | varying_str = None 119 | if phys_verbose: 120 | fmt = f"{{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} : {{:>{PrintConfig.phys_len}}}" 121 | varying_str = fmt.format(hex(self.va), hex(self.page_size), hex(self.phys[0])) 122 | else: 123 | fmt = f"{{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}}" 124 | varying_str = fmt.format(hex(self.va), hex(self.page_size)) 125 | 126 | s = f"{varying_str} | W:{int(self.w)} X:{int(self.x)} S:{int(self.s)} UC:{int(self.uc)} WB:{int(self.wb)}" 127 | 128 | res = "" 129 | if self.x and self.w: 130 | res = prefix + bcolors.BLUE + " " + s + bcolors.ENDC 131 | elif self.w and not self.x: 132 | res = prefix + bcolors.GREEN + " " + s + bcolors.ENDC 133 | elif self.x: 134 | res = prefix + bcolors.RED + " " + s + bcolors.ENDC 135 | else: 136 | res = prefix + " " + s 137 | 138 | return res 139 | 140 | def merge_cont_pages(pages, func_semantic_sim, require_physical_contiguity): 141 | if len(pages) <= 1: 142 | return pages 143 | 144 | # Here I am just going to abuse the Page structure to contain the range 145 | merged_pages = [] 146 | cur_page = copy.copy(pages[0]) 147 | for page in pages[1:]: 148 | 149 | merge_pages = True 150 | if not (cur_page.va + cur_page.page_size == page.va and func_semantic_sim(cur_page, page)): 151 | merge_pages = False 152 | 153 | if require_physical_contiguity and not (cur_page.phys[-1] + cur_page.sizes[-1] == page.phys[0]): 154 | merge_pages = False 155 | 156 | if merge_pages: 157 | cur_page.page_size = cur_page.page_size + page.page_size 158 | if cur_page.phys[-1] + cur_page.sizes[-1] == page.phys[0]: 159 | # Depending on the flag require_physical_contiguity, the extended ranges may or may not be physically contiguous 160 | assert(len(page.phys) == 1) 161 | assert(len(page.sizes) == 1) 162 | cur_page.sizes[-1] = cur_page.sizes[-1] + page.page_size 163 | else: 164 | # If not, then add a new entry 165 | cur_page.phys.extend(page.phys) 166 | cur_page.sizes.extend(page.sizes) 167 | else: 168 | merged_pages.append(cur_page) 169 | cur_page = copy.copy(page) 170 | merged_pages.append(cur_page) 171 | return merged_pages 172 | 173 | def optimize(gig_pages, mb_pages, kb_pages, func_semantic_sim, require_physical_contiguity): 174 | pages = sorted(gig_pages + mb_pages + kb_pages, key = lambda p: p.va) 175 | opt = merge_cont_pages(pages, func_semantic_sim, require_physical_contiguity) 176 | return opt 177 | 178 | def select_color(w, x, r): 179 | if x and w: 180 | return bcolors.BLUE 181 | if x: 182 | return bcolors.RED 183 | if w: 184 | return bcolors.GREEN 185 | if r: 186 | return bcolors.LGREY 187 | return bcolors.BLACK 188 | 189 | def create_compound_filter(filters): 190 | def apply_filters(p): 191 | res = True 192 | for func in filters: 193 | res = res and func(p) 194 | return res 195 | return apply_filters 196 | 197 | def search_memory(machine, page_ranges, to_search, to_search_num, aligned_to, aligned_offset): 198 | done_searching = False 199 | for range in page_ranges: 200 | if done_searching: 201 | break 202 | 203 | data = None 204 | try: 205 | data = range.read_memory(machine) 206 | except OSError: 207 | pass 208 | 209 | if data is not None: 210 | idx = 0 211 | while True: 212 | idx = data.find(to_search, idx) 213 | if idx != -1: 214 | if (idx - aligned_offset) % aligned_to == 0: 215 | yield (range.va + idx, range) 216 | to_search_num = to_search_num - 1 217 | if to_search_num == 0: 218 | done_searching = True 219 | break 220 | idx = idx + 1 221 | else: 222 | break 223 | return None 224 | 225 | def find_aliases(virtual_page_ranges, phys_verobse): 226 | # First collect the physical ranges, aka inverse virtual map 227 | phys_ranges = [] 228 | i = 0 229 | for range in virtual_page_ranges: 230 | virtual_page_range_base = range.va 231 | off = 0 232 | for phys_range, phys_range_size in zip(range.phys, range.sizes): 233 | phys_ranges.append((phys_range, phys_range + phys_range_size, virtual_page_range_base + off)) 234 | off = off + phys_range_size 235 | 236 | # Sort the physical ranges 237 | phys_ranges = sorted(phys_ranges, key=lambda key: key[0]) 238 | 239 | # TODO 240 | # We could use bisect here to speed-up 241 | # The first loop can be simplified 242 | # The object copy is a hack 243 | # The check for previous occ is not elegant 244 | overlaps_dict = {} 245 | for range in virtual_page_ranges: 246 | base_va = range.va 247 | off = 0 248 | for phys_range, phys_range_size in zip(range.phys, range.sizes): 249 | phys_range_end = phys_range + phys_range_size 250 | for saved_range in phys_ranges: 251 | if saved_range[0] > phys_range_end: 252 | break 253 | beg = max(phys_range, saved_range[0]) 254 | end = min(phys_range_end, saved_range[1]) 255 | va = base_va + off + (beg - phys_range) 256 | if beg < end and va != saved_range[2]: 257 | key = (beg, end) 258 | # Make copy and clean-up 259 | range_copy = copy.copy(range) 260 | range_copy.phys = None 261 | range_copy.size = None 262 | range_copy.va = va 263 | range_copy.page_size = end - beg 264 | if key in overlaps_dict: 265 | found = False 266 | for tmp in overlaps_dict[key]: 267 | if tmp.va == va: 268 | found = True 269 | break 270 | if not found: 271 | overlaps_dict[key].append(range_copy) 272 | else: 273 | overlaps_dict[key] = [range_copy] 274 | off = off + phys_range_size 275 | 276 | # Print the found aliases 277 | for key in overlaps_dict.keys(): 278 | overlaps = overlaps_dict[key] 279 | if len(overlaps) > 1: 280 | print(f"Phys: {hex(key[0])} - {hex(key[1])}") 281 | overlap_len = key[1] - key[0] 282 | for overlap in overlaps: 283 | print(" " * 4 + overlap.to_string(phys_verbose)) 284 | print("") 285 | 286 | class PageTableWalkInfo(): 287 | 288 | def __init__(self, va): 289 | self.va = va 290 | self.faulted = False 291 | self.stages = [] 292 | 293 | def add_register_stage(self, register_name, register_value): 294 | self.base_register = (register_name, register_value) 295 | 296 | def add_stage(self, stage_str, table_index, entry_value_without_meta, meta_bits): 297 | self.stages.append((stage_str, table_index, entry_value_without_meta, meta_bits)) 298 | 299 | def set_faulted(self): 300 | self.faulted = True 301 | 302 | def __str__(self): 303 | s = "" 304 | 305 | s += f"Page table walk for VA = {hex(self.va)}\n" 306 | s += "-" * 43 + "\n" 307 | 308 | s += f"{self.base_register[0]} = {hex(self.base_register[1])}\n" 309 | 310 | for (stage_index, stage_entry) in enumerate(self.stages): 311 | stage_str, table_index, entry_value_without_meta, meta_bits = stage_entry 312 | stage_index = stage_index + 1 313 | mapping_string = " " * stage_index * 2 + f"{stage_str}[{table_index}] = {hex(entry_value_without_meta)}" 314 | flags_string = f"Flags 0x{meta_bits:03x}" 315 | s += mapping_string.ljust(34) + "| " + flags_string + "\n" 316 | 317 | if self.faulted: 318 | s += "\n!!! Last stage faulted !!!\n" 319 | 320 | return s 321 | -------------------------------------------------------------------------------- /pt/pt_constants.py: -------------------------------------------------------------------------------- 1 | 2 | PT_SIZE_4K = 4 * 1024 3 | PT_SIZE_16K = 16 * 1024 4 | PT_SIZE_64K = 64 * 1024 5 | PT_SIZE_1MIB = 1024 * 1024 6 | PT_SIZE_2MIB = 2 * PT_SIZE_1MIB 7 | PT_SIZE_32MIB = 32 * PT_SIZE_1MIB 8 | PT_SIZE_512MIB = 512 * PT_SIZE_1MIB 9 | PT_SIZE_1GIB = 1024 * 1024 * 1024 10 | PT_SIZE_64GIB = 64 * PT_SIZE_1GIB 11 | PT_SIZE_512GIB = 512 * PT_SIZE_1GIB 12 | PT_SIZE_1TB = 1024 * PT_SIZE_1GIB 13 | PT_SIZE_4TB = 4 * PT_SIZE_1TB 14 | PT_SIZE_128TB = 128 * PT_SIZE_1TB 15 | -------------------------------------------------------------------------------- /pt/pt_register.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | from pt.pt_common import * 3 | 4 | PT_Register_Range = namedtuple('PT_Register_Range', ['name', 'low', 'high', 'func']) 5 | 6 | class PT_Register_State: 7 | def __init__(self, short_name, name, kv): 8 | self.short_name = short_name 9 | self.name = name 10 | self.kv = kv 11 | 12 | def __str__(self): 13 | s = "" 14 | total = 148 15 | s += bcolors.BLUE + f"{self.short_name} ({self.name}):".ljust(total) + bcolors.ENDC + "\n" 16 | delim = "|" 17 | for key in self.kv: 18 | value, low, high, res = self.kv[key] 19 | s += f" {key}".ljust(10) + " (" + f"{low}".rjust(2) + ":" + f"{high}".rjust(2) + ") = " + hex(res).rjust(4) + " " + delim + f" {value} ".ljust(128) + "\n" 20 | s += "-" * total + "\n" 21 | return s 22 | 23 | def get_value(self, key): 24 | return self.kv[key][3] 25 | 26 | class PT_Decipher_Meaning_Match: 27 | def __init__(self, kv): 28 | self.kv = kv 29 | 30 | def __call__(self, key): 31 | return self.kv[key] 32 | 33 | PT_Decipher_Meaning_Passthrough = lambda x: x 34 | 35 | class PT_Register: 36 | def __init__(self, machine, register, name): 37 | self.machine = machine 38 | self.register = register 39 | self.name = name 40 | self.ranges_dict = {} 41 | 42 | def add_range(self, name, low, high, decipher_meaning): 43 | self.ranges_dict[name] = PT_Register_Range(name = name, low = low, high = high, func = decipher_meaning) 44 | 45 | def check(self): 46 | reg_value = self.machine.read_register(f"${self.register}") 47 | kv = dict() 48 | for key in self.ranges_dict: 49 | r = self.ranges_dict[key] 50 | res = extract(reg_value, r.low, r.high) 51 | kv[r.name] = (r.func(res), r.low, r.high, res) 52 | return PT_Register_State(self.register, self.name, kv) 53 | 54 | def __getattr__(self, attr): 55 | return self.check().get_value(str(attr)) 56 | 57 | -------------------------------------------------------------------------------- /pt/pt_riscv64_parse.py: -------------------------------------------------------------------------------- 1 | from pt.pt_common import * 2 | from pt.pt_arch_backend import PTArchBackend 3 | from pt.pt_constants import * 4 | 5 | import math 6 | 7 | class Riscv64_Page(CommonPage): 8 | def __init__(self, va, phys, size, readable, writeable, executable, user): 9 | self.va = va 10 | self.page_size = size 11 | self.r = readable 12 | self.w = writeable 13 | self.x = executable 14 | self.s = not user 15 | self.phys = [phys] 16 | self.sizes = [size] 17 | 18 | def pwndbg_is_writeable(self): 19 | return self.w 20 | 21 | def pwndbg_is_executable(self): 22 | return self.x 23 | 24 | def to_string(self, phys_verbose): 25 | prefix = "" 26 | if not self.s: 27 | prefix = bcolors.CYAN + " " + bcolors.ENDC 28 | elif self.s: 29 | prefix = bcolors.MAGENTA + " " + bcolors.ENDC 30 | 31 | varying_str = None 32 | if phys_verbose: 33 | fmt = f"{{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} : {{:>{PrintConfig.phys_len}}}" 34 | varying_str = fmt.format(hex(self.va), hex(self.page_size), hex(self.phys[-1])) 35 | else: 36 | fmt = f"{{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}}" 37 | varying_str = fmt.format(hex(self.va), hex(self.page_size)) 38 | s = f"{varying_str} | W:{int(self.w)} X:{int(self.x)} R:{int(self.r)} S:{int(self.s)}" 39 | 40 | res = "" 41 | if self.x and self.w: 42 | res = prefix + bcolors.BLUE + " " + s + bcolors.ENDC 43 | elif self.w and not self.x: 44 | res = prefix + bcolors.GREEN + " " + s + bcolors.ENDC 45 | elif self.x: 46 | res = prefix + bcolors.RED + " " + s + bcolors.ENDC 47 | else: 48 | res = prefix + " " + s 49 | 50 | return res 51 | 52 | class PT_RiscV64_Backend(PTArchBackend): 53 | def __init__(self, machine): 54 | self.machine = machine 55 | 56 | def get_arch(self): 57 | return "riscv64" 58 | 59 | def riscv64_semantically_similar(p1, p2) -> bool: 60 | return p1.x == p2.x and p1.w == p2.w and p1.r == p2.r and p1.s == p2.s 61 | 62 | def parse_entries(self, table, as_size, lvl): 63 | dirs = [] 64 | pages = [] 65 | entries = [] 66 | try: 67 | entries = split_range_into_int_values(read_page(self.machine, table.phys[0]), 8) 68 | except: 69 | pass 70 | for i, pa in enumerate(entries): 71 | valid = extract(pa, 0, 0) 72 | if valid: 73 | is_leaf = (extract(pa, 1, 3) != 0) 74 | address_contrib = (i << (as_size - lvl * 9)) 75 | child_va = table.va | address_contrib 76 | phys_addr = extract(pa, 10, 53) << 12 77 | if is_leaf: 78 | size = 1 << (as_size - lvl * 9) 79 | readable = extract(pa, 1, 1) 80 | writeable = extract(pa, 2, 2) 81 | executable = extract(pa, 3, 3) 82 | user_accessible = extract(pa, 4, 4) 83 | pages.append(Riscv64_Page(child_va, phys_addr, size, readable, writeable, executable, user_accessible)) 84 | else: 85 | dirs.append(Riscv64_Page(child_va, phys_addr, None, None, None, None, None)) 86 | return dirs, pages 87 | 88 | def traverse_table(self, pt_addr, as_size): 89 | root = Riscv64_Page(0, pt_addr, 0, 0, 0, 0, 0) 90 | directories, leafs = self.parse_entries(root, as_size, lvl=1) 91 | 92 | lvl = 2 93 | while len(directories) != 0: 94 | directories_cur_lvl = [] 95 | for tmp_tb in directories: 96 | tmp_dirs, tmp_leafs = self.parse_entries(tmp_tb, as_size, lvl=lvl) 97 | directories_cur_lvl.extend(tmp_dirs) 98 | leafs.extend(tmp_leafs) 99 | lvl = lvl + 1 100 | directories = directories_cur_lvl 101 | 102 | for leaf in leafs: 103 | leaf.va = make_canonical(leaf.va, as_size) 104 | 105 | return leafs 106 | 107 | def print_stats(self): 108 | raise Exception("Unimplemented") 109 | 110 | def get_address_space_size_from_mode(self, mode_value): 111 | if mode_value == 8: 112 | return 39 113 | elif mode_value == 9: 114 | return 48 115 | elif mode_value == 10: 116 | return 57 117 | elif mode_value == 11: 118 | return 64 119 | else: 120 | raise Exception(f"Unknown mode: {hex(mode_value)}") 121 | 122 | def walk(self, va): 123 | entry_size = 8 124 | num_entries_per_page = int(4096 / entry_size) 125 | bits_per_level = int(math.log2(num_entries_per_page)) 126 | 127 | satp = self.machine.read_register("$satp") 128 | pt_addr = extract(satp, 0, 43) << 12 129 | mode_value = extract(satp, 60, 63) 130 | as_size = self.get_address_space_size_from_mode(mode_value) 131 | 132 | pt_walk = PageTableWalkInfo(va) 133 | pt_walk.add_register_stage("satp", pt_addr) 134 | 135 | iter = 0 136 | while True: 137 | top_bit = as_size - 1 - iter * bits_per_level 138 | low_bit = top_bit - bits_per_level + 1 139 | entry_index = extract(va, low_bit, top_bit) 140 | entry_page_pa = pt_addr + entry_index * entry_size 141 | try: 142 | entry_value = int.from_bytes(self.machine.read_physical_memory(entry_page_pa, entry_size), 'little') 143 | except: 144 | pt_walk.set_faulted() 145 | break 146 | entry_value_pa_no_meta = (extract(entry_value, 10, 53)) << 12 147 | meta_bits = extract_no_shift(entry_value, 0, 9) 148 | pt_walk.add_stage(f"Level{iter}", entry_index, entry_value_pa_no_meta, meta_bits) 149 | 150 | if extract(meta_bits, 0, 0) == 0: 151 | # Not present 152 | pt_walk.set_faulted() 153 | break 154 | 155 | is_leaf = (extract(meta_bits, 1, 3) != 0) 156 | if is_leaf: 157 | break 158 | 159 | pt_addr = entry_value_pa_no_meta 160 | iter += 1 161 | 162 | return pt_walk 163 | 164 | def get_filter_is_writeable(self, has_superuser_filter, has_user_filter): 165 | return lambda p: p.w 166 | 167 | def get_filter_is_not_writeable(self, has_superuser_filter, has_user_filter): 168 | return lambda p: not p.w 169 | 170 | def get_filter_is_executable(self, has_superuser_filter, has_user_filter): 171 | return lambda p: p.x 172 | 173 | def get_filter_is_not_executable(self, has_superuser_filter, has_user_filter): 174 | return lambda p: not p.x 175 | 176 | def get_filter_is_writeable_or_executable(self, has_superuser_filter, has_user_filter): 177 | return lambda p: p.w or p.x 178 | 179 | def get_filter_is_user_page(self, has_superuser_filter, has_user_filter): 180 | return lambda p: not p.s 181 | 182 | def get_filter_is_superuser_page(self, has_superuser_filter, has_user_filter): 183 | return lambda p: p.s 184 | 185 | def get_filter_is_read_only_page(self, has_superuser_filter, has_user_filter): 186 | return lambda p: p.r and not p.w and not p.x 187 | 188 | def get_filter_architecture_specific(self, filter_name, has_superuser_filter, has_user_filter): 189 | raise exception(f"Uknown filter {filter_name}") 190 | 191 | def parse_tables(self, cache, args): 192 | satp = args.satp 193 | requires_physical_contiguity = args.phys_verbose 194 | 195 | if satp: 196 | satp = int(satp[0], 16) 197 | else: 198 | satp = self.machine.read_register("$satp") 199 | 200 | all_blocks = None 201 | 202 | if satp in cache: 203 | all_blocks = cache[satp] 204 | else: 205 | mode_value = extract(satp, 60, 63) 206 | as_size = self.get_address_space_size_from_mode(mode_value) 207 | pt_base = extract(satp, 0, 43) << 12 208 | all_blocks = self.traverse_table(pt_base, as_size) 209 | all_blocks = optimize([], [], all_blocks, PT_RiscV64_Backend.riscv64_semantically_similar, requires_physical_contiguity) 210 | 211 | if args.save: 212 | cache[satp] = all_blocks 213 | 214 | return all_blocks 215 | 216 | def print_kaslr_information(self, table, should_print = True, phys_verbose = False): 217 | return None 218 | 219 | def print_table(self, table, phys_verbose): 220 | varying_str = None 221 | if phys_verbose: 222 | fmt = f" {{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} : {{:>{PrintConfig.phys_len}}}" 223 | varying_str = fmt.format("Address", "Length", "Phys") 224 | else: 225 | fmt = f" {{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}}" 226 | varying_str = fmt.format("Address", "Length") 227 | print(bcolors.BLUE + varying_str + " Permissions " + bcolors.ENDC) 228 | for page in table: 229 | print(page.to_string(phys_verbose)) 230 | return None 231 | 232 | -------------------------------------------------------------------------------- /pt/pt_x86_64_definitions.py: -------------------------------------------------------------------------------- 1 | from pt.pt_common import * 2 | 3 | class PML5_Entry(): 4 | def __init__(self, value, index): 5 | self.present = is_present(value) 6 | self.writeable = is_writeable(value) 7 | self.supervisor = is_supervisor(value) 8 | self.writeback = is_writeback(value) 9 | self.cacheable = is_cacheable(value) 10 | self.accessed = is_accessed(value) 11 | self.available = is_available(value) 12 | self.nx = is_nx(value) 13 | self.pml4 = get_pml4_base(value) 14 | self.raw = value 15 | self.virt_part = make_canonical(index << 48, 57) 16 | 17 | def __str__(self): 18 | res = (f"{hex(self.pdp)}: " 19 | f"P:{int(self.present)} " 20 | f"W:{int(self.writeable)} " 21 | f"S:{int(self.supervisor)} " 22 | f"WB:{int(self.writeback)} " 23 | f"UC:{int(not self.cacheable)} " 24 | f"A:{int(self.accessed)} " 25 | f"AVL:{int(self.available)} " 26 | f"NX:{int(self.nx)}") 27 | return res 28 | 29 | class PML4_Entry(): 30 | def __init__(self, value, parent_va, index): 31 | self.present = is_present(value) 32 | self.writeable = is_writeable(value) 33 | self.supervisor = is_supervisor(value) 34 | self.writeback = is_writeback(value) 35 | self.cacheable = is_cacheable(value) 36 | self.accessed = is_accessed(value) 37 | self.available = is_available(value) 38 | self.nx = is_nx(value) 39 | self.pdp = get_pdp_base(value) 40 | self.raw = value 41 | self.virt_part = make_canonical((index << 39) | parent_va, 57 if parent_va != 0 else 48) 42 | 43 | def __str__(self): 44 | res = (f"{hex(self.pdp)}: " 45 | f"P:{int(self.present)} " 46 | f"W:{int(self.writeable)} " 47 | f"S:{int(self.supervisor)} " 48 | f"WB:{int(self.writeback)} " 49 | f"UC:{int(not self.cacheable)} " 50 | f"A:{int(self.accessed)} " 51 | f"AVL:{int(self.available)} " 52 | f"NX:{int(self.nx)}") 53 | return res 54 | 55 | class PDP_Entry(): 56 | def __init__(self, value, parent_va, index): 57 | self.present = is_present(value) 58 | self.writeable = is_writeable(value) 59 | self.supervisor = is_supervisor(value) 60 | self.writeback = is_writeback(value) 61 | self.cacheable = is_cacheable(value) 62 | self.accessed = is_accessed(value) 63 | self.virt_part = (index << 30) | parent_va 64 | self.large_page = is_large_page(value) # This means it's a leaf 65 | if self.large_page: 66 | self.dirty = is_dirty(value) 67 | self.glob = True 68 | self.pd = extract_no_shift(value, 30, 51) 69 | else: 70 | self.pd = get_pdp_base(value) 71 | self.nx = is_nx(value) 72 | 73 | def __str__(self): 74 | res = (f"{hex(self.pd)}: " 75 | f"P:{int(self.present)} " 76 | f"W:{int(self.writeable)} " 77 | f"S:{int(self.supervisor)} " 78 | f"WB:{int(self.writeback)} " 79 | f"UC:{int(not self.cacheable)} " 80 | f"A:{int(self.accessed)} ") 81 | if self.large_page: 82 | res += (f"D:{int(self.dirty)} " 83 | f"G:{int(self.glob)} " 84 | f"NX:{int(self.nx)}") 85 | return res 86 | 87 | class PD_Entry(): 88 | def __init__(self, value, parent_va, index, pde_shift): 89 | self.present = is_present(value) 90 | self.writeable = is_writeable(value) 91 | self.supervisor = is_supervisor(value) 92 | self.writeback = is_writeback(value) 93 | self.cacheable = is_cacheable(value) 94 | self.accessed = is_accessed(value) 95 | self.virt_part = (index << pde_shift) | parent_va 96 | self.big_page = is_big_page(value) # This means it's a leaf 97 | if self.big_page: 98 | self.dirty = is_dirty(value) 99 | self.glob = True 100 | self.pat = is_pat(value) 101 | # TODO 102 | self.pt = extract_no_shift(value, 20, 51) 103 | else: 104 | self.pt = get_pdp_base(value) 105 | self.page_size = 1 << pde_shift 106 | self.nx = is_nx(value) 107 | 108 | def __str__(self): 109 | res = (f"{hex(self.pt)}: " 110 | f"P:{int(self.present)} " 111 | f"W:{int(self.writeable)} " 112 | f"S:{int(self.supervisor)} " 113 | f"WB:{int(self.writeback)} " 114 | f"UC:{int(not self.cacheable)} " 115 | f"A:{int(self.accessed)} ") 116 | if self.big_page: 117 | res += (f"D:{int(self.dirty)} " 118 | f"G:{int(self.glob)} " 119 | f"NX:{int(self.nx)}") 120 | return res 121 | 122 | class PT_Entry(): 123 | def __init__(self, value, parent_va, index): 124 | self.present = is_present(value) 125 | self.writeable = is_writeable(value) 126 | self.supervisor = is_supervisor(value) 127 | self.writeback = is_writeback(value) 128 | self.cacheable = is_cacheable(value) 129 | self.accessed = is_accessed(value) 130 | self.dirty = is_dirty(value) 131 | self.glob = True 132 | self.pat = is_pat(value) 133 | self.pt = extract_no_shift(value, 12, 51) 134 | self.virt = (index << 12) | parent_va 135 | self.nx = is_nx(value) 136 | 137 | def __str__(self): 138 | res = (f"{hex(self.pt)}: " 139 | f"P:{int(self.present)} " 140 | f"W:{int(self.writeable)} " 141 | f"S:{int(self.supervisor)} " 142 | f"WB:{int(self.writeback)} " 143 | f"UC:{int(not self.cacheable)} " 144 | f"A:{int(self.accessed)} " 145 | f"D:{int(self.dirty)} " 146 | f"PAT:{int(self.pat)} " 147 | f"G:{int(self.glob)} " 148 | f"NX:{int(self.nx)}") 149 | return res 150 | 151 | def create_page_from_pte(pte: PT_Entry) -> Page: 152 | page = Page() 153 | page.va = pte.virt 154 | page.page_size = 4096 155 | page.w = pte.writeable 156 | page.x = not pte.nx 157 | page.s = pte.supervisor 158 | page.uc = not pte.cacheable 159 | page.wb = pte.writeback 160 | page.phys = [pte.pt] 161 | page.sizes = [page.page_size] 162 | return page 163 | 164 | def create_page_from_pde(pde: PD_Entry) -> Page: 165 | page = Page() 166 | page.va = pde.virt_part 167 | page.page_size = pde.page_size 168 | page.w = pde.writeable 169 | page.x = not pde.nx 170 | page.s = pde.supervisor 171 | page.uc = not pde.cacheable 172 | page.wb = pde.writeback 173 | page.phys = [pde.pt] 174 | page.sizes = [page.page_size] 175 | return page 176 | 177 | def create_page_from_pdpe(pdpe: PDP_Entry) -> Page: 178 | page = Page() 179 | page.va = pdpe.virt_part 180 | page.page_size = 1024 * 1024 * 1024 181 | page.w = pdpe.writeable 182 | page.x = not pdpe.nx 183 | page.s = pdpe.supervisor 184 | page.uc = not pdpe.cacheable 185 | page.wb = pdpe.writeback 186 | page.phys = [pdpe.pd] 187 | page.sizes = [page.page_size] 188 | return page 189 | 190 | def is_present(addr): 191 | return (addr & 0x1) != 0 192 | 193 | def is_writeable(addr): 194 | return (addr & 0x2) != 0 195 | 196 | def is_supervisor(addr): 197 | return (addr & 0x4) == 0 198 | 199 | def is_writeback(addr): 200 | return (addr & 0x8) == 0 201 | 202 | def is_cacheable(addr): 203 | return (addr & 0x10) == 0 204 | 205 | def is_accessed(addr): 206 | return (addr & 0x10) == 1 207 | 208 | def is_dirty(addr): 209 | return ((addr >> 6) & 0x1) == 0 210 | 211 | def is_available(addr): 212 | return ((addr >> 9) & 0x3) != 0 213 | 214 | def is_nx(addr): 215 | return (addr & (1<<63)) != 0 216 | 217 | def get_pdp_base(addr): 218 | return extract_no_shift(addr, 12, 51) 219 | 220 | def get_pml4_base(addr): 221 | return extract_no_shift(addr, 12, 51) 222 | 223 | # One gigabyte-large page. 224 | def is_large_page(addr): 225 | return (addr >> 0x7) & 0x1 226 | 227 | # Either two-mb- or four-mb-large page. 228 | def is_big_page(addr): 229 | return (addr >> 7) & 0x1 230 | 231 | def is_pat(addr): 232 | return (addr >> 12) & 0x1 233 | 234 | def is_global(addr): 235 | return (addr >> 0x8) & 0x1 236 | 237 | def is_pat(addr): 238 | return (addr >> 12) & 0x1 239 | 240 | def rwxs_semantically_similar(p1: Page, p2: Page) -> bool: 241 | return p1.w == p2.w and p1.x == p2.x and p1.s == p2.s and p1.wb == p2.wb and p1.uc == p2.uc 242 | 243 | -------------------------------------------------------------------------------- /pt/pt_x86_64_parse.py: -------------------------------------------------------------------------------- 1 | from pt.pt_x86_64_definitions import * 2 | from pt.pt_x86_msr import * 3 | from pt.pt_common import * 4 | from pt.pt_constants import * 5 | from pt.pt_arch_backend import PTArchBackend 6 | from abc import ABC 7 | from abc import abstractmethod 8 | 9 | import math 10 | 11 | class PT_x86_Common_Backend(): 12 | 13 | def get_filter_is_writeable(self, has_superuser_filter, has_user_filter): 14 | return lambda p: p.w 15 | 16 | def get_filter_is_not_writeable(self, has_superuser_filter, has_user_filter): 17 | return lambda p: not p.w 18 | 19 | def get_filter_is_executable(self, has_superuser_filter, has_user_filter): 20 | return lambda p: p.x 21 | 22 | def get_filter_is_not_executable(self, has_superuser_filter, has_user_filter): 23 | return lambda p: not p.x 24 | 25 | def get_filter_is_writeable_or_executable(self, has_superuser_filter, has_user_filter): 26 | return lambda p: p.x or p.w 27 | 28 | def get_filter_is_user_page(self, has_superuser_filter, has_user_filter): 29 | return lambda p: not p.s 30 | 31 | def get_filter_is_superuser_page(self, has_superuser_filter, has_user_filter): 32 | return lambda p: p.s 33 | 34 | def get_filter_is_read_only_page(self, has_superuser_filter, has_user_filter): 35 | return lambda p: not p.x and not p.w 36 | 37 | def get_filter_architecture_specific(self, filter_name, has_superuser_filter, has_user_filter): 38 | if filter_name == "wb": 39 | return lambda p: p.wb 40 | elif filter_name == "_wb": 41 | return lambda p: not p.wb 42 | elif filter_name == "uc": 43 | return lambda p: p.uc 44 | elif filter_name == "_uc": 45 | return lambda p: not p.uc 46 | else: 47 | return None 48 | 49 | def print_table(self, table, phys_verbose): 50 | varying_str = None 51 | if phys_verbose: 52 | fmt = f" {{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} : {{:>{PrintConfig.phys_len}}} |" 53 | varying_str = fmt.format("Address", "Length", "Phys") 54 | else: 55 | fmt = f" {{:>{PrintConfig.va_len}}} : {{:>{PrintConfig.page_size_len}}} |" 56 | varying_str = fmt.format("Address", "Length") 57 | print(bcolors.BLUE + varying_str + " Permissions " + bcolors.ENDC) 58 | for page in table: 59 | print(page.to_string(phys_verbose)) 60 | 61 | def print_stats(self): 62 | print(self.pt_cr0.check()) 63 | print(self.pt_cr4.check()) 64 | 65 | @abstractmethod 66 | def get_arch(self): 67 | raise NotImplementedError("") 68 | 69 | def walk(self, va): 70 | 71 | if self.has_paging_enabled() == False: 72 | raise Exception("Paging is not enabled") 73 | 74 | entry_size = self.get_entry_size() 75 | num_entries_per_page = int(4096 / entry_size) 76 | bits_per_level = int(math.log2(num_entries_per_page)) 77 | 78 | pse = self.retrieve_pse() 79 | pse_ignore = self.get_arch() == "x86_64" 80 | 81 | pt_addr = self.machine.read_register("$cr3") 82 | 83 | pt_walk = PageTableWalkInfo(va) 84 | pt_walk.add_register_stage("CR3", pt_addr) 85 | 86 | top_va = None 87 | stages = None 88 | if self.is_long_mode_enabled(): 89 | if self.has_level_5_paging_enabled(): 90 | stages = ["PML5", "PML4", "PDP", "PD", "PT"] 91 | top_va_bit = 56 92 | else: 93 | stages = ["PML4", "PDP", "PD", "PT"] 94 | top_va_bit = 47 95 | else: 96 | stages = ["PD", "PT"] 97 | top_va_bit = 31 98 | 99 | cur_phys_addr = pt_addr 100 | for (stage_index, stage_str) in enumerate(stages): 101 | page_pa = cur_phys_addr & ~0xFFF 102 | entry_index = extract(va, top_va_bit - bits_per_level + 1, top_va_bit) 103 | entry_page_pa = page_pa + entry_index * entry_size 104 | entry_value = int.from_bytes(self.machine.read_physical_memory(entry_page_pa, entry_size), 'little') 105 | entry_value_pa_no_meta = extract_no_shift(entry_value, 12, 47) 106 | meta_bits = extract_no_shift(entry_value, 0, 11) 107 | 108 | pt_walk.add_stage(stage_str, entry_index, entry_value_pa_no_meta, meta_bits) 109 | 110 | if not is_present(entry_value): 111 | pt_walk.set_faulted() 112 | break 113 | 114 | if is_big_page(entry_value) and (pse or pse_ignore): 115 | break 116 | 117 | cur_phys_addr = entry_value 118 | top_va_bit = top_va_bit - bits_per_level 119 | 120 | return pt_walk 121 | 122 | def print_kaslr_information(self, table, should_print=True, phys_verbose=False): 123 | potential_base_filter = lambda p: p.x and p.s and p.phys[0] % PT_SIZE_2MIB == 0 124 | tmp = list(filter(potential_base_filter, table)) 125 | found_page = None 126 | 127 | for page in tmp: 128 | first_byte = self.machine.read_physical_memory(page.phys[0], 1) 129 | if first_byte[0] == 0x48: 130 | found_page = page 131 | break 132 | 133 | stdout_output = "" 134 | kaslr_addresses = [] 135 | if found_page: 136 | stdout_output += "Found virtual image base:\n" 137 | stdout_output += "\tVirt: " + found_page.to_string(phys_verbose) + "\n" 138 | stdout_output += "\tPhys: " + hex(found_page.phys[0]) + "\n" 139 | kaslr_addresses.append(found_page.va) 140 | first_bytes = self.machine.read_physical_memory(page.phys[0], 32) 141 | page_ranges_subset = filter(lambda page: not page.x and page.s and page.va % PT_SIZE_2MIB == 0, table) 142 | search_res_iter = search_memory(self.machine, page_ranges_subset, first_bytes, 1, 1, 0) 143 | try: 144 | search_res = next(search_res_iter) 145 | stdout_output += "Found phys map base:\n" 146 | phys_map_virt_base = search_res[0] - found_page.phys[0] 147 | phys_map_range = next(range for range in table if range.va >= phys_map_virt_base and phys_map_virt_base < range.va + range.page_size) 148 | stdout_output += "\tVirt: " + hex(phys_map_virt_base) + " in " + phys_map_range.to_string(phys_verbose) + "\n" 149 | kaslr_addresses.append(phys_map_virt_base) 150 | except StopIteration: 151 | print("Phys map was not found") 152 | else: 153 | stdout_output = "Failed to find KASLR info" 154 | if should_print: 155 | print(stdout_output) 156 | return kaslr_addresses 157 | 158 | def retrieve_pse(self): 159 | return (self.machine.read_register("$cr4") >> 4) & 0x1 == 0x1 160 | 161 | def retrieve_pae(self): 162 | return (self.machine.read_register("$cr4") >> 5) & 0x1 == 0x1 163 | 164 | def has_paging_enabled(self): 165 | return (self.machine.read_register("$cr0") >> 31) & 0x1 == 0x1 166 | 167 | def has_level_5_paging_enabled(self): 168 | return (self.machine.read_register("$cr4") >> 12) & 0x1 == 0x1 169 | 170 | def parse_pml5(self, addr, force_traverse_all): 171 | entries = [] 172 | entry_size = 8 173 | try: 174 | values = split_range_into_int_values(read_page(self.machine, addr), entry_size) 175 | except: 176 | return entries 177 | pml5_cache = {} 178 | for u, value in enumerate(values): 179 | if (value & 0x1) != 0: # Page must be present 180 | if force_traverse_all or value not in pml5_cache: 181 | entry = PML5_Entry(value, u) 182 | entries.append(entry) 183 | pml5_cache[value] = entry 184 | return entries 185 | 186 | def parse_pml5es(self, pml5es, force_traverse_all, entry_size): 187 | entries = [] 188 | for pml5e in pml5es: 189 | pdpe = self.parse_pml4(pml5e, force_traverse_all) 190 | entries.extend(pdpe) 191 | return entries 192 | 193 | def parse_pml4(self, pml5e, force_traverse_all): 194 | entries = [] 195 | entry_size = 8 196 | try: 197 | values = split_range_into_int_values(read_page(self.machine, pml5e.pml4), entry_size) 198 | except: 199 | return entries 200 | pml4_cache = {} 201 | for u, value in enumerate(values): 202 | if (value & 0x1) != 0: # Page must be present 203 | if force_traverse_all or value not in pml4_cache: 204 | entry = PML4_Entry(value, pml5e.virt_part, u) 205 | entries.append(entry) 206 | pml4_cache[value] = entry 207 | return entries 208 | 209 | def parse_pml4es(self, pml4es, force_traverse_all, entry_size): 210 | entries = [] 211 | for pml4e in pml4es: 212 | pdpe = self.parse_pdp(pml4e, force_traverse_all, 4096, entry_size) 213 | entries.extend(pdpe) 214 | return entries 215 | 216 | def parse_pdp(self, pml4e, force_traverse_all, size, entry_size): 217 | entries = [] 218 | try: 219 | values = split_range_into_int_values(self.machine.read_physical_memory(pml4e.pdp, size), entry_size) 220 | except: 221 | return entries 222 | pdp_cache = {} 223 | for u, value in enumerate(values): 224 | if (value & 0x1) != 0: 225 | if force_traverse_all or value not in pdp_cache: 226 | entry = PDP_Entry(value, pml4e.virt_part, u) 227 | entries.append(entry) 228 | pdp_cache[value] = entry 229 | return entries 230 | 231 | def parse_pdpes(self, pdpes, force_traverse_all, entry_size, pde_shift): 232 | entries = [] 233 | pages = [] 234 | for pdpe in pdpes: 235 | if pdpe.large_page == False: 236 | pdes = self.parse_pd(pdpe, force_traverse_all, entry_size, pde_shift) 237 | entries.extend(pdes) 238 | else: 239 | page = create_page_from_pdpe(pdpe) 240 | pages.append(page) 241 | return entries, pages 242 | 243 | def parse_pd(self, pdpe, force_traverse_all, entry_size, pde_shift): 244 | entries = [] 245 | try: 246 | values = split_range_into_int_values(read_page(self.machine, pdpe.pd), entry_size) 247 | except: 248 | return entries 249 | pd_cache = {} 250 | for u, value in enumerate(values): 251 | if (value & 0x1) != 0: 252 | if force_traverse_all or value not in pd_cache: 253 | entry = PD_Entry(value, pdpe.virt_part, u, pde_shift) 254 | entries.append(entry) 255 | pd_cache[value] = entry 256 | return entries 257 | 258 | def parse_pdes(self, pdes, entry_size=8): 259 | entries = [] 260 | pages = [] 261 | for pde in pdes: 262 | if pde.big_page == False: 263 | ptes = self.parse_pt(pde, entry_size) 264 | entries.extend(ptes) 265 | else: 266 | page = create_page_from_pde(pde) 267 | pages.append(page) 268 | return entries, pages 269 | 270 | def parse_pt(self, pde, entry_size=8): 271 | entries = [] 272 | try: 273 | values = split_range_into_int_values(read_page(self.machine, pde.pt), entry_size) 274 | except: 275 | return entries 276 | for u, value in enumerate(values): 277 | if (value & 0x1) != 0: 278 | entry = PT_Entry(value, pde.virt_part, u) 279 | entries.append(entry) 280 | return entries 281 | 282 | class PT_x86_64_Backend(PT_x86_Common_Backend, PTArchBackend): 283 | 284 | def get_arch(self): 285 | return "x86_64" 286 | 287 | def __init__(self, machine): 288 | self.machine = machine 289 | self.init_registers() 290 | 291 | def init_registers(self): 292 | self.pt_cr0 = PT_CR0(self.machine) 293 | self.pt_cr4 = PT_CR4(self.machine) 294 | 295 | def is_long_mode_enabled(self): 296 | try: 297 | efer = self.machine.read_register("$efer") 298 | long_mode_enabled = bool((efer >> 8) & 0x1) 299 | return long_mode_enabled 300 | except: 301 | # EFER does not exist for a 32-bit target machine 302 | return False 303 | 304 | def get_entry_size(self): 305 | if self.is_long_mode_enabled(): 306 | return 8 307 | else: 308 | pae = self.retrieve_pae() 309 | return 8 if pae else 4 310 | 311 | def get_pde_shift(self): 312 | if self.is_long_mode_enabled(): 313 | return 21 314 | else: 315 | pse = self.retrieve_pse() 316 | pae = self.retrieve_pae() 317 | if pse and pae: 318 | # PSE is ignored when PAE is available. 319 | return 21 320 | elif not pse and pae: 321 | # Only PAE. Page size is 2MiB 322 | return 21 323 | elif pse and not pae: 324 | # Only PSE. Page size is 4MiB. 325 | return 22 326 | elif not pse and not pae: 327 | # Default. 328 | # Manual suggests this shouldn't be possible because the page extension bit in the pde would be ignored. 329 | # Yet, QEMU doesn't respect this rule and here we are. 330 | return 21 331 | 332 | def parse_tables(self, cache, args): 333 | # Check that paging is enabled, otherwise no point to continue. 334 | if self.has_paging_enabled() == False: 335 | raise Exception("Paging is not enabled") 336 | 337 | requires_physical_contiguity = args.phys_verbose 338 | pt_addr = None 339 | if args.cr3: 340 | pt_addr = int(args.cr3[0], 16) 341 | else: 342 | pt_addr = self.machine.read_register("$cr3") 343 | # TODO: Check if these attribute bits in the cr3 need to be respected. 344 | pt_addr = pt_addr & (~0xfff) 345 | 346 | page_ranges = None 347 | 348 | if pt_addr in cache: 349 | page_ranges = cache[pt_addr] 350 | elif self.is_long_mode_enabled(): 351 | pde_shift = self.get_pde_shift() 352 | entry_size = self.get_entry_size() 353 | pml4es = [] 354 | if self.has_level_5_paging_enabled(): 355 | pml5es = self.parse_pml5(pt_addr, args.force_traverse_all) 356 | pml4es = self.parse_pml5es(pml5es, args.force_traverse_all, entry_size) 357 | else: 358 | pml4es = self.parse_pml4(PML5_Entry(pt_addr, 0), args.force_traverse_all) 359 | 360 | pdpes = self.parse_pml4es(pml4es, args.force_traverse_all, entry_size) 361 | pdes, large_pages = self.parse_pdpes(pdpes, args.force_traverse_all, entry_size, pde_shift) 362 | ptes, big_pages = self.parse_pdes(pdes) 363 | small_pages = [] 364 | for pte in ptes: 365 | small_pages.append(create_page_from_pte(pte)) 366 | page_ranges = optimize(large_pages, big_pages, small_pages, rwxs_semantically_similar, requires_physical_contiguity) 367 | else: 368 | pae = self.retrieve_pae() 369 | pde_shift = self.get_pde_shift() 370 | entry_size = self.get_entry_size() 371 | 372 | pdpes = None 373 | if pae: 374 | dummy_pml4 = PML4_Entry(pt_addr, 0) 375 | num_entries = 4 376 | pdpes = parse_pdp(dummy_pml4, args.force_traverse_all, num_entries * entry_size, entry_size) 377 | else: 378 | pdpes = [PDP_Entry(pt_addr, 0, 0)] 379 | 380 | pdes, large_pages = self.parse_pdpes(pdpes, args.force_traverse_all, entry_size, pde_shift) 381 | ptes, big_pages = self.parse_pdes(pdes, entry_size) 382 | small_pages = [] 383 | for pte in ptes: 384 | small_pages.append(create_page_from_pte(pte)) 385 | page_ranges = optimize(large_pages, big_pages, small_pages, rwxs_semantically_similar, requires_physical_contiguity) 386 | 387 | # Cache the page table if caching is set. 388 | # Caching happens before the filter is applied. 389 | if args.save: 390 | cache[pt_addr] = page_ranges 391 | 392 | return page_ranges 393 | 394 | -------------------------------------------------------------------------------- /pt/pt_x86_msr.py: -------------------------------------------------------------------------------- 1 | from pt.pt_register import * 2 | 3 | class PT_CR0(PT_Register): 4 | def __init__(self, machine): 5 | super(PT_CR0, self).__init__(machine, "cr0", "Control Register 0") 6 | self.add_range("PE (Protected Mode Enable)", 0, 0, PT_Decipher_Meaning_Match({0: "Protected mode", 1: "Real mode"})) 7 | self.add_range("MP (Monitor co-processor)", 1, 1, PT_Decipher_Meaning_Passthrough) 8 | self.add_range("EM (Emulation)", 2, 2, PT_Decipher_Meaning_Match({1: "No x87 FPU present", 0: "x87 FPU present"})) 9 | self.add_range("TS (Task switched)", 3, 3, PT_Decipher_Meaning_Passthrough) 10 | self.add_range("ET (Extension type)", 4, 4, PT_Decipher_Meaning_Passthrough) 11 | self.add_range("NE (Numeric error)", 5, 5, PT_Decipher_Meaning_Passthrough) 12 | self.add_range("WP (Write protect)", 16, 16, PT_Decipher_Meaning_Passthrough) 13 | self.add_range("AM (Alignment mask)", 18, 18, PT_Decipher_Meaning_Passthrough) 14 | self.add_range("NW (Not write-through)", 29, 29, PT_Decipher_Meaning_Passthrough) 15 | self.add_range("CD (Cache disable)", 30, 30, PT_Decipher_Meaning_Passthrough) 16 | self.add_range("PG (Paging)", 31, 31, PT_Decipher_Meaning_Passthrough) 17 | 18 | class PT_CR4(PT_Register): 19 | def __init__(self, machine): 20 | super(PT_CR4, self).__init__(machine, "cr4", "Control Register 4") 21 | self.add_range("VME (Virtual 8086 Mode Extensions)", 0, 0, PT_Decipher_Meaning_Passthrough) 22 | self.add_range("PVI (Protected-model virtual interrupts)", 1, 1, PT_Decipher_Meaning_Passthrough) 23 | self.add_range("TSD (Time Stamp Disable)", 2, 2, PT_Decipher_Meaning_Passthrough) 24 | self.add_range("DE (Debugging Extensions)", 3, 3, PT_Decipher_Meaning_Passthrough) 25 | self.add_range("PSE (Page Size Extension)", 4, 4, PT_Decipher_Meaning_Passthrough) 26 | self.add_range("PAE (Physical Address Extension)", 5, 5, PT_Decipher_Meaning_Passthrough) 27 | self.add_range("MCE (Machine Check Exception)", 6, 6, PT_Decipher_Meaning_Passthrough) 28 | self.add_range("PGE (Page Global Enabled)", 7, 7, PT_Decipher_Meaning_Passthrough) 29 | self.add_range("PCE (Performance Monitor Counter Enable)", 8, 8, PT_Decipher_Meaning_Passthrough) 30 | self.add_range("OSFXSR", 9, 9, PT_Decipher_Meaning_Passthrough) 31 | self.add_range("OSXMMEXCPT", 10, 10, PT_Decipher_Meaning_Passthrough) 32 | self.add_range("UMIP (User mode instruction prevention)", 11, 11, PT_Decipher_Meaning_Passthrough) 33 | self.add_range("LA57", 12, 12, PT_Decipher_Meaning_Passthrough) 34 | self.add_range("VMXE", 13, 13, PT_Decipher_Meaning_Passthrough) 35 | self.add_range("SMXE", 14, 14, PT_Decipher_Meaning_Passthrough) 36 | self.add_range("FSGSBASE", 16, 16, PT_Decipher_Meaning_Passthrough) 37 | self.add_range("PCIDE", 17, 17, PT_Decipher_Meaning_Passthrough) 38 | self.add_range("OSXSAVE", 18, 18, PT_Decipher_Meaning_Passthrough) 39 | self.add_range("SMEP", 20, 20, PT_Decipher_Meaning_Passthrough) 40 | self.add_range("SMAP", 21, 21, PT_Decipher_Meaning_Passthrough) 41 | self.add_range("PKE", 22, 22, PT_Decipher_Meaning_Passthrough) 42 | 43 | -------------------------------------------------------------------------------- /pt_gdb/__init__.py: -------------------------------------------------------------------------------- 1 | from pt_gdb.pt_gdb import PageTableDumpGdbFrontend 2 | -------------------------------------------------------------------------------- /pt_gdb/pt_gdb.py: -------------------------------------------------------------------------------- 1 | from pt.machine import * 2 | from pt.pt import * 3 | from pt.pt_x86_64_parse import * 4 | from pt.pt_aarch64_parse import * 5 | from pt.pt_riscv64_parse import * 6 | 7 | import gdb 8 | import os 9 | import subprocess 10 | 11 | class QemuGdbMachine(Machine): 12 | 13 | def __init__(self): 14 | self.pid = QemuGdbMachine.get_qemu_pid() 15 | self.file = os.open(f"/proc/{self.pid}/mem", os.O_RDONLY) 16 | 17 | def __del__(self): 18 | if self.file: 19 | os.close(self.file) 20 | 21 | def read_physical_memory(self, physical_address, length): 22 | res = gdb.execute(f"monitor gpa2hva {hex(physical_address)}", to_string = True) 23 | 24 | # It's not possible to pread large sizes, so let's break the request 25 | # into a few smaller ones. 26 | max_block_size = 1024 * 1024 * 256 27 | try: 28 | hva = int(res.split(" ")[-1], 16) 29 | data = b"" 30 | for offset in range(0, length, max_block_size): 31 | length_to_read = min(length - offset, max_block_size) 32 | block = os.pread(self.file, length_to_read, hva + offset) 33 | data += block 34 | return data 35 | except Exception as e: 36 | msg = f"Physical address ({hex(physical_address)}, +{hex(length)}) is not accessible. Reason: {e}. gpa2hva result: {res}" 37 | raise OSError(msg) 38 | 39 | def search_pids_for_file(pids, filename): 40 | for pid in pids: 41 | fd_dir = f"/proc/{pid}/fd" 42 | 43 | try: 44 | for fd in os.listdir(fd_dir): 45 | if os.readlink(f"{fd_dir}/{fd}") == filename: 46 | return pid 47 | except FileNotFoundError: 48 | # Either the process has gone or fds are changing, not our pid 49 | pass 50 | except PermissionError: 51 | # Evade processes owned by other users 52 | pass 53 | 54 | return None 55 | 56 | @staticmethod 57 | def get_qemu_pid(): 58 | out = subprocess.check_output(["pgrep", "qemu-system"], encoding="utf8") 59 | pids = out.strip().split('\n') 60 | 61 | if len(pids) == 1: 62 | return int(pids[0], 10) 63 | 64 | # We add a chardev file backend (we dont add a fronted, so it doesn't affect 65 | # the guest). We can then look through proc to find which process has the file 66 | # open. This approach is agnostic to namespaces (pid, network and mount). 67 | chardev_id = "gdb-pt-dump" + '-' + ''.join(random.choices(string.ascii_letters, k=16)) 68 | with tempfile.NamedTemporaryFile() as t: 69 | gdb.execute(f"monitor chardev-add file,id={chardev_id},path={t.name}") 70 | ret = QemuGdbMachine.search_pids_for_file(pids, t.name) 71 | gdb.execute(f"monitor chardev-remove {chardev_id}") 72 | 73 | if not ret: 74 | raise Exception("Could not find qemu pid") 75 | 76 | return int(ret, 10) 77 | 78 | def read_register(self, register_name): 79 | return int(gdb.parse_and_eval(register_name).cast(gdb.lookup_type("unsigned long"))) 80 | 81 | class PageTableDumpGdbFrontend(gdb.Command): 82 | """ 83 | GDB pt-dump: command for inspecting VM page tables. 84 | Arguments: 85 | -filter FILTER [FILTER ...] 86 | Specify filters for the recorded pages. 87 | x86_64 Supported filters: 88 | w: is writeable. 89 | x: is executable 90 | w|x: is writeable or executable 91 | ro: read-only 92 | u: user-space page 93 | s: supervisor page 94 | wb: write-back 95 | uc: uncacheable 96 | 97 | aarch64- and riscv64-supported filters: 98 | w: is writeable. 99 | x: is executable 100 | w|x: is writeable or executable 101 | ro: read-only 102 | u: user-space page 103 | s: supervisor page 104 | 105 | -range START_ADDR END_ADDR 106 | Will filter-out virtual memory ranges which start at a position in [START_ADDR, END_ADDR] 107 | -has ADDR 108 | Will filter-out virtual memory ranges which contain ADDR 109 | -before ADDR 110 | Will select virtual memory ranges which start =ADDR 113 | -ss "STRING" 114 | Searches for the string STRING in the ranges after filtering 115 | -sb BYTESTRING 116 | Searches for the byte-string BYTESTRING in the ranges after filtering 117 | -s8 VALUE 118 | Searches for the value VALUE in the ranges after filtering 119 | VALUE should fit in 8 bytes. 120 | -s4 VALUE 121 | Searches for the value VALUE in the ranges after filtering 122 | VALUE should fit in 4 bytes. 123 | -align ALIGNMENT [OFFSET] 124 | When searching, it will print out addresses which are aligned to ALIGNMENT. 125 | If offset is provided, then the check would be performed as (ADDR - OFFSET) % ALIGNMENT. 126 | It can be useful when searching for content in a particular SLAB. 127 | -kaslr 128 | Print KASLR-relevant information like the image offsets and phys map base. 129 | -kaslr_leaks 130 | Searchers for values which disclose KASLR offsets. 131 | -save 132 | Cache the recorded page table for that address after traversing the hierachy. 133 | This will yield speed-up when printing the page table again. 134 | -list 135 | List the cached page tables. 136 | -clear 137 | Clear all saved page tables. 138 | -info 139 | Print arch register information. 140 | -o FILE_NAME 141 | Store the output from the current command to a file with name FILE_NAME. 142 | This may be useful when the a lot of data is produced, e.g. full page table. 143 | -find_alias 144 | Experimental feature and currently slow. Searches for aliases ranges in virtual memory. 145 | Ranges are aliased if they point to the the same physical memory. This can be useful if one 146 | is searching for R/RX memory which is writeable through some other address. 147 | Another interesting option is to find alias for memory mapped in user space and kernel space. 148 | TODO: This feature will be reworked for usability and performance in the near future. 149 | -force_traverse_all 150 | Forces the traversal of any page table entry (pml4, pdp, ...) even if a duplicate entry has 151 | already been trarversed. Using this option bypasses an optimization which discards already 152 | traversed duplicate entries. Expect that using this option would render pt unusable for 153 | windows VMs. 154 | -phys_verbose 155 | Prints the start physical address for the printed virtual ranges. This argument further 156 | restricts the merging of virtual ranges by requiring that merged ranges need to also be 157 | physically contiguous. Using this range leads to more verbose output. 158 | 159 | Architecture-specific arguments: 160 | - X86-32 / X86-64 161 | `-cr3 HEX_ADDR` 162 | The GPA of the page table. If not used, the script will use the architectural 163 | register (e.g. cr3). 164 | 165 | - aarch64 166 | `-ttbr0_el1 HEX_ADDR` 167 | The GPA of the TTBR0_EL1 register. 168 | `-ttbr1_el1 HEX_ADDR` 169 | The GPA of the TTBR1_EL1 register. 170 | 171 | - riscv64 172 | `-satp HEX_ADDR` 173 | The GPA of the SATP register. 174 | 175 | Example usage: 176 | `pt -save -filter s w|x wb` 177 | Traverse the current page table and then save it. When returning the result, 178 | filter the pages to be marked as supervisor, be writeable or executable, and marked as 179 | write-back. 180 | `pt -filter w x` 181 | Traverse the current page table and print out mappings which are both writeable and 182 | executable. 183 | `pt -cr3 0x4000` 184 | Traverse the page table at guest physical address 0x4000. Don't save it. 185 | `pt -save -kaslr` 186 | Traverse page tables, save them and print kaslr information. 187 | `pt -ss "Linux 4."` 188 | Search for the string Linux. 189 | `pt -sb da87374107` 190 | Search for the byte-string da87374107. 191 | `pt -s8 0xaabbccdd` 192 | Search for the 8-byte-long value 0xaabbccdd. 193 | `pt -has 0xffffffffaaf629f7` 194 | Print information about the mapping which covers the address 0xffffffffaaf629f7. 195 | """ 196 | 197 | def __init__(self): 198 | super(PageTableDumpGdbFrontend, self).__init__("pt", gdb.COMMAND_USER) 199 | self.pid = -1 200 | self.pt = None 201 | 202 | def lazy_init(self): 203 | 204 | # Create machine backend 205 | machine_backend = QemuGdbMachine() 206 | 207 | # Create arch backend 208 | arch = gdb.execute("show architecture", to_string = True) 209 | arch_backend = None 210 | if "aarch64" in arch: 211 | arch_backend = PT_Aarch64_Backend(machine_backend) 212 | elif "x86" in arch or "x64" in arch or "i386" in arch: 213 | arch_backend = PT_x86_64_Backend(machine_backend) 214 | elif "riscv:rv64" in arch: 215 | arch_backend = PT_RiscV64_Backend(machine_backend) 216 | else: 217 | raise Exception(f"Unknown arch. Message: {arch}") 218 | 219 | # Bring-up pt_dump 220 | self.pt = PageTableDump(machine_backend, arch_backend) 221 | 222 | def invoke(self, arg, from_tty): 223 | try: 224 | curr_pid = QemuGdbMachine.get_qemu_pid() 225 | if curr_pid != self.pid: 226 | self.lazy_init() 227 | except Exception as e: 228 | print("Cannot get qemu-system pid", e) 229 | return 230 | 231 | argv = gdb.string_to_argv(arg) 232 | self.pt.handle_command_wrapper(argv) 233 | -------------------------------------------------------------------------------- /pt_host.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | from pt.machine import * 4 | from pt.pt import * 5 | from pt.pt_x86_64_parse import * 6 | 7 | from bcc import BPF 8 | import ctypes 9 | 10 | import sys 11 | 12 | def read_address_from_kallsyms(searched_symb_name): 13 | """ 14 | Returns the VA of a given kernel symbol. 15 | """ 16 | f = open("/proc/kallsyms", "r") 17 | data = f.read() 18 | f.close() 19 | for line in data.split("\n"): 20 | symb_addr, symb_type, symb_name = line.split(" ") 21 | if searched_symb_name == symb_name: 22 | return int(symb_addr, 16) 23 | return None 24 | 25 | syscall = ctypes.CDLL(None).syscall 26 | read_memory_program = None 27 | read_cr3_program = None 28 | 29 | def init_global_state(target_pid): 30 | global read_memory_program 31 | global read_cr3_program 32 | 33 | page_offset_base = read_address_from_kallsyms("page_offset_base") 34 | 35 | with open("pt_host/pt_host_read_physmem.bcc", "r") as read_memory_bpf_program_src_file: 36 | read_memory_bpf_program_src = read_memory_bpf_program_src_file.read() 37 | read_memory_bpf_program_src = read_memory_bpf_program_src.replace("$PAGE_OFFSET_BASE", hex(page_offset_base)) 38 | read_memory_program = BPF(text=read_memory_bpf_program_src) 39 | 40 | with open("pt_host/pt_host_read_cr3.bcc", "r") as read_cr3_program_src_file: 41 | read_cr3_program_src = read_cr3_program_src_file.read() 42 | read_cr3_program_src = read_cr3_program_src.replace("$TARGET_PID", str(target_pid)) 43 | read_cr3_program_src = read_cr3_program_src.replace("$PT_HOST_PID", str(os.getpid())) 44 | read_cr3_program_src = read_cr3_program_src.replace("$PAGE_OFFSET_BASE", hex(page_offset_base)) 45 | read_cr3_program = BPF(text=read_cr3_program_src) 46 | 47 | def read_cr3(target_pid): 48 | """ 49 | Retrieves the value of the CR3 register for a given PID. 50 | 51 | The CR3 value is retrieved from mm_struct::pgd. 52 | """ 53 | cr3 = None 54 | 55 | # Invoke 'pidfd_open' which will cause 'pidfd_create' to be called. 56 | pidfd = os.pidfd_open(int(target_pid)) 57 | os.close(pidfd) 58 | 59 | # The trace must have been triggered 60 | (_, _, _, _, _, msg_b) = read_cr3_program.trace_fields() 61 | msg = msg_b.decode('utf8') 62 | assert(msg == "Done") 63 | 64 | # Read back the CR3 value for the target process. 65 | cr3 = read_cr3_program["out_cr3"][0].value 66 | return cr3 67 | 68 | class HostMachine(Machine): 69 | """ 70 | Implementation of the pt-dump 'Machine' interface. 71 | """ 72 | 73 | def __init__(self, target_pid): 74 | self.target_pid = target_pid 75 | 76 | def __del__(self): 77 | pass 78 | 79 | def read_physical_memory(self, physical_address, length): 80 | block_size = 0x1000 81 | data = b"" 82 | for block_off in range(0, length, block_size): 83 | block_len = min(block_size, length - block_off) 84 | data += read_phys_memory(physical_address + block_off, block_len) 85 | return data 86 | 87 | def read_register(self, register_name): 88 | value = None 89 | if register_name == "$cr3": 90 | value = read_cr3(self.target_pid) 91 | elif register_name == "$cr4": 92 | value = (1 << 4) | (1 << 5) | (1 << 20) | (1 << 21) 93 | elif register_name == "$cr0": 94 | value = (1 << 0) | (1 << 16) | (1 << 31) 95 | elif register_name == "$efer": 96 | value = (1 << 8) 97 | else: 98 | raise Exception("Unimplemented register: " + register_name) 99 | return value 100 | 101 | class HostFrontend(): 102 | """ 103 | Combines the machine, arch and pt_dump implementations. 104 | """ 105 | 106 | def __init__(self, target_pid): 107 | 108 | # Create machine backend 109 | machine_backend = HostMachine(target_pid) 110 | 111 | # Create arch backend 112 | arch_backend = PT_x86_64_Backend(machine_backend) 113 | 114 | # Bring-up pt_dump 115 | self.pt = PageTableDump(machine_backend, arch_backend, needs_pid = True) 116 | 117 | def invoke(self, arg): 118 | self.pt.handle_command_wrapper(arg) 119 | 120 | class Page(ctypes.Structure): 121 | """ 122 | Data structure for holding one 4K page. 123 | """ 124 | _fields_ = [('data', ctypes.c_char * 4096)] 125 | 126 | _attached = False 127 | 128 | def read_phys_memory(addr, len): 129 | """ 130 | Reads 'len' bytes from the physical address in 'addr' 131 | """ 132 | 133 | assert(len <= 0x1000) 134 | 135 | pid = os.getpid() 136 | 137 | # Attach once dynamically 138 | global _attached 139 | if not _attached: 140 | read_memory_program.get_table("in_pid")[ctypes.c_uint32(0)] = ctypes.c_uint32(pid) 141 | read_memory_program.attach_kprobe(event=read_memory_program.get_syscall_fnname("madvise"), fn_name="syscall__madvise") 142 | _attached = True 143 | 144 | read_memory_program.get_table("in_phys_addr")[ctypes.c_uint32(0)] = ctypes.c_uint64(addr) 145 | read_memory_program.get_table("in_phys_len")[ctypes.c_uint32(0)] = ctypes.c_uint32(len) 146 | 147 | # Use madvise as a driver to trigger the point for reading physical memory 148 | try: 149 | madvise_syscall = 28 150 | syscall(madvise_syscall, 0x1337, 0x1337) 151 | except Exception as e: 152 | print(e) 153 | pass 154 | 155 | # Sanity check that execution of the point finished successfully 156 | (_, _, _, _, _, msg_b) = read_memory_program.trace_fields() 157 | msg = msg_b.decode('utf8') 158 | assert(msg == "Done") 159 | 160 | # Read back the physical memory from the shared array. 161 | data = bytearray(read_memory_program["out_block"][0].data) 162 | return data[:len] 163 | 164 | def main(): 165 | 166 | # This is a necessary hack to WAR the disparity of argument handling in pt_host and pt_dump 167 | # pt_host needs to take a mandatory "-pid" argument, while "pt_dump" does not need it. 168 | # pt_host does not know of the arguments handled by pt_dump. 169 | # 170 | # The hack is to do the pid arg parsing here once to get the pid, but then still add 171 | # pid to argparse in pt_dump. 172 | pid_index = None 173 | for (index, arg_value) in enumerate(sys.argv): 174 | if arg_value == "-pid" and (index + 1) < len(sys.argv): 175 | pid_index = index 176 | break 177 | pid = None 178 | if pid_index == None: 179 | pid = -1 180 | else: 181 | pid = int(sys.argv[pid_index + 1]) 182 | 183 | # Initialize state 184 | init_global_state(target_pid=pid) 185 | frontend = HostFrontend(target_pid=pid) 186 | 187 | # Invoke pt_dump 188 | frontend.invoke(sys.argv[1:]) 189 | 190 | if __name__ == "__main__": 191 | main() 192 | 193 | -------------------------------------------------------------------------------- /pt_host/pt_host_read_cr3.bcc: -------------------------------------------------------------------------------- 1 | 2 | #include 3 | #include 4 | #include 5 | 6 | BPF_ARRAY(out_cr3, u64, 1); 7 | 8 | KFUNC_PROBE(pidfd_create, struct pid *pid, unsigned int flags) { 9 | // Retrieve the passed arguments from userspace 10 | const u32 pt_host_pid = $PT_HOST_PID; 11 | 12 | const u32 target_pid = $TARGET_PID; 13 | 14 | // Check if this trace happened due to an invocation from the pt_host script 15 | struct task_struct *current_task = (struct task_struct *)bpf_get_current_task(); 16 | if (current_task == NULL) { 17 | return 0; 18 | } 19 | 20 | u32 current_task_pid = 0; 21 | if (bpf_probe_read(¤t_task_pid, sizeof(current_task_pid), &(current_task->pid)) < 0) { 22 | return 0; 23 | } 24 | if (current_task_pid != pt_host_pid) { 25 | return 0; 26 | } 27 | 28 | struct hlist_node *first_task_node = NULL; 29 | if (bpf_probe_read(&first_task_node, sizeof(first_task_node), &(pid->tasks[PIDTYPE_TGID].first)) < 0) { 30 | return 0; 31 | } 32 | struct task_struct *target_task = hlist_entry(first_task_node, struct task_struct, pid_links[PIDTYPE_TGID]); 33 | 34 | // Get the task_struct of the target process 35 | if (!target_task) { 36 | return 0; 37 | } 38 | 39 | struct mm_struct *mm = NULL; 40 | if (bpf_probe_read(&mm, sizeof(mm), &(target_task->mm)) < 0) { 41 | return 0; 42 | } 43 | 44 | unsigned long cr3 = 0; 45 | if (bpf_probe_read(&cr3, sizeof(cr3), &(mm->pgd)) < 0) { 46 | return 0; 47 | } 48 | 49 | unsigned long page_offset_base = 0; 50 | if (bpf_probe_read(&page_offset_base, 8, (void *)$PAGE_OFFSET_BASE) < 0) { 51 | return 0; 52 | } 53 | unsigned long cr3_phys_address = cr3 - page_offset_base; 54 | u32 index = 0; 55 | u64 *ptr = out_cr3.lookup(&index); 56 | if (!ptr) { 57 | return 0; 58 | } 59 | *ptr = cr3_phys_address; 60 | bpf_trace_printk("Done\n"); 61 | return 0; 62 | } 63 | -------------------------------------------------------------------------------- /pt_host/pt_host_read_physmem.bcc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | #define MAX_LEN ( 4096 ) 5 | #define MAX_ELEMENTS ( MAX_LEN / 8 ) 6 | 7 | 8 | struct Page { 9 | u8 data[4096]; 10 | }; 11 | 12 | BPF_ARRAY(out_page_memory, u64, MAX_ELEMENTS); 13 | BPF_ARRAY(in_phys_addr, u64, 1); 14 | BPF_ARRAY(in_phys_len, u32, 1); 15 | BPF_ARRAY(in_pid, u32, 1); 16 | BPF_ARRAY(out_block, struct Page, 1); 17 | 18 | int syscall__madvise(struct pt_regs *regs, unsigned long start, size_t advise_len, int behavior) { 19 | unsigned int zero = 0; 20 | struct task_struct *task = (struct task_struct *)bpf_get_current_task(); 21 | if (task == NULL) { 22 | return 1; 23 | } 24 | u32 *pid_ptr = in_pid.lookup(&zero); 25 | if (!pid_ptr) { 26 | return 1; 27 | } 28 | 29 | unsigned int task_pid = 0; 30 | if (bpf_probe_read(&task_pid, sizeof(task_pid), &(task->pid)) < 0) { 31 | return 1; 32 | } 33 | if (task_pid != *pid_ptr) { 34 | return 1; 35 | } 36 | 37 | if (start != 0x1337 || advise_len != 0x1337) { 38 | return 1; 39 | } 40 | 41 | unsigned long page_offset_base = 0; 42 | if (bpf_probe_read(&page_offset_base, 8, (void *)$PAGE_OFFSET_BASE) < 0) { 43 | return 1; 44 | } 45 | u32 *len_ptr = in_phys_len.lookup(&zero); 46 | if (!len_ptr) { 47 | return 1; 48 | } 49 | const u32 len = *len_ptr; 50 | if (len > MAX_LEN) 51 | { 52 | return 1; 53 | } 54 | u64 *phys_addr_ptr = in_phys_addr.lookup(&zero); 55 | if (!phys_addr_ptr) { 56 | return 1; 57 | } 58 | 59 | struct Page *block_ptr = out_block.lookup(&zero); 60 | if (!block_ptr) { 61 | return 1; 62 | } 63 | 64 | u8 *phys_addr = 0x0; 65 | phys_addr += *phys_addr_ptr; 66 | phys_addr += page_offset_base; 67 | if (bpf_probe_read(block_ptr, len, (const void *)phys_addr) != 0) { 68 | bpf_trace_printk("Failed to read physical memory\n"); 69 | return 1; 70 | } 71 | 72 | bpf_trace_printk("Done\n"); 73 | return 0; 74 | } 75 | 76 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [tool.poetry] 2 | name = "pt" 3 | version = "1.0.0" 4 | description = "`gdb-pt-dump` is a gdb script to examine the address space of a QEMU-based virtual machine." 5 | authors = ["martinradev "] 6 | license = "MIT" 7 | readme = "README.md" 8 | packages = [ 9 | { include = "pt" }, 10 | { include = "pt_gdb" }, 11 | ] 12 | 13 | [tool.poetry.dependencies] 14 | python = "^3.8" 15 | 16 | 17 | [build-system] 18 | requires = ["poetry-core"] 19 | build-backend = "poetry.core.masonry.api" 20 | -------------------------------------------------------------------------------- /tests/integration_tests/Dockerfile.package: -------------------------------------------------------------------------------- 1 | FROM ubuntu:24.04 AS build 2 | 3 | RUN apt-get update && \ 4 | DEBIAN_FRONTEND=noninteractive \ 5 | apt-get install -y \ 6 | make \ 7 | nasm \ 8 | gcc-12 \ 9 | gcc-12-aarch64-linux-gnu \ 10 | zstd 11 | 12 | RUN mkdir -p /build 13 | ADD custom_kernels /build/custom_kernels 14 | COPY Makefile /build/Makefile 15 | ADD test_images /build/test_images 16 | WORKDIR /build 17 | RUN make -j$(nproc) 18 | RUN ZSTD_CLEVEL=6 tar -I zstd -cf test_images.tar.zst test_images 19 | 20 | FROM scratch AS export 21 | COPY --from=build /build/test_images.tar.zst test_images.tar.zst 22 | 23 | -------------------------------------------------------------------------------- /tests/integration_tests/Dockerfile.runtests: -------------------------------------------------------------------------------- 1 | FROM ubuntu:22.04 AS build 2 | 3 | RUN apt-get update && \ 4 | DEBIAN_FRONTEND=noninteractive \ 5 | apt-get install -y \ 6 | qemu-system \ 7 | gdb \ 8 | gdb-multiarch \ 9 | python3 \ 10 | python3-pip \ 11 | python3-pytest \ 12 | python3-pytest-xdist \ 13 | python3-pytest-timeout 14 | 15 | ARG UID=0 16 | ARG GID=0 17 | ARG GROUPNAME=testgroup 18 | ARG USERNAME=testuser 19 | 20 | # RUN groupadd -g ${GID} testgroup 21 | RUN if ! getent group ${GID} >/dev/null; then \ 22 | groupadd -g ${GID} ${GROUPNAME}; \ 23 | fi 24 | 25 | # Create user if it does not exist 26 | RUN if ! id -u ${UID} >/dev/null 2>&1; then \ 27 | useradd -m -u ${UID} -g ${GID} -s /bin/bash ${USERNAME}; \ 28 | fi 29 | 30 | RUN mkdir -p /gdb-pt-dump && chown -R ${GID}:${UID} /gdb-pt-dump 31 | USER ${UID} 32 | WORKDIR /gdb-pt-dump/tests/integration_tests 33 | ENV GDB_PT_DUMP_TESTS_LOGFILE=/tmp/log.txt 34 | CMD ./run_tests.sh --skip_download --logfile $GDB_PT_DUMP_TESTS_LOGFILE 35 | 36 | -------------------------------------------------------------------------------- /tests/integration_tests/Makefile: -------------------------------------------------------------------------------- 1 | 2 | all: custom_kernels 3 | 4 | custom_kernels: custom_kernels_x86_64 custom_kernels_arm_64 5 | 6 | custom_kernels_x86_64: 7 | make -C custom_kernels/x86/64_bit/ 8 | mkdir -p images/custom_kernels/x86_64/ 9 | cp custom_kernels/x86/64_bit/*.bin images/custom_kernels/x86_64/ 10 | 11 | custom_kernels_arm_64: 12 | make -C custom_kernels/arm/64_bit/ 13 | mkdir -p images/custom_kernels/arm_64/ 14 | cp custom_kernels/arm/64_bit/*.bin images/custom_kernels/arm_64/ 15 | 16 | clean: 17 | make -C custom_kernels/x86/64_bit/ clean 18 | make -C custom_kernels/arm/64_bit/ clean 19 | -------------------------------------------------------------------------------- /tests/integration_tests/build_package.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | . common.sh 4 | 5 | download_latest 6 | docker build -t gdb_pt_dump_package_tests -f Dockerfile.package . 7 | -------------------------------------------------------------------------------- /tests/integration_tests/common.sh: -------------------------------------------------------------------------------- 1 | 2 | download_latest() { 3 | if [[ -d test_images ]]; then 4 | echo "Test images already downloaded. Skipping..." 5 | return 0 6 | fi 7 | source_url="https://github.com/martinradev/gdb-pt-dump/releases/download/test_binary_images_v1/test_images.tar.zst" 8 | wget "${source_url}" 9 | tar -xf test_images.tar.zst 10 | rm test_images.tar.zst 11 | return 0 12 | } 13 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/.gitignore: -------------------------------------------------------------------------------- 1 | *.o 2 | *.bin 3 | *.elf 4 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/arm/64_bit/Makefile: -------------------------------------------------------------------------------- 1 | 2 | test_list = \ 3 | test_granularity_4k \ 4 | test_granularity_16k \ 5 | test_granularity_64k 6 | 7 | tests_arm64: $(foreach test_name,$(test_list),$(addsuffix .bin,$(test_name))) 8 | 9 | define generate_test 10 | 11 | $(1).bin: boot.o entry_$(1).o linker.ld 12 | echo "Creating $(1)" 13 | aarch64-linux-gnu-ld -Bstatic -nostdlib -Tlinker.ld -o tmp_arm_test_entry_$(1).elf boot.o entry_$(1).o 14 | aarch64-linux-gnu-objcopy -O binary tmp_arm_test_entry_$(1).elf arm_test_entry_$(1).bin 15 | 16 | entry_$(1).o: entry.c ../../common/common.h 17 | aarch64-linux-gnu-gcc -DGDB_PT_DUMP_TEST=$(1) entry.c -I./ -I../../common -Wall -Wextra -O3 -fno-builtin -no-pie -fno-PIE -ffreestanding -nostdlib -c -o entry_$(1).o 18 | 19 | endef 20 | 21 | $(foreach test_name,$(test_list),$(eval $(call generate_test,$(test_name)))) 22 | 23 | boot.o: boot.asm 24 | aarch64-linux-gnu-as boot.asm -o boot.o 25 | 26 | clean: 27 | find . -name "*.elf" -exec rm {} \; 28 | find . -name "*.o" -exec rm {} \; 29 | find . -name "*.bin" -exec rm {} \; 30 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/arm/64_bit/boot.asm: -------------------------------------------------------------------------------- 1 | .extern entry 2 | .global _start 3 | .section .boot 4 | _start: 5 | mov x30, 0x800000 6 | mov sp, x30 7 | bl entry 8 | hang: 9 | b hang 10 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/arm/64_bit/entry.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "common.h" 4 | 5 | volatile unsigned int * const UART0_DR = (unsigned int *) 0x09000000; 6 | 7 | typedef enum 8 | { 9 | Granularity_T0_4K = 0U, 10 | Granularity_T0_16K = 2U, 11 | Granularity_T0_64K = 1U, 12 | } Granularity_T0; 13 | 14 | typedef enum 15 | { 16 | Granularity_T1_4K = 2U, 17 | Granularity_T1_16K = 1U, 18 | Granularity_T1_64K = 3U, 19 | } Granularity_T1; 20 | 21 | typedef enum 22 | { 23 | Descriptor_Invalid = 0U, 24 | Descriptor_Table = 3U, 25 | Descriptor_TableEntry = 3U, 26 | Descriptor_Block = 1U, 27 | } Descriptor; 28 | 29 | 30 | static uint64_t get_t0sz(uint8_t sz) 31 | { 32 | return (uint64_t)sz; 33 | } 34 | 35 | static uint64_t get_t1sz(uint8_t sz) 36 | { 37 | return ((uint64_t)sz << 16U); 38 | } 39 | 40 | static uint64_t get_tg0(Granularity_T0 gran) 41 | { 42 | return (uint64_t)gran << 14U; 43 | } 44 | 45 | static uint64_t get_tg1(Granularity_T1 gran) 46 | { 47 | return (uint64_t)gran << 30U; 48 | } 49 | 50 | static uint64_t construct_tcr(uint8_t t0sz, uint8_t t1sz, Granularity_T0 gran0, Granularity_T1 gran1) 51 | { 52 | return get_t0sz(t0sz) | get_t1sz(t1sz) | get_tg0(gran0) | get_tg1(gran1); 53 | } 54 | 55 | static uint64_t construct_desc(Descriptor desc, uint64_t phys, uint8_t ap, bool pxn, bool xn) 56 | { 57 | uint64_t desc_entry = (uint64_t)desc; 58 | desc_entry |= (phys & ~0xFFFUL); 59 | desc_entry |= (1ULL << 10u); // AF 60 | desc_entry |= (((uint64_t)ap & 0x3U) << 6U); 61 | desc_entry |= ((uint64_t)pxn << 53U); 62 | desc_entry |= ((uint64_t)xn << 54U); 63 | return desc_entry; 64 | } 65 | 66 | static void print(const char *s) { 67 | while(*s != '\0') { 68 | *UART0_DR = (unsigned int)(*s++); 69 | } 70 | } 71 | 72 | static void write_to_paging_regs(uint64_t tcr_el1, uint32_t ttbr0_el1, uint32_t ttbr1_el1) 73 | { 74 | asm volatile( 75 | "dsb ish\n\t" 76 | "isb\n\t" 77 | 78 | // Update addressing and page table state 79 | "msr tcr_el1, %0\n\t" 80 | "msr ttbr0_el1, %1\n\t" 81 | "msr ttbr1_el1, %2\n\t" 82 | 83 | // Update MAIR 84 | "mov x0, 0x000004FF\n\t" 85 | "msr mair_el1, x0\n\t" 86 | "dsb ish\n\t" 87 | "isb\n\t" 88 | 89 | // enable mmu 90 | "mrs x0, sctlr_el1\n\t" 91 | "orr x0, x0, #0x1\n\t" 92 | "msr sctlr_el1, x0\n\t" 93 | : 94 | : "r"(tcr_el1), "r"(ttbr0_el1), "r"(ttbr1_el1) 95 | : "x0" 96 | ); 97 | } 98 | 99 | static const uint64_t test_base = 0x40300000ULL; 100 | 101 | static void test_granularity_4k() 102 | { 103 | uint64_t *ptr = (uint64_t *)test_base; 104 | memset((void *)ptr, 0, 0x20000); 105 | const uint64_t num_entries = 512; 106 | const uint64_t granule_size = 0x1000UL; 107 | ptr[0] = construct_desc(Descriptor_Block, 0, 0, false, false); // cover first 512GB 108 | ptr[1] = construct_desc(Descriptor_Table, test_base + granule_size, 0, false, false); 109 | ptr[num_entries - 1] = construct_desc(Descriptor_Block, 0, 0, false, true); 110 | ptr[num_entries] = construct_desc(Descriptor_Table, test_base + granule_size * 2U, 0, false, false); 111 | ptr[num_entries + 8] = construct_desc(Descriptor_Table, test_base + granule_size * 4U, 0, false, false); 112 | ptr[num_entries * 2] = construct_desc(Descriptor_Table, test_base + granule_size * 3U, 0, false, false); 113 | ptr[num_entries * 3] = construct_desc(Descriptor_TableEntry, 0, 0x1, true, true); 114 | ptr[num_entries * 3 + 1] = construct_desc(Descriptor_TableEntry, 0, 0, false, false); 115 | ptr[num_entries * 3 + 2] = construct_desc(Descriptor_TableEntry, 0, 0, false, false); 116 | ptr[num_entries * 3 + 3] = construct_desc(Descriptor_TableEntry, 0, 0, false, true); 117 | ptr[num_entries * 4] = construct_desc(Descriptor_Block, 0, 0x11, true, true); 118 | ptr[num_entries * 4 + 1] = construct_desc(Descriptor_Block, 0, 0x10, true, true); 119 | ptr[num_entries * 4 + 2] = construct_desc(Descriptor_Block, 0, 0x01, true, true); 120 | ptr[num_entries * 4 + 3] = construct_desc(Descriptor_Block, 0, 0x01, false, true); 121 | ptr[num_entries * 4 + 4] = construct_desc(Descriptor_Block, 0, 0x01, true, false); 122 | ptr[num_entries * 4 + 5] = construct_desc(Descriptor_Block, 0, 0x01, true, false); 123 | ptr[num_entries * 4 + 6] = construct_desc(Descriptor_Block, 0, 0x01, true, false); 124 | ptr[num_entries * 4 + 7] = construct_desc(Descriptor_Block, 0, 0x01, true, false); 125 | ptr[num_entries * 5 - 1] = construct_desc(Descriptor_Block, 0, 0x01, true, false); 126 | 127 | const uint64_t tcr = construct_tcr(16, 16, Granularity_T0_4K, Granularity_T1_4K); 128 | write_to_paging_regs(tcr, test_base, test_base); 129 | } 130 | 131 | static void test_granularity_16k() 132 | { 133 | const uint64_t num_entries = 512 * 4; 134 | const uint64_t granularity = 16 * 1024; 135 | uint64_t *ptr = (uint64_t *)test_base; 136 | memset((void *)ptr, 0, granularity * 5); 137 | ptr[0] = construct_desc(Descriptor_Block, 0, 0, false, false); // map 128TiB 138 | ptr[1] = construct_desc(Descriptor_Table, test_base + granularity, 0, false, false); 139 | ptr[num_entries] = construct_desc(Descriptor_Block, 0, 0, true, false); // map 64G 140 | ptr[num_entries + 2] = construct_desc(Descriptor_Table, test_base + granularity * 2, 0, true, false); 141 | ptr[num_entries * 2 + 1] = construct_desc(Descriptor_Block, 0, 0, true, false); // Map 32M 142 | ptr[num_entries * 2 + 4] = construct_desc(Descriptor_Table, test_base + granularity * 3, 0, true, false); 143 | ptr[num_entries * 3 + 8] = construct_desc(Descriptor_TableEntry, 0, 0, true, false); // map 16K 144 | 145 | const uint64_t tcr = construct_tcr(16, 16, Granularity_T0_16K, Granularity_T1_16K); 146 | write_to_paging_regs(tcr, test_base, test_base); 147 | } 148 | 149 | static void test_granularity_64k() 150 | { 151 | const uint64_t num_entries = 512 * 16; 152 | const uint64_t granularity = 64 * 1024; 153 | uint64_t *ptr = (uint64_t *)test_base; 154 | memset((void *)ptr, 0, 0x20000); 155 | ptr[0] = construct_desc(Descriptor_Block, 0, 0, false, false); // Map first 4TiB 156 | ptr[1] = construct_desc(Descriptor_Block, 0, 0, false, false); 157 | ptr[4] = construct_desc(Descriptor_Table, test_base + granularity, 0, false, false); 158 | ptr[num_entries - 1] = construct_desc(Descriptor_Block, 0, 0, false, true); 159 | ptr[num_entries] = construct_desc(Descriptor_Block, 0, 0, false, false); // Map another 512MiB 160 | ptr[num_entries + 1] = construct_desc(Descriptor_Table, test_base + granularity * 2, 0, false, false); 161 | ptr[num_entries * 2] = construct_desc(Descriptor_TableEntry, 0, 0, true, true); 162 | ptr[num_entries * 2 + num_entries - 1] = construct_desc(Descriptor_TableEntry, 0, 0, true, true); 163 | ptr[num_entries * 2 + 2] = construct_desc(Descriptor_TableEntry, 0, 0, false, false); 164 | ptr[num_entries * 3 - 1] = construct_desc(Descriptor_TableEntry, 0, 0, false, false); 165 | 166 | const uint64_t tcr = construct_tcr(16, 16, Granularity_T0_64K, Granularity_T1_64K); 167 | write_to_paging_regs(tcr, test_base, test_base); 168 | } 169 | 170 | static void test_complete(void) 171 | { 172 | print("Done\n"); 173 | while(1) { 174 | volatile int a = 0xcafefe; 175 | (void)a; 176 | asm volatile("yield" :::); 177 | } 178 | } 179 | 180 | static void no_test_executed(void) 181 | { 182 | print("Test not found\n"); 183 | while(1) { 184 | volatile int a = 0xdeaddead; 185 | (void)a; 186 | asm volatile("yield" :::); 187 | } 188 | } 189 | 190 | void entry() 191 | { 192 | // setup_initial_pt(); 193 | #define DISPATCH(test) \ 194 | do { \ 195 | if (GDB_PT_DUMP_TEST == test) { \ 196 | print("Running: " #test "\n"); \ 197 | test(); \ 198 | test_complete(); \ 199 | } \ 200 | } while (0) 201 | 202 | print("Searching for test...\n"); 203 | DISPATCH(test_granularity_4k); 204 | DISPATCH(test_granularity_16k); 205 | DISPATCH(test_granularity_64k); 206 | 207 | no_test_executed(); 208 | } 209 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/arm/64_bit/linker.ld: -------------------------------------------------------------------------------- 1 | ENTRY(_start) 2 | 3 | MEMORY 4 | { 5 | flash : ORIGIN = 0 LENGTH = 8M 6 | } 7 | 8 | SECTIONS 9 | { 10 | .text : 11 | { 12 | boot.o(.boot) 13 | entry*.o 14 | } > flash 15 | } 16 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/common/common.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | 5 | void memset(volatile unsigned char *buf, int v, size_t sz) 6 | { 7 | for (size_t i = 0; i < sz; ++i) { 8 | buf[i] = v; 9 | } 10 | } 11 | 12 | void memcpy(volatile unsigned char *buf, volatile unsigned char *src, size_t sz) 13 | { 14 | for (size_t i = 0; i < sz; ++i) { 15 | buf[i] = src[i]; 16 | } 17 | } 18 | 19 | size_t strlen(volatile unsigned char *buf) 20 | { 21 | size_t i = 0; 22 | while (buf[i++]); 23 | return i - 1; 24 | } 25 | 26 | int strcmp(const char *left, const char *right) 27 | { 28 | while (1) { 29 | char l = *left; 30 | char r = *right; 31 | if (l != r) { 32 | return (int)l - (int)r; 33 | } 34 | if (l == 0) { 35 | break; 36 | } 37 | } 38 | return 0; 39 | } 40 | 41 | void write_to_addresses(void *addresses[], size_t n, void *value, size_t value_size) 42 | { 43 | for (size_t i = 0; i < n; ++i) { 44 | volatile void *ptr = addresses[i]; 45 | memcpy(ptr, value, value_size); 46 | } 47 | memset(value, 0, value_size); 48 | } 49 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/x86/64_bit/Makefile: -------------------------------------------------------------------------------- 1 | 2 | test_list = \ 3 | setup_2mb_page_table_simple \ 4 | setup_4k_page_table_complex \ 5 | setup_4k_page_table_simple 6 | 7 | tests_x86_64: $(foreach test_name,$(test_list),$(addsuffix .bin,$(test_name))) 8 | 9 | define generate_test 10 | 11 | $(1).bin: boot.o entry_$(1).o linker.ld 12 | echo "Creating $(1)" 13 | ld -m elf_x86_64 -Tlinker.ld -o x86_64_test_entry_$(1).bin entry_$(1).o boot.o 14 | 15 | entry_$(1).o: entry.c ../../common/common.h ../common/common_x86.h 16 | gcc -DGDB_PT_DUMP_TEST=$(1) entry.c -I./ -I../../common/ -I../common/ -Wall -Wextra -Werror -O3 -fno-builtin -m64 -no-pie -fno-PIE -ffreestanding -nostdlib -c -o entry_$(1).o 17 | 18 | endef 19 | 20 | $(foreach test_name,$(test_list),$(eval $(call generate_test,$(test_name)))) 21 | 22 | boot.o: boot.asm 23 | nasm boot.asm -f elf64 -o boot.o 24 | 25 | clean: 26 | find . -name "*.o" -exec rm {} \; 27 | find . -name "*.bin" -exec rm {} \; 28 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/x86/64_bit/boot.asm: -------------------------------------------------------------------------------- 1 | kernel_offset equ 0x8000 2 | 3 | section .boot 4 | [bits 16] 5 | 6 | 7 | ; Copied from https://wiki.osdev.org/Entering_Long_Mode_Directly 8 | 9 | %define PAGE_PRESENT (1 << 0) 10 | %define PAGE_WRITE (1 << 1) 11 | 12 | %define CODE_SEG 0x0008 13 | %define DATA_SEG 0x0010 14 | 15 | ALIGN 4 16 | IDT: 17 | .Length dw 0 18 | .Base dd 0 19 | 20 | switch_to_long_mode: 21 | 22 | mov bp, 0x8000 23 | mov sp, bp 24 | 25 | mov bx, kernel_offset 26 | mov dh, 10 27 | call load_test 28 | 29 | ; Zero out the 16KiB buffer. 30 | ; Since we are doing a rep stosd, count should be bytes/4. 31 | 32 | push di ; REP STOSD alters DI. 33 | mov ecx, 0x1000 34 | xor eax, eax 35 | cld 36 | rep stosd 37 | pop di ; Get DI back. 38 | 39 | ; Build the Page Map Level 4. 40 | ; es:di points to the Page Map Level 4 table. 41 | lea eax, [es:di + 0x1000] ; Put the address of the Page Directory Pointer Table in to EAX. 42 | or eax, PAGE_PRESENT | PAGE_WRITE ; Or EAX with the flags - present flag, writable flag. 43 | mov [es:di], eax ; Store the value of EAX as the first PML4E. 44 | 45 | 46 | ; Build the Page Directory Pointer Table. 47 | lea eax, [es:di + 0x2000] ; Put the address of the Page Directory in to EAX. 48 | or eax, PAGE_PRESENT | PAGE_WRITE ; Or EAX with the flags - present flag, writable flag. 49 | mov [es:di + 0x1000], eax ; Store the value of EAX as the first PDPTE. 50 | 51 | 52 | ; Build the Page Directory. 53 | lea eax, [es:di + 0x3000] ; Put the address of the Page Table in to EAX. 54 | or eax, PAGE_PRESENT | PAGE_WRITE ; Or EAX with the flags - present flag, writeable flag. 55 | mov [es:di + 0x2000], eax ; Store to value of EAX as the first PDE. 56 | 57 | 58 | push di ; Save DI for the time being. 59 | lea di, [di + 0x3000] ; Point DI to the page table. 60 | mov eax, PAGE_PRESENT | PAGE_WRITE ; Move the flags into EAX - and point it to 0x0000. 61 | 62 | 63 | ; Build the Page Table. 64 | .loop_page_table: 65 | mov [es:di], eax 66 | add eax, 0x1000 67 | add di, 8 68 | cmp eax, 0x200000 ; If we did all 2MiB, end. 69 | jb .loop_page_table 70 | 71 | pop di ; Restore DI. 72 | 73 | ; Disable IRQs 74 | mov al, 0xFF ; Out 0xFF to 0xA1 and 0x21 to disable all IRQs. 75 | out 0xA1, al 76 | out 0x21, al 77 | 78 | nop 79 | nop 80 | 81 | lidt [IDT] ; Load a zero length IDT so that any NMI causes a triple fault. 82 | 83 | ; Enter long mode. 84 | mov eax, 10100000b ; Set the PAE and PGE bit. 85 | mov cr4, eax 86 | 87 | mov edx, edi ; Point CR3 at the PML4. 88 | mov cr3, edx 89 | 90 | mov ecx, 0xC0000080 ; Read from the EFER MSR. 91 | rdmsr 92 | 93 | or eax, 0x00000100 ; Set the LME bit. 94 | wrmsr 95 | 96 | mov ebx, cr0 ; Activate long mode - 97 | or ebx,0x80000001 ; - by enabling paging and protection simultaneously. 98 | mov cr0, ebx 99 | 100 | lgdt [GDT.Pointer] ; Load GDT.Pointer defined below. 101 | 102 | jmp CODE_SEG:long_mode ; Load CS with 64 bit segment and flush the instruction cache 103 | 104 | load_test: 105 | pusha 106 | push dx ; number of sectors (input parameter) 107 | 108 | mov ah, 0x02 ; read function 109 | mov al, dh ; number of sectors 110 | mov dl, 0x00 ; drive number 111 | mov dh, 0x00 ; head number 112 | mov ch, 0x00 ; cylinder number 113 | mov cl, 0x03 ; sector number 114 | 115 | ; read data to [es:bx] 116 | int 0x13 117 | jc error ; carry bit is set -> error 118 | 119 | pop dx 120 | cmp al, dh ; read correct number of sectors 121 | jne error 122 | 123 | popa 124 | ret 125 | 126 | error: 127 | mov bl, 0xff 128 | jmp error 129 | 130 | ; Global Descriptor Table 131 | GDT: 132 | .Null: 133 | dq 0x0000000000000000 ; Null Descriptor - should be present. 134 | 135 | .Code: 136 | dq 0x00209A0000000000 ; 64-bit code descriptor (exec/read). 137 | dq 0x0000920000000000 ; 64-bit data descriptor (read/write). 138 | 139 | ALIGN 4 140 | dw 0 ; Padding to make the "address of the GDT" field aligned on a 4-byte boundary 141 | 142 | .Pointer: 143 | dw $ - GDT - 1 ; 16-bit Size (Limit) of GDT. 144 | dd GDT ; 32-bit Base Address of GDT. (CPU will zero extend to 64-bit) 145 | 146 | 147 | [BITS 64] 148 | long_mode: 149 | 150 | mov ax, DATA_SEG 151 | 152 | mov ds, ax 153 | mov es, ax 154 | mov fs, ax 155 | mov gs, ax 156 | mov ss, ax 157 | 158 | extern entry 159 | jmp entry 160 | 161 | times 510 - ($ - $$) db 0 162 | db 0x55, 0xaa 163 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/x86/64_bit/entry.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | #define WIDTH 80 5 | #define HEIGHT 64 6 | 7 | #include "common.h" 8 | #include "common_x86.h" 9 | 10 | void setup_2mb_page_table_simple(); 11 | void setup_4k_page_table_simple(); 12 | void setup_4k_page_table_complex(); 13 | void test_complete(); 14 | void no_test_executed(); 15 | 16 | #define NUM_ENTRIES_PER_PAGE (4096 / sizeof(size_t)) 17 | _Static_assert(sizeof(size_t) == 8, "Expected that size_t is 8 bytes for 64-bit builds"); 18 | 19 | void entry() { 20 | map_all(1024 * 1024 * 1024); 21 | setup_serial(); 22 | 23 | #define DISPATCH(test) \ 24 | do { \ 25 | if (GDB_PT_DUMP_TEST == test) { \ 26 | WRITE_MSG("Running: " #test); \ 27 | test(); \ 28 | test_complete(); \ 29 | } \ 30 | } while (0) 31 | 32 | WRITE_MSG("Searching for test..."); 33 | DISPATCH(setup_2mb_page_table_simple); 34 | DISPATCH(setup_4k_page_table_complex); 35 | DISPATCH(setup_4k_page_table_simple); 36 | 37 | no_test_executed(); 38 | } 39 | 40 | void test_complete() { 41 | WRITE_MSG("Done"); \ 42 | while(1) { 43 | volatile int a = 0xcafefe; 44 | (void)a; 45 | } 46 | } 47 | 48 | void no_test_executed() { 49 | WRITE_MSG("Test not found"); \ 50 | while(1) { 51 | volatile int a = 0xdeaddead; 52 | (void)a; 53 | } 54 | } 55 | 56 | // The memory ranges are as follows: 57 | // 2mb rx 58 | // 2mb rxw 59 | // 2mb rx, user 60 | // 2mb rx, wb=0 61 | // 2mb rx, uc=1 62 | // 4mb rx 63 | void setup_2mb_page_table_simple() 64 | { 65 | pml4_t pml4 = (pml4_t)0x10000; 66 | pdp_t pdp = (pdp_t)0x11000; 67 | pd_t pd = (pd_t)0x12000; 68 | pml4[0] = (pml4e) { .page_frame = page_frame(pa(pdp)), .present = 1, .write = 1}; 69 | pdp[0] = (pdpe) { .page_frame = page_frame(pa(pd)), .present = 1, .write = 1}; 70 | pd[0] = (pde) { .page_frame = page_frame(0x0), .present = 1, .page_size = 1}; 71 | pd[1] = (pde) { .page_frame = page_frame(TWO_MB), .write = 1, .present = 1, .page_size = 1 }; 72 | pd[2] = (pde) { .page_frame = page_frame(TWO_MB * 2), .user = 1, .present = 1, .page_size = 1 }; 73 | pd[3] = (pde) { .page_frame = page_frame(TWO_MB * 3), .pwt = 1, .present = 1, .page_size = 1 }; 74 | pd[4] = (pde) { .page_frame = page_frame(TWO_MB * 4), .pcd = 1, .present = 1, .page_size = 1 }; 75 | pd[5] = (pde) { .page_frame = page_frame(TWO_MB * 5), .accessed = 1, .present = 1, .page_size = 1 }; 76 | pd[6] = (pde) { .page_frame = page_frame(TWO_MB * 6), .global = 1, .present = 1, .page_size = 1 }; 77 | 78 | wr_cr3(pa(pml4)); 79 | } 80 | 81 | // The memory ranges are as follows: 82 | // 4 mb rwx 83 | // 2 mb rx, user 84 | // 2 mib rx, wb=0 85 | // 2 mib rx, uc=1 86 | // 4 mib + 4kb rx 87 | // gap 4kb 88 | // 4kb rx 89 | // gap 2032kb 90 | // 4kb rwx 91 | void setup_4k_page_table_simple() 92 | { 93 | pml4_t pml4 = (pml4_t)0x10000; 94 | pdp_t pdp = (pdp_t)0x11000; 95 | pd_t pd = (pd_t)0x12000; 96 | pml4[0] = (pml4e) { .page_frame = page_frame(pa(pdp)), .present = 1, .write = 1 }; 97 | pdp[0] = (pdpe) { .page_frame = page_frame(pa(pd)), .present = 1, .write = 1 }; 98 | pd[0] = (pde) { .page_frame = page_frame(0x0), .present = 1, .page_size = 1, .write = 1 }; 99 | pd[1] = (pde) { .page_frame = page_frame(TWO_MB), .present = 1, .write = 1, .page_size = 1 }; 100 | pd[2] = (pde) { .page_frame = page_frame(TWO_MB * 2), .present = 1, .user = 1, .page_size = 1 }; 101 | pd[3] = (pde) { .page_frame = page_frame(TWO_MB * 3), .pwt = 1, .present = 1, .page_size = 1 }; 102 | pd[4] = (pde) { .page_frame = page_frame(TWO_MB * 4), .pcd = 1, .present = 1, .page_size = 1 }; 103 | pd[5] = (pde) { .page_frame = page_frame(TWO_MB * 5), .global = 1, .present = 1, .page_size = 1 }; 104 | pd[6] = (pde) { .page_frame = page_frame(TWO_MB * 6), .global = 1, .present = 1, .page_size = 1 }; 105 | 106 | pt_t pt = (pt_t)0x13000; 107 | memset(va_ptr(pt), 0, 0x1000); 108 | 109 | pd[7] = (pde) { .page_frame = page_frame(pa(pt)), .present = 1 }; 110 | pt[0] = (pte) { .page_frame = page_frame(0), .present = 1 }; 111 | pt[2] = (pte) { .page_frame = page_frame(0x1000), .present = 1 }; 112 | pt[511] = (pte) { .page_frame = page_frame(0x200000), .present = 1, .write = 1 }; 113 | 114 | wr_cr3(pa(pml4)); 115 | } 116 | 117 | // The memory ranges are as follows: 118 | // 516 kb rx 119 | // gap 4kb 120 | // 360 kb rx 121 | // 4 kb rx, user 122 | // 396 kb rx 123 | // 4kb rwx 124 | // 36 kb rx 125 | // gap 4kb 126 | // 452 kb rx 127 | // 4kb rwx 128 | // 220 rw 129 | // 4kb rwx 130 | // 36 kb rx 131 | // gap 4kb 132 | // 4kb rx, user 133 | // 254 mb rx 134 | void setup_4k_page_table_complex() 135 | { 136 | pml4_t pml4 = (pml4_t)0x100000; 137 | pdp_t pdp = (pdp_t)0x101000; 138 | pd_t pd = (pd_t)0x102000; 139 | pt_t pt = (pt_t)0x103000; 140 | 141 | pml4[0] = (pml4e) { .page_frame = page_frame(pa(pdp)), .present = 1, .write = 1 }; 142 | pdp[0] = (pdpe) { .page_frame = page_frame(pa(pd)), .present = 1, .write = 1 }; 143 | 144 | for (unsigned i = 0; i < 128; ++i) 145 | { 146 | pd[i] = (pde) { .page_frame = page_frame(pa(&pt[NUM_ENTRIES_PER_PAGE * i])), .present = 1, .write = 1 }; 147 | for (unsigned j = 0; j < NUM_ENTRIES_PER_PAGE; ++j) 148 | { 149 | pt[i * NUM_ENTRIES_PER_PAGE + j] = (pte) { .page_frame = page_frame(i * TWO_MB + j * FOUR_KB), .present = 1 }; 150 | } 151 | } 152 | 153 | pt[129] = (pte) { .page_frame = page_frame(0), .present = 0 }; 154 | pt[220] = (pte) { .page_frame = page_frame(0), .present = 1, .user = 1 }; 155 | pt[320] = (pte) { .page_frame = page_frame(0), .present = 1, .write = 1 }; 156 | pt[330] = (pte) { .page_frame = page_frame(0), .present = 0 }; 157 | pt[444] = (pte) { .page_frame = page_frame(0), .present = 1, .write = 1 }; 158 | pt[500] = (pte) { .page_frame = page_frame(0), .present = 1, .write = 1 }; 159 | pt[NUM_ENTRIES_PER_PAGE - 2] = (pte) { .page_frame = page_frame(0), .present = 0 }; 160 | pt[NUM_ENTRIES_PER_PAGE - 1] = (pte) { .page_frame = page_frame(0), .present = 1, .user = 1 }; 161 | 162 | wr_cr3(pa(pml4)); 163 | } 164 | 165 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/x86/64_bit/linker.ld: -------------------------------------------------------------------------------- 1 | ENTRY(_start) 2 | OUTPUT_FORMAT("binary") 3 | 4 | SECTIONS 5 | { 6 | . = 0x7c00; 7 | .boot : { boot.o(.boot) } 8 | . = 0x8000; 9 | .kernel : { entry*(.*) } 10 | } 11 | -------------------------------------------------------------------------------- /tests/integration_tests/custom_kernels/x86/common/common_x86.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | #include 5 | 6 | #define ONE_KB (1U * 1024U) 7 | #define FOUR_KB (ONE_KB * 4U) 8 | #define TWO_MB (2U * 1024U * 1024U) 9 | #define ONE_GIB (1024U * 1024U * 1024U) 10 | 11 | typedef volatile uint64_t* table_t; 12 | 13 | typedef struct { 14 | uint64_t present : 1; // Page table present flag 15 | uint64_t write : 1; // Read/write flag 16 | uint64_t user : 1; // User/supervisor flag 17 | uint64_t pwt : 1; // Page-level write-through flag 18 | uint64_t pcd : 1; // Page-level cache disable flag 19 | uint64_t accessed : 1; // Accessed flag 20 | uint64_t ignored : 1; // Accessed flag 21 | uint64_t page_size : 1; // Must be 0 for PDPE 22 | uint64_t ignored2 : 4; // Global page flag 23 | uint64_t page_frame : 40; // Physical address of the 4-KByte page table 24 | uint64_t reserved3 : 11; // Reserved, set to 0 25 | uint64_t nx : 1; // Execute disable flag 26 | } pml4e; 27 | typedef volatile pml4e* pml4_t; 28 | 29 | typedef struct { 30 | uint64_t present : 1; // Page table present flag 31 | uint64_t write : 1; // Read/write flag 32 | uint64_t user : 1; // User/supervisor flag 33 | uint64_t pwt : 1; // Page-level write-through flag 34 | uint64_t pcd : 1; // Page-level cache disable flag 35 | uint64_t accessed : 1; // Accessed flag 36 | uint64_t dirty : 1; // Accessed flag 37 | uint64_t page_size : 1; // Must be 0 for PDPE 38 | uint64_t global : 1; // Global page flag 39 | uint64_t available : 3; // Available for software use 40 | uint64_t page_frame : 40; // Physical address of the 4-KByte page table 41 | uint64_t reserved3 : 11; // Reserved, set to 0 42 | uint64_t nx : 1; // Execute disable flag 43 | } pdpe; 44 | typedef volatile pdpe* pdp_t; 45 | 46 | typedef struct { 47 | uint64_t present : 1; // Page present in memory 48 | uint64_t write : 1; // Writeable 49 | uint64_t user : 1; // User-mode accessible 50 | uint64_t pwt : 1; // Write-Through caching 51 | uint64_t pcd : 1; // Cache Disabled 52 | uint64_t accessed : 1; // Page has been accessed 53 | uint64_t dirty : 1; // Page has been written to 54 | uint64_t page_size : 1; // 1 if this entry maps a 2MiB page 55 | uint64_t global : 1; // If set, the page won't be flushed from the TLB on CR3 writes 56 | uint64_t reserved : 3; // Reserved bits (must be 0) 57 | uint64_t page_frame : 40; // Physical address of the 4KB page frame or 1GB page frame 58 | uint64_t reserved3 : 11; // Reserved bits (must be 0) 59 | uint64_t nx : 1; // No Execute (NX) bit 60 | } pde; 61 | typedef volatile pde* pd_t; 62 | 63 | typedef struct { 64 | uint64_t present : 1; // Page present in memory 65 | uint64_t write : 1; // Writeable 66 | uint64_t user : 1; // User-mode accessible 67 | uint64_t pwt : 1; // Write-Through caching 68 | uint64_t pcd : 1; // Cache Disabled 69 | uint64_t accessed : 1; // Page has been accessed 70 | uint64_t dirty : 1; // Page has been written to 71 | uint64_t pat : 1; // 1 if this entry maps a 2MiB page 72 | uint64_t global : 1; // If set, the page won't be flushed from the TLB on CR3 writes 73 | uint64_t reserved : 3; // Reserved bits (must be 0) 74 | uint64_t page_frame : 40; // Physical address of the 4KB page frame or 1GB page frame 75 | uint64_t reserved2 : 11; // Reserved 76 | uint64_t nx : 1; // NX 77 | } pte; 78 | typedef volatile pte* pt_t; 79 | 80 | inline uint64_t page_frame(const uint64_t addr) { 81 | return addr >> 12; 82 | } 83 | 84 | inline uint64_t pa(volatile const void *const addr) { 85 | return (uint64_t)addr; 86 | } 87 | 88 | inline void *va_ptr(volatile void *addr) { 89 | return (void *)addr; 90 | } 91 | 92 | inline void outb(uint16_t port, uint8_t value) { 93 | // Inline assembly to use the out instruction 94 | asm volatile ("outb %0, %1" : : "a"(value), "Nd"(port)); 95 | } 96 | 97 | inline uint8_t inb(uint16_t port) { 98 | uint8_t data; 99 | asm volatile ("inb %w1, %b0" : "=a"(data) : "Nd"(port)); 100 | return data; 101 | } 102 | 103 | #define IO_PORT 0x3f8 104 | 105 | static void setup_serial() { 106 | // Copied from https://stackoverflow.com/questions/69481715/initialize-serial-port-with-x86-assembly 107 | outb(IO_PORT + 1, 0x00); // Disable all interrupts 108 | outb(IO_PORT + 3, 0x80); // Enable DLAB (set baud rate divisor) 109 | outb(IO_PORT + 0, 0x03); // Set divisor to 3 (lo byte) 38400 baud 110 | outb(IO_PORT + 1, 0x00); // (hi byte) 111 | outb(IO_PORT + 3, 0x03); // 8 bits, no parity, one stop bit 112 | outb(IO_PORT + 2, 0xC7); // Enable FIFO, clear them, with 14-byte threshold 113 | outb(IO_PORT + 4, 0x0B); // IRQs enabled, RTS/DSR set 114 | outb(IO_PORT + 4, 0x1E); // Set in loopback mode, test the serial chip 115 | outb(IO_PORT + 0, 0xAE); // Test serial chip (send byte 0xAE and check if serial returns same byte) 116 | outb(IO_PORT + 4, 0x0F); 117 | } 118 | 119 | static inline void write_byte_to_serial(uint8_t data) { 120 | // Wait for the serial port to be ready to accept data 121 | while ((inb(IO_PORT + 5) & 0x20) == 0); 122 | 123 | // Write the byte to the serial port 124 | outb(IO_PORT, data); 125 | } 126 | 127 | static inline void write_str_to_serial(const char *data, size_t n) { 128 | for (size_t i = 0; i < n; ++i) 129 | { 130 | char c = data[i]; 131 | write_byte_to_serial((uint8_t)c); 132 | } 133 | write_byte_to_serial('\n'); 134 | } 135 | 136 | #define VIDMEM_ADDR 0xb8000 137 | 138 | static uint32_t screen_line; 139 | 140 | void write_to_screen(const char *const msg, unsigned msg_len) 141 | { 142 | volatile uint16_t *vidmem = (volatile uint16_t *)VIDMEM_ADDR; 143 | for (unsigned i = 0; i < msg_len; ++i) 144 | { 145 | uint16_t value = (uint8_t)msg[i]; 146 | value |= 0x0f00; 147 | vidmem[screen_line * 80 + i] = value; 148 | } 149 | } 150 | 151 | #define WRITE_MSG(MSG) \ 152 | do { \ 153 | write_to_screen(MSG, sizeof(MSG) - 1), screen_line++; \ 154 | write_str_to_serial(MSG, sizeof(MSG) - 1); \ 155 | } while (0) 156 | 157 | void wr_cr3(const uint64_t new_cr3) { 158 | asm volatile( 159 | "movq %0, %%cr3\n\t" 160 | : 161 | : "r"(new_cr3) 162 | : "memory" 163 | ); 164 | } 165 | 166 | unsigned long rd_cr3() { 167 | unsigned long value; 168 | asm volatile( 169 | "movq %%cr3, %0\n\t" 170 | : "=r"(value) 171 | : 172 | : "memory" 173 | ); 174 | return value; 175 | } 176 | 177 | void wr_cr0(unsigned long new_cr0) { 178 | asm volatile( 179 | "movq %0, %%cr0\n\t" 180 | : 181 | : "r"(new_cr0) 182 | : "memory" 183 | ); 184 | } 185 | 186 | unsigned long rd_cr0() { 187 | unsigned long value; 188 | asm volatile( 189 | "movq %%cr0, %0\n\t" 190 | : "=r"(value) 191 | : 192 | : "memory" 193 | ); 194 | return value; 195 | } 196 | 197 | void map_all(size_t size) { 198 | pml4_t pml4 = (pml4_t)0x20000; 199 | pdp_t pdp = (pdp_t)0x21000; 200 | pml4[0] = (pml4e) { .page_frame = page_frame(pa(pdp)), .present = 1, .write = 1 }; 201 | for (unsigned i = 0; i < size / ONE_GIB; ++i) 202 | { 203 | pdp[i] = (pdpe) { .page_frame = page_frame(ONE_GIB * i), .present = 1, .write = 1, .page_size = 1 }; 204 | } 205 | wr_cr3(pa(pml4)); 206 | } 207 | 208 | -------------------------------------------------------------------------------- /tests/integration_tests/pt_utils.py: -------------------------------------------------------------------------------- 1 | 2 | from abc import ABC, abstractmethod 3 | import re 4 | 5 | class MetaFlags(ABC): 6 | def __init__(self): 7 | pass 8 | 9 | @abstractmethod 10 | def executable(self): 11 | raise Exception("Unimplemented") 12 | 13 | @abstractmethod 14 | def writeable(self): 15 | raise Exception("Unimplemented") 16 | 17 | @abstractmethod 18 | def user_accessible(self): 19 | raise Exception("Unimplemented") 20 | 21 | @abstractmethod 22 | def user_writeable(self): 23 | raise Exception("Unimplemented") 24 | 25 | @abstractmethod 26 | def user_executable(self): 27 | raise Exception("Unimplemented") 28 | 29 | @abstractmethod 30 | def super_accessible(self): 31 | raise Exception("Unimplemented") 32 | 33 | @abstractmethod 34 | def super_writeable(self): 35 | raise Exception("Unimplemented") 36 | 37 | @abstractmethod 38 | def super_executable(self): 39 | raise Exception("Unimplemented") 40 | 41 | 42 | class MetaFlagsX86(MetaFlags): 43 | def __init__(self, w, x, s, uc, wb): 44 | super().__init__() 45 | self.w = w 46 | self.x = x 47 | self.s = s 48 | self.uc = uc 49 | self.wb = wb 50 | 51 | def __eq__(self, other): 52 | fields = ["w", "x", "s", "uc", "wb"] 53 | return all(getattr(self, attr) == getattr(other, attr) for attr in fields) 54 | 55 | def executable(self): 56 | return self.x 57 | 58 | def writeable(self): 59 | return self.w 60 | 61 | def user_accessible(self): 62 | return not self.s 63 | 64 | def user_writeable(self): 65 | return self.user_accessible() and self.w 66 | 67 | def user_executable(self): 68 | return self.user_accessible() and self.w 69 | 70 | def super_accessible(self): 71 | return self.s 72 | 73 | def super_writeable(self): 74 | return self.super_accessible() and self.w 75 | 76 | def super_executable(self): 77 | return self.super_accessible() and self.x 78 | 79 | def __str__(self): 80 | # TODO 81 | return "UNIMPLEMENTED" 82 | 83 | class MetaFlagsArm64(MetaFlags): 84 | def __init__(self, ur, uw, ux, sr, sw, sx): 85 | super().__init__() 86 | self.ur = ur 87 | self.uw = uw 88 | self.ux = ux 89 | self.sr = sr 90 | self.sw = sw 91 | self.sx = sx 92 | 93 | def __eq__(self, other): 94 | fields = ["ur", "uw", "ux", "sr", "sw", "sx"] 95 | return all(getattr(self, attr) == getattr(other, attr) for attr in fields) 96 | 97 | def executable(self): 98 | return self.ux or self.sx 99 | 100 | def writeable(self): 101 | return self.uw or self.sw 102 | 103 | def user_accessible(self): 104 | return any([self.ur, self.uw, self.ux]) 105 | 106 | def user_writeable(self): 107 | return self.uw 108 | 109 | def user_executable(self): 110 | return self.ux 111 | 112 | def super_accessible(self): 113 | return any([self.sr, self.sw, self.sx]) 114 | 115 | def super_writeable(self): 116 | return self.sw 117 | 118 | def super_executable(self): 119 | return self.sx 120 | 121 | def __str__(self): 122 | # TODO 123 | return "UNIMPLEMENTED" 124 | 125 | class MetaFlagsRiscv(MetaFlags): 126 | def __init__(self, r, w, x, s): 127 | super().__init__() 128 | self.r = r 129 | self.w = w 130 | self.x = x 131 | self.s = s 132 | 133 | def __eq__(self, other): 134 | fields = ["r", "w", "x", "s"] 135 | return all(getattr(self, attr) == getattr(other, attr) for attr in fields) 136 | 137 | def __str__(self): 138 | # TODO 139 | return "UNIMPLEMENTED" 140 | 141 | def executable(self): 142 | return self.x 143 | 144 | def writeable(self): 145 | return self.w 146 | 147 | def user_accessible(self): 148 | return any([self.r, self.w, self.x]) and not self.s 149 | 150 | def user_writeable(self): 151 | return self.w and not self.s 152 | 153 | def user_executable(self): 154 | return self.w and not self.s 155 | 156 | def super_accessible(self): 157 | return any([self.r, self.w, self.x]) and self.s 158 | 159 | def super_writeable(self): 160 | return self.w and self.s 161 | 162 | def super_executable(self): 163 | return self.x and self.s 164 | 165 | class VirtRange(): 166 | def __init__(self, va_start, length, flags): 167 | self.va_start = va_start 168 | self.length = length 169 | self.flags = flags 170 | 171 | def __eq__(self, other): 172 | fields = ["va_start", "length", "flags"] 173 | return all(getattr(self, attr) == getattr(other, attr) for attr in fields) 174 | 175 | def __str__(self): 176 | return f"VA_start {hex(self.va_start)}, VA_length: {hex(self.length)}, Flags: {self.flags}" 177 | 178 | class Occurrence(): 179 | def __init__(self, occ_va, virt_range): 180 | self.occ_va = occ_va 181 | self.virt_range = virt_range 182 | 183 | def ansi_escape(line): 184 | ansi_escape = re.compile(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])') 185 | return ansi_escape.sub('', line) 186 | 187 | def _parse_va_range_x86(line): 188 | pattern = r"\s*([0-9a-fA-Fx]+)\s*:\s*([0-9a-fA-Fx]+)\s*\|\s*W:(\d+)\s*X:(\d+)\s*S:(\d+)\s*UC:(\d+)\s*WB:(\d+)" 189 | line = ansi_escape(line) 190 | match = re.match(pattern, line) 191 | if match: 192 | range_va, range_size, flag_w, flag_x, flag_s, flag_uc, flag_wb = match.groups() 193 | flags = MetaFlagsX86(w=bool(int(flag_w)), x=bool(int(flag_x)), s=bool(int(flag_s)), uc=bool(int(flag_uc)), wb=bool(int(flag_wb))) 194 | virt_range = VirtRange(int(range_va, 16), int(range_size, 16), flags) 195 | return virt_range 196 | return None 197 | 198 | def _parse_va_range_arm_64(line): 199 | pattern = r"\s*([0-9a-fA-Fx]+)\s*:\s*([0-9a-fA-Fx]+)\s*\|\s*W:(\d+)\s*X:(\d+)\s*S:(\d+)\s*UC:(\d+)\s*WB:(\d+)" 200 | pattern = r"\s*([0-9a-fA-Fx]+)\s*:\s*([0-9a-fA-Fx]+)\s*R:(\d+)\s+W:(\d+)\s+X:(\d+)\s+R:(\d+)\s+W:(\d+)\s+X:(\d+)" 201 | line = ansi_escape(line) 202 | match = re.match(pattern, line) 203 | if match: 204 | range_va, range_size, flag_user_r, flag_user_w, flag_user_x, flag_super_r, flag_super_w, flag_super_x = match.groups() 205 | flags = MetaFlagsArm64( \ 206 | ur=bool(int(flag_user_r)), uw=bool(int(flag_user_w)), ux=bool(int(flag_user_x)), \ 207 | sr=bool(int(flag_super_r)), sw=bool(int(flag_super_w)), sx=bool(int(flag_super_x))) 208 | virt_range = VirtRange(int(range_va, 16), int(range_size, 16), flags) 209 | return virt_range 210 | return None 211 | 212 | def _parse_va_range_riscv(line): 213 | pattern = r"\s*([0-9a-fA-Fx]+)\s*:\s*([0-9a-fA-Fx]+)\s*\|\s*W:(\d+)\s*X:(\d+)\s*R:(\d+)\s*S:(\d+)" 214 | line = ansi_escape(line) 215 | match = re.match(pattern, line) 216 | if match: 217 | range_va, range_size, flag_w, flag_x, flag_r, flag_s = match.groups() 218 | flags = MetaFlagsRiscv(w=bool(int(flag_w)), x=bool(int(flag_x)), s=bool(int(flag_s)), r=bool(int(flag_r))) 219 | virt_range = VirtRange(int(range_va, 16), int(range_size, 16), flags) 220 | return virt_range 221 | return None 222 | 223 | def parse_va_ranges(arch, command_output): 224 | func = None 225 | if arch == "x86_64": 226 | func = _parse_va_range_x86 227 | elif arch == "arm_64": 228 | func = _parse_va_range_arm_64 229 | elif arch == "riscv": 230 | func = _parse_va_range_riscv 231 | else: 232 | raise Exception("Unknown architecture") 233 | 234 | lines = command_output.split("\n") 235 | ranges = [] 236 | for line in lines: 237 | range_info = func(line) 238 | if range_info: 239 | ranges.append(range_info) 240 | return ranges 241 | 242 | def _parse_occurrences_x86(command_output): 243 | occ_lines = command_output.split("\n") 244 | pattern = r"Found at (\w+) in\s+(\w+)\s+:\s+(\w+)\s+\|\s+W:(\d+)\s+X:(\d+)\s+S:(\d+)\s+UC:(\d+)\s+WB:(\d+)" 245 | occs = [] 246 | for line in occ_lines: 247 | line = ansi_escape(line) 248 | match = re.match(pattern, line) 249 | if match: 250 | found_at, range_va, range_size, flag_w, flag_x, flag_s, flag_uc, flag_wb = match.groups() 251 | flags = MetaFlagsX86(w=bool(int(flag_w)), x=bool(int(flag_x)), s=bool(int(flag_s)), uc=bool(int(flag_uc)), wb=bool(int(flag_wb))) 252 | virt_range = VirtRange(int(range_va, 16), int(range_size, 16), flags) 253 | occ = Occurrence(int(found_at, 16), virt_range) 254 | occs.append(occ) 255 | return occs 256 | 257 | def _parse_occurrences_arm_64(command_output): 258 | occ_lines = command_output.split("\n") 259 | pattern = r"Found at (\w+) in\s+(\w+)\s+:\s+(\w+)\s+R:(\d+)\s+W:(\d+)\s+X:(\d+)\s+R:(\d+)\s+W:(\d+)\s+X:(\d+)" 260 | occs = [] 261 | for line in occ_lines: 262 | line = ansi_escape(line) 263 | match = re.match(pattern, line) 264 | if match: 265 | found_at, range_va, range_size, flag_user_r, flag_user_w, flag_user_x, flag_super_r, flag_super_w, flag_super_x = match.groups() 266 | flags = MetaFlagsArm64( \ 267 | ur=bool(int(flag_user_r)), uw=bool(int(flag_user_w)), ux=bool(int(flag_user_x)), \ 268 | sr=bool(int(flag_super_r)), sw=bool(int(flag_super_w)), sx=bool(int(flag_super_x))) 269 | virt_range = VirtRange(int(range_va, 16), int(range_size, 16), flags) 270 | occ = Occurrence(int(found_at, 16), virt_range) 271 | occs.append(occ) 272 | return occs 273 | 274 | def _parse_occurrences_riscv(command_output): 275 | occ_lines = command_output.split("\n") 276 | pattern = r"Found at (\w+) in\s+(\w+)\s+:\s+(\w+)\s+\|\s+W:(\d+)\s+X:(\d+)\s+R:(\d+)\s+S:(\d+)" 277 | occs = [] 278 | for line in occ_lines: 279 | line = ansi_escape(line) 280 | match = re.match(pattern, line) 281 | if match: 282 | found_at, range_va, range_size, flag_w, flag_x, flag_r, flag_s = match.groups() 283 | flags = MetaFlagsRiscv(w=bool(int(flag_w)), x=bool(int(flag_x)), s=bool(int(flag_s)), r=bool(int(flag_r))) 284 | virt_range = VirtRange(int(range_va, 16), int(range_size, 16), flags) 285 | occ = Occurrence(int(found_at, 16), virt_range) 286 | occs.append(occ) 287 | return occs 288 | 289 | def parse_occurrences(arch, command_output): 290 | if arch == "x86_64": 291 | return _parse_occurrences_x86(command_output) 292 | elif arch == "arm_64": 293 | return _parse_occurrences_arm_64(command_output) 294 | elif arch == "riscv": 295 | return _parse_occurrences_riscv(command_output) 296 | else: 297 | raise Exception("Unknown architecture") 298 | -------------------------------------------------------------------------------- /tests/integration_tests/run_integration_tests.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env -S python3 -m pytest 2 | 3 | import os 4 | import copy 5 | import re 6 | import pytest 7 | import sys 8 | import filecmp 9 | 10 | from vm_utils import * 11 | from pt_utils import * 12 | 13 | def verify_all_search_occurrences(monitor, occs, mem_len, checker): 14 | for occ in occs: 15 | memory = monitor.read_virt_memory(occ.occ_va, mem_len) 16 | assert(checker(occ, memory)) 17 | 18 | def get_all_arch(): 19 | return ["x86_64", "arm_64", "riscv"] 20 | 21 | def get_all_images(): 22 | return [("x86_64", "linux_x86_64"), ("x86_64", "linux_x86_64_la57"), ("arm_64", "linux_arm_64_4k"), ("arm_64", "linux_arm_64_4k_kpti"), ("arm_64", "linux_arm_64_64k"), ("riscv", "linux_riscv")] 23 | 24 | def create_resources(arch_name, linux_image, kaslr): 25 | vm = create_linux_vm(arch_name, linux_image) 26 | if "la57" in linux_image: 27 | vm.start(kaslr=kaslr, la57=True) 28 | else: 29 | vm.start(kaslr=kaslr) 30 | vm.wait_for_shell() 31 | gdb = GdbCommandExecutor(vm) 32 | monitor = QemuMonitorExecutor(vm) 33 | memory_flatview = monitor.get_memory_flat_view() 34 | 35 | if bool(os.getenv("GDB_PT_DUMP_TESTS_PAUSE_AFTER_BOOT")) == True: 36 | print("Sleeping...") 37 | time.sleep(10000) 38 | 39 | return (vm, gdb, monitor, memory_flatview) 40 | 41 | def get_qemu_version(): 42 | try: 43 | # Run the `qemu-system-x86_64 --version` command 44 | result = subprocess.run( 45 | ["qemu-system-x86_64", "--version"], 46 | stdout=subprocess.PIPE, 47 | stderr=subprocess.PIPE, 48 | text=True 49 | ) 50 | if result.returncode == 0: 51 | # Extract the version number from the output 52 | version_line = result.stdout.splitlines()[0] 53 | version = version_line.split()[3] # Extract the version (e.g., '8.2.2') 54 | return version 55 | else: 56 | raise Exception(f"Error: {result.stderr.strip()}") 57 | except FileNotFoundError: 58 | raise Exception("QEMU is not installed or qemu-system-x86_64 is not in the PATH.") 59 | 60 | def normalize_version(version, length=4): 61 | """ 62 | Normalize a version string by padding it to the required length with zeros. 63 | 64 | :param version: The version string to normalize (e.g., "8.2"). 65 | :param length: The target number of components in the version. 66 | :return: A normalized version string (e.g., "8.2.0"). 67 | """ 68 | parts = version.split('.') 69 | while len(parts) < length: 70 | parts.append('0') # Pad with zeros 71 | return '.'.join(parts[:length]) 72 | 73 | def compare_versions(version, target): 74 | """ 75 | Compares two version strings. 76 | 77 | :param version: The current version string (e.g., "8.2.1"). 78 | :param target: The target version string (e.g., "8.2.2"). 79 | :return: "above", "below", or "equal". 80 | """ 81 | version = normalize_version(version) 82 | target = normalize_version(target) 83 | version_parts = list(map(int, version.split('.'))) 84 | target_parts = list(map(int, target.split('.'))) 85 | 86 | for v, t in zip(version_parts, target_parts): 87 | if v > t: 88 | return "above" 89 | elif v < t: 90 | return "below" 91 | # If all parts compared so far are equal, check the length of the version 92 | if len(version_parts) > len(target_parts): 93 | return "above" 94 | elif len(version_parts) < len(target_parts): 95 | return "below" 96 | return "equal" 97 | 98 | _ARE_ARM64_CUSTOM_IMAGES_BROKEN=None 99 | def are_arm64_custom_images_broken(): 100 | assert(compare_versions("8.2.1", "8.2.2") == "below") 101 | assert(compare_versions("8.2.2", "8.2.2") == "equal") 102 | assert(compare_versions("8.2.3", "8.2.2") == "above") 103 | assert(compare_versions("7.3", "8.2.2") == "below") 104 | assert(compare_versions("7.3", "7.3.1") == "below") 105 | 106 | global _ARE_ARM64_CUSTOM_IMAGES_BROKEN 107 | if _ARE_ARM64_CUSTOM_IMAGES_BROKEN == None: 108 | qemu_version = get_qemu_version() 109 | res = compare_versions(qemu_version, "8.2.2") 110 | if res in ["above", "equal"]: 111 | # Something gets broken with the custom kernels with QEMU version 8.2.2 112 | # The custom kernels work fine on 6.2.0 113 | _ARE_ARM64_CUSTOM_IMAGES_BROKEN = True 114 | else: 115 | _ARE_ARM64_CUSTOM_IMAGES_BROKEN = False 116 | 117 | return _ARE_ARM64_CUSTOM_IMAGES_BROKEN 118 | 119 | @pytest.fixture 120 | def create_resources_fixture_nokaslr(arch_name, linux_image): 121 | vm, gdb, monitor, memory_flatview = create_resources(arch_name, linux_image, kaslr=False) 122 | yield (vm, gdb, monitor, memory_flatview) 123 | monitor.stop() 124 | gdb.stop() 125 | vm.stop() 126 | 127 | @pytest.fixture 128 | def create_resources_fixture(arch_name, linux_image): 129 | vm, gdb, monitor, memory_flatview = create_resources(arch_name, linux_image, kaslr=True) 130 | yield (vm, gdb, monitor, memory_flatview) 131 | monitor.stop() 132 | gdb.stop() 133 | vm.stop() 134 | 135 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 136 | def test_pt_smoke(create_resources_fixture, arch_name, linux_image): 137 | vm, gdb, monitor, memory_flatview = create_resources_fixture 138 | res = gdb.run_cmd("pt") 139 | 140 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 141 | def test_pt_filter_smoke(create_resources_fixture, arch_name, linux_image): 142 | vm, gdb, monitor, memory_flatview = create_resources_fixture 143 | gdb.run_cmd("pt") 144 | gdb.run_cmd("pt -filter x") 145 | gdb.run_cmd("pt -filter w") 146 | gdb.run_cmd("pt -filter ro") 147 | gdb.run_cmd("pt -filter w|x") 148 | gdb.run_cmd("pt -filter u") 149 | gdb.run_cmd("pt -filter s") 150 | gdb.run_cmd("pt -filter w x") 151 | 152 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 153 | def test_pt_kaslr_smoke(create_resources_fixture, arch_name, linux_image): 154 | vm, gdb, monitor, memory_flatview = create_resources_fixture 155 | res = gdb.run_cmd("pt -kaslr") 156 | 157 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 158 | def test_pt_phys_verbose_smoke(create_resources_fixture, arch_name, linux_image): 159 | vm, gdb, monitor, memory_flatview = create_resources_fixture 160 | res = gdb.run_cmd("pt -phys_verbose") 161 | 162 | def _test_pt_search(vm, gdb, monitor, search_command, mem_len, checker): 163 | res = gdb.run_cmd(search_command) 164 | monitor.pause() 165 | occs = parse_occurrences(vm.get_arch(), res.output) 166 | assert(len(occs) > 0) 167 | verify_all_search_occurrences(monitor, occs, mem_len, checker) 168 | 169 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 170 | def test_pt_search_string(create_resources_fixture, arch_name, linux_image): 171 | vm, gdb, monitor, memory_flatview = create_resources_fixture 172 | checker = lambda _, mem: mem == b"Linux" 173 | _test_pt_search(vm, gdb, monitor, "pt -ss Linux", 5, checker) 174 | 175 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 176 | def test_pt_search_s4(create_resources_fixture, arch_name, linux_image): 177 | vm, gdb, monitor, memory_flatview = create_resources_fixture 178 | checker = lambda _, mem: mem == b"\x41\x41\x41\x41" 179 | _test_pt_search(vm, gdb, monitor, "pt -s4 0x41414141", 4, checker) 180 | 181 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 182 | def test_pt_search_s8(create_resources_fixture, arch_name, linux_image): 183 | vm, gdb, monitor, memory_flatview = create_resources_fixture 184 | checker = lambda _, mem: mem == b"\xfe\xff\xff\xff\xff\xff\xff\xff" 185 | _test_pt_search(vm, gdb, monitor, "pt -s8 0xfffffffffffffffe", 8, checker) 186 | 187 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 188 | def test_pt_range_exists(create_resources_fixture, arch_name, linux_image): 189 | vm, gdb, monitor, memory_flatview = create_resources_fixture 190 | res = gdb.run_cmd(f"pt") 191 | monitor.pause() 192 | 193 | ranges = parse_va_ranges(arch_name, res.output) 194 | assert(len(ranges) > 0) 195 | for r in ranges: 196 | if check_if_belongs_to_io_or_rom(monitor, memory_flatview, r.va_start): 197 | print(f"Skipping {str(r)}") 198 | continue 199 | 200 | print("Range base is", hex(r.va_start)) 201 | addr = r.va_start 202 | assert(check_va_exists(monitor, memory_flatview, addr)) 203 | 204 | # BUG: for some reason qemu does not allow reading these physical addresses 205 | if r.va_start == 0xffff800010010000 or r.va_start == 0xffff800010030000 or r.va_start == 0xffffffc008010000 or r.va_start == 0xffffffc008030000 or r.va_start == 0xfffffe0008020000 or r.va_start == 0xfffffe0008040000 or r.va_start == 0xfffffe0008060000 or r.va_start == 0xfffffe00084e0000 or r.va_start == 0xff20000000245000 or r.va_start == 0xff2000000024d000: 206 | print(f"Skip reading {hex(r.va_start)} due to a weird qemu bug") 207 | continue 208 | 209 | addr = r.va_start + int(r.length / 2) 210 | assert(check_va_exists(monitor, memory_flatview, addr)) 211 | 212 | addr = r.va_start + r.length - 4 213 | assert(check_va_exists(monitor, memory_flatview, addr)) 214 | 215 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 216 | def test_pt_walk_many_ranges(create_resources_fixture, arch_name, linux_image): 217 | vm, gdb, monitor, memory_flatview = create_resources_fixture 218 | res = gdb.run_cmd(f"pt") 219 | monitor.pause() 220 | 221 | ranges = parse_va_ranges(arch_name, res.output) 222 | assert(len(ranges) > 0) 223 | for r in ranges[0:16]: 224 | print(f"Trying {hex(r.va_start)} {hex(r.length)}") 225 | output = gdb.run_cmd(f"pt -walk {hex(r.va_start)}") 226 | assert("Last stage faulted" not in output.output) 227 | 228 | output = gdb.run_cmd(f"pt -walk {hex(int(r.va_start + r.length / 2))}") 229 | assert("Last stage faulted" not in output.output) 230 | 231 | output = gdb.run_cmd(f"pt -walk {hex(r.va_start + r.length - 1)}") 232 | assert("Last stage faulted" not in output.output) 233 | 234 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 235 | def test_pt_walk_first_stage_fault(create_resources_fixture, arch_name, linux_image): 236 | vm, gdb, monitor, memory_flatview = create_resources_fixture 237 | res = gdb.run_cmd(f"pt") 238 | monitor.pause() 239 | 240 | ranges = parse_va_ranges(arch_name, res.output) 241 | assert(len(ranges) > 0) 242 | 243 | unmapped_address = ranges[0].va_start - 0x100 244 | output = gdb.run_cmd(f"pt -walk {hex(unmapped_address)}") 245 | assert("Last stage faulted" in output.output) 246 | 247 | def _test_pt_filter_common(vm, gdb, monitor, executions): 248 | for (_cmd, _check) in executions: 249 | res = gdb.run_cmd(_cmd) 250 | monitor.pause() 251 | 252 | ranges = parse_va_ranges(vm.get_arch(), res.output) 253 | assert(len(ranges) > 0) 254 | for r in ranges: 255 | assert(_check(r)) 256 | 257 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 258 | def test_pt_filter_executable(create_resources_fixture, arch_name, linux_image): 259 | vm, gdb, monitor, memory_flatview = create_resources_fixture 260 | executions = [("pt -filter x", lambda r: r.flags.executable()), ("pt -filter _x", lambda r: not r.flags.executable())] 261 | _test_pt_filter_common(vm, gdb, monitor, executions) 262 | 263 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 264 | def test_pt_filter_writeable(create_resources_fixture, arch_name, linux_image): 265 | vm, gdb, monitor, memory_flatview = create_resources_fixture 266 | executions = [("pt -filter w", lambda r: r.flags.writeable()), ("pt -filter _w", lambda r: not r.flags.writeable())] 267 | _test_pt_filter_common(vm, gdb, monitor, executions) 268 | 269 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 270 | def test_pt_filter_read_only(create_resources_fixture, arch_name, linux_image): 271 | vm, gdb, monitor, memory_flatview = create_resources_fixture 272 | executions = [("pt -filter ro", lambda r: (not r.flags.user_executable() and not r.flags.user_writeable()) or (not r.flags.super_executable() and not r.flags.super_writeable()))] 273 | _test_pt_filter_common(vm, gdb, monitor, executions) 274 | 275 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 276 | def test_pt_filter_user_accessible(create_resources_fixture, arch_name, linux_image): 277 | if arch_name == "arm_64" and "kpti" in linux_image: 278 | # BUG: needs another kernel image 279 | pytest.skip("User ranges are never visible because user page table is unmapped") 280 | vm, gdb, monitor, memory_flatview = create_resources_fixture 281 | executions = [("pt -filter u", lambda r: r.flags.user_accessible())] 282 | _test_pt_filter_common(vm, gdb, monitor, executions) 283 | 284 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 285 | def test_pt_filter_kernel_only_accessible(create_resources_fixture, arch_name, linux_image): 286 | if arch_name == "arm_64" and "kpti" in linux_image: 287 | # BUG: needs another kernel image 288 | pytest.skip("The _s would result into 0 ranges because user page table is unmapped.") 289 | executions = [("pt -filter s", lambda r: r.flags.super_accessible())] 290 | vm, gdb, monitor, memory_flatview = create_resources_fixture 291 | _test_pt_filter_common(vm, gdb, monitor, executions) 292 | 293 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 294 | def test_pt_filter_multiple_filters_user(create_resources_fixture, arch_name, linux_image): 295 | if arch_name == "arm_64" and "kpti" in linux_image: 296 | # BUG: needs another kernel image 297 | pytest.skip("This would result into 0 ranges because user page table is unmapped.") 298 | executions = [ \ 299 | ("pt -filter w u", lambda r: r.flags.user_writeable()), \ 300 | ] 301 | vm, gdb, monitor, memory_flatview = create_resources_fixture 302 | _test_pt_filter_common(vm, gdb, monitor, executions) 303 | 304 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 305 | def test_pt_filter_multiple_filters_super(create_resources_fixture, arch_name, linux_image): 306 | executions = [ \ 307 | ("pt -filter w s", lambda r: r.flags.super_writeable()), \ 308 | ("pt -filter x s", lambda r: r.flags.super_executable()), \ 309 | ] 310 | vm, gdb, monitor, memory_flatview = create_resources_fixture 311 | _test_pt_filter_common(vm, gdb, monitor, executions) 312 | 313 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 314 | def test_pt_filter_and_search(create_resources_fixture, arch_name, linux_image): 315 | checker = lambda occ, data: occ.virt_range.flags.writeable() == True and data == b"Linux" 316 | vm, gdb, monitor, memory_flatview = create_resources_fixture 317 | _test_pt_search(vm, gdb, monitor, "pt -ss Linux -filter w", 5, checker) 318 | 319 | checker = lambda occ, data: \ 320 | ((occ.virt_range.flags.user_executable() == False and occ.virt_range.flags.user_executable() == False) or \ 321 | (occ.virt_range.flags.super_executable() == False and occ.virt_range.flags.super_executable() == False)) and \ 322 | data == b"Linux" 323 | _test_pt_search(vm, gdb, monitor, "pt -ss Linux -filter ro", 5, checker) 324 | 325 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 326 | def test_pt_range_command(create_resources_fixture, arch_name, linux_image): 327 | vm, gdb, monitor, memory_flatview = create_resources_fixture 328 | 329 | res = gdb.run_cmd("pt") 330 | ranges = parse_va_ranges(arch_name, res.output) 331 | assert(len(ranges) > 10) 332 | 333 | r0, r1, r2, r3 = ranges[0:4] 334 | 335 | # cover first range only 336 | res = gdb.run_cmd(f"pt -range {r0.va_start} {r0.va_start + r0.length - 1}") 337 | subranges = parse_va_ranges(arch_name, res.output) 338 | assert(len(subranges) == 1) 339 | assert(subranges[0] == r0) 340 | 341 | # cover first range partially 342 | res = gdb.run_cmd(f"pt -range {r0.va_start + 0x1} {r0.va_start + r0.length - 1}") 343 | subranges = parse_va_ranges(arch_name, res.output) 344 | assert(len(subranges) == 0) 345 | 346 | # cover range 2 partially 347 | res = gdb.run_cmd(f"pt -range {r0.va_start + 0x1} {r1.va_start}") 348 | subranges = parse_va_ranges(arch_name, res.output) 349 | assert(len(subranges) == 1) 350 | assert(subranges[0] == r1) 351 | 352 | # cover ranges 0, 1, 2 353 | res = gdb.run_cmd(f"pt -range {r0.va_start} {r2.va_start}") 354 | subranges = parse_va_ranges(arch_name, res.output) 355 | assert(len(subranges) == 3) 356 | assert(subranges[0] == r0) 357 | assert(subranges[1] == r1) 358 | assert(subranges[2] == r2) 359 | 360 | # cover ranges 1, 2, 3 361 | res = gdb.run_cmd(f"pt -range {r1.va_start} {r3.va_start}") 362 | subranges = parse_va_ranges(arch_name, res.output) 363 | assert(len(subranges) == 3) 364 | assert(subranges[0] == r1) 365 | assert(subranges[1] == r2) 366 | assert(subranges[2] == r3) 367 | 368 | # end before start 369 | res = gdb.run_cmd(f"pt -range 0x40000 0x30000") 370 | subranges = parse_va_ranges(arch_name, res.output) 371 | assert(len(subranges) == 0) 372 | 373 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 374 | def test_pt_has_command(create_resources_fixture, arch_name, linux_image): 375 | vm, gdb, monitor, memory_flatview = create_resources_fixture 376 | 377 | res = gdb.run_cmd("pt") 378 | ranges = parse_va_ranges(arch_name, res.output) 379 | assert(len(ranges) > 10) 380 | 381 | r0, r1, r2, r3 = ranges[0:4] 382 | 383 | res = gdb.run_cmd(f"pt -has {r0.va_start}") 384 | subranges = parse_va_ranges(arch_name, res.output) 385 | assert(len(subranges) == 1) 386 | assert(subranges[0] == r0) 387 | 388 | res = gdb.run_cmd(f"pt -has {r0.va_start + 1}") 389 | subranges = parse_va_ranges(arch_name, res.output) 390 | assert(len(subranges) == 1) 391 | assert(subranges[0] == r0) 392 | 393 | res = gdb.run_cmd(f"pt -has {r0.va_start + r0.length - 1}") 394 | subranges = parse_va_ranges(arch_name, res.output) 395 | assert(len(subranges) == 1) 396 | assert(subranges[0] == r0) 397 | 398 | res = gdb.run_cmd(f"pt -has {r1.va_start}") 399 | subranges = parse_va_ranges(arch_name, res.output) 400 | assert(len(subranges) == 1) 401 | assert(subranges[0] == r1) 402 | 403 | res = gdb.run_cmd(f"pt -has {ranges[-1].va_start + ranges[-1].length - 1}") 404 | subranges = parse_va_ranges(arch_name, res.output) 405 | assert(len(subranges) == 1) 406 | assert(subranges[0] == ranges[-1]) 407 | 408 | res = gdb.run_cmd(f"pt -has {ranges[-1].va_start + ranges[-1].length}") 409 | subranges = parse_va_ranges(arch_name, res.output) 410 | assert(len(subranges) == 0) 411 | 412 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 413 | def test_pt_before_command(create_resources_fixture, arch_name, linux_image): 414 | if arch_name == "arm_64": 415 | # BUG: needs gdb_pt_dump fix 416 | pytest.skip("The gdb_pt_dump aarch64 backend does not implement cut_before and cut_after.") 417 | 418 | vm, gdb, monitor, memory_flatview = create_resources_fixture 419 | 420 | res = gdb.run_cmd("pt") 421 | ranges = parse_va_ranges(arch_name, res.output) 422 | assert(len(ranges) > 10) 423 | 424 | r0, r1, r2, r3 = ranges[0:4] 425 | 426 | res = gdb.run_cmd(f"pt -before {r0.va_start}") 427 | subranges = parse_va_ranges(arch_name, res.output) 428 | assert(len(subranges) == 0) 429 | 430 | res = gdb.run_cmd(f"pt -before {r0.va_start + r0.length}") 431 | subranges = parse_va_ranges(arch_name, res.output) 432 | assert(len(subranges) == 1) 433 | assert(subranges[0] == r0) 434 | 435 | res = gdb.run_cmd(f"pt -before {r0.va_start + 0x100}") 436 | subranges = parse_va_ranges(arch_name, res.output) 437 | assert(len(subranges) == 1) 438 | r_tmp = copy.deepcopy(r0) 439 | r_tmp.length = 0x100 440 | print(r0.va_start, subranges[0].va_start, subranges[0].length) 441 | assert(subranges[0] == r_tmp) 442 | 443 | res = gdb.run_cmd(f"pt -before {r2.va_start + r2.length}") 444 | subranges = parse_va_ranges(arch_name, res.output) 445 | assert(len(subranges) == 3) 446 | assert(subranges[0] == r0) 447 | assert(subranges[1] == r1) 448 | assert(subranges[2] == r2) 449 | 450 | res = gdb.run_cmd(f"pt -before {r3.va_start + r3.length - 0x100}") 451 | subranges = parse_va_ranges(arch_name, res.output) 452 | assert(len(subranges) == 4) 453 | assert(subranges[0] == r0) 454 | assert(subranges[1] == r1) 455 | assert(subranges[2] == r2) 456 | r_tmp = copy.deepcopy(r3) 457 | r_tmp.length = r3.length - 0x100 458 | assert(subranges[3] == r_tmp) 459 | 460 | res = gdb.run_cmd(f"pt -before {ranges[-1].va_start + ranges[-1].length}") 461 | subranges = parse_va_ranges(arch_name, res.output) 462 | assert(subranges == ranges) 463 | 464 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 465 | def test_pt_after_command(create_resources_fixture, arch_name, linux_image): 466 | if arch_name == "arm_64": 467 | # BUG: needs gdb_pt_dump fix 468 | pytest.skip("The gdb_pt_dump aarch64 backend does not implement cut_before and cut_after.") 469 | 470 | vm, gdb, monitor, memory_flatview = create_resources_fixture 471 | 472 | res = gdb.run_cmd("pt") 473 | ranges = parse_va_ranges(arch_name, res.output) 474 | assert(len(ranges) > 10) 475 | 476 | res = gdb.run_cmd(f"pt -after {ranges[-1].va_start}") 477 | subranges = parse_va_ranges(arch_name, res.output) 478 | assert(subranges == [ranges[-1]]) 479 | 480 | res = gdb.run_cmd(f"pt -after {ranges[-1].va_start + ranges[-1].length}") 481 | subranges = parse_va_ranges(arch_name, res.output) 482 | assert(subranges == []) 483 | 484 | res = gdb.run_cmd(f"pt -after {ranges[-1].va_start + 0x100}") 485 | subranges = parse_va_ranges(arch_name, res.output) 486 | r_tmp = copy.deepcopy(ranges[-1]) 487 | r_tmp.va_start += 0x100 488 | r_tmp.length = ranges[-1].length - 0x100 489 | assert(subranges == [r_tmp]) 490 | 491 | res = gdb.run_cmd(f"pt -after {ranges[0].va_start}") 492 | subranges = parse_va_ranges(arch_name, res.output) 493 | assert(subranges == ranges) 494 | 495 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 496 | def test_pt_before_after_combination(create_resources_fixture, arch_name, linux_image): 497 | if arch_name == "arm_64": 498 | # BUG: needs gdb_pt_dump fix 499 | pytest.skip("The gdb_pt_dump aarch64 backend does not implement cut_before and cut_after.") 500 | 501 | vm, gdb, monitor, memory_flatview = create_resources_fixture 502 | 503 | res = gdb.run_cmd("pt") 504 | ranges = parse_va_ranges(arch_name, res.output) 505 | assert(len(ranges) > 10) 506 | 507 | res = gdb.run_cmd(f"pt -after {ranges[0].va_start} -before {ranges[-1].va_start + ranges[-1].length}") 508 | subranges = parse_va_ranges(arch_name, res.output) 509 | assert(subranges == ranges) 510 | 511 | res = gdb.run_cmd(f"pt -after {ranges[1].va_start} -before {ranges[-1].va_start}") 512 | subranges = parse_va_ranges(arch_name, res.output) 513 | assert(subranges == ranges[1:-1]) 514 | 515 | res = gdb.run_cmd(f"pt -after {ranges[2].va_start} -before {ranges[3].va_start}") 516 | subranges = parse_va_ranges(arch_name, res.output) 517 | assert(subranges == [ranges[2]]) 518 | 519 | res = gdb.run_cmd(f"pt -after {ranges[2].va_start} -before {ranges[3].va_start + ranges[3].length}") 520 | subranges = parse_va_ranges(arch_name, res.output) 521 | assert(subranges == ranges[2:4]) 522 | 523 | res = gdb.run_cmd(f"pt -after {ranges[2].va_start + 0x200} -before {ranges[4].va_start + 0x300}") 524 | subranges = parse_va_ranges(arch_name, res.output) 525 | 526 | r2_tmp = copy.deepcopy(ranges[2]) 527 | r2_tmp.va_start += 0x200 528 | r2_tmp.length = r2_tmp.length - 0x200 529 | r4_tmp = copy.deepcopy(ranges[4]) 530 | r4_tmp.length = 0x300 531 | assert(subranges == [r2_tmp, ranges[3], r4_tmp]) 532 | 533 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 534 | def test_pt_kaslr(create_resources_fixture_nokaslr, arch_name, linux_image): 535 | if arch_name == "riscv": 536 | pytest.skip("KASLR commands not supported with riscv") 537 | if "la57" in linux_image: 538 | pytest.skip("The test needs to be updated to have the correct hardcoded base addresses when LA57 is enabled") 539 | virt_pattern = re.compile(r'Virt:\s+([0-9a-fA-Fx]+)') 540 | phys_pattern = re.compile(r'Phys:\s+([0-9a-fA-Fx]+)') 541 | 542 | vm, gdb, monitor, memory_flatview = create_resources_fixture_nokaslr 543 | 544 | res = gdb.run_cmd("pt -kaslr") 545 | output = ansi_escape(res.output) 546 | virt_matches = virt_pattern.findall(output) 547 | phys_matches = phys_pattern.findall(output) 548 | 549 | assert(int(virt_matches[0], 16) in vm.get_default_base_image_kaddr()) 550 | assert(int(phys_matches[0], 16) in vm.get_default_base_image_paddr()) 551 | 552 | # BUG: not implemented in gdb_pt_dump for aarch64 553 | if arch_name != "arm_64": 554 | assert(int(virt_matches[1], 16) == vm.get_default_physmap_kaddr()) 555 | 556 | vm.stop() 557 | gdb.stop() 558 | 559 | for u in range(4): 560 | vm.start(kaslr=True) 561 | vm.wait_for_shell() 562 | res = gdb.run_cmd("pt -kaslr") 563 | output = ansi_escape(res.output) 564 | virt_matches = virt_pattern.findall(output) 565 | phys_matches = phys_pattern.findall(output) 566 | assert(int(virt_matches[0], 16) != 0) 567 | assert(int(phys_matches[0], 16) != 0) 568 | 569 | # BUG: not implemented in gdb_pt_dump for aarch64 570 | if arch_name != "arm_64": 571 | assert(int(virt_matches[1], 16) != 0) 572 | 573 | vm.stop() 574 | gdb.stop() 575 | 576 | def get_custom_binaries(): 577 | custom_x86_64 = [("x86_64", bin) for bin in get_x86_64_binary_names()] 578 | custom_arm_64 = [("arm_64", bin) for bin in get_arm_64_binary_names()] 579 | return custom_x86_64 + custom_arm_64 580 | 581 | @pytest.fixture 582 | def create_custom_resources_fixture(arch_name, image_name): 583 | if arch_name == "arm_64" and are_arm64_custom_images_broken(): 584 | pytest.skip("Custom images are broken on QEMU Aarch64 8.2.2 (VM faults)") 585 | vm = create_custom_vm(arch_name, image_name) 586 | vm.start() 587 | vm.wait_for_string_on_line(b"Done") 588 | gdb = GdbCommandExecutor(vm) 589 | monitor = QemuMonitorExecutor(vm) 590 | memory_flatview = monitor.get_memory_flat_view() 591 | 592 | if bool(os.getenv("GDB_PT_DUMP_TESTS_PAUSE_AFTER_BOOT")) == True: 593 | print("Sleeping...") 594 | time.sleep(10000) 595 | 596 | yield (vm, gdb, monitor, memory_flatview) 597 | gdb.stop() 598 | monitor.stop() 599 | vm.stop() 600 | 601 | @pytest.mark.parametrize("arch_name, image_name", get_custom_binaries()) 602 | def test_golden_images(request, create_custom_resources_fixture, arch_name, image_name): 603 | vm, gdb, monitor, flatview = create_custom_resources_fixture 604 | test_name = request.node.name 605 | generated_image_name = "/tmp/.gdb_pt_dump_{}".format(image_name) 606 | print("Generated image path is {}".format(generated_image_name)) 607 | gdb.run_cmd("pt -o {}".format(generated_image_name)) 608 | 609 | generated_data = None 610 | with open(generated_image_name, "r") as generated_file: 611 | generated_data = generated_file.read() 612 | 613 | golden_image = os.path.join(ImageContainer().get_custom_kernels_golden_images(arch_name), image_name) 614 | expected_data = None 615 | with open(golden_image, "r") as golden_image_file: 616 | expected_data = golden_image_file.read() 617 | 618 | assert(expected_data == generated_data) 619 | 620 | @pytest.mark.parametrize("arch_name, image_name", get_custom_binaries()) 621 | def test_phys_verbose_golden_images(request, create_custom_resources_fixture, arch_name, image_name): 622 | if arch_name == "arm_64": 623 | pytest.skip("phys verbose not correct for arm64") 624 | vm, gdb, monitor, flatview = create_custom_resources_fixture 625 | test_name = request.node.name 626 | generated_image_name = "/tmp/.gdb_pt_dump_phys_verbose_{}".format(image_name) 627 | print("Generated image path is {}".format(generated_image_name)) 628 | gdb.run_cmd("pt -phys_verbose -o {}".format(generated_image_name)) 629 | 630 | generated_data = None 631 | with open(generated_image_name, "r") as generated_file: 632 | generated_data = generated_file.read() 633 | 634 | golden_image = os.path.join(ImageContainer().get_custom_kernels_golden_images(arch_name), "phys_verbose_{}".format(image_name)) 635 | expected_data = None 636 | with open(golden_image, "r") as golden_image_file: 637 | expected_data = golden_image_file.read() 638 | 639 | assert(expected_data == generated_data) 640 | 641 | @pytest.mark.parametrize("arch_name, image_name", get_custom_binaries()) 642 | def test_pt_walk_golden_images(request, create_custom_resources_fixture, arch_name, image_name): 643 | if arch_name == "arm_64": 644 | pytest.skip("golden images are not present") 645 | vm, gdb, monitor, flatview = create_custom_resources_fixture 646 | test_name = request.node.name 647 | generated_image_name = "/tmp/.gdb_pt_dump_pt_walk_{}".format(image_name) 648 | print("Generated image path is {}".format(generated_image_name)) 649 | gdb.run_cmd("pt -walk 0x2000 -o {}".format(generated_image_name)) 650 | 651 | generated_data = None 652 | with open(generated_image_name, "r") as generated_file: 653 | generated_data = generated_file.read() 654 | 655 | golden_image = os.path.join(ImageContainer().get_custom_kernels_golden_images(arch_name), "pt_walk_{}".format(image_name)) 656 | expected_data = None 657 | with open(golden_image, "r") as golden_image_file: 658 | expected_data = golden_image_file.read() 659 | 660 | assert(expected_data == generated_data) 661 | 662 | def test_pt_la57(): 663 | vm = VM_X86_64(ImageContainer().get_linux_x86_64()) 664 | vm.start() 665 | 666 | time.sleep(100) 667 | 668 | def test_pt_x86_32(): 669 | vm = VM_X86_64(ImageContainer().get_kolibri_x86_32(), fda_name = "kolibri.img") 670 | vm.start() 671 | 672 | time.sleep(30) 673 | 674 | gdb = GdbCommandExecutor(vm) 675 | res = gdb.run_cmd("pt") 676 | ranges = parse_va_ranges("x86_64", res.output) 677 | assert(len(ranges) > 0) 678 | 679 | monitor = QemuMonitorExecutor(vm) 680 | 681 | for r in ranges: 682 | addr = r.va_start 683 | data = monitor.read_virt_memory(addr, 4) 684 | assert(len(data) == 4) 685 | 686 | res = gdb.run_cmd(f"pt -walk {hex(ranges[0].va_start)}") 687 | assert("Last stage faulted" not in res.output) 688 | 689 | res = gdb.run_cmd("pt -ss Kolibri") 690 | occs = parse_occurrences("x86_64", res.output) 691 | assert(len(occs) > 1) 692 | 693 | gdb.stop() 694 | monitor.stop() 695 | vm.stop() 696 | 697 | def test_pt_i386(): 698 | vm = VM_X86_32(ImageContainer().get_kolibri_x86_32(), fda_name = "kolibri.img") 699 | vm.start() 700 | 701 | time.sleep(30) 702 | 703 | gdb = GdbCommandExecutor(vm) 704 | res = gdb.run_cmd("pt") 705 | ranges = parse_va_ranges("x86_64", res.output) 706 | assert(len(ranges) > 0) 707 | 708 | monitor = QemuMonitorExecutor(vm) 709 | 710 | for r in ranges: 711 | addr = r.va_start 712 | data = monitor.read_virt_memory(addr, 4) 713 | assert(len(data) == 4) 714 | 715 | res = gdb.run_cmd(f"pt -walk {hex(ranges[0].va_start)}") 716 | assert("Last stage faulted" not in res.output) 717 | 718 | res = gdb.run_cmd("pt -ss Kolibri") 719 | occs = parse_occurrences("x86_64", res.output) 720 | assert(len(occs) > 1) 721 | 722 | gdb.stop() 723 | monitor.stop() 724 | vm.stop() 725 | 726 | @pytest.mark.parametrize("arch_name, linux_image", get_all_images()) 727 | def test_pt_read_virt_memory(create_resources_fixture_nokaslr, arch_name, linux_image): 728 | if arch_name == "arm_64": 729 | pytest.skip("pt command fails on arm_64 for some reason, needs debugging") 730 | vm, gdb, monitor, memory_flatview = create_resources_fixture_nokaslr 731 | 732 | res = gdb.run_cmd("pt") 733 | ranges = parse_va_ranges(arch_name, res.output) 734 | 735 | for r in ranges[:10]: 736 | pt_read_virt_filename = tempfile.mktemp() 737 | gdb_dump_filename = tempfile.mktemp() 738 | gdb.run_cmd(f"pt -read_virt {hex(r.va_start)} {r.length} -o {pt_read_virt_filename}") 739 | gdb.run_cmd(f"dump binary memory {gdb_dump_filename} {hex(r.va_start)} {hex(r.va_start + r.length)}") 740 | assert(filecmp.cmp(gdb_dump_filename, pt_read_virt_filename, shallow=False)) 741 | 742 | if __name__ == "__main__": 743 | print("This code should be invoked via 'pytest':", file=sys.stderr) 744 | print("") 745 | print(" pytest run_integration_tests.py") 746 | print("") 747 | 748 | -------------------------------------------------------------------------------- /tests/integration_tests/run_tests.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -e 4 | 5 | print_usage() { 6 | echo "Example usage: $0 --logfile /tmp/log.txt --use_docker" 7 | echo "" 8 | echo " Parameters:" 9 | echo "" 10 | echo " --logfile " 11 | echo " Specifies a logfile for storing stdout and stderr" 12 | echo " output from the test run." 13 | echo "" 14 | echo " --use_docker" 15 | echo " Specifies that tests must be executed within a docker" 16 | echo " container. This is useful for running the tests in a" 17 | echo " clean environment with minimal dependencies installed." 18 | echo "" 19 | echo " --skip_download" 20 | echo " Skips downloading the image files." 21 | } 22 | 23 | logfile="" 24 | use_docker="" 25 | skip_download="" 26 | 27 | # Parse arguments manually 28 | while [[ $# -gt 0 ]]; do 29 | case "$1" in 30 | --logfile) 31 | if [[ -n "$2" && "$2" != "--"* ]]; then 32 | logfile="$2" 33 | shift 2 34 | else 35 | echo "Error: --logfile requires a value." 36 | exit 1 37 | fi 38 | ;; 39 | --use_docker) 40 | use_docker="1" 41 | shift 42 | ;; 43 | --skip_download) 44 | skip_download="1" 45 | shift 46 | ;; 47 | --help|-h) 48 | print_usage 49 | exit 0 50 | ;; 51 | *) 52 | echo "Error: Unknown option $1" 53 | print_usage 54 | exit 1 55 | ;; 56 | esac 57 | done 58 | 59 | # Check if --logfile was provided 60 | if [[ -z "${logfile}" ]]; then 61 | echo "Error: --logfile is a mandatory argument." 62 | echo "Usage: $0 --logfile " 63 | exit 1 64 | fi 65 | 66 | . common.sh 67 | 68 | if [[ -z "${skip_download}" ]]; then 69 | download_latest 70 | fi 71 | 72 | if [[ ! -z "${use_docker}" ]]; then 73 | project_path=$(git rev-parse --show-toplevel) 74 | integration_tests_dir="${project_path}/tests/integration_tests/" 75 | cd "${project_path}" 76 | 77 | uid=$(id -u) 78 | gid=$(id -g) 79 | 80 | docker build --build-arg UID=${uid} --build-arg GID=${gid} -f "${integration_tests_dir}/Dockerfile.runtests" -t gdb_pt_dump_run_tests . 81 | 82 | mkdir -p $(dirname "${logfile}") 83 | touch "${logfile}" 84 | fullpath=$(realpath "${logfile}") 85 | 86 | docker run --volume "${project_path}:/gdb-pt-dump:ro" --volume "${fullpath}:${fullpath}:rw" -e "GDB_PT_DUMP_TESTS_LOGFILE=${logfile}" -ti gdb_pt_dump_run_tests 87 | exit 0 88 | fi 89 | 90 | echo "Storing output in logfile: \"${logfile}\"" 91 | 92 | # Use half ot the available CPUs to avoid excessively high memory usage. 93 | num_jobs=$(($(nproc) / 2)) 94 | 95 | timeout_limit=120 96 | 97 | ./run_integration_tests.py -o "cache_dir=/tmp" -v -n ${num_jobs} --timeout ${timeout_limit} 2>&1 | tee ${logfile} 98 | -------------------------------------------------------------------------------- /tests/integration_tests/vm_utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import subprocess 4 | import re 5 | import time 6 | import socket 7 | import tempfile 8 | import shutil 9 | import shlex 10 | from abc import ABC, abstractmethod 11 | 12 | class SocketAllocator: 13 | def __init__(self): 14 | self._socket_dir = tempfile.mkdtemp() 15 | 16 | def __del__(self): 17 | #self.cleanup_all_sockets() 18 | pass 19 | 20 | def cleanup_all_sockets(self): 21 | shutil.rmtree(self._socket_dir) 22 | 23 | def alloc_monitor_socket(self): 24 | return self._alloc_socket("monitor") 25 | 26 | def alloc_gdb_socket(self): 27 | return self._alloc_socket("gdb") 28 | 29 | def _alloc_socket(self, type_name): 30 | return tempfile.mkstemp(prefix="{}_".format(type_name), dir=self._socket_dir)[1] 31 | 32 | GlobalSocketAllocator = SocketAllocator() 33 | 34 | # TODO: add images to github and write a downloader script 35 | class ImageContainer: 36 | def __init__(self): 37 | self.images_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "test_images") 38 | 39 | def get_linux_image(self, image_name): 40 | return os.path.join(self.images_dir, image_name) 41 | 42 | def get_linux_x86_64(self): 43 | return os.path.join(self.images_dir, "linux_x86_64") 44 | 45 | def get_linux_riscv(self): 46 | return os.path.join(self.images_dir, "linux_riscv") 47 | 48 | def get_kolibri_x86_32(self): 49 | return os.path.join(self.images_dir, "kolibri_x86_32") 50 | 51 | def get_linux_arm_64(self): 52 | return os.path.join(self.images_dir, "linux_arm_64") 53 | 54 | def get_custom_kernels_x86_64(self): 55 | return os.path.join(self.images_dir, "custom_kernels", "x86_64") 56 | 57 | def get_custom_kernels_arm_64(self): 58 | return os.path.join(self.images_dir, "custom_kernels", "arm_64") 59 | 60 | def get_custom_kernels_golden_images(self, arch_name): 61 | if arch_name == "x86_64": 62 | return os.path.join(self.images_dir, "custom_kernels_golden_images", "x86_64") 63 | elif arch_name == "arm_64": 64 | return os.path.join(self.images_dir, "custom_kernels_golden_images", "arm_64") 65 | else: 66 | raise Exception(f"Unknown arch {arch_name}") 67 | 68 | class VM(ABC): 69 | def __init__(self, arch): 70 | self.vm_proc = None 71 | self.arch = arch 72 | self.qemu_monitor_path = GlobalSocketAllocator.alloc_monitor_socket() 73 | self.print_uart = bool(os.getenv("GDB_PT_DUMP_TESTS_PRINT_UART")) 74 | self.qemu_gdb_path = GlobalSocketAllocator.alloc_gdb_socket() 75 | 76 | def start(self, cmd): 77 | if bool(os.getenv("GDB_PT_DUMP_TESTS_PRINT_VM_LAUNCH_CMD")): 78 | print(f"Executing command: {' '.join(shlex.quote(arg) for arg in cmd)}") 79 | self.vm_proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 80 | 81 | def get_arch(self): 82 | return self.arch 83 | 84 | @abstractmethod 85 | def get_default_base_image_kaddr(self): 86 | raise Exception("Not implemented") 87 | 88 | @abstractmethod 89 | def get_default_base_image_paddr(self): 90 | raise Exception("Not implemented") 91 | 92 | @abstractmethod 93 | def get_default_physmap_kaddr(self): 94 | raise Exception("Not implemented") 95 | 96 | @abstractmethod 97 | def get_fixed_known_address(self): 98 | raise Exception("Not implemented") 99 | 100 | def get_qemu_monitor_path(self): 101 | return self.qemu_monitor_path 102 | 103 | def get_qemu_gdb_path(self): 104 | return self.qemu_gdb_path 105 | 106 | def wait_for_string_on_line(self, string): 107 | line = b"" 108 | while True: 109 | b = self.vm_proc.stdout.read(1) 110 | if b == b"": 111 | continue 112 | if self.print_uart: 113 | sys.stdout.write(b.decode("utf-8")) 114 | line += b 115 | if b == b"\n": 116 | if line[:-1] == string: 117 | return 118 | line = b"" 119 | 120 | def wait_for_shell(self, shell_symbol="~"): 121 | line = b"" 122 | while True: 123 | b = self.vm_proc.stdout.read(1) 124 | if b == b"": 125 | continue 126 | if self.print_uart: 127 | sys.stdout.write(b.decode("utf-8")) 128 | line += b 129 | if b == b"\n": 130 | match = re.search(b'Boot took (.*) seconds', line) 131 | if match: 132 | return 133 | line = b"" 134 | 135 | def stop(self): 136 | if self.vm_proc: 137 | self.vm_proc.kill() 138 | self.vm_proc.wait() 139 | self.vm_proc = None 140 | 141 | def is_alive(self): 142 | if not self.vm_proc: 143 | return False 144 | return self.vm_proc.poll() == None 145 | 146 | class VM_X86_64(VM): 147 | def __init__(self, image_dir, fda_name=None): 148 | super().__init__(arch="x86_64") 149 | self.image_dir = image_dir 150 | self.fda_name = fda_name 151 | 152 | def start(self, memory_mib=256, kvm=False, smep=True, smap=True, kaslr=True, svm=False, la57=False, num_cores=1): 153 | cmd = [] 154 | cmd.extend(["qemu-system-x86_64"]) 155 | 156 | cpu_options = [] 157 | if kvm: 158 | cpu_options.append("kvm64") 159 | else: 160 | if la57: 161 | cpu_options.append("qemu64,+la57") 162 | else: 163 | cpu_options.append("qemu64") 164 | 165 | if smep: 166 | cpu_options.append("+smep") 167 | else: 168 | cpu_options.append("-smep") 169 | 170 | if smap: 171 | cpu_options.append("+smap") 172 | else: 173 | cpu_options.append("-smap") 174 | 175 | if svm: 176 | cpu_options.append("+svm") 177 | 178 | cmd.extend(["-cpu", ",".join(cpu_options)]) 179 | 180 | if self.fda_name == None: 181 | kernel_image = os.path.join(self.image_dir, "kernel.img") 182 | cmd.extend(["-kernel", kernel_image]) 183 | 184 | initrd_image = os.path.join(self.image_dir, "initrd.img") 185 | cmd.extend(["-initrd", initrd_image]) 186 | 187 | boot_string = "console=ttyS0 oops=panic ip=dhcp root=/dev/ram rdinit=/init quiet" 188 | if kaslr: 189 | boot_string += " kaslr" 190 | else: 191 | boot_string += " nokaslr" 192 | 193 | cmd.extend(["-append", boot_string]) 194 | else: 195 | # This path is taken for the custom images 196 | cmd.extend(["-fda", os.path.join(self.image_dir, self.fda_name)]) 197 | 198 | cmd.extend(["-m", str(memory_mib)]) 199 | 200 | cmd.extend(["-monitor", f"unix:{self.get_qemu_monitor_path()},server,nowait"]) 201 | 202 | cmd.extend(["-gdb", f"unix:{self.get_qemu_gdb_path()},server,nowait"]) 203 | 204 | cmd.extend(["-nographic", "-snapshot", "-no-reboot"]) 205 | 206 | cmd.extend(["-smp", str(num_cores)]) 207 | 208 | super().start(cmd) 209 | 210 | def get_default_base_image_kaddr(self): 211 | return [0xffffffff81000000] 212 | 213 | def get_default_base_image_paddr(self): 214 | return [0x1000000] 215 | 216 | def get_default_physmap_kaddr(self): 217 | return 0xffff888000000000 218 | 219 | def get_fixed_known_address(self): 220 | return 0xffffffff81000000 221 | 222 | class VM_X86_32(VM): 223 | def __init__(self, image_dir, fda_name=None): 224 | super().__init__(arch="x86_32") 225 | self.image_dir = image_dir 226 | self.fda_name = fda_name 227 | 228 | def start(self, memory_mib=256, num_cores=1): 229 | cmd = [] 230 | cmd.extend(["qemu-system-i386"]) 231 | 232 | if self.fda_name == None: 233 | kernel_image = os.path.join(self.image_dir, "kernel.img") 234 | cmd.extend(["-kernel", kernel_image]) 235 | 236 | initrd_image = os.path.join(self.image_dir, "initrd.img") 237 | cmd.extend(["-initrd", initrd_image]) 238 | 239 | boot_string = "console=ttyS0 oops=panic ip=dhcp root=/dev/ram rdinit=/init quiet" 240 | if kaslr: 241 | boot_string += " kaslr" 242 | else: 243 | boot_string += " nokaslr" 244 | 245 | cmd.extend(["-append", boot_string]) 246 | else: 247 | # This path is taken for the custom images 248 | cmd.extend(["-fda", os.path.join(self.image_dir, self.fda_name)]) 249 | 250 | cmd.extend(["-m", str(memory_mib)]) 251 | 252 | cmd.extend(["-monitor", f"unix:{self.get_qemu_monitor_path()},server,nowait"]) 253 | 254 | cmd.extend(["-gdb", f"unix:{self.get_qemu_gdb_path()},server,nowait"]) 255 | 256 | cmd.extend(["-nographic", "-snapshot", "-no-reboot"]) 257 | 258 | cmd.extend(["-smp", str(num_cores)]) 259 | 260 | super().start(cmd) 261 | 262 | def get_default_base_image_kaddr(self): 263 | raise Exception("Unimplemented") 264 | 265 | def get_default_base_image_paddr(self): 266 | raise Exception("Unimplemented") 267 | 268 | def get_default_physmap_kaddr(self): 269 | raise Exception("Unimplemented") 270 | 271 | def get_fixed_known_address(self): 272 | raise Exception("Unimplemented") 273 | 274 | class VM_Arm_64(VM): 275 | def __init__(self, image_dir, bios_name=None, has_kernel=True): 276 | super().__init__(arch="arm_64") 277 | self.image_dir = image_dir 278 | self.bios_name = bios_name 279 | self.has_kernel = has_kernel 280 | 281 | def start(self, memory_mib=256, kaslr=True, num_cores=1): 282 | cmd = [] 283 | cmd.extend(["qemu-system-aarch64"]) 284 | 285 | cpu_options = [] 286 | cpu_options.append("cortex-a57") 287 | cmd.extend(["-cpu", ",".join(cpu_options)]) 288 | 289 | cmd.extend(["-machine", "virt"]) 290 | 291 | if self.bios_name: 292 | cmd.extend(["-bios", os.path.join(self.image_dir, self.bios_name)]) 293 | 294 | if self.has_kernel: 295 | kernel_image = os.path.join(self.image_dir, "kernel.img") 296 | cmd.extend(["-kernel", kernel_image]) 297 | 298 | initrd_image = os.path.join(self.image_dir, "initrd.img") 299 | cmd.extend(["-initrd", initrd_image]) 300 | 301 | boot_string = "root=/dev/ram rdinit=/init" 302 | if kaslr: 303 | boot_string += " kaslr" 304 | else: 305 | boot_string += " nokaslr" 306 | 307 | cmd.extend(["-append", boot_string]) 308 | 309 | cmd.extend(["-m", str(memory_mib)]) 310 | 311 | cmd.extend(["-monitor", f"unix:{self.get_qemu_monitor_path()},server,nowait"]) 312 | 313 | cmd.extend(["-gdb", f"unix:{self.get_qemu_gdb_path()},server,nowait"]) 314 | 315 | cmd.extend(["-nographic", "-snapshot", "-no-reboot"]) 316 | 317 | cmd.extend(["-smp", str(num_cores)]) 318 | 319 | super().start(cmd) 320 | 321 | # TODO 322 | # The addresses are not correct when LA57 is enabled 323 | def get_default_base_image_kaddr(self): 324 | return [0xffff800010000000, 0xffffffc008010000, 0xfffffe0008010000] 325 | 326 | def get_default_base_image_paddr(self): 327 | return [0x40200000, 0x40210000] 328 | 329 | def get_default_physmap_kaddr(self): 330 | raise Exception("Unknown") 331 | 332 | def get_fixed_known_address(self): 333 | return 0xfffffffe00000000 334 | 335 | class VM_Riscv(VM): 336 | def __init__(self, image_dir): 337 | super().__init__(arch="riscv") 338 | self.image_dir = image_dir 339 | 340 | def start(self, memory_mib=64, kvm=False, kaslr=True, num_cores=1): 341 | cmd = [] 342 | cmd.extend(["qemu-system-riscv64"]) 343 | 344 | cpu_options = [] 345 | if kvm: 346 | cpu_options.append("kvm64") 347 | else: 348 | cpu_options.append("qemu64") 349 | 350 | cmd.extend(["-cpu", "rv64"]) 351 | 352 | kernel_image = os.path.join(self.image_dir, "kernel.img") 353 | cmd.extend(["-kernel", kernel_image]) 354 | 355 | initrd_image = os.path.join(self.image_dir, "initrd.img") 356 | cmd.extend(["-initrd", initrd_image]) 357 | 358 | cmd.extend(["-machine", "virt"]) 359 | 360 | boot_string = "root=/dev/ram rdinit=/init console=ttyS0 " 361 | if kaslr: 362 | boot_string += " kaslr" 363 | else: 364 | boot_string += " nokaslr" 365 | 366 | cmd.extend(["-append", boot_string]) 367 | 368 | cmd.extend(["-m", str(memory_mib)]) 369 | 370 | cmd.extend(["-monitor", f"unix:{self.get_qemu_monitor_path()},server,nowait"]) 371 | 372 | cmd.extend(["-gdb", f"unix:{self.get_qemu_gdb_path()},server,nowait"]) 373 | 374 | cmd.extend(["-nographic", "-snapshot", "-no-reboot"]) 375 | 376 | cmd.extend(["-smp", str(num_cores)]) 377 | 378 | super().start(cmd) 379 | 380 | def get_default_base_image_kaddr(self): 381 | raise Exception("Unimplemented") 382 | 383 | def get_default_base_image_paddr(self): 384 | raise Exception("Unimplemented") 385 | 386 | def get_default_physmap_kaddr(self): 387 | raise Exception("Unimplemented") 388 | 389 | def get_fixed_known_address(self): 390 | raise Exception("Unimplemented") 391 | 392 | class FlatViewRange: 393 | 394 | def __init__(self, range_start, range_end, range_type): 395 | self.range_start = range_start 396 | self.range_end = range_end 397 | self.range_type = range_type 398 | 399 | def is_memory_backed(self): 400 | return self.range_type in ["ram"] 401 | 402 | def is_rom(self): 403 | return self.range_type in ["rom"] 404 | 405 | def is_io(self): 406 | return self.range_type in ["i/o"] 407 | 408 | def __str__(self): 409 | return f"VA_start: {hex(self.range_start)}, VA_end: {hex(self.range_end)}, Type: {self.range_type}" 410 | 411 | class VmFlatView: 412 | 413 | def __init__(self, tree_data): 414 | self._tree_data = tree_data 415 | self._ranges = [] 416 | 417 | pattern = "^([0-9a-f]{16})-([0-9a-f]{16}) \\(prio \d, (.+)\\): (.+)$" 418 | res = "" 419 | for line in tree_data.split("\n"): 420 | line = line.strip() 421 | matching = re.match(pattern, line) 422 | if matching: 423 | memory_type = matching.group(3) 424 | if memory_type in ["ram", "rom"]: 425 | range_start = int(matching.group(1), 16) 426 | range_end = int(matching.group(2), 16) + 1 427 | self._ranges.append(FlatViewRange(range_start, range_end, memory_type)) 428 | 429 | def find_range(self, pa): 430 | for r in self._ranges: 431 | if pa >= r.range_start and pa < r.range_end: 432 | return r 433 | return None 434 | 435 | def find_prev_range(self, pa): 436 | prev = None 437 | for r in self._ranges: 438 | if r.range_end <= pa: 439 | prev = r 440 | return prev 441 | 442 | class QemuMonitorExecutor: 443 | 444 | def __init__(self, vm): 445 | self.socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) 446 | self.socket.connect(vm.get_qemu_monitor_path()) 447 | self._read_until("(qemu)") 448 | 449 | def stop(self): 450 | if self.socket: 451 | self.socket.shutdown(socket.SHUT_WR) 452 | self.socket.close() 453 | 454 | def _read_until(self, until): 455 | buf = "" 456 | while True: 457 | b = self.socket.recv(1) 458 | if b != b'': 459 | buf += b.decode() 460 | if buf.endswith(until): 461 | break 462 | return buf 463 | 464 | def _send_command(self, command): 465 | self.socket.send(command.encode() + b"\n") 466 | res = self._read_until("(qemu)")[:-7] 467 | res = res[res.find("\n"):] 468 | return res 469 | 470 | def get_memory_flat_view(self): 471 | tree = self._send_command("info mtree -f") 472 | return VmFlatView(tree) 473 | 474 | def gva2gpa(self, addr): 475 | res = self._send_command(f"gva2gpa {hex(addr)}").strip() 476 | matching = re.match("gpa: (.+)", res) 477 | if matching == None: 478 | return None 479 | res = matching.group(1) 480 | gpa_addr = int(res, 16) 481 | return gpa_addr 482 | 483 | def read_virt_memory(self, addr, len): 484 | data = self._read_memory(addr, len, True) 485 | return data 486 | 487 | def _read_memory(self, addr, len, is_virt): 488 | exec = "" 489 | if is_virt: 490 | exec = "memsave" 491 | else: 492 | exec = "pmemsave" 493 | filename = tempfile.mktemp() 494 | if os.path.isfile(filename): 495 | os.remove(filename) 496 | 497 | self._send_command(f"{exec} {addr} {len} \"{filename}\"") 498 | 499 | data = b"" 500 | for u in range(3): 501 | try: 502 | with open(filename, "rb") as f: 503 | data = f.read() 504 | break 505 | except: 506 | time.sleep(0.1) 507 | os.remove(filename) 508 | return data 509 | 510 | def pause(self): 511 | self._send_command("stop") 512 | 513 | def resume(self): 514 | self._send_command("cont") 515 | 516 | class GdbCommandExecutor: 517 | 518 | class Result: 519 | def __init__(self, output, elapsed): 520 | self.output = output 521 | self.elapsed = elapsed 522 | 523 | def __init__(self, vm): 524 | self.gdb_server_path = vm.get_qemu_gdb_path() 525 | self.use_multiarch = vm.get_arch() != "x86_64" 526 | self.script_root_pt = os.path.abspath(os.path.join(os.path.dirname(os.path.abspath(__file__)), "../", "../", "pt.py")) 527 | 528 | # Start a GDB process immediately so that 529 | self._start_gdb_process() 530 | 531 | def __del__(self): 532 | self.stop() 533 | 534 | def stop(self): 535 | if self._subproc: 536 | self._subproc.kill() 537 | self._subproc.wait() 538 | self._subproc = None 539 | 540 | def _start_gdb_process(self): 541 | cmd = [] 542 | if self.use_multiarch: 543 | cmd.extend(["gdb-multiarch"]) 544 | else: 545 | cmd.extend(["gdb"]) 546 | 547 | cmd.extend(["-n"]) 548 | cmd.extend(["-q"]) 549 | cmd.extend(["-ex", "set confirm off"]) 550 | cmd.extend(["-ex", "set pagination off"]) 551 | cmd.extend(["-ex", "source {}".format(self.script_root_pt)]) 552 | cmd.extend(["-ex", f"target remote {self.gdb_server_path}"]) 553 | 554 | self._subproc = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True, bufsize=0) 555 | 556 | def _read_until(self, until_str): 557 | buf = "" 558 | while True: 559 | b = self._subproc.stdout.read(1) 560 | if b != "": 561 | buf += b 562 | if buf.endswith(until_str): 563 | break 564 | return buf 565 | 566 | def run_cmd(self, pt_cmd): 567 | if not self._subproc: 568 | self._start_gdb_process() 569 | 570 | t1 = time.time() 571 | self._read_until("(gdb)") 572 | self._subproc.stdin.write(pt_cmd + "\n") 573 | self._subproc.stdin.write("p \"(DONE)\"\n") 574 | res = self._read_until("(DONE)") 575 | res = res[:res.rfind("\n")] 576 | t2 = time.time() 577 | 578 | if "error" in res.lower() or "exception" in res.lower(): 579 | raise Exception(f"Executuing command failed: '{res}'") 580 | elapsed = t2 - t1 581 | return GdbCommandExecutor.Result(res, elapsed) 582 | 583 | def create_linux_vm(arch_name, image_name = None): 584 | if arch_name == "x86_64": 585 | image = ImageContainer().get_linux_image(image_name) if image_name is not None else ImageContainer().get_linux_x86_64() 586 | return VM_X86_64(image) 587 | elif arch_name == "arm_64": 588 | image = ImageContainer().get_linux_image(image_name) if image_name is not None else ImageContainer().get_linux_arm_64() 589 | return VM_Arm_64(image) 590 | elif arch_name == "riscv": 591 | image = ImageContainer().get_linux_image(image_name) if image_name is not None else ImageContainer().get_linux_riscv() 592 | return VM_Riscv(image) 593 | else: 594 | raise Exception(f"Unknown arch {arch_name}") 595 | 596 | def create_custom_vm(arch_name, image_name): 597 | if arch_name == "x86_64": 598 | return VM_X86_64(image_dir = ImageContainer().get_custom_kernels_x86_64(), fda_name = image_name) 599 | elif arch_name == "arm_64": 600 | return VM_Arm_64(image_dir = ImageContainer().get_custom_kernels_arm_64(), bios_name = image_name, has_kernel = False) 601 | else: 602 | raise Exception(f"Unknown arch {arch_name}") 603 | 604 | def get_x86_64_binary_names(): 605 | image_folder = ImageContainer().get_custom_kernels_x86_64() 606 | files = [file for file in os.listdir(image_folder) if file.endswith(".bin")] 607 | return files 608 | 609 | def get_arm_64_binary_names(): 610 | image_folder = ImageContainer().get_custom_kernels_arm_64() 611 | files = [file for file in os.listdir(image_folder) if file.endswith(".bin")] 612 | # Filter out 16k granule 613 | files = [file for file in files if "16k" not in file] 614 | return files 615 | 616 | def check_va_exists(monitor, flatview, va): 617 | data = monitor.read_virt_memory(va, 4) 618 | if len(data) == 4: 619 | # This is the common case that the memory is accessible 620 | return True 621 | 622 | pa = monitor.gva2gpa(va) 623 | if pa == None: 624 | print("Qemu failed to translate the GVA altogether") 625 | print("This probably means that the page-table parsing or range merging is incorrect") 626 | return False 627 | 628 | r = flatview.find_range(pa) 629 | if r == None: 630 | if prev_range := flatview.find_prev_range(va): 631 | if prev_range.is_io() or prev_range.is_rom(): 632 | page_aligned_pa = pa & 0xFF_FF_FF_FF_FF_FF_F0_00 633 | if page_aligned_pa < prev_range.range_end: 634 | return True 635 | print(f"Failed to find the PA ({hex(pa)}) in the collected flatview ranges") 636 | print("This can mean the whole page is part of IO/ROM and only part of the address is accessible") 637 | return False 638 | 639 | if r.is_io(): 640 | # If it's not IO, it should be either RAM or ROM. 641 | assert(r.is_memory_backed()) 642 | 643 | # IO may only implement part of the physical memory range to be accessible, so accesses will fail 644 | # even if technically the physical page exists. 645 | return True 646 | 647 | # Anything else that's not handled is a failure to access memory 648 | return False 649 | 650 | def check_if_belongs_to_io_or_rom(monitor, flatview, va): 651 | pa = monitor.gva2gpa(va) 652 | if pa == None: 653 | print("Qemu failed to translate the GVA altogether") 654 | print("This probably means that the page-table parsing or range merging is incorrect") 655 | return False 656 | 657 | if pa == None: 658 | print("Qemu failed to translate the GVA altogether") 659 | print("This probably means that the page-table parsing or range merging is incorrect") 660 | return False 661 | 662 | r = flatview.find_range(pa) 663 | if r != None: 664 | return r.is_io() or r.is_rom() 665 | 666 | return False 667 | 668 | --------------------------------------------------------------------------------