├── .gitattributes ├── .gitignore ├── LICENSE ├── README.md ├── distfromfunc.py ├── gamedata_checker.py ├── isgoodsig.py ├── makesig.py ├── makesigfromhere.py ├── nameresetter.py ├── netprop_importer.py ├── sigfind.py ├── sigsmasher.py ├── structfiller.py ├── symbolsmasher.py ├── vtable_io.py └── vtable_structs.py /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | .vscode/* 3 | iconbuster.py 4 | form_test.py 5 | *.bak 6 | capdisasm.py 7 | vtest* 8 | namecollisions.py -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 John Mascagni 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # IDA Scripts 2 | Some random IDA scripts I wrote 3 | 4 | ## V2.0 5 | 6 | These scripts were heavily modified on 8/16/2023. For a full writeup on the new changes, see [here](https://github.com/Scags/IDA-Scripts/pull/2). 7 | 8 | ### distfromfunc.py ### 9 | 10 | Get the offset from the cursor address and the start of a function. Useful for byte patching. 11 | 12 | ### gamedata_checker.py ### 13 | 14 | Name says it all, but this verifies SourceMod gamedata files. This requires Valve's VDF library, install it with `pip install vdf`. 15 | 16 | Has a few quirks with it at the moment: 17 | - It does not support multi-line comments within gamedata files nor will it support multiple instances of `#default` keys. Parsing core SourceMod gamedata files is essentially verboten. 18 | - VTable functions that are stripped cannot be verified, obviously. 19 | - Function overloads tends to mess up VTable offset checking; e.g. `GiveNamedItem`. 20 | - Offset checking is variably difficult depending on naming conventions. If the gamedata key name is not named exactly the same as the function name, it will not be found; e.g. `OnTakeDamage` -> `CBaseEntity::OnTakeDamage` and `CTFPlayer::OnTakeDamage` -> `CBaseEntity::OnTakeDamage` but `TakeDamage` != `CBaseEntity::OnTakeDamage`. 21 | 22 | 23 | ### isgoodsig.py ### 24 | 25 | Takes a SourceMod (or any) signature input and detects if it's unique or not. 26 | 27 | 28 | ### makesig.py ### 29 | 30 | Python translation of [makesig](https://github.com/alliedmodders/sourcemod/blob/master/tools/ida_scripts/makesig.idc). 31 | 32 | Optionally, install pyperclip with `pip install pyperclip` to automatically copy any signatures to your clipboard when running. 33 | 34 | 35 | ### makesigfromhere.py ### 36 | 37 | Creates a signature from the cursor offset. Useful for byte patching. 38 | 39 | 40 | ### nameresetter.py ### 41 | 42 | Resets the name of every function in IDA's database. Does not include library or external functions. 43 | 44 | 45 | ### netprop_importer.py ### 46 | 47 | Imports netprops and owner classes as structs and struct members into IDA's DB. Only works with the XML file provided by sm_dump_netprops_xml. Datatables only work most of the time. You should also use the proper netprop dump for your OS, or else you will be very confused. 48 | 49 | 50 | ### sigfind.py ### 51 | 52 | Takes a SourceMod (or any) signature and jumps you to the function it's for. If it's a bad signature, then you won't go anywhere. 53 | 54 | 55 | ### sigsmasher.py ### 56 | 57 | Makes SourceMod ready signatures for every function in IDA's database. Yes, this will take a long, long time. Requires PyYAML so you'll need to `pip install pyyaml`. You have the option of only generating signatures for typed functions so this works very well with the Symbol Smasher. 58 | 59 | 60 | ### structfiller.py ### 61 | 62 | Sanitizes undefined struct members as if IDA had parsed a header file. Each structure will have its undefined members replaced with a one-byte-sized member in order to prevent pseudocode from falling apart. Only makes sense to use it after running the netprop importer. 63 | 64 | 65 | ### symbolsmasher.py ### 66 | 67 | Renames functions in a stripped library database based on unique string cross-references. 68 | 69 | Running the script presents 2 options: you can read and export data from the current database, or you can import and write data into it. 70 | 71 | If you're on a symbol library, you should run it in read mode and export it to a file. This file is what is used to import back into a stripped binary. 72 | 73 | When on Windows or another stripped database, run the script in write mode and select the file you exported earlier. A solid amount of functions should be typed within a few seconds. 74 | 75 | This works well with the Signature Smasher. However to save you an hour or so, I publicly host dumps of most Source games [here](http://scag.site.nfoservers.com/sigdump). 76 | 77 | ### vtable_io.py ### 78 | 79 | Imports and exports virtual tables. Run it through a Linux binary to export to a file, then run it through a Windows binary to import those VTables into the database. This is similar to [Asherkin's VTable Dumper](https://asherkin.github.io/vtable/) but doesn't suffer from the pitfalls of multiple inheritance. Since it doesn't have those liabilities, its function typing will almost always be perfect. 80 | 81 | #### Features #### 82 | This script is slightly heavy and has features that warrant explanation. Features can be freely enabled/disabled in the popup form that opens when you run the script. Desired features options are kept in the IDA registry and will persist. 83 | 84 | **Parse type strings** 85 | - Sometimes IDA fails to properly analyze Windows RTTI Type Descriptor objects. Because of this, there won't be a reference from certain type descriptors to std::type_info, which is required for the script to work. 86 | - If this feature is enabled, then the string names of the type descriptor will be parsed in order to discover the unreferencing type descriptors. This will be done alongside the normal script function. 87 | - If you notice that there are multiple functions of the same name or classes that have virtual functions that aren't typed, consider enabling this. 88 | - It should be harmless to keep on regardless, but it is disabled by default. 89 | - This problem only seemed to be present in NMRiH. 90 | 91 | **Skip vtable size mismatches** 92 | - The script is *almost* perfect. On rare occasion, it will fail to properly prepare a Windows translation of a Linux virtual table. 93 | - If this option is enabled, then any size mismatches will forego function typing. 94 | - Enabled by default. 95 | 96 | **Comment reused functions** 97 | - Windows oftentimes optimizes shorter and simpler functions and reuses them across multiple virtual tables. This means that it would be redundant to rename these functions over and over again. 98 | - If enabled, virtual table declarations instead emplace a comment on the function's reference. 99 | - Enabled by default. 100 | 101 | **Export options** 102 | - Should be self-explanatory, but the script is able to export the Linux and Windows virtual tables to a file. 103 | - This is is a .json file and is organized to be readable. 104 | - The format of the export file is as follows: 105 | ```json 106 | "classname" 107 | { 108 | "[this-offset] vtable-offset function-name" 109 | } 110 | ``` 111 | - Linux offsets are denoted with `L` and Windows with `W`. If the function is not present in a certain OS, then that index is empty. 112 | - Exporting is optional, and if it is not enabled, then the export file path option can be safely ignored. 113 | 114 | ### vtable_structs.py ### 115 | 116 | Runs through virtual tables and creates structs for them. Use at your own risk since it screws up refencing members through pseudocode. -------------------------------------------------------------------------------- /distfromfunc.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idaapi 3 | 4 | def main(): 5 | addr = idaapi.get_screen_ea() 6 | if addr == idc.BADADDR: 7 | print("Make sure you are in a function!") 8 | idaapi.beep() 9 | return 10 | 11 | func = idaapi.get_func(addr) 12 | if func is None: 13 | print("Make sure you are in a function!") 14 | idaapi.beep() 15 | return 16 | 17 | funcname = idaapi.get_name(func.start_ea) 18 | demangled = idaapi.demangle_name(funcname, idc.get_inf_attr(idc.INF_SHORT_DN)) 19 | print(f"{demangled or funcname}:") 20 | print(f"Offset from {func.start_ea:08X} to {addr:08X} = {addr - func.start_ea} ({addr - func.start_ea:#X})") 21 | 22 | main() -------------------------------------------------------------------------------- /gamedata_checker.py: -------------------------------------------------------------------------------- 1 | import idautils 2 | import idaapi 3 | import idc 4 | import vdf 5 | 6 | from sys import version_info 7 | 8 | OS_Linux = 0 9 | OS_Win = 1 10 | 11 | def get_os(): 12 | ftype = idaapi.get_file_type_name() 13 | if "PE" in ftype: 14 | return OS_Win 15 | elif "ELF" in ftype: 16 | return OS_Linux 17 | return -1 18 | 19 | def osstr(os): 20 | if os == OS_Linux: 21 | return "linux" 22 | elif os == OS_Win: 23 | return "windows" 24 | return "unknown" 25 | 26 | def checksig(sig): 27 | if sig[0] == '@': 28 | # Just check for existence of this mangled name 29 | return idc.get_name_ea_simple(sig[1:]) != idc.BADADDR 30 | 31 | sig = sig.replace(r"\x", " ").replace("2A", "?").replace("2a", "?").replace("\\", "").strip() 32 | 33 | # Get the first segment that is executable to use its addresses for parse_binpat_str 34 | endea = idc.BADADDR 35 | for segea in idautils.Segments(): 36 | s = idaapi.getseg(segea) 37 | if s.perm & idaapi.SEGPERM_EXEC: 38 | segstart = segea 39 | # Set the end ea to the end of the last executable segment 40 | # Speed isn't as important in this script, so reading any extra X 41 | # segments is fine 42 | if endea == idc.BADADDR or endea < segstart + s.size(): 43 | endea = segstart + s.size() 44 | 45 | count = 0 46 | addr = 0 47 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 48 | while count < 2 and addr != idc.BADADDR: 49 | count = count + 1 50 | if count > 1: 51 | break 52 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 53 | 54 | return count == 1 55 | 56 | # bin_search3 breaks after 15 or so bytes or something, idk man 57 | # binpat = idaapi.compiled_binpat_vec_t() 58 | # idaapi.parse_binpat_str(binpat, segstart, sig, 16, idaapi.get_default_encoding_idx(idaapi.get_encoding_bpu_by_name("UTF-8"))) 59 | 60 | # count = 0 61 | # addr = 0 62 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 63 | # while addr != idc.BADADDR: 64 | # count += 1 65 | # if count > 1: 66 | # break 67 | 68 | # # +1 because the search finds itself 69 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 70 | 71 | # return count == 1 72 | 73 | def get_bcompat_items(d): 74 | return d.iteritems() if version_info[0] <= 2 else d.items() 75 | 76 | # Unfortunately I don't care too much about overtly complex gamedata files 77 | # If you have multiple #default's in you first subsection or you have #default 78 | # anywhere else other than that first subsection, you're SOL. Sorry Silvers :c 79 | def get_gamedir(kv): 80 | # If we've got multiple games supported, let's just ask 81 | if len(kv.items()) > 1: 82 | gamedir = idaapi.ask_str("", 0, "There are multiple games supported by this file. Which game directory is this for?") 83 | # Not in the basic game shit, check for support in default 84 | if gamedir is not None and gamedir not in kv.keys(): 85 | default = kv.get("#default") 86 | # There's a default entry, check for supported 87 | if default: 88 | supported = kv.get("#supported") 89 | if supported: 90 | if gamedir in supported.values(): 91 | return gamedir 92 | return "" 93 | return "#default" 94 | return "" 95 | else: 96 | # 1 item, see if it's a default 97 | gamedir = list(kv.keys())[0] 98 | if gamedir == "#default": 99 | default = kv.items()[0] 100 | # If it has multiple supports, check and see if we're in there 101 | supported = kv.get("#supported") 102 | if supported: 103 | if len(supported.items()) > 1: 104 | gamedir = idaapi.ask_str("", 0, "There are multiple games supported by this file. Which game directory is this for?") 105 | if gamedir is not None and gamedir in default["#supported"].values(): 106 | return gamedir 107 | return "" 108 | return list(supported.values())[0] 109 | return "#default" 110 | 111 | return gamedir 112 | 113 | def get_voffs(name): 114 | os = get_os() 115 | if os == OS_Linux: 116 | mangled = "_ZTV{}{}".format(len(name), name) 117 | offset = 8 118 | else: 119 | mangled = "??_7{}@@6B@".format(name) 120 | offset = 0 121 | 122 | addr = idc.get_name_ea_simple(mangled) 123 | if addr != idc.BADADDR: 124 | addr += offset 125 | return addr 126 | 127 | def read_vtable(funcname, ea): 128 | funcs = {} 129 | offset = 0 130 | while ea != idc.BADADDR: 131 | if idaapi.inf_is_64bit(): 132 | offs = idaapi.get_qword(ea) 133 | else: 134 | offs = idaapi.get_dword(ea) 135 | 136 | if not idaapi.is_code(idaapi.get_full_flags(offs)): 137 | break 138 | 139 | name = idc.get_name(offs, idaapi.GN_VISIBLE) 140 | demangled = idc.demangle_name(name, idc.get_inf_attr(idc.INF_SHORT_DN)) 141 | if demangled == None: 142 | demangled = name 143 | 144 | if "(" in demangled: 145 | demangled = demangled[:demangled.find("(")] 146 | funcs[demangled.lower()] = offset 147 | 148 | offset += 1 149 | ea = idaapi.next_head(ea, idc.BADADDR) 150 | 151 | # We've got a list of function names, let's do this really shittily because idk any other way 152 | 153 | # This is a good programmer who makes their gamedata the proper way :) 154 | offs = funcs.get(funcname.lower(), -1) 155 | if offs != -1: 156 | return offs 157 | 158 | # Often done but sometimes there are subclass types thrown in, save those too 159 | if "::" in funcname: 160 | funcname = funcname[funcname.find("::")+2:] 161 | 162 | # Try by exact function name 163 | funcnames = {} 164 | for key, value in get_bcompat_items(funcs): 165 | # Function overloads can fuck right off 166 | s = key[key.find("::")+2:].lower() if "::" in key else key.lower() 167 | funcnames[s.lower()] = value 168 | 169 | offs = funcnames.get(funcname.lower(), -1) 170 | # Second best way, exact function name 171 | if offs != -1: 172 | return offs 173 | 174 | return -1 175 | # Anything else near here is either some random mem offset or some other crap 176 | # possibilities = [key for key in funcnames.keys() if funcname in key] 177 | # return [found for found in funcnames[x] for x in possibilities] 178 | 179 | # So we've a few options with finding appropriate vtable offsets 180 | # Option 1: Check and see if they use the optimal naming sequence "Type::Function" and revel in that 181 | # If we can't deduce that exactly, try option 2 182 | # Option 2: They must've used just the function name, run through every function that has a name like that 183 | # and perform option 1 on each 184 | # Windows can suck a wiener on this one 185 | def try_get_voffset(funcname): 186 | if "(" in funcname: 187 | funcname = funcname[:funcname.find("(")] 188 | if "::" in funcname: 189 | # Option 1 190 | typename = funcname[:funcname.find("::")] 191 | voffs = get_voffs(typename) 192 | offs = -1 193 | if voffs != idc.BADADDR: 194 | offs = read_vtable(funcname, voffs) 195 | if offs != -1: 196 | return offs 197 | 198 | funcname = funcname[funcname.find("::")+2:] 199 | 200 | # Let's chug along all of these functions, woohoo for option 2! 201 | for func in idautils.Functions(): 202 | name = idc.get_name(func, idaapi.GN_VISIBLE) 203 | if not name or funcname not in name: # funcname should only be a plain function decl, so it would be unfettered in a mangled name 204 | continue 205 | 206 | demangled = idc.demangle_name(name, idc.get_inf_attr(idc.INF_SHORT_DN)) 207 | if demangled == None: 208 | continue 209 | 210 | demname = demangled 211 | if "::" in demname: 212 | demname = demname[demname.find("::")+2:] 213 | if "(" in demname: 214 | demname = demname[:demname.find("(")] 215 | 216 | if funcname == demname: # Here's an exact match, let's get the type name then read the vtable 217 | if "::" not in demangled: # Okay, so someone somewhere is an idiot and managed to provide an offset name that is the 218 | continue # same name as some non-class function and this will manage to catch that 219 | 220 | typename = demangled[:demangled.find("::")] 221 | voffs = get_voffs(typename) 222 | if voffs != idc.BADADDR: 223 | offs = read_vtable(funcname, voffs) 224 | if offs != -1: 225 | return offs 226 | 227 | return -1 # Your naming conventions suck and you should feel bad. Or this is Windows and you should still feel bad 228 | 229 | def main(): 230 | kv = None 231 | filereq = idaapi.ask_file(0, "*.txt", "Select a gamedata file") 232 | if filereq is None: 233 | return 234 | 235 | # Try and capture the huge exception that happens if there are multi-line comments 236 | # Why does vdfparse print the entire file? Lol 237 | try: 238 | with open(filereq) as f: 239 | kv = vdf.load(f) 240 | except Exception as e: 241 | idaapi.warning("Could not load file!\nSee console for details") 242 | import traceback 243 | traceback.print_exc(type(e), e, e.__traceback__) 244 | if "vdf.parse: unexpected EOF" in str(e): 245 | print("[Gamedata Checker] This is likely due to multi-line comments in the gamedata file. Try removing them and try again") 246 | return 247 | 248 | if kv == None: 249 | idaapi.warning("Could not load file!") 250 | return 251 | 252 | kv = list(kv.values())[0] 253 | os = get_os() 254 | gamedir = get_gamedir(kv) 255 | if not gamedir: 256 | idaapi.warning("Could not find game directory in file") 257 | return 258 | 259 | kv = kv[gamedir] 260 | found = { 261 | "Signatures": {}, 262 | "Offsets": {} 263 | } 264 | 265 | signatures = kv.get("Signatures") 266 | if signatures: 267 | for name, handle in signatures.items(): 268 | s = handle.get(osstr(os)) 269 | if s: 270 | found["Signatures"][name] = checksig(s) 271 | 272 | offsets = kv.get("Offsets") 273 | if offsets:# and os != "windows": 274 | for name, handle in offsets.items(): 275 | offset = handle.get(osstr(os), -1) 276 | if offset != -1: 277 | found["Offsets"][name] = [offset, try_get_voffset(name)] 278 | 279 | checkmark = u"\u2713".encode("utf8") if version_info[0] <= 2 else "✓" 280 | 281 | # Format the output string so it's pretty 282 | try: 283 | maxlen = max([len(key) for key in found["Signatures"].keys()]) 284 | except: 285 | maxlen = 0 286 | if maxlen: 287 | # Align maxlen to 4 288 | if maxlen % 4 != 0: 289 | maxlen += 4 - (maxlen % 4) 290 | 291 | print("Signatures:") 292 | for key, value in get_bcompat_items(found["Signatures"]): 293 | print(f"\t{key:{maxlen}}{checkmark if value else 'INVALID'}") 294 | 295 | try: 296 | maxlen = max([len(key) for key in found["Offsets"].keys()]) 297 | except: 298 | maxlen = 0 299 | if maxlen: 300 | # Align maxlen to 4 301 | if maxlen % 4 != 0: 302 | maxlen += 4 - (maxlen % 4) 303 | 304 | # Trial and error and trial and error and trial and 305 | print(f"Offsets:{'Gamedata':>{maxlen + 9}}{'Current':>12}{'Status':>12}") 306 | for key, value in get_bcompat_items(found["Offsets"]): 307 | s = f"\t{key:{maxlen}}" 308 | foundval = value[1] 309 | status = checkmark 310 | if isinstance(value[1], list): 311 | status = checkmark if value[0] in value[1] else 'X' 312 | elif int(value[0]) != int(value[1]): 313 | status = 'X' 314 | if value[1] == -1: 315 | foundval = "N/A" 316 | 317 | s += f"{value[0]:<12} {foundval:<12} {status:<12}" 318 | 319 | print(s) 320 | 321 | main() -------------------------------------------------------------------------------- /isgoodsig.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idaapi 3 | import idautils 4 | 5 | def main(): 6 | bytesig = idaapi.ask_str("", 0, "Insert signature: ") 7 | 8 | sig = bytesig.replace(r"\x", " ").replace("2A", "?").replace("2a", "?").strip() 9 | 10 | count = checksig(sig) 11 | if not count: 12 | print(r"INVALID: {}".format(bytesig)) 13 | print("Could not find any matching signatures for input") 14 | elif count == 1: 15 | print(r"VALID: {}".format(bytesig)) 16 | else: 17 | print(r"INVALID: {}".format(bytesig)) 18 | print("Found {} instances of input signature".format(count)) 19 | 20 | def checksig(sig): 21 | # Get the first segment that is executable to use its addresses for parse_binpat_str 22 | endea = idc.BADADDR 23 | for segea in idautils.Segments(): 24 | s = idaapi.getseg(segea) 25 | if s.perm & idaapi.SEGPERM_EXEC: 26 | segstart = segea 27 | # Set the end ea to the end of the last executable segment 28 | # Speed isn't as important in this script, so reading any extra X 29 | # segments is fine 30 | if endea == idc.BADADDR or endea < segstart + s.size(): 31 | endea = segstart + s.size() 32 | break 33 | 34 | count = 0 35 | addr = 0 36 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 37 | while addr != idc.BADADDR: 38 | count = count + 1 39 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 40 | 41 | return count 42 | 43 | # bin_search3 breaks after 15 or so bytes or something, idk man 44 | # binpat = idaapi.compiled_binpat_vec_t() 45 | # idaapi.parse_binpat_str(binpat, segstart, sig, 16, idaapi.get_default_encoding_idx(idaapi.get_encoding_bpu_by_name("UTF-8"))) 46 | 47 | # count = 0 48 | # addr = 0 49 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 50 | # while addr != idc.BADADDR: 51 | # count += 1 52 | 53 | # # +1 because the search finds itself 54 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 55 | 56 | # return count 57 | 58 | main() -------------------------------------------------------------------------------- /makesig.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idautils 3 | import idaapi 4 | 5 | def print_wildcards(count): 6 | return "?" * count 7 | 8 | def is_good_sig(sig, mask): 9 | search = " ".join('?' if m == '?' else b for b, m in zip(sig.strip().split(), mask)) 10 | 11 | # Get the first segment that is executable to use its addresses for parse_binpat_str 12 | endea = idc.BADADDR 13 | for segea in idautils.Segments(): 14 | s = idaapi.getseg(segea) 15 | if s.perm & idaapi.SEGPERM_EXEC: 16 | segstart = segea 17 | # Set the end ea to the end of the last executable segment 18 | # Speed isn't as important in this script, so reading any extra X 19 | # segments is fine 20 | if endea == idc.BADADDR or endea < segstart + s.size(): 21 | endea = segstart + s.size() 22 | 23 | count = 0 24 | addr = 0 25 | # Ever just deprecate something and provide 0 documentation on what to use instead? 26 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 27 | while addr != idc.BADADDR: 28 | count = count + 1 29 | if count > 1: 30 | break 31 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 32 | 33 | return count == 1 34 | 35 | # binpat = idaapi.compiled_binpat_vec_t() 36 | # idaapi.parse_binpat_str(binpat, segstart, search, 16, idaapi.get_encoding_bpu_by_name("UTF-8")) 37 | 38 | # count = 0 39 | # addr = 0 40 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 41 | 42 | # while addr != idc.BADADDR: 43 | # count += 1 44 | # if count > 1: 45 | # break 46 | 47 | # # +1 because the search finds itself 48 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 49 | 50 | 51 | # return count == 1 52 | 53 | def makesig(ea, sz = -1): 54 | name = idc.get_name(ea, idaapi.GN_VISIBLE) 55 | 56 | sig = "" 57 | mask = "" 58 | found = 0 59 | done = 0 60 | 61 | addr = ea 62 | end = ea + sz if sz != -1 else idc.BADADDR 63 | while addr != idc.BADADDR and (sz == -1 or addr < ea + sz): 64 | info = idaapi.insn_t() 65 | if not idaapi.decode_insn(info, addr): 66 | print(f"Failed to decode instruction at {addr:#X}?") 67 | idaapi.beep() 68 | return 69 | 70 | sig += " ".join(f"{idaapi.get_byte(addr+i):02X}" for i in range(info.size)) + " " 71 | 72 | done = 0 73 | if info.Op1.type in (idaapi.o_near, idaapi.o_far): 74 | insnsz = 2 if idaapi.get_byte(addr) == 0x0F else 1 75 | mask += f"{'x' * insnsz}{print_wildcards(info.size - insnsz)}" 76 | done = 1 77 | elif info.Op1.type == idaapi.o_reg and info.Op2.type == idaapi.o_mem and info.Op2.addr != idc.BADADDR: 78 | mask += f"{'x' * info.Op2.offb}{print_wildcards(info.size - info.Op2.offb)}" 79 | done = 1 80 | 81 | if not done: # Unknown, just wildcard addresses 82 | i = 0 83 | while i < info.size: 84 | loc = addr + i 85 | if ((idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF32): 86 | mask += print_wildcards(4) 87 | i += 3 88 | elif (idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF64: 89 | mask += print_wildcards(8) 90 | i += 7 91 | else: 92 | mask += 'x' 93 | 94 | i += 1 95 | 96 | if is_good_sig(sig, mask): 97 | found = 1 98 | break 99 | 100 | addr = idaapi.next_head(addr, end) 101 | 102 | if found == 0: 103 | print(sig) 104 | print("Ran out of bytes to create unique signature.") 105 | idaapi.beep() 106 | return 107 | 108 | sig = sig.strip() 109 | csig = r"\x" + sig.replace(" ", r"\x") 110 | 111 | align = len("Wildcarded Bytes: ") 112 | wildcarded = f"{'Wildcarded Bytes:':<{align}} {' '.join('?' if m == '?' else b for b, m in zip(sig.split(), mask))}\n" if "?" in mask else "" 113 | smsig = r"\x" + r"\x".join("2A" if m == "?" else b for b, m in zip(sig.split(), mask)) 114 | 115 | print("==================================================") 116 | print( 117 | f"Signature for {name}:\n" 118 | f"{'Mask:':<{align}} {mask}\n" 119 | f"{'Bytes:':<{align}} {sig}\n" 120 | f"{wildcarded}" 121 | f"{'Byte String:':<{align}} {csig}\n" 122 | f"{'SourceMod':<{align}} {smsig}" 123 | ) 124 | 125 | try: 126 | import pyperclip 127 | pyperclip.copy(smsig) 128 | print(f"SourceMod signature copied to clipboard") 129 | except: 130 | print("'pip install pyperclip' to automatically copy the SourceMod signature to your clipboard") 131 | return csig 132 | 133 | def main(): 134 | addr = idaapi.get_screen_ea() 135 | func = idaapi.get_func(addr) 136 | if addr == idc.BADADDR or func is None: 137 | print("Make sure you are in a function!") 138 | idaapi.beep() 139 | return 140 | 141 | makesig(func.start_ea, func.end_ea - func.start_ea) 142 | 143 | main() -------------------------------------------------------------------------------- /makesigfromhere.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idautils 3 | import idaapi 4 | 5 | def get_dt_size(dtype): 6 | return { 7 | idaapi.dt_byte: 1, 8 | idaapi.dt_word: 2, 9 | idaapi.dt_dword: 4, 10 | idaapi.dt_float: 4, 11 | idaapi.dt_double: 8, 12 | }.get(dtype, -1) 13 | 14 | def print_wildcards(count): 15 | return "?" * count 16 | 17 | def is_good_sig(sig, mask): 18 | search = " ".join('?' if m == '?' else b for b, m in zip(sig.strip().split(), mask)) 19 | 20 | # Get the first segment that is executable to use its addresses for parse_binpat_str 21 | endea = idc.BADADDR 22 | for segea in idautils.Segments(): 23 | s = idaapi.getseg(segea) 24 | if s.perm & idaapi.SEGPERM_EXEC: 25 | segstart = segea 26 | # Set the end ea to the end of the last executable segment 27 | # Speed isn't as important in this script, so reading any extra X 28 | # segments is fine 29 | if endea == idc.BADADDR or endea < segstart + s.size(): 30 | endea = segstart + s.size() 31 | 32 | count = 0 33 | addr = 0 34 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 35 | while addr != idc.BADADDR: 36 | count = count + 1 37 | if count > 1: 38 | break 39 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 40 | 41 | return count == 1 42 | 43 | # binpat = idaapi.compiled_binpat_vec_t() 44 | # idaapi.parse_binpat_str(binpat, segstart, search, 16, idaapi.get_encoding_bpu_by_name("UTF-8")) 45 | 46 | # count = 0 47 | # addr = 0 48 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 49 | 50 | # while addr != idc.BADADDR: 51 | # count += 1 52 | # if count > 1: 53 | # break 54 | 55 | # # +1 because the search finds itself 56 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 57 | 58 | 59 | # return count == 1 60 | 61 | def makesig(ea, sz=-1): 62 | func = idaapi.get_func(ea) 63 | name = idc.get_name(func.start_ea, idaapi.GN_VISIBLE) 64 | 65 | sig = "" 66 | mask = "" 67 | found = 0 68 | done = 0 69 | 70 | addr = ea 71 | end = ea + sz if sz != -1 else idc.BADADDR 72 | while addr != idc.BADADDR and (sz == -1 or addr < ea + sz): 73 | info = idaapi.insn_t() 74 | if not idaapi.decode_insn(info, addr): 75 | print(f"Failed to decode instruction at {addr:#X}?") 76 | idaapi.beep() 77 | return 78 | 79 | sig += " ".join(f"{idaapi.get_byte(addr+i):02X}" for i in range(info.size)) + " " 80 | 81 | done = 0 82 | if info.Op1.type in (idaapi.o_near, idaapi.o_far): 83 | insnsz = 2 if idaapi.get_byte(addr) == 0x0F else 1 84 | mask += f"{'x' * insnsz}{print_wildcards(info.size - insnsz)}" 85 | done = 1 86 | elif info.Op1.type == idaapi.o_reg and info.Op2.type == idaapi.o_mem and info.Op2.addr != idc.BADADDR: 87 | mask += f"{'x' * info.Op2.offb}{print_wildcards(info.size - info.Op2.offb)}" 88 | done = 1 89 | 90 | if not done: # Unknown, just wildcard addresses 91 | i = 0 92 | while i < info.size: 93 | loc = addr + i 94 | if ((idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF32): 95 | mask += print_wildcards(4) 96 | i += 3 97 | elif (idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF64: 98 | mask += print_wildcards(8) 99 | i += 7 100 | else: 101 | mask += 'x' 102 | 103 | i += 1 104 | 105 | if is_good_sig(sig, mask): 106 | found = 1 107 | break 108 | 109 | addr = idaapi.next_head(addr, end) 110 | 111 | if found == 0: 112 | print(sig) 113 | print("Ran out of bytes to create unique signature.") 114 | idaapi.beep() 115 | return 116 | 117 | sig = sig.strip() 118 | csig = r"\x" + sig.replace(" ", r"\x") 119 | 120 | align = len("Wildcarded Bytes: ") 121 | wildcarded = f"{'Wildcarded Bytes:':<{align}} {' '.join('?' if m == '?' else b for b, m in zip(sig.split(), mask))}\n" if "?" in mask else "" 122 | smsig = r"\x" + r"\x".join("2A" if m == "?" else b for b, 123 | m in zip(sig.split(), mask)) 124 | 125 | print("==================================================") 126 | print( 127 | f"Signature for {name} + {ea - func.start_ea} ({ea - func.start_ea:#x}):\n" 128 | f"{'Mask:':<{align}} {mask}\n" 129 | f"{'Bytes:':<{align}} {sig}\n" 130 | f"{wildcarded}" 131 | f"{'Byte String:':<{align}} {csig}\n" 132 | f"{'SourceMod':<{align}} {smsig}" 133 | ) 134 | 135 | try: 136 | import pyperclip 137 | pyperclip.copy(smsig) 138 | print(f"SourceMod signature copied to clipboard") 139 | except: 140 | print("'pip install pyperclip' to automatically copy the SourceMod signature to your clipboard") 141 | return csig 142 | 143 | def main(): 144 | ea = idaapi.get_screen_ea() 145 | func = idaapi.get_func(ea) 146 | if ea == idc.BADADDR or func is None: 147 | print("Make sure you are in a function!") 148 | idaapi.beep() 149 | return 150 | 151 | sz = func.end_ea - ea 152 | 153 | makesig(ea, sz) 154 | 155 | main() -------------------------------------------------------------------------------- /nameresetter.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idautils 3 | import idaapi 4 | 5 | def main(): 6 | count = 0 7 | for segstart in idautils.Segments(): 8 | segend = idaapi.getseg(segstart).end_ea 9 | for fea in idautils.Functions(segstart, segend): 10 | flags = idaapi.get_full_flags(fea) 11 | if not (flags & idc.FF_NAME): 12 | continue 13 | 14 | fflags = idc.get_func_attr(fea, idc.FUNCATTR_FLAGS) 15 | if fflags & idaapi.FUNC_LIB: 16 | continue 17 | 18 | if idc.set_name(fea, ""): 19 | count += 1 20 | 21 | print(f"Successfully renamed {count} functions") 22 | 23 | main() -------------------------------------------------------------------------------- /netprop_importer.py: -------------------------------------------------------------------------------- 1 | import idautils 2 | import idaapi 3 | import idc 4 | import ctypes 5 | import time 6 | 7 | from math import ceil 8 | 9 | import xml.etree.ElementTree as et 10 | 11 | from dataclasses import dataclass 12 | from enum import Enum 13 | 14 | if idaapi.inf_is_64bit(): 15 | ea_t = ctypes.c_uint64 16 | FF_PTR = idc.FF_QWORD 17 | else: 18 | ea_t = ctypes.c_uint32 19 | FF_PTR = idc.FF_DWORD 20 | 21 | class DataCache(object): 22 | tablecache = {} 23 | 24 | class SendPropType(Enum): 25 | DPT_Int = 0 26 | DPT_Float = 1 27 | DPT_Vector = 2 28 | DPT_VectorXY = 3 29 | DPT_String = 4 30 | DPT_Array = 5 31 | DPT_DataTable = 6 32 | DPT_Int64 = 7 33 | 34 | class SendFlags(Enum): 35 | UNSIGNED = 1 << 0 36 | COORD = 1 << 1 37 | NOSCALE = 1 << 2 38 | ROUNDDOWN = 1 << 3 39 | ROUNDUP = 1 << 4 40 | NORMAL = 1 << 5 41 | EXCLUDE = 1 << 6 42 | XYZE = 1 << 7 43 | INSIDEARRAY = 1 << 8 44 | PROXY_ALWAYS_YES = 1 << 9 45 | CHANGES_OFTEN = 1 << 10 46 | IS_A_VECTOR_ELEM = 1 << 11 47 | COLLAPSIBLE = 1 << 12 48 | COORD_MP = 1 << 13 49 | COORD_MP_LOWPRECISION = 1 << 14 50 | COORD_MP_INTEGRAL = 1 << 15 51 | VARINT = NORMAL 52 | ENCODED_AGAINST_TICKCOUNT = 1 << 16 53 | 54 | @dataclass(frozen=True) 55 | class SendProp: 56 | name: str 57 | type: int #SendPropType 58 | offset: int 59 | bits: int 60 | flags: int 61 | table: 'SendTable' = None 62 | 63 | def __repr__(self): 64 | # Use id() with table or else infinite recursion 65 | return f"SendProp(name={self.name}, type={self.type}, offset={self.offset}, bits={self.bits}, flags={self.flags}, table={id(self.table):#x})" 66 | 67 | def add_to_struc(self, struc, offset): 68 | # So, unfortunately, it doesn't seem to be possible to implement baseclasses 69 | # while also keeping vtables intact. This might actually be possible as it can be done 70 | # with IDA's header parser, but this might not be exposed to the API. 71 | # Implementing baseclasses with seamless vtable integration is a TODO 72 | # The framework is more or less here, so if I manage to figure that out it won't 73 | # be that difficult to implement 74 | # Might have to do with optinfo_t pointing to the proper vtable? Dunno 75 | # if self.table is not None: 76 | # baseclass = DataCache.struccache.get(self.table.classname, None) 77 | # if baseclass is None: 78 | # self.table.create_struc() 79 | 80 | # baseclass = DataCache.struccache[self.table.classname] 81 | 82 | if self.table is not None: 83 | # Array 84 | # We *could* parse these and implement them as embedded classes/arrays 85 | # but there's no guarantee that we would get a proper size, which could 86 | # cause some really poor results 87 | # There's a good chance that more array data is actually in the inner table's m_pExtraData 88 | # Mayhaps a SourceMod PR for another time 89 | if not self.table.name.startswith("_ST_"): 90 | # Bad hack but catches arrays 91 | if self.table.name == self.name: 92 | if self.offset != 0: 93 | self.table.add_array_to_struc(struc, offset + self.offset) 94 | return 95 | else: 96 | self.table.add_to_struc(struc, offset + self.offset) 97 | 98 | # Offset is 0 so we die 99 | if self.offset == 0: 100 | return 101 | 102 | curroffset = self.offset + offset 103 | 104 | currmem = idaapi.get_member(struc, curroffset) 105 | if currmem is not None: 106 | # print(f"Member {self.name} already exists in {idaapi.get_struc_name(struc.id)}") 107 | return 108 | 109 | idaflags, sz = self.calc_sz() 110 | tinfo = self.get_tinfo() 111 | targetname = idaapi.validate_name(self.name, idaapi.VNT_IDENT) 112 | 113 | serr = idaapi.add_struc_member(struc, targetname, curroffset, idaflags, None, sz) 114 | if serr != idaapi.STRUC_ERROR_MEMBER_OK: 115 | # I really don't wanna deal with these silly subclasses 116 | if serr < idaapi.STRUC_ERROR_MEMBER_OFFSET: 117 | print(f"Could not add struct member {idaapi.get_struc_name(struc.id)}.{targetname} at {curroffset} ({curroffset:#x})! Error {serr}") 118 | return 119 | 120 | currmem = idaapi.get_member(struc, curroffset) 121 | if tinfo is not None: 122 | idaapi.set_member_tinfo(struc, currmem, 0, tinfo, 0) 123 | elif self.flags and self.flags & SendFlags.UNSIGNED.value: 124 | currinfo = idaapi.tinfo_t() 125 | if idaapi.get_member_tinfo(currinfo, currmem): 126 | currinfo.change_sign(idaapi.type_unsigned) 127 | idaapi.set_member_tinfo(struc, currmem, 0, currinfo, 0) 128 | 129 | def calc_sz(self): 130 | if self.type == SendPropType.DPT_Float.value: 131 | return idc.FF_FLOAT, 4 132 | elif self.type == SendPropType.DPT_Int64.value: 133 | return idc.FF_QWORD, 8 134 | elif self.type == SendPropType.DPT_String.value: 135 | return FF_PTR, ctypes.sizeof(ea_t) 136 | elif self.type == SendPropType.DPT_Vector.value: 137 | # Returning FF_STRUCT doesn't work because the proper opinfo_t needs to be set 138 | # but this can be cheesed by just setting it to FF_DWORD and setting the tinfo after 139 | return idc.FF_DWORD, 12 #idc.FF_STRUCT 140 | 141 | absmax = ceil(self.bits/8.0) 142 | if absmax == 1: 143 | flags = idc.FF_BYTE 144 | numbytes = 1 145 | elif absmax == 2: 146 | flags = idc.FF_WORD 147 | numbytes = 2 148 | else: 149 | flags = idc.FF_DWORD 150 | numbytes = 4 151 | 152 | return flags, numbytes 153 | 154 | def get_tinfo(self): 155 | return { 156 | SendPropType.DPT_Vector.value: VECTOR, 157 | # SendPropType.DPT_Int.value: idaapi.tinfo_t(idaapi.BT_INT), 158 | SendPropType.DPT_Float.value: idaapi.tinfo_t(idaapi.BT_FLOAT), 159 | # SendPropType.DPT_String.value: idaapi.tinfo_t(idaapi.BT_PTR), 160 | SendPropType.DPT_Int64.value: idaapi.tinfo_t(idaapi.BT_INT64), 161 | }.get(self.type, None) 162 | 163 | @dataclass 164 | class SendTable: 165 | name: str 166 | props: list[SendProp] 167 | # For mapping to a "C"-class 168 | # I'm gonna assume that there'll be some game that won't suffice with a "replace DT_ with C" method, 169 | # so we have SendTable objects point to their actual class name 170 | classname: str 171 | 172 | @staticmethod 173 | def create(elem:et.Element, classname=None): 174 | name = elem.attrib["name"] 175 | 176 | # Check if we've already cached this table, update classname if so 177 | # because if this is true, then its classname is surely missing 178 | if name in DataCache.tablecache: 179 | if classname is not None: 180 | DataCache.tablecache[name].classname = classname 181 | return DataCache.tablecache[name] 182 | 183 | props = [] 184 | for p in elem: 185 | pname = p.attrib["name"] 186 | 187 | # Collect and format the fields 188 | stype = p.find("type").text if p.find("type") != None else None 189 | ptype = str_to_dt_type(stype) 190 | sflags = p.find("flags").text if p.find("flags") != None else None 191 | flags = str_to_sendflags(sflags) 192 | offset = int(p.find("offset").text) if p.find("offset") != None else None 193 | bits = int(p.find("bits").text) if p.find("bits") != None else None 194 | ptable = SendTable.create(p.find("sendtable")) if p.find("sendtable") != None else None 195 | 196 | # Append a new prop 197 | props.append(SendProp(pname, ptype, offset, bits, flags, ptable)) 198 | 199 | # Cache and return 200 | DataCache.tablecache[name] = SendTable(name, props, classname) 201 | return DataCache.tablecache[name] 202 | 203 | def create_struc(self): 204 | struc = add_struc_ex(self.classname) 205 | 206 | self.add_to_struc(struc, 0) 207 | 208 | #DataCache.struccache[self.classname] = struc 209 | 210 | def add_to_struc(self, struc, offset): 211 | for prop in self.props: 212 | prop.add_to_struc(struc, offset) 213 | 214 | def add_array_to_struc(self, struc, offset): 215 | if offset == 0: 216 | return 217 | 218 | idaflags, sz = self.props[0].calc_sz() 219 | if len(self.props) > 1: 220 | sz = (self.props[1].offset - self.props[0].offset) 221 | idaflags = sz_to_idaflags(sz) 222 | 223 | sz *= len(self.props) 224 | 225 | tinfo = self.props[0].get_tinfo() 226 | targetname = idaapi.validate_name(self.name, idaapi.VNT_IDENT) 227 | 228 | serr = idaapi.add_struc_member(struc, targetname, offset, idaflags, None, sz) 229 | if serr != idaapi.STRUC_ERROR_MEMBER_OK: 230 | # I really don't wanna deal with these silly subclasses 231 | if serr < idaapi.STRUC_ERROR_MEMBER_OFFSET: 232 | print(f"Could not add struct member {idaapi.get_struc_name(struc.id)}.{targetname} at {offset} ({offset:#x})! Error {serr}") 233 | return 234 | 235 | currmem = idaapi.get_member(struc, offset) 236 | if tinfo is not None: 237 | idaapi.set_member_tinfo(struc, currmem, 0, tinfo, 0) 238 | elif self.props[0].flags and self.props[0].flags & SendFlags.UNSIGNED.value: 239 | currinfo = idaapi.tinfo_t() 240 | if idaapi.get_member_tinfo(currinfo, currmem): 241 | currinfo.change_sign(idaapi.type_unsigned) 242 | idaapi.set_member_tinfo(struc, currmem, 0, currinfo, 0) 243 | 244 | @dataclass(frozen=True) 245 | class ServerClass: 246 | name: str 247 | sendtable: SendTable 248 | 249 | @staticmethod 250 | def create(elem: et.Element, classname): 251 | sendtable = elem.find("sendtable") 252 | return ServerClass(classname, SendTable.create(sendtable, classname)) 253 | 254 | def create_struc(self): 255 | self.sendtable.create_struc() 256 | 257 | 258 | # Idiot proof IDA wait box 259 | class WaitBox: 260 | buffertime = 0.0 261 | shown = False 262 | msg = "" 263 | 264 | @staticmethod 265 | def _show(msg): 266 | WaitBox.msg = msg 267 | if WaitBox.shown: 268 | idaapi.replace_wait_box(msg) 269 | else: 270 | idaapi.show_wait_box(msg) 271 | WaitBox.shown = True 272 | 273 | @staticmethod 274 | def show(msg, buffertime=0.1): 275 | if msg == WaitBox.msg: 276 | return 277 | 278 | if buffertime > 0.0: 279 | if time.time() - WaitBox.buffertime < buffertime: 280 | return 281 | WaitBox.buffertime = time.time() 282 | WaitBox._show(msg) 283 | 284 | @staticmethod 285 | def hide(): 286 | if WaitBox.shown: 287 | idaapi.hide_wait_box() 288 | WaitBox.shown = False 289 | 290 | VECTOR = None 291 | 292 | def str_to_dt_type(t): 293 | return { 294 | "int": SendPropType.DPT_Int.value, 295 | "float": SendPropType.DPT_Float.value, 296 | "vector": SendPropType.DPT_Vector.value, 297 | "string": SendPropType.DPT_String.value, 298 | "array": SendPropType.DPT_Array.value, 299 | "datatable": SendPropType.DPT_DataTable.value, 300 | "int64": SendPropType.DPT_Int64.value 301 | }.get(t, None) 302 | 303 | def str_to_sendflags(s): 304 | if not s: 305 | return s 306 | 307 | splode = s.split("|") 308 | d = { 309 | "Unsigned": SendFlags.UNSIGNED.value, 310 | "Coord": SendFlags.COORD.value, 311 | "NoScale": SendFlags.NOSCALE.value, 312 | "RoundDown": SendFlags.ROUNDDOWN.value, 313 | "RoundUp": SendFlags.ROUNDUP.value, 314 | "VarInt": SendFlags.NORMAL.value, 315 | "Normal": SendFlags.NORMAL.value, 316 | "Exclude": SendFlags.EXCLUDE.value, 317 | "XYZE": SendFlags.XYZE.value, 318 | "InsideArray": SendFlags.INSIDEARRAY.value, 319 | "AlwaysProxy": SendFlags.PROXY_ALWAYS_YES.value, 320 | "ChangesOften": SendFlags.CHANGES_OFTEN.value, 321 | "VectorElem": SendFlags.IS_A_VECTOR_ELEM.value, 322 | "Collapsible": SendFlags.COLLAPSIBLE.value, 323 | "CoordMP": SendFlags.COORD_MP.value, 324 | "CoordMPLowPrec": SendFlags.COORD_MP_LOWPRECISION.value, 325 | "CoordMpIntegral": SendFlags.COORD_MP_INTEGRAL.value, 326 | } 327 | flags = 0 328 | for fl in splode: 329 | flags |= d.get(fl, 0) 330 | 331 | return flags 332 | 333 | def sz_to_idaflags(sz): 334 | return { 335 | 1: idc.FF_BYTE, 336 | 2: idc.FF_WORD, 337 | 4: idc.FF_DWORD, 338 | 8: idc.FF_QWORD 339 | }.get(sz, 1) 340 | 341 | 342 | def add_struc_ex(name): 343 | strucid = idaapi.get_struc_id(name) 344 | if strucid == idc.BADADDR: 345 | strucid = idaapi.add_struc(idc.BADADDR, name) 346 | 347 | return idaapi.get_struc(strucid) 348 | 349 | def calcszdata(sz): 350 | absmax = ceil(sz/8.0) 351 | if absmax == 1: 352 | flags = idc.FF_BYTE 353 | numbytes = 1 354 | elif absmax == 2: 355 | flags = idc.FF_WORD 356 | numbytes = 2 357 | else: 358 | flags = idc.FF_DWORD 359 | numbytes = 4 360 | 361 | return flags, numbytes 362 | 363 | # Fix SM's bad xml structure 364 | def fix_xml(data): 365 | for i in range(len(data)): 366 | data[i] = data[i].replace('""', '"') 367 | 368 | data[3] = "\n" 369 | data.append("\n") 370 | return data 371 | 372 | # Make Vector and QAngle structs to keep things sane 373 | def make_basic_structs(): 374 | strucid = idaapi.get_struc_id("Vector") 375 | if strucid == idc.BADADDR: 376 | struc = idaapi.get_struc(idaapi.add_struc(idc.BADADDR, "Vector")) 377 | idaapi.add_struc_member(struc, "x", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4) 378 | idaapi.add_struc_member(struc, "y", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4) 379 | idaapi.add_struc_member(struc, "z", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4) 380 | strucid = idaapi.get_struc_id("Vector") 381 | 382 | global VECTOR 383 | VECTOR = idaapi.tinfo_t() 384 | if idaapi.guess_tinfo(VECTOR, strucid) == idaapi.GUESS_FUNC_FAILED: 385 | VECTOR = None 386 | 387 | strucid = idaapi.get_struc_id("QAngle") 388 | if strucid == idc.BADADDR: 389 | struc = idaapi.get_struc(idaapi.add_struc(idc.BADADDR, "QAngle")) 390 | idaapi.add_struc_member(struc, "x", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4) 391 | idaapi.add_struc_member(struc, "y", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4) 392 | idaapi.add_struc_member(struc, "z", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4) 393 | 394 | def main(): 395 | data = None 396 | try: 397 | fopen = idaapi.ask_file(0, "*.xml", "Select a file to import") 398 | if fopen is None: 399 | return 400 | 401 | idaapi.set_ida_state(idaapi.st_Work) 402 | WaitBox.show("Parsing XML") 403 | with open(fopen) as f: 404 | data = f.readlines() 405 | 406 | if data is None: 407 | idaapi.set_ida_state(idaapi.st_Ready) 408 | return 409 | 410 | make_basic_structs() 411 | 412 | try: 413 | # SM 1.10 <= has bad XML, assume its correct first then try to fix it 414 | tree = et.fromstringlist(data) 415 | except: 416 | fix_xml(data) 417 | tree = et.fromstringlist(data) 418 | 419 | if tree is None: 420 | idaapi.warning("Something bad happened :(") 421 | idaapi.set_ida_state(idaapi.st_Ready) 422 | return 423 | 424 | WaitBox.show("Creating ServerClasses") 425 | classes = {} 426 | for cls in tree: 427 | classname = cls.attrib["name"] 428 | classes[classname] = ServerClass.create(cls, classname) 429 | 430 | idaapi.begin_type_updating(idaapi.UTP_STRUCT) 431 | 432 | WaitBox.show("Adding struct members") 433 | for classname, serverclass in classes.items(): 434 | serverclass.create_struc() 435 | 436 | print("Done!") 437 | except: 438 | import traceback 439 | traceback.print_exc() 440 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues") 441 | idaapi.beep() 442 | 443 | WaitBox.hide() 444 | idaapi.end_type_updating(idaapi.UTP_STRUCT) 445 | idaapi.set_ida_state(idaapi.st_Ready) 446 | 447 | main() -------------------------------------------------------------------------------- /sigfind.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idaapi 3 | import idautils 4 | 5 | def getsigloc(sig): 6 | # Get the first segment that is executable to use its addresses for parse_binpat_str 7 | endea = idc.BADADDR 8 | for segea in idautils.Segments(): 9 | s = idaapi.getseg(segea) 10 | if s.perm & idaapi.SEGPERM_EXEC: 11 | segstart = segea 12 | # Set the end ea to the end of the last executable segment 13 | # Speed isn't as important in this script, so reading any extra X 14 | # segments is fine 15 | if endea == idc.BADADDR or endea < segstart + s.size(): 16 | endea = segstart + s.size() 17 | break 18 | 19 | count = 0 20 | first = idaapi.find_binary(0, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 21 | addr = first 22 | while addr != idc.BADADDR: 23 | count = count + 1 24 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 25 | 26 | return first, count 27 | 28 | # binpat = idaapi.compiled_binpat_vec_t() 29 | # # This returns false but it works? 30 | # idaapi.parse_binpat_str(binpat, segstart, sig, 16, idaapi.get_default_encoding_idx(idaapi.get_encoding_bpu_by_name("UTF-8"))) 31 | # addr, _ = idaapi.bin_search3(0, endea, binpat, idaapi.BIN_SEARCH_FORWARD) 32 | # return addr 33 | 34 | 35 | def main(): 36 | bytesig = idaapi.ask_str("", 0, "Insert signature: ") 37 | if bytesig is None: 38 | return 39 | 40 | sig = bytesig.replace(r"\x", " ").replace("2A", "?").replace("2a", "?").strip() 41 | 42 | loc, count = getsigloc(sig) 43 | if loc != idc.BADADDR: 44 | idaapi.jumpto(loc) 45 | if count > 1: 46 | print(f"Found {count} instances of signature. Jumping to first at {loc:#X}") 47 | else: 48 | # Beep, nothing found 49 | idaapi.beep() 50 | 51 | main() -------------------------------------------------------------------------------- /sigsmasher.py: -------------------------------------------------------------------------------- 1 | import idautils 2 | import idc 3 | import idaapi 4 | import yaml 5 | import time 6 | 7 | from math import floor 8 | 9 | MAX_SIG_LENGTH = 512 10 | 11 | # Change to 1 to have a very optimized makesig 12 | # Will produce useable signatures but theyll be a bit more volatile 13 | # since they rely on the position of the function in the binary 14 | # Uses the end of the function to search compared to the end of the .text segment 15 | ABSOLUTE_OPTIMIZATION = 0 16 | 17 | # Write-only trie for signatures 18 | # This is slightly faster than constantly running search_binary as 19 | # common signature prologues will be caught early and more quickly 20 | class Trie(object): 21 | def __init__(self): 22 | self.root = {} 23 | 24 | def add(self, data): 25 | node = self.root 26 | for d in data: 27 | if d not in node: 28 | node[d] = {} 29 | node = node[d] 30 | 31 | def find(self, data): 32 | node = self.root 33 | for d in data: 34 | if d not in node: 35 | return False 36 | node = node[d] 37 | return True 38 | 39 | def __contains__(self, data): 40 | return self.find(data) 41 | 42 | TRIE = Trie() 43 | 44 | # Idiot proof IDA wait box 45 | 46 | 47 | class WaitBox: 48 | buffertime = 0.0 49 | shown = False 50 | msg = "" 51 | 52 | @staticmethod 53 | def _show(msg): 54 | WaitBox.msg = msg 55 | if WaitBox.shown: 56 | idaapi.replace_wait_box(msg) 57 | else: 58 | idaapi.show_wait_box(msg) 59 | WaitBox.shown = True 60 | 61 | @staticmethod 62 | def show(msg, buffertime=0.1): 63 | if msg == WaitBox.msg: 64 | return 65 | 66 | if buffertime > 0.0: 67 | if time.time() - WaitBox.buffertime < buffertime: 68 | return 69 | WaitBox.buffertime = time.time() 70 | WaitBox._show(msg) 71 | 72 | @staticmethod 73 | def hide(): 74 | if WaitBox.shown: 75 | idaapi.hide_wait_box() 76 | WaitBox.shown = False 77 | 78 | FUNCS_SEGEND = idc.BADADDR 79 | def calc_sigstop(): 80 | endea = idc.BADADDR 81 | for segea in idautils.Segments(): 82 | s = idaapi.getseg(segea) 83 | if s.perm & idaapi.SEGPERM_EXEC: 84 | segstart = segea 85 | # Set the end ea to the end of the last executable segment 86 | # Speed isn't as important in this script, so reading any extra X 87 | # segments is fine 88 | if endea == idc.BADADDR or endea < segstart + s.size(): 89 | endea = segstart + s.size() 90 | 91 | return endea 92 | 93 | def is_good_sig(sig, funcend): 94 | if sig in TRIE: 95 | return False 96 | 97 | bytesig = " ".join(sig) 98 | 99 | endea = funcend if ABSOLUTE_OPTIMIZATION else FUNCS_SEGEND 100 | count = 0 101 | addr = 0 102 | addr = idaapi.find_binary(addr, endea, bytesig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 103 | while count < 2 and addr != idc.BADADDR: 104 | count = count + 1 105 | addr = idaapi.find_binary(addr, endea, bytesig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT) 106 | 107 | # Good sig, add it to the trie 108 | if count == 1: 109 | TRIE.add(sig) 110 | return True 111 | 112 | return False 113 | 114 | def makesigfast(func): 115 | addr = func.start_ea 116 | found = 0 117 | 118 | sig = [] 119 | while addr != idc.BADADDR: 120 | info = idaapi.insn_t() 121 | if not idaapi.decode_insn(info, addr): 122 | return None 123 | 124 | done = 0 125 | if info.Op1.type in (idaapi.o_near, idaapi.o_far): 126 | insnsz = 2 if idaapi.get_byte(addr) == 0x0F else 1 127 | sig += [f"{idaapi.get_byte(addr+i):02X}" for i in range(insnsz)] + ["?"] * (info.size - insnsz) 128 | done = 1 129 | elif info.Op1.type == idaapi.o_reg and info.Op2.type == idaapi.o_mem and info.Op2.addr != idc.BADADDR: 130 | sig += [f"{idaapi.get_byte(addr+i):02X}" for i in range(info.Op2.offb)] + ["?"] * (info.size - info.Op2.offb) 131 | done = 1 132 | 133 | if not done: # Unknown, just wildcard addresses 134 | i = 0 135 | while i < info.size: 136 | loc = addr + i 137 | if ((idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF32): 138 | sig += ["?"] * 4 139 | i += 3 140 | elif (idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF64: 141 | sig += ["?"] * 8 142 | i += 7 143 | else: 144 | sig += [f"{idaapi.get_byte(addr+i):02X}"] 145 | 146 | i += 1 147 | 148 | # Escape the evil functions that break everything 149 | if len(sig) > MAX_SIG_LENGTH: 150 | return "Signature is too long!" 151 | # Save milliseconds and only check for good sigs after a fewish bytes 152 | # Trust me, it matters 153 | elif len(sig) >= 5 and is_good_sig(sig, func.end_ea): 154 | found = 1 155 | break 156 | 157 | addr = idc.next_head(addr, func.end_ea) 158 | 159 | if found == 0: 160 | return "Ran out of bytes!" 161 | 162 | smsig = r"\x" + r"\x".join(sig) 163 | smsig = smsig.replace("?", "2A") 164 | return smsig 165 | 166 | def main(): 167 | try: 168 | root = {} 169 | 170 | f = idaapi.ask_file(1, "*.yml", "Choose a file to save to") 171 | if not f: 172 | return 173 | 174 | skip = idaapi.ask_yn(1, "Skip unnamed functions (e.g. ones that start with \"sub_\")?") 175 | if skip == -1: 176 | return 177 | 178 | idaapi.set_ida_state(idaapi.st_Work) 179 | global FUNCS_SEGEND 180 | FUNCS_SEGEND = calc_sigstop() 181 | 182 | funcs = list(idautils.Functions()) 183 | siglist = [] 184 | 185 | for i in range(len(funcs)): 186 | fea = funcs[i] 187 | flags = idaapi.get_full_flags(fea) 188 | if not idaapi.is_func(flags): 189 | continue 190 | 191 | if skip and not idaapi.has_name(flags): 192 | continue 193 | 194 | func = idaapi.get_func(fea) 195 | # Thunks and lib funcs 196 | if func.flags & (idaapi.FUNC_LIB | idaapi.FUNC_THUNK): 197 | continue 198 | 199 | funcname = idaapi.get_name(fea) 200 | unmangled = idaapi.demangle_name(funcname, idaapi.MNG_SHORT_FORM) 201 | if unmangled is not None: 202 | # Skip jmp stubs 203 | if unmangled.startswith("j_"): 204 | continue 205 | 206 | # Nullsub 207 | if unmangled.startswith("nullsub"): 208 | continue 209 | 210 | siglist.append(func) 211 | 212 | totalcount = len(siglist) 213 | actualstarttime = time.time() 214 | sigcount = 0 215 | for i, func in enumerate(siglist): 216 | funcname = idaapi.get_name(func.start_ea) 217 | unmangled = idaapi.demangle_name(funcname, idaapi.MNG_SHORT_FORM) 218 | if unmangled is None: 219 | unmangled = funcname 220 | 221 | sig = makesigfast(func) 222 | root[unmangled] = {"mangled": funcname, "signature": sig} 223 | 224 | if sig: 225 | sigcount += (0 if "!" in sig else 1) 226 | 227 | # Unfortunately, sigging takes progressively longer the further along the function list 228 | # this goes, as makesig() searches from top to bottom while functions are ordered from top to bottom 229 | # So this isn't really accurate but w/e 230 | 231 | totaltime = time.time() - actualstarttime 232 | count = i + 1 233 | avgtime = totaltime / count 234 | eta = int(avgtime * (totalcount - count)) 235 | etastr = time.strftime("%H:%M:%S", time.gmtime(eta)) 236 | 237 | WaitBox.show(f"Evaluated {count} out of {totalcount} ({floor(i / float(totalcount) * 100.0 * 10.0) / 10.0}%)\nETA: {etastr}") 238 | 239 | WaitBox.show("Saving to file") 240 | with open(f, "w") as f: 241 | yaml.safe_dump(root, f, default_flow_style=False, width=999999) 242 | 243 | totaltime = time.strftime("%H:%M:%S", time.gmtime(time.time() - actualstarttime)) 244 | print(f"Successfully generated {sigcount} signatures from {totalcount} functions in {totaltime}") 245 | except: 246 | import traceback 247 | traceback.print_exc() 248 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues") 249 | idaapi.beep() 250 | 251 | idaapi.set_ida_state(idaapi.st_Ready) 252 | WaitBox.hide() 253 | 254 | # import cProfile 255 | # cProfile.run("main()", "sigsmasher.prof") 256 | main() 257 | -------------------------------------------------------------------------------- /structfiller.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idautils 3 | import idaapi 4 | import time 5 | 6 | from math import floor 7 | 8 | # Idiot proof IDA wait box 9 | class WaitBox: 10 | buffertime = 0.0 11 | shown = False 12 | msg = "" 13 | 14 | @staticmethod 15 | def _show(msg): 16 | WaitBox.msg = msg 17 | if WaitBox.shown: 18 | idaapi.replace_wait_box(msg) 19 | else: 20 | idaapi.show_wait_box(msg) 21 | WaitBox.shown = True 22 | 23 | @staticmethod 24 | def show(msg, buffertime=0.1): 25 | if msg == WaitBox.msg: 26 | return 27 | 28 | if buffertime > 0.0: 29 | if time.time() - WaitBox.buffertime < buffertime: 30 | return 31 | WaitBox.buffertime = time.time() 32 | WaitBox._show(msg) 33 | 34 | @staticmethod 35 | def hide(): 36 | if WaitBox.shown: 37 | idaapi.hide_wait_box() 38 | WaitBox.shown = False 39 | 40 | def main(): 41 | try: 42 | idaapi.begin_type_updating(idaapi.UTP_STRUCT) 43 | maxstructs = idaapi.get_struc_qty() 44 | i = idaapi.get_first_struc_idx() 45 | while i < maxstructs: 46 | WaitBox.show(f"{floor(i / float(maxstructs) * 100.0 * 10.0) / 10.0}%") 47 | strucid = idaapi.get_struc_by_idx(i) 48 | struc = idaapi.get_struc(strucid) 49 | k = 0 50 | struclen = idaapi.get_max_offset(struc) 51 | while k < struclen: 52 | member = idaapi.get_member(struc, k) 53 | if not member: 54 | idaapi.add_struc_member(struc, f"field_{k:X}", k, idc.FF_BYTE, None, 1) 55 | k += 1 56 | else: 57 | k += idaapi.get_member_size(member) 58 | 59 | i += 1 60 | 61 | print("Done!") 62 | except: 63 | import traceback 64 | traceback.print_exc() 65 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues") 66 | idaapi.beep() 67 | 68 | WaitBox.hide() 69 | idaapi.end_type_updating(idaapi.UTP_STRUCT) 70 | 71 | main() -------------------------------------------------------------------------------- /symbolsmasher.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idautils 3 | import idaapi 4 | import json 5 | 6 | import time 7 | from sys import version_info 8 | 9 | # Are we reading this DB or writing to it. Not to be confused with reading from/writing to the work file 10 | Mode_Invalid = -1 11 | Mode_Write = 0 12 | Mode_Read = 1 13 | 14 | DEBUG = 0 15 | 16 | # Idiot proof IDA wait box 17 | class WaitBox: 18 | buffertime = 0.0 19 | shown = False 20 | msg = "" 21 | 22 | @staticmethod 23 | def _show(msg): 24 | WaitBox.msg = msg 25 | if WaitBox.shown: 26 | idaapi.replace_wait_box(msg) 27 | else: 28 | idaapi.show_wait_box(msg) 29 | WaitBox.shown = True 30 | 31 | @staticmethod 32 | def show(msg, buffertime = 0.1): 33 | if msg == WaitBox.msg: 34 | return 35 | 36 | if buffertime > 0.0: 37 | if time.time() - WaitBox.buffertime < buffertime: 38 | return 39 | WaitBox.buffertime = time.time() 40 | WaitBox._show(msg) 41 | 42 | @staticmethod 43 | def hide(): 44 | if WaitBox.shown: 45 | idaapi.hide_wait_box() 46 | WaitBox.shown = False 47 | 48 | def get_action(): 49 | return idaapi.ask_buttons("Reading from", "Writing to", "", 0, "What action are we performing on this database?") 50 | 51 | def get_file(action): 52 | forsaving, rw, s = (1, "w", "write to") if action == Mode_Read else (0, "r", "read from") 53 | fname = "*.json" 54 | f = idaapi.ask_file(forsaving, fname, "Choose a file to {}".format(s)) 55 | 56 | return open(f, rw) if f else None 57 | 58 | # Show how many functions we've found 59 | FOUND_FUNCS = set() 60 | 61 | # Format: 62 | # "String Name": 63 | # { 64 | # "_ZN8Function5Name", 65 | # "_ZN8Function6Name2", 66 | # etc... 67 | # } 68 | def build_xref_dict(strings): 69 | xrefs = {} 70 | for s in strings: 71 | xrefs[str(s)] = [] 72 | 73 | for xref in idautils.XrefsTo(s.ea): 74 | funcname = idaapi.get_func_name(xref.frm) 75 | if funcname is None: 76 | continue 77 | 78 | node = xrefs[str(s)] 79 | node.append(funcname) 80 | xrefs[str(s)] = node 81 | 82 | # Empty, trash, we don't want it 83 | if not len(xrefs[str(s)]): 84 | del xrefs[str(s)] 85 | 86 | return xrefs 87 | 88 | # Format: 89 | # "_ZN8Function5Name": 90 | # { 91 | # "str1", 92 | # "str2", 93 | # "str1", 94 | # } 95 | def build_data_dict(strdict): 96 | funcs = {} 97 | for s, value in get_bcompat_iter(strdict): 98 | for funcname in value: 99 | node = funcs.get(funcname, []) 100 | node.append(s) 101 | funcs[funcname] = node 102 | return funcs 103 | 104 | def read_strs(strings, file): 105 | WaitBox.show("Reading strings", True) 106 | # Build an organized dictionary of the string data we can get 107 | strdict = build_xref_dict(strings) 108 | # Then reorient it around functions, then dump it 109 | funcdict = build_data_dict(strdict) 110 | WaitBox.show("Dumping to file", True) 111 | # Running the script in write mode will build a similar dict then compare the two through functions 112 | json.dump(funcdict, file, indent = 4, sort_keys = True) 113 | 114 | def get_bcompat_iter(d): 115 | return d.items() if version_info[0] >= 3 else d.iteritems() 116 | 117 | def get_bcompat_keys(d): 118 | return d.keys() if version_info[0] >= 3 else d.iterkeys() 119 | 120 | def write_exact_comp(strdict, funcdict, myfuncs): 121 | global FOUND_FUNCS 122 | WaitBox.show("Writing exact comparisons") 123 | count = 0 124 | 125 | for strippedname, strippedlist in get_bcompat_iter(strdict): 126 | if not idaapi.get_func_name(myfuncs[strippedname]).startswith("sub_"): 127 | continue 128 | 129 | possibilities = [] 130 | strippedlist = sorted(strippedlist) 131 | for symname, symlist in get_bcompat_iter(funcdict): 132 | if strippedlist == sorted(symlist): 133 | possibilities.append(str(symname)) 134 | else: 135 | continue 136 | 137 | if len(possibilities) >= 2: 138 | break 139 | 140 | if len(possibilities) != 1: 141 | continue 142 | 143 | if possibilities[0] not in FOUND_FUNCS and possibilities[0] not in myfuncs: 144 | # print(idaapi.get_func_name(myfuncs[strippedname])) 145 | idc.set_name(myfuncs[strippedname], possibilities[0], idaapi.SN_FORCE) 146 | count += 1 147 | 148 | FOUND_FUNCS.add(possibilities[0]) 149 | WaitBox.show("Writing exact comparisons") 150 | elif DEBUG: 151 | print("{} is probably wrong!".format(idc.demangle_name(possibilities[0], idc.get_inf_attr(idc.INF_SHORT_DN)))) 152 | 153 | return count 154 | 155 | def write_simple_comp(strdict, funcdict, myfuncs, liw = True): 156 | global FOUND_FUNCS 157 | s = "symboled in stripped" if liw else "stripped in symboled" 158 | WaitBox.show("Writing simple comparisons ({})".format(s)) 159 | count = 0 160 | 161 | for strippedname, strippedlist in get_bcompat_iter(strdict): 162 | if not idaapi.get_func_name(myfuncs[strippedname]).startswith("sub_"): 163 | continue 164 | 165 | possibilities = [] 166 | for symname, symlist in get_bcompat_iter(funcdict): 167 | if liw: 168 | if all(val in strippedlist for val in symlist): 169 | possibilities.append(str(symname)) 170 | else: 171 | continue 172 | else: 173 | if all(val in symlist for val in strippedlist): 174 | possibilities.append(str(symname)) 175 | else: 176 | continue 177 | 178 | if len(possibilities) >= 2: 179 | break 180 | 181 | if len(possibilities) != 1: 182 | continue 183 | 184 | if possibilities[0] not in FOUND_FUNCS and possibilities[0] not in myfuncs: 185 | idc.set_name(myfuncs[strippedname], possibilities[0], idaapi.SN_FORCE) 186 | count += 1 187 | 188 | FOUND_FUNCS.add(possibilities[0]) 189 | WaitBox.show("Writing simple comparisons ({})".format(s)) 190 | elif DEBUG: 191 | print("{} is probably wrong!".format(idc.demangle_name(possibilities[0], idc.get_inf_attr(idc.INF_SHORT_DN)))) 192 | 193 | return count 194 | 195 | def get_bin_funcs(): 196 | seg = idaapi.get_segm_by_name(".text") 197 | return {idaapi.get_func_name(ea): ea for ea in idautils.Functions(seg.start_ea, seg.end_ea)} 198 | 199 | # So to prevent bad things, we're going to destroy any functions that have the exact same string xrefs 200 | # This is to protect against inlining but ultimately fails as this compares direct values 201 | # Foo() could call inlined Bar() twice which would fuck this up 202 | # What to do, what to do... 203 | def clean_data_dict(strdict): 204 | pass 205 | # resultant = {} 206 | # for key, value in get_bcompat_iter(strdict): 207 | # if sorted(value) not in resultant.values(): 208 | # resultant[key] = sorted(value) 209 | # 210 | # strdict = resultant 211 | 212 | def write_symbols(strings, file): 213 | WaitBox.show("Loading file", True) 214 | funcdict = json.load(file) 215 | if not funcdict: 216 | idaapi.warning("Could not load function data from file") 217 | return 218 | 219 | strdict = build_data_dict(build_xref_dict(strings)) 220 | clean_data_dict(strdict) 221 | myfuncs = get_bin_funcs() 222 | 223 | # Writing uniques is much more liable to produce bad typing 224 | # Unique, one-off strings seem to be inlined much more often, so it's 225 | # better to use the simple comparison technique 226 | # This will reduce the amount of types, but the reduced types 227 | # wouldve been wrong or duplicated anyways 228 | # strdict = write_uniques(strings, funcdict["Uniques"]) 229 | 230 | # A good test is to just simply compare xrefs 231 | # If a function references "fizzbuzz" 2 times and "foobar" once and its the only function 232 | # that does anything like that, chances are that we found something to smash 233 | exact_count = write_exact_comp(strdict, funcdict, myfuncs) 234 | 235 | # Since a lot of functions that have good strings have inlined strings in them, let's just look for containment 236 | # If "fizz", "buzz", and "foo" exist in Bar::Foo which has "fizz", "buzz", "foo", and "fizzbuzz" for example 237 | # Obviously we're only checking for 1 instance 238 | liw = write_simple_comp(strdict, funcdict, myfuncs) # Symboled strings in stripped 239 | wil = write_simple_comp(strdict, funcdict, myfuncs, False) # Stripped strings in symboled 240 | 241 | # TODO IDEAS; 242 | # - Dance around some function xrefs. By now, a solid chunk of them should have symboled names (a few thousand at least) 243 | # A unique set of named xrefs could guarantee something 244 | # Would need a new section in the data file (to and from) 245 | return exact_count, liw, wil 246 | 247 | def main(): 248 | try: 249 | action = get_action() 250 | if action == Mode_Invalid: 251 | return 252 | 253 | file = get_file(action) 254 | if file is None: 255 | return 256 | 257 | # strings = get_strs() 258 | strings = list(idautils.Strings()) 259 | if action == Mode_Read: 260 | read_strs(strings, file) 261 | print("Done!") 262 | else: 263 | c1, c2, c3 = write_symbols(strings, file) 264 | print("Successfully typed {} functions".format(len(FOUND_FUNCS))) 265 | print("\t- {} Exact\n\t- {} Symboled in stripped\n\t- {} Stripped in symboled".format(c1, c2, c3)) 266 | except: 267 | import traceback 268 | traceback.print_exc() 269 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues") 270 | idaapi.beep() 271 | 272 | WaitBox.hide() 273 | file.close() 274 | 275 | main() -------------------------------------------------------------------------------- /vtable_io.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idautils 3 | import idaapi 4 | import json 5 | import ctypes 6 | import time 7 | import re 8 | 9 | from dataclasses import dataclass 10 | 11 | if idaapi.inf_is_64bit(): 12 | ea_t = ctypes.c_uint64 13 | ptr_t = ctypes.c_int64 14 | get_ptr = idaapi.get_qword 15 | FF_PTR = idc.FF_QWORD 16 | else: 17 | ea_t = ctypes.c_uint32 18 | ptr_t = ctypes.c_int32 19 | get_ptr = idaapi.get_dword 20 | FF_PTR = idc.FF_DWORD 21 | 22 | # Calling these a lot so we'll speed up the invocations by manually implementing them here 23 | def is_off(f): return (f & (idc.FF_0OFF|idc.FF_1OFF)) != 0 24 | def is_code(f): return (f & idaapi.MS_CLS) == idc.FF_CODE 25 | def has_any_name(f): return (f & idc.FF_ANYNAME) != 0 26 | def is_ptr(f): return (f & idaapi.MS_CLS) == idc.FF_DATA and (f & idaapi.DT_TYPE) == FF_PTR 27 | 28 | # Let's go https://www.blackhat.com/presentations/bh-dc-07/Sabanal_Yason/Paper/bh-dc-07-Sabanal_Yason-WP.pdf 29 | 30 | _RTTICompleteObjectLocator_fields = [ 31 | ("signature", ctypes.c_uint32), # signature 32 | ("offset", ctypes.c_uint32), # offset of this vtable in complete class (from top) 33 | ("cdOffset", ctypes.c_uint32), # offset of constructor displacement 34 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor 35 | ("pClassHierarchyDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor 36 | ] 37 | 38 | if idaapi.inf_is_64bit(): 39 | _RTTICompleteObjectLocator_fields.append(("pSelf", ctypes.c_uint32)) # ref to object's base 40 | 41 | class RTTICompleteObjectLocator(ctypes.Structure): 42 | _fields_ = _RTTICompleteObjectLocator_fields 43 | 44 | 45 | class TypeDescriptor(ctypes.Structure): 46 | _fields_ = [ 47 | ("pVFTable", ctypes.c_uint32), # reference to RTTI's vftable 48 | ("spare", ctypes.c_uint32), # internal runtime reference 49 | ("name", ctypes.c_uint8), # type descriptor name (no varstruct needed since we don't use this) 50 | ] 51 | 52 | 53 | class RTTIClassHierarchyDescriptor(ctypes.Structure): 54 | _fields_ = [ 55 | ("signature", ctypes.c_uint32), # signature 56 | ("attribs", ctypes.c_uint32), # attributes 57 | ("numBaseClasses", ctypes.c_uint32), # # of items in the array of base classes 58 | ("pBaseClassArray", ctypes.c_uint32), # ref BaseClassArray 59 | ] 60 | 61 | 62 | class RTTIBaseClassDescriptor(ctypes.Structure): 63 | _fields_ = [ 64 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor 65 | ("numContainedBases", ctypes.c_uint32), # # of sub elements within base class array 66 | ("mdisp", ctypes.c_uint32), # member displacement 67 | ("pdisp", ctypes.c_uint32), # vftable displacement 68 | ("vdisp", ctypes.c_uint32), # displacement within vftable 69 | ("attributes", ctypes.c_uint32), # base class attributes 70 | ("pClassDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor 71 | ] 72 | 73 | 74 | class base_class_type_info(ctypes.Structure): 75 | _fields_ = [ 76 | ("basetype", ea_t), # Base class type 77 | ("offsetflags", ea_t), # Offset and info 78 | ] 79 | 80 | 81 | class class_type_info(ctypes.Structure): 82 | _fields_ = [ 83 | ("pVFTable", ea_t), # reference to RTTI's vftable (__class_type_info) 84 | ("pName", ea_t), # ref to type name 85 | ] 86 | 87 | # I don't think this is right, but every case I found looked to be correct 88 | # This might be a vtable? IDA sometimes says it is but not always 89 | # Plus sometimes the flags member is 0x1, so it's not a thisoffs. Weird 90 | class pointer_type_info(class_type_info): 91 | _fields_ = [ 92 | ("flags", ea_t), # Flags or something else 93 | ("pType", ea_t), # ref to type 94 | ] 95 | 96 | class si_class_type_info(class_type_info): 97 | _fields_ = [ 98 | ("pParent", ea_t), # ref to parent type 99 | ] 100 | 101 | class vmi_class_type_info(class_type_info): 102 | _fields_ = [ 103 | ("flags", ctypes.c_uint32), # flags 104 | ("basecount", ctypes.c_uint32), # # of base classes 105 | ("pBaseArray", base_class_type_info), # array of BaseClassArray 106 | ] 107 | 108 | def create_vmi_class_type_info(ea): 109 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(vmi_class_type_info)) 110 | tinfo = vmi_class_type_info.from_buffer_copy(bytestr) 111 | 112 | # Since this is a varstruct, we create a dynamic class with the proper size and type and return it instead 113 | class vmi_class_type_info_dynamic(class_type_info): 114 | _fields_ = [ 115 | ("flags", ctypes.c_uint32), 116 | ("basecount", ctypes.c_uint32), 117 | ("pBaseArray", base_class_type_info * tinfo.basecount), 118 | ] 119 | 120 | return vmi_class_type_info_dynamic 121 | 122 | 123 | # Steps to retrieve vtables on Windows (MSVC): 124 | # 1. Get RTTI's vftable (??_7type_info@@6B@) 125 | # 2. Iterate over xrefs to, which are all TypeDescriptor objects 126 | # a. Of course don't load up the function that uses it 127 | # 3. At each xref load up xrefs to again 128 | # a. There should only be at least 2, the important ones are RTTICompleteObjectLocator's AKA COL (there can be more than 1) 129 | # b. To discern which one is which, just see if there's a label at the address 130 | # - If there is, then that one is RTTIClassHierarchyDescriptor, so skip it 131 | # 4. The current ea position at each xref should be at RTTICompleteObjectLocator::pTypeDescriptor, so subtract 12 to get to the beginning of the struct 132 | # 5. Find xrefs to each. There should only be one, and it should be its vtable 133 | # a. Each COL has an offset which will shows where its vtable starts, so running too far over the table will be easier to detect 134 | # 135 | # Steps to retrieve vtables on Linux (GCC and maybe Clang) 136 | # 1. Get RTTI's vftable (_ZTVN10__cxxabiv117__class_type_infoE, 137 | # _ZTVN10__cxxabiv120__si_class_type_infoE, and _ZTVN10__cxxabiv121__vmi_class_type_infoE) 138 | # 2. First, before doing anything, shove each xref of type_info object into some sort of structure 139 | # a. There's no easy way to cheese discerning which xref is the actual vtable, unless we want to start parsing IDA comments 140 | # 3. Once each type_info object and their references are loaded, get the xrefs from each pVFTable 141 | # 4. There will probably be more than one xref. 142 | # a. To discern which one is a vtable, if the xref lies in another type_info object, then it's not a vtable 143 | # b. The remaining xref(s) is indeed a vtable 144 | 145 | # Class for windows type info, helps organize things 146 | @dataclass(frozen=True) 147 | class WinTI(object): 148 | typedesc: int 149 | name: str 150 | cols: list[int] 151 | vtables: list[int] 152 | 153 | # Class for function lists (what is held in the json) 154 | @dataclass(frozen=True) 155 | class FuncList: 156 | thisoffs: int 157 | funcs: list#[VFunc] 158 | 159 | # Idiot proof IDA wait box 160 | class WaitBox: 161 | buffertime = 0.0 162 | shown = False 163 | msg = "" 164 | 165 | @staticmethod 166 | def _show(msg): 167 | WaitBox.msg = msg 168 | if WaitBox.shown: 169 | idaapi.replace_wait_box(msg) 170 | else: 171 | idaapi.show_wait_box(msg) 172 | WaitBox.shown = True 173 | 174 | @staticmethod 175 | def show(msg, buffertime=0.1): 176 | if msg == WaitBox.msg: 177 | return 178 | 179 | if buffertime > 0.0: 180 | if time.time() - WaitBox.buffertime < buffertime: 181 | return 182 | WaitBox.buffertime = time.time() 183 | WaitBox._show(msg) 184 | 185 | @staticmethod 186 | def hide(): 187 | if WaitBox.shown: 188 | idaapi.hide_wait_box() 189 | WaitBox.shown = False 190 | 191 | # Virtual class tree 192 | class VClass(object): 193 | def __init__(self, *args, **kwargs): 194 | self.name = kwargs.get("name", "") 195 | # dict[classname, VClass] 196 | self.baseclasses = kwargs.get("baseclasses", {}) 197 | # Same as Linux json, dict[thisoffs, funcs] 198 | self.vfuncs = kwargs.get("vfuncs", {}) 199 | # Written to when writing to Windows, dict[thisoffs, [VFunc]] 200 | self.vfuncnames = kwargs.get("vfuncnames", {}) 201 | # Exists solely to speed up checking for inherited functions 202 | self.postnames = set() 203 | 204 | def __str__(self): 205 | return f"{self.name} (baseclasses = {self.baseclasses}, vfuncs = {self.vfuncs})" 206 | 207 | def parse(self, colea, wintable): 208 | col = get_class_from_ea(RTTICompleteObjectLocator, colea) 209 | thisoffs = col.offset 210 | 211 | # Already parsed 212 | if self.name in wintable.keys(): 213 | if thisoffs in wintable[self.name].vfuncs.keys(): 214 | return 215 | 216 | 217 | # In 64-bit PEs, the COL references itself, remove this 218 | xrefs = list(idautils.XrefsTo(colea)) 219 | if idaapi.inf_is_64bit(): 220 | for n in range(len(xrefs)-1, -1, -1): 221 | if xrefs[n].frm == colea + RTTICompleteObjectLocator.pSelf.offset: 222 | del xrefs[n] 223 | 224 | if len(xrefs) != 1: 225 | print(f"[VTABLE IO] Multiple vtables point to same COL - {self.name} at {colea:#x}") 226 | return 227 | 228 | vtable = xrefs[0].frm + ctypes.sizeof(ea_t) 229 | self.vfuncs[thisoffs] = parse_vtable_addresses(vtable) 230 | 231 | # TODO; This is created for each function in the json and for each function in each vtable 232 | # This clearly does this for multiple of each function, so there needs to be a way to 233 | # cache each function and reuse it for each vtable 234 | # Possible pain point is differentiating between inheritedness 235 | @dataclass 236 | class VFunc: 237 | ea: int # Address to this function 238 | vaddr: int # Address to this function's reference in its vtable 239 | mangledname: str 240 | inheritid: int 241 | name: str 242 | postname: str 243 | sname: str 244 | 245 | @staticmethod 246 | def create(ea=idc.BADADDR, mangledname="", inheritid=-1, vaddr=idc.BADADDR): 247 | name = "" 248 | postname = "" 249 | sname = "" 250 | if mangledname: 251 | name = idaapi.demangle_name(mangledname, idaapi.MNG_LONG_FORM) or mangledname 252 | if name: 253 | postname = get_func_postname(name) 254 | sname = postname.split("(")[0] 255 | return VFunc(ea, vaddr, mangledname, inheritid, name, postname, sname) 256 | 257 | class VOptions(object): 258 | StringMethod = 1 << 0 259 | SkipMismatches = 1 << 1 260 | CommentReusedFunctions = 1 << 2 261 | 262 | DoNotExport = 0 263 | ExportNormal = 1 264 | ExportOnly = 2 265 | 266 | # Form for script options 267 | class VForm(idaapi.Form): 268 | 269 | def __init__(self): 270 | idaapi.Form.__init__(self, r"""STARTITEM 0 271 | BUTTON YES* Go 272 | BUTTON CANCEL Cancel 273 | VTable IO 274 | {FormChangeCb} 275 | <#Browse#Select a file to import from :{iFileImport}> 276 | <##Import options##Parse type strings (for hashed type info):{rStringMethod}> | <##Export options##Do not export:{rDoNotExport}> 277 | | 278 | {cImportOptions}> | {cExportOptions}> 279 | <#Browse#Select a file to export to (ignored if unchecked):{iFileExport}> 280 | """, { 281 | "FormChangeCb": idaapi.Form.FormChangeCb(self.OnFormChange), 282 | "iFileImport": idaapi.Form.FileInput(open=True, value=idaapi.reg_read_string("vtable_io", "iFileImport", "*.json"), swidth=50), 283 | "cImportOptions": idaapi.Form.ChkGroupControl( 284 | ("rStringMethod", "rSkipMismatches", "rComment"), value=idaapi.reg_read_int("vtable_io", VOptions.SkipMismatches | VOptions.CommentReusedFunctions, "cImportOptions") 285 | ), 286 | "cExportOptions": idaapi.Form.RadGroupControl( 287 | ("rDoNotExport", "rExportNormal", "rExportOnly"), value=idaapi.reg_read_int("vtable_io", VOptions.DoNotExport, "cExportOptions") 288 | ), 289 | "iFileExport": idaapi.Form.FileInput(save=True, value=idaapi.reg_read_string("vtable_io", "iFileExport", "*.json"), swidth=50), 290 | }) 291 | 292 | def OnFormChange(self, fid): 293 | # print(fid) 294 | return 1 295 | 296 | @staticmethod 297 | def init_options(): 298 | f = VForm() 299 | f, _ = f.Compile() 300 | go = f.Execute() 301 | if not go: 302 | return None 303 | 304 | options = VOptions() 305 | for control in f.controls.keys(): 306 | if control != "FormChangeCb": 307 | currval = getattr(f, control).value 308 | setattr(options, control, currval) 309 | if isinstance(currval, str): 310 | idaapi.reg_write_string("vtable_io", currval, control) 311 | elif isinstance(currval, int): 312 | idaapi.reg_write_int("vtable_io", currval, control) 313 | else: 314 | print(f"Unsupported type for {control} - {type(currval)}") 315 | 316 | f.Free() 317 | return options 318 | 319 | OS_Linux = 0 320 | OS_Win = 1 321 | 322 | FUNCS = 0 323 | EXPORTS = 0 324 | 325 | VOPTIONS = None 326 | 327 | def get_os(): 328 | ftype = idaapi.get_file_type_name() 329 | if "ELF" in ftype: 330 | return OS_Linux 331 | elif "PE" in ftype: 332 | return OS_Win 333 | return -1 334 | 335 | # Read a ctypes class from an ea 336 | def get_class_from_ea(classtype, ea): 337 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(classtype)) 338 | return classtype.from_buffer_copy(bytestr) 339 | 340 | def rva_to_ea(ea): 341 | if idaapi.inf_is_64bit(): 342 | return idaapi.get_imagebase() + ea 343 | return ea 344 | 345 | # Anything past Classname:: 346 | # Thank you CTFPlayer::SOCacheUnsubscribed... 347 | def get_func_postname(name): 348 | retname = name 349 | template = 0 350 | iterback = 0 351 | for i, c in enumerate(retname): 352 | if c == "<": 353 | template += 1 354 | elif c == ">": 355 | template -= 1 356 | # Find ( and break if we're not in a template 357 | elif c == "(" and template == 0: 358 | iterback = i 359 | break 360 | 361 | # Run backwards from ( until we hit a :: 362 | for i in range(iterback, -1, -1): 363 | if retname[i] == ":": 364 | retname = retname[i+1:] 365 | break 366 | 367 | return retname 368 | 369 | def parse_vtable_names(ea): 370 | funcs = [] 371 | 372 | while ea != idc.BADADDR: 373 | # Using flags sped this up by a lot 374 | # Went from 4 secs to ~1.3 375 | flags = idaapi.get_full_flags(ea) 376 | if not is_off(flags) or not is_ptr(flags): 377 | break 378 | 379 | if idaapi.has_name(flags): 380 | break 381 | 382 | offs = get_ptr(ea) 383 | fflags = idaapi.get_full_flags(offs) 384 | if not idaapi.is_func(fflags): 385 | break 386 | 387 | name = idaapi.get_name(offs) 388 | funcs.append(name) 389 | 390 | ea = idaapi.next_head(ea, idc.BADADDR) 391 | return funcs 392 | 393 | def parse_vtable_addresses(ea): 394 | funcs = [] 395 | 396 | while ea != idc.BADADDR: 397 | flags = idaapi.get_full_flags(ea) 398 | if not is_off(flags) or not is_ptr(flags): 399 | break 400 | 401 | offs = get_ptr(ea) 402 | fflags = idaapi.get_full_flags(offs) 403 | if not has_any_name(fflags): 404 | break 405 | 406 | # if not idaapi.is_func(fflags):# or not idaapi.has_name(fflags): 407 | # Sometimes IDA doesn't think a function is a function 408 | # This is all CSteamWorksGameStatsUploader's fault :( 409 | if not is_code(fflags): 410 | break 411 | 412 | funcs.append(VFunc.create(ea=offs, vaddr=ea)) 413 | 414 | ea = idaapi.next_head(ea, idc.BADADDR) 415 | return funcs 416 | 417 | def parse_si_tinfo(ea, tinfos): 418 | for xref in idautils.XrefsTo(ea): 419 | tinfo = get_class_from_ea(si_class_type_info, xref.frm) 420 | tinfos[xref.frm + si_class_type_info.pParent.offset] = tinfo.pParent 421 | 422 | def parse_pointer_tinfo(ea, tinfos): 423 | for xref in idautils.XrefsTo(ea): 424 | tinfo = get_class_from_ea(pointer_type_info, xref.frm) 425 | tinfos[xref.frm + pointer_type_info.pType.offset] = tinfo.pType 426 | 427 | def parse_vmi_tinfo(ea, tinfos): 428 | for xref in idautils.XrefsTo(ea): 429 | tinfotype = create_vmi_class_type_info(xref.frm) 430 | tinfo = get_class_from_ea(tinfotype, xref.frm) 431 | 432 | for i in range(tinfo.basecount): 433 | offset = vmi_class_type_info.pBaseArray.offset + i * ctypes.sizeof(base_class_type_info) 434 | basetinfo = get_class_from_ea(base_class_type_info, xref.frm + offset) 435 | tinfos[xref.frm + offset + base_class_type_info.basetype.offset] = basetinfo.basetype 436 | 437 | def get_tinfo_vtables(ea, tinfos, vtables): 438 | if ea == idc.BADADDR: 439 | return 440 | 441 | for tinfoxref in idautils.XrefsTo(ea, idaapi.XREF_DATA): 442 | count = 0 443 | mangled = idaapi.get_name(tinfoxref.frm) 444 | demangled = idc.demangle_name(mangled, idaapi.MNG_LONG_FORM) 445 | if demangled is None: 446 | print(f"[VTABLE IO] Invalid name at {tinfoxref.frm:#x}") 447 | continue 448 | 449 | classname = demangled[len("`typeinfo for'"):] 450 | for xref in idautils.XrefsTo(tinfoxref.frm, idaapi.XREF_DATA): 451 | if xref.frm not in tinfos.keys(): 452 | # If address lies in a function 453 | if idaapi.is_func(idaapi.get_full_flags(xref.frm)): 454 | continue 455 | 456 | count += 1 457 | vtables[classname] = vtables.get(classname, []) + [xref.frm] 458 | 459 | def read_vtables_linux(): 460 | f = idaapi.ask_file(1, "*.json", "Select a file to export to") 461 | if not f: 462 | return 463 | 464 | WaitBox.show("Parsing typeinfo") 465 | 466 | # Step 1 and 2, crawl xrefs and stick the inherited class type infos into a structure 467 | # After this, we can run over the xrefs again and see which xrefs come from another structure 468 | # The remaining xrefs are either vtables or weird math in a function 469 | xreftinfos = {} 470 | 471 | def getparse(name, fn, quiet=False): 472 | tinfo = idc.get_name_ea_simple(name) 473 | if tinfo == idc.BADADDR and not quiet: 474 | print(f"[VTABLE IO] Type info {name} not found. Skipping...") 475 | return None 476 | 477 | if fn is not None: 478 | fn(tinfo, xreftinfos) 479 | return tinfo 480 | 481 | # Don't need to parse base classes 482 | tinfo = getparse("_ZTVN10__cxxabiv117__class_type_infoE", None) 483 | tinfo_pointer = getparse("_ZTVN10__cxxabiv119__pointer_type_infoE", parse_pointer_tinfo, True) 484 | tinfo_si = getparse("_ZTVN10__cxxabiv120__si_class_type_infoE", parse_si_tinfo) 485 | tinfo_vmi = getparse("_ZTVN10__cxxabiv121__vmi_class_type_infoE", parse_vmi_tinfo) 486 | 487 | if len(xreftinfos) == 0: 488 | print("[VTABLE IO] No type infos found. Are you sure you're in a C++ binary?") 489 | return 490 | 491 | # Step 3, crawl xrefs to again and if the xref is not in the type info structure, then it's a vtable 492 | WaitBox.show("Discovering vtables") 493 | vtables = {} 494 | get_tinfo_vtables(tinfo, xreftinfos, vtables) 495 | get_tinfo_vtables(tinfo_pointer, xreftinfos, vtables) 496 | get_tinfo_vtables(tinfo_si, xreftinfos, vtables) 497 | get_tinfo_vtables(tinfo_vmi, xreftinfos, vtables) 498 | 499 | # Now, we have a list of vtables and their respective classes 500 | WaitBox.show("Parsing vtables") 501 | jsondata = parse_vtables(vtables) 502 | 503 | WaitBox.show("Writing to file") 504 | with open(f, "w") as f: 505 | json.dump(jsondata, f, indent=4, sort_keys=True) 506 | 507 | def parse_ti(ea, tis): 508 | typedesc = ea 509 | flags = idaapi.get_full_flags(ea) 510 | if is_code(flags): 511 | return 512 | 513 | name = idc.get_name(ea) 514 | if not name: 515 | return 516 | 517 | # Pointer type 518 | # I have no idea what this is but it is not what we want 519 | if name.startswith("??_R0P"): 520 | return 521 | 522 | try: 523 | classname = idaapi.demangle_name(name, idaapi.MNG_SHORT_FORM) 524 | classname = classname.removeprefix("class ") 525 | classname = classname.removeprefix("struct TypeDescriptor ") 526 | classname = classname.removesuffix(" `RTTI Type Descriptor'") 527 | classname = classname.strip() 528 | except: 529 | print(f"[VTABLE IO] Invalid vtable name at {ea:#x}") 530 | return 531 | 532 | if classname in tis.keys(): 533 | return 534 | 535 | cols = [] 536 | vtables = [] 537 | 538 | # Then figure out which xref is a/the COL 539 | for xref in idautils.XrefsTo(typedesc): 540 | ea = xref.frm 541 | flags = idaapi.get_full_flags(ea) 542 | 543 | # Dynamic cast 544 | if is_code(flags): 545 | continue 546 | 547 | name = idaapi.get_name(ea) 548 | # Class type descriptor and/or random global data 549 | # Kind of a hack but let's assume no one will rename these 550 | if name and (name.startswith("??_R1") or name.startswith("off_")): 551 | continue 552 | 553 | ea -= 4 554 | name = idaapi.get_name(ea) 555 | # Catchable types 556 | if name and name.startswith("__CT"): 557 | continue 558 | 559 | # COL 560 | ea -= 8 561 | workaround = False 562 | if idaapi.is_unknown(idaapi.get_full_flags(ea)): 563 | print(f"[VTABLE IO] Possible COL is unknown at {ea:#x}. This may be an unreferenced vtable. Trying workaround...") 564 | # This might be a bug with IDA, but sometimes the COL isn't analyzed 565 | # If there's still a reference, then we can still trace back 566 | # If there is a list of functions (or even just one), then it's probably a vtable, 567 | # but we'll still warn the user that it might be garbage 568 | refs = list(idautils.XrefsTo(ea)) 569 | if len(refs) == 1: 570 | vtable = refs[0].frm + ctypes.sizeof(ea_t) 571 | tryfunc = get_ptr(vtable + ctypes.sizeof(ea_t)) 572 | funcflags = idaapi.get_full_flags(tryfunc) 573 | if idaapi.is_func(funcflags): 574 | print(f" - Workaround successful. Please assure that {vtable:#x} is a vtable.") 575 | workaround = True 576 | 577 | if not workaround: 578 | print(" - Workaround failed. Skipping...") 579 | continue 580 | 581 | name = idaapi.get_name(ea) 582 | if not workaround and (not name or not name.startswith("??_R4")): 583 | print(f"[VTABLE IO] Invalid name at {ea:#x}. Possible unwind info. Ignoring...") 584 | continue 585 | 586 | # In 64-bit PEs, the COL references itself, remove this 587 | refs = list(idautils.XrefsTo(ea)) 588 | if idaapi.inf_is_64bit(): 589 | for n in range(len(refs)-1, -1, -1): 590 | if refs[n].frm == ea + RTTICompleteObjectLocator.pSelf.offset: 591 | del refs[n] 592 | 593 | # Now that we have the COL, we can use it to find the vtable that utilizes it and its thisoffs 594 | # We need to use this later because of overloads so we cache it in a list 595 | if len(refs) != 1: 596 | print(f"[VTABLE IO] Multiple vtables point to same COL - {name} at {ea:#x}") 597 | continue 598 | 599 | cols.append(ea) 600 | vtable = refs[0].frm + ctypes.sizeof(ea_t) 601 | vtables.append(vtable) 602 | 603 | # Can have RTTI without a vtable 604 | tis[classname] = WinTI(typedesc, classname, cols, vtables) 605 | 606 | 607 | def read_ti_win(): 608 | # Step 1, get the vftable of type_info 609 | type_info = idc.get_name_ea_simple("??_7type_info@@6B@") 610 | if type_info == idc.BADADDR: 611 | # If type_info doesn't exist as a label, we might still be able to snipe it with the string method 612 | strings = list(idautils.Strings()) 613 | for s in strings: 614 | if str(s) == ".?AVtype_info@@": 615 | ea = s.ea - TypeDescriptor.name.offset 616 | type_info = rva_to_ea(idaapi.get_wide_dword(ea)) 617 | 618 | print("[VTABLE IO] type_info not found. Are you sure you're in a C++ binary?") 619 | return None 620 | 621 | tis = {} 622 | 623 | # Step 2, get all xrefs to type_info 624 | # Get type descriptor 625 | for typedesc in idautils.XrefsTo(type_info): 626 | parse_ti(typedesc.frm, tis) 627 | 628 | # In some cases, the IDA either fails to reference some type descriptors with type_info 629 | # Not exactly sure why, but it lists the ea of type_info as a "hash" when in reality it isn't 630 | # A workaround for this is to parse type descriptor strings (".?AV*"), load up their references, and 631 | # walk backwards to the start of what is supposed to be the type descriptor, and assure that 632 | # its DWORD is the type_info vtable 633 | # We also make this an optional feature because it's slow in older IDA versions and not necessarily needed 634 | # I only found this to be a problem in NMRIH, so it appears to be rare 635 | if VOPTIONS.cImportOptions & VOptions.StringMethod: 636 | WaitBox.show("Performing string parsing") 637 | string_method(type_info, tis) 638 | 639 | return tis 640 | 641 | def string_method(type_info, tis): 642 | for string in idautils.Strings(): 643 | sstr = str(string) 644 | if not sstr.startswith(".?AV"): 645 | continue 646 | 647 | ea = string.ea 648 | ea -= TypeDescriptor.name.offset 649 | trytinfo = rva_to_ea(idaapi.get_wide_dword(ea)) 650 | # This is a weird string that isn't a part of a type descriptor 651 | if trytinfo != type_info: 652 | continue 653 | 654 | parse_ti(ea, tis) 655 | 656 | 657 | def parse_vtables(vtables): 658 | jsondata = {} 659 | ptrsize = ctypes.sizeof(ea_t) 660 | for classname, tables in vtables.items(): 661 | # We don't *need* to do any sort of sorting in Linux and can just capture the thisoffset 662 | # The Windows side of the script can organize later 663 | for ea in tables: 664 | thisoffs = get_ptr(ea - ptrsize) 665 | 666 | funcs = parse_vtable_names(ea + ptrsize) 667 | # Can be zero if there's an xref in the global offset table (.got) section 668 | # Fortunately the parse_vtable function doesn't grab anything from there 669 | if funcs: 670 | classdata = jsondata.get(classname, {}) 671 | classdata[ptr_t(thisoffs).value] = funcs 672 | jsondata[classname] = classdata 673 | 674 | return jsondata 675 | 676 | # See if the thunk is actually a thunk and jumps to 677 | # a function in the vtable 678 | def is_thunk(thunkfunc, targetfuncs): 679 | ea = thunkfunc.ea 680 | func = idaapi.get_func(ea) 681 | funcend = func.end_ea 682 | 683 | # if funcend - ea > 20: # Highest I've seen is 13 opcodes but this works ig 684 | # return False 685 | 686 | addr = idc.next_head(ea, funcend) 687 | 688 | if addr == idc.BADADDR: 689 | return False 690 | 691 | b = idaapi.get_byte(addr) 692 | if b in (0xEB, 0xE9): 693 | insn = idaapi.insn_t() 694 | idaapi.decode_insn(insn, addr) 695 | jmpaddr = insn.Op1.addr 696 | return any(jmpaddr == i.ea for i in targetfuncs) 697 | 698 | return False 699 | 700 | def build_export_table(linuxtables, wintables): 701 | # Table is built mainly for readability but having one that is actually parsable would 702 | # be a cool idea for the future 703 | exporttable = {} 704 | # Save Linux only tables for exporting too 705 | winless = {k: linuxtables[k] for k in linuxtables.keys() - wintables.keys()} 706 | global EXPORTS 707 | for classname, wintable in wintables.items(): 708 | linuxtable = linuxtables.get(classname, None) 709 | if linuxtable is None: 710 | continue 711 | 712 | # Sort and int-ify Linux again 713 | newlinuxtable = [(abs(int(k)), v) for k, v in linuxtable.items()] 714 | newlinuxtable.sort(key=lambda x: x[0]) 715 | 716 | exportnode = [] 717 | purecalls = [] 718 | for currlinuxitems, currwinitems in zip(newlinuxtable, wintable.items()): 719 | lthisoffs, ltable = currlinuxitems 720 | wthisoffs, wtable = currwinitems 721 | 722 | windiscovered = set() 723 | prepend = f"[L{lthisoffs}/W{wthisoffs}]" 724 | for i, mangledname in enumerate(ltable): 725 | # Save for later 726 | if mangledname.startswith("__cxa"): 727 | # print(f"Found purecall {classname}::{mangledname} at {i}") 728 | purecalls.append(i) 729 | continue 730 | 731 | winidx = -1 732 | for j, winfunc in enumerate(wtable): 733 | if mangledname == winfunc.mangledname: 734 | winidx = j 735 | windiscovered.add(j) 736 | break 737 | 738 | s = f"L{i}" 739 | if winidx != -1: 740 | s = f"{s:<8}W{winidx}" 741 | 742 | if not mangledname.startswith("sub_"): 743 | shortname = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM) or "purecall" 744 | else: 745 | shortname = mangledname 746 | newprepend = f"{prepend:<20}{s:<8}" 747 | s = f"{newprepend:<36}{shortname}" 748 | exportnode.append(s) 749 | 750 | # Purecalls are a bit special 751 | # We can't just grab the Linux index and use it for Windows 752 | # So we 1: do this after everything else is done, and 2: find the first 753 | # Windows purecall after the last purecall we found for each one 754 | # in the Linux table 755 | # This is kinda hard to test edge cases, but we'll assume this works 756 | lastidx = 0 757 | for i in purecalls: 758 | winidx = -1 759 | for j, winfunc in enumerate(wtable[lastidx:]): 760 | if winfunc.mangledname == "__cxa_pure_virtual": 761 | winidx = j + lastidx 762 | break 763 | 764 | s = f"L{i}" 765 | if winidx != -1: 766 | s = f"{s:<8}W{winidx}" 767 | 768 | shortname = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM) or "purecall" 769 | newprepend = f"{prepend:<20}{s:<8}" 770 | s = f"{newprepend:<36}{shortname}" 771 | exportnode.insert(i, s) 772 | lastidx = winidx+1 773 | windiscovered.add(winidx) 774 | 775 | # For thunks, figure out which Windows indices were not discovered and add them 776 | # Inherited table might be out of order but we favor Linux anyways 777 | for j, winfunc in enumerate(wtable): 778 | if j not in windiscovered: 779 | dummy = "" 780 | s = f"W{j}" 781 | 782 | shortname = idaapi.demangle_name(winfunc.mangledname, idaapi.MNG_SHORT_FORM) or "purecall" 783 | newprepend = f"{prepend:<20}{dummy:<8}{s:<8}" 784 | s = f"{newprepend:<36}{shortname}" 785 | exportnode.append(s) 786 | 787 | EXPORTS += 1 788 | exporttable[classname] = exportnode 789 | 790 | # Export Linux only tables 791 | for classname, linuxtable in winless.items(): 792 | # Sort and int-ify Linux again 793 | newlinuxtable = [(abs(int(k)), v) for k, v in linuxtable.items()] 794 | newlinuxtable.sort(key=lambda x: x[0]) 795 | exportnode = [] 796 | for thisoffs, table in newlinuxtable: 797 | prepend = f"[L{thisoffs}]" 798 | for i, mangledname in enumerate(table): 799 | shortname = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM) or "purecall" 800 | newprepend = f"{prepend:<20}L{i:<8}" 801 | s = f"{newprepend:<36}{shortname}" 802 | exportnode.append(s) 803 | 804 | EXPORTS += 1 805 | exporttable[classname] = exportnode 806 | return exporttable 807 | 808 | def read_vtables_win(classname, ti, wintable, baseclasses): 809 | if classname in wintable.keys(): 810 | return 811 | 812 | vclass = wintable.get(classname, VClass(name=classname, baseclasses=baseclasses)) 813 | for colea in ti.cols: 814 | vclass.parse(colea, wintable) 815 | 816 | wintable[classname] = vclass 817 | 818 | def read_tinfo_win(classname, ti, winti, wintable, baseclasses): 819 | # Strange cases where there is a base class descriptor with no vtable 820 | if classname not in winti.keys(): 821 | return 822 | 823 | if classname in wintable.keys(): 824 | return 825 | 826 | # No COLs, but we still keep the type in the wintable 827 | if not ti.cols: 828 | wintable[classname] = VClass(name=classname, baseclasses=baseclasses) 829 | return 830 | 831 | # So essentially we just run through each base class in the hierarchy descriptor 832 | # and recursively parse the base classes of the base classes 833 | # Sort of like a reverse insertion sort only not really a sort 834 | for colea in ti.cols: 835 | col = get_class_from_ea(RTTICompleteObjectLocator, colea) 836 | hierarchydesc = get_class_from_ea(RTTIClassHierarchyDescriptor, rva_to_ea(col.pClassHierarchyDescriptor)) 837 | numitems = hierarchydesc.numBaseClasses 838 | arraystart = rva_to_ea(hierarchydesc.pBaseClassArray) 839 | 840 | # Go backwards because we should start parsing from the basest base class 841 | for i in range(numitems - 1, -1, -1): 842 | offset = arraystart + i * ctypes.sizeof(ctypes.c_uint32) 843 | descea = rva_to_ea(idaapi.get_wide_dword(offset)) 844 | parentname = idaapi.demangle_name(idaapi.get_name(descea), idaapi.MNG_SHORT_FORM) 845 | if not parentname: 846 | # Another undefining IDA moment 847 | # print(f"[VTABLE IO] Invalid parent name at {offset:#x}") 848 | typedesc = rva_to_ea(idaapi.get_wide_dword(descea)) 849 | parentname = idaapi.demangle_name(idaapi.get_name(typedesc), idaapi.MNG_SHORT_FORM) 850 | 851 | # Should be impossible since this is the type descriptor 852 | if not parentname: 853 | print(f"[VTABLE IO] Invalid parent name at {offset:#x} - type descriptor at {typedesc:#x}") 854 | continue 855 | 856 | parentname = parentname.removeprefix("class ") 857 | parentname = parentname.removeprefix("struct TypeDescriptor ") 858 | parentname = parentname.removesuffix(" `RTTI Type Descriptor'") 859 | else: 860 | parentname = parentname[:parentname.find("::`RTTI Base Class Descriptor")] 861 | 862 | # End of the line 863 | if i == 0: 864 | read_vtables_win(classname, winti[parentname], wintable, baseclasses) 865 | elif parentname in winti.keys(): 866 | read_tinfo_win(parentname, winti[parentname], winti, wintable, baseclasses) 867 | # Once again relying on dicts being ordered 868 | baseclasses[parentname] = wintable[parentname] 869 | 870 | def gen_win_tables(winti): 871 | # So first we start looping windows typeinfos because 872 | # we're going to go from the COL -> ClassHierarchyDescriptor -> BaseClassArray 873 | # The reason why we're doing this is because of subclass overloads 874 | # For a history lesson, see https://github.com/Scags/IDA-Scripts/blob/125f1877a24da48062e62efcfb7d8a63e3bd939b/vtable_io.py#L251-L263 875 | # We're going to fix this by writing (and thus caching the names of) the baseclasses of classes first 876 | # This way, we'll be able to know the classname and the virtual functions contained therein, 877 | # and thus we will know if there is an overload that exists in a subclass 878 | # This relies on the fact that dicts are ordered in Python 3.7+ 879 | # If you're running Jiang Yang, either get a job or replace wintables with an OrderedDict 880 | 881 | # Same format as linuxtables 882 | # {classname: VClass(classname, {thisoffs: [vfunc...], ...}, ...}) 883 | wintables = {} 884 | for classname, ti in winti.items(): 885 | read_tinfo_win(classname, ti, winti, wintables, {}) 886 | 887 | return wintables 888 | 889 | def fix_windows_classname(classname): 890 | # Double pointers are spaced... 891 | classnamefix = classname.replace("* *", "**") 892 | 893 | # References/pointers that are const are spaced... 894 | classnamefix = classnamefix.replace("const &", "const&") 895 | classnamefix = classnamefix.replace("const *", "const*") 896 | 897 | # And true/false is instead replaced with 1/0 898 | def replacer(m): 899 | # Avoid replacing 1s and 0s that are a part of classnames 900 | # Thanks ChatGPT 901 | return re.sub(r"(?<=\W)1(?=\W)", "true", re.sub(r"(?<=\W)0(?=\W)", "false", m.group())) 902 | classnamefix = re.sub(r"<[^>]+>", replacer, classnamefix) 903 | 904 | # Other quirks are inline structs and templated enums 905 | # which are pretty much impossible to deduce 906 | return classnamefix 907 | 908 | # Idk why but sometimes pointers have a mind of their own 909 | def fix_windows_classname2(classname): 910 | return classname.replace(" *", "*") 911 | 912 | def fix_win_overloads(linuxitems, winitems, vclass, functable): 913 | for i in range(min(len(linuxitems), len(winitems))): 914 | currfuncs = linuxitems[i].funcs 915 | vfuncs = [] 916 | for u in range(len(currfuncs)): 917 | f = VFunc.create(mangledname=currfuncs[u]) 918 | for j, baseclass in enumerate(vclass.baseclasses.values()): 919 | if f.postname in baseclass.postnames: 920 | f.inheritid = j 921 | break 922 | 923 | # Unbelievable hack right here 924 | # Looks like pointers are getting shoved next to their types instead of spaced sometimes 925 | # Not entirely sure what causes this. 926 | # CAI_BaseNPC::CanStandOn(CBaseEntity*) vs CBaseEntity::CanStandOn(CBaseEntity *) 927 | # Maybe it's the difference in the types of the pointers and this? 928 | trystr = f.postname 929 | breakout = False 930 | for k in range(trystr.count(" *")): 931 | trystr = trystr.replace(" *", "*", 1) 932 | if trystr in baseclass.postnames: 933 | f.inheritid = j 934 | f.postname = trystr 935 | breakout = True 936 | break 937 | 938 | if breakout: 939 | break 940 | 941 | vfuncs.append(f) 942 | 943 | # Remove Linux's extra dtor 944 | for u, f in enumerate(vfuncs): 945 | if "::~" in f.name: 946 | del vfuncs[u] 947 | break 948 | 949 | # Windows does overloads backwards, reverse them 950 | funcnameset = set() 951 | u = 0 952 | while u < len(vfuncs): 953 | f = vfuncs[u] 954 | 955 | if f.mangledname.startswith("__cxa"):# or f.mangledname.startswith("_ZThn") or f.mangledname.startswith("_ZTv"): 956 | u += 1 957 | continue 958 | 959 | if not f.name: 960 | u += 1 961 | continue 962 | 963 | # This is an overload, we take the function name here, and push it somewhere else 964 | if f.sname in funcnameset: 965 | # Find the first index of the overload 966 | firstidx = -1 967 | for k in range(u): 968 | if vfuncs[k].sname == f.sname: 969 | firstidx = k 970 | break 971 | 972 | if firstidx == -1: 973 | print(f"[VTABLE IO] An impossibility has occurred. \"{f.sname}\" ({f.mangledname}, {f.name}) is in funcnameset but there is no possible overload.") 974 | 975 | overloadfunc = vfuncs[firstidx] 976 | if overloadfunc.inheritid != f.inheritid: 977 | # Although this function is an overload, it was created in a subclass 978 | # So we don't move it 979 | u += 1 980 | continue 981 | 982 | # Remove the current func from the list 983 | del vfuncs[u] 984 | # And insert it into the first index 985 | vfuncs.insert(firstidx, f) 986 | u += 1 987 | continue 988 | 989 | funcnameset.add(f.sname) 990 | u += 1 991 | 992 | for f in vfuncs: 993 | vclass.postnames.add(f.postname) 994 | functable[linuxitems[i].thisoffs] = vfuncs 995 | 996 | def thunk_dance(winitems, vclass, functable): 997 | # Now it's time for thunk dancing 998 | mainltable = functable[0] 999 | mainwtable = winitems[0].funcs 1000 | for currlinuxitems, currwinitems in zip(functable.items(), winitems): 1001 | thisoffs, ltable = currlinuxitems 1002 | wtable = currwinitems.funcs 1003 | if thisoffs == 0: 1004 | continue 1005 | 1006 | # Remove any extra dtors from this table 1007 | dtorcount = 0 1008 | for i, f in enumerate(ltable): 1009 | if "::~" in f.name: 1010 | dtorcount += 1 1011 | if dtorcount > 1: 1012 | del ltable[i] 1013 | break 1014 | 1015 | i = 0 1016 | while i < len(mainltable): 1017 | f = mainltable[i] 1018 | if f.mangledname.startswith("__cxa"): 1019 | i += 1 1020 | continue 1021 | 1022 | # I shouldn't need to do this, but destructors are wonky 1023 | if i == 0 and "::~" in f.name: 1024 | i += 1 1025 | continue 1026 | 1027 | if not f.postname: 1028 | i += 1 1029 | continue 1030 | 1031 | # Windows skips the vtable function if it's implementation is in the thunks 1032 | # A way to check if this is true is to see which thunks are actually thunks 1033 | # Then we just pop its name from the main table, since it's no longer there 1034 | thunkidx = -1 1035 | for u in range(len(ltable)): 1036 | if ltable[u].postname == f.postname: 1037 | thunkidx = u 1038 | break 1039 | 1040 | if thunkidx != -1: 1041 | try: 1042 | # We can't exactly see if the possible thunk jumps to a certain function (mainwtable[i]) because 1043 | # it's impossible to know what that function even is, so we instead check to see if 1044 | # it jumps into any function in the main vtable which is good enough 1045 | if not is_thunk(wtable[thunkidx], mainwtable): 1046 | ltable[thunkidx] = mainltable[i] 1047 | del mainltable[i] 1048 | continue 1049 | except: 1050 | print(f"[VTABLE IO] Anomalous thunk: {vclass.name}::{f.postname}, mainwtable {len(mainwtable)} wtable {len(wtable)} thunkidx {thunkidx} thisoffs {thisoffs}") 1051 | pass 1052 | i += 1 1053 | 1054 | # Update current linux table 1055 | functable[thisoffs] = ltable 1056 | 1057 | # Update main table 1058 | functable[0] = mainltable 1059 | 1060 | def prep_linux_vtables(linuxitems, winitems, vclass): 1061 | functable = {} 1062 | 1063 | fix_win_overloads(linuxitems, winitems, vclass, functable) 1064 | 1065 | # No thunks, we are done 1066 | if min(len(linuxitems), len(winitems)) == 1: 1067 | return functable 1068 | 1069 | thunk_dance(winitems, vclass, functable) 1070 | 1071 | # Ready to write 1072 | return functable 1073 | 1074 | def merge_tables(functable, winitems): 1075 | for items in zip(functable.items(), winitems): 1076 | # Should probably make this unpacking/packing more efficient 1077 | currlitems, currwitems = items 1078 | _, ltable = currlitems 1079 | wtable = currwitems.funcs 1080 | 1081 | for i, f in enumerate(ltable): 1082 | targetname = f.mangledname 1083 | # Purecall, which should already be handled on the Windows side 1084 | if targetname.startswith("__cxa"): 1085 | continue 1086 | 1087 | # Size mismatch, skip it 1088 | try: 1089 | currfunc = wtable[i] 1090 | except: 1091 | continue 1092 | targetaddr = currfunc.ea 1093 | 1094 | flags = idaapi.get_full_flags(targetaddr) 1095 | # Already typed 1096 | if idaapi.has_name(flags): 1097 | if VOPTIONS.cImportOptions & VOptions.CommentReusedFunctions: 1098 | # If it's a Windows optimization (nullsubs, etc), 1099 | # add a comment with the actual name 1100 | # There's gotta be a way to rename the reference but not the function 1101 | currmangledname = idaapi.get_name(targetaddr) 1102 | currname = idaapi.demangle_name(currmangledname, idaapi.MNG_LONG_FORM) 1103 | if not currname or currname != f.name: 1104 | # Use short name for cmt since that's what IDA uses 1105 | shortname = idaapi.demangle_name(f.mangledname, idaapi.MNG_SHORT_FORM) 1106 | idaapi.set_cmt(currfunc.vaddr, shortname, False) 1107 | continue 1108 | 1109 | func = idaapi.get_func(targetaddr) 1110 | # Not actually a function somehow 1111 | if not func: 1112 | continue 1113 | 1114 | # A library function (should already have a name) 1115 | if func.flags & idaapi.FUNC_LIB: 1116 | continue 1117 | 1118 | idaapi.set_name(targetaddr, targetname, idaapi.SN_FORCE) 1119 | global FUNCS 1120 | FUNCS += 1 1121 | 1122 | def compare_tables(wintables, linuxtables): 1123 | functables = {} 1124 | for classname, vclass in wintables.items(): 1125 | if not vclass.vfuncs: 1126 | continue 1127 | 1128 | linuxtable = linuxtables.get(classname, {}) 1129 | if not linuxtable: 1130 | # Some weird Windows quirks 1131 | classnamefix = fix_windows_classname(classname) 1132 | linuxtable = linuxtables.get(classnamefix, {}) 1133 | if not linuxtable: 1134 | # Another very weird quirk 1135 | classnamefix = fix_windows_classname2(classnamefix) 1136 | linuxtable = linuxtables.get(classnamefix, {}) 1137 | if not linuxtable: 1138 | # print(f"[VTABLE IO] {classname}{f' (tried {classnamefix})' if classname != classnamefix else ''} not found in Linux tables. Skipping...") 1139 | continue 1140 | 1141 | winitems = list(FuncList(x[0], x[1]) for x in vclass.vfuncs.items()) 1142 | # Sort by thisoffs, smallest first 1143 | winitems.sort(key=lambda x: x.thisoffs) 1144 | 1145 | # Convert the string thisoffs to int 1146 | # Linux thisoffses are negative, abs them 1147 | linuxitems = list(FuncList(abs(int(x[0])), x[1]) for x in zip([abs(int(i)) for i in linuxtable.keys()], linuxtable.values())) 1148 | linuxitems.sort(key=lambda x: x.thisoffs) 1149 | 1150 | # If there's a size mismatch (very rare), then most likely IDA failed to analyze 1151 | # A certain vtable, so we can't continue given the high probability of catastrophich failure 1152 | if len(winitems) != len(linuxitems): 1153 | print(f"[VTABLE IO] {classname} vtable # mismatch - L{len(linuxitems)} W{len(winitems)}. Skipping...") 1154 | continue 1155 | 1156 | functable = prep_linux_vtables(linuxitems, winitems, vclass) 1157 | 1158 | skip = False 1159 | for items in zip(functable.items(), winitems): 1160 | currlinuxitems, currwinitems = items 1161 | thisoffs, ltable = currlinuxitems 1162 | if len(ltable) != len(currwinitems.funcs): 1163 | print(f"[VTABLE IO] WARNING: {vclass.name} vtable [W{currwinitems.thisoffs}/L{thisoffs}] may be wrong! L{len(ltable)} - W{len(currwinitems.funcs)} = {len(ltable) - len(currwinitems.funcs)}", end="") 1164 | if VOPTIONS.cImportOptions & VOptions.SkipMismatches: 1165 | print(". Skipping...") 1166 | skip = True 1167 | break 1168 | else: 1169 | print() 1170 | 1171 | if skip: 1172 | continue 1173 | 1174 | functables[classname] = functable 1175 | 1176 | # Write! 1177 | if VOPTIONS.cExportOptions != VOptions.ExportOnly: 1178 | merge_tables(functable, winitems) 1179 | 1180 | return functables 1181 | 1182 | def write_vtables(): 1183 | WaitBox.show("Importing file") 1184 | linuxtables = None 1185 | try: 1186 | with open(VOPTIONS.iFileImport) as f: 1187 | linuxtables = json.load(f) 1188 | except FileNotFoundError as e: 1189 | print(f"[VTABLE IO] File {VOPTIONS.iFileImport} not found.") 1190 | return 1191 | 1192 | if not linuxtables: 1193 | return 1194 | 1195 | WaitBox.show("Parsing Windows typeinfo") 1196 | winti = read_ti_win() 1197 | if winti is None: 1198 | return 1199 | 1200 | WaitBox.show("Generating windows vtables") 1201 | wintables = gen_win_tables(winti) 1202 | 1203 | if not wintables: 1204 | return 1205 | 1206 | WaitBox.show("Comparing vtables") 1207 | functables = compare_tables(wintables, linuxtables) 1208 | 1209 | if VOPTIONS.cExportOptions in (VOptions.ExportOnly, VOptions.ExportNormal): 1210 | if VOPTIONS.iFileExport is None or VOPTIONS.iFileExport == "*.json": 1211 | print("[VTABLE IO] No export file specified.") 1212 | return 1213 | 1214 | WaitBox.show("Writing to file") 1215 | exporttable = build_export_table(linuxtables, functables) 1216 | with open(VOPTIONS.iFileExport, "w") as f: 1217 | json.dump(exporttable, f, indent=4, sort_keys=True) 1218 | 1219 | 1220 | def main(): 1221 | os = get_os() 1222 | if os == -1: 1223 | print(f"Unsupported OS?: {idaapi.get_file_type_name()}") 1224 | idaapi.beep() 1225 | return 1226 | 1227 | try: 1228 | if os == OS_Linux: 1229 | read_vtables_linux() 1230 | print("Done!") 1231 | elif os == OS_Win: 1232 | global VOPTIONS 1233 | VOPTIONS = VForm.init_options() 1234 | if not VOPTIONS: 1235 | return 1236 | 1237 | write_vtables() 1238 | if FUNCS: 1239 | print(f"[VTABLE IO] Successfully typed {FUNCS} virtual functions") 1240 | else: 1241 | print("[VTABLE IO] No functions were typed") 1242 | 1243 | if EXPORTS: 1244 | print(f"[VTABLE IO] Successfully exported {EXPORTS} virtual tables") 1245 | 1246 | if FUNCS == 0 and EXPORTS == 0: 1247 | idaapi.beep() 1248 | except: 1249 | import traceback 1250 | traceback.print_exc() 1251 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues") 1252 | idaapi.beep() 1253 | 1254 | WaitBox.hide() 1255 | 1256 | # import cProfile 1257 | # cProfile.run("main()", "vtable_io.prof") 1258 | main() -------------------------------------------------------------------------------- /vtable_structs.py: -------------------------------------------------------------------------------- 1 | import idc 2 | import idautils 3 | import idaapi 4 | import ctypes 5 | import time 6 | 7 | from dataclasses import dataclass 8 | 9 | OS_Linux = 0 10 | OS_Win = 1 11 | 12 | if idaapi.inf_is_64bit(): 13 | ea_t = ctypes.c_uint64 14 | ptr_t = ctypes.c_int64 15 | get_ptr = idaapi.get_qword 16 | FF_PTR = idc.FF_QWORD 17 | else: 18 | ea_t = ctypes.c_uint32 19 | ptr_t = ctypes.c_int32 20 | get_ptr = idaapi.get_dword 21 | FF_PTR = idc.FF_DWORD 22 | 23 | def is_ptr(f): return (f & idaapi.MS_CLS) == idc.FF_DATA and (f & idaapi.DT_TYPE) == FF_PTR 24 | def is_off(f): return (f & (idc.FF_0OFF|idc.FF_1OFF)) != 0 25 | 26 | 27 | _RTTICompleteObjectLocator_fields = [ 28 | ("signature", ctypes.c_uint32), # signature 29 | ("offset", ctypes.c_uint32), # offset of this vtable in complete class (from top) 30 | ("cdOffset", ctypes.c_uint32), # offset of constructor displacement 31 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor 32 | ("pClassHierarchyDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor 33 | ] 34 | 35 | if idaapi.inf_is_64bit(): 36 | _RTTICompleteObjectLocator_fields.append(("pSelf", ctypes.c_uint32)) # ref to object's base 37 | 38 | class RTTICompleteObjectLocator(ctypes.Structure): 39 | _fields_ = _RTTICompleteObjectLocator_fields 40 | 41 | 42 | class TypeDescriptor(ctypes.Structure): 43 | _fields_ = [ 44 | ("pVFTable", ctypes.c_uint32), # reference to RTTI's vftable 45 | ("spare", ctypes.c_uint32), # internal runtime reference 46 | ("name", ctypes.c_uint8), # type descriptor name (no varstruct needed since we don't use this) 47 | ] 48 | 49 | 50 | class RTTIClassHierarchyDescriptor(ctypes.Structure): 51 | _fields_ = [ 52 | ("signature", ctypes.c_uint32), # signature 53 | ("attribs", ctypes.c_uint32), # attributes 54 | ("numBaseClasses", ctypes.c_uint32), # # of items in the array of base classes 55 | ("pBaseClassArray", ctypes.c_uint32), # ref BaseClassArray 56 | ] 57 | 58 | 59 | class RTTIBaseClassDescriptor(ctypes.Structure): 60 | _fields_ = [ 61 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor 62 | ("numContainedBases", ctypes.c_uint32), # # of sub elements within base class array 63 | ("mdisp", ctypes.c_uint32), # member displacement 64 | ("pdisp", ctypes.c_uint32), # vftable displacement 65 | ("vdisp", ctypes.c_uint32), # displacement within vftable 66 | ("attributes", ctypes.c_uint32), # base class attributes 67 | ("pClassDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor 68 | ] 69 | 70 | 71 | class base_class_type_info(ctypes.Structure): 72 | _fields_ = [ 73 | ("basetype", ea_t), # Base class type 74 | ("offsetflags", ea_t), # Offset and info 75 | ] 76 | 77 | 78 | class class_type_info(ctypes.Structure): 79 | _fields_ = [ 80 | ("pVFTable", ea_t), # reference to RTTI's vftable (__class_type_info) 81 | ("pName", ea_t), # ref to type name 82 | ] 83 | 84 | # I don't think this is right, but every case I found looked to be correct 85 | # This might be a vtable? IDA sometimes says it is but not always 86 | # Plus sometimes the flags member is 0x1, so it's not a thisoffs. Weird 87 | class pointer_type_info(class_type_info): 88 | _fields_ = [ 89 | ("flags", ea_t), # Flags or something else 90 | ("pType", ea_t), # ref to type 91 | ] 92 | 93 | class si_class_type_info(class_type_info): 94 | _fields_ = [ 95 | ("pParent", ea_t), # ref to parent type 96 | ] 97 | 98 | class vmi_class_type_info(class_type_info): 99 | _fields_ = [ 100 | ("flags", ctypes.c_uint32), # flags 101 | ("basecount", ctypes.c_uint32), # # of base classes 102 | ("pBaseArray", base_class_type_info), # array of BaseClassArray 103 | ] 104 | 105 | def create_vmi_class_type_info(ea): 106 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(vmi_class_type_info)) 107 | tinfo = vmi_class_type_info.from_buffer_copy(bytestr) 108 | 109 | # Since this is a varstruct, we create a dynamic class with the proper size and type and return it instead 110 | class vmi_class_type_info_dynamic(class_type_info): 111 | _fields_ = [ 112 | ("flags", ctypes.c_uint32), 113 | ("basecount", ctypes.c_uint32), 114 | ("pBaseArray", base_class_type_info * tinfo.basecount), 115 | ] 116 | 117 | return vmi_class_type_info_dynamic 118 | 119 | # Idiot proof IDA wait box 120 | class WaitBox: 121 | buffertime = 0.0 122 | shown = False 123 | msg = "" 124 | 125 | @staticmethod 126 | def _show(msg): 127 | WaitBox.msg = msg 128 | if WaitBox.shown: 129 | idaapi.replace_wait_box(msg) 130 | else: 131 | idaapi.show_wait_box(msg) 132 | WaitBox.shown = True 133 | 134 | @staticmethod 135 | def show(msg, buffertime=0.1): 136 | if msg == WaitBox.msg: 137 | return 138 | 139 | if buffertime > 0.0: 140 | if time.time() - WaitBox.buffertime < buffertime: 141 | return 142 | WaitBox.buffertime = time.time() 143 | WaitBox._show(msg) 144 | 145 | @staticmethod 146 | def hide(): 147 | if WaitBox.shown: 148 | idaapi.hide_wait_box() 149 | WaitBox.shown = False 150 | STRUCTS = 0 151 | 152 | class InfoCache(object): 153 | tinfos = {} 154 | vfuncs = {} 155 | 156 | # Class for windows type info, helps organize things 157 | @dataclass(frozen=True) 158 | class WinTI(object): 159 | typedesc: int 160 | name: str 161 | cols: list[int] 162 | vtables: list[int] 163 | 164 | @dataclass 165 | class VFuncRef: 166 | ea: int # Address to this function 167 | mangledname: str 168 | name: str 169 | postname: str 170 | sname: str 171 | 172 | @staticmethod 173 | def create(ea=idc.BADADDR, mangledname=""): 174 | if InfoCache.vfuncs.get(ea): 175 | return InfoCache.vfuncs[ea] 176 | 177 | name = "" 178 | postname = "" 179 | sname = "" 180 | if mangledname: 181 | name = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM) 182 | if name: 183 | postname = get_func_postname(name) 184 | sname = postname.split("(")[0] 185 | else: 186 | postname = mangledname 187 | sname = mangledname 188 | 189 | vfunc = VFuncRef(ea, mangledname, name, postname, sname) 190 | InfoCache.vfuncs[ea] = vfunc 191 | return vfunc 192 | 193 | @dataclass(frozen=True) 194 | class VFunc: 195 | funcref: VFuncRef 196 | vaddr: int # Address to this function's reference in its vtable 197 | 198 | @staticmethod 199 | def create(vaddr): 200 | ea = get_ptr(vaddr) 201 | ref = InfoCache.vfuncs.get(ea, VFuncRef.create(ea=ea, mangledname=idaapi.get_name(ea))) 202 | return VFunc(ref, vaddr) 203 | 204 | 205 | def get_os(): 206 | ftype = idaapi.get_file_type_name() 207 | if "ELF" in ftype: 208 | return OS_Linux 209 | elif "PE" in ftype: 210 | return OS_Win 211 | return -1 212 | 213 | # Read a ctypes class from an ea 214 | def get_class_from_ea(classtype, ea): 215 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(classtype)) 216 | return classtype.from_buffer_copy(bytestr) 217 | 218 | def add_struc_ex(name): 219 | strucid = idaapi.get_struc_id(name) 220 | if strucid == idc.BADADDR: 221 | strucid = idaapi.add_struc(idc.BADADDR, name) 222 | 223 | return strucid 224 | 225 | # Anything past Classname:: 226 | # Thank you CTFPlayer::SOCacheUnsubscribed... 227 | def get_func_postname(name): 228 | retname = name 229 | template = 0 230 | iterback = 0 231 | for i, c in enumerate(retname): 232 | if c == "<": 233 | template += 1 234 | elif c == ">": 235 | template -= 1 236 | # Find ( and break if we're not in a template 237 | elif c == "(" and template == 0: 238 | iterback = i 239 | break 240 | 241 | # Run backwards from ( until we hit a :: 242 | for i in range(iterback, -1, -1): 243 | if retname[i] == ":": 244 | retname = retname[i+1:] 245 | break 246 | 247 | return retname 248 | 249 | def rva_to_ea(ea): 250 | if idaapi.inf_is_64bit(): 251 | return idaapi.get_imagebase() + ea 252 | return ea 253 | 254 | def parse_si_tinfo(ea, tinfos): 255 | for xref in idautils.XrefsTo(ea): 256 | tinfo = get_class_from_ea(si_class_type_info, xref.frm) 257 | tinfos[xref.frm + si_class_type_info.pParent.offset] = tinfo.pParent 258 | 259 | 260 | def parse_pointer_tinfo(ea, tinfos): 261 | for xref in idautils.XrefsTo(ea): 262 | tinfo = get_class_from_ea(pointer_type_info, xref.frm) 263 | tinfos[xref.frm + pointer_type_info.pType.offset] = tinfo.pType 264 | 265 | 266 | def parse_vmi_tinfo(ea, tinfos): 267 | for xref in idautils.XrefsTo(ea): 268 | tinfotype = create_vmi_class_type_info(xref.frm) 269 | tinfo = get_class_from_ea(tinfotype, xref.frm) 270 | 271 | for i in range(tinfo.basecount): 272 | offset = vmi_class_type_info.pBaseArray.offset + i * ctypes.sizeof(base_class_type_info) 273 | basetinfo = get_class_from_ea(base_class_type_info, xref.frm + offset) 274 | tinfos[xref.frm + offset + base_class_type_info.basetype.offset] = basetinfo.basetype 275 | 276 | def get_tinfo_vtables(ea, tinfos, vtables): 277 | if ea == idc.BADADDR: 278 | return 279 | 280 | for tinfoxref in idautils.XrefsTo(ea, idaapi.XREF_DATA): 281 | count = 0 282 | mangled = idaapi.get_name(tinfoxref.frm) 283 | demangled = idc.demangle_name(mangled, idaapi.MNG_LONG_FORM) 284 | if demangled is None: 285 | print(f"[VTABLE STRUCTS] Invalid name at {tinfoxref.frm:#x}") 286 | continue 287 | 288 | classname = demangled[len("`typeinfo for'"):] 289 | for xref in idautils.XrefsTo(tinfoxref.frm, idaapi.XREF_DATA): 290 | if xref.frm not in tinfos.keys(): 291 | # If address lies in a function 292 | if idaapi.is_func(idaapi.get_full_flags(xref.frm)): 293 | continue 294 | 295 | count += 1 296 | vtables[classname] = vtables.get(classname, []) + [xref.frm] 297 | 298 | 299 | def get_tinfo_vtables(ea, tinfos, vtables): 300 | if ea == idc.BADADDR: 301 | return 302 | 303 | for tinfoxref in idautils.XrefsTo(ea, idaapi.XREF_DATA): 304 | count = 0 305 | mangled = idaapi.get_name(tinfoxref.frm) 306 | demangled = idc.demangle_name(mangled, idaapi.MNG_LONG_FORM) 307 | if demangled is None: 308 | print(f"[VTABLE STRUCTS] Invalid name at {tinfoxref.frm:#x}") 309 | continue 310 | 311 | classname = demangled[len("`typeinfo for'"):] 312 | for xref in idautils.XrefsTo(tinfoxref.frm, idaapi.XREF_DATA): 313 | if xref.frm not in tinfos.keys(): 314 | # If address lies in a function 315 | if idaapi.is_func(idaapi.get_full_flags(xref.frm)): 316 | continue 317 | 318 | count += 1 319 | vtables[classname] = vtables.get(classname, []) + [xref.frm] 320 | 321 | 322 | def parse_vtables(vtables): 323 | jsondata = {} 324 | ptrsize = ctypes.sizeof(ea_t) 325 | for classname, tables in vtables.items(): 326 | # We don't *need* to do any sort of sorting in Linux and can just capture the thisoffset 327 | # The Windows side of the script can organize later 328 | for ea in tables: 329 | thisoffs = get_ptr(ea - ptrsize) 330 | 331 | funcs = parse_vtable(ea + ptrsize) 332 | # Can be zero if there's an xref in the global offset table (.got) section 333 | # Fortunately the parse_vtable function doesn't grab anything from there 334 | if funcs: 335 | classdata = jsondata.get(classname, {}) 336 | classdata[ptr_t(thisoffs).value] = funcs 337 | jsondata[classname] = classdata 338 | 339 | return jsondata 340 | 341 | def parse_vtable(ea): 342 | funcs = [] 343 | 344 | while ea != idc.BADADDR: 345 | # Using flags sped this up by a lot 346 | # Went from 4 secs to ~1.3 347 | flags = idaapi.get_full_flags(ea) 348 | if not is_off(flags) or not is_ptr(flags): 349 | break 350 | 351 | if get_os() == OS_Linux and idaapi.has_name(flags): 352 | break 353 | 354 | offs = get_ptr(ea) 355 | fflags = idaapi.get_full_flags(offs) 356 | if not idaapi.is_code(fflags): 357 | break 358 | 359 | if get_os() == OS_Win and not idaapi.has_any_name(fflags): 360 | break 361 | 362 | vfunc = VFunc.create(ea) 363 | # Invalid name, so this can be a "sub_", purecall, or an optimized function 364 | # So to keep vtable_io compat, we grab the comment instead and update the names 365 | if not vfunc.funcref.name: 366 | cmt = idaapi.get_cmt(ea, False) 367 | if cmt and "::" in cmt: 368 | vfunc.funcref.mangledname = None 369 | vfunc.funcref.name = cmt 370 | vfunc.funcref.postname = get_func_postname(vfunc.funcref.name) 371 | vfunc.funcref.sname = vfunc.funcref.postname.split("(")[0] 372 | 373 | funcs.append(vfunc) 374 | 375 | ea = idaapi.next_head(ea, idc.BADADDR) 376 | return funcs 377 | 378 | def calc_member_tinfo(vfunc): 379 | cached = InfoCache.tinfos.get(vfunc.funcref.ea, None) 380 | if cached is not None: 381 | return cached 382 | 383 | # Get the type info of the function if it's present 384 | # In Windows, you can't get the actual tinfo so you can only guess 385 | # and use the rudimentary type info 386 | tinfo = idaapi.tinfo_t() 387 | if not idaapi.get_tinfo(tinfo, vfunc.funcref.ea): 388 | if idaapi.guess_tinfo(tinfo, vfunc.funcref.ea) == idaapi.GUESS_FUNC_FAILED: 389 | tinfo = None 390 | 391 | if tinfo is not None: 392 | tinfo.create_ptr(tinfo) 393 | 394 | InfoCache.tinfos[vfunc.funcref.ea] = tinfo 395 | return tinfo 396 | 397 | 398 | def create_structs(data): 399 | # Now this is an awesome API function that we most certainly need 400 | idaapi.begin_type_updating(idaapi.UTP_STRUCT) 401 | 402 | for classname, vtables in data.items(): 403 | classstrucid = add_struc_ex(classname) 404 | classstruc = idaapi.get_struc(classstrucid) 405 | for thisoffs, vfuncs in vtables.items(): 406 | thisoffs = abs(thisoffs) 407 | postfix = f"_{thisoffs:04X}" if thisoffs != 0 else "" 408 | structype = f"{classname}{postfix}{idaapi.VTBL_SUFFIX}" 409 | structype = idaapi.validate_name(structype, idaapi.VNT_TYPE, idaapi.SN_IDBENC) 410 | 411 | vtablestrucid = add_struc_ex(structype) 412 | vtablestruc = idaapi.get_struc(vtablestrucid) 413 | for i, vfunc in enumerate(vfuncs): 414 | offs = i * ctypes.sizeof(ea_t) 415 | targetname = vfunc.funcref.sname 416 | 417 | currmem = idaapi.get_member(vtablestruc, offs) 418 | if currmem: 419 | # memname = idaapi.get_member_name(currmem.id) 420 | # # Can have a postfix so we use in operator 421 | # if targetname in memname: 422 | # if not currmem.has_ti(): 423 | # tinfo = calc_member_tinfo(vfunc) 424 | # if tinfo is not None: 425 | # idaapi.set_member_tinfo(vtablestruc, currmem, 0, tinfo, 0) 426 | # continue 427 | 428 | # # Sadly if you reorganize a vtable and move a function up, this will fail 429 | # # and you'll have an unneeded postfix 430 | # if not idaapi.set_name(currmem.id, targetname, idaapi.SN_NOCHECK): 431 | # newname = f"{targetname}_{offs:x}" 432 | # if not idaapi.set_name(currmem.id, newname, idaapi.SN_NOCHECK): 433 | # print(f"Failed to set name for {classname}::{vfunc.funcref.sname} ({targetname}) at offset {offs:#x}") 434 | # continue 435 | 436 | # tinfo = calc_member_tinfo(vfunc) 437 | # if tinfo is not None: 438 | # idaapi.set_member_tinfo(vtablestruc, currmem, 0, tinfo, 0) 439 | continue 440 | 441 | else: 442 | opinfo = idaapi.opinfo_t() 443 | # I don't think this does anything 444 | opinfo.ri.flags = idaapi.REF_OFF64 if idaapi.inf_is_64bit() else idaapi.REF_OFF32 445 | opinfo.ri.target = vfunc.funcref.ea 446 | opinfo.ri.base = 0 447 | opinfo.ri.tdelta = 0 448 | 449 | serr = idaapi.add_struc_member(vtablestruc, targetname, offs, FF_PTR|idc.FF_0OFF, opinfo, ctypes.sizeof(ea_t)) 450 | # Failed, so there was either an invalid name or a name collision 451 | if serr == idaapi.STRUC_ERROR_MEMBER_NAME: 452 | targetname = idaapi.validate_name(targetname, idaapi.VNT_IDENT, idaapi.SN_IDBENC) 453 | serr = idaapi.add_struc_member(vtablestruc, targetname, offs, FF_PTR|idc.FF_0OFF, opinfo, ctypes.sizeof(ea_t)) 454 | if serr == idaapi.STRUC_ERROR_MEMBER_NAME: 455 | targetname = f"{targetname}_{offs:X}" 456 | serr = idaapi.add_struc_member(vtablestruc, targetname, offs, FF_PTR|idc.FF_0OFF, opinfo, ctypes.sizeof(ea_t)) 457 | 458 | if serr != idaapi.STRUC_ERROR_MEMBER_OK: 459 | print(vtablestruc, vtablestrucid) 460 | print(f"Failed to add member {classname}::{vfunc.funcref.sname} ({targetname}) at offset {offs:#x} -> {serr}") 461 | continue 462 | 463 | tinfo = calc_member_tinfo(vfunc) 464 | if tinfo is not None: 465 | mem = idaapi.get_member(vtablestruc, offs) 466 | idaapi.set_member_tinfo(vtablestruc, mem, 0, tinfo, 0) 467 | 468 | vmember = idaapi.get_member(classstruc, thisoffs) 469 | if not vmember: 470 | if idaapi.add_struc_member(classstruc, f"{idaapi.VTBL_MEMNAME}{postfix}", thisoffs, idc.FF_DATA | FF_PTR, None, ctypes.sizeof(ea_t)) == idaapi.STRUC_ERROR_MEMBER_OK: 471 | global STRUCTS 472 | STRUCTS += 1 473 | tinfo = idaapi.tinfo_t() 474 | if idaapi.guess_tinfo(tinfo, vtablestrucid) != idaapi.GUESS_FUNC_FAILED: 475 | mem = idaapi.get_member(classstruc, thisoffs) 476 | tinfo.create_ptr(tinfo) 477 | idaapi.set_member_tinfo(classstruc, mem, 0, tinfo, 0) 478 | 479 | def read_vtables_linux(): 480 | WaitBox.show("Parsing typeinfo") 481 | 482 | # Step 1 and 2, crawl xrefs and stick the inherited class type infos into a structure 483 | # After this, we can run over the xrefs again and see which xrefs come from another structure 484 | # The remaining xrefs are either vtables or weird math in a function 485 | xreftinfos = {} 486 | 487 | def getparse(name, fn, quiet=False): 488 | tinfo = idc.get_name_ea_simple(name) 489 | if tinfo == idc.BADADDR and not quiet: 490 | print(f"[VTABLE STRUCTS] Type info {name} not found. Skipping...") 491 | return None 492 | 493 | if fn is not None: 494 | fn(tinfo, xreftinfos) 495 | return tinfo 496 | 497 | # Don't need to parse base classes 498 | tinfo = getparse("_ZTVN10__cxxabiv117__class_type_infoE", None) 499 | tinfo_pointer = getparse("_ZTVN10__cxxabiv119__pointer_type_infoE", parse_pointer_tinfo, True) 500 | tinfo_si = getparse("_ZTVN10__cxxabiv120__si_class_type_infoE", parse_si_tinfo) 501 | tinfo_vmi = getparse("_ZTVN10__cxxabiv121__vmi_class_type_infoE", parse_vmi_tinfo) 502 | 503 | if len(xreftinfos) == 0: 504 | print("[VTABLE STRUCTS] No type infos found. Are you sure you're in a C++ binary?") 505 | return 506 | 507 | # Step 3, crawl xrefs to again and if the xref is not in the type info structure, then it's a vtable 508 | WaitBox.show("Discovering vtables") 509 | vtables = {} 510 | get_tinfo_vtables(tinfo, xreftinfos, vtables) 511 | get_tinfo_vtables(tinfo_pointer, xreftinfos, vtables) 512 | get_tinfo_vtables(tinfo_si, xreftinfos, vtables) 513 | get_tinfo_vtables(tinfo_vmi, xreftinfos, vtables) 514 | 515 | # Now, we have a list of vtables and their respective classes 516 | WaitBox.show("Parsing vtables") 517 | data = parse_vtables(vtables) 518 | 519 | WaitBox.show("Creating structs") 520 | create_structs(data) 521 | 522 | def parse_ti(ea, tis): 523 | typedesc = ea 524 | flags = idaapi.get_full_flags(ea) 525 | if idaapi.is_code(flags): 526 | return 527 | 528 | try: 529 | classname = idaapi.demangle_name(idc.get_name(ea), idaapi.MNG_SHORT_FORM) 530 | classname = classname.removeprefix("class ") 531 | classname = classname.removeprefix("struct TypeDescriptor ") 532 | classname = classname.removesuffix(" `RTTI Type Descriptor'") 533 | except: 534 | print(f"[VTABLE STRUCTS] Invalid vtable name at {ea:#x}") 535 | return 536 | 537 | if classname in tis.keys(): 538 | return 539 | 540 | vtables = [] 541 | 542 | # Then figure out which xref is a/the COL 543 | for xref in idautils.XrefsTo(typedesc): 544 | ea = xref.frm 545 | flags = idaapi.get_full_flags(ea) 546 | 547 | # Dynamic cast 548 | if idaapi.is_code(flags): 549 | continue 550 | 551 | name = idaapi.get_name(ea) 552 | # Class type descriptor and/or random global data 553 | # Kind of a hack but let's assume no one will rename these 554 | if name and (name.startswith("??_R1") or name.startswith("off_")): 555 | continue 556 | 557 | ea -= 4 558 | name = idaapi.get_name(ea) 559 | # Catchable types 560 | if name and name.startswith("__CT"): 561 | continue 562 | 563 | # COL 564 | ea -= 8 565 | workaround = False 566 | if idaapi.is_unknown(idaapi.get_full_flags(ea)): 567 | print(f"[VTABLE STRUCTS] Possible COL is unknown at {ea:#x}. This may be an unreferenced vtable. Trying workaround...") 568 | # This might be a bug with IDA, but sometimes the COL isn't analyzed 569 | # If there's still a reference, then we can still trace back 570 | # If there is a list of functions (or even just one), then it's probably a vtable, 571 | # but we'll still warn the user that it might be garbage 572 | refs = list(idautils.XrefsTo(ea)) 573 | if len(refs) == 1: 574 | vtable = refs[0].frm + ctypes.sizeof(ea_t) 575 | tryfunc = get_ptr(vtable + ctypes.sizeof(ea_t)) 576 | funcflags = idaapi.get_full_flags(tryfunc) 577 | if idaapi.is_func(funcflags): 578 | print(f" - Workaround successful. Please assure that {vtable:#x} is a vtable.") 579 | workaround = True 580 | 581 | if not workaround: 582 | print(" - Workaround failed. Skipping...") 583 | continue 584 | 585 | name = idaapi.get_name(ea) 586 | if not workaround and (not name or not name.startswith("??_R4")): 587 | print(f"[VTABLE STRUCTS] Invalid name at {ea:#x}. Possible unwind info. Ignoring...") 588 | continue 589 | 590 | # In 64-bit PEs, the COL references itself, remove this 591 | refs = list(idautils.XrefsTo(ea)) 592 | if idaapi.inf_is_64bit(): 593 | for n in range(len(refs)-1, -1, -1): 594 | if refs[n].frm == ea + RTTICompleteObjectLocator.pSelf.offset: 595 | del refs[n] 596 | 597 | # Now that we have the COL, we can use it to find the vtable that utilizes it and its thisoffs 598 | if len(refs) != 1: 599 | print(f"[VTABLE STRUCTS] Multiple vtables point to same COL - {name} at {ea:#x}") 600 | continue 601 | 602 | vtable = refs[0].frm + ctypes.sizeof(ea_t) 603 | thisoffs = idaapi.get_dword(ea + RTTICompleteObjectLocator.offset.offset) 604 | vtables.append((thisoffs, vtable)) 605 | 606 | # Can have RTTI without a vtable 607 | tis[classname] = {thisoffs: parse_vtable(vaddr) for thisoffs, vaddr in vtables} 608 | 609 | def string_method(type_info, vtabledata): 610 | for string in idautils.Strings(): 611 | sstr = str(string) 612 | if not sstr.startswith(".?AV"): 613 | continue 614 | 615 | ea = string.ea 616 | ea -= TypeDescriptor.name.offset 617 | trytinfo = rva_to_ea(idaapi.get_wide_dword(ea)) 618 | # This is a weird string that isn't a part of a type descriptor 619 | if trytinfo != type_info: 620 | continue 621 | 622 | parse_ti(ea, vtabledata) 623 | 624 | def read_ti_win(): 625 | # Step 1, get the vftable of type_info 626 | type_info = idc.get_name_ea_simple("??_7type_info@@6B@") 627 | if type_info == idc.BADADDR: 628 | # If type_info doesn't exist as a label, we might still be able to snipe it with the string method 629 | strings = list(idautils.Strings()) 630 | for s in strings: 631 | if str(s) == ".?AVtype_info@@": 632 | ea = s.ea - TypeDescriptor.name.offset 633 | type_info = rva_to_ea(idaapi.get_wide_dword(ea)) 634 | 635 | if type_info == idc.BADADDR: 636 | print("[VTABLE STRUCTS] type_info not found. Are you sure you're in a C++ binary?") 637 | return None 638 | 639 | vtabledata = {} 640 | 641 | # Step 2, get all xrefs to type_info 642 | # Get type descriptor 643 | for typedesc in idautils.XrefsTo(type_info): 644 | parse_ti(typedesc.frm, vtabledata) 645 | 646 | # In some cases, the IDA either fails to reference some type descriptors with type_info 647 | # Not exactly sure why, but it lists the ea of type_info as a "hash" when in reality it isn't 648 | # A workaround for this is to parse type descriptor strings (".?AV*"), load up their references, and 649 | # walk backwards to the start of what is supposed to be the type descriptor, and assure that 650 | # its DWORD is the type_info vtable 651 | # I only found this to be a problem in NMRIH, so it appears to be rare 652 | WaitBox.show("Performing string parsing") 653 | string_method(type_info, vtabledata) 654 | 655 | return vtabledata 656 | 657 | def read_vtables_win(): 658 | WaitBox.show("Parsing Windows typeinfo") 659 | data = read_ti_win() 660 | 661 | if data is None: 662 | return 663 | 664 | WaitBox.show("Creating structs") 665 | create_structs(data) 666 | 667 | def main(): 668 | os = get_os() 669 | try: 670 | if os == OS_Linux: 671 | read_vtables_linux() 672 | elif os == OS_Win: 673 | read_vtables_win() 674 | else: 675 | print(f"Unsupported OS?: {idaapi.get_file_type_name()}") 676 | idaapi.beep() 677 | 678 | if STRUCTS: 679 | print(f"Successfully imported {STRUCTS} virtual structures") 680 | else: 681 | print("No virtual structures imported") 682 | idaapi.beep() 683 | except: 684 | import traceback 685 | traceback.print_exc() 686 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues") 687 | idaapi.beep() 688 | 689 | idaapi.end_type_updating(idaapi.UTP_STRUCT) 690 | WaitBox.hide() 691 | 692 | # import cProfile 693 | # cProfile.run("main()", "vtable_structs.prof") 694 | main() 695 | --------------------------------------------------------------------------------