├── .gitattributes
├── .gitignore
├── LICENSE
├── README.md
├── distfromfunc.py
├── gamedata_checker.py
├── isgoodsig.py
├── makesig.py
├── makesigfromhere.py
├── nameresetter.py
├── netprop_importer.py
├── sigfind.py
├── sigsmasher.py
├── structfiller.py
├── symbolsmasher.py
├── vtable_io.py
└── vtable_structs.py
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 |
2 | .vscode/*
3 | iconbuster.py
4 | form_test.py
5 | *.bak
6 | capdisasm.py
7 | vtest*
8 | namecollisions.py
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2023 John Mascagni
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # IDA Scripts
2 | Some random IDA scripts I wrote
3 |
4 | ## V2.0
5 |
6 | These scripts were heavily modified on 8/16/2023. For a full writeup on the new changes, see [here](https://github.com/Scags/IDA-Scripts/pull/2).
7 |
8 | ### distfromfunc.py ###
9 |
10 | Get the offset from the cursor address and the start of a function. Useful for byte patching.
11 |
12 | ### gamedata_checker.py ###
13 |
14 | Name says it all, but this verifies SourceMod gamedata files. This requires Valve's VDF library, install it with `pip install vdf`.
15 |
16 | Has a few quirks with it at the moment:
17 | - It does not support multi-line comments within gamedata files nor will it support multiple instances of `#default` keys. Parsing core SourceMod gamedata files is essentially verboten.
18 | - VTable functions that are stripped cannot be verified, obviously.
19 | - Function overloads tends to mess up VTable offset checking; e.g. `GiveNamedItem`.
20 | - Offset checking is variably difficult depending on naming conventions. If the gamedata key name is not named exactly the same as the function name, it will not be found; e.g. `OnTakeDamage` -> `CBaseEntity::OnTakeDamage` and `CTFPlayer::OnTakeDamage` -> `CBaseEntity::OnTakeDamage` but `TakeDamage` != `CBaseEntity::OnTakeDamage`.
21 |
22 |
23 | ### isgoodsig.py ###
24 |
25 | Takes a SourceMod (or any) signature input and detects if it's unique or not.
26 |
27 |
28 | ### makesig.py ###
29 |
30 | Python translation of [makesig](https://github.com/alliedmodders/sourcemod/blob/master/tools/ida_scripts/makesig.idc).
31 |
32 | Optionally, install pyperclip with `pip install pyperclip` to automatically copy any signatures to your clipboard when running.
33 |
34 |
35 | ### makesigfromhere.py ###
36 |
37 | Creates a signature from the cursor offset. Useful for byte patching.
38 |
39 |
40 | ### nameresetter.py ###
41 |
42 | Resets the name of every function in IDA's database. Does not include library or external functions.
43 |
44 |
45 | ### netprop_importer.py ###
46 |
47 | Imports netprops and owner classes as structs and struct members into IDA's DB. Only works with the XML file provided by sm_dump_netprops_xml. Datatables only work most of the time. You should also use the proper netprop dump for your OS, or else you will be very confused.
48 |
49 |
50 | ### sigfind.py ###
51 |
52 | Takes a SourceMod (or any) signature and jumps you to the function it's for. If it's a bad signature, then you won't go anywhere.
53 |
54 |
55 | ### sigsmasher.py ###
56 |
57 | Makes SourceMod ready signatures for every function in IDA's database. Yes, this will take a long, long time. Requires PyYAML so you'll need to `pip install pyyaml`. You have the option of only generating signatures for typed functions so this works very well with the Symbol Smasher.
58 |
59 |
60 | ### structfiller.py ###
61 |
62 | Sanitizes undefined struct members as if IDA had parsed a header file. Each structure will have its undefined members replaced with a one-byte-sized member in order to prevent pseudocode from falling apart. Only makes sense to use it after running the netprop importer.
63 |
64 |
65 | ### symbolsmasher.py ###
66 |
67 | Renames functions in a stripped library database based on unique string cross-references.
68 |
69 | Running the script presents 2 options: you can read and export data from the current database, or you can import and write data into it.
70 |
71 | If you're on a symbol library, you should run it in read mode and export it to a file. This file is what is used to import back into a stripped binary.
72 |
73 | When on Windows or another stripped database, run the script in write mode and select the file you exported earlier. A solid amount of functions should be typed within a few seconds.
74 |
75 | This works well with the Signature Smasher. However to save you an hour or so, I publicly host dumps of most Source games [here](http://scag.site.nfoservers.com/sigdump).
76 |
77 | ### vtable_io.py ###
78 |
79 | Imports and exports virtual tables. Run it through a Linux binary to export to a file, then run it through a Windows binary to import those VTables into the database. This is similar to [Asherkin's VTable Dumper](https://asherkin.github.io/vtable/) but doesn't suffer from the pitfalls of multiple inheritance. Since it doesn't have those liabilities, its function typing will almost always be perfect.
80 |
81 | #### Features ####
82 | This script is slightly heavy and has features that warrant explanation. Features can be freely enabled/disabled in the popup form that opens when you run the script. Desired features options are kept in the IDA registry and will persist.
83 |
84 | **Parse type strings**
85 | - Sometimes IDA fails to properly analyze Windows RTTI Type Descriptor objects. Because of this, there won't be a reference from certain type descriptors to std::type_info, which is required for the script to work.
86 | - If this feature is enabled, then the string names of the type descriptor will be parsed in order to discover the unreferencing type descriptors. This will be done alongside the normal script function.
87 | - If you notice that there are multiple functions of the same name or classes that have virtual functions that aren't typed, consider enabling this.
88 | - It should be harmless to keep on regardless, but it is disabled by default.
89 | - This problem only seemed to be present in NMRiH.
90 |
91 | **Skip vtable size mismatches**
92 | - The script is *almost* perfect. On rare occasion, it will fail to properly prepare a Windows translation of a Linux virtual table.
93 | - If this option is enabled, then any size mismatches will forego function typing.
94 | - Enabled by default.
95 |
96 | **Comment reused functions**
97 | - Windows oftentimes optimizes shorter and simpler functions and reuses them across multiple virtual tables. This means that it would be redundant to rename these functions over and over again.
98 | - If enabled, virtual table declarations instead emplace a comment on the function's reference.
99 | - Enabled by default.
100 |
101 | **Export options**
102 | - Should be self-explanatory, but the script is able to export the Linux and Windows virtual tables to a file.
103 | - This is is a .json file and is organized to be readable.
104 | - The format of the export file is as follows:
105 | ```json
106 | "classname"
107 | {
108 | "[this-offset] vtable-offset function-name"
109 | }
110 | ```
111 | - Linux offsets are denoted with `L` and Windows with `W`. If the function is not present in a certain OS, then that index is empty.
112 | - Exporting is optional, and if it is not enabled, then the export file path option can be safely ignored.
113 |
114 | ### vtable_structs.py ###
115 |
116 | Runs through virtual tables and creates structs for them. Use at your own risk since it screws up refencing members through pseudocode.
--------------------------------------------------------------------------------
/distfromfunc.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idaapi
3 |
4 | def main():
5 | addr = idaapi.get_screen_ea()
6 | if addr == idc.BADADDR:
7 | print("Make sure you are in a function!")
8 | idaapi.beep()
9 | return
10 |
11 | func = idaapi.get_func(addr)
12 | if func is None:
13 | print("Make sure you are in a function!")
14 | idaapi.beep()
15 | return
16 |
17 | funcname = idaapi.get_name(func.start_ea)
18 | demangled = idaapi.demangle_name(funcname, idc.get_inf_attr(idc.INF_SHORT_DN))
19 | print(f"{demangled or funcname}:")
20 | print(f"Offset from {func.start_ea:08X} to {addr:08X} = {addr - func.start_ea} ({addr - func.start_ea:#X})")
21 |
22 | main()
--------------------------------------------------------------------------------
/gamedata_checker.py:
--------------------------------------------------------------------------------
1 | import idautils
2 | import idaapi
3 | import idc
4 | import vdf
5 |
6 | from sys import version_info
7 |
8 | OS_Linux = 0
9 | OS_Win = 1
10 |
11 | def get_os():
12 | ftype = idaapi.get_file_type_name()
13 | if "PE" in ftype:
14 | return OS_Win
15 | elif "ELF" in ftype:
16 | return OS_Linux
17 | return -1
18 |
19 | def osstr(os):
20 | if os == OS_Linux:
21 | return "linux"
22 | elif os == OS_Win:
23 | return "windows"
24 | return "unknown"
25 |
26 | def checksig(sig):
27 | if sig[0] == '@':
28 | # Just check for existence of this mangled name
29 | return idc.get_name_ea_simple(sig[1:]) != idc.BADADDR
30 |
31 | sig = sig.replace(r"\x", " ").replace("2A", "?").replace("2a", "?").replace("\\", "").strip()
32 |
33 | # Get the first segment that is executable to use its addresses for parse_binpat_str
34 | endea = idc.BADADDR
35 | for segea in idautils.Segments():
36 | s = idaapi.getseg(segea)
37 | if s.perm & idaapi.SEGPERM_EXEC:
38 | segstart = segea
39 | # Set the end ea to the end of the last executable segment
40 | # Speed isn't as important in this script, so reading any extra X
41 | # segments is fine
42 | if endea == idc.BADADDR or endea < segstart + s.size():
43 | endea = segstart + s.size()
44 |
45 | count = 0
46 | addr = 0
47 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
48 | while count < 2 and addr != idc.BADADDR:
49 | count = count + 1
50 | if count > 1:
51 | break
52 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
53 |
54 | return count == 1
55 |
56 | # bin_search3 breaks after 15 or so bytes or something, idk man
57 | # binpat = idaapi.compiled_binpat_vec_t()
58 | # idaapi.parse_binpat_str(binpat, segstart, sig, 16, idaapi.get_default_encoding_idx(idaapi.get_encoding_bpu_by_name("UTF-8")))
59 |
60 | # count = 0
61 | # addr = 0
62 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
63 | # while addr != idc.BADADDR:
64 | # count += 1
65 | # if count > 1:
66 | # break
67 |
68 | # # +1 because the search finds itself
69 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
70 |
71 | # return count == 1
72 |
73 | def get_bcompat_items(d):
74 | return d.iteritems() if version_info[0] <= 2 else d.items()
75 |
76 | # Unfortunately I don't care too much about overtly complex gamedata files
77 | # If you have multiple #default's in you first subsection or you have #default
78 | # anywhere else other than that first subsection, you're SOL. Sorry Silvers :c
79 | def get_gamedir(kv):
80 | # If we've got multiple games supported, let's just ask
81 | if len(kv.items()) > 1:
82 | gamedir = idaapi.ask_str("", 0, "There are multiple games supported by this file. Which game directory is this for?")
83 | # Not in the basic game shit, check for support in default
84 | if gamedir is not None and gamedir not in kv.keys():
85 | default = kv.get("#default")
86 | # There's a default entry, check for supported
87 | if default:
88 | supported = kv.get("#supported")
89 | if supported:
90 | if gamedir in supported.values():
91 | return gamedir
92 | return ""
93 | return "#default"
94 | return ""
95 | else:
96 | # 1 item, see if it's a default
97 | gamedir = list(kv.keys())[0]
98 | if gamedir == "#default":
99 | default = kv.items()[0]
100 | # If it has multiple supports, check and see if we're in there
101 | supported = kv.get("#supported")
102 | if supported:
103 | if len(supported.items()) > 1:
104 | gamedir = idaapi.ask_str("", 0, "There are multiple games supported by this file. Which game directory is this for?")
105 | if gamedir is not None and gamedir in default["#supported"].values():
106 | return gamedir
107 | return ""
108 | return list(supported.values())[0]
109 | return "#default"
110 |
111 | return gamedir
112 |
113 | def get_voffs(name):
114 | os = get_os()
115 | if os == OS_Linux:
116 | mangled = "_ZTV{}{}".format(len(name), name)
117 | offset = 8
118 | else:
119 | mangled = "??_7{}@@6B@".format(name)
120 | offset = 0
121 |
122 | addr = idc.get_name_ea_simple(mangled)
123 | if addr != idc.BADADDR:
124 | addr += offset
125 | return addr
126 |
127 | def read_vtable(funcname, ea):
128 | funcs = {}
129 | offset = 0
130 | while ea != idc.BADADDR:
131 | if idaapi.inf_is_64bit():
132 | offs = idaapi.get_qword(ea)
133 | else:
134 | offs = idaapi.get_dword(ea)
135 |
136 | if not idaapi.is_code(idaapi.get_full_flags(offs)):
137 | break
138 |
139 | name = idc.get_name(offs, idaapi.GN_VISIBLE)
140 | demangled = idc.demangle_name(name, idc.get_inf_attr(idc.INF_SHORT_DN))
141 | if demangled == None:
142 | demangled = name
143 |
144 | if "(" in demangled:
145 | demangled = demangled[:demangled.find("(")]
146 | funcs[demangled.lower()] = offset
147 |
148 | offset += 1
149 | ea = idaapi.next_head(ea, idc.BADADDR)
150 |
151 | # We've got a list of function names, let's do this really shittily because idk any other way
152 |
153 | # This is a good programmer who makes their gamedata the proper way :)
154 | offs = funcs.get(funcname.lower(), -1)
155 | if offs != -1:
156 | return offs
157 |
158 | # Often done but sometimes there are subclass types thrown in, save those too
159 | if "::" in funcname:
160 | funcname = funcname[funcname.find("::")+2:]
161 |
162 | # Try by exact function name
163 | funcnames = {}
164 | for key, value in get_bcompat_items(funcs):
165 | # Function overloads can fuck right off
166 | s = key[key.find("::")+2:].lower() if "::" in key else key.lower()
167 | funcnames[s.lower()] = value
168 |
169 | offs = funcnames.get(funcname.lower(), -1)
170 | # Second best way, exact function name
171 | if offs != -1:
172 | return offs
173 |
174 | return -1
175 | # Anything else near here is either some random mem offset or some other crap
176 | # possibilities = [key for key in funcnames.keys() if funcname in key]
177 | # return [found for found in funcnames[x] for x in possibilities]
178 |
179 | # So we've a few options with finding appropriate vtable offsets
180 | # Option 1: Check and see if they use the optimal naming sequence "Type::Function" and revel in that
181 | # If we can't deduce that exactly, try option 2
182 | # Option 2: They must've used just the function name, run through every function that has a name like that
183 | # and perform option 1 on each
184 | # Windows can suck a wiener on this one
185 | def try_get_voffset(funcname):
186 | if "(" in funcname:
187 | funcname = funcname[:funcname.find("(")]
188 | if "::" in funcname:
189 | # Option 1
190 | typename = funcname[:funcname.find("::")]
191 | voffs = get_voffs(typename)
192 | offs = -1
193 | if voffs != idc.BADADDR:
194 | offs = read_vtable(funcname, voffs)
195 | if offs != -1:
196 | return offs
197 |
198 | funcname = funcname[funcname.find("::")+2:]
199 |
200 | # Let's chug along all of these functions, woohoo for option 2!
201 | for func in idautils.Functions():
202 | name = idc.get_name(func, idaapi.GN_VISIBLE)
203 | if not name or funcname not in name: # funcname should only be a plain function decl, so it would be unfettered in a mangled name
204 | continue
205 |
206 | demangled = idc.demangle_name(name, idc.get_inf_attr(idc.INF_SHORT_DN))
207 | if demangled == None:
208 | continue
209 |
210 | demname = demangled
211 | if "::" in demname:
212 | demname = demname[demname.find("::")+2:]
213 | if "(" in demname:
214 | demname = demname[:demname.find("(")]
215 |
216 | if funcname == demname: # Here's an exact match, let's get the type name then read the vtable
217 | if "::" not in demangled: # Okay, so someone somewhere is an idiot and managed to provide an offset name that is the
218 | continue # same name as some non-class function and this will manage to catch that
219 |
220 | typename = demangled[:demangled.find("::")]
221 | voffs = get_voffs(typename)
222 | if voffs != idc.BADADDR:
223 | offs = read_vtable(funcname, voffs)
224 | if offs != -1:
225 | return offs
226 |
227 | return -1 # Your naming conventions suck and you should feel bad. Or this is Windows and you should still feel bad
228 |
229 | def main():
230 | kv = None
231 | filereq = idaapi.ask_file(0, "*.txt", "Select a gamedata file")
232 | if filereq is None:
233 | return
234 |
235 | # Try and capture the huge exception that happens if there are multi-line comments
236 | # Why does vdfparse print the entire file? Lol
237 | try:
238 | with open(filereq) as f:
239 | kv = vdf.load(f)
240 | except Exception as e:
241 | idaapi.warning("Could not load file!\nSee console for details")
242 | import traceback
243 | traceback.print_exc(type(e), e, e.__traceback__)
244 | if "vdf.parse: unexpected EOF" in str(e):
245 | print("[Gamedata Checker] This is likely due to multi-line comments in the gamedata file. Try removing them and try again")
246 | return
247 |
248 | if kv == None:
249 | idaapi.warning("Could not load file!")
250 | return
251 |
252 | kv = list(kv.values())[0]
253 | os = get_os()
254 | gamedir = get_gamedir(kv)
255 | if not gamedir:
256 | idaapi.warning("Could not find game directory in file")
257 | return
258 |
259 | kv = kv[gamedir]
260 | found = {
261 | "Signatures": {},
262 | "Offsets": {}
263 | }
264 |
265 | signatures = kv.get("Signatures")
266 | if signatures:
267 | for name, handle in signatures.items():
268 | s = handle.get(osstr(os))
269 | if s:
270 | found["Signatures"][name] = checksig(s)
271 |
272 | offsets = kv.get("Offsets")
273 | if offsets:# and os != "windows":
274 | for name, handle in offsets.items():
275 | offset = handle.get(osstr(os), -1)
276 | if offset != -1:
277 | found["Offsets"][name] = [offset, try_get_voffset(name)]
278 |
279 | checkmark = u"\u2713".encode("utf8") if version_info[0] <= 2 else "✓"
280 |
281 | # Format the output string so it's pretty
282 | try:
283 | maxlen = max([len(key) for key in found["Signatures"].keys()])
284 | except:
285 | maxlen = 0
286 | if maxlen:
287 | # Align maxlen to 4
288 | if maxlen % 4 != 0:
289 | maxlen += 4 - (maxlen % 4)
290 |
291 | print("Signatures:")
292 | for key, value in get_bcompat_items(found["Signatures"]):
293 | print(f"\t{key:{maxlen}}{checkmark if value else 'INVALID'}")
294 |
295 | try:
296 | maxlen = max([len(key) for key in found["Offsets"].keys()])
297 | except:
298 | maxlen = 0
299 | if maxlen:
300 | # Align maxlen to 4
301 | if maxlen % 4 != 0:
302 | maxlen += 4 - (maxlen % 4)
303 |
304 | # Trial and error and trial and error and trial and
305 | print(f"Offsets:{'Gamedata':>{maxlen + 9}}{'Current':>12}{'Status':>12}")
306 | for key, value in get_bcompat_items(found["Offsets"]):
307 | s = f"\t{key:{maxlen}}"
308 | foundval = value[1]
309 | status = checkmark
310 | if isinstance(value[1], list):
311 | status = checkmark if value[0] in value[1] else 'X'
312 | elif int(value[0]) != int(value[1]):
313 | status = 'X'
314 | if value[1] == -1:
315 | foundval = "N/A"
316 |
317 | s += f"{value[0]:<12} {foundval:<12} {status:<12}"
318 |
319 | print(s)
320 |
321 | main()
--------------------------------------------------------------------------------
/isgoodsig.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idaapi
3 | import idautils
4 |
5 | def main():
6 | bytesig = idaapi.ask_str("", 0, "Insert signature: ")
7 |
8 | sig = bytesig.replace(r"\x", " ").replace("2A", "?").replace("2a", "?").strip()
9 |
10 | count = checksig(sig)
11 | if not count:
12 | print(r"INVALID: {}".format(bytesig))
13 | print("Could not find any matching signatures for input")
14 | elif count == 1:
15 | print(r"VALID: {}".format(bytesig))
16 | else:
17 | print(r"INVALID: {}".format(bytesig))
18 | print("Found {} instances of input signature".format(count))
19 |
20 | def checksig(sig):
21 | # Get the first segment that is executable to use its addresses for parse_binpat_str
22 | endea = idc.BADADDR
23 | for segea in idautils.Segments():
24 | s = idaapi.getseg(segea)
25 | if s.perm & idaapi.SEGPERM_EXEC:
26 | segstart = segea
27 | # Set the end ea to the end of the last executable segment
28 | # Speed isn't as important in this script, so reading any extra X
29 | # segments is fine
30 | if endea == idc.BADADDR or endea < segstart + s.size():
31 | endea = segstart + s.size()
32 | break
33 |
34 | count = 0
35 | addr = 0
36 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
37 | while addr != idc.BADADDR:
38 | count = count + 1
39 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
40 |
41 | return count
42 |
43 | # bin_search3 breaks after 15 or so bytes or something, idk man
44 | # binpat = idaapi.compiled_binpat_vec_t()
45 | # idaapi.parse_binpat_str(binpat, segstart, sig, 16, idaapi.get_default_encoding_idx(idaapi.get_encoding_bpu_by_name("UTF-8")))
46 |
47 | # count = 0
48 | # addr = 0
49 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
50 | # while addr != idc.BADADDR:
51 | # count += 1
52 |
53 | # # +1 because the search finds itself
54 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
55 |
56 | # return count
57 |
58 | main()
--------------------------------------------------------------------------------
/makesig.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idautils
3 | import idaapi
4 |
5 | def print_wildcards(count):
6 | return "?" * count
7 |
8 | def is_good_sig(sig, mask):
9 | search = " ".join('?' if m == '?' else b for b, m in zip(sig.strip().split(), mask))
10 |
11 | # Get the first segment that is executable to use its addresses for parse_binpat_str
12 | endea = idc.BADADDR
13 | for segea in idautils.Segments():
14 | s = idaapi.getseg(segea)
15 | if s.perm & idaapi.SEGPERM_EXEC:
16 | segstart = segea
17 | # Set the end ea to the end of the last executable segment
18 | # Speed isn't as important in this script, so reading any extra X
19 | # segments is fine
20 | if endea == idc.BADADDR or endea < segstart + s.size():
21 | endea = segstart + s.size()
22 |
23 | count = 0
24 | addr = 0
25 | # Ever just deprecate something and provide 0 documentation on what to use instead?
26 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
27 | while addr != idc.BADADDR:
28 | count = count + 1
29 | if count > 1:
30 | break
31 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
32 |
33 | return count == 1
34 |
35 | # binpat = idaapi.compiled_binpat_vec_t()
36 | # idaapi.parse_binpat_str(binpat, segstart, search, 16, idaapi.get_encoding_bpu_by_name("UTF-8"))
37 |
38 | # count = 0
39 | # addr = 0
40 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
41 |
42 | # while addr != idc.BADADDR:
43 | # count += 1
44 | # if count > 1:
45 | # break
46 |
47 | # # +1 because the search finds itself
48 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
49 |
50 |
51 | # return count == 1
52 |
53 | def makesig(ea, sz = -1):
54 | name = idc.get_name(ea, idaapi.GN_VISIBLE)
55 |
56 | sig = ""
57 | mask = ""
58 | found = 0
59 | done = 0
60 |
61 | addr = ea
62 | end = ea + sz if sz != -1 else idc.BADADDR
63 | while addr != idc.BADADDR and (sz == -1 or addr < ea + sz):
64 | info = idaapi.insn_t()
65 | if not idaapi.decode_insn(info, addr):
66 | print(f"Failed to decode instruction at {addr:#X}?")
67 | idaapi.beep()
68 | return
69 |
70 | sig += " ".join(f"{idaapi.get_byte(addr+i):02X}" for i in range(info.size)) + " "
71 |
72 | done = 0
73 | if info.Op1.type in (idaapi.o_near, idaapi.o_far):
74 | insnsz = 2 if idaapi.get_byte(addr) == 0x0F else 1
75 | mask += f"{'x' * insnsz}{print_wildcards(info.size - insnsz)}"
76 | done = 1
77 | elif info.Op1.type == idaapi.o_reg and info.Op2.type == idaapi.o_mem and info.Op2.addr != idc.BADADDR:
78 | mask += f"{'x' * info.Op2.offb}{print_wildcards(info.size - info.Op2.offb)}"
79 | done = 1
80 |
81 | if not done: # Unknown, just wildcard addresses
82 | i = 0
83 | while i < info.size:
84 | loc = addr + i
85 | if ((idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF32):
86 | mask += print_wildcards(4)
87 | i += 3
88 | elif (idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF64:
89 | mask += print_wildcards(8)
90 | i += 7
91 | else:
92 | mask += 'x'
93 |
94 | i += 1
95 |
96 | if is_good_sig(sig, mask):
97 | found = 1
98 | break
99 |
100 | addr = idaapi.next_head(addr, end)
101 |
102 | if found == 0:
103 | print(sig)
104 | print("Ran out of bytes to create unique signature.")
105 | idaapi.beep()
106 | return
107 |
108 | sig = sig.strip()
109 | csig = r"\x" + sig.replace(" ", r"\x")
110 |
111 | align = len("Wildcarded Bytes: ")
112 | wildcarded = f"{'Wildcarded Bytes:':<{align}} {' '.join('?' if m == '?' else b for b, m in zip(sig.split(), mask))}\n" if "?" in mask else ""
113 | smsig = r"\x" + r"\x".join("2A" if m == "?" else b for b, m in zip(sig.split(), mask))
114 |
115 | print("==================================================")
116 | print(
117 | f"Signature for {name}:\n"
118 | f"{'Mask:':<{align}} {mask}\n"
119 | f"{'Bytes:':<{align}} {sig}\n"
120 | f"{wildcarded}"
121 | f"{'Byte String:':<{align}} {csig}\n"
122 | f"{'SourceMod':<{align}} {smsig}"
123 | )
124 |
125 | try:
126 | import pyperclip
127 | pyperclip.copy(smsig)
128 | print(f"SourceMod signature copied to clipboard")
129 | except:
130 | print("'pip install pyperclip' to automatically copy the SourceMod signature to your clipboard")
131 | return csig
132 |
133 | def main():
134 | addr = idaapi.get_screen_ea()
135 | func = idaapi.get_func(addr)
136 | if addr == idc.BADADDR or func is None:
137 | print("Make sure you are in a function!")
138 | idaapi.beep()
139 | return
140 |
141 | makesig(func.start_ea, func.end_ea - func.start_ea)
142 |
143 | main()
--------------------------------------------------------------------------------
/makesigfromhere.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idautils
3 | import idaapi
4 |
5 | def get_dt_size(dtype):
6 | return {
7 | idaapi.dt_byte: 1,
8 | idaapi.dt_word: 2,
9 | idaapi.dt_dword: 4,
10 | idaapi.dt_float: 4,
11 | idaapi.dt_double: 8,
12 | }.get(dtype, -1)
13 |
14 | def print_wildcards(count):
15 | return "?" * count
16 |
17 | def is_good_sig(sig, mask):
18 | search = " ".join('?' if m == '?' else b for b, m in zip(sig.strip().split(), mask))
19 |
20 | # Get the first segment that is executable to use its addresses for parse_binpat_str
21 | endea = idc.BADADDR
22 | for segea in idautils.Segments():
23 | s = idaapi.getseg(segea)
24 | if s.perm & idaapi.SEGPERM_EXEC:
25 | segstart = segea
26 | # Set the end ea to the end of the last executable segment
27 | # Speed isn't as important in this script, so reading any extra X
28 | # segments is fine
29 | if endea == idc.BADADDR or endea < segstart + s.size():
30 | endea = segstart + s.size()
31 |
32 | count = 0
33 | addr = 0
34 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
35 | while addr != idc.BADADDR:
36 | count = count + 1
37 | if count > 1:
38 | break
39 | addr = idaapi.find_binary(addr, endea, search, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
40 |
41 | return count == 1
42 |
43 | # binpat = idaapi.compiled_binpat_vec_t()
44 | # idaapi.parse_binpat_str(binpat, segstart, search, 16, idaapi.get_encoding_bpu_by_name("UTF-8"))
45 |
46 | # count = 0
47 | # addr = 0
48 | # addr, _ = idaapi.bin_search3(addr, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
49 |
50 | # while addr != idc.BADADDR:
51 | # count += 1
52 | # if count > 1:
53 | # break
54 |
55 | # # +1 because the search finds itself
56 | # addr, _ = idaapi.bin_search3(addr + 1, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
57 |
58 |
59 | # return count == 1
60 |
61 | def makesig(ea, sz=-1):
62 | func = idaapi.get_func(ea)
63 | name = idc.get_name(func.start_ea, idaapi.GN_VISIBLE)
64 |
65 | sig = ""
66 | mask = ""
67 | found = 0
68 | done = 0
69 |
70 | addr = ea
71 | end = ea + sz if sz != -1 else idc.BADADDR
72 | while addr != idc.BADADDR and (sz == -1 or addr < ea + sz):
73 | info = idaapi.insn_t()
74 | if not idaapi.decode_insn(info, addr):
75 | print(f"Failed to decode instruction at {addr:#X}?")
76 | idaapi.beep()
77 | return
78 |
79 | sig += " ".join(f"{idaapi.get_byte(addr+i):02X}" for i in range(info.size)) + " "
80 |
81 | done = 0
82 | if info.Op1.type in (idaapi.o_near, idaapi.o_far):
83 | insnsz = 2 if idaapi.get_byte(addr) == 0x0F else 1
84 | mask += f"{'x' * insnsz}{print_wildcards(info.size - insnsz)}"
85 | done = 1
86 | elif info.Op1.type == idaapi.o_reg and info.Op2.type == idaapi.o_mem and info.Op2.addr != idc.BADADDR:
87 | mask += f"{'x' * info.Op2.offb}{print_wildcards(info.size - info.Op2.offb)}"
88 | done = 1
89 |
90 | if not done: # Unknown, just wildcard addresses
91 | i = 0
92 | while i < info.size:
93 | loc = addr + i
94 | if ((idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF32):
95 | mask += print_wildcards(4)
96 | i += 3
97 | elif (idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF64:
98 | mask += print_wildcards(8)
99 | i += 7
100 | else:
101 | mask += 'x'
102 |
103 | i += 1
104 |
105 | if is_good_sig(sig, mask):
106 | found = 1
107 | break
108 |
109 | addr = idaapi.next_head(addr, end)
110 |
111 | if found == 0:
112 | print(sig)
113 | print("Ran out of bytes to create unique signature.")
114 | idaapi.beep()
115 | return
116 |
117 | sig = sig.strip()
118 | csig = r"\x" + sig.replace(" ", r"\x")
119 |
120 | align = len("Wildcarded Bytes: ")
121 | wildcarded = f"{'Wildcarded Bytes:':<{align}} {' '.join('?' if m == '?' else b for b, m in zip(sig.split(), mask))}\n" if "?" in mask else ""
122 | smsig = r"\x" + r"\x".join("2A" if m == "?" else b for b,
123 | m in zip(sig.split(), mask))
124 |
125 | print("==================================================")
126 | print(
127 | f"Signature for {name} + {ea - func.start_ea} ({ea - func.start_ea:#x}):\n"
128 | f"{'Mask:':<{align}} {mask}\n"
129 | f"{'Bytes:':<{align}} {sig}\n"
130 | f"{wildcarded}"
131 | f"{'Byte String:':<{align}} {csig}\n"
132 | f"{'SourceMod':<{align}} {smsig}"
133 | )
134 |
135 | try:
136 | import pyperclip
137 | pyperclip.copy(smsig)
138 | print(f"SourceMod signature copied to clipboard")
139 | except:
140 | print("'pip install pyperclip' to automatically copy the SourceMod signature to your clipboard")
141 | return csig
142 |
143 | def main():
144 | ea = idaapi.get_screen_ea()
145 | func = idaapi.get_func(ea)
146 | if ea == idc.BADADDR or func is None:
147 | print("Make sure you are in a function!")
148 | idaapi.beep()
149 | return
150 |
151 | sz = func.end_ea - ea
152 |
153 | makesig(ea, sz)
154 |
155 | main()
--------------------------------------------------------------------------------
/nameresetter.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idautils
3 | import idaapi
4 |
5 | def main():
6 | count = 0
7 | for segstart in idautils.Segments():
8 | segend = idaapi.getseg(segstart).end_ea
9 | for fea in idautils.Functions(segstart, segend):
10 | flags = idaapi.get_full_flags(fea)
11 | if not (flags & idc.FF_NAME):
12 | continue
13 |
14 | fflags = idc.get_func_attr(fea, idc.FUNCATTR_FLAGS)
15 | if fflags & idaapi.FUNC_LIB:
16 | continue
17 |
18 | if idc.set_name(fea, ""):
19 | count += 1
20 |
21 | print(f"Successfully renamed {count} functions")
22 |
23 | main()
--------------------------------------------------------------------------------
/netprop_importer.py:
--------------------------------------------------------------------------------
1 | import idautils
2 | import idaapi
3 | import idc
4 | import ctypes
5 | import time
6 |
7 | from math import ceil
8 |
9 | import xml.etree.ElementTree as et
10 |
11 | from dataclasses import dataclass
12 | from enum import Enum
13 |
14 | if idaapi.inf_is_64bit():
15 | ea_t = ctypes.c_uint64
16 | FF_PTR = idc.FF_QWORD
17 | else:
18 | ea_t = ctypes.c_uint32
19 | FF_PTR = idc.FF_DWORD
20 |
21 | class DataCache(object):
22 | tablecache = {}
23 |
24 | class SendPropType(Enum):
25 | DPT_Int = 0
26 | DPT_Float = 1
27 | DPT_Vector = 2
28 | DPT_VectorXY = 3
29 | DPT_String = 4
30 | DPT_Array = 5
31 | DPT_DataTable = 6
32 | DPT_Int64 = 7
33 |
34 | class SendFlags(Enum):
35 | UNSIGNED = 1 << 0
36 | COORD = 1 << 1
37 | NOSCALE = 1 << 2
38 | ROUNDDOWN = 1 << 3
39 | ROUNDUP = 1 << 4
40 | NORMAL = 1 << 5
41 | EXCLUDE = 1 << 6
42 | XYZE = 1 << 7
43 | INSIDEARRAY = 1 << 8
44 | PROXY_ALWAYS_YES = 1 << 9
45 | CHANGES_OFTEN = 1 << 10
46 | IS_A_VECTOR_ELEM = 1 << 11
47 | COLLAPSIBLE = 1 << 12
48 | COORD_MP = 1 << 13
49 | COORD_MP_LOWPRECISION = 1 << 14
50 | COORD_MP_INTEGRAL = 1 << 15
51 | VARINT = NORMAL
52 | ENCODED_AGAINST_TICKCOUNT = 1 << 16
53 |
54 | @dataclass(frozen=True)
55 | class SendProp:
56 | name: str
57 | type: int #SendPropType
58 | offset: int
59 | bits: int
60 | flags: int
61 | table: 'SendTable' = None
62 |
63 | def __repr__(self):
64 | # Use id() with table or else infinite recursion
65 | return f"SendProp(name={self.name}, type={self.type}, offset={self.offset}, bits={self.bits}, flags={self.flags}, table={id(self.table):#x})"
66 |
67 | def add_to_struc(self, struc, offset):
68 | # So, unfortunately, it doesn't seem to be possible to implement baseclasses
69 | # while also keeping vtables intact. This might actually be possible as it can be done
70 | # with IDA's header parser, but this might not be exposed to the API.
71 | # Implementing baseclasses with seamless vtable integration is a TODO
72 | # The framework is more or less here, so if I manage to figure that out it won't
73 | # be that difficult to implement
74 | # Might have to do with optinfo_t pointing to the proper vtable? Dunno
75 | # if self.table is not None:
76 | # baseclass = DataCache.struccache.get(self.table.classname, None)
77 | # if baseclass is None:
78 | # self.table.create_struc()
79 |
80 | # baseclass = DataCache.struccache[self.table.classname]
81 |
82 | if self.table is not None:
83 | # Array
84 | # We *could* parse these and implement them as embedded classes/arrays
85 | # but there's no guarantee that we would get a proper size, which could
86 | # cause some really poor results
87 | # There's a good chance that more array data is actually in the inner table's m_pExtraData
88 | # Mayhaps a SourceMod PR for another time
89 | if not self.table.name.startswith("_ST_"):
90 | # Bad hack but catches arrays
91 | if self.table.name == self.name:
92 | if self.offset != 0:
93 | self.table.add_array_to_struc(struc, offset + self.offset)
94 | return
95 | else:
96 | self.table.add_to_struc(struc, offset + self.offset)
97 |
98 | # Offset is 0 so we die
99 | if self.offset == 0:
100 | return
101 |
102 | curroffset = self.offset + offset
103 |
104 | currmem = idaapi.get_member(struc, curroffset)
105 | if currmem is not None:
106 | # print(f"Member {self.name} already exists in {idaapi.get_struc_name(struc.id)}")
107 | return
108 |
109 | idaflags, sz = self.calc_sz()
110 | tinfo = self.get_tinfo()
111 | targetname = idaapi.validate_name(self.name, idaapi.VNT_IDENT)
112 |
113 | serr = idaapi.add_struc_member(struc, targetname, curroffset, idaflags, None, sz)
114 | if serr != idaapi.STRUC_ERROR_MEMBER_OK:
115 | # I really don't wanna deal with these silly subclasses
116 | if serr < idaapi.STRUC_ERROR_MEMBER_OFFSET:
117 | print(f"Could not add struct member {idaapi.get_struc_name(struc.id)}.{targetname} at {curroffset} ({curroffset:#x})! Error {serr}")
118 | return
119 |
120 | currmem = idaapi.get_member(struc, curroffset)
121 | if tinfo is not None:
122 | idaapi.set_member_tinfo(struc, currmem, 0, tinfo, 0)
123 | elif self.flags and self.flags & SendFlags.UNSIGNED.value:
124 | currinfo = idaapi.tinfo_t()
125 | if idaapi.get_member_tinfo(currinfo, currmem):
126 | currinfo.change_sign(idaapi.type_unsigned)
127 | idaapi.set_member_tinfo(struc, currmem, 0, currinfo, 0)
128 |
129 | def calc_sz(self):
130 | if self.type == SendPropType.DPT_Float.value:
131 | return idc.FF_FLOAT, 4
132 | elif self.type == SendPropType.DPT_Int64.value:
133 | return idc.FF_QWORD, 8
134 | elif self.type == SendPropType.DPT_String.value:
135 | return FF_PTR, ctypes.sizeof(ea_t)
136 | elif self.type == SendPropType.DPT_Vector.value:
137 | # Returning FF_STRUCT doesn't work because the proper opinfo_t needs to be set
138 | # but this can be cheesed by just setting it to FF_DWORD and setting the tinfo after
139 | return idc.FF_DWORD, 12 #idc.FF_STRUCT
140 |
141 | absmax = ceil(self.bits/8.0)
142 | if absmax == 1:
143 | flags = idc.FF_BYTE
144 | numbytes = 1
145 | elif absmax == 2:
146 | flags = idc.FF_WORD
147 | numbytes = 2
148 | else:
149 | flags = idc.FF_DWORD
150 | numbytes = 4
151 |
152 | return flags, numbytes
153 |
154 | def get_tinfo(self):
155 | return {
156 | SendPropType.DPT_Vector.value: VECTOR,
157 | # SendPropType.DPT_Int.value: idaapi.tinfo_t(idaapi.BT_INT),
158 | SendPropType.DPT_Float.value: idaapi.tinfo_t(idaapi.BT_FLOAT),
159 | # SendPropType.DPT_String.value: idaapi.tinfo_t(idaapi.BT_PTR),
160 | SendPropType.DPT_Int64.value: idaapi.tinfo_t(idaapi.BT_INT64),
161 | }.get(self.type, None)
162 |
163 | @dataclass
164 | class SendTable:
165 | name: str
166 | props: list[SendProp]
167 | # For mapping to a "C"-class
168 | # I'm gonna assume that there'll be some game that won't suffice with a "replace DT_ with C" method,
169 | # so we have SendTable objects point to their actual class name
170 | classname: str
171 |
172 | @staticmethod
173 | def create(elem:et.Element, classname=None):
174 | name = elem.attrib["name"]
175 |
176 | # Check if we've already cached this table, update classname if so
177 | # because if this is true, then its classname is surely missing
178 | if name in DataCache.tablecache:
179 | if classname is not None:
180 | DataCache.tablecache[name].classname = classname
181 | return DataCache.tablecache[name]
182 |
183 | props = []
184 | for p in elem:
185 | pname = p.attrib["name"]
186 |
187 | # Collect and format the fields
188 | stype = p.find("type").text if p.find("type") != None else None
189 | ptype = str_to_dt_type(stype)
190 | sflags = p.find("flags").text if p.find("flags") != None else None
191 | flags = str_to_sendflags(sflags)
192 | offset = int(p.find("offset").text) if p.find("offset") != None else None
193 | bits = int(p.find("bits").text) if p.find("bits") != None else None
194 | ptable = SendTable.create(p.find("sendtable")) if p.find("sendtable") != None else None
195 |
196 | # Append a new prop
197 | props.append(SendProp(pname, ptype, offset, bits, flags, ptable))
198 |
199 | # Cache and return
200 | DataCache.tablecache[name] = SendTable(name, props, classname)
201 | return DataCache.tablecache[name]
202 |
203 | def create_struc(self):
204 | struc = add_struc_ex(self.classname)
205 |
206 | self.add_to_struc(struc, 0)
207 |
208 | #DataCache.struccache[self.classname] = struc
209 |
210 | def add_to_struc(self, struc, offset):
211 | for prop in self.props:
212 | prop.add_to_struc(struc, offset)
213 |
214 | def add_array_to_struc(self, struc, offset):
215 | if offset == 0:
216 | return
217 |
218 | idaflags, sz = self.props[0].calc_sz()
219 | if len(self.props) > 1:
220 | sz = (self.props[1].offset - self.props[0].offset)
221 | idaflags = sz_to_idaflags(sz)
222 |
223 | sz *= len(self.props)
224 |
225 | tinfo = self.props[0].get_tinfo()
226 | targetname = idaapi.validate_name(self.name, idaapi.VNT_IDENT)
227 |
228 | serr = idaapi.add_struc_member(struc, targetname, offset, idaflags, None, sz)
229 | if serr != idaapi.STRUC_ERROR_MEMBER_OK:
230 | # I really don't wanna deal with these silly subclasses
231 | if serr < idaapi.STRUC_ERROR_MEMBER_OFFSET:
232 | print(f"Could not add struct member {idaapi.get_struc_name(struc.id)}.{targetname} at {offset} ({offset:#x})! Error {serr}")
233 | return
234 |
235 | currmem = idaapi.get_member(struc, offset)
236 | if tinfo is not None:
237 | idaapi.set_member_tinfo(struc, currmem, 0, tinfo, 0)
238 | elif self.props[0].flags and self.props[0].flags & SendFlags.UNSIGNED.value:
239 | currinfo = idaapi.tinfo_t()
240 | if idaapi.get_member_tinfo(currinfo, currmem):
241 | currinfo.change_sign(idaapi.type_unsigned)
242 | idaapi.set_member_tinfo(struc, currmem, 0, currinfo, 0)
243 |
244 | @dataclass(frozen=True)
245 | class ServerClass:
246 | name: str
247 | sendtable: SendTable
248 |
249 | @staticmethod
250 | def create(elem: et.Element, classname):
251 | sendtable = elem.find("sendtable")
252 | return ServerClass(classname, SendTable.create(sendtable, classname))
253 |
254 | def create_struc(self):
255 | self.sendtable.create_struc()
256 |
257 |
258 | # Idiot proof IDA wait box
259 | class WaitBox:
260 | buffertime = 0.0
261 | shown = False
262 | msg = ""
263 |
264 | @staticmethod
265 | def _show(msg):
266 | WaitBox.msg = msg
267 | if WaitBox.shown:
268 | idaapi.replace_wait_box(msg)
269 | else:
270 | idaapi.show_wait_box(msg)
271 | WaitBox.shown = True
272 |
273 | @staticmethod
274 | def show(msg, buffertime=0.1):
275 | if msg == WaitBox.msg:
276 | return
277 |
278 | if buffertime > 0.0:
279 | if time.time() - WaitBox.buffertime < buffertime:
280 | return
281 | WaitBox.buffertime = time.time()
282 | WaitBox._show(msg)
283 |
284 | @staticmethod
285 | def hide():
286 | if WaitBox.shown:
287 | idaapi.hide_wait_box()
288 | WaitBox.shown = False
289 |
290 | VECTOR = None
291 |
292 | def str_to_dt_type(t):
293 | return {
294 | "int": SendPropType.DPT_Int.value,
295 | "float": SendPropType.DPT_Float.value,
296 | "vector": SendPropType.DPT_Vector.value,
297 | "string": SendPropType.DPT_String.value,
298 | "array": SendPropType.DPT_Array.value,
299 | "datatable": SendPropType.DPT_DataTable.value,
300 | "int64": SendPropType.DPT_Int64.value
301 | }.get(t, None)
302 |
303 | def str_to_sendflags(s):
304 | if not s:
305 | return s
306 |
307 | splode = s.split("|")
308 | d = {
309 | "Unsigned": SendFlags.UNSIGNED.value,
310 | "Coord": SendFlags.COORD.value,
311 | "NoScale": SendFlags.NOSCALE.value,
312 | "RoundDown": SendFlags.ROUNDDOWN.value,
313 | "RoundUp": SendFlags.ROUNDUP.value,
314 | "VarInt": SendFlags.NORMAL.value,
315 | "Normal": SendFlags.NORMAL.value,
316 | "Exclude": SendFlags.EXCLUDE.value,
317 | "XYZE": SendFlags.XYZE.value,
318 | "InsideArray": SendFlags.INSIDEARRAY.value,
319 | "AlwaysProxy": SendFlags.PROXY_ALWAYS_YES.value,
320 | "ChangesOften": SendFlags.CHANGES_OFTEN.value,
321 | "VectorElem": SendFlags.IS_A_VECTOR_ELEM.value,
322 | "Collapsible": SendFlags.COLLAPSIBLE.value,
323 | "CoordMP": SendFlags.COORD_MP.value,
324 | "CoordMPLowPrec": SendFlags.COORD_MP_LOWPRECISION.value,
325 | "CoordMpIntegral": SendFlags.COORD_MP_INTEGRAL.value,
326 | }
327 | flags = 0
328 | for fl in splode:
329 | flags |= d.get(fl, 0)
330 |
331 | return flags
332 |
333 | def sz_to_idaflags(sz):
334 | return {
335 | 1: idc.FF_BYTE,
336 | 2: idc.FF_WORD,
337 | 4: idc.FF_DWORD,
338 | 8: idc.FF_QWORD
339 | }.get(sz, 1)
340 |
341 |
342 | def add_struc_ex(name):
343 | strucid = idaapi.get_struc_id(name)
344 | if strucid == idc.BADADDR:
345 | strucid = idaapi.add_struc(idc.BADADDR, name)
346 |
347 | return idaapi.get_struc(strucid)
348 |
349 | def calcszdata(sz):
350 | absmax = ceil(sz/8.0)
351 | if absmax == 1:
352 | flags = idc.FF_BYTE
353 | numbytes = 1
354 | elif absmax == 2:
355 | flags = idc.FF_WORD
356 | numbytes = 2
357 | else:
358 | flags = idc.FF_DWORD
359 | numbytes = 4
360 |
361 | return flags, numbytes
362 |
363 | # Fix SM's bad xml structure
364 | def fix_xml(data):
365 | for i in range(len(data)):
366 | data[i] = data[i].replace('""', '"')
367 |
368 | data[3] = "\n"
369 | data.append("\n")
370 | return data
371 |
372 | # Make Vector and QAngle structs to keep things sane
373 | def make_basic_structs():
374 | strucid = idaapi.get_struc_id("Vector")
375 | if strucid == idc.BADADDR:
376 | struc = idaapi.get_struc(idaapi.add_struc(idc.BADADDR, "Vector"))
377 | idaapi.add_struc_member(struc, "x", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4)
378 | idaapi.add_struc_member(struc, "y", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4)
379 | idaapi.add_struc_member(struc, "z", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4)
380 | strucid = idaapi.get_struc_id("Vector")
381 |
382 | global VECTOR
383 | VECTOR = idaapi.tinfo_t()
384 | if idaapi.guess_tinfo(VECTOR, strucid) == idaapi.GUESS_FUNC_FAILED:
385 | VECTOR = None
386 |
387 | strucid = idaapi.get_struc_id("QAngle")
388 | if strucid == idc.BADADDR:
389 | struc = idaapi.get_struc(idaapi.add_struc(idc.BADADDR, "QAngle"))
390 | idaapi.add_struc_member(struc, "x", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4)
391 | idaapi.add_struc_member(struc, "y", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4)
392 | idaapi.add_struc_member(struc, "z", idc.BADADDR, idc.FF_FLOAT|idc.FF_DATA, None, 4)
393 |
394 | def main():
395 | data = None
396 | try:
397 | fopen = idaapi.ask_file(0, "*.xml", "Select a file to import")
398 | if fopen is None:
399 | return
400 |
401 | idaapi.set_ida_state(idaapi.st_Work)
402 | WaitBox.show("Parsing XML")
403 | with open(fopen) as f:
404 | data = f.readlines()
405 |
406 | if data is None:
407 | idaapi.set_ida_state(idaapi.st_Ready)
408 | return
409 |
410 | make_basic_structs()
411 |
412 | try:
413 | # SM 1.10 <= has bad XML, assume its correct first then try to fix it
414 | tree = et.fromstringlist(data)
415 | except:
416 | fix_xml(data)
417 | tree = et.fromstringlist(data)
418 |
419 | if tree is None:
420 | idaapi.warning("Something bad happened :(")
421 | idaapi.set_ida_state(idaapi.st_Ready)
422 | return
423 |
424 | WaitBox.show("Creating ServerClasses")
425 | classes = {}
426 | for cls in tree:
427 | classname = cls.attrib["name"]
428 | classes[classname] = ServerClass.create(cls, classname)
429 |
430 | idaapi.begin_type_updating(idaapi.UTP_STRUCT)
431 |
432 | WaitBox.show("Adding struct members")
433 | for classname, serverclass in classes.items():
434 | serverclass.create_struc()
435 |
436 | print("Done!")
437 | except:
438 | import traceback
439 | traceback.print_exc()
440 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues")
441 | idaapi.beep()
442 |
443 | WaitBox.hide()
444 | idaapi.end_type_updating(idaapi.UTP_STRUCT)
445 | idaapi.set_ida_state(idaapi.st_Ready)
446 |
447 | main()
--------------------------------------------------------------------------------
/sigfind.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idaapi
3 | import idautils
4 |
5 | def getsigloc(sig):
6 | # Get the first segment that is executable to use its addresses for parse_binpat_str
7 | endea = idc.BADADDR
8 | for segea in idautils.Segments():
9 | s = idaapi.getseg(segea)
10 | if s.perm & idaapi.SEGPERM_EXEC:
11 | segstart = segea
12 | # Set the end ea to the end of the last executable segment
13 | # Speed isn't as important in this script, so reading any extra X
14 | # segments is fine
15 | if endea == idc.BADADDR or endea < segstart + s.size():
16 | endea = segstart + s.size()
17 | break
18 |
19 | count = 0
20 | first = idaapi.find_binary(0, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
21 | addr = first
22 | while addr != idc.BADADDR:
23 | count = count + 1
24 | addr = idaapi.find_binary(addr, endea, sig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
25 |
26 | return first, count
27 |
28 | # binpat = idaapi.compiled_binpat_vec_t()
29 | # # This returns false but it works?
30 | # idaapi.parse_binpat_str(binpat, segstart, sig, 16, idaapi.get_default_encoding_idx(idaapi.get_encoding_bpu_by_name("UTF-8")))
31 | # addr, _ = idaapi.bin_search3(0, endea, binpat, idaapi.BIN_SEARCH_FORWARD)
32 | # return addr
33 |
34 |
35 | def main():
36 | bytesig = idaapi.ask_str("", 0, "Insert signature: ")
37 | if bytesig is None:
38 | return
39 |
40 | sig = bytesig.replace(r"\x", " ").replace("2A", "?").replace("2a", "?").strip()
41 |
42 | loc, count = getsigloc(sig)
43 | if loc != idc.BADADDR:
44 | idaapi.jumpto(loc)
45 | if count > 1:
46 | print(f"Found {count} instances of signature. Jumping to first at {loc:#X}")
47 | else:
48 | # Beep, nothing found
49 | idaapi.beep()
50 |
51 | main()
--------------------------------------------------------------------------------
/sigsmasher.py:
--------------------------------------------------------------------------------
1 | import idautils
2 | import idc
3 | import idaapi
4 | import yaml
5 | import time
6 |
7 | from math import floor
8 |
9 | MAX_SIG_LENGTH = 512
10 |
11 | # Change to 1 to have a very optimized makesig
12 | # Will produce useable signatures but theyll be a bit more volatile
13 | # since they rely on the position of the function in the binary
14 | # Uses the end of the function to search compared to the end of the .text segment
15 | ABSOLUTE_OPTIMIZATION = 0
16 |
17 | # Write-only trie for signatures
18 | # This is slightly faster than constantly running search_binary as
19 | # common signature prologues will be caught early and more quickly
20 | class Trie(object):
21 | def __init__(self):
22 | self.root = {}
23 |
24 | def add(self, data):
25 | node = self.root
26 | for d in data:
27 | if d not in node:
28 | node[d] = {}
29 | node = node[d]
30 |
31 | def find(self, data):
32 | node = self.root
33 | for d in data:
34 | if d not in node:
35 | return False
36 | node = node[d]
37 | return True
38 |
39 | def __contains__(self, data):
40 | return self.find(data)
41 |
42 | TRIE = Trie()
43 |
44 | # Idiot proof IDA wait box
45 |
46 |
47 | class WaitBox:
48 | buffertime = 0.0
49 | shown = False
50 | msg = ""
51 |
52 | @staticmethod
53 | def _show(msg):
54 | WaitBox.msg = msg
55 | if WaitBox.shown:
56 | idaapi.replace_wait_box(msg)
57 | else:
58 | idaapi.show_wait_box(msg)
59 | WaitBox.shown = True
60 |
61 | @staticmethod
62 | def show(msg, buffertime=0.1):
63 | if msg == WaitBox.msg:
64 | return
65 |
66 | if buffertime > 0.0:
67 | if time.time() - WaitBox.buffertime < buffertime:
68 | return
69 | WaitBox.buffertime = time.time()
70 | WaitBox._show(msg)
71 |
72 | @staticmethod
73 | def hide():
74 | if WaitBox.shown:
75 | idaapi.hide_wait_box()
76 | WaitBox.shown = False
77 |
78 | FUNCS_SEGEND = idc.BADADDR
79 | def calc_sigstop():
80 | endea = idc.BADADDR
81 | for segea in idautils.Segments():
82 | s = idaapi.getseg(segea)
83 | if s.perm & idaapi.SEGPERM_EXEC:
84 | segstart = segea
85 | # Set the end ea to the end of the last executable segment
86 | # Speed isn't as important in this script, so reading any extra X
87 | # segments is fine
88 | if endea == idc.BADADDR or endea < segstart + s.size():
89 | endea = segstart + s.size()
90 |
91 | return endea
92 |
93 | def is_good_sig(sig, funcend):
94 | if sig in TRIE:
95 | return False
96 |
97 | bytesig = " ".join(sig)
98 |
99 | endea = funcend if ABSOLUTE_OPTIMIZATION else FUNCS_SEGEND
100 | count = 0
101 | addr = 0
102 | addr = idaapi.find_binary(addr, endea, bytesig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
103 | while count < 2 and addr != idc.BADADDR:
104 | count = count + 1
105 | addr = idaapi.find_binary(addr, endea, bytesig, 0, idc.SEARCH_DOWN|idc.SEARCH_NEXT)
106 |
107 | # Good sig, add it to the trie
108 | if count == 1:
109 | TRIE.add(sig)
110 | return True
111 |
112 | return False
113 |
114 | def makesigfast(func):
115 | addr = func.start_ea
116 | found = 0
117 |
118 | sig = []
119 | while addr != idc.BADADDR:
120 | info = idaapi.insn_t()
121 | if not idaapi.decode_insn(info, addr):
122 | return None
123 |
124 | done = 0
125 | if info.Op1.type in (idaapi.o_near, idaapi.o_far):
126 | insnsz = 2 if idaapi.get_byte(addr) == 0x0F else 1
127 | sig += [f"{idaapi.get_byte(addr+i):02X}" for i in range(insnsz)] + ["?"] * (info.size - insnsz)
128 | done = 1
129 | elif info.Op1.type == idaapi.o_reg and info.Op2.type == idaapi.o_mem and info.Op2.addr != idc.BADADDR:
130 | sig += [f"{idaapi.get_byte(addr+i):02X}" for i in range(info.Op2.offb)] + ["?"] * (info.size - info.Op2.offb)
131 | done = 1
132 |
133 | if not done: # Unknown, just wildcard addresses
134 | i = 0
135 | while i < info.size:
136 | loc = addr + i
137 | if ((idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF32):
138 | sig += ["?"] * 4
139 | i += 3
140 | elif (idc.get_fixup_target_type(loc) & 0x0F) == idaapi.FIXUP_OFF64:
141 | sig += ["?"] * 8
142 | i += 7
143 | else:
144 | sig += [f"{idaapi.get_byte(addr+i):02X}"]
145 |
146 | i += 1
147 |
148 | # Escape the evil functions that break everything
149 | if len(sig) > MAX_SIG_LENGTH:
150 | return "Signature is too long!"
151 | # Save milliseconds and only check for good sigs after a fewish bytes
152 | # Trust me, it matters
153 | elif len(sig) >= 5 and is_good_sig(sig, func.end_ea):
154 | found = 1
155 | break
156 |
157 | addr = idc.next_head(addr, func.end_ea)
158 |
159 | if found == 0:
160 | return "Ran out of bytes!"
161 |
162 | smsig = r"\x" + r"\x".join(sig)
163 | smsig = smsig.replace("?", "2A")
164 | return smsig
165 |
166 | def main():
167 | try:
168 | root = {}
169 |
170 | f = idaapi.ask_file(1, "*.yml", "Choose a file to save to")
171 | if not f:
172 | return
173 |
174 | skip = idaapi.ask_yn(1, "Skip unnamed functions (e.g. ones that start with \"sub_\")?")
175 | if skip == -1:
176 | return
177 |
178 | idaapi.set_ida_state(idaapi.st_Work)
179 | global FUNCS_SEGEND
180 | FUNCS_SEGEND = calc_sigstop()
181 |
182 | funcs = list(idautils.Functions())
183 | siglist = []
184 |
185 | for i in range(len(funcs)):
186 | fea = funcs[i]
187 | flags = idaapi.get_full_flags(fea)
188 | if not idaapi.is_func(flags):
189 | continue
190 |
191 | if skip and not idaapi.has_name(flags):
192 | continue
193 |
194 | func = idaapi.get_func(fea)
195 | # Thunks and lib funcs
196 | if func.flags & (idaapi.FUNC_LIB | idaapi.FUNC_THUNK):
197 | continue
198 |
199 | funcname = idaapi.get_name(fea)
200 | unmangled = idaapi.demangle_name(funcname, idaapi.MNG_SHORT_FORM)
201 | if unmangled is not None:
202 | # Skip jmp stubs
203 | if unmangled.startswith("j_"):
204 | continue
205 |
206 | # Nullsub
207 | if unmangled.startswith("nullsub"):
208 | continue
209 |
210 | siglist.append(func)
211 |
212 | totalcount = len(siglist)
213 | actualstarttime = time.time()
214 | sigcount = 0
215 | for i, func in enumerate(siglist):
216 | funcname = idaapi.get_name(func.start_ea)
217 | unmangled = idaapi.demangle_name(funcname, idaapi.MNG_SHORT_FORM)
218 | if unmangled is None:
219 | unmangled = funcname
220 |
221 | sig = makesigfast(func)
222 | root[unmangled] = {"mangled": funcname, "signature": sig}
223 |
224 | if sig:
225 | sigcount += (0 if "!" in sig else 1)
226 |
227 | # Unfortunately, sigging takes progressively longer the further along the function list
228 | # this goes, as makesig() searches from top to bottom while functions are ordered from top to bottom
229 | # So this isn't really accurate but w/e
230 |
231 | totaltime = time.time() - actualstarttime
232 | count = i + 1
233 | avgtime = totaltime / count
234 | eta = int(avgtime * (totalcount - count))
235 | etastr = time.strftime("%H:%M:%S", time.gmtime(eta))
236 |
237 | WaitBox.show(f"Evaluated {count} out of {totalcount} ({floor(i / float(totalcount) * 100.0 * 10.0) / 10.0}%)\nETA: {etastr}")
238 |
239 | WaitBox.show("Saving to file")
240 | with open(f, "w") as f:
241 | yaml.safe_dump(root, f, default_flow_style=False, width=999999)
242 |
243 | totaltime = time.strftime("%H:%M:%S", time.gmtime(time.time() - actualstarttime))
244 | print(f"Successfully generated {sigcount} signatures from {totalcount} functions in {totaltime}")
245 | except:
246 | import traceback
247 | traceback.print_exc()
248 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues")
249 | idaapi.beep()
250 |
251 | idaapi.set_ida_state(idaapi.st_Ready)
252 | WaitBox.hide()
253 |
254 | # import cProfile
255 | # cProfile.run("main()", "sigsmasher.prof")
256 | main()
257 |
--------------------------------------------------------------------------------
/structfiller.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idautils
3 | import idaapi
4 | import time
5 |
6 | from math import floor
7 |
8 | # Idiot proof IDA wait box
9 | class WaitBox:
10 | buffertime = 0.0
11 | shown = False
12 | msg = ""
13 |
14 | @staticmethod
15 | def _show(msg):
16 | WaitBox.msg = msg
17 | if WaitBox.shown:
18 | idaapi.replace_wait_box(msg)
19 | else:
20 | idaapi.show_wait_box(msg)
21 | WaitBox.shown = True
22 |
23 | @staticmethod
24 | def show(msg, buffertime=0.1):
25 | if msg == WaitBox.msg:
26 | return
27 |
28 | if buffertime > 0.0:
29 | if time.time() - WaitBox.buffertime < buffertime:
30 | return
31 | WaitBox.buffertime = time.time()
32 | WaitBox._show(msg)
33 |
34 | @staticmethod
35 | def hide():
36 | if WaitBox.shown:
37 | idaapi.hide_wait_box()
38 | WaitBox.shown = False
39 |
40 | def main():
41 | try:
42 | idaapi.begin_type_updating(idaapi.UTP_STRUCT)
43 | maxstructs = idaapi.get_struc_qty()
44 | i = idaapi.get_first_struc_idx()
45 | while i < maxstructs:
46 | WaitBox.show(f"{floor(i / float(maxstructs) * 100.0 * 10.0) / 10.0}%")
47 | strucid = idaapi.get_struc_by_idx(i)
48 | struc = idaapi.get_struc(strucid)
49 | k = 0
50 | struclen = idaapi.get_max_offset(struc)
51 | while k < struclen:
52 | member = idaapi.get_member(struc, k)
53 | if not member:
54 | idaapi.add_struc_member(struc, f"field_{k:X}", k, idc.FF_BYTE, None, 1)
55 | k += 1
56 | else:
57 | k += idaapi.get_member_size(member)
58 |
59 | i += 1
60 |
61 | print("Done!")
62 | except:
63 | import traceback
64 | traceback.print_exc()
65 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues")
66 | idaapi.beep()
67 |
68 | WaitBox.hide()
69 | idaapi.end_type_updating(idaapi.UTP_STRUCT)
70 |
71 | main()
--------------------------------------------------------------------------------
/symbolsmasher.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idautils
3 | import idaapi
4 | import json
5 |
6 | import time
7 | from sys import version_info
8 |
9 | # Are we reading this DB or writing to it. Not to be confused with reading from/writing to the work file
10 | Mode_Invalid = -1
11 | Mode_Write = 0
12 | Mode_Read = 1
13 |
14 | DEBUG = 0
15 |
16 | # Idiot proof IDA wait box
17 | class WaitBox:
18 | buffertime = 0.0
19 | shown = False
20 | msg = ""
21 |
22 | @staticmethod
23 | def _show(msg):
24 | WaitBox.msg = msg
25 | if WaitBox.shown:
26 | idaapi.replace_wait_box(msg)
27 | else:
28 | idaapi.show_wait_box(msg)
29 | WaitBox.shown = True
30 |
31 | @staticmethod
32 | def show(msg, buffertime = 0.1):
33 | if msg == WaitBox.msg:
34 | return
35 |
36 | if buffertime > 0.0:
37 | if time.time() - WaitBox.buffertime < buffertime:
38 | return
39 | WaitBox.buffertime = time.time()
40 | WaitBox._show(msg)
41 |
42 | @staticmethod
43 | def hide():
44 | if WaitBox.shown:
45 | idaapi.hide_wait_box()
46 | WaitBox.shown = False
47 |
48 | def get_action():
49 | return idaapi.ask_buttons("Reading from", "Writing to", "", 0, "What action are we performing on this database?")
50 |
51 | def get_file(action):
52 | forsaving, rw, s = (1, "w", "write to") if action == Mode_Read else (0, "r", "read from")
53 | fname = "*.json"
54 | f = idaapi.ask_file(forsaving, fname, "Choose a file to {}".format(s))
55 |
56 | return open(f, rw) if f else None
57 |
58 | # Show how many functions we've found
59 | FOUND_FUNCS = set()
60 |
61 | # Format:
62 | # "String Name":
63 | # {
64 | # "_ZN8Function5Name",
65 | # "_ZN8Function6Name2",
66 | # etc...
67 | # }
68 | def build_xref_dict(strings):
69 | xrefs = {}
70 | for s in strings:
71 | xrefs[str(s)] = []
72 |
73 | for xref in idautils.XrefsTo(s.ea):
74 | funcname = idaapi.get_func_name(xref.frm)
75 | if funcname is None:
76 | continue
77 |
78 | node = xrefs[str(s)]
79 | node.append(funcname)
80 | xrefs[str(s)] = node
81 |
82 | # Empty, trash, we don't want it
83 | if not len(xrefs[str(s)]):
84 | del xrefs[str(s)]
85 |
86 | return xrefs
87 |
88 | # Format:
89 | # "_ZN8Function5Name":
90 | # {
91 | # "str1",
92 | # "str2",
93 | # "str1",
94 | # }
95 | def build_data_dict(strdict):
96 | funcs = {}
97 | for s, value in get_bcompat_iter(strdict):
98 | for funcname in value:
99 | node = funcs.get(funcname, [])
100 | node.append(s)
101 | funcs[funcname] = node
102 | return funcs
103 |
104 | def read_strs(strings, file):
105 | WaitBox.show("Reading strings", True)
106 | # Build an organized dictionary of the string data we can get
107 | strdict = build_xref_dict(strings)
108 | # Then reorient it around functions, then dump it
109 | funcdict = build_data_dict(strdict)
110 | WaitBox.show("Dumping to file", True)
111 | # Running the script in write mode will build a similar dict then compare the two through functions
112 | json.dump(funcdict, file, indent = 4, sort_keys = True)
113 |
114 | def get_bcompat_iter(d):
115 | return d.items() if version_info[0] >= 3 else d.iteritems()
116 |
117 | def get_bcompat_keys(d):
118 | return d.keys() if version_info[0] >= 3 else d.iterkeys()
119 |
120 | def write_exact_comp(strdict, funcdict, myfuncs):
121 | global FOUND_FUNCS
122 | WaitBox.show("Writing exact comparisons")
123 | count = 0
124 |
125 | for strippedname, strippedlist in get_bcompat_iter(strdict):
126 | if not idaapi.get_func_name(myfuncs[strippedname]).startswith("sub_"):
127 | continue
128 |
129 | possibilities = []
130 | strippedlist = sorted(strippedlist)
131 | for symname, symlist in get_bcompat_iter(funcdict):
132 | if strippedlist == sorted(symlist):
133 | possibilities.append(str(symname))
134 | else:
135 | continue
136 |
137 | if len(possibilities) >= 2:
138 | break
139 |
140 | if len(possibilities) != 1:
141 | continue
142 |
143 | if possibilities[0] not in FOUND_FUNCS and possibilities[0] not in myfuncs:
144 | # print(idaapi.get_func_name(myfuncs[strippedname]))
145 | idc.set_name(myfuncs[strippedname], possibilities[0], idaapi.SN_FORCE)
146 | count += 1
147 |
148 | FOUND_FUNCS.add(possibilities[0])
149 | WaitBox.show("Writing exact comparisons")
150 | elif DEBUG:
151 | print("{} is probably wrong!".format(idc.demangle_name(possibilities[0], idc.get_inf_attr(idc.INF_SHORT_DN))))
152 |
153 | return count
154 |
155 | def write_simple_comp(strdict, funcdict, myfuncs, liw = True):
156 | global FOUND_FUNCS
157 | s = "symboled in stripped" if liw else "stripped in symboled"
158 | WaitBox.show("Writing simple comparisons ({})".format(s))
159 | count = 0
160 |
161 | for strippedname, strippedlist in get_bcompat_iter(strdict):
162 | if not idaapi.get_func_name(myfuncs[strippedname]).startswith("sub_"):
163 | continue
164 |
165 | possibilities = []
166 | for symname, symlist in get_bcompat_iter(funcdict):
167 | if liw:
168 | if all(val in strippedlist for val in symlist):
169 | possibilities.append(str(symname))
170 | else:
171 | continue
172 | else:
173 | if all(val in symlist for val in strippedlist):
174 | possibilities.append(str(symname))
175 | else:
176 | continue
177 |
178 | if len(possibilities) >= 2:
179 | break
180 |
181 | if len(possibilities) != 1:
182 | continue
183 |
184 | if possibilities[0] not in FOUND_FUNCS and possibilities[0] not in myfuncs:
185 | idc.set_name(myfuncs[strippedname], possibilities[0], idaapi.SN_FORCE)
186 | count += 1
187 |
188 | FOUND_FUNCS.add(possibilities[0])
189 | WaitBox.show("Writing simple comparisons ({})".format(s))
190 | elif DEBUG:
191 | print("{} is probably wrong!".format(idc.demangle_name(possibilities[0], idc.get_inf_attr(idc.INF_SHORT_DN))))
192 |
193 | return count
194 |
195 | def get_bin_funcs():
196 | seg = idaapi.get_segm_by_name(".text")
197 | return {idaapi.get_func_name(ea): ea for ea in idautils.Functions(seg.start_ea, seg.end_ea)}
198 |
199 | # So to prevent bad things, we're going to destroy any functions that have the exact same string xrefs
200 | # This is to protect against inlining but ultimately fails as this compares direct values
201 | # Foo() could call inlined Bar() twice which would fuck this up
202 | # What to do, what to do...
203 | def clean_data_dict(strdict):
204 | pass
205 | # resultant = {}
206 | # for key, value in get_bcompat_iter(strdict):
207 | # if sorted(value) not in resultant.values():
208 | # resultant[key] = sorted(value)
209 | #
210 | # strdict = resultant
211 |
212 | def write_symbols(strings, file):
213 | WaitBox.show("Loading file", True)
214 | funcdict = json.load(file)
215 | if not funcdict:
216 | idaapi.warning("Could not load function data from file")
217 | return
218 |
219 | strdict = build_data_dict(build_xref_dict(strings))
220 | clean_data_dict(strdict)
221 | myfuncs = get_bin_funcs()
222 |
223 | # Writing uniques is much more liable to produce bad typing
224 | # Unique, one-off strings seem to be inlined much more often, so it's
225 | # better to use the simple comparison technique
226 | # This will reduce the amount of types, but the reduced types
227 | # wouldve been wrong or duplicated anyways
228 | # strdict = write_uniques(strings, funcdict["Uniques"])
229 |
230 | # A good test is to just simply compare xrefs
231 | # If a function references "fizzbuzz" 2 times and "foobar" once and its the only function
232 | # that does anything like that, chances are that we found something to smash
233 | exact_count = write_exact_comp(strdict, funcdict, myfuncs)
234 |
235 | # Since a lot of functions that have good strings have inlined strings in them, let's just look for containment
236 | # If "fizz", "buzz", and "foo" exist in Bar::Foo which has "fizz", "buzz", "foo", and "fizzbuzz" for example
237 | # Obviously we're only checking for 1 instance
238 | liw = write_simple_comp(strdict, funcdict, myfuncs) # Symboled strings in stripped
239 | wil = write_simple_comp(strdict, funcdict, myfuncs, False) # Stripped strings in symboled
240 |
241 | # TODO IDEAS;
242 | # - Dance around some function xrefs. By now, a solid chunk of them should have symboled names (a few thousand at least)
243 | # A unique set of named xrefs could guarantee something
244 | # Would need a new section in the data file (to and from)
245 | return exact_count, liw, wil
246 |
247 | def main():
248 | try:
249 | action = get_action()
250 | if action == Mode_Invalid:
251 | return
252 |
253 | file = get_file(action)
254 | if file is None:
255 | return
256 |
257 | # strings = get_strs()
258 | strings = list(idautils.Strings())
259 | if action == Mode_Read:
260 | read_strs(strings, file)
261 | print("Done!")
262 | else:
263 | c1, c2, c3 = write_symbols(strings, file)
264 | print("Successfully typed {} functions".format(len(FOUND_FUNCS)))
265 | print("\t- {} Exact\n\t- {} Symboled in stripped\n\t- {} Stripped in symboled".format(c1, c2, c3))
266 | except:
267 | import traceback
268 | traceback.print_exc()
269 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues")
270 | idaapi.beep()
271 |
272 | WaitBox.hide()
273 | file.close()
274 |
275 | main()
--------------------------------------------------------------------------------
/vtable_io.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idautils
3 | import idaapi
4 | import json
5 | import ctypes
6 | import time
7 | import re
8 |
9 | from dataclasses import dataclass
10 |
11 | if idaapi.inf_is_64bit():
12 | ea_t = ctypes.c_uint64
13 | ptr_t = ctypes.c_int64
14 | get_ptr = idaapi.get_qword
15 | FF_PTR = idc.FF_QWORD
16 | else:
17 | ea_t = ctypes.c_uint32
18 | ptr_t = ctypes.c_int32
19 | get_ptr = idaapi.get_dword
20 | FF_PTR = idc.FF_DWORD
21 |
22 | # Calling these a lot so we'll speed up the invocations by manually implementing them here
23 | def is_off(f): return (f & (idc.FF_0OFF|idc.FF_1OFF)) != 0
24 | def is_code(f): return (f & idaapi.MS_CLS) == idc.FF_CODE
25 | def has_any_name(f): return (f & idc.FF_ANYNAME) != 0
26 | def is_ptr(f): return (f & idaapi.MS_CLS) == idc.FF_DATA and (f & idaapi.DT_TYPE) == FF_PTR
27 |
28 | # Let's go https://www.blackhat.com/presentations/bh-dc-07/Sabanal_Yason/Paper/bh-dc-07-Sabanal_Yason-WP.pdf
29 |
30 | _RTTICompleteObjectLocator_fields = [
31 | ("signature", ctypes.c_uint32), # signature
32 | ("offset", ctypes.c_uint32), # offset of this vtable in complete class (from top)
33 | ("cdOffset", ctypes.c_uint32), # offset of constructor displacement
34 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor
35 | ("pClassHierarchyDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor
36 | ]
37 |
38 | if idaapi.inf_is_64bit():
39 | _RTTICompleteObjectLocator_fields.append(("pSelf", ctypes.c_uint32)) # ref to object's base
40 |
41 | class RTTICompleteObjectLocator(ctypes.Structure):
42 | _fields_ = _RTTICompleteObjectLocator_fields
43 |
44 |
45 | class TypeDescriptor(ctypes.Structure):
46 | _fields_ = [
47 | ("pVFTable", ctypes.c_uint32), # reference to RTTI's vftable
48 | ("spare", ctypes.c_uint32), # internal runtime reference
49 | ("name", ctypes.c_uint8), # type descriptor name (no varstruct needed since we don't use this)
50 | ]
51 |
52 |
53 | class RTTIClassHierarchyDescriptor(ctypes.Structure):
54 | _fields_ = [
55 | ("signature", ctypes.c_uint32), # signature
56 | ("attribs", ctypes.c_uint32), # attributes
57 | ("numBaseClasses", ctypes.c_uint32), # # of items in the array of base classes
58 | ("pBaseClassArray", ctypes.c_uint32), # ref BaseClassArray
59 | ]
60 |
61 |
62 | class RTTIBaseClassDescriptor(ctypes.Structure):
63 | _fields_ = [
64 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor
65 | ("numContainedBases", ctypes.c_uint32), # # of sub elements within base class array
66 | ("mdisp", ctypes.c_uint32), # member displacement
67 | ("pdisp", ctypes.c_uint32), # vftable displacement
68 | ("vdisp", ctypes.c_uint32), # displacement within vftable
69 | ("attributes", ctypes.c_uint32), # base class attributes
70 | ("pClassDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor
71 | ]
72 |
73 |
74 | class base_class_type_info(ctypes.Structure):
75 | _fields_ = [
76 | ("basetype", ea_t), # Base class type
77 | ("offsetflags", ea_t), # Offset and info
78 | ]
79 |
80 |
81 | class class_type_info(ctypes.Structure):
82 | _fields_ = [
83 | ("pVFTable", ea_t), # reference to RTTI's vftable (__class_type_info)
84 | ("pName", ea_t), # ref to type name
85 | ]
86 |
87 | # I don't think this is right, but every case I found looked to be correct
88 | # This might be a vtable? IDA sometimes says it is but not always
89 | # Plus sometimes the flags member is 0x1, so it's not a thisoffs. Weird
90 | class pointer_type_info(class_type_info):
91 | _fields_ = [
92 | ("flags", ea_t), # Flags or something else
93 | ("pType", ea_t), # ref to type
94 | ]
95 |
96 | class si_class_type_info(class_type_info):
97 | _fields_ = [
98 | ("pParent", ea_t), # ref to parent type
99 | ]
100 |
101 | class vmi_class_type_info(class_type_info):
102 | _fields_ = [
103 | ("flags", ctypes.c_uint32), # flags
104 | ("basecount", ctypes.c_uint32), # # of base classes
105 | ("pBaseArray", base_class_type_info), # array of BaseClassArray
106 | ]
107 |
108 | def create_vmi_class_type_info(ea):
109 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(vmi_class_type_info))
110 | tinfo = vmi_class_type_info.from_buffer_copy(bytestr)
111 |
112 | # Since this is a varstruct, we create a dynamic class with the proper size and type and return it instead
113 | class vmi_class_type_info_dynamic(class_type_info):
114 | _fields_ = [
115 | ("flags", ctypes.c_uint32),
116 | ("basecount", ctypes.c_uint32),
117 | ("pBaseArray", base_class_type_info * tinfo.basecount),
118 | ]
119 |
120 | return vmi_class_type_info_dynamic
121 |
122 |
123 | # Steps to retrieve vtables on Windows (MSVC):
124 | # 1. Get RTTI's vftable (??_7type_info@@6B@)
125 | # 2. Iterate over xrefs to, which are all TypeDescriptor objects
126 | # a. Of course don't load up the function that uses it
127 | # 3. At each xref load up xrefs to again
128 | # a. There should only be at least 2, the important ones are RTTICompleteObjectLocator's AKA COL (there can be more than 1)
129 | # b. To discern which one is which, just see if there's a label at the address
130 | # - If there is, then that one is RTTIClassHierarchyDescriptor, so skip it
131 | # 4. The current ea position at each xref should be at RTTICompleteObjectLocator::pTypeDescriptor, so subtract 12 to get to the beginning of the struct
132 | # 5. Find xrefs to each. There should only be one, and it should be its vtable
133 | # a. Each COL has an offset which will shows where its vtable starts, so running too far over the table will be easier to detect
134 | #
135 | # Steps to retrieve vtables on Linux (GCC and maybe Clang)
136 | # 1. Get RTTI's vftable (_ZTVN10__cxxabiv117__class_type_infoE,
137 | # _ZTVN10__cxxabiv120__si_class_type_infoE, and _ZTVN10__cxxabiv121__vmi_class_type_infoE)
138 | # 2. First, before doing anything, shove each xref of type_info object into some sort of structure
139 | # a. There's no easy way to cheese discerning which xref is the actual vtable, unless we want to start parsing IDA comments
140 | # 3. Once each type_info object and their references are loaded, get the xrefs from each pVFTable
141 | # 4. There will probably be more than one xref.
142 | # a. To discern which one is a vtable, if the xref lies in another type_info object, then it's not a vtable
143 | # b. The remaining xref(s) is indeed a vtable
144 |
145 | # Class for windows type info, helps organize things
146 | @dataclass(frozen=True)
147 | class WinTI(object):
148 | typedesc: int
149 | name: str
150 | cols: list[int]
151 | vtables: list[int]
152 |
153 | # Class for function lists (what is held in the json)
154 | @dataclass(frozen=True)
155 | class FuncList:
156 | thisoffs: int
157 | funcs: list#[VFunc]
158 |
159 | # Idiot proof IDA wait box
160 | class WaitBox:
161 | buffertime = 0.0
162 | shown = False
163 | msg = ""
164 |
165 | @staticmethod
166 | def _show(msg):
167 | WaitBox.msg = msg
168 | if WaitBox.shown:
169 | idaapi.replace_wait_box(msg)
170 | else:
171 | idaapi.show_wait_box(msg)
172 | WaitBox.shown = True
173 |
174 | @staticmethod
175 | def show(msg, buffertime=0.1):
176 | if msg == WaitBox.msg:
177 | return
178 |
179 | if buffertime > 0.0:
180 | if time.time() - WaitBox.buffertime < buffertime:
181 | return
182 | WaitBox.buffertime = time.time()
183 | WaitBox._show(msg)
184 |
185 | @staticmethod
186 | def hide():
187 | if WaitBox.shown:
188 | idaapi.hide_wait_box()
189 | WaitBox.shown = False
190 |
191 | # Virtual class tree
192 | class VClass(object):
193 | def __init__(self, *args, **kwargs):
194 | self.name = kwargs.get("name", "")
195 | # dict[classname, VClass]
196 | self.baseclasses = kwargs.get("baseclasses", {})
197 | # Same as Linux json, dict[thisoffs, funcs]
198 | self.vfuncs = kwargs.get("vfuncs", {})
199 | # Written to when writing to Windows, dict[thisoffs, [VFunc]]
200 | self.vfuncnames = kwargs.get("vfuncnames", {})
201 | # Exists solely to speed up checking for inherited functions
202 | self.postnames = set()
203 |
204 | def __str__(self):
205 | return f"{self.name} (baseclasses = {self.baseclasses}, vfuncs = {self.vfuncs})"
206 |
207 | def parse(self, colea, wintable):
208 | col = get_class_from_ea(RTTICompleteObjectLocator, colea)
209 | thisoffs = col.offset
210 |
211 | # Already parsed
212 | if self.name in wintable.keys():
213 | if thisoffs in wintable[self.name].vfuncs.keys():
214 | return
215 |
216 |
217 | # In 64-bit PEs, the COL references itself, remove this
218 | xrefs = list(idautils.XrefsTo(colea))
219 | if idaapi.inf_is_64bit():
220 | for n in range(len(xrefs)-1, -1, -1):
221 | if xrefs[n].frm == colea + RTTICompleteObjectLocator.pSelf.offset:
222 | del xrefs[n]
223 |
224 | if len(xrefs) != 1:
225 | print(f"[VTABLE IO] Multiple vtables point to same COL - {self.name} at {colea:#x}")
226 | return
227 |
228 | vtable = xrefs[0].frm + ctypes.sizeof(ea_t)
229 | self.vfuncs[thisoffs] = parse_vtable_addresses(vtable)
230 |
231 | # TODO; This is created for each function in the json and for each function in each vtable
232 | # This clearly does this for multiple of each function, so there needs to be a way to
233 | # cache each function and reuse it for each vtable
234 | # Possible pain point is differentiating between inheritedness
235 | @dataclass
236 | class VFunc:
237 | ea: int # Address to this function
238 | vaddr: int # Address to this function's reference in its vtable
239 | mangledname: str
240 | inheritid: int
241 | name: str
242 | postname: str
243 | sname: str
244 |
245 | @staticmethod
246 | def create(ea=idc.BADADDR, mangledname="", inheritid=-1, vaddr=idc.BADADDR):
247 | name = ""
248 | postname = ""
249 | sname = ""
250 | if mangledname:
251 | name = idaapi.demangle_name(mangledname, idaapi.MNG_LONG_FORM) or mangledname
252 | if name:
253 | postname = get_func_postname(name)
254 | sname = postname.split("(")[0]
255 | return VFunc(ea, vaddr, mangledname, inheritid, name, postname, sname)
256 |
257 | class VOptions(object):
258 | StringMethod = 1 << 0
259 | SkipMismatches = 1 << 1
260 | CommentReusedFunctions = 1 << 2
261 |
262 | DoNotExport = 0
263 | ExportNormal = 1
264 | ExportOnly = 2
265 |
266 | # Form for script options
267 | class VForm(idaapi.Form):
268 |
269 | def __init__(self):
270 | idaapi.Form.__init__(self, r"""STARTITEM 0
271 | BUTTON YES* Go
272 | BUTTON CANCEL Cancel
273 | VTable IO
274 | {FormChangeCb}
275 | <#Browse#Select a file to import from :{iFileImport}>
276 | <##Import options##Parse type strings (for hashed type info):{rStringMethod}> | <##Export options##Do not export:{rDoNotExport}>
277 | |
278 | {cImportOptions}> | {cExportOptions}>
279 | <#Browse#Select a file to export to (ignored if unchecked):{iFileExport}>
280 | """, {
281 | "FormChangeCb": idaapi.Form.FormChangeCb(self.OnFormChange),
282 | "iFileImport": idaapi.Form.FileInput(open=True, value=idaapi.reg_read_string("vtable_io", "iFileImport", "*.json"), swidth=50),
283 | "cImportOptions": idaapi.Form.ChkGroupControl(
284 | ("rStringMethod", "rSkipMismatches", "rComment"), value=idaapi.reg_read_int("vtable_io", VOptions.SkipMismatches | VOptions.CommentReusedFunctions, "cImportOptions")
285 | ),
286 | "cExportOptions": idaapi.Form.RadGroupControl(
287 | ("rDoNotExport", "rExportNormal", "rExportOnly"), value=idaapi.reg_read_int("vtable_io", VOptions.DoNotExport, "cExportOptions")
288 | ),
289 | "iFileExport": idaapi.Form.FileInput(save=True, value=idaapi.reg_read_string("vtable_io", "iFileExport", "*.json"), swidth=50),
290 | })
291 |
292 | def OnFormChange(self, fid):
293 | # print(fid)
294 | return 1
295 |
296 | @staticmethod
297 | def init_options():
298 | f = VForm()
299 | f, _ = f.Compile()
300 | go = f.Execute()
301 | if not go:
302 | return None
303 |
304 | options = VOptions()
305 | for control in f.controls.keys():
306 | if control != "FormChangeCb":
307 | currval = getattr(f, control).value
308 | setattr(options, control, currval)
309 | if isinstance(currval, str):
310 | idaapi.reg_write_string("vtable_io", currval, control)
311 | elif isinstance(currval, int):
312 | idaapi.reg_write_int("vtable_io", currval, control)
313 | else:
314 | print(f"Unsupported type for {control} - {type(currval)}")
315 |
316 | f.Free()
317 | return options
318 |
319 | OS_Linux = 0
320 | OS_Win = 1
321 |
322 | FUNCS = 0
323 | EXPORTS = 0
324 |
325 | VOPTIONS = None
326 |
327 | def get_os():
328 | ftype = idaapi.get_file_type_name()
329 | if "ELF" in ftype:
330 | return OS_Linux
331 | elif "PE" in ftype:
332 | return OS_Win
333 | return -1
334 |
335 | # Read a ctypes class from an ea
336 | def get_class_from_ea(classtype, ea):
337 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(classtype))
338 | return classtype.from_buffer_copy(bytestr)
339 |
340 | def rva_to_ea(ea):
341 | if idaapi.inf_is_64bit():
342 | return idaapi.get_imagebase() + ea
343 | return ea
344 |
345 | # Anything past Classname::
346 | # Thank you CTFPlayer::SOCacheUnsubscribed...
347 | def get_func_postname(name):
348 | retname = name
349 | template = 0
350 | iterback = 0
351 | for i, c in enumerate(retname):
352 | if c == "<":
353 | template += 1
354 | elif c == ">":
355 | template -= 1
356 | # Find ( and break if we're not in a template
357 | elif c == "(" and template == 0:
358 | iterback = i
359 | break
360 |
361 | # Run backwards from ( until we hit a ::
362 | for i in range(iterback, -1, -1):
363 | if retname[i] == ":":
364 | retname = retname[i+1:]
365 | break
366 |
367 | return retname
368 |
369 | def parse_vtable_names(ea):
370 | funcs = []
371 |
372 | while ea != idc.BADADDR:
373 | # Using flags sped this up by a lot
374 | # Went from 4 secs to ~1.3
375 | flags = idaapi.get_full_flags(ea)
376 | if not is_off(flags) or not is_ptr(flags):
377 | break
378 |
379 | if idaapi.has_name(flags):
380 | break
381 |
382 | offs = get_ptr(ea)
383 | fflags = idaapi.get_full_flags(offs)
384 | if not idaapi.is_func(fflags):
385 | break
386 |
387 | name = idaapi.get_name(offs)
388 | funcs.append(name)
389 |
390 | ea = idaapi.next_head(ea, idc.BADADDR)
391 | return funcs
392 |
393 | def parse_vtable_addresses(ea):
394 | funcs = []
395 |
396 | while ea != idc.BADADDR:
397 | flags = idaapi.get_full_flags(ea)
398 | if not is_off(flags) or not is_ptr(flags):
399 | break
400 |
401 | offs = get_ptr(ea)
402 | fflags = idaapi.get_full_flags(offs)
403 | if not has_any_name(fflags):
404 | break
405 |
406 | # if not idaapi.is_func(fflags):# or not idaapi.has_name(fflags):
407 | # Sometimes IDA doesn't think a function is a function
408 | # This is all CSteamWorksGameStatsUploader's fault :(
409 | if not is_code(fflags):
410 | break
411 |
412 | funcs.append(VFunc.create(ea=offs, vaddr=ea))
413 |
414 | ea = idaapi.next_head(ea, idc.BADADDR)
415 | return funcs
416 |
417 | def parse_si_tinfo(ea, tinfos):
418 | for xref in idautils.XrefsTo(ea):
419 | tinfo = get_class_from_ea(si_class_type_info, xref.frm)
420 | tinfos[xref.frm + si_class_type_info.pParent.offset] = tinfo.pParent
421 |
422 | def parse_pointer_tinfo(ea, tinfos):
423 | for xref in idautils.XrefsTo(ea):
424 | tinfo = get_class_from_ea(pointer_type_info, xref.frm)
425 | tinfos[xref.frm + pointer_type_info.pType.offset] = tinfo.pType
426 |
427 | def parse_vmi_tinfo(ea, tinfos):
428 | for xref in idautils.XrefsTo(ea):
429 | tinfotype = create_vmi_class_type_info(xref.frm)
430 | tinfo = get_class_from_ea(tinfotype, xref.frm)
431 |
432 | for i in range(tinfo.basecount):
433 | offset = vmi_class_type_info.pBaseArray.offset + i * ctypes.sizeof(base_class_type_info)
434 | basetinfo = get_class_from_ea(base_class_type_info, xref.frm + offset)
435 | tinfos[xref.frm + offset + base_class_type_info.basetype.offset] = basetinfo.basetype
436 |
437 | def get_tinfo_vtables(ea, tinfos, vtables):
438 | if ea == idc.BADADDR:
439 | return
440 |
441 | for tinfoxref in idautils.XrefsTo(ea, idaapi.XREF_DATA):
442 | count = 0
443 | mangled = idaapi.get_name(tinfoxref.frm)
444 | demangled = idc.demangle_name(mangled, idaapi.MNG_LONG_FORM)
445 | if demangled is None:
446 | print(f"[VTABLE IO] Invalid name at {tinfoxref.frm:#x}")
447 | continue
448 |
449 | classname = demangled[len("`typeinfo for'"):]
450 | for xref in idautils.XrefsTo(tinfoxref.frm, idaapi.XREF_DATA):
451 | if xref.frm not in tinfos.keys():
452 | # If address lies in a function
453 | if idaapi.is_func(idaapi.get_full_flags(xref.frm)):
454 | continue
455 |
456 | count += 1
457 | vtables[classname] = vtables.get(classname, []) + [xref.frm]
458 |
459 | def read_vtables_linux():
460 | f = idaapi.ask_file(1, "*.json", "Select a file to export to")
461 | if not f:
462 | return
463 |
464 | WaitBox.show("Parsing typeinfo")
465 |
466 | # Step 1 and 2, crawl xrefs and stick the inherited class type infos into a structure
467 | # After this, we can run over the xrefs again and see which xrefs come from another structure
468 | # The remaining xrefs are either vtables or weird math in a function
469 | xreftinfos = {}
470 |
471 | def getparse(name, fn, quiet=False):
472 | tinfo = idc.get_name_ea_simple(name)
473 | if tinfo == idc.BADADDR and not quiet:
474 | print(f"[VTABLE IO] Type info {name} not found. Skipping...")
475 | return None
476 |
477 | if fn is not None:
478 | fn(tinfo, xreftinfos)
479 | return tinfo
480 |
481 | # Don't need to parse base classes
482 | tinfo = getparse("_ZTVN10__cxxabiv117__class_type_infoE", None)
483 | tinfo_pointer = getparse("_ZTVN10__cxxabiv119__pointer_type_infoE", parse_pointer_tinfo, True)
484 | tinfo_si = getparse("_ZTVN10__cxxabiv120__si_class_type_infoE", parse_si_tinfo)
485 | tinfo_vmi = getparse("_ZTVN10__cxxabiv121__vmi_class_type_infoE", parse_vmi_tinfo)
486 |
487 | if len(xreftinfos) == 0:
488 | print("[VTABLE IO] No type infos found. Are you sure you're in a C++ binary?")
489 | return
490 |
491 | # Step 3, crawl xrefs to again and if the xref is not in the type info structure, then it's a vtable
492 | WaitBox.show("Discovering vtables")
493 | vtables = {}
494 | get_tinfo_vtables(tinfo, xreftinfos, vtables)
495 | get_tinfo_vtables(tinfo_pointer, xreftinfos, vtables)
496 | get_tinfo_vtables(tinfo_si, xreftinfos, vtables)
497 | get_tinfo_vtables(tinfo_vmi, xreftinfos, vtables)
498 |
499 | # Now, we have a list of vtables and their respective classes
500 | WaitBox.show("Parsing vtables")
501 | jsondata = parse_vtables(vtables)
502 |
503 | WaitBox.show("Writing to file")
504 | with open(f, "w") as f:
505 | json.dump(jsondata, f, indent=4, sort_keys=True)
506 |
507 | def parse_ti(ea, tis):
508 | typedesc = ea
509 | flags = idaapi.get_full_flags(ea)
510 | if is_code(flags):
511 | return
512 |
513 | name = idc.get_name(ea)
514 | if not name:
515 | return
516 |
517 | # Pointer type
518 | # I have no idea what this is but it is not what we want
519 | if name.startswith("??_R0P"):
520 | return
521 |
522 | try:
523 | classname = idaapi.demangle_name(name, idaapi.MNG_SHORT_FORM)
524 | classname = classname.removeprefix("class ")
525 | classname = classname.removeprefix("struct TypeDescriptor ")
526 | classname = classname.removesuffix(" `RTTI Type Descriptor'")
527 | classname = classname.strip()
528 | except:
529 | print(f"[VTABLE IO] Invalid vtable name at {ea:#x}")
530 | return
531 |
532 | if classname in tis.keys():
533 | return
534 |
535 | cols = []
536 | vtables = []
537 |
538 | # Then figure out which xref is a/the COL
539 | for xref in idautils.XrefsTo(typedesc):
540 | ea = xref.frm
541 | flags = idaapi.get_full_flags(ea)
542 |
543 | # Dynamic cast
544 | if is_code(flags):
545 | continue
546 |
547 | name = idaapi.get_name(ea)
548 | # Class type descriptor and/or random global data
549 | # Kind of a hack but let's assume no one will rename these
550 | if name and (name.startswith("??_R1") or name.startswith("off_")):
551 | continue
552 |
553 | ea -= 4
554 | name = idaapi.get_name(ea)
555 | # Catchable types
556 | if name and name.startswith("__CT"):
557 | continue
558 |
559 | # COL
560 | ea -= 8
561 | workaround = False
562 | if idaapi.is_unknown(idaapi.get_full_flags(ea)):
563 | print(f"[VTABLE IO] Possible COL is unknown at {ea:#x}. This may be an unreferenced vtable. Trying workaround...")
564 | # This might be a bug with IDA, but sometimes the COL isn't analyzed
565 | # If there's still a reference, then we can still trace back
566 | # If there is a list of functions (or even just one), then it's probably a vtable,
567 | # but we'll still warn the user that it might be garbage
568 | refs = list(idautils.XrefsTo(ea))
569 | if len(refs) == 1:
570 | vtable = refs[0].frm + ctypes.sizeof(ea_t)
571 | tryfunc = get_ptr(vtable + ctypes.sizeof(ea_t))
572 | funcflags = idaapi.get_full_flags(tryfunc)
573 | if idaapi.is_func(funcflags):
574 | print(f" - Workaround successful. Please assure that {vtable:#x} is a vtable.")
575 | workaround = True
576 |
577 | if not workaround:
578 | print(" - Workaround failed. Skipping...")
579 | continue
580 |
581 | name = idaapi.get_name(ea)
582 | if not workaround and (not name or not name.startswith("??_R4")):
583 | print(f"[VTABLE IO] Invalid name at {ea:#x}. Possible unwind info. Ignoring...")
584 | continue
585 |
586 | # In 64-bit PEs, the COL references itself, remove this
587 | refs = list(idautils.XrefsTo(ea))
588 | if idaapi.inf_is_64bit():
589 | for n in range(len(refs)-1, -1, -1):
590 | if refs[n].frm == ea + RTTICompleteObjectLocator.pSelf.offset:
591 | del refs[n]
592 |
593 | # Now that we have the COL, we can use it to find the vtable that utilizes it and its thisoffs
594 | # We need to use this later because of overloads so we cache it in a list
595 | if len(refs) != 1:
596 | print(f"[VTABLE IO] Multiple vtables point to same COL - {name} at {ea:#x}")
597 | continue
598 |
599 | cols.append(ea)
600 | vtable = refs[0].frm + ctypes.sizeof(ea_t)
601 | vtables.append(vtable)
602 |
603 | # Can have RTTI without a vtable
604 | tis[classname] = WinTI(typedesc, classname, cols, vtables)
605 |
606 |
607 | def read_ti_win():
608 | # Step 1, get the vftable of type_info
609 | type_info = idc.get_name_ea_simple("??_7type_info@@6B@")
610 | if type_info == idc.BADADDR:
611 | # If type_info doesn't exist as a label, we might still be able to snipe it with the string method
612 | strings = list(idautils.Strings())
613 | for s in strings:
614 | if str(s) == ".?AVtype_info@@":
615 | ea = s.ea - TypeDescriptor.name.offset
616 | type_info = rva_to_ea(idaapi.get_wide_dword(ea))
617 |
618 | print("[VTABLE IO] type_info not found. Are you sure you're in a C++ binary?")
619 | return None
620 |
621 | tis = {}
622 |
623 | # Step 2, get all xrefs to type_info
624 | # Get type descriptor
625 | for typedesc in idautils.XrefsTo(type_info):
626 | parse_ti(typedesc.frm, tis)
627 |
628 | # In some cases, the IDA either fails to reference some type descriptors with type_info
629 | # Not exactly sure why, but it lists the ea of type_info as a "hash" when in reality it isn't
630 | # A workaround for this is to parse type descriptor strings (".?AV*"), load up their references, and
631 | # walk backwards to the start of what is supposed to be the type descriptor, and assure that
632 | # its DWORD is the type_info vtable
633 | # We also make this an optional feature because it's slow in older IDA versions and not necessarily needed
634 | # I only found this to be a problem in NMRIH, so it appears to be rare
635 | if VOPTIONS.cImportOptions & VOptions.StringMethod:
636 | WaitBox.show("Performing string parsing")
637 | string_method(type_info, tis)
638 |
639 | return tis
640 |
641 | def string_method(type_info, tis):
642 | for string in idautils.Strings():
643 | sstr = str(string)
644 | if not sstr.startswith(".?AV"):
645 | continue
646 |
647 | ea = string.ea
648 | ea -= TypeDescriptor.name.offset
649 | trytinfo = rva_to_ea(idaapi.get_wide_dword(ea))
650 | # This is a weird string that isn't a part of a type descriptor
651 | if trytinfo != type_info:
652 | continue
653 |
654 | parse_ti(ea, tis)
655 |
656 |
657 | def parse_vtables(vtables):
658 | jsondata = {}
659 | ptrsize = ctypes.sizeof(ea_t)
660 | for classname, tables in vtables.items():
661 | # We don't *need* to do any sort of sorting in Linux and can just capture the thisoffset
662 | # The Windows side of the script can organize later
663 | for ea in tables:
664 | thisoffs = get_ptr(ea - ptrsize)
665 |
666 | funcs = parse_vtable_names(ea + ptrsize)
667 | # Can be zero if there's an xref in the global offset table (.got) section
668 | # Fortunately the parse_vtable function doesn't grab anything from there
669 | if funcs:
670 | classdata = jsondata.get(classname, {})
671 | classdata[ptr_t(thisoffs).value] = funcs
672 | jsondata[classname] = classdata
673 |
674 | return jsondata
675 |
676 | # See if the thunk is actually a thunk and jumps to
677 | # a function in the vtable
678 | def is_thunk(thunkfunc, targetfuncs):
679 | ea = thunkfunc.ea
680 | func = idaapi.get_func(ea)
681 | funcend = func.end_ea
682 |
683 | # if funcend - ea > 20: # Highest I've seen is 13 opcodes but this works ig
684 | # return False
685 |
686 | addr = idc.next_head(ea, funcend)
687 |
688 | if addr == idc.BADADDR:
689 | return False
690 |
691 | b = idaapi.get_byte(addr)
692 | if b in (0xEB, 0xE9):
693 | insn = idaapi.insn_t()
694 | idaapi.decode_insn(insn, addr)
695 | jmpaddr = insn.Op1.addr
696 | return any(jmpaddr == i.ea for i in targetfuncs)
697 |
698 | return False
699 |
700 | def build_export_table(linuxtables, wintables):
701 | # Table is built mainly for readability but having one that is actually parsable would
702 | # be a cool idea for the future
703 | exporttable = {}
704 | # Save Linux only tables for exporting too
705 | winless = {k: linuxtables[k] for k in linuxtables.keys() - wintables.keys()}
706 | global EXPORTS
707 | for classname, wintable in wintables.items():
708 | linuxtable = linuxtables.get(classname, None)
709 | if linuxtable is None:
710 | continue
711 |
712 | # Sort and int-ify Linux again
713 | newlinuxtable = [(abs(int(k)), v) for k, v in linuxtable.items()]
714 | newlinuxtable.sort(key=lambda x: x[0])
715 |
716 | exportnode = []
717 | purecalls = []
718 | for currlinuxitems, currwinitems in zip(newlinuxtable, wintable.items()):
719 | lthisoffs, ltable = currlinuxitems
720 | wthisoffs, wtable = currwinitems
721 |
722 | windiscovered = set()
723 | prepend = f"[L{lthisoffs}/W{wthisoffs}]"
724 | for i, mangledname in enumerate(ltable):
725 | # Save for later
726 | if mangledname.startswith("__cxa"):
727 | # print(f"Found purecall {classname}::{mangledname} at {i}")
728 | purecalls.append(i)
729 | continue
730 |
731 | winidx = -1
732 | for j, winfunc in enumerate(wtable):
733 | if mangledname == winfunc.mangledname:
734 | winidx = j
735 | windiscovered.add(j)
736 | break
737 |
738 | s = f"L{i}"
739 | if winidx != -1:
740 | s = f"{s:<8}W{winidx}"
741 |
742 | if not mangledname.startswith("sub_"):
743 | shortname = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM) or "purecall"
744 | else:
745 | shortname = mangledname
746 | newprepend = f"{prepend:<20}{s:<8}"
747 | s = f"{newprepend:<36}{shortname}"
748 | exportnode.append(s)
749 |
750 | # Purecalls are a bit special
751 | # We can't just grab the Linux index and use it for Windows
752 | # So we 1: do this after everything else is done, and 2: find the first
753 | # Windows purecall after the last purecall we found for each one
754 | # in the Linux table
755 | # This is kinda hard to test edge cases, but we'll assume this works
756 | lastidx = 0
757 | for i in purecalls:
758 | winidx = -1
759 | for j, winfunc in enumerate(wtable[lastidx:]):
760 | if winfunc.mangledname == "__cxa_pure_virtual":
761 | winidx = j + lastidx
762 | break
763 |
764 | s = f"L{i}"
765 | if winidx != -1:
766 | s = f"{s:<8}W{winidx}"
767 |
768 | shortname = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM) or "purecall"
769 | newprepend = f"{prepend:<20}{s:<8}"
770 | s = f"{newprepend:<36}{shortname}"
771 | exportnode.insert(i, s)
772 | lastidx = winidx+1
773 | windiscovered.add(winidx)
774 |
775 | # For thunks, figure out which Windows indices were not discovered and add them
776 | # Inherited table might be out of order but we favor Linux anyways
777 | for j, winfunc in enumerate(wtable):
778 | if j not in windiscovered:
779 | dummy = ""
780 | s = f"W{j}"
781 |
782 | shortname = idaapi.demangle_name(winfunc.mangledname, idaapi.MNG_SHORT_FORM) or "purecall"
783 | newprepend = f"{prepend:<20}{dummy:<8}{s:<8}"
784 | s = f"{newprepend:<36}{shortname}"
785 | exportnode.append(s)
786 |
787 | EXPORTS += 1
788 | exporttable[classname] = exportnode
789 |
790 | # Export Linux only tables
791 | for classname, linuxtable in winless.items():
792 | # Sort and int-ify Linux again
793 | newlinuxtable = [(abs(int(k)), v) for k, v in linuxtable.items()]
794 | newlinuxtable.sort(key=lambda x: x[0])
795 | exportnode = []
796 | for thisoffs, table in newlinuxtable:
797 | prepend = f"[L{thisoffs}]"
798 | for i, mangledname in enumerate(table):
799 | shortname = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM) or "purecall"
800 | newprepend = f"{prepend:<20}L{i:<8}"
801 | s = f"{newprepend:<36}{shortname}"
802 | exportnode.append(s)
803 |
804 | EXPORTS += 1
805 | exporttable[classname] = exportnode
806 | return exporttable
807 |
808 | def read_vtables_win(classname, ti, wintable, baseclasses):
809 | if classname in wintable.keys():
810 | return
811 |
812 | vclass = wintable.get(classname, VClass(name=classname, baseclasses=baseclasses))
813 | for colea in ti.cols:
814 | vclass.parse(colea, wintable)
815 |
816 | wintable[classname] = vclass
817 |
818 | def read_tinfo_win(classname, ti, winti, wintable, baseclasses):
819 | # Strange cases where there is a base class descriptor with no vtable
820 | if classname not in winti.keys():
821 | return
822 |
823 | if classname in wintable.keys():
824 | return
825 |
826 | # No COLs, but we still keep the type in the wintable
827 | if not ti.cols:
828 | wintable[classname] = VClass(name=classname, baseclasses=baseclasses)
829 | return
830 |
831 | # So essentially we just run through each base class in the hierarchy descriptor
832 | # and recursively parse the base classes of the base classes
833 | # Sort of like a reverse insertion sort only not really a sort
834 | for colea in ti.cols:
835 | col = get_class_from_ea(RTTICompleteObjectLocator, colea)
836 | hierarchydesc = get_class_from_ea(RTTIClassHierarchyDescriptor, rva_to_ea(col.pClassHierarchyDescriptor))
837 | numitems = hierarchydesc.numBaseClasses
838 | arraystart = rva_to_ea(hierarchydesc.pBaseClassArray)
839 |
840 | # Go backwards because we should start parsing from the basest base class
841 | for i in range(numitems - 1, -1, -1):
842 | offset = arraystart + i * ctypes.sizeof(ctypes.c_uint32)
843 | descea = rva_to_ea(idaapi.get_wide_dword(offset))
844 | parentname = idaapi.demangle_name(idaapi.get_name(descea), idaapi.MNG_SHORT_FORM)
845 | if not parentname:
846 | # Another undefining IDA moment
847 | # print(f"[VTABLE IO] Invalid parent name at {offset:#x}")
848 | typedesc = rva_to_ea(idaapi.get_wide_dword(descea))
849 | parentname = idaapi.demangle_name(idaapi.get_name(typedesc), idaapi.MNG_SHORT_FORM)
850 |
851 | # Should be impossible since this is the type descriptor
852 | if not parentname:
853 | print(f"[VTABLE IO] Invalid parent name at {offset:#x} - type descriptor at {typedesc:#x}")
854 | continue
855 |
856 | parentname = parentname.removeprefix("class ")
857 | parentname = parentname.removeprefix("struct TypeDescriptor ")
858 | parentname = parentname.removesuffix(" `RTTI Type Descriptor'")
859 | else:
860 | parentname = parentname[:parentname.find("::`RTTI Base Class Descriptor")]
861 |
862 | # End of the line
863 | if i == 0:
864 | read_vtables_win(classname, winti[parentname], wintable, baseclasses)
865 | elif parentname in winti.keys():
866 | read_tinfo_win(parentname, winti[parentname], winti, wintable, baseclasses)
867 | # Once again relying on dicts being ordered
868 | baseclasses[parentname] = wintable[parentname]
869 |
870 | def gen_win_tables(winti):
871 | # So first we start looping windows typeinfos because
872 | # we're going to go from the COL -> ClassHierarchyDescriptor -> BaseClassArray
873 | # The reason why we're doing this is because of subclass overloads
874 | # For a history lesson, see https://github.com/Scags/IDA-Scripts/blob/125f1877a24da48062e62efcfb7d8a63e3bd939b/vtable_io.py#L251-L263
875 | # We're going to fix this by writing (and thus caching the names of) the baseclasses of classes first
876 | # This way, we'll be able to know the classname and the virtual functions contained therein,
877 | # and thus we will know if there is an overload that exists in a subclass
878 | # This relies on the fact that dicts are ordered in Python 3.7+
879 | # If you're running Jiang Yang, either get a job or replace wintables with an OrderedDict
880 |
881 | # Same format as linuxtables
882 | # {classname: VClass(classname, {thisoffs: [vfunc...], ...}, ...})
883 | wintables = {}
884 | for classname, ti in winti.items():
885 | read_tinfo_win(classname, ti, winti, wintables, {})
886 |
887 | return wintables
888 |
889 | def fix_windows_classname(classname):
890 | # Double pointers are spaced...
891 | classnamefix = classname.replace("* *", "**")
892 |
893 | # References/pointers that are const are spaced...
894 | classnamefix = classnamefix.replace("const &", "const&")
895 | classnamefix = classnamefix.replace("const *", "const*")
896 |
897 | # And true/false is instead replaced with 1/0
898 | def replacer(m):
899 | # Avoid replacing 1s and 0s that are a part of classnames
900 | # Thanks ChatGPT
901 | return re.sub(r"(?<=\W)1(?=\W)", "true", re.sub(r"(?<=\W)0(?=\W)", "false", m.group()))
902 | classnamefix = re.sub(r"<[^>]+>", replacer, classnamefix)
903 |
904 | # Other quirks are inline structs and templated enums
905 | # which are pretty much impossible to deduce
906 | return classnamefix
907 |
908 | # Idk why but sometimes pointers have a mind of their own
909 | def fix_windows_classname2(classname):
910 | return classname.replace(" *", "*")
911 |
912 | def fix_win_overloads(linuxitems, winitems, vclass, functable):
913 | for i in range(min(len(linuxitems), len(winitems))):
914 | currfuncs = linuxitems[i].funcs
915 | vfuncs = []
916 | for u in range(len(currfuncs)):
917 | f = VFunc.create(mangledname=currfuncs[u])
918 | for j, baseclass in enumerate(vclass.baseclasses.values()):
919 | if f.postname in baseclass.postnames:
920 | f.inheritid = j
921 | break
922 |
923 | # Unbelievable hack right here
924 | # Looks like pointers are getting shoved next to their types instead of spaced sometimes
925 | # Not entirely sure what causes this.
926 | # CAI_BaseNPC::CanStandOn(CBaseEntity*) vs CBaseEntity::CanStandOn(CBaseEntity *)
927 | # Maybe it's the difference in the types of the pointers and this?
928 | trystr = f.postname
929 | breakout = False
930 | for k in range(trystr.count(" *")):
931 | trystr = trystr.replace(" *", "*", 1)
932 | if trystr in baseclass.postnames:
933 | f.inheritid = j
934 | f.postname = trystr
935 | breakout = True
936 | break
937 |
938 | if breakout:
939 | break
940 |
941 | vfuncs.append(f)
942 |
943 | # Remove Linux's extra dtor
944 | for u, f in enumerate(vfuncs):
945 | if "::~" in f.name:
946 | del vfuncs[u]
947 | break
948 |
949 | # Windows does overloads backwards, reverse them
950 | funcnameset = set()
951 | u = 0
952 | while u < len(vfuncs):
953 | f = vfuncs[u]
954 |
955 | if f.mangledname.startswith("__cxa"):# or f.mangledname.startswith("_ZThn") or f.mangledname.startswith("_ZTv"):
956 | u += 1
957 | continue
958 |
959 | if not f.name:
960 | u += 1
961 | continue
962 |
963 | # This is an overload, we take the function name here, and push it somewhere else
964 | if f.sname in funcnameset:
965 | # Find the first index of the overload
966 | firstidx = -1
967 | for k in range(u):
968 | if vfuncs[k].sname == f.sname:
969 | firstidx = k
970 | break
971 |
972 | if firstidx == -1:
973 | print(f"[VTABLE IO] An impossibility has occurred. \"{f.sname}\" ({f.mangledname}, {f.name}) is in funcnameset but there is no possible overload.")
974 |
975 | overloadfunc = vfuncs[firstidx]
976 | if overloadfunc.inheritid != f.inheritid:
977 | # Although this function is an overload, it was created in a subclass
978 | # So we don't move it
979 | u += 1
980 | continue
981 |
982 | # Remove the current func from the list
983 | del vfuncs[u]
984 | # And insert it into the first index
985 | vfuncs.insert(firstidx, f)
986 | u += 1
987 | continue
988 |
989 | funcnameset.add(f.sname)
990 | u += 1
991 |
992 | for f in vfuncs:
993 | vclass.postnames.add(f.postname)
994 | functable[linuxitems[i].thisoffs] = vfuncs
995 |
996 | def thunk_dance(winitems, vclass, functable):
997 | # Now it's time for thunk dancing
998 | mainltable = functable[0]
999 | mainwtable = winitems[0].funcs
1000 | for currlinuxitems, currwinitems in zip(functable.items(), winitems):
1001 | thisoffs, ltable = currlinuxitems
1002 | wtable = currwinitems.funcs
1003 | if thisoffs == 0:
1004 | continue
1005 |
1006 | # Remove any extra dtors from this table
1007 | dtorcount = 0
1008 | for i, f in enumerate(ltable):
1009 | if "::~" in f.name:
1010 | dtorcount += 1
1011 | if dtorcount > 1:
1012 | del ltable[i]
1013 | break
1014 |
1015 | i = 0
1016 | while i < len(mainltable):
1017 | f = mainltable[i]
1018 | if f.mangledname.startswith("__cxa"):
1019 | i += 1
1020 | continue
1021 |
1022 | # I shouldn't need to do this, but destructors are wonky
1023 | if i == 0 and "::~" in f.name:
1024 | i += 1
1025 | continue
1026 |
1027 | if not f.postname:
1028 | i += 1
1029 | continue
1030 |
1031 | # Windows skips the vtable function if it's implementation is in the thunks
1032 | # A way to check if this is true is to see which thunks are actually thunks
1033 | # Then we just pop its name from the main table, since it's no longer there
1034 | thunkidx = -1
1035 | for u in range(len(ltable)):
1036 | if ltable[u].postname == f.postname:
1037 | thunkidx = u
1038 | break
1039 |
1040 | if thunkidx != -1:
1041 | try:
1042 | # We can't exactly see if the possible thunk jumps to a certain function (mainwtable[i]) because
1043 | # it's impossible to know what that function even is, so we instead check to see if
1044 | # it jumps into any function in the main vtable which is good enough
1045 | if not is_thunk(wtable[thunkidx], mainwtable):
1046 | ltable[thunkidx] = mainltable[i]
1047 | del mainltable[i]
1048 | continue
1049 | except:
1050 | print(f"[VTABLE IO] Anomalous thunk: {vclass.name}::{f.postname}, mainwtable {len(mainwtable)} wtable {len(wtable)} thunkidx {thunkidx} thisoffs {thisoffs}")
1051 | pass
1052 | i += 1
1053 |
1054 | # Update current linux table
1055 | functable[thisoffs] = ltable
1056 |
1057 | # Update main table
1058 | functable[0] = mainltable
1059 |
1060 | def prep_linux_vtables(linuxitems, winitems, vclass):
1061 | functable = {}
1062 |
1063 | fix_win_overloads(linuxitems, winitems, vclass, functable)
1064 |
1065 | # No thunks, we are done
1066 | if min(len(linuxitems), len(winitems)) == 1:
1067 | return functable
1068 |
1069 | thunk_dance(winitems, vclass, functable)
1070 |
1071 | # Ready to write
1072 | return functable
1073 |
1074 | def merge_tables(functable, winitems):
1075 | for items in zip(functable.items(), winitems):
1076 | # Should probably make this unpacking/packing more efficient
1077 | currlitems, currwitems = items
1078 | _, ltable = currlitems
1079 | wtable = currwitems.funcs
1080 |
1081 | for i, f in enumerate(ltable):
1082 | targetname = f.mangledname
1083 | # Purecall, which should already be handled on the Windows side
1084 | if targetname.startswith("__cxa"):
1085 | continue
1086 |
1087 | # Size mismatch, skip it
1088 | try:
1089 | currfunc = wtable[i]
1090 | except:
1091 | continue
1092 | targetaddr = currfunc.ea
1093 |
1094 | flags = idaapi.get_full_flags(targetaddr)
1095 | # Already typed
1096 | if idaapi.has_name(flags):
1097 | if VOPTIONS.cImportOptions & VOptions.CommentReusedFunctions:
1098 | # If it's a Windows optimization (nullsubs, etc),
1099 | # add a comment with the actual name
1100 | # There's gotta be a way to rename the reference but not the function
1101 | currmangledname = idaapi.get_name(targetaddr)
1102 | currname = idaapi.demangle_name(currmangledname, idaapi.MNG_LONG_FORM)
1103 | if not currname or currname != f.name:
1104 | # Use short name for cmt since that's what IDA uses
1105 | shortname = idaapi.demangle_name(f.mangledname, idaapi.MNG_SHORT_FORM)
1106 | idaapi.set_cmt(currfunc.vaddr, shortname, False)
1107 | continue
1108 |
1109 | func = idaapi.get_func(targetaddr)
1110 | # Not actually a function somehow
1111 | if not func:
1112 | continue
1113 |
1114 | # A library function (should already have a name)
1115 | if func.flags & idaapi.FUNC_LIB:
1116 | continue
1117 |
1118 | idaapi.set_name(targetaddr, targetname, idaapi.SN_FORCE)
1119 | global FUNCS
1120 | FUNCS += 1
1121 |
1122 | def compare_tables(wintables, linuxtables):
1123 | functables = {}
1124 | for classname, vclass in wintables.items():
1125 | if not vclass.vfuncs:
1126 | continue
1127 |
1128 | linuxtable = linuxtables.get(classname, {})
1129 | if not linuxtable:
1130 | # Some weird Windows quirks
1131 | classnamefix = fix_windows_classname(classname)
1132 | linuxtable = linuxtables.get(classnamefix, {})
1133 | if not linuxtable:
1134 | # Another very weird quirk
1135 | classnamefix = fix_windows_classname2(classnamefix)
1136 | linuxtable = linuxtables.get(classnamefix, {})
1137 | if not linuxtable:
1138 | # print(f"[VTABLE IO] {classname}{f' (tried {classnamefix})' if classname != classnamefix else ''} not found in Linux tables. Skipping...")
1139 | continue
1140 |
1141 | winitems = list(FuncList(x[0], x[1]) for x in vclass.vfuncs.items())
1142 | # Sort by thisoffs, smallest first
1143 | winitems.sort(key=lambda x: x.thisoffs)
1144 |
1145 | # Convert the string thisoffs to int
1146 | # Linux thisoffses are negative, abs them
1147 | linuxitems = list(FuncList(abs(int(x[0])), x[1]) for x in zip([abs(int(i)) for i in linuxtable.keys()], linuxtable.values()))
1148 | linuxitems.sort(key=lambda x: x.thisoffs)
1149 |
1150 | # If there's a size mismatch (very rare), then most likely IDA failed to analyze
1151 | # A certain vtable, so we can't continue given the high probability of catastrophich failure
1152 | if len(winitems) != len(linuxitems):
1153 | print(f"[VTABLE IO] {classname} vtable # mismatch - L{len(linuxitems)} W{len(winitems)}. Skipping...")
1154 | continue
1155 |
1156 | functable = prep_linux_vtables(linuxitems, winitems, vclass)
1157 |
1158 | skip = False
1159 | for items in zip(functable.items(), winitems):
1160 | currlinuxitems, currwinitems = items
1161 | thisoffs, ltable = currlinuxitems
1162 | if len(ltable) != len(currwinitems.funcs):
1163 | print(f"[VTABLE IO] WARNING: {vclass.name} vtable [W{currwinitems.thisoffs}/L{thisoffs}] may be wrong! L{len(ltable)} - W{len(currwinitems.funcs)} = {len(ltable) - len(currwinitems.funcs)}", end="")
1164 | if VOPTIONS.cImportOptions & VOptions.SkipMismatches:
1165 | print(". Skipping...")
1166 | skip = True
1167 | break
1168 | else:
1169 | print()
1170 |
1171 | if skip:
1172 | continue
1173 |
1174 | functables[classname] = functable
1175 |
1176 | # Write!
1177 | if VOPTIONS.cExportOptions != VOptions.ExportOnly:
1178 | merge_tables(functable, winitems)
1179 |
1180 | return functables
1181 |
1182 | def write_vtables():
1183 | WaitBox.show("Importing file")
1184 | linuxtables = None
1185 | try:
1186 | with open(VOPTIONS.iFileImport) as f:
1187 | linuxtables = json.load(f)
1188 | except FileNotFoundError as e:
1189 | print(f"[VTABLE IO] File {VOPTIONS.iFileImport} not found.")
1190 | return
1191 |
1192 | if not linuxtables:
1193 | return
1194 |
1195 | WaitBox.show("Parsing Windows typeinfo")
1196 | winti = read_ti_win()
1197 | if winti is None:
1198 | return
1199 |
1200 | WaitBox.show("Generating windows vtables")
1201 | wintables = gen_win_tables(winti)
1202 |
1203 | if not wintables:
1204 | return
1205 |
1206 | WaitBox.show("Comparing vtables")
1207 | functables = compare_tables(wintables, linuxtables)
1208 |
1209 | if VOPTIONS.cExportOptions in (VOptions.ExportOnly, VOptions.ExportNormal):
1210 | if VOPTIONS.iFileExport is None or VOPTIONS.iFileExport == "*.json":
1211 | print("[VTABLE IO] No export file specified.")
1212 | return
1213 |
1214 | WaitBox.show("Writing to file")
1215 | exporttable = build_export_table(linuxtables, functables)
1216 | with open(VOPTIONS.iFileExport, "w") as f:
1217 | json.dump(exporttable, f, indent=4, sort_keys=True)
1218 |
1219 |
1220 | def main():
1221 | os = get_os()
1222 | if os == -1:
1223 | print(f"Unsupported OS?: {idaapi.get_file_type_name()}")
1224 | idaapi.beep()
1225 | return
1226 |
1227 | try:
1228 | if os == OS_Linux:
1229 | read_vtables_linux()
1230 | print("Done!")
1231 | elif os == OS_Win:
1232 | global VOPTIONS
1233 | VOPTIONS = VForm.init_options()
1234 | if not VOPTIONS:
1235 | return
1236 |
1237 | write_vtables()
1238 | if FUNCS:
1239 | print(f"[VTABLE IO] Successfully typed {FUNCS} virtual functions")
1240 | else:
1241 | print("[VTABLE IO] No functions were typed")
1242 |
1243 | if EXPORTS:
1244 | print(f"[VTABLE IO] Successfully exported {EXPORTS} virtual tables")
1245 |
1246 | if FUNCS == 0 and EXPORTS == 0:
1247 | idaapi.beep()
1248 | except:
1249 | import traceback
1250 | traceback.print_exc()
1251 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues")
1252 | idaapi.beep()
1253 |
1254 | WaitBox.hide()
1255 |
1256 | # import cProfile
1257 | # cProfile.run("main()", "vtable_io.prof")
1258 | main()
--------------------------------------------------------------------------------
/vtable_structs.py:
--------------------------------------------------------------------------------
1 | import idc
2 | import idautils
3 | import idaapi
4 | import ctypes
5 | import time
6 |
7 | from dataclasses import dataclass
8 |
9 | OS_Linux = 0
10 | OS_Win = 1
11 |
12 | if idaapi.inf_is_64bit():
13 | ea_t = ctypes.c_uint64
14 | ptr_t = ctypes.c_int64
15 | get_ptr = idaapi.get_qword
16 | FF_PTR = idc.FF_QWORD
17 | else:
18 | ea_t = ctypes.c_uint32
19 | ptr_t = ctypes.c_int32
20 | get_ptr = idaapi.get_dword
21 | FF_PTR = idc.FF_DWORD
22 |
23 | def is_ptr(f): return (f & idaapi.MS_CLS) == idc.FF_DATA and (f & idaapi.DT_TYPE) == FF_PTR
24 | def is_off(f): return (f & (idc.FF_0OFF|idc.FF_1OFF)) != 0
25 |
26 |
27 | _RTTICompleteObjectLocator_fields = [
28 | ("signature", ctypes.c_uint32), # signature
29 | ("offset", ctypes.c_uint32), # offset of this vtable in complete class (from top)
30 | ("cdOffset", ctypes.c_uint32), # offset of constructor displacement
31 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor
32 | ("pClassHierarchyDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor
33 | ]
34 |
35 | if idaapi.inf_is_64bit():
36 | _RTTICompleteObjectLocator_fields.append(("pSelf", ctypes.c_uint32)) # ref to object's base
37 |
38 | class RTTICompleteObjectLocator(ctypes.Structure):
39 | _fields_ = _RTTICompleteObjectLocator_fields
40 |
41 |
42 | class TypeDescriptor(ctypes.Structure):
43 | _fields_ = [
44 | ("pVFTable", ctypes.c_uint32), # reference to RTTI's vftable
45 | ("spare", ctypes.c_uint32), # internal runtime reference
46 | ("name", ctypes.c_uint8), # type descriptor name (no varstruct needed since we don't use this)
47 | ]
48 |
49 |
50 | class RTTIClassHierarchyDescriptor(ctypes.Structure):
51 | _fields_ = [
52 | ("signature", ctypes.c_uint32), # signature
53 | ("attribs", ctypes.c_uint32), # attributes
54 | ("numBaseClasses", ctypes.c_uint32), # # of items in the array of base classes
55 | ("pBaseClassArray", ctypes.c_uint32), # ref BaseClassArray
56 | ]
57 |
58 |
59 | class RTTIBaseClassDescriptor(ctypes.Structure):
60 | _fields_ = [
61 | ("pTypeDescriptor", ctypes.c_uint32), # ref TypeDescriptor
62 | ("numContainedBases", ctypes.c_uint32), # # of sub elements within base class array
63 | ("mdisp", ctypes.c_uint32), # member displacement
64 | ("pdisp", ctypes.c_uint32), # vftable displacement
65 | ("vdisp", ctypes.c_uint32), # displacement within vftable
66 | ("attributes", ctypes.c_uint32), # base class attributes
67 | ("pClassDescriptor", ctypes.c_uint32), # ref RTTIClassHierarchyDescriptor
68 | ]
69 |
70 |
71 | class base_class_type_info(ctypes.Structure):
72 | _fields_ = [
73 | ("basetype", ea_t), # Base class type
74 | ("offsetflags", ea_t), # Offset and info
75 | ]
76 |
77 |
78 | class class_type_info(ctypes.Structure):
79 | _fields_ = [
80 | ("pVFTable", ea_t), # reference to RTTI's vftable (__class_type_info)
81 | ("pName", ea_t), # ref to type name
82 | ]
83 |
84 | # I don't think this is right, but every case I found looked to be correct
85 | # This might be a vtable? IDA sometimes says it is but not always
86 | # Plus sometimes the flags member is 0x1, so it's not a thisoffs. Weird
87 | class pointer_type_info(class_type_info):
88 | _fields_ = [
89 | ("flags", ea_t), # Flags or something else
90 | ("pType", ea_t), # ref to type
91 | ]
92 |
93 | class si_class_type_info(class_type_info):
94 | _fields_ = [
95 | ("pParent", ea_t), # ref to parent type
96 | ]
97 |
98 | class vmi_class_type_info(class_type_info):
99 | _fields_ = [
100 | ("flags", ctypes.c_uint32), # flags
101 | ("basecount", ctypes.c_uint32), # # of base classes
102 | ("pBaseArray", base_class_type_info), # array of BaseClassArray
103 | ]
104 |
105 | def create_vmi_class_type_info(ea):
106 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(vmi_class_type_info))
107 | tinfo = vmi_class_type_info.from_buffer_copy(bytestr)
108 |
109 | # Since this is a varstruct, we create a dynamic class with the proper size and type and return it instead
110 | class vmi_class_type_info_dynamic(class_type_info):
111 | _fields_ = [
112 | ("flags", ctypes.c_uint32),
113 | ("basecount", ctypes.c_uint32),
114 | ("pBaseArray", base_class_type_info * tinfo.basecount),
115 | ]
116 |
117 | return vmi_class_type_info_dynamic
118 |
119 | # Idiot proof IDA wait box
120 | class WaitBox:
121 | buffertime = 0.0
122 | shown = False
123 | msg = ""
124 |
125 | @staticmethod
126 | def _show(msg):
127 | WaitBox.msg = msg
128 | if WaitBox.shown:
129 | idaapi.replace_wait_box(msg)
130 | else:
131 | idaapi.show_wait_box(msg)
132 | WaitBox.shown = True
133 |
134 | @staticmethod
135 | def show(msg, buffertime=0.1):
136 | if msg == WaitBox.msg:
137 | return
138 |
139 | if buffertime > 0.0:
140 | if time.time() - WaitBox.buffertime < buffertime:
141 | return
142 | WaitBox.buffertime = time.time()
143 | WaitBox._show(msg)
144 |
145 | @staticmethod
146 | def hide():
147 | if WaitBox.shown:
148 | idaapi.hide_wait_box()
149 | WaitBox.shown = False
150 | STRUCTS = 0
151 |
152 | class InfoCache(object):
153 | tinfos = {}
154 | vfuncs = {}
155 |
156 | # Class for windows type info, helps organize things
157 | @dataclass(frozen=True)
158 | class WinTI(object):
159 | typedesc: int
160 | name: str
161 | cols: list[int]
162 | vtables: list[int]
163 |
164 | @dataclass
165 | class VFuncRef:
166 | ea: int # Address to this function
167 | mangledname: str
168 | name: str
169 | postname: str
170 | sname: str
171 |
172 | @staticmethod
173 | def create(ea=idc.BADADDR, mangledname=""):
174 | if InfoCache.vfuncs.get(ea):
175 | return InfoCache.vfuncs[ea]
176 |
177 | name = ""
178 | postname = ""
179 | sname = ""
180 | if mangledname:
181 | name = idaapi.demangle_name(mangledname, idaapi.MNG_SHORT_FORM)
182 | if name:
183 | postname = get_func_postname(name)
184 | sname = postname.split("(")[0]
185 | else:
186 | postname = mangledname
187 | sname = mangledname
188 |
189 | vfunc = VFuncRef(ea, mangledname, name, postname, sname)
190 | InfoCache.vfuncs[ea] = vfunc
191 | return vfunc
192 |
193 | @dataclass(frozen=True)
194 | class VFunc:
195 | funcref: VFuncRef
196 | vaddr: int # Address to this function's reference in its vtable
197 |
198 | @staticmethod
199 | def create(vaddr):
200 | ea = get_ptr(vaddr)
201 | ref = InfoCache.vfuncs.get(ea, VFuncRef.create(ea=ea, mangledname=idaapi.get_name(ea)))
202 | return VFunc(ref, vaddr)
203 |
204 |
205 | def get_os():
206 | ftype = idaapi.get_file_type_name()
207 | if "ELF" in ftype:
208 | return OS_Linux
209 | elif "PE" in ftype:
210 | return OS_Win
211 | return -1
212 |
213 | # Read a ctypes class from an ea
214 | def get_class_from_ea(classtype, ea):
215 | bytestr = idaapi.get_bytes(ea, ctypes.sizeof(classtype))
216 | return classtype.from_buffer_copy(bytestr)
217 |
218 | def add_struc_ex(name):
219 | strucid = idaapi.get_struc_id(name)
220 | if strucid == idc.BADADDR:
221 | strucid = idaapi.add_struc(idc.BADADDR, name)
222 |
223 | return strucid
224 |
225 | # Anything past Classname::
226 | # Thank you CTFPlayer::SOCacheUnsubscribed...
227 | def get_func_postname(name):
228 | retname = name
229 | template = 0
230 | iterback = 0
231 | for i, c in enumerate(retname):
232 | if c == "<":
233 | template += 1
234 | elif c == ">":
235 | template -= 1
236 | # Find ( and break if we're not in a template
237 | elif c == "(" and template == 0:
238 | iterback = i
239 | break
240 |
241 | # Run backwards from ( until we hit a ::
242 | for i in range(iterback, -1, -1):
243 | if retname[i] == ":":
244 | retname = retname[i+1:]
245 | break
246 |
247 | return retname
248 |
249 | def rva_to_ea(ea):
250 | if idaapi.inf_is_64bit():
251 | return idaapi.get_imagebase() + ea
252 | return ea
253 |
254 | def parse_si_tinfo(ea, tinfos):
255 | for xref in idautils.XrefsTo(ea):
256 | tinfo = get_class_from_ea(si_class_type_info, xref.frm)
257 | tinfos[xref.frm + si_class_type_info.pParent.offset] = tinfo.pParent
258 |
259 |
260 | def parse_pointer_tinfo(ea, tinfos):
261 | for xref in idautils.XrefsTo(ea):
262 | tinfo = get_class_from_ea(pointer_type_info, xref.frm)
263 | tinfos[xref.frm + pointer_type_info.pType.offset] = tinfo.pType
264 |
265 |
266 | def parse_vmi_tinfo(ea, tinfos):
267 | for xref in idautils.XrefsTo(ea):
268 | tinfotype = create_vmi_class_type_info(xref.frm)
269 | tinfo = get_class_from_ea(tinfotype, xref.frm)
270 |
271 | for i in range(tinfo.basecount):
272 | offset = vmi_class_type_info.pBaseArray.offset + i * ctypes.sizeof(base_class_type_info)
273 | basetinfo = get_class_from_ea(base_class_type_info, xref.frm + offset)
274 | tinfos[xref.frm + offset + base_class_type_info.basetype.offset] = basetinfo.basetype
275 |
276 | def get_tinfo_vtables(ea, tinfos, vtables):
277 | if ea == idc.BADADDR:
278 | return
279 |
280 | for tinfoxref in idautils.XrefsTo(ea, idaapi.XREF_DATA):
281 | count = 0
282 | mangled = idaapi.get_name(tinfoxref.frm)
283 | demangled = idc.demangle_name(mangled, idaapi.MNG_LONG_FORM)
284 | if demangled is None:
285 | print(f"[VTABLE STRUCTS] Invalid name at {tinfoxref.frm:#x}")
286 | continue
287 |
288 | classname = demangled[len("`typeinfo for'"):]
289 | for xref in idautils.XrefsTo(tinfoxref.frm, idaapi.XREF_DATA):
290 | if xref.frm not in tinfos.keys():
291 | # If address lies in a function
292 | if idaapi.is_func(idaapi.get_full_flags(xref.frm)):
293 | continue
294 |
295 | count += 1
296 | vtables[classname] = vtables.get(classname, []) + [xref.frm]
297 |
298 |
299 | def get_tinfo_vtables(ea, tinfos, vtables):
300 | if ea == idc.BADADDR:
301 | return
302 |
303 | for tinfoxref in idautils.XrefsTo(ea, idaapi.XREF_DATA):
304 | count = 0
305 | mangled = idaapi.get_name(tinfoxref.frm)
306 | demangled = idc.demangle_name(mangled, idaapi.MNG_LONG_FORM)
307 | if demangled is None:
308 | print(f"[VTABLE STRUCTS] Invalid name at {tinfoxref.frm:#x}")
309 | continue
310 |
311 | classname = demangled[len("`typeinfo for'"):]
312 | for xref in idautils.XrefsTo(tinfoxref.frm, idaapi.XREF_DATA):
313 | if xref.frm not in tinfos.keys():
314 | # If address lies in a function
315 | if idaapi.is_func(idaapi.get_full_flags(xref.frm)):
316 | continue
317 |
318 | count += 1
319 | vtables[classname] = vtables.get(classname, []) + [xref.frm]
320 |
321 |
322 | def parse_vtables(vtables):
323 | jsondata = {}
324 | ptrsize = ctypes.sizeof(ea_t)
325 | for classname, tables in vtables.items():
326 | # We don't *need* to do any sort of sorting in Linux and can just capture the thisoffset
327 | # The Windows side of the script can organize later
328 | for ea in tables:
329 | thisoffs = get_ptr(ea - ptrsize)
330 |
331 | funcs = parse_vtable(ea + ptrsize)
332 | # Can be zero if there's an xref in the global offset table (.got) section
333 | # Fortunately the parse_vtable function doesn't grab anything from there
334 | if funcs:
335 | classdata = jsondata.get(classname, {})
336 | classdata[ptr_t(thisoffs).value] = funcs
337 | jsondata[classname] = classdata
338 |
339 | return jsondata
340 |
341 | def parse_vtable(ea):
342 | funcs = []
343 |
344 | while ea != idc.BADADDR:
345 | # Using flags sped this up by a lot
346 | # Went from 4 secs to ~1.3
347 | flags = idaapi.get_full_flags(ea)
348 | if not is_off(flags) or not is_ptr(flags):
349 | break
350 |
351 | if get_os() == OS_Linux and idaapi.has_name(flags):
352 | break
353 |
354 | offs = get_ptr(ea)
355 | fflags = idaapi.get_full_flags(offs)
356 | if not idaapi.is_code(fflags):
357 | break
358 |
359 | if get_os() == OS_Win and not idaapi.has_any_name(fflags):
360 | break
361 |
362 | vfunc = VFunc.create(ea)
363 | # Invalid name, so this can be a "sub_", purecall, or an optimized function
364 | # So to keep vtable_io compat, we grab the comment instead and update the names
365 | if not vfunc.funcref.name:
366 | cmt = idaapi.get_cmt(ea, False)
367 | if cmt and "::" in cmt:
368 | vfunc.funcref.mangledname = None
369 | vfunc.funcref.name = cmt
370 | vfunc.funcref.postname = get_func_postname(vfunc.funcref.name)
371 | vfunc.funcref.sname = vfunc.funcref.postname.split("(")[0]
372 |
373 | funcs.append(vfunc)
374 |
375 | ea = idaapi.next_head(ea, idc.BADADDR)
376 | return funcs
377 |
378 | def calc_member_tinfo(vfunc):
379 | cached = InfoCache.tinfos.get(vfunc.funcref.ea, None)
380 | if cached is not None:
381 | return cached
382 |
383 | # Get the type info of the function if it's present
384 | # In Windows, you can't get the actual tinfo so you can only guess
385 | # and use the rudimentary type info
386 | tinfo = idaapi.tinfo_t()
387 | if not idaapi.get_tinfo(tinfo, vfunc.funcref.ea):
388 | if idaapi.guess_tinfo(tinfo, vfunc.funcref.ea) == idaapi.GUESS_FUNC_FAILED:
389 | tinfo = None
390 |
391 | if tinfo is not None:
392 | tinfo.create_ptr(tinfo)
393 |
394 | InfoCache.tinfos[vfunc.funcref.ea] = tinfo
395 | return tinfo
396 |
397 |
398 | def create_structs(data):
399 | # Now this is an awesome API function that we most certainly need
400 | idaapi.begin_type_updating(idaapi.UTP_STRUCT)
401 |
402 | for classname, vtables in data.items():
403 | classstrucid = add_struc_ex(classname)
404 | classstruc = idaapi.get_struc(classstrucid)
405 | for thisoffs, vfuncs in vtables.items():
406 | thisoffs = abs(thisoffs)
407 | postfix = f"_{thisoffs:04X}" if thisoffs != 0 else ""
408 | structype = f"{classname}{postfix}{idaapi.VTBL_SUFFIX}"
409 | structype = idaapi.validate_name(structype, idaapi.VNT_TYPE, idaapi.SN_IDBENC)
410 |
411 | vtablestrucid = add_struc_ex(structype)
412 | vtablestruc = idaapi.get_struc(vtablestrucid)
413 | for i, vfunc in enumerate(vfuncs):
414 | offs = i * ctypes.sizeof(ea_t)
415 | targetname = vfunc.funcref.sname
416 |
417 | currmem = idaapi.get_member(vtablestruc, offs)
418 | if currmem:
419 | # memname = idaapi.get_member_name(currmem.id)
420 | # # Can have a postfix so we use in operator
421 | # if targetname in memname:
422 | # if not currmem.has_ti():
423 | # tinfo = calc_member_tinfo(vfunc)
424 | # if tinfo is not None:
425 | # idaapi.set_member_tinfo(vtablestruc, currmem, 0, tinfo, 0)
426 | # continue
427 |
428 | # # Sadly if you reorganize a vtable and move a function up, this will fail
429 | # # and you'll have an unneeded postfix
430 | # if not idaapi.set_name(currmem.id, targetname, idaapi.SN_NOCHECK):
431 | # newname = f"{targetname}_{offs:x}"
432 | # if not idaapi.set_name(currmem.id, newname, idaapi.SN_NOCHECK):
433 | # print(f"Failed to set name for {classname}::{vfunc.funcref.sname} ({targetname}) at offset {offs:#x}")
434 | # continue
435 |
436 | # tinfo = calc_member_tinfo(vfunc)
437 | # if tinfo is not None:
438 | # idaapi.set_member_tinfo(vtablestruc, currmem, 0, tinfo, 0)
439 | continue
440 |
441 | else:
442 | opinfo = idaapi.opinfo_t()
443 | # I don't think this does anything
444 | opinfo.ri.flags = idaapi.REF_OFF64 if idaapi.inf_is_64bit() else idaapi.REF_OFF32
445 | opinfo.ri.target = vfunc.funcref.ea
446 | opinfo.ri.base = 0
447 | opinfo.ri.tdelta = 0
448 |
449 | serr = idaapi.add_struc_member(vtablestruc, targetname, offs, FF_PTR|idc.FF_0OFF, opinfo, ctypes.sizeof(ea_t))
450 | # Failed, so there was either an invalid name or a name collision
451 | if serr == idaapi.STRUC_ERROR_MEMBER_NAME:
452 | targetname = idaapi.validate_name(targetname, idaapi.VNT_IDENT, idaapi.SN_IDBENC)
453 | serr = idaapi.add_struc_member(vtablestruc, targetname, offs, FF_PTR|idc.FF_0OFF, opinfo, ctypes.sizeof(ea_t))
454 | if serr == idaapi.STRUC_ERROR_MEMBER_NAME:
455 | targetname = f"{targetname}_{offs:X}"
456 | serr = idaapi.add_struc_member(vtablestruc, targetname, offs, FF_PTR|idc.FF_0OFF, opinfo, ctypes.sizeof(ea_t))
457 |
458 | if serr != idaapi.STRUC_ERROR_MEMBER_OK:
459 | print(vtablestruc, vtablestrucid)
460 | print(f"Failed to add member {classname}::{vfunc.funcref.sname} ({targetname}) at offset {offs:#x} -> {serr}")
461 | continue
462 |
463 | tinfo = calc_member_tinfo(vfunc)
464 | if tinfo is not None:
465 | mem = idaapi.get_member(vtablestruc, offs)
466 | idaapi.set_member_tinfo(vtablestruc, mem, 0, tinfo, 0)
467 |
468 | vmember = idaapi.get_member(classstruc, thisoffs)
469 | if not vmember:
470 | if idaapi.add_struc_member(classstruc, f"{idaapi.VTBL_MEMNAME}{postfix}", thisoffs, idc.FF_DATA | FF_PTR, None, ctypes.sizeof(ea_t)) == idaapi.STRUC_ERROR_MEMBER_OK:
471 | global STRUCTS
472 | STRUCTS += 1
473 | tinfo = idaapi.tinfo_t()
474 | if idaapi.guess_tinfo(tinfo, vtablestrucid) != idaapi.GUESS_FUNC_FAILED:
475 | mem = idaapi.get_member(classstruc, thisoffs)
476 | tinfo.create_ptr(tinfo)
477 | idaapi.set_member_tinfo(classstruc, mem, 0, tinfo, 0)
478 |
479 | def read_vtables_linux():
480 | WaitBox.show("Parsing typeinfo")
481 |
482 | # Step 1 and 2, crawl xrefs and stick the inherited class type infos into a structure
483 | # After this, we can run over the xrefs again and see which xrefs come from another structure
484 | # The remaining xrefs are either vtables or weird math in a function
485 | xreftinfos = {}
486 |
487 | def getparse(name, fn, quiet=False):
488 | tinfo = idc.get_name_ea_simple(name)
489 | if tinfo == idc.BADADDR and not quiet:
490 | print(f"[VTABLE STRUCTS] Type info {name} not found. Skipping...")
491 | return None
492 |
493 | if fn is not None:
494 | fn(tinfo, xreftinfos)
495 | return tinfo
496 |
497 | # Don't need to parse base classes
498 | tinfo = getparse("_ZTVN10__cxxabiv117__class_type_infoE", None)
499 | tinfo_pointer = getparse("_ZTVN10__cxxabiv119__pointer_type_infoE", parse_pointer_tinfo, True)
500 | tinfo_si = getparse("_ZTVN10__cxxabiv120__si_class_type_infoE", parse_si_tinfo)
501 | tinfo_vmi = getparse("_ZTVN10__cxxabiv121__vmi_class_type_infoE", parse_vmi_tinfo)
502 |
503 | if len(xreftinfos) == 0:
504 | print("[VTABLE STRUCTS] No type infos found. Are you sure you're in a C++ binary?")
505 | return
506 |
507 | # Step 3, crawl xrefs to again and if the xref is not in the type info structure, then it's a vtable
508 | WaitBox.show("Discovering vtables")
509 | vtables = {}
510 | get_tinfo_vtables(tinfo, xreftinfos, vtables)
511 | get_tinfo_vtables(tinfo_pointer, xreftinfos, vtables)
512 | get_tinfo_vtables(tinfo_si, xreftinfos, vtables)
513 | get_tinfo_vtables(tinfo_vmi, xreftinfos, vtables)
514 |
515 | # Now, we have a list of vtables and their respective classes
516 | WaitBox.show("Parsing vtables")
517 | data = parse_vtables(vtables)
518 |
519 | WaitBox.show("Creating structs")
520 | create_structs(data)
521 |
522 | def parse_ti(ea, tis):
523 | typedesc = ea
524 | flags = idaapi.get_full_flags(ea)
525 | if idaapi.is_code(flags):
526 | return
527 |
528 | try:
529 | classname = idaapi.demangle_name(idc.get_name(ea), idaapi.MNG_SHORT_FORM)
530 | classname = classname.removeprefix("class ")
531 | classname = classname.removeprefix("struct TypeDescriptor ")
532 | classname = classname.removesuffix(" `RTTI Type Descriptor'")
533 | except:
534 | print(f"[VTABLE STRUCTS] Invalid vtable name at {ea:#x}")
535 | return
536 |
537 | if classname in tis.keys():
538 | return
539 |
540 | vtables = []
541 |
542 | # Then figure out which xref is a/the COL
543 | for xref in idautils.XrefsTo(typedesc):
544 | ea = xref.frm
545 | flags = idaapi.get_full_flags(ea)
546 |
547 | # Dynamic cast
548 | if idaapi.is_code(flags):
549 | continue
550 |
551 | name = idaapi.get_name(ea)
552 | # Class type descriptor and/or random global data
553 | # Kind of a hack but let's assume no one will rename these
554 | if name and (name.startswith("??_R1") or name.startswith("off_")):
555 | continue
556 |
557 | ea -= 4
558 | name = idaapi.get_name(ea)
559 | # Catchable types
560 | if name and name.startswith("__CT"):
561 | continue
562 |
563 | # COL
564 | ea -= 8
565 | workaround = False
566 | if idaapi.is_unknown(idaapi.get_full_flags(ea)):
567 | print(f"[VTABLE STRUCTS] Possible COL is unknown at {ea:#x}. This may be an unreferenced vtable. Trying workaround...")
568 | # This might be a bug with IDA, but sometimes the COL isn't analyzed
569 | # If there's still a reference, then we can still trace back
570 | # If there is a list of functions (or even just one), then it's probably a vtable,
571 | # but we'll still warn the user that it might be garbage
572 | refs = list(idautils.XrefsTo(ea))
573 | if len(refs) == 1:
574 | vtable = refs[0].frm + ctypes.sizeof(ea_t)
575 | tryfunc = get_ptr(vtable + ctypes.sizeof(ea_t))
576 | funcflags = idaapi.get_full_flags(tryfunc)
577 | if idaapi.is_func(funcflags):
578 | print(f" - Workaround successful. Please assure that {vtable:#x} is a vtable.")
579 | workaround = True
580 |
581 | if not workaround:
582 | print(" - Workaround failed. Skipping...")
583 | continue
584 |
585 | name = idaapi.get_name(ea)
586 | if not workaround and (not name or not name.startswith("??_R4")):
587 | print(f"[VTABLE STRUCTS] Invalid name at {ea:#x}. Possible unwind info. Ignoring...")
588 | continue
589 |
590 | # In 64-bit PEs, the COL references itself, remove this
591 | refs = list(idautils.XrefsTo(ea))
592 | if idaapi.inf_is_64bit():
593 | for n in range(len(refs)-1, -1, -1):
594 | if refs[n].frm == ea + RTTICompleteObjectLocator.pSelf.offset:
595 | del refs[n]
596 |
597 | # Now that we have the COL, we can use it to find the vtable that utilizes it and its thisoffs
598 | if len(refs) != 1:
599 | print(f"[VTABLE STRUCTS] Multiple vtables point to same COL - {name} at {ea:#x}")
600 | continue
601 |
602 | vtable = refs[0].frm + ctypes.sizeof(ea_t)
603 | thisoffs = idaapi.get_dword(ea + RTTICompleteObjectLocator.offset.offset)
604 | vtables.append((thisoffs, vtable))
605 |
606 | # Can have RTTI without a vtable
607 | tis[classname] = {thisoffs: parse_vtable(vaddr) for thisoffs, vaddr in vtables}
608 |
609 | def string_method(type_info, vtabledata):
610 | for string in idautils.Strings():
611 | sstr = str(string)
612 | if not sstr.startswith(".?AV"):
613 | continue
614 |
615 | ea = string.ea
616 | ea -= TypeDescriptor.name.offset
617 | trytinfo = rva_to_ea(idaapi.get_wide_dword(ea))
618 | # This is a weird string that isn't a part of a type descriptor
619 | if trytinfo != type_info:
620 | continue
621 |
622 | parse_ti(ea, vtabledata)
623 |
624 | def read_ti_win():
625 | # Step 1, get the vftable of type_info
626 | type_info = idc.get_name_ea_simple("??_7type_info@@6B@")
627 | if type_info == idc.BADADDR:
628 | # If type_info doesn't exist as a label, we might still be able to snipe it with the string method
629 | strings = list(idautils.Strings())
630 | for s in strings:
631 | if str(s) == ".?AVtype_info@@":
632 | ea = s.ea - TypeDescriptor.name.offset
633 | type_info = rva_to_ea(idaapi.get_wide_dword(ea))
634 |
635 | if type_info == idc.BADADDR:
636 | print("[VTABLE STRUCTS] type_info not found. Are you sure you're in a C++ binary?")
637 | return None
638 |
639 | vtabledata = {}
640 |
641 | # Step 2, get all xrefs to type_info
642 | # Get type descriptor
643 | for typedesc in idautils.XrefsTo(type_info):
644 | parse_ti(typedesc.frm, vtabledata)
645 |
646 | # In some cases, the IDA either fails to reference some type descriptors with type_info
647 | # Not exactly sure why, but it lists the ea of type_info as a "hash" when in reality it isn't
648 | # A workaround for this is to parse type descriptor strings (".?AV*"), load up their references, and
649 | # walk backwards to the start of what is supposed to be the type descriptor, and assure that
650 | # its DWORD is the type_info vtable
651 | # I only found this to be a problem in NMRIH, so it appears to be rare
652 | WaitBox.show("Performing string parsing")
653 | string_method(type_info, vtabledata)
654 |
655 | return vtabledata
656 |
657 | def read_vtables_win():
658 | WaitBox.show("Parsing Windows typeinfo")
659 | data = read_ti_win()
660 |
661 | if data is None:
662 | return
663 |
664 | WaitBox.show("Creating structs")
665 | create_structs(data)
666 |
667 | def main():
668 | os = get_os()
669 | try:
670 | if os == OS_Linux:
671 | read_vtables_linux()
672 | elif os == OS_Win:
673 | read_vtables_win()
674 | else:
675 | print(f"Unsupported OS?: {idaapi.get_file_type_name()}")
676 | idaapi.beep()
677 |
678 | if STRUCTS:
679 | print(f"Successfully imported {STRUCTS} virtual structures")
680 | else:
681 | print("No virtual structures imported")
682 | idaapi.beep()
683 | except:
684 | import traceback
685 | traceback.print_exc()
686 | print("Please file a bug report with supporting information at https://github.com/Scags/IDA-Scripts/issues")
687 | idaapi.beep()
688 |
689 | idaapi.end_type_updating(idaapi.UTP_STRUCT)
690 | WaitBox.hide()
691 |
692 | # import cProfile
693 | # cProfile.run("main()", "vtable_structs.prof")
694 | main()
695 |
--------------------------------------------------------------------------------