├── .gitattributes
├── README.md
└── enoki.py
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
4 | # Custom for Visual Studio
5 | *.cs diff=csharp
6 |
7 | # Standard to msysgit
8 | *.doc diff=astextplain
9 | *.DOC diff=astextplain
10 | *.docx diff=astextplain
11 | *.DOCX diff=astextplain
12 | *.dot diff=astextplain
13 | *.DOT diff=astextplain
14 | *.pdf diff=astextplain
15 | *.PDF diff=astextplain
16 | *.rtf diff=astextplain
17 | *.RTF diff=astextplain
18 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # What is _Enoki_ ?
2 | The _Enoki_ script is a wrapper class for [IDAPython](https://www.hex-rays.com/products/ida/support/idapython_docs/). It regroups various useful functions for reverse engineering of non-standard
3 | and/or uncommon binaries. Many of the scripts currently available online are geared towards malware analysis of Windows [Portable Executable (PE)
4 | files](https://en.wikipedia.org/wiki/Portable_Executable) and as such, most of their functionalities are geared toward Intel-based systems and perform many tasks to detect or
5 | deobfuscate malicious, well-known file standards. _Enoki_ seeks to provide a set of basic functions for analysis of binaries, memory maps
6 | or other non-malware oriented files for reverse engineering purposes.
7 |
8 | ## Summary
9 |
10 | The _Enoki_ script is a wrapper around many IDAPython functions and is designed for analysts conducting reverse engineering on
11 | non-standard and uncommon files such as firmware of embedded devices or simply plain unknown files for ICS systems. _Enoki_ provides
12 | additional shortcut functions for extracting, searching and analyzing machines code, useful when IDA as issue parsing
13 | or detecting the actual processor.
14 |
15 | ## Usage
16 |
17 | To use _Enoki_ with [IDA](https://www.hex-rays.com/products/ida/), simply load the _enoki.py_ file into IDA. An instance of the _Enoki_ object will automatically be created in the ```e``` variable or you can create your own
18 | instance using the following command in the interpreter:
19 |
20 | ```
21 | e = Enoki()
22 | ```
23 |
24 | Simply call any of the function required using the instance, for example:
25 |
26 | ```
27 | Python>hex(e.current_file_offset())
28 | 0x74fc
29 | ```
30 |
31 | ## Examples
32 |
33 | This section provides some example of the functionalities provded by the _Enoki_ script. More details can be found by consulting the wiki of the project.
34 |
35 | ### Find a byte string
36 |
37 | One of the function provided by _Enoki_ is the ```find_byte_string```, which allow the analyst to search for specific sequence of bytes or words in the machine
38 | code. The function will return all locations where the specific byte string has been found in the range searched.
39 |
40 | ```
41 | Python>e.find_byte_string(ScreenEA(), ScreenEA() + 0x1000, "7980 ????")
42 | [150, 155, 173, 198, 208]
43 | ```
44 |
45 | If you need the output in hexadecimal addresses, simply wrap the result using the ```hex()``` function:
46 |
47 | ```
48 | Python>[hex(i) for i in e.find_byte_string(ScreenEA(), ScreenEA() + 0x1000, "7980 ????")]
49 | ['0x96', '0x9b', '0xad', '0xc6', '0xd0']
50 | ```
51 |
52 | ### Compare two code ranges for similarity
53 |
54 | Another functionality available is to compare the similarity of two code segments via the ```compare_code``` function. This function
55 | will take two arrays of opcodes or assembly instructions and calculate the similarity of the sequence. In the example below,
56 | the similarity is only 11%, meaning the 2 code segments are quite different.
57 |
58 | ```
59 | Python>c1 = e.get_words_between(0x2C00, 0x2CFF)
60 | Python>c2 = e.get_words_between(0x8000, 0x80FF)
61 | Python>e.compare_code(c1, c2)
62 | 0.11328125
63 | ```
64 |
65 | Other functions are available within _Enoki_ and more details can be found in the comments of the script or in the future wiki of the project.
66 |
67 |
68 | ## References
69 |
70 | If you find this script useful for your projects or research, please add a reference or link to this project to help make it better.
71 |
72 | - __URL:__
73 | - [Enoki](https://github.com/InfectedPacket/Idacraft), https://github.com/InfectedPacket/Idacraft
74 | - __Reference (Chicago):__
75 | - Racicot, Jonathan. 2016. Enoki (version 1.0.2). Windows/Mac/Linux. Ottawa, Canada.
76 | - __Reference (IEEE):__
77 | - J. Racicot, Enoki. Ottawa, Canada, 2016.
78 |
79 |
--------------------------------------------------------------------------------
/enoki.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # Copyright (C) 2015 Jonathan Racicot
3 | #
4 | # This program is free software: you can redistribute it and/or modify
5 | # it under the terms of the GNU General Public License as published by
6 | # the Free Software Foundation, either version 3 of the License, or
7 | # (at your option) any later version.
8 | #
9 | # This program is distributed in the hope that it will be useful,
10 | # but WITHOUT ANY WARRANTY; without even the implied warranty of
11 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 | # GNU General Public License for more details.
13 | #
14 | # You should have received a copy of the GNU General Public License
15 | # along with this program. If not, see .
16 | #
17 | # If you use this program and find it useful, please include a link
18 | # or reference to the project's page in your program and/or document.
19 | #
20 | # Reference (Chicago):
21 | # Racicot, Jonathan. 2016. Enoki (version 1.0.2). Windows/Mac/Linux. Ottawa, Canada.
22 | # Reference (IEEE):
23 | # J. Racicot, Enoki. Ottawa, Canada, 2016.
24 | #
25 | #
26 | # Jonathan Racicot
27 | # infectedpacket@gmail.com
28 | # 2016-01-10
29 | # https://github.com/infectedpacket
30 | #//////////////////////////////////////////////////////////////////////////////
31 | #
32 | #//////////////////////////////////////////////////////////////////////////////
33 | # Imports
34 | #//////////////////////////////////////////////////////////////////////////////
35 | #
36 | import re
37 | import idc
38 | import idaapi
39 | import difflib
40 | import idautils
41 | import logging
42 | #
43 | #//////////////////////////////////////////////////////////////////////////////
44 | # Enoki class
45 | #//////////////////////////////////////////////////////////////////////////////
46 | class Enoki(object):
47 | """
48 | Description:
49 | Provides wrapping functions around IDAPython to analyze
50 | and format structures for unknown/difficult architectures.
51 |
52 | Notes:
53 | Tested on IDA Pro v6.5
54 |
55 | Author:
56 | Jonathan Racicot
57 |
58 | Date:
59 | Created: 2015-10-14
60 | Updated: 2016-01-10
61 | """
62 |
63 | VERSION = "1.0.0"
64 |
65 | #Specifies a 16bit segment
66 | SEG_16 = 0
67 | #Specifies a 32bit segment
68 | SEG_32 = 1
69 | #Specifies a 64bit segment
70 | SEG_64 = 2
71 |
72 | #Segment bitness to use when none has been specified.
73 | DEFAULT_SEGMENT_SIZE = SEG_16
74 |
75 | #Specifies a DATA segment
76 | SEG_DATA = "DATA"
77 | #Specifies a CODE segment
78 | SEG_CODE = "CODE"
79 |
80 | SEG_TYPE_CODE = 2
81 | SEG_TYPE_DATA = 3
82 |
83 | #Used for assessing returns from IDA functions calls.
84 | FAIL = 0
85 | SUCCESS = 1
86 |
87 | # Basic colors
88 | RED = 0x0000FF
89 | GREEN = 0x00FF00
90 | BLUE = 0xFF0000
91 | YELLOW = 0x00FFFF
92 | WHITE = 0xFFFFFF
93 | BLACK = 0x000000
94 | CYAN = 0xFFFF00
95 | # Fancy colors
96 | ABSOLUTE_ZERO = 0xBA4800
97 | AFRICAN_VIOLET = 0xBE84B2
98 | ALIZARIN_CRIMSON = 0x3626E3
99 | AMBER = 0x00BFFF
100 | APPLE_GREEN = 0x00B68D
101 | AZURE = 0xFF7F00
102 | BABY_BLUE = 0xF0CF89
103 | BABY_PINK = 0xC2C2F4
104 | BONE = 0xE3DAC9
105 | CADMIUM_ORANGE = 0x2D87ED
106 | CITRINE = 0xE4D00A
107 | CADET_BLUE = 0x5F9EA0
108 | CHAMOISEE = 0xA0785A
109 |
110 | logger = logging.getLogger(__name__)
111 |
112 | def Enoki(self):
113 | """
114 | Constructor of the Enoki engine. Does nothing.
115 | """
116 | pass
117 |
118 | def vers(self):
119 | return Enoki.VERSION
120 |
121 | def make_comment(self, _ea, _comment):
122 | """
123 | Creates a comment at the given address.
124 |
125 | @param _ea: The address where the comment will be created.
126 | @param _comment: The comment
127 | @return Enoki.SUCCESS if the comment as created successfully,
128 | Enoki.FAIL otherwise.
129 | """
130 | return idc.MakeComm(_ea, _comment)
131 |
132 | def clear_comment(self, _ea):
133 | """
134 | Removes any comment at the given address.
135 |
136 | @param _ea: The address where the comment will be removed.
137 | @return Enoki.SUCCESS if the comment as created successfully,
138 | Enoki.FAIL otherwise.
139 | """
140 | return self.make_comment(_ea, "")
141 |
142 | def clear_all_comments(self, _startea, _endea):
143 | """
144 | Removes all comment between the given addresses.
145 |
146 | @param _startea: The start address.
147 | @param _endea: The end address.
148 | @return Enoki.SUCCESS if all the comments were removed,
149 | Enoki.FAIL otherwise.
150 | """
151 | if (_startea != BADADDR and _endea != BADADDR):
152 | curea = _startea
153 | error = Enoki.SUCCESS
154 | while (curea < _endea):
155 | r = self.clear_comment(curea)
156 | curea = idc.NextHead(curea)
157 | if (r == Enoki.FAIL):
158 | error = Enoki.FAIL
159 | return error
160 |
161 | def clear_function_comments(self, _funcea):
162 | """
163 | Removes all comments in the function at the specified address.
164 |
165 | @param _funcea: An address within the function
166 | @return Enoki.SUCCESS if all the comments were removed,
167 | Enoki.FAIL otherwise.
168 | """
169 | func = self.get_function_at(_funcea)
170 | if (func):
171 | return self.clear_all_comments(func.startEA, func.endEA)
172 | else:
173 | return Enoki.FAIL
174 |
175 | def append_comment(self, _ea, _comment):
176 | """
177 | Appends a new comment to an instruction at the specified address.
178 |
179 | @param _ea: The address where the comment will be appended.
180 | @param _comment The comment
181 | @return Enoki.SUCCESS if the comment as created successfully,
182 | Enoki.FAIL otherwise.
183 | """
184 | if (_ea != BADADDR):
185 | cur_comment = Comment(_ea)
186 | if (cur_comment != None and len(cur_comment) > 0):
187 | comment = "{:s}\n{:s}".format(cur_comment, _comment)
188 | else:
189 | comment = _comment
190 | return self.make_comment(_ea, comment)
191 | return Enoki.FAIL
192 |
193 | def make_repeat_comment(self, _ea, _comment):
194 | """
195 | Creates a repeatable comment at the given address.
196 |
197 | @param _ea: The address where the comment will be created.
198 | @param _comment: The comment
199 | @return IDAEngine.SUCCESS if the comment as created successfully,
200 | IDAEngine.FAIL otherwise.
201 | """
202 | return idc.MakeRptCmt(_ea, _comment)
203 |
204 | def backup_database(self):
205 | """
206 | Backup the database to a file similar to
207 | IDA's snapshot function.
208 | """
209 | time_string = strftime('%Y%m%d%H%M%S')
210 | file = idc.GetInputFile()
211 | if not file:
212 | raise NoInputFileException('No input file provided')
213 | input_file = rsplit(file, '.', 1)[0]
214 | backup_file = "{:s}_{:s}.idb".format(input_file, time_string)
215 | idc.SaveBase(backup_file, idaapi.DBFL_BAK)
216 |
217 | def create_segment(self, _startea, _endea, _name,
218 | _type, _segsize=DEFAULT_SEGMENT_SIZE):
219 | """
220 | Creates a segment between provided addresses.
221 |
222 | @param _startea: The start address of the segment.
223 | @param _endea: The end address of the segment.
224 | @param _name: Name to be given to the new segment.
225 | @param _type: Either idaapi.SEG_CODE to specified a code
226 | segment or idaapi.SEG_DATA for a segment containing data.
227 | @param _segsize: Bitness of the segment, e.g. 16, 32 or 64 bit.
228 | """
229 | r = idc.AddSeg(_startea, _endea, 0, _segsize, 1, 2)
230 | if (r == Enoki.SUCCESS):
231 | idc.RenameSeg(_startea, _name)
232 | return idc.SetSegmentType(_startea, _type)
233 | else:
234 | return Enoki.FAIL
235 |
236 | def get_segment(self, _ea):
237 | return idaapi.getseg(_ea)
238 |
239 | def get_segment_type(self, _ea):
240 | return self.get_seg_attribute(_ea, idc.SEGATTR_TYPE)
241 |
242 | def segment_is_code(self, _segea):
243 | return self.get_segment_type(_segea) == self.SEG_TYPE_CODE
244 |
245 | def segment_is_data(self, _segea):
246 | return self.get_segment_type(_segea) == self.SEG_TYPE_DATA
247 |
248 | def get_seg_attribute(self, _segea, _attr):
249 | """
250 | Sets an attribute to the segment at the given address. The available
251 | attributes are:
252 | SEGATTR_START starting address
253 | SEGATTR_END ending address
254 | SEGATTR_ALIGN alignment
255 | SEGATTR_COMB combination
256 | SEGATTR_PERM permissions
257 | SEGATTR_BITNESS bitness (0: 16, 1: 32, 2: 64 bit segment)
258 | SEGATTR_FLAGS segment flags
259 | SEGATTR_SEL segment selector
260 | SEGATTR_ES default ES value
261 | SEGATTR_CS default CS value
262 | SEGATTR_SS default SS value
263 | SEGATTR_DS default DS value
264 | SEGATTR_FS default FS value
265 | SEGATTR_GS default GS value
266 | SEGATTR_TYPE segment type
267 | SEGATTR_COLOR segment color
268 | @param _segea Address within the segment to be modified.
269 | @param _attr The attribute to change. This is one of the value listed above.
270 | @param _value The value of the attibute.
271 | """
272 | return idc.GetSegmentAttr(_segea, _attr, _value)
273 |
274 | def create_selector(self, _sel, _value):
275 | return idc.SetSelector(_sel, _value)
276 |
277 | def create_data_segment(self, _startea, _endea, _name,
278 | _segsize=DEFAULT_SEGMENT_SIZE):
279 | """
280 | Wrapper around the create_segment function to
281 | create a new data segment.
282 | @param _startea: The start address of the segment.
283 | @param _endea: The end address of the segment.
284 | @param _name: Name to be given to the new segment.
285 | @param _segsize: Bitness of the segment, e.g. 16, 32 or 64 bit.
286 | """
287 | r = self.create_segment(_startea, _endea, _name, idaapi.SEG_DATA, _segsize)
288 | if (r == Enoki.SUCCESS):
289 | return self.set_seg_class_code(_startea)
290 | return Enoki.FAIL
291 |
292 | def create_code_segment(self, _startea, _endea, _name,
293 | _segsize=DEFAULT_SEGMENT_SIZE):
294 | """
295 | Wrapper around the create_segment function to
296 | create a new code segment.
297 | @param _startea: The start address of the segment.
298 | @param _endea: The end address of the segment.
299 | @param _name: Name to be given to the new segment.
300 | @param _segsize: Bitness of the segment, e.g. 16, 32 or 64 bit.
301 | """
302 | r = self.create_segment(_startea, _endea, _name, idaapi.SEG_CODE, _segsize)
303 | if (r == Enoki.SUCCESS):
304 | return self.set_seg_class_code(_startea)
305 | return Enoki.FAIL
306 |
307 | def set_seg_selector(self, _segea, _sel):
308 | return self.set_seg_attribute(_segea, SEGATTR_SEL, _sel)
309 |
310 | def set_seg_align_para(self, _segea):
311 | """
312 | Sets the alignment of the segment at the given address as 'paragraph',
313 | i.e. 16bit.
314 |
315 | #param _segea Address within the segment to be modified.
316 | """
317 | return idc.SegAlign(_segea, saRelPara)
318 |
319 | def set_seg_class_code(self, _segea):
320 | """
321 | Sets the class of the segment at the given address as containing code.
322 |
323 | #param _segea Address within the segment to be modified.
324 | """
325 | return self.set_seg_class(_segea, "CODE")
326 |
327 | def set_seg_class_data(self, _segea):
328 | """
329 | Sets the class of the segment at the given address as containing data.
330 |
331 | #param _segea Address within the segment to be modified.
332 | """
333 | return self.set_seg_class(_segea, "DATA")
334 |
335 | def set_seg_class(self, _segea, _type):
336 | """
337 | Sets the class of the segment at the given address.
338 |
339 | #param _segea Address within the segment to be modified.
340 | """
341 | return idc.SegClass(_segea, _type)
342 |
343 | def set_seg_attribute(self, _segea, _attr, _value):
344 | """
345 | Sets an attribute to the segment at the given address. The available
346 | attributes are:
347 | SEGATTR_START starting address
348 | SEGATTR_END ending address
349 | SEGATTR_ALIGN alignment
350 | SEGATTR_COMB combination
351 | SEGATTR_PERM permissions
352 | SEGATTR_BITNESS bitness (0: 16, 1: 32, 2: 64 bit segment)
353 | SEGATTR_FLAGS segment flags
354 | SEGATTR_SEL segment selector
355 | SEGATTR_ES default ES value
356 | SEGATTR_CS default CS value
357 | SEGATTR_SS default SS value
358 | SEGATTR_DS default DS value
359 | SEGATTR_FS default FS value
360 | SEGATTR_GS default GS value
361 | SEGATTR_TYPE segment type
362 | SEGATTR_COLOR segment color
363 | @param _segea Address within the segment to be modified.
364 | @param _attr The attribute to change. This is one of the value listed above.
365 | @param _value The value of the attibute.
366 | """
367 | return idc.SetSegmentAttr(_segea, _attr, _value)
368 |
369 | def create_string_at(self, _startea, _unicode=False, _terminator="00"):
370 | """
371 | Creates a StringItem object at the specified location.
372 | @param _startea The start address of the string
373 | @param _unicode Specifies whether the string is ASCII or UnicodeDecodeError
374 | @param _terminator Specify the terminator character of a sequence. Default is
375 | "00"
376 | """
377 | # Gets the address of the closest terminator byte/word
378 | strend = self.find_next_byte_string(_startea, _terminator)
379 | strlen = strend-_startea
380 | if strend != idaapi.BADADDR:
381 | if (_unicode):
382 | result = idaapi.make_ascii_string(_startea, strlen, idaapi.ACFOPT_UTF8)
383 | else:
384 | result = idaapi.make_ascii_string(_startea, strlen, idaapi.ACFOPT_ASCII)
385 | if (result == Enoki.FAIL):
386 | print "[-] Failed to create a string at 0x{:x} to 0x{:x}.".format(_startea, strend+1)
387 | return Enoki.FAIL
388 | return Enoki.SUCCESS
389 | return Enoki.FAIL
390 |
391 | def current_file_offset(self):
392 | """
393 | Returns the file offset, i.e. absolute offset from the beginning of the file,
394 | of the currently selected address.
395 | @return The absolute offset of the selected address.
396 | """
397 | return idaapi.get_fileregion_offset(ScreenEA())
398 |
399 | def min_file_offset(self):
400 | """
401 | Returns the minimum file offset, i.e. absolute offset of the beginning of the file/memory.
402 | @return The absolute minimum offset of the loaded code.
403 | """
404 | return idaapi.get_fileregion_offset(MinEA())
405 |
406 | def max_file_offset(self):
407 | """
408 | Returns the maximum file offset, i.e. absolute offset of the end of the file/memory.
409 | @return The absolute maximum offset of the loaded code.
410 | """
411 | return idaapi.get_fileregion_offset(MaxEA())
412 |
413 | def get_byte_at(self, _ea):
414 | return idc.Byte(_ea)
415 |
416 | def get_word_at(self, _ea):
417 | return idc.Word(_ea)
418 |
419 | def get_dword_at(self, _ea):
420 | return idc.Dword(_ea)
421 |
422 | def get_all_bytes_between(self, _startea, _endea):
423 | """
424 | Returns all bytes between the given addresses.
425 | @param _startea The starting address
426 | @param _endea The ending address
427 | @return A list containing all bytes between the given addresses.
428 | """
429 | bytes = []
430 | if (_startea != BADADDR and _endea != BADADDR):
431 | curea = _startea
432 | while (curea < _endea):
433 | bytes.append(self.get_byte_at(curea))
434 | curea = NextHead(curea)
435 |
436 | return bytes
437 |
438 | def get_all_words_between(self, _startea, _endea):
439 | """
440 | Returns all words between the given addresses.
441 | @param _startea The starting address
442 | @param _endea The ending address
443 | @return A list containing all words between the given addresses.
444 | """
445 | words = []
446 | if (_startea != BADADDR and _endea != BADADDR):
447 | curea = _startea
448 | while (curea < _endea):
449 | words.append(self.get_word_at(curea))
450 | curea = NextHead(curea)
451 |
452 | return words
453 |
454 | def get_all_strings(self, _filter='',
455 | _encoding=(Strings.STR_UNICODE | Strings.STR_C)):
456 | """
457 | Retrieves all strings from the current file matching the
458 | regular expression specified in the filter parameter. If no
459 | filter value is provided, all strings IDA objects with the specified encoding
460 | are returned. To access only the strings and display them in the interpreter,
461 | consult the show_all_strings function.
462 |
463 | Values for the _encoding parameters includes:
464 | - Strings.STR_UNICODE
465 | - Strings.STR_C
466 |
467 | Values for the _encoding parameter can be combined using the |
468 | operator. Example:
469 |
470 | _encoding=(Strings.STR_UNICODE | Strings.STR_C)
471 |
472 | @param _filter Regular expression to filter unneeded strings.
473 | @param _encoding Specified the type of strings to seek.
474 | @return A list of strings IDA objects
475 | """
476 | strings = []
477 | string_finder = idautils.Strings(False)
478 | string_finder.setup(strtypes=_encoding)
479 |
480 | for index, string in enumerate(string_finder):
481 | s = str(string)
482 | if len(_filter) > 0 and len(s) > 0:
483 | if re.search(_filter, s):
484 | strings.append(string)
485 | else:
486 | strings.append(string)
487 | return strings
488 |
489 | def show_all_strings(self, _filter='',
490 | _encoding=(Strings.STR_UNICODE | Strings.STR_C)):
491 | """
492 | This function will display the address and the strings found in the
493 | file. This function differs from get_all_strings by printing the results
494 | into the interpreter and only the strings are returns, while the
495 | get_all_strings function returns the IDA string objects.
496 |
497 | @param _filter Regular expression to filter unneeded strings.
498 | @param _encoding Specified the type of strings to seek.
499 | @return A list of strings
500 | """
501 | strings = []
502 | strings_objs = self.get_all_strings(_filter, _encoding)
503 | for s in strings_objs:
504 | strings.append(str(s))
505 | print("[>]\t0x{:x}: {:s}".format(s.ea, str(s)))
506 | return strings
507 |
508 | def get_string_at(self, _ea):
509 | """
510 | Returns the string, if any, at the specified address.
511 | @param _ea Address of the string
512 | @return The string at the specified address.
513 | """
514 | if (_ea != BADADDR):
515 | stype = idc.GetStringType(_ea)
516 | return idc.GetString(_ea, strtype=stype)
517 | return ""
518 |
519 | def get_all_comments_at(self, _ea):
520 | """
521 | Returns both normal and repeatable comments at
522 | the specified address. If both are present, a single
523 | string is returned, both comments separated by a semi-
524 | colon (:)
525 |
526 | @param _ea: Address from which to retrieve the comments
527 | @return: A string containing both normal and repeatable comments,
528 | or an empty string if no comments are found.
529 | """
530 | normal_comment = self.get_normal_comment(_ea)
531 | rpt_comment = self.get_repeat_comment(_ea)
532 | comment = normal_comment
533 |
534 | if (comment and rpt_comment):
535 | comment += ":" + rpt_command
536 |
537 | return comment
538 |
539 | def get_normal_comment_at(self, _ea):
540 | comment = idc.Comment(_ea)
541 | if not comment:
542 | comment = ""
543 |
544 | return comment;
545 |
546 | def get_repeat_comment(self, _ea):
547 | comment = idc.RptCmt(_ea)
548 | if not comment:
549 | comment = ""
550 |
551 | return comment;
552 |
553 | def get_ea(self, _name):
554 | """
555 | Returns the address of a named location. Returns Enoki.FAIL
556 | if no address matches the supplied name.
557 | @param _name Name of the location.
558 | @return The address corresponding to the name.
559 | """
560 | if (len(_name) > 0):
561 | return idc.LocByName(_name)
562 | return Enoki.FAIL
563 |
564 | def get_ea_label(self, _ea):
565 | """
566 | Returns the label of an address if any. Returns an empty string
567 | if no label is assigned to the address.
568 | @param _ea Address of the location.
569 | @return The label set to the address if any, empty string otherwise.
570 | """
571 | return idc.Name(_ea)
572 |
573 | def get_disasm(self, _ea):
574 | """
575 | Returns the disassembled code at the specified address.
576 | @param _ea Address of the opcode to disassembled.
577 | @return String containing the disassembled code.
578 | """
579 | return idc.GetDisasm(_ea)
580 |
581 | def get_mnemonic(self, _ea):
582 | """
583 | Returns the instruction at the specified address.
584 | @param _ea The address from which to retrieve the instruction.
585 | @return String containing the mnemonic of the instruction.
586 | """
587 | return idc.GetMnem(_ea)
588 |
589 | def get_first_segment(self):
590 | """
591 | Returns the address of the first defined
592 | segment of the file.
593 |
594 | @return: Start address of the first segment or
595 | idc.BADADDR if no segments are defined
596 | """
597 | return idc.FirstSeg()
598 |
599 | def get_next_segment(self, _ea):
600 | """
601 | Returns the address of the segment following the one defined
602 | at the given address.
603 |
604 | @param _ea: Address of the current segment.
605 |
606 | @return: Start address of the next segment or
607 | idc.BADADDR if no segments are defined
608 | """
609 | return idc.FirstSeg()
610 |
611 | def get_segment_name(self, _ea):
612 | """
613 | Returns the name of the segment at the specified address.
614 | @param _ea An address within the segment
615 | @return String containing the name of the segment.
616 | """
617 | return idc.Segname(_ea)
618 |
619 | def get_segment_start(self, _ea):
620 | """
621 | Returns the starting address of the segment located at the specified
622 | address
623 | @param _ea An address within the segment
624 | @return long The starting address of the segment.
625 | """
626 | return idc.SegStart(_ea)
627 |
628 | def get_segment_end(self, _ea):
629 | """
630 | Returns the ending address of the segment located at the specified
631 | address
632 | @param _ea An address within the segment
633 | @return long The ending address of the segment.
634 | """
635 | return idc.SegEnd(_ea)
636 |
637 | def find_next_byte_string(self, _startea, _bytestr, _fileOffset = False,
638 | _bitness=DEFAULT_SEGMENT_SIZE):
639 | """
640 | This function searches for text representing bytes and/or words in the
641 | machine code of the file from a start address. This function is built on top of the native
642 | FindBinary function. The search is conducted starting at the specified address and downward
643 | for the provided byte string.
644 |
645 | Example:
646 | e.find_next_byte_string(ScreenEA(), "0000 FFFF ???? 0000")
647 |
648 | @param _startea Starting address of the search
649 | @param _bytestr String to search for
650 | @param _fileOffset Specifies whether to return found addresses as relative or absolute
651 | offsets
652 | @param _bitness Specifies the bitness of the segment.
653 | @return The offset of the byte string found, or None if there is no search result.
654 | """
655 | offset = None
656 | ea = _startea;
657 | if ea == idaapi.BADADDR:
658 | print ("[-] Failed to retrieve starting address.")
659 | offset = None
660 | else:
661 | block = FindBinary(ea, SEARCH_DOWN | SEARCH_CASE, _bytestr, _bitness)
662 | if (block == idc.BADADDR):
663 | offset = None
664 | if _fileOffset:
665 | offset = idaapi.get_fileregion_offset(block)
666 | else:
667 | offset = block
668 | return offset
669 |
670 | def find_byte_string(self, _startea, _endea, _bytestr,
671 | _fileOffsets = False, _showmsg = False):
672 | """
673 | This function searches for text representing bytes and/or words in the
674 | machine code of the file between 2 addresses. This function is built on top of the native
675 | FindBinary function. The search is conducted starting at the specified address and downward
676 | for the provided byte string.
677 |
678 | Example:
679 | e.find_byte_string(0x4000, 0x8000, "FF FF AA AA FF FF", True)
680 |
681 | @param _startea Starting address of the search
682 | @param _startea Ending address of the search
683 | @param _bytestr String to search for
684 | @param _fileOffsets Specifies whether to return found addresses as relative or absolute
685 | offsets
686 | @param _showmsg Specifies if the function should print a message with the results
687 | @return An array of addresses corresponding to the start of the byte string. If none found,
688 | returns an empty array.
689 | """
690 | try:
691 | offsets = []
692 | ea = _startea;
693 | if ea == idaapi.BADADDR:
694 | print ("[-] Failed to retrieve starting address.")
695 | return None
696 | else:
697 | block = FindBinary(ea, SEARCH_DOWN | SEARCH_CASE, _bytestr, 16)
698 | if (block == idc.BADADDR):
699 | print("[-] Byte string '{:s}' not found.".format(_bytestr))
700 |
701 | while (block != idc.BADADDR and block < _endea):
702 | block_file_offset = idaapi.get_fileregion_offset(block)
703 | if _fileOffsets:
704 | offsets.append(block_file_offset)
705 | else:
706 | offsets.append(block)
707 | next_block_offset = idaapi.get_fileregion_ea(block_file_offset+4)
708 | if (_showmsg):
709 | print("[+] Byte string '{:s}' found at offset 0x{:X}, file offset 0x{:X}.".format(
710 | _bytestr,
711 | block,
712 | block_file_offset))
713 | block = FindBinary(next_block_offset, SEARCH_DOWN | SEARCH_CASE, _bytestr, 16)
714 | return offsets
715 | except Exception as e:
716 | print("[-] An error occured while seeking byte string {:s}: {:s}".format(_bytestr, e.message))
717 | return []
718 |
719 | def get_code_ranges(self, _startea, _endea, _prolog, _epilog):
720 | """
721 | This function will extract all the machine opcodes located between the
722 | provided code boundaries in the prescribed range.
723 |
724 | Example: TODO
725 |
726 | m = e.get_code_ranges(MinEA(), MaxEA(), "4500 4885", "4886 0090")
727 | print(m)
728 | [[0x2C00, 0x2C15], [0x2C16, 0x2C38]]
729 |
730 | @param _startea The start address of the range to look for code
731 | segment
732 | @param _endea The end address of the range to look for code
733 | segment
734 | @param _prolog Starting byte string of the code segment to look for.
735 | @param _epilog Ending byte string of the code segment to look for.
736 | @return matrix containing the starting and ending addresses of the code
737 | segment found.
738 | """
739 | segments = []
740 | if (_startea != BADADDR and _endea != BADADDR):
741 | prolog_offsets = self.find_byte_string(_startea, _endea, _prolog, False)
742 | for offset_idx in range(0, len(prolog_offsets)):
743 | epilog_offset = self.find_next_byte_string(
744 | prolog_offsets[offset_idx],
745 | _epilog)
746 | if epilog_offset != idc.BADADDR:
747 | segments.append([prolog_offsets[offset_idx], epilog_offset])
748 | return segments;
749 |
750 | def get_instruction_tokens(self, _ea):
751 | """
752 | Returns the tokens of the disassembled instruction at the specified address.
753 |
754 | Example:
755 | ...
756 | 0x2C00: pop r1 ; Pops stack into R1 register
757 | ...
758 | s = get_instruction_tokens(0x2C00)
759 | print(s)
760 | ['pop', 'r1', ';', 'Pops', 'stack', 'into', 'R1', 'register']
761 |
762 | @param _ea Address of the instruction to disassembled
763 | @return Array of string containing the tokens of the disassembled instruction.
764 | """
765 | if (_ea != BADADDR):
766 | return filter(None, GetDisasm(_ea).split(" "))
767 |
768 | def get_function_at(self, _ea):
769 | """
770 | Returns the function object at the specified address.
771 | @param _ea An address within the function
772 | @return The native IDA function object at the given address.
773 | """
774 | if (_ea != BADADDR):
775 | return idaapi.get_func(_ea)
776 | else:
777 | return None
778 |
779 | def set_function_name_at(self, _funcea, _name):
780 | """
781 | Sets the name of the function located at the specified address,
782 | if any.
783 |
784 | @param _funcea An address within the function
785 | @param _name The new name of the function. Cannot be empty.
786 | @return Enoki.SUCCESS or Enoki.FAIL on error.
787 | """
788 | if (_funcea != BADADDR and len(_name) > 0):
789 | func = self.get_function_at(_funcea)
790 | if (func):
791 | return idc.MakeName(func.startEA, _name)
792 | return Enoki.FAIL
793 |
794 | def get_function_name_at(self, _ea):
795 | """
796 | Returns the name of the function at the given address if one is
797 | defined.Returns an empty string if no function is defined at the
798 | address.
799 | @param _ea An address within the function
800 | @return The name of the function or an empty string.
801 | """
802 | return GetFunctionName(_ea)
803 |
804 | def get_function_disasm(self, _ea):
805 | """
806 | This function retrieves all of the disassembled and tokenized instructions
807 | of the function located at the specified address.
808 |
809 | Example:
810 | ...
811 | 0x2C00: pop r1
812 | 0x2C01: load acc, 0
813 | 0x2C03: jmp 0x2C0A
814 | ...
815 | s = get_function_disasm(0x2C00)
816 | print(s)
817 | [['pop', 'r1'], ['load', 'acc,', '0'], ['jmp', '0x2C0A']]
818 |
819 | Note that the tokenization is done using white spaces only, so any commas will remain
820 | as part of the token.
821 |
822 | @param _ea An address within the function.
823 | @return A matrix of tuples containing the address of the instruction and a
824 | list of tokenized instructions contained in the function at the specified address.
825 |
826 | """
827 | matrix_disasm = []
828 | if (_ea != BADADDR):
829 | current_func = self.get_function_at(_ea)
830 | if (current_func):
831 | func_start = current_func.startEA
832 | func_end = current_func.endEA
833 | curea = func_start
834 | while(curea < func_end):
835 | inst_tokens = self.get_instruction_tokens(curea)
836 | matrix_disasm.append(inst_tokens)
837 | curea = NextHead(curea)
838 | else:
839 | print("[-] No function found at 0x{:x}.".format(_ea))
840 | return matrix_disasm
841 |
842 | def get_function_disasm_with_ea(self, _ea):
843 | """
844 | This function retrieves all of the disassembled and tokenized instructions
845 | of the function located at the specified address.
846 |
847 | Example:
848 | ...
849 | 0x2C00: pop r1
850 | 0x2C01: load acc, 0
851 | 0x2C03: jmp 0x2C0A
852 | ...
853 | s = get_function_disasm(0x2C00)
854 | print(s)
855 | [(0x2C00, ['pop', 'r1']), (0x2C01, ['load', 'acc,', '0']), (0x2C03, ['jmp', '0x2C0A'])]
856 |
857 | Note that the tokenization is done using white spaces only, so any commas will remain
858 | as part of the token.
859 |
860 | @param _ea An address within the function.
861 | @return A matrix of tuples containing the address of the instruction and a
862 | list of tokenized instructions contained in the function at the specified address.
863 |
864 | """
865 | matrix_disasm = []
866 | if (_ea != BADADDR):
867 | current_func = self.get_function_at(_ea)
868 | if (current_func):
869 | func_start = current_func.startEA
870 | func_end = current_func.endEA
871 | curea = func_start
872 | while(curea < func_end):
873 | inst_tokens = self.get_instruction_tokens(curea)
874 | matrix_disasm.append((curea, inst_tokens))
875 | curea = NextHead(curea)
876 | else:
877 | print("[-] No function found at 0x{:x}.".format(_ea))
878 | return matrix_disasm
879 |
880 | def compare_code(self, _code1, _code2):
881 | """
882 | The compare_code function provides a similarity ratio between the provided code
883 | segments. It does so by using the SequenceMatcher from the difflib module, which
884 | return a value between 0 and 1, 0 indicating 2 completely different segment and 1
885 | specifying identical code segments.
886 |
887 | @param _code1 First code segment to compare
888 | @param _code2 Seconde code segment to compare
889 | @return double A value between 0 and 1 indicating the degree of similarity between the
890 | 2 code segments.
891 | """
892 | sm=difflib.SequenceMatcher(None,_code1,_code2,autojunk=False)
893 | r = sm.ratio()
894 | return r
895 |
896 | def compare_functions(self, _ea_func1, _ea_func2):
897 | """
898 | Compares the code of 2 functions using the compare_code function.
899 |
900 | @param _ea_func1 Address within the first function to compare
901 | @param _ea_func2 Address within the second function to compare
902 | @return double A value between 0 and 1, 0 indicating 2 completely different
903 | functions and 1 specifying identical functions.
904 | """
905 | l1 = self.get_function_instructions(_ea_func1)
906 | l2 = self.get_function_instructions(_ea_func2)
907 | return self.compare_code(l1, l2)
908 |
909 | def get_function_instructions(self, _ea):
910 | """
911 | Retrieves the instructions, without operands, of the function located at the
912 | specified address.
913 |
914 | Example:
915 | ...
916 | 0x2C00: pop r1
917 | 0x2C01: load acc, 0
918 | 0x2C03: jmp 0x2C0A
919 | ...
920 | s = e.get_function_instructions(0x2C00)
921 | print(s)
922 | ['pop', 'load', 'jmp']
923 |
924 | @param _ea Address within the function
925 | @return Array of string representing the instruction of the function.
926 | """
927 | instr = []
928 | if (_ea != BADADDR):
929 | instr_matrix = self.get_function_disasm(_ea)
930 | for line in instr_matrix:
931 | instr.append(line[0])
932 | return instr
933 |
934 | def get_all_functions_instr(self, _startea, _endea):
935 | """
936 | Extracts the instructions of all functions located between the provided
937 | start and end addresses. Returns a dictionary in the format
938 | <"FunctionName", ['i1', 'i2', ..., 'in']>
939 |
940 | @param _startea Starting address
941 | @param _endea Ending address
942 | @return A dictionary object. The keys are the name of the functions found
943 | within the boundaries, while the value is the array of instructions
944 | for the function.
945 | """
946 | f_instr = {}
947 | curEA = _startea
948 | func = self.get_function_at(_ea)
949 |
950 | while (curEA <= _endea):
951 | name = GetFunctionName(curEA)
952 | i = self.get_function_instructions(curEA)
953 | f_instr[name] = i
954 | func = idaapi.get_next_func(curEA)
955 | curEA = func.startEA
956 | return f_instr
957 |
958 | def get_all_functions(self, _startea, _endea):
959 | """
960 | Gets all function objects between the provided start and end
961 | addresses. Returns a dictionary in the format <"FunctionName", FunctionObject>.
962 |
963 | @param _startea Starting address
964 | @param _endea Ending address
965 | @return A dictionary object. The keys are the name of the functions found
966 | within the boundaries, while the value is the native Function object
967 | if IDA.
968 | """
969 | functions = {}
970 | curEA = _startea
971 | func = self.get_function_at(curEA)
972 | if (func):
973 | while (curEA <= _endea):
974 | name = GetFunctionName(curEA)
975 | functions[name] = func
976 | func = idaapi.get_next_func(curEA)
977 | if (func):
978 | curEA = func.startEA
979 | else:
980 | NextHead(curEA)
981 | return functions
982 |
983 | def get_all_func_instr_seg(self, _ea=ScreenEA()):
984 | """
985 | Returns all the functions in the segment specified by the provided address.
986 | Returns a dictionary in the format <"FunctionName", FunctionObject>.
987 |
988 | @param _ea An address within the segment. Default is the segment of the current
989 | instruction.
990 | @return A dictionary object. The keys are the name of the functions found
991 | within the boundaries, while the value is the native Function object
992 | if IDA.
993 | """
994 | return self.get_all_functions_instr(SegStart(_ea), SegEnd(_ea))
995 |
996 | def get_closest_previous_instr(self, _ea, _instruction, _max=20):
997 | """
998 | Find the closest instruction matching the specified instructions above the
999 | specified address.
1000 |
1001 | Example:
1002 | 0x2C00 lacl #FFh
1003 | 0x2C01 sacl *+
1004 | 0x2C02 sbrk #5
1005 | 0x2C03 lar ar1, *-
1006 | 0x2C04 call SUB_02CC4
1007 | ...
1008 | Python>e.get_closest_previous_instr(0x2C04, "lac")
1009 | (11264, 'lacl #FF')
1010 |
1011 | If found, the function will return the address of the matching instruction
1012 | and the matching instruction. You can specified a maximum of instructions
1013 | to look before giving up by setting the _max argument, which is set to
1014 | 20 per default.
1015 |
1016 | @param _ea The reference address to search from
1017 | @param _instruction A regular expression to match the required instruction
1018 | @param _max Maximum of instruction to look at before giving up.
1019 | @return A tuple containing the address and the matching instruction.
1020 | """
1021 | found_ins = (BADADDR, "")
1022 | if (_ea != BADADDR):
1023 | step = 0
1024 | curea = _ea
1025 | found = False
1026 | while (step < _max and not found):
1027 | ins = GetMnem(curea)
1028 | if (re.search(_instruction, ins)):
1029 | found_ins = (curea, e.get_disasm(curea))
1030 | found = True
1031 | step += 1
1032 | curea = PrevHead(curea)
1033 |
1034 | return found_ins
1035 |
1036 | def get_closest_next_instr(self, _ea, _instruction, _max=20):
1037 | """
1038 | Find the closest instruction matching the specified instructions above the
1039 | specified address.
1040 |
1041 | Example:
1042 | 0x2C00 lacl #FFh
1043 | 0x2C01 sacl *+
1044 | 0x2C02 sbrk #5
1045 | 0x2C03 lar ar1, *-
1046 | 0x2C04 call SUB_02CC4
1047 | ...
1048 | Python>e.get_closest_previous_instr(0x2C04, "lac")
1049 | (11264, 'lacl #FF')
1050 |
1051 | If found, the function will return the address of the matching instruction
1052 | and the matching instruction. You can specified a maximum of instructions
1053 | to look before giving up by setting the _max argument, which is set to
1054 | 20 per default.
1055 |
1056 | @param _ea The reference address to search from
1057 | @param _instruction A regular expression to match the required instruction
1058 | @param _max Maximum of instruction to look at before giving up.
1059 | @return A tuple containing the address and the matching instruction.
1060 | """
1061 | found_ins = (BADADDR, "")
1062 | if (_ea != BADADDR):
1063 | step = 0
1064 | curea = _ea
1065 | found = False
1066 | while (step < _max and not found):
1067 | ins = GetMnem(curea)
1068 | if (re.search(_instruction, ins)):
1069 | found_ins = (curea, e.get_disasm(curea))
1070 | found = True
1071 | step += 1
1072 | curea = NextHead(curea)
1073 |
1074 | return found_ins
1075 |
1076 | def get_similarity_ratios(self, func1, func2):
1077 | """
1078 | Calculates the similarity ratios between 2 sets of functions and returns
1079 | a matrix of the results. The matrix is in the following format:
1080 |
1081 | [
1082 | ["f11", "f12", r1]
1083 | ["f21", "f22", r2]
1084 | ...
1085 | ["fn1", "fn2", rn]
1086 | ]
1087 |
1088 | Note: this function can take a while to complete and was not design for
1089 | efficiency. O(n^2)
1090 |
1091 | @param func1 First set of function to compare
1092 | @param func2 Second set of function to compare.
1093 | @return Matrix of similarity ratios for each function compared.
1094 | """
1095 | ratios = []
1096 | for f1, l1 in func1.iteritems():
1097 | for f2, l2 in func2.iteritems():
1098 | r = self.compare_code(l1, l2)
1099 | ratios.append([f1, f2, r])
1100 | return ratios
1101 |
1102 | def get_similarity_func(self, ratios, threshold=1.0):
1103 | """
1104 | Returns a matrix of similarity vectors with ratios greater or equal
1105 | to the specified threshold.
1106 |
1107 | Example:
1108 |
1109 | ratios = [
1110 | ["f11", "f12", 1.0]
1111 | ["f21", "f22", 0.64]
1112 | ["f31", "f32", 0.85]
1113 | ]
1114 |
1115 | m = e.get_similarity_func(ratios, 0.9)
1116 | print(m)
1117 | [["f11", "f12", 1.0]]
1118 |
1119 | @param ratios Matrix of ratios as returned by function get_similarity_ratios
1120 | @param threshold Minimum threshold desired. Default value is 1.0
1121 | @return Matrix of similarity ratios with ratio greater or equal to specified threshold.
1122 | """
1123 | funcs = []
1124 | for r in ratios:
1125 | if (r[2] >= threshold):
1126 | #print("[+] Similarity between '{:s}' and '{:s}': {:f}.".format(r[0], r[1], r[2]))
1127 | funcs.append(r)
1128 | return funcs
1129 |
1130 | def function_is_leaf(self, _funcea):
1131 | """
1132 | Verifies if the function at the specified address is a leaf function, i.e.
1133 | it does not make any call to other function.
1134 |
1135 | @param _funcea An address within the function
1136 | @return True if the function at the address contains no call instructions.
1137 | """
1138 | # Retrieves the function at _funcea:
1139 | near_calls = self.get_functions_called_from(_funcea)
1140 | return len(near_calls) == 0
1141 |
1142 | def get_functions_called_by(self, _funcea, _display=True):
1143 | """
1144 | Get all functions directly called by the function at the given address. This function
1145 | only extract functions called at the first level, i.e. this function is not recursive.
1146 | Returns a matrix containing the address originating the call, the destination address
1147 | and the name of the function/address called.
1148 |
1149 | Example:
1150 | ...
1151 | 0x2C00: pop r1
1152 | 0x2C01: load acc, 0
1153 | 0x2C03: call 0x2CC0
1154 | 0x2C05: load acc, 27h
1155 | 0x2C07: call 0x2D78
1156 | 0x2C09: push r1
1157 | 0x2C0A: ret
1158 | ...
1159 |
1160 | m = e.get_functions_called_by(0x2C00)
1161 | print(m)
1162 | [[0x2C03, 0x2CC0, 'SUB__02CC0'],[0x2C07, 0x2D78, 'SUB__02D78']]
1163 |
1164 | @param _funcea Address within the function
1165 | @param _display If True, display the results at the console.
1166 | @return Matrix containing the source, destination and name of the functions called.
1167 | """
1168 | # Retrieves the function at _funcea:
1169 | func = self.get_function_at(_funcea)
1170 | # Boundaries:
1171 | startea = func.startEA
1172 | endea = func.endEA
1173 | # EA index:
1174 | curea = startea
1175 | # Results here:
1176 | near_calls = []
1177 | while (curea < endea):
1178 | for xref in XrefsFrom(curea):
1179 | # Code 17 is the code for 'Code_Near_Jump' type of XREF
1180 | if (xref.type == 17):
1181 | # Add the current address, the address of the call and the
1182 | # name of the function called.
1183 | call_info = [xref.frm, xref.to, GetFunctionName(xref.to)]
1184 | near_calls.append(call_info)
1185 | if (_display):
1186 | print("[*] 0x{:x}: {:s} -> {:s}.".format(
1187 | call_info[0],
1188 | GetFunctionName(call_info[0]),
1189 | GetFunctionName(call_info[1])))
1190 | # Next instruction in the function
1191 | curea = NextHead(curea)
1192 | return near_calls
1193 |
1194 | def get_function_flowchart(self, _funcea):
1195 | """
1196 | Returns the flowchart of the function specified at the given address.
1197 |
1198 | @param _funcea An address within the function
1199 | @return A FlowChart object or Enoki.FAIL if the address given is invalid,
1200 | or no function were found at the address.
1201 | """
1202 | if (_funcea != BADADDR):
1203 | func = self.get_function_at(_funcea)
1204 | if (func):
1205 | return idaapi.FlowChart(func)
1206 | return Enoki.FAIL
1207 |
1208 | def get_func_block_bounds(self, _funcea):
1209 | """
1210 | Returns all the code blocks of a given function, i.e. code segment
1211 | between branches/returns and other jumps except for calls.
1212 |
1213 | Example:
1214 | 0x2C00 pop *
1215 | 0x2C01 load r1, *+
1216 | ...
1217 | 0x2C15 jmp 0x2C20
1218 | 0x2C16 call 0x03D0
1219 | ...
1220 | 0x2C20 jne r2, 0x2C3D
1221 | ...
1222 |
1223 | Python>c_blks = e.get_code_block_boundaries(0x2C00)
1224 | Python>c_blks
1225 | [(0x2C00, 0x2C15), (0x2C15, 0x2C20), ...]
1226 |
1227 | @param _funcea An address within the function
1228 | @return A list of tuples containing the start of the block (inclusive) and the
1229 | end of the block (exclusive). Returns an empty list on error.
1230 | """
1231 | blks = []
1232 | fc = self.get_function_flowchart(_funcea)
1233 | if (fc != Enoki.FAIL):
1234 | for blk in fc:
1235 | blks.append((blk.startEA, blk.endEA))
1236 | return blks
1237 |
1238 | def get_block_at(self, _funcea):
1239 | """
1240 | Retrieves the code block at the given address
1241 | @param _funcea An address within the function
1242 | @return A tuple containing the boundaries of the corresponding code block.
1243 | returns (BADADDR, BADADDR) if none found.
1244 | """
1245 | found = (BADADDR, BADADDR)
1246 | if (_funcea != BADADDR):
1247 | blks = self.get_func_block_bounds(_funcea)
1248 | if (len(blks) > 0):
1249 | for (b_start, b_end) in blks:
1250 | if (_funcea >= b_start and _funcea < b_end):
1251 | return (b_start, b_end)
1252 | return found
1253 |
1254 | def get_all_sub_functions_called(self, _funcea, _level=0, _visited=[]):
1255 | """
1256 | Get all functions directly and indirectly called by the function at the given address.
1257 | This function is recursive and will seek all sub function calls as well, therefore this
1258 | function can be time consumming to complete.
1259 | Returns a matrix containing the address originating the call, the destination address
1260 | and the name of the function/address called and the depth of the call from the initial
1261 | function.
1262 |
1263 | Example:
1264 | ...
1265 | 0x2C00: pop r1
1266 | 0x2C01: load acc, 0
1267 | 0x2C03: call 0x2CC0
1268 | 0x2C05: load acc, 27h
1269 | 0x2C07: call 0x2D78
1270 | 0x2C09: push r1
1271 | 0x2C0A: ret
1272 | ...
1273 | 0x2CC0 SUB__02CC0:
1274 | 0x2CC0 pop r1
1275 | 0x2CC1 load acc, 00
1276 | 0x2CC2 call 0x3DEE
1277 | ...
1278 |
1279 | m = e.get_all_sub_functions_called(0x2C00)
1280 | print(m)
1281 | [[0x2C03, 0x2CC0, 'SUB__02CC0', 0],[0x2CC2, 0x3DEE, 'SUB__03DDE', 1],
1282 | [0x2C07, 0x2D78, 'SUB__02D78', 0]]
1283 |
1284 | @param _funcea Address within the function
1285 | @return Matrix containing the source, destination, name of the functions called and
1286 | the depth relative to the first function.
1287 | """
1288 | # Retrieves the function at _funcea:
1289 | func = self.get_function_at(_funcea)
1290 | # Make sure a function object was extracted
1291 | if (not func):
1292 | print("[-] Error getting function at 0x{:x}.".format(_funcea))
1293 | return []
1294 | # Boundaries:
1295 | startea = func.startEA
1296 | endea = func.endEA
1297 | # EA index:
1298 | curea = startea
1299 | # Results here:
1300 | near_calls = []
1301 | while (curea < endea):
1302 | for xref in XrefsFrom(curea):
1303 | # Code 17 is the code for 'Code_Near_Jump' type of XREF
1304 | if (xref.type == 17):
1305 | # Add the current address, the address of the call and the
1306 | # name of the function called along with the depth.
1307 | fname = GetFunctionName(xref.to)
1308 | if not fname in _visited:
1309 | _visited.append(fname)
1310 | call_info = [xref.frm, xref.to, fname, _level]
1311 | print("[*]{:s}0x{:x}: {:s} -> {:s}.".format(
1312 | " " * _level,
1313 | call_info[0],
1314 | self.get_function_name_at(call_info[0]),
1315 | self.get_function_name_at(call_info[1])))
1316 | sub_calls = self.get_all_sub_functions_called(xref.to, _level+1, _visited)
1317 | # Add calls to current ones
1318 | near_calls.append(call_info)
1319 | if (len(sub_calls) > 0):
1320 | near_calls += sub_calls
1321 |
1322 | # Next instruction in the function
1323 | curea = NextHead(curea)
1324 | return near_calls
1325 |
1326 | def get_functions_leading_to(self, _funcea):
1327 | """
1328 | This function returns all the functions calling the function at the
1329 | provided address. This function is not recursive and only returns the
1330 | first depth of function calling. Returns a matrix containing the address
1331 | originating the call, the destination address and the name of the
1332 | function/address called.
1333 |
1334 | Example:
1335 | ...
1336 | 0x2C00: MAIN:
1337 | 0x2C00: pop r1
1338 | 0x2C01: load acc, 0
1339 | 0x2C03: call 0x2CC0
1340 | 0x2C05: load acc, 27h
1341 | 0x2C07: call 0x2D78
1342 | 0x2C09: push r1
1343 | 0x2C0A: ret
1344 | ...
1345 | 0x2CC0 SUB__02CC0:
1346 | 0x2CC0 pop r1
1347 | 0x2CC1 load acc, 00
1348 | 0x2CC2 call 0x3DEE
1349 | ...
1350 |
1351 | m = e.get_all_sub_functions_called(0x2CC0)
1352 | print(m)
1353 | [[0x2C00, 0x2CC0, 'MAIN']]
1354 |
1355 | @param _funcea Address within the function
1356 | @return Matrix containing the source, destination, name of the functions calling the
1357 | function.
1358 | """
1359 | # Retrieves the function at _funcea:
1360 | func = idaapi.get_prev_func(idaapi.get_next_func(_funcea).startEA)
1361 | # Boundaries:
1362 | startea = func.startEA
1363 | endea = func.endEA
1364 | # EA index:
1365 | curea = startea
1366 | # Results here:
1367 | near_calls = []
1368 | while (curea < endea):
1369 | for xref in XrefsTo(curea):
1370 | # Code 17 is the code for 'Code_Near_Jump' type of XREF
1371 | if (xref.type == 17):
1372 | # Add the current address, the address of the call and the
1373 | # name of the function called.
1374 | call_info = [xref.frm, xref.to, GetFunctionName(xref.to)]
1375 | near_calls.append(call_info)
1376 | print("[*] 0x{:x}: {:s} -> {:s}.".format(
1377 | call_info[0],
1378 | GetFunctionName(call_info[0]),
1379 | GetFunctionName(call_info[1])))
1380 | # Next instruction in the function
1381 | curea = NextHead(curea)
1382 | return near_calls
1383 |
1384 | def color_all_functions_from(self, _funcea, _color):
1385 | """
1386 | Sets the background color of all functions and sub functions called from the
1387 | root function specified at the given address, i.e. this function is recursive.
1388 | This function can be use to trace the call tree of a function. The function
1389 | will return a matrix of functions calls as per returned by the
1390 | get_all_sub_functions_called function if it succeeds.
1391 |
1392 | Note: You may need to scroll around/refresh the GUI for the change to take
1393 | effect. The background will remain as the default color otherwise.
1394 |
1395 | The value of the color must be in the following format: 0xBBGGRR. Some colors
1396 | are defined in the header of the Enoki class.
1397 |
1398 | Example:
1399 | m = e.get_all_sub_functions_called(0x2CC0, Enoki.BABY_BLUE)
1400 | print(m)
1401 | [[0x2C00, 0x2CC0, 'MAIN']]
1402 |
1403 | Unlike the get_all_sub_functions_called function, this function will
1404 | also change the background color in the GUI.
1405 |
1406 | @param _funcea Address within the root function
1407 | @param _color The background color to set.
1408 | @return Matrix containing the source, destination, name of the functions calling the
1409 | function. Enoki.FAIL otherwise.
1410 | """
1411 | if (_funcea != BADADDR):
1412 | fct_calls = self.get_all_sub_functions_called(_funcea, _visited=[])
1413 | if (len(fct_calls) > 0):
1414 | for fcall in fct_calls:
1415 | self.set_function_color(fcall[0], _color)
1416 | self.set_function_color(fcall[1], _color)
1417 | return fct_calls
1418 | else:
1419 | return Enoki.FAIL
1420 |
1421 | def set_function_color(self, _funcea, _color):
1422 | """
1423 | Sets the background color of the function at the specified address. The value
1424 | of the color must be in the following format: 0xBBGGRR. Some colors
1425 | are defined in the header of the Enoki class.
1426 |
1427 | Example:
1428 | Red:
1429 | e.set_function_color(0x2C00, 0x0000FF)
1430 |
1431 | Blue:
1432 | e.set_function_color(0x2C00, 0xFF0000)
1433 |
1434 | Yellow:
1435 | e.set_function_color(0x2C00, Enoki.YELLOW)
1436 |
1437 | @param _funcea Address within the function
1438 | @param _color The background color to set.
1439 | @return Enoki.SUCCESS if the background color was changed. Enoki.FAIL otherwise.
1440 | """
1441 | if (_funcea != BADADDR):
1442 | idc.SetColor(_funcea, CIC_FUNC, _color)
1443 | return Enoki.SUCCESS
1444 | return Enoki.FAIL
1445 |
1446 | def get_bytes_between(self, _startea, _endea):
1447 | """
1448 | Returns bytes located between the provided start and end addresses.
1449 |
1450 | @param _startea The start address
1451 | @param _endea The end address
1452 | @return An array of bytes located between the addresses specified.
1453 | """
1454 | bytes = []
1455 | if (_startea != BADADDR and _endea != BADADDR):
1456 | curea = _startea
1457 | while (curea <= _endea):
1458 | b = idaapi.get_byte(curea)
1459 | bytes.append(b)
1460 | curea += 1
1461 | return bytes
1462 |
1463 | def get_words_between(self, _startea, _endea):
1464 | """
1465 | Returns words located between the provided start and end addresses.
1466 |
1467 | @param _startea The start address
1468 | @param _endea The end address
1469 | @return An array of words located between the addresses specified.
1470 | """
1471 | words = []
1472 | if (_startea != BADADDR and _endea != BADADDR):
1473 | curea = _startea
1474 | while (curea <= _endea):
1475 | w = idaapi.get_16bit(curea)
1476 | words.append(w)
1477 | curea += 1
1478 | return words
1479 |
1480 | def get_disasm_between(self, _startea, _endea):
1481 | """
1482 | Returns a list of disassembled code between the two addresses
1483 | provided.
1484 |
1485 | Example:
1486 | Python>a = e.get_disasm_section(0x2C00, 0x2C10)
1487 | Python>a
1488 | ['pop ar0, 'sar ar0, *', 'sar ar1, *', 'lar ar0, #106', ...]
1489 |
1490 | @param _startea The starting address of the section
1491 | @param _endea The ending address of the section
1492 | @return A list of instructions, returns an empty list ([]) if an error occured.
1493 | """
1494 | lines = []
1495 | if (_startea != BADADDR and _endea != BADADDR):
1496 | if (_startea > _endea):
1497 | t = _startea
1498 | _startea = _endea
1499 | _endea = _startea
1500 | curea = _startea
1501 |
1502 | while (curea <= _endea):
1503 | disasm = self.get_disasm(curea)
1504 | lines.append(disasm)
1505 | curea = NextHead(curea)
1506 | return lines
1507 |
1508 | def get_disasm_function_line(self, _funcea):
1509 | """
1510 | Returns a list of disassembled instructions from the function at the
1511 | given address.
1512 |
1513 | Example:
1514 | Python>a = e.get_disasm_function(0x2CD0)
1515 | Python>a
1516 | ['popd *+', 'sar ar0, *+', 'sar ar1, *', ...]
1517 |
1518 | @param _funcea Address within the function
1519 | @return A list of instructions, returns an empty list ([]) if an error occured.
1520 | """
1521 | if (_funcea != BADADDR):
1522 | func = self.get_function_at(_funcea)
1523 | if (func):
1524 | return self.get_disasm_between(func.startEA, func.endEA-1)
1525 | return []
1526 |
1527 | def get_disasm_all_functions_from(self, _funcea):
1528 | """
1529 | Retrieves all the disassembled codes of the function at the specified
1530 | address and all functions called from the function. This function is recursive
1531 | and can take a while to complete. Depending on the complexity of the root function,
1532 | it may also take considerable memory resources.
1533 |
1534 | If successful, this function returns a dictionary. The keys are the name
1535 | of the functions and the values are list of strings containing the instructions
1536 | of the function.
1537 |
1538 | Example:
1539 | Python>a = e.get_disasm_all_functions_from(0x2C00)
1540 | Python>print(a)
1541 | {'sub_2C00': ['popd *+', 'sar ar0, *+', 'sar ar1, ...],
1542 | ...
1543 | 'sub_23CC': ['popd *+', 'sar ar0, *+', 'sar ar1, ...] }
1544 |
1545 | @param _funcea Address within the function
1546 | @return a dictionary using the key-value pair ("function_name", [instructions])
1547 | """
1548 | fdisasm = {}
1549 | if (_funcea != BADADDR):
1550 | froot_disasm = self.get_disasm_function_line(_funcea)
1551 | froot_name = GetFunctionName(_funcea)
1552 | fdisasm[froot_name] = froot_disasm
1553 | fcalled = self.get_all_sub_functions_called(_funcea, _visited=[])
1554 | print(fcalled)
1555 | if (len(fcalled) > 0):
1556 | print("[*] Retrieving assembly from {:d} function(s).".format(len(fcalled)))
1557 | for finfo in fcalled:
1558 | fea = finfo[1]
1559 | fname = finfo[2]
1560 | fcode = self.get_disasm_function_line(fea)
1561 | fdisasm[fname] = fcode
1562 | return fdisasm
1563 |
1564 | def function_find_all(self, _funcea, _criteria):
1565 | """
1566 | Retrieves all instructions within the specified function that matches
1567 | the strings provided in the list '_criteria'.
1568 |
1569 | Example:
1570 | Python>r = e.function_find_all(0xA8BF, ["popd", "#15Ah"])
1571 | Python>print(r)
1572 | ['popd *+ ; Pop Top of Stack',
1573 | 'ldp #15Ah ']
1574 |
1575 | @param _funcea Address within the function to search
1576 | @param _criteria A list of regular expressions to match against
1577 | every instruction in the function.
1578 | @return A list of instructions matching the provided search criterias
1579 | """
1580 | found_ins = []
1581 | if (_funcea != BADADDR):
1582 | if (not type(_criteria) in [list, tuple]):
1583 | _criteria = [_criteria]
1584 |
1585 | fdisasm = self.get_disasm_function_line(_funcea)
1586 | if (len(fdisasm) > 0):
1587 | for ins in fdisasm:
1588 | for crit in _criteria:
1589 | if (re.search(crit, ins)):
1590 | found_ins.append(ins)
1591 | return found_ins
1592 |
1593 | def function_find_all_ea(self, _funcea, _criteria):
1594 | """
1595 | Retrieves all instructions within the specified function that matches
1596 | the strings provided in the list '_criteria' along with the address the
1597 | matching instruction was found.
1598 |
1599 | Example:
1600 | Python>r = e.function_find_all(0xA8BF, ["popd", "#15Ah"])
1601 | Python>print(r)
1602 | [(0xA8C5, 'popd *+ ; Pop Top of Stack'),
1603 | (0xA8D9, 'ldp #15Ah ')]
1604 |
1605 | @param _funcea Address within the function to search
1606 | @param _criteria A list of regular expressions to match against
1607 | every instruction in the function.
1608 | @return A list of instructions matching the provided search criterias
1609 | """
1610 | found_ins = []
1611 | if (_funcea != BADADDR):
1612 | if (not type(_criteria) in [list, tuple]):
1613 | _criteria = [_criteria]
1614 |
1615 | func = self.get_function_at(_funcea)
1616 | curea = func.startEA
1617 | while (curea < func.endEA):
1618 | ins_disasm = self.get_disasm(curea)
1619 |
1620 | for c in _criteria:
1621 | if (re.search(c, ins_disasm)):
1622 | found_ins.append((curea, ins_disasm))
1623 |
1624 | curea = NextHead(curea)
1625 |
1626 | return found_ins
1627 |
1628 | def function_contains_all(self, _funcea, _criteria):
1629 | """
1630 | Verifies if ALL the regular expressions in the _criteria arguments
1631 | have a matching instruction in the function at the given address. If one
1632 | or more of the regular expression included does not match any instruction,
1633 | this function will return False.
1634 |
1635 | Example:
1636 | popd *+
1637 | sar ar0, *+
1638 | sar ar1, *
1639 | lar ar0, #1
1640 | lar ar0, *0+, ar2 ;(dseg:0001)
1641 | ...
1642 |
1643 | Python>e.function_contains_all(0xBFDC, ["popd", "lar\\s+ar"])
1644 | True
1645 | Python>e.function_contains_all(0xBFDC, ["popd", "lar\\s+ar7"])
1646 | False
1647 |
1648 | @param _funcea Address within the function to search
1649 | @param _criteria A list of regular expressions to match against each instruction of
1650 | the function.
1651 | @return True if all regular expresions were matched, False otheriwse.
1652 | """
1653 | if (_funcea != BADADDR):
1654 | if (not type(_criteria) in [list, tuple]):
1655 | _criteria = [_criteria]
1656 |
1657 | fdisasm = self.get_disasm_function_line(_funcea)
1658 |
1659 | if (len(fdisasm) > 0):
1660 | for crit in _criteria:
1661 | idx = 0
1662 | matched = False
1663 |
1664 | while (idx < len(fdisasm) and not matched):
1665 | ins = fdisasm[idx]
1666 | if (re.search(crit, ins)):
1667 | matched = True
1668 |
1669 | idx += 1
1670 |
1671 | if (not matched):
1672 | return False
1673 |
1674 | return True
1675 | return False
1676 |
1677 | def find_all_functions_contain(self, _criteria, _startea=MinEA(), _endea=MaxEA()):
1678 | """
1679 | This function will look for all functions between the given boundaries that contains
1680 | instructions matching all regular expresions in the given list.
1681 |
1682 | Example:
1683 | Python>e.find_all_functions_contain(["popd", "lar\\s+ar"], _startea=0x8F59, _endea=0x9000)
1684 | ['sub_8F42', 'sub_8F97', 'sub_8FE8', ...]
1685 |
1686 | Python>e.find_all_functions_contain(["popd", "lar\\s+ar7"], _startea=0x8000, _endea=0x8FFF)
1687 | []
1688 |
1689 | @param _criteria A list of regular expressions to match against each instruction of
1690 | the function.
1691 | @param _startea The starting address of the search. If no value is specified, MinEA() is
1692 | used.
1693 | @param _endea The ending address of the search. If no value is specified, MaxEA() is
1694 | used.
1695 | """
1696 | found = []
1697 | f = self.get_function_at(_startea)
1698 | while (f):
1699 | fname = GetFunctionName(f.startEA)
1700 | if (self.function_contains_all(f.startEA, _criteria)):
1701 | found.append(fname)
1702 | f = idaapi.get_next_func(f.endEA+1)
1703 | return found
1704 |
1705 | def search_code_all_functions_from(self, _funcea, _search):
1706 | """
1707 | This function searches all the disassembly of the function at the
1708 | given address and all functions called from it for the specified
1709 | regular expression.
1710 |
1711 | Example:
1712 | Python>a = e.search_code_all_functions_from(0x0800, "1EFh")
1713 | Python>print(a)
1714 | [('sub_0800', 'lacc #1EFh, *+'), ('sub_0800', 'lacc #1EFh, *+')]
1715 |
1716 | In the example above, two LACC instructions containing the "1EFh" is found
1717 | in the same function, hence it appears twice in the results.
1718 |
1719 | @param _funcea Address within the function to search
1720 | @param _search A regular expression to search for in the disassembled code.
1721 | @return A list of tuples containing the name of the function and the
1722 | matching instructions.
1723 | """
1724 | results = []
1725 | if (_funcea != BADADDR):
1726 | disasm = self.get_disasm_all_functions_from(_funcea)
1727 | for fname, fcode in disasm.iteritems():
1728 | for ins in fcode:
1729 | if re.search(_search, ins):
1730 | results.append((fname, ins))
1731 | return results
1732 |
1733 | def find_similar_functions_in_tree(self, _funcea, _startea, _threshold=1.0):
1734 | """
1735 | Attempts to find other functions similar to the one specified in the call tree
1736 | of the given function.
1737 |
1738 | This function will accept the address of a function and navigate the call tree
1739 | of the second address provided. The instructions of both function will be compared
1740 | and if the similarity between both is above the specified threshold, the function
1741 | of the call tree is stored along with other found function and returns.
1742 |
1743 | The function returns a matrix in the following format:
1744 | [
1745 | [, , ratio1],
1746 | ...
1747 | [, , ratioN]
1748 | ]
1749 |
1750 | @param _funcea Address within the function to search
1751 | @param _funcea Address of the starting function of the call tree
1752 | @return A matrix containing the address, name and ratio of the functions
1753 | found.
1754 | """
1755 | results = []
1756 | if (_funcea != BADADDR):
1757 | tree = self.get_all_sub_functions_called(_startea, _visited=[])
1758 | for fcall in tree:
1759 | fcalled_ea = fcall[1]
1760 | fcalled_name = fcall[2]
1761 | ratio = self.compare_functions(_funcea, fcalled_ea)
1762 | if (ratio >= _threshold):
1763 | results.append([fcalled_ea, fcalled_name, ratio])
1764 |
1765 | return results
1766 |
1767 | def save_range_to_file(self, _startea, _endea, _file):
1768 | """
1769 | Saves the chunk of bytes between the given start and end addresses into
1770 | the given file.
1771 |
1772 | @param _startea The starting address of the chunk
1773 | @param _endea The ending address of the chunk
1774 | @param _file Name of the file to write.
1775 | @return Enoki.SUCCESS if the file was written successfully, Enoki.FAIL
1776 | otherwise.
1777 | """
1778 | if (_startea != BADADDR and _endea != BADADDR):
1779 | try:
1780 | chunk = bytearray(idc.GetManyBytes(_startea, ((_endea-_startea)+1)*2))
1781 | print("Exporting {:d} bytes chunk 0x{:05x} to 0x{:05x} to {:s}.".format(len(chunk), _startea, _endea, _file))
1782 | with open(_file, "wb") as f:
1783 | f.write(chunk)
1784 | except Exception as e:
1785 | print("[-] Error while writing file: {:s}.".format(e.message))
1786 | return Enoki.FAIL
1787 | return Enoki.SUCCESS
1788 |
1789 | e = Enoki()
1790 | print("[+] Enoki {:s} loaded successfully.".format(e.vers()))
--------------------------------------------------------------------------------