├── .gitattributes
├── README.md
└── enoki.py


/.gitattributes:
--------------------------------------------------------------------------------
 1 | # Auto detect text files and perform LF normalization
 2 | * text=auto
 3 | 
 4 | # Custom for Visual Studio
 5 | *.cs     diff=csharp
 6 | 
 7 | # Standard to msysgit
 8 | *.doc	 diff=astextplain
 9 | *.DOC	 diff=astextplain
10 | *.docx diff=astextplain
11 | *.DOCX diff=astextplain
12 | *.dot  diff=astextplain
13 | *.DOT  diff=astextplain
14 | *.pdf  diff=astextplain
15 | *.PDF	 diff=astextplain
16 | *.rtf	 diff=astextplain
17 | *.RTF	 diff=astextplain
18 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # What is _Enoki_ ?
 2 | The _Enoki_ script is a wrapper class for [IDAPython](https://www.hex-rays.com/products/ida/support/idapython_docs/). It regroups various useful functions for reverse engineering of non-standard 
 3 | and/or uncommon binaries. Many of the scripts currently available online are geared towards malware analysis of Windows [Portable Executable (PE)
 4 | files](https://en.wikipedia.org/wiki/Portable_Executable) and as such, most of their functionalities are geared toward Intel-based systems and perform many tasks to detect or
 5 | deobfuscate malicious, well-known file standards. _Enoki_ seeks to provide a set of basic functions for analysis of binaries, memory maps
 6 | or other non-malware oriented files for reverse engineering purposes.
 7 | 
 8 | ## Summary
 9 | 
10 | The _Enoki_ script is a wrapper around many IDAPython functions and is designed for analysts conducting reverse engineering on
11 | non-standard and uncommon files such as firmware of embedded devices or simply plain unknown files for ICS systems. _Enoki_ provides
12 | additional shortcut functions for extracting, searching and analyzing machines code, useful when IDA as issue parsing
13 | or detecting the actual processor.
14 | 
15 | ## Usage
16 | 
17 | To use _Enoki_ with [IDA](https://www.hex-rays.com/products/ida/), simply load the _enoki.py_ file into IDA. An instance of the _Enoki_ object will automatically be created in the ```e``` variable or you can create your own
18 | instance using the following command in the interpreter:
19 | 
20 | ```
21 | e = Enoki()
22 | ```
23 | 
24 | Simply call any of the function required using the instance, for example:
25 | 
26 | ```
27 | Python>hex(e.current_file_offset())
28 | 0x74fc
29 | ```
30 | 
31 | ## Examples
32 | 
33 | This section provides some example of the functionalities provded by the _Enoki_ script. More details can be found by consulting the wiki of the project.
34 | 
35 | ### Find a byte string
36 | 
37 | One of the function provided by _Enoki_ is the ```find_byte_string```, which allow the analyst to search for specific sequence of bytes or words in the machine
38 | code. The function will return all locations where the specific byte string has been found in the range searched. 
39 | 
40 | ```
41 | Python>e.find_byte_string(ScreenEA(), ScreenEA() + 0x1000, "7980 ????")
42 | [150, 155, 173, 198, 208]
43 | ```
44 | 
45 | If you need the output in hexadecimal addresses, simply wrap the result using the ```hex()``` function:
46 | 
47 | ```
48 | Python>[hex(i) for i in e.find_byte_string(ScreenEA(), ScreenEA() + 0x1000, "7980 ????")]
49 | ['0x96', '0x9b', '0xad', '0xc6', '0xd0']
50 | ```
51 | 
52 | ### Compare two code ranges for similarity
53 | 
54 | Another functionality available is to compare the similarity of two code segments via the ```compare_code``` function. This function
55 | will take two arrays of opcodes or assembly instructions and calculate the similarity of the sequence. In the example below, 
56 | the similarity is only 11%, meaning the 2 code segments are quite different.
57 | 
58 | ```
59 | Python>c1 = e.get_words_between(0x2C00, 0x2CFF)
60 | Python>c2 = e.get_words_between(0x8000, 0x80FF)
61 | Python>e.compare_code(c1, c2)
62 | 0.11328125
63 | ```
64 | 
65 | Other functions are available within _Enoki_ and more details can be found in the comments of the script or in the future wiki of the project.
66 | 
67 | 
68 | ## References
69 | 
70 | If you find this script useful for your projects or research, please add a reference or link to this project to help make it better.
71 | 
72 | - __URL:__
73 |   - [Enoki](https://github.com/InfectedPacket/Idacraft), https://github.com/InfectedPacket/Idacraft
74 | - __Reference (Chicago):__
75 |   - Racicot, Jonathan. 2016. Enoki (version 1.0.2). Windows/Mac/Linux. Ottawa, Canada.
76 | - __Reference (IEEE):__
77 |   - J. Racicot, Enoki. Ottawa, Canada, 2016.
78 |   
79 | 


--------------------------------------------------------------------------------
/enoki.py:
--------------------------------------------------------------------------------
   1 | #!/usr/bin/env python
   2 | # Copyright (C) 2015 Jonathan Racicot
   3 | #
   4 | # This program is free software: you can redistribute it and/or modify
   5 | # it under the terms of the GNU General Public License as published by
   6 | # the Free Software Foundation, either version 3 of the License, or
   7 | # (at your option) any later version.
   8 | #
   9 | # This program is distributed in the hope that it will be useful,
  10 | # but WITHOUT ANY WARRANTY; without even the implied warranty of
  11 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  12 | # GNU General Public License for more details.
  13 | #
  14 | # You should have received a copy of the GNU General Public License
  15 | # along with this program. If not, see <http:#www.gnu.org/licenses/>.
  16 | #
  17 | # If you use this program and find it useful, please include a link
  18 | # or reference to the project's page in your program and/or document.
  19 | #
  20 | # Reference (Chicago):
  21 | #  Racicot, Jonathan. 2016. Enoki (version 1.0.2). Windows/Mac/Linux. Ottawa, Canada.
  22 | # Reference (IEEE):
  23 | #  J. Racicot, Enoki. Ottawa, Canada, 2016.
  24 | #
  25 | # </copyright>
  26 | # <author>Jonathan Racicot</author>
  27 | # <email>infectedpacket@gmail.com</email>
  28 | # <date>2016-01-10</date>
  29 | # <url>https://github.com/infectedpacket</url>
  30 | #//////////////////////////////////////////////////////////////////////////////
  31 | #
  32 | #//////////////////////////////////////////////////////////////////////////////
  33 | # Imports
  34 | #//////////////////////////////////////////////////////////////////////////////
  35 | #
  36 | import re
  37 | import idc
  38 | import idaapi
  39 | import difflib
  40 | import idautils
  41 | import logging
  42 | #
  43 | #//////////////////////////////////////////////////////////////////////////////
  44 | # Enoki class
  45 | #//////////////////////////////////////////////////////////////////////////////
  46 | class Enoki(object):
  47 | 	"""
  48 | 		Description: 
  49 | 			Provides wrapping functions around IDAPython to analyze
  50 | 		and format structures for unknown/difficult architectures.
  51 | 		
  52 | 		Notes:
  53 | 			Tested on IDA Pro v6.5
  54 | 			
  55 | 		Author:
  56 | 			Jonathan Racicot
  57 | 			
  58 | 		Date:
  59 | 			Created: 2015-10-14
  60 | 			Updated: 2016-01-10
  61 | 	"""
  62 | 	
  63 | 	VERSION = "1.0.0"
  64 | 	
  65 | 	#Specifies a 16bit segment
  66 | 	SEG_16	=	0
  67 | 	#Specifies a 32bit segment
  68 | 	SEG_32	=	1
  69 | 	#Specifies a 64bit segment
  70 | 	SEG_64	=	2
  71 | 
  72 | 	#Segment bitness to use when none has been specified.
  73 | 	DEFAULT_SEGMENT_SIZE = SEG_16
  74 | 	
  75 | 	#Specifies a DATA segment
  76 | 	SEG_DATA = "DATA"
  77 | 	#Specifies a CODE segment
  78 | 	SEG_CODE = "CODE"
  79 | 
  80 | 	SEG_TYPE_CODE = 2
  81 | 	SEG_TYPE_DATA = 3		
  82 | 	
  83 | 	#Used for assessing returns from IDA functions calls.
  84 | 	FAIL = 0
  85 | 	SUCCESS = 1
  86 | 	
  87 | 	# Basic colors
  88 | 	RED 	= 0x0000FF
  89 | 	GREEN 	= 0x00FF00
  90 | 	BLUE 	= 0xFF0000
  91 | 	YELLOW  = 0x00FFFF
  92 | 	WHITE 	= 0xFFFFFF
  93 | 	BLACK	= 0x000000
  94 | 	CYAN	= 0xFFFF00
  95 | 	# Fancy colors
  96 | 	ABSOLUTE_ZERO 		= 0xBA4800
  97 | 	AFRICAN_VIOLET		= 0xBE84B2
  98 | 	ALIZARIN_CRIMSON 	= 0x3626E3
  99 | 	AMBER		=	0x00BFFF
 100 | 	APPLE_GREEN	=	0x00B68D
 101 | 	AZURE		=	0xFF7F00
 102 | 	BABY_BLUE	=	0xF0CF89
 103 | 	BABY_PINK	=	0xC2C2F4
 104 | 	BONE		=	0xE3DAC9
 105 | 	CADMIUM_ORANGE = 0x2D87ED
 106 | 	CITRINE 	=	0xE4D00A
 107 | 	CADET_BLUE	=	0x5F9EA0
 108 | 	CHAMOISEE	=	0xA0785A
 109 | 
 110 | 	logger = logging.getLogger(__name__)
 111 | 
 112 | 	def Enoki(self):
 113 | 		"""
 114 | 		Constructor of the Enoki engine. Does nothing.
 115 | 		"""
 116 | 		pass
 117 | 		
 118 | 	def vers(self):
 119 | 		return Enoki.VERSION
 120 | 		
 121 | 	def make_comment(self, _ea, _comment):
 122 | 		"""
 123 | 		Creates a comment at the given address.
 124 | 		
 125 | 		@param _ea: The address where the comment will be created.
 126 | 		@param _comment: The comment
 127 | 		@return Enoki.SUCCESS if the comment as created successfully,
 128 | 		Enoki.FAIL otherwise.
 129 | 		"""	
 130 | 		return idc.MakeComm(_ea, _comment)
 131 | 		
 132 | 	def clear_comment(self, _ea):
 133 | 		"""
 134 | 		Removes any comment at the given address.
 135 | 		
 136 | 		@param _ea: The address where the comment will be removed.
 137 | 		@return Enoki.SUCCESS if the comment as created successfully,
 138 | 		Enoki.FAIL otherwise.		
 139 | 		"""
 140 | 		return self.make_comment(_ea, "")
 141 | 		
 142 | 	def clear_all_comments(self, _startea, _endea):
 143 | 		"""
 144 | 		Removes all comment between the given addresses.
 145 | 		
 146 | 		@param _startea: The start address.
 147 | 		@param _endea: The end address.
 148 | 		@return Enoki.SUCCESS if all the comments were removed,
 149 | 		Enoki.FAIL otherwise.		
 150 | 		"""	
 151 | 		if (_startea != BADADDR and _endea != BADADDR):
 152 | 			curea = _startea
 153 | 			error = Enoki.SUCCESS
 154 | 			while (curea < _endea):
 155 | 				r = self.clear_comment(curea)
 156 | 				curea = idc.NextHead(curea)
 157 | 				if (r == Enoki.FAIL):
 158 | 					error = Enoki.FAIL
 159 | 		return error
 160 | 
 161 | 	def clear_function_comments(self, _funcea):
 162 | 		"""
 163 | 		Removes all comments in the function at the specified address.
 164 | 		
 165 | 		@param _funcea: An address within the function
 166 | 		@return Enoki.SUCCESS if all the comments were removed,
 167 | 		Enoki.FAIL otherwise.		
 168 | 		"""		
 169 | 		func = self.get_function_at(_funcea)
 170 | 		if (func):
 171 | 			return self.clear_all_comments(func.startEA, func.endEA)
 172 | 		else:
 173 | 			return Enoki.FAIL
 174 | 		
 175 | 	def append_comment(self, _ea, _comment):
 176 | 		"""
 177 | 		Appends a new comment to an instruction at the specified address.
 178 | 		
 179 | 		@param _ea: The address where the comment will be appended.
 180 | 		@param _comment The comment
 181 | 		@return Enoki.SUCCESS if the comment as created successfully,
 182 | 		Enoki.FAIL otherwise.		
 183 | 		"""
 184 | 		if (_ea != BADADDR):
 185 | 			cur_comment = Comment(_ea)
 186 | 			if (cur_comment != None and len(cur_comment) > 0):
 187 | 				comment = "{:s}\n{:s}".format(cur_comment, _comment)
 188 | 			else:
 189 | 				comment = _comment
 190 | 			return self.make_comment(_ea, comment)
 191 | 		return Enoki.FAIL
 192 | 		
 193 | 	def make_repeat_comment(self, _ea, _comment):
 194 | 		"""
 195 | 		Creates a repeatable comment at the given address.
 196 | 		
 197 | 		@param _ea: The address where the comment will be created.
 198 | 		@param _comment: The comment
 199 | 		@return IDAEngine.SUCCESS if the comment as created successfully,
 200 | 		IDAEngine.FAIL otherwise.
 201 | 		"""	
 202 | 		return idc.MakeRptCmt(_ea, _comment)
 203 | 
 204 | 	def backup_database(self):
 205 | 		""" 
 206 | 			Backup the database to a file similar to 
 207 | 			IDA's snapshot function. 
 208 | 		"""
 209 | 		time_string = strftime('%Y%m%d%H%M%S')
 210 | 		file = idc.GetInputFile()
 211 | 		if not file:
 212 | 			raise NoInputFileException('No input file provided')
 213 | 		input_file = rsplit(file, '.', 1)[0]
 214 | 		backup_file = "{:s}_{:s}.idb".format(input_file, time_string)
 215 | 		idc.SaveBase(backup_file, idaapi.DBFL_BAK)  
 216 | 		
 217 | 	def create_segment(self, _startea, _endea, _name, 
 218 | 		_type, _segsize=DEFAULT_SEGMENT_SIZE):
 219 | 		"""
 220 | 			Creates a segment between provided addresses.
 221 | 			
 222 | 			@param _startea: The start address of the segment.
 223 | 			@param _endea: The end address of the segment.
 224 | 			@param _name: Name to be given to the new segment.
 225 | 			@param _type: Either idaapi.SEG_CODE to specified a code
 226 | 				segment or idaapi.SEG_DATA for a segment containing data.
 227 | 			@param _segsize: Bitness of the segment, e.g. 16, 32 or 64 bit.
 228 | 		"""		
 229 | 		r = idc.AddSeg(_startea, _endea, 0, _segsize, 1, 2)
 230 | 		if (r == Enoki.SUCCESS):
 231 | 			idc.RenameSeg(_startea, _name) 
 232 | 			return idc.SetSegmentType(_startea, _type)
 233 | 		else:
 234 | 			return Enoki.FAIL
 235 |   
 236 | 	def get_segment(self, _ea):
 237 | 		return idaapi.getseg(_ea)
 238 |   
 239 | 	def get_segment_type(self, _ea):
 240 | 		return self.get_seg_attribute(_ea, idc.SEGATTR_TYPE)
 241 |   
 242 | 	def segment_is_code(self, _segea):
 243 | 		return self.get_segment_type(_segea) == self.SEG_TYPE_CODE
 244 |   
 245 | 	def segment_is_data(self, _segea):
 246 | 		return self.get_segment_type(_segea) == self.SEG_TYPE_DATA  
 247 |   
 248 | 	def get_seg_attribute(self, _segea, _attr):
 249 | 		"""
 250 | 		Sets an attribute to the segment at the given address. The available
 251 | 		attributes are:
 252 | 		  SEGATTR_START          starting address
 253 | 		  SEGATTR_END            ending address
 254 | 		  SEGATTR_ALIGN          alignment
 255 | 		  SEGATTR_COMB           combination
 256 | 		  SEGATTR_PERM           permissions
 257 | 		  SEGATTR_BITNESS        bitness (0: 16, 1: 32, 2: 64 bit segment)
 258 | 		  SEGATTR_FLAGS          segment flags
 259 | 		  SEGATTR_SEL            segment selector
 260 | 		  SEGATTR_ES             default ES value
 261 | 		  SEGATTR_CS             default CS value
 262 | 		  SEGATTR_SS             default SS value
 263 | 		  SEGATTR_DS             default DS value
 264 | 		  SEGATTR_FS             default FS value
 265 | 		  SEGATTR_GS             default GS value
 266 | 		  SEGATTR_TYPE           segment type
 267 | 		  SEGATTR_COLOR          segment color
 268 | 		@param _segea Address within the segment to be modified.
 269 | 		@param _attr The attribute to change. This is one of the value listed above.
 270 | 		@param _value The value of the attibute.
 271 | 		"""
 272 | 		return idc.GetSegmentAttr(_segea, _attr, _value)  
 273 |   
 274 | 	def create_selector(self, _sel, _value):
 275 | 		return idc.SetSelector(_sel, _value)
 276 |   
 277 | 	def create_data_segment(self, _startea, _endea, _name,
 278 | 		_segsize=DEFAULT_SEGMENT_SIZE):
 279 | 		"""
 280 | 			Wrapper around the create_segment function to 
 281 | 			create a new data segment.
 282 | 			@param _startea: The start address of the segment.
 283 | 			@param _endea: The end address of the segment.
 284 | 			@param _name: Name to be given to the new segment.
 285 | 			@param _segsize: Bitness of the segment, e.g. 16, 32 or 64 bit.			
 286 | 		"""		
 287 | 		r = self.create_segment(_startea, _endea, _name, idaapi.SEG_DATA, _segsize)
 288 | 		if (r == Enoki.SUCCESS):
 289 | 			return self.set_seg_class_code(_startea)
 290 | 		return Enoki.FAIL
 291 | 		
 292 | 	def create_code_segment(self, _startea, _endea, _name, 
 293 | 		_segsize=DEFAULT_SEGMENT_SIZE):
 294 | 		"""
 295 | 			Wrapper around the create_segment function to 
 296 | 			create a new code segment.
 297 | 			@param _startea: The start address of the segment.
 298 | 			@param _endea: The end address of the segment.
 299 | 			@param _name: Name to be given to the new segment.
 300 | 			@param _segsize: Bitness of the segment, e.g. 16, 32 or 64 bit.			
 301 | 		"""			
 302 | 		r = self.create_segment(_startea, _endea, _name, idaapi.SEG_CODE, _segsize) 
 303 | 		if (r == Enoki.SUCCESS):
 304 | 			return self.set_seg_class_code(_startea)
 305 | 		return Enoki.FAIL
 306 | 		
 307 | 	def set_seg_selector(self, _segea, _sel):
 308 | 		return self.set_seg_attribute(_segea, SEGATTR_SEL, _sel)
 309 | 		
 310 | 	def set_seg_align_para(self, _segea):
 311 | 		"""
 312 | 		Sets the alignment of the segment at the given address as 'paragraph', 
 313 | 		i.e. 16bit.
 314 | 		
 315 | 		#param _segea Address within the segment to be modified.
 316 | 		"""
 317 | 		return idc.SegAlign(_segea, saRelPara)
 318 | 		
 319 | 	def set_seg_class_code(self, _segea):
 320 | 		"""
 321 | 		Sets the class of the segment at the given address as containing code. 
 322 | 		
 323 | 		#param _segea Address within the segment to be modified.
 324 | 		"""	
 325 | 		return self.set_seg_class(_segea, "CODE")
 326 | 		
 327 | 	def set_seg_class_data(self, _segea):
 328 | 		"""
 329 | 		Sets the class of the segment at the given address as containing data. 
 330 | 		
 331 | 		#param _segea Address within the segment to be modified.
 332 | 		"""		
 333 | 		return self.set_seg_class(_segea, "DATA")
 334 | 		
 335 | 	def set_seg_class(self, _segea, _type):
 336 | 		"""
 337 | 		Sets the class of the segment at the given address. 
 338 | 		
 339 | 		#param _segea Address within the segment to be modified.
 340 | 		"""		
 341 | 		return idc.SegClass(_segea, _type)
 342 | 		
 343 | 	def set_seg_attribute(self, _segea, _attr, _value):
 344 | 		"""
 345 | 		Sets an attribute to the segment at the given address. The available
 346 | 		attributes are:
 347 | 		  SEGATTR_START          starting address
 348 | 		  SEGATTR_END            ending address
 349 | 		  SEGATTR_ALIGN          alignment
 350 | 		  SEGATTR_COMB           combination
 351 | 		  SEGATTR_PERM           permissions
 352 | 		  SEGATTR_BITNESS        bitness (0: 16, 1: 32, 2: 64 bit segment)
 353 | 		  SEGATTR_FLAGS          segment flags
 354 | 		  SEGATTR_SEL            segment selector
 355 | 		  SEGATTR_ES             default ES value
 356 | 		  SEGATTR_CS             default CS value
 357 | 		  SEGATTR_SS             default SS value
 358 | 		  SEGATTR_DS             default DS value
 359 | 		  SEGATTR_FS             default FS value
 360 | 		  SEGATTR_GS             default GS value
 361 | 		  SEGATTR_TYPE           segment type
 362 | 		  SEGATTR_COLOR          segment color
 363 | 		@param _segea Address within the segment to be modified.
 364 | 		@param _attr The attribute to change. This is one of the value listed above.
 365 | 		@param _value The value of the attibute.
 366 | 		"""
 367 | 		return idc.SetSegmentAttr(_segea, _attr, _value)
 368 | 		
 369 | 	def create_string_at(self, _startea, _unicode=False, _terminator="00"):
 370 | 		"""
 371 | 		Creates a StringItem object at the specified location.
 372 | 		@param _startea The start address of the string
 373 | 		@param _unicode Specifies whether the string is ASCII or UnicodeDecodeError
 374 | 		@param _terminator Specify the terminator character of a sequence. Default is
 375 | 				"00"
 376 | 		"""
 377 | 		# Gets the address of the closest terminator byte/word
 378 | 		strend = self.find_next_byte_string(_startea, _terminator)
 379 | 		strlen = strend-_startea
 380 | 		if strend != idaapi.BADADDR:
 381 | 			if (_unicode):
 382 | 				result = idaapi.make_ascii_string(_startea, strlen, idaapi.ACFOPT_UTF8)
 383 | 			else:
 384 | 				result = idaapi.make_ascii_string(_startea, strlen, idaapi.ACFOPT_ASCII)
 385 | 			if (result == Enoki.FAIL):
 386 | 				print "[-] Failed to create a string at 0x{:x} to 0x{:x}.".format(_startea, strend+1)
 387 | 				return Enoki.FAIL
 388 | 			return Enoki.SUCCESS
 389 | 		return Enoki.FAIL
 390 | 		
 391 | 	def current_file_offset(self):
 392 | 		"""
 393 | 		Returns the file offset, i.e. absolute offset from the beginning of the file,
 394 | 		of the currently selected address.
 395 | 		@return The absolute offset of the selected address.
 396 | 		"""
 397 | 		return idaapi.get_fileregion_offset(ScreenEA())
 398 | 	
 399 | 	def min_file_offset(self):
 400 | 		"""
 401 | 		Returns the minimum file offset, i.e. absolute offset of the beginning of the file/memory.
 402 | 		@return The absolute minimum offset of the loaded code.
 403 | 		"""	
 404 | 		return idaapi.get_fileregion_offset(MinEA())
 405 | 
 406 | 	def max_file_offset(self):
 407 | 		"""
 408 | 		Returns the maximum file offset, i.e. absolute offset of the end of the file/memory.
 409 | 		@return The absolute maximum offset of the loaded code.
 410 | 		"""		
 411 | 		return idaapi.get_fileregion_offset(MaxEA())  
 412 |   
 413 | 	def get_byte_at(self, _ea):
 414 | 		return idc.Byte(_ea)
 415 |   
 416 | 	def get_word_at(self, _ea):
 417 | 		return idc.Word(_ea)
 418 |   
 419 | 	def get_dword_at(self, _ea):
 420 | 		return idc.Dword(_ea)
 421 | 		
 422 | 	def get_all_bytes_between(self, _startea, _endea):
 423 | 		"""
 424 | 		Returns all bytes between the given addresses.
 425 | 		@param _startea The starting address
 426 | 		@param _endea The ending address
 427 | 		@return A list containing all bytes between the given addresses.
 428 | 		"""		
 429 | 		bytes = []
 430 | 		if (_startea != BADADDR and _endea != BADADDR):
 431 | 			curea = _startea
 432 | 			while (curea < _endea):
 433 | 				bytes.append(self.get_byte_at(curea))
 434 | 				curea = NextHead(curea)
 435 |   
 436 | 		return bytes		
 437 | 		
 438 | 	def get_all_words_between(self, _startea, _endea):
 439 | 		"""
 440 | 		Returns all words between the given addresses.
 441 | 		@param _startea The starting address
 442 | 		@param _endea The ending address
 443 | 		@return A list containing all words between the given addresses.
 444 | 		"""		
 445 | 		words = []
 446 | 		if (_startea != BADADDR and _endea != BADADDR):
 447 | 			curea = _startea
 448 | 			while (curea < _endea):
 449 | 				words.append(self.get_word_at(curea))
 450 | 				curea = NextHead(curea)
 451 |   
 452 | 		return words		
 453 | 		
 454 | 	def get_all_strings(self, _filter='', 
 455 | 		_encoding=(Strings.STR_UNICODE | Strings.STR_C)):
 456 | 		"""
 457 | 		Retrieves all strings from the current file matching the
 458 | 		regular expression specified in the filter parameter. If no
 459 | 		filter value is provided, all strings IDA objects with the specified encoding
 460 | 		are returned. To access only the strings and display them in the interpreter,
 461 | 		consult the show_all_strings function.
 462 | 		
 463 | 		Values for the _encoding parameters includes:
 464 | 		- Strings.STR_UNICODE
 465 | 		- Strings.STR_C
 466 | 		
 467 | 		Values for the _encoding parameter can be combined using the |
 468 | 		operator. Example:
 469 | 		
 470 | 		_encoding=(Strings.STR_UNICODE | Strings.STR_C)
 471 | 		
 472 | 		@param _filter Regular expression to filter unneeded strings.
 473 | 		@param _encoding Specified the type of strings to seek.
 474 | 		@return A list of strings IDA objects
 475 | 		"""		
 476 | 		strings = []
 477 | 		string_finder = idautils.Strings(False)
 478 | 		string_finder.setup(strtypes=_encoding)
 479 | 		
 480 | 		for index, string in enumerate(string_finder):
 481 | 			s = str(string)
 482 | 			if len(_filter) > 0 and len(s) > 0:
 483 | 				if re.search(_filter, s):
 484 | 					strings.append(string)
 485 | 			else:
 486 | 				strings.append(string)
 487 | 		return strings
 488 |   
 489 | 	def show_all_strings(self, _filter='', 
 490 | 		_encoding=(Strings.STR_UNICODE | Strings.STR_C)):
 491 | 		"""
 492 | 		This function will display the address and the strings found in the
 493 | 		file. This function differs from get_all_strings by printing the results
 494 | 		into the interpreter and only the strings are returns, while the 
 495 | 		get_all_strings function returns the IDA string objects.
 496 | 	
 497 | 		@param _filter Regular expression to filter unneeded strings.
 498 | 		@param _encoding Specified the type of strings to seek.
 499 | 		@return A list of strings
 500 | 		"""
 501 | 		strings = []
 502 | 		strings_objs = self.get_all_strings(_filter, _encoding)
 503 | 		for s in strings_objs:
 504 | 			strings.append(str(s))
 505 | 			print("[>]\t0x{:x}: {:s}".format(s.ea, str(s)))
 506 | 		return strings
 507 |   
 508 | 	def get_string_at(self, _ea):
 509 | 		"""
 510 | 		Returns the string, if any, at the specified address.
 511 | 		@param _ea Address of the string
 512 | 		@return The string at the specified address.
 513 | 		"""	
 514 | 		if (_ea != BADADDR):
 515 | 			stype = idc.GetStringType(_ea)
 516 | 			return idc.GetString(_ea, strtype=stype)  
 517 | 		return ""
 518 |   
 519 | 	def get_all_comments_at(self, _ea):
 520 | 		"""
 521 | 		Returns both normal and repeatable comments at
 522 | 		the specified address. If both are present, a single
 523 | 		string is returned, both comments separated by a semi-
 524 | 		colon (:)
 525 | 		
 526 | 		@param _ea: Address from which to retrieve the comments
 527 | 		@return: A string containing both normal and repeatable comments,
 528 | 			or an empty string if no comments are found.
 529 | 		"""	
 530 | 		normal_comment = self.get_normal_comment(_ea)
 531 | 		rpt_comment = self.get_repeat_comment(_ea)
 532 | 		comment = normal_comment
 533 | 
 534 | 		if (comment and rpt_comment):
 535 | 			comment += ":" + rpt_command
 536 | 		
 537 | 		return comment
 538 |   
 539 | 	def get_normal_comment_at(self, _ea):
 540 | 		comment = idc.Comment(_ea)
 541 | 		if not comment:
 542 | 			comment = ""
 543 | 		
 544 | 		return comment;
 545 |   
 546 | 	def get_repeat_comment(self, _ea):
 547 | 		comment = idc.RptCmt(_ea)
 548 | 		if not comment:
 549 | 			comment = ""
 550 | 		
 551 | 		return comment;  
 552 | 
 553 | 	def get_ea(self, _name):
 554 | 		"""
 555 | 		Returns the address of a named location. Returns Enoki.FAIL
 556 | 		if no address matches the supplied name.
 557 | 		@param _name Name of the location.
 558 | 		@return The address corresponding to the name.
 559 | 		"""
 560 | 		if (len(_name) > 0):
 561 | 			return idc.LocByName(_name)
 562 | 		return Enoki.FAIL
 563 | 		
 564 | 	def get_ea_label(self, _ea):
 565 | 		"""
 566 | 		Returns the label of an address if any. Returns an empty string
 567 | 		if no label is assigned to the address.
 568 | 		@param _ea Address of the location.
 569 | 		@return The label set to the address if any, empty string otherwise.
 570 | 		"""	
 571 | 		return idc.Name(_ea)
 572 | 		
 573 | 	def get_disasm(self, _ea):
 574 | 		"""
 575 | 		Returns the disassembled code at the specified address.
 576 | 		@param _ea Address of the opcode to disassembled.
 577 | 		@return String containing the disassembled code.
 578 | 		"""
 579 | 		return idc.GetDisasm(_ea)
 580 |   
 581 | 	def get_mnemonic(self, _ea):
 582 | 		"""
 583 | 		Returns the instruction at the specified address.
 584 | 		@param _ea The address from which to retrieve the instruction.
 585 | 		@return String containing the mnemonic of the instruction.
 586 | 		"""
 587 | 		return idc.GetMnem(_ea)		
 588 | 		
 589 | 	def get_first_segment(self):
 590 | 		"""
 591 | 		Returns the address of the first defined 
 592 | 		segment of the file.
 593 | 
 594 | 		@return: Start address of the first segment or 
 595 | 			idc.BADADDR if no segments are defined
 596 | 		"""	
 597 | 		return idc.FirstSeg()
 598 | 		
 599 | 	def get_next_segment(self, _ea):
 600 | 		"""
 601 | 		Returns the address of the segment following the one defined 
 602 | 		at the given address.
 603 | 
 604 | 		@param _ea: Address of the current segment.
 605 | 		
 606 | 		@return: Start address of the next segment or 
 607 | 			idc.BADADDR if no segments are defined
 608 | 		"""	
 609 | 		return idc.FirstSeg()			
 610 | 		
 611 | 	def get_segment_name(self, _ea):
 612 | 		"""
 613 | 		Returns the name of the segment at the specified address.
 614 | 		@param _ea An address within the segment
 615 | 		@return String containing the name of the segment.
 616 | 		"""
 617 | 		return idc.Segname(_ea)
 618 | 		
 619 | 	def get_segment_start(self, _ea):
 620 | 		"""
 621 | 		Returns the starting address of the segment located at the specified
 622 | 		address
 623 | 		@param _ea An address within the segment
 624 | 		@return long The starting address of the segment.
 625 | 		"""
 626 | 		return idc.SegStart(_ea)
 627 | 		
 628 | 	def get_segment_end(self, _ea):
 629 | 		"""
 630 | 		Returns the ending address of the segment located at the specified
 631 | 		address
 632 | 		@param _ea An address within the segment
 633 | 		@return long The ending address of the segment.
 634 | 		"""	
 635 | 		return idc.SegEnd(_ea)	
 636 | 
 637 | 	def find_next_byte_string(self, _startea, _bytestr, _fileOffset = False, 
 638 | 		_bitness=DEFAULT_SEGMENT_SIZE):
 639 | 		"""
 640 | 		This function searches for text representing bytes and/or words in the
 641 | 		machine code of the file from a start address. This function is built on top of the native 
 642 | 		FindBinary function. The search is conducted starting at the specified address and downward
 643 | 		for the provided byte string. 
 644 | 		
 645 | 		Example:
 646 | 		e.find_next_byte_string(ScreenEA(), "0000 FFFF ???? 0000")
 647 | 		
 648 | 		@param _startea Starting address of the search
 649 | 		@param _bytestr String to search for
 650 | 		@param _fileOffset Specifies whether to return found addresses as relative or absolute
 651 | 				offsets
 652 | 		@param _bitness Specifies the bitness of the segment.
 653 | 		@return The offset of the byte string found, or None if there is no search result.
 654 | 		"""
 655 | 		offset = None
 656 | 		ea = _startea;
 657 | 		if ea == idaapi.BADADDR:
 658 | 			print ("[-] Failed to retrieve starting address.")
 659 | 			offset = None
 660 | 		else:
 661 | 			block = FindBinary(ea, SEARCH_DOWN | SEARCH_CASE, _bytestr, _bitness)
 662 | 			if (block == idc.BADADDR):
 663 | 				offset = None
 664 | 			if _fileOffset:
 665 | 				offset = idaapi.get_fileregion_offset(block)
 666 | 			else:
 667 | 				offset = block
 668 | 		return offset
 669 | 		
 670 | 	def find_byte_string(self, _startea, _endea, _bytestr, 
 671 | 		_fileOffsets = False, _showmsg = False):
 672 | 		"""
 673 | 		This function searches for text representing bytes and/or words in the
 674 | 		machine code of the file between 2 addresses. This function is built on top of the native 
 675 | 		FindBinary function. The search is conducted starting at the specified address and downward
 676 | 		for the provided byte string. 
 677 | 		
 678 | 		Example:
 679 | 		e.find_byte_string(0x4000, 0x8000, "FF FF AA AA FF FF", True)
 680 | 		
 681 | 		@param _startea Starting address of the search
 682 | 		@param _startea Ending address of the search
 683 | 		@param _bytestr String to search for
 684 | 		@param _fileOffsets Specifies whether to return found addresses as relative or absolute
 685 | 				offsets
 686 | 		@param _showmsg Specifies if the function should print a message with the results
 687 | 		@return An array of addresses corresponding to the start of the byte string. If none found,
 688 | 				returns an empty array.
 689 | 		"""		
 690 | 		try:
 691 | 			offsets = []
 692 | 			ea = _startea;
 693 | 			if ea == idaapi.BADADDR:
 694 | 				print ("[-] Failed to retrieve starting address.")
 695 | 				return None
 696 | 			else:
 697 | 				block = FindBinary(ea, SEARCH_DOWN | SEARCH_CASE, _bytestr, 16)
 698 | 				if (block == idc.BADADDR):
 699 | 					print("[-] Byte string '{:s}' not found.".format(_bytestr))
 700 | 					
 701 | 				while (block != idc.BADADDR and block < _endea):
 702 | 					block_file_offset = idaapi.get_fileregion_offset(block)
 703 | 					if _fileOffsets:
 704 | 						offsets.append(block_file_offset)
 705 | 					else:
 706 | 						offsets.append(block)
 707 | 					next_block_offset = idaapi.get_fileregion_ea(block_file_offset+4)
 708 | 					if (_showmsg):
 709 | 						print("[+] Byte string '{:s}' found at offset 0x{:X}, file offset 0x{:X}.".format(
 710 | 							_bytestr,
 711 | 							block,
 712 | 							block_file_offset))
 713 | 					block = FindBinary(next_block_offset, SEARCH_DOWN | SEARCH_CASE, _bytestr, 16)
 714 | 				return offsets
 715 | 		except Exception as e:
 716 | 			print("[-] An error occured while seeking byte string {:s}: {:s}".format(_bytestr, e.message))
 717 | 			return []
 718 | 			
 719 | 	def get_code_ranges(self, _startea, _endea, _prolog, _epilog):
 720 | 		"""
 721 | 		This function will extract all the machine opcodes located between the
 722 | 		provided code boundaries in the prescribed range.
 723 | 		
 724 | 		Example: TODO
 725 | 		
 726 | 		m = e.get_code_ranges(MinEA(), MaxEA(), "4500 4885", "4886 0090")
 727 | 		print(m)
 728 | 		[[0x2C00, 0x2C15], [0x2C16, 0x2C38]]
 729 | 		
 730 | 		@param _startea The start address of the range to look for code
 731 | 			segment
 732 | 		@param _endea The end address of the range to look for code
 733 | 			segment
 734 | 		@param _prolog Starting byte string of the code segment to look for.
 735 | 		@param _epilog Ending byte string of the code segment to look for.
 736 | 		@return matrix containing the starting and ending addresses of the code
 737 | 			segment found.
 738 | 		"""
 739 | 		segments = []
 740 | 		if (_startea != BADADDR and _endea != BADADDR):
 741 | 			prolog_offsets = self.find_byte_string(_startea, _endea, _prolog, False)
 742 | 			for offset_idx in range(0, len(prolog_offsets)):
 743 | 				epilog_offset = self.find_next_byte_string(
 744 | 					prolog_offsets[offset_idx],
 745 | 					_epilog)
 746 | 				if epilog_offset != idc.BADADDR:
 747 | 					segments.append([prolog_offsets[offset_idx], epilog_offset])
 748 | 		return segments;		
 749 | 		
 750 | 	def get_instruction_tokens(self, _ea):
 751 | 		"""
 752 | 		Returns the tokens of the disassembled instruction at the specified address.
 753 | 		
 754 | 		Example:
 755 | 		...
 756 | 		0x2C00:	pop r1	; Pops stack into R1 register
 757 | 		...
 758 | 		s = get_instruction_tokens(0x2C00)
 759 | 		print(s)
 760 | 		['pop', 'r1', ';', 'Pops', 'stack', 'into', 'R1', 'register']
 761 | 		
 762 | 		@param _ea Address of the instruction to disassembled
 763 | 		@return Array of string containing the tokens of the disassembled instruction.
 764 | 		"""
 765 | 		if (_ea != BADADDR):
 766 | 			return filter(None, GetDisasm(_ea).split(" "))
 767 | 		
 768 | 	def get_function_at(self, _ea):
 769 | 		"""
 770 | 		Returns the function object at the specified address.
 771 | 		@param _ea An address within the function
 772 | 		@return The native IDA function object at the given address.
 773 | 		"""
 774 | 		if (_ea != BADADDR):
 775 | 			return idaapi.get_func(_ea)
 776 | 		else:
 777 | 			return None
 778 | 		
 779 | 	def set_function_name_at(self, _funcea, _name):
 780 | 		"""
 781 | 		Sets the name of the function located at the specified address,
 782 | 		if any.
 783 | 		
 784 | 		@param _funcea An address within the function
 785 | 		@param _name The new name of the function. Cannot be empty.
 786 | 		@return Enoki.SUCCESS or Enoki.FAIL on error.
 787 | 		"""
 788 | 		if (_funcea != BADADDR and len(_name) > 0):
 789 | 			func = self.get_function_at(_funcea)
 790 | 			if (func):
 791 | 				return idc.MakeName(func.startEA, _name)
 792 | 		return Enoki.FAIL	
 793 | 		
 794 | 	def get_function_name_at(self, _ea):
 795 | 		"""
 796 | 		Returns the name of the function at the given address if one is
 797 | 		defined.Returns an empty string if no function is defined at the 
 798 | 		address.
 799 | 		@param _ea An address within the function
 800 | 		@return The name of the function or an empty string.
 801 | 		"""
 802 | 		return GetFunctionName(_ea)
 803 | 		
 804 | 	def get_function_disasm(self, _ea):
 805 | 		"""
 806 | 		This function retrieves all of the disassembled and tokenized instructions 
 807 | 		of the function located at the specified address.
 808 | 		
 809 | 		Example:
 810 | 		...
 811 | 		0x2C00:	pop r1
 812 | 		0x2C01: load acc, 0
 813 | 		0x2C03: jmp 0x2C0A
 814 | 		...
 815 | 		s = get_function_disasm(0x2C00)
 816 | 		print(s)
 817 | 		[['pop', 'r1'], ['load', 'acc,', '0'], ['jmp', '0x2C0A']]
 818 | 		
 819 | 		Note that the tokenization is done using white spaces only, so any commas will remain
 820 | 		as part of the token.
 821 | 		
 822 | 		@param _ea An address within the function.
 823 | 		@return A matrix of tuples containing the address of the instruction and a 
 824 | 		list of tokenized instructions contained in the function at the specified address.
 825 | 		
 826 | 		"""
 827 | 		matrix_disasm = []
 828 | 		if (_ea != BADADDR):
 829 | 			current_func = self.get_function_at(_ea)
 830 | 			if (current_func):
 831 | 				func_start = current_func.startEA
 832 | 				func_end = current_func.endEA
 833 | 				curea = func_start
 834 | 				while(curea < func_end):
 835 | 					inst_tokens = self.get_instruction_tokens(curea)
 836 | 					matrix_disasm.append(inst_tokens)
 837 | 					curea = NextHead(curea)				
 838 | 			else:
 839 | 				print("[-] No function found at 0x{:x}.".format(_ea))
 840 | 		return matrix_disasm		
 841 | 		
 842 | 	def get_function_disasm_with_ea(self, _ea):
 843 | 		"""
 844 | 		This function retrieves all of the disassembled and tokenized instructions 
 845 | 		of the function located at the specified address.
 846 | 		
 847 | 		Example:
 848 | 		...
 849 | 		0x2C00:	pop r1
 850 | 		0x2C01: load acc, 0
 851 | 		0x2C03: jmp 0x2C0A
 852 | 		...
 853 | 		s = get_function_disasm(0x2C00)
 854 | 		print(s)
 855 | 		[(0x2C00, ['pop', 'r1']), (0x2C01, ['load', 'acc,', '0']), (0x2C03, ['jmp', '0x2C0A'])]
 856 | 		
 857 | 		Note that the tokenization is done using white spaces only, so any commas will remain
 858 | 		as part of the token.
 859 | 		
 860 | 		@param _ea An address within the function.
 861 | 		@return A matrix of tuples containing the address of the instruction and a 
 862 | 		list of tokenized instructions contained in the function at the specified address.
 863 | 		
 864 | 		"""
 865 | 		matrix_disasm = []
 866 | 		if (_ea != BADADDR):
 867 | 			current_func = self.get_function_at(_ea)
 868 | 			if (current_func):
 869 | 				func_start = current_func.startEA
 870 | 				func_end = current_func.endEA
 871 | 				curea = func_start
 872 | 				while(curea < func_end):
 873 | 					inst_tokens = self.get_instruction_tokens(curea)
 874 | 					matrix_disasm.append((curea, inst_tokens))
 875 | 					curea = NextHead(curea)				
 876 | 			else:
 877 | 				print("[-] No function found at 0x{:x}.".format(_ea))
 878 | 		return matrix_disasm
 879 | 	
 880 | 	def compare_code(self, _code1, _code2):
 881 | 		"""
 882 | 		The compare_code function provides a similarity ratio between the provided code
 883 | 		segments. It does so by using the SequenceMatcher from the difflib module, which
 884 | 		return a value between 0 and 1, 0 indicating 2 completely different segment and 1
 885 | 		specifying identical code segments.
 886 | 		
 887 | 		@param _code1 First code segment to compare
 888 | 		@param _code2 Seconde code segment to compare
 889 | 		@return double A value between 0 and 1 indicating the degree of similarity between the
 890 | 		2 code segments.
 891 | 		"""
 892 | 		sm=difflib.SequenceMatcher(None,_code1,_code2,autojunk=False)	
 893 | 		r = sm.ratio()
 894 | 		return r
 895 | 	
 896 | 	def compare_functions(self, _ea_func1, _ea_func2):
 897 | 		"""
 898 | 		Compares the code of 2 functions using the compare_code function. 
 899 | 		
 900 | 		@param _ea_func1 Address within the first function to compare
 901 | 		@param _ea_func2 Address within the second function to compare
 902 | 		@return double A value between 0 and 1, 0 indicating 2 completely different 
 903 | 		functions and 1 specifying identical functions.
 904 | 		"""
 905 | 		l1 = self.get_function_instructions(_ea_func1)
 906 | 		l2 = self.get_function_instructions(_ea_func2)
 907 | 		return self.compare_code(l1, l2)
 908 | 		
 909 | 	def get_function_instructions(self, _ea):
 910 | 		"""
 911 | 		Retrieves the instructions, without operands, of the function located at the
 912 | 		specified address.
 913 | 		
 914 | 		Example:
 915 | 		...
 916 | 		0x2C00:	pop r1
 917 | 		0x2C01: load acc, 0
 918 | 		0x2C03: jmp 0x2C0A		
 919 | 		...
 920 | 		s = e.get_function_instructions(0x2C00)
 921 | 		print(s)
 922 | 		['pop', 'load', 'jmp']
 923 | 				
 924 | 		@param _ea Address within the function
 925 | 		@return Array of string representing the instruction of the function.
 926 | 		"""
 927 | 		instr = []
 928 | 		if (_ea != BADADDR):
 929 | 			instr_matrix = self.get_function_disasm(_ea)
 930 | 			for line in instr_matrix:
 931 | 				instr.append(line[0])
 932 | 		return instr
 933 | 		
 934 | 	def get_all_functions_instr(self, _startea, _endea):
 935 | 		"""
 936 | 		Extracts the instructions of all functions located between the provided
 937 | 		start and end addresses. Returns a dictionary in the format 
 938 | 		<"FunctionName", ['i1', 'i2', ..., 'in']>
 939 | 		
 940 | 		@param _startea Starting address
 941 | 		@param _endea Ending address
 942 | 		@return A dictionary object. The keys are the name of the functions found
 943 | 				within the boundaries, while the value is the array of instructions
 944 | 				for the function.
 945 | 		"""
 946 | 		f_instr = {}		
 947 | 		curEA = _startea
 948 | 		func = self.get_function_at(_ea)
 949 | 		
 950 | 		while (curEA <= _endea):
 951 | 			name = GetFunctionName(curEA)
 952 | 			i = self.get_function_instructions(curEA)
 953 | 			f_instr[name] = i
 954 | 			func = idaapi.get_next_func(curEA)
 955 | 			curEA = func.startEA
 956 | 		return f_instr
 957 | 		
 958 | 	def get_all_functions(self, _startea, _endea):
 959 | 		"""
 960 | 		Gets all function objects between the provided start and end
 961 | 		addresses. Returns a dictionary in the format <"FunctionName", FunctionObject>.
 962 | 		
 963 | 		@param _startea Starting address
 964 | 		@param _endea Ending address
 965 | 		@return A dictionary object. The keys are the name of the functions found
 966 | 				within the boundaries, while the value is the native Function object
 967 | 				if IDA.
 968 | 		"""
 969 | 		functions = {}
 970 | 		curEA = _startea
 971 | 		func = self.get_function_at(curEA)
 972 | 		if (func):
 973 | 			while (curEA <= _endea):
 974 | 				name = GetFunctionName(curEA)
 975 | 				functions[name] = func
 976 | 				func = idaapi.get_next_func(curEA)
 977 | 				if (func):
 978 | 					curEA = func.startEA
 979 | 				else:
 980 | 					NextHead(curEA)
 981 | 			return functions
 982 | 		
 983 | 	def get_all_func_instr_seg(self, _ea=ScreenEA()):
 984 | 		"""
 985 | 		Returns all the functions in the segment specified by the provided address. 
 986 | 		Returns a dictionary in the format <"FunctionName", FunctionObject>.
 987 | 		
 988 | 		@param _ea An address within the segment. Default is the segment of the current
 989 | 				instruction.
 990 | 		@return A dictionary object. The keys are the name of the functions found
 991 | 		within the boundaries, while the value is the native Function object
 992 | 		if IDA.
 993 | 		"""
 994 | 		return self.get_all_functions_instr(SegStart(_ea), SegEnd(_ea))
 995 | 	
 996 | 	def get_closest_previous_instr(self, _ea, _instruction, _max=20):
 997 | 		"""
 998 | 		Find the closest instruction matching the specified instructions above the 
 999 | 		specified address. 
1000 | 		
1001 | 		Example:
1002 | 		0x2C00 lacl  #FFh
1003 | 		0x2C01 sacl  *+
1004 | 		0x2C02 sbrk  #5
1005 | 		0x2C03 lar   ar1, *-
1006 | 		0x2C04 call SUB_02CC4
1007 | 		...
1008 | 		Python>e.get_closest_previous_instr(0x2C04, "lac")
1009 | 		(11264, 'lacl    #FF')
1010 | 		
1011 | 		If found, the function will return the address of the matching instruction 
1012 | 		and the matching instruction. You can specified a maximum of instructions
1013 | 		to look before giving up by setting the _max argument, which is set to 
1014 | 		20 per default.
1015 | 		
1016 | 		@param _ea The reference address to search from
1017 | 		@param _instruction A regular expression to match the required instruction
1018 | 		@param _max Maximum of instruction to look at before giving up.
1019 | 		@return A tuple containing the address and the matching instruction.
1020 | 		"""
1021 | 		found_ins = (BADADDR, "")
1022 | 		if (_ea != BADADDR):
1023 | 			step = 0
1024 | 			curea = _ea
1025 | 			found = False
1026 | 			while (step < _max and not found):
1027 | 				ins = GetMnem(curea)
1028 | 				if (re.search(_instruction, ins)):
1029 | 					found_ins = (curea, e.get_disasm(curea))
1030 | 					found = True
1031 | 				step += 1
1032 | 				curea = PrevHead(curea)
1033 | 				
1034 | 		return found_ins
1035 | 	
1036 | 	def get_closest_next_instr(self, _ea, _instruction, _max=20):
1037 | 		"""
1038 | 		Find the closest instruction matching the specified instructions above the 
1039 | 		specified address. 
1040 | 		
1041 | 		Example:
1042 | 		0x2C00 lacl  #FFh
1043 | 		0x2C01 sacl  *+
1044 | 		0x2C02 sbrk  #5
1045 | 		0x2C03 lar   ar1, *-
1046 | 		0x2C04 call SUB_02CC4
1047 | 		...
1048 | 		Python>e.get_closest_previous_instr(0x2C04, "lac")
1049 | 		(11264, 'lacl    #FF')
1050 | 		
1051 | 		If found, the function will return the address of the matching instruction 
1052 | 		and the matching instruction. You can specified a maximum of instructions
1053 | 		to look before giving up by setting the _max argument, which is set to 
1054 | 		20 per default.
1055 | 		
1056 | 		@param _ea The reference address to search from
1057 | 		@param _instruction A regular expression to match the required instruction
1058 | 		@param _max Maximum of instruction to look at before giving up.
1059 | 		@return A tuple containing the address and the matching instruction.
1060 | 		"""
1061 | 		found_ins = (BADADDR, "")
1062 | 		if (_ea != BADADDR):
1063 | 			step = 0
1064 | 			curea = _ea
1065 | 			found = False
1066 | 			while (step < _max and not found):
1067 | 				ins = GetMnem(curea)
1068 | 				if (re.search(_instruction, ins)):
1069 | 					found_ins = (curea, e.get_disasm(curea))
1070 | 					found = True
1071 | 				step += 1
1072 | 				curea = NextHead(curea)
1073 | 				
1074 | 		return found_ins	
1075 | 	
1076 | 	def get_similarity_ratios(self, func1, func2):
1077 | 		"""
1078 | 		Calculates the similarity ratios between 2 sets of functions and returns 
1079 | 		a matrix of the results. The matrix is in the following format:
1080 | 		
1081 | 		[
1082 | 		["f11", "f12", r1]
1083 | 		["f21", "f22", r2]
1084 | 		...
1085 | 		["fn1", "fn2", rn]
1086 | 		]
1087 | 		
1088 | 		Note: this function can take a while to complete and was not design for
1089 | 		efficiency. O(n^2)
1090 | 		
1091 | 		@param func1 First set of function to compare
1092 | 		@param func2 Second set of function to compare.
1093 | 		@return Matrix of similarity ratios for each function compared.
1094 | 		"""
1095 | 		ratios = []
1096 | 		for f1, l1 in func1.iteritems():
1097 | 			for f2, l2 in func2.iteritems():
1098 | 				r = self.compare_code(l1, l2)
1099 | 				ratios.append([f1, f2, r])
1100 | 		return ratios
1101 | 		
1102 | 	def get_similarity_func(self, ratios, threshold=1.0):
1103 | 		"""
1104 | 		Returns a matrix of similarity vectors with ratios greater or equal
1105 | 		to the specified threshold.
1106 | 		
1107 | 		Example:
1108 | 		
1109 | 		ratios = [
1110 | 		["f11", "f12", 1.0]
1111 | 		["f21", "f22", 0.64]
1112 | 		["f31", "f32", 0.85]
1113 | 		]		
1114 | 		
1115 | 		m = e.get_similarity_func(ratios, 0.9)
1116 | 		print(m)
1117 | 		[["f11", "f12", 1.0]]	
1118 | 		
1119 | 		@param ratios Matrix of ratios as returned by function get_similarity_ratios
1120 | 		@param threshold Minimum threshold desired. Default value is 1.0
1121 | 		@return Matrix of similarity ratios with ratio greater or equal to specified threshold.
1122 | 		"""
1123 | 		funcs = []
1124 | 		for r in ratios:
1125 | 			if (r[2] >= threshold):
1126 | 				#print("[+] Similarity between '{:s}' and '{:s}': {:f}.".format(r[0], r[1], r[2]))
1127 | 				funcs.append(r)
1128 | 		return funcs
1129 | 		
1130 | 	def function_is_leaf(self, _funcea):
1131 | 		"""
1132 | 		Verifies if the function at the specified address is a leaf function, i.e.
1133 | 		it does not make any call to other function.
1134 | 		
1135 | 		@param _funcea An address within the function
1136 | 		@return True if the function at the address contains no call instructions.
1137 | 		"""
1138 | 		# Retrieves the function at _funcea:
1139 | 		near_calls = self.get_functions_called_from(_funcea)
1140 | 		return len(near_calls) == 0
1141 | 		
1142 | 	def get_functions_called_by(self, _funcea, _display=True):
1143 | 		"""
1144 | 		Get all functions directly called by the function at the given address. This function
1145 | 		only extract functions called at the first level, i.e. this function is not recursive.
1146 | 		Returns a matrix containing the address originating the call, the destination address
1147 | 		and the name of the function/address called.
1148 | 		
1149 | 		Example:
1150 | 		...
1151 | 		0x2C00:	pop r1
1152 | 		0x2C01: load acc, 0
1153 | 		0x2C03: call 0x2CC0		
1154 | 		0x2C05: load acc, 27h
1155 | 		0x2C07: call 0x2D78
1156 | 		0x2C09: push r1
1157 | 		0x2C0A: ret
1158 | 		...
1159 | 		
1160 | 		m = e.get_functions_called_by(0x2C00)
1161 | 		print(m)
1162 | 		[[0x2C03, 0x2CC0, 'SUB__02CC0'],[0x2C07, 0x2D78, 'SUB__02D78']]
1163 | 		
1164 | 		@param _funcea Address within the function
1165 | 		@param _display If True, display the results at the console.
1166 | 		@return Matrix containing the source, destination and name of the functions called.
1167 | 		"""
1168 | 		# Retrieves the function at _funcea:
1169 | 		func = self.get_function_at(_funcea)
1170 | 		# Boundaries:
1171 | 		startea = func.startEA
1172 | 		endea = func.endEA
1173 | 		# EA index:
1174 | 		curea = startea
1175 | 		# Results here:
1176 | 		near_calls = []
1177 | 		while (curea < endea):
1178 | 			for xref in XrefsFrom(curea):
1179 | 				# Code 17 is the code for 'Code_Near_Jump' type of XREF
1180 | 				if (xref.type == 17):
1181 | 					# Add the current address, the address of the call and the 
1182 | 					# name of the function called.
1183 | 					call_info = [xref.frm, xref.to, GetFunctionName(xref.to)]
1184 | 					near_calls.append(call_info)
1185 | 					if (_display):
1186 | 						print("[*] 0x{:x}: {:s} -> {:s}.".format(
1187 | 							call_info[0], 
1188 | 							GetFunctionName(call_info[0]), 
1189 | 							GetFunctionName(call_info[1])))
1190 | 			# Next instruction in the function
1191 | 			curea = NextHead(curea)
1192 | 		return near_calls
1193 | 		
1194 | 	def get_function_flowchart(self, _funcea):
1195 | 		"""
1196 | 		Returns the flowchart of the function specified at the given address.
1197 | 		
1198 | 		@param _funcea An address within the function
1199 | 		@return A FlowChart object or Enoki.FAIL if the address given is invalid,
1200 | 		or no function were found at the address.
1201 | 		"""
1202 | 		if (_funcea != BADADDR):
1203 | 			func = self.get_function_at(_funcea)
1204 | 			if (func):
1205 | 				return idaapi.FlowChart(func)
1206 | 		return Enoki.FAIL
1207 | 		
1208 | 	def get_func_block_bounds(self, _funcea):
1209 | 		"""
1210 | 		Returns all the code blocks of a given function, i.e. code segment
1211 | 		between branches/returns and other jumps except for calls.
1212 | 		
1213 | 		Example:
1214 | 		0x2C00 pop *
1215 | 		0x2C01 load r1, *+
1216 | 		...
1217 | 		0x2C15 jmp 0x2C20
1218 | 		0x2C16 call 0x03D0
1219 | 		...
1220 | 		0x2C20 jne r2, 0x2C3D
1221 | 		...
1222 | 		
1223 | 		Python>c_blks = e.get_code_block_boundaries(0x2C00)
1224 | 		Python>c_blks
1225 | 		[(0x2C00, 0x2C15), (0x2C15, 0x2C20), ...]
1226 | 		
1227 | 		@param _funcea An address within the function
1228 | 		@return A list of tuples containing the start of the block (inclusive) and the 
1229 | 		end of the block (exclusive). Returns an empty list on error.
1230 | 		"""
1231 | 		blks = []
1232 | 		fc = self.get_function_flowchart(_funcea)
1233 | 		if (fc != Enoki.FAIL):
1234 | 			for blk in fc:
1235 | 				blks.append((blk.startEA, blk.endEA))
1236 | 		return blks		
1237 | 		
1238 | 	def get_block_at(self, _funcea):
1239 | 		"""
1240 | 		Retrieves the code block at the given address
1241 | 		@param _funcea An address within the function
1242 | 		@return A tuple containing the boundaries of the corresponding code block.
1243 | 		returns (BADADDR, BADADDR) if none found.
1244 | 		"""
1245 | 		found = (BADADDR, BADADDR)
1246 | 		if (_funcea != BADADDR):
1247 | 			blks = self.get_func_block_bounds(_funcea)
1248 | 			if (len(blks) > 0):
1249 | 				for (b_start, b_end) in blks:
1250 | 					if (_funcea >= b_start and _funcea < b_end):
1251 | 						return (b_start, b_end)
1252 | 		return found		
1253 | 		
1254 | 	def get_all_sub_functions_called(self, _funcea, _level=0, _visited=[]):
1255 | 		"""
1256 | 		Get all functions directly and indirectly called by the function at the given address. 
1257 | 		This function is recursive and will seek all sub function calls as well, therefore this
1258 | 		function can be time consumming to complete.
1259 | 		Returns a matrix containing the address originating the call, the destination address
1260 | 		and the name of the function/address called and the depth of the call from the initial
1261 | 		function.
1262 | 		
1263 | 		Example:
1264 | 		...
1265 | 		0x2C00:	pop r1
1266 | 		0x2C01: load acc, 0
1267 | 		0x2C03: call 0x2CC0		
1268 | 		0x2C05: load acc, 27h
1269 | 		0x2C07: call 0x2D78
1270 | 		0x2C09: push r1
1271 | 		0x2C0A: ret
1272 | 		...
1273 | 		0x2CC0 SUB__02CC0:
1274 | 		0x2CC0  pop r1
1275 | 		0x2CC1  load acc, 00
1276 | 		0x2CC2  call 0x3DEE
1277 | 		...
1278 | 		
1279 | 		m = e.get_all_sub_functions_called(0x2C00)
1280 | 		print(m)
1281 | 		[[0x2C03, 0x2CC0, 'SUB__02CC0', 0],[0x2CC2, 0x3DEE, 'SUB__03DDE', 1],
1282 | 		 [0x2C07, 0x2D78, 'SUB__02D78', 0]]
1283 | 		
1284 | 		@param _funcea Address within the function
1285 | 		@return Matrix containing the source, destination, name of the functions called and 
1286 | 		the depth relative to the first function.
1287 | 		"""	
1288 | 		# Retrieves the function at _funcea:
1289 | 		func = self.get_function_at(_funcea)
1290 | 		# Make sure a function object was extracted
1291 | 		if (not func):
1292 | 			print("[-] Error getting function at 0x{:x}.".format(_funcea))
1293 | 			return []
1294 | 		# Boundaries:
1295 | 		startea = func.startEA
1296 | 		endea = func.endEA
1297 | 		# EA index:
1298 | 		curea = startea
1299 | 		# Results here:
1300 | 		near_calls = []
1301 | 		while (curea < endea):
1302 | 			for xref in XrefsFrom(curea):
1303 | 				# Code 17 is the code for 'Code_Near_Jump' type of XREF
1304 | 				if (xref.type == 17):
1305 | 					# Add the current address, the address of the call and the 
1306 | 					# name of the function called along with the depth.
1307 | 					fname = GetFunctionName(xref.to)
1308 | 					if not fname in _visited:
1309 | 						_visited.append(fname)
1310 | 						call_info = [xref.frm, xref.to, fname, _level]	
1311 | 						print("[*]{:s}0x{:x}: {:s} -> {:s}.".format(
1312 | 							" " * _level,
1313 | 							call_info[0], 
1314 | 							self.get_function_name_at(call_info[0]), 
1315 | 							self.get_function_name_at(call_info[1])))		
1316 | 						sub_calls = self.get_all_sub_functions_called(xref.to, _level+1, _visited)
1317 | 						# Add calls to current ones
1318 | 						near_calls.append(call_info)
1319 | 						if (len(sub_calls) > 0):
1320 | 							near_calls += sub_calls
1321 | 						
1322 | 			# Next instruction in the function
1323 | 			curea = NextHead(curea)
1324 | 		return near_calls		
1325 | 		
1326 | 	def get_functions_leading_to(self, _funcea):
1327 | 		"""
1328 | 		This function returns all the functions calling the function at the 
1329 | 		provided address. This function is not recursive and only returns the
1330 | 		first depth of function calling. Returns a matrix containing the address 
1331 | 		originating the call, the destination address and the name of the 
1332 | 		function/address called.
1333 | 		
1334 | 		Example:
1335 | 		...
1336 | 		0x2C00: MAIN:
1337 | 		0x2C00:	pop r1
1338 | 		0x2C01: load acc, 0
1339 | 		0x2C03: call 0x2CC0		
1340 | 		0x2C05: load acc, 27h
1341 | 		0x2C07: call 0x2D78
1342 | 		0x2C09: push r1
1343 | 		0x2C0A: ret
1344 | 		...
1345 | 		0x2CC0 SUB__02CC0:
1346 | 		0x2CC0  pop r1
1347 | 		0x2CC1  load acc, 00
1348 | 		0x2CC2  call 0x3DEE
1349 | 		...
1350 | 		
1351 | 		m = e.get_all_sub_functions_called(0x2CC0)
1352 | 		print(m)
1353 | 		[[0x2C00, 0x2CC0, 'MAIN']]
1354 | 		
1355 | 		@param _funcea Address within the function
1356 | 		@return Matrix containing the source, destination, name of the functions calling the
1357 | 		function.
1358 | 		"""	
1359 | 		# Retrieves the function at _funcea:
1360 | 		func = idaapi.get_prev_func(idaapi.get_next_func(_funcea).startEA)	
1361 | 		# Boundaries:
1362 | 		startea = func.startEA
1363 | 		endea = func.endEA
1364 | 		# EA index:
1365 | 		curea = startea
1366 | 		# Results here:
1367 | 		near_calls = []
1368 | 		while (curea < endea):
1369 | 			for xref in XrefsTo(curea):
1370 | 					# Code 17 is the code for 'Code_Near_Jump' type of XREF
1371 | 					if (xref.type == 17):
1372 | 						# Add the current address, the address of the call and the 
1373 | 						# name of the function called.
1374 | 						call_info = [xref.frm, xref.to, GetFunctionName(xref.to)]
1375 | 						near_calls.append(call_info)
1376 | 						print("[*] 0x{:x}: {:s} -> {:s}.".format(
1377 | 							call_info[0], 
1378 | 							GetFunctionName(call_info[0]), 
1379 | 							GetFunctionName(call_info[1])))
1380 | 			# Next instruction in the function
1381 | 			curea = NextHead(curea)
1382 | 		return near_calls
1383 | 		
1384 | 	def color_all_functions_from(self, _funcea, _color):
1385 | 		"""
1386 | 		Sets the background color of all functions and sub functions called from the
1387 | 		root function specified at the given address, i.e. this function is recursive.
1388 | 		This function can be use to trace the call tree of a function. The function
1389 | 		will return a matrix of functions calls as per returned by the 
1390 | 		get_all_sub_functions_called function if it succeeds. 
1391 | 		
1392 | 		Note: You may need to scroll around/refresh the GUI for the change to take
1393 | 		effect. The background will remain as the default color otherwise.
1394 | 		
1395 | 		The value of the color must be in the following format: 0xBBGGRR. Some colors
1396 | 		are defined in the header of the Enoki class.
1397 | 		
1398 | 		Example:
1399 | 		m = e.get_all_sub_functions_called(0x2CC0, Enoki.BABY_BLUE)
1400 | 		print(m)
1401 | 		[[0x2C00, 0x2CC0, 'MAIN']]
1402 | 		
1403 | 		Unlike the get_all_sub_functions_called function, this function will
1404 | 		also change the background color in the GUI. 
1405 | 		
1406 | 		@param _funcea Address within the root function
1407 | 		@param _color The background color to set.
1408 | 		@return Matrix containing the source, destination, name of the functions calling the
1409 | 		function. Enoki.FAIL otherwise.		
1410 | 		"""
1411 | 		if (_funcea != BADADDR):
1412 | 			fct_calls = self.get_all_sub_functions_called(_funcea, _visited=[])
1413 | 			if (len(fct_calls) > 0):
1414 | 				for fcall in fct_calls:
1415 | 					self.set_function_color(fcall[0], _color)
1416 | 					self.set_function_color(fcall[1], _color)
1417 | 			return fct_calls
1418 | 		else:
1419 | 			return Enoki.FAIL
1420 | 		
1421 | 	def set_function_color(self, _funcea, _color):
1422 | 		"""
1423 | 		Sets the background color of the function at the specified address. The value
1424 | 		of the color must be in the following format: 0xBBGGRR. Some colors
1425 | 		are defined in the header of the Enoki class.
1426 | 		
1427 | 		Example:
1428 | 		Red:
1429 | 		e.set_function_color(0x2C00, 0x0000FF)
1430 | 		
1431 | 		Blue:
1432 | 		e.set_function_color(0x2C00, 0xFF0000)
1433 | 		
1434 | 		Yellow:
1435 | 		e.set_function_color(0x2C00, Enoki.YELLOW)
1436 | 		
1437 | 		@param _funcea Address within the function
1438 | 		@param _color The background color to set.
1439 | 		@return Enoki.SUCCESS if the background color was changed. Enoki.FAIL otherwise.
1440 | 		"""
1441 | 		if (_funcea != BADADDR):
1442 | 			idc.SetColor(_funcea, CIC_FUNC, _color)
1443 | 			return Enoki.SUCCESS
1444 | 		return Enoki.FAIL		
1445 | 
1446 | 	def get_bytes_between(self, _startea, _endea):
1447 | 		"""
1448 | 		Returns bytes located between the provided start and end addresses.
1449 | 		
1450 | 		@param _startea The start address
1451 | 		@param _endea The end address
1452 | 		@return An array of bytes located between the addresses specified.
1453 | 		"""
1454 | 		bytes = []
1455 | 		if (_startea != BADADDR and _endea != BADADDR):
1456 | 			curea = _startea
1457 | 			while (curea <= _endea):
1458 | 				b = idaapi.get_byte(curea)
1459 | 				bytes.append(b)
1460 | 				curea += 1
1461 | 		return bytes
1462 | 	
1463 | 	def get_words_between(self, _startea, _endea):
1464 | 		"""
1465 | 		Returns words located between the provided start and end addresses.
1466 | 		
1467 | 		@param _startea The start address
1468 | 		@param _endea The end address
1469 | 		@return An array of words located between the addresses specified.
1470 | 		"""
1471 | 		words = []
1472 | 		if (_startea != BADADDR and _endea != BADADDR):
1473 | 			curea = _startea
1474 | 			while (curea <= _endea):
1475 | 				w = idaapi.get_16bit(curea)
1476 | 				words.append(w)
1477 | 				curea += 1
1478 | 		return words
1479 | 	
1480 | 	def get_disasm_between(self, _startea, _endea):
1481 | 		"""
1482 | 		Returns a list of disassembled code between the two addresses
1483 | 		provided.
1484 | 		
1485 | 		Example:
1486 | 		Python>a = e.get_disasm_section(0x2C00, 0x2C10)
1487 | 		Python>a
1488 | 		['pop    ar0, 'sar     ar0, *', 'sar     ar1, *', 'lar     ar0, #106', ...]
1489 | 		
1490 | 		@param _startea The starting address of the section
1491 | 		@param _endea The ending address of the section
1492 | 		@return A list of instructions, returns an empty list ([]) if an error occured.
1493 | 		"""
1494 | 		lines = []
1495 | 		if (_startea != BADADDR and _endea != BADADDR):
1496 | 			if (_startea > _endea):
1497 | 				t = _startea
1498 | 				_startea = _endea
1499 | 				_endea = _startea
1500 | 			curea = _startea
1501 | 			
1502 | 			while (curea <= _endea):
1503 | 				disasm = self.get_disasm(curea)
1504 | 				lines.append(disasm)
1505 | 				curea = NextHead(curea)
1506 | 		return lines	
1507 | 	
1508 | 	def get_disasm_function_line(self, _funcea):
1509 | 		"""
1510 | 		Returns a list of disassembled instructions from the function at the
1511 | 		given address. 
1512 | 		
1513 | 		Example:
1514 | 		Python>a = e.get_disasm_function(0x2CD0)
1515 | 		Python>a
1516 | 		['popd    *+', 'sar     ar0, *+', 'sar     ar1, *', ...]
1517 | 		
1518 | 		@param _funcea Address within the function
1519 | 		@return A list of instructions, returns an empty list ([]) if an error occured.
1520 | 		"""		
1521 | 		if (_funcea != BADADDR):
1522 | 			func = self.get_function_at(_funcea)
1523 | 			if (func):
1524 | 				return self.get_disasm_between(func.startEA, func.endEA-1)
1525 | 		return []
1526 | 	
1527 | 	def get_disasm_all_functions_from(self, _funcea):
1528 | 		"""
1529 | 		Retrieves all the disassembled codes of the function at the specified
1530 | 		address and all functions called from the function. This function is recursive
1531 | 		and can take a while to complete. Depending on the complexity of the root function,
1532 | 		it may also take considerable memory resources.
1533 | 		
1534 | 		If successful, this function returns a dictionary. The keys are the name
1535 | 		of the functions and the values are list of strings containing the instructions
1536 | 		of the function.
1537 | 		
1538 | 		Example:
1539 | 		Python>a = e.get_disasm_all_functions_from(0x2C00)
1540 | 		Python>print(a)
1541 | 		{'sub_2C00': ['popd    *+', 'sar     ar0, *+', 'sar     ar1, ...],
1542 | 		 ...
1543 | 		 'sub_23CC': ['popd    *+', 'sar     ar0, *+', 'sar     ar1, ...] }
1544 | 		
1545 | 		@param _funcea Address within the function
1546 | 		@return a dictionary using the key-value pair ("function_name", [instructions])
1547 | 		"""
1548 | 		fdisasm = {}
1549 | 		if (_funcea != BADADDR):
1550 | 			froot_disasm = self.get_disasm_function_line(_funcea)
1551 | 			froot_name = GetFunctionName(_funcea)
1552 | 			fdisasm[froot_name] = froot_disasm
1553 | 			fcalled = self.get_all_sub_functions_called(_funcea, _visited=[])
1554 | 			print(fcalled)
1555 | 			if (len(fcalled) > 0):
1556 | 				print("[*] Retrieving assembly from {:d} function(s).".format(len(fcalled)))
1557 | 				for finfo in fcalled:
1558 | 					fea = finfo[1]
1559 | 					fname = finfo[2]
1560 | 					fcode = self.get_disasm_function_line(fea)
1561 | 					fdisasm[fname] = fcode
1562 | 		return fdisasm
1563 | 	
1564 | 	def function_find_all(self, _funcea, _criteria):
1565 | 		"""
1566 | 		Retrieves all instructions within the specified function that matches
1567 | 		the strings provided in the list '_criteria'.
1568 | 		
1569 | 		Example:
1570 | 		Python>r = e.function_find_all(0xA8BF, ["popd", "#15Ah"])
1571 | 		Python>print(r)
1572 | 		['popd    *+              ; Pop Top of Stack', 
1573 | 		 'ldp     #15Ah           ']
1574 | 		
1575 | 		@param _funcea Address within the function to search
1576 | 		@param _criteria A list of regular expressions to match against
1577 | 			every instruction in the function.
1578 | 		@return A list of instructions matching the provided search criterias
1579 | 		"""
1580 | 		found_ins = []
1581 | 		if (_funcea != BADADDR):
1582 | 			if (not type(_criteria) in [list, tuple]):
1583 | 				_criteria = [_criteria]
1584 | 				
1585 | 			fdisasm = self.get_disasm_function_line(_funcea)
1586 | 			if (len(fdisasm) > 0):
1587 | 				for ins in fdisasm:
1588 | 					for crit in _criteria:
1589 | 						if (re.search(crit, ins)):
1590 | 							found_ins.append(ins)
1591 | 		return found_ins	
1592 | 	
1593 | 	def function_find_all_ea(self, _funcea, _criteria):
1594 | 		"""
1595 | 		Retrieves all instructions within the specified function that matches
1596 | 		the strings provided in the list '_criteria' along with the address the
1597 | 		matching instruction was found.
1598 | 		
1599 | 		Example:
1600 | 		Python>r = e.function_find_all(0xA8BF, ["popd", "#15Ah"])
1601 | 		Python>print(r)
1602 | 		[(0xA8C5, 'popd    *+              ; Pop Top of Stack'), 
1603 | 		 (0xA8D9, 'ldp     #15Ah           ')]
1604 | 		
1605 | 		@param _funcea Address within the function to search
1606 | 		@param _criteria A list of regular expressions to match against
1607 | 			every instruction in the function.
1608 | 		@return A list of instructions matching the provided search criterias
1609 | 		"""	
1610 | 		found_ins = []
1611 | 		if (_funcea != BADADDR):
1612 | 			if (not type(_criteria) in [list, tuple]):
1613 | 				_criteria = [_criteria]
1614 | 				
1615 | 			func = self.get_function_at(_funcea)
1616 | 			curea = func.startEA
1617 | 			while (curea < func.endEA):
1618 | 				ins_disasm = self.get_disasm(curea)
1619 | 				
1620 | 				for c in _criteria:
1621 | 					if (re.search(c, ins_disasm)):
1622 | 						found_ins.append((curea, ins_disasm))
1623 | 				
1624 | 				curea = NextHead(curea)
1625 | 				
1626 | 		return found_ins	
1627 | 		
1628 | 	def function_contains_all(self, _funcea, _criteria):
1629 | 		"""
1630 | 		Verifies if ALL the regular expressions in the _criteria arguments
1631 | 		have a matching instruction in the function at the given address. If one
1632 | 		or more of the regular expression included does not match any instruction,
1633 | 		this function will return False.
1634 | 		
1635 | 		Example:
1636 | 			popd    *+
1637 | 			sar     ar0, *+
1638 | 			sar     ar1, *
1639 | 			lar     ar0, #1
1640 | 			lar     ar0, *0+, ar2 ;(dseg:0001)
1641 | 			...
1642 | 			
1643 | 			Python>e.function_contains_all(0xBFDC, ["popd", "lar\\s+ar"])
1644 | 			True
1645 | 			Python>e.function_contains_all(0xBFDC, ["popd", "lar\\s+ar7"])
1646 | 			False		
1647 | 			
1648 | 		@param _funcea Address within the function to search
1649 | 		@param _criteria A list of regular expressions to match against each instruction of
1650 | 				the function.
1651 | 		@return True if all regular expresions were matched, False otheriwse.
1652 | 		"""
1653 | 		if (_funcea != BADADDR):
1654 | 			if (not type(_criteria) in [list, tuple]):
1655 | 				_criteria = [_criteria]
1656 | 
1657 | 			fdisasm = self.get_disasm_function_line(_funcea)
1658 | 			
1659 | 			if (len(fdisasm) > 0):
1660 | 				for crit in _criteria:
1661 | 					idx = 0
1662 | 					matched = False
1663 | 					
1664 | 					while (idx < len(fdisasm) and not matched):
1665 | 						ins = fdisasm[idx]
1666 | 						if (re.search(crit, ins)):
1667 | 							matched = True
1668 | 							
1669 | 						idx += 1
1670 | 						
1671 | 					if (not matched):
1672 | 						return False
1673 | 						
1674 | 				return True
1675 | 		return False
1676 | 	
1677 | 	def find_all_functions_contain(self, _criteria, _startea=MinEA(), _endea=MaxEA()):
1678 | 		"""
1679 | 		This function will look for all functions between the given boundaries that contains
1680 | 		instructions matching all regular expresions in the given list.
1681 | 		
1682 | 		Example:
1683 | 		Python>e.find_all_functions_contain(["popd", "lar\\s+ar"], _startea=0x8F59, _endea=0x9000)
1684 | 		['sub_8F42', 'sub_8F97', 'sub_8FE8', ...]
1685 | 
1686 | 		Python>e.find_all_functions_contain(["popd", "lar\\s+ar7"], _startea=0x8000, _endea=0x8FFF)
1687 | 		[]
1688 | 
1689 | 		@param _criteria A list of regular expressions to match against each instruction of
1690 | 				the function.
1691 | 		@param _startea The starting address of the search. If no value is specified, MinEA() is
1692 | 				used.
1693 | 		@param _endea The ending address of the search. If no value is specified, MaxEA() is
1694 | 				used.				
1695 | 		"""
1696 | 		found = []
1697 | 		f = self.get_function_at(_startea)
1698 | 		while (f):
1699 | 			fname = GetFunctionName(f.startEA)
1700 | 			if (self.function_contains_all(f.startEA, _criteria)):
1701 | 				found.append(fname)
1702 | 			f = idaapi.get_next_func(f.endEA+1)
1703 | 		return found
1704 | 	
1705 | 	def search_code_all_functions_from(self, _funcea, _search):
1706 | 		"""
1707 | 		This function searches all the disassembly of the function at the 
1708 | 		given address and all functions called from it for the specified 
1709 | 		regular expression.
1710 | 		
1711 | 		Example:
1712 | 		Python>a = e.search_code_all_functions_from(0x0800, "1EFh")
1713 | 		Python>print(a)
1714 | 		[('sub_0800', 'lacc #1EFh, *+'), ('sub_0800', 'lacc #1EFh, *+')]
1715 | 		
1716 | 		In the example above, two LACC instructions containing the "1EFh" is found
1717 | 		in the same function, hence it appears twice in the results.
1718 | 		
1719 | 		@param _funcea Address within the function to search
1720 | 		@param _search A regular expression to search for in the disassembled code.
1721 | 		@return A list of tuples containing the name of the function and the
1722 | 		matching instructions.
1723 | 		"""
1724 | 		results = []
1725 | 		if (_funcea != BADADDR):
1726 | 			disasm = self.get_disasm_all_functions_from(_funcea)
1727 | 			for fname, fcode in disasm.iteritems():
1728 | 				for ins in fcode:
1729 | 					if re.search(_search, ins):
1730 | 						results.append((fname, ins))
1731 | 		return results
1732 | 	
1733 | 	def find_similar_functions_in_tree(self, _funcea, _startea, _threshold=1.0):
1734 | 		"""
1735 | 		Attempts to find other functions similar to the one specified in the call tree
1736 | 		of the given function.
1737 | 		
1738 | 		This function will accept the address of a function and navigate the call tree
1739 | 		of the second address provided. The instructions of both function will be compared
1740 | 		and if the similarity between both is above the specified threshold, the function
1741 | 		of the call tree is stored along with other found function and returns.
1742 | 		
1743 | 		The function returns a matrix in the following format:
1744 | 		[
1745 | 		 [<address1>, <name1>, ratio1],
1746 | 		 ...
1747 | 		 [<addressN>, <nameN>, ratioN]
1748 | 		]
1749 | 		
1750 | 		@param _funcea Address within the function to search
1751 | 		@param _funcea Address of the starting function of the call tree
1752 | 		@return A matrix containing the address, name and ratio of the functions
1753 | 				found.
1754 | 		"""
1755 | 		results = []
1756 | 		if (_funcea != BADADDR):
1757 | 			tree = self.get_all_sub_functions_called(_startea, _visited=[])
1758 | 			for fcall in tree:
1759 | 				fcalled_ea = fcall[1]
1760 | 				fcalled_name = fcall[2]
1761 | 				ratio = self.compare_functions(_funcea, fcalled_ea)
1762 | 				if (ratio >= _threshold):
1763 | 					results.append([fcalled_ea, fcalled_name, ratio])
1764 | 			
1765 | 		return results	
1766 | 	
1767 | 	def save_range_to_file(self, _startea, _endea, _file):
1768 | 		"""
1769 | 		Saves the chunk of bytes between the given start and end addresses into
1770 | 		the given file.
1771 | 		
1772 | 		@param _startea The starting address of the chunk
1773 | 		@param _endea The ending address of the chunk
1774 | 		@param _file Name of the file to write.
1775 | 		@return Enoki.SUCCESS if the file was written successfully, Enoki.FAIL
1776 | 				otherwise.
1777 | 		"""
1778 | 		if (_startea != BADADDR and _endea != BADADDR):
1779 | 			try:
1780 | 				chunk = bytearray(idc.GetManyBytes(_startea, ((_endea-_startea)+1)*2))
1781 | 				print("Exporting {:d} bytes chunk 0x{:05x} to 0x{:05x} to {:s}.".format(len(chunk), _startea, _endea, _file))
1782 | 				with open(_file, "wb") as f:
1783 | 					f.write(chunk)
1784 | 			except Exception as e:
1785 | 				print("[-] Error while writing file: {:s}.".format(e.message))
1786 | 				return Enoki.FAIL
1787 | 		return Enoki.SUCCESS
1788 | 
1789 | e = Enoki()
1790 | print("[+] Enoki {:s} loaded successfully.".format(e.vers()))


--------------------------------------------------------------------------------