├── Clear_All_Instruction_Colors.py
├── Minimize_Automatic_Function_Comments.py
├── README.md
├── Highlight_Target_Instructions.py
├── Label_Dynamically_Resolved_Iat_Entries.py
├── LICENSE
├── Utils.py
└── Preview_Function_Capabilities.py


/Clear_All_Instruction_Colors.py:
--------------------------------------------------------------------------------
 1 | # Clears all colors applied to instructions in program
 2 | #@author https://AGDCServices.com
 3 | #@category AGDCservices
 4 | #@keybinding
 5 | #@menupath
 6 | #@toolbar
 7 | 
 8 | '''
 9 | Removes all highlight colors from current program.
10 | Applied highlighting colors are saved with the ghidra file.
11 | This script can be used to remove the colors prior to exporting
12 | and sharing the ghidra database so that the highlight colors
13 | don't clash with different color schemes used by coworkers
14 | '''
15 | 
16 | instructions = currentProgram.getListing().getInstructions(True)
17 | for curInstr in instructions:
18 |     clearBackgroundColor(curInstr.getAddress())
19 | 


--------------------------------------------------------------------------------
/Minimize_Automatic_Function_Comments.py:
--------------------------------------------------------------------------------
 1 | # Adds a short repeatable comment to all functions to hide the automatic function comment
 2 | #@author https://AGDCServices.com
 3 | #@category AGDCservices
 4 | #@keybinding
 5 | #@menupath
 6 | #@toolbar
 7 | 
 8 | '''
 9 | Adds a single space as a repeatable comment to all functions
10 | within the current program.  By default, Ghidra adds a function
11 | prototype as a repeatable comment to all functions.  These comments
12 | are very long which will force the code block to expand it its maximum
13 | size within the graph view.  These default comments do not add any real value
14 | and decreases the amount of code that can be seen in the graph view.
15 | 
16 | Currently, there is no way to turn this option off.  A work around is 
17 | to replace the repeatable comment with a single space so that you don't
18 | see any comment by default, and the code block is not expanded out to 
19 | it's maximum size because of the long function prototype comment.
20 | '''
21 | 
22 | REPEATABLE_COMMENT = ghidra.program.model.listing.CodeUnit.REPEATABLE_COMMENT
23 | listing = currentProgram.getListing()
24 | 
25 | commentCount = 0
26 | for func in listing.getFunctions(True):
27 |     listing.getCodeUnitAt(func.getEntryPoint()).setComment(REPEATABLE_COMMENT, ' ')
28 |     commentCount += 1
29 | 
30 | print('Set {:d} repeatable function comments to a single space to prevent automatic function comments from being displayd'.format(commentCount))


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Ghidra Scripts
 2 | Custom scripts to make analyzing malware easier in Ghidra
 3 | ## Installation
 4 | Add these scripts to your Ghidra scripts directory:
 5 | 1. Open any file in Ghidra for analysis
 6 | 2. Select the Window / Script Manager menu
 7 | 3. Click the "Script Directories" icon in the upper right toolbar
 8 | 4. Add the directory where your scripts are located via the green plus sign
 9 | 5. All scripts will show up under the AGDCservices folder
10 | ## Clear_All_Instruction_Colors.py
11 | Removes all highlight colors from current program.  Applied highlighting colors are saved with the ghidra file.
12 | This script can be used to remove the colors prior to exporting and sharing the ghidra database so that the highlight colors don't clash with different color schemes used by coworkers. See script header for more usage details.
13 | ## Preview_Function_Capabilities.py
14 | This script will name all unidentified functions with a nomenclature that provides a preview of important capabilities included within the function and all child functions.
15 | 
16 | The script includes a list of hardcoded important API calls. The script will locate all calls contained in the unidentifed function and it's children functions. For any of the calls which match the hardcoded API call list, a shorthand name will be applied to indicate which category of important call is contained within the function.
17 | 
18 | The naming nomenclature is based on capability and does not identify specific API's. By keeping the syntax short and just for capability, you can get a preview of all the important capabilities within a function without having the name get enormous. See script header for more details.
19 | 
20 | For a video demonstration of this script, view the video "Ghidra Script To Name Function From Capabilities" on the AGDC Services channel of youtube, https://youtu.be/s5weitGaKLw
21 | ## Highlight_Target_Instructions.py
22 | Script to search all instructions in current program looking for target instructions of interest.  When found,
23 | a defined highlighting color will be applied to make it easy to identify target instructions.  Target instructions are things like call instructions, potential crypto operations, pointer instructions, etc.  Highlighting instructions of interest decrease the chance of missing important instructions when skimming malware code. See script header for more usage details.
24 | 
25 | **Default color choices are made to work with the AGDC_codeBrowser_##.tool.  They can be changed to fit any coloring schema by modifying the defined color constants at the top of the script**
26 | ## Minimize_Automatic_Function_Comments.py
27 | Adds a single space as a repeatable comment to all functions within the current program.  By default, Ghidra adds a function prototype as a repeatable comment to all functions.  These comments are very long which will force the code block to expand it its maximum size within the graph view.  These default comments do not add any real value and decreases the amount of code that can be seen in the graph view.
28 | 
29 | Currently, there is no way to turn this option off.  A work around is to replace the repeatable comment with a single space so that you don't see any comment by default, and the code block is not expanded out to 
30 | it's maximum size because of the long function prototype comment. See script header for more usage details.
31 | ## Utils.py
32 | A number of commonly used convenience functions to aid in rapid scripting, e.g. Get_Operand_As_Immediate_Value, Get_Next_Target_Instruction, Get_Bytes_List, etc. See script header for more usage details.
33 | ## Label_Dynamically_Resolved_Iat_Entries.py
34 | Script to aid in reverse engineering files that dynamically resolve imports. Script will search program for all dynamically resolved imports and label them with the appropriate API name pulled from a provided labeled IAT dump file.  Only resolved imports stored in global variables will be identified. This script will not label every resolved global variable, but only those that are used inside a call instruction.
35 | 
36 | The labeled IAT dump file must be generated by an associated program, "Dump_Labeled_Iat_Memory.exe". This program is located in another repo on this github site called "Misc Malware Anaysis Tools".  See script header for more usage details.
37 | 
38 | 


--------------------------------------------------------------------------------
/Highlight_Target_Instructions.py:
--------------------------------------------------------------------------------
  1 | # Highlights target instructions using custom colors for easy identification
  2 | #@author https://AGDCServices.com
  3 | #@category AGDCservices
  4 | #@keybinding
  5 | #@menupath
  6 | #@toolbar
  7 | 
  8 | '''
  9 | Script will search all instructions in current program
 10 | looking for target instructions of interest.  When found,
 11 | a defined highlighting color will be applied to make it
 12 | easy to identify target instructions.
 13 | 
 14 | default color choices are made to work with the 
 15 | AGDC_codeBrowser_14.tool.  They can be changed to fit any
 16 | coloring schema by modifying the defined color constants
 17 | at the top of the program
 18 | '''
 19 | 
 20 | from java.awt import Color
 21 | 
 22 | 
 23 | # define RGB colors for target instructions
 24 | 
 25 | # color_default sets non-target instructions colors
 26 | # needed to account for bug in graph view
 27 | COLOR_DEFAULT = Color(255,255,255) # white
 28 | COLOR_CALL = Color(255, 220, 220) #light red
 29 | COLOR_POINTER = Color(200, 240, 255) # blue
 30 | COLOR_CRYPTO = Color(245, 205, 255) # violet
 31 | COLOR_STRING_OPERATION = Color(180,230,170) # green
 32 | 
 33 | #
 34 | # additional unused colors
 35 | #
 36 | # Color(255,255,180) #yellow
 37 | # Color(220,255,200) #very light green
 38 | # Color(255,200,100) #orange 
 39 | # Color(220, 220, 220) #light grey
 40 | # Color(195, 195, 195) # dark grey
 41 | 
 42 | 
 43 | 
 44 | REG_TYPE = 512
 45 | 
 46 | 
 47 | # loop through all program instructions searching
 48 | # for target instructions.  when found, apply defined
 49 | # color
 50 | instructions = currentProgram.getListing().getInstructions(True)
 51 | for curInstr in instructions:
 52 | 
 53 |     bIsTargetInstruction = False
 54 | 
 55 |     curMnem = curInstr.getMnemonicString().lower()
 56 | 
 57 |     # color call instructions
 58 |     if curMnem == 'call':
 59 |         bIsTargetInstruction = True
 60 |         setBackgroundColor(curInstr.getAddress(), COLOR_CALL)
 61 | 
 62 | 
 63 |     # color lea instructions
 64 |     if curMnem == 'lea':
 65 |         bIsTargetInstruction = True
 66 |         setBackgroundColor(curInstr.getAddress(), COLOR_POINTER)
 67 | 
 68 | 
 69 |     #
 70 |     # color suspected crypto instructions
 71 |     #
 72 | 
 73 |     # xor that does not zero out the register
 74 |     if (curMnem == 'xor') and (curInstr.getOpObjects(0) != curInstr.getOpObjects(1)):
 75 |         bIsTargetInstruction = True
 76 |         setBackgroundColor(curInstr.getAddress(), COLOR_CRYPTO)
 77 | 
 78 | 
 79 |     # common RC4 instructions
 80 |     if (curMnem == 'cmp') and (curInstr.getOperandType(0) == REG_TYPE) and (curInstr.getOpObjects(1)[0].toString() == '0x100'):
 81 |         bIsTargetInstruction = True
 82 |         setBackgroundColor(curInstr.getAddress(), COLOR_CRYPTO)
 83 | 
 84 |     # misc math operations
 85 |     mathInstrList = ['sar', 'sal', 'shr', 'shl', 'ror', 'rol', 'idiv', 'div', 'imul', 'mul', 'not']
 86 |     if curMnem in mathInstrList:
 87 |         bIsTargetInstruction = True
 88 |         setBackgroundColor(curInstr.getAddress(), COLOR_CRYPTO)
 89 | 
 90 | 
 91 | 	#
 92 | 	#
 93 | 	#
 94 | 
 95 | 
 96 | 
 97 |     # color string operations
 98 |     #  skip instructions that start with 'c' to exclude conditional moves, e.g. cmovs
 99 |     if (curMnem.startswith('c') == False) and (curMnem.endswith('x') == False) and ( ('scas' in curMnem) or ('movs' in curMnem) or ('stos' in curMnem) ):
100 |         bIsTargetInstruction = True
101 |         setBackgroundColor(curInstr.getAddress(), COLOR_STRING_OPERATION)
102 | 
103 | 
104 | 
105 | 
106 |     # fixes ghidra bug in graph mode where if a color is applied to the first instruction of a code block
107 |     # the color will also be applied to the rest of the instructions in that code block
108 |     # by setting the color to every line that's not a target instruction to the default color,
109 |     # target colors should be applied accurately
110 |     # error only appears to be in graph view.  colors will be correctly applied in flat view, but incorrect in graph view
111 |     # if you just clear the colors instead of setting all the colors to the default color,
112 |     # the error will still occur.  In this case, it may get fixed by redrawing the graph,
113 |     # but you will have to redraw the graph every time you come across an error
114 |     if bIsTargetInstruction == False:
115 |         setBackgroundColor(curInstr.getAddress(), COLOR_DEFAULT)
116 | 
117 | 
118 | 
119 | 


--------------------------------------------------------------------------------
/Label_Dynamically_Resolved_Iat_Entries.py:
--------------------------------------------------------------------------------
  1 | #Find dynamically resolved IAT locations and apply labels from input file
  2 | #@author https://AGDCServices.com
  3 | #@category AGDCservices
  4 | #@keybinding
  5 | #@menupath
  6 | #@toolbar
  7 | #@toolbar
  8 | 
  9 | '''
 10 | Script will search program for all dynamically resolved
 11 | imports and label them with the appropriate API name pulled
 12 | from a provided labeled IAT dump file.  Only resolved imports
 13 | stored in global variables will be identified. This script will
 14 | not label every resolved global variable, but only those that 
 15 | are used inside a call instruction
 16 | 
 17 | The labeled IAT dump file must be generated by an associated
 18 | program, "Dump_Labeled_Iat_Memory.exe". This program is located
 19 | in another repo on this github site called "Misc Malware Anaysis Tools"
 20 | 
 21 | usage:
 22 |     Run file inside a debugger up to the point where all 
 23 |     dynamically resolved imports are resolved.  At that point,
 24 |     run the associated "Dump_Labeled_Iat_Memory.exe" to create
 25 |     the labeled Iat dump file.
 26 |     
 27 |     Once you have the labeled IAT dump file, run this script.
 28 |     The script must be run prior to renaming any of the global
 29 |     IAT variables.  The script will not overwrite any manually
 30 |     named global variables.
 31 | 
 32 | '''
 33 | 
 34 | 
 35 | def main():
 36 | 
 37 |     try:
 38 |         fileObject = askFile('Select Labeled Iat Dump File', 'Open')
 39 |     except:
 40 |         print('file could not be opened')
 41 |         quit()
 42 | 
 43 |     iatList = Get_Dynamically_Resolved_Iat_Addresses()
 44 |     Label_Dynamically_Resolved_Iat_Addresses(iatList, fileObject.getPath())
 45 |     
 46 |     
 47 | def Get_Dynamically_Resolved_Iat_Addresses():
 48 |     '''
 49 |     function will search current program for all 
 50 |     calls to unresolved global variables and return
 51 |     a list of all the global variable addresses.
 52 |     '''
 53 |     
 54 |     instructions = currentProgram.getListing().getInstructions(True)
 55 |     iatSet = set()
 56 |     for curInstr in instructions:
 57 |         curMnem = curInstr.getMnemonicString().lower()
 58 |         if curMnem == 'call':
 59 |             operandRef = curInstr.getOperandReferences(0)
 60 |             if len(operandRef) != 0:
 61 |                 operandRefEa = operandRef[0].getToAddress()
 62 |                 curLabel = getSymbolAt(operandRefEa)
 63 |                 if curLabel != None:  # accounts for non memory references
 64 |                     if curLabel.getName().lower().startswith( ('dat_', 'byte_', 'word_', 'dword_', 'qword_') ):
 65 |                         iatSet.add(operandRefEa)
 66 |                         
 67 |     
 68 |     return list(iatSet)
 69 | 
 70 |         
 71 | def Label_Dynamically_Resolved_Iat_Addresses(iatList, labeledIatDumpFileName):
 72 |     '''
 73 |     function will read in file with a format of:
 74 |     iatRva\tapiString 
 75 |     each address in the iatList will be checked to 
 76 |     see if there is an entry in the labeled Iat Dump File.
 77 |     If so, the iat label will be set to the api string 
 78 |     from the input file
 79 |     
 80 |     iatList should be list of address objects
 81 |     '''
 82 |     
 83 |     with open(labeledIatDumpFileName, 'r') as fp:
 84 |         labeledIatList = fp.read().splitlines()
 85 |     
 86 |     imageBase = currentProgram.getImageBase().getOffset()
 87 |     labeledIatDict = dict()
 88 |     for i in labeledIatList:
 89 |         curRva, curIatLabel = i.split('\t')
 90 |         labeledIatDict[imageBase + int(curRva, 16)] = curIatLabel
 91 |     
 92 |     labeledCount = 0
 93 |     unresolvedList = []
 94 |     for entry in iatList:
 95 |         curIatLabel = labeledIatDict.get(entry.getOffset(), None)
 96 |         if curIatLabel != None:
 97 |             getSymbolAt(entry).setName(curIatLabel, ghidra.program.model.symbol.SourceType.USER_DEFINED)
 98 |             labeledCount += 1
 99 |         else:
100 |             unresolvedList.append('could not resolve address 0x{:x}'.format(entry.getOffset()))
101 |     
102 |     print('labeled {:x} dynamically resolved IAT entries'.format(labeledCount))
103 |     
104 |     if len(unresolvedList) != 0:
105 |         print('[*] ERROR, was not able to resolve {:x} entries'.format(len(unresolvedList)))
106 |         print('\n'.join(unresolvedList))    
107 |  
108 |             
109 |             
110 |             
111 |             
112 | if __name__ == '__main__':
113 |     main()


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright [yyyy] [name of copyright owner]
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/Utils.py:
--------------------------------------------------------------------------------
  1 | from __main__ import *
  2 | 
  3 | '''
  4 | Utility module of common helper functions used
  5 | in building Ghidra scripts
  6 | 
  7 | Contained function prototypes below:
  8 |     Get_Bytes_List(targetEa, nLen)
  9 |     Get_Bytes_String(targetEa, nLen)
 10 |     Get_Ascii_String(targetEa)
 11 |     Set_Bytes_String(targetEa, patchStr)
 12 |     Get_Call_Xrefs_To(targetEa)
 13 |     Get_Prev_Target_Instruction(curInstr, mnem, N, MAX_INSTRUCTIONS = 9999)
 14 |     Get_Next_Target_Instruction(curInstr, mnem, N, MAX_INSTRUCTIONS = 9999)
 15 |     Get_Operand_As_Address(targetInstr, operandIndex)
 16 |     Get_Operand_As_Immediate_Value(targetInstr, operandIndex)
 17 |     Get_Operand_As_String(targetInstr, operandIndex)
 18 | 
 19 | '''
 20 | 
 21 | def Get_Bytes_List(targetEa, nLen):
 22 |     '''
 23 |     gets the bytes from memory, treating as unsigned bytes
 24 |     ghidra treats read bytes as signed which is not what
 25 |     you normally want when reading memory, e.g. if you call
 26 |     getBytes on a byte 0xfe, you won't get 0xfe, you'll get -2
 27 |     this may not be an issue depending on what operation you
 28 |     are performing, or it may, e.g. reading a byte that is
 29 |     displayed as a negative value will fail when compared to
 30 |     the two's complement hex (-2 != 0xfe).  If you're using
 31 |     the byte to patch the program, it may work ok.
 32 | 
 33 |     returns result as a list
 34 |     '''
 35 | 
 36 |     signedList = list(getBytes(targetEa, nLen))
 37 |     unsignedList = []
 38 |     for curByte in signedList:
 39 |         if curByte < 0:
 40 |             uByte = (0xff - abs(curByte) + 1)
 41 |         else:
 42 |             uByte= curByte
 43 |         unsignedList.append(uByte)
 44 | 
 45 |     return unsignedList
 46 | 
 47 | def Get_Bytes_String(targetEa, nLen):
 48 |     '''
 49 |     gets the bytes from memory, treating as unsigned bytes
 50 |     ghidra treats read bytes as signed which is not what
 51 |     you normally want when reading memory, e.g. if you call
 52 |     getBytes on a byte 0xfe, you won't get 0xfe, you'll get -2
 53 |     this may not be an issue depending on what operation you
 54 |     are performing, or it may, e.g. reading a byte that is
 55 |     displayed as a negative value will fail when compared to
 56 |     the two's complement hex (-2 != 0xfe).  If you're using
 57 |     the byte to patch the program, it may work ok.
 58 | 
 59 |     returns result as a string
 60 |     '''
 61 | 
 62 |     signedList = list(getBytes(targetEa, nLen))
 63 |     unsignedList = []
 64 |     for curByte in signedList:
 65 |         if curByte < 0:
 66 |             uByte = (0xff - abs(curByte) + 1)
 67 |         else:
 68 |             uByte= curByte
 69 |         unsignedList.append(chr(uByte))
 70 | 
 71 |     return ''.join(unsignedList)
 72 | 
 73 | 
 74 | def Get_Ascii_String(targetEa):
 75 |     '''
 76 |     returns the null terminated ascii string starting
 77 |     at targetEa.  Returns a string object and does not
 78 |     include the terminating null character
 79 | 
 80 |     targetEa must be an address object
 81 |     '''
 82 | 
 83 |     result = ''
 84 |     i = 0
 85 |     while True:
 86 |         curByte = chr(getByte(targetEa.add(i)))
 87 |         if curByte == chr(0): break
 88 |         result += curByte
 89 |         i += 1
 90 | 
 91 |     return result
 92 | 
 93 | def Set_Bytes_String(targetEa, patchStr):
 94 |     '''
 95 |     writes the patchStr to targetEa
 96 |     does in a loop with setByte instead of setBytes
 97 |     so avoid having to deal with bytearray in jython
 98 |     '''
 99 |     
100 |     for i, v in enumerate(patchStr):
101 |         setByte(targetEa.add(i), ord(v))
102 |         
103 | 
104 | def Get_Call_Xrefs_To(targetEa):
105 |     '''
106 |     returns list of addresses which call the targetEa
107 | 
108 |     '''
109 | 
110 |     callEaList = []
111 |     for ref in getReferencesTo(targetEa):
112 |         if getInstructionAt(ref.getFromAddress()).getMnemonicString().lower() == 'call':
113 |             callEaList.append(ref.getFromAddress())
114 | 
115 |     return callEaList
116 | 
117 | def Get_Prev_Target_Instruction(curInstr, mnem, N, MAX_INSTRUCTIONS = 9999):
118 |     '''
119 |     gets N'th previous target instruction from the curInstr
120 |     function will only go back MAX_INSTRUCTIONS
121 |     function will not search outside of current function if the
122 |     current instruction is inside a defined function
123 |     returns None on failure
124 |     '''
125 | 
126 | 
127 |     # get address set of current function to use in determining if prev instruction
128 |     # is outside of current function
129 |     try:
130 |         funcBody = getFunctionContaining(curInstr.getAddress()).getBody()
131 |     except:
132 |         funcBody = None
133 | 
134 | 
135 |     # get Nth prev instruction
136 |     totalInstructionCount = 0
137 |     targetInstructionCount = 0
138 |     while (totalInstructionCount < MAX_INSTRUCTIONS) and (targetInstructionCount < N):
139 |         curInstr = curInstr.getPrevious()
140 | 
141 |         if curInstr == None: break
142 |         if funcBody != None:
143 |             if funcBody.contains(curInstr.getAddress()) == False: break
144 | 
145 |         if curInstr.getMnemonicString().lower() == mnem.lower(): targetInstructionCount += 1
146 | 
147 |         totalInstructionCount += 1
148 | 
149 | 
150 |     # return the results
151 |     if targetInstructionCount == N:
152 |         result = curInstr
153 |     else:
154 |         result = None
155 | 
156 |     return result
157 | 
158 | def Get_Next_Target_Instruction(curInstr, mnem, N, MAX_INSTRUCTIONS = 9999):
159 |     '''
160 |     gets N'th next target instruction from the curInstr
161 |     function will only go forward MAX_INSTRUCTIONS
162 |     function will not search outside of current function if the
163 |     current instruction is inside defined function
164 |     returns None on failure
165 |     '''
166 | 
167 |     # get address set of current function to use in determining if prev instruction
168 |     # is outside of current function
169 |     try:
170 |         funcBody = getFunctionContaining(curInstr.getAddress()).getBody()
171 |     except:
172 |         funcBody = None
173 | 
174 | 
175 |     # get Nth next instruction
176 |     totalInstructionCount = 0
177 |     targetInstructionCount = 0
178 |     while (totalInstructionCount < MAX_INSTRUCTIONS) and (targetInstructionCount < N):
179 |         curInstr = curInstr.getNext()
180 | 
181 |         if curInstr == None: break
182 |         if funcBody != None:
183 |             if funcBody.contains(curInstr.getAddress()) == False: break
184 | 
185 |         if curInstr.getMnemonicString().lower() == mnem.lower(): targetInstructionCount += 1
186 | 
187 |         totalInstructionCount += 1
188 | 
189 | 
190 |     # return the results
191 |     if targetInstructionCount == N:
192 |         result = curInstr
193 |     else:
194 |         result = None
195 | 
196 |     return result
197 | 
198 | def Get_Operand_As_Address(targetInstr, operandIndex):
199 |     '''
200 |     returns the value for the operandIndex operand of the
201 |     target instruction treated as an address.  if the
202 |     target operand can not be treated as an address,
203 |     returns None.  operandIndex starts at 0
204 | 
205 |     If this is called on jumps or calls, the final
206 |     address jumped to / called will be returned
207 | 
208 |     There are no real checks for validity and it's up to
209 |     the author to ensure the target operand should be an address
210 | 
211 |     '''
212 | 
213 |     # error check
214 |     if operandIndex >= targetInstr.getNumOperands():
215 |         print('[*] Error in Get_Operand_As_Address.  operandIndex is too large at {:s}'.format(targetInstr.getAddress().toString()))
216 |         return None
217 |     elif targetInstr.getNumOperands() == 0:
218 |         return None
219 | 
220 | 
221 |     operand = targetInstr.getOpObjects(operandIndex)[0]
222 |     if type(operand) == ghidra.program.model.scalar.Scalar:
223 |         targetValue = toAddr(operand.getValue())
224 |     elif type(operand) == ghidra.program.model.address.GenericAddress:
225 |         targetValue = operand
226 |     else:
227 |         targetValue = None
228 | 
229 |     return targetValue
230 | 
231 | def Get_Operand_As_Immediate_Value(targetInstr, operandIndex):
232 |     '''
233 |     returns the value for the operandIndex operand of the target instruction
234 |     if the target operand is not an immediate value, the function will attempt
235 |     to find where the variable was previously set.  It will ONLY search within
236 |     the current function to find where the variable was previously set.
237 |     if operand value can not be determined, returns None
238 |     operandIndex starts at 0
239 |     '''
240 | 
241 |     # operand types are typically different if operand is
242 |     # used in a call versus not a call and if there is a
243 |     # reference or not
244 |     OP_TYPE_IMMEDIATE = 16384
245 |     OP_TYPE_NO_CALL_REG = 512
246 |     OP_TYPE_NO_CALL_STACK = 4202496
247 |     # global variables have numerous reference types
248 |     # unsure how to differentiate the different types
249 | 
250 | 
251 |     # error check
252 |     if operandIndex >= targetInstr.getNumOperands():
253 |         print('[*] Error in Get_Operand_As_Immediate_Value.  operandIndex is too large at {:s}'.format(targetInstr.getAddress().toString()))
254 |         return None
255 |     elif targetInstr.getNumOperands() == 0:
256 |         return None
257 | 
258 | 
259 |     # get address set of current function to use in determining
260 |     # if prev instruction is outside of current function
261 |     try:
262 |         funcBody = getFunctionContaining(targetInstr.getAddress()).getBody()
263 |     except:
264 |         funcBody = None
265 | 
266 | 
267 |     # find the actual operand value
268 |     targetValue = None
269 |     opType = targetInstr.getOperandType(operandIndex)
270 |     # if operand is a direct number
271 |     if opType == OP_TYPE_IMMEDIATE:
272 |         targetValue = targetInstr.getOpObjects(operandIndex)[0].getValue()
273 |     # else if operand is a register
274 |     elif opType == OP_TYPE_NO_CALL_REG:
275 |         regName = targetInstr.getOpObjects(operandIndex)[0].getName().lower()
276 | 
277 |         # search for previous location where register value was set
278 |         curInstr = targetInstr
279 |         while True:
280 |             curInstr = curInstr.getPrevious()
281 | 
282 |             # check to make sure curInstr is valid
283 |             if curInstr == None: break
284 |             if funcBody != None:
285 |                 if funcBody.contains(curInstr.getAddress()) == False: break
286 | 
287 |             # check different variations of how register values get set
288 |             curMnem = curInstr.getMnemonicString().lower()
289 |             if (curMnem == 'mov') and (curInstr.getOperandType(0) == OP_TYPE_NO_CALL_REG):
290 |                 if curInstr.getOpObjects(0)[0].getName().lower() == regName:
291 |                     if curInstr.getOperandType(1) == OP_TYPE_IMMEDIATE:
292 |                         targetValue = curInstr.getOpObjects(1)[0].getValue()
293 |                     elif curInstr.getOperandType(1) == OP_TYPE_NO_CALL_REG:
294 |                         targetValue = Get_Operand_As_Immediate_Value(curInstr, 1)
295 |                     break
296 |             elif (curMnem == 'xor'):
297 |                 operand1 = curInstr.getOpObjects(0)[0]
298 |                 operand2 = curInstr.getOpObjects(1)[0]
299 |                 op1Type = curInstr.getOperandType(0)
300 |                 op2Type = curInstr.getOperandType(1)
301 | 
302 |                 if (op1Type == OP_TYPE_NO_CALL_REG) and (op2Type == OP_TYPE_NO_CALL_REG):
303 |                     if (operand1.getName().lower() == regName) and (operand2.getName().lower() == regName):
304 |                         targetValue = 0
305 |                         break
306 |             elif (curMnem == 'pop') and (curInstr.getOperandType(0) == OP_TYPE_NO_CALL_REG):
307 |                 if curInstr.getOpObjects(0)[0].getName().lower() == regName:
308 |                     # find previous push
309 |                     # NOTE: assumes previous push corresponds to pop but
310 |                     # will fail if there is a function call in-between
311 |                     tmpCurInstr = curInstr.getPrevious()
312 |                     while True:
313 |                         # check to make sure tmpCurInstr is valid
314 |                         if tmpCurInstr == None: break
315 |                         if funcBody != None:
316 |                             if funcBody.contains(tmpCurInstr.getAddress()) == False: break
317 | 
318 |                         if tmpCurInstr.getMnemonicString().lower() == 'push':
319 |                             if tmpCurInstr.getOperandType(0) == OP_TYPE_IMMEDIATE:
320 |                                 targetValue = tmpCurInstr.getOpObjects(0)[0].getValue()
321 |                             break
322 | 
323 |                     # break out of outer while loop
324 |                     break
325 |     # if operand is a stack variable
326 |     elif opType == OP_TYPE_NO_CALL_STACK:
327 |         stackOffset = targetInstr.getOperandReferences(operandIndex)[0].getStackOffset()
328 | 
329 |         # search for previous location where stack variable value was set
330 |         curInstr = targetInstr
331 |         while True:
332 |             curInstr = curInstr.getPrevious()
333 | 
334 |             # check to make sure curInstr is valid
335 |             if curInstr == None: break
336 |             if funcBody != None:
337 |                 if funcBody.contains(curInstr.getAddress()) == False: break
338 | 
339 |             # find where stack variable was set
340 |             curMnem = curInstr.getMnemonicString().lower()
341 |             if (curMnem == 'mov') and (curInstr.getOperandType(0) == OP_TYPE_NO_CALL_STACK):
342 |                 if curInstr.getOperandReferences(0)[0].getStackOffset() == stackOffset:
343 |                     if curInstr.getOperandType(1) == OP_TYPE_IMMEDIATE:
344 |                         targetValue = curInstr.getOpObjects(1)[0].getValue()
345 |                     break
346 | 
347 | 
348 | 
349 | 
350 |     return targetValue
351 | 
352 | def Get_Operand_As_String(targetInstr, operandIndex):
353 |     '''
354 |     returns the value for the operandIndex operand of the
355 |     target instruction treated as a string.
356 |     operandIndex starts at 0
357 | 
358 |     If this is called on jumps or calls, the final
359 |     address jumped to / called will be returned
360 | 
361 |     '''
362 | 
363 |     # error check
364 |     if operandIndex >= targetInstr.getNumOperands():
365 |         print('[*] Error in Get_Operand_As_String.  operandIndex is too large at {:s}'.format(targetInstr.getAddress().toString()))
366 |         return None
367 |     elif targetInstr.getNumOperands() == 0:
368 |         return None
369 | 
370 | 
371 |     operand = targetInstr.getOpObjects(operandIndex)[0]
372 | 
373 |     return operand.toString()
374 | 
375 | 
376 | 
377 | 
378 | 
379 | 


--------------------------------------------------------------------------------
/Preview_Function_Capabilities.py:
--------------------------------------------------------------------------------
  1 | # Names unindentified functions with a nomenclature that provides a preview of included capabilities within the function
  2 | #@author https://AGDCServices.com
  3 | #@category AGDCservices
  4 | #@keybinding
  5 | #@menupath
  6 | #@toolbar
  7 | 
  8 | '''
  9 | This script will name all unidentified functions with a nomenclature
 10 | that provides a preview of important capabilities included within the
 11 | function and all child functions.
 12 | 
 13 | The script includes a list of hardcoded important API calls. The
 14 | script will locate all calls contained in the unidentifed function
 15 | and it's children functions. For any of the calls which match
 16 | the hardcoded API call list, a shorthand name will be applied to
 17 | indicate which category of important call is contained within the function.
 18 | 
 19 | The naming nomenclature is based on capability and does not identify
 20 | specific APIs. By keeping the syntax short and just for capability,
 21 | you can get a preview of all the important capabilities within a function
 22 | without having the name get enormous.
 23 | 
 24 | The naming convention is as follows:
 25 | - all funtions automatically named will start with a f_p__
 26 | - a function will only be renamed if it starts with either the
 27 |   Ghidra default function name, or this scripts default function name.
 28 |   If any other name is found, it is expected the function was either 
 29 |   manually named or identified by a library signature, and it is
 30 |   assumed those names are more accurate than the automated preview name.
 31 | - each category will be seperated by a double underscore
 32 | - within each catagory, a specific capability is identified by a
 33 |   single "preview" letter.
 34 | - if the preview letter is uppercase, it means the capability
 35 |   was found in the current function. If the preview letter is
 36 |   lowercase, it means the capability was found somewhere in a
 37 |   child function.
 38 | - the last entry of the preview name will be the function address
 39 |   This is because Ghidra allows duplicate names, but when a name is
 40 |   selected, all copies are highlighted based only on the name.
 41 |   Because you often get duplicates of the preview name, adding the 
 42 |   functions address to the end will make each name unique so you can
 43 |   easily differentiate functions with the same base preview name.
 44 | 
 45 | One exception to the naming convention are functions which are the
 46 | start of a thread. These functions will only have the category
 47 | TS applied and will not contain any capability preview.
 48 | Because the thread starts are almost like mini-programs, this 
 49 | identifer is used just to identify the starting functions so you 
 50 | can manually review them to determine the general capabilities    
 51 |     
 52 | The preview letters are all single characters that are typically
 53 | the first letter of the capability. The categories and preview
 54 | letters used are below. To see the specific API calls that
 55 | correspond to each capability, see the list at the top of the
 56 | function, Build_New_Func_Name()
 57 | 
 58 | 
 59 | TS = thread start (no further capability preview will be applied)
 60 | 
 61 | netw = networking functionality
 62 |   b = build
 63 |   c = connect
 64 |   l = listen
 65 |   s = send
 66 |   r = receive
 67 |   t = terminate
 68 |   m = modify
 69 | 
 70 | reg = registry functionality
 71 |   h = handle
 72 |   r = read
 73 |   w = write
 74 |   d = delete
 75 | 
 76 | file = file processing functionality
 77 |   h = handle
 78 |   r = read
 79 |   w = write
 80 |   d = delete
 81 |   c = copy
 82 |   m = move
 83 |   e = enumerate
 84 | 
 85 | proc = process manipulation functionality
 86 |   h = handle
 87 |   e = enumerate
 88 |   c = create
 89 |   t = terminate
 90 |   r = read process memory
 91 |   w = write process memory
 92 | 
 93 | serv = service manipulation functionality
 94 |   h = handle
 95 |   c = create
 96 |   d = delete
 97 |   s = start
 98 |   r = read
 99 |   w = write
100 | 
101 | thread = thread functionality
102 |   c = create
103 |   o = open
104 |   s = suspend
105 |   r = resume
106 | 
107 | str = string manipulation functionality
108 |   c = compare
109 | 
110 | zc = there were no call instructions in the function
111 | 
112 | xref = number of cross references for the function
113 | 
114 | '''
115 | 
116 | 
117 | import re
118 | import collections
119 | 
120 | 
121 | GHIDRA_FUNC_PREFIX = 'FUN_'
122 | CUSTOM_AUTO_FUNC_PREFIX = 'f_p__'
123 | CUSTOM_AUTO_THREAD_FUNC_PREFIX  = 'f_p__TS__'
124 | 
125 | OP_TYPE_PUSH_REGISTER = 512
126 | #OP_TYPE_CALL_REGISTER_NO_REFERENCE = 516
127 | #OP_TYPE_CALL_REGISTER_WITH_REFERENCE = 8708
128 | OP_TYPE_CALL_STATIC_FUNCTION = 8256
129 | OP_TYPE_CALL_DATA_VARIABLE = 8324 # with or without known reference
130 | #OP_TYPE_CALL_STACK_VARIABLE = 4202500
131 | 
132 | 
133 | 
134 | def main():
135 | 
136 |     print('{:s}\n{:s}'.format('=' * 100, 'Function_Preview Script Starting'))
137 | 
138 | 
139 |     
140 |     #
141 |     # rename thread start functions
142 |     # do this first so to potentially create new functions 
143 |     # because often the thread start functions don't get 
144 |     # analyzed by default
145 |     #
146 | 
147 |     # get initial thread starts
148 |     threadRootsList = Get_Thread_Roots()
149 | 
150 |     # rename thread starts with auto name
151 |     for rootEa in threadRootsList:
152 |         newFuncName = '{:s}{:s}{:s}'.format(CUSTOM_AUTO_THREAD_FUNC_PREFIX , GHIDRA_FUNC_PREFIX, rootEa.toString())
153 | 
154 |         curFunc = getFunctionAt(rootEa)
155 |         if curFunc == None:
156 |             createFunction(rootEa, newFuncName)
157 |         else:
158 |             curFunc.setName(newFuncName, ghidra.program.model.symbol.SourceType.USER_DEFINED)
159 | 
160 | 
161 |     
162 |     #
163 |     # get list of all functions to rename and leaf nodes.  Get leaf nodes by
164 |     # checking if each function is a parent. leaf nodes will not be a parent functions
165 |     # ignore library / thunk functions
166 |     #
167 | 
168 | 
169 |     # start with all unidentified functions, i.e. all functions that start with the
170 |     # Ghidra standard function prefix or this scripts custom function prefix
171 |     # assume any other function name was either named from a library signature or manually by
172 |     # a user, and you don't want to overwrite those function names.  Also ignore thunk functions
173 |     # skip thread start functions because having all of the target functionality added to the 
174 |     # thread function name is generally overkill.
175 |     funcList = [f for f in currentProgram.getListing().getFunctions(True) if f.getName().startswith( (GHIDRA_FUNC_PREFIX, CUSTOM_AUTO_FUNC_PREFIX) ) and not f.getName().startswith(CUSTOM_AUTO_THREAD_FUNC_PREFIX)]
176 |     funcList = [f for f in funcList[:] if f.isThunk() == False]
177 | 
178 |     # identify all parent nodes within unidentified function set
179 |     parentNodes = set()
180 |     for curFunc in funcList:
181 |         curParentNodes = curFunc.getCallingFunctions(monitor)
182 |         parentNodes.update(curParentNodes)
183 | 
184 | 
185 |     # store all functions that are not a parent as a leaf node
186 |     leafNodes = [f for f in funcList if f not in parentNodes ]
187 | 
188 | 
189 |     
190 |     #
191 |     # recusively apply renaming to unidentified functions starting from leaf nodes
192 |     # up through parents.  This will ensure child functionality is propagated
193 |     # up through the parent functions
194 |     #
195 |     # do recursively until no changes are made.  This will ensure that all of the
196 |     # child function capabilities are propagated up through the parents
197 |     #
198 |     while True:
199 |         funcRenamedCount = 0
200 |         nodesTraversed = set()
201 |         curNodes = leafNodes[:]
202 |         while True:
203 | 
204 |             # rename each function in current level of nodes
205 |             parentNodes = set()
206 |             for curFunc in curNodes:
207 | 
208 |                 # rename function and track if new name is actually different than old name
209 |                 # this count is used to determine when to finish recursively renaming functions
210 |                 oldFuncName = curFunc.getName()
211 |                 newFuncNameProposed = Build_New_Func_Name(curFunc)
212 |                 curFunc.setName(newFuncNameProposed, ghidra.program.model.symbol.SourceType.USER_DEFINED)
213 |                 newFuncNameActual = curFunc.getName()
214 |                 if oldFuncName != newFuncNameActual: funcRenamedCount += 1
215 | 
216 |                 # add current function into nodesTraversed so you can check for infinite loops
217 |                 nodesTraversed.add(curFunc)
218 | 
219 |                 # get parent nodes that are in the unidentified functions list
220 |                 # ignore any parents not in that list assuming they are library
221 |                 # calls or other functions we don't want to overwrite
222 |                 curParentNodes = curFunc.getCallingFunctions(monitor)
223 |                 parentNodes.update( curParentNodes & set(funcList) )
224 | 
225 |                 # remove any functions from the nodesTraversed list to eliminate infinite loops
226 |                 parentNodes = parentNodes - nodesTraversed
227 | 
228 | 
229 |             # inner whie loop exit condition
230 |             if len(parentNodes) == 0: break
231 | 
232 |             # copy parentNodes to curNodes to rename in next iteration of loop
233 |             curNodes = parentNodes.copy()
234 | 
235 |         # outer while loop exit condition
236 |         if funcRenamedCount == 0: break
237 | 
238 | 
239 |     print('{:s}\n{:s}'.format('Function_Preview Script Completed', '=' * 100))
240 | 
241 | 
242 | 
243 | 
244 | def Get_Prev_Target_Instruction(curInstr, mnem, N, MAX_INSTRUCTIONS = 9999):
245 |     '''
246 |     gets N'th previous target instruction from the curInstr
247 |     function will only go back MAX_INSTRUCTIONS
248 |     function will not search outside of current function if the
249 |     current instruction is inside a defined function
250 |     returns None on failure
251 |     '''
252 | 
253 | 
254 |     # get address set of current function to use in determining if prev instruction
255 |     # is outside of current function
256 |     try:
257 |         funcBody = getFunctionContaining(curInstr.getAddress()).getBody()
258 |     except:
259 |         funcBody = None
260 | 
261 | 
262 |     # get Nth prev instruction
263 |     totalInstructionCount = 0
264 |     targetInstructionCount = 0
265 |     while (totalInstructionCount < MAX_INSTRUCTIONS) and (targetInstructionCount < N):
266 |         curInstr = curInstr.getPrevious()
267 | 
268 |         if curInstr == None: break
269 |         if funcBody != None:
270 |             if funcBody.contains(curInstr.getAddress()) == False: break
271 | 
272 |         if curInstr.getMnemonicString().lower() == mnem.lower(): targetInstructionCount += 1
273 | 
274 |         totalInstructionCount += 1
275 | 
276 | 
277 |     # return the results
278 |     if targetInstructionCount == N:
279 |         result = curInstr
280 |     else:
281 |         result = None
282 | 
283 |     return result
284 | 
285 | 
286 | 
287 | 
288 | 
289 | 
290 | def Get_Thread_Roots():
291 |     '''
292 |     returns a list of addresses of the root functions for all threads
293 |     found in the program
294 |     '''
295 | 
296 |     # list of  thread creation functions
297 |     funcNamesList = ['CreateThread', '_beginthreadex', '__beginthreadex', '_beginthread', '__beginthread']
298 | 
299 |     # go through every thread create option
300 |     threadStartEaSet = set()
301 |     for funcName in funcNamesList:
302 |         # set thread start argument because it is different number based on API used
303 |         argIndex = 1 if funcName.lstrip('_') == 'beginthread' else 3
304 | 
305 |         # get list of API references
306 |         funcList = list(currentProgram.getSymbolTable().getSymbols(funcName))
307 |         if len(funcList) == 0: continue
308 | 
309 |         # get all references to target function
310 |         funcReferences = funcList[0].getReferences()
311 | 
312 |         for ref in funcReferences:
313 | 
314 |             # if reference location is a call instruction
315 |             if 'call' not in ref.getReferenceType().getName().lower(): continue
316 | 
317 |             # find the actual thread start function
318 |             refInstr = getInstructionAt(ref.getFromAddress())
319 |             mnemInstr = Get_Prev_Target_Instruction(refInstr, 'push', argIndex, 10)
320 |             if mnemInstr == None: continue
321 | 
322 | 
323 |             # get thread start address
324 |             if mnemInstr.getOperandType(0) == OP_TYPE_PUSH_REGISTER:
325 |                 # if thread start was a register, look for root address where register
326 |                 # value was set
327 |                 regStr = mnemInstr.getRegister(0).getName().lower()
328 |                 for i in range(5):
329 |                     mnemInstr = Get_Prev_Target_Instruction(mnemInstr, 'mov', 1, 10)
330 |                     if mnemInstr == None: break
331 | 
332 |                     if mnemInstr.getRegister(0).getName().lower() == regStr:
333 |                         rootEa = mnemInstr.getOperandReferences(1)[0].getToAddress()
334 |                         if getFunctionContaining(rootEa) != None: threadStartEaSet.add(rootEa)
335 | 
336 |                         break
337 |             else:
338 |                 # assume normal push offset
339 |                 rootEa = mnemInstr.getOperandReferences(0)[0].getToAddress()
340 |                 threadStartEaSet.add(rootEa)
341 | 
342 | 
343 | 
344 |     return threadStartEaSet
345 | 
346 | 
347 | 
348 | def Build_New_Func_Name(func):
349 |     '''
350 |     function will return a string for naming functionality based on desired
351 |     functionality found
352 |     functionality is split into categories.  Each category has a
353 |     single identifier to indicate a generic capability for that category
354 |     e.g. netwCSR = network category, connect, send, and receive capabilities
355 |     '''
356 | 
357 |     # use ordered dictionary so that categories are always printed
358 |     # in the same order
359 |     categoryNomenclatureDict = collections.OrderedDict()
360 |     categoryNomenclatureDict['netw'] = ['b','c','l','s','r','t','m']
361 |     categoryNomenclatureDict['reg'] = ['h','r','w','d']
362 |     categoryNomenclatureDict['file'] = ['h','r','w','d','c','m','e']
363 |     categoryNomenclatureDict['proc'] = ['h','e','c','t','r','w']
364 |     categoryNomenclatureDict['serv'] = ['h','c','d','s','r','w']
365 |     categoryNomenclatureDict['thread'] = ['c','o','s','r']
366 |     categoryNomenclatureDict['str'] = ['c']
367 | 
368 | 
369 | 
370 |     # for dictionary, list only the basenames, leave off prefixes of '_'
371 |     # and any suffix such as Ex, ExA, etc.  These will be stripped from
372 |     # the functions calleed to account for all variations
373 |     apiPurposeDict = {
374 |         'socket':'netwB',
375 | 
376 |         #WSAStartup':'netwC',
377 |         'connect':'netwC',
378 |         'InternetOpen':'netwC',
379 |         'InternetConnect':'netwC',
380 |         'InternetOpenURL':'netwC',
381 |         'HttpOpenRequest':'netwC',
382 |         'WinHttpConnect':'netwC',
383 |         'WinHttpOpenRequest':'netwC',
384 | 
385 |         'bind':'netwL',
386 |         'listen':'netwL',
387 |         'accept':'netwL',
388 | 
389 |         'send':'netwS',
390 |         'sendto':'netwS',
391 |         'InternetWriteFile':'netwS',
392 |         'HttpSendRequest':'netwS',
393 |         'WSASend':'netwS',
394 |         'WSASendTo':'netwS',
395 |         'WinHttpSendRequest':'netwS',
396 |         'WinHttpWriteData':'netwS',
397 | 
398 |         'recv':'netwR',
399 |         'recvfrom':'netwR',
400 |         'InternetReadFile':'netwR',
401 |         'HttpReceiveHttpRequest':'netwR',
402 |         'WSARecv':'netwR',
403 |         'WSARecvFrom':'netwR',
404 |         'WinHttpReceiveResponse':'netwR',
405 |         'WinHttpReadData':'netwR',
406 |         'URLDownloadToFile':'netwR',
407 | 
408 |         'inet_addr':'netwM',
409 |         'htons':'netwM',
410 |         'htonl':'netwM',
411 |         'ntohs':'netwM',
412 |         'ntohl':'netwM',
413 | 
414 |         # to common due to error conditions
415 |         # basically becomes background noise
416 |         #
417 |         #'closesocket':'netwT',
418 |         #'shutdown':'netwT',
419 | 
420 | 
421 |         'RegOpenKey':'regH',
422 | 
423 |         'RegQueryValue':'regR',
424 |         'RegGetValue':'regR',
425 |         'RegEnumValue':'regR',
426 | 
427 |         'RegSetValue':'regW',
428 |         'RegSetKeyValue':'regW',
429 | 
430 |         'RegDeleteValue':'regD',
431 |         'RegDeleteKey':'regD',
432 |         'RegDeleteKeyValue':'regD',
433 | 
434 |         'RegCreateKey':'regC',
435 | 
436 |         'CreateFile':'fileH',
437 |         'fopen':'fileH',
438 | 
439 |         'fscan':'fileR',
440 |         'fgetc':'fileR',
441 |         'fgets':'fileR',
442 |         'fread':'fileR',
443 |         'ReadFile':'fileR',
444 | 
445 |         'flushfilebuffers':'fileW',
446 |         'fprintf':'fileW',
447 |         'fputc':'fileW',
448 |         'fputs':'fileW',
449 |         'fwrite':'fileW',
450 |         'WriteFile':'fileW',
451 | 
452 |         'DeleteFile':'fileD',
453 | 
454 |         'CopyFile':'fileC',
455 | 
456 |         'MoveFile':'fileM',
457 | 
458 |         'FindFirstFile':'fileE',
459 |         'FindNextFile':'fileE',
460 | 
461 |         'strcmp':'strC',
462 |         'strncmp':'strC',
463 |         'stricmp':'strC',
464 |         'wcsicmp':'strC',
465 |         'mbsicmp':'strC',
466 |         'lstrcmp':'strC',
467 |         'lstrcmpi':'strC',
468 | 
469 |         'OpenService':'servH',
470 | 
471 |         'QueryServiceStatus':'servR',
472 |         'QueryServiceConfig':'servR',
473 | 
474 |         'ChangeServiceConfig':'servW',
475 |         'ChangeServiceConfig2':'servW',
476 | 
477 |         'CreateService':'servC',
478 | 
479 |         'DeleteService':'servD',
480 | 
481 |         'StartService':'servS',
482 | 
483 |         'CreateToolhelp32Snapshot':'procE',
484 |         'Process32First':'procE',
485 |         'Process32Next':'procE',
486 | 
487 |         'OpenProcess':'procH',
488 | 
489 |         'CreateProcess':'procC',
490 |         'CreateProcessAsUser':'procC',
491 |         'CreateProcessWithLogon':'procC',
492 |         'CreateProcessWithToken':'procC',
493 |         'ShellExecute':'procC',
494 | 
495 |         # to common due to error conditions
496 |         # basically becomes background noise
497 |         #
498 |         #'ExitProcess':'procT',
499 |         #'TerminateProcess':'procT',
500 | 
501 |         'ReadProcessMemory':'procR',
502 | 
503 |         'WriteProcessMemory':'procW',
504 | 
505 |         'CreateThread':'threadC',
506 |         'beginthread':'threadC',
507 |         'beginthreadex':'threadC', # EXCEPTION: include ex because it's lowercase and won't be caught by case-sensitive suffix stripper routine later
508 | 
509 |         'OpenThread':'threadO',
510 | 
511 |         'SuspendThread':'threadS',
512 | 
513 |         'ResumeThread':'threadR',
514 | 
515 |     }
516 | 
517 | 
518 |     # get function info
519 |     funcOrigName = func.getName()
520 |     funcAddressSet = func.getBody()
521 | 
522 |     # get count of number of times current function is called
523 |     refToCount = getSymbolAt(func.getEntryPoint()).getReferenceCount()
524 | 
525 |     # get all calls in current function
526 |     callList = []
527 |     curInstr = getInstructionAt(func.getEntryPoint())
528 |     while ( (curInstr != None) and (funcAddressSet.contains(curInstr.getAddress()) == True) ):
529 |         if curInstr.getMnemonicString().lower() == 'call': callList.append(curInstr)
530 |         curInstr = curInstr.getNext()
531 | 
532 | 
533 | 
534 |     # remove any recursive calls, otherwise any functionality in function
535 |     # will also be treated as child functionality and appended to child
536 |     # portion of name
537 |     recursiveList = []
538 |     for curCall in callList:
539 |         curOpRef = curCall.getOperandReferences(0)
540 | 
541 |         # skip calls to registers or any type that doesn't store adddress information
542 |         if len(curOpRef) == 0: continue
543 | 
544 |         # check operand reference to make sure it's not recursive
545 |         if curOpRef[0].getToAddress().equals(func.getEntryPoint()) == True:
546 |             recursiveList.append(curCall)
547 |     callList = list(set(callList) - set(recursiveList))
548 | 
549 | 
550 | 
551 |     # if no calls, return appropriate response
552 |     if len(callList) == 0:
553 |         # check if functiton is a thunk
554 |         if func.isThunk() == True:
555 |             callList.append(getInstructionAt(func.getEntryPoint()))
556 |         else:
557 |             # otherwise, return zero call
558 |             return '{:s}zc_{:s}{:s}__xref_{:02d}'.format(CUSTOM_AUTO_FUNC_PREFIX, GHIDRA_FUNC_PREFIX, func.getEntryPoint().toString(), refToCount)
559 | 
560 | 
561 |     #
562 |     # if calls are found, try to identify functionality
563 |     #
564 |     apiUsed = set()
565 | 
566 |     # process calls with external reference
567 |     for curCall in callList:
568 |         if curCall.getExternalReference(0) != None:
569 |             # extract API basename to ignore prefix/suffix, e.g. _, Ex, ExA
570 |             curApiName = curCall.getExternalReference(0).getLabel()
571 |             pattern = '^(?:FID_conflict:)?(?:_)*(?P<baseName>.+?)(?:A|W|Ex|ExA|ExW)?(?:@[a-fA-F0-9]+)?$'
572 |             match = re.search(pattern, curApiName)
573 |             curApiName = match.group('baseName')
574 | 
575 |             # add current API name to summary set
576 |             apiUsed.add(curApiName)
577 | 
578 | 
579 |     # process calls to statically linked functions
580 |     for curCall in callList:
581 |         if curCall.getOperandType(0) == OP_TYPE_CALL_STATIC_FUNCTION:
582 |             curApiName = getFunctionAt(curCall.getReferencesFrom()[0].getToAddress()).getName()
583 |             if curApiName.startswith((GHIDRA_FUNC_PREFIX, CUSTOM_AUTO_FUNC_PREFIX, CUSTOM_AUTO_THREAD_FUNC_PREFIX )) == False:
584 |                 # extract API basename to ingnore prefix/suffix, e.g. _, Ex, ExA
585 |                 pattern = '^(?:FID_conflict:)?(?:_)*(?P<baseName>.+?)(?:A|W|Ex|ExA|ExW)?(?:@[a-fA-F0-9]+)?$'
586 |                 match = re.search(pattern, curApiName)
587 |                 curApiName = match.group('baseName')
588 | 
589 |                 # add current API name to summary set
590 |                 apiUsed.add(curApiName)
591 | 
592 | 
593 |     # process calls to function pointers stored in data variables
594 |     for curCall in callList:
595 |         if curCall.getOperandType(0) == OP_TYPE_CALL_DATA_VARIABLE:
596 |             curOpEa = curCall.getReferencesFrom()[0].getToAddress()
597 |             curData = getDataAt(curOpEa)
598 | 
599 |             # getDataAt should return data object for defined and undefined data,
600 |             # but there seems to be a bug and sometimes returns None on undefined data
601 |             if curData == None: curData = getUndefinedDataAt(curOpEa)
602 | 
603 |             # get the data variable label
604 |             if curData.getExternalReference(0) != None:
605 |                 curApiName = curData.getExternalReference(0).getLabel()
606 |             else:
607 |                 curApiName = curData.getLabel()
608 | 
609 | 
610 |             if curApiName.lower().startswith(('dat_', 'byte_', 'word_', 'dword_', 'qword_')) == False:
611 |                 # extract API basename to ingnore prefix/suffix, e.g. _, Ex, ExA
612 |                 pattern = '^(?:FID_conflict:)?(?:_)*(?P<baseName>.+?)(?:A|W|Ex|ExA|ExW)?(?:@[a-fA-F0-9]+)?$'
613 |                 match = re.search(pattern, curApiName)
614 |                 curApiName = match.group('baseName')
615 | 
616 |                 # add current API name to summary set
617 |                 apiUsed.add(curApiName)
618 | 
619 | 
620 | 
621 | 
622 |     # map API's called to functionality to use for naming
623 |     implementedApiPurpose = set()
624 |     for entry in apiUsed:
625 |         implementedApiPurpose.add(apiPurposeDict.get(entry))
626 | 
627 | 
628 |     # identify functionality from child functions already renamed by this script
629 |     # this will allow api usage to propagate up to the root function
630 |     childFunctionImplementedApiPurpose = dict()
631 |     for curCall in callList:
632 |         if curCall.getOperandType(0) == OP_TYPE_CALL_STATIC_FUNCTION:
633 |             curApiName = getFunctionAt(curCall.getReferencesFrom()[0].getToAddress()).getName()
634 |             if curApiName.startswith(CUSTOM_AUTO_FUNC_PREFIX) == True:
635 | 
636 |                 # pull out api capabilities based on naming convention
637 |                 for category in categoryNomenclatureDict:
638 |                     pattern = category + '_' + '([a-zA-Z]+)+_?([a-zA-Z]+)?'
639 |                     match = re.search(pattern, curApiName)
640 | 
641 |                     # if category is found, save into results
642 |                     if match is not None:
643 |                         apiPurpose = set()
644 |                         if match.group(1) is not None: apiPurpose.update(list(match.group(1).lower()))
645 |                         if match.group(2) is not None: apiPurpose.update(list(match.group(2).lower()))
646 |                         if category in childFunctionImplementedApiPurpose:
647 |                             childFunctionImplementedApiPurpose[category].update(apiPurpose)
648 |                         else:
649 |                             childFunctionImplementedApiPurpose[category] = apiPurpose
650 | 
651 | 
652 | 
653 |     #
654 |     # create function name based on API functionality found
655 |     #
656 | 
657 |     newFuncNamePurpose = ''
658 | 
659 |     # for each category, loop through all the nomenclature symbols
660 |     # if the symbol is found in the current function, add it to the parent string
661 |     # if the symbol is found in a child function, add it to child string
662 |     for category in categoryNomenclatureDict:
663 | 
664 |         # build the symbol list for the parent function
665 |         parentStr = ''
666 |         for symbol in categoryNomenclatureDict[category]:
667 |             if (category + symbol.upper()) in implementedApiPurpose:
668 |                 parentStr += symbol.upper()
669 | 
670 | 
671 |         # build the symbol list for the child functions
672 |         childStr = ''
673 |         if category in childFunctionImplementedApiPurpose:
674 |             for symbol in categoryNomenclatureDict[category]:
675 |                 if symbol.lower() in childFunctionImplementedApiPurpose[category]:
676 |                     childStr += symbol.lower()
677 | 
678 |         # combine the parent / child symbol list into one final string
679 |         if (len(parentStr) > 0) or (len(childStr) > 0):
680 |             newFuncNamePurpose = newFuncNamePurpose + category
681 |             if len(parentStr) > 0: newFuncNamePurpose = newFuncNamePurpose + '_' + parentStr
682 |             if len(childStr) > 0: newFuncNamePurpose = newFuncNamePurpose + '_' + childStr
683 |             newFuncNamePurpose = newFuncNamePurpose + '__'
684 | 
685 | 
686 | 
687 | 
688 | 
689 | 
690 |     # build the final function name
691 |     if len(newFuncNamePurpose) > 0:
692 |         # targeted functionality found
693 |         finalFuncName = '{:s}{:s}xref_{:02d}_{:s}'.format(CUSTOM_AUTO_FUNC_PREFIX, newFuncNamePurpose, refToCount, func.getEntryPoint().toString())
694 |     else:
695 |         # no targeted functionality identified
696 |         finalFuncName = '{:s}{:s}{:s}__xref_{:02d}'.format(CUSTOM_AUTO_FUNC_PREFIX, GHIDRA_FUNC_PREFIX, func.getEntryPoint().toString(), refToCount)
697 | 
698 | 
699 | 
700 |     return finalFuncName
701 | 
702 | 
703 | 
704 | 
705 | 
706 | 
707 | if __name__ == '__main__':
708 |     main()
709 | 
710 | 


--------------------------------------------------------------------------------