├── .gitignore ├── A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion ├── A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion.pdf ├── README.md ├── ROOTS Presentation.pdf └── ShmooCon 2017 Presentation.pdf ├── AVLeak ├── AVLeak Black Hat Presentation.pdf ├── README.md └── USENIX WOOT16 AVLeak Paper.pdf ├── Ghidra ├── PCodeMallocDemo │ ├── MallocTrace.java │ ├── README.md │ ├── mallocexample │ ├── mallocexample.c │ └── output.txt └── README.md ├── README.md └── Reverse Engineering Windows Defender ├── JavaScript Engine ├── README.md └── Reverse Engineering-Windows Defender JavaScript Engine.pdf ├── README.md ├── Virus Bulletin Retrospective └── VB2018 Presentation.pdf └── Windows Binary Emulator ├── BHUSA - DEFCON - Alexei-Bulazel-Reverse-Engineering-Windows-Defender-Revision-3.pdf └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | *~ -------------------------------------------------------------------------------- /A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion/A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion/A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion.pdf -------------------------------------------------------------------------------- /A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion/README.md: -------------------------------------------------------------------------------- 1 | # A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion: PC, Mobile, and Web 2 | 3 | *Alexei Bulazel and Bulent Yener* 4 | 5 | *Published at The First Reversing and Offensive-oriented Trends Symposium (ROOTS 2017), Vienna, Austria* 6 | 7 | 8 | Automated dynamic malware analysis systems are important in combating the proliferation of modern malware. Unfortunately, malware can often easily detect and evade these systems. Competition between malware authors and analysis system developers has pushed each to continually evolve their tactics for countering the other. In this paper we systematically review i) "fingerprint"-based evasion techniques against automated dynamic malware analysis systems for PC, mobile, and web, ii) evasion detection, iii) evasion mitigation, and iv) offensive and defensive evasion case studies. We also discuss difficulties in experimental evaluation, highlight future directions in offensive and defensive research, and briefly survey related topics in anti-analysis. 9 | 10 | [ShmooCon 2018 Video]: https://www.youtube.com/watch?v=KtX9wap-LWY 11 | -------------------------------------------------------------------------------- /A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion/ROOTS Presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion/ROOTS Presentation.pdf -------------------------------------------------------------------------------- /A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion/ShmooCon 2017 Presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion/ShmooCon 2017 Presentation.pdf -------------------------------------------------------------------------------- /AVLeak/AVLeak Black Hat Presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/AVLeak/AVLeak Black Hat Presentation.pdf -------------------------------------------------------------------------------- /AVLeak/README.md: -------------------------------------------------------------------------------- 1 | # AVLeak 2 | 3 | *In collaboration with Jeremy Blackthorne, Andrew Fasano, Patrick Biernat, and Dr. Bulent Yener of Rensselaer Polytechnic Institute* 4 | 5 | AVLeak is a tool for fingerprinting consumer antivirus emulators through automated black box testing. AVLeak can be used to extract fingerprints from AV emulators that may be used by malware to detect that it is being analyzed and subsequently evade detection, including environmental artifacts, OS API behavioral inconsistencies, emulation of network connectivity, timing inconsistencies, process introspection, and CPU emulator "red pills.” 6 | 7 | Emulator fingerprints may be discovered through painstaking binary reverse engineering, or with time consuming black box testing using binaries that conditionally choose to behave benignly or drop malware based on the emulated environment. AVLeak significantly advances upon prior approaches to black box testing, allowing researchers to extract emulator fingerprints in just a few seconds, and to script out testing using powerful APIs. 8 | 9 | AVLeak will be demoed live, showing real world fingerprints discovered using the tool that can be used to detect and evade popular consumer AVs including Kaspersky, Bitdefender engine (licensed out to 20+ other AV products), AVG, and VBA. This survey of emulation detection methods is the most comprehensive examination of the topic ever presented in one place. 10 | 11 | [Black Hat 2016 Video]: https://www.youtube.com/watch?v=a6yOwvFds78 12 | -------------------------------------------------------------------------------- /AVLeak/USENIX WOOT16 AVLeak Paper.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/AVLeak/USENIX WOOT16 AVLeak Paper.pdf -------------------------------------------------------------------------------- /Ghidra/PCodeMallocDemo/MallocTrace.java: -------------------------------------------------------------------------------- 1 | //Analysis to find arguments passed to malloc() 2 | //Blog post about this script: https://www.riverloopsecurity.com/blog/2019/05/pcode/ 3 | //@author Alexei Bulazel 4 | //@category PresentationDemo 5 | //@keybinding 6 | //@menupath 7 | //@toolbar 8 | 9 | import ghidra.app.decompiler.DecompInterface; 10 | import ghidra.app.decompiler.DecompileOptions; 11 | import ghidra.app.decompiler.DecompileResults; 12 | import ghidra.app.script.GhidraScript; 13 | import ghidra.framework.options.ToolOptions; 14 | import ghidra.framework.plugintool.PluginTool; 15 | import ghidra.framework.plugintool.util.OptionsService; 16 | import ghidra.util.exception.InvalidInputException; 17 | import ghidra.util.exception.NotFoundException; 18 | import ghidra.util.exception.NotYetImplementedException; 19 | import ghidra.program.model.symbol.*; 20 | import ghidra.program.model.listing.*; 21 | import ghidra.program.model.pcode.*; 22 | import ghidra.program.model.address.*; 23 | 24 | import java.util.ArrayList; 25 | import java.util.HashSet; 26 | import java.util.Iterator; 27 | 28 | 29 | public class MallocTrace extends GhidraScript { 30 | 31 | private DecompInterface decomplib; 32 | 33 | //class for node in a source-sink flow 34 | class FlowInfo { 35 | public long constValue; 36 | private boolean isParent; 37 | private boolean isChild; 38 | private Function function; 39 | private Function targetFunction; 40 | private ArrayList children = new ArrayList(); 41 | private ArrayList parents = new ArrayList(); 42 | 43 | private Address callSiteAddress; 44 | private int argIdx; 45 | 46 | FlowInfo(long constValue){ 47 | this.constValue = constValue; 48 | } 49 | 50 | FlowInfo(Function function){ 51 | this.function = function; 52 | this.isChild = true; 53 | } 54 | 55 | FlowInfo(Function function, Function targetFunction, Address callSiteAddress, int argIdx){ 56 | this.function = function; 57 | this.callSiteAddress = callSiteAddress; 58 | this.targetFunction = targetFunction; 59 | this.argIdx = argIdx; 60 | 61 | this.isParent = true; 62 | } 63 | 64 | public void appendNewParent(FlowInfo parent) { 65 | this.parents.add(parent); 66 | printf("Adding new parent... \n"); 67 | } 68 | 69 | public void appendNewChild(FlowInfo child) { 70 | this.children.add(child); 71 | printf("Adding new child...\n"); 72 | } 73 | 74 | public boolean isParent() { return isParent; } 75 | 76 | public boolean isChild() { return isChild; } 77 | 78 | public ArrayList getChildren() { return children; } 79 | 80 | public ArrayList getParents() { return parents; } 81 | 82 | public Function getFunction() { return function; } 83 | 84 | public Function getTargetFunction() { return targetFunction; } 85 | 86 | public Address getAddress() { return callSiteAddress;} 87 | 88 | public int getArgIdx() { return argIdx;} 89 | 90 | 91 | } 92 | 93 | // child class representing variables / flows that are phi inputs, e.g., any PhiFlow object 94 | // is directly an input to a MULTIEQUAL phi node 95 | class PhiFlow extends FlowInfo{ 96 | PhiFlow(long newConstValue){ 97 | super(newConstValue); 98 | } 99 | 100 | PhiFlow(Function newFunction){ 101 | super(newFunction); 102 | } 103 | 104 | PhiFlow(Function newFunction, Function newTargetFunction, Address newAddr, int newArgIdx){ 105 | super(newFunction, newTargetFunction, newAddr, newArgIdx); 106 | } 107 | } 108 | 109 | //child class for representing our "sink" function 110 | class Sink extends FlowInfo{ 111 | Sink(Function newFunction,Function newTargetFunction, Address newAddr){ 112 | //TODO add support for different param indices if we want to support functions other than malloc() 113 | super(newFunction, newTargetFunction, newAddr, 0); 114 | super.isParent = false; //hacky 115 | } 116 | } 117 | 118 | 119 | 120 | public HighFunction decompileFunction(Function f) { 121 | HighFunction hfunction = null; 122 | 123 | try { 124 | DecompileResults dRes = decomplib.decompileFunction(f, decomplib.getOptions().getDefaultTimeout(), getMonitor()); 125 | 126 | hfunction = dRes.getHighFunction(); 127 | } 128 | catch (Exception exc) { 129 | printf("EXCEPTION IN DECOMPILATION!\n"); 130 | exc.printStackTrace(); 131 | } 132 | 133 | return hfunction; 134 | } 135 | 136 | /* 137 | This function analyzes a function called on the way to determining an input to our sink 138 | e.g.: 139 | 140 | int x = calledFunction(); 141 | sink(x); 142 | 143 | We find the function, then find all of it's RETURN pcode ops, and analyze backwards from 144 | the varnode associated with the RETURN value. 145 | 146 | weird edge case, we can't handle funcs that are just wrappers around other functions, e.g.: 147 | func(){ 148 | return rand() 149 | }; 150 | */ 151 | private void analyzeCalledFunction(FlowInfo path, Function f, boolean isPhi) 152 | throws NotYetImplementedException, InvalidInputException, NotFoundException { 153 | 154 | FlowInfo newFlow = null; 155 | if (!isPhi) { 156 | newFlow = new FlowInfo(f); 157 | path.appendNewChild(newFlow); 158 | } 159 | else { 160 | newFlow = new PhiFlow(f); 161 | path.appendNewChild(newFlow); 162 | } 163 | 164 | HighFunction hfunction = decompileFunction(f); 165 | if(hfunction == null) { 166 | printf("Failed to decompile function!"); 167 | return; 168 | } 169 | 170 | printf("Function %s entry @ 0x%x\n", 171 | f.getName(), 172 | f.getEntryPoint().getOffset()); 173 | 174 | Iterator ops = hfunction.getPcodeOps(); 175 | 176 | //Loop through the functions p-code ops, looking for RETURN 177 | while (ops.hasNext() && !monitor.isCancelled()) { 178 | PcodeOpAST pcodeOpAST = ops.next(); 179 | 180 | if (pcodeOpAST.getOpcode() != PcodeOp.RETURN) { 181 | continue; 182 | } 183 | //from here on, we are dealing with a PcodeOp.RETURN 184 | 185 | int returnAddress = 0; 186 | if ( pcodeOpAST.getSeqnum() != null){ 187 | returnAddress = (int) pcodeOpAST.getSeqnum().getTarget().getOffset(); 188 | 189 | printf("Found %s return @ 0x%x\n", 190 | f.getName(), 191 | returnAddress); 192 | } 193 | 194 | //get the varnode for the function's return value 195 | Varnode returnedValue = pcodeOpAST.getInput(1); 196 | 197 | if (returnedValue == null) { 198 | printf("--> Could not resolve return value from %s\n", f.getName()); 199 | return; 200 | } 201 | 202 | //if we had a phi earlier, it's been logged, so going forward we set isPhi back to false 203 | processOneVarnode(newFlow, f, returnedValue, false); 204 | } 205 | 206 | printf("\n\n\n\n"); 207 | } 208 | 209 | /* 210 | Given a function, analyze all sites where it is called, looking at how the parameter at the call 211 | site specified by paramSlot is derived. This is for situations where we determine that a varnode 212 | we are looking at is a parameter to the current function - we then have to analyze all sites where 213 | that function is called to determine possible values for that parameter. 214 | */ 215 | private FlowInfo analyzeCallSites(FlowInfo path, Function function, int paramSlot, boolean isPhi) 216 | throws InvalidInputException, NotYetImplementedException, NotFoundException { 217 | 218 | ReferenceIterator referencesTo = currentProgram.getReferenceManager().getReferencesTo(function.getEntryPoint()); 219 | 220 | FlowInfo currentPath = null; 221 | 222 | for (Reference currentReference : referencesTo) { 223 | 224 | Address fromAddr = currentReference.getFromAddress(); 225 | Function callingFunction = getFunctionContaining(fromAddr); 226 | 227 | if (callingFunction == null) { 228 | printf("Could not get calling function @ 0x%x\n", fromAddr.getOffset()); 229 | continue; 230 | } 231 | 232 | printf("analyzeCallSites(..., %s, ...) - found calling function @ 0x%x [%s]\n", 233 | function.getName(), 234 | fromAddr.getOffset(), 235 | callingFunction.getName()); 236 | 237 | //if the reference is a CALL 238 | if (currentReference.getReferenceType() == RefType.UNCONDITIONAL_CALL) { 239 | printf("found unconditional call %s -> %s\n", 240 | getFunctionContaining(currentReference.getFromAddress()).getName(), 241 | function.getName()); 242 | 243 | /* 244 | Heavily based off of code at ShowConstantUse.java:729. Previously I had a very hacky callsite 245 | discovery algorithm here 246 | */ 247 | HighFunction hfunction = decompileFunction(callingFunction); 248 | 249 | //get the p-code ops at the address of the reference 250 | Iterator ops = hfunction.getPcodeOps(fromAddr.getPhysicalAddress()); 251 | 252 | //now loop over p-code ops ops looking for the CALL operation 253 | while(ops.hasNext() && !monitor.isCancelled()) { 254 | 255 | PcodeOpAST currentOp = ops.next(); 256 | 257 | if (currentOp.getOpcode() == PcodeOp.CALL) { 258 | Address parentAddress = currentOp.getSeqnum().getTarget(); 259 | 260 | FlowInfo parentNode = null; 261 | 262 | //get the function which is called by the CALL operation 263 | Function targetFunction = getFunctionAt(currentOp.getInput(0).getAddress()); 264 | 265 | //construct and add the appropriate node to our path 266 | 267 | if (!isPhi) { 268 | parentNode = new FlowInfo(function, targetFunction, parentAddress, paramSlot); 269 | } 270 | else { 271 | parentNode = new PhiFlow(function, targetFunction, parentAddress, paramSlot); 272 | } 273 | 274 | //dispatch to analysis of the particular function callsite we are examining to determine how the parameter is defined 275 | currentPath = analyzeFunctionCallSite(parentNode, getFunctionContaining(currentReference.getFromAddress()), currentOp, paramSlot); 276 | 277 | path.appendNewParent(currentPath); 278 | } 279 | } 280 | } 281 | 282 | } 283 | 284 | return path; 285 | } 286 | 287 | 288 | 289 | 290 | 291 | /* 292 | 293 | This function handles one varnode 294 | 295 | If the varnode is a constant, we are done, create a constant node and return 296 | 297 | If the varnode is associated with a parameter to the function, we then find each 298 | site where the function is called, and analyze how the parameter varnode at the 299 | corresponding index is derived for each call of the function 300 | 301 | If the varnode is not constant or a parameter, we get the p-code op which defines it, 302 | and then recursively trace the one or more varnodes associated with that varnode (tracing backwards), 303 | and see how they are defined 304 | 305 | */ 306 | private FlowInfo processOneVarnode(FlowInfo path, Function f, Varnode v, boolean isPhi) 307 | throws NotYetImplementedException, InvalidInputException, NotFoundException { 308 | 309 | if (v.isAddress()) { 310 | println("TODO handle addresses"); 311 | } 312 | 313 | //If the varnode is constant, we are done, save it off 314 | if ( v.isConstant()) { 315 | printf("\t\t\tprocessOneVarnode: Addr or Constant! - %s\n", v.toString()); 316 | 317 | long value = v.getOffset(); 318 | 319 | //either it's just a constant, or an input to a phi... 320 | if (!isPhi) { 321 | FlowInfo terminal = new FlowInfo(value); 322 | path.appendNewChild(terminal); 323 | } 324 | else { 325 | PhiFlow terminalPhi = new PhiFlow(value); 326 | path.appendNewChild(terminalPhi); 327 | } 328 | 329 | //done! return 330 | return path; 331 | } 332 | 333 | /* 334 | check if this varnode is in fact a parameter to the current function 335 | 336 | we retrieve the high level decompiler variable associated with the varnode 337 | and check if it is an instance of HighParam, a child class of HighVariable 338 | representing a function parameter. This seems like an unncessarily complex 339 | way of figuring out if a given varnode is a parameter, but I found examples 340 | of doing it this way in officially-published plugins bundled with Ghidra, 341 | and I couldn't figure out a better way to do it 342 | */ 343 | 344 | HighVariable hvar = v.getHigh(); 345 | 346 | if (hvar instanceof HighParam) { 347 | printf("Varnode is function parameter -> parameter #%d... %s\n", 348 | ((HighParam)hvar).getSlot(), //the parameter index 349 | v.toString()); 350 | 351 | //ok, so we do have a function parameter. Now we want to analyze all 352 | //sites in the binary where this function is called, seeing how varnode 353 | //at the parameter index that we are is derived 354 | path = analyzeCallSites(path, f, ((HighParam)hvar).getSlot(), isPhi); 355 | 356 | return path; 357 | } 358 | 359 | /* 360 | varnode is not a constant, or associated with a param 361 | 362 | In this case, we get the p-code operation 363 | which defines the varnode, and analyze it. We are tracing backwards, for example: 364 | if we had "varnode x = a + b", we will be given the pcode operation "a + b"... 365 | from there, we recursively trace further back, seeing how varnode "a" is defined 366 | and how varnode "b" is defined. 367 | 368 | As we trace backwards, we might terminate on one of the cases handled above, the 369 | varnode ultimately resolving into some constant, or resolving into a parameter to 370 | the current function. 371 | 372 | It possible that we have something like "varnode x = function_a()" - in this 373 | case (a PcodeOp.CALL), we want to trace into that function. At that function, we'll 374 | start tracing backwards from the varnode(s) associated with the function's RETURN 375 | p-codeop(s), in order to figure out how the return value is constructed, this happens 376 | in analyzeCalledFunction 377 | 378 | 379 | --- 380 | 381 | Additionally, it's possible that this varnode is defined by a MULTIEQUAL p-code 382 | operation. This is an operation inserted by Ghidra's decompilation analysis when it 383 | is creating a single-static assignment representation of the p-code, you 384 | will not see it in the regular listing view if you enable p-code view. 385 | 386 | MULTIEQUAL is the Ghidra p-code operation used to implement phi nodes in Static Single 387 | Assignment. Briefly, this operation is used to select from different assignments to the 388 | same variable along different control flow paths. Consider the following example: 389 | 390 | var x = 5; 391 | if (a){ 392 | x = 6; 393 | } 394 | else if (b){ 395 | x = 7; 396 | } 397 | y = x + 5; 398 | 399 | The final line, after the control flow statements in the middle, makes use of the "x" variable, 400 | which can obtain values at three different points in the code above. SSA conversion will rename 401 | each of these variables (each called "x" in the code above) upon assignment, say to "x1", "x2", 402 | and "x3", respectively. After renaming these variables, there is no such variable named "x" anymore, 403 | so the final line needs to be corrected to indicate that what is being denoted by "x" could 404 | actually refer to any of the three renamed variables just created. "x = MULTIEQUAL(x1,x2,x3)" 405 | defines a new variable, "x", which could take the value of any of those three variables. 406 | 407 | In any case, what we want to do here, is look at each varnode which is input to this MULTIEQUAL, 408 | and trace backwards from it. In the example above, we'd visit "x = 5", "x = 6", and "x = 7". Because 409 | each of these constants is a possible value of x which unifies in this MULTIEQUAL, we set the isPhi flag 410 | as we trace backwards. This lets our analysis know that whatever value we get for x is associated with our 411 | phi. This information will be displayed to the user when they get the print out later. 412 | */ 413 | 414 | //get the p-code op defining the varnode 415 | PcodeOp def = v.getDef(); 416 | 417 | if(def == null) { 418 | printf("NULL DEF!\n"); 419 | return path; 420 | } 421 | 422 | /* 423 | This is a very hacky way of getting the concrete virtual address 424 | associated with a given p-code operation (e.g, the address of the 425 | actual instruction that underlies it.) Best I can tell, this isn't something 426 | that we're officially supposed to do - in some cases, a p-code operation may 427 | not have a concrete instruction in the binary behind it. For debugging purposes while 428 | developing this script, I used this code, but I don't think it's "correct" or 429 | the "right way" to do things 430 | */ 431 | if (def.getSeqnum().getTarget() != null) { 432 | printf("0x%x - ", def.getSeqnum().getTarget().getOffset()); 433 | } 434 | 435 | printf("processOneVarnode: %s\n", def.toString()); 436 | 437 | //get the enum value of the p-code operation that defines our varnode 438 | int opcode = def.getOpcode(); 439 | 440 | /* 441 | Switch on the opcode enum value. Note that this script doens't support 442 | all possible p-code operations, just common ones that I encountered while 443 | writing code to test this script 444 | 445 | see Ghidra's included docs/languages/html/pcodedescription.htm for a listing 446 | of p-code operations, and check the "next" link at the bottom for even more 447 | */ 448 | switch (opcode) { 449 | 450 | /* 451 | Handle p-code ops that take one input. We just pass through here, 452 | analyzing single varnode that the p-code operation takes. 453 | 454 | For example, see "NOT EAX" here. Our output varnode is just the negation 455 | of the input varnode. So upon seeing a INT_NEGATE p-code operation, we just 456 | examine the single varnode that is its input 457 | 458 | malloc(~return3()); 459 | 460 | 004008a9 NOT EAX 461 | EAX = INT_NEGATE EAX 462 | 004008ab CDQE 463 | RAX = INT_SEXT EAX 464 | 004008ad MOV RDI,RAX 465 | RDI = COPY RAX 466 | 004008b0 CALL malloc 467 | RSP = INT_SUB RSP, 8:8 468 | STORE ram(RSP), 0x4008b5:8 469 | CALL *[ram]0x400550:8 470 | 471 | 472 | */ 473 | case PcodeOp.INT_NEGATE: 474 | case PcodeOp.INT_ZEXT: 475 | case PcodeOp.INT_SEXT: 476 | case PcodeOp.CAST: 477 | case PcodeOp.COPY: { 478 | processOneVarnode(path, f, def.getInput(0), isPhi); 479 | break; 480 | } 481 | 482 | /* 483 | Handle p-code ops that take two inputs. 484 | 485 | The output (our current varnode) = "(pcodeop input1 input2)" or "input1 [pcodeop] input2": 486 | 487 | Because we are not tracing out all the values that effect values going into our sink function, 488 | just terminating constants and function calls, we don't log constants associated with these operations 489 | 490 | So if we had a current varnode x: 491 | 492 | "x = y + 5" would result in us calling processOneVarnode(y) but ignoring that "5" 493 | 494 | "x = y + z" would result in us calling processOneVarnode(y) and processOneVarnode(z) 495 | 496 | */ 497 | case PcodeOp.INT_ADD: 498 | case PcodeOp.INT_SUB: 499 | case PcodeOp.INT_MULT: 500 | case PcodeOp.INT_DIV: 501 | case PcodeOp.INT_AND: 502 | case PcodeOp.INT_OR: 503 | case PcodeOp.INT_XOR: { 504 | if (!def.getInput(0).isConstant()) { 505 | //only process if not constant 506 | processOneVarnode(path,f, def.getInput(0), isPhi); 507 | } 508 | if(!def.getInput(1).isConstant()) { 509 | //only process if not constant 510 | processOneVarnode(path,f, def.getInput(1), isPhi); 511 | } 512 | break; 513 | } 514 | 515 | 516 | /* 517 | Handle CALL p-code ops by analyzing the functions that they call 518 | */ 519 | case PcodeOp.CALL:{ 520 | printf("Located source - call to %x [%s]\n", 521 | def.getInput(0).getAddress().getOffset(), 522 | getFunctionAt(def.getInput(0).getAddress()).getName()); 523 | 524 | Function pf = getFunctionAt(def.getInput(0).getAddress()); 525 | 526 | analyzeCalledFunction(path, pf, isPhi); 527 | break; 528 | } 529 | 530 | /* 531 | p-code representation of a PHI operation. 532 | 533 | So here we choose one varnode from a number of incoming varnodes. 534 | 535 | In this case, we want to explore each varnode that the phi handles 536 | We need to propogate phi status to each of them as well 537 | 538 | See documentation at /docs/languages/html/additionalpcode.html 539 | */ 540 | case PcodeOp.MULTIEQUAL:{ 541 | printf("Processing a MULTIEQUAL with %d inputs", def.getInputs().length); 542 | 543 | //visit each input to the MULTIEQUAL 544 | for (int i = 0; i < def.getInputs().length; i++) { 545 | //we set isPhi = true, as we trace each of the phi inputs 546 | processOneVarnode(path,f, def.getInput(i), true); 547 | } 548 | break; 549 | } 550 | 551 | /* 552 | This is a p-code op that may be inserted during the decompiler's 553 | construction of SSA form. To be honest, I don't completely understand 554 | this p-code op's purpose 555 | 556 | See documentation at /docs/languages/html/additionalpcode.html 557 | */ 558 | case PcodeOp.INDIRECT:{ 559 | printf("USED In INDIRECT --> output %s\n", def.getOutput().toString()); 560 | 561 | PcodeOp[] pc = getInstructionAt(v.getPCAddress()).getPcode(); 562 | 563 | for (int i = 0; i < pc.length; i++) { 564 | printf("PC%d -> %s\n", i, pc[i].toString()); 565 | if(pc[i].getOpcode() == PcodeOp.CALL) { 566 | printf("INDIRECT Associated with call @ %x (%s)\n", pc[i].getInput(0).getOffset(), 567 | getFunctionContaining(pc[i].getInput(0).getAddress()).getName()); 568 | } 569 | } 570 | 571 | /* 572 | I'm not sure if I'm doing the right thing handling INDIRECT in this way 573 | but I found it being used when handling global variables, and inserting this 574 | call to processOneVarnodeallows us to resolve further back 575 | */ 576 | processOneVarnode(path,f, def.getInput(1), isPhi); 577 | break; 578 | } 579 | 580 | /* 581 | Two more p-code operations which take two inputs 582 | */ 583 | case PcodeOp.PIECE: 584 | case PcodeOp.PTRSUB: { 585 | processOneVarnode(path,f, def.getInput(0), isPhi); 586 | processOneVarnode(path,f, def.getInput(1), isPhi); 587 | break; 588 | } 589 | 590 | //throw an exception when encountering a p-code op we don't support 591 | default: { 592 | throw new NotYetImplementedException("Support for PcodeOp " + def.toString() + "not implemented"); 593 | } 594 | } 595 | 596 | return path; 597 | } 598 | 599 | 600 | /* 601 | This function handles analysis of a particular callsite for a function we are looking at - 602 | we start at knowing we want to analyze a particular input to the function, e.g., the second parameter, 603 | then find all call sites in the binary where that function is called (see getFunctionCallSitePCodeOps), 604 | and then call this function, passing it the pcode op for the CALL that dispatches to the function, as 605 | well as the parameter index that we want to examine. 606 | 607 | This function then finds the varnode associated with that particular index, and either saves it (if it 608 | is a constant value), or passes it off to processOneVarnode to be analyzed 609 | 610 | */ 611 | public FlowInfo analyzeFunctionCallSite(FlowInfo path, Function f, PcodeOpAST callPCOp, int paramIndex) 612 | throws InvalidInputException, NotYetImplementedException, NotFoundException { 613 | 614 | if (callPCOp.getOpcode() != PcodeOp.CALL) { 615 | throw new InvalidInputException("PCodeOp that is not CALL passed in to function expecting CALL only"); 616 | } 617 | 618 | Varnode calledFunc = callPCOp.getInput(0); 619 | 620 | if (calledFunc == null || !calledFunc.isAddress()) { 621 | println("call, but not address!"); 622 | return null; 623 | } 624 | 625 | Address pa = callPCOp.getSeqnum().getTarget(); 626 | 627 | int numParams = callPCOp.getNumInputs(); 628 | 629 | /* 630 | the number of p-code operation varnode inputs here is the number of parameters 631 | being passed to the function when called 632 | 633 | Note that these parameters only become associated with the CALL p-code op during 634 | decompiler analysis. They are not present in the raw p-code. 635 | */ 636 | printf("\nCall @ 0x%x [%s] to 0x%x [%s] (%d pcodeops)\n", 637 | pa.getOffset(), 638 | f.getName(), 639 | calledFunc.getAddress().getOffset(), 640 | getFunctionAt(calledFunc.getAddress()).getName(), 641 | numParams); 642 | 643 | //param index #0 is the call target address, skip it, start at 1, the 0th parameter 644 | for (int i = 1; i < numParams; i++) { 645 | 646 | //this function is called with param index starting at 0, we subtract 1 from the input # 647 | if(i - 1 == paramIndex) { 648 | //ok, we have the parameter of interest 649 | Varnode parm = callPCOp.getInput(i); 650 | 651 | if (parm == null) { 652 | printf("\tNULL param #%d??\n", i); 653 | continue; 654 | } 655 | 656 | printf("\tParameter #%d - %s @ 0x%x\n", 657 | i, 658 | parm.toString(), 659 | parm.getAddress().getOffset()); 660 | 661 | //if we have a constant parameter, save that. We are done here 662 | if(parm.isConstant()) { 663 | long value = parm.getOffset(); 664 | 665 | printf("\t\tisConstant: %d\n", value); 666 | 667 | FlowInfo newFlowConst = new FlowInfo(value); 668 | path.appendNewChild(newFlowConst); 669 | } 670 | else{ 671 | path = processOneVarnode(path,f, parm, false); //isPhi = false 672 | } 673 | } 674 | } 675 | return path; 676 | } 677 | 678 | /* 679 | Within a function "f", look for all p-code operations associated with a call to a specified 680 | function, calledFunctionName 681 | 682 | Return an array of these p-code CALL sites 683 | */ 684 | public ArrayList getFunctionCallSitePCodeOps(Function f, String calledFunctionName){ 685 | 686 | ArrayList pcodeOpCallSites = new ArrayList(); 687 | 688 | HighFunction hfunction = decompileFunction(f); 689 | if(hfunction == null) { 690 | printf("ERROR: Failed to decompile function!\n"); 691 | return null; 692 | } 693 | 694 | Iterator ops = hfunction.getPcodeOps(); 695 | 696 | //iterate over all p-code ops in the function 697 | while (ops.hasNext() && !monitor.isCancelled()) { 698 | PcodeOpAST pcodeOpAST = ops.next(); 699 | 700 | if (pcodeOpAST.getOpcode() == PcodeOp.CALL) { 701 | 702 | //current p-code op is a CALL 703 | //get the address CALL-ed 704 | Varnode calledVarnode = pcodeOpAST.getInput(0); 705 | 706 | if (calledVarnode == null || !calledVarnode.isAddress()) { 707 | printf("ERROR: call, but not to address!"); 708 | continue; 709 | } 710 | 711 | //if the CALL is to our function, save this callsite 712 | if( getFunctionAt(calledVarnode.getAddress()).getName().compareTo(calledFunctionName) == 0) { 713 | pcodeOpCallSites.add(pcodeOpAST); 714 | } 715 | } 716 | } 717 | return pcodeOpCallSites; 718 | } 719 | 720 | /* 721 | set up the decompiler 722 | */ 723 | private DecompInterface setUpDecompiler(Program program) { 724 | DecompInterface decompInterface = new DecompInterface(); 725 | 726 | DecompileOptions options; 727 | options = new DecompileOptions(); 728 | PluginTool tool = state.getTool(); 729 | if (tool != null) { 730 | OptionsService service = tool.getService(OptionsService.class); 731 | if (service != null) { 732 | ToolOptions opt = service.getOptions("Decompiler"); 733 | options.grabFromToolAndProgram(null, opt, program); 734 | } 735 | } 736 | decompInterface.setOptions(options); 737 | 738 | decompInterface.toggleCCode(true); 739 | decompInterface.toggleSyntaxTree(true); 740 | decompInterface.setSimplificationStyle("decompile"); 741 | 742 | return decompInterface; 743 | } 744 | 745 | /* 746 | pretty print a path to a sink 747 | */ 748 | private void pprintPathInternal(FlowInfo path, String pattern) { 749 | 750 | //if we have a phi, add the phi character at the correct column 751 | if (path instanceof PhiFlow) { 752 | pattern += "Ø"; 753 | } 754 | 755 | //print the child/parent/phi pattern of "-", "+", and "Ø" 756 | if (pattern != "") { 757 | printf("%s", pattern); 758 | } 759 | 760 | //Our sink function 761 | if(path instanceof Sink) { 762 | printf("SINK: call to %s in %s @ 0x%x\n", 763 | path.getTargetFunction().getName(), 764 | getFunctionContaining(path.getAddress()).getName(), 765 | path.getAddress().getOffset()); 766 | } 767 | //a "parent" - a function that calls the previous function 768 | else if(path.isParent()) { 769 | printf("P: call %s -> %s @ 0x%x - param #%d\n", 770 | getFunctionContaining(path.getAddress()).getName(), 771 | path.getTargetFunction().getName(), 772 | path.getAddress().getOffset(), 773 | path.getArgIdx()); 774 | } 775 | //a "child" - a function that the current function calls 776 | else if (path.isChild()){ 777 | printf("C: %s\n", path.getFunction().getName()); 778 | } 779 | //if we don't have a function, we have a terminal constant 780 | if (path.function == null) { 781 | printf("CONST: %d (0x%x)\n", path.constValue, path.constValue); 782 | } 783 | 784 | //now print all of this node's children 785 | for (int i = 0; i < path.getChildren().size(); i++) { 786 | pprintPathInternal(path.getChildren().get(i), pattern + "-"); 787 | } 788 | 789 | //now print all of this node's parents 790 | for (int j = 0; j < path.getParents().size(); j++) { 791 | pprintPathInternal(path.getParents().get(j), pattern + "+"); 792 | } 793 | } 794 | 795 | /* 796 | Wrapper for pprintPathInternal 797 | */ 798 | public void pprintPath(FlowInfo path) { 799 | pprintPathInternal(path, ""); 800 | } 801 | 802 | 803 | public void run() throws Exception { 804 | 805 | //malloc is an easy function to look at, as it takes a single integer argument 806 | String sinkFunctionName = "malloc"; 807 | 808 | 809 | decomplib = setUpDecompiler(currentProgram); 810 | 811 | if(!decomplib.openProgram(currentProgram)) { 812 | printf("Decompiler error: %s\n", decomplib.getLastMessage()); 813 | return; 814 | } 815 | 816 | Reference[] sinkFunctionReferences; 817 | HashSet functionsCallingSinkFunction = new HashSet(); 818 | 819 | 820 | //iterator over all functions in the program 821 | FunctionIterator functionManager = currentProgram.getFunctionManager().getFunctions(true); 822 | 823 | for (Function function : functionManager) { 824 | /* 825 | Look for the function with sinkFunctionName (malloc). 826 | 827 | Unfortunately, we can't look the function up by name as the FlatAPI function 828 | getFunction​(java.lang.String name) is deprecated 829 | */ 830 | if (function.getName().equals(sinkFunctionName)) { 831 | 832 | printf("Found sink function %s @ 0x%x\n", 833 | sinkFunctionName, 834 | function.getEntryPoint().getOffset()); 835 | 836 | sinkFunctionReferences = getReferencesTo(function.getEntryPoint()); 837 | 838 | //Now find all references to this function 839 | for (Reference currentSinkFunctionReference : sinkFunctionReferences) { 840 | printf("\tFound %s reference @ 0x%x (%s)\n", 841 | sinkFunctionName, 842 | currentSinkFunctionReference.getFromAddress().getOffset(), 843 | currentSinkFunctionReference.getReferenceType().getName()); 844 | 845 | //get the function where the current reference occurs (hopefully it is a function) 846 | Function callingFunction = getFunctionContaining(currentSinkFunctionReference.getFromAddress()); 847 | 848 | //Only save *unique* calling functions which are not thunks 849 | if (callingFunction != null && 850 | !callingFunction.isThunk() && 851 | !functionsCallingSinkFunction.contains(callingFunction) ) { 852 | functionsCallingSinkFunction.add(callingFunction); 853 | } 854 | } 855 | } 856 | } 857 | 858 | printf("\nFound %d functions calling sink function\n", functionsCallingSinkFunction.size()); 859 | for (Function currentFunction : functionsCallingSinkFunction) { 860 | printf("\t-> %s\n", currentFunction.toString()); 861 | } 862 | 863 | ArrayList paths = new ArrayList(); 864 | 865 | //iterate through each unique function which references our sink function 866 | for (Function currentFunction : functionsCallingSinkFunction) { 867 | 868 | //get all sites in the function where we CALL the sink 869 | ArrayList callSites = getFunctionCallSitePCodeOps(currentFunction, sinkFunctionName); 870 | 871 | printf("\nFound %d sink function call sites in %s\n", 872 | callSites.size(), 873 | currentFunction.getName()); 874 | 875 | //for each CALL, figure out the inputs into the sink function 876 | for (PcodeOpAST callSite : callSites) { 877 | 878 | Address pa = callSite.getSeqnum().getTarget(); 879 | 880 | Function targetFunction = getFunctionContaining(callSite.getInput(0).getAddress()); 881 | 882 | Sink sink = new Sink(currentFunction, targetFunction, pa); 883 | 884 | //for now we pass in 0 for param idx because we only care about input #0 to malloc 885 | FlowInfo currentPath = analyzeFunctionCallSite(sink, currentFunction, callSite, 0); 886 | 887 | paths.add(currentPath); 888 | } 889 | } 890 | 891 | // Done! Now pretty print. Ideally, here, we would instead render a graph, but Ghidra 892 | // does not come with a GraphProvider interface :( 893 | 894 | printf("\n\n\n\n\n---------------------\n\nPRINTING OUTPUTS\n\n\n\n"); 895 | for (FlowInfo path : paths) { 896 | pprintPath(path); 897 | printf("\n\n\n-------------\n\n\n"); 898 | } 899 | } 900 | } -------------------------------------------------------------------------------- /Ghidra/PCodeMallocDemo/README.md: -------------------------------------------------------------------------------- 1 | # Working With Ghidra's P-Code To Identify Vulnerable Function Calls by Alexei Bulazel 2 | ## Originally posted at https://www.riverloopsecurity.com/blog/2019/05/pcode/ 3 | 4 | For those unfamiliar with the tool, [Ghidra](https://ghidra-sre.org/) is an interactive reverse engineering tool developed by the US National Security Agency, comparable in functionality to tools such as [Binary Ninja](https://binary.ninja) and [IDA Pro](https://www.hex-rays.com/). After years of development internally at NSA, [Ghidra was released open source](https://github.com/NationalSecurityAgency/ghidra/) to the public in [March 2019 at RSA](https://www.rsaconference.com/events/us19/agenda/sessions/16608-come-get-your-free-nsa-reverse-engineering-tool). 5 | 6 | My script leverages Ghidra's "p-code" intermediate representation to trace inputs to the `malloc` through functions and interprocedural calls. 7 | Calls to `malloc` are of obvious interest to vulnerability researchers looking for bugs in binary software - if a user-controlled input can somehow effect the size of parameter passed to the function, it may be possible for the user to pass in a argument triggering integer overflow during the calculation of allocation size, leading to memory corruption. 8 | 9 | If you want to follow along with the code, I've published it at: https://github.com/0xAlexei/INFILTRATE2019/tree/master/PCodeMallocDemo See discussion later in "Running The Script" for instructions on how to run it with your local copy of Ghidra. 10 | 11 | 12 | ## Inspiration 13 | The inspiration this demo came from watching [Sophia d’Antoine, Peter LaFosse, and Rusty Wagner’s "Be A Binary Rockstar"](https://vimeo.com/215511922) at INFILTRATE 2017, a presentation on Binary Ninja. 14 | During that presentation, fellow [River Loop Security team member Sophia d'Antoine]({{< relref "supermicro-validation-1.md" >}}) demonstrated a [script to find calls to memcpy with unsafe arguments](https://github.com/trailofbits/binjascripts/blob/master/abstractanalysis/binja_memcpy.py), leveraging Binary Ninja’s intermediate language representations of assembly code. 15 | I figured I would create a Ghidra script to do similar, but when I found that it wouldn’t be as simple as just calling a function like `get_parameter_at`, I began digging into to Ghidra’s code and plugin examples published by NSA with Ghidra. 16 | I ended up with the proof of concept script discussed in this post. 17 | While this script might not be ready for real world 0day discovery, it should give you a sense of working with Ghidra's scripting APIs, p-code intermediate representation, and built-in support for program analysis. 18 | 19 | 20 | ## P-Code 21 | P-code is Ghidra's intermediate representation / intermediate language (IR/IL) for assembly language instructions. 22 | Ghidra "lifts" assembly instructions of various disparate architectures into p-code, allowing reverse engineers to more easily develop automated analyses that work with assembly code. 23 | 24 | P-code abstracts away the complexities of working with various CPU architectures - x86's plethora of instructions and prefixes, MIPS' delay slots, ARM's conditional instructions, etc, and presents reverse engineers with a common, simplified instruction set to work with. P-code lifting is a one-to-many translation, a single assembly instruction may be lifted into one or more p-code instruction. 25 | 26 | For a simple example, see how an x86 `MOV` instruction translates into a single `COPY` p-code operation 27 | 28 | ``` 29 | MOV RAX,RSI 30 | RAX = COPY RSI 31 | ``` 32 | 33 | In a more complex case, a `SHR` instruction expands out into 30 p-code operations. 34 | Note how calculations for x86 flags (`CF`, `OF`, `SF`, and `ZF`) are made explicit. 35 | 36 | ``` 37 | SHR RAX,0x3f 38 | $Ub7c0:4 = INT_AND 63:4, 63:4 39 | $Ub7d0:8 = COPY RAX 40 | RAX = INT_RIGHT RAX, $Ub7c0 41 | $U33e0:1 = INT_NOTEQUAL $Ub7c0, 0:4 42 | $U33f0:4 = INT_SUB $Ub7c0, 1:4 43 | $U3400:8 = INT_RIGHT $Ub7d0, $U33f0 44 | $U3410:8 = INT_AND $U3400, 1:8 45 | $U3430:1 = INT_NOTEQUAL $U3410, 0:8 46 | $U3440:1 = BOOL_NEGATE $U33e0 47 | $U3450:1 = INT_AND $U3440, CF 48 | $U3460:1 = INT_AND $U33e0, $U3430 49 | CF = INT_OR $U3450, $U3460 50 | $U3490:1 = INT_EQUAL $Ub7c0, 1:4 51 | $U34b0:1 = INT_SLESS $Ub7d0, 0:8 52 | $U34c0:1 = BOOL_NEGATE $U3490 53 | $U34d0:1 = INT_AND $U34c0, OF 54 | $U34e0:1 = INT_AND $U3490, $U34b0 55 | OF = INT_OR $U34d0, $U34e0 56 | $U2e00:1 = INT_NOTEQUAL $Ub7c0, 0:4 57 | $U2e20:1 = INT_SLESS RAX, 0:8 58 | $U2e30:1 = BOOL_NEGATE $U2e00 59 | $U2e40:1 = INT_AND $U2e30, SF 60 | $U2e50:1 = INT_AND $U2e00, $U2e20 61 | SF = INT_OR $U2e40, $U2e50 62 | $U2e80:1 = INT_EQUAL RAX, 0:8 63 | $U2e90:1 = BOOL_NEGATE $U2e00 64 | $U2ea0:1 = INT_AND $U2e90, ZF 65 | $U2eb0:1 = INT_AND $U2e00, $U2e80 66 | ZF = INT_OR $U2ea0, $U2eb0 67 | 68 | ``` 69 | 70 | P-code itself is generated with SLEIGH, a processor specification language for Ghidra which provides the tool with both disassembly information (e.g., the sequence of bytes `89 d8` means `MOV EAX, EBX`), *and* semantic information (`MOV EAX, EBX` has the p-code semantics `EAX = COPY EBX`). After lifting up to raw p-code (i.e., the direct translation to p-code), additionally follow-on analysis may enhance the p-code, transforming it by adding additional metadata to instructions (e.g., the `CALL` p-code operation only has a call target address in raw p-code form, but may gain parameters associated with the function call after analysis), and adding additional analysis-derived instructions not present in raw p-code, such as `MULTIEQUAL`, representing a phi-node (more on that later), or `PTRSUB`, for pointer arithmetic producing a pointer to a subcomponent of a data type. 71 | 72 | During analysis the code is also lifted code into [single static assignment (SSA) form](https://en.wikipedia.org/wiki/Static_single_assignment_form), a representation wherein each variable is only assigned a value once. 73 | 74 | P-code operates over `varnodes` - quoting from the Ghidra documentation: "A varnode is a generalization of either a register or a memory location. It is represented by the formal triple: an address space, an offset into the space, and a size. Intuitively, a varnode is a contiguous sequence of bytes in some address space that can be treated as a single value. All manipulation of data by p-code operations occurs on varnodes." 75 | 76 | For readers interested in learning more, Ghidra ships with p-code documentation at `docs/languages/html/pcoderef.html`. Additionally, someone has posted the Ghidra decompiler Doxygen docs (included in the decompiler's source) at https://ghidra-decompiler-docs.netlify.com/index.html. 77 | 78 | 79 | ## This Script 80 | This script identifies inputs to `malloc()` by tracing backwards from the variable given to the function in order to figure out how that variable obtains its value, terminating in either a constant value or an external function call. Along the way, each function call that the value passes through is logged - either where it is returned by a function, or passed as an incoming parameter to a function call. The specific operations along the way that can constrain (e.g., checking equality or comparisons) or modify (e.g., arithmetic or bitwise operations) the values are not logged or processed currently for this proof of concept. 81 | 82 | Calls to `malloc` can go badly in a variety of ways, for example, [if an allocation size of zero is passed in](https://openwall.info/wiki/_media/people/jvanegue/files/woot10.pdf), or if an integer overflow occurs on the way to calculating the number of bytes to allocate. In general, we can expect that the chances of one of these types of bugs occuring is more likely if user input is able to somehow effect the value passed to `malloc`, e.g., if the user is able to specify a number of elements to allocate, and then that value is multiplied by `sizeof(element)`, there may be a chance of an integer overflow. If this script is able to determine that user input taken a few function calls before a call to `malloc` ends up passed to the function call, this code path may be worth auditing by a human vulnerability researcher. 83 | 84 | Understanding where allocations of static, non-user controlled sizes are used is also interesting, as exploit developers looking to turn discovered [heap vulnerabilities](http://security.cs.rpi.edu/courses/binexp-spring2015/lectures/17/10_lecture.pdf) into exploits may need to manipulate heap layout with ["heap grooms"](https://googleprojectzero.blogspot.com/2015/06/what-is-good-memory-corruption.html) relying on specific patterns of controlled allocations and deallocations. 85 | 86 | Note that while I've chosen to build this script around analysis of `malloc`, as it is a simple function that just takes a single integer argument, the same sort of analysis could be very easily adapted to look for other vulnerable function call patterns, such as `memcpy` with user controlled lengths or buffers on the stack, or `system` or `exec`-family functions ingesting user input 87 | 88 | 89 | ## Running The Script 90 | 91 | I've published the script, a test binary and its source code, and the output I receive when running the script over the binary at https://github.com/0xAlexei/INFILTRATE2019/tree/master/PCodeMallocDemo 92 | 93 | You can run the script by putting it in your Ghidra scripts directory (default `$USER_HOME/ghidra_scripts`), opening Ghidra's Script Manager window, and then looking for it in a folder labeled "INFILTRATE". The green arrow "Run Script" button at the top of the Script Manager window will then run the script, with output printed to the console. 94 | 95 | I'd also add that because the script simply prints output to the console, it can be run with Ghidra's command line ["headless mode"](https://ghidra-sre.org/InstallationGuide.html#RunHeadless) as well, to print its output to your command line terminal. 96 | 97 | ## Algorithm 98 | 99 | The script begins by looking for every function that references `malloc`. Then, for each of these function, we look for each `CALL` p-code operation targeting `malloc` inside that function. Analysis then begins, looking at sole parameter to `malloc` (`size_t size`). This parameter is a `varnode`, a generalized representation of a value in the program. After Ghidra's [data-flow analysis](https://en.wikipedia.org/wiki/Data-flow_analysis) has run, we can use `varnode`'s `getDef()` method to retrieve the p-code operation which defines it - e.g., for statement `a = b + c`, if we asked for the operation defining `a`, we'd get `b + c`. From here, we can recursively trace backwards, asking what operations define `varnode`s `b` and `c` in that p-code expression, then what operations define their parents, and so on. 100 | 101 | Eventually, we might arrive on the discovery that one of these parents is a constant value, that a value is derived from a function call, or that a value comes from a parameter to the function. In the case that analysis determines that a constant is the ultimate origin value behind the value passed in to `malloc`, we can simply save the constant and terminate analysis of the particular code path under examination. Otherwise, we have to trace into called functions, and consider possible callsites for functions that call the current function under analysis. Along the way, for each function we traverse in a path to a terminal constant value or external function (where we cannot go any further), we save a node in our path to be printed out to the user at the end. 102 | 103 | 104 | ### Analyzing Inside Function Calls 105 | 106 | The value passed to `malloc` may ultimately derive from a function call, e.g.: 107 | 108 | ``` 109 | int x = getNumber(); 110 | 111 | malloc(x+5); 112 | ``` 113 | 114 | In this case, we would analyze `getNumber`, finding each `RETURN` p-code operation in the function, and analyzing the "input1" varnode associated with it, which represents the value the function returns (on x86, this would be the value in `EAX` at time of function return). Note that similar to the association of function parameters with `CALL` p-code operations, return values are only associated with `RETURN` p-code operations *after* analysis, and are not present in raw p-code. 115 | 116 | For example: 117 | 118 | ``` 119 | int getNumber(){ 120 | int number = atoi("8"); 121 | 122 | number = number + 10; 123 | 124 | return number; 125 | } 126 | ``` 127 | 128 | In the above code snippet, our analysis would trace backwards from return, to addition, and finally to a call to `atoi`, so we could add `atoi` as a node in path determining the source of input to `malloc`. This analysis may be applied recursively until a terminating value of a constant or external function call is encountered. 129 | 130 | ### Phi Nodes 131 | 132 | Discussing analysis of values returned by called functions gives us a opportunity to consider "phi nodes". In the above example, there's only a single path for how number can be defined, first `atoi`, then `+ 10`. But what if instead, we had: 133 | 134 | ``` 135 | int getNumber(){ 136 | 137 | int number; 138 | 139 | if (rand() > 100){ 140 | number = 10; 141 | } 142 | else { 143 | number = 20; 144 | } 145 | 146 | return number; 147 | } 148 | ``` 149 | 150 | Now, it's not so clear what `number`'s definition is at time of function return - it could be 10 or 20. A "phi node" can be used to represent the point in the program at which, going forward, number will possess either value 10 or 20. Ghidra's own analysis will insert a `MULTIEQUAL` operation (not present in the raw p-code) at the point where number is used, but could have either value 10 or 20 (you can imagine this operation as happening in between the closing brace of the `else` statement and before `return`). The `MULTIEQUAL` operation tells us that going forward, `number` can have one value out of a range of possible values defined in previous basic blocks (the `if` and `else` paths). 151 | 152 | Representing the function in single static assignment form, it can be better understood as: 153 | 154 | ``` 155 | int getNumber(){ 156 | 157 | if (rand() > 100){ 158 | number1 = 10; 159 | } 160 | else { 161 | number2 = 20; 162 | } 163 | 164 | number3 = MULTIEQUAL(number1, number2); 165 | 166 | return number3; 167 | } 168 | ``` 169 | 170 | `number1` and `number2` represent SSA instantiations of `number`, and we've inserted `MULTIEQUAL` operation before the return, indicating that the return value (`number3`) will be one of these prior two values. `MULTIEQUAL` is not constrained to only taking two values, for example, if we had five values which `number` could take before return, we could have `number6 = MULTIEQUAL(number1, number2, number3, number4, number5);`. 171 | 172 | We can handle `MULTIEQUAL` p-code operations by noting that the next node we append to our path will be a phi input, and should be marked accordingly. When we print out paths at the end, inputs to the same phi will be marked accordingly so that end users know that each value is a possible input to the phi. 173 | 174 | ### Analyzing Parent Calls 175 | 176 | In addition to analyzing functions *called* by our current function, our analysis must consider functions *calling* our current function, as values passed to `malloc` could be dependent on parameters to our function. For example: 177 | 178 | ``` 179 | void doMalloc(int int1, int size){ 180 | ... 181 | malloc(size); 182 | } 183 | 184 | ... 185 | 186 | doMalloc(8, 5); 187 | ... 188 | doMalloc(10, 7); 189 | ``` 190 | 191 | In cases like these, we will search for each location in the binary where our current function (`doMalloc`) is called, and analyze the parameter passed to the function which effects the value passed to our target function. In the above case, analysis would return that 5 and 7 are both possible values for `size` in a call to `doMalloc`. 192 | 193 | As our analysis simply considers each site in the binary where the current function we are analyzing is called, it can make mistakes in analysis, because it is not "*context sensitive*". Analysis does not consider the specific context in which functions are called, which can lead to inaccuracies in cases that seem obvious. For example, if we have a function: 194 | 195 | ``` 196 | int returnArg0(int arg0){ 197 | return arg0; 198 | } 199 | ``` 200 | 201 | And this function is called in several places: 202 | 203 | ``` 204 | int x = returnArg0(9); 205 | 206 | int y = returnArg0(7); 207 | 208 | printf("%d", returnArg0(8)); 209 | 210 | malloc(returnArg0(11)); 211 | ``` 212 | While to us it's very obvious that the call to `malloc` will receive argument 11, our context-insensitive analysis considers every site in the program in which `returnArg0` is called, so it will return 9, 7, 8, and 11 all as possible values for the value at this call to `malloc` 213 | 214 | As with analysis of called functions, analysis of calling functions may be applied recursively. Further, these analyses may be interwoven with one another, if for example, a function is invoked with parameter derived from a call into another function. 215 | 216 | 217 | ## Ghidra Resources 218 | 219 | 220 | I found Ghidra's included plugins [`ShowConstantUse.java`](https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Decompiler/ghidra_scripts/ShowConstantUse.java) and [`WindowsResourceReference.java`](https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Decompiler/ghidra_scripts/WindowsResourceReference.java) very helpful when working with p-code and the Ghidra decompiler. I borrowed some code from these scripts when building this script, and consulted them extensively. 221 | 222 | 223 | ## Output 224 | The data we’re dealing with is probably best visualized with a graph of connected nodes. [Unfortunately, the publicly released version of Ghidra does not currently have the necessary external "GraphService" needed to work with graphs](https://github.com/NationalSecurityAgency/ghidra/issues/174), as can be observed by running Ghidra’s included scripts [`GraphAST.java`](https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Decompiler/ghidra_scripts/GraphAST.java), [`GraphASTAndFlow.java`](https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Decompiler/ghidra_scripts/GraphASTAndFlow.java), and [`GraphSelectedAST.java`](https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Decompiler/ghidra_scripts/GraphSelectedAST.java) (a popup alert informs the user "GraphService not found: Please add a graph service provider to your tool"). 225 | 226 | Without a graph provider, I resorted to using ASCII depictions of flow to our "sink" function of `malloc`. Each line of output represents a node on the way to malloc. A series of lines before the node’s value represents how it is derived, with `-` representing a value coming from within a function (either because it returns a constant or calls an external function), `+` representing a value coming from a function parameter, and `Ø` being printed when a series of nodes are inputs to a phi-node. `C:` is used to denote a called "child" function call, `P:` for a calling "parent", and `CONST:` for a terminal constant value. 227 | 228 | For example: 229 | 230 | ``` 231 | int return3(){ 232 | return 3; 233 | } 234 | 235 | … 236 | malloc(return3()); 237 | … 238 | ``` 239 | 240 | Here, we have a call into return3 denoted by `-`, and then inside of that function, a terminal constant value of `3`. 241 | 242 | ``` 243 | SINK: call to malloc in analyzefun @ 0x4008f6 244 | -C: return3 245 | --CONST: 3 (0x3) 246 | ``` 247 | 248 | In a more complex case: 249 | 250 | ``` 251 | int returnmynumberplus5(int x){ 252 | return x+5; 253 | } 254 | ... 255 | malloc(returnmynumberplus5(10) | 7); 256 | .... 257 | 258 | 259 | SINK: call to malloc in analyzefun @ 0x40091a 260 | -C: returnmynumberplus5 261 | -+P: call analyzefun -> returnmynumberplus5 @ 0x40090d - param #0 262 | -+-CONST: 10 (0xA) 263 | ``` 264 | 265 | Here we have a call into `returnmynumberplus5` denoted with `-`, then `-+` denoting that the return value for `returnmynumberplus5` is derived from a parameter passed to it by a calling "parent" function, and then finally `-+-` for the final constant value of 10 which was determined to be the ultimate terminating constant in this flow to the sink function. 266 | This is somewhat a contrived example, as the script considers all possible callsites for `returnmynumberplus5`, and would in fact list constants (or other values) passed to the function throughout the entire program, if there were other sites where it was invoked - an example of the script not being context sensitive. 267 | 268 | Finally, lets take a look at a case where a phi node is involved: 269 | 270 | ``` 271 | int phidemo(){ 272 | int x = 0; 273 | if (rand() > 100){ 274 | x = 100; 275 | } 276 | else if (rand() > 200){ 277 | x = 700; 278 | } 279 | return x; 280 | } 281 | 282 | ... 283 | malloc(phidemo()); 284 | ... 285 | 286 | 287 | SINK: call to malloc in analyzefun @ 0x4008b0 288 | -C: phidemo 289 | --ØCONST: 100 (0x64) 290 | --ØCONST: 0 (0x0) 291 | --ØCONST: 700 (0x2bc) 292 | 293 | ``` 294 | In this case, we see that the call to `malloc` is the result of a call to `phidemo`. At the next level deeper, we print `-` followed by `Ø`, indicating the three constant values displayed are all phi node inputs, with only one used in returning from `phidemo`. 295 | 296 | 297 | ## Limitations and Future Work 298 | 299 | After all that discussion of what this script can do, we should address the various things that it cannot. This proof of concept script has a number of limitations, including: 300 | 301 | * Transfers of control flow between functions not based on `CALL` p-code ops with explicitly resolved targets. This includes use of direct jumps to other functions, transfer through function pointers, or C++ vtables 302 | * Handling pointers 303 | * Recursive functions 304 | * Programs using p-code operations that we do not support 305 | * Context sensitive analysis 306 | * etc... 307 | 308 | That said, implementing support for these other constructions should be possible and fairly easy. Beyond growing out more robust support for various program constructions, there are many of other directions this code could be taken in: 309 | * Adding support for actually logging all operations along the way, e.g, letting the user know that the value parsed by `atoi()` is then multiplied by `8`, and compared against `0x100`, and then `2` is added - for example. 310 | * Integrating an SMT solver to allow for more complex analyses of possible values 311 | * Adding context sensitivity 312 | * Modeling process address space 313 | 314 | ## Conclusion 315 | 316 | I hope this blog post has been insightful in elucidating how Ghidra's powerful scripting API, intermediate representation, and built-in data flow analysis can be leveraged together for program analysis. With this script, I've only scratched the surface of what is possible with Ghidra, I hope we'll see more public research on what the tool can do. 317 | 318 | I know this script isn't perfect, please do reach out if you find it useful or have suggestions for improvement. 319 | 320 | > If you have questions about your reverse engineering and security analysis, consider [contacting our team]({{< relref "contact.md" >}}) of experienced security experts to learn more about what you can do. 321 | >
322 | > If you have questions or comments about Ghidra, p-code, training, or otherwise want to get in touch, you can email re-training@riverloopsecurity.com, or contact me directly via open DMs on Twitter at https://twitter.com/0xAlexei 323 | 324 | -------------------------------------------------------------------------------- /Ghidra/PCodeMallocDemo/mallocexample: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/Ghidra/PCodeMallocDemo/mallocexample -------------------------------------------------------------------------------- /Ghidra/PCodeMallocDemo/mallocexample.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | 8 | int getNum(){ 9 | return atoi("10"); 10 | } 11 | 12 | 13 | int g = 7; 14 | 15 | int oneOf3(){ 16 | if(rand() > 1){ 17 | return 1; 18 | } 19 | 20 | else if (rand() > 2){ 21 | return 2; 22 | } 23 | return 3; 24 | } 25 | 26 | int getNumber(){ 27 | int x = 6; 28 | if ( rand() < 9) 29 | return 9; 30 | else if (rand() < 100){ 31 | return atoi("8"); 32 | } 33 | else if (rand() == 200){ 34 | return rand(); 35 | } 36 | else if (rand() == 700){ 37 | return oneOf3(); 38 | } 39 | else if (rand() == 9000){ 40 | return 77; 41 | } 42 | return x+5; 43 | } 44 | 45 | 46 | int getNumber2(){ 47 | return getNumber() + atoi("8"); 48 | } 49 | 50 | 51 | int return3(){ 52 | return 3; 53 | } 54 | 55 | int getrand(){ 56 | return rand() + rand(); 57 | } 58 | 59 | 60 | int getarg1(int j){ 61 | return j+4; 62 | } 63 | 64 | int phidemo(){ 65 | int x = 0; 66 | if (rand() > 100){ 67 | x = 100; 68 | } 69 | else if (rand() > 200){ 70 | x = 700; 71 | } 72 | return x; 73 | } 74 | 75 | int analyzefun(char * string, int z){ 76 | int x = 9; 77 | int y = 10; 78 | 79 | 80 | malloc(5); 81 | malloc(x); 82 | malloc(x + y); //ghidra's analysis figures out that this is 19 83 | 84 | malloc(return3()); 85 | 86 | malloc(getNumber()+65); 87 | 88 | malloc(getNumber2()); 89 | 90 | malloc(y + return3()); 91 | 92 | malloc(strlen(string)); 93 | 94 | malloc(z); 95 | 96 | malloc(getarg1(888888)); 97 | 98 | malloc(phidemo()); 99 | 100 | int a = 0x4444; 101 | if (rand() > 2){ 102 | a = z; 103 | } 104 | malloc(a); 105 | } 106 | 107 | 108 | int analyzefun2(){ 109 | malloc(5); 110 | } 111 | 112 | int analyzefun3(){ 113 | malloc(rand()); 114 | } 115 | 116 | 117 | int intermediatefunc2(char * string, int y){ 118 | analyzefun(string, y); 119 | } 120 | 121 | int main(int argv, char ** argc) { 122 | 123 | analyzefun("foo", 77); 124 | 125 | int zz = recv(0, NULL, 0, 0); 126 | analyzefun("bar", zz); 127 | 128 | int yy = getrand(); 129 | int jj = getarg1(5); 130 | intermediatefunc2("bar", yy + jj); 131 | 132 | intermediatefunc2("bar", 100); 133 | 134 | intermediatefunc2("baz", 100 + getNum()); 135 | intermediatefunc2("baz", 99 + getNum()); 136 | 137 | analyzefun2(); 138 | analyzefun3(); 139 | 140 | 141 | } -------------------------------------------------------------------------------- /Ghidra/PCodeMallocDemo/output.txt: -------------------------------------------------------------------------------- 1 | MallocTrace.java> Running... 2 | Found sink function malloc @ 0x400510 3 | Found malloc reference @ 0x4007bb (UNCONDITIONAL_CALL) 4 | Found malloc reference @ 0x4007c8 (UNCONDITIONAL_CALL) 5 | Found malloc reference @ 0x4007da (UNCONDITIONAL_CALL) 6 | Found malloc reference @ 0x4007ee (UNCONDITIONAL_CALL) 7 | Found malloc reference @ 0x400805 (UNCONDITIONAL_CALL) 8 | Found malloc reference @ 0x400819 (UNCONDITIONAL_CALL) 9 | Found malloc reference @ 0x400834 (UNCONDITIONAL_CALL) 10 | Found malloc reference @ 0x400848 (UNCONDITIONAL_CALL) 11 | Found malloc reference @ 0x400855 (UNCONDITIONAL_CALL) 12 | Found malloc reference @ 0x400869 (UNCONDITIONAL_CALL) 13 | Found malloc reference @ 0x40087d (UNCONDITIONAL_CALL) 14 | Found malloc reference @ 0x4008a1 (UNCONDITIONAL_CALL) 15 | Found malloc reference @ 0x4008b2 (UNCONDITIONAL_CALL) 16 | Found malloc reference @ 0x4008c8 (UNCONDITIONAL_CALL) 17 | Found sink function malloc @ 0x602020 18 | Found malloc reference @ 0x601030 (DATA) 19 | Found malloc reference @ 0x400510 (COMPUTED_CALL_TERMINATOR) 20 | 21 | Found 3 functions calling sink function 22 | -> analyzefun2 23 | -> analyzefun3 24 | -> analyzefun 25 | 26 | Found 1 sink function call sites in analyzefun2 27 | 28 | Call @ 0x4008b2 [analyzefun2] to 0x400510 [malloc] (2 pcodeops) 29 | Parameter #1 - (const, 0x5, 8) @ 0x5 30 | isConstant: 5 31 | Adding new child... 32 | 33 | Found 1 sink function call sites in analyzefun3 34 | 35 | Call @ 0x4008c8 [analyzefun3] to 0x400510 [malloc] (2 pcodeops) 36 | Parameter #1 - (register, 0x0, 8) @ 0x0 37 | 0x4008c3 - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x0, 4) 38 | 0x4008be - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400530, 8) 39 | Located source - call to 400530 [rand] 40 | Adding new child... 41 | Function rand entry @ 0x400530 42 | Found rand return @ 0x400530 43 | 0x400530 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602030, 8) 44 | Located source - call to 602030 [rand] 45 | Adding new child... 46 | Function rand entry @ 0x602030 47 | Found rand return @ 0x602030 48 | --> Could not resolve return value from rand 49 | 50 | 51 | 52 | 53 | 54 | Found 12 sink function call sites in analyzefun 55 | 56 | Call @ 0x4007bb [analyzefun] to 0x400510 [malloc] (2 pcodeops) 57 | Parameter #1 - (const, 0x5, 8) @ 0x5 58 | isConstant: 5 59 | Adding new child... 60 | 61 | Call @ 0x4007c8 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 62 | Parameter #1 - (const, 0x9, 8) @ 0x9 63 | isConstant: 9 64 | Adding new child... 65 | 66 | Call @ 0x4007da [analyzefun] to 0x400510 [malloc] (2 pcodeops) 67 | Parameter #1 - (const, 0x13, 8) @ 0x13 68 | isConstant: 19 69 | Adding new child... 70 | 71 | Call @ 0x4007ee [analyzefun] to 0x400510 [malloc] (2 pcodeops) 72 | Parameter #1 - (register, 0x0, 8) @ 0x0 73 | 0x4007e9 - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x0, 4) 74 | 0x4007e4 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400727, 8) 75 | Located source - call to 400727 [return3] 76 | Adding new child... 77 | Function return3 entry @ 0x400727 78 | Found return3 return @ 0x400731 79 | 0x40072b - processOneVarnode: (register, 0x0, 8) COPY (const, 0x3, 8) 80 | processOneVarnode: Addr or Constant! - (const, 0x3, 8) 81 | Adding new child... 82 | 83 | 84 | 85 | 86 | 87 | Call @ 0x400805 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 88 | Parameter #1 - (register, 0x0, 8) @ 0x0 89 | 0x400800 - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x0, 4) 90 | 0x4007fd - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (const, 0x41, 4) 91 | 0x4007f8 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400683, 8) 92 | Located source - call to 400683 [getNumber] 93 | Adding new child... 94 | Function getNumber entry @ 0x400683 95 | Found getNumber return @ 0x4006fe 96 | 0x4006fd - processOneVarnode: (register, 0x0, 8) MULTIEQUAL (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) 97 | Processing a MULTIEQUAL with 6 inputs0x40069c - processOneVarnode: (register, 0x0, 8) COPY (const, 0x9, 8) 98 | processOneVarnode: Addr or Constant! - (const, 0x9, 8) 99 | Adding new child... 100 | 0x4006b2 - processOneVarnode: (register, 0x0, 8) PIECE (register, 0x4, 4) , (register, 0x0, 4) 101 | 0x4006b2 - processOneVarnode: (register, 0x4, 4) INDIRECT (const, 0x0, 4) , (const, 0x2f, 4) 102 | USED In INDIRECT --> output (register, 0x4, 4) 103 | PC0 -> (register, 0x20, 8) INT_SUB (register, 0x20, 8) , (const, 0x8, 8) 104 | PC1 -> --- STORE (const, 0x131, 8) , (register, 0x20, 8) , (const, 0x4006b7, 8) 105 | PC2 -> --- CALL (ram, 0x400520, 8) 106 | INDIRECT Associated with call @ 400520 (atoi) 107 | processOneVarnode: Addr or Constant! - (const, 0x2f, 4) 108 | Adding new child... 109 | 0x4006b2 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400520, 8) , (unique, 0x100000ac, 8) 110 | Located source - call to 400520 [atoi] 111 | Adding new child... 112 | Function atoi entry @ 0x400520 113 | Found atoi return @ 0x400520 114 | 0x400520 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602028, 8) , (register, 0x38, 8) 115 | Located source - call to 602028 [atoi] 116 | Adding new child... 117 | Function atoi entry @ 0x602028 118 | Found atoi return @ 0x602028 119 | --> Could not resolve return value from atoi 120 | 121 | 122 | 123 | 124 | 0x4006c5 - processOneVarnode: (register, 0x0, 8) PIECE (register, 0x4, 4) , (register, 0x0, 4) 125 | 0x4006c5 - processOneVarnode: (register, 0x4, 4) INDIRECT (const, 0x0, 4) , (const, 0x3d, 4) 126 | USED In INDIRECT --> output (register, 0x4, 4) 127 | PC0 -> (register, 0x20, 8) INT_SUB (register, 0x20, 8) , (const, 0x8, 8) 128 | PC1 -> --- STORE (const, 0x131, 8) , (register, 0x20, 8) , (const, 0x4006ca, 8) 129 | PC2 -> --- CALL (ram, 0x400530, 8) 130 | INDIRECT Associated with call @ 400530 (rand) 131 | processOneVarnode: Addr or Constant! - (const, 0x3d, 4) 132 | Adding new child... 133 | 0x4006c5 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400530, 8) 134 | Located source - call to 400530 [rand] 135 | Adding new child... 136 | Function rand entry @ 0x400530 137 | Found rand return @ 0x400530 138 | 0x400530 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602030, 8) 139 | Located source - call to 602030 [rand] 140 | Adding new child... 141 | Function rand entry @ 0x602030 142 | Found rand return @ 0x602030 143 | --> Could not resolve return value from rand 144 | 145 | 146 | 147 | 148 | 0x4006dd - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x400656, 8) 149 | Located source - call to 400656 [oneOf3] 150 | Adding new child... 151 | Function oneOf3 entry @ 0x400656 152 | Found oneOf3 return @ 0x400682 153 | 0x400681 - processOneVarnode: (register, 0x0, 8) MULTIEQUAL (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) 154 | Processing a MULTIEQUAL with 3 inputs0x400664 - processOneVarnode: (register, 0x0, 8) COPY (const, 0x1, 8) 155 | processOneVarnode: Addr or Constant! - (const, 0x1, 8) 156 | Adding new child... 157 | 0x400675 - processOneVarnode: (register, 0x0, 8) COPY (const, 0x2, 8) 158 | processOneVarnode: Addr or Constant! - (const, 0x2, 8) 159 | Adding new child... 160 | 0x40067c - processOneVarnode: (register, 0x0, 8) COPY (const, 0x3, 8) 161 | processOneVarnode: Addr or Constant! - (const, 0x3, 8) 162 | Adding new child... 163 | 164 | 165 | 166 | 167 | 0x4006f0 - processOneVarnode: (register, 0x0, 8) COPY (const, 0x4d, 8) 168 | processOneVarnode: Addr or Constant! - (const, 0x4d, 8) 169 | Adding new child... 170 | 0x4006fa - processOneVarnode: (register, 0x0, 8) COPY (const, 0xb, 8) 171 | processOneVarnode: Addr or Constant! - (const, 0xb, 8) 172 | Adding new child... 173 | 174 | 175 | 176 | 177 | 178 | Call @ 0x400819 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 179 | Parameter #1 - (register, 0x0, 8) @ 0x0 180 | 0x400814 - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x0, 4) 181 | 0x40080f - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x4006ff, 8) 182 | Located source - call to 4006ff [getNumber2] 183 | Adding new child... 184 | Function getNumber2 entry @ 0x4006ff 185 | Found getNumber2 return @ 0x400726 186 | 0x40071e - processOneVarnode: (register, 0x0, 8) COPY (register, 0x0, 8) 187 | 0x40071e - processOneVarnode: (register, 0x0, 8) INT_ZEXT (unique, 0x10000042, 4) 188 | 0x40071e - processOneVarnode: (unique, 0x10000042, 4) CAST (register, 0x0, 4) 189 | 0x40071e - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (register, 0x0, 4) 190 | 0x400719 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400520, 8) , (unique, 0x1000003a, 8) 191 | Located source - call to 400520 [atoi] 192 | Adding new child... 193 | Function atoi entry @ 0x400520 194 | Found atoi return @ 0x400520 195 | 0x400520 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602028, 8) , (register, 0x38, 8) 196 | Located source - call to 602028 [atoi] 197 | Adding new child... 198 | Function atoi entry @ 0x602028 199 | Found atoi return @ 0x602028 200 | --> Could not resolve return value from atoi 201 | 202 | 203 | 204 | 205 | 0x40070d - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400683, 8) 206 | Located source - call to 400683 [getNumber] 207 | Adding new child... 208 | Function getNumber entry @ 0x400683 209 | Found getNumber return @ 0x4006fe 210 | 0x4006fd - processOneVarnode: (register, 0x0, 8) MULTIEQUAL (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) 211 | Processing a MULTIEQUAL with 6 inputs0x40069c - processOneVarnode: (register, 0x0, 8) COPY (const, 0x9, 8) 212 | processOneVarnode: Addr or Constant! - (const, 0x9, 8) 213 | Adding new child... 214 | 0x4006b2 - processOneVarnode: (register, 0x0, 8) PIECE (register, 0x4, 4) , (register, 0x0, 4) 215 | 0x4006b2 - processOneVarnode: (register, 0x4, 4) INDIRECT (const, 0x0, 4) , (const, 0x2f, 4) 216 | USED In INDIRECT --> output (register, 0x4, 4) 217 | PC0 -> (register, 0x20, 8) INT_SUB (register, 0x20, 8) , (const, 0x8, 8) 218 | PC1 -> --- STORE (const, 0x131, 8) , (register, 0x20, 8) , (const, 0x4006b7, 8) 219 | PC2 -> --- CALL (ram, 0x400520, 8) 220 | INDIRECT Associated with call @ 400520 (atoi) 221 | processOneVarnode: Addr or Constant! - (const, 0x2f, 4) 222 | Adding new child... 223 | 0x4006b2 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400520, 8) , (unique, 0x100000ac, 8) 224 | Located source - call to 400520 [atoi] 225 | Adding new child... 226 | Function atoi entry @ 0x400520 227 | Found atoi return @ 0x400520 228 | 0x400520 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602028, 8) , (register, 0x38, 8) 229 | Located source - call to 602028 [atoi] 230 | Adding new child... 231 | Function atoi entry @ 0x602028 232 | Found atoi return @ 0x602028 233 | --> Could not resolve return value from atoi 234 | 235 | 236 | 237 | 238 | 0x4006c5 - processOneVarnode: (register, 0x0, 8) PIECE (register, 0x4, 4) , (register, 0x0, 4) 239 | 0x4006c5 - processOneVarnode: (register, 0x4, 4) INDIRECT (const, 0x0, 4) , (const, 0x3d, 4) 240 | USED In INDIRECT --> output (register, 0x4, 4) 241 | PC0 -> (register, 0x20, 8) INT_SUB (register, 0x20, 8) , (const, 0x8, 8) 242 | PC1 -> --- STORE (const, 0x131, 8) , (register, 0x20, 8) , (const, 0x4006ca, 8) 243 | PC2 -> --- CALL (ram, 0x400530, 8) 244 | INDIRECT Associated with call @ 400530 (rand) 245 | processOneVarnode: Addr or Constant! - (const, 0x3d, 4) 246 | Adding new child... 247 | 0x4006c5 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400530, 8) 248 | Located source - call to 400530 [rand] 249 | Adding new child... 250 | Function rand entry @ 0x400530 251 | Found rand return @ 0x400530 252 | 0x400530 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602030, 8) 253 | Located source - call to 602030 [rand] 254 | Adding new child... 255 | Function rand entry @ 0x602030 256 | Found rand return @ 0x602030 257 | --> Could not resolve return value from rand 258 | 259 | 260 | 261 | 262 | 0x4006dd - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x400656, 8) 263 | Located source - call to 400656 [oneOf3] 264 | Adding new child... 265 | Function oneOf3 entry @ 0x400656 266 | Found oneOf3 return @ 0x400682 267 | 0x400681 - processOneVarnode: (register, 0x0, 8) MULTIEQUAL (register, 0x0, 8) , (register, 0x0, 8) , (register, 0x0, 8) 268 | Processing a MULTIEQUAL with 3 inputs0x400664 - processOneVarnode: (register, 0x0, 8) COPY (const, 0x1, 8) 269 | processOneVarnode: Addr or Constant! - (const, 0x1, 8) 270 | Adding new child... 271 | 0x400675 - processOneVarnode: (register, 0x0, 8) COPY (const, 0x2, 8) 272 | processOneVarnode: Addr or Constant! - (const, 0x2, 8) 273 | Adding new child... 274 | 0x40067c - processOneVarnode: (register, 0x0, 8) COPY (const, 0x3, 8) 275 | processOneVarnode: Addr or Constant! - (const, 0x3, 8) 276 | Adding new child... 277 | 278 | 279 | 280 | 281 | 0x4006f0 - processOneVarnode: (register, 0x0, 8) COPY (const, 0x4d, 8) 282 | processOneVarnode: Addr or Constant! - (const, 0x4d, 8) 283 | Adding new child... 284 | 0x4006fa - processOneVarnode: (register, 0x0, 8) COPY (const, 0xb, 8) 285 | processOneVarnode: Addr or Constant! - (const, 0xb, 8) 286 | Adding new child... 287 | 288 | 289 | 290 | 291 | 292 | 293 | 294 | 295 | 296 | Call @ 0x400834 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 297 | Parameter #1 - (register, 0x0, 8) @ 0x0 298 | 0x40082f - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x0, 4) 299 | 0x40082d - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (const, 0xa, 4) 300 | 0x400823 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400727, 8) 301 | Located source - call to 400727 [return3] 302 | Adding new child... 303 | Function return3 entry @ 0x400727 304 | Found return3 return @ 0x400731 305 | 0x40072b - processOneVarnode: (register, 0x0, 8) COPY (const, 0x3, 8) 306 | processOneVarnode: Addr or Constant! - (const, 0x3, 8) 307 | Adding new child... 308 | 309 | 310 | 311 | 312 | 313 | Call @ 0x400848 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 314 | Parameter #1 - (register, 0x0, 8) @ 0x0 315 | 0x400840 - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x4004f0, 8) , (register, 0x38, 8) 316 | Located source - call to 4004f0 [strlen] 317 | Adding new child... 318 | Function strlen entry @ 0x4004f0 319 | Found strlen return @ 0x4004f0 320 | 0x4004f0 - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x602008, 8) , (register, 0x38, 8) 321 | Located source - call to 602008 [strlen] 322 | Adding new child... 323 | Function strlen entry @ 0x602008 324 | Found strlen return @ 0x602008 325 | --> Could not resolve return value from strlen 326 | 327 | 328 | 329 | 330 | 331 | Call @ 0x400855 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 332 | Parameter #1 - (register, 0x0, 8) @ 0x0 333 | 0x400850 - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x30, 4) 334 | Varnode is function parameter -> parameter #1... (register, 0x30, 4) 335 | Could not get calling function @ 0x0 336 | Could not get calling function @ 0x400ac4 337 | Could not get calling function @ 0x400c78 338 | analyzeCallSites(..., analyzefun, ...) - found calling function @ 0x4008eb [intermediatefunc2] 339 | found unconditional call intermediatefunc2 -> analyzefun 340 | 341 | Call @ 0x4008eb [intermediatefunc2] to 0x400799 [analyzefun] (4 pcodeops) 342 | Parameter #2 - (register, 0x30, 8) @ 0x30 343 | 0x4008e6 - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x30, 4) 344 | Varnode is function parameter -> parameter #1... (register, 0x30, 4) 345 | Could not get calling function @ 0x0 346 | Could not get calling function @ 0x400adc 347 | Could not get calling function @ 0x400cd8 348 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x400965 [main] 349 | found unconditional call main -> intermediatefunc2 350 | 351 | Call @ 0x400965 [main] to 0x4008d0 [intermediatefunc2] (4 pcodeops) 352 | Parameter #2 - (register, 0x30, 8) @ 0x30 353 | 0x40095e - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x0, 4) 354 | 0x40095c - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (register, 0x0, 4) 355 | 0x40094e - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400750, 8) , (const, 0x5, 8) 356 | Located source - call to 400750 [getarg1] 357 | Adding new child... 358 | Function getarg1 entry @ 0x400750 359 | Found getarg1 return @ 0x40075e 360 | 0x40075a - processOneVarnode: (register, 0x0, 8) COPY (register, 0x0, 8) 361 | 0x40075a - processOneVarnode: (register, 0x0, 8) INT_ZEXT (register, 0x0, 4) 362 | 0x40075a - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x38, 4) , (const, 0x4, 4) 363 | Varnode is function parameter -> parameter #0... (register, 0x38, 4) 364 | Could not get calling function @ 0x0 365 | Could not get calling function @ 0x400ab4 366 | Could not get calling function @ 0x400c38 367 | analyzeCallSites(..., getarg1, ...) - found calling function @ 0x40085f [analyzefun] 368 | found unconditional call analyzefun -> getarg1 369 | 370 | Call @ 0x40085f [analyzefun] to 0x400750 [getarg1] (2 pcodeops) 371 | Parameter #1 - (const, 0xd9038, 8) @ 0xd9038 372 | isConstant: 888888 373 | Adding new child... 374 | Adding new parent... 375 | analyzeCallSites(..., getarg1, ...) - found calling function @ 0x40094e [main] 376 | found unconditional call main -> getarg1 377 | 378 | Call @ 0x40094e [main] to 0x400750 [getarg1] (2 pcodeops) 379 | Parameter #1 - (const, 0x5, 8) @ 0x5 380 | isConstant: 5 381 | Adding new child... 382 | Adding new parent... 383 | 384 | 385 | 386 | 387 | 0x400941 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400732, 8) 388 | Located source - call to 400732 [getrand] 389 | Adding new child... 390 | Function getrand entry @ 0x400732 391 | Found getrand return @ 0x40074f 392 | 0x400747 - processOneVarnode: (register, 0x0, 8) COPY (register, 0x0, 8) 393 | 0x400747 - processOneVarnode: (register, 0x0, 8) INT_ZEXT (unique, 0x1000003a, 4) 394 | 0x400747 - processOneVarnode: (unique, 0x1000003a, 4) CAST (register, 0x0, 4) 395 | 0x400747 - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (register, 0x0, 4) 396 | 0x400742 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400530, 8) 397 | Located source - call to 400530 [rand] 398 | Adding new child... 399 | Function rand entry @ 0x400530 400 | Found rand return @ 0x400530 401 | 0x400530 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602030, 8) 402 | Located source - call to 602030 [rand] 403 | Adding new child... 404 | Function rand entry @ 0x602030 405 | Found rand return @ 0x602030 406 | --> Could not resolve return value from rand 407 | 408 | 409 | 410 | 411 | 0x40073b - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400530, 8) 412 | Located source - call to 400530 [rand] 413 | Adding new child... 414 | Function rand entry @ 0x400530 415 | Found rand return @ 0x400530 416 | 0x400530 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602030, 8) 417 | Located source - call to 602030 [rand] 418 | Adding new child... 419 | Function rand entry @ 0x602030 420 | Found rand return @ 0x602030 421 | --> Could not resolve return value from rand 422 | 423 | 424 | 425 | 426 | 427 | 428 | 429 | 430 | Adding new parent... 431 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x400974 [main] 432 | found unconditional call main -> intermediatefunc2 433 | 434 | Call @ 0x400974 [main] to 0x4008d0 [intermediatefunc2] (3 pcodeops) 435 | Parameter #2 - (const, 0x64, 8) @ 0x64 436 | isConstant: 100 437 | Adding new child... 438 | Adding new parent... 439 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x40098d [main] 440 | found unconditional call main -> intermediatefunc2 441 | 442 | Call @ 0x40098d [main] to 0x4008d0 [intermediatefunc2] (3 pcodeops) 443 | Parameter #2 - (register, 0x30, 8) @ 0x30 444 | 0x400986 - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x0, 4) 445 | 0x400983 - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (const, 0x64, 4) 446 | 0x40097e - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400646, 8) 447 | Located source - call to 400646 [getNum] 448 | Adding new child... 449 | Function getNum entry @ 0x400646 450 | Found getNum return @ 0x400655 451 | --> Could not resolve return value from getNum 452 | Adding new parent... 453 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x4009a6 [main] 454 | found unconditional call main -> intermediatefunc2 455 | 456 | Call @ 0x4009a6 [main] to 0x4008d0 [intermediatefunc2] (3 pcodeops) 457 | Parameter #2 - (register, 0x30, 8) @ 0x30 458 | 0x40099f - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x0, 4) 459 | 0x40099c - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (const, 0x63, 4) 460 | 0x400997 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400646, 8) 461 | Located source - call to 400646 [getNum] 462 | Adding new child... 463 | Function getNum entry @ 0x400646 464 | Found getNum return @ 0x400655 465 | --> Could not resolve return value from getNum 466 | Adding new parent... 467 | Adding new parent... 468 | analyzeCallSites(..., analyzefun, ...) - found calling function @ 0x40090c [main] 469 | found unconditional call main -> analyzefun 470 | 471 | Call @ 0x40090c [main] to 0x400799 [analyzefun] (3 pcodeops) 472 | Parameter #2 - (const, 0x4d, 8) @ 0x4d 473 | isConstant: 77 474 | Adding new child... 475 | Adding new parent... 476 | analyzeCallSites(..., analyzefun, ...) - found calling function @ 0x400937 [main] 477 | found unconditional call main -> analyzefun 478 | 479 | Call @ 0x400937 [main] to 0x400799 [analyzefun] (3 pcodeops) 480 | Parameter #2 - (register, 0x30, 8) @ 0x30 481 | 0x400930 - processOneVarnode: (register, 0x30, 8) INT_AND (register, 0x0, 8) , (const, 0xffffffff, 8) 482 | 0x400925 - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x4004e0, 8) , (const, 0x0, 4) , (const, 0x0, 8) , (const, 0x0, 8) , (const, 0x0, 4) 483 | Located source - call to 4004e0 [recv] 484 | Adding new child... 485 | Function recv entry @ 0x4004e0 486 | Found recv return @ 0x4004e0 487 | 0x4004e0 - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x602000, 8) , (register, 0x38, 4) , (register, 0x30, 8) , (register, 0x10, 8) , (register, 0x8, 4) 488 | Located source - call to 602000 [recv] 489 | Adding new child... 490 | Function recv entry @ 0x602000 491 | Found recv return @ 0x602000 492 | --> Could not resolve return value from recv 493 | 494 | 495 | 496 | 497 | Adding new parent... 498 | 499 | Call @ 0x400869 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 500 | Parameter #1 - (register, 0x0, 8) @ 0x0 501 | 0x400864 - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x0, 4) 502 | 0x40085f - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400750, 8) , (const, 0xd9038, 8) 503 | Located source - call to 400750 [getarg1] 504 | Adding new child... 505 | Function getarg1 entry @ 0x400750 506 | Found getarg1 return @ 0x40075e 507 | 0x40075a - processOneVarnode: (register, 0x0, 8) COPY (register, 0x0, 8) 508 | 0x40075a - processOneVarnode: (register, 0x0, 8) INT_ZEXT (register, 0x0, 4) 509 | 0x40075a - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x38, 4) , (const, 0x4, 4) 510 | Varnode is function parameter -> parameter #0... (register, 0x38, 4) 511 | Could not get calling function @ 0x0 512 | Could not get calling function @ 0x400ab4 513 | Could not get calling function @ 0x400c38 514 | analyzeCallSites(..., getarg1, ...) - found calling function @ 0x40085f [analyzefun] 515 | found unconditional call analyzefun -> getarg1 516 | 517 | Call @ 0x40085f [analyzefun] to 0x400750 [getarg1] (2 pcodeops) 518 | Parameter #1 - (const, 0xd9038, 8) @ 0xd9038 519 | isConstant: 888888 520 | Adding new child... 521 | Adding new parent... 522 | analyzeCallSites(..., getarg1, ...) - found calling function @ 0x40094e [main] 523 | found unconditional call main -> getarg1 524 | 525 | Call @ 0x40094e [main] to 0x400750 [getarg1] (2 pcodeops) 526 | Parameter #1 - (const, 0x5, 8) @ 0x5 527 | isConstant: 5 528 | Adding new child... 529 | Adding new parent... 530 | 531 | 532 | 533 | 534 | 535 | Call @ 0x40087d [analyzefun] to 0x400510 [malloc] (2 pcodeops) 536 | Parameter #1 - (register, 0x0, 8) @ 0x0 537 | 0x400878 - processOneVarnode: (register, 0x0, 8) INT_SEXT (register, 0x0, 4) 538 | 0x400873 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x40075f, 8) 539 | Located source - call to 40075f [phidemo] 540 | Adding new child... 541 | Function phidemo entry @ 0x40075f 542 | Found phidemo return @ 0x400798 543 | 0x400794 - processOneVarnode: (register, 0x0, 8) COPY (register, 0x0, 8) 544 | 0x400794 - processOneVarnode: (register, 0x0, 8) INT_ZEXT (stack, 0xfffffffffffffff4, 4) 545 | 0x400794 - processOneVarnode: (stack, 0xfffffffffffffff4, 4) MULTIEQUAL (stack, 0xfffffffffffffff4, 4) , (stack, 0xfffffffffffffff4, 4) , (stack, 0xfffffffffffffff4, 4) 546 | Processing a MULTIEQUAL with 3 inputs0x400778 - processOneVarnode: (stack, 0xfffffffffffffff4, 4) COPY (const, 0x64, 4) 547 | processOneVarnode: Addr or Constant! - (const, 0x64, 4) 548 | Adding new child... 549 | 0x400767 - processOneVarnode: (stack, 0xfffffffffffffff4, 4) COPY (const, 0x0, 4) 550 | processOneVarnode: Addr or Constant! - (const, 0x0, 4) 551 | Adding new child... 552 | 0x40078d - processOneVarnode: (stack, 0xfffffffffffffff4, 4) COPY (const, 0x2bc, 4) 553 | processOneVarnode: Addr or Constant! - (const, 0x2bc, 4) 554 | Adding new child... 555 | 556 | 557 | 558 | 559 | 560 | Call @ 0x4008a1 [analyzefun] to 0x400510 [malloc] (2 pcodeops) 561 | Parameter #1 - (register, 0x0, 8) @ 0x0 562 | 0x40089c - processOneVarnode: (register, 0x0, 8) INT_SEXT (stack, 0xffffffffffffffec, 4) 563 | 0x400899 - processOneVarnode: (stack, 0xffffffffffffffec, 4) MULTIEQUAL (stack, 0xffffffffffffffec, 4) , (unique, 0x10000169, 4) 564 | Processing a MULTIEQUAL with 2 inputs0x400882 - processOneVarnode: (stack, 0xffffffffffffffec, 4) COPY (const, 0x4444, 4) 565 | processOneVarnode: Addr or Constant! - (const, 0x4444, 4) 566 | Adding new child... 567 | 0x400896 - processOneVarnode: (unique, 0x10000169, 4) COPY (register, 0x30, 4) 568 | Varnode is function parameter -> parameter #1... (register, 0x30, 4) 569 | Could not get calling function @ 0x0 570 | Could not get calling function @ 0x400ac4 571 | Could not get calling function @ 0x400c78 572 | analyzeCallSites(..., analyzefun, ...) - found calling function @ 0x4008eb [intermediatefunc2] 573 | found unconditional call intermediatefunc2 -> analyzefun 574 | 575 | Call @ 0x4008eb [intermediatefunc2] to 0x400799 [analyzefun] (4 pcodeops) 576 | Parameter #2 - (register, 0x30, 8) @ 0x30 577 | 0x4008e6 - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x30, 4) 578 | Varnode is function parameter -> parameter #1... (register, 0x30, 4) 579 | Could not get calling function @ 0x0 580 | Could not get calling function @ 0x400adc 581 | Could not get calling function @ 0x400cd8 582 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x400965 [main] 583 | found unconditional call main -> intermediatefunc2 584 | 585 | Call @ 0x400965 [main] to 0x4008d0 [intermediatefunc2] (4 pcodeops) 586 | Parameter #2 - (register, 0x30, 8) @ 0x30 587 | 0x40095e - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x0, 4) 588 | 0x40095c - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (register, 0x0, 4) 589 | 0x40094e - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400750, 8) , (const, 0x5, 8) 590 | Located source - call to 400750 [getarg1] 591 | Adding new child... 592 | Function getarg1 entry @ 0x400750 593 | Found getarg1 return @ 0x40075e 594 | 0x40075a - processOneVarnode: (register, 0x0, 8) COPY (register, 0x0, 8) 595 | 0x40075a - processOneVarnode: (register, 0x0, 8) INT_ZEXT (register, 0x0, 4) 596 | 0x40075a - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x38, 4) , (const, 0x4, 4) 597 | Varnode is function parameter -> parameter #0... (register, 0x38, 4) 598 | Could not get calling function @ 0x0 599 | Could not get calling function @ 0x400ab4 600 | Could not get calling function @ 0x400c38 601 | analyzeCallSites(..., getarg1, ...) - found calling function @ 0x40085f [analyzefun] 602 | found unconditional call analyzefun -> getarg1 603 | 604 | Call @ 0x40085f [analyzefun] to 0x400750 [getarg1] (2 pcodeops) 605 | Parameter #1 - (const, 0xd9038, 8) @ 0xd9038 606 | isConstant: 888888 607 | Adding new child... 608 | Adding new parent... 609 | analyzeCallSites(..., getarg1, ...) - found calling function @ 0x40094e [main] 610 | found unconditional call main -> getarg1 611 | 612 | Call @ 0x40094e [main] to 0x400750 [getarg1] (2 pcodeops) 613 | Parameter #1 - (const, 0x5, 8) @ 0x5 614 | isConstant: 5 615 | Adding new child... 616 | Adding new parent... 617 | 618 | 619 | 620 | 621 | 0x400941 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400732, 8) 622 | Located source - call to 400732 [getrand] 623 | Adding new child... 624 | Function getrand entry @ 0x400732 625 | Found getrand return @ 0x40074f 626 | 0x400747 - processOneVarnode: (register, 0x0, 8) COPY (register, 0x0, 8) 627 | 0x400747 - processOneVarnode: (register, 0x0, 8) INT_ZEXT (unique, 0x1000003a, 4) 628 | 0x400747 - processOneVarnode: (unique, 0x1000003a, 4) CAST (register, 0x0, 4) 629 | 0x400747 - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (register, 0x0, 4) 630 | 0x400742 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400530, 8) 631 | Located source - call to 400530 [rand] 632 | Adding new child... 633 | Function rand entry @ 0x400530 634 | Found rand return @ 0x400530 635 | 0x400530 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602030, 8) 636 | Located source - call to 602030 [rand] 637 | Adding new child... 638 | Function rand entry @ 0x602030 639 | Found rand return @ 0x602030 640 | --> Could not resolve return value from rand 641 | 642 | 643 | 644 | 645 | 0x40073b - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400530, 8) 646 | Located source - call to 400530 [rand] 647 | Adding new child... 648 | Function rand entry @ 0x400530 649 | Found rand return @ 0x400530 650 | 0x400530 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x602030, 8) 651 | Located source - call to 602030 [rand] 652 | Adding new child... 653 | Function rand entry @ 0x602030 654 | Found rand return @ 0x602030 655 | --> Could not resolve return value from rand 656 | 657 | 658 | 659 | 660 | 661 | 662 | 663 | 664 | Adding new parent... 665 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x400974 [main] 666 | found unconditional call main -> intermediatefunc2 667 | 668 | Call @ 0x400974 [main] to 0x4008d0 [intermediatefunc2] (3 pcodeops) 669 | Parameter #2 - (const, 0x64, 8) @ 0x64 670 | isConstant: 100 671 | Adding new child... 672 | Adding new parent... 673 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x40098d [main] 674 | found unconditional call main -> intermediatefunc2 675 | 676 | Call @ 0x40098d [main] to 0x4008d0 [intermediatefunc2] (3 pcodeops) 677 | Parameter #2 - (register, 0x30, 8) @ 0x30 678 | 0x400986 - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x0, 4) 679 | 0x400983 - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (const, 0x64, 4) 680 | 0x40097e - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400646, 8) 681 | Located source - call to 400646 [getNum] 682 | Adding new child... 683 | Function getNum entry @ 0x400646 684 | Found getNum return @ 0x400655 685 | --> Could not resolve return value from getNum 686 | Adding new parent... 687 | analyzeCallSites(..., intermediatefunc2, ...) - found calling function @ 0x4009a6 [main] 688 | found unconditional call main -> intermediatefunc2 689 | 690 | Call @ 0x4009a6 [main] to 0x4008d0 [intermediatefunc2] (3 pcodeops) 691 | Parameter #2 - (register, 0x30, 8) @ 0x30 692 | 0x40099f - processOneVarnode: (register, 0x30, 8) INT_ZEXT (register, 0x0, 4) 693 | 0x40099c - processOneVarnode: (register, 0x0, 4) INT_ADD (register, 0x0, 4) , (const, 0x63, 4) 694 | 0x400997 - processOneVarnode: (register, 0x0, 4) CALL (ram, 0x400646, 8) 695 | Located source - call to 400646 [getNum] 696 | Adding new child... 697 | Function getNum entry @ 0x400646 698 | Found getNum return @ 0x400655 699 | --> Could not resolve return value from getNum 700 | Adding new parent... 701 | Adding new parent... 702 | analyzeCallSites(..., analyzefun, ...) - found calling function @ 0x40090c [main] 703 | found unconditional call main -> analyzefun 704 | 705 | Call @ 0x40090c [main] to 0x400799 [analyzefun] (3 pcodeops) 706 | Parameter #2 - (const, 0x4d, 8) @ 0x4d 707 | isConstant: 77 708 | Adding new child... 709 | Adding new parent... 710 | analyzeCallSites(..., analyzefun, ...) - found calling function @ 0x400937 [main] 711 | found unconditional call main -> analyzefun 712 | 713 | Call @ 0x400937 [main] to 0x400799 [analyzefun] (3 pcodeops) 714 | Parameter #2 - (register, 0x30, 8) @ 0x30 715 | 0x400930 - processOneVarnode: (register, 0x30, 8) INT_AND (register, 0x0, 8) , (const, 0xffffffff, 8) 716 | 0x400925 - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x4004e0, 8) , (const, 0x0, 4) , (const, 0x0, 8) , (const, 0x0, 8) , (const, 0x0, 4) 717 | Located source - call to 4004e0 [recv] 718 | Adding new child... 719 | Function recv entry @ 0x4004e0 720 | Found recv return @ 0x4004e0 721 | 0x4004e0 - processOneVarnode: (register, 0x0, 8) CALL (ram, 0x602000, 8) , (register, 0x38, 4) , (register, 0x30, 8) , (register, 0x10, 8) , (register, 0x8, 4) 722 | Located source - call to 602000 [recv] 723 | Adding new child... 724 | Function recv entry @ 0x602000 725 | Found recv return @ 0x602000 726 | --> Could not resolve return value from recv 727 | 728 | 729 | 730 | 731 | Adding new parent... 732 | 733 | 734 | 735 | 736 | 737 | --------------------- 738 | 739 | PRINTING OUTPUTS 740 | 741 | 742 | 743 | SINK: call to malloc in analyzefun2 @ 0x4008b2 744 | -CONST: 5 (0x5) 745 | 746 | 747 | 748 | ------------- 749 | 750 | 751 | SINK: call to malloc in analyzefun3 @ 0x4008c8 752 | -C: rand 753 | --C: rand 754 | 755 | 756 | 757 | ------------- 758 | 759 | 760 | SINK: call to malloc in analyzefun @ 0x4007bb 761 | -CONST: 5 (0x5) 762 | 763 | 764 | 765 | ------------- 766 | 767 | 768 | SINK: call to malloc in analyzefun @ 0x4007c8 769 | -CONST: 9 (0x9) 770 | 771 | 772 | 773 | ------------- 774 | 775 | 776 | SINK: call to malloc in analyzefun @ 0x4007da 777 | -CONST: 19 (0x13) 778 | 779 | 780 | 781 | ------------- 782 | 783 | 784 | SINK: call to malloc in analyzefun @ 0x4007ee 785 | -C: return3 786 | --CONST: 3 (0x3) 787 | 788 | 789 | 790 | ------------- 791 | 792 | 793 | SINK: call to malloc in analyzefun @ 0x400805 794 | -C: getNumber 795 | --ØCONST: 9 (0x9) 796 | --ØCONST: 47 (0x2f) 797 | --ØC: atoi 798 | --Ø-C: atoi 799 | --ØCONST: 61 (0x3d) 800 | --ØC: rand 801 | --Ø-C: rand 802 | --ØC: oneOf3 803 | --Ø-ØCONST: 1 (0x1) 804 | --Ø-ØCONST: 2 (0x2) 805 | --Ø-ØCONST: 3 (0x3) 806 | --ØCONST: 77 (0x4d) 807 | --ØCONST: 11 (0xb) 808 | 809 | 810 | 811 | ------------- 812 | 813 | 814 | SINK: call to malloc in analyzefun @ 0x400819 815 | -C: getNumber2 816 | --C: atoi 817 | ---C: atoi 818 | --C: getNumber 819 | ---ØCONST: 9 (0x9) 820 | ---ØCONST: 47 (0x2f) 821 | ---ØC: atoi 822 | ---Ø-C: atoi 823 | ---ØCONST: 61 (0x3d) 824 | ---ØC: rand 825 | ---Ø-C: rand 826 | ---ØC: oneOf3 827 | ---Ø-ØCONST: 1 (0x1) 828 | ---Ø-ØCONST: 2 (0x2) 829 | ---Ø-ØCONST: 3 (0x3) 830 | ---ØCONST: 77 (0x4d) 831 | ---ØCONST: 11 (0xb) 832 | 833 | 834 | 835 | ------------- 836 | 837 | 838 | SINK: call to malloc in analyzefun @ 0x400834 839 | -C: return3 840 | --CONST: 3 (0x3) 841 | 842 | 843 | 844 | ------------- 845 | 846 | 847 | SINK: call to malloc in analyzefun @ 0x400848 848 | -C: strlen 849 | --C: strlen 850 | 851 | 852 | 853 | ------------- 854 | 855 | 856 | SINK: call to malloc in analyzefun @ 0x400855 857 | +P: call intermediatefunc2 -> analyzefun @ 0x4008eb - param #1 858 | ++P: call main -> intermediatefunc2 @ 0x400965 - param #1 859 | ++-C: getarg1 860 | ++-+P: call analyzefun -> getarg1 @ 0x40085f - param #0 861 | ++-+-CONST: 888888 (0xd9038) 862 | ++-+P: call main -> getarg1 @ 0x40094e - param #0 863 | ++-+-CONST: 5 (0x5) 864 | ++-C: getrand 865 | ++--C: rand 866 | ++---C: rand 867 | ++--C: rand 868 | ++---C: rand 869 | ++P: call main -> intermediatefunc2 @ 0x400974 - param #1 870 | ++-CONST: 100 (0x64) 871 | ++P: call main -> intermediatefunc2 @ 0x40098d - param #1 872 | ++-C: getNum 873 | ++P: call main -> intermediatefunc2 @ 0x4009a6 - param #1 874 | ++-C: getNum 875 | +P: call main -> analyzefun @ 0x40090c - param #1 876 | +-CONST: 77 (0x4d) 877 | +P: call main -> analyzefun @ 0x400937 - param #1 878 | +-C: recv 879 | +--C: recv 880 | 881 | 882 | 883 | ------------- 884 | 885 | 886 | SINK: call to malloc in analyzefun @ 0x400869 887 | -C: getarg1 888 | -+P: call analyzefun -> getarg1 @ 0x40085f - param #0 889 | -+-CONST: 888888 (0xd9038) 890 | -+P: call main -> getarg1 @ 0x40094e - param #0 891 | -+-CONST: 5 (0x5) 892 | 893 | 894 | 895 | ------------- 896 | 897 | 898 | SINK: call to malloc in analyzefun @ 0x40087d 899 | -C: phidemo 900 | --ØCONST: 100 (0x64) 901 | --ØCONST: 0 (0x0) 902 | --ØCONST: 700 (0x2bc) 903 | 904 | 905 | 906 | ------------- 907 | 908 | 909 | SINK: call to malloc in analyzefun @ 0x4008a1 910 | -ØCONST: 17476 (0x4444) 911 | +ØP: call intermediatefunc2 -> analyzefun @ 0x4008eb - param #1 912 | +Ø+P: call main -> intermediatefunc2 @ 0x400965 - param #1 913 | +Ø+-C: getarg1 914 | +Ø+-+P: call analyzefun -> getarg1 @ 0x40085f - param #0 915 | +Ø+-+-CONST: 888888 (0xd9038) 916 | +Ø+-+P: call main -> getarg1 @ 0x40094e - param #0 917 | +Ø+-+-CONST: 5 (0x5) 918 | +Ø+-C: getrand 919 | +Ø+--C: rand 920 | +Ø+---C: rand 921 | +Ø+--C: rand 922 | +Ø+---C: rand 923 | +Ø+P: call main -> intermediatefunc2 @ 0x400974 - param #1 924 | +Ø+-CONST: 100 (0x64) 925 | +Ø+P: call main -> intermediatefunc2 @ 0x40098d - param #1 926 | +Ø+-C: getNum 927 | +Ø+P: call main -> intermediatefunc2 @ 0x4009a6 - param #1 928 | +Ø+-C: getNum 929 | +ØP: call main -> analyzefun @ 0x40090c - param #1 930 | +Ø-CONST: 77 (0x4d) 931 | +ØP: call main -> analyzefun @ 0x400937 - param #1 932 | +Ø-C: recv 933 | +Ø--C: recv 934 | 935 | 936 | 937 | ------------- 938 | 939 | 940 | MallocTrace.java> Finished! 941 | -------------------------------------------------------------------------------- /Ghidra/README.md: -------------------------------------------------------------------------------- 1 | # Ghidra -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Slides and papers from conference presentations and publications 2 | -------------------------------------------------------------------------------- /Reverse Engineering Windows Defender/JavaScript Engine/README.md: -------------------------------------------------------------------------------- 1 | 2 | # Reverse Engineering Windows Defender's JavaScript Engine 3 | 4 | *Presented at REcon Brussels and Jailbreak Security Summit* 5 | 6 | Windows Defender’s `MpEngine.dll` implements the core of Defender’s functionality in an enormous ~11 MB, 30,000+ function DLL. In this presentation, we’ll look at the ~1,200 functions that comprise Defender’s proprietary JavaScript engine, which is used for analyzing potentially malicious JS code. Defender implements a full JS engine, though it is significantly simpler than the engines found in modern web browsers, so it is a tractable target for reverse engineering from binary. 7 | 8 | We’ll cover reverse engineering the JS engine, including how it works (types, memory management, JS/ECMAScript features, integration with Defender’s antivirus system, etc.), building tooling to interact with it, non-security JS runtime bugs, anti-analysis tricks for malicious scripts, and a bit on the engine’s attack surface for exploitation. 9 | 10 | We’ll conclude by considering other attack surface within the remaining 98% of this enormous binary. 11 | -------------------------------------------------------------------------------- /Reverse Engineering Windows Defender/JavaScript Engine/Reverse Engineering-Windows Defender JavaScript Engine.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/Reverse Engineering Windows Defender/JavaScript Engine/Reverse Engineering-Windows Defender JavaScript Engine.pdf -------------------------------------------------------------------------------- /Reverse Engineering Windows Defender/README.md: -------------------------------------------------------------------------------- 1 | # Reverse Engineering Windows Defender 2 | 3 | Presentation materials on reverse engineering Windows Defender's JavaScript Engine and Windows Binary Emulator 4 | -------------------------------------------------------------------------------- /Reverse Engineering Windows Defender/Virus Bulletin Retrospective/VB2018 Presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/Reverse Engineering Windows Defender/Virus Bulletin Retrospective/VB2018 Presentation.pdf -------------------------------------------------------------------------------- /Reverse Engineering Windows Defender/Windows Binary Emulator/BHUSA - DEFCON - Alexei-Bulazel-Reverse-Engineering-Windows-Defender-Revision-3.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xAlexei/Publications/bce8a02e258219d773cecc3589421d5a49604bd8/Reverse Engineering Windows Defender/Windows Binary Emulator/BHUSA - DEFCON - Alexei-Bulazel-Reverse-Engineering-Windows-Defender-Revision-3.pdf -------------------------------------------------------------------------------- /Reverse Engineering Windows Defender/Windows Binary Emulator/README.md: -------------------------------------------------------------------------------- 1 | 2 | # Reverse Engineering Windows Defender's Windows Binary Emulator 3 | 4 | *Presented at REcon Montreal, Black Hat USA, and DEF CON* 5 | 6 | Windows Defender's `mpengine.dll` implements the core of Defender antivirus' functionality in an enormous ~11 MB, 30,000+ function DLL. 7 | 8 | In this presentation, we'll look at Defender's emulator for analysis of potentially malicious Windows PE binaries on the endpoint. To the best of my knowledge, there has never been a conference talk or publication on reverse engineering the internals of any antivirus binary emulator before. 9 | 10 | I'll cover a range of topics including emulator internals (bytecode to intermediate language lifting and execution; memory management; Windows API emulation; NT kernel emulation; file system and registry emulation; integration with Defender's antivirus features; the virtual environment; etc.), how I built custom tooling to assist in reverse engineering and attacking the emulator; tricks that malicious binaries can use to evade or subvert analysis; and attack surface within the emulator. 11 | 12 | Code available at 13 | --------------------------------------------------------------------------------