├── .gitignore ├── README.md ├── __init__.py ├── asm ├── base1.asm └── helloworld.asm ├── bin ├── cb-replay_mod ├── qemu-cgc ├── qemu_bb_wrap.sh ├── qemu_launcher.sh └── qemu_singlestep_wrap.sh ├── cgc_pin_tracer ├── cgc_pin_tracer.cpp ├── libcgc_pin.h ├── makefile ├── makefile.rules ├── pin_wrap.sh └── test.sh ├── cgrex ├── CGRexAnalysis.py ├── Exceptions.py ├── Fortifier.py ├── MiasmPatcher.py ├── PinManager.py ├── QemuTracer.py ├── VagrantManager.py ├── __init__.py └── utils.py ├── fortify.py ├── main.py ├── reqs.txt └── tests └── 0b32aa01 ├── 0b32aa01_01 ├── 0b32aa01_01.xml └── 0b32aa01_02.xml /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | *.swp 3 | *.o 4 | 5 | # Byte-compiled / optimized / DLL files 6 | __pycache__/ 7 | *.py[cod] 8 | # C extensions 9 | *.so 10 | # Distribution / packaging 11 | .Python 12 | env/ 13 | build/ 14 | develop-eggs/ 15 | dist/ 16 | downloads/ 17 | eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | # PyInstaller 27 | # Usually these files are written by a python script from a template 28 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 29 | *.manifest 30 | *.spec 31 | # Installer logs 32 | pip-log.txt 33 | pip-delete-this-directory.txt 34 | # Unit test / coverage reports 35 | htmlcov/ 36 | .tox/ 37 | .coverage 38 | .cache 39 | nosetests.xml 40 | coverage.xml 41 | # Translations 42 | *.mo 43 | *.pot 44 | # Django stuff: 45 | *.log 46 | # Sphinx documentation 47 | docs/_build/ 48 | # PyBuilder 49 | target/ 50 | 51 | .idea 52 | 53 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CGrex 2 | **WARNING: this project is obsolete.** 3 | 4 | **CGrex was used only during the CGC qualifier event, for the final event we used [Patcherex](https://github.com/shellphish/patcherex).** 5 | 6 | CGrex is a targeted patcher for CGC binaries. 7 | 8 | We used it (together with fidget) just for the CGC qualifier event. 9 | 10 | CGrex takes as an input a CGC binary and a list of POVs and it generates a binary that (supposedly) it is not vulnerable anymore to those POVs. 11 | 12 | In a nutshell, CGrex works by injecting code that "abuses" return values of the `random` and `fdwait` syscalls to detect if the soon-to-be-accessed memory regions are allocated. 13 | 14 | 15 | ## Installation 16 | ```bash 17 | # install requirements 18 | sudo apt-get install socat 19 | 20 | # create a virtualenv 21 | mkvirtualenv cgrex 22 | 23 | # download cgrex 24 | git clone https://github.com/mechaphish/cgrex.git 25 | 26 | # install Python requirements 27 | cd cgrex 28 | pip install -r reqs.txt 29 | 30 | #install miasm 31 | git clone https://github.com/cea-sec/miasm.git 32 | cd miasm 33 | pip install -e . 34 | cd .. 35 | ``` 36 | 37 | ## Usage 38 | ```bash 39 | main.py --binary= --out= ... 40 | ``` 41 | Example: 42 | ```bash 43 | ./main.py --binary tests/0b32aa01/0b32aa01_01 --out /tmp/0b32aa01_01_cgrex1 tests/0b32aa01/0b32aa01_01.xml tests/0b32aa01/0b32aa01_02.xml 44 | ``` 45 | `/tmp/0b32aa01_01_cgrex1` is now "immune" to the two POVs. 46 | For instance, it should not segfault with the following input: 47 | ```bash 48 | python -c 'print "A"*100' | bin/qemu-cgc /tmp/0b32aa01_01_cgrex1 49 | ``` 50 | 51 | ## How does CGrex work? 52 | During the CGC qualification event an input generating a crash (encoded in a POV) was a considered as a vulnerability. The goal of CGrex is just to generate a binary that does not crash, when provided with a previously crashing input. 53 | 54 | CGrex works in five steps. 55 | 56 | 1) Run the CGC binary against a given POV (using `bin/qemu-cgc`). 57 | 58 | 2) Detect the instruction pointer where the POV generates a crash (the "culprit instruction"). 59 | 60 | 3) Extract the symbolic expression of the memory accesses performed by the "culprit instruction" (by using miasm). 61 | 62 | 4) Generate "checking" code that dynamically: 63 | 64 | * Compute the memory accesses that the "culprit instruction" is going to perform. 65 | 66 | * Verify that these memory accesses are within allocated memory regions (and so the "culprit instruction" is not going to crash). To understand if some memory is allocated or not CGrex "abuses" the return values of the `random` and `fdwait` syscalls. 67 | 68 | * If a memory access outside allocated memory is detected, the injected code just calls `exit`. 69 | 70 | 5) Inject the "cheking" code. 71 | 72 | Steps 1 to 5 are repeated until the binary does not crash anymore with all the provided POVs. 73 | 74 | 75 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mechaphish/cgrex/8ce02e323646fc83e61db8eb6b4dc7c0dfa0ae03/__init__.py -------------------------------------------------------------------------------- /asm/base1.asm: -------------------------------------------------------------------------------- 1 | USE32 2 | org {code_loaded_address} 3 | 4 | _saved_esp: 5 | db 0x0,0x0,0x0,0x0 6 | _saved_eax: 7 | db 0x0,0x0,0x0,0x0 8 | 9 | CGREX_memcheck_and_exit_ptr: 10 | db 0xc,0x0,0x0,0x9 11 | 12 | CGREX_memcheck_and_exit: ;eax=address,ebx=size,ecx=flags([write][read]) 13 | pusha 14 | mov edx,eax 15 | and edx,0xfffff000 16 | mov esi,eax 17 | add esi,ebx 18 | dec esi 19 | and esi,0xfffff000 20 | 21 | mov eax,edx 22 | call _memcheck_and_exit_int 23 | mov eax,esi 24 | call _memcheck_and_exit_int 25 | 26 | popa 27 | ret 28 | 29 | CGREX_print_eax: 30 | pusha 31 | mov ecx,32 32 | mov ebx,eax 33 | _print_reg_loop: 34 | rol ebx,4 35 | mov edi,ebx 36 | and edi,0x0000000f 37 | lea eax,[_print_hex_array+edi] 38 | mov ebp,ebx 39 | mov ebx,0x1 40 | call _print 41 | mov ebx,ebp 42 | sub ecx,4 43 | jnz _print_reg_loop 44 | mov eax,_new_line 45 | mov ebx,1 46 | call _print 47 | popa 48 | ret 49 | 50 | 51 | CGREX_exit: 52 | pusha 53 | mov eax,1 ;_terminate 54 | ;TODO return something related to detour point (but return value is ANDed with 0xff) 55 | mov ebx,0x85 ;133 56 | int 0x80 ;this may actually not terminate due to the pin counter-hack 57 | popa 58 | ret 59 | 60 | _memcheck_and_exit_int: ;eax=address,ecx=flags([write][read]) 61 | pusha 62 | ;int3 63 | mov ebp,eax 64 | xor edx,edx 65 | mov edi,ecx 66 | and edi,0x00000001 67 | test edi,edi 68 | je _out1 69 | call _test_read 70 | test eax,eax 71 | jne _out1 72 | call CGREX_exit 73 | _out1: 74 | ;int3 75 | mov eax,ebp 76 | mov edi,ecx 77 | and edi,0x00000002 78 | test edi,edi 79 | je _out2 80 | call _test_write 81 | test eax,eax 82 | jne _out2 83 | call CGREX_exit 84 | _out2 85 | popa 86 | ret 87 | 88 | _test_read: 89 | ;call CGREX_print_eax 90 | pusha 91 | cmp eax, 0x1000 92 | jb _fail_read 93 | mov esi,eax 94 | mov eax,4 ;fdwait 95 | xor ebx,ebx 96 | dec ebx ;nfds<0 97 | xor ecx,ecx 98 | xor edx,edx 99 | mov edi,0x0 ;passing syscall arguments in edi does not seem to work! 100 | int 0x80 101 | ;jmp _iloop 102 | ;int3 103 | 104 | cmp eax,3 ;EFAULT 105 | jne _fail_read 106 | xor eax,eax 107 | inc eax 108 | jmp _end_test_read 109 | _fail_read: 110 | xor eax,eax 111 | _end_test_read: 112 | 113 | mov [_garbage_area],eax 114 | popa 115 | mov eax,[_garbage_area] 116 | ret 117 | 118 | _test_write: 119 | ;call CGREX_print_eax 120 | pusha 121 | cmp eax, 0x1000 122 | jb _fail_write 123 | mov edx,eax 124 | mov eax,7 ;random 125 | xor ebx,ebx 126 | xor ecx,ecx 127 | 128 | mov edi,[edx] ; FIXME hack that assumes that this area is 4-byte readable 129 | mov [_garbage_area],edi 130 | int 0x80 131 | mov edi,[_garbage_area] 132 | mov [edx],edi 133 | 134 | test eax,eax 135 | jne _fail_write 136 | xor eax,eax 137 | inc eax 138 | jmp _end_test_write 139 | _fail_write: 140 | xor eax,eax 141 | _end_test_write: 142 | 143 | mov [_garbage_area],eax 144 | popa 145 | mov eax,[_garbage_area] 146 | ret 147 | 148 | 149 | _print: ;eax=buf,ebx=len 150 | pusha 151 | mov ecx,eax 152 | mov edx,ebx 153 | mov eax,0x2 154 | mov ebx,0x1 155 | mov esi,0x0 156 | int 0x80 157 | popa 158 | ret 159 | 160 | 161 | _new_line: 162 | db 0xa 163 | _print_hex_array: 164 | db '0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f' 165 | _garbage_area: 166 | db 0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0 167 | db 0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0 168 | 169 | 170 | ; === END_BASE 171 | -------------------------------------------------------------------------------- /asm/helloworld.asm: -------------------------------------------------------------------------------- 1 | USE32 2 | org {code_loaded_address} 3 | 4 | _start 5 | pusha 6 | mov eax, _str_hello 7 | call _prints 8 | 9 | popa 10 | jmp {code_return} 11 | 12 | ;_iloop: 13 | ; jmp _iloop 14 | 15 | _prints ;eax=buf(null terminates) 16 | pusha 17 | xor ebx,ebx 18 | xor edx,edx 19 | xor ecx,ecx 20 | _prints_loop 21 | mov ebx,eax 22 | add ebx,ecx 23 | mov dl,[ebx] 24 | inc ecx 25 | test edx,edx 26 | jne _prints_loop 27 | dec ecx 28 | mov ebx,ecx 29 | call _print 30 | popa 31 | ret 32 | 33 | _print ;eax=buf,ebx=len 34 | pusha 35 | mov ecx,eax 36 | mov edx,ebx 37 | mov eax,0x2 38 | mov ebx,0x1 39 | mov esi,0x0 40 | int 0x80 41 | popa 42 | ret 43 | 44 | _str_hello: 45 | db '=== Hello world!',0xa,0x00 46 | 47 | 48 | 49 | -------------------------------------------------------------------------------- /bin/cb-replay_mod: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | """ 4 | CB POV / Poll communication verification tool 5 | 6 | Copyright (C) 2014 - Brian Caswell 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining a copy 9 | of this software and associated documentation files (the "Software"), to deal 10 | in the Software without restriction, including without limitation the rights 11 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | copies of the Software, and to permit persons to whom the Software is 13 | furnished to do so, subject to the following conditions: 14 | 15 | The above copyright notice and this permission notice shall be included in 16 | all copies or substantial portions of the Software. 17 | 18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 24 | THE SOFTWARE. 25 | 26 | This tool allows for deterministic communication to a CGC Challenge Binary 27 | using a communication spec [0] defined in XML. Results are logged in the TAP 28 | format [1]. 29 | 30 | 0 - file:///usr/share/cgc-docs/replay.dtd 31 | 1 - http://testanything.org/ 32 | """ 33 | 34 | import argparse 35 | import multiprocessing 36 | import signal 37 | import re 38 | import socket 39 | import struct 40 | import time 41 | import select 42 | import defusedxml.ElementTree as ET 43 | 44 | 45 | class RegexMatch(object): 46 | """ Simple wrapper for handling regexes in Throw. 47 | 48 | Attributes: 49 | group: which re group to use when extracting data 50 | regex: The compiled re to be evaluated 51 | 52 | """ 53 | def __init__(self, regex, group=None): 54 | if group is None: 55 | group = 0 56 | 57 | self.regex = regex 58 | self.group = group 59 | 60 | def match(self, data): 61 | """ 62 | Match the compiled regular expression 63 | 64 | Arguments: 65 | data: Data to match 66 | 67 | Returns: 68 | Result of the re.match call 69 | 70 | Raises 71 | None 72 | """ 73 | 74 | return self.regex.match(data) 75 | 76 | 77 | class _ValueStr(str): 78 | """ Wrapper class, used to specify the string is meant to be a 'key' in the 79 | Throw.values key/value store.""" 80 | pass 81 | 82 | 83 | class TimeoutException(Exception): 84 | """ Exception to be used by Timeout(), to allow catching of timeout 85 | exceptions """ 86 | pass 87 | 88 | 89 | class TestFailure(Exception): 90 | """ Exception to be used by Throw(), to allow catching of test failures """ 91 | pass 92 | 93 | 94 | class Timeout(object): 95 | """ Timeout - A class to use within 'with' for timing out a block via 96 | exceptions and alarm.""" 97 | 98 | def __init__(self, seconds): 99 | self.seconds = seconds 100 | 101 | @staticmethod 102 | def cb_handle_timeout(signum, frame): 103 | """ SIGALRM signal handler callback """ 104 | raise TimeoutException("timed out") 105 | 106 | def __enter__(self): 107 | if self.seconds: 108 | signal.signal(signal.SIGALRM, self.cb_handle_timeout) 109 | signal.alarm(self.seconds) 110 | 111 | def __exit__(self, exit_type, exit_value, traceback): 112 | if self.seconds: 113 | signal.alarm(0) 114 | 115 | 116 | class Throw(object): 117 | """Throw - Perform the interactions with a CB 118 | 119 | This class implements the basic methods to interact with a CB, verifying 120 | the interaction works as expected. 121 | 122 | Usage: 123 | a = Throw((source_ip, source_port), (target_ip, target_port), POV, 124 | timeout, should_debug) 125 | a.run() 126 | 127 | Attributes: 128 | source: touple of host and port for the outbound connection 129 | target: touple of host and port for the CB 130 | 131 | count: Number of actions performed 132 | 133 | debug: Is debugging enabled 134 | 135 | failed: Number of actions that did not work as expected 136 | 137 | passed: Number of actions that did worked as expected 138 | 139 | pov: POV, as defined by POV() 140 | 141 | sock: TCP Socket to the CB 142 | 143 | timeout: connection timeout 144 | 145 | values: Variable dictionary 146 | 147 | logs: all of the output from the interactions 148 | """ 149 | def __init__(self, source, target, pov, timeout, debug): 150 | self.source = source 151 | self.target = target 152 | self.count = 0 153 | self.failed = 0 154 | self.passed = 0 155 | self.pov = pov 156 | self.debug = debug 157 | self.sock = None 158 | self.timeout = timeout 159 | self.values = {} 160 | self.logs = [] 161 | self._read_buffer = '' 162 | 163 | def is_ok(self, expected, result, message): 164 | """ Verifies 'expected' is equal to 'result', logging results in TAP 165 | format 166 | 167 | Args: 168 | expected: Expected value 169 | result: Action value 170 | message: String describing the action being evaluated 171 | 172 | Returns: 173 | legnth: If the 'expected' result is a string, returns the length of 174 | the string, otherwise 0 175 | 176 | Raises: 177 | None 178 | """ 179 | 180 | if isinstance(expected, _ValueStr): 181 | message += ' (expanded from %s)' % repr(expected) 182 | if expected not in self.values: 183 | message += ' value not provided' 184 | self.log_fail(message) 185 | return 0 186 | expected = self.values[expected] 187 | 188 | if isinstance(expected, str): 189 | if result.startswith(expected): 190 | self.log_ok(message) 191 | return len(expected) 192 | else: 193 | if result == expected: 194 | self.log_ok(message) 195 | return 0 196 | 197 | if self.debug: 198 | self.log('expected: %s' % repr(expected)) 199 | self.log('result: %s' % repr(result)) 200 | 201 | self.log_fail(message) 202 | return 0 203 | 204 | def is_not(self, expected, result, message): 205 | """ Verifies 'expected' is not equal to 'result', logging results in 206 | TAP format 207 | 208 | Args: 209 | expected: Expected value 210 | result: Action value 211 | message: String describing the action being evaluated 212 | 213 | Returns: 214 | legnth: If the 'expected' result is a string, returns the length of 215 | the string, otherwise 0 216 | 217 | Raises: 218 | None 219 | """ 220 | if isinstance(expected, _ValueStr): 221 | message += ' (expanded from %s)' % repr(expected) 222 | if expected not in self.values: 223 | message += ' value not provided' 224 | self.log_fail(message) 225 | return 0 226 | expected = self.values[expected] 227 | 228 | if isinstance(expected, str): 229 | if not result.startswith(expected): 230 | self.log_ok(message) 231 | return len(expected) 232 | else: 233 | if result != expected: 234 | self.log_ok(message) 235 | return 0 236 | 237 | if self.debug: 238 | self.log('these are expected to be different:') 239 | self.log('expected: %s' % repr(expected)) 240 | self.log('result: %s' % repr(result)) 241 | self.log_fail(message) 242 | return 0 243 | 244 | def log_ok(self, message): 245 | """ Log a test that passed in the TAP format 246 | 247 | Args: 248 | message: String describing the action that 'passed' 249 | 250 | Returns: 251 | None 252 | 253 | Raises: 254 | None 255 | """ 256 | self.passed += 1 257 | self.count += 1 258 | self.logs.append("ok %d - %s" % (self.count, message)) 259 | 260 | def log_fail(self, message): 261 | """ Log a test that failed in the TAP format 262 | 263 | Args: 264 | message: String describing the action that 'passed' 265 | 266 | Returns: 267 | None 268 | 269 | Raises: 270 | None 271 | """ 272 | self.failed += 1 273 | self.count += 1 274 | self.logs.append("not ok %d - %s" % (self.count, message)) 275 | raise TestFailure('failed: %s' % message) 276 | 277 | def log(self, message): 278 | """ Log diagnostic information in the TAP format 279 | 280 | Args: 281 | message: String being logged 282 | 283 | Returns: 284 | None 285 | 286 | Raises: 287 | None 288 | """ 289 | self.logs.append("# %s" % message) 290 | 291 | def sleep(self, value): 292 | """ Sleep a specified amount 293 | 294 | Args: 295 | value: Amount of time to sleep, specified in miliseconds 296 | 297 | Returns: 298 | None 299 | 300 | Raises: 301 | None 302 | """ 303 | time.sleep(value) 304 | self.log_ok("slept %f" % value) 305 | 306 | def declare(self, values): 307 | """ Declare variables for use within the current CB communication 308 | iteration 309 | 310 | Args: 311 | values: Dictionary of key/value pair values to be set 312 | 313 | Returns: 314 | None 315 | 316 | Raises: 317 | None 318 | """ 319 | self.values.update(values) 320 | 321 | set_values = [repr(x) for x in values.keys()] 322 | self.log_ok("set values: %s" % ', '.join(set_values)) 323 | 324 | def _perform_match(self, match, data, invert=False): 325 | """ Validate the data read from the CB is as expected 326 | 327 | Args: 328 | match: Pre-parsed expression to validate the data from the CB 329 | data: Data read from the CB 330 | 331 | Returns: 332 | None 333 | 334 | Raises: 335 | None 336 | """ 337 | offset = 0 338 | for item in match: 339 | if isinstance(item, str): 340 | if invert: 341 | offset += self.is_not(item, data[offset:], 342 | 'match: not string') 343 | else: 344 | offset += self.is_ok(item, data[offset:], 'match: string') 345 | elif hasattr(item, 'match'): 346 | match = item.match(data[offset:]) 347 | if match: 348 | if invert: 349 | if self.debug: 350 | self.log('pattern: %s' % repr(item.pattern)) 351 | self.log('data: %s' % repr(data[offset:])) 352 | self.log_fail('match: not pcre') 353 | else: 354 | self.log_ok('match: pcre') 355 | offset += match.end() 356 | else: 357 | if invert: 358 | self.log_ok('match: not pcre') 359 | else: 360 | if self.debug: 361 | self.log('pattern: %s' % repr(item.pattern)) 362 | self.log('data: %s' % repr(data[offset:])) 363 | self.log_fail('match: pcre') 364 | else: 365 | raise Exception('unknown match type: %s' % repr(item)) 366 | 367 | def _perform_expr(self, expr, key, data): 368 | """ Extract a value from the value read from the CB using 'slice' or 369 | 'pcre' 370 | 371 | Args: 372 | expr: Pre-parsed expression to extract the value 373 | key: Key to store the value in the instance iteration 374 | data: Data read from the CB 375 | 376 | Returns: 377 | None 378 | 379 | Raises: 380 | None 381 | """ 382 | value = None 383 | 384 | # self.log('PERFORMING EXPR (%s): %s' % (key, repr(expr))) 385 | # self.log('DATA: %s' % repr(data)) 386 | if isinstance(expr, slice): 387 | value = data[expr] 388 | elif isinstance(expr, RegexMatch): 389 | match = expr.match(data) 390 | if match: 391 | try: 392 | value = match.group(expr.group) 393 | except IndexError: 394 | self.log_fail('match group unavailable') 395 | else: 396 | self.log_fail('match failed') 397 | 398 | else: 399 | self.log_fail('unknown expr type: %s' % repr(expr)) 400 | 401 | if value is not None: 402 | self.values[key] = value 403 | if self.debug: 404 | self.log('set %s to %s' % (key, value.encode('hex'))) 405 | self.log_ok('set %s' % (key)) 406 | 407 | def _read_len(self, read_len): 408 | """ 409 | Always read at least 4096 byte chunks. Because reasons? 410 | """ 411 | if len(self._read_buffer) >= read_len: 412 | data = self._read_buffer[:read_len] 413 | self._read_buffer = self._read_buffer[read_len:] 414 | return data 415 | 416 | data = [self._read_buffer] 417 | data_len = len(self._read_buffer) 418 | while data_len < read_len: 419 | left = read_len - data_len 420 | data_read = self.sock.recv(max(4096, left)) 421 | if len(data_read) == 0: 422 | self.log_fail('recv failed') 423 | self._read_buffer = ''.join(data) 424 | return '' 425 | 426 | data.append(data_read) 427 | data_len += len(data_read) 428 | 429 | data = ''.join(data) 430 | self._read_buffer = data[read_len:] 431 | return data[:read_len] 432 | 433 | def _read_delim(self, delim): 434 | while delim not in self._read_buffer: 435 | data_read = self.sock.recv(4096) 436 | if len(data_read) == 0: 437 | self.log_fail('recv failed') 438 | return '' 439 | self._read_buffer += data_read 440 | 441 | depth = self._read_buffer.index(delim) + len(delim) 442 | data = self._read_buffer[:depth] 443 | self._read_buffer = self._read_buffer[depth:] 444 | return data 445 | 446 | def read(self, read_args): 447 | """ Read data from the CB, validating the results 448 | 449 | Args: 450 | read_args: Dictionary of arguments 451 | 452 | Returns: 453 | None 454 | 455 | Raises: 456 | Exception: if 'expr' argument is provided and 'assign' is not 457 | """ 458 | data = '' 459 | try: 460 | if 'length' in read_args: 461 | data = self._read_len(read_args['length']) 462 | self.is_ok(read_args['length'], len(data), 'read length') 463 | elif 'delim' in read_args: 464 | data = self._read_delim(read_args['delim']) 465 | except socket.error: 466 | self.log_fail('recv failed') 467 | 468 | if 'echo' in read_args and self.debug: 469 | assert read_args['echo'] in ['yes', 'no', 'ascii'] 470 | 471 | if 'yes' == read_args['echo']: 472 | self.log('received %s' % data.encode('hex')) 473 | elif 'ascii' == read_args['echo']: 474 | self.log('received %s' % repr(data)) 475 | 476 | if 'match' in read_args: 477 | self._perform_match(read_args['match']['values'], data, 478 | read_args['match']['invert']) 479 | 480 | if 'expr' in read_args: 481 | assert 'assign' in read_args 482 | self._perform_expr(read_args['expr'], read_args['assign'], data) 483 | 484 | def write(self, args): 485 | """ Write data to the CB 486 | 487 | Args: 488 | args: Dictionary of arguments 489 | 490 | Returns: 491 | None 492 | 493 | Raises: 494 | None 495 | """ 496 | data = [] 497 | for value in args['value']: 498 | if isinstance(value, _ValueStr): 499 | if value not in self.values: 500 | self.log_fail('write failed: %s not available' % value) 501 | return 502 | data.append(self.values[value]) 503 | else: 504 | data.append(value) 505 | to_send = ''.join(data) 506 | 507 | if self.debug: 508 | if args['echo'] == 'yes': 509 | self.log('writing: %s' % to_send.encode('hex')) 510 | elif args['echo'] == 'ascii': 511 | self.log('writing: %s' % repr(to_send)) 512 | 513 | try: 514 | total_sent = 0 515 | while total_sent < len(to_send): 516 | sent = self.sock.send(to_send[total_sent:]) 517 | if sent == 0: 518 | self.log_fail('write failed. wrote %d of %d bytes' % 519 | (total_sent, len(to_send))) 520 | return 521 | total_sent += sent 522 | self.log_ok('write: sent %d bytes' % len(to_send)) 523 | except socket.error: 524 | self.log_fail('write failed') 525 | 526 | def run(self): 527 | """ Iteratively execute each of the actions within the POV 528 | 529 | Args: 530 | None 531 | 532 | Returns: 533 | None 534 | 535 | Raises: 536 | AssertionError: if a POV action is not in the pre-defined methods 537 | """ 538 | 539 | self.log('%s - %s' % (self.pov.name, self.pov.filename)) 540 | 541 | methods = { 542 | 'sleep': self.sleep, 543 | 'declare': self.declare, 544 | 'read': self.read, 545 | 'write': self.write, 546 | } 547 | 548 | self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 549 | self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) 550 | self.sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, 551 | struct.pack('ii', 1, 5)) 552 | self.sock.bind(self.source) 553 | self.sock.connect(self.target) 554 | # = socket.create_connection(self.target) 555 | self.log('connected to %s' % repr(self.target)) 556 | 557 | for method, arguments in self.pov: 558 | assert method in methods, "%s not in methods" % method 559 | try: 560 | methods[method](arguments) 561 | except TestFailure: 562 | self.log('stopping due to failure') 563 | return 564 | 565 | ctime = time.time() 566 | while True: 567 | #print time.time(),ctime 568 | if time.time()-ctime > 10.0: 569 | break 570 | ready,_,err_sock = select.select([self.sock], [], [], 1.0) 571 | if ready[0]: 572 | data = self.sock.recv(10000) 573 | if data == "": 574 | break 575 | print "remaining data in cb-replay_mod:" 576 | print repr(data) 577 | else: 578 | break 579 | 580 | time.sleep(1.0) 581 | self.sock.close() 582 | 583 | 584 | def dump(self): 585 | """ Log information for the current POV iteraction 586 | 587 | Args: 588 | None 589 | 590 | Returns: 591 | None 592 | 593 | Raises: 594 | None 595 | """ 596 | if len(self.values) and self.debug: 597 | self.log("variables at end of interaction: ") 598 | for key in self.values: 599 | self.log("%s : %s" % (repr(key), repr(self.values[key]))) 600 | self.log('tests passed: %d' % self.passed) 601 | self.log('tests failed: %d' % self.failed) 602 | 603 | 604 | class POV(object): 605 | """POV - Parse a POV xml file 606 | 607 | This class implements a parser for the CGC POV/Poll format defined in 608 | replay.dtd. 609 | 610 | Usage: 611 | p = POV() 612 | p.parse(filename) 613 | p.dump() 614 | 615 | Attributes: 616 | name: Name of the CB 617 | 618 | filename: Filename of the CB definition 619 | 620 | _steps: List of iteractions of a CB 621 | 622 | _variables: List of variables used during CB interaction 623 | """ 624 | def __init__(self): 625 | self.filename = None 626 | self.name = None 627 | self._steps = [] 628 | self._variables = [] 629 | 630 | def __iter__(self): 631 | """ Iterate over iteractions in a POV 632 | 633 | Args: 634 | None 635 | 636 | Returns: 637 | None 638 | 639 | Raises: 640 | None 641 | """ 642 | for step in self._steps: 643 | yield step 644 | 645 | @staticmethod 646 | def compile_hex_match(data): 647 | """ Convert a string of hex values to their ascii value, skipping 648 | whitespace 649 | 650 | Args: 651 | data: Hex string 652 | 653 | Returns: 654 | None 655 | 656 | Raises: 657 | None 658 | """ 659 | for i in [' ', '\n', '\r', '\t']: 660 | data = data.replace(i, '') 661 | return data.decode('hex') 662 | 663 | @staticmethod 664 | def compile_pcre(data): 665 | """ Compile a PCRE regular express for later use 666 | 667 | Args: 668 | data: String to be compiled 669 | 670 | Returns: 671 | None 672 | 673 | Raises: 674 | None 675 | """ 676 | pattern = re.compile(data, re.DOTALL) 677 | return RegexMatch(pattern) 678 | 679 | @staticmethod 680 | def compile_slice(data): 681 | """ Parse a slice XML element, into simplified Python slice format 682 | (:). 683 | 684 | Args: 685 | data: XML element defining a slice 686 | 687 | Returns: 688 | None 689 | 690 | Raises: 691 | AssertionError: If the tag text is not empty 692 | AssertionError: If the tag name is not 'slice' 693 | """ 694 | assert data.tag == 'slice' 695 | assert data.text is None 696 | begin = int(POV.get_attribute(data, 'begin', '0')) 697 | end = POV.get_attribute(data, 'end', None) 698 | if end is not None: 699 | end = int(end) 700 | return slice(begin, end) 701 | 702 | @staticmethod 703 | def compile_string_match(data): 704 | """ Parse a string into an 'asciic' format, for easy use. Allows for 705 | \\r, \\n, \\t, \\\\, and hex values specified via C Style \\x notation. 706 | 707 | Args: 708 | data: String to be parsed into a 'asciic' supported value. 709 | 710 | Returns: 711 | None 712 | 713 | Raises: 714 | AssertionError: if either of two characters following '\\x' are not 715 | hexidecimal values 716 | Exception: if the escaped value is not one of the supported escaped 717 | strings (See above) 718 | """ 719 | # \\, \r, \n, \t \x(HEX)(HEX) 720 | data = str(data) # no unicode support 721 | state = 0 722 | out = [] 723 | chars = {'n': '\n', 'r': '\r', 't': '\t', '\\': '\\'} 724 | hex_chars = '0123456789abcdef' 725 | hex_tmp = '' 726 | for val in data: 727 | if state == 0: 728 | if val != '\\': 729 | out.append(val) 730 | continue 731 | state = 1 732 | elif state == 1: 733 | if val in chars: 734 | out.append(chars[val]) 735 | state = 0 736 | continue 737 | elif val == 'x': 738 | state = 2 739 | else: 740 | raise Exception('invalid asciic string (%s)' % repr(data)) 741 | elif state == 2: 742 | assert val.lower() in hex_chars 743 | hex_tmp = val 744 | state = 3 745 | else: 746 | assert val.lower() in hex_chars 747 | hex_tmp += val 748 | out.append(hex_tmp.decode('hex')) 749 | hex_tmp = '' 750 | state = 0 751 | return ''.join(out) 752 | 753 | @staticmethod 754 | def compile_string(data_type, data): 755 | """ Converts a string from a specified format into the converted into 756 | an optimized form for later use 757 | 758 | Args: 759 | data_type: Which 'compiler' to use 760 | data: String to be 'compiled' 761 | 762 | Returns: 763 | None 764 | 765 | Raises: 766 | None 767 | """ 768 | funcs = { 769 | 'pcre': POV.compile_pcre, 770 | 'asciic': POV.compile_string_match, 771 | 'hex': POV.compile_hex_match, 772 | } 773 | return funcs[data_type](data) 774 | 775 | @staticmethod 776 | def get_child(data, name): 777 | """ Retrieve the specified 'BeautifulSoup' child from the current 778 | element 779 | 780 | Args: 781 | data: Current element that should be searched 782 | name: Name of child element to be returned 783 | 784 | Returns: 785 | child: BeautifulSoup element 786 | 787 | Raises: 788 | AssertionError: if a child with the specified name is not contained 789 | in the specified element 790 | """ 791 | child = data.findChild(name) 792 | assert child is not None 793 | return child 794 | 795 | @staticmethod 796 | def get_attribute(data, name, default=None, allowed=None): 797 | """ Return the named attribute from the current element. 798 | 799 | Args: 800 | data: Element to read the named attribute 801 | name: Name of attribute 802 | default: Optional default value to be returne if the attribute is 803 | not provided 804 | allowed: Optional list of allowed values 805 | 806 | Returns: 807 | None 808 | 809 | Raises: 810 | AssertionError: if the value is not in the specified allowed values 811 | """ 812 | value = default 813 | if name in data.attrib: 814 | value = data.attrib[name] 815 | if allowed is not None: 816 | assert value in allowed 817 | return value 818 | 819 | def add_variable(self, name): 820 | """ Add a variable the POV interaction 821 | 822 | This allows for insurance of runtime access of initialized variables 823 | during parse time. 824 | 825 | Args: 826 | name: Name of variable 827 | 828 | Returns: 829 | None 830 | 831 | Raises: 832 | None 833 | """ 834 | if name not in self._variables: 835 | self._variables.append(name) 836 | 837 | def has_variable(self, name): 838 | """ Verify a variable has been defined 839 | 840 | Args: 841 | name: Name of variable 842 | 843 | Returns: 844 | None 845 | 846 | Raises: 847 | None 848 | """ 849 | return name in self._variables 850 | 851 | def add_step(self, step_type, data): 852 | """ Add a step to the POV iteraction sequence 853 | 854 | Args: 855 | step_type: Type of interaction 856 | data: Data for the interaction 857 | 858 | Returns: 859 | None 860 | 861 | Raises: 862 | AssertionError: if the step_type is not one of the pre-defined 863 | types 864 | """ 865 | assert step_type in ['declare', 'sleep', 'read', 'write'] 866 | self._steps.append((step_type, data)) 867 | 868 | def parse_delay(self, data): 869 | """ Parse a 'delay' interaction XML element 870 | 871 | Args: 872 | data: XML Element defining the 'delay' iteraction 873 | 874 | Returns: 875 | None 876 | 877 | Raises: 878 | AssertionError: if there is not only one child in the 'delay' 879 | element 880 | """ 881 | self.add_step('sleep', float(data.text) / 1000) 882 | 883 | def parse_decl(self, data): 884 | """ Parse a 'decl' interaction XML element 885 | 886 | Args: 887 | data: XML Element defining the 'decl' iteraction 888 | 889 | Returns: 890 | None 891 | 892 | Raises: 893 | AssertionError: If there is not two children in the 'decl' element 894 | AssertionError: If the 'var' child element is not defined 895 | AssertionError: If the 'var' child element does not have only one 896 | child 897 | AssertionError: If the 'value' child element is not defined 898 | AssertionError: If the 'value' child element does not have only one 899 | child 900 | """ 901 | assert len(data) == 2 902 | assert data[0].tag == 'var' 903 | key = data[0].text 904 | 905 | values = [] 906 | assert data[1].tag == 'value' 907 | assert len(data[1]) > 0 908 | for item in data[1]: 909 | values.append(self.parse_data(item)) 910 | 911 | value = ''.join(values) 912 | 913 | self.add_variable(key) 914 | self.add_step('declare', {key: value}) 915 | 916 | def parse_assign(self, data): 917 | """ Parse an 'assign' XML element 918 | 919 | Args: 920 | data: XML Element defining the 'assign' iteraction 921 | 922 | Returns: 923 | None 924 | 925 | Raises: 926 | AssertionError: If the 'var' element is not defined 927 | AssertionError: If the 'var' element does not have only one child 928 | AssertionError: If the 'pcre' or 'slice' element of the 'assign' 929 | element is not defined 930 | """ 931 | 932 | assert data.tag == 'assign' 933 | assert data[0].tag == 'var' 934 | assign = data[0].text 935 | self.add_variable(assign) 936 | 937 | if data[1].tag == 'pcre': 938 | expression = POV.compile_string('pcre', data[1].text) 939 | group = POV.get_attribute(data[1], 'group', '0') 940 | expression.group = int(group) 941 | 942 | elif data[1].tag == 'slice': 943 | expression = POV.compile_slice(data[1]) 944 | else: 945 | raise Exception("unknown expr tag: %s" % data[1].tag) 946 | 947 | return assign, expression 948 | 949 | def parse_read(self, data): 950 | """ Parse a 'read' interaction XML element 951 | 952 | Args: 953 | data: XML Element defining the 'read' iteraction 954 | 955 | Returns: 956 | None 957 | 958 | Raises: 959 | AssertionError: If the 'delim' element is defined, it does not have 960 | only one child 961 | AssertionError: If the 'length' element is defined, it does not 962 | have only one child 963 | AssertionError: If both 'delim' and 'length' are specified 964 | AssertionError: If neither 'delim' and 'length' are specified 965 | AssertionError: If the 'match' element is defined, it does not have 966 | only one child 967 | AssertionError: If the 'timeout' element is defined, it does not 968 | have only one child 969 | """ 970 | # 971 | # 972 | 973 | # defaults 974 | read_args = {'timeout': 0} 975 | 976 | # yay, pass by reference. this allows us to just return when we're out 977 | # of sub-elements. 978 | self.add_step('read', read_args) 979 | 980 | read_args['echo'] = POV.get_attribute(data, 'echo', 'no', ['yes', 'no', 981 | 'ascii']) 982 | 983 | assert len(data) > 0 984 | 985 | children = data.getchildren() 986 | 987 | read_until = children.pop(0) 988 | 989 | if read_until.tag == 'length': 990 | read_args['length'] = int(read_until.text) 991 | elif read_until.tag == 'delim': 992 | read_args['delim'] = self.parse_data(read_until, 'asciic', 993 | ['asciic', 'hex']) 994 | else: 995 | raise Exception('invalid first argument') 996 | 997 | if len(children) == 0: 998 | return 999 | current = children.pop(0) 1000 | 1001 | if current.tag == 'match': 1002 | invert = False 1003 | if POV.get_attribute(current, 'invert', 'false', 1004 | ['false', 'true']) == 'true': 1005 | invert = True 1006 | 1007 | assert len(current) > 0 1008 | 1009 | values = [] 1010 | for item in current: 1011 | if item.tag == 'data': 1012 | values.append(self.parse_data(item, 'asciic', 1013 | ['asciic', 'hex'])) 1014 | elif item.tag == 'pcre': 1015 | values.append(POV.compile_string('pcre', item.text)) 1016 | elif item.tag == 'var': 1017 | values.append(_ValueStr(item.text)) 1018 | else: 1019 | raise Exception('invalid data.match element name: %s' % 1020 | item.name) 1021 | 1022 | read_args['match'] = {'invert': invert, 'values': values} 1023 | 1024 | if len(children) == 0: 1025 | return 1026 | current = children.pop(0) 1027 | 1028 | if current.tag == 'assign': 1029 | assign, expr = self.parse_assign(current) 1030 | read_args['assign'] = assign 1031 | read_args['expr'] = expr 1032 | if len(children) == 0: 1033 | return 1034 | current = children.pop(0) 1035 | 1036 | assert current.tag == 'timeout', "%s tag, not 'timeout'" % current.tag 1037 | read_args['timeout'] = int(current.text) 1038 | 1039 | @staticmethod 1040 | def parse_data(data, default=None, formats=None): 1041 | """ Parse a 'data' element' 1042 | 1043 | Args: 1044 | data: XML Element defining the 'data' item 1045 | formats: Allowed formats 1046 | 1047 | Returns: 1048 | A 'normalized' string 1049 | 1050 | Raises: 1051 | AssertionError: If element is not named 'data' 1052 | AssertionError: If the element has more than one child 1053 | """ 1054 | 1055 | if formats is None: 1056 | formats = ['asciic', 'hex'] 1057 | 1058 | if default is None: 1059 | default = 'asciic' 1060 | 1061 | assert data.tag in ['data', 'delim', 'value'] 1062 | assert len(data.text) > 0 1063 | data_format = POV.get_attribute(data, 'format', default, formats) 1064 | return POV.compile_string(data_format, data.text) 1065 | 1066 | def parse_write(self, data): 1067 | """ Parse a 'write' interaction XML element 1068 | 1069 | Args: 1070 | data: XML Element defining the 'write' iteraction 1071 | 1072 | Returns: 1073 | None 1074 | 1075 | Raises: 1076 | AssertionError: If any of the child elements do not have the name 1077 | 'data' 1078 | AssertionError: If any of the 'data' elements have more than one 1079 | child 1080 | """ 1081 | # 1082 | # 1083 | # 1084 | 1085 | # self._add_variables(name) 1086 | 1087 | values = [] 1088 | assert len(data) > 0 1089 | for val in data: 1090 | if val.tag == 'data': 1091 | values.append(self.parse_data(val)) 1092 | else: 1093 | assert val.tag == 'var' 1094 | assert self.has_variable(val.text) 1095 | values.append(_ValueStr(val.text)) 1096 | 1097 | echo = POV.get_attribute(data, 'echo', 'no', ['yes', 'no', 'ascii']) 1098 | self.add_step('write', {'value': values, 'echo': echo}) 1099 | 1100 | def parse(self, raw_data, filename=None): 1101 | """ Parse the specified replay XML 1102 | 1103 | Args: 1104 | raw_data: Raw XML to be parsed 1105 | 1106 | Returns: 1107 | None 1108 | 1109 | Raises: 1110 | AssertionError: If the XML file has more than top-level children 1111 | (Expected: pov and doctype) 1112 | AssertionError: If the first child is not a Doctype instance 1113 | AssertionError: If the doctype does not specify the replay.dtd 1114 | AssertionError: If the second child is not named 'pov' 1115 | AssertionError: If the 'pov' element has more than two elements 1116 | AssertionError: If the 'pov' element does not contain a 'cbid' 1117 | element 1118 | AssertionError: If the 'cbid' element value is blank 1119 | """ 1120 | 1121 | self.filename = filename 1122 | 1123 | tree = ET.fromstring(raw_data) 1124 | assert tree.tag == 'pov' 1125 | assert len(tree) == 2 1126 | 1127 | assert tree[0].tag == 'cbid' 1128 | assert len(tree[0].tag) > 0 1129 | self.name = tree[0].text 1130 | 1131 | assert tree[1].tag == 'replay' 1132 | 1133 | parse_fields = { 1134 | 'decl': self.parse_decl, 1135 | 'read': self.parse_read, 1136 | 'write': self.parse_write, 1137 | 'delay': self.parse_delay, 1138 | } 1139 | 1140 | for replay_element in tree[1]: 1141 | assert replay_element.tag in parse_fields 1142 | parse_fields[replay_element.tag](replay_element) 1143 | 1144 | def dump(self): 1145 | """ Print the steps in the POV, via repr 1146 | 1147 | Args: 1148 | None 1149 | 1150 | Returns: 1151 | None 1152 | 1153 | Raises: 1154 | None 1155 | """ 1156 | for step in self._steps: 1157 | print repr(step) 1158 | 1159 | 1160 | class Results(object): 1161 | """ Class to handle gathering result stats from Throw() instances """ 1162 | def __init__(self): 1163 | self.passed = 0 1164 | self.failed = 0 1165 | self.errors = 0 1166 | self.full_passed = 0 1167 | 1168 | def cb_pov_result(self, results): 1169 | """ 1170 | Throw() result callback 1171 | 1172 | Arguments: 1173 | results: tuple containing the number of results passed, failed, and 1174 | a list of logs 1175 | 1176 | Returns: 1177 | None 1178 | 1179 | Raises: 1180 | None 1181 | """ 1182 | got_passed, got_failed, got_logs = results 1183 | print '\n'.join(got_logs) 1184 | self.passed += got_passed 1185 | self.failed += got_failed 1186 | if got_failed > 0: 1187 | self.errors += 1 1188 | else: 1189 | self.full_passed += 1 1190 | 1191 | 1192 | def run_pov(src, dst, pov_info, timeout, debug): 1193 | """ 1194 | Parse and Throw a POV/Poll 1195 | 1196 | Arguments: 1197 | src: IP/Port tuple for the source of the connection 1198 | dst: IP/Port tuple for the destination of the connection 1199 | pov_info: content/filename tuple of the POV 1200 | timeout: How long the POV communication is allowed to take 1201 | debug: Flag to enable debug logs 1202 | 1203 | Returns: 1204 | The number of passed tests 1205 | The number of failed tests 1206 | A list containing the logs 1207 | 1208 | Raises: 1209 | Exception if parsing the POV times out 1210 | """ 1211 | 1212 | xml, filename = pov_info 1213 | pov = POV() 1214 | try: 1215 | with Timeout(30): 1216 | pov.parse(xml, filename=filename) 1217 | except TimeoutException: 1218 | raise Exception("parsing %s timed out" % filename) 1219 | 1220 | thrower = Throw(src, dst, pov, timeout, debug) 1221 | try: 1222 | with Timeout(timeout): 1223 | thrower.run() 1224 | except TimeoutException: 1225 | thrower.log_fail('pov timed out') 1226 | thrower.dump() 1227 | 1228 | return thrower.passed, thrower.failed, thrower.logs 1229 | 1230 | 1231 | def main(): 1232 | """ Parse and Throw the POVs """ 1233 | parser = argparse.ArgumentParser(description='Send CGC Polls and POVs') 1234 | required = parser.add_argument_group(title='required arguments') 1235 | required.add_argument('--host', required=True, type=str, 1236 | help='IP address of CB server') 1237 | required.add_argument('--port', required=True, type=int, 1238 | help='PORT of the listening CB') 1239 | required.add_argument('files', metavar='xml_file', type=str, nargs='+', 1240 | help='POV/Poll XML file') 1241 | parser.add_argument('--source_host', required=False, type=str, default='', 1242 | help='Source IP address to use in connections') 1243 | parser.add_argument('--source_port', required=False, type=int, 1244 | default=0, help='Source port to use in connections') 1245 | parser.add_argument('--concurrent', required=False, type=int, default=1, 1246 | help='Number of Polls/POVs to throw concurrently') 1247 | parser.add_argument('--timeout', required=False, type=int, default=None, 1248 | help='Connect timeout') 1249 | parser.add_argument('--failure_ok', required=False, action='store_true', 1250 | default=False, 1251 | help='Failures for this test are accepted') 1252 | parser.add_argument('--debug', required=False, action='store_true', 1253 | default=False, help='Enable debugging output') 1254 | args = parser.parse_args() 1255 | 1256 | assert args.concurrent > 0, "Conccurent count must be less than 1" 1257 | 1258 | povs = [] 1259 | for pov_filename in args.files: 1260 | pov_xml = [] 1261 | with open(pov_filename, 'rb') as pov_fh: 1262 | pov_xml.append(pov_fh.read()) 1263 | 1264 | for xml in pov_xml: 1265 | povs.append((xml, pov_filename)) 1266 | 1267 | result_handler = Results() 1268 | pool = multiprocessing.Pool(args.concurrent) 1269 | pool_responses = [] 1270 | for pov in povs: 1271 | pov_args = ((args.source_host, args.source_port), 1272 | (args.host, args.port), pov, args.timeout, args.debug) 1273 | pool_response = pool.apply_async(run_pov, args=pov_args, 1274 | callback=result_handler.cb_pov_result) 1275 | pool_responses.append(pool_response) 1276 | 1277 | for response in pool_responses: 1278 | response.get() 1279 | 1280 | pool.close() 1281 | pool.join() 1282 | 1283 | print "# total tests passed: %d" % result_handler.passed 1284 | print "# total tests failed: %d" % result_handler.failed 1285 | print "# polls passed: %d" % result_handler.full_passed 1286 | print "# polls failed: %d" % result_handler.errors 1287 | 1288 | if args.failure_ok: 1289 | return 0 1290 | else: 1291 | return result_handler.errors != 0 1292 | 1293 | if __name__ == "__main__": 1294 | exit(main()) 1295 | -------------------------------------------------------------------------------- /bin/qemu-cgc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mechaphish/cgrex/8ce02e323646fc83e61db8eb6b4dc7c0dfa0ae03/bin/qemu-cgc -------------------------------------------------------------------------------- /bin/qemu_bb_wrap.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -x 4 | 5 | ./qemu-cgc -D "$2/qemu_log.txt" -d exec,circular,in_asm $1 6 | 7 | -------------------------------------------------------------------------------- /bin/qemu_launcher.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -x 4 | 5 | echo "qemu_launcher.sh START" 6 | 7 | echo "$1 $2 $4" 8 | socat -d -d TCP-LISTEN:0,bind=localhost,reuseaddr EXEC:"timeout -k 70 65 $1 $2 $4" & 9 | PID=$! 10 | 11 | #wait for nc opening the port 12 | netstat -ltunp 2>/dev/null | grep " $PID/" > /dev/null 13 | STATUS=$? 14 | while [ $STATUS -eq 1 ] 15 | do 16 | sleep 1 17 | netstat -ltunp 2>/dev/null | grep " $PID/" > /dev/null 18 | STATUS=$? 19 | echo "waiting for socat..." 20 | done 21 | #echo `cat /proc/$PID/cmdline` ### 22 | 23 | PORT=`netstat -ltunp 2>/dev/null | grep " $PID/" | awk -F':' '{print $2}' | awk -F' ' '{print $1}'` 24 | echo "port is $PORT" 25 | ./cb-replay_mod --timeout 60 --host 127.0.0.1 --port $PORT $3 26 | 27 | sleep 1 #giving qemu time to write everything, this is bad, but other solutions are bad too 28 | kill -9 $PID 29 | 30 | echo "qemu_launcher.sh END" 31 | 32 | -------------------------------------------------------------------------------- /bin/qemu_singlestep_wrap.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | ./qemu-cgc 2>"$2/qemu_stderr.txt" -singlestep -D "$2/qemu_log.txt" -d exec,in_asm,circular $1 4 | 5 | 6 | -------------------------------------------------------------------------------- /cgc_pin_tracer/cgc_pin_tracer.cpp: -------------------------------------------------------------------------------- 1 | /** CGC Instruction Count Example 2 | * @author Lok Yan 3 | * @data 25 Aug 2014 4 | 5 | * Note that this was only tested on 32bit systems - we might need to fix some 6 | * copying operations 7 | **/ 8 | 9 | #include 10 | #include 11 | #include "pin.H" 12 | 13 | static UINT64 icount = 0; 14 | static ADDRINT LastIP1 = 0; 15 | static ADDRINT LastIP2 = 0; 16 | static ADDRINT LastBB1addr = 0; 17 | static ADDRINT LastBB1size = 0; 18 | static ADDRINT LastBB2addr = 0; 19 | static ADDRINT LastBB2size = 0; 20 | static UINT32 Signal=0; 21 | static UINT32 ExitCode=0; 22 | static ADDRINT DEBUG = 0; 23 | 24 | /********************************************/ 25 | /** START OF CGC SYS_CALL EMULATOR SECTION **/ 26 | /********************************************/ 27 | 28 | //#include 29 | #include 30 | #include // for dup, write, etc. 31 | #include 32 | #include // for mmap 33 | #include 34 | #include //for fopen 35 | 36 | //Include the cgc definitions 37 | #include "libcgc_pin.h" 38 | 39 | //a mode that passes the syscalls directly through to the kernel 40 | KNOB KnobModePassthrough(KNOB_MODE_WRITEONCE, "pintool", 41 | "p", "1", "Syscall passthrough mode"); 42 | 43 | //a mode that emulates the system calls using file inputs 44 | KNOB KnobModeEmulation(KNOB_MODE_WRITEONCE, "pintool", 45 | "e", "0", "Syscall emulation mode"); 46 | KNOB KnobFDMap(KNOB_MODE_APPEND, "pintool", 47 | "fd", "" , "File descriptor number to file mappings, \n" 48 | "\t e.g., -fd 0,mystdin will use 'mystdin' as input to fd 0\n" 49 | "\t -fd 0,mystdin -fd 1,mystdout will use 'mystdin' as input to 0 and 'mystdout' to 1"); 50 | KNOB KnobRandFile(KNOB_MODE_WRITEONCE, "pintool", 51 | "rand", "/dev/urandom", "Filename for the source of random bytes"); 52 | 53 | //NOTE: We are going to use a MAP for now - but perhaps a clean implementation will have 54 | // a smaller footprint 55 | /* -- unordered_map is unsupported in cgc vm 56 | #include 57 | typedef std::unordered_map fd_map_t; 58 | */ 59 | 60 | #include 61 | typedef std::map fd_map_t; 62 | fd_map_t cgc_fds; //faster than a regular map 63 | 64 | FILE* randfd = NULL; 65 | 66 | VOID cgc_cleanup_files() 67 | { 68 | //close all of the open files 69 | while (!cgc_fds.empty()) 70 | { 71 | fd_map_t::iterator it = cgc_fds.begin(); 72 | fclose(it->second); 73 | cgc_fds.erase(it); 74 | } 75 | 76 | if (randfd != NULL) 77 | { 78 | fclose(randfd); 79 | } 80 | } 81 | 82 | /** 83 | * This function is called before the analysis target is loaded 84 | **/ 85 | VOID cgc_init(VOID* v) 86 | { 87 | //0. Make sure that the size of fd_set and cgc_fd_set are the same 88 | if (sizeof(fd_set) != sizeof(cgc_fd_set)) 89 | { 90 | cerr << "ERROR!!! The sizeof native fd_set and cgc_fd_set are not the same" << endl; 91 | cgc_cleanup_files(); 92 | exit(-2); 93 | } 94 | 95 | //1. First, we want to open up the randomness source 96 | //The default value for KnobRandFile is /dev/urandom so KnobRandFile should always be well 97 | // defined 98 | randfd = fopen(KnobRandFile.Value().c_str(), "rb"); 99 | if (randfd == NULL) 100 | { 101 | cerr << "The random source file [" << KnobRandFile.Value() << "] could not be opened" << endl; 102 | cgc_cleanup_files(); 103 | exit(-1); 104 | } 105 | 106 | //2. Next we want to see if emulation mode is enabled, if so then open up the rest of the files 107 | //if we are going to emulate, then we would like to load the files corresponding 108 | // to the defined fd numbers 109 | if (KnobModeEmulation) 110 | { 111 | //NOTE: All files are opened for read/write and in binary mode 112 | for (size_t i = 0; i < KnobFDMap.NumberOfValues(); i++) 113 | { 114 | const string& str = KnobFDMap.Value(i); 115 | std::string::size_type commaPos = str.find(','); 116 | 117 | int fdnum = 0; 118 | 119 | /** std::stoi doesn't seem to exist in the cgc VM either -- so switching to strtol 120 | try 121 | { 122 | fdnum = std::stoi(str.substr(0, commaPos - 1)); 123 | } 124 | catch (std::invalid_argument e) 125 | { 126 | cerr << "Invalid argument received [" << str << "]" << endl; 127 | cgc_cleanup_files(); 128 | exit(-1); 129 | } 130 | **/ 131 | 132 | const char* cstr = str.c_str(); 133 | char* tempp = NULL; 134 | 135 | fdnum = strtol(cstr, &tempp, 10); 136 | //I am counting on some initial tests that shows strtol will stop at 137 | // the , if conversion was good and it will go beyond it if conversion 138 | // was unsuccessful 139 | if ( (tempp != (cstr + commaPos)) ) 140 | { 141 | cerr << "Could not convert the fd in [" << str << "]" << endl; 142 | cgc_cleanup_files(); 143 | exit(-1); 144 | } 145 | 146 | FILE* fp = fopen(str.substr(commaPos + 1).c_str(), "r+b"); 147 | if (fp == NULL) 148 | { 149 | cerr << "Could not open file [" << str.substr(commaPos+1) << "] for rw" << endl; 150 | cgc_cleanup_files(); 151 | exit(-1); 152 | } 153 | 154 | if (cgc_fds.find(fdnum) != cgc_fds.end()) 155 | { 156 | //if it exists already then errror 157 | cerr << "The file for fd [" << fdnum << "] has already beed defined" << endl; 158 | cgc_cleanup_files(); 159 | exit(-1); 160 | } 161 | 162 | //everything checks out so add in the new entry 163 | cgc_fds[fdnum] = fp; 164 | }//end for KnobFDMap.NumberOfValues 165 | }//END if emulation 166 | 167 | /** INSERT PINTOOLS INITIALIZATIONS HERE **/ 168 | /** END PINTOOLS INITIALIZATION SECTION **/ 169 | } 170 | 171 | /** 172 | * This function is called when the program ends 173 | * That is BEFORE the terminate system call is called 174 | **/ 175 | VOID cgc_cleanup(INT32 code, VOID* v) 176 | { 177 | cgc_cleanup_files(); 178 | } 179 | 180 | 181 | 182 | #define CGC_SET_RETURN(_ctx, _val) PIN_SetContextReg(_ctx, REG_EAX, _val) 183 | 184 | /** 185 | * This function is called if the system call number is 1 (_terminate) 186 | * Notice that we don't need to do anything here because 1 is also 187 | * the system call number for sys_exit in 32bit linux. 188 | * If the host is 64bit, then we will need to change the sys_call number 189 | * to the proper value. Might have to fix the register values as well. 190 | **/ 191 | VOID emulate_terminate(CONTEXT* ctx) 192 | { 193 | ADDRINT curIP = PIN_GetContextReg(ctx, REG_EIP); 194 | DEBUG = curIP; 195 | 196 | if((curIP & 0xffff0000)==0x9000000){ 197 | PIN_SetContextReg(ctx, REG_EIP, curIP + 2); 198 | PIN_ExecuteAt(ctx); //execute after the instruction 199 | } 200 | } 201 | 202 | /** 203 | * Emulation function for sys call 2 (transmit) 204 | * The basic idea is that, we can either pass this call through to the host 205 | * in which case we will need to worry about the file descriptors being 206 | * shared between the pintool and the analysis target. We get around this 207 | * problem by writing our own wrapper launcher that will preallocate the 208 | * file descriptors that the analysis target might need before transferring 209 | * control to the pintool. This way, the fds used by the pintools will be 210 | * above the ones needed by the CB. 211 | * The other method is to emulate the file descriptors. This works by changing 212 | * all transmits into fwrites into the files that the user have defined 213 | * using the -fd #,filename arguments. If an argument was not defined, then 214 | * we default to passthrough mode (e.g. when user only passes in -fd 0,input.txt 215 | * we will emulate receive from fd 0 with a fread from input.txt but we will 216 | * just pass transmits to fd 1 right into stdout.) 217 | **/ 218 | VOID emulate_transmit(CONTEXT* ctx) 219 | { 220 | cgc_size_t stemp = 0; 221 | 222 | size_t stret = 0; 223 | ssize_t sstret = 0; 224 | 225 | if (ctx == NULL) 226 | { 227 | return; 228 | } 229 | 230 | //int transmit(int fd, const void *buf, size_t count, size_t *tx_bytes); 231 | ADDRINT fd = PIN_GetContextReg(ctx, REG_EBX); 232 | ADDRINT buf = PIN_GetContextReg(ctx, REG_ECX); 233 | ADDRINT count = PIN_GetContextReg(ctx, REG_EDX); 234 | ADDRINT tx_bytes = PIN_GetContextReg(ctx, REG_ESI); 235 | 236 | 237 | /** 238 | EBADF fd is not a valid file descriptor 239 | or is not open. 240 | EFAULT buf or tx_bytes points to an 241 | invalid address. 242 | **/ 243 | 244 | //TODO: Make sure that the error logic is the same as in the kernel 245 | // For example, right now we make sure that tx_bytes is writeable first 246 | // before actually calling write. 247 | 248 | if ((void*)tx_bytes != NULL) 249 | { 250 | if (PIN_SafeCopy((void*)(&stemp), (void*)(tx_bytes), sizeof(cgc_size_t)) != sizeof(cgc_size_t)) 251 | { 252 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 253 | goto SKIP_INT; 254 | } 255 | } 256 | 257 | //we will try the emulation mode first - since not all fds might be defined 258 | // for the ones that are not defined - we want to just pass through 259 | 260 | if (KnobModeEmulation) 261 | { 262 | //in this case, we want to send it out to a file instead of using write 263 | fd_map_t::iterator it = cgc_fds.find(fd); 264 | if (it != cgc_fds.end()) //if it exists then process 265 | { 266 | stret = fwrite((void*)buf, 1, (size_t)count, it->second); 267 | 268 | //NOTE: is there anything to do with the return value? 269 | if (stret < count) 270 | { 271 | //TODO:what to return if there is an error? 272 | } 273 | 274 | if ((void*)tx_bytes != NULL) 275 | { 276 | stret = PIN_SafeCopy((void*)tx_bytes, (void*)(&stret), sizeof(cgc_size_t)); //this should work 277 | } 278 | 279 | CGC_SET_RETURN(ctx, 0); 280 | goto SKIP_INT; 281 | } 282 | 283 | //if the entry is not found then just pass it through 284 | } 285 | 286 | //PASSTHROUGH MODE - KnobModePassthrough is always TRUE 287 | //if we are just passing it through then call write 288 | sstret = write(fd, (void*)buf, count); 289 | if (sstret >= 0) 290 | { 291 | CGC_SET_RETURN(ctx, 0); // we wrote something so set the return to 0 292 | 293 | if ((void*)tx_bytes != NULL) 294 | { 295 | stret = PIN_SafeCopy((void*)tx_bytes, (void*)(&sstret), sizeof(cgc_size_t)); //this should work 296 | } 297 | } 298 | else // an error occurred 299 | { 300 | switch (sstret) 301 | { 302 | case (-EINVAL): 303 | case (-EBADF): 304 | { 305 | CGC_SET_RETURN(ctx, -CGC_EBADF); 306 | break; 307 | } 308 | default: 309 | { 310 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 311 | break; 312 | } 313 | } 314 | } 315 | 316 | SKIP_INT: 317 | ADDRINT curIP = PIN_GetContextReg(ctx, REG_EIP); 318 | 319 | //int 0x80 is cd 80 in hex which is two bytes 320 | //get the parameters off the stack first 321 | PIN_SetContextReg(ctx, REG_EIP, curIP + 2); 322 | PIN_ExecuteAt(ctx); //execute after the instruction 323 | } 324 | 325 | /** 326 | * Function to emulate syscall 3 (receive) 327 | * See the comments for emulate_transmit 328 | **/ 329 | VOID emulate_receive(CONTEXT* ctx) 330 | { 331 | cgc_size_t stemp = 0; 332 | size_t stret = 0; 333 | ssize_t sstret = 0; 334 | 335 | if (ctx == NULL) 336 | { 337 | return; 338 | } 339 | 340 | //int receive(int fd, void *buf, size_t count, size_t *rx_bytes) 341 | 342 | ADDRINT fd = PIN_GetContextReg(ctx, REG_EBX); 343 | ADDRINT buf = PIN_GetContextReg(ctx, REG_ECX); 344 | ADDRINT count = PIN_GetContextReg(ctx, REG_EDX); 345 | ADDRINT rx_bytes = PIN_GetContextReg(ctx, REG_ESI); 346 | 347 | /** 348 | EBADF fd is not a valid file descriptor 349 | or is not open. 350 | EFAULT buf or rx_bytes points to an 351 | invalid address. 352 | **/ 353 | 354 | //TODO: Make sure that the error logic is the same as in the kernel 355 | // For example, right now we make sure that rx_bytes is writeable first 356 | // before actually calling write. 357 | 358 | if ((void*)rx_bytes != NULL) 359 | { 360 | if (PIN_SafeCopy((void*)(&stemp), (void*)(rx_bytes), sizeof(cgc_size_t)) != sizeof(cgc_size_t)) 361 | { 362 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 363 | goto SKIP_INT; 364 | } 365 | } 366 | 367 | if (KnobModeEmulation) 368 | { 369 | //in this case, we want to read from a file 370 | fd_map_t::iterator it = cgc_fds.find(fd); 371 | if (it != cgc_fds.end()) //if it exists then process 372 | { 373 | stret = fread((void*)buf, 1, (size_t)count, it->second); 374 | 375 | if (stret == 0) //if there is an error 376 | { 377 | if (!feof(it->second)) 378 | { 379 | //TODO: What to do if there is an error - and not just the end of file? 380 | } 381 | } 382 | //NOTE: is there anything to do with the return value? 383 | if ((void*)rx_bytes != NULL) 384 | { 385 | stret = PIN_SafeCopy((void*)rx_bytes, (void*)(&stret), sizeof(cgc_size_t)); //this should work 386 | } 387 | 388 | CGC_SET_RETURN(ctx, 0); 389 | goto SKIP_INT; 390 | } 391 | 392 | //if the entry is not found then just pass it through 393 | } 394 | 395 | //PASSTHROUGH MODE 396 | //if we are just passing it through then call write 397 | sstret = read(fd, (void*)buf, count); 398 | if (sstret >= 0) 399 | { 400 | CGC_SET_RETURN(ctx, 0); // we wrote something so set the return to 0 401 | 402 | if ((void*)rx_bytes != NULL) 403 | { 404 | stret = PIN_SafeCopy((void*)rx_bytes, (void*)(&sstret), sizeof(cgc_size_t)); //this should work 405 | } 406 | } 407 | else // an error occurred 408 | { 409 | switch (sstret) 410 | { 411 | case (-EINVAL): 412 | case (-EBADF): 413 | { 414 | CGC_SET_RETURN(ctx, -CGC_EBADF); 415 | break; 416 | } 417 | default: 418 | { 419 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 420 | break; 421 | } 422 | } 423 | } 424 | 425 | SKIP_INT: 426 | ADDRINT curIP = PIN_GetContextReg(ctx, REG_EIP); 427 | 428 | //int 0x80 is cd 80 in hex which is two bytes 429 | //get the parameters off the stack first 430 | PIN_SetContextReg(ctx, REG_EIP, curIP + 2); 431 | PIN_ExecuteAt(ctx); //execute after the instruction 432 | } 433 | 434 | 435 | 436 | int CGC_FD_IS_SET_EMPTY(cgc_fd_set* set) 437 | { 438 | cgc_size_t i = 0; 439 | if (set == NULL) 440 | { 441 | return (1); 442 | } 443 | 444 | for (i = 0; i < (CGC_FD_SETSIZE / CGC_NFDBITS); i++) 445 | { 446 | if (set->_fd_bits[i] != 0) 447 | { 448 | return (0); 449 | } 450 | } 451 | 452 | return (1); 453 | } 454 | 455 | /** 456 | * This emulates syscall 4 (fdwait) 457 | * Since fdwait is just like select, we will pass through 458 | * the parameters directly to select. This assumes that 459 | * the definitions for fd_set are the same between the CGC binary 460 | * and the host system. We do a quick and dirty check by 461 | * making sure that they are the same size. The definitions for FD_SETSIZE and 462 | * CGC_FD_SETSIZE might be different. 463 | * If we are running in emulation mode, then what we do is just 464 | * set the file descriptors ourselves as long as the fd number 465 | * to file map is defined. If it is not defined then we pass 466 | * the undefined fds to select. 467 | **/ 468 | VOID emulate_fdwait(CONTEXT* ctx) 469 | { 470 | int iret = 0; 471 | int numReady = 0; 472 | size_t stret = 0; 473 | 474 | cgc_fd_set tempReadSet; 475 | cgc_fd_set* pTempReadSet = NULL; 476 | cgc_fd_set tempWriteSet; 477 | cgc_fd_set* pTempWriteSet = NULL; 478 | 479 | cgc_fd_set retReadSet; 480 | cgc_fd_set* pRetReadSet = NULL; 481 | cgc_fd_set retWriteSet; 482 | cgc_fd_set* pRetWriteSet = NULL; 483 | 484 | /** 485 | int fdwait(int nfds, fd_set *readfds, fd_set *writefds, const struct timeval *timeout, 486 | int *readyfds); 487 | **/ 488 | 489 | ADDRINT nfds = PIN_GetContextReg(ctx, REG_EBX); 490 | ADDRINT readfds = PIN_GetContextReg(ctx, REG_ECX); 491 | ADDRINT writefds = PIN_GetContextReg(ctx, REG_EDX); 492 | ADDRINT timeout = PIN_GetContextReg(ctx, REG_ESI); 493 | ADDRINT readyfds = PIN_GetContextReg(ctx, REG_EDI); 494 | 495 | /** 496 | EBADF an invalid file descriptor was 497 | given in one of the sets (per‐ 498 | haps a file descriptor that was 499 | already closed, or one on which 500 | an error has occurred). 501 | EINVAL nfds is negative or the value 502 | contained within *timeout is 503 | invalid. 504 | 505 | EFAULT One of the arguments readfds, 506 | writefds, timeout, readyfds 507 | points to an invalid address. 508 | ENOMEM unable to allocate memory for 509 | internal tables. 510 | **/ 511 | 512 | 513 | if ((int)nfds < 0) 514 | { 515 | CGC_SET_RETURN(ctx, -CGC_EINVAL); 516 | goto SKIP_INT; 517 | } 518 | 519 | //NOTE: We are enforcing the less than CGC_FD_SETSIZE which might not be what 520 | // the kernel is doing 521 | if ((int)nfds > CGC_FD_SETSIZE) 522 | { 523 | CGC_SET_RETURN(ctx, -CGC_EINVAL); 524 | goto SKIP_INT; 525 | } 526 | 527 | //first we make a copy of the fd_set lists 528 | if ((cgc_fd_set*)readfds != NULL) 529 | { 530 | pTempReadSet = &tempReadSet; 531 | stret = PIN_SafeCopy((void*)pTempReadSet, (void*)readfds, sizeof(cgc_fd_set)); 532 | if (stret != sizeof(cgc_fd_set)) 533 | { 534 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 535 | goto SKIP_INT; 536 | } 537 | 538 | //we also want the retReadSet to be zeroed out 539 | // it will be set as fds are ready 540 | pRetReadSet = &retReadSet; 541 | CGC_FD_ZERO(pRetReadSet); 542 | } 543 | 544 | if ((cgc_fd_set*)writefds != NULL) 545 | { 546 | pTempWriteSet = &tempWriteSet; 547 | stret = PIN_SafeCopy((void*)pTempWriteSet, (void*)writefds, sizeof(cgc_fd_set)); 548 | if (stret != sizeof(cgc_fd_set)) 549 | { 550 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 551 | goto SKIP_INT; 552 | } 553 | 554 | pRetWriteSet = &retWriteSet; 555 | CGC_FD_ZERO(pRetWriteSet); 556 | } 557 | 558 | //now we should have pTempReadSet and pTrempWriteSet pointing to copies 559 | // of the corresponding sets - OR NULL 560 | 561 | //So lets go through emulation mode first and see if there are any fds that 562 | // are already mapped 563 | if (KnobModeEmulation) 564 | { 565 | for (int i = 0; (i < (int)nfds); i++) 566 | { 567 | if (cgc_fds.find(i) != cgc_fds.end()) 568 | { 569 | //if the file exists - then remove the set bit now 570 | if (pTempReadSet != NULL) 571 | { 572 | //if we are watching this particular fd 573 | if (CGC_FD_ISSET(i, pTempReadSet)) 574 | { 575 | //clear the corresponding bit in case we need to call select later 576 | CGC_FD_CLR(i, pTempReadSet); 577 | //but also set the same bit in the return set 578 | CGC_FD_SET(i, pRetReadSet); 579 | //increment the number of Ready fds 580 | numReady++; 581 | } 582 | } 583 | 584 | //do the same for the write set 585 | if (pTempWriteSet != NULL) 586 | { 587 | //if we are watching this particular fd 588 | if (CGC_FD_ISSET(i, pTempWriteSet)) 589 | { 590 | //clear the corresponding bit in case we need to call select later 591 | CGC_FD_CLR(i, pTempWriteSet); 592 | //but also set the same bit in the return set 593 | CGC_FD_SET(i, pRetWriteSet); 594 | numReady++; 595 | } 596 | } 597 | } 598 | } 599 | 600 | //At this point in time - temp*Set should be the left over any real fds that we don't have a 601 | // file mapping to. We don't change nfds because one of the unmapped fds could be at the end 602 | //TODO: Finally, note that the above logic WILL NOT WORK for read-only or write-only fds 603 | // such as 0, 1, and 2. They will be counted twice since all files are opened as read and 604 | // writeable. 605 | } 606 | 607 | //PASSTHROUGH MODE 608 | /** 609 | static int asmlinkage 610 | cgcos_fdwait(int nfds, fd_set __user *readfds, fd_set __user *writefds, 611 | struct timeval __user *timeout, int __user *readyfds) { 612 | int res; 613 | if (readyfds != NULL && 614 | !access_ok(VERIFY_WRITE, readyfds, sizeof(*readyfds))) 615 | return (-EFAULT); 616 | 617 | res = sys_select(nfds, readfds, writefds, NULL, timeout); 618 | 619 | if (res < 0) 620 | return (res); 621 | 622 | if (readyfds != NULL && copy_to_user(readyfds, &res, sizeof(*readyfds))) 623 | return (-EFAULT); 624 | 625 | return (0); 626 | } 627 | **/ 628 | 629 | //if its emulation mode - then we want to skip this if the fdsets are now empty 630 | if (CGC_FD_IS_SET_EMPTY(pTempReadSet) && CGC_FD_IS_SET_EMPTY(pTempWriteSet)) 631 | { 632 | //if they are both empty then just skip the select step but to make sure iret is 0 633 | iret = 0; 634 | } 635 | else 636 | { 637 | iret = select((int)nfds, (fd_set*)pTempReadSet, (fd_set*)pTempWriteSet, NULL, (struct timeval*)timeout); 638 | } 639 | 640 | //if its emulation mode we need to combine the results 641 | if (iret < 0) 642 | { 643 | //either way - if select failed then an error occurred so we will just ignore 644 | // all of temporary work we did above with the temporary knobs 645 | CGC_SET_RETURN(ctx, iret); 646 | goto SKIP_INT; 647 | } 648 | else 649 | { 650 | if (KnobModeEmulation && (numReady > 0)) 651 | { 652 | if (iret > 0) 653 | { 654 | //if its emulation mode AND we have some fds that are set AND select set some more 655 | // then we need to merge the previous results with the ones from tempRead and WriteSets 656 | for (cgc_size_t i = 0; i < (CGC_FD_SETSIZE / CGC_NFDBITS); i++) 657 | { 658 | if ( (pRetReadSet != NULL) && (pTempReadSet != NULL) ) //the two points should be consistent 659 | { 660 | pRetReadSet->_fd_bits[i] |= pTempReadSet->_fd_bits[i]; 661 | } 662 | if ( (pRetWriteSet != NULL) && (pTempWriteSet != NULL) ) 663 | { 664 | pRetWriteSet->_fd_bits[i] |= pTempWriteSet->_fd_bits[i]; 665 | } 666 | } 667 | } 668 | 669 | //We also need to update the total number 670 | numReady += iret; 671 | } 672 | else 673 | { 674 | //since KnobModeEmulation was not set or there weren't anything of interest 675 | // we can just set the return pointers to the tempSets that we passed into select 676 | pRetReadSet = pTempReadSet; 677 | pRetWriteSet = pTempWriteSet; 678 | numReady = iret; 679 | } 680 | 681 | //by now pRetReadSet and pRetWriteSet should have all of the bits for ready fds 682 | // set and numReady has the number of fds that are ready 683 | 684 | //Lets copy back to the user 685 | if ( ((void*)readfds != NULL) && (pRetReadSet != NULL) ) //once again these should be consistently NULL or non NULL at the same time. 686 | { 687 | stret = PIN_SafeCopy((void*)readfds, (void*)pRetReadSet, sizeof(cgc_fd_set)); 688 | if (stret != sizeof(cgc_fd_set)) 689 | { 690 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 691 | goto SKIP_INT; 692 | } 693 | } 694 | 695 | if ( ((void*)writefds != NULL) && (pRetWriteSet != NULL) ) //once again these should be consistently NULL or non NULL at the same time. 696 | { 697 | stret = PIN_SafeCopy((void*)writefds, (void*)pRetWriteSet, sizeof(cgc_fd_set)); 698 | if (stret != sizeof(cgc_fd_set)) 699 | { 700 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 701 | goto SKIP_INT; 702 | } 703 | } 704 | 705 | if ((int*)readyfds != NULL) 706 | { 707 | stret = PIN_SafeCopy((void*)readyfds, (void*)(&numReady), sizeof(int)); 708 | if (stret != sizeof(int)) 709 | { 710 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 711 | goto SKIP_INT; 712 | } 713 | } 714 | 715 | CGC_SET_RETURN(ctx, 0); //success 716 | }//end else error from select 717 | 718 | SKIP_INT: 719 | ADDRINT curIP = PIN_GetContextReg(ctx, REG_EIP); 720 | 721 | //int 0x80 is cd 80 in hex which is two bytes 722 | //get the parameters off the stack first 723 | PIN_SetContextReg(ctx, REG_EIP, curIP + 2); 724 | PIN_ExecuteAt(ctx); //execute after the instruction 725 | } 726 | 727 | /** 728 | * allocate is a wrapper of sorts for mmap so we just pass 729 | * this one through mmap. There is no difference between emulation 730 | * and passthrough mode at the moment. 731 | * As one would expect, and is well documented, the allocation 732 | * behavior is going to be different between the CGC binary running 733 | * in PIN and one running natively by itself. 734 | **/ 735 | VOID emulate_allocate(CONTEXT* ctx) 736 | { 737 | ADDRINT temp = 0; 738 | void* p = NULL; 739 | 740 | if (ctx == NULL) 741 | { 742 | return; 743 | } 744 | 745 | //int allocate(size_t length, int is_X, void **addr) 746 | 747 | ADDRINT len = PIN_GetContextReg(ctx, REG_EBX); 748 | ADDRINT is_X = PIN_GetContextReg(ctx, REG_ECX); 749 | ADDRINT addr = PIN_GetContextReg(ctx, REG_EDX); 750 | 751 | /** 752 | EINVAL length is zero. 753 | EINVAL length is too large. 754 | EFAULT addr points to an invalid 755 | address. 756 | ENOMEM No memory is available or the 757 | process' maximum number of 758 | allocations would have been 759 | exceeded. 760 | **/ 761 | 762 | if (len == 0) 763 | { 764 | CGC_SET_RETURN(ctx, -CGC_EINVAL); 765 | goto SKIP_INT; 766 | } 767 | 768 | if ((void*)addr == NULL) 769 | { 770 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 771 | goto SKIP_INT; 772 | } 773 | 774 | //try to read from the target address first to see if the memory address is valid 775 | if (PIN_SafeCopy(&temp, (void*)(addr), sizeof(ADDRINT)) != sizeof(ADDRINT)) 776 | { 777 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 778 | goto SKIP_INT; 779 | } 780 | 781 | //if we are here then the addr is valid so lets call mmap 782 | p = mmap(0, len, PROT_READ | PROT_WRITE | (is_X ? PROT_EXEC : 0), MAP_PRIVATE | MAP_ANON, -1, 0); 783 | 784 | if (p == (void*)(-1)) 785 | { 786 | CGC_SET_RETURN(ctx, -CGC_EINVAL); 787 | } 788 | else 789 | { 790 | PIN_SafeCopy((void*)addr, &p, sizeof(void*)); 791 | CGC_SET_RETURN(ctx, 0); 792 | } 793 | 794 | SKIP_INT: 795 | ADDRINT curIP = PIN_GetContextReg(ctx, REG_EIP); 796 | 797 | //int 0x80 is cd 80 in hex which is two bytes 798 | //get the parameters off the stack first 799 | PIN_SetContextReg(ctx, REG_EIP, curIP + 2); 800 | PIN_ExecuteAt(ctx); //execute after the instruction 801 | } 802 | 803 | /** 804 | * See emulate_allocate for more info 805 | **/ 806 | VOID emulate_deallocate(CONTEXT* ctx) 807 | { 808 | int ret = 0; 809 | 810 | if (ctx == NULL) 811 | { 812 | return; 813 | } 814 | 815 | //int deallocate(void *addr, size_t length) 816 | 817 | ADDRINT addr = PIN_GetContextReg(ctx, REG_EBX); 818 | ADDRINT len = PIN_GetContextReg(ctx, REG_ECX); 819 | 820 | /** 821 | EINVAL addr is not page aligned. 822 | EINVAL length is zero. 823 | EINVAL any part of the region being 824 | deallocated is outside the 825 | valid address range of the 826 | process. 827 | **/ 828 | 829 | if ( (addr & (~CGC_PAGE_MASK)) || (len == 0) ) 830 | { 831 | CGC_SET_RETURN(ctx, -CGC_EINVAL); 832 | goto SKIP_INT; 833 | } 834 | 835 | ret = munmap((void*)addr, len); 836 | 837 | if (ret != 0) 838 | { 839 | CGC_SET_RETURN(ctx, -CGC_EINVAL); 840 | } 841 | else 842 | { 843 | CGC_SET_RETURN(ctx, 0); 844 | } 845 | 846 | SKIP_INT: 847 | ADDRINT curIP = PIN_GetContextReg(ctx, REG_EIP); 848 | 849 | //int 0x80 is cd 80 in hex which is two bytes 850 | //get the parameters off the stack first 851 | PIN_SetContextReg(ctx, REG_EIP, curIP + 2); 852 | PIN_ExecuteAt(ctx); //execute after the instruction 853 | } 854 | 855 | /** 856 | * According to the kernel documentation there are three 857 | * ways to get_random_bytes - the first is the kernel function 858 | * with the same name, the second is from the user space through 859 | * /dev/random and the third is from /dev/urandom 860 | * Since there isn't a direct sys_call for random, we will emulate 861 | * this from the userspace. 862 | * According to the source in linux-source-3.13.2-cgc/drivers/char/random.c 863 | * get_random_bytes calls extract_entropy using the nonblocking_pool 864 | * /dev/random calls the userspace version using the blocking_pool 865 | * /dev/urandom calls it using the nonblocking_pool 866 | * and so, we will use /dev/urandom as the randomness source 867 | * No matter the mode, randfd should be pointing to the right file or 868 | * /dev/urandom 869 | **/ 870 | VOID emulate_random(CONTEXT* ctx) 871 | { 872 | size_t stret = 0; 873 | //int random(void *buf, size_t count, size_t *rnd_bytes) 874 | 875 | ADDRINT buf = PIN_GetContextReg(ctx, REG_EBX); 876 | ADDRINT count = PIN_GetContextReg(ctx, REG_ECX); 877 | ADDRINT rnd_bytes = PIN_GetContextReg(ctx, REG_EDX); 878 | 879 | /** 880 | EINVAL count is invalid. 881 | EFAULT buf or rnd_bytes points to an 882 | invalid address. 883 | */ 884 | 885 | stret = fread((void*)buf, 1, (size_t)count, randfd); 886 | if (stret < count) //an error has occurred 887 | { 888 | //first we check to see if eof has been reached 889 | if (feof(randfd)) //this should only happen if fdrand is a file, /dev/urandom should not give us an eof 890 | { 891 | size_t bytesLeft = (size_t) count - stret; 892 | fseek(randfd, 0, SEEK_SET); //go back to the beginning of the file 893 | stret = fread((void*)(buf + stret), 1, bytesLeft, randfd); //read again 894 | if (stret < bytesLeft) //another error - then just die 895 | { 896 | //TODO: How to die? Neither EINVAL nor EFAULT seems to be the right thing to do 897 | } 898 | else 899 | { 900 | //success so set the return value and go 901 | goto SUCCESS; //not really needed 902 | } 903 | } 904 | else 905 | { 906 | //NOTE: We will just assume that buf is wrong 907 | CGC_SET_RETURN(ctx, -CGC_EFAULT); 908 | goto SKIP_INT; 909 | } 910 | } 911 | 912 | SUCCESS: 913 | if ((void*)rnd_bytes != NULL) 914 | { 915 | 916 | stret = PIN_SafeCopy((void*)rnd_bytes, (void*)(&count), sizeof(cgc_size_t)); 917 | CGC_SET_RETURN(ctx, 0); 918 | } 919 | 920 | SKIP_INT: 921 | ADDRINT curIP = PIN_GetContextReg(ctx, REG_EIP); 922 | 923 | //int 0x80 is cd 80 in hex which is two bytes 924 | //get the parameters off the stack first 925 | PIN_SetContextReg(ctx, REG_EIP, curIP + 2); 926 | PIN_ExecuteAt(ctx); //execute after the instruction 927 | } 928 | 929 | /** 930 | * Our own small little syscall handler 931 | **/ 932 | VOID cgc_syscallHandler(CONTEXT* ctx) 933 | { 934 | if (ctx == NULL) 935 | { 936 | return; 937 | } 938 | 939 | //Check the syscall number 940 | switch(PIN_GetContextReg(ctx, REG_EAX)) 941 | { 942 | case (_TERMINATE): 943 | { 944 | emulate_terminate(ctx); 945 | break; 946 | } 947 | case (_TRANSMIT): 948 | { 949 | emulate_transmit(ctx); 950 | break; 951 | } 952 | case (_RECEIVE): 953 | { 954 | emulate_receive(ctx); 955 | break; 956 | } 957 | case (_FDWAIT): 958 | { 959 | emulate_fdwait(ctx); 960 | break; 961 | } 962 | case (_ALLOCATE): 963 | { 964 | emulate_allocate(ctx); 965 | break; 966 | } 967 | case (_DEALLOCATE): 968 | { 969 | emulate_deallocate(ctx); 970 | break; 971 | } 972 | case (_RANDOM): 973 | { 974 | emulate_random(ctx); 975 | break; 976 | } 977 | default: 978 | { 979 | //TODO: Right now we don't do anything, 980 | // meaning we just pass the syscall through 981 | //This is not the right behavior, since an 982 | // undefined cgc syscall is actually defined 983 | // on Linux which is the context that PIN is running under 984 | break; 985 | } 986 | } 987 | } 988 | 989 | /** 990 | * We need an instruction handler so we can skip the int 0x80 instructions 991 | **/ 992 | VOID cgc_instrumentInstruction(INS ins, VOID* v) 993 | { 994 | //NOTE: Instead of using INS_isSyscall we will look for int 0x80 instead 995 | //We could use INT_isInterrupt as well, but that covers more opcodes 996 | // than just INT Immediate 997 | if (INS_Opcode(ins) == XED_ICLASS_INT) 998 | { 999 | if ( (INS_OperandIsImmediate(ins, 0)) //its an immediate operand 1000 | && (INS_OperandImmediate(ins, 0) == 0x80) //and its 0x80 1001 | ) 1002 | { 1003 | 1004 | INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)cgc_syscallHandler, 1005 | IARG_CALL_ORDER, CALL_ORDER_FIRST, //NOTE: We want to be called first, but you can change it 1006 | IARG_CONTEXT, 1007 | IARG_END 1008 | ); 1009 | 1010 | //NOTE: We don't just delete the instruction at this point - we will update 1011 | // the PC and then call PIN_ExecuteAt to bypass these instructions later 1012 | } 1013 | } 1014 | } 1015 | 1016 | /********************************************/ 1017 | /** END OF EMULATION SECTION **/ 1018 | /********************************************/ 1019 | 1020 | /*BEGIN_LEGAL 1021 | Intel Open Source License 1022 | 1023 | Copyright (c) 2002-2014 Intel Corporation. All rights reserved. 1024 | 1025 | Redistribution and use in source and binary forms, with or without 1026 | modification, are permitted provided that the following conditions are 1027 | met: 1028 | 1029 | Redistributions of source code must retain the above copyright notice, 1030 | this list of conditions and the following disclaimer. Redistributions 1031 | in binary form must reproduce the above copyright notice, this list of 1032 | conditions and the following disclaimer in the documentation and/or 1033 | other materials provided with the distribution. Neither the name of 1034 | the Intel Corporation nor the names of its contributors may be used to 1035 | endorse or promote products derived from this software without 1036 | specific prior written permission. 1037 | 1038 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 1039 | ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 1040 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 1041 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE INTEL OR 1042 | ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 1043 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 1044 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 1045 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 1046 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 1047 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 1048 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 1049 | END_LEGAL */ 1050 | 1051 | ofstream OutFile; 1052 | 1053 | 1054 | 1055 | // This function is called before every instruction is executed 1056 | VOID InstructionLevelTrace(CONTEXT * ctx) { 1057 | icount++; 1058 | LastIP2 = LastIP1; 1059 | LastIP1 = PIN_GetContextReg(ctx, REG_EIP); 1060 | //DEBUG = PIN_GetContextReg(ctx, REG_ECX);// + PIN_GetContextReg(ctx, REG_ECX) -84; 1061 | } 1062 | // Pin calls this function every time a new instruction is encountered 1063 | VOID InstructionCallback(INS ins, VOID *v) 1064 | { 1065 | INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)InstructionLevelTrace,IARG_CONTEXT, IARG_END); 1066 | } 1067 | 1068 | VOID BBLevelTrace(ADDRINT address,UINT32 size){ 1069 | LastBB2addr = LastBB1addr; 1070 | LastBB2size = LastBB1size; 1071 | LastBB1addr = address; 1072 | LastBB1size = size; 1073 | } 1074 | VOID TraceCallback(TRACE trace, VOID *v) 1075 | { 1076 | //a trace is not a bb! 1077 | //instrument every basic block in the trace 1078 | for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl)) 1079 | { 1080 | BBL_InsertCall(bbl, IPOINT_BEFORE, (AFUNPTR)BBLevelTrace, IARG_ADDRINT, BBL_Address(bbl), IARG_UINT32, BBL_Size(bbl), IARG_END); 1081 | } 1082 | } 1083 | 1084 | bool HandleSig(THREADID tid, INT32 sig, CONTEXT *ctxt, BOOL hasHandler, const EXCEPTION_INFO *pExceptInfo, VOID *v){ 1085 | Signal = sig; 1086 | //OutFile << "sss: " << sig << endl; 1087 | if(sig==0xd){ 1088 | return 0; 1089 | }else{ 1090 | return 1; 1091 | } 1092 | } 1093 | 1094 | KNOB KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", 1095 | "o", "cgc_pin_tracer_results.out", "specify output file name"); 1096 | 1097 | // This function is called when the application exits 1098 | VOID Fini(INT32 code, VOID *v) 1099 | { 1100 | ExitCode=code; 1101 | 1102 | // Write to a file since cout and cerr maybe closed by the application 1103 | //TODO it would be cool to write in json here, but I do not want to deal with C++ craziness 1104 | OutFile.setf(ios::showbase); 1105 | OutFile << "Count: " << icount << endl; 1106 | OutFile << hex; 1107 | OutFile << "LastIP1: " << LastIP1 << endl; 1108 | OutFile << "LastIP2: " << LastIP2 << endl; 1109 | OutFile << "LastBB1addr: " << LastBB1addr << endl; 1110 | OutFile << "LastBB1size: " << LastBB1size << endl; 1111 | OutFile << "LastBB2addr: " << LastBB2addr << endl; 1112 | OutFile << "LastBB2size: " << LastBB2size << endl; 1113 | OutFile << "Signal: " << Signal << endl; 1114 | OutFile << "ExitCode: " << ExitCode << endl; 1115 | OutFile << "DEBUG: " << DEBUG << endl; 1116 | OutFile.close(); 1117 | 1118 | //usleep(5000000); 1119 | } 1120 | 1121 | /* ===================================================================== */ 1122 | /* Print Help Message */ 1123 | /* ===================================================================== */ 1124 | 1125 | INT32 Usage() 1126 | { 1127 | cerr << "This tool counts the number of dynamic instructions executed" << endl; 1128 | cerr << endl << KNOB_BASE::StringKnobSummary() << endl; 1129 | return -1; 1130 | } 1131 | 1132 | /* ===================================================================== */ 1133 | /* Main */ 1134 | /* ===================================================================== */ 1135 | /* argc, argv are the entire command line: pin -t -- ... */ 1136 | /* ===================================================================== */ 1137 | 1138 | int main(int argc, char * argv[]) 1139 | { 1140 | 1141 | //cout << "PIN STARTED!"; 1142 | 1143 | // Initialize pin 1144 | if (PIN_Init(argc, argv)) return Usage(); 1145 | 1146 | OutFile.open(KnobOutputFile.Value().c_str()); 1147 | 1148 | PIN_AddApplicationStartFunction(cgc_init, NULL); 1149 | 1150 | /** ADD CGC CALLBACK **/ 1151 | INS_AddInstrumentFunction(cgc_instrumentInstruction, NULL); 1152 | 1153 | INS_AddInstrumentFunction(InstructionCallback, 0); 1154 | TRACE_AddInstrumentFunction(TraceCallback, 0); 1155 | unsigned int sig; 1156 | for(sig=1;sig<32;sig++){ 1157 | PIN_InterceptSignal(sig, (INTERCEPT_SIGNAL_CALLBACK)HandleSig,NULL); 1158 | } 1159 | // Register Fini to be called when the application exits 1160 | PIN_AddFiniFunction(Fini, 0); 1161 | PIN_AddFiniFunction(cgc_cleanup, 0); 1162 | 1163 | // Start the program, never returns 1164 | PIN_StartProgram(); 1165 | 1166 | return 0; 1167 | } 1168 | 1169 | 1170 | -------------------------------------------------------------------------------- /cgc_pin_tracer/libcgc_pin.h: -------------------------------------------------------------------------------- 1 | #ifndef _LIBCGC_PIN_H 2 | #define _LIBCGC_PIN_H 3 | 4 | //NOTE: I just prefixed everything with CGC or cgc 5 | // so that these declarations are different from the standard (e.g. linux) 6 | // ones. 7 | 8 | #define CGC_STDIN 0 9 | #define CGC_STDOUT 1 10 | #define CGC_STDERR 2 11 | 12 | #ifndef NULL 13 | #define NULL ((void *)0) 14 | #endif 15 | 16 | typedef long unsigned int cgc_size_t; 17 | typedef long signed int cgc_ssize_t; 18 | 19 | #define CGC_SSIZE_MAX 2147483647 20 | #define CGC_SIZE_MAX 4294967295 21 | #define CGC_FD_SETSIZE 1024 22 | 23 | typedef long int cgc_fd_mask; 24 | 25 | #define CGC_NFDBITS (8 * sizeof(cgc_fd_mask)) 26 | 27 | typedef struct { 28 | cgc_fd_mask _fd_bits[CGC_FD_SETSIZE / CGC_NFDBITS]; 29 | } cgc_fd_set; 30 | 31 | #define CGC_FD_ZERO(set) \ 32 | do { \ 33 | cgc_size_t __i; \ 34 | for (__i = 0; __i < (CGC_FD_SETSIZE / CGC_NFDBITS); __i++) \ 35 | (set)->_fd_bits[__i] = 0; \ 36 | } while (0) 37 | #define CGC_FD_SET(b, set) \ 38 | ((set)->_fd_bits[b / CGC_NFDBITS] |= (1 << (b & (CGC_NFDBITS - 1)))) 39 | #define CGC_FD_CLR(b, set) \ 40 | ((set)->_fd_bits[b / CGC_NFDBITS] &= ~(1 << (b & (CGC_NFDBITS - 1)))) 41 | #define CGC_FD_ISSET(b, set) \ 42 | ((set)->_fd_bits[b / CGC_NFDBITS] & (1 << (b & (CGC_NFDBITS - 1)))) 43 | 44 | struct cgc_timeval { 45 | int tv_sec; 46 | int tv_usec; 47 | }; 48 | 49 | #define CGC_EBADF 1 50 | #define CGC_EFAULT 2 51 | #define CGC_EINVAL 3 52 | #define CGC_ENOMEM 4 53 | #define CGC_ENOSYS 5 54 | #define CGC_EPIPE 6 55 | 56 | //Some additional definitions 57 | #define _TERMINATE 1 58 | #define _TRANSMIT 2 59 | #define _RECEIVE 3 60 | #define _FDWAIT 4 61 | #define _ALLOCATE 5 62 | #define _DEALLOCATE 6 63 | #define _RANDOM 7 64 | 65 | #define CGC_PAGE_SIZE 4096 66 | #define CGC_PAGE_MASK (~(CGC_PAGE_SIZE - 1)) 67 | 68 | #endif /* _LIBCGC_PIN_H */ 69 | 70 | 71 | 72 | -------------------------------------------------------------------------------- /cgc_pin_tracer/makefile: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # DO NOT EDIT THIS FILE! 4 | # 5 | ############################################################## 6 | 7 | # If the tool is built out of the kit, PIN_ROOT must be specified in the make invocation and point to the kit root. 8 | ifdef PIN_ROOT 9 | CONFIG_ROOT := $(PIN_ROOT)/source/tools/Config 10 | else 11 | CONFIG_ROOT := ../Config 12 | endif 13 | include $(CONFIG_ROOT)/makefile.config 14 | include makefile.rules 15 | include $(TOOLS_ROOT)/Config/makefile.default.rules 16 | 17 | ############################################################## 18 | # 19 | # DO NOT EDIT THIS FILE! 20 | # 21 | ############################################################## 22 | -------------------------------------------------------------------------------- /cgc_pin_tracer/makefile.rules: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # This file includes all the test targets as well as all the 4 | # non-default build rules and test recipes. 5 | # 6 | ############################################################## 7 | 8 | 9 | ############################################################## 10 | # 11 | # Test targets 12 | # 13 | ############################################################## 14 | 15 | ###### Place all generic definitions here ###### 16 | 17 | # This defines tests which run tools of the same name. This is simply for convenience to avoid 18 | # defining the test name twice (once in TOOL_ROOTS and again in TEST_ROOTS). 19 | # Tests defined here should not be defined in TOOL_ROOTS and TEST_ROOTS. 20 | TEST_TOOL_ROOTS := cgc_pin_tracer 21 | 22 | # This defines the tests to be run that were not already defined in TEST_TOOL_ROOTS. 23 | TEST_ROOTS := 24 | 25 | # This defines a list of tests that should run in the "short" sanity. Tests in this list must also 26 | # appear either in the TEST_TOOL_ROOTS or the TEST_ROOTS list. 27 | # If the entire directory should be tested in sanity, assign TEST_TOOL_ROOTS and TEST_ROOTS to the 28 | # SANITY_SUBSET variable in the tests section below (see example in makefile.rules.tmpl). 29 | SANITY_SUBSET := 30 | 31 | # This defines the tools which will be run during the the tests, and were not already defined in 32 | # TEST_TOOL_ROOTS. 33 | TOOL_ROOTS := 34 | 35 | # This defines the static analysis tools which will be run during the the tests. They should not 36 | # be defined in TEST_TOOL_ROOTS. If a test with the same name exists, it should be defined in 37 | # TEST_ROOTS. 38 | # Note: Static analysis tools are in fact executables linked with the Pin Static Analysis Library. 39 | # This library provides a subset of the Pin APIs which allows the tool to perform static analysis 40 | # of an application or dll. Pin itself is not used when this tool runs. 41 | SA_TOOL_ROOTS := 42 | 43 | # This defines all the applications that will be run during the tests. 44 | APP_ROOTS := 45 | 46 | # This defines any additional object files that need to be compiled. 47 | OBJECT_ROOTS := 48 | 49 | # This defines any additional dlls (shared objects), other than the pintools, that need to be compiled. 50 | DLL_ROOTS := 51 | 52 | # This defines any static libraries (archives), that need to be built. 53 | LIB_ROOTS := 54 | 55 | 56 | ############################################################## 57 | # 58 | # Test recipes 59 | # 60 | ############################################################## 61 | 62 | # This section contains recipes for tests other than the default. 63 | # See makefile.default.rules for the default test rules. 64 | # All tests in this section should adhere to the naming convention: .test 65 | 66 | 67 | ############################################################## 68 | # 69 | # Build rules 70 | # 71 | ############################################################## 72 | 73 | # This section contains the build rules for all binaries that have special build rules. 74 | # See makefile.default.rules for the default build rules. 75 | -------------------------------------------------------------------------------- /cgc_pin_tracer/pin_wrap.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | #TODO adapt to multi-binary programs 4 | 5 | ./pin_binary -t obj-ia32/cgc_pin_tracer.so -- ./binary 6 | 7 | -------------------------------------------------------------------------------- /cgc_pin_tracer/test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | echo "test.sh START" 3 | PORT=10000 4 | #a problem is that we cannot have any output from pin 5 | #not to interfere with the testing 6 | nc -e ./pin_wrap.sh -l 127.0.0.1 -p $PORT & 7 | PID=$! 8 | 9 | #wait for nc opening the port 10 | netstat -ltun | grep ":$PORT" > /dev/null 11 | STATUS=$? 12 | while [ $STATUS -eq 1 ] 13 | do 14 | sleep 1 15 | netstat -ltun | grep ":$PORT" > /dev/null 16 | STATUS=$? 17 | echo "waiting for netcat..." 18 | done 19 | #echo `cat /proc/$PID/cmdline` ### 20 | 21 | cb-replay --host 127.0.0.1 --port 10000 $1 22 | 23 | echo "waiting $PID" 24 | #echo `cat /proc/$PID/cmdline` 25 | #wait for nc termination 26 | wait $PID 27 | echo "test.sh END" 28 | 29 | -------------------------------------------------------------------------------- /cgrex/CGRexAnalysis.py: -------------------------------------------------------------------------------- 1 | 2 | import shutil 3 | import os 4 | import time 5 | import timeout_decorator 6 | 7 | from . import utils 8 | from Fortifier import Fortifier 9 | from QemuTracer import QemuTracer 10 | from MiasmPatcher import MiasmPatcher 11 | 12 | import angr 13 | 14 | 15 | import logging 16 | l = logging.getLogger('cgrex.CGRexAnalysis') 17 | 18 | 19 | class CGRexAnalysisException(Exception): 20 | pass 21 | 22 | 23 | class CGRexAnalysis: 24 | 25 | max_patching_tries = 5 26 | 27 | def __init__(self,binary_fname,out_fname,povs_fname): 28 | self.binary_fname = binary_fname 29 | self.out_fname = out_fname 30 | assert len(povs_fname)>0 31 | self.povs_fname = povs_fname 32 | 33 | self.tracer = QemuTracer() 34 | self.patcher = MiasmPatcher() 35 | 36 | @timeout_decorator.timeout(60*10) 37 | def create_cfg(self,p): 38 | ctime = time.time() 39 | l.debug("creating cfg...") 40 | cfg = p.analyses.CFG() 41 | cfg.normalize() 42 | l.debug("cfg done (%s)!"%(time.time()-ctime)) 43 | return cfg 44 | 45 | 46 | def run(self): 47 | def gen_name(td,cb_num,ntry): 48 | return os.path.join(td,os.path.basename(self.binary_fname)+"_%06d_%06d"%(cb_num,ntry)) 49 | 50 | 51 | l.info("working on binary: %s",self.binary_fname) 52 | base_code = utils.compile_asm_template("base1.asm",{'code_loaded_address':hex(Fortifier.fortify_segment1_base)}) 53 | cb = Fortifier(self.binary_fname, base_code) 54 | 55 | crashes = 0 56 | with utils.tempdir() as td: 57 | 58 | total_tries = 0 59 | cb_num = -1 60 | for pov in self.povs_fname: 61 | cb_num+=1 62 | ntry = 0 63 | cb.save(gen_name(td,cb_num,ntry)) 64 | shutil.copy(gen_name(td,cb_num,ntry),self.out_fname+"_tmp"+"_%06d_%06d"%(cb_num,ntry)) 65 | 66 | while True: 67 | l.info("working on pov: %s, try: %d",pov,ntry) 68 | cb = Fortifier(gen_name(td,cb_num,ntry)) 69 | 70 | trace = self.tracer.trace_pov(pov,gen_name(td,cb_num,ntry)) 71 | if trace == None: 72 | if ntry == 0: 73 | l.info("pov did not generate any crash: %s",pov) 74 | #FIXME how should we handle this? 75 | #maybe we want to generate an exception only if total_tries == 0 76 | #because in other cases a patch for a previous pov may have fixed the currently tested one 77 | break 78 | else: 79 | l.info("cb is now immune to %s. after %d patches",pov,ntry) 80 | break 81 | 82 | angr_project = angr.Project(gen_name(td,cb_num,ntry)) 83 | try: 84 | angr_cfg = self.create_cfg(angr_project) 85 | except timeout_decorator.TimeoutError: 86 | l.error("cfg timeout!") 87 | angr_cfg = None 88 | #FIXME (how can we handle this) 89 | 90 | crashes += 1 91 | if ntry > CGRexAnalysis.max_patching_tries: 92 | raise CGRexAnalysisException("too many patching tries (%d)"%CGRexAnalysis.max_patching_tries) 93 | 94 | patch_info = self.patcher.generate_patch_info(trace,cb,angr_cfg) 95 | patch = self.patcher.add_code_to_patch_info(patch_info,cb) 96 | 97 | print hex(Fortifier.fortify_segment1_base+len(cb.injected_code)) 98 | cb.insert_detour(Fortifier.fortify_segment1_base+len(cb.injected_code),patch) 99 | 100 | ntry += 1 101 | total_tries += 1 102 | cb.save(gen_name(td,cb_num,ntry)) 103 | shutil.copy(gen_name(td,cb_num,ntry),self.out_fname+"_tmp"+"_%06d_%06d"%(cb_num,ntry)) 104 | 105 | if crashes==0: 106 | raise CGRexAnalysisException("no pov crashed, tested povs: %s"%repr(self.povs_fname)) 107 | 108 | l.info("cb is now immune to %s after %d patches",repr(self.povs_fname),total_tries) 109 | shutil.copy(gen_name(td,cb_num,ntry),self.out_fname) 110 | 111 | 112 | -------------------------------------------------------------------------------- /cgrex/Exceptions.py: -------------------------------------------------------------------------------- 1 | 2 | class InvalidVAddrException(Exception): 3 | pass 4 | 5 | 6 | 7 | class NasmException(Exception): 8 | pass 9 | 10 | 11 | 12 | class DetourException(Exception): 13 | pass 14 | 15 | -------------------------------------------------------------------------------- /cgrex/Fortifier.py: -------------------------------------------------------------------------------- 1 | 2 | import utils 3 | import struct 4 | import os 5 | 6 | from Exceptions import * 7 | 8 | 9 | 10 | import logging 11 | l = logging.getLogger('cgrex.Fortifier') 12 | 13 | 14 | class FortifierException(Exception): 15 | pass 16 | class DetourException(Exception): 17 | pass 18 | 19 | class Fortifier: 20 | 21 | fortify_segment1_base = 0x09000000 22 | fortify_segment2_base = 0x09100000 23 | fortified_tag = "FORTIFIED\x00" #should not be longer than 0x20 24 | 25 | 26 | def __init__(self,fname,base_code=""): 27 | self.fname = fname 28 | self.ocontent = open(fname,"rb").read() 29 | etype = utils.exe_type(self.ocontent) 30 | assert etype != None 31 | if etype == "ELF": 32 | self.ocontent = utils.elf_to_cgc(self.ocontent) 33 | self.ncontent = self.ocontent 34 | self.segments = None 35 | if self.has_fortify_segment(): 36 | self.injected_code = self.get_injected_code() 37 | self.first_patch = False 38 | else: 39 | self.setup_headers() 40 | self.injected_code = base_code 41 | self.first_patch = True 42 | 43 | 44 | def save(self,nname,both_formats=False): 45 | if self.first_patch: 46 | self.set_fortify_segment(self.injected_code) 47 | else: 48 | self.update_fortify_segment(self.injected_code) 49 | if both_formats: 50 | self.ncontent = utils.cgc_to_elf(self.ncontent) 51 | open(nname,"wb").write(self.ncontent) 52 | os.chmod(nname+"_cgc",0755) 53 | self.ncontent = utils.elf_to_cgc(self.ncontent) 54 | open(nname,"wb").write(self.ncontent) 55 | os.chmod(nname+"_elf",0755) 56 | else: 57 | open(nname,"wb").write(self.ncontent) 58 | os.chmod(nname,0755) #rwxr-xr-x 59 | 60 | 61 | def pflags_to_perms(self,p_flags): 62 | PF_X = (1 << 0) 63 | PF_W = (1 << 1) 64 | PF_R = (1 << 2) 65 | 66 | perms = "" 67 | if p_flags & PF_R: 68 | perms = perms + "R" 69 | if p_flags & PF_W: 70 | perms = perms + "W" 71 | if p_flags & PF_X: 72 | perms = perms + "X" 73 | return perms 74 | 75 | 76 | def dump_segments(self,tprint=False): 77 | #from: https://github.com/CyberGrandChallenge/readcgcef/blob/master/readcgcef-minimal.py 78 | header_size = 16 + 2*2 + 4*5 + 2*6 79 | buf = self.ncontent[0:header_size] 80 | (cgcef_type, cgcef_machine, cgcef_version, cgcef_entry, cgcef_phoff, 81 | cgcef_shoff, cgcef_flags, cgcef_ehsize, cgcef_phentsize, cgcef_phnum, 82 | cgcef_shentsize, cgcef_shnum, cgcef_shstrndx) = struct.unpack("= vaddr_start_page and address < vaddr_end_page: 121 | return self.pflags_to_perms(s[6]) 122 | return "" 123 | 124 | 125 | def has_fortify_segment(self): 126 | segments = self.dump_segments() 127 | segment_vaddrs = [s[2] for s in segments] 128 | if Fortifier.fortify_segment1_base in segment_vaddrs: 129 | return True 130 | else: 131 | return False 132 | 133 | 134 | def setup_headers(self): 135 | if self.has_fortify_segment(): 136 | return 137 | 138 | segments = self.dump_segments() 139 | 140 | #align size of the entire ELF 141 | self.ncontent = utils.pad_str(self.ncontent,0x10) 142 | #change pointer to program headers to point at the end of the elf 143 | self.ncontent = utils.str_overwrite(self.ncontent,struct.pack("= vaddr_start_page and maddress < vaddr_end_page: 172 | #regardless to what is written in the header, everything is page aligned (for mmap reasons) 173 | #print hex(paddr_page),hex(vaddr_start_page),hex(vaddr_end_page) 174 | return (maddress - vaddr_start_page) + paddr_page 175 | raise(InvalidVAddrException(hex(maddress))) 176 | 177 | 178 | def get_memory_translation_list(self,address,size,permissive=False): 179 | start = address 180 | end = address+size-1 #we will take the byte at end 181 | #print hex(start),hex(end) 182 | start_p = address & 0xfffffff000 183 | end_p = end & 0xfffffff000 184 | if start_p==end_p: 185 | return [(self.maddress_to_baddress(start),self.maddress_to_baddress(end)+1)] 186 | else: 187 | first_page_baddress = self.maddress_to_baddress(start) 188 | mlist = [] 189 | mlist.append((first_page_baddress,(first_page_baddress & 0xfffffff000)+0x1000)) 190 | nstart = (start & 0xfffffff000)+0x1000 191 | try: 192 | while nstart != end_p: 193 | mlist.append((self.maddress_to_baddress(nstart),self.maddress_to_baddress(nstart)+0x1000)) 194 | nstart += 0x1000 195 | mlist.append((self.maddress_to_baddress(nstart),self.maddress_to_baddress(end)+1)) 196 | except InvalidVAddrException, e: 197 | if permissive: 198 | return mlist 199 | else: 200 | raise e 201 | return mlist 202 | 203 | 204 | def get_maddress(self,address,size,permissive=False): 205 | mem = "" 206 | for start,end in self.get_memory_translation_list(address,size): 207 | #print "-",hex(start),hex(end) 208 | mem += self.ncontent[start:end] 209 | return mem 210 | 211 | 212 | def patch_bin(self,address,new_content): 213 | ndata_pos = 0 214 | for start,end in self.get_memory_translation_list(address,len(new_content)): 215 | #print "-",hex(start),hex(end) 216 | ndata = new_content[ndata_pos:ndata_pos+(end-start)] 217 | self.ncontent = utils.str_overwrite(self.ncontent,ndata,start) 218 | ndata_pos += len(ndata) 219 | 220 | 221 | def insert_detour(self,target,patch): 222 | def check_if_movable(instruction): 223 | #the idea here is an instruction is movable if and only if 224 | #it has the same string representation when moved at different offsets is "movable" 225 | def bytes_to_comparable_str(ibytes,offset): 226 | return " ".join(utils.instruction_to_str(utils.decompile(ibytes,offset)[0]).split()[2:]) 227 | 228 | instruction_bytes = str(instruction.bytes) 229 | pos1 = bytes_to_comparable_str(instruction_bytes,0x0) 230 | pos2 = bytes_to_comparable_str(instruction_bytes,0x07f00000) 231 | pos3 = bytes_to_comparable_str(instruction_bytes,0xfe000000) 232 | print pos1,pos2,pos3 233 | if pos1 == pos2 and pos2 == pos3: 234 | return True 235 | else: 236 | return False 237 | 238 | culprit_address = patch['culprit_address'] 239 | bbstart = patch['bbstart'] 240 | bbsize = patch['bbsize'] 241 | patch_code = patch['code'] 242 | 243 | l.debug("inserting detour for patch: %s"%(map(hex,(bbstart,bbsize,culprit_address)))) 244 | 245 | detour_size = 5 246 | detour_attempts = range(-1*detour_size,0+1) 247 | one_byte_nop = '\x90' 248 | 249 | #get movable_instructions in the bb 250 | original_bbcode = self.get_maddress(bbstart,bbsize) 251 | instructions = utils.decompile(original_bbcode,bbstart) 252 | assert any([culprit_address == i.address for i in instructions]) 253 | 254 | #the last instruction may be not movable (a direct call or jmp) 255 | #given the definition of bb, only the last instruction may be non-movable 256 | #TODO moving an indirect call or a ret is still scary and should be tested, 257 | #because performing a call not from the origianl position changes what is going on the stack 258 | if check_if_movable(instructions[-1]): 259 | movable_instructions = instructions 260 | else: 261 | movable_instructions = instructions[:-1] 262 | 263 | if len(movable_instructions)==0: 264 | raise DetourException("No movable instructions found") 265 | movable_bb_start = movable_instructions[0].address 266 | movable_bb_size = reduce(lambda t,n:t+len(str(n.bytes)),movable_instructions,0) 267 | print "movable_bb_size:",movable_bb_size 268 | print "movable bb instructions:" 269 | print "\n".join([utils.instruction_to_str(i) for i in movable_instructions]) 270 | 271 | #find a spot for the detour 272 | detour_pos = None 273 | for pos in detour_attempts: 274 | detour_start = culprit_address + pos 275 | detour_end = detour_start + detour_size - 1 276 | if detour_start >= movable_bb_start and detour_end < (movable_bb_start + movable_bb_size): 277 | detour_pos = detour_start 278 | break 279 | if detour_pos == None: 280 | raise DetourException("No space in bb",hex(bbstart),hex(bbsize),hex(movable_bb_start),hex(movable_bb_size)) 281 | else: 282 | print "detour fits at",hex(detour_pos) 283 | detour_overwritten_bytes = range(detour_pos,detour_pos+detour_size) 284 | #print "ob"," ".join(map(hex,detour_overwritten_bytes)) 285 | 286 | #detect overwritten instruction 287 | for i in movable_instructions: 288 | if len(set(detour_overwritten_bytes).intersection(set(range(i.address,i.address+len(i.bytes)))))>0: 289 | if i.address < culprit_address: 290 | i.overwritten = "pre" 291 | elif i.address == culprit_address: 292 | i.overwritten = "culprit" 293 | else: 294 | i.overwritten = "post" 295 | else: 296 | i.overwritten = "out" 297 | print "\n".join([utils.instruction_to_str(i) for i in movable_instructions]) 298 | assert any([i.overwritten!="out" for i in movable_instructions]) 299 | 300 | #patch bb code 301 | for i in movable_instructions: 302 | if i.overwritten != "out": 303 | self.patch_bin(i.address,one_byte_nop*len(i.bytes)) 304 | detour_jmp_code = utils.compile_jmp(detour_pos,target) 305 | self.patch_bin(detour_pos,detour_jmp_code) 306 | patched_bbcode = self.get_maddress(bbstart,bbsize) 307 | patched_bbinstructions = utils.decompile(patched_bbcode,bbstart) 308 | print "patched bb instructions:" 309 | print "\n".join([utils.instruction_to_str(i) for i in patched_bbinstructions]) 310 | 311 | #create injected_code (pre, injected, culprit, post, jmp_back) 312 | injected_code = "" 313 | injected_code += "\n"+"nop\n"*5+"\n" 314 | injected_code += "\n".join([utils.capstone_to_nasm(i) for i in movable_instructions if i.overwritten=='pre'])+"\n" 315 | injected_code += "; --- custom code start\n"+patch['code']+"\n"+"; --- custom code end\n"+"\n" 316 | injected_code += "\n".join([utils.capstone_to_nasm(i) for i in movable_instructions if i.overwritten=='culprit'])+"\n" 317 | injected_code += "\n".join([utils.capstone_to_nasm(i) for i in movable_instructions if i.overwritten=='post'])+"\n" 318 | jmp_back_target = None 319 | for i in reversed(movable_instructions): #jmp back to the one after the last byte of the last non-out 320 | if i.overwritten != "out": 321 | jmp_back_target = i.address+len(str(i.bytes)) 322 | break 323 | assert jmp_back_target != None 324 | injected_code += "jmp %s"%hex(int(jmp_back_target))+"\n" 325 | injected_code = "\n".join([line for line in injected_code.split("\n") if line!= ""]) #removing blank lines as a pro 326 | print "injected code:" 327 | print injected_code 328 | 329 | self.injected_code += utils.compile_asm(injected_code,base=Fortifier.fortify_segment1_base+len(self.injected_code)) 330 | 331 | 332 | def get_injected_code(self): 333 | assert self.has_fortify_segment() 334 | segments = self.dump_segments() 335 | fortify_segment_info = [s for s in segments if s[2] == Fortifier.fortify_segment1_base][0] 336 | return self.ncontent[fortify_segment_info[1]:] 337 | 338 | 339 | def set_fortify_segment(self,code_segment): 340 | 341 | assert self.ncontent[0x34:0x34+len(self.fortified_tag)] == self.fortified_tag 342 | 343 | code_segment = utils.pad_str(code_segment,0x10) 344 | start_new_segment = len(utils.pad_str(self.ncontent + " "*0x20,0x1000)) 345 | code_segment_header = (1, start_new_segment, self.fortify_segment1_base, self.fortify_segment1_base, \ 346 | len(code_segment), len(code_segment), 0x7, 0x0) #RWX 347 | data_segment_header = (1, 0, self.fortify_segment2_base, self.fortify_segment2_base, \ 348 | 0, 0x1000, 0x6, 0x0) #RW 349 | self.ncontent = utils.str_overwrite(self.ncontent,struct.pack(" shadow_stack_start 42 | ''' 43 | 44 | ''' 45 | pusha 46 | lea eax, [...] 47 | mov ebx,4 48 | mov ecx,3 49 | call CGREX_memcheck_and_exit 50 | popa 51 | ''' 52 | 53 | # this is the worst code I have ever written in my life :-) 54 | def protect_access(self,tstr,access,size,final_deref=False): 55 | def format_str(tstr): 56 | tstr = tstr.replace("_init","").replace("(","").replace(")","") 57 | tstr = ' '.join(tstr.split()) 58 | tstr = tstr.replace("*"," * ").replace("+"," + ") 59 | return tstr 60 | 61 | 62 | def compile_token(token): 63 | token = token.strip().lower() 64 | if token == "esp": 65 | token = "[{_saved_esp}]" 66 | if token == "eax": 67 | token = "[{_saved_eax}]" 68 | return token 69 | 70 | def find_nested(tstr,n): 71 | level = 0 72 | offsets = [] 73 | start = -1 74 | for i,c in enumerate(tstr): 75 | if c == "(": 76 | level += 1 77 | if c == ")": 78 | if level == n: 79 | stop = i 80 | offsets.append((start,stop)) 81 | start = -1 82 | level -= 1 83 | if level == n and start == -1: 84 | start = i 85 | return offsets 86 | 87 | 88 | l.debug("original expression: %s"%tstr) 89 | 90 | 91 | #moving nested expressions at the end (compiled first) 92 | #cannot work woth multiple nested expressions, but it should never be the case 93 | nested = find_nested(tstr,2) 94 | if len(nested)>1: 95 | raise MiasmPatcherException("expression with multiple nesting: "%tstr) 96 | for s,e in nested: 97 | if tstr[s-1] == "+" or tstr[s-1] == "*": 98 | tstr = tstr[:s-1]+tstr[e+1:]+tstr[s-1:e+1] 99 | else: 100 | tstr = tstr[:s]+tstr[e+2:]+tstr[e+1]+tstr[s-1:e+1] 101 | 102 | tstr = format_str(tstr) 103 | inner_assembly = [] 104 | tokens = list(reversed(tstr.split())) 105 | l.debug("tokenized expression: %s"%repr(tstr)) 106 | 107 | inner_assembly.append("mov eax, %s"%compile_token(tokens[0])) 108 | for token1,token2 in zip(tokens[1:],tokens[2:])[::2]: 109 | if token1 == "+": 110 | inner_assembly.append("add eax, %s"%compile_token(token2)) 111 | elif token1 == "*": 112 | inner_assembly.append("imul eax, %s"%compile_token(token2)) 113 | else: 114 | raise MiasmPatcherException("found weird token: %s"%token1) 115 | if final_deref: 116 | inner_assembly += ["mov eax, [eax]"] 117 | final_assembly = ["pusha"]+inner_assembly+["mov ebx,%d"%size,"mov ecx,%d"%access]+\ 118 | ["call [{CGREX_memcheck_and_exit_ptr}]","popa"] 119 | 120 | patch_str = "\n".join(final_assembly) 121 | return patch_str 122 | 123 | 124 | #TODO many wild assumptons about what can appear in a single instruction and what not 125 | def parse_reg_diff(self,miasm_str): 126 | patches = [] 127 | for line in miasm_str.split("\n"): 128 | if line.strip() == "": 129 | continue 130 | if ("X " in line or line.startswith("XMM") or line.startswith("zf ")) and "[" in line: 131 | patches.append(self.protect_access(line.split("[")[1].split("]")[0],1,4)) 132 | if line.startswith("EIP"): 133 | if("[" in line): 134 | patches.append(self.protect_access(line.split("[")[1].split("]")[0],1,4)) 135 | patches.append(self.protect_access(line.split("[")[1].split("]")[0],1,4,True)) 136 | else: 137 | patches.append(self.protect_access(line.split(" ")[1],1,4)) 138 | return patches 139 | 140 | 141 | def parse_mem_diff(self,miasm_str): 142 | patches = [] 143 | for line in miasm_str.split("\n"): 144 | if line.strip() == "": 145 | continue 146 | if "] " in line: 147 | patches.append(self.protect_access(line.split("[")[1].split("]")[0],3,4)) #FIXME permission should be 2, but becuase of the test_write problem I cneed it to be also readable 148 | return patches 149 | 150 | 151 | def generate_patch_info(self,trace_info,cb,cfg): 152 | patch_info = {} 153 | 154 | if "LastIP2" not in trace_info: 155 | last_bb_data = cb.get_maddress(trace_info["LastBB1addr"],trace_info["LastBB1size"]) 156 | last_instruction = utils.decompile(last_bb_data,trace_info["LastBB1addr"])[-1] 157 | trace_info["LastIP2"] = int(last_instruction.address) 158 | l.debug("LastIP2 (from last bb): %s"%hex(trace_info["LastIP2"])) 159 | 160 | if ('X' in cb.get_memory_permissions(trace_info["LastIP1"])): 161 | culprit_address = trace_info["LastIP1"] 162 | l.debug("using LastIP1: %s"%hex(culprit_address)) 163 | elif (trace_info["LastIP2"] != None and 'X' in cb.get_memory_permissions(trace_info["LastIP2"])): 164 | culprit_address = trace_info["LastIP2"] 165 | l.debug("using LastIP2: %s"%hex(culprit_address)) 166 | 167 | if culprit_address >= trace_info["LastBB1addr"] and \ 168 | culprit_address < (trace_info["LastBB1addr"] + trace_info["LastBB1size"]): 169 | bbstart = trace_info["LastBB1addr"] 170 | bbsize = trace_info["LastBB1size"] 171 | elif trace_info["LastBB2addr"] != None and \ 172 | culprit_address >= trace_info["LastBB2addr"] and \ 173 | culprit_address < (trace_info["LastBB2addr"] + trace_info["LastBB2size"]): 174 | bbstart = trace_info["LastBB2addr"] 175 | bbsize = trace_info["LastBB2size"] 176 | 177 | patch_info['bbstart'] = bbstart 178 | patch_info['bbsize'] = bbsize 179 | if cfg != None: 180 | l.info("culprit address to angr %s"%hex(culprit_address)) 181 | angr_bb = cfg.get_any_node(culprit_address, is_syscall=False, anyaddr=True) 182 | if angr_bb != None: 183 | if angr_bb.size != None and angr_bb.size != 0: 184 | angr_bbstart = int(angr_bb.addr) 185 | angr_bbsize = int(angr_bb.size) 186 | if angr_bbstart >= bbstart and (angr_bbstart+angr_bbsize <= bbstart+bbsize): 187 | patch_info['bbstart'] = angr_bbstart 188 | patch_info['bbsize'] = angr_bbsize 189 | if angr_bbstart!=bbstart or angr_bbstart!=bbstart: 190 | l.info("basicblocks do not match angr: %s-%s qemu: %s-%s"%(hex(angr_bbstart),angr_bbsize,hex(bbstart),bbsize)) 191 | else: 192 | l.info("angr basicblock match angr: %s-%s qemu: %s-%s"%(hex(angr_bbstart),angr_bbsize,hex(bbstart),bbsize)) 193 | else: 194 | l.info("angr basicblock is outsize angr: %s-%s qemu: %s-%s"%(hex(angr_bbstart),angr_bbsize,hex(bbstart),bbsize)) 195 | #TODO we may want to check if this is because partial overwrite 196 | else: 197 | l.info("angr basicblock problem (size is None or 0)") 198 | else: 199 | l.info("angr basicblock problem (bb is None)") 200 | else: 201 | l.info("angr cfg is None") 202 | 203 | 204 | patch_info['culprit_address'] = culprit_address 205 | 206 | return patch_info 207 | 208 | 209 | 210 | def add_code_to_patch_info(self,patch_info,cb): 211 | culprit_address = patch_info['culprit_address'] 212 | bbsize = patch_info['bbsize'] 213 | bbstart = patch_info['bbstart'] 214 | culprit_upto_bb_limit = cb.get_maddress(culprit_address,bbsize - (culprit_address-bbstart)) 215 | #get only culprit 216 | culprit_instrucion = utils.decompile(culprit_upto_bb_limit)[0] 217 | culprit = culprit_upto_bb_limit[:culprit_instrucion.size] 218 | #culprit = utils.compile_asm("mov ecx, byte [ebp]") 219 | 220 | dec = utils.decompile(culprit,culprit_address)[0] 221 | l.debug("the culprit is: %s %s %s"%(hex(culprit_address),culprit.encode('hex'),utils.instruction_to_str(dec))) 222 | #culprit = utils.compile_asm("call [100+ecx*2+esp]") 223 | 224 | 225 | stdout = StringIO.StringIO() 226 | stderr = StringIO.StringIO() 227 | #this is terrible: since I did not want to parse miasm expressions, I just parse the string 228 | with utils.redirect_stdout(stdout,stderr): 229 | sb = self.execc(culprit) 230 | 231 | patches = [] 232 | with utils.redirect_stdout(stdout,stderr): 233 | sb.dump_id() 234 | l.debug("raw miasm results regs:\n%s\n"%stdout.getvalue()) 235 | patches += self.parse_reg_diff(stdout.getvalue()) 236 | stdout = StringIO.StringIO() 237 | stderr = StringIO.StringIO() 238 | with utils.redirect_stdout(stdout,stderr): 239 | sb.dump_mem() 240 | l.debug("raw miasm results mem:\n%s\n"%stdout.getvalue()) 241 | patches += self.parse_mem_diff(stdout.getvalue()) 242 | 243 | patches = ["mov [{_saved_esp}],esp","mov [{_saved_eax}],eax","mov esp, {_shadow_stack}"]+patches+["mov esp, [{_saved_esp}]"] 244 | 245 | l.debug("fixing the culprit: %s %s"%(dec.mnemonic, dec.op_str)) 246 | l.debug("with:\n"+"\n".join(patches)) 247 | 248 | substitution_dict = { 249 | "_saved_esp":hex(Fortifier.fortify_segment1_base), 250 | "_saved_eax":hex(Fortifier.fortify_segment1_base+4), 251 | "CGREX_memcheck_and_exit_ptr":hex(Fortifier.fortify_segment1_base+8), 252 | "_shadow_stack":hex(Fortifier.fortify_segment2_base+0xf00) 253 | } 254 | fixed_patches = [line.format(**substitution_dict) for line in patches] 255 | l.debug("fixed patches:\n"+"\n".join(fixed_patches)) 256 | patch_info["code"] = "\n".join(fixed_patches) 257 | return patch_info 258 | 259 | 260 | -------------------------------------------------------------------------------- /cgrex/PinManager.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | import os 4 | import re 5 | import utils 6 | import tempfile 7 | import shutil 8 | import contextlib 9 | import json 10 | import time 11 | import sys 12 | from distutils.spawn import find_executable 13 | 14 | 15 | class PinManager: 16 | 17 | pin_download_link = "http://software.intel.com/sites/landingpage/pintool/downloads/pin-2.14-67254-gcc.4.4.7-linux.tar.gz" 18 | pin_installation_folder = "/vagrant/pin" 19 | pin_executable = os.path.join(pin_installation_folder,"pin") 20 | pin_module = "cgc_pin_tracer" 21 | pin_module_folder = os.path.join(os.path.split(os.path.split(os.path.abspath(__file__))[0])[0],pin_module) 22 | 23 | def __init__(self,vagrant_manager,pin_installation_folder=None): 24 | if pin_installation_folder != None: 25 | self.pin_installation_folder = pin_installation_folder 26 | self.vgm = vagrant_manager 27 | 28 | res = self.vgm.exec_cmd(["stat",self.pin_executable]) 29 | if res[2] != 0: 30 | print "cannot find pin executable (inside the vm) in:",self.pin_executable 31 | print "download it from:",self.pin_download_link 32 | print "unpack it here (inside the vm):",self.pin_installation_folder 33 | sys.exit(1) 34 | 35 | 36 | def trace_exploit(self,executable_fname,exploit_fname): 37 | #TODO make the pintool directly output json 38 | def naive_parser(keys,content): 39 | res = {} 40 | for line in content.split("\n"): 41 | for k in keys: 42 | if line.startswith(k): 43 | res[k]=int(line.split(":")[1].strip(),16) 44 | return res 45 | 46 | with self.vgm.get_shared_tmpdir() as tf: 47 | executable_cgc_tmp_fname = os.path.join(tf,os.path.basename(executable_fname)+"_cgc") 48 | executable_elf_tmp_fname = os.path.join(tf,os.path.basename(executable_fname)+"_elf") 49 | exploit_tmp_fname = os.path.join(tf,os.path.basename(exploit_fname)) 50 | result_tmp_fname = os.path.join(tf,"pin_module","cgc_pin_tracer_results.out") 51 | pinlog_tmp_fname = os.path.join(tf,"pin_module","pin.log") 52 | pin_module_tmp_fname = os.path.join(tf,"pin_module") 53 | 54 | shutil.copyfile(executable_fname,executable_cgc_tmp_fname) 55 | shutil.copyfile(executable_fname,executable_elf_tmp_fname) 56 | shutil.copyfile(exploit_fname,exploit_tmp_fname) 57 | shutil.copytree(self.pin_module_folder,pin_module_tmp_fname) 58 | 59 | res = self.vgm.exec_cmd([ 60 | ["cd",pin_module_tmp_fname], 61 | ["make","PIN_ROOT="+self.pin_installation_folder,"clean"], 62 | ["make","PIN_ROOT="+self.pin_installation_folder] 63 | ]) 64 | if res[2] != 0: 65 | print "error %d while compiling the pin module"%res[2] 66 | print res[0] 67 | print res[1] 68 | return None 69 | 70 | ''' 71 | The wrapping made by cgc-server (or cgc-test since it uses cgc-server) is incompatible with pin. 72 | In fact, it, for instance, forbid to open new files and a lot of other bad stuff. 73 | Netcat can do the same. 74 | from pin_module: nc -e ./pin_wrap.sh -l 127.0.0.1 -p 10000 & 75 | from tf: cb-replay --host 127.0.0.1 --port 10000 pov-1.xml 76 | I do not know hot to pss an argument to pin_wrap.sh, just create and use a harcoded link 77 | ''' 78 | #set permissions and links 79 | #I use links since I do not know how to pass parameters to netcat -e 80 | #It seems that cb-reply does not like links 81 | #the big assumption is that the test is going to make the execution of the program end AND crash 82 | res = self.vgm.exec_cmd([ 83 | ["killall","test.sh"], 84 | ["cgc2elf",executable_elf_tmp_fname], 85 | ["chmod","755",executable_cgc_tmp_fname], 86 | ["chmod","755",executable_elf_tmp_fname], 87 | ["ln","-s",executable_elf_tmp_fname,os.path.join(pin_module_tmp_fname,"binary")], 88 | ["ln","-s",self.pin_executable,os.path.join(pin_module_tmp_fname,"pin_binary")], 89 | ["cd",pin_module_tmp_fname], 90 | ["./test.sh",exploit_tmp_fname] 91 | ]) 92 | #TODO adapt for multi-binary programs 93 | #TODO check if it actually crashed (and timeout if no response) 94 | #TODO the current pintool is tracing at instruction level: a better solution would be to 95 | #first trace at bb level and rerun tracing at instruction level only within the crashing bb 96 | print "===","TEST RESULTS:" 97 | print "=","STDOUT:\n",res[0].strip() 98 | print "=","STDERR:\n",res[1].strip() 99 | print "=","RETURN CODE:",res[2] 100 | 101 | if(os.path.exists(pinlog_tmp_fname)): 102 | pinlog_res = open(pinlog_tmp_fname).read() 103 | else: 104 | pinlog_res = None 105 | if(os.path.exists(result_tmp_fname)): 106 | raw_res = open(result_tmp_fname).read() 107 | else: 108 | result_tmp_fname = None 109 | raw_input() 110 | 111 | if pinlog_res != None: 112 | print "=","PIN LOG:" 113 | print pinlog_res 114 | 115 | if raw_res != None: 116 | print "=","PIN RESULT:" 117 | print raw_res 118 | res_keys = ["LastIP1","LastIP2","LastBB1addr","LastBB1size","LastBB2addr","LastBB2size","Signal","ExitCode"] 119 | res = naive_parser(res_keys,raw_res) 120 | return res 121 | else: 122 | return None 123 | 124 | 125 | 126 | 127 | -------------------------------------------------------------------------------- /cgrex/QemuTracer.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | from . import utils 4 | 5 | import logging 6 | l = logging.getLogger('cgrex.QemuTracer') 7 | 8 | 9 | 10 | class QemuTracer: 11 | self_path = os.path.dirname(os.path.abspath(__file__)) 12 | shm_folder = "/dev/shm" 13 | qemu_circular_buffer_split = "==========SPLIT==========\n" 14 | qemu_signal_log_line = "qemu: uncaught target signal" 15 | qemu_launcher_timeout = 75 16 | 17 | 18 | def __init__(self): 19 | pass 20 | 21 | 22 | def uncircle_string(self,tstr): 23 | l,_,r = tstr.partition(QemuTracer.qemu_circular_buffer_split) 24 | return r+l 25 | 26 | 27 | def trace_pov(self,pov,cb_fname): 28 | def line_to_bbinfo(line): 29 | if line.startswith("IN:"): 30 | p = line.split("[")[1].split("]")[0] 31 | bbstart = int(p.split(",")[0],16) 32 | bbsize = int(p.split(",")[1],16) 33 | elif line.startswith("Trace "): 34 | plist = line.split()[2:4] 35 | bbstart = int(plist[0],16) 36 | bbsize = int(plist[1],10) 37 | return (bbstart,bbsize) 38 | 39 | 40 | def remove_dups(seq): 41 | noDupes = [] 42 | [noDupes.append(i) for i in reversed(seq) if not noDupes.count(i)] 43 | return noDupes 44 | 45 | 46 | with utils.tempdir(os.path.join(QemuTracer.shm_folder,"QemuTracer")) as td: 47 | args = ["timeout", 48 | "-k", 49 | str(QemuTracer.qemu_launcher_timeout+5), 50 | str(QemuTracer.qemu_launcher_timeout), 51 | "./qemu_launcher.sh", 52 | "./qemu_bb_wrap.sh", 53 | os.path.abspath(cb_fname),os.path.abspath(pov),td] 54 | res = utils.exec_cmd(args,cwd=os.path.join(QemuTracer.self_path,"../bin/")) 55 | l.debug("running: %s"%" ".join(args)) 56 | l.debug("results:") 57 | l.debug(res[0]) 58 | l.debug(res[1]) 59 | l.debug(res[2]) 60 | qemu_log = self.uncircle_string(open(os.path.join(td,"qemu_log.txt")).read()) 61 | l.debug(qemu_log) 62 | 63 | signal_lines = [line for line in qemu_log.split("\n") if line.startswith(QemuTracer.qemu_signal_log_line)] 64 | l.debug("signal lines:%s"%repr(signal_lines)) 65 | if len(signal_lines) == 0: 66 | return None 67 | 68 | trace_info = {} 69 | signal_line = signal_lines[0] 70 | trace_info["Signal"] = int(signal_line.split(QemuTracer.qemu_signal_log_line)[1].split()[0]) 71 | trace_info["LastIP1"] = int(signal_line.split("[")[1].split("]")[0],16) 72 | 73 | to_be_parsed_lines = [line for line in qemu_log.split("\n") if \ 74 | (line.startswith("Trace ") or line.startswith("IN:"))] 75 | 76 | itrace = remove_dups([line_to_bbinfo(line) for line in to_be_parsed_lines]) 77 | 78 | trace_info["LastBB1addr"],trace_info["LastBB1size"] = itrace[0] 79 | trace_info["LastBB2addr"],trace_info["LastBB2size"] = itrace[1] 80 | 81 | l.debug("pov_trace: %s"%" - ".join([k+":"+hex(trace_info[k]) for k in trace_info.keys()])) 82 | return trace_info 83 | 84 | 85 | -------------------------------------------------------------------------------- /cgrex/VagrantManager.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import re 4 | import utils 5 | import tempfile 6 | import shutil 7 | import contextlib 8 | from distutils.spawn import find_executable 9 | 10 | class VagrantManager: 11 | 12 | shared_folder_tag = "cgcrex_shared_tmp" 13 | 14 | def __init__(self,vgfile=None): 15 | assert (vgfile == None or os.path.exists(vgfile)) 16 | self.vgfile = vgfile 17 | self.vagrant_cmd = find_executable("vagrant") 18 | 19 | if self.vgfile!=None: 20 | self.vgfolder = os.path.dirname(os.path.realpath(self.vgfile)) 21 | else: 22 | self.vgfolder="/tmp" 23 | 24 | #TODO get these two by reading elf.vgfil 25 | self.shared_dir = self.vgfolder 26 | self.shared_remote_dir = "/vagrant" 27 | 28 | self.start_vm_if_necessary() 29 | 30 | 31 | @contextlib.contextmanager 32 | def get_shared_tmpdir(self,auto_delete=True): 33 | ''' 34 | create a temporary folder, shared with the vm 35 | ''' 36 | prefix = os.path.join(self.shared_dir,VagrantManager.shared_folder_tag) 37 | tmpdir = tempfile.mkdtemp(prefix=prefix) 38 | try: 39 | yield tmpdir 40 | finally: 41 | if auto_delete: 42 | shutil.rmtree(tmpdir) 43 | 44 | 45 | def start_vm_if_necessary(self): 46 | if self.vgfile==None or self.check_vm_status()=="running": 47 | return 48 | 49 | print "+++","the vm is down, powering it up" 50 | res = utils.exec_cmd([self.vagrant_cmd] + ["up"],cwd=self.vgfolder) 51 | assert self.check_vm_status()=="running","the vm did not start: %s" % repr(res) 52 | 53 | 54 | def check_vm_status(self): 55 | assert self.vgfile!=None 56 | 57 | #TODO consider other cases: not existing, ... 58 | res = utils.exec_cmd([self.vagrant_cmd] + ["status"],cwd=self.vgfolder) 59 | for line in res[0].split("\n"): 60 | line = line.strip() 61 | if line.startswith("default"): 62 | if "running" in line: 63 | return "running" 64 | return "non-running" 65 | 66 | 67 | def quote(self,s): 68 | #from shlex.quote 69 | if not s: 70 | return "''" 71 | # use single quotes, and put single quotes into double quotes 72 | # the string $'b is then quoted as '$'"'"'b' 73 | return "'" + s.replace("'", "'\"'\"'") + "'" 74 | 75 | 76 | def translate_and_quote(self,s): 77 | #this is somehow a heuristic, but it should be good 78 | if self.vgfile!=None: 79 | sep = os.path.sep 80 | bname = os.path.realpath(s) 81 | 82 | if bname.startswith(os.path.join(self.shared_dir,VagrantManager.shared_folder_tag)): 83 | #this is a path and it is inside a shared dir: convert it 84 | inside_path = bname[len(self.shared_dir)+1:] 85 | s = os.path.join(self.shared_remote_dir,inside_path) 86 | 87 | return self.quote(s) 88 | 89 | 90 | 91 | def exec_cmd(self,args,force_machine_up=False,debug=False): 92 | ''' 93 | execute one or more commands within the Vagrant vm, if a self.vgfile!=None 94 | if args is a list of lists, multiple commands are executed in bash 95 | ''' 96 | #TODO test inside vagrant (vgfile == None) 97 | 98 | if len(args)>0 and type(args[0])==list: 99 | #at the end every arg will be quoted twice 100 | targs = ";".join([" ".join([self.translate_and_quote(a) for a in c]) for c in args]) 101 | processed_args = ["bash","-c"] + [targs] 102 | else: 103 | processed_args = args 104 | 105 | 106 | if self.vgfile == None: 107 | #we are running inside the vm, just do normal execution 108 | res = utils.exec_cmd(processed_args,debug=debug) 109 | return res 110 | 111 | if force_machine_up: 112 | #checking the status every time slows down a lot 113 | #the machine should be (from when __init__ is called) 114 | if self.check_vm_status() != "running": 115 | self.start_vm_if_necessary() 116 | 117 | inner_args = [self.translate_and_quote(a) for a in processed_args] 118 | full_args = [self.vagrant_cmd,"ssh","--"] + inner_args 119 | #implicitly this seems to be called with shell=True 120 | res = utils.exec_cmd(full_args,cwd=self.vgfolder,debug=debug) 121 | return res 122 | 123 | 124 | 125 | 126 | -------------------------------------------------------------------------------- /cgrex/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mechaphish/cgrex/8ce02e323646fc83e61db8eb6b4dc7c0dfa0ae03/cgrex/__init__.py -------------------------------------------------------------------------------- /cgrex/utils.py: -------------------------------------------------------------------------------- 1 | 2 | import subprocess 3 | import contextlib 4 | import yaml 5 | import tempfile 6 | import shutil 7 | import os 8 | import capstone 9 | import struct 10 | import sys 11 | 12 | from Exceptions import * 13 | 14 | 15 | ELF_HEADER = "7f45 4c46 0101 0100 0000 0000 0000 0000".replace(" ","").decode('hex') 16 | CGC_HEADER = "7f43 4743 0101 0143 014d 6572 696e 6f00".replace(" ","").decode('hex') 17 | 18 | #adapted from: 19 | #http://stackoverflow.com/questions/18666816/using-python-to-dump-hexidecimals-into-yaml 20 | def representer(dumper, data): 21 | return yaml.ScalarNode('tag:yaml.org,2002:int', hex(data)) 22 | def ydump(*args,**kwargs): 23 | kwargs['width']=1000000 #we like long lines 24 | res = yaml._dump(*args,**kwargs) 25 | return res.strip() 26 | yaml.add_representer(int, representer) 27 | yaml._dump = yaml.dump 28 | yaml.dump = ydump 29 | #the output of yaml_hex.dump() can still be loaded using the standard yaml module 30 | yaml_hex = yaml 31 | 32 | def str_overwrite(tstr,new,pos=None): 33 | if pos == None: 34 | pos = len(tstr) 35 | return tstr[:pos] + new + tstr[pos+len(new):] 36 | 37 | 38 | def pad_str(tstr,align,pad="\x00"): 39 | str_len = len(tstr) 40 | if str_len % align == 0: 41 | return tstr 42 | else: 43 | return tstr + pad * (align - (str_len%align)) 44 | 45 | 46 | def elf_to_cgc(tstr): 47 | assert(tstr.startswith(ELF_HEADER)) 48 | return str_overwrite(tstr,CGC_HEADER,0) 49 | 50 | 51 | def cgc_to_elf(tstr): 52 | assert(tstr.startswith(CGC_HEADER)) 53 | return str_overwrite(tstr,ELF_HEADER,0) 54 | 55 | 56 | def exe_type(tstr): 57 | if tstr.startswith(ELF_HEADER): 58 | return "ELF" 59 | elif tstr.startswith(CGC_HEADER): 60 | return "CGC" 61 | else: 62 | return None 63 | 64 | @contextlib.contextmanager 65 | def tempdir(prefix='/tmp/python_tmp'): 66 | """A context manager for creating and then deleting a temporary directory.""" 67 | tmpdir = tempfile.mkdtemp(prefix=prefix) 68 | try: 69 | yield tmpdir 70 | finally: 71 | #pass 72 | shutil.rmtree(tmpdir) 73 | 74 | 75 | def exec_cmd(args,cwd=None,shell=False,debug=False): 76 | #debug = True 77 | if debug: 78 | print "EXECUTING:",repr(args),cwd,shell 79 | 80 | pipe = subprocess.PIPE 81 | p = subprocess.Popen(args,cwd=cwd,shell=shell,stdout=pipe,stderr=pipe) 82 | std = p.communicate() 83 | retcode = p.poll() 84 | res = (std[0],std[1],retcode) 85 | 86 | if debug: 87 | print "RESULT:",repr(res) 88 | 89 | return res 90 | 91 | 92 | def compile_asm_template(template_name,substitution_dict): 93 | formatted_template_content = get_asm_template(template_name,substitution_dict) 94 | return compile_asm(formatted_template_content) 95 | 96 | 97 | def get_asm_template(template_name,substitution_dict): 98 | project_basedir = os.path.sep.join(os.path.abspath(__file__).split(os.path.sep)[:-2]) 99 | template_fname = os.path.join(project_basedir,"asm",template_name) 100 | template_content = open(template_fname).read() 101 | formatted_template_content = template_content.format(**substitution_dict) 102 | return formatted_template_content 103 | 104 | 105 | def instruction_to_str(instruction,print_bytes=True): 106 | if print_bytes: 107 | pbytes = str(instruction.bytes).encode('hex').rjust(16) 108 | else: 109 | pbytes = "" 110 | return "0x%x %s:\t%s\t%s %s" %(instruction.address, pbytes, instruction.mnemonic, instruction.op_str, 111 | "{"+instruction.overwritten+"}" if hasattr(instruction,'overwritten') else "") 112 | 113 | def capstone_to_nasm(instruction): 114 | tstr = "db " 115 | tstr += ",".join([hex(struct.unpack("B",b)[0]) for b in str(instruction.bytes)]) 116 | tstr += " ;"+instruction_to_str(instruction,print_bytes=False) 117 | return tstr 118 | 119 | 120 | def decompile(code,offset=0x0): 121 | md = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_32) 122 | return list(md.disasm(code, offset)) 123 | 124 | 125 | def compile_jmp(origin,target): 126 | jmp_str = ''' 127 | USE32 128 | org {code_loaded_address} 129 | 130 | jmp {target} 131 | '''.format(**{'code_loaded_address':hex(origin),'target':hex(target)}) 132 | return compile_asm(jmp_str) 133 | 134 | 135 | def get_multiline_str(): 136 | print "[press Ctrl+C to exit]" 137 | input_list = [] 138 | try: 139 | while True: 140 | input_str = raw_input() 141 | input_list.append(input_str) 142 | except KeyboardInterrupt: 143 | pass 144 | print "" 145 | return "\n".join(input_list) 146 | 147 | 148 | def compile_asm(code,base=None): 149 | with tempdir() as td: 150 | asm_fname = os.path.join(td,"asm.s") 151 | bin_fname = os.path.join(td,"bin.o") 152 | 153 | fp = open(asm_fname,'wb') 154 | fp.write("bits 32\n") 155 | if base != None: 156 | fp.write("org %s\n" % hex(base)) 157 | fp.write(code) 158 | fp.close() 159 | 160 | res = exec_cmd("nasm -o %s %s"%(bin_fname,asm_fname),shell=True) 161 | if res[2] != 0: 162 | print "NASM error:" 163 | print res[0] 164 | print res[1] 165 | print open(asm_fname,'r').read() 166 | raise NasmException 167 | 168 | compiled = open(bin_fname).read() 169 | 170 | return compiled 171 | 172 | @contextlib.contextmanager 173 | def redirect_stdout(new_target1,new_target2): 174 | old_target1, sys.stdout = sys.stdout, new_target1 # replace sys.stdout 175 | old_target2, sys.stderr = sys.stderr, new_target2 176 | 177 | try: 178 | yield (new_target1,new_target2) # run some code with the replaced stdout 179 | finally: 180 | sys.stdout = old_target1 # restore to the previous value 181 | sys.stderr = old_target2 182 | 183 | 184 | -------------------------------------------------------------------------------- /fortify.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import sys 4 | import os 5 | import shutil 6 | 7 | import cgrex.utils as utils 8 | from cgrex.Fortifier import Fortifier 9 | from cgrex.VagrantManager import VagrantManager 10 | 11 | 12 | 13 | def self_memory_oep_test(): 14 | fname = sys.argv[1] 15 | 16 | ff = Fortifier(fname) 17 | #print ff.dump_segments() 18 | 19 | assert not ff.has_fortify_segment(),"%s already fortified"%fname 20 | if not ff.has_fortify_segment(): 21 | ff.setup_headers() 22 | 23 | oep = ff.get_oep() 24 | print "--- original oep",oep,repr(hex(oep)) 25 | ff.set_oep(Fortifier.fortify_segment1_base) 26 | 27 | 28 | injected_code = utils.compile_asm_template("memory_scanner.asm", 29 | {'code_loaded_address':hex(Fortifier.fortify_segment1_base),'code_return':hex(oep)}) 30 | ff.set_fortify_segment(injected_code) 31 | ff.save(fname+"_cgrex") 32 | 33 | ''' 34 | vgm = VagrantManager(sys.argv[2]) 35 | with vgm.get_shared_tmpdir() as sd: 36 | save_fname = os.path.join(sd,os.path.basename(fname)+"_cgrex") 37 | ff.save(save_fname) 38 | res = vgm.exec_cmd(["exec",save_fname],debug=True) 39 | raw_input() 40 | ''' 41 | 42 | def inject_helloworld_test(): 43 | fname = sys.argv[1] 44 | 45 | ff = Fortifier(fname) 46 | #print ff.dump_segments() 47 | 48 | assert not ff.has_fortify_segment(),"%s already fortified"%fname 49 | if not ff.has_fortify_segment(): 50 | ff.setup_headers() 51 | 52 | oep = ff.get_oep() 53 | print "--- original oep",oep,repr(hex(oep)) 54 | ff.set_oep(Fortifier.fortify_segment1_base) 55 | 56 | injected_code = utils.compile_asm_template("helloworld.asm", 57 | {'code_loaded_address':hex(Fortifier.fortify_segment1_base),'code_return':hex(oep)}) 58 | ff.set_fortify_segment(injected_code) 59 | ff.save(fname+"_cgrex") 60 | 61 | 62 | if __name__ == "__main__": 63 | 64 | #./fortify.py ../../cgc/vm/cgc/shared/CADET_00001 65 | #self_memory_oep_test() 66 | #inject_helloworld_test() 67 | 68 | fname = sys.argv[1] 69 | ff = Fortifier(fname) 70 | assert not ff.has_fortify_segment(),"%s already fortified"%fname 71 | if not ff.has_fortify_segment(): 72 | ff.setup_headers() 73 | ff.set_fortify_segment("\x90"*1000) 74 | 75 | ff.dump_segments() 76 | print ff.get_maddress(0x8048f00,0x200).encode('hex') 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import os 4 | import docopt 5 | import sys 6 | from cgrex.CGRexAnalysis import CGRexAnalysis 7 | 8 | 9 | import logging 10 | logging.root.addHandler(logging.StreamHandler(sys.stdout)) 11 | l = logging.getLogger('cgrex.main') 12 | 13 | ARGS = ''' 14 | Usage: 15 | main.py --binary= --out= ... 16 | ''' 17 | 18 | 19 | 20 | def main(): 21 | args = docopt.docopt(ARGS) 22 | l.debug(repr(args)) 23 | 24 | 25 | if os.path.isdir(args[""][0]): 26 | povlist = os.listdit(args[""][0]) 27 | else: 28 | povlist = args[""] 29 | 30 | cga = CGRexAnalysis(args["--binary"],args["--out"],args[""]) 31 | cga.run() 32 | 33 | 34 | if __name__ == "__main__": 35 | logging.getLogger('cgrex.main').setLevel(logging.DEBUG) 36 | logging.getLogger('cgrex.CGRexAnalysis').setLevel(logging.DEBUG) 37 | logging.getLogger('cgrex.QemuTracer').setLevel(logging.DEBUG) 38 | logging.getLogger('cgrex.MiasmPatcher').setLevel(logging.DEBUG) 39 | logging.getLogger('cgrex.Fortifier').setLevel(logging.DEBUG) 40 | logging.getLogger('angr.analyses.cfg').setLevel(logging.DEBUG) 41 | 42 | sys.exit(main()) 43 | 44 | 45 | 46 | ''' 47 | ./main.py --binary tests/0b32aa01/0b32aa01_01 --out /tmp/0b32aa01_01_cgrex1 tests/0b32aa01/0b32aa01_01.xml tests/0b32aa01/0b32aa01_02.xml 48 | ''' 49 | 50 | 51 | -------------------------------------------------------------------------------- /reqs.txt: -------------------------------------------------------------------------------- 1 | capstone 2 | docopt 3 | ElfEsteem 4 | pyparsing 5 | PyYAML 6 | defusedxml 7 | termcolor 8 | timeout-decorator 9 | angr 10 | -------------------------------------------------------------------------------- /tests/0b32aa01/0b32aa01_01: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mechaphish/cgrex/8ce02e323646fc83e61db8eb6b4dc7c0dfa0ae03/tests/0b32aa01/0b32aa01_01 -------------------------------------------------------------------------------- /tests/0b32aa01/0b32aa01_01.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | service 5 | 6 | 31\nWelcome to Palindrome Finder\n\n 7 | 37\tPlease enter a possible palindrome: 8 | 9 | \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /tests/0b32aa01/0b32aa01_02.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | service 5 | 6 | 31\nWelcome to Palindrome Finder\n\n 7 | 37\tPlease enter a possible palindrome: 8 | 9 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 10 | 11 | 12 | 13 | --------------------------------------------------------------------------------