├── CREDITS ├── LICENSE ├── README.md ├── analyzer ├── __init__.py ├── coverage │ ├── coverage.cpp │ ├── makefile │ └── makefile.rules └── pin.py ├── blockcache.py ├── campaign.py ├── choronzon.py ├── chromosome ├── __init__.py ├── chromosome.py ├── deserializer.py ├── factory.py ├── gene.py ├── parsers │ ├── PNG.py │ └── __init__.py └── serializer.py ├── configuration.py ├── disassembler ├── __init__.py ├── disassembler.py └── prepare.py ├── evaluator.py ├── fuzzers ├── __init__.py ├── mutators.py ├── recombinators.py └── strategy.py ├── settings ├── __init__.py ├── iview.py ├── pngcheck.py ├── system.py └── winsystem.py ├── tracer.py └── world.py /CREDITS: -------------------------------------------------------------------------------- 1 | Developers: 2 | 3 | Zisis Sialveras 4 | Nikos Naziridis 5 | 6 | Beta testers (unordered): 7 | 8 | Richard Johnson 9 | Aleksandar Nikolich 10 | Joxean Koret 11 | Ben Nagy 12 | Georgi Geshev 13 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Choronzon - An evolutionary knowledge-based fuzzer 2 | 3 | Copyright (c) 2016 Zisis Sialveras 4 | Copyright (c) 2016 Nikos Naziridis 5 | Copyright (c) 2016 CENSUS, S.A. (http://www.census-labs.com/) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions 10 | are met: 11 | 12 | 1. Redistributions of source code must retain the above copyright 13 | notice, this list of conditions and the following disclaimer. 14 | 2. Redistributions in binary form must reproduce the above copyright 15 | notice, this list of conditions and the following disclaimer in the 16 | documentation and/or other materials provided with the distribution. 17 | 3. The names of the authors and copyright holders may not be used to 18 | endorse or promote products derived from this software without 19 | specific prior written permission. 20 | 21 | THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, 22 | INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY 23 | AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL 24 | THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 25 | EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 26 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; 27 | OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 28 | WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 29 | OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF 30 | ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 31 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Choronzon - An evolutionary knowledge-based fuzzer 2 | 3 | ## Introduction 4 | 5 | This document aims to explain in brief the theory behind **Choronzon**. 6 | Moreover, it provides details about its internals and how one can extend 7 | **Choronzon** to meet new requirements. An overview of the architecture of 8 | **Choronzon** was initially presented at the [ZeroNights 2015 9 | Conference](http://2015.zeronights.org/). A 10 | [recording](https://www.youtube.com/watch?v=WafsYOCl8hQ) of the presentation and 11 | the [slide deck](https://census-labs.com/media/choronzon-zeronights-2015.pdf) 12 | are also available. 13 | 14 | **Choronzon** is an evolutionary fuzzer. It tries to imitate the evolutionary 15 | process in order to keep producing better results. To achieve this, 16 | it has an evaluation system to classify which of the fuzzed files are 17 | interesting and which should be dropped. 18 | 19 | Moreover, **Choronzon** is a knowledge-based fuzzer. It uses user-defined 20 | information to read and write files of the targeted file format. To become 21 | familiar with **Choronzon's** terminology, you should consider that each file is 22 | represented by a **chromosome**. Users should describe the elementary structure 23 | of the file format under consideration. A high level overview of the file format 24 | is preferred instead of describing every detail and aspect of it. Each one of 25 | those user-defined elementary structures is considered a **gene**. Each 26 | chromosome contains a tree of genes and it is able to build the corresponding 27 | file from it. 28 | 29 | **Choronzon** is divided into three subsystems, the **Tracer** module, 30 | the **Chromosome** module and the fuzzer. 31 | 32 | Briefly, the **Chromosome** component is used to describe the target file 33 | format. Users are able to write their own modules to support new or 34 | custom formats. As a test-case, a PNG module is provided with **Choronzon**. 35 | 36 | On the other hand, the **Tracer** component is responsible to monitor the target 37 | application and collect various information about its execution. This version 38 | of **Choronzon** uses [Intel's 39 | Pin](https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool) 40 | binary instrumentation tool in order to log the basic blocks that were visited 41 | during the execution. However, **Choronzon** is able to support other tracing 42 | backends as well. Also keep in mind that in the next version of Choronzon, Pin is 43 | going to be replaced due to its staggering performance impact. 44 | 45 | Lastly, the fuzzer component is used to alter the contents of the files to be 46 | tested. The module contains a corpus of **Mutators** and **Recombinators**. 47 | **Mutators**, simply, are changing the file like common fuzzers do. For example, 48 | they perform byte flipping, byte swapping, random byte mutation and so on. But 49 | **Choronzon** has another feature that is not that common across fuzzers. 50 | **Recombinators** are using the information about the structure of the file 51 | format, provided by the **Chromosome** module, in order to perform intelligent 52 | fuzzing. 53 | 54 | ## Chromosome 55 | 56 | In the directory `chromosome/parsers` you can find the file `PNG.py`. This 57 | Python module describes the PNG file format to the fuzzer. You may add your 58 | custom modules for other file formats in this directory. 59 | 60 | The fundamental idea behind the **Chromosome** subsystem is to convert the 61 | initial seed files using a **Deserializer** into a tree of **Genes**. At some 62 | point, the (fuzzed) **Genes** will be written into a file, using a 63 | **Serializer**. 64 | 65 | Consider that in **Choronzon** the aim of the parser module is to provide 66 | *the elementary structure* of the file format, instead of 67 | every minor detail. This will help the fuzzer to construct files that are 68 | mostly sane, avoiding early exiting from the target application. 69 | Additionally, this approach saves time, because describing every aspect 70 | of the file format is time consuming and introduces significant development 71 | overhead. 72 | 73 | ### How to write a custom parser 74 | 75 | A new parser module must import: 76 | 77 | * chromosome.gene.AbstractGene 78 | * chromosome.serializer.BaseSerializer 79 | * chromosome.deserializer.BaseDeserializer 80 | 81 | and it must implement 82 | 83 | * a **Gene** class derived from **chromosome.gene.AbstractGene**, 84 | * a **Serializer** class derived from **chromosome.serializer.BaseSerializer**, 85 | * and a **Deserializer** class derived from 86 | **chromosome.deserializer.BaseDeserializer**. 87 | 88 | In the example shipped with **Choronzon**, each **PNGGene** corresponds to a PNG 89 | chunk. Generally, you may think of a **Gene** as an *elementary data structure* 90 | of the target format. Each **Chromosome** is comprised from a tree of **Genes**, 91 | and represents a unique file. Each **Gene** must be able to produce a byte string 92 | that contains its data combined with the data of the lower **Genes** in the 93 | tree. 94 | 95 | The **PNGSerializer** must be able to produce (a mostly sane) file from when 96 | a list of **Genes** is given to it. On the other hand, **PNGDeserializer** must 97 | be a able to parse a *valid* file of the target format and deserialize it to a 98 | tree of **Genes**. 99 | 100 | Check `chromosome/parsers/PNG.py` for a commented example for the PNG format. 101 | 102 | ## Tracer 103 | 104 | The **Tracer** module is used to disassemble the target application (and/or one 105 | or more of its libraries). In this version of **Choronzon** this is achieved 106 | with IDA. We used this approach because we can correlate any interesting 107 | information from the fuzzing campaign with our IDBs. However, we may drop the 108 | dependency on IDA in the near future in order to make **Choronzon** more 109 | portable and accessible. 110 | 111 | A file is tested against an application with the help of a Pin utility. In the 112 | `analyzer/coverage` directory there's the source code of this Pin tool, which 113 | injects hooks in the beginning of each basic block at the target application. 114 | When the execution is finished, we correlate the basic block that was hit, with 115 | the basic block from the binary. Thus, we're able to calculate metrics that are 116 | valuable for us (coverage etc). 117 | 118 | ## Fuzzer 119 | 120 | The **Fuzzer** component is using the **Chromosome** representation to fuzz a 121 | file. As mentioned earlier, there are two fuzzing methods in **Choronzon**. 122 | 123 | For the first method, **Choronzon** gets the content from one or more genes 124 | and applies one of the **Mutators**. **Mutators** implement common but effective 125 | fuzzing methods like random byte mutation, high bit set, byte swapping 126 | and many more. You may also write your own custom mutators and add them in 127 | `fuzzers/mutators.py`. 128 | 129 | The second fuzzing method is called recombination. **Recombinators** are 130 | used to change the structure of the file. Here's an example with the 131 | PNG format. 132 | 133 | PNG files are comprised by consecutive chunks that contain four fields, 134 | 135 | * length, 136 | * chunk's type, 137 | * chunk's data, 138 | * and a CRC. 139 | 140 | Let's assume we have a PNG file that only has IHDR, IDAT and IEND chunks. Its 141 | structure would look like the following: 142 | 143 | [ PNG signature ] [ IHDR ] [ IDAT ] [ IEND ] 144 | 145 | Since **Choronzon** is aware of the basic structures (i.e the PNG chunks), 146 | it is able to alter their sequence. After a successful recombination the fuzzed 147 | PNG output file can look like this: 148 | 149 | [ PNG signature ] [ IDAT ] [ IHDR ] [ IEND ] 150 | 151 | **Choronzon** contains many more recombination strategies that make it 152 | able to cope even with complicated file formats. 153 | 154 | ## Installation 155 | 156 | **Choronzon** has been tested with Python 2.7, Pin 3, IDA Pro 6.6 to 6.9, 157 | on Ubuntu 16.04 LTS (Linux kernel 4.4) and Windows 10. 158 | 159 | In order to run it you'll need to install the sortedcontainers Python package. 160 | You may find it [here](https://pypi.python.org/pypi/sortedcontainers) or install 161 | it via pip. 162 | 163 | Moreover, **Choronzon** needs IDA Pro (actually, its terminal version). The 164 | path of IDA Pro should be specified in your configuration file like this: 165 | 166 | ``` 167 | DisassemblerPath = 'C:\\Program Files (x86)\\IDA 6.6' 168 | ``` 169 | 170 | It has been tested successfully with IDA Pro 6.6, 6.7, 6.8 and 6.9. 171 | 172 | **Choronzon's** coverage Pin tool is located at `analyzer/coverage` and must be 173 | compiled. You may want to check Pin's documentation for details, or you can 174 | perform the following steps: 175 | 176 | 1. Copy the `coverage.cpp` and `makefile.rules` file to 177 | `/path/to/pin/source/tools/MyPinTool` 178 | 2. Run `make`. If you're on Windows you should run the Visual 179 | Studio command line, and use the `make` utility and its dependencies from 180 | [Cygwin](https://www.cygwin.com/) 181 | 3. Copy back to `/path/to/choronzon/analysis/coverage` the newly created 182 | `obj-intel64` directory (or `obj-ia32` for 32 bit systems) 183 | 184 | ## Configuration 185 | 186 | In order to fuzz with **Choronzon**, you must provide a configuration 187 | file. In the `settings` directory there is an example of **Choronzon's** 188 | configuration. 189 | 190 | -------------------------------------------------------------------------------- /analyzer/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | from pin import * 4 | -------------------------------------------------------------------------------- /analyzer/coverage/coverage.cpp: -------------------------------------------------------------------------------- 1 | #include "pin.H" 2 | 3 | #include 4 | #include 5 | 6 | #ifdef TARGET_LINUX 7 | 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | #elif TARGET_WINDOWS 14 | namespace WIN32_API { 15 | #include 16 | } 17 | #endif 18 | 19 | /* Definitions of pintool's data structures. */ 20 | 21 | // struct that holds information 22 | // about an image. 23 | typedef struct 24 | { 25 | ADDRINT low; 26 | ADDRINT high; 27 | INT32 loaded; 28 | char *path; 29 | UINT32 index; 30 | } image_t; 31 | 32 | // whitelist type 33 | typedef struct 34 | { 35 | image_t *list; 36 | off_t len; 37 | } whitelist_t; 38 | 39 | typedef struct { 40 | UINT64 image_index; 41 | UINT64 bbl; 42 | } node_t; 43 | 44 | typedef struct 45 | { 46 | UINT64 size; 47 | node_t *start; 48 | node_t *curr; // points to the first *empty* node 49 | node_t *end; // points right *after* the allocated region 50 | UINT32 guard; 51 | } bucket_t; 52 | 53 | /* Declaration of global variables */ 54 | uint64_t bbls_count; 55 | #ifdef TARGET_LINUX 56 | int pipeHandle; 57 | #elif TARGET_WINDOWS 58 | WIN32_API::HANDLE pipeHandle; 59 | WIN32_API::HANDLE TimeoutEvent; 60 | volatile WIN32_API::BOOL IsProcessRunning = true; 61 | volatile WIN32_API::BOOL IsProcessSignaled = false; 62 | PIN_LOCK *PinThreadLock; 63 | PIN_THREAD_UID InternalPINThreadUid; 64 | #else 65 | #error "This operating system is not supported yet." 66 | #endif 67 | 68 | #define IS_BUCKET_FULL(bkt) (bkt->curr > bkt->end) 69 | bucket_t *bucket; 70 | whitelist_t whitelist; 71 | 72 | VOID write_to_pipe(VOID *, size_t); 73 | 74 | bucket_t * 75 | init_bucket(int pipe_size) 76 | { 77 | bucket_t *bkt = (bucket_t *)malloc(sizeof(bucket_t)); 78 | if(bkt == NULL) 79 | { 80 | perror("malloc"); 81 | return NULL; 82 | } 83 | 84 | bkt->size = (pipe_size >> 1) / sizeof(node_t); 85 | bkt->start = bkt->curr = (node_t *)malloc(bkt->size * sizeof(node_t)); 86 | bkt->end = bkt->start + bkt->size; 87 | bkt->guard = 0x41424344; 88 | if(bkt->start == NULL) 89 | { 90 | perror("malloc"); 91 | return NULL; 92 | } 93 | return bkt; 94 | } 95 | 96 | /* Whitelisting */ 97 | INT32 98 | wht_insert_image(image_t *image) 99 | { 100 | image_t *list = whitelist.list; 101 | off_t i = 0; 102 | for (i=0; i < whitelist.len; i++) 103 | { 104 | char *stub = (char *)strstr( 105 | image->path, 106 | list[i].path 107 | ); 108 | 109 | // search other 110 | if (stub == NULL) 111 | continue; 112 | 113 | image->index = i; 114 | // copy image metadata to whitelist 115 | memcpy(list+i, image, sizeof(image_t)); 116 | return 0; 117 | } 118 | 119 | return -1; 120 | } 121 | 122 | image_t * 123 | wht_find_image(ADDRINT bbl) 124 | { 125 | image_t *list = whitelist.list; 126 | off_t i = 0; 127 | for (i=0; i < whitelist.len; i++) 128 | { 129 | if (!list[i].loaded) 130 | continue; 131 | 132 | if (list[i].low <= bbl && list[i].high >= bbl) 133 | return list+i; 134 | } 135 | 136 | return NULL; 137 | } 138 | 139 | /* 140 | * Initializes the whitelist struct and fills 141 | * it with image stubs. 142 | * 143 | * wht_insert_image() whill overwrite the matching 144 | * stub with the loaded image. 145 | */ 146 | VOID 147 | wht_init(KNOB *img_list) 148 | { 149 | // Initialize the whitelist structure 150 | // set the length and allocate the space 151 | // required. 152 | whitelist.len = img_list->NumberOfValues(); 153 | whitelist.list = (image_t *)malloc( 154 | whitelist.len * sizeof(image_t) 155 | ); 156 | 157 | // TODO: needs some better error 158 | // handling. 159 | if (!whitelist.list) 160 | LOG("Could not allocate whitelist\n"); 161 | 162 | // build an empty stub that 163 | // will be overwritten by the 164 | // matching image if that image 165 | // is loaded. 166 | image_t *p; 167 | off_t i = 0; 168 | image_t stub; 169 | 170 | stub.low = 0; 171 | stub.high = 0; 172 | stub.loaded = 0; 173 | stub.path = NULL; 174 | 175 | for (i=0; i < whitelist.len; i++) 176 | { 177 | stub.path = (char *)img_list->Value(i).c_str(); 178 | stub.index = i; 179 | p = whitelist.list + i; 180 | memcpy(p, &stub, sizeof(image_t)); 181 | } 182 | } 183 | 184 | VOID 185 | wht_free() 186 | { 187 | free(whitelist.list); 188 | } 189 | 190 | /* Instrumentation */ 191 | VOID 192 | img_load(IMG img, VOID *v) 193 | { 194 | image_t image; 195 | image.path = (char *)IMG_Name(img).c_str(); 196 | image.low = IMG_LowAddress(img); 197 | image.high = IMG_HighAddress(img); 198 | image.loaded = 1; 199 | 200 | LOG("[+] Image "); 201 | LOG(image.path); 202 | if(!wht_insert_image(&image)) 203 | LOG("loaded successfully\n"); 204 | else 205 | LOG("skipped\n"); 206 | } 207 | 208 | VOID 209 | img_unload(IMG img, VOID *v) 210 | { 211 | ADDRINT low = IMG_LowAddress(img); 212 | image_t *i = wht_find_image(low); 213 | if (i) 214 | { 215 | LOG("[+] Unloading image "); 216 | LOG(i->path); 217 | LOG("\n"); 218 | i->loaded = 0; 219 | } 220 | } 221 | 222 | VOID 223 | bbl_hit_handler(image_t *img, ADDRINT ip) 224 | { 225 | #ifdef TARGET_WINDOWS 226 | PIN_GetLock(PinThreadLock, 0); 227 | if(IsProcessSignaled) 228 | return; 229 | #endif /* TARGET_WINDOWS */ 230 | 231 | if(IS_BUCKET_FULL(bucket)) 232 | { 233 | write_to_pipe(bucket->start, (bucket->end - bucket->start) * sizeof(node_t)); 234 | bucket->curr = bucket->start; 235 | } 236 | 237 | bucket->curr->image_index = img->index; 238 | bucket->curr->bbl = (UINT64)(ip - img->low); 239 | bucket->curr++; 240 | 241 | bbls_count++; 242 | 243 | #ifdef TARGET_WINDOWS 244 | PIN_ReleaseLock(PinThreadLock); 245 | #endif /* TARGET_WINDOWS */ 246 | } 247 | 248 | VOID 249 | trace_callback(TRACE trace, VOID *v) 250 | { 251 | // get trace's address and check 252 | // if the image it belongs to has 253 | // been whitelisted. 254 | ADDRINT addr = TRACE_Address(trace); 255 | image_t *im = wht_find_image(addr); 256 | if (!im) 257 | return; 258 | 259 | // add instrumentation to all basic blocks 260 | // in the current trace. 261 | BBL bbl = TRACE_BblHead(trace); 262 | for (; BBL_Valid(bbl); bbl = BBL_Next(bbl)) 263 | { 264 | addr = BBL_Address(bbl); 265 | 266 | // enable tracing of this 267 | // basic block. 268 | BBL_InsertCall( 269 | bbl, 270 | IPOINT_ANYWHERE, 271 | (AFUNPTR)bbl_hit_handler, 272 | IARG_FAST_ANALYSIS_CALL, 273 | IARG_PTR, 274 | im, 275 | IARG_ADDRINT, 276 | addr, 277 | IARG_END 278 | ); 279 | } 280 | } 281 | 282 | void 283 | context_change_cb(THREADID thridx, CONTEXT_CHANGE_REASON reason, const CONTEXT *from, CONTEXT *to, INT32 info, VOID *v) 284 | { 285 | switch(reason) 286 | { 287 | case CONTEXT_CHANGE_REASON_FATALSIGNAL: 288 | if(IS_BUCKET_FULL(bucket)) 289 | { 290 | write_to_pipe(bucket->start, (bucket->end - bucket->start) * sizeof(node_t)); 291 | bucket->curr = bucket->start; 292 | } 293 | bucket->curr->image_index = 0xFFFFFFFFFFFFFFFF; 294 | bucket->curr->bbl = (UINT64)info; 295 | bucket->curr++; 296 | break; 297 | case CONTEXT_CHANGE_REASON_EXCEPTION: 298 | #define IS_FATAL_EXCEPTION(ex) ((ex & 0xC0000000) == 0xC0000000) 299 | if(IS_FATAL_EXCEPTION(info)) 300 | { 301 | if(IS_BUCKET_FULL(bucket)) 302 | { 303 | write_to_pipe(bucket->start, (bucket->end - bucket->start) * sizeof(node_t)); 304 | bucket->curr = bucket->start; 305 | } 306 | bucket->curr->image_index = 0xFFFFFFFFFFFFFFFF; 307 | bucket->curr->bbl = (UINT64)info; 308 | bucket->curr++; 309 | } 310 | break; 311 | default: 312 | break; 313 | } 314 | } 315 | 316 | /* Windows specific code */ 317 | #ifdef TARGET_WINDOWS 318 | 319 | #define EVENT_WAIT_TIMEOUT 500 320 | /* Checks whether a time out event has been triggered. If that's the case 321 | * it flushes the data from buckets into the pipe and returns. 322 | */ 323 | void 324 | CheckTerminationEvent(void *arg) 325 | { 326 | using namespace WIN32_API; 327 | LOG("New thread has spawned.\n"); 328 | while(IsProcessRunning) 329 | { 330 | if(WaitForSingleObject(TimeoutEvent, EVENT_WAIT_TIMEOUT) == WAIT_OBJECT_0) 331 | { 332 | LOG("Event was set.\n"); 333 | PIN_GetLock(PinThreadLock, 0); 334 | 335 | if(IS_BUCKET_FULL(bucket)) 336 | { 337 | write_to_pipe(bucket->start, (bucket->end - bucket->start) * sizeof(node_t)); 338 | bucket->curr = bucket->start; 339 | } 340 | bucket->curr->image_index = 0xFFFFFFFFFFFFFFFF; 341 | // SIGUSR2, process terminated due to a timeout event 342 | bucket->curr->bbl = (uint64_t)(0x0000000c); 343 | bucket->curr++; 344 | 345 | IsProcessSignaled = true; 346 | IsProcessRunning = false; 347 | PIN_ReleaseLock(PinThreadLock); 348 | PIN_ExitApplication(0); 349 | } 350 | } 351 | } 352 | 353 | /* Signal the interal pin thread that the process is exiting 354 | * wait the thread to terminate. 355 | */ 356 | VOID 357 | TerminateInternalPINThreads(VOID *dummy) 358 | { 359 | LOG("Waiting for CheckTerminiationEvent\n"); 360 | IsProcessRunning = false; 361 | PIN_WaitForThreadTermination(InternalPINThreadUid, PIN_INFINITE_TIMEOUT, NULL); 362 | LOG("CheckTerminatioEvent has finished.\n"); 363 | } 364 | #endif /* TARGET_WINDOWS */ 365 | 366 | /* IPC */ 367 | VOID 368 | write_to_pipe(VOID *buffer, size_t count) 369 | { 370 | #ifdef TARGET_LINUX 371 | ssize_t bytes_written = 0; 372 | while(count > 0) 373 | { 374 | bytes_written = write(pipeHandle, buffer, count); 375 | if(bytes_written < 0) 376 | { 377 | perror("write"); 378 | exit(1); 379 | } 380 | else 381 | { 382 | count -= bytes_written; 383 | } 384 | } 385 | #elif TARGET_WINDOWS 386 | { 387 | using namespace WIN32_API; 388 | DWORD NumberOfBytesWritten; 389 | NumberOfBytesWritten = 0; 390 | WriteFile(pipeHandle, buffer, count, &NumberOfBytesWritten, NULL); 391 | if(NumberOfBytesWritten != count) 392 | { 393 | LOG("WriteFile failed ?\n"); 394 | } 395 | } 396 | #else 397 | #error "This operating system is not supported." 398 | #endif 399 | } 400 | 401 | #ifdef TARGET_LINUX 402 | int 403 | get_pipe_max_size() 404 | { 405 | FILE *fp; 406 | int pipe_max_size = 0; 407 | 408 | fp = fopen("/proc/sys/fs/pipe-max-size", "r"); 409 | if(fp == NULL) 410 | return 0; 411 | 412 | if(fscanf(fp, "%u", &pipe_max_size) != 1) 413 | { 414 | pipe_max_size = 0; 415 | 416 | fprintf(stderr, "Unable to read /proc/sys/fs/pipe-max-size properly\n"); 417 | } 418 | 419 | return pipe_max_size; 420 | } 421 | #endif /* TARGET_LINUX */ 422 | 423 | char *fix_pipe_name(const char *name) 424 | { 425 | char *p; 426 | if(name[0] == '\\' && name[1] == '\\') { 427 | p = (char *)name; 428 | } else { 429 | p = (char *)malloc(strlen(name) + 0x20); 430 | sprintf(p, "\\\\.\\pipe\\%s", name); 431 | } 432 | return p; 433 | } 434 | 435 | int 436 | init_fifo(const char *fifoname) 437 | { 438 | #ifdef TARGET_LINUX 439 | int pipe_max_size; 440 | pipe_max_size = get_pipe_max_size(); 441 | LOG("init_fifo"); 442 | if(!pipe_max_size) 443 | return 0; 444 | 445 | LOG("opening handle"); 446 | if((pipeHandle = open(fifoname, O_WRONLY)) < 0) { 447 | perror("open"); 448 | return 0; 449 | } 450 | 451 | if(fcntl(pipeHandle, F_SETPIPE_SZ, pipe_max_size) < 0) { 452 | perror("fcntl"); 453 | return 0; 454 | } 455 | 456 | return pipe_max_size; 457 | #elif TARGET_WINDOWS 458 | { 459 | using namespace WIN32_API; 460 | #define WINDOWS_PIPE_SIZE 0x8000 461 | char *pipename = fix_pipe_name(fifoname); 462 | LOG(pipename); 463 | pipeHandle = CreateNamedPipe(pipename, 464 | PIPE_ACCESS_OUTBOUND, 465 | PIPE_TYPE_BYTE | PIPE_WAIT, 466 | 1, 467 | WINDOWS_PIPE_SIZE, 468 | WINDOWS_PIPE_SIZE, 469 | 0, 470 | NULL); 471 | 472 | if(pipeHandle == INVALID_HANDLE_VALUE) { 473 | LOG("CreateNamedPipeA failed.\n"); 474 | return 0; 475 | } 476 | 477 | if(!ConnectNamedPipe(pipeHandle, NULL)) { 478 | if(GetLastError() != ERROR_PIPE_CONNECTED) { 479 | LOG("ConnectNamedPipe failed.\n"); 480 | return 0; 481 | } 482 | } 483 | return WINDOWS_PIPE_SIZE; 484 | } 485 | #else 486 | #error "This option is not supported yet." 487 | #endif 488 | } 489 | 490 | void 491 | write_header() 492 | { 493 | uint8_t image_count; 494 | uint8_t *header; 495 | unsigned int header_size, pos; 496 | 497 | image_count = (uint8_t)whitelist.len; 498 | header_size = 1; 499 | header = NULL; 500 | 501 | for(uint8_t i = 0; i < image_count; i++) 502 | header_size += strlen(whitelist.list[i].path) + 2; 503 | 504 | header = (uint8_t*)malloc(header_size); 505 | *header = image_count; 506 | pos = 1; 507 | 508 | for(uint8_t i = 0; i < image_count; i++) 509 | { 510 | *(uint16_t *)(header + pos) = (uint16_t)strlen(whitelist.list[i].path); 511 | pos += 2; 512 | memcpy(header + pos, whitelist.list[i].path, strlen(whitelist.list[i].path)); 513 | pos += strlen(whitelist.list[i].path); 514 | } 515 | 516 | //[header] [number of images - 2 bytes][imagename - no-null byte] 517 | write_to_pipe(header, header_size); 518 | } 519 | 520 | /* 521 | * Main and usage 522 | */ 523 | KNOB 524 | knob_database( 525 | KNOB_MODE_WRITEONCE, 526 | "pintool", 527 | "o", "", 528 | "specify an output file that will be generated from the target executable" 529 | ); 530 | 531 | KNOB 532 | knob_event( 533 | KNOB_MODE_WRITEONCE, 534 | "pintool", 535 | "e", "", 536 | "windows only - the name of event"); 537 | 538 | KNOB 539 | knob_whitelist( 540 | KNOB_MODE_APPEND, 541 | "pintool", 542 | "wht", "", 543 | "list of image names to instrument" 544 | ); 545 | 546 | void 547 | pin_finish(INT32 code, VOID *v) 548 | { 549 | char buffer[100]; 550 | snprintf(buffer, 99, "pin_finish, bbls hit: %lu\n", bbls_count); 551 | LOG(buffer); 552 | write_to_pipe(bucket->start, (bucket->curr - bucket->start) * sizeof(node_t)); 553 | #ifdef TARGET_LINUX 554 | LOG("closing fifo"); 555 | close(pipeHandle); 556 | #elif TARGET_WINDOWS 557 | CloseHandle(pipeHandle); 558 | CloseHandle(TimeoutEvent); 559 | #else 560 | #error "This system is not supported." 561 | #endif 562 | } 563 | 564 | INT32 565 | usage() 566 | { 567 | printf("This tool traces all the basic blocks " 568 | "and routines that are accessed during execution\n"); 569 | return -1; 570 | } 571 | 572 | int 573 | main(int argc, char **argv) { 574 | int pipe_size; 575 | 576 | if(PIN_Init(argc, argv)) { 577 | LOG("PIN_Init() failed.\n"); 578 | return usage(); 579 | } 580 | 581 | pipe_size = init_fifo(knob_database.Value().c_str()); 582 | 583 | if(!pipe_size) { 584 | LOG("init_fifo() failed\n"); 585 | fprintf(stderr, "Unable to make a fifo.\n"); 586 | return -1; 587 | } 588 | LOG("pipe ok\n"); 589 | 590 | #ifdef TARGET_WINDOWS 591 | if(!knob_event.Value().size()) { 592 | LOG("Error in arguments (event was not set).\n"); 593 | return usage(); 594 | } 595 | 596 | /* On Windows, the instrumentation timeout is specified by an event object */ 597 | { 598 | using namespace WIN32_API; 599 | 600 | TimeoutEvent = CreateEventA(NULL, TRUE, FALSE, knob_event.Value().c_str()); 601 | 602 | if(TimeoutEvent == NULL) { 603 | LOG("CreateEventA failed.\n"); 604 | fprintf(stderr, "CreateEventA failed %d.", GetLastError()); 605 | return -2; 606 | } 607 | 608 | LOG("event ok\n"); 609 | } 610 | #endif /* TARGET_WINDOWS */ 611 | 612 | 613 | bucket = init_bucket(pipe_size); 614 | LOG("bucket ok\n"); 615 | wht_init(&knob_whitelist); 616 | LOG("whitelist ok\n"); 617 | 618 | write_header(); 619 | LOG("write_header ok\n"); 620 | 621 | IMG_AddInstrumentFunction(img_load, NULL); 622 | IMG_AddUnloadFunction(img_unload, NULL); 623 | 624 | #ifdef TARGET_WINDOWS 625 | /* 626 | * In Windows in order to signal to the pintool that the instrumentation 627 | * should stop we're using an event object. A new thread is created to 628 | * watch is the event was set. On the other hand, on Linux, a SIGUSR2 629 | * signal is sent to the process. 630 | */ 631 | PinThreadLock = (PIN_LOCK *)malloc(sizeof(PIN_LOCK)); 632 | PIN_InitLock(PinThreadLock); 633 | if(PinThreadLock == NULL) { 634 | LOG("PIN_InitLock failed.\n"); 635 | return usage(); 636 | } 637 | THREADID tid; 638 | tid = PIN_SpawnInternalThread(CheckTerminationEvent, NULL, 0, &InternalPINThreadUid); 639 | if(tid == INVALID_THREADID) { 640 | LOG("PIN_SpawnInternalThread failed.\n"); 641 | return -2; 642 | } 643 | PIN_AddPrepareForFiniFunction(TerminateInternalPINThreads, NULL); 644 | #endif /* TARGET_WINDOWS */ 645 | 646 | PIN_AddContextChangeFunction(context_change_cb, 0); 647 | 648 | TRACE_AddInstrumentFunction(trace_callback, NULL); 649 | 650 | // cleanup code 651 | PIN_AddFiniFunction(pin_finish, NULL); 652 | 653 | // never returns 654 | PIN_StartProgram(); 655 | 656 | return 0; 657 | } 658 | -------------------------------------------------------------------------------- /analyzer/coverage/makefile: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # DO NOT EDIT THIS FILE! 4 | # 5 | ############################################################## 6 | 7 | # If the tool is built out of the kit, PIN_ROOT must be specified in the make invocation and point to the kit root. 8 | ifdef PIN_ROOT 9 | CONFIG_ROOT := $(PIN_ROOT)/source/tools/Config 10 | else 11 | CONFIG_ROOT := ../Config 12 | endif 13 | include $(CONFIG_ROOT)/makefile.config 14 | include makefile.rules 15 | include $(TOOLS_ROOT)/Config/makefile.default.rules 16 | 17 | ############################################################## 18 | # 19 | # DO NOT EDIT THIS FILE! 20 | # 21 | ############################################################## 22 | -------------------------------------------------------------------------------- /analyzer/coverage/makefile.rules: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # This file includes all the test targets as well as all the 4 | # non-default build rules and test recipes. 5 | # 6 | ############################################################## 7 | 8 | 9 | ############################################################## 10 | # 11 | # Test targets 12 | # 13 | ############################################################## 14 | 15 | ###### Place all generic definitions here ###### 16 | 17 | # This defines tests which run tools of the same name. This is simply for convenience to avoid 18 | # defining the test name twice (once in TOOL_ROOTS and again in TEST_ROOTS). 19 | # Tests defined here should not be defined in TOOL_ROOTS and TEST_ROOTS. 20 | TEST_TOOL_ROOTS := coverage 21 | 22 | # This defines the tests to be run that were not already defined in TEST_TOOL_ROOTS. 23 | TEST_ROOTS := 24 | 25 | # This defines the tools which will be run during the the tests, and were not already defined in 26 | # TEST_TOOL_ROOTS. 27 | TOOL_ROOTS := 28 | 29 | # This defines the static analysis tools which will be run during the the tests. They should not 30 | # be defined in TEST_TOOL_ROOTS. If a test with the same name exists, it should be defined in 31 | # TEST_ROOTS. 32 | # Note: Static analysis tools are in fact executables linked with the Pin Static Analysis Library. 33 | # This library provides a subset of the Pin APIs which allows the tool to perform static analysis 34 | # of an application or dll. Pin itself is not used when this tool runs. 35 | SA_TOOL_ROOTS := 36 | 37 | # This defines all the applications that will be run during the tests. 38 | APP_ROOTS := 39 | 40 | # This defines any additional object files that need to be compiled. 41 | OBJECT_ROOTS := 42 | 43 | # This defines any additional dlls (shared objects), other than the pintools, that need to be compiled. 44 | DLL_ROOTS := 45 | 46 | # This defines any static libraries (archives), that need to be built. 47 | LIB_ROOTS := 48 | 49 | ###### Define the sanity subset ###### 50 | 51 | # This defines the list of tests that should run in sanity. It should include all the tests listed in 52 | # TEST_TOOL_ROOTS and TEST_ROOTS excluding only unstable tests. 53 | SANITY_SUBSET := $(TEST_TOOL_ROOTS) $(TEST_ROOTS) 54 | 55 | 56 | ############################################################## 57 | # 58 | # Test recipes 59 | # 60 | ############################################################## 61 | 62 | # This section contains recipes for tests other than the default. 63 | # See makefile.default.rules for the default test rules. 64 | # All tests in this section should adhere to the naming convention: .test 65 | 66 | 67 | ############################################################## 68 | # 69 | # Build rules 70 | # 71 | ############################################################## 72 | 73 | # This section contains the build rules for all binaries that have special build rules. 74 | # See makefile.default.rules for the default build rules. 75 | -------------------------------------------------------------------------------- /analyzer/pin.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | ''' 4 | pin.py provides a driver class for Intel's binary 5 | instrumentation framework PIN. The exposed API allows a 6 | user to trace and monitor the execution of any given binary 7 | executable image, including shared libraries. 8 | ''' 9 | 10 | import platform 11 | import os 12 | import sys 13 | import time 14 | import shlex 15 | import signal 16 | import subprocess 17 | import threading 18 | import ctypes 19 | import random 20 | 21 | EVENT_ALL_ACCESS = 0x1F0003 22 | EVENT_MODIFY_STATE = 0x0002 23 | 24 | NMPWAIT_USE_DEFAULT_WAIT = 0x0 25 | NMPWAIT_WAIT_FOREVER = 0xFFFFFFFF 26 | 27 | class PinRunner(object): 28 | ''' 29 | Base PIN driver class. It provides a generic interface 30 | to PIN's functionality. This includes the execution of a 31 | binary under PIN's monitoring and the injection of any pintool 32 | in the process. 33 | ''' 34 | pintool = None 35 | cmd_template = None 36 | process = None 37 | timeout = None 38 | event_name = None 39 | timer = None 40 | 41 | def __init__(self, timeout=20): 42 | self.timeout = timeout 43 | if platform.system() == 'Linux': 44 | self.cmd_template = 'pin -t %s %s' 45 | elif platform.system() == 'Windows': 46 | self.cmd_template = 'pin.exe -t %s %s' 47 | 48 | 49 | def handler(self): 50 | ''' 51 | the signal/event alarm handler. This code is executed 52 | when the alarm is off. 53 | ''' 54 | if self.process.poll() != None: 55 | return 56 | try: 57 | if platform.system() == 'Linux': 58 | self.process.send_signal(signal.SIGUSR2) 59 | elif platform.system() == 'Windows': 60 | event_object = ctypes.windll.kernel32.OpenEventA( 61 | EVENT_ALL_ACCESS, False, self.event_name) 62 | 63 | ctypes.windll.kernel32.SetEvent(event_object) 64 | except OSError, ex: 65 | print '[!] ERROR: ', ex 66 | 67 | def set_alarm(self, seconds=None): 68 | ''' 69 | sets a therad timer that calls the handler() function. 70 | ''' 71 | if seconds == None: 72 | return 73 | 74 | if self.timer != None: 75 | self.timer.cancel() 76 | 77 | self.timer = threading.Timer(float(seconds), self.handler) 78 | self.timer.start() 79 | 80 | def craft_command(self, pintool, arguments=''): 81 | ''' 82 | creates a command string based on the user settings, 83 | default values for the platform and the specified 84 | target binary. 85 | ''' 86 | if not pintool: 87 | raise ValueError('pintool not provided') 88 | if not os.path.exists(pintool): 89 | raise IOError( 90 | 'pintool %s does not exist' 91 | % pintool 92 | ) 93 | cmd = None 94 | if platform.system() == 'Linux': 95 | cmd = shlex.split( 96 | self.cmd_template % (pintool, arguments) 97 | ) 98 | elif platform.system() == 'Windows': 99 | cmd = self.cmd_template % (pintool, arguments) 100 | return cmd 101 | 102 | def run(self, pintool, *args): 103 | ''' 104 | runs the user-provided pintool with a user-provided 105 | argument list. 106 | ''' 107 | pintool_args = None 108 | if args and type(args) == tuple: 109 | pintool_args = ' '.join(args) 110 | else: 111 | pintool_args = '' 112 | # craft command string 113 | cmd = self.craft_command(pintool, pintool_args) 114 | # call pin with 115 | 116 | try: 117 | self.set_alarm(self.timeout) 118 | 119 | with open(os.devnull, 'w') as nullfp: 120 | print 'Calling: %s' % cmd 121 | self.process = subprocess.Popen(cmd, 122 | stdout=nullfp, stderr=nullfp) 123 | 124 | ### self.process.wait() 125 | 126 | except subprocess.CalledProcessError: 127 | return 128 | 129 | class Coverage(PinRunner): 130 | ''' 131 | This is a specific PIN module that is optimized to be 132 | used with the coverage.so pintool that accompanies Choronzon. 133 | It makes use of OS provided pipes and events to collect 134 | the execution information as quickly as possible. 135 | ''' 136 | def __init__(self, 137 | pintool='analyzer/coverage/obj-intel64/coverage.so', 138 | timeout=20 139 | ): 140 | self.pintool = os.path.abspath(pintool) 141 | super(Coverage, self).__init__(timeout) 142 | 143 | def _run(self, execmd, output, whitelist): 144 | ''' 145 | execmd is the command string to run 146 | the pintool on. 147 | ''' 148 | # if os.path.exists(output): 149 | # os.remove(output) 150 | 151 | quoted_whilelist = [] 152 | for image in whitelist: 153 | quoted_whilelist.append('\"%s\"' % os.path.basename(image)) 154 | 155 | # print '[+] Running pintool...' 156 | if platform.system() == 'Linux': 157 | return super(Coverage, self).run( 158 | self.pintool, 159 | '-o %s -wht %s -- %s' 160 | % (output, ' -wht '.join(quoted_whilelist), execmd) 161 | ) 162 | elif platform.system() == 'Windows': 163 | self.event_name = 'Global\\event%s' % str( 164 | random.randint(0, 0xFFFFFFFFFFFFFFFF) 165 | ) 166 | return super(Coverage, self).run( 167 | self.pintool, 168 | '-o %s -e %s -wht %s -- %s' 169 | % ( 170 | output, 171 | self.event_name, 172 | ' -wht '.join(quoted_whilelist), 173 | execmd 174 | ) 175 | ) 176 | 177 | def _pre_run(self, output): 178 | if platform.system() == 'Linux': 179 | os.mkfifo(output) 180 | 181 | def _post_run(self, output): 182 | if platform.system() == 'Windows': 183 | ready = 0 184 | while not ready: 185 | ready = ctypes.windll.kernel32.WaitNamedPipeA( 186 | output, 187 | NMPWAIT_USE_DEFAULT_WAIT 188 | ) 189 | 190 | def run(self, execmd, output='output.dmp', whitelist=[]): 191 | basename_output = output 192 | if platform.system() == 'Windows': 193 | output = '\\\\.\\pipe\\%s' % basename_output 194 | self._pre_run(output) 195 | self._run(execmd, basename_output, whitelist) 196 | self._post_run(output) 197 | 198 | return output 199 | 200 | -------------------------------------------------------------------------------- /blockcache.py: -------------------------------------------------------------------------------- 1 | ''' 2 | blockcache.py provides a uniform API for Choronzon to 3 | store and correlate execution traces. It makes use of an 4 | external dependency/module called sortedcontainers. 5 | ''' 6 | 7 | import sortedcontainers as sc 8 | 9 | class BlockCache(object): 10 | ''' 11 | Basic block cache implementation. It is based on a 12 | sorted dictionary. 13 | ''' 14 | cache = None 15 | total = None 16 | 17 | 18 | def __init__(self): 19 | self.cache = sc.SortedDict() 20 | self.total = 0x0 21 | 22 | def yield_bbls(self): 23 | ''' 24 | returns a generator for all basic blocks inside 25 | the cache. 26 | ''' 27 | for start_ea, end_ea in self.cache.itervalues(): 28 | yield (start_ea, end_ea) 29 | 30 | def get_count(self): 31 | ''' 32 | Return the number of the basic block in this image 33 | ''' 34 | return float(self.total) 35 | 36 | def add_bbl(self, key, value): 37 | ''' 38 | Add a BBL with start address `key' and end address `value' 39 | ''' 40 | if key == value[0]: 41 | self.total += 1 42 | self.cache[key] = value 43 | 44 | def get_bbl(self, bbl): 45 | ''' 46 | Get a BLL (tuple with (startEA, endEA)) 47 | ''' 48 | return self.cache[bbl] 49 | 50 | def is_cached(self, bbl): 51 | ''' 52 | aux: returns True if the bbl exists inside 53 | the block cache already. 54 | ''' 55 | return bbl in self.cache 56 | 57 | def get_cached(self, bbl): 58 | ''' 59 | if the bbl is cached then it retrieves the 60 | cached value, or in case it's not, it adds 61 | it to the cache and then returns the value. 62 | ''' 63 | if self.is_cached(bbl): 64 | return self.get_bbl(bbl) 65 | 66 | bindex = self.cache.bisect(bbl) 67 | bstart = self.cache.iloc[bindex] 68 | left, right = self.get_bbl(bstart) 69 | if left < bbl and right > bbl: 70 | self.add_bbl(bbl, (left, right)) 71 | return (left, right) 72 | else: 73 | return None 74 | 75 | @classmethod 76 | def parse_idmp(cls, idmp_iterable): 77 | ''' 78 | parse the output of the disassembler module. 79 | ''' 80 | mode = None 81 | cache = cls() 82 | 83 | for line in idmp_iterable: 84 | if '#' in line: 85 | if '#IMAGE#' in line: 86 | mode = 'image' 87 | elif '#FUNCTIONS#' in line: 88 | mode = 'functions' 89 | elif '#BBLS#' in line: 90 | mode = 'bbls' 91 | elif mode == 'image': 92 | pass 93 | elif mode == 'functions': 94 | pass 95 | elif mode == 'bbls': 96 | start, end, _ = line.split(',') 97 | start = int(start, 16) 98 | end = int(end, 16) 99 | cache.add_bbl(start, (start, end)) 100 | 101 | return cache 102 | -------------------------------------------------------------------------------- /campaign.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import os 4 | import time 5 | import shutil 6 | import random 7 | import platform 8 | 9 | class Singleton(type): 10 | ''' 11 | Assing this class as the __metaclass__ member of a class and it will 12 | convert it to a singleton class. 13 | ''' 14 | def __call__(cls, *args, **kwargs): 15 | try: 16 | return cls.__instance 17 | except AttributeError: 18 | cls.__instance = super(Singleton, cls).__call__(*args, **kwargs) 19 | return cls.__instance 20 | 21 | 22 | class Campaign(object): 23 | ''' 24 | A singleton class for managing files and directories in a campaign. 25 | ''' 26 | __metaclass__ = Singleton 27 | campaign_id = None 28 | work_dir = None 29 | temp_dir = None 30 | local_dir = None 31 | shared_dir = None 32 | campaign_dir = None 33 | files = None 34 | shared_files = None 35 | chromo_files = None 36 | 37 | def __init__(self, campaign_id=None, work_dir='.'): 38 | if self.campaign_id == None: 39 | self.files = [] 40 | self.shared_files = [] 41 | self.chromo_files = {} 42 | self.work_dir = self.__checkfilename(work_dir) 43 | self.new_campaign(campaign_id) 44 | 45 | def log(self, msg): 46 | self.logfp.write('%s\n' % msg) 47 | self.logfp.flush() 48 | 49 | def copy_directory(self, input_path, name=None): 50 | ''' 51 | Takes as input a list or tuple with the absolute path names 52 | of the initial seed files and copies them at a special directory 53 | named `seedfiles' in the campaign directory 54 | ''' 55 | if not name or name == None: 56 | name = os.path.basename(input_path) 57 | 58 | path = self.create_directory(name) 59 | for filename in os.listdir(input_path): 60 | name = os.path.join(input_path, filename) 61 | with open(name, 'rb') as fin: 62 | with open(os.path.join(path, filename), 'wb') as fout: 63 | fout.write(fin.read()) 64 | return path 65 | 66 | def __checkfilename(self, directory): 67 | ''' 68 | aux: raises exception if the filename does not exist or returns 69 | the absolute filepath. 70 | ''' 71 | if not os.path.exists(directory): 72 | raise IOError( 73 | 'File "%s" does not exist' 74 | % directory 75 | ) 76 | return os.path.abspath(directory) 77 | 78 | def new_id(self): 79 | ''' 80 | crafts an unique id for a campaign by combining a timestamp 81 | and a random number. 82 | ''' 83 | newid = random.random().as_integer_ratio()[0] 84 | newid += time.time().as_integer_ratio()[1] 85 | return 'campaign-%d' % newid 86 | 87 | def new_campaign(self, campaign_id=None): 88 | ''' 89 | creates a new directory for the new campaign. The name of the 90 | directory is the new campaign id. 91 | ''' 92 | if campaign_id and campaign_id != None: 93 | self.campaign_id = campaign_id 94 | else: 95 | self.campaign_id = self.new_id() 96 | 97 | self.campaign_dir = os.path.join(self.work_dir, self.campaign_id) 98 | self.temp_dir = self.create_directory('.tmp') 99 | self.local_dir = self.create_directory('.local') 100 | self.chromo_dir = self.create_directory('.chromo') 101 | self.logfp = open(os.path.join(self.campaign_dir, 'log.txt'), 'a') 102 | self.log('Log opened for writing at %s' % time.ctime()) 103 | 104 | def get_chromosome(self, uid): 105 | ''' 106 | returns the full path of the chromosome file inside the 107 | campaign directory. 108 | ''' 109 | if uid not in self.chromo_files: 110 | raise KeyError('Could not find chromosome: %s' % uid) 111 | 112 | return self.chromo_files[uid] 113 | 114 | def add_chromosome(self, uid, data): 115 | ''' 116 | inserts a chromosome in the chromosome directory and it 117 | updates the path to the file. 118 | ''' 119 | path = os.path.join(self.chromo_dir, '%s' % uid) 120 | path = os.path.abspath(path) 121 | with open(path, 'wb') as fout: 122 | fout.write(data) 123 | self.chromo_files[uid] = path 124 | return path 125 | 126 | def delete_chromosome(self, uid): 127 | ''' 128 | removes the given uid from the dictionary of the chromo 129 | files as well as the file it points to. 130 | ''' 131 | if uid not in self.chromo_files: 132 | raise KeyError('Could not find chromosome: %s' % uid) 133 | 134 | os.remove(self.chromo_files[uid]) 135 | del self.chromo_files[uid] 136 | 137 | def cleanup(self): 138 | ''' 139 | deletes the campaign directory. 140 | ''' 141 | shutil.rmtree(self.campaign_dir) 142 | 143 | def copy_to_campaign(self, filename): 144 | ''' 145 | aux: copies the filename to the campaign_dir directory. 146 | ''' 147 | name = os.path.basename(filename) 148 | with open(filename, 'rb') as fin: 149 | newpath = os.path.join( 150 | self.campaign_dir, 151 | name 152 | ) 153 | with open(newpath, 'wb') as fout: 154 | fout.write(fin.read()) 155 | 156 | def create_shared_directory(self, abspath): 157 | ''' 158 | A path to the shared directory that will be used for communicating 159 | with other Choronzon's instances. If the directory does not exist, 160 | it will create it. 161 | ''' 162 | self.shared_files = [] 163 | if not os.path.exists(abspath): 164 | os.makedirs(abspath) 165 | self.shared_dir = abspath 166 | return abspath 167 | 168 | def create_directory(self, path): 169 | ''' 170 | Creates a directory (if it does not already exists) 171 | and returns its path. 172 | ''' 173 | path = os.path.join(self.campaign_dir, path) 174 | if not os.path.exists(path): 175 | os.makedirs(path) 176 | return path 177 | 178 | def delete_pipe(self, pipe_name): 179 | ''' 180 | Deletes a named pipe. 181 | ''' 182 | if platform.system() == 'Linux': 183 | try: 184 | os.unlink(pipe_name) 185 | except OSError as oserr: 186 | print '[!] ERROR: Could not delete pipe:', oserr 187 | 188 | elif platform.system() == 'Windows': 189 | # Windows delete automatically the pipe when all handles to it 190 | # are closed. So there's nothing to delete. 191 | pass 192 | 193 | def create_pipe(self, seedid): 194 | ''' 195 | This function crafts a valid name of a named pipe depending on the 196 | underlying operating system. 197 | ''' 198 | name = None 199 | if platform.system() == 'Linux': 200 | pipe_absname = os.path.join(self.temp_dir, 'pipe%s' % seedid) 201 | while os.path.exists(pipe_absname): 202 | pipe_absname = os.path.join(self.temp_dir, 203 | 'pipe%s' % str(random.randint(0, 0xFFFFFFFF))) 204 | name = pipe_absname 205 | elif platform.system() == 'Windows': 206 | # name = '\\\\.\\pipe%s' % seedid 207 | name = 'pipe%s' % seedid 208 | # should check here, if the pipe already exists. 209 | # however, windows automatically delete the pipe if there isn't 210 | # any open handle to it. So, theoritically there's no prob here. 211 | return name 212 | 213 | def create(self, filename, data=None): 214 | ''' 215 | create a new file inside the temporary directory. 216 | if data is defined, write the data into the new file. 217 | ''' 218 | if filename in self.files: 219 | raise ValueError( 220 | 'File "%s" already exists.' 221 | % filename 222 | ) 223 | name = os.path.basename(filename) 224 | self.files.append(name) 225 | filepath = self.get(name) 226 | if data != None: 227 | with open( 228 | filepath, 229 | 'wb' 230 | ) as fout: 231 | fout.write(data) 232 | 233 | return filepath 234 | 235 | def copy_from_shared(self, filename): 236 | ''' 237 | Copies a pickled file (usually a pickled chromosome object) from the 238 | shared directory, to a local directory inside the campaign. Notice 239 | that, if a file with the same name was copied previously, then it 240 | does nothing. 241 | ''' 242 | if not self.already_processed(filename): 243 | self.shared_files.append(filename) 244 | with open(os.path.join(self.shared_dir, filename), 'r') as fin: 245 | with open(os.path.join(self.local_dir, filename), 'wb') as fout: 246 | fout.write(fin.read()) 247 | return os.path.join(self.local_dir, filename) 248 | 249 | def already_processed(self, filename): 250 | ''' 251 | Returns True if this fuzzer has already processed this specific 252 | chromosome pointed by `filename'. Process could mean, that either 253 | this instance of the fuzzer dumped the chromosome in the shared 254 | directory or the chromosome was imported from the shared directory. 255 | ''' 256 | return filename in self.shared_files 257 | 258 | def dump_to_shared(self, filename, bytestring): 259 | ''' 260 | Dumps a bytestring into the shared directory and into a local 261 | directory. 262 | ''' 263 | if filename not in self.shared_files: 264 | self.shared_files.append(filename) 265 | path = os.path.join(self.shared_dir, filename) 266 | localpath = os.path.join(self.local_dir, filename) 267 | with open(path, 'wb') as fout: 268 | fout.write(bytestring) 269 | with open(localpath, 'wb') as fout: 270 | fout.write(bytestring) 271 | 272 | def get(self, filename): 273 | ''' 274 | retrieve a file already add()ed in the campaign. 275 | ''' 276 | if filename not in self.files: 277 | raise IndexError( 278 | 'File "%s" was not found' 279 | % filename 280 | ) 281 | return os.path.join( 282 | self.campaign_dir, 283 | filename 284 | ) 285 | 286 | def add_to(self, out, inp): 287 | ''' 288 | Copies `filename' to `fullpath' directory. 289 | ''' 290 | outfull = os.path.join(out, os.path.basename(inp)) 291 | with open(inp) as fin: 292 | with open(outfull, 'wb') as fout: 293 | fout.write(fin.read()) 294 | return outfull 295 | 296 | def add(self, filename): 297 | ''' 298 | copy a file inside the campaign directory. 299 | ''' 300 | name = os.path.basename(filename) 301 | if name not in self.files: 302 | self.copy_to_campaign(filename) 303 | self.files.append(name) 304 | return self.get(name) 305 | 306 | -------------------------------------------------------------------------------- /choronzon.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import sys 3 | import os 4 | import argparse 5 | 6 | import world 7 | import chromosome 8 | import fuzzers.strategy as strategy 9 | import configuration 10 | import campaign 11 | import tracer 12 | import evaluator 13 | 14 | 15 | class Choronzon(object): 16 | ''' 17 | https://en.wikipedia.org/wiki/Choronzon 18 | ''' 19 | configuration = None 20 | campaign = None 21 | population = None 22 | strategy = None 23 | tracer = None 24 | evaluator = None 25 | 26 | def __init__(self, configfile=None): 27 | ''' 28 | Initialization method of Choronzon. Reads the configuration, 29 | instantiates objects of the vital classes, builds and 30 | analyzes the first generation of chromosomes by reading 31 | the initial population provided to the fuzzer. 32 | ''' 33 | # configuration is a singleton 34 | self.configuration = configuration.Configuration(configfile) 35 | self.campaign = campaign.Campaign(self.configuration['CampaignName']) 36 | 37 | seedpath = self.campaign.copy_directory( 38 | self.configuration['InitialPopulation'], name='seedfiles') 39 | 40 | self.tracer = tracer.Tracer() 41 | self.strategy = strategy.FuzzingStrategy() 42 | self.population = world.Population(self.tracer.cache) 43 | self.evaluator = evaluator.Evaluator(self.tracer.cache) 44 | 45 | try: 46 | self.sharedpath = self.campaign.create_shared_directory( 47 | self.configuration['ChromosomeShared']) 48 | except: 49 | self.sharedpath = None 50 | 51 | # Initialize factory for building chromosome 52 | # and the proxy for computing the fitness. 53 | chromosomes = chromosome.Factory.build(seedpath) 54 | for chromo in chromosomes: 55 | self.population.add_chromosome(chromo) 56 | 57 | self.analyze() 58 | 59 | def _grab_from_shared(self): 60 | ''' 61 | This functions is looking for files in the shared directory. If a 62 | file has not already been processed, it uses it to build a new 63 | Chromosome and import it into the current population. 64 | ''' 65 | listed_files = os.listdir(self.sharedpath) 66 | 67 | for curr in listed_files: 68 | if not self.campaign.already_processed(curr): 69 | abspath = self.campaign.copy_from_shared(curr) 70 | # build an empty chromosome, which will be filled with the 71 | # contents of the file from the shared directory 72 | new_chromo = chromosome.Factory.build_empty() 73 | new_chromo.load_chromosome(abspath) 74 | self.population.add_chromosome(new_chromo) 75 | # this is for updating the generation trace 76 | self.population.add_trace(new_chromo.uid, new_chromo.trace) 77 | 78 | def fuzz(self): 79 | ''' 80 | Each time fuzz method is called, it is indicating that a new epoch 81 | has begun. The function picks random couples of chromosomes from the 82 | population and apply to them recombination and mutation algorithms. 83 | Finally, the new (fuzzed) chromosomes are imported to the new 84 | generation. 85 | ''' 86 | self.population.new_epoch() 87 | self.campaign.log('Fuzzing of chromosomes has begun.') 88 | 89 | # This is to keep the family tree of the chromosomes 90 | for male, female in self.population.get_couple_from_previous(True): 91 | # assume that the UID is colliding with another's chromosome UID 92 | uid_collision = True 93 | 94 | maleclone = male.clone() 95 | femaleclone = female.clone() 96 | 97 | # assign new UIDs to the new chromosomes until they are unique 98 | while uid_collision: 99 | if self.population.does_exist(maleclone.uid) == False and \ 100 | self.population.does_exist(femaleclone.uid) == False: 101 | uid_collision = False 102 | else: 103 | maleclone.new_uid() 104 | femaleclone.new_uid() 105 | 106 | son, daughter = self.strategy.recombine(maleclone, femaleclone) 107 | 108 | self.population.add_chromosome(son) 109 | self.population.add_chromosome(daughter) 110 | 111 | self.campaign.log('The stage of fuzzing is finished') 112 | 113 | if 'KeepGenerations' in self.configuration and \ 114 | self.configuration['KeepGenerations']: 115 | gpath = self.campaign.create_directory( 116 | '%s' % self.population.epoch 117 | ) 118 | for chromo in self.population.get_all_from_current(): 119 | path = os.path.join( 120 | '%s' % gpath, 121 | '%s' % chromo.uid 122 | ) 123 | with open(path, 'wb') as fout: 124 | fout.write(chromo.serialize()) 125 | 126 | def evaluate_fuzzers(self): 127 | ''' 128 | assigns credits to the combinations of mutators and recombinators 129 | that have generated elite chromosomes (they have survived through 130 | the elitism step to the next generation). For every such chromosome, 131 | the pair is more likely to be chosen based on the lottery selector. 132 | ''' 133 | 134 | involved = {} 135 | 136 | # for every derived chromosome (it has a fuzzer) assign it a fuzzer 137 | # score of 0. This step only considers chromosomes that have been 138 | # involved in this fuzzing cycle. 139 | for chromo in self.population.get_all_from_previous(): 140 | if chromo.fuzzer == None: 141 | continue 142 | if chromo.fuzzer not in involved: 143 | involved[chromo.fuzzer] = 0 144 | 145 | # for every chromosome that is involved (has been generated in this 146 | # step) and has survived, increase its fuzzer's score by one 147 | for chromo in self.population.get_all_from_current(): 148 | if chromo.fuzzer == None: 149 | continue 150 | if chromo.fuzzer not in involved: 151 | involved[chromo.fuzzer] = 0 152 | else: 153 | involved[chromo.fuzzer] += 1 154 | 155 | # update the strategy instance with the new fuzzer scores, so that 156 | # the mutator/combinator pair is more likely to be chosen again 157 | for fuzzer, score in involved.iteritems(): 158 | if score > 0: 159 | self.strategy.good(fuzzer, score) 160 | 161 | # not sure if negative feedback will help, so ignore 162 | # for now 163 | #else: 164 | # self.strategy.bad(fuzzer) 165 | 166 | def analyze(self): 167 | ''' 168 | Analyze the corpus of the current generation, by instrumenting the 169 | execution using the Tracer class. 170 | ''' 171 | 172 | self.campaign.log('Analysis of chromosomes.') 173 | self.campaign.log('Current generation has %d chromosomes.' % ( 174 | len(self.population.current)) 175 | ) 176 | crashed_uids = [] 177 | 178 | for chromo in self.population.get_all_from_current(): 179 | newfile = self.campaign.create( 180 | '%s' % chromo.uid, 181 | chromo.serialize() 182 | ) 183 | self.campaign.log('Analyzing %s' % chromo.uid) 184 | trace = self.tracer.analyze('%s' % chromo.uid) 185 | self.campaign.log('Analysis of %s finished' % chromo.uid) 186 | 187 | # if the fuzzed file triggered a bug (yay!!), remove it from the 188 | # population, since it may trigger the same bug again and again 189 | if trace.has_crashed: 190 | crash_dir = self.campaign.create_directory('crashes') 191 | path = os.path.join(crash_dir, '%s' % chromo.uid) 192 | with open(path, 'wb') as fout: 193 | fout.write(chromo.serialize()) 194 | self.campaign.log('CRASH! :)') 195 | self.campaign.log('The trigger file is saved at %s.' % path) 196 | crashed_uids.append(chromo.uid) 197 | else: 198 | self.population.add_trace(chromo.uid, trace) 199 | try: 200 | os.unlink(newfile) 201 | except: 202 | # Sometimes newfile is still used by the OS and unlink 203 | # raises an exception. We just ignore this. 204 | pass 205 | 206 | # Erase of chromosomes must be done after iterating the current 207 | # generation. Otherwise, python will raise a RuntimeError expcetion. 208 | for uid in crashed_uids: 209 | self.population.delete_chromosome(uid) 210 | 211 | if self.sharedpath != None: 212 | self._grab_from_shared() 213 | 214 | self.campaign.log('Evaluation stage of chromosomes has begun') 215 | self.evaluator.evaluate(self.population) 216 | 217 | # new epoch will be created in elitism function 218 | self.population.elitism() 219 | self.campaign.log('Elite generation contains %d chromosomes.' % ( 220 | len(self.population.current) 221 | )) 222 | 223 | if len(self.population.current) < 2: 224 | raise ValueError('Elitism resulted to just one chromosome.'\ 225 | 'Usually this is due to bad initial corpus (limited ' \ 226 | 'seedfiles provided or they are identical) or there is a '\ 227 | 'problem with the instrumented binary. For example, '\ 228 | 'maybe the basic blocks that are visited in every run are '\ 229 | 'the same.') 230 | self.evaluate_fuzzers() 231 | 232 | # if there are multiple instances of choronzon, dump the chromosomes 233 | # from the elite generation to the shared directory. 234 | if self.sharedpath != None: 235 | for chromo in self.population.get_all_from_current(): 236 | filename = str(chromo.uid) 237 | if not self.campaign.already_processed(filename): 238 | self.campaign.dump_to_shared(filename, 239 | chromo.dumps_chromosome()) 240 | 241 | elite_dir = self.campaign.create_directory('%s' % self.population.epoch) 242 | 243 | if 'KeepGenerations' in self.configuration \ 244 | and self.configuration['KeepGenerations']: 245 | for chromo in self.population.get_all_from_current(): 246 | path = os.path.join(elite_dir, '%s' % chromo.uid) 247 | with open(path, 'wb') as fout: 248 | fout.write(chromo.serialize()) 249 | 250 | def start(self): 251 | while True: 252 | self.fuzz() 253 | self.analyze() 254 | 255 | def stop(self): 256 | print '[+] Bye! :)' 257 | 258 | def main(args): 259 | print '[+] Choronzon fuzzer v.0.1' 260 | print '[+] starting campaign...' 261 | 262 | choronzon = Choronzon(args.config) 263 | 264 | try: 265 | choronzon.start() 266 | except KeyboardInterrupt: 267 | choronzon.stop() 268 | 269 | return 0 270 | 271 | if __name__ == '__main__': 272 | parser = argparse.ArgumentParser( 273 | description='Choronzon v0.1\nAn evolutionary knowledge-based fuzzer' 274 | ) 275 | parser.add_argument( 276 | 'config', 277 | help='/path/to/config/file.py' 278 | ) 279 | arguments = parser.parse_args() 280 | sys.exit(main(arguments)) 281 | -------------------------------------------------------------------------------- /chromosome/__init__.py: -------------------------------------------------------------------------------- 1 | # Each parser class MUST implement the following methods: 2 | # parse: which will parse the given file 3 | # get_genes: a generator that yields all genes of the file 4 | # get_filter_manager: a filter_manager for the chromosome 5 | # Also an important thing about the parsers is that should 6 | # initialize the filter for each gene. 7 | # 8 | # Each gene class must implement the following methods: 9 | 10 | from factory import * 11 | from chromosome import * 12 | -------------------------------------------------------------------------------- /chromosome/chromosome.py: -------------------------------------------------------------------------------- 1 | ''' 2 | The Chromosome module. 3 | ''' 4 | import os 5 | import copy 6 | import random 7 | import cPickle 8 | 9 | class Chromosome(object): 10 | ''' 11 | The Chromosome class represents a deserialized file. 12 | 13 | Each chromosome class contains various information about the file. 14 | For example, the fitness and the metrics, which indicate how "favorable" 15 | is the file for the fuzzing process. 16 | 17 | The basic structures of the file format are defined by genes. Genes form 18 | a tree-like structure, in which the children nodes are usually 19 | sub-structures of their parents. To get a better understanding of this, 20 | in the XML format you may think as a top-level gene the outter tag 21 | and as a child gene the inner tag. 22 | 23 | Notice that each chromosome contains only the top level genes of a file. 24 | The relationship between the genes is not stored in Chromosome class. 25 | 26 | Additionally, a Chromosome is able to parse this tree of genes and 27 | serialize them in a file using the serilizer given as argument to 28 | Chromosome's constructor. 29 | ''' 30 | trace = None 31 | genes = None 32 | serializer = None 33 | deserializer = None 34 | metrics = None 35 | fitness = None 36 | fuzzer = None 37 | uid = None 38 | processed = None 39 | 40 | def __init__(self, serializer=None, deserializer=None): 41 | self.genes = list() 42 | self.serializer = serializer() 43 | self.deserializer = deserializer() 44 | self.metrics = {} 45 | self.fitness = 0.0 46 | self.uid = self.new_uid() 47 | self.processed = False 48 | 49 | def __len__(self): 50 | return len(self.genes) 51 | 52 | def __str__(self): 53 | return self.serialize() 54 | 55 | def new_uid(self): 56 | ''' 57 | Assign a new random UID to the chromosome. 58 | ''' 59 | self.uid = random.randint(0, 0xFFFFFFFFFFFFFFFF) 60 | return self.uid 61 | 62 | def clone(self): 63 | ''' 64 | Clone the chromosome object, but assign a new unique identifier 65 | to the new chromosome. 66 | ''' 67 | newchr = copy.deepcopy(self) 68 | newchr.new_uid() 69 | return newchr 70 | 71 | def set_metrics(self, met): 72 | ''' 73 | Set the metrics to the chromosome. 74 | ''' 75 | self.metrics = met 76 | 77 | def get_metrics(self): 78 | ''' 79 | Get the metrics from the chromosome. 80 | ''' 81 | return self.metrics 82 | 83 | def set_fitness(self, fit): 84 | ''' 85 | Sets the fitness to the current chromosome. 86 | ''' 87 | self.fitness = fit 88 | 89 | def get_fitness(self): 90 | ''' 91 | Gets the fitness from the current chromosome. 92 | ''' 93 | return self.fitness 94 | 95 | def get_genes(self): 96 | ''' 97 | Returns only the root nodes of the genes in the current chromsome. 98 | ''' 99 | return self.genes 100 | 101 | def get_all_genes(self): 102 | ''' 103 | Returns all genes of the current chromosome. 104 | ''' 105 | return self._get_all_ancestors(self.get_genes()) 106 | 107 | def _get_all_ancestors(self, genes): 108 | ''' 109 | Returns a list with all ancestor genes (including the input genes) 110 | ''' 111 | ancestors = [] 112 | 113 | for parent_gene in genes: 114 | # add itself in the list 115 | ancestors.append(parent_gene) 116 | # for every children of the current gene 117 | for child in parent_gene.get_children(): 118 | # if the child of the current gene contains children, 119 | # call _get_all_ancestors recursively 120 | if child.children_number() > 0: 121 | ancestors.extend( 122 | self._get_all_ancestors( 123 | child.get_children() 124 | ) 125 | ) 126 | else: 127 | ancestors.append(child) 128 | 129 | return ancestors 130 | 131 | def _internal_find_parent(self, root, target): 132 | ''' 133 | A breadth first greedy search algorithm. Given a root node 134 | it returns the parent gene of the target. Returns None if the 135 | parent could not be found. 136 | ''' 137 | parent = None 138 | 139 | if target in root.get_children(): 140 | return root 141 | 142 | for child in root.get_children(): 143 | parent = self._internal_find_parent(child, target) 144 | if parent != None: 145 | break 146 | 147 | return parent 148 | 149 | def find_parent(self, child): 150 | ''' 151 | Finds and returns the parent of the gene given. If the gene is 152 | a root level gene, it returns None. On the other hand, if 153 | the gene does not belong to this chromosome, it raises 154 | a ValueError exception. 155 | ''' 156 | if child in self.genes: 157 | return None 158 | 159 | parent = None 160 | 161 | for gene in self.genes: 162 | parent = self._internal_find_parent(gene, child) 163 | if parent != None: 164 | break 165 | 166 | if parent != None: 167 | return parent 168 | else: 169 | raise ValueError('Unable to find parent of gene.') 170 | 171 | def replace_gene(self, target, new): 172 | ''' 173 | Replaces the target gene with new. Returns the replaced gene. 174 | ''' 175 | old = None 176 | 177 | if target in self.genes: 178 | index = self.genes.index(target) 179 | old = self.genes[index] 180 | self.genes[index] = new 181 | else: 182 | parent = self.find_parent(target) 183 | old = parent.replace_child(target, new) 184 | return old 185 | 186 | def remove_gene(self, target): 187 | ''' 188 | Removes a Gene from the chromosome. If the Gene does not exist, 189 | it raises a ValueError exception. 190 | ''' 191 | parent = self.find_parent(target) 192 | if parent != None: 193 | parent.remove_child(target) 194 | else: 195 | self.genes.remove(target) 196 | 197 | def add_gene(self, gene): 198 | ''' 199 | Appends a top level gene in the chromosome. 200 | ''' 201 | self.genes.append(gene) 202 | 203 | def deserialize(self, filepath): 204 | ''' 205 | Reads in a file and generates a list of genes. It uses a 206 | user-defined deserializer. 207 | ''' 208 | self.genes = self.deserializer.deserialize(filepath) 209 | 210 | def serialize(self): 211 | ''' 212 | Returns a bytestring that is the used as input to the target 213 | application. It uses a user-defined serializer. 214 | ''' 215 | return self.serializer.serialize(self.genes) 216 | 217 | def dumps_chromosome(self, protocol=-1): 218 | ''' 219 | It returns pickled bytestring containing the important attributes 220 | of the chromosome that are needed in order to write the chromosome 221 | in a file and restore it later or restore it from another Choronzon 222 | instance. 223 | ''' 224 | important = [self.genes, self.metrics, self.uid, self.trace] 225 | return cPickle.dumps(important, protocol) 226 | 227 | def dump_chromosome(self, path, protocol=-1): 228 | ''' 229 | Dumps the pickled bytestring into a file, indicated by path. 230 | ''' 231 | if not os.path.exists(path): 232 | raise IOError('Could not find path: %s' % path) 233 | 234 | with open(path, 'wb') as fout: 235 | fout.write(self.dumps_chromosome(protocol)) 236 | 237 | def loads_chromosome(self, data): 238 | ''' 239 | Restores a chromosome from a pickled string. 240 | ''' 241 | self.genes, self.metrics, self.uid, self.trace = cPickle.loads(data) 242 | 243 | def load_chromosome(self, path): 244 | ''' 245 | Restores a chromosome from a pickled file. 246 | ''' 247 | if not os.path.exists(path): 248 | raise IOError('Could not find path: %s' % path) 249 | 250 | with open(path, 'rb') as fin: 251 | self.genes, self.metrics, self.uid, self.trace = cPickle.load(fin) 252 | -------------------------------------------------------------------------------- /chromosome/deserializer.py: -------------------------------------------------------------------------------- 1 | class BaseDeserializer(object): 2 | def __init__(self): 3 | pass 4 | 5 | def deserialize(self, filepath): 6 | genes = list() 7 | return genes 8 | -------------------------------------------------------------------------------- /chromosome/factory.py: -------------------------------------------------------------------------------- 1 | import os 2 | import importlib 3 | import string 4 | import chromosome 5 | import configuration 6 | import campaign 7 | 8 | class Factory(object): 9 | ''' 10 | Factory class generates chromosomes using user-defined 11 | parsers specified in the configuration. 12 | ''' 13 | 14 | directory = None 15 | configuration = None 16 | serializer = None 17 | deserializer = None 18 | 19 | def __init__(self, seeddir): 20 | self.configuration = configuration.Configuration() 21 | self.campaign = campaign.Campaign() 22 | if seeddir != None: 23 | self.directory = self._check_path(seeddir) 24 | self._load_parser() 25 | 26 | 27 | def _check_path(self, path): 28 | ''' 29 | aux: checks if a file or directory exists and 30 | returns the absolute path. 31 | ''' 32 | path = os.path.abspath(path) 33 | if not os.path.exists(path): 34 | raise IOError('Seed directory "%s" does not exist' % path) 35 | return path 36 | 37 | def _load_parser(self): 38 | ''' 39 | attempt to load the module and the appropriate classes 40 | according to the "Parser" configuration setting. 41 | ''' 42 | if not self.configuration or self.configuration == None: 43 | raise configuration.ConfigurationError( 44 | 'Configuration is not loaded' 45 | ) 46 | self.parser = importlib.import_module( 47 | 'chromosome.parsers.%s' % self.configuration['Parser'] 48 | ) 49 | self.serializer = getattr( 50 | self.parser, 51 | '%sSerializer' % self.configuration['Parser'] 52 | ) 53 | self.deserializer = getattr( 54 | self.parser, 55 | '%sDeserializer' % self.configuration['Parser'] 56 | ) 57 | 58 | def _generate_chromosome(self, fname): 59 | ''' 60 | It parses the current filename, and then append 61 | the genes that found in a chromosome and return it. 62 | ''' 63 | chromo = chromosome.Chromosome( 64 | serializer=self.serializer, 65 | deserializer=self.deserializer 66 | ) 67 | if fname != None: 68 | print '[!] Parsing: %s (%s)' % (fname, chromo.uid) 69 | self.campaign.log('Parsing: %s (%s)' % (fname, chromo.uid)) 70 | chromo.deserialize(fname) 71 | return chromo 72 | 73 | def generate(self): 74 | ''' 75 | This is a generator that yields chromosomes, 76 | that represent each file in the chosen path. 77 | ''' 78 | for root, _, files in os.walk(self.directory): 79 | for fname in files: 80 | yield self._generate_chromosome( 81 | os.path.join(root, fname) 82 | ) 83 | 84 | @classmethod 85 | def build_empty(klass): 86 | obj = klass(None) 87 | return obj._generate_chromosome(None) 88 | 89 | @classmethod 90 | def build(cls, seed_dir): 91 | ''' 92 | aux: automate the process of parsing seed files. 93 | ''' 94 | o = cls(seed_dir) 95 | return o.generate() 96 | -------------------------------------------------------------------------------- /chromosome/gene.py: -------------------------------------------------------------------------------- 1 | class AbstractGene(object): 2 | data = None 3 | children = None 4 | 5 | def __init__(self): 6 | self.data = '' 7 | self.children = [] 8 | 9 | def get_data(self): 10 | ''' 11 | Returns the fuzzable data of the gene. 12 | ''' 13 | return self.data 14 | 15 | def set_data(self, data): 16 | ''' 17 | sets the data of this gene 18 | ''' 19 | self.data = data 20 | 21 | def add_children(self, new): 22 | self.children.extend(new) 23 | 24 | def add_child(self, child, index=None): 25 | ''' 26 | Adds a new gene children to the current gene. 27 | ''' 28 | if index == None: 29 | self.children.append(child) 30 | else: 31 | self.children.insert(index, child) 32 | 33 | def get_children(self): 34 | ''' 35 | Returns all the genes children. If the current gene 36 | has no children, then it returns an empty list. 37 | ''' 38 | return self.children 39 | 40 | def remove_child(self, target): 41 | self.children.remove(target) 42 | 43 | def replace_child(self, target, new): 44 | ''' 45 | It replaces a child with a new one. This function is not recursive 46 | which means it does not search for the ancestors of the gene's 47 | children. It returns the gene which will be replaced. 48 | ''' 49 | index = self.children.index(target) 50 | old = self.children[index] 51 | self.children[index] = target 52 | return old 53 | 54 | def children_number(self): 55 | return len(self.children) 56 | 57 | def anomaly(self): 58 | ''' 59 | Decides wheather this gene is fuzzable or not. 60 | True means that this gene should not be fuzzed. 61 | ''' 62 | return False 63 | 64 | def mutate(self, mutator): 65 | ''' 66 | uses a Mutator object to corrupt some of its data. 67 | ''' 68 | data = mutator.mutate(self.get_data()) 69 | self.set_data(data) 70 | 71 | def serialize(self): 72 | ''' 73 | serializes its data and children's data into 74 | a bytestring. 75 | ''' 76 | data = self.get_data() 77 | if data == None: 78 | data = '' 79 | for child in self.children: 80 | data += child.serialize() 81 | return data 82 | 83 | def is_equal(self, other): 84 | ''' 85 | dummy 86 | ''' 87 | return False 88 | 89 | def __str__(self): 90 | ''' 91 | Convert the gene to bytestring. 92 | ''' 93 | return self.to_str() 94 | 95 | -------------------------------------------------------------------------------- /chromosome/parsers/PNG.py: -------------------------------------------------------------------------------- 1 | import os 2 | import zlib 3 | import math 4 | import struct 5 | import copy 6 | 7 | import chromosome.gene as gene 8 | import chromosome.serializer as serializer 9 | import chromosome.deserializer as deserializer 10 | 11 | PNG_SIGNATURE = '\x89\x50\x4e\x47\x0d\x0a\x1a\x0a' 12 | 13 | 14 | class PNGGene(gene.AbstractGene): 15 | ''' 16 | The PNGGene represent a png chunk. 17 | 18 | Using the PNGDeserializer, we read the contents of a PNG file, 19 | and hold them into memory. Each PNG chunk corresponds to a PNGGene 20 | object. The contents of the PNG chunk are fuzzed in memory. We have 21 | the capability to fuzz specific parts of the chunk's contents. For 22 | example, it is useless to fuzz the CRC field of a PNG chunk. 23 | ''' 24 | def __init__(self, chunk): 25 | super(PNGGene, self).__init__() 26 | self.length = chunk['length'] 27 | self.name = chunk['name'] 28 | self.data = chunk['data'] 29 | self.crc = chunk['crc'] 30 | 31 | def anomaly(self): 32 | ''' 33 | If anomaly returns True, then the current 34 | gene should not be fuzzed. 35 | ''' 36 | if self.length == 0: 37 | return True 38 | else: 39 | return False 40 | 41 | def is_equal(self, other): 42 | ''' 43 | To identify PNG chunks of same type. 44 | ''' 45 | if not isinstance(other, self.__class__): 46 | return False 47 | 48 | if self.name == other.name and PNGGene.asciiname(self.name) != 'IEND': 49 | return True 50 | else: 51 | return False 52 | 53 | # This function must be implemented in order 54 | def serialize(self): 55 | ''' 56 | This function is called to serialize in-memory data of a PNG chunk. 57 | ''' 58 | self.fix_crc() 59 | 60 | bytestring = '' 61 | chunk_data = super(PNGGene, self).serialize() 62 | 63 | bytestring += struct.pack('>I', len(chunk_data)) 64 | bytestring += struct.pack('>I', self.name) 65 | bytestring += chunk_data 66 | bytestring += struct.pack('>I', self.crc) 67 | 68 | return bytestring 69 | 70 | def fix_crc(self): 71 | ''' 72 | re-calculates the Gene's CRC checksum. 73 | ''' 74 | checksum = zlib.crc32( 75 | struct.pack('>I', self.name) 76 | ) 77 | self.crc = zlib.crc32( 78 | self.data, checksum 79 | ) & 0xffffffff 80 | 81 | @staticmethod 82 | def asciiname(chunkname): 83 | ''' 84 | Converts a chunk name to ascii and returns it. 85 | ''' 86 | return '%c%c%c%c' % ( 87 | (chunkname >> 24) & 0xFF, 88 | (chunkname >> 16) & 0xFF, 89 | (chunkname >> 8) & 0xFF, 90 | (chunkname & 0xFF) 91 | ) 92 | 93 | 94 | class PNGSerializer(serializer.BaseSerializer): 95 | ''' 96 | The PNG Serializer. 97 | 98 | This class is used to serialize a tree of PNGGenes into a file. Since 99 | PNG is just a chunk-based format, there is no a tree of genes, but 100 | a list of genes. During the serialization, the CRC of each chunk is 101 | fixed and some chunks, which are required to be compressed, are 102 | deflated using the zlib. 103 | ''' 104 | def __init__(self): 105 | super(PNGSerializer, self).__init__() 106 | 107 | @staticmethod 108 | def deflate_idat_chunks(genes): 109 | ''' 110 | deflate_idat_chunks takes as input a number of genes. Data stored 111 | only in IDAT genes is collected in a bytestring and it is compressed 112 | using the zlib module. Then the compressed bytestring is divided 113 | again and copied in genes. This functions returns a list with the 114 | deflated genes. Keep in mind that this function is working with a 115 | deep copy of the genes given as input. Hence, do not worry for your 116 | data in the genes passed as argument. 117 | ''' 118 | indices = list() 119 | deflated_genes = copy.deepcopy(genes) 120 | datastream = str() 121 | 122 | for idx, curr_gene in enumerate(genes): 123 | if PNGGene.asciiname(curr_gene.name) == 'IDAT': 124 | indices.append(idx) 125 | datastream += curr_gene.get_data() 126 | 127 | comp = zlib.compress(datastream) 128 | idatno = len(indices) 129 | 130 | if idatno > 0: 131 | chunk_len = int(math.ceil(float(len(comp)) / float(idatno))) 132 | 133 | for cnt, index in enumerate(indices): 134 | start = cnt * chunk_len 135 | if index != indices[-1]: 136 | deflated_genes[index].set_data( 137 | comp[start : start+chunk_len]) 138 | else: 139 | deflated_genes[index].set_data( 140 | comp[start : ] 141 | ) 142 | deflated_genes[index].length = len( 143 | deflated_genes[index].get_data() 144 | ) 145 | 146 | return deflated_genes 147 | 148 | def serialize(self, genes): 149 | ''' 150 | This method serializes each one of the genes given as argument. The 151 | serialized bytestring of each of the genes is appended in a buffer 152 | that contains the PNG header. The bytestring of the whole PNG 153 | is returned. 154 | ''' 155 | bytestring = PNG_SIGNATURE 156 | deflated_genes = PNGSerializer.deflate_idat_chunks(genes) 157 | bytestring += super(PNGSerializer, self).serialize(deflated_genes) 158 | return bytestring 159 | 160 | 161 | class PNGDeserializer(deserializer.BaseDeserializer): 162 | ''' 163 | A parser for PNG files. 164 | 165 | This class is used to parse the chunks of a PNG file and construct 166 | PNGGene objects with the contents of the chunks. Moreover, the 167 | deserializer will perform decompression to the zipped data in order to 168 | fuzz them directly in memory. 169 | ''' 170 | fsize = None 171 | fstream = None 172 | chunks = None 173 | 174 | def __init__(self): 175 | super(PNGDeserializer, self).__init__() 176 | self.fsize = 0 177 | self.fstream = None 178 | self.chunks = list() 179 | 180 | def deserialize(self, filename): 181 | ''' 182 | Parses the chosen PNG file. 183 | ''' 184 | # initialize input file 185 | genes = list() 186 | 187 | # open and read PNG header 188 | self._prepare(filename) 189 | self._parse_signature() 190 | 191 | # parse data chunks 192 | for chunk in self._parse_chunks(): 193 | self.chunks.append(chunk) 194 | 195 | # decompress IDAT chunks (zlib streams) 196 | self._inflate_idat_chunks() 197 | 198 | # initialize gene list with deflated chunks 199 | for chunk in self.chunks: 200 | genes.append(PNGGene(chunk)) 201 | 202 | self.fstream.close() 203 | self.fsize = 0 204 | self.chunks = list() 205 | 206 | return genes 207 | 208 | def _inflate_idat_chunks(self): 209 | ''' 210 | This method takes all IDAT PNG chunks that was read and decompress 211 | their data using zlib module. 212 | ''' 213 | datastream = str() 214 | indices = list() 215 | 216 | for idx, chunk in enumerate(self.chunks): 217 | if PNGGene.asciiname(chunk['name']) == 'IDAT': 218 | datastream += chunk['data'] 219 | indices.append(idx) 220 | 221 | decomp = zlib.decompress(datastream) 222 | 223 | idatno = len(indices) 224 | chunk_len = int(math.ceil(float(len(decomp)) / float(idatno))) 225 | 226 | for cnt, index in enumerate(indices): 227 | start = cnt * chunk_len 228 | 229 | if index != indices[-1]: 230 | self.chunks[index]['data'] = decomp[start : start + chunk_len] 231 | else: 232 | self.chunks[index]['data'] = decomp[start:] 233 | 234 | self.chunks[index]['length'] = len(self.chunks[index]['data']) 235 | 236 | 237 | def _parse_signature(self): 238 | ''' 239 | The first 8 bytes of every PNG image must be the signature. 240 | ''' 241 | signature = self.fstream.read(8) 242 | assert len(signature) == 8 243 | 244 | def _parse_chunks(self): 245 | ''' 246 | A generator that parses all chunks of the chosen PNG image. 247 | ''' 248 | index = 0 249 | while self.fsize > self.fstream.tell(): 250 | index += 1 251 | chunk = dict() 252 | chunk['index'] = index 253 | chunk['length'], = struct.unpack('>I', self.fstream.read(4)) 254 | chunk['name'], = struct.unpack('>I', self.fstream.read(4)) 255 | chunk['data'] = self.fstream.read(chunk['length']) 256 | chunk['crc'], = struct.unpack('>I', self.fstream.read(4)) 257 | 258 | yield chunk 259 | 260 | def _get_filesize(self): 261 | ''' 262 | Returns the file size. 263 | ''' 264 | where = self.fstream.tell() 265 | self.fstream.seek(0, 2) 266 | size = self.fstream.tell() 267 | self.fstream.seek(where, 0) 268 | return size 269 | 270 | def _prepare(self, filename): 271 | ''' 272 | Preparation before parsing. 273 | ''' 274 | if not os.path.isfile(filename): 275 | raise IOError('%s is not a regural file.' % filename) 276 | 277 | self.chunks = list() 278 | self.fstream = open(filename, 'rb') 279 | self.fsize = self._get_filesize() 280 | 281 | -------------------------------------------------------------------------------- /chromosome/parsers/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CENSUS/choronzon/d702c318e2292a061da57d7ba5f88d4c4b0f5256/chromosome/parsers/__init__.py -------------------------------------------------------------------------------- /chromosome/serializer.py: -------------------------------------------------------------------------------- 1 | class BaseSerializer(object): 2 | ''' 3 | API for parsers. 4 | ''' 5 | def __init__(self): 6 | pass 7 | 8 | def serialize(self, genes): 9 | data = '' 10 | for gene in genes: 11 | data += gene.serialize() 12 | return data 13 | -------------------------------------------------------------------------------- /configuration.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Provides a singleton class that initializes and stores 3 | all the settings for the currently loaded campaign. It is 4 | shared with all of Choronzon's components. 5 | ''' 6 | 7 | import os 8 | import sys 9 | import imp 10 | import contextlib 11 | 12 | class Singleton(type): 13 | ''' 14 | Basic singleton class recipe. Use this as a 15 | metaclass for any other normal class definition and 16 | it will become a singleton. 17 | ''' 18 | def __call__(cls, *args, **kwargs): 19 | try: 20 | return cls.__instance 21 | except AttributeError: 22 | cls.__instance = super(Singleton, cls).__call__(*args, **kwargs) 23 | return cls.__instance 24 | 25 | class Configuration(object): 26 | ''' 27 | Singleton that provides a uniform API for Choronzon's 28 | components to look up various settings and configurations. 29 | ''' 30 | __metaclass__ = Singleton 31 | 32 | def __init__(self, configfile): 33 | if not os.path.exists(configfile): 34 | raise IOError('Configuration file does not exist.') 35 | 36 | self.configfile = os.path.abspath(configfile) 37 | self.module = self.import_program_as_module('%s' % self.configfile) 38 | 39 | def __contains__(self, item): 40 | try: 41 | getattr(self.module, item) 42 | return True 43 | except AttributeError: 44 | return False 45 | 46 | @contextlib.contextmanager 47 | def preserve_value(self, namespace, name): 48 | """ A context manager to preserve, then restore, the specified binding. 49 | 50 | :param namespace: The namespace object (e.g. a class or dict) 51 | containing the name binding. 52 | :param name: The name of the binding to be preserved. 53 | :yield: None. 54 | 55 | When the context manager is entered, the current value bound to 56 | `name` in `namespace` is saved. When the context manager is 57 | exited, the binding is re-established to the saved value. 58 | 59 | """ 60 | saved_value = getattr(namespace, name) 61 | yield 62 | setattr(namespace, name, saved_value) 63 | 64 | 65 | def make_module_from_file(self, module_name, module_filepath): 66 | """ 67 | Make a new module object from the source code in specified file. 68 | 69 | :param module_name: The name of the resulting module object. 70 | :param module_filepath: The filesystem path to open for 71 | reading the module's Python source. 72 | :return: The module object. 73 | 74 | The Python import mechanism is not used. No cached bytecode 75 | file is created, and no entry is placed in `sys.modules`. 76 | 77 | """ 78 | py_source_open_mode = 'U' 79 | py_source_description = (".py", py_source_open_mode, imp.PY_SOURCE) 80 | 81 | with open(module_filepath, py_source_open_mode) as module_file: 82 | with self.preserve_value(sys, 'dont_write_bytecode'): 83 | sys.dont_write_bytecode = True 84 | module = imp.load_module( 85 | module_name, module_file, module_filepath, 86 | py_source_description) 87 | 88 | return module 89 | 90 | 91 | def import_program_as_module(self, program_filepath): 92 | """ 93 | Import module from program file `program_filepath`. 94 | 95 | :param program_filepath: The full filesystem path to the program. 96 | This name will be used for both the source file to read, and 97 | the resulting module name. 98 | :return: The module object. 99 | 100 | A program file has an arbitrary name; it is not suitable to 101 | create a corresponding bytecode file alongside. So the creation 102 | of bytecode is suppressed during the import. 103 | 104 | The module object will also be added to `sys.modules`. 105 | 106 | """ 107 | module_name = os.path.basename(program_filepath) 108 | 109 | module = self.make_module_from_file(module_name, program_filepath) 110 | sys.modules[module_name] = module 111 | 112 | return module 113 | 114 | def __getitem__(self, name): 115 | return getattr(self.module, name) 116 | 117 | def __setitem__(self, name, value): 118 | setattr(self.module, name, value) 119 | 120 | 121 | class ConfigurationError(Exception): 122 | ''' 123 | Exception for errors in the configuration file. 124 | ''' 125 | pass 126 | -------------------------------------------------------------------------------- /disassembler/__init__.py: -------------------------------------------------------------------------------- 1 | from disassembler import * 2 | -------------------------------------------------------------------------------- /disassembler/disassembler.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | ''' 4 | disassembler.py provides an interface to IDA Pro's 5 | functionality. This module exposes a class that can disassemble 6 | a given binary into a list of basic blocks, using IDA's command 7 | line interface. 8 | ''' 9 | 10 | import os 11 | import platform 12 | import subprocess 13 | 14 | class Disassembler(object): 15 | ''' 16 | This is the disassembler's base class. 17 | ''' 18 | disassembler_path = None 19 | def __init__(self, dispath): 20 | # The path to disassembler binary 21 | self.disassembler_path = dispath 22 | if not os.path.exists(dispath): 23 | raise IOError('Disassembler could not be found at %s' % dispath) 24 | 25 | def get_disassembler_path(self): 26 | ''' 27 | Returns the path of the disassembler's binary. 28 | ''' 29 | return self.disassembler_path 30 | 31 | def disassemble(self, binary, output): 32 | ''' 33 | This method should be overridden with the implemention of each 34 | disassembler. 35 | ''' 36 | raise NotImplementedError( 37 | 'Do not call this method of this class directly.') 38 | 39 | class IDADisassembler(Disassembler): 40 | ''' 41 | A generic IDA Pro CLI driver class. 42 | ''' 43 | def get_ida_runnable(self, exe): 44 | ''' 45 | returns the appropriate IDA binary according to 46 | platform and target architecture. 47 | ''' 48 | system = platform.system() 49 | # platform.architecture = arch, linkage 50 | arch, _ = platform.architecture( 51 | exe, 52 | bits='64bit' 53 | ) 54 | 55 | runnable = '' 56 | if system == 'Linux': 57 | runnable = 'idal' 58 | elif system == 'Windows': 59 | runnable = 'idaw' 60 | else: 61 | raise OSError('Unsupported system "%s".' % system) 62 | 63 | if arch == '64bit': 64 | runnable += '64' 65 | if system == 'Windows': 66 | runnable += '.exe' 67 | 68 | return runnable 69 | 70 | def _run_ida(self, exe, script='', output='.'): 71 | ''' 72 | crafts the command and executes it in order to 73 | retrieve the list of basic blocks given a binary. 74 | ''' 75 | xecut = self.get_ida_runnable(exe) 76 | ida = os.path.join(self.disassembler_path, xecut) 77 | scriptcmd = '%s -o %s' % (script, output) 78 | script = os.path.abspath(script) 79 | 80 | cmdlist = [] 81 | cmdlist.append(ida) # first argument, ida's path 82 | cmdlist.append('-A') 83 | cmdlist.append('-L"%s"' % os.path.join(output, "log.txt")) 84 | cmdlist.append('-S"%s"' % scriptcmd) 85 | cmdlist.append('%s' % exe) 86 | 87 | print '[-] command line: ', ' '.join(cmdlist) 88 | 89 | proc = None 90 | if platform.system() == 'Linux': 91 | proc = subprocess.Popen(' '.join(cmdlist), shell=True) 92 | elif platform.system() == 'Windows': 93 | proc = subprocess.Popen(' '.join(cmdlist)) 94 | 95 | proc.wait() 96 | 97 | def disassemble(self, blob, output='.'): 98 | ''' 99 | wrapper that calls the appropriate functions in 100 | order to expose the class' functionality. 101 | ''' 102 | self._run_ida(blob, 'disassembler/prepare.py', output) 103 | 104 | dump = os.path.join(output, '%s.idmp' % blob) 105 | 106 | with open(dump, 'r') as fin: 107 | for line in fin: 108 | yield line 109 | -------------------------------------------------------------------------------- /disassembler/prepare.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | ''' 4 | this script is meant to be run inside IDA Pro, 5 | as an IDAPython script. 6 | ''' 7 | 8 | import os 9 | import idc 10 | import idaapi 11 | 12 | def find_functions(): 13 | ''' 14 | yields all functions in the form a 2-tuple: 15 | 16 | (function_address, function_name) 17 | 18 | function_address is a RELATIVE offset from the 19 | image base. 20 | ''' 21 | # get image base from IDA 22 | image_base = idaapi.get_imagebase() 23 | 24 | # iterate through all functions in the executable. 25 | for func_ea in Functions(MinEA(), MaxEA()): 26 | # craft the routine record 27 | func_name = GetFunctionName(func_ea) 28 | funcaddr = func_ea - image_base 29 | yield funcaddr, func_name 30 | 31 | def find_bbls(function_ea): 32 | ''' 33 | yields all basic blocks that belong to the 34 | given function. The blocks are returned in 35 | a 2-tuple like: 36 | 37 | (start_address, end_address) 38 | 39 | Both start and end address are RELATIVE offsets 40 | from the image base. 41 | ''' 42 | 43 | # get image base from IDA 44 | image_base = idaapi.get_imagebase() 45 | function_ea += image_base 46 | 47 | # get flow chart from IDA 48 | flow_chart = idaapi.FlowChart( 49 | idaapi.get_func(function_ea), 50 | flags=idaapi.FC_PREDS 51 | ) 52 | 53 | # iterate through all basic blocks in 54 | # the current routine 55 | for block in flow_chart: 56 | start_addr = block.startEA - image_base 57 | end_addr = block.endEA - image_base 58 | if start_addr != end_addr: 59 | yield start_addr, end_addr 60 | 61 | def write(stream, msg): 62 | stream.write('%s\n' % msg) 63 | stream.flush() 64 | 65 | def get_image(): 66 | name = idc.GetInputFile() 67 | base = idaapi.get_imagebase() 68 | return base, name 69 | 70 | def dump_all(output): 71 | with open(output, 'w') as fout: 72 | print '[+] Dumping image...' 73 | write(fout, '##IMAGE##') 74 | base, name = get_image() 75 | write(fout, '%s,%s' % (base, name)) 76 | 77 | print '[+] Dumping all functions...' 78 | write(fout, '##FUNCTIONS##') 79 | functions = find_functions() 80 | for fea, fname in functions: 81 | write(fout, '%s,%s' % (fea, fname)) 82 | 83 | print '[+] Dumping all basic blocks...' 84 | write(fout, '##BBLS##') 85 | functions = find_functions() 86 | for fea, fname in functions: 87 | for start, end in find_bbls(fea): 88 | write( 89 | fout, '0x%x,0x%x,%s' % ( 90 | start, 91 | end, 92 | fname 93 | ) 94 | ) 95 | 96 | def wait_until_ready(): 97 | ''' 98 | first thing you should wait until IDA has parsed 99 | the executable. 100 | ''' 101 | print "[+] Waiting for auto-analysis to finish..." 102 | # wait for the autoanalysis to finish 103 | idc.Wait() 104 | 105 | def prepare_output(path): 106 | idb_name = os.path.basename('%s.idmp' % idc.GetInputFile()) 107 | path = os.path.abspath(path) 108 | return os.path.join(path, idb_name) 109 | 110 | def dump(path): 111 | out = prepare_output(path) 112 | wait_until_ready() 113 | print '[+] Dumping everything on: %s' % out 114 | dump_all(out) 115 | 116 | def main(args): 117 | dump(args.output) 118 | return 0 119 | 120 | if __name__ == '__main__': 121 | import sys 122 | import argparse 123 | 124 | parser = argparse.ArgumentParser( 125 | description="a tool that analyzes an executable file or \ 126 | an .idb/.i64 and dumps everything on a file." 127 | ) 128 | 129 | parser.add_argument( 130 | "-o", 131 | "--output", 132 | help='the directory to save the output file, default is /tmp/.', 133 | default="/tmp/" 134 | ) 135 | 136 | args = parser.parse_args(idc.ARGV[1:]) 137 | 138 | sys.exit(main(args)) 139 | -------------------------------------------------------------------------------- /evaluator.py: -------------------------------------------------------------------------------- 1 | ''' 2 | evaluator.py contains code that divides the chromosomes 3 | of a generation to elite and non-interesting. Also, it's responsible 4 | for metric normalization and fitness calculation. 5 | ''' 6 | 7 | import sortedcontainers as sc 8 | 9 | import configuration 10 | import campaign 11 | 12 | class Metric(object): 13 | ''' 14 | Base Metric class that is inherited by every 15 | user specified metric algorithm. 16 | ''' 17 | trace = None 18 | value = None 19 | 20 | def __init__(self, chromo): 21 | ''' 22 | normal initializer 23 | ''' 24 | self.chromo = chromo 25 | self.value = 0.0 26 | 27 | def get_normal(self, **kwargs): 28 | ''' 29 | returns the normalized metric value 30 | ''' 31 | return self.value 32 | 33 | @classmethod 34 | def calculate(cls, chromo, **kwargs): 35 | ''' 36 | automatic wrapper that returns the value 37 | of get_normal(). 38 | ''' 39 | obj = cls(chromo) 40 | return obj.get_normal(**kwargs) 41 | 42 | class BasicBlockCoverage(Metric): 43 | ''' 44 | Returns the percentage of the total basic block that 45 | was hit in all images. 46 | ''' 47 | def get_normal(self, **kwargs): 48 | if 'cache' not in kwargs: 49 | raise KeyError('Cache not found') 50 | 51 | unique_trace = self.chromo.trace.get_unique_total() 52 | count = 0x0 53 | for img in kwargs['cache']: 54 | count += kwargs['cache'][img].get_count() 55 | 56 | if count == 0x0: 57 | return 0.0 58 | 59 | return unique_trace / float(count) 60 | 61 | class UniversalPathUniqueness(Metric): 62 | ''' 63 | Returns the percentage of the bbls of the trace of the given chromosome 64 | that was not hit by any other chromosome in the population. 65 | ''' 66 | def get_normal(self, **kwargs): 67 | # assume that this chromosome is in the current generation 68 | other = kwargs['previous'] 69 | this = kwargs['current'] 70 | 71 | # check if the assumption is correct 72 | if kwargs['previous'] != None: 73 | if self.chromo in kwargs['previous']: 74 | this = kwargs['previous'] 75 | other = kwargs['current'] 76 | 77 | # holds the unique basic blocks per image (key) 78 | unique = {} 79 | 80 | # if other != None, means that this isn't the first generation 81 | if other != None: 82 | # unique will hold all the bbls that was hit in this chromo 83 | # and was not hit by the other generation 84 | for img, uniq in self.chromo.trace.get_difference_per_image( 85 | other.trace 86 | ): 87 | unique[img] = uniq 88 | else: 89 | # if this is the first generation, unique corresponds to 90 | # all the bbls of the trace 91 | for img in self.chromo.trace.images: 92 | unique[img] = sc.SortedSet().update( 93 | self.chromo.trace.set_per_image[img] 94 | ) 95 | # iterate through all chromos in this generation (unless myself) 96 | for chromo in this: 97 | if chromo.uid == self.chromo.uid: 98 | continue 99 | for img in chromo.trace.images: 100 | # remove from the unique the bbls that was hit by other 101 | # chromosomes in my generation 102 | unique[img] -= chromo.trace.set_per_image[img] 103 | 104 | # faults will be equal to the basic blocks that exist only in myself 105 | faults = 0x0 106 | for img in unique: 107 | faults += len(unique[img]) 108 | 109 | return faults / float(self.chromo.trace.get_unique_total()) 110 | 111 | class GenerationUniqueness(Metric): 112 | ''' 113 | Returns the percentage of the bbls of the trace of the given chromosome 114 | that was not hit by any other chromosome in the other generation. 115 | ''' 116 | def get_normal(self, **kwargs): 117 | # other is `previous' if the chromosome belongs to 118 | # current generation otherwise it's `current' if the 119 | # chromosome belongs to previous generation 120 | other = kwargs['previous'] 121 | 122 | if kwargs['previous'] != None: 123 | if self.chromo in kwargs['previous']: 124 | other = kwargs['current'] 125 | 126 | # if other == None, this is the first generation 127 | if other == None: 128 | return 1.0 129 | 130 | unique = {} 131 | 132 | for img, uniq in self.chromo.trace.get_difference_per_image( 133 | other.trace 134 | ): 135 | unique[img] = uniq 136 | 137 | faults = 0x0 138 | for img in unique: 139 | faults += len(unique[img]) 140 | 141 | return faults / float(self.chromo.trace.get_unique_total()) 142 | 143 | class CodeCommonality(Metric): 144 | ''' 145 | The percentage of the unique BBLs hit 146 | ''' 147 | def get_normal(self, **kwargs): 148 | unique_trace = self.chromo.trace.get_unique_total() 149 | total_trace = self.chromo.trace.get_total() 150 | if total_trace == 0x0: 151 | return 0.0 152 | return total_trace / float(unique_trace) 153 | 154 | class Evaluator(object): 155 | ''' 156 | Evaluator class is the top-level management class 157 | that handles the calling the appropriate functions 158 | and incorporates the logic of the evaluation. 159 | ''' 160 | cache = None 161 | configuration = None 162 | weights = None 163 | algorithms = None 164 | population = None 165 | 166 | def __init__(self, cache, configfile=None): 167 | self.cache = cache 168 | self.configuration = configuration.Configuration(configfile) 169 | self.campaign = campaign.Campaign() 170 | self.load_metric_algorithms( 171 | self.configuration['FitnessAlgorithms'] 172 | ) 173 | 174 | def load_metric_algorithms(self, algorithms=None): 175 | ''' 176 | accepts a dictionary of the algorithm class names and 177 | their matching weights and loads them into a class 178 | instance by searching the module globals. 179 | ''' 180 | if algorithms == None: 181 | algorithms = {} 182 | self.weights = algorithms 183 | self.algorithms = {} 184 | for name in algorithms: 185 | self.algorithms[name] = globals()[name] 186 | 187 | def calculate_metrics(self, chromo): 188 | ''' 189 | use the implemented algorithms above to 190 | calculate the metrics for a given chromosome. 191 | ''' 192 | previous = None 193 | if self.population.previous != None: 194 | previous = self.population.previous 195 | 196 | metrics = {} 197 | 198 | # This is because we want to log the metrics for each chromosome 199 | for name in self.algorithms: 200 | algo = self.algorithms[name] 201 | metric = algo.calculate( 202 | chromo, 203 | cache=self.cache, 204 | previous=previous, 205 | current=self.population.current 206 | ) 207 | metrics[name] = metric 208 | 209 | return metrics 210 | 211 | def calculate_previous_gen_metrics(self): 212 | ''' 213 | calculates and sets the (non normalized) metrics 214 | for each individual chromosome. 215 | ''' 216 | if self.population.previous == None: 217 | return 218 | 219 | for chromo in self.population.previous.get_all(): 220 | metrics = self.calculate_metrics(chromo) 221 | self.population.previous.set_metrics(chromo.uid, metrics) 222 | 223 | def calculate_current_gen_metrics(self): 224 | ''' 225 | calculates and sets the metrics for each 226 | individual chromosome. 227 | ''' 228 | for chromo in self.population.current.get_all(): 229 | metrics = self.calculate_metrics(chromo) 230 | self.population.current.set_metrics(chromo.uid, metrics) 231 | 232 | def get_population_max_metrics(self): 233 | ''' 234 | returns the maximum value for each metric 235 | ''' 236 | if self.population.previous == None: 237 | return self.population.current.max_metrics 238 | 239 | globmax = {} 240 | for name, prev in self.population.previous.max_metrics.iteritems(): 241 | curr = self.population.current.max_metrics[name] 242 | globmax[name] = max(prev, curr) 243 | 244 | return globmax 245 | 246 | def get_population_min_metrics(self): 247 | ''' 248 | returns the minimum value for each metric 249 | ''' 250 | if self.population.previous == None: 251 | return self.population.current.min_metrics 252 | 253 | globmin = {} 254 | for name, prev in self.population.previous.min_metrics.iteritems(): 255 | curr = self.population.current.min_metrics[name] 256 | globmin[name] = min(prev, curr) 257 | 258 | return globmin 259 | 260 | def get_normalized_metrics(self): 261 | ''' 262 | normalizes the metrics retrieved for each chromosome 263 | in the population (previous AND current generation) 264 | using the classical: 265 | 266 | x_norm = (x - xmin) / (xmax - xmin) 267 | ''' 268 | globmax = self.get_population_max_metrics() 269 | globmin = self.get_population_min_metrics() 270 | 271 | maxmin = {} 272 | for name in globmax: 273 | val = float(globmax[name] - globmin[name]) 274 | if val == 0.0: 275 | maxmin[name] = 1 276 | else: 277 | maxmin[name] = val 278 | 279 | current = {} 280 | 281 | # this applies to both current and previous 282 | # current[chromo.uid][metric_name] = metric_value 283 | for chromo in self.population.current.get_all(): 284 | current[chromo.uid] = {} 285 | for name in chromo.metrics: 286 | current[chromo.uid][name] = ( 287 | chromo.metrics[name] - globmin[name] 288 | ) / maxmin[name] 289 | 290 | previous = {} 291 | if self.population.previous != None: 292 | for chromo in self.population.previous.get_all(): 293 | previous[chromo.uid] = {} 294 | for name in chromo.metrics: 295 | previous[chromo.uid][name] = ( 296 | chromo.metrics[name] - globmin[name] 297 | ) / maxmin[name] 298 | 299 | return previous, current 300 | 301 | def calculate_fitness(self, metrics): 302 | ''' 303 | uses the weights provided in the configuration 304 | to calculate the individual fitness of a 305 | chromosome. 306 | ''' 307 | fitness = 0.0 308 | 309 | for name in metrics: 310 | weight = self.weights[name] 311 | fitness += weight * metrics[name] 312 | 313 | return fitness 314 | 315 | def set_population_fitness(self): 316 | ''' 317 | uses the normalized metrics to compute 318 | the fitness for both the previous and 319 | the current generation. It then proceeds 320 | to set the fitness for every chromosome in 321 | the population. 322 | ''' 323 | previous, current = self.get_normalized_metrics() 324 | 325 | self.campaign.log('From the previous generation') 326 | for chromo_uid in previous: 327 | fitness = self.calculate_fitness(previous[chromo_uid]) 328 | self.campaign.log('Uid: %s, fitness: %f' % (chromo_uid, fitness)) 329 | self.population.set_previous_fitness( 330 | chromo_uid, fitness 331 | ) 332 | 333 | self.campaign.log('From the current generation') 334 | for chromo_uid in current: 335 | fitness = self.calculate_fitness(current[chromo_uid]) 336 | self.campaign.log('Uid: %s, fitness: %f' % (chromo_uid, fitness)) 337 | self.population.set_fitness( 338 | chromo_uid, fitness 339 | ) 340 | 341 | def evaluate(self, population): 342 | ''' 343 | computes the metrics for every chromosome 344 | in the current generation. 345 | 346 | Then it normalizes the metrics and calculates 347 | the fitness for every chromosome in the 348 | *population*. 349 | 350 | This means that the results are normalized 351 | for both previous and current generations. 352 | ''' 353 | self.campaign.log('Evaluating the population.') 354 | self.population = population 355 | self.calculate_previous_gen_metrics() 356 | self.calculate_current_gen_metrics() 357 | self.set_population_fitness() 358 | 359 | if self.population.previous == None: 360 | self.population.current.clear_metrics() 361 | 362 | return True 363 | -------------------------------------------------------------------------------- /fuzzers/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CENSUS/choronzon/d702c318e2292a061da57d7ba5f88d4c4b0f5256/fuzzers/__init__.py -------------------------------------------------------------------------------- /fuzzers/mutators.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Mutator module contains a wide range of mutators. You can implement yours 3 | mutator in this module as well. All mutators must inherit the Mutator class 4 | and implement the mutate method. Choronzon, will eventually call every 5 | mutator, passing as parameters, a bytestring that needs to be fuzzed 6 | and a small integer. Every mutator must return the fuzzed data to the 7 | caller. 8 | ''' 9 | import random 10 | import re 11 | 12 | class Mutator(object): 13 | ''' 14 | This is the Mutator base class. 15 | ''' 16 | def __init__(self): 17 | pass 18 | 19 | def mutate(self, data, howmany=0): 20 | ''' 21 | data is the bytestring that will be mutated. howmany is usually a 22 | small integer given randomly by Choronzon. The fuzzed bytestring 23 | must return to the caller of this function. 24 | ''' 25 | return data 26 | 27 | 28 | class QuotedTextualNumberMutator(Mutator): 29 | ''' 30 | Scans the bytestring in order to find number that are inside quotes. 31 | If there is such a number in the bytestring, it replaces it with a 32 | value from 0 to 0xFFFFFFFF. 33 | ''' 34 | def __init__(self): 35 | super(QuotedTextualNumberMutator, self).__init__() 36 | 37 | def _coinflip(self, probability): 38 | ''' returns true with probability 1/probability''' 39 | return random.randint(0, probability) == 0 40 | 41 | def mutate(self, data, attribs=1): 42 | pattern = re.compile('\"\d+\"') 43 | fuzzed = '' 44 | to_be_fuzzed = [] 45 | matched = [] 46 | 47 | for match in pattern.finditer(data): 48 | matched.append(match.span()) 49 | 50 | if len(matched) == 0 or attribs == 0: 51 | return data 52 | 53 | if len(matched) < attribs: 54 | attribs = len(matched) 55 | 56 | # first choose randomly which of the matched patterns will be found 57 | for _ in xrange(attribs): 58 | target = random.choice(matched) 59 | to_be_fuzzed.append(target) 60 | matched.remove(target) 61 | 62 | # we start to change the matched patterns backwards 63 | # otherwise, the indices of to_be_fuzzed variable would need to 64 | # recalcuated in every iteration 65 | to_be_fuzzed.reverse() 66 | for start, end in to_be_fuzzed: 67 | fuzzed = '%s\"%d\"%s' % (data[:start], 68 | random.randint(0, 0xFFFFFFFF), 69 | data[end:]) 70 | data = fuzzed 71 | return data 72 | 73 | 74 | class RemoveLines(Mutator): 75 | ''' 76 | Removes a number of lines. 77 | ''' 78 | def __init__(self): 79 | super(RemoveLines, self).__init__() 80 | 81 | def mutate(self, data, to_be_removed=1): 82 | lines = data.split('\n') 83 | 84 | if len(lines) < to_be_removed: 85 | return '' 86 | 87 | for _ in xrange(to_be_removed): 88 | line = random.choice(lines) 89 | lines.remove(line) 90 | 91 | return '\n'.join(lines) 92 | 93 | 94 | class RepeatLine(Mutator): 95 | ''' 96 | Duplicates a line. 97 | ''' 98 | def __init__(self): 99 | super(RepeatLine, self).__init__() 100 | 101 | def mutate(self, data, repeat=1): 102 | lines = data.split('\n') 103 | 104 | if len(lines) < 1: 105 | return data 106 | 107 | index = random.randint(0, len(lines) - 1) 108 | target_line = lines[index] 109 | 110 | for _ in xrange(repeat): 111 | lines.insert(index, target_line) 112 | 113 | return '\n'.join(lines) 114 | 115 | 116 | class SwapLines(Mutator): 117 | ''' 118 | Grabs two lines and swaps them. 119 | ''' 120 | def __init__(self): 121 | super(SwapLines, self).__init__() 122 | 123 | def mutate(self, data, _=1): 124 | lines = data.split('\n') 125 | if len(lines) < 2: 126 | return data 127 | 128 | index1 = random.randint(0, len(lines) - 2) 129 | index2 = random.randint(0, len(lines) - 2) 130 | 131 | tmp = lines[index1] 132 | lines[index1] = lines[index2] 133 | lines[index2] = tmp 134 | 135 | return '\n'.join(lines) 136 | 137 | 138 | class SwapAdjacentLines(Mutator): 139 | ''' 140 | Swap two adjacent lines. 141 | ''' 142 | def __init__(self): 143 | super(SwapAdjacentLines, self).__init__() 144 | 145 | def mutate(self, data, howmany=1): 146 | lines = data.split('\n') 147 | if len(lines) < 3: 148 | return data 149 | 150 | for _ in xrange(howmany): 151 | index = random.randint(0, len(lines) - 2) 152 | tmp = lines[index] 153 | lines[index] = lines[index + 1] 154 | lines[index + 1] = tmp 155 | 156 | return '\n'.join(lines) 157 | 158 | 159 | class PurgeMutator(Mutator): 160 | ''' 161 | Deletes everything. 162 | ''' 163 | def __init__(self): 164 | super(PurgeMutator, self).__init__() 165 | 166 | def mutate(self, data, _=0): 167 | return '' 168 | 169 | 170 | class SwapByte(Mutator): 171 | ''' 172 | Grabs two byte randomly and swaps them. 173 | ''' 174 | def __init__(self): 175 | super(SwapByte, self).__init__() 176 | 177 | def mutate(self, data, _=2): 178 | fuzzed = '' 179 | 180 | if len(data) < 2: 181 | return data 182 | 183 | rnd1 = random.randint(0, len(data) - 1) 184 | if rnd1 >= 1: 185 | rnd2 = random.randint(0, rnd1 - 1) 186 | elif rnd1 + 1 <= len(data) - 1: 187 | rnd2 = random.randint(rnd1 + 1, len(data) - 1) 188 | 189 | min_rnd = min(rnd1, rnd2) 190 | max_rnd = max(rnd1, rnd2) 191 | 192 | byte1 = data[min_rnd] 193 | byte2 = data[max_rnd] 194 | 195 | fuzzed = data[:min_rnd] 196 | fuzzed += byte2 197 | fuzzed += data[min_rnd + 1:max_rnd] 198 | fuzzed += byte1 199 | fuzzed += data[max_rnd + 1:] 200 | 201 | return fuzzed 202 | 203 | 204 | class SwapWord(Mutator): 205 | ''' 206 | Grabs two word the swaps them. 207 | ''' 208 | def __init__(self): 209 | super(SwapWord, self).__init__() 210 | 211 | def mutate(self, data, _=4): 212 | fuzzed = '' 213 | if len(data) < 4: 214 | return data 215 | 216 | rnd1 = random.randint(0, len(data) - 2) 217 | 218 | if rnd1 >= 2: 219 | rnd2 = random.randint(0, rnd1 - 2) 220 | elif rnd1 + 2 <= len(data) - 2: 221 | rnd2 = random.randint(rnd1 + 2, len(data) - 2) 222 | else: 223 | return data 224 | 225 | min_rnd = min(rnd1, rnd2) 226 | max_rnd = max(rnd1, rnd2) 227 | 228 | word1 = data[min_rnd:min_rnd + 2] 229 | 230 | word2 = data[max_rnd:max_rnd + 2] 231 | 232 | fuzzed = data[:min_rnd] 233 | fuzzed += word1 234 | fuzzed += data[min_rnd + 2:max_rnd] 235 | fuzzed += word2 236 | fuzzed += data[max_rnd + 2:] 237 | 238 | return fuzzed 239 | 240 | 241 | class ByteNullifier(Mutator): 242 | ''' 243 | Replace one (or more) bytes from the bytestring with \x00. 244 | ''' 245 | def __init__(self): 246 | super(ByteNullifier, self).__init__() 247 | 248 | def mutate(self, data, _=1): 249 | fuzzed = '' 250 | if len(data) == 0: 251 | return data 252 | index = random.randint(0, len(data) - 1) 253 | 254 | fuzzed = '%s\x00%s' % (data[:index], data[index + 1:]) 255 | return fuzzed 256 | 257 | 258 | class IncreaseByOneMutator(Mutator): 259 | ''' 260 | Increases the value of one (or more) byte(s) by one. 261 | ''' 262 | def __init__(self): 263 | super(IncreaseByOneMutator, self).__init__() 264 | 265 | def mutate(self, data, howmany=1): 266 | if len(data) == 0: 267 | return data 268 | 269 | if len(data) < howmany: 270 | howmany = random.randint(1, len(data)) 271 | 272 | fuzzed = data 273 | 274 | for _ in xrange(howmany): 275 | index = random.randint(0, len(data) - 1) 276 | if ord(data[index]) != 0xFF: 277 | fuzzed = '%s%c%s' % ( 278 | data[:index], 279 | ord(data[index]) + 1, 280 | data[index + 1:] 281 | ) 282 | else: 283 | fuzzed = '%s\x00%s' % ( 284 | data[:index], 285 | data[index + 1:] 286 | ) 287 | 288 | data = fuzzed 289 | 290 | return fuzzed 291 | 292 | 293 | class DecreaseByOneMutator(Mutator): 294 | ''' 295 | Decreases the value of one (or more) byte(s) by one. 296 | ''' 297 | def __init__(self): 298 | super(DecreaseByOneMutator, self).__init__() 299 | 300 | def mutate(self, data, howmany=1): 301 | if len(data) == 0: 302 | return data 303 | 304 | if len(data) < howmany: 305 | howmany = random.randint(0, len(data) - 1) 306 | 307 | fuzzed = data 308 | for _ in xrange(howmany): 309 | index = random.randint(0, len(data) - 1) 310 | if ord(data[index]) != 0: 311 | fuzzed = '%s%c%s' % ( 312 | data[:index], 313 | ord(data[index]) - 1, 314 | data[index + 1:] 315 | ) 316 | else: 317 | fuzzed = '%s\xFF%s' % ( 318 | data[:index], 319 | data[index + 1:] 320 | ) 321 | data = fuzzed 322 | return fuzzed 323 | 324 | 325 | class ProgressiveIncreaseMutator(Mutator): 326 | ''' 327 | Increases the value of many consecutive bytes progressively. 328 | Specifically, the first byte will be increased by one, the second by 329 | two, the third by three and so on. 330 | ''' 331 | def __init__(self): 332 | super(ProgressiveIncreaseMutator, self).__init__() 333 | 334 | def mutate(self, data, howmany=8): 335 | if len(data) < howmany: 336 | return data 337 | 338 | index = random.randint(0, len(data) - howmany) 339 | buf = '' 340 | fuzzed = '' 341 | 342 | for addend, curr in enumerate(xrange(index, index + howmany)): 343 | if addend + ord(data[curr]) > 0xFF: 344 | addend -= 0xFF 345 | buf += chr(ord(data[curr]) + addend) 346 | 347 | fuzzed = '%s%s%s' % (data[index:], buf, data[index + howmany:]) 348 | return fuzzed 349 | 350 | 351 | class ProgressiveDecreaseMutator(Mutator): 352 | ''' 353 | Decreases the value of many consecutive bytes progressively. 354 | Specifically, the first byte will be decreased by one, the second by 355 | two, the third by three and so on. 356 | ''' 357 | def __init__(self): 358 | super(ProgressiveDecreaseMutator, self).__init__() 359 | 360 | def mutate(self, data, howmany=8): 361 | if len(data) < howmany: 362 | return data 363 | index = random.randint(0, len(data) - howmany) 364 | buf = '' 365 | fuzzed = '' 366 | 367 | for subtrahend, curr in enumerate(xrange(index, index + howmany)): 368 | if ord(data[curr]) >= subtrahend: 369 | buf += chr(ord(data[curr]) - subtrahend) 370 | else: 371 | buf += chr(subtrahend - ord(data[curr])) 372 | 373 | fuzzed = '%s%s%s' % (data[index:], buf, data[index + howmany:]) 374 | return fuzzed 375 | 376 | 377 | class SwapDword(Mutator): 378 | ''' 379 | Grabs two dwords from the bytestring and swaps them. 380 | ''' 381 | def __init__(self): 382 | super(SwapDword, self).__init__() 383 | 384 | def mutate(self, data, _=8): 385 | fuzzed = '' 386 | 387 | if len(data) < 8: 388 | return data 389 | 390 | rnd1 = random.randint(0, len(data) - 4) 391 | 392 | if rnd1 >= 4: 393 | rnd2 = random.randint(0, rnd1 - 4) 394 | elif rnd1 + 4 <= len(data) - 4: 395 | rnd2 = random.randint(rnd1 + 4, len(data) - 4) 396 | else: 397 | return data 398 | 399 | min_rnd = min(rnd1, rnd2) 400 | max_rnd = max(rnd1, rnd2) 401 | 402 | dword1 = data[min_rnd:min_rnd + 4] 403 | dword2 = data[max_rnd:max_rnd + 4] 404 | 405 | 406 | fuzzed = data[:min_rnd] 407 | fuzzed += dword1 408 | fuzzed += data[min_rnd + 4:max_rnd] 409 | fuzzed += dword2 410 | fuzzed += data[max_rnd + 4:] 411 | 412 | return fuzzed 413 | 414 | 415 | class SetHighBitFromByte(Mutator): 416 | ''' 417 | Set the high bit from a byte. 418 | ''' 419 | def __init__(self): 420 | super(SetHighBitFromByte, self).__init__() 421 | 422 | def mutate(self, data, _=1): 423 | fuzzed = '' 424 | 425 | if len(data) > 0: 426 | index = random.randint(0, len(data) - 1) 427 | byte = ord(data[index]) 428 | byte |= 0x80 429 | fuzzed = data[:index] 430 | fuzzed += chr(byte) 431 | fuzzed += data[index + 1:] 432 | 433 | return fuzzed 434 | 435 | 436 | class DuplicateByte(Mutator): 437 | ''' 438 | Duplicate randomly a byte (or more) in the bytestring. 439 | ''' 440 | def __init__(self): 441 | super(DuplicateByte, self).__init__() 442 | 443 | def mutate(self, data, howmany=1): 444 | fuzzed = '' 445 | 446 | if len(data) > howmany: 447 | howmany = len(data) 448 | 449 | for _ in xrange(howmany): 450 | index = random.randint(0, len(data) - 1) 451 | byte = data[index] 452 | fuzzed = data[:index] 453 | fuzzed += byte 454 | fuzzed += data[index:] 455 | 456 | return fuzzed 457 | 458 | 459 | class RemoveByte(Mutator): 460 | ''' 461 | Remove randomly a byte (or more) from the bytestring. 462 | ''' 463 | def __init__(self): 464 | super(RemoveByte, self).__init__() 465 | 466 | def mutate(self, data, _=1): 467 | fuzzed = data 468 | if len(data): 469 | index = random.randint(0, len(data)) 470 | fuzzed = data[:index] 471 | fuzzed += data[index + 1:] 472 | return fuzzed 473 | 474 | 475 | class RandomByteMutator(Mutator): 476 | ''' 477 | The old-time classic random byte mutator. 478 | ''' 479 | def __init__(self): 480 | super(RandomByteMutator, self).__init__() 481 | 482 | def mutate(self, data, howmany=5): 483 | if len(data) < 2: 484 | return data 485 | for _ in xrange(howmany): 486 | tmp = random.randint(0, len(data) - 1) 487 | data = '%s%c%s' % ( 488 | data[:tmp], 489 | random.randint(0, 0xFF), 490 | data[tmp+1:] 491 | ) 492 | return data 493 | 494 | 495 | class AddRandomData(Mutator): 496 | ''' 497 | Adds some random byte into the bytestring. 498 | ''' 499 | def __init__(self): 500 | super(AddRandomData, self).__init__() 501 | 502 | def mutate(self, data, howmany=2): 503 | fuzzed = '' 504 | additional = '' 505 | for _ in xrange(howmany): 506 | additional += '%c' % (random.randint(0, 0xFF)) 507 | 508 | index = random.randint(0, len(data)) 509 | 510 | fuzzed = data[:index] 511 | fuzzed += additional 512 | fuzzed += data[index:] 513 | 514 | return fuzzed 515 | 516 | 517 | class NullMutator(Mutator): 518 | ''' 519 | Does absolutely nothing. 520 | ''' 521 | def __init__(self): 522 | super(NullMutator, self).__init__() 523 | -------------------------------------------------------------------------------- /fuzzers/recombinators.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Recombination is a feature of Choronzon, which is not common amongst other 3 | fuzzers. Prerequisite of recombination is the knowledge of the basic 4 | structure of the file format. Thus, recombinators are built upon 5 | the gene/chrosome system of Choronzon. Recombinators use this knowledge, 6 | and try to alter the structure of the file instead of simply mutating their 7 | bits and bytes. 8 | ''' 9 | import random 10 | import copy 11 | import fuzzers.mutators as mutators 12 | 13 | class Recombinator(object): 14 | ''' 15 | This is the recombinator's base class. 16 | ''' 17 | def __init__(self): 18 | pass 19 | 20 | def choose_genes(self, chr1, chr2): 21 | ''' 22 | Returns randomly one gene of each chromosome. 23 | ''' 24 | all1 = chr1.get_all_genes() 25 | all2 = chr2.get_all_genes() 26 | 27 | if len(all1) < 1 or len(all2) < 1: 28 | return None, None 29 | 30 | return ( 31 | random.choice(all1), 32 | random.choice(all2) 33 | ) 34 | 35 | def mutate(self, gene, mutator=None): 36 | ''' 37 | Fuzz a gene. 38 | ''' 39 | if gene.anomaly(): 40 | return gene 41 | 42 | # If the mutator is not specified, use the random byte mutator. 43 | # This behaviour may change in the future. 44 | if mutator == None: 45 | mutator = mutators.RandomByteMutator() 46 | 47 | gene.mutate(mutator) 48 | return gene 49 | 50 | def recombine(self, chr1, chr2, mutator=None): 51 | ''' 52 | The recombination should be implemented in this function. 53 | ''' 54 | return chr1, chr2 55 | 56 | class NullRecombinator(Recombinator): 57 | ''' 58 | Just fuzz one gene of each chromosome. Do not recombine. 59 | ''' 60 | def __init__(self): 61 | super(NullRecombinator, self).__init__() 62 | 63 | def recombine(self, chr1, chr2, mutator=None): 64 | gene1, gene2 = self.choose_genes(chr1, chr2) 65 | if gene1 == None or gene2 == None: 66 | return chr1, chr2 67 | 68 | self.mutate(gene1, mutator) 69 | self.mutate(gene2, mutator) 70 | 71 | return chr1, chr2 72 | 73 | class ChildrenSelector(Recombinator): 74 | ''' 75 | This class is a Selector, which means that implements only the 76 | choose_genes function. 77 | ''' 78 | def __init__(self): 79 | super(ChildrenSelector, self).__init__() 80 | 81 | def choose_genes(self, chr1, chr2): 82 | ''' 83 | Picks one non-root node from each chromosome/ 84 | ''' 85 | # get_all_genes returns all genes 86 | genes1 = chr1.get_all_genes() 87 | # get_genes returns only the root nodes 88 | parents1 = chr1.get_genes() 89 | child1 = None 90 | 91 | for random_gene in random.sample(genes1, len(genes1)): 92 | if random_gene not in parents1: 93 | child1 = random_gene 94 | break 95 | 96 | genes2 = chr2.get_all_genes() 97 | parents2 = chr2.get_genes() 98 | child2 = None 99 | 100 | for random_gene in random.sample(genes2, len(genes2)): 101 | if random_gene not in parents2: 102 | child2 = random_gene 103 | break 104 | 105 | return child1, child2 106 | 107 | 108 | class SimilarGeneSelector(Recombinator): 109 | ''' 110 | Selects similar genes from two chromosomes 111 | ''' 112 | 113 | def __init__(self): 114 | super(SimilarGeneSelector, self).__init__() 115 | 116 | def choose_genes(self, chr1, chr2): 117 | genes = chr1.get_all_genes() 118 | for gene1 in random.sample(genes, len(genes)): 119 | for gene2 in chr2.get_all_genes(): 120 | if gene2.is_equal(gene1): 121 | return gene1, gene2 122 | return None, None 123 | 124 | 125 | class ParentChildrenSwap(ChildrenSelector, Recombinator): 126 | ''' 127 | Changes the hierarchy (children - parent) for one gene 128 | in each chromosome. 129 | ''' 130 | def __init__(self): 131 | super(ParentChildrenSwap, self).__init__() 132 | 133 | def recombine(self, chr1, chr2, mutator=None): 134 | child1, child2 = self.choose_genes(chr1, chr2) 135 | if child1 == None or child2 == None: 136 | return chr1, chr2 137 | 138 | parent1 = chr1.find_parent(child1) 139 | # index-th position of the parent's children list points to 140 | # the selected child. keep this for later 141 | index = parent1.children.index(child1) 142 | 143 | siblings = parent1.children 144 | # move the children's ancestors to the parent 145 | parent1.children = child1.children 146 | # and set the children of the parent (siblings of the child) 147 | # as ancestors of the child 148 | child1.children = siblings 149 | child1.children[index] = parent1 150 | 151 | parent2 = chr2.find_parent(child2) 152 | index = parent2.children.index(child2) 153 | 154 | siblings = parent2.children 155 | parent2.children = child2.children 156 | child2.children = siblings 157 | child2.children[index] = parent2 158 | 159 | return chr1, chr2 160 | 161 | class ShuffleSiblings(ChildrenSelector, Recombinator): 162 | ''' 163 | Chooses two non-root nodes of each chromosome and shuffle them and 164 | their siblings. 165 | ''' 166 | def __init__(self): 167 | super(ShuffleSiblings, self).__init__() 168 | 169 | def recombine(self, chr1, chr2, mutator=None): 170 | child1, child2 = self.choose_genes(chr1, chr2) 171 | if child1 == None or child2 == None: 172 | return chr1, chr2 173 | 174 | parent1 = chr1.find_parent(child1) 175 | parent2 = chr2.find_parent(child2) 176 | 177 | random.shuffle(parent1.children) 178 | random.shuffle(parent2.children) 179 | 180 | return chr1, chr2 181 | 182 | 183 | class RandomGeneSwapRecombinator(Recombinator): 184 | ''' 185 | Chooses a gene randomly from each chromosome and swaps them. 186 | ''' 187 | def __init__(self): 188 | super(RandomGeneSwapRecombinator, self).__init__() 189 | 190 | def recombine(self, chr1, chr2, mutator=None): 191 | old_gene1, old_gene2 = self.choose_genes(chr1, chr2) 192 | if old_gene1 == None or old_gene2 == None: 193 | return chr1, chr2 194 | 195 | # Probably deep copy is not required here 196 | gene1 = copy.deepcopy(old_gene1) 197 | gene2 = copy.deepcopy(old_gene2) 198 | 199 | gene1 = self.mutate(gene1, mutator) 200 | gene2 = self.mutate(gene2, mutator) 201 | 202 | chr2.replace_gene(old_gene2, gene1) 203 | chr1.replace_gene(old_gene1, gene2) 204 | 205 | return chr1, chr2 206 | 207 | class RemoveGeneRecombinator(Recombinator): 208 | ''' 209 | Remove randomly one gene from each chromosome. 210 | ''' 211 | def __init__(self): 212 | super(RemoveGeneRecombinator, self).__init__() 213 | 214 | def recombine(self, chr1, chr2, mutator=None): 215 | old_gene1, old_gene2 = self.choose_genes(chr1, chr2) 216 | if old_gene1 == None or old_gene2 == None: 217 | return chr1, chr2 218 | 219 | chr1.remove_gene(old_gene1) 220 | chr2.remove_gene(old_gene2) 221 | 222 | return chr1, chr2 223 | 224 | class DuplicateGeneRecombinator(Recombinator): 225 | ''' 226 | Duplicates and muates one gene from each chromosome. 227 | ''' 228 | def __init__(self): 229 | super(DuplicateGeneRecombinator, self).__init__() 230 | 231 | def recombine(self, chr1, chr2, mutator=None): 232 | old_gene1, old_gene2 = self.choose_genes(chr1, chr2) 233 | if old_gene1 == None or old_gene2 == None: 234 | return chr1, chr2 235 | 236 | gene1 = copy.deepcopy(old_gene1) 237 | gene2 = copy.deepcopy(old_gene2) 238 | 239 | self.mutate(gene1, mutator) 240 | self.mutate(gene2, mutator) 241 | 242 | parent1 = chr1.find_parent(old_gene1) 243 | if parent1 == None: 244 | index = chr1.genes.index(old_gene1) 245 | chr1.genes.insert(index, gene2) 246 | else: 247 | parent1.add_child(gene2) 248 | 249 | parent2 = chr2.find_parent(old_gene2) 250 | if parent2 == None: 251 | index = chr2.genes.index(old_gene2) 252 | chr2.genes.insert(index, gene2) 253 | else: 254 | parent2.add_child(gene2) 255 | 256 | return chr1, chr2 257 | 258 | class AdditiveSimilarGeneCrossOver(SimilarGeneSelector, Recombinator): 259 | ''' 260 | Finds one similar gene in each chromosome. Inserts the one 261 | to the other chromosome as sibling of the similar gene. 262 | ''' 263 | def __init__(self): 264 | super(AdditiveSimilarGeneCrossOver, self).__init__() 265 | 266 | def recombine(self, chr1, chr2, mutator=None): 267 | old_gene1, old_gene2 = self.choose_genes(chr1, chr2) 268 | if old_gene1 == None or old_gene2 == None: 269 | return chr1, chr2 270 | 271 | gene1 = copy.deepcopy(old_gene1) 272 | gene2 = copy.deepcopy(old_gene2) 273 | 274 | self.mutate(gene1, mutator) 275 | self.mutate(gene2, mutator) 276 | 277 | parent1 = chr1.find_parent(old_gene1) 278 | if parent1 == None: 279 | index = chr1.genes.index(old_gene1) 280 | chr1.genes.insert(index, gene2) 281 | else: 282 | parent1.add_child(gene2) 283 | 284 | parent2 = chr2.find_parent(old_gene2) 285 | if parent2 == None: 286 | index = chr2.genes.index(old_gene2) 287 | chr2.genes.insert(index, gene1) 288 | else: 289 | parent2.add_child(gene1) 290 | 291 | return chr1, chr2 292 | 293 | 294 | class SimilarGeneSwapRecombinator(SimilarGeneSelector, 295 | RandomGeneSwapRecombinator): 296 | ''' 297 | Chooses one gene randomly from one parent and then it searches if 298 | there's a identical one in the other parent. If this is true, 299 | it swaps them. Otherwise, it swaps two genes randomly. 300 | ''' 301 | def __init__(self): 302 | super(SimilarGeneSwapRecombinator, self).__init__() 303 | 304 | class RandomGeneInsertRecombinator(Recombinator): 305 | ''' 306 | Chooses a Gene randomly from one chromosome and inserts it 307 | to the other (randomly again). It does the same to the 308 | other chromosome. 309 | ''' 310 | def __init__(self): 311 | super(RandomGeneInsertRecombinator, self).__init__() 312 | 313 | def recombine(self, chr1, chr2, mutator=None): 314 | old_gene1, old_gene2 = self.choose_genes(chr1, chr2) 315 | 316 | gene1 = copy.deepcopy(old_gene1) 317 | gene2 = copy.deepcopy(old_gene2) 318 | 319 | gene1 = self.mutate(gene1, mutator) 320 | gene2 = self.mutate(gene2, mutator) 321 | 322 | parent1 = chr1.find_parent(old_gene1) 323 | if parent1 == None: 324 | # could not find the parent of g1 that means it is a root node 325 | # just insert the new gene after the chosen one. 326 | index = chr1.genes.index(old_gene1) 327 | chr1.genes.insert(index, gene2) 328 | else: 329 | # make the fuzzed gene of the chromo2 330 | parent1.add_child(gene2) 331 | 332 | parent2 = chr2.find_parent(old_gene2) 333 | if parent2 == None: 334 | index = chr2.genes.index(old_gene2) 335 | chr2.genes.insert(index, gene1) 336 | else: 337 | parent2.add_child(gene1) 338 | 339 | return chr1, chr2 340 | 341 | class SimilarGeneInsertRecombinator(SimilarGeneSelector, 342 | RandomGeneInsertRecombinator): 343 | ''' 344 | Selects similar genes from the two chromosomes and 345 | inserts each one to the other chromosome. 346 | ''' 347 | 348 | def __init__(self): 349 | super(SimilarGeneInsertRecombinator, self).__init__() 350 | -------------------------------------------------------------------------------- /fuzzers/strategy.py: -------------------------------------------------------------------------------- 1 | import random 2 | import bisect 3 | import fuzzers.recombinators as recombinators 4 | import fuzzers.mutators as mutators 5 | from configuration import Configuration 6 | import world 7 | 8 | class WeightedSelector(world.NaiveSelector): 9 | ''' 10 | This class implements a selection algorithm. Each of the items can be 11 | assigned with a weight in order to control if it is more likely 12 | to select it or not. 13 | ''' 14 | def __init__(self, objlist, initial=0): 15 | super(WeightedSelector, self).__init__(objlist) 16 | 17 | def select(self): 18 | ''' 19 | selects a random object from the object list 20 | based according to weights set. 21 | ''' 22 | if not self.objdict.keys(): 23 | return None 24 | 25 | while True: 26 | objkey = random.choice(self.objdict.keys()) 27 | if self.unfair_coinflip(self.objdict[objkey]): 28 | return objkey 29 | 30 | def set_weight(self, key, weight): 31 | self.objdict[key] = weight 32 | 33 | def get_weight(self, key): 34 | return self.objdict[key] 35 | 36 | class Lottery(object): 37 | ''' 38 | the Lottery class is responsible of randomly selecting a winner from a 39 | pool of players, by randomly choosing a ticket from a pool of tickets. 40 | The number of tickets in the pool of tickets is the total of the score 41 | of each individual player. This means that the player's score is 42 | actually the number of tickets this player has "bought". 43 | ''' 44 | players = None 45 | tickets = None 46 | ticket_number = None 47 | 48 | def __init__(self): 49 | self.players = [] 50 | self.tickets = [] 51 | self.ticket_number = 0x0 52 | 53 | def join(self, player, score): 54 | self.players.append(player) 55 | self.tickets.append( 56 | self.ticket_number 57 | ) 58 | self.ticket_number += score 59 | 60 | def choose_ticket(self): 61 | return random.randrange( 62 | 0, self.ticket_number 63 | ) 64 | 65 | def choose_winner(self): 66 | ticket = self.choose_ticket() 67 | index = bisect.bisect( 68 | self.tickets, 69 | ticket 70 | ) 71 | return self.players[index-1] 72 | 73 | @classmethod 74 | def run(cls, players): 75 | ''' 76 | players is an iterable of 77 | ''' 78 | obj = cls() 79 | for player in players: 80 | obj.join(player, player['score']) 81 | return obj.choose_winner() 82 | 83 | class FuzzingStrategy(object): 84 | configuration = None 85 | recombinators = None 86 | mutators = None 87 | candidates = None 88 | 89 | def __init__(self): 90 | self.configuration = Configuration() 91 | self.initialize_recombinators() 92 | self.initialize_mutators() 93 | self.generate_candidates() 94 | 95 | def initialize_recombinators(self): 96 | self.recombinators = dict() 97 | for recombinator in self.configuration['Recombinators']: 98 | self.recombinators[recombinator] = getattr( 99 | recombinators, 100 | recombinator 101 | )() 102 | 103 | def initialize_mutators(self): 104 | self.mutators = dict() 105 | for mutator in self.configuration['Mutators']: 106 | self.mutators[mutator] = getattr( 107 | mutators, 108 | mutator 109 | )() 110 | 111 | def generate_candidates(self): 112 | self.candidates = dict() 113 | for rname, recombinator in self.recombinators.iteritems(): 114 | for mname, mutator in self.mutators.iteritems(): 115 | cid = '%s_%s' % (rname, mname) 116 | candidate = dict() 117 | candidate['cid'] = cid 118 | candidate['recombinator'] = recombinator 119 | candidate['mutator'] = mutator 120 | candidate['score'] = 0x1 121 | self.candidates[cid] = candidate 122 | 123 | def good(self, cid, score=1): 124 | this = self.candidates[cid]['score'] 125 | self.candidates[cid]['score'] = max(this, score) 126 | 127 | def bad(self, cid, score=1): 128 | if self.candidates[cid]['score'] > 1: 129 | self.candidates[cid]['score'] -= score 130 | 131 | def select_candidate(self): 132 | return Lottery.run(self.candidates.values()) 133 | 134 | def recombine(self, male, female): 135 | candidate = self.select_candidate() 136 | 137 | # XXX: implement a system that will be able to pretty-print the scores 138 | # of the mutators/recombinators. 139 | 140 | mutator = candidate['mutator'] 141 | recombinator = candidate['recombinator'] 142 | 143 | son, daughter = recombinator.recombine( 144 | male, 145 | female, 146 | mutator 147 | ) 148 | 149 | son.fuzzer = candidate['cid'] 150 | daughter.fuzzer = candidate['cid'] 151 | 152 | return son, daughter 153 | -------------------------------------------------------------------------------- /settings/__init__.py: -------------------------------------------------------------------------------- 1 | import platform 2 | if platform.system() == 'Linux': 3 | from system import * 4 | elif platform.system() == 'Windows': 5 | from winsystem import * 6 | -------------------------------------------------------------------------------- /settings/iview.py: -------------------------------------------------------------------------------- 1 | # Name of the campaign 2 | CampaignName = 'iview-campaign' 3 | 4 | # Name of the parser module. The parser module must be 5 | # in the chromosome/parsers directory. 6 | Parser = 'PNG' 7 | 8 | # The path of the initial corpus 9 | InitialPopulation = 'C:\\tmp\\png' 10 | 11 | # The fitness algorithms that will be used by Chronzon 12 | # and the weight of each one. Currently, two algorithms 13 | # are implemented, the BasicBlockCoverage and CodeCommonality. 14 | FitnessAlgorithms = { 15 | 'BasicBlockCoverage': 0.5, 16 | 'CodeCommonality': 0.3 17 | } 18 | 19 | # A tuple with the Recombinators that will be used during the fuzzing. 20 | # Users are encouraged to comment out the algorithms that they think 21 | # they are not effective when fuzzing a specific target format. However, 22 | # Choronzon has an internal evaluation system in order to use more often 23 | # the effective algorithms. 24 | Recombinators = ( 25 | 'AdditiveSimilarGeneCrossOver', 26 | 'DuplicateGeneRecombinator', 27 | 'RemoveGeneRecombinator', 28 | 'RemoveGeneRecombinator', 29 | 'ShuffleSiblings', 30 | 'ParentChildrenSwap', 31 | 'SimilarGeneSwapRecombinator', 32 | 'RandomGeneSwapRecombinator', 33 | 'RandomGeneInsertRecombinator', 34 | ) 35 | 36 | # A tuple with the Mutators that will be used during the fuzzing. 37 | Mutators = ( 38 | 'RandomByteMutator', 39 | 'AddRandomData', 40 | 'RandomByteMutator', 41 | 'RemoveByte', 42 | 'SwapAdjacentLines', 43 | 'SwapLines', 44 | 'RepeatLine', 45 | 'RemoveLines', 46 | 'QuotedTextualNumberMutator', 47 | 'PurgeMutator', 48 | 'SwapWord', 49 | 'SwapDword', 50 | ) 51 | 52 | # If KeepGenerations is True the seedfiles of each generation will be stored 53 | # in the campaign directory. Keep in mind though, that this may lead to run out of 54 | # free space, if the fuzzer runs of a long time. 55 | KeepGenerations = True 56 | 57 | # The name of the disassembler module that will be used. 58 | # Currently, only IDA is supported. 59 | Disassembler = 'IDADisassembler' 60 | DisassemblerPath = 'C:\\Program Files (x86)\\IDA 6.9' 61 | 62 | # The command that will be executed to test the target application. 63 | # Note that %s will be replaced by the fuzzed file. 64 | Command = '\"C:\\Program Files\\IrfanView\\i_view64.exe\" %s' 65 | 66 | # A tuple with the modules that will be instrumented in order to collect 67 | # stats to calculate the fitness. Full path of the modules is required. 68 | # Please note, that Whitelist must be a tuple even there is only one module. 69 | Whitelist = ('C:\\Program Files\\IrfanView\\i_view64.exe',) 70 | -------------------------------------------------------------------------------- /settings/pngcheck.py: -------------------------------------------------------------------------------- 1 | CampaignName = 'pngcheck-campaign' 2 | Parser = 'PNG' 3 | InitialPopulation = '/tmp/png/' 4 | 5 | FitnessAlgorithms = { 6 | 'BasicBlockCoverage': 0.6, 7 | 'CodeCommonality': 0.4 8 | } 9 | 10 | Recombinators = ( 11 | 'Recombinator', 12 | 'NullRecombinator', 13 | 'ChildrenSelector', 14 | 'SimilarGeneSelector', 15 | 'ParentChildrenSwap', 16 | 'ShuffleSiblings', 17 | 'RandomGeneSwapRecombinator', 18 | 'RemoveGeneRecombinator', 19 | 'DuplicateGeneRecombinator', 20 | 'AdditiveSimilarGeneCrossOver', 21 | 'SimilarGeneSwapRecombinator', 22 | 'RandomGeneInsertRecombinator', 23 | 'SimilarGeneInsertRecombinator' 24 | ) 25 | 26 | Mutators = ( 27 | 'QuotedTextualNumberMutator', 28 | 'RemoveLines', 29 | 'RepeatLine', 30 | 'SwapLines', 31 | 'SwapAdjacentLines', 32 | 'PurgeMutator', 33 | 'SwapByte', 34 | 'SwapWord', 35 | 'ByteNullifier', 36 | 'IncreaseByOneMutator', 37 | 'DecreaseByOneMutator', 38 | 'ProgressiveIncreaseMutator', 39 | 'ProgressiveDecreaseMutator', 40 | 'SwapDword', 41 | 'SetHighBitFromByte', 42 | 'DuplicateByte', 43 | 'RemoveByte', 44 | 'RandomByteMutator', 45 | 'AddRandomData', 46 | 'NullMutator' 47 | ) 48 | 49 | Disassembler = 'IDADisassembler' 50 | DisassemblerPath = '/home/user/ida-6.9' 51 | 52 | KeepGenerations = True 53 | 54 | # Pintool related settings 55 | Timeout = 10 56 | 57 | Command = '/usr/bin/pngcheck %s' 58 | Whitelist = ('/usr/bin/pngcheck',) 59 | -------------------------------------------------------------------------------- /settings/system.py: -------------------------------------------------------------------------------- 1 | ida_script = './disassembler/prepare.py' 2 | pintool = './analyzer/coverage/obj-intel64/coverage.so' 3 | 4 | -------------------------------------------------------------------------------- /settings/winsystem.py: -------------------------------------------------------------------------------- 1 | ida_script = '.\\disassembler\\prepare.py' 2 | pintool = '.\\analyzer\\coverage\\obj-intel64\\coverage.dll' 3 | -------------------------------------------------------------------------------- /tracer.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import os 4 | import struct 5 | import sortedcontainers as sc 6 | import settings 7 | 8 | import campaign 9 | import disassembler 10 | import analyzer 11 | import configuration 12 | import blockcache as bcache 13 | 14 | class Trace(object): 15 | images = None 16 | # bbls_per_image = None 17 | set_per_image = None 18 | functions = None 19 | trace = None 20 | total = None 21 | has_crashed = None 22 | 23 | def __init__(self): 24 | self.has_crashed = False 25 | self.images = [] 26 | self.total = 0x0 27 | #self.bbls_per_image = {} 28 | self.set_per_image = {} 29 | 30 | def add_image(self, image): 31 | ''' 32 | Adds a new image name into the trace. 33 | ''' 34 | self.images.append(image) 35 | #self.bbls_per_image[image] = [] 36 | self.set_per_image[image] = sc.SortedSet() 37 | 38 | def add_bbl(self, image, bbl): 39 | ''' 40 | Adds a new basic block into the trace. 41 | ''' 42 | # The bbl given as input in this function is 43 | # taken from the block cache. Technically, this means 44 | # that it corresponds to the basic blocks of IDA. 45 | #self.bbls_per_image[image].append(bbl) 46 | self.set_per_image[image].add(bbl) 47 | self.total += 1 48 | 49 | def get_total(self): 50 | ''' 51 | Returns the total number of basic blocks in the trace. 52 | ''' 53 | return self.total 54 | 55 | def get_unique_total(self): 56 | ''' 57 | Returns the total number of unique basic blocks 58 | of all images, hit by the current trace. 59 | ''' 60 | count = 0x0 61 | for img in self.set_per_image: 62 | count += len(self.set_per_image[img]) 63 | return count 64 | 65 | def get_difference_per_image(self, trace): 66 | ''' 67 | This function yields a tuple with the image name 68 | and the difference between this trace object 69 | and the trace object given as argument. 70 | ''' 71 | assert len(self.set_per_image) == len(trace.set_per_image) 72 | for img in self.images: 73 | this = self.set_per_image[img] 74 | other = trace.set_per_image[img] 75 | yield img, this - other 76 | 77 | def get_similarity(self, trace): 78 | ''' 79 | Returns the percentage of similar basic block between two traces. 80 | If the return value is 1.0, the traces are equal. On the other hand, 81 | if it is 0.0, the traces have not any common basic block. 82 | ''' 83 | assert len(self.set_per_image) == len(trace.set_per_image) 84 | faults = 0x0 85 | for img in self.images: 86 | this = self.set_per_image[img] 87 | other = trace.set_per_image[img] 88 | faults += len(this - other) 89 | return faults / float(self.get_unique_total()) 90 | 91 | def update(self, trace): 92 | ''' 93 | Updates the trace. 94 | ''' 95 | for img in trace.images: 96 | if img not in self.images: 97 | self.add_image(img) 98 | self.set_per_image[img].update(trace.set_per_image[img]) 99 | self.total += trace.total 100 | 101 | class Tracer(object): 102 | cache = None 103 | campaign = None 104 | analyzer = None 105 | disassembler = None 106 | configuration = None 107 | 108 | def __init__(self, configfile=None): 109 | self.cache = {} 110 | 111 | print '[+] Loading configuration...' 112 | self.configuration = configuration.Configuration(configfile) 113 | 114 | print '[+] Loading Disassembler module...' 115 | self.disassembler = getattr( 116 | disassembler, 117 | self.configuration['Disassembler'] 118 | )(self.configuration['DisassemblerPath']) 119 | 120 | print '[+] Loading Analyzer module...' 121 | if 'Timeout' not in self.configuration: 122 | timeout = 20 123 | else: 124 | timeout = self.configuration['Timeout'] 125 | self.analyzer = analyzer.Coverage(settings.pintool, timeout) 126 | 127 | self.initialize_campaign() 128 | print '[+] Tracer module is initialized.' 129 | 130 | def initialize_campaign(self): 131 | ''' 132 | Initiliaze a new tracer campaign. 133 | ''' 134 | print '[+] Initializing campaign...' 135 | self.campaign = campaign.Campaign( 136 | self.configuration['CampaignName'] 137 | ) 138 | 139 | print '[+] Parsing whitelist...' 140 | for target in self.configuration['Whitelist']: 141 | exe = self.campaign.add(target) 142 | print ' [-] Disassemblying %s...' % (os.path.basename(exe)) 143 | self.disassemble(exe) 144 | 145 | def disassemble(self, exe): 146 | ''' 147 | Disassembles the binary and imports the basic block 148 | that has was found into the BlockCache. 149 | ''' 150 | # dump the disassembly 151 | dmp = self.disassembler.disassemble( 152 | exe, 153 | output=self.campaign.campaign_dir 154 | ) 155 | 156 | self.cache[os.path.basename(exe)] = bcache.BlockCache.parse_idmp(dmp) 157 | 158 | def parse_trace_file(self, trace_file): 159 | ''' 160 | Parses the format of the trace file (actually a named pipe). 161 | 162 | The format of the trace file is the following: 163 | [number of images, 1 byte ] 164 | IMAGE SECTION 165 | [ 166 | [image name length, 2 bytes] 167 | [image name, variable length] 168 | ... 169 | ] 170 | BASIC BLOCK SECTION 171 | [ 172 | [image number, 8 bytes] 173 | [basic block offset, 8 bytes] 174 | ... 175 | ] 176 | 177 | Note that image section must be prior to basic block section. 178 | If the image number attribute, in a chunk which is contained 179 | in the basic block section, is 0xffffffffffffffff, then a 180 | signal (in Linux) or an exception (in Windows) has been 181 | raised in the monitored application. 182 | ''' 183 | trace = Trace() 184 | nimg = 0x0 185 | with open(trace_file, 'rb') as fin: 186 | nimg = ord(fin.read(1)) 187 | # read the image section 188 | for _ in xrange(nimg): 189 | imgname_sz, = struct.unpack(' self.max_metrics[name]: 197 | self.max_metrics[name] = metrics[name] 198 | 199 | if name not in self.min_metrics \ 200 | or metrics[name] < self.min_metrics[name]: 201 | self.min_metrics[name] = metrics[name] 202 | 203 | def extend(self, dct): 204 | ''' 205 | Extends the chromosomes in the current generation. 206 | ''' 207 | for uid, chromo in dct.iteritems(): 208 | self.set_chromosome(uid, chromo) 209 | 210 | 211 | class Population(object): 212 | ''' 213 | Population constists of two generations. Normally, one that contains 214 | the elite generation and another one with the fuzzed seed files. Also, 215 | this class has operation about messing with the chromosomes of those 216 | two generation. Using the API, you're able to retrieve/delete/add 217 | chromosomes. 218 | ''' 219 | epoch = None 220 | previous = None 221 | current = None 222 | 223 | def __init__(self, cache, epoch=0): 224 | self.cache = cache 225 | self.epoch = epoch 226 | self.previous = None 227 | self.current = Generation(self.epoch) 228 | 229 | # Setup image leaders and basic blocks leaders 230 | self.image_leaders = {} 231 | 232 | for image_name in self.cache: 233 | self.image_leaders[image_name] = {} 234 | bbl_leaders = {} 235 | for startea in self.cache[image_name].yield_bbls(): 236 | bbl_leaders[startea] = None 237 | self.image_leaders[image_name] = bbl_leaders 238 | 239 | def get_chromo_from_current(self, uid): 240 | ''' 241 | Returns a chromosome from the current generation which has uid 242 | equal to `uid'. If there isn't such chromosome, returns None. 243 | ''' 244 | try: 245 | chromo = self.current[uid] 246 | except KeyError: 247 | chromo = None 248 | return chromo 249 | 250 | def get_chromo_from_previous(self, uid): 251 | ''' 252 | Returns a chromosome from the previous generation which has uid 253 | equal to `uid'. If there isn't such chromosome, returns None. 254 | ''' 255 | try: 256 | chromo = self.previous[uid] 257 | except KeyError: 258 | chromo = None 259 | return chromo 260 | 261 | def get_all_from_current(self): 262 | ''' 263 | Returns a generator for all chromosomes inside the 264 | current generation. 265 | ''' 266 | return list(self.current.get_all()) 267 | 268 | def get_all_from_previous(self): 269 | ''' 270 | Returns a generator for all chromosomes inside the 271 | previous generation. 272 | ''' 273 | return list(self.previous.get_all()) 274 | 275 | def get_couple_from_current(self, different=True): 276 | ''' 277 | Returns a tuple with two random chromosomes 278 | from the current generation inside the population. 279 | 280 | If different is set to True, the tuple will not 281 | contain the same chromosome twice. 282 | ''' 283 | done = None 284 | 285 | while not done: 286 | male = self.current.select() 287 | female = self.current.select() 288 | 289 | if different: 290 | while female == male and female != None: 291 | female = self.current.select() 292 | 293 | if male == None or female == None: 294 | done = True 295 | else: 296 | yield male, female 297 | 298 | 299 | def get_couple_from_previous(self, different=True): 300 | ''' 301 | Same as get_couple_from_current() but the 302 | chromosomes selected come from the previous 303 | generation. 304 | ''' 305 | 306 | done = None 307 | 308 | while not done: 309 | male = self.previous.select() 310 | female = self.previous.select() 311 | 312 | if different: 313 | while female == male and female != None: 314 | female = self.previous.select() 315 | 316 | if male == None or female == None: 317 | done = True 318 | else: 319 | yield male, female 320 | 321 | def new_epoch(self, newgen=None): 322 | ''' 323 | Changes the epoch and sets the current generation as previous. 324 | If newgen is set, it is assigned as the current generation. 325 | Otherwise, it creates a new one. 326 | Returns the current generation. 327 | ''' 328 | self.epoch += 1 329 | self.previous = self.current 330 | 331 | if newgen == None: 332 | self.current = Generation(self.epoch) 333 | else: 334 | self.current = newgen 335 | 336 | return self.current 337 | 338 | def does_exist(self, uid): 339 | exists = False 340 | if self.get_chromo_from_current(uid) != None \ 341 | or self.get_chromo_from_previous(uid) != None: 342 | exists = True 343 | return exists 344 | 345 | def set_fitness(self, uid, fitness): 346 | ''' 347 | Sets the fitness to a chromosome of the current generation. 348 | ''' 349 | self.current.set_fitness(uid, fitness) 350 | 351 | def set_previous_fitness(self, uid, fitness): 352 | ''' 353 | Set fitness to chromosome of the previous generation. 354 | ''' 355 | self.previous.set_fitness(uid, fitness) 356 | 357 | def delete_chromosome(self, uid): 358 | ''' 359 | Deletes a chromosome from the current generation. 360 | ''' 361 | self.current.delete(uid) 362 | 363 | def add_chromosome(self, chromo): 364 | ''' 365 | Adds a chromosome to the current generation. Notice that, if the 366 | uid already exists in the current generation, it does nothing. 367 | ''' 368 | if chromo.uid not in self.current: 369 | self.current[chromo.uid] = chromo 370 | 371 | def add_trace(self, uid, trace): 372 | ''' 373 | Adds a trace to the target chromosome. 374 | ''' 375 | chromo = self.current[uid] 376 | chromo.trace = trace 377 | self.current[uid] = chromo 378 | self.current.trace.update(trace) 379 | 380 | def elitism(self): 381 | ''' 382 | Elitism is a selection process of genetic algorithms. 383 | 384 | Common genetic algorithms construct the new population just by 385 | retaining the chromosomes with the best fitness of the current 386 | generation and discarding the rest. 387 | 388 | Elitism (or elitist selection) is a different approach of selection. 389 | It is retaining the best individuals from the whole population. 390 | That means, if the parents (previous generation) have better fitness 391 | that the children (current generation), the children will be 392 | discarded but the parents will be retained. This indicates that the 393 | current generation was not valuable to the genetic algorithm. 394 | ''' 395 | # for each chromosome in the current generation 396 | for chromo in self.current.get_all(): 397 | 398 | # for each image in the monitored 399 | for image_name in chromo.trace.set_per_image.iterkeys(): 400 | 401 | # for each basic block explored in the run 402 | for bbl in chromo.trace.set_per_image[image_name]: 403 | 404 | # if there isn't any leader for this bbl 405 | # set the current chromo 406 | if self.image_leaders[image_name][bbl] == None: 407 | self.image_leaders[image_name][bbl] = chromo 408 | else: 409 | # pick the fittest chromosome for this specific bbl 410 | leader = self.image_leaders[image_name][bbl] 411 | 412 | # if the fitness of the currect chromosome is better 413 | # than the fitness of the leader, replace it 414 | if leader.get_fitness() < chromo.get_fitness(): 415 | self.image_leaders[image_name][bbl] = chromo 416 | 417 | ### since we do not keep the number of times each 418 | ### bbl was hit, we compare the total number of 419 | ### basic blocks between the leader and the 420 | ### challenger chromosome. Enabling full trace 421 | ### logging did not seem worth for this feature, 422 | ### due to the memory issues we already have. 423 | 424 | elif leader.get_fitness() == chromo.get_fitness(): 425 | if leader.trace.get_total() \ 426 | < chromo.trace.get_total(): 427 | self.image_leaders[image_name][bbl] = chromo 428 | 429 | # find the unique chromosome that compose bbl leaders 430 | elite_chromosomes = {} 431 | 432 | # build the elite generation 433 | for bbl_leaders in self.image_leaders.itervalues(): 434 | for chromo in bbl_leaders.itervalues(): 435 | if chromo != None: 436 | elite_chromosomes[chromo.uid] = chromo 437 | 438 | # create new generation 439 | new = self.new_epoch() 440 | new.extend(elite_chromosomes) 441 | 442 | # set up the generation metrics/stats 443 | for chromo in new.get_all(): 444 | new.trace.update(chromo.trace) 445 | new.set_metrics(chromo.uid, chromo.metrics) 446 | --------------------------------------------------------------------------------