├── development.md └── README.md /development.md: -------------------------------------------------------------------------------- 1 | # Development setup 2 | 3 | How to run CRS locally (without Kubernetes). 4 | 5 | ## Requirements 6 | 7 | * docker 8 | * ubuntu 16.04 9 | * the following packages: virtualenvwrapper python2.7-dev build-essential sudo libxml2-dev libxslt1-dev git libffi-dev cmake libreadline-dev libtool debootstrap debian-archive-keyring libglib2.0-dev libpixman-1-dev libpq-dev python-dev libc6:i386 libncurses5:i386 libstdc++6:i386 zlib1g:i386 pkg-config zlib1g-dev libtool libtool-bin wget automake autoconf coreutils bison libacl1-dev qemu-user qemu-kvm socat postgresql-client nasm binutils-multiarch llvm clang 10 | 11 | 12 | ## Install the CRS 13 | 14 | This is a bit hacky, but it pays the bills: 15 | 16 | ``` 17 | git clone git@github.com:angr/angr-dev 18 | cd angr-dev 19 | ./setup.sh -e cgc -r https://github.com/shellphish -r https://github.com/mechaphish -r https://github.com/salls -D \ 20 | ana idalink cooldict mulpyplexer monkeyhex superstruct \ 21 | shellphish-afl shellphish-qemu capstone unicorn peewee \ 22 | archinfo vex pyvex cle claripy simuvex angr angr-management angr-doc \ 23 | binaries identifier fidget angrop tracer fuzzer driller \ 24 | compilerex povsim rex farnsworth patcherex colorguard \ 25 | common-utils network_poll_creator patch_performance \ 26 | worker meister ambassador scriba virtual-competition manual-interaction 27 | ``` 28 | 29 | Annotated: 30 | - Use the **setup.sh** from the angr-dev repository to perform installation 31 | - Create a **virtual environment** named "cgc" 32 | - Use the **mechaphish** and **shellphish** github organizations for data sources in addition to the hardcoded defaults 33 | - Specifies all python packages to install explicitly 34 | 35 | Note that this will take a *VERY LONG TIME*, since installing shellphish-afl and shellphish-qemu involves building qemu about 40 times. 36 | There is a binary distribution that should take significantly shorter, but there are distribution issues at present. 37 | 38 | Additionlly note that there's some sort of bug that prevents meister from running under pypy. 39 | During the CGC we had the worker running with pypy and everything else under cpython. 40 | 41 | ## Run the CRS 42 | 43 | ### Run the database 44 | 45 | Mechanical Phish needs postgres to run. 46 | You can easily spawn it with docker: 47 | 48 | ```bash 49 | sudo docker run -p 127.0.0.1:5432:5432 -d postgres:9.5 50 | ``` 51 | 52 | Now, you need to set up the DB itself: 53 | 54 | ```bash 55 | workon cgc 56 | cd farnsworth 57 | cp .env.example .env 58 | # edit .env if needed 59 | ./setupdb.sh 60 | ``` 61 | 62 | ### Run meister 63 | 64 | ```bash 65 | workon cgc 66 | cd meister 67 | cp .env.example .env 68 | # edit .env if needed 69 | meister 70 | ``` 71 | 72 | You *probably* want to change the environment to tweak the following options: 73 | 74 | - Change `MEISTER_OVERPROVISIONING` to the fraction of your total system resources the CRS should be alowed to use. 75 | It is set to more than 1 because when running behind kubernetes, the kuernetes scheduler makes scheduling decisions of its own. 76 | - Change `MEISTER_LOG_LEVEL` to a lower level, probably `INFO`. 77 | The default `DEBUG` is horrifyingly verbose. 78 | - Change `MEISTER_NUM_THREADS` to a very small number. 79 | 1 works most of the time. 80 | More can overwhelm the database unless you are working with a lot of compute power. 81 | 82 | ### Run Virtual Competition 83 | 84 | To run a virtual competition that serves challenge sets locally, DARPA provided a set of scripts to serve as an API test mock. 85 | We expanded it into a slightly more useful test mock. 86 | 87 | The competition requires DARPA's DECREE VM. 88 | To run it, you'll need to install [VirtualBox](https://www.virtualbox.org/wiki/Downloads) and [Vagrant](https://www.vagrantup.com/downloads.html) 89 | 90 | Then, to set up and run the vitual competition, run: 91 | 92 | ```bash 93 | cd virtual-competition 94 | vagrant up ti 95 | bin/launch reset 96 | bin/launch start 97 | ``` 98 | 99 | This will automatically compile and enable several sample CGC challenges. 100 | If you would like to field your own challenges, place them in the `shared/cgc-challenges-unfielded` folder. 101 | The behavior of the virtual competition with respect to fielding challenges is as follows: 102 | 103 | - Rounds are 5 minutes long. The inter-round period is not emulated. 104 | - At the start of a round where round % 10 == 0, cycle the challenge sets if there are any left in the queue. 105 | This means moving every challenge from `shared/cgc-challenges` to `shared/cgc-challenges-spent` and moving between 3 and 8 challenges from `shared/cgc-challenges-unfielded` to `shared/cgc-challenges`. 106 | Keep in mind that this does not happen if `shared/cgc-challenges-unfielded` is empty. 107 | - Serve to the CRS any challenges in `shared/cgc-challenges` 108 | 109 | ### Run the Ambassador 110 | 111 | The CRS still can't see any of these challenges! 112 | The component that interacts with the CGC API is the ambassador. 113 | Run it as follows: 114 | 115 | ```bash 116 | workon cgc 117 | cd ambassador 118 | cp .env.example .env 119 | # edit .env if you need 120 | ambassador 121 | ``` 122 | 123 | ### Run the Submitter 124 | 125 | The component `scriba` performs patch and POV submission. 126 | 127 | ```bash 128 | workon cgc 129 | cd scriba 130 | cp .env.example .env 131 | # edit .env if you need 132 | scriba 133 | ``` 134 | 135 | ### It's live! 136 | 137 | The CRS should now be scheduling jobs. 138 | You can jump into the database to look at the internal setup at any time: 139 | `psql -hlocalhost -Upostgres farnsworth` should do the trick. 140 | 141 | If you look at the `jobs` table (`select * from jobs`), you can see that there are some jobs to be done now! 142 | However, because we're not running with kubernetes, none of these will actaully get run unless you run them manually: 143 | 144 | ```bash 145 | workon cgc 146 | cd manual-interaction 147 | ./jobs.py list 148 | ./jobs.py run 149 | ``` 150 | 151 | This should run the job. 152 | Each job syncronizes its own data and status into the database, so if your job produces interesting stuff, you should see more interesting jobs! 153 | 154 | To view the logs of a given job, look in `manual-interaction/workers/.log`. 155 | Be sure to delete the `manual-interaction/workers` folder whenever you reset the database! 156 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # The Mechanical Phish 2 | 3 | The Mechanical Phish is open source! 4 | Mechanical Phish was created by Shellphish as our CRS for the DARPA Cyber Grand Challenge. 5 | It rocked it in the final event, winning 3rd place, and we are very proud of it. 6 | 7 | The Cyber Grand Challenge was the first time anything like this was attempted in the security world. 8 | As such, Mechanical Phish is an *extremely* complicated piece of software, with an absurd amount of components. 9 | No blueprint for doing this existed before the CGC, so we had to figure things out as we went along. 10 | Unfortunately, rather than being a software development shop, we are a "mysterious hacker collective". 11 | This means that Mechanical Phish has some rough components, missing documentation, and ghosts in the machine. 12 | Our hope is that, going forward, we can polish and extend Mechanical Phish, as a community, to continue to push the limits of automated hacking. 13 | 14 | Keep in mind that this was never designed to be turn-key, might not install without extreme effort, and might not work without a lot of tweaking. 15 | Otherwise, have at it! 16 | 17 | # Issues 18 | 19 | So far, there are several glaring issues that came up during the runup to the CGC Final Event, the CFE itself, DEFCON, or our post-CGC analysis: 20 | 21 | - There is very little documentation of the whole thing. This is something that we would love community involvement for (although it's admittedly a chicken-and-egg problem). 22 | - Setting Mechanical Phish up can be an ordeal. A ready-built docker would be cool. 23 | - There are probably lots of URLs pointing to our internal infrastructure. These will need to be changed to point to github as they're identified. 24 | - There are some issues we're aware of: 25 | * Our Kubernetes setup has problems scheduling things quickly, leading to under-utilitization of the infrastructure. 26 | * An accidental assert, partway through our multi-CB exploitation pipeline, completely disables Mechanical Phish's ability to exploit multi-CB challenge sets. 27 | * There are some race conditions between the patch selection and patch submission, leading to too many patch submissions. 28 | * The scheduler for analyzing network traffic has a bug that causes it to pull down the full traffic rather than the metadata, essentially disabling that component after 15 rounds of the game. 29 | * There are other, mysterious scheduling issues that cause exploitable crashes to be overlooked (for example, during the CFE, Mechanical Phish identified 40 exploitable crashes, but only scheduled the generation of 15 exploits). 30 | 31 | Overall, we're pretty surprised this thing ran at all! 32 | In the development of Mechanical Phish, we had to fix some bugs in some underlying components. 33 | We have fixes our fixes in the following forks, but we need to upstream them: 34 | 35 | - [Unicorn Engine](https://github.com/angr/unicorn) 36 | - [Capstone Engine](https://github.com/angr/capstone) 37 | - [PeeWee](https://github.com/mechaphish/peewee) 38 | 39 | # Components 40 | 41 | The CRS has a *lot* of moving parts. 42 | They have been distributed throughout several different github namespaces: 43 | 44 | - https://github.com/angr - Core angr components and interesting static analyses 45 | - https://github.com/shellphish - Cool hacking tools, generally anything that's useful outside the CGC 46 | - https://github.com/mechaphish - CRS bookkeping or utility components 47 | - Additional repositories authoried by a single person may be under their github username namespace 48 | 49 | This is an index of all of the repositories, split by component. 50 | 51 | ## Meister 52 | 53 | The CRS Meister handles task scheduling. 54 | 55 | - [Meister](https://github.com/mechaphish/meister) 56 | 57 | ## Ambassador 58 | 59 | The Ambassador talks to the CGC API to retrieve CBs, submit POVs, etc. 60 | 61 | - [Ambassador](https://github.com/mechaphish/ambassador) 62 | 63 | ## Scriba 64 | 65 | Scriba decides what exploits and RCBs to submit. 66 | 67 | - [Scriba](https://github.com/mechaphish/scriba) 68 | 69 | ## Worker 70 | 71 | The CRS worker handles running the actual tasks that are scheduled by the worker. 72 | This part includes a lot of repositories working together. 73 | 74 | - [Worker](https://github.com/mechaphish/worker) - the actual glue 75 | - [Rex](https://github.com/shellphish/rex) - Shellphish's automated exploitation component 76 | - [Patcherex](https://github.com/shellphish/patcherex) - Shellphish's automatic patching engine. 77 | - [Fuzzer](https://github.com/shellphish/fuzzer) - a Python wrapper around AFL, and the fuzzer component of the Shellphish CRS. 78 | - [Tracer](https://github.com/angr/tracer) - Component that traces CGC binaries symbolically with concrete inputs (convenience wrapper for concolic tracing). 79 | - [Driller](https://github.com/shellphish/driller) - The Shellphish CRS "smart fuzzer". 80 | - [Shellphish-QEMU](https://github.com/angr/shellphish-qemu) - A pip wrapper around our ridiculous amounts of qemu ports. 81 | - [Shellphish-AFL](https://github.com/shellphish/shellphish-afl) - A pip wrapper to easily distribute AFL. 82 | 83 | ## Common 84 | 85 | There are several common components that don't fit in a single category above. 86 | 87 | - [Farnsworth](https://github.com/mechaphish/farnsworth) - Farnsworth is the knowledge base of the Shellphish CRS. It provides a JSON-based REST API and uses PostgreSQL as the data store. 88 | - [Common-utils](https://github.com/mechaphish/common-utils) - Common utilities that didn't fit elsewhere. 89 | 90 | ## Other 91 | 92 | There are also many other repositories that act as dependencies for those above. 93 | We'll get a full list together at some point. 94 | 95 | # Setup 96 | 97 | We have documented the process to setup a local development version [here](development.md). 98 | 99 | Scripts for setting up the full, kubernetes-powered distributed systems are [here](https://github.com/mechaphish/setup). 100 | 101 | # Support 102 | 103 | We plan to keep Mechanical Phish alive through our research and the support of the community. 104 | However, we are a small group of PhD students, and don't have much time to give support, as our primary task is to do research, publish papers, and graduate. 105 | Please understand this when creating support requests: if resolving a support question takes too much time, it will likely never be done. 106 | To maximize the chance of a resolution, meet us half way and try exploring the issue *with* us :-) 107 | --------------------------------------------------------------------------------