├── development.md
└── README.md


/development.md:
--------------------------------------------------------------------------------
  1 | # Development setup
  2 | 
  3 | How to run CRS locally (without Kubernetes).
  4 | 
  5 | ## Requirements
  6 | 
  7 | * docker
  8 | * ubuntu 16.04
  9 | * the following packages: virtualenvwrapper python2.7-dev build-essential sudo libxml2-dev libxslt1-dev git libffi-dev cmake libreadline-dev libtool debootstrap debian-archive-keyring libglib2.0-dev libpixman-1-dev libpq-dev python-dev libc6:i386 libncurses5:i386 libstdc++6:i386 zlib1g:i386 pkg-config zlib1g-dev libtool libtool-bin wget automake autoconf coreutils bison libacl1-dev qemu-user qemu-kvm socat postgresql-client nasm binutils-multiarch llvm clang
 10 | 
 11 | 
 12 | ## Install the CRS
 13 | 
 14 | This is a bit hacky, but it pays the bills:
 15 | 
 16 | ```
 17 | git clone git@github.com:angr/angr-dev
 18 | cd angr-dev
 19 | ./setup.sh -e cgc -r https://github.com/shellphish -r https://github.com/mechaphish -r https://github.com/salls -D \
 20 |                 ana idalink cooldict mulpyplexer monkeyhex superstruct \
 21 |                 shellphish-afl shellphish-qemu capstone unicorn peewee \
 22 |             	archinfo vex pyvex cle claripy simuvex angr angr-management angr-doc \
 23 |                 binaries identifier fidget angrop tracer fuzzer driller \
 24 |                 compilerex povsim rex farnsworth patcherex colorguard \
 25 |                 common-utils network_poll_creator patch_performance \
 26 |                 worker meister ambassador scriba virtual-competition manual-interaction
 27 | ```
 28 | 
 29 | Annotated:
 30 | - Use the **setup.sh** from the angr-dev repository to perform installation
 31 | - Create a **virtual environment** named "cgc"
 32 | - Use the **mechaphish** and **shellphish** github organizations for data sources in addition to the hardcoded defaults
 33 | - Specifies all python packages to install explicitly
 34 | 
 35 | Note that this will take a *VERY LONG TIME*, since installing shellphish-afl and shellphish-qemu involves building qemu about 40 times.
 36 | There is a binary distribution that should take significantly shorter, but there are distribution issues at present.
 37 | 
 38 | Additionlly note that there's some sort of bug that prevents meister from running under pypy.
 39 | During the CGC we had the worker running with pypy and everything else under cpython.
 40 | 
 41 | ## Run the CRS
 42 | 
 43 | ### Run the database
 44 | 
 45 | Mechanical Phish needs postgres to run.
 46 | You can easily spawn it with docker:
 47 | 
 48 | ```bash
 49 | sudo docker run -p 127.0.0.1:5432:5432 -d postgres:9.5
 50 | ```
 51 | 
 52 | Now, you need to set up the DB itself:
 53 | 
 54 | ```bash
 55 | workon cgc
 56 | cd farnsworth
 57 | cp .env.example .env
 58 | # edit .env if needed
 59 | ./setupdb.sh
 60 | ```
 61 | 
 62 | ### Run meister
 63 | 
 64 | ```bash
 65 | workon cgc
 66 | cd meister
 67 | cp .env.example .env
 68 | # edit .env if needed
 69 | meister
 70 | ```
 71 | 
 72 | You *probably* want to change the environment to tweak the following options:
 73 | 
 74 | - Change `MEISTER_OVERPROVISIONING` to the fraction of your total system resources the CRS should be alowed to use.
 75 |   It is set to more than 1 because when running behind kubernetes, the kuernetes scheduler makes scheduling decisions of its own.
 76 | - Change `MEISTER_LOG_LEVEL` to a lower level, probably `INFO`.
 77 |   The default `DEBUG` is horrifyingly verbose.
 78 | - Change `MEISTER_NUM_THREADS` to a very small number.
 79 |   1 works most of the time.
 80 |   More can overwhelm the database unless you are working with a lot of compute power.
 81 | 
 82 | ### Run Virtual Competition
 83 | 
 84 | To run a virtual competition that serves challenge sets locally, DARPA provided a set of scripts to serve as an API test mock.
 85 | We expanded it into a slightly more useful test mock.
 86 | 
 87 | The competition requires DARPA's DECREE VM.
 88 | To run it, you'll need to install [VirtualBox](https://www.virtualbox.org/wiki/Downloads) and [Vagrant](https://www.vagrantup.com/downloads.html)
 89 | 
 90 | Then, to set up and run the vitual competition, run:
 91 | 
 92 | ```bash
 93 | cd virtual-competition
 94 | vagrant up ti
 95 | bin/launch reset
 96 | bin/launch start
 97 | ```
 98 | 
 99 | This will automatically compile and enable several sample CGC challenges.
100 | If you would like to field your own challenges, place them in the `shared/cgc-challenges-unfielded` folder.
101 | The behavior of the virtual competition with respect to fielding challenges is as follows:
102 | 
103 | - Rounds are 5 minutes long. The inter-round period is not emulated.
104 | - At the start of a round where round % 10 == 0, cycle the challenge sets if there are any left in the queue.
105 |   This means moving every challenge from `shared/cgc-challenges` to `shared/cgc-challenges-spent` and moving between 3 and 8 challenges from `shared/cgc-challenges-unfielded` to `shared/cgc-challenges`.
106 |   Keep in mind that this does not happen if `shared/cgc-challenges-unfielded` is empty.
107 | - Serve to the CRS any challenges in `shared/cgc-challenges`
108 | 
109 | ### Run the Ambassador
110 | 
111 | The CRS still can't see any of these challenges!
112 | The component that interacts with the CGC API is the ambassador.
113 | Run it as follows:
114 | 
115 | ```bash
116 | workon cgc
117 | cd ambassador
118 | cp .env.example .env
119 | # edit .env if you need
120 | ambassador
121 | ```
122 | 
123 | ### Run the Submitter
124 | 
125 | The component `scriba` performs patch and POV submission.
126 | 
127 | ```bash
128 | workon cgc
129 | cd scriba
130 | cp .env.example .env
131 | # edit .env if you need
132 | scriba
133 | ```
134 | 
135 | ### It's live!
136 | 
137 | The CRS should now be scheduling jobs.
138 | You can jump into the database to look at the internal setup at any time:
139 | `psql -hlocalhost -Upostgres farnsworth` should do the trick.
140 | 
141 | If you look at the `jobs` table (`select * from jobs`), you can see that there are some jobs to be done now!
142 | However, because we're not running with kubernetes, none of these will actaully get run unless you run them manually:
143 | 
144 | ```bash
145 | workon cgc
146 | cd manual-interaction
147 | ./jobs.py list
148 | ./jobs.py run <JOB ID>
149 | ```
150 | 
151 | This should run the job.
152 | Each job syncronizes its own data and status into the database, so if your job produces interesting stuff, you should see more interesting jobs!
153 | 
154 | To view the logs of a given job, look in `manual-interaction/workers/<JOB ID>.log`.
155 | Be sure to delete the `manual-interaction/workers` folder whenever you reset the database!
156 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # The Mechanical Phish
  2 | 
  3 | The Mechanical Phish is open source!
  4 | Mechanical Phish was created by Shellphish as our CRS for the DARPA Cyber Grand Challenge.
  5 | It rocked it in the final event, winning 3rd place, and we are very proud of it.
  6 | 
  7 | The Cyber Grand Challenge was the first time anything like this was attempted in the security world.
  8 | As such, Mechanical Phish is an *extremely* complicated piece of software, with an absurd amount of components.
  9 | No blueprint for doing this existed before the CGC, so we had to figure things out as we went along.
 10 | Unfortunately, rather than being a software development shop, we are a "mysterious hacker collective".
 11 | This means that Mechanical Phish has some rough components, missing documentation, and ghosts in the machine.
 12 | Our hope is that, going forward, we can polish and extend Mechanical Phish, as a community, to continue to push the limits of automated hacking.
 13 | 
 14 | Keep in mind that this was never designed to be turn-key, might not install without extreme effort, and might not work without a lot of tweaking.
 15 | Otherwise, have at it!
 16 | 
 17 | # Issues
 18 | 
 19 | So far, there are several glaring issues that came up during the runup to the CGC Final Event, the CFE itself, DEFCON, or our post-CGC analysis:
 20 | 
 21 | - There is very little documentation of the whole thing. This is something that we would love community involvement for (although it's admittedly a chicken-and-egg problem).
 22 | - Setting Mechanical Phish up can be an ordeal. A ready-built docker would be cool.
 23 | - There are probably lots of URLs pointing to our internal infrastructure. These will need to be changed to point to github as they're identified.
 24 | - There are some issues we're aware of:
 25 |  * Our Kubernetes setup has problems scheduling things quickly, leading to under-utilitization of the infrastructure.
 26 |  * An accidental assert, partway through our multi-CB exploitation pipeline, completely disables Mechanical Phish's ability to exploit multi-CB challenge sets.
 27 |  * There are some race conditions between the patch selection and patch submission, leading to too many patch submissions.
 28 |  * The scheduler for analyzing network traffic has a bug that causes it to pull down the full traffic rather than the metadata, essentially disabling that component after 15 rounds of the game.
 29 |  * There are other, mysterious scheduling issues that cause exploitable crashes to be overlooked (for example, during the CFE, Mechanical Phish identified 40 exploitable crashes, but only scheduled the generation of 15 exploits).
 30 | 
 31 | Overall, we're pretty surprised this thing ran at all!
 32 | In the development of Mechanical Phish, we had to fix some bugs in some underlying components.
 33 | We have fixes our fixes in the following forks, but we need to upstream them:
 34 | 
 35 | - [Unicorn Engine](https://github.com/angr/unicorn)
 36 | - [Capstone Engine](https://github.com/angr/capstone)
 37 | - [PeeWee](https://github.com/mechaphish/peewee)
 38 | 
 39 | # Components
 40 | 
 41 | The CRS has a *lot* of moving parts.
 42 | They have been distributed throughout several different github namespaces:
 43 | 
 44 | - https://github.com/angr - Core angr components and interesting static analyses
 45 | - https://github.com/shellphish - Cool hacking tools, generally anything that's useful outside the CGC
 46 | - https://github.com/mechaphish - CRS bookkeping or utility components
 47 | - Additional repositories authoried by a single person may be under their github username namespace
 48 | 
 49 | This is an index of all of the repositories, split by component.
 50 | 
 51 | ## Meister
 52 | 
 53 | The CRS Meister handles task scheduling.
 54 | 
 55 | - [Meister](https://github.com/mechaphish/meister)
 56 | 
 57 | ## Ambassador
 58 | 
 59 | The Ambassador talks to the CGC API to retrieve CBs, submit POVs, etc.
 60 | 
 61 | - [Ambassador](https://github.com/mechaphish/ambassador)
 62 | 
 63 | ## Scriba
 64 | 
 65 | Scriba decides what exploits and RCBs to submit.
 66 | 
 67 | - [Scriba](https://github.com/mechaphish/scriba)
 68 | 
 69 | ## Worker
 70 | 
 71 | The CRS worker handles running the actual tasks that are scheduled by the worker.
 72 | This part includes a lot of repositories working together.
 73 | 
 74 | - [Worker](https://github.com/mechaphish/worker) - the actual glue
 75 | - [Rex](https://github.com/shellphish/rex) - Shellphish's automated exploitation component
 76 | - [Patcherex](https://github.com/shellphish/patcherex) - Shellphish's automatic patching engine.
 77 | - [Fuzzer](https://github.com/shellphish/fuzzer) - a Python wrapper around AFL, and the fuzzer component of the Shellphish CRS.
 78 | - [Tracer](https://github.com/angr/tracer) - Component that traces CGC binaries symbolically with concrete inputs (convenience wrapper for concolic tracing).
 79 | - [Driller](https://github.com/shellphish/driller) - The Shellphish CRS "smart fuzzer".
 80 | - [Shellphish-QEMU](https://github.com/angr/shellphish-qemu) - A pip wrapper around our ridiculous amounts of qemu ports.
 81 | - [Shellphish-AFL](https://github.com/shellphish/shellphish-afl) - A pip wrapper to easily distribute AFL.
 82 | 
 83 | ## Common
 84 | 
 85 | There are several common components that don't fit in a single category above.
 86 | 
 87 | - [Farnsworth](https://github.com/mechaphish/farnsworth) - Farnsworth is the knowledge base of the Shellphish CRS. It provides a JSON-based REST API and uses PostgreSQL as the data store.
 88 | - [Common-utils](https://github.com/mechaphish/common-utils) - Common utilities that didn't fit elsewhere.
 89 | 
 90 | ## Other
 91 | 
 92 | There are also many other repositories that act as dependencies for those above.
 93 | We'll get a full list together at some point.
 94 | 
 95 | # Setup
 96 | 
 97 | We have documented the process to setup a local development version [here](development.md).
 98 | 
 99 | Scripts for setting up the full, kubernetes-powered distributed systems are [here](https://github.com/mechaphish/setup).
100 | 
101 | # Support
102 | 
103 | We plan to keep Mechanical Phish alive through our research and the support of the community.
104 | However, we are a small group of PhD students, and don't have much time to give support, as our primary task is to do research, publish papers, and graduate.
105 | Please understand this when creating support requests: if resolving a support question takes too much time, it will likely never be done.
106 | To maximize the chance of a resolution, meet us half way and try exploring the issue *with* us :-)
107 | 


--------------------------------------------------------------------------------