├── LICENSE ├── README.md ├── crash_exploration ├── README.md └── crash_exploration.patch ├── docker ├── Dockerfile ├── build.sh ├── entrypoint.sh ├── example_scripts │ ├── 01_afl.sh │ ├── 02_tracing.sh │ └── 03_rca.sh ├── pull.sh └── run.sh ├── example.zip ├── paper.png ├── root_cause_analysis ├── .gitignore ├── Cargo.toml ├── predicate_monitoring │ ├── Cargo.toml │ └── src │ │ ├── assembler.rs │ │ ├── lib.rs │ │ ├── monitor.rs │ │ ├── predicate.rs │ │ ├── register.rs │ │ └── rflags.rs ├── root_cause_analysis │ ├── Cargo.toml │ └── src │ │ ├── addr2line.rs │ │ ├── config.rs │ │ ├── lib.rs │ │ ├── main.rs │ │ ├── monitor.rs │ │ ├── rankings.rs │ │ ├── traces.rs │ │ └── utils.rs └── trace_analysis │ ├── Cargo.toml │ └── src │ ├── config.rs │ ├── control_flow_graph.rs │ ├── debug.rs │ ├── lib.rs │ ├── main.rs │ ├── predicate_analysis.rs │ ├── predicate_builder.rs │ ├── predicate_synthesizer.rs │ ├── predicates.rs │ ├── trace.rs │ ├── trace_analyzer.rs │ └── trace_integrity.rs └── tracing ├── README.md ├── aurora_tracer.cpp ├── makefile ├── makefile.rules └── scripts ├── addr_ranges.py ├── pprint.py ├── run_tracer.sh └── tracing.py /README.md: -------------------------------------------------------------------------------- 1 | # Aurora: Statistical Crash Analysis for Automated Root Cause Explanation 2 |

3 | 4 | Aurora is a tool for automated root cause analysis. It is based on our [paper](https://www.usenix.org/system/files/sec20-blazytko.pdf) ([slides](https://www.usenix.org/system/files/sec20_slides_blazytko.pdf), [recording](https://2459d6dc103cb5933875-c0245c5c937c5dedcca3f1764ecc9b2f.ssl.cf2.rackcdn.com/sec20/videos/0812/s3_software_security_and_verification/4_sec20fall-paper610-presentation-video.mp4)): 5 | 6 | ``` 7 | @inproceedings{blazytko2020aurora, 8 | author = {Tim Blazytko and Moritz Schl{\"o}gel and Cornelius Aschermann and Ali Abbasi and Joel Frank and Simon W{\"o}rner and Thorsten Holz}, 9 | title = {{AURORA}: Statistical Crash Analysis for Automated Root Cause Explanation}, 10 | year = {2020}, 11 | booktitle = {29th {USENIX} Security Symposium ({USENIX} Security 20)}, 12 | } 13 | ``` 14 | 15 | 16 | # Components 17 | 18 | This repository is structured as follows: 19 | 20 | 1) Crash exploration (AFL): Our patch for AFL's crash exploration mode. 21 | 22 | 2) Tracer (Pin): Our tracer to extract information such as register values for inputs. 23 | 24 | 3) Root Cause Analysis: Our Rust-based tooling to identify the root cause. 25 | 26 | 27 | ## Crash Exploration 28 | 29 | We rely on AFL's crash exploration mode. We patch AFL such that inputs not crashing anymore (so-called non-crashes) are saved. Download AFL 2.52b and apply our patch `patch -p1 < crash_exploration.patch` before running AFL's crash exploration mode as usual. 30 | 31 | 32 | ## Tracer 33 | 34 | Our tracer is implemented as a pintool. Install Pin 3.15 and then compile our tool with `make obj-intel64/aurora_tracer.so`. We provide scripts to trace one input (tracing/scripts/run_tracer.sh) or multiple inputs (tracing/scripts/tracing.py). 35 | 36 | ## Root Cause Analysis 37 | 38 | Our RCA component is written in Rust. It expects an evaluation folder (organized as in our example folder) and a folder containing traces. 39 | 40 | The tool `rca` performs the predicate analysis, monitoring and ranking; `addr2line` enriches the predicates with debug symbols (if existing). 41 | 42 | ``` 43 | # build project 44 | cargo build --release 45 | 46 | # run root cause analysis 47 | cargo run --release --bin rca -- --eval-dir --trace-dir --monitor --rank-predicates 48 | 49 | # enrich with debug symbols 50 | cargo run --release --bin addr2line -- --eval-dir 51 | ``` 52 | 53 | # Example 54 | 55 | The following commands show how to use Aurora for the type confusion in `mruby`. 56 | 57 | ## Preparation 58 | 59 | Setup directories: 60 | 61 | ``` 62 | # set directories 63 | # Clone this repository and make AURORA_GIT_DIR point to it 64 | AURORA_GIT_DIR="$(pwd)/aurora" 65 | mkdir evaluation 66 | cd evaluation 67 | EVAL_DIR=`pwd` 68 | AFL_DIR=$EVAL_DIR/afl-fuzz 69 | AFL_WORKDIR=$EVAL_DIR/afl-workdir 70 | mkdir -p $EVAL_DIR/inputs/crashes 71 | mkdir -p $EVAL_DIR/inputs/non_crashes 72 | ``` 73 | 74 | To prepare fuzzing, perform the following as root: 75 | 76 | ``` 77 | echo core >/proc/sys/kernel/core_pattern 78 | cd /sys/devices/system/cpu 79 | echo performance | tee cpu*/cpufreq/scaling_governor 80 | 81 | # disable ASLR 82 | echo 0 | tee /proc/sys/kernel/randomize_va_space 83 | ``` 84 | 85 | 86 | Build and install `AFL`: 87 | 88 | ``` 89 | # download afl 90 | wget -c https://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz 91 | tar xvf afl-latest.tgz 92 | 93 | # rename afl directory and cd 94 | mv afl-2.52b afl-fuzz 95 | cd afl-fuzz 96 | 97 | # apply patch 98 | patch -p1 < ${AURORA_GIT_DIR}/crash_exploration/crash_exploration.patch 99 | 100 | # build afl 101 | make -j 102 | cd .. 103 | ``` 104 | 105 | Buld the `mruby` target: 106 | 107 | ``` 108 | # clone mruby 109 | git clone https://github.com/mruby/mruby.git 110 | cd mruby 111 | git checkout 88604e39ac9c25ffdad2e3f03be26516fe866038 112 | 113 | # build afl version 114 | CC=$AFL_DIR/afl-gcc make -e -j 115 | mv ./bin/mruby ../mruby_fuzz 116 | 117 | # clean 118 | make clean 119 | 120 | # build normal version for tracing/rca 121 | CFLAGS="-ggdb -O0" make -e -j 122 | 123 | mv ./bin/mruby ../mruby_trace 124 | ``` 125 | 126 | Place the initial crashing seed: 127 | 128 | ``` 129 | cp $AURORA_GIT_DIR/example.zip $EVAL_DIR 130 | cd $EVAL_DIR/ 131 | unzip example.zip 132 | echo "@@" > arguments.txt 133 | cp -r example/mruby_type_confusion/seed . 134 | ``` 135 | 136 | ## Crash Exploration 137 | 138 | For crash exploration, perform the following operations in the evaluation directory: 139 | 140 | ``` 141 | # fuzzing 142 | timeout 43200 $AFL_DIR/afl-fuzz -C -d -m none -i $EVAL_DIR/seed -o $AFL_WORKDIR -- $EVAL_DIR/mruby_fuzz @@ 143 | 144 | # move crashes to eval dir 145 | cp $AFL_WORKDIR/queue/* $EVAL_DIR/inputs/crashes 146 | 147 | # move non-rashes to eval dir 148 | cp $AFL_WORKDIR/non_crashes/* $EVAL_DIR/inputs/non_crashes 149 | ``` 150 | 151 | 152 | ## Tracing 153 | 154 | To trace all inputs, install Pin (note our tool was originally designed to work with Pin 3.7 which is no longer available for download from the official site; we've adapted the tool to Pin 3.15) 155 | ``` 156 | wget -c http://software.intel.com/sites/landingpage/pintool/downloads/pin-3.15-98253-gb56e429b1-gcc-linux.tar.gz 157 | tar -xzf pin*.tar.gz 158 | export PIN_ROOT="$(pwd)/pin-3.15-98253-gb56e429b1-gcc-linux" 159 | mkdir -p "${PIN_ROOT}/source/tools/AuroraTracer" 160 | cp -r ${AURORA_GIT_DIR}/tracing/* ${PIN_ROOT}/source/tools/AuroraTracer 161 | cd ${PIN_ROOT}/source/tools/AuroraTracer 162 | # requires PIN_ROOT to be set correctly 163 | make obj-intel64/aurora_tracer.so 164 | cd - 165 | ``` 166 | With the tracer built, we must trace all crashing and non-crashing inputs found by the fuzzer's crash exploration mode. 167 | 168 | ``` 169 | mkdir -p $EVAL_DIR/traces 170 | # requires at least python 3.6 171 | cd $AURORA_GIT_DIR/tracing/scripts 172 | python3 tracing.py $EVAL_DIR/mruby_trace $EVAL_DIR/inputs $EVAL_DIR/traces 173 | # extract stack and heap addr ranges from logfiles 174 | python3 addr_ranges.py --eval_dir $EVAL_DIR $EVAL_DIR/traces 175 | cd - 176 | ``` 177 | 178 | 179 | ## Root Cause Analysis 180 | Once tracing completed, you can determine predicates as follows (requires Rust Nightly): 181 | ``` 182 | # go to directory 183 | cd $AURORA_GIT_DIR/root_cause_analysis 184 | 185 | # Build components 186 | cargo build --release --bin monitor 187 | cargo build --release --bin rca 188 | 189 | # run root cause analysis 190 | cargo run --release --bin rca -- --eval-dir $EVAL_DIR --trace-dir $EVAL_DIR --monitor --rank-predicates 191 | 192 | # (Optional) enrich with debug symbols 193 | cargo run --release --bin addr2line -- --eval-dir $EVAL_DIR 194 | ``` 195 | Your predicates are in `ranked_predicates_verbose.txt` :) 196 | 197 | 198 | # Output 199 | Aurora provides you with predicates structured as follows (in `ranked_predicates_verbose.txt`): 200 | ``` 201 | 0x0000555555569c5a -- rax min_reg_val_less 0x11 -- 1 -- mov eax, dword ptr [rbp-0x48] (path rank: 0.9690633497239973) //mrb_exc_set at error.c:277 202 | address -- predicate explanation -- score -- disassembly at addr (path rank) // addr2line (if applied) 203 | 204 | ``` 205 | 206 | # Docker 207 | We provide a dockerfile setting up the example for you. 208 | 209 | Preparation: As root, set the following (required by AFL): 210 | ``` 211 | echo core >/proc/sys/kernel/core_pattern 212 | cd /sys/devices/system/cpu 213 | echo performance | tee cpu*/cpufreq/scaling_governor 214 | 215 | # disable ASLR 216 | echo 0 | tee /proc/sys/kernel/randomize_va_space 217 | ``` 218 | 219 | Then, build (or [pull from Dockerhub](https://hub.docker.com/repository/docker/mu00d8/aurora)) and run the docker image: 220 | ``` 221 | # either pull the image from dockerhub 222 | ./pull.sh 223 | # *or*, alternatively, manually build it 224 | ./build.sh 225 | 226 | # start container 227 | ./run.sh 228 | ``` 229 | 230 | In docker, you can find the following scripts in `/home/user/aurora/docker/example_scripts` 231 | ``` 232 | # Run AFL in crash exploration mode (modify timeout before) 233 | ./01_afl.sh 234 | # Trace all inputs found in the previous step 235 | ./02_tracing.sh 236 | # Run root cause analysis on the traced inputs 237 | ./03_rca.sh 238 | ``` 239 | 240 | # Contact 241 | 242 | For more information, contact [mrphrazer](https://github.com/mrphrazer) ([@mr_phrazer](https://twitter.com/mr_phrazer)) or [m_u00d8](https://github.com/mu00d8) ([@m_u00d8](https://twitter.com/m_u00d8)). 243 | 244 | -------------------------------------------------------------------------------- /crash_exploration/README.md: -------------------------------------------------------------------------------- 1 | # Crash exploration 2 | 3 | ## Setup 4 | Download alf-2.52b and apply the crash_exploration.patch with 5 | `patch -p1 < crash_exploration.patch` 6 | 7 | ## Usage 8 | Run AFL with the desired options. Make sure to use a crashing input as seed and the -C flag to run AFL's crash exploration mode. Modified afl will then save crashing ('queue' folder) and non-crashing inputs ('non_crashes' folder) which are needed for tracing. 9 | 10 | -------------------------------------------------------------------------------- /crash_exploration/crash_exploration.patch: -------------------------------------------------------------------------------- 1 | diff --git a/afl-fuzz.c b/afl-fuzz.c 2 | index 01b4afe..4f1549b 100644 3 | --- a/afl-fuzz.c 4 | +++ b/afl-fuzz.c 5 | @@ -280,6 +280,10 @@ static s8 interesting_8[] = { INTERESTING_8 }; 6 | static s16 interesting_16[] = { INTERESTING_8, INTERESTING_16 }; 7 | static s32 interesting_32[] = { INTERESTING_8, INTERESTING_16, INTERESTING_32 }; 8 | 9 | +/* Crash exploration */ 10 | + 11 | +static u32 unique_non_crash_id = 0; 12 | + 13 | /* Fuzzing stages */ 14 | 15 | enum { 16 | @@ -3280,7 +3284,26 @@ keep_as_crash: 17 | 18 | case FAULT_ERROR: FATAL("Unable to execute target application"); 19 | 20 | - default: return keeping; 21 | + default: 22 | + /* Crash exploration mode */ 23 | + if (!(hnb = has_new_bits(virgin_bits))) 24 | + return keeping; 25 | +#ifndef SIMPLE_FILES 26 | + 27 | + fn = alloc_printf("%s/non_crashes/id:%06u,%s_%u", out_dir, queued_paths, 28 | + describe_op(hnb), unique_non_crash_id); 29 | + 30 | +#else 31 | + 32 | + fn = alloc_printf("%s/non_crashes/id_%06u_%u", out_dir, queued_paths, unique_non_crash_id); 33 | + 34 | +#endif /* ^!SIMPLE_FILES */ 35 | + fd = open(fn, O_WRONLY | O_CREAT | O_EXCL, 0600); 36 | + if (fd < 0) PFATAL("Unable to create '%s'", fn); 37 | + ck_write(fd, mem, len, fn); 38 | + close(fd); 39 | + unique_non_crash_id += 1; 40 | + return keeping; 41 | 42 | } 43 | 44 | @@ -3746,6 +3769,11 @@ static void maybe_delete_out_dir(void) { 45 | if (delete_files(fn, CASE_PREFIX)) goto dir_cleanup_failed; 46 | ck_free(fn); 47 | 48 | + /* Crash exploration */ 49 | + fn = alloc_printf("%s/non_crashes", out_dir); 50 | + if (delete_files(fn, CASE_PREFIX)) goto dir_cleanup_failed; 51 | + ck_free(fn); 52 | + 53 | /* All right, let's do /crashes/id:* and /hangs/id:*. */ 54 | 55 | if (!in_place_resume) { 56 | @@ -7117,6 +7145,11 @@ EXP_ST void setup_dirs_fds(void) { 57 | if (mkdir(tmp, 0700)) PFATAL("Unable to create '%s'", tmp); 58 | ck_free(tmp); 59 | 60 | + /* Crash exploration */ 61 | + tmp = alloc_printf("%s/non_crashes", out_dir); 62 | + if (mkdir(tmp, 0700)) PFATAL("Unable to create '%s'", tmp); 63 | + ck_free(tmp); 64 | + 65 | /* Top-level directory for queue metadata used for session 66 | resume and related tasks. */ 67 | 68 | -------------------------------------------------------------------------------- /docker/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu:20.04 2 | 3 | ENV DEBIAN_FRONTEND noninteractive 4 | 5 | RUN apt update -y && \ 6 | apt install -y build-essential git sudo python3 wget curl neovim htop \ 7 | ruby bison cmake 8 | 9 | ARG USER_UID=1000 10 | ARG USER_GID=1000 11 | 12 | #Enable sudo group 13 | RUN echo "%sudo ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers 14 | ADD entrypoint.sh "/usr/bin/" 15 | WORKDIR /tmp 16 | 17 | 18 | #Create user "user" 19 | RUN groupadd -g ${USER_GID} user 20 | RUN useradd -l --shell /bin/bash -c "" -m -u ${USER_UID} -g user -G sudo user 21 | WORKDIR "/home/user" 22 | USER user 23 | 24 | RUN wget -O ~/.gdbinit-gef.py -q https://gef.blah.cat/sh \ 25 | && echo source ~/.gdbinit-gef.py >> ~/.gdbinit 26 | 27 | 28 | ################################################################# 29 | # Clone Aurora repository and prepare environment 30 | RUN git clone https://github.com/RUB-SysSec/aurora 31 | 32 | RUN mkdir evaluation 33 | 34 | ENV AURORA_GIT_DIR=/home/user/aurora 35 | ENV EVAL_DIR=/home/user/evaluation 36 | ENV AFL_DIR=$EVAL_DIR/afl-fuzz 37 | ENV AFL_WORKDIR=$EVAL_DIR/afl-workdir 38 | 39 | WORKDIR $EVAL_DIR 40 | RUN mkdir -p $EVAL_DIR/inputs/crashes && mkdir -p $EVAL_DIR/inputs/non_crashes 41 | 42 | ################################################################# 43 | # Install AFL 44 | RUN wget -q -c https://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz && tar xf afl-latest.tgz && mv afl-2.52b afl-fuzz 45 | 46 | ## apply patch & build 47 | RUN cd afl-fuzz && patch -p1 < ${AURORA_GIT_DIR}/crash_exploration/crash_exploration.patch && make -j 48 | 49 | ################################################################# 50 | # Build mruby target 51 | RUN git clone https://github.com/mruby/mruby.git && cd mruby && git checkout 88604e39ac9c25ffdad2e3f03be26516fe866038 52 | ## build afl version 53 | RUN CC=$AFL_DIR/afl-gcc printenv | grep CC 54 | RUN cd mruby && CC=$AFL_DIR/afl-gcc make -e -j && mv ./bin/mruby ../mruby_fuzz 55 | 56 | # clean 57 | RUN cd mruby && make clean 58 | 59 | # build normal version for tracing/rca 60 | RUN cd mruby && CFLAGS="-ggdb -O0" make -e -j && mv ./bin/mruby ../mruby_trace 61 | 62 | ################################################################# 63 | # Install Rust 64 | RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | bash -s -- -q -y --default-toolchain nightly 65 | ENV PATH="/home/user/.cargo/bin:${PATH}" 66 | 67 | 68 | ################################################################# 69 | # Install Pin 70 | RUN wget -q -c http://software.intel.com/sites/landingpage/pintool/downloads/pin-3.15-98253-gb56e429b1-gcc-linux.tar.gz && tar -xzf pin*.tar.gz 71 | ENV PIN_ROOT="/home/user/evaluation/pin-3.15-98253-gb56e429b1-gcc-linux" 72 | 73 | RUN mkdir -p "${PIN_ROOT}/source/tools/AuroraTracer" && cp -r ${AURORA_GIT_DIR}/tracing/* ${PIN_ROOT}/source/tools/AuroraTracer 74 | 75 | ## requires PIN_ROOT to be set correctly 76 | RUN cd ${PIN_ROOT}/source/tools/AuroraTracer && make obj-intel64/aurora_tracer.so 77 | 78 | 79 | ################################################################# 80 | # Complete setting up evaluation directory 81 | RUN cp $AURORA_GIT_DIR/example.zip $EVAL_DIR && unzip -q example.zip && cp -r example/mruby_type_confusion/seed . 82 | RUN echo "@@" > arguments.txt 83 | 84 | WORKDIR /home/user 85 | -------------------------------------------------------------------------------- /docker/build.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | docker build --build-arg USER_UID="$(id -u)" --build-arg USER_GID="$(id -g)" $@ -t aurora:latest . 4 | -------------------------------------------------------------------------------- /docker/entrypoint.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Add local user 4 | # Either use the LOCAL_USER_ID if passed in at runtime or 5 | # fallback 6 | 7 | set -eu 8 | 9 | if [[ -z "$LOCAL_USER_ID" ]]; then 10 | echo "Please set LOCAL_USER_ID" 11 | exit 1 12 | fi 13 | 14 | if [[ -z "$LOCAL_GROUP_ID" ]]; then 15 | echo "Please set LOCAL_GROUP_ID" 16 | exit 1 17 | fi 18 | 19 | echo "$LOCAL_USER_ID:$LOCAL_GROUP_ID" 20 | export uid=$LOCAL_USER_ID 21 | export gid=$LOCAL_GROUP_ID 22 | export HOME=/home/user 23 | export USER=user 24 | 25 | usermod -u $LOCAL_USER_ID user 26 | groupmod -g $LOCAL_GROUP_ID user 27 | 28 | exec /usr/sbin/gosu user $@ 29 | -------------------------------------------------------------------------------- /docker/example_scripts/01_afl.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # run AFL 4 | timeout 43200 $AFL_DIR/afl-fuzz -C -d -m none -i $EVAL_DIR/seed -o $AFL_WORKDIR -- $EVAL_DIR/mruby_fuzz @@ 5 | 6 | # save crashes and non-crashes 7 | cp $AFL_WORKDIR/queue/* $EVAL_DIR/inputs/crashes 8 | cp $AFL_WORKDIR/non_crashes/* $EVAL_DIR/inputs/non_crashes 9 | -------------------------------------------------------------------------------- /docker/example_scripts/02_tracing.sh: -------------------------------------------------------------------------------- 1 | #/bin/bash 2 | 3 | set -eu 4 | 5 | if [ -z "$EVAL_DIR" ] || [ -z "$AURORA_GIT_DIR" ]; then 6 | echo "ERROR: set EVAL_DIR and AURORA_GIT_DIR env vars" 7 | exit 1 8 | fi 9 | 10 | if [ ! -f "$EVAL_DIR/pin-3.15-98253-gb56e429b1-gcc-linux/source/tools/AuroraTracer/obj-intel64/aurora_tracer.o" ]; then 11 | echo "Need to make obj-intel64/aurora_tracer.so first" 12 | exit 1 13 | fi 14 | 15 | mkdir -p $EVAL_DIR/traces 16 | # requires at least python 3.6 17 | pushd $AURORA_GIT_DIR/tracing/scripts > /dev/null 18 | python3 tracing.py $EVAL_DIR/mruby_trace $EVAL_DIR/inputs $EVAL_DIR/traces 19 | # extract stack and heap addr ranges from logfiles 20 | python3 addr_ranges.py --eval_dir $EVAL_DIR $EVAL_DIR/traces 21 | popd > /dev/null 22 | -------------------------------------------------------------------------------- /docker/example_scripts/03_rca.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eu 4 | 5 | # go to directory 6 | cd $AURORA_GIT_DIR/root_cause_analysis 7 | 8 | # Build components 9 | cargo build --release --bin monitor 10 | cargo build --release --bin rca 11 | 12 | # run root cause analysis 13 | cargo run --release --bin rca -- --eval-dir $EVAL_DIR --trace-dir $EVAL_DIR --monitor --rank-predicates 14 | 15 | # (Optional) enrich with debug symbols 16 | cargo run --release --bin addr2line -- --eval-dir $EVAL_DIR 17 | -------------------------------------------------------------------------------- /docker/pull.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | set -eu 4 | 5 | # This script pulls the docker image from Dockerhub and changes the tag 6 | # such that the convenience scripts run.sh still works as expected 7 | 8 | # Use this *instead* of manually building the docker image 9 | 10 | # pull image 11 | echo "Pulling mu00d8/aurora:latest" 12 | docker pull mu00d8/aurora:latest 13 | 14 | # re-tag image 15 | echo "Changing tag to aurora:latest" 16 | docker tag mu00d8/aurora:latest aurora:latest 17 | 18 | -------------------------------------------------------------------------------- /docker/run.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eu 4 | 5 | IMAGE_NAME="aurora:latest" 6 | 7 | function yes_no() { 8 | if [[ "$1" == "yes" || "$1" == "y" ]]; then 9 | return 0 10 | else 11 | return 1 12 | fi 13 | } 14 | 15 | ancestor="$(docker ps --filter="ancestor=${IMAGE_NAME}" --latest --quiet)" 16 | 17 | if [[ ! -z "$ancestor" ]]; then 18 | read -p "Found running instance: $ancestor, connect?" yn 19 | if yes_no "$yn"; then 20 | cmd="docker exec -it --user "$UID:$(id -g)" $ancestor /usr/bin/bash" 21 | echo $cmd 22 | $cmd 23 | exit 0 24 | fi 25 | #Else; Create new container 26 | fi 27 | 28 | 29 | cmd="docker run --rm -it --privileged ${IMAGE_NAME} /usr/bin/bash" 30 | 31 | echo "$cmd" 32 | $cmd 33 | 34 | -------------------------------------------------------------------------------- /example.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RUB-SysSec/aurora/707b94f1d7ac46c9e4575dfcbbf0dab08bbb3af2/example.zip -------------------------------------------------------------------------------- /paper.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RUB-SysSec/aurora/707b94f1d7ac46c9e4575dfcbbf0dab08bbb3af2/paper.png -------------------------------------------------------------------------------- /root_cause_analysis/.gitignore: -------------------------------------------------------------------------------- 1 | Cargo.lock 2 | target 3 | -------------------------------------------------------------------------------- /root_cause_analysis/Cargo.toml: -------------------------------------------------------------------------------- 1 | [workspace] 2 | 3 | members = [ 4 | "predicate_monitoring", 5 | "trace_analysis", 6 | "root_cause_analysis", 7 | ] -------------------------------------------------------------------------------- /root_cause_analysis/predicate_monitoring/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "predicate_monitoring" 3 | version = "0.1.0" 4 | authors = ["Simon Wörner "] 5 | edition = "2018" 6 | 7 | [profile.release] 8 | lto = true 9 | opt-level = 3 10 | 11 | [dependencies] 12 | trace_analysis = { path = "../trace_analysis" } 13 | nix = "^0.15" 14 | ptracer = { git = "https://github.com/SWW13/ptracer.git", rev = "786fce1ddf8b107c65921f9f5e3caecc496ff233" } 15 | zydis = "^3.0" 16 | nom = "^5.0" 17 | bitflags = "^1.2" 18 | serde_json = "^1.0" 19 | log = "^0.4" 20 | env_logger = "^0.7" 21 | 22 | [[bin]] 23 | name = "monitor" 24 | path = "src/monitor.rs" -------------------------------------------------------------------------------- /root_cause_analysis/predicate_monitoring/src/assembler.rs: -------------------------------------------------------------------------------- 1 | use std::str::FromStr; 2 | 3 | use nix::libc::user_regs_struct; 4 | 5 | use crate::register::{Register, RegisterValue}; 6 | 7 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 8 | pub enum AccessSize { 9 | Size1Byte = 1, 10 | Size2Byte = 2, 11 | Size4Byte = 4, 12 | Size8Byte = 8, 13 | } 14 | 15 | impl FromStr for AccessSize { 16 | type Err = (); 17 | 18 | fn from_str(s: &str) -> Result { 19 | Ok(match s { 20 | "1" => Self::Size1Byte, 21 | "2" => Self::Size2Byte, 22 | "4" => Self::Size4Byte, 23 | "8" => Self::Size8Byte, 24 | _ => return Err(()), 25 | }) 26 | } 27 | } 28 | 29 | impl Default for AccessSize { 30 | fn default() -> Self { 31 | Self::Size8Byte 32 | } 33 | } 34 | 35 | #[derive(Debug, Clone, PartialEq, Eq)] 36 | pub enum Operand { 37 | Memory(MemoryLocation), 38 | Register(Register), 39 | Immediate(usize), 40 | } 41 | 42 | #[derive(Debug, Clone, PartialEq, Eq)] 43 | pub struct MemoryLocation { 44 | pub offset: Option, 45 | pub base: Option, 46 | pub index: Option<(Register, ArraySize)>, 47 | // pub access_size: AccessSize, 48 | } 49 | 50 | impl MemoryLocation { 51 | pub fn address(&self, registers: &user_regs_struct) -> usize { 52 | let address = self 53 | .base 54 | .and_then(|reg| Some(reg.value(registers))) 55 | .unwrap_or(0) 56 | + self 57 | .index 58 | .and_then(|(reg, size)| Some(reg.value(registers) * size as usize)) 59 | .unwrap_or(0); 60 | 61 | match self.offset { 62 | Some(offset) => { 63 | if offset >= 0 { 64 | address + offset.abs() as usize 65 | } else { 66 | address - offset.abs() as usize 67 | } 68 | } 69 | None => address, 70 | } 71 | } 72 | } 73 | 74 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 75 | pub enum ArraySize { 76 | Size1Byte = 1, 77 | Size2Byte = 2, 78 | Size4Byte = 4, 79 | Size8Byte = 8, 80 | } 81 | 82 | impl FromStr for ArraySize { 83 | type Err = (); 84 | 85 | fn from_str(s: &str) -> Result { 86 | Ok(match s { 87 | "1" => Self::Size1Byte, 88 | "2" => Self::Size2Byte, 89 | "4" => Self::Size4Byte, 90 | "8" => Self::Size8Byte, 91 | _ => return Err(()), 92 | }) 93 | } 94 | } 95 | 96 | impl Default for ArraySize { 97 | fn default() -> Self { 98 | Self::Size1Byte 99 | } 100 | } 101 | 102 | use nom::bytes::complete::tag; 103 | use nom::character::complete::{alphanumeric1, digit1, hex_digit1}; 104 | use nom::{ 105 | call, complete, do_parse, map, map_res, named, opt, peek, preceded, separated_list, switch, 106 | tag, take, take_while, verify, 107 | }; 108 | 109 | named!(pub operands(&str) -> Vec, 110 | separated_list!(tag(","), complete!(call!(operand))) 111 | ); 112 | 113 | named!(operand(&str) -> Operand, 114 | do_parse!( 115 | take_while!(is_space) >> 116 | operand: switch!(peek!(take!(1)), 117 | "$" => call!(immediate) | 118 | "%" => map!(call!(register), |register| Operand::Register(register)) | 119 | _ => map!(call!(memory), |memory| Operand::Memory(memory)) 120 | ) >> 121 | (operand) 122 | ) 123 | ); 124 | 125 | named!(immediate(&str) -> Operand, 126 | do_parse!( 127 | tag!("$") >> 128 | num: call!(address) >> 129 | (Operand::Immediate(num)) 130 | ) 131 | ); 132 | 133 | named!(register(&str) -> Register, 134 | do_parse!( 135 | tag!("%") >> 136 | reg: map_res!( 137 | alphanumeric1, 138 | |name: &str| Register::from_str(name) 139 | ) >> 140 | (reg) 141 | ) 142 | ); 143 | 144 | named!(memory(&str) -> MemoryLocation, 145 | verify!(memory_empty, 146 | |memory: &MemoryLocation| memory.offset.is_some() || memory.base.is_some() || memory.index.is_some() 147 | ) 148 | ); 149 | 150 | named!(memory_empty(&str) -> MemoryLocation, 151 | do_parse!( 152 | offset: opt!(call!(address)) >> 153 | inner: opt!(call!(memory_inner)) >> 154 | (MemoryLocation { 155 | offset: offset.and_then(|offset| Some(offset as isize)), 156 | base: inner.and_then(|inner| inner.0), 157 | index: inner.and_then(|inner| inner.1), 158 | }) 159 | ) 160 | ); 161 | 162 | named!(memory_inner(&str) -> (Option, Option<(Register, ArraySize)>), 163 | complete!(do_parse!( 164 | tag!("(") >> 165 | take_while!(is_space) >> 166 | base: opt!(call!(register)) >> 167 | index: opt!(call!(memory_index)) >> 168 | take_while!(is_space) >> 169 | tag!(")") >> 170 | ((base, index)) 171 | )) 172 | ); 173 | 174 | named!(memory_index(&str) -> (Register, ArraySize), 175 | do_parse!( 176 | tag!(",") >> 177 | take_while!(is_space) >> 178 | index: call!(register) >> 179 | scale: opt!(call!(memory_index_scale)) >> 180 | ((index, scale.unwrap_or(ArraySize::Size1Byte))) 181 | ) 182 | ); 183 | 184 | named!(memory_index_scale(&str) -> ArraySize, 185 | do_parse!( 186 | tag!(",") >> 187 | take_while!(is_space) >> 188 | scale: map_res!(digit1, |size| ArraySize::from_str(size)) >> 189 | (scale) 190 | ) 191 | ); 192 | 193 | named!(address(&str) -> usize, 194 | do_parse!( 195 | neg: opt!(tag!("-")) >> 196 | num: switch!(peek!(take!(2)), 197 | "0x" => call!(hex_number) | 198 | _ => call!(number) 199 | ) >> 200 | (match neg { 201 | Some(_) => -(num as isize) as usize, 202 | None => num 203 | }) 204 | ) 205 | ); 206 | 207 | named!(number(&str) -> usize, 208 | map_res!( 209 | digit1, 210 | |num: &str| num.parse::() 211 | ) 212 | ); 213 | named!(hex_number(&str) -> usize, 214 | preceded!(tag!("0x"), 215 | map_res!( 216 | hex_digit1, 217 | |num: &str| usize::from_str_radix(num, 16) 218 | ) 219 | ) 220 | ); 221 | 222 | #[inline] 223 | pub fn is_space(chr: char) -> bool { 224 | chr == ' ' || chr == '\t' 225 | } 226 | 227 | #[cfg(test)] 228 | mod tests { 229 | use super::*; 230 | use crate::register::*; 231 | 232 | macro_rules! parse_error { 233 | ( $func:ident, $input:expr ) => { 234 | print!("parsing {:?}", $input); 235 | let result = $func($input); 236 | println!(" -> {:?}", result); 237 | assert!(result.is_err()); 238 | }; 239 | } 240 | macro_rules! parse_eq_inner { 241 | ( $func:ident, $input:expr, $output:expr, $expected:expr ) => { 242 | print!("parsing {:?}", $input); 243 | let result = $func($input); 244 | println!(" -> {:?}", result); 245 | assert_eq!(result, Ok(($output, $expected))); 246 | }; 247 | } 248 | macro_rules! parse_eq { 249 | ( $func:ident, $input:expr, $expected:expr ) => { 250 | parse_eq_inner!($func, $input, "", $expected); 251 | }; 252 | } 253 | 254 | #[test] 255 | fn test_operands() { 256 | parse_eq!(operands, "", vec![]); 257 | parse_eq!( 258 | operands, 259 | "%eax, -0x127c(%rbp)", 260 | vec![ 261 | Operand::Register(Register32::Eax.into()), 262 | Operand::Memory(MemoryLocation { 263 | offset: Some(-0x127c), 264 | base: Some(Register64::Rbp.into()), 265 | index: None, 266 | }) 267 | ] 268 | ); 269 | } 270 | 271 | #[test] 272 | fn test_memory_operand() { 273 | parse_error!(operand, "(%ecx,2)"); 274 | parse_error!(operand, "(%ebx,%ecx,-1)"); 275 | parse_error!(operand, "(%ebx,%ecx,3)"); 276 | parse_error!(operand, "(%ebx,%ecx,0x8)"); 277 | 278 | parse_eq!( 279 | operand, 280 | "0x42(%rsi,%ebx,4)", 281 | Operand::Memory(MemoryLocation { 282 | offset: Some(0x42), 283 | base: Some(Register64::Rsi.into()), 284 | index: Some((Register32::Ebx.into(), ArraySize::Size4Byte)), 285 | }) 286 | ); 287 | parse_eq!( 288 | operand, 289 | "(%rax, %rcx, 8)", 290 | Operand::Memory(MemoryLocation { 291 | offset: None, 292 | base: Some(Register64::Rax.into()), 293 | index: Some((Register64::Rcx.into(), ArraySize::Size8Byte)), 294 | }) 295 | ); 296 | parse_eq!( 297 | operand, 298 | "-0x127c(%rbp)", 299 | Operand::Memory(MemoryLocation { 300 | offset: Some(-0x127c), 301 | base: Some(Register64::Rbp.into()), 302 | index: None, 303 | }) 304 | ); 305 | parse_eq!( 306 | operand, 307 | "(%esi,%rax)", 308 | Operand::Memory(MemoryLocation { 309 | offset: None, 310 | base: Some(Register32::Esi.into()), 311 | index: Some((Register64::Rax.into(), ArraySize::Size1Byte)), 312 | }) 313 | ); 314 | 315 | parse_eq!( 316 | operand, 317 | "0x1337", 318 | Operand::Memory(MemoryLocation { 319 | offset: Some(0x1337), 320 | base: None, 321 | index: None, 322 | }) 323 | ); 324 | parse_eq!( 325 | operand, 326 | "1337()", 327 | Operand::Memory(MemoryLocation { 328 | offset: Some(1337), 329 | base: None, 330 | index: None, 331 | }) 332 | ); 333 | parse_eq!( 334 | operand, 335 | "-0x42()", 336 | Operand::Memory(MemoryLocation { 337 | offset: Some(-0x42), 338 | base: None, 339 | index: None, 340 | }) 341 | ); 342 | } 343 | 344 | #[test] 345 | fn test_register_operand() { 346 | parse_error!(operand, "%abc"); 347 | parse_error!(operand, "%0xrax"); 348 | 349 | parse_eq!(operand, "%rax", Operand::Register(Register64::Rax.into())); 350 | parse_eq!(operand, "%ah", Operand::Register(Register8High::Ah.into())); 351 | } 352 | 353 | #[test] 354 | fn test_immediate_operand() { 355 | parse_error!(operand, "$+1"); 356 | parse_error!(operand, "$--1"); 357 | 358 | parse_eq!(operand, "$0x42", Operand::Immediate(0x42)); 359 | parse_eq!(operand, "$-1337", Operand::Immediate(-1337isize as usize)); 360 | } 361 | } 362 | -------------------------------------------------------------------------------- /root_cause_analysis/predicate_monitoring/src/lib.rs: -------------------------------------------------------------------------------- 1 | use log::{debug, error, info, trace, warn}; 2 | use nix::sys::wait::WaitStatus; 3 | use nix::unistd::Pid; 4 | use predicate::*; 5 | use ptracer::{ContinueMode, Ptracer}; 6 | use register::*; 7 | use rflags::RFlags; 8 | use std::collections::HashMap; 9 | use std::path::Path; 10 | use std::time::Instant; 11 | use trace_analysis::predicates::SerializedPredicate; 12 | use zydis::*; 13 | 14 | mod predicate; 15 | mod register; 16 | mod rflags; 17 | 18 | fn new_decoder() -> Decoder { 19 | Decoder::new(MachineMode::LONG_64, AddressWidth::_64).expect("failed to create decoder") 20 | } 21 | 22 | fn instruction(decoder: &Decoder, pid: Pid, address: usize) -> Option { 23 | let mut code = [0u8; 16]; 24 | ptracer::util::read_data(pid, address, &mut code).expect("failed to read memory"); 25 | trace!("code = {:02x?}", &code); 26 | 27 | let instruction = decoder.decode(&code).expect("failed to decode instruction"); 28 | if let Some(instruction) = instruction { 29 | Some(instruction) 30 | } else { 31 | warn!("No instructions found at {:#018x}.", address); 32 | None 33 | } 34 | } 35 | 36 | fn disasm(log_level: log::Level, decoder: &Decoder, pid: Pid, address: usize, length: usize) { 37 | debug!("disasm at {:#018x} with length {}:", address, length); 38 | 39 | let formatter = Formatter::new(FormatterStyle::INTEL).expect("failed to create formatter"); 40 | let mut code = vec![0u8; length]; 41 | if let Err(err) = ptracer::util::read_data(pid, address, &mut code) { 42 | warn!("failed to read memory for disasm, skipping: {:?}", err); 43 | return; 44 | } 45 | 46 | let mut buffer = [0u8; 200]; 47 | let mut buffer = OutputBuffer::new(&mut buffer[..]); 48 | 49 | for (instruction, ip) in decoder.instruction_iterator(&code, address as u64) { 50 | formatter 51 | .format_instruction(&instruction, &mut buffer, Some(ip), None) 52 | .expect("failed to format instruction"); 53 | log::log!(log_level, "{:#018x} {}", ip, buffer); 54 | } 55 | } 56 | 57 | #[derive(Debug, Clone)] 58 | pub struct RootCauseCandidate { 59 | pub address: usize, 60 | pub score: f64, 61 | pub predicate: Predicate, 62 | } 63 | 64 | impl RootCauseCandidate { 65 | pub fn satisfied( 66 | &self, 67 | dbg: &mut Ptracer, 68 | old_registers: &nix::libc::user_regs_struct, 69 | ) -> nix::Result { 70 | let old_rip = old_registers.rip; 71 | let new_rip = dbg.registers.rip; 72 | debug!("old_rip = {:#018x}, new_rip = {:#018x}", old_rip, new_rip); 73 | let rflags = RFlags::from_bits_truncate(dbg.registers.eflags); 74 | trace!("rflags = {:#018x}", rflags); 75 | 76 | // disasm 77 | if log::log_enabled!(log::Level::Trace) { 78 | let decoder = new_decoder(); 79 | disasm( 80 | log::Level::Trace, 81 | &decoder, 82 | dbg.event().pid().expect("pid missing"), 83 | old_rip as usize, 84 | 32, 85 | ); 86 | disasm( 87 | log::Level::Trace, 88 | &decoder, 89 | dbg.event().pid().expect("pid missing"), 90 | new_rip as usize, 91 | 32, 92 | ); 93 | } 94 | 95 | match self.predicate { 96 | Predicate::Compare(ref compare) => { 97 | let value = match compare.destination { 98 | ValueDestination::Register(ref reg) => reg.value(&dbg.registers), 99 | ValueDestination::Address(ref mem) => mem.address(&old_registers), 100 | ValueDestination::Memory(ref access_size, ref mem) => { 101 | let address = mem.address(&old_registers); 102 | debug!("address = {:#018x}", address); 103 | 104 | let value = ptracer::read( 105 | dbg.event().pid().expect("pid missing"), 106 | address as nix::sys::ptrace::AddressType, 107 | ) 108 | .expect("failed to read memory value") 109 | as usize; 110 | debug!("raw value = {:#018x}", value); 111 | 112 | match 1usize.checked_shl(*access_size as u32) { 113 | Some(mask) => value & mask, 114 | _ => value, 115 | } 116 | } 117 | }; 118 | debug!( 119 | "value = {:#018x}, compare.value = {:#018x}", 120 | value, compare.value 121 | ); 122 | 123 | Ok(match compare.compare { 124 | Compare::Equal => value == compare.value, 125 | Compare::Greater => value > compare.value, 126 | Compare::GreaterOrEqual => value >= compare.value, 127 | Compare::Less => value < compare.value, 128 | Compare::NotEqual => value != compare.value, 129 | }) 130 | } 131 | Predicate::Edge(ref edge) => match edge.transition { 132 | EdgeTransition::Taken => { 133 | Ok(old_rip as usize == edge.source && new_rip as usize == edge.destination) 134 | } 135 | EdgeTransition::NotTaken => { 136 | Ok(old_rip as usize == edge.source && new_rip as usize != edge.destination) 137 | } 138 | }, 139 | Predicate::Visited => Ok(true), 140 | Predicate::FlagSet(flag) => Ok(rflags.contains(flag)), 141 | } 142 | } 143 | } 144 | 145 | fn convert_predicates( 146 | decoder: &Decoder, 147 | dbg: &mut Ptracer, 148 | predicates: Vec, 149 | ) -> HashMap { 150 | predicates 151 | .into_iter() 152 | .map(|pred| { 153 | debug!("pred = {:?}", pred); 154 | 155 | let address = pred.address; 156 | let instr = 157 | instruction(&decoder, dbg.pid, address).expect("failed to parse instruction"); 158 | 159 | if log::log_enabled!(log::Level::Debug) { 160 | let formatter = 161 | Formatter::new(FormatterStyle::INTEL).expect("failed to create formatter"); 162 | let mut buffer = [0u8; 200]; 163 | let mut buffer = OutputBuffer::new(&mut buffer[..]); 164 | 165 | formatter 166 | .format_instruction(&instr, &mut buffer, Some(dbg.registers.rip), None) 167 | .expect("failed to format instruction"); 168 | println!("{:#018x} {}", dbg.registers.rip, buffer); 169 | } 170 | trace!("{:#018x?} -> {:?}", address, instr); 171 | 172 | let converted = predicate::convert_predicate(&pred.name, instr).and_then(|predicate| { 173 | Some(RootCauseCandidate { 174 | address, 175 | score: pred.score, 176 | predicate, 177 | }) 178 | }); 179 | 180 | if converted.is_none() { 181 | warn!("could not convert predicate {:016x?}", pred); 182 | } 183 | 184 | converted 185 | }) 186 | .filter_map(|pred| pred) 187 | .map(|pred| (pred.address, pred)) 188 | .collect() 189 | } 190 | 191 | fn insert_breakpoints(dbg: &mut Ptracer, rccs: &HashMap) { 192 | for address in rccs.keys() { 193 | dbg.insert_breakpoint(*address) 194 | .expect("failed to insert breakpoint"); 195 | } 196 | debug!("breakpoints = {:#018x?}", dbg.breakpoints()); 197 | } 198 | 199 | fn add_rccs_single_steps( 200 | pid: Pid, 201 | dbg: &mut Ptracer, 202 | rccs: &HashMap, 203 | single_steping: &mut HashMap, 204 | ) { 205 | let rip = dbg.registers.rip; 206 | 207 | if let Some(rcc) = rccs.get(&(dbg.registers.rip as usize)) { 208 | debug!( 209 | "breakpoint at {:#018x} of predicate {:016x?} reached", 210 | rip, rcc.predicate 211 | ); 212 | 213 | single_steping.insert(pid, dbg.registers); 214 | } 215 | } 216 | 217 | fn check_rccs( 218 | dbg: &mut Ptracer, 219 | old_registers: &nix::libc::user_regs_struct, 220 | rccs: &mut HashMap, 221 | satisfaction: &mut Vec<(usize, Predicate)>, 222 | ) { 223 | let old_rip = old_registers.rip; 224 | let remove_breakpoint = |dbg: &mut Ptracer, address| { 225 | if let Err(err) = dbg.remove_breakpoint(address) { 226 | error!( 227 | "failed to remove breakpoint at {:#018x}, skipping: {:?}", 228 | address, err 229 | ); 230 | } 231 | }; 232 | 233 | if let Some(rcc) = rccs.get(&(old_rip as usize)) { 234 | debug!( 235 | "single step target at {:#018x} of predicate {:016x?} reached", 236 | dbg.registers.rip, rcc.predicate 237 | ); 238 | 239 | if !rcc 240 | .satisfied(dbg, old_registers) 241 | .expect("failed to test predicate") 242 | { 243 | trace!("predicate {:016x?} NOT satisfied", rcc.predicate); 244 | return; 245 | } 246 | } else { 247 | // removing the breakpoint may have failed early when the predicate was satisfied 248 | remove_breakpoint(dbg, old_rip as usize); 249 | return; 250 | } 251 | 252 | // predicate satisfied 253 | if let Some(rcc) = rccs.remove(&(old_rip as usize)) { 254 | info!( 255 | "predicate {:016x?} satisfied, moving predicate to satisfaction and removing breakpoint", 256 | rcc.predicate 257 | ); 258 | satisfaction.push((rcc.address, rcc.predicate)); 259 | remove_breakpoint(dbg, rcc.address); 260 | } 261 | } 262 | 263 | fn collect_satisfied( 264 | decoder: &Decoder, 265 | dbg: &mut Ptracer, 266 | rccs: &mut HashMap, 267 | timeout: u64, 268 | ) -> Vec<(usize, Predicate)> { 269 | let mut satisfaction = vec![]; 270 | let mut single_steping = HashMap::new(); 271 | let start_time = Instant::now(); 272 | 273 | loop { 274 | trace!("threads = {:?}", dbg.threads); 275 | trace!("registers = {:#018x?}", dbg.registers); 276 | trace!("single_steping = {:#018x?}", single_steping); 277 | 278 | let rip = dbg.registers.rip; 279 | trace!("rip = {:#018x}", rip); 280 | 281 | if let Some(pid) = dbg.event().pid() { 282 | if log::log_enabled!(log::Level::Trace) { 283 | disasm(log::Level::Trace, decoder, pid, rip as usize, 32); 284 | } 285 | 286 | // we assume that single stepping on a breakpoint raises two ptrace events 287 | // therefor when our thread is in single step mode 288 | // (even when hitting another breakpoint) 289 | // we can just check the rcc without checking the next breakpoint 290 | // otherwise we hit a breakpoint and need to request single stepping 291 | if let Some(old_registers) = single_steping.remove(&pid) { 292 | // handle previous single steping request 293 | check_rccs(dbg, &old_registers, rccs, &mut satisfaction) 294 | } else { 295 | // add single stepping request 296 | add_rccs_single_steps(pid, dbg, rccs, &mut single_steping); 297 | } 298 | } 299 | 300 | if start_time.elapsed().as_secs() >= timeout { 301 | info!("timeout reached, end debugging."); 302 | break; 303 | } 304 | 305 | // A ptrace call may return `ESRCH` when the debugee is 306 | // dead or not ptrace-stopped. Only a dead debugee is fatal. 307 | // Retry the request up to 3 times to verify the debugee is dead. 308 | let mut result = Ok(()); 309 | for _ in 0..3 { 310 | // continue / single step debugee 311 | result = if single_steping.is_empty() { 312 | dbg.cont(ContinueMode::Default) 313 | } else { 314 | dbg.step(ContinueMode::Default) 315 | }; 316 | 317 | // retry on ESRCH 318 | match result { 319 | Ok(_) => break, 320 | Err(err) => { 321 | debug!("ptrace returned error: {}", err); 322 | 323 | if err.as_errno() != Some(nix::errno::Errno::ESRCH) { 324 | break; 325 | } 326 | } 327 | } 328 | } 329 | let event = dbg.event(); 330 | 331 | // handle unexpected missing debugee 332 | if let Err(err) = result { 333 | info!("event = {:?}", dbg.event()); 334 | warn!( 335 | "debugee exited unexpected, cannot continue debugging: {:?}", 336 | err 337 | ); 338 | break; 339 | } else { 340 | debug!("event = {:?}", dbg.event()); 341 | } 342 | 343 | // handle exited / signaled debugee 344 | match event { 345 | WaitStatus::Exited(pid, ret) if *pid == dbg.pid => { 346 | info!( 347 | "debugee exited graceful with return code {}, stopping.", 348 | ret 349 | ); 350 | break; 351 | } 352 | WaitStatus::Signaled(pid, signal, _) if *pid == dbg.pid => { 353 | info!( 354 | "debugee exited ungraceful with signal {}, stopping.", 355 | signal 356 | ); 357 | break; 358 | } 359 | _ => {} 360 | } 361 | 362 | if dbg.threads.is_empty() { 363 | info!("no more threads, end debugging."); 364 | break; 365 | } 366 | } 367 | 368 | info!("satisfaction = {:#018x?}", satisfaction); 369 | 370 | satisfaction 371 | } 372 | 373 | pub fn rank_predicates( 374 | mut dbg: Ptracer, 375 | predicates: Vec, 376 | timeout: u64, 377 | ) -> Vec { 378 | let decoder = new_decoder(); 379 | 380 | let mut rccs = convert_predicates(&decoder, &mut dbg, predicates); 381 | debug!("rccs = {:#018x?}", rccs); 382 | 383 | insert_breakpoints(&mut dbg, &rccs); 384 | 385 | let satisfaction = collect_satisfied(&decoder, &mut dbg, &mut rccs, timeout); 386 | satisfaction.into_iter().map(|(addr, _)| addr).collect() 387 | } 388 | 389 | pub fn spawn_dbg(path: &Path, args: &[String]) -> Ptracer { 390 | Ptracer::spawn(path, args).expect("spawn failed") 391 | } 392 | -------------------------------------------------------------------------------- /root_cause_analysis/predicate_monitoring/src/monitor.rs: -------------------------------------------------------------------------------- 1 | use log::debug; 2 | use ptracer::Ptracer; 3 | use std::env; 4 | use std::fs; 5 | use std::path::Path; 6 | use trace_analysis::predicates::SerializedPredicate; 7 | 8 | fn deserialize_predicates(predicate_file: &String) -> Vec { 9 | let content = fs::read_to_string(predicate_file).expect("Could not read predicates.json"); 10 | 11 | serde_json::from_str(&content).expect("Could not deserialize predicates.") 12 | } 13 | 14 | fn serialize_ranking(out_file: &String, ranking: &Vec) { 15 | let content = serde_json::to_string(&ranking).expect("Could not serialize ranking"); 16 | fs::write(out_file, content).expect(&format!("Could not write {}", out_file)); 17 | } 18 | 19 | fn main() { 20 | match env::var("RUST_LOG") { 21 | Err(_) => { 22 | env::set_var("RUST_LOG", "error"); 23 | } 24 | Ok(_) => {} 25 | } 26 | env_logger::init(); 27 | 28 | let args: Vec<_> = env::args().collect(); 29 | debug!("args = {:#?}", args); 30 | 31 | if args.len() < 4 { 32 | println!( 33 | "usage: {} [argument]...", 34 | args[0] 35 | ); 36 | return; 37 | } 38 | 39 | let out_file = args.get(1).expect("No out file specified"); 40 | let predicate_file = args.get(2).expect("No predicate file specified"); 41 | let timeout: u64 = args 42 | .get(3) 43 | .expect("No timeout specified") 44 | .parse() 45 | .expect("Could not parse timeout"); 46 | let cmd = args.get(4).expect("No cmd line specified"); 47 | let cmd_args: Vec<_> = args[5..].iter().cloned().collect(); 48 | 49 | debug!("cmd = {:?}", cmd); 50 | debug!("cmd_args = {:?}", cmd_args); 51 | 52 | let dbg = Ptracer::spawn(Path::new(&cmd), cmd_args.as_ref()).expect("spawn failed"); 53 | 54 | let predicates = deserialize_predicates(&predicate_file); 55 | let ranking = predicate_monitoring::rank_predicates(dbg, predicates, timeout); 56 | 57 | serialize_ranking(out_file, &ranking); 58 | } 59 | -------------------------------------------------------------------------------- /root_cause_analysis/predicate_monitoring/src/predicate.rs: -------------------------------------------------------------------------------- 1 | use nix::libc::user_regs_struct; 2 | use std::str::FromStr; 3 | 4 | use crate::register::{Register, RegisterValue}; 5 | use crate::rflags::RFlags; 6 | 7 | #[derive(Debug, Clone, PartialEq, Eq)] 8 | pub enum Predicate { 9 | Compare(ComparePredicate), 10 | Edge(EdgePredicate), 11 | FlagSet(RFlags), 12 | Visited, 13 | } 14 | 15 | #[derive(Debug, Clone, PartialEq, Eq)] 16 | pub struct ComparePredicate { 17 | pub destination: ValueDestination, 18 | pub compare: Compare, 19 | pub value: usize, 20 | } 21 | 22 | #[derive(Debug, Clone, PartialEq, Eq)] 23 | pub enum ValueDestination { 24 | Address(MemoryLocation), 25 | Memory(AccessSize, MemoryLocation), 26 | Register(Register), 27 | } 28 | 29 | impl ValueDestination { 30 | pub fn register(register: Register) -> Self { 31 | Self::Register(register) 32 | } 33 | } 34 | 35 | pub type AccessSize = u8; 36 | 37 | #[derive(Debug, Clone, PartialEq, Eq)] 38 | pub struct MemoryLocation { 39 | segment: Option, 40 | base: Option, 41 | index: Option, 42 | scale: u8, 43 | displacement: Option, 44 | } 45 | 46 | impl MemoryLocation { 47 | fn from_memory_info(mem: &zydis::ffi::MemoryInfo) -> Self { 48 | Self { 49 | segment: Register::from_zydis_register(mem.segment), 50 | base: Register::from_zydis_register(mem.base), 51 | index: Register::from_zydis_register(mem.index), 52 | scale: mem.scale, 53 | displacement: if mem.disp.has_displacement { 54 | Some(mem.disp.displacement) 55 | } else { 56 | None 57 | }, 58 | } 59 | } 60 | } 61 | 62 | impl MemoryLocation { 63 | pub fn address(&self, registers: &user_regs_struct) -> usize { 64 | let address = self 65 | .base 66 | .and_then(|reg| Some(reg.value(registers))) 67 | .unwrap_or(0) 68 | + self 69 | .index 70 | .and_then(|reg| Some(reg.value(registers) * self.scale as usize)) 71 | .unwrap_or(0); 72 | 73 | match self.displacement { 74 | Some(displacement) => { 75 | if displacement >= 0 { 76 | address + displacement.abs() as usize 77 | } else { 78 | address - displacement.abs() as usize 79 | } 80 | } 81 | None => address, 82 | } 83 | } 84 | } 85 | 86 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 87 | pub enum Compare { 88 | Less, 89 | Greater, 90 | GreaterOrEqual, 91 | Equal, 92 | NotEqual, 93 | } 94 | 95 | #[derive(Debug, Clone, PartialEq, Eq)] 96 | pub struct EdgePredicate { 97 | pub source: usize, 98 | pub transition: EdgeTransition, 99 | pub destination: usize, 100 | } 101 | 102 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 103 | pub enum EdgeTransition { 104 | Taken, 105 | NotTaken, 106 | } 107 | 108 | pub fn convert_predicate( 109 | predicate: &str, 110 | instruction: zydis::DecodedInstruction, 111 | ) -> Option { 112 | let parts: Vec<_> = predicate.split(' ').collect(); 113 | let function = match parts.len() { 114 | 1 | 2 => parts[0], 115 | 3 => parts[1], 116 | _ => unimplemented!(), 117 | }; 118 | 119 | if function.contains("edge") { 120 | let source = usize::from_str_radix(&parts[0][2..], 16).expect("failed to parse source"); 121 | let destination = 122 | usize::from_str_radix(&parts[2][2..], 16).expect("failed to parse destination"); 123 | let transition = match function { 124 | "has_edge_to" => EdgeTransition::Taken, 125 | "edge_only_taken_to" => EdgeTransition::NotTaken, 126 | "last_edge_to" => return None, 127 | _ => unimplemented!(), 128 | }; 129 | 130 | return Some(Predicate::Edge(EdgePredicate { 131 | source, 132 | transition, 133 | destination, 134 | })); 135 | } else if function.contains("reg_val") { 136 | let value = usize::from_str_radix(&parts[2][2..], 16).expect("failed to parse value"); 137 | let memory_locations = instruction.operands[..instruction.operand_count as usize] 138 | .into_iter() 139 | .filter(|op| match op.ty { 140 | zydis::OperandType::MEMORY => true, 141 | _ => false, 142 | }); 143 | let memory = memory_locations 144 | .last() 145 | .and_then(|op| Some(MemoryLocation::from_memory_info(&op.mem))); 146 | 147 | let destination = match parts[0] { 148 | "memory_address" => ValueDestination::Address(memory.expect("no memory location")), 149 | "memory_value" => ValueDestination::Memory( 150 | instruction.operand_width, 151 | memory.expect("no memory location"), 152 | ), 153 | 154 | "seg_cs" => return None, 155 | "seg_ss" => return None, 156 | "seg_ds" => return None, 157 | "seg_es" => return None, 158 | "seg_fs" => return None, 159 | "seg_gs" => return None, 160 | 161 | "eflags" => return None, 162 | 163 | register => ValueDestination::Register( 164 | Register::from_str(register).expect("failed to parse register"), 165 | ), 166 | }; 167 | 168 | let compare = match function { 169 | "min_reg_val_less" => Compare::Less, 170 | "max_reg_val_less" => Compare::Less, 171 | "last_reg_val_less" => return None, 172 | "max_min_diff_reg_val_less" => return None, 173 | 174 | "min_reg_val_greater_or_equal" => Compare::GreaterOrEqual, 175 | "max_reg_val_greater_or_equal" => Compare::GreaterOrEqual, 176 | "last_reg_val_greater_or_equal" => return None, 177 | "max_min_diff_reg_val_greater_or_equal" => return None, 178 | 179 | _ => unimplemented!(), 180 | }; 181 | 182 | return Some(Predicate::Compare(ComparePredicate { 183 | destination, 184 | compare, 185 | value, 186 | })); 187 | } else if function.contains("ins_count") { 188 | // "ins_count_less" 189 | // "ins_count_greater_or_equal" 190 | } else if function.contains("selector_val") { 191 | // "selector_val_less_name" 192 | // "selector_val_less" 193 | // "selector_val_greater_or_equal_name" 194 | // "selector_val_greater_or_equal" 195 | } else if function.contains("num_successors") { 196 | // "num_successors_greater" => 197 | // "num_successors_equal" => 198 | } else if function.contains("flag") { 199 | let flag = match function { 200 | "min_carry_flag_set" => RFlags::CARRY_FLAG, 201 | "min_parity_flag_set" => RFlags::PARITY_FLAG, 202 | "min_adjust_flag_set" => RFlags::AUXILIARY_CARRY_FLAG, 203 | "min_zero_flag_set" => RFlags::ZERO_FLAG, 204 | "min_sign_flag_set" => RFlags::SIGN_FLAG, 205 | "min_trap_flag_set" => RFlags::TRAP_FLAG, 206 | "min_interrupt_flag_set" => RFlags::INTERRUPT_FLAG, 207 | "min_direction_flag_set" => RFlags::DIRECTION_FLAG, 208 | "min_overflow_flag_set" => RFlags::OVERFLOW_FLAG, 209 | 210 | "max_carry_flag_set" => RFlags::CARRY_FLAG, 211 | "max_parity_flag_set" => RFlags::PARITY_FLAG, 212 | "max_adjust_flag_set" => RFlags::AUXILIARY_CARRY_FLAG, 213 | "max_zero_flag_set" => RFlags::ZERO_FLAG, 214 | "max_sign_flag_set" => RFlags::SIGN_FLAG, 215 | "max_trap_flag_set" => RFlags::TRAP_FLAG, 216 | "max_interrupt_flag_set" => RFlags::INTERRUPT_FLAG, 217 | "max_direction_flag_set" => RFlags::DIRECTION_FLAG, 218 | "max_overflow_flag_set" => RFlags::OVERFLOW_FLAG, 219 | 220 | "last_carry_flag_set" => return None, 221 | "last_parity_flag_set" => return None, 222 | "last_adjust_flag_set" => return None, 223 | "last_zero_flag_set" => return None, 224 | "last_sign_flag_set" => return None, 225 | "last_trap_flag_set" => return None, 226 | "last_interrupt_flag_set" => return None, 227 | "last_direction_flag_set" => return None, 228 | "last_overflow_flag_set" => return None, 229 | 230 | _ => unimplemented!(), 231 | }; 232 | 233 | return Some(Predicate::FlagSet(flag)); 234 | } else if function == "is_visited" { 235 | return Some(Predicate::Visited); 236 | } else { 237 | log::error!("unknown predicate function {:?}", function); 238 | unimplemented!() 239 | } 240 | 241 | None 242 | } 243 | -------------------------------------------------------------------------------- /root_cause_analysis/predicate_monitoring/src/register.rs: -------------------------------------------------------------------------------- 1 | use std::fmt; 2 | use std::str::FromStr; 3 | 4 | use nix::libc::user_regs_struct; 5 | use zydis::Register as ZydisRegister; 6 | 7 | pub trait ArchRegister { 8 | fn arch_register(self) -> Register; 9 | } 10 | 11 | pub trait RegisterValue { 12 | fn value(self, registers: &user_regs_struct) -> T; 13 | } 14 | 15 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 16 | pub enum Register64 { 17 | Rax, 18 | Rbx, 19 | Rcx, 20 | Rdx, 21 | Rbp, 22 | Rsi, 23 | Rdi, 24 | Rsp, 25 | Rip, 26 | R8, 27 | R9, 28 | R10, 29 | R11, 30 | R12, 31 | R13, 32 | R14, 33 | R15, 34 | } 35 | 36 | impl ArchRegister for Register64 { 37 | fn arch_register(self) -> Register { 38 | Register::Register64(self) 39 | } 40 | } 41 | impl RegisterValue for Register64 { 42 | fn value(self, registers: &user_regs_struct) -> u64 { 43 | match self { 44 | Self::Rax => registers.rax, 45 | Self::Rbx => registers.rbx, 46 | Self::Rcx => registers.rcx, 47 | Self::Rdx => registers.rdx, 48 | Self::Rbp => registers.rbp, 49 | Self::Rsi => registers.rsi, 50 | Self::Rdi => registers.rdi, 51 | Self::Rsp => registers.rsp, 52 | Self::Rip => registers.rip, 53 | Self::R8 => registers.r8, 54 | Self::R9 => registers.r9, 55 | Self::R10 => registers.r10, 56 | Self::R11 => registers.r11, 57 | Self::R12 => registers.r12, 58 | Self::R13 => registers.r13, 59 | Self::R14 => registers.r14, 60 | Self::R15 => registers.r15, 61 | } 62 | } 63 | } 64 | 65 | impl From for Register { 66 | fn from(register: Register64) -> Self { 67 | Register::Register64(register) 68 | } 69 | } 70 | 71 | impl fmt::Display for Register64 { 72 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 73 | write!( 74 | f, 75 | "{}", 76 | match self { 77 | Self::Rax => "rax", 78 | Self::Rbx => "rbx", 79 | Self::Rcx => "rcx", 80 | Self::Rdx => "rdx", 81 | Self::Rbp => "rbp", 82 | Self::Rsi => "rsi", 83 | Self::Rdi => "rdi", 84 | Self::Rsp => "rsp", 85 | Self::Rip => "rip", 86 | Self::R8 => "r8", 87 | Self::R9 => "r9", 88 | Self::R10 => "r10", 89 | Self::R11 => "r11", 90 | Self::R12 => "r12", 91 | Self::R13 => "r13", 92 | Self::R14 => "r14", 93 | Self::R15 => "r15", 94 | } 95 | ) 96 | } 97 | } 98 | 99 | impl FromStr for Register64 { 100 | type Err = (); 101 | 102 | fn from_str(s: &str) -> Result { 103 | Ok(match s { 104 | "rax" => Self::Rax, 105 | "rbx" => Self::Rbx, 106 | "rcx" => Self::Rcx, 107 | "rdx" => Self::Rdx, 108 | "rbp" => Self::Rbp, 109 | "rsi" => Self::Rsi, 110 | "rdi" => Self::Rdi, 111 | "rsp" => Self::Rsp, 112 | "rip" => Self::Rip, 113 | "r8" => Self::R8, 114 | "r9" => Self::R9, 115 | "r10" => Self::R10, 116 | "r11" => Self::R11, 117 | "r12" => Self::R12, 118 | "r13" => Self::R13, 119 | "r14" => Self::R14, 120 | "r15" => Self::R15, 121 | _ => return Err(()), 122 | }) 123 | } 124 | } 125 | 126 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 127 | pub enum Register32 { 128 | Eax, 129 | Ebx, 130 | Ecx, 131 | Edx, 132 | Ebp, 133 | Esi, 134 | Edi, 135 | Esp, 136 | Eip, 137 | R8d, 138 | R9d, 139 | R10d, 140 | R11d, 141 | R12d, 142 | R13d, 143 | R14d, 144 | R15d, 145 | } 146 | 147 | impl ArchRegister for Register32 { 148 | fn arch_register(self) -> Register { 149 | Register::Register64(match self { 150 | Self::Eax => Register64::Rax, 151 | Self::Ebx => Register64::Rbx, 152 | Self::Ecx => Register64::Rcx, 153 | Self::Edx => Register64::Rdx, 154 | Self::Ebp => Register64::Rbp, 155 | Self::Esi => Register64::Rsi, 156 | Self::Edi => Register64::Rdi, 157 | Self::Esp => Register64::Rsp, 158 | Self::Eip => Register64::Rip, 159 | Self::R8d => Register64::R8, 160 | Self::R9d => Register64::R9, 161 | Self::R10d => Register64::R10, 162 | Self::R11d => Register64::R11, 163 | Self::R12d => Register64::R12, 164 | Self::R13d => Register64::R13, 165 | Self::R14d => Register64::R14, 166 | Self::R15d => Register64::R15, 167 | }) 168 | } 169 | } 170 | impl RegisterValue for Register32 { 171 | fn value(self, registers: &user_regs_struct) -> u32 { 172 | (self.arch_register().value(registers) & 0xFFFF_FFFF) as u32 173 | } 174 | } 175 | 176 | impl From for Register { 177 | fn from(register: Register32) -> Self { 178 | Register::Register32(register) 179 | } 180 | } 181 | 182 | impl fmt::Display for Register32 { 183 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 184 | write!( 185 | f, 186 | "{}", 187 | match self { 188 | Self::Eax => "eax", 189 | Self::Ebx => "ebx", 190 | Self::Ecx => "ecx", 191 | Self::Edx => "edx", 192 | Self::Ebp => "ebp", 193 | Self::Esi => "esi", 194 | Self::Edi => "edi", 195 | Self::Esp => "esp", 196 | Self::Eip => "eip", 197 | Self::R8d => "r8d", 198 | Self::R9d => "r9d", 199 | Self::R10d => "r10d", 200 | Self::R11d => "r11d", 201 | Self::R12d => "r12d", 202 | Self::R13d => "r13d", 203 | Self::R14d => "r14d", 204 | Self::R15d => "r15d", 205 | } 206 | ) 207 | } 208 | } 209 | 210 | impl FromStr for Register32 { 211 | type Err = (); 212 | 213 | fn from_str(s: &str) -> Result { 214 | Ok(match s { 215 | "eax" => Self::Eax, 216 | "ebx" => Self::Ebx, 217 | "ecx" => Self::Ecx, 218 | "edx" => Self::Edx, 219 | "ebp" => Self::Ebp, 220 | "esi" => Self::Esi, 221 | "edi" => Self::Edi, 222 | "esp" => Self::Esp, 223 | "eip" => Self::Eip, 224 | "r8d" => Self::R8d, 225 | "r9d" => Self::R9d, 226 | "r10d" => Self::R10d, 227 | "r11d" => Self::R11d, 228 | "r12d" => Self::R12d, 229 | "r13d" => Self::R13d, 230 | "r14d" => Self::R14d, 231 | "r15d" => Self::R15d, 232 | _ => return Err(()), 233 | }) 234 | } 235 | } 236 | 237 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 238 | pub enum Register16 { 239 | Ax, 240 | Bx, 241 | Cx, 242 | Dx, 243 | Bp, 244 | Si, 245 | Di, 246 | Sp, 247 | Ip, 248 | Cs, 249 | Ss, 250 | Ds, 251 | Es, 252 | Fs, 253 | Gs, 254 | R8w, 255 | R9w, 256 | R10w, 257 | R11w, 258 | R12w, 259 | R13w, 260 | R14w, 261 | R15w, 262 | } 263 | 264 | impl ArchRegister for Register16 { 265 | fn arch_register(self) -> Register { 266 | match self { 267 | Self::Ax => Register::Register64(Register64::Rax), 268 | Self::Bx => Register::Register64(Register64::Rbx), 269 | Self::Cx => Register::Register64(Register64::Rcx), 270 | Self::Dx => Register::Register64(Register64::Rdx), 271 | Self::Bp => Register::Register64(Register64::Rbp), 272 | Self::Si => Register::Register64(Register64::Rsi), 273 | Self::Di => Register::Register64(Register64::Rdi), 274 | Self::Sp => Register::Register64(Register64::Rsp), 275 | Self::Ip => Register::Register64(Register64::Rip), 276 | Self::Cs => Register::Register16(Register16::Cs), 277 | Self::Ss => Register::Register16(Register16::Ss), 278 | Self::Ds => Register::Register16(Register16::Ds), 279 | Self::Es => Register::Register16(Register16::Es), 280 | Self::Fs => Register::Register16(Register16::Fs), 281 | Self::Gs => Register::Register16(Register16::Gs), 282 | Self::R8w => Register::Register64(Register64::R8), 283 | Self::R9w => Register::Register64(Register64::R9), 284 | Self::R10w => Register::Register64(Register64::R10), 285 | Self::R11w => Register::Register64(Register64::R11), 286 | Self::R12w => Register::Register64(Register64::R12), 287 | Self::R13w => Register::Register64(Register64::R13), 288 | Self::R14w => Register::Register64(Register64::R14), 289 | Self::R15w => Register::Register64(Register64::R15), 290 | } 291 | } 292 | } 293 | impl RegisterValue for Register16 { 294 | fn value(self, registers: &user_regs_struct) -> u16 { 295 | match self { 296 | Self::Cs => registers.cs as u16, 297 | Self::Ss => registers.ss as u16, 298 | Self::Ds => registers.ds as u16, 299 | Self::Es => registers.es as u16, 300 | Self::Fs => registers.fs as u16, 301 | Self::Gs => registers.gs as u16, 302 | _ => (self.arch_register().value(registers) & 0xFFFF) as u16, 303 | } 304 | } 305 | } 306 | 307 | impl From for Register { 308 | fn from(register: Register16) -> Self { 309 | Register::Register16(register) 310 | } 311 | } 312 | 313 | impl fmt::Display for Register16 { 314 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 315 | write!( 316 | f, 317 | "{}", 318 | match self { 319 | Self::Ax => "ax", 320 | Self::Bx => "bx", 321 | Self::Cx => "cx", 322 | Self::Dx => "dx", 323 | Self::Bp => "bp", 324 | Self::Si => "si", 325 | Self::Di => "di", 326 | Self::Sp => "sp", 327 | Self::Ip => "ip", 328 | Self::Cs => "cs", 329 | Self::Ss => "ss", 330 | Self::Ds => "ds", 331 | Self::Es => "es", 332 | Self::Fs => "fs", 333 | Self::Gs => "gs", 334 | Self::R8w => "r8w", 335 | Self::R9w => "r9w", 336 | Self::R10w => "r10w", 337 | Self::R11w => "r11w", 338 | Self::R12w => "r12w", 339 | Self::R13w => "r13w", 340 | Self::R14w => "r14w", 341 | Self::R15w => "r15w", 342 | } 343 | ) 344 | } 345 | } 346 | 347 | impl FromStr for Register16 { 348 | type Err = (); 349 | 350 | fn from_str(s: &str) -> Result { 351 | Ok(match s { 352 | "ax" => Self::Ax, 353 | "bx" => Self::Bx, 354 | "cx" => Self::Cx, 355 | "dx" => Self::Dx, 356 | "bp" => Self::Bp, 357 | "si" => Self::Si, 358 | "di" => Self::Di, 359 | "sp" => Self::Sp, 360 | "ip" => Self::Ip, 361 | "cs" => Self::Cs, 362 | "ss" => Self::Ss, 363 | "ds" => Self::Ds, 364 | "es" => Self::Es, 365 | "fs" => Self::Fs, 366 | "gs" => Self::Gs, 367 | "r8w" => Self::R8w, 368 | "r9w" => Self::R9w, 369 | "r10w" => Self::R10w, 370 | "r11w" => Self::R11w, 371 | "r12w" => Self::R12w, 372 | "r13w" => Self::R13w, 373 | "r14w" => Self::R14w, 374 | "r15w" => Self::R15w, 375 | _ => return Err(()), 376 | }) 377 | } 378 | } 379 | 380 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 381 | pub enum Register8Low { 382 | Al, 383 | Bl, 384 | Cl, 385 | Dl, 386 | Bpl, 387 | Sil, 388 | Dil, 389 | Spl, 390 | R8b, 391 | R9b, 392 | R10b, 393 | R11b, 394 | R12b, 395 | R13b, 396 | R14b, 397 | R15b, 398 | } 399 | 400 | impl ArchRegister for Register8Low { 401 | fn arch_register(self) -> Register { 402 | Register::Register64(match self { 403 | Self::Al => Register64::Rax, 404 | Self::Bl => Register64::Rbx, 405 | Self::Cl => Register64::Rcx, 406 | Self::Dl => Register64::Rdx, 407 | Self::Bpl => Register64::Rbp, 408 | Self::Sil => Register64::Rsi, 409 | Self::Dil => Register64::Rdi, 410 | Self::Spl => Register64::Rsp, 411 | Self::R8b => Register64::R8, 412 | Self::R9b => Register64::R9, 413 | Self::R10b => Register64::R10, 414 | Self::R11b => Register64::R11, 415 | Self::R12b => Register64::R12, 416 | Self::R13b => Register64::R13, 417 | Self::R14b => Register64::R14, 418 | Self::R15b => Register64::R15, 419 | }) 420 | } 421 | } 422 | impl RegisterValue for Register8Low { 423 | fn value(self, registers: &user_regs_struct) -> u8 { 424 | (self.arch_register().value(registers) & 0xFF) as u8 425 | } 426 | } 427 | 428 | impl From for Register { 429 | fn from(register: Register8Low) -> Self { 430 | Register::Register8Low(register) 431 | } 432 | } 433 | 434 | impl fmt::Display for Register8Low { 435 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 436 | write!( 437 | f, 438 | "{}", 439 | match self { 440 | Self::Al => "al", 441 | Self::Bl => "bl", 442 | Self::Cl => "cl", 443 | Self::Dl => "dl", 444 | Self::Bpl => "bpl", 445 | Self::Sil => "sil", 446 | Self::Dil => "dil", 447 | Self::Spl => "spl", 448 | Self::R8b => "r8b", 449 | Self::R9b => "r9b", 450 | Self::R10b => "r10b", 451 | Self::R11b => "r11b", 452 | Self::R12b => "r12b", 453 | Self::R13b => "r13b", 454 | Self::R14b => "r14b", 455 | Self::R15b => "r15b", 456 | } 457 | ) 458 | } 459 | } 460 | 461 | impl FromStr for Register8Low { 462 | type Err = (); 463 | 464 | fn from_str(s: &str) -> Result { 465 | Ok(match s { 466 | "al" => Self::Al, 467 | "bl" => Self::Bl, 468 | "cl" => Self::Cl, 469 | "dl" => Self::Dl, 470 | "bpl" => Self::Bpl, 471 | "sil" => Self::Sil, 472 | "dil" => Self::Dil, 473 | "spl" => Self::Spl, 474 | "r8b" => Self::R8b, 475 | "r9b" => Self::R9b, 476 | "r10b" => Self::R10b, 477 | "r11b" => Self::R11b, 478 | "r12b" => Self::R12b, 479 | "r13b" => Self::R13b, 480 | "r14b" => Self::R14b, 481 | "r15b" => Self::R15b, 482 | _ => return Err(()), 483 | }) 484 | } 485 | } 486 | 487 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 488 | pub enum Register8High { 489 | Ah, 490 | Bh, 491 | Ch, 492 | Dh, 493 | } 494 | 495 | impl ArchRegister for Register8High { 496 | fn arch_register(self) -> Register { 497 | Register::Register64(match self { 498 | Self::Ah => Register64::Rax, 499 | Self::Bh => Register64::Rbx, 500 | Self::Ch => Register64::Rcx, 501 | Self::Dh => Register64::Rdx, 502 | }) 503 | } 504 | } 505 | impl RegisterValue for Register8High { 506 | fn value(self, registers: &user_regs_struct) -> u8 { 507 | ((self.arch_register().value(registers) >> 8) & 0xFF) as u8 508 | } 509 | } 510 | 511 | impl From for Register { 512 | fn from(register: Register8High) -> Self { 513 | Register::Register8High(register) 514 | } 515 | } 516 | 517 | impl fmt::Display for Register8High { 518 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 519 | write!( 520 | f, 521 | "{}", 522 | match self { 523 | Self::Ah => "ah", 524 | Self::Bh => "bh", 525 | Self::Ch => "ch", 526 | Self::Dh => "dh", 527 | } 528 | ) 529 | } 530 | } 531 | impl FromStr for Register8High { 532 | type Err = (); 533 | 534 | fn from_str(s: &str) -> Result { 535 | Ok(match s { 536 | "ah" => Self::Ah, 537 | "bh" => Self::Bh, 538 | "ch" => Self::Ch, 539 | "dh" => Self::Dh, 540 | _ => return Err(()), 541 | }) 542 | } 543 | } 544 | 545 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 546 | pub enum Register { 547 | Register64(Register64), 548 | Register32(Register32), 549 | Register16(Register16), 550 | Register8Low(Register8Low), 551 | Register8High(Register8High), 552 | } 553 | 554 | impl Register { 555 | pub fn from_zydis_register(reg: ZydisRegister) -> Option { 556 | match reg { 557 | ZydisRegister::AL => Some(Register8Low::Al.into()), 558 | ZydisRegister::CL => Some(Register8Low::Cl.into()), 559 | ZydisRegister::DL => Some(Register8Low::Dl.into()), 560 | ZydisRegister::BL => Some(Register8Low::Bl.into()), 561 | ZydisRegister::AH => Some(Register8High::Ah.into()), 562 | ZydisRegister::CH => Some(Register8High::Ch.into()), 563 | ZydisRegister::DH => Some(Register8High::Dh.into()), 564 | ZydisRegister::BH => Some(Register8High::Bh.into()), 565 | ZydisRegister::SPL => Some(Register8Low::Spl.into()), 566 | ZydisRegister::BPL => Some(Register8Low::Bpl.into()), 567 | ZydisRegister::SIL => Some(Register8Low::Sil.into()), 568 | ZydisRegister::DIL => Some(Register8Low::Dil.into()), 569 | ZydisRegister::R8B => Some(Register8Low::R8b.into()), 570 | ZydisRegister::R9B => Some(Register8Low::R9b.into()), 571 | ZydisRegister::R10B => Some(Register8Low::R10b.into()), 572 | ZydisRegister::R11B => Some(Register8Low::R11b.into()), 573 | ZydisRegister::R12B => Some(Register8Low::R12b.into()), 574 | ZydisRegister::R13B => Some(Register8Low::R13b.into()), 575 | ZydisRegister::R14B => Some(Register8Low::R14b.into()), 576 | ZydisRegister::R15B => Some(Register8Low::R15b.into()), 577 | ZydisRegister::AX => Some(Register16::Ax.into()), 578 | ZydisRegister::CX => Some(Register16::Cx.into()), 579 | ZydisRegister::DX => Some(Register16::Dx.into()), 580 | ZydisRegister::BX => Some(Register16::Bx.into()), 581 | ZydisRegister::SP => Some(Register16::Sp.into()), 582 | ZydisRegister::BP => Some(Register16::Bp.into()), 583 | ZydisRegister::SI => Some(Register16::Si.into()), 584 | ZydisRegister::DI => Some(Register16::Di.into()), 585 | ZydisRegister::R8W => Some(Register16::R8w.into()), 586 | ZydisRegister::R9W => Some(Register16::R9w.into()), 587 | ZydisRegister::R10W => Some(Register16::R10w.into()), 588 | ZydisRegister::R11W => Some(Register16::R11w.into()), 589 | ZydisRegister::R12W => Some(Register16::R12w.into()), 590 | ZydisRegister::R13W => Some(Register16::R13w.into()), 591 | ZydisRegister::R14W => Some(Register16::R14w.into()), 592 | ZydisRegister::R15W => Some(Register16::R15w.into()), 593 | ZydisRegister::EAX => Some(Register32::Eax.into()), 594 | ZydisRegister::ECX => Some(Register32::Ecx.into()), 595 | ZydisRegister::EDX => Some(Register32::Edx.into()), 596 | ZydisRegister::EBX => Some(Register32::Ebx.into()), 597 | ZydisRegister::ESP => Some(Register32::Esp.into()), 598 | ZydisRegister::EBP => Some(Register32::Ebp.into()), 599 | ZydisRegister::ESI => Some(Register32::Esi.into()), 600 | ZydisRegister::EDI => Some(Register32::Edi.into()), 601 | ZydisRegister::R8D => Some(Register32::R8d.into()), 602 | ZydisRegister::R9D => Some(Register32::R9d.into()), 603 | ZydisRegister::R10D => Some(Register32::R10d.into()), 604 | ZydisRegister::R11D => Some(Register32::R11d.into()), 605 | ZydisRegister::R12D => Some(Register32::R12d.into()), 606 | ZydisRegister::R13D => Some(Register32::R13d.into()), 607 | ZydisRegister::R14D => Some(Register32::R14d.into()), 608 | ZydisRegister::R15D => Some(Register32::R15d.into()), 609 | ZydisRegister::RAX => Some(Register64::Rax.into()), 610 | ZydisRegister::RCX => Some(Register64::Rcx.into()), 611 | ZydisRegister::RDX => Some(Register64::Rdx.into()), 612 | ZydisRegister::RBX => Some(Register64::Rbx.into()), 613 | ZydisRegister::RSP => Some(Register64::Rsp.into()), 614 | ZydisRegister::RBP => Some(Register64::Rbp.into()), 615 | ZydisRegister::RSI => Some(Register64::Rsi.into()), 616 | ZydisRegister::RDI => Some(Register64::Rdi.into()), 617 | ZydisRegister::R8 => Some(Register64::R8.into()), 618 | ZydisRegister::R9 => Some(Register64::R9.into()), 619 | ZydisRegister::R10 => Some(Register64::R10.into()), 620 | ZydisRegister::R11 => Some(Register64::R11.into()), 621 | ZydisRegister::R12 => Some(Register64::R12.into()), 622 | ZydisRegister::R13 => Some(Register64::R13.into()), 623 | ZydisRegister::R14 => Some(Register64::R14.into()), 624 | ZydisRegister::R15 => Some(Register64::R15.into()), 625 | // ZydisRegister::FLAGS => Some(Register64::Flags.into()), 626 | // ZydisRegister::EFLAGS => Some(Register64::Eflags.into()), 627 | // ZydisRegister::RFLAGS => Some(Register64::Rflags.into()), 628 | ZydisRegister::IP => Some(Register16::Ip.into()), 629 | ZydisRegister::EIP => Some(Register32::Eip.into()), 630 | ZydisRegister::RIP => Some(Register64::Rip.into()), 631 | ZydisRegister::ES => Some(Register16::Es.into()), 632 | ZydisRegister::CS => Some(Register16::Cs.into()), 633 | ZydisRegister::SS => Some(Register16::Ss.into()), 634 | ZydisRegister::DS => Some(Register16::Ds.into()), 635 | ZydisRegister::FS => Some(Register16::Fs.into()), 636 | ZydisRegister::GS => Some(Register16::Gs.into()), 637 | ZydisRegister::NONE => None, 638 | _ => None, 639 | } 640 | } 641 | } 642 | 643 | impl RegisterValue for Register { 644 | fn value(self, registers: &user_regs_struct) -> usize { 645 | match self { 646 | Self::Register64(register) => register.value(registers) as usize, 647 | Self::Register32(register) => register.value(registers) as usize, 648 | Self::Register16(register) => register.value(registers) as usize, 649 | Self::Register8Low(register) => register.value(registers) as usize, 650 | Self::Register8High(register) => register.value(registers) as usize, 651 | } 652 | } 653 | } 654 | 655 | impl fmt::Display for Register { 656 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 657 | match self { 658 | Self::Register64(register) => register.fmt(f), 659 | Self::Register32(register) => register.fmt(f), 660 | Self::Register16(register) => register.fmt(f), 661 | Self::Register8Low(register) => register.fmt(f), 662 | Self::Register8High(register) => register.fmt(f), 663 | } 664 | } 665 | } 666 | 667 | impl FromStr for Register { 668 | type Err = (); 669 | 670 | fn from_str(s: &str) -> Result { 671 | Register64::from_str(s) 672 | .and_then(|reg| Ok(reg.into())) 673 | .or_else(|_| Register32::from_str(s).and_then(|reg| Ok(reg.into()))) 674 | .or_else(|_| Register16::from_str(s).and_then(|reg| Ok(reg.into()))) 675 | .or_else(|_| Register8Low::from_str(s).and_then(|reg| Ok(reg.into()))) 676 | .or_else(|_| Register8High::from_str(s).and_then(|reg| Ok(reg.into()))) 677 | } 678 | } 679 | -------------------------------------------------------------------------------- /root_cause_analysis/predicate_monitoring/src/rflags.rs: -------------------------------------------------------------------------------- 1 | // https://github.com/rust-osdev/x86_64 2 | // 3 | // The MIT License (MIT) 4 | // 5 | // Copyright (c) 2018 Philipp Oppermann 6 | // Copyright (c) 2015 Gerd Zellweger 7 | // Copyright (c) 2015 The libcpu Developers 8 | 9 | //! Processor state stored in the RFLAGS register. 10 | use bitflags::bitflags; 11 | 12 | bitflags! { 13 | /// The RFLAGS register. 14 | pub struct RFlags: u64 { 15 | /// Processor feature identification flag. 16 | /// 17 | /// If this flag is modifiable, the CPU supports CPUID. 18 | const ID = 1 << 21; 19 | /// Indicates that an external, maskable interrupt is pending. 20 | /// 21 | /// Used when virtual-8086 mode extensions (CR4.VME) or protected-mode virtual 22 | /// interrupts (CR4.PVI) are activated. 23 | const VIRTUAL_INTERRUPT_PENDING = 1 << 20; 24 | /// Virtual image of the INTERRUPT_FLAG bit. 25 | /// 26 | /// Used when virtual-8086 mode extensions (CR4.VME) or protected-mode virtual 27 | /// interrupts (CR4.PVI) are activated. 28 | const VIRTUAL_INTERRUPT = 1 << 19; 29 | /// Enable automatic alignment checking if CR0.AM is set. Only works if CPL is 3. 30 | const ALIGNMENT_CHECK = 1 << 18; 31 | /// Enable the virtual-8086 mode. 32 | const VIRTUAL_8086_MODE = 1 << 17; 33 | /// Allows to restart an instruction following an instrucion breakpoint. 34 | const RESUME_FLAG = 1 << 16; 35 | /// Used by `iret` in hardware task switch mode to determine if current task is nested. 36 | const NESTED_TASK = 1 << 14; 37 | /// The high bit of the I/O Privilege Level field. 38 | /// 39 | /// Specifies the privilege level required for executing I/O address-space instructions. 40 | const IOPL_HIGH = 1 << 13; 41 | /// The low bit of the I/O Privilege Level field. 42 | /// 43 | /// Specifies the privilege level required for executing I/O address-space instructions. 44 | const IOPL_LOW = 1 << 12; 45 | /// Set by hardware to indicate that the sign bit of the result of the last signed integer 46 | /// operation differs from the source operands. 47 | const OVERFLOW_FLAG = 1 << 11; 48 | /// Determines the order in which strings are processed. 49 | const DIRECTION_FLAG = 1 << 10; 50 | /// Enable interrupts. 51 | const INTERRUPT_FLAG = 1 << 9; 52 | /// Enable single-step mode for debugging. 53 | const TRAP_FLAG = 1 << 8; 54 | /// Set by hardware if last arithmetic operation resulted in a negative value. 55 | const SIGN_FLAG = 1 << 7; 56 | /// Set by hardware if last arithmetic operation resulted in a zero value. 57 | const ZERO_FLAG = 1 << 6; 58 | /// Set by hardware if last arithmetic operation generated a carry ouf of bit 3 of the 59 | /// result. 60 | const AUXILIARY_CARRY_FLAG = 1 << 4; 61 | /// Set by hardware if last result has an even number of 1 bits (only for some operations). 62 | const PARITY_FLAG = 1 << 2; 63 | /// Set by hardware if last arithmetic operation generated a carry out of the 64 | /// most-significant bit of the result. 65 | const CARRY_FLAG = 1; 66 | } 67 | } 68 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "root_cause_analysis" 3 | version = "0.1.0" 4 | authors = ["Tim Blazytko ", "Moritz Schlögel "] 5 | 6 | edition = "2018" 7 | 8 | [profile.release] 9 | lto = true 10 | 11 | [dependencies] 12 | trace_analysis = { path = "../trace_analysis"} 13 | predicate_monitoring = { path = "../predicate_monitoring"} 14 | structopt="*" 15 | serde = { version = "*", features = ["derive"] } 16 | serde_json="*" 17 | glob = "*" 18 | itertools="*" 19 | rayon="*" 20 | 21 | [[bin]] 22 | name = "rca" 23 | path = "src/main.rs" 24 | 25 | [[bin]] 26 | name = "addr2line" 27 | path = "src/addr2line.rs" -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/addr2line.rs: -------------------------------------------------------------------------------- 1 | use rayon::prelude::*; 2 | use std::collections::{HashMap, HashSet}; 3 | use std::fs::read_to_string; 4 | 5 | use std::process::Command; 6 | use structopt::StructOpt; 7 | 8 | use root_cause_analysis::config::Config; 9 | use root_cause_analysis::monitor::executable; 10 | use root_cause_analysis::utils::{parse_hex, write_file}; 11 | 12 | fn addr2line_args(config: &Config, address: usize) -> Vec { 13 | format!( 14 | "-e {} -a 0x{:x} -f -C -s -i -p", 15 | executable(config), 16 | address - config.load_offset 17 | ) 18 | .split_whitespace() 19 | .map(|s| s.to_string()) 20 | .collect() 21 | } 22 | 23 | fn addr2line(config: &Config, address: usize) -> String { 24 | let args = addr2line_args(config, address); 25 | 26 | let output = Command::new("addr2line") 27 | .args(args) 28 | .output() 29 | .expect("Could not execute addr2line"); 30 | 31 | String::from_utf8_lossy(&output.stdout)[19..] 32 | .trim() 33 | .to_string() 34 | } 35 | 36 | fn read_trace_file(config: &Config) -> String { 37 | match config.debug_trace { 38 | true => format!("{}/seed_dump.txt", config.eval_dir), 39 | false => format!("{}/ranked_predicates.txt", config.eval_dir), 40 | } 41 | } 42 | 43 | fn out_file_path(config: &Config) -> String { 44 | match config.debug_trace { 45 | true => format!("{}/seed_dump_verbose.txt", config.eval_dir), 46 | false => format!("{}/ranked_predicates_verbose.txt", config.eval_dir), 47 | } 48 | } 49 | 50 | fn lines_as_vec(config: &Config) -> Vec { 51 | read_to_string(&read_trace_file(config)) 52 | .expect("Could not read") 53 | .split("\n") 54 | .filter(|s| !s.is_empty()) 55 | .map(|s| s.to_string()) 56 | .collect() 57 | } 58 | 59 | fn line2addr(line: &String) -> usize { 60 | parse_hex(line.split_whitespace().nth(0).unwrap()).unwrap() 61 | } 62 | 63 | fn unique_addresses(lines: &Vec) -> HashSet { 64 | lines.par_iter().map(|line| line2addr(line)).collect() 65 | } 66 | 67 | fn map_address_to_src(config: &Config, addresses: &HashSet) -> HashMap { 68 | addresses 69 | .par_iter() 70 | .map(|address| (*address, addr2line(&config, *address))) 71 | .collect() 72 | } 73 | 74 | fn merge(lines: &Vec, map: &HashMap) -> String { 75 | lines 76 | .par_iter() 77 | .map(|line| format!("{} //{}\n", line, map[&line2addr(&line)])) 78 | .collect() 79 | } 80 | 81 | fn main() { 82 | let config = Config::from_args(); 83 | 84 | let output_vec = lines_as_vec(&config); 85 | let addresses = unique_addresses(&output_vec); 86 | let address_src_map = map_address_to_src(&config, &addresses); 87 | let output: String = merge(&output_vec, &address_src_map); 88 | 89 | write_file(&out_file_path(&config), output); 90 | } 91 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/config.rs: -------------------------------------------------------------------------------- 1 | use std::num::ParseIntError; 2 | use structopt::clap::AppSettings; 3 | use structopt::StructOpt; 4 | 5 | fn parse_hex(src: &str) -> Result { 6 | usize::from_str_radix(&src.replace("0x", ""), 16) 7 | } 8 | 9 | #[derive(Debug, StructOpt)] 10 | #[structopt( 11 | name = "root_cause_analysis", 12 | global_settings = &[AppSettings::DisableVersion] 13 | )] 14 | 15 | pub struct Config { 16 | #[structopt(long = "trace-dir", default_value = "", help = "Path to traces")] 17 | pub trace_dir: String, 18 | #[structopt(long = "eval-dir", help = "Path to evaluation folder")] 19 | pub eval_dir: String, 20 | #[structopt(long = "rank-predicates", help = "Rank predicates")] 21 | pub rank_predicates: bool, 22 | #[structopt(long = "monitor", help = "Monitor predicates")] 23 | pub monitor_predicates: bool, 24 | #[structopt( 25 | long = "--monitor-timeout", 26 | default_value = "60", 27 | help = "Timeout for monitoring" 28 | )] 29 | pub monitor_timeout: u64, 30 | #[structopt( 31 | long = "blacklist-crashes", 32 | default_value = "", 33 | help = "Path for crash blacklist" 34 | )] 35 | pub crash_blacklist_path: String, 36 | #[structopt(long = "debug-trace", help = "Debug trace")] 37 | pub debug_trace: bool, 38 | #[structopt( 39 | long = "load-offset", 40 | default_value = "0x0000555555554000", 41 | parse(try_from_str = parse_hex), 42 | help = "Load offset of the target" 43 | )] 44 | pub load_offset: usize, 45 | } 46 | 47 | impl Config { 48 | pub fn analyze_traces(&self) -> bool { 49 | !self.trace_dir.is_empty() 50 | } 51 | 52 | pub fn monitor_predicates(&self) -> bool { 53 | !self.eval_dir.is_empty() 54 | } 55 | 56 | pub fn blacklist_crashes(&self) -> bool { 57 | self.crash_blacklist_path != "" 58 | } 59 | } 60 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/lib.rs: -------------------------------------------------------------------------------- 1 | pub mod config; 2 | pub mod monitor; 3 | pub mod rankings; 4 | pub mod traces; 5 | pub mod utils; 6 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/main.rs: -------------------------------------------------------------------------------- 1 | use root_cause_analysis::config::Config; 2 | use root_cause_analysis::monitor::monitor_predicates; 3 | use root_cause_analysis::rankings::rank_predicates; 4 | use root_cause_analysis::traces::analyze_traces; 5 | use std::time::Instant; 6 | use structopt::StructOpt; 7 | 8 | fn main() { 9 | let config = Config::from_args(); 10 | 11 | let total_time = Instant::now(); 12 | 13 | if config.analyze_traces() { 14 | println!("analyzing traces"); 15 | let trace_analysis_time = Instant::now(); 16 | analyze_traces(&config); 17 | println!( 18 | "trace analysis time: {} seconds", 19 | trace_analysis_time.elapsed().as_secs_f64() 20 | ); 21 | } 22 | 23 | if config.monitor_predicates { 24 | println!("monitoring predicates"); 25 | let monitoring_time = Instant::now(); 26 | monitor_predicates(&config); 27 | println!( 28 | "monitoring time: {} seconds", 29 | monitoring_time.elapsed().as_secs_f64() 30 | ); 31 | } 32 | 33 | if config.rank_predicates { 34 | println!("ranking predicates"); 35 | let ranking_time = Instant::now(); 36 | rank_predicates(&config); 37 | println!( 38 | "ranking time: {} seconds", 39 | ranking_time.elapsed().as_secs_f64() 40 | ); 41 | } 42 | 43 | println!("total time: {} seconds", total_time.elapsed().as_secs_f64()); 44 | } 45 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/monitor.rs: -------------------------------------------------------------------------------- 1 | use crate::config::Config; 2 | use crate::rankings::serialize_rankings; 3 | use crate::utils::{glob_paths, read_file}; 4 | use rayon::prelude::*; 5 | use std::fs::File; 6 | use std::fs::{read_to_string, remove_file}; 7 | use std::process::{Child, Command, Stdio}; 8 | use std::time::Instant; 9 | use trace_analysis::trace_analyzer::{blacklist_path, read_crash_blacklist}; 10 | 11 | pub fn monitor_predicates(config: &Config) { 12 | let cmd_line = cmd_line(&config); 13 | let blacklist_paths = 14 | read_crash_blacklist(config.blacklist_crashes(), &config.crash_blacklist_path); 15 | 16 | let rankings = glob_paths(format!("{}/inputs/crashes/*", config.eval_dir)) 17 | .into_par_iter() 18 | .enumerate() 19 | .filter(|(_, p)| !blacklist_path(&p, &blacklist_paths)) 20 | .map(|(index, i)| monitor(config, index, &replace_input(&cmd_line, &i))) 21 | .filter(|r| !r.is_empty()) 22 | .collect(); 23 | 24 | serialize_rankings(config, &rankings); 25 | } 26 | 27 | pub fn monitor( 28 | config: &Config, 29 | index: usize, 30 | (cmd_line, file_path): &(String, Option), 31 | ) -> Vec { 32 | let predicate_order_file = format!("out_{}", index); 33 | let predicate_file = &format!("{}/{}", config.eval_dir, predicate_file_name()); 34 | let timeout = format!("{}", config.monitor_timeout); 35 | 36 | let args: Vec<_> = cmd_line.split_whitespace().map(|s| s.to_string()).collect(); 37 | 38 | let mut child = if let Some(p) = file_path { 39 | Command::new("./target/release/monitor") 40 | .arg(&predicate_order_file) 41 | .arg(&predicate_file) 42 | .arg(&timeout) 43 | .args(args) 44 | .stdin(Stdio::from(File::open(p).unwrap())) 45 | .stdout(Stdio::null()) 46 | .stderr(Stdio::null()) 47 | .spawn() 48 | .expect("Could not spawn child") 49 | } else { 50 | Command::new("./target/release/monitor") 51 | .arg(&predicate_order_file) 52 | .arg(&predicate_file) 53 | .arg(&timeout) 54 | .args(args) 55 | .stdout(Stdio::null()) 56 | .stderr(Stdio::null()) 57 | .spawn() 58 | .expect("Could not spawn child") 59 | }; 60 | 61 | wait_and_kill_child(&mut child, config.monitor_timeout); 62 | 63 | deserialize_predicate_order_file(&predicate_order_file) 64 | } 65 | 66 | fn wait_and_kill_child(child: &mut Child, timeout: u64) { 67 | let start_time = Instant::now(); 68 | 69 | while start_time.elapsed().as_secs() < timeout + 10 { 70 | match child.try_wait() { 71 | Ok(Some(_)) => break, 72 | _ => {} 73 | } 74 | } 75 | 76 | match child.kill() { 77 | _ => {} 78 | } 79 | } 80 | 81 | fn predicate_file_name() -> String { 82 | "predicates.json".to_string() 83 | } 84 | 85 | fn deserialize_predicate_order_file(file_path: &String) -> Vec { 86 | let content = read_to_string(file_path); 87 | 88 | if !content.is_ok() { 89 | return vec![]; 90 | } 91 | 92 | let ret: Vec = serde_json::from_str(&content.unwrap()) 93 | .expect(&format!("Could not deserialize {}", file_path)); 94 | remove_file(file_path).expect(&format!("Could not remove {}", file_path)); 95 | 96 | ret 97 | } 98 | 99 | pub fn cmd_line(config: &Config) -> String { 100 | let executable = executable(config); 101 | let arguments = parse_args(config); 102 | 103 | format!("{} {}", executable, arguments) 104 | } 105 | 106 | fn parse_args(config: &Config) -> String { 107 | let file_name = format!("{}/arguments.txt", config.eval_dir); 108 | read_file(&file_name) 109 | } 110 | 111 | pub fn executable(config: &Config) -> String { 112 | let pattern = format!("{}/*_trace", config.eval_dir); 113 | let mut results = glob_paths(pattern); 114 | assert_eq!(results.len(), 1); 115 | 116 | results.pop().expect("No trace executable found") 117 | } 118 | 119 | pub fn replace_input(cmd_line: &String, replacement: &String) -> (String, Option) { 120 | match cmd_line.contains("@@") { 121 | true => (cmd_line.replace("@@", replacement), None), 122 | false => (cmd_line.to_string(), Some(replacement.to_string())), 123 | } 124 | } 125 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/rankings.rs: -------------------------------------------------------------------------------- 1 | use crate::config::Config; 2 | use crate::traces::{deserialize_mnemonics, deserialize_predicates}; 3 | use crate::utils::{read_file, write_file}; 4 | use rayon::prelude::*; 5 | use std::cmp::Ordering; 6 | use std::collections::HashMap; 7 | use trace_analysis::predicates::SerializedPredicate; 8 | 9 | pub fn trunc_score(score: f64) -> f64 { 10 | (score * 100.0).trunc() as f64 11 | } 12 | fn predicate_order( 13 | p1: &SerializedPredicate, 14 | p2: &SerializedPredicate, 15 | rankings: &Vec>, 16 | ) -> Ordering { 17 | p2.score.partial_cmp(&p1.score).unwrap().then( 18 | path_rank(p1.address, rankings) 19 | .partial_cmp(&path_rank(p2.address, rankings)) 20 | .unwrap(), 21 | ) 22 | } 23 | 24 | pub fn rank_predicates(config: &Config) { 25 | let rankings = deserialize_rankings(config); 26 | let mnemonics = deserialize_mnemonics(config); 27 | let mut predicates = deserialize_predicates(config); 28 | 29 | predicates.par_sort_by(|p1, p2| predicate_order(p1, p2, &rankings)); 30 | 31 | dump_ranked_predicates(config, &predicates, &mnemonics, &rankings); 32 | } 33 | 34 | fn path_rank(address: usize, rankings: &Vec>) -> f64 { 35 | rankings 36 | .par_iter() 37 | .map(|r| rank_path_level(address, r)) 38 | .sum::() 39 | / rankings.len() as f64 40 | } 41 | 42 | fn rank_path_level(address: usize, rank: &Vec) -> f64 { 43 | match rank.iter().position(|x| address == *x) { 44 | Some(pos) => pos as f64 / rank.len() as f64, 45 | None => 2.0, 46 | } 47 | } 48 | 49 | pub fn serialize_rankings(config: &Config, rankings: &Vec>) { 50 | let content = serde_json::to_string(rankings).expect("Could not serialize rankings"); 51 | write_file(&format!("{}/rankings.json", config.eval_dir), content); 52 | } 53 | 54 | fn deserialize_rankings(config: &Config) -> Vec> { 55 | let content = read_file(&format!("{}/rankings.json", config.eval_dir)); 56 | serde_json::from_str(&content).expect("Could not deserialize rankings") 57 | } 58 | 59 | fn dump_ranked_predicates( 60 | config: &Config, 61 | predicates: &Vec, 62 | mnemonics: &HashMap, 63 | rankings: &Vec>, 64 | ) { 65 | let content: String = predicates 66 | .iter() 67 | .map(|p| { 68 | format!( 69 | "{} -- {} (path rank: {})\n", 70 | p.to_string(), 71 | mnemonics[&p.address], 72 | path_rank(p.address, rankings) 73 | ) 74 | }) 75 | .collect(); 76 | write_file( 77 | &format!("{}/ranked_predicates.txt", config.eval_dir), 78 | content, 79 | ); 80 | } 81 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/traces.rs: -------------------------------------------------------------------------------- 1 | use crate::config::Config; 2 | use crate::utils::{read_file, write_file}; 3 | use std::collections::HashMap; 4 | use trace_analysis::predicates::SerializedPredicate; 5 | use trace_analysis::trace_analyzer::TraceAnalyzer; 6 | 7 | pub fn analyze_traces(config: &Config) { 8 | let trace_analysis_output_dir = Some(config.eval_dir.to_string()); 9 | let crash_blacklist_path = if config.blacklist_crashes() { 10 | Some(config.crash_blacklist_path.to_string()) 11 | } else { 12 | None 13 | }; 14 | let trace_analysis_config = trace_analysis::config::Config::default( 15 | &config.trace_dir, 16 | &trace_analysis_output_dir, 17 | &crash_blacklist_path, 18 | ); 19 | let trace_analyzer = TraceAnalyzer::new(&trace_analysis_config); 20 | 21 | println!("dumping linear scores"); 22 | trace_analyzer.dump_scores(&trace_analysis_config, false, false); 23 | 24 | let predicates = trace_analyzer.get_predicates_better_than(0.9); 25 | 26 | serialize_mnemonics(config, &predicates, &trace_analyzer); 27 | 28 | serialize_predicates(config, &predicates); 29 | } 30 | 31 | fn serialize_predicates(config: &Config, predicates: &Vec) { 32 | let content = serde_json::to_string(predicates).expect("Could not serialize predicates"); 33 | write_file(&format!("{}/predicates.json", config.eval_dir), content); 34 | } 35 | 36 | pub fn deserialize_predicates(config: &Config) -> Vec { 37 | let file_name = format!("{}/predicates.json", config.eval_dir); 38 | 39 | let content = read_file(&file_name); 40 | serde_json::from_str(&content).expect("Could not deserialize predicates") 41 | } 42 | 43 | fn serialize_mnemonics( 44 | config: &Config, 45 | predicates: &Vec, 46 | trace_analyzer: &TraceAnalyzer, 47 | ) { 48 | let map: HashMap<_, _> = predicates 49 | .iter() 50 | .map(|p| (p.address, trace_analyzer.get_any_mnemonic(p.address))) 51 | .collect(); 52 | let content = serde_json::to_string(&map).expect("Could not serialize mnemonics"); 53 | write_file(&format!("{}/mnemonics.json", config.eval_dir), content); 54 | } 55 | 56 | pub fn deserialize_mnemonics(config: &Config) -> HashMap { 57 | let content = read_file(&format!("{}/mnemonics.json", config.eval_dir)); 58 | serde_json::from_str(&content).expect("Could not deserialize mnemonics") 59 | } 60 | -------------------------------------------------------------------------------- /root_cause_analysis/root_cause_analysis/src/utils.rs: -------------------------------------------------------------------------------- 1 | use glob::glob; 2 | use std::fs; 3 | use std::num::ParseIntError; 4 | 5 | pub fn read_file(file_path: &str) -> String { 6 | fs::read_to_string(file_path).expect(&format!("Could not read file {}", file_path)) 7 | } 8 | 9 | pub fn read_file_to_bytes(file_path: &str) -> Vec { 10 | fs::read(file_path).expect(&format!("Could not read file {}", file_path)) 11 | } 12 | 13 | pub fn write_file(file_path: &str, content: String) { 14 | fs::write(file_path, content).expect(&format!("Could not write file {}", file_path)); 15 | } 16 | 17 | pub fn glob_paths(pattern: String) -> Vec { 18 | glob(&pattern) 19 | .unwrap() 20 | .map(|p| p.unwrap().to_str().unwrap().to_string()) 21 | .collect() 22 | } 23 | 24 | pub fn parse_hex(src: &str) -> Result { 25 | usize::from_str_radix(&src.replace("0x", ""), 16) 26 | } 27 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "trace_analysis" 3 | version = "0.1.0" 4 | authors = ["Tim Blazytko ", "Moritz Schlögel "] 5 | edition = "2018" 6 | 7 | [profile.release] 8 | lto = true 9 | 10 | [dependencies] 11 | glob="*" 12 | rayon="*" 13 | serde = { version = "*", features = ["derive"] } 14 | serde_json="*" 15 | structopt="*" 16 | zip = "*" 17 | rand="*" -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/config.rs: -------------------------------------------------------------------------------- 1 | use std::num::ParseIntError; 2 | use structopt::clap::AppSettings; 3 | use structopt::StructOpt; 4 | 5 | fn parse_hex(src: &str) -> Result { 6 | usize::from_str_radix(&src.replace("0x", ""), 16) 7 | } 8 | 9 | #[derive(Debug, StructOpt)] 10 | #[structopt( 11 | name = "trace_analysis", 12 | global_settings = &[AppSettings::DisableVersion] 13 | )] 14 | 15 | pub struct Config { 16 | #[structopt(index = 1, help = "Path to traces of crashing inputs")] 17 | pub path_to_crashes: String, 18 | #[structopt(index = 2, help = "Path to traces of non-crashing inputs")] 19 | pub path_to_non_crashes: String, 20 | #[structopt( 21 | short = "c", 22 | long = "check-traces", 23 | help = "Performs trace integrity checks" 24 | )] 25 | pub check_traces: bool, 26 | #[structopt(short = "d", long = "dump-traces", help = "Dumps trace data")] 27 | pub dump_traces: bool, 28 | #[structopt(short = "s", long = "scores", help = "Dumps instruction scores")] 29 | pub dump_scores: bool, 30 | #[structopt(long = "zip", help = "Trace files are provided in zipped form")] 31 | pub zipped: bool, 32 | #[structopt(short = "a", long = "dump-address", default_value="0", parse(try_from_str = parse_hex), help = "Dump at address")] 33 | pub dump_address: usize, 34 | #[structopt( 35 | short = "r", 36 | long = "random", 37 | default_value = "0", 38 | help = "Select n random traces" 39 | )] 40 | pub random_traces: usize, 41 | #[structopt( 42 | short = "f", 43 | long = "filter", 44 | help = "Ignore non-crashes that do not visit the crashing CFG leaves" 45 | )] 46 | pub filter_non_crashes: bool, 47 | #[structopt(short = "t", long = "trace-info", help = "Dump trace infos")] 48 | pub trace_info: bool, 49 | #[structopt( 50 | long = "output-dir", 51 | default_value = "./", 52 | help = "Path for output directory" 53 | )] 54 | pub output_directory: String, 55 | #[structopt( 56 | long = "blacklist-crashes", 57 | default_value = "", 58 | help = "Path for crash blacklist" 59 | )] 60 | pub crash_blacklist_path: String, 61 | #[structopt( 62 | long = "debug-predicate", 63 | default_value="0", parse(try_from_str = parse_hex), 64 | help = "Dumps the best predicate at address" 65 | )] 66 | pub predicate_address: usize, 67 | } 68 | 69 | impl Config { 70 | pub fn default( 71 | trace_dir: &String, 72 | output_dir: &Option, 73 | crash_blacklist_path: &Option, 74 | ) -> Config { 75 | Config { 76 | path_to_crashes: format!("{}/traces/crashes/", trace_dir), 77 | path_to_non_crashes: format!("{}/traces/non_crashes/", trace_dir), 78 | check_traces: false, 79 | dump_traces: false, 80 | dump_scores: true, 81 | zipped: true, 82 | dump_address: 0, 83 | random_traces: 0, 84 | filter_non_crashes: false, 85 | trace_info: false, 86 | output_directory: if output_dir.is_some() { 87 | output_dir.as_ref().unwrap().to_string() 88 | } else { 89 | "./".to_string() 90 | }, 91 | crash_blacklist_path: if crash_blacklist_path.is_some() { 92 | crash_blacklist_path.as_ref().unwrap().to_string() 93 | } else { 94 | "".to_string() 95 | }, 96 | predicate_address: 0, 97 | } 98 | } 99 | 100 | pub fn random_traces(&self) -> bool { 101 | self.random_traces > 0 102 | } 103 | 104 | pub fn dump_address(&self) -> bool { 105 | self.dump_address > 0 106 | } 107 | 108 | pub fn blacklist_crashes(&self) -> bool { 109 | self.crash_blacklist_path != "" 110 | } 111 | 112 | pub fn debug_predicate(&self) -> bool { 113 | self.predicate_address > 0 114 | } 115 | } 116 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/control_flow_graph.rs: -------------------------------------------------------------------------------- 1 | use std::collections::hash_map::Keys; 2 | use std::collections::{HashMap, HashSet}; 3 | use std::str::FromStr; 4 | 5 | #[derive(Debug)] 6 | pub struct BasicBlock { 7 | pub body: Vec, 8 | successors: HashSet, 9 | predecessors: HashSet, 10 | } 11 | 12 | impl BasicBlock { 13 | pub fn new() -> BasicBlock { 14 | let body = vec![]; 15 | let successors = HashSet::new(); 16 | let predecessors = HashSet::new(); 17 | 18 | BasicBlock { 19 | body, 20 | successors, 21 | predecessors, 22 | } 23 | } 24 | pub fn start(&self) -> usize { 25 | *self.body.first().unwrap() 26 | } 27 | 28 | pub fn exit(&self) -> usize { 29 | *self.body.last().unwrap() 30 | } 31 | 32 | pub fn iter_addresses(&self) -> impl Iterator { 33 | self.body.iter() 34 | } 35 | } 36 | 37 | #[derive(Debug)] 38 | pub struct ControlFlowGraph { 39 | addr_to_bb_exit: HashMap, 40 | exit_addr_to_bb: HashMap, 41 | } 42 | 43 | impl ControlFlowGraph { 44 | pub fn new() -> Self { 45 | let addr_to_bb_exit = HashMap::new(); 46 | let exit_addr_to_bb = HashMap::new(); 47 | ControlFlowGraph { 48 | addr_to_bb_exit, 49 | exit_addr_to_bb, 50 | } 51 | } 52 | 53 | pub fn is_empty(&self) -> bool { 54 | self.exit_addr_to_bb.is_empty() 55 | } 56 | 57 | pub fn keys(&self) -> Keys { 58 | self.addr_to_bb_exit.keys() 59 | } 60 | 61 | pub fn get_instruction_successors(&self, address: usize) -> Vec { 62 | match self.exit_addr_to_bb.get(&address) { 63 | Some(bb) => bb.successors.iter().cloned().collect(), 64 | None => vec![], 65 | } 66 | } 67 | 68 | pub fn is_bb_end(&self, address: usize) -> bool { 69 | self.exit_addr_to_bb.contains_key(&address) 70 | } 71 | 72 | pub fn add_bb(&mut self, bb: BasicBlock) { 73 | for addr in bb.body.iter() { 74 | self.addr_to_bb_exit.insert(*addr, bb.exit()); 75 | } 76 | self.exit_addr_to_bb.insert(bb.exit(), bb); 77 | } 78 | 79 | pub fn bbs(&self) -> impl Iterator { 80 | self.exit_addr_to_bb.values() 81 | } 82 | 83 | pub fn get_bb(&self, addr: usize) -> &BasicBlock { 84 | let exit_addr = self 85 | .addr_to_bb_exit 86 | .get(&addr) 87 | .expect(&format!("no exit address for {:x}", addr)); 88 | let bb = self 89 | .exit_addr_to_bb 90 | .get(exit_addr) 91 | .expect(&format!("BB not found for exit address {:x}", exit_addr)); 92 | 93 | bb 94 | } 95 | 96 | pub fn to_dot(&self) -> String { 97 | let mut ret = String::from_str("digraph {\n").unwrap(); 98 | 99 | for bb in self.bbs() { 100 | for succ in bb.successors.iter() { 101 | ret.push_str(&format!("{} -> {}\n", bb.exit(), self.get_bb(*succ).exit())); 102 | } 103 | } 104 | ret.push_str("}\n"); 105 | ret 106 | } 107 | 108 | pub fn heads(&self) -> Vec { 109 | self.bbs() 110 | .filter(|bb| bb.predecessors.is_empty()) 111 | .map(|bb| bb.start()) 112 | .collect() 113 | } 114 | 115 | pub fn leaves(&self) -> Vec { 116 | self.bbs() 117 | .filter(|bb| bb.successors.is_empty()) 118 | .map(|bb| bb.start()) 119 | .collect() 120 | } 121 | } 122 | 123 | pub struct CFGCollector { 124 | successors: HashMap>, 125 | predecessors: HashMap>, 126 | } 127 | 128 | impl CFGCollector { 129 | pub fn new() -> CFGCollector { 130 | CFGCollector { 131 | successors: HashMap::new(), 132 | predecessors: HashMap::new(), 133 | } 134 | } 135 | 136 | pub fn add_edge(&mut self, src: usize, dst: usize) { 137 | if !self.predecessors.contains_key(&src) { 138 | self.predecessors.insert(src, HashSet::new()); 139 | } 140 | if !self.predecessors.contains_key(&dst) { 141 | self.predecessors.insert(dst, HashSet::new()); 142 | } 143 | 144 | if !self.successors.contains_key(&src) { 145 | self.successors.insert(src, HashSet::new()); 146 | } 147 | 148 | if !self.successors.contains_key(&dst) { 149 | self.successors.insert(dst, HashSet::new()); 150 | } 151 | self.predecessors.get_mut(&dst).unwrap().insert(src); 152 | 153 | self.successors.get_mut(&src).unwrap().insert(dst); 154 | } 155 | 156 | pub fn heads(&self) -> Vec { 157 | let set: HashSet<_> = self.successors.keys().cloned().collect(); 158 | set.into_iter() 159 | .filter(|k| !self.predecessors.contains_key(k) || self.predecessors[k].is_empty()) 160 | .collect() 161 | } 162 | 163 | pub fn dfs(&self, start: usize) -> Vec { 164 | let mut ret = vec![]; 165 | let mut todo = vec![start]; 166 | let mut done = HashSet::new(); 167 | 168 | while !todo.is_empty() { 169 | let node = todo.pop().unwrap(); 170 | 171 | if done.contains(&node) { 172 | continue; 173 | } 174 | 175 | done.insert(node); 176 | ret.push(node); 177 | 178 | for successors in self.successors.get(&node) { 179 | for successor in successors { 180 | todo.push(*successor); 181 | } 182 | } 183 | } 184 | ret 185 | } 186 | 187 | pub fn construct_graph(&self) -> ControlFlowGraph { 188 | let mut cfg = ControlFlowGraph::new(); 189 | let mut bb = BasicBlock::new(); 190 | let mut finished = false; 191 | 192 | let mut heads = self.heads(); 193 | assert_eq!(heads.len(), 1); 194 | 195 | for node in self.dfs(heads.pop().unwrap()) { 196 | // current instruction is leading instruction 197 | if bb.body.is_empty() { 198 | for pred in self.predecessors[&node].iter() { 199 | bb.predecessors.insert(*pred); 200 | } 201 | } 202 | 203 | // next instruction is leader 204 | if self.successors[&node].len() == 1 205 | && self.predecessors[&self.successors[&node].iter().last().unwrap()].len() != 1 206 | { 207 | for succ in self.successors[&node].iter() { 208 | bb.successors.insert(*succ); 209 | } 210 | finished = true; 211 | } 212 | 213 | // more than one outgoing edges -> end of basic block 214 | if self.successors[&node].len() != 1 { 215 | for succ in self.successors[&node].iter() { 216 | bb.successors.insert(*succ); 217 | } 218 | finished = true; 219 | } 220 | 221 | bb.body.push(node); 222 | 223 | if finished { 224 | cfg.add_bb(bb); 225 | bb = BasicBlock::new(); 226 | finished = false; 227 | } 228 | } 229 | 230 | cfg 231 | } 232 | } 233 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/debug.rs: -------------------------------------------------------------------------------- 1 | use crate::config::Config; 2 | use crate::predicate_analysis::PredicateAnalyzer; 3 | use crate::trace::Trace; 4 | use crate::trace_analyzer::TraceAnalyzer; 5 | use std::fs::File; 6 | use std::io::Write; 7 | 8 | pub fn diff_traces(config: &Config, trace_analyzer: &TraceAnalyzer) { 9 | let mut file = File::create(format!("{}/verbose_info.csv", config.output_directory)).unwrap(); 10 | /* addresses that have been seen in traces AND non-traces */ 11 | for addr in trace_analyzer.cfg.keys() { 12 | write_instruction_from_traces_at_address( 13 | &mut file, 14 | *addr, 15 | &trace_analyzer.crashes.as_slice(), 16 | "crash", 17 | ); 18 | write_instruction_from_traces_at_address( 19 | &mut file, 20 | *addr, 21 | &trace_analyzer.non_crashes.as_slice(), 22 | "non_crash", 23 | ); 24 | } 25 | } 26 | 27 | pub fn diff_traces_at_address(config: &Config, trace_analyzer: &TraceAnalyzer) { 28 | let mut file = File::create(format!("{}/verbose_info.csv", config.output_directory)).unwrap(); 29 | write_instruction_from_traces_at_address( 30 | &mut file, 31 | config.dump_address, 32 | &trace_analyzer.crashes.as_slice(), 33 | "crash", 34 | ); 35 | write_instruction_from_traces_at_address( 36 | &mut file, 37 | config.dump_address, 38 | &trace_analyzer.non_crashes.as_slice(), 39 | "non_crash", 40 | ); 41 | } 42 | 43 | pub fn dump_trace_info(config: &Config, trace_analyzer: &TraceAnalyzer) { 44 | let mut file = File::create(format!("{}/trace_info.csv", config.output_directory)).unwrap(); 45 | 46 | write_traces_info(&mut file, &trace_analyzer.crashes.as_slice(), "crash"); 47 | 48 | write_traces_info( 49 | &mut file, 50 | &trace_analyzer.non_crashes.as_slice(), 51 | "non_crash", 52 | ); 53 | } 54 | 55 | pub fn debug_predicate_at_address(address: usize, trace_analyzer: &TraceAnalyzer) { 56 | let predicate = PredicateAnalyzer::evaluate_best_predicate_at_address(address, trace_analyzer); 57 | 58 | println!( 59 | "0x{:x} -- {} -- {}", 60 | predicate.address, 61 | predicate.to_string(), 62 | predicate.score 63 | ); 64 | } 65 | 66 | fn write_traces_info(file: &mut File, traces: &[Trace], flag: &str) { 67 | for trace in traces.iter() { 68 | write!(file, "{};{}\n", trace.to_string(), flag).unwrap(); 69 | } 70 | } 71 | 72 | fn write_instruction_from_traces_at_address( 73 | file: &mut File, 74 | addr: usize, 75 | traces: &[Trace], 76 | flag: &str, 77 | ) { 78 | for trace in traces.iter().filter(|t| t.instructions.contains_key(&addr)) { 79 | write!(file, "{};{}\n", trace.instructions[&addr].to_string(), flag).unwrap(); 80 | } 81 | } 82 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/lib.rs: -------------------------------------------------------------------------------- 1 | pub mod config; 2 | pub mod control_flow_graph; 3 | pub mod debug; 4 | pub mod predicate_analysis; 5 | pub mod predicate_builder; 6 | pub mod predicate_synthesizer; 7 | pub mod predicates; 8 | pub mod trace; 9 | pub mod trace_analyzer; 10 | pub mod trace_integrity; 11 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/main.rs: -------------------------------------------------------------------------------- 1 | use structopt::StructOpt; 2 | use trace_analysis::config::Config; 3 | use trace_analysis::debug::{ 4 | debug_predicate_at_address, diff_traces, diff_traces_at_address, dump_trace_info, 5 | }; 6 | use trace_analysis::trace_analyzer::TraceAnalyzer; 7 | 8 | fn main() { 9 | let config = Config::from_args(); 10 | 11 | let trace_analyzer = TraceAnalyzer::new(&config); 12 | 13 | if config.dump_traces { 14 | println!("dumping traces"); 15 | diff_traces(&config, &trace_analyzer); 16 | } 17 | 18 | if config.dump_address() { 19 | println!("dumping traces at address 0x{:x}", config.dump_address); 20 | diff_traces_at_address(&config, &trace_analyzer); 21 | } 22 | 23 | if config.trace_info { 24 | println!("dumping trace information"); 25 | dump_trace_info(&config, &trace_analyzer); 26 | } 27 | 28 | if config.debug_predicate() { 29 | println!( 30 | "dumping predicate at address 0x{:x}", 31 | config.predicate_address 32 | ); 33 | debug_predicate_at_address(config.predicate_address, &trace_analyzer); 34 | } 35 | 36 | if config.dump_scores { 37 | println!("dumping linear scores"); 38 | trace_analyzer.dump_scores(&config, false, false); 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/predicate_analysis.rs: -------------------------------------------------------------------------------- 1 | use crate::predicate_builder::PredicateBuilder; 2 | use crate::predicates::Predicate; 3 | 4 | use crate::trace_analyzer::TraceAnalyzer; 5 | use rayon::prelude::*; 6 | 7 | pub struct PredicateAnalyzer {} 8 | 9 | impl PredicateAnalyzer { 10 | pub fn evaluate_best_predicate_at_address( 11 | address: usize, 12 | trace_analyzer: &TraceAnalyzer, 13 | ) -> Predicate { 14 | let predicates = PredicateBuilder::gen_predicates(address, trace_analyzer); 15 | 16 | if predicates.is_empty() { 17 | return Predicate::gen_empty(address); 18 | } 19 | 20 | let mut ret: Vec = predicates 21 | .into_par_iter() 22 | .map(|p| PredicateAnalyzer::evaluate_predicate(trace_analyzer, p)) 23 | .collect(); 24 | 25 | ret.sort_by(|p1, p2| p1.score.partial_cmp(&p2.score).unwrap()); 26 | ret.pop().unwrap() 27 | } 28 | 29 | fn evaluate_predicate(trace_analyzer: &TraceAnalyzer, mut predicate: Predicate) -> Predicate { 30 | let true_positives = trace_analyzer 31 | .crashes 32 | .as_slice() 33 | .par_iter() 34 | .map(|t| t.instructions.get(&predicate.address)) 35 | .filter(|i| predicate.execute(i)) 36 | .count() as f64 37 | / trace_analyzer.crashes.len() as f64; 38 | let true_negatives = trace_analyzer 39 | .non_crashes 40 | .as_slice() 41 | .par_iter() 42 | .map(|t| t.instructions.get(&predicate.address)) 43 | .filter(|i| !predicate.execute(i)) 44 | .count() as f64 45 | / trace_analyzer.non_crashes.len() as f64; 46 | 47 | predicate.score = (true_positives + true_negatives) / 2.0; 48 | 49 | predicate 50 | } 51 | } 52 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/predicate_builder.rs: -------------------------------------------------------------------------------- 1 | use crate::control_flow_graph::ControlFlowGraph; 2 | use crate::predicate_synthesizer::{gen_reg_val_name, PredicateSynthesizer}; 3 | use crate::predicates::*; 4 | use crate::trace::Instruction; 5 | use crate::trace::{Selector, REGISTERS}; 6 | use crate::trace_analyzer::TraceAnalyzer; 7 | 8 | pub struct PredicateBuilder {} 9 | 10 | impl PredicateBuilder { 11 | fn gen_visited(address: usize) -> Vec { 12 | vec![Predicate::new( 13 | "is_visited", 14 | address, 15 | is_visited, 16 | None, 17 | None, 18 | )] 19 | } 20 | fn gen_all_edge_from_to_predicates( 21 | address: usize, 22 | cfg: &ControlFlowGraph, 23 | pred_name: &str, 24 | func: fn(&Instruction, Option, Option) -> bool, 25 | ) -> Vec { 26 | cfg.get_instruction_successors(address) 27 | .iter() 28 | .map(|to| { 29 | let pred_name = format!("0x{:x} {} 0x{:x}", address, pred_name, to); 30 | Predicate::new(&pred_name, address, func, Some(*to), None) 31 | }) 32 | .collect() 33 | } 34 | 35 | fn gen_all_edge_val_predicates( 36 | address: usize, 37 | pred_name: &str, 38 | value: usize, 39 | func: fn(&Instruction, Option, Option) -> bool, 40 | ) -> Predicate { 41 | let pred_name = format!("{} {}", pred_name, value); 42 | 43 | Predicate::new(&pred_name, address, func, Some(value), None) 44 | } 45 | 46 | pub fn gen_flag_predicates(address: usize, trace_analyzer: &TraceAnalyzer) -> Vec { 47 | if !trace_analyzer.any_instruction_at_address_contains_reg(address, 22) { 48 | return vec![]; 49 | } 50 | 51 | vec![ 52 | // min 53 | Predicate::new( 54 | "min_carry_flag_set", 55 | address, 56 | min_carry_flag_set, 57 | None, 58 | None, 59 | ), 60 | Predicate::new( 61 | "min_parity_flag_set", 62 | address, 63 | min_parity_flag_set, 64 | None, 65 | None, 66 | ), 67 | Predicate::new( 68 | "min_adjust_flag_set", 69 | address, 70 | min_adjust_flag_set, 71 | None, 72 | None, 73 | ), 74 | Predicate::new("min_zero_flag_set", address, min_zero_flag_set, None, None), 75 | Predicate::new("min_sign_flag_set", address, min_sign_flag_set, None, None), 76 | Predicate::new("min_trap_flag_set", address, min_trap_flag_set, None, None), 77 | Predicate::new( 78 | "min_interrupt_flag_set", 79 | address, 80 | min_interrupt_flag_set, 81 | None, 82 | None, 83 | ), 84 | Predicate::new( 85 | "min_direction_flag_set", 86 | address, 87 | min_direction_flag_set, 88 | None, 89 | None, 90 | ), 91 | Predicate::new( 92 | "min_overflow_flag_set", 93 | address, 94 | min_overflow_flag_set, 95 | None, 96 | None, 97 | ), 98 | // max 99 | Predicate::new( 100 | "max_carry_flag_set", 101 | address, 102 | max_carry_flag_set, 103 | None, 104 | None, 105 | ), 106 | Predicate::new( 107 | "max_parity_flag_set", 108 | address, 109 | max_parity_flag_set, 110 | None, 111 | None, 112 | ), 113 | Predicate::new( 114 | "max_adjust_flag_set", 115 | address, 116 | max_adjust_flag_set, 117 | None, 118 | None, 119 | ), 120 | Predicate::new("max_zero_flag_set", address, max_zero_flag_set, None, None), 121 | Predicate::new("max_sign_flag_set", address, max_sign_flag_set, None, None), 122 | Predicate::new("max_trap_flag_set", address, max_trap_flag_set, None, None), 123 | Predicate::new( 124 | "max_interrupt_flag_set", 125 | address, 126 | max_interrupt_flag_set, 127 | None, 128 | None, 129 | ), 130 | Predicate::new( 131 | "max_direction_flag_set", 132 | address, 133 | max_direction_flag_set, 134 | None, 135 | None, 136 | ), 137 | Predicate::new( 138 | "max_overflow_flag_set", 139 | address, 140 | max_overflow_flag_set, 141 | None, 142 | None, 143 | ), 144 | ] 145 | } 146 | 147 | pub fn gen_cfg_predicates(address: usize, cfg: &ControlFlowGraph) -> Vec { 148 | let mut ret = vec![]; 149 | 150 | // check if end of basic block 151 | if !cfg.is_bb_end(address) { 152 | return ret; 153 | } 154 | 155 | // #successors > 0 156 | ret.push(PredicateBuilder::gen_all_edge_val_predicates( 157 | address, 158 | "num_successors_greater", 159 | 0, 160 | num_successors_greater, 161 | )); 162 | // #successors > 1 163 | ret.push(PredicateBuilder::gen_all_edge_val_predicates( 164 | address, 165 | "num_successors_greater", 166 | 1, 167 | num_successors_greater, 168 | )); 169 | // #successors > 2 170 | ret.push(PredicateBuilder::gen_all_edge_val_predicates( 171 | address, 172 | "num_successors_greater", 173 | 2, 174 | num_successors_greater, 175 | )); 176 | 177 | // #successors == 0 178 | ret.push(PredicateBuilder::gen_all_edge_val_predicates( 179 | address, 180 | "num_successors_equal", 181 | 0, 182 | num_successors_equal, 183 | )); 184 | // #successors == 1 185 | ret.push(PredicateBuilder::gen_all_edge_val_predicates( 186 | address, 187 | "num_successors_equal", 188 | 1, 189 | num_successors_equal, 190 | )); 191 | // #successors == 2 192 | ret.push(PredicateBuilder::gen_all_edge_val_predicates( 193 | address, 194 | "num_successors_equal", 195 | 2, 196 | num_successors_equal, 197 | )); 198 | // edge addr -> x cfg edges exists 199 | ret.extend(PredicateBuilder::gen_all_edge_from_to_predicates( 200 | address, 201 | cfg, 202 | "has_edge_to", 203 | has_edge_to, 204 | )); 205 | ret.extend(PredicateBuilder::gen_all_edge_from_to_predicates( 206 | address, 207 | cfg, 208 | "edge_only_taken_to", 209 | edge_only_taken_to, 210 | )); 211 | ret 212 | } 213 | 214 | pub fn gen_all_reg_val_predicates( 215 | address: usize, 216 | trace_analyzer: &TraceAnalyzer, 217 | selector: &Selector, 218 | value: usize, 219 | ) -> Vec { 220 | (0..REGISTERS.len()) 221 | .into_iter() 222 | .filter(|reg_index| { 223 | trace_analyzer.any_instruction_at_address_contains_reg(address, *reg_index) 224 | }) 225 | /* skip RSP */ 226 | .filter(|reg_index| *reg_index != 7) 227 | /* skip EFLAGS */ 228 | .filter(|reg_index| *reg_index != 22) 229 | /* skip memory address */ 230 | .filter(|reg_index| *reg_index != 23) 231 | .map(|reg_index| { 232 | let pred_name = gen_reg_val_name( 233 | Some(reg_index), 234 | selector_val_less_name(selector), 235 | value as u64, 236 | ); 237 | Predicate::new( 238 | &pred_name, 239 | address, 240 | selector_val_less(&selector), 241 | Some(reg_index), 242 | Some(value), 243 | ) 244 | }) 245 | .collect() 246 | } 247 | 248 | pub fn gen_register_predicates( 249 | address: usize, 250 | trace_analyzer: &TraceAnalyzer, 251 | ) -> Vec { 252 | let mut ret = vec![]; 253 | 254 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 255 | address, 256 | trace_analyzer, 257 | &Selector::RegMax, 258 | 0xffffffffffffffff, 259 | )); 260 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 261 | address, 262 | trace_analyzer, 263 | &Selector::RegMax, 264 | 0xffffffff, 265 | )); 266 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 267 | address, 268 | trace_analyzer, 269 | &Selector::RegMax, 270 | 0xffff, 271 | )); 272 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 273 | address, 274 | trace_analyzer, 275 | &Selector::RegMax, 276 | 0xff, 277 | )); 278 | 279 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 280 | address, 281 | trace_analyzer, 282 | &Selector::RegMin, 283 | 0xffffffffffffffff, 284 | )); 285 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 286 | address, 287 | trace_analyzer, 288 | &Selector::RegMin, 289 | 0xffffffff, 290 | )); 291 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 292 | address, 293 | trace_analyzer, 294 | &Selector::RegMin, 295 | 0xffff, 296 | )); 297 | ret.extend(PredicateBuilder::gen_all_reg_val_predicates( 298 | address, 299 | trace_analyzer, 300 | &Selector::RegMin, 301 | 0xff, 302 | )); 303 | 304 | ret 305 | } 306 | 307 | pub fn gen_predicates(address: usize, trace_analyzer: &TraceAnalyzer) -> Vec { 308 | let mut ret = vec![]; 309 | 310 | let skip_register_predicates = 311 | PredicateBuilder::skip_register_mnemonic(trace_analyzer.get_any_mnemonic(address)); 312 | 313 | ret.extend(PredicateBuilder::gen_visited(address)); 314 | 315 | if !skip_register_predicates { 316 | ret.extend(PredicateSynthesizer::constant_predicates_at_address( 317 | address, 318 | trace_analyzer, 319 | )); 320 | 321 | ret.extend(PredicateBuilder::gen_register_predicates( 322 | address, 323 | &trace_analyzer, 324 | )); 325 | } 326 | 327 | ret.extend(PredicateBuilder::gen_cfg_predicates( 328 | address, 329 | &trace_analyzer.cfg, 330 | )); 331 | 332 | if !skip_register_predicates { 333 | ret.extend(PredicateBuilder::gen_flag_predicates( 334 | address, 335 | &trace_analyzer, 336 | )); 337 | } 338 | 339 | ret 340 | } 341 | 342 | fn skip_register_mnemonic(mnemonic: String) -> bool { 343 | match mnemonic.as_str() { 344 | // leave instruction 345 | _ if mnemonic.contains("leave") => true, 346 | // contains floating point register 347 | _ if mnemonic.contains("xmm") => true, 348 | // contains rsp but is no memory operation 349 | _ if !mnemonic.contains("[") && mnemonic.contains("rsp") => true, 350 | // moves a constant into register/memory 351 | _ if mnemonic.contains("mov") && mnemonic.contains(", 0x") => true, 352 | _ => false, 353 | } 354 | } 355 | } 356 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/predicate_synthesizer.rs: -------------------------------------------------------------------------------- 1 | use crate::predicates::*; 2 | use crate::trace::{Selector, REGISTERS}; 3 | use crate::trace_analyzer::TraceAnalyzer; 4 | use rayon::prelude::*; 5 | 6 | pub struct PredicateSynthesizer {} 7 | 8 | pub fn gen_reg_val_name(reg_index: Option, pred_name: String, value: u64) -> String { 9 | match reg_index.is_some() { 10 | true => format!( 11 | "{} {} 0x{:x}", 12 | REGISTERS[reg_index.unwrap()], 13 | pred_name, 14 | value 15 | ), 16 | false => format!("{} {}", pred_name, value), 17 | } 18 | } 19 | 20 | impl PredicateSynthesizer { 21 | pub fn constant_predicates_at_address( 22 | address: usize, 23 | trace_analyzer: &TraceAnalyzer, 24 | ) -> Vec { 25 | let mut predicates = vec![]; 26 | 27 | predicates.extend( 28 | PredicateSynthesizer::register_constant_predicates_at_address( 29 | address, 30 | trace_analyzer, 31 | &Selector::RegMax, 32 | ), 33 | ); 34 | predicates.extend( 35 | PredicateSynthesizer::register_constant_predicates_at_address( 36 | address, 37 | trace_analyzer, 38 | &Selector::RegMin, 39 | ), 40 | ); 41 | 42 | predicates 43 | } 44 | 45 | fn register_constant_predicates_at_address( 46 | address: usize, 47 | trace_analyzer: &TraceAnalyzer, 48 | selector: &Selector, 49 | ) -> Vec { 50 | (0..REGISTERS.len()) 51 | .into_par_iter() 52 | .filter(|reg_index| { 53 | trace_analyzer.any_instruction_at_address_contains_reg(address, *reg_index) 54 | }) 55 | /* skip RSP */ 56 | .filter(|reg_index| *reg_index != 7) 57 | /* skip EFLAGS */ 58 | .filter(|reg_index| *reg_index != 22) 59 | /* skip memory address */ 60 | .filter(|reg_index| *reg_index != 23) 61 | /* skip all heap addresses */ 62 | .filter(|reg_index| { 63 | !trace_analyzer 64 | .values_at_address(address, selector, Some(*reg_index)) 65 | .into_iter() 66 | .all(|v: u64| { 67 | trace_analyzer.memory_addresses.heap_start <= v as usize 68 | && v as usize <= trace_analyzer.memory_addresses.heap_end 69 | }) 70 | }) 71 | /* skip all stack addresses */ 72 | .filter(|reg_index| { 73 | !trace_analyzer 74 | .values_at_address(address, selector, Some(*reg_index)) 75 | .into_iter() 76 | .all(|v: u64| { 77 | trace_analyzer.memory_addresses.stack_start <= v as usize 78 | && v as usize <= trace_analyzer.memory_addresses.stack_end 79 | }) 80 | }) 81 | .flat_map(|reg_index| { 82 | PredicateSynthesizer::synthesize_constant_predicates( 83 | address, 84 | trace_analyzer, 85 | selector, 86 | Some(reg_index), 87 | ) 88 | }) 89 | .collect() 90 | } 91 | 92 | fn synthesize_constant_predicates( 93 | address: usize, 94 | trace_analyzer: &TraceAnalyzer, 95 | selector: &Selector, 96 | reg_index: Option, 97 | ) -> Vec { 98 | let values = trace_analyzer.unique_values_at_address(address, selector, reg_index); 99 | if values.is_empty() { 100 | return vec![]; 101 | } 102 | 103 | let mut f: Vec<_> = values 104 | .par_iter() 105 | .map(|v| { 106 | ( 107 | v, 108 | PredicateSynthesizer::evaluate_value_at_address( 109 | address, 110 | trace_analyzer, 111 | selector, 112 | reg_index, 113 | *v, 114 | ), 115 | ) 116 | }) 117 | .collect(); 118 | 119 | f.sort_by(|(_, f1), (_, f2)| f1.partial_cmp(&f2).unwrap()); 120 | 121 | PredicateSynthesizer::build_constant_predicates( 122 | address, 123 | selector, 124 | reg_index, 125 | PredicateSynthesizer::arithmetic_mean(*f.first().unwrap().0, &values), 126 | PredicateSynthesizer::arithmetic_mean(*f.last().unwrap().0, &values), 127 | ) 128 | } 129 | 130 | fn arithmetic_mean(v1: u64, values: &Vec) -> u64 { 131 | match values.iter().filter(|v| *v < &v1).max() { 132 | Some(v2) => ((v1 as f64 + *v2 as f64) / 2.0).round() as u64, 133 | None => v1, 134 | } 135 | } 136 | 137 | fn build_constant_predicates( 138 | address: usize, 139 | selector: &Selector, 140 | reg_index: Option, 141 | v1: u64, 142 | v2: u64, 143 | ) -> Vec { 144 | let pred_name1 = 145 | gen_reg_val_name(reg_index, selector_val_greater_or_equal_name(selector), v1); 146 | let pred_name2 = gen_reg_val_name(reg_index, selector_val_less_name(selector), v2); 147 | 148 | vec![ 149 | Predicate::new( 150 | &pred_name1, 151 | address, 152 | selector_val_greater_or_equal(selector), 153 | reg_index, 154 | Some(v1 as usize), 155 | ), 156 | Predicate::new( 157 | &pred_name2, 158 | address, 159 | selector_val_less(selector), 160 | reg_index, 161 | Some(v2 as usize), 162 | ), 163 | ] 164 | } 165 | 166 | fn evaluate_value_at_address( 167 | address: usize, 168 | trace_analyzer: &TraceAnalyzer, 169 | selector: &Selector, 170 | reg_index: Option, 171 | val: u64, 172 | ) -> f64 { 173 | let pred_name = format!( 174 | "{:?} {} {}", 175 | reg_index, 176 | selector_val_less_name(selector), 177 | val 178 | ); 179 | 180 | let predicate = Predicate::new( 181 | &pred_name, 182 | address, 183 | selector_val_less(selector), 184 | reg_index, 185 | Some(val as usize), 186 | ); 187 | 188 | PredicateSynthesizer::evaluate_predicate_with_reachability( 189 | address, 190 | trace_analyzer, 191 | &predicate, 192 | ) 193 | } 194 | 195 | pub fn evaluate_predicate_with_reachability( 196 | address: usize, 197 | trace_analyzer: &TraceAnalyzer, 198 | predicate: &Predicate, 199 | ) -> f64 { 200 | let true_positives = trace_analyzer 201 | .crashes 202 | .as_slice() 203 | .par_iter() 204 | .filter(|t| t.instructions.get(&address).is_some()) 205 | .map(|t| t.instructions.get(&address)) 206 | .filter(|i| predicate.execute(i)) 207 | .count() as f64 208 | / trace_analyzer.crashes.len() as f64; 209 | let true_negatives = (trace_analyzer 210 | .non_crashes 211 | .as_slice() 212 | .par_iter() 213 | .filter(|t| t.instructions.get(&address).is_some()) 214 | .map(|t| t.instructions.get(&address)) 215 | .filter(|i| !predicate.execute(i)) 216 | .count() as f64 217 | + trace_analyzer 218 | .non_crashes 219 | .as_slice() 220 | .par_iter() 221 | .filter(|t| t.instructions.get(&address).is_none()) 222 | .count() as f64) 223 | / trace_analyzer.non_crashes.len() as f64; 224 | 225 | let score = (true_positives + true_negatives) / 2.0; 226 | 227 | score 228 | } 229 | } 230 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/predicates.rs: -------------------------------------------------------------------------------- 1 | use crate::trace::{Instruction, Register, Selector}; 2 | use serde::{Deserialize, Serialize}; 3 | 4 | #[derive(Debug, Clone, Serialize, Deserialize)] 5 | pub struct SerializedPredicate { 6 | pub name: String, 7 | pub score: f64, 8 | pub address: usize, 9 | } 10 | 11 | impl SerializedPredicate { 12 | pub fn new(name: String, address: usize, score: f64) -> SerializedPredicate { 13 | SerializedPredicate { 14 | name, 15 | score, 16 | address, 17 | } 18 | } 19 | 20 | pub fn to_string(&self) -> String { 21 | format!("{:#018x} -- {} -- {}", self.address, self.name, self.score) 22 | } 23 | 24 | pub fn serialize(&self) -> String { 25 | serde_json::to_string(&self).expect(&format!( 26 | "Could not serialize predicate {}", 27 | self.to_string() 28 | )) 29 | } 30 | } 31 | 32 | #[derive(Clone)] 33 | pub struct Predicate { 34 | pub name: String, 35 | p1: Option, 36 | p2: Option, 37 | function: fn(&Instruction, Option, Option) -> bool, 38 | pub score: f64, 39 | pub address: usize, 40 | } 41 | 42 | impl Predicate { 43 | pub fn new( 44 | name: &str, 45 | address: usize, 46 | function: fn(&Instruction, Option, Option) -> bool, 47 | p1: Option, 48 | p2: Option, 49 | ) -> Predicate { 50 | Predicate { 51 | name: name.to_string(), 52 | address, 53 | p1, 54 | p2, 55 | function, 56 | score: 0.0, 57 | } 58 | } 59 | 60 | pub fn serialize(&self) -> String { 61 | let serialized = SerializedPredicate::new(self.name.to_string(), self.address, self.score); 62 | serde_json::to_string(&serialized).unwrap() 63 | } 64 | 65 | pub fn to_serialzed(&self) -> SerializedPredicate { 66 | SerializedPredicate::new(self.name.to_string(), self.address, self.score) 67 | } 68 | 69 | pub fn execute(&self, instruction_option: &Option<&Instruction>) -> bool { 70 | match instruction_option { 71 | Some(instruction) => (self.function)(instruction, self.p1, self.p2), 72 | None => false, 73 | } 74 | } 75 | 76 | pub fn gen_empty(address: usize) -> Predicate { 77 | Predicate::new("empty", address, empty, None, None) 78 | } 79 | 80 | pub fn to_string(&self) -> String { 81 | format!("{}", self.name) 82 | } 83 | } 84 | 85 | pub fn empty(_: &Instruction, _: Option, _: Option) -> bool { 86 | false 87 | } 88 | 89 | pub fn is_visited(_: &Instruction, _: Option, _: Option) -> bool { 90 | true 91 | } 92 | 93 | pub fn selector_val_less_name(selector: &Selector) -> String { 94 | match selector { 95 | Selector::RegMin => format!("min_reg_val_less"), 96 | Selector::RegMax => format!("max_reg_val_less"), 97 | Selector::RegMaxMinDiff => format!("max_min_diff_reg_val_less"), 98 | Selector::InsCount => format!("ins_count_less"), 99 | _ => unreachable!(), 100 | } 101 | } 102 | 103 | pub fn selector_val_less( 104 | selector: &Selector, 105 | ) -> fn(&Instruction, Option, Option) -> bool { 106 | match selector { 107 | Selector::RegMin => min_reg_val_less, 108 | Selector::RegMax => max_reg_val_less, 109 | Selector::RegMaxMinDiff => max_min_diff_reg_val_less, 110 | // Selector::InsCount => ins_count_less, 111 | _ => unreachable!(), 112 | } 113 | } 114 | 115 | pub fn min_reg_val_less( 116 | instruction: &Instruction, 117 | reg_index: Option, 118 | value: Option, 119 | ) -> bool { 120 | match instruction.registers_min.get(reg_index.unwrap()) { 121 | Some(reg) => reg.value() < value.unwrap() as u64, 122 | None => false, 123 | } 124 | } 125 | 126 | pub fn max_reg_val_less( 127 | instruction: &Instruction, 128 | reg_index: Option, 129 | value: Option, 130 | ) -> bool { 131 | match instruction.registers_max.get(reg_index.unwrap()) { 132 | Some(reg) => reg.value() < value.unwrap() as u64, 133 | None => false, 134 | } 135 | } 136 | 137 | pub fn max_min_diff_reg_val_less( 138 | instruction: &Instruction, 139 | reg_index: Option, 140 | value: Option, 141 | ) -> bool { 142 | match ( 143 | instruction.registers_max.get(reg_index.unwrap()), 144 | instruction.registers_min.get(reg_index.unwrap()), 145 | ) { 146 | (Some(reg_max), Some(reg_min)) => reg_max.value() - reg_min.value() < value.unwrap() as u64, 147 | _ => false, 148 | } 149 | } 150 | 151 | pub fn selector_val_greater_or_equal_name(selector: &Selector) -> String { 152 | match selector { 153 | Selector::RegMin => format!("min_reg_val_greater_or_equal"), 154 | Selector::RegMax => format!("max_reg_val_greater_or_equal"), 155 | Selector::RegMaxMinDiff => format!("max_min_diff_reg_val_greater_or_equal"), 156 | Selector::InsCount => format!("ins_count_greater_or_equal"), 157 | _ => unreachable!(), 158 | } 159 | } 160 | 161 | pub fn selector_val_greater_or_equal( 162 | selector: &Selector, 163 | ) -> fn(&Instruction, Option, Option) -> bool { 164 | match selector { 165 | Selector::RegMin => min_reg_val_greater_or_equal, 166 | Selector::RegMax => max_reg_val_greater_or_equal, 167 | Selector::RegMaxMinDiff => max_min_diff_reg_val_greater_or_equal, 168 | // Selector::InsCount => ins_count_greater_or_equal, 169 | _ => unreachable!(), 170 | } 171 | } 172 | 173 | pub fn min_reg_val_greater_or_equal( 174 | instruction: &Instruction, 175 | reg_index: Option, 176 | value: Option, 177 | ) -> bool { 178 | match instruction.registers_min.get(reg_index.unwrap()) { 179 | Some(reg) => reg.value() >= value.unwrap() as u64, 180 | None => false, 181 | } 182 | } 183 | 184 | pub fn max_reg_val_greater_or_equal( 185 | instruction: &Instruction, 186 | reg_index: Option, 187 | value: Option, 188 | ) -> bool { 189 | match instruction.registers_max.get(reg_index.unwrap()) { 190 | Some(reg) => reg.value() >= value.unwrap() as u64, 191 | None => false, 192 | } 193 | } 194 | 195 | pub fn max_min_diff_reg_val_greater_or_equal( 196 | instruction: &Instruction, 197 | reg_index: Option, 198 | value: Option, 199 | ) -> bool { 200 | match ( 201 | instruction.registers_max.get(reg_index.unwrap()), 202 | instruction.registers_min.get(reg_index.unwrap()), 203 | ) { 204 | (Some(reg_max), Some(reg_min)) => { 205 | reg_max.value() - reg_min.value() >= value.unwrap() as u64 206 | } 207 | _ => false, 208 | } 209 | } 210 | 211 | fn is_flag_bit_set(instruction: &Instruction, reg_type: Selector, pos: u64) -> bool { 212 | match reg_type { 213 | Selector::RegMin => is_reg_bit_set(instruction.registers_min.get(22), pos), 214 | Selector::RegMax => is_reg_bit_set(instruction.registers_max.get(22), pos), 215 | // Selector::RegLast => is_reg_bit_set(instruction.registers_last.get(22), pos), 216 | _ => unreachable!(), 217 | } 218 | } 219 | 220 | fn is_reg_bit_set(reg: Option<&Register>, pos: u64) -> bool { 221 | match reg.is_some() { 222 | true => match reg.unwrap().value() & (1 << pos) { 223 | 0 => false, 224 | _ => true, 225 | }, 226 | _ => false, 227 | } 228 | } 229 | 230 | pub fn min_carry_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 231 | is_flag_bit_set(instruction, Selector::RegMin, 0) 232 | } 233 | 234 | pub fn min_parity_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 235 | is_flag_bit_set(instruction, Selector::RegMin, 2) 236 | } 237 | 238 | pub fn min_adjust_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 239 | is_flag_bit_set(instruction, Selector::RegMin, 4) 240 | } 241 | 242 | pub fn min_zero_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 243 | is_flag_bit_set(instruction, Selector::RegMin, 6) 244 | } 245 | 246 | pub fn min_sign_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 247 | is_flag_bit_set(instruction, Selector::RegMin, 7) 248 | } 249 | 250 | pub fn min_trap_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 251 | is_flag_bit_set(instruction, Selector::RegMin, 8) 252 | } 253 | 254 | pub fn min_interrupt_flag_set( 255 | instruction: &Instruction, 256 | _: Option, 257 | _: Option, 258 | ) -> bool { 259 | is_flag_bit_set(instruction, Selector::RegMin, 9) 260 | } 261 | 262 | pub fn min_direction_flag_set( 263 | instruction: &Instruction, 264 | _: Option, 265 | _: Option, 266 | ) -> bool { 267 | is_flag_bit_set(instruction, Selector::RegMin, 10) 268 | } 269 | 270 | pub fn min_overflow_flag_set( 271 | instruction: &Instruction, 272 | _: Option, 273 | _: Option, 274 | ) -> bool { 275 | is_flag_bit_set(instruction, Selector::RegMin, 11) 276 | } 277 | 278 | pub fn max_carry_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 279 | is_flag_bit_set(instruction, Selector::RegMax, 0) 280 | } 281 | 282 | pub fn max_parity_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 283 | is_flag_bit_set(instruction, Selector::RegMax, 2) 284 | } 285 | 286 | pub fn max_adjust_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 287 | is_flag_bit_set(instruction, Selector::RegMax, 4) 288 | } 289 | 290 | pub fn max_zero_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 291 | is_flag_bit_set(instruction, Selector::RegMax, 6) 292 | } 293 | 294 | pub fn max_sign_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 295 | is_flag_bit_set(instruction, Selector::RegMax, 7) 296 | } 297 | 298 | pub fn max_trap_flag_set(instruction: &Instruction, _: Option, _: Option) -> bool { 299 | is_flag_bit_set(instruction, Selector::RegMax, 8) 300 | } 301 | 302 | pub fn max_interrupt_flag_set( 303 | instruction: &Instruction, 304 | _: Option, 305 | _: Option, 306 | ) -> bool { 307 | is_flag_bit_set(instruction, Selector::RegMax, 9) 308 | } 309 | 310 | pub fn max_direction_flag_set( 311 | instruction: &Instruction, 312 | _: Option, 313 | _: Option, 314 | ) -> bool { 315 | is_flag_bit_set(instruction, Selector::RegMax, 10) 316 | } 317 | 318 | pub fn max_overflow_flag_set( 319 | instruction: &Instruction, 320 | _: Option, 321 | _: Option, 322 | ) -> bool { 323 | is_flag_bit_set(instruction, Selector::RegMax, 11) 324 | } 325 | 326 | pub fn num_successors_greater( 327 | instruction: &Instruction, 328 | n: Option, 329 | _: Option, 330 | ) -> bool { 331 | instruction.successors.len() > n.unwrap() 332 | } 333 | 334 | pub fn num_successors_equal(instruction: &Instruction, n: Option, _: Option) -> bool { 335 | instruction.successors.len() == n.unwrap() 336 | } 337 | 338 | pub fn has_edge_to(instruction: &Instruction, address: Option, _: Option) -> bool { 339 | instruction 340 | .successors 341 | .iter() 342 | .any(|s| s.address == address.unwrap()) 343 | } 344 | 345 | pub fn edge_only_taken_to( 346 | instruction: &Instruction, 347 | address: Option, 348 | _: Option, 349 | ) -> bool { 350 | instruction 351 | .successors 352 | .iter() 353 | .any(|s| s.address == address.unwrap()) 354 | && instruction.successors.len() == 1 355 | } 356 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/trace.rs: -------------------------------------------------------------------------------- 1 | use serde::{Deserialize, Serialize}; 2 | use std::collections::hash_map::Keys; 3 | use std::collections::{HashMap, HashSet}; 4 | use std::fs; 5 | use std::io::Read; 6 | 7 | pub static REGISTERS: [&str; 25] = [ 8 | "rax", 9 | "rbx", 10 | "rcx", 11 | "rdx", 12 | "rsi", 13 | "rdi", 14 | "rbp", 15 | "rsp", 16 | "r8", 17 | "r9", 18 | "r10", 19 | "r11", 20 | "r12", 21 | "r13", 22 | "r14", 23 | "r15", 24 | "seg_cs", 25 | "seg_ss", 26 | "seg_ds", 27 | "seg_es", 28 | "seg_fs", 29 | "seg_gs", 30 | "eflags", 31 | "memory_address", 32 | "memory_value", 33 | ]; 34 | 35 | pub enum Selector { 36 | RegMin, 37 | RegMax, 38 | RegLast, 39 | RegMaxMinDiff, 40 | InsCount, 41 | } 42 | 43 | #[derive(Clone, Serialize, Deserialize)] 44 | pub struct Register { 45 | value: u64, 46 | } 47 | 48 | impl Register { 49 | pub fn new(_: &str, value: u64) -> Register { 50 | Register { value } 51 | } 52 | 53 | pub fn value(&self) -> u64 { 54 | self.value 55 | } 56 | 57 | pub fn to_string(&self) -> String { 58 | format!("{:#018x}", self.value()) 59 | } 60 | 61 | pub fn to_string_extended(&self) -> String { 62 | format!("{:#018x}", self.value()) 63 | } 64 | } 65 | 66 | #[derive(Clone, Serialize, Deserialize)] 67 | pub struct Registers(HashMap); 68 | 69 | impl Registers { 70 | pub fn get(&self, index: usize) -> Option<&Register> { 71 | self.0.get(&index) 72 | } 73 | 74 | pub fn insert(&mut self, index: usize, reg: Register) { 75 | self.0.insert(index, reg); 76 | } 77 | 78 | pub fn len(&self) -> usize { 79 | self.0.len() 80 | } 81 | 82 | pub fn keys(&self) -> Keys { 83 | self.0.keys() 84 | } 85 | 86 | pub fn values(&self) -> impl Iterator { 87 | self.0.values() 88 | } 89 | 90 | pub fn to_string(&self) -> String { 91 | self.0 92 | .values() 93 | .map(|r| format!("{};", r.to_string_extended())) 94 | .collect() 95 | } 96 | } 97 | 98 | #[derive(Clone, Serialize, Deserialize)] 99 | pub struct Memory { 100 | pub min_address: u64, 101 | pub max_address: u64, 102 | pub last_address: u64, 103 | pub min_value: u64, 104 | pub max_value: u64, 105 | pub last_value: u64, 106 | } 107 | 108 | impl Memory { 109 | pub fn to_string(&self) -> String { 110 | format!( 111 | "memory: {:#018x};{:#018x};{:#018x};{:#018x};{:#018x};{:#018x}", 112 | self.min_address, 113 | self.max_address, 114 | self.last_address, 115 | self.min_value, 116 | self.max_address, 117 | self.last_value 118 | ) 119 | } 120 | } 121 | 122 | #[derive(Clone, Serialize, Deserialize)] 123 | pub struct Instruction { 124 | pub address: usize, 125 | pub mnemonic: String, 126 | pub registers_min: Registers, 127 | pub registers_max: Registers, 128 | pub successors: Vec, 129 | } 130 | 131 | impl Instruction { 132 | pub fn to_string(&self) -> String { 133 | let mut ret = String::new(); 134 | ret.push_str(&format!("{:#018x};", self.address)); 135 | ret.push_str(&format!("{};", self.mnemonic)); 136 | 137 | for index in 0..REGISTERS.len() { 138 | if let Some(register) = self.registers_min.get(index) { 139 | ret.push_str(&format!( 140 | "{}: {};", 141 | REGISTERS[index], 142 | register.to_string_extended() 143 | )); 144 | } 145 | if let Some(register) = self.registers_max.get(index) { 146 | ret.push_str(&format!( 147 | "{}: {};", 148 | REGISTERS[index], 149 | register.to_string_extended() 150 | )); 151 | } 152 | } 153 | 154 | for successor in self.successors.iter() { 155 | ret.push_str(&format!("successor: {};", successor.to_string())); 156 | } 157 | 158 | ret 159 | } 160 | } 161 | 162 | #[derive(Clone, Serialize, Deserialize)] 163 | pub struct SerializedInstruction { 164 | pub address: usize, 165 | pub mnemonic: String, 166 | pub registers_min: Registers, 167 | pub registers_max: Registers, 168 | pub registers_last: Registers, 169 | pub last_successor: usize, 170 | pub count: usize, 171 | pub memory: Option, 172 | } 173 | 174 | impl SerializedInstruction { 175 | fn add_mem_to_registers(&self) -> (Registers, Registers, Registers) { 176 | let mut registers_min = self.registers_min.clone(); 177 | let mut registers_max = self.registers_max.clone(); 178 | let mut registers_last = self.registers_last.clone(); 179 | 180 | if let Some(memory) = &self.memory { 181 | registers_min.insert(23, Register::new("memory_address", memory.min_address)); 182 | registers_max.insert(23, Register::new("memory_address", memory.max_address)); 183 | registers_last.insert(23, Register::new("memory_address", memory.last_address)); 184 | 185 | registers_min.insert(24, Register::new("memory_value", memory.min_value)); 186 | registers_max.insert(24, Register::new("memory_value", memory.max_value)); 187 | registers_last.insert(24, Register::new("memory_value", memory.last_value)); 188 | } 189 | 190 | (registers_min, registers_max, registers_last) 191 | } 192 | 193 | pub fn to_instruction(&self) -> Instruction { 194 | let (registers_min, registers_max, _) = self.add_mem_to_registers(); 195 | 196 | Instruction { 197 | address: self.address, 198 | mnemonic: self.mnemonic.to_string(), 199 | registers_min, 200 | registers_max, 201 | successors: vec![], 202 | } 203 | } 204 | } 205 | 206 | #[derive(Clone, Serialize, Deserialize)] 207 | struct SerializedEdge { 208 | from: usize, 209 | to: usize, 210 | count: usize, 211 | } 212 | 213 | #[derive(Clone, Serialize, Deserialize)] 214 | struct SerializedTrace { 215 | pub instructions: Vec, 216 | pub edges: Vec, 217 | pub first_address: usize, 218 | pub last_address: usize, 219 | pub image_base: usize, 220 | } 221 | 222 | impl SerializedTrace { 223 | pub fn to_trace(name: String, serialized: SerializedTrace) -> Trace { 224 | let mut instructions: HashMap = serialized 225 | .instructions 226 | .into_iter() 227 | .map(|instr| (instr.address, instr.to_instruction())) 228 | .collect(); 229 | for edge in &serialized.edges { 230 | if let Some(entry) = instructions.get_mut(&edge.from) { 231 | entry.successors.push(Successor { address: edge.to }); 232 | } 233 | } 234 | for v in instructions.values_mut() { 235 | v.successors.sort_by(|a, b| a.address.cmp(&b.address)) 236 | } 237 | 238 | Trace { 239 | name, 240 | instructions, 241 | image_base: serialized.image_base, 242 | first_address: serialized.first_address, 243 | last_address: serialized.last_address, 244 | } 245 | } 246 | } 247 | 248 | #[derive(Clone, Serialize, Deserialize, Copy)] 249 | pub struct Successor { 250 | pub address: usize, 251 | } 252 | 253 | impl Successor { 254 | pub fn to_string(&self) -> String { 255 | format!("{:#018x}", self.address) 256 | } 257 | } 258 | 259 | #[derive(Clone, Serialize, Deserialize)] 260 | pub struct Trace { 261 | pub name: String, 262 | pub image_base: usize, 263 | pub instructions: HashMap, 264 | pub first_address: usize, 265 | pub last_address: usize, 266 | } 267 | 268 | impl Trace { 269 | pub fn from_trace_file(file_path: String) -> Trace { 270 | let content = 271 | fs::read_to_string(&file_path).expect(&format!("File {} not found!", &file_path)); 272 | Trace::from_file(file_path, content) 273 | } 274 | 275 | pub fn from_zip_file(file_path: String) -> Trace { 276 | let zip_file = 277 | fs::File::open(&file_path).expect(&format!("Could not open file {}", &file_path)); 278 | let mut zip_archive = zip::ZipArchive::new(zip_file) 279 | .expect(&format!("Could not open archive {}", &file_path)); 280 | 281 | let mut trace_file = zip_archive.by_index(0).unwrap(); 282 | let trace_file_path = trace_file.sanitized_name().to_str().unwrap().to_string(); 283 | 284 | let mut trace_content = String::new(); 285 | trace_file 286 | .read_to_string(&mut trace_content) 287 | .expect(&format!("Could not read unzipped file {}", trace_file_path)); 288 | 289 | Trace::from_file(trace_file_path, trace_content) 290 | } 291 | 292 | fn from_file(file_path: String, content: String) -> Trace { 293 | let serialized_trace: SerializedTrace = serde_json::from_str(&content) 294 | .expect(&format!("Could not deserialize file {}", &file_path)); 295 | SerializedTrace::to_trace(file_path, serialized_trace) 296 | } 297 | 298 | pub fn visited_addresses(&self) -> HashSet { 299 | self.instructions.keys().map(|x| *x).collect() 300 | } 301 | 302 | pub fn to_string(&self) -> String { 303 | format!( 304 | "{};{:#018x};{:#018x};{:#018x}", 305 | self.name, self.image_base, self.first_address, self.last_address 306 | ) 307 | } 308 | } 309 | 310 | pub struct TraceVec(pub Vec); 311 | 312 | impl TraceVec { 313 | pub fn from_vec(v: Vec) -> TraceVec { 314 | TraceVec(v) 315 | } 316 | 317 | pub fn iter_instructions_at_address( 318 | &self, 319 | address: usize, 320 | ) -> impl Iterator { 321 | self.iter() 322 | .filter(move |t| t.instructions.contains_key(&address)) 323 | .map(move |t| t.instructions.get(&address).unwrap()) 324 | } 325 | 326 | pub fn len(&self) -> usize { 327 | self.0.len() 328 | } 329 | 330 | pub fn iter_all_instructions(&self) -> impl Iterator { 331 | self.0.iter().flat_map(|t| t.instructions.values()) 332 | } 333 | 334 | pub fn iter(&self) -> impl Iterator { 335 | self.0.iter() 336 | } 337 | 338 | pub fn as_slice(&self) -> &[Trace] { 339 | self.0.as_slice() 340 | } 341 | } 342 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/trace_analyzer.rs: -------------------------------------------------------------------------------- 1 | use crate::config::Config; 2 | use crate::control_flow_graph::{CFGCollector, ControlFlowGraph}; 3 | use crate::predicate_analysis::PredicateAnalyzer; 4 | use crate::predicates::{Predicate, SerializedPredicate}; 5 | use crate::trace::{Instruction, Selector, Trace, TraceVec}; 6 | use crate::trace_integrity::TraceIntegrityChecker; 7 | use glob::glob; 8 | use rand::seq::SliceRandom; 9 | use rand::thread_rng; 10 | use rayon::prelude::*; 11 | use serde::{Deserialize, Serialize}; 12 | use std::collections::{HashMap, HashSet}; 13 | use std::fs; 14 | use std::fs::{read_to_string, File}; 15 | use std::io::Write; 16 | use std::process::exit; 17 | 18 | pub struct TraceAnalyzer { 19 | pub crashes: TraceVec, 20 | pub non_crashes: TraceVec, 21 | pub address_scores: HashMap, 22 | pub cfg: ControlFlowGraph, 23 | pub memory_addresses: MemoryAddresses, 24 | } 25 | 26 | #[derive(Clone, Serialize, Deserialize)] 27 | pub struct MemoryAddresses { 28 | pub heap_start: usize, 29 | pub heap_end: usize, 30 | pub stack_start: usize, 31 | pub stack_end: usize, 32 | } 33 | 34 | impl MemoryAddresses { 35 | pub fn read_from_file(config: &Config) -> MemoryAddresses { 36 | let file_path = format!("{}/addresses.json", config.output_directory); 37 | let content = 38 | fs::read_to_string(&file_path).expect(&format!("File {} not found!", &file_path)); 39 | serde_json::from_str(&content).expect(&format!("Could not deserialize file {}", &file_path)) 40 | } 41 | } 42 | 43 | fn store_trace(trace: &Trace, must_have: &Option>) -> bool { 44 | match must_have { 45 | Some(addresses) => addresses.iter().any(|k| trace.instructions.contains_key(k)), 46 | None => true, 47 | } 48 | } 49 | 50 | pub fn read_crash_blacklist( 51 | blacklist_crashes: bool, 52 | crash_blacklist_path: &String, 53 | ) -> Option> { 54 | if blacklist_crashes { 55 | Some( 56 | read_to_string(crash_blacklist_path) 57 | .expect("Could not read crash blacklist") 58 | .split("\n") 59 | .map(|s| { 60 | s.split("/") 61 | .last() 62 | .expect(&format!("Could not split string {}", s)) 63 | .to_string() 64 | }) 65 | .filter(|s| !s.is_empty()) 66 | .collect(), 67 | ) 68 | } else { 69 | None 70 | } 71 | } 72 | 73 | pub fn blacklist_path(path: &String, blacklist: &Option>) -> bool { 74 | blacklist 75 | .as_ref() 76 | .unwrap_or(&vec![]) 77 | .iter() 78 | .any(|p| path.contains(p)) 79 | } 80 | 81 | fn parse_traces( 82 | path: &String, 83 | config: &Config, 84 | must_include: Option>, 85 | blacklist_paths: Option>, 86 | ) -> TraceVec { 87 | let pattern = match config.zipped { 88 | false => format!("{}/*trace", path), 89 | true => format!("{}/*.zip", path), 90 | }; 91 | 92 | let mut paths: Vec = glob(&pattern) 93 | .unwrap() 94 | .map(|p| p.unwrap().to_str().unwrap().to_string()) 95 | .filter(|p| !blacklist_path(&p, &blacklist_paths)) 96 | .collect(); 97 | 98 | if config.random_traces() { 99 | paths.shuffle(&mut thread_rng()); 100 | } 101 | 102 | match config.zipped { 103 | false => TraceVec::from_vec( 104 | paths 105 | .into_par_iter() 106 | .map(|s| Trace::from_trace_file(s)) 107 | .take(if config.random_traces() { 108 | config.random_traces 109 | } else { 110 | 0xffff_ffff_ffff_ffff 111 | }) 112 | .filter(|t| store_trace(&t, &must_include)) 113 | .collect(), 114 | ), 115 | true => TraceVec::from_vec( 116 | paths 117 | .into_par_iter() 118 | .map(|s| Trace::from_zip_file(s)) 119 | .take(if config.random_traces() { 120 | config.random_traces 121 | } else { 122 | 0xffff_ffff_ffff_ffff 123 | }) 124 | .filter(|t| store_trace(&t, &must_include)) 125 | .collect(), 126 | ), 127 | } 128 | } 129 | 130 | impl TraceAnalyzer { 131 | pub fn new(config: &Config) -> TraceAnalyzer { 132 | println!("reading crashes"); 133 | let crash_blacklist = 134 | read_crash_blacklist(config.blacklist_crashes(), &config.crash_blacklist_path); 135 | let crashes = parse_traces(&config.path_to_crashes, config, None, crash_blacklist); 136 | let crashing_addresses: Option> = match config.filter_non_crashes { 137 | true => Some(crashes.iter().map(|t| t.last_address).collect()), 138 | false => None, 139 | }; 140 | 141 | println!("reading non-crashes"); 142 | let non_crashes = parse_traces( 143 | &config.path_to_non_crashes, 144 | config, 145 | crashing_addresses, 146 | None, 147 | ); 148 | 149 | println!( 150 | "{} crashes and {} non-crashes", 151 | crashes.len(), 152 | non_crashes.len() 153 | ); 154 | 155 | let mut trace_analyzer = TraceAnalyzer { 156 | crashes, 157 | non_crashes, 158 | address_scores: HashMap::new(), 159 | cfg: ControlFlowGraph::new(), 160 | memory_addresses: MemoryAddresses::read_from_file(config), 161 | }; 162 | 163 | if config.check_traces || config.dump_scores || config.debug_predicate() { 164 | let mut cfg_collector = CFGCollector::new(); 165 | println!("filling cfg"); 166 | trace_analyzer.fill_cfg(&mut cfg_collector); 167 | } 168 | 169 | if config.check_traces { 170 | println!("checking traces"); 171 | TraceIntegrityChecker::check_traces(&trace_analyzer); 172 | exit(0); 173 | } 174 | 175 | if config.dump_scores { 176 | println!("calculating scores"); 177 | trace_analyzer.fill_address_scores(); 178 | } 179 | 180 | trace_analyzer 181 | } 182 | 183 | fn fill_cfg(&mut self, cfg_collector: &mut CFGCollector) { 184 | for instruction in self 185 | .crashes 186 | .iter_all_instructions() 187 | .chain(self.non_crashes.iter_all_instructions()) 188 | { 189 | for succ in &instruction.successors { 190 | cfg_collector.add_edge(instruction.address, succ.address); 191 | } 192 | } 193 | 194 | self.cfg = cfg_collector.construct_graph(); 195 | } 196 | 197 | fn fill_address_scores(&mut self) { 198 | let addresses = self.crash_non_crash_intersection(); 199 | self.address_scores = addresses 200 | .into_par_iter() 201 | .map(|address| { 202 | ( 203 | address, 204 | PredicateAnalyzer::evaluate_best_predicate_at_address(address, self), 205 | ) 206 | }) 207 | .collect(); 208 | } 209 | 210 | pub fn address_union(&self) -> HashSet { 211 | let crash_union = TraceAnalyzer::trace_union(&self.crashes); 212 | let non_crash_union = TraceAnalyzer::trace_union(&self.non_crashes); 213 | crash_union.union(&non_crash_union).map(|x| *x).collect() 214 | } 215 | 216 | pub fn crash_address_union(&self) -> HashSet { 217 | TraceAnalyzer::trace_union(&self.crashes) 218 | } 219 | 220 | fn trace_union(traces: &TraceVec) -> HashSet { 221 | let mut res = HashSet::new(); 222 | for trace in traces.iter() { 223 | res = res.union(&trace.visited_addresses()).map(|x| *x).collect(); 224 | } 225 | 226 | res 227 | } 228 | 229 | pub fn iter_all_instructions<'a>( 230 | crashes: &'a TraceVec, 231 | non_crashes: &'a TraceVec, 232 | ) -> impl Iterator { 233 | crashes 234 | .iter_all_instructions() 235 | .chain(non_crashes.iter_all_instructions()) 236 | } 237 | 238 | pub fn iter_all_traces(&self) -> impl Iterator { 239 | self.crashes.iter().chain(self.non_crashes.iter()) 240 | } 241 | 242 | pub fn iter_all_instructions_at_address( 243 | &self, 244 | address: usize, 245 | ) -> impl Iterator { 246 | self.crashes 247 | .iter_instructions_at_address(address) 248 | .chain(self.non_crashes.iter_instructions_at_address(address)) 249 | } 250 | 251 | pub fn crash_non_crash_intersection(&self) -> HashSet { 252 | let crash_union = TraceAnalyzer::trace_union(&self.crashes); 253 | let non_crash_union = TraceAnalyzer::trace_union(&self.non_crashes); 254 | crash_union 255 | .intersection(&non_crash_union) 256 | .map(|x| *x) 257 | .collect() 258 | } 259 | 260 | pub fn values_at_address( 261 | &self, 262 | address: usize, 263 | selector: &Selector, 264 | reg_index: Option, 265 | ) -> Vec { 266 | let ret: Vec<_> = match selector { 267 | Selector::RegMin => self 268 | .iter_all_instructions_at_address(address) 269 | .filter(|i| i.registers_min.get(reg_index.unwrap()).is_some()) 270 | .map(|i| i.registers_min.get(reg_index.unwrap()).unwrap().value()) 271 | .collect(), 272 | Selector::RegMax => self 273 | .iter_all_instructions_at_address(address) 274 | .filter(|i| i.registers_max.get(reg_index.unwrap()).is_some()) 275 | .map(|i| i.registers_max.get(reg_index.unwrap()).unwrap().value()) 276 | .collect(), 277 | _ => unreachable!(), 278 | }; 279 | 280 | ret 281 | } 282 | 283 | pub fn unique_values_at_address( 284 | &self, 285 | address: usize, 286 | selector: &Selector, 287 | reg_index: Option, 288 | ) -> Vec { 289 | let mut ret: Vec<_> = self 290 | .values_at_address(address, selector, reg_index) 291 | .into_iter() 292 | .collect::>() 293 | .into_iter() 294 | .collect::>(); 295 | 296 | ret.sort(); 297 | 298 | ret 299 | } 300 | 301 | pub fn sort_scores(&self) -> Vec { 302 | let mut ret: Vec = self 303 | .address_scores 304 | .iter() 305 | .map(|(_, p)| (p.clone())) 306 | .collect(); 307 | 308 | ret.par_sort_by(|p1, p2| p1.score.partial_cmp(&p2.score).unwrap()); 309 | 310 | ret 311 | } 312 | 313 | pub fn dump_scores(&self, config: &Config, filter_scores: bool, print_scores: bool) { 314 | let (file_name, scores) = ( 315 | format!("{}/scores_linear.csv", config.output_directory), 316 | self.sort_scores(), 317 | ); 318 | 319 | let mut file = File::create(file_name).unwrap(); 320 | 321 | for predicate in scores.iter() { 322 | if filter_scores && predicate.score <= 0.5 { 323 | continue; 324 | } 325 | 326 | write!( 327 | &mut file, 328 | "{:#x};{} ({}) -- {}\n", 329 | predicate.address, 330 | predicate.score, 331 | predicate.to_string(), 332 | self.get_any_mnemonic(predicate.address), 333 | ) 334 | .unwrap(); 335 | } 336 | 337 | if print_scores { 338 | TraceAnalyzer::print_scores(&scores, filter_scores); 339 | } 340 | 341 | TraceAnalyzer::dump_for_serialization(config, &scores) 342 | } 343 | 344 | fn dump_for_serialization(config: &Config, scores: &Vec) { 345 | let scores: Vec<_> = scores.iter().map(|p| p.to_serialzed()).collect(); 346 | let serialized_string = serde_json::to_string(&scores).unwrap(); 347 | 348 | let file_path = format!("{}/scores_linear_serialized.json", config.output_directory); 349 | 350 | fs::write(&file_path, serialized_string) 351 | .expect(&format!("Could not write file {}", file_path)); 352 | } 353 | 354 | pub fn get_predicates_better_than(&self, min_score: f64) -> Vec { 355 | self.address_scores 356 | .values() 357 | .filter(|p| p.score > min_score) 358 | .map(|p| p.to_serialzed()) 359 | .collect() 360 | } 361 | 362 | fn print_scores(scores: &Vec, filter_scores: bool) { 363 | for predicate in scores.iter() { 364 | if filter_scores && predicate.score <= 0.5 { 365 | continue; 366 | } 367 | println!( 368 | "{:#x};{} ({})", 369 | predicate.address, 370 | predicate.score, 371 | predicate.to_string() 372 | ); 373 | } 374 | } 375 | 376 | pub fn any_instruction_at_address_contains_reg( 377 | &self, 378 | address: usize, 379 | reg_index: usize, 380 | ) -> bool { 381 | self.crashes 382 | .0 383 | .par_iter() 384 | .chain(self.non_crashes.0.par_iter()) 385 | .any(|t| match t.instructions.get(&address) { 386 | Some(instruction) => instruction.registers_min.get(reg_index).is_some(), 387 | _ => false, 388 | }) 389 | } 390 | 391 | pub fn get_any_mnemonic(&self, address: usize) -> String { 392 | self.iter_all_instructions_at_address(address) 393 | .nth(0) 394 | .unwrap() 395 | .mnemonic 396 | .to_string() 397 | } 398 | } 399 | -------------------------------------------------------------------------------- /root_cause_analysis/trace_analysis/src/trace_integrity.rs: -------------------------------------------------------------------------------- 1 | use crate::trace::REGISTERS; 2 | use crate::trace_analyzer::TraceAnalyzer; 3 | use std::collections::HashSet; 4 | 5 | pub struct TraceIntegrityChecker {} 6 | 7 | impl TraceIntegrityChecker { 8 | pub fn check_traces(trace_analyzer: &TraceAnalyzer) { 9 | TraceIntegrityChecker::cfg_empty(trace_analyzer); 10 | TraceIntegrityChecker::cfg_heads(trace_analyzer); 11 | TraceIntegrityChecker::cfg_leaves(trace_analyzer); 12 | TraceIntegrityChecker::cfg_head_equals_first_instruction(trace_analyzer); 13 | TraceIntegrityChecker::cfg_addresses_unique(trace_analyzer); 14 | TraceIntegrityChecker::instruction_mnemonic_not_empty(trace_analyzer); 15 | TraceIntegrityChecker::compare_reg_min_last_max(trace_analyzer); 16 | TraceIntegrityChecker::untracked_memory_write(trace_analyzer); 17 | } 18 | 19 | fn cfg_empty(trace_analyzer: &TraceAnalyzer) { 20 | // cfg is not empty 21 | if trace_analyzer.cfg.is_empty() { 22 | println!("[E] CFG is empty"); 23 | } 24 | } 25 | 26 | fn cfg_heads(trace_analyzer: &TraceAnalyzer) { 27 | let cfg_heads = trace_analyzer.cfg.heads(); 28 | // there is only one cfg head (joint cfg of crashes and non-crashes) 29 | if cfg_heads.len() != 1 { 30 | println!("[E] CFG has {} heads (should have 1)", cfg_heads.len()); 31 | } 32 | } 33 | 34 | fn cfg_leaves(trace_analyzer: &TraceAnalyzer) { 35 | // there is only one cfg exit 36 | // this assumption might not hold every time (crashes may have different leaves from non-crashes) 37 | if trace_analyzer.cfg.leaves().len() != 1 { 38 | println!( 39 | "[W] CFG has {} leaves (Should have 1 leaf unless Crash-CFG leaf != CFG leaf)", 40 | trace_analyzer.cfg.leaves().len() 41 | ); 42 | } 43 | } 44 | 45 | fn cfg_head_equals_first_instruction(trace_analyzer: &TraceAnalyzer) { 46 | let head = trace_analyzer.cfg.heads().pop().unwrap(); 47 | for trace in trace_analyzer.iter_all_traces() { 48 | if head != trace.first_address { 49 | println!("[E] CFG head (0x{:x}) is not equal to first instruction address (0x{:x}) reported in trace {}. Not re-running this check", head, trace.first_address, trace.name); 50 | return; 51 | } 52 | } 53 | } 54 | 55 | fn cfg_addresses_unique(trace_analyzer: &TraceAnalyzer) { 56 | let cfg_addresses: Vec = trace_analyzer 57 | .cfg 58 | .bbs() 59 | .flat_map(|b| b.body.iter()) 60 | .cloned() 61 | .collect(); 62 | 63 | let cfg_addresses_unique = cfg_addresses.iter().cloned().collect::>(); 64 | let address_union = trace_analyzer.address_union(); 65 | 66 | if cfg_addresses.len() != cfg_addresses_unique.len() { 67 | println!( 68 | "[E] #addresses ({}) != #unique_addresses ({}) in CFG", 69 | cfg_addresses.len(), 70 | cfg_addresses_unique.len() 71 | ); 72 | } 73 | 74 | if cfg_addresses.len() != address_union.len() { 75 | println!( 76 | "[E] #addresses ({}) in CFG != #crash_address_union ({})", 77 | cfg_addresses.len(), 78 | address_union.len() 79 | ); 80 | } 81 | } 82 | 83 | fn instruction_mnemonic_not_empty(trace_analyzer: &TraceAnalyzer) { 84 | // instruction mnemonic != "" 85 | for trace in trace_analyzer.iter_all_traces() { 86 | trace.instructions.values().for_each(|i| { 87 | if i.mnemonic == "".to_string() { 88 | println!("[E] Instruction {:x} has empty mnemonic in trace {}. Not re-running this check", 89 | i.address, 90 | trace.name); 91 | return; 92 | } 93 | }); 94 | } 95 | } 96 | 97 | fn untracked_memory_write(trace_analyzer: &TraceAnalyzer) { 98 | for trace in trace_analyzer.iter_all_traces() { 99 | for instruction in trace.instructions.values() { 100 | if instruction.mnemonic.contains("], ") 101 | && instruction.mnemonic.contains("mov") 102 | && !instruction.mnemonic.contains("rep") 103 | { 104 | if !instruction.registers_min.get(23).is_some() { 105 | println!("[E] Memory write found in mnemonic but no memory address field tracked for instruction {:x} with mnemonic {} in trace {}. Not re-running this check", 106 | instruction.address, 107 | instruction.mnemonic, 108 | trace.name); 109 | } 110 | if !instruction.registers_min.get(24).is_some() { 111 | println!("[E] Memory write found in mnemonic but no memory value field tracked for instruction {:x} with mnemonic {} in trace {}. Not re-running this check", 112 | instruction.address, 113 | instruction.mnemonic, 114 | trace.name); 115 | } 116 | } 117 | } 118 | } 119 | } 120 | 121 | fn compare_reg_min_last_max(trace_analyzer: &TraceAnalyzer) { 122 | // reg_min <= reg_last <= reg_max 123 | for trace in trace_analyzer.iter_all_traces() { 124 | for instruction in trace.instructions.values() { 125 | (0..REGISTERS.len()) 126 | .into_iter() 127 | .filter(|i| instruction.registers_min.get(*i).is_some()) 128 | .for_each(|i| { 129 | let reg_min = instruction.registers_min.get(i).unwrap(); 130 | let reg_max = instruction.registers_max.get(i).unwrap(); 131 | // let reg_last = instruction.registers_last.get(i).unwrap(); 132 | 133 | if !(reg_min.value() <= reg_max.value()) { 134 | println!("[E] min reg {} is not <= max reg for instruction {:x} in trace {}. Not re-running this check", 135 | REGISTERS[i], 136 | instruction.address, 137 | trace.name); 138 | return; 139 | } 140 | 141 | }); 142 | } 143 | } 144 | } 145 | } 146 | -------------------------------------------------------------------------------- /tracing/README.md: -------------------------------------------------------------------------------- 1 | # Tracing 2 | 3 | We use a pintool to trace all crashing and non-crashing inputs. 4 | 5 | 6 | ## Setup 7 | 1. Install Intel Pin in version 3.15 (note, the original pintool was designed for 3.7 - which is no longer available for download, such that we have updated the pintool to version 3.15). 8 | 9 | 2. Set PIN_ROOT to point to the correct location, e.g., `export PIN_ROOT=/home/user/builds/pin-3.7-97619-g0d0c92f4f-gcc-linux/`. 10 | 11 | 3. Run `make aurora_tracer.test` or ` make obj-intel64/aurora_tracer.so` to build the pintool. 12 | 13 | ## Usage 14 | 15 | In scripts, you can find an example script `run_tracer.sh` on how to run the tracer. In general, tracing will generate an output containing the trace as JSON and a logfile. Note that Pin struggles with long paths for both output file and logfile. 16 | 17 | The second script, pprint.py, allows to pretty-print the trace file. 18 | 19 | `tracing.py` requires at least Python 3.6 and allows to trace multiple files (and zips them for space reasons - root cause analysis tooling can deal with zipped traces automatically) and expects PIN_ROOT to be set. It requires 3 arguments: the path to the (non-AFL instrumented) trace binary, an input folder where `crashes` and `non_crashes`can be found as well as an output folder where to drop the `traces. A tracing.log logfile is created. 20 | 21 | The fourth script, `addr_ranges.py` extracts heap and stack address ranges from logfiles generated by `tracing.py`. 22 | 23 | -------------------------------------------------------------------------------- /tracing/aurora_tracer.cpp: -------------------------------------------------------------------------------- 1 | /*BEGIN_LEGAL 2 | Intel Open Source License 3 | 4 | Copyright (c) 2002-2018 Intel Corporation. All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are 8 | met: 9 | 10 | Redistributions of source code must retain the above copyright notice, 11 | this list of conditions and the following disclaimer. Redistributions 12 | in binary form must reproduce the above copyright notice, this list of 13 | conditions and the following disclaimer in the documentation and/or 14 | other materials provided with the distribution. Neither the name of 15 | the Intel Corporation nor the names of its contributors may be used to 16 | endorse or promote products derived from this software without 17 | specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20 | ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE INTEL OR 23 | ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | END_LEGAL */ 31 | #include 32 | #include 33 | #include 34 | #include 35 | #include 36 | #include "pin.H" 37 | 38 | #define NUM_REGS 23 39 | 40 | enum EdgeType {Direct, Indirect, Conditional, Syscall, Return, Regular, Unknown}; 41 | static const std::string EDGE_TYPE_STR[7] = { 42 | "Direct", "Indirect", "Conditional", "Syscall", "Return", "Regular", "Unknown" 43 | }; 44 | 45 | struct MemoryField { 46 | UINT64 address; 47 | UINT32 size; 48 | UINT64 value; 49 | 50 | std::string to_string() const { 51 | std::ostringstream ss; 52 | ss << "{\"address\":" << address << ","; 53 | ss << "\"size\":" << 8*size << ","; 54 | ss << "\"value\":" << value << "}"; 55 | return ss.str(); 56 | } 57 | }; 58 | 59 | struct MemoryData { 60 | MemoryField last_addr = {0, 0, 0}; 61 | MemoryField min_addr = {UINT64_MAX, 0, 0}; 62 | MemoryField max_addr = {0, 0, 0}; 63 | MemoryField last_value = {0, 0, 0}; 64 | MemoryField min_value = {0, 0, UINT64_MAX}; 65 | MemoryField max_value = {0, 0, 0}; 66 | 67 | std::string to_string() const { 68 | std::ostringstream ss; 69 | ss << "{\"last_address\":" << last_addr.address << ","; 70 | ss << "\"min_address\":" << min_addr.address << ","; 71 | ss << "\"max_address\":" << max_addr.address << ","; 72 | ss << "\"last_value\":" << last_value.value << ","; 73 | ss << "\"min_value\":" << min_value.value << ","; 74 | ss << "\"max_value\":" << max_value.value << "}"; 75 | return ss.str(); 76 | } 77 | }; 78 | 79 | struct Value { 80 | bool is_set; 81 | UINT64 value; 82 | }; 83 | 84 | struct InstructionData { 85 | UINT64 count; 86 | std::string disas; 87 | Value min_val[23]; 88 | Value max_val[23]; 89 | Value last_val[23]; 90 | MemoryData mem; 91 | ADDRINT next_ins_addr; // note, in JSON this is called last_successor 92 | }; 93 | 94 | static const REG REGISTERS[NUM_REGS] = { 95 | REG_RAX, REG_RBX, REG_RCX, REG_RDX, REG_RSI, REG_RDI, REG_RBP, REG_RSP, 96 | REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, 97 | REG_SEG_CS, REG_SEG_SS, REG_SEG_DS, REG_SEG_ES, REG_SEG_FS, REG_SEG_GS, REG_GFLAGS 98 | }; 99 | static const std::string REG_NAMES[NUM_REGS] = { 100 | "rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", "rsp", 101 | "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", 102 | "seg_cs", "seg_ss", "seg_ds", "seg_es", "seg_fs", "seg_gs", "eflags" 103 | }; 104 | 105 | static FILE * g_trace_file; 106 | static std::map g_instruction_map; 107 | static std::map, std::pair> g_edge_map; 108 | static EdgeType g_prev_ins_edge_type = Unknown; 109 | static ADDRINT g_prev_ins_addr = 0; 110 | static ADDRINT g_load_offset; 111 | static ADDRINT g_low_address; 112 | static ADDRINT g_first_ins_addr; 113 | static UINT64 g_reg_state[23] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; 114 | static PIN_LOCK g_lock; 115 | 116 | 117 | /** 118 | * Add instruction to global instruction map 119 | */ 120 | VOID add_instruction(ADDRINT ins_addr, const std::string& ins_disas) { 121 | g_instruction_map[ins_addr] = { 122 | 0, 123 | std::string(ins_disas), // disas is helpful in case something goes wrong 124 | { 125 | {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, 126 | {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, 127 | {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, 128 | {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX}, 129 | {0, UINT64_MAX}, {0, UINT64_MAX}, {0, UINT64_MAX} 130 | }, 131 | {{0}}, 132 | {{0}}, 133 | {{0, 0, 0}, {UINT64_MAX, 0, 0}, {0, 0, 0}, {0, 0, 0}, {0, 0, UINT64_MAX}, {0, 0, 0}}, 134 | 0, 135 | }; 136 | } 137 | 138 | 139 | /** 140 | * Add new edge to global edge map (if necessary) and increase visited count 141 | */ 142 | VOID ins_save_edge(ADDRINT predecessor, ADDRINT successor, EdgeType type) { 143 | std::pair current_edge(predecessor, successor); 144 | if (g_edge_map.find(current_edge) == g_edge_map.end()) { 145 | g_edge_map[current_edge] = {/* type = */ type, /* visited count = */ 0}; 146 | } 147 | else if (g_edge_map[current_edge].first != type) { 148 | LOG("[E] Edge(" + StringFromAddrint(predecessor) + ", " + StringFromAddrint(successor) 149 | + ") type differs\n"); 150 | assert(g_edge_map[current_edge].first == type); 151 | } 152 | g_edge_map[current_edge].second += 1; 153 | // annotate previous instruction with the last successor node 154 | g_instruction_map[predecessor].next_ins_addr = successor; 155 | } 156 | 157 | 158 | /** 159 | * Update globally tracked register state and append written or modified values to address 160 | */ 161 | VOID update_reg_state(ADDRINT ins_addr, const CONTEXT * ctxt, const std::set * reg_ops) { 162 | PIN_REGISTER temp; 163 | InstructionData * tuple = &g_instruction_map[ins_addr]; 164 | for (UINT32 i = 0; i < NUM_REGS; i++) { 165 | PIN_GetContextRegval(ctxt, REGISTERS[i], reinterpret_cast(&temp)); 166 | // check if reg value changed OR reg is register operand that is written to 167 | if (*(temp.qword) != g_reg_state[i] || std::find(reg_ops->begin(), reg_ops->end(), REGISTERS[i]) != reg_ops->end()) { 168 | g_reg_state[i] = *(temp.qword); 169 | if (tuple->min_val[i].value >= *(temp.qword)) tuple->min_val[i].value = *(temp.qword); 170 | if (tuple->max_val[i].value <= *(temp.qword)) tuple->max_val[i].value = *(temp.qword); 171 | // store "last-seen" value 172 | tuple->last_val[i].value = *(temp.qword); 173 | tuple->min_val[i].is_set = true; 174 | tuple->max_val[i].is_set = true; 175 | tuple->last_val[i].is_set = true; 176 | } 177 | } 178 | } 179 | 180 | 181 | /** 182 | * Update register state, save edge, and update global information based on current instruction 183 | */ 184 | VOID ins_save_state(ADDRINT ins_addr, const std::string& ins_disas, const CONTEXT * ctxt, const std::set * reg_ops, EdgeType type) { 185 | PIN_GetLock(&g_lock, ins_addr); 186 | // if first occurence, add instruction to map 187 | if (g_instruction_map.find(ins_addr) == g_instruction_map.end()) add_instruction(ins_addr, ins_disas); 188 | // increase visited count 189 | g_instruction_map[ins_addr].count += 1; 190 | // update which registers where changed during execution of the instruction 191 | update_reg_state(ins_addr, ctxt, reg_ops); 192 | // if a predecessor exists, save the edge 193 | if (g_prev_ins_addr) ins_save_edge(g_prev_ins_addr, ins_addr, g_prev_ins_edge_type); 194 | // data for next instruction to act upon 195 | g_prev_ins_addr = ins_addr; 196 | g_prev_ins_edge_type = type; 197 | PIN_ReleaseLock(&g_lock); 198 | } 199 | 200 | /** 201 | * read value from memory address (respects size) 202 | */ 203 | UINT64 read_from_addr(ADDRINT mem_addr, ADDRINT size, ADDRINT ins_addr) { 204 | switch(size) { 205 | case 1: 206 | { 207 | uint8_t * value = reinterpret_cast(mem_addr); 208 | return static_cast(*value); 209 | } 210 | case 2: 211 | { 212 | uint16_t * value = reinterpret_cast(mem_addr); 213 | return static_cast(*value); 214 | } 215 | case 4: 216 | { 217 | uint32_t * value = reinterpret_cast(mem_addr); 218 | return static_cast(*value); 219 | } 220 | case 8: 221 | { 222 | uint64_t * value = reinterpret_cast(mem_addr); 223 | return static_cast(*value); 224 | } 225 | default: 226 | LOG ("[E] Unhandled memory access size " + decstr(size) + " (" + decstr(size*8) 227 | + " bits). Value set to 0 for " + StringFromAddrint(ins_addr) + "\n"); 228 | } 229 | return 0; 230 | } 231 | 232 | 233 | VOID ins_save_memory_access(ADDRINT ins_addr, ADDRINT mem_addr, UINT32 size) { 234 | // Disregard everything with more than 8 bytes (we are not interested in floating point stuff) 235 | if (size > 8) { 236 | return; 237 | } 238 | PIN_GetLock(&g_lock, ins_addr); 239 | MemoryData* mem_data = &g_instruction_map[ins_addr].mem; 240 | if (mem_data->last_addr.size && mem_data->last_addr.size != size) { 241 | LOG("[E] Memory operand has different memory access sizes at " + StringFromAddrint(ins_addr) + "\n"); 242 | assert(mem_data->last_addr.size == size && "Memory operand has different memory access sizes"); 243 | } 244 | MemoryField access = {mem_addr, size, 0}; 245 | access.value = read_from_addr(mem_addr, size, ins_addr); 246 | if (mem_data->max_addr.address <= access.address) mem_data->max_addr = access; 247 | if (mem_data->min_addr.address >= access.address) mem_data->min_addr = access; 248 | mem_data->last_addr = access; 249 | if (mem_data->max_value.value <= access.value) mem_data->max_value = access; 250 | if (mem_data->min_value.value >= access.value) mem_data->min_value = access; 251 | mem_data->last_value = access; 252 | PIN_ReleaseLock(&g_lock); 253 | } 254 | 255 | 256 | EdgeType get_edge_type(INS ins) { 257 | if (INS_IsRet(ins)) return Return; 258 | if (INS_IsCall(ins) || INS_IsBranch(ins)) { 259 | if (INS_Category(ins) == XED_CATEGORY_COND_BR) return Conditional; 260 | if (INS_IsIndirectControlFlow(ins)) return Indirect; 261 | if (INS_IsDirectControlFlow(ins)) return Direct; 262 | return Unknown; 263 | } 264 | if (INS_IsSyscall(ins)) return Syscall; 265 | return Regular; 266 | } 267 | 268 | 269 | std::set* get_written_reg_operands(INS ins) { 270 | std::set* reg_ops = new std::set(); 271 | for (const REG& reg : REGISTERS) { 272 | if (INS_FullRegWContain(ins, reg)) reg_ops->insert(reg); 273 | } 274 | return reg_ops; 275 | } 276 | 277 | 278 | // Pin calls this function every time a new instruction is encountered 279 | VOID Instruction(INS ins, VOID *v) { 280 | // Skip instructions outside main exec 281 | PIN_LockClient(); 282 | const IMG image = IMG_FindByAddress(INS_Address(ins)); 283 | PIN_UnlockClient(); 284 | if (IMG_Valid(image) && IMG_IsMainExecutable(image)) { 285 | if (INS_IsHalt(ins)) { 286 | LOG("[W] Skipping instruction: " + StringFromAddrint(INS_Address(ins)) + " : " 287 | + INS_Disassemble(ins) + "\n"); 288 | return; 289 | } 290 | std::set* reg_ops = get_written_reg_operands(ins); 291 | // Check whether the instruction is a branch | call | ret | ... 292 | EdgeType type = get_edge_type(ins); 293 | // For regular edges, put insertion point after execution else (calls/ret/(cond) branches) before 294 | IPOINT ipoint = (type == Regular ? IPOINT_AFTER : IPOINT_BEFORE); 295 | INS_InsertCall(ins, 296 | ipoint, (AFUNPTR)ins_save_state, 297 | IARG_ADDRINT, INS_Address(ins), 298 | IARG_PTR, new std::string(INS_Disassemble(ins)), 299 | IARG_CONST_CONTEXT, 300 | IARG_PTR, reg_ops, 301 | IARG_PTR, type, 302 | IARG_END 303 | ); 304 | 305 | // Check whether we explicitly dereference memory 306 | if (!(INS_HasExplicitMemoryReference(ins) || INS_Stutters(ins)) || type != Regular) { 307 | return; 308 | } 309 | // Ignore non-typical operations such as vscatter/vgather 310 | if (!INS_IsStandardMemop(ins)) { 311 | LOG("[W] Non-standard memory operand encountered: " + StringFromAddrint(INS_Address(ins)) 312 | + " : " + INS_Disassemble(ins) + "\n"); 313 | return; 314 | } 315 | // Iterate over all memory operands of the instruction 316 | UINT32 mem_operands = INS_MemoryOperandCount(ins); 317 | for (UINT32 mem_op = 0; mem_op < mem_operands; mem_op++) { 318 | // Ensure that we can determine the size 319 | if (!INS_hasKnownMemorySize(ins)) { 320 | LOG("[W] Memory operand with unknown size encountered: " + StringFromAddrint(INS_Address(ins)) 321 | + " : " + INS_Disassemble(ins) + "\n"); 322 | continue; 323 | } 324 | // Instrument only when we *write* to memory 325 | if (INS_MemoryOperandIsWritten(ins, mem_op)) { 326 | // Instrument only when the instruction is executed (conditional mov) 327 | INS_InsertPredicatedCall( 328 | ins, IPOINT_AFTER, (AFUNPTR)ins_save_memory_access, 329 | IARG_INST_PTR, 330 | IARG_MEMORYOP_EA, mem_op, 331 | IARG_MEMORYWRITE_SIZE, 332 | IARG_END 333 | ); 334 | } 335 | } 336 | } 337 | } 338 | 339 | 340 | VOID parse_maps() { 341 | FILE *fp; 342 | char line[2048]; 343 | fp = fopen("/proc/self/maps", "r"); 344 | if (fp == NULL) { 345 | LOG("[E] Failed to open /proc/self/maps"); 346 | return; 347 | } 348 | while (fgets(line, 2048, fp) != NULL) { 349 | std::string s = std::string(line); 350 | if (strstr(line, "stack") != NULL) { 351 | std::string start = s.substr(0, s.find("-")); 352 | std::string end = s.substr(start.length() + 1, s.find(" ") - (start.length() + 1)); 353 | LOG("[*] Stack: 0x" + start + " - 0x" + end + "\n"); 354 | } 355 | if (strstr(line, "heap") != NULL) { 356 | std::string start = s.substr(0, s.find("-")); 357 | std::string end = s.substr(start.length() + 1, s.find(" ") - (start.length() + 1)); 358 | LOG("[*] Heap: 0x" + start + " - 0x" + end + "\n"); 359 | } 360 | } 361 | fclose(fp); 362 | } 363 | 364 | 365 | /** 366 | * Extract metadata from main executable. Includes image base, load offset, 367 | * first executed instruction address, and stack + heap ranges 368 | */ 369 | VOID parse_image(IMG img, VOID *v) { 370 | LOG("[+] Called parse_image on " + IMG_Name(img) + "\n"); 371 | if (IMG_IsMainExecutable(img)) { 372 | g_load_offset = IMG_LoadOffset(img); 373 | g_low_address = IMG_LowAddress(img); 374 | LOG("[*] Image base: " + StringFromAddrint(g_low_address) + "\n"); 375 | LOG("[*] Load offset: " + StringFromAddrint(g_load_offset) + "\n"); 376 | ADDRINT img_entry_addr = IMG_EntryAddress(img); 377 | LOG("[*] Image entry address: " + StringFromAddrint(img_entry_addr) + "\n"); 378 | g_first_ins_addr = g_load_offset + img_entry_addr; 379 | LOG("[*] First instruction address: " + StringFromAddrint(g_first_ins_addr) + "\n"); 380 | } 381 | } 382 | 383 | 384 | /** 385 | * Convert an array of REGISTER : data to JSON string 386 | */ 387 | std::string jsonify_reg_array(const Value* values) { 388 | std::ostringstream ss; 389 | for (int i = 0; i < 23; i++) { 390 | if (values[i].is_set) { 391 | ss << "\"" << i << "\":{\"name\":\"" << REG_NAMES[i] << "\",\"value\":" << values[i].value << "},"; 392 | } 393 | } 394 | std::string str = ss.str(); 395 | // remove last comma 396 | if (str.length() > 0) 397 | str.pop_back(); 398 | return str; 399 | } 400 | 401 | 402 | /** 403 | * Return a JSON representation as string of 'relevant' (sic) data 404 | */ 405 | std::string jsonify() { 406 | LOG("[+] Called jsonify\n"); 407 | std::ostringstream ss; 408 | ss << "{\"image_base\":" << g_low_address; 409 | ss << ",\"first_address\":" << g_first_ins_addr; 410 | ss << ",\"last_address\":" << g_prev_ins_addr; 411 | ss << ",\"instructions\":["; 412 | bool first = true; 413 | for (auto const& ins : g_instruction_map){ 414 | if (!first) ss << ","; 415 | first = false; 416 | if (ins.second.disas == "") { 417 | LOG("[E] Disassembly is empty for " + StringFromAddrint(ins.first) + "\n"); 418 | assert(ins.second.disas != "" && "Disassembly is empty"); 419 | } 420 | ss << "{\"address\":" << ins.first << ",\"mnemonic\":\"" << ins.second.disas << "\",\"registers_min\":{"; 421 | ss << jsonify_reg_array(ins.second.min_val); 422 | ss << "},\"registers_max\":{"; 423 | ss << jsonify_reg_array(ins.second.max_val); 424 | ss << "},\"registers_last\":{"; 425 | ss << jsonify_reg_array(ins.second.last_val); 426 | ss << "},\"last_successor\":" << ins.second.next_ins_addr << ","; 427 | ss << "\"count\":" << ins.second.count; 428 | if (ins.second.mem.last_addr.size != 0) ss << "," << "\"memory\":" << ins.second.mem.to_string(); 429 | ss << "}"; 430 | } 431 | ss << "],\"edges\":["; 432 | first = true; 433 | for (auto const& edge : g_edge_map) { 434 | if (!first) ss << ","; 435 | first = false; 436 | // Convert edge type to str 437 | std::string edge_type_str = EDGE_TYPE_STR[static_cast(edge.second.first)]; 438 | ss << "{\"from\":" << edge.first.first << ",\"to\":" << edge.first.second; 439 | ss << ",\"count\":" << edge.second.second; 440 | ss << ",\"edge_type\":\"" << edge_type_str << "\"}"; 441 | } 442 | ss << "]}"; 443 | return ss.str(); 444 | } 445 | 446 | 447 | /** 448 | * Write data as JSON to output file upon application exit 449 | */ 450 | VOID Fini(INT32 code, VOID *v) { 451 | LOG("[*] Last instruction: " + StringFromAddrint(g_prev_ins_addr) + "\n"); 452 | std::string data = jsonify(); 453 | fprintf(g_trace_file, "%s", data.c_str()); 454 | fclose(g_trace_file); 455 | parse_maps(); 456 | LOG("[=] Completed trace.\n"); 457 | } 458 | 459 | 460 | // Allow renaming output file via -o switch 461 | KNOB KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", 462 | "o", "itrace.out", "specify output file name"); 463 | 464 | 465 | /* ===================================================================== */ 466 | /* Print Help Messages */ 467 | /* ===================================================================== */ 468 | 469 | INT32 Usage() { 470 | PIN_ERROR("This Pintool traces each instruction, dumping their addresses and additional state.\n" 471 | + KNOB_BASE::StringKnobSummary() + "\n"); 472 | return -1; 473 | } 474 | 475 | 476 | INT32 Aslr() { 477 | PIN_ERROR("Disable ASLR before running this tool: echo 0 | sudo tee /proc/sys/kernel/randomize_va_space"); 478 | return -1; 479 | } 480 | 481 | 482 | /* ===================================================================== */ 483 | /* Main */ 484 | /* ===================================================================== */ 485 | 486 | int main(int argc, char * argv[]) { 487 | // Check if ASLR is disabled 488 | std::ifstream infile("/proc/sys/kernel/randomize_va_space"); 489 | int aslr; 490 | if (!infile) { 491 | PIN_ERROR("Unable to check whether ASLR is enabled or not. Failed to open /proc/sys/kernel/randomize_va_space"); 492 | return -1; 493 | } 494 | infile >> aslr; 495 | infile.close(); 496 | if (aslr != 0) return Aslr(); 497 | 498 | // Initialize pin 499 | if (PIN_Init(argc, argv)) return Usage(); 500 | 501 | g_trace_file = fopen(KnobOutputFile.Value().c_str(), "w"); 502 | 503 | 504 | // get image base address 505 | IMG_AddInstrumentFunction(parse_image, 0); 506 | 507 | // Register Instruction to be called to instrument instructions 508 | INS_AddInstrumentFunction(Instruction, 0); 509 | // Register Fini to be called when the application exits 510 | PIN_AddFiniFunction(Fini, 0); 511 | 512 | LOG("[*] Pintool: " + std::string(PIN_ToolFullPath()) + "\n"); 513 | LOG("[*] Target: " + std::string(PIN_VmFullPath()) + "\n"); 514 | 515 | // Start the program, never returns 516 | PIN_StartProgram(); 517 | 518 | return 0; 519 | } 520 | 521 | -------------------------------------------------------------------------------- /tracing/makefile: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # DO NOT EDIT THIS FILE! 4 | # 5 | ############################################################## 6 | 7 | # If the tool is built out of the kit, PIN_ROOT must be specified in the make invocation and point to the kit root. 8 | ifdef PIN_ROOT 9 | CONFIG_ROOT := $(PIN_ROOT)/source/tools/Config 10 | else 11 | CONFIG_ROOT := ../Config 12 | endif 13 | include $(CONFIG_ROOT)/makefile.config 14 | include makefile.rules 15 | include $(TOOLS_ROOT)/Config/makefile.default.rules 16 | 17 | ############################################################## 18 | # 19 | # DO NOT EDIT THIS FILE! 20 | # 21 | ############################################################## 22 | -------------------------------------------------------------------------------- /tracing/makefile.rules: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # This file includes all the test targets as well as all the 4 | # non-default build rules and test recipes. 5 | # 6 | ############################################################## 7 | 8 | 9 | ############################################################## 10 | # 11 | # Test targets 12 | # 13 | ############################################################## 14 | 15 | ###### Place all generic definitions here ###### 16 | 17 | # This defines tests which run tools of the same name. This is simply for convenience to avoid 18 | # defining the test name twice (once in TOOL_ROOTS and again in TEST_ROOTS). 19 | # Tests defined here should not be defined in TOOL_ROOTS and TEST_ROOTS. 20 | TEST_TOOL_ROOTS := aurora_tracer 21 | 22 | # This defines the tests to be run that were not already defined in TEST_TOOL_ROOTS. 23 | TEST_ROOTS := 24 | 25 | # This defines the tools which will be run during the the tests, and were not already defined in 26 | # TEST_TOOL_ROOTS. 27 | TOOL_ROOTS := 28 | 29 | # This defines the static analysis tools which will be run during the the tests. They should not 30 | # be defined in TEST_TOOL_ROOTS. If a test with the same name exists, it should be defined in 31 | # TEST_ROOTS. 32 | # Note: Static analysis tools are in fact executables linked with the Pin Static Analysis Library. 33 | # This library provides a subset of the Pin APIs which allows the tool to perform static analysis 34 | # of an application or dll. Pin itself is not used when this tool runs. 35 | SA_TOOL_ROOTS := 36 | 37 | # This defines all the applications that will be run during the tests. 38 | APP_ROOTS := 39 | 40 | # This defines any additional object files that need to be compiled. 41 | OBJECT_ROOTS := 42 | 43 | # This defines any additional dlls (shared objects), other than the pintools, that need to be compiled. 44 | DLL_ROOTS := 45 | 46 | # This defines any static libraries (archives), that need to be built. 47 | LIB_ROOTS := 48 | 49 | ###### Define the sanity subset ###### 50 | 51 | # This defines the list of tests that should run in sanity. It should include all the tests listed in 52 | # TEST_TOOL_ROOTS and TEST_ROOTS excluding only unstable tests. 53 | SANITY_SUBSET := $(TEST_TOOL_ROOTS) $(TEST_ROOTS) 54 | 55 | 56 | ############################################################## 57 | # 58 | # Test recipes 59 | # 60 | ############################################################## 61 | 62 | # This section contains recipes for tests other than the default. 63 | # See makefile.default.rules for the default test rules. 64 | # All tests in this section should adhere to the naming convention: .test 65 | 66 | 67 | ############################################################## 68 | # 69 | # Build rules 70 | # 71 | ############################################################## 72 | 73 | # This section contains the build rules for all binaries that have special build rules. 74 | # See makefile.default.rules for the default build rules. 75 | -------------------------------------------------------------------------------- /tracing/scripts/addr_ranges.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | """Extract stack and heap address ranges from logfiles""" 3 | 4 | from argparse import ArgumentParser 5 | from pathlib import Path 6 | from typing import Dict, List, Optional 7 | import json 8 | import sys 9 | 10 | 11 | def dump_to_file(data: Dict[str, Dict[str, str]], path: Path) -> None: 12 | """Dump dict as JSON to file""" 13 | with open(path, "w") as fd: 14 | json.dump(data, fd) 15 | 16 | 17 | def _range_to_dict(d: Dict[str, str]) -> Dict[str, Dict[str, str]]: 18 | """Turn stack/heap range into dict with 'start' and 'end' keys""" 19 | return {k : dict(zip(('start', 'end'), v.split(" - "))) for (k, v) in d.items()} 20 | 21 | 22 | def _overapproximate(ld: List[Dict[str, Dict[str, str]]]) -> Dict[str, Dict[str, str]]: 23 | """extract lowest start and highest end address""" 24 | acc = lambda f, k, t: f([d[t][k] for d in ld if t in d]) 25 | return {k : {'start' : acc(min, 'start', k), 'end' : acc(max, 'end', k)} for k in ld[0].keys()} 26 | 27 | 28 | def _flatten(d: Dict[str, Dict[str, str]]) -> Dict[str, int]: 29 | """Flatten to one laye by concatting keys""" 30 | return {f"{k.lower()}_{kk}": int(vv, 16) for (k,v) in d.items() for (kk, vv) in v.items()} 31 | 32 | 33 | def parse_logfile(logfile: Path) -> Dict[str, Dict[str, str]]: 34 | """Parse Stack and Heap start + end address from specified logfile""" 35 | with open(logfile, 'r') as fd: 36 | ranges: Dict[str, str] = dict(l.strip().lstrip("[*] ").split(": ") \ 37 | for l in fd.readlines() if l.strip() and ("Stack" in l or "Heap" in l)) 38 | address_dict = _range_to_dict(ranges) 39 | # Check we didn't mess up the range's order 40 | for k, d in address_dict.items(): 41 | assert int(d['start'], 16) < int(d['end'], 16), f"[{k}] Start address {d['start']} > end address {d['end']}" 42 | return address_dict 43 | 44 | 45 | def extract_stack_and_heap_address(trace_dir: Path, eval_dir: Optional[Path], exact: bool = False) -> None: 46 | """Extract stack and heap address ranges from some logfile""" 47 | logfiles = list(trace_dir.glob("./*")) 48 | address_dicts = list(map(parse_logfile, logfiles)) 49 | if exact: 50 | unique_address_dicts = set(map(str, map(parse_logfile, logfiles))) 51 | assert len(unique_address_dicts) == 1, f"Found {len(unique_address_dicts)} unique address ranges overall logfiles (should be 1)" 52 | address_dict = _flatten(_overapproximate(list(address_dicts))) 53 | print(json.dumps(address_dict)) 54 | if not eval_dir is None: 55 | print(f"Dumping to {eval_dir / 'addresses.json'}") 56 | dump_to_file(address_dict, eval_dir / 'addresses.json') 57 | 58 | 59 | if __name__ == '__main__': 60 | parser = ArgumentParser(description='Extract stack and heap address ranges from logfiles') # pylint: disable=invalid-name 61 | parser.add_argument('trace_dir', nargs=1, help="path to traces directory") 62 | parser.add_argument('--eval_dir', nargs=1, action='store', default=[], help="path to evaluation directory") 63 | parser.add_argument('--exact', action="store_true", default=False, help="Guarantee that address range is the same for all logfiles") 64 | 65 | cargs = parser.parse_args() # pylint: disable=invalid-name 66 | 67 | trace_dir = (Path(cargs.trace_dir[0]) / "logs").resolve() # pylint: disable=invalid-name 68 | if not trace_dir.exists(): 69 | print(f"Trace dir {trace_dir} does not exist. Aborting..") 70 | sys.exit(1) 71 | eval_dir = None 72 | if cargs.eval_dir: 73 | eval_dir = Path(cargs.eval_dir[0]).resolve() 74 | extract_stack_and_heap_address(trace_dir, eval_dir, cargs.exact) 75 | 76 | -------------------------------------------------------------------------------- /tracing/scripts/pprint.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python3 2 | 3 | import json 4 | import argparse 5 | 6 | parser = argparse.ArgumentParser(description='Pretty-print a trace.') 7 | parser.add_argument('-s', "--save", help="save output additionally to file", action='store', metavar='FILENAME') 8 | parser.add_argument('-q', "--quiet", help="don't print to stdout", action='store_true') 9 | parser.add_argument('path', help="path to trace file") 10 | args = parser.parse_args() 11 | 12 | with open(args.path, 'r') as f: 13 | content = json.loads(f.read()) 14 | if not args.quiet: 15 | print(json.dumps(content, indent=4, sort_keys=True)) 16 | if args.save: 17 | with open(args.save, 'w') as f: 18 | json.dump(content, f, indent=4, sort_keys=True) 19 | 20 | -------------------------------------------------------------------------------- /tracing/scripts/run_tracer.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Simple example script on how to run the tracer. 4 | 5 | set -eu 6 | 7 | PIN_EXE="$PIN_ROOT/pin" 8 | PIN_TOOL="../obj-intel64/aurora_tracer.so" 9 | 10 | # Tracer will create a trace and logfile 11 | WORKDIR="." 12 | OUTPUT="$WORKDIR/test.trace" 13 | LOGFILE="$WORKDIR/test.log" 14 | 15 | # Note, PIN cannot process long paths, thus we use a TMP_DIR with short paths 16 | TMP_DIR="/tmp/pin" 17 | TMP_OUTPUT="$TMP_DIR/test.trace" 18 | TMP_LOGFILE="$TMP_DIR/test.log" 19 | 20 | TARGET_DIR="$1" 21 | TARGET_BIN="mruby_trace" 22 | TARGET_ARGS="" 23 | SEED="$TARGET_DIR/seed/*" 24 | 25 | TARGET="${TARGET_DIR}/${TARGET_BIN} ${TARGET_ARGS} ${SEED}" 26 | 27 | mkdir -p $TMP_DIR 28 | echo "${PIN_EXE} -t ${PIN_TOOL} -o ${TMP_OUTPUT} -logfile ${TMP_LOGFILE} -- ${TARGET}" 29 | time ${PIN_EXE} -t ${PIN_TOOL} -o ${TMP_OUTPUT} -logfile ${TMP_LOGFILE} -- ${TARGET} || true 30 | 31 | mv $TMP_OUTPUT $WORKDIR 32 | mv $TMP_LOGFILE $WORKDIR 33 | 34 | rm -rf "$TMP_DIR" 35 | 36 | # Pretty-print output 37 | ./pprint.py -q -s pretty_test.trace test.trace 38 | -------------------------------------------------------------------------------- /tracing/scripts/tracing.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python3 2 | #pylint: disable = missing-module-docstring, no-member, redefined-outer-name 3 | 4 | from functools import partial 5 | from typing import List 6 | import os 7 | import sys 8 | import time 9 | import random 10 | import logging 11 | import zipfile 12 | import subprocess 13 | import multiprocessing 14 | 15 | ############################################################################### 16 | PIN_EXE = os.environ['PIN_ROOT'] + "/pin" 17 | PIN_TOOL = os.environ['PIN_ROOT'] + "/source/tools/AuroraTracer/obj-intel64/aurora_tracer.so" 18 | 19 | ASAN_OPTIONS = "ASAN_OPTIONS=detect_leaks=0" 20 | 21 | TMP_PATH = "/tmp/tm/" 22 | PIN_TIMEOUT = 5 * 60 23 | PARALLEL_PROCESSES = os.cpu_count() 24 | 25 | SUBDIRS = ["crashes/", "non_crashes/"] 26 | 27 | ############################################################################### 28 | 29 | SUCCESS = True 30 | FAILURE = False 31 | logger = logging.getLogger('tracing_manager') # pylint: disable=invalid-name 32 | trace_logger = logging.getLogger('tracer') # pylint: disable=invalid-name 33 | rng = random.SystemRandom() # pylint: disable=invalid-name 34 | 35 | ############################################################################### 36 | 37 | # Check if input folder and output folder are passed as parameters 38 | def preliminary_checks(target_exe: str) -> bool: 39 | """Checks to avoid creation of bad traces.""" 40 | # Check if ASLR is still enabled and abort 41 | with open("/proc/sys/kernel/randomize_va_space", 'r') as f: 42 | if not "0" in f.read().strip(): 43 | logger.critical("[!] Disable ASLR: echo 0 | sudo tee /proc/sys/kernel/randomize_va_space") 44 | return FAILURE 45 | # Check if temporary directory still exits and abort 46 | if os.path.exists(TMP_PATH): 47 | logger.critical(f"[!] Temporary directory {TMP_PATH} already exists. Backup its contents and delete it to proceed.") 48 | return FAILURE 49 | # Check if we try to trace a bash file 50 | file_output = str(subprocess.check_output(['file', target_exe])) 51 | if "Bourne-Again shell script" in file_output: 52 | logger.critical("[!] Target binary is a bash script") 53 | return FAILURE 54 | return SUCCESS 55 | 56 | 57 | def zip_input(src_file: str, should_delete_original: bool = True) -> None: 58 | """Replaces an input by a zipped version of it.""" 59 | zip_file = src_file + ".zip" 60 | file_name = os.path.basename(src_file) 61 | logger.debug(f"Zipping into {zip_file}") 62 | with zipfile.ZipFile(zip_file, 'w', compression=zipfile.ZIP_BZIP2, allowZip64=True) as f: 63 | f.write(src_file, arcname=file_name) 64 | if should_delete_original: 65 | os.remove(src_file) 66 | logger.debug(f"Deleted after zipping original: {src_file}") 67 | 68 | 69 | def check_size(path: str) -> bool: 70 | """Check file size""" 71 | if os.stat(path).st_size == 0: 72 | logger.warning(f"File has size 0 => {path} is empty") 73 | os.remove(path) 74 | return FAILURE 75 | return SUCCESS 76 | 77 | 78 | def check_trace_log(path: str) -> bool: 79 | """Check whether logfile reports Trace completed, i.e., whether trace is complete and whether multiple traces are contained.""" 80 | log_path = os.path.join(os.path.dirname(os.path.dirname(path)), "logs", os.path.basename(path) + ".log") 81 | logger.debug(f"Checking logfile at {log_path} for completeness") 82 | with open(log_path, 'r') as logfile: 83 | data = logfile.read() 84 | count = data.count("[=] Completed trace") 85 | for line in data.split("\n"): 86 | if "[E]" in line: 87 | trace_logger.error(line.split("[E]")[1]) 88 | elif "[W]" in line: 89 | trace_logger.warning(line.split("[W]")[1]) 90 | if count < 1: 91 | logger.warning(f"Incomplete trace: {path}") 92 | os.remove(path) 93 | logger.info(f"Deleted incomplete {path}") 94 | return FAILURE 95 | if count > 1: 96 | logger.warning(f"Multiple traces ({count}x) in one file {path}") 97 | os.remove(path) 98 | logger.info(f"Deleted multiple-trace containing {path}") 99 | return FAILURE 100 | return SUCCESS 101 | 102 | 103 | def trace_input(src_path: str, target_exe: str, should_zip: bool, trace_target: str) -> None: 104 | """Trace a directory within a given path.""" 105 | if trace_target == "README.txt": # Skip README file 106 | return 107 | (subdir, trace_target) = trace_target.split("-", 1) 108 | logger.debug(f"subdir was reconstructed as '{subdir}', trace_target as {trace_target}'") 109 | src_file = os.path.join(src_path, subdir, trace_target) 110 | pin_logfile = os.path.join(TMP_PATH, "logs", trace_target + "_trace.log") 111 | outfile = os.path.join(TMP_PATH, subdir, trace_target + "_trace") 112 | if "@@" in target_exe: 113 | target_exe = target_exe.replace("@@", src_file) 114 | else: 115 | target_exe += f" < {src_file}" 116 | cmd = f"{ASAN_OPTIONS} {PIN_EXE} -t {PIN_TOOL} -o {outfile} -logfile {pin_logfile} -- {target_exe}" 117 | logger.debug(f"CMD: {cmd}") 118 | try: 119 | subprocess.run(cmd, shell=True, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, timeout=PIN_TIMEOUT) 120 | except subprocess.TimeoutExpired: 121 | logger.info(f"Timeout for {src_file}") 122 | #os.remove(src_file) 123 | #logger.info(f"Deleted timeouted file {src_file}") 124 | except subprocess.CalledProcessError as err: 125 | logger.debug(f"Process errored out for {cmd} with {err}") 126 | 127 | if check_trace_log(outfile) == SUCCESS: 128 | if os.path.isfile(outfile) and check_size(outfile) == SUCCESS and should_zip: 129 | zip_input(src_file=outfile, should_delete_original=True) 130 | 131 | 132 | def create_dir(path: str) -> None: 133 | """Create new directory""" 134 | cmd = f"mkdir -p {path}" 135 | os.system(cmd) 136 | 137 | 138 | def remove_tmp_dir() -> None: 139 | """Delete temporary directory if TMP_PATH appears to be safe""" 140 | if not(len(TMP_PATH) > 5 and TMP_PATH.startswith("/tmp/")): 141 | logger.critical(f"TMP PATH might be not what you expect; skipping deletion - path is {TMP_PATH}") 142 | return 143 | logger.info(f"Deleting temporary directory {TMP_PATH}") 144 | cmd = f"rm -rf {TMP_PATH}" 145 | os.system(cmd) 146 | 147 | 148 | def cleanup(target_exe: str, save_path: str) -> None: 149 | """Cleanup after tracing: Kill remaining targets; move files to save_path and remove temporary directory""" 150 | target = os.path.basename(target_exe.split(" ", 1)[0]) 151 | logger.info(f"killall {target}") 152 | cmd = f"killall -s SIGKILL {target}" 153 | os.system(cmd) 154 | 155 | start_time = time.time() 156 | logger.info(f"Moving files from {TMP_PATH} to {save_path}") 157 | cmd = f"mv {TMP_PATH}/* {save_path}" 158 | os.system(cmd) 159 | 160 | remove_tmp_dir() 161 | 162 | move_time = time.time() 163 | logger.info(f"Cleanup time: {move_time - start_time}s") 164 | 165 | 166 | def trace_all(target_exe: str, src_path: str, save_path: str, subdirs: List[str], should_zip: bool = True) -> bool: 167 | """Manage parallel tracing of all files""" 168 | if preliminary_checks(target_exe) == FAILURE: 169 | return FAILURE 170 | logger.info(f"Using files at {src_path}") 171 | logger.info(f"Generating temporary directory at {TMP_PATH}") 172 | start_time = time.time() 173 | create_dir(TMP_PATH) 174 | create_dir(save_path) 175 | create_dir(os.path.join(TMP_PATH, "logs")) 176 | 177 | files = [] 178 | for subdir in subdirs: 179 | create_dir(os.path.join(TMP_PATH, subdir)) 180 | src_subdir = os.path.join(src_path, subdir) 181 | files.extend([f"{subdir}-{x}" for x in os.listdir(src_subdir)]) 182 | create_dir(os.path.join(save_path, subdir)) 183 | 184 | rng.shuffle(files) # shuffle files to avoid timestamp being a 'good' predicate 185 | logger.info(f"Processing {len(files)} files in {len(subdirs)} subdirs at {src_path}") 186 | before_time = time.time() 187 | with multiprocessing.Pool(PARALLEL_PROCESSES) as pool: 188 | func = partial(trace_input, src_path, target_exe, should_zip) 189 | pool.map(func, files) 190 | trace_time = time.time() - before_time 191 | avg_time = (PARALLEL_PROCESSES * trace_time) / len(files) 192 | num_files = 0 193 | for subdir in subdirs: 194 | num_files += len(os.listdir(os.path.join(TMP_PATH, subdir))) 195 | logger.info(f"Done processing {num_files} files in {trace_time}s (on average {avg_time}s per input)") 196 | 197 | logger.info(f"STATS: traced {num_files}/{len(files)} files in {trace_time}s with {PARALLEL_PROCESSES} cores for {src_path}") 198 | 199 | cleanup(target_exe, save_path) 200 | 201 | # write stats to file 202 | with open(os.path.join(save_path, "stats.txt"), 'w') as stats_file: 203 | stats_file.write(f"STATS: traced {num_files}/{len(files)} files in {trace_time}s with {PARALLEL_PROCESSES} cores for {src_path}\n") 204 | 205 | logger.info(f"Total execution time: {time.time() - start_time}s") 206 | return SUCCESS 207 | 208 | 209 | 210 | if __name__ == "__main__": 211 | if len(sys.argv) < 4: 212 | logger.critical("Usage: ./tracing.py ") 213 | print("Usage: ./tracing.py ") 214 | sys.exit(1) 215 | 216 | ### Logging handlers 217 | # Create handlers 218 | c_handler = logging.StreamHandler() # pylint: disable=invalid-name 219 | f_handler = logging.FileHandler('tracing.log') # pylint: disable=invalid-name 220 | c_handler.setLevel(logging.INFO) 221 | f_handler.setLevel(logging.DEBUG) 222 | 223 | logger.setLevel(logging.DEBUG) 224 | trace_logger.setLevel(logging.DEBUG) 225 | 226 | # Create formatters and add it to handlers 227 | c_format = logging.Formatter('%(levelname)s: %(message)s') # pylint: disable=invalid-name 228 | f_format = logging.Formatter('%(asctime)s %(levelname)s: %(message)s') # pylint: disable=invalid-name 229 | c_handler.setFormatter(c_format) 230 | f_handler.setFormatter(f_format) 231 | 232 | # Add handlers to the logger 233 | logger.addHandler(c_handler) 234 | logger.addHandler(f_handler) 235 | trace_logger.addHandler(c_handler) 236 | trace_logger.addHandler(f_handler) 237 | 238 | if trace_all(target_exe=sys.argv[1], src_path=sys.argv[2], save_path=sys.argv[3], subdirs=SUBDIRS) == SUCCESS: 239 | logger.info("Finished tracing run") 240 | --------------------------------------------------------------------------------