├── ExplorationTechniques ├── SimgrViz │ ├── screenshot_1.PNG │ ├── README.md │ └── SimgrViz.py ├── HeartBeat │ ├── README.md │ └── heartbeat.py ├── MemLimiter │ ├── README.md │ └── MemLimiter.py ├── ExplosionDetector │ ├── README.md │ └── ExplosionDetector.py ├── StochasticSearch │ ├── README.md │ └── StocasticSearch.py ├── KLEERandomSearch │ ├── README.me │ └── KLEERandomSearch.py ├── LoopExhaustion │ ├── README.md │ └── LoopExhaustion.py └── KLEECoverageOptimizeSearch │ ├── README.md │ └── KLEECoverageOS.py └── README.md /ExplorationTechniques/SimgrViz/screenshot_1.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/degrigis/awesome-angr/HEAD/ExplorationTechniques/SimgrViz/screenshot_1.PNG -------------------------------------------------------------------------------- /ExplorationTechniques/HeartBeat/README.md: -------------------------------------------------------------------------------- 1 | An exploration technique to make sure symbolic execution is alive and provides some utility to 2 | gently hijack into the DSE while it is running. -------------------------------------------------------------------------------- /ExplorationTechniques/MemLimiter/README.md: -------------------------------------------------------------------------------- 1 | The following ExplorationTechnique can be plugged into an instance of a SimulationManager to stop DSE when memory consumption hits critical levels. -------------------------------------------------------------------------------- /ExplorationTechniques/ExplosionDetector/README.md: -------------------------------------------------------------------------------- 1 | When the ExplosionDetector is plugged in a SimulationManager, it can be used to (1) trigger a timeout, (2) stop the execution when reaching a certain amount of generated SimState(s), and (3) nuking all the unconstrained SimState(s) in a SimulationManager. 2 | -------------------------------------------------------------------------------- /ExplorationTechniques/StochasticSearch/README.md: -------------------------------------------------------------------------------- 1 | Will only keep one path active at a time, any others will be discarded. 2 | Before each pass through, weights are randomly assigned to each basic block. 3 | These weights form a probability distribution for determining which state remains after splits. 4 | When we run out of active paths to step, we start again from the start state. -------------------------------------------------------------------------------- /ExplorationTechniques/KLEERandomSearch/README.me: -------------------------------------------------------------------------------- 1 | Random path selection. https://hci.stanford.edu/cstr/reports/2008-03.pdf 2 | 3 | Maintains a binary tree recording the program path followed for all active processes, 4 | i.e. the leaves of the tree are the current processes and the internal nodes are places 5 | where execution forked. Processes are selected by traversing this tree from the root 6 | and randomly selecting the path to follow at branch points. Therefore when a branch point 7 | is reached the set of processes in each subtree will have equal probability of being selected, 8 | regardless of their size. 9 | 10 | This is implemented as a Non-Uniform-Random-Search where child nodes inherit parent weight, divided by the number 11 | of siblings -------------------------------------------------------------------------------- /ExplorationTechniques/SimgrViz/README.md: -------------------------------------------------------------------------------- 1 | 2 | ## SimgrViz 3 | 4 | This exploration technique dumps the successors of a given state and build the dynamic control flow graph of the program while 5 | symbolically executing it. 6 | The final result can be exported in a .dot file and visualized with [Gephi](https://gephi.org/) or any other tool that supports the DOT format. 7 | Node information can be enriched with attributes of the state for a post-mortem analysis of what happened during the symbolic execution. 8 | 9 | *HINT*: Plug this ET as the last ET of your SimulationManager. 10 | 11 | To dump the .dot file use 12 | 13 | ``` 14 | import networkx as nx 15 | 16 | [...] 17 | 18 | simgr_viz = SimgrViz(cfg=cfg) 19 | simgr.use_technique(simgr_viz) 20 | simgr.explore() 21 | [...] 22 | 23 | nx.write_dot(simgr_viz._simgrG,"my_simgr.dot") 24 | 25 | ``` 26 | 27 | Here an example of the graph when visualized with Gephi. 28 | 29 | ![Example of visualization in Gephi](./screenshot_1.PNG) 30 | -------------------------------------------------------------------------------- /ExplorationTechniques/LoopExhaustion/README.md: -------------------------------------------------------------------------------- 1 | Loop Exhaustion. http://security.ece.cmu.edu/aeg/aeg-current.pdf 2 | We propose and use a loop exhaustion search strategy. The loop-exhaustion 3 | strategy gives higher priority to an interpreter exploring the maximum number 4 | of loop iterations, hoping that computations involving more iterations 5 | are more promising to produce bugs like buffer overflows. 6 | Thus, whenever execution hits a symbolic loop, we try to exhaust the loopexecute 7 | it as many times as possible. Exhausting a symbolic loop has two immediate side effects: 8 | 1) on each loop iteration a new interpreter is spawned, effectively causing an explosion 9 | in the state space, and 2) execution might get 'stuck' in a deep loop. 10 | To avoid getting stuck, we impose two additional heuristics during loop exhaustion: 11 | 1) we use preconditioned symbolic execution along with pruning to reduce the number of interpreters or 12 | 2) we give higher priority to only one interpreter that tries to fully exhaust the loop, 13 | while all other interpreters exploring the same loop have the lowest possible priority. -------------------------------------------------------------------------------- /ExplorationTechniques/MemLimiter/MemLimiter.py: -------------------------------------------------------------------------------- 1 | import os 2 | import psutil 3 | from angr.exploration_techniques import ExplorationTechnique 4 | 5 | class MemLimiter(ExplorationTechnique): 6 | def __init__(self, max_mem, drop_errored): 7 | super(MemLimiter, self).__init__() 8 | self.max_mem = max_mem 9 | self.drop_errored = drop_errored 10 | self.process = psutil.Process(os.getpid()) 11 | 12 | def step(self, simgr, stash='active', **kwargs): 13 | if psutil.virtual_memory().percent > 90 or (self.max_mem - 1) < self.memory_usage_psutil: 14 | simgr.move(from_stash='active', to_stash='out_of_memory') 15 | simgr.move(from_stash='deferred', to_stash='out_of_memory') 16 | 17 | simgr.drop(stash='deadended') 18 | simgr.drop(stash='avoid') 19 | simgr.drop(stash='found') 20 | if self.drop_errored: 21 | del simgr.errored[:] 22 | 23 | return simgr.step(stash=stash) 24 | 25 | @property 26 | def memory_usage_psutil(self): 27 | # return the memory usage in MB 28 | mem = self.process.memory_info().vms / float(2 ** 30) 29 | return mem 30 | -------------------------------------------------------------------------------- /ExplorationTechniques/ExplosionDetector/ExplosionDetector.py: -------------------------------------------------------------------------------- 1 | from angr.exploration_techniques import ExplorationTechnique 2 | 3 | class ExplosionDetector(ExplorationTechnique): 4 | def __init__(self, stashes=('active', 'deferred', 'errored', 'cut'), threshold=100): 5 | super(ExplosionDetector, self).__init__() 6 | self._stashes = stashes 7 | self._threshold = threshold 8 | self.timed_out = Event() 9 | self.timed_out_bool = False 10 | 11 | def step(self, simgr, stash='active', **kwargs): 12 | simgr = simgr.step(stash=stash, **kwargs) 13 | total = 0 14 | if len(simgr.unconstrained) > 0: 15 | l.debug("Nuking unconstrained") 16 | simgr.move(from_stash='unconstrained', to_stash='_Drop', filter_func=lambda _: True) 17 | if self.timed_out.is_set(): 18 | l.critical("Timed out, %d states: %s" % (total, str(simgr))) 19 | self.timed_out_bool = True 20 | for st in self._stashes: 21 | if hasattr(simgr, st): 22 | simgr.move(from_stash=st, to_stash='_Drop', filter_func=lambda _: True) 23 | for st in self._stashes: 24 | if hasattr(simgr, st): 25 | total += len(getattr(simgr, st)) 26 | 27 | if total >= self._threshold: 28 | l.critical("State explosion detected, over %d states: %s" % (total, str(simgr))) 29 | for st in self._stashes: 30 | if hasattr(simgr, st): 31 | simgr.move(from_stash=st, to_stash='_Drop', filter_func=lambda _: True) 32 | 33 | return simgr -------------------------------------------------------------------------------- /ExplorationTechniques/KLEECoverageOptimizeSearch/README.md: -------------------------------------------------------------------------------- 1 | Coverage Optimize Search. https://hci.stanford.edu/cstr/reports/2008-03.pdf 2 | 3 | A strategy which attempts to select states that are likely to cover new code 4 | in the immediate future. Heuristics are used to compute a weight for each process 5 | and a random process is selected according to these weights. 6 | Currently these heuristics use a combination of the minimum distance 7 | to an uncovered instruction, taking into account the call stack of the 8 | process, and whether the process has recently covered new code. 9 | These strategies are composed by selecting from each in a round robin fashion. 10 | Although this interleaving may increase the time for a particularly effective 11 | strategy to achieve high coverage, it protects the system against cases where 12 | one individual strategy would become stuck. 13 | Furthermore, because the strategies are always selecting processes from the same pool, 14 | using interleaving allows the strategies to interact cooperatively. 15 | Finally, once selected each process is run for a "time slice" defined by 16 | both a maximum number of instructions and a maximum amount of time. 17 | The time to execute an individual instruction can vary widely between 18 | simple instructions, like addition, and instructions which may use the 19 | constraint solver or fork, like branches or memory accesses. 20 | Time-slicing processes helps ensure that a process which is frequently 21 | executing expensive instructions will not dominate execution time. 22 | 23 | This is implemented as a Non-Uniform-Random-Search with interleaved heuristics: 24 | 1. md2u: minimum distance to uncovered instruction 25 | 2. covnew: recently covered new code 26 | TODO: a time/instruction batch limit may be set (default to false, user-set) -------------------------------------------------------------------------------- /ExplorationTechniques/StochasticSearch/StocasticSearch.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import random 3 | from collections import defaultdict 4 | 5 | from angr.exploration_techniques import ExplorationTechnique 6 | 7 | l = logging.getLogger('syml') 8 | 9 | 10 | class StochasticSearch(ExplorationTechnique): 11 | """ 12 | Stochastic Search. 13 | 14 | Will only keep one path active at a time, any others will be discarded. 15 | Before each pass through, weights are randomly assigned to each basic block. 16 | These weights form a probability distribution for determining which state remains after splits. 17 | When we run out of active paths to step, we start again from the start state. 18 | """ 19 | 20 | def __init__(self, restart_prob=0.0001, **kwargs): 21 | """ 22 | :param start_state: The initial state from which exploration stems. 23 | :param restart_prob: The probability of randomly restarting the search (default 0.0001). 24 | """ 25 | super(StochasticSearch, self).__init__() 26 | self.restart_prob = restart_prob 27 | self._random = random.Random() 28 | self._random.seed(42) 29 | self.affinity = defaultdict(self._random.random) 30 | 31 | def setup(self, simgr): 32 | super(StochasticSearch, self).setup(simgr) 33 | self.start_state = simgr.one_active 34 | 35 | def step(self, simgr, stash='active', **kwargs): 36 | simgr = simgr.step(stash=stash, **kwargs) 37 | 38 | if not simgr.stashes[stash] or self._random.random() < self.restart_prob: 39 | simgr.stashes[stash] = [self.start_state] 40 | self.affinity.clear() 41 | 42 | if len(simgr.stashes[stash]) > 1: 43 | def weighted_pick(states): 44 | """ 45 | param states: Diverging states. 46 | """ 47 | assert len(states) >= 2 48 | total_weight = sum((self.affinity[s.addr] for s in states)) 49 | selected = self._random.uniform(0, total_weight) 50 | i = 0 51 | for i, state in enumerate(states): 52 | weight = self.affinity[state.addr] 53 | if selected < weight: 54 | break 55 | else: 56 | selected -= weight 57 | picked = states[i] 58 | return picked 59 | 60 | simgr.stashes[stash] = [weighted_pick(simgr.stashes[stash])] 61 | 62 | return simgr -------------------------------------------------------------------------------- /ExplorationTechniques/HeartBeat/heartbeat.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=import-error, no-name-in-module 2 | import angr 3 | import hashlib 4 | import os 5 | import logging 6 | import networkx 7 | import time 8 | import copy 9 | 10 | from typing import List, Set, Dict, Tuple, Optional 11 | from angr.exploration_techniques import ExplorationTechnique 12 | from angr import SimState 13 | from networkx.drawing.nx_agraph import write_dot 14 | 15 | 16 | l = logging.getLogger("HeartBeat") 17 | l.setLevel("INFO") 18 | 19 | global CURR_SIMGR 20 | global CURR_PROJ 21 | global CURR_STATE 22 | 23 | # This is useful if you plugged this: 24 | # https://github.com/degrigis/awesome-angr/tree/main/ExplorationTechniques/SimgrViz 25 | def dump_viz_graph(simgr=None): 26 | l.info("Dumping visualization graph if it exists") 27 | 28 | if simgr is None: 29 | simgr = CURR_SIMGR 30 | 31 | for et in simgr._techniques: 32 | if "SimgrViz" in str(et): 33 | break 34 | write_dot(et._simgrG,"/tmp/my_simgr.dot") 35 | 36 | # This is useful if you are using this: 37 | # https://github.com/fmagin/angr-cli 38 | def spw_cli(): 39 | global CURR_SIMGR 40 | global CURR_PROJ 41 | global CURR_STATE 42 | import angrcli.plugins.ContextView 43 | from angrcli.interaction.explore import ExploreInteractive 44 | e = ExploreInteractive(CURR_PROJ, CURR_STATE) 45 | e.cmdloop() 46 | 47 | class HeartBeat(ExplorationTechnique): 48 | 49 | def __init__(self, beat_interval=100): 50 | super(HeartBeat, self).__init__() 51 | self.stop_heart_beat_file = "/tmp/stop_heartbeat.txt" 52 | self.beat_interval = beat_interval 53 | self.beat_cnt = 0 54 | self.steps_cnt = 0 55 | 56 | def setup(self, simgr): 57 | return True 58 | 59 | def successors(self, simgr, state:SimState, **kwargs): 60 | succs = simgr.successors(state, **kwargs) 61 | self.beat_cnt += 1 62 | self.steps_cnt += 1 63 | if self.beat_cnt == self.beat_interval: 64 | l.info("Exploration is alive <3. Step {}".format(self.steps_cnt)) 65 | l.info(" Succs are: {}".format(succs)) 66 | l.info(" Simgr is: {}".format(simgr)) 67 | self.beat_cnt = 0 68 | if os.path.isfile(self.stop_heart_beat_file): 69 | l.info("HeartBeat stopped, need help? 1: 46 | l.debug(f'{"-" * 0x10}\nStatus:\t\t{simgr} --> active: {simgr.stashes[stash]}') 47 | # update binary tree 48 | for s in simgr.stashes[stash]: 49 | s.globals['weight'] = s.globals.get('weight', 1) / len(simgr.stashes[stash]) 50 | pass # weighted choice code is always executed before returning 51 | 52 | # randomly pick new path 53 | simgr.move(from_stash=stash, to_stash='deferred') 54 | if max([s.globals['weight'] for s in simgr.stashes['deferred']]) < 0.1: 55 | for s in simgr.stashes['deferred']: 56 | s.globals['weight'] *= 10 57 | n = random.uniform(0, sum([s.globals['weight'] for s in simgr.stashes['deferred']])) 58 | for s in simgr.stashes['deferred']: 59 | if n < s.globals['weight']: 60 | simgr.stashes['deferred'].remove(s) 61 | simgr.stashes[stash] = [s] 62 | break 63 | n = n - s.globals['weight'] 64 | 65 | return simgr -------------------------------------------------------------------------------- /ExplorationTechniques/LoopExhaustion/LoopExhaustion.py: -------------------------------------------------------------------------------- 1 | 2 | # https://raw.githubusercontent.com/ucsb-seclab/syml/main/syml/exploration/exploration_techniques/literature/aeg_loop_exhaustion.py 3 | 4 | import logging 5 | 6 | import angr 7 | from angr.exploration_techniques import ExplorationTechnique 8 | 9 | l = logging.getLogger('LoopExhaustion') 10 | 11 | 12 | class AEGLoopExhaustion(ExplorationTechnique): 13 | """ 14 | Loop Exhaustion. http://security.ece.cmu.edu/aeg/aeg-current.pdf 15 | 16 | We propose and use a loop exhaustion search strategy. The loop-exhaustion 17 | strategy gives higher priority to an interpreter exploring the maximum number 18 | of loop iterations, hoping that computations involving more iterations 19 | are more promising to produce bugs like buffer overflows. 20 | Thus, whenever execution hits a symbolic loop, we try to exhaust the loopexecute 21 | it as many times as possible. Exhausting a symbolic loop has two immediate side effects: 22 | 1) on each loop iteration a new interpreter is spawned, effectively causing an explosion 23 | in the state space, and 2) execution might get 'stuck' in a deep loop. 24 | To avoid getting stuck, we impose two additional heuristics during loop exhaustion: 25 | 1) we use preconditioned symbolic execution along with pruning to reduce the number of interpreters or 26 | 2) we give higher priority to only one interpreter that tries to fully exhaust the loop, 27 | while all other interpreters exploring the same loop have the lowest possible priority. 28 | """ 29 | 30 | def __init__(self, **kwargs): 31 | super(AEGLoopExhaustion, self).__init__() 32 | self.top_count = 0 33 | 34 | def setup(self, simgr): 35 | super(AEGLoopExhaustion, self).setup(simgr=simgr) 36 | simgr.stashes['active'][0].globals['visits'] = dict() 37 | 38 | # setup LoopSeer 39 | simgr.stashes['active'][0].register_plugin('loop_data', angr.state_plugins.SimStateLoopData()) 40 | simgr.use_technique(angr.exploration_techniques.LoopSeer(bound=10000)) 41 | 42 | @staticmethod 43 | def rank(s, reverse=False): 44 | k = -1 if reverse else 1 45 | return k * sum([s.loop_data.back_edge_trip_counts[loop[0].entry.addr][-1] for loop in s.loop_data.current_loop]) 46 | 47 | def step(self, simgr, stash='active', **kwargs): 48 | simgr = simgr.step(stash=stash, **kwargs) 49 | 50 | if len(simgr.stashes[stash]) == 1: 51 | new_count = self.rank(simgr.stashes[stash][0]) 52 | if new_count > self.top_count or len(simgr.stashes['deferred']) == 0: 53 | #l.debug(f'looping!') 54 | self.top_count = new_count 55 | else: 56 | #l.debug(f'exhausted or new loop!') # \t {simgr.stashes[stash][0].loop_data.back_edge_trip_counts}') 57 | simgr.move(from_stash=stash, to_stash='deferred') 58 | simgr.split(from_stash='deferred', to_stash=stash, state_ranker=self.rank, 59 | limit=len(simgr.deferred) - 1) 60 | self.top_count = self.rank(simgr.stashes[stash][0]) 61 | 62 | elif len(simgr.stashes[stash]) == 0: 63 | #l.debug('exhausted?') 64 | simgr.split(from_stash='deferred', to_stash=stash, state_ranker=self.rank, limit=len(simgr.deferred) - 1) 65 | self.top_count = self.rank(simgr.stashes[stash][0]) 66 | 67 | else: 68 | counts = simgr.stashes[stash][0].loop_data.back_edge_trip_counts 69 | for s in simgr.stashes[stash][1:]: 70 | if s.loop_data.back_edge_trip_counts != counts: 71 | simgr.split(from_stash=stash, to_stash='deferred', state_ranker=lambda s: self.rank(s, reverse=True), 72 | limit=1) 73 | self.top_count = self.rank(simgr.stashes[stash][0]) 74 | 75 | l.debug(f'{"-" * 0x10}\nStatus:\t\t{simgr} --> active: {simgr.stashes[stash]}') 76 | break 77 | else: 78 | #l.debug('one more step..and let\'s see what happens..') 79 | pass 80 | 81 | return simgr -------------------------------------------------------------------------------- /ExplorationTechniques/KLEECoverageOptimizeSearch/KLEECoverageOS.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import random 3 | from itertools import cycle 4 | 5 | from angr.exploration_techniques import ExplorationTechnique 6 | 7 | l = logging.getLogger('KLEECoverageOS') 8 | 9 | 10 | class KLEECoverageOptimizeSearch(ExplorationTechnique): 11 | """ 12 | Coverage Optimize Search. https://hci.stanford.edu/cstr/reports/2008-03.pdf 13 | 14 | A strategy which attempts to select states that are likely to cover new code 15 | in the immediate future. Heuristics are used to compute a weight for each process 16 | and a random process is selected according to these weights. 17 | Currently these heuristics use a combination of the minimum distance 18 | to an uncovered instruction, taking into account the call stack of the 19 | process, and whether the process has recently covered new code. 20 | These strategies are composed by selecting from each in a round robin fashion. 21 | Although this interleaving may increase the time for a particularly effective 22 | strategy to achieve high coverage, it protects the system against cases where 23 | one individual strategy would become stuck. 24 | Furthermore, because the strategies are always selecting processes from the same pool, 25 | using interleaving allows the strategies to interact cooperatively. 26 | Finally, once selected each process is run for a "time slice" defined by 27 | both a maximum number of instructions and a maximum amount of time. 28 | The time to execute an individual instruction can vary widely between 29 | simple instructions, like addition, and instructions which may use the 30 | constraint solver or fork, like branches or memory accesses. 31 | Time-slicing processes helps ensure that a process which is frequently 32 | executing expensive instructions will not dominate execution time. 33 | 34 | This is implemented as a Non-Uniform-Random-Search with interleaved heuristics: 35 | 1. md2u: minimum distance to uncovered instruction 36 | 2. covnew: recently covered new code 37 | TODO: a time/instruction batch limit may be set (default to false, user-set) 38 | """ 39 | 40 | def __init__(self, **kwargs): 41 | super(KLEECoverageOptimizeSearch, self).__init__() 42 | self.heuristics = cycle(['md2u', 'covnew']) 43 | self.curr_heuristic = None 44 | self.covered = set() 45 | self.cfg = None 46 | 47 | def setup(self, simgr): 48 | super(KLEECoverageOptimizeSearch, self).setup(simgr) 49 | self.cfg = simgr._project.analyses.CFGFast(base_state=simgr.one_active, fail_fast=True, normalize=True) 50 | 51 | def rank(self, s, reverse=False): 52 | k = -1 if reverse else 1 53 | return k * s.globals[self.curr_heuristic] 54 | 55 | def step(self, simgr, stash='active', **kwargs): 56 | simgr = simgr.step(stash=stash, **kwargs) 57 | 58 | # if there's no branch: update globals and go on 59 | if len(simgr.stashes[stash]) == 1: 60 | self.update_globals(simgr.stashes[stash][0]) 61 | return simgr 62 | 63 | # if there are no successors: SHARED CODE AFTER IF STMT 64 | elif len(simgr.stashes[stash]) == 0: 65 | pass 66 | 67 | # if there is more than one successor: update globals, SHARED CODE AFTER IF STMT 68 | elif len(simgr.stashes[stash]) > 1: 69 | for state in simgr.stashes[stash]: 70 | self.update_globals(state) 71 | 72 | # change heuristic 73 | self.curr_heuristic = next(self.heuristics) 74 | 75 | # weighted choice 76 | simgr.move(from_stash=stash, to_stash='deferred') 77 | n = random.uniform(0, sum([s.globals[self.curr_heuristic] for s in simgr.stashes['deferred']])) 78 | for s in simgr.stashes['deferred']: 79 | if n < s.globals[self.curr_heuristic]: 80 | simgr.stashes['deferred'].remove(s) 81 | simgr.stashes[stash] = [s] 82 | l.debug(f'{"-" * 0x10}\nStatus:\t\t{simgr} --> active: {simgr.stashes[stash]} [{self.curr_heuristic} {s.globals[self.curr_heuristic]}]') 83 | break 84 | n = n - s.globals[self.curr_heuristic] 85 | 86 | return simgr 87 | 88 | def update_globals(self, state): 89 | # if new, update covered blocks, set insns since new code to 0 90 | if state.addr not in self.covered: 91 | self.covered.add(state.addr) 92 | state.globals['insns_since_new'] = 0 93 | # if not new: update insns since new code 94 | else: 95 | state.globals['insns_since_new'] = state.globals.get('insns_since_new', 0) + state.block().instructions 96 | 97 | state.globals['covnew'] = 1. / max(1, state.globals['insns_since_new'] - 1000) 98 | state.globals['covnew'] *= state.globals['covnew'] 99 | state.globals['md2u'] = 1. / min(self.get_md2u(state.addr), 10000) or 1 100 | state.globals['md2u'] *= state.globals['md2u'] 101 | 102 | def get_md2u(self, addr, iter=50): 103 | if iter == 0: 104 | return float('inf') 105 | 106 | if addr not in self.covered: 107 | return 0 108 | 109 | node = self.cfg.model.get_any_node(addr, anyaddr=True) 110 | md2u = float('inf') 111 | 112 | for succ in set(node.successors): 113 | md2u = min(md2u, self.get_md2u(succ, iter=iter - 1)) 114 | 115 | return md2u + node.block.instructions if node.block else 10 -------------------------------------------------------------------------------- /ExplorationTechniques/SimgrViz/SimgrViz.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=import-error, no-name-in-module 2 | import angr 3 | import hashlib 4 | import os 5 | import logging 6 | import networkx 7 | import time 8 | import copy 9 | 10 | from typing import List, Set, Dict, Tuple, Optional 11 | from angr.exploration_techniques import ExplorationTechnique 12 | from angr import SimState 13 | from networkx.drawing.nx_agraph import write_dot 14 | 15 | from shutil import which 16 | 17 | l = logging.getLogger("SimgrViz") 18 | l.setLevel("INFO") 19 | 20 | WDIR = './' 21 | RET_ADDR = 0xdeadbeef 22 | 23 | class SimgrViz(ExplorationTechnique): 24 | ''' 25 | When plugging this Exploration technique we collect information 26 | regarding the SimStates generated by the Simgr. 27 | This is a DEBUG ONLY technique that should never be used in production. 28 | ''' 29 | def __init__(self, cfg=None): 30 | super(SimgrViz, self).__init__() 31 | self._simgrG = networkx.DiGraph() 32 | self.cfg = cfg 33 | # Boolean guard to understand if this is the initial state or not. 34 | self._start = True 35 | self._salt = 0 36 | self._path_exploration_id = 0 37 | 38 | # Reference to the taint tracker to extract info 39 | self.taint_tracker = None 40 | 41 | # TODO 42 | # Activate the visualization only when _starts_from is reached. 43 | self._starts_from = None 44 | # De-activate the visualizaton when _ends_to is reached. 45 | self._ends_to = None 46 | 47 | self.last_seen_id = None 48 | 49 | def setup(self, simgr): 50 | for state in simgr.stashes['active']: 51 | state.globals["predecessor"] = None 52 | state.globals["path_exploration_id"] = self._path_exploration_id 53 | self._path_exploration_id += 1 54 | return 55 | 56 | def get_state_hash(self, state): 57 | reg_values = [] 58 | for r in state.project.arch.register_list: 59 | reg_values.append(state.registers.load(r.name)) 60 | regs = '-'.join([str(x) for x in reg_values ]) 61 | stack_signature = '-'.join([ 62 | hex(state.callstack.call_site_addr), 63 | hex(state.callstack.current_return_target), 64 | hex(state.callstack.current_stack_pointer), 65 | str(state.callstack.jumpkind), 66 | hex(state.callstack.ret_addr), 67 | ]) 68 | globals_signature = '-'.join([ str(x) for x in state.globals.values()]) 69 | state_id_sig = str(id(state)) # regs + str(id(state)) # + stack_signature # + globals_signature 70 | h = hashlib.sha256() 71 | h.update(state_id_sig.encode("utf-8")) 72 | h.update(regs.encode("utf-8")) 73 | h.update(stack_signature.encode("utf-8")) 74 | h.update(globals_signature.encode("utf-8")) 75 | if state.globals["predecessor"]: 76 | h.update(state.globals["predecessor"].encode("utf-8")) 77 | h_hexdigest = h.hexdigest() 78 | # Store the signature into the state. 79 | state.globals["state_signature"] = h_hexdigest 80 | return str(h_hexdigest) 81 | 82 | def _update_timeout_info(self, timeout_states: List[SimState]): 83 | for state in timeout_states: 84 | s_sig = state.globals["state_signature"] 85 | self._simgrG.nodes[s_sig]["timeout"] = True 86 | 87 | def _add_state_to_graph(self, parent_state_id:str, sim_state_id:str, state:SimState): 88 | 89 | self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr)) 90 | self._simgrG.add_edge(parent_state_id, sim_state_id) 91 | 92 | if state.addr in self.cfg.functions: 93 | self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), color = "green" if state.project.is_hooked(state.addr) else "yellow", 94 | func_name="{}".format(self.cfg.get_any_node(state.addr).name), 95 | hooked = True if state.project.is_hooked(state.addr) else False, 96 | call_followed = True, 97 | path_exploration_id=state.globals["path_exploration_id"]) 98 | else: 99 | self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), path_exploration_id=state.globals["path_exploration_id"]) 100 | 101 | self._simgrG.nodes[sim_state_id]['jumpkind'] = state.history.jumpkind 102 | 103 | # This can be heavy 104 | if state.addr != RET_ADDR: 105 | try: 106 | self._simgrG.nodes[sim_state_id]['bb_ins'] = [x.mnemonic for x in state.block().disassembly.insns] 107 | self._simgrG.nodes[sim_state_id]['bb_size'] = state.block().size 108 | if state.callstack.current_function_address: 109 | self._simgrG.nodes[sim_state_id]['callstack_curr_func_addr'] = str(hex(state.callstack.current_function_address)) 110 | except Exception: 111 | pass 112 | 113 | def _tag_fake_ret(self, state:SimState): 114 | if state.history.jumpkind == "Ijk_FakeRet": 115 | self._simgrG.nodes[state.globals["state_signature"]]['call_followed'] = False 116 | self._simgrG.nodes[state.globals["state_signature"]]['color'] = "red" 117 | 118 | def successors(self, simgr, state:SimState, **kwargs): 119 | succs = simgr.successors(state, **kwargs) 120 | self._tag_fake_ret(state) 121 | if self._start: 122 | assert(not state.globals["predecessor"]) 123 | sim_state_id = self.get_state_hash(state) 124 | 125 | if state.addr in self.cfg.functions: 126 | if state.project.is_hooked(state.addr): 127 | self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), color = "green", 128 | hooked = True, 129 | func_name="{}".format(self.cfg.get_any_node(state.addr).name)) 130 | else: 131 | self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), color = "yellow", 132 | hooked = False, 133 | func_name="{}".format(self.cfg.get_any_node(state.addr).name)) 134 | else: 135 | self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr)) 136 | 137 | self._start = False 138 | 139 | self._path_exploration_id += 1 140 | 141 | for succ_state in succs.flat_successors: 142 | succ_state.globals["predecessor"] = state.globals["state_signature"] 143 | parent_state_id = succ_state.globals["predecessor"] 144 | succ_state.globals["path_exploration_id"] = self._path_exploration_id 145 | sim_state_id = self.get_state_hash(succ_state) 146 | 147 | self._add_state_to_graph(parent_state_id, sim_state_id, succ_state) 148 | 149 | return succs 150 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Awesome angr [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) 2 | 3 | A collection of resources/tools and analyses for the [angr](https://github.com/angr) binary analysis framework. 4 | This page does not only collect links and external resources, but its meant to be an harbour to release any non-official extensions/tool/utils that can be useful when working with angr. 5 | 6 | ## ExplorationTechniques 📁 7 | 8 | A collection of exploration techniques written by the community 9 | 10 | * *SimgrViz*: an exploration technique that collects information regarding the states generated by the SimulationManager and creates a graph that can be later visualized to debug the analyses (.dot file). 11 | * *MemLimiter*: an exploration technique to stop the analysis when memory consumption is too high! 12 | * *ExplosionDetector*: stop the analysis when there are too many states or other critical errors happen. 13 | * *KLEECoverageOptimizeSearch*: KLEE technique to improve coverage. 14 | * *KLEERandomSearch*: an ET for random path selection. 15 | * *LoopExhaustion*: a loop exhaustion search strategy. 16 | * *StochasticSearch*: an ET for stocastic search of active states. 17 | * *HeartBeat*: An exploration technique to make sure symbolic execution is alive and provides some utility to gently hijack into the DSE while it is running. 18 | 19 | ## Documentation :book: 20 | * [docs.angr.op](https://docs.angr.io/) - Official angr general documentatoin website. 21 | * [angr.io](http://angr.io/api-doc/angr.html) - Official angr API documentation. 22 | * [Intro to Binary Analysis with Z3 and angr](https://github.com/FSecureLABS/z3_and_angr_binary_analysis_workshop) - FSecureLABS workshop on using Z3 and the angr framework. 23 | 24 | ## Projects :rocket: 25 | 26 | List of academic/not-acadamic projects based on angr which code is open source. 27 | 28 | * [Heaphopper](https://github.com/angr/heaphopper) - Apply symbolic execution to automatically verify security properties of most common heap libraries. 29 | * [angr-cli](https://github.com/fmagin/angr-cli) - Command line interface for angr a la peda/GEF/pwndbg. 30 | * [Syml](https://github.com/ucsb-seclab/syml) - Use ML to prioritize exploration of promising vulnerable paths. 31 | * [Angrop](https://github.com/angr/angrop) - Generate ropchains using angr and symbolic execution. 32 | * [Angr-management](https://github.com/angr/angr-management) - GUI for angr. 33 | * [Mechaphish](https://github.com/mechaphish) - AEG system for CGC. 34 | * [angr-static-analysis-for-vuzzer64](https://github.com/ash09/angr-static-analysis-for-vuzzer64) - angr-based static analysis module for Vuzzer. 35 | * [FirmXRay-angr](https://github.com/ucsb-seclab/monolithic-firmware-collection/tree/master/utils/firmxray) - An angr version of the base address detection analysis implemented in [FirmXRay](https://github.com/OSUSecLab/FirmXRay). 36 | * [IVTSpotter](https://github.com/ucsb-seclab/monolithic-firmware-collection/blob/master/utils/ivt_spotter/spot_ivt.py) - An IVT Spotter for monolithic ARM firmware images. 37 | * [MemSight](https://github.com/season-lab/memsight) - Rethinking Pointer Reasoning in Symbolic Execution. 38 | * [Karonte](https://github.com/ucsb-seclab/karonte) - Detecting Insecure Multi-binary Interactions in Embedded Firmware. 39 | * [BootStomp](https://github.com/ucsb-seclab/BootStomp) - A bootloader vulnerability finder. 40 | * [SaTC](https://github.com/NSSL-SJTU/SaTC/) - A prototype of Shared-keywords aware Taint Checking(SaTC), a static analysis method that tracks user input between front-end and back-end for vulnerability discovery effectively and efficiently. 41 | * [Arbiter](https://github.com/jkrshnmenon/arbiter) - Arbiter is a combination of static and dynamic analyses, built on top of angr, that can be used to detect some vulnerability classes. 42 | ## Blogposts :newspaper: 43 | * [angr-blog](https://angr.io/) - Official angr blog. 44 | * [A reaching definition engine for binary analysis built-in in angr.](https://degrigis.github.io/posts/angr_rd/) - A walk-through of the ReachingDefinition analysis built-in in angr. 45 | * [shellphish-phrack](http://phrack.org/papers/cyber_grand_shellphish.html) - Phrack article on [Mechaphish](https://github.com/mechaphish), the AEG system based on angr that got 3rd place at the CGC. 46 | * [angr-tutorial](https://blog.notso.pro/2019-03-20-angr-introduction-part0/) - Introduction to angr - baby steps in symbolic execution. 47 | * [bcheck](https://github.com/ChrisTheCoolHut/bcheck) - Binary check tool to identify command injection and format string vulnerabilities in blackbox binaries. 48 | 49 | ## Papers :page_with_curl: 50 | 51 | Here a collection of papers which used or whose project is based on the angr framework. 52 | 53 | | Year | Paper | 54 | | :------------- | :----------: | 55 | | 2022 | [Heapster: Analyzing the Security of Dynamic Allocators for Monolithic Firmware Images](https://degrigis.github.io/bins/heapster.pdf) 56 | | 2022 | [Arbiter: Bridging the Static and Dynamic Divide in Vulnerability Discovery on Binary Programs](https://www.s3.eurecom.fr/docs/usenixsec22_arbiter.pdf) 57 | | 2022 | [Ferry: State-Aware Symbolic Execution for Exploring State-Dependent Program Paths](https://www.usenix.org/system/files/sec22summer_zhou-shunfan.pdf) 58 | | 2022 | [Fuzzware: Using Precise MMIO Modeling for Effective Firmware Fuzzing](https://sites.cs.ucsb.edu/~vigna/publications/2022_USENIXSecurity_Fuzzware.pdf) 59 | | 2021 | [Jetset: Targeted Firmware Rehosting for Embedded Systems](https://www.usenix.org/system/files/sec21fall-johnson.pdf) 60 | | 2021 | [SoK: All You Ever Wanted to Know About x86/x64 Binary Disassembly But Were Afraid to Ask](https://www.portokalidis.net/files/sok86disas_oakland21.pdf) 61 | | 2021 | [SyML: Guiding Symbolic Execution Toward Vulnerable States Through Pattern Learning](https://conand.me/publications/ruaro-syml-2021.pdf) 62 | | 2021 | [DIANE: Identifying Fuzzing Triggers in Apps to Generate Under-constrained Inputs for IoT Devices](https://conand.me/publications/redini-diane-2021.pdf) 63 | | 2021 | [Sharing More and Checking Less: Leveraging Common Input Keywords to Detect Bugs in Embedded Systems](https://www.usenix.org/system/files/sec21fall-chen-libo.pdf) 64 | | 2021 | [Boosting symbolic execution via constraint solving time prediction (experience paper)](https://dl.acm.org/doi/10.1145/3460319.3464813) 65 | | 2020 | [DICE: Automatic Emulation of DMA Input Channels for Dynamic Firmware Analysis](https://arxiv.org/pdf/2007.01502.pdf) 66 | | 2020 | [Towards Constant-Time Foundations for the New Spectre Era](https://cseweb.ucsd.edu/~cdisselk/papers/ct-foundations.pdf) 67 | | 2020 | [Symbion: Interleaving Symbolic with Concrete Execution](https://sites.cs.ucsb.edu/~vigna/publications/2020_CNS_Symbion.pdf) | 68 | | 2020 | [KARONTE: Detecting Insecure Multi-binary Interactions in Embedded Firmware](https://www.badnack.it/static/papers/University/karonte.pdf) | 69 | | 2020 | [Device-agnostic Firmware Execution is Possible: A Concolic Execution Approach for Peripheral Emulation](https://dl.acm.org/doi/10.1145/3427228.3427280) | 70 | | 2020 | [KOOBE: Towards Facilitating Exploit Generation of Kernel Out-Of-Bounds Write Vulnerabilities](https://www.usenix.org/system/files/sec20summer_chen-weiteng_prepub.pdf) 71 | | 2020 | [BugMiner: Mining the Hard-to-Reach Software Vulnerabilities through the Target-Oriented Hybrid Fuzzer](https://www.mdpi.com/2079-9292/10/1/62/pdf) 72 | | 2019 | [Enhancing Symbolic Execution by Machine Learning Based Solver Selection](https://www.csie.ntu.edu.tw/~hchsiao/pub/2019_BAR.pdf) 73 | | 2019 | [BinTrimmer: Towards Static Binary Debloating Through Abstract Interpretation](https://sites.cs.ucsb.edu/~chris/research/doc/dimva19_bintrimmer.pdf) 74 | | 2019 | [Sleak: Automating Address Space Layout Derandomization](https://par.nsf.gov/servlets/purl/10155109) 75 | | 2019 | [Time and Order: Towards Automatically Identifying Side-Channel Vulnerabilities in Enclave Binaries](https://www.usenix.org/conference/raid2019/presentation/wang-wubing) 76 | | 2018 | [HeapHopper: Bringing Bounded Model Checking to Heap Implementation Security](https://sites.cs.ucsb.edu/~chris/research/doc/usenix18_heaphopper.pdf) 77 | | 2018 | [Efficient Extraction of Malware Signatures Through System Calls and Symbolic Execution: An Experience Report](https://hal.inria.fr/hal-01954483/document) 78 | | 2018 | [Dynamic Path Pruning in Symbolic Execution](https://www.csie.ntu.edu.tw/~hchsiao/pub/2018_IEEE_DSC.pdf) 79 | | 2018 | [On Benchmarking the Capability of Symbolic Execution Tools with Logic Bombs](https://arxiv.org/pdf/1712.01674.pdf) 80 | | 2017 | [Rethinking Pointer Reasoning in Symbolic Execution](https://github.com/season-lab/memsight/raw/master/publications/memsight-ase17.pdf) 81 | | 2017 | [Your Exploit is Mine: Automatic Shellcode Transplant for Remote Exploits](https://www.ieee-security.org/TC/SP2017/papers/579.pdf) 82 | | 2017 | [BOOMERANG: Exploiting the Semantic Gap in Trusted Execution Environments](https://sites.cs.ucsb.edu/~vigna/publications/2017_NDSS_Boomerang.pdf) 83 | | 2017 | [Ramblr: Making Reassembly Great Again](https://sefcom.asu.edu/publications/ramblr-making-reassembly-great-again-ndss2017.pdf) 84 | | 2017 | [BootStomp: On the Security of Bootloaders in Mobile Devices](https://www.usenix.org/system/files/conference/usenixsecurity17/sec17-redini.pdf) | 85 | | 2017 | [Piston: Uncooperative Remote Runtime Patching](https://sefcom.asu.edu/publications/piston-uncooperative-remote-runtime-patching-acsac2017.pdf) 86 | | 2016 | [SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis](https://sites.cs.ucsb.edu/~vigna/publications/2016_SP_angrSoK.pdf) 87 | | 2016 | [Driller: Augmenting Fuzzing Through Selective Symbolic Execution](https://sites.cs.ucsb.edu/~chris/research/doc/ndss16_driller.pdf) 88 | | 2015 | [Firmalice - Automatic Detection of Authentication Bypass Vulnerabilities in Binary Firmware](https://sites.cs.ucsb.edu/~chris/research/doc/ndss15_firmalice.pdf) | 89 | 90 | 91 | 92 | --------------------------------------------------------------------------------