├── ExplorationTechniques
    ├── SimgrViz
    │   ├── screenshot_1.PNG
    │   ├── README.md
    │   └── SimgrViz.py
    ├── HeartBeat
    │   ├── README.md
    │   └── heartbeat.py
    ├── MemLimiter
    │   ├── README.md
    │   └── MemLimiter.py
    ├── ExplosionDetector
    │   ├── README.md
    │   └── ExplosionDetector.py
    ├── StochasticSearch
    │   ├── README.md
    │   └── StocasticSearch.py
    ├── KLEERandomSearch
    │   ├── README.me
    │   └── KLEERandomSearch.py
    ├── LoopExhaustion
    │   ├── README.md
    │   └── LoopExhaustion.py
    └── KLEECoverageOptimizeSearch
    │   ├── README.md
    │   └── KLEECoverageOS.py
└── README.md


/ExplorationTechniques/SimgrViz/screenshot_1.PNG:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/degrigis/awesome-angr/HEAD/ExplorationTechniques/SimgrViz/screenshot_1.PNG


--------------------------------------------------------------------------------
/ExplorationTechniques/HeartBeat/README.md:
--------------------------------------------------------------------------------
1 | An exploration technique to make sure symbolic execution is alive and provides some utility to
2 | gently hijack into the DSE while it is running.


--------------------------------------------------------------------------------
/ExplorationTechniques/MemLimiter/README.md:
--------------------------------------------------------------------------------
1 | The following ExplorationTechnique can be plugged into an instance of a SimulationManager to stop DSE when memory consumption hits critical levels.


--------------------------------------------------------------------------------
/ExplorationTechniques/ExplosionDetector/README.md:
--------------------------------------------------------------------------------
1 | When the ExplosionDetector is plugged in a SimulationManager, it can be used to (1) trigger a timeout, (2) stop the execution when reaching a certain amount of generated SimState(s), and (3) nuking all the unconstrained SimState(s) in a SimulationManager.
2 | 


--------------------------------------------------------------------------------
/ExplorationTechniques/StochasticSearch/README.md:
--------------------------------------------------------------------------------
1 | Will only keep one path active at a time, any others will be discarded.
2 | Before each pass through, weights are randomly assigned to each basic block.
3 | These weights form a probability distribution for determining which state remains after splits.
4 | When we run out of active paths to step, we start again from the start state.


--------------------------------------------------------------------------------
/ExplorationTechniques/KLEERandomSearch/README.me:
--------------------------------------------------------------------------------
 1 | Random path selection. https://hci.stanford.edu/cstr/reports/2008-03.pdf
 2 | 
 3 | Maintains a binary tree recording the program path followed for all active processes,
 4 | i.e. the leaves of the tree are the current processes and the internal nodes are places
 5 | where execution forked. Processes are selected by traversing this tree from the root
 6 | and randomly selecting the path to follow at branch points. Therefore when a branch point
 7 | is reached the set of processes in each subtree will have equal probability of being selected,
 8 | regardless of their size.
 9 | 
10 | This is implemented as a Non-Uniform-Random-Search where child nodes inherit parent weight, divided by the number
11 | of siblings


--------------------------------------------------------------------------------
/ExplorationTechniques/SimgrViz/README.md:
--------------------------------------------------------------------------------
 1 | 
 2 | ## SimgrViz
 3 | 
 4 | This exploration technique dumps the successors of a given state and build the dynamic control flow graph of the program while 
 5 | symbolically executing it. 
 6 | The final result can be exported in a .dot file and visualized with [Gephi](https://gephi.org/) or any other tool that supports the DOT format.
 7 | Node information can be enriched with attributes of the state for a post-mortem analysis of what happened during the symbolic execution.
 8 | 
 9 | *HINT*: Plug this ET as the last ET of your SimulationManager.
10 | 
11 | To dump the .dot file use
12 | 
13 | ```
14 | import networkx as nx
15 | 
16 | [...]
17 | 
18 | simgr_viz = SimgrViz(cfg=cfg)
19 | simgr.use_technique(simgr_viz)
20 | simgr.explore()
21 | [...]
22 | 
23 | nx.write_dot(simgr_viz._simgrG,"my_simgr.dot")
24 | 
25 | ```
26 | 
27 | Here an example of the graph when visualized with Gephi.
28 | 
29 | ![Example of visualization in Gephi](./screenshot_1.PNG)
30 | 


--------------------------------------------------------------------------------
/ExplorationTechniques/LoopExhaustion/README.md:
--------------------------------------------------------------------------------
 1 | Loop Exhaustion. http://security.ece.cmu.edu/aeg/aeg-current.pdf
 2 | We propose and use a loop exhaustion search strategy. The loop-exhaustion
 3 | strategy gives higher priority to an interpreter exploring the maximum number
 4 | of loop iterations, hoping that computations involving more iterations
 5 | are more promising to produce bugs like buffer overflows.
 6 | Thus, whenever execution hits a symbolic loop, we try to exhaust the loopexecute
 7 | it as many times as possible. Exhausting a symbolic loop has two immediate side effects:
 8 | 1) on each loop iteration a new interpreter is spawned, effectively causing an explosion
 9 | in the state space, and 2) execution might get 'stuck' in a deep loop.
10 | To avoid getting stuck, we impose two additional heuristics during loop exhaustion:
11 | 1) we use preconditioned symbolic execution along with pruning to reduce the number of interpreters or
12 | 2) we give higher priority to only one interpreter that tries to fully exhaust the loop,
13 | while all other interpreters exploring the same loop have the lowest possible priority.


--------------------------------------------------------------------------------
/ExplorationTechniques/MemLimiter/MemLimiter.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import psutil
 3 | from angr.exploration_techniques import ExplorationTechnique
 4 | 
 5 | class MemLimiter(ExplorationTechnique):
 6 |     def __init__(self, max_mem, drop_errored):
 7 |         super(MemLimiter, self).__init__()
 8 |         self.max_mem = max_mem
 9 |         self.drop_errored = drop_errored
10 |         self.process = psutil.Process(os.getpid())
11 | 
12 |     def step(self, simgr, stash='active', **kwargs):
13 |         if psutil.virtual_memory().percent > 90 or (self.max_mem - 1) < self.memory_usage_psutil:
14 |             simgr.move(from_stash='active', to_stash='out_of_memory')
15 |             simgr.move(from_stash='deferred', to_stash='out_of_memory')
16 | 
17 |         simgr.drop(stash='deadended')
18 |         simgr.drop(stash='avoid')
19 |         simgr.drop(stash='found')
20 |         if self.drop_errored:
21 |             del simgr.errored[:]
22 | 
23 |         return simgr.step(stash=stash)
24 | 
25 |     @property
26 |     def memory_usage_psutil(self):
27 |         # return the memory usage in MB
28 |         mem = self.process.memory_info().vms / float(2 ** 30)
29 |         return mem
30 | 


--------------------------------------------------------------------------------
/ExplorationTechniques/ExplosionDetector/ExplosionDetector.py:
--------------------------------------------------------------------------------
 1 | from angr.exploration_techniques import ExplorationTechnique
 2 | 
 3 | class ExplosionDetector(ExplorationTechnique):
 4 |     def __init__(self, stashes=('active', 'deferred', 'errored', 'cut'), threshold=100):
 5 |         super(ExplosionDetector, self).__init__()
 6 |         self._stashes = stashes
 7 |         self._threshold = threshold
 8 |         self.timed_out = Event()
 9 |         self.timed_out_bool = False
10 | 
11 |     def step(self, simgr, stash='active', **kwargs):
12 |         simgr = simgr.step(stash=stash, **kwargs)
13 |         total = 0
14 |         if len(simgr.unconstrained) > 0:
15 |             l.debug("Nuking unconstrained")
16 |             simgr.move(from_stash='unconstrained', to_stash='_Drop', filter_func=lambda _: True)
17 |         if self.timed_out.is_set():
18 |             l.critical("Timed out, %d states: %s" % (total, str(simgr)))
19 |             self.timed_out_bool = True
20 |             for st in self._stashes:
21 |                 if hasattr(simgr, st):
22 |                     simgr.move(from_stash=st, to_stash='_Drop', filter_func=lambda _: True)
23 |         for st in self._stashes:
24 |             if hasattr(simgr, st):
25 |                 total += len(getattr(simgr, st))
26 | 
27 |         if total >= self._threshold:
28 |             l.critical("State explosion detected, over %d states: %s" % (total, str(simgr)))
29 |             for st in self._stashes:
30 |                 if hasattr(simgr, st):
31 |                     simgr.move(from_stash=st, to_stash='_Drop', filter_func=lambda _: True)
32 | 
33 |         return simgr


--------------------------------------------------------------------------------
/ExplorationTechniques/KLEECoverageOptimizeSearch/README.md:
--------------------------------------------------------------------------------
 1 | Coverage Optimize Search. https://hci.stanford.edu/cstr/reports/2008-03.pdf
 2 | 
 3 | A strategy which attempts to select states that are likely to cover new code
 4 | in the immediate future. Heuristics are used to compute a weight for each process
 5 | and a random process is selected according to these weights.
 6 | Currently these heuristics use a combination of the minimum distance
 7 | to an uncovered instruction, taking into account the call stack of the
 8 | process, and whether the process has recently covered new code.
 9 | These strategies are composed by selecting from each in a round robin fashion.
10 | Although this interleaving may increase the time for a particularly effective
11 | strategy to achieve high coverage, it protects the system against cases where
12 | one individual strategy would become stuck.
13 | Furthermore, because the strategies are always selecting processes from the same pool,
14 | using interleaving allows the strategies to interact cooperatively.
15 | Finally, once selected each process is run for a "time slice" defined by
16 | both a maximum number of instructions and a maximum amount of time.
17 | The time to execute an individual instruction can vary widely between
18 | simple instructions, like addition, and instructions which may use the
19 | constraint solver or fork, like branches or memory accesses.
20 | Time-slicing processes helps ensure that a process which is frequently
21 | executing expensive instructions will not dominate execution time.
22 | 
23 | This is implemented as a Non-Uniform-Random-Search with interleaved heuristics:
24 |     1. md2u: minimum distance to uncovered instruction
25 |     2. covnew: recently covered new code
26 | TODO: a time/instruction batch limit may be set (default to false, user-set)


--------------------------------------------------------------------------------
/ExplorationTechniques/StochasticSearch/StocasticSearch.py:
--------------------------------------------------------------------------------
 1 | import logging
 2 | import random
 3 | from collections import defaultdict
 4 | 
 5 | from angr.exploration_techniques import ExplorationTechnique
 6 | 
 7 | l = logging.getLogger('syml')
 8 | 
 9 | 
10 | class StochasticSearch(ExplorationTechnique):
11 |     """
12 |     Stochastic Search.
13 | 
14 |     Will only keep one path active at a time, any others will be discarded.
15 |     Before each pass through, weights are randomly assigned to each basic block.
16 |     These weights form a probability distribution for determining which state remains after splits.
17 |     When we run out of active paths to step, we start again from the start state.
18 |     """
19 | 
20 |     def __init__(self, restart_prob=0.0001, **kwargs):
21 |         """
22 |         :param start_state:  The initial state from which exploration stems.
23 |         :param restart_prob: The probability of randomly restarting the search (default 0.0001).
24 |         """
25 |         super(StochasticSearch, self).__init__()
26 |         self.restart_prob = restart_prob
27 |         self._random = random.Random()
28 |         self._random.seed(42)
29 |         self.affinity = defaultdict(self._random.random)
30 | 
31 |     def setup(self, simgr):
32 |         super(StochasticSearch, self).setup(simgr)
33 |         self.start_state = simgr.one_active
34 | 
35 |     def step(self, simgr, stash='active', **kwargs):
36 |         simgr = simgr.step(stash=stash, **kwargs)
37 | 
38 |         if not simgr.stashes[stash] or self._random.random() < self.restart_prob:
39 |             simgr.stashes[stash] = [self.start_state]
40 |             self.affinity.clear()
41 | 
42 |         if len(simgr.stashes[stash]) > 1:
43 |             def weighted_pick(states):
44 |                 """
45 |                 param states: Diverging states.
46 |                 """
47 |                 assert len(states) >= 2
48 |                 total_weight = sum((self.affinity[s.addr] for s in states))
49 |                 selected = self._random.uniform(0, total_weight)
50 |                 i = 0
51 |                 for i, state in enumerate(states):
52 |                     weight = self.affinity[state.addr]
53 |                     if selected < weight:
54 |                         break
55 |                     else:
56 |                         selected -= weight
57 |                 picked = states[i]
58 |                 return picked
59 | 
60 |             simgr.stashes[stash] = [weighted_pick(simgr.stashes[stash])]
61 | 
62 |         return simgr


--------------------------------------------------------------------------------
/ExplorationTechniques/HeartBeat/heartbeat.py:
--------------------------------------------------------------------------------
 1 | # pylint: disable=import-error, no-name-in-module
 2 | import angr
 3 | import hashlib
 4 | import os
 5 | import logging
 6 | import networkx
 7 | import time
 8 | import copy
 9 | 
10 | from typing import List, Set, Dict, Tuple, Optional
11 | from angr.exploration_techniques import ExplorationTechnique
12 | from angr import SimState
13 | from networkx.drawing.nx_agraph import write_dot
14 | 
15 | 
16 | l = logging.getLogger("HeartBeat")
17 | l.setLevel("INFO")
18 | 
19 | global CURR_SIMGR
20 | global CURR_PROJ
21 | global CURR_STATE
22 | 
23 | # This is useful if you plugged this: 
24 | # https://github.com/degrigis/awesome-angr/tree/main/ExplorationTechniques/SimgrViz
25 | def dump_viz_graph(simgr=None):
26 |     l.info("Dumping visualization graph if it exists")
27 |     
28 |     if simgr is None:
29 |         simgr = CURR_SIMGR
30 |     
31 |     for et in simgr._techniques:
32 |         if "SimgrViz" in str(et):
33 |             break
34 |     write_dot(et._simgrG,"/tmp/my_simgr.dot")
35 | 
36 | # This is useful if you are using this:
37 | # https://github.com/fmagin/angr-cli
38 | def spw_cli():
39 |     global CURR_SIMGR
40 |     global CURR_PROJ
41 |     global CURR_STATE
42 |     import angrcli.plugins.ContextView
43 |     from angrcli.interaction.explore import ExploreInteractive
44 |     e = ExploreInteractive(CURR_PROJ, CURR_STATE)
45 |     e.cmdloop()
46 | 
47 | class HeartBeat(ExplorationTechnique):
48 | 
49 |     def __init__(self, beat_interval=100):
50 |         super(HeartBeat, self).__init__()
51 |         self.stop_heart_beat_file = "/tmp/stop_heartbeat.txt"
52 |         self.beat_interval = beat_interval
53 |         self.beat_cnt = 0
54 |         self.steps_cnt = 0
55 | 
56 |     def setup(self, simgr):
57 |         return True
58 | 
59 |     def successors(self, simgr, state:SimState, **kwargs):
60 |         succs = simgr.successors(state, **kwargs)
61 |         self.beat_cnt += 1
62 |         self.steps_cnt += 1
63 |         if self.beat_cnt == self.beat_interval:
64 |             l.info("Exploration is alive <3. Step {}".format(self.steps_cnt)) 
65 |             l.info("    Succs are: {}".format(succs))
66 |             l.info("    Simgr is: {}".format(simgr))
67 |             self.beat_cnt = 0
68 |             if os.path.isfile(self.stop_heart_beat_file):
69 |                 l.info("HeartBeat stopped, need help? </3")
70 |                 
71 |                 global CURR_SIMGR
72 |                 global CURR_PROJ
73 |                 global CURR_STATE
74 | 
75 |                 CURR_SIMGR = simgr
76 |                 CURR_PROJ = state.project
77 |                 CURR_STATE = state
78 |                 
79 |                 import ipdb; ipdb.set_trace()
80 |                 
81 |                 CURR_SIMGR = None
82 |         
83 |         return succs
84 | 


--------------------------------------------------------------------------------
/ExplorationTechniques/KLEERandomSearch/KLEERandomSearch.py:
--------------------------------------------------------------------------------
 1 | import logging
 2 | import random
 3 | 
 4 | from angr.exploration_techniques import ExplorationTechnique
 5 | 
 6 | l = logging.getLogger('KLEERandomSearch')
 7 | 
 8 | 
 9 | class KLEERandomSearch(ExplorationTechnique):
10 |     """
11 |     Random path selection. https://hci.stanford.edu/cstr/reports/2008-03.pdf
12 | 
13 |     Maintains a binary tree recording the program path followed for all active processes,
14 |     i.e. the leaves of the tree are the current processes and the internal nodes are places
15 |     where execution forked. Processes are selected by traversing this tree from the root
16 |     and randomly selecting the path to follow at branch points. Therefore when a branch point
17 |     is reached the set of processes in each subtree will have equal probability of being selected,
18 |     regardless of their size.
19 | 
20 |     This is implemented as a Non-Uniform-Random-Search where child nodes inherit parent weight, divided by the number
21 |     of siblings
22 |     """
23 | 
24 |     def __init__(self, **kwargs):
25 |         super(KLEERandomSearch, self).__init__()
26 |     
27 |     @staticmethod
28 |     def rank(s, reverse=False):
29 |         k = -1 if reverse else 1
30 |         return k * s.globals['weight']
31 |         
32 |     def step(self, simgr, stash='active', **kwargs):
33 |         simgr = simgr.step(stash=stash, **kwargs)
34 |         print(simgr.active)
35 | 
36 |         # if there's no branch just go on
37 |         if len(simgr.stashes[stash]) == 1:
38 |             return simgr
39 | 
40 |         # if we there are no successors randomly pick a new path
41 |         elif len(simgr.stashes[stash]) == 0:
42 |             pass  # weighted choice code is always executed before returning
43 | 
44 |         # if there is more than one successor update the binary tree and randomly pick a new path
45 |         elif len(simgr.stashes[stash]) > 1:
46 |             l.debug(f'{"-" * 0x10}\nStatus:\t\t{simgr} --> active: {simgr.stashes[stash]}')
47 |             # update binary tree
48 |             for s in simgr.stashes[stash]:
49 |                 s.globals['weight'] = s.globals.get('weight', 1) / len(simgr.stashes[stash])
50 |             pass  # weighted choice code is always executed before returning
51 | 
52 |         # randomly pick new path
53 |         simgr.move(from_stash=stash, to_stash='deferred')
54 |         if max([s.globals['weight'] for s in simgr.stashes['deferred']]) < 0.1:
55 |             for s in simgr.stashes['deferred']:
56 |                 s.globals['weight'] *= 10
57 |         n = random.uniform(0, sum([s.globals['weight'] for s in simgr.stashes['deferred']]))
58 |         for s in simgr.stashes['deferred']:
59 |             if n < s.globals['weight']:
60 |                 simgr.stashes['deferred'].remove(s)
61 |                 simgr.stashes[stash] = [s]
62 |                 break
63 |             n = n - s.globals['weight']
64 | 
65 |         return simgr


--------------------------------------------------------------------------------
/ExplorationTechniques/LoopExhaustion/LoopExhaustion.py:
--------------------------------------------------------------------------------
 1 | 
 2 | # https://raw.githubusercontent.com/ucsb-seclab/syml/main/syml/exploration/exploration_techniques/literature/aeg_loop_exhaustion.py
 3 | 
 4 | import logging
 5 | 
 6 | import angr
 7 | from angr.exploration_techniques import ExplorationTechnique
 8 | 
 9 | l = logging.getLogger('LoopExhaustion')
10 | 
11 | 
12 | class AEGLoopExhaustion(ExplorationTechnique):
13 |     """
14 |     Loop Exhaustion. http://security.ece.cmu.edu/aeg/aeg-current.pdf
15 | 
16 |     We propose and use a loop exhaustion search strategy. The loop-exhaustion
17 |     strategy gives higher priority to an interpreter exploring the maximum number
18 |     of loop iterations, hoping that computations involving more iterations
19 |     are more promising to produce bugs like buffer overflows.
20 |     Thus, whenever execution hits a symbolic loop, we try to exhaust the loopexecute
21 |     it as many times as possible. Exhausting a symbolic loop has two immediate side effects:
22 |     1) on each loop iteration a new interpreter is spawned, effectively causing an explosion
23 |     in the state space, and 2) execution might get 'stuck' in a deep loop.
24 |     To avoid getting stuck, we impose two additional heuristics during loop exhaustion:
25 |     1) we use preconditioned symbolic execution along with pruning to reduce the number of interpreters or
26 |     2) we give higher priority to only one interpreter that tries to fully exhaust the loop,
27 |     while all other interpreters exploring the same loop have the lowest possible priority.
28 |     """
29 | 
30 |     def __init__(self, **kwargs):
31 |         super(AEGLoopExhaustion, self).__init__()
32 |         self.top_count = 0
33 | 
34 |     def setup(self, simgr):
35 |         super(AEGLoopExhaustion, self).setup(simgr=simgr)
36 |         simgr.stashes['active'][0].globals['visits'] = dict()
37 | 
38 |         # setup LoopSeer
39 |         simgr.stashes['active'][0].register_plugin('loop_data', angr.state_plugins.SimStateLoopData())
40 |         simgr.use_technique(angr.exploration_techniques.LoopSeer(bound=10000))
41 | 
42 |     @staticmethod
43 |     def rank(s, reverse=False):
44 |         k = -1 if reverse else 1
45 |         return k * sum([s.loop_data.back_edge_trip_counts[loop[0].entry.addr][-1] for loop in s.loop_data.current_loop])
46 | 
47 |     def step(self, simgr, stash='active', **kwargs):
48 |         simgr = simgr.step(stash=stash, **kwargs)
49 | 
50 |         if len(simgr.stashes[stash]) == 1:
51 |             new_count = self.rank(simgr.stashes[stash][0])
52 |             if new_count > self.top_count or len(simgr.stashes['deferred']) == 0:
53 |                 #l.debug(f'looping!')
54 |                 self.top_count = new_count
55 |             else:
56 |                 #l.debug(f'exhausted or new loop!')  # \t {simgr.stashes[stash][0].loop_data.back_edge_trip_counts}')
57 |                 simgr.move(from_stash=stash, to_stash='deferred')
58 |                 simgr.split(from_stash='deferred', to_stash=stash, state_ranker=self.rank,
59 |                             limit=len(simgr.deferred) - 1)
60 |                 self.top_count = self.rank(simgr.stashes[stash][0])
61 | 
62 |         elif len(simgr.stashes[stash]) == 0:
63 |             #l.debug('exhausted?')
64 |             simgr.split(from_stash='deferred', to_stash=stash, state_ranker=self.rank, limit=len(simgr.deferred) - 1)
65 |             self.top_count = self.rank(simgr.stashes[stash][0])
66 | 
67 |         else:
68 |             counts = simgr.stashes[stash][0].loop_data.back_edge_trip_counts
69 |             for s in simgr.stashes[stash][1:]:
70 |                 if s.loop_data.back_edge_trip_counts != counts:
71 |                     simgr.split(from_stash=stash, to_stash='deferred', state_ranker=lambda s: self.rank(s, reverse=True),
72 |                                 limit=1)
73 |                     self.top_count = self.rank(simgr.stashes[stash][0])
74 | 
75 |                     l.debug(f'{"-" * 0x10}\nStatus:\t\t{simgr} --> active: {simgr.stashes[stash]}')
76 |                     break
77 |             else:
78 |                 #l.debug('one more step..and let\'s see what happens..')
79 |                 pass
80 | 
81 |         return simgr


--------------------------------------------------------------------------------
/ExplorationTechniques/KLEECoverageOptimizeSearch/KLEECoverageOS.py:
--------------------------------------------------------------------------------
  1 | import logging
  2 | import random
  3 | from itertools import cycle
  4 | 
  5 | from angr.exploration_techniques import ExplorationTechnique
  6 | 
  7 | l = logging.getLogger('KLEECoverageOS')
  8 | 
  9 | 
 10 | class KLEECoverageOptimizeSearch(ExplorationTechnique):
 11 |     """
 12 |     Coverage Optimize Search. https://hci.stanford.edu/cstr/reports/2008-03.pdf
 13 | 
 14 |     A strategy which attempts to select states that are likely to cover new code
 15 |     in the immediate future. Heuristics are used to compute a weight for each process
 16 |     and a random process is selected according to these weights.
 17 |     Currently these heuristics use a combination of the minimum distance
 18 |     to an uncovered instruction, taking into account the call stack of the
 19 |     process, and whether the process has recently covered new code.
 20 |     These strategies are composed by selecting from each in a round robin fashion.
 21 |     Although this interleaving may increase the time for a particularly effective
 22 |     strategy to achieve high coverage, it protects the system against cases where
 23 |     one individual strategy would become stuck.
 24 |     Furthermore, because the strategies are always selecting processes from the same pool,
 25 |     using interleaving allows the strategies to interact cooperatively.
 26 |     Finally, once selected each process is run for a "time slice" defined by
 27 |     both a maximum number of instructions and a maximum amount of time.
 28 |     The time to execute an individual instruction can vary widely between
 29 |     simple instructions, like addition, and instructions which may use the
 30 |     constraint solver or fork, like branches or memory accesses.
 31 |     Time-slicing processes helps ensure that a process which is frequently
 32 |     executing expensive instructions will not dominate execution time.
 33 | 
 34 |     This is implemented as a Non-Uniform-Random-Search with interleaved heuristics:
 35 |         1. md2u: minimum distance to uncovered instruction
 36 |         2. covnew: recently covered new code
 37 |     TODO: a time/instruction batch limit may be set (default to false, user-set)
 38 |     """
 39 | 
 40 |     def __init__(self, **kwargs):
 41 |         super(KLEECoverageOptimizeSearch, self).__init__()
 42 |         self.heuristics = cycle(['md2u', 'covnew'])
 43 |         self.curr_heuristic = None
 44 |         self.covered = set()
 45 |         self.cfg = None
 46 | 
 47 |     def setup(self, simgr):
 48 |         super(KLEECoverageOptimizeSearch, self).setup(simgr)
 49 |         self.cfg = simgr._project.analyses.CFGFast(base_state=simgr.one_active, fail_fast=True, normalize=True)
 50 |         
 51 |     def rank(self, s, reverse=False):
 52 |         k = -1 if reverse else 1
 53 |         return k * s.globals[self.curr_heuristic]
 54 | 
 55 |     def step(self, simgr, stash='active', **kwargs):
 56 |         simgr = simgr.step(stash=stash, **kwargs)
 57 | 
 58 |         # if there's no branch: update globals and go on
 59 |         if len(simgr.stashes[stash]) == 1:
 60 |             self.update_globals(simgr.stashes[stash][0])
 61 |             return simgr
 62 | 
 63 |         # if there are no successors: SHARED CODE AFTER IF STMT
 64 |         elif len(simgr.stashes[stash]) == 0:
 65 |             pass
 66 | 
 67 |         # if there is more than one successor: update globals, SHARED CODE AFTER IF STMT
 68 |         elif len(simgr.stashes[stash]) > 1:
 69 |             for state in simgr.stashes[stash]:
 70 |                 self.update_globals(state)
 71 | 
 72 |         # change heuristic
 73 |         self.curr_heuristic = next(self.heuristics)
 74 | 
 75 |         # weighted choice
 76 |         simgr.move(from_stash=stash, to_stash='deferred')
 77 |         n = random.uniform(0, sum([s.globals[self.curr_heuristic] for s in simgr.stashes['deferred']]))
 78 |         for s in simgr.stashes['deferred']:
 79 |             if n < s.globals[self.curr_heuristic]:
 80 |                 simgr.stashes['deferred'].remove(s)
 81 |                 simgr.stashes[stash] = [s]
 82 |                 l.debug(f'{"-" * 0x10}\nStatus:\t\t{simgr} --> active: {simgr.stashes[stash]} [{self.curr_heuristic} {s.globals[self.curr_heuristic]}]')
 83 |                 break
 84 |             n = n - s.globals[self.curr_heuristic]
 85 | 
 86 |         return simgr
 87 | 
 88 |     def update_globals(self, state):
 89 |         # if new, update covered blocks, set insns since new code to 0
 90 |         if state.addr not in self.covered:
 91 |             self.covered.add(state.addr)
 92 |             state.globals['insns_since_new'] = 0
 93 |         # if not new: update insns since new code
 94 |         else:
 95 |             state.globals['insns_since_new'] = state.globals.get('insns_since_new', 0) + state.block().instructions
 96 | 
 97 |         state.globals['covnew'] = 1. / max(1, state.globals['insns_since_new'] - 1000)
 98 |         state.globals['covnew'] *= state.globals['covnew']
 99 |         state.globals['md2u'] = 1. / min(self.get_md2u(state.addr), 10000) or 1
100 |         state.globals['md2u'] *= state.globals['md2u']
101 | 
102 |     def get_md2u(self, addr, iter=50):
103 |         if iter == 0:
104 |             return float('inf')
105 | 
106 |         if addr not in self.covered:
107 |             return 0
108 | 
109 |         node = self.cfg.model.get_any_node(addr, anyaddr=True)
110 |         md2u = float('inf')
111 | 
112 |         for succ in set(node.successors):
113 |             md2u = min(md2u, self.get_md2u(succ, iter=iter - 1))
114 | 
115 |         return md2u + node.block.instructions if node.block else 10


--------------------------------------------------------------------------------
/ExplorationTechniques/SimgrViz/SimgrViz.py:
--------------------------------------------------------------------------------
  1 | # pylint: disable=import-error, no-name-in-module
  2 | import angr
  3 | import hashlib
  4 | import os
  5 | import logging
  6 | import networkx
  7 | import time
  8 | import copy
  9 | 
 10 | from typing import List, Set, Dict, Tuple, Optional
 11 | from angr.exploration_techniques import ExplorationTechnique
 12 | from angr import SimState
 13 | from networkx.drawing.nx_agraph import write_dot
 14 | 
 15 | from shutil import which
 16 | 
 17 | l = logging.getLogger("SimgrViz")
 18 | l.setLevel("INFO")
 19 | 
 20 | WDIR = './'
 21 | RET_ADDR = 0xdeadbeef
 22 | 
 23 | class SimgrViz(ExplorationTechnique):
 24 |     '''
 25 |     When plugging this Exploration technique we collect information
 26 |     regarding the SimStates generated by the Simgr.
 27 |     This is a DEBUG ONLY technique that should never be used in production.
 28 |     '''
 29 |     def __init__(self, cfg=None):
 30 |         super(SimgrViz, self).__init__()
 31 |         self._simgrG = networkx.DiGraph()
 32 |         self.cfg = cfg
 33 |         # Boolean guard to understand if this is the initial state or not.
 34 |         self._start = True
 35 |         self._salt = 0
 36 |         self._path_exploration_id = 0
 37 | 
 38 |         # Reference to the taint tracker to extract info
 39 |         self.taint_tracker = None
 40 | 
 41 |         # TODO
 42 |         # Activate the visualization only when _starts_from is reached.
 43 |         self._starts_from = None
 44 |         # De-activate the visualizaton when _ends_to is reached.
 45 |         self._ends_to = None
 46 | 
 47 |         self.last_seen_id = None
 48 | 
 49 |     def setup(self, simgr):
 50 |         for state in simgr.stashes['active']:
 51 |             state.globals["predecessor"] = None
 52 |             state.globals["path_exploration_id"] = self._path_exploration_id
 53 |         self._path_exploration_id += 1
 54 |         return
 55 | 
 56 |     def get_state_hash(self, state):
 57 |         reg_values = []
 58 |         for r in state.project.arch.register_list:
 59 |             reg_values.append(state.registers.load(r.name))
 60 |         regs = '-'.join([str(x) for x in reg_values ])
 61 |         stack_signature = '-'.join([
 62 |                                    hex(state.callstack.call_site_addr),
 63 |                                    hex(state.callstack.current_return_target),
 64 |                                    hex(state.callstack.current_stack_pointer),
 65 |                                    str(state.callstack.jumpkind),
 66 |                                    hex(state.callstack.ret_addr),
 67 |                                    ])
 68 |         globals_signature = '-'.join([ str(x) for x in state.globals.values()])
 69 |         state_id_sig = str(id(state)) # regs + str(id(state)) # + stack_signature # + globals_signature
 70 |         h = hashlib.sha256()
 71 |         h.update(state_id_sig.encode("utf-8"))
 72 |         h.update(regs.encode("utf-8"))
 73 |         h.update(stack_signature.encode("utf-8"))
 74 |         h.update(globals_signature.encode("utf-8"))
 75 |         if state.globals["predecessor"]:
 76 |             h.update(state.globals["predecessor"].encode("utf-8"))
 77 |         h_hexdigest = h.hexdigest()
 78 |         # Store the signature into the state.
 79 |         state.globals["state_signature"] = h_hexdigest
 80 |         return str(h_hexdigest)
 81 | 
 82 |     def _update_timeout_info(self, timeout_states: List[SimState]):
 83 |         for state in timeout_states:
 84 |             s_sig = state.globals["state_signature"]
 85 |             self._simgrG.nodes[s_sig]["timeout"] = True
 86 | 
 87 |     def _add_state_to_graph(self, parent_state_id:str, sim_state_id:str, state:SimState):
 88 | 
 89 |         self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr))
 90 |         self._simgrG.add_edge(parent_state_id, sim_state_id)
 91 | 
 92 |         if state.addr in self.cfg.functions:
 93 |             self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), color = "green" if state.project.is_hooked(state.addr) else "yellow",
 94 |                                                 func_name="{}".format(self.cfg.get_any_node(state.addr).name),
 95 |                                                 hooked = True if state.project.is_hooked(state.addr) else False,
 96 |                                                 call_followed = True,
 97 |                                                 path_exploration_id=state.globals["path_exploration_id"])
 98 |         else:
 99 |             self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), path_exploration_id=state.globals["path_exploration_id"])
100 | 
101 |         self._simgrG.nodes[sim_state_id]['jumpkind'] = state.history.jumpkind
102 | 
103 |         # This can be heavy
104 |         if state.addr != RET_ADDR:
105 |             try:
106 |                 self._simgrG.nodes[sim_state_id]['bb_ins'] = [x.mnemonic for x in state.block().disassembly.insns]
107 |                 self._simgrG.nodes[sim_state_id]['bb_size'] = state.block().size
108 |                 if state.callstack.current_function_address:
109 |                     self._simgrG.nodes[sim_state_id]['callstack_curr_func_addr'] = str(hex(state.callstack.current_function_address))
110 |             except Exception:
111 |                 pass
112 |             
113 |     def _tag_fake_ret(self, state:SimState):
114 |         if state.history.jumpkind == "Ijk_FakeRet":
115 |             self._simgrG.nodes[state.globals["state_signature"]]['call_followed'] = False
116 |             self._simgrG.nodes[state.globals["state_signature"]]['color'] = "red"
117 | 
118 |     def successors(self, simgr, state:SimState, **kwargs):
119 |         succs = simgr.successors(state, **kwargs)
120 |         self._tag_fake_ret(state)
121 |         if self._start:
122 |             assert(not state.globals["predecessor"])
123 |             sim_state_id = self.get_state_hash(state)
124 | 
125 |             if state.addr in self.cfg.functions:
126 |                 if state.project.is_hooked(state.addr):
127 |                     self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), color = "green",
128 |                                                         hooked = True,
129 |                                                         func_name="{}".format(self.cfg.get_any_node(state.addr).name))
130 |                 else:
131 |                     self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr), color = "yellow",
132 |                                                         hooked = False,
133 |                                                         func_name="{}".format(self.cfg.get_any_node(state.addr).name))
134 |             else:
135 |                 self._simgrG.add_node(sim_state_id, state_addr = hex(state.addr))
136 | 
137 |             self._start = False
138 | 
139 |         self._path_exploration_id += 1
140 | 
141 |         for succ_state in succs.flat_successors:
142 |             succ_state.globals["predecessor"] = state.globals["state_signature"]
143 |             parent_state_id = succ_state.globals["predecessor"]
144 |             succ_state.globals["path_exploration_id"] = self._path_exploration_id
145 |             sim_state_id = self.get_state_hash(succ_state)
146 | 
147 |             self._add_state_to_graph(parent_state_id, sim_state_id, succ_state)
148 | 
149 |         return succs
150 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Awesome angr [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)
 2 | 
 3 | A collection of resources/tools and analyses for the [angr](https://github.com/angr) binary analysis framework.
 4 | This page does not only collect links and external resources, but its meant to be an harbour to release any non-official extensions/tool/utils that can be useful when working with angr.
 5 | 
 6 | ## ExplorationTechniques 📁
 7 | 
 8 | A collection of exploration techniques written by the community
 9 | 
10 | * *SimgrViz*: an exploration technique that collects information regarding the states generated by the SimulationManager and creates a graph that can be later visualized to debug the analyses (.dot file). 
11 | * *MemLimiter*: an exploration technique to stop the analysis when memory consumption is too high!
12 | * *ExplosionDetector*: stop the analysis when there are too many states or other critical errors happen.
13 | * *KLEECoverageOptimizeSearch*: KLEE technique to improve coverage. 
14 | * *KLEERandomSearch*: an ET for random path selection.
15 | * *LoopExhaustion*: a loop exhaustion search strategy.
16 | * *StochasticSearch*: an ET for stocastic search of active states.
17 | * *HeartBeat*: An exploration technique to make sure symbolic execution is alive and provides some utility to gently hijack into the DSE while it is running.
18 | 
19 | ## Documentation :book:
20 | * [docs.angr.op](https://docs.angr.io/) - Official angr general documentatoin website.
21 | * [angr.io](http://angr.io/api-doc/angr.html) - Official angr API documentation.
22 | * [Intro to Binary Analysis with Z3 and angr](https://github.com/FSecureLABS/z3_and_angr_binary_analysis_workshop) - FSecureLABS workshop on using Z3 and the angr framework.
23 | 
24 | ## Projects :rocket:
25 | 
26 | List of academic/not-acadamic projects based on angr which code is open source.
27 | 
28 | * [Heaphopper](https://github.com/angr/heaphopper) - Apply symbolic execution to automatically verify security properties of most common heap libraries.
29 | * [angr-cli](https://github.com/fmagin/angr-cli) - Command line interface for angr a la peda/GEF/pwndbg.
30 | * [Syml](https://github.com/ucsb-seclab/syml) - Use ML to prioritize exploration of promising vulnerable paths.
31 | * [Angrop](https://github.com/angr/angrop) - Generate ropchains using angr and symbolic execution.
32 | * [Angr-management](https://github.com/angr/angr-management) - GUI for angr.
33 | * [Mechaphish](https://github.com/mechaphish) - AEG system for CGC.
34 | * [angr-static-analysis-for-vuzzer64](https://github.com/ash09/angr-static-analysis-for-vuzzer64) - angr-based static analysis module for Vuzzer.
35 | * [FirmXRay-angr](https://github.com/ucsb-seclab/monolithic-firmware-collection/tree/master/utils/firmxray) - An angr version of the base address detection analysis implemented in [FirmXRay](https://github.com/OSUSecLab/FirmXRay).
36 | * [IVTSpotter](https://github.com/ucsb-seclab/monolithic-firmware-collection/blob/master/utils/ivt_spotter/spot_ivt.py) - An IVT Spotter for monolithic ARM firmware images.
37 | * [MemSight](https://github.com/season-lab/memsight) - Rethinking Pointer Reasoning in Symbolic Execution.
38 | * [Karonte](https://github.com/ucsb-seclab/karonte) - Detecting Insecure Multi-binary Interactions in Embedded Firmware.
39 | * [BootStomp](https://github.com/ucsb-seclab/BootStomp) - A bootloader vulnerability finder.
40 | * [SaTC](https://github.com/NSSL-SJTU/SaTC/) - A prototype of Shared-keywords aware Taint Checking(SaTC), a static analysis method that tracks user input between front-end and back-end for vulnerability discovery effectively and efficiently.
41 | * [Arbiter](https://github.com/jkrshnmenon/arbiter) - Arbiter is a combination of static and dynamic analyses, built on top of angr, that can be used to detect some vulnerability classes.
42 | ## Blogposts :newspaper:
43 | * [angr-blog](https://angr.io/) - Official angr blog.
44 | * [A reaching definition engine for binary analysis built-in in angr.](https://degrigis.github.io/posts/angr_rd/) - A walk-through of the ReachingDefinition analysis built-in in angr.
45 | * [shellphish-phrack](http://phrack.org/papers/cyber_grand_shellphish.html) - Phrack article on [Mechaphish](https://github.com/mechaphish), the AEG system based on angr that got 3rd place at the CGC.
46 | * [angr-tutorial](https://blog.notso.pro/2019-03-20-angr-introduction-part0/) - Introduction to angr - baby steps in symbolic execution.
47 | * [bcheck](https://github.com/ChrisTheCoolHut/bcheck) - Binary check tool to identify command injection and format string vulnerabilities in blackbox binaries.
48 | 
49 | ## Papers :page_with_curl:
50 | 
51 | Here a collection of papers which used or whose project is based on the angr framework.
52 | 
53 | | Year       | Paper     | 
54 | | :------------- | :----------: | 
55 | | 2022 | [Heapster: Analyzing the Security of Dynamic Allocators for Monolithic Firmware Images](https://degrigis.github.io/bins/heapster.pdf)
56 | | 2022 | [Arbiter: Bridging the Static and Dynamic Divide in Vulnerability Discovery on Binary Programs](https://www.s3.eurecom.fr/docs/usenixsec22_arbiter.pdf)
57 | | 2022 | [Ferry: State-Aware Symbolic Execution for Exploring State-Dependent Program Paths](https://www.usenix.org/system/files/sec22summer_zhou-shunfan.pdf)
58 | | 2022 | [Fuzzware: Using Precise MMIO Modeling for Effective Firmware Fuzzing](https://sites.cs.ucsb.edu/~vigna/publications/2022_USENIXSecurity_Fuzzware.pdf)
59 | | 2021 | [Jetset: Targeted Firmware Rehosting for Embedded Systems](https://www.usenix.org/system/files/sec21fall-johnson.pdf)
60 | | 2021 | [SoK: All You Ever Wanted to Know About x86/x64 Binary Disassembly But Were Afraid to Ask](https://www.portokalidis.net/files/sok86disas_oakland21.pdf)
61 | | 2021 | [SyML: Guiding Symbolic Execution Toward Vulnerable States Through Pattern Learning](https://conand.me/publications/ruaro-syml-2021.pdf)
62 | | 2021 | [DIANE: Identifying Fuzzing Triggers in Apps to Generate Under-constrained Inputs for IoT Devices](https://conand.me/publications/redini-diane-2021.pdf)
63 | | 2021 | [Sharing More and Checking Less: Leveraging Common Input Keywords to Detect Bugs in Embedded Systems](https://www.usenix.org/system/files/sec21fall-chen-libo.pdf)
64 | | 2021 | [Boosting symbolic execution via constraint solving time prediction (experience paper)](https://dl.acm.org/doi/10.1145/3460319.3464813)
65 | | 2020 | [DICE: Automatic Emulation of DMA Input Channels for Dynamic Firmware Analysis](https://arxiv.org/pdf/2007.01502.pdf)
66 | | 2020 | [Towards Constant-Time Foundations for the New Spectre Era](https://cseweb.ucsd.edu/~cdisselk/papers/ct-foundations.pdf)
67 | | 2020 | [Symbion: Interleaving Symbolic with Concrete Execution](https://sites.cs.ucsb.edu/~vigna/publications/2020_CNS_Symbion.pdf) |
68 | | 2020 | [KARONTE: Detecting Insecure Multi-binary Interactions in Embedded Firmware](https://www.badnack.it/static/papers/University/karonte.pdf) | 
69 | | 2020 | [Device-agnostic Firmware Execution is Possible: A Concolic Execution Approach for Peripheral Emulation](https://dl.acm.org/doi/10.1145/3427228.3427280) |
70 | | 2020 | [KOOBE: Towards Facilitating Exploit Generation of Kernel Out-Of-Bounds Write Vulnerabilities](https://www.usenix.org/system/files/sec20summer_chen-weiteng_prepub.pdf)
71 | | 2020 | [BugMiner: Mining the Hard-to-Reach Software Vulnerabilities through the Target-Oriented Hybrid Fuzzer](https://www.mdpi.com/2079-9292/10/1/62/pdf)
72 | | 2019 | [Enhancing Symbolic Execution by Machine Learning Based Solver Selection](https://www.csie.ntu.edu.tw/~hchsiao/pub/2019_BAR.pdf)
73 | | 2019 | [BinTrimmer: Towards Static Binary Debloating Through Abstract Interpretation](https://sites.cs.ucsb.edu/~chris/research/doc/dimva19_bintrimmer.pdf)
74 | | 2019 | [Sleak: Automating Address Space Layout Derandomization](https://par.nsf.gov/servlets/purl/10155109)
75 | | 2019 | [Time and Order: Towards Automatically Identifying Side-Channel Vulnerabilities in Enclave Binaries](https://www.usenix.org/conference/raid2019/presentation/wang-wubing)
76 | | 2018 | [HeapHopper: Bringing Bounded Model Checking to Heap Implementation Security](https://sites.cs.ucsb.edu/~chris/research/doc/usenix18_heaphopper.pdf)
77 | | 2018 | [Efficient Extraction of Malware Signatures Through System Calls and Symbolic Execution: An Experience Report](https://hal.inria.fr/hal-01954483/document)
78 | | 2018 | [Dynamic Path Pruning in Symbolic Execution](https://www.csie.ntu.edu.tw/~hchsiao/pub/2018_IEEE_DSC.pdf)
79 | | 2018 | [On Benchmarking the Capability of Symbolic Execution Tools with Logic Bombs](https://arxiv.org/pdf/1712.01674.pdf)
80 | | 2017 | [Rethinking Pointer Reasoning in Symbolic Execution](https://github.com/season-lab/memsight/raw/master/publications/memsight-ase17.pdf)
81 | | 2017 | [Your Exploit is Mine: Automatic Shellcode Transplant for Remote Exploits](https://www.ieee-security.org/TC/SP2017/papers/579.pdf)
82 | | 2017 | [BOOMERANG: Exploiting the Semantic Gap in Trusted Execution Environments](https://sites.cs.ucsb.edu/~vigna/publications/2017_NDSS_Boomerang.pdf)
83 | | 2017 | [Ramblr: Making Reassembly Great Again](https://sefcom.asu.edu/publications/ramblr-making-reassembly-great-again-ndss2017.pdf)
84 | | 2017 | [BootStomp: On the Security of Bootloaders in Mobile Devices](https://www.usenix.org/system/files/conference/usenixsecurity17/sec17-redini.pdf) |
85 | | 2017 | [Piston: Uncooperative Remote Runtime Patching](https://sefcom.asu.edu/publications/piston-uncooperative-remote-runtime-patching-acsac2017.pdf)
86 | | 2016 | [SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis](https://sites.cs.ucsb.edu/~vigna/publications/2016_SP_angrSoK.pdf)
87 | | 2016 | [Driller: Augmenting Fuzzing Through Selective Symbolic Execution](https://sites.cs.ucsb.edu/~chris/research/doc/ndss16_driller.pdf)
88 | | 2015 | [Firmalice - Automatic Detection of Authentication Bypass Vulnerabilities in Binary Firmware](https://sites.cs.ucsb.edu/~chris/research/doc/ndss15_firmalice.pdf) |
89 | 
90 | 
91 | 
92 | 


--------------------------------------------------------------------------------