├── .hgignore ├── stegosaurus ├── .hg_archival.txt ├── CONTRIBUTORS.md ├── sample.py ├── LICENSE ├── .gitignore ├── stegosaurus.py └── README.md /.hgignore: -------------------------------------------------------------------------------- 1 | syntax: glob 2 | 3 | __pycache__ -------------------------------------------------------------------------------- /stegosaurus: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AngelKitty/stegosaurus/HEAD/stegosaurus -------------------------------------------------------------------------------- /.hg_archival.txt: -------------------------------------------------------------------------------- 1 | repo: 963c30ed858be61e345394d1e093ebf970f3fcc0 2 | node: cd5c2373c031d80a2ef01ce53a8da69430b6b19c 3 | branch: default 4 | latesttag: null 5 | latesttagdistance: 12 6 | changessincelatesttag: 12 7 | -------------------------------------------------------------------------------- /CONTRIBUTORS.md: -------------------------------------------------------------------------------- 1 | ### jherron 2 | 3 | * Initial drop 4 | 5 | ### S0lll0s 6 | 7 | * Added --explode option to prevent placing the payload in long runs of opcodes that do not take an argument 8 | as this can lead to exposure of the payload through tools like ```strings``` -------------------------------------------------------------------------------- /sample.py: -------------------------------------------------------------------------------- 1 | """Example carrier file to embed our payload in. 2 | """ 3 | 4 | from math import sqrt 5 | 6 | 7 | def fib_v1(n): 8 | if n == 0 or n == 1: 9 | return n 10 | return fib_v1(n - 1) + fib_v1(n - 2) 11 | 12 | 13 | def fib_v2(n): 14 | if n == 0 or n == 1: 15 | return n 16 | return int(((1 + sqrt(5))**n - (1 - sqrt(5))**n) / (2**n * sqrt(5))) 17 | 18 | 19 | def main(): 20 | result1 = fib_v1(12) 21 | result2 = fib_v2(12) 22 | 23 | print(result1) 24 | print(result2) 25 | 26 | if __name__ == "__main__": 27 | main() 28 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | ISC License 2 | 3 | Copyright (c) Jon Herron, 2017 4 | 5 | Permission to use, copy, modify, and/or distribute this software for any 6 | purpose with or without fee is hereby granted, provided that the above 7 | copyright notice and this permission notice appear in all copies. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH 10 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY 11 | AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, 12 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM 13 | LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE 14 | OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 15 | PERFORMANCE OF THIS SOFTWARE. -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | -------------------------------------------------------------------------------- /stegosaurus.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import logging 3 | import marshal 4 | import opcode 5 | import os 6 | import py_compile 7 | import sys 8 | import math 9 | import string 10 | import types 11 | 12 | if sys.version_info < (3, 6): 13 | sys.exit("Stegosaurus requires Python 3.6 or later") 14 | 15 | 16 | class MutableBytecode(): 17 | def __init__(self, code): 18 | self.originalCode = code 19 | self.bytes = bytearray(code.co_code) 20 | self.consts = [MutableBytecode(const) if isinstance(const, types.CodeType) else const for const in code.co_consts] 21 | 22 | 23 | def _bytesAvailableForPayload(mutableBytecodeStack, explodeAfter, logger=None): 24 | for mutableBytecode in reversed(mutableBytecodeStack): 25 | bytes = mutableBytecode.bytes 26 | consecutivePrintableBytes = 0 27 | for i in range(0, len(bytes)): 28 | if chr(bytes[i]) in string.printable: 29 | consecutivePrintableBytes += 1 30 | else: 31 | consecutivePrintableBytes = 0 32 | 33 | if i % 2 == 0 and bytes[i] < opcode.HAVE_ARGUMENT: 34 | if consecutivePrintableBytes >= explodeAfter: 35 | if logger: 36 | logger.debug("Skipping available byte to terminate string leak") 37 | consecutivePrintableBytes = 0 38 | continue 39 | yield (bytes, i + 1) 40 | 41 | 42 | def _createMutableBytecodeStack(mutableBytecode): 43 | def _stack(parent, stack): 44 | stack.append(parent) 45 | 46 | for child in [const for const in parent.consts if isinstance(const, MutableBytecode)]: 47 | _stack(child, stack) 48 | 49 | return stack 50 | 51 | return _stack(mutableBytecode, []) 52 | 53 | 54 | def _dumpBytecode(header, code, carrier, logger): 55 | try: 56 | f = open(carrier, "wb") 57 | f.write(header) 58 | marshal.dump(code, f) 59 | logger.info("Wrote carrier file as %s", carrier) 60 | finally: 61 | f.close() 62 | 63 | 64 | def _embedPayload(mutableBytecodeStack, payload, explodeAfter, logger): 65 | payloadBytes = bytearray(payload, "utf8") 66 | payloadIndex = 0 67 | payloadLen = len(payloadBytes) 68 | 69 | for bytes, byteIndex in _bytesAvailableForPayload(mutableBytecodeStack, explodeAfter): 70 | if payloadIndex < payloadLen: 71 | bytes[byteIndex] = payloadBytes[payloadIndex] 72 | payloadIndex += 1 73 | else: 74 | bytes[byteIndex] = 0 75 | 76 | print("Payload embedded in carrier") 77 | 78 | 79 | def _extractPayload(mutableBytecodeStack, explodeAfter, logger): 80 | payloadBytes = bytearray() 81 | 82 | for bytes, byteIndex in _bytesAvailableForPayload(mutableBytecodeStack, explodeAfter): 83 | byte = bytes[byteIndex] 84 | if byte == 0: 85 | break 86 | payloadBytes.append(byte) 87 | 88 | payload = str(payloadBytes, "utf8") 89 | 90 | print("Extracted payload: {}".format(payload)) 91 | 92 | 93 | def _getCarrierFile(args, logger): 94 | carrier = args.carrier 95 | _, ext = os.path.splitext(carrier) 96 | 97 | if ext == ".py": 98 | carrier = py_compile.compile(carrier, doraise=True) 99 | logger.info("Compiled %s as %s for use as carrier", args.carrier, carrier) 100 | 101 | return carrier 102 | 103 | 104 | def _initLogger(args): 105 | handler = logging.StreamHandler() 106 | handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')) 107 | 108 | logger = logging.getLogger("stegosaurus") 109 | logger.addHandler(handler) 110 | 111 | if args.verbose: 112 | if args.verbose == 1: 113 | logger.setLevel(logging.INFO) 114 | else: 115 | logger.setLevel(logging.DEBUG) 116 | 117 | return logger 118 | 119 | 120 | def _loadBytecode(carrier, logger): 121 | try: 122 | f = open(carrier, "rb") 123 | header = f.read(12) 124 | code = marshal.load(f) 125 | logger.debug("Read header and bytecode from carrier") 126 | finally: 127 | f.close() 128 | 129 | return (header, code) 130 | 131 | 132 | def _logBytesAvailableForPayload(mutableBytecodeStack, explodeAfter, logger): 133 | for bytes, i in _bytesAvailableForPayload(mutableBytecodeStack, explodeAfter, logger): 134 | logger.debug("%s (%d)", opcode.opname[bytes[i - 1]], bytes[i]) 135 | 136 | 137 | def _maxSupportedPayloadSize(mutableBytecodeStack, explodeAfter, logger): 138 | maxPayloadSize = 0 139 | 140 | for bytes, i in _bytesAvailableForPayload(mutableBytecodeStack, explodeAfter): 141 | maxPayloadSize += 1 142 | 143 | logger.info("Found %d bytes available for payload", maxPayloadSize) 144 | 145 | return maxPayloadSize 146 | 147 | 148 | def _parseArgs(): 149 | argParser = argparse.ArgumentParser() 150 | argParser.add_argument("carrier", help="Carrier py, pyc or pyo file") 151 | argParser.add_argument("-p", "--payload", help="Embed payload in carrier file") 152 | argParser.add_argument("-r", "--report", action="store_true", help="Report max available payload size carrier supports") 153 | argParser.add_argument("-s", "--side-by-side", action="store_true", help="Do not overwrite carrier file, install side by side instead.") 154 | argParser.add_argument("-v", "--verbose", action="count", help="Increase verbosity once per use") 155 | argParser.add_argument("-x", "--extract", action="store_true", help="Extract payload from carrier file") 156 | argParser.add_argument("-e", "--explode", type=int, default=math.inf, help="Explode payload into groups of a limited length if necessary") 157 | args = argParser.parse_args() 158 | 159 | return args 160 | 161 | 162 | def _toCodeType(mutableBytecode): 163 | return types.CodeType( 164 | mutableBytecode.originalCode.co_argcount, 165 | mutableBytecode.originalCode.co_kwonlyargcount, 166 | mutableBytecode.originalCode.co_nlocals, 167 | mutableBytecode.originalCode.co_stacksize, 168 | mutableBytecode.originalCode.co_flags, 169 | bytes(mutableBytecode.bytes), 170 | tuple([_toCodeType(const) if isinstance(const, MutableBytecode) else const for const in mutableBytecode.consts]), 171 | mutableBytecode.originalCode.co_names, 172 | mutableBytecode.originalCode.co_varnames, 173 | mutableBytecode.originalCode.co_filename, 174 | mutableBytecode.originalCode.co_name, 175 | mutableBytecode.originalCode.co_firstlineno, 176 | mutableBytecode.originalCode.co_lnotab, 177 | mutableBytecode.originalCode.co_freevars, 178 | mutableBytecode.originalCode.co_cellvars 179 | ) 180 | 181 | 182 | def _validateArgs(args, logger): 183 | def _exit(msg): 184 | msg = "Fatal error: {}\nUse -h or --help for usage".format(msg) 185 | sys.exit(msg) 186 | 187 | allowedCarriers = {".py", ".pyc", ".pyo"} 188 | 189 | _, ext = os.path.splitext(args.carrier) 190 | 191 | if ext not in allowedCarriers: 192 | _exit("Carrier file must be one of the following types: {}, got: {}".format(allowedCarriers, ext)) 193 | 194 | if args.payload is None: 195 | if not args.report and not args.extract: 196 | _exit("Unless -r or -x are specified, a payload is required") 197 | 198 | if args.extract or args.report: 199 | if args.payload: 200 | logger.warn("Payload is ignored when -x or -r is specified") 201 | if args.side_by_side: 202 | logger.warn("Side by side is ignored when -x or -r is specified") 203 | 204 | if args.explode and args.explode < 1: 205 | _exit("Values for -e must be positive integers") 206 | 207 | logger.debug("Validated args") 208 | 209 | 210 | def main(): 211 | args = _parseArgs() 212 | logger = _initLogger(args) 213 | 214 | _validateArgs(args, logger) 215 | 216 | carrier = _getCarrierFile(args, logger) 217 | header, code = _loadBytecode(carrier, logger) 218 | 219 | mutableBytecode = MutableBytecode(code) 220 | mutableBytecodeStack = _createMutableBytecodeStack(mutableBytecode) 221 | _logBytesAvailableForPayload(mutableBytecodeStack, args.explode, logger) 222 | 223 | if args.extract: 224 | _extractPayload(mutableBytecodeStack, args.explode, logger) 225 | return 226 | 227 | maxPayloadSize = _maxSupportedPayloadSize(mutableBytecodeStack, args.explode, logger) 228 | 229 | if args.report: 230 | print("Carrier can support a payload of {} bytes".format(maxPayloadSize)) 231 | return 232 | 233 | payloadLen = len(args.payload) 234 | if payloadLen > maxPayloadSize: 235 | sys.exit("Carrier can only support a payload of {} bytes, payload of {} bytes received".format(maxPayloadSize, payloadLen)) 236 | 237 | _embedPayload(mutableBytecodeStack, args.payload, args.explode, logger) 238 | _logBytesAvailableForPayload(mutableBytecodeStack, args.explode, logger) 239 | 240 | if args.side_by_side: 241 | logger.debug("Creating new carrier file name for side-by-side install") 242 | base, ext = os.path.splitext(carrier) 243 | carrier = "{}-stegosaurus{}".format(base, ext) 244 | 245 | code = _toCodeType(mutableBytecode) 246 | 247 | _dumpBytecode(header, code, carrier, logger) 248 | 249 | 250 | if __name__ == "__main__": 251 | main() 252 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Stegosaurus 2 | ## A steganography tool for embedding payloads within Python bytecode. 3 | 4 | Stegosaurus is a [steganography tool](https://en.wikipedia.org/wiki/Steganography) 5 | that allows embedding arbitrary payloads in 6 | Python bytecode (pyc or pyo) files. The embedding process does not alter the 7 | runtime behavior or file size of the carrier file and typically results in a low 8 | encoding density. The payload is dispersed throughout the bytecode so tools like 9 | ```strings``` will not show the actual payload. Python's ```dis``` module will 10 | return the same results for bytecode before and after Stegosaurus is used to embed 11 | a payload. At this time, no prior work or detection methods are known for this type 12 | of payload delivery. 13 | 14 | Stegosaurus requires Python 3.6 or later. 15 | 16 | #### Usage 17 | 18 | $ python3 -m stegosaurus -h 19 | usage: stegosaurus.py [-h] [-p PAYLOAD] [-r] [-s] [-v] [-x] carrier 20 | 21 | positional arguments: 22 | carrier Carrier py, pyc or pyo file 23 | 24 | optional arguments: 25 | -h, --help show this help message and exit 26 | -p PAYLOAD, --payload PAYLOAD 27 | Embed payload in carrier file 28 | -r, --report Report max available payload size carrier supports 29 | -s, --side-by-side Do not overwrite carrier file, install side by side 30 | instead. 31 | -v, --verbose Increase verbosity once per use 32 | -x, --extract Extract payload from carrier file 33 | 34 | #### Example 35 | 36 | Assume we wish to embed a payload in the bytecode of the following Python script, named example.py: 37 | 38 | """Example carrier file to embed our payload in. 39 | """ 40 | 41 | import math 42 | 43 | def fibV1(n): 44 | if n == 0 or n == 1: 45 | return n 46 | return fibV1(n - 1) + fibV1(n - 2) 47 | 48 | def fibV2(n): 49 | if n == 0 or n == 1: 50 | return n 51 | return int(((1 + math.sqrt(5))**n - (1 - math.sqrt(5))**n) / (2**n * math.sqrt(5))) 52 | 53 | def main(): 54 | result1 = fibV1(12) 55 | result2 = fibV2(12) 56 | 57 | print(result1) 58 | print(result2) 59 | 60 | if __name__ == "__main__": 61 | main() 62 | 63 | 64 | The first step is to use Stegosaurus to see how many bytes our payload can contain without 65 | changing the size of the carrier file. 66 | 67 | $ python3 -m stegosaurus example.py -r 68 | Carrier can support a payload of 20 bytes 69 | 70 | We can now safely embed a payload of up to 20 bytes. To help show the before and after the 71 | ```-s``` option can be used to install the carrier file side by side with the untouched 72 | bytecode: 73 | 74 | $ python3 -m stegosaurus example.py -s --payload "root pwd: 5+3g05aW" 75 | Payload embedded in carrier 76 | 77 | Looking on disk, both the carrier file and original bytecode file have the same size: 78 | 79 | $ ls -l __pycache__/example.cpython-36* 80 | -rw-r--r-- 1 jherron staff 743 Mar 10 00:58 __pycache__/example.cpython-36-stegosaurus.pyc 81 | -rw-r--r-- 1 jherron staff 743 Mar 10 00:58 __pycache__/example.cpython-36.pyc 82 | 83 | _Note: If the ```-s``` option is omitted, the original bytecode would have been overwritten._ 84 | 85 | The payload can be extracted by passing the ```-x``` option to Stegosaurus: 86 | 87 | $ python3 -m stegosaurus __pycache__/example.cpython-36-stegosaurus.pyc -x 88 | Extracted payload: root pwd: 5+3g05aW 89 | 90 | The payload does not have to be an ascii string, shellcode is also supported: 91 | 92 | $ python3 -m stegosaurus example.py -s --payload "\xeb\x2a\x5e\x89\x76" 93 | Payload embedded in carrier 94 | 95 | $ python3 -m stegosaurus __pycache__/example.cpython-36-stegosaurus.pyc -x 96 | Extracted payload: \xeb\x2a\x5e\x89\x76 97 | 98 | To show that the runtime behavior of the Python code remains after Stegosaurus embeds the 99 | payload: 100 | 101 | $ python3 example.py 102 | 144 103 | 144 104 | 105 | $ python3 __pycache__/example.cpython-36.pyc 106 | 144 107 | 144 108 | 109 | $ python3 __pycache__/example.cpython-36-stegosaurus.pyc 110 | 144 111 | 144 112 | 113 | Output of ```strings``` after Stegosaurus embeds the payload (notice the payload is 114 | not shown): 115 | 116 | $ python3 -m stegosaurus example.py -s --payload "PAYLOAD_IS_HERE" 117 | Payload embedded in carrier 118 | 119 | $ strings __pycache__/example.cpython-36-stegosaurus.pyc 120 | .Example carrier file to embed our payload in. 121 | fibV1) 122 | example.pyr 123 | math 124 | sqrt) 125 | fibV2 126 | print) 127 | result1 128 | result2r 129 | main 130 | __main__) 131 | __doc__r 132 | 133 | __name__r 134 | 135 | 136 | $ python3 -m stegosaurus __pycache__/example.cpython-36-stegosaurus.pyc -x 137 | Extracted payload: PAYLOAD_IS_HERE 138 | 139 | Sample output of Python's ```dis``` module, which shows no difference before and after 140 | Stegosaurus embeds its payload: 141 | 142 | Before: 143 | 144 | 20 LOAD_GLOBAL 0 (int) 145 | 22 LOAD_CONST 2 (1) 146 | 24 LOAD_GLOBAL 1 (math) 147 | 26 LOAD_ATTR 2 (sqrt) 148 | 28 LOAD_CONST 3 (5) 149 | 30 CALL_FUNCTION 1 150 | 32 BINARY_ADD 151 | 34 LOAD_FAST 0 (n) 152 | 36 BINARY_POWER 153 | 38 LOAD_CONST 2 (1) 154 | 40 LOAD_GLOBAL 1 (math) 155 | 42 LOAD_ATTR 2 (sqrt) 156 | 44 LOAD_CONST 3 (5) 157 | 46 CALL_FUNCTION 1 158 | 48 BINARY_SUBTRACT 159 | 50 LOAD_FAST 0 (n) 160 | 52 BINARY_POWER 161 | 54 BINARY_SUBTRACT 162 | 56 LOAD_CONST 4 (2) 163 | 164 | After: 165 | 166 | 20 LOAD_GLOBAL 0 (int) 167 | 22 LOAD_CONST 2 (1) 168 | 24 LOAD_GLOBAL 1 (math) 169 | 26 LOAD_ATTR 2 (sqrt) 170 | 28 LOAD_CONST 3 (5) 171 | 30 CALL_FUNCTION 1 172 | 32 BINARY_ADD 173 | 34 LOAD_FAST 0 (n) 174 | 36 BINARY_POWER 175 | 38 LOAD_CONST 2 (1) 176 | 40 LOAD_GLOBAL 1 (math) 177 | 42 LOAD_ATTR 2 (sqrt) 178 | 44 LOAD_CONST 3 (5) 179 | 46 CALL_FUNCTION 1 180 | 48 BINARY_SUBTRACT 181 | 50 LOAD_FAST 0 (n) 182 | 52 BINARY_POWER 183 | 54 BINARY_SUBTRACT 184 | 56 LOAD_CONST 4 (2) 185 | 186 | 187 | #### Using Stegosaurus 188 | 189 | Payloads, delivery and reciept methods are entirely up to the user. Stegosaurus only 190 | provides the means to embed and extract paylods from a given Python bytecode file. 191 | Due to the desire to leave file size intact, a relatively few number of bytes can be used to 192 | deliver the payload. This may require spreading larger payloads across multiple bytecode 193 | files, which has some advantages such as: 194 | 195 | * Delivering a payload in pieces over time 196 | * Portions of the payload can be spread over mutliple locations and joined when needed 197 | * A single portion being compromised does not divulge the whole payload 198 | * Thwarting detection of the entire payload by spreading it across multiple seemingly unrelated files 199 | 200 | The means to spread large payloads across multiple Python bytecode files is not supported 201 | as this moment, see TODOs. 202 | 203 | #### How Stegosaurus Works 204 | 205 | In order to embed a payload without increasing the file size, dead zones need to be identified 206 | within the bytecode. A dead zone is defined as any byte which if changed will not impact the 207 | behavior of the Python script. Python 3.6 introduced easy to exploit dead zones. Stepping back 208 | though, a little history to set the stage. 209 | 210 | Python's reference interpreter, CPython has two types of opcodes - those with arguments and 211 | those without. In Python <= 3.5 instructions in the bytecode occupied either 1 or 3 bytes, 212 | depending on if the opcode took an arugment or not. In Python 3.6 this was changed so that 213 | all instructions occupy two bytes. Those without arguments simply set the second byte to zero 214 | and it is ignored during execution. This means that for each instruction in the bytecode that 215 | does not take an arugment, Stegosaurus can safely insert one byte of the payload. 216 | 217 | Some examples of opcodes that do not take an argument: 218 | 219 | BINARY_SUBTRACT 220 | INPLACE_ADD 221 | RETURN_VALUE 222 | GET_ITER 223 | YIELD_VALUE 224 | IMPORT_STAR 225 | END_FINALLY 226 | NOP 227 | ... 228 | 229 | To see an example of the changes in the bytecode, consider the following Python snippet: 230 | 231 | def test(n): 232 | return n + 5 + n - 3 233 | 234 | Using ```dis``` with Python < 3.6 shows: 235 | 236 | 0 LOAD_FAST 0 (n) 237 | 3 LOAD_CONST 1 (5) <-- opcodes with an arg take 3 bytes 238 | 6 BINARY_ADD <-- opcodes without an arg take 1 byte 239 | 7 LOAD_FAST 0 (n) 240 | 10 BINARY_ADD 241 | 11 LOAD_CONST 2 (3) 242 | 14 BINARY_SUBTRACT 243 | 15 RETURN_VALUE 244 | 245 | # :( no easy bytes to embed a payload 246 | 247 | However with Python 3.6: 248 | 249 | 0 LOAD_FAST 0 (n) 250 | 2 LOAD_CONST 1 (5) <-- all opcodes now occupy two bytes 251 | 4 BINARY_ADD <-- opcodes without an arg leave 1 byte for the payload 252 | 6 LOAD_FAST 0 (n) 253 | 8 BINARY_ADD 254 | 10 LOAD_CONST 2 (3) 255 | 12 BINARY_SUBTRACT 256 | 14 RETURN_VALUE 257 | 258 | # :) easy bytes to embed a payload 259 | 260 | Passing ```-vv``` to Stegosaurus we can see how the payload is embedded in these dead zones: 261 | 262 | $ python3 -m stegosaurus ../python_tests/loop.py -s -p "ABCDE" -vv 263 | Read header and bytecode from carrier 264 | BINARY_ADD (0) 265 | BINARY_ADD (0) 266 | BINARY_SUBTRACT (0) 267 | RETURN_VALUE (0) 268 | RETURN_VALUE (0) 269 | Found 5 bytes available for payload 270 | Payload embedded in carrier 271 | BINARY_ADD (65) <-- A 272 | BINARY_ADD (66) <-- B 273 | BINARY_SUBTRACT (67) <-- C 274 | RETURN_VALUE (68) <-- D 275 | RETURN_VALUE (69) <-- E 276 | 277 | _Timestamps and debug levels removed from logs for readability_ 278 | 279 | Currently this is the only dead zone that Stegosaurus exploits. Future improvements include 280 | more dead zone identification as mentioned in the TODOs. 281 | 282 | #### TODOs 283 | 284 | * Add self destruct option ```-d``` which will purge the payload from the carrier file after 285 | extraction 286 | * Support method to distribute payload across multiple carrier files 287 | * Provide ```-t``` flag to test if a payload may be present within a carrier file 288 | * Find more dead zones within the bytecode to place the payload, such as dead code 289 | * Add a ```-g``` option which will grow the size of the file to supported larger payloads 290 | for users that are not concerned with a change in file size (for instance if Stegosaurus 291 | is injected into a build pipeline) 292 | 293 | #### Contributions 294 | 295 | Thanks to S0lll0s for: 296 | 297 | * Prevent placing the payload in long runs of opcodes that do not take an argument 298 | as this can lead to exposure of the payload through tools like ```strings``` 299 | 300 | #### Contact 301 | 302 | For any questions, please contact the author: 303 | 304 | Jon Herron 305 | 306 | jon _dot_ herron _at_ yahoo.com --------------------------------------------------------------------------------