├── README.md └── pyc2bytecode.py /README.md: -------------------------------------------------------------------------------- 1 | # pyc2bytecode: 2 | 3 | A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries by disassembling and analyzing the compiled python byte-code(.pyc) files across all python versions (including Python 3.10.*) 4 | 5 | ## Usage: 6 | 7 | To run pyc2bytecode: 8 | ``` 9 | > Console Disassembled Output: python pyc2bytecode.py -p 10 | > Save Disassembled Output to a file: python pyc2bytecode.py -p -o 11 | ``` 12 | ## Demonstration: 13 | 14 | **pyc2bytecode** can be used by researchers for reverse engineering Malicious Python Binaries and tear them apart in order to understand the inner workings of the binary statically. 15 | 16 | We execute pyc2bytecode.py against **onlyfans.pyc** which is extracted from a recent Python ransomware sample masquerading as an **OnlyFans** executable in the wild using [pyinstxtractor.py](https://github.com/countercept/python-exe-unpacker/blob/master/pyinstxtractor.py) 17 | 18 | Following are the analysis results extracted post execution of **pyc2bytecode**: 19 | 20 | ![2](https://user-images.githubusercontent.com/60843949/149174687-0191b9f2-89e0-493e-b140-0f3b2adc5af6.PNG) 21 | 22 | ![3](https://user-images.githubusercontent.com/60843949/149175102-fe0c9214-c7cd-4f78-87a0-aa25c4571196.PNG) 23 | 24 | ![7](https://user-images.githubusercontent.com/60843949/149175411-fc4606c4-4f42-49ad-9724-4d60ba81e7fa.PNG) 25 | 26 | ![8](https://user-images.githubusercontent.com/60843949/149175512-6c577c97-d4d3-4f8f-a409-cb327eb84a23.PNG) 27 | 28 | ![9](https://user-images.githubusercontent.com/60843949/149175534-f3bb9f11-8ca7-4564-8281-ebc7d32a6e34.PNG) 29 | 30 | **Extract the Disassembled output into a text file** 31 | 32 | ![output-file](https://user-images.githubusercontent.com/60843949/149175676-34e76764-c7e9-4990-8c4c-ef3cda214450.PNG) 33 | 34 | ![10](https://user-images.githubusercontent.com/60843949/149175797-8075b3e1-61e5-4645-a693-688539c36b6a.PNG) 35 | 36 | 37 | ## Future Development: 38 | 39 | - Develop Python decompiler for recent python versions by using pyc2bytecode (Need to DIS it up :p) 40 | 41 | ## Credits & References: 42 | 43 | i) https://github.com/google/pytype/blob/main/pytype/pyc/magic.py - Magic Numbers
44 | ii) https://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html - PYC structure
45 | iii) https://docs.python.org/3/library/dis.html - DIS
46 | iv) https://docs.python.org/3/library/marshal.html- Marshal
47 | 48 | **Thankyou, Feedback would be greatly appreciated! hope you like the tool :) - knight!** 49 | 50 | 51 | -------------------------------------------------------------------------------- /pyc2bytecode.py: -------------------------------------------------------------------------------- 1 | # pyc2bytecode disassembler 2 | # A Python Bytecode Disassembler hepling reverse engineers in dissecting Python binaries by disassembling and analysing the Compiled python byte-code(*.pyc) files across all python versions (including Python 3.10.*) 3 | # Author: https://twitter.com/knight0x07 4 | 5 | 6 | import sys 7 | import time 8 | import struct 9 | import marshal 10 | import dis 11 | import types 12 | import os 13 | 14 | def banner(): 15 | print(''' 16 | ________ ___. __ .___ 17 | ______ ___.__. ____ \_____ \\_ |__ ___.__._/ |_ ____ ____ ____ __| _/ ____ 18 | \____ \< | |_/ ___\ / ____/ | __ \< | |\ __\_/ __ \_/ ___\ / _ \ / __ |_/ __ \ 19 | | |_> >\___ |\ \___ / \ | \_\ \\___ | | | \ ___/\ \___( <_> )/ /_/ |\ ___/ 20 | | __/ / ____| \___ >\_______ \|___ // ____| |__| \___ >\___ >\____/ \____ | \___ > 21 | |__| \/ \/ \/ \/ \/ \/ \/ \/ \/ 22 | 23 | Author: https://twitter.com/knight0x07 24 | Github: https://github.com/knight0x07 25 | ''') 26 | 27 | def error_banner(): 28 | print(''' 29 | [-] Help: 30 | -> Console disassembled Output: python pyc2bytecode.py -p 31 | -> Save disassembled Output: python pyc2bytecode.py -p -o 32 | ''') 33 | 34 | MAGIC_TAG = { 35 | # Defines Tuples with Multiple Python versions and there magic Numbers | Credit: https://github.com/google/pytype/blob/main/pytype/pyc/magic.py | Thanks! 36 | 37 | # Python 1 38 | 20121: (1, 5), 39 | 50428: (1, 6), 40 | 41 | # Python 2 42 | 50823: (2, 0), 43 | 60202: (2, 1), 44 | 60717: (2, 2), 45 | 62011: (2, 3), # a0 46 | 62021: (2, 3), # a0 47 | 62041: (2, 4), # a0 48 | 62051: (2, 4), # a3 49 | 62061: (2, 4), # b1 50 | 62071: (2, 5), # a0 51 | 62081: (2, 5), # a0 52 | 62091: (2, 5), # a0 53 | 62092: (2, 5), # a0 54 | 62101: (2, 5), # b3 55 | 62111: (2, 5), # b3 56 | 62121: (2, 5), # c1 57 | 62131: (2, 5), # c2 58 | 62151: (2, 6), # a0 59 | 62161: (2, 6), # a1 60 | 62171: (2, 7), # a0 61 | 62181: (2, 7), # a0 62 | 62191: (2, 7), # a0 63 | 62201: (2, 7), # a0 64 | 62211: (2, 7), # a0 65 | 66 | # Python 3 67 | 3000: (3, 0), 68 | 3010: (3, 0), 69 | 3020: (3, 0), 70 | 3030: (3, 0), 71 | 3040: (3, 0), 72 | 3050: (3, 0), 73 | 3060: (3, 0), 74 | 3061: (3, 0), 75 | 3071: (3, 0), 76 | 3081: (3, 0), 77 | 3091: (3, 0), 78 | 3101: (3, 0), 79 | 3103: (3, 0), 80 | 3111: (3, 0), # a4 81 | 3131: (3, 0), # a5 82 | 83 | # Python 3.1 84 | 3141: (3, 1), # a0 85 | 3151: (3, 1), # a0 86 | 87 | # Python 3.2 88 | 3160: (3, 2), # a0 89 | 3170: (3, 2), # a1 90 | 3180: (3, 2), # a2 91 | 92 | # Python 3.3 93 | 3190: (3, 3), # a0 94 | 3200: (3, 3), # a0 95 | 3210: (3, 3), # a1 96 | 3220: (3, 3), # a1 97 | 3230: (3, 3), # a4 98 | 99 | # Python 3.4 100 | 3250: (3, 4), # a1 101 | 3260: (3, 4), # a1 102 | 3270: (3, 4), # a1 103 | 3280: (3, 4), # a1 104 | 3290: (3, 4), # a4 105 | 3300: (3, 4), # a4 106 | 3310: (3, 4), # rc2 107 | 108 | # Python 3.5 109 | 3320: (3, 5), # a0 110 | 3330: (3, 5), # b1 111 | 3340: (3, 5), # b2 112 | 3350: (3, 5), # b2 113 | 3351: (3, 5), # 3.5.2 114 | 115 | # Python 3.6 116 | 3360: (3, 6), # a0 117 | 3361: (3, 6), # a0 118 | 3370: (3, 6), # a1 119 | 3371: (3, 6), # a1 120 | 3372: (3, 6), # a1 121 | 3373: (3, 6), # b1 122 | 3375: (3, 6), # b1 123 | 3376: (3, 6), # b1 124 | 3377: (3, 6), # b1 125 | 3378: (3, 6), # b2 126 | 3379: (3, 6), # rc1 127 | 128 | # Python 3.7 129 | 3390: (3, 7), # a1 130 | 3391: (3, 7), # a2 131 | 3392: (3, 7), # a4 132 | 3393: (3, 7), # b1 133 | 3394: (3, 7), # b5 134 | 135 | # Python 3.8 136 | 3400: (3, 8), # a1 137 | 3401: (3, 8), # a1 138 | 3411: (3, 8), # b2 139 | 3412: (3, 8), # b2 140 | 3413: (3, 8), # b4 141 | 142 | # Python 3.9 143 | 3420: (3, 9), # a0 144 | 3421: (3, 9), # a0 145 | 3422: (3, 9), # a0 146 | 3423: (3, 9), # a2 147 | 3424: (3, 9), # a2 148 | 3425: (3, 9), # a2 149 | 150 | # Python 3.10 151 | 3430: (3, 10), # a1 152 | 3431: (3, 10), # a1 153 | 3432: (3, 10), # a2 154 | 3433: (3, 10), # a2 155 | 3434: (3, 10), # a6 156 | 3435: (3, 10), # a7 157 | 3436: (3, 10), # b1 158 | 3437: (3, 10), # b1 159 | 3438: (3, 10), # b1 160 | 3439: (3, 10), # b1 161 | } 162 | 163 | def magic_to_version(magic): 164 | magic_decimal = struct.unpack(' " + str(no_entry)); 175 | else: 176 | for val in code_obj.co_consts: 177 | print(" -> " + str(val)); 178 | print("\n[+] Code Object Name: " + str(code_obj.co_name)) 179 | print("[+] Number of Local Variables: " + str(code_obj.co_nlocals)) 180 | print("[+] Local Variables: \n") 181 | if not code_obj.co_names: 182 | print(" -> " + str(no_entry)); 183 | else: 184 | for var in code_obj.co_names: 185 | print(" -> " + str(var)); 186 | print("\n[+] Arguments & Local variable names: \n") 187 | if not code_obj.co_varnames: 188 | print(" -> " + str(no_entry)); 189 | else: 190 | for argu in code_obj.co_varnames: 191 | print(" -> " + str(argu)); 192 | print("\n[*] Note: Other Attributes can be added by editing the code if required") 193 | #print("Free Variable Names: " + str(code_obj.co_freevars)) 194 | #print("Cell Variable Names: " + str(code_obj.co_cellvars)) 195 | #print("ByteCode: " + code_obj.co_code.hex()) 196 | #print("Stack Size: " + str(code_obj.co_stacksize)) 197 | #print("LNoTab: " + str(code_obj.co_lnotab)) 198 | #print("First Line Number: " + str(code_obj.co_firstlineno)) 199 | #print("Flags: " + str(code_obj.co_flags)) 200 | print("\n-------------------[DISASSEMBLED BYTECODE]----------------------\n\n") 201 | dis.dis(code_obj) 202 | print("\n\n----------------------------[END]-------------------------------\n\n") 203 | 204 | 205 | def disassemble_pyc(pathofpyc): 206 | try: 207 | with open(pathofpyc, "rb") as f: 208 | print("\n---------------------[PARSED PYC HEADER]------------------------") 209 | magic_number = f.read(2) # Magic Number - Depends on the Python version whilst compilation 210 | carriage_return = f.read(2) # Carriage return - remains identical in every python version 211 | compiled_python_version = magic_to_version(magic_number) # Convert magic number to Python Version 212 | major_version = compiled_python_version[0] 213 | minor_version = compiled_python_version[1] 214 | print("\n[+] Compiled PYC Python Version: " + str(major_version) + "." + str(minor_version)) 215 | 216 | if major_version == 2: # Python version is "2" | Same for All Versions 217 | 218 | # Analyze the 4-BYTE timestamp bits of the PYC 219 | 220 | timestamp_val = f.read(4) # Read 4-byte Timestamp value 221 | print("[+] Timestamp: " + time.ctime(struct.unpack('L', timestamp_val)[0])) 222 | 223 | 224 | elif major_version == 3: # Python version is "3" 225 | 226 | if minor_version < 3: # For python version less than Python3.3 227 | 228 | # Analyze the 4-BYTE timestamp bits of the PYC 229 | 230 | timestamp_val = f.read(4) # Read 4-byte Timestamp value 231 | print("[+] Timestamp: " + time.ctime(struct.unpack('L', timestamp_val)[0])) 232 | 233 | 234 | elif minor_version < 7: # For python version less than Python3.7 235 | 236 | # Analyze the 4-BYTE timestamp bits of the PYC 237 | 238 | timestamp_val = f.read(4) # Read 4-byte Timestamp value 239 | print("[+] Timestamp: " + time.ctime(struct.unpack('L', timestamp_val)[0])) 240 | 241 | # Analyze the 4-Byte file size 242 | 243 | filesize_val = f.read(4) # 32-bit file size 244 | filesize_bytes = struct.unpack('= 3.7 250 | else: 251 | 252 | bit_field_val = f.read(4) # Bit Field - Reads the 4-byte Bit Field 253 | if bit_field_val.hex()[7] == '0': # Check if the last bit = 0 then Timestamp-based PYC 254 | 255 | print("[+] PYC Type: Time-Stamp based PYC") 256 | 257 | # Analyze the 4-BYTE timestamp bits of the PYC 258 | 259 | timestamp_val = f.read(4) # Read 4-byte Timestamp value 260 | print("[+] Timestamp: " + time.ctime(struct.unpack('L', timestamp_val)[0])) 261 | 262 | # Analyze the 4-Byte file size 263 | 264 | filesize_val = f.read(4) # 32-bit file size 265 | filesize_bytes = struct.unpack('