├── .gitignore ├── LICENSE ├── README.md ├── example_rules ├── Eicar.yar └── test_rules.yar ├── img ├── demo.gif ├── ex1.png └── ex2.png ├── requirements.txt └── src ├── arya.py ├── ast_observer.py ├── consts.py └── file_mapper.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Claroty 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Arya - The Reverse YARA 2 | Arya is a unique tool that produces pseudo-malicious files meant to trigger YARA rules. You can think of it like a reverse YARA because it does exactly the opposite - it creates files that matches your rules. 3 | 4 | You can read more about Arya and how it works in our [blog](https://claroty.com/2022/03/16/blog-research-arya-the-new-tailor-made-eicar-using-yara/). 5 | 6 | ![Arya Demo](img/demo.gif "Ayra Demo") 7 | 8 | ## Intro 9 | 10 | YARA rules are an essential tool for security researchers that help them identify and classify malware samples. They do so by describing patterns and strings within malware code that can help an analyst identify known or new threats. YARA rules are also often integrated within commercial detection tools, or used internally to detect misbehaving binaries on the enterprise network. 11 | 12 | But what if we don’t have the malicious file we are writing rules for? What if we want to create the “malware” file based on YARA as input? That's why we developed Arya. Arya can be used to generate custom-made, pseudo-malware files to trigger antivirus (AV) and endpoint detection and response (EDR) tools just like the good old EICAR test file. Our tool has a number of use cases, including malware research, YARA rule QA testing, and pressure testing a network with code samples built from YARA rules. 13 | 14 | ![](img/ex1.png "") 15 | 16 | ## More details 17 | Arya is a first-of-its-kind tool; it produces pseudo-malicious files meant to trigger YARA rules. The tool reads the given YARA (.yar suffix) files, parses their syntax using Avast's yaramod package—the YARA parsing engine used in this research—and builds a pseudo “malware” file. Carefully placing desired bytes from the YARA rules to trigger the input rules. 18 | 19 | The goal of the tool is to generate a tailor-made pseudo-malicious file that detection sensors such as AV or EDR will identify the result file as the malware file an input YARA rule is meant to detect. To achieve this goal not only are we are adding the necessary signatures, strings, and bytes from the input YARA rules, but also adding some “touches” such as real PE headers, increasing the outfile entropy, adding x86 bytecode, and function prologue/epilogue assembly code. All of this helps the AV/EDR triggering process, and bypasses some heuristics checks they might have. 20 | 21 | ![](img/ex2.png "") 22 | 23 | YARA files (.yar extensions) are text files that contain one or more YARA rules. In this project, we used yaramod to turn YARA rules into a list of rules represented by Python objects, which can then be used to access the internal contents of a rule, such as strings, types, conditions, and more. Yaramond parses YARA rules into AST, or an Abstract Syntax Tree. 24 | 25 | Traversal of the abstract syntax tree is done by using a combination of the Observer and Visitor design patterns. For every node that code goes through, it will determine which bytes and strings it needs to place, and where, in order to trigger the condition in the subtree it traverses. As a result, this tree will produce a mapping of strings and possible offsets to put them in; it can also reserve some of them in the file, which will be passed to the placer mechanism for further processing. 26 | 27 | 28 | ## Example Usage Cases 29 | Researchers may use Arya to generate pseudo-malicious files using YARA rules as building blocks. These files can be used to build files that will be identified as specific malware. For example, if you don’t have a Zeus malware sample, but you want to check how your AV reacts to it? No problem, load the Zeus YARA rules to Arya and generate your own “Zeus” like pseudo-malware that AV/EDR tools will identify as Zeus. 30 | 31 | Arya can also be used as part of incident response training—similar to purple-teaming—where pseudo-malicious files can be sent across the network to pressure test sensors and detectors in the network. 32 | 33 | 34 | ## Currently Supported Yara functionalities 35 | **Supported** 36 | - Strings - ASCII, Wide, Base64, Base64wide, XOR 37 | - Hex Streams (including Jumps and Alternations) 38 | - Regular Expressions - Generates a string to match the inverse of regexes. 39 | - At operator 40 | - Int Functions (uint32, int16be, etc.) 41 | - Of operator (e.g. all of ($s*)) 42 | 43 | **Planned for the next updates** 44 | - FileSize Operator 45 | - Range Operator 46 | - String Count Operator 47 | 48 | ## How to use Arya 49 | ### Step 1 50 | ```bash 51 | sudo apt update 52 | sudo apt install yara cmake 53 | git clone https://github.com/claroty/arya 54 | cd arya 55 | pip3 install -r requirements.txt 56 | ``` 57 | 58 | ### Step 2 59 | Run Arya considering the following args: 60 | * `-i [INPUT_DIRECTORY] OR [INPUT_FILE]` 61 | * `-o [OUTPUT_FILE_NAME]` 62 | * `--header` - Adds the first 2048 bytes of conficker at the start of the file. 63 | * `-r` - Run recursively on all of the files under \[INPUT_DIRECTORY] 64 | * `-m [MALWARE_FILE]` - Use the file \[INPUT_DIRECTORY] as the base for the output. 65 | 66 | ### Example 67 | 68 | ```bash 69 | python3 src/arya.py -i example_rules/ -o MalwareAryaDefault.exe 70 | ``` 71 | -------------------------------------------------------------------------------- /example_rules/Eicar.yar: -------------------------------------------------------------------------------- 1 | rule eicar_substring_test 2 | { 3 | strings: 4 | $eicar_substring = "$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!" 5 | condition: 6 | $eicar_substring 7 | } 8 | rule eicar 9 | { 10 | strings: 11 | $hex_string = {58 35 4F 21 50 25 40 41 50 5B 34 5C 50 5A 58 35 34 28 50 5E 29 37 12 | 43 43 29 37 7D 24 45 49 43 41 52 2D 53 54 41 4E 44 41 52 44 2D 41 4E 54 49 56 49 52 55 13 | 53 2D 54 45 53 54 2D 46 49 4C 45 21 24 48 2B 48 2A} 14 | condition: 15 | $hex_string 16 | } 17 | -------------------------------------------------------------------------------- /example_rules/test_rules.yar: -------------------------------------------------------------------------------- 1 | /* 2 | * Test the free string placer 3 | */ 4 | rule TestFreeString { 5 | strings: 6 | $s1 = "checkcheckcheck" fullword ascii 7 | 8 | condition: 9 | 1 of ($s*) 10 | } 11 | rule TestFreeString2 { 12 | strings: 13 | $s1 = "2check2check2check2" wide 14 | 15 | condition: 16 | all of them 17 | } 18 | 19 | /* 20 | * Test the int function placer 21 | */ 22 | rule TestIntFunction { 23 | 24 | condition: 25 | uint32(uint32(0xA28)) == 0x4550 26 | } 27 | rule TestIntFunction2 { 28 | 29 | condition: 30 | 0x4550 == uint32(uint32(0xA38)) 31 | } 32 | 33 | /* 34 | * Test the offset placer 35 | */ 36 | rule TestOffset { 37 | strings: 38 | $str = "someteststring" 39 | 40 | condition: 41 | $str at 6000 42 | } 43 | 44 | /* 45 | * Tests for the hex placer 46 | */ 47 | rule TestHexWildcard { 48 | strings: 49 | $hex_string = { E2 34 ?? C8 A? FB } 50 | 51 | condition: 52 | $hex_string 53 | } 54 | rule TestHexJump { 55 | strings: 56 | $hex_string1 = { F4 23 [4-6] 62 B4 } 57 | $hex_string2 = { FE 39 45 [6] 89 00 } 58 | /*$hex_string3 = { FE 39 45 [4-] 89 00 }*/ 59 | /*$hex_string4 = { FE 39 45 [-] 89 00 } */ 60 | 61 | condition: 62 | $hex_string1 and $hex_string2 /* and $hex_string3 and $hex_string4*/ 63 | } 64 | rule TestHexAlternation { 65 | strings: 66 | $hex_string1 = { F4 23 (39 45 | 66 66 | be ef) 62 B4 } 67 | 68 | condition: 69 | $hex_string1 70 | } 71 | 72 | /* 73 | * Tests for the base64 placer 74 | */ 75 | rule TestBase64 { 76 | strings: 77 | $b64_string1 = "I am base64 ascii" base64 78 | 79 | condition: 80 | $b64_string1 81 | } 82 | rule TestBase64wide { 83 | strings: 84 | $b64w_string1 = "I am base64 wide" base64wide 85 | 86 | condition: 87 | $b64w_string1 88 | } 89 | 90 | 91 | /* 92 | * Tests for the regexp placer 93 | */ 94 | rule TestRegExp 95 | { 96 | strings: 97 | $re1 = /wow[0-9a-fA-F]{5}\d\w......./s 98 | $re2 = /anOTher\s+(One|regEx|tEst)/i 99 | 100 | condition: 101 | $re1 and $re2 102 | } 103 | 104 | /* 105 | * Tests for the xor string placer 106 | */ 107 | rule TestXorString 108 | { 109 | strings: 110 | $x1 = "xor me xor me!!" xor 111 | 112 | condition: 113 | $x1 114 | } -------------------------------------------------------------------------------- /img/demo.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/claroty/arya/1a30cf2c336dc32895c3e3f0968a2f4c8801d51b/img/demo.gif -------------------------------------------------------------------------------- /img/ex1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/claroty/arya/1a30cf2c336dc32895c3e3f0968a2f4c8801d51b/img/ex1.png -------------------------------------------------------------------------------- /img/ex2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/claroty/arya/1a30cf2c336dc32895c3e3f0968a2f4c8801d51b/img/ex2.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | colorama==0.4.4 2 | yaramod==3.12.2 3 | setuptools==65.5.1 4 | xeger==0.3.5 -------------------------------------------------------------------------------- /src/arya.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import subprocess 3 | import os 4 | import re 5 | import random 6 | import binascii 7 | import sys 8 | import base64 9 | from pathlib import Path 10 | 11 | import colorama 12 | import yaramod as ym 13 | from xeger import Xeger 14 | 15 | from file_mapper import FileMapper 16 | from ast_observer import YaraAstObserver 17 | 18 | 19 | class RuleReverser: 20 | def __init__(self, input_path, output_path, is_recursive, add_pe_header, malware_file): 21 | self._curr_indent = 0 22 | self._file_mapper = FileMapper(add_pe_header, malware_file) 23 | self._input_path = input_path 24 | self._input_files_paths, self._rules_list = self.get_rules_list(self._input_path, is_recursive) 25 | self._output_file_path = output_path 26 | self.all_offsets = [] 27 | self.rules_names = [] 28 | self.print_row("[-] Starting Arya") 29 | 30 | def increase_indent(self): 31 | self._curr_indent += 4 32 | 33 | def decrease_indent(self): 34 | self._curr_indent -= 4 35 | 36 | def print_row(self, string): 37 | print((self._curr_indent * ' ') + string) 38 | 39 | def _hex_string_to_bytes(self, string): 40 | ret_str = str(string.text) 41 | 42 | ret_str = ret_str.replace("{", "").replace("}", "") \ 43 | .replace("??", "90").replace("?", "0").replace(" ", "").strip() 44 | 45 | amounts_to_replace = re.findall("(\[(\d*)[\-\][\d]*])", ret_str) 46 | if amounts_to_replace: 47 | for sub_to_replace, amount in amounts_to_replace: 48 | if not amount: 49 | amount = 1 50 | sub_to_replace = "\\" + sub_to_replace[:-1] + "\\" + sub_to_replace[-1] 51 | ret_str = re.sub(sub_to_replace, "90" * int(amount), ret_str) 52 | 53 | groups_to_replace = re.findall("(\(([0-9A-F]*)\|)", ret_str) 54 | if groups_to_replace: 55 | for sub_to_replace, hex_stream in groups_to_replace: 56 | sub_to_replace = "\\" + sub_to_replace[:-1] + "\|[0-9A-F|]*?\)" 57 | ret_str = re.sub(sub_to_replace, hex_stream, ret_str) 58 | 59 | return binascii.unhexlify(ret_str) 60 | 61 | def _yara_string_to_bytes(self, string): 62 | if string.is_plain: 63 | if string.is_base64: 64 | return base64.b64encode(string.pure_text) 65 | if string.is_base64_wide: 66 | return str(base64.b64encode(string.pure_text)).encode("utf-16") 67 | if string.is_wide: 68 | return str(string.pure_text).encode("utf-16") 69 | if string.is_ascii: 70 | return string.pure_text 71 | if string.is_hex: 72 | return self._hex_string_to_bytes(string) 73 | if string.is_regexp: 74 | if string.is_wide: 75 | return Xeger(limit=10).xeger(string.pure_text).encode("utf-16") 76 | if string.is_ascii: 77 | return Xeger(limit=10).xeger(string.pure_text).encode("utf-8") 78 | 79 | def _of_expr_to_string(self, count, iterable, string_mapping): 80 | if isinstance(iterable, ym.ThemExpression): 81 | strings = string_mapping 82 | elif isinstance(iterable, ym.SetExpression): 83 | elements = [element.id.replace("$", "\\$").replace("*", ".*") for element in iterable.elements] 84 | strings = {key: val for key, val in string_mapping.items() if 85 | any([re.findall(element, key) for element in elements])} 86 | else: 87 | return None 88 | 89 | if count.get_text() == "all": 90 | amount = len(strings) 91 | elif count.get_text() == "any": 92 | amount = 1 93 | else: 94 | amount = int(count.get_text()) 95 | 96 | return b"".join([self._yara_string_to_bytes(val) for val in list(strings.values())][:amount]) 97 | 98 | def get_rules_list(self, path, is_recursive): 99 | ymod_parser = ym.Yaramod() 100 | 101 | all_file_paths = [] 102 | if os.path.isdir(path): 103 | if is_recursive: 104 | all_file_paths = [str(p) for p in list(Path(path).rglob("*.[yY][aA][rR]"))] 105 | else: 106 | for root, subdirectories, files in os.walk(path): 107 | for file in files: 108 | all_file_paths.append(os.path.join(root, file)) 109 | elif os.path.isfile(path): 110 | all_file_paths.append(path) 111 | 112 | all_yara_rules = [] 113 | for file_path in all_file_paths: 114 | curr_yar_file = ymod_parser.parse_file(file_path) 115 | all_yara_rules.extend([(rule, file_path) for rule in curr_yar_file.rules]) 116 | 117 | self.print_row(f"[-] Input file/directory {self._input_path}, found {len(all_yara_rules)} yara rules") 118 | return all_file_paths, all_yara_rules 119 | 120 | def init_offset_list(self): 121 | for rule, path in self._rules_list: 122 | ast_observer = YaraAstObserver(self._file_mapper) 123 | self.rules_names.append((rule.name, path)) 124 | string_mapping = {s.identifier: s for s in rule.strings} 125 | ast_observer.observe(rule.condition) 126 | offsets_map = ast_observer.strings_offsets_map 127 | for expr in offsets_map: 128 | if expr["operation"] == "of": 129 | expr["var"] = self._of_expr_to_string(expr["min_offset"], expr["max_offset"], string_mapping) 130 | expr["operation"] = "String" 131 | expr["min_offset"] = "*" 132 | expr["max_offset"] = "*" 133 | elif expr["operation"] == "IntFunction": 134 | continue 135 | else: 136 | expr["var"] = self._yara_string_to_bytes(string_mapping[expr["var"]]) 137 | self.all_offsets.extend(offsets_map) 138 | 139 | def build_file_from_instructions(self): 140 | free_strings = b"" 141 | 142 | self.print_row("[-] Building output file...") 143 | for instructions_dict in self.all_offsets: 144 | if (instructions_dict["operation"] == "String" 145 | and instructions_dict["min_offset"] == instructions_dict["max_offset"] == "*"): 146 | free_strings += b"." + instructions_dict["var"] + b"." \ 147 | + self._file_mapper.generate_random_x86_code(random.randint(1, 16)) 148 | elif instructions_dict["operation"] == "IntFunction": 149 | self._file_mapper.place(instructions_dict["var"], int(instructions_dict["min_offset"]), pre_reserved=True) 150 | elif instructions_dict["operation"] == "at": 151 | min_offset = instructions_dict["min_offset"] 152 | if "entrypoint" in min_offset: 153 | continue 154 | self._file_mapper.place(instructions_dict["var"], int(min_offset)) 155 | 156 | file_len = self._file_mapper.get_file_len() 157 | if file_len > 1024: 158 | random_amount_of_code = self._file_mapper.generate_random_x86_code(random.randint(1024, file_len)) 159 | else: 160 | random_amount_of_code = [] 161 | self._file_mapper.append(random_amount_of_code) 162 | self._file_mapper.append(free_strings) 163 | self._file_mapper.fill_empty_with_code() 164 | 165 | self.print_row(f"[-] Saving result to {self._output_file_path}") 166 | with open(self._output_file_path, "wb") as out: 167 | out.write(self._file_mapper.get_as_bytestream()) 168 | 169 | return self._file_mapper.get_as_bytestream() 170 | 171 | def test_yara(self, rules_path): 172 | yara_output = subprocess.run(['yara', rules_path, self._output_file_path], stdout=subprocess.PIPE).stdout.decode('utf-8') 173 | return [out.split(" ")[0] for out in yara_output.split("\n")] 174 | 175 | def print_triggered_and_summary(self): 176 | triggered_rules = [] 177 | for in_path in self._input_files_paths: 178 | triggered_rules.extend([rule for rule in self.test_yara(in_path) if rule]) 179 | self.print_row("[-] Checking result output against all files") 180 | self.increase_indent() 181 | for rule, file in self.rules_names: 182 | if rule in triggered_rules: 183 | self.print_row(f"File {file} Rule {rule}: {colorama.Fore.LIGHTGREEN_EX}Triggered") 184 | else: 185 | self.print_row(f"File {file} Rule {rule}: {colorama.Fore.RED}Not triggered") 186 | self.decrease_indent() 187 | self.print_row("\n[-] Summary:") 188 | 189 | self.increase_indent() 190 | self.print_row(f"[-] File {self._output_file_path} size in kb: {round(self._file_mapper.get_file_len() / 1024, 2)}") 191 | self.print_row(f"[-] Number of rules triggered: {len(triggered_rules)}/{len(self.rules_names)} rules") 192 | self.decrease_indent() 193 | 194 | self.print_row("\n[-] Done.") 195 | 196 | def main(): 197 | colorama.init(autoreset=True) 198 | parser = argparse.ArgumentParser( 199 | description="Build a file that will trigger as much yara rules as possible from a given directory/file.") 200 | if len(sys.argv) < 3: 201 | parser.print_help() 202 | parser.add_argument("-i", dest="in_path", type=str, required=True, 203 | help="PATH_TO_FILE or PATH_TO_DIRECTORY") 204 | parser.add_argument("-o", dest="out_path", type=str, required=True, 205 | help="OUTPUT_FILE_PATH") 206 | parser.add_argument("-m", dest="malware_file", type=str, required=False, 207 | help="MALWARE_FILE - Malware/Executable file to use as template") 208 | parser.add_argument("--header", dest="add_pe_header", action="store_true", required=False, 209 | help="Adds the pe header(first 2048 bytes) of MALWARE_FILE in the beginning of the file. " 210 | "If no malware file is specified, adds from conficker.") 211 | parser.add_argument("-r", dest="is_recursive", action="store_true", required=False, 212 | help="Recursively scan all sub folders of input path for .yar files") 213 | args = parser.parse_args() 214 | 215 | rule_reverser = RuleReverser(args.in_path, args.out_path, args.is_recursive, args.add_pe_header, args.malware_file) 216 | rule_reverser.init_offset_list() 217 | rule_reverser.build_file_from_instructions() 218 | rule_reverser.print_triggered_and_summary() 219 | 220 | 221 | if __name__ == "__main__": 222 | main() 223 | -------------------------------------------------------------------------------- /src/ast_observer.py: -------------------------------------------------------------------------------- 1 | import yaramod as ym 2 | 3 | 4 | class YaraAstObserver(ym.ObservingVisitor): 5 | def __init__(self, file_mapper): 6 | super(YaraAstObserver, self).__init__() 7 | self.strings_offsets_map = [] 8 | self.file_mapper = file_mapper 9 | 10 | def visit_StringExpression(self, expr): 11 | self._add_to_offset_map(expr.id, "String", "*", "*") 12 | 13 | def visit_StringWildcardExpression(self, expr): 14 | pass 15 | 16 | def visit_StringAtExpression(self, expr): 17 | 18 | self._add_to_offset_map(expr.id, "at", expr.at_expr.get_text(), expr.at_expr.get_text()) 19 | 20 | expr.at_expr.accept(self) 21 | 22 | def visit_StringInRangeExpression(self, expr): 23 | low = expr.range_expr.low 24 | high = expr.range_expr.high 25 | 26 | if isinstance(low, ym.StringOffsetExpression): 27 | low = low.index_expr.value 28 | if isinstance(high, ym.StringOffsetExpression): 29 | high = high.index_expr.value 30 | if isinstance(low, ym.IntLiteralExpression): 31 | low = low.value 32 | if isinstance(high, ym.IntLiteralExpression): 33 | high = high.value 34 | 35 | self._add_to_offset_map(expr.id, "in_range", low, high) 36 | expr.range_expr.accept(self) 37 | 38 | def visit_StringCountExpression(self, expr): 39 | pass 40 | 41 | def visit_StringOffsetExpression(self, expr): 42 | if expr.index_expr: 43 | expr.index_expr.accept(self) 44 | 45 | def visit_StringLengthExpression(self, expr): 46 | if expr.index_expr: 47 | expr.index_expr.accept(self) 48 | 49 | def visit_NotExpression(self, expr): 50 | expr.operand.accept(self) 51 | 52 | def visit_UnaryMinusExpression(self, expr): 53 | expr.operand.accept(self) 54 | 55 | def visit_BitwiseNotExpression(self, expr): 56 | expr.operand.accept(self) 57 | 58 | def visit_AndExpression(self, expr): 59 | expr.left_operand.accept(self) 60 | expr.right_operand.accept(self) 61 | 62 | def visit_OrExpression(self, expr): 63 | expr.left_operand.accept(self) 64 | expr.right_operand.accept(self) 65 | 66 | def visit_LtExpression(self, expr): 67 | expr.left_operand.accept(self) 68 | expr.right_operand.accept(self) 69 | 70 | def visit_GtExpression(self, expr): 71 | expr.left_operand.accept(self) 72 | expr.right_operand.accept(self) 73 | 74 | def visit_LeExpression(self, expr): 75 | expr.left_operand.accept(self) 76 | expr.right_operand.accept(self) 77 | 78 | def visit_GeExpression(self, expr): 79 | expr.left_operand.accept(self) 80 | expr.right_operand.accept(self) 81 | 82 | def helper_trigger_intfunc(self, func_name, func_arg, value): 83 | signed = False if func_name.startswith("uint") else True 84 | byteorder = "big" if func_name.endswith("be") else "little" 85 | size_in_bytes = int(func_name.rstrip("be").lstrip("u").lstrip("int")) // 8 86 | 87 | if isinstance(func_arg, ym.IntLiteralExpression): 88 | curr_resolve = value.to_bytes(size_in_bytes, byteorder=byteorder, signed=signed) 89 | return [(curr_resolve, "IntFunction", func_arg.value, func_arg.value + size_in_bytes)] 90 | 91 | if isinstance(func_arg, ym.IntFunctionExpression): 92 | first_free_idx = self.file_mapper.reserve_first_free_spot(size_in_bytes) 93 | next_functions_results = self.helper_trigger_intfunc(func_arg.function, func_arg.argument, value) 94 | curr_resolve = first_free_idx.to_bytes(size_in_bytes, byteorder=byteorder, signed=signed) 95 | return next_functions_results\ 96 | + [(curr_resolve, "IntFunction", first_free_idx, first_free_idx + size_in_bytes)] 97 | 98 | def helper_rotate_ptrs(self, offsets): 99 | offset_vars = [offset[0] for offset in offsets] 100 | offset_vars = offset_vars[1:] + offset_vars[:1] 101 | offsets_to_ret = [] 102 | for index, offset in enumerate(offsets): 103 | offsets_to_ret.append((offset_vars[index],) + offset[1:4]) 104 | 105 | return offsets_to_ret 106 | 107 | def helper_add_int_functions(self, function, argument, value): 108 | offsets = self.helper_trigger_intfunc(function, argument, value) 109 | offsets = self.helper_rotate_ptrs(offsets) 110 | for curr_offset in offsets: 111 | self._add_to_offset_map(*curr_offset) 112 | 113 | def visit_EqExpression(self, expr): 114 | 115 | if isinstance(expr.left_operand, ym.IntFunctionExpression) and isinstance(expr.right_operand, ym.IntLiteralExpression): 116 | self.helper_add_int_functions(expr.left_operand.function, expr.left_operand.argument, expr.right_operand.value) 117 | elif isinstance(expr.right_operand, ym.IntFunctionExpression) and isinstance(expr.left_operand, ym.IntLiteralExpression): 118 | self.helper_add_int_functions(expr.right_operand.function, expr.right_operand.argument, expr.left_operand.value) 119 | 120 | expr.left_operand.accept(self) 121 | expr.right_operand.accept(self) 122 | 123 | def visit_NeqExpression(self, expr): 124 | expr.left_operand.accept(self) 125 | expr.right_operand.accept(self) 126 | 127 | def visit_ContainsExpression(self, expr): 128 | expr.left_operand.accept(self) 129 | expr.right_operand.accept(self) 130 | 131 | def visit_MatchesExpression(self, expr): 132 | expr.left_operand.accept(self) 133 | expr.right_operand.accept(self) 134 | 135 | def visit_PlusExpression(self, expr): 136 | expr.left_operand.accept(self) 137 | expr.right_operand.accept(self) 138 | 139 | def visit_MinusExpression(self, expr): 140 | expr.left_operand.accept(self) 141 | expr.right_operand.accept(self) 142 | 143 | def visit_MultiplyExpression(self, expr): 144 | expr.left_operand.accept(self) 145 | expr.right_operand.accept(self) 146 | 147 | def visit_DivideExpression(self, expr): 148 | expr.left_operand.accept(self) 149 | expr.right_operand.accept(self) 150 | 151 | def visit_ModuloExpression(self, expr): 152 | expr.left_operand.accept(self) 153 | expr.right_operand.accept(self) 154 | 155 | def visit_BitwiseXorExpression(self, expr): 156 | expr.left_operand.accept(self) 157 | expr.right_operand.accept(self) 158 | 159 | def visit_BitwiseAndExpression(self, expr): 160 | expr.left_operand.accept(self) 161 | expr.right_operand.accept(self) 162 | 163 | def visit_BitwiseOrExpression(self, expr): 164 | expr.left_operand.accept(self) 165 | expr.right_operand.accept(self) 166 | 167 | def visit_ShiftLeftExpression(self, expr): 168 | expr.left_operand.accept(self) 169 | expr.right_operand.accept(self) 170 | 171 | def visit_ShiftRightExpression(self, expr): 172 | expr.left_operand.accept(self) 173 | expr.right_operand.accept(self) 174 | 175 | def visit_ForDictExpression(self, expr): 176 | expr.variable.accept(self) 177 | expr.iterable.accept(self) 178 | expr.body.accept(self) 179 | 180 | def visit_ForArrayExpression(self, expr): 181 | expr.variable.accept(self) 182 | expr.iterable.accept(self) 183 | expr.body.accept(self) 184 | 185 | def visit_ForStringExpression(self, expr): 186 | expr.variable.accept(self) 187 | expr.iterable.accept(self) 188 | expr.body.accept(self) 189 | 190 | def visit_OfExpression(self, expr): 191 | self._add_to_offset_map(None, 'of', expr.variable, expr.iterable) 192 | 193 | expr.variable.accept(self) 194 | expr.iterable.accept(self) 195 | 196 | def visit_IteratorExpression(self, expr): 197 | for elem in expr.elements: 198 | elem.accept(self) 199 | 200 | def visit_SetExpression(self, expr): 201 | for elem in expr.elements: 202 | elem.accept(self) 203 | 204 | def visit_RangeExpression(self, expr): 205 | expr.low.accept(self) 206 | expr.high.accept(self) 207 | 208 | def visit_IdExpression(self, expr): 209 | pass 210 | 211 | def visit_StructAccessExpression(self, expr): 212 | expr.structure.accept(self) 213 | 214 | def visit_ArrayAccessExpression(self, expr): 215 | expr.array.accept(self) 216 | expr.accessor.accept(self) 217 | 218 | def visit_FunctionCallExpression(self, expr): 219 | expr.function.accept(self) 220 | 221 | for arg in expr.arguments: 222 | arg.accept(self) 223 | 224 | def visit_BoolLiteralExpression(self, expr): 225 | pass 226 | 227 | def visit_StringLiteralExpression(self, expr): 228 | pass 229 | 230 | def visit_IntLiteralExpression(self, expr): 231 | pass 232 | 233 | def visit_DoubleLiteralExpression(self, expr): 234 | pass 235 | 236 | def visit_FilesizeExpression(self, expr): 237 | pass 238 | 239 | def visit_EntrypointExpression(self, expr): 240 | pass 241 | 242 | def visit_AllExpression(self, expr): 243 | pass 244 | 245 | def visit_AnyExpression(self, expr): 246 | pass 247 | 248 | def visit_NoneExpression(self, expr): 249 | pass 250 | 251 | def visit_ThemExpression(self, expr): 252 | pass 253 | 254 | def visit_ParenthesesExpression(self, expr): 255 | expr.enclosed_expr.accept(self) 256 | 257 | def visit_IntFunctionExpression(self, expr): 258 | expr.argument.accept(self) 259 | 260 | def visit_RegexpExpression(self, expr): 261 | pass 262 | 263 | def _add_to_offset_map(self, var, operation, min_offset, max_offset): 264 | self.strings_offsets_map.append({"var": var, 265 | "operation": operation, 266 | "min_offset": min_offset, 267 | "max_offset": max_offset}) -------------------------------------------------------------------------------- /src/consts.py: -------------------------------------------------------------------------------- 1 | CONFICKER_FIRST_4KB = b'MZ\x90\x00\x03\x00\x00\x00\x04\x00\x00\x00\xff\xff\x00\x00\xb8\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xd8\x00\x00\x00\x0e\x1f\xba\x0e\x00\xb4\t\xcd!\xb8\x01L\xcd!This program cannot be run in DOS mode.\r\r\n$\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00PE\x00\x00L\x01\x03\x00\xfbZ\xdf9\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x00\x0e!\x0b\x01\x07\x00\x00\xf0\x00\x00\x00\x10\x00\x00\x00p\x00\x00\x10n\x01\x00\x00\x80\x00\x00\x00p\x01\x00\x00\x00\x00\x10\x00\x10\x00\x00\x00\x02\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x80\x01\x00\x00\x10\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x10\x00\x00\x10\x00\x00\x00\x00\x10\x00\x00\x10\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00p\x01\x000\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x000q\x01\x00\x0c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00UPX0\x00\x00\x00\x00\x00p\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\xe0UPX1\x00\x00\x00\x00\x00\xf0\x00\x00\x00\x80\x00\x00\x00\xf0\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\xe0UPX2\x00\x00\x00\x00\x00\x10\x00\x00\x00p\x01\x00\x00\x02\x00\x00\x00\xf4\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\xc0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x003.03\x00UPX!\r\t\x02\t\x8e\xf5Z\x19\x87k\xc2\xad+H\x01\x00\x01\xee\x00\x00\x00P\x01\x00&\x00\x00p\xbd\xfd\xff\xff\x8b\r\x00\x00PX\x89L$8\x89t$<\x8b\x15\x00\x01"\x94\xdfl\r%\xff\xff\xef\xdev\xdf\x18R\xd9\xff\xdd\x1d\x12\xb0\xff\x18\x00@4,\x01\'\x98\x89\xde\xdd\xdb\xdb\x01\xe9\x0c\x06\x91h\x04\n\x1ch\x10\x18\xc7\x05\x05\xbb\x7f;\xf6\x0e&8\xa1\x0e \xc4_^[\x8b\xe5(\x02a\x7f\xec\xdf\xee\x0b\xd3\x89V\'$\\ \x90\x03\xca\x07\xa0\x8dT\x11\x01\x8bo\xfb\xbbov\x0c#\xd1\x8aU \xc0\xd3\xea\x81\xca\x82B\n\r^\x8e=\xf6!\x04\xd3\xe2\x98h3\xd1\x07<+\xce\x8b\x1d7\xcb\x91\xc7?\xcb\x8b\x1dL\x19\\\xcb\x81\xf1\xe7\xee\xfd\xb7\xdc"\x00\x8b\x1c\x08\x8b\xc63\xd3\xa2$\xc1\x8b\r_\xd7\xdd\xfd\x03\xc8\x8d\x0c\xcd\x08\x1d\x83\xe1\x18`\xe2\xd3\x00` \xd8\x93\xc1\x9a\xc8\xc6P\xc3\x03\xd8\x8e\xcd\xd8\xbb\xc9\xd9\xb9\x1c\xf4j\xbf\xd0\xac"\xe4A\xba\xff\x90\x8dP\x07\x8dX\x06\x8bt\xb2\xf2\xee\x81\xcel\x90A\x06\xe6\xf1\xf1Ow\xfb\xecX\xee\' \xa434\n\x90\xc5\xff\x00\xfd\xffu;\x90\xff%6\xc8,@x\x8b\x11\x8bD$0\x8a\x0cBw#\xbc\xbb\x17\x02\x89N$\xe0\x85\xf6V<5\x00\x0f\x86\r\xc7\xc6\xdaI\x13\x05/\x88\x8b\x8b\xc8\x0c\x00l\x90\x1f\xccP\\L\x90\x8dt\xf1\x99l\x92\x91!\x15\xf2\xd0?[\x16\x82\x10/L\nT\xad\xb1o\x0b#R\nh3\xf3\x8b=*\x87,<\xb8\xcf\x07.\xf9R\xfd\xee6\xb2n\xc9\x81\xe6f\xc8+\xb3\x9b\xb7\xf6?\xfc]\xc2\x0c\x00\x89\x84$\xa8\x1b\x06\xb4$\xac\xdf\xac\x8f\x11nc\r\xe8\x022\x86\x01\x80g\xb67\xb4\xff%\xc8\x839\x01\x0f\x8e\x00\x10\x10\x7f\x15]\x0f\xaf\xdf\xc7\x01\xe1e\x10\xcb\x90\xc1\xe1\x03\xc7\x0e\x1b\xae\xfb\x0b\xf2\xc1\xe6\x10\xf4\x18%L\x0f@\x1b\xbbH6L\xd0\xa1\xfe\xd057\xb6i\x0eDL\xc15\xee\r\xfa\x90\x8d;\xf73\x14\x01I\x1e\xc3\x03\xc7<\xc5\x03\xb9\xac\x85|y\x104vyk\xb7\xbdW\xdd\x05\x18\xcb\xb6\xf4F\xbai\xdbZ@\x89 DT@\xd9\xf3M`\xdd>\xc2\x11\x8e\x84R\x1c\xb0h\x03\x01\rh\x87vG\x18\xc3\\lP\x81\xf0\xa3\x04|\xa1\xd8\xeewo\xd0\x83\xc4\x08T\x07\xb4\x8d\x8c$l)\xc7\x84\x06a\xc9H\xdd\x94/Qn,gN\xc6c\xd2\x13\xd2t\x14\x0b\xd1\xbd\xd0\xc6\xffd\xae\x9f\xdf\x03\xd9\xb9\x14\xde\xfd#\xa18L\x8d\x14\x80\x8d\x04\xd7\xa9\x10\x83\xc0\x08\x16!K\x13q\xa2\x08\xedRf\xce5L\x84\xc2%\xf4#\xc8\xf6\x8d\x10<\xda\xb8n\x9b4R!\x07\x96\x18\xe8\xa4G\x1e\xd9?5\xce\x8b5L1\x01\xa5,\xbb\x84X\x0f\x04\x0e(\xda\xc1v#x\x98\x92/D\xb3P\x08~\xdc\x19y0^\xeb\x81\xe3\xf3\xc7\x01y\x90\xde\x1b\t#\xda\xf0\xeb\x81\xcb\x00\x19d\x90\xe3\xd9\xd9d\xcc6SB\x140T>-@\xce\xda\xe3\\Y\x08\xec`\x8d\xcd\xaf\x05\x10\xa6+\xe8\r\x84i\x90A\x06\xe0\xc1\xc1\xd1zi%,\x05\xd1\x0e3\xc3\xd8\x01\x81AN\xca\x1a\xe8%\x9dy\xe3 \x0c\xe0&\xf2\xe9<\x14hj\x8c\xd1\xb4{\x94$\xbc\x1dRh\xa0`r\x90\xfc:6l\x93\x0c\xc5"\xb8\xf9$N*H\xc9I\xd3\x1890\xe4\x80\x84o\xf0.\xbd\rf\x11\x02$XPQ\xd9\xe0\x0c\xf2\x83\xe1O\'\xc8\x15\xff\xfd\xc6\xfeC\x80\x89\x02\x12\x08VP\xa1\x1690\x0f\x84\x18\x0f\x08t\xaf(\xf6\xd7$\xfc\x00PP\xbf}\x84!\xc88\x8b\xf09\xe9\x10\x0f\x83X\x1c\xc0\xb2\xe9\x18g\x0f\rsH\xc7L\xab\xf4\xa9\xe9;wH\x8f0\xf8}i~\xb0#/"\xa0\x9b\x00_\x1cRj\x01PJ\xd2\xad\xb1\x82\xa0o\xff\xd0Mc\xb1\xff{[\xb8\xbc(\x1c\x0b\xc8j\x9d\x89\t\xa9\xb0\xd6\xed\x96>|\x8b\x94\xc8C\xf7\xc6\x04 \xaf\xfdG\x18\xe44&(K\xbc\xd3\xfe\x83\xfe\x03u\x05\x9bf$\xde\x94\x19\xd83\xf6kpX\\\x83lz\xe0\xdd\xec\x08\x9b`\xdd\x1c$\x05z\xeb%@\x83\x8c\xc1\xdb@\xa3\x08\x1b\xc4\t\xad\xf7\xda\xf4f\x8b\x89\x0e\xf3\x81\xe1*\x83\x1aB0\xf8\xaf\x85BL\x0f\x85\x1e\x0b\x18\x1b\xd8\x02\xe3\xa4\x11P\xc8\x1a\x9d^\x80\xa4\xd05\xd0\x06ILaUPc\r\xb2\xb1\x06&\x15#\xd0vp\x06\x92\x8d\xbf\x14Vu\x08\xd8o\x8b}z*LS\x100\x0cHF(\x8e\x8cU\xd2x\x85\xf63![\x89W\x0e7\xa9\xf5\x94t\x8e\xa4\x91\xc3\xe8\n,6\\\xce\xe7\x17\xe5\x1b\x95\xc1\x84\xdd.,\xcaD\xdcd\x9c\xa5\xef\x895\xff\x8a{\xa0r\x0c\xdak\xfa(\xbcj\x04uT{\x1e\xb2>+\x8cp\xd4P\x1dQd& \x85\xc0\xd2\xbb%6\x89\xcez\xff_\x0c\xe6j\x10\xef\xf4\x1cg0\xd0,PaE\xc2,F\xf8\'*\xeel\xf5\x8e\xbb\tw\x1dR\x0fA$#\xde\x13\xf6\xe4\x82\x84v\r\xcd\xf7d\xdd\x19%c\x94\x13\xf04H]\xee\xd3GM\xd1\x03t\x89\xd8Z\xdb\xf1\x07#m\x06H\xa4@\xf8hh\x07vIn\xce\xf5\xef\xc4\xecR\x19\x1e\xa2\xc0(,0\xafK\xad\x90>\x8c\x91<\xbc\x1b\xea\x073\xa9\x85\xd8o\x94\x84&\xa0TUT\x03\xaf\x08\xb9\x9c\x038t\x84\x06\xd7\xde\xa9$\n\xc3$\xd8Zk\xdd\xb6Z%\xad \x08\xd9\xfe\xca\xa1\xb0t\xd8\x1e8\xff\xd6Q\xa3~\x8a\xd6\xc4\x00\xa1\xefIVX\x1a\x1c\x06\x86\x86\x81\x1b\xeeT\xb1AHM\x87\xd1\x8f]\xa8\x8dq\x02pt\x00H\x1d\xd1\x03Ax,Q\xf8\xa1\xe01\x8b\x17\xed6\x05\x1f\x1e\x81]\xbf>n\x04\xa1\xf8\x838\xf2\xc8!$e4\x08\x1c\r\'Q\xda\xf5\x18\xdd*\x83\x86\x15\xf4\xdf,\xc9\x18\x17\xa1\xde\xf0\xe3\x8e\xc1\x13z\x83\xc2=\x1b\x81\xf2}\x04\x13#d\xa3\xf7!\x19C\xcd\x1b\x84\xd0`9<\xcfS\xa3a\x9c\x1d\x83\xc1\xa1\xdc\x00\xc5\xa6\xe1<%h\x8d\x039v\x7f\xea\xc8\x97\x11\n\x9fL\x1c\x8d\x1c\x85F\x88\x89}\x12c1.J\xac\x01\x83\xc6=>\xd3\xd8H\x14\xf2V0x~\x00\xa0\xd1\x9c\x90I",-\xcd7\xd2,\xee\x80\x88\x14\x08\xe0\x03\x1ct\xd0\xe0=\xd6z\xfe8#\xd6\x128\xbf\xd3\xf6\x8f\x85\xf7\xda\x1b\xd2\x83\xe2 \xe1 \xe4\x16\x8b\x08\x11\xf8\x0e\xdd\x83\xf9\x95\xb8\xb2i\xa4w\xc0\xec\x90\x9d\\\xc83\xfb\x1a< \xf5\x84s\xef\x81\xe7\x84\xe7\x0b\xf7\x0f\xc2\x08\x83\xf8\xd9\xf9A\xef\x81\xcf\x90\x863 \xe7\xb0\xf9\xac\x146l\xf9Wp\x9f\xa4\xeb\x15\x1a\xefEn\xa1\xf7\x97A\xc6`t\xa9T\x11\x82\x94\xb0y85%\xb2\x1d\xbc\x13/}\xc8 \t\x8e\x9a\xd3\xc1\xe2\x10\xc5\x0c\xb6\x98\x1a$\x0e\x9aIf\x0b\x84\xe44L\xeb\n\x8b\xc1^\x04\x0bn\xce\xe2v\x03\x1b{\x0b\xd0\xa1!\xc6s\xab\x12\xec}K>\x9bC( ( O#\x8d\xb1\x91\xa1\xa9\xe4\xb4\x0e\x13\xd0{\xf6\x16 \xb1^\x9b#\xc8^\x190\t\xbc\x08\x863\xd8/\xc6\xab\x07\x02\xc7\x08\xf1\xc8\xa1@\x086\x1b\x8ba%U4!\xe6P\x18\xc6r\x8b3\xf9U\r\x97\x10\xc0X\x18\xa4eI\xc6.\xb0=\xa5\xa1\xa4\xd0\t\xff\x91wU\x83\xe0=5<\x033\xc03\xd7\x8a\xc6)7\x10r6\xf0J<<\xea\x91#\x10`&\xfc\x84\x10\xdd\xc1\xaf\x18I\xae*\x84\x04\x8d\x81\xa4\x89\x903\xe0\x8e\xde\x10\x9c@\xc6\x80\x15\xd4\x103\xf366\x90&6E|h`\x84\xb0Jb\xbd\xb9\xc8\xba\x07\xc8\xe84\x08\x13\x10\'\xf2\xf1\n\xc6\xfe\xf2\xfe\xd6$\x92\xed\x1d\xe8,\xe5\x08\xf5\x8dF\x03BC\xb8\x11\x03\xc2\x8b\xbc\x82\xd2\x8f"Y\xd71\x08\xe8\x00-~pF\x8f&\x88\x84 \x04\xce\x97M\x00PlID\x8e\x9dp\xce\xa1\x19\xa1\xc0\xb3\\\x8b\x10`%$\xf8\xdd*\x80\x1eW\xdc%\xb1Kq\'\xd6"\\\x14%\xd7\x83\xec\x84 c\x8a\x91\x8c\xce\x02\xb0uc\xbf\x02P"T\x15h\t\x88\x1a\xa1\x8b[\x1f:\xbf\x7f%\xc0\xfaQv\xee\xc6\xa2[\x1a\x80\x86\x00\xadQ\\\x80\xf6\xd8\xc2$\xa7\xb2P\x95$jk\x07\xff\xe1P70\xbb\x1c\x88\x13#\x14@;\xc1\xa3X\x1d\x16\x7f\x07\x0f\x82p\x9b`-\x0br\xf2\xcd\x80D\x89D$HLU\xedL\x9f`U\xa3%P)\x1e@\xdb\x02\x0f\xc5\xc8\x03\x051\x04\xdc\xd6n\x12S\xc8\xec\x1c(\xfe\xb2F\xc8\x82T+\x05V!\xb0\xe0\x0b\x01U\n\xf4h\x82\x11kv\xdc;\x80\r\x85a\xcax4\x0b\x1c\x13<$\rN\xc6\xc1\xe0\xb10r\x07\xa3t\xb1\xa1\x14\xbd<\x1f\xf8\xc0|\xaf3\xe2\x1e_\xc4-\xc8\'\r\xe7#\xc4\x04\x019\xbd\xc7\x0f\x9d\x151\x00\x8d\x82qJ\\euvI\\\x08\x1e\x15\xd8-\t\x8c\xda\xb2L\x08hF\x04\'\xe4\xa3\xbd\xcc\x12\xa9\xe8\xa2\xa3\xb2W\x98sp>t\x8c\x91zh\xdep\xdb\x92\xd4\xb47\xb0\xd1?\xc1-\xf89t\n\x04\xb6\xad\x07$\x9a\xc0Y\x92\xe9\x9f\x8b\xd0\xd0\x10 Y\xcc\xfe\xc8I0\x90[\xd0F\xd3\x19i(\xa1\xfc\x15\xc3\xda\x95\xc0\xc8&\xd0i]\xf8\x08f\x80\x06&\x89\xc6\xd6\x08/\x12\xfb\x8b\xc7#\x04%\xb9\x1f\xb6D\x13(\xa6"Xg\x10KPZX?t\x7f\xc0_\xc3\t\x0eC\x97P\xc44\xdb\x17\xb9\x0b\xb6\x89\x08\x88\x86\x9c$\x8c\r\xc5\xc6\xa6\xaf\x07\xfc+x\x89"|\x1f\xc6\x8e,\x8dx\'\xd8\xae\x94zn\xa3\xd4!\xd4tF\xb8\xcc\x8a \xd9 \x84\xb7[\x11va\x81T\xe9\x97\xd5\x08WXV\xa9\x1d\xf8\xc8\x94\xf6\t\x96\x11\x06\xc5\xc5t>%B\xbat1\x01/b\x06\xa4\xeb\x0e\xd7\tp\xc7P\x1c\x83:\xcb\x0f\x81\x93\xa2\xbe{\xe0k\x8dN||\xb0\xae\x16S${\xc3\x8d^\x17\xee\xd0\xa4\x15\xc8X\xd4\xd3\x93\x88D\x8f-\xd3\x9cj\x96.\x07\xf8@\x8c\x11\xa2L;;X\xf8\x91\x1b\xbb\xd3\x0b$mj"\x8d^0\x8a\x93K\xae\xf1\x11$@\x94\xb8\x9d|\xf1\xe7\xe7",\xb7\x81\xfa\xb1\xd9\xe3\x04L\xe0\x00\xd4\xd0\xf7\x06\xe8\xde\xcc\xe8\x18$0\xa4\x808\xc9A\x07z3>\xf2@N\xda\x00\x92\xb4A\xd3N\x8cO>\xe9\xab\xf3\x19\x8b\x1d\xd8\x03\xdf\x9e\x1e\n!\xd0\x07;\x15\xac\x98\xb0\xcd+\x92\xcf\xbb\x92\x90\x05\x1f\x85\xd4\xee!\x1b2\x16\x0c\x15\x84\xd5\xe2\xbfE:\x1e<\xb8\x02\xc3\x8bj$h\x0b\xc8Rp\xdb\x80\x19\xd0\x8al\x18.5O\x8d}\x8c%<\x9c\xb4\n\x10Z\x8c\xf8\xbb\x1eb;\xb8 \xf4$\x8b\xcb+\xca\xd1\xe9\t\xf7\x1a4GB"\xa1\xf1\xfc\xff\xbf{\xd9\xea\xdc\x8d\x18\xc3\xd9\xc0\xd9\xfc\xd9\xc9\xd8\xe1\xd9\xf0\xd9\xe8\xde\xc1\xd9\xfd\xdd\xd9c\x86\x08c\xf8\xc0q\x02H\x08\x93\xbb\xe8\x8f\x85w8\xb8\x80o\x85\xda\x0c\x89\xe4+\xa9\xecY\x00Ple\xf1\xe2s1"\xde9\x1f\xdf\xcd\xce\x1a\x0f\xd8U\xb4.yj|\x8b\xc6\xbe`\x8b\x14\xd2\xd0\x1e\x98\xed\xa4_\xb4\x7f\tr\xdeAVh\xc0\xe0\xd6\x05\x02\x8cnd\x83O\xabb\x89\xf2h\xdb\x08K\x8d\xc5\x02S\x10\xff\x90\x15\x18\x84\xda\x8a\xc3;\xc6\x1e=H4\xe8\x17\x95\x1ap\xb5\xb2\x8b\xd7s\xeb\x96\xd8\x9b\x18\xc7\x03:0\x12\xd5\x00\x8fq\x89:\n(Y\x04\x03\x91\'\xacQc\xb7\xe4\x07\x12\'\xacK&|\x11\x02;"1\xaa\x16\x03\xd7\x08\xd4\xa1\x9eH\x1et;\x1b\xe8Le\x15\x15U\x17\x1c1#\x8eA\xaa\x03\xf1\x90\x86\x0cZ\t~\xd6\x03CB\x07#2\x11\n\x88\xd3\x84v4\x06\xf3L@A\x08\x8f\x9a5S\xa2h7H\xd5\x1af\x97\xc4\xc1\rf b\x81Ec\xd12`\xfd\x80\x1a\xfb\x03\xed\x02\xcd\x1f\x9e\x9c\xe8\xd9\xfa\xdd\x18Fx\xd8zP\x8b\x03\xa4P\xbd\x84|DMG\x8c\x19W\x0e\x94\xf2\x89@\xa0\x85c\x1f\x00DVt\x89\xb7\xa0d\xe8E\xc6m\xa4\r(1\x1a\x970$uX\xdf,\'\xd5\xd1p\xbc\xa1' 2 | -------------------------------------------------------------------------------- /src/file_mapper.py: -------------------------------------------------------------------------------- 1 | import random 2 | 3 | from consts import CONFICKER_FIRST_4KB 4 | 5 | PE_HEADER_OFFSET = 2048 6 | 7 | def _add_func_pre_and_epi(func): 8 | # function Prologue Example 9 | # 55 89 e5 83 ec 0c 10 | # 0: 55 push ebp 11 | # 1: 89 e5 mov ebp,esp 12 | # 3: 83 ec 0c sub esp,0xc 13 | 14 | # function Epilogue Example 15 | # 89 ec 5d c3 16 | # 0: 89 ec mov esp,ebp 17 | # 2: 5d pop ebp 18 | # 3: c3 ret 19 | def inner(self, length): 20 | if length > 10: 21 | start = b"\x55\x89\xe5\x83\xec" + (4 * random.randint(1, 30)).to_bytes(1, byteorder="little", signed=True) 22 | end = b"\x89\xec\x5d\xc3" 23 | return start + func(self, length)[len(start):-1 * len(end)] + end 24 | else: 25 | return func(self, length) 26 | return inner 27 | 28 | class FileMapper: 29 | def __init__(self, add_pe_header, malware_file): 30 | self._byte_mapping = [] 31 | self._malware_bytes = self._read_malware(malware_file) 32 | if add_pe_header: 33 | self._get_pe_header() 34 | 35 | def is_slice_empty(self, start_index, end_index): 36 | if len(self._byte_mapping) < end_index: 37 | self._byte_mapping += [None] * (end_index - len(self._byte_mapping)) 38 | 39 | return all([True if byte is None else False for byte in self._byte_mapping[start_index:end_index]]) 40 | 41 | def is_slice_reserved(self, start_index, end_index): 42 | return all([True if place == "reserved" else False for place in self._byte_mapping[start_index:end_index]]) 43 | 44 | def place(self, byte_string, start_index, pre_reserved=False): 45 | end_index = start_index + len(byte_string) 46 | if self.is_slice_empty(start_index, end_index) or (pre_reserved and self.is_slice_reserved(start_index, end_index)): 47 | self._byte_mapping = self._byte_mapping[:start_index] + list(byte_string) + self._byte_mapping[end_index:] 48 | 49 | def append(self, byte_string): 50 | self._byte_mapping.extend(byte_string) 51 | 52 | def _get_first_free_spot(self, length): 53 | for curr_index, value in enumerate(self._byte_mapping): 54 | if self.is_slice_empty(curr_index, curr_index + length): 55 | return curr_index 56 | return len(self._byte_mapping) 57 | 58 | def reserve_first_free_spot(self, length): 59 | first_free_index = self._get_first_free_spot(length) 60 | self._byte_mapping = self._byte_mapping[:first_free_index] + (["reserved"] * length) + self._byte_mapping[first_free_index + length:] 61 | return first_free_index 62 | 63 | def _get_pe_header(self): 64 | self.append(self._malware_bytes[:PE_HEADER_OFFSET]) 65 | 66 | def _read_malware(self, file_name): 67 | if file_name: 68 | with open(file_name, "rb") as malware_file: 69 | return malware_file.read() 70 | else: 71 | return CONFICKER_FIRST_4KB 72 | 73 | @_add_func_pre_and_epi 74 | def generate_random_x86_code(self, length): 75 | if self.get_malware_len() - PE_HEADER_OFFSET < length: 76 | start_index = random.randint(PE_HEADER_OFFSET, self.get_malware_len() - 1) 77 | mult = length // (self.get_malware_len() - start_index) + 1 78 | return (self._malware_bytes[start_index:self.get_malware_len()] * mult)[0:length] 79 | else: 80 | start_index = random.randint(PE_HEADER_OFFSET, self.get_malware_len() - length) 81 | return self._malware_bytes[start_index:start_index + length] 82 | 83 | def _get_none_mapping(self): 84 | is_prev_none = False 85 | curr_start = 0 86 | start_end_mapping = [] 87 | for index, byte in enumerate(self._byte_mapping): 88 | if byte is None and not is_prev_none: 89 | curr_start = index 90 | if byte is None: 91 | is_prev_none = True 92 | if byte is not None and is_prev_none: 93 | start_end_mapping.append((curr_start, index)) 94 | is_prev_none = False 95 | 96 | return start_end_mapping 97 | 98 | def fill_empty_with_code(self): 99 | none_mapping = self._get_none_mapping() 100 | 101 | for start_index, end_index in none_mapping: 102 | self.place(self.generate_random_x86_code(end_index - start_index), start_index) 103 | 104 | def get_as_bytestream(self): 105 | return bytes(self._byte_mapping) 106 | 107 | def get_file_len(self): 108 | return len(self._byte_mapping) 109 | 110 | def get_malware_len(self): 111 | return len(self._malware_bytes) 112 | --------------------------------------------------------------------------------