├── .gitignore
├── .pre-commit-config.yaml
├── LICENSE
├── README.md
├── data
    ├── evaded
    │   ├── .gitkeep
    │   ├── ember
    │   │   └── .gitkeep
    │   └── malconv
    │   │   └── .gitkeep
    └── logs
    │   └── .gitkeep
├── dockerfile.cpu
├── download_deps.py
├── malware_rl
    ├── __init__.py
    └── envs
    │   ├── __init__.py
    │   ├── controls
    │       ├── __init__.py
    │       ├── good_strings
    │       │   └── .gitkeep
    │       ├── modifier.py
    │       ├── section_names.txt
    │       ├── small_dll_imports.json
    │       └── trusted
    │       │   └── .gitkeep
    │   ├── ember_gym.py
    │   ├── malconv_gym.py
    │   ├── sorel_gym.py
    │   └── utils
    │       ├── __init__.py
    │       ├── ember.py
    │       ├── interface.py
    │       ├── malconv.h5
    │       ├── malconv.py
    │       ├── samples
    │           └── .gitkeep
    │       └── sorel.py
├── ppo.py
├── random_agent.py
├── requirements.txt
└── stable_baselines_env_check.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | *
 2 | !/**/
 3 | !*.*
 4 | 
 5 | __pycache__/
 6 | *.py[cod]
 7 | .DS_Store
 8 | .vscode/
 9 | data/*
10 | *.ipynb_checkpoints/
11 | *.exe
12 | *.txt
13 | .idea
14 | *.model
15 | 


--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------
 1 | repos:
 2 | -   repo: https://github.com/pre-commit/pre-commit-hooks
 3 |     rev: v4.0.1
 4 |     hooks:
 5 |     -   id: trailing-whitespace
 6 |     -   id: end-of-file-fixer
 7 |     -   id: check-docstring-first
 8 |     -   id: check-json
 9 |     -   id: check-yaml
10 |     -   id: debug-statements
11 |     -   id: requirements-txt-fixer
12 | -   repo: https://github.com/asottile/pyupgrade
13 |     rev: v2.19.4
14 |     hooks:
15 |     -   id: pyupgrade
16 |         args: [--py36-plus]
17 | -   repo: https://github.com/asottile/add-trailing-comma
18 |     rev: v2.1.0
19 |     hooks:
20 |     -   id: add-trailing-comma
21 |         args: [--py36-plus]
22 | -   repo: meta
23 |     hooks:
24 |     -   id: check-hooks-apply
25 |     -   id: check-useless-excludes
26 | -   repo: https://github.com/pre-commit/mirrors-isort
27 |     rev: v5.8.0
28 |     hooks:
29 |     -   id: isort
30 | -   repo: https://github.com/ambv/black
31 |     rev: 21.6b0
32 |     hooks:
33 |     - id: black
34 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2020 Bobby Filar
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # MalwareRL
 2 | > Malware Bypass Research using Reinforcement Learning
 3 | 
 4 | ## Background
 5 | This is a malware manipulation environment using OpenAI's gym environments. The core idea is based on paper "Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning"
 6 | ([paper](https://arxiv.org/abs/1801.08917)). I am extending the original repo because:
 7 | 1. It is no longer maintained
 8 | 2. It uses Python2 and an outdated version of LIEF
 9 | 3. I wanted to integrate new Malware gym environments and additional manipulations
10 | 
11 | Over the past three years there have been breakthrough open-source projects published in the security ML space. In particular, [Ember](https://github.com/endgameinc/ember) (Endgame Malware BEnchmark for Research) ([paper](https://arxiv.org/abs/1804.04637)) and MalConv: Malware detection by eating a whole exe ([paper](https://arxiv.org/abs/1710.09435)) have provided security researchers the ability to develop sophisticated, reproducible models that emulate features/techniques found in NGAVs.
12 | 
13 | ## MalwareRL Gym Environment
14 | MalwareRL exposes `gym` environments for both Ember and MalConv to allow researchers to develop Reinforcement Learning agents to bypass Malware Classifiers. Actions include a variety of non-breaking (e.g. binaries will still execute) modifications to the PE header, sections, imports and overlay and are listed below.
15 | 
16 | ### Action Space
17 | ```
18 | ACTION_TABLE = {
19 |     'modify_machine_type': 'modify_machine_type',
20 |     'pad_overlay': 'pad_overlay',
21 |     'append_benign_data_overlay': 'append_benign_data_overlay',
22 |     'append_benign_binary_overlay': 'append_benign_binary_overlay',
23 |     'add_bytes_to_section_cave': 'add_bytes_to_section_cave',
24 |     'add_section_strings': 'add_section_strings',
25 |     'add_section_benign_data': 'add_section_benign_data',
26 |     'add_strings_to_overlay': 'add_strings_to_overlay',
27 |     'add_imports': 'add_imports',
28 |     'rename_section': 'rename_section',
29 |     'remove_debug': 'remove_debug',
30 |     'modify_optional_header': 'modify_optional_header',
31 |     'modify_timestamp': 'modify_timestamp',
32 |     'break_optional_header_checksum': 'break_optional_header_checksum',
33 |     'upx_unpack': 'upx_unpack',
34 |     'upx_pack': 'upx_pack'
35 | }
36 | ```
37 | 
38 | ### Observation Space
39 | The `observation_space` of the `gym` environments are an array representing the feature vector. For ember this is `numpy.array == 2381` and malconv `numpy.array == 1024**2`. The MalConv gym presents an opportunity to try RL techniques to generalize learning across large State Spaces.
40 | 
41 | ### Agents
42 | A baseline agent `RandomAgent` is provided to demonstrate how to interact w/ `gym` environments and expected output. This agent attempts to evade the classifier by randomly selecting an action. This process is repeated up to the length of a game (e.g. 50 mods). If the modifed binary scores below the classifier threshold we register it as an evasion. In a lot of ways the `RandomAgent` acts as a fuzzer trying a bunch of actions with no regard to minimizing the modifications of the resulting binary.
43 | 
44 | Additional agents will be developed and made available (both model and code) in the coming weeks.
45 | 
46 | **Table 1:** _Evasion Rate against Ember Holdout Dataset_*
47 | | gym | agent | evasion_rate | avg_ep_len |
48 | | --- | ----- | ------------ | ---------- |
49 | | ember | RandomAgent | 89.2% | 8.2
50 | | malconv | RandomAgent | 88.5% | 16.33
51 | 
52 | \
53 | \* _250 random samples_
54 | 
55 | ## Setup
56 | To get `malware_rl` up and running you will need the follow external dependencies:
57 | - [LIEF](https://lief.quarkslab.com/)
58 | - [Ember](https://github.com/Azure/2020-machine-learning-security-evasion-competition/blob/master/defender/defender/models/ember_model.txt.gz), [Malconv](https://github.com/endgameinc/ember/blob/master/malconv/malconv.h5) and [SOREL-20M](https://github.com/sophos-ai/SOREL-20M) models. All of these then need to be placed into the `malware_rl/envs/utils/` directory.
59 |   > The SOREL-20M model requires use of the `aws-cli` in order to get. When accessing the AWS S3 bucket, look in the `sorel-20m-model/checkpoints/lightGBM` folder and fish out any of the models in the `seed` folders. The model file will need to be renamed to `sorel.model` and placed into `malware_rl/envs/utils` alongside the other models.
60 | - UPX has been added to support pack/unpack modifications. Download the binary [here](https://upx.github.io/) and place in the `malware_rl/envs/controls` directory.
61 | - Benign binaries - a small set of "trusted" binaries (e.g. grabbed from base Windows installation) you can download some via MSFT website ([example](https://download.microsoft.com/download/a/c/1/ac1ac039-088b-4024-833e-28f61e01f102/NETFX1.1_bootstrapper.exe)). Store these binaries in `malware_rl/envs/controls/trusted`
62 | - Run `strings` command on those binaries and save the output as `.txt` files in `malware_rl/envs/controls/good_strings`
63 | - Download a set of malware from VirusShare or VirusTotal. I just used a list of hashes from the Ember dataset
64 | 
65 | **Note:** The helper script `download_deps.py` can be used as a quickstart to get most of the key dependencies setup.
66 | 
67 | I used a [conda](https://docs.conda.io/en/latest/) env set for Python3.7:
68 | 
69 | `conda create -n malware_rl python=3.7`
70 | 
71 | Finally install the Python3 dependencies in the `requirements.txt`.
72 | 
73 | `pip3 install -r requirements.txt`
74 | 
75 | ## References
76 | The are a bunch of good papers/blog posts on manipulating binaries to evade ML classifiers. I compiled a few that inspired portions of this project below. Also, I have inevitably left out other pertinent reseach, so if there is something that should be in here let me know in an Git Issue or hit me up on Twitter ([@filar](https://twitter.com/filar)).
77 | ### Papers
78 | - Demetrio, Luca, et al. "Efficient Black-box Optimization of Adversarial Windows Malware with Constrained Manipulations." arXiv preprint arXiv:2003.13526 (2020). ([paper](https://arxiv.org/abs/2003.13526))
79 | - Demetrio, Luca, et al. "Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection." arXiv preprint arXiv:2008.07125 (2020). ([paper](https://arxiv.org/abs/2008.07125))
80 | - Song, Wei, et al. "Automatic Generation of Adversarial Examples for Interpreting Malware Classifiers." arXiv preprint arXiv:2003.03100 (2020).
81 |  ([paper](https://arxiv.org/abs/2003.03100))
82 | - Suciu, Octavian, Scott E. Coull, and Jeffrey Johns. "Exploring adversarial examples in malware detection." 2019 IEEE Security and Privacy Workshops (SPW). IEEE, 2019. ([paper](https://arxiv.org/abs/1810.08280))
83 | - Fleshman, William, et al. "Static malware detection & subterfuge: Quantifying the robustness of machine learning and current anti-virus." 2018 13th International Conference on Malicious and Unwanted Software (MALWARE). IEEE, 2018. ([paper](https://arxiv.org/abs/1806.04773))
84 | - Pierazzi, Fabio, et al. "Intriguing properties of adversarial ML attacks in the problem space." 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020. ([paper/code](https://s2lab.kcl.ac.uk/projects/intriguing/))
85 | - Fang, Zhiyang, et al. "Evading anti-malware engines with deep reinforcement learning." IEEE Access 7 (2019): 48867-48879. ([paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8676031))
86 | 
87 | ### Blog Posts
88 | - [Evading Machine Learning Malware Classifiers: for fun and profit!](https://towardsdatascience.com/evading-machine-learning-malware-classifiers-ce52dabdb713)
89 | - [Cylance, I Kill You!](https://skylightcyber.com/2019/07/18/cylance-i-kill-you/)
90 | - [Machine Learning Security Evasion Competition 2020](https://msrc-blog.microsoft.com/2020/06/01/machine-learning-security-evasion-competition-2020-invites-researchers-to-defend-and-attack/)
91 | - [ML evasion contest – the AV tester’s perspective](https://www.mrg-effitas.com/research/machine-learning-evasion-contest-the-av-testers-perspective/)
92 | 
93 | ### Talks
94 | - 42: The answer to life the universe and everything offensive security by Will Pearce, Nick Landers ([slides](https://github.com/moohax/Talks/blob/master/slides/DerbyCon19.pdf))
95 | - Bot vs. Bot: Evading Machine Learning Malware Detection by Hyrum Anderson ([slides](https://www.blackhat.com/docs/us-17/thursday/us-17-Anderson-Bot-Vs-Bot-Evading-Machine-Learning-Malware-Detection.pdf))
96 | - Trying to Make Meterpreter into an Adversarial Example by Andy Applebaum ([slides](https://www.camlis.org/2019/talks/applebaum))
97 | 


--------------------------------------------------------------------------------
/data/evaded/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/data/evaded/.gitkeep


--------------------------------------------------------------------------------
/data/evaded/ember/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/data/evaded/ember/.gitkeep


--------------------------------------------------------------------------------
/data/evaded/malconv/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/data/evaded/malconv/.gitkeep


--------------------------------------------------------------------------------
/data/logs/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/data/logs/.gitkeep


--------------------------------------------------------------------------------
/dockerfile.cpu:
--------------------------------------------------------------------------------
 1 | # Dockerfile to stand up malware-rl and all associated dependencies.
 2 | # Only uses CPU tensorflow. Would need to be rebuilt in order to take
 3 | # advantage of a GPU.
 4 | FROM ubuntu:20.04
 5 | 
 6 | LABEL maintainer="br0kej@protonmail.com"
 7 | 
 8 | WORKDIR /home
 9 | 
10 | RUN apt-get update
11 | 
12 | RUN apt-get install -y git python3 python3-pip python3-virtualenv upx subversion binutils
13 | 
14 | RUN git clone https://github.com/bfilar/malware_rl.git
15 | 
16 | RUN cd malware_rl && pip3 install -r requirements.txt && python3 download_deps.py --accept --force
17 | 
18 | CMD bash
19 | 


--------------------------------------------------------------------------------
/download_deps.py:
--------------------------------------------------------------------------------
  1 | """
  2 | This script does the following:
  3 | 
  4 | 1. Downloads a small repository of known bad stuff (14 bad things) and
  5 |     saves to temporary directory. The ransomware folder from the
  6 |     https://github.com/Endermanch/MalwareDatabase/ repo.
  7 | 2. Unzips the samples into the correct directory for the environment
  8 |    (malware_rl/envs/utils/samples).
  9 | 3. Renames each sample to its corresponding SHA256 hash.
 10 | 4. Removes temporary malware directory
 11 | """
 12 | 
 13 | import argparse
 14 | import glob
 15 | import gzip
 16 | import hashlib
 17 | import os
 18 | import shutil
 19 | import subprocess
 20 | import sys
 21 | import urllib.request
 22 | import zipfile
 23 | 
 24 | # Third Part Libraries
 25 | import svn.remote
 26 | 
 27 | MODULE_PATH = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
 28 | UTIL_PATH = os.path.join(MODULE_PATH, "malware_rl/envs/utils/")
 29 | SAMPLE_PATH = os.path.join(MODULE_PATH, "malware_rl/envs/utils/samples/")
 30 | ZIP_PASSWORD = "mysubsarethebest"
 31 | DEFAULT_MALWARE_REPOS = [
 32 |     "https://github.com/Endermanch/MalwareDatabase/trunk/ransomwares",
 33 |     "https://github.com/Endermanch/MalwareDatabase/trunk/rogues",
 34 |     "https://github.com/Endermanch/MalwareDatabase/trunk/trojans",
 35 |     "https://github.com/Endermanch/MalwareDatabase/trunk/jokes",
 36 | ]
 37 | TEMP_SAMPLE_PATHS = ["ransomwares/", "rogues/", "trojans/", "jokes/"]
 38 | BENIGN_REPO = "https://github.com/xournalpp/xournalpp/releases/download/1.0.18/xournalpp-1.0.18-windows.zip"
 39 | EMBER_MODEL_PATH = "https://raw.githubusercontent.com/Azure/2020-machine-learning-security-evasion-competition/master/defender/defender/models/ember_model.txt.gz"
 40 | 
 41 | 
 42 | def retrive_url(source_file_url=None, filename=None):
 43 |     """
 44 |     Retrieves a file
 45 |     """
 46 |     if os.path.exists(filename):
 47 |         print(f"[-] {filename} already present. Skipping")
 48 |     else:
 49 |         urllib.request.urlretrieve(source_file_url, filename)
 50 | 
 51 | 
 52 | def download_specific_github_file(
 53 |     source_file_url=None,
 54 |     filename=None,
 55 |     storage_directory=None,
 56 | ):
 57 |     """
 58 |     Downloads a specific file from a github repo.
 59 |     If gzipped, decompresses and drops into directory
 60 |     """
 61 |     retrive_url(source_file_url, filename)
 62 |     shutil.move(
 63 |         os.path.join(os.getcwd(), filename),
 64 |         os.path.join(storage_directory, filename),
 65 |     )
 66 | 
 67 |     if os.path.join(storage_directory, filename).endswith(".gz"):
 68 |         split_filename = os.path.splitext(filename)[0]
 69 | 
 70 |         with gzip.open(os.path.join(UTIL_PATH, filename), "r") as f_in, open(
 71 |             os.path.join(UTIL_PATH, split_filename),
 72 |             "wb",
 73 |         ) as f_out:
 74 |             shutil.copyfileobj(f_in, f_out)
 75 |         os.remove(os.path.join(UTIL_PATH, filename))
 76 |     print("[+] Success - Ember Model downloaded")
 77 | 
 78 | 
 79 | def download_specific_git_repo_directory(temp_path=None, source_repo=None):
 80 |     """
 81 |     Downloads a specific directory within a git repo.
 82 |     """
 83 |     if os.path.exists(temp_path) is False:
 84 |         repo = svn.remote.RemoteClient(source_repo)
 85 | 
 86 |         try:
 87 |             repo.checkout(source_repo)
 88 |             print(
 89 |                 "[+] Success - Samples Downloaded " "Placed into Temp Directory",
 90 |             )
 91 | 
 92 |         except svn.exception.SvnException:
 93 |             print(
 94 |                 """
 95 |             Subversion not found. In order to download the sample malware,
 96 |             Subversion (svn) needs to be installed. This provides a method of
 97 |             downloading only the target folder rather than the whole repo.
 98 |             """
 99 |             )
100 | 
101 | 
102 | def unzip_file(filename=None, source_zip=None, password=False):
103 |     """
104 |     Unzips a .zip file
105 |     """
106 |     try:
107 |         if password:
108 |             with zipfile.ZipFile(filename, "r") as file:
109 |                 file.extractall(
110 |                     source_zip,
111 |                     pwd=bytes(ZIP_PASSWORD, "utf-8"),
112 |                 )
113 |         else:
114 |             with zipfile.ZipFile(filename, "r") as file:
115 |                 file.extractall(source_zip)
116 |     except:
117 |         pass
118 | 
119 | 
120 | def unzip_samples(temp_sample_path=None, sample_path=None):
121 |     """
122 |     Unzips all .zip's within the target directory
123 |     """
124 |     if os.path.exists(temp_sample_path):
125 |         target_path_contents = glob.glob(
126 |             os.path.join(
127 |                 os.getcwd(),
128 |                 temp_sample_path + "*.zip",
129 |             ),
130 |         )
131 |         for filename in target_path_contents:
132 |             unzip_file(filename, sample_path, password=True)
133 | 
134 |     print("[+] Success - Samples Unzipped")
135 | 
136 | 
137 | def rename_samples_to_sha256_hash(sample_path=None):
138 |     """
139 |     Renames all malware files within a target directory to there
140 |     SHA256 hash
141 |     """
142 |     for files in glob.glob(os.path.join(sample_path, "*")):
143 |         sha256_hash = hashlib.sha256()
144 |         with open(files, "rb") as file:
145 |             for byte_block in iter(lambda: file.read(4096), b""):
146 |                 sha256_hash.update(byte_block)
147 |             computed_hash = sha256_hash.hexdigest()
148 |             os.rename(files, os.path.join(sample_path, computed_hash))
149 |     print("[+] Success - Samples renamed to their SHA256 hash")
150 | 
151 | 
152 | def clean_up_temp_samples_dir(directory_to_remove=None):
153 |     """
154 |     Clean up temporary samples directory
155 |     """
156 |     if os.path.exists(directory_to_remove):
157 |         shutil.rmtree(directory_to_remove)
158 |         print(f"[+] Cleanup Complete - {directory_to_remove} has been removed")
159 | 
160 | 
161 | def check_if_samples_exist(directory_to_check=None):
162 |     """
163 |     Checks if samples directory contains samples
164 |     """
165 |     if len(os.listdir(directory_to_check)) == 0:
166 |         return True
167 |     else:
168 |         return False
169 | 
170 | 
171 | def generate_example_benign_strings_output(benign_repo=None, output_dir=None):
172 |     """
173 |     Downloads a sample open source windows application and
174 |     generates strings output
175 |     """
176 |     output_zip = benign_repo.split("/")[-1]
177 |     output_filename = "".join(output_zip.split(".")[:-1])
178 | 
179 |     retrive_url(benign_repo, output_zip)
180 |     unzip_file(output_zip)
181 |     os.remove(output_zip)
182 | 
183 |     file = open(
184 |         "./malware_rl/envs/controls/good_strings/xournal-strings.txt",
185 |         "w",
186 |     )
187 |     unzipped_filename = glob.glob("xournalpp-*")[0]
188 |     subprocess.run(["strings", unzipped_filename], stdout=file)
189 |     shutil.move(
190 |         os.path.join(MODULE_PATH, unzipped_filename),
191 |         os.path.join(
192 |             MODULE_PATH,
193 |             "malware_rl/envs/controls/trusted/" + unzipped_filename,
194 |         ),
195 |     )
196 | 
197 | 
198 | if __name__ == "__main__":
199 |     parser = argparse.ArgumentParser(
200 |         description="A small utility that helps with the downloading of the requirements for the malware-rl environment",
201 |     )
202 |     parser.add_argument(
203 |         "--accept",
204 |         help="accept liability for downloading bad things",
205 |         required=False,
206 |         action="store_true",
207 |     )
208 |     parser.add_argument(
209 |         "--force",
210 |         help="forces the download even if samples directory is" "not empty",
211 |         action="store_true",
212 |     )
213 |     parser.add_argument(
214 |         "--clean",
215 |         help="deletes the contents of the samples directory",
216 |         action="store_true",
217 |     )
218 |     parser.add_argument(
219 |         "--strings",
220 |         help="download goodware windows executable and generate text file containing strings output",
221 |         action="store_true",
222 |     )
223 |     args = parser.parse_args()
224 | 
225 |     if args.clean:
226 |         for sample in glob.glob(os.path.join(SAMPLE_PATH, "*")):
227 |             os.remove(sample)
228 | 
229 |     if args.strings:
230 |         generate_example_benign_strings_output(
231 |             benign_repo=BENIGN_REPO,
232 |             output_dir=MODULE_PATH,
233 |         )
234 | 
235 |     if args.accept:
236 |         if check_if_samples_exist(directory_to_check=SAMPLE_PATH) | args.force is True:
237 |             for temp_sample_path, malware_repo in zip(
238 |                 TEMP_SAMPLE_PATHS,
239 |                 DEFAULT_MALWARE_REPOS,
240 |             ):
241 |                 print(
242 |                     f"[*] Attempting to Download {temp_sample_path} Samples & Place in Temp Directory",
243 |                 )
244 |                 download_specific_git_repo_directory(
245 |                     temp_path=temp_sample_path,
246 |                     source_repo=malware_repo,
247 |                 )
248 |                 print("[*] Attempting to Unzip Samples")
249 |                 unzip_samples(
250 |                     temp_sample_path=temp_sample_path,
251 |                     sample_path=SAMPLE_PATH,
252 |                 )
253 |                 print("[*] Attempting to Rename Files to SHA256 Hash")
254 |                 rename_samples_to_sha256_hash(sample_path=SAMPLE_PATH)
255 |                 print("[*] Attempting Clean Up")
256 |                 clean_up_temp_samples_dir(directory_to_remove=temp_sample_path)
257 | 
258 |             print("[*] Attempting downloading Ember Model")
259 |             download_specific_github_file(
260 |                 source_file_url=EMBER_MODEL_PATH,
261 |                 filename="ember_model.txt.gz",
262 |                 storage_directory=UTIL_PATH,
263 |             )
264 |             print("[+] Success - Ember Model Downloaded")
265 |             print("[*] Attempting to generate example benign strings output")
266 |             generate_example_benign_strings_output(
267 |                 benign_repo=BENIGN_REPO,
268 |                 output_dir=MODULE_PATH,
269 |             )
270 |             print("[+] Success - Example Strings Output Generated")
271 | 
272 |         else:
273 |             print(
274 |                 "[-] It looks like there is something in your samples "
275 |                 "directory (malware_rl/envs/utils/samples) already, aborting"
276 |                 "download. Use the --force flag to continue download",
277 |             )
278 | 


--------------------------------------------------------------------------------
/malware_rl/__init__.py:
--------------------------------------------------------------------------------
 1 | from gym.envs.registration import register
 2 | from sklearn.model_selection import train_test_split
 3 | 
 4 | from malware_rl.envs.utils import interface
 5 | 
 6 | # create a holdout set
 7 | sha256 = interface.get_available_sha256()
 8 | sha256_train, sha256_holdout = train_test_split(sha256, test_size=40)
 9 | 
10 | MAXTURNS = 50
11 | 
12 | register(
13 |     id="malconv-train-v0",
14 |     entry_point="malware_rl.envs:MalConvEnv",
15 |     kwargs={
16 |         "random_sample": True,
17 |         "maxturns": MAXTURNS,
18 |         "sha256list": sha256_train,
19 |     },
20 | )
21 | 
22 | register(
23 |     id="malconv-test-v0",
24 |     entry_point="malware_rl.envs:MalConvEnv",
25 |     kwargs={
26 |         "random_sample": False,
27 |         "maxturns": MAXTURNS,
28 |         "sha256list": sha256_holdout,
29 |     },
30 | )
31 | 
32 | register(
33 |     id="ember-train-v0",
34 |     entry_point="malware_rl.envs:EmberEnv",
35 |     kwargs={
36 |         "random_sample": True,
37 |         "maxturns": MAXTURNS,
38 |         "sha256list": sha256_train,
39 |     },
40 | )
41 | 
42 | register(
43 |     id="ember-test-v0",
44 |     entry_point="malware_rl.envs:EmberEnv",
45 |     kwargs={
46 |         "random_sample": False,
47 |         "maxturns": MAXTURNS,
48 |         "sha256list": sha256_holdout,
49 |     },
50 | )
51 | 
52 | register(
53 |     id="sorel-train-v0",
54 |     entry_point="malware_rl.envs:SorelEnv",
55 |     kwargs={
56 |         "random_sample": True,
57 |         "maxturns": MAXTURNS,
58 |         "sha256list": sha256_train,
59 |     },
60 | )
61 | 
62 | register(
63 |     id="sorel-test-v0",
64 |     entry_point="malware_rl.envs:SorelEnv",
65 |     kwargs={
66 |         "random_sample": False,
67 |         "maxturns": MAXTURNS,
68 |         "sha256list": sha256_holdout,
69 |     },
70 | )
71 | 


--------------------------------------------------------------------------------
/malware_rl/envs/__init__.py:
--------------------------------------------------------------------------------
1 | from malware_rl.envs import utils
2 | from malware_rl.envs.ember_gym import EmberEnv
3 | from malware_rl.envs.malconv_gym import MalConvEnv
4 | from malware_rl.envs.sorel_gym import SorelEnv
5 | 


--------------------------------------------------------------------------------
/malware_rl/envs/controls/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/malware_rl/envs/controls/__init__.py


--------------------------------------------------------------------------------
/malware_rl/envs/controls/good_strings/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/malware_rl/envs/controls/good_strings/.gitkeep


--------------------------------------------------------------------------------
/malware_rl/envs/controls/modifier.py:
--------------------------------------------------------------------------------
  1 | import array
  2 | import json
  3 | import os
  4 | import random
  5 | import subprocess
  6 | import sys
  7 | import tempfile
  8 | from os import listdir
  9 | from os.path import isfile, join
 10 | 
 11 | import lief
 12 | 
 13 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
 14 | 
 15 | COMMON_SECTION_NAMES = (
 16 |     open(
 17 |         os.path.join(
 18 |             module_path,
 19 |             "section_names.txt",
 20 |         ),
 21 |     )
 22 |     .read()
 23 |     .rstrip()
 24 |     .split("\n")
 25 | )
 26 | COMMON_IMPORTS = json.load(
 27 |     open(os.path.join(module_path, "small_dll_imports.json")),
 28 | )
 29 | 
 30 | 
 31 | class ModifyBinary:
 32 |     def __init__(self, bytez):
 33 |         self.bytez = bytez
 34 |         self.trusted_path = module_path + "/trusted/"
 35 |         self.good_str_path = module_path + "/good_strings/"
 36 | 
 37 |     def _randomly_select_trusted_file(self):
 38 |         return random.choice(
 39 |             [
 40 |                 join(self.trusted_path, f)
 41 |                 for f in listdir(self.trusted_path)
 42 |                 if (f != ".gitkeep") and (isfile(join(self.trusted_path, f)))
 43 |             ],
 44 |         )
 45 | 
 46 |     def _randomly_select_good_strings(self):
 47 |         good_strings = random.choice(
 48 |             [
 49 |                 join(self.good_str_path, f)
 50 |                 for f in listdir(self.good_str_path)
 51 |                 if (f != ".gitkeep") and (isfile(join(self.good_str_path, f)))
 52 |             ],
 53 |         )
 54 | 
 55 |         with open(good_strings) as f:
 56 |             strings = f.read()
 57 | 
 58 |         return strings
 59 | 
 60 |     def _random_length(self):
 61 |         return 2 ** random.randint(5, 8)
 62 | 
 63 |     def _search_cave(
 64 |         self,
 65 |         name,
 66 |         body,
 67 |         file_offset,
 68 |         vaddr,
 69 |         cave_size=128,
 70 |         _bytes=b"\x00",
 71 |     ):
 72 |         found_caves = []
 73 |         null_count = 0
 74 |         size = len(body)
 75 | 
 76 |         for offset in range(size):
 77 |             byte = body[offset]
 78 |             check = False
 79 | 
 80 |             if byte in _bytes:
 81 |                 null_count += 1
 82 |             else:
 83 |                 check = True
 84 | 
 85 |             if offset == size - 1:
 86 |                 check = True
 87 |                 offset += 1
 88 | 
 89 |             if check:
 90 |                 if null_count >= cave_size:
 91 |                     cave_start = file_offset + offset - null_count
 92 |                     cave_end = file_offset + offset
 93 |                     cave_size = null_count
 94 |                     found_caves.append([cave_start, cave_end, cave_size])
 95 |                 null_count = 0
 96 |         return found_caves
 97 | 
 98 |     def _binary_to_bytez(self, binary, imports=False):
 99 |         # Write modified binary to disk
100 |         builder = lief.PE.Builder(binary)
101 |         builder.build_imports(imports)
102 |         builder.build()
103 | 
104 |         self.bytez = array.array("B", builder.get_build()).tobytes()
105 |         return self.bytez
106 | 
107 |     def rename_section(self):
108 |         binary = lief.PE.parse(list(self.bytez))
109 |         targeted_section = random.choice(binary.sections)
110 |         targeted_section.name = random.choice(COMMON_SECTION_NAMES)[:5]
111 | 
112 |         self.bytez = self._binary_to_bytez(binary)
113 |         return self.bytez
114 | 
115 |     def add_bytes_to_section_cave(self):
116 |         caves = []
117 |         binary = lief.PE.parse(list(self.bytez))
118 |         base_addr = binary.optional_header.imagebase
119 |         for section in binary.sections:
120 |             section_offset = section.pointerto_raw_data
121 |             vaddr = section.virtual_address + base_addr
122 |             body = bytearray(section.content)
123 | 
124 |             if section.sizeof_raw_data > section.virtual_size:
125 |                 body.extend(
126 |                     list(b"\x00" * (section.sizeof_raw_data - section.virtual_size)),
127 |                 )
128 | 
129 |             caves.extend(
130 |                 self._search_cave(
131 |                     section.name,
132 |                     body,
133 |                     section_offset,
134 |                     vaddr,
135 |                 ),
136 |             )
137 | 
138 |         if caves:
139 |             random_selected_cave = random.choice(caves)
140 |             upper = random.randrange(256)
141 |             add_bytes = bytearray(
142 |                 random.randint(0, upper) for _ in range(random_selected_cave[-1])
143 |             )
144 |             self.bytez = (
145 |                 self.bytez[: random_selected_cave[0]]
146 |                 + add_bytes
147 |                 + self.bytez[random_selected_cave[1] :]
148 |             )
149 | 
150 |         return self.bytez
151 | 
152 |     def modify_machine_type(self):
153 |         binary = lief.PE.parse(list(self.bytez))
154 |         binary.header.machine = random.choice(
155 |             [
156 |                 lief.PE.MACHINE_TYPES.AMD64,
157 |                 lief.PE.MACHINE_TYPES.IA64,
158 |                 lief.PE.MACHINE_TYPES.ARM64,
159 |                 lief.PE.MACHINE_TYPES.POWERPC,
160 |             ],
161 |         )
162 | 
163 |         self.bytez = self._binary_to_bytez(binary)
164 | 
165 |         return self.bytez
166 | 
167 |     def modify_timestamp(self):
168 |         binary = lief.PE.parse(list(self.bytez))
169 |         binary.header.time_date_stamps = random.choice(
170 |             [
171 |                 0,
172 |                 868967292,
173 |                 993636360,
174 |                 587902357,
175 |                 872078556,
176 |             ],
177 |         )
178 | 
179 |         self.bytez = self._binary_to_bytez(binary)
180 | 
181 |         return self.bytez
182 | 
183 |     def pad_overlay(self):
184 |         byte_pattern = random.choice([i for i in range(256)])
185 |         overlay = bytearray([byte_pattern] * 100000)
186 |         self.bytez += overlay
187 | 
188 |         return self.bytez
189 | 
190 |     def append_benign_data_overlay(self):
191 |         random_benign_file = self._randomly_select_trusted_file()
192 |         benign_binary = lief.PE.parse(random_benign_file)
193 |         benign_binary_section_content = benign_binary.get_section(
194 |             ".text",
195 |         ).content
196 |         overlay = bytearray(benign_binary_section_content)
197 |         self.bytez += overlay
198 | 
199 |         return self.bytez
200 | 
201 |     def append_benign_binary_overlay(self):
202 |         random_benign_file = self._randomly_select_trusted_file()
203 | 
204 |         with open(random_benign_file, "rb") as f:
205 |             benign_binary = f.read()
206 |         self.bytez += benign_binary
207 | 
208 |         return self.bytez
209 | 
210 |     def add_section_benign_data(self):
211 |         random_benign_file = self._randomly_select_trusted_file()
212 |         benign_binary = lief.PE.parse(random_benign_file)
213 |         benign_binary_section_content = benign_binary.get_section(
214 |             ".text",
215 |         ).content
216 | 
217 |         binary = lief.PE.parse(list(self.bytez))
218 | 
219 |         current_section_names = [section.name for section in binary.sections]
220 |         available_section_names = list(
221 |             set(COMMON_SECTION_NAMES) - set(current_section_names),
222 |         )
223 |         section = lief.PE.Section(random.choice(available_section_names))
224 |         section.content = benign_binary_section_content
225 |         binary.add_section(section, lief.PE.SECTION_TYPES.DATA)
226 | 
227 |         self.bytez = self._binary_to_bytez(binary)
228 |         return self.bytez
229 | 
230 |     def add_section_strings(self):
231 |         good_strings = self._randomly_select_good_strings()
232 |         binary = lief.PE.parse(list(self.bytez))
233 | 
234 |         current_section_names = [section.name for section in binary.sections]
235 |         available_section_names = list(
236 |             set(COMMON_SECTION_NAMES) - set(current_section_names),
237 |         )
238 |         section = lief.PE.Section(random.choice(available_section_names))
239 |         section.content = [ord(c) for c in good_strings]
240 |         binary.add_section(section, lief.PE.SECTION_TYPES.DATA)
241 | 
242 |         self.bytez = self._binary_to_bytez(binary)
243 |         return self.bytez
244 | 
245 |     def add_strings_to_overlay(self):
246 |         """
247 |         Open a txt file of strings from low scoring binaries.
248 |         https://skylightcyber.com/2019/07/18/cylance-i-kill-you/
249 |         """
250 |         good_strings = self._randomly_select_good_strings()
251 |         self.bytez += bytes(good_strings, encoding="ascii")
252 | 
253 |         return self.bytez
254 | 
255 |     def add_imports(self):
256 |         binary = lief.PE.parse(list(self.bytez))
257 | 
258 |         # draw a library at random
259 |         libname = random.choice(list(COMMON_IMPORTS.keys()))
260 |         funcname = random.choice(list(COMMON_IMPORTS[libname]))
261 |         lowerlibname = libname.lower()
262 | 
263 |         # find this lib in the imports, if it exists
264 |         lib = None
265 |         for im in binary.imports:
266 |             if im.name.lower() == lowerlibname:
267 |                 lib = im
268 |                 break
269 | 
270 |         if lib is None:
271 |             # add a new library
272 |             lib = binary.add_library(libname)
273 | 
274 |         # get current names
275 |         names = {e.name for e in lib.entries}
276 |         if funcname not in names:
277 |             lib.add_entry(funcname)
278 | 
279 |         self.bytez = self._binary_to_bytez(binary, imports=True)
280 | 
281 |         return self.bytez
282 | 
283 |     def remove_debug(self):
284 |         binary = lief.PE.parse(list(self.bytez))
285 | 
286 |         if binary.has_debug:
287 |             for i, e in enumerate(binary.data_directories):
288 |                 if e.type == lief.PE.DATA_DIRECTORY.DEBUG:
289 |                     e.rva = 0
290 |                     e.size = 0
291 |                     self.bytez = self._binary_to_bytez(binary)
292 |                     return self.bytez
293 |         # no debug found
294 |         return self.bytez
295 | 
296 |     def modify_optional_header(self):
297 |         binary = lief.PE.parse(list(self.bytez))
298 | 
299 |         oh = {
300 |             "major_linker_version": [2, 6, 7, 9, 11, 14],
301 |             "minor_linker_version": [0, 16, 20, 22, 25],
302 |             "major_operating_system_version": [4, 5, 6, 10],
303 |             "minor_operating_system_version": [0, 1, 3],
304 |             "major_image_version": [0, 1, 5, 6, 10],
305 |             "minor_image_version": [0, 1, 3],
306 |         }
307 | 
308 |         key = random.choice(list(oh.keys()))
309 | 
310 |         modified_val = random.choice(oh[key])
311 |         binary.optional_header.__setattr__(key, modified_val)
312 | 
313 |         self.bytez = self._binary_to_bytez(binary)
314 |         return self.bytez
315 | 
316 |     def break_optional_header_checksum(self):
317 |         binary = lief.PE.parse(list(self.bytez))
318 |         binary.optional_header.checksum = 0
319 |         self.bytez = self._binary_to_bytez(binary)
320 |         return self.bytez
321 | 
322 |     def upx_unpack(self):
323 |         # dump bytez to a temporary file
324 |         tmpfilename = os.path.join(
325 |             tempfile._get_default_tempdir(),
326 |             next(tempfile._get_candidate_names()),
327 |         )
328 | 
329 |         with open(tmpfilename, "wb") as outfile:
330 |             outfile.write(self.bytez)
331 | 
332 |         with open(os.devnull, "w") as DEVNULL:
333 |             retcode = subprocess.call(
334 |                 ["upx", tmpfilename, "-d", "-o", tmpfilename + "_unpacked"],
335 |                 stdout=DEVNULL,
336 |                 stderr=DEVNULL,
337 |             )
338 | 
339 |         os.unlink(tmpfilename)
340 | 
341 |         if retcode == 0:  # sucessfully unpacked
342 |             with open(tmpfilename + "_unpacked", "rb") as result:
343 |                 self.bytez = result.read()
344 | 
345 |             os.unlink(tmpfilename + "_unpacked")
346 | 
347 |         return self.bytez
348 | 
349 |     def upx_pack(self):
350 |         # tested with UPX 3.94
351 |         # WARNING: upx compression only works on binaries over 100KB
352 |         tmpfilename = os.path.join(
353 |             tempfile._get_default_tempdir(),
354 |             next(tempfile._get_candidate_names()),
355 |         )
356 | 
357 |         # dump bytez to a temporary file
358 |         with open(tmpfilename, "wb") as outfile:
359 |             outfile.write(self.bytez)
360 | 
361 |         options = ["--force", "--overlay=copy"]
362 |         compression_level = random.randint(1, 9)
363 |         options += [f"-{compression_level}"]
364 |         options += [f"--compress-exports={random.randint(0, 1)}"]
365 |         options += [f"--compress-icons={random.randint(0, 3)}"]
366 |         options += [f"--compress-resources={random.randint(0, 1)}"]
367 |         options += [f"--strip-relocs={random.randint(0, 1)}"]
368 | 
369 |         with open(os.devnull, "w") as DEVNULL:
370 |             retcode = subprocess.call(
371 |                 ["upx"] + options + [tmpfilename, "-o", tmpfilename + "_packed"],
372 |                 stdout=DEVNULL,
373 |                 stderr=DEVNULL,
374 |             )
375 | 
376 |         os.unlink(tmpfilename)
377 | 
378 |         if retcode == 0:  # successfully packed
379 | 
380 |             with open(tmpfilename + "_packed", "rb") as infile:
381 |                 self.bytez = infile.read()
382 | 
383 |             os.unlink(tmpfilename + "_packed")
384 | 
385 |         return self.bytez
386 | 
387 | 
388 | def modify_sample(bytez, action):
389 |     bytez = ModifyBinary(bytez).__getattribute__(action)()
390 |     return bytez
391 | 
392 | 
393 | ACTION_TABLE = {
394 |     "modify_machine_type": "modify_machine_type",
395 |     "pad_overlay": "pad_overlay",
396 |     "append_benign_data_overlay": "append_benign_data_overlay",
397 |     "append_benign_binary_overlay": "append_benign_binary_overlay",
398 |     "add_bytes_to_section_cave": "add_bytes_to_section_cave",
399 |     "add_section_strings": "add_section_strings",
400 |     "add_section_benign_data": "add_section_benign_data",
401 |     "add_strings_to_overlay": "add_strings_to_overlay",
402 |     "add_imports": "add_imports",
403 |     "rename_section": "rename_section",
404 |     "remove_debug": "remove_debug",
405 |     "modify_optional_header": "modify_optional_header",
406 |     "modify_timestamp": "modify_timestamp",
407 |     "break_optional_header_checksum": "break_optional_header_checksum",
408 |     "upx_unpack": "upx_unpack",
409 |     "upx_pack": "upx_pack",
410 | }
411 | 
412 | if __name__ == "__main__":
413 |     # use for testing/debugging actions
414 |     import hashlib
415 | 
416 |     from IPython import embed
417 | 
418 |     # filename =  '../utils/samples/e090668cfbbe44474cc979f09c1efe82a644a351c5b1a2e16009be273118e053' # upx packed sample
419 |     filename = "../utils/samples/7a5d1bb166c07ed101f2ee9cb43b3a8ce0d90d52788a0d9791a040d2cdcc8057"
420 |     with open(filename, "rb") as f:
421 |         bytez = f.read()
422 | 
423 |     m = hashlib.sha256()
424 |     m.update(bytez)
425 |     print(f"original hash: {m.hexdigest()}")
426 | 
427 |     action = "upx_pack"
428 |     bytez = modify_sample(bytez, action)
429 | 
430 |     m = hashlib.sha256()
431 |     m.update(bytez)
432 |     print(f"modified hash: {m.hexdigest()}")
433 | 
434 |     embed()
435 | 


--------------------------------------------------------------------------------
/malware_rl/envs/controls/section_names.txt:
--------------------------------------------------------------------------------
 1 | .text
 2 | .rsrc
 3 | .reloc
 4 | .data
 5 | .rdata
 6 | .idata
 7 | .tls
 8 | .brdata
 9 | .bss
10 | .pdata
11 | .xdata
12 | DATA
13 | CODE
14 | BSS
15 | rdata
16 | .rmnet
17 | .CRT
18 | .edata
19 | .extrel
20 | .sdata
21 | .code
22 | .vmp0
23 | .itext
24 | .data2
25 | .data1
26 | .vmp1
27 | .adata
28 | .gfids
29 | .data3
30 | INIT
31 | .extjmp
32 | .didat
33 | .didata
34 | PAGE
35 | .orpc
36 | vryeypb
37 | camztlf
38 | tkjdelw
39 | dgbwqbp
40 | odyqxub
41 | .tsuarch
42 | .tsustub
43 | .textbss
44 | .sxdata
45 | .zrdata
46 | qxejodg
47 | .data-co
48 | .text-co
49 | gumrkvc
50 | rqvmxkb
51 | kakxcjb
52 | .cdata
53 | ExeS
54 | .rrdata
55 | 


--------------------------------------------------------------------------------
/malware_rl/envs/controls/trusted/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/malware_rl/envs/controls/trusted/.gitkeep


--------------------------------------------------------------------------------
/malware_rl/envs/ember_gym.py:
--------------------------------------------------------------------------------
  1 | import hashlib
  2 | import os
  3 | import random
  4 | import sys
  5 | from collections import OrderedDict
  6 | 
  7 | import gym
  8 | import numpy as np
  9 | from gym import spaces
 10 | from malware_rl.envs.controls import modifier
 11 | from malware_rl.envs.utils import ember, interface
 12 | 
 13 | random.seed(0)
 14 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
 15 | 
 16 | ACTION_LOOKUP = {i: act for i, act in enumerate(modifier.ACTION_TABLE.keys())}
 17 | 
 18 | ember_model = ember.EmberModel()
 19 | malicious_threshold = ember_model.threshold
 20 | 
 21 | 
 22 | class EmberEnv(gym.Env):
 23 |     """Create MalConv gym interface"""
 24 | 
 25 |     metadata = {"render.modes": ["human"]}
 26 | 
 27 |     def __init__(
 28 |         self,
 29 |         sha256list,
 30 |         random_sample=True,
 31 |         maxturns=5,
 32 |         output_path="data/evaded/ember",
 33 |     ):
 34 |         super().__init__()
 35 |         self.available_sha256 = sha256list
 36 |         self.action_space = spaces.Discrete(len(ACTION_LOOKUP))
 37 |         observation_high = np.finfo(np.float32).max
 38 |         self.observation_space = spaces.Box(
 39 |             low=-observation_high,
 40 |             high=observation_high,
 41 |             shape=(2381,),
 42 |             dtype=np.float32,
 43 |         )
 44 |         self.maxturns = maxturns
 45 |         self.feature_extractor = ember_model.extract
 46 |         self.output_path = output_path
 47 |         self.random_sample = random_sample
 48 |         self.history = OrderedDict()
 49 |         self.sample_iteration_index = 0
 50 | 
 51 |         self.output_path = os.path.join(
 52 |             os.path.dirname(
 53 |                 os.path.dirname(
 54 |                     os.path.dirname(
 55 |                         os.path.abspath(__file__),
 56 |                     ),
 57 |                 ),
 58 |             ),
 59 |             output_path,
 60 |         )
 61 | 
 62 |     def step(self, action_ix):
 63 |         # Execute one time step within the environment
 64 |         self.turns += 1
 65 |         self._take_action(action_ix)
 66 |         self.observation_space = self.feature_extractor(self.bytez)
 67 |         self.score = ember_model.predict_sample(self.observation_space)
 68 | 
 69 |         if self.score < malicious_threshold:
 70 |             reward = 10.0
 71 |             episode_over = True
 72 |             self.history[self.sha256]["evaded"] = True
 73 |             self.history[self.sha256]["reward"] = reward
 74 | 
 75 |             # save off file to evasion directory
 76 |             m = hashlib.sha256()
 77 |             m.update(self.bytez)
 78 |             sha256 = m.hexdigest()
 79 |             evade_path = os.path.join(self.output_path, sha256)
 80 | 
 81 |             with open(evade_path, "wb") as out:
 82 |                 out.write(self.bytez)
 83 | 
 84 |             self.history[self.sha256]["evade_path"] = evade_path
 85 | 
 86 |         elif self.turns >= self.maxturns:
 87 |             # game over - max turns hit
 88 |             reward = self.original_score - self.score
 89 |             episode_over = True
 90 |             self.history[self.sha256]["evaded"] = False
 91 |             self.history[self.sha256]["reward"] = reward
 92 |         else:
 93 |             reward = self.original_score - self.score
 94 |             episode_over = False
 95 | 
 96 |         if episode_over:
 97 |             print(f"Episode over: reward = {reward}")
 98 | 
 99 |         return self.observation_space, reward, episode_over, self.history[self.sha256]
100 | 
101 |     def _take_action(self, action_ix):
102 |         action = ACTION_LOOKUP[action_ix]
103 |         self.history[self.sha256]["actions"].append(action)
104 |         self.bytez = modifier.modify_sample(self.bytez, action)
105 | 
106 |     def reset(self):
107 |         # Reset the state of the environment to an initial state
108 |         self.turns = 0
109 |         while True:
110 |             # grab a new sample (TODO)
111 |             if self.random_sample:
112 |                 self.sha256 = random.choice(self.available_sha256)
113 |             else:
114 |                 self.sha256 = self.available_sha256[
115 |                     self.sample_iteration_index % len(self.available_sha256)
116 |                 ]
117 |                 self.sample_iteration_index += 1
118 | 
119 |             self.history[self.sha256] = {"actions": [], "evaded": False}
120 |             self.bytez = interface.fetch_file(
121 |                 os.path.join(
122 |                     module_path,
123 |                     "utils/samples/",
124 |                 )
125 |                 + self.sha256,
126 |             )
127 | 
128 |             self.observation_space = self.feature_extractor(self.bytez)
129 |             self.original_score = ember_model.predict_sample(
130 |                 self.observation_space,
131 |             )
132 |             if self.original_score < malicious_threshold:
133 |                 # already labeled benign, skip
134 |                 continue
135 | 
136 |             break
137 |         print(f"Sample: {self.sha256}")
138 |         return self.observation_space
139 | 
140 |     def render(self, mode="human", close=False):
141 |         # Render the environment to the screen
142 |         pass
143 | 


--------------------------------------------------------------------------------
/malware_rl/envs/malconv_gym.py:
--------------------------------------------------------------------------------
  1 | import hashlib
  2 | import os
  3 | import random
  4 | import sys
  5 | from collections import OrderedDict
  6 | 
  7 | import gym
  8 | import numpy as np
  9 | from gym import spaces
 10 | from malware_rl.envs.controls import modifier
 11 | from malware_rl.envs.utils import interface, malconv
 12 | 
 13 | random.seed(0)
 14 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
 15 | 
 16 | ACTION_LOOKUP = {i: act for i, act in enumerate(modifier.ACTION_TABLE.keys())}
 17 | 
 18 | mc = malconv.MalConv()
 19 | malicious_threshold = mc.malicious_threshold
 20 | 
 21 | 
 22 | class MalConvEnv(gym.Env):
 23 |     """Create MalConv gym interface"""
 24 | 
 25 |     metadata = {"render.modes": ["human"]}
 26 | 
 27 |     def __init__(
 28 |         self,
 29 |         sha256list,
 30 |         random_sample=True,
 31 |         maxturns=5,
 32 |         output_path="data/evaded/malconv",
 33 |     ):
 34 |         super().__init__()
 35 |         self.available_sha256 = sha256list
 36 |         self.action_space = spaces.Discrete(len(ACTION_LOOKUP))
 37 |         self.observation_space = spaces.Box(
 38 |             low=0,
 39 |             high=256,
 40 |             shape=(1048576,),
 41 |             dtype=np.int16,
 42 |         )
 43 |         self.maxturns = maxturns
 44 |         self.feature_extractor = mc.extract
 45 |         self.output_path = output_path
 46 |         self.random_sample = random_sample
 47 |         self.history = OrderedDict()
 48 |         self.sample_iteration_index = 0
 49 | 
 50 |         self.output_path = os.path.join(
 51 |             os.path.dirname(
 52 |                 os.path.dirname(
 53 |                     os.path.dirname(
 54 |                         os.path.abspath(__file__),
 55 |                     ),
 56 |                 ),
 57 |             ),
 58 |             output_path,
 59 |         )
 60 | 
 61 |     def step(self, action_ix):
 62 |         # Execute one time step within the environment
 63 |         self.turns += 1
 64 |         self._take_action(action_ix)
 65 |         self.observation_space = self.feature_extractor(self.bytez)
 66 |         self.score = mc.predict_sample(self.observation_space)
 67 | 
 68 |         if self.score < malicious_threshold:
 69 |             reward = 10.0
 70 |             episode_over = True
 71 |             self.history[self.sha256]["evaded"] = True
 72 |             self.history[self.sha256]["reward"] = reward
 73 | 
 74 |             # save off file to evasion directory
 75 |             m = hashlib.sha256()
 76 |             m.update(self.bytez)
 77 |             sha256 = m.hexdigest()
 78 |             evade_path = os.path.join(self.output_path, sha256)
 79 | 
 80 |             with open(evade_path, "wb") as out:
 81 |                 out.write(self.bytez)
 82 | 
 83 |             self.history[self.sha256]["evade_path"] = evade_path
 84 | 
 85 |         elif self.turns >= self.maxturns:
 86 |             # game over - max turns hit
 87 |             reward = self.original_score - self.score
 88 |             episode_over = True
 89 |             self.history[self.sha256]["evaded"] = False
 90 |             self.history[self.sha256]["reward"] = reward
 91 | 
 92 |         else:
 93 |             reward = float(self.original_score - self.score)
 94 |             episode_over = False
 95 | 
 96 |         if episode_over:
 97 |             print(f"Episode over: reward = {reward}")
 98 | 
 99 |         return self.observation_space, reward, episode_over, self.history[self.sha256]
100 | 
101 |     def _take_action(self, action_ix):
102 |         action = ACTION_LOOKUP[action_ix]
103 |         # print("ACTION:", action)
104 |         self.history[self.sha256]["actions"].append(action)
105 |         self.bytez = modifier.modify_sample(self.bytez, action)
106 | 
107 |     def reset(self):
108 |         # Reset the state of the environment to an initial state
109 |         self.turns = 0
110 |         while True:
111 |             # grab a new sample (TODO)
112 |             if self.random_sample:
113 |                 self.sha256 = random.choice(self.available_sha256)
114 |             else:
115 |                 self.sha256 = self.available_sha256[
116 |                     self.sample_iteration_index % len(self.available_sha256)
117 |                 ]
118 |                 self.sample_iteration_index += 1
119 | 
120 |             self.history[self.sha256] = {"actions": [], "evaded": False}
121 |             self.bytez = interface.fetch_file(
122 |                 os.path.join(
123 |                     module_path,
124 |                     "utils/samples/",
125 |                 )
126 |                 + self.sha256,
127 |             )
128 | 
129 |             self.observation_space = self.feature_extractor(self.bytez)
130 |             self.original_score = mc.predict_sample(self.observation_space)
131 |             if self.original_score < malicious_threshold:
132 |                 # already labeled benign, skip
133 |                 continue
134 | 
135 |             break
136 |         print(f"Sample: {self.sha256}")
137 | 
138 |         return self.observation_space
139 | 
140 |     def render(self, mode="human", close=False):
141 |         # Render the environment to the screen
142 |         pass
143 | 


--------------------------------------------------------------------------------
/malware_rl/envs/sorel_gym.py:
--------------------------------------------------------------------------------
  1 | import hashlib
  2 | import os
  3 | import random
  4 | import sys
  5 | from collections import OrderedDict
  6 | 
  7 | import gym
  8 | import numpy as np
  9 | from gym import spaces
 10 | from malware_rl.envs.controls import modifier
 11 | from malware_rl.envs.utils import interface, sorel
 12 | 
 13 | random.seed(0)
 14 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
 15 | 
 16 | ACTION_LOOKUP = {i: act for i, act in enumerate(modifier.ACTION_TABLE.keys())}
 17 | 
 18 | sorel_model = sorel.SorelModel()
 19 | malicious_threshold = sorel_model.threshold
 20 | 
 21 | 
 22 | class SorelEnv(gym.Env):
 23 |     """Creates the Sorel Gym Interface"""
 24 | 
 25 |     metadata = {"render.modes": ["human"]}
 26 | 
 27 |     def __init__(
 28 |         self,
 29 |         sha256list,
 30 |         random_sample=True,
 31 |         maxturns=5,
 32 |         output_path="data/evaded/sorel",
 33 |     ):
 34 |         super().__init__()
 35 |         self.available_sha256 = sha256list
 36 |         self.action_space = spaces.Discrete(len(ACTION_LOOKUP))
 37 |         observation_high = np.finfo(np.float32).max
 38 |         self.observation_space = spaces.Box(
 39 |             low=-observation_high,
 40 |             high=observation_high,
 41 |             shape=(2381,),
 42 |             dtype=np.float32,
 43 |         )
 44 |         self.maxturns = maxturns
 45 |         self.feature_extractor = sorel_model.extract
 46 |         self.output_path = output_path
 47 |         self.random_sample = random_sample
 48 |         self.history = OrderedDict()
 49 |         self.sample_iteration_index = 0
 50 | 
 51 |         self.output_path = os.path.join(
 52 |             os.path.dirname(
 53 |                 os.path.dirname(
 54 |                     os.path.dirname(
 55 |                         os.path.abspath(__file__),
 56 |                     ),
 57 |                 ),
 58 |             ),
 59 |             output_path,
 60 |         )
 61 | 
 62 |     def step(self, action_ix):
 63 |         # Execute one time step within the environment
 64 |         self.turns += 1
 65 |         self._take_action(action_ix)
 66 |         self.observation_space = self.feature_extractor(self.bytez)
 67 |         self.score = sorel_model.predict_sample(self.observation_space)
 68 | 
 69 |         if self.score < malicious_threshold:
 70 |             reward = 10.0
 71 |             episode_over = True
 72 |             self.history[self.sha256]["evaded"] = True
 73 |             self.history[self.sha256]["reward"] = reward
 74 | 
 75 |             # save off file to evasion directory
 76 |             m = hashlib.sha256()
 77 |             m.update(self.bytez)
 78 |             sha256 = m.hexdigest()
 79 |             evade_path = os.path.join(self.output_path, sha256)
 80 | 
 81 |             with open(evade_path, "wb") as out:
 82 |                 out.write(self.bytez)
 83 | 
 84 |             self.history[self.sha256]["evade_path"] = evade_path
 85 | 
 86 |         elif self.turns >= self.maxturns:
 87 |             # game over - max turns hit
 88 |             reward = self.original_score - self.score
 89 |             episode_over = True
 90 |             self.history[self.sha256]["evaded"] = False
 91 |             self.history[self.sha256]["reward"] = reward
 92 |         else:
 93 |             reward = self.original_score - self.score
 94 |             episode_over = False
 95 | 
 96 |         if episode_over:
 97 |             print(f"Episode over: reward = {reward}")
 98 | 
 99 |         return self.observation_space, reward, episode_over, self.history[self.sha256]
100 | 
101 |     def _take_action(self, action_ix):
102 |         action = ACTION_LOOKUP[action_ix]
103 |         self.history[self.sha256]["actions"].append(action)
104 |         self.bytez = modifier.modify_sample(self.bytez, action)
105 | 
106 |     def reset(self):
107 |         # Reset the state of the environment to an initial state
108 |         self.turns = 0
109 |         while True:
110 |             # grab a new sample (TODO)
111 |             if self.random_sample:
112 |                 self.sha256 = random.choice(self.available_sha256)
113 |             else:
114 |                 self.sha256 = self.available_sha256[
115 |                     self.sample_iteration_index % len(self.available_sha256)
116 |                 ]
117 |                 self.sample_iteration_index += 1
118 | 
119 |             self.history[self.sha256] = {"actions": [], "evaded": False}
120 |             self.bytez = interface.fetch_file(
121 |                 os.path.join(
122 |                     module_path,
123 |                     "utils/samples/",
124 |                 )
125 |                 + self.sha256,
126 |             )
127 | 
128 |             self.observation_space = self.feature_extractor(self.bytez)
129 |             self.original_score = sorel_model.predict_sample(
130 |                 self.observation_space,
131 |             )
132 |             if self.original_score < malicious_threshold:
133 |                 # already labeled benign, skip
134 |                 continue
135 | 
136 |             break
137 |         print(f"Sample: {self.sha256}")
138 |         return self.observation_space
139 | 
140 |     def render(self, mode="human", close=False):
141 |         # Render the environment to the screen
142 |         pass
143 | 


--------------------------------------------------------------------------------
/malware_rl/envs/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/malware_rl/envs/utils/__init__.py


--------------------------------------------------------------------------------
/malware_rl/envs/utils/ember.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/python
  2 | """ Extracts some basic features from PE files. Many of the features
  3 | implemented have been used in previously published works. For more information,
  4 | check out the following resources:
  5 | * Schultz, et al., 2001: http://128.59.14.66/sites/default/files/binaryeval-ieeesp01.pdf
  6 | * Kolter and Maloof, 2006: http://www.jmlr.org/papers/volume7/kolter06a/kolter06a.pdf
  7 | * Shafiq et al., 2009: https://www.researchgate.net/profile/Fauzan_Mirza/publication/242084613_A_Framework_for_Efficient_Mining_of_Structural_Information_to_Detect_Zero-Day_Malicious_Portable_Executables/links/0c96052e191668c3d5000000.pdf
  8 | * Raman, 2012: http://2012.infosecsouthwest.com/files/speaker_materials/ISSW2012_Selecting_Features_to_Classify_Malware.pdf
  9 | * Saxe and Berlin, 2015: https://arxiv.org/pdf/1508.03096.pdf
 10 | 
 11 | It may be useful to do feature selection to reduce this set of features to a meaningful set
 12 | for your modeling problem.
 13 | """
 14 | import hashlib
 15 | import os
 16 | import re
 17 | import sys
 18 | 
 19 | import lief
 20 | import lightgbm as lgb
 21 | import numpy as np
 22 | from sklearn.feature_extraction import FeatureHasher
 23 | 
 24 | LIEF_MAJOR, LIEF_MINOR, _ = lief.__version__.split(".")
 25 | LIEF_EXPORT_OBJECT = int(LIEF_MAJOR) > 0 or (
 26 |     int(LIEF_MAJOR) == 0 and int(LIEF_MINOR) >= 10
 27 | )
 28 | 
 29 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
 30 | model_path = os.path.join(module_path, "ember_model.txt")
 31 | 
 32 | 
 33 | class FeatureType:
 34 |     """Base class from which each feature type may inherit"""
 35 | 
 36 |     name = ""
 37 |     dim = 0
 38 | 
 39 |     def __repr__(self):
 40 |         return f"{self.name}({self.dim})"
 41 | 
 42 |     def raw_features(self, bytez, lief_binary):
 43 |         """Generate a JSON-able representation of the file"""
 44 |         raise (NotImplementedError)
 45 | 
 46 |     def process_raw_features(self, raw_obj):
 47 |         """Generate a feature vector from the raw features"""
 48 |         raise (NotImplementedError)
 49 | 
 50 |     def feature_vector(self, bytez, lief_binary):
 51 |         """Directly calculate the feature vector from the sample itself. This should only be implemented differently
 52 |         if there are significant speedups to be gained from combining the two functions."""
 53 |         return self.process_raw_features(self.raw_features(bytez, lief_binary))
 54 | 
 55 | 
 56 | class ByteHistogram(FeatureType):
 57 |     """Byte histogram (count + non-normalized) over the entire binary file"""
 58 | 
 59 |     name = "histogram"
 60 |     dim = 256
 61 | 
 62 |     def __init__(self):
 63 |         super(FeatureType, self).__init__()
 64 | 
 65 |     def raw_features(self, bytez, lief_binary):
 66 |         counts = np.bincount(
 67 |             np.frombuffer(
 68 |                 bytez,
 69 |                 dtype=np.uint8,
 70 |             ),
 71 |             minlength=256,
 72 |         )
 73 |         return counts.tolist()
 74 | 
 75 |     def process_raw_features(self, raw_obj):
 76 |         counts = np.array(raw_obj, dtype=np.float32)
 77 |         sum = counts.sum()
 78 |         normalized = counts / sum
 79 |         return normalized
 80 | 
 81 | 
 82 | class ByteEntropyHistogram(FeatureType):
 83 |     """2d byte/entropy histogram based loosely on (Saxe and Berlin, 2015).
 84 |     This roughly approximates the joint probability of byte value and local entropy.
 85 |     See Section 2.1.1 in https://arxiv.org/pdf/1508.03096.pdf for more info.
 86 |     """
 87 | 
 88 |     name = "byteentropy"
 89 |     dim = 256
 90 | 
 91 |     def __init__(self, step=1024, window=2048):
 92 |         super(FeatureType, self).__init__()
 93 |         self.window = window
 94 |         self.step = step
 95 | 
 96 |     def _entropy_bin_counts(self, block):
 97 |         # coarse histogram, 16 bytes per bin
 98 |         c = np.bincount(block >> 4, minlength=16)  # 16-bin histogram
 99 |         p = c.astype(np.float32) / self.window
100 |         wh = np.where(c)[0]
101 |         H = (
102 |             np.sum(
103 |                 -p[wh]
104 |                 * np.log2(
105 |                     p[wh],
106 |                 ),
107 |             )
108 |             * 2
109 |         )  # * x2 b.c. we reduced information by half: 256 bins (8 bits) to 16 bins (4 bits)
110 | 
111 |         Hbin = int(H * 2)  # up to 16 bins (max entropy is 8 bits)
112 |         if Hbin == 16:  # handle entropy = 8.0 bits
113 |             Hbin = 15
114 | 
115 |         return Hbin, c
116 | 
117 |     def raw_features(self, bytez, lief_binary):
118 |         output = np.zeros((16, 16), dtype=np.int)
119 |         a = np.frombuffer(bytez, dtype=np.uint8)
120 |         if a.shape[0] < self.window:
121 |             Hbin, c = self._entropy_bin_counts(a)
122 |             output[Hbin, :] += c
123 |         else:
124 |             # strided trick from here: http://www.rigtorp.se/2011/01/01/rolling-statistics-numpy.html
125 |             shape = a.shape[:-1] + (a.shape[-1] - self.window + 1, self.window)
126 |             strides = a.strides + (a.strides[-1],)
127 |             blocks = np.lib.stride_tricks.as_strided(
128 |                 a,
129 |                 shape=shape,
130 |                 strides=strides,
131 |             )[:: self.step, :]
132 | 
133 |             # from the blocks, compute histogram
134 |             for block in blocks:
135 |                 Hbin, c = self._entropy_bin_counts(block)
136 |                 output[Hbin, :] += c
137 | 
138 |         return output.flatten().tolist()
139 | 
140 |     def process_raw_features(self, raw_obj):
141 |         counts = np.array(raw_obj, dtype=np.float32)
142 |         sum = counts.sum()
143 |         normalized = counts / sum
144 |         return normalized
145 | 
146 | 
147 | class SectionInfo(FeatureType):
148 |     """Information about section names, sizes and entropy.  Uses hashing trick
149 |     to summarize all this section info into a feature vector.
150 |     """
151 | 
152 |     name = "section"
153 |     dim = 5 + 50 + 50 + 50 + 50 + 50
154 | 
155 |     def __init__(self):
156 |         super(FeatureType, self).__init__()
157 | 
158 |     @staticmethod
159 |     def _properties(s):
160 |         return [str(c).split(".")[-1] for c in s.characteristics_lists]
161 | 
162 |     def raw_features(self, bytez, lief_binary):
163 |         if lief_binary is None:
164 |             return {"entry": "", "sections": []}
165 | 
166 |         # properties of entry point, or if invalid, the first executable section
167 |         try:
168 |             entry_section = lief_binary.section_from_offset(
169 |                 lief_binary.entrypoint,
170 |             ).name
171 |         except lief.not_found:
172 |             # bad entry point, let's find the first executable section
173 |             entry_section = ""
174 |             for s in lief_binary.sections:
175 |                 if (
176 |                     lief.PE.SECTION_CHARACTERISTICS.MEM_EXECUTE
177 |                     in s.characteristics_lists
178 |                 ):
179 |                     entry_section = s.name
180 |                     break
181 | 
182 |         raw_obj = {"entry": entry_section}
183 |         raw_obj["sections"] = [
184 |             {
185 |                 "name": s.name,
186 |                 "size": s.size,
187 |                 "entropy": s.entropy,
188 |                 "vsize": s.virtual_size,
189 |                 "props": self._properties(s),
190 |             }
191 |             for s in lief_binary.sections
192 |         ]
193 |         return raw_obj
194 | 
195 |     def process_raw_features(self, raw_obj):
196 |         sections = raw_obj["sections"]
197 |         general = [
198 |             len(sections),  # total number of sections
199 |             # number of sections with nonzero size
200 |             sum(1 for s in sections if s["size"] == 0),
201 |             # number of sections with an empty name
202 |             sum(1 for s in sections if s["name"] == ""),
203 |             # number of RX
204 |             sum(
205 |                 1
206 |                 for s in sections
207 |                 if "MEM_READ" in s["props"] and "MEM_EXECUTE" in s["props"]
208 |             ),
209 |             # number of W
210 |             sum(1 for s in sections if "MEM_WRITE" in s["props"]),
211 |         ]
212 |         # gross characteristics of each section
213 |         section_sizes = [(s["name"], s["size"]) for s in sections]
214 |         section_sizes_hashed = (
215 |             FeatureHasher(50, input_type="pair")
216 |             .transform(
217 |                 [
218 |                     section_sizes,
219 |                 ],
220 |             )
221 |             .toarray()[0]
222 |         )
223 |         section_entropy = [(s["name"], s["entropy"]) for s in sections]
224 |         section_entropy_hashed = (
225 |             FeatureHasher(50, input_type="pair")
226 |             .transform(
227 |                 [
228 |                     section_entropy,
229 |                 ],
230 |             )
231 |             .toarray()[0]
232 |         )
233 |         section_vsize = [(s["name"], s["vsize"]) for s in sections]
234 |         section_vsize_hashed = (
235 |             FeatureHasher(50, input_type="pair")
236 |             .transform(
237 |                 [
238 |                     section_vsize,
239 |                 ],
240 |             )
241 |             .toarray()[0]
242 |         )
243 |         entry_name_hashed = (
244 |             FeatureHasher(50, input_type="string")
245 |             .transform(
246 |                 [
247 |                     raw_obj["entry"],
248 |                 ],
249 |             )
250 |             .toarray()[0]
251 |         )
252 |         characteristics = [
253 |             p for s in sections for p in s["props"] if s["name"] == raw_obj["entry"]
254 |         ]
255 |         characteristics_hashed = (
256 |             FeatureHasher(50, input_type="string")
257 |             .transform(
258 |                 [
259 |                     characteristics,
260 |                 ],
261 |             )
262 |             .toarray()[0]
263 |         )
264 | 
265 |         return np.hstack(
266 |             [
267 |                 general,
268 |                 section_sizes_hashed,
269 |                 section_entropy_hashed,
270 |                 section_vsize_hashed,
271 |                 entry_name_hashed,
272 |                 characteristics_hashed,
273 |             ],
274 |         ).astype(np.float32)
275 | 
276 | 
277 | class ImportsInfo(FeatureType):
278 |     """Information about imported libraries and functions from the
279 |     import address table.  Note that the total number of imported
280 |     functions is contained in GeneralFileInfo.
281 |     """
282 | 
283 |     name = "imports"
284 |     dim = 1280
285 | 
286 |     def __init__(self):
287 |         super(FeatureType, self).__init__()
288 | 
289 |     def raw_features(self, bytez, lief_binary):
290 |         imports = {}
291 |         if lief_binary is None:
292 |             return imports
293 | 
294 |         for lib in lief_binary.imports:
295 |             if lib.name not in imports:
296 |                 # libraries can be duplicated in listing, extend instead of overwrite
297 |                 imports[lib.name] = []
298 | 
299 |             # Clipping assumes there are diminishing returns on the discriminatory power of imported functions
300 |             #  beyond the first 10000 characters, and this will help limit the dataset size
301 |             for entry in lib.entries:
302 |                 if entry.is_ordinal:
303 |                     imports[lib.name].append("ordinal" + str(entry.ordinal))
304 |                 else:
305 |                     imports[lib.name].append(entry.name[:10000])
306 | 
307 |         return imports
308 | 
309 |     def process_raw_features(self, raw_obj):
310 |         # unique libraries
311 |         libraries = list({l.lower() for l in raw_obj.keys()})
312 |         libraries_hashed = (
313 |             FeatureHasher(
314 |                 256,
315 |                 input_type="string",
316 |             )
317 |             .transform([libraries])
318 |             .toarray()[0]
319 |         )
320 | 
321 |         # A string like "kernel32.dll:CreateFileMappingA" for each imported function
322 |         imports = [
323 |             lib.lower() + ":" + e for lib, elist in raw_obj.items() for e in elist
324 |         ]
325 |         imports_hashed = (
326 |             FeatureHasher(
327 |                 1024,
328 |                 input_type="string",
329 |             )
330 |             .transform([imports])
331 |             .toarray()[0]
332 |         )
333 | 
334 |         # Two separate elements: libraries (alone) and fully-qualified names of imported functions
335 |         return np.hstack([libraries_hashed, imports_hashed]).astype(np.float32)
336 | 
337 | 
338 | class ExportsInfo(FeatureType):
339 |     """Information about exported functions. Note that the total number of exported
340 |     functions is contained in GeneralFileInfo.
341 |     """
342 | 
343 |     name = "exports"
344 |     dim = 128
345 | 
346 |     def __init__(self):
347 |         super(FeatureType, self).__init__()
348 | 
349 |     def raw_features(self, bytez, lief_binary):
350 |         if lief_binary is None:
351 |             return []
352 | 
353 |         # Clipping assumes there are diminishing returns on the discriminatory power of exports beyond
354 |         #  the first 10000 characters, and this will help limit the dataset size
355 |         if LIEF_EXPORT_OBJECT:
356 |             # export is an object with .name attribute (0.10.0 and later)
357 |             clipped_exports = [
358 |                 export.name[:10000] for export in lief_binary.exported_functions
359 |             ]
360 |         else:
361 |             # export is a string (LIEF 0.9.0 and earlier)
362 |             clipped_exports = [
363 |                 export[:10000] for export in lief_binary.exported_functions
364 |             ]
365 | 
366 |         return clipped_exports
367 | 
368 |     def process_raw_features(self, raw_obj):
369 |         exports_hashed = (
370 |             FeatureHasher(
371 |                 128,
372 |                 input_type="string",
373 |             )
374 |             .transform([raw_obj])
375 |             .toarray()[0]
376 |         )
377 |         return exports_hashed.astype(np.float32)
378 | 
379 | 
380 | class GeneralFileInfo(FeatureType):
381 |     """General information about the file"""
382 | 
383 |     name = "general"
384 |     dim = 10
385 | 
386 |     def __init__(self):
387 |         super(FeatureType, self).__init__()
388 | 
389 |     def raw_features(self, bytez, lief_binary):
390 |         if lief_binary is None:
391 |             return {
392 |                 "size": len(bytez),
393 |                 "vsize": 0,
394 |                 "has_debug": 0,
395 |                 "exports": 0,
396 |                 "imports": 0,
397 |                 "has_relocations": 0,
398 |                 "has_resources": 0,
399 |                 "has_signature": 0,
400 |                 "has_tls": 0,
401 |                 "symbols": 0,
402 |             }
403 | 
404 |         return {
405 |             "size": len(bytez),
406 |             "vsize": lief_binary.virtual_size,
407 |             "has_debug": int(lief_binary.has_debug),
408 |             "exports": len(lief_binary.exported_functions),
409 |             "imports": len(lief_binary.imported_functions),
410 |             "has_relocations": int(lief_binary.has_relocations),
411 |             "has_resources": int(lief_binary.has_resources),
412 |             "has_signature": int(lief_binary.has_signature),
413 |             "has_tls": int(lief_binary.has_tls),
414 |             "symbols": len(lief_binary.symbols),
415 |         }
416 | 
417 |     def process_raw_features(self, raw_obj):
418 |         return np.asarray(
419 |             [
420 |                 raw_obj["size"],
421 |                 raw_obj["vsize"],
422 |                 raw_obj["has_debug"],
423 |                 raw_obj["exports"],
424 |                 raw_obj["imports"],
425 |                 raw_obj["has_relocations"],
426 |                 raw_obj["has_resources"],
427 |                 raw_obj["has_signature"],
428 |                 raw_obj["has_tls"],
429 |                 raw_obj["symbols"],
430 |             ],
431 |             dtype=np.float32,
432 |         )
433 | 
434 | 
435 | class HeaderFileInfo(FeatureType):
436 |     """Machine, architecure, OS, linker and other information extracted from header"""
437 | 
438 |     name = "header"
439 |     dim = 62
440 | 
441 |     def __init__(self):
442 |         super(FeatureType, self).__init__()
443 | 
444 |     def raw_features(self, bytez, lief_binary):
445 |         raw_obj = {}
446 |         raw_obj["coff"] = {
447 |             "timestamp": 0,
448 |             "machine": "",
449 |             "characteristics": [],
450 |         }
451 |         raw_obj["optional"] = {
452 |             "subsystem": "",
453 |             "dll_characteristics": [],
454 |             "magic": "",
455 |             "major_image_version": 0,
456 |             "minor_image_version": 0,
457 |             "major_linker_version": 0,
458 |             "minor_linker_version": 0,
459 |             "major_operating_system_version": 0,
460 |             "minor_operating_system_version": 0,
461 |             "major_subsystem_version": 0,
462 |             "minor_subsystem_version": 0,
463 |             "sizeof_code": 0,
464 |             "sizeof_headers": 0,
465 |             "sizeof_heap_commit": 0,
466 |         }
467 |         if lief_binary is None:
468 |             return raw_obj
469 | 
470 |         raw_obj["coff"]["timestamp"] = lief_binary.header.time_date_stamps
471 |         raw_obj["coff"]["machine"] = str(lief_binary.header.machine).split(
472 |             ".",
473 |         )[-1]
474 |         raw_obj["coff"]["characteristics"] = [
475 |             str(c).split(
476 |                 ".",
477 |             )[-1]
478 |             for c in lief_binary.header.characteristics_list
479 |         ]
480 |         raw_obj["optional"]["subsystem"] = str(
481 |             lief_binary.optional_header.subsystem,
482 |         ).split(".")[-1]
483 |         raw_obj["optional"]["dll_characteristics"] = [
484 |             str(c).split(".")[-1]
485 |             for c in lief_binary.optional_header.dll_characteristics_lists
486 |         ]
487 |         raw_obj["optional"]["magic"] = str(lief_binary.optional_header.magic).split(
488 |             ".",
489 |         )[-1]
490 |         raw_obj["optional"][
491 |             "major_image_version"
492 |         ] = lief_binary.optional_header.major_image_version
493 |         raw_obj["optional"][
494 |             "minor_image_version"
495 |         ] = lief_binary.optional_header.minor_image_version
496 |         raw_obj["optional"][
497 |             "major_linker_version"
498 |         ] = lief_binary.optional_header.major_linker_version
499 |         raw_obj["optional"][
500 |             "minor_linker_version"
501 |         ] = lief_binary.optional_header.minor_linker_version
502 |         raw_obj["optional"][
503 |             "major_operating_system_version"
504 |         ] = lief_binary.optional_header.major_operating_system_version
505 |         raw_obj["optional"][
506 |             "minor_operating_system_version"
507 |         ] = lief_binary.optional_header.minor_operating_system_version
508 |         raw_obj["optional"][
509 |             "major_subsystem_version"
510 |         ] = lief_binary.optional_header.major_subsystem_version
511 |         raw_obj["optional"][
512 |             "minor_subsystem_version"
513 |         ] = lief_binary.optional_header.minor_subsystem_version
514 |         raw_obj["optional"]["sizeof_code"] = lief_binary.optional_header.sizeof_code
515 |         raw_obj["optional"][
516 |             "sizeof_headers"
517 |         ] = lief_binary.optional_header.sizeof_headers
518 |         raw_obj["optional"][
519 |             "sizeof_heap_commit"
520 |         ] = lief_binary.optional_header.sizeof_heap_commit
521 |         return raw_obj
522 | 
523 |     def process_raw_features(self, raw_obj):
524 |         return np.hstack(
525 |             [
526 |                 raw_obj["coff"]["timestamp"],
527 |                 FeatureHasher(10, input_type="string")
528 |                 .transform(
529 |                     [[raw_obj["coff"]["machine"]]],
530 |                 )
531 |                 .toarray()[0],
532 |                 FeatureHasher(10, input_type="string")
533 |                 .transform(
534 |                     [raw_obj["coff"]["characteristics"]],
535 |                 )
536 |                 .toarray()[0],
537 |                 FeatureHasher(10, input_type="string")
538 |                 .transform(
539 |                     [[raw_obj["optional"]["subsystem"]]],
540 |                 )
541 |                 .toarray()[0],
542 |                 FeatureHasher(10, input_type="string")
543 |                 .transform(
544 |                     [raw_obj["optional"]["dll_characteristics"]],
545 |                 )
546 |                 .toarray()[0],
547 |                 FeatureHasher(10, input_type="string")
548 |                 .transform(
549 |                     [[raw_obj["optional"]["magic"]]],
550 |                 )
551 |                 .toarray()[0],
552 |                 raw_obj["optional"]["major_image_version"],
553 |                 raw_obj["optional"]["minor_image_version"],
554 |                 raw_obj["optional"]["major_linker_version"],
555 |                 raw_obj["optional"]["minor_linker_version"],
556 |                 raw_obj["optional"]["major_operating_system_version"],
557 |                 raw_obj["optional"]["minor_operating_system_version"],
558 |                 raw_obj["optional"]["major_subsystem_version"],
559 |                 raw_obj["optional"]["minor_subsystem_version"],
560 |                 raw_obj["optional"]["sizeof_code"],
561 |                 raw_obj["optional"]["sizeof_headers"],
562 |                 raw_obj["optional"]["sizeof_heap_commit"],
563 |             ],
564 |         ).astype(np.float32)
565 | 
566 | 
567 | class StringExtractor(FeatureType):
568 |     """Extracts strings from raw byte stream"""
569 | 
570 |     name = "strings"
571 |     dim = 1 + 1 + 1 + 96 + 1 + 1 + 1 + 1 + 1
572 | 
573 |     def __init__(self):
574 |         super(FeatureType, self).__init__()
575 |         # all consecutive runs of 0x20 - 0x7f that are 5+ characters
576 |         self._allstrings = re.compile(b"[\x20-\x7f]{5,}")
577 |         # occurances of the string 'C:\'.  Not actually extracting the path
578 |         self._paths = re.compile(b"c:\\\\", re.IGNORECASE)
579 |         # occurances of http:// or https://.  Not actually extracting the URLs
580 |         self._urls = re.compile(b"https?://", re.IGNORECASE)
581 |         # occurances of the string prefix HKEY_.  No actually extracting registry names
582 |         self._registry = re.compile(b"HKEY_")
583 |         # crude evidence of an MZ header (dropper?) somewhere in the byte stream
584 |         self._mz = re.compile(b"MZ")
585 | 
586 |     def raw_features(self, bytez, lief_binary):
587 |         allstrings = self._allstrings.findall(bytez)
588 |         if allstrings:
589 |             # statistics about strings:
590 |             string_lengths = [len(s) for s in allstrings]
591 |             avlength = sum(string_lengths) / len(string_lengths)
592 |             # map printable characters 0x20 - 0x7f to an int array consisting of 0-95, inclusive
593 |             as_shifted_string = [b - ord(b"\x20") for b in b"".join(allstrings)]
594 |             c = np.bincount(as_shifted_string, minlength=96)  # histogram count
595 |             # distribution of characters in printable strings
596 |             csum = c.sum()
597 |             p = c.astype(np.float32) / csum
598 |             wh = np.where(c)[0]
599 |             H = np.sum(-p[wh] * np.log2(p[wh]))  # entropy
600 |         else:
601 |             avlength = 0
602 |             c = np.zeros((96,), dtype=np.float32)
603 |             H = 0
604 |             csum = 0
605 | 
606 |         return {
607 |             "numstrings": len(allstrings),
608 |             "avlength": avlength,
609 |             "printabledist": c.tolist(),  # store non-normalized histogram
610 |             "printables": int(csum),
611 |             "entropy": float(H),
612 |             "paths": len(self._paths.findall(bytez)),
613 |             "urls": len(self._urls.findall(bytez)),
614 |             "registry": len(self._registry.findall(bytez)),
615 |             "MZ": len(self._mz.findall(bytez)),
616 |         }
617 | 
618 |     def process_raw_features(self, raw_obj):
619 |         hist_divisor = (
620 |             float(
621 |                 raw_obj["printables"],
622 |             )
623 |             if raw_obj["printables"] > 0
624 |             else 1.0
625 |         )
626 |         return np.hstack(
627 |             [
628 |                 raw_obj["numstrings"],
629 |                 raw_obj["avlength"],
630 |                 raw_obj["printables"],
631 |                 np.asarray(raw_obj["printabledist"]) / hist_divisor,
632 |                 raw_obj["entropy"],
633 |                 raw_obj["paths"],
634 |                 raw_obj["urls"],
635 |                 raw_obj["registry"],
636 |                 raw_obj["MZ"],
637 |             ],
638 |         ).astype(np.float32)
639 | 
640 | 
641 | class DataDirectories(FeatureType):
642 |     """Extracts size and virtual address of the first 15 data directories"""
643 | 
644 |     name = "datadirectories"
645 |     dim = 15 * 2
646 | 
647 |     def __init__(self):
648 |         super(FeatureType, self).__init__()
649 |         self._name_order = [
650 |             "EXPORT_TABLE",
651 |             "IMPORT_TABLE",
652 |             "RESOURCE_TABLE",
653 |             "EXCEPTION_TABLE",
654 |             "CERTIFICATE_TABLE",
655 |             "BASE_RELOCATION_TABLE",
656 |             "DEBUG",
657 |             "ARCHITECTURE",
658 |             "GLOBAL_PTR",
659 |             "TLS_TABLE",
660 |             "LOAD_CONFIG_TABLE",
661 |             "BOUND_IMPORT",
662 |             "IAT",
663 |             "DELAY_IMPORT_DESCRIPTOR",
664 |             "CLR_RUNTIME_HEADER",
665 |         ]
666 | 
667 |     def raw_features(self, bytez, lief_binary):
668 |         output = []
669 |         if lief_binary is None:
670 |             return output
671 | 
672 |         for data_directory in lief_binary.data_directories:
673 |             output.append(
674 |                 {
675 |                     "name": str(data_directory.type).replace("DATA_DIRECTORY.", ""),
676 |                     "size": data_directory.size,
677 |                     "virtual_address": data_directory.rva,
678 |                 },
679 |             )
680 |         return output
681 | 
682 |     def process_raw_features(self, raw_obj):
683 |         features = np.zeros(2 * len(self._name_order), dtype=np.float32)
684 |         for i in range(len(self._name_order)):
685 |             if i < len(raw_obj):
686 |                 features[2 * i] = raw_obj[i]["size"]
687 |                 features[2 * i + 1] = raw_obj[i]["virtual_address"]
688 |         return features
689 | 
690 | 
691 | class PEFeatureExtractor:
692 |     """Extract useful features from a PE file, and return as a vector of fixed size."""
693 | 
694 |     def __init__(self, feature_version=2):
695 |         self.features = [
696 |             ByteHistogram(),
697 |             ByteEntropyHistogram(),
698 |             StringExtractor(),
699 |             GeneralFileInfo(),
700 |             HeaderFileInfo(),
701 |             SectionInfo(),
702 |             ImportsInfo(),
703 |             ExportsInfo(),
704 |         ]
705 |         if feature_version == 1:
706 |             if not lief.__version__.startswith("0.8.3"):
707 |                 print(
708 |                     f"WARNING: EMBER feature version 1 were computed using lief version 0.8.3-18d5b75",
709 |                 )
710 |                 print(
711 |                     f"WARNING:   lief version {lief.__version__} found instead. There may be slight inconsistencies",
712 |                 )
713 |                 print(f"WARNING:   in the feature calculations.")
714 |         elif feature_version == 2:
715 |             self.features.append(DataDirectories())
716 |             if not lief.__version__.startswith("0.9.0"):
717 |                 print(
718 |                     f"WARNING: EMBER feature version 2 were computed using lief version 0.9.0-",
719 |                 )
720 |                 print(
721 |                     f"WARNING:   lief version {lief.__version__} found instead. There may be slight inconsistencies",
722 |                 )
723 |                 print(f"WARNING:   in the feature calculations.")
724 |         else:
725 |             raise Exception(
726 |                 f"EMBER feature version must be 1 or 2. Not {feature_version}",
727 |             )
728 |         self.dim = sum(fe.dim for fe in self.features)
729 | 
730 |     def raw_features(self, bytez):
731 |         lief_errors = (
732 |             lief.bad_format,
733 |             lief.bad_file,
734 |             lief.pe_error,
735 |             lief.parser_error,
736 |             lief.read_out_of_bound,
737 |             RuntimeError,
738 |         )
739 |         try:
740 |             lief_binary = lief.PE.parse(list(bytez))
741 |         except lief_errors as e:
742 |             print("lief error: ", str(e))
743 |             lief_binary = None
744 |         # everything else (KeyboardInterrupt, SystemExit, ValueError):
745 |         except Exception:
746 |             raise
747 | 
748 |         features = {"sha256": hashlib.sha256(bytez).hexdigest()}
749 |         features.update(
750 |             {fe.name: fe.raw_features(bytez, lief_binary) for fe in self.features},
751 |         )
752 |         return features
753 | 
754 |     def process_raw_features(self, raw_obj):
755 |         feature_vectors = [
756 |             fe.process_raw_features(
757 |                 raw_obj[fe.name],
758 |             )
759 |             for fe in self.features
760 |         ]
761 |         return np.hstack(feature_vectors).astype(np.float32)
762 | 
763 |     def feature_vector(self, bytez):
764 |         return self.process_raw_features(self.raw_features(bytez))
765 | 
766 | 
767 | class EmberModel:
768 |     def __init__(self):
769 |         self.model = lgb.Booster(model_file=model_path)
770 |         self.threshold = 0.8336  # Ember 1% FPR
771 |         self.feature_version = 2
772 |         self.extractor = PEFeatureExtractor(self.feature_version)
773 | 
774 |     def extract(self, bytez):
775 |         return np.array(self.extractor.feature_vector(bytez), dtype=np.float32)
776 | 
777 |     def predict_sample(self, features):
778 |         return self.model.predict([features])[0]
779 | 


--------------------------------------------------------------------------------
/malware_rl/envs/utils/interface.py:
--------------------------------------------------------------------------------
 1 | import glob
 2 | import os.path
 3 | import re
 4 | import sys
 5 | 
 6 | module_path = os.path.dirname(os.path.abspath(sys.modules[__name__].__file__))
 7 | SAMPLE_PATH = os.path.join(module_path, "samples")
 8 | 
 9 | 
10 | def fetch_file(sample_path):
11 |     with open(sample_path, "rb") as f:
12 |         bytez = f.read()
13 |     return bytez
14 | 
15 | 
16 | def get_available_sha256():
17 |     sha256list = []
18 |     for fp in glob.glob(os.path.join(SAMPLE_PATH, "*")):
19 |         fn = os.path.split(fp)[-1]
20 |         # require filenames to be sha256
21 |         result = re.match(r"^[0-9a-fA-F]{64}$", fn)
22 |         if result:
23 |             sha256list.append(result.group(0))
24 |     # no files found in SAMLPE_PATH with sha256 names
25 |     assert len(sha256list) > 0
26 |     return sha256list
27 | 


--------------------------------------------------------------------------------
/malware_rl/envs/utils/malconv.h5:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/malware_rl/envs/utils/malconv.h5


--------------------------------------------------------------------------------
/malware_rl/envs/utils/malconv.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/python
 2 | """
 3 | Defines the MalConv architecture.
 4 | Adapted from https://arxiv.org/pdf/1710.09435.pdf
 5 | Things different about our implementation and that of the original paper:
 6 |  * The paper uses batch_size = 256 and
 7 |    SGD(lr=0.01, momentum=0.9, decay=UNDISCLOSED, nesterov=True )
 8 |  * The paper didn't have a special EOF symbol
 9 |  * The paper allowed for up to 2MB malware sizes,
10 |    we use 1.0MB because of memory on a Titan X
11 |  """
12 | import os
13 | import sys
14 | 
15 | import numpy as np
16 | import tensorflow as tf
17 | from keras import metrics
18 | from keras.models import load_model
19 | from keras.optimizers import SGD
20 | 
21 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
22 | model_path = os.path.join(module_path, "malconv.h5")
23 | 
24 | tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
25 | 
26 | 
27 | class MalConv:
28 |     def __init__(self):
29 |         self.batch_size = 100
30 |         self.input_dim = 257  # every byte plus a special padding symbol
31 |         self.padding_char = 256
32 |         self.malicious_threshold = 0.5
33 | 
34 |         self.model = load_model(model_path)
35 |         _, self.maxlen, self.embedding_size = self.model.layers[1].output_shape
36 | 
37 |         self.model.compile(
38 |             loss="binary_crossentropy",
39 |             optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True, decay=1e-3),
40 |             metrics=[metrics.binary_accuracy],
41 |         )
42 | 
43 |     def extract(self, bytez):
44 |         b = np.ones((self.maxlen,), dtype=np.int16) * self.padding_char
45 |         bytez = np.frombuffer(bytez[: self.maxlen], dtype=np.uint8)
46 |         b[: len(bytez)] = bytez
47 |         return b
48 | 
49 |     def predict_sample(self, bytez):
50 |         return self.model.predict(bytez.reshape(1, -1))[0][0]
51 | 


--------------------------------------------------------------------------------
/malware_rl/envs/utils/samples/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bfilar/malware_rl/300f47ff2240132a449277283807426d34271993/malware_rl/envs/utils/samples/.gitkeep


--------------------------------------------------------------------------------
/malware_rl/envs/utils/sorel.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | 
 4 | import lightgbm as lgb
 5 | import numpy as np
 6 | from malware_rl.envs.utils.ember import PEFeatureExtractor
 7 | 
 8 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
 9 | 
10 | try:
11 |     model_path = os.path.join(module_path, "sorel.model")
12 | except ValueError:
13 |     print("The model path provide does not exist")
14 | 
15 | 
16 | class SorelModel:
17 |     def __init__(self):
18 |         self.model = lgb.Booster(model_file=model_path)
19 |         self.threshold = 0.8336  # Ember 1% FPR
20 |         self.feature_version = 2
21 |         self.extractor = PEFeatureExtractor(self.feature_version)
22 | 
23 |     def extract(self, bytez):
24 |         return np.array(self.extractor.feature_vector(bytez), dtype=np.float32)
25 | 
26 |     def predict_sample(self, features):
27 |         return self.model.predict([features])[0]
28 | 


--------------------------------------------------------------------------------
/ppo.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import random
 3 | import sys
 4 | 
 5 | import gym
 6 | import numpy as np
 7 | from gym import wrappers
 8 | from stable_baselines3 import PPO
 9 | 
10 | import malware_rl
11 | 
12 | random.seed(0)
13 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
14 | outdir = os.path.join(module_path, "data/logs/ppo-agent-results")
15 | 
16 | # Setting up environment
17 | env = gym.make("sorel-train-v0")
18 | env = wrappers.Monitor(env, directory=outdir, force=True)
19 | env.seed(0)
20 | 
21 | # Setting up training parameters and holding variables
22 | episode_count = 250
23 | done = False
24 | reward = 0
25 | evasions = 0
26 | evasion_history = {}
27 | 
28 | # Train the agent
29 | agent = PPO("MlpPolicy", env, verbose=1)
30 | agent.learn(total_timesteps=2500)
31 | 
32 | 
33 | # Test the agent
34 | for i in range(episode_count):
35 |     ob = env.reset()
36 |     sha256 = env.env.sha256
37 |     while True:
38 |         action, _states = agent.predict(ob, reward, done)
39 |         obs, rewards, done, ep_history = env.step(action)
40 |         if done and rewards >= 10.0:
41 |             evasions += 1
42 |             evasion_history[sha256] = ep_history
43 |             break
44 | 
45 |         elif done:
46 |             break
47 | 
48 | # Output metrics/evaluation stuff
49 | evasion_rate = (evasions / episode_count) * 100
50 | mean_action_count = np.mean(env.get_episode_lengths())
51 | print(f"{evasion_rate}% samples evaded model.")
52 | print(f"Average of {mean_action_count} moves to evade model.")
53 | 


--------------------------------------------------------------------------------
/random_agent.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import random
 3 | import sys
 4 | 
 5 | import gym
 6 | import numpy as np
 7 | from gym import wrappers
 8 | from IPython import embed
 9 | 
10 | import malware_rl
11 | 
12 | random.seed(0)
13 | module_path = os.path.split(os.path.abspath(sys.modules[__name__].__file__))[0]
14 | 
15 | 
16 | class RandomAgent:
17 |     """The world's simplest agent!"""
18 | 
19 |     def __init__(self, action_space):
20 |         self.action_space = action_space
21 | 
22 |     def act(self, observation, reward, done):
23 |         return self.action_space.sample()
24 | 
25 | 
26 | # gym setup
27 | outdir = os.path.join(module_path, "data/logs/random-agent-results")
28 | env = gym.make("malconv-train-v0")
29 | env = wrappers.Monitor(env, directory=outdir, force=True)
30 | env.seed(0)
31 | episode_count = 250
32 | done = False
33 | reward = 0
34 | 
35 | # metric tracking
36 | evasions = 0
37 | evasion_history = {}
38 | 
39 | agent = RandomAgent(env.action_space)
40 | 
41 | for i in range(episode_count):
42 |     ob = env.reset()
43 |     sha256 = env.env.sha256
44 |     while True:
45 |         action = agent.act(ob, reward, done)
46 |         ob, reward, done, ep_history = env.step(action)
47 |         if done and reward >= 10.0:
48 |             evasions += 1
49 |             evasion_history[sha256] = ep_history
50 |             break
51 | 
52 |         elif done:
53 |             break
54 | 
55 | evasion_rate = (evasions / episode_count) * 100
56 | mean_action_count = np.mean(env.get_episode_lengths())
57 | print(f"{evasion_rate}% samples evaded model.")
58 | print(f"Average of {mean_action_count} moves to evade model.")
59 | embed()
60 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | absl-py==0.10.0
 2 | appdirs==1.4.3
 3 | appnope==0.1.0
 4 | astunparse==1.6.3
 5 | backcall==0.2.0
 6 | CacheControl==0.12.6
 7 | cachetools==4.1.1
 8 | certifi>=2022.12.07
 9 | chardet==3.0.4
10 | cloudpickle==1.3.0
11 | colorama==0.4.3
12 | contextlib2==0.6.0
13 | decorator==4.4.2
14 | distlib==0.3.0
15 | distro==1.4.0
16 | gast==0.3.3
17 | google-auth==1.21.0
18 | google-auth-oauthlib==0.4.1
19 | google-pasta==0.2.0
20 | grpcio==1.53.2
21 | gym==0.17.2
22 | h5py==2.10.0
23 | html5lib==1.0.1
24 | idna==2.10
25 | importlib-metadata==1.7.0
26 | ipaddr==2.2.0
27 | ipython==8.10.0
28 | ipython-genutils==0.2.0
29 | jedi==0.17.2
30 | joblib>=1.2.0
31 | Keras==2.4.3
32 | Keras-Preprocessing==1.1.2
33 | lief>=0.12.2
34 | lightgbm==2.3.1
35 | lockfile==0.12.2
36 | Markdown==3.2.2
37 | mccabe==0.6.1
38 | msgpack==0.6.2
39 | nose==1.3.7
40 | numpy==1.22.0
41 | oauthlib==3.1.0
42 | opt-einsum==3.3.0
43 | packaging==20.3
44 | parso==0.7.1
45 | pep517==0.8.2
46 | pexpect==4.8.0
47 | pickleshare==0.7.5
48 | progress==1.5
49 | prompt-toolkit==3.0.6
50 | protobuf>=3.18.3
51 | ptyprocess==0.6.0
52 | pyasn1==0.4.8
53 | pyasn1-modules==0.2.8
54 | pycodestyle==2.6.0
55 | pyflakes==2.2.0
56 | pyglet==1.5.0
57 | Pygments>=2.7.4
58 | pyparsing==2.4.6
59 | python-dateutil==2.8.1
60 | pytoml==0.1.21
61 | PyYAML>=5.4
62 | requests==2.24.0
63 | requests-oauthlib==1.3.0
64 | retrying==1.3.3
65 | rsa>=4.7
66 | scikit-learn==1.0.1
67 | scipy==1.11.1
68 | six==1.15.0
69 | stable-baselines3
70 | svn==1.0.1
71 | tensorboard==2.3.0
72 | tensorboard-plugin-wit==1.7.0
73 | tensorflow>=2.3.1
74 | tensorflow-estimator==2.3.0
75 | termcolor==1.1.0
76 | threadpoolctl==2.1.0
77 | traitlets==4.3.3
78 | urllib3>=1.26.5
79 | wcwidth==0.2.5
80 | webencodings==0.5.1
81 | Werkzeug==2.3.8
82 | wrapt==1.12.1
83 | zipp==3.1.0
84 | 


--------------------------------------------------------------------------------
/stable_baselines_env_check.py:
--------------------------------------------------------------------------------
 1 | import gym
 2 | from stable_baselines3.common.env_checker import check_env
 3 | 
 4 | import malware_rl  # Needs to be included in order to make the environment using gym
 5 | 
 6 | 
 7 | def test_env(env_name):
 8 | 
 9 |     print(f"TESTING {env_name}!")
10 |     env = gym.make(env_name)
11 |     print("Checking environment . . .")
12 |     check_env(env)
13 |     env.close()
14 | 
15 | 
16 | environments = ["sorel-train-v0", "malconv-train-v0", "ember-train-v0"]
17 | 
18 | for e in environments:
19 |     test_env(e)
20 | 


--------------------------------------------------------------------------------