├── .gitignore ├── LICENSE ├── README.md ├── RELEASE ├── UPGRADE ├── collect_distribution.py ├── email_to_db.py ├── etc ├── logging_example.ini └── vt_example.ini ├── fetchmail_processor.py ├── fetchmailrc-example ├── lib ├── __init__.py ├── analysis │ ├── __init__.py │ ├── analysis.py │ ├── example.py │ └── mwzoo.py ├── ansistrm.py ├── constants.py ├── hunting.py └── vtmis │ ├── __init__.py │ ├── scoring_example.py │ └── utilities.py ├── migrate └── migrate_0.11.py ├── process_downloads.py ├── review_alerts.py └── vtmis.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *.swp 3 | *.swo 4 | etc/vt.ini 5 | vtmis.sqlite3 6 | incoming/ 7 | email/ 8 | hashes/ 9 | raw_msgs/ 10 | *-repo 11 | campaign_translation.db 12 | lib/vtmis/scoring.py 13 | log/ 14 | etc/logging.ini 15 | venv/ 16 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015, The MITRE Corporation. All rights reserved. 4 | 5 | Approved for Public Release; Distribution Unlimited 14-1511 6 | 7 | Permission is hereby granted, free of charge, to any person obtaining a copy 8 | of this software and associated documentation files (the "Software"), to deal 9 | in the Software without restriction, including without limitation the rights 10 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 | copies of the Software, and to permit persons to whom the Software is 12 | furnished to do so, subject to the following conditions: 13 | 14 | The above copyright notice and this permission notice shall be included in all 15 | copies or substantial portions of the Software. 16 | 17 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 | SOFTWARE. 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | VT Hunter Overview 2 | ------------------ 3 | VT Hunter provides automation around the Virus Total Intelligence service. It attempts to speed up the review process for your hunting alerts so you can quickly decide whether to download or ignore a particular alert. Currently, it runs in a Linux server environment and contains a fancy curses gui where you make your decisions. All other processes run seemlessly in the background and take care of things like: pulling your hunting alerts, organizing the data, downloading malware samples, and submitting samples for analysis. 4 | 5 | ![review_alerts.py screenshot](https://magicked.github.io/images/review_alerts.png) 6 | 7 | As you make a decision on each alert, a new one is displayed and the old one is handled in the background. 8 | 9 | RELEASE and UPGRADE 10 | ------------------- 11 | A new branch has been added to track the first actual release. Please see the UPGRADE and RELEASE files for more information. If you have used vt-hunter in its previous state, you must migrate the DB. 12 | 13 | VT Hunter Configuration 14 | ----------------------- 15 | 16 | 1. Copy etc/vt_example.ini to etc/vt.ini. Open etc/vt.ini and modify as necessary. 17 | 2. Copy etc/logging_example.ini to etc/logging.ini. Open etc/logging.ini and modify as necessary. 18 | 3. Configure fetchmail to use the fetchmail_processor.py script. 19 | a) copy fetchmailrc-example to ~/.fetchmailrc 20 | b) modify ~/.fetchmailrc to include your information 21 | * You might need to run fetchmail -B 1 -v to find the new SSL fingerprint of the email server and put that in fetchmailrc 22 | 4. Copy campaign_translation_example.db to campaign_translation.db. Modify as necessary. See campaign translation section for details. 23 | 5. Copy vtmis/scoring_example.py to vtmis/scoring.py. Modify vtmis/scoring.py to include weights for your custom campaigns. 24 | 6. Write your own analysis module! See [Analysis Modules](## Analysis Modules) section. 25 | 7. The database will be created the first time you run anything that uses it. 26 | 27 | ## Dependencies 28 | * sqlalchemy 29 | * requests 30 | * configparser 31 | 32 | Make sure you install the python3 version of these dependencies. Feel free to install them from your OS package manager, but instead you may want to use a virtual environment to install your dependencies. You can use pyvenv to set this up for python3. 33 | 34 | On Ubuntu 14.04, I had to set this up without pip initially, then I had to install setuptools and pip manually: 35 | 36 | ```shell 37 | pyvenv-3.4 --without-pip venv 38 | source ./venv/bin/activate 39 | wget https://pypi.python.org/packages/source/s/setuptools/setuptools-3.4.4.tar.gz 40 | tar -zxvf setuptools-3.4.4.tar.gz 41 | cd setuptools-3.4.4/ 42 | python setup.py install 43 | cd .. 44 | wget https://pypi.python.org/packages/source/p/pip/pip-1.5.6.tar.gz 45 | tar -zxvf pip-1.5.6.tar.gz 46 | cd pip-1.5.6/ 47 | python setup.py install 48 | cd ../ 49 | deactivate 50 | rm -rf setuptools-3.4.4* 51 | rm -rf pip-1.5.6* 52 | ``` 53 | 54 | After this, the normal pip install X worked fine. 55 | 56 | ## Campaign Translation 57 | campaign_translation.db contains mappings to do string substitution on campaign names. You might use this if you don't want to put your internal campaign names on VirusTotal in any form (such as a yara rule name). This will allow you to provide an "external_name" (the fake name), which will then be converted to the "internal_name" when the data is processed. 58 | 59 | As a further example. Our internal name for a specific campaign is "Mighty Bear". We want to track this campaign name in a yara rule on VT, so we create a fake name called "campaign1". Our rule is then named "rule prod_campaign1_pivy_strings". We also create a campaign_translation.db entry as so: 60 | 61 | ``` 62 | { 63 | "campaign1" : "mightybear", 64 | } 65 | ``` 66 | 67 | Now when we receive alerts and the emails are processed, this substitution will occur. 68 | 69 | One last note. Unless you have a specific reason (such as some tagging scheme), it is probably a good idea to remove underscores from your campaign names. Underscores are used internally to separate rule names into tags. This might split your campaign name into two or more separate tags you don't want. 70 | 71 | ## Scoring 72 | scoring.py can be implemented in any way you see fit. The default implementation takes tags for the VT yara hit (based on the yara rule name) and assigns points based on the keywords found. Certain campaigns can be assigned a greater weight, while there is also room for keywords based on specific malware or other special keywords you can define. The result is computed and returned via the get_string_score(rule) function. 73 | 74 | ## The Process 75 | 1. Run fetchmail. The -B option lets you limit the number of emails. This is also intended to be placed in a cron job. 76 | 2. Process the emails with email_to_db.py 77 | 3. Review alerts with review_alerts.py 78 | 4. Download and submit samples to your analysis module with process_downloads.py 79 | 5. Run the collect_distribution.py script to download and store data from the live distribution feed. (Need unlimited API key) 80 | 81 | NOTE: When running in crontab, you need to cd to the vt-hunter directory first. Like so: 82 | ``` 83 | */15 * * * * cd /path/to/vt-hunter && /usr/bin/fetchmail >> /path/to/log/fetchmail.log 84 | ``` 85 | 86 | ## Automation 87 | Currently, automation occurs via crontab. You want to automate the following tasks: 88 | * fetchmail 89 | * email_to_db.py 90 | 91 | You will also want to run the following in a screen session: 92 | * process_downloads.py 93 | * collection_distribution.py (Need unlimited API key) 94 | 95 | At some point, the functionality of email_to_db.py can be moved to fetchmail_processor.py. I just haven't done this yet. 96 | 97 | ## Analysis Modules 98 | process_downloads.py is capable of submitting downloaded samples to any automated analysis you might have. To do so, create an analysis module in the analysis/ directory. You must do the following: 99 | * Create your_analysis_module.py in analysis/ 100 | ** Implement the methods in analysis.py 101 | * Add your_analysis_module to analysis/__init__.py 102 | * Add your_analysis_module section to vt.ini 103 | 104 | For example, if your_analysis_module looked like the following: 105 | 106 | ``` 107 | import analysis 108 | 109 | class YourAnalysisModule(analysis.AnalysisModule): 110 | 111 | def analyze_sample(self, filename='', tags=[]): 112 | # Do any analysis steps you want here. This could launch an external 113 | # script or be entirely self contained. 114 | print('Opening file: ' + filename) 115 | 116 | def check_status(self, filename=''): 117 | # This determines when a file has completed analysis. If you don't 118 | # want to deal with this, just return True 119 | print('Analysis completed.') 120 | return True 121 | 122 | def cleanup(self, filename='', tags=[]): 123 | # This is an additional step to clean up after this analysis module. 124 | # You don't necessarily need to do anything here 125 | print('Cleanup all the things!') 126 | ``` 127 | 128 | You would then add the following section to vt.ini: 129 | 130 | ``` 131 | [analysis_module_your_analysis_module] 132 | module = analysis.your_analysis_module 133 | class = YourAnalysisModule 134 | enabled = yes 135 | ``` 136 | 137 | Notice the "your_analysis_module" parts are the exact same as your_analysis_module.py. This convention is important to follow. 138 | 139 | lib/analysis/mwzoo.py contains a real example for an analysis module that submits each sample to a custom analysis engine. 140 | lib/analysis/downloader.py contains a simple example that downloads the malware file and moves it to a different working directory. 141 | 142 | ## Optional malware selection process 143 | * TODO: Configure "no review", aka direct download from email hits. Based on keywords from the rule name perhaps? 144 | -------------------------------------------------------------------------------- /RELEASE: -------------------------------------------------------------------------------- 1 | Version 0.11 2 | ============ 3 | - Added a new table "Tag" to track tags associated with downloads. 4 | - Added a progress count to review_alerts.py. 5 | - Added the ability to grab "tags" in review_alerts.py and then do a bulk action on anything matching the grabbed tags. 6 | - Separated out a release branch on git. 7 | - Consolidated versions to a single version for the entire project. I don't know why I broke it out in the first place. 8 | -------------------------------------------------------------------------------- /UPGRADE: -------------------------------------------------------------------------------- 1 | Version 0.12 to 0.13 2 | ==================== 3 | - Moved campaign_translation.db to into etc/vt.ini 4 | - Created utility library campaign_translation.py as a helper for translating campaigns 5 | 6 | Version 0.11 to 0.12 7 | ==================== 8 | - Added collect_distribution.py to collect the VT distribution feed. 9 | 10 | Version 0.000001337 to 0.11 11 | =========================== 12 | - Added a new table "Tag" to track tags associated with downloads. 13 | - Upgrade required: 14 | 1) Stop all crontab or other automation 15 | 2) Pull new code 16 | 3) Copy migrate_0.11.py from the migrate/ directory to your root installation directory. 17 | 4) Run migrate_0.11.py. This may take some time depending on how big your DB is. 18 | 5) Restart automation! Remove migrate_0.11.py. 19 | -------------------------------------------------------------------------------- /collect_distribution.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import json 4 | import os 5 | import requests 6 | import time 7 | import logging 8 | import logging.config 9 | import lib.hunting as hunting 10 | 11 | from lib.constants import VT_VERSION, VT_HOME 12 | from configparser import ConfigParser 13 | from datetime import datetime 14 | 15 | log_path = os.path.join(VT_HOME, "etc", "logging.ini") 16 | try: 17 | logging.config.fileConfig(log_path) 18 | log = logging.getLogger("collectDistribution") 19 | except Exception as e: 20 | raise SystemExit("unable to load logging configuration file {0}: {1}".format(log_path, str(e))) 21 | 22 | def collector_init(): 23 | global config 24 | global downloads_dir 25 | 26 | log.debug("Running VT Hunter version {0}".format(VT_VERSION)) 27 | try: 28 | config = ConfigParser() 29 | config.read(os.path.join(VT_HOME, "etc", "vt.ini")) 30 | except ImportError: 31 | raise SystemExit('vt.ini was not found or was not accessible.') 32 | 33 | # TODO: Not used yet, but at some point we will watch for certain source ids or 34 | # other indicators and will download files automatically 35 | downloads_dir = config.get('locations', 'downloads') 36 | 37 | if not os.path.exists(downloads_dir): 38 | os.mkdir(downloads_dir) 39 | 40 | os.environ["http_proxy"] = config.get("proxy", "http") 41 | os.environ["https_proxy"] = config.get("proxy", "https") 42 | 43 | 44 | def download_feed(last_timestamp): 45 | # If we don't have a timestamp, we just retrieve the last 500 files 46 | params = { 'apikey' : config.get('vt', 'api_master'), 'reports' : 'false', 'after' : last_timestamp, 'limit' : config.get('vt', 'limit') } 47 | 48 | logging.debug('Making distribution API call') 49 | r = requests.get( 'https://www.virustotal.com/vtapi/v2/file/distribution', params=params ) 50 | first_ts = 0 51 | last_ts = 0 52 | last_md5 = '' 53 | if r.status_code == 200: 54 | logging.debug('Status code 200 received. Parsing results...') 55 | count = 0 56 | r_json = r.json() 57 | for entry in r_json: 58 | count += 1 59 | if count == 1: 60 | first_ts = entry['timestamp'] 61 | # Check to see if we already have this entry 62 | fs = None 63 | fs = None 64 | if entry['first_seen'] is not None: 65 | fs = datetime.strptime(entry['first_seen'], '%Y-%m-%d %H:%M:%S') 66 | if entry['first_seen'] is not None: 67 | ls = datetime.strptime(entry['last_seen'], '%Y-%m-%d %H:%M:%S') 68 | tags = '' 69 | if len(entry['tags']) > 0: 70 | tags = ','.join(entry['tags']) 71 | 72 | statement = {'md5': entry['md5'], 'sha1' : entry['sha1'], 'sha256' : entry['sha256'], 'size' : entry['size'], 'type' : entry['type'], 'vhash' : entry['vhash'], 'ssdeep' : entry['ssdeep'], 'link' : entry['link'], 'source_country' : entry['source_country'], 'first_seen' : fs, 'last_seen' : ls, 'source_id' : entry['source_id'], 'orig_filename' : entry['name'], 'timestamp' : entry['timestamp'], 'tags' : tags } 73 | hunting.insert_vt_sample(statement) 74 | last_ts = entry['timestamp'] 75 | last_md5 = entry['md5'] 76 | 77 | hunting.sess.commit() 78 | 79 | logging.info('Processed {0} results from distribution feed.'.format(count)) 80 | else: 81 | logging.warning('Received non-200 status code from distribution feed: {0}'.format(r.status_code)) 82 | 83 | logging.debug('Last MD5 is {0}'.format(last_md5)) 84 | return (first_ts, last_ts) 85 | 86 | 87 | if __name__ == "__main__": 88 | parser = argparse.ArgumentParser() 89 | parser.add_argument("-v", "--version", action="version", version="You are running VT collector {0}".format(VT_VERSION)) 90 | args = parser.parse_args() 91 | 92 | running = True 93 | collector_init() 94 | last_ts = time.time() * 1000 95 | last_ts = int(last_ts) 96 | time.sleep(5) 97 | 98 | while running: 99 | try: 100 | returnobj = download_feed(last_ts) 101 | first_ts = returnobj[0] 102 | last_ts = returnobj[1] 103 | first_dt = datetime.fromtimestamp(first_ts / 1000.0).strftime('%Y-%m-%d %H:%M:%S') 104 | last_dt = datetime.fromtimestamp(last_ts / 1000.0).strftime('%Y-%m-%d %H:%M:%S') 105 | logging.debug('First timestamp is {0} and last timestamp is {1}'.format(first_ts, last_ts)) 106 | logging.debug('First datetime is {0} and last datetime is {1}'.format(first_dt, last_dt)) 107 | logging.debug('Sleeping for 10 seconds') 108 | time.sleep(10) 109 | except KeyboardInterrupt: 110 | log.info('Caught kill signal, shutting down.') 111 | running = False 112 | # TODO: Find a way to clean up running processes 113 | -------------------------------------------------------------------------------- /email_to_db.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | """ 4 | This script processes incoming emails and enters the appropriate information 5 | into the notifications database. 6 | """ 7 | 8 | __author__ = "hausrath@gmail.com (Nate Hausrath)" 9 | 10 | import os, sys, time 11 | import re 12 | import email 13 | import uuid 14 | import datetime 15 | import lib.hunting as hunting 16 | 17 | from lib.constants import VT_HOME 18 | from lib.vtmis.utilities import * 19 | from lib.vtmis.scoring import * 20 | from configparser import ConfigParser 21 | from io import StringIO 22 | 23 | try: 24 | config = ConfigParser() 25 | config.read(os.path.join(VT_HOME, "etc", "vt.ini")) 26 | except ImportError: 27 | raise SystemExit('vt.ini was not found or was not accessible.') 28 | 29 | # TODO: Add real logging 30 | 31 | scoring = get_scoring_dict() 32 | incoming_emails = config.get("locations", "incoming_emails") 33 | processed_emails = config.get("locations", "processed_emails") 34 | failed_emails = config.get("locations", "failed_emails") 35 | raw_msgs = config.get("locations", "raw_msgs") 36 | 37 | # These are new incoming emails 38 | if not os.path.exists(incoming_emails): 39 | print("There is no incoming email directory!") 40 | exit(1) 41 | # This is where archived emails go that have already by processed 42 | if not os.path.exists(processed_emails): 43 | os.mkdir(processed_emails) 44 | # This is where raw messages go 45 | if not os.path.exists(raw_msgs): 46 | os.mkdir(raw_msgs) 47 | # This is where failed messages go 48 | if not os.path.exists(failed_emails): 49 | os.mkdir(failed_emails) 50 | # Limit for the number of emails to process this time. Mainly used for testing. 51 | # Set to 0 for unlimited. 52 | LIMIT = 0 53 | 54 | # Build our regex strings 55 | re_md5 = re.compile(r'MD5\s+:\s+([A-Fa-f0-9]{32})') 56 | re_sha1 = re.compile(r'SHA1\s+:\s+([A-Fa-f0-9]{40})') 57 | re_sha256 = re.compile(r'SHA256\s+:') 58 | re_type = re.compile(r'Type\s+:\s+([A-Za-z0-9\s]+)') 59 | re_orig_filename = re.compile(r'OriginalFilename\s+:\s+([\w\s\d]+)') 60 | re_link = re.compile(r'Link\s+:') 61 | re_rule = re.compile('\[VTMIS\]\[[a-f0-9]+\]\s(.*)$') 62 | re_first_country = re.compile(r'First country\s*:\s+([A-Za-z]{2})') 63 | re_first_source = re.compile(r'First source\s+:\s+([a-z0-9]{8})\s+\(([a-z0-9A-Z]+)\)') 64 | 65 | incoming_count = len(os.listdir(incoming_emails)) 66 | total_processed = 0 67 | 68 | for f in os.listdir(incoming_emails): 69 | if os.path.isdir(incoming_emails + "/" + f): 70 | continue 71 | if LIMIT > 0 and total_processed >= LIMIT: 72 | continue 73 | if total_processed % 100 == 0: 74 | # TODO: This will not complete if the number of emails is too small. 75 | print("Processed " + str(total_processed) + " / " + str(incoming_count)) 76 | total_processed += 1 77 | 78 | # Read our email 79 | fin = open(incoming_emails + "/" + f, 'r') 80 | fstr = fin.read() 81 | fin.close() 82 | 83 | msg = email.message_from_string(fstr) 84 | 85 | rule = '' 86 | md5 = '' 87 | sha1 = '' 88 | sha256 = '' 89 | filetype = '' 90 | orig_file_name = '' 91 | link = '' 92 | first_source = '' 93 | first_source_type = '' 94 | first_country = '' 95 | 96 | # Get and clean the subject 97 | subject = ''.join(str(msg['subject']).splitlines()) 98 | re_rule = re.compile(r'\[VTMIS\]\[[0-9A-Za-z]+\]\s*(.*)') 99 | re_rule_match = re_rule.search(subject.split(":")[0]) 100 | if re_rule_match: 101 | rule = re_rule_match.group(1) 102 | else: 103 | print("Cannot find the appropriate rule match in the email subject. Sending email to {0}".format(os.path.join(failed_emails, f))) 104 | os.rename(os.path.join(incoming_emails, f), os.path.join(failed_emails, f)) 105 | continue 106 | 107 | payload = StringIO(msg.get_payload()) 108 | next_sha256 = False 109 | next_link = False 110 | raw_msg_text = "" 111 | raw_msg_file = str(uuid.uuid4()) 112 | 113 | maintype = msg.get_content_maintype() 114 | if maintype == 'text': 115 | for line in payload.readlines(): 116 | raw_msg_text = raw_msg_text + line 117 | match_md5 = re_md5.search(line) 118 | match_sha1 = re_sha1.search(line) 119 | match_sha256 = re_sha256.search(line) 120 | match_type = re_type.match(line) 121 | match_orig_fname = re_orig_filename.search(line) 122 | match_first_source = re_first_source.search(line) 123 | match_first_country = re_first_country.search(line) 124 | # Some goofy logic here to handle multilines and whatnot. 125 | if next_sha256: 126 | sha256 = line.rstrip() 127 | next_sha256 = False 128 | if match_md5: 129 | md5 = match_md5.group(1) 130 | if match_sha1: 131 | sha1 = match_sha1.group(1) 132 | if match_sha256: 133 | next_sha256 = True 134 | if match_type: 135 | filetype = match_type.group(1) 136 | filetype = filetype.rstrip() 137 | if match_orig_fname: 138 | orig_file_name = match_orig_fname.group(1) 139 | if match_first_source: 140 | first_source = match_first_source.group(1) 141 | first_source_type = match_first_source.group(2) 142 | if match_first_country: 143 | first_country = match_first_country.group(1) 144 | 145 | # Get time for file paths 146 | utctime = time.gmtime() 147 | utctimestr = time.strftime("%Y-%m-%d", utctime) 148 | 149 | # Build our file locations 150 | email_archive = utctimestr + "/" + f 151 | raw_email_html = utctimestr + "/" + f 152 | 153 | # Get the timestamp 154 | created_at = datetime.datetime.now() 155 | 156 | # First check to see if this file and rule hits are already in the database. 157 | check_exists = hunting.sess.query(hunting.Hit).filter(hunting.Hit.md5 == md5, hunting.Hit.rule == rule).first() 158 | 159 | if check_exists is None: 160 | # Check if a download exists for this md5 already 161 | dl = hunting.sess.query(hunting.Download).filter(hunting.Download.md5 == md5).first() 162 | if dl is None: 163 | dl = hunting.Download(md5=md5, sha1=sha1, score=0, process_state=0) 164 | # Now we write all the data we scraped to the DB 165 | hit = hunting.Hit(md5=md5, sha1=sha1, sha256=sha256, rule=rule, created_at=created_at, first_source=first_source, first_country=first_country, file_type=filetype, first_source_type=first_source_type, orig_file_name=orig_file_name, raw_email_html=raw_email_html, email_archive=email_archive, score=get_string_score(rule), download=dl) 166 | dl.score += hit.score 167 | #print("Inserting hit with md5 {0} and rule {1}".format(md5, rule)) 168 | hunting.sess.add(hit) 169 | hunting.sess.commit() 170 | 171 | # Now we need to create the tags entry 172 | # Obtain and make unique the tags 173 | tag_list = [] 174 | tag_list.extend(hit.rule.split('_')) 175 | tags = set(tag_list) 176 | 177 | for tag_str in tags: 178 | tag = hunting.sess.query(hunting.Tag).filter(hunting.Tag.tag == tag_str).first() 179 | if tag is None: 180 | #print("Inserting tag {0}".format(tag_str)) 181 | tag = hunting.Tag(tag=tag_str) 182 | hunting.sess.add(tag) 183 | hunting.sess.commit() 184 | if tag not in dl.tags: 185 | #print("Linking tag {0} to dl {1}".format(tag_str, dl.md5)) 186 | dl.tags.append(tag) 187 | hunting.sess.commit() 188 | 189 | # Convert the raw message to html and write it out 190 | if not os.path.exists(raw_msgs + "/" + utctimestr): 191 | os.mkdir(raw_msgs + "/" + utctimestr) 192 | if not os.path.exists(processed_emails + "/" + utctimestr): 193 | os.mkdir(processed_emails + "/" + utctimestr) 194 | raw_msg_html = convert_msg_to_html(raw_msg_text) 195 | fout = open(raw_msgs + raw_email_html, "w") 196 | fout.write(raw_msg_html) 197 | fout.close() 198 | os.rename(os.path.join(incoming_emails, f), os.path.join(processed_emails, email_archive)) 199 | 200 | print("Processed " + str(total_processed) + " / " + str(incoming_count)) 201 | -------------------------------------------------------------------------------- /etc/logging_example.ini: -------------------------------------------------------------------------------- 1 | [loggers] 2 | keys=root 3 | 4 | [handlers] 5 | keys=console,file,distcoll 6 | 7 | [formatters] 8 | keys=base 9 | 10 | [logger_root] 11 | level=DEBUG 12 | handlers=console,file 13 | 14 | [logger_processDownloads] 15 | level=DEBUG 16 | handlers=console,file 17 | qualname=processDownloads 18 | propagate=0 19 | 20 | [logger_collectDistribution] 21 | level=DEBUG 22 | handlers=console,distcoll 23 | qualname=collectDistribution 24 | propagate=0 25 | 26 | [handler_console] 27 | class=lib.ansistrm.ColorizingStreamHandler 28 | level=DEBUG 29 | formatter=base 30 | args=(sys.stdout,) 31 | 32 | [handler_file] 33 | class=logging.FileHandler 34 | level=DEBUG 35 | formatter=base 36 | args=("log/vt.log",) 37 | 38 | [handler_distcoll] 39 | class=logging.FileHandler 40 | level=INFO 41 | formatter=base 42 | args=("log/vt_dist.log",) 43 | 44 | [formatter_base] 45 | format=[%(asctime)s] [%(filename)s:%(lineno)d] [%(threadName)s] [%(levelname)s] - %(message)s 46 | -------------------------------------------------------------------------------- /etc/vt_example.ini: -------------------------------------------------------------------------------- 1 | [global] 2 | ; MASTER is the vtmis account key that gives access to live feed notifications 3 | api_master = 4 | ; Your individual API key 5 | api_local = 6 | ; The limit for the number of results to pull from the distribution feed 7 | limit = 1000 8 | 9 | [locations] 10 | ; Locations for various things 11 | sqlite_db = /path/to/dev/hunting/vtmis.sqlite3 12 | incoming_emails = /path/to/dev/hunting/incoming/ 13 | processed_emails = /path/to/dev/hunting/email/ 14 | failed_emails = /path/to/dev/hunting/failed/ 15 | raw_msgs = /path/to/dev/hunting/raw_msgs/ 16 | hashes = /path/to/dev/hashes/ 17 | downloads = /path/to/dev/malware_downloads/ 18 | 19 | [proxy] 20 | ; Proxy 21 | http = http://1.2.3.4:5678/ 22 | https = https://1.2.3.4:5678/ 23 | 24 | ; See analysis modules section of help for more information 25 | [analysis_module_example] 26 | module = analysis.example 27 | class = Example 28 | enabled = yes 29 | -------------------------------------------------------------------------------- /fetchmail_processor.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import os 3 | import sys 4 | import re 5 | import uuid 6 | import logging 7 | import logging.config 8 | 9 | from lib.constants import VT_VERSION, VT_HOME 10 | from configparser import ConfigParser 11 | 12 | try: 13 | config = ConfigParser() 14 | config.read(os.path.join(VT_HOME, "etc", "vt.ini")) 15 | except ImportError: 16 | raise SystemExit('vt.ini was not found or was not accessible.') 17 | 18 | incoming_emails = config.get('locations', 'incoming_emails') 19 | 20 | if not os.path.exists(incoming_emails): 21 | os.mkdir(incoming_emails) 22 | 23 | fname = str(uuid.uuid4()) 24 | fstr = "" 25 | for line in sys.stdin: 26 | line = re.sub('=3D', '=', line) 27 | line = re.sub('=20', ' ', line) 28 | line = line.rstrip() 29 | if len(line) > 1: 30 | if line[-1] == "=": 31 | line = line[:-1] 32 | fstr = fstr + line 33 | else: 34 | fstr = fstr + line + '\n' 35 | else: 36 | fstr = fstr + line + '\n' 37 | 38 | fout = open(incoming_emails + fname, 'w') 39 | fout.write(fstr) 40 | fout.close() 41 | -------------------------------------------------------------------------------- /fetchmailrc-example: -------------------------------------------------------------------------------- 1 | poll emailserver.com protocol imap user "vtmis.functional@email.com" password "yourpass" is "yourusername" here options ssl sslfingerprint "79:24:DF:5A:39:92:C2:4E:A1:1C:67:E5:59:3D:98:43" mda "python $VTPATH/vt-hunter/fetchmail_processor.py" 2 | -------------------------------------------------------------------------------- /lib/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lolnate/vt-hunter/e1034093bf2d3b1866e34fd7ae98c9bdb67f8b3e/lib/__init__.py -------------------------------------------------------------------------------- /lib/analysis/__init__.py: -------------------------------------------------------------------------------- 1 | __all__ = [ "mwzoo" ] 2 | -------------------------------------------------------------------------------- /lib/analysis/analysis.py: -------------------------------------------------------------------------------- 1 | import lib.analysis 2 | 3 | class AnalysisModule(object): 4 | 5 | def __init__(self, config_section, *args, **kwargs): 6 | assert isinstance(config_section, str) 7 | self.config_section = config_section 8 | 9 | ''' 10 | Called to start the analysis for the module. 11 | ''' 12 | def analyze_sample(self, filename='',tags=[] ): 13 | raise NotImplementedError("This analysis module was not implemented.") 14 | 15 | def check_status(self, filename='', tags=[] ): 16 | raise NotImplementedError("This analysis module was not implemented.") 17 | 18 | def cleanup(self, filename='', tags=[]): 19 | raise NotImplementedError("This analysis module was not implemented.") 20 | -------------------------------------------------------------------------------- /lib/analysis/example.py: -------------------------------------------------------------------------------- 1 | import analysis 2 | 3 | class Example(analysis.AnalysisModule): 4 | 5 | def analyze_sample(self, filename='', tags=[]): 6 | # Do any analysis steps you want here. This could launch an external 7 | # script or be entirely self contained. 8 | print('Opening file: ' + filename) 9 | 10 | def check_status(self, filename=''): 11 | # This determines when a file has completed analysis. If you don't 12 | # want to deal with this, just return True 13 | print('Analysis completed.') 14 | return True 15 | -------------------------------------------------------------------------------- /lib/analysis/mwzoo.py: -------------------------------------------------------------------------------- 1 | import hashlib 2 | import lib.analysis 3 | import subprocess 4 | import os, time 5 | import logging 6 | import logging.config 7 | from lib.analysis import analysis 8 | from subprocess import Popen 9 | 10 | class MWZoo(analysis.AnalysisModule): 11 | 12 | def _get_index_path(self, _hash): 13 | return os.path.join('/opt/mwzoo/index/md5/', _hash[0:3], _hash) 14 | 15 | def _sample_exists(self, _hash): 16 | """Returns True if a given hash already exists in the zoo, False otherwise.""" 17 | logging.debug("Looking up hash {0}".format(_hash)) 18 | path = self._get_index_path(_hash) 19 | logging.debug('Index path is {0}'.format(path)) 20 | if os.path.islink(path) and not os.path.exists(os.path.realpath(path)): 21 | logging.warning("index is corrupted: {0} is broken link".format(path)) 22 | return False 23 | 24 | return os.path.exists(path) 25 | 26 | def _get_file_hash(self, _filename): 27 | logging.debug('Building hash for file {0}'.format(_filename)) 28 | m = hashlib.md5() 29 | with open(_filename, 'rb') as f: 30 | m.update(f.read()) 31 | return m.hexdigest() 32 | 33 | ''' 34 | This submits the sample to our malware zoo for analysis. We use an external 35 | process for this submission. 36 | ''' 37 | def analyze_sample(self, filename='', tags=[]): 38 | # When we submit the sample to the mwzoo, it will create a copy of that sample 39 | # in its directory structure. 40 | formatted_tags = [] 41 | for tag in tags: 42 | formatted_tags.append('-t') 43 | formatted_tags.append(tag) 44 | 45 | subdir = '' 46 | if len(tags) > 0: 47 | subdir = "_".join(sorted(tags)) 48 | # The data directory for the file 49 | mwzoo_dirname = '/opt/mwzoo/data/vt/' + subdir 50 | 51 | ''' 52 | For reference, here is the add-sample command for our mwzoo: 53 | usage: add-sample [-h] [--enable-download] -t TAGS -s SOURCE 54 | [--comment COMMENT] [-d SUBDIRECTORY] [--disable-analysis] 55 | input_data [input_data ...] 56 | 57 | Add a given file or download by hash from VirusTotal. 58 | 59 | positional arguments: 60 | input_data The files or hashes or add. Accepts file paths and 61 | md5, sha1 and/or sha256 hashes. 62 | 63 | optional arguments: 64 | -h, --help show this help message and exit 65 | --enable-download Enable downloading files from VirusTotal. 66 | -t TAGS, --tags TAGS Add the given tag to the sample. Multiple -t options 67 | are allowed. 68 | -s SOURCE, --source SOURCE 69 | Record the original source of the file. 70 | --comment COMMENT Record a comment about the sample. 71 | -d SUBDIRECTORY, --subdirectory SUBDIRECTORY 72 | File the sample in the given subdirectory. Defaults to 73 | processing the file where it's at. 74 | --disable-analysis Do not analyze files, just add them. 75 | ''' 76 | fhash = self._get_file_hash(filename) 77 | if self._sample_exists(fhash): 78 | logging.info('Sample already exists: {0}'.format(fhash)) 79 | return False 80 | 81 | logging.info('Launching add-sample for file {0}'.format(filename)) 82 | subprocess.call( ['/usr/bin/python', '/opt/mwzoo/bin/add-sample', '-s', 'vt', '--comment', 'VirusTotal automated download'] + formatted_tags + [ '-d', mwzoo_dirname, filename ] ) 83 | 84 | # Then we need to call the analyze function for the mwzoo. The -d option 85 | # tells it not to launch the sandbox analysis. 86 | logging.info('Launching mwzoo analyze for file {0}'.format(filename)) 87 | Popen( ['/usr/bin/python', '/opt/mwzoo/bin/analyze', '-d', 'cuckoo', mwzoo_dirname + "/" + os.path.basename(filename)] ) 88 | # Dumb hack to make sure the .running file is created in the mwzoo 89 | time.sleep(1) 90 | return True 91 | 92 | ''' 93 | This checks the status of the mwzoo analysis. 94 | True - analysis complete 95 | False - analysis not complete 96 | ''' 97 | def check_status(self, filename='', tags=[]): 98 | subdir = '' 99 | if len(tags) > 0: 100 | subdir = "_".join(sorted(tags)) 101 | # The data directory for the file 102 | mwzoo_dirname = '/opt/mwzoo/data/vt/' + subdir 103 | # If .analysis is NOT found, analysis has not yet started: 104 | fhash = self._get_file_hash(filename) 105 | if self._sample_exists(fhash): 106 | # Check for the .analysis dir 107 | if os.path.isdir(os.path.realpath(self._get_index_path(fhash)) + '.analysis'): 108 | logging.debug('Analysis directory found for sample: {0}'.format(os.path.basename(filename))) 109 | else: 110 | logging.debug('Analysis has not yet started for sample: {0}'.format(os.path.basename(filename))) 111 | return False 112 | 113 | # If the name.running file is present the analysis is still running. 114 | if os.path.isfile(mwzoo_dirname + os.path.basename(filename) + '.running'): 115 | # Still running 116 | logging.debug('Analysis is still running for sample: {0}'.format(os.path.basename(filename))) 117 | return False 118 | else: 119 | logging.debug('Running file not found for sample: {0}'.format(os.path.basename(filename))) 120 | 121 | # Otherwise, analysis is complete! 122 | logging.info('Analysis complete for {0}'.format(filename)) 123 | return True 124 | 125 | ''' 126 | Called at the end of the processing. 127 | We want to remove the file from the vt-hunter downloads directory 128 | since it is now stored in our mwzoo instead. 129 | ''' 130 | def cleanup(self, filename='', tags=[]): 131 | # Remove the malware file 132 | logging.info("Removing {0}".format(filename)) 133 | os.remove(filename) 134 | 135 | -------------------------------------------------------------------------------- /lib/ansistrm.py: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright (C) 2010-2012 Vinay Sajip. All rights reserved. Licensed under the new BSD license. 3 | # 4 | import ctypes 5 | import logging 6 | import os 7 | 8 | class ColorizingStreamHandler(logging.StreamHandler): 9 | # color names to indices 10 | color_map = { 11 | 'black': 0, 12 | 'red': 1, 13 | 'green': 2, 14 | 'yellow': 3, 15 | 'blue': 4, 16 | 'magenta': 5, 17 | 'cyan': 6, 18 | 'white': 7, 19 | } 20 | 21 | #levels to (background, foreground, bold/intense) 22 | if os.name == 'nt': 23 | level_map = { 24 | logging.DEBUG: (None, 'blue', True), 25 | logging.INFO: (None, 'white', False), 26 | logging.WARNING: (None, 'yellow', True), 27 | logging.ERROR: (None, 'red', True), 28 | logging.CRITICAL: ('red', 'white', True), 29 | } 30 | else: 31 | level_map = { 32 | logging.DEBUG: (None, 'blue', False), 33 | logging.INFO: (None, 'black', False), 34 | logging.WARNING: (None, 'yellow', False), 35 | logging.ERROR: (None, 'red', False), 36 | logging.CRITICAL: ('red', 'white', True), 37 | } 38 | csi = '\x1b[' 39 | reset = '\x1b[0m' 40 | 41 | @property 42 | def is_tty(self): 43 | isatty = getattr(self.stream, 'isatty', None) 44 | return isatty and isatty() 45 | 46 | def emit(self, record): 47 | try: 48 | message = self.format(record) 49 | stream = self.stream 50 | if not self.is_tty: 51 | stream.write(message) 52 | else: 53 | self.output_colorized(message) 54 | stream.write(getattr(self, 'terminator', '\n')) 55 | self.flush() 56 | except (KeyboardInterrupt, SystemExit): 57 | raise 58 | except: 59 | self.handleError(record) 60 | 61 | if os.name != 'nt': 62 | def output_colorized(self, message): 63 | self.stream.write(message) 64 | else: 65 | import re 66 | ansi_esc = re.compile(r'\x1b\[((?:\d+)(?:;(?:\d+))*)m') 67 | 68 | nt_color_map = { 69 | 0: 0x00, # black 70 | 1: 0x04, # red 71 | 2: 0x02, # green 72 | 3: 0x06, # yellow 73 | 4: 0x01, # blue 74 | 5: 0x05, # magenta 75 | 6: 0x03, # cyan 76 | 7: 0x07, # white 77 | } 78 | 79 | def output_colorized(self, message): 80 | parts = self.ansi_esc.split(message) 81 | write = self.stream.write 82 | h = None 83 | fd = getattr(self.stream, 'fileno', None) 84 | if fd is not None: 85 | fd = fd() 86 | if fd in (1, 2): # stdout or stderr 87 | h = ctypes.windll.kernel32.GetStdHandle(-10 - fd) 88 | while parts: 89 | text = parts.pop(0) 90 | if text: 91 | write(text) 92 | if parts: 93 | params = parts.pop(0) 94 | if h is not None: 95 | params = [int(p) for p in params.split(';')] 96 | color = 0 97 | for p in params: 98 | if 40 <= p <= 47: 99 | color |= self.nt_color_map[p - 40] << 4 100 | elif 30 <= p <= 37: 101 | color |= self.nt_color_map[p - 30] 102 | elif p == 1: 103 | color |= 0x08 # foreground intensity on 104 | elif p == 0: # reset to default color 105 | color = 0x07 106 | else: 107 | pass # error condition ignored 108 | ctypes.windll.kernel32.SetConsoleTextAttribute(h, color) 109 | 110 | def colorize(self, message, record): 111 | if record.levelno in self.level_map: 112 | bg, fg, bold = self.level_map[record.levelno] 113 | params = [] 114 | if bg in self.color_map: 115 | params.append(str(self.color_map[bg] + 40)) 116 | if fg in self.color_map: 117 | params.append(str(self.color_map[fg] + 30)) 118 | if bold: 119 | params.append('1') 120 | if params: 121 | message = ''.join((self.csi, ';'.join(params), 122 | 'm', message, self.reset)) 123 | return message 124 | 125 | def format(self, record): 126 | message = logging.StreamHandler.format(self, record) 127 | if self.is_tty: 128 | # Don't colorize any traceback 129 | parts = message.split('\n', 1) 130 | parts[0] = self.colorize(parts[0], record) 131 | message = '\n'.join(parts) 132 | return message 133 | 134 | def main(): 135 | root = logging.getLogger() 136 | root.setLevel(logging.DEBUG) 137 | root.addHandler(ColorizingStreamHandler()) 138 | logging.debug('DEBUG') 139 | logging.info('INFO') 140 | logging.warning('WARNING') 141 | logging.error('ERROR') 142 | logging.critical('CRITICAL') 143 | 144 | if __name__ == '__main__': 145 | main() 146 | -------------------------------------------------------------------------------- /lib/constants.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | _current_dir = os.path.abspath(os.path.dirname(__file__)) 4 | VT_HOME = os.path.normpath(os.path.join(_current_dir, "..")) 5 | VT_VERSION = "0.11" 6 | -------------------------------------------------------------------------------- /lib/hunting.py: -------------------------------------------------------------------------------- 1 | import sqlalchemy 2 | 3 | from sqlalchemy.ext.declarative import declarative_base 4 | from sqlalchemy import Column, Integer, String, DateTime, Text, Date, Boolean, Table 5 | from sqlalchemy import create_engine 6 | from sqlalchemy import ForeignKey 7 | from sqlalchemy.orm import sessionmaker 8 | from sqlalchemy.orm import relationship, backref 9 | 10 | from configparser import ConfigParser 11 | from lib.constants import VT_VERSION, VT_HOME 12 | 13 | import os 14 | 15 | Base = declarative_base() 16 | 17 | DownloadTag = Table('download_tag', Base.metadata, 18 | Column('tagId', Integer, ForeignKey('tag.id'), primary_key=True), 19 | Column('downloadId', Integer, ForeignKey('download.id'), primary_key=True) 20 | ) 21 | 22 | #VTSampleTag = Table('vt_sample_tag', Base.metadata, 23 | # Column('tagId', Integer, ForeignKey('vt_tag.id'), primary_key=True), 24 | # Column('sampleId', Integer, ForeignKey('vt_sample.id'), primary_key=True) 25 | # ) 26 | 27 | class Hit(Base): 28 | __tablename__ = "hunting" 29 | 30 | id = Column(Integer, primary_key=True) 31 | md5 = Column(String, ForeignKey('download.md5')) 32 | sha1 = Column(String) 33 | sha256 = Column(String) 34 | rule = Column(String) 35 | created_at = Column(DateTime) 36 | first_source = Column(String) 37 | first_country = Column(String) 38 | file_type = Column(String) 39 | first_source_type = Column(String) 40 | orig_file_name = Column(String) 41 | raw_email_html = Column(Text) 42 | email_archive = Column(String) 43 | score = Column(Integer) 44 | 45 | download = relationship("Download", backref=backref('hits', order_by=id)) 46 | 47 | def __repr__(self): 48 | return "" % (self.id, self.md5, self.download) 49 | 50 | 51 | class Download(Base): 52 | __tablename__ = "download" 53 | 54 | id = Column(Integer, primary_key=True) 55 | md5 = Column(String) 56 | sha1 = Column(String) 57 | score = Column(Integer) 58 | # 0 = Not Reviewed 59 | # 1 = Download 60 | # 2 = Downloaded, Awaiting Processing 61 | # 3 = Processing 62 | # 4 = Processed 63 | # 5 = Do Not Download 64 | # 6 = Error Downloading 65 | process_state = Column(Integer) 66 | tags = relationship('Tag', secondary=DownloadTag, backref='download') 67 | 68 | def __repr__(self): 69 | return "" % (self.id, self.md5, self.sha1, self.process_state) 70 | 71 | 72 | class Tag(Base): 73 | __tablename__ = "tag" 74 | 75 | id = Column(Integer, primary_key=True) 76 | tag = Column(String) 77 | downloads = relationship('Download', secondary=DownloadTag, backref='tag') 78 | 79 | def __repr__(self): 80 | return "" % (self.id, self.tag) 81 | 82 | 83 | class VTSample(Base): 84 | __tablename__ = "vt_sample" 85 | 86 | id = Column(Integer, primary_key=True) 87 | md5 = Column(String) 88 | sha1 = Column(String) 89 | sha256 = Column(String) 90 | size = Column(Integer) 91 | type = Column(String) 92 | vhash = Column(String) 93 | ssdeep = Column(String) 94 | link = Column(String) 95 | source_country = Column(String) 96 | first_seen = Column(DateTime) 97 | last_seen = Column(DateTime) 98 | source_id = Column(String) 99 | orig_filename = Column(String) 100 | timestamp = Column(String) 101 | tags = Column(String) 102 | 103 | #tags = relationship('VTTag', secondary=VTSampleTag, backref='vt_sample') 104 | 105 | def __repr__(self): 106 | return "" % (self.id, self.md5) 107 | 108 | 109 | #class VTTag(Base): 110 | # __tablename__ = "vt_tag" 111 | # 112 | # id = Column(Integer, primary_key=True) 113 | # tag = Column(String) 114 | # vt_samples = relationship('VTSample', secondary=VTSampleTag, backref='vt_tag') 115 | # 116 | # def __repr__(self): 117 | # return "" % (self.id, self.tag) 118 | # 119 | # 120 | #class VTReport(Base): 121 | # __tablename__ = "vt_report" 122 | # 123 | # id = Column(Integer, primary_key=True) 124 | # sample_id = Column(String, ForeignKey('vt_sample.id')) 125 | # signature = Column(String) 126 | # detected = Column(Boolean) 127 | # vendor_name = Column(String) 128 | # version = Column(String) 129 | # date = Column(Date) 130 | # 131 | # vt_sample = relationship("VTSample", backref=backref('vt_reports', order_by=id)) 132 | 133 | 134 | try: 135 | config = ConfigParser() 136 | config.read(os.path.join(VT_HOME, "etc", "vt.ini")) 137 | except ImportError: 138 | raise SystemExit('vt.ini was not found or was not accessible.') 139 | 140 | global engine 141 | engine = create_engine("sqlite:///{0}".format(config.get("locations", "sqlite_db"))) 142 | Base.metadata.create_all(engine) 143 | sess = sessionmaker(bind=engine)() 144 | 145 | 146 | if __name__ == "__main__": 147 | results = sess.query(Hit).all() 148 | print(results) 149 | results[0].md5 = "1" 150 | sess.commit() 151 | results = sess.query(Hit).all() 152 | print(results) 153 | 154 | def insert_vt_sample(statement): 155 | engine.execute( 156 | VTSample.__table__.insert(), 157 | statement 158 | ) 159 | 160 | -------------------------------------------------------------------------------- /lib/vtmis/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lolnate/vt-hunter/e1034093bf2d3b1866e34fd7ae98c9bdb67f8b3e/lib/vtmis/__init__.py -------------------------------------------------------------------------------- /lib/vtmis/scoring_example.py: -------------------------------------------------------------------------------- 1 | valid_campaigns = [ "mightybear", "dancingdragon", "sillysand", "pretentiouspanda" ] 2 | 3 | def get_scoring_dict(): 4 | return { 5 | "unattrib" : { "score" : 1, "list" : [ "unattrib", "misc" ] }, 6 | "named" : { "score" : 5, "list" : valid_campaigns 7 | }, 8 | "top_campaign" : { "score" : 7, "list" : 9 | [ "dancingdragon", "pretentiouspanda" ] 10 | }, 11 | "specific_malz" : { "score" : 5, "list" : 12 | [ "pipeline", "dridex", "gh0st" ] 13 | }, 14 | "somewhat_special" : { "score" : 3, "list" : 15 | [ "sharinggroup" ] 16 | }, 17 | "super_special" : { "score" : 9, "list" : 18 | [ "incident", "malwarefamily" ] 19 | } 20 | } 21 | 22 | def get_string_score(rule): 23 | score = 0 24 | scoring = get_scoring_dict() 25 | rule_list = rule.split("_") 26 | for item in sorted(set(rule_list)): 27 | for key in scoring.keys(): 28 | if item in scoring[key]['list']: 29 | score += scoring[key]['score'] 30 | return score 31 | 32 | def get_rule_campaign(rule): 33 | items = rule.split("_") 34 | for subset in items: 35 | if subset in valid_campaigns: 36 | return subset 37 | return "unknown" 38 | -------------------------------------------------------------------------------- /lib/vtmis/utilities.py: -------------------------------------------------------------------------------- 1 | import re 2 | 3 | valid_rule_statuses = [ 'prod', 'dev', 'test'] 4 | 5 | def convert_msg_to_html(msg): 6 | html_msg = "" 7 | re_breaks = re.compile('\n') 8 | html_msg = re_breaks.sub("
", msg) 9 | return html_msg 10 | 11 | def get_rule_status(rule): 12 | for subset in rule.split("_"): 13 | if subset in valid_rule_statuses: 14 | return subset 15 | return "dev" 16 | -------------------------------------------------------------------------------- /migrate/migrate_0.11.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # Migrates to version 0.11 3 | import os 4 | import lib.hunting as hunting 5 | import logging 6 | import logging.config 7 | 8 | from lib.constants import VT_VERSION, VT_HOME 9 | 10 | log_path = os.path.join(VT_HOME, "etc", "logging.ini") 11 | try: 12 | logging.config.fileConfig(log_path) 13 | log = logging.getLogger("processDownloads") 14 | except Exception as e: 15 | raise SystemExit("unable to load logging configuration file {0}: {1}".format(log_path, str(e))) 16 | 17 | if float(VT_VERSION) >= 0.11: 18 | downloads = hunting.sess.query(hunting.Download).all() 19 | for download in downloads: 20 | hits = hunting.sess.query(hunting.Hit).filter(hunting.Hit.md5 == download.md5).all() 21 | tag_list = [] 22 | if len(hits) > 0: 23 | for hit in hits: 24 | tag_list.extend(hit.rule.split('_')) 25 | else: 26 | log.error('Download entry existed for md5 {0}, but no Hit entry was found.'.format(download.md5)) 27 | continue 28 | 29 | tags = set(tag_list) 30 | 31 | for t in tags: 32 | tag = hunting.sess.query(hunting.Tag).filter(hunting.Tag.tag == t).first() 33 | if tag is None: 34 | tag = hunting.Tag(tag=t) 35 | hunting.sess.add(tag) 36 | hunting.sess.commit() 37 | download.tags.append(tag) 38 | hunting.sess.commit() 39 | -------------------------------------------------------------------------------- /process_downloads.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import sys, os, time 3 | import importlib 4 | import lib.analysis 5 | import argparse 6 | import logging 7 | import logging.config 8 | import lib.hunting as hunting 9 | 10 | from lib.constants import VT_VERSION, VT_HOME 11 | from subprocess import call 12 | from configparser import ConfigParser 13 | 14 | downloads_dir = '' 15 | 16 | log_path = os.path.join(VT_HOME, "etc", "logging.ini") 17 | try: 18 | logging.config.fileConfig(log_path) 19 | log = logging.getLogger("processDownloads") 20 | except Exception as e: 21 | raise SystemExit("unable to load logging configuration file {0}: {1}".format(log_path, str(e))) 22 | 23 | def processor_init(): 24 | global config 25 | global downloads_dir 26 | 27 | log.debug("Running VT Processor version {0}".format(VT_VERSION)) 28 | try: 29 | config = ConfigParser() 30 | config.read(os.path.join(VT_HOME, "etc", "vt.ini")) 31 | except ImportError: 32 | raise SystemExit('vt.ini was not found or was not accessible.') 33 | 34 | downloads_dir = config.get('locations', 'downloads') 35 | 36 | if not os.path.exists(downloads_dir): 37 | os.mkdir(downloads_dir) 38 | 39 | os.environ["http_proxy"] = config.get("proxy", "http") 40 | os.environ["https_proxy"] = config.get("proxy", "https") 41 | 42 | 43 | def download_files(): 44 | # Gather md5s of malware to download 45 | downloads = hunting.sess.query(hunting.Download).filter(hunting.Download.process_state == '1').limit(1) 46 | for download in downloads: 47 | # Download it 48 | log.debug('Downloading {0}'.format(download.md5)) 49 | rcode = call(["./vtmis.py", "-d", download.md5]) 50 | if rcode > 0: 51 | log.error('Error: MD5 {0} not downloaded with downloader script.'.format(download.md5)) 52 | download.process_state = '6' 53 | hunting.sess.commit() 54 | else: 55 | log.debug("File {0} downloaded successfully.".format(download.md5)) 56 | download.process_state = '2' 57 | hunting.sess.commit() 58 | 59 | 60 | def load_modules(): 61 | analysis_modules = [] 62 | for section in config: 63 | if "analysis_module_" in section: 64 | if not config.getboolean(section, "enabled"): 65 | continue 66 | 67 | module_name = config.get(section, "module") 68 | try: 69 | _module = importlib.import_module(module_name) 70 | except Exception as e: 71 | log.error("Unable to import module {0}: {1}".format(module_name, str(e))) 72 | continue 73 | 74 | class_name = config.get(section, "class") 75 | try: 76 | module_class = getattr(_module, class_name) 77 | except Exception as e: 78 | log.error("Unable to load module class {0}: {1}".format(module_class, str(e))) 79 | continue 80 | 81 | try: 82 | analysis_module = module_class(str(section)) 83 | except Exception as e: 84 | log.error("Unable to load analysis module {0}: {1}".format(section, str(e))) 85 | continue 86 | 87 | analysis_modules.append(analysis_module) 88 | 89 | return analysis_modules 90 | 91 | # Submit the sample for automated analysis 92 | # Import enabled modules. 93 | def run_analysis(analysis_modules=[]): 94 | to_analysis = hunting.sess.query(hunting.Download).filter(hunting.Download.process_state == '2').all() 95 | for download in to_analysis: 96 | log.debug('Submitting {0} for analysis'.format(download.md5)) 97 | rule_list = [] 98 | hits = hunting.sess.query(hunting.Hit).filter(hunting.Hit.md5 == download.md5).all() 99 | if len(hits) > 0: 100 | for hit in hits: 101 | rule_list.append(hit.rule) 102 | else: 103 | log.error('Error with MD5 {0} - No rules available in Hits database.'.format(download.md5)) 104 | download.process_state = '6' 105 | hunting.sess.commit() 106 | continue 107 | 108 | rtags = [] 109 | for rule in rule_list: 110 | rtags.extend(rule.split('_')) 111 | tags = set(rtags) 112 | 113 | # Format: File Location, rule list, 114 | for module in analysis_modules: 115 | module.analyze_sample(downloads_dir + download.md5, tags) 116 | 117 | # Change state to 'processing' 118 | download.process_state = '3' 119 | hunting.sess.commit() 120 | 121 | # Check analysis statuses 122 | def check_analysis(): 123 | check_analysis = hunting.sess.query(hunting.Download).filter(hunting.Download.process_state == '3').all() 124 | combined_status = True 125 | for download in check_analysis: 126 | rule_list = [] 127 | hits = hunting.sess.query(hunting.Hit).filter(hunting.Hit.md5 == download.md5).all() 128 | if len(hits) > 0: 129 | for hit in hits: 130 | rule_list.append(hit.rule) 131 | 132 | rtags = [] 133 | for rule in rule_list: 134 | rtags.extend(rule.split('_')) 135 | tags = set(rtags) 136 | 137 | for module in analysis_modules: 138 | combined_status = combined_status and module.check_status(downloads_dir + download.md5, tags) 139 | 140 | if combined_status: 141 | # Change state to 'completed' 142 | download.process_state = '4' 143 | hunting.sess.commit() 144 | for module in analysis_modules: 145 | module.cleanup(downloads_dir + download.md5) 146 | 147 | 148 | if __name__ == "__main__": 149 | parser = argparse.ArgumentParser() 150 | parser.add_argument("-v", "--version", action="version", version="You are running VT processor {0}".format(VT_VERSION)) 151 | args = parser.parse_args() 152 | 153 | running = True 154 | processor_init() 155 | analysis_modules = load_modules() 156 | 157 | while running: 158 | try: 159 | download_files() 160 | run_analysis(analysis_modules) 161 | check_analysis() 162 | # Sleep for a bit 163 | time.sleep(5) 164 | except KeyboardInterrupt: 165 | log.info('Caught kill signal, shutting down.') 166 | running = False 167 | # TODO: Find a way to clean up running processes 168 | -------------------------------------------------------------------------------- /review_alerts.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import curses 3 | import email 4 | import os 5 | import time 6 | import lib.hunting as hunting 7 | 8 | from lib.vtmis.scoring import * 9 | from lib.constants import VT_VERSION, VT_HOME 10 | 11 | from configparser import ConfigParser 12 | 13 | try: 14 | config = ConfigParser() 15 | config.read(os.path.join(VT_HOME, "etc", "vt.ini")) 16 | except ImportError: 17 | raise SystemExit('vt.ini was not found or was not accessible.') 18 | 19 | raw_msgs = config.get("locations", "raw_msgs") 20 | 21 | stdscr = None 22 | 23 | def display_normal(stdscr, dl): 24 | # Get the rule 'tags' 25 | hits = hunting.sess.query(hunting.Hit).filter(hunting.Hit.download == dl).all() 26 | rtags = [] 27 | ctags = [] 28 | file_type = "" 29 | first_country = "" 30 | for hit in hits: 31 | rtags.extend(hit.rule.split("_")) 32 | file_type = hit.file_type 33 | first_country = hit.first_country 34 | ctags.append(get_rule_campaign(hit.rule)) 35 | campaigns = set(ctags) 36 | rule_tags = set(rtags) 37 | 38 | # Display them 39 | stdscr.addstr(3,1,"Rule hits: {0}".format(",".join(rule_tags))) 40 | stdscr.addstr(4,1,"Score: {0}".format(dl.score)) 41 | stdscr.addstr(5,1,"Campaign Matches: {0}".format(" - ".join(campaigns))) 42 | stdscr.addstr(6,1,"File Type: {0}".format(file_type)) 43 | stdscr.addstr(7,1,"First Country: {0}".format(first_country)) 44 | 45 | def display_raw(stdscr, dl): 46 | # Display more information about the email 47 | # TODO: Allow for more than just the first raw email hit (allow cycling) 48 | first_hit = hunting.sess.query(hunting.Hit).filter(hunting.Hit.download == dl).first() 49 | # Figure out how many lines we have available to display this text 50 | lines_available = stdscr.getmaxyx()[0] - 8 51 | if lines_available < 0: 52 | return 53 | 54 | fin = open(raw_msgs + first_hit.raw_email_html, "r") 55 | text = fin.read().split('
') 56 | fin.close() 57 | line_num = 1 58 | for line in text: 59 | line = line.replace("
", "") 60 | if line_num > lines_available: 61 | continue 62 | # Start printing on line 3 (line_num + 2) 63 | stdscr.addstr(line_num + 2,2,line) 64 | line_num += 1 65 | 66 | 67 | def display_processing_message(stdscr, additional_msg): 68 | stdscr.addstr(3, 1, "Processing... {0}".format(additional_msg)) 69 | 70 | 71 | def display_message(stdscr, msg): 72 | lines_available = stdscr.getmaxyx()[0] - 8 73 | draw_line = int(lines_available / 2) 74 | stdscr.addstr(draw_line, 20, msg) 75 | 76 | 77 | def process_grab(command, current_dl): 78 | # Get the download for this current md5 79 | download = hunting.sess.query(hunting.Download).filter(hunting.Download.md5 == current_dl.md5).first() 80 | # Now, based on the tags for the current download, we must find all the md5s that have the same tags 81 | if download is not None: 82 | dl_tag_ids = [] 83 | for tag in download.tags: 84 | dl_tag_ids.append(tag.id) 85 | matched_downloads = hunting.sess.query(hunting.Download).filter(hunting.Download.process_state == 0, hunting.Download.tags.any(hunting.Tag.id.in_(dl_tag_ids))) 86 | mcount = 0 87 | for dl in matched_downloads: 88 | matched = True 89 | for mtag in dl.tags: 90 | if mtag.id not in dl_tag_ids: 91 | matched = False 92 | if matched: 93 | mcount += 1 94 | if command == 'd': 95 | process_download(dl) 96 | if command == 'n': 97 | process_nodownload(dl) 98 | return mcount 99 | 100 | def process_download(current_dl): 101 | # 1 = Download 102 | current_dl.process_state = 1 103 | hunting.sess.commit() 104 | 105 | 106 | def process_nodownload(current_dl): 107 | # 5 = Do Not Download 108 | current_dl.process_state = 5 109 | hunting.sess.commit() 110 | 111 | 112 | def main(): 113 | global stdscr 114 | stdscr = curses.initscr() 115 | curses.noecho() 116 | curses.cbreak() 117 | stdscr.keypad(1) 118 | 119 | curses.start_color() 120 | scrsize = stdscr.getmaxyx() 121 | 122 | # Init some fancy colors 123 | curses.init_pair(1, curses.COLOR_BLUE, curses.COLOR_BLACK) 124 | additional_msg_str = "" 125 | 126 | # Get our download objects 127 | dl_queue = hunting.sess.query(hunting.Download).filter(hunting.Download.process_state == 0).all() 128 | dl_iter = iter(dl_queue) 129 | if len(dl_queue) < 1: 130 | current_dl = None 131 | current_num = 0 132 | else: 133 | current_dl = next(dl_iter) 134 | current_num = 1 135 | max_num = len(dl_queue) 136 | 137 | # Various flags 138 | running = True 139 | toggle_raw = False 140 | toggle_grab = False 141 | processed_grab = False 142 | 143 | while running: 144 | stdscr.clear() 145 | stdscr.addstr(1,1,"VT HUNTER V{0}".format(VT_VERSION), curses.A_BOLD) 146 | 147 | # We processed a large grab, so we need to re query the DB to make it a bit more user friendly. 148 | if processed_grab: 149 | dl_queue = hunting.sess.query(hunting.Download).filter(hunting.Download.process_state == 0).all() 150 | dl_iter = iter(dl_queue) 151 | current_dl = next(dl_iter) 152 | current_num = 1 153 | max_num = len(dl_queue) 154 | processed_grab = False 155 | 156 | if additional_msg_str != "": 157 | display_message(stdscr, additional_msg_str) 158 | additional_msg_str = "" 159 | 160 | if current_dl is None: 161 | stdscr.addstr(3,1,"No alerts are available for review!", curses.A_BOLD) 162 | current_num = 0 163 | else: 164 | if toggle_raw: 165 | display_raw(stdscr, current_dl) 166 | else: 167 | display_normal(stdscr, current_dl) 168 | 169 | # Display Help 170 | stdscr.addstr(scrsize[0] - 4, 1, "COMMANDS", curses.color_pair(1)) 171 | if not toggle_grab: 172 | stdscr.addstr(scrsize[0] - 3, 1, "q - quit r - raw email d - download", curses.color_pair(1)) 173 | else: 174 | stdscr.addstr(scrsize[0] - 3, 1, "q - quit d - download", curses.color_pair(1)) 175 | if not toggle_grab: 176 | stdscr.addstr(scrsize[0] - 2, 1, "s - skip n - do not download g - grab tags", curses.color_pair(1)) 177 | else: 178 | stdscr.addstr(scrsize[0] - 2, 1, " n - do not download g - cancel grab", curses.color_pair(1)) 179 | 180 | # Display the number of alerts left 181 | stdscr.addstr(scrsize[0] - 6, 1, "{0} / {1} Alerts".format(current_num, len(dl_queue)), curses.color_pair(1)) 182 | 183 | c = stdscr.getch() 184 | # Toggle commands 185 | commands = [] 186 | if c == ord('q'): 187 | commands.extend('q') 188 | if current_dl is not None: 189 | if c == ord('s'): 190 | if not toggle_grab: 191 | commands.extend('s') 192 | if c == ord('r'): 193 | if not toggle_grab: 194 | commands.extend('r') 195 | if c == ord('d'): 196 | commands.extend('d') 197 | commands.extend('s') 198 | if c == ord('n'): 199 | commands.extend('n') 200 | commands.extend('s') 201 | if c == ord('g'): 202 | if toggle_grab: 203 | toggle_grab = False 204 | else: 205 | toggle_grab = True 206 | 207 | # Process commands 208 | if 'q' in commands: 209 | running = False 210 | break 211 | if 'd' in commands: 212 | if toggle_grab: 213 | display_processing_message(stdscr, "Please wait. Downloading multiple files.") 214 | ret_count = process_grab('d', current_dl) 215 | processed_grab = True 216 | additional_msg_str = "Queued {0} samples for download.".format(ret_count) 217 | else: 218 | current_num += 1 219 | process_download(current_dl) 220 | if 'n' in commands: 221 | if toggle_grab: 222 | display_processing_message(stdscr, "Please wait. Marking multiple files as Do Not Download.") 223 | ret_count = process_grab('n', current_dl) 224 | processed_grab = True 225 | additional_msg_str = "Marked {0} samples as Do Not Download.".format(ret_count) 226 | else: 227 | current_num += 1 228 | process_nodownload(current_dl) 229 | if 's' in commands: 230 | toggle_raw = False 231 | # TODO: Do we want to allow skipping a grabbed set of tags? 232 | toggle_grab = False 233 | try: 234 | current_dl = next(dl_iter) 235 | except StopIteration: 236 | dl_queue = hunting.sess.query(hunting.Download).filter(hunting.Download.process_state == 0).all() 237 | if len(dl_queue) < 1: 238 | current_dl = None 239 | current_num = 0 240 | else: 241 | current_num += 1 242 | dl_iter = iter(dl_queue) 243 | current_dl = next(dl_iter) 244 | if 'r' in commands: 245 | if toggle_raw: 246 | toggle_raw = False 247 | else: 248 | toggle_raw = True 249 | 250 | # Wrap it up and return the console to normal. 251 | curses.nocbreak(); stdscr.keypad(0); curses.echo() 252 | curses.endwin() 253 | 254 | if __name__ == "__main__": 255 | main() 256 | -------------------------------------------------------------------------------- /vtmis.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | from __future__ import print_function 3 | import argparse 4 | import hashlib 5 | import sys 6 | import os 7 | import requests 8 | 9 | from configparser import ConfigParser 10 | from lib.constants import VT_VERSION, VT_HOME 11 | 12 | class vtAPI(): 13 | def __init__(self, config): 14 | self.base = 'https://www.virustotal.com/vtapi/v2/' 15 | self.config = config 16 | 17 | def downloadFile(self, vthash, dl_location): 18 | try: 19 | params = {'hash': vthash, 'apikey': self.config.get('vt', 'api_local')} 20 | r = requests.get(self.base + 'file/download', params=params) 21 | if r.status_code == 200: 22 | downloaded_file = r.content 23 | if len(downloaded_file) > 0: 24 | fout = open(dl_location + vthash, 'wb') 25 | fout.write(downloaded_file) 26 | fout.close() 27 | return 0 28 | else: 29 | print('Received status code {0} and message {1}'.format(r.status_code, r.content)) 30 | return 1 31 | except Exception as e: 32 | print("Exception: {0}".format(e)) 33 | return 1 34 | 35 | def parse_arguments(): 36 | ''' 37 | Parse command line arguments 38 | ''' 39 | opt = argparse.ArgumentParser(description='Search and Download from VirusTotal') 40 | opt.add_argument('vthash', metavar='Hash', help='An MD5/SHA1/SHA256 Hash') 41 | opt.add_argument('-d', '--download', action='store_true', help='Download File from VT') 42 | return opt.parse_args() 43 | 44 | def main(): 45 | try: 46 | config = ConfigParser() 47 | config.read(os.path.join(VT_HOME, "etc", "vt.ini")) 48 | except ImportError: 49 | raise SystemExit('vt.ini was not found or was not accessible.') 50 | 51 | os.environ["http_proxy"] = config.get('proxy', 'http') 52 | os.environ["https_proxy"] = config.get('proxy', 'https') 53 | 54 | options = parse_arguments() 55 | vt = vtAPI(config) 56 | if options.download: 57 | retcode = vt.downloadFile(options.vthash, config.get('locations', 'downloads')) 58 | if retcode > 0: 59 | return retcode 60 | return 0 61 | 62 | if __name__ == '__main__': 63 | retcode = main() 64 | exit(retcode) 65 | --------------------------------------------------------------------------------