├── README.md
├── addEdge.py
├── attack_pattern.py
├── generate_graph.py
├── initial.groovy
├── initial_properties.py
├── putmethod.py
├── signature_test.bro
├── subgraph_search.py
├── updateHost.bro
└── updateHost.zeek


/README.md:
--------------------------------------------------------------------------------
 1 | 知识图谱+网络安全
 2 | ====
 3 | # Knowledge-Graph-Analyze
 4 | ## 尝试1：
 5 | Bro和Snort的初步结果存入知识图谱，如网络包、网络底层事件等。知识图谱在这些数据的基础上进行分析。  
 6 | 问题：  
 7 | 具体是做什么样的分析呢，分析出什么结果？  
 8 | 老师提及的“多步攻击”？  
 9 | 知识图谱中的数据的存储格式是不是要作出改变，以适应“分析”的要求？  
10 | 关于网络底层事件：  
11 | 网络基本事件，Bro会生成很多日志文件，其中大多数是以协议的名称命名的（其内容基本是与该协议相关的流量内容）。但是也有比较特殊的日志文件，比如notice.log，我们可以定制该文件的内容（通过添加notice类型的方式），姑且认为notice.log文件中记录的内容就是所谓的网络基本事件。  
12 | conn.log中存放网络中连接的日志，其实连接建立也是一种事件，是不是被Bro整理为日志输出的内容，都属于事件的范畴？  
13 | 关于网络包：  
14 | 网络包应该是网络流量最原始的状态，没有经过上层分析。Snort在Packet Logger模式下，记录的就是网络数据包。  
15 | 关于知识图谱的分析、推理功能：  
16 | 参考《网络空间安全防御与态势感知》的第8章，要对网络中的事件坐初步的分析、推理需要一个”本体模型“，这里提及了OWL模型。所以，我们的数据是不是也需要经过一番处理，转换成OWL模型的数据，方便分析、推理呢？  
17 | 关于知识图谱的存储：  
18 | 我们目前将知识存储在MYSQL数据库中，这种传统的关系型数据的存储与知识图谱所需的语义存储相去甚远。考虑使用D2RQ将关系型数据转换为RDF表示的数据。  
19 | 
20 | ## 数据集选取
21 | 考虑DARPA的[LLS_DDOS](https://archive.ll.mit.edu/ideval/data/2000/LLS_DDOS_1.0.html)，这是一个DDOS攻击的数据集，它将攻击分为五个阶段[1]:  
22 | (1) 预探测网络（IPSweep）;  
23 | IPsweep of the AFB from a remote site  
24 | The adversary performs a scripted IPsweep of multiple class C subnets on the Air Force Base. The following networks are swept from address 1 to 254: 172.16.115.0/24, 172.16.114.0/24, 172.16.113.0/24, 172.16.112.0/24. The attacker sends ICMP echo-requests in this sweep and listens for ICMP echo-replies to determine which hosts are "up".  
25 | (2) 端口扫描，确定主机的脆弱信息（PortScan）;  
26 | Probe of live IP's to look for the sadmind daemon running on Solaris hosts  
27 | The hosts discovered in the previous phase are probed to determine which hosts are running the "sadmind" remote administration tool. This tells the attacker which hosts might be vulnerable to the exploit that he/she has. Each host is probed, by the script, using the "ping" option of the sadmind exploit program, as provided on the Internet by "Cheez Whiz". The ping option makes a rpc request to the host in question, asks what TCP port number to connect to for the sadmind service, and then connects to the port number supplied to test to see if the daemon is listening.  
28 | (3) 获得管理员权限（FTPBufOverflow）;  
29 | Breakins via the sadmind vulnerability, both successful and unsuccessful on those hosts  
30 | The attacker then tries to break into the hosts found to be running the sadmind service in the previous phase. The attack script attempts the sadmind Remote-to-Root exploit several times against each host, each time with different parameters. Since this is a remote buffer-overflow attack, the exploit code cannot easily determine the appropriate stack pointer value as in a local buffer-overflow. Thus the adversary must try several different stack pointer values, each of which he/she has validated to work on some test machines. There are three stack pointer values attempted on each potential victim. With each attempt, the exploit tries to execute one command, as root, on the remote system. The attacker needs to execute two commands however, one to "cat" an entry onto the victim's /etc/passwd file and one to "cat" an entry onto the victim's /etc/shadow file. The new root user's name is 'hacker2' and hacker2's home directory is set to be /tmp. Thus, there are 6 exploit attempts on each potential victim host. To test weather or not a break-in was successful, the attack script attempts a login, via telnet, as hacker2, after each set of two breakin attempts. When successful the attackers script moves on to the next potential victim.  
31 | (4) 安装特洛伊Mstream DDOS木马软件（UploadSoftware）;  
32 | Installation of the trojan mstream DDoS software on three hosts at the AFB  
33 | Entering this phase, the attack script has built a list of those hosts on which it has successfully installed the 'hacker2' user. These are mill (172.16.115.20), pascal (172.16.112.50), and locke (172.16.112.10). For each host on this list, the script performs a telnet login, makes a directory on the victim called "/tmp/.mstream/" and uses rcp to copy the server-sol binary into the new directory. This is the mstream server software. The attacker also installs a ".rhosts" file for themselves in /tmp, so that they can rsh in to startup the binary programs. On the first victim on the list, the attacker also installs the "master-sol" software, which is the mstream master. After installing the software on each host, the attacker uses rsh to startup first the master, and then the servers. as they come up, each server "registers" with the master that it is alive. The master writes out a database of live servers to a file called "/tmp/.sr".  
34 | (5) 借助被控制的主机对远程服务器发动DDOS攻击（DDOSAttack）;  
35 | Launching the DDoS  
36 | In the final phase, the attacker manually launches the DDOS. This is performed via a telnet login to the victim on which the master is running, and then, from the victim, a "telnet" to port 6723 of the localhost. Port 6723/TCP is the port on which the master listens for connections to its user-interface. After entering a password for the user-interface, the attacker is given a prompt at which he/she enters two commands. The command "servers" causes the UI to list the mstream servers which have registered with it and are ready to attack. the command "mstream 131.84.1.31 5" causes a DDOS attack, of 5 second duration, against the given IP address to be launched by all three servers simultaneously. The mstream DDOS consists of many, many connection requests to a variety of ports on the victim. All packets have a spoofed, random source IP address. The attacker then logs out. The tiny duration was chosen so that it would be possible to easily distribute tcpdump and audit logs of these events -- to avoid them being to large. In real life, one might expect a DDOS of longer duration, several hours or more.  
37 | In the case of this scenario, however, it should be noted that the DDoS does not exactly succeed. The Mstream DDoS software attempts to flood the victim with ack packets that go to many random tcp ports on the victim host. The AirForce base firewall, the Sidewinder firewall, is not configured to pass traffic on all these ports, thus the only mstream packets that make it though the firewall are those on well-known ports. All other mstream packets result in a tcp reset being sent to the spoof source address. Thus in the DMZ dump file, one sees many resets apparently coming from "www.af.mil" going to the many spoofed source addresses. These are actually created by the firewall as a result of the reciept of the tcp packet for which the firewall is configured not to proxy!  
38 | 
39 | [1] 胡倩.基于多步攻击场景的攻击预测方法[J].计算机科学,2019,46(S1):365-369.  
40 |  
41 |  ## 方法探索、文献阅读
42 |  文献1[2]  
43 |  基于关联分析和HMM的网络安全态势评估模型  
44 |  态势要素提取、态势理解和态势评估，是一个将基本的关于网络信息系统与网络安全方法的静动态信息通过信息融合技术逐步加工生成网络管理员可以理解和进行决策的信息的过程。（XX是XXX的过程）  
45 |  此过程是在态势理解环节通过关联分析实现对告警信息的聚类，并通过隐马尔可夫模型实现对态势的评估和预测。（关联分析=>聚类？ HMM理解一下？）  
46 |  态势要素：资产信息、网络攻击告警信息、资产的漏洞信息。  
47 |  在态势理解环节，要对态势要素提取的各类型信息作面向主机的告警聚类和面向攻击模式的关联分析。  
48 |  态势评估环节，该文章采用基于HMM的态势评估方法，将攻击的威胁等级作为观测值，将态势作为隐含的需要评估的状态值。  
49 |  思考尝试：分为三步，第一步，从流量出发，通过知识图谱的分析，给出告警信息；第二步，结合文献1,对告警信息作关联分析；第三步，在关联分析的基础上作态势评估。要注意，三步都要与KG相结合，体现出“基于KG”这一思想的价值。  
50 |  [2] 吴建台,刘光杰,刘伟伟,戴跃伟.一种基于关联分析和HMM的网络安全态势评估方法[J].计算机与现代化,2018(06):30-36.  
51 | 
52 |  ## 实践探索
53 |  采用百度开发的hugegraph开源图数据库,Hugegraph支持gremline,属性图表示,提供图计算API以及展示图数据的hugegraph studio.  
54 |  图谱内容主要分两部分,其一,网络安全知识(包括特征事件图);其二,由真实网络流量产生的动态图(分析对象).  
55 |  思路:  
56 |  1. 图谱构建工作,首先,网络安全知识部分转移过来(从原先的mysql数据库,需重新考虑点属性和关系)；其次,关注bro脚本和自带日志,完善动态图；最后构建特征事件图(要结合后面的图分析算法考虑).  
57 |  2. 图分析算法,在动态图中发现攻击事件(匹配攻击事件),这是第一个难点.考虑这几点,第一,实际场景中,分析对象是动态变化的,动态图匹配算法复杂度高；第二,这里的图匹配不是严格的匹配,而是模式匹配,另外,不仅仅要考虑拓扑,还要考虑点/边的属性(图的内容).这一步完成,即完成任务:基于行为发现攻击事件(尚未将攻击事件串起来).  
58 |  3. 攻击事件关联分析方法,仅发现攻击事件是不够的,需要找到这些攻击事件之间的关联,以进一步发现多步攻击.这里考虑HMM方法,还需要更多工作.可预见的难点,如何将HMM方法与KG扯上关系,这一点很重要.    
59 | ## 各个文件的功能
60 | addEdge.py: 好像没用，当时是用来测试“添加边”的功能的。  
61 | attack_pattern.py：把自定义的攻击特征（在attack_pattern_event.log中）导入“特征图谱”。  
62 | generate_graph.py：生成网络事件图。  
63 | initial_properties.py：定义图谱中需要的点、边属性，还有标签。  
64 | initial.groovy：没用。  
65 | put_method.py：没用。  
66 | signature_test.bro：没用。  
67 | subgraph_search.py：完成攻击事件查找，尚没有弄关联（设置的攻击事件太少了！）。  
68 | updateHost.bro：bro旧版本，没用。  
69 | updateHost.zeek：生成产生图谱所需的日志文件。  
70 | 


--------------------------------------------------------------------------------
/addEdge.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: UTF-8 -*- 
  2 | import requests, json
  3 | from urlparse import urlparse
  4 | 
  5 | def client_post_formurlencodeddata_requests(request_url,requestJSONdata):
  6 |     #功能说明：发送以form表单数据格式（它要求数据名称（name）和数据值（value）之间以等号相连，与另一组name/value值之间用&相连。例如：parameter1=12345&parameter2=23456。）请求到远程服务器，并获取请求响应报文。该请求消息头要求为：{"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8"}。
  7 |     #输入参数说明：接收请求的URL;请求报文数据，格式为name1=value1&name2=value2
  8 |     #输出参数：请求响应报文      
  9 |  
 10 |     requestJSONdata=str(requestJSONdata).replace("+", "%2B")
 11 |     requestdata=requestJSONdata.encode("utf-8")
 12 |     head = {"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8", 'Connection': 'close'}
 13 |      
 14 |     print '客户端请求JSON报文数据为（客户端 --> 服务端）:\n',requestdata
 15 |      
 16 |     #客户端发送请求报文到服务端
 17 |     r = requests.post(request_url,data=requestdata,headers=head)
 18 |      
 19 |     #客户端获取服务端的响应报文数据
 20 |     responsedata = r.text
 21 |     print '服务端的响应报文为（客户端 <--服务端）: ',responsedata
 22 |     print "get the status: ",r.status_code
 23 |         
 24 |     #返回请求响应报文
 25 |     return responsedata
 26 | 
 27 | 
 28 | # text = ""
 29 | # with open("test.json", "r") as f:
 30 | #     text = f.read()
 31 | # print text
 32 | 
 33 | # 定义请求header
 34 | HEADERS = {'Content-Type': 'application/x-www-form-urlencoded;charset=utf-8', 'Key': '332213fa4a9d4288b5668ddd9'}
 35 | 
 36 | url = "http://localhost:8080/graphs/hugegraph/graph/edges"
 37 | # data_f = open('test.json', 'r')
 38 | # data = json.load(data_f)
 39 | data = {
 40 |     "label": "icmp_echo_ping",
 41 |     "outV": "1:202.77.162.213",
 42 |     "inV": "1:172.16.115.1",
 43 |     "outVLabel": "entity",
 44 |     "inVLabel": "entity",
 45 |     "properties": {
 46 |         "edge_type": "basic_event", 
 47 |         "ts": "952440696.029745", 
 48 |         "time": "2000-03-07-22:51:36", 
 49 |         "frequency": 1, 
 50 |         "src_ip": "202.77.162.213", 
 51 |         "src_p": "8", 
 52 |         "dst_ip": "172.16.115.1", 
 53 |         "dst_p": "0"
 54 |     }
 55 | }
 56 | print type(data)
 57 | data = json.dumps(data)
 58 | 
 59 | r = requests.post(url, data)
 60 | print r.text
 61 | 
 62 | # requestbody = """'{
 63 | #     "label": "icmp_echo_ping",
 64 | #     "outV": "1:202.77.162.213",
 65 | #     "inV": "1:172.16.115.1",
 66 | #     "outVLabel": "entity",
 67 | #     "inVLabel": "entity",
 68 | #     "properties": {
 69 | #         "edge_type": "basic_event", 
 70 | #         "ts": "952440696.029745", 
 71 | #         "time": "2000-03-07-22:51:36", 
 72 | #         "frequency": 1, 
 73 | #         "src_ip": "202.77.162.213", 
 74 | #         "src_p": "8", 
 75 | #         "dst_ip": "172.16.115.1", 
 76 | #         "dst_p": "0"
 77 | #     }
 78 | # }'"""
 79 | 
 80 | requestbody = """'{
 81 |     "label": "entity",
 82 |     "properties": {
 83 |         "ip": "202.77.162.213",
 84 |         "ips": "2000-03-07-22:51:36|202.77.162.213",
 85 |         "ts": "952440722.006110",
 86 |         "vertex_type": "asset",
 87 |         "status": "unknown"
 88 |     }
 89 | }'"""
 90 | t1 = "202.77.162.213"
 91 | t2 = "952440722.006110"
 92 | t3 = "2000-03-07-22:51:36|202.77.162.213"
 93 | requestbody = """ '{
 94 |         "label": "entity",
 95 |         "properties": {
 96 |         "ip": "%s",
 97 |         "ips": "%s",
 98 |         "ts": "%s",
 99 |         "vertex_type": "asset",
100 |         "status": "unknown"
101 |     }
102 | }' """%(t1, t2, t3)
103 | 
104 | cmd =  """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/graph/vertices -d"""
105 | cmd = cmd + requestbody
106 | print cmd
107 | # client_post_formurlencodeddata_requests(url, data)
108 | # content = requests.post(url=url, headers=HEADERS, data=data).text
109 | # content = json.loads(content)
110 | # print content


--------------------------------------------------------------------------------
/attack_pattern.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: UTF-8 -*-
  2 | # author: wangyixuan 
  3 | # 网络攻击特征事件图谱生成,攻击模式应当自动生成,并能自动增长
  4 | # 考虑使用不同的顶点标签来表示不同的攻击模式,攻击模式的点边属性该怎么设置?
  5 | import os
  6 | import subprocess
  7 | 
  8 | # 创建顺序
  9 | # 1. 属性类型
 10 | # 2. 攻击模式点类型
 11 | # 3. 攻击模式边类型
 12 | # 4. 攻击模式数据(定制)
 13 | 
 14 | property_keys = ["event_label", "pattern_node_id", "pattern_edge_id"]
 15 | edges = []
 16 | 
 17 | def extractText(text):
 18 |     res = text.split(' ')
 19 |     return res
 20 | 
 21 | def addNode(v, name):
 22 |     requestbody = """'{
 23 |         "label": "%s",
 24 |         "properties": {
 25 |             "pattern_node_id": %s
 26 |         }
 27 |     }'"""%(name, v)
 28 |     # 实际上是整数
 29 |     cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/graph/vertices -d"""
 30 |     cmd = cmd + requestbody
 31 |     print cmd
 32 |     sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
 33 |     str1 = sub.stdout.read()
 34 |     sub.communicate()
 35 |     str1 = str1.decode()
 36 |     print str1
 37 |     return
 38 | 
 39 | def addEdge(v1, v2, event_label, edge_num, event_seq):
 40 |     # 必须知道两点的id,很糟
 41 |     edge_name = "attack_event_" + event_seq
 42 |     node_name = "attack_pattern_" + event_seq
 43 |     n = int(event_seq) + 4 # 1是entity,2开始才是攻击模式   有点问题,需要处理冒号前的序号
 44 |     requestbody = """'{
 45 |         "label": "%s",
 46 |         "outV": "%d:%s",
 47 |         "inV": "%d:%s",
 48 |         "outVLabel": "%s",
 49 |         "inVLabel": "%s",
 50 |         "properties": {
 51 |             "pattern_edge_id": %s,
 52 |             "event_label": "%s"
 53 |         }
 54 |     }
 55 |     '"""%(edge_name, n, v1, n, v2, node_name, node_name, edge_num, event_label)
 56 |     cmd =  """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/graph/edges -d"""
 57 |     cmd = cmd + requestbody
 58 |     print cmd
 59 |     sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
 60 |     str1 = sub.stdout.read()
 61 |     sub.communicate()
 62 |     str1 = str1.decode()
 63 |     print str1
 64 |     return
 65 | 
 66 | def execute_command(cmd):
 67 |     sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
 68 |     str1 = sub.stdout.read()
 69 |     sub.communicate()
 70 |     str1 = str1.decode()
 71 |     # print str1
 72 |     return str1
 73 | 
 74 | # 攻击特征子图发现,通过gremlin提供的subgraph功能,先作一步过滤
 75 | # 过滤方法(基于边标签)
 76 | # 边标签可以取自攻击特征子图的边属性
 77 | # gremlin提供的子图中的所有节点,都可能参与/被波及
 78 | # 然后关注子图的点出度,出度高的点,可疑度高
 79 | 
 80 | if __name__ == '__main__':
 81 |     # for key in property_keys:
 82 |     #     requestbody = """'{
 83 |     #         "name": "%s",
 84 |     #         "data_type": "TEXT",
 85 |     #         "cardinality": "SINGLE"
 86 |     #     }'"""%key
 87 |     #     cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/schema/propertykeys -d"""
 88 |     #     cmd += requestbody
 89 |     #     print cmd
 90 | 
 91 |     # 攻击模式点标签
 92 |     # requestbody = """'{
 93 |     #     "name": "attack_pattern_0",
 94 |     #     "id_strategy": "DEFAULT",
 95 |     #     "properties": [
 96 |     #         "pattern_node_id"
 97 |     #     ],
 98 |     #     "primary_keys": [
 99 |     #         "pattern_node_id"
100 |     #     ],
101 |     #     "nullable_keys": [],
102 |     #     "enable_label_index": true
103 |     # }'"""
104 |     # cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/schema/vertexlabels -d"""
105 |     # cmd += requestbody
106 |     # print cmd
107 | 
108 |     # 攻击模式边标签
109 |     # 没办法了,要标明源节点和目的节点的标签,标签的爆炸式增长
110 |     max_squence = 3
111 |     i = 0
112 |     while i <= max_squence:
113 |         requestbody = """'{
114 |             "name": "attack_pattern_%d",
115 |             "id_strategy": "DEFAULT",
116 |             "properties": [
117 |                 "pattern_node_id"
118 |             ],
119 |             "primary_keys": [
120 |                 "pattern_node_id"
121 |             ],
122 |             "nullable_keys": [],
123 |             "enable_label_index": true
124 |         }'"""%(i)
125 |         cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/schema/vertexlabels -d"""
126 |         cmd += requestbody
127 |         print execute_command(cmd)
128 |         requestbody = """'{
129 |             "name": "attack_event_%d",
130 |             "source_label": "attack_pattern_%d",
131 |             "target_label": "attack_pattern_%d",
132 |             "frequency": "SINGLE",
133 |             "properties": [
134 |                 "pattern_edge_id",
135 |                 "event_label"
136 |             ],
137 |             "sort_keys": [],
138 |             "nullable_keys": [],
139 |             "enable_label_index": true
140 |         }'"""%(i, i, i)
141 |         cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/schema/edgelabels -d"""
142 |         cmd += requestbody
143 |         print execute_command(cmd)
144 |         i += 1
145 |     cmd = "awk '/^[^#]/ {print $1, $2, $3, $4}' attack_pattern_event.log"
146 |     r = os.popen(cmd)
147 |     text = r.readline()
148 |     while text != "":
149 |         text = text.strip('\n')
150 |         res = extractText(text)
151 |         if len(res) != 0 and len(res) != 1:
152 |             # 边似乎不用边变量
153 |             edges.append(res)
154 |         text = r.readline()
155 |     print edges
156 |     for item in edges:
157 |         # 0: 节点标签, 1: 边序号, 2: 边事件描述, 3: 连接关系
158 |         res = item[3].split('>')
159 |         print res
160 |         # 先建点,再建边
161 |         v1 = res[0]
162 |         v2 = res[1]
163 |         addNode(v1, item[0])
164 |         addNode(v2, item[0])
165 |         event_label = item[2]
166 |         edge_num = item[1]
167 |         addEdge(v1, v2, event_label, edge_num, item[0][-1])
168 |         
169 |     # gremline_for_pattern_0 = """subGraph = g.E().hasLabel('icmp_echo_ping').subgraph('subGraph').cap('subGraph').next()
170 |     #                         sg = subGraph.traversal()
171 |     #                         sg.E()"""
172 |     # 按边标签过滤,抽取子图
173 | 
174 | 


--------------------------------------------------------------------------------
/generate_graph.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: UTF-8 -*-
  2 | # author: wangyixuan 
  3 | # 与hugegraph-server交互,根据LLS_DDOS_1.0内容创建网络事件图谱
  4 | # 要求:
  5 | # 1. 属性图结构
  6 | # 2. 采用Gremlin图计算语言
  7 | # 3. 自动化处理
  8 | # import subprocess
  9 | # cmd = "cat host-summary.log | bro-cut"
 10 | # sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
 11 | # str1 = sub.stdout.read()
 12 | # sub.communicate()
 13 | # print str1
 14 | # for i in iter(sub.stdout.readline):
 15 | #     print i
 16 | 
 17 | # 动态构建图谱需要考虑的问题
 18 | # 按照顶点id来取顶点的方法不好
 19 | # 不同类型的顶点或者边有着不同的属性,request_data需要重新考量
 20 | 
 21 | # 去除写gremlin文件的内容,仅保留gremlin_script0
 22 | 
 23 | import os
 24 | import subprocess
 25 | def extractText(text):
 26 |     res = text.split(' ')
 27 |     return res
 28 | n = 0
 29 | vertexs = []
 30 | events = []
 31 | propertyKeys_txt = ["ip", "ts", "vertex_type", "edge_type", "time", "ips", "status", "src_ip", "src_p", "dst_ip", "dst_p", "description"]
 32 | propertyKeys_int = ["frequency"]
 33 | vertexTypes = ["entity"]
 34 | edgeTypes = ["icmp_echo_ping", "icmp_echo_reply", "icmp_unreachable", "rpc_reply", "rpc_call", "portmap", 
 35 |  "new_connection_contents", "connection_SYN_packet", "tcp_packet", "connection_established",
 36 |  "connection_first_ack", "connection_eof", "connection_finished", "connection_pending", "login_output_line",
 37 |  "login_input_line", "login_confused", "login_confused_text", "login_success", "rsh_request",
 38 |  "rsh_reply", "connection_attempt", "login_terminal", "connection_half_finished", "login_display",
 39 |  "http_event", "http_stats", "http_end_entity", "http_message_done", "heep_content_type",
 40 |  "http_all_headers", "http_reply", "http_header", "http_begin_entity", "http_entity_data"]
 41 | 
 42 | edge_type_value = []
 43 | 
 44 | if __name__ == '__main__':
 45 |     n = 1
 46 |     for i in edgeTypes:
 47 |         dict_item = {}
 48 |         dict_item[i] = n
 49 |         n += 1
 50 |         edge_type_value.append(dict_item)
 51 |     print edge_type_value
 52 |     # cmd = "cat host-summary.log | bro-cut"
 53 |     with open("gremlin_scripts_0", "w") as f:
 54 |         for item in propertyKeys_txt:
 55 |             line = """graph.schema().propertyKey("{key}").asText().ifNotExist().create()\n""".format(key = item)
 56 |             f.write(line)
 57 |         for item in propertyKeys_int:
 58 |             line = """graph.schema().propertyKey("{key}").asInt().ifNotExist().create()\n""".format(key = item)
 59 |             f.write(line)
 60 |         for item in vertexTypes:# 不同类型顶点应拥有不同的属性,待完善
 61 |             line = """entity = graph.schema().vertexLabel("{label}").properties("ip", "ips", "ts", "vertex_type", "status").primaryKeys("ip").ifNotExist().create()\n""".format(label = item)
 62 |             f.write(line)
 63 |         for item in edgeTypes:# 不同类型边应拥有不同的属性,待完善
 64 |             line = """{edge_variable} = graph.schema().edgeLabel("{edge_variable}").sourceLabel("entity").targetLabel("entity").properties("edge_type", "ts", "time", "frequency", "src_ip", "src_p", "dst_ip", "dst_p", "description").ifNotExist().create()\n""".format(edge_variable = item)
 65 |             f.write(line)
 66 |         f.close()
 67 |     cmd = "awk '/^[^#]/ {print $1, $2, $3}' host-summary.log"
 68 |     r = os.popen(cmd)
 69 |     text = r.readline()
 70 |     print text
 71 |     while text != "":
 72 |         # print type(text)
 73 |         text = text.strip('\n')
 74 |         res = extractText(text)
 75 |         if len(res) != 0 and len(res) != 1:
 76 |             res.append("Vertex" + str(n))
 77 |             print res
 78 |             vertexs.append(res)
 79 |         text = r.readline()
 80 |         n += 1
 81 |     for item in vertexs:
 82 |         # 0: ts, 1: ip, 2: ips, 3: variable name
 83 |         requestbody = """'{
 84 |             "label": "entity",
 85 |             "properties": {
 86 |                 "ip": "%s",
 87 |                 "ips": "%s",
 88 |                 "ts": "%s",
 89 |                 "vertex_type": "asset",
 90 |                 "status": "unknown"
 91 |             }
 92 |         }'"""%(item[1], item[2], item[0])
 93 |         cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/graph/vertices -d"""
 94 |         cmd = cmd + requestbody
 95 |         print cmd
 96 |         sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
 97 |         str1 = sub.stdout.read()
 98 |         sub.communicate()
 99 |         str1 = str1.decode()
100 |         print str1
101 |     print len(vertexs)
102 |     print n
103 |     cmd = "awk '/^[^#]/ {print $1, $2, $3, $4, $5, $6, $7, $8}' network_events.log"
104 |     r = os.popen(cmd)
105 |     text = r.readline()
106 |     while text != "":
107 |         text = text.strip('\n')
108 |         res = extractText(text)
109 |         if len(res) != 0 and len(res) != 1:
110 |             # 边似乎不用边变量
111 |             events.append(res)
112 |         text = r.readline()
113 |     # v_src,v_dst为顶点变量名,用于连线
114 |     v_src = ""
115 |     v_dst = ""
116 |     for item in events:
117 |         # 先确定是哪两个点相连
118 |         # 0: ts, 1: real_time, 2: event_type, 3: src_ip, 4: src_p, 5: dst_ip, 6: dst_p, 7: description
119 |         src_ip = item[3]
120 |         dst_ip = item[5]
121 |         # print src_ip
122 |         # print dst_ip
123 |         for v in vertexs:
124 |             if v[1] == src_ip:
125 |                 # print "!!!"
126 |                 v_src = v[3]
127 |                 continue
128 |             if v[1] == dst_ip:
129 |                 # print "!!!!"
130 |                 v_dst = v[3]
131 |                 continue
132 |         if v_src != "" and v_dst != "":
133 |             edge_label = ""
134 |             if item[2] == "HOST_INFO::ICMP_ECHO_REQUEST":
135 |                 # line = """{t_v_src}.addEdge("icmp_echo_ping", {t_v_dst}, "edge_type", "basic_event", "ts", "{t_ts}", "time", "{t_time}", "frequency", 1, "src_ip", "{t_src_ip}", "src_p", "{t_src_p}", "dst_ip", "{t_dst_ip}", "dst_p", "{t_dst_p}")\n""".format(
136 |                 #     t_v_src = v_src, t_v_dst = v_dst, t_ts = item[0], t_time = item[1], t_src_ip = item[3], t_src_p = item[4], t_dst_ip = item[5], t_dst_p = item[6]
137 |                 # )
138 |                 # f.write(line)
139 |                 edge_label = "icmp_echo_ping"
140 |             elif item[2] == "HOST_INFO::ICMP_ECHO_REPLY":
141 |                 # line = """{t_v_src}.addEdge("icmp_echo_reply", {t_v_dst}, "edge_type", "basic_event", "ts", "{t_ts}", "time", "{t_time}", "frequency", 1, "src_ip", "{t_src_ip}", "src_p", "{t_src_p}", "dst_ip", "{t_dst_ip}", "dst_p", "{t_dst_p}")\n""".format(
142 |                 #     t_v_src = v_src, t_v_dst = v_dst, t_ts = item[0], t_time = item[1], t_src_ip = item[3], t_src_p = item[4], t_dst_ip = item[5], t_dst_p = item[6]
143 |                 # )
144 |                 # f.write(line)
145 |                 edge_label = "icmp_echo_reply"
146 |             elif item[2] == "HOST_INFO::ICMP_UNREACHABLE":
147 |                 # line = """{t_v_src}.addEdge("icmp_unreachable", {t_v_dst}, "edge_type", "basic_event", "ts", "{t_ts}", "time", "{t_time}", "frequency", 1, "src_ip", "{t_src_ip}", "src_p", "{t_src_p}", "dst_ip", "{t_dst_ip}", "dst_p", "{t_dst_p}")\n""".format(
148 |                 #     t_v_src = v_src, t_v_dst = v_dst, t_ts = item[0], t_time = item[1], t_src_ip = item[3], t_src_p = item[4], t_dst_ip = item[5], t_dst_p = item[6]
149 |                 # )
150 |                 # f.write(line)
151 |                 edge_label = "icmp_unreachable"
152 |             elif item[2] == "HOST_INFO::RPC_REPLY":
153 |                 edge_label = "rpc_reply"
154 |             elif item[2] == "HOST_INFO::RPC_CALL":
155 |                 edge_label = "rpc_call"
156 |             elif item[2] == "HOST_INFO::PORTMAP":
157 |                 edge_label = "portmap"
158 |             elif item[2] == "HOST_INFO::NEW_CONNECTION_CONTENTS":
159 |                 edge_label = "new_connection_contents"
160 |             elif item[2] == "HOST_INFO::CONNECTION_SYN_PACKET":
161 |                 edge_label = "connection_SYN_packet"
162 |             elif item[2] == "HOST_INFO::TCP_PACKET":
163 |                 edge_label = "tcp_packet"
164 |             elif item[2] == "HOST_INFO::CONNECTION_ESTABLISHED":
165 |                 edge_label = "connection-established"
166 |             elif item[2] == "HOST_INFO::CONNECTION_FIRST_ACK":
167 |                 edge_label = "connection_first_ack"
168 |             elif item[2] == "HOST_INFO::CONNECTION_EOF":
169 |                 edge_label = "connection_eof"
170 |             elif item[2] == "HOST_INFO::CONNECTION_FINISHED":
171 |                 edge_label = "connection_finished"
172 |             elif item[2] == "HOST_INFO::CONNECTION_PENDING":
173 |                 edge_label = "connection_pending"
174 |             elif item[2] == "HOST_INFO::LOGIN_OUTPUT_LINE":
175 |                 edge_label = "login_output_line"
176 |             elif item[2] == "HOST_INFO::LOGIN_INPUT_LINE":
177 |                 edge_label = "login_input_line"
178 |             elif item[2] == "HOST_INFO::LOGIN_CONFUSED":
179 |                 edge_label = "login_confused"
180 |             elif item[2] == "HOST_INFO::LOGIN_CONFUSED_TEXT":
181 |                 edge_label = "login_confused_text"
182 |             elif item[2] == "HOST_INFO::LOGIN_SUCCESS":
183 |                 edge_label = "login_success"
184 |             elif item[2] == "HOST_INFO::RSH_REQUEST":
185 |                 edge_label = "rsh_request"
186 |             elif item[2] == "HOST_INFO::RSH_REPLY":
187 |                 edge_label = "rsh_reply"
188 |             elif item[2] == "HOST_INFO::CONNECTION_ATTEMPT":
189 |                 edge_label = "connection_attempt"
190 |             elif item[2] == "HOST_INFO::LOGIN_TERMINAL":
191 |                 edge_label = "login_terminal"
192 |             elif item[2] == "HOST_INFO::CONNECTION_HALF_FINISHED":
193 |                 edge_label = "connection_half_finished"
194 |             elif item[2] == "HOST_INFO::LOGIN_DISPLAY":
195 |                 edge_label = "login_display"
196 |             elif item[2] == "HOST_INFO::HTTP_EVENT":
197 |                 edge_label = "http_event"
198 |             elif item[2] == "HOST_INFO::HTTP_STATS":
199 |                 edge_label = "http_stats"
200 |             elif item[2] == "HOST_INFO::HTTP_END_ENTITY":
201 |                 edge_label = "http_end_entity"
202 |             elif item[2] == "HOST_INFO::HTTP_MESSAGE_DONE":
203 |                 edge_label = "http_message_done"
204 |             elif item[2] == "HOST_INFO::HTTP_CONTENT_TYPE":
205 |                 edge_label = "http_content_type"
206 |             elif item[2] == "HOST_INFO::HTTP_ALL_HEADERS":
207 |                 edge_label = "http_all_headers"
208 |             elif item[2] == "HOST_INFO::HTTP_REPLY":
209 |                 edge_label = "http_reply"
210 |             elif item[2] == "HOST_INFO::HTTP_HEADER":
211 |                 edge_label = "http_header"
212 |             elif item[2] == "HOST_INFO::HTTP_BEGIN_ENTITY":
213 |                 edge_label = "http_begin_entity"
214 |             elif item[2] == "HOST_INFO::HTTP_ENTITY_DATA":
215 |                 edge_label = "http_entity_data"
216 |             else:
217 |                 print item[2]
218 |             t_src_ip = item[3]
219 |             t_dst_ip = item[5]
220 |             t_ts = item[0]
221 |             t_time = item[1]
222 |             t_src_p = item[4]
223 |             t_dst_p = item[6]
224 |             t_description = item[7]
225 |             # 先确认是否已经有相同事件记录,若有,frequency递增,若没有,加入新边
226 |             # 1. 按预计的边id查询边,GET方法,自己拼凑边id ok, 参照putmethod.py中,使用request方法
227 |             # 2. 查看结果,有边,则frequency+1,更新边属性,PUT方法 
228 |             # 3. 无边,插入新边,已经写好
229 |             requestbody = """'{
230 |                 "label": "%s",
231 |                 "outV": "1:%s",
232 |                 "inV": "1:%s",
233 |                 "outVLabel": "entity",
234 |                 "inVLabel": "entity",
235 |                 "properties": {
236 |                     "edge_type": "basic_event", 
237 |                     "ts": "%s", 
238 |                     "time": "%s", 
239 |                     "frequency": 1, 
240 |                     "src_ip": "%s", 
241 |                     "src_p": "%s", 
242 |                     "dst_ip": "%s", 
243 |                     "dst_p": "%s",
244 |                     "description": "%s"
245 |                 }
246 |             }'"""%(edge_label, t_src_ip, t_dst_ip, t_ts, t_time, t_src_ip, t_src_p, t_dst_ip, t_dst_p, t_description)
247 |             cmd =  """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/graph/edges -d"""
248 |             cmd = cmd + requestbody
249 |             print cmd
250 |             sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
251 |             str1 = sub.stdout.read()
252 |             sub.communicate()
253 |             str1 = str1.decode()
254 |             print str1
255 |         else:
256 |             print "没有找到合适的两个点"
257 |     # f.close()
258 |     # print vertexs
259 |     r.close()
260 | 


--------------------------------------------------------------------------------
/initial.groovy:
--------------------------------------------------------------------------------
 1 | // example
 2 | graph.schema().propertyKey("ip").asText().ifNotExist().create()
 3 | graph.schema().propertyKey("frequency").asInt().ifNotExist().create()
 4 | graph.schema().propertyKey("vertex_type").asText().ifNotExist().create()
 5 | graph.schema().propertyKey("edge_type").asText().ifNotExist().create()
 6 | graph.schema().propertyKey("time").asText().ifNotExist().create()
 7 | 
 8 | host = graph.schema().vertexLabel("host").properties("ip", "time", "vertex_type").primaryKeys("ip").ifNotExist().create()
 9 | server = graph.schema().vertexLabel("server").properties("ip", "time", "vertex_type").primaryKeys("ip").ifNotExist().create()
10 | 
11 | ping = graph.schema().edgeLabel("ping").sourceLabel("host").targetLabel("server").properties("time", "edge_type").ifNotExist().create()
12 | reply = graph.schema().edgeLabel("reply").sourceLabel("server").targetLabel("host").properties("time", "edge_type").ifNotExist().create()
13 | 
14 | thinkpad = graph.addVertex(T.label, "host", "ip", "192.168.1.157", "time", "2019-11-07-11:25:16", "vertex_type", "PC")
15 | Galaxy_server = graph.addVertex(T.label, "server", "ip", "202.55.12.142", "time", "2019-11-07-11:28:35", "vertex_type", "WorkStation")
16 | thinkpad.addEdge("ping", Galaxy_server, "edge_type", "ping_echo_request", "time", "2019-11-07-11:32:21")


--------------------------------------------------------------------------------
/initial_properties.py:
--------------------------------------------------------------------------------
 1 | # -*- coding: UTF-8 -*-
 2 | import os
 3 | import subprocess
 4 | def extractText(text):
 5 |     res = text.split(' ')
 6 |     return res
 7 | n = 0
 8 | vertexs = []
 9 | events = []
10 | propertyKeys_txt = ["ip", "ts", "vertex_type", "edge_type", "time", "ips", "status", "src_ip", "src_p", "dst_ip", "dst_p", "description", "event_label"]
11 | propertyKeys_int = ["frequency", "pattern_node_id", "pattern_edge_id"]
12 | vertexTypes = ["entity"]
13 | edgeTypes = ["icmp_echo_ping", "icmp_echo_reply", "icmp_unreachable", "rpc_reply", "rpc_call", "portmap", 
14 |  "new_connection_contents", "connection_SYN_packet", "tcp_packet", "connection_established",
15 |  "connection_first_ack", "connection_eof", "connection_finished", "connection_pending", "login_output_line",
16 |  "login_input_line", "login_confused", "login_confused_text", "login_success", "rsh_request",
17 |  "rsh_reply", "connection_attempt", "login_terminal", "connection_half_finished", "login_display",
18 |  "http_event", "http_stats", "http_end_entity", "http_message_done", "heep_content_type",
19 |  "http_all_headers", "http_reply", "http_header", "http_begin_entity", "http_entity_data"]
20 | 
21 | edge_type_value = []
22 | 
23 | if __name__ == '__main__':
24 |     # cmd = "cat host-summary.log | bro-cut"
25 |     with open("gremlin_scripts_0", "w") as f:
26 |         for item in propertyKeys_txt:
27 |             line = """graph.schema().propertyKey("{key}").asText().ifNotExist().create()\n""".format(key = item)
28 |             f.write(line)
29 |         for item in propertyKeys_int:
30 |             line = """graph.schema().propertyKey("{key}").asInt().ifNotExist().create()\n""".format(key = item)
31 |             f.write(line)
32 |         for item in vertexTypes:# 不同类型顶点应拥有不同的属性,待完善
33 |             line = """entity = graph.schema().vertexLabel("{label}").properties("ip", "ips", "ts", "vertex_type", "status").primaryKeys("ip").ifNotExist().create()\n""".format(label = item)
34 |             f.write(line)
35 |         for item in edgeTypes:# 不同类型边应拥有不同的属性,待完善
36 |             line = """{edge_variable} = graph.schema().edgeLabel("{edge_variable}").sourceLabel("entity").targetLabel("entity").properties("edge_type", "ts", "time", "frequency", "src_ip", "src_p", "dst_ip", "dst_p", "description").ifNotExist().create()\n""".format(edge_variable = item)
37 |             f.write(line)
38 |         f.close()
39 |     # cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/schema/indexlabels -d"""
40 |     # requestbody = """'{
41 |     #         "name": "entityByip",
42 |     #         "base_type": "VERTEX_LABEL",
43 |     #         "base_value": "entity",
44 |     #         "index_type": "SECONDARY",
45 |     #         "fields": [
46 |     #             "vertex_type"
47 |     #         ]
48 |     #     }'"""
49 |     # cmd = cmd + requestbody
50 |     n = 1
51 |     for i in edgeTypes:
52 |         dict_item = {}
53 |         dict_item[i] = n
54 |         n += 1
55 |         edge_type_value.append(dict_item)
56 |     print edge_type_value
57 | 
58 |     # cmd = """GET http://127.0.0.1:8080/graphs/hugegraph/graph/edges?vertex_id="1:202.77.162.213"&direction=OUT&label=icmp_echo_ping&properties={}"""
59 | 
60 |     # cmd = """curl http://localhost:8080/graphs/hugegraph/graph/edges/'S1:202.77.162.213>1>>S1:172.16.113.95'"""
61 | 
62 |     # cmd = """curl -X PUT http://localhost:8080/graphs/hugegraph/graph/edges/'S1:202.77.162.213>1>>S1:172.16.112.94'?action=append -d """
63 |     # requestbody = """'{"properties":{"frequency": 2}}'"""
64 |     # cmd = cmd + requestbody
65 |     # print cmd
66 |     # sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
67 |     # str1 = sub.stdout.read()
68 |     # sub.communicate()
69 |     # str1 = str1.decode()
70 |     # print str1
71 | 
72 | 
73 |     # requestbody = """'{
74 |     #         "name": "entityByip",
75 |     #         "base_type": "VERTEX_LABEL",
76 |     #         "base_value": "entity",
77 |     #         "index_type": "SECONDARY",
78 |     #         "fields": [
79 |     #             "ip"
80 |     #         ]
81 |     #     }'"""
82 |     # cmd = """curl -X POST -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/schema/indexlabels -d"""
83 | 
84 | 
85 |     requestbody = """'{"properties":{"frequency": 2}}'"""
86 |     cmd="""curl -X PUT -H "Content-Type:application/json" http://localhost:8080/graphs/hugegraph/graph/edges/"S1:202.77.162.213%3E1%3E%3ES1:172.16.113.95"?action=append -d"""
87 |     cmd = cmd + requestbody
88 |     print cmd
89 | 


--------------------------------------------------------------------------------
/putmethod.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # -*- coding: utf-8 -*-
 3 | 
 4 | 
 5 | # ok
 6 | # import json
 7 | # import requests
 8 | # 根据边id查询边数据
 9 | # url = "http://localhost:8080/graphs/hugegraph/graph/edges/S1:202.77.162.213>1>>S1:172.16.113.95"
10 | # response = requests.get(url)
11 | # print response.status_code
12 | # print type(response.text)
13 | # text = response.text
14 | # str1 = text.encode('gbk')
15 | # print type(str1)
16 | # jdata = json.loads(str1)
17 | # print jdata["properties"]["frequency"]
18 | # print type(jdata)
19 | # ok
20 | 
21 | # import urllib2
22 | # import json
23 |  
24 | # def putMesParent():
25 | #   try:
26 | #     # 将参数存储为键值对形式
27 | #     value = {"properties":{"frequency": 2}}
28 | #     # json封装
29 | #     jdata = json.dumps(value,indent=4)
30 | #     print jdata
31 | #     # 与服务器交互，进行put请求
32 | #     url = """http://localhost:8080/graphs/hugegraph/graph/edges/'S1:202.77.162.213>1>>S1:172.16.113.95'?action=append"""
33 | #     request = urllib2.Request(url, jdata)
34 | #     # 这行很重要,put一定要用这个
35 | #     request.add_header("Content-Type","application/json; charset=utf-8")
36 | #     # 设置返回值为put方式
37 | #     request.get_method = lambda:"PUT"
38 | #     # 得到返回结果
39 | #     result = urllib2.urlopen(request)
40 | #     # 返回结果
41 | #     return result
42 | #   except Exception,e:
43 | #     print Exception,":",e
44 |  
45 | # # 访问方法，并返回结果
46 | # print putMesParent()
47 | 
48 | # !/usr/bin/env python
49 | # -*- coding:utf-8 -*-
50 | # File: http_put.py
51 | 
52 | import urllib2
53 | import json
54 | 
55 | def http_put():
56 |     url="""http://localhost:8080/graphs/hugegraph/graph/edges/\"S1:202.77.162.213>1>>S1:172.16.113.95\"?action=append"""
57 |     values={"properties":{"frequency": 2}}
58 | 
59 |     jdata = json.dumps(values)                  # 对数据进行JSON格式化编码
60 |     request = urllib2.Request(url, jdata)
61 |     request.add_header('Content-Type', 'application/json')
62 |     request.get_method = lambda:'PUT'           # 设置HTTP的访问方式
63 |     request = urllib2.urlopen(request)
64 |     return request.read()
65 | 
66 | resp = http_put()
67 | print resp


--------------------------------------------------------------------------------
/signature_test.bro:
--------------------------------------------------------------------------------
  1 | @load /usr/local/bro/share/bro/base/bif/plugins/Bro_RPC.events.bif.bro
  2 | @load /home/lw/myKGA/updateHost.bro
  3 | # 如何处理文件重复包含?
  4 | 
  5 | global n = 0;
  6 | global m = 0;
  7 | global p_num = 0;
  8 | global k = 0;
  9 | 
 10 | # 基本数据包
 11 | # A raw packet header, consisting of L2 header and everything in pkt_hdr. .
 12 | event raw_packet(p: raw_pkt_hdr){
 13 |     # print "raw_packet!";
 14 |     # print p;
 15 |     # if(p?$l2){
 16 |     #     print p$l2;
 17 |     # } else {
 18 |     #     print "no l2";
 19 |     # }
 20 |     # if(p?$ip){
 21 |     #     print p$ip;
 22 |     # } else {
 23 |     #     print "no ip field";
 24 |     # }
 25 |     # if(p?$ip6){
 26 |     #     print p$ip6;
 27 |     # } else {
 28 |     #     print "no ip6 field";
 29 |     # }
 30 |     # if(p?$tcp){
 31 |     #     print p$tcp;
 32 |     # } else {
 33 |     #     print "no tcp field";
 34 |     # }
 35 |     # if(p?$udp){
 36 |     #     print p$udp;
 37 |     # } else {
 38 |     #     print "no udp field";
 39 |     # }
 40 |     # if(p?$icmp){
 41 |     #     print p$icmp;
 42 |     # } else {
 43 |     #     print "no icmp field";
 44 |     # }
 45 |     p_num += 1;
 46 | }
 47 | 
 48 | event packet_contents(c: connection, contents: string){
 49 |     print "packet_contents!";
 50 |     # print c$id$resp_p;
 51 |     if(c$id$resp_p == 111/udp){
 52 |         print "portmapper protocol";
 53 |     } else {
 54 |         print c$id$resp_p;
 55 |     }
 56 |     # print contents;
 57 |     # p_num -= 1;
 58 | }
 59 | 
 60 | # phase-1-dump
 61 | event icmp_echo_request(c: connection, icmp: icmp_conn, id: count, seq: count, payload: string){
 62 |     # print "icmp_echo_request!";
 63 |     # print icmp;
 64 |     n += 1;
 65 | }
 66 | 
 67 | event icmp_echo_reply(c: connection, icmp: icmp_conn, id: count, seq: count, payload: string){
 68 |     print "icmp_echo_reply!";
 69 |     # print icmp;
 70 |     m += 1;
 71 | }
 72 | 
 73 | event icmp_time_exceeded(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 74 |     print "icmp_time_exceeded!";
 75 |     k += 1;
 76 | }
 77 | 
 78 | event icmp_error_message(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 79 |     print "icmp_error_message!";
 80 | }
 81 | 
 82 | event icmp_neighbor_advertisement(c: connection, icmp: icmp_conn, router: bool, solicited: bool,
 83 | override: bool, tgt: addr, options: icmp6_nd_options){
 84 |     print "icmp_neighbor_advertisement!";
 85 | }
 86 | 
 87 | event icmp_neighbor_solicitation(c: connection, icmp: icmp_conn, tgt: addr, options: icmp6_nd_options){
 88 |     print "icmp_neighbor_solicitation!";
 89 | }
 90 | 
 91 | event icmp_packet_too_big(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 92 |     print "icmp_packet_too_big!";
 93 | }
 94 | 
 95 | event icmp_parameter_problem(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 96 |     print "icmp_parameter_problem!";
 97 | }
 98 | 
 99 | event icmp_redirect(c: connection, icmp: icmp_conn, tgt: addr, dest: addr, options: icmp6_nd_options){
100 |     print "icmp_redirect!";
101 | }
102 | 
103 | event icmp_router_advertisement(c: connection, icmp: icmp_conn, cur_hop_limit: count, managed: bool,
104 | other: bool, home_agent: bool, pref: count, proxy: bool, res: count, router_lifetime: interval,
105 | reachable_time: interval, retrans_timer: interval, options: icmp6_nd_options){
106 |     print "icmp_router_advertisement!";
107 | }
108 | 
109 | event icmp_router_solicitation(c: connection, icmp: icmp_conn, options: icmp6_nd_options){
110 |     print "icmp_router_solicitation!";
111 | }
112 | 
113 | event icmp_sent(c: connection, icmp: icmp_conn){
114 |     print "icmp_sent!";
115 | }
116 | 
117 | event icmp_sent_payload(c: connection, icmp: icmp_conn, payload: string){
118 |     print "icmp_sent_payload!";
119 | }
120 | 
121 | event icmp_unreachable(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
122 |     print "icmp_unreachable!";
123 | }
124 | # phase-2-dump
125 | # pm related
126 | event mount_proc_mnt(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t, rep: MOUNT3::mnt_reply_t){
127 |     print "mount_proc_mnt!";
128 | }
129 | 
130 | event mount_proc_not_implemented(c: connection, info: MOUNT3::info_t, proc: MOUNT3::proc_t){
131 |     print "mount_proc_not_implemented!";
132 | }
133 | 
134 | event mount_proc_null(c: connection, info: MOUNT3::info_t){
135 |     print "mount_proc_null!";
136 | }
137 | 
138 | event mount_proc_umnt(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t){
139 |     print "mount_proc_umnt!";
140 | }
141 | 
142 | event mount_proc_umnt_all(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t){
143 |     print "mount_proc_umnt_all!";
144 | }
145 | 
146 | event mount_reply_status(n: connection, info: MOUNT3::info_t){
147 |     print "mount_reply_status!";
148 | }
149 | 
150 | event nfs_proc_create(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::newobj_reply_t){
151 |     print "nfs_proc_create!";
152 | }
153 | 
154 | event nfs_proc_getaddr(c: connection, info: NFS3::info_t, fh: string, attrs: NFS3::fattr_t){
155 |     print "nfs_proc_getaddr!";
156 | }
157 | 
158 | event nfs_proc_link(c: connection, info: NFS3::info_t, req: NFS3::linkargs_t, rep: NFS3::link_reply_t){
159 |     print "nfs_proc_link!";
160 | }
161 | 
162 | event nfs_proc_lookup(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::lookup_reply_t){
163 |     print "nfs_proc_lookup!";
164 | }
165 | 
166 | event nfs_proc_mkdir(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::newobj_reply_t){
167 |     print "nfs_proc_mkdir!";
168 | }
169 | 
170 | event nfs_proc_not_implemented(c: connection, info: NFS3::info_t, proc: NFS3::proc_t){
171 |     print "nfs_proc_not_implemented!";
172 | }
173 | 
174 | event nfs_proc_null(c: connection, info: NFS3::info_t){
175 |     print "nfs_proc_null!";
176 | }
177 | 
178 | event nfs_proc_read(c: connection, info: NFS3::info_t, req: NFS3::readargs_t, rep: NFS3::read_reply_t){
179 |     print "nfs_proc_read!";
180 | }
181 | 
182 | event nfs_proc_readdir(c: connection, info: NFS3::info_t, req: NFS3::readdirargs_t, rep: NFS3::readdir_reply_t){
183 |     print "nfs_proc_readdir!";
184 | }
185 | 
186 | event nfs_proc_readlink(c: connection, info: NFS3::info_t, fh: string, rep: NFS3::readlink_reply_t){
187 |     print "nfs_proc_readlink!";
188 | }
189 | 
190 | event nfs_proc_remove(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::delobj_reply_t){
191 |     print "nfs_proc_remove!";
192 | }
193 | 
194 | event nfs_proc_rename(c: connection, info: NFS3::info_t, req: NFS3::renameopargs_t, rep: NFS3::renameobj_reply_t){
195 |     print "nfs_proc_rename!";
196 | }
197 | 
198 | event nfs_proc_rmdir(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::delobj_reply_t){
199 |     print "nfs_proc_rmdir!";
200 | }
201 | 
202 | event nfs_proc_sattr(c: connection, info: NFS3::info_t, req: NFS3::sattrargs_t, rep: NFS3::sattr_reply_t){
203 |     print "nfs_proc_sattr!";
204 | }
205 | 
206 | event nfs_proc_symlink(c: connection, info: NFS3::info_t, req: NFS3::symlinkargs_t, rep: NFS3::newobj_reply_t){
207 |     print "nfs_proc_symlink!";
208 | }
209 | 
210 | event nfs_proc_write(c: connection, info: NFS3::info_t, req: NFS3::writeargs_t, rep: NFS3::write_reply_t){
211 |     print "nfs_proc_write!";
212 | }
213 | 
214 | event nfs_reply_status(n: connection, info: NFS3::info_t){
215 |     print "nfs_reply_status!";
216 | }
217 | 
218 | #--上面是关于nfs的调用事件--
219 | 
220 | event pm_attempt_getport(r: connection, status: rpc_status, pr: pm_port_request){
221 |     print "pm_attempt_getport!";
222 | }
223 | 
224 | event pm_attempt_dump(r: connection, status: rpc_status){
225 |     print "pm_attempt_dump!";
226 | }
227 | 
228 | event pm_attempt_callit(r: connection, status: rpc_status, call: pm_callit_request){
229 |     print "pm_attempt_callit!";
230 | }
231 | 
232 | event pm_attempt_null(r: connection, status: rpc_status){
233 |     print "pm_attempt_null!";
234 | }
235 | 
236 | event pm_attempt_set(r: connection, status: rpc_status, m: pm_mapping){
237 |     print "pm_attempt_set!";
238 | }
239 | 
240 | event pm_attempt_unset(r: connection, status: rpc_status, m: pm_mapping){
241 |     print "pm_attempt_unset!";
242 | }
243 | 
244 | event pm_bad_port(r: connection, bad_p: count){
245 |     print "pm_bad_port!";
246 | }
247 | 
248 | event pm_request_callit(r: connection, call: pm_callit_request, p: port){
249 |     print "pm_request_callit!";
250 | }
251 | 
252 | event pm_request_dump(r: connection, m: pm_mappings){
253 |     print "pm_request_dump!";
254 | }
255 | 
256 | event pm_request_getport(r: connection, pr: pm_port_request, p: port){
257 |     print "pm_request_getport!";
258 | }
259 | 
260 | event pm_request_null(r: connection){
261 |     print "pm_request_null!";
262 | }
263 | 
264 | event pm_request_set(r: connection, m: pm_mapping, success: bool){
265 |     print "pm_request_set!";
266 | }
267 | 
268 | event pm_request_unset(r: connection, m: pm_mapping, success: bool){
269 |     print "pm_request_unset!";
270 | }
271 | 
272 | event rpc_call(C: connection, xid: count, prog: count, ver: count, proc: count, call_len: count){
273 |     print "rpc_call!";
274 | }
275 | 
276 | event rpc_dialogue(c: connection, prog: count, ver: count, proc: count, status: rpc_status, start_time: time, call_len: count, reply_len: count){
277 |     print "rpc_dialogue!";
278 | }
279 | 
280 | event rpc_reply(c: connection, xid: count, status: rpc_status, reply_len: count){
281 |     print "rpc_reply!";
282 | }
283 | # 上面是关于pm和rpc的,可惜一个都没有触发
284 | # 考虑包内容中有resp_p=111/udp,其中111是portmapper的端口号得知此包与portmapper相关
285 | # 如何通过bro得知rpc调用了sadmind守护进程?
286 | 
287 | event udp_contents(u: connection, is_orig: bool, contents: string){
288 |     print "udp_contents!";
289 |     print u;
290 | }
291 | # 测试一下udp事件
292 | 
293 | # phase-3-dump
294 | 
295 | # phase-4-dump
296 | 
297 | # phase-5-dump
298 | 
299 | 
300 | event bro_init(){
301 |     print "Let's start!";
302 | }
303 | 
304 | event bro_done(){
305 |     print "Over.";
306 |     print n;
307 |     print m;
308 |     print k;
309 |     print p_num;
310 | }


--------------------------------------------------------------------------------
/subgraph_search.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | import os
  4 | import subprocess
  5 | import json
  6 | import requests
  7 | import urllib
  8 | from decimal import Decimal
  9 | from decimal import getcontext
 10 | import numpy as np
 11 | # 测试Restful API执行gremlin
 12 | # 垃圾,点和边要单独查?
 13 | # 把ui里脚本的g换成hugegraph.traversal()
 14 | # gremlin_srcipt = """hugegraph.traversal().V().hasLabel("attack_pattern_0")"""
 15 | # gremlin_srcipt = """hugegraph.traversal().V().hasLabel("attack_pattern_0")"""
 16 | # gremlin_srcipt = """hugegraph.traversal().E().hasLabel("attack_event_0")"""
 17 | # gremlin_srcipt = """hugegraph.traversal().V().group().by(label)"""
 18 | # gremlin_srcipt = """hugegraph.traversal().V().hasLabel('attack_pattern_0').out().has('pattern_node_id',within(1))"""
 19 | # gremline_for_pattern_0 = """'subGraph = g.E().hasLabel('icmp_echo_ping').subgraph('subGraph').cap('subGraph').next()
 20 | #                         sg = subGraph.traversal()
 21 | #                         sg.E()'"""
 22 | # url = "http://127.0.0.1:8080/gremlin?gremlin="
 23 | # url = url + gremline_for_pattern_0
 24 | # response = requests.get(url)
 25 | # print response.status_code
 26 | # print response.text
 27 | 
 28 | # Restful API例子
 29 | # url中的特殊字符编码规范:  https://www.w3cschool.cn/htmltags/html-urlencode.html
 30 | 
 31 | # http://localhost:8080/graphs/hugegraph/traversers/kout?source=%222:0%22&max_depth=1
 32 | # 从"2:0"点出发,一步可达的点集合,可以继续增加max_depth的值,当返回空时,max_depth=MAX_DISTANCE-1
 33 | # 用不着最短路径了,自带的最短路径用不了
 34 | 
 35 | # http://localhost:8080/graphs/hugegraph/traversers/rings?source=%222:0%22&max_depth=2&direction=OUT
 36 | # 可以求从原点出发最大n步内的环路,hugegraph的Rings功能
 37 | 
 38 | # http://localhost:8080/graphs/hugegraph/traversers/rays?source=%222:0%22&max_depth=2&direction=OUT
 39 | # 根据起始顶点、方向、边的类型（可选）和最大深度等条件查找发散到边界顶点的路径,hugegraph的Rays功能
 40 | 
 41 | # gremlin例子
 42 | # g.V().as('a').out('icmp_echo_ping').as('b').select('a','b')
 43 | # 由icmp_echo_ping类型的边相连的两个点的信息
 44 | 
 45 | # g.V().as('a').out('icmp_echo_ping').as('b').select('a','b').by('ip')
 46 | # 由icmp_echo_ping类型的边相连的两个点的ip的信息,
 47 | 
 48 | # g.V().where(out('icmp_echo_ping')).values('ip')
 49 | # 有icmp_echo_ping类型出边的点的ip值
 50 | 
 51 | # g.V().where(out('attack_event_0').count().is(gte(6))).values('pattern_node_id')
 52 | # 取出边类型为attack_event_0,且该类型出边条数大于等于6的点(的pattern_node_id值)
 53 | 
 54 | # g.V().where(__.not(out('icmp_echo_ping'))).where(__.in('icmp_echo_reply')).values('ip')
 55 | # 取出没有类型为icmp_echo_ping的出边,但是有icmp_echo_reply的入边的点(的ip)
 56 | 
 57 | # g.V().where(out('rpc_call').where(out('rpc_reply'))).values('ip')
 58 | # 取出有类型为rpc_call的出边,且接着这个出边,有类型为rpc_reply的第二跳出边的点(的ip)
 59 | 
 60 | # g.V('2:0').outE('attack_event_0').values('event_label')
 61 | # 取出从可疑点触发的攻击事件边的事件类型(会有重复),复杂一点,记下数量?
 62 | # g.E().hasLabel('attack_event_0').values('event_label')
 63 | # 攻击模式0下所有的边事件类型
 64 | 
 65 | # g.V('1:202.77.162.213').out().out().path()
 66 | # 从点202.77.162.213出发,连续两个出边的路线
 67 | 
 68 | # g.V().and(outE('icmp_echo_ping'), values('ip').is('202.77.162.213')).values('ts')
 69 | # 有icmp_echo_ping类型出边,且ip为202.77.162.213的点的ts值
 70 | 
 71 | # g.V().as('a').out('icmp_echo_ping').as('b').select('a','b')
 72 | # 取a->b的,边为icmp_echo_ping的所有a,b对
 73 | 
 74 | # g.V().group().by(bothE().count())
 75 | # 此方法可以把图中的所有点按照度进行分组,可用于取前k个节点(度由高到低)
 76 | 
 77 | # g.V().match(__.as('a').in('icmp_echo_ping').has('ip', '202.77.162.213').as('b'))
 78 | # 此方法是gremlin的模式匹配,满足则生成一个map<String, Object>,不满足则过滤掉
 79 | # 模式1: "a"对应当前节点,有icmp_echo_ping的入边
 80 | # 模式2: "b"对应节点"202.77.162.213"
 81 | # 效果: 得到从b出发的,且距离为1的所有节点对
 82 | 
 83 | # g.V().match(
 84 | #     __.as('a').out('icmp_echo_ping').as('b'),
 85 | #     __.as('b').in('rpc_call').as('c')).
 86 | #     where('a', neq('c')).
 87 | #     select('a','c').by('ip')
 88 | # 从某个节点a出发的icmp_echo_ping,到节点b,节点b有来自节点c的rpc_call的入边,where保证a和c
 89 | 
 90 | # g.V('1:202.77.162.213').match(
 91 | #     __.as('a').out('icmp_echo_ping').as('b'),
 92 | #     __.as('b').in('icmp_echo_reply').as('c')).
 93 | #     where('a', eq('c')).
 94 | #     select('a','b')
 95 | # 在本场景中也可以用,指定出发点的id,a等于c表示环
 96 | 
 97 | # g.V('2:0').bothE().otherV().simplePath().path()
 98 | # g.V('2:0').both().both().cyclicPath().path()
 99 | 
100 | # subGraph = g.E().or(__.hasLabel('icmp_echo_ping'),
101 | #                     __.hasLabel('icmp_echo_reply'))
102 | #                     .subgraph('subGraph').cap('subGraph').next()
103 | # sg = subGraph.traversal()
104 | # sg.E()
105 | # 多类型的边,生成的子图.即子图中既包含icmp_echo_ping类型的边,也包含icmp_echo_reply类型的边
106 | 
107 | # g.V("2:0").repeat(out()).times(2).path()
108 | # 某点出发,out两次
109 | 
110 | # 需要的gremline功能
111 | # 在图sg中,取出其中所有满足某个边属性条件的,边
112 | 
113 | # 这里不好用Restful API执行,可以使用hugegraph-tool的gremlin-execute指令执行
114 | # gremlin-execute: 同步执行
115 | # --script 执行脚本看来不太行
116 | # --file 执行文件中的脚本,脚本语句的前后依赖不能太多,不然运行非常慢
117 | # gremlin-schedule: 异步执行
118 | 
119 | # gremlin执行流程:
120 | # 1. 根据需求设置gremlin语句
121 | # 2. 将gremlin写入脚本文件
122 | # 3. cmd拼接脚本文件
123 | # 4. hugegraph-tool执行脚本文件
124 | # 5. 分析返回结果
125 | 
126 | # 子图匹配步骤:
127 | # 1. 从攻击模式图中提取特征信息(攻击模式图按照0,1,2,3,...编号)
128 | #    gremlin脚本文件或者Restful API获取标签为attack_event_n的的边集合
129 | #    特征信息包括:0点出边事件类型,0点到其他点的最远距离,环信息(如何表示?)
130 | # 2. 按边事件类型提取子图,得到边集合,去除不关心的边(数据过滤1))
131 | #    gremlin提供了subgraph方法
132 | #    各种事件类型缺一不可,这里搜的边事件类型是和0号节点直接相连的事件类型(相对重要)
133 | #    比如0->1->2,1->2的事件可以在最后一步模式匹配的时候再关心(皮之不存,毛将焉附)
134 | # 3. 分析边集合,按照TIME WINDOW进行初步筛选,去除过旧的边(数据过滤2)
135 | #    本地数据分析,JSON格式的边数据
136 | # 4. 从边集合中提取点集合,得到过滤后的子图(该子图怎么存储,计算?)
137 | #    本地数据分析,JSON格式的边数据中提取点数据
138 | #    难点:过滤后的子图如何存放?保留在本地的话,不方便做图计算.简单整理之后,存回图谱?按照一种新的模式存储.
139 | #    过滤后的子图肯定不能再存,需要在gremlin脚本中以图变量形式存在
140 | #    所以,上面3步必须一次到位
141 | # 5. 在上一步的子图中,计算各节点的度,并按照从达到小的顺序对节点进行排序(度越大,可疑程度越高)
142 | #    参考gremlin中与节点度相关的图计算接口
143 | # 6. 将可疑节点序列对应攻击模式中的0号节点,限定范围匹配(从可疑节点开始,1跳,2跳)
144 | hugegraph_bin_path = "/home/lw/hugegraph-tools-1.3.0/bin/"
145 | project_path = "/home/lw/myKGA/"
146 | gremline_file_name = "gremlin_scripts"
147 | tool_command = "gremlin-execute"
148 | 
149 | nodes_involved = []
150 | global attack_counter
151 | 
152 | 
153 | def execute_command(cmd):
154 |     sub = subprocess.Popen(cmd, shell = True, stdout = subprocess.PIPE)
155 |     str1 = sub.stdout.read()
156 |     sub.communicate()
157 |     str1 = str1.decode()
158 |     # print str1
159 |     return str1
160 | 
161 | def extract_max_distance(source_node_id):
162 |     print "获取攻击节点可达的最远距离..."
163 |     max_distance = 1
164 |     while(True):
165 |         url = "http://localhost:8080/graphs/hugegraph/traversers/kout"
166 |         pms = {
167 |             "max_depth": max_distance
168 |         }
169 |         pms["source"] = "\"%s\""%source_node_id
170 |         url_encoded = urllib.urlencode(pms)
171 |         # cmd = "curl " + url
172 |         # print cmd
173 |         # print url_encoded
174 |         r = requests.get(url, params = pms)
175 |         # print r.url
176 |         # print r.status_code
177 |         # print type(r.content)
178 |         tmp_dict = json.loads(r.content)
179 |         end_while = False
180 |         for key in tmp_dict:
181 |             if key == "vertices" and tmp_dict[key] == []:
182 |                 max_distance -= 1
183 |                 end_while = True
184 |         if end_while:
185 |             break
186 |         max_distance += 1
187 |     return max_distance
188 | 
189 | def extract_key_events(pattern_num):
190 |     print "获取关键事件..."
191 |     tmp_events = set()
192 |     pattern = "attack_event_" + str(pattern_num)
193 |     with open("gremlin_scripts", "w") as f:
194 |         line = """g.V().outE('{attack_event_num}').values('event_label')\n""".format(attack_event_num = pattern)
195 |         f.write(line)
196 |         f.close()
197 |     cmd = hugegraph_bin_path + "hugegraph " + tool_command + " --file " + project_path + gremline_file_name
198 |     result = execute_command(cmd)
199 |     # print result
200 |     lines = result.split('\n')
201 |     # print lines
202 |     for item in lines:
203 |         if item != "" and item != "Run gremlin script":# Run gremlin script是hugegraph的输出
204 |             tmp_events.add(item)
205 |     return tmp_events
206 | 
207 | def extract_attack_pattern_edgelabel_id(PATTERN_NUM):
208 |     url = "http://localhost:8080/graphs/hugegraph/schema/edgelabels/"
209 |     edgelabel = "attack_event_" + str(PATTERN_NUM)
210 |     url = url + edgelabel
211 |     r = requests.get(url)
212 |     tmp_dict = json.loads(r.content)
213 |     return tmp_dict["id"]
214 | 
215 | def extract_edgelabelid_by_edgelabel(edgelabel):
216 |     url = "http://localhost:8080/graphs/hugegraph/schema/edgelabels/"
217 |     url = url + edgelabel
218 |     r = requests.get(url)
219 |     tmp_dict = json.loads(r.content)
220 |     return tmp_dict["id"]
221 | 
222 | def extract_edge_by_edgeid(edge_id):
223 |     url = "http://localhost:8080/graphs/hugegraph/graph/edges/"
224 |     url = url + edge_id
225 |     r = requests.get(url)
226 |     tmp_dict = json.loads(r.content)
227 |     return tmp_dict
228 | 
229 | def extract_attack_event_by_edgeid(edge_id):
230 |     url = "http://localhost:8080/graphs/hugegraph/graph/edges/"
231 |     url = url + edge_id
232 |     r = requests.get(url)
233 |     tmp_dict = json.loads(r.content)
234 |     return tmp_dict["properties"]["event_label"]
235 | 
236 | def extract_event_chain_paths(source_node_id, MAX_DISTANCE, PATTERN_NUM):
237 |     print "提取攻击事件链(无环)..."
238 |     tmp_paths = []
239 |     # 攻击事件链结构
240 |     # 可以先用Rays方法求出其到边界节点的所有可能路径,然后把路径转为事件链.因为是特征子图,计算量不会太大.
241 |     # event(n1),event(n2),...,event(n3)
242 |     url = "http://localhost:8080/graphs/hugegraph/traversers/rays"
243 |     pms = {
244 |         "max_depth": MAX_DISTANCE,
245 |         "direction": "OUT"
246 |     }
247 |     pms["source"] = "\"%s\""%source_node_id
248 |     # url_encoded = urllib.urlencode(pms)
249 |     r = requests.get(url, params = pms)
250 |     tmp_dict = json.loads(r.content)
251 |     print tmp_dict
252 |     print tmp_dict["rays"]
253 |     # 根据路径确定边id,边id由两点id以及边标签id决定,所以选查出边标签id,最后拼出边id
254 |     edgelabel_id = extract_attack_pattern_edgelabel_id(PATTERN_NUM)
255 |     for item in tmp_dict["rays"]:
256 |         i = 0
257 |         # print item["objects"][2]
258 |         attack_event_sequence_arr = []
259 |         attack_event_sequence = ""
260 |         while i + 1 < len(item["objects"]):
261 |             tmp_src = item["objects"][i]
262 |             tmp_dst = item["objects"][i+1]
263 |             # tmp_sec, tmp_dst是一小段路径
264 |             edge_id = "S" + tmp_src + ">" + str(edgelabel_id) + ">>S" + tmp_dst
265 |             print edge_id
266 |             attack_event = extract_attack_event_by_edgeid(edge_id)
267 |             attack_event_sequence_arr.append(attack_event)
268 |             i += 1
269 |         for e in attack_event_sequence_arr:
270 |             attack_event_sequence = attack_event_sequence + e + ">"
271 |         if attack_event_sequence not in tmp_paths:
272 |             tmp_paths.append(attack_event_sequence)
273 |     return tmp_paths
274 | 
275 | def extract_event_chain_cyclicpaths(source_node_id, PATTERN_NUM):
276 |     print "提取攻击事件链(环路)..."
277 |     # 先确定环路查找的最大深度
278 |     url = "http://localhost:8080/gremlin"
279 |     pms = {}
280 |     tmp_paths = []
281 |     edgelabel = "attack_event_" + str(PATTERN_NUM)
282 |     pms["gremlin"] = "hugegraph.traversal().E().hasLabel(\"%s\").count()"%edgelabel
283 |     # url_encoded = urllib.urlencode(pms)
284 |     r = requests.get(url, params = pms)
285 |     tmp_dict = json.loads(r.content)
286 |     depth = tmp_dict["result"]["data"][0] # 保守起见,设为边个数
287 |     pms.clear()
288 |     # 使用rings方法查找环路
289 |     url = "http://localhost:8080/graphs/hugegraph/traversers/rings"
290 |     pms["source"] = "\"%s\""%source_node_id
291 |     pms["max_depth"] = depth
292 |     pms["direction"] = "OUT"
293 |     # url_encoded = urllib.urlencode(pms)
294 |     r = requests.get(url, params = pms)
295 |     tmp_dict = json.loads(r.content)
296 |     edgelabel_id = extract_attack_pattern_edgelabel_id(PATTERN_NUM)
297 |     for item in tmp_dict["rings"]:# 照抄上面的
298 |         i = 0
299 |         # print item["objects"][2]
300 |         attack_event_sequence_arr = []
301 |         attack_event_sequence = ""
302 |         while i + 1 < len(item["objects"]):
303 |             tmp_src = item["objects"][i]
304 |             tmp_dst = item["objects"][i+1]
305 |             # tmp_sec, tmp_dst是一小段路径
306 |             edge_id = "S" + tmp_src + ">" + str(edgelabel_id) + ">>S" + tmp_dst
307 |             print edge_id
308 |             attack_event = extract_attack_event_by_edgeid(edge_id)
309 |             attack_event_sequence_arr.append(attack_event)
310 |             i += 1
311 |         for e in attack_event_sequence_arr:
312 |             attack_event_sequence = attack_event_sequence + e + ">"
313 |         if attack_event_sequence not in tmp_paths:
314 |             tmp_paths.append(attack_event_sequence)
315 |     return tmp_paths
316 | 
317 | def extract_suspicious_nodes_from_datagraph(KEY_EVENTS, K):
318 |     print "查找可疑点序列..."
319 |     request_body = {
320 |         "bindings": {},
321 |         "language": "gremlin-groovy",
322 |         "aliases": {}
323 |     }
324 |     orSequence = ""
325 |     for i in KEY_EVENTS:
326 |         orSequence = orSequence + "__.hasLabel(\"%s\")"%i + ","
327 |     orSequence = orSequence[:-1] # 去除结尾的逗号
328 |     # print orSequence
329 |     line1 = "subGraph = hugegraph.traversal().E().or(" + orSequence + ").subgraph(\"subGraph\").cap(\"subGraph\").next()\n"
330 |     line2 = "sg = subGraph.traversal()\n"
331 |     line3 = "sg.V().group().by(bothE().count())\n"
332 |     request_body["gremlin"] = line1 + line2 + line3
333 |     # print request_body
334 |     url = "http://localhost:8080/gremlin"
335 |     # pms_encoded = urllib.urlencode(request_body)
336 |     r = requests.post(url, json=request_body)
337 |     # print r.content
338 |     tmp_dict = json.loads(r.content)
339 |     for i in tmp_dict["result"]["data"]:
340 |         key_list = i.keys()
341 |         # print key_list
342 |     degree = []
343 |     candidates = []
344 |     for i in key_list:
345 |         degree.append(int(i))
346 |     degree.sort(reverse=True)
347 |     # print degree
348 |     j = 0
349 |     # 格式奇怪,要自己解析
350 |     if K > len(tmp_dict["result"]["data"][0]):# K不能过大
351 |         K = len(tmp_dict["result"]["data"][0])
352 |     while j < K:
353 |         for item in tmp_dict["result"]["data"][0][str(degree[j])]: # 度相同的节点们
354 |             candidates.append(item["id"])
355 |         j += 1
356 |     return candidates
357 | 
358 | def execute_Gremlin(script):
359 |     request_body = {
360 |         "bindings": {},
361 |         "language": "gremlin-groovy",
362 |         "aliases": {}
363 |     }
364 |     request_body["gremlin"] = script
365 |     url = "http://localhost:8080/gremlin"
366 |     r = requests.post(url, json=request_body)
367 |     tmp_dict = json.loads(r.content)
368 |     return tmp_dict
369 | 
370 | def search_attack_event(SYMBOL_LIST, EVENT_SEQUENCE, V, IsCylic):# 一次针对某一个点,匹配一个攻击模式
371 |     # 能不能在这里把边上的时间戳属性也取出来?或者说在得到a1,a2这些之后顺势拿出边上的时间戳
372 |     # Q: 已经点序列id和边标签,求路径上的边属性?
373 |     print "基于可疑节点进行攻击模式匹配..."
374 |     print V
375 |     # search_result = True
376 |     v = V
377 |     start_subsentence = "hugegraph.traversal().V('" + v + "').match(\n"
378 |     match_subsentence = ""
379 |     end_subsentence = ""
380 |     where_subsentence = "" #环路匹配需要
381 |     single_subsentences = []
382 |     i = 0 # i用于取出符号
383 |     for item in EVENT_SEQUENCE:
384 |         if item != "" and i < len(SYMBOL_LIST)-1:
385 |             if i == len(SYMBOL_LIST)-2 and IsCylic:# 环路匹配的最后一节是in边   不对,都是out边
386 |                 single_subsentence = "__.as('" + SYMBOL_LIST[i] + "').out('" + item + "').as('" + SYMBOL_LIST[i+1] + "'),"
387 |             else:
388 |                 single_subsentence = "__.as('" + SYMBOL_LIST[i] + "').out('" + item + "').as('" + SYMBOL_LIST[i+1] + "'),"
389 |             single_subsentences.append(single_subsentence)
390 |             i += 1
391 |         else:
392 |             break
393 |     for i in single_subsentences:
394 |         match_subsentence += i
395 |     match_subsentence = match_subsentence[:-1]
396 |     end_subsentence = ").select("
397 |     for symbol in SYMBOL_LIST:
398 |         end_subsentence = end_subsentence + "'" + symbol + "',"
399 |     end_subsentence = end_subsentence[:-1]
400 |     end_subsentence += ").by('ip')"
401 |     if not IsCylic:
402 |         query_sentence = start_subsentence + match_subsentence + end_subsentence
403 |     else:
404 |         where_subsentence = ").where('" + SYMBOL_LIST[0] + "', eq('" + SYMBOL_LIST[len(SYMBOL_LIST)-1] + "')"
405 |         query_sentence = start_subsentence + match_subsentence + where_subsentence + end_subsentence
406 |     tmp_dict = execute_Gremlin(query_sentence)
407 |     print query_sentence
408 |     if len(tmp_dict["result"]["data"]):
409 |         # 这里应该添加取时间戳的操作,最后把时间戳和tmp_dict["result"]["data"]的内容一并返回,后续的处理稍作修改
410 |         print "成功!"
411 |     else:
412 |         print "失败!"
413 |     return tmp_dict["result"]["data"]
414 |         
415 | 
416 | def extract_attack_event_by_event_chain(EVENT_CHAIN_PATHS, EVENT_CHAIN_CYCLICPATHS, SUSPICIOUS_NODES, PATTERN_NUM):
417 |     print "开始攻击事件匹配..."
418 |     Malicious_nodes = [] # 符合该攻击模式的所有的攻击节点
419 |     for V in SUSPICIOUS_NODES: # 从可疑节点出发,理论上可能会匹配到多个. (一个可疑节点)->(若干个受害节点),可以写下来,写到文件中去.保存在一个全部变量中也可以
420 |         result_dict = {}
421 |         start_time = Decimal()
422 |         end_time = Decimal()
423 |         IsMalicious = True
424 |         victim_nodes = set()
425 |         ports_involved = set()
426 |         print "匹配无环攻击序列..."
427 |         for event in EVENT_CHAIN_PATHS:
428 |             event_sequence = event.split('>')
429 |             # event_sequence中实际事件个数为len(event_sequence)-1,去掉最后的空串
430 |             symbol_list = [] # 符号表中的符号个数应比事件个数多1(点的别名)
431 |             i = 1
432 |             while i <= len(event_sequence):
433 |                 symbol_list.append("a" + str((i)))
434 |                 i += 1
435 |             res = search_attack_event(symbol_list, event_sequence, V, False)
436 |             if not len(res):
437 |                 IsMalicious = False
438 |             else:
439 |                 print res
440 |                 # 可以按照边id来查,端点的id已知,需要知道边标签的序号,自己拼出边id 使用extract_edgelabelid_by_edgelabel方法
441 |                 index = 1
442 |                 scan_index = 0
443 |                 while scan_index < len(res):
444 |                     index = 1
445 |                     while index < len(symbol_list):
446 |                         edgelabelid = extract_edgelabelid_by_edgelabel(event_sequence[index - 1])
447 |                         edgeid = "S1:" + res[scan_index][symbol_list[index - 1]] + ">" + str(edgelabelid) + ">>S1:" + res[scan_index][symbol_list[index]]
448 |                         edge_info = extract_edge_by_edgeid(edgeid)["properties"]
449 |                         ports_involved.add(edge_info["src_p"])
450 |                         ports_involved.add(edge_info["dst_p"])
451 |                         tmp_ts = Decimal(edge_info["ts"])
452 |                         if start_time == 0:
453 |                             start_time = tmp_ts
454 |                         if end_time == 0:
455 |                             end_time = tmp_ts
456 |                         # 更新时间戳
457 |                         if start_time.compare(tmp_ts) > 0:
458 |                             start_time = tmp_ts
459 |                         if end_time.compare(tmp_ts) < 0:
460 |                             end_time = tmp_ts
461 |                         index += 1
462 |                     scan_index += 1
463 |                 scan_index = 0
464 |                 while scan_index < len(res):
465 |                     for sym in symbol_list:# 有问题!第一个ping扫描应该涉及很多节点才对
466 |                         # print res[0][sym] # 涉及的节点
467 |                         victim_nodes.add(res[scan_index][sym])
468 |                     scan_index += 1
469 |             print symbol_list
470 |         print "匹配环路攻击序列..."
471 |         for event in EVENT_CHAIN_CYCLICPATHS:
472 |             event_sequence = event.split('>')
473 |             symbol_list = []
474 |             i = 1
475 |             while i <= len(event_sequence):
476 |                 symbol_list.append("a" + str((i)))
477 |                 i += 1
478 |             res = search_attack_event(symbol_list, event_sequence, V, True)
479 |             if not len(res):
480 |                 IsMalicious = False
481 |             else:
482 |                 print res
483 |                 index = 1
484 |                 scan_index = 0
485 |                 while scan_index < len(res):
486 |                     index = 1
487 |                     while index < len(symbol_list):
488 |                         edgelabelid = extract_edgelabelid_by_edgelabel(event_sequence[index - 1])
489 |                         edgeid = "S1:" + res[scan_index][symbol_list[index - 1]] + ">" + str(edgelabelid) + ">>S1:" + res[scan_index][symbol_list[index]]
490 |                         edge_info = extract_edge_by_edgeid(edgeid)["properties"]
491 |                         ports_involved.add(edge_info["src_p"])
492 |                         ports_involved.add(edge_info["dst_p"])
493 |                         tmp_ts = Decimal(edge_info["ts"])
494 |                         # 初始化时间戳
495 |                         if start_time == 0:
496 |                             start_time = tmp_ts
497 |                         if end_time == 0:
498 |                             end_time = tmp_ts
499 |                         # 更新时间戳
500 |                         if start_time.compare(tmp_ts) > 0:
501 |                             start_time = tmp_ts
502 |                         if end_time.compare(tmp_ts) < 0:
503 |                             end_time = tmp_ts
504 |                         index += 1
505 |                     scan_index += 1
506 |                 scan_index = 0
507 |                 while scan_index < len(res):
508 |                     for sym in symbol_list:# 有问题!第一个ping扫描应该涉及很多节点才对
509 |                         # print res[0][sym] # 涉及的节点
510 |                         victim_nodes.add(res[scan_index][sym])
511 |                     scan_index += 1
512 |             print symbol_list
513 |         if IsMalicious:
514 |             Malicious_nodes.append(V)
515 |             # 这种情况下,victim_nodes的内容才有效
516 |             # 攻击模式应当对应一个攻击模式的标签,才合理.仅仅标上序号是不行的.
517 |             # 可以先按Pattern-序号的形式记录,后面再补上序号和标签的对应关系即可
518 |             print "确定恶意节点:"
519 |             print V
520 |             print "对应的受影响节点:"
521 |             print victim_nodes
522 |             print "受影响端口:"
523 |             print ports_involved
524 |             print "开始时间:"
525 |             print start_time
526 |             print "结束时间"
527 |             print end_time
528 |             print "+++++++++++++++++++++++++++++++++++++++++++++++++"
529 |             global attack_counter
530 |             result_dict["num"] = attack_counter # 用全局变量标个号
531 |             attack_counter += 1
532 |             result_dict["pattern"] = "attack-pattern-" + str(PATTERN_NUM) # 希望可以处理成label
533 |             result_dict[V] = victim_nodes
534 |             result_dict["ports"] = ports_involved
535 |             result_dict["start_time"] = start_time
536 |             result_dict["end_time"] = end_time
537 |             nodes_involved.append(result_dict)
538 |             print nodes_involved
539 |             print "+++++++++++++++++++++++++++++++++++++++++++++++++"
540 |     return Malicious_nodes
541 | 
542 | # 计算Jaccord集合相似度,p和q是集合类型
543 | def jaccard(p, q):
544 |     return float(len(set.intersection(p, q)))/float(len(set.union(p, q)))
545 | 
546 | if __name__ == '__main__':
547 |     # 工作分四部分:
548 |     # 第一部分,图谱构建(基本完成,不在此文档中)
549 |     # 第二部分,属性图挖掘进行攻击发现(基本完成)
550 |     # 第三部分,属性图相似度计算进行攻击关联(有思路可以参考,吴东的超级告警方法,但也有不同的地方)
551 |     # 第四部分,基于恶意节点影响度进行态势理解(考虑pagerank算法结合场景,考虑两个因素,第一,单步攻击的影响关系,第二,涉及攻击链的影响关系,
552 |     # 还没做,但这个应该不太难,此算法有现成代码,最后只是给出态势值,说服力可能不够)
553 |     # 思路先理清,文章不能着急写
554 |     # 做的是特征匹配,而不是子图同构匹配,否则漏洞百出
555 |     # 也可以理解为做的是模糊匹配,而不是精确匹配(不需要)
556 |     attack_counter = 0
557 |     PATTERN_NUM = 0
558 |     MAX_DISTANCE = 1 # ok
559 |     TIME_WINDOW = 600 # 自己设
560 |     KEY_EVENTS = set() # 取和0点相连的attack_event_n的边的event_label属性, ok
561 |     EVENT_CHAIN_PATHS = []
562 |     EVENT_CHAIN_CYCLICPATHS = []
563 |     SUSPICIOUS_NODES = []
564 |     PATTERNS = 5
565 |     K = 2 # 自己设
566 |     # 从特征图中提取以下要素
567 |     # 从可疑节点出发的匹配规则
568 |     # K值(前K个可疑点),ok
569 |     # 最大距离(限制匹配范围),ok
570 |     # 环的处理(模式匹配/连续out匹配),环仍然要化为匹配规则
571 |     while PATTERN_NUM < PATTERNS:
572 |         KEY_EVENTS.clear() # 集合        # 阶段2的rpc_call和rpc_reply环匹配有问题,明明有这个环存在,但是匹配不到要清空
573 |         print "开始抽取攻击模式图0的特征信息:"
574 |         # source_node_id = "4:0" # 攻击特征图中的攻击节点 这个id太坑,不可控
575 |         source_node_id = str(PATTERN_NUM+4) + ":0" # 实际就是相隔2
576 |         MAX_DISTANCE = extract_max_distance(source_node_id)
577 |         print "MAX_DISTANCE = " + str(MAX_DISTANCE)
578 |         print "设置时间窗大小..."
579 |         print "TIME_WINDOW = " + str(TIME_WINDOW)
580 |         print "设置K值..."
581 |         print "K = " + str(K)
582 |         # 匹配规则的格式:
583 |         # 可疑点出发,攻击事件链(思路,从攻击模式图中查找所有的事件链组合,重复也没事,链上每一步可能有不止一个事件(兼备,只要模式匹配能达到这个要求就行))
584 |         # 提取关键事件(用于求边生成子图)
585 |         KEY_EVENTS = extract_key_events(PATTERN_NUM)
586 |         print KEY_EVENTS
587 |         # 提取攻击事件链
588 |         EVENT_CHAIN_PATHS = extract_event_chain_paths(source_node_id, MAX_DISTANCE, PATTERN_NUM)
589 |         print "攻击事件链(无环):"
590 |         print "EVENT_CHAIN_PATHS = "
591 |         print EVENT_CHAIN_PATHS
592 |         EVENT_CHAIN_CYCLICPATHS = extract_event_chain_cyclicpaths(source_node_id, PATTERN_NUM)
593 |         print "攻击事件链(有环):"
594 |         print "EVENT_CHAIN_CYCLICPATHS = "
595 |         print EVENT_CHAIN_CYCLICPATHS
596 |         SUSPICIOUS_NODES = extract_suspicious_nodes_from_datagraph(KEY_EVENTS, K)
597 |         print "可疑节点id:"
598 |         print "SUSPICIOUS_NODES = "
599 |         print SUSPICIOUS_NODES
600 |         MALICIOUS_NODES = extract_attack_event_by_event_chain(EVENT_CHAIN_PATHS, EVENT_CHAIN_CYCLICPATHS, SUSPICIOUS_NODES, PATTERN_NUM)
601 |         print "恶意节点id:"
602 |         print "MALICIOUS_NODES = "
603 |         print MALICIOUS_NODES
604 |         # 发现的攻击模式需要保存下来
605 |         # 为下一步攻击链发现作准备
606 |         # 匹配成功的结果:单步攻击的表示形式(恶意节点,受影响节点,事件集合,开始时间,结束时间,事件标签)
607 |         # 事件标签还没有贴上. 可能会用到. 考虑"多因素关联"和"本体推理机"方法
608 |         PATTERN_NUM += 1
609 |     print nodes_involved
610 |     # nodes_involved包含了所有单步攻击的信息
611 |     # 分析任务
612 |     # 1: 确定单步攻击的start_time和end_time属性
613 |     # 2: 将攻击模式映射至标签,希望可以做类型关联度分析
614 |     # 3: 单步攻击表示为(攻击节点,受影响节点,开始时间,结束时间,涉及端口,攻击标签)的六元组 (1,1,1,1,1,0)
615 |     # 4: 从ip,端口,时间,类型等四个角度计算关联度,同时设置关联度阈值
616 |     # 5: 类型需要一个从攻击序号到标签的映射关系
617 |     # 6: 待定,参考论文
618 |     # 关于攻击标签的考虑,这是一种比较明显的专家知识.最多作辅助使用,比如属于杀伤链的某个大阶段.
619 |     # 弱化攻击标签的影响,好让我的"发现新攻击序列"的说法站得住脚
620 |     a = set()
621 |     b = set()
622 |     a.add("111")
623 |     b.add("111")
624 |     b.add("222")
625 |     a.add("333")
626 |     a.add("444")
627 |     print "测试杰卡徳相似度..." # 应用到ip,端口上
628 |     print jaccard(a, b)
629 |     # 要算出任意两个单步之间的关联度,设置一个n*n的关联矩阵
630 |     for e in nodes_involved:
631 |         print e
632 |         # 挨个处理
633 |         for key in e:
634 |             if key == "pattern":
635 |                 print "convert attack patterns to labels"
636 |             elif key == "num": # 是不是按照顺序来的?
637 |                 print e[key]
638 |             else:
639 |                 print key
640 |                 print type(e[key]) # e[key]是受影响节点集合
641 |                 # 再想办法取出时间戳,现在回头取是不是有点难弄?能不能一开始就弄上? 可以
642 |     print len(nodes_involved)
643 |     # 先按时间顺序罗列一下?
644 |     incidence_matrix = np.zeros((len(nodes_involved), len(nodes_involved)))
645 |     print "初始化关联性度量矩阵..."
646 |     print incidence_matrix
647 |     # Corr( Ha, Hb) = ∑ w i × C i
648 |     # C1: IP的集合相似度(受影响节点) 是否需要对子集之类的情况作一些修正?
649 |     # C2: Port的集合相似度(参与通信的端口)
650 |     # C3: 类型 APT的杀伤链模型
651 |     for a in nodes_involved:
652 |         for b in nodes_involved:
653 |             if a["num"] == b["num"]:
654 |                 continue
655 |             key1 = ""
656 |             key2 = ""
657 |             for key in a: # 恶意节点ip不太确定, key1和key2都代表恶意节点ip
658 |                 if key != "pattern" and key != "start_time" and key != "num" and key != "end_time" and key != "ports":
659 |                     key1 = key
660 |             for key in b:
661 |                 if key != "pattern" and key != "start_time" and key != "num" and key != "end_time" and key != "ports":
662 |                     key2 = key
663 |             c1 = jaccard(a[key1], b[key2]) # a[key1]和b[key2]分别代表受影响节点
664 |             # 可以分情况考虑,如果key1和key2相同,说明这两个攻击关联性强(为同一个恶意节点发动的攻击)
665 |             # 反之,如果key1和key2不同,那么就从受影响节点的集合来考虑. Jaccard常数为0,完全无关,非0,有关.
666 |             # 还可以考虑一些其他的情况,比如key1或者key2在对方的受影响节点中,key2在key1的受影响节点中
667 |             # 说明,key2可能是被key1入侵了,成为了新的肉机(当然,还需要时间上满足先后关系).这种情况,可以增加关联性值.
668 |             c2 = jaccard(a["ports"], b["ports"])
669 |             getcontext().prec = 4
670 |             c = Decimal(0.5 * c1 + 0.5 * c2) # 暂时这么写
671 |             incidence_matrix[a["num"], b["num"]] = incidence_matrix[b["num"], a["num"]] = c # 关联矩阵应该是个对称矩阵
672 |     print "计算过后的关联性度量矩阵..."
673 |     print incidence_matrix
674 |     # 计算结束之后,需要设置规定"两者之间是否有相关关系"的阈值
675 |     # 此后可以仿照吴东的方法,构建单步攻击时间关系图并从中挖掘攻击链
676 |     # ...


--------------------------------------------------------------------------------
/updateHost.bro:
--------------------------------------------------------------------------------
   1 | # author: wangyixuan
   2 | # It aims at getting hosts'
   3 | # USERNAME,    0      pay attention to NTLM PLZ
   4 | # HOSTNAME,      1
   5 | # MAC ADDRESS,  1
   6 | # OPERATING SYSTEM,     1
   7 | # IP ADDRESS    1
   8 | 
   9 | # How to sort ips?
  10 | # Type          (indicates devicetype, etc desktop, laptop, tablet)
  11 | # Applications  (Why do we need it? Should we guess during which period such applications are running?)
  12 | # Protocols     (so many protocols, how to handle them? It exists between two hosts.)
  13 | 
  14 | # Maintain the information of hosts all the time and output it to log file regularly
  15 | 
  16 | # completed two functions named update_hostlist and update_single_host
  17 | 
  18 | # event new_connection: collect various protocols which are indicated by connections
  19 | # event protocol_confirmation: this event is emitted when bro confirms that this protocol is actually running here
  20 | # problem to solve: whether the protocols comes is a new protocol? Using !in is not appropriate.
  21 | 
  22 | # adjust the format of protocols   etc: http:33,dns:14
  23 | # data to log cannot be a table
  24 | 
  25 | # how to invoke a event in a specific interval
  26 | # refer to test1.bro and define an event by ourself
  27 | # this user-defined event can complish the task of logging hostlist every n seconds
  28 | # outside dataprocesser can read the log every n seconds as well
  29 | 
  30 | # convert ts to the form of "YYYY:MM:DD-HH:MM:SS", which is easier to understand
  31 | # in "ips": mark the timestamp of each ip
  32 | # in "protocols": mark the number that indicate how many time this protocol has beem confirmed
  33 | # the value of n is dynamic
  34 | # there are some problems in updating ips
  35 | # 1. segment fault
  36 | # 2. redundant ip in "ips" field
  37 | # 3, three records missing ips(uninitialized)
  38 | # 4. the way to check a ip already exist? etc: 192.168.1.5, 192.168.1.50 substring is not reliable
  39 | # @load /home/lw/myKGA/signature_test.bro
  40 | module HOST_INFO;
  41 | @load /usr/local/zeek/share/zeek/base/frameworks/dpd/__load__.bro
  42 | export{
  43 | 	# Create an ID for our new stream. By convention, this is
  44 | 	# called "HOST_INFO_LOG".
  45 | 	redef enum Log::ID += { HOST_INFO_LOG,
  46 |                             SUMMARY_HOST_LOG,
  47 |                             NET_EVENTS_LOG };# NET_EVENTS_LOG记录重要网络事件(或者网络包),作为KG分析的输入,BRO脚本分析多步攻击的数据集
  48 | 
  49 |     # 定义三元组中谓语的类型,输出的格式是HOST_INFO::ICMP_ECHO_REQUEST
  50 |     type relation: enum {
  51 |         Empty, ICMP_ECHO_REQUEST, ICMP_ECHO_REPLY, ICMP_UNREACHABLE
  52 |     };
  53 |     # unfortunately, its json format is incorrect
  54 |     # We need to handle the json format output line by line
  55 |     # redef LogAscii::use_json = T;
  56 | 	# Define the record type that will contain the data to log.
  57 |     type host_info: record{
  58 |         ts: time    &log;
  59 |         ip: addr      &log;#indicates the newest ip
  60 |         ips: string     &default="" &log; # historical ips, ordered by their
  61 |         username: string    &default="" &log;
  62 |         hostname: string    &default="" &log;
  63 |         mac: string     &default="" &log;
  64 |         os: string      &default="" &log;
  65 |         description: string     &default="" &log;
  66 |         protocols: string   &default="" &log; # list all of its protocols
  67 |     };
  68 | 
  69 |     # 再定义一个结构体,用于存储三元组事件(A, relation, B),实际就是(主,谓,宾)三元组
  70 |     # 三元组事件的存储方案: 1.三元组表 2.水平表 3.属性表 4.垂直划分 5.六重索引 6.DB2RDF 
  71 |     # 还是存储到RDF中,后续可以进行SPARQL查询?
  72 |     # 数据量巨大,考虑三元组的聚合(去除一些没用的三元组)=>类比南理工的文章中的经验聚合(去除一些不太重要的告警信息)
  73 |     # 三元组的内容不局限于最底层流量,应当有一些告警层面的三元组(但是这种三元组从哪儿来?有现成的事件还是推理出来)
  74 | 
  75 |     # 更新: 用边来表示事件,为了兼容各种情况,可以包含诸多属性
  76 |     # 属性可以慢慢添加,逐渐完善
  77 |     # 目前考虑的事件: 1. icmp ping事件
  78 |     type event_info: record{
  79 |         ts: time    &log;
  80 |         real_time: string   &log;
  81 |         event_type: relation    &log;
  82 |         src_ip: addr    &log;
  83 |         src_p: port     &log;
  84 |         dst_ip: addr    &log;
  85 |         dst_p: port     &log;
  86 |     };
  87 | }
  88 | 
  89 | # Use it to store host-info
  90 | global hostlist: vector of host_info = {};
  91 | 
  92 | global num_packets = 0;
  93 | 
  94 | # Precondition: 0 <= index <= |hostlist|
  95 | # Postcondition: cooresponding item has been updated
  96 | function update_single_host(hinfo: HOST_INFO::host_info, protocol: string, index: int){
  97 |     # remember to initialize "ips" and "protocols"
  98 |     # print fmt("update index %d", index);
  99 |     # print hinfo;
 100 |     print fmt("index is : %d", index);
 101 |     local tmp_ip: string = fmt("%s", hinfo$ip);
 102 |     local up_index: count = 0;
 103 |     print fmt("the ip is %s", tmp_ip);
 104 |     if(hostlist[index]$ips == ""){
 105 |         # print fmt("initialize ips of index %d", index);
 106 |         local t: time = network_time();
 107 |         hostlist[index]$ips = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t) + tmp_ip);
 108 |     }
 109 |     if(hostlist[index]$protocols == ""){
 110 |         # print fmt("initialize protocols of index %d", index);
 111 |         hostlist[index]$protocols = protocol + ":1";
 112 |     }
 113 |     # check that whether ip is the newest ip
 114 |     if(hinfo$ip != hostlist[index]$ip){
 115 |         # print fmt("update ips because host's ip has been changed");
 116 |         # Maybe this host uses a new ip now, so I need to concatenate "ips"
 117 |         # Since these messages comes in order, I take it for granted that it is unnecessary to compare timestamp.
 118 |         hostlist[index]$ip = hinfo$ip; # update the newest ip
 119 |         # maybe we need a new way to determine whether the ip is new: edit the if condition
 120 |         if(tmp_ip !in hostlist[index]$ips){
 121 |             # a new ip comes, append it to the end of ips
 122 |             local t1: time = network_time();
 123 |             hostlist[index]$ips += fmt(",%s", strftime("%Y-%m-%d-%H:%M:%S|", t1) + tmp_ip);
 124 |             print "append ips";
 125 |         } else {
 126 |             print "update ips";
 127 |             # in this case, the previous ts should be updated
 128 |             local comma: pattern = /,/;
 129 |             local tmp_tlb: table[count] of string = split(hostlist[index]$ips, comma);
 130 |             local ori_len: count = |tmp_tlb|;
 131 |             # tmp_tlb_ip holds the ips in tmp_tlb and has the same index as tmp_tlb
 132 |             # To ensure the coming ip is a new ip or not clearly.
 133 |             local tmp_tlb_ip: table[count] of string;
 134 |             for(key in tmp_tlb){
 135 |                 local bin_tlb: table[count] of string = split(tmp_tlb[key], /\|/);
 136 |                 tmp_tlb_ip[key] = bin_tlb[2];
 137 |             }
 138 |             print fmt("previous len: %d", ori_len);
 139 |             print "what is in ips now ?";
 140 |             print hostlist[index]$ips;
 141 |             print "what is in  tmp_tlb now?";
 142 |             print tmp_tlb;
 143 |             for(key in tmp_tlb_ip){# use tmp_tlb_ip to determine the key to store
 144 |                 print key;    # To avoid missing ips, we should initialize "ips" when we append a new item
 145 |                 print tmp_tlb_ip[key];
 146 |                 print "start checking";
 147 |                 if(tmp_ip == tmp_tlb_ip[key]){
 148 |                     # this item should be updated
 149 |                     print "bingo";
 150 |                     # here is strange segment fault when I try to directly overwrite tmp_tlb[key] here
 151 |                     # so I record the value of key instead
 152 |                     up_index = key;
 153 |                     # tmp_tlb[key] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 154 |                     # print fmt("the last item: %s", tmp_tlb[key]);
 155 |                     # if(key == ori_len){ # the last item
 156 |                     #     tmp_tlb[key] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 157 |                     #     print fmt("the last item: %s", tmp_tlb[key]);
 158 |                     # }
 159 |                     # else{ # previous items
 160 |                     #     tmp_tlb[key] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 161 |                     #     print fmt("previous item: %s", tmp_tlb[key]);
 162 |                     # }
 163 |                 }
 164 |                 print "end checking";
 165 |             }
 166 |             print "before join!";
 167 |             if(up_index != 0){
 168 |                 # up_index is applied to update tmp_tlb
 169 |                 # from now on, tmp_tlb_ip is useless
 170 |                 local t2: time = network_time();
 171 |                 tmp_tlb[up_index] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 172 |             }
 173 |             for(key in tmp_tlb){
 174 |                 print fmt("[%d]=>%s", key, tmp_tlb[key]);
 175 |             }
 176 |             # hostlist[index]$ips = cat_string_array(tmp_tlb); # overwrite
 177 |             hostlist[index]$ips = join_string_array(",", tmp_tlb);
 178 |             print fmt("after join:%s", hostlist[index]$ips);
 179 |             # recheck the number of commas in ips
 180 |             if(ori_len != |split(hostlist[index]$ips, comma)|){
 181 |                 print "Unexpected error: the number of commas is wrong";
 182 |                 print fmt("ori_len: %d, new len: %d", ori_len, |split(hostlist[index]$ips, comma)|);
 183 |             }
 184 |         }
 185 |     } else {
 186 |         print "do not update ips";
 187 |     }
 188 |     # check that whether protocol is a protocol related to this host
 189 |     # if not: concatenate "protocols"   separated by commas
 190 |     # this check condition is not so good
 191 |     # we'd better split protocols into individual items and compare them
 192 |     up_index = 0; # reinitialize up_index
 193 |     if(protocol != "" && protocol !in hostlist[index]$protocols){
 194 |         # print fmt("update protocols because a new protocol of this host found");
 195 |         hostlist[index]$protocols += fmt(",%s:1", protocol);
 196 |     } else {
 197 |         # record the count
 198 |         local pro_tlb: table[count] of string = split(hostlist[index]$protocols, /,/);
 199 |         local pro_tlb_tmp: table[count] of string;
 200 |         print "start updating protocols";
 201 |         print pro_tlb;
 202 |         for(key in pro_tlb){
 203 |             local bin_p_tlb: table[count] of string = split(pro_tlb[key], /:/);
 204 |             pro_tlb_tmp[key] = bin_p_tlb[1];
 205 |         }
 206 |         for(key in pro_tlb_tmp){
 207 |             if(protocol == pro_tlb_tmp[key]){
 208 |                 # increase by one later
 209 |                 up_index = key;
 210 |             }
 211 |             if(up_index != 0){
 212 |                 local bin_p_tlb1: table[count] of string = split(pro_tlb[up_index], /:/);
 213 |                 local num_s: string = bin_p_tlb1[2];
 214 |                 local num_v: count = to_count(num_s);
 215 |                 num_v += 1;
 216 |                 pro_tlb[up_index] = fmt("%s:%d", bin_p_tlb1[1], num_v);
 217 |             }
 218 |             for(key in pro_tlb){
 219 |                 print fmt("[%d]=>%s", key, pro_tlb[key]);
 220 |             }
 221 |             hostlist[index]$protocols = join_string_array(",", pro_tlb);
 222 |         }
 223 |     }
 224 |     # update timestamp
 225 |     hostlist[index]$ts = hinfo$ts;
 226 |     # update hostname iff a different hostname comes
 227 |     if(hinfo$hostname != "" && hinfo$hostname != hostlist[index]$hostname){
 228 |         # in the case of empty string, initialize it
 229 |         # print fmt("initialize the hostname field of this host");
 230 |         hostlist[index]$hostname = hinfo$hostname;
 231 |     }
 232 |     # update os
 233 |     if(hinfo$os != "" && hinfo$os != hostlist[index]$os){
 234 |         # print fmt("update os field of this host");
 235 |         hostlist[index]$os = hinfo$os;
 236 |     }
 237 |     # update mac
 238 |     # Although we confirm that mac should be set as the unique id, 
 239 |     # we reconsider it in the second branch in update_hostlist.
 240 |     if(hinfo$mac != "" && hostlist[index]$mac == ""){
 241 |         # initialize mac field
 242 |         # print fmt("initialize mac field of this host");
 243 |         hostlist[index]$mac = hinfo$mac;
 244 |     }
 245 |     # update username
 246 |     if(hinfo$username != "" && hostlist[index]$username == ""){
 247 |         # print fmt("update username field of this host");
 248 |         hostlist[index]$username = hinfo$username;
 249 |     }
 250 | }
 251 | 
 252 | # Precondition: hinfo comes from fragmentary records
 253 | # Postcondition: update contents of hostlist with hinfo
 254 | function update_hostlist(hinfo: HOST_INFO::host_info, protocol: string){
 255 |     # print "prepare to update";
 256 |     local has_updated: bool = F;
 257 |     if(hinfo$mac != "" || hinfo$hostname != ""){ 
 258 |         # I believe that mac addresses and hostnames can uniquely identify a host.
 259 |         for(i in hostlist){
 260 |             if(((hostlist[i]$mac == hinfo$mac) && (hinfo$mac != "")) || ((hostlist[i]$hostname == hinfo$hostname) && (hinfo$hostname != ""))){
 261 |                 # update
 262 |                 update_single_host(hinfo, protocol, i);
 263 |                 has_updated = T;
 264 |                 break;
 265 |             }
 266 |         }
 267 |         if(!has_updated) {
 268 |             # To avoid missing ips, we should initialize "ips" when we append a new item
 269 |             hostlist[|hostlist|] = hinfo;
 270 |             local wall_time: time = network_time();
 271 |             local tmp_ip: string = fmt("%s", hinfo$ip);
 272 |             # 这边应该也要-1
 273 |             hostlist[|hostlist|-1]$ips += fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", wall_time) + tmp_ip);
 274 |             has_updated = T;
 275 |         }
 276 |     }
 277 |     # 为了icmp发现的主机能进入记录,暂时允许把ip作为主机唯一性考量
 278 |     if(hinfo ?$ ip){
 279 |         for(i in hostlist){
 280 |             if(hostlist[i]$ip == hinfo$ip){# 如果有,更新一下,其实没有什么好更新的
 281 |             print hostlist[i]$ip;
 282 |             print hinfo$ip;
 283 |                 update_single_host(hinfo, protocol, i);
 284 |                 has_updated = T;
 285 |                 break;
 286 |             }
 287 |         }
 288 |         # 如果没有,插入
 289 |         if(!has_updated){
 290 |             print "a new ip comes";
 291 |             print hinfo$ip;
 292 |             print |hostlist|;
 293 |             hostlist[|hostlist|] = hinfo;
 294 |             print |hostlist|;
 295 |             local wall_time1: time = network_time();
 296 |             local tmp_ip1: string = fmt("%s", hinfo$ip);
 297 |             # print hostlist[|hostlist|-1];
 298 |             # |hostlist|改变了,再对齐对应记录作修改,后面-1
 299 |             hostlist[|hostlist|-1]$ips += fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", wall_time1) + tmp_ip1);
 300 |             has_updated = T;
 301 |             # 针对仅有ip的主机更新情况,不会再去下一个if分支
 302 |         }
 303 |     }
 304 |     if(!has_updated){
 305 |         # In this case, I can't confirm that this host
 306 |         for(i in hostlist){
 307 |             if(hinfo$ip == hostlist[i]$ip){
 308 |                 update_single_host(hinfo, protocol, i);
 309 |                 has_updated = T;
 310 |                 break;
 311 |             }
 312 |         }
 313 |         if(!has_updated){
 314 |             # At this point, all correct info should have been updated
 315 |             print "incomplete information, skip it ", hinfo;
 316 |         }
 317 |     }
 318 | }
 319 | 
 320 | function check_ssh_hostname(id: conn_id, uid: string, host: addr){
 321 |     when(local hostname = lookup_addr(host)){
 322 |         local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = host, $hostname = hostname, $description = "shh_auth"];
 323 |         update_hostlist(rec, "ssh");
 324 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec);
 325 |     }
 326 | }
 327 | 
 328 | event OS_version_found(c: connection, host: addr, OS: OS_version){
 329 |     # print "an operating system has been fingerprinted";
 330 |     # print fmt("the host running this OS is %s", host);
 331 |     # print OS;
 332 |     if(OS$genre != "UNKNOWN"){
 333 |         local os_detail = fmt("%s %s", OS$genre, OS$detail);
 334 |         local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = host, $os = os_detail, $description = "OS_version_found"];
 335 |         update_hostlist(rec, "os_fingerprint");
 336 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec);
 337 |     }
 338 |     # e.g [genre=UNKNOWN, detail=, dist=36, match_type=direct_inference]
 339 |     # How to utilize this message?
 340 | }
 341 | 
 342 | # There is no point in removing dulipcated messages for a specific ip. 
 343 | # Becuase ip addresses should not be the unique identification of a specific host.
 344 | # We should identity a specific host by ip and mac pairs which have the lastest network time.
 345 | event arp_reply(mac_src: string, mac_dst: string, SPA: addr, SHA: string, TPA: addr, THA: string){
 346 |     # print "arp reply";
 347 |     # print fmt("source mac: %s, destination mac: %s, SPA: %s, SHA: %s, TPA: %s, THA: %s", mac_src, mac_dst, SPA, SHA, TPA, THA);
 348 |     # record ip and its mac address
 349 |     # we don't these form of mac addresses:
 350 |     # 00:00:00:00:00:00 and ff:ff:ff:ff:ff:ff
 351 |     if(SHA != "ff:ff:ff:ff:ff:ff" && SHA != "00:00:00:00:00:00" && SPA != 0.0.0.0){
 352 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = SPA, $mac = SHA, $description = "arp_reply" ];
 353 |         update_hostlist(rec1, "arp");
 354 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 355 |     }
 356 |     if(THA != "ff:ff:ff:ff:ff:ff" && THA != "00:00:00:00:00:00" && TPA != 0.0.0.0){
 357 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = TPA, $mac = THA, $description = "arp_reply" ];
 358 |         update_hostlist(rec2, "arp");
 359 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 360 |     }
 361 | }
 362 | 
 363 | event arp_request(mac_src: string, mac_dst: string, SPA: addr, SHA: string, TPA: addr, THA: string){
 364 |     # print "arp request";
 365 |     # print fmt("source mac: %s, destination mac: %s, SPA: %s, SHA: %s, TPA: %s, THA: %s", mac_src, mac_dst, SPA, SHA, TPA, THA);
 366 |     if(SHA != "ff:ff:ff:ff:ff:ff" && SHA != "00:00:00:00:00:00" && SPA != 0.0.0.0){
 367 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = SPA, $mac = SHA, $description = "arp_request" ];
 368 |         update_hostlist(rec1, "arp");
 369 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 370 |     }
 371 |     if(THA != "ff:ff:ff:ff:ff:ff" && THA != "00:00:00:00:00:00" && TPA != 0.0.0.0){
 372 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = TPA, $mac = THA, $description = "arp_request" ];
 373 |         update_hostlist(rec2, "arp");
 374 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 375 |     }
 376 | }
 377 | 
 378 | event bad_arp(SPA: addr, SHA: string, TPA: addr, THA: string, explanation: string){
 379 |     print fmt("this arp packet is bad because: %s", explanation);
 380 | }
 381 | 
 382 | event dhcp_message(c: connection, is_orig: bool, msg: DHCP::Msg, options: DHCP::Options){
 383 |     # print "A dhcp message is coming!";
 384 |     # print msg;
 385 |     # print options;
 386 |     if(options ?$ host_name && options ?$ addr_request && options ?$ client_id){ # It occurred once: missing client_id, check it in advance
 387 |         print "haha";
 388 |         # print options;
 389 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = options$addr_request, $mac = options$client_id$hwaddr, $hostname = options$host_name, $description = "dhcp_message1" ];
 390 |         update_hostlist(rec1, "dhcp");
 391 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 392 |     } else{
 393 |         if(msg$yiaddr != 0.0.0.0){
 394 |             local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = msg$yiaddr, $mac = msg$chaddr, $description = "dhcp_message2" ];
 395 |             update_hostlist(rec2, "dhcp");
 396 |             Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 397 |         }
 398 |     }
 399 | }
 400 | 
 401 | 
 402 | event ssh_auth_successful(c: connection, auth_method_none: bool){
 403 | 	for ( host in set(c$id$orig_h, c$id$resp_h) )
 404 | 	{
 405 | 		check_ssh_hostname(c$id, c$uid, host);
 406 | 	}
 407 | }
 408 | 
 409 | event dns_query_reply(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count){
 410 |     # print "here comes a dns query reply";
 411 |     # print c;
 412 |     # print msg;        
 413 |     # print query;      
 414 |     # print qtype;
 415 | }
 416 | 
 417 | event dns_A_reply(c: connection, msg: dns_msg, ans: dns_answer, a: addr){
 418 |     # print "********************************TYPE A REPLY*********************";
 419 |     # print c;
 420 |     # print msg;#[id=0, opcode=0, rcode=0, QR=T, AA=T, TC=F, RD=F, RA=F, Z=0, num_queries=0, num_answers=1, num_auth=0, num_addl=0]
 421 |     # print ans;#[answer_type=1, query=brwa86bad339915.local, qtype=1, qclass=32769, TTL=4.0 mins]
 422 |     # print a;#192.168.1.108
 423 |     local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = a, $hostname = ans$query, $description = "dns_A_reply" ];
 424 |     update_hostlist(rec, "dns");
 425 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec);    
 426 | }
 427 | 
 428 | event dns_AAAA_reply(c: connection, msg: dns_msg, ans: dns_answer, a: addr){
 429 |     local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = a, $hostname = ans$query, $description = "dns_AAAA_reply" ];
 430 |     update_hostlist(rec, "dns");
 431 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec); 
 432 | }
 433 | 
 434 | # I want to get hostnames by event related to DNS.
 435 | event dns_message(c: connection, is_orig: bool, msg: dns_msg, len: count){
 436 |     # print "dns_message";
 437 |     # print "1";
 438 |     # print c$dns_state$pending_queries;
 439 |     if(c ?$dns_state){
 440 |         for(index1 in c$dns_state$pending_queries){
 441 |             # print "2";
 442 |             # print c$dns_state$pending_queries[index1];
 443 |             for(index2 in c$dns_state$pending_queries[index1]$vals){
 444 |                 local rec: DNS::Info = c$dns_state$pending_queries[index1]$vals[index2];
 445 |                 # print rec;
 446 |                 if(rec ?$ answers){
 447 |                     print "It has answers!!!!";
 448 |                     print rec;
 449 |                 }
 450 |                 if(rec ?$ qtype_name){
 451 |                     switch(rec$qtype_name){
 452 |                         case "A":
 453 |                             # print "type A";
 454 |                             # print fmt("host %s's query field: %s", rec$id$orig_h, rec$query);
 455 |                             break;
 456 |                         case "AAAA":
 457 |                             # print "type AAAA";
 458 |                             break;
 459 |                         case "CNAME":
 460 |                             # print "type CNAME";
 461 |                             break;
 462 |                         case "PTR":
 463 |                             # print "type PTR";
 464 |                             break;
 465 |                         case "MX":
 466 |                             # print "type MX";
 467 |                             break;
 468 |                         case "NS":
 469 |                             # print "type NS";
 470 |                             break;
 471 |                         default:
 472 |                             # print fmt("unexpected type: %s", rec$qtype_name);
 473 |                             break;
 474 |                     }
 475 |                 }
 476 |                 # Unfortunately, it is not the hostname. :(
 477 |             }
 478 |         }
 479 |     }
 480 | }
 481 | 
 482 | event dns_mapping_valid(dm: dns_mapping){
 483 |     print "dns_mapping_valid";
 484 |     print dm;
 485 | }
 486 | 
 487 | event dns_mapping_altered(dm: dns_mapping, old_addrs: addr_set, new_addrs: addr_set){
 488 |     print "dns_mapping_altered";
 489 |     print dm;
 490 | }
 491 | 
 492 | event dns_mapping_lost_name(dm: dns_mapping){
 493 |     print "dns_mapping_lost_name";
 494 |     print dm;
 495 | }
 496 | 
 497 | event dns_mapping_new_name(dm: dns_mapping){
 498 |     print "dns_mapping_new_name";
 499 |     print dm;
 500 | }
 501 | 
 502 | event dns_mapping_unverified(dm: dns_mapping){
 503 |     print "dns_mapping_unverified";
 504 |     print dm;
 505 | }
 506 | 
 507 | 
 508 | 
 509 | event ntlm_authenticate(c: connection, request: NTLM::Authenticate){
 510 |     print c;
 511 |     print request;
 512 |     if(request ?$ user_name){
 513 |         print fmt("username: %s", request$user_name);
 514 |     }
 515 | }
 516 | 
 517 | # collect more protocol information here
 518 | event protocol_confirmation(c: connection, atype: Analyzer::Tag, aid: count){
 519 |     local src_ip: addr;
 520 |     local dst_ip: addr;
 521 |     local protocol: string;
 522 |     if(c$id ?$ orig_h && c$id ?$ resp_h){
 523 |         src_ip = c$id$orig_h;
 524 |         dst_ip = c$id$resp_h;
 525 |     }
 526 |     switch(atype){
 527 |         case Analyzer::ANALYZER_AYIYA:
 528 |             protocol = "ayiya";
 529 |             break;
 530 |         case Analyzer::ANALYZER_BACKDOOR:
 531 |             protocol = "backdoor";
 532 |             break;
 533 |         case Analyzer::ANALYZER_BITTORRENT:
 534 |             protocol = "bittorrent";
 535 |             break;
 536 |         case Analyzer::ANALYZER_BITTORRENTTRACKER:
 537 |             protocol = "bittorrenttracker";
 538 |             break;
 539 |         case Analyzer::ANALYZER_CONNSIZE:
 540 |             protocol = "connsize";#??
 541 |             break;
 542 |         case Analyzer::ANALYZER_DCE_RPC:
 543 |             protocol = "dce_rpc";
 544 |             break;
 545 |         case Analyzer::ANALYZER_DHCP:
 546 |             protocol = "dhcp";
 547 |             break;
 548 |         case Analyzer::ANALYZER_DNP3_TCP:
 549 |             protocol = "dnp3_tcp";
 550 |             break;
 551 |         case Analyzer::ANALYZER_DNP3_UDP:
 552 |             protocol = "dnp3_udp";
 553 |             break;
 554 |         case Analyzer::ANALYZER_CONTENTS_DNS:
 555 |             protocol = "contents_dns";
 556 |             break;
 557 |         case Analyzer::ANALYZER_DNS:
 558 |             protocol = "dns";
 559 |             break;
 560 |         case Analyzer::ANALYZER_FTP_DATA:
 561 |             protocol = "ftp_data";
 562 |             break;
 563 |         case Analyzer::ANALYZER_IRC_DATA:
 564 |             protocol = "irc_data";
 565 |             break;
 566 |         case Analyzer::ANALYZER_FINGER:
 567 |             protocol = "finger";
 568 |             break;
 569 |         case Analyzer::ANALYZER_FTP:
 570 |             protocol = "ftp";
 571 |             break;
 572 |         case Analyzer::ANALYZER_FTP_ADAT:
 573 |             protocol = "ftp_adat";
 574 |             break;
 575 |         case Analyzer::ANALYZER_GNUTELLA:
 576 |             protocol = "gnutella";
 577 |             break;
 578 |         case Analyzer::ANALYZER_GSSAPI:
 579 |             protocol = "gssapi";
 580 |             break;
 581 |         case Analyzer::ANALYZER_GTPV1:
 582 |             protocol = "gtpv1";
 583 |             break;
 584 |         case Analyzer::ANALYZER_HTTP:
 585 |             protocol = "http";
 586 |             break;
 587 |         case Analyzer::ANALYZER_ICMP:
 588 |             protocol = "icmp";
 589 |             break;
 590 |         case Analyzer::ANALYZER_IDENT:
 591 |             protocol = "ident";
 592 |             break;
 593 |         case Analyzer::ANALYZER_IMAP:
 594 |             protocol = "imap";
 595 |             break;
 596 |         case  Analyzer::ANALYZER_INTERCONN:
 597 |             protocol = "interconn";
 598 |             break;
 599 |         case Analyzer::ANALYZER_IRC:
 600 |             protocol = "irc";
 601 |             break;
 602 |         case Analyzer::ANALYZER_KRB:
 603 |             protocol = "krb";
 604 |             break;
 605 |         case Analyzer::ANALYZER_KRB_TCP:
 606 |             protocol = "krb_tcp";# the previous one is its substring, how to handle this situation?
 607 |             break;
 608 |         case Analyzer::ANALYZER_CONTENTS_RLOGIN:
 609 |             protocol = "contents_rlogin";
 610 |             break;
 611 |         case Analyzer::ANALYZER_CONTENTS_RSH:
 612 |             protocol = "contents_rsh";
 613 |             break;
 614 |         case Analyzer::ANALYZER_LOGIN:
 615 |             protocol = "login";
 616 |             break;
 617 |         case Analyzer::ANALYZER_NVT:
 618 |             protocol = "nvt";
 619 |             break;
 620 |         case Analyzer::ANALYZER_RLOGIN:
 621 |             protocol = "rlogin";
 622 |             break;
 623 |         case Analyzer::ANALYZER_RSH:
 624 |             protocol = "rsh";
 625 |             break;
 626 |         case Analyzer::ANALYZER_TELNET:
 627 |             protocol = "telnet";
 628 |             break;
 629 |         case Analyzer::ANALYZER_MODBUS:
 630 |             protocol = "modbus";
 631 |             break;
 632 |         case Analyzer::ANALYZER_MYSQL:
 633 |             protocol = "mysql";
 634 |             break;
 635 |         case Analyzer::ANALYZER_CONTENTS_NCP:
 636 |             protocol = "contents_ncp";
 637 |             break;
 638 |         case Analyzer::ANALYZER_NCP:
 639 |             protocol = "ncp";
 640 |             break;
 641 |         case Analyzer::ANALYZER_CONTENTS_NETBIOSSSN:
 642 |             protocol = "contents_netbiosssn";
 643 |             break;
 644 |         case Analyzer::ANALYZER_NETBIOSSSN:
 645 |             protocol = "netbiosssn";
 646 |             break;
 647 |         case Analyzer::ANALYZER_NTLM:
 648 |             protocol = "ntlm";
 649 |             break;
 650 |         case Analyzer::ANALYZER_NTP:
 651 |             protocol = "ntp";
 652 |             break;
 653 |         case Analyzer::ANALYZER_PIA_TCP:
 654 |             protocol = "pia_tcp";
 655 |             break;
 656 |         case Analyzer::ANALYZER_PIA_UDP:
 657 |             protocol = "pia_udp";
 658 |             break;
 659 |         case Analyzer::ANALYZER_POP3:
 660 |             protocol = "pop3";
 661 |             break;
 662 |         case Analyzer::ANALYZER_RADIUS:
 663 |             protocol = "radius";
 664 |             break;
 665 |         case Analyzer::ANALYZER_RDP:
 666 |             protocol = "rdp";
 667 |             break;
 668 |         case Analyzer::ANALYZER_RFB:
 669 |             protocol = "rfb";
 670 |             break;
 671 |         case Analyzer::ANALYZER_CONTENTS_NFS:
 672 |             protocol = "contents_nfs";
 673 |             break;
 674 |         case Analyzer::ANALYZER_CONTENTS_RPC:
 675 |             protocol = "contents_rpc";
 676 |             break;
 677 |         case Analyzer::ANALYZER_MOUNT:
 678 |             protocol = "mount";
 679 |             break;
 680 |         case Analyzer::ANALYZER_NFS:
 681 |             protocol = "nfs";
 682 |             break;
 683 |         case Analyzer::ANALYZER_PORTMAPPER:
 684 |             protocol = "portmapper";
 685 |             break;
 686 |         case Analyzer::ANALYZER_SIP:
 687 |             protocol = "sip";
 688 |             break;
 689 |         case Analyzer::ANALYZER_CONTENTS_SMB:
 690 |             protocol = "contents_smb";
 691 |             break;
 692 |         case Analyzer::ANALYZER_SMB:
 693 |             protocol = "smb";
 694 |             break;
 695 |         case Analyzer::ANALYZER_SMTP:
 696 |             protocol = "smtp";
 697 |             break;
 698 |         case Analyzer::ANALYZER_SNMP:
 699 |             protocol = "snmp";
 700 |             break;
 701 |         case Analyzer::ANALYZER_SOCKS:
 702 |             protocol = "socks";
 703 |             break;
 704 |         case Analyzer::ANALYZER_SSH:
 705 |             protocol = "ssh";
 706 |             break;
 707 |         case Analyzer::ANALYZER_DTLS:
 708 |             protocol = "dtls";
 709 |             break;
 710 |         case Analyzer::ANALYZER_SSL:
 711 |             protocol = "ssl";
 712 |             break;
 713 |         case Analyzer::ANALYZER_STEPPINGSTONE:
 714 |             protocol = "steppingstone";
 715 |             break;
 716 |         case Analyzer::ANALYZER_SYSLOG:
 717 |             protocol = "syslog";
 718 |             break;
 719 |         case Analyzer::ANALYZER_CONTENTLINE:
 720 |             protocol = "contentline";
 721 |             break;
 722 |         case Analyzer::ANALYZER_CONTENTS:
 723 |             protocol = "contents";
 724 |             break;
 725 |         case Analyzer::ANALYZER_TCP:
 726 |             protocol = "tcp";
 727 |             break;
 728 |         case Analyzer::ANALYZER_TCPSTATS:
 729 |             protocol = "tcpstats";
 730 |             break;
 731 |         case Analyzer::ANALYZER_TEREDO:
 732 |             protocol = "teredo";
 733 |             break;
 734 |         case Analyzer::ANALYZER_UDP:
 735 |             protocol = "udp";
 736 |             break;
 737 |         case Analyzer::ANALYZER_XMPP:
 738 |             protocol = "xmpp";
 739 |             break;
 740 |         case Analyzer::ANALYZER_ZIP:
 741 |             protocol = "zip";
 742 |             break;
 743 |         default:
 744 |             print "Unexpected error: unknown protocol type!";
 745 |             protocol = "error";
 746 |             break;
 747 |     }
 748 |     if(protocol == "error")
 749 |         return;
 750 |     # both endpoints share the same protocol
 751 |     local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = src_ip, $description = protocol ];
 752 |     local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = dst_ip, $description = protocol ];
 753 |     update_hostlist(rec1, protocol);
 754 |     update_hostlist(rec2, protocol);
 755 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec1); 
 756 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 757 |     # print "a new protocol is logged";
 758 | }
 759 | 
 760 | # try to get software info
 761 | # Unfortunately, they haven't been triggered
 762 | event software_unparsed_version_found(c: connection, host: addr, str: string){
 763 |     # fill app record
 764 | }
 765 | 
 766 | event software_version_found(c: connection, host: addr, s: software, descr: string){
 767 |     # fill app record
 768 | }
 769 | 
 770 | event bro_init() &priority=10{
 771 |     # create our log stream at the very beginning
 772 | 	Log::create_stream(HOST_INFO::HOST_INFO_LOG, [$columns=host_info, $path="host-info"]);
 773 |     # the other log stream to output of a summary of host-info
 774 |     Log::create_stream(HOST_INFO::SUMMARY_HOST_LOG, [$columns=host_info, $path="host-summary"]);
 775 |     # 同样地,建立KG要存储的内容的日志流
 776 |     Log::create_stream(HOST_INFO::NET_EVENTS_LOG, [$columns=event_info, $path="network_events"]);# kg_info存储"三元组"形式的知识
 777 |     # some useless fields are filtered
 778 |     local filter: Log::Filter = [$name="without_dscription", $path="simple_hosts",
 779 |                                 $include=set("ip","hostname","username","mac","os","ips","protocols")];
 780 |     Log::add_filter(HOST_INFO::SUMMARY_HOST_LOG, filter);
 781 | }
 782 | 
 783 | # 基本数据包
 784 | # A raw packet header, consisting of L2 header and everything in pkt_hdr. .
 785 | # 比起packet_contents,raw_packet提供的信息更少,而且bro提出这两个事件的开销很大
 786 | event raw_packet(p: raw_pkt_hdr){
 787 |     print "raw_packet!";
 788 |     print p;
 789 |     # print p;
 790 |     # if(p?$l2){
 791 |     #     print p$l2;
 792 |     # } else {
 793 |     #     print "no l2";
 794 |     # }
 795 |     # if(p?$ip){
 796 |     #     print p$ip;
 797 |     # } else {
 798 |     #     print "no ip field";
 799 |     # }
 800 |     # if(p?$ip6){
 801 |     #     print p$ip6;
 802 |     # } else {
 803 |     #     print "no ip6 field";
 804 |     # }
 805 |     # if(p?$tcp){
 806 |     #     print p$tcp;
 807 |     # } else {
 808 |     #     print "no tcp field";
 809 |     # }
 810 |     # if(p?$udp){
 811 |     #     print p$udp;
 812 |     # } else {
 813 |     #     print "no udp field";
 814 |     # }
 815 |     # if(p?$icmp){
 816 |     #     print p$icmp;
 817 |     # } else {
 818 |     #     print "no icmp field";
 819 |     # }
 820 | }
 821 | 
 822 | event packet_contents(c: connection, contents: string){
 823 |     print "packet_contents!";
 824 |     # print c$id$resp_p;
 825 |     if(c$id$resp_p == 111/udp){
 826 |         # 这种情况视作一个rpc事件,对应phase2中的135个分组(总共148个)
 827 |         print "portmapper protocol";
 828 |         print c;
 829 |         print contents;
 830 |         num_packets += 1;
 831 |     } else {
 832 |         print c$id$resp_p;
 833 |     }
 834 |     # print contents;
 835 |     # p_num -= 1;
 836 | }
 837 | 
 838 | 
 839 | # 数据集分析的事件,同样要关心里面涉及的主机信息
 840 | # 网络流量图谱的基于bro日志构建,转为Gremlin脚本(属性设定,加点加边)输出
 841 | # 网络流量图谱的分析计算依赖于Gremlin提供的强大的图计算能力
 842 | # phase-1-dump
 843 | event icmp_echo_request(c: connection, icmp: icmp_conn, id: count, seq: count, payload: string){
 844 |     print "icmp_echo_request!";
 845 |     # print c;
 846 |     # 记录资产,主机即资产
 847 |     if(c$id ?$ orig_h && c$id ?$ resp_h){
 848 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$orig_h, $description = "icmp_echo_request"];
 849 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$resp_h, $description = "icmp_echo_request"];
 850 |         update_hostlist(rec1, "icmp");
 851 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 852 |         update_hostlist(rec2, "icmp");
 853 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 854 |     }
 855 |     # 记录事件,事件以边的形式呈现,必须连接两个点
 856 |     local t: time = network_time();
 857 |     local rec3: HOST_INFO::event_info = [$ts = network_time(), $real_time = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S", t)), 
 858 |                                         $event_type = ICMP_ECHO_REQUEST, $src_ip = c$id$orig_h, $src_p = c$id$orig_p, 
 859 |                                         $dst_ip = c$id$resp_h, $dst_p = c$id$resp_p];
 860 |     Log::write(HOST_INFO::NET_EVENTS_LOG, rec3);
 861 |     # print icmp;
 862 | }
 863 | 
 864 | event icmp_echo_reply(c: connection, icmp: icmp_conn, id: count, seq: count, payload: string){
 865 |     print "icmp_echo_reply!";
 866 |     # 记录资产,主机即资产
 867 |     if(c$id ?$ orig_h && c$id ?$ resp_h){
 868 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$orig_h, $description = "icmp_echo_reply"];
 869 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$resp_h, $description = "icmp_echo_reply"];
 870 |         update_hostlist(rec1, "icmp");
 871 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 872 |         update_hostlist(rec2, "icmp");
 873 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 874 |     }
 875 |     # 记录事件,事件以边的形式呈现,必须连接两个点
 876 |     local t: time = network_time();
 877 |     local rec3: HOST_INFO::event_info = [$ts = network_time(), $real_time = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S", t)), 
 878 |                                         $event_type = ICMP_ECHO_REPLY, $src_ip = c$id$orig_h, $src_p = c$id$orig_p, 
 879 |                                         $dst_ip = c$id$resp_h, $dst_p = c$id$resp_p];
 880 |     Log::write(HOST_INFO::NET_EVENTS_LOG, rec3);
 881 |     # print icmp;
 882 | }
 883 | 
 884 | event icmp_time_exceeded(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 885 |     print "icmp_time_exceeded!";
 886 | }
 887 | 
 888 | event icmp_error_message(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 889 |     print "icmp_error_message!";
 890 | }
 891 | 
 892 | event icmp_neighbor_advertisement(c: connection, icmp: icmp_conn, router: bool, solicited: bool,
 893 | override: bool, tgt: addr, options: icmp6_nd_options){
 894 |     print "icmp_neighbor_advertisement!";
 895 | }
 896 | 
 897 | event icmp_neighbor_solicitation(c: connection, icmp: icmp_conn, tgt: addr, options: icmp6_nd_options){
 898 |     print "icmp_neighbor_solicitation!";
 899 | }
 900 | 
 901 | event icmp_packet_too_big(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 902 |     print "icmp_packet_too_big!";
 903 | }
 904 | 
 905 | event icmp_parameter_problem(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 906 |     print "icmp_parameter_problem!";
 907 | }
 908 | 
 909 | event icmp_redirect(c: connection, icmp: icmp_conn, tgt: addr, dest: addr, options: icmp6_nd_options){
 910 |     print "icmp_redirect!";
 911 | }
 912 | 
 913 | event icmp_router_advertisement(c: connection, icmp: icmp_conn, cur_hop_limit: count, managed: bool,
 914 | other: bool, home_agent: bool, pref: count, proxy: bool, res: count, router_lifetime: interval,
 915 | reachable_time: interval, retrans_timer: interval, options: icmp6_nd_options){
 916 |     print "icmp_router_advertisement!";
 917 | }
 918 | 
 919 | event icmp_router_solicitation(c: connection, icmp: icmp_conn, options: icmp6_nd_options){
 920 |     print "icmp_router_solicitation!";
 921 | }
 922 | 
 923 | event icmp_sent(c: connection, icmp: icmp_conn){
 924 |     print "icmp_sent!";
 925 | }
 926 | 
 927 | event icmp_sent_payload(c: connection, icmp: icmp_conn, payload: string){
 928 |     print "icmp_sent_payload!";
 929 | }
 930 | 
 931 | event icmp_unreachable(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 932 |     print "icmp_unreachable!";
 933 |     # 记录资产,主机即资产
 934 |     if(c$id ?$ orig_h && c$id ?$ resp_h){
 935 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$orig_h, $description = "icmp_unreachable"];
 936 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$resp_h, $description = "icmp_unreachable"];
 937 |         update_hostlist(rec1, "icmp");
 938 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 939 |         update_hostlist(rec2, "icmp");
 940 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 941 |     }
 942 |     # 记录事件,事件以边的形式呈现,必须连接两个点
 943 |     local t: time = network_time();
 944 |     local rec3: HOST_INFO::event_info = [$ts = network_time(), $real_time = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S", t)), 
 945 |                                         $event_type = ICMP_UNREACHABLE, $src_ip = c$id$orig_h, $src_p = c$id$orig_p, 
 946 |                                         $dst_ip = c$id$resp_h, $dst_p = c$id$resp_p];
 947 |     Log::write(HOST_INFO::NET_EVENTS_LOG, rec3);
 948 |     # print icmp;
 949 | }
 950 | 
 951 | # phase-2-dump
 952 | # pm related
 953 | # 阶段2需要自己定义事件了,自带的事件没有触发
 954 | # 从new_packet,packet_contents出发
 955 | event mount_proc_mnt(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t, rep: MOUNT3::mnt_reply_t){
 956 |     print "mount_proc_mnt!";
 957 | }
 958 | 
 959 | event mount_proc_not_implemented(c: connection, info: MOUNT3::info_t, proc: MOUNT3::proc_t){
 960 |     print "mount_proc_not_implemented!";
 961 | }
 962 | 
 963 | event mount_proc_null(c: connection, info: MOUNT3::info_t){
 964 |     print "mount_proc_null!";
 965 | }
 966 | 
 967 | event mount_proc_umnt(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t){
 968 |     print "mount_proc_umnt!";
 969 | }
 970 | 
 971 | event mount_proc_umnt_all(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t){
 972 |     print "mount_proc_umnt_all!";
 973 | }
 974 | 
 975 | event mount_reply_status(n: connection, info: MOUNT3::info_t){
 976 |     print "mount_reply_status!";
 977 | }
 978 | 
 979 | event nfs_proc_create(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::newobj_reply_t){
 980 |     print "nfs_proc_create!";
 981 | }
 982 | 
 983 | event nfs_proc_getaddr(c: connection, info: NFS3::info_t, fh: string, attrs: NFS3::fattr_t){
 984 |     print "nfs_proc_getaddr!";
 985 | }
 986 | 
 987 | event nfs_proc_link(c: connection, info: NFS3::info_t, req: NFS3::linkargs_t, rep: NFS3::link_reply_t){
 988 |     print "nfs_proc_link!";
 989 | }
 990 | 
 991 | event nfs_proc_lookup(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::lookup_reply_t){
 992 |     print "nfs_proc_lookup!";
 993 | }
 994 | 
 995 | event nfs_proc_mkdir(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::newobj_reply_t){
 996 |     print "nfs_proc_mkdir!";
 997 | }
 998 | 
 999 | event nfs_proc_not_implemented(c: connection, info: NFS3::info_t, proc: NFS3::proc_t){
1000 |     print "nfs_proc_not_implemented!";
1001 | }
1002 | 
1003 | event nfs_proc_null(c: connection, info: NFS3::info_t){
1004 |     print "nfs_proc_null!";
1005 | }
1006 | 
1007 | event nfs_proc_read(c: connection, info: NFS3::info_t, req: NFS3::readargs_t, rep: NFS3::read_reply_t){
1008 |     print "nfs_proc_read!";
1009 | }
1010 | 
1011 | event nfs_proc_readdir(c: connection, info: NFS3::info_t, req: NFS3::readdirargs_t, rep: NFS3::readdir_reply_t){
1012 |     print "nfs_proc_readdir!";
1013 | }
1014 | 
1015 | event nfs_proc_readlink(c: connection, info: NFS3::info_t, fh: string, rep: NFS3::readlink_reply_t){
1016 |     print "nfs_proc_readlink!";
1017 | }
1018 | 
1019 | event nfs_proc_remove(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::delobj_reply_t){
1020 |     print "nfs_proc_remove!";
1021 | }
1022 | 
1023 | event nfs_proc_rename(c: connection, info: NFS3::info_t, req: NFS3::renameopargs_t, rep: NFS3::renameobj_reply_t){
1024 |     print "nfs_proc_rename!";
1025 | }
1026 | 
1027 | event nfs_proc_rmdir(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::delobj_reply_t){
1028 |     print "nfs_proc_rmdir!";
1029 | }
1030 | 
1031 | event nfs_proc_sattr(c: connection, info: NFS3::info_t, req: NFS3::sattrargs_t, rep: NFS3::sattr_reply_t){
1032 |     print "nfs_proc_sattr!";
1033 | }
1034 | 
1035 | event nfs_proc_symlink(c: connection, info: NFS3::info_t, req: NFS3::symlinkargs_t, rep: NFS3::newobj_reply_t){
1036 |     print "nfs_proc_symlink!";
1037 | }
1038 | 
1039 | event nfs_proc_write(c: connection, info: NFS3::info_t, req: NFS3::writeargs_t, rep: NFS3::write_reply_t){
1040 |     print "nfs_proc_write!";
1041 | }
1042 | 
1043 | event nfs_reply_status(n: connection, info: NFS3::info_t){
1044 |     print "nfs_reply_status!";
1045 | }
1046 | 
1047 | #--上面是关于nfs的调用事件--
1048 | 
1049 | event pm_attempt_getport(r: connection, status: rpc_status, pr: pm_port_request){
1050 |     print "pm_attempt_getport!";
1051 | }
1052 | 
1053 | event pm_attempt_dump(r: connection, status: rpc_status){
1054 |     print "pm_attempt_dump!";
1055 | }
1056 | 
1057 | event pm_attempt_callit(r: connection, status: rpc_status, call: pm_callit_request){
1058 |     print "pm_attempt_callit!";
1059 | }
1060 | 
1061 | event pm_attempt_null(r: connection, status: rpc_status){
1062 |     print "pm_attempt_null!";
1063 | }
1064 | 
1065 | event pm_attempt_set(r: connection, status: rpc_status, m: pm_mapping){
1066 |     print "pm_attempt_set!";
1067 | }
1068 | 
1069 | event pm_attempt_unset(r: connection, status: rpc_status, m: pm_mapping){
1070 |     print "pm_attempt_unset!";
1071 | }
1072 | 
1073 | event pm_bad_port(r: connection, bad_p: count){
1074 |     print "pm_bad_port!";
1075 | }
1076 | 
1077 | event pm_request_callit(r: connection, call: pm_callit_request, p: port){
1078 |     print "pm_request_callit!";
1079 | }
1080 | 
1081 | event pm_request_dump(r: connection, m: pm_mappings){
1082 |     print "pm_request_dump!";
1083 | }
1084 | 
1085 | event pm_request_getport(r: connection, pr: pm_port_request, p: port){
1086 |     print "pm_request_getport!";
1087 | }
1088 | 
1089 | event pm_request_null(r: connection){
1090 |     print "pm_request_null!";
1091 | }
1092 | 
1093 | event pm_request_set(r: connection, m: pm_mapping, success: bool){
1094 |     print "pm_request_set!";
1095 | }
1096 | 
1097 | event pm_request_unset(r: connection, m: pm_mapping, success: bool){
1098 |     print "pm_request_unset!";
1099 | }
1100 | 
1101 | event rpc_call(c: connection, xid: count, prog: count, ver: count, proc: count, call_len: count){
1102 |     print "rpc_call!";
1103 | }
1104 | 
1105 | event rpc_dialogue(c: connection, prog: count, ver: count, proc: count, status: rpc_status, start_time: time, call_len: count, reply_len: count){
1106 |     print "rpc_dialogue!";
1107 | }
1108 | 
1109 | event rpc_reply(c: connection, xid: count, status: rpc_status, reply_len: count){
1110 |     print "rpc_reply!";
1111 | }
1112 | # 上面是关于pm和rpc的,可惜一个都没有触发
1113 | # 考虑包内容中有resp_p=111/udp,其中111是portmapper的端口号得知此包与portmapper相关
1114 | # 如何通过bro得知rpc调用了sadmind守护进程?
1115 | 
1116 | 
1117 | event bro_done(){
1118 |     print "finish";
1119 |     for(i in hostlist){
1120 |         local rec: HOST_INFO::host_info = hostlist[i];
1121 |         Log::write(HOST_INFO::SUMMARY_HOST_LOG, rec);
1122 |     }
1123 |     # local rec1: HOST_INFO::kg_info = [$ts=network_time(), $A=" ", $predicate=ICMP_ECHO_REQUEST, $B=" "];# 三元组日志测试数据
1124 |     # Log::write(HOST_INFO::NET_EVENTS_LOG, rec1);
1125 |     print num_packets;
1126 | }


--------------------------------------------------------------------------------
/updateHost.zeek:
--------------------------------------------------------------------------------
   1 | # author: wangyixuan
   2 | # It aims at getting hosts'
   3 | # USERNAME,    0      pay attention to NTLM PLZ
   4 | # HOSTNAME,      1
   5 | # MAC ADDRESS,  1
   6 | # OPERATING SYSTEM,     1
   7 | # IP ADDRESS    1
   8 | 
   9 | # How to sort ips?
  10 | # Type          (indicates devicetype, etc desktop, laptop, tablet)
  11 | # Applications  (Why do we need it? Should we guess during which period such applications are running?)
  12 | # Protocols     (so many protocols, how to handle them? It exists between two hosts.)
  13 | 
  14 | # Maintain the information of hosts all the time and output it to log file regularly
  15 | 
  16 | # completed two functions named update_hostlist and update_single_host
  17 | 
  18 | # event new_connection: collect various protocols which are indicated by connections
  19 | # event protocol_confirmation: this event is emitted when bro confirms that this protocol is actually running here
  20 | # problem to solve: whether the protocols comes is a new protocol? Using !in is not appropriate.
  21 | 
  22 | # adjust the format of protocols   etc: http:33,dns:14
  23 | # data to log cannot be a table
  24 | 
  25 | # how to invoke a event in a specific interval
  26 | # refer to test1.bro and define an event by ourself
  27 | # this user-defined event can complish the task of logging hostlist every n seconds
  28 | # outside dataprocesser can read the log every n seconds as well
  29 | 
  30 | # convert ts to the form of "YYYY:MM:DD-HH:MM:SS", which is easier to understand
  31 | # in "ips": mark the timestamp of each ip
  32 | # in "protocols": mark the number that indicate how many time this protocol has beem confirmed
  33 | # the value of n is dynamic
  34 | # there are some problems in updating ips
  35 | # 1. segment fault
  36 | # 2. redundant ip in "ips" field
  37 | # 3, three records missing ips(uninitialized)
  38 | # 4. the way to check a ip already exist? etc: 192.168.1.5, 192.168.1.50 substring is not reliable
  39 | # @load /home/lw/myKGA/signature_test.bro
  40 | @load /usr/local/zeek/share/zeek/policy/frameworks/dpd/detect-protocols.zeek
  41 | @load /usr/local/zeek/share/zeek/policy/frameworks/dpd/packet-segment-logging.zeek
  42 | 
  43 | module HOST_INFO;
  44 | 
  45 | const pm_ports = { 111/udp, 111/tcp };
  46 | const telnet_ports = { 23/tcp };
  47 | const rsh_ports = { 514/tcp };
  48 | redef likely_server_ports += {pm_ports, telnet_ports, rsh_ports};
  49 | 
  50 | redef ProtocolDetector::valids += {[Analyzer::ANALYZER_PORTMAPPER, 0.0.0.0, 111/udp] = ProtocolDetector::BOTH};
  51 | 
  52 | # declarations of my own events
  53 | global portmapper_call: function(c: connection);
  54 | 
  55 | global event_counts: int = 0;
  56 | 
  57 | const analyzer_tags: set[Analyzer::Tag] = {
  58 |     Analyzer::ANALYZER_AYIYA,
  59 |     Analyzer::ANALYZER_BITTORRENT,
  60 |     Analyzer::ANALYZER_BITTORRENTTRACKER,
  61 |     Analyzer::ANALYZER_CONNSIZE,
  62 |     Analyzer::ANALYZER_DCE_RPC,
  63 |     Analyzer::ANALYZER_DHCP,
  64 |     Analyzer::ANALYZER_DNP3_TCP,
  65 |     Analyzer::ANALYZER_DNP3_UDP,
  66 |     Analyzer::ANALYZER_CONTENTS_DNS,
  67 |     Analyzer::ANALYZER_DNS,
  68 |     Analyzer::ANALYZER_FTP_DATA,
  69 |     Analyzer::ANALYZER_IRC_DATA,
  70 |     Analyzer::ANALYZER_FINGER,
  71 |     Analyzer::ANALYZER_FTP,
  72 |     Analyzer::ANALYZER_FTP_ADAT,
  73 |     Analyzer::ANALYZER_GNUTELLA,
  74 |     Analyzer::ANALYZER_GSSAPI,
  75 |     Analyzer::ANALYZER_GTPV1,
  76 |     Analyzer::ANALYZER_HTTP,
  77 |     Analyzer::ANALYZER_ICMP,
  78 |     Analyzer::ANALYZER_IDENT,
  79 |     Analyzer::ANALYZER_IMAP,
  80 |     Analyzer::ANALYZER_IRC,
  81 |     Analyzer::ANALYZER_KRB,
  82 |     Analyzer::ANALYZER_KRB_TCP,
  83 |     Analyzer::ANALYZER_CONTENTS_RLOGIN,
  84 |     Analyzer::ANALYZER_CONTENTS_RSH,
  85 |     Analyzer::ANALYZER_LOGIN,
  86 |     Analyzer::ANALYZER_NVT,
  87 |     Analyzer::ANALYZER_RLOGIN,
  88 |     Analyzer::ANALYZER_RSH,
  89 |     Analyzer::ANALYZER_TELNET,
  90 |     Analyzer::ANALYZER_MODBUS,
  91 |     Analyzer::ANALYZER_MQTT,
  92 |     Analyzer::ANALYZER_MYSQL,
  93 |     Analyzer::ANALYZER_CONTENTS_NCP,
  94 |     Analyzer::ANALYZER_NCP,
  95 |     Analyzer::ANALYZER_CONTENTS_NETBIOSSSN,
  96 |     Analyzer::ANALYZER_NETBIOSSSN,
  97 |     Analyzer::ANALYZER_NTLM,
  98 |     Analyzer::ANALYZER_NTP,
  99 |     Analyzer::ANALYZER_PIA_TCP,
 100 |     Analyzer::ANALYZER_PIA_UDP,
 101 |     Analyzer::ANALYZER_POP3,
 102 |     Analyzer::ANALYZER_RADIUS,
 103 |     Analyzer::ANALYZER_RDP,
 104 |     Analyzer::ANALYZER_RFB,
 105 |     Analyzer::ANALYZER_CONTENTS_NFS,
 106 |     Analyzer::ANALYZER_CONTENTS_RPC,
 107 |     Analyzer::ANALYZER_MOUNT,
 108 |     Analyzer::ANALYZER_NFS,
 109 |     Analyzer::ANALYZER_PORTMAPPER,
 110 |     Analyzer::ANALYZER_SIP,
 111 |     Analyzer::ANALYZER_CONTENTS_SMB,
 112 |     Analyzer::ANALYZER_SMB,
 113 |     Analyzer::ANALYZER_SMTP,
 114 |     Analyzer::ANALYZER_SNMP,
 115 |     Analyzer::ANALYZER_SOCKS,
 116 |     Analyzer::ANALYZER_SSH,
 117 |     Analyzer::ANALYZER_DTLS,
 118 |     Analyzer::ANALYZER_SSL,
 119 |     Analyzer::ANALYZER_STEPPINGSTONE,
 120 |     Analyzer::ANALYZER_SYSLOG,
 121 |     Analyzer::ANALYZER_CONTENTLINE,
 122 |     Analyzer::ANALYZER_CONTENTS,
 123 |     Analyzer::ANALYZER_TCP,
 124 |     Analyzer::ANALYZER_TCPSTATS,
 125 |     Analyzer::ANALYZER_TEREDO,
 126 |     Analyzer::ANALYZER_UDP,
 127 |     Analyzer::ANALYZER_VXLAN,
 128 |     Analyzer::ANALYZER_XMPP,
 129 |     Analyzer::ANALYZER_ZIP,
 130 |     Analyzer::ANALYZER_TELNET,
 131 |     Analyzer::ANALYZER_RSH
 132 | };
 133 | 
 134 | export{
 135 | 	# Create an ID for our new stream. By convention, this is
 136 | 	# called "HOST_INFO_LOG".
 137 | 	redef enum Log::ID += { HOST_INFO_LOG,
 138 |                             SUMMARY_HOST_LOG,
 139 |                             NET_EVENTS_LOG,
 140 |                             ATTACK_PATTERN_EVENT_LOG };# NET_EVENTS_LOG记录重要网络事件(或者网络包),作为KG分析的输入,BRO脚本分析多步攻击的数据集
 141 | 
 142 |     # 攻击模式更关注拓扑结构,没有大量的实际数据属性,借用一下zeek的日志输出功能,转化为易于使用的模式点,模式边文件
 143 |     # 不如只记录边,点边一起更新
 144 |     type pattern_event: record{
 145 |         name: string    &log;# 事件两端的点的标签名字,attack_pattern_n的模式
 146 |         id: int     &log;# 同一模式下有多个边,每个边再用id区分
 147 |         event_type: string     &log;# 与基本事件的类型对应
 148 |         edge_content: string    &log;# 设为形同"1>2"的字符串,分解后先加边,后加点,1和2含在点的name中
 149 |     };
 150 | 
 151 |     # 定义三元组中谓语的类型,输出的格式是HOST_INFO::ICMP_ECHO_REQUEST
 152 |     # 增加事件,需要修改三处,第一处是relation类型,第二处是写日志的rec3,第三处是generate_graph.py中的边标签(改两个地方)
 153 |     type relation: enum {
 154 |         Empty, ICMP_ECHO_REQUEST, ICMP_ECHO_REPLY, ICMP_UNREACHABLE, RPC_REPLY, RPC_CALL, PORTMAP, NEW_CONNECTION_CONTENTS,
 155 |         CONNECTION_SYN_PACKET, TCP_PACKET, CONNECTION_ESTABLISHED, CONNECTION_FIRST_ACK, CONNECTION_EOF, CONNECTION_FINISHED,
 156 |         CONNECTION_PENDING, LOGIN_OUTPUT_LINE, LOGIN_INPUT_LINE, LOGIN_CONFUSED, LOGIN_CONFUSED_TEXT, LOGIN_SUCCESS, RSH_REQUEST,
 157 |         RSH_REPLY, CONNECTION_ATTEMPT, LOGIN_TERMINAL, CONNECTION_HALF_FINISHED, LOGIN_DISPLAY, HTTP_EVENT, HTTP_STATS, HTTP_END_ENTITY,
 158 |         HTTP_MESSAGE_DONE, HTTP_CONTENT_TYPE, HTTP_ALL_HEADERS, HTTP_REPLY, HTTP_HEADER, HTTP_BEGIN_ENTITY, HTTP_ENTITY_DATA
 159 |     };
 160 |     # unfortunately, its json format is incorrect
 161 |     # We need to handle the json format output line by line
 162 |     # redef LogAscii::use_json = T;
 163 | 	# Define the record type that will contain the data to log.
 164 |     type host_info: record{
 165 |         ts: time    &log;
 166 |         ip: addr      &log;#indicates the newest ip
 167 |         ips: string     &default="" &log; # historical ips, ordered by their
 168 |         username: string    &default="" &log;
 169 |         hostname: string    &default="" &log;
 170 |         mac: string     &default="" &log;
 171 |         os: string      &default="" &log;
 172 |         description: string     &default="" &log;
 173 |         protocols: string   &default="" &log; # list all of its protocols
 174 |     };
 175 | 
 176 |     # 再定义一个结构体,用于存储三元组事件(A, relation, B),实际就是(主,谓,宾)三元组
 177 |     # 三元组事件的存储方案: 1.三元组表 2.水平表 3.属性表 4.垂直划分 5.六重索引 6.DB2RDF 
 178 |     # 还是存储到RDF中,后续可以进行SPARQL查询?
 179 |     # 数据量巨大,考虑三元组的聚合(去除一些没用的三元组)=>类比南理工的文章中的经验聚合(去除一些不太重要的告警信息)
 180 |     # 三元组的内容不局限于最底层流量,应当有一些告警层面的三元组(但是这种三元组从哪儿来?有现成的事件还是推理出来)
 181 | 
 182 |     # 更新: 用边来表示事件,为了兼容各种情况,可以包含诸多属性
 183 |     # 属性可以慢慢添加,逐渐完善
 184 |     # 目前考虑的事件: 1. icmp ping事件
 185 |     type event_info: record{
 186 |         ts: time    &log;
 187 |         real_time: string   &log;
 188 |         event_type: relation    &log;
 189 |         src_ip: addr    &log;
 190 |         src_p: port     &log;
 191 |         dst_ip: addr    &log;
 192 |         dst_p: port     &log;
 193 |         description: string &default="" &log;
 194 |     };
 195 | }
 196 | 
 197 | # Use it to store host-info
 198 | global hostlist: vector of host_info = {};
 199 | global events_not_recorded: table[string] of count = {};
 200 | global num_packets = 0;
 201 | 
 202 | function record_event(s: string){
 203 |     if(s in events_not_recorded){
 204 |         events_not_recorded[s] += 1;
 205 |     } else {
 206 |         events_not_recorded[s] = 1;
 207 |     }
 208 | }
 209 | 
 210 | # Precondition: 0 <= index <= |hostlist|
 211 | # Postcondition: cooresponding item has been updated
 212 | function update_single_host(hinfo: HOST_INFO::host_info, protocol: string, index: int){
 213 |     # remember to initialize "ips" and "protocols"
 214 |     # print fmt("update index %d", index);
 215 |     # print hinfo;
 216 |     # print fmt("index is : %d", index);
 217 |     local tmp_ip: string = fmt("%s", hinfo$ip);
 218 |     local up_index: count = 0;
 219 |     # print fmt("the ip is %s", tmp_ip);
 220 |     if(hostlist[index]$ips == ""){
 221 |         # print fmt("initialize ips of index %d", index);
 222 |         local t: time = network_time();
 223 |         hostlist[index]$ips = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t) + tmp_ip);
 224 |     }
 225 |     if(hostlist[index]$protocols == ""){
 226 |         # print fmt("initialize protocols of index %d", index);
 227 |         hostlist[index]$protocols = protocol + ":1";
 228 |     }
 229 |     # check that whether ip is the newest ip
 230 |     if(hinfo$ip != hostlist[index]$ip){
 231 |         # print fmt("update ips because host's ip has been changed");
 232 |         # Maybe this host uses a new ip now, so I need to concatenate "ips"
 233 |         # Since these messages comes in order, I take it for granted that it is unnecessary to compare timestamp.
 234 |         hostlist[index]$ip = hinfo$ip; # update the newest ip
 235 |         # maybe we need a new way to determine whether the ip is new: edit the if condition
 236 |         if(tmp_ip !in hostlist[index]$ips){
 237 |             # a new ip comes, append it to the end of ips
 238 |             local t1: time = network_time();
 239 |             hostlist[index]$ips += fmt(",%s", strftime("%Y-%m-%d-%H:%M:%S|", t1) + tmp_ip);
 240 |             print "append ips";
 241 |         } else {
 242 |             print "update ips";
 243 |             # in this case, the previous ts should be updated
 244 |             local comma: pattern = /,/;
 245 |             local tmp_tlb: string_vec = split_string(hostlist[index]$ips, comma);
 246 |             local ori_len: count = |tmp_tlb|;
 247 |             # tmp_tlb_ip holds the ips in tmp_tlb and has the same index as tmp_tlb
 248 |             # To ensure the coming ip is a new ip or not clearly.
 249 |             local tmp_tlb_ip: string_vec;
 250 |             for(key in tmp_tlb){
 251 |                 local bin_tlb: string_vec = split_string(tmp_tlb[key], /\|/);
 252 |                 tmp_tlb_ip[key] = bin_tlb[2];
 253 |             }
 254 |             # print fmt("previous len: %d", ori_len);
 255 |             # print "what is in ips now ?";
 256 |             # print hostlist[index]$ips;
 257 |             # print "what is in  tmp_tlb now?";
 258 |             # print tmp_tlb;
 259 |             for(key in tmp_tlb_ip){# use tmp_tlb_ip to determine the key to store
 260 |                 print key;    # To avoid missing ips, we should initialize "ips" when we append a new item
 261 |                 print tmp_tlb_ip[key];
 262 |                 print "start checking";
 263 |                 if(tmp_ip == tmp_tlb_ip[key]){
 264 |                     # this item should be updated
 265 |                     print "bingo";
 266 |                     # here is strange segment fault when I try to directly overwrite tmp_tlb[key] here
 267 |                     # so I record the value of key instead
 268 |                     up_index = key;
 269 |                     # tmp_tlb[key] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 270 |                     # print fmt("the last item: %s", tmp_tlb[key]);
 271 |                     # if(key == ori_len){ # the last item
 272 |                     #     tmp_tlb[key] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 273 |                     #     print fmt("the last item: %s", tmp_tlb[key]);
 274 |                     # }
 275 |                     # else{ # previous items
 276 |                     #     tmp_tlb[key] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 277 |                     #     print fmt("previous item: %s", tmp_tlb[key]);
 278 |                     # }
 279 |                 }
 280 |                 print "end checking";
 281 |             }
 282 |             print "before join!";
 283 |             if(up_index != 0){
 284 |                 # up_index is applied to update tmp_tlb
 285 |                 # from now on, tmp_tlb_ip is useless
 286 |                 local t2: time = network_time();
 287 |                 tmp_tlb[up_index] = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", t2) + tmp_ip);
 288 |             }
 289 |             # for(key in tmp_tlb){
 290 |             #     print fmt("[%d]=>%s", key, tmp_tlb[key]);
 291 |             # }
 292 |             # hostlist[index]$ips = cat_string_array(tmp_tlb); # overwrite
 293 |             hostlist[index]$ips = join_string_vec(tmp_tlb, ",");
 294 |             # print fmt("after join:%s", hostlist[index]$ips);
 295 |             # recheck the number of commas in ips
 296 |             if(ori_len != |split_string(hostlist[index]$ips, comma)|){
 297 |                 print "Unexpected error: the number of commas is wrong";
 298 |                 print fmt("ori_len: %d, new len: %d", ori_len, |split_string(hostlist[index]$ips, comma)|);
 299 |             }
 300 |         }
 301 |     } 
 302 |     # else {
 303 |     #     print "do not update ips";
 304 |     # }
 305 |     # check that whether protocol is a protocol related to this host
 306 |     # if not: concatenate "protocols"   separated by commas
 307 |     # this check condition is not so good
 308 |     # we'd better split protocols into individual items and compare them
 309 |     up_index = 0; # reinitialize up_index
 310 |     if(protocol != "" && protocol !in hostlist[index]$protocols){
 311 |         # print fmt("update protocols because a new protocol of this host found");
 312 |         hostlist[index]$protocols += fmt(",%s:1", protocol);
 313 |     } else {
 314 |         # record the count
 315 |         # print hostlist[index]$protocols;
 316 |         local pro_tlb: string_vec = split_string(hostlist[index]$protocols, /,/);
 317 |         local pro_tlb_tmp: string_vec;
 318 |         # print "start updating protocols";
 319 |         # print pro_tlb;
 320 |         for(key in pro_tlb){
 321 |             local bin_p_tlb: string_vec = split_string(pro_tlb[key], /:/);
 322 |             # print bin_p_tlb;
 323 |             pro_tlb_tmp[key] = bin_p_tlb[1];
 324 |         }
 325 |         for(key in pro_tlb_tmp){
 326 |             if(protocol == pro_tlb_tmp[key]){
 327 |                 # increase by one later
 328 |                 up_index = key;
 329 |             }
 330 |             if(up_index != 0){
 331 |                 local bin_p_tlb1: string_vec = split_string(pro_tlb[up_index], /:/);
 332 |                 local num_s: string = bin_p_tlb1[2];
 333 |                 local num_v: count = to_count(num_s);
 334 |                 num_v += 1;
 335 |                 pro_tlb[up_index] = fmt("%s:%d", bin_p_tlb1[1], num_v);
 336 |             }
 337 |             # for(key in pro_tlb){
 338 |             #     print fmt("[%d]=>%s", key, pro_tlb[key]);
 339 |             # }
 340 |             hostlist[index]$protocols = join_string_vec(pro_tlb, ",");
 341 |         }
 342 |     }
 343 |     # update timestamp
 344 |     hostlist[index]$ts = hinfo$ts;
 345 |     # update hostname iff a different hostname comes
 346 |     if(hinfo$hostname != "" && hinfo$hostname != hostlist[index]$hostname){
 347 |         # in the case of empty string, initialize it
 348 |         # print fmt("initialize the hostname field of this host");
 349 |         hostlist[index]$hostname = hinfo$hostname;
 350 |     }
 351 |     # update os
 352 |     if(hinfo$os != "" && hinfo$os != hostlist[index]$os){
 353 |         # print fmt("update os field of this host");
 354 |         hostlist[index]$os = hinfo$os;
 355 |     }
 356 |     # update mac
 357 |     # Although we confirm that mac should be set as the unique id, 
 358 |     # we reconsider it in the second branch in update_hostlist.
 359 |     if(hinfo$mac != "" && hostlist[index]$mac == ""){
 360 |         # initialize mac field
 361 |         # print fmt("initialize mac field of this host");
 362 |         hostlist[index]$mac = hinfo$mac;
 363 |     }
 364 |     # update username
 365 |     if(hinfo$username != "" && hostlist[index]$username == ""){
 366 |         # print fmt("update username field of this host");
 367 |         hostlist[index]$username = hinfo$username;
 368 |     }
 369 | }
 370 | 
 371 | # Precondition: hinfo comes from fragmentary records
 372 | # Postcondition: update contents of hostlist with hinfo
 373 | function update_hostlist(hinfo: HOST_INFO::host_info, protocol: string){
 374 |     # print "prepare to update";
 375 |     local has_updated: bool = F;
 376 |     if(hinfo$mac != "" || hinfo$hostname != ""){ 
 377 |         # I believe that mac addresses and hostnames can uniquely identify a host.
 378 |         for(i in hostlist){
 379 |             if(((hostlist[i]$mac == hinfo$mac) && (hinfo$mac != "")) || ((hostlist[i]$hostname == hinfo$hostname) && (hinfo$hostname != ""))){
 380 |                 # update
 381 |                 update_single_host(hinfo, protocol, i);
 382 |                 has_updated = T;
 383 |                 break;
 384 |             }
 385 |         }
 386 |         if(!has_updated) {
 387 |             # To avoid missing ips, we should initialize "ips" when we append a new item
 388 |             hostlist[|hostlist|] = hinfo;
 389 |             local wall_time: time = network_time();
 390 |             local tmp_ip: string = fmt("%s", hinfo$ip);
 391 |             # 这边应该也要-1
 392 |             hostlist[|hostlist|-1]$ips += fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", wall_time) + tmp_ip);
 393 |             has_updated = T;
 394 |         }
 395 |     }
 396 |     # 为了icmp发现的主机能进入记录,暂时允许把ip作为主机唯一性考量
 397 |     if(hinfo ?$ ip){
 398 |         for(i in hostlist){
 399 |             if(hostlist[i]$ip == hinfo$ip){# 如果有,更新一下,其实没有什么好更新的
 400 |             # print hostlist[i]$ip;
 401 |             # print hinfo$ip;
 402 |                 update_single_host(hinfo, protocol, i);
 403 |                 has_updated = T;
 404 |                 break;
 405 |             }
 406 |         }
 407 |         # 如果没有,插入
 408 |         if(!has_updated){
 409 |             # print "a new ip comes";
 410 |             # print hinfo$ip;
 411 |             # print |hostlist|;
 412 |             hostlist[|hostlist|] = hinfo;
 413 |             # print |hostlist|;
 414 |             local wall_time1: time = network_time();
 415 |             local tmp_ip1: string = fmt("%s", hinfo$ip);
 416 |             # print hostlist[|hostlist|-1];
 417 |             # |hostlist|改变了,再对齐对应记录作修改,后面-1
 418 |             hostlist[|hostlist|-1]$ips += fmt("%s", strftime("%Y-%m-%d-%H:%M:%S|", wall_time1) + tmp_ip1);
 419 |             has_updated = T;
 420 |             # 针对仅有ip的主机更新情况,不会再去下一个if分支
 421 |         }
 422 |     }
 423 |     if(!has_updated){
 424 |         # In this case, I can't confirm that this host
 425 |         for(i in hostlist){
 426 |             if(hinfo$ip == hostlist[i]$ip){
 427 |                 update_single_host(hinfo, protocol, i);
 428 |                 has_updated = T;
 429 |                 break;
 430 |             }
 431 |         }
 432 |         if(!has_updated){
 433 |             # At this point, all correct info should have been updated
 434 |             print "incomplete information, skip it ", hinfo;
 435 |         }
 436 |     }
 437 | }
 438 | 
 439 | function update_network_event(c: connection, host_description: string, protocol: string, event_description: string, event_type_para: relation){
 440 |     event_counts += 1;
 441 |     print fmt("%d event(s) occurred.", event_counts);
 442 |     # 记录资产,主机即资产
 443 |     if(c$id ?$ orig_h && c$id ?$ resp_h){
 444 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$orig_h, $description = host_description];
 445 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$resp_h, $description = host_description];
 446 |         update_hostlist(rec1, protocol);
 447 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 448 |         update_hostlist(rec2, protocol);
 449 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 450 |     }
 451 |     # 记录事件,事件以边的形式呈现,必须连接两个点
 452 |     local t: time = network_time();
 453 |     local rec3: HOST_INFO::event_info = [$ts = network_time(), $real_time = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S", t)), 
 454 |                                         $event_type = event_type_para, $src_ip = c$id$orig_h, $src_p = c$id$orig_p, 
 455 |                                         $dst_ip = c$id$resp_h, $dst_p = c$id$resp_p, $description = event_description];# 具体进行了哪个进程和端口的转换,从c参数看不出来,需要pm相关的事件提供
 456 |     # icmp_echo_reply的地址内容应该和icmp_echo_request的地址内容反过来
 457 |     # rsh_reply也是
 458 |     if(host_description == "icmp_echo_reply" || host_description == "rsh_reply"){
 459 |         rec3$src_ip = c$id$resp_h;
 460 |         rec3$src_p = c$id$resp_p;
 461 |         rec3$dst_ip = c$id$orig_h;
 462 |         rec3$dst_p = c$id$orig_p;
 463 |     }
 464 |     Log::write(HOST_INFO::NET_EVENTS_LOG, rec3);    
 465 | }
 466 | 
 467 | function check_ssh_hostname(id: conn_id, uid: string, host: addr){
 468 |     when(local hostname = lookup_addr(host)){
 469 |         local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = host, $hostname = hostname, $description = "shh_auth"];
 470 |         update_hostlist(rec, "ssh");
 471 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec);
 472 |     }
 473 | }
 474 | 
 475 | # event OS_version_found(c: connection, host: addr, OS: OS_version){
 476 | #     # print "an operating system has been fingerprinted";
 477 | #     # print fmt("the host running this OS is %s", host);
 478 | #     # print OS;
 479 | #     if(OS$genre != "UNKNOWN"){
 480 | #         local os_detail = fmt("%s %s", OS$genre, OS$detail);
 481 | #         local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = host, $os = os_detail, $description = "OS_version_found"];
 482 | #         update_hostlist(rec, "os_fingerprint");
 483 | #         Log::write(HOST_INFO::HOST_INFO_LOG, rec);
 484 | #     }
 485 | #     # e.g [genre=UNKNOWN, detail=, dist=36, match_type=direct_inference]
 486 | #     # How to utilize this message?
 487 | # }
 488 | 
 489 | # There is no point in removing dulipcated messages for a specific ip. 
 490 | # Becuase ip addresses should not be the unique identification of a specific host.
 491 | # We should identity a specific host by ip and mac pairs which have the lastest network time.
 492 | event arp_reply(mac_src: string, mac_dst: string, SPA: addr, SHA: string, TPA: addr, THA: string){
 493 |     # print "arp reply";
 494 |     # print fmt("source mac: %s, destination mac: %s, SPA: %s, SHA: %s, TPA: %s, THA: %s", mac_src, mac_dst, SPA, SHA, TPA, THA);
 495 |     # record ip and its mac address
 496 |     # we don't these form of mac addresses:
 497 |     # 00:00:00:00:00:00 and ff:ff:ff:ff:ff:ff
 498 |     if(SHA != "ff:ff:ff:ff:ff:ff" && SHA != "00:00:00:00:00:00" && SPA != 0.0.0.0){
 499 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = SPA, $mac = SHA, $description = "arp_reply" ];
 500 |         update_hostlist(rec1, "arp");
 501 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 502 |     }
 503 |     if(THA != "ff:ff:ff:ff:ff:ff" && THA != "00:00:00:00:00:00" && TPA != 0.0.0.0){
 504 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = TPA, $mac = THA, $description = "arp_reply" ];
 505 |         update_hostlist(rec2, "arp");
 506 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 507 |     }
 508 | }
 509 | 
 510 | event arp_request(mac_src: string, mac_dst: string, SPA: addr, SHA: string, TPA: addr, THA: string){
 511 |     record_event("arp_request");
 512 |     # print "arp request";
 513 |     # print fmt("source mac: %s, destination mac: %s, SPA: %s, SHA: %s, TPA: %s, THA: %s", mac_src, mac_dst, SPA, SHA, TPA, THA);
 514 |     if(SHA != "ff:ff:ff:ff:ff:ff" && SHA != "00:00:00:00:00:00" && SPA != 0.0.0.0){
 515 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = SPA, $mac = SHA, $description = "arp_request" ];
 516 |         update_hostlist(rec1, "arp");
 517 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 518 |     }
 519 |     if(THA != "ff:ff:ff:ff:ff:ff" && THA != "00:00:00:00:00:00" && TPA != 0.0.0.0){
 520 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = TPA, $mac = THA, $description = "arp_request" ];
 521 |         update_hostlist(rec2, "arp");
 522 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 523 |     }
 524 | }
 525 | 
 526 | event bad_arp(SPA: addr, SHA: string, TPA: addr, THA: string, explanation: string){
 527 |     record_event("bad_arp");
 528 |     print fmt("this arp packet is bad because: %s", explanation);
 529 | }
 530 | 
 531 | event dhcp_message(c: connection, is_orig: bool, msg: DHCP::Msg, options: DHCP::Options){
 532 |     # print "A dhcp message is coming!";
 533 |     # print msg;
 534 |     # print options;
 535 |     record_event("dhcp_message");
 536 |     if(options ?$ host_name && options ?$ addr_request && options ?$ client_id){ # It occurred once: missing client_id, check it in advance
 537 |         print "haha";
 538 |         # print options;
 539 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = options$addr_request, $mac = options$client_id$hwaddr, $hostname = options$host_name, $description = "dhcp_message1" ];
 540 |         update_hostlist(rec1, "dhcp");
 541 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
 542 |     } else{
 543 |         if(msg$yiaddr != 0.0.0.0){
 544 |             local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = msg$yiaddr, $mac = msg$chaddr, $description = "dhcp_message2" ];
 545 |             update_hostlist(rec2, "dhcp");
 546 |             Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 547 |         }
 548 |     }
 549 | }
 550 | 
 551 | 
 552 | event ssh_auth_successful(c: connection, auth_method_none: bool){
 553 |     record_event("ssh_auth_successful");
 554 | 	for ( host in set(c$id$orig_h, c$id$resp_h) )
 555 | 	{
 556 | 		check_ssh_hostname(c$id, c$uid, host);
 557 | 	}
 558 | }
 559 | 
 560 | event dns_query_reply(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count){
 561 |     record_event("dns_query_reply");
 562 |     # print "here comes a dns query reply";
 563 |     # print c;
 564 |     # print msg;        
 565 |     # print query;      
 566 |     # print qtype;
 567 | }
 568 | 
 569 | event dns_A_reply(c: connection, msg: dns_msg, ans: dns_answer, a: addr){
 570 |     record_event("dns_A_reply");
 571 |     # print "********************************TYPE A REPLY*********************";
 572 |     # print c;
 573 |     # print msg;#[id=0, opcode=0, rcode=0, QR=T, AA=T, TC=F, RD=F, RA=F, Z=0, num_queries=0, num_answers=1, num_auth=0, num_addl=0]
 574 |     # print ans;#[answer_type=1, query=brwa86bad339915.local, qtype=1, qclass=32769, TTL=4.0 mins]
 575 |     # print a;#192.168.1.108
 576 |     local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = a, $hostname = ans$query, $description = "dns_A_reply" ];
 577 |     update_hostlist(rec, "dns");
 578 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec);    
 579 | }
 580 | 
 581 | event dns_AAAA_reply(c: connection, msg: dns_msg, ans: dns_answer, a: addr){
 582 |     record_event("dns_AAAA_reply");
 583 |     local rec: HOST_INFO::host_info = [$ts = network_time(), $ip = a, $hostname = ans$query, $description = "dns_AAAA_reply" ];
 584 |     update_hostlist(rec, "dns");
 585 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec); 
 586 | }
 587 | 
 588 | # I want to get hostnames by event related to DNS.
 589 | event dns_message(c: connection, is_orig: bool, msg: dns_msg, len: count){
 590 |     record_event("dns_message");
 591 |     # print "dns_message";
 592 |     # print "1";
 593 |     # print c$dns_state$pending_queries;
 594 |     if(c ?$dns_state){
 595 |         for(index1 in c$dns_state$pending_queries){
 596 |             # print "2";
 597 |             # print c$dns_state$pending_queries[index1];
 598 |             for(index2 in c$dns_state$pending_queries[index1]$vals){
 599 |                 local rec: DNS::Info = c$dns_state$pending_queries[index1]$vals[index2];
 600 |                 # print rec;
 601 |                 if(rec ?$ answers){
 602 |                     print "It has answers!!!!";
 603 |                     print rec;
 604 |                 }
 605 |                 if(rec ?$ qtype_name){
 606 |                     switch(rec$qtype_name){
 607 |                         case "A":
 608 |                             # print "type A";
 609 |                             # print fmt("host %s's query field: %s", rec$id$orig_h, rec$query);
 610 |                             break;
 611 |                         case "AAAA":
 612 |                             # print "type AAAA";
 613 |                             break;
 614 |                         case "CNAME":
 615 |                             # print "type CNAME";
 616 |                             break;
 617 |                         case "PTR":
 618 |                             # print "type PTR";
 619 |                             break;
 620 |                         case "MX":
 621 |                             # print "type MX";
 622 |                             break;
 623 |                         case "NS":
 624 |                             # print "type NS";
 625 |                             break;
 626 |                         default:
 627 |                             # print fmt("unexpected type: %s", rec$qtype_name);
 628 |                             break;
 629 |                     }
 630 |                 }
 631 |                 # Unfortunately, it is not the hostname. :(
 632 |             }
 633 |         }
 634 |     }
 635 | }
 636 | 
 637 | event dns_mapping_valid(dm: dns_mapping){
 638 |     record_event("dns_mapping_valid");
 639 |     # print "dns_mapping_valid";
 640 |     # print dm;
 641 | }
 642 | 
 643 | event dns_mapping_altered(dm: dns_mapping, old_addrs: addr_set, new_addrs: addr_set){
 644 |     record_event("dns_mapping_altered");
 645 |     # print "dns_mapping_altered";
 646 |     # print dm;
 647 | }
 648 | 
 649 | event dns_mapping_lost_name(dm: dns_mapping){
 650 |     record_event("dns_mapping_lost_name");
 651 |     # print "dns_mapping_lost_name";
 652 |     # print dm;
 653 | }
 654 | 
 655 | event dns_mapping_new_name(dm: dns_mapping){
 656 |     record_event("dns_mapping_new_name");
 657 |     # print "dns_mapping_new_name";
 658 |     # print dm;
 659 | }
 660 | 
 661 | event dns_mapping_unverified(dm: dns_mapping){
 662 |     record_event("dns_mapping_unverified");
 663 |     # print "dns_mapping_unverified";
 664 |     # print dm;
 665 | }
 666 | 
 667 | 
 668 | 
 669 | event ntlm_authenticate(c: connection, request: NTLM::Authenticate){
 670 |     record_event("ntlm_authenticate");
 671 |     # print c;
 672 |     # print request;
 673 |     # if(request ?$ user_name){
 674 |     #     print fmt("username: %s", request$user_name);
 675 |     # }
 676 | }
 677 | 
 678 | # collect more protocol information here
 679 | event protocol_confirmation(c: connection, atype: Analyzer::Tag, aid: count){
 680 |     record_event("protocol_confirmation");
 681 |     local src_ip: addr;
 682 |     local dst_ip: addr;
 683 |     local protocol: string;
 684 |     if(c$id ?$ orig_h && c$id ?$ resp_h){
 685 |         src_ip = c$id$orig_h;
 686 |         dst_ip = c$id$resp_h;
 687 |     }
 688 |     switch(atype){
 689 |         case Analyzer::ANALYZER_AYIYA:
 690 |             protocol = "ayiya";
 691 |             break;
 692 |         # case Analyzer::ANALYZER_BACKDOOR:
 693 |         #     protocol = "backdoor";
 694 |         #     break;
 695 |         case Analyzer::ANALYZER_BITTORRENT:
 696 |             protocol = "bittorrent";
 697 |             break;
 698 |         case Analyzer::ANALYZER_BITTORRENTTRACKER:
 699 |             protocol = "bittorrenttracker";
 700 |             break;
 701 |         case Analyzer::ANALYZER_CONNSIZE:
 702 |             protocol = "connsize";#??
 703 |             break;
 704 |         case Analyzer::ANALYZER_DCE_RPC:
 705 |             protocol = "dce_rpc";
 706 |             break;
 707 |         case Analyzer::ANALYZER_DHCP:
 708 |             protocol = "dhcp";
 709 |             break;
 710 |         case Analyzer::ANALYZER_DNP3_TCP:
 711 |             protocol = "dnp3_tcp";
 712 |             break;
 713 |         case Analyzer::ANALYZER_DNP3_UDP:
 714 |             protocol = "dnp3_udp";
 715 |             break;
 716 |         case Analyzer::ANALYZER_CONTENTS_DNS:
 717 |             protocol = "contents_dns";
 718 |             break;
 719 |         case Analyzer::ANALYZER_DNS:
 720 |             protocol = "dns";
 721 |             break;
 722 |         case Analyzer::ANALYZER_FTP_DATA:
 723 |             protocol = "ftp_data";
 724 |             break;
 725 |         case Analyzer::ANALYZER_IRC_DATA:
 726 |             protocol = "irc_data";
 727 |             break;
 728 |         case Analyzer::ANALYZER_FINGER:
 729 |             protocol = "finger";
 730 |             break;
 731 |         case Analyzer::ANALYZER_FTP:
 732 |             protocol = "ftp";
 733 |             break;
 734 |         case Analyzer::ANALYZER_FTP_ADAT:
 735 |             protocol = "ftp_adat";
 736 |             break;
 737 |         case Analyzer::ANALYZER_GNUTELLA:
 738 |             protocol = "gnutella";
 739 |             break;
 740 |         case Analyzer::ANALYZER_GSSAPI:
 741 |             protocol = "gssapi";
 742 |             break;
 743 |         case Analyzer::ANALYZER_GTPV1:
 744 |             protocol = "gtpv1";
 745 |             break;
 746 |         case Analyzer::ANALYZER_HTTP:
 747 |             protocol = "http";
 748 |             break;
 749 |         case Analyzer::ANALYZER_ICMP:
 750 |             protocol = "icmp";
 751 |             break;
 752 |         case Analyzer::ANALYZER_IDENT:
 753 |             protocol = "ident";
 754 |             break;
 755 |         case Analyzer::ANALYZER_IMAP:
 756 |             protocol = "imap";
 757 |             break;
 758 |         # case  Analyzer::ANALYZER_INTERCONN:
 759 |         #     protocol = "interconn";
 760 |         #     break;
 761 |         case Analyzer::ANALYZER_IRC:
 762 |             protocol = "irc";
 763 |             break;
 764 |         case Analyzer::ANALYZER_KRB:
 765 |             protocol = "krb";
 766 |             break;
 767 |         case Analyzer::ANALYZER_KRB_TCP:
 768 |             protocol = "krb_tcp";# the previous one is its substring, how to handle this situation?
 769 |             break;
 770 |         case Analyzer::ANALYZER_CONTENTS_RLOGIN:
 771 |             protocol = "contents_rlogin";
 772 |             break;
 773 |         case Analyzer::ANALYZER_CONTENTS_RSH:
 774 |             protocol = "contents_rsh";
 775 |             break;
 776 |         case Analyzer::ANALYZER_LOGIN:
 777 |             protocol = "login";
 778 |             break;
 779 |         case Analyzer::ANALYZER_NVT:
 780 |             protocol = "nvt";
 781 |             break;
 782 |         case Analyzer::ANALYZER_RLOGIN:
 783 |             protocol = "rlogin";
 784 |             break;
 785 |         case Analyzer::ANALYZER_RSH:
 786 |             protocol = "rsh";
 787 |             break;
 788 |         case Analyzer::ANALYZER_TELNET:
 789 |             protocol = "telnet";
 790 |             break;
 791 |         case Analyzer::ANALYZER_MODBUS:
 792 |             protocol = "modbus";
 793 |             break;
 794 |         case Analyzer::ANALYZER_MYSQL:
 795 |             protocol = "mysql";
 796 |             break;
 797 |         case Analyzer::ANALYZER_CONTENTS_NCP:
 798 |             protocol = "contents_ncp";
 799 |             break;
 800 |         case Analyzer::ANALYZER_NCP:
 801 |             protocol = "ncp";
 802 |             break;
 803 |         case Analyzer::ANALYZER_CONTENTS_NETBIOSSSN:
 804 |             protocol = "contents_netbiosssn";
 805 |             break;
 806 |         case Analyzer::ANALYZER_NETBIOSSSN:
 807 |             protocol = "netbiosssn";
 808 |             break;
 809 |         case Analyzer::ANALYZER_NTLM:
 810 |             protocol = "ntlm";
 811 |             break;
 812 |         case Analyzer::ANALYZER_NTP:
 813 |             protocol = "ntp";
 814 |             break;
 815 |         case Analyzer::ANALYZER_PIA_TCP:
 816 |             protocol = "pia_tcp";
 817 |             break;
 818 |         case Analyzer::ANALYZER_PIA_UDP:
 819 |             protocol = "pia_udp";
 820 |             break;
 821 |         case Analyzer::ANALYZER_POP3:
 822 |             protocol = "pop3";
 823 |             break;
 824 |         case Analyzer::ANALYZER_RADIUS:
 825 |             protocol = "radius";
 826 |             break;
 827 |         case Analyzer::ANALYZER_RDP:
 828 |             protocol = "rdp";
 829 |             break;
 830 |         case Analyzer::ANALYZER_RFB:
 831 |             protocol = "rfb";
 832 |             break;
 833 |         case Analyzer::ANALYZER_CONTENTS_NFS:
 834 |             protocol = "contents_nfs";
 835 |             break;
 836 |         case Analyzer::ANALYZER_CONTENTS_RPC:
 837 |             protocol = "contents_rpc";
 838 |             break;
 839 |         case Analyzer::ANALYZER_MOUNT:
 840 |             protocol = "mount";
 841 |             break;
 842 |         case Analyzer::ANALYZER_NFS:
 843 |             protocol = "nfs";
 844 |             break;
 845 |         case Analyzer::ANALYZER_PORTMAPPER:
 846 |             protocol = "portmapper";
 847 |             break;
 848 |         case Analyzer::ANALYZER_SIP:
 849 |             protocol = "sip";
 850 |             break;
 851 |         case Analyzer::ANALYZER_CONTENTS_SMB:
 852 |             protocol = "contents_smb";
 853 |             break;
 854 |         case Analyzer::ANALYZER_SMB:
 855 |             protocol = "smb";
 856 |             break;
 857 |         case Analyzer::ANALYZER_SMTP:
 858 |             protocol = "smtp";
 859 |             break;
 860 |         case Analyzer::ANALYZER_SNMP:
 861 |             protocol = "snmp";
 862 |             break;
 863 |         case Analyzer::ANALYZER_SOCKS:
 864 |             protocol = "socks";
 865 |             break;
 866 |         case Analyzer::ANALYZER_SSH:
 867 |             protocol = "ssh";
 868 |             break;
 869 |         case Analyzer::ANALYZER_DTLS:
 870 |             protocol = "dtls";
 871 |             break;
 872 |         case Analyzer::ANALYZER_SSL:
 873 |             protocol = "ssl";
 874 |             break;
 875 |         case Analyzer::ANALYZER_STEPPINGSTONE:
 876 |             protocol = "steppingstone";
 877 |             break;
 878 |         case Analyzer::ANALYZER_SYSLOG:
 879 |             protocol = "syslog";
 880 |             break;
 881 |         case Analyzer::ANALYZER_CONTENTLINE:
 882 |             protocol = "contentline";
 883 |             break;
 884 |         case Analyzer::ANALYZER_CONTENTS:
 885 |             protocol = "contents";
 886 |             break;
 887 |         case Analyzer::ANALYZER_TCP:
 888 |             protocol = "tcp";
 889 |             break;
 890 |         case Analyzer::ANALYZER_TCPSTATS:
 891 |             protocol = "tcpstats";
 892 |             break;
 893 |         case Analyzer::ANALYZER_TEREDO:
 894 |             protocol = "teredo";
 895 |             break;
 896 |         case Analyzer::ANALYZER_UDP:
 897 |             protocol = "udp";
 898 |             break;
 899 |         case Analyzer::ANALYZER_XMPP:
 900 |             protocol = "xmpp";
 901 |             break;
 902 |         case Analyzer::ANALYZER_ZIP:
 903 |             protocol = "zip";
 904 |             break;
 905 |         default:
 906 |             print "Unexpected error: unknown protocol type!";
 907 |             protocol = "error";
 908 |             break;
 909 |     }
 910 |     if(protocol == "error")
 911 |         return;
 912 |     # both endpoints share the same protocol
 913 |     local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = src_ip, $description = protocol ];
 914 |     local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = dst_ip, $description = protocol ];
 915 |     update_hostlist(rec1, protocol);
 916 |     update_hostlist(rec2, protocol);
 917 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec1); 
 918 |     Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
 919 |     # print "a new protocol is logged";
 920 | }
 921 | 
 922 | # try to get software info
 923 | # Unfortunately, they haven't been triggered
 924 | event software_unparsed_version_found(c: connection, host: addr, str: string){
 925 |     record_event("software_unparsed_version_found");
 926 |     # fill app record
 927 | }
 928 | 
 929 | # event software_version_found(c: connection, host: addr, s: software, descr: string){
 930 | #     # fill app record
 931 | # }
 932 | 
 933 | # 基本数据包
 934 | # A raw packet header, consisting of L2 header and everything in pkt_hdr. .
 935 | # 比起packet_contents,raw_packet提供的信息更少,而且bro提出这两个事件的开销很大
 936 | 
 937 | # event raw_packet(p: raw_pkt_hdr){
 938 | #     record_event("raw_packet");
 939 | # }
 940 | 
 941 | # 若不是portmap出问题,packet_contents也不要了,增加了很多不必要的开销
 942 | event packet_contents(c: connection, contents: string){
 943 |     # print "packet_contents!";
 944 |     # print c$id$resp_p;
 945 |     record_event("packet_contents");
 946 |     if(c$id$resp_p == 111/udp){
 947 |         # 这种情况视作一个rpc事件,对应phase2中的135个分组(总共148个)
 948 |         # 其实是对目标主机的111/udp端口进行端口扫描,bro没有提供这个事件,自己定制该事件
 949 |         # print "portmapper protocol";
 950 |         # print c;
 951 |         # print contents;
 952 |         portmapper_call(c);
 953 |         # num_packets += 1;
 954 |     } 
 955 |     # else {
 956 |     #     print c;
 957 |     #     num_packets += 1;
 958 |     # }
 959 |     # # print contents;
 960 |     # p_num -= 1;
 961 | }
 962 | 
 963 | 
 964 | # 数据集分析的事件,同样要关心里面涉及的主机信息
 965 | # 网络流量图谱的基于bro日志构建,转为Gremlin脚本(属性设定,加点加边)输出
 966 | # 网络流量图谱的分析计算依赖于Gremlin提供的强大的图计算能力
 967 | 
 968 | # XX 放弃Gremlin脚本,调用hugegraph的http API传送json数据完成图更新操作
 969 | # phase-1-dump 785个分组
 970 | event icmp_echo_request(c: connection, icmp: icmp_conn, id: count, seq: count, payload: string){
 971 |     # Generated for ICMP echo request messages.
 972 |     update_network_event(c, "icmp_echo_request", "icmp", "a-ping-request-message", ICMP_ECHO_REQUEST);
 973 | }
 974 | 
 975 | event icmp_echo_reply(c: connection, icmp: icmp_conn, id: count, seq: count, payload: string){
 976 |     # Generated for ICMP echo reply messages.
 977 |     update_network_event(c, "icmp_echo_reply", "icmp", "a-ping-reply-message", ICMP_ECHO_REPLY);
 978 | }
 979 | 
 980 | event icmp_time_exceeded(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 981 |     record_event("icmp_time_exceeded");
 982 |     # Generated for ICMP time exceeded messages.
 983 | }
 984 | 
 985 | event icmp_error_message(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
 986 |     record_event("icmp_error_message");
 987 |     # Generated for all ICMPv6 error messages that are not handled separately with dedicated events.
 988 |     # Zeek’s ICMP analyzer handles a number of ICMP error messages directly with dedicated events.
 989 |     # This event acts as a fallback for those it doesn’t.
 990 | }
 991 | 
 992 | event icmp_neighbor_advertisement(c: connection, icmp: icmp_conn, router: bool, solicited: bool,
 993 | override: bool, tgt: addr, options: icmp6_nd_options){
 994 |     record_event("icmp_neighbor_advertisement");
 995 |     # Generated for ICMP neighbor advertisement messages.
 996 | }
 997 | 
 998 | event icmp_neighbor_solicitation(c: connection, icmp: icmp_conn, tgt: addr, options: icmp6_nd_options){
 999 |     record_event("icmp_neighbor_solicitation");
1000 |     # Generated for ICMP neighbor solicitation messages.
1001 | }
1002 | 
1003 | event icmp_packet_too_big(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
1004 |     record_event("icmp_packet_too_big");
1005 |     # Generated for ICMPv6 packet too big messages.
1006 | }
1007 | 
1008 | event icmp_parameter_problem(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
1009 |     record_event("icmp_parameter_problem");
1010 |     # Generated for ICMPv6 parameter problem messages.
1011 | }
1012 | 
1013 | event icmp_redirect(c: connection, icmp: icmp_conn, tgt: addr, dest: addr, options: icmp6_nd_options){
1014 |     record_event("icmp_redirect");
1015 |     # Generated for ICMP redirect messages.
1016 | }
1017 | 
1018 | event icmp_router_advertisement(c: connection, icmp: icmp_conn, cur_hop_limit: count, managed: bool,
1019 | other: bool, home_agent: bool, pref: count, proxy: bool, res: count, router_lifetime: interval,
1020 | reachable_time: interval, retrans_timer: interval, options: icmp6_nd_options){
1021 |     record_event("icmp_router_advertisement");
1022 |     # Generated for ICMP router advertisement messages.
1023 | }
1024 | 
1025 | event icmp_router_solicitation(c: connection, icmp: icmp_conn, options: icmp6_nd_options){
1026 |     record_event("icmp_router_solicitation");
1027 |     # Generated for ICMP router solicitation messages.
1028 | }
1029 | 
1030 | event icmp_sent(c: connection, icmp: icmp_conn){
1031 |     record_event("icmp_sent");
1032 |     # Generated for all ICMP messages that are not handled separately with dedicated ICMP events.
1033 |     # Zeek’s ICMP analyzer handles a number of ICMP messages directly with dedicated events.
1034 |     # This event acts as a fallback for those it doesn’t.
1035 | }
1036 | 
1037 | event icmp_sent_payload(c: connection, icmp: icmp_conn, payload: string){
1038 |     record_event("icmp_sent_payload");
1039 |     # The same as icmp_sent except containing the ICMP payload.
1040 | }
1041 | 
1042 | event icmp_unreachable(c: connection, icmp: icmp_conn, code: count, context: icmp_context){
1043 |     # Generated for ICMP destination unreachable messages.
1044 |     update_network_event(c, "icmp_unreachable", "icmp", "a-ping-failed", ICMP_UNREACHABLE);
1045 | }
1046 | 
1047 | # phase-2-dump 148个分组
1048 | # pm related
1049 | 
1050 | # ProtocolDetector::found_protocol
1051 | 
1052 | # 阶段2需要自己定义"事件"了,自带的事件没有触发
1053 | function portmapper_call(c: connection){
1054 |     update_network_event(c, "portmapper_call", "portmap", "SADMIND(100232)", PORTMAP);
1055 | }
1056 | 
1057 | # 从new_packet,packet_contents出发
1058 | event mount_proc_mnt(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t, rep: MOUNT3::mnt_reply_t){
1059 |     record_event("mount_proc_mnt");
1060 |     # Generated for MOUNT3 request/reply dialogues of type mnt.
1061 |     # The event is generated once we have either seen both the request and its corresponding reply,
1062 |     # or an unanswered request has timed out. MOUNT is a service running on top of RPC.
1063 | }
1064 | 
1065 | event mount_proc_not_implemented(c: connection, info: MOUNT3::info_t, proc: MOUNT3::proc_t){
1066 |     record_event("mount_proc_not_implemented");
1067 |     # Generated for MOUNT3 request/reply dialogues of a type that Zeek’s MOUNTv3 analyzer does not implement.
1068 | }
1069 | 
1070 | event mount_proc_null(c: connection, info: MOUNT3::info_t){
1071 |     record_event("mount_proc_null");
1072 |     # Generated for MOUNT3 request/reply dialogues of type null.
1073 |     # The event is generated once we have either seen both the request and its corresponding reply,
1074 |     # or an unanswered request has timed out. MOUNT is a service running on top of RPC.
1075 | }
1076 | 
1077 | event mount_proc_umnt(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t){
1078 |     record_event("mount_proc_umnt");
1079 |     # Generated for MOUNT3 request/reply dialogues of type umnt.
1080 |     # The event is generated once we have either seen both the request and its corresponding reply,
1081 |     # or an unanswered request has timed out. MOUNT is a service running on top of RPC.
1082 | }
1083 | 
1084 | event mount_proc_umnt_all(c: connection, info: MOUNT3::info_t, req: MOUNT3::dirmntargs_t){
1085 |     record_event("mount_proc_umnt_all");
1086 |     # Generated for MOUNT3 request/reply dialogues of type umnt_all.
1087 |     # The event is generated once we have either seen both the request and its corresponding reply,
1088 |     # or an unanswered request has timed out. MOUNT is a service running on top of RPC.
1089 | }
1090 | 
1091 | event mount_reply_status(n: connection, info: MOUNT3::info_t){
1092 |     record_event("mount_reply_status");
1093 |     # Generated for each MOUNT3 reply message received, reporting just the status included.
1094 | }
1095 | 
1096 | event nfs_proc_create(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::newobj_reply_t){
1097 |     record_event("nfs_proc_create");
1098 |     # Generated for NFSv3 request/reply dialogues of type create.
1099 |     # The event is generated once we have either seen both the request and its corresponding reply,
1100 |     # or an unanswered request has timed out.
1101 | }
1102 | 
1103 | event nfs_proc_getattr(c: connection, info: NFS3::info_t, fh: string, attrs: NFS3::fattr_t){
1104 |     record_event("nfs_proc_getattr");
1105 |     # Generated for NFSv3 request/reply dialogues of type getattr.
1106 |     # The event is generated once we have either seen both the request
1107 |     # and its corresponding reply, or an unanswered request has timed out.
1108 | }
1109 | 
1110 | event nfs_proc_link(c: connection, info: NFS3::info_t, req: NFS3::linkargs_t, rep: NFS3::link_reply_t){
1111 |     record_event("nfs_proc_link");
1112 |     # Generated for NFSv3 request/reply dialogues of type link.
1113 |     # The event is generated once we have either seen both the request and its corresponding reply,
1114 |     # or an unanswered request has timed out.
1115 | }
1116 | 
1117 | event nfs_proc_lookup(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::lookup_reply_t){
1118 |     record_event("nfs_proc_lookup");
1119 |     # Generated for NFSv3 request/reply dialogues of type lookup.
1120 |     # The event is generated once we have either seen both the request and its corresponding reply,
1121 |     # or an unanswered request has timed out.
1122 | }
1123 | 
1124 | event nfs_proc_mkdir(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::newobj_reply_t){
1125 |     record_event("nfs_proc_mkdir");
1126 |     # Generated for NFSv3 request/reply dialogues of type mkdir.
1127 |     # The event is generated once we have either seen both the request and its corresponding reply,
1128 |     # or an unanswered request has timed out.
1129 | }
1130 | 
1131 | event nfs_proc_not_implemented(c: connection, info: NFS3::info_t, proc: NFS3::proc_t){
1132 |     record_event("nfs_proc_not_implemented");
1133 |     # Generated for NFSv3 request/reply dialogues of a type that Zeek’s NFSv3 analyzer does not implement.
1134 | }
1135 | 
1136 | event nfs_proc_null(c: connection, info: NFS3::info_t){
1137 |     record_event("nfs_proc_null");
1138 |     # Generated for NFSv3 request/reply dialogues of type null.
1139 |     # The event is generated once we have either seen both the request and its corresponding reply,
1140 |     # or an unanswered request has timed out.
1141 | }
1142 | 
1143 | event nfs_proc_read(c: connection, info: NFS3::info_t, req: NFS3::readargs_t, rep: NFS3::read_reply_t){
1144 |     record_event("nfs_proc_read");
1145 |     # Generated for NFSv3 request/reply dialogues of type read.
1146 |     # The event is generated once we have either seen both the request and its corresponding reply,
1147 |     # or an unanswered request has timed out.
1148 | }
1149 | 
1150 | event nfs_proc_readdir(c: connection, info: NFS3::info_t, req: NFS3::readdirargs_t, rep: NFS3::readdir_reply_t){
1151 |     record_event("nfs_proc_readdir");
1152 |     # Generated for NFSv3 request/reply dialogues of type readdir.
1153 |     # The event is generated once we have either seen both the request and its corresponding reply,
1154 |     # or an unanswered request has timed out.
1155 | }
1156 | 
1157 | event nfs_proc_readlink(c: connection, info: NFS3::info_t, fh: string, rep: NFS3::readlink_reply_t){
1158 |     record_event("nfs_proc_readlink");
1159 |     # Generated for NFSv3 request/reply dialogues of type readlink.
1160 |     # The event is generated once we have either seen both the request and its corresponding reply,
1161 |     # or an unanswered request has timed out.
1162 | }
1163 | 
1164 | event nfs_proc_remove(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::delobj_reply_t){
1165 |     record_event("nfs_proc_remove");
1166 |     # Generated for NFSv3 request/reply dialogues of type remove.
1167 |     # The event is generated once we have either seen both the request and its corresponding reply,
1168 |     # or an unanswered request has timed out.
1169 | }
1170 | 
1171 | event nfs_proc_rename(c: connection, info: NFS3::info_t, req: NFS3::renameopargs_t, rep: NFS3::renameobj_reply_t){
1172 |     record_event("nfs_proc_rename");
1173 |     # Generated for NFSv3 request/reply dialogues of type rename.
1174 |     # The event is generated once we have either seen both the request and its corresponding reply,
1175 |     # or an unanswered request has timed out.
1176 | }
1177 | 
1178 | event nfs_proc_rmdir(c: connection, info: NFS3::info_t, req: NFS3::diropargs_t, rep: NFS3::delobj_reply_t){
1179 |     record_event("nfs_proc_rmdir");
1180 |     # Generated for NFSv3 request/reply dialogues of type rmdir.
1181 |     # The event is generated once we have either seen both the request and its corresponding reply,
1182 |     # or an unanswered request has timed out.
1183 | }
1184 | 
1185 | event nfs_proc_sattr(c: connection, info: NFS3::info_t, req: NFS3::sattrargs_t, rep: NFS3::sattr_reply_t){
1186 |     record_event("nfs_proc_sattr");
1187 |     # Generated for NFSv3 request/reply dialogues of type sattr.
1188 |     # The event is generated once we have either seen both the request and its corresponding reply,
1189 |     # or an unanswered request has timed out.
1190 | }
1191 | 
1192 | event nfs_proc_symlink(c: connection, info: NFS3::info_t, req: NFS3::symlinkargs_t, rep: NFS3::newobj_reply_t){
1193 |     record_event("nfs_proc_symlink");
1194 |     # Generated for NFSv3 request/reply dialogues of type symlink.
1195 |     # The event is generated once we have either seen both the request and its corresponding reply,
1196 |     # or an unanswered request has timed out.
1197 | }
1198 | 
1199 | event nfs_proc_write(c: connection, info: NFS3::info_t, req: NFS3::writeargs_t, rep: NFS3::write_reply_t){
1200 |     record_event("nfs_proc_write");
1201 |     # Generated for NFSv3 request/reply dialogues of type write.
1202 |     # The event is generated once we have either seen both the request and its corresponding reply,
1203 |     # or an unanswered request has timed out.
1204 | }
1205 | 
1206 | event nfs_reply_status(n: connection, info: NFS3::info_t){
1207 |     record_event("nfs_reply_status");
1208 |     # Generated for each NFSv3 reply message received, reporting just the status included.
1209 | }
1210 | 
1211 | #--上面是关于nfs的调用事件--
1212 | 
1213 | event pm_attempt_getport(r: connection, status: rpc_status, pr: pm_port_request){
1214 |     record_event("pm_attempt_getport");
1215 |     # Generated for failed Portmapper requests of type getport.
1216 | }
1217 | 
1218 | event pm_attempt_dump(r: connection, status: rpc_status){
1219 |     record_event("pm_attempt_dump");
1220 |     # Generated for failed Portmapper requests of type dump.
1221 | }
1222 | 
1223 | event pm_attempt_callit(r: connection, status: rpc_status, call: pm_callit_request){
1224 |     record_event("pm_attempt_callit");
1225 |     # Generated for failed Portmapper requests of type callit.
1226 | }
1227 | 
1228 | event pm_attempt_null(r: connection, status: rpc_status){
1229 |     record_event("pm_attempt_null");
1230 |     # Generated for failed Portmapper requests of type null.
1231 | }
1232 | 
1233 | event pm_attempt_set(r: connection, status: rpc_status, m: pm_mapping){
1234 |     record_event("pm_attempt_set");
1235 |     # Generated for failed Portmapper requests of type set.
1236 | }
1237 | 
1238 | event pm_attempt_unset(r: connection, status: rpc_status, m: pm_mapping){
1239 |     record_event("pm_attempt_unset");
1240 |     # Generated for failed Portmapper requests of type unset.
1241 | }
1242 | 
1243 | event pm_bad_port(r: connection, bad_p: count){
1244 |     record_event("pm_bad_port");
1245 |     # Generated for Portmapper requests or replies that include an invalid port number.
1246 |     # Since ports are represented by unsigned 4-byte integers,
1247 |     # they can stray outside the allowed range of 0–65535 by being >= 65536.
1248 |     # If so, this event is generated.
1249 | }
1250 | 
1251 | event pm_request_callit(r: connection, call: pm_callit_request, p: port){
1252 |     record_event("pm_request_callit");
1253 |     # Generated for Portmapper request/reply dialogues of type callit.
1254 | }
1255 | 
1256 | event pm_request_dump(r: connection, m: pm_mappings){
1257 |     record_event("pm_request_dump");
1258 |     # Generated for Portmapper request/reply dialogues of type dump.
1259 | }
1260 | 
1261 | event pm_request_getport(r: connection, pr: pm_port_request, p: port){
1262 |     record_event("pm_request_getport");
1263 |     # Generated for Portmapper request/reply dialogues of type getport.
1264 | }
1265 | 
1266 | event pm_request_null(r: connection){
1267 |     record_event("pm_request_null");
1268 |     # Generated for Portmapper requests of type null.
1269 | }
1270 | 
1271 | event pm_request_set(r: connection, m: pm_mapping, success: bool){
1272 |     record_event("pm_request_set");
1273 |     # Generated for Portmapper request/reply dialogues of type set.
1274 | }
1275 | 
1276 | event pm_request_unset(r: connection, m: pm_mapping, success: bool){
1277 |     record_event("pm_request_unset");
1278 |     # enerated for Portmapper request/reply dialogues of type unset.
1279 | }
1280 | 
1281 | event rpc_call(c: connection, xid: count, prog: count, ver: count, proc: count, call_len: count){
1282 |     record_event("rpc_call");
1283 |     # Generated for RPC call messages.
1284 | }
1285 | 
1286 | event rpc_dialogue(c: connection, prog: count, ver: count, proc: count, status: rpc_status, start_time: time, call_len: count, reply_len: count){
1287 |     record_event("rpc_dialogue");
1288 |     # Generated for RPC request/reply pairs.
1289 |     # The RPC analyzer associates request and reply by their transaction identifiers
1290 |     # and raises this event once both have been seen.
1291 |     # If there’s not a reply, this event will still be generated eventually on timeout.
1292 |     # In that case, status will be set to RPC_TIMEOUT.
1293 | }
1294 | 
1295 | # 这边的实现,有误,可以触发rpc_reply,但是参数c是rpc_call的
1296 | # 暂且认为rpc_reply触发代表出现了一对rpc调用
1297 | # zeek将在3.1.0版本修复此bug
1298 | # 这里不调用update_network_event
1299 | event rpc_reply(c: connection, xid: count, status: rpc_status, reply_len: count){
1300 |     # print c;
1301 |     # print status;
1302 |     # 记录资产,主机即资产
1303 |     # Generated for RPC reply messages.
1304 |     if(c$id ?$ orig_h && c$id ?$ resp_h){
1305 |         local rec1: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$orig_h, $description = "rpc_reply"];
1306 |         local rec2: HOST_INFO::host_info = [$ts = network_time(), $ip = c$id$resp_h, $description = "rpc_reply"];
1307 |         update_hostlist(rec1, "rpc");
1308 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec1);
1309 |         update_hostlist(rec2, "rpc");
1310 |         Log::write(HOST_INFO::HOST_INFO_LOG, rec2);
1311 |     }
1312 |     # 记录事件,事件以边的形式呈现,必须连接两个点
1313 |     # 先处理RPC_CALL事件
1314 |     local t: time = network_time();
1315 |     local rec3: HOST_INFO::event_info = [$ts = network_time(), $real_time = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S", t)), 
1316 |                                         $event_type = RPC_CALL, $src_ip = c$id$orig_h, $src_p = c$id$orig_p, 
1317 |                                         $dst_ip = c$id$resp_h, $dst_p = c$id$resp_p, $description = fmt("%s", status)];
1318 |     Log::write(HOST_INFO::NET_EVENTS_LOG, rec3);
1319 |     # 然后处理RPC_REPLY事件,源ip与目的ip颠倒,源端口与目的端口颠倒,但是发出的rpc_reply的节点的源端口不是111(被映射为一个未知端口,展示需要,先设为一个较大的数)
1320 |     t = network_time();
1321 |     local random_port: int = rand(65535);# 为展示需要设置一个虚假的源端口
1322 |     while(random_port < 40000){
1323 |         random_port = rand(65535);
1324 |     }
1325 |     local tmp_str: string = fmt("%d", random_port) + "/udp";
1326 |     local fake_port: port = to_port(tmp_str);
1327 | 
1328 |     rec3 = [$ts = network_time(), $real_time = fmt("%s", strftime("%Y-%m-%d-%H:%M:%S", t)), 
1329 |                                         $event_type = RPC_REPLY, $src_ip = c$id$resp_h, $src_p = fake_port, 
1330 |                                         $dst_ip = c$id$orig_h, $dst_p = c$id$orig_p, $description = fmt("%s", status)];
1331 |     Log::write(HOST_INFO::NET_EVENTS_LOG, rec3);
1332 |     # num_packets += 1;
1333 | }
1334 | # 上面是关于pm和rpc的,可惜一个都没有触发
1335 | # 考虑包内容中有resp_p=111/udp,其中111是portmapper的端口号得知此包与portmapper相关
1336 | # 如何通过bro得知rpc调用了sadmind守护进程?
1337 | 
1338 | # phase-3-dump 530个分组
1339 | # 阶段3涉及的主要协议有SADMIND,Portmap,TCP,TELNET
1340 | # 先添加TCP和TELNET相关的事件,希望TELNET相关的事件可以触发
1341 | # 一些关于TCP重传之类的非常频繁且价值不大的事件,可以考虑直接过滤掉
1342 | event new_connection_contents(c: connection){
1343 |     # Generated when reassembly starts for a TCP connection. 
1344 |     # This event is raised at the moment when Zeek’s TCP analyzer enables stream reassembly for a connection.
1345 |     update_network_event(c, "new_connection_contents", "tcp", "reassembly-starts-for-a-TCP-connection", CONNECTION_SYN_PACKET);# 具体进行了哪个进程和端口的转换,从c参数看不出来,需要pm相关的事件提供
1346 | }
1347 | 
1348 | event connection_attempt(c: connection){
1349 |     # record_event("connection_attempt");
1350 |     # Generated for an unsuccessful connection attempt.
1351 |     # This event is raised when an originator unsuccessfully attempted to establish a connection.
1352 |     # “Unsuccessful” is defined as at least tcp_attempt_delay seconds having elapsed 
1353 |     # since the originator first sent a connection establishment packet to the destination without seeing a reply.
1354 |     update_network_event(c, "connection_attempt", "tcp", "an-unsuccessful-connection-attempt", CONNECTION_ATTEMPT);
1355 | }
1356 | 
1357 | event connection_established(c: connection){
1358 |     # Generated when seeing a SYN-ACK packet from the responder in a TCP handshake.
1359 |     # An associated SYN packet was not seen from the originator side if its state is not set to TCP_ESTABLISHED.
1360 |     # The final ACK of the handshake in response to SYN-ACK may or may not occur later,
1361 |     # one way to tell is to check the history field of connection to see if the originator sent an ACK,
1362 |     # indicated by ‘A’ in the history string.
1363 |     update_network_event(c, "connection_established", "tcp", "see-a-synack-packet-from-a-tcp-handshake", CONNECTION_ESTABLISHED);
1364 | }
1365 | 
1366 | event partial_connection(c: connection){
1367 |     record_event("partial_connection");
1368 |     # Generated for a new active TCP connection if Zeek did not see the initial handshake.
1369 |     # This event is raised when Zeek has observed traffic from each endpoint,
1370 |     # but the activity did not begin with the usual connection establishment.
1371 | }
1372 | 
1373 | event connection_partial_close(c: connection){
1374 |     record_event("connection_partial_close");
1375 |     # Generated when a previously inactive endpoint attempts to close a TCP connection
1376 |     # via a normal FIN handshake or an abort RST sequence.
1377 |     # When the endpoint sent one of these packets, 
1378 |     # Zeek waits tcp_partial_close_delay prior to generating the event,
1379 |     # to give the other endpoint a chance to close the connection normally.
1380 | }
1381 | 
1382 | event connection_finished(c: connection){
1383 |     # Generated for a TCP connection that finished normally.
1384 |     # The event is raised when a regular FIN handshake from both endpoints was observed.
1385 |     update_network_event(c, "connection_finished", "tcp", "a-tcp-connection-finished-normally", CONNECTION_FINISHED);
1386 | }
1387 | 
1388 | event connection_half_finished(c: connection){
1389 |     # record_event("connection_half_finished");
1390 |     # Generated when one endpoint of a TCP connection attempted to gracefully close the connection,
1391 |     # but the other endpoint is in the TCP_INACTIVE state.
1392 |     # This can happen due to split routing, in which Zeek only sees one side of a connection.
1393 |     update_network_event(c, "connection_half_finished", "tcp", "This-can-happen-due-to-split-routing", CONNECTION_HALF_FINISHED);
1394 | }
1395 | 
1396 | event connection_rejected(c: connection){
1397 |     record_event("connection_rejected");
1398 |     # Generated for a rejected TCP connection.
1399 |     # This event is raised when an originator attempted to setup a TCP connection
1400 |     # but the responder replied with a RST packet denying it.
1401 | }
1402 | 
1403 | event connection_reset(c: connection){
1404 |     record_event("connection_reset");
1405 |     # Generated when an endpoint aborted a TCP connection.
1406 |     # The event is raised when one endpoint of an established TCP connection aborted by sending a RST packet.
1407 | }
1408 | 
1409 | event connection_pending(c: connection){
1410 |     # Generated for each still-open TCP connection when Zeek terminates.
1411 |     update_network_event(c, "connection_pending", "tcp", "a-still-open-tcp-connection", CONNECTION_PENDING);
1412 | }
1413 | 
1414 | event connection_SYN_packet(c: connection, pkt: SYN_packet){
1415 |     # Generated for a SYN packet.
1416 |     # Zeek raises this event for every SYN packet seen by its TCP analyzer.
1417 |     update_network_event(c, "connection_SYN_packet", "tcp", "a-SYN-packet-appears", CONNECTION_SYN_PACKET);
1418 | }
1419 | 
1420 | event connection_first_ACK(c: connection){
1421 |     # Generated for the first ACK packet seen for a TCP connection from its originator.
1422 |     update_network_event(c, "connection_first_ack", "tcp", "the-first-ack-packet-seen-in-this-tcp-connection", CONNECTION_FIRST_ACK);
1423 | }
1424 | 
1425 | event connection_EOF(c: connection, is_orig: bool){
1426 |     # Generated at the end of reassembled TCP connections.
1427 |     # The TCP reassembler raised the event once for each endpoint of a connection
1428 |     # when it finished reassembling the corresponding side of the communication.
1429 |     update_network_event(c, "connection_eof", "tcp", "the-end-of-reassembled-tcp-connections", CONNECTION_EOF);
1430 | }
1431 | 
1432 | event tcp_packet(c: connection, is_orig: bool, flags: string, seq: count, ack: count, len: count, payload: string){
1433 |     # Generated for every TCP packet.
1434 |     # This is a very low-level and expensive event that should be avoided when at all possible.
1435 |     # It’s usually infeasible to handle when processing even medium volumes of traffic in real-time.
1436 |     # It’s slightly better than new_packet because it affects only TCP, but not much.
1437 |     # That said, if you work from a trace and want to do some packet-level analysis, it may come in handy.
1438 | 
1439 |     # 这一条开销太大了,如果价值不大,就删掉
1440 |     update_network_event(c, "tcp_packet", "tcp", "a-tcp-packet-appears", TCP_PACKET);
1441 | }
1442 | 
1443 | event tcp_option(c: connection, is_orig: bool, opt: count, optlen: count){
1444 |     record_event("tcp_option");
1445 |     # Generated for each option found in a TCP header.
1446 |     # Like many of the tcp_* events, this is a very low-level event and potentially expensive as it may be raised very often.
1447 | }
1448 | 
1449 | event tcp_contents(c: connection, is_orig: bool, seq: count, contents: string){
1450 |     record_event("tcp_contents");
1451 |     # Generated for each chunk of reassembled TCP payload.
1452 |     # When content delivery is enabled for a TCP connection
1453 |     # (via tcp_content_delivery_ports_orig, tcp_content_delivery_ports_resp,
1454 |     # tcp_content_deliver_all_orig, tcp_content_deliver_all_resp),
1455 |     # this event is raised for each chunk of in-order payload reconstructed from the packet stream.
1456 |     # Note that this event is potentially expensive if many connections carry significant
1457 |     # amounts of data as then all that data needs to be passed on to the scripting layer.
1458 | }
1459 | 
1460 | event tcp_rexmit(c: connection, is_orig: bool, seq: count, len: count, data_in_flight: count, window: count){
1461 |     record_event("tcp_rexmit");
1462 |     # Generated for each detected TCP segment retransmission.
1463 | }
1464 | 
1465 | event tcp_multiple_checksum_errors(c: connection, is_orig: bool, threshold: count){
1466 |     record_event("tcp_multiple_checksum_errors");
1467 |     # Generated if a TCP flow crosses a checksum-error threshold, per ‘C’/’c’ history reporting.
1468 | }
1469 | 
1470 | event tcp_multiple_zero_windows(c: connection, is_orig: bool, threshold: count){
1471 |     record_event("tcp_multiple_zero_windows");
1472 |     # Generated if a TCP flow crosses a zero-window threshold, per ‘W’/’w’ history reporting.
1473 | }
1474 | 
1475 | event tcp_multiple_retransmissions(c: connection, is_orig: bool, threshold: count){
1476 |     record_event("tcp_multiple_retransmissions");
1477 |     # Generated if a TCP flow crosses a retransmission threshold, per ‘T’/’t’ history reporting.
1478 | }
1479 | 
1480 | event tcp_multiple_gap(c: connection, is_orig: bool, threshold: count){
1481 |     record_event("tcp_multiple_gap");
1482 |     # Generated if a TCP flow crosses a gap threshold, per ‘G’/’g’ history reporting.
1483 | }
1484 | 
1485 | event contents_file_write_failure(c: connection, is_orig: bool, msg: string){
1486 |     record_event("contents_file_write_failure");
1487 |     # Generated when failing to write contents of a TCP stream to a file.
1488 | }
1489 | # 以上是Zeek::TCP中的所有事件
1490 | # Zeek_Login.events.bif.zeek中应该含有关于RSH调用和TELNET的信息
1491 | 
1492 | event activating_encryption(c: connection){
1493 |     record_event("activating_encryption");
1494 |     # Generated for Telnet sessions when encryption is activated.
1495 |     # The Telnet protocol includes options for negotiating encryption.
1496 |     # When such a series of options is successfully negotiated,
1497 |     # the event engine generates this event.
1498 | }
1499 | 
1500 | event authentication_accepted(name: string, c: connection){
1501 |     record_event("authentication_accepted");
1502 |     # Generated when a Telnet authentication has been successful.
1503 |     # The Telnet protocol includes options for negotiating authentication.
1504 |     # When such an option is sent from client to server and the server replies that it accepts the authentication,
1505 |     # then the event engine generates this event.
1506 | 
1507 |     # Todo
1508 |     # Zeek’s current default configuration does not activate the protocol analyzer that generates this event;
1509 |     # the corresponding script has not yet been ported. To still enable this event, 
1510 |     # one needs to add a call to Analyzer::register_for_ports or a DPD payload signature.
1511 | }
1512 | 
1513 | event authentication_skipped(c: connection){
1514 |     record_event("authentication_skipped");
1515 |     # Generated for Telnet/Rlogin sessions when a pattern match indicates that
1516 |     # no authentication is performed.
1517 | }
1518 | 
1519 | event bad_option(c: connection){
1520 |     record_event("bad_option");
1521 |     # Generated for an ill-formed or unrecognized Telnet option.
1522 | }
1523 | 
1524 | event bad_option_termination(c: connection){
1525 |     record_event("bad_option_termination");
1526 |     # Generated for a Telnet option that’s incorrectly terminated.
1527 | }
1528 | 
1529 | event inconsistent_option(c: connection){
1530 |     record_event("inconsistent_option");
1531 |     # Generated for an inconsistent Telnet option.
1532 |     # Telnet options are specified by the client and server stating
1533 |     # which options they are willing to support vs. which they are not,
1534 |     # and then instructing one another which in fact they should
1535 |     # or should not use for the current connection.
1536 |     # If the event engine sees a peer violate either what
1537 |     # the other peer has instructed it to do,
1538 |     # or what it itself offered in terms of options in the past,
1539 |     # then the engine generates this event.
1540 | }
1541 | 
1542 | event login_confused(c: connection, msg: string, line: string){
1543 |     # Generated when tracking of Telnet/Rlogin authentication failed.
1544 |     # As Zeek’s login analyzer uses a number of heuristics to
1545 |     # extract authentication information, it may become confused.
1546 |     # If it can no longer correctly track the authentication dialog, it raises this event.
1547 |     update_network_event(c, "login_confused", "telnet", "tracking-of-Telnet/Rlogin-authentication-failed", LOGIN_CONFUSED);
1548 | }
1549 | 
1550 | event login_confused_text(c: connection, line: string){
1551 |     # Generated after getting confused while tracking
1552 |     # a Telnet/Rlogin authentication dialog.
1553 |     # The login analyzer generates this even for every line
1554 |     # of user input after it has reported login_confused for a connection.
1555 |     update_network_event(c, "login_confused_text", "telnet", "getting-confused-while-tracking-a-Telnet/Rlogin-authentication-dialog", LOGIN_CONFUSED_TEXT);
1556 | }
1557 | 
1558 | event login_display(c: connection, display: string){
1559 |     # record_event("login_display");
1560 |     # Generated for clients transmitting an X11 DISPLAY in a Telnet session.
1561 |     # This information is extracted out of environment variables sent as Telnet options.
1562 |     update_network_event(c, "login_display", "telnet", "clients-transmitting-an-X11-DISPLAY-in-a-Telnet-session", LOGIN_DISPLAY);
1563 | }
1564 | 
1565 | event login_failure(c: connection, user: string, client_user: string, password: string, line: string){
1566 |     record_event("login_failure");
1567 |     # Generated for Telnet/Rlogin login failures.
1568 |     # The login analyzer inspects Telnet/Rlogin sessions to heuristically extract
1569 |     # username and password information as well as the text returned by the login server.
1570 |     # This event is raised if a login attempt appears to have been unsuccessful.
1571 | }
1572 | 
1573 | event login_input_line(c: connection, line: string){
1574 |     # Generated for lines of input on Telnet/Rlogin sessions.
1575 |     # The line will have control characters (such as in-band Telnet options) removed.
1576 |     update_network_event(c, "login_input_line", "telnet", "lines-of-input-on-Telnet/Rlogin-sessions", LOGIN_INPUT_LINE);
1577 | }
1578 | 
1579 | event login_output_line(c: connection, line: string){
1580 |     # Generated for lines of output on Telnet/Rlogin sessions.
1581 |     # The line will have control characters (such as in-band Telnet options) removed.
1582 |     update_network_event(c, "login_output_line", "telnet", "lines-of-output-on-Telnet/Rlogin-sessions", LOGIN_OUTPUT_LINE);
1583 | }
1584 | 
1585 | event login_prompt(c: connection, prompt: string){
1586 |     record_event("login_prompt");
1587 |     # Generated for clients transmitting a terminal prompt in a Telnet session.
1588 |     # This information is extracted out of environment variables sent as Telnet options.
1589 | }
1590 | 
1591 | event login_success(c: connection, user: string, client_user: string, password: string, line: string){
1592 |     # Generated for successful Telnet/Rlogin logins.
1593 |     # The login analyzer inspects Telnet/Rlogin sessions to heuristically
1594 |     # extract username and password information as well as the text
1595 |     # returned by the login server.
1596 |     # This event is raised if a login attempt appears to have been successful.
1597 |     update_network_event(c, "login_success", "telnet", "successful-Telnet/Rlogin-logins", LOGIN_SUCCESS);
1598 | }
1599 | 
1600 | event login_terminal(c: connection, terminal: string){
1601 |     # record_event("login_terminal");
1602 |     # Generated for clients transmitting a terminal type in a Telnet session.
1603 |     # This information is extracted out of environment variables sent as Telnet options.
1604 |     update_network_event(c, "login_terminal", "telnet", "clients-transmitting-a-terminal-type-in-a-Telnet-session", LOGIN_TERMINAL);
1605 | }
1606 | 
1607 | # phase-4-dump 526个分组
1608 | # 该阶段攻击主机往目标主机上安装DDOS工具,大量触发rsh_request和rsh_reply事件
1609 | 
1610 | event rsh_reply(c: connection, client_user: string, server_user: string, line: string){
1611 |     # Generated for client side commands on an RSH connection.
1612 |     # See RFC 1258 for more information about the Rlogin/Rsh protocol.
1613 |     local des: string = fmt("rsh-connection-client_user:%s,server_user:%s", client_user, server_user);
1614 |     # print c;
1615 |     update_network_event(c, "rsh_reply", "rsh", des, RSH_REPLY);
1616 | }
1617 | 
1618 | event rsh_request(c: connection, client_user: string, server_user: string, line: string, new_session: bool){
1619 |     # Generated for client side commands on an RSH connection.
1620 |     # See RFC 1258 for more information about the Rlogin/Rsh protocol.
1621 |     local des: string = fmt("rsh-connection-client_user:%s,server_user:%s", client_user, server_user);
1622 |     update_network_event(c, "rsh_request", "rsh", des, RSH_REQUEST);
1623 | }
1624 | # 以上是Zeek提供的和TELNET相关的事件,有相当一部分事件需要自己激活(为其注册端口)
1625 | 
1626 | # phase-5-dump 分组最多的阶段 34553个分组 DDOS工具开始运作
1627 | # HTTP相关协议
1628 | event http_request(c: connection, method: string, original_URI: string, unescaped_URI: string, version: string){
1629 |     record_event("http_request");
1630 |     # Generated for HTTP requests.
1631 |     # Zeek supports persistent and pipelined HTTP sessions
1632 |     # and raises corresponding events as it parses client/server dialogues.
1633 |     # This event is generated as soon as a request’s initial line has been parsed,
1634 |     # and before any http_header events are raised.
1635 | }
1636 | 
1637 | event http_reply(c: connection, version: string, code: count, reason: string){
1638 |     # record_event("http_reply");
1639 |     # Generated for HTTP replies.
1640 |     # Zeek supports persistent and pipelined HTTP sessions
1641 |     # and raises corresponding events as it parses client/server dialogues.
1642 |     # This event is generated as soon as a reply’s initial line has been parsed,
1643 |     # and before any http_header events are raised.
1644 |     update_network_event(c, "http_reply", "http", "a-reply’s-initial-line-has-been-parsed", HTTP_REPLY);
1645 | }
1646 | 
1647 | event http_header(c: connection, is_orig: bool, name: string, value: string){
1648 |     # record_event("http_header");
1649 |     # Generated for HTTP headers.
1650 |     # Zeek supports persistent and pipelined HTTP sessions
1651 |     # and raises corresponding events as it parses client/server dialogues.
1652 |     update_network_event(c, "http_header", "http", "Generated-for-HTTP-headers", HTTP_HEADER);
1653 | }
1654 | 
1655 | event http_all_headers(c: connection, is_orig: bool, hlist: mime_header_list){
1656 |     # record_event("http_all_headers");
1657 |     # Generated for HTTP headers, passing on all headers of
1658 |     # an HTTP message at once.
1659 |     # Zeek supports persistent and pipelined HTTP sessions
1660 |     # and raises corresponding events as it parses client/server dialogues.
1661 |     update_network_event(c, "http_all_headers", "http", "passing-on-all-headers-of-an-HTTP-message-at-once", HTTP_ALL_HEADERS);
1662 | }
1663 | 
1664 | event http_begin_entity(c: connection, is_orig: bool){
1665 |     # record_event("http_begin_entity");
1666 |     # Generated when starting to parse an HTTP body entity.
1667 |     # This event is generated at least once for each non-empty
1668 |     # (client or server) HTTP body; and potentially more than once
1669 |     # if the body contains further nested MIME entities.
1670 |     # Zeek raises this event just before it starts parsing each entity’s content.
1671 |     update_network_event(c, "http_begin_entity", "http", "generated-for-each non-empty-(client-or-server)-HTTP-body", HTTP_BEGIN_ENTITY);
1672 | }
1673 | 
1674 | event http_end_entity(c: connection, is_orig: bool){
1675 |     # record_event("http_end_entity");
1676 |     # Generated when finishing parsing an HTTP body entity.
1677 |     # This event is generated at least once for each non-empty
1678 |     # (client or server) HTTP body; and potentially more than once
1679 |     # if the body contains further nested MIME entities.
1680 |     # Zeek raises this event at the point when it has finished
1681 |     # parsing an entity’s content.
1682 |     update_network_event(c, "http_end_entity", "http", "finish-parsing-an-HTTP-body-entity", HTTP_END_ENTITY);
1683 | }
1684 | 
1685 | event http_entity_data(c: connection, is_orig: bool, length: count, data: string){
1686 |     # record_event("http_entity_data");
1687 |     # Generated when parsing an HTTP body entity, passing on the data.
1688 |     # This event can potentially be raised many times for each entity,
1689 |     # each time passing a chunk of the data of not further defined size.
1690 | 
1691 |     # A common idiom for using this event is to first reassemble the data
1692 |     # at the scripting layer by concatenating it to a successively growing string;
1693 |     # and only perform further content analysis once the corresponding http_end_entity event
1694 |     # has been raised. Note, however, that doing so can be quite expensive for HTTP tranders.
1695 |     # At the very least, one should impose an upper size limit on how much data is being buffered.
1696 |     update_network_event(c, "http_entity_data", "http", "pass-on-the-data", HTTP_ENTITY_DATA);
1697 | }
1698 | 
1699 | event http_content_type(c: connection, is_orig: bool, ty: string, subty: string){
1700 |     # record_event("http_content_type");
1701 |     # Generated for reporting an HTTP body’s content type.
1702 |     # This event is generated at the end of parsing an HTTP header,
1703 |     # passing on the MIME type as specified by the Content-Type header.
1704 |     # If that header is missing, this event is still raised with a default value of text/plain.
1705 |     update_network_event(c, "http_content_type", "http", "passing-on-the-MIME-typed", HTTP_CONTENT_TYPE);
1706 | }
1707 | 
1708 | event http_message_done(c: connection, is_orig: bool, stat: http_message_stat){
1709 |     # record_event("http_message_done");
1710 |     # Generated once at the end of parsing an HTTP message.
1711 |     # Zeek supports persistent and pipelined HTTP sessions and raises
1712 |     # corresponding events as it parses client/server dialogues.
1713 |     # A “message” is one top-level HTTP entity, such as a complete request or reply.
1714 |     # Each message can have further nested sub-entities inside.
1715 |     # This event is raised once all sub-entities belonging to a top-level message
1716 |     # have been processed (and their corresponding http_entity_* events generated).
1717 |     update_network_event(c, "http_message_done", "http", "a-top-level-message-have-been-processed", HTTP_MESSAGE_DONE);
1718 | }
1719 | 
1720 | event http_event(c: connection, event_type: string, detail: string){
1721 |     # record_event("http_event");
1722 |     # Generated for errors found when decoding HTTP requests or replies.
1723 |     update_network_event(c, "http_event", "http", detail, HTTP_EVENT);# 注意一下detail的内容
1724 | }
1725 | 
1726 | event http_stats(c: connection, stats: http_stats_rec){
1727 |     # record_event("http_stats");
1728 |     # Generated at the end of an HTTP session to report statistics about it.
1729 |     # This event is raised after all of an HTTP session’s requests and replies have been fully processed.
1730 |     update_network_event(c, "http_stats", "http", "an-http-session-finished-and-stats-generated", HTTP_STATS);
1731 | }
1732 | 
1733 | event http_connection_upgrade(c: connection, protocol: string){
1734 |     record_event("http_connection_upgrade");
1735 |     # Generated when a HTTP session is upgraded to a different protocol (e.g. websocket).
1736 |     # This event is raised when a server replies with a HTTP 101 reply.
1737 |     # No more HTTP events will be raised after this event.
1738 | }
1739 | 
1740 | # [http_entity_data] = 226,
1741 | # [http_begin_entity] = 55,
1742 | # [http_header] = 217,
1743 | # [http_reply] = 55,
1744 | # [http_all_headers] = 55,
1745 | # [http_content_type] = 55,
1746 | # [http_message_done] = 55,
1747 | # [http_end_entity] = 55,
1748 | # [http_stats] = 45,
1749 | # [http_event] = 24
1750 | 
1751 | 
1752 | function start_analyzers(){
1753 |     # enable RPC-based protocol analyzers
1754 |     Analyzer::register_for_ports(Analyzer::ANALYZER_PORTMAPPER, pm_ports);
1755 |     # enable telnet protocol analyzers
1756 |     Analyzer::register_for_ports(Analyzer::ANALYZER_TELNET, telnet_ports);
1757 |     # enable rsh protocol analyzers
1758 |     Analyzer::register_for_ports(Analyzer::ANALYZER_RSH, rsh_ports);
1759 |     # Analyzer::enable_analyzer(Analyzer::ANALYZER_TELNET);
1760 |     Analyzer::enable_analyzer(Analyzer::ANALYZER_PORTMAPPER);
1761 |     for(e in analyzer_tags){
1762 |         Analyzer::enable_analyzer(e);
1763 |     }
1764 | }
1765 | 
1766 | 
1767 | function attack_pattern_event_logger(){
1768 |     # 如果想实现数据独立性,考虑使用输入框架
1769 |     # 仔细想了一下,ping扫描事件根本就不需要reply啊
1770 |     # ping的太少也不行,稍微多一点才合适
1771 |     local attack_rel = string_vec("icmp_echo_ping|0>1", "icmp_echo_ping|0>2", "icmp_echo_ping|0>3", "icmp_echo_ping|0>4", "icmp_echo_ping|0>5", "icmp_echo_ping|0>6","icmp_echo_ping|0>7", "icmp_echo_ping|0>8", "icmp_echo_ping|0>9","icmp_echo_ping|0>10", "icmp_echo_ping|0>11", "icmp_echo_ping|0>12", "icmp_echo_reply|11>0");
1772 |     local tmp_n: int = 0;
1773 | 
1774 |     print attack_rel;
1775 |     while(tmp_n < |attack_rel|){
1776 |         # print type_name(item);
1777 |         local tmp_tlb: string_vec = split_string(attack_rel[tmp_n], /\|/);
1778 |         local rec: HOST_INFO::pattern_event = [$name="attack_pattern_0", $id=tmp_n, $event_type=tmp_tlb[0], $edge_content=tmp_tlb[1]];
1779 |         Log::write(HOST_INFO::ATTACK_PATTERN_EVENT_LOG, rec);
1780 |         tmp_n += 1;
1781 |     }
1782 | }
1783 | 
1784 | function attack_pattern_event_logger1(){
1785 |     # 如果想实现数据独立性,考虑使用输入框架
1786 |     local attack_rel = string_vec("portmap|0>1", "portmap|0>2", "rpc_call|0>1", "rpc_reply|1>0");# protmap|0>1会被覆盖
1787 |     local tmp_n: int = 0;
1788 | 
1789 |     print attack_rel;
1790 |     while(tmp_n < |attack_rel|){
1791 |         # print type_name(item);
1792 |         local tmp_tlb: string_vec = split_string(attack_rel[tmp_n], /\|/);
1793 |         local rec: HOST_INFO::pattern_event = [$name="attack_pattern_1", $id=tmp_n, $event_type=tmp_tlb[0], $edge_content=tmp_tlb[1]];
1794 |         Log::write(HOST_INFO::ATTACK_PATTERN_EVENT_LOG, rec);
1795 |         tmp_n += 1;
1796 |     }
1797 | }
1798 | 
1799 | function attack_pattern_event_logger2(){
1800 |     # 如果想实现数据独立性,考虑使用输入框架
1801 |     local attack_rel = string_vec("login_output_line|0>3", "login_confused|0>2", "login_success|0>1");# login_success代表root权限被获取,参考CVE-1999-0977
1802 |     local tmp_n: int = 0;
1803 | 
1804 |     print attack_rel;
1805 |     while(tmp_n < |attack_rel|){
1806 |         # print type_name(item);
1807 |         local tmp_tlb: string_vec = split_string(attack_rel[tmp_n], /\|/);
1808 |         local rec: HOST_INFO::pattern_event = [$name="attack_pattern_2", $id=tmp_n, $event_type=tmp_tlb[0], $edge_content=tmp_tlb[1]];
1809 |         Log::write(HOST_INFO::ATTACK_PATTERN_EVENT_LOG, rec);
1810 |         tmp_n += 1;
1811 |     }
1812 | }
1813 | 
1814 | function attack_pattern_event_logger3(){
1815 |     # 如果想实现数据独立性,考虑使用输入框架
1816 |     local attack_rel = string_vec("rsh_request|0>1", "rsh_reply|1>0");# 
1817 |     local tmp_n: int = 0;
1818 | 
1819 |     print attack_rel;
1820 |     while(tmp_n < |attack_rel|){
1821 |         # print type_name(item);
1822 |         local tmp_tlb: string_vec = split_string(attack_rel[tmp_n], /\|/);
1823 |         local rec: HOST_INFO::pattern_event = [$name="attack_pattern_3", $id=tmp_n, $event_type=tmp_tlb[0], $edge_content=tmp_tlb[1]];
1824 |         Log::write(HOST_INFO::ATTACK_PATTERN_EVENT_LOG, rec);
1825 |         tmp_n += 1;
1826 |     }
1827 | }
1828 | 
1829 | function attack_pattern_logger(){
1830 |     Log::create_stream(HOST_INFO::ATTACK_PATTERN_EVENT_LOG, [$columns=pattern_event, $path="attack_pattern_event"]);
1831 |     attack_pattern_event_logger();
1832 |     attack_pattern_event_logger1();# 暂时这么弄
1833 |     attack_pattern_event_logger2();
1834 |     attack_pattern_event_logger3();
1835 | }
1836 | 
1837 | event zeek_init() &priority=10{
1838 |     start_analyzers();
1839 |     # Analyzer::register_for_ports(Analyzer::ANALYZER_CONTENTS_RPC, pm_ports);
1840 |     # Analyzer::enable_analyzer(Analyzer::ANALYZER_CONTENTS_RPC);
1841 |     # create our log stream at the very beginning
1842 | 	Log::create_stream(HOST_INFO::HOST_INFO_LOG, [$columns=host_info, $path="host-info"]);
1843 |     # the other log stream to output of a summary of host-info
1844 |     Log::create_stream(HOST_INFO::SUMMARY_HOST_LOG, [$columns=host_info, $path="host-summary"]);
1845 |     # 同样地,建立KG要存储的内容的日志流
1846 |     Log::create_stream(HOST_INFO::NET_EVENTS_LOG, [$columns=event_info, $path="network_events"]);# kg_info存储"三元组"形式的知识
1847 |     # some useless fields are filtered
1848 |     local filter: Log::Filter = [$name="without_dscription", $path="simple_hosts",
1849 |                                 $include=set("ip","hostname","username","mac","os","ips","protocols")];
1850 |     Log::add_filter(HOST_INFO::SUMMARY_HOST_LOG, filter);
1851 | }
1852 | 
1853 | event zeek_done(){
1854 |     print "finish";
1855 |     for(i in hostlist){
1856 |         local rec: HOST_INFO::host_info = hostlist[i];
1857 |         Log::write(HOST_INFO::SUMMARY_HOST_LOG, rec);
1858 |     }
1859 |     # local rec1: HOST_INFO::kg_info = [$ts=network_time(), $A=" ", $predicate=ICMP_ECHO_REQUEST, $B=" "];# 三元组日志测试数据
1860 |     # Log::write(HOST_INFO::NET_EVENTS_LOG, rec1);
1861 |     # print Analyzer::registered_ports(Analyzer::ANALYZER_CONTENTS_RPC);
1862 |     # print Analyzer::all_registered_ports();
1863 |     # print Analyzer::disabled_analyzers;
1864 |     # print likely_server_ports;
1865 |     # print num_packets;
1866 |     print events_not_recorded;
1867 |     attack_pattern_logger();
1868 | }
1869 | 
1870 | 


--------------------------------------------------------------------------------