├── LICENSE
├── README.md
├── code
    ├── README.md
    ├── config
    │   ├── Config.py
    │   ├── EviConfig.py
    │   └── __init__.py
    ├── evaluation.py
    ├── gen_data.py
    ├── models
    │   ├── BiLSTM.py
    │   ├── CNN3.py
    │   ├── ContextAware.py
    │   ├── LSTM.py
    │   ├── LSTM_SP.py
    │   └── __init__.py
    ├── prepro_data
    │   └── README.md
    ├── requirements.txt
    ├── test.py
    ├── test_sp.py
    ├── train.py
    └── train_sp.py
└── data
    └── README.md


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2017 THUNLP
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # DocRED
 2 | Dataset and code for baselines for [DocRED: A Large-Scale Document-Level Relation Extraction Dataset](https://arxiv.org/abs/1906.06127v3)
 3 | 
 4 | Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs. In order to accelerate the research on document-level RE, we introduce DocRED, a new dataset constructed from Wikipedia and Wikidata with three features: 
 5 | 
 6 | + DocRED annotates both named entities and relations, and is the largest human-annotated dataset for document-level RE from plain text.
 7 | + DocRED requires reading multiple sentences in a document to extract entities and infer their relations by synthesizing all information of the document.
 8 | + Along with the human-annotated data, we also offer large-scale distantly supervised data, which enables DocRED to be adopted for both supervised and weakly supervised scenarios.
 9 | 
10 | ## Codalab
11 | If you are interested in our dataset, you are welcome to join in the Codalab competition at [DocRED](https://competitions.codalab.org/competitions/20717)
12 | 
13 | 
14 | ## Cite
15 | If you use the dataset or the code, please cite this paper:
16 | ```
17 | @inproceedings{yao2019DocRED,
18 |   title={{DocRED}: A Large-Scale Document-Level Relation Extraction Dataset},
19 |   author={Yao, Yuan and Ye, Deming and Li, Peng and Han, Xu and Lin, Yankai and Liu, Zhenghao and Liu, Zhiyuan and Huang, Lixin and Zhou, Jie and Sun, Maosong},
20 |   booktitle={Proceedings of ACL 2019},
21 |   year={2019}
22 | }
23 | ```
24 | 


--------------------------------------------------------------------------------
/code/README.md:
--------------------------------------------------------------------------------
 1 | # Baseline code
 2 | 
 3 | ## Requirements and Installation
 4 | python3
 5 | 
 6 | pytorch>=1.0
 7 | 
 8 | ```
 9 | pip3 install -r requirements.txt
10 | ```
11 | 
12 | ## preprocessing data
13 | Download metadata from [TsinghuaCloud](https://cloud.tsinghua.edu.cn/d/99e1c0805eb64736af95/) or [GoogleDrive](https://drive.google.com/drive/folders/1Ri3LIILKKBi3aBJjUVCOBpGX5PpONHRK) for baseline method and put them into prepro_data folder.
14 | 
15 | 
16 | ```
17 | python3 gen_data.py --in_path ../data --out_path prepro_data
18 | ```
19 | 
20 | ## relation extration
21 | 
22 | training:
23 | ```
24 | CUDA_VISIBLE_DEVICES=0 python3 train.py --model_name BiLSTM --save_name checkpoint_BiLSTM --train_prefix dev_train --test_prefix dev_dev
25 | ```
26 | 
27 | testing (--test_prefix dev_dev for dev set, dev_test for test set):
28 | ```
29 | CUDA_VISIBLE_DEVICES=0 python3 test.py --model_name BiLSTM --save_name checkpoint_BiLSTM --train_prefix dev_train --test_prefix dev_dev --input_theta 0.3601
30 | ```
31 | 
32 | ## evidence extration
33 | 
34 | training:
35 | ```
36 | CUDA_VISIBLE_DEVICES=0 python3 train_sp.py --model_name LSTM_SP  --save_name checkpoint_BiLSTMSP  --train_prefix dev_train --test_prefix dev_dev
37 | ```
38 | 
39 | testing:
40 | ```
41 | CUDA_VISIBLE_DEVICES=0 python3 test_sp.py --model_name LSTM_SP --save_name checkpoint_BiLSTMSP --train_prefix dev_train --test_prefix dev_dev --input_theta 0.4619
42 | ```
43 | 
44 | ## evaluation
45 | 
46 | dev result can evaluated by 
47 | ```
48 | python3 evalutaion result.json ../data/dev.json
49 | ```
50 | 
51 | test result should be submit to Codalab.
52 | 
53 | 
54 | 
55 | 


--------------------------------------------------------------------------------
/code/config/Config.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | import torch
  3 | import torch.nn as nn
  4 | from torch.autograd import Variable
  5 | import torch.optim as optim
  6 | import numpy as np
  7 | import os
  8 | import time
  9 | import datetime
 10 | import json
 11 | import sys
 12 | import sklearn.metrics
 13 | from tqdm import tqdm
 14 | import matplotlib
 15 | matplotlib.use('Agg')
 16 | import matplotlib.pyplot as plt
 17 | import random
 18 | from collections import defaultdict
 19 | import torch.nn.functional as F
 20 | 
 21 | 
 22 | IGNORE_INDEX = -100
 23 | is_transformer = False
 24 | 
 25 | class Accuracy(object):
 26 | 	def __init__(self):
 27 | 		self.correct = 0
 28 | 		self.total = 0
 29 | 	def add(self, is_correct):
 30 | 		self.total += 1
 31 | 		if is_correct:
 32 | 			self.correct += 1
 33 | 	def get(self):
 34 | 		if self.total == 0:
 35 | 			return 0.0
 36 | 		else:
 37 | 			return float(self.correct) / self.total
 38 | 	def clear(self):
 39 | 		self.correct = 0
 40 | 		self.total = 0 
 41 | 
 42 | class Config(object):
 43 | 	def __init__(self, args):
 44 | 		self.acc_NA = Accuracy()
 45 | 		self.acc_not_NA = Accuracy()
 46 | 		self.acc_total = Accuracy()
 47 | 		self.data_path = './prepro_data'
 48 | 		self.use_bag = False
 49 | 		self.use_gpu = True
 50 | 		self.is_training = True
 51 | 		self.max_length = 512
 52 | 		self.pos_num = 2 * self.max_length
 53 | 		self.entity_num = self.max_length
 54 | 		self.relation_num = 97
 55 | 
 56 | 		self.coref_size = 20
 57 | 		self.entity_type_size = 20
 58 | 		self.max_epoch = 20
 59 | 		self.opt_method = 'Adam'
 60 | 		self.optimizer = None
 61 | 
 62 | 		self.checkpoint_dir = './checkpoint'
 63 | 		self.fig_result_dir = './fig_result'
 64 | 		self.test_epoch = 5
 65 | 		self.pretrain_model = None
 66 | 
 67 | 
 68 | 		self.word_size = 100
 69 | 		self.epoch_range = None
 70 | 		self.cnn_drop_prob = 0.5  # for cnn
 71 | 		self.keep_prob = 0.8  # for lstm
 72 | 
 73 | 		self.period = 50
 74 | 
 75 | 		self.batch_size = 40
 76 | 		self.h_t_limit = 1800
 77 | 
 78 | 		self.test_batch_size = self.batch_size
 79 | 		self.test_relation_limit = 1800
 80 | 		self.char_limit = 16
 81 | 		self.sent_limit = 25
 82 | 		self.dis2idx = np.zeros((512), dtype='int64')
 83 | 		self.dis2idx[1] = 1
 84 | 		self.dis2idx[2:] = 2
 85 | 		self.dis2idx[4:] = 3
 86 | 		self.dis2idx[8:] = 4
 87 | 		self.dis2idx[16:] = 5
 88 | 		self.dis2idx[32:] = 6
 89 | 		self.dis2idx[64:] = 7
 90 | 		self.dis2idx[128:] = 8
 91 | 		self.dis2idx[256:] = 9
 92 | 		self.dis_size = 20
 93 | 
 94 | 		self.train_prefix = args.train_prefix
 95 | 		self.test_prefix = args.test_prefix
 96 | 
 97 | 
 98 | 		if not os.path.exists("log"):
 99 | 			os.mkdir("log")
100 | 
101 | 	def set_data_path(self, data_path):
102 | 		self.data_path = data_path
103 | 	def set_max_length(self, max_length):
104 | 		self.max_length = max_length
105 | 		self.pos_num = 2 * self.max_length
106 | 	def set_num_classes(self, num_classes):
107 | 		self.num_classes = num_classes
108 | 	def set_window_size(self, window_size):
109 | 		self.window_size = window_size
110 | 	def set_word_size(self, word_size):
111 | 		self.word_size = word_size
112 | 	def set_max_epoch(self, max_epoch):
113 | 		self.max_epoch = max_epoch
114 | 	def set_batch_size(self, batch_size):
115 | 		self.batch_size = batch_size
116 | 	def set_opt_method(self, opt_method):
117 | 		self.opt_method = opt_method
118 | 	def set_drop_prob(self, drop_prob):
119 | 		self.drop_prob = drop_prob
120 | 	def set_checkpoint_dir(self, checkpoint_dir):
121 | 		self.checkpoint_dir = checkpoint_dir
122 | 	def set_test_epoch(self, test_epoch):
123 | 		self.test_epoch = test_epoch
124 | 	def set_pretrain_model(self, pretrain_model):
125 | 		self.pretrain_model = pretrain_model
126 | 	def set_is_training(self, is_training):
127 | 		self.is_training = is_training
128 | 	def set_use_bag(self, use_bag):
129 | 		self.use_bag = use_bag
130 | 	def set_use_gpu(self, use_gpu):
131 | 		self.use_gpu = use_gpu
132 | 	def set_epoch_range(self, epoch_range):
133 | 		self.epoch_range = epoch_range
134 | 	
135 | 	def load_train_data(self):
136 | 		print("Reading training data...")
137 | 		prefix = self.train_prefix
138 | 
139 | 		print ('train', prefix)
140 | 		self.data_train_word = np.load(os.path.join(self.data_path, prefix+'_word.npy'))
141 | 		self.data_train_pos = np.load(os.path.join(self.data_path, prefix+'_pos.npy'))
142 | 		self.data_train_ner = np.load(os.path.join(self.data_path, prefix+'_ner.npy'))
143 | 		self.data_train_char = np.load(os.path.join(self.data_path, prefix+'_char.npy'))
144 | 		self.train_file = json.load(open(os.path.join(self.data_path, prefix+'.json')))
145 | 
146 | 		print("Finish reading")
147 | 
148 | 		self.train_len = ins_num = self.data_train_word.shape[0]
149 | 		assert(self.train_len==len(self.train_file))
150 | 
151 | 		self.train_order = list(range(ins_num))
152 | 		self.train_batches = ins_num // self.batch_size
153 | 		if ins_num % self.batch_size != 0:
154 | 			self.train_batches += 1
155 | 
156 | 	def load_test_data(self):
157 | 		print("Reading testing data...")
158 | 		self.data_word_vec = np.load(os.path.join(self.data_path, 'vec.npy'))
159 | 		self.data_char_vec = np.load(os.path.join(self.data_path, 'char_vec.npy'))
160 | 		self.rel2id = json.load(open(os.path.join(self.data_path, 'rel2id.json')))
161 | 		self.id2rel = {v: k for k,v in self.rel2id.items()}
162 | 
163 | 		prefix = self.test_prefix
164 | 		print (prefix)
165 | 		self.is_test = ('dev_test' == prefix)
166 | 		self.data_test_word = np.load(os.path.join(self.data_path, prefix+'_word.npy'))
167 | 		self.data_test_pos = np.load(os.path.join(self.data_path, prefix+'_pos.npy'))
168 | 		self.data_test_ner = np.load(os.path.join(self.data_path, prefix+'_ner.npy'))
169 | 		self.data_test_char = np.load(os.path.join(self.data_path, prefix+'_char.npy'))
170 | 		self.test_file = json.load(open(os.path.join(self.data_path, prefix+'.json')))
171 | 
172 | 
173 | 		self.test_len = self.data_test_word.shape[0]
174 | 		assert(self.test_len==len(self.test_file))
175 | 
176 | 
177 | 		print("Finish reading")
178 | 
179 | 		self.test_batches = self.data_test_word.shape[0] // self.test_batch_size
180 | 		if self.data_test_word.shape[0] % self.test_batch_size != 0:
181 | 			self.test_batches += 1
182 | 
183 | 		self.test_order = list(range(self.test_len))
184 | 		self.test_order.sort(key=lambda x: np.sum(self.data_test_word[x] > 0), reverse=True)
185 | 
186 | 
187 | 	def get_train_batch(self):
188 | 		random.shuffle(self.train_order)
189 | 
190 | 		context_idxs = torch.LongTensor(self.batch_size, self.max_length).cuda()
191 | 		context_pos = torch.LongTensor(self.batch_size, self.max_length).cuda()
192 | 		h_mapping = torch.Tensor(self.batch_size, self.h_t_limit, self.max_length).cuda()
193 | 		t_mapping = torch.Tensor(self.batch_size, self.h_t_limit, self.max_length).cuda()
194 | 		relation_multi_label = torch.Tensor(self.batch_size, self.h_t_limit, self.relation_num).cuda()
195 | 		relation_mask = torch.Tensor(self.batch_size, self.h_t_limit).cuda()
196 | 
197 | 		pos_idx = torch.LongTensor(self.batch_size, self.max_length).cuda()
198 | 
199 | 		context_ner = torch.LongTensor(self.batch_size, self.max_length).cuda()
200 | 		context_char_idxs = torch.LongTensor(self.batch_size, self.max_length, self.char_limit).cuda()
201 | 
202 | 		relation_label = torch.LongTensor(self.batch_size, self.h_t_limit).cuda()
203 | 
204 | 
205 | 		ht_pair_pos = torch.LongTensor(self.batch_size, self.h_t_limit).cuda()
206 | 
207 | 		for b in range(self.train_batches):
208 | 			start_id = b * self.batch_size
209 | 			cur_bsz = min(self.batch_size, self.train_len - start_id)
210 | 			cur_batch = list(self.train_order[start_id: start_id + cur_bsz])
211 | 			cur_batch.sort(key=lambda x: np.sum(self.data_train_word[x]>0) , reverse = True)
212 | 
213 | 			for mapping in [h_mapping, t_mapping]:
214 | 				mapping.zero_()
215 | 
216 | 			for mapping in [relation_multi_label, relation_mask, pos_idx]:
217 | 				mapping.zero_()
218 | 
219 | 			ht_pair_pos.zero_()
220 | 
221 | 
222 | 			relation_label.fill_(IGNORE_INDEX)
223 | 
224 | 			max_h_t_cnt = 1
225 | 
226 | 
227 | 			for i, index in enumerate(cur_batch):
228 | 				context_idxs[i].copy_(torch.from_numpy(self.data_train_word[index, :]))
229 | 				context_pos[i].copy_(torch.from_numpy(self.data_train_pos[index, :]))
230 | 				context_char_idxs[i].copy_(torch.from_numpy(self.data_train_char[index, :]))
231 | 				context_ner[i].copy_(torch.from_numpy(self.data_train_ner[index, :]))
232 | 
233 | 				for j in range(self.max_length):
234 | 					if self.data_train_word[index, j]==0:
235 | 						break
236 | 					pos_idx[i, j] = j+1
237 | 
238 | 				ins = self.train_file[index]
239 | 				labels = ins['labels']
240 | 				idx2label = defaultdict(list)
241 | 
242 | 				for label in labels:
243 | 					idx2label[(label['h'], label['t'])].append(label['r'])
244 | 
245 | 
246 | 
247 | 				train_tripe = list(idx2label.keys())
248 | 				for j, (h_idx, t_idx) in enumerate(train_tripe):
249 | 					hlist = ins['vertexSet'][h_idx]
250 | 					tlist = ins['vertexSet'][t_idx]
251 | 
252 | 					for h in hlist:
253 | 						h_mapping[i, j, h['pos'][0]:h['pos'][1]] = 1.0 / len(hlist) / (h['pos'][1] - h['pos'][0])
254 | 
255 | 					for t in tlist:
256 | 						t_mapping[i, j, t['pos'][0]:t['pos'][1]] = 1.0 / len(tlist) / (t['pos'][1] - t['pos'][0])
257 | 
258 | 					label = idx2label[(h_idx, t_idx)]
259 | 
260 | 					delta_dis = hlist[0]['pos'][0] - tlist[0]['pos'][0]
261 | 					if delta_dis < 0:
262 | 						ht_pair_pos[i, j] = -int(self.dis2idx[-delta_dis])
263 | 					else:
264 | 						ht_pair_pos[i, j] = int(self.dis2idx[delta_dis])
265 | 
266 | 
267 | 					for r in label:
268 | 						relation_multi_label[i, j, r] = 1
269 | 
270 | 					relation_mask[i, j] = 1
271 | 					rt = np.random.randint(len(label))
272 | 					relation_label[i, j] = label[rt]
273 | 
274 | 
275 | 
276 | 				lower_bound = len(ins['na_triple'])
277 | 				# random.shuffle(ins['na_triple'])
278 | 				# lower_bound = max(20, len(train_tripe)*3)
279 | 
280 | 
281 | 				for j, (h_idx, t_idx) in enumerate(ins['na_triple'][:lower_bound], len(train_tripe)):
282 | 					hlist = ins['vertexSet'][h_idx]
283 | 					tlist = ins['vertexSet'][t_idx]
284 | 
285 | 					for h in hlist:
286 | 						h_mapping[i, j, h['pos'][0]:h['pos'][1]] = 1.0 / len(hlist) / (h['pos'][1] - h['pos'][0])
287 | 
288 | 					for t in tlist:
289 | 						t_mapping[i, j, t['pos'][0]:t['pos'][1]] = 1.0 / len(tlist) / (t['pos'][1] - t['pos'][0])
290 | 
291 | 					relation_multi_label[i, j, 0] = 1
292 | 					relation_label[i, j] = 0
293 | 					relation_mask[i, j] = 1
294 | 					delta_dis = hlist[0]['pos'][0] - tlist[0]['pos'][0]
295 | 					if delta_dis < 0:
296 | 						ht_pair_pos[i, j] = -int(self.dis2idx[-delta_dis])
297 | 					else:
298 | 						ht_pair_pos[i, j] = int(self.dis2idx[delta_dis])
299 | 
300 | 				max_h_t_cnt = max(max_h_t_cnt, len(train_tripe) + lower_bound)
301 | 
302 | 
303 | 			input_lengths = (context_idxs[:cur_bsz] > 0).long().sum(dim=1)
304 | 			max_c_len = int(input_lengths.max())
305 | 
306 | 			yield {'context_idxs': context_idxs[:cur_bsz, :max_c_len].contiguous(),
307 | 				   'context_pos': context_pos[:cur_bsz, :max_c_len].contiguous(),
308 | 				   'h_mapping': h_mapping[:cur_bsz, :max_h_t_cnt, :max_c_len],
309 | 				   't_mapping': t_mapping[:cur_bsz, :max_h_t_cnt, :max_c_len],
310 | 				   'relation_label': relation_label[:cur_bsz, :max_h_t_cnt].contiguous(),
311 | 				   'input_lengths' : input_lengths,
312 | 				   'pos_idx': pos_idx[:cur_bsz, :max_c_len].contiguous(),
313 | 				   'relation_multi_label': relation_multi_label[:cur_bsz, :max_h_t_cnt],
314 | 				   'relation_mask': relation_mask[:cur_bsz, :max_h_t_cnt],
315 | 				   'context_ner': context_ner[:cur_bsz, :max_c_len].contiguous(),
316 | 				   'context_char_idxs': context_char_idxs[:cur_bsz, :max_c_len].contiguous(),
317 | 				   'ht_pair_pos': ht_pair_pos[:cur_bsz, :max_h_t_cnt],
318 | 				   }
319 | 
320 | 	def get_test_batch(self):
321 | 		context_idxs = torch.LongTensor(self.test_batch_size, self.max_length).cuda()
322 | 		context_pos = torch.LongTensor(self.test_batch_size, self.max_length).cuda()
323 | 		h_mapping = torch.Tensor(self.test_batch_size, self.test_relation_limit, self.max_length).cuda()
324 | 		t_mapping = torch.Tensor(self.test_batch_size, self.test_relation_limit, self.max_length).cuda()
325 | 		context_ner = torch.LongTensor(self.test_batch_size, self.max_length).cuda()
326 | 		context_char_idxs = torch.LongTensor(self.test_batch_size, self.max_length, self.char_limit).cuda()
327 | 		relation_mask = torch.Tensor(self.test_batch_size, self.h_t_limit).cuda()
328 | 		ht_pair_pos = torch.LongTensor(self.test_batch_size, self.h_t_limit).cuda()
329 | 
330 | 		for b in range(self.test_batches):
331 | 			start_id = b * self.test_batch_size
332 | 			cur_bsz = min(self.test_batch_size, self.test_len - start_id)
333 | 			cur_batch = list(self.test_order[start_id : start_id + cur_bsz])
334 | 
335 | 			for mapping in [h_mapping, t_mapping, relation_mask]:
336 | 				mapping.zero_()
337 | 
338 | 
339 | 			ht_pair_pos.zero_()
340 | 
341 | 			max_h_t_cnt = 1
342 | 
343 | 			cur_batch.sort(key=lambda x: np.sum(self.data_test_word[x]>0) , reverse = True)
344 | 
345 | 			labels = []
346 | 
347 | 			L_vertex = []
348 | 			titles = []
349 | 			indexes = []
350 | 			for i, index in enumerate(cur_batch):
351 | 				context_idxs[i].copy_(torch.from_numpy(self.data_test_word[index, :]))
352 | 				context_pos[i].copy_(torch.from_numpy(self.data_test_pos[index, :]))
353 | 				context_char_idxs[i].copy_(torch.from_numpy(self.data_test_char[index, :]))
354 | 				context_ner[i].copy_(torch.from_numpy(self.data_test_ner[index, :]))
355 | 
356 | 
357 | 
358 | 				idx2label = defaultdict(list)
359 | 				ins = self.test_file[index]
360 | 
361 | 				for label in ins['labels']:
362 | 					idx2label[(label['h'], label['t'])].append(label['r'])
363 | 
364 | 
365 | 
366 | 				L = len(ins['vertexSet'])
367 | 				titles.append(ins['title'])
368 | 
369 | 				j = 0
370 | 				for h_idx in range(L):
371 | 					for t_idx in range(L):
372 | 						if h_idx != t_idx:
373 | 							hlist = ins['vertexSet'][h_idx]
374 | 							tlist = ins['vertexSet'][t_idx]
375 | 
376 | 							for h in hlist:
377 | 								h_mapping[i, j, h['pos'][0]:h['pos'][1]] = 1.0 / len(hlist) / (h['pos'][1] - h['pos'][0])
378 | 							for t in tlist:
379 | 								t_mapping[i, j, t['pos'][0]:t['pos'][1]] = 1.0 / len(tlist) / (t['pos'][1] - t['pos'][0])
380 | 
381 | 							relation_mask[i, j] = 1
382 | 
383 | 							delta_dis = hlist[0]['pos'][0] - tlist[0]['pos'][0]
384 | 							if delta_dis < 0:
385 | 								ht_pair_pos[i, j] = -int(self.dis2idx[-delta_dis])
386 | 							else:
387 | 								ht_pair_pos[i, j] = int(self.dis2idx[delta_dis])
388 | 							j += 1
389 | 
390 | 
391 | 				max_h_t_cnt = max(max_h_t_cnt, j)
392 | 				label_set = {}
393 | 				for label in ins['labels']:
394 | 					label_set[(label['h'], label['t'], label['r'])] = label['in'+self.train_prefix]
395 | 
396 | 				labels.append(label_set)
397 | 
398 | 
399 | 				L_vertex.append(L)
400 | 				indexes.append(index)
401 | 
402 | 
403 | 
404 | 			input_lengths = (context_idxs[:cur_bsz] > 0).long().sum(dim=1)
405 | 			max_c_len = int(input_lengths.max())
406 | 
407 | 
408 | 			yield {'context_idxs': context_idxs[:cur_bsz, :max_c_len].contiguous(),
409 | 				   'context_pos': context_pos[:cur_bsz, :max_c_len].contiguous(),
410 | 				   'h_mapping': h_mapping[:cur_bsz, :max_h_t_cnt, :max_c_len],
411 | 				   't_mapping': t_mapping[:cur_bsz, :max_h_t_cnt, :max_c_len],
412 | 				   'labels': labels,
413 | 				   'L_vertex': L_vertex,
414 | 				   'input_lengths': input_lengths,
415 | 				   'context_ner': context_ner[:cur_bsz, :max_c_len].contiguous(),
416 | 				   'context_char_idxs': context_char_idxs[:cur_bsz, :max_c_len].contiguous(),
417 | 				   'relation_mask': relation_mask[:cur_bsz, :max_h_t_cnt],
418 | 				   'titles': titles,
419 | 				   'ht_pair_pos': ht_pair_pos[:cur_bsz, :max_h_t_cnt],
420 | 				   'indexes': indexes
421 | 				   }
422 | 
423 | 	def train(self, model_pattern, model_name):
424 | 
425 | 		ori_model = model_pattern(config = self)
426 | 		if self.pretrain_model != None:
427 | 			ori_model.load_state_dict(torch.load(self.pretrain_model))
428 | 		ori_model.cuda()
429 | 		model = nn.DataParallel(ori_model)
430 | 
431 | 		optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()))
432 | 		# nll_average = nn.CrossEntropyLoss(size_average=True, ignore_index=IGNORE_INDEX)
433 | 		BCE = nn.BCEWithLogitsLoss(reduction='none')
434 | 
435 | 		if not os.path.exists(self.checkpoint_dir):
436 | 			os.mkdir(self.checkpoint_dir)
437 | 
438 | 		best_auc = 0.0
439 | 		best_f1 = 0.0
440 | 		best_epoch = 0
441 | 
442 | 		model.train()
443 | 
444 | 		global_step = 0
445 | 		total_loss = 0
446 | 		start_time = time.time()
447 | 
448 | 		def logging(s, print_=True, log_=True):
449 | 			if print_:
450 | 				print(s)
451 | 			if log_:
452 | 				with open(os.path.join(os.path.join("log", model_name)), 'a+') as f_log:
453 | 					f_log.write(s + '\n')
454 | 
455 | 		plt.xlabel('Recall')
456 | 		plt.ylabel('Precision')
457 | 		plt.ylim(0.3, 1.0)
458 | 		plt.xlim(0.0, 0.4)
459 | 		plt.title('Precision-Recall')
460 | 		plt.grid(True)
461 | 
462 | 		for epoch in range(self.max_epoch):
463 | 
464 | 			self.acc_NA.clear()
465 | 			self.acc_not_NA.clear()
466 | 			self.acc_total.clear()
467 | 
468 | 			for data in self.get_train_batch():
469 | 
470 | 				context_idxs = data['context_idxs']
471 | 				context_pos = data['context_pos']
472 | 				h_mapping = data['h_mapping']
473 | 				t_mapping = data['t_mapping']
474 | 				relation_label = data['relation_label']
475 | 				input_lengths =  data['input_lengths']
476 | 				relation_multi_label = data['relation_multi_label']
477 | 				relation_mask = data['relation_mask']
478 | 				context_ner = data['context_ner']
479 | 				context_char_idxs = data['context_char_idxs']
480 | 				ht_pair_pos = data['ht_pair_pos']
481 | 
482 | 
483 | 				dis_h_2_t = ht_pair_pos+10
484 | 				dis_t_2_h = -ht_pair_pos+10
485 | 
486 | 
487 | 				predict_re = model(context_idxs, context_pos, context_ner, context_char_idxs, input_lengths, h_mapping, t_mapping, relation_mask, dis_h_2_t, dis_t_2_h)
488 | 				loss = torch.sum(BCE(predict_re, relation_multi_label)*relation_mask.unsqueeze(2)) /  (self.relation_num * torch.sum(relation_mask))
489 | 
490 | 
491 | 				output = torch.argmax(predict_re, dim=-1)
492 | 				output = output.data.cpu().numpy()
493 | 
494 | 				optimizer.zero_grad()
495 | 				loss.backward()
496 | 				optimizer.step()
497 | 
498 | 				relation_label = relation_label.data.cpu().numpy()
499 | 
500 | 				for i in range(output.shape[0]):
501 | 					for j in range(output.shape[1]):
502 | 						label = relation_label[i][j]
503 | 						if label<0:
504 | 							break
505 | 
506 | 						if label == 0:
507 | 							self.acc_NA.add(output[i][j] == label)
508 | 						else:
509 | 							self.acc_not_NA.add(output[i][j] == label)
510 | 
511 | 						self.acc_total.add(output[i][j] == label)
512 | 
513 | 				global_step += 1
514 | 				total_loss += loss.item()
515 | 
516 | 				if global_step % self.period == 0 :
517 | 					cur_loss = total_loss / self.period
518 | 					elapsed = time.time() - start_time
519 | 					logging('| epoch {:2d} | step {:4d} |  ms/b {:5.2f} | train loss {:5.3f} | NA acc: {:4.2f} | not NA acc: {:4.2f}  | tot acc: {:4.2f} '.format(epoch, global_step, elapsed * 1000 / self.period, cur_loss, self.acc_NA.get(), self.acc_not_NA.get(), self.acc_total.get()))
520 | 					total_loss = 0
521 | 					start_time = time.time()
522 | 
523 | 
524 | 
525 | 			if (epoch+1) % self.test_epoch == 0:
526 | 				logging('-' * 89)
527 | 				eval_start_time = time.time()
528 | 				model.eval()
529 | 				f1, auc, pr_x, pr_y = self.test(model, model_name)
530 | 				model.train()
531 | 				logging('| epoch {:3d} | time: {:5.2f}s'.format(epoch, time.time() - eval_start_time))
532 | 				logging('-' * 89)
533 | 
534 | 
535 | 				if f1 > best_f1:
536 | 					best_f1 = f1
537 | 					best_auc = auc
538 | 					best_epoch = epoch
539 | 					path = os.path.join(self.checkpoint_dir, model_name)
540 | 					torch.save(ori_model.state_dict(), path)
541 | 
542 | 					plt.plot(pr_x, pr_y, lw=2, label=str(epoch))
543 | 					plt.legend(loc="upper right")
544 | 					plt.savefig(os.path.join("fig_result", model_name))
545 | 
546 | 		print("Finish training")
547 | 		print("Best epoch = %d | auc = %f" % (best_epoch, best_auc))
548 | 		print("Storing best result...")
549 | 		print("Finish storing")
550 | 
551 | 	def test(self, model, model_name, output=False, input_theta=-1):
552 | 		data_idx = 0
553 | 		eval_start_time = time.time()
554 | 		# test_result_ignore = []
555 | 		total_recall_ignore = 0
556 | 
557 | 		test_result = []
558 | 		total_recall = 0
559 | 		top1_acc = have_label = 0
560 | 
561 | 		def logging(s, print_=True, log_=True):
562 | 			if print_:
563 | 				print(s)
564 | 			if log_:
565 | 				with open(os.path.join(os.path.join("log", model_name)), 'a+') as f_log:
566 | 					f_log.write(s + '\n')
567 | 
568 | 
569 | 
570 | 		for data in self.get_test_batch():
571 | 			with torch.no_grad():
572 | 				context_idxs = data['context_idxs']
573 | 				context_pos = data['context_pos']
574 | 				h_mapping = data['h_mapping']
575 | 				t_mapping = data['t_mapping']
576 | 				labels = data['labels']
577 | 				L_vertex = data['L_vertex']
578 | 				input_lengths =  data['input_lengths']
579 | 				context_ner = data['context_ner']
580 | 				context_char_idxs = data['context_char_idxs']
581 | 				relation_mask = data['relation_mask']
582 | 				ht_pair_pos = data['ht_pair_pos']
583 | 
584 | 				titles = data['titles']
585 | 				indexes = data['indexes']
586 | 
587 | 				dis_h_2_t = ht_pair_pos+10
588 | 				dis_t_2_h = -ht_pair_pos+10
589 | 
590 | 				predict_re = model(context_idxs, context_pos, context_ner, context_char_idxs, input_lengths,
591 | 								   h_mapping, t_mapping, relation_mask, dis_h_2_t, dis_t_2_h)
592 | 
593 | 				predict_re = torch.sigmoid(predict_re)
594 | 
595 | 			predict_re = predict_re.data.cpu().numpy()
596 | 
597 | 			for i in range(len(labels)):
598 | 				label = labels[i]
599 | 				index = indexes[i]
600 | 
601 | 
602 | 				total_recall += len(label)
603 | 				for l in label.values():
604 | 					if not l:
605 | 						total_recall_ignore += 1
606 | 
607 | 				L = L_vertex[i]
608 | 				j = 0
609 | 
610 | 				for h_idx in range(L):
611 | 					for t_idx in range(L):
612 | 						if h_idx != t_idx:
613 | 							r = np.argmax(predict_re[i, j])
614 | 							if (h_idx, t_idx, r) in label:
615 | 								top1_acc += 1
616 | 
617 | 							flag = False
618 | 
619 | 							for r in range(1, self.relation_num):
620 | 								intrain = False
621 | 
622 | 								if (h_idx, t_idx, r) in label:
623 | 									flag = True
624 | 									if label[(h_idx, t_idx, r)]==True:
625 | 										intrain = True
626 | 
627 | 
628 | 								# if not intrain:
629 | 								# 	test_result_ignore.append( ((h_idx, t_idx, r) in label, float(predict_re[i,j,r]),  titles[i], self.id2rel[r], index, h_idx, t_idx, r) )
630 | 
631 | 								test_result.append( ((h_idx, t_idx, r) in label, float(predict_re[i,j,r]), intrain,  titles[i], self.id2rel[r], index, h_idx, t_idx, r) )
632 | 
633 | 							if flag:
634 | 								have_label += 1
635 | 
636 | 							j += 1
637 | 
638 | 
639 | 			data_idx += 1
640 | 
641 | 			if data_idx % self.period == 0:
642 | 				print('| step {:3d} | time: {:5.2f}'.format(data_idx // self.period, (time.time() - eval_start_time)))
643 | 				eval_start_time = time.time()
644 | 
645 | 		# test_result_ignore.sort(key=lambda x: x[1], reverse=True)
646 | 		test_result.sort(key = lambda x: x[1], reverse=True)
647 | 
648 | 		print ('total_recall', total_recall)
649 | 		# plt.xlabel('Recall')
650 | 		# plt.ylabel('Precision')
651 | 		# plt.ylim(0.2, 1.0)
652 | 		# plt.xlim(0.0, 0.6)
653 | 		# plt.title('Precision-Recall')
654 | 		# plt.grid(True)
655 | 
656 | 		pr_x = []
657 | 		pr_y = []
658 | 		correct = 0
659 | 		w = 0
660 | 
661 | 		if total_recall == 0:
662 | 			total_recall = 1  # for test
663 | 
664 | 		for i, item in enumerate(test_result):
665 | 			correct += item[0]
666 | 			pr_y.append(float(correct) / (i + 1))
667 | 			pr_x.append(float(correct) / total_recall)
668 | 			if item[1] > input_theta:
669 | 				w = i
670 | 
671 | 
672 | 		pr_x = np.asarray(pr_x, dtype='float32')
673 | 		pr_y = np.asarray(pr_y, dtype='float32')
674 | 		f1_arr = (2 * pr_x * pr_y / (pr_x + pr_y + 1e-20))
675 | 		f1 = f1_arr.max()
676 | 		f1_pos = f1_arr.argmax()
677 | 		theta = test_result[f1_pos][1]
678 | 
679 | 		if input_theta==-1:
680 | 			w = f1_pos
681 | 			input_theta = theta
682 | 
683 | 		auc = sklearn.metrics.auc(x = pr_x, y = pr_y)
684 | 		if not self.is_test:
685 | 			logging('ALL  : Theta {:3.4f} | F1 {:3.4f} | AUC {:3.4f}'.format(theta, f1, auc))
686 | 		else:
687 | 			logging('ma_f1 {:3.4f} | input_theta {:3.4f} test_result F1 {:3.4f} | AUC {:3.4f}'.format(f1, input_theta, f1_arr[w], auc))
688 | 
689 | 		if output:
690 | 			# output = [x[-4:] for x in test_result[:w+1]]
691 | 			output = [{'index': x[-4], 'h_idx': x[-3], 't_idx': x[-2], 'r_idx': x[-1], 'r': x[-5], 'title': x[-6]} for x in test_result[:w+1]]
692 | 			json.dump(output, open(self.test_prefix + "_index.json", "w"))
693 | 
694 | 		# plt.plot(pr_x, pr_y, lw=2, label=model_name)
695 | 		# plt.legend(loc="upper right")
696 | 		if not os.path.exists(self.fig_result_dir):
697 | 			os.mkdir(self.fig_result_dir)
698 | 		# plt.savefig(os.path.join(self.fig_result_dir, model_name))
699 | 
700 | 		pr_x = []
701 | 		pr_y = []
702 | 		correct = correct_in_train = 0
703 | 		w = 0
704 | 		for i, item in enumerate(test_result):
705 | 			correct += item[0]
706 | 			if item[0] & item[2]:
707 | 				correct_in_train += 1
708 | 			if correct_in_train==correct:
709 | 				p = 0
710 | 			else:
711 | 				p = float(correct - correct_in_train) / (i + 1 - correct_in_train)
712 | 			pr_y.append(p)
713 | 			pr_x.append(float(correct) / total_recall)
714 | 			if item[1] > input_theta:
715 | 				w = i
716 | 
717 | 		pr_x = np.asarray(pr_x, dtype='float32')
718 | 		pr_y = np.asarray(pr_y, dtype='float32')
719 | 		f1_arr = (2 * pr_x * pr_y / (pr_x + pr_y + 1e-20))
720 | 		f1 = f1_arr.max()
721 | 
722 | 		auc = sklearn.metrics.auc(x = pr_x, y = pr_y)
723 | 
724 | 		logging('Ignore ma_f1 {:3.4f} | input_theta {:3.4f} test_result F1 {:3.4f} | AUC {:3.4f}'.format(f1, input_theta, f1_arr[w], auc))
725 | 
726 | 		return f1, auc, pr_x, pr_y
727 | 
728 | 
729 | 
730 | 	def testall(self, model_pattern, model_name, input_theta):#, ignore_input_theta):
731 | 		model = model_pattern(config = self)
732 | 
733 | 		model.load_state_dict(torch.load(os.path.join(self.checkpoint_dir, model_name)))
734 | 		model.cuda()
735 | 		model.eval()
736 | 		f1, auc, pr_x, pr_y = self.test(model, model_name, True, input_theta)
737 | 


--------------------------------------------------------------------------------
/code/config/EviConfig.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | import torch
  3 | import torch.nn as nn
  4 | from torch.autograd import Variable
  5 | import torch.optim as optim
  6 | import numpy as np
  7 | import os
  8 | import time
  9 | import datetime
 10 | import json
 11 | import sys
 12 | import sklearn.metrics
 13 | from tqdm import tqdm
 14 | import matplotlib
 15 | matplotlib.use('Agg')
 16 | import matplotlib.pyplot as plt
 17 | import random
 18 | from collections import defaultdict
 19 | import torch.nn.functional as F
 20 | 
 21 | 
 22 | IGNORE_INDEX = -100
 23 | TRAIN_LIMIT = 3600
 24 | test_evidence = False
 25 | 
 26 | class Accuracy(object):
 27 | 	def __init__(self):
 28 | 		self.correct = 0
 29 | 		self.total = 0
 30 | 	def add(self, is_correct):
 31 | 		self.total += 1
 32 | 		if is_correct:
 33 | 			self.correct += 1
 34 | 	def get(self):
 35 | 		if self.total == 0:
 36 | 			return 0.0
 37 | 		else:
 38 | 			return float(self.correct) / self.total
 39 | 	def clear(self):
 40 | 		self.correct = 0
 41 | 		self.total = 0 
 42 | 
 43 | class EviConfig(object):
 44 | 	def __init__(self, args):
 45 | 		self.acc_NA = Accuracy()
 46 | 		self.acc_not_NA = Accuracy()
 47 | 		self.acc_total = Accuracy()
 48 | 		self.data_path = './prepro_data'
 49 | 		self.use_bag = False
 50 | 		self.use_gpu = True
 51 | 		self.is_training = True
 52 | 		self.max_length = 512
 53 | 		self.pos_num = 2 * self.max_length
 54 | 		self.entity_num = self.max_length
 55 | 		self.relation_num = 97
 56 | 		self.coref_size = 20
 57 | 		self.entity_type_size = 20
 58 | 		self.max_epoch = 20
 59 | 		self.opt_method = 'Adam'
 60 | 		self.optimizer = None
 61 | 		self.drop_prob = 0.5 # for cnn
 62 | 		self.keep_prob = 0.8 # for lstm
 63 | 		self.checkpoint_dir = './checkpoint'
 64 | 		self.test_result_dir = './test_result'
 65 | 		self.test_epoch = 5
 66 | 		self.pretrain_model = None
 67 | 
 68 | 
 69 | 		self.word_size = 100
 70 | 		self.epoch_range = None
 71 | 		self.dropout = 0.5
 72 | 		self.period = 50
 73 | 
 74 | 		self.ins_batch_size = 40
 75 | 		self.test_ins_batch_size = self.ins_batch_size
 76 | 		self.batch_size = 4000
 77 | 
 78 | 
 79 | 		self.char_limit = 16
 80 | 		self.sent_limit = 25
 81 | 		self.dis2idx = np.zeros((512), dtype='int64')
 82 | 		self.dis2idx[1] = 1
 83 | 		self.dis2idx[2:] = 2
 84 | 		self.dis2idx[4:] = 3
 85 | 		self.dis2idx[8:] = 4
 86 | 		self.dis2idx[16:] = 5
 87 | 		self.dis2idx[32:] = 6
 88 | 		self.dis2idx[64:] = 7
 89 | 		self.dis2idx[128:] = 8
 90 | 		self.dis2idx[256:] = 9
 91 | 		self.dis_size = 20
 92 | 
 93 | 		self.train_prefix = args.train_prefix
 94 | 		self.test_prefix = args.test_prefix
 95 | 		self.output_file = args.output_file
 96 | 
 97 | 
 98 | 	def set_data_path(self, data_path):
 99 | 		self.data_path = data_path
100 | 	def set_max_length(self, max_length):
101 | 		self.max_length = max_length
102 | 		self.pos_num = 2 * self.max_length
103 | 	def set_num_classes(self, num_classes):
104 | 		self.num_classes = num_classes
105 | 	def set_window_size(self, window_size):
106 | 		self.window_size = window_size
107 | 	def set_pos_size(self, pos_size):
108 | 		self.pos_size = pos_size
109 | 	def set_word_size(self, word_size):
110 | 		self.word_size = word_size
111 | 	def set_max_epoch(self, max_epoch):
112 | 		self.max_epoch = max_epoch
113 | 	def set_batch_size(self, batch_size):
114 | 		self.batch_size = batch_size
115 | 	def set_opt_method(self, opt_method):
116 | 		self.opt_method = opt_method
117 | 	def set_drop_prob(self, drop_prob):
118 | 		self.drop_prob = drop_prob
119 | 	def set_checkpoint_dir(self, checkpoint_dir):
120 | 		self.checkpoint_dir = checkpoint_dir
121 | 	def set_test_epoch(self, test_epoch):
122 | 		self.test_epoch = test_epoch
123 | 	def set_pretrain_model(self, pretrain_model):
124 | 		self.pretrain_model = pretrain_model
125 | 	def set_is_training(self, is_training):
126 | 		self.is_training = is_training
127 | 	def set_use_bag(self, use_bag):
128 | 		self.use_bag = use_bag
129 | 	def set_use_gpu(self, use_gpu):
130 | 		self.use_gpu = use_gpu
131 | 	def set_epoch_range(self, epoch_range):
132 | 		self.epoch_range = epoch_range
133 | 	
134 | 	def load_train_data(self):
135 | 		print("Reading training data...")
136 | 
137 | 		prefix = 'dev_train'
138 | 		self.data_train_word = np.load(os.path.join(self.data_path, prefix+'_word.npy'))
139 | 		self.data_train_pos = np.load(os.path.join(self.data_path, prefix+'_pos.npy'))
140 | 		self.data_train_ner = np.load(os.path.join(self.data_path, prefix+'_ner.npy'))
141 | 		self.data_train_char = np.load(os.path.join(self.data_path, prefix+'_char.npy'))
142 | 		self.train_file = json.load(open(os.path.join(self.data_path, prefix+'.json')))
143 | 
144 | 		print("Finish reading")
145 | 
146 | 		self.train_len = ins_num = self.data_train_word.shape[0]
147 | 		assert(self.train_len==len(self.train_file))
148 | 
149 | 		self.train_order = list(range(ins_num))
150 | 		self.train_batches = ins_num // self.ins_batch_size
151 | 		if ins_num % self.ins_batch_size != 0:
152 | 			self.train_batches += 1
153 | 
154 | 	def load_test_data(self):
155 | 		print("Reading testing data...")
156 | 
157 | 		self.data_char_vec = np.load(os.path.join(self.data_path, 'char_vec.npy'))
158 | 		self.data_word_vec = np.load(os.path.join(self.data_path, 'vec.npy'))
159 | 		self.rel2id = json.load(open(os.path.join(self.data_path, 'rel2id.json')))
160 | 		self.id2rel = {v: k for k,v in self.rel2id.items()}
161 | 
162 | 		prefix = self.test_prefix
163 | 		print (prefix)
164 | 		self.data_test_word = np.load(os.path.join(self.data_path, prefix+'_word.npy'))
165 | 		self.data_test_pos = np.load(os.path.join(self.data_path, prefix+'_pos.npy'))
166 | 		self.data_test_ner = np.load(os.path.join(self.data_path, prefix+'_ner.npy'))
167 | 		self.data_test_char = np.load(os.path.join(self.data_path, prefix+'_char.npy'))
168 | 		self.test_file = json.load(open(os.path.join(self.data_path, prefix+'.json')))
169 | 
170 | 		self.test_len = self.data_test_word.shape[0]
171 | 		assert(self.test_len==len(self.test_file))
172 | 
173 | 
174 | 		self.test_index = json.load(open(prefix+"_index.json"))
175 | 
176 | 
177 | 		self.total_evidence_recall = 0
178 | 		for ins in self.test_file:
179 | 			for label in ins['labels']:
180 | 				evidence = [int(e) for e in label['evidence']]
181 | 				self.total_evidence_recall += len(evidence)
182 | 
183 | 		print ("total_evidence_recall:", self.total_evidence_recall)
184 | 		print ("Finish reading")
185 | 
186 | 		self.test_batches = self.data_test_word.shape[0] // self.test_ins_batch_size
187 | 		if self.data_test_word.shape[0] % self.test_ins_batch_size != 0:
188 | 			self.test_batches += 1
189 | 
190 | 
191 | 		cur_batch = list(range(self.test_len))
192 | 		cur_batch.sort(key=lambda x: len(self.test_file[x]['vertexSet']))
193 | 		i = 0
194 | 		j = self.test_len-1
195 | 		# small vertexSet + big vertexSet as a pair
196 | 		self.test_order = []
197 | 		while i <= j:
198 | 			self.test_order.append(cur_batch[i])
199 | 			i += 1
200 | 			if i>j:
201 | 				break
202 | 
203 | 			self.test_order.append(cur_batch[j])
204 | 			j -= 1
205 | 
206 | 		assert(len(self.test_order)==self.test_len)
207 | 
208 | 
209 | 	def get_N2_train_batch(self):
210 | 		random.shuffle(self.train_order)
211 | 
212 | 		context_idxs = torch.LongTensor(self.batch_size, self.max_length).cuda()
213 | 		context_pos = torch.LongTensor(self.batch_size, self.max_length).cuda()
214 | 
215 | 		context_ner = torch.LongTensor(self.batch_size, self.max_length).cuda()
216 | 		context_char_idxs = torch.LongTensor(self.batch_size, self.max_length, self.char_limit).cuda()
217 | 
218 | 		relation_label = torch.LongTensor(self.batch_size).cuda()
219 | 		evidence_label = torch.Tensor(self.batch_size, self.sent_limit).cuda()
220 | 		sent_mask = torch.Tensor(self.batch_size, self.sent_limit).cuda()
221 | 
222 | 		sent_h_mapping = torch.Tensor(self.batch_size, self.sent_limit, self.max_length).cuda()
223 | 		sent_t_mapping = torch.Tensor(self.batch_size, self.sent_limit, self.max_length).cuda()
224 | 
225 | 
226 | 		for b in range(self.train_batches):
227 | 			start_id = b * self.ins_batch_size
228 | 			cur_bsz = min(self.ins_batch_size, self.train_len - start_id)
229 | 			cur_batch = list(self.train_order[start_id: start_id + cur_bsz])
230 | 			cur_batch.sort(key=lambda x: np.sum(self.data_train_word[x]>0) , reverse = True)
231 | 
232 | 			for mapping in [sent_h_mapping, sent_t_mapping, sent_mask, evidence_label, context_pos, relation_label]:
233 | 				mapping.zero_()
234 | 
235 | 			max_sents = 0
236 | 			i = 0
237 | 			for w, index in enumerate(cur_batch):
238 | 				ins = self.train_file[index]
239 | 				Ls = ins['Ls']
240 | 				max_sents = max(max_sents, len(Ls) - 1)
241 | 				random.shuffle(ins['labels'])
242 | 				for label in ins['labels']:
243 | 					context_idxs[i].copy_(torch.from_numpy(self.data_train_word[index, :]))
244 | 					context_char_idxs[i].copy_(torch.from_numpy(self.data_train_char[index, :]))
245 | 					context_ner[i].copy_(torch.from_numpy(self.data_train_ner[index, :]))
246 | 					relation_label[i] = label['r']
247 | 
248 | 					h_idx = label['h']
249 | 					t_idx = label['t']
250 | 
251 | 					hlist = ins['vertexSet'][h_idx]
252 | 					tlist = ins['vertexSet'][t_idx]
253 | 
254 | 					for h in hlist:
255 | 						context_pos[i, h['pos'][0]:h['pos'][1]] = 1
256 | 
257 | 					for t in tlist:
258 | 						context_pos[i, t['pos'][0]:t['pos'][1]] = 2
259 | 
260 | 					for e in label['evidence']:
261 | 						evidence_label[i, int(e)] = 1
262 | 
263 | 					for j in range(len(Ls) - 1):
264 | 						sent_h_mapping[i, j, Ls[j]] = 1
265 | 						sent_t_mapping[i, j, Ls[j + 1] - 1] = 1
266 | 						sent_mask[i, j] = 1
267 | 
268 | 
269 | 					i += 1
270 | 					if i == self.batch_size:
271 | 						break
272 | 				if i == self.batch_size:
273 | 					break
274 | 
275 | 
276 | 
277 | 			cur_bsz = i
278 | 			input_lengths = (context_idxs[:cur_bsz] > 0).long().sum(dim=1)
279 | 			max_c_len = int(input_lengths.max())
280 | 
281 | 			yield {'context_idxs': context_idxs[:cur_bsz, :max_c_len].contiguous(),
282 | 				   'context_pos': context_pos[:cur_bsz, :max_c_len].contiguous(),
283 | 				   'relation_label': relation_label[:cur_bsz].contiguous(),
284 | 				   'input_lengths' : input_lengths,
285 | 				   'context_ner': context_ner[:cur_bsz, :max_c_len].contiguous(),
286 | 				   'context_char_idxs': context_char_idxs[:cur_bsz, :max_c_len].contiguous(),
287 | 				   'sent_h_mapping': sent_h_mapping[:cur_bsz, :max_sents, :max_c_len],
288 | 				   'sent_t_mapping': sent_t_mapping[:cur_bsz, :max_sents, :max_c_len],
289 | 				   'sent_mask': sent_mask[:cur_bsz, :max_sents],
290 | 				   'evidence_label': evidence_label[:cur_bsz, :max_sents]
291 | 				   }
292 | 
293 | 
294 | 	def get_real_test_batch(self):
295 | 
296 | 
297 | 		self.test_len  = len(self.test_index)
298 | 
299 | 		self.test_order = list(range(self.test_len))
300 | 		self.test_batches = self.test_len // self.batch_size
301 | 		if self.test_len % self.batch_size != 0:
302 | 			self.test_batches += 1
303 | 
304 | 		context_idxs = torch.LongTensor(self.batch_size, self.max_length).cuda()
305 | 		context_pos = torch.LongTensor(self.batch_size, self.max_length).cuda()
306 | 
307 | 		context_ner = torch.LongTensor(self.batch_size, self.max_length).cuda()
308 | 		context_char_idxs = torch.LongTensor(self.batch_size, self.max_length, self.char_limit).cuda()
309 | 
310 | 		relation_label = torch.LongTensor(self.batch_size).cuda()
311 | 
312 | 		sent_mask = torch.Tensor(self.batch_size, self.sent_limit).cuda()
313 | 
314 | 		sent_h_mapping = torch.Tensor(self.batch_size, self.sent_limit, self.max_length).cuda()
315 | 		sent_t_mapping = torch.Tensor(self.batch_size, self.sent_limit, self.max_length).cuda()
316 | 
317 | 
318 | 		for b in range(self.test_batches):
319 | 			start_id = b * self.batch_size
320 | 			cur_bsz = min(self.batch_size, self.test_len - start_id)
321 | 			cur_batch = list(self.test_order[start_id : start_id + cur_bsz])
322 | 
323 | 			cur_batch.sort(key=lambda x: np.sum(self.data_test_word[self.test_index[x]['index']]>0) , reverse = True)
324 | 
325 | 			for mapping in [sent_h_mapping, sent_t_mapping, sent_mask, context_pos, relation_label]:
326 | 				mapping.zero_()
327 | 
328 | 			max_sents = 0
329 | 			evidences = []
330 | 			sents_num = []
331 | 			infos = []
332 | 
333 | 
334 | 			for i, t_index in enumerate(cur_batch):
335 | 				pos_ins = self.test_index[t_index]
336 | 				index = pos_ins['index']
337 | 				h_idx = pos_ins['h_idx']
338 | 				t_idx = pos_ins['t_idx']
339 | 				r = pos_ins['r_idx']
340 | 
341 | 				ins = self.test_file[index]
342 | 				Ls = ins['Ls']
343 | 				max_sents = max(max_sents, len(Ls) - 1)
344 | 				infos.append((ins['title'], h_idx, t_idx, self.id2rel[r]))
345 | 
346 | 
347 | 				context_idxs[i].copy_(torch.from_numpy(self.data_test_word[index, :]))
348 | 				context_char_idxs[i].copy_(torch.from_numpy(self.data_test_char[index, :]))
349 | 				context_ner[i].copy_(torch.from_numpy(self.data_test_ner[index, :]))
350 | 				relation_label[i] = r
351 | 
352 | 				hlist = ins['vertexSet'][h_idx]
353 | 				tlist = ins['vertexSet'][t_idx]
354 | 
355 | 				for h in hlist:
356 | 					context_pos[i, h['pos'][0]:h['pos'][1]] = 1
357 | 
358 | 				for t in tlist:
359 | 					context_pos[i, t['pos'][0]:t['pos'][1]] = 2
360 | 
361 | 
362 | 				evidence = []
363 | 				for label in ins['labels']:
364 | 					if (label['h'], label['t'], label['r']) == (h_idx, t_idx, r):
365 | 						evidence = [int(e) for e in label['evidence']]
366 | 
367 | 				evidences.append(evidence)
368 | 
369 | 				for j in range(len(Ls) - 1):
370 | 					sent_h_mapping[i, j, Ls[j]] = 1
371 | 					sent_t_mapping[i, j, Ls[j + 1] - 1] = 1
372 | 					sent_mask[i, j] = 1
373 | 
374 | 				sents_num.append(len(Ls)-1)
375 | 
376 | 			input_lengths = (context_idxs[:cur_bsz] > 0).long().sum(dim=1)
377 | 			max_c_len = int(input_lengths.max())
378 | 
379 | 			yield {'context_idxs': context_idxs[:cur_bsz, :max_c_len].contiguous(),
380 | 				   'context_pos': context_pos[:cur_bsz, :max_c_len].contiguous(),
381 | 				   'relation_label': relation_label[:cur_bsz].contiguous(),
382 | 				   'input_lengths' : input_lengths,
383 | 				   'context_ner': context_ner[:cur_bsz, :max_c_len].contiguous(),
384 | 				   'context_char_idxs': context_char_idxs[:cur_bsz, :max_c_len].contiguous(),
385 | 				   'sent_h_mapping': sent_h_mapping[:cur_bsz, :max_sents, :max_c_len],
386 | 				   'sent_t_mapping': sent_t_mapping[:cur_bsz, :max_sents, :max_c_len],
387 | 				   'sent_mask': sent_mask[:cur_bsz, :max_sents],
388 | 				   'evidences': evidences,
389 | 				   'sents_num': sents_num,
390 | 				   'infos': infos
391 | 				   }
392 | 
393 | 
394 | 	def train(self, model_pattern, model_name):
395 | 		ori_model = model_pattern(config = self)
396 | 		if self.pretrain_model != None:
397 | 			ori_model.load_state_dict(torch.load(self.pretrain_model))
398 | 		ori_model.cuda()
399 | 		model = nn.DataParallel(ori_model)
400 | 
401 | 		optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()))
402 | 
403 | 		BCE = nn.BCEWithLogitsLoss(reduction='none')
404 | 
405 | 		if not os.path.exists(self.checkpoint_dir):
406 | 			os.mkdir(self.checkpoint_dir)
407 | 
408 | 		best_auc = 0.0
409 | 		best_f1 = 0.0
410 | 		best_epoch = 0
411 | 
412 | 		model.train()
413 | 
414 | 		global_step = 0
415 | 		total_loss = 0
416 | 		start_time = time.time()
417 | 
418 | 		def logging(s, print_=True, log_=True):
419 | 			if print_:
420 | 				print(s)
421 | 			if log_:
422 | 				with open(os.path.join(os.path.join("log", model_name)), 'a+') as f_log:
423 | 					f_log.write(s + '\n')
424 | 
425 | 		for epoch in range(self.max_epoch):
426 | 
427 | 			self.acc_NA.clear()
428 | 			self.acc_not_NA.clear()
429 | 			self.acc_total.clear()
430 | 
431 | 			for data in self.get_N2_train_batch():
432 | 
433 | 				context_idxs = data['context_idxs']
434 | 				context_pos = data['context_pos']
435 | 				relation_label = data['relation_label']
436 | 				input_lengths =  data['input_lengths']
437 | 				context_ner = data['context_ner']
438 | 				context_char_idxs = data['context_char_idxs']
439 | 				sent_h_mapping = data['sent_h_mapping']
440 | 				sent_t_mapping = data['sent_t_mapping']
441 | 				sent_mask = data['sent_mask']
442 | 				evidence_label = data['evidence_label']
443 | 
444 | 				predict_sent = model(context_idxs, context_pos, context_ner, context_char_idxs, input_lengths, sent_h_mapping, sent_t_mapping, relation_label)
445 | 				loss = torch.sum(BCE(predict_sent, evidence_label) * sent_mask) / torch.sum(sent_mask)
446 | 
447 | 
448 | 				optimizer.zero_grad()
449 | 				loss.backward()
450 | 				optimizer.step()
451 | 
452 | 				global_step += 1
453 | 				total_loss += loss.item()
454 | 
455 | 				if global_step % self.period == 0 :
456 | 					cur_loss = total_loss / self.period
457 | 					elapsed = time.time() - start_time
458 | 					logging('| epoch {:2d} | step {:4d} |  ms/b {:5.2f} | train loss {:5.3f} '.format(epoch, global_step, elapsed * 1000 / self.period, cur_loss))
459 | 					total_loss = 0
460 | 					start_time = time.time()
461 | 
462 | 
463 | 
464 | 			if (epoch + 1) % self.test_epoch == 0:
465 | 				logging('-' * 89)
466 | 				eval_start_time = time.time()
467 | 				model.eval()
468 | 				f1 = self.test(model, model_name)
469 | 				model.train()
470 | 				logging('| epoch {:3d} | time: {:5.2f}s | F1 {:.4f}'.format(epoch, time.time() - eval_start_time, f1))
471 | 				logging('-' * 89)
472 | 
473 | 
474 | 				if f1 > best_f1:
475 | 					best_f1 = f1
476 | 					best_epoch = epoch
477 | 					path = os.path.join(self.checkpoint_dir, model_name)
478 | 					torch.save(ori_model.state_dict(), path)
479 | 
480 | 		print("Finish training")
481 | 		print("Best epoch = %d | auc = %f" % (best_epoch, best_auc))
482 | 		print("Storing best result...")
483 | 		print("Finish storing")
484 | 
485 | 	def test(self, model, model_name, output=False, input_theta=-1):
486 | 		test_evidence_result = []
487 | 
488 | 		def logging(s, print_=True, log_=True):
489 | 			if print_:
490 | 				print(s)
491 | 			if log_:
492 | 				with open(os.path.join(os.path.join("log", model_name)), 'a+') as f_log:
493 | 					f_log.write(s + '\n')
494 | 
495 | 		for data in self.get_real_test_batch():
496 | 			with torch.no_grad():
497 | 				context_idxs = data['context_idxs']
498 | 				context_pos = data['context_pos']
499 | 				relation_label = data['relation_label']
500 | 				input_lengths = data['input_lengths']
501 | 				context_ner = data['context_ner']
502 | 				context_char_idxs = data['context_char_idxs']
503 | 				sent_h_mapping = data['sent_h_mapping']
504 | 				sent_t_mapping = data['sent_t_mapping']
505 | 				evidences = data['evidences']
506 | 				sents_num = data['sents_num']
507 | 				infos = data['infos']
508 | 
509 | 
510 | 				predict_sent = model(context_idxs, context_pos, context_ner, context_char_idxs, input_lengths, sent_h_mapping, sent_t_mapping, relation_label)
511 | 
512 | 				predict_sent = torch.sigmoid(predict_sent)
513 | 
514 | 
515 | 			predict_sent = predict_sent.data.cpu().numpy()
516 | 
517 | 			for i in range(len(evidences)):
518 | 				evi = evidences[i]
519 | 				for j in range(sents_num[i]):
520 | 					test_evidence_result.append( (j in evi, float(predict_sent[i, j]), infos[i], j) )
521 | 
522 | 
523 | 
524 | 		test_evidence_result.sort(key = lambda x: x[1], reverse=True)
525 | 
526 | 		total_evidence_recall = self.total_evidence_recall
527 | 		if total_evidence_recall==0:   # for test
528 | 			total_evidence_recall = 1
529 | 
530 | 		pr_x = []
531 | 		pr_y = []
532 | 		correct = 0
533 | 		w = 0
534 | 
535 | 		for i, item in enumerate(test_evidence_result):
536 | 			correct += item[0]
537 | 			pr_y.append(float(correct) / (i + 1))
538 | 			pr_x.append(float(correct) / total_evidence_recall)
539 | 			if item[1] > input_theta:
540 | 				w = i
541 | 
542 | 
543 | 		pr_x = np.asarray(pr_x, dtype='float32')
544 | 		pr_y = np.asarray(pr_y, dtype='float32')
545 | 		f1_arr = (2 * pr_x * pr_y / (pr_x + pr_y + 1e-20))
546 | 		f1_pos = f1_arr.argmax()
547 | 		evidence_f1 = f1_arr.max()
548 | 		auc = sklearn.metrics.auc(x = pr_x, y = pr_y)
549 | 
550 | 		if input_theta==-1:
551 | 			w = f1_pos
552 | 			input_theta = test_evidence_result[w][1]
553 | 
554 | 		logging('ma_f1{:3.4f} | input_theta {:3.4f} test_evidence_result F1 {:3.4f} | AUC {:3.4f}'.format(evidence_f1, input_theta, f1_arr[w], auc))
555 | 
556 | 		if output:
557 | 			info2evi = {}
558 | 
559 | 			for x in self.test_index:
560 | 				info2evi[(x['title'], x['h_idx'], x['t_idx'], x['r'])] = []
561 | 
562 | 
563 | 			for i in range(w+1):
564 | 				info = test_evidence_result[i][-2]
565 | 				sent_id = test_evidence_result[i][-1]
566 | 				info2evi[info].append(sent_id)
567 | 
568 | 
569 | 			output = []
570 | 			for u, v in info2evi.items():
571 | 				title = u[0]
572 | 				h_idx = u[1]
573 | 				t_idx = u[2]
574 | 				r = u[3]
575 | 				evidence = v
576 | 				output.append({'title':title, 'h_idx': h_idx, 't_idx': t_idx, 'r': r, 'evidence': evidence})
577 | 
578 | 			json.dump(output, open(self.output_file, "w"))
579 | 
580 | 		return evidence_f1
581 | 
582 | 
583 | 
584 | 	def testall(self, model_pattern, model_name, input_theta=-1):
585 | 		model = model_pattern(config = self)
586 | 
587 | 		model.load_state_dict(torch.load(os.path.join(self.checkpoint_dir, model_name)))
588 | 		model.cuda()
589 | 		model.eval()
590 | 		self.test(model, model_name, True, input_theta)
591 | 
592 | 


--------------------------------------------------------------------------------
/code/config/__init__.py:
--------------------------------------------------------------------------------
1 | from .Config import Config
2 | from .EviConfig import EviConfig
3 | 


--------------------------------------------------------------------------------
/code/evaluation.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | import sys
  3 | import os
  4 | import os.path
  5 | import json
  6 | 
  7 | def gen_train_facts(data_file_name, truth_dir):
  8 |     fact_file_name = data_file_name[data_file_name.find("train_"):]
  9 |     fact_file_name = os.path.join(truth_dir, fact_file_name.replace(".json", ".fact"))
 10 | 
 11 |     if os.path.exists(fact_file_name):
 12 |         fact_in_train = set([])
 13 |         triples = json.load(open(fact_file_name))
 14 |         for x in triples:
 15 |             fact_in_train.add(tuple(x))
 16 |         return fact_in_train
 17 | 
 18 |     fact_in_train = set([])
 19 |     ori_data = json.load(open(data_file_name))
 20 |     for data in ori_data:
 21 |         vertexSet = data['vertexSet']
 22 |         for label in data['labels']:
 23 |             rel = label['r']
 24 |             for n1 in vertexSet[label['h']]:
 25 |                 for n2 in vertexSet[label['t']]:
 26 |                     fact_in_train.add((n1['name'], n2['name'], rel))
 27 | 
 28 |     json.dump(list(fact_in_train), open(fact_file_name, "w"))
 29 | 
 30 |     return fact_in_train
 31 | 
 32 | input_dir = sys.argv[1]
 33 | output_dir = sys.argv[2]
 34 | 
 35 | submit_dir = os.path.join(input_dir, 'res')
 36 | truth_dir = os.path.join(input_dir, 'ref')
 37 | 
 38 | if not os.path.isdir(submit_dir):
 39 |     print ("%s doesn't exist" % submit_dir)
 40 | 
 41 | if os.path.isdir(submit_dir) and os.path.isdir(truth_dir):
 42 |     if not os.path.exists(output_dir):
 43 |         os.makedirs(output_dir)
 44 | 
 45 |     fact_in_train_annotated = gen_train_facts("../data/train_annotated.json", truth_dir)
 46 |     fact_in_train_distant = gen_train_facts("../data/train_distant.json", truth_dir)
 47 | 
 48 |     output_filename = os.path.join(output_dir, 'scores.txt')
 49 |     output_file = open(output_filename, 'w')
 50 | 
 51 |     truth_file = os.path.join(truth_dir, "dev_test.json")
 52 |     truth = json.load(open(truth_file))
 53 | 
 54 |     std = {}
 55 |     tot_evidences = 0
 56 |     titleset = set([])
 57 | 
 58 |     title2vectexSet = {}
 59 | 
 60 |     for x in truth:
 61 |         title = x['title']
 62 |         titleset.add(title)
 63 | 
 64 |         vertexSet = x['vertexSet']
 65 |         title2vectexSet[title] = vertexSet
 66 | 
 67 |         for label in x['labels']:
 68 |             r = label['r']
 69 | 
 70 |             h_idx = label['h']
 71 |             t_idx = label['t']
 72 |             std[(title, r, h_idx, t_idx)] = set(label['evidence'])
 73 |             tot_evidences += len(label['evidence'])
 74 | 
 75 |     tot_relations = len(std)
 76 | 
 77 |     submission_answer_file = os.path.join(submit_dir, "result.json")
 78 |     tmp = json.load(open(submission_answer_file))
 79 |     tmp.sort(key=lambda x: (x['title'], x['h_idx'], x['t_idx'], x['r']))
 80 |     submission_answer = [tmp[0]]
 81 |     for i in range(1, len(tmp)):
 82 |         x = tmp[i]
 83 |         y = tmp[i-1]
 84 |         if (x['title'], x['h_idx'], x['t_idx'], x['r']) != (y['title'], y['h_idx'], y['t_idx'], y['r']):
 85 |             submission_answer.append(tmp[i])
 86 | 
 87 |     correct_re = 0
 88 |     correct_evidence = 0
 89 |     pred_evi = 0
 90 | 
 91 |     correct_in_train_annotated = 0
 92 |     correct_in_train_distant = 0
 93 |     titleset2 = set([])
 94 |     for x in submission_answer:
 95 |         title = x['title']
 96 |         h_idx = x['h_idx']
 97 |         t_idx = x['t_idx']
 98 |         r = x['r']
 99 |         titleset2.add(title)
100 |         if title not in title2vectexSet:
101 |             continue
102 |         vertexSet = title2vectexSet[title]
103 | 
104 |         if 'evidence' in x:
105 |             evi = set(x['evidence'])
106 |         else:
107 |             evi = set([])
108 |         pred_evi += len(evi)
109 | 
110 |         if (title, r, h_idx, t_idx) in std:
111 |             correct_re += 1
112 |             stdevi = std[(title, r, h_idx, t_idx)]
113 |             correct_evidence += len(stdevi & evi)
114 |             in_train_annotated = in_train_distant = False
115 |             for n1 in vertexSet[h_idx]:
116 |                 for n2 in vertexSet[t_idx]:
117 |                     if (n1['name'], n2['name'], r) in fact_in_train_annotated:
118 |                         in_train_annotated = True
119 |                     if (n1['name'], n2['name'], r) in fact_in_train_distant:
120 |                         in_train_distant = True
121 | 
122 |             if in_train_annotated:
123 |                 correct_in_train_annotated += 1
124 |             if in_train_distant:
125 |                 correct_in_train_distant += 1
126 | 
127 |     re_p = 1.0 * correct_re / len(submission_answer)
128 |     re_r = 1.0 * correct_re / tot_relations
129 |     if re_p+re_r == 0:
130 |         re_f1 = 0
131 |     else:
132 |         re_f1 = 2.0 * re_p * re_r / (re_p + re_r)
133 | 
134 |     evi_p = 1.0 * correct_evidence / pred_evi if pred_evi>0 else 0
135 |     evi_r = 1.0 * correct_evidence / tot_evidences
136 |     if evi_p+evi_r == 0:
137 |         evi_f1 = 0
138 |     else:
139 |         evi_f1 = 2.0 * evi_p * evi_r / (evi_p + evi_r)
140 | 
141 |     re_p_ignore_train_annotated = 1.0 * (correct_re-correct_in_train_annotated) / (len(submission_answer)-correct_in_train_annotated)
142 |     re_p_ignore_train = 1.0 * (correct_re-correct_in_train_distant) / (len(submission_answer)-correct_in_train_distant)
143 | 
144 |     if re_p_ignore_train_annotated+re_r == 0:
145 |         re_f1_ignore_train_annotated = 0
146 |     else:
147 |         re_f1_ignore_train_annotated = 2.0 * re_p_ignore_train_annotated * re_r / (re_p_ignore_train_annotated + re_r)
148 | 
149 |     if re_p_ignore_train+re_r == 0:
150 |         re_f1_ignore_train = 0
151 |     else:
152 |         re_f1_ignore_train = 2.0 * re_p_ignore_train * re_r / (re_p_ignore_train + re_r)
153 | 
154 | 
155 | 
156 |     print ('RE_F1:', re_f1)
157 |     print ('Evi_F1:', evi_f1)
158 |     print ('RE_ignore_annotated_F1:', re_f1_ignore_train_annotated)
159 |     print ('RE_ignore_distant_F1:', re_f1_ignore_train)
160 | 
161 |     output_file.write("RE_F1: %f\n" % re_f1)
162 |     output_file.write("Evi_F1: %f\n" % evi_f1)
163 | 
164 |     output_file.write("RE_ignore_annotated_F1: %f\n" % re_f1_ignore_train_annotated)
165 |     output_file.write("RE_ignore_distant_F1: %f\n" % re_f1_ignore_train)
166 | 
167 | 
168 |     output_file.close()
169 | 
170 | 


--------------------------------------------------------------------------------
/code/gen_data.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import os
  3 | import json
  4 | from nltk.tokenize import WordPunctTokenizer
  5 | import argparse
  6 | parser = argparse.ArgumentParser()
  7 | parser.add_argument('--in_path', type = str, default =  "../data")
  8 | parser.add_argument('--out_path', type = str, default = "prepro_data")
  9 | 
 10 | args = parser.parse_args()
 11 | in_path = args.in_path
 12 | out_path = args.out_path
 13 | case_sensitive = False
 14 | 
 15 | char_limit = 16
 16 | train_distant_file_name = os.path.join(in_path, 'train_distant.json')
 17 | train_annotated_file_name = os.path.join(in_path, 'train_annotated.json')
 18 | dev_file_name = os.path.join(in_path, 'dev.json')
 19 | test_file_name = os.path.join(in_path, 'test.json')
 20 | 
 21 | rel2id = json.load(open(os.path.join(out_path, 'rel2id.json'), "r"))
 22 | id2rel = {v:u for u,v in rel2id.items()}
 23 | json.dump(id2rel, open(os.path.join(out_path, 'id2rel.json'), "w"))
 24 | fact_in_train = set([])
 25 | fact_in_dev_train = set([])
 26 | 
 27 | def init(data_file_name, rel2id, max_length = 512, is_training = True, suffix=''):
 28 | 
 29 | 	ori_data = json.load(open(data_file_name))
 30 | 
 31 | 
 32 | 	Ma = 0
 33 | 	Ma_e = 0
 34 | 	data = []
 35 | 	intrain = notintrain = notindevtrain = indevtrain = 0
 36 | 	for i in range(len(ori_data)):
 37 | 		Ls = [0]
 38 | 		L = 0
 39 | 		for x in ori_data[i]['sents']:
 40 | 			L += len(x)
 41 | 			Ls.append(L)
 42 | 
 43 | 		vertexSet =  ori_data[i]['vertexSet']
 44 | 		# point position added with sent start position
 45 | 		for j in range(len(vertexSet)):
 46 | 			for k in range(len(vertexSet[j])):
 47 | 				vertexSet[j][k]['sent_id'] = int(vertexSet[j][k]['sent_id'])
 48 | 
 49 | 				sent_id = vertexSet[j][k]['sent_id']
 50 | 				dl = Ls[sent_id]
 51 | 				pos1 = vertexSet[j][k]['pos'][0]
 52 | 				pos2 = vertexSet[j][k]['pos'][1]
 53 | 				vertexSet[j][k]['pos'] = (pos1+dl, pos2+dl)
 54 | 
 55 | 		ori_data[i]['vertexSet'] = vertexSet
 56 | 
 57 | 		item = {}
 58 | 		item['vertexSet'] = vertexSet
 59 | 		labels = ori_data[i].get('labels', [])
 60 | 
 61 | 		train_triple = set([])
 62 | 		new_labels = []
 63 | 		for label in labels:
 64 | 			rel = label['r']
 65 | 			assert(rel in rel2id)
 66 | 			label['r'] = rel2id[label['r']]
 67 | 
 68 | 			train_triple.add((label['h'], label['t']))
 69 | 
 70 | 
 71 | 			if suffix=='_train':
 72 | 				for n1 in vertexSet[label['h']]:
 73 | 					for n2 in vertexSet[label['t']]:
 74 | 						fact_in_dev_train.add((n1['name'], n2['name'], rel))
 75 | 
 76 | 
 77 | 			if is_training:
 78 | 				for n1 in vertexSet[label['h']]:
 79 | 					for n2 in vertexSet[label['t']]:
 80 | 						fact_in_train.add((n1['name'], n2['name'], rel))
 81 | 
 82 | 			else:
 83 | 				# fix a bug here
 84 | 				label['intrain'] = False
 85 | 				label['indev_train'] = False
 86 | 
 87 | 				for n1 in vertexSet[label['h']]:
 88 | 					for n2 in vertexSet[label['t']]:
 89 | 						if (n1['name'], n2['name'], rel) in fact_in_train:
 90 | 							label['intrain'] = True
 91 | 
 92 | 						if suffix == '_dev' or suffix == '_test':
 93 | 							if (n1['name'], n2['name'], rel) in fact_in_dev_train:
 94 | 								label['indev_train'] = True
 95 | 
 96 | 
 97 | 			new_labels.append(label)
 98 | 
 99 | 		item['labels'] = new_labels
100 | 		item['title'] = ori_data[i]['title']
101 | 
102 | 		na_triple = []
103 | 		for j in range(len(vertexSet)):
104 | 			for k in range(len(vertexSet)):
105 | 				if (j != k):
106 | 					if (j, k) not in train_triple:
107 | 						na_triple.append((j, k))
108 | 
109 | 		item['na_triple'] = na_triple
110 | 		item['Ls'] = Ls
111 | 		item['sents'] = ori_data[i]['sents']
112 | 		data.append(item)
113 | 
114 | 		Ma = max(Ma, len(vertexSet))
115 | 		Ma_e = max(Ma_e, len(item['labels']))
116 | 
117 | 
118 | 	print ('data_len:', len(ori_data))
119 | 	# print ('Ma_V', Ma)
120 | 	# print ('Ma_e', Ma_e)
121 | 	# print (suffix)
122 | 	# print ('fact_in_train', len(fact_in_train))
123 | 	# print (intrain, notintrain)
124 | 	# print ('fact_in_devtrain', len(fact_in_dev_train))
125 | 	# print (indevtrain, notindevtrain)
126 | 
127 | 
128 | 	# saving
129 | 	print("Saving files")
130 | 	if is_training:
131 | 		name_prefix = "train"
132 | 	else:
133 | 		name_prefix = "dev"
134 | 
135 | 	json.dump(data , open(os.path.join(out_path, name_prefix + suffix + '.json'), "w"))
136 | 
137 | 	char2id = json.load(open(os.path.join(out_path, "char2id.json")))
138 | 	# id2char= {v:k for k,v in char2id.items()}
139 | 	# json.dump(id2char, open("data/id2char.json", "w"))
140 | 
141 | 	word2id = json.load(open(os.path.join(out_path, "word2id.json")))
142 | 	ner2id = json.load(open(os.path.join(out_path, "ner2id.json")))
143 | 
144 | 	sen_tot = len(ori_data)
145 | 	sen_word = np.zeros((sen_tot, max_length), dtype = np.int64)
146 | 	sen_pos = np.zeros((sen_tot, max_length), dtype = np.int64)
147 | 	sen_ner = np.zeros((sen_tot, max_length), dtype = np.int64)
148 | 	sen_char = np.zeros((sen_tot, max_length, char_limit), dtype = np.int64)
149 | 
150 | 	for i in range(len(ori_data)):
151 | 		item = ori_data[i]
152 | 		words = []
153 | 		for sent in item['sents']:
154 | 			words += sent
155 | 
156 | 		for j, word in enumerate(words):
157 | 			word = word.lower()
158 | 
159 | 			if j < max_length:
160 | 				if word in word2id:
161 | 					sen_word[i][j] = word2id[word]
162 | 				else:
163 | 					sen_word[i][j] = word2id['UNK']
164 | 
165 | 				for c_idx, k in enumerate(list(word)):
166 | 					if c_idx>=char_limit:
167 | 						break
168 | 					sen_char[i,j,c_idx] = char2id.get(k, char2id['UNK'])
169 | 
170 | 		for j in range(j + 1, max_length):
171 | 			sen_word[i][j] = word2id['BLANK']
172 | 
173 | 		vertexSet = item['vertexSet']
174 | 
175 | 		for idx, vertex in enumerate(vertexSet, 1):
176 | 			for v in vertex:
177 | 				sen_pos[i][v['pos'][0]:v['pos'][1]] = idx
178 | 				sen_ner[i][v['pos'][0]:v['pos'][1]] = ner2id[v['type']]
179 | 
180 | 	print("Finishing processing")
181 | 	np.save(os.path.join(out_path, name_prefix + suffix + '_word.npy'), sen_word)
182 | 	np.save(os.path.join(out_path, name_prefix + suffix + '_pos.npy'), sen_pos)
183 | 	np.save(os.path.join(out_path, name_prefix + suffix + '_ner.npy'), sen_ner)
184 | 	np.save(os.path.join(out_path, name_prefix + suffix + '_char.npy'), sen_char)
185 | 	print("Finish saving")
186 | 
187 | 
188 | 
189 | init(train_distant_file_name, rel2id, max_length = 512, is_training = True, suffix='')
190 | init(train_annotated_file_name, rel2id, max_length = 512, is_training = False, suffix='_train')
191 | init(dev_file_name, rel2id, max_length = 512, is_training = False, suffix='_dev')
192 | init(test_file_name, rel2id, max_length = 512, is_training = False, suffix='_test')
193 | 
194 | 
195 | 


--------------------------------------------------------------------------------
/code/models/BiLSTM.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.autograd as autograd
  3 | import torch.nn as nn
  4 | import torch.nn.functional as F
  5 | import torch.optim as optim
  6 | from torch.autograd import Variable
  7 | from torch import nn
  8 | import numpy as np
  9 | import math
 10 | from torch.nn import init
 11 | from torch.nn.utils import rnn
 12 | 
 13 | 
 14 | class BiLSTM(nn.Module):
 15 | 	def __init__(self, config):
 16 | 		super(BiLSTM, self).__init__()
 17 | 		self.config = config
 18 | 
 19 | 		word_vec_size = config.data_word_vec.shape[0]
 20 | 		self.word_emb = nn.Embedding(word_vec_size, config.data_word_vec.shape[1])
 21 | 		self.word_emb.weight.data.copy_(torch.from_numpy(config.data_word_vec))
 22 | 
 23 | 		self.word_emb.weight.requires_grad = False
 24 | 		self.use_entity_type = True
 25 | 		self.use_coreference = True
 26 | 		self.use_distance = True
 27 | 
 28 | 		# performance is similar with char_embed
 29 | 		# self.char_emb = nn.Embedding(config.data_char_vec.shape[0], config.data_char_vec.shape[1])
 30 | 		# self.char_emb.weight.data.copy_(torch.from_numpy(config.data_char_vec))
 31 | 
 32 | 		# char_dim = config.data_char_vec.shape[1]
 33 | 		# char_hidden = 100
 34 | 		# self.char_cnn = nn.Conv1d(char_dim,  char_hidden, 5)
 35 | 
 36 | 		hidden_size = 128
 37 | 		input_size = config.data_word_vec.shape[1]
 38 | 		if self.use_entity_type:
 39 | 			input_size += config.entity_type_size
 40 | 			self.ner_emb = nn.Embedding(7, config.entity_type_size, padding_idx=0)
 41 | 
 42 | 		if self.use_coreference:
 43 | 			input_size += config.coref_size
 44 | 			# self.coref_embed = nn.Embedding(config.max_length, config.coref_size, padding_idx=0)
 45 | 			self.entity_embed = nn.Embedding(config.max_length, config.coref_size, padding_idx=0)
 46 | 
 47 | 		# input_size += char_hidden
 48 | 
 49 | 		self.rnn = EncoderLSTM(input_size, hidden_size, 1, True, True, 1 - config.keep_prob, False)
 50 | 		self.linear_re = nn.Linear(hidden_size*2, hidden_size)
 51 | 
 52 | 		if self.use_distance:
 53 | 			self.dis_embed = nn.Embedding(20, config.dis_size, padding_idx=10)
 54 | 			self.bili = torch.nn.Bilinear(hidden_size+config.dis_size, hidden_size+config.dis_size, config.relation_num)
 55 | 		else:
 56 | 			self.bili = torch.nn.Bilinear(hidden_size, hidden_size, config.relation_num)
 57 | 
 58 | 	def forward(self, context_idxs, pos, context_ner, context_char_idxs, context_lens, h_mapping, t_mapping,
 59 | 				relation_mask, dis_h_2_t, dis_t_2_h):
 60 | 		# para_size, char_size, bsz = context_idxs.size(1), context_char_idxs.size(2), context_idxs.size(0)
 61 | 		# context_ch = self.char_emb(context_char_idxs.contiguous().view(-1, char_size)).view(bsz * para_size, char_size, -1)
 62 | 		# context_ch = self.char_cnn(context_ch.permute(0, 2, 1).contiguous()).max(dim=-1)[0].view(bsz, para_size, -1)
 63 | 
 64 | 		sent = self.word_emb(context_idxs)
 65 | 		if self.use_coreference:
 66 | 			sent = torch.cat([sent, self.entity_embed(pos)], dim=-1)
 67 | 
 68 | 		if self.use_entity_type:
 69 | 			sent = torch.cat([sent, self.ner_emb(context_ner)], dim=-1)
 70 | 
 71 | 		# sent = torch.cat([sent, context_ch], dim=-1)
 72 | 		context_output = self.rnn(sent, context_lens)
 73 | 
 74 | 		context_output = torch.relu(self.linear_re(context_output))
 75 | 
 76 | 
 77 | 		start_re_output = torch.matmul(h_mapping, context_output)
 78 | 		end_re_output = torch.matmul(t_mapping, context_output)
 79 | 
 80 | 
 81 | 		if self.use_distance:
 82 | 			s_rep = torch.cat([start_re_output, self.dis_embed(dis_h_2_t)], dim=-1)
 83 | 			t_rep = torch.cat([end_re_output, self.dis_embed(dis_t_2_h)], dim=-1)
 84 | 			predict_re = self.bili(s_rep, t_rep)
 85 | 		else:
 86 | 			predict_re = self.bili(start_re_output, end_re_output)
 87 | 
 88 | 		return predict_re
 89 | 
 90 | 
 91 | class LockedDropout(nn.Module):
 92 | 	def __init__(self, dropout):
 93 | 		super().__init__()
 94 | 		self.dropout = dropout
 95 | 
 96 | 	def forward(self, x):
 97 | 		dropout = self.dropout
 98 | 		if not self.training:
 99 | 			return x
100 | 		m = x.data.new(x.size(0), 1, x.size(2)).bernoulli_(1 - dropout)
101 | 		mask = Variable(m.div_(1 - dropout), requires_grad=False)
102 | 		mask = mask.expand_as(x)
103 | 		return mask * x
104 | 
105 | class EncoderRNN(nn.Module):
106 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
107 | 		super().__init__()
108 | 		self.rnns = []
109 | 		for i in range(nlayers):
110 | 			if i == 0:
111 | 				input_size_ = input_size
112 | 				output_size_ = num_units
113 | 			else:
114 | 				input_size_ = num_units if not bidir else num_units * 2
115 | 				output_size_ = num_units
116 | 			self.rnns.append(nn.GRU(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
117 | 		self.rnns = nn.ModuleList(self.rnns)
118 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
119 | 		self.dropout = LockedDropout(dropout)
120 | 		self.concat = concat
121 | 		self.nlayers = nlayers
122 | 		self.return_last = return_last
123 | 
124 | 		# self.reset_parameters()
125 | 
126 | 	def reset_parameters(self):
127 | 		for rnn in self.rnns:
128 | 			for name, p in rnn.named_parameters():
129 | 				if 'weight' in name:
130 | 					p.data.normal_(std=0.1)
131 | 				else:
132 | 					p.data.zero_()
133 | 
134 | 	def get_init(self, bsz, i):
135 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous()
136 | 
137 | 	def forward(self, input, input_lengths=None):
138 | 		bsz, slen = input.size(0), input.size(1)
139 | 		output = input
140 | 		outputs = []
141 | 		if input_lengths is not None:
142 | 			lens = input_lengths.data.cpu().numpy()
143 | 		for i in range(self.nlayers):
144 | 			hidden = self.get_init(bsz, i)
145 | 			output = self.dropout(output)
146 | 			if input_lengths is not None:
147 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
148 | 
149 | 			output, hidden = self.rnns[i](output, hidden)
150 | 
151 | 
152 | 			if input_lengths is not None:
153 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
154 | 				if output.size(1) < slen: # used for parallel
155 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
156 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
157 | 			if self.return_last:
158 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
159 | 			else:
160 | 				outputs.append(output)
161 | 		if self.concat:
162 | 			return torch.cat(outputs, dim=2)
163 | 		return outputs[-1]
164 | 
165 | 
166 | 
167 | 
168 | class EncoderLSTM(nn.Module):
169 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
170 | 		super().__init__()
171 | 		self.rnns = []
172 | 		for i in range(nlayers):
173 | 			if i == 0:
174 | 				input_size_ = input_size
175 | 				output_size_ = num_units
176 | 			else:
177 | 				input_size_ = num_units if not bidir else num_units * 2
178 | 				output_size_ = num_units
179 | 			self.rnns.append(nn.LSTM(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
180 | 		self.rnns = nn.ModuleList(self.rnns)
181 | 
182 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
183 | 		self.init_c = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
184 | 
185 | 		self.dropout = LockedDropout(dropout)
186 | 		self.concat = concat
187 | 		self.nlayers = nlayers
188 | 		self.return_last = return_last
189 | 
190 | 		# self.reset_parameters()
191 | 
192 | 	def reset_parameters(self):
193 | 		for rnn in self.rnns:
194 | 			for name, p in rnn.named_parameters():
195 | 				if 'weight' in name:
196 | 					p.data.normal_(std=0.1)
197 | 				else:
198 | 					p.data.zero_()
199 | 
200 | 	def get_init(self, bsz, i):
201 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous(), self.init_c[i].expand(-1, bsz, -1).contiguous()
202 | 
203 | 	def forward(self, input, input_lengths=None):
204 | 		bsz, slen = input.size(0), input.size(1)
205 | 		output = input
206 | 		outputs = []
207 | 		if input_lengths is not None:
208 | 			lens = input_lengths.data.cpu().numpy()
209 | 
210 | 		for i in range(self.nlayers):
211 | 			hidden, c = self.get_init(bsz, i)
212 | 
213 | 			output = self.dropout(output)
214 | 			if input_lengths is not None:
215 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
216 | 
217 | 			output, hidden = self.rnns[i](output, (hidden, c))
218 | 
219 | 
220 | 			if input_lengths is not None:
221 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
222 | 				if output.size(1) < slen: # used for parallel
223 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
224 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
225 | 			if self.return_last:
226 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
227 | 			else:
228 | 				outputs.append(output)
229 | 		if self.concat:
230 | 			return torch.cat(outputs, dim=2)
231 | 		return outputs[-1]
232 | 
233 | class BiAttention(nn.Module):
234 | 	def __init__(self, input_size, dropout):
235 | 		super().__init__()
236 | 		self.dropout = LockedDropout(dropout)
237 | 		self.input_linear = nn.Linear(input_size, 1, bias=False)
238 | 		self.memory_linear = nn.Linear(input_size, 1, bias=False)
239 | 
240 | 		self.dot_scale = nn.Parameter(torch.Tensor(input_size).uniform_(1.0 / (input_size ** 0.5)))
241 | 
242 | 	def forward(self, input, memory, mask):
243 | 		bsz, input_len, memory_len = input.size(0), input.size(1), memory.size(1)
244 | 
245 | 		input = self.dropout(input)
246 | 		memory = self.dropout(memory)
247 | 
248 | 		input_dot = self.input_linear(input)
249 | 		memory_dot = self.memory_linear(memory).view(bsz, 1, memory_len)
250 | 		cross_dot = torch.bmm(input * self.dot_scale, memory.permute(0, 2, 1).contiguous())
251 | 		att = input_dot + memory_dot + cross_dot
252 | 		att = att - 1e30 * (1 - mask[:,None])
253 | 
254 | 		weight_one = F.softmax(att, dim=-1)
255 | 		output_one = torch.bmm(weight_one, memory)
256 | 		weight_two = F.softmax(att.max(dim=-1)[0], dim=-1).view(bsz, 1, input_len)
257 | 		output_two = torch.bmm(weight_two, input)
258 | 
259 | 		return torch.cat([input, output_one, input*output_one, output_two*output_one], dim=-1)
260 | 


--------------------------------------------------------------------------------
/code/models/CNN3.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.autograd as autograd
 3 | import torch.nn as nn
 4 | import torch.nn.functional as F
 5 | import torch.optim as optim
 6 | from torch.autograd import Variable
 7 | 
 8 | class CNN3(nn.Module):
 9 | 	def __init__(self, config):
10 | 		super(CNN3, self).__init__()
11 | 		self.config = config
12 | 		self.word_emb = nn.Embedding(config.data_word_vec.shape[0], config.data_word_vec.shape[1])
13 | 		self.word_emb.weight.data.copy_(torch.from_numpy(config.data_word_vec))
14 | 		self.word_emb.weight.requires_grad = False
15 | 
16 | 
17 | 		# self.char_emb = nn.Embedding(config.data_char_vec.shape[0], config.data_char_vec.shape[1])
18 | 		# self.char_emb.weight.data.copy_(torch.from_numpy(config.data_char_vec))
19 | 		# char_dim = config.data_char_vec.shape[1]
20 | 		# char_hidden = 100
21 | 		# self.char_cnn = nn.Conv1d(char_dim,  char_hidden, 5)
22 | 
23 | 		self.coref_embed = nn.Embedding(config.max_length, config.coref_size, padding_idx=0)
24 | 		self.ner_emb = nn.Embedding(7, config.entity_type_size, padding_idx=0)
25 | 
26 | 		input_size = config.data_word_vec.shape[1] + config.coref_size + config.entity_type_size #+ char_hidden
27 | 
28 | 		self.out_channels = 200
29 | 		self.in_channels = input_size
30 | 
31 | 		self.kernel_size = 3
32 | 		self.stride = 1
33 | 		self.padding = int((self.kernel_size - 1) / 2)
34 | 
35 | 		self.cnn_1 = nn.Conv1d(self.in_channels, self.out_channels, self.kernel_size, self.stride, self.padding)
36 | 		self.cnn_2 = nn.Conv1d(self.out_channels, self.out_channels, self.kernel_size, self.stride, self.padding)
37 | 		self.cnn_3 = nn.Conv1d(self.out_channels, self.out_channels, self.kernel_size, self.stride, self.padding)
38 | 		self.max_pooling = nn.MaxPool1d(self.kernel_size, stride=self.stride, padding=self.padding)
39 | 		self.relu = nn.ReLU()
40 | 
41 | 		self.dropout = nn.Dropout(config.cnn_drop_prob)
42 | 
43 | 		self.bili = torch.nn.Bilinear(self.out_channels+config.dis_size, self.out_channels+config.dis_size, config.relation_num)
44 | 		self.dis_embed = nn.Embedding(20, config.dis_size, padding_idx=10)
45 | 
46 | 
47 | 	def forward(self, context_idxs, pos, context_ner, context_char_idxs, context_lens, h_mapping, t_mapping, relation_mask, dis_h_2_t, dis_t_2_h):
48 | 		# para_size, char_size, bsz = context_idxs.size(1), context_char_idxs.size(2), context_idxs.size(0)
49 | 		# context_ch = self.char_emb(context_char_idxs.contiguous().view(-1, char_size)).view(bsz * para_size, char_size, -1)
50 | 		# context_ch = self.char_cnn(context_ch.permute(0, 2, 1).contiguous()).max(dim=-1)[0].view(bsz, para_size, -1)
51 | 
52 | 		sent = torch.cat([self.word_emb(context_idxs), self.coref_embed(pos), self.ner_emb(context_ner)], dim=-1)
53 | 
54 | 		sent = sent.permute(0, 2, 1)
55 | 
56 | 		# batch * embedding_size * max_len
57 | 		x = self.cnn_1(sent)
58 | 		x = self.max_pooling(x)
59 | 		x = self.relu(x)
60 | 		x = self.dropout(x)
61 | 
62 | 		x = self.cnn_2(x)
63 | 		x = self.max_pooling(x)
64 | 		x = self.relu(x)
65 | 		x = self.dropout(x)
66 | 
67 | 		x = self.cnn_3(x)
68 | 		x = self.max_pooling(x)
69 | 		x = self.relu(x)
70 | 		x = self.dropout(x)
71 | 
72 | 		context_output = x.permute(0, 2, 1)
73 | 		start_re_output = torch.matmul(h_mapping, context_output)
74 | 		end_re_output = torch.matmul(t_mapping, context_output)
75 | 
76 | 		s_rep = torch.cat([start_re_output, self.dis_embed(dis_h_2_t)], dim=-1)
77 | 		t_rep = torch.cat([end_re_output, self.dis_embed(dis_t_2_h)], dim=-1)
78 | 
79 | 		predict_re = self.bili(s_rep, t_rep)
80 | 
81 | 		return predict_re
82 | 


--------------------------------------------------------------------------------
/code/models/ContextAware.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.autograd as autograd
  3 | import torch.nn as nn
  4 | import torch.nn.functional as F
  5 | import torch.optim as optim
  6 | from torch.autograd import Variable
  7 | from torch import nn
  8 | import numpy as np
  9 | import math
 10 | from torch.nn import init
 11 | from torch.nn.utils import rnn
 12 | 
 13 | 
 14 | class ContextAware(nn.Module):
 15 | 	def __init__(self, config):
 16 | 		super(ContextAware, self).__init__()
 17 | 		self.config = config
 18 | 		self.word_emb = nn.Embedding(config.data_word_vec.shape[0], config.data_word_vec.shape[1])
 19 | 		self.word_emb.weight.data.copy_(torch.from_numpy(config.data_word_vec))
 20 | 		self.word_emb.weight.requires_grad = False
 21 | 
 22 | 		self.ner_emb = nn.Embedding(7, config.entity_type_size, padding_idx=0)
 23 | 
 24 | 		self.coref_embed = nn.Embedding(config.max_length, config.coref_size, padding_idx=0)
 25 | 
 26 | 		# self.char_emb = nn.Embedding(config.data_char_vec.shape[0], config.data_char_vec.shape[1])
 27 | 		# self.char_emb.weight.data.copy_(torch.from_numpy(config.data_char_vec))
 28 | 		# char_dim = config.data_char_vec.shape[1]
 29 | 		# char_hidden = 100
 30 | 		# self.char_cnn = nn.Conv1d(char_dim,  char_hidden, 5)
 31 | 
 32 | 		hidden_size = 128
 33 | 		input_size = config.data_word_vec.shape[1] + config.coref_size + config.entity_type_size #+ char_hidden
 34 | 
 35 | 
 36 | 		self.rnn = EncoderLSTM(input_size, hidden_size, 1, True, True, 1 - config.keep_prob, False)
 37 | 
 38 | 
 39 | 		self.linear_re = nn.Linear(hidden_size * 2, hidden_size)
 40 | 		self.bili = torch.nn.Bilinear(hidden_size, hidden_size, hidden_size)
 41 | 
 42 | 		self.self_att = SelfAttention(hidden_size, 1.0)
 43 | 
 44 | 		self.bili = torch.nn.Bilinear(hidden_size+config.dis_size, hidden_size+config.dis_size, hidden_size)
 45 | 		self.dis_embed = nn.Embedding(20, config.dis_size, padding_idx=10)
 46 | 
 47 | 		self.linear_output = nn.Linear(hidden_size * 2, config.relation_num)
 48 | 
 49 | 
 50 | 	def forward(self, context_idxs, pos, context_ner, context_char_idxs, context_lens, h_mapping, t_mapping, relation_mask, dis_h_2_t, dis_t_2_h):
 51 | 		# para_size, char_size, bsz = context_idxs.size(1), context_char_idxs.size(2), context_idxs.size(0)
 52 | 		# context_ch = self.char_emb(context_char_idxs.contiguous().view(-1, char_size)).view(bsz * para_size, char_size, -1)
 53 | 		# context_ch = self.char_cnn(context_ch.permute(0, 2, 1).contiguous()).max(dim=-1)[0].view(bsz, para_size, -1)
 54 | 
 55 | 		sent = torch.cat([self.word_emb(context_idxs), self.coref_embed(pos), self.ner_emb(context_ner)], dim=-1)
 56 | 		context_output = self.rnn(sent, context_lens)
 57 | 
 58 | 
 59 | 		context_output = torch.relu(self.linear_re(context_output))
 60 | 
 61 | 		start_re_output = torch.matmul(h_mapping, context_output)
 62 | 		end_re_output = torch.matmul(t_mapping, context_output)
 63 | 
 64 | 		s_rep = torch.cat([start_re_output, self.dis_embed(dis_h_2_t)], dim=-1)
 65 | 		t_rep = torch.cat([end_re_output, self.dis_embed(dis_t_2_h)], dim=-1)
 66 | 
 67 | 
 68 | 		re_rep = self.bili(s_rep, t_rep)
 69 | 		re_rep = self.self_att(re_rep, re_rep, relation_mask)
 70 | 
 71 | 
 72 | 		return self.linear_output(re_rep)
 73 | 
 74 | 
 75 | class LockedDropout(nn.Module):
 76 | 	def __init__(self, dropout):
 77 | 		super().__init__()
 78 | 		self.dropout = dropout
 79 | 
 80 | 	def forward(self, x):
 81 | 		dropout = self.dropout
 82 | 		if not self.training:
 83 | 			return x
 84 | 		m = x.data.new(x.size(0), 1, x.size(2)).bernoulli_(1 - dropout)
 85 | 		mask = Variable(m.div_(1 - dropout), requires_grad=False)
 86 | 		mask = mask.expand_as(x)
 87 | 		return mask * x
 88 | 
 89 | class EncoderRNN(nn.Module):
 90 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
 91 | 		super().__init__()
 92 | 		self.rnns = []
 93 | 		for i in range(nlayers):
 94 | 			if i == 0:
 95 | 				input_size_ = input_size
 96 | 				output_size_ = num_units
 97 | 			else:
 98 | 				input_size_ = num_units if not bidir else num_units * 2
 99 | 				output_size_ = num_units
100 | 			self.rnns.append(nn.GRU(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
101 | 		self.rnns = nn.ModuleList(self.rnns)
102 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
103 | 		self.dropout = LockedDropout(dropout)
104 | 		self.concat = concat
105 | 		self.nlayers = nlayers
106 | 		self.return_last = return_last
107 | 
108 | 		# self.reset_parameters()
109 | 
110 | 	def reset_parameters(self):
111 | 		for rnn in self.rnns:
112 | 			for name, p in rnn.named_parameters():
113 | 				if 'weight' in name:
114 | 					p.data.normal_(std=0.1)
115 | 				else:
116 | 					p.data.zero_()
117 | 
118 | 	def get_init(self, bsz, i):
119 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous()
120 | 
121 | 	def forward(self, input, input_lengths=None):
122 | 		bsz, slen = input.size(0), input.size(1)
123 | 		output = input
124 | 		outputs = []
125 | 		if input_lengths is not None:
126 | 			lens = input_lengths.data.cpu().numpy()
127 | 		for i in range(self.nlayers):
128 | 			hidden = self.get_init(bsz, i)
129 | 			output = self.dropout(output)
130 | 			if input_lengths is not None:
131 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
132 | 
133 | 			output, hidden = self.rnns[i](output, hidden)
134 | 
135 | 
136 | 			if input_lengths is not None:
137 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
138 | 				if output.size(1) < slen: # used for parallel
139 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
140 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
141 | 			if self.return_last:
142 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
143 | 			else:
144 | 				outputs.append(output)
145 | 		if self.concat:
146 | 			return torch.cat(outputs, dim=2)
147 | 		return outputs[-1]
148 | 
149 | 
150 | 
151 | class EncoderLSTM(nn.Module):
152 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
153 | 		super().__init__()
154 | 		self.rnns = []
155 | 		for i in range(nlayers):
156 | 			if i == 0:
157 | 				input_size_ = input_size
158 | 				output_size_ = num_units
159 | 			else:
160 | 				input_size_ = num_units if not bidir else num_units * 2
161 | 				output_size_ = num_units
162 | 			self.rnns.append(nn.LSTM(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
163 | 		self.rnns = nn.ModuleList(self.rnns)
164 | 
165 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
166 | 		self.init_c = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
167 | 
168 | 		self.dropout = LockedDropout(dropout)
169 | 		self.concat = concat
170 | 		self.nlayers = nlayers
171 | 		self.return_last = return_last
172 | 
173 | 		# self.reset_parameters()
174 | 
175 | 	def reset_parameters(self):
176 | 		for rnn in self.rnns:
177 | 			for name, p in rnn.named_parameters():
178 | 				if 'weight' in name:
179 | 					p.data.normal_(std=0.1)
180 | 				else:
181 | 					p.data.zero_()
182 | 
183 | 	def get_init(self, bsz, i):
184 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous(), self.init_c[i].expand(-1, bsz, -1).contiguous()
185 | 
186 | 	def forward(self, input, input_lengths=None):
187 | 		bsz, slen = input.size(0), input.size(1)
188 | 		output = input
189 | 		outputs = []
190 | 		if input_lengths is not None:
191 | 			lens = input_lengths.data.cpu().numpy()
192 | 
193 | 		for i in range(self.nlayers):
194 | 			hidden, c = self.get_init(bsz, i)
195 | 
196 | 			output = self.dropout(output)
197 | 			if input_lengths is not None:
198 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
199 | 
200 | 			output, hidden = self.rnns[i](output, (hidden, c))
201 | 
202 | 
203 | 			if input_lengths is not None:
204 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
205 | 				if output.size(1) < slen: # used for parallel
206 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
207 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
208 | 			if self.return_last:
209 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
210 | 			else:
211 | 				outputs.append(output)
212 | 		if self.concat:
213 | 			return torch.cat(outputs, dim=2)
214 | 		return outputs[-1]
215 | 
216 | class SelfAttention(nn.Module):
217 | 	def __init__(self, input_size, dropout):
218 | 		super().__init__()
219 | 		# self.dropout = LockedDropout(dropout)
220 | 		self.input_linear = nn.Linear(input_size, 1, bias=False)
221 | 		self.dot_scale = nn.Parameter(torch.Tensor(input_size).uniform_(1.0 / (input_size ** 0.5)))
222 | 
223 | 	def forward(self, input, memory, mask):
224 | 
225 | 		# input = self.dropout(input)
226 | 		# memory = self.dropout(memory)
227 | 
228 | 		input_dot = self.input_linear(input)
229 | 		cross_dot = torch.bmm(input * self.dot_scale, memory.permute(0, 2, 1).contiguous())
230 | 		att = input_dot + cross_dot
231 | 		att = att - 1e30 * (1 - mask[:,None])
232 | 
233 | 		weight_one = F.softmax(att, dim=-1)
234 | 		output_one = torch.bmm(weight_one, memory)
235 | 
236 | 		return torch.cat([input, output_one], dim=-1)
237 | 
238 | 
239 | class BiAttention(nn.Module):
240 | 	def __init__(self, input_size, dropout):
241 | 		super().__init__()
242 | 		self.dropout = LockedDropout(dropout)
243 | 		self.input_linear = nn.Linear(input_size, 1, bias=False)
244 | 		self.memory_linear = nn.Linear(input_size, 1, bias=False)
245 | 
246 | 		self.dot_scale = nn.Parameter(torch.Tensor(input_size).uniform_(1.0 / (input_size ** 0.5)))
247 | 
248 | 	def forward(self, input, memory, mask):
249 | 		bsz, input_len, memory_len = input.size(0), input.size(1), memory.size(1)
250 | 
251 | 		input = self.dropout(input)
252 | 		memory = self.dropout(memory)
253 | 
254 | 		input_dot = self.input_linear(input)
255 | 		memory_dot = self.memory_linear(memory).view(bsz, 1, memory_len)
256 | 		cross_dot = torch.bmm(input * self.dot_scale, memory.permute(0, 2, 1).contiguous())
257 | 		att = input_dot + memory_dot + cross_dot
258 | 		att = att - 1e30 * (1 - mask[:,None])
259 | 
260 | 		weight_one = F.softmax(att, dim=-1)
261 | 		output_one = torch.bmm(weight_one, memory)
262 | 		weight_two = F.softmax(att.max(dim=-1)[0], dim=-1).view(bsz, 1, input_len)
263 | 		output_two = torch.bmm(weight_two, input)
264 | 
265 | 		return torch.cat([input, output_one, input*output_one, output_two*output_one], dim=-1)
266 | 


--------------------------------------------------------------------------------
/code/models/LSTM.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.autograd as autograd
  3 | import torch.nn as nn
  4 | import torch.nn.functional as F
  5 | import torch.optim as optim
  6 | from torch.autograd import Variable
  7 | from torch import nn
  8 | import numpy as np
  9 | import math
 10 | from torch.nn import init
 11 | from torch.nn.utils import rnn
 12 | 
 13 | 
 14 | class LSTM(nn.Module):
 15 | 	def __init__(self, config):
 16 | 		super(LSTM, self).__init__()
 17 | 		self.config = config
 18 | 
 19 | 		word_vec_size = config.data_word_vec.shape[0]
 20 | 		self.word_emb = nn.Embedding(word_vec_size, config.data_word_vec.shape[1])
 21 | 		self.word_emb.weight.data.copy_(torch.from_numpy(config.data_word_vec))
 22 | 		self.word_emb.weight.requires_grad = False
 23 | 
 24 | 		# self.char_emb = nn.Embedding(config.data_char_vec.shape[0], config.data_char_vec.shape[1])
 25 | 		# self.char_emb.weight.data.copy_(torch.from_numpy(config.data_char_vec))
 26 | 		# char_dim = config.data_char_vec.shape[1]
 27 | 		# char_hidden = 100
 28 | 		# self.char_cnn = nn.Conv1d(char_dim,  char_hidden, 5)
 29 | 		self.coref_embed = nn.Embedding(config.max_length, config.coref_size, padding_idx=0)
 30 | 		self.ner_emb = nn.Embedding(7, config.entity_type_size, padding_idx=0)
 31 | 
 32 | 		input_size = config.data_word_vec.shape[1] + config.coref_size + config.entity_type_size #+ char_hidden
 33 | 		hidden_size = 128
 34 | 
 35 | 		# self.rnn = EncoderLSTM(input_size, hidden_size, 1, True, True, 1 - config.keep_prob, False)
 36 | 		# self.linear_re = nn.Linear(hidden_size*2, hidden_size)  # *4 for 2layer
 37 | 
 38 | 		self.rnn = EncoderLSTM(input_size, hidden_size, 1, True, False, 1 - config.keep_prob, False)
 39 | 		self.linear_re = nn.Linear(hidden_size, hidden_size)  # *4 for 2layer
 40 | 
 41 | 		self.bili = torch.nn.Bilinear(hidden_size+config.dis_size, hidden_size+config.dis_size, config.relation_num)
 42 | 
 43 | 
 44 | 
 45 | 		self.dis_embed = nn.Embedding(20, config.dis_size, padding_idx=10)
 46 | 
 47 | 
 48 | 
 49 | 	def forward(self, context_idxs, pos, context_ner, context_char_idxs, context_lens, h_mapping, t_mapping,
 50 | 				relation_mask, dis_h_2_t, dis_t_2_h):
 51 | 		# para_size, char_size, bsz = context_idxs.size(1), context_char_idxs.size(2), context_idxs.size(0)
 52 | 		# context_ch = self.char_emb(context_char_idxs.contiguous().view(-1, char_size)).view(bsz * para_size, char_size, -1)
 53 | 		# context_ch = self.char_cnn(context_ch.permute(0, 2, 1).contiguous()).max(dim=-1)[0].view(bsz, para_size, -1)
 54 | 
 55 | 		sent = torch.cat([self.word_emb(context_idxs) , self.coref_embed(pos), self.ner_emb(context_ner)], dim=-1)
 56 | 		# sent = torch.cat([self.word_emb(context_idxs), context_ch], dim=-1)
 57 | 
 58 | 		# context_mask = (context_idxs > 0).float()
 59 | 		context_output = self.rnn(sent, context_lens)
 60 | 
 61 | 		context_output = torch.relu(self.linear_re(context_output))
 62 | 
 63 | 
 64 | 		start_re_output = torch.matmul(h_mapping, context_output)
 65 | 		end_re_output = torch.matmul(t_mapping, context_output)
 66 | 		# predict_re = self.bili(start_re_output, end_re_output)
 67 | 
 68 | 		s_rep = torch.cat([start_re_output, self.dis_embed(dis_h_2_t)], dim=-1)
 69 | 		t_rep = torch.cat([end_re_output, self.dis_embed(dis_t_2_h)], dim=-1)
 70 | 		predict_re = self.bili(s_rep, t_rep)
 71 | 
 72 | 
 73 | 
 74 | 		return predict_re
 75 | 
 76 | 
 77 | class LockedDropout(nn.Module):
 78 | 	def __init__(self, dropout):
 79 | 		super().__init__()
 80 | 		self.dropout = dropout
 81 | 
 82 | 	def forward(self, x):
 83 | 		dropout = self.dropout
 84 | 		if not self.training:
 85 | 			return x
 86 | 		m = x.data.new(x.size(0), 1, x.size(2)).bernoulli_(1 - dropout)
 87 | 		mask = Variable(m.div_(1 - dropout), requires_grad=False)
 88 | 		mask = mask.expand_as(x)
 89 | 		return mask * x
 90 | 
 91 | class EncoderRNN(nn.Module):
 92 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
 93 | 		super().__init__()
 94 | 		self.rnns = []
 95 | 		for i in range(nlayers):
 96 | 			if i == 0:
 97 | 				input_size_ = input_size
 98 | 				output_size_ = num_units
 99 | 			else:
100 | 				input_size_ = num_units if not bidir else num_units * 2
101 | 				output_size_ = num_units
102 | 			self.rnns.append(nn.GRU(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
103 | 		self.rnns = nn.ModuleList(self.rnns)
104 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
105 | 		self.dropout = LockedDropout(dropout)
106 | 		self.concat = concat
107 | 		self.nlayers = nlayers
108 | 		self.return_last = return_last
109 | 
110 | 		# self.reset_parameters()
111 | 
112 | 	def reset_parameters(self):
113 | 		for rnn in self.rnns:
114 | 			for name, p in rnn.named_parameters():
115 | 				if 'weight' in name:
116 | 					p.data.normal_(std=0.1)
117 | 				else:
118 | 					p.data.zero_()
119 | 
120 | 	def get_init(self, bsz, i):
121 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous()
122 | 
123 | 	def forward(self, input, input_lengths=None):
124 | 		bsz, slen = input.size(0), input.size(1)
125 | 		output = input
126 | 		outputs = []
127 | 		if input_lengths is not None:
128 | 			lens = input_lengths.data.cpu().numpy()
129 | 		for i in range(self.nlayers):
130 | 			hidden = self.get_init(bsz, i)
131 | 			output = self.dropout(output)
132 | 			if input_lengths is not None:
133 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
134 | 
135 | 			output, hidden = self.rnns[i](output, hidden)
136 | 
137 | 
138 | 			if input_lengths is not None:
139 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
140 | 				if output.size(1) < slen: # used for parallel
141 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
142 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
143 | 			if self.return_last:
144 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
145 | 			else:
146 | 				outputs.append(output)
147 | 		if self.concat:
148 | 			return torch.cat(outputs, dim=2)
149 | 		return outputs[-1]
150 | 
151 | 
152 | 
153 | 
154 | class EncoderLSTM(nn.Module):
155 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
156 | 		super().__init__()
157 | 		self.rnns = []
158 | 		for i in range(nlayers):
159 | 			if i == 0:
160 | 				input_size_ = input_size
161 | 				output_size_ = num_units
162 | 			else:
163 | 				input_size_ = num_units if not bidir else num_units * 2
164 | 				output_size_ = num_units
165 | 			self.rnns.append(nn.LSTM(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
166 | 		self.rnns = nn.ModuleList(self.rnns)
167 | 
168 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
169 | 		self.init_c = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
170 | 
171 | 		self.dropout = LockedDropout(dropout)
172 | 		self.concat = concat
173 | 		self.nlayers = nlayers
174 | 		self.return_last = return_last
175 | 
176 | 		# self.reset_parameters()
177 | 
178 | 	def reset_parameters(self):
179 | 		for rnn in self.rnns:
180 | 			for name, p in rnn.named_parameters():
181 | 				if 'weight' in name:
182 | 					p.data.normal_(std=0.1)
183 | 				else:
184 | 					p.data.zero_()
185 | 
186 | 	def get_init(self, bsz, i):
187 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous(), self.init_c[i].expand(-1, bsz, -1).contiguous()
188 | 
189 | 	def forward(self, input, input_lengths=None):
190 | 		bsz, slen = input.size(0), input.size(1)
191 | 		output = input
192 | 		outputs = []
193 | 		if input_lengths is not None:
194 | 			lens = input_lengths.data.cpu().numpy()
195 | 
196 | 		for i in range(self.nlayers):
197 | 			hidden, c = self.get_init(bsz, i)
198 | 
199 | 			output = self.dropout(output)
200 | 			if input_lengths is not None:
201 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
202 | 
203 | 			output, hidden = self.rnns[i](output, (hidden, c))
204 | 
205 | 
206 | 			if input_lengths is not None:
207 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
208 | 				if output.size(1) < slen: # used for parallel
209 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
210 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
211 | 			if self.return_last:
212 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
213 | 			else:
214 | 				outputs.append(output)
215 | 		if self.concat:
216 | 			return torch.cat(outputs, dim=2)
217 | 		return outputs[-1]
218 | 
219 | class BiAttention(nn.Module):
220 | 	def __init__(self, input_size, dropout):
221 | 		super().__init__()
222 | 		self.dropout = LockedDropout(dropout)
223 | 		self.input_linear = nn.Linear(input_size, 1, bias=False)
224 | 		self.memory_linear = nn.Linear(input_size, 1, bias=False)
225 | 
226 | 		self.dot_scale = nn.Parameter(torch.Tensor(input_size).uniform_(1.0 / (input_size ** 0.5)))
227 | 
228 | 	def forward(self, input, memory, mask):
229 | 		bsz, input_len, memory_len = input.size(0), input.size(1), memory.size(1)
230 | 
231 | 		input = self.dropout(input)
232 | 		memory = self.dropout(memory)
233 | 
234 | 		input_dot = self.input_linear(input)
235 | 		memory_dot = self.memory_linear(memory).view(bsz, 1, memory_len)
236 | 		cross_dot = torch.bmm(input * self.dot_scale, memory.permute(0, 2, 1).contiguous())
237 | 		att = input_dot + memory_dot + cross_dot
238 | 		att = att - 1e30 * (1 - mask[:,None])
239 | 
240 | 		weight_one = F.softmax(att, dim=-1)
241 | 		output_one = torch.bmm(weight_one, memory)
242 | 		weight_two = F.softmax(att.max(dim=-1)[0], dim=-1).view(bsz, 1, input_len)
243 | 		output_two = torch.bmm(weight_two, input)
244 | 
245 | 		return torch.cat([input, output_one, input*output_one, output_two*output_one], dim=-1)
246 | 


--------------------------------------------------------------------------------
/code/models/LSTM_SP.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.autograd as autograd
  3 | import torch.nn as nn
  4 | import torch.nn.functional as F
  5 | import torch.optim as optim
  6 | from torch.autograd import Variable
  7 | from torch import nn
  8 | import numpy as np
  9 | import math
 10 | from torch.nn import init
 11 | from torch.nn.utils import rnn
 12 | 
 13 | 
 14 | class LSTM_SP(nn.Module):
 15 | 	def __init__(self, config):
 16 | 		super(LSTM_SP, self).__init__()
 17 | 		self.config = config
 18 | 
 19 | 		word_vec_size = config.data_word_vec.shape[0]
 20 | 		self.word_emb = nn.Embedding(word_vec_size, config.data_word_vec.shape[1])
 21 | 		self.word_emb.weight.data.copy_(torch.from_numpy(config.data_word_vec))
 22 | 		self.word_emb.weight.requires_grad = False
 23 | 
 24 | 		# char_size = config.data_char_vec.shape[0]
 25 | 		# self.char_emb = nn.Embedding(char_size, config.data_char_vec.shape[1])
 26 | 		# self.char_emb.weight.data.copy_(torch.from_numpy(config.data_char_vec))
 27 | 		# char_dim = config.data_char_vec.shape[1]
 28 | 		# char_hidden = 100
 29 | 		# self.char_cnn = nn.Conv1d(char_dim,  char_hidden, 5)
 30 | 
 31 | 		self.coref_embed = nn.Embedding(config.max_length, config.coref_size, padding_idx=0)
 32 | 		self.ner_emb = nn.Embedding(7, config.entity_type_size, padding_idx=0)
 33 | 
 34 | 
 35 | 		hidden_size = 128
 36 | 		input_size = config.data_word_vec.shape[1] + config.coref_size + config.entity_type_size# + char_hidden
 37 | 
 38 | 		self.rnn = EncoderLSTM(input_size, hidden_size, 1, True, True, 1 - config.keep_prob, False)
 39 | 
 40 | 		self.relation_embed = nn.Embedding(config.relation_num, hidden_size, padding_idx=0)
 41 | 
 42 | 
 43 | 		self.linear_t = nn.Linear(hidden_size*2, hidden_size)  # *4 for 2layer
 44 | 		# self.bili = torch.nn.Bilinear(hidden_size, hidden_size, hidden_size)
 45 | 
 46 | 		self.linear_re = nn.Linear(hidden_size*3, 1)
 47 | 
 48 | 
 49 | 	def forward(self, context_idxs, pos, context_ner, context_char_idxs, context_lens, sent_h_mapping, sent_t_mapping, relation_label):
 50 | 		# para_size, char_size, bsz = context_idxs.size(1), context_char_idxs.size(2), context_idxs.size(0)
 51 | 		# context_ch = self.char_emb(context_char_idxs.contiguous().view(-1, char_size)).view(bsz * para_size, char_size, -1)
 52 | 		# context_ch = self.char_cnn(context_ch.permute(0, 2, 1).contiguous()).max(dim=-1)[0].view(bsz, para_size, -1)
 53 | 
 54 | 		sent = torch.cat([self.word_emb(context_idxs) , self.coref_embed(pos), self.ner_emb(context_ner)], dim=-1)
 55 | 
 56 | 		el = sent_h_mapping.size(1)
 57 | 		re_embed = (self.relation_embed(relation_label).unsqueeze(1)).expand(-1, el, -1)
 58 | 
 59 | 		context_output = self.rnn(sent, context_lens)
 60 | 		context_output = torch.relu(self.linear_t(context_output))
 61 | 		start_re_output = torch.matmul(sent_h_mapping, context_output)
 62 | 		end_re_output = torch.matmul(sent_t_mapping, context_output)
 63 | 
 64 | 		sent_output = torch.cat([start_re_output, end_re_output, re_embed], dim=-1)
 65 | 		predict_sent = self.linear_re(sent_output).squeeze(2)
 66 | 
 67 | 		# predict_sent = torch.sum(self.bili(start_re_output, end_re_output)*re_embed, dim=-1)
 68 | 
 69 | 		return predict_sent
 70 | 
 71 | 
 72 | class LockedDropout(nn.Module):
 73 | 	def __init__(self, dropout):
 74 | 		super().__init__()
 75 | 		self.dropout = dropout
 76 | 
 77 | 	def forward(self, x):
 78 | 		dropout = self.dropout
 79 | 		if not self.training:
 80 | 			return x
 81 | 		m = x.data.new(x.size(0), 1, x.size(2)).bernoulli_(1 - dropout)
 82 | 		mask = Variable(m.div_(1 - dropout), requires_grad=False)
 83 | 		mask = mask.expand_as(x)
 84 | 		return mask * x
 85 | 
 86 | class EncoderRNN(nn.Module):
 87 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
 88 | 		super().__init__()
 89 | 		self.rnns = []
 90 | 		for i in range(nlayers):
 91 | 			if i == 0:
 92 | 				input_size_ = input_size
 93 | 				output_size_ = num_units
 94 | 			else:
 95 | 				input_size_ = num_units if not bidir else num_units * 2
 96 | 				output_size_ = num_units
 97 | 			self.rnns.append(nn.GRU(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
 98 | 		self.rnns = nn.ModuleList(self.rnns)
 99 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
100 | 		self.dropout = LockedDropout(dropout)
101 | 		self.concat = concat
102 | 		self.nlayers = nlayers
103 | 		self.return_last = return_last
104 | 
105 | 		# self.reset_parameters()
106 | 
107 | 	def reset_parameters(self):
108 | 		for rnn in self.rnns:
109 | 			for name, p in rnn.named_parameters():
110 | 				if 'weight' in name:
111 | 					p.data.normal_(std=0.1)
112 | 				else:
113 | 					p.data.zero_()
114 | 
115 | 	def get_init(self, bsz, i):
116 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous()
117 | 
118 | 	def forward(self, input, input_lengths=None):
119 | 		bsz, slen = input.size(0), input.size(1)
120 | 		output = input
121 | 		outputs = []
122 | 		if input_lengths is not None:
123 | 			lens = input_lengths.data.cpu().numpy()
124 | 		for i in range(self.nlayers):
125 | 			hidden = self.get_init(bsz, i)
126 | 			output = self.dropout(output)
127 | 			if input_lengths is not None:
128 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
129 | 
130 | 			output, hidden = self.rnns[i](output, hidden)
131 | 
132 | 
133 | 			if input_lengths is not None:
134 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
135 | 				if output.size(1) < slen: # used for parallel
136 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
137 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
138 | 			if self.return_last:
139 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
140 | 			else:
141 | 				outputs.append(output)
142 | 		if self.concat:
143 | 			return torch.cat(outputs, dim=2)
144 | 		return outputs[-1]
145 | 
146 | 
147 | 
148 | 
149 | class EncoderLSTM(nn.Module):
150 | 	def __init__(self, input_size, num_units, nlayers, concat, bidir, dropout, return_last):
151 | 		super().__init__()
152 | 		self.rnns = []
153 | 		for i in range(nlayers):
154 | 			if i == 0:
155 | 				input_size_ = input_size
156 | 				output_size_ = num_units
157 | 			else:
158 | 				input_size_ = num_units if not bidir else num_units * 2
159 | 				output_size_ = num_units
160 | 			self.rnns.append(nn.LSTM(input_size_, output_size_, 1, bidirectional=bidir, batch_first=True))
161 | 		self.rnns = nn.ModuleList(self.rnns)
162 | 
163 | 		self.init_hidden = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
164 | 		self.init_c = nn.ParameterList([nn.Parameter(torch.Tensor(2 if bidir else 1, 1, num_units).zero_()) for _ in range(nlayers)])
165 | 
166 | 		self.dropout = LockedDropout(dropout)
167 | 		self.concat = concat
168 | 		self.nlayers = nlayers
169 | 		self.return_last = return_last
170 | 
171 | 		# self.reset_parameters()
172 | 
173 | 	def reset_parameters(self):
174 | 		for rnn in self.rnns:
175 | 			for name, p in rnn.named_parameters():
176 | 				if 'weight' in name:
177 | 					p.data.normal_(std=0.1)
178 | 				else:
179 | 					p.data.zero_()
180 | 
181 | 	def get_init(self, bsz, i):
182 | 		return self.init_hidden[i].expand(-1, bsz, -1).contiguous(), self.init_c[i].expand(-1, bsz, -1).contiguous()
183 | 
184 | 	def forward(self, input, input_lengths=None):
185 | 		bsz, slen = input.size(0), input.size(1)
186 | 		output = input
187 | 		outputs = []
188 | 		if input_lengths is not None:
189 | 			lens = input_lengths.data.cpu().numpy()
190 | 
191 | 		for i in range(self.nlayers):
192 | 			hidden, c = self.get_init(bsz, i)
193 | 
194 | 			output = self.dropout(output)
195 | 			if input_lengths is not None:
196 | 				output = rnn.pack_padded_sequence(output, lens, batch_first=True)
197 | 
198 | 			output, hidden = self.rnns[i](output, (hidden, c))
199 | 
200 | 
201 | 			if input_lengths is not None:
202 | 				output, _ = rnn.pad_packed_sequence(output, batch_first=True)
203 | 				if output.size(1) < slen: # used for parallel
204 | 					padding = Variable(output.data.new(1, 1, 1).zero_())
205 | 					output = torch.cat([output, padding.expand(output.size(0), slen-output.size(1), output.size(2))], dim=1)
206 | 			if self.return_last:
207 | 				outputs.append(hidden.permute(1, 0, 2).contiguous().view(bsz, -1))
208 | 			else:
209 | 				outputs.append(output)
210 | 		if self.concat:
211 | 			return torch.cat(outputs, dim=2)
212 | 		return outputs[-1]
213 | 
214 | class BiAttention(nn.Module):
215 | 	def __init__(self, input_size, dropout):
216 | 		super().__init__()
217 | 		self.dropout = LockedDropout(dropout)
218 | 		self.input_linear = nn.Linear(input_size, 1, bias=False)
219 | 		self.memory_linear = nn.Linear(input_size, 1, bias=False)
220 | 
221 | 		self.dot_scale = nn.Parameter(torch.Tensor(input_size).uniform_(1.0 / (input_size ** 0.5)))
222 | 
223 | 	def forward(self, input, memory, mask):
224 | 		bsz, input_len, memory_len = input.size(0), input.size(1), memory.size(1)
225 | 
226 | 		input = self.dropout(input)
227 | 		memory = self.dropout(memory)
228 | 
229 | 		input_dot = self.input_linear(input)
230 | 		memory_dot = self.memory_linear(memory).view(bsz, 1, memory_len)
231 | 		cross_dot = torch.bmm(input * self.dot_scale, memory.permute(0, 2, 1).contiguous())
232 | 		att = input_dot + memory_dot + cross_dot
233 | 		att = att - 1e30 * (1 - mask[:,None])
234 | 
235 | 		weight_one = F.softmax(att, dim=-1)
236 | 		output_one = torch.bmm(weight_one, memory)
237 | 		weight_two = F.softmax(att.max(dim=-1)[0], dim=-1).view(bsz, 1, input_len)
238 | 		output_two = torch.bmm(weight_two, input)
239 | 
240 | 		return torch.cat([input, output_one, input*output_one, output_two*output_one], dim=-1)
241 | 


--------------------------------------------------------------------------------
/code/models/__init__.py:
--------------------------------------------------------------------------------
1 | from .CNN3 import CNN3
2 | from .LSTM import LSTM
3 | from .BiLSTM import BiLSTM
4 | from .ContextAware import ContextAware
5 | from .LSTM_SP import LSTM_SP
6 | 


--------------------------------------------------------------------------------
/code/prepro_data/README.md:
--------------------------------------------------------------------------------
1 | # Metadata
2 | 
3 | Metadata for baseline model can be downloaded from [TsinghuaCloud](https://cloud.tsinghua.edu.cn/d/99e1c0805eb64736af95/) or [GoogleDrive](https://drive.google.com/drive/folders/1Ri3LIILKKBi3aBJjUVCOBpGX5PpONHRK).
4 | 


--------------------------------------------------------------------------------
/code/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib>=3.0.2
2 | nltk>=3.4
3 | tqdm>=4.29.1
4 | torch>=1.0.0
5 | numpy>=1.16.0
6 | scikit_learn>=0.21.2
7 | 


--------------------------------------------------------------------------------
/code/test.py:
--------------------------------------------------------------------------------
 1 | import config
 2 | import models
 3 | import numpy as np
 4 | import os
 5 | import time
 6 | import datetime
 7 | import json
 8 | from sklearn.metrics import average_precision_score
 9 | import sys
10 | import os
11 | import argparse
12 | # import IPython
13 | 
14 | # sys.excepthook = IPython.core.ultratb.FormattedTB(mode='Verbose', color_scheme='Linux', call_pdb=1)
15 | 
16 | 
17 | parser = argparse.ArgumentParser()
18 | parser.add_argument('--model_name', type = str, default = 'LSTM', help = 'name of the model')
19 | parser.add_argument('--save_name', type = str)
20 | 
21 | parser.add_argument('--train_prefix', type = str, default = 'train')
22 | parser.add_argument('--test_prefix', type = str, default = 'dev_dev')
23 | parser.add_argument('--input_theta', type = float, default = -1)
24 | # parser.add_argument('--ignore_input_theta', type = float, default = -1)
25 | 
26 | 
27 | args = parser.parse_args()
28 | model = {
29 | 	'CNN3': models.CNN3,
30 | 	'LSTM': models.LSTM,
31 | 	'BiLSTM': models.BiLSTM,
32 | 	'ContextAware': models.ContextAware,
33 | 	# 'LSTM_SP': models.LSTM_SP
34 | }
35 | 
36 | con = config.Config(args)
37 | #con.load_train_data()
38 | con.load_test_data()
39 | # con.set_train_model()
40 | con.testall(model[args.model_name], args.save_name, args.input_theta)#, args.ignore_input_theta)
41 | 


--------------------------------------------------------------------------------
/code/test_sp.py:
--------------------------------------------------------------------------------
 1 | import config
 2 | import models
 3 | import numpy as np
 4 | import os
 5 | import time
 6 | import datetime
 7 | import json
 8 | from sklearn.metrics import average_precision_score
 9 | import sys
10 | import os
11 | import argparse
12 | # import IPython
13 | 
14 | # sys.excepthook = IPython.core.ultratb.FormattedTB(mode='Verbose', color_scheme='Linux', call_pdb=1)
15 | 
16 | 
17 | parser = argparse.ArgumentParser()
18 | parser.add_argument('--model_name', type = str, default = 'LSTM', help = 'name of the model')
19 | parser.add_argument('--save_name', type = str)
20 | 
21 | parser.add_argument('--train_prefix', type = str, default = 'train')
22 | parser.add_argument('--test_prefix', type = str, default = 'dev_dev')
23 | parser.add_argument('--input_theta', type = float, default = -1)
24 | parser.add_argument('--output_file', type = str, default = "result.json")
25 | 
26 | 
27 | args = parser.parse_args()
28 | model = {
29 | 	'LSTM_SP': models.LSTM_SP
30 | }
31 | 
32 | con = config.EviConfig(args)
33 | #con.load_train_data()
34 | con.load_test_data()
35 | # con.set_train_model()
36 | con.testall(model[args.model_name], args.save_name, args.input_theta)
37 | 


--------------------------------------------------------------------------------
/code/train.py:
--------------------------------------------------------------------------------
 1 | import config
 2 | import models
 3 | import numpy as np
 4 | import os
 5 | import time
 6 | import datetime
 7 | import json
 8 | from sklearn.metrics import average_precision_score
 9 | import sys
10 | import os
11 | import argparse
12 | # import IPython
13 | 
14 | # sys.excepthook = IPython.core.ultratb.FormattedTB(mode='Verbose', color_scheme='Linux', call_pdb=1)
15 | 
16 | 
17 | parser = argparse.ArgumentParser()
18 | parser.add_argument('--model_name', type = str, default = 'BiLSTM', help = 'name of the model')
19 | parser.add_argument('--save_name', type = str)
20 | 
21 | parser.add_argument('--train_prefix', type = str, default = 'dev_train')
22 | parser.add_argument('--test_prefix', type = str, default = 'dev_dev')
23 | 
24 | 
25 | args = parser.parse_args()
26 | model = {
27 | 	'CNN3': models.CNN3,
28 | 	'LSTM': models.LSTM,
29 | 	'BiLSTM': models.BiLSTM,
30 | 	'ContextAware': models.ContextAware,
31 | }
32 | 
33 | con = config.Config(args)
34 | con.set_max_epoch(200)
35 | con.load_train_data()
36 | con.load_test_data()
37 | # con.set_train_model()
38 | con.train(model[args.model_name], args.save_name)
39 | 


--------------------------------------------------------------------------------
/code/train_sp.py:
--------------------------------------------------------------------------------
 1 | import config
 2 | import models
 3 | import numpy as np
 4 | import os
 5 | import time
 6 | import datetime
 7 | import json
 8 | from sklearn.metrics import average_precision_score
 9 | import sys
10 | import os
11 | import argparse
12 | # import IPython
13 | 
14 | # sys.excepthook = IPython.core.ultratb.FormattedTB(mode='Verbose', color_scheme='Linux', call_pdb=1)
15 | 
16 | 
17 | parser = argparse.ArgumentParser()
18 | parser.add_argument('--model_name', type = str, default = 'pcnn_att', help = 'name of the model')
19 | parser.add_argument('--save_name', type = str)
20 | 
21 | parser.add_argument('--train_prefix', type = str, default = 'dev_train')
22 | parser.add_argument('--test_prefix', type = str, default = 'dev_dev')
23 | parser.add_argument('--output_file', type = str, default = "result.json")
24 | 
25 | 
26 | args = parser.parse_args()
27 | model = {
28 | 	# 'CNN3': models.CNN3,
29 | 	# 'LSTM': models.LSTM,
30 | 	# 'BiLSTM': models.BiLSTM,
31 | 	# 'ContextAware': models.ContextAware,
32 | 	'LSTM_SP': models.LSTM_SP
33 | }
34 | 
35 | con = config.EviConfig(args)
36 | con.set_max_epoch(200)
37 | con.load_train_data()
38 | con.load_test_data()
39 | # con.set_train_model()
40 | con.train(model[args.model_name], args.save_name)
41 | 


--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
 1 | # Data
 2 | 
 3 | Data can be downloaded from [Google Drive](https://drive.google.com/drive/folders/1c5-0YwnoJx8NS6CV2f-NoTHR__BdkNqw?usp=sharing).
 4 | 
 5 | Relation information file has been uploaded.
 6 | 
 7 | 
 8 | ```
 9 | Data Format:
10 | {
11 |   'title',
12 |   'sents':     [
13 |                   [word in sent 0],
14 |                   [word in sent 1]
15 |                ]
16 |   'vertexSet': [
17 |                   [
18 |                     { 'name': mention_name, 
19 |                       'sent_id': mention in which sentence, 
20 |                       'pos': postion of mention in a sentence, 
21 |                       'type': NER_type}
22 |                     {anthor mention}
23 |                   ], 
24 |                   [anthoer entity]
25 |                 ]
26 |   'labels':   [
27 |                 {
28 |                   'h': idx of head entity in vertexSet,
29 |                   't': idx of tail entity in vertexSet,
30 |                   'r': relation,
31 |                   'evidence': evidence sentences' id
32 |                 }
33 |               ]
34 | }
35 | ```
36 | 
37 | Please submit the test set result to Codalab.
38 | 


--------------------------------------------------------------------------------