├── README.md ├── __pycache__ ├── data_loader.cpython-36.pyc └── model.cpython-36.pyc ├── conf └── default_setting.conf ├── data_loader.py ├── dataset └── README.md ├── dynamic_memory ├── __init__.py ├── __pycache__ │ ├── __init__.cpython-36.pyc │ ├── encoder.cpython-36.pyc │ └── episode.cpython-36.pyc ├── encoder.py └── episode.py ├── images ├── README └── README.png ├── main.py ├── model.py └── tile.py /README.md: -------------------------------------------------------------------------------- 1 | # Conversational 2 | 3 | ## Instructions on preparing conversation data for model training 4 | 5 | The package already includes processed converseation data to train and run the model. It includes four sub-categories from the Amazon dataset. If you are not using the model for new conversation data, you can just play with the code based on the four processed conversation datasets. 6 | 7 | If you want to build your own conversation data — e.g., for other sub-categories of the Amazon dataset, or even datasets for other conversation scenarios — we have prepared the following package to achieve this goal, together with a figure to show the flow of the process. 8 | 9 | The package for conversation data preparation can be downloaded in this URL: 10 | https://www.dropbox.com/s/3gf4zz02okphjhb/data.zip?dl=0 11 | 12 | This follwoing repository provides an instruction about how to use the Sentires phrase-level sentiment analysis toolkit: 13 | https://github.com/evison/Sentires 14 | 15 | And the flow of the data preparation process is shown as the following figure. 16 | 17 | In general, it includes the follows steps: 18 | ``` 19 | 1. Run raw2tool_format.py to get xxx.raw and xxx.product files 20 | 2. Put .raw data into English-Jar/data/raw and use English-Jar (DOIT) to get xxx.pos.profile and xxx.neg.profile 21 | 3. Run gen_feature_ui_dicts.py to get xxx.pos.profileu_i_dict and xxx.pos.profilefeature_dict 22 | 4. Run map_name_2_id.py to get cleared_u_i_dict, cleared_feature_dict and many other dicts 23 | 5. Run split_train_test.py to get train_dict and test_dict 24 | 6. Run get_category.py to get item_category_dict 25 | 7. Run get_description.py to get item_description_dict 26 | ``` 27 | 28 | ![](images/README.png) 29 | 30 | ## Bibliographic information: 31 | 32 | ``` 33 | @inproceedings{zhang2018towards, 34 | title={Towards conversational search and recommendation: System ask, user respond}, 35 | author={Zhang, Yongfeng and Chen, Xu and Ai, Qingyao and Yang, Liu and Croft, W Bruce}, 36 | booktitle={Proceedings of the 27th ACM International Conference on Information and Knowledge Management}, 37 | pages={177--186}, 38 | year={2018}, 39 | organization={ACM} 40 | } 41 | ``` 42 | -------------------------------------------------------------------------------- /__pycache__/data_loader.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/evison/Conversational/77a8f20ba367e4f71a966ceebfb2588b0ffd084e/__pycache__/data_loader.cpython-36.pyc -------------------------------------------------------------------------------- /__pycache__/model.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/evison/Conversational/77a8f20ba367e4f71a966ceebfb2588b0ffd084e/__pycache__/model.cpython-36.pyc -------------------------------------------------------------------------------- /conf/default_setting.conf: -------------------------------------------------------------------------------- 1 | [path] 2 | category = example/ 3 | base_path = ./data/ 4 | 5 | [parameters] 6 | batch_size = 5 7 | use_pretrained = 0 8 | embed_dim = 40 9 | encoder_type = uni 10 | cell_type = gru 11 | num_layers = 1 12 | num_units = 32 13 | memory_search_hob = 1 14 | memory_question_hob = 2 15 | dropout = 0.0 16 | evaluate = search 17 | reg_scale = 0.001 18 | search_with_conversation_number = 2 19 | prediction_with_conversation_number = 0 20 | epoch_number = 3 21 | 22 | [train] 23 | learning_rate = 0.0001 24 | optimizer = Adam 25 | train_steps = 100 26 | test_steps = 10 27 | model_dir = 'logs/' 28 | save_checkpoints_steps =100 29 | check_hook_n_iter = 1000 30 | min_eval_frequency = 1000 31 | print_verbose = 0 32 | debug = 0 33 | PAD_ID = 0 34 | -------------------------------------------------------------------------------- /data_loader.py: -------------------------------------------------------------------------------- 1 | """ 2 | data_loader 3 | """ 4 | 5 | import numpy as np 6 | import pickle 7 | import random 8 | import pandas as pd 9 | 10 | 11 | class DataLoader: 12 | 13 | def __init__(self, args): 14 | self.args = args 15 | self.users_all = dict() 16 | self.user_number = 0 17 | self.w2v_dim = self.args.embed_dim 18 | self.input_mask_mode = "sentence" 19 | self.use_pretrained = self.args.use_pretrained 20 | self.train_batch_id = 0 21 | self.test_batch_id = 0 22 | 23 | self.base_path = self.args.base_path 24 | self.category = self.args.category 25 | 26 | path = self.base_path + self.category 27 | train_data_path = path + 'train_dict' 28 | self.train_data = pickle.load(open(train_data_path, 'rb')) 29 | test_data_path = path + 'test_dict' 30 | self.test_data = pickle.load(open(test_data_path, 'rb')) 31 | item_id_path = path + 'item_id_dict' 32 | self.items = list(pickle.load(open(item_id_path, 'rb')).values()) 33 | print('item number: '+str(len(self.items))) 34 | 35 | feature_id_path = path + 'feature_id_dict' 36 | self.id_feature_dict = {v:k for k,v in pickle.load(open(feature_id_path, 'rb')).items()} 37 | 38 | opinion_id_path = path + 'opinion_id_dict' 39 | self.id_opinion_dict = {v: k for k, v in pickle.load(open(opinion_id_path, 'rb')).items()} 40 | word_id_path = path + 'word_id_dict' 41 | self.word_id_dict = pickle.load(open(word_id_path, 'rb')) 42 | 43 | 44 | item_description_dict_path = path + 'item_description_dict' 45 | self.item_description_dict = pickle.load(open(item_description_dict_path, 'rb')) 46 | item_category_dict_path = path + 'item_category_dict' 47 | self.item_category_dict = pickle.load(open(item_category_dict_path, 'rb')) 48 | 49 | # build for model testing 50 | self.item_candidates = [] 51 | for k, v in self.test_data.items(): 52 | item = int(k.split('@')[1]) 53 | if item not in self.item_candidates: 54 | self.item_candidates.append(item) 55 | if len(self.item_candidates) > 100: 56 | self.item_candidates = random.sample(self.item_candidates, 100) 57 | self.grund_truth = dict() 58 | 59 | for k, v in self.test_data.items(): 60 | user = int(k.split('@')[0]) 61 | item = int(k.split('@')[1]) 62 | if item in self.item_candidates: 63 | if user not in self.grund_truth.keys(): 64 | self.grund_truth[user] = [item] 65 | else: 66 | self.grund_truth[user].append(item) 67 | 68 | self.question_cadidates = list(self.id_feature_dict.values())[:100] 69 | 70 | def make_train_and_test_set(self): 71 | train_raw = self.get_train_raw_data() 72 | test_raw = self.get_test_raw_data() 73 | self.train_sample_num = len(train_raw) 74 | self.test_sample_num = len(test_raw) 75 | self.max_description_word_length, self.max_description_sentence_length, \ 76 | self.max_answer_word_length, self.max_answer_sentence_length = self.get_max_seq_length(train_raw, test_raw) 77 | self.max_description_word_length = 128 78 | 79 | print("max_description_word_length:", self.max_description_word_length) 80 | print("max_description_sentence_length:", self.max_description_sentence_length) 81 | print("max_answer_word_length:", self.max_answer_word_length) 82 | print("max_answer_sentence_length:", self.max_answer_sentence_length) 83 | print("train sample number:", self.train_sample_num) 84 | print("test sample number:", self.test_sample_num) 85 | print('test item number:%d' % (len(self.item_candidates))) 86 | 87 | self.train_users, self.train_answers, self.train_pos_descriptions, self.train_neg_descriptions, self.train_pos_questions, \ 88 | self.train_neg_questions, self.train_answer_masks, self.train_pos_descriptions_masks, self.train_neg_descriptions_masks = self.process_train_input(train_raw) 89 | 90 | self.test_users, self.test_answers, self.test_pos_descriptions, self.test_pos_questions, \ 91 | self.test_pos_descriptions_masks = self.process_test_input(test_raw) 92 | 93 | def get_max_seq_length(self, *datasets): 94 | 95 | max_description_word_length, max_description_sentence_length,\ 96 | max_answer_word_length, max_answer_sentence_length = 0, 0, 0, 0 97 | 98 | def count_punctuation(facts): 99 | return len(list(filter(lambda x: x == ".", facts))) 100 | 101 | for dataset in datasets: 102 | for d in dataset: 103 | max_description_word_length = max(max_description_word_length, len(d['pos_des'].split('-'))) 104 | max_description_sentence_length = max(max_description_sentence_length, count_punctuation(d['pos_des'])) 105 | max_answer_word_length = max(max_answer_word_length, len(' '.join(d['answer']).split())) 106 | max_answer_sentence_length = max(max_answer_sentence_length, count_punctuation(d['answer'])) 107 | 108 | return max_description_word_length, max_description_sentence_length,\ 109 | max_answer_word_length, max_answer_sentence_length 110 | 111 | def get_all_description(self): 112 | all_d = [] 113 | all_d_mask = [] 114 | self.items = self.items[:10] 115 | for item in self.items: 116 | if int(item) in self.item_category_dict.keys() and int(item) in self.item_description_dict.keys(): 117 | item_description_and_category = self.item_category_dict[int(item)].split('||') 118 | pos_product_description = '-'.join(item_description_and_category[0].split('-')[:10]) 119 | pos_review_decription = '-'.join(self.item_description_dict[int(item)].split('-')[:100]) 120 | d = pos_product_description + '-' + pos_review_decription 121 | pos_des = d.lower().split('-') 122 | pos_des = [self.word_id_dict[w] for w in pos_des if w in self.word_id_dict.keys()] 123 | pos_des_pad = self.pad_input(pos_des, self.max_description_word_length, [0]) 124 | all_d.append(pos_des_pad) 125 | pos_mask = [index for index, w in enumerate(pos_des) if w == self.word_id_dict['.']] 126 | pos_mask = self.pad_input(pos_mask, self.max_description_sentence_length, [0]) 127 | all_d_mask.append(pos_mask) 128 | return all_d, all_d_mask 129 | 130 | def get_train_raw_data(self): 131 | data = self.train_data.items() 132 | tasks = [] 133 | 134 | 135 | for user_item, feature_opinion in list(data)[:100]: 136 | 137 | # Item description, category and the feature-opinion pairs. 138 | # 0. category 139 | # 1. caetgory + (feature1, opinion1) 140 | # 2. caetgory + (feature1, opinion1) + (feature2, opinion2) 141 | # ... 142 | task = {"user_item":"", "pos_des": "", "neg_des": "", "answer": "", "pos_ques": "", "neg_ques": ""} 143 | pos_item = user_item.split('@')[1] 144 | fo_pairs = [i.split('|')[:2] for i in feature_opinion.split(':')] 145 | # pos 146 | if int(pos_item) in self.item_category_dict.keys() and int(pos_item) in self.item_description_dict.keys(): 147 | item_description_and_category = self.item_category_dict[int(pos_item)].split('||') 148 | category = ' '.join(set(item_description_and_category[1].split('-'))) 149 | an = category 150 | task["user_item"] = user_item 151 | user = user_item.split('@')[0] 152 | if user not in self.users_all.keys(): 153 | self.users_all[user] = self.user_number 154 | self.user_number += 1 155 | 156 | task["answer"] = an 157 | pos_product_description = '-'.join(item_description_and_category[0].split('-')[:10]) 158 | pos_review_decription = '-'.join(self.item_description_dict[int(pos_item)].split('-')[:100]) 159 | task["pos_des"] = pos_product_description + '-' + pos_review_decription 160 | task["pos_ques"] = self.id_feature_dict[int(fo_pairs[0][0])] 161 | 162 | neg_item = random.choice(self.items) 163 | while (neg_item == pos_item or int(neg_item) not in self.item_category_dict.keys() or int(neg_item) not in self.item_description_dict.keys()): 164 | neg_item = random.choice(self.items) 165 | item_description_and_category = self.item_category_dict[int(neg_item)].split('||') 166 | 167 | neg_product_description = '-'.join(item_description_and_category[0].split('-')[:10]) 168 | neg_review_decription = '-'.join(self.item_description_dict[int(neg_item)].split('-')[:100]) 169 | task["neg_des"] = neg_product_description + '-' + neg_review_decription 170 | 171 | neg_ques = random.choice(list(self.id_feature_dict.values())) 172 | if neg_ques == int(fo_pairs[0][0]): 173 | neg_ques = random.choice(list(self.id_feature_dict.values())) 174 | task["neg_ques"] = neg_ques 175 | tasks.append(task.copy()) 176 | 177 | 178 | for index in range(len(fo_pairs)): 179 | if index + 1 < len(fo_pairs): 180 | f = self.id_feature_dict[int(fo_pairs[index][0])] 181 | o = self.id_opinion_dict[int(fo_pairs[index][1])] 182 | task["user_item"] = user_item 183 | task["pos_des"] = pos_product_description + '-' + pos_review_decription 184 | task["neg_des"] = neg_product_description + '-' + neg_review_decription 185 | an += ' . ' + f + ' ' + o 186 | task["answer"] = an 187 | task["pos_ques"] = self.id_feature_dict[int(fo_pairs[index+1][0])] 188 | neg_ques = random.choice(list(self.id_feature_dict.values())) 189 | if neg_ques == int(fo_pairs[index+1][0]): 190 | neg_ques = random.choice(list(self.id_feature_dict.values())) 191 | task["neg_ques"] = neg_ques 192 | tasks.append(task.copy()) 193 | return tasks 194 | 195 | def get_test_raw_data(self): 196 | data = self.test_data.items() 197 | tasks = [] 198 | output_search_result_index = [] 199 | output_question_result_index = [] 200 | 201 | ''' 202 | item description (changing), item description mask, answer including n round conversation, question at n+1 round 203 | --> search task at round n 204 | item description, item description mask, answer including n round conversation, question at n+1 round (changing) 205 | --> question task at round n 206 | ''' 207 | if self.args.evaluate == 'search': 208 | for user_item, feature_opinion in list(data)[:100]: 209 | task = {"user_item": "", "pos_des": "", "neg_des": "", "answer": "", "pos_ques": "", "neg_ques": ""} 210 | user = user_item.split('@')[0] 211 | #item = user_item.split('@')[1] 212 | fo_pairs = [i.split('|')[:2] for i in feature_opinion.split(':')] 213 | conversation = '' 214 | for i in range(self.args.search_with_conversation_number): 215 | conversation += ' . ' + self.id_feature_dict[int(fo_pairs[i][0])] + ' ' + \ 216 | self.id_opinion_dict[int(fo_pairs[i][1])] 217 | 218 | if int(user) in self.grund_truth.keys(): 219 | user = user_item.split('@')[0] 220 | if user not in self.users_all.keys(): 221 | self.users_all[user] = self.user_number 222 | self.user_number += 1 223 | 224 | for pos_item in self.item_candidates: 225 | if int(pos_item) in self.item_category_dict.keys() and int(pos_item) in self.item_description_dict.keys(): 226 | item_description_and_category = self.item_category_dict[int(pos_item)].split('||') 227 | category = ' '.join(set(item_description_and_category[1].split('-'))) 228 | an = category 229 | task["user_item"] = user_item 230 | task["answer"] = an + conversation 231 | pos_product_description = '-'.join(item_description_and_category[0].split('-')[:10]) 232 | pos_review_decription = '-'.join(self.item_description_dict[int(pos_item)].split('-')[:100]) 233 | task["pos_des"] = pos_product_description + '-' + pos_review_decription 234 | task["pos_ques"] = '' 235 | if pos_item in self.grund_truth[int(user)]: 236 | tmp = [user, pos_item, 1] 237 | else: 238 | tmp = [user, pos_item, 0] 239 | output_search_result_index.append(tmp) 240 | tasks.append(task.copy()) 241 | t = pd.DataFrame(output_search_result_index) 242 | t.to_csv(self.base_path + self.category+'output_'+self.args.evaluate+'_result_index', index=False, header=None) 243 | return tasks 244 | else: 245 | # predict the first aspect when there is no conversation. 246 | for user_item, feature_opinion in list(data)[:100]: 247 | task = {"user_item": "", "pos_des": "", "neg_des": "", "answer": "", "pos_ques": "", "neg_ques": ""} 248 | user = user_item.split('@')[0] 249 | if user not in self.users_all.keys(): 250 | self.users_all[user] = self.user_number 251 | self.user_number += 1 252 | pos_item = user_item.split('@')[1] 253 | fo_pairs = [i.split('|')[:2] for i in feature_opinion.split(':')] 254 | conversation = '' 255 | if len(fo_pairs) > self.args.prediction_with_conversation_number: 256 | for i in range(self.args.prediction_with_conversation_number): 257 | conversation += ' . ' + self.id_feature_dict[int(fo_pairs[i][0])] + ' ' + \ 258 | self.id_opinion_dict[int(fo_pairs[i][1])] 259 | 260 | if int(pos_item) in self.item_category_dict.keys() and int( 261 | pos_item) in self.item_description_dict.keys(): 262 | item_description_and_category = self.item_category_dict[int(pos_item)].split('||') 263 | category = ' '.join(set(item_description_and_category[1].split('-'))) 264 | an = category 265 | pos_product_description = ' '.join(item_description_and_category[0].split('-')[:10]) 266 | pos_review_decription = self.item_description_dict[int(pos_item)] 267 | 268 | for pos_ques in self.question_cadidates: 269 | task["user_item"] = user_item 270 | task["answer"] = an + conversation 271 | task["pos_des"] = pos_product_description + ' ' + pos_review_decription 272 | task["pos_ques"] = pos_ques 273 | if pos_ques == self.id_feature_dict[int(fo_pairs[self.args.prediction_with_conversation_number][0])]: 274 | tmp = [user, pos_item, pos_ques, 1] 275 | else: 276 | tmp = [user, pos_item, pos_ques, 0] 277 | output_question_result_index.append(tmp) 278 | tasks.append(task.copy()) 279 | 280 | t = pd.DataFrame(output_question_result_index) 281 | t.to_csv(self.base_path + self.category+'output_'+self.args.evaluate+'_result_index', index=False, header=None) 282 | return tasks 283 | 284 | def get_norm(self, x): 285 | x = np.array(x) 286 | return np.sum(x * x) 287 | 288 | def process_train_input(self, data_raw): 289 | users = [] 290 | answers = [] 291 | pos_descriptions = [] 292 | neg_descriptions = [] 293 | pos_questions = [] 294 | neg_questions = [] 295 | 296 | pos_descriptions_masks = [] 297 | neg_descriptions_masks = [] 298 | answer_masks = [] 299 | 300 | for x in data_raw: 301 | user = x["user_item"].split('@')[0] 302 | users.append(self.users_all[user]) 303 | 304 | pos_des = x["pos_des"].lower().split('-') 305 | pos_des = [self.word_id_dict[w] for w in pos_des if w in self.word_id_dict.keys()] 306 | 307 | neg_des = x["neg_des"].lower().split('-') 308 | neg_des = [self.word_id_dict[w] for w in neg_des if w in self.word_id_dict.keys()] 309 | 310 | an = x["answer"].lower().split(' ') 311 | an = [self.word_id_dict[w] for w in an if w in self.word_id_dict.keys()] 312 | 313 | pos_des_pad = self.pad_input(pos_des, self.max_description_word_length, [0]) 314 | pos_descriptions.append(pos_des_pad) 315 | 316 | neg_des_pad = self.pad_input(neg_des, self.max_description_word_length, [0]) 317 | neg_descriptions.append(neg_des_pad) 318 | 319 | an_pad = self.pad_input(an, self.max_answer_word_length, [0]) 320 | answers.append(an_pad) 321 | 322 | pos_questions.append(self.word_id_dict[x["pos_ques"]]) 323 | neg_questions.append(self.word_id_dict[x["neg_ques"]]) 324 | 325 | 326 | if self.input_mask_mode == 'word': 327 | pos_mask_tmp = [index for index, w in enumerate(pos_des)] 328 | pos_mask = self.pad_input(pos_mask_tmp, self.max_description_word_length, [0]) 329 | pos_descriptions_masks.append(pos_mask) 330 | neg_mask_tmp = [index for index, w in enumerate(neg_des)] 331 | neg_mask = self.pad_input(neg_mask_tmp, self.max_description_word_length, [0]) 332 | neg_descriptions_masks.append(neg_mask) 333 | answer_mask_tmp = [index for index, w in enumerate(an)] 334 | answer_mask = self.pad_input(answer_mask_tmp, self.max_answer_word_length, [0]) 335 | answer_masks.append(answer_mask) 336 | 337 | elif self.input_mask_mode == 'sentence': 338 | pos_mask = [index for index, w in enumerate(pos_des) if w == self.word_id_dict['.']] 339 | pos_mask = self.pad_input(pos_mask, self.max_description_sentence_length, [0]) 340 | pos_descriptions_masks.append(pos_mask) 341 | neg_mask = [index for index, w in enumerate(neg_des) if w == self.word_id_dict['.']] 342 | neg_mask = self.pad_input(neg_mask, self.max_description_sentence_length, [0]) 343 | neg_descriptions_masks.append(neg_mask) 344 | answer_mask_tmp = [index for index, w in enumerate(an)] 345 | answer_mask = self.pad_input(answer_mask_tmp, self.max_answer_word_length, [0]) 346 | answer_masks.append(answer_mask) 347 | #answer_mask = [index for index, w in enumerate(an) if w == self.word_id_dict['.']] 348 | #answer_mask = self.pad_input(answer_mask, self.max_answer_sentence_length, [0]) 349 | #answer_masks.append(answer_mask) 350 | 351 | else: 352 | raise ValueError("input_mask_mode is only available (word, sentence)") 353 | 354 | return (np.array(users, dtype=np.int32).tolist(), 355 | np.array(answers, dtype=np.int32).tolist(), 356 | np.array(pos_descriptions, dtype=np.int32).tolist(), 357 | np.array(neg_descriptions, dtype=np.int32).tolist(), 358 | np.array(pos_questions, dtype=np.int32).tolist(), 359 | np.array(neg_questions, dtype=np.int32).tolist(), 360 | np.array(answer_masks, dtype=np.int32).tolist(), 361 | np.array(pos_descriptions_masks, dtype=np.int32).tolist(), 362 | np.array(neg_descriptions_masks, dtype=np.int32).tolist()) 363 | 364 | def process_test_input(self, data_raw): 365 | users = [] 366 | answers = [] 367 | descriptions = [] 368 | questions = [] 369 | descriptions_masks = [] 370 | for x in data_raw: 371 | user = x["user_item"].split('@')[0] 372 | users.append(self.users_all[user]) 373 | pos_des = x["pos_des"].lower().split('-') 374 | pos_des = [self.word_id_dict[w] for w in pos_des if w in self.word_id_dict.keys()] 375 | an = x["answer"].lower().split(' ') 376 | an = [self.word_id_dict[w] for w in an if w in self.word_id_dict.keys()] 377 | des_pad = self.pad_input(pos_des, self.max_description_word_length, [0]) 378 | descriptions.append(des_pad) 379 | an_pad = self.pad_input(an, self.max_answer_word_length, [0]) 380 | answers.append(an_pad) 381 | if self.args.evaluate == 'search': 382 | questions.append(0) 383 | else: 384 | questions.append(self.word_id_dict[x["pos_ques"]]) 385 | 386 | if self.input_mask_mode == 'word': 387 | descriptions_masks.append(np.array([index for index, w in enumerate(pos_des)], dtype=np.int32)) 388 | elif self.input_mask_mode == 'sentence': 389 | pos_mask = [index for index, w in enumerate(pos_des) if w == self.word_id_dict['.']] 390 | pos_mask = self.pad_input(pos_mask, self.max_description_sentence_length, [0]) 391 | descriptions_masks.append(pos_mask) 392 | else: 393 | raise ValueError("input_mask_mode is only available (word, sentence)") 394 | 395 | return (np.array(users, dtype=np.int32).tolist(), 396 | np.array(answers, dtype=np.int32).tolist(), 397 | np.array(descriptions, dtype=np.int32).tolist(), 398 | np.array(questions, dtype=np.int32).tolist(), 399 | np.array(descriptions_masks, dtype=np.int32).tolist()) 400 | 401 | def pad_input(self, input_, size, pad_item): 402 | if size > len(input_): 403 | return input_ + pad_item * (size - len(input_)) 404 | else: 405 | return input_[:size] 406 | 407 | def get_train_batch_data(self, batch_size): 408 | l = len(self.train_answers) 409 | if self.train_batch_id + batch_size > l: 410 | batch_train_users = self.train_users[self.train_batch_id:] + self.train_users[:self.train_batch_id + batch_size - l] 411 | batch_train_answers = self.train_answers[self.train_batch_id:] + self.train_answers[:self.train_batch_id + batch_size - l] 412 | batch_train_pos_descriptions = self.train_pos_descriptions[self.train_batch_id:] + self.train_pos_descriptions[:self.train_batch_id + batch_size - l] 413 | batch_train_neg_descriptions = self.train_neg_descriptions[self.train_batch_id:] + self.train_neg_descriptions[:self.train_batch_id + batch_size - l] 414 | 415 | batch_train_pos_questions = self.train_pos_questions[self.train_batch_id:] + self.train_pos_questions[:self.train_batch_id + batch_size - l] 416 | batch_train_neg_questions = self.train_neg_questions[self.train_batch_id:] + self.train_neg_questions[:self.train_batch_id + batch_size - l] 417 | 418 | batch_train_answer_masks = self.train_answer_masks[self.train_batch_id:] + self.train_answer_masks[:self.train_batch_id + batch_size - l] 419 | batch_train_pos_descriptions_masks = self.train_pos_descriptions_masks[self.train_batch_id:] + self.train_pos_descriptions_masks[:self.train_batch_id + batch_size - l] 420 | batch_train_neg_descriptions_masks = self.train_neg_descriptions_masks[self.train_batch_id:] + self.train_neg_descriptions_masks[:self.train_batch_id + batch_size - l] 421 | 422 | self.train_batch_id = self.train_batch_id + batch_size - l 423 | 424 | else: 425 | batch_train_users = self.train_users[self.train_batch_id:self.train_batch_id + batch_size] 426 | batch_train_answers = self.train_answers[self.train_batch_id:self.train_batch_id + batch_size] 427 | batch_train_pos_descriptions = self.train_pos_descriptions[self.train_batch_id:self.train_batch_id + batch_size] 428 | batch_train_neg_descriptions = self.train_neg_descriptions[self.train_batch_id:self.train_batch_id + batch_size] 429 | batch_train_pos_questions = self.train_pos_questions[self.train_batch_id:self.train_batch_id + batch_size] 430 | batch_train_neg_questions = self.train_neg_questions[self.train_batch_id:self.train_batch_id + batch_size] 431 | batch_train_answer_masks = self.train_answer_masks[self.train_batch_id:self.train_batch_id + batch_size] 432 | batch_train_pos_descriptions_masks = self.train_pos_descriptions_masks[self.train_batch_id:self.train_batch_id + batch_size] 433 | batch_train_neg_descriptions_masks = self.train_neg_descriptions_masks[self.train_batch_id:self.train_batch_id + batch_size] 434 | 435 | self.train_batch_id = self.train_batch_id + batch_size 436 | 437 | return [batch_train_answers, batch_train_pos_descriptions, batch_train_neg_descriptions, \ 438 | batch_train_pos_questions, batch_train_neg_questions, batch_train_answer_masks, \ 439 | batch_train_pos_descriptions_masks, batch_train_neg_descriptions_masks, batch_train_users] 440 | 441 | def get_test_batch_data(self, batch_size): 442 | l = len(self.test_answers) 443 | if self.test_batch_id + batch_size > l: 444 | batch_test_users = self.test_users[self.test_batch_id:] + self.test_users[:self.test_batch_id + batch_size - l] 445 | batch_test_answers = self.test_answers[self.test_batch_id:] + self.test_answers[:self.test_batch_id + batch_size - l] 446 | batch_test_pos_descriptions = self.test_pos_descriptions[self.test_batch_id:] + self.test_pos_descriptions[:self.test_batch_id + batch_size - l] 447 | batch_test_pos_questions = self.test_pos_questions[self.test_batch_id:] + self.test_pos_questions[:self.test_batch_id + batch_size - l] 448 | batch_test_pos_descriptions_masks = self.test_pos_descriptions_masks[self.test_batch_id:] + self.test_pos_descriptions_masks[:self.test_batch_id + batch_size - l] 449 | 450 | self.test_batch_id = self.test_batch_id + batch_size - l 451 | 452 | else: 453 | batch_test_users = self.test_users[self.test_batch_id:self.test_batch_id + batch_size] 454 | batch_test_answers = self.test_answers[self.test_batch_id:self.test_batch_id + batch_size] 455 | batch_test_pos_descriptions = self.test_pos_descriptions[self.test_batch_id:self.test_batch_id + batch_size] 456 | batch_test_pos_questions = self.test_pos_questions[self.test_batch_id:self.test_batch_id + batch_size] 457 | batch_test_pos_descriptions_masks = self.test_pos_descriptions_masks[self.test_batch_id:self.test_batch_id + batch_size] 458 | 459 | self.test_batch_id = self.test_batch_id + batch_size 460 | 461 | return [batch_test_answers, batch_test_pos_descriptions, batch_test_pos_questions, batch_test_pos_descriptions_masks, batch_test_users] -------------------------------------------------------------------------------- /dataset/README.md: -------------------------------------------------------------------------------- 1 | Dataset can be downloaded at https://www.dropbox.com/s/kugsfgkwksosesh/Datasets%28Conversational-Search-and-Recommendation%29.zip?dl=0 2 | -------------------------------------------------------------------------------- /dynamic_memory/__init__.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from .encoder import Encoder 3 | from .episode import Episode 4 | 5 | 6 | 7 | class Graph: 8 | 9 | def __init__(self, params): 10 | self.params = params 11 | 12 | def build_loss(self, 13 | embedding_user, 14 | embedding_all_description, 15 | embedding_pos_description, 16 | embedding_neg_description, 17 | embedding_answer, 18 | all_description_mask, 19 | pos_description_mask, 20 | neg_description_mask, 21 | embedding_pos_question, 22 | embedding_neg_question): 23 | self.all = self.params['item_number'] 24 | 25 | facts, answer = self._build_input_module(embedding_pos_description, pos_description_mask, embedding_answer, reuse=False) 26 | pos_search_memory, _ = self._build_episodic_memory(facts, answer, reuse=False) 27 | pos_search = self._build_search_decoder(embedding_user, pos_search_memory, answer, reuse=False) 28 | facts, answer = self._build_input_module(embedding_neg_description, neg_description_mask, embedding_answer, reuse=True) 29 | neg_search_memory, _ = self._build_episodic_memory(facts, answer, reuse=True) 30 | neg_search = self._build_search_decoder(embedding_user, neg_search_memory, answer, reuse=True) 31 | 32 | scat_embedding_all_description = tf.tile(embedding_all_description, [self.params['batch_size'], 1, 1]) 33 | scat_all_description_mask = tf.tile(all_description_mask, [self.params['batch_size'], 1]) 34 | scat_embedding_answer = tf.reshape(tf.tile(embedding_answer, [1, self.all, 1]),[-1, self.params['max_answer_word_length'], self.params['embed_dim']]) 35 | 36 | scat_facts, scat_answer = self._build_input_module(scat_embedding_all_description, scat_all_description_mask, scat_embedding_answer, reuse=True) 37 | _, question_memory = self._build_episodic_memory(scat_facts, scat_answer, reuse=True) 38 | pos_question = self._build_question_decoder(embedding_user, question_memory, answer, embedding_pos_question, reuse=False) 39 | neg_question = self._build_question_decoder(embedding_user, question_memory, answer, embedding_neg_question, reuse=True) 40 | 41 | 42 | with tf.variable_scope('search_loss'): 43 | w = tf.get_variable("w", [self.params['num_units']/2, 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 44 | b = tf.get_variable("b", [1, 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 45 | pos_s = tf.log_sigmoid(tf.matmul(pos_search, w) + b) 46 | neg_s = tf.log_sigmoid(-tf.matmul(neg_search, w) - b) 47 | search_loss = - pos_s - neg_s 48 | 49 | 50 | with tf.variable_scope('question_loss'): 51 | w = tf.get_variable("w", [self.params['num_units'], 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 52 | b = tf.get_variable("b", [1, 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 53 | pos_q = tf.log_sigmoid(tf.matmul(pos_question, w) + b) 54 | neg_q = tf.log_sigmoid(-tf.matmul(neg_question, w) - b) 55 | question_loss = - pos_q - neg_q 56 | 57 | reg_term = tf.reduce_sum(tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)) 58 | total_loss = tf.reduce_mean(tf.add(search_loss, question_loss))+reg_term 59 | return total_loss 60 | 61 | 62 | def _build_input_module(self, embedding_input, input_mask, embedding_question, reuse=False): 63 | encoder = Encoder( 64 | encoder_type=self.params['encoder_type'], 65 | num_layers=self.params['num_layers'], 66 | cell_type=self.params['cell_type'], 67 | num_units=self.params['num_units'], 68 | dropout=self.params['dropout']) 69 | 70 | # slice zeros padding 71 | input_length = tf.reduce_max(input_mask, axis=1) 72 | question_length = tf.reduce_sum(tf.to_int32(tf.not_equal(tf.reduce_max(embedding_question, axis=2), 73 | self.params['PAD_ID'])), axis=1) 74 | 75 | with tf.variable_scope("input-module") as scope: 76 | input_encoder_outputs, _ = encoder.build(embedding_input, input_length, scope="encoder", reuse=reuse) 77 | 78 | with tf.variable_scope("facts") as scope: 79 | batch_size = tf.shape(input_mask)[0] 80 | max_mask_length = tf.shape(input_mask)[1] 81 | 82 | def get_encoded_fact(i): 83 | nonlocal input_mask 84 | 85 | mask_lengths = tf.reduce_sum(tf.to_int32(tf.not_equal(input_mask[i], self.params['PAD_ID'])), axis=0) 86 | input_mask = tf.boolean_mask(input_mask[i], tf.sequence_mask(mask_lengths, max_mask_length)) 87 | 88 | encoded_facts = tf.gather_nd(input_encoder_outputs[i], tf.reshape(input_mask, [-1, 1])) 89 | padding = tf.zeros(tf.stack([max_mask_length - mask_lengths, self.params['num_units']])) 90 | return tf.concat([encoded_facts, padding], 0) 91 | 92 | facts_stacked = tf.map_fn(get_encoded_fact, tf.range(start=0, limit=batch_size), dtype=tf.float32) 93 | # max_input_mask_length x [batch_size, num_units] 94 | facts = tf.unstack(tf.transpose(facts_stacked, [1, 0, 2]), num=self.params['max_description_sentence_length']) 95 | 96 | with tf.variable_scope("input-module") as scope: 97 | scope.reuse_variables() 98 | _, question = encoder.build(embedding_question, question_length, scope="encoder", reuse=reuse) 99 | return facts, question[0] 100 | 101 | def _build_episodic_memory(self, facts, question, reuse=False): 102 | with tf.variable_scope('episodic-memory-module', reuse=reuse) as scope: 103 | memory = tf.identity(question) 104 | episode = Episode(self.params['num_units'], reg_scale=self.params['reg_scale'], reuse=reuse) 105 | rnn = tf.contrib.rnn.GRUCell(self.params['num_units']) 106 | updated_memory = episode.update(facts, tf.transpose(memory, name="m"), tf.transpose(question, name="q")) 107 | search_memory, _ = rnn(updated_memory, memory, scope="memory_rnn") 108 | scope.reuse_variables() 109 | updated_memory = episode.update(facts, tf.transpose(memory, name="m"), tf.transpose(question, name="q")) 110 | question_memory, _ = rnn(updated_memory, memory, scope="memory_rnn") 111 | return search_memory, question_memory 112 | 113 | def _build_search_decoder(self, embedding_user, last_memory, current_answer, reuse=False): 114 | with tf.variable_scope('search-module', reuse=reuse): 115 | w1 = tf.get_variable("w1", [self.params['embed_dim']+ 2 * self.params['num_units'], self.params['num_units']], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 116 | b1 = tf.get_variable("b1", [1, self.params['num_units']], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 117 | w2 = tf.get_variable("w2", [self.params['num_units'],self.params['num_units']/2], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 118 | b2 = tf.get_variable("b2", [1, self.params['num_units']/2], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 119 | z = tf.concat([current_answer, last_memory, embedding_user], 1) 120 | o1 = tf.nn.elu(tf.matmul(z, w1) + b1) 121 | o2 = tf.nn.elu(tf.matmul(o1, w2) + b2) 122 | # return $x_i$, which will be used to compute the probability of item i 123 | return o2 124 | 125 | def _build_question_decoder(self, embedding_user, last_memory, current_answer, next_question, reuse=False): 126 | last_memory_mean = tf.reduce_mean(tf.reshape(last_memory, [-1, self.all, self.params['num_units']]), 1) 127 | 128 | with tf.variable_scope('question-module', reuse=reuse): 129 | w1 = tf.get_variable("w1", [2 * self.params['num_units']+ 2*self.params['embed_dim'], 2*self.params['num_units']], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 130 | b1 = tf.get_variable("b1", [1, 2*self.params['num_units']], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 131 | w2 = tf.get_variable("w2", [2*self.params['num_units'], self.params['num_units']], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 132 | b2 = tf.get_variable("b2", [1, self.params['num_units']], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 133 | 134 | z = tf.concat([current_answer, last_memory_mean, next_question, embedding_user], 1) 135 | o1 = tf.nn.elu(tf.matmul(z, w1) + b1) 136 | o2 = tf.nn.elu(tf.matmul(o1, w2) + b2) 137 | # return $x_{k+1}$, which will be used to compute the probability of asking the next question (k+1) 138 | return o2 139 | 140 | def build_search_prediction(self, embedding_user, embedding_description, embedding_answer, description_mask): 141 | facts, answer = self._build_input_module(embedding_description, description_mask, embedding_answer, reuse=True) 142 | search_memory, question_memory = self._build_episodic_memory(facts, answer, reuse=True) 143 | search = self._build_search_decoder(embedding_user, search_memory, answer, reuse=True) 144 | with tf.variable_scope('search_loss', reuse=True): 145 | w = tf.get_variable("w", [self.params['num_units']/2, 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 146 | b = tf.get_variable("b", [1, 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 147 | s = tf.log_sigmoid(tf.matmul(search, w) + b) 148 | return s 149 | 150 | def build_question_prediction(self, embedding_user, embedding_description, description_mask, all_embedding_description, embedding_answer, all_description_mask, embedding_question): 151 | _, answer = self._build_input_module(embedding_description, description_mask, embedding_answer, reuse=True) 152 | scat_embedding_all_description = tf.tile(all_embedding_description, [self.params['batch_size'], 1, 1]) 153 | scat_all_description_mask = tf.tile(all_description_mask, [self.params['batch_size'], 1]) 154 | scat_embedding_answer = tf.reshape(tf.tile(embedding_answer, [1, self.all, 1]),[-1, self.params['max_answer_word_length'], 155 | self.params['embed_dim']]) 156 | scat_facts, scat_answer = self._build_input_module(scat_embedding_all_description, scat_all_description_mask, scat_embedding_answer, reuse=True) 157 | _, question_memory = self._build_episodic_memory(scat_facts, scat_answer, reuse=True) 158 | question = self._build_question_decoder(embedding_user, question_memory, answer, embedding_question, reuse=True) 159 | 160 | 161 | with tf.variable_scope('question_loss', reuse=True): 162 | w = tf.get_variable("w", [self.params['num_units'], 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 163 | b = tf.get_variable("b", [1, 1], regularizer=tf.contrib.layers.l2_regularizer(self.params['reg_scale'])) 164 | q = tf.log_sigmoid(tf.matmul(question, w) + b) 165 | return q 166 | -------------------------------------------------------------------------------- /dynamic_memory/__pycache__/__init__.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/evison/Conversational/77a8f20ba367e4f71a966ceebfb2588b0ffd084e/dynamic_memory/__pycache__/__init__.cpython-36.pyc -------------------------------------------------------------------------------- /dynamic_memory/__pycache__/encoder.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/evison/Conversational/77a8f20ba367e4f71a966ceebfb2588b0ffd084e/dynamic_memory/__pycache__/encoder.cpython-36.pyc -------------------------------------------------------------------------------- /dynamic_memory/__pycache__/episode.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/evison/Conversational/77a8f20ba367e4f71a966ceebfb2588b0ffd084e/dynamic_memory/__pycache__/episode.cpython-36.pyc -------------------------------------------------------------------------------- /dynamic_memory/encoder.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | class Encoder: 4 | """Encoder class is Mutil-layer Recurrent Neural Networks 5 | 6 | The 'Encoder' usually encode the sequential input vector. 7 | """ 8 | 9 | UNI_ENCODER_TYPE = "uni" 10 | BI_ENCODER_TYPE = "bi" 11 | 12 | RNN_GRU_CELL = "gru" 13 | RNN_LSTM_CELL = "lstm" 14 | RNN_LAYER_NORM_LSTM_CELL = "layer_norm_lstm" 15 | RNN_NAS_CELL = "nas" 16 | 17 | def __init__(self, encoder_type="uni", num_layers=1, 18 | cell_type="gru", num_units=512, dropout=0.8, 19 | dtype=tf.float32): 20 | """Contructs an 'Encoder' instance. 21 | 22 | * Args: 23 | encoder_type: rnn encoder_type (uni, bi) 24 | num_layers: number of RNN cell composed sequentially of multiple simple cells. 25 | input_vector: RNN Input vectors. 26 | sequence_length: batch element's sequence length 27 | cell_type: RNN cell types (lstm, gru, layer_norm_lstm, nas) 28 | num_units: the number of units in cell 29 | dropout: set prob operator adding dropout to inputs of the given cell. 30 | dtype: the dtype of the input 31 | 32 | * Returns: 33 | Encoder instance 34 | """ 35 | 36 | self.encoder_type = encoder_type 37 | self.num_layers = num_layers 38 | self.cell_type = cell_type 39 | self.num_units = num_units 40 | self.dropout = dropout 41 | self.dtype = dtype 42 | 43 | def build(self, input_vector, sequence_length, scope=None, reuse=False): 44 | with tf.variable_scope('build_uniencoder', reuse=reuse) as scope: 45 | if self.encoder_type == self.UNI_ENCODER_TYPE: 46 | ''' 47 | initializer = tf.random_uniform_initializer(-1, 1) 48 | cell = tf.nn.rnn_cell.LSTMCell(self.num_units, initializer=initializer) 49 | outputs, _ = tf.nn.dynamic_rnn(cell, input_vector, sequence_length=sequence_length, dtype=tf.float32, time_major=False) 50 | return outputs, _ 51 | ''' 52 | self.cells = self._create_rnn_cells() 53 | return self.unidirectional_rnn(input_vector, sequence_length, scope=scope) 54 | elif self.encoder_type == self.BI_ENCODER_TYPE: 55 | self.cells_fw = self._create_rnn_cells(is_list=True) 56 | self.cells_bw = self._create_rnn_cells(is_list=True) 57 | return self.bidirectional_rnn(input_vector, sequence_length, scope=scope) 58 | 59 | else: 60 | raise ValueError("Unknown encoder_type {self.encoder_type}") 61 | 62 | 63 | def unidirectional_rnn(self, input_vector, sequence_length, scope=None): 64 | return tf.nn.dynamic_rnn( 65 | self.cells, 66 | input_vector, 67 | sequence_length=sequence_length, 68 | dtype=self.dtype, 69 | time_major=False, 70 | swap_memory=True, 71 | scope=scope) 72 | 73 | def bidirectional_rnn(self, input_vector, sequence_length, scope=None): 74 | outputs, output_state_fw, output_state_bw = tf.contrib.rnn.stack_bidirectional_dynamic_rnn( 75 | self.cells_fw, 76 | self.cells_bw, 77 | input_vector, 78 | sequence_length=sequence_length, 79 | dtype=self.dtype, 80 | scope=scope) 81 | 82 | encoder_final_state = tf.concat((output_state_fw[-1], output_state_bw[-1]), axis=1) 83 | return outputs, encoder_final_state 84 | 85 | def _create_rnn_cells(self, is_list=False): 86 | """Contructs stacked_rnn with num_layers 87 | 88 | * Args: 89 | is_list: flags for stack bidirectional. True=stack bidirectional, False=unidirectional 90 | 91 | * Returns: 92 | stacked_rnn 93 | """ 94 | 95 | stacked_rnn = [] 96 | for _ in range(self.num_layers): 97 | single_cell = self._rnn_single_cell() 98 | stacked_rnn.append(single_cell) 99 | 100 | if is_list: 101 | return stacked_rnn 102 | else: 103 | return tf.nn.rnn_cell.MultiRNNCell( 104 | cells=stacked_rnn, 105 | state_is_tuple=True) 106 | 107 | def _rnn_single_cell(self): 108 | """Contructs rnn single_cell""" 109 | 110 | if self.cell_type == self.RNN_GRU_CELL: 111 | single_cell = tf.contrib.rnn.GRUCell( 112 | self.num_units, 113 | reuse=tf.get_variable_scope().reuse) 114 | elif self.cell_type == self.RNN_LSTM_CELL: 115 | single_cell = tf.contrib.rnn.BasicLSTMCell( 116 | self.num_units, 117 | forget_bias=1.0, 118 | reuse=tf.get_variable_scope().reuse) 119 | elif self.cell_type == self.RNN_LAYER_NORM_LSTM_CELL: 120 | single_cell = tf.contrib.rnn.LayerNormBasicLSTMCell( 121 | self.num_units, 122 | forget_bias=1.0, 123 | layer_norm=True, 124 | reuse=tf.get_variable_scope().reuse) 125 | elif self.cell_type == self.RNN_NAS_CELL: 126 | single_cell = tf.contrib.rnn.LayerNormBasicLSTMCell( 127 | self.num_units) 128 | else: 129 | raise ValueError("Unknown rnn cell type. {self.cell_type}") 130 | 131 | if self.dropout > 0.0: 132 | single_cell = tf.contrib.rnn.DropoutWrapper( 133 | cell=single_cell, input_keep_prob=(1.0 - self.dropout)) 134 | 135 | return single_cell 136 | -------------------------------------------------------------------------------- /dynamic_memory/episode.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | 5 | class Episode: 6 | """Episode class is used update memory in in Episodic Memory Module""" 7 | 8 | def __init__(self, num_units, reg_scale=0.001, reuse=False): 9 | self.gate = AttentionGate(hidden_size=num_units, reg_scale=reg_scale, reuse=reuse) 10 | self.rnn = tf.contrib.rnn.GRUCell(num_units) 11 | 12 | def update(self, c, m_t, q_t): 13 | """Update memory with attention mechanism 14 | 15 | * Args: 16 | c : encoded raw text and stacked by each sentence 17 | shape: fact_count x [batch_size, num_units] 18 | m_t : previous memory 19 | shape: [num_units, batch_size] 20 | q_t : encoded question last state 21 | shape: [num_units, batch_size] 22 | 23 | * Returns: 24 | h : updated memory 25 | """ 26 | h = tf.zeros_like(c[0]) 27 | init_w = self.gate.score(tf.transpose(c[0], name="init_shape"), m_t, q_t) 28 | w = tf.zeros_like(init_w) 29 | with tf.variable_scope('memory-update') as scope: 30 | for fact in c: 31 | g = self.gate.score(tf.transpose(fact, name="c"), m_t, q_t) 32 | #h = g * self.rnn(fact, h, scope="episode_rnn")[0] + (1 - g) * h 33 | w = tf.add(g, w) 34 | h = tf.add(g * fact, h) 35 | scope.reuse_variables() 36 | return tf.div(h, w) 37 | 38 | 39 | class AttentionGate: 40 | """AttentionGate class is simple two-layer feed forward neural network with Score function.""" 41 | 42 | def __init__(self, hidden_size=4, reg_scale=0.001,reuse=False): 43 | with tf.variable_scope('attention_weight', reuse=reuse) as scope: 44 | self.w1 = tf.get_variable( 45 | "w1", [hidden_size, 3 * hidden_size], 46 | regularizer=tf.contrib.layers.l2_regularizer(reg_scale)) 47 | self.b1 = tf.get_variable("b1", [hidden_size, 1]) 48 | self.w2 = tf.get_variable( 49 | "w2", [1, hidden_size], 50 | regularizer=tf.contrib.layers.l2_regularizer(reg_scale)) 51 | self.b2 = tf.get_variable("b2", [1, 1]) 52 | 53 | def score(self, c_t, m_t, q_t): 54 | """For captures a variety of similarities between input(c), memory(m) and question(q) 55 | 56 | * Args: 57 | c_t : transpose of one fact (encoded sentence's last state) 58 | shape: [num_units, batch_size] 59 | m_t : transpose of previous memory 60 | shape: [num_units, batch_size] 61 | q_t : transpose of encoded question 62 | shape: [num_units, batch_size] 63 | 64 | * Returns: 65 | gate score 66 | shape: [batch_size, 1] 67 | """ 68 | 69 | with tf.variable_scope('attention_gate'): 70 | #z = tf.concat([c_t, m_t, q_t, c_t*q_t, c_t*m_t, (c_t-q_t)**2, (c_t-m_t)**2], 0) 71 | z = tf.concat([c_t, m_t, q_t], 0) 72 | o1 = tf.nn.tanh(tf.matmul(self.w1, z) + self.b1) 73 | o2 = tf.nn.sigmoid(tf.matmul(self.w2, o1) + self.b2) 74 | return tf.transpose(o2) 75 | -------------------------------------------------------------------------------- /images/README: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /images/README.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/evison/Conversational/77a8f20ba367e4f71a966ceebfb2588b0ffd084e/images/README.png -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | #-- coding: utf-8 -*- 2 | import argparse 3 | import configparser 4 | import tensorflow as tf 5 | from data_loader import DataLoader 6 | from model import Model 7 | import pandas as pd 8 | import numpy as np 9 | 10 | class solver(): 11 | 12 | def __init__(self, model, args): 13 | self.model = model 14 | self.args = args 15 | 16 | self.data_loader = DataLoader(self.args) 17 | self.data_loader.make_train_and_test_set() 18 | self.train_sample_num = self.data_loader.train_sample_num 19 | self.test_sample_num = self.data_loader.test_sample_num 20 | self.all_d, self.all_d_mask = self.data_loader.get_all_description() 21 | 22 | self.params = {} 23 | for k, v in vars(args).items(): 24 | self.params[k] = v 25 | self.params['max_description_word_length'] = self.data_loader.max_description_word_length 26 | self.params['max_description_sentence_length'] = self.data_loader.max_description_sentence_length 27 | self.params['max_answer_word_length'] = self.data_loader.max_answer_word_length 28 | self.params['max_answer_sentence_length'] = self.data_loader.max_answer_sentence_length 29 | self.params['item_number'] = len(self.all_d) 30 | self.params['user_number'] = self.data_loader.user_number 31 | 32 | self.model.build_graph_init(self.params) 33 | self.model.build_graph() 34 | self.s_prediction = self.model.search_predictions 35 | self.q_prediction = self.model.question_predictions 36 | self.train_op = self.model.train_op 37 | 38 | def HIT(self, ground_truth, pred): 39 | result = [] 40 | print(len(ground_truth)) 41 | print(ground_truth) 42 | print(pred) 43 | for k,v in ground_truth.items(): 44 | ground = v 45 | fit = [i[0] for i in pred[k]][:1] 46 | tmp = 0 47 | for j in range(len(fit)): 48 | if fit[j] in ground: 49 | tmp += 1 50 | if tmp > 0: 51 | result.append(1) 52 | else: 53 | result.append(0) 54 | return np.array(result).mean() 55 | 56 | def MAP(self, ground_truth, pred): 57 | result = [] 58 | for k,v in ground_truth.items(): 59 | ground = v 60 | fit = [i[0] for i in pred[k]][:100] 61 | tmp = 0 62 | hit = 0 63 | for j in range(len(fit)): 64 | if fit[j] in ground: 65 | hit += 1 66 | tmp += hit / (j+1) 67 | result.append(tmp) 68 | return np.array(result).mean() 69 | 70 | def MRR(self, ground_truth, pred): 71 | result = [] 72 | for k, v in ground_truth.items(): 73 | ground = v 74 | fit = [i[0] for i in pred[k]][:100] 75 | tmp = 0 76 | for j in range(len(fit)): 77 | if fit[j] in ground: 78 | tmp = 1 / (j + 1) 79 | break 80 | result.append(tmp) 81 | return np.array(result).mean() 82 | 83 | def NDCG(self, ground_truth, pred): 84 | result = [] 85 | for k, v in ground_truth.items(): 86 | ground = v 87 | fit = [i[0] for i in pred[k]][:10] 88 | temp = 0 89 | Z_u = 0 90 | for j in range(len(fit)): 91 | Z_u = Z_u + 1 / np.log2(j + 2) 92 | if fit[j] in ground: 93 | temp = temp + 1 / np.log2(j + 2) 94 | if Z_u == 0: 95 | temp = 0 96 | else: 97 | temp = temp / Z_u 98 | result.append(temp) 99 | return np.array(result).mean() 100 | 101 | def evaluate(self): 102 | index_path = self.args.base_path + self.args.category+'output_' + self.args.evaluate + '_result_index' 103 | index = pd.read_csv(index_path, header=None) 104 | predictions_path = self.args.base_path + self.args.category+'output_' + self.args.evaluate + '_result' 105 | predictions = pd.read_csv(predictions_path, header=None) 106 | 107 | ground_truth = {} 108 | pred = {} 109 | 110 | l = len(predictions.values) 111 | for i in range(l): 112 | ind = index.values[i] 113 | pre = predictions.values[i][0] 114 | user = ind[0] 115 | item = ind[1] 116 | pur_or_not = ind[2] 117 | 118 | if pur_or_not == 1: 119 | if user not in ground_truth.keys(): 120 | ground_truth[user] = [item] 121 | else: 122 | ground_truth[user].append(item) 123 | 124 | if user not in pred.keys(): 125 | pred[user] = {item: pre} 126 | else: 127 | pred[user][item] = pre 128 | 129 | for k,v in pred.items(): 130 | pred[k] = sorted(v.items(), key=lambda item: item[1])[::-1] 131 | 132 | hit = self.HIT(ground_truth, pred) 133 | map = self.MAP(ground_truth, pred) 134 | mrr = self.MRR(ground_truth, pred) 135 | ndcg = self.NDCG(ground_truth, pred) 136 | return map, mrr, ndcg, hit 137 | 138 | def evaluate_q(self): 139 | index_path = self.args.base_path + self.args.category+'output_' + self.args.evaluate + '_result_index' 140 | index = pd.read_csv(index_path, header=None) 141 | predictions_path = self.args.base_path + self.args.category+'output_' + self.args.evaluate + '_result' 142 | predictions = pd.read_csv(predictions_path, header=None) 143 | 144 | ground_truth = {} 145 | pred = {} 146 | 147 | l = len(predictions.values) 148 | for i in range(l): 149 | ind = index.values[i] 150 | pre = predictions.values[i][0] 151 | user = ind[0] 152 | item = ind[1] 153 | ui = str(user)+"@"+str(item) 154 | ques = ind[2] 155 | pur_or_not = ind[3] 156 | 157 | if pur_or_not == 1: 158 | if ui not in ground_truth.keys(): 159 | ground_truth[ui] = [ques] 160 | else: 161 | ground_truth[ui].append(ques) 162 | 163 | if ui not in pred.keys(): 164 | pred[ui] = {ques: pre} 165 | else: 166 | pred[ui][ques] = pre 167 | 168 | for k,v in pred.items(): 169 | pred[k] = sorted(v.items(), key=lambda ques: ques[1])[::-1] 170 | 171 | hit = self.HIT(ground_truth, pred) 172 | map = self.MAP(ground_truth, pred) 173 | mrr = self.MRR(ground_truth, pred) 174 | ndcg = self.NDCG(ground_truth, pred) 175 | return map, mrr, ndcg, hit 176 | 177 | def run(self): 178 | with tf.Session() as self.sess: 179 | init = tf.initialize_all_variables() 180 | self.sess.run(init) 181 | best_value = 0.0 182 | best_result = [] 183 | for epoch in range(self.args.epoch_number): 184 | for step in range(int(self.train_sample_num/self.args.batch_size)): 185 | print('epoch: %s, step %s' % (epoch, step)) 186 | train_input_fn = self.data_loader.get_train_batch_data(self.args.batch_size) 187 | #print(self.all_d[0]) 188 | #input() 189 | self.sess.run(self.train_op, feed_dict={ 190 | self.model.answer_placeholder: train_input_fn[0], 191 | self.model.all_description_placeholder: self.all_d, 192 | self.model.pos_description_placeholder: train_input_fn[1], 193 | self.model.neg_description_placeholder: train_input_fn[2], 194 | self.model.pos_question_placeholder: train_input_fn[3], 195 | self.model.neg_question_placeholder: train_input_fn[4], 196 | self.model.answer_mask_placeholder: train_input_fn[5], 197 | self.model.all_descriptions_mask_placeholder: self.all_d_mask, 198 | self.model.pos_descriptions_mask_placeholder: train_input_fn[6], 199 | self.model.neg_descriptions_mask_placeholder: train_input_fn[7], 200 | self.model.user_placeholder: train_input_fn[8], 201 | }) 202 | 203 | if step % 10 == 0: 204 | if self.args.evaluate == 'search': 205 | result = [] 206 | for _ in range(self.test_sample_num): 207 | test_input_fn = self.data_loader.get_test_batch_data(1) 208 | s = self.sess.run(self.s_prediction, feed_dict={ 209 | self.model.answer_placeholder: test_input_fn[0], 210 | self.model.pos_description_placeholder: test_input_fn[1], 211 | self.model.pos_descriptions_mask_placeholder: test_input_fn[3], 212 | self.model.user_placeholder: test_input_fn[4] 213 | }) 214 | result += [i.tolist() for i in list(s)] 215 | t = pd.DataFrame(result) 216 | t.to_csv(self.args.base_path + self.args.category + 'output_'+self.args.evaluate+'_result', index=False, 217 | header=None) 218 | map, mrr, ndcg, hit = self.evaluate() 219 | if map > best_value: 220 | best_result = [map, mrr, ndcg, hit] 221 | best_value = map 222 | print('map = %s, mrr = %s, ndcg = %s, hit = %s' % (map, mrr, ndcg, hit)) 223 | print('current best:%s' % (str(best_result))) 224 | 225 | else: 226 | print(str(self.test_sample_num)) 227 | result = [] 228 | for _ in range(int(self.test_sample_num/self.args.batch_size)): 229 | test_input_fn = self.data_loader.get_test_batch_data(self.args.batch_size) 230 | q = self.sess.run(self.q_prediction, feed_dict={ 231 | self.model.answer_placeholder: test_input_fn[0], 232 | self.model.all_description_placeholder: self.all_d, 233 | self.model.all_descriptions_mask_placeholder: self.all_d_mask, 234 | self.model.pos_description_placeholder: test_input_fn[1], 235 | self.model.pos_question_placeholder: test_input_fn[2], 236 | self.model.pos_descriptions_mask_placeholder: test_input_fn[3], 237 | self.model.user_placeholder: test_input_fn[4] 238 | }) 239 | result += [i.tolist() for i in list(q)] 240 | print('evel') 241 | t = pd.DataFrame(result) 242 | t.to_csv(self.args.base_path + self.args.category + 'output_'+self.args.evaluate+'_result', index=False, 243 | header=None) 244 | map, mrr, ndcg, hit = self.evaluate_q() 245 | if map > best_value: 246 | best_result = [map, mrr, ndcg, hit] 247 | best_value = map 248 | print('map = %s, mrr = %s, ndcg = %s, hit = %s' % (map, mrr, ndcg, hit)) 249 | print('current best:%s' % (str(best_result))) 250 | 251 | if __name__ == '__main__': 252 | cf = configparser.ConfigParser() 253 | cf.read("./conf/default_setting.conf") 254 | parser = argparse.ArgumentParser() 255 | # 21 parameters 256 | parser.add_argument('--batch_size', type=int, default=cf.get("parameters", "batch_size"), required=False, 257 | help='batch_size') 258 | parser.add_argument('--use_pretrained', type=int, default=cf.get("parameters", "use_pretrained"), required=False, 259 | help='use_pretrained') 260 | parser.add_argument('--embed_dim', type=int, default=cf.get("parameters", "embed_dim"), required=False, 261 | help='embed_dim') 262 | parser.add_argument('--encoder_type', type=str, default=cf.get("parameters", "encoder_type"), required=False, 263 | help='encoder_type') 264 | parser.add_argument('--cell_type', type=str, default=cf.get("parameters", "cell_type"), required=False, 265 | help='cell_type') 266 | parser.add_argument('--evaluate', type=str, default=cf.get("parameters", "evaluate"), required=False, 267 | help='evaluate') 268 | parser.add_argument('--num_layers', type=int, default=cf.get("parameters", "num_layers"), required=False, 269 | help='num_layers') 270 | parser.add_argument('--num_units', type=int, default=cf.get("parameters", "num_units"), required=False, 271 | help='num_units') 272 | parser.add_argument('--memory_question_hob', type=int, default=cf.get("parameters", "memory_question_hob"), required=False, 273 | help='memory_question_hob') 274 | parser.add_argument('--memory_search_hob', type=int, default=cf.get("parameters", "memory_search_hob"), required=False, 275 | help='memory_search_hob') 276 | parser.add_argument('--dropout', type=float, default=cf.get("parameters", "dropout"), required=False, 277 | help='dropout') 278 | parser.add_argument('--reg_scale', type=float, default=cf.get("parameters", "reg_scale"), required=False, 279 | help='reg_scale') 280 | parser.add_argument('--learning_rate', type=float, default=cf.get("train", "learning_rate"), required=False, 281 | help='learning_rate') 282 | parser.add_argument('--optimizer', type=str, default=cf.get("train", "optimizer"), required=False, 283 | help='optimizer') 284 | parser.add_argument('--train_steps', type=int, default=cf.get("train", "train_steps"), required=False, 285 | help='train_steps') 286 | parser.add_argument('--test_steps', type=int, default=cf.get("train", "test_steps"), required=False, 287 | help='test_steps') 288 | parser.add_argument('--PAD_ID', type=int, default=cf.get("train", "PAD_ID"), required=False, 289 | help='PAD_ID') 290 | parser.add_argument('--model_dir', type=str, default=cf.get("train", "model_dir"), required=False, 291 | help='model_dir') 292 | parser.add_argument('--save_checkpoints_steps', type=int, default=cf.get("train", "save_checkpoints_steps"), required=False, 293 | help='save_checkpoints_steps') 294 | parser.add_argument('--check_hook_n_iter', type=int, default=cf.get("train", "check_hook_n_iter"), required=False, 295 | help='check_hook_n_iter') 296 | parser.add_argument('--min_eval_frequency', type=int, default=cf.get("train", "min_eval_frequency"), 297 | required=False, help='min_eval_frequency') 298 | parser.add_argument('--print_verbose', type=int, default=cf.get("train", "print_verbose"), required=False, 299 | help='print_verbose') 300 | parser.add_argument('--debug', type=int, default=cf.get("train", "debug"), required=False, 301 | help='debug') 302 | parser.add_argument('--category', type=str, default=cf.get("path", "category"), required=False, 303 | help='category') 304 | parser.add_argument('--base_path', type=str, default=cf.get("path", "base_path"), required=False, 305 | help='base_path') 306 | parser.add_argument('--search_with_conversation_number', type=int, default=cf.get("parameters", "search_with_conversation_number"), required=False, 307 | help='search_with_conversation_number') 308 | parser.add_argument('--prediction_with_conversation_number', type=int, default=cf.get("parameters", "prediction_with_conversation_number"), required=False, 309 | help='prediction_with_conversation_number') 310 | parser.add_argument('--epoch_number', type=int, default=cf.get("parameters", "epoch_number"), required=False, 311 | help='epoch_number') 312 | args = parser.parse_args() 313 | 314 | m = Model() 315 | s = solver(m, args) 316 | s.run() 317 | -------------------------------------------------------------------------------- /model.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import tensorflow as tf 3 | import dynamic_memory 4 | import pickle 5 | 6 | 7 | class Model: 8 | def __init__(self): 9 | pass 10 | 11 | def build_graph_init(self, params): 12 | self.params = params 13 | self.base_path = params['base_path'] 14 | self.category = params['category'] 15 | path = self.base_path + self.category 16 | word_id_path = path + 'word_id_dict' 17 | self.word_id_dict = pickle.load(open(word_id_path, 'rb')) 18 | word_number = len(self.word_id_dict.items()) 19 | 20 | self.user_placeholder = tf.placeholder(tf.int32, [None]) 21 | self.all_description_placeholder = tf.placeholder(tf.int32, [params['item_number'], params['max_description_word_length']]) 22 | self.pos_description_placeholder = tf.placeholder(tf.int32, [None, params['max_description_word_length']]) 23 | self.neg_description_placeholder = tf.placeholder(tf.int32, [None, params['max_description_word_length']]) 24 | self.answer_placeholder = tf.placeholder(tf.int32, [None, params['max_answer_word_length']]) 25 | self.pos_question_placeholder = tf.placeholder(tf.int32, [None]) 26 | self.neg_question_placeholder = tf.placeholder(tf.int32, [None]) 27 | 28 | self.all_descriptions_mask_placeholder = tf.placeholder(tf.int32,[params['item_number'], params['max_description_sentence_length']]) 29 | self.pos_descriptions_mask_placeholder = tf.placeholder(tf.int32,[None, params['max_description_sentence_length']]) 30 | self.neg_descriptions_mask_placeholder = tf.placeholder(tf.int32,[None, params['max_description_sentence_length']]) 31 | self.answer_mask_placeholder = tf.placeholder(tf.int32, [None, params['max_answer_word_length']]) 32 | 33 | self.initializer = tf.random_uniform_initializer(minval=-1, maxval=1) 34 | self.word_embedding_matrix = tf.get_variable('word_embedding', [word_number, params['embed_dim']], initializer=self.initializer) 35 | self.user_embedding_matrix = tf.get_variable('user_embedding', [self.params['user_number'], params['embed_dim']], initializer=self.initializer) 36 | 37 | self.embedding_user = tf.nn.embedding_lookup(self.user_embedding_matrix, self.user_placeholder) 38 | self.embedding_all_description = tf.nn.embedding_lookup(self.word_embedding_matrix, self.all_description_placeholder) 39 | self.embedding_pos_description = tf.nn.embedding_lookup(self.word_embedding_matrix, self.pos_description_placeholder) 40 | self.embedding_neg_description = tf.nn.embedding_lookup(self.word_embedding_matrix, self.neg_description_placeholder) 41 | self.embedding_answer = tf.nn.embedding_lookup(self.word_embedding_matrix, self.answer_placeholder) 42 | self.embedding_pos_question = tf.nn.embedding_lookup(self.word_embedding_matrix, self.pos_question_placeholder) 43 | self.embedding_neg_question = tf.nn.embedding_lookup(self.word_embedding_matrix, self.neg_question_placeholder) 44 | 45 | 46 | self.dtype = tf.float32 47 | self.loss, self.train_op, self.predictions = None, None, None 48 | self.graph = dynamic_memory.Graph(self.params) 49 | 50 | def build_graph(self): 51 | self.loss = self.graph.build_loss( 52 | embedding_user=self.embedding_user, 53 | embedding_all_description=self.embedding_all_description, 54 | embedding_pos_description = self.embedding_pos_description, 55 | embedding_neg_description = self.embedding_neg_description, 56 | embedding_answer = self.embedding_answer, 57 | all_description_mask=self.all_descriptions_mask_placeholder, 58 | pos_description_mask = self.pos_descriptions_mask_placeholder, 59 | neg_description_mask = self.neg_descriptions_mask_placeholder, 60 | embedding_pos_question = self.embedding_pos_question, 61 | embedding_neg_question = self.embedding_neg_question 62 | ) 63 | # _build_prediction should after build_loss 64 | self._build_prediction() 65 | self._build_optimizer() 66 | 67 | def _build_prediction(self): 68 | self.search_predictions = self.graph.build_search_prediction( 69 | embedding_user=self.embedding_user, 70 | embedding_description = self.embedding_pos_description, 71 | embedding_answer = self.embedding_answer, 72 | description_mask = self.pos_descriptions_mask_placeholder) 73 | 74 | self.question_predictions = self.graph.build_question_prediction( 75 | embedding_user=self.embedding_user, 76 | all_embedding_description=self.embedding_all_description, 77 | embedding_description=self.embedding_pos_description, 78 | embedding_answer=self.embedding_answer, 79 | all_description_mask=self.all_descriptions_mask_placeholder, 80 | description_mask=self.pos_descriptions_mask_placeholder, 81 | embedding_question=self.embedding_pos_question) 82 | 83 | 84 | def _build_optimizer(self): 85 | self.train_op = tf.contrib.layers.optimize_loss( 86 | self.loss, tf.train.get_global_step(), 87 | optimizer=self.params['optimizer'], 88 | learning_rate=self.params['learning_rate'], 89 | summaries=['loss', 'gradients', 'learning_rate'], 90 | name="train_op") 91 | 92 | -------------------------------------------------------------------------------- /tile.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | temp = tf.tile([[[1,3],[2,1],[3,5]],[[4,1],[2,1],[5,4]]],[3,1,1]) 3 | temp2 = tf.reshape(tf.tile([[[1,3],[2,1],[3,5]],[[4,1],[2,1],[5,4]]],[1,3,1]),[-1, 3, 2]) 4 | 5 | with tf.Session() as sess: 6 | print(sess.run(temp)) 7 | print(sess.run(temp2)) 8 | --------------------------------------------------------------------------------