├── README.md ├── TransE.py └── data ├── FB15k ├── entity2id.txt ├── relation2id.txt ├── test.txt ├── train.txt └── valid.txt └── WN18 ├── entity2id.txt ├── relation2id.txt ├── test.txt ├── train.txt └── valid.txt /README.md: -------------------------------------------------------------------------------- 1 | # TransE 2 | An implementation of TransE with tensorflow. 3 | 4 | TransE is proposed by Antoine Bordes, Nicolas Usunier and Alberto Garcia-Duran in 2013. The paper title is [Translating Embeddings for Modeling Multi-relational Data](https://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf) 5 | 6 | # Dataset 7 | 8 | The dataset WN18 and FB15k are orginally published by TransE paper and cand be download [here](https://everest.hds.utc.fr/doku.php?id=en:transe) 9 | 10 | The data used here are from [here](https://github.com/thunlp/KB2E) 11 | 12 | # Train and test 13 | 14 | ## Run 15 | 16 | to run the model with default parameter setting: 17 | 18 | --- 'python3 TransE.py'. 19 | 20 | If you run this code using python2 there will be some error because print setting '\t' is not supported by python2. 21 | 22 | ## change parameter setting 23 | 24 | to change the parameter setting you can use 25 | 26 | --- 'python3 TransE.py --help' 27 | 28 | to see the optional arguments. 29 | 30 | ## test 31 | 32 | the default setting for testing is testing once with 300 triples after every 10 training iteration 33 | 34 | 35 | 36 | -------------------------------------------------------------------------------- /TransE.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import time 3 | import argparse 4 | import random 5 | import numpy as np 6 | import os.path 7 | import math 8 | import timeit 9 | from multiprocessing import JoinableQueue, Queue, Process 10 | from collections import defaultdict 11 | 12 | class TransE: 13 | @property 14 | def variables(self): 15 | return self.__variables 16 | 17 | @property 18 | def num_triple_train(self): 19 | return self.__num_triple_train 20 | 21 | @property 22 | def num_triple_test(self): 23 | return self.__num_triple_test 24 | 25 | @property 26 | def testing_data(self): 27 | return self.__triple_test 28 | 29 | @property 30 | def num_entity(self): 31 | return self.__num_entity 32 | 33 | @property 34 | def embedding_entity(self): 35 | return self.__embedding_entity 36 | 37 | 38 | @property 39 | def embedding_relation(self): 40 | return self.__embedding_relation 41 | 42 | @property 43 | def hr_t(self): 44 | return self.__hr_t 45 | 46 | @property 47 | def tr_h(self): 48 | return self.__tr_h 49 | 50 | 51 | def training_data_batch(self, batch_size = 512): 52 | n_triple = len(self.__triple_train) 53 | rand_idx = np.random.permutation(n_triple) 54 | start = 0 55 | while start < n_triple: 56 | start_t = timeit.default_timer() 57 | end = min(start+batch_size, n_triple) 58 | size = end - start 59 | train_triple_positive = np.asarray([ self.__triple_train[x] for x in rand_idx[start:end]]) 60 | train_triple_negative = [] 61 | for t in train_triple_positive: 62 | replace_entity_id = np.random.randint(self.__num_entity) 63 | random_num = np.random.random() 64 | 65 | if self.__negative_sampling == 'unif': 66 | replace_head_probability = 0.5 67 | elif self.__negative_sampling == 'bern': 68 | replace_head_probability = self.__relation_property[t[1]] 69 | else: 70 | raise NotImplementedError("Dose not support %s negative_sampling" %negative_sampling) 71 | 72 | if random_num