├── .gitignore
├── README.md
├── config
    ├── bAbi_task1.yml
    └── check_tiny.yml
├── data
    └── en-10
    │   ├── qa1_single-supporting-fact_test.txt
    │   └── qa1_single-supporting-fact_train.txt
├── data_loader.py
├── dynamic_memory
    ├── __init__.py
    ├── encoder.py
    └── episode.py
├── dynamic_memory_plus
    ├── __init__.py
    ├── attn_gru.py
    ├── encoder.py
    ├── episode.py
    └── input.py
├── hook.py
├── images
    └── ask_me_anything_figure_3.png
├── main.py
├── model.py
├── notebooks
    └── data_loader.py.ipynb
├── requirements.txt
└── scripts
    ├── fetch_babi_data.sh
    └── fetch_glove_data.sh


/.gitignore:
--------------------------------------------------------------------------------
1 | data/
2 | .ipynb_checkpoints/
3 | notebooks/.ipynb_checkpoints/
4 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Dynamic Memory Network [![hb-research](https://img.shields.io/badge/hb--research-experiment-green.svg?style=flat&colorA=448C57&colorB=555555)](https://github.com/hb-research)
  2 | 
  3 | TensorFlow implementation of [Ask Me Anything:
  4 | Dynamic Memory Networks for Natural Language Processing](https://arxiv.org/pdf/1506.07285.pdf).
  5 | 
  6 | ![images](images/ask_me_anything_figure_3.png)
  7 | 
  8 | 
  9 | ## Requirements
 10 | 
 11 | - Python 3.6
 12 | - TensorFlow 1.8
 13 | - [hb-config](https://github.com/hb-research/hb-config) (Singleton Config)
 14 | - nltk (tokenizer and blue score)
 15 | - tqdm (progress bar)
 16 | 
 17 | 
 18 | ## Project Structure
 19 | 
 20 | init Project by [hb-base](https://github.com/hb-research/hb-base)
 21 | 
 22 |     .
 23 |     ├── config                  # Config files (.yml, .json) using with hb-config
 24 |     ├── data                    # dataset path
 25 |     ├── notebooks               # Prototyping with numpy or tf.interactivesession
 26 |     ├── dynamic_memory          # dmn architecture graphs (from input to output)
 27 |         ├── __init__.py             # Graph logic
 28 |         ├── encoder.py              # Encoder
 29 |         └── episode.py              # Episode and AttentionGate
 30 |     ├── data_loader.py          # raw_date -> precossed_data -> generate_batch (using Dataset)
 31 |     ├── hook.py                 # training or test hook feature (eg. print_variables)
 32 |     ├── main.py                 # define experiment_fn
 33 |     └── model.py                # define EstimatorSpec      
 34 | 
 35 | Reference : [hb-config](https://github.com/hb-research/hb-config), [Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_generator), [experiments_fn](https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Experiment), [EstimatorSpec](https://www.tensorflow.org/api_docs/python/tf/estimator/EstimatorSpec)
 36 | 
 37 | 
 38 | ## Todo
 39 | 
 40 | - Implements DMN+ ([Dynamic Memory Networks for Visual and Textual Question Answering](https://arxiv.org/pdf/1603.01417.pdf) (2016) by C Xiong)
 41 | 
 42 | 
 43 | 
 44 | ## Config
 45 | 
 46 | example: bAbi_task1.yml
 47 | 
 48 | ```yml
 49 | data:
 50 |   base_path: 'data/'
 51 |   task_path: 'en-10k/'
 52 |   task_id: 1
 53 |   PAD_ID: 0
 54 | 
 55 | model:
 56 |   batch_size: 16
 57 |   use_pretrained: true             # (true or false)
 58 |   embed_dim: 50                    # if use_pretrained: only available 50, 100, 200, 300
 59 |   encoder_type: uni                # uni, bi
 60 |   cell_type: gru                   # lstm, gru, layer_norm_lstm, nas
 61 |   num_layers: 1
 62 |   num_units: 32
 63 |   memory_hob: 3
 64 |   dropout: 0.0
 65 |   reg_scale: 0.001
 66 | 
 67 | train:
 68 |   learning_rate: 0.0001
 69 |   optimizer: 'Adam'                # Adagrad, Adam, Ftrl, Momentum, RMSProp, SGD
 70 | 
 71 |   train_steps: 100000
 72 |   model_dir: 'logs/bAbi_task1'
 73 | 
 74 |   save_checkpoints_steps: 1000
 75 |   check_hook_n_iter: 1000
 76 |   min_eval_frequency: 1000
 77 | 
 78 |   print_verbose: False
 79 |   debug: False
 80 | ```
 81 | 
 82 | 
 83 | ## Usage
 84 | 
 85 | Install requirements.
 86 | 
 87 | ```pip install -r requirements.txt```
 88 | 
 89 | Then, prepare dataset and pre-trained glove.
 90 | 
 91 | ```
 92 | sh scripts/fetch_babi_data.sh
 93 | sh scripts/fetch_glove_data.sh
 94 | ```
 95 | 
 96 | Finally, start trand and evalueate model
 97 | ```
 98 | python main.py --config bAbi_task1 --mode train_and_evaluate
 99 | ```
100 | 
101 | ### Experiments modes
102 | 
103 | :white_check_mark: : Working  
104 | :white_medium_small_square: : Not tested yet.
105 | 
106 | 
107 | - :white_check_mark: `evaluate` : Evaluate on the evaluation data.
108 | - :white_medium_small_square: `extend_train_hooks` :  Extends the hooks for training.
109 | - :white_medium_small_square: `reset_export_strategies` : Resets the export strategies with the new_export_strategies.
110 | - :white_medium_small_square: `run_std_server` : Starts a TensorFlow server and joins the serving thread.
111 | - :white_medium_small_square: `test` : Tests training, evaluating and exporting the estimator for a single step.
112 | - :white_check_mark: `train` : Fit the estimator using the training data.
113 | - :white_check_mark: `train_and_evaluate` : Interleaves training and evaluation.
114 | 
115 | ---
116 | 
117 | 
118 | ### Tensorboar
119 | 
120 | ```tensorboard --logdir logs```
121 | 
122 | 
123 | ## Reference
124 | 
125 | - [Implementing Dynamic memory networks](https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/)
126 | - [arXiv - Ask Me Anything:
127 | Dynamic Memory Networks for Natural Language Processing](https://arxiv.org/abs/1506.07285) (2015. 6) by A Kumar
128 | - [arXiv - Dynamic Memory Networks for Visual and Textual Question Answering](https://arxiv.org/abs/1603.01417) (2016. 3) by C Xiong
129 | 
130 | ## Author
131 | 
132 | Dongjun Lee (humanbrain.djlee@gmail.com)
133 | 


--------------------------------------------------------------------------------
/config/bAbi_task1.yml:
--------------------------------------------------------------------------------
 1 | data:
 2 |   base_path: 'data/'
 3 |   task_path: 'en-10/'
 4 |   task_id: 1
 5 |   PAD_ID: 0
 6 | 
 7 | model:
 8 |   batch_size: 16
 9 |   use_pretrained: true             # (true or false)
10 |   embed_dim: 50                    # if use_pretrained: only available 50, 100, 200, 300
11 |   encoder_type: UNI                # uni, bi
12 |   cell_type: GRU                   # lstm, gru, layer_norm_lstm, nas
13 |   num_layers: 1
14 |   num_units: 32
15 |   memory_hob: 3
16 |   dropout: 0.0
17 |   reg_scale: 0.001
18 | 
19 | train:
20 |   learning_rate: 0.0001
21 |   optimizer: 'Adam'                # Adagrad, Adam, Ftrl, Momentum, RMSProp, SGD
22 | 
23 |   train_steps: 100000
24 |   model_dir: 'logs/bAbi_task1'
25 | 
26 |   save_checkpoints_steps: 1000
27 |   check_hook_n_iter: 1000
28 |   min_eval_frequency: 1000
29 | 
30 |   print_verbose: False
31 |   debug: False
32 | 


--------------------------------------------------------------------------------
/config/check_tiny.yml:
--------------------------------------------------------------------------------
 1 | data:
 2 |   base_path: 'data/'
 3 |   task_path: 'en-10/'
 4 |   task_id: 1
 5 |   PAD_ID: 0
 6 | 
 7 | model:
 8 |   batch_size: 2
 9 |   use_pretrained: false               # (true or false)
10 |   embed_dim: 8                        # if use_pretrained: only available 50, 100, 200, 300
11 |   encoder_type: uni                   # uni, bi
12 |   cell_type: gru                      # lstm, gru, layer_norm_lstm, nas
13 |   num_layers: 2
14 |   num_units: 16
15 |   memory_hob: 2
16 |   dropout: 0.5
17 |   reg_scale: 0.001
18 | 
19 | train:
20 |   learning_rate: 0.001
21 |   optimizer: 'Adam'                  # Adagrad, Adam, Ftrl, Momentum, RMSProp, SGD
22 | 
23 |   train_steps: 10000
24 |   model_dir: 'logs/check_tiny'
25 | 
26 |   save_checkpoints_steps: 1000
27 |   check_hook_n_iter: 100
28 |   min_eval_frequency: 100
29 | 
30 |   print_verbose: False
31 |   debug: False
32 | 


--------------------------------------------------------------------------------
/data/en-10/qa1_single-supporting-fact_test.txt:
--------------------------------------------------------------------------------
1 | 1 Mary moved to the bathroom.
2 | 2 John went to the hallway.
3 | 3 Where is Mary? 	bathroom	1
4 | 4 Daniel went back to the hallway.
5 | 5 Sandra moved to the garden.
6 | 6 Where is Daniel? 	hallway	4
7 | 7 John moved to the office.
8 | 8 Sandra journeyed to the bathroom.
9 | 9 Where is Daniel? 	hallway	4


--------------------------------------------------------------------------------
/data/en-10/qa1_single-supporting-fact_train.txt:
--------------------------------------------------------------------------------
1 | 1 Mary moved to the bathroom.
2 | 2 John went to the hallway.
3 | 3 Where is Mary? 	bathroom	1
4 | 4 Daniel went back to the hallway.
5 | 5 Sandra moved to the garden.
6 | 6 Where is Daniel? 	hallway	4
7 | 7 John moved to the office.
8 | 8 Sandra journeyed to the bathroom.
9 | 9 Where is Daniel? 	hallway	4


--------------------------------------------------------------------------------
/data_loader.py:
--------------------------------------------------------------------------------
  1 | """
  2 | bAbi data_loader
  3 | Original code : https://github.com/YerevaNN/Dynamic-memory-networks-in-Theano/blob/master/utils.py
  4 | """
  5 | 
  6 | import os
  7 | 
  8 | from hbconfig import Config
  9 | import numpy as np
 10 | import tensorflow as tf
 11 | from tqdm import tqdm
 12 | 
 13 | 
 14 | 
 15 | class DataLoader:
 16 | 
 17 |     def __init__(self, task_path, task_id, task_test_id, w2v_dim=100, input_mask_mode="sentence", use_pretrained=True):
 18 |         self.base_path = "data/"
 19 |         self.task_path = task_path
 20 | 
 21 |         self.task_id = str(task_id)
 22 |         self.task_test_id = str(task_test_id)
 23 |         self.w2v_dim = w2v_dim
 24 |         self.input_mask_mode = input_mask_mode
 25 |         self.use_pretrained = use_pretrained
 26 | 
 27 |     def make_train_and_test_set(self):
 28 |         train_raw, test_raw = self.get_babi_raw(self.task_id, self.task_test_id)
 29 |         self.max_facts_seq_len, self.max_question_seq_len, self.max_input_mask_len = self.get_max_seq_length(train_raw, test_raw)
 30 | 
 31 |         if self.use_pretrained:
 32 |             self.word2vec = self.load_glove(self.w2v_dim)
 33 |         else:
 34 |             self.word2vec = {}
 35 |         self.vocab = {}
 36 |         self.ivocab = {}
 37 | 
 38 |         self.create_vector("unknown")
 39 | 
 40 |         train_input, train_question, train_answer, train_input_mask = self.process_input(train_raw)
 41 |         test_input, test_question, test_answer, test_input_mask = self.process_input(test_raw)
 42 | 
 43 |         return {
 44 |             "train": (train_input, train_input_mask, train_question, train_answer),
 45 |             "test": (test_input, test_input_mask, test_question, test_answer)
 46 |         }
 47 | 
 48 |     def get_max_seq_length(self, *datasets):
 49 |         max_facts_length, max_question_length, max_input_mask_length = 0, 0, 0
 50 | 
 51 |         def count_punctuation(facts):
 52 |             return len(list(filter(lambda x: x == ".", facts)))
 53 | 
 54 |         for dataset in datasets:
 55 |             for d in dataset:
 56 |                 max_facts_length = max(max_facts_length, len(d['C'].split()))
 57 |                 max_input_mask_length = max(max_input_mask_length, count_punctuation(d['C']))
 58 |                 max_question_length = max(max_question_length, len(d['Q'].split()))
 59 |         return max_facts_length, max_question_length, max_input_mask_length
 60 | 
 61 |     def init_babi(self, fname):
 62 |         print("==> Loading test from %s" % fname)
 63 |         tasks = []
 64 |         task = None
 65 |         for i, line in enumerate(open(fname)):
 66 |             id = int(line[0:line.find(' ')])
 67 |             if id == 1:
 68 |                 task = {"C": "", "Q": "", "A": ""}
 69 | 
 70 |             line = line.strip()
 71 |             line = line.replace('.', ' . ')
 72 |             line = line[line.find(' ')+1:]
 73 |             if line.find('?') == -1:
 74 |                 task["C"] += line
 75 |             else:
 76 |                 idx = line.find('?')
 77 |                 tmp = line[idx+1:].split('\t')
 78 |                 task["Q"] = line[:idx]
 79 |                 task["A"] = tmp[1].strip()
 80 |                 tasks.append(task.copy())
 81 | 
 82 |         return tasks
 83 | 
 84 | 
 85 |     def get_babi_raw(self, id, test_id):
 86 |         babi_map = {
 87 |             "1": "qa1_single-supporting-fact",
 88 |             "2": "qa2_two-supporting-facts",
 89 |             "3": "qa3_three-supporting-facts",
 90 |             "4": "qa4_two-arg-relations",
 91 |             "5": "qa5_three-arg-relations",
 92 |             "6": "qa6_yes-no-questions",
 93 |             "7": "qa7_counting",
 94 |             "8": "qa8_lists-sets",
 95 |             "9": "qa9_simple-negation",
 96 |             "10": "qa10_indefinite-knowledge",
 97 |             "11": "qa11_basic-coreference",
 98 |             "12": "qa12_conjunction",
 99 |             "13": "qa13_compound-coreference",
100 |             "14": "qa14_time-reasoning",
101 |             "15": "qa15_basic-deduction",
102 |             "16": "qa16_basic-induction",
103 |             "17": "qa17_positional-reasoning",
104 |             "18": "qa18_size-reasoning",
105 |             "19": "qa19_path-finding",
106 |             "20": "qa20_agents-motivations",
107 |             "MCTest": "MCTest",
108 |             "19changed": "19changed",
109 |             "joint": "all_shuffled",
110 |             "sh1": "../shuffled/qa1_single-supporting-fact",
111 |             "sh2": "../shuffled/qa2_two-supporting-facts",
112 |             "sh3": "../shuffled/qa3_three-supporting-facts",
113 |             "sh4": "../shuffled/qa4_two-arg-relations",
114 |             "sh5": "../shuffled/qa5_three-arg-relations",
115 |             "sh6": "../shuffled/qa6_yes-no-questions",
116 |             "sh7": "../shuffled/qa7_counting",
117 |             "sh8": "../shuffled/qa8_lists-sets",
118 |             "sh9": "../shuffled/qa9_simple-negation",
119 |             "sh10": "../shuffled/qa10_indefinite-knowledge",
120 |             "sh11": "../shuffled/qa11_basic-coreference",
121 |             "sh12": "../shuffled/qa12_conjunction",
122 |             "sh13": "../shuffled/qa13_compound-coreference",
123 |             "sh14": "../shuffled/qa14_time-reasoning",
124 |             "sh15": "../shuffled/qa15_basic-deduction",
125 |             "sh16": "../shuffled/qa16_basic-induction",
126 |             "sh17": "../shuffled/qa17_positional-reasoning",
127 |             "sh18": "../shuffled/qa18_size-reasoning",
128 |             "sh19": "../shuffled/qa19_path-finding",
129 |             "sh20": "../shuffled/qa20_agents-motivations",
130 |         }
131 |         if (test_id == ""):
132 |             test_id = id
133 |         babi_name = babi_map[id]
134 |         babi_test_name = babi_map[test_id]
135 |         babi_train_raw = self.init_babi(os.path.join(self.base_path, self.task_path, '%s_train.txt' % babi_name))
136 |         babi_test_raw = self.init_babi(os.path.join(self.base_path, self.task_path, '%s_test.txt' % babi_test_name))
137 |         return babi_train_raw, babi_test_raw
138 | 
139 |     def load_glove(self, dim):
140 |         word2vec = {}
141 | 
142 |         print("==> loading glove")
143 |         with open(os.path.join(self.base_path, "glove/glove.6B." + str(dim) + "d.txt"), 'rb') as f:
144 |             for line in tqdm(f):
145 |                 l = line.decode('utf-8').split()
146 |                 word2vec[l[0]] = l[1:]
147 | 
148 |         print("==> glove is loaded")
149 | 
150 |         return word2vec
151 | 
152 |     def create_vector(self, word, silent=False):
153 |         # if the word is missing from Glove, create some fake vector and store in glove!
154 |         vector = np.random.uniform(0.0, 1.0, (self.w2v_dim,))
155 |         self.word2vec[word] = vector
156 |         if (not silent):
157 |             print("data_loader.py::create_vector => %s is missing" % word)
158 |         return vector
159 | 
160 |     def process_word(self, word, to_return="word2vec", silent=False):
161 |         if not word in self.word2vec:
162 |             self.create_vector(word, silent=silent)
163 |         if not word in self.vocab:
164 |             next_index = len(self.vocab)
165 |             self.vocab[word] = next_index
166 |             self.ivocab[next_index] = word
167 | 
168 |         if to_return == "word2vec":
169 |             return self.word2vec[word]
170 |         elif to_return == "index":
171 |             return self.vocab[word]
172 |         else:
173 |             raise ValueError("return type is 'word2vec' or 'index'")
174 | 
175 |     def get_norm(self, x):
176 |         x = np.array(x)
177 |         return np.sum(x * x)
178 | 
179 |     def process_input(self, data_raw):
180 |         questions = []
181 |         inputs = []
182 |         answers = []
183 |         input_masks = []
184 | 
185 |         for x in data_raw:
186 |             inp = x["C"].lower().split(' ')
187 |             inp = [w for w in inp if len(w) > 0]
188 | 
189 |             q = x["Q"].lower().split(' ')
190 |             q = [w for w in q if len(w) > 0]
191 | 
192 |             inp_vector = [self.process_word(word=w, to_return="word2vec") for w in inp]
193 |             inp_vector = self.pad_input(inp_vector, self.max_facts_seq_len, [np.zeros(self.w2v_dim)])
194 | 
195 |             q_vector = [self.process_word(word=w, to_return="word2vec") for w in q]
196 |             q_vector = self.pad_input(q_vector, self.max_question_seq_len, [np.zeros(self.w2v_dim)])
197 | 
198 |             inputs.append(np.vstack(inp_vector).astype(float))
199 |             questions.append(np.vstack(q_vector).astype(float))
200 |             answers.append(self.process_word(word = x["A"], to_return = "index"))
201 | 
202 |             if self.input_mask_mode == 'word':
203 |                 input_masks.append(np.array([index for index, w in enumerate(inp)], dtype=np.int32))
204 |             elif self.input_mask_mode == 'sentence':
205 |                 input_mask = [index for index, w in enumerate(inp) if w == '.']
206 |                 input_mask = self.pad_input(input_mask, self.max_input_mask_len, [0])
207 |                 input_masks.append(input_mask)
208 |             else:
209 |                 raise ValueError("input_mask_mode is only available (word, sentence)")
210 | 
211 |         return (np.array(inputs, dtype=np.float32),
212 |                 np.array(questions, dtype=np.float32),
213 |                 np.array(answers, dtype=np.int32).reshape(-1, 1),
214 |                 np.array(input_masks, dtype=np.int32))
215 | 
216 |     def pad_input(self, input_, size, pad_item):
217 |         return input_ + pad_item * (size - len(input_))
218 | 
219 |     def make_batch(self, data, buffer_size=10000, batch_size=64, scope="train"):
220 | 
221 |         class IteratorInitializerHook(tf.train.SessionRunHook):
222 |             """Hook to initialise data iterator after Session is created."""
223 | 
224 |             def __init__(self):
225 |                 super(IteratorInitializerHook, self).__init__()
226 |                 self.iterator_initializer_func = None
227 | 
228 |             def after_create_session(self, session, coord):
229 |                 """Initialise the iterator after the session has been created."""
230 |                 self.iterator_initializer_func(session)
231 | 
232 | 
233 |         iterator_initializer_hook = IteratorInitializerHook()
234 | 
235 |         def get_inputs():
236 |             with tf.name_scope(scope):
237 | 
238 |                 inputs, input_masks, questions, answers = data
239 | 
240 |                 # Define placeholders
241 |                 input_placeholder = tf.placeholder(
242 |                     tf.float32, [None, Config.data.max_facts_seq_len, Config.model.embed_dim])
243 |                 input_mask_placeholder = tf.placeholder(
244 |                     tf.int32, [None, Config.data.max_input_mask_length])
245 |                 question_placeholder = tf.placeholder(
246 |                     tf.float32, [None, Config.data.max_question_seq_len, Config.model.embed_dim])
247 |                 answer_placeholder = tf.placeholder(
248 |                     tf.int32, [None, 1])
249 | 
250 |                 # Build dataset iterator
251 |                 dataset = tf.data.Dataset.from_tensor_slices(
252 |                     (input_placeholder, input_mask_placeholder,
253 |                     question_placeholder, answer_placeholder))
254 | 
255 |                 if scope == "train":
256 |                     dataset = dataset.repeat(None)  # Infinite iterations
257 |                 else:
258 |                     dataset = dataset.repeat(1)  # one Epoch
259 | 
260 |                 dataset = dataset.shuffle(buffer_size=buffer_size)
261 |                 dataset = dataset.batch(batch_size)
262 | 
263 |                 iterator = dataset.make_initializable_iterator()
264 |                 next_input, next_input_mask, next_question, next_answer = iterator.get_next()
265 | 
266 |                 # Set runhook to initialize iterator
267 |                 iterator_initializer_hook.iterator_initializer_func = \
268 |                     lambda sess: sess.run(
269 |                         iterator.initializer,
270 |                         feed_dict={input_placeholder: inputs,
271 |                                 input_mask_placeholder: input_masks,
272 |                                 question_placeholder: questions,
273 |                                 answer_placeholder: answers})
274 | 
275 |                 # Return batched (features, labels)
276 |                 features = {"input_data": next_input,
277 |                             "input_data_mask": next_input_mask,
278 |                             "question_data": next_question}
279 |                 return (features, next_answer)
280 | 
281 |         # Return function and hook
282 |         return get_inputs, iterator_initializer_hook
283 | 


--------------------------------------------------------------------------------
/dynamic_memory/__init__.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from hbconfig import Config
 3 | import tensorflow as tf
 4 | 
 5 | from .encoder import Encoder
 6 | from .episode import Episode
 7 | 
 8 | 
 9 | 
10 | class Graph:
11 | 
12 |     def __init__(self, mode, dtype=tf.float32):
13 |         self.mode = mode
14 |         self.dtype = dtype
15 | 
16 |     def build(self,
17 |               embedding_input=None,
18 |               input_mask=None,
19 |               embedding_question=None):
20 | 
21 |         facts, question = self._build_input_module(embedding_input, input_mask, embedding_question)
22 |         last_memory = self._build_episodic_memory(facts, question)
23 |         return self._build_answer_decoder(last_memory)
24 | 
25 |     def _build_input_module(self, embedding_input, input_mask, embedding_question):
26 |         encoder = Encoder(
27 |             encoder_type=Config.model.encoder_type,
28 |             num_layers=Config.model.num_layers,
29 |             cell_type=Config.model.cell_type,
30 |             num_units=Config.model.num_units,
31 |             dropout=Config.model.dropout)
32 | 
33 |         # slice zeros padding
34 |         input_length = tf.reduce_max(input_mask, axis=1)
35 |         question_length = tf.reduce_sum(tf.to_int32(
36 |             tf.not_equal(tf.reduce_max(embedding_question, axis=2), Config.data.PAD_ID)), axis=1)
37 | 
38 |         with tf.variable_scope("input-module") as scope:
39 |             input_encoder_outputs, _ = encoder.build(
40 |                     embedding_input, input_length, scope="encoder")
41 | 
42 |             with tf.variable_scope("facts") as scope:
43 |                 batch_size = tf.shape(input_mask)[0]
44 |                 max_mask_length = tf.shape(input_mask)[1]
45 | 
46 |                 def get_encoded_fact(i):
47 |                     nonlocal input_mask
48 | 
49 |                     mask_lengths = tf.reduce_sum(tf.to_int32(tf.not_equal(input_mask[i], Config.data.PAD_ID)), axis=0)
50 |                     input_mask = tf.boolean_mask(input_mask[i], tf.sequence_mask(mask_lengths, max_mask_length))
51 | 
52 |                     encoded_facts = tf.gather_nd(input_encoder_outputs[i], tf.reshape(input_mask, [-1, 1]))
53 |                     padding = tf.zeros(tf.stack([max_mask_length - mask_lengths, Config.model.num_units]))
54 |                     return tf.concat([encoded_facts, padding], 0)
55 | 
56 |                 facts_stacked = tf.map_fn(get_encoded_fact, tf.range(start=0, limit=batch_size), dtype=self.dtype)
57 | 
58 |                 # max_input_mask_length x [batch_size, num_units]
59 |                 facts = tf.unstack(tf.transpose(facts_stacked, [1, 0, 2]), num=Config.data.max_input_mask_length)
60 | 
61 |         with tf.variable_scope("input-module") as scope:
62 |             scope.reuse_variables()
63 |             _, question = encoder.build(
64 |                     embedding_question, question_length, scope="encoder")
65 | 
66 |         return facts, question[0]
67 | 
68 | 
69 |     def _build_episodic_memory(self, facts, question):
70 | 
71 |         with tf.variable_scope('episodic-memory-module') as scope:
72 |             memory = tf.identity(question)
73 | 
74 |             episode = Episode(Config.model.num_units, reg_scale=Config.model.reg_scale)
75 |             rnn = tf.contrib.rnn.GRUCell(Config.model.num_units)
76 | 
77 |             for _ in range(Config.model.memory_hob):
78 |                 updated_memory = episode.update(facts,
79 |                         tf.transpose(memory, name="m"),
80 |                         tf.transpose(question, name="q"))
81 |                 memory, _ = rnn(updated_memory, memory, scope="memory_rnn")
82 |                 scope.reuse_variables()
83 |         return memory
84 | 
85 |     def _build_answer_decoder(self, last_memory):
86 | 
87 |         with tf.variable_scope('answer-module'):
88 |             w_a = tf.get_variable(
89 |                     "w_a", [Config.model.num_units, Config.data.vocab_size],
90 |                     regularizer=tf.contrib.layers.l2_regularizer(Config.model.reg_scale))
91 |             logits = tf.matmul(last_memory, w_a)
92 |         return logits
93 | 


--------------------------------------------------------------------------------
/dynamic_memory/encoder.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import tensorflow as tf
  3 | 
  4 | 
  5 | 
  6 | class Encoder:
  7 |     """Encoder class is Mutil-layer Recurrent Neural Networks
  8 | 
  9 |     The 'Encoder' usually encode the sequential input vector.
 10 |     """
 11 | 
 12 |     UNI_ENCODER_TYPE = "UNI"
 13 |     BI_ENCODER_TYPE = "BI"
 14 | 
 15 |     RNN_GRU_CELL = "GRU"
 16 |     RNN_LSTM_CELL = "LSTM"
 17 |     RNN_LAYER_NORM_LSTM_CELL = "layer_norm_lstm"
 18 |     RNN_NAS_CELL = "nas"
 19 | 
 20 |     def __init__(self, encoder_type="UNI", num_layers=4,
 21 |                  cell_type="GRU", num_units=512, dropout=0.8,
 22 |                  dtype=tf.float32):
 23 |         """Contructs an 'Encoder' instance.
 24 | 
 25 |         * Args:
 26 |             encoder_type: rnn encoder_type (uni, bi)
 27 |             num_layers: number of RNN cell composed sequentially of multiple simple cells.
 28 |             input_vector: RNN Input vectors.
 29 |             sequence_length: batch element's sequence length
 30 |             cell_type: RNN cell types (lstm, gru, layer_norm_lstm, nas)
 31 |             num_units: the number of units in cell
 32 |             dropout: set prob operator adding dropout to inputs of the given cell.
 33 |             dtype: the dtype of the input
 34 | 
 35 |         * Returns:
 36 |             Encoder instance
 37 |         """
 38 | 
 39 |         self.encoder_type = encoder_type
 40 |         self.num_layers = num_layers
 41 |         self.cell_type = cell_type
 42 |         self.num_units = num_units
 43 |         self.dropout = dropout
 44 |         self.dtype = dtype
 45 | 
 46 |     def build(self, input_vector, sequence_length, scope=None):
 47 |         if self.encoder_type == self.UNI_ENCODER_TYPE:
 48 |             self.cells = self._create_rnn_cells()
 49 | 
 50 |             return self.unidirectional_rnn(input_vector, sequence_length, scope=scope)
 51 |         elif self.encoder_type == self.BI_ENCODER_TYPE:
 52 |             self.cells_fw = self._create_rnn_cells(is_list=True)
 53 |             self.cells_bw = self._create_rnn_cells(is_list=True)
 54 | 
 55 |             return self.bidirectional_rnn(input_vector, sequence_length, scope=scope)
 56 |         else:
 57 |             raise ValueError(f"Unknown encoder_type {self.encoder_type}")
 58 | 
 59 |     def unidirectional_rnn(self, input_vector, sequence_length, scope=None):
 60 |         return tf.nn.dynamic_rnn(
 61 |                 self.cells,
 62 |                 input_vector,
 63 |                 sequence_length=sequence_length,
 64 |                 dtype=self.dtype,
 65 |                 time_major=False,
 66 |                 swap_memory=True,
 67 |                 scope=scope)
 68 | 
 69 |     def bidirectional_rnn(self, input_vector, sequence_length, scope=None):
 70 |         outputs, output_state_fw, output_state_bw = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(
 71 |                 self.cells_fw,
 72 |                 self.cells_bw,
 73 |                 input_vector,
 74 |                 sequence_length=sequence_length,
 75 |                 dtype=self.dtype,
 76 |                 scope=scope)
 77 | 
 78 |         encoder_final_state = tf.concat((output_state_fw[-1], output_state_bw[-1]), axis=1)
 79 |         return outputs, encoder_final_state
 80 | 
 81 |     def _create_rnn_cells(self, is_list=False):
 82 |         """Contructs stacked_rnn with num_layers
 83 | 
 84 |         * Args:
 85 |             is_list: flags for stack bidirectional. True=stack bidirectional, False=unidirectional
 86 | 
 87 |         * Returns:
 88 |             stacked_rnn
 89 |         """
 90 | 
 91 |         stacked_rnn = []
 92 |         for _ in range(self.num_layers):
 93 |             single_cell = self._rnn_single_cell()
 94 |             stacked_rnn.append(single_cell)
 95 | 
 96 |         if is_list:
 97 |             return stacked_rnn
 98 |         else:
 99 |             return tf.nn.rnn_cell.MultiRNNCell(
100 |                     cells=stacked_rnn,
101 |                     state_is_tuple=True)
102 | 
103 |     def _rnn_single_cell(self):
104 |         """Contructs rnn single_cell"""
105 | 
106 |         if self.cell_type == self.RNN_GRU_CELL:
107 |             single_cell = tf.contrib.rnn.GRUCell(
108 |                 self.num_units,
109 |                 reuse=tf.get_variable_scope().reuse)
110 |         elif self.cell_type == self.RNN_LSTM_CELL:
111 |             single_cell = tf.contrib.rnn.BasicLSTMCell(
112 |                 self.num_units,
113 |                 forget_bias=1.0,
114 |                 reuse=tf.get_variable_scope().reuse)
115 |         elif self.cell_type == self.RNN_LAYER_NORM_LSTM_CELL:
116 |             single_cell = tf.contrib.rnn.LayerNormBasicLSTMCell(
117 |                 self.num_units,
118 |                 forget_bias=1.0,
119 |                 layer_norm=True,
120 |                 reuse=tf.get_variable_scope().reuse)
121 |         elif self.cell_type == self.RNN_NAS_CELL:
122 |             single_cell = tf.contrib.rnn.LayerNormBasicLSTMCell(
123 |                 self.num_units)
124 |         else:
125 |             raise ValueError(f"Unknown rnn cell type. {self.cell_type}")
126 | 
127 |         if self.dropout > 0.0:
128 |             single_cell = tf.contrib.rnn.DropoutWrapper(
129 |                 cell=single_cell, input_keep_prob=(1.0 - self.dropout))
130 | 
131 |         return single_cell
132 | 


--------------------------------------------------------------------------------
/dynamic_memory/episode.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import tensorflow as tf
 3 | 
 4 | 
 5 | 
 6 | class Episode:
 7 |     """Episode class is used update memory in in Episodic Memory Module"""
 8 | 
 9 |     def __init__(self, num_units, reg_scale=0.001):
10 |         self.gate = AttentionGate(hidden_size=num_units, reg_scale=reg_scale)
11 |         self.rnn = tf.contrib.rnn.GRUCell(num_units)
12 | 
13 |     def update(self, c, m_t, q_t):
14 |         """Update memory with attention mechanism
15 | 
16 |         * Args:
17 |             c : encoded raw text and stacked by each sentence
18 |                 shape: fact_count x [batch_size, num_units]
19 |             m_t : previous memory
20 |                 shape: [num_units, batch_size]
21 |             q_t : encoded question last state
22 |                 shape: [num_units, batch_size]
23 | 
24 |         * Returns:
25 |             h : updated memory
26 |         """
27 |         h = tf.zeros_like(c[0])
28 | 
29 |         with tf.variable_scope('memory-update') as scope:
30 |             for fact in c:
31 |                 g = self.gate.score(tf.transpose(fact, name="c"), m_t, q_t)
32 |                 h = g * self.rnn(fact, h, scope="episode_rnn")[0] + (1 - g) * h
33 |                 scope.reuse_variables()
34 |         return h
35 | 
36 | 
37 | class AttentionGate:
38 |     """AttentionGate class is simple two-layer feed forward neural network with Score function."""
39 | 
40 |     def __init__(self, hidden_size=4, reg_scale=0.001):
41 |         self.w1 = tf.get_variable(
42 |                 "w1", [hidden_size, 7*hidden_size],
43 |                 regularizer=tf.contrib.layers.l2_regularizer(reg_scale))
44 |         self.b1 = tf.get_variable("b1", [hidden_size, 1])
45 |         self.w2 = tf.get_variable(
46 |                 "w2", [1, hidden_size],
47 |                 regularizer=tf.contrib.layers.l2_regularizer(reg_scale))
48 |         self.b2 = tf.get_variable("b2", [1, 1])
49 | 
50 |     def score(self, c_t, m_t, q_t):
51 |         """For captures a variety of similarities between input(c), memory(m) and question(q)
52 | 
53 |         * Args:
54 |             c_t : transpose of one fact (encoded sentence's last state)
55 |                   shape: [num_units, batch_size]
56 |             m_t : transpose of previous memory
57 |                   shape: [num_units, batch_size]
58 |             q_t : transpose of encoded question
59 |                   shape: [num_units, batch_size]
60 | 
61 |         * Returns:
62 |             gate score
63 |             shape: [batch_size, 1]
64 |         """
65 | 
66 |         with tf.variable_scope('attention_gate'):
67 |             z = tf.concat([c_t, m_t, q_t, c_t*q_t, c_t*m_t, (c_t-q_t)**2, (c_t-m_t)**2], 0)
68 | 
69 |             o1 = tf.nn.tanh(tf.matmul(self.w1, z) + self.b1)
70 |             o2 = tf.nn.sigmoid(tf.matmul(self.w2, o1) + self.b2)
71 |             return tf.transpose(o2)
72 | 


--------------------------------------------------------------------------------
/dynamic_memory_plus/__init__.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from hbconfig import Config
 3 | import tensorflow as tf
 4 | 
 5 | 
 6 | from .input import TextualInput
 7 | 
 8 | 
 9 | class Graph:
10 | 
11 |     def __init__(self, mode, dtype=tf.float32):
12 |         self.mode = mode
13 |         self.dtype = dtype
14 | 
15 |     def build(self,
16 |               input=None,
17 |               input_mask=None,
18 |               question=None):
19 | 
20 |         facts = self._build_textual_input_module(input)
21 |         encoded_question = self._build_question_module(question)
22 |         last_memory = self._build_episodic_memory_module(facts, question)
23 |         return self._build_question_module(last_memory)
24 | 
25 |     def _build_textual_input_module(self, input):
26 |         textual_input = TextualInput(embed_dim=Config.model.embed_dim,
27 |                                      vocab_size=Config.data.vocab_size,
28 |                                      dtype=self.dtype)
29 |         facts = textual_input.build(input)
30 |         return facts
31 | 
32 |     def _build_question_module(self, question):
33 |         encoder = Encoder(...)
34 |         _, question = encoder.build(question)
35 |         return question[0]
36 | 
37 |     def _build_episodic_memory_module(self, facts, question):
38 |         # attention mechanism (gate attention + AttnGRU)
39 |         # memory update
40 |         pass
41 | 
42 |     def _build_answer_module(self):
43 |         pass
44 | 


--------------------------------------------------------------------------------
/dynamic_memory_plus/attn_gru.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from tf.nn.rnn_cell importRNNCell
 3 | 
 4 | 
 5 | class AttnGRUCell(RNNCell):
 6 |     """Attention based GRU (cf. https://arxiv.org/abs/1603.01417).
 7 | 
 8 |     * Args:
 9 |         num_units: int, The number of units in the AttnGRU cell.
10 |         activation: Nonlinearity to use.  Default: `tanh`.
11 |         reuse: (optional) Python boolean describing whether to reuse variables
12 |          in an existing scope.  If not `True`, and the existing scope already has
13 |          the given variables, an error is raised.
14 |         kernel_initializer: (optional) The initializer to use for the weight and
15 |          projection matrices.
16 |         bias_initializer: (optional) The initializer to use for the bias.
17 |   """
18 | 
19 |     def __init__(self,
20 |                  num_units,
21 |                  activation=None,
22 |                  reuse=None,
23 |                  kernel_initializer=None,
24 |                  bias_initializer=None):
25 | 
26 |         super(AttnGRUCell, self).__init__(_reuse=reuse)
27 |         self._num_units = num_units
28 |         self._activation = activation or math_ops.tanh
29 |         self._kernel_initializer = kernel_initializer
30 |         self._bias_initializer = bias_initializer
31 |         self._gate_linear = None
32 |         self._candidate_linear = None
33 | 
34 |     @property
35 |     def state_size(self):
36 |         return self._num_units
37 | 
38 |     @property
39 |     def output_size(self):
40 |         return self._num_units
41 | 
42 |     def call(self, inputs, state):
43 |         """Attention Based GRU with num units cells."""
44 |         if self._gate_linear is None:
45 |             bias_ones = self._bias_initializer
46 |         if self._bias_initializer is None:
47 |             bias_ones = init_ops.constant_initializer(1.0, dtype=inputs.dtype)
48 | 
49 |     with vs.variable_scope("gates"):  # Reset gate and update gate.
50 |         self._gate_linear = _Linear(
51 |             [inputs, state],
52 |             2 * self._num_units,
53 |             True,
54 |             bias_initializer=bias_ones,
55 |             kernel_initializer=self._kernel_initializer)
56 | 
57 |         value = math_ops.sigmoid(self._gate_linear([inputs, state]))
58 |         r, u = array_ops.split(value=value, num_or_size_splits=2, axis=1)
59 | 
60 |         r_state = r * state
61 |         if self._candidate_linear is None:
62 |             with vs.variable_scope("candidate"):
63 |                 self._candidate_linear = _Linear(
64 |                     [inputs, r_state],
65 |                     self._num_units,
66 |                     True,
67 |                     bias_initializer=self._bias_initializer,
68 |                     kernel_initializer=self._kernel_initializer)
69 |         c = self._activation(self._candidate_linear([inputs, r_state]))
70 |         new_h = u * state + (1 - u) * c
71 |         return new_h, new_h
72 | 


--------------------------------------------------------------------------------
/dynamic_memory_plus/encoder.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import tensorflow as tf
  3 | 
  4 | 
  5 | 
  6 | class Encoder:
  7 |     """Encoder class is Mutil-layer Recurrent Neural Networks
  8 | 
  9 |     The 'Encoder' usually encode the sequential input vector.
 10 |     """
 11 | 
 12 |     UNI_ENCODER_TYPE = "uni"
 13 |     BI_ENCODER_TYPE = "bi"
 14 | 
 15 |     RNN_GRU_CELL = "gru"
 16 |     RNN_LSTM_CELL = "lstm"
 17 |     RNN_LAYER_NORM_LSTM_CELL = "layer_norm_lstm"
 18 |     RNN_NAS_CELL = "nas"
 19 | 
 20 |     def __init__(self, encoder_type="UNI", num_layers=4,
 21 |                  cell_type="GRU", num_units=512, dropout=0.8,
 22 |                  dtype=tf.float32):
 23 |         """Contructs an 'Encoder' instance.
 24 | 
 25 |         * Args:
 26 |             encoder_type: rnn encoder_type (uni, bi)
 27 |             num_layers: number of RNN cell composed sequentially of multiple simple cells.
 28 |             input_vector: RNN Input vectors.
 29 |             sequence_length: batch element's sequence length
 30 |             cell_type: RNN cell types (lstm, gru, layer_norm_lstm, nas)
 31 |             num_units: the number of units in cell
 32 |             dropout: set prob operator adding dropout to inputs of the given cell.
 33 |             dtype: the dtype of the input
 34 | 
 35 |         * Returns:
 36 |             Encoder instance
 37 |         """
 38 | 
 39 |         self.encoder_type = encoder_type
 40 |         self.num_layers = num_layers
 41 |         self.cell_type = cell_type
 42 |         self.num_units = num_units
 43 |         self.dropout = dropout
 44 |         self.dtype = dtype
 45 | 
 46 |     def build(self, input_vector, sequence_length, scope=None):
 47 |         if self.encoder_type == self.UNI_ENCODER_TYPE:
 48 |             self.cells = self._create_rnn_cells()
 49 | 
 50 |             return self.unidirectional_rnn(input_vector, sequence_length, scope=scope)
 51 |         elif self.encoder_type == self.BI_ENCODER_TYPE:
 52 |             self.cells_fw = self._create_rnn_cells(is_list=True)
 53 |             self.cells_bw = self._create_rnn_cells(is_list=True)
 54 | 
 55 |             return self.bidirectional_rnn(input_vector, sequence_length, scope=scope)
 56 |         else:
 57 |             raise ValueError(f"Unknown encoder_type {self.encoder_type}")
 58 | 
 59 |     def unidirectional_rnn(self, input_vector, sequence_length, scope=None):
 60 |         return tf.nn.dynamic_rnn(
 61 |                 self.cells,
 62 |                 input_vector,
 63 |                 sequence_length=sequence_length,
 64 |                 dtype=self.dtype,
 65 |                 time_major=False,
 66 |                 swap_memory=True,
 67 |                 scope=scope)
 68 | 
 69 |     def bidirectional_rnn(self, input_vector, sequence_length, scope=None):
 70 |         outputs, output_state_fw, output_state_bw = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(
 71 |                 self.cells_fw,
 72 |                 self.cells_bw,
 73 |                 input_vector,
 74 |                 sequence_length=sequence_length,
 75 |                 dtype=self.dtype,
 76 |                 scope=scope)
 77 | 
 78 |         encoder_final_state = tf.concat((output_state_fw[-1], output_state_bw[-1]), axis=1)
 79 |         return outputs, encoder_final_state
 80 | 
 81 |     def _create_rnn_cells(self, is_list=False):
 82 |         """Contructs stacked_rnn with num_layers
 83 | 
 84 |         * Args:
 85 |             is_list: flags for stack bidirectional. True=stack bidirectional, False=unidirectional
 86 | 
 87 |         * Returns:
 88 |             stacked_rnn
 89 |         """
 90 | 
 91 |         stacked_rnn = []
 92 |         for _ in range(self.num_layers):
 93 |             single_cell = self._rnn_single_cell()
 94 |             stacked_rnn.append(single_cell)
 95 | 
 96 |         if is_list:
 97 |             return stacked_rnn
 98 |         else:
 99 |             return tf.nn.rnn_cell.MultiRNNCell(
100 |                     cells=stacked_rnn,
101 |                     state_is_tuple=True)
102 | 
103 |     def _rnn_single_cell(self):
104 |         """Contructs rnn single_cell"""
105 | 
106 |         if self.cell_type == self.RNN_GRU_CELL:
107 |             single_cell = tf.contrib.rnn.GRUCell(
108 |                 self.num_units,
109 |                 reuse=tf.get_variable_scope().reuse)
110 |         elif self.cell_type == self.RNN_LSTM_CELL:
111 |             single_cell = tf.contrib.rnn.BasicLSTMCell(
112 |                 self.num_units,
113 |                 forget_bias=1.0,
114 |                 reuse=tf.get_variable_scope().reuse)
115 |         elif self.cell_type == self.RNN_LAYER_NORM_LSTM_CELL:
116 |             single_cell = tf.contrib.rnn.LayerNormBasicLSTMCell(
117 |                 self.num_units,
118 |                 forget_bias=1.0,
119 |                 layer_norm=True,
120 |                 reuse=tf.get_variable_scope().reuse)
121 |         elif self.cell_type == self.RNN_NAS_CELL:
122 |             single_cell = tf.contrib.rnn.LayerNormBasicLSTMCell(
123 |                 self.num_units)
124 |         else:
125 |             raise ValueError(f"Unknown rnn cell type. {self.cell_type}")
126 | 
127 |         if self.dropout > 0.0:
128 |             single_cell = tf.contrib.rnn.DropoutWrapper(
129 |                 cell=single_cell, input_keep_prob=(1.0 - self.dropout))
130 | 
131 |         return single_cell
132 | 


--------------------------------------------------------------------------------
/dynamic_memory_plus/episode.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import tensorflow as tf
 3 | 
 4 | 
 5 | 
 6 | class Episode:
 7 | 
 8 |     def __init__(self, num_units):
 9 |         self.gate = AttentionGate()
10 |         self.attn_gru = self._build_attention_based_gru()
11 |         self.rnn = self._build_attention_based_gru(num_units)
12 | 
13 |     def _build_attention_based_gru(self, num_units):
14 |         pass
15 | 
16 |     def update(self, f, m_t, q_t):
17 |         h = tf.zeros_like(f[0])
18 | 
19 |         with tf.variable_scope('memory-update') as scope:
20 |             for fact in f:
21 |                 g = self.gate.score(tf.transpose(fact, name="f"), m_t, q_t)
22 |                 h = g * self.rnn(fact, h, scope="episode_rnn")[0] + (1 - g) * h
23 |                 scope.reuse_variables()
24 |         return h
25 | 
26 | 
27 | class AttentionGate:
28 | 
29 |     def __init__(self, hidden_size=4, reg_scale=0.001):
30 |         self.w1 = tf.get_variable(
31 |                 "w1", [hidden_size, 7*hidden_size],
32 |                 regularizer=tf.contrib.layers.l2_regularizer(reg_scale))
33 |         self.b1 = tf.get_variable("b1", [hidden_size, 1])
34 |         self.w2 = tf.get_variable(
35 |                 "w2", [1, hidden_size],
36 |                 regularizer=tf.contrib.layers.l2_regularizer(reg_scale))
37 |         self.b2 = tf.get_variable("b2", [1, 1])
38 | 
39 |     def score(self, f_t, m_t, q_t):
40 | 
41 |         with tf.variable_scope('attention_gate'):
42 |             z = tf.concat([f_t * q_t, f_t * m_t, tf.abs(t_f - q_t), tf.abs(f_t - m_t)], axis=0)
43 | 
44 |             o1 = tf.nn.tanh(tf.matmul(self.w1, z) + self.b1)
45 |             o2 = tf.matmul(self.w2, o1) + self.b2
46 |             o3 = tf.softmax(o2)
47 |             return tf.transpose(o3)
48 | 


--------------------------------------------------------------------------------
/dynamic_memory_plus/input.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import numpy as np
 3 | import tensorflow as tf
 4 | 
 5 | from .encoder import Encoder
 6 | 
 7 | 
 8 | 
 9 | class TextualInput:
10 | 
11 |     def __init__(self, embed_dim, vocab_size, dtype=tf.float32):
12 |         self.embed_dim = embed_dim
13 |         self.vocab_size = vocab_size
14 |         self.dtype = dtype
15 | 
16 |     def build(self, input):
17 |         fs = self.build_sentence_reader(input)
18 |         facts = self.build_input_fusion_layer(fs)
19 |         return facts
20 | 
21 |     def build_sentence_reader(self, input):
22 | 
23 |         with tf.variable_scope("sentence reader"):
24 |             # fi = j=1 ~ M sum of (i_j o w_j^i)
25 |             pe = tf.nn.embedding_lookup(self._positional_encoding(num_of_words, dim), input)
26 |             w = tf.nn.embedding_lookup(self._word_embedding(), input)
27 |             return pe * w # element wise
28 | 
29 |     def _word_embedding(self, dtype=tf.float32):
30 |         return tf.get_variable("word embedding",
31 |                                [self.vocab_size, self.embed_dim], dtype)
32 | 
33 |     def _positional_encoding(self, num_of_words, dim, dtype=tf.float32):
34 |         # M = num_of_words, K = dim
35 |         pe = np.array(
36 |             [[ (1 - j/M) - (k/d) * (1 - 2j/M) for j in M(num_of_words) ] for k in K (dim)]
37 |         )
38 |         return tf.convert_to_tensor(pe, dtype=dtype, name="positional encoding")
39 | 
40 |     def build_input_fusion_layer(self, f):
41 |         encoder = Encoder() # bidirectional rnn
42 |         facts = encoder.build()
43 |         return facts
44 | 


--------------------------------------------------------------------------------
/hook.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from hbconfig import Config
 3 | import numpy as np
 4 | import tensorflow as tf
 5 | 
 6 | 
 7 | 
 8 | def print_input(variables, vocab=None, every_n_iter=100):
 9 | 
10 |     return tf.train.LoggingTensorHook(
11 |         variables,
12 |         every_n_iter=every_n_iter,
13 |         formatter=format_variable(variables, vocab=vocab))
14 | 
15 | 
16 | def format_variable(keys, vocab=None):
17 |     rev_vocab = get_rev_vocab(vocab)
18 | 
19 |     def to_str(sequence):
20 |         tokens = [
21 |             rev_vocab.get(x, '') for x in sequence if x != Config.data.PAD_ID]
22 |         return ' '.join(tokens)
23 | 
24 |     def format(values):
25 |         result = []
26 |         for key in keys:
27 |             if vocab is None:
28 |                 result.append(f"{key} = {values[key]}")
29 |             else:
30 |                 result.append(f"{key} = {to_str(values[key])}")
31 | 
32 |         try:
33 |             return '\n - '.join(result)
34 |         except:
35 |             pass
36 | 
37 |     return format
38 | 
39 | 
40 | def get_rev_vocab(vocab):
41 |     if vocab is None:
42 |         return None
43 |     return {idx: key for key, idx in vocab.items()}
44 | 
45 | 
46 | def print_target(variables, every_n_iter=100):
47 | 
48 |     return tf.train.LoggingTensorHook(
49 |         variables,
50 |         every_n_iter=every_n_iter,
51 |         formatter=print_pos_or_neg(variables))
52 | 
53 | 
54 | def print_pos_or_neg(keys):
55 | 
56 |     def format(values):
57 |         result = []
58 |         for key in keys:
59 |             if type(values[key]) == np.ndarray:
60 |                 value = max(values[key])
61 |             else:
62 |                 value = values[key]
63 |             result.append(f"{key} = {value}")
64 | 
65 |         try:
66 |             return ', '.join(result)
67 |         except:
68 |             pass
69 | 
70 |     return format
71 | 


--------------------------------------------------------------------------------
/images/ask_me_anything_figure_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DongjunLee/dmn-tensorflow/09796bda5f068d8e6d53cfe71da4a234e67c6a7d/images/ask_me_anything_figure_3.png


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
  1 | #-- coding: utf-8 -*-
  2 | 
  3 | import argparse
  4 | import logging
  5 | 
  6 | from hbconfig import Config
  7 | import tensorflow as tf
  8 | from tensorflow.python import debug as tf_debug
  9 | 
 10 | from data_loader import DataLoader
 11 | import hook
 12 | from model import Model
 13 | 
 14 | 
 15 | def experiment_fn(run_config, params):
 16 | 
 17 |     model = Model()
 18 |     estimator = tf.estimator.Estimator(
 19 |             model_fn=model.model_fn,
 20 |             model_dir=Config.train.model_dir,
 21 |             params=params,
 22 |             config=run_config)
 23 | 
 24 |     data_loader = DataLoader(
 25 |             task_path=Config.data.task_path,
 26 |             task_id=Config.data.task_id,
 27 |             task_test_id=Config.data.task_id,
 28 |             w2v_dim=Config.model.embed_dim,
 29 |             use_pretrained=Config.model.use_pretrained)
 30 | 
 31 |     data = data_loader.make_train_and_test_set()
 32 | 
 33 |     vocab = data_loader.vocab
 34 | 
 35 |     # setting data property
 36 |     Config.data.vocab_size = len(vocab)
 37 |     Config.data.max_facts_seq_len = data_loader.max_facts_seq_len
 38 |     Config.data.max_question_seq_len = data_loader.max_question_seq_len
 39 |     Config.data.max_input_mask_length = data_loader.max_input_mask_len
 40 |     print("max_facts_seq_len:", data_loader.max_facts_seq_len)
 41 |     print("max_question_seq_len:", data_loader.max_question_seq_len)
 42 |     print("max_input_mask_length:", data_loader.max_input_mask_len)
 43 | 
 44 |     train_input_fn, train_input_hook = data_loader.make_batch(
 45 |             data["train"], batch_size=Config.model.batch_size, scope="train")
 46 |     test_input_fn, test_input_hook = data_loader.make_batch(
 47 |             data["test"], batch_size=Config.model.batch_size, scope="test")
 48 | 
 49 |     train_hooks = [train_input_hook]
 50 |     if Config.train.print_verbose:
 51 |         pass
 52 |     if Config.train.debug:
 53 |         train_hooks.append(tf_debug.LocalCLIDebugHook())
 54 | 
 55 |     eval_hooks = [test_input_hook]
 56 |     if Config.train.debug:
 57 |         eval_hooks.append(tf_debug.LocalCLIDebugHook())
 58 | 
 59 |     experiment = tf.contrib.learn.Experiment(
 60 |         estimator=estimator,
 61 |         train_input_fn=train_input_fn,
 62 |         eval_input_fn=test_input_fn,
 63 |         train_steps=Config.train.train_steps,
 64 |         min_eval_frequency=Config.train.min_eval_frequency,
 65 |         train_monitors=train_hooks,
 66 |         eval_hooks=eval_hooks
 67 |     )
 68 |     return experiment
 69 | 
 70 | 
 71 | def main(mode):
 72 |     params = tf.contrib.training.HParams(**Config.model.to_dict())
 73 | 
 74 |     run_config = tf.contrib.learn.RunConfig(
 75 |             model_dir=Config.train.model_dir,
 76 |             save_checkpoints_steps=Config.train.save_checkpoints_steps)
 77 | 
 78 |     tf.contrib.learn.learn_runner.run(
 79 |         experiment_fn=experiment_fn,
 80 |         run_config=run_config,
 81 |         schedule=mode,
 82 |         hparams=params
 83 |     )
 84 | 
 85 | 
 86 | if __name__ == '__main__':
 87 | 
 88 |     parser = argparse.ArgumentParser(
 89 |                         formatter_class=argparse.ArgumentDefaultsHelpFormatter)
 90 |     parser.add_argument('--config', type=str, default='config',
 91 |                         help='config file name')
 92 |     parser.add_argument('--mode', type=str, default='train',
 93 |                         help='Mode (train/test/train_and_evaluate)')
 94 |     args = parser.parse_args()
 95 | 
 96 |     tf.logging.set_verbosity(logging.INFO)
 97 | 
 98 |     Config(args.config)
 99 |     print("Config: ", Config)
100 |     if Config.description:
101 |         print("Config Description")
102 |         for key, value in Config.description.items():
103 |             print(f" - {key}: {value}")
104 | 
105 |     main(args.mode)
106 | 


--------------------------------------------------------------------------------
/model.py:
--------------------------------------------------------------------------------
 1 | from __future__ import print_function
 2 | 
 3 | 
 4 | from hbconfig import Config
 5 | import tensorflow as tf
 6 | 
 7 | import dynamic_memory
 8 | 
 9 | 
10 | 
11 | class Model:
12 | 
13 |     def __init__(self):
14 |         pass
15 | 
16 |     def model_fn(self, mode, features, labels, params):
17 |         self.dtype = tf.float32
18 | 
19 |         self.mode = mode
20 |         self.params = params
21 | 
22 |         self.loss, self.train_op, self.eval_metric_ops, self.predictions = None, None, None, None
23 |         self._init_placeholder(features, labels)
24 |         self.build_graph()
25 | 
26 |         # train mode: required loss and train_op
27 |         # eval mode: required loss
28 |         # predict mode: required predictions
29 | 
30 |         return tf.estimator.EstimatorSpec(
31 |             mode=mode,
32 |             predictions=self.predictions,
33 |             loss=self.loss,
34 |             train_op=self.train_op,
35 |             eval_metric_ops=self._build_metric()
36 |         )
37 | 
38 |     def _init_placeholder(self, features, labels):
39 |         self.input_data = features
40 |         if type(features) == dict:
41 |             self.embedding_input = features["input_data"]
42 |             self.input_mask = features["input_data_mask"]
43 |             self.embedding_question = features["question_data"]
44 | 
45 |         self.targets = labels
46 | 
47 |     def build_graph(self):
48 |         graph = dynamic_memory.Graph(self.mode)
49 |         output = graph.build(embedding_input=self.embedding_input,
50 |                              input_mask = self.input_mask,
51 |                              embedding_question=self.embedding_question)
52 |         self.predictions = tf.argmax(output, axis=1)
53 | 
54 |         self._build_loss(output)
55 |         self._build_optimizer()
56 | 
57 |     def _build_loss(self, output):
58 |         with tf.variable_scope('loss'):
59 |             cross_entropy = tf.losses.sparse_softmax_cross_entropy(
60 |                     self.targets,
61 |                     output,
62 |                     scope="cross-entropy")
63 |             reg_term = tf.reduce_sum(tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES))
64 | 
65 |             self.loss = tf.add(cross_entropy, reg_term)
66 | 
67 |     def _build_optimizer(self):
68 |         self.train_op = tf.contrib.layers.optimize_loss(
69 |             self.loss, tf.train.get_global_step(),
70 |             optimizer=Config.train.get('optimizer', 'Adam'),
71 |             learning_rate=Config.train.learning_rate,
72 |             summaries=['loss', 'gradients', 'learning_rate'],
73 |             name="train_op")
74 | 
75 |     def _build_metric(self):
76 |         return {
77 |             "accuracy": tf.metrics.accuracy(self.targets, self.predictions)
78 |         }
79 | 


--------------------------------------------------------------------------------
/notebooks/data_loader.py.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 44,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "\"\"\"\n",
 10 |     "bAbi data_loader\n",
 11 |     "Original code : https://github.com/YerevaNN/Dynamic-memory-networks-in-Theano/blob/master/utils.py\n",
 12 |     "\"\"\"\n",
 13 |     "\n",
 14 |     "import os as os\n",
 15 |     "import numpy as np\n",
 16 |     "from tqdm import tqdm\n",
 17 |     "\n",
 18 |     "\n",
 19 |     "\n",
 20 |     "class DataLoader:\n",
 21 |     "\n",
 22 |     "    def __init__(self, task_id, task_test_id, w2v_dim=100, input_mask_mode=\"sentence\", use_pretrained=True):\n",
 23 |     "        self.base_path = os.path.join(\"data/\")\n",
 24 |     "\n",
 25 |     "        self.task_id = str(task_id)\n",
 26 |     "        self.task_test_id = str(task_test_id)\n",
 27 |     "        self.w2v_dim = w2v_dim\n",
 28 |     "        self.input_mask_mode = input_mask_mode\n",
 29 |     "        self.use_pretrained = use_pretrained\n",
 30 |     "\n",
 31 |     "    def make_train_and_test_set(self):\n",
 32 |     "        train_raw, test_raw = self.get_babi_raw(self.task_id, self.task_test_id)\n",
 33 |     "        self.max_facts_seq_len, self.max_question_seq_len, self.max_input_mask_len = self.get_max_seq_length(train_raw, test_raw)\n",
 34 |     "        \n",
 35 |     "        if self.use_pretrained:\n",
 36 |     "            self.word2vec = self.load_glove(self.w2v_dim)\n",
 37 |     "        else:\n",
 38 |     "            self.word2vec = {}\n",
 39 |     "        self.vocab = {}\n",
 40 |     "        self.ivocab = {}\n",
 41 |     "        \n",
 42 |     "        self.create_vector(\"unknown\")\n",
 43 |     "\n",
 44 |     "        train_input, train_question, train_answer, train_input_mask = self.process_input(train_raw)\n",
 45 |     "        test_input, test_question, test_answer, test_input_mask = self.process_input(test_raw)\n",
 46 |     "\n",
 47 |     "        return {\n",
 48 |     "            \"train\": (train_input, train_input_mask, train_question, train_answer),\n",
 49 |     "            \"test\": (test_input, test_input_mask, test_question, test_answer)\n",
 50 |     "        }\n",
 51 |     "    \n",
 52 |     "    def get_max_seq_length(self, *datasets):\n",
 53 |     "        max_facts_length, max_question_length, max_input_mask_length = 0, 0, 0\n",
 54 |     "        \n",
 55 |     "        def count_punctuation(facts):\n",
 56 |     "            return len(list(filter(lambda x: x == \".\", facts)))\n",
 57 |     "        \n",
 58 |     "        for dataset in datasets:\n",
 59 |     "            for d in dataset:\n",
 60 |     "                max_facts_length = max(max_facts_length, len(d['C'].split()))\n",
 61 |     "                max_input_mask_length = max(max_input_mask_length, count_punctuation(d['C']))\n",
 62 |     "                max_question_length = max(max_question_length, len(d['Q'].split()))\n",
 63 |     "        return max_facts_length, max_question_length, max_input_mask_length\n",
 64 |     "\n",
 65 |     "    def init_babi(self, fname):\n",
 66 |     "        print(\"==> Loading test from %s\" % fname)\n",
 67 |     "        tasks = []\n",
 68 |     "        task = None\n",
 69 |     "        for i, line in enumerate(open(fname)):\n",
 70 |     "            id = int(line[0:line.find(' ')])\n",
 71 |     "            if id == 1:\n",
 72 |     "                task = {\"C\": \"\", \"Q\": \"\", \"A\": \"\"}\n",
 73 |     "\n",
 74 |     "            line = line.strip()\n",
 75 |     "            line = line.replace('.', ' . ')\n",
 76 |     "            line = line[line.find(' ')+1:]\n",
 77 |     "            if line.find('?') == -1:\n",
 78 |     "                task[\"C\"] += line\n",
 79 |     "            else:\n",
 80 |     "                idx = line.find('?')\n",
 81 |     "                tmp = line[idx+1:].split('\\t')\n",
 82 |     "                task[\"Q\"] = line[:idx]\n",
 83 |     "                task[\"A\"] = tmp[1].strip()\n",
 84 |     "                tasks.append(task.copy())\n",
 85 |     "\n",
 86 |     "        return tasks\n",
 87 |     "\n",
 88 |     "\n",
 89 |     "    def get_babi_raw(self, id, test_id):\n",
 90 |     "        babi_map = {\n",
 91 |     "            \"1\": \"qa1_single-supporting-fact\",\n",
 92 |     "            \"2\": \"qa2_two-supporting-facts\",\n",
 93 |     "            \"3\": \"qa3_three-supporting-facts\",\n",
 94 |     "            \"4\": \"qa4_two-arg-relations\",\n",
 95 |     "            \"5\": \"qa5_three-arg-relations\",\n",
 96 |     "            \"6\": \"qa6_yes-no-questions\",\n",
 97 |     "            \"7\": \"qa7_counting\",\n",
 98 |     "            \"8\": \"qa8_lists-sets\",\n",
 99 |     "            \"9\": \"qa9_simple-negation\",\n",
100 |     "            \"10\": \"qa10_indefinite-knowledge\",\n",
101 |     "            \"11\": \"qa11_basic-coreference\",\n",
102 |     "            \"12\": \"qa12_conjunction\",\n",
103 |     "            \"13\": \"qa13_compound-coreference\",\n",
104 |     "            \"14\": \"qa14_time-reasoning\",\n",
105 |     "            \"15\": \"qa15_basic-deduction\",\n",
106 |     "            \"16\": \"qa16_basic-induction\",\n",
107 |     "            \"17\": \"qa17_positional-reasoning\",\n",
108 |     "            \"18\": \"qa18_size-reasoning\",\n",
109 |     "            \"19\": \"qa19_path-finding\",\n",
110 |     "            \"20\": \"qa20_agents-motivations\",\n",
111 |     "            \"MCTest\": \"MCTest\",\n",
112 |     "            \"19changed\": \"19changed\",\n",
113 |     "            \"joint\": \"all_shuffled\",\n",
114 |     "            \"sh1\": \"../shuffled/qa1_single-supporting-fact\",\n",
115 |     "            \"sh2\": \"../shuffled/qa2_two-supporting-facts\",\n",
116 |     "            \"sh3\": \"../shuffled/qa3_three-supporting-facts\",\n",
117 |     "            \"sh4\": \"../shuffled/qa4_two-arg-relations\",\n",
118 |     "            \"sh5\": \"../shuffled/qa5_three-arg-relations\",\n",
119 |     "            \"sh6\": \"../shuffled/qa6_yes-no-questions\",\n",
120 |     "            \"sh7\": \"../shuffled/qa7_counting\",\n",
121 |     "            \"sh8\": \"../shuffled/qa8_lists-sets\",\n",
122 |     "            \"sh9\": \"../shuffled/qa9_simple-negation\",\n",
123 |     "            \"sh10\": \"../shuffled/qa10_indefinite-knowledge\",\n",
124 |     "            \"sh11\": \"../shuffled/qa11_basic-coreference\",\n",
125 |     "            \"sh12\": \"../shuffled/qa12_conjunction\",\n",
126 |     "            \"sh13\": \"../shuffled/qa13_compound-coreference\",\n",
127 |     "            \"sh14\": \"../shuffled/qa14_time-reasoning\",\n",
128 |     "            \"sh15\": \"../shuffled/qa15_basic-deduction\",\n",
129 |     "            \"sh16\": \"../shuffled/qa16_basic-induction\",\n",
130 |     "            \"sh17\": \"../shuffled/qa17_positional-reasoning\",\n",
131 |     "            \"sh18\": \"../shuffled/qa18_size-reasoning\",\n",
132 |     "            \"sh19\": \"../shuffled/qa19_path-finding\",\n",
133 |     "            \"sh20\": \"../shuffled/qa20_agents-motivations\",\n",
134 |     "        }\n",
135 |     "        if (test_id == \"\"):\n",
136 |     "            test_id = id\n",
137 |     "        babi_name = babi_map[id]\n",
138 |     "        babi_test_name = babi_map[test_id]\n",
139 |     "        babi_train_raw = self.init_babi(os.path.join(self.base_path, 'en-10/%s_train.txt' % babi_name))\n",
140 |     "        babi_test_raw = self.init_babi(os.path.join(self.base_path, 'en-10/%s_test.txt' % babi_test_name))\n",
141 |     "        return babi_train_raw, babi_test_raw\n",
142 |     "\n",
143 |     "    def load_glove(self, dim):\n",
144 |     "        word2vec = {}\n",
145 |     "\n",
146 |     "        print(\"==> loading glove\")\n",
147 |     "        with open(os.path.join(self.base_path, \"glove/glove.6B.\" + str(dim) + \"d.txt\")) as f:\n",
148 |     "            for line in tqdm(f):\n",
149 |     "                l = line.split()\n",
150 |     "                word2vec[l[0]] = l[1:]\n",
151 |     "\n",
152 |     "        print(\"==> glove is loaded\")\n",
153 |     "\n",
154 |     "        return word2vec\n",
155 |     "\n",
156 |     "    def create_vector(self, word, silent=False):\n",
157 |     "        # if the word is missing from Glove, create some fake vector and store in glove!\n",
158 |     "        vector = np.random.uniform(0.0, 1.0, (self.w2v_dim,))\n",
159 |     "        self.word2vec[word] = vector\n",
160 |     "        if (not silent):\n",
161 |     "            print(\"data_loader.py::create_vector => %s is missing\" % word)\n",
162 |     "        return vector\n",
163 |     "\n",
164 |     "    def process_word(self, word, to_return=\"word2vec\", silent=False):\n",
165 |     "        if not word in self.word2vec:\n",
166 |     "            self.create_vector(word, silent=silent)\n",
167 |     "        if not word in self.vocab:\n",
168 |     "            next_index = len(self.vocab)\n",
169 |     "            self.vocab[word] = next_index\n",
170 |     "            self.ivocab[next_index] = word\n",
171 |     "\n",
172 |     "        if to_return == \"word2vec\":\n",
173 |     "            return self.word2vec[word]\n",
174 |     "        elif to_return == \"index\":\n",
175 |     "            return self.vocab[word]\n",
176 |     "        else:\n",
177 |     "            raise ValueError(\"return type is 'word2vec' or 'index'\")\n",
178 |     "\n",
179 |     "    def get_norm(self, x):\n",
180 |     "        x = np.array(x)\n",
181 |     "        return np.sum(x * x)\n",
182 |     "\n",
183 |     "    def process_input(self, data_raw):\n",
184 |     "        questions = []\n",
185 |     "        inputs = []\n",
186 |     "        answers = []\n",
187 |     "        input_masks = []\n",
188 |     "        \n",
189 |     "        for x in data_raw:\n",
190 |     "            inp = x[\"C\"].lower().split(' ')\n",
191 |     "            inp = [w for w in inp if len(w) > 0]\n",
192 |     "            \n",
193 |     "            q = x[\"Q\"].lower().split(' ')\n",
194 |     "            q = [w for w in q if len(w) > 0]\n",
195 |     "\n",
196 |     "            inp_vector = [self.process_word(word=w, to_return=\"word2vec\") for w in inp]\n",
197 |     "            inp_vector = self.pad_input(inp_vector, self.max_facts_seq_len, [np.zeros(self.w2v_dim)])\n",
198 |     "            \n",
199 |     "            q_vector = [self.process_word(word=w, to_return=\"word2vec\") for w in q]\n",
200 |     "            q_vector = self.pad_input(q_vector, self.max_question_seq_len, [np.zeros(self.w2v_dim)])\n",
201 |     "            \n",
202 |     "            inputs.append(np.vstack(inp_vector).astype(float))            \n",
203 |     "            questions.append(np.vstack(q_vector).astype(float))\n",
204 |     "            answers.append(self.process_word(word = x[\"A\"], to_return = \"index\"))\n",
205 |     "\n",
206 |     "            if self.input_mask_mode == 'word':\n",
207 |     "                input_masks.append(np.array([index for index, w in enumerate(inp)], dtype=np.int32))\n",
208 |     "            elif self.input_mask_mode == 'sentence':\n",
209 |     "                input_mask = [index for index, w in enumerate(inp) if w == '.']\n",
210 |     "                input_mask = self.pad_input(input_mask, self.max_input_mask_len, [0])\n",
211 |     "                input_masks.append(input_mask)\n",
212 |     "            else:\n",
213 |     "                raise ValueError(\"input_mask_mode is only available (word, sentence)\")\n",
214 |     "            \n",
215 |     "        return (np.array(inputs, dtype=np.float32), \n",
216 |     "                np.array(questions, dtype=np.float32),\n",
217 |     "                np.array(answers, dtype=np.int32).reshape(-1, 1), \n",
218 |     "                np.array(input_masks, dtype=np.int32))\n",
219 |     "    \n",
220 |     "    def pad_input(self, input_, size, pad_item):\n",
221 |     "        return input_ + pad_item * (size - len(input_))"
222 |    ]
223 |   },
224 |   {
225 |    "cell_type": "code",
226 |    "execution_count": 45,
227 |    "metadata": {},
228 |    "outputs": [
229 |     {
230 |      "name": "stdout",
231 |      "output_type": "stream",
232 |      "text": [
233 |       "==> Loading test from data/en-10/qa1_single-supporting-fact_train.txt\n",
234 |       "==> Loading test from data/en-10/qa1_single-supporting-fact_test.txt\n",
235 |       "data_loader.py::create_vector => unknown is missing\n",
236 |       "data_loader.py::create_vector => mary is missing\n",
237 |       "data_loader.py::create_vector => moved is missing\n",
238 |       "data_loader.py::create_vector => to is missing\n",
239 |       "data_loader.py::create_vector => the is missing\n",
240 |       "data_loader.py::create_vector => bathroom is missing\n",
241 |       "data_loader.py::create_vector => . is missing\n",
242 |       "data_loader.py::create_vector => john is missing\n",
243 |       "data_loader.py::create_vector => went is missing\n",
244 |       "data_loader.py::create_vector => hallway is missing\n",
245 |       "data_loader.py::create_vector => where is missing\n",
246 |       "data_loader.py::create_vector => is is missing\n",
247 |       "data_loader.py::create_vector => daniel is missing\n",
248 |       "data_loader.py::create_vector => back is missing\n",
249 |       "data_loader.py::create_vector => sandra is missing\n",
250 |       "data_loader.py::create_vector => garden is missing\n",
251 |       "data_loader.py::create_vector => office is missing\n",
252 |       "data_loader.py::create_vector => journeyed is missing\n"
253 |      ]
254 |     }
255 |    ],
256 |    "source": [
257 |     "data_loader = DataLoader(task_id=\"1\", task_test_id=\"1\", w2v_dim=50, use_pretrained=False)\n",
258 |     "data = data_loader.make_train_and_test_set()"
259 |    ]
260 |   },
261 |   {
262 |    "cell_type": "code",
263 |    "execution_count": 46,
264 |    "metadata": {},
265 |    "outputs": [
266 |     {
267 |      "name": "stdout",
268 |      "output_type": "stream",
269 |      "text": [
270 |       "==> Loading test from data/en-10/qa1_single-supporting-fact_train.txt\n",
271 |       "==> Loading test from data/en-10/qa1_single-supporting-fact_test.txt\n"
272 |      ]
273 |     }
274 |    ],
275 |    "source": [
276 |     "train_raw, test_raw = data_loader.get_babi_raw(\"1\", \"1\")"
277 |    ]
278 |   },
279 |   {
280 |    "cell_type": "code",
281 |    "execution_count": 47,
282 |    "metadata": {},
283 |    "outputs": [
284 |     {
285 |      "data": {
286 |       "text/plain": [
287 |        "[{'A': 'bathroom',\n",
288 |        "  'C': 'Mary moved to the bathroom . John went to the hallway . ',\n",
289 |        "  'Q': 'Where is Mary'},\n",
290 |        " {'A': 'hallway',\n",
291 |        "  'C': 'Mary moved to the bathroom . John went to the hallway . Daniel went back to the hallway . Sandra moved to the garden . ',\n",
292 |        "  'Q': 'Where is Daniel'},\n",
293 |        " {'A': 'hallway',\n",
294 |        "  'C': 'Mary moved to the bathroom . John went to the hallway . Daniel went back to the hallway . Sandra moved to the garden . John moved to the office . Sandra journeyed to the bathroom . ',\n",
295 |        "  'Q': 'Where is Daniel'}]"
296 |       ]
297 |      },
298 |      "execution_count": 47,
299 |      "metadata": {},
300 |      "output_type": "execute_result"
301 |     }
302 |    ],
303 |    "source": [
304 |     "train_raw"
305 |    ]
306 |   },
307 |   {
308 |    "cell_type": "code",
309 |    "execution_count": 48,
310 |    "metadata": {},
311 |    "outputs": [
312 |     {
313 |      "name": "stdout",
314 |      "output_type": "stream",
315 |      "text": [
316 |       "(3, 37, 50) (3, 6) (3, 3, 50) (3, 1)\n"
317 |      ]
318 |     }
319 |    ],
320 |    "source": [
321 |     "train_input, train_input_mask, train_question, train_answer = data[\"train\"]\n",
322 |     "print(train_input.shape, train_input_mask.shape, train_question.shape, train_answer.shape)"
323 |    ]
324 |   },
325 |   {
326 |    "cell_type": "code",
327 |    "execution_count": 49,
328 |    "metadata": {},
329 |    "outputs": [
330 |     {
331 |      "data": {
332 |       "text/plain": [
333 |        "(3, 37, 50)"
334 |       ]
335 |      },
336 |      "execution_count": 49,
337 |      "metadata": {},
338 |      "output_type": "execute_result"
339 |     }
340 |    ],
341 |    "source": [
342 |     "train_input.shape"
343 |    ]
344 |   },
345 |   {
346 |    "cell_type": "code",
347 |    "execution_count": 50,
348 |    "metadata": {},
349 |    "outputs": [
350 |     {
351 |      "data": {
352 |       "text/plain": [
353 |        "(37, 50)"
354 |       ]
355 |      },
356 |      "execution_count": 50,
357 |      "metadata": {},
358 |      "output_type": "execute_result"
359 |     }
360 |    ],
361 |    "source": [
362 |     "train_input[0].shape"
363 |    ]
364 |   },
365 |   {
366 |    "cell_type": "code",
367 |    "execution_count": 51,
368 |    "metadata": {},
369 |    "outputs": [
370 |     {
371 |      "data": {
372 |       "text/plain": [
373 |        "array([ 5, 11,  0,  0,  0,  0], dtype=int32)"
374 |       ]
375 |      },
376 |      "execution_count": 51,
377 |      "metadata": {},
378 |      "output_type": "execute_result"
379 |     }
380 |    ],
381 |    "source": [
382 |     "train_input_mask[0]"
383 |    ]
384 |   },
385 |   {
386 |    "cell_type": "code",
387 |    "execution_count": 52,
388 |    "metadata": {},
389 |    "outputs": [
390 |     {
391 |      "data": {
392 |       "text/plain": [
393 |        "(3, 50)"
394 |       ]
395 |      },
396 |      "execution_count": 52,
397 |      "metadata": {},
398 |      "output_type": "execute_result"
399 |     }
400 |    ],
401 |    "source": [
402 |     "train_question[0].shape"
403 |    ]
404 |   },
405 |   {
406 |    "cell_type": "code",
407 |    "execution_count": 53,
408 |    "metadata": {},
409 |    "outputs": [
410 |     {
411 |      "data": {
412 |       "text/plain": [
413 |        "array([4], dtype=int32)"
414 |       ]
415 |      },
416 |      "execution_count": 53,
417 |      "metadata": {},
418 |      "output_type": "execute_result"
419 |     }
420 |    ],
421 |    "source": [
422 |     "train_answer[0]"
423 |    ]
424 |   },
425 |   {
426 |    "cell_type": "code",
427 |    "execution_count": 54,
428 |    "metadata": {},
429 |    "outputs": [
430 |     {
431 |      "data": {
432 |       "text/plain": [
433 |        "37"
434 |       ]
435 |      },
436 |      "execution_count": 54,
437 |      "metadata": {},
438 |      "output_type": "execute_result"
439 |     }
440 |    ],
441 |    "source": [
442 |     "data_loader.max_facts_seq_len"
443 |    ]
444 |   },
445 |   {
446 |    "cell_type": "code",
447 |    "execution_count": 49,
448 |    "metadata": {},
449 |    "outputs": [
450 |     {
451 |      "data": {
452 |       "text/plain": [
453 |        "3"
454 |       ]
455 |      },
456 |      "execution_count": 49,
457 |      "metadata": {},
458 |      "output_type": "execute_result"
459 |     }
460 |    ],
461 |    "source": [
462 |     "data_loader.max_question_seq_len"
463 |    ]
464 |   },
465 |   {
466 |    "cell_type": "code",
467 |    "execution_count": null,
468 |    "metadata": {},
469 |    "outputs": [],
470 |    "source": []
471 |   },
472 |   {
473 |    "cell_type": "code",
474 |    "execution_count": null,
475 |    "metadata": {},
476 |    "outputs": [],
477 |    "source": []
478 |   }
479 |  ],
480 |  "metadata": {
481 |   "kernelspec": {
482 |    "display_name": "Python 3.6 (NLP)",
483 |    "language": "python",
484 |    "name": "nlp"
485 |   },
486 |   "language_info": {
487 |    "codemirror_mode": {
488 |     "name": "ipython",
489 |     "version": 3
490 |    },
491 |    "file_extension": ".py",
492 |    "mimetype": "text/x-python",
493 |    "name": "python",
494 |    "nbconvert_exporter": "python",
495 |    "pygments_lexer": "ipython3",
496 |    "version": "3.6.1"
497 |   }
498 |  },
499 |  "nbformat": 4,
500 |  "nbformat_minor": 2
501 | }
502 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | hb-config
2 | tqdm


--------------------------------------------------------------------------------
/scripts/fetch_babi_data.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | url=http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz
 4 | fname=`basename $url`
 5 | 
 6 | curl -SLO $url
 7 | tar zxvf $fname 
 8 | mkdir -p data
 9 | mv tasks_1-20_v1-2/* data/
10 | 
11 | rm -r tasks_1-20_v1-2
12 | rm tasks_1-20_v1-2.tar.gz
13 | 


--------------------------------------------------------------------------------
/scripts/fetch_glove_data.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | 
3 | url=http://nlp.stanford.edu/data/glove.6B.zip
4 | fname=`basename $url`
5 | 
6 | curl -SLO $url
7 | mkdir -p data
8 | unzip $fname -d data/glove/
9 | 


--------------------------------------------------------------------------------