├── .gitignore
├── LICENSE.txt
├── README.md
├── requirements.txt
└── src
    ├── __init__.py
    ├── npi
        ├── __init__.py
        ├── add
        │   ├── __init__.py
        │   ├── config.py
        │   ├── create_training_data.py
        │   ├── lib.py
        │   ├── model.py
        │   ├── test_model.py
        │   └── training_model.py
        ├── core.py
        └── terminal_core.py
    ├── run_create_addition_data.sh
    ├── run_test_addition_model.sh
    └── run_train_addition_model.sh


/.gitignore:
--------------------------------------------------------------------------------
1 | .idea/
2 | .python-version
3 | *.pyc
4 | *.log
5 | *.png
6 | *.model
7 | *.pkl
8 | .DS_Store
9 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2016 Ken Morishita
 2 | 
 3 | Permission is hereby granted, free of charge, to any person obtaining
 4 | a copy of this software and associated documentation files (the
 5 | "Software"), to deal in the Software without restriction, including
 6 | without limitation the rights to use, copy, modify, merge, publish,
 7 | distribute, sublicense, and/or sell copies of the Software, and to
 8 | permit persons to whom the Software is furnished to do so, subject to
 9 | the following conditions:
10 | 
11 | The above copyright notice and this permission notice shall be
12 | included in all copies or substantial portions of the Software.
13 | 
14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | About
  2 | =====
  3 | 
  4 | Implementation of [Neural Programmer-Interpreters](http://arxiv.org/abs/1511.06279) with Keras.
  5 | 
  6 | How to Demo
  7 | ===========
  8 | 
  9 | [Demo Movie](https://youtu.be/s7PuBqwI2YA)
 10 | 
 11 | requirement
 12 | -----------
 13 | 
 14 | * Python3
 15 | 
 16 | setup
 17 | -----
 18 | 
 19 | ```
 20 | pip install -r requirements.txt
 21 | ```
 22 | 
 23 | create training dataset
 24 | -----------------------
 25 | ### create training dataset
 26 | ```
 27 | sh src/run_create_addition_data.sh
 28 | ```
 29 | 
 30 | ### create training dataset with showing steps on terminal
 31 | ```
 32 | DEBUG=1 sh src/run_create_addition_data.sh
 33 | ```
 34 | 
 35 | training model
 36 | ------------------
 37 | ### Create New Model (-> remove old model if exists and then create new model)
 38 | ```
 39 | NEW_MODEL=1 sh src/run_train_addition_model.sh
 40 | ```
 41 | 
 42 | ### Training Existing Model (-> if a model exists, use the model)
 43 | ```
 44 | sh src/run_train_addition_model.sh
 45 | ```
 46 | 
 47 | test model
 48 | ----------
 49 | ### check the model accuracy
 50 | ```
 51 | sh src/run_test_addition_model.sh
 52 | ```
 53 | 
 54 | ### check the model accuracy with showing steps on terminal
 55 | ```
 56 | DEBUG=1 sh src/run_test_addition_model.sh
 57 | ```
 58 | 
 59 | Implementation FAQ
 60 | ==================
 61 | These are questions about implementation that I received in the past.
 62 | 
 63 | about pydot
 64 | -----------
 65 | Q: I am using Python3. I am getting an error "module 'pydot' has no attribute 'find_graphviz'".
 66 | 
 67 | A: Let's try `pydot-ng`. 
 68 |  
 69 | `train_f_enc` method
 70 | --------------------
 71 | Q: What is the purpose of 'env_model' in 'train_f_enc' method which gets called by 'fit' method? My guess is, it is to train the weights of 'f_enc' layer.
 72 | 
 73 | A: Yes, that's right.
 74 | 
 75 | ----
 76 | 
 77 | Q: Why is the target output of 'env_model' - [[first digit of sum], [ carry of sum]]? 
 78 | Also, why does the target output not have 'output'
 79 | As per my understanding the weights of 'f_enc' layer should be trained only in 'self.model'.
 80 | 
 81 | A: Yes, in the original paper, 'f_enc' is trained with other layers. It is better not to be trained separately.
 82 | 
 83 | The reason of that in my implementation is just difficulty to train the model.
 84 | Especially it seemed to hard to train layers before LSTMs (like f_enc layer). 
 85 | f_enc weights often became some NaNs. (I don't know why... keras problem? or ??)
 86 | So, I tried to train f_enc separately, and it seemed good (not best).
 87 | 
 88 | NOP program
 89 | -----------
 90 | Q: what's the purpose of NOP program?
 91 | 
 92 | A: I do not remember it much, but NOP (No Operation) is program_id = 0.
 93 | I thought that in the early days of learning, the predicted value often becomes 0, and harmless NOPs that do not perform unnecessary movements will learn more efficiently.
 94 | Although it is not certain whether it is effective...
 95 | 
 96 | `weights = [[1.]]`
 97 | ------------------
 98 | 
 99 | Q: what's the purpose of `weights = [[1.]]` this initialization?
100 | 
101 | A: You mean `weights = [[1.]]` in `AdditionNPIModel#convert_output()`, don't you?
102 | 
103 | The `weights` means learning weights of [f_end, f_prog, f_args].
104 | The first `weights = [[1.]]` means "f_end's weight=1".
105 | f_prog and f_args weights are set to 1 if the teacher returns valid values.


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | h5py==2.6.0
 2 | Keras==1.0.2
 3 | numpy==1.11.0
 4 | pydot-ng==1.0.0
 5 | pyparsing==2.1.1
 6 | PyYAML==3.11
 7 | scipy==0.17.0
 8 | six==1.10.0
 9 | Theano==0.8.2
10 | 


--------------------------------------------------------------------------------
/src/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mokemokechicken/keras_npi/05a4227effe9d1752fb953b9ffbd9ad29ba79e04/src/__init__.py


--------------------------------------------------------------------------------
/src/npi/__init__.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding: utf-8
3 | 
4 | __author__ = 'k_morishita'
5 | 


--------------------------------------------------------------------------------
/src/npi/add/__init__.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding: utf-8
3 | 
4 | __author__ = 'k_morishita'
5 | 


--------------------------------------------------------------------------------
/src/npi/add/config.py:
--------------------------------------------------------------------------------
1 | # coding: utf-8
2 | 
3 | FIELD_ROW = 4     # Input1, Input2, Carry, Output
4 | FIELD_WIDTH = 9   # number of columns
5 | FIELD_DEPTH = 11  # number of characters(0~9 digits) and white space, per cell. one-hot-encoding
6 | PROGRAM_VEC_SIZE = 10
7 | PROGRAM_KEY_VEC_SIZE = 5
8 | MAX_PROGRAM_NUM = 10
9 | 


--------------------------------------------------------------------------------
/src/npi/add/create_training_data.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | import os
 3 | import curses
 4 | import pickle
 5 | from copy import copy
 6 | 
 7 | from npi.add.config import FIELD_ROW, FIELD_WIDTH, FIELD_DEPTH
 8 | from npi.add.lib import AdditionEnv, AdditionProgramSet, AdditionTeacher, create_char_map, create_questions, run_npi
 9 | from npi.core import ResultLogger
10 | from npi.terminal_core import TerminalNPIRunner, Terminal
11 | 
12 | 
13 | def main(stdscr, filename: str, num: int, result_logger: ResultLogger):
14 |     terminal = Terminal(stdscr, create_char_map())
15 |     terminal.init_window(FIELD_WIDTH, FIELD_ROW)
16 |     program_set = AdditionProgramSet()
17 |     addition_env = AdditionEnv(FIELD_ROW, FIELD_WIDTH, FIELD_DEPTH)
18 | 
19 |     questions = create_questions(num)
20 |     teacher = AdditionTeacher(program_set)
21 |     npi_runner = TerminalNPIRunner(terminal, teacher)
22 |     npi_runner.verbose = DEBUG_MODE
23 |     steps_list = []
24 |     for data in questions:
25 |         addition_env.reset()
26 |         q = copy(data)
27 |         run_npi(addition_env, npi_runner, program_set.ADD, data)
28 |         steps_list.append({"q": q, "steps": npi_runner.step_list})
29 |         result_logger.write(data)
30 |         terminal.add_log(data)
31 | 
32 |     if filename:
33 |         with open(filename, 'wb') as f:
34 |             pickle.dump(steps_list, f, protocol=pickle.HIGHEST_PROTOCOL)
35 | 
36 | if __name__ == '__main__':
37 |     import sys
38 |     DEBUG_MODE = os.environ.get('DEBUG')
39 |     if DEBUG_MODE:
40 |         output_filename = None
41 |         num_data = 3
42 |         log_filename = 'result.log'
43 |     else:
44 |         output_filename = sys.argv[1] if len(sys.argv) > 1 else None
45 |         num_data = int(sys.argv[2]) if len(sys.argv) > 2 else 1000
46 |         log_filename = sys.argv[3] if len(sys.argv) > 3 else 'result.log'
47 |     curses.wrapper(main, output_filename, num_data, ResultLogger(log_filename))
48 |     print("create %d training data" % num_data)
49 | 


--------------------------------------------------------------------------------
/src/npi/add/lib.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | from random import random
  3 | 
  4 | import numpy as np
  5 | 
  6 | from npi.core import Program, IntegerArguments, StepOutput, NPIStep, PG_CONTINUE, PG_RETURN
  7 | from npi.terminal_core import Screen, Terminal
  8 | 
  9 | __author__ = 'k_morishita'
 10 | 
 11 | 
 12 | class AdditionEnv:
 13 |     """
 14 |     Environment of Addition
 15 |     """
 16 |     def __init__(self, height, width, num_chars):
 17 |         self.screen = Screen(height, width)
 18 |         self.num_chars = num_chars
 19 |         self.pointers = [0] * height
 20 |         self.reset()
 21 | 
 22 |     def reset(self):
 23 |         self.screen.fill(0)
 24 |         self.pointers = [self.screen.width-1] * self.screen.height  # rightmost
 25 | 
 26 |     def get_observation(self) -> np.ndarray:
 27 |         value = []
 28 |         for row in range(len(self.pointers)):
 29 |             value.append(self.to_one_hot(self.screen[row, self.pointers[row]]))
 30 |         return np.array(value)  # shape of FIELD_ROW * FIELD_DEPTH
 31 | 
 32 |     def to_one_hot(self, ch):
 33 |         ret = np.zeros((self.num_chars,), dtype=np.int8)
 34 |         if 0 <= ch < self.num_chars:
 35 |             ret[ch] = 1
 36 |         else:
 37 |             raise IndexError("ch must be 0 <= ch < %s, but %s" % (self.num_chars, ch))
 38 |         return ret
 39 | 
 40 |     def setup_problem(self, num1, num2):
 41 |         for i, s in enumerate(reversed("%s" % num1)):
 42 |             self.screen[0, -(i+1)] = int(s) + 1
 43 |         for i, s in enumerate(reversed("%s" % num2)):
 44 |             self.screen[1, -(i+1)] = int(s) + 1
 45 | 
 46 |     def move_pointer(self, row, left_or_right):
 47 |         if 0 <= row < len(self.pointers):
 48 |             self.pointers[row] += 1 if left_or_right == 1 else -1  # LEFT is 0, RIGHT is 1
 49 |             self.pointers[row] %= self.screen.width
 50 | 
 51 |     def write(self, row, ch):
 52 |         if 0 <= row < self.screen.height and 0 <= ch < self.num_chars:
 53 |             self.screen[row, self.pointers[row]] = ch
 54 | 
 55 |     def get_output(self):
 56 |         s = ""
 57 |         for ch in self.screen[3]:
 58 |             if ch > 0:
 59 |                 s += "%s" % (ch-1)
 60 |         return int(s or "0")
 61 | 
 62 | 
 63 | class MovePtrProgram(Program):
 64 |     output_to_env = True
 65 |     PTR_IN1 = 0
 66 |     PTR_IN2 = 1
 67 |     PTR_CARRY = 2
 68 |     PTR_OUT = 3
 69 | 
 70 |     TO_LEFT = 0
 71 |     TO_RIGHT = 1
 72 | 
 73 |     def do(self, env: AdditionEnv, args: IntegerArguments):
 74 |         ptr_kind = args.decode_at(0)
 75 |         left_or_right = args.decode_at(1)
 76 |         env.move_pointer(ptr_kind, left_or_right)
 77 | 
 78 | 
 79 | class WriteProgram(Program):
 80 |     output_to_env = True
 81 |     WRITE_TO_CARRY = 0
 82 |     WRITE_TO_OUTPUT = 1
 83 | 
 84 |     def do(self, env: AdditionEnv, args: IntegerArguments):
 85 |         row = 2 if args.decode_at(0) == self.WRITE_TO_CARRY else 3
 86 |         digit = args.decode_at(1)
 87 |         env.write(row, digit+1)
 88 | 
 89 | 
 90 | class AdditionProgramSet:
 91 |     NOP = Program('NOP')
 92 |     MOVE_PTR = MovePtrProgram('MOVE_PTR', 4, 2)  # PTR_KIND(4), LEFT_OR_RIGHT(2)
 93 |     WRITE = WriteProgram('WRITE', 2, 10)       # CARRY_OR_OUT(2), DIGITS(10)
 94 |     ADD = Program('ADD')
 95 |     ADD1 = Program('ADD1')
 96 |     CARRY = Program('CARRY')
 97 |     LSHIFT = Program('LSHIFT')
 98 |     RSHIFT = Program('RSHIFT')
 99 | 
100 |     def __init__(self):
101 |         self.map = {}
102 |         self.program_id = 0
103 |         self.register(self.NOP)
104 |         self.register(self.MOVE_PTR)
105 |         self.register(self.WRITE)
106 |         self.register(self.ADD)
107 |         self.register(self.ADD1)
108 |         self.register(self.CARRY)
109 |         self.register(self.LSHIFT)
110 |         self.register(self.RSHIFT)
111 | 
112 |     def register(self, pg: Program):
113 |         pg.program_id = self.program_id
114 |         self.map[pg.program_id] = pg
115 |         self.program_id += 1
116 | 
117 |     def get(self, i: int):
118 |         return self.map.get(i)
119 | 
120 | 
121 | class AdditionTeacher(NPIStep):
122 |     def __init__(self, program_set: AdditionProgramSet):
123 |         self.pg_set = program_set
124 |         self.step_queue = None
125 |         self.step_queue_stack = []
126 |         self.sub_program = {}
127 |         self.register_subprogram(program_set.MOVE_PTR, self.pg_primitive)
128 |         self.register_subprogram(program_set.WRITE   , self.pg_primitive)
129 |         self.register_subprogram(program_set.ADD     , self.pg_add)
130 |         self.register_subprogram(program_set.ADD1    , self.pg_add1)
131 |         self.register_subprogram(program_set.CARRY   , self.pg_carry)
132 |         self.register_subprogram(program_set.LSHIFT  , self.pg_lshift)
133 |         self.register_subprogram(program_set.RSHIFT  , self.pg_rshift)
134 | 
135 |     def reset(self):
136 |         super(AdditionTeacher, self).reset()
137 |         self.step_queue_stack = []
138 |         self.step_queue = None
139 | 
140 |     def register_subprogram(self, pg, method):
141 |         self.sub_program[pg.program_id] = method
142 | 
143 |     @staticmethod
144 |     def decode_params(env_observation: np.ndarray, arguments: IntegerArguments):
145 |         return env_observation.argmax(axis=1), arguments.decode_all()
146 | 
147 |     def enter_function(self):
148 |         self.step_queue_stack.append(self.step_queue or [])
149 |         self.step_queue = None
150 | 
151 |     def exit_function(self):
152 |         self.step_queue = self.step_queue_stack.pop()
153 | 
154 |     def step(self, env_observation: np.ndarray, pg: Program, arguments: IntegerArguments) -> StepOutput:
155 |         if not self.step_queue:
156 |             self.step_queue = self.sub_program[pg.program_id](env_observation, arguments)
157 |         if self.step_queue:
158 |             ret = self.convert_for_step_return(self.step_queue[0])
159 |             self.step_queue = self.step_queue[1:]
160 |         else:
161 |             ret = StepOutput(PG_RETURN, None, None)
162 |         return ret
163 | 
164 |     @staticmethod
165 |     def convert_for_step_return(step_values: tuple) -> StepOutput:
166 |         if len(step_values) == 2:
167 |             return StepOutput(PG_CONTINUE, step_values[0], IntegerArguments(step_values[1]))
168 |         else:
169 |             return StepOutput(step_values[0], step_values[1], IntegerArguments(step_values[2]))
170 | 
171 |     @staticmethod
172 |     def pg_primitive(env_observation: np.ndarray, arguments: IntegerArguments):
173 |         return None
174 | 
175 |     def pg_add(self, env_observation: np.ndarray, arguments: IntegerArguments):
176 |         ret = []
177 |         (in1, in2, carry, output), (a1, a2, a3) = self.decode_params(env_observation, arguments)
178 |         if in1 == 0 and in2 == 0 and carry == 0:
179 |             return None
180 |         ret.append((self.pg_set.ADD1, None))
181 |         ret.append((self.pg_set.LSHIFT, None))
182 |         return ret
183 | 
184 |     def pg_add1(self, env_observation: np.ndarray, arguments: IntegerArguments):
185 |         ret = []
186 |         p = self.pg_set
187 |         (in1, in2, carry, output), (a1, a2, a3) = self.decode_params(env_observation, arguments)
188 |         result = self.sum_ch_list([in1, in2, carry])
189 |         ret.append((p.WRITE, (p.WRITE.WRITE_TO_OUTPUT, result % 10)))
190 |         if result > 9:
191 |             ret.append((p.CARRY, None))
192 |         ret[-1] = (PG_RETURN, ret[-1][0], ret[-1][1])
193 |         return ret
194 | 
195 |     @staticmethod
196 |     def sum_ch_list(ch_list):
197 |         ret = 0
198 |         for ch in ch_list:
199 |             if ch > 0:
200 |                 ret += ch - 1
201 |         return ret
202 | 
203 |     def pg_carry(self, env_observation: np.ndarray, arguments: IntegerArguments):
204 |         ret = []
205 |         p = self.pg_set
206 |         ret.append((p.MOVE_PTR, (p.MOVE_PTR.PTR_CARRY, p.MOVE_PTR.TO_LEFT)))
207 |         ret.append((p.WRITE, (p.WRITE.WRITE_TO_CARRY, 1)))
208 |         ret.append((PG_RETURN, p.MOVE_PTR, (p.MOVE_PTR.PTR_CARRY, p.MOVE_PTR.TO_RIGHT)))
209 |         return ret
210 | 
211 |     def pg_lshift(self, env_observation: np.ndarray, arguments: IntegerArguments):
212 |         ret = []
213 |         p = self.pg_set
214 |         ret.append((p.MOVE_PTR, (p.MOVE_PTR.PTR_IN1, p.MOVE_PTR.TO_LEFT)))
215 |         ret.append((p.MOVE_PTR, (p.MOVE_PTR.PTR_IN2, p.MOVE_PTR.TO_LEFT)))
216 |         ret.append((p.MOVE_PTR, (p.MOVE_PTR.PTR_CARRY, p.MOVE_PTR.TO_LEFT)))
217 |         ret.append((PG_RETURN, p.MOVE_PTR, (p.MOVE_PTR.PTR_OUT, p.MOVE_PTR.TO_LEFT)))
218 |         return ret
219 | 
220 |     def pg_rshift(self, env_observation: np.ndarray, arguments: IntegerArguments):
221 |         ret = []
222 |         p = self.pg_set
223 |         ret.append((p.MOVE_PTR, (p.MOVE_PTR.PTR_IN1, p.MOVE_PTR.TO_RIGHT)))
224 |         ret.append((p.MOVE_PTR, (p.MOVE_PTR.PTR_IN2, p.MOVE_PTR.TO_RIGHT)))
225 |         ret.append((p.MOVE_PTR, (p.MOVE_PTR.PTR_CARRY, p.MOVE_PTR.TO_RIGHT)))
226 |         ret.append((PG_RETURN, p.MOVE_PTR, (p.MOVE_PTR.PTR_OUT, p.MOVE_PTR.TO_RIGHT)))
227 |         return ret
228 | 
229 | 
230 | def create_char_map():
231 |     char_map = dict((i+1, "%s" % i) for i in range(10))
232 |     char_map[0] = ' '
233 |     return char_map
234 | 
235 | 
236 | def create_questions(num=100, max_number=10000):
237 |     questions = []
238 |     for in1 in range(10):
239 |         for in2 in range(10):
240 |             questions.append(dict(in1=in1, in2=in2))
241 | 
242 |     for _ in range(100):
243 |         questions.append(dict(in1=int(random() * 100), in2=int(random() * 100)))
244 | 
245 |     for _ in range(100):
246 |         questions.append(dict(in1=int(random() * 1000), in2=int(random() * 1000)))
247 | 
248 |     questions += [
249 |         dict(in1=104, in2=902),
250 |     ]
251 | 
252 |     questions += create_random_questions(num=num, max_number=max_number)
253 |     return questions
254 | 
255 | 
256 | def create_random_questions(num=100, max_number=10000):
257 |     questions = []
258 |     for _ in range(num):
259 |         questions.append(dict(in1=int(random() * max_number), in2=int(random() * max_number)))
260 |     return questions
261 | 
262 | 
263 | def run_npi(addition_env, npi_runner, program, data):
264 |     data['expect'] = data['in1'] + data['in2']
265 | 
266 |     addition_env.setup_problem(data['in1'], data['in2'])
267 | 
268 |     npi_runner.reset()
269 |     npi_runner.display_env(addition_env, force=True)
270 |     npi_runner.npi_program_interface(addition_env, program, IntegerArguments())
271 | 
272 |     data['result'] = addition_env.get_output()
273 |     data['correct'] = data['result'] == data['expect']
274 | 


--------------------------------------------------------------------------------
/src/npi/add/model.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding: utf-8
  3 | import os
  4 | from collections import Counter
  5 | from copy import copy
  6 | 
  7 | import math
  8 | import numpy as np
  9 | from keras.engine.topology import Merge, InputLayer
 10 | from keras.engine.training import Model
 11 | from keras.layers.core import Dense, Activation, RepeatVector, MaxoutDense
 12 | from keras.layers.embeddings import Embedding
 13 | from keras.layers.recurrent import LSTM
 14 | from keras.models import Sequential, model_from_yaml
 15 | from keras.optimizers import Adam
 16 | from keras.regularizers import l1, l2
 17 | from keras.utils.visualize_util import plot
 18 | 
 19 | from npi.add.config import FIELD_ROW, FIELD_DEPTH, PROGRAM_VEC_SIZE, PROGRAM_KEY_VEC_SIZE, FIELD_WIDTH
 20 | from npi.add.lib import AdditionProgramSet, AdditionEnv, run_npi, create_questions, AdditionTeacher, \
 21 |     create_random_questions
 22 | from npi.core import NPIStep, Program, IntegerArguments, StepOutput, RuntimeSystem, PG_RETURN, StepInOut, StepInput, \
 23 |     to_one_hot_array
 24 | from npi.terminal_core import TerminalNPIRunner
 25 | 
 26 | __author__ = 'k_morishita'
 27 | 
 28 | 
 29 | class AdditionNPIModel(NPIStep):
 30 |     model = None
 31 |     f_enc = None
 32 | 
 33 |     def __init__(self, system: RuntimeSystem, model_path: str=None, program_set: AdditionProgramSet=None):
 34 |         self.system = system
 35 |         self.model_path = model_path
 36 |         self.program_set = program_set
 37 |         self.batch_size = 1
 38 |         self.build()
 39 |         self.weight_loaded = False
 40 |         self.load_weights()
 41 | 
 42 |     def build(self):
 43 |         enc_size = self.size_of_env_observation()
 44 |         argument_size = IntegerArguments.size_of_arguments
 45 |         input_enc = InputLayer(batch_input_shape=(self.batch_size, enc_size), name='input_enc')
 46 |         input_arg = InputLayer(batch_input_shape=(self.batch_size, argument_size), name='input_arg')
 47 |         input_prg = Embedding(input_dim=PROGRAM_VEC_SIZE, output_dim=PROGRAM_KEY_VEC_SIZE, input_length=1,
 48 |                               batch_input_shape=(self.batch_size, 1))
 49 | 
 50 |         f_enc = Sequential(name='f_enc')
 51 |         f_enc.add(Merge([input_enc, input_arg], mode='concat'))
 52 |         f_enc.add(MaxoutDense(128, nb_feature=4))
 53 |         self.f_enc = f_enc
 54 | 
 55 |         program_embedding = Sequential(name='program_embedding')
 56 |         program_embedding.add(input_prg)
 57 | 
 58 |         f_enc_convert = Sequential(name='f_enc_convert')
 59 |         f_enc_convert.add(f_enc)
 60 |         f_enc_convert.add(RepeatVector(1))
 61 | 
 62 |         f_lstm = Sequential(name='f_lstm')
 63 |         f_lstm.add(Merge([f_enc_convert, program_embedding], mode='concat'))
 64 |         f_lstm.add(LSTM(256, return_sequences=False, stateful=True, W_regularizer=l2(0.0000001)))
 65 |         f_lstm.add(Activation('relu', name='relu_lstm_1'))
 66 |         f_lstm.add(RepeatVector(1))
 67 |         f_lstm.add(LSTM(256, return_sequences=False, stateful=True, W_regularizer=l2(0.0000001)))
 68 |         f_lstm.add(Activation('relu', name='relu_lstm_2'))
 69 |         # plot(f_lstm, to_file='f_lstm.png', show_shapes=True)
 70 | 
 71 |         f_end = Sequential(name='f_end')
 72 |         f_end.add(f_lstm)
 73 |         f_end.add(Dense(1, W_regularizer=l2(0.001)))
 74 |         f_end.add(Activation('sigmoid', name='sigmoid_end'))
 75 | 
 76 |         f_prog = Sequential(name='f_prog')
 77 |         f_prog.add(f_lstm)
 78 |         f_prog.add(Dense(PROGRAM_KEY_VEC_SIZE, activation="relu"))
 79 |         f_prog.add(Dense(PROGRAM_VEC_SIZE, W_regularizer=l2(0.0001)))
 80 |         f_prog.add(Activation('softmax', name='softmax_prog'))
 81 |         # plot(f_prog, to_file='f_prog.png', show_shapes=True)
 82 | 
 83 |         f_args = []
 84 |         for ai in range(1, IntegerArguments.max_arg_num+1):
 85 |             f_arg = Sequential(name='f_arg%s' % ai)
 86 |             f_arg.add(f_lstm)
 87 |             f_arg.add(Dense(IntegerArguments.depth, W_regularizer=l2(0.0001)))
 88 |             f_arg.add(Activation('softmax', name='softmax_arg%s' % ai))
 89 |             f_args.append(f_arg)
 90 |         # plot(f_arg, to_file='f_arg.png', show_shapes=True)
 91 | 
 92 |         self.model = Model([input_enc.input, input_arg.input, input_prg.input],
 93 |                            [f_end.output, f_prog.output] + [fa.output for fa in f_args],
 94 |                            name="npi")
 95 |         self.compile_model()
 96 |         plot(self.model, to_file='model.png', show_shapes=True)
 97 | 
 98 |     def reset(self):
 99 |         super(AdditionNPIModel, self).reset()
100 |         for l in self.model.layers:
101 |             if type(l) is LSTM:
102 |                 l.reset_states()
103 | 
104 |     def compile_model(self, lr=0.0001, arg_weight=1.):
105 |         arg_num = IntegerArguments.max_arg_num
106 |         optimizer = Adam(lr=lr)
107 |         loss = ['binary_crossentropy', 'categorical_crossentropy'] + ['categorical_crossentropy'] * arg_num
108 |         self.model.compile(optimizer=optimizer, loss=loss, loss_weights=[0.25, 0.25] + [arg_weight] * arg_num)
109 | 
110 |     def fit(self, steps_list, epoch=3000):
111 |         """
112 | 
113 |         :param int epoch:
114 |         :param typing.List[typing.Dict[q=dict, steps=typing.List[StepInOut]]] steps_list:
115 |         :return:
116 |         """
117 | 
118 |         def filter_question(condition_func):
119 |             sub_steps_list = []
120 |             for steps_dict in steps_list:
121 |                 question = steps_dict['q']
122 |                 if condition_func(question['in1'], question['in2']):
123 |                     sub_steps_list.append(steps_dict)
124 |             return sub_steps_list
125 | 
126 |         # self.print_weights()
127 |         if not self.weight_loaded:
128 |             self.train_f_enc(filter_question(lambda a, b: 10 <= a < 100 and 10 <= b < 100), epoch=100)
129 |         self.f_enc.trainable = False
130 | 
131 |         self.update_learning_rate(0.0001)
132 | 
133 |         # q_type = "training questions of a+b < 10"
134 |         # print(q_type)
135 |         # pr = 0.8
136 |         # all_ok = self.fit_to_subset(filter_question(lambda a, b: a+b < 10), pass_rate=pr)
137 |         # print("%s is pass_rate >= %s: %s" % (q_type, pr, all_ok))
138 |         #
139 |         # q_type = "training questions of a<10 and b< 10 and 10 <= a+b"
140 |         # print(q_type)
141 |         # pr = 0.8
142 |         # all_ok = self.fit_to_subset(filter_question(lambda a, b: a<10 and b<10 and a + b >= 10), pass_rate=pr)
143 |         # print("%s is pass_rate >= %s: %s" % (q_type, pr, all_ok))
144 |         #
145 |         # q_type = "training questions of a<10 and b<10"
146 |         # print(q_type)
147 |         # pr = 0.8
148 |         # all_ok = self.fit_to_subset(filter_question(lambda a, b: a < 10 and b < 10), pass_rate=pr)
149 |         # print("%s is pass_rate >= %s: %s" % (q_type, pr, all_ok))
150 | 
151 |         q_type = "training questions of a<100 and b<100"
152 |         print(q_type)
153 |         pr = 0.8
154 |         all_ok = self.fit_to_subset(filter_question(lambda a, b: a < 100 and b < 100), pass_rate=pr)
155 |         print("%s is pass_rate >= %s: %s" % (q_type, pr, all_ok))
156 | 
157 |         while True:
158 |             if self.test_and_learn([10, 100, 1000]):
159 |                 break
160 | 
161 |             q_type = "training questions of ALL"
162 |             print(q_type)
163 | 
164 |             q_num = 100
165 |             skip_correct = False
166 |             pr = 1.0
167 |             questions = filter_question(lambda a, b: True)
168 |             np.random.shuffle(questions)
169 |             questions = questions[:q_num]
170 |             all_ok = self.fit_to_subset(questions, pass_rate=pr, skip_correct=skip_correct)
171 |             print("%s is pass_rate >= %s: %s" % (q_type, pr, all_ok))
172 | 
173 |     def fit_to_subset(self, steps_list, pass_rate=1.0, skip_correct=False):
174 |         for i in range(10):
175 |             all_ok = self.do_learn(steps_list, 100, pass_rate=pass_rate, skip_correct=skip_correct)
176 |             if all_ok:
177 |                 return True
178 |         return False
179 | 
180 |     def test_and_learn(self, num_questions):
181 |         for num in num_questions:
182 |             print("test all type of %d questions" % num)
183 |             cc, wc, wrong_questions = self.test_to_subset(create_random_questions(num))
184 |             acc_rate = cc/(cc+wc)
185 |             print("Accuracy %s(OK=%d, NG=%d)" % (acc_rate, cc, wc))
186 |             if wc > 0:
187 |                 self.fit_to_subset(wrong_questions, pass_rate=1.0, skip_correct=False)
188 |                 return False
189 |         return True
190 | 
191 |     def test_to_subset(self, questions):
192 |         addition_env = AdditionEnv(FIELD_ROW, FIELD_WIDTH, FIELD_DEPTH)
193 |         teacher = AdditionTeacher(self.program_set)
194 |         npi_runner = TerminalNPIRunner(None, self)
195 |         teacher_runner = TerminalNPIRunner(None, teacher)
196 |         correct_count = wrong_count = 0
197 |         wrong_steps_list = []
198 |         for idx, question in enumerate(questions):
199 |             question = copy(question)
200 |             if self.question_test(addition_env, npi_runner, question):
201 |                 correct_count += 1
202 |             else:
203 |                 self.question_test(addition_env, teacher_runner, question)
204 |                 wrong_steps_list.append({"q": question, "steps": teacher_runner.step_list})
205 |                 wrong_count += 1
206 |         return correct_count, wrong_count, wrong_steps_list
207 | 
208 |     @staticmethod
209 |     def dict_to_str(d):
210 |         return str(tuple([(k, d[k]) for k in sorted(d)]))
211 | 
212 |     def do_learn(self, steps_list, epoch, pass_rate=1.0, skip_correct=False):
213 |         addition_env = AdditionEnv(FIELD_ROW, FIELD_WIDTH, FIELD_DEPTH)
214 |         npi_runner = TerminalNPIRunner(None, self)
215 |         last_weights = None
216 |         correct_count = Counter()
217 |         no_change_count = 0
218 |         last_loss = 1000
219 |         for ep in range(1, epoch+1):
220 |             correct_new = wrong_new = 0
221 |             losses = []
222 |             ok_rate = []
223 |             np.random.shuffle(steps_list)
224 |             for idx, steps_dict in enumerate(steps_list):
225 |                 question = copy(steps_dict['q'])
226 |                 question_key = self.dict_to_str(question)
227 |                 if self.question_test(addition_env, npi_runner, question):
228 |                     if correct_count[question_key] == 0:
229 |                         correct_new += 1
230 |                     correct_count[question_key] += 1
231 |                     print("GOOD!: ep=%2d idx=%3d :%s CorrectCount=%s" % (ep, idx, self.dict_to_str(question), correct_count[question_key]))
232 |                     ok_rate.append(1)
233 |                     cc = correct_count[question_key]
234 |                     if skip_correct or int(math.sqrt(cc)) ** 2 != cc:
235 |                         continue
236 |                 else:
237 |                     ok_rate.append(0)
238 |                     if correct_count[question_key] > 0:
239 |                         print("Degraded: ep=%2d idx=%3d :%s CorrectCount=%s -> 0" % (ep, idx, self.dict_to_str(question), correct_count[question_key]))
240 |                         correct_count[question_key] = 0
241 |                         wrong_new += 1
242 | 
243 |                 steps = steps_dict['steps']
244 |                 xs = []
245 |                 ys = []
246 |                 ws = []
247 |                 for step in steps:
248 |                     xs.append(self.convert_input(step.input))
249 |                     y, w = self.convert_output(step.output)
250 |                     ys.append(y)
251 |                     ws.append(w)
252 | 
253 |                 self.reset()
254 | 
255 |                 for i, (x, y, w) in enumerate(zip(xs, ys, ws)):
256 |                     loss = self.model.train_on_batch(x, y, sample_weight=w)
257 |                     if not np.isfinite(loss):
258 |                         print("Loss is not finite!, Last Input=%s" % ([i, (x, y, w)]))
259 |                         self.print_weights(last_weights, detail=True)
260 |                         raise RuntimeError("Loss is not finite!")
261 |                     losses.append(loss)
262 |                     last_weights = self.model.get_weights()
263 |             if losses:
264 |                 cur_loss = np.average(losses)
265 |                 print("ep=%2d: ok_rate=%.2f%% (+%s -%s): ave loss %s (%s samples)" %
266 |                       (ep, np.average(ok_rate)*100, correct_new, wrong_new, cur_loss, len(steps_list)))
267 |                 # self.print_weights()
268 |                 if correct_new + wrong_new == 0:
269 |                     no_change_count += 1
270 |                 else:
271 |                     no_change_count = 0
272 | 
273 |                 if math.fabs(1 - cur_loss/last_loss) < 0.001 and no_change_count > 5:
274 |                     print("math.fabs(1 - cur_loss/last_loss) < 0.001 and no_change_count > 5:")
275 |                     return False
276 |                 last_loss = cur_loss
277 |                 print("=" * 80)
278 |             self.save()
279 |             if np.average(ok_rate) >= pass_rate:
280 |                 return True
281 |         return False
282 | 
283 |     def update_learning_rate(self, learning_rate, arg_weight=1.):
284 |         print("Re-Compile Model lr=%s aw=%s" % (learning_rate, arg_weight))
285 |         self.compile_model(learning_rate, arg_weight=arg_weight)
286 | 
287 |     def train_f_enc(self, steps_list, epoch=50):
288 |         print("training f_enc")
289 |         f_add0 = Sequential(name='f_add0')
290 |         f_add0.add(self.f_enc)
291 |         f_add0.add(Dense(FIELD_DEPTH))
292 |         f_add0.add(Activation('softmax', name='softmax_add0'))
293 | 
294 |         f_add1 = Sequential(name='f_add1')
295 |         f_add1.add(self.f_enc)
296 |         f_add1.add(Dense(FIELD_DEPTH))
297 |         f_add1.add(Activation('softmax', name='softmax_add1'))
298 | 
299 |         env_model = Model(self.f_enc.inputs, [f_add0.output, f_add1.output], name="env_model")
300 |         env_model.compile(optimizer='adam', loss=['categorical_crossentropy']*2)
301 | 
302 |         for ep in range(epoch):
303 |             losses = []
304 |             for idx, steps_dict in enumerate(steps_list):
305 |                 prev = None
306 |                 for step in steps_dict['steps']:
307 |                     x = self.convert_input(step.input)[:2]
308 |                     env_values = step.input.env.reshape((4, -1))
309 |                     in1 = np.clip(env_values[0].argmax() - 1, 0, 9)
310 |                     in2 = np.clip(env_values[1].argmax() - 1, 0, 9)
311 |                     carry = np.clip(env_values[2].argmax() - 1, 0, 9)
312 |                     y_num = in1 + in2 + carry
313 |                     now = (in1, in2, carry)
314 |                     if prev == now:
315 |                         continue
316 |                     prev = now
317 |                     y0 = to_one_hot_array((y_num %  10)+1, FIELD_DEPTH)
318 |                     y1 = to_one_hot_array((y_num // 10)+1, FIELD_DEPTH)
319 |                     y = [yy.reshape((self.batch_size, -1)) for yy in [y0, y1]]
320 |                     loss = env_model.train_on_batch(x, y)
321 |                     losses.append(loss)
322 |             print("ep %3d: loss=%s" % (ep, np.average(losses)))
323 |             if np.average(losses) < 1e-06:
324 |                 break
325 | 
326 |     def question_test(self, addition_env, npi_runner, question):
327 |         addition_env.reset()
328 |         self.reset()
329 |         try:
330 |             run_npi(addition_env, npi_runner, self.program_set.ADD, question)
331 |             if question['correct']:
332 |                 return True
333 |         except StopIteration:
334 |             pass
335 |         return False
336 | 
337 |     def convert_input(self, p_in: StepInput):
338 |         x_pg = np.array((p_in.program.program_id,))
339 |         x = [xx.reshape((self.batch_size, -1)) for xx in (p_in.env, p_in.arguments.values, x_pg)]
340 |         return x
341 | 
342 |     def convert_output(self, p_out: StepOutput):
343 |         y = [np.array((p_out.r,))]
344 |         weights = [[1.]]
345 |         if p_out.program:
346 |             arg_values = p_out.arguments.values
347 |             arg_num = len(p_out.program.args or [])
348 |             y += [p_out.program.to_one_hot(PROGRAM_VEC_SIZE)]
349 |             weights += [[1.]]
350 |         else:
351 |             arg_values = IntegerArguments().values
352 |             arg_num = 0
353 |             y += [np.zeros((PROGRAM_VEC_SIZE, ))]
354 |             weights += [[1e-10]]
355 | 
356 |         for v in arg_values:  # split by each args
357 |             y += [v]
358 |         weights += [[1.]] * arg_num + [[1e-10]] * (len(arg_values) - arg_num)
359 |         weights = [np.array(w) for w in weights]
360 |         return [yy.reshape((self.batch_size, -1)) for yy in y], weights
361 | 
362 |     def step(self, env_observation: np.ndarray, pg: Program, arguments: IntegerArguments) -> StepOutput:
363 |         x = self.convert_input(StepInput(env_observation, pg, arguments))
364 |         results = self.model.predict(x, batch_size=1)  # if batch_size==1, returns single row
365 | 
366 |         r, pg_one_hot, arg_values = results[0], results[1], results[2:]
367 |         program = self.program_set.get(pg_one_hot.argmax())
368 |         ret = StepOutput(r, program, IntegerArguments(values=np.stack(arg_values)))
369 |         return ret
370 | 
371 |     def save(self):
372 |         self.model.save_weights(self.model_path, overwrite=True)
373 | 
374 |     def load_weights(self):
375 |         if os.path.exists(self.model_path):
376 |             self.model.load_weights(self.model_path)
377 |             self.weight_loaded = True
378 | 
379 |     def print_weights(self, weights=None, detail=False):
380 |         weights = weights or self.model.get_weights()
381 |         for w in weights:
382 |             print("w%s: sum(w)=%s, ave(w)=%s" % (w.shape, np.sum(w), np.average(w)))
383 |         if detail:
384 |             for w in weights:
385 |                 print("%s: %s" % (w.shape, w))
386 | 
387 |     @staticmethod
388 |     def size_of_env_observation():
389 |         return FIELD_ROW * FIELD_DEPTH
390 | 


--------------------------------------------------------------------------------
/src/npi/add/test_model.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | import curses
 3 | import os
 4 | import pickle
 5 | 
 6 | from npi.add.config import FIELD_ROW, FIELD_WIDTH, FIELD_DEPTH
 7 | from npi.add.lib import AdditionEnv, AdditionProgramSet, AdditionTeacher, create_char_map, create_questions, run_npi
 8 | from npi.add.model import AdditionNPIModel
 9 | from npi.core import ResultLogger, RuntimeSystem
10 | from npi.terminal_core import TerminalNPIRunner, Terminal
11 | 
12 | 
13 | def main(stdscr, model_path: str, num: int, result_logger: ResultLogger):
14 |     terminal = Terminal(stdscr, create_char_map())
15 |     terminal.init_window(FIELD_WIDTH, FIELD_ROW)
16 |     program_set = AdditionProgramSet()
17 |     addition_env = AdditionEnv(FIELD_ROW, FIELD_WIDTH, FIELD_DEPTH)
18 | 
19 |     questions = create_questions(num)
20 |     if DEBUG_MODE:
21 |         questions = questions[-num:]
22 |     system = RuntimeSystem(terminal=terminal)
23 |     npi_model = AdditionNPIModel(system, model_path, program_set)
24 |     npi_runner = TerminalNPIRunner(terminal, npi_model, recording=False)
25 |     npi_runner.verbose = DEBUG_MODE
26 |     correct_count = wrong_count = 0
27 |     for data in questions:
28 |         addition_env.reset()
29 |         run_npi(addition_env, npi_runner, program_set.ADD, data)
30 |         result_logger.write(data)
31 |         terminal.add_log(data)
32 |         if data['correct']:
33 |             correct_count += 1
34 |         else:
35 |             wrong_count += 1
36 |     return correct_count, wrong_count
37 | 
38 | 
39 | if __name__ == '__main__':
40 |     import sys
41 |     DEBUG_MODE = os.environ.get('DEBUG')
42 |     model_path_ = sys.argv[1]
43 |     num_data = int(sys.argv[2]) if len(sys.argv) > 2 else 1000
44 |     log_filename = sys.argv[3] if len(sys.argv) > 3 else 'result.log'
45 |     cc, wc = curses.wrapper(main, model_path_, num_data, ResultLogger(log_filename))
46 |     print("Accuracy %s(OK=%d, NG=%d)" % (cc/(cc+wc), cc, wc))
47 | 


--------------------------------------------------------------------------------
/src/npi/add/training_model.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | import os
 3 | import pickle
 4 | 
 5 | from npi.add.config import FIELD_ROW, FIELD_WIDTH, FIELD_DEPTH
 6 | from npi.add.lib import AdditionEnv, AdditionProgramSet, AdditionTeacher, create_char_map, create_questions, run_npi
 7 | from npi.add.model import AdditionNPIModel
 8 | from npi.core import ResultLogger, RuntimeSystem
 9 | from npi.terminal_core import TerminalNPIRunner, Terminal
10 | 
11 | 
12 | def main(filename: str, model_path: str):
13 |     system = RuntimeSystem()
14 |     program_set = AdditionProgramSet()
15 | 
16 |     with open(filename, 'rb') as f:
17 |         steps_list = pickle.load(f)
18 | 
19 |     npi_model = AdditionNPIModel(system, model_path, program_set)
20 |     npi_model.fit(steps_list)
21 | 
22 | 
23 | if __name__ == '__main__':
24 |     import sys
25 |     DEBUG_MODE = os.environ.get('DEBUG')
26 |     train_filename = sys.argv[1]
27 |     model_output = sys.argv[2]
28 |     main(train_filename, model_output)
29 | 
30 | 


--------------------------------------------------------------------------------
/src/npi/core.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | import json
  4 | from copy import copy
  5 | 
  6 | import numpy as np
  7 | 
  8 | MAX_ARG_NUM = 3
  9 | ARG_DEPTH = 10   # 0~9 digit. one-hot.
 10 | 
 11 | PG_CONTINUE = 0
 12 | PG_RETURN = 1
 13 | 
 14 | 
 15 | class IntegerArguments:
 16 |     depth = ARG_DEPTH
 17 |     max_arg_num = MAX_ARG_NUM
 18 |     size_of_arguments = depth * max_arg_num
 19 | 
 20 |     def __init__(self, args: list=None, values: np.ndarray=None):
 21 |         if values is not None:
 22 |             self.values = values.reshape((self.max_arg_num, self.depth))
 23 |         else:
 24 |             self.values = np.zeros((self.max_arg_num, self.depth), dtype=np.float32)
 25 | 
 26 |         if args:
 27 |             for i, v in enumerate(args):
 28 |                 self.update_to(i, v)
 29 | 
 30 |     def copy(self):
 31 |         obj = IntegerArguments()
 32 |         obj.values = np.copy(self.values)
 33 |         return obj
 34 | 
 35 |     def decode_all(self):
 36 |         return [self.decode_at(i) for i in range(len(self.values))]
 37 | 
 38 |     def decode_at(self, index: int) -> int:
 39 |         return self.values[index].argmax()
 40 | 
 41 |     def update_to(self, index: int, integer: int):
 42 |         self.values[index] = 0
 43 |         self.values[index, int(np.clip(integer, 0, self.depth-1))] = 1
 44 | 
 45 |     def __str__(self):
 46 |         return "<IA: %s>" % self.decode_all()
 47 | 
 48 | 
 49 | class Program:
 50 |     output_to_env = False
 51 | 
 52 |     def __init__(self, name, *args):
 53 |         self.name = name
 54 |         self.args = args
 55 |         self.program_id = None
 56 | 
 57 |     def description_with_args(self, args: IntegerArguments) -> str:
 58 |         int_args = args.decode_all()
 59 |         return "%s(%s)" % (self.name, ", ".join([str(x) for x in int_args]))
 60 | 
 61 |     def to_one_hot(self, size, dtype=np.float):
 62 |         ret = np.zeros((size,), dtype=dtype)
 63 |         ret[self.program_id] = 1
 64 |         return ret
 65 | 
 66 |     def do(self, env, args: IntegerArguments):
 67 |         raise NotImplementedError()
 68 | 
 69 |     def __str__(self):
 70 |         return "<Program: name=%s>" % self.name
 71 | 
 72 | 
 73 | class StepInput:
 74 |     def __init__(self, env: np.ndarray, program: Program, arguments: IntegerArguments):
 75 |         self.env = env
 76 |         self.program = program
 77 |         self.arguments = arguments
 78 | 
 79 | 
 80 | class StepOutput:
 81 |     def __init__(self, r: float, program: Program=None, arguments: IntegerArguments=None):
 82 |         self.r = r
 83 |         self.program = program
 84 |         self.arguments = arguments
 85 | 
 86 |     def __str__(self):
 87 |         return "<StepOutput: r=%s pg=%s arg=%s>" % (self.r, self.program, self.arguments)
 88 | 
 89 | 
 90 | class StepInOut:
 91 |     def __init__(self, input: StepInput, output: StepOutput):
 92 |         self.input = input
 93 |         self.output = output
 94 | 
 95 | 
 96 | class ResultLogger:
 97 |     def __init__(self, filename):
 98 |         self.filename = filename
 99 | 
100 |     def write(self, obj):
101 |         with open(self.filename, "a") as f:
102 |             json.dump(obj, f)
103 |             f.write("\n")
104 | 
105 | 
106 | class NPIStep:
107 |     def reset(self):
108 |         pass
109 | 
110 |     def enter_function(self):
111 |         pass
112 | 
113 |     def exit_function(self):
114 |         pass
115 | 
116 |     def step(self, env_observation: np.ndarray, pg: Program, arguments: IntegerArguments) -> StepOutput:
117 |         raise NotImplementedError()
118 | 
119 | 
120 | class RuntimeSystem:
121 |     def __init__(self, terminal=None):
122 |         self.terminal = terminal
123 | 
124 |     def logging(self, message):
125 |         if self.terminal:
126 |             self.terminal.add_log(message)
127 |         else:
128 |             print(message)
129 | 
130 | 
131 | def to_one_hot_array(idx, size, dtype=np.int8):
132 |     ret = np.zeros((size, ), dtype=dtype)
133 |     ret[idx] = 1
134 |     return ret
135 | 


--------------------------------------------------------------------------------
/src/npi/terminal_core.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding: utf-8
  3 | import curses
  4 | import numpy as np
  5 | 
  6 | from npi.core import Program, IntegerArguments, NPIStep, StepOutput, StepInput, StepInOut
  7 | 
  8 | __author__ = 'k_morishita'
  9 | 
 10 | 
 11 | class Screen:
 12 |     data = None
 13 | 
 14 |     def __init__(self, height, width):
 15 |         self.height = height
 16 |         self.width = width
 17 |         self.init_screen()
 18 | 
 19 |     def init_screen(self):
 20 |         self.data = np.zeros([self.height, self.width], dtype=np.int8)
 21 | 
 22 |     def fill(self, ch):
 23 |         self.data.fill(ch)
 24 | 
 25 |     def as_float32(self):
 26 |         return self.data.astype(np.float32)
 27 | 
 28 |     def __setitem__(self, key, value):
 29 |         self.data[key] = value
 30 | 
 31 |     def __getitem__(self, item):
 32 |         return self.data[item]
 33 | 
 34 | 
 35 | class Terminal:
 36 |     W_TOP = 1
 37 |     W_LEFT = 1
 38 |     LOG_WINDOW_HEIGHT = 10
 39 |     LOG_WINDOW_WIDTH = 80
 40 |     INFO_WINDOW_HEIGHT = 10
 41 |     INFO_WINDOW_WIDTH = 40
 42 | 
 43 |     main_window = None
 44 |     info_window = None
 45 |     log_window = None
 46 | 
 47 |     def __init__(self, stdscr, char_map=None):
 48 |         print(type(stdscr))
 49 |         self.stdscr = stdscr
 50 |         self.char_map = char_map or dict((ch, chr(ch)) for ch in range(128))
 51 |         self.log_list = []
 52 | 
 53 |     def init_window(self, width, height):
 54 |         curses.curs_set(0)
 55 |         border_win = curses.newwin(height + 2, width + 2, self.W_TOP, self.W_LEFT)  # h, w, y, x
 56 |         border_win.box()
 57 |         self.stdscr.refresh()
 58 |         border_win.refresh()
 59 |         self.main_window = curses.newwin(height, width, self.W_TOP + 1, self.W_LEFT + 1)
 60 |         self.main_window.refresh()
 61 |         self.main_window.timeout(1)
 62 |         self.info_window = curses.newwin(self.INFO_WINDOW_HEIGHT, self.INFO_WINDOW_WIDTH,
 63 |                                          self.W_TOP + 1, self.W_LEFT + width + 2)
 64 |         self.log_window = curses.newwin(self.LOG_WINDOW_HEIGHT, self.LOG_WINDOW_WIDTH,
 65 |                                         self.W_TOP + max(height, self.INFO_WINDOW_HEIGHT) + 5, self.W_LEFT)
 66 |         self.log_window.refresh()
 67 | 
 68 |     def wait_for_key(self):
 69 |         self.stdscr.getch()
 70 | 
 71 |     def update_main_screen(self, screen):
 72 |         for y in range(screen.height):
 73 |             line = "".join([self.char_map[ch] for ch in screen[y]])
 74 |             self.ignore_error_add_str(self.main_window, y, 0, line)
 75 | 
 76 |     def update_main_window_attr(self, screen, y, x, attr):
 77 |         ch = screen[y, x]
 78 |         self.ignore_error_add_str(self.main_window, y, x, self.char_map[ch], attr)
 79 | 
 80 |     def refresh_main_window(self):
 81 |         self.main_window.refresh()
 82 | 
 83 |     def update_info_screen(self, info_list):
 84 |         self.info_window.clear()
 85 |         for i, info_str in enumerate(info_list):
 86 |             self.info_window.addstr(i, 2, info_str)
 87 |         self.info_window.refresh()
 88 | 
 89 |     def add_log(self, line):
 90 |         self.log_list.insert(0, str(line)[:self.LOG_WINDOW_WIDTH])
 91 |         self.log_list = self.log_list[:self.LOG_WINDOW_HEIGHT-1]
 92 |         self.log_window.clear()
 93 |         for i, line in enumerate(self.log_list):
 94 |             line = str(line) + " " * (self.LOG_WINDOW_WIDTH - len(str(line)))
 95 |             self.log_window.addstr(i, 0, line)
 96 |         self.log_window.refresh()
 97 | 
 98 |     @staticmethod
 99 |     def ignore_error_add_str(win, y, x, s, attr=curses.A_NORMAL):
100 |         """一番右下に書き込むと例外が飛んでくるけど、漢は黙って無視するのがお作法らしい？"""
101 |         try:
102 |             win.addstr(y, x, s, attr)
103 |         except curses.error:
104 |             pass
105 | 
106 | 
107 | def show_env_to_terminal(terminal, env):
108 |     terminal.update_main_screen(env.screen)
109 |     for i, p in enumerate(env.pointers):
110 |         terminal.update_main_window_attr(env.screen, i, p, curses.A_REVERSE)
111 |     terminal.refresh_main_window()
112 | 
113 | 
114 | class TerminalNPIRunner:
115 |     def __init__(self, terminal: Terminal, model: NPIStep=None, recording=True, max_depth=10, max_step=1000):
116 |         self.terminal = terminal
117 |         self.model = model
118 |         self.steps = 0
119 |         self.step_list = []
120 |         self.alpha = 0.5
121 |         self.verbose = True
122 |         self.recording = recording
123 |         self.max_depth = max_depth
124 |         self.max_step = max_step
125 | 
126 |     def reset(self):
127 |         self.steps = 0
128 |         self.step_list = []
129 |         self.model.reset()
130 | 
131 |     def display_env(self, env, force=False):
132 |         if (self.verbose or force) and self.terminal:
133 |             show_env_to_terminal(self.terminal, env)
134 | 
135 |     def display_information(self, program: Program, arguments: IntegerArguments, result: StepOutput, depth: int):
136 |         if self.verbose and self.terminal:
137 |             information = [
138 |                 "Step %2d Depth: %2d" % (self.steps, depth),
139 |                 program.description_with_args(arguments),
140 |                 'r=%.2f' % result.r,
141 |             ]
142 |             if result.program:
143 |                 information.append("-> %s" % result.program.description_with_args(result.arguments))
144 |             self.terminal.update_info_screen(information)
145 |             self.wait()
146 | 
147 |     def npi_program_interface(self, env, program: Program, arguments: IntegerArguments, depth=0):
148 |         if self.max_depth < depth or self.max_step < self.steps:
149 |             raise StopIteration()
150 | 
151 |         self.model.enter_function()
152 | 
153 |         result = StepOutput(0, None, None)
154 |         while result.r < self.alpha:
155 |             self.steps += 1
156 |             if self.max_step < self.steps:
157 |                 raise StopIteration()
158 | 
159 |             env_observation = env.get_observation()
160 |             result = self.model.step(env_observation, program, arguments.copy())
161 |             if self.recording:
162 |                 self.step_list.append(StepInOut(StepInput(env_observation, program, arguments.copy()), result))
163 |             self.display_information(program, arguments, result, depth)
164 | 
165 |             if program.output_to_env:
166 |                 program.do(env, arguments.copy())
167 |                 self.display_env(env)
168 |             else:
169 |                 if result.program:  # modify original algorithm
170 |                     self.npi_program_interface(env, result.program, result.arguments, depth=depth+1)
171 | 
172 |         self.model.exit_function()
173 | 
174 |     def wait(self):
175 |         self.terminal.wait_for_key()
176 | 


--------------------------------------------------------------------------------
/src/run_create_addition_data.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | THIS_DIR=$(cd $(dirname $0); pwd)
 4 | DATA_DIR=${THIS_DIR}/../data
 5 | OUTPUT_FILE=${1:-${DATA_DIR}/train.pkl}
 6 | LOG=train_result.log
 7 | export PYTHONPATH=${THIS_DIR}
 8 | cd $THIS_DIR
 9 | 
10 | mkdir -p "$DATA_DIR"
11 | 
12 | rm -f "$LOG"
13 | echo python npi/add/create_training_data.py "$OUTPUT_FILE" 1000 "$LOG"
14 | python npi/add/create_training_data.py "$OUTPUT_FILE" 1000 "$LOG"
15 | 
16 | 


--------------------------------------------------------------------------------
/src/run_test_addition_model.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | THIS_DIR=$(cd $(dirname $0); pwd)
 4 | DATA_DIR=${THIS_DIR}/../data
 5 | MODEL_OUTPUT=${1:-${DATA_DIR}/addition.model}
 6 | NUM_TEST=${2:-100}
 7 | 
 8 | export PYTHONPATH=${THIS_DIR}
 9 | cd "$THIS_DIR"
10 | 
11 | mkdir -p "$DATA_DIR"
12 | 
13 | echo python npi/add/test_model.py "$MODEL_OUTPUT" "$NUM_TEST"
14 | python npi/add/test_model.py "$MODEL_OUTPUT" "$NUM_TEST"
15 | 


--------------------------------------------------------------------------------
/src/run_train_addition_model.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | THIS_DIR=$(cd $(dirname $0); pwd)
 4 | DATA_DIR=${THIS_DIR}/../data
 5 | TRAIN_DATA=${1:-${DATA_DIR}/train.pkl}
 6 | MODEL_OUTPUT=${2:-${DATA_DIR}/addition.model}
 7 | 
 8 | export PYTHONPATH=${THIS_DIR}
 9 | cd "$THIS_DIR"
10 | 
11 | mkdir -p "$DATA_DIR"
12 | 
13 | [ "$NEW_MODEL" != "" ] && rm -f "$MODEL_OUTPUT"
14 | 
15 | echo python npi/add/training_model.py "$TRAIN_DATA" "$MODEL_OUTPUT"
16 | time python npi/add/training_model.py "$TRAIN_DATA" "$MODEL_OUTPUT"
17 | 
18 | 


--------------------------------------------------------------------------------