├── README.md
└── keras2caffe.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Keras2Caffe
 2 | ### Keras to Caffe
 3 | This is a script that converts Keras models to Caffe models
 4 | from the common Keras layers into caffe NetSpecs, and into prototext and caffemodel files.
 5 | This allows you to pipe directly into your favorite Caffe framework of choice.
 6 | 
 7 | Be aware that Keras has much more flexible functionality and thus will miss
 8 | lots of goodies such as Lambda functions, some acitvaiton functions, and some 
 9 | of the pooling, text preprocessing, recurrent, and noise layers. For those you
10 | can try your luck with [custom python layers](https://stackoverflow.com/questions/33778225/building-custom-caffe-layer-in-python) or [custom c++ layers](https://github.com/BVLC/caffe/wiki/Development).
11 | 
12 | This is build with a very simple structure of parsing layer by layer, and the 
13 | same goes for caffe weights.
14 | 
15 | ### Why
16 | For anyone who needs to use Caffe for some reason.
17 | This was a script I wrote while at IBM Watson, in Spark.tc for Keras2DML
18 | to run Common pretrained keras models to run on large clusters. This was partly used to piggy back off Caffe2DML. Definitely recommend checking out [SystemML](https://github.com/apache/systemml) if you want to train models on clusters and 
19 | distributed machine learning concepts in general. 
20 | 
21 | ## Getting Started
22 | 
23 | These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
24 | 
25 | ### Prerequisites
26 | 
27 | What things you need to install the software and how to install them
28 | 
29 | ```
30 | sudo apt-get install caffe
31 | pip install tensorflow keras
32 | ```
33 |  Eventually I will add Caffe.proto to generate files so you don't get bloated
34 |  from the caffe dependency.
35 |  
36 | ### Installing
37 | 
38 | A step by step series of examples that tell you have to get a development env running
39 | 
40 | Import the script to use the functions
41 | 
42 | ```
43 | import keras2caffe
44 | ```
45 | 
46 | ## Things That Work and What Needs To
47 | 
48 | Currently the layers that work are:
49 | * Dense
50 | * Dropout
51 | * Add, Multiply, Maximum, and Concat
52 | * Conv2D, MaxPooling2D, Conv2DTranspose(Does not map anything that doesn't exist in Caffe, such as regulaizers, and intiallizers)
53 | * All activations in Caffe except Threshold, bias, and Scale
54 | * All Losses in Caffe, including both combined activation and loss layers
55 | 
56 | Things that I need to add
57 | * LSTM (Map as much as possible with keras parameters)
58 | * Crop, AveragePooling, and Embedding
59 | 
60 | If I missed anything, make sure to inform me so I can add it, or merge pull requests.
61 | 
62 | REMINDER: You can't get one to one models, so don't expect everything to work perfectly!
63 | 
64 | ## Authors
65 | 
66 | * **Anooj Patel** - *Initial work* - [FuturizeHandgun](https://github.com/FuturizeHandgun)
67 | 
68 | 
69 | ## License
70 | 
71 | This project is licensed under the Apache License 
72 | 
73 | ## Acknowledgments
74 | 
75 | * Mike
76 | * Caffe for making it a game to find valid documentation for pycaffe
77 | 


--------------------------------------------------------------------------------
/keras2caffe.py:
--------------------------------------------------------------------------------
  1 | # -------------------------------------------------------------
  2 | #
  3 | # Licensed to the Apache Software Foundation (ASF) under one
  4 | # or more contributor license agreements.  See the NOTICE file
  5 | # distributed with this work for additional information
  6 | # regarding copyright ownership.  The ASF licenses this file
  7 | # to you under the Apache License, Version 2.0 (the
  8 | # "License"); you may not use this file except in compliance
  9 | # with the License.  You may obtain a copy of the License at
 10 | #
 11 | #   http://www.apache.org/licenses/LICENSE-2.0
 12 | #
 13 | # Unless required by applicable law or agreed to in writing,
 14 | # software distributed under the License is distributed on an
 15 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 16 | # KIND, either express or implied.  See the License for the
 17 | # specific language governing permissions and limitations
 18 | # under the License.
 19 | #
 20 | # -------------------------------------------------------------
 21 | 
 22 | # Script to generate caffe proto and .caffemodel files from Keras models
 23 | 
 24 | from caffe import *
 25 | import caffe
 26 | from caffe import layers as L
 27 | from caffe import params as P
 28 | 
 29 | import keras
 30 | from keras.models import load_model
 31 | from keras.models import model_from_json
 32 | from keras.utils.conv_utils import convert_kernel
 33 | 
 34 | import numpy as np
 35 | 
 36 | 
 37 | def load_keras_model(filepath):
 38 |     model = load_model(filepath)
 39 |     return model
 40 | 
 41 | 
 42 | def load_keras_skeleton_model(filepath):
 43 |     json_file = open(filepath, 'r')
 44 |     model_json = json_file.read()
 45 |     json_file.close()
 46 |     loaded_model = model_from_json(model_json)
 47 |     return loaded_model
 48 | 
 49 | 
 50 | def load_weights_to_model(model, filepath):
 51 |     model.load_weights(filepath)
 52 |     return model
 53 | 
 54 | 
 55 | # Currently can only generate a Dense model
 56 | def generate_caffe_model(kModel, filepath, weights_filepath, input_shape=None, phases=None):
 57 |     n = caffe.NetSpec()
 58 |     layers = kModel.layers
 59 |     net_params = dict()
 60 |     input_name = kModel.inputs[0].name
 61 |     label_name = input_name + "_label"
 62 | 
 63 |     for layer in layers:
 64 |         blobs = layer.get_weights()
 65 | 
 66 |         generate_layer(blobs, layer, n, net_params)
 67 | 
 68 |     # Determine the loss needed to be added
 69 |     generate_loss(kModel, n, label_name)
 70 |     print("Converting model to proto and converting weights")
 71 |     write_caffe_model(n, filepath)
 72 |     caffe_model = caffe.Net(filepath, caffe.TEST)
 73 |     for layer in caffe_model.params.keys():
 74 |         for i in range(0, len(caffe_model.params[layer])):
 75 |             print(layer + ": ")
 76 |             print(net_params[layer][i].shape)
 77 |             print(caffe_model.params[layer][i].data.shape)
 78 |             # print(dir(caffe_model.params[layer]))
 79 |             caffe_model.params[layer][i].data[...] = net_params[layer][i]
 80 | 
 81 |     caffe_model.save(weights_filepath)
 82 | 
 83 |     # Change back Input into Data layer for Caffe2DML
 84 |     n[label_name], n[input_name] = L.Data(ntop=2)
 85 | 
 86 |     write_caffe_model(n, filepath)
 87 | 
 88 |     return n, caffe_model
 89 | 
 90 | 
 91 | def generate_layer(blobs, layer, n, net_params):
 92 |     """
 93 |     Parameters: blobs: weights for keras, layer: keras layer, n: Caffe NetSpec,
 94 |     net_params: Dictionary to store Caffe weights
 95 |     """
 96 |     if type(layer) == keras.layers.InputLayer:
 97 |         # Grab the batchsize from i 0, shift over channels to index 1, and place the rest into the dictionary
 98 |         # TODO determine when to transform for layer types/input shape
 99 |         num = len(layer.batch_input_shape) - 1  # Range from 1st index to second last
100 |         # TODO check for image_data_format to be channels_first or channels_last
101 |         batch_list = [layer.batch_input_shape[0], layer.batch_input_shape[-1]]
102 |         for i in range(1, num):
103 |             batch_list.append(layer.batch_input_shape[i])
104 |         for i in range(len(batch_list)):  # Set None dimensions to 0 for Caffe
105 |             if (batch_list[i] == None):
106 |                 batch_list[i] = 1
107 |         name = layer.name
108 |         # TODO figure out having 2 tops, with n.label
109 |         n[name] = L.Input(shape=[dict(dim=batch_list)])
110 | 
111 |     elif type(layer) == keras.layers.Dense:
112 |         # Pull name from Keras
113 |         name = layer.name
114 |         # Pull layer name of the layer passing to current layer
115 |         in_names = get_inbound_layers(layer)
116 |         # Pipe names into caffe using unique Keras layer names
117 |         n[name] = L.InnerProduct(n[in_names[0].name], num_output=layer.units)  # TODO: Assert only 1
118 |         config = layer.get_config()
119 |         if config['use_bias']:
120 |             net_params[name] = (np.array(blobs[0]).transpose(1, 0), np.array(blobs[1]))
121 |         else:
122 |             net_params[name] = (blobs[0])
123 |         if layer.activation is not None and layer.activation.__name__ != 'linear':
124 |             name_act = name + "_activation_" + layer.activation.__name__  # get function string
125 |             n[name_act] = get_activation(layer, n[name])
126 | 
127 |     elif type(layer) == keras.layers.Flatten:
128 | 
129 |         """
130 |         Caffe2DML implicitly stores all tensors as a 1D array with shapes so after every passthrough
131 |         all outputs are already flatten thus, we can ignore all flattens are just pass the
132 |         tops and bottoms across all flatten layers.
133 |         """
134 | 
135 |     elif type(layer) == keras.layers.Dropout:  # TODO Random seed will be lost
136 |         name = layer.name
137 |         in_names = get_inbound_layers(layer)
138 |         n[name] = L.Dropout(n[in_names[0].name], dropout_ratio=layer.rate, in_place=True)
139 | 
140 |     # elif type(layer) == keras.Layers.LSTM:
141 | 
142 |     elif type(layer) == keras.layers.Add:
143 |         name = layer.name
144 |         in_names = get_inbound_layers(layer)
145 |         # turn list of names into network layers
146 |         network_layers = []
147 |         for ref in in_names:
148 |             network_layers.append(n[ref.name])
149 |         # print(network_layers)
150 |         # unpack the bottom layers
151 |         n[name] = L.Eltwise(*network_layers, operation=1)  # 1 is SUM
152 | 
153 |     elif type(layer) == keras.layers.Multiply:
154 |         name = layer.name
155 |         in_names = get_inbound_layers(layer)
156 |         # turn list of names into network layers
157 |         network_layers = []
158 |         for ref in in_names:
159 |             network_layers.append(n[ref.name])
160 |         # unpack the bottom layers
161 |         n[name] = L.Eltwise(*network_layers, operation=0)
162 | 
163 |     elif type(layer) == keras.layers.Concatenate:
164 |         name = layer.name
165 |         in_names = get_inbound_layers(layer)
166 |         # turn list of names into network layers
167 |         network_layers = []
168 |         for ref in in_names:
169 |             network_layers.append(n[ref.name])
170 |         axis = get_compensated_axis(layer)
171 |         n[name] = L.Concat(*network_layers, axis=1)
172 | 
173 |     elif type(layer) == keras.layers.Maximum:
174 |         name = layer.name
175 |         in_names = get_inbound_layers(layer)
176 |         # turn list of names into network layers
177 |         network_layers = []
178 |         for ref in in_names:
179 |             network_layers += n[ref.name]
180 |         # unpack the bottom layers
181 |         n[name] = L.Eltwise(*network_layers, operation=2)
182 | 
183 |     elif type(layer) == keras.layers.Conv2DTranspose:
184 |         name = layer.name
185 |         in_names = get_inbound_layers(layer)
186 |         # Stride
187 |         if layer.strides is None:
188 |             stride = (1, 1)
189 |         else:
190 |             stride = layer.strides
191 |         # Padding
192 |         if layer.padding == 'same':  # Calculate the padding for 'same'
193 |             padding = [layer.kernel_size[0] / 2, layer.kernel_size[1] / 2]
194 |         else:
195 |             padding = [0, 0]  # If padding is valid(aka no padding)
196 |         # get bias parameter
197 |         config = layer.get_config()
198 |         use_bias = config['use_bias']
199 |         param = dict(bias_term=use_bias)
200 | 
201 |         n[name] = L.Deconvolution(n[in_names[0].name], kernel_h=layer.kernel_size[0],
202 |                                   kernel_w=layer.kernel_size[1], stride_h=stride[0],
203 |                                   stride_w=stride[1], num_output=layer.filters, pad_h=padding[0], pad_w=padding[1],
204 |                                   convolution_param=param)
205 |         blobs[0] = np.array(blobs[0]).transpose(3, 2, 0, 1)
206 |         net_params[name] = blobs
207 |         if layer.activation is not None and layer.activation.__name__ != 'linear':
208 |             name_act = name + "_activation_" + layer.activation.__name__  # get function string
209 |             n[name_act] = get_activation(layer, n[name])
210 | 
211 |     elif type(layer) == keras.layers.BatchNormalization:
212 |         name = layer.name
213 |         in_names = get_inbound_layers(layer)
214 |         n[name] = L.BatchNorm(n[in_names[0].name], moving_average_fraction=layer.momentum, eps=layer.epsilon)
215 |         variance = np.array(blobs[-1])
216 |         mean = np.array(blobs[-2])
217 | 
218 |         config = layer.get_config()
219 |         # Set mean variance and gamma into respective params
220 |         param = dict()
221 |         if config['scale']:
222 |             gamma = np.array(blobs[0])
223 |         else:
224 |             gamma = np.ones(mean.shape, dtype=np.float32)
225 | 
226 |         if config['center']:
227 |             beta = np.array(blobs[1])
228 |             param['bias_term'] = True
229 |         else:
230 |             beta = np.zeros(mean.shape, dtype=np.float32)
231 |             param['bias_term'] = False
232 | 
233 |         net_params[name] = (mean, variance, np.array(1.0))
234 | 
235 |         name_scale = name + '_scale'
236 |         # Scale after batchNorm
237 |         n[name_scale] = L.Scale(n[name], in_place=True, scale_param=param)
238 |         net_params[name_scale] = (gamma, beta)
239 |     # TODO Needs to be implemented
240 |     elif type(layer) == keras.layers.Conv1D:
241 |         name = layer.name
242 |         in_names = get_inbound_layers(layer)
243 |         n[name] = L.Convolution(n[in_names[0]])
244 | 
245 |     elif type(layer) == keras.layers.Conv2D:
246 |         name = layer.name
247 |         in_names = get_inbound_layers(layer)
248 |         # Stride
249 |         if layer.strides is None:
250 |             stride = (1, 1)
251 |         else:
252 |             stride = layer.strides
253 |         # Padding
254 |         if layer.padding == 'same':  # Calculate the padding for 'same'
255 |             padding = [layer.kernel_size[0] / 2, layer.kernel_size[1] / 2]
256 |         else:
257 |             padding = [0, 0]  # If padding is valid(aka no padding)
258 |         # TODO The rest of the arguements including bias, regulizers, dilation,
259 |         config = layer.get_config()
260 |         # get bias parameter
261 |         use_bias = config['use_bias']
262 |         param = dict(bias_term=use_bias)
263 |         n[name] = L.Convolution(n[in_names[0].name], kernel_h=layer.kernel_size[0],
264 |                                 kernel_w=layer.kernel_size[1], stride_h=stride[0],
265 |                                 stride_w=stride[1], num_output=layer.filters, pad_h=padding[0], pad_w=padding[1],
266 |                                 convolution_param=param)
267 |         weights = blobs
268 |         blobs[0] = np.array(blobs[0]).transpose((3, 2, 0, 1))
269 |         print(type(weights))
270 |         net_params[name] = blobs
271 |         if layer.activation is not None and layer.activation.__name__ != 'linear':
272 |             name_act = name + "_activation_" + layer.activation.__name__  # get function string
273 |             n[name_act] = get_activation(layer, n[name])
274 | 
275 |     elif type(layer) == keras.layers.MaxPooling2D or type(layer) == keras.layers.AveragePooling2D:
276 |         name = layer.name
277 |         in_names = get_inbound_layers(layer)
278 |         if type(layer) == keras.layers.MaxPooling2D:
279 |             pool = P.Pooling.MAX
280 |         else:  # NOTE AveragePooling needs to be implemented
281 |             pool = P.Pooling.AVE
282 |         # Padding
283 |         # TODO The rest of the arguements including bias, regulizers, dilatin,
284 |         if layer.strides is None:
285 |             stride = (1, 1)
286 |         else:
287 |             stride = layer.strides
288 |         # Padding
289 |         if layer.padding == 'same':  # Calculate the padding for 'same'
290 |             padding = [layer.pool_size[0] / 2, layer.pool_size[1] / 2]
291 |         else:
292 |             padding = [0, 0]  # If padding is valid(aka no padding)
293 |         n[name] = L.Pooling(n[in_names[0].name], kernel_h=layer.pool_size[0],
294 |                             kernel_w=layer.pool_size[1], stride_h=stride[0],
295 |                             stride_w=stride[1], pad_h=padding[0], pad_w=padding[1],
296 |                             pool=pool)
297 |         """
298 |         if hasattr(layer,layer.activation):
299 |             name_act = name + "_activation_" + layer.activation.__name__ #get function string
300 |             n[name_act] = get_activation(layer,n[name])
301 |         """
302 |     # Activation (wrapper for activations) and Advanced Activation Layers
303 |     elif type(layer) == keras.layers.Activation:
304 |         name = layer.name
305 |         in_names = get_inbound_layers(layer)
306 |         n[name] = get_activation(layer, n[in_names[0].name])  # TODO: Assert only 1
307 | 
308 |     # Caffe lacks intializer, regulizer, and constraint params
309 |     elif type(layer) == keras.layers.LeakyReLU:
310 |         # TODO: figure out how to pass Leaky params
311 |         name = layer.name
312 |         in_names = get_inbound_layers(layer)
313 |         n[name] = L.PReLU(n[in_names[0].name])
314 | 
315 |     elif type(layer) == keras.layers.PReLU:
316 |         name = layer.name
317 |         in_names = get_inbound_layers(layer)
318 |         n[name] = L.PReLU(n[in_names[0].name])
319 | 
320 |     elif type(layer) == keras.layers.ELU:
321 |         name = layer.name
322 |         in_names = get_inbound_layers(layer)
323 |         n[name] = L.ELU(n[in_names[0].name], layer.alpha)
324 | 
325 |     elif type(layer) == keras.layers.GlobalAveragePooling2D:
326 |         name = layer.name
327 |         in_names = get_inbound_layers(layer)
328 |         n[name] = L.Pooling(n[in_names[0].name], kernel_size=8, stride=8, pad=0, pool=P.Pooling.AVE)
329 | 
330 |     elif type(layer) == keras.layers.ZeroPadding2D:
331 |         name = layer.name
332 |         in_names = get_inbound_layers(layer)
333 |         config = layer.get_config()
334 |         padding = config['padding']
335 |         n[name] = L.Convolution(n[in_names[0].name], num_output=3, kernel_size=1, stride=1,
336 |                                 pad_h=padding[0][0], pad_w=padding[1][0], convolution_param=dict(bias_term=False))
337 |         net_params[name] = np.ones((3, 3, 1, 1))
338 | 
339 |     else:
340 |         raise Exception("Cannot convert model. " + layer.name + " is not supported.")
341 | 
342 | 
343 | def get_inbound_layers(layer):
344 |     in_names = []
345 |     for node in layer.inbound_nodes:  # get inbound nodes to current layer
346 |         node_list = node.inbound_layers  # get layers pointing to this node
347 |         in_names = in_names + node_list
348 |     if any('flat' in s.name for s in in_names):  # For Caffe2DML to reroute any use of Flatten layers
349 |         return get_inbound_layers([s for s in in_names if 'flat' in s.name][0])
350 |     return in_names
351 | 
352 | 
353 | # Only works with non Tensorflow functions!
354 | def get_activation(layer, bottom):
355 |     if keras.activations.serialize(layer.activation) == 'relu':
356 |         return L.ReLU(bottom, in_place=True)
357 |     elif keras.activations.serialize(layer.activation) == 'softmax':
358 |         return L.Softmax(bottom)  # Cannot extract axis from model, so default to -1
359 |     elif keras.activations.serialize(layer.activation) == 'softsign':
360 |         # Needs to be implemented in caffe2dml
361 |         raise Exception("softsign is not implemented")
362 |     elif keras.activations.serialize(layer.activation) == 'elu':
363 |         return L.ELU(bottom)
364 |     elif keras.activations.serialize(layer.activation) == 'selu':
365 |         # Needs to be implemented in caffe2dml
366 |         raise Exception("SELU activation is not implemented")
367 |     elif keras.activations.serialize(layer.activation) == 'sigmoid':
368 |         return L.Sigmoid(bottom)
369 |     elif keras.activations.serialize(layer.activation) == 'tanh':
370 |         return L.TanH(bottom)
371 |         # To add more acitvaiton functions, add more elif statements with
372 |         # activation funciton __name__'s.
373 | 
374 | 
375 | def generate_loss(kModel, n, label_name):
376 |     # Determine the loss needed to be added
377 |     for output in kModel.output_layers:
378 |         if hasattr(kModel, 'loss'):
379 |             if kModel.loss == 'categorical_crossentropy' and output.activation.__name__ == 'softmax':
380 |                 name = output.name + "_activation_" + output.activation.__name__
381 |                 n[name] = L.SoftmaxWithLoss(n[output.name], n[label_name])
382 |             elif kModel.loss == 'binary_crossentropy' and output.activation.__name__ == 'sigmoid':
383 |                 name = output.name + "_activation_" + output.activation.__name__
384 |                 n[name] = L.SigmoidCrossEntropyLoss(n[output.name])
385 |             else:  # Map the rest of the loss functions to the end of the output layer in Keras
386 |                 if kModel.loss == 'hinge':
387 |                     name = kModel.name + 'hinge'
388 |                     n[name] = L.HingeLoss(n[output.name])
389 |                 elif kModel.loss == 'categorical_crossentropy':
390 |                     name = kModel.name + 'categorical_crossentropy'
391 |                     n[name] = L.MultinomialLogisticLoss(n[output.name])
392 |                     # TODO Post warning to use softmax before this loss
393 |                 elif kModel.loss == 'mean_squared_error':
394 |                     name = kModel.name + 'mean_squared_error'
395 |                     n[name] = L.EuclideanLoss(n[output.name])
396 |                 # TODO implement Infogain Loss
397 |                 else:
398 |                     raise Exception(kModel.loss + "is not supported")
399 | 
400 | 
401 | # Params: keras Model, caffe prototxt filepath, filepath to save solver
402 | def generate_caffe_solver(kModel, cModelPath, filepath):
403 |     solver_param = CaffeSolver(trainnet_prototxt_path=cModelPath,
404 |                                testnet_prototxt_path=cModelPath,
405 |                                debug=True)  # Currently train and test are the same protos
406 |     solver_param.write(filepath)
407 | 
408 | 
409 | # Params: NetSpec, filepath and filename
410 | def write_caffe_model(cModel, filepath):
411 |     with open(filepath, 'w') as f:
412 |         f.write(str(cModel.to_proto()))
413 | 
414 | 
415 | """
416 | Get compensated axis since Caffe has n,c,h,w and Keras has n,h,w,c for tensor dimensions
417 | Params: Current Keras layer
418 | """
419 | 
420 | 
421 | def get_compensated_axis(layer):
422 |     compensated_axis = layer.axis
423 |     # Cover all cases for anything accessing the 0th index or the last index
424 |     if layer.axis > 0 and layer.axis < layer.input[0].shape.ndims - 1:
425 |         compensated_axis = layer.axis + 1
426 |     elif layer.axis < -1 and layer.axis > -(layer.input[0].shape.ndims):
427 |         compensated_axis = layer.axis + 1
428 |     elif layer.axis == -1 or layer.axis == layer.input[0].shape.ndims - 1:
429 |         compensated_axis = 1
430 |     return compensated_axis
431 | 
432 | def format_optimizer_name(self,optimizer):
433 |     if optimizer == "Adadelta":
434 |         return "AdaDelta"
435 |     elif optimizer == "Adagrad":
436 |         return "AdaGrad"
437 |     elif optimizer == "Adam":
438 |         return "Adam"
439 |     elif optimizer == "RMSprop":
440 |         return "RMSProp"
441 |     elif optimizer == "SGD":
442 |         return "SGD"
443 |     else:
444 |         raise Exception(optimizer + " is not supported in Caffe2DML")
445 | 
446 | class CaffeSolver:
447 |     """
448 |     Caffesolver is a class for creating a solver.prototxt file. It sets default
449 |     values and can export a solver parameter file.
450 |     Note that all parameters are stored as strings. Strings variables are
451 |     stored as strings in strings.
452 |     """
453 | 
454 |     def __init__(self, keras_model, testnet_prototxt_path="testnet.prototxt",
455 |                  debug=False):
456 | 
457 |         self.sp = {}
458 | 
459 |         optimizer_name = format_optimizer_name(keras_model.optimizer.__name__)
460 |         # TODO Grab momentum values from other optimizers
461 |         # critical:
462 |         self.sp['base_lr'] = '{}'.format(keras_model.optimizer.lr)
463 |         self.sp['momentum'] = '0.9'
464 |         self.sp['type'] = '"{}"'.format(optimizer_name)
465 | 
466 |         # speed:
467 |         self.sp['test_iter'] = '100'
468 |         self.sp['test_interval'] = '250'
469 | 
470 |         # looks:
471 |         self.sp['display'] = '25'
472 |         self.sp['snapshot'] = '2500'
473 |         self.sp['snapshot_prefix'] = '"snapshot"'  # string within a string!
474 | 
475 |         # learning rate policy
476 |         self.sp['lr_policy'] = '"fixed"'
477 | 
478 |         # important, but rare:
479 |         self.sp['gamma'] = '0.1'
480 |         self.sp['weight_decay'] = '0.0005'
481 |         # self.sp['train_net'] = '"' + trainnet_prototxt_path + '"'
482 |         # self.sp['test_net'] = '"' + testnet_prototxt_path + '"'
483 | 
484 |         self.sp['net'] = '"' + testnet_prototxt_path + '"'
485 | 
486 |         # pretty much never change these.
487 |         self.sp['max_iter'] = '100000'
488 |         self.sp['test_initialization'] = 'false'
489 |         self.sp['average_loss'] = '25'  # this has to do with the display.
490 |         self.sp['iter_size'] = '1'  # this is for accumulating gradients
491 | 
492 |         if (debug):
493 |             self.sp['max_iter'] = '12'
494 |             self.sp['test_iter'] = '1'
495 |             self.sp['test_interval'] = '4'
496 |             self.sp['display'] = '1'
497 | 
498 |     def add_from_file(self, filepath):
499 |         """
500 |         Reads a caffe solver prototxt file and updates the Caffesolver
501 |         instance parameters.
502 |         """
503 |         with open(filepath, 'r') as f:
504 |             for line in f:
505 |                 if line[0] == '#':
506 |                     continue
507 |                 splitLine = line.split(':')
508 |                 self.sp[splitLine[0].strip()] = splitLine[1].strip()
509 | 
510 |     def write(self, filepath):
511 |         """
512 |         Export solver parameters to INPUT "filepath". Sorted alphabetically.
513 |         """
514 |         f = open(filepath, 'w')
515 |         for key, value in sorted(self.sp.items()):
516 |             if not (type(value) is str):
517 |                 raise Exception('All solver parameters must be strings')
518 |             f.write('%s: %s\n' % (key, value))
519 | 
520 | 


--------------------------------------------------------------------------------