├── README.md └── keras2caffe.py /README.md: -------------------------------------------------------------------------------- 1 | # Keras2Caffe 2 | ### Keras to Caffe 3 | This is a script that converts Keras models to Caffe models 4 | from the common Keras layers into caffe NetSpecs, and into prototext and caffemodel files. 5 | This allows you to pipe directly into your favorite Caffe framework of choice. 6 | 7 | Be aware that Keras has much more flexible functionality and thus will miss 8 | lots of goodies such as Lambda functions, some acitvaiton functions, and some 9 | of the pooling, text preprocessing, recurrent, and noise layers. For those you 10 | can try your luck with [custom python layers](https://stackoverflow.com/questions/33778225/building-custom-caffe-layer-in-python) or [custom c++ layers](https://github.com/BVLC/caffe/wiki/Development). 11 | 12 | This is build with a very simple structure of parsing layer by layer, and the 13 | same goes for caffe weights. 14 | 15 | ### Why 16 | For anyone who needs to use Caffe for some reason. 17 | This was a script I wrote while at IBM Watson, in Spark.tc for Keras2DML 18 | to run Common pretrained keras models to run on large clusters. This was partly used to piggy back off Caffe2DML. Definitely recommend checking out [SystemML](https://github.com/apache/systemml) if you want to train models on clusters and 19 | distributed machine learning concepts in general. 20 | 21 | ## Getting Started 22 | 23 | These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system. 24 | 25 | ### Prerequisites 26 | 27 | What things you need to install the software and how to install them 28 | 29 | ``` 30 | sudo apt-get install caffe 31 | pip install tensorflow keras 32 | ``` 33 | Eventually I will add Caffe.proto to generate files so you don't get bloated 34 | from the caffe dependency. 35 | 36 | ### Installing 37 | 38 | A step by step series of examples that tell you have to get a development env running 39 | 40 | Import the script to use the functions 41 | 42 | ``` 43 | import keras2caffe 44 | ``` 45 | 46 | ## Things That Work and What Needs To 47 | 48 | Currently the layers that work are: 49 | * Dense 50 | * Dropout 51 | * Add, Multiply, Maximum, and Concat 52 | * Conv2D, MaxPooling2D, Conv2DTranspose(Does not map anything that doesn't exist in Caffe, such as regulaizers, and intiallizers) 53 | * All activations in Caffe except Threshold, bias, and Scale 54 | * All Losses in Caffe, including both combined activation and loss layers 55 | 56 | Things that I need to add 57 | * LSTM (Map as much as possible with keras parameters) 58 | * Crop, AveragePooling, and Embedding 59 | 60 | If I missed anything, make sure to inform me so I can add it, or merge pull requests. 61 | 62 | REMINDER: You can't get one to one models, so don't expect everything to work perfectly! 63 | 64 | ## Authors 65 | 66 | * **Anooj Patel** - *Initial work* - [FuturizeHandgun](https://github.com/FuturizeHandgun) 67 | 68 | 69 | ## License 70 | 71 | This project is licensed under the Apache License 72 | 73 | ## Acknowledgments 74 | 75 | * Mike 76 | * Caffe for making it a game to find valid documentation for pycaffe 77 | -------------------------------------------------------------------------------- /keras2caffe.py: -------------------------------------------------------------------------------- 1 | # ------------------------------------------------------------- 2 | # 3 | # Licensed to the Apache Software Foundation (ASF) under one 4 | # or more contributor license agreements. See the NOTICE file 5 | # distributed with this work for additional information 6 | # regarding copyright ownership. The ASF licenses this file 7 | # to you under the Apache License, Version 2.0 (the 8 | # "License"); you may not use this file except in compliance 9 | # with the License. You may obtain a copy of the License at 10 | # 11 | # http://www.apache.org/licenses/LICENSE-2.0 12 | # 13 | # Unless required by applicable law or agreed to in writing, 14 | # software distributed under the License is distributed on an 15 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 16 | # KIND, either express or implied. See the License for the 17 | # specific language governing permissions and limitations 18 | # under the License. 19 | # 20 | # ------------------------------------------------------------- 21 | 22 | # Script to generate caffe proto and .caffemodel files from Keras models 23 | 24 | from caffe import * 25 | import caffe 26 | from caffe import layers as L 27 | from caffe import params as P 28 | 29 | import keras 30 | from keras.models import load_model 31 | from keras.models import model_from_json 32 | from keras.utils.conv_utils import convert_kernel 33 | 34 | import numpy as np 35 | 36 | 37 | def load_keras_model(filepath): 38 | model = load_model(filepath) 39 | return model 40 | 41 | 42 | def load_keras_skeleton_model(filepath): 43 | json_file = open(filepath, 'r') 44 | model_json = json_file.read() 45 | json_file.close() 46 | loaded_model = model_from_json(model_json) 47 | return loaded_model 48 | 49 | 50 | def load_weights_to_model(model, filepath): 51 | model.load_weights(filepath) 52 | return model 53 | 54 | 55 | # Currently can only generate a Dense model 56 | def generate_caffe_model(kModel, filepath, weights_filepath, input_shape=None, phases=None): 57 | n = caffe.NetSpec() 58 | layers = kModel.layers 59 | net_params = dict() 60 | input_name = kModel.inputs[0].name 61 | label_name = input_name + "_label" 62 | 63 | for layer in layers: 64 | blobs = layer.get_weights() 65 | 66 | generate_layer(blobs, layer, n, net_params) 67 | 68 | # Determine the loss needed to be added 69 | generate_loss(kModel, n, label_name) 70 | print("Converting model to proto and converting weights") 71 | write_caffe_model(n, filepath) 72 | caffe_model = caffe.Net(filepath, caffe.TEST) 73 | for layer in caffe_model.params.keys(): 74 | for i in range(0, len(caffe_model.params[layer])): 75 | print(layer + ": ") 76 | print(net_params[layer][i].shape) 77 | print(caffe_model.params[layer][i].data.shape) 78 | # print(dir(caffe_model.params[layer])) 79 | caffe_model.params[layer][i].data[...] = net_params[layer][i] 80 | 81 | caffe_model.save(weights_filepath) 82 | 83 | # Change back Input into Data layer for Caffe2DML 84 | n[label_name], n[input_name] = L.Data(ntop=2) 85 | 86 | write_caffe_model(n, filepath) 87 | 88 | return n, caffe_model 89 | 90 | 91 | def generate_layer(blobs, layer, n, net_params): 92 | """ 93 | Parameters: blobs: weights for keras, layer: keras layer, n: Caffe NetSpec, 94 | net_params: Dictionary to store Caffe weights 95 | """ 96 | if type(layer) == keras.layers.InputLayer: 97 | # Grab the batchsize from i 0, shift over channels to index 1, and place the rest into the dictionary 98 | # TODO determine when to transform for layer types/input shape 99 | num = len(layer.batch_input_shape) - 1 # Range from 1st index to second last 100 | # TODO check for image_data_format to be channels_first or channels_last 101 | batch_list = [layer.batch_input_shape[0], layer.batch_input_shape[-1]] 102 | for i in range(1, num): 103 | batch_list.append(layer.batch_input_shape[i]) 104 | for i in range(len(batch_list)): # Set None dimensions to 0 for Caffe 105 | if (batch_list[i] == None): 106 | batch_list[i] = 1 107 | name = layer.name 108 | # TODO figure out having 2 tops, with n.label 109 | n[name] = L.Input(shape=[dict(dim=batch_list)]) 110 | 111 | elif type(layer) == keras.layers.Dense: 112 | # Pull name from Keras 113 | name = layer.name 114 | # Pull layer name of the layer passing to current layer 115 | in_names = get_inbound_layers(layer) 116 | # Pipe names into caffe using unique Keras layer names 117 | n[name] = L.InnerProduct(n[in_names[0].name], num_output=layer.units) # TODO: Assert only 1 118 | config = layer.get_config() 119 | if config['use_bias']: 120 | net_params[name] = (np.array(blobs[0]).transpose(1, 0), np.array(blobs[1])) 121 | else: 122 | net_params[name] = (blobs[0]) 123 | if layer.activation is not None and layer.activation.__name__ != 'linear': 124 | name_act = name + "_activation_" + layer.activation.__name__ # get function string 125 | n[name_act] = get_activation(layer, n[name]) 126 | 127 | elif type(layer) == keras.layers.Flatten: 128 | 129 | """ 130 | Caffe2DML implicitly stores all tensors as a 1D array with shapes so after every passthrough 131 | all outputs are already flatten thus, we can ignore all flattens are just pass the 132 | tops and bottoms across all flatten layers. 133 | """ 134 | 135 | elif type(layer) == keras.layers.Dropout: # TODO Random seed will be lost 136 | name = layer.name 137 | in_names = get_inbound_layers(layer) 138 | n[name] = L.Dropout(n[in_names[0].name], dropout_ratio=layer.rate, in_place=True) 139 | 140 | # elif type(layer) == keras.Layers.LSTM: 141 | 142 | elif type(layer) == keras.layers.Add: 143 | name = layer.name 144 | in_names = get_inbound_layers(layer) 145 | # turn list of names into network layers 146 | network_layers = [] 147 | for ref in in_names: 148 | network_layers.append(n[ref.name]) 149 | # print(network_layers) 150 | # unpack the bottom layers 151 | n[name] = L.Eltwise(*network_layers, operation=1) # 1 is SUM 152 | 153 | elif type(layer) == keras.layers.Multiply: 154 | name = layer.name 155 | in_names = get_inbound_layers(layer) 156 | # turn list of names into network layers 157 | network_layers = [] 158 | for ref in in_names: 159 | network_layers.append(n[ref.name]) 160 | # unpack the bottom layers 161 | n[name] = L.Eltwise(*network_layers, operation=0) 162 | 163 | elif type(layer) == keras.layers.Concatenate: 164 | name = layer.name 165 | in_names = get_inbound_layers(layer) 166 | # turn list of names into network layers 167 | network_layers = [] 168 | for ref in in_names: 169 | network_layers.append(n[ref.name]) 170 | axis = get_compensated_axis(layer) 171 | n[name] = L.Concat(*network_layers, axis=1) 172 | 173 | elif type(layer) == keras.layers.Maximum: 174 | name = layer.name 175 | in_names = get_inbound_layers(layer) 176 | # turn list of names into network layers 177 | network_layers = [] 178 | for ref in in_names: 179 | network_layers += n[ref.name] 180 | # unpack the bottom layers 181 | n[name] = L.Eltwise(*network_layers, operation=2) 182 | 183 | elif type(layer) == keras.layers.Conv2DTranspose: 184 | name = layer.name 185 | in_names = get_inbound_layers(layer) 186 | # Stride 187 | if layer.strides is None: 188 | stride = (1, 1) 189 | else: 190 | stride = layer.strides 191 | # Padding 192 | if layer.padding == 'same': # Calculate the padding for 'same' 193 | padding = [layer.kernel_size[0] / 2, layer.kernel_size[1] / 2] 194 | else: 195 | padding = [0, 0] # If padding is valid(aka no padding) 196 | # get bias parameter 197 | config = layer.get_config() 198 | use_bias = config['use_bias'] 199 | param = dict(bias_term=use_bias) 200 | 201 | n[name] = L.Deconvolution(n[in_names[0].name], kernel_h=layer.kernel_size[0], 202 | kernel_w=layer.kernel_size[1], stride_h=stride[0], 203 | stride_w=stride[1], num_output=layer.filters, pad_h=padding[0], pad_w=padding[1], 204 | convolution_param=param) 205 | blobs[0] = np.array(blobs[0]).transpose(3, 2, 0, 1) 206 | net_params[name] = blobs 207 | if layer.activation is not None and layer.activation.__name__ != 'linear': 208 | name_act = name + "_activation_" + layer.activation.__name__ # get function string 209 | n[name_act] = get_activation(layer, n[name]) 210 | 211 | elif type(layer) == keras.layers.BatchNormalization: 212 | name = layer.name 213 | in_names = get_inbound_layers(layer) 214 | n[name] = L.BatchNorm(n[in_names[0].name], moving_average_fraction=layer.momentum, eps=layer.epsilon) 215 | variance = np.array(blobs[-1]) 216 | mean = np.array(blobs[-2]) 217 | 218 | config = layer.get_config() 219 | # Set mean variance and gamma into respective params 220 | param = dict() 221 | if config['scale']: 222 | gamma = np.array(blobs[0]) 223 | else: 224 | gamma = np.ones(mean.shape, dtype=np.float32) 225 | 226 | if config['center']: 227 | beta = np.array(blobs[1]) 228 | param['bias_term'] = True 229 | else: 230 | beta = np.zeros(mean.shape, dtype=np.float32) 231 | param['bias_term'] = False 232 | 233 | net_params[name] = (mean, variance, np.array(1.0)) 234 | 235 | name_scale = name + '_scale' 236 | # Scale after batchNorm 237 | n[name_scale] = L.Scale(n[name], in_place=True, scale_param=param) 238 | net_params[name_scale] = (gamma, beta) 239 | # TODO Needs to be implemented 240 | elif type(layer) == keras.layers.Conv1D: 241 | name = layer.name 242 | in_names = get_inbound_layers(layer) 243 | n[name] = L.Convolution(n[in_names[0]]) 244 | 245 | elif type(layer) == keras.layers.Conv2D: 246 | name = layer.name 247 | in_names = get_inbound_layers(layer) 248 | # Stride 249 | if layer.strides is None: 250 | stride = (1, 1) 251 | else: 252 | stride = layer.strides 253 | # Padding 254 | if layer.padding == 'same': # Calculate the padding for 'same' 255 | padding = [layer.kernel_size[0] / 2, layer.kernel_size[1] / 2] 256 | else: 257 | padding = [0, 0] # If padding is valid(aka no padding) 258 | # TODO The rest of the arguements including bias, regulizers, dilation, 259 | config = layer.get_config() 260 | # get bias parameter 261 | use_bias = config['use_bias'] 262 | param = dict(bias_term=use_bias) 263 | n[name] = L.Convolution(n[in_names[0].name], kernel_h=layer.kernel_size[0], 264 | kernel_w=layer.kernel_size[1], stride_h=stride[0], 265 | stride_w=stride[1], num_output=layer.filters, pad_h=padding[0], pad_w=padding[1], 266 | convolution_param=param) 267 | weights = blobs 268 | blobs[0] = np.array(blobs[0]).transpose((3, 2, 0, 1)) 269 | print(type(weights)) 270 | net_params[name] = blobs 271 | if layer.activation is not None and layer.activation.__name__ != 'linear': 272 | name_act = name + "_activation_" + layer.activation.__name__ # get function string 273 | n[name_act] = get_activation(layer, n[name]) 274 | 275 | elif type(layer) == keras.layers.MaxPooling2D or type(layer) == keras.layers.AveragePooling2D: 276 | name = layer.name 277 | in_names = get_inbound_layers(layer) 278 | if type(layer) == keras.layers.MaxPooling2D: 279 | pool = P.Pooling.MAX 280 | else: # NOTE AveragePooling needs to be implemented 281 | pool = P.Pooling.AVE 282 | # Padding 283 | # TODO The rest of the arguements including bias, regulizers, dilatin, 284 | if layer.strides is None: 285 | stride = (1, 1) 286 | else: 287 | stride = layer.strides 288 | # Padding 289 | if layer.padding == 'same': # Calculate the padding for 'same' 290 | padding = [layer.pool_size[0] / 2, layer.pool_size[1] / 2] 291 | else: 292 | padding = [0, 0] # If padding is valid(aka no padding) 293 | n[name] = L.Pooling(n[in_names[0].name], kernel_h=layer.pool_size[0], 294 | kernel_w=layer.pool_size[1], stride_h=stride[0], 295 | stride_w=stride[1], pad_h=padding[0], pad_w=padding[1], 296 | pool=pool) 297 | """ 298 | if hasattr(layer,layer.activation): 299 | name_act = name + "_activation_" + layer.activation.__name__ #get function string 300 | n[name_act] = get_activation(layer,n[name]) 301 | """ 302 | # Activation (wrapper for activations) and Advanced Activation Layers 303 | elif type(layer) == keras.layers.Activation: 304 | name = layer.name 305 | in_names = get_inbound_layers(layer) 306 | n[name] = get_activation(layer, n[in_names[0].name]) # TODO: Assert only 1 307 | 308 | # Caffe lacks intializer, regulizer, and constraint params 309 | elif type(layer) == keras.layers.LeakyReLU: 310 | # TODO: figure out how to pass Leaky params 311 | name = layer.name 312 | in_names = get_inbound_layers(layer) 313 | n[name] = L.PReLU(n[in_names[0].name]) 314 | 315 | elif type(layer) == keras.layers.PReLU: 316 | name = layer.name 317 | in_names = get_inbound_layers(layer) 318 | n[name] = L.PReLU(n[in_names[0].name]) 319 | 320 | elif type(layer) == keras.layers.ELU: 321 | name = layer.name 322 | in_names = get_inbound_layers(layer) 323 | n[name] = L.ELU(n[in_names[0].name], layer.alpha) 324 | 325 | elif type(layer) == keras.layers.GlobalAveragePooling2D: 326 | name = layer.name 327 | in_names = get_inbound_layers(layer) 328 | n[name] = L.Pooling(n[in_names[0].name], kernel_size=8, stride=8, pad=0, pool=P.Pooling.AVE) 329 | 330 | elif type(layer) == keras.layers.ZeroPadding2D: 331 | name = layer.name 332 | in_names = get_inbound_layers(layer) 333 | config = layer.get_config() 334 | padding = config['padding'] 335 | n[name] = L.Convolution(n[in_names[0].name], num_output=3, kernel_size=1, stride=1, 336 | pad_h=padding[0][0], pad_w=padding[1][0], convolution_param=dict(bias_term=False)) 337 | net_params[name] = np.ones((3, 3, 1, 1)) 338 | 339 | else: 340 | raise Exception("Cannot convert model. " + layer.name + " is not supported.") 341 | 342 | 343 | def get_inbound_layers(layer): 344 | in_names = [] 345 | for node in layer.inbound_nodes: # get inbound nodes to current layer 346 | node_list = node.inbound_layers # get layers pointing to this node 347 | in_names = in_names + node_list 348 | if any('flat' in s.name for s in in_names): # For Caffe2DML to reroute any use of Flatten layers 349 | return get_inbound_layers([s for s in in_names if 'flat' in s.name][0]) 350 | return in_names 351 | 352 | 353 | # Only works with non Tensorflow functions! 354 | def get_activation(layer, bottom): 355 | if keras.activations.serialize(layer.activation) == 'relu': 356 | return L.ReLU(bottom, in_place=True) 357 | elif keras.activations.serialize(layer.activation) == 'softmax': 358 | return L.Softmax(bottom) # Cannot extract axis from model, so default to -1 359 | elif keras.activations.serialize(layer.activation) == 'softsign': 360 | # Needs to be implemented in caffe2dml 361 | raise Exception("softsign is not implemented") 362 | elif keras.activations.serialize(layer.activation) == 'elu': 363 | return L.ELU(bottom) 364 | elif keras.activations.serialize(layer.activation) == 'selu': 365 | # Needs to be implemented in caffe2dml 366 | raise Exception("SELU activation is not implemented") 367 | elif keras.activations.serialize(layer.activation) == 'sigmoid': 368 | return L.Sigmoid(bottom) 369 | elif keras.activations.serialize(layer.activation) == 'tanh': 370 | return L.TanH(bottom) 371 | # To add more acitvaiton functions, add more elif statements with 372 | # activation funciton __name__'s. 373 | 374 | 375 | def generate_loss(kModel, n, label_name): 376 | # Determine the loss needed to be added 377 | for output in kModel.output_layers: 378 | if hasattr(kModel, 'loss'): 379 | if kModel.loss == 'categorical_crossentropy' and output.activation.__name__ == 'softmax': 380 | name = output.name + "_activation_" + output.activation.__name__ 381 | n[name] = L.SoftmaxWithLoss(n[output.name], n[label_name]) 382 | elif kModel.loss == 'binary_crossentropy' and output.activation.__name__ == 'sigmoid': 383 | name = output.name + "_activation_" + output.activation.__name__ 384 | n[name] = L.SigmoidCrossEntropyLoss(n[output.name]) 385 | else: # Map the rest of the loss functions to the end of the output layer in Keras 386 | if kModel.loss == 'hinge': 387 | name = kModel.name + 'hinge' 388 | n[name] = L.HingeLoss(n[output.name]) 389 | elif kModel.loss == 'categorical_crossentropy': 390 | name = kModel.name + 'categorical_crossentropy' 391 | n[name] = L.MultinomialLogisticLoss(n[output.name]) 392 | # TODO Post warning to use softmax before this loss 393 | elif kModel.loss == 'mean_squared_error': 394 | name = kModel.name + 'mean_squared_error' 395 | n[name] = L.EuclideanLoss(n[output.name]) 396 | # TODO implement Infogain Loss 397 | else: 398 | raise Exception(kModel.loss + "is not supported") 399 | 400 | 401 | # Params: keras Model, caffe prototxt filepath, filepath to save solver 402 | def generate_caffe_solver(kModel, cModelPath, filepath): 403 | solver_param = CaffeSolver(trainnet_prototxt_path=cModelPath, 404 | testnet_prototxt_path=cModelPath, 405 | debug=True) # Currently train and test are the same protos 406 | solver_param.write(filepath) 407 | 408 | 409 | # Params: NetSpec, filepath and filename 410 | def write_caffe_model(cModel, filepath): 411 | with open(filepath, 'w') as f: 412 | f.write(str(cModel.to_proto())) 413 | 414 | 415 | """ 416 | Get compensated axis since Caffe has n,c,h,w and Keras has n,h,w,c for tensor dimensions 417 | Params: Current Keras layer 418 | """ 419 | 420 | 421 | def get_compensated_axis(layer): 422 | compensated_axis = layer.axis 423 | # Cover all cases for anything accessing the 0th index or the last index 424 | if layer.axis > 0 and layer.axis < layer.input[0].shape.ndims - 1: 425 | compensated_axis = layer.axis + 1 426 | elif layer.axis < -1 and layer.axis > -(layer.input[0].shape.ndims): 427 | compensated_axis = layer.axis + 1 428 | elif layer.axis == -1 or layer.axis == layer.input[0].shape.ndims - 1: 429 | compensated_axis = 1 430 | return compensated_axis 431 | 432 | def format_optimizer_name(self,optimizer): 433 | if optimizer == "Adadelta": 434 | return "AdaDelta" 435 | elif optimizer == "Adagrad": 436 | return "AdaGrad" 437 | elif optimizer == "Adam": 438 | return "Adam" 439 | elif optimizer == "RMSprop": 440 | return "RMSProp" 441 | elif optimizer == "SGD": 442 | return "SGD" 443 | else: 444 | raise Exception(optimizer + " is not supported in Caffe2DML") 445 | 446 | class CaffeSolver: 447 | """ 448 | Caffesolver is a class for creating a solver.prototxt file. It sets default 449 | values and can export a solver parameter file. 450 | Note that all parameters are stored as strings. Strings variables are 451 | stored as strings in strings. 452 | """ 453 | 454 | def __init__(self, keras_model, testnet_prototxt_path="testnet.prototxt", 455 | debug=False): 456 | 457 | self.sp = {} 458 | 459 | optimizer_name = format_optimizer_name(keras_model.optimizer.__name__) 460 | # TODO Grab momentum values from other optimizers 461 | # critical: 462 | self.sp['base_lr'] = '{}'.format(keras_model.optimizer.lr) 463 | self.sp['momentum'] = '0.9' 464 | self.sp['type'] = '"{}"'.format(optimizer_name) 465 | 466 | # speed: 467 | self.sp['test_iter'] = '100' 468 | self.sp['test_interval'] = '250' 469 | 470 | # looks: 471 | self.sp['display'] = '25' 472 | self.sp['snapshot'] = '2500' 473 | self.sp['snapshot_prefix'] = '"snapshot"' # string within a string! 474 | 475 | # learning rate policy 476 | self.sp['lr_policy'] = '"fixed"' 477 | 478 | # important, but rare: 479 | self.sp['gamma'] = '0.1' 480 | self.sp['weight_decay'] = '0.0005' 481 | # self.sp['train_net'] = '"' + trainnet_prototxt_path + '"' 482 | # self.sp['test_net'] = '"' + testnet_prototxt_path + '"' 483 | 484 | self.sp['net'] = '"' + testnet_prototxt_path + '"' 485 | 486 | # pretty much never change these. 487 | self.sp['max_iter'] = '100000' 488 | self.sp['test_initialization'] = 'false' 489 | self.sp['average_loss'] = '25' # this has to do with the display. 490 | self.sp['iter_size'] = '1' # this is for accumulating gradients 491 | 492 | if (debug): 493 | self.sp['max_iter'] = '12' 494 | self.sp['test_iter'] = '1' 495 | self.sp['test_interval'] = '4' 496 | self.sp['display'] = '1' 497 | 498 | def add_from_file(self, filepath): 499 | """ 500 | Reads a caffe solver prototxt file and updates the Caffesolver 501 | instance parameters. 502 | """ 503 | with open(filepath, 'r') as f: 504 | for line in f: 505 | if line[0] == '#': 506 | continue 507 | splitLine = line.split(':') 508 | self.sp[splitLine[0].strip()] = splitLine[1].strip() 509 | 510 | def write(self, filepath): 511 | """ 512 | Export solver parameters to INPUT "filepath". Sorted alphabetically. 513 | """ 514 | f = open(filepath, 'w') 515 | for key, value in sorted(self.sp.items()): 516 | if not (type(value) is str): 517 | raise Exception('All solver parameters must be strings') 518 | f.write('%s: %s\n' % (key, value)) 519 | 520 | --------------------------------------------------------------------------------