├── LICENSE ├── README.md ├── code ├── Compute_Sketchy_score.ipynb ├── README └── Retrieval_Example.ipynb ├── list └── README ├── models └── triplet_googlenet │ ├── README │ ├── Triplet_googlenet_imagedeploy.prototxt │ └── Triplet_googlenet_sketchdeploy.prototxt └── training ├── README.md ├── Triplet_googlenet_train_test.prototxt └── sketch_triplet_solver.prototxt /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 janesjanes 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Sketchy 2 | http://sketchy.eye.gatech.edu/ 3 | 4 | Caffemodel for our final Triplet GoogleNet network: https://goo.gl/cqXm7F 5 | -------------------------------------------------------------------------------- /code/Compute_Sketchy_score.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "This script is for computing the performance of the network on our sketchy benchmark. \n" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": { 14 | "collapsed": true 15 | }, 16 | "outputs": [], 17 | "source": [ 18 | "import numpy as np\n", 19 | "from pylab import *\n", 20 | "%matplotlib inline\n", 21 | "import os\n", 22 | "import sys" 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "## caffe" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "First, we need to import caffe. You'll need to have caffe installed, as well as python interface for caffe. " 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 2, 42 | "metadata": { 43 | "collapsed": true 44 | }, 45 | "outputs": [], 46 | "source": [ 47 | "#TODO: specify your caffe root folder here\n", 48 | "caffe_root = \"X:\\caffe_siggraph/caffe-windows-master\"\n", 49 | "sys.path.insert(0, caffe_root+'/python')\n", 50 | "import caffe" 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "metadata": {}, 56 | "source": [ 57 | "Now we can load up the network. You can change the path to your own network here. Make sure to use the matching deploy prototxt files and change the target layer to your layer name." 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": 35, 63 | "metadata": { 64 | "collapsed": false 65 | }, 66 | "outputs": [ 67 | { 68 | "name": "stdout", 69 | "output_type": "stream", 70 | "text": [ 71 | "\n" 72 | ] 73 | } 74 | ], 75 | "source": [ 76 | "#TODO: change to your own network and deploying file\n", 77 | "PRETRAINED_FILE = '../models/triplet_googlenet/triplet_googlenet_finegrain_final.caffemodel' \n", 78 | "sketch_model = '../models/triplet_googlenet/googlenet_sketchdeploy.prototxt'\n", 79 | "image_model = '../models/triplet_googlenet/googlenet_imagedeploy.prototxt'" 80 | ] 81 | }, 82 | { 83 | "cell_type": "code", 84 | "execution_count": 36, 85 | "metadata": { 86 | "collapsed": false 87 | }, 88 | "outputs": [ 89 | { 90 | "data": { 91 | "text/plain": [ 92 | "['data',\n", 93 | " 'conv1/7x7_s2_s',\n", 94 | " 'pool1/3x3_s2_s',\n", 95 | " 'pool1/norm1_s',\n", 96 | " 'conv2/3x3_reduce_s',\n", 97 | " 'conv2/3x3_s',\n", 98 | " 'conv2/norm2_s',\n", 99 | " 'pool2/3x3_s2_s',\n", 100 | " 'pool2/3x3_s2_s_pool2/3x3_s2_s_0_split_0',\n", 101 | " 'pool2/3x3_s2_s_pool2/3x3_s2_s_0_split_1',\n", 102 | " 'pool2/3x3_s2_s_pool2/3x3_s2_s_0_split_2',\n", 103 | " 'pool2/3x3_s2_s_pool2/3x3_s2_s_0_split_3',\n", 104 | " 'inception_3a/1x1_s',\n", 105 | " 'inception_3a/3x3_reduce_s',\n", 106 | " 'inception_3a/3x3_s',\n", 107 | " 'inception_3a/5x5_reduce_s',\n", 108 | " 'inception_3a/5x5_s',\n", 109 | " 'inception_3a/pool_s',\n", 110 | " 'inception_3a/pool_proj_s',\n", 111 | " 'inception_3a/output_s',\n", 112 | " 'inception_3a/output_s_inception_3a/output_s_0_split_0',\n", 113 | " 'inception_3a/output_s_inception_3a/output_s_0_split_1',\n", 114 | " 'inception_3a/output_s_inception_3a/output_s_0_split_2',\n", 115 | " 'inception_3a/output_s_inception_3a/output_s_0_split_3',\n", 116 | " 'inception_3b/1x1_s',\n", 117 | " 'inception_3b/3x3_reduce_s',\n", 118 | " 'inception_3b/3x3_s',\n", 119 | " 'inception_3b/5x5_reduce_s',\n", 120 | " 'inception_3b/5x5_s',\n", 121 | " 'inception_3b/pool_s',\n", 122 | " 'inception_3b/pool_proj_s',\n", 123 | " 'inception_3b/output_s',\n", 124 | " 'pool3/3x3_s2_s',\n", 125 | " 'pool3/3x3_s2_s_pool3/3x3_s2_s_0_split_0',\n", 126 | " 'pool3/3x3_s2_s_pool3/3x3_s2_s_0_split_1',\n", 127 | " 'pool3/3x3_s2_s_pool3/3x3_s2_s_0_split_2',\n", 128 | " 'pool3/3x3_s2_s_pool3/3x3_s2_s_0_split_3',\n", 129 | " 'inception_4a/1x1_s',\n", 130 | " 'inception_4a/3x3_reduce_s',\n", 131 | " 'inception_4a/3x3_s',\n", 132 | " 'inception_4a/5x5_reduce_s',\n", 133 | " 'inception_4a/5x5_s',\n", 134 | " 'inception_4a/pool_s',\n", 135 | " 'inception_4a/pool_proj_s',\n", 136 | " 'inception_4a/output_s',\n", 137 | " 'inception_4a/output_s_inception_4a/output_s_0_split_0',\n", 138 | " 'inception_4a/output_s_inception_4a/output_s_0_split_1',\n", 139 | " 'inception_4a/output_s_inception_4a/output_s_0_split_2',\n", 140 | " 'inception_4a/output_s_inception_4a/output_s_0_split_3',\n", 141 | " 'inception_4a/output_s_inception_4a/output_s_0_split_4',\n", 142 | " 'loss1/ave_pool_s',\n", 143 | " 'loss1/conv_s',\n", 144 | " 'loss1/fc_s',\n", 145 | " 'inception_4b/1x1_s',\n", 146 | " 'inception_4b/3x3_reduce_s',\n", 147 | " 'inception_4b/3x3_s',\n", 148 | " 'inception_4b/5x5_reduce_s',\n", 149 | " 'inception_4b/5x5_s',\n", 150 | " 'inception_4b/pool_s',\n", 151 | " 'inception_4b/pool_proj_s',\n", 152 | " 'inception_4b/output_s',\n", 153 | " 'inception_4b/output_s_inception_4b/output_s_0_split_0',\n", 154 | " 'inception_4b/output_s_inception_4b/output_s_0_split_1',\n", 155 | " 'inception_4b/output_s_inception_4b/output_s_0_split_2',\n", 156 | " 'inception_4b/output_s_inception_4b/output_s_0_split_3',\n", 157 | " 'inception_4c/1x1_s',\n", 158 | " 'inception_4c/3x3_reduce_s',\n", 159 | " 'inception_4c/3x3_s',\n", 160 | " 'inception_4c/5x5_reduce_s',\n", 161 | " 'inception_4c/5x5_s',\n", 162 | " 'inception_4c/pool_s',\n", 163 | " 'inception_4c/pool_proj_s',\n", 164 | " 'inception_4c/output_s',\n", 165 | " 'inception_4c/output_s_inception_4c/output_s_0_split_0',\n", 166 | " 'inception_4c/output_s_inception_4c/output_s_0_split_1',\n", 167 | " 'inception_4c/output_s_inception_4c/output_s_0_split_2',\n", 168 | " 'inception_4c/output_s_inception_4c/output_s_0_split_3',\n", 169 | " 'inception_4d/1x1_s',\n", 170 | " 'inception_4d/3x3_reduce_s',\n", 171 | " 'inception_4d/3x3_s',\n", 172 | " 'inception_4d/5x5_reduce_s',\n", 173 | " 'inception_4d/5x5_s',\n", 174 | " 'inception_4d/pool_s',\n", 175 | " 'inception_4d/pool_proj_s',\n", 176 | " 'inception_4d/output_s',\n", 177 | " 'inception_4d/output_s_inception_4d/output_s_0_split_0',\n", 178 | " 'inception_4d/output_s_inception_4d/output_s_0_split_1',\n", 179 | " 'inception_4d/output_s_inception_4d/output_s_0_split_2',\n", 180 | " 'inception_4d/output_s_inception_4d/output_s_0_split_3',\n", 181 | " 'inception_4d/output_s_inception_4d/output_s_0_split_4',\n", 182 | " 'loss2/ave_pool_s',\n", 183 | " 'loss2/conv_s',\n", 184 | " 'loss2/fc_s',\n", 185 | " 'inception_4e/1x1_s',\n", 186 | " 'inception_4e/3x3_reduce_s',\n", 187 | " 'inception_4e/3x3_s',\n", 188 | " 'inception_4e/5x5_reduce_s',\n", 189 | " 'inception_4e/5x5_s',\n", 190 | " 'inception_4e/pool_s',\n", 191 | " 'inception_4e/pool_proj_s',\n", 192 | " 'inception_4e/output_s',\n", 193 | " 'pool4/3x3_s2_s',\n", 194 | " 'pool4/3x3_s2_s_pool4/3x3_s2_s_0_split_0',\n", 195 | " 'pool4/3x3_s2_s_pool4/3x3_s2_s_0_split_1',\n", 196 | " 'pool4/3x3_s2_s_pool4/3x3_s2_s_0_split_2',\n", 197 | " 'pool4/3x3_s2_s_pool4/3x3_s2_s_0_split_3',\n", 198 | " 'inception_5a/1x1_s',\n", 199 | " 'inception_5a/3x3_reduce_s',\n", 200 | " 'inception_5a/3x3_s',\n", 201 | " 'inception_5a/5x5_reduce_s',\n", 202 | " 'inception_5a/5x5_s',\n", 203 | " 'inception_5a/pool_s',\n", 204 | " 'inception_5a/pool_proj_s',\n", 205 | " 'inception_5a/output_s',\n", 206 | " 'inception_5a/output_s_inception_5a/output_s_0_split_0',\n", 207 | " 'inception_5a/output_s_inception_5a/output_s_0_split_1',\n", 208 | " 'inception_5a/output_s_inception_5a/output_s_0_split_2',\n", 209 | " 'inception_5a/output_s_inception_5a/output_s_0_split_3',\n", 210 | " 'inception_5b/1x1_s',\n", 211 | " 'inception_5b/3x3_reduce_s',\n", 212 | " 'inception_5b/3x3_s',\n", 213 | " 'inception_5b/5x5_reduce_s',\n", 214 | " 'inception_5b/5x5_s',\n", 215 | " 'inception_5b/pool_s',\n", 216 | " 'inception_5b/pool_proj_s',\n", 217 | " 'inception_5b/output_s',\n", 218 | " 'pool5/7x7_s1_s']" 219 | ] 220 | }, 221 | "execution_count": 36, 222 | "metadata": {}, 223 | "output_type": "execute_result" 224 | } 225 | ], 226 | "source": [ 227 | "caffe.set_mode_gpu()\n", 228 | "#caffe.set_mode_cpu()\n", 229 | "sketch_net = caffe.Net(sketch_model, PRETRAINED_FILE, caffe.TEST)\n", 230 | "img_net = caffe.Net(image_model, PRETRAINED_FILE, caffe.TEST)\n", 231 | "sketch_net.blobs.keys()" 232 | ] 233 | }, 234 | { 235 | "cell_type": "code", 236 | "execution_count": 38, 237 | "metadata": { 238 | "collapsed": false 239 | }, 240 | "outputs": [ 241 | { 242 | "name": "stdout", 243 | "output_type": "stream", 244 | "text": [ 245 | "\n" 246 | ] 247 | } 248 | ], 249 | "source": [ 250 | "#TODO: set output layer name. You can use sketch_net.blobs.keys() to list all layer\n", 251 | "output_layer_sketch = 'pool5/7x7_s1_s'\n", 252 | "output_layer_image = 'pool5/7x7_s1_p'" 253 | ] 254 | }, 255 | { 256 | "cell_type": "code", 257 | "execution_count": 12, 258 | "metadata": { 259 | "collapsed": false 260 | }, 261 | "outputs": [], 262 | "source": [ 263 | "#set the transformer\n", 264 | "transformer = caffe.io.Transformer({'data': np.shape(sketch_net.blobs['data'].data)})\n", 265 | "transformer.set_mean('data', np.array([104, 117, 123]))\n", 266 | "transformer.set_transpose('data',(2,0,1))\n", 267 | "transformer.set_channel_swap('data', (2,1,0))\n", 268 | "transformer.set_raw_scale('data', 255.0)" 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "metadata": {}, 274 | "source": [ 275 | "## Sketchy test set" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": 24, 281 | "metadata": { 282 | "collapsed": true 283 | }, 284 | "outputs": [], 285 | "source": [ 286 | "#photo paths\n", 287 | "photo_paths = 'C:\\Users\\Patsorn\\Documents/notebook_backup/SBIR/photos/'\n", 288 | "sketch_paths = 'C:\\Users\\Patsorn\\Documents/notebook_backup/SBIR/sketches/'" 289 | ] 290 | }, 291 | { 292 | "cell_type": "code", 293 | "execution_count": 15, 294 | "metadata": { 295 | "collapsed": true 296 | }, 297 | "outputs": [], 298 | "source": [ 299 | "#load up test images\n", 300 | "with open('../list/test_img_list.txt','r') as my_file:\n", 301 | " test_img_list = [c.rstrip() for c in my_file.readlines()]" 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": 39, 307 | "metadata": { 308 | "collapsed": false 309 | }, 310 | "outputs": [ 311 | { 312 | "name": "stdout", 313 | "output_type": "stream", 314 | "text": [ 315 | "1250/1250 Extracting sailboat/n04128499_654.jpg... done\n" 316 | ] 317 | } 318 | ], 319 | "source": [ 320 | "#extract feature for all test images\n", 321 | "feats = []\n", 322 | "N = np.shape(test_img_list)[0]\n", 323 | "for i,path in enumerate(test_img_list):\n", 324 | " imgname = path.split('/')[-1]\n", 325 | " imgname = imgname.split('.jpg')[0]\n", 326 | " imgcat = path.split('/')[0]\n", 327 | " print '\\r',str(i+1)+'/'+str(N)+ ' '+'Extracting ' +path+'...',\n", 328 | " full_path = photo_paths + path\n", 329 | " img = (transformer.preprocess('data', caffe.io.load_image(full_path.rstrip())))\n", 330 | " img_in = np.reshape([img],np.shape(sketch_net.blobs['data'].data))\n", 331 | " out_img = img_net.forward(data=img_in)\n", 332 | " out_img = np.copy(out_img[output_layer_image]) \n", 333 | " feats.append(out_img)\n", 334 | " print 'done',\n", 335 | "np.shape(feats)\n", 336 | "feats = np.resize(feats,[np.shape(feats)[0],np.shape(feats)[2]]) #quick fixed for size" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": 44, 342 | "metadata": { 343 | "collapsed": false 344 | }, 345 | "outputs": [], 346 | "source": [ 347 | "#build nn pool\n", 348 | "from sklearn.neighbors import NearestNeighbors,LSHForest\n", 349 | "nbrs = NearestNeighbors(n_neighbors=np.size(feats,0), algorithm='brute',metric='cosine').fit(feats)" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": 45, 355 | "metadata": { 356 | "collapsed": false 357 | }, 358 | "outputs": [ 359 | { 360 | "name": "stdout", 361 | "output_type": "stream", 362 | "text": [ 363 | "ranking: sailboat n04128499_654-5-5.png found at 0 \n", 364 | "Recall @K=1 = 0.371039290241\n" 365 | ] 366 | } 367 | ], 368 | "source": [ 369 | "#compute score\n", 370 | "\n", 371 | "num_query = 0\n", 372 | "count_recall = [0]*1250\n", 373 | "sum_rank = 0\n", 374 | "sum_class_rank = [0]*125\n", 375 | "count_recall_class = np.zeros((125,1250),np.float)\n", 376 | "i_coco =-1\n", 377 | "for i,img in enumerate(test_img_list):\n", 378 | " imgname = img.split('/')[-1]\n", 379 | " imgname = imgname.split('.jpg')[0]\n", 380 | " imgcat = img.split('/')[0]\n", 381 | " \n", 382 | " sketch_list = os.listdir(sketch_paths+imgcat)\n", 383 | " sketch_img_list = [skg for skg in sketch_list if skg.startswith(imgname+'-') and skg.endswith('-5.png')]#change this skg.endswith('-1.png') to the variation you want\n", 384 | " for sketch in sketch_img_list:\n", 385 | " sketch_path = sketch_paths + imgcat+'/' + sketch\n", 386 | " sketch_in = (transformer.preprocess('data', plt.imread(sketch_path)))\n", 387 | " sketch_in = np.reshape([sketch_in],np.shape(sketch_net.blobs['data'].data))\n", 388 | " query = sketch_net.forward(data=sketch_in)\n", 389 | " query=np.copy(query[output_layer_sketch])\n", 390 | " distances, indices = nbrs.kneighbors(np.reshape(query,[np.shape(query)[1]]))\n", 391 | " num_query = num_query+1\n", 392 | " print '\\r','...'+sketch+'...',\n", 393 | "\n", 394 | " for j,indice in enumerate(indices[0]):\n", 395 | " if indice==i:\n", 396 | " #this j is the right one.\n", 397 | " count_recall[j] = count_recall[j]+1\n", 398 | " print '\\r','ranking: '+imgcat+ ' '+sketch + ' found at ' +str(j),\n", 399 | " break\n", 400 | " \n", 401 | "cum_count = [0]*1250\n", 402 | "sumc = 0\n", 403 | "for i,c in enumerate(count_recall):\n", 404 | " sumc = sumc + c\n", 405 | " cum_count[i] = sumc\n", 406 | "print '\\nRecall @K=1 = ', 1.00*cum_count[0]/cum_count[-1]" 407 | ] 408 | } 409 | ], 410 | "metadata": { 411 | "kernelspec": { 412 | "display_name": "Python 2", 413 | "language": "python", 414 | "name": "python2" 415 | }, 416 | "language_info": { 417 | "codemirror_mode": { 418 | "name": "ipython", 419 | "version": 2 420 | }, 421 | "file_extension": ".py", 422 | "mimetype": "text/x-python", 423 | "name": "python", 424 | "nbconvert_exporter": "python", 425 | "pygments_lexer": "ipython2", 426 | "version": "2.7.11" 427 | } 428 | }, 429 | "nbformat": 4, 430 | "nbformat_minor": 0 431 | } 432 | -------------------------------------------------------------------------------- /code/README: -------------------------------------------------------------------------------- 1 | These are scripts I used in the paper. You'll need Ipython notebook to run it. 2 | 3 | If you have any question about the code, contact me at patsorn.sangkloy@gmail.com 4 | 5 | -Patsorn 6 | -------------------------------------------------------------------------------- /list/README: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /models/triplet_googlenet/README: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /models/triplet_googlenet/Triplet_googlenet_imagedeploy.prototxt: -------------------------------------------------------------------------------- 1 | name: "sketch_siamese_train_test" 2 | input: "data" 3 | force_backward: true 4 | input_shape { 5 | dim: 1 6 | dim: 3 7 | dim: 224 8 | dim: 224 9 | } 10 | ####################################################################################### 11 | 12 | layer { 13 | name: "conv1/7x7_s2_p" 14 | type: "Convolution" 15 | bottom: "data" 16 | top: "conv1/7x7_s2_p" 17 | param { 18 | lr_mult: 1 19 | decay_mult: 1 20 | } 21 | param { 22 | lr_mult: 2 23 | decay_mult: 0 24 | } 25 | convolution_param { 26 | num_output: 64 27 | pad: 3 28 | kernel_size: 7 29 | stride: 2 30 | weight_filler { 31 | type: "xavier" 32 | std: 0.1 33 | } 34 | bias_filler { 35 | type: "constant" 36 | value: 0.2 37 | } 38 | } 39 | } 40 | layer { 41 | name: "conv1/relu_7x7_p" 42 | type: "ReLU" 43 | bottom: "conv1/7x7_s2_p" 44 | top: "conv1/7x7_s2_p" 45 | } 46 | layer { 47 | name: "pool1/3x3_s2_p" 48 | type: "Pooling" 49 | bottom: "conv1/7x7_s2_p" 50 | top: "pool1/3x3_s2_p" 51 | pooling_param { 52 | pool: MAX 53 | kernel_size: 3 54 | stride: 2 55 | } 56 | } 57 | layer { 58 | name: "pool1/norm1_p" 59 | type: "LRN" 60 | bottom: "pool1/3x3_s2_p" 61 | top: "pool1/norm1_p" 62 | lrn_param { 63 | local_size: 5 64 | alpha: 0.0001 65 | beta: 0.75 66 | } 67 | } 68 | layer { 69 | name: "conv2/3x3_reduce_p" 70 | type: "Convolution" 71 | bottom: "pool1/norm1_p" 72 | top: "conv2/3x3_reduce_p" 73 | param { 74 | lr_mult: 1 75 | decay_mult: 1 76 | } 77 | param { 78 | lr_mult: 2 79 | decay_mult: 0 80 | } 81 | convolution_param { 82 | num_output: 64 83 | kernel_size: 1 84 | weight_filler { 85 | type: "xavier" 86 | std: 0.1 87 | } 88 | bias_filler { 89 | type: "constant" 90 | value: 0.2 91 | } 92 | } 93 | } 94 | layer { 95 | name: "conv2/relu_3x3_reduce_p" 96 | type: "ReLU" 97 | bottom: "conv2/3x3_reduce_p" 98 | top: "conv2/3x3_reduce_p" 99 | } 100 | layer { 101 | name: "conv2/3x3_p" 102 | type: "Convolution" 103 | bottom: "conv2/3x3_reduce_p" 104 | top: "conv2/3x3_p" 105 | param { 106 | lr_mult: 1 107 | decay_mult: 1 108 | } 109 | param { 110 | lr_mult: 2 111 | decay_mult: 0 112 | } 113 | convolution_param { 114 | num_output: 192 115 | pad: 1 116 | kernel_size: 3 117 | weight_filler { 118 | type: "xavier" 119 | std: 0.03 120 | } 121 | bias_filler { 122 | type: "constant" 123 | value: 0.2 124 | } 125 | } 126 | } 127 | layer { 128 | name: "conv2/relu_3x3_p" 129 | type: "ReLU" 130 | bottom: "conv2/3x3_p" 131 | top: "conv2/3x3_p" 132 | } 133 | layer { 134 | name: "conv2/norm2_p" 135 | type: "LRN" 136 | bottom: "conv2/3x3_p" 137 | top: "conv2/norm2_p" 138 | lrn_param { 139 | local_size: 5 140 | alpha: 0.0001 141 | beta: 0.75 142 | } 143 | } 144 | layer { 145 | name: "pool2/3x3_s2_p" 146 | type: "Pooling" 147 | bottom: "conv2/norm2_p" 148 | top: "pool2/3x3_s2_p" 149 | pooling_param { 150 | pool: MAX 151 | kernel_size: 3 152 | stride: 2 153 | } 154 | } 155 | layer { 156 | name: "inception_3a/1x1_p" 157 | type: "Convolution" 158 | bottom: "pool2/3x3_s2_p" 159 | top: "inception_3a/1x1_p" 160 | param { 161 | lr_mult: 1 162 | decay_mult: 1 163 | } 164 | param { 165 | lr_mult: 2 166 | decay_mult: 0 167 | } 168 | convolution_param { 169 | num_output: 64 170 | kernel_size: 1 171 | weight_filler { 172 | type: "xavier" 173 | std: 0.03 174 | } 175 | bias_filler { 176 | type: "constant" 177 | value: 0.2 178 | } 179 | } 180 | } 181 | layer { 182 | name: "inception_3a/relu_1x1_p" 183 | type: "ReLU" 184 | bottom: "inception_3a/1x1_p" 185 | top: "inception_3a/1x1_p" 186 | } 187 | layer { 188 | name: "inception_3a/3x3_reduce_p" 189 | type: "Convolution" 190 | bottom: "pool2/3x3_s2_p" 191 | top: "inception_3a/3x3_reduce_p" 192 | param { 193 | lr_mult: 1 194 | decay_mult: 1 195 | } 196 | param { 197 | lr_mult: 2 198 | decay_mult: 0 199 | } 200 | convolution_param { 201 | num_output: 96 202 | kernel_size: 1 203 | weight_filler { 204 | type: "xavier" 205 | std: 0.09 206 | } 207 | bias_filler { 208 | type: "constant" 209 | value: 0.2 210 | } 211 | } 212 | } 213 | layer { 214 | name: "inception_3a/relu_3x3_reduce_p" 215 | type: "ReLU" 216 | bottom: "inception_3a/3x3_reduce_p" 217 | top: "inception_3a/3x3_reduce_p" 218 | } 219 | layer { 220 | name: "inception_3a/3x3_p" 221 | type: "Convolution" 222 | bottom: "inception_3a/3x3_reduce_p" 223 | top: "inception_3a/3x3_p" 224 | param { 225 | lr_mult: 1 226 | decay_mult: 1 227 | } 228 | param { 229 | lr_mult: 2 230 | decay_mult: 0 231 | } 232 | convolution_param { 233 | num_output: 128 234 | pad: 1 235 | kernel_size: 3 236 | weight_filler { 237 | type: "xavier" 238 | std: 0.03 239 | } 240 | bias_filler { 241 | type: "constant" 242 | value: 0.2 243 | } 244 | } 245 | } 246 | layer { 247 | name: "inception_3a/relu_3x3_p" 248 | type: "ReLU" 249 | bottom: "inception_3a/3x3_p" 250 | top: "inception_3a/3x3_p" 251 | } 252 | layer { 253 | name: "inception_3a/5x5_reduce_p" 254 | type: "Convolution" 255 | bottom: "pool2/3x3_s2_p" 256 | top: "inception_3a/5x5_reduce_p" 257 | param { 258 | lr_mult: 1 259 | decay_mult: 1 260 | } 261 | param { 262 | lr_mult: 2 263 | decay_mult: 0 264 | } 265 | convolution_param { 266 | num_output: 16 267 | kernel_size: 1 268 | weight_filler { 269 | type: "xavier" 270 | std: 0.2 271 | } 272 | bias_filler { 273 | type: "constant" 274 | value: 0.2 275 | } 276 | } 277 | } 278 | layer { 279 | name: "inception_3a/relu_5x5_reduce_p" 280 | type: "ReLU" 281 | bottom: "inception_3a/5x5_reduce_p" 282 | top: "inception_3a/5x5_reduce_p" 283 | } 284 | layer { 285 | name: "inception_3a/5x5_p" 286 | type: "Convolution" 287 | bottom: "inception_3a/5x5_reduce_p" 288 | top: "inception_3a/5x5_p" 289 | param { 290 | lr_mult: 1 291 | decay_mult: 1 292 | } 293 | param { 294 | lr_mult: 2 295 | decay_mult: 0 296 | } 297 | convolution_param { 298 | num_output: 32 299 | pad: 2 300 | kernel_size: 5 301 | weight_filler { 302 | type: "xavier" 303 | std: 0.03 304 | } 305 | bias_filler { 306 | type: "constant" 307 | value: 0.2 308 | } 309 | } 310 | } 311 | layer { 312 | name: "inception_3a/relu_5x5_p" 313 | type: "ReLU" 314 | bottom: "inception_3a/5x5_p" 315 | top: "inception_3a/5x5_p" 316 | } 317 | layer { 318 | name: "inception_3a/pool_p" 319 | type: "Pooling" 320 | bottom: "pool2/3x3_s2_p" 321 | top: "inception_3a/pool_p" 322 | pooling_param { 323 | pool: MAX 324 | kernel_size: 3 325 | stride: 1 326 | pad: 1 327 | } 328 | } 329 | layer { 330 | name: "inception_3a/pool_proj_p" 331 | type: "Convolution" 332 | bottom: "inception_3a/pool_p" 333 | top: "inception_3a/pool_proj_p" 334 | param { 335 | lr_mult: 1 336 | decay_mult: 1 337 | } 338 | param { 339 | lr_mult: 2 340 | decay_mult: 0 341 | } 342 | convolution_param { 343 | num_output: 32 344 | kernel_size: 1 345 | weight_filler { 346 | type: "xavier" 347 | std: 0.1 348 | } 349 | bias_filler { 350 | type: "constant" 351 | value: 0.2 352 | } 353 | } 354 | } 355 | layer { 356 | name: "inception_3a/relu_pool_proj_p" 357 | type: "ReLU" 358 | bottom: "inception_3a/pool_proj_p" 359 | top: "inception_3a/pool_proj_p" 360 | } 361 | layer { 362 | name: "inception_3a/output_p" 363 | type: "Concat" 364 | bottom: "inception_3a/1x1_p" 365 | bottom: "inception_3a/3x3_p" 366 | bottom: "inception_3a/5x5_p" 367 | bottom: "inception_3a/pool_proj_p" 368 | top: "inception_3a/output_p" 369 | } 370 | layer { 371 | name: "inception_3b/1x1_p" 372 | type: "Convolution" 373 | bottom: "inception_3a/output_p" 374 | top: "inception_3b/1x1_p" 375 | param { 376 | lr_mult: 1 377 | decay_mult: 1 378 | } 379 | param { 380 | lr_mult: 2 381 | decay_mult: 0 382 | } 383 | convolution_param { 384 | num_output: 128 385 | kernel_size: 1 386 | weight_filler { 387 | type: "xavier" 388 | std: 0.03 389 | } 390 | bias_filler { 391 | type: "constant" 392 | value: 0.2 393 | } 394 | } 395 | } 396 | layer { 397 | name: "inception_3b/relu_1x1_p" 398 | type: "ReLU" 399 | bottom: "inception_3b/1x1_p" 400 | top: "inception_3b/1x1_p" 401 | } 402 | layer { 403 | name: "inception_3b/3x3_reduce_p" 404 | type: "Convolution" 405 | bottom: "inception_3a/output_p" 406 | top: "inception_3b/3x3_reduce_p" 407 | param { 408 | lr_mult: 1 409 | decay_mult: 1 410 | } 411 | param { 412 | lr_mult: 2 413 | decay_mult: 0 414 | } 415 | convolution_param { 416 | num_output: 128 417 | kernel_size: 1 418 | weight_filler { 419 | type: "xavier" 420 | std: 0.09 421 | } 422 | bias_filler { 423 | type: "constant" 424 | value: 0.2 425 | } 426 | } 427 | } 428 | layer { 429 | name: "inception_3b/relu_3x3_reduce_p" 430 | type: "ReLU" 431 | bottom: "inception_3b/3x3_reduce_p" 432 | top: "inception_3b/3x3_reduce_p" 433 | } 434 | layer { 435 | name: "inception_3b/3x3_p" 436 | type: "Convolution" 437 | bottom: "inception_3b/3x3_reduce_p" 438 | top: "inception_3b/3x3_p" 439 | param { 440 | lr_mult: 1 441 | decay_mult: 1 442 | } 443 | param { 444 | lr_mult: 2 445 | decay_mult: 0 446 | } 447 | convolution_param { 448 | num_output: 192 449 | pad: 1 450 | kernel_size: 3 451 | weight_filler { 452 | type: "xavier" 453 | std: 0.03 454 | } 455 | bias_filler { 456 | type: "constant" 457 | value: 0.2 458 | } 459 | } 460 | } 461 | layer { 462 | name: "inception_3b/relu_3x3_p" 463 | type: "ReLU" 464 | bottom: "inception_3b/3x3_p" 465 | top: "inception_3b/3x3_p" 466 | } 467 | layer { 468 | name: "inception_3b/5x5_reduce_p" 469 | type: "Convolution" 470 | bottom: "inception_3a/output_p" 471 | top: "inception_3b/5x5_reduce_p" 472 | param { 473 | lr_mult: 1 474 | decay_mult: 1 475 | } 476 | param { 477 | lr_mult: 2 478 | decay_mult: 0 479 | } 480 | convolution_param { 481 | num_output: 32 482 | kernel_size: 1 483 | weight_filler { 484 | type: "xavier" 485 | std: 0.2 486 | } 487 | bias_filler { 488 | type: "constant" 489 | value: 0.2 490 | } 491 | } 492 | } 493 | layer { 494 | name: "inception_3b/relu_5x5_reduce_p" 495 | type: "ReLU" 496 | bottom: "inception_3b/5x5_reduce_p" 497 | top: "inception_3b/5x5_reduce_p" 498 | } 499 | layer { 500 | name: "inception_3b/5x5_p" 501 | type: "Convolution" 502 | bottom: "inception_3b/5x5_reduce_p" 503 | top: "inception_3b/5x5_p" 504 | param { 505 | lr_mult: 1 506 | decay_mult: 1 507 | } 508 | param { 509 | lr_mult: 2 510 | decay_mult: 0 511 | } 512 | convolution_param { 513 | num_output: 96 514 | pad: 2 515 | kernel_size: 5 516 | weight_filler { 517 | type: "xavier" 518 | std: 0.03 519 | } 520 | bias_filler { 521 | type: "constant" 522 | value: 0.2 523 | } 524 | } 525 | } 526 | layer { 527 | name: "inception_3b/relu_5x5_p" 528 | type: "ReLU" 529 | bottom: "inception_3b/5x5_p" 530 | top: "inception_3b/5x5_p" 531 | } 532 | layer { 533 | name: "inception_3b/pool_p" 534 | type: "Pooling" 535 | bottom: "inception_3a/output_p" 536 | top: "inception_3b/pool_p" 537 | pooling_param { 538 | pool: MAX 539 | kernel_size: 3 540 | stride: 1 541 | pad: 1 542 | } 543 | } 544 | layer { 545 | name: "inception_3b/pool_proj_p" 546 | type: "Convolution" 547 | bottom: "inception_3b/pool_p" 548 | top: "inception_3b/pool_proj_p" 549 | param { 550 | lr_mult: 1 551 | decay_mult: 1 552 | } 553 | param { 554 | lr_mult: 2 555 | decay_mult: 0 556 | } 557 | convolution_param { 558 | num_output: 64 559 | kernel_size: 1 560 | weight_filler { 561 | type: "xavier" 562 | std: 0.1 563 | } 564 | bias_filler { 565 | type: "constant" 566 | value: 0.2 567 | } 568 | } 569 | } 570 | layer { 571 | name: "inception_3b/relu_pool_proj_p" 572 | type: "ReLU" 573 | bottom: "inception_3b/pool_proj_p" 574 | top: "inception_3b/pool_proj_p" 575 | } 576 | layer { 577 | name: "inception_3b/output_p" 578 | type: "Concat" 579 | bottom: "inception_3b/1x1_p" 580 | bottom: "inception_3b/3x3_p" 581 | bottom: "inception_3b/5x5_p" 582 | bottom: "inception_3b/pool_proj_p" 583 | top: "inception_3b/output_p" 584 | } 585 | layer { 586 | name: "pool3/3x3_s2_p" 587 | type: "Pooling" 588 | bottom: "inception_3b/output_p" 589 | top: "pool3/3x3_s2_p" 590 | pooling_param { 591 | pool: MAX 592 | kernel_size: 3 593 | stride: 2 594 | } 595 | } 596 | layer { 597 | name: "inception_4a/1x1_p" 598 | type: "Convolution" 599 | bottom: "pool3/3x3_s2_p" 600 | top: "inception_4a/1x1_p" 601 | param { 602 | lr_mult: 1 603 | decay_mult: 1 604 | } 605 | param { 606 | lr_mult: 2 607 | decay_mult: 0 608 | } 609 | convolution_param { 610 | num_output: 192 611 | kernel_size: 1 612 | weight_filler { 613 | type: "xavier" 614 | std: 0.03 615 | } 616 | bias_filler { 617 | type: "constant" 618 | value: 0.2 619 | } 620 | } 621 | } 622 | layer { 623 | name: "inception_4a/relu_1x1_p" 624 | type: "ReLU" 625 | bottom: "inception_4a/1x1_p" 626 | top: "inception_4a/1x1_p" 627 | } 628 | layer { 629 | name: "inception_4a/3x3_reduce_p" 630 | type: "Convolution" 631 | bottom: "pool3/3x3_s2_p" 632 | top: "inception_4a/3x3_reduce_p" 633 | param { 634 | lr_mult: 1 635 | decay_mult: 1 636 | } 637 | param { 638 | lr_mult: 2 639 | decay_mult: 0 640 | } 641 | convolution_param { 642 | num_output: 96 643 | kernel_size: 1 644 | weight_filler { 645 | type: "xavier" 646 | std: 0.09 647 | } 648 | bias_filler { 649 | type: "constant" 650 | value: 0.2 651 | } 652 | } 653 | } 654 | layer { 655 | name: "inception_4a/relu_3x3_reduce_p" 656 | type: "ReLU" 657 | bottom: "inception_4a/3x3_reduce_p" 658 | top: "inception_4a/3x3_reduce_p" 659 | } 660 | layer { 661 | name: "inception_4a/3x3_p" 662 | type: "Convolution" 663 | bottom: "inception_4a/3x3_reduce_p" 664 | top: "inception_4a/3x3_p" 665 | param { 666 | lr_mult: 1 667 | decay_mult: 1 668 | } 669 | param { 670 | lr_mult: 2 671 | decay_mult: 0 672 | } 673 | convolution_param { 674 | num_output: 208 675 | pad: 1 676 | kernel_size: 3 677 | weight_filler { 678 | type: "xavier" 679 | std: 0.03 680 | } 681 | bias_filler { 682 | type: "constant" 683 | value: 0.2 684 | } 685 | } 686 | } 687 | layer { 688 | name: "inception_4a/relu_3x3_p" 689 | type: "ReLU" 690 | bottom: "inception_4a/3x3_p" 691 | top: "inception_4a/3x3_p" 692 | } 693 | layer { 694 | name: "inception_4a/5x5_reduce_p" 695 | type: "Convolution" 696 | bottom: "pool3/3x3_s2_p" 697 | top: "inception_4a/5x5_reduce_p" 698 | param { 699 | lr_mult: 1 700 | decay_mult: 1 701 | } 702 | param { 703 | lr_mult: 2 704 | decay_mult: 0 705 | } 706 | convolution_param { 707 | num_output: 16 708 | kernel_size: 1 709 | weight_filler { 710 | type: "xavier" 711 | std: 0.2 712 | } 713 | bias_filler { 714 | type: "constant" 715 | value: 0.2 716 | } 717 | } 718 | } 719 | layer { 720 | name: "inception_4a/relu_5x5_reduce_p" 721 | type: "ReLU" 722 | bottom: "inception_4a/5x5_reduce_p" 723 | top: "inception_4a/5x5_reduce_p" 724 | } 725 | layer { 726 | name: "inception_4a/5x5_p" 727 | type: "Convolution" 728 | bottom: "inception_4a/5x5_reduce_p" 729 | top: "inception_4a/5x5_p" 730 | param { 731 | lr_mult: 1 732 | decay_mult: 1 733 | } 734 | param { 735 | lr_mult: 2 736 | decay_mult: 0 737 | } 738 | convolution_param { 739 | num_output: 48 740 | pad: 2 741 | kernel_size: 5 742 | weight_filler { 743 | type: "xavier" 744 | std: 0.03 745 | } 746 | bias_filler { 747 | type: "constant" 748 | value: 0.2 749 | } 750 | } 751 | } 752 | layer { 753 | name: "inception_4a/relu_5x5_p" 754 | type: "ReLU" 755 | bottom: "inception_4a/5x5_p" 756 | top: "inception_4a/5x5_p" 757 | } 758 | layer { 759 | name: "inception_4a/pool_p" 760 | type: "Pooling" 761 | bottom: "pool3/3x3_s2_p" 762 | top: "inception_4a/pool_p" 763 | pooling_param { 764 | pool: MAX 765 | kernel_size: 3 766 | stride: 1 767 | pad: 1 768 | } 769 | } 770 | layer { 771 | name: "inception_4a/pool_proj_p" 772 | type: "Convolution" 773 | bottom: "inception_4a/pool_p" 774 | top: "inception_4a/pool_proj_p" 775 | param { 776 | lr_mult: 1 777 | decay_mult: 1 778 | } 779 | param { 780 | lr_mult: 2 781 | decay_mult: 0 782 | } 783 | convolution_param { 784 | num_output: 64 785 | kernel_size: 1 786 | weight_filler { 787 | type: "xavier" 788 | std: 0.1 789 | } 790 | bias_filler { 791 | type: "constant" 792 | value: 0.2 793 | } 794 | } 795 | } 796 | layer { 797 | name: "inception_4a/relu_pool_proj_p" 798 | type: "ReLU" 799 | bottom: "inception_4a/pool_proj_p" 800 | top: "inception_4a/pool_proj_p" 801 | } 802 | layer { 803 | name: "inception_4a/output_p" 804 | type: "Concat" 805 | bottom: "inception_4a/1x1_p" 806 | bottom: "inception_4a/3x3_p" 807 | bottom: "inception_4a/5x5_p" 808 | bottom: "inception_4a/pool_proj_p" 809 | top: "inception_4a/output_p" 810 | } 811 | layer { 812 | name: "loss1/ave_pool_p" 813 | type: "Pooling" 814 | bottom: "inception_4a/output_p" 815 | top: "loss1/ave_pool_p" 816 | pooling_param { 817 | pool: AVE 818 | kernel_size: 5 819 | stride: 3 820 | } 821 | } 822 | layer { 823 | name: "loss1/conv_p" 824 | type: "Convolution" 825 | bottom: "loss1/ave_pool_p" 826 | top: "loss1/conv_p" 827 | param { 828 | lr_mult: 1 829 | decay_mult: 1 830 | } 831 | param { 832 | lr_mult: 2 833 | decay_mult: 0 834 | } 835 | convolution_param { 836 | num_output: 128 837 | kernel_size: 1 838 | weight_filler { 839 | type: "xavier" 840 | std: 0.08 841 | } 842 | bias_filler { 843 | type: "constant" 844 | value: 0.2 845 | } 846 | } 847 | } 848 | layer { 849 | name: "loss1/relu_conv_p" 850 | type: "ReLU" 851 | bottom: "loss1/conv_p" 852 | top: "loss1/conv_p" 853 | } 854 | layer { 855 | name: "loss1/fc_p" 856 | type: "InnerProduct" 857 | bottom: "loss1/conv_p" 858 | top: "loss1/fc_p" 859 | param { 860 | lr_mult: 1 861 | decay_mult: 1 862 | } 863 | param { 864 | lr_mult: 2 865 | decay_mult: 0 866 | } 867 | inner_product_param { 868 | num_output: 1024 869 | weight_filler { 870 | type: "xavier" 871 | std: 0.02 872 | } 873 | bias_filler { 874 | type: "constant" 875 | value: 0.2 876 | } 877 | } 878 | } 879 | layer { 880 | name: "loss1/relu_fc_p" 881 | type: "ReLU" 882 | bottom: "loss1/fc_p" 883 | top: "loss1/fc_p" 884 | } 885 | layer { 886 | name: "loss1/drop_fc_p" 887 | type: "Dropout" 888 | bottom: "loss1/fc_p" 889 | top: "loss1/fc_p" 890 | dropout_param { 891 | dropout_ratio: 0.7 892 | } 893 | } 894 | layer { 895 | name: "inception_4b/1x1_p" 896 | type: "Convolution" 897 | bottom: "inception_4a/output_p" 898 | top: "inception_4b/1x1_p" 899 | param { 900 | lr_mult: 1 901 | decay_mult: 1 902 | } 903 | param { 904 | lr_mult: 2 905 | decay_mult: 0 906 | } 907 | convolution_param { 908 | num_output: 160 909 | kernel_size: 1 910 | weight_filler { 911 | type: "xavier" 912 | std: 0.03 913 | } 914 | bias_filler { 915 | type: "constant" 916 | value: 0.2 917 | } 918 | } 919 | } 920 | layer { 921 | name: "inception_4b/relu_1x1_p" 922 | type: "ReLU" 923 | bottom: "inception_4b/1x1_p" 924 | top: "inception_4b/1x1_p" 925 | } 926 | layer { 927 | name: "inception_4b/3x3_reduce_p" 928 | type: "Convolution" 929 | bottom: "inception_4a/output_p" 930 | top: "inception_4b/3x3_reduce_p" 931 | param { 932 | lr_mult: 1 933 | decay_mult: 1 934 | } 935 | param { 936 | lr_mult: 2 937 | decay_mult: 0 938 | } 939 | convolution_param { 940 | num_output: 112 941 | kernel_size: 1 942 | weight_filler { 943 | type: "xavier" 944 | std: 0.09 945 | } 946 | bias_filler { 947 | type: "constant" 948 | value: 0.2 949 | } 950 | } 951 | } 952 | layer { 953 | name: "inception_4b/relu_3x3_reduce_p" 954 | type: "ReLU" 955 | bottom: "inception_4b/3x3_reduce_p" 956 | top: "inception_4b/3x3_reduce_p" 957 | } 958 | layer { 959 | name: "inception_4b/3x3_p" 960 | type: "Convolution" 961 | bottom: "inception_4b/3x3_reduce_p" 962 | top: "inception_4b/3x3_p" 963 | param { 964 | lr_mult: 1 965 | decay_mult: 1 966 | } 967 | param { 968 | lr_mult: 2 969 | decay_mult: 0 970 | 971 | } 972 | convolution_param { 973 | num_output: 224 974 | pad: 1 975 | kernel_size: 3 976 | weight_filler { 977 | type: "xavier" 978 | std: 0.03 979 | } 980 | bias_filler { 981 | type: "constant" 982 | value: 0.2 983 | } 984 | } 985 | } 986 | layer { 987 | name: "inception_4b/relu_3x3_p" 988 | type: "ReLU" 989 | bottom: "inception_4b/3x3_p" 990 | top: "inception_4b/3x3_p" 991 | } 992 | layer { 993 | name: "inception_4b/5x5_reduce_p" 994 | type: "Convolution" 995 | bottom: "inception_4a/output_p" 996 | top: "inception_4b/5x5_reduce_p" 997 | param { 998 | lr_mult: 1 999 | decay_mult: 1 1000 | } 1001 | param { 1002 | lr_mult: 2 1003 | decay_mult: 0 1004 | } 1005 | convolution_param { 1006 | num_output: 24 1007 | kernel_size: 1 1008 | weight_filler { 1009 | type: "xavier" 1010 | std: 0.2 1011 | } 1012 | bias_filler { 1013 | type: "constant" 1014 | value: 0.2 1015 | } 1016 | } 1017 | } 1018 | layer { 1019 | name: "inception_4b/relu_5x5_reduce_p" 1020 | type: "ReLU" 1021 | bottom: "inception_4b/5x5_reduce_p" 1022 | top: "inception_4b/5x5_reduce_p" 1023 | } 1024 | layer { 1025 | name: "inception_4b/5x5_p" 1026 | type: "Convolution" 1027 | bottom: "inception_4b/5x5_reduce_p" 1028 | top: "inception_4b/5x5_p" 1029 | param { 1030 | lr_mult: 1 1031 | decay_mult: 1 1032 | } 1033 | param { 1034 | lr_mult: 2 1035 | decay_mult: 0 1036 | } 1037 | convolution_param { 1038 | num_output: 64 1039 | pad: 2 1040 | kernel_size: 5 1041 | weight_filler { 1042 | type: "xavier" 1043 | std: 0.03 1044 | } 1045 | bias_filler { 1046 | type: "constant" 1047 | value: 0.2 1048 | } 1049 | } 1050 | } 1051 | layer { 1052 | name: "inception_4b/relu_5x5_p" 1053 | type: "ReLU" 1054 | bottom: "inception_4b/5x5_p" 1055 | top: "inception_4b/5x5_p" 1056 | } 1057 | layer { 1058 | name: "inception_4b/pool_p" 1059 | type: "Pooling" 1060 | bottom: "inception_4a/output_p" 1061 | top: "inception_4b/pool_p" 1062 | pooling_param { 1063 | pool: MAX 1064 | kernel_size: 3 1065 | stride: 1 1066 | pad: 1 1067 | } 1068 | } 1069 | layer { 1070 | name: "inception_4b/pool_proj_p" 1071 | type: "Convolution" 1072 | bottom: "inception_4b/pool_p" 1073 | top: "inception_4b/pool_proj_p" 1074 | param { 1075 | lr_mult: 1 1076 | decay_mult: 1 1077 | } 1078 | param { 1079 | lr_mult: 2 1080 | decay_mult: 0 1081 | } 1082 | convolution_param { 1083 | num_output: 64 1084 | kernel_size: 1 1085 | weight_filler { 1086 | type: "xavier" 1087 | std: 0.1 1088 | } 1089 | bias_filler { 1090 | type: "constant" 1091 | value: 0.2 1092 | } 1093 | } 1094 | } 1095 | layer { 1096 | name: "inception_4b/relu_pool_proj_p" 1097 | type: "ReLU" 1098 | bottom: "inception_4b/pool_proj_p" 1099 | top: "inception_4b/pool_proj_p" 1100 | } 1101 | layer { 1102 | name: "inception_4b/output_p" 1103 | type: "Concat" 1104 | bottom: "inception_4b/1x1_p" 1105 | bottom: "inception_4b/3x3_p" 1106 | bottom: "inception_4b/5x5_p" 1107 | bottom: "inception_4b/pool_proj_p" 1108 | top: "inception_4b/output_p" 1109 | } 1110 | layer { 1111 | name: "inception_4c/1x1_p" 1112 | type: "Convolution" 1113 | bottom: "inception_4b/output_p" 1114 | top: "inception_4c/1x1_p" 1115 | param { 1116 | lr_mult: 1 1117 | decay_mult: 1 1118 | } 1119 | param { 1120 | lr_mult: 2 1121 | decay_mult: 0 1122 | } 1123 | convolution_param { 1124 | num_output: 128 1125 | kernel_size: 1 1126 | weight_filler { 1127 | type: "xavier" 1128 | std: 0.03 1129 | } 1130 | bias_filler { 1131 | type: "constant" 1132 | value: 0.2 1133 | } 1134 | } 1135 | } 1136 | layer { 1137 | name: "inception_4c/relu_1x1_p" 1138 | type: "ReLU" 1139 | bottom: "inception_4c/1x1_p" 1140 | top: "inception_4c/1x1_p" 1141 | } 1142 | layer { 1143 | name: "inception_4c/3x3_reduce_p" 1144 | type: "Convolution" 1145 | bottom: "inception_4b/output_p" 1146 | top: "inception_4c/3x3_reduce_p" 1147 | param { 1148 | lr_mult: 1 1149 | decay_mult: 1 1150 | } 1151 | param { 1152 | lr_mult: 2 1153 | decay_mult: 0 1154 | } 1155 | convolution_param { 1156 | num_output: 128 1157 | kernel_size: 1 1158 | weight_filler { 1159 | type: "xavier" 1160 | std: 0.09 1161 | } 1162 | bias_filler { 1163 | type: "constant" 1164 | value: 0.2 1165 | } 1166 | } 1167 | } 1168 | layer { 1169 | name: "inception_4c/relu_3x3_reduce_p" 1170 | type: "ReLU" 1171 | bottom: "inception_4c/3x3_reduce_p" 1172 | top: "inception_4c/3x3_reduce_p" 1173 | } 1174 | layer { 1175 | name: "inception_4c/3x3_p" 1176 | type: "Convolution" 1177 | bottom: "inception_4c/3x3_reduce_p" 1178 | top: "inception_4c/3x3_p" 1179 | param { 1180 | lr_mult: 1 1181 | decay_mult: 1 1182 | } 1183 | param { 1184 | lr_mult: 2 1185 | decay_mult: 0 1186 | } 1187 | convolution_param { 1188 | num_output: 256 1189 | pad: 1 1190 | kernel_size: 3 1191 | weight_filler { 1192 | type: "xavier" 1193 | std: 0.03 1194 | } 1195 | bias_filler { 1196 | type: "constant" 1197 | value: 0.2 1198 | } 1199 | } 1200 | } 1201 | layer { 1202 | name: "inception_4c/relu_3x3_p" 1203 | type: "ReLU" 1204 | bottom: "inception_4c/3x3_p" 1205 | top: "inception_4c/3x3_p" 1206 | } 1207 | layer { 1208 | name: "inception_4c/5x5_reduce_p" 1209 | type: "Convolution" 1210 | bottom: "inception_4b/output_p" 1211 | top: "inception_4c/5x5_reduce_p" 1212 | param { 1213 | lr_mult: 1 1214 | decay_mult: 1 1215 | } 1216 | param { 1217 | lr_mult: 2 1218 | decay_mult: 0 1219 | } 1220 | convolution_param { 1221 | num_output: 24 1222 | kernel_size: 1 1223 | weight_filler { 1224 | type: "xavier" 1225 | std: 0.2 1226 | } 1227 | bias_filler { 1228 | type: "constant" 1229 | value: 0.2 1230 | } 1231 | } 1232 | } 1233 | layer { 1234 | name: "inception_4c/relu_5x5_reduce_p" 1235 | type: "ReLU" 1236 | bottom: "inception_4c/5x5_reduce_p" 1237 | top: "inception_4c/5x5_reduce_p" 1238 | } 1239 | layer { 1240 | name: "inception_4c/5x5_p" 1241 | type: "Convolution" 1242 | bottom: "inception_4c/5x5_reduce_p" 1243 | top: "inception_4c/5x5_p" 1244 | param { 1245 | lr_mult: 1 1246 | decay_mult: 1 1247 | } 1248 | param { 1249 | lr_mult: 2 1250 | decay_mult: 0 1251 | } 1252 | convolution_param { 1253 | num_output: 64 1254 | pad: 2 1255 | kernel_size: 5 1256 | weight_filler { 1257 | type: "xavier" 1258 | std: 0.03 1259 | } 1260 | bias_filler { 1261 | type: "constant" 1262 | value: 0.2 1263 | } 1264 | } 1265 | } 1266 | layer { 1267 | name: "inception_4c/relu_5x5_p" 1268 | type: "ReLU" 1269 | bottom: "inception_4c/5x5_p" 1270 | top: "inception_4c/5x5_p" 1271 | } 1272 | layer { 1273 | name: "inception_4c/pool_p" 1274 | type: "Pooling" 1275 | bottom: "inception_4b/output_p" 1276 | top: "inception_4c/pool_p" 1277 | pooling_param { 1278 | pool: MAX 1279 | kernel_size: 3 1280 | stride: 1 1281 | pad: 1 1282 | } 1283 | } 1284 | layer { 1285 | name: "inception_4c/pool_proj_p" 1286 | type: "Convolution" 1287 | bottom: "inception_4c/pool_p" 1288 | top: "inception_4c/pool_proj_p" 1289 | param { 1290 | lr_mult: 1 1291 | decay_mult: 1 1292 | } 1293 | param { 1294 | lr_mult: 2 1295 | decay_mult: 0 1296 | } 1297 | convolution_param { 1298 | num_output: 64 1299 | kernel_size: 1 1300 | weight_filler { 1301 | type: "xavier" 1302 | std: 0.1 1303 | } 1304 | bias_filler { 1305 | type: "constant" 1306 | value: 0.2 1307 | } 1308 | } 1309 | } 1310 | layer { 1311 | name: "inception_4c/relu_pool_proj_p" 1312 | type: "ReLU" 1313 | bottom: "inception_4c/pool_proj_p" 1314 | top: "inception_4c/pool_proj_p" 1315 | } 1316 | layer { 1317 | name: "inception_4c/output_p" 1318 | type: "Concat" 1319 | bottom: "inception_4c/1x1_p" 1320 | bottom: "inception_4c/3x3_p" 1321 | bottom: "inception_4c/5x5_p" 1322 | bottom: "inception_4c/pool_proj_p" 1323 | top: "inception_4c/output_p" 1324 | } 1325 | layer { 1326 | name: "inception_4d/1x1_p" 1327 | type: "Convolution" 1328 | bottom: "inception_4c/output_p" 1329 | top: "inception_4d/1x1_p" 1330 | param { 1331 | lr_mult: 1 1332 | decay_mult: 1 1333 | } 1334 | param { 1335 | lr_mult: 2 1336 | decay_mult: 0 1337 | } 1338 | convolution_param { 1339 | num_output: 112 1340 | kernel_size: 1 1341 | weight_filler { 1342 | type: "xavier" 1343 | std: 0.03 1344 | } 1345 | bias_filler { 1346 | type: "constant" 1347 | value: 0.2 1348 | } 1349 | } 1350 | } 1351 | layer { 1352 | name: "inception_4d/relu_1x1_p" 1353 | type: "ReLU" 1354 | bottom: "inception_4d/1x1_p" 1355 | top: "inception_4d/1x1_p" 1356 | } 1357 | layer { 1358 | name: "inception_4d/3x3_reduce_p" 1359 | type: "Convolution" 1360 | bottom: "inception_4c/output_p" 1361 | top: "inception_4d/3x3_reduce_p" 1362 | param { 1363 | lr_mult: 1 1364 | decay_mult: 1 1365 | } 1366 | param { 1367 | lr_mult: 2 1368 | decay_mult: 0 1369 | } 1370 | convolution_param { 1371 | num_output: 144 1372 | kernel_size: 1 1373 | weight_filler { 1374 | type: "xavier" 1375 | std: 0.09 1376 | } 1377 | bias_filler { 1378 | type: "constant" 1379 | value: 0.2 1380 | } 1381 | } 1382 | } 1383 | layer { 1384 | name: "inception_4d/relu_3x3_reduce_p" 1385 | type: "ReLU" 1386 | bottom: "inception_4d/3x3_reduce_p" 1387 | top: "inception_4d/3x3_reduce_p" 1388 | } 1389 | layer { 1390 | name: "inception_4d/3x3_p" 1391 | type: "Convolution" 1392 | bottom: "inception_4d/3x3_reduce_p" 1393 | top: "inception_4d/3x3_p" 1394 | param { 1395 | lr_mult: 1 1396 | decay_mult: 1 1397 | } 1398 | param { 1399 | lr_mult: 2 1400 | decay_mult: 0 1401 | } 1402 | convolution_param { 1403 | num_output: 288 1404 | pad: 1 1405 | kernel_size: 3 1406 | weight_filler { 1407 | type: "xavier" 1408 | std: 0.03 1409 | } 1410 | bias_filler { 1411 | type: "constant" 1412 | value: 0.2 1413 | } 1414 | } 1415 | } 1416 | layer { 1417 | name: "inception_4d/relu_3x3_p" 1418 | type: "ReLU" 1419 | bottom: "inception_4d/3x3_p" 1420 | top: "inception_4d/3x3_p" 1421 | } 1422 | layer { 1423 | name: "inception_4d/5x5_reduce_p" 1424 | type: "Convolution" 1425 | bottom: "inception_4c/output_p" 1426 | top: "inception_4d/5x5_reduce_p" 1427 | param { 1428 | lr_mult: 1 1429 | decay_mult: 1 1430 | } 1431 | param { 1432 | lr_mult: 2 1433 | decay_mult: 0 1434 | } 1435 | convolution_param { 1436 | num_output: 32 1437 | kernel_size: 1 1438 | weight_filler { 1439 | type: "xavier" 1440 | std: 0.2 1441 | } 1442 | bias_filler { 1443 | type: "constant" 1444 | value: 0.2 1445 | } 1446 | } 1447 | } 1448 | layer { 1449 | name: "inception_4d/relu_5x5_reduce_p" 1450 | type: "ReLU" 1451 | bottom: "inception_4d/5x5_reduce_p" 1452 | top: "inception_4d/5x5_reduce_p" 1453 | } 1454 | layer { 1455 | name: "inception_4d/5x5_p" 1456 | type: "Convolution" 1457 | bottom: "inception_4d/5x5_reduce_p" 1458 | top: "inception_4d/5x5_p" 1459 | param { 1460 | lr_mult: 1 1461 | decay_mult: 1 1462 | } 1463 | param { 1464 | lr_mult: 2 1465 | decay_mult: 0 1466 | } 1467 | convolution_param { 1468 | num_output: 64 1469 | pad: 2 1470 | kernel_size: 5 1471 | weight_filler { 1472 | type: "xavier" 1473 | std: 0.03 1474 | } 1475 | bias_filler { 1476 | type: "constant" 1477 | value: 0.2 1478 | } 1479 | } 1480 | } 1481 | layer { 1482 | name: "inception_4d/relu_5x5_p" 1483 | type: "ReLU" 1484 | bottom: "inception_4d/5x5_p" 1485 | top: "inception_4d/5x5_p" 1486 | } 1487 | layer { 1488 | name: "inception_4d/pool_p" 1489 | type: "Pooling" 1490 | bottom: "inception_4c/output_p" 1491 | top: "inception_4d/pool_p" 1492 | pooling_param { 1493 | pool: MAX 1494 | kernel_size: 3 1495 | stride: 1 1496 | pad: 1 1497 | } 1498 | } 1499 | layer { 1500 | name: "inception_4d/pool_proj_p" 1501 | type: "Convolution" 1502 | bottom: "inception_4d/pool_p" 1503 | top: "inception_4d/pool_proj_p" 1504 | param { 1505 | lr_mult: 1 1506 | decay_mult: 1 1507 | } 1508 | param { 1509 | lr_mult: 2 1510 | decay_mult: 0 1511 | } 1512 | convolution_param { 1513 | num_output: 64 1514 | kernel_size: 1 1515 | weight_filler { 1516 | type: "xavier" 1517 | std: 0.1 1518 | } 1519 | bias_filler { 1520 | type: "constant" 1521 | value: 0.2 1522 | } 1523 | } 1524 | } 1525 | layer { 1526 | name: "inception_4d/relu_pool_proj_p" 1527 | type: "ReLU" 1528 | bottom: "inception_4d/pool_proj_p" 1529 | top: "inception_4d/pool_proj_p" 1530 | } 1531 | layer { 1532 | name: "inception_4d/output_p" 1533 | type: "Concat" 1534 | bottom: "inception_4d/1x1_p" 1535 | bottom: "inception_4d/3x3_p" 1536 | bottom: "inception_4d/5x5_p" 1537 | bottom: "inception_4d/pool_proj_p" 1538 | top: "inception_4d/output_p" 1539 | } 1540 | layer { 1541 | name: "loss2/ave_pool_p" 1542 | type: "Pooling" 1543 | bottom: "inception_4d/output_p" 1544 | top: "loss2/ave_pool_p" 1545 | pooling_param { 1546 | pool: AVE 1547 | kernel_size: 5 1548 | stride: 3 1549 | } 1550 | } 1551 | layer { 1552 | name: "loss2/conv_p" 1553 | type: "Convolution" 1554 | bottom: "loss2/ave_pool_p" 1555 | top: "loss2/conv_p" 1556 | param { 1557 | lr_mult: 10 1558 | decay_mult: 1 1559 | } 1560 | param { 1561 | lr_mult: 20 1562 | decay_mult: 0 1563 | } 1564 | convolution_param { 1565 | num_output: 128 1566 | kernel_size: 1 1567 | weight_filler { 1568 | type: "xavier" 1569 | std: 0.08 1570 | } 1571 | bias_filler { 1572 | type: "constant" 1573 | value: 0.2 1574 | } 1575 | } 1576 | } 1577 | layer { 1578 | name: "loss2/relu_conv_p" 1579 | type: "ReLU" 1580 | bottom: "loss2/conv_p" 1581 | top: "loss2/conv_p" 1582 | } 1583 | layer { 1584 | name: "loss2/fc_p" 1585 | type: "InnerProduct" 1586 | bottom: "loss2/conv_p" 1587 | top: "loss2/fc_p" 1588 | param { 1589 | lr_mult: 10 1590 | decay_mult: 1 1591 | } 1592 | param { 1593 | lr_mult: 20 1594 | decay_mult: 0 1595 | } 1596 | inner_product_param { 1597 | num_output: 1024 1598 | weight_filler { 1599 | type: "xavier" 1600 | std: 0.02 1601 | } 1602 | bias_filler { 1603 | type: "constant" 1604 | value: 0.2 1605 | } 1606 | } 1607 | } 1608 | layer { 1609 | name: "loss2/relu_fc_p" 1610 | type: "ReLU" 1611 | bottom: "loss2/fc_p" 1612 | top: "loss2/fc_p" 1613 | } 1614 | layer { 1615 | name: "loss2/drop_fc_p" 1616 | type: "Dropout" 1617 | bottom: "loss2/fc_p" 1618 | top: "loss2/fc_p" 1619 | dropout_param { 1620 | dropout_ratio: 0.7 1621 | } 1622 | } 1623 | layer { 1624 | name: "inception_4e/1x1_p" 1625 | type: "Convolution" 1626 | bottom: "inception_4d/output_p" 1627 | top: "inception_4e/1x1_p" 1628 | param { 1629 | lr_mult: 1 1630 | decay_mult: 1 1631 | } 1632 | param { 1633 | lr_mult: 2 1634 | decay_mult: 0 1635 | } 1636 | convolution_param { 1637 | num_output: 256 1638 | kernel_size: 1 1639 | weight_filler { 1640 | type: "xavier" 1641 | std: 0.03 1642 | } 1643 | bias_filler { 1644 | type: "constant" 1645 | value: 0.2 1646 | } 1647 | } 1648 | } 1649 | layer { 1650 | name: "inception_4e/relu_1x1_p" 1651 | type: "ReLU" 1652 | bottom: "inception_4e/1x1_p" 1653 | top: "inception_4e/1x1_p" 1654 | } 1655 | layer { 1656 | name: "inception_4e/3x3_reduce_p" 1657 | type: "Convolution" 1658 | bottom: "inception_4d/output_p" 1659 | top: "inception_4e/3x3_reduce_p" 1660 | param { 1661 | lr_mult: 1 1662 | decay_mult: 1 1663 | } 1664 | param { 1665 | lr_mult: 2 1666 | decay_mult: 0 1667 | } 1668 | convolution_param { 1669 | num_output: 160 1670 | kernel_size: 1 1671 | weight_filler { 1672 | type: "xavier" 1673 | std: 0.09 1674 | } 1675 | bias_filler { 1676 | type: "constant" 1677 | value: 0.2 1678 | } 1679 | } 1680 | } 1681 | layer { 1682 | name: "inception_4e/relu_3x3_reduce_p" 1683 | type: "ReLU" 1684 | bottom: "inception_4e/3x3_reduce_p" 1685 | top: "inception_4e/3x3_reduce_p" 1686 | } 1687 | layer { 1688 | name: "inception_4e/3x3_p" 1689 | type: "Convolution" 1690 | bottom: "inception_4e/3x3_reduce_p" 1691 | top: "inception_4e/3x3_p" 1692 | param { 1693 | lr_mult: 1 1694 | decay_mult: 1 1695 | } 1696 | param { 1697 | lr_mult: 2 1698 | decay_mult: 0 1699 | } 1700 | convolution_param { 1701 | num_output: 320 1702 | pad: 1 1703 | kernel_size: 3 1704 | weight_filler { 1705 | type: "xavier" 1706 | std: 0.03 1707 | } 1708 | bias_filler { 1709 | type: "constant" 1710 | value: 0.2 1711 | } 1712 | } 1713 | } 1714 | layer { 1715 | name: "inception_4e/relu_3x3_p" 1716 | type: "ReLU" 1717 | bottom: "inception_4e/3x3_p" 1718 | top: "inception_4e/3x3_p" 1719 | } 1720 | layer { 1721 | name: "inception_4e/5x5_reduce_p" 1722 | type: "Convolution" 1723 | bottom: "inception_4d/output_p" 1724 | top: "inception_4e/5x5_reduce_p" 1725 | param { 1726 | lr_mult: 1 1727 | decay_mult: 1 1728 | } 1729 | param { 1730 | lr_mult: 2 1731 | decay_mult: 0 1732 | } 1733 | convolution_param { 1734 | num_output: 32 1735 | kernel_size: 1 1736 | weight_filler { 1737 | type: "xavier" 1738 | std: 0.2 1739 | } 1740 | bias_filler { 1741 | type: "constant" 1742 | value: 0.2 1743 | } 1744 | } 1745 | } 1746 | layer { 1747 | name: "inception_4e/relu_5x5_reduce_p" 1748 | type: "ReLU" 1749 | bottom: "inception_4e/5x5_reduce_p" 1750 | top: "inception_4e/5x5_reduce_p" 1751 | } 1752 | layer { 1753 | name: "inception_4e/5x5_p" 1754 | type: "Convolution" 1755 | bottom: "inception_4e/5x5_reduce_p" 1756 | top: "inception_4e/5x5_p" 1757 | param { 1758 | lr_mult: 1 1759 | decay_mult: 1 1760 | } 1761 | param { 1762 | lr_mult: 2 1763 | decay_mult: 0 1764 | } 1765 | convolution_param { 1766 | num_output: 128 1767 | pad: 2 1768 | kernel_size: 5 1769 | weight_filler { 1770 | type: "xavier" 1771 | std: 0.03 1772 | } 1773 | bias_filler { 1774 | type: "constant" 1775 | value: 0.2 1776 | } 1777 | } 1778 | } 1779 | layer { 1780 | name: "inception_4e/relu_5x5_p" 1781 | type: "ReLU" 1782 | bottom: "inception_4e/5x5_p" 1783 | top: "inception_4e/5x5_p" 1784 | } 1785 | layer { 1786 | name: "inception_4e/pool_p" 1787 | type: "Pooling" 1788 | bottom: "inception_4d/output_p" 1789 | top: "inception_4e/pool_p" 1790 | pooling_param { 1791 | pool: MAX 1792 | kernel_size: 3 1793 | stride: 1 1794 | pad: 1 1795 | } 1796 | } 1797 | layer { 1798 | name: "inception_4e/pool_proj_p" 1799 | type: "Convolution" 1800 | bottom: "inception_4e/pool_p" 1801 | top: "inception_4e/pool_proj_p" 1802 | param { 1803 | lr_mult: 1 1804 | decay_mult: 1 1805 | } 1806 | param { 1807 | lr_mult: 2 1808 | decay_mult: 0 1809 | } 1810 | convolution_param { 1811 | num_output: 128 1812 | kernel_size: 1 1813 | weight_filler { 1814 | type: "xavier" 1815 | std: 0.1 1816 | } 1817 | bias_filler { 1818 | type: "constant" 1819 | value: 0.2 1820 | } 1821 | } 1822 | } 1823 | layer { 1824 | name: "inception_4e/relu_pool_proj_p" 1825 | type: "ReLU" 1826 | bottom: "inception_4e/pool_proj_p" 1827 | top: "inception_4e/pool_proj_p" 1828 | } 1829 | layer { 1830 | name: "inception_4e/output_p" 1831 | type: "Concat" 1832 | bottom: "inception_4e/1x1_p" 1833 | bottom: "inception_4e/3x3_p" 1834 | bottom: "inception_4e/5x5_p" 1835 | bottom: "inception_4e/pool_proj_p" 1836 | top: "inception_4e/output_p" 1837 | } 1838 | layer { 1839 | name: "pool4/3x3_s2_p" 1840 | type: "Pooling" 1841 | bottom: "inception_4e/output_p" 1842 | top: "pool4/3x3_s2_p" 1843 | pooling_param { 1844 | pool: MAX 1845 | kernel_size: 3 1846 | stride: 2 1847 | } 1848 | } 1849 | layer { 1850 | name: "inception_5a/1x1_p" 1851 | type: "Convolution" 1852 | bottom: "pool4/3x3_s2_p" 1853 | top: "inception_5a/1x1_p" 1854 | param { 1855 | lr_mult: 1 1856 | decay_mult: 1 1857 | } 1858 | param { 1859 | lr_mult: 2 1860 | decay_mult: 0 1861 | } 1862 | convolution_param { 1863 | num_output: 256 1864 | kernel_size: 1 1865 | weight_filler { 1866 | type: "xavier" 1867 | std: 0.03 1868 | } 1869 | bias_filler { 1870 | type: "constant" 1871 | value: 0.2 1872 | } 1873 | } 1874 | } 1875 | layer { 1876 | name: "inception_5a/relu_1x1_p" 1877 | type: "ReLU" 1878 | bottom: "inception_5a/1x1_p" 1879 | top: "inception_5a/1x1_p" 1880 | } 1881 | layer { 1882 | name: "inception_5a/3x3_reduce_p" 1883 | type: "Convolution" 1884 | bottom: "pool4/3x3_s2_p" 1885 | top: "inception_5a/3x3_reduce_p" 1886 | param { 1887 | lr_mult: 1 1888 | decay_mult: 1 1889 | } 1890 | param { 1891 | lr_mult: 2 1892 | decay_mult: 0 1893 | } 1894 | convolution_param { 1895 | num_output: 160 1896 | kernel_size: 1 1897 | weight_filler { 1898 | type: "xavier" 1899 | std: 0.09 1900 | } 1901 | bias_filler { 1902 | type: "constant" 1903 | value: 0.2 1904 | } 1905 | } 1906 | } 1907 | layer { 1908 | name: "inception_5a/relu_3x3_reduce_p" 1909 | type: "ReLU" 1910 | bottom: "inception_5a/3x3_reduce_p" 1911 | top: "inception_5a/3x3_reduce_p" 1912 | } 1913 | layer { 1914 | name: "inception_5a/3x3_p" 1915 | type: "Convolution" 1916 | bottom: "inception_5a/3x3_reduce_p" 1917 | top: "inception_5a/3x3_p" 1918 | param { 1919 | lr_mult: 1 1920 | decay_mult: 1 1921 | } 1922 | param { 1923 | lr_mult: 2 1924 | decay_mult: 0 1925 | } 1926 | convolution_param { 1927 | num_output: 320 1928 | pad: 1 1929 | kernel_size: 3 1930 | weight_filler { 1931 | type: "xavier" 1932 | std: 0.03 1933 | } 1934 | bias_filler { 1935 | type: "constant" 1936 | value: 0.2 1937 | } 1938 | } 1939 | } 1940 | layer { 1941 | name: "inception_5a/relu_3x3_p" 1942 | type: "ReLU" 1943 | bottom: "inception_5a/3x3_p" 1944 | top: "inception_5a/3x3_p" 1945 | } 1946 | layer { 1947 | name: "inception_5a/5x5_reduce_p" 1948 | type: "Convolution" 1949 | bottom: "pool4/3x3_s2_p" 1950 | top: "inception_5a/5x5_reduce_p" 1951 | param { 1952 | lr_mult: 1 1953 | decay_mult: 1 1954 | } 1955 | param { 1956 | lr_mult: 2 1957 | decay_mult: 0 1958 | } 1959 | convolution_param { 1960 | num_output: 32 1961 | kernel_size: 1 1962 | weight_filler { 1963 | type: "xavier" 1964 | std: 0.2 1965 | } 1966 | bias_filler { 1967 | type: "constant" 1968 | value: 0.2 1969 | } 1970 | } 1971 | } 1972 | layer { 1973 | name: "inception_5a/relu_5x5_reduce_p" 1974 | type: "ReLU" 1975 | bottom: "inception_5a/5x5_reduce_p" 1976 | top: "inception_5a/5x5_reduce_p" 1977 | } 1978 | layer { 1979 | name: "inception_5a/5x5_p" 1980 | type: "Convolution" 1981 | bottom: "inception_5a/5x5_reduce_p" 1982 | top: "inception_5a/5x5_p" 1983 | param { 1984 | lr_mult: 1 1985 | decay_mult: 1 1986 | } 1987 | param { 1988 | lr_mult: 2 1989 | decay_mult: 0 1990 | } 1991 | convolution_param { 1992 | num_output: 128 1993 | pad: 2 1994 | kernel_size: 5 1995 | weight_filler { 1996 | type: "xavier" 1997 | std: 0.03 1998 | } 1999 | bias_filler { 2000 | type: "constant" 2001 | value: 0.2 2002 | } 2003 | } 2004 | } 2005 | layer { 2006 | name: "inception_5a/relu_5x5_p" 2007 | type: "ReLU" 2008 | bottom: "inception_5a/5x5_p" 2009 | top: "inception_5a/5x5_p" 2010 | } 2011 | layer { 2012 | name: "inception_5a/pool_p" 2013 | type: "Pooling" 2014 | bottom: "pool4/3x3_s2_p" 2015 | top: "inception_5a/pool_p" 2016 | pooling_param { 2017 | pool: MAX 2018 | kernel_size: 3 2019 | stride: 1 2020 | pad: 1 2021 | } 2022 | } 2023 | layer { 2024 | name: "inception_5a/pool_proj_p" 2025 | type: "Convolution" 2026 | bottom: "inception_5a/pool_p" 2027 | top: "inception_5a/pool_proj_p" 2028 | param { 2029 | lr_mult: 1 2030 | decay_mult: 1 2031 | } 2032 | param { 2033 | lr_mult: 2 2034 | decay_mult: 0 2035 | } 2036 | convolution_param { 2037 | num_output: 128 2038 | kernel_size: 1 2039 | weight_filler { 2040 | type: "xavier" 2041 | std: 0.1 2042 | } 2043 | bias_filler { 2044 | type: "constant" 2045 | value: 0.2 2046 | } 2047 | } 2048 | } 2049 | layer { 2050 | name: "inception_5a/relu_pool_proj_p" 2051 | type: "ReLU" 2052 | bottom: "inception_5a/pool_proj_p" 2053 | top: "inception_5a/pool_proj_p" 2054 | } 2055 | layer { 2056 | name: "inception_5a/output_p" 2057 | type: "Concat" 2058 | bottom: "inception_5a/1x1_p" 2059 | bottom: "inception_5a/3x3_p" 2060 | bottom: "inception_5a/5x5_p" 2061 | bottom: "inception_5a/pool_proj_p" 2062 | top: "inception_5a/output_p" 2063 | } 2064 | layer { 2065 | name: "inception_5b/1x1_p" 2066 | type: "Convolution" 2067 | bottom: "inception_5a/output_p" 2068 | top: "inception_5b/1x1_p" 2069 | param { 2070 | lr_mult: 1 2071 | decay_mult: 1 2072 | } 2073 | param { 2074 | lr_mult: 2 2075 | decay_mult: 0 2076 | } 2077 | convolution_param { 2078 | num_output: 384 2079 | kernel_size: 1 2080 | weight_filler { 2081 | type: "xavier" 2082 | std: 0.03 2083 | } 2084 | bias_filler { 2085 | type: "constant" 2086 | value: 0.2 2087 | } 2088 | } 2089 | } 2090 | layer { 2091 | name: "inception_5b/relu_1x1_p" 2092 | type: "ReLU" 2093 | bottom: "inception_5b/1x1_p" 2094 | top: "inception_5b/1x1_p" 2095 | } 2096 | layer { 2097 | name: "inception_5b/3x3_reduce_p" 2098 | type: "Convolution" 2099 | bottom: "inception_5a/output_p" 2100 | top: "inception_5b/3x3_reduce_p" 2101 | param { 2102 | lr_mult: 1 2103 | decay_mult: 1 2104 | } 2105 | param { 2106 | lr_mult: 2 2107 | decay_mult: 0 2108 | } 2109 | convolution_param { 2110 | num_output: 192 2111 | kernel_size: 1 2112 | weight_filler { 2113 | type: "xavier" 2114 | std: 0.09 2115 | } 2116 | bias_filler { 2117 | type: "constant" 2118 | value: 0.2 2119 | } 2120 | } 2121 | } 2122 | layer { 2123 | name: "inception_5b/relu_3x3_reduce_p" 2124 | type: "ReLU" 2125 | bottom: "inception_5b/3x3_reduce_p" 2126 | top: "inception_5b/3x3_reduce_p" 2127 | } 2128 | layer { 2129 | name: "inception_5b/3x3_p" 2130 | type: "Convolution" 2131 | bottom: "inception_5b/3x3_reduce_p" 2132 | top: "inception_5b/3x3_p" 2133 | param { 2134 | lr_mult: 1 2135 | decay_mult: 1 2136 | } 2137 | param { 2138 | lr_mult: 2 2139 | decay_mult: 0 2140 | } 2141 | convolution_param { 2142 | num_output: 384 2143 | pad: 1 2144 | kernel_size: 3 2145 | weight_filler { 2146 | type: "xavier" 2147 | std: 0.03 2148 | } 2149 | bias_filler { 2150 | type: "constant" 2151 | value: 0.2 2152 | } 2153 | } 2154 | } 2155 | layer { 2156 | name: "inception_5b/relu_3x3_p" 2157 | type: "ReLU" 2158 | bottom: "inception_5b/3x3_p" 2159 | top: "inception_5b/3x3_p" 2160 | } 2161 | layer { 2162 | name: "inception_5b/5x5_reduce_p" 2163 | type: "Convolution" 2164 | bottom: "inception_5a/output_p" 2165 | top: "inception_5b/5x5_reduce_p" 2166 | param { 2167 | lr_mult: 1 2168 | decay_mult: 1 2169 | } 2170 | param { 2171 | lr_mult: 2 2172 | decay_mult: 0 2173 | } 2174 | convolution_param { 2175 | num_output: 48 2176 | kernel_size: 1 2177 | weight_filler { 2178 | type: "xavier" 2179 | std: 0.2 2180 | } 2181 | bias_filler { 2182 | type: "constant" 2183 | value: 0.2 2184 | } 2185 | } 2186 | } 2187 | layer { 2188 | name: "inception_5b/relu_5x5_reduce_p" 2189 | type: "ReLU" 2190 | bottom: "inception_5b/5x5_reduce_p" 2191 | top: "inception_5b/5x5_reduce_p" 2192 | } 2193 | layer { 2194 | name: "inception_5b/5x5_p" 2195 | type: "Convolution" 2196 | bottom: "inception_5b/5x5_reduce_p" 2197 | top: "inception_5b/5x5_p" 2198 | param { 2199 | lr_mult: 1 2200 | decay_mult: 1 2201 | } 2202 | param { 2203 | lr_mult: 2 2204 | decay_mult: 0 2205 | } 2206 | convolution_param { 2207 | num_output: 128 2208 | pad: 2 2209 | kernel_size: 5 2210 | weight_filler { 2211 | type: "xavier" 2212 | std: 0.03 2213 | } 2214 | bias_filler { 2215 | type: "constant" 2216 | value: 0.2 2217 | } 2218 | } 2219 | } 2220 | layer { 2221 | name: "inception_5b/relu_5x5_p" 2222 | type: "ReLU" 2223 | bottom: "inception_5b/5x5_p" 2224 | top: "inception_5b/5x5_p" 2225 | } 2226 | layer { 2227 | name: "inception_5b/pool_p" 2228 | type: "Pooling" 2229 | bottom: "inception_5a/output_p" 2230 | top: "inception_5b/pool_p" 2231 | pooling_param { 2232 | pool: MAX 2233 | kernel_size: 3 2234 | stride: 1 2235 | pad: 1 2236 | } 2237 | } 2238 | layer { 2239 | name: "inception_5b/pool_proj_p" 2240 | type: "Convolution" 2241 | bottom: "inception_5b/pool_p" 2242 | top: "inception_5b/pool_proj_p" 2243 | param { 2244 | lr_mult: 1 2245 | decay_mult: 1 2246 | } 2247 | param { 2248 | lr_mult: 2 2249 | decay_mult: 0 2250 | } 2251 | convolution_param { 2252 | num_output: 128 2253 | kernel_size: 1 2254 | weight_filler { 2255 | type: "xavier" 2256 | std: 0.1 2257 | } 2258 | bias_filler { 2259 | type: "constant" 2260 | value: 0.2 2261 | } 2262 | } 2263 | } 2264 | layer { 2265 | name: "inception_5b/relu_pool_proj_p" 2266 | type: "ReLU" 2267 | bottom: "inception_5b/pool_proj_p" 2268 | top: "inception_5b/pool_proj_p" 2269 | } 2270 | layer { 2271 | name: "inception_5b/output_p" 2272 | type: "Concat" 2273 | bottom: "inception_5b/1x1_p" 2274 | bottom: "inception_5b/3x3_p" 2275 | bottom: "inception_5b/5x5_p" 2276 | bottom: "inception_5b/pool_proj_p" 2277 | top: "inception_5b/output_p" 2278 | } 2279 | layer { 2280 | name: "pool5/7x7_s1_p" 2281 | type: "Pooling" 2282 | bottom: "inception_5b/output_p" 2283 | top: "pool5/7x7_s1_p" 2284 | pooling_param { 2285 | pool: AVE 2286 | kernel_size: 7 2287 | stride: 1 2288 | } 2289 | } 2290 | 2291 | -------------------------------------------------------------------------------- /models/triplet_googlenet/Triplet_googlenet_sketchdeploy.prototxt: -------------------------------------------------------------------------------- 1 | name: "sketch_siamese_train_test" 2 | input: "data" 3 | force_backward: true 4 | input_shape { 5 | dim: 1 6 | dim: 3 7 | dim: 224 8 | dim: 224 9 | } 10 | ####################################################################################### 11 | 12 | layer { 13 | name: "conv1/7x7_s2_s" 14 | type: "Convolution" 15 | bottom: "data" 16 | top: "conv1/7x7_s2_s" 17 | param { 18 | lr_mult: 1 19 | decay_mult: 1 20 | } 21 | param { 22 | lr_mult: 2 23 | decay_mult: 0 24 | } 25 | convolution_param { 26 | num_output: 64 27 | pad: 3 28 | kernel_size: 7 29 | stride: 2 30 | weight_filler { 31 | type: "xavier" 32 | std: 0.1 33 | } 34 | bias_filler { 35 | type: "constant" 36 | value: 0.2 37 | } 38 | } 39 | } 40 | layer { 41 | name: "conv1/relu_7x7_s" 42 | type: "ReLU" 43 | bottom: "conv1/7x7_s2_s" 44 | top: "conv1/7x7_s2_s" 45 | } 46 | layer { 47 | name: "pool1/3x3_s2_s" 48 | type: "Pooling" 49 | bottom: "conv1/7x7_s2_s" 50 | top: "pool1/3x3_s2_s" 51 | pooling_param { 52 | pool: MAX 53 | kernel_size: 3 54 | stride: 2 55 | } 56 | } 57 | layer { 58 | name: "pool1/norm1_s" 59 | type: "LRN" 60 | bottom: "pool1/3x3_s2_s" 61 | top: "pool1/norm1_s" 62 | lrn_param { 63 | local_size: 5 64 | alpha: 0.0001 65 | beta: 0.75 66 | } 67 | } 68 | layer { 69 | name: "conv2/3x3_reduce_s" 70 | type: "Convolution" 71 | bottom: "pool1/norm1_s" 72 | top: "conv2/3x3_reduce_s" 73 | param { 74 | lr_mult: 1 75 | decay_mult: 1 76 | } 77 | param { 78 | lr_mult: 2 79 | decay_mult: 0 80 | } 81 | convolution_param { 82 | num_output: 64 83 | kernel_size: 1 84 | weight_filler { 85 | type: "xavier" 86 | std: 0.1 87 | } 88 | bias_filler { 89 | type: "constant" 90 | value: 0.2 91 | } 92 | } 93 | } 94 | layer { 95 | name: "conv2/relu_3x3_reduce_s" 96 | type: "ReLU" 97 | bottom: "conv2/3x3_reduce_s" 98 | top: "conv2/3x3_reduce_s" 99 | } 100 | layer { 101 | name: "conv2/3x3_s" 102 | type: "Convolution" 103 | bottom: "conv2/3x3_reduce_s" 104 | top: "conv2/3x3_s" 105 | param { 106 | lr_mult: 1 107 | decay_mult: 1 108 | } 109 | param { 110 | lr_mult: 2 111 | decay_mult: 0 112 | } 113 | convolution_param { 114 | num_output: 192 115 | pad: 1 116 | kernel_size: 3 117 | weight_filler { 118 | type: "xavier" 119 | std: 0.03 120 | } 121 | bias_filler { 122 | type: "constant" 123 | value: 0.2 124 | } 125 | } 126 | } 127 | layer { 128 | name: "conv2/relu_3x3_s" 129 | type: "ReLU" 130 | bottom: "conv2/3x3_s" 131 | top: "conv2/3x3_s" 132 | } 133 | layer { 134 | name: "conv2/norm2_s" 135 | type: "LRN" 136 | bottom: "conv2/3x3_s" 137 | top: "conv2/norm2_s" 138 | lrn_param { 139 | local_size: 5 140 | alpha: 0.0001 141 | beta: 0.75 142 | } 143 | } 144 | layer { 145 | name: "pool2/3x3_s2_s" 146 | type: "Pooling" 147 | bottom: "conv2/norm2_s" 148 | top: "pool2/3x3_s2_s" 149 | pooling_param { 150 | pool: MAX 151 | kernel_size: 3 152 | stride: 2 153 | } 154 | } 155 | layer { 156 | name: "inception_3a/1x1_s" 157 | type: "Convolution" 158 | bottom: "pool2/3x3_s2_s" 159 | top: "inception_3a/1x1_s" 160 | param { 161 | lr_mult: 1 162 | decay_mult: 1 163 | } 164 | param { 165 | lr_mult: 2 166 | decay_mult: 0 167 | } 168 | convolution_param { 169 | num_output: 64 170 | kernel_size: 1 171 | weight_filler { 172 | type: "xavier" 173 | std: 0.03 174 | } 175 | bias_filler { 176 | type: "constant" 177 | value: 0.2 178 | } 179 | } 180 | } 181 | layer { 182 | name: "inception_3a/relu_1x1_s" 183 | type: "ReLU" 184 | bottom: "inception_3a/1x1_s" 185 | top: "inception_3a/1x1_s" 186 | } 187 | layer { 188 | name: "inception_3a/3x3_reduce_s" 189 | type: "Convolution" 190 | bottom: "pool2/3x3_s2_s" 191 | top: "inception_3a/3x3_reduce_s" 192 | param { 193 | lr_mult: 1 194 | decay_mult: 1 195 | } 196 | param { 197 | lr_mult: 2 198 | decay_mult: 0 199 | } 200 | convolution_param { 201 | num_output: 96 202 | kernel_size: 1 203 | weight_filler { 204 | type: "xavier" 205 | std: 0.09 206 | } 207 | bias_filler { 208 | type: "constant" 209 | value: 0.2 210 | } 211 | } 212 | } 213 | layer { 214 | name: "inception_3a/relu_3x3_reduce_s" 215 | type: "ReLU" 216 | bottom: "inception_3a/3x3_reduce_s" 217 | top: "inception_3a/3x3_reduce_s" 218 | } 219 | layer { 220 | name: "inception_3a/3x3_s" 221 | type: "Convolution" 222 | bottom: "inception_3a/3x3_reduce_s" 223 | top: "inception_3a/3x3_s" 224 | param { 225 | lr_mult: 1 226 | decay_mult: 1 227 | } 228 | param { 229 | lr_mult: 2 230 | decay_mult: 0 231 | } 232 | convolution_param { 233 | num_output: 128 234 | pad: 1 235 | kernel_size: 3 236 | weight_filler { 237 | type: "xavier" 238 | std: 0.03 239 | } 240 | bias_filler { 241 | type: "constant" 242 | value: 0.2 243 | } 244 | } 245 | } 246 | layer { 247 | name: "inception_3a/relu_3x3_s" 248 | type: "ReLU" 249 | bottom: "inception_3a/3x3_s" 250 | top: "inception_3a/3x3_s" 251 | } 252 | layer { 253 | name: "inception_3a/5x5_reduce_s" 254 | type: "Convolution" 255 | bottom: "pool2/3x3_s2_s" 256 | top: "inception_3a/5x5_reduce_s" 257 | param { 258 | lr_mult: 1 259 | decay_mult: 1 260 | } 261 | param { 262 | lr_mult: 2 263 | decay_mult: 0 264 | } 265 | convolution_param { 266 | num_output: 16 267 | kernel_size: 1 268 | weight_filler { 269 | type: "xavier" 270 | std: 0.2 271 | } 272 | bias_filler { 273 | type: "constant" 274 | value: 0.2 275 | } 276 | } 277 | } 278 | layer { 279 | name: "inception_3a/relu_5x5_reduce_s" 280 | type: "ReLU" 281 | bottom: "inception_3a/5x5_reduce_s" 282 | top: "inception_3a/5x5_reduce_s" 283 | } 284 | layer { 285 | name: "inception_3a/5x5_s" 286 | type: "Convolution" 287 | bottom: "inception_3a/5x5_reduce_s" 288 | top: "inception_3a/5x5_s" 289 | param { 290 | lr_mult: 1 291 | decay_mult: 1 292 | } 293 | param { 294 | lr_mult: 2 295 | decay_mult: 0 296 | } 297 | convolution_param { 298 | num_output: 32 299 | pad: 2 300 | kernel_size: 5 301 | weight_filler { 302 | type: "xavier" 303 | std: 0.03 304 | } 305 | bias_filler { 306 | type: "constant" 307 | value: 0.2 308 | } 309 | } 310 | } 311 | layer { 312 | name: "inception_3a/relu_5x5_s" 313 | type: "ReLU" 314 | bottom: "inception_3a/5x5_s" 315 | top: "inception_3a/5x5_s" 316 | } 317 | layer { 318 | name: "inception_3a/pool_s" 319 | type: "Pooling" 320 | bottom: "pool2/3x3_s2_s" 321 | top: "inception_3a/pool_s" 322 | pooling_param { 323 | pool: MAX 324 | kernel_size: 3 325 | stride: 1 326 | pad: 1 327 | } 328 | } 329 | layer { 330 | name: "inception_3a/pool_proj_s" 331 | type: "Convolution" 332 | bottom: "inception_3a/pool_s" 333 | top: "inception_3a/pool_proj_s" 334 | param { 335 | lr_mult: 1 336 | decay_mult: 1 337 | } 338 | param { 339 | lr_mult: 2 340 | decay_mult: 0 341 | } 342 | convolution_param { 343 | num_output: 32 344 | kernel_size: 1 345 | weight_filler { 346 | type: "xavier" 347 | std: 0.1 348 | } 349 | bias_filler { 350 | type: "constant" 351 | value: 0.2 352 | } 353 | } 354 | } 355 | layer { 356 | name: "inception_3a/relu_pool_proj_s" 357 | type: "ReLU" 358 | bottom: "inception_3a/pool_proj_s" 359 | top: "inception_3a/pool_proj_s" 360 | } 361 | layer { 362 | name: "inception_3a/output_s" 363 | type: "Concat" 364 | bottom: "inception_3a/1x1_s" 365 | bottom: "inception_3a/3x3_s" 366 | bottom: "inception_3a/5x5_s" 367 | bottom: "inception_3a/pool_proj_s" 368 | top: "inception_3a/output_s" 369 | } 370 | layer { 371 | name: "inception_3b/1x1_s" 372 | type: "Convolution" 373 | bottom: "inception_3a/output_s" 374 | top: "inception_3b/1x1_s" 375 | param { 376 | lr_mult: 1 377 | decay_mult: 1 378 | } 379 | param { 380 | lr_mult: 2 381 | decay_mult: 0 382 | } 383 | convolution_param { 384 | num_output: 128 385 | kernel_size: 1 386 | weight_filler { 387 | type: "xavier" 388 | std: 0.03 389 | } 390 | bias_filler { 391 | type: "constant" 392 | value: 0.2 393 | } 394 | } 395 | } 396 | layer { 397 | name: "inception_3b/relu_1x1_s" 398 | type: "ReLU" 399 | bottom: "inception_3b/1x1_s" 400 | top: "inception_3b/1x1_s" 401 | } 402 | layer { 403 | name: "inception_3b/3x3_reduce_s" 404 | type: "Convolution" 405 | bottom: "inception_3a/output_s" 406 | top: "inception_3b/3x3_reduce_s" 407 | param { 408 | lr_mult: 1 409 | decay_mult: 1 410 | } 411 | param { 412 | lr_mult: 2 413 | decay_mult: 0 414 | } 415 | convolution_param { 416 | num_output: 128 417 | kernel_size: 1 418 | weight_filler { 419 | type: "xavier" 420 | std: 0.09 421 | } 422 | bias_filler { 423 | type: "constant" 424 | value: 0.2 425 | } 426 | } 427 | } 428 | layer { 429 | name: "inception_3b/relu_3x3_reduce_s" 430 | type: "ReLU" 431 | bottom: "inception_3b/3x3_reduce_s" 432 | top: "inception_3b/3x3_reduce_s" 433 | } 434 | layer { 435 | name: "inception_3b/3x3_s" 436 | type: "Convolution" 437 | bottom: "inception_3b/3x3_reduce_s" 438 | top: "inception_3b/3x3_s" 439 | param { 440 | lr_mult: 1 441 | decay_mult: 1 442 | } 443 | param { 444 | lr_mult: 2 445 | decay_mult: 0 446 | } 447 | convolution_param { 448 | num_output: 192 449 | pad: 1 450 | kernel_size: 3 451 | weight_filler { 452 | type: "xavier" 453 | std: 0.03 454 | } 455 | bias_filler { 456 | type: "constant" 457 | value: 0.2 458 | } 459 | } 460 | } 461 | layer { 462 | name: "inception_3b/relu_3x3_s" 463 | type: "ReLU" 464 | bottom: "inception_3b/3x3_s" 465 | top: "inception_3b/3x3_s" 466 | } 467 | layer { 468 | name: "inception_3b/5x5_reduce_s" 469 | type: "Convolution" 470 | bottom: "inception_3a/output_s" 471 | top: "inception_3b/5x5_reduce_s" 472 | param { 473 | lr_mult: 1 474 | decay_mult: 1 475 | } 476 | param { 477 | lr_mult: 2 478 | decay_mult: 0 479 | } 480 | convolution_param { 481 | num_output: 32 482 | kernel_size: 1 483 | weight_filler { 484 | type: "xavier" 485 | std: 0.2 486 | } 487 | bias_filler { 488 | type: "constant" 489 | value: 0.2 490 | } 491 | } 492 | } 493 | layer { 494 | name: "inception_3b/relu_5x5_reduce_s" 495 | type: "ReLU" 496 | bottom: "inception_3b/5x5_reduce_s" 497 | top: "inception_3b/5x5_reduce_s" 498 | } 499 | layer { 500 | name: "inception_3b/5x5_s" 501 | type: "Convolution" 502 | bottom: "inception_3b/5x5_reduce_s" 503 | top: "inception_3b/5x5_s" 504 | param { 505 | lr_mult: 1 506 | decay_mult: 1 507 | } 508 | param { 509 | lr_mult: 2 510 | decay_mult: 0 511 | } 512 | convolution_param { 513 | num_output: 96 514 | pad: 2 515 | kernel_size: 5 516 | weight_filler { 517 | type: "xavier" 518 | std: 0.03 519 | } 520 | bias_filler { 521 | type: "constant" 522 | value: 0.2 523 | } 524 | } 525 | } 526 | layer { 527 | name: "inception_3b/relu_5x5_s" 528 | type: "ReLU" 529 | bottom: "inception_3b/5x5_s" 530 | top: "inception_3b/5x5_s" 531 | } 532 | layer { 533 | name: "inception_3b/pool_s" 534 | type: "Pooling" 535 | bottom: "inception_3a/output_s" 536 | top: "inception_3b/pool_s" 537 | pooling_param { 538 | pool: MAX 539 | kernel_size: 3 540 | stride: 1 541 | pad: 1 542 | } 543 | } 544 | layer { 545 | name: "inception_3b/pool_proj_s" 546 | type: "Convolution" 547 | bottom: "inception_3b/pool_s" 548 | top: "inception_3b/pool_proj_s" 549 | param { 550 | lr_mult: 1 551 | decay_mult: 1 552 | } 553 | param { 554 | lr_mult: 2 555 | decay_mult: 0 556 | } 557 | convolution_param { 558 | num_output: 64 559 | kernel_size: 1 560 | weight_filler { 561 | type: "xavier" 562 | std: 0.1 563 | } 564 | bias_filler { 565 | type: "constant" 566 | value: 0.2 567 | } 568 | } 569 | } 570 | layer { 571 | name: "inception_3b/relu_pool_proj_s" 572 | type: "ReLU" 573 | bottom: "inception_3b/pool_proj_s" 574 | top: "inception_3b/pool_proj_s" 575 | } 576 | layer { 577 | name: "inception_3b/output_s" 578 | type: "Concat" 579 | bottom: "inception_3b/1x1_s" 580 | bottom: "inception_3b/3x3_s" 581 | bottom: "inception_3b/5x5_s" 582 | bottom: "inception_3b/pool_proj_s" 583 | top: "inception_3b/output_s" 584 | } 585 | layer { 586 | name: "pool3/3x3_s2_s" 587 | type: "Pooling" 588 | bottom: "inception_3b/output_s" 589 | top: "pool3/3x3_s2_s" 590 | pooling_param { 591 | pool: MAX 592 | kernel_size: 3 593 | stride: 2 594 | } 595 | } 596 | layer { 597 | name: "inception_4a/1x1_s" 598 | type: "Convolution" 599 | bottom: "pool3/3x3_s2_s" 600 | top: "inception_4a/1x1_s" 601 | param { 602 | lr_mult: 1 603 | decay_mult: 1 604 | } 605 | param { 606 | lr_mult: 2 607 | decay_mult: 0 608 | } 609 | convolution_param { 610 | num_output: 192 611 | kernel_size: 1 612 | weight_filler { 613 | type: "xavier" 614 | std: 0.03 615 | } 616 | bias_filler { 617 | type: "constant" 618 | value: 0.2 619 | } 620 | } 621 | } 622 | layer { 623 | name: "inception_4a/relu_1x1_s" 624 | type: "ReLU" 625 | bottom: "inception_4a/1x1_s" 626 | top: "inception_4a/1x1_s" 627 | } 628 | layer { 629 | name: "inception_4a/3x3_reduce_s" 630 | type: "Convolution" 631 | bottom: "pool3/3x3_s2_s" 632 | top: "inception_4a/3x3_reduce_s" 633 | param { 634 | lr_mult: 1 635 | decay_mult: 1 636 | } 637 | param { 638 | lr_mult: 2 639 | decay_mult: 0 640 | } 641 | convolution_param { 642 | num_output: 96 643 | kernel_size: 1 644 | weight_filler { 645 | type: "xavier" 646 | std: 0.09 647 | } 648 | bias_filler { 649 | type: "constant" 650 | value: 0.2 651 | } 652 | } 653 | } 654 | layer { 655 | name: "inception_4a/relu_3x3_reduce_s" 656 | type: "ReLU" 657 | bottom: "inception_4a/3x3_reduce_s" 658 | top: "inception_4a/3x3_reduce_s" 659 | } 660 | layer { 661 | name: "inception_4a/3x3_s" 662 | type: "Convolution" 663 | bottom: "inception_4a/3x3_reduce_s" 664 | top: "inception_4a/3x3_s" 665 | param { 666 | lr_mult: 1 667 | decay_mult: 1 668 | } 669 | param { 670 | lr_mult: 2 671 | decay_mult: 0 672 | } 673 | convolution_param { 674 | num_output: 208 675 | pad: 1 676 | kernel_size: 3 677 | weight_filler { 678 | type: "xavier" 679 | std: 0.03 680 | } 681 | bias_filler { 682 | type: "constant" 683 | value: 0.2 684 | } 685 | } 686 | } 687 | layer { 688 | name: "inception_4a/relu_3x3_s" 689 | type: "ReLU" 690 | bottom: "inception_4a/3x3_s" 691 | top: "inception_4a/3x3_s" 692 | } 693 | layer { 694 | name: "inception_4a/5x5_reduce_s" 695 | type: "Convolution" 696 | bottom: "pool3/3x3_s2_s" 697 | top: "inception_4a/5x5_reduce_s" 698 | param { 699 | lr_mult: 1 700 | decay_mult: 1 701 | } 702 | param { 703 | lr_mult: 2 704 | decay_mult: 0 705 | } 706 | convolution_param { 707 | num_output: 16 708 | kernel_size: 1 709 | weight_filler { 710 | type: "xavier" 711 | std: 0.2 712 | } 713 | bias_filler { 714 | type: "constant" 715 | value: 0.2 716 | } 717 | } 718 | } 719 | layer { 720 | name: "inception_4a/relu_5x5_reduce_s" 721 | type: "ReLU" 722 | bottom: "inception_4a/5x5_reduce_s" 723 | top: "inception_4a/5x5_reduce_s" 724 | } 725 | layer { 726 | name: "inception_4a/5x5_s" 727 | type: "Convolution" 728 | bottom: "inception_4a/5x5_reduce_s" 729 | top: "inception_4a/5x5_s" 730 | param { 731 | lr_mult: 1 732 | decay_mult: 1 733 | } 734 | param { 735 | lr_mult: 2 736 | decay_mult: 0 737 | } 738 | convolution_param { 739 | num_output: 48 740 | pad: 2 741 | kernel_size: 5 742 | weight_filler { 743 | type: "xavier" 744 | std: 0.03 745 | } 746 | bias_filler { 747 | type: "constant" 748 | value: 0.2 749 | } 750 | } 751 | } 752 | layer { 753 | name: "inception_4a/relu_5x5_s" 754 | type: "ReLU" 755 | bottom: "inception_4a/5x5_s" 756 | top: "inception_4a/5x5_s" 757 | } 758 | layer { 759 | name: "inception_4a/pool_s" 760 | type: "Pooling" 761 | bottom: "pool3/3x3_s2_s" 762 | top: "inception_4a/pool_s" 763 | pooling_param { 764 | pool: MAX 765 | kernel_size: 3 766 | stride: 1 767 | pad: 1 768 | } 769 | } 770 | layer { 771 | name: "inception_4a/pool_proj_s" 772 | type: "Convolution" 773 | bottom: "inception_4a/pool_s" 774 | top: "inception_4a/pool_proj_s" 775 | param { 776 | lr_mult: 1 777 | decay_mult: 1 778 | } 779 | param { 780 | lr_mult: 2 781 | decay_mult: 0 782 | } 783 | convolution_param { 784 | num_output: 64 785 | kernel_size: 1 786 | weight_filler { 787 | type: "xavier" 788 | std: 0.1 789 | } 790 | bias_filler { 791 | type: "constant" 792 | value: 0.2 793 | } 794 | } 795 | } 796 | layer { 797 | name: "inception_4a/relu_pool_proj_s" 798 | type: "ReLU" 799 | bottom: "inception_4a/pool_proj_s" 800 | top: "inception_4a/pool_proj_s" 801 | } 802 | layer { 803 | name: "inception_4a/output_s" 804 | type: "Concat" 805 | bottom: "inception_4a/1x1_s" 806 | bottom: "inception_4a/3x3_s" 807 | bottom: "inception_4a/5x5_s" 808 | bottom: "inception_4a/pool_proj_s" 809 | top: "inception_4a/output_s" 810 | } 811 | layer { 812 | name: "loss1/ave_pool_s" 813 | type: "Pooling" 814 | bottom: "inception_4a/output_s" 815 | top: "loss1/ave_pool_s" 816 | pooling_param { 817 | pool: AVE 818 | kernel_size: 5 819 | stride: 3 820 | } 821 | } 822 | layer { 823 | name: "loss1/conv_s" 824 | type: "Convolution" 825 | bottom: "loss1/ave_pool_s" 826 | top: "loss1/conv_s" 827 | param { 828 | lr_mult: 1 829 | decay_mult: 1 830 | } 831 | param { 832 | lr_mult: 2 833 | decay_mult: 0 834 | } 835 | convolution_param { 836 | num_output: 128 837 | kernel_size: 1 838 | weight_filler { 839 | type: "xavier" 840 | std: 0.08 841 | } 842 | bias_filler { 843 | type: "constant" 844 | value: 0.2 845 | } 846 | } 847 | } 848 | layer { 849 | name: "loss1/relu_conv_s" 850 | type: "ReLU" 851 | bottom: "loss1/conv_s" 852 | top: "loss1/conv_s" 853 | } 854 | layer { 855 | name: "loss1/fc_s" 856 | type: "InnerProduct" 857 | bottom: "loss1/conv_s" 858 | top: "loss1/fc_s" 859 | param { 860 | lr_mult: 1 861 | decay_mult: 1 862 | } 863 | param { 864 | lr_mult: 2 865 | decay_mult: 0 866 | } 867 | inner_product_param { 868 | num_output: 1024 869 | weight_filler { 870 | type: "xavier" 871 | std: 0.02 872 | } 873 | bias_filler { 874 | type: "constant" 875 | value: 0.2 876 | } 877 | } 878 | } 879 | layer { 880 | name: "loss1/relu_fc_s" 881 | type: "ReLU" 882 | bottom: "loss1/fc_s" 883 | top: "loss1/fc_s" 884 | } 885 | layer { 886 | name: "loss1/drop_fc_s" 887 | type: "Dropout" 888 | bottom: "loss1/fc_s" 889 | top: "loss1/fc_s" 890 | dropout_param { 891 | dropout_ratio: 0.7 892 | } 893 | } 894 | layer { 895 | name: "inception_4b/1x1_s" 896 | type: "Convolution" 897 | bottom: "inception_4a/output_s" 898 | top: "inception_4b/1x1_s" 899 | param { 900 | lr_mult: 1 901 | decay_mult: 1 902 | } 903 | param { 904 | lr_mult: 2 905 | decay_mult: 0 906 | } 907 | convolution_param { 908 | num_output: 160 909 | kernel_size: 1 910 | weight_filler { 911 | type: "xavier" 912 | std: 0.03 913 | } 914 | bias_filler { 915 | type: "constant" 916 | value: 0.2 917 | } 918 | } 919 | } 920 | layer { 921 | name: "inception_4b/relu_1x1_s" 922 | type: "ReLU" 923 | bottom: "inception_4b/1x1_s" 924 | top: "inception_4b/1x1_s" 925 | } 926 | layer { 927 | name: "inception_4b/3x3_reduce_s" 928 | type: "Convolution" 929 | bottom: "inception_4a/output_s" 930 | top: "inception_4b/3x3_reduce_s" 931 | param { 932 | lr_mult: 1 933 | decay_mult: 1 934 | } 935 | param { 936 | lr_mult: 2 937 | decay_mult: 0 938 | } 939 | convolution_param { 940 | num_output: 112 941 | kernel_size: 1 942 | weight_filler { 943 | type: "xavier" 944 | std: 0.09 945 | } 946 | bias_filler { 947 | type: "constant" 948 | value: 0.2 949 | } 950 | } 951 | } 952 | layer { 953 | name: "inception_4b/relu_3x3_reduce_s" 954 | type: "ReLU" 955 | bottom: "inception_4b/3x3_reduce_s" 956 | top: "inception_4b/3x3_reduce_s" 957 | } 958 | layer { 959 | name: "inception_4b/3x3_s" 960 | type: "Convolution" 961 | bottom: "inception_4b/3x3_reduce_s" 962 | top: "inception_4b/3x3_s" 963 | param { 964 | lr_mult: 1 965 | decay_mult: 1 966 | } 967 | param { 968 | lr_mult: 2 969 | decay_mult: 0 970 | 971 | } 972 | convolution_param { 973 | num_output: 224 974 | pad: 1 975 | kernel_size: 3 976 | weight_filler { 977 | type: "xavier" 978 | std: 0.03 979 | } 980 | bias_filler { 981 | type: "constant" 982 | value: 0.2 983 | } 984 | } 985 | } 986 | layer { 987 | name: "inception_4b/relu_3x3_s" 988 | type: "ReLU" 989 | bottom: "inception_4b/3x3_s" 990 | top: "inception_4b/3x3_s" 991 | } 992 | layer { 993 | name: "inception_4b/5x5_reduce_s" 994 | type: "Convolution" 995 | bottom: "inception_4a/output_s" 996 | top: "inception_4b/5x5_reduce_s" 997 | param { 998 | lr_mult: 1 999 | decay_mult: 1 1000 | } 1001 | param { 1002 | lr_mult: 2 1003 | decay_mult: 0 1004 | } 1005 | convolution_param { 1006 | num_output: 24 1007 | kernel_size: 1 1008 | weight_filler { 1009 | type: "xavier" 1010 | std: 0.2 1011 | } 1012 | bias_filler { 1013 | type: "constant" 1014 | value: 0.2 1015 | } 1016 | } 1017 | } 1018 | layer { 1019 | name: "inception_4b/relu_5x5_reduce_s" 1020 | type: "ReLU" 1021 | bottom: "inception_4b/5x5_reduce_s" 1022 | top: "inception_4b/5x5_reduce_s" 1023 | } 1024 | layer { 1025 | name: "inception_4b/5x5_s" 1026 | type: "Convolution" 1027 | bottom: "inception_4b/5x5_reduce_s" 1028 | top: "inception_4b/5x5_s" 1029 | param { 1030 | lr_mult: 1 1031 | decay_mult: 1 1032 | } 1033 | param { 1034 | lr_mult: 2 1035 | decay_mult: 0 1036 | } 1037 | convolution_param { 1038 | num_output: 64 1039 | pad: 2 1040 | kernel_size: 5 1041 | weight_filler { 1042 | type: "xavier" 1043 | std: 0.03 1044 | } 1045 | bias_filler { 1046 | type: "constant" 1047 | value: 0.2 1048 | } 1049 | } 1050 | } 1051 | layer { 1052 | name: "inception_4b/relu_5x5_s" 1053 | type: "ReLU" 1054 | bottom: "inception_4b/5x5_s" 1055 | top: "inception_4b/5x5_s" 1056 | } 1057 | layer { 1058 | name: "inception_4b/pool_s" 1059 | type: "Pooling" 1060 | bottom: "inception_4a/output_s" 1061 | top: "inception_4b/pool_s" 1062 | pooling_param { 1063 | pool: MAX 1064 | kernel_size: 3 1065 | stride: 1 1066 | pad: 1 1067 | } 1068 | } 1069 | layer { 1070 | name: "inception_4b/pool_proj_s" 1071 | type: "Convolution" 1072 | bottom: "inception_4b/pool_s" 1073 | top: "inception_4b/pool_proj_s" 1074 | param { 1075 | lr_mult: 1 1076 | decay_mult: 1 1077 | } 1078 | param { 1079 | lr_mult: 2 1080 | decay_mult: 0 1081 | } 1082 | convolution_param { 1083 | num_output: 64 1084 | kernel_size: 1 1085 | weight_filler { 1086 | type: "xavier" 1087 | std: 0.1 1088 | } 1089 | bias_filler { 1090 | type: "constant" 1091 | value: 0.2 1092 | } 1093 | } 1094 | } 1095 | layer { 1096 | name: "inception_4b/relu_pool_proj_s" 1097 | type: "ReLU" 1098 | bottom: "inception_4b/pool_proj_s" 1099 | top: "inception_4b/pool_proj_s" 1100 | } 1101 | layer { 1102 | name: "inception_4b/output_s" 1103 | type: "Concat" 1104 | bottom: "inception_4b/1x1_s" 1105 | bottom: "inception_4b/3x3_s" 1106 | bottom: "inception_4b/5x5_s" 1107 | bottom: "inception_4b/pool_proj_s" 1108 | top: "inception_4b/output_s" 1109 | } 1110 | layer { 1111 | name: "inception_4c/1x1_s" 1112 | type: "Convolution" 1113 | bottom: "inception_4b/output_s" 1114 | top: "inception_4c/1x1_s" 1115 | param { 1116 | lr_mult: 1 1117 | decay_mult: 1 1118 | } 1119 | param { 1120 | lr_mult: 2 1121 | decay_mult: 0 1122 | } 1123 | convolution_param { 1124 | num_output: 128 1125 | kernel_size: 1 1126 | weight_filler { 1127 | type: "xavier" 1128 | std: 0.03 1129 | } 1130 | bias_filler { 1131 | type: "constant" 1132 | value: 0.2 1133 | } 1134 | } 1135 | } 1136 | layer { 1137 | name: "inception_4c/relu_1x1_s" 1138 | type: "ReLU" 1139 | bottom: "inception_4c/1x1_s" 1140 | top: "inception_4c/1x1_s" 1141 | } 1142 | layer { 1143 | name: "inception_4c/3x3_reduce_s" 1144 | type: "Convolution" 1145 | bottom: "inception_4b/output_s" 1146 | top: "inception_4c/3x3_reduce_s" 1147 | param { 1148 | lr_mult: 1 1149 | decay_mult: 1 1150 | } 1151 | param { 1152 | lr_mult: 2 1153 | decay_mult: 0 1154 | } 1155 | convolution_param { 1156 | num_output: 128 1157 | kernel_size: 1 1158 | weight_filler { 1159 | type: "xavier" 1160 | std: 0.09 1161 | } 1162 | bias_filler { 1163 | type: "constant" 1164 | value: 0.2 1165 | } 1166 | } 1167 | } 1168 | layer { 1169 | name: "inception_4c/relu_3x3_reduce_s" 1170 | type: "ReLU" 1171 | bottom: "inception_4c/3x3_reduce_s" 1172 | top: "inception_4c/3x3_reduce_s" 1173 | } 1174 | layer { 1175 | name: "inception_4c/3x3_s" 1176 | type: "Convolution" 1177 | bottom: "inception_4c/3x3_reduce_s" 1178 | top: "inception_4c/3x3_s" 1179 | param { 1180 | lr_mult: 1 1181 | decay_mult: 1 1182 | } 1183 | param { 1184 | lr_mult: 2 1185 | decay_mult: 0 1186 | } 1187 | convolution_param { 1188 | num_output: 256 1189 | pad: 1 1190 | kernel_size: 3 1191 | weight_filler { 1192 | type: "xavier" 1193 | std: 0.03 1194 | } 1195 | bias_filler { 1196 | type: "constant" 1197 | value: 0.2 1198 | } 1199 | } 1200 | } 1201 | layer { 1202 | name: "inception_4c/relu_3x3_s" 1203 | type: "ReLU" 1204 | bottom: "inception_4c/3x3_s" 1205 | top: "inception_4c/3x3_s" 1206 | } 1207 | layer { 1208 | name: "inception_4c/5x5_reduce_s" 1209 | type: "Convolution" 1210 | bottom: "inception_4b/output_s" 1211 | top: "inception_4c/5x5_reduce_s" 1212 | param { 1213 | lr_mult: 1 1214 | decay_mult: 1 1215 | } 1216 | param { 1217 | lr_mult: 2 1218 | decay_mult: 0 1219 | } 1220 | convolution_param { 1221 | num_output: 24 1222 | kernel_size: 1 1223 | weight_filler { 1224 | type: "xavier" 1225 | std: 0.2 1226 | } 1227 | bias_filler { 1228 | type: "constant" 1229 | value: 0.2 1230 | } 1231 | } 1232 | } 1233 | layer { 1234 | name: "inception_4c/relu_5x5_reduce_s" 1235 | type: "ReLU" 1236 | bottom: "inception_4c/5x5_reduce_s" 1237 | top: "inception_4c/5x5_reduce_s" 1238 | } 1239 | layer { 1240 | name: "inception_4c/5x5_s" 1241 | type: "Convolution" 1242 | bottom: "inception_4c/5x5_reduce_s" 1243 | top: "inception_4c/5x5_s" 1244 | param { 1245 | lr_mult: 1 1246 | decay_mult: 1 1247 | } 1248 | param { 1249 | lr_mult: 2 1250 | decay_mult: 0 1251 | } 1252 | convolution_param { 1253 | num_output: 64 1254 | pad: 2 1255 | kernel_size: 5 1256 | weight_filler { 1257 | type: "xavier" 1258 | std: 0.03 1259 | } 1260 | bias_filler { 1261 | type: "constant" 1262 | value: 0.2 1263 | } 1264 | } 1265 | } 1266 | layer { 1267 | name: "inception_4c/relu_5x5_s" 1268 | type: "ReLU" 1269 | bottom: "inception_4c/5x5_s" 1270 | top: "inception_4c/5x5_s" 1271 | } 1272 | layer { 1273 | name: "inception_4c/pool_s" 1274 | type: "Pooling" 1275 | bottom: "inception_4b/output_s" 1276 | top: "inception_4c/pool_s" 1277 | pooling_param { 1278 | pool: MAX 1279 | kernel_size: 3 1280 | stride: 1 1281 | pad: 1 1282 | } 1283 | } 1284 | layer { 1285 | name: "inception_4c/pool_proj_s" 1286 | type: "Convolution" 1287 | bottom: "inception_4c/pool_s" 1288 | top: "inception_4c/pool_proj_s" 1289 | param { 1290 | lr_mult: 1 1291 | decay_mult: 1 1292 | } 1293 | param { 1294 | lr_mult: 2 1295 | decay_mult: 0 1296 | } 1297 | convolution_param { 1298 | num_output: 64 1299 | kernel_size: 1 1300 | weight_filler { 1301 | type: "xavier" 1302 | std: 0.1 1303 | } 1304 | bias_filler { 1305 | type: "constant" 1306 | value: 0.2 1307 | } 1308 | } 1309 | } 1310 | layer { 1311 | name: "inception_4c/relu_pool_proj_s" 1312 | type: "ReLU" 1313 | bottom: "inception_4c/pool_proj_s" 1314 | top: "inception_4c/pool_proj_s" 1315 | } 1316 | layer { 1317 | name: "inception_4c/output_s" 1318 | type: "Concat" 1319 | bottom: "inception_4c/1x1_s" 1320 | bottom: "inception_4c/3x3_s" 1321 | bottom: "inception_4c/5x5_s" 1322 | bottom: "inception_4c/pool_proj_s" 1323 | top: "inception_4c/output_s" 1324 | } 1325 | layer { 1326 | name: "inception_4d/1x1_s" 1327 | type: "Convolution" 1328 | bottom: "inception_4c/output_s" 1329 | top: "inception_4d/1x1_s" 1330 | param { 1331 | lr_mult: 1 1332 | decay_mult: 1 1333 | } 1334 | param { 1335 | lr_mult: 2 1336 | decay_mult: 0 1337 | } 1338 | convolution_param { 1339 | num_output: 112 1340 | kernel_size: 1 1341 | weight_filler { 1342 | type: "xavier" 1343 | std: 0.03 1344 | } 1345 | bias_filler { 1346 | type: "constant" 1347 | value: 0.2 1348 | } 1349 | } 1350 | } 1351 | layer { 1352 | name: "inception_4d/relu_1x1_s" 1353 | type: "ReLU" 1354 | bottom: "inception_4d/1x1_s" 1355 | top: "inception_4d/1x1_s" 1356 | } 1357 | layer { 1358 | name: "inception_4d/3x3_reduce_s" 1359 | type: "Convolution" 1360 | bottom: "inception_4c/output_s" 1361 | top: "inception_4d/3x3_reduce_s" 1362 | param { 1363 | lr_mult: 1 1364 | decay_mult: 1 1365 | } 1366 | param { 1367 | lr_mult: 2 1368 | decay_mult: 0 1369 | } 1370 | convolution_param { 1371 | num_output: 144 1372 | kernel_size: 1 1373 | weight_filler { 1374 | type: "xavier" 1375 | std: 0.09 1376 | } 1377 | bias_filler { 1378 | type: "constant" 1379 | value: 0.2 1380 | } 1381 | } 1382 | } 1383 | layer { 1384 | name: "inception_4d/relu_3x3_reduce_s" 1385 | type: "ReLU" 1386 | bottom: "inception_4d/3x3_reduce_s" 1387 | top: "inception_4d/3x3_reduce_s" 1388 | } 1389 | layer { 1390 | name: "inception_4d/3x3_s" 1391 | type: "Convolution" 1392 | bottom: "inception_4d/3x3_reduce_s" 1393 | top: "inception_4d/3x3_s" 1394 | param { 1395 | lr_mult: 1 1396 | decay_mult: 1 1397 | } 1398 | param { 1399 | lr_mult: 2 1400 | decay_mult: 0 1401 | } 1402 | convolution_param { 1403 | num_output: 288 1404 | pad: 1 1405 | kernel_size: 3 1406 | weight_filler { 1407 | type: "xavier" 1408 | std: 0.03 1409 | } 1410 | bias_filler { 1411 | type: "constant" 1412 | value: 0.2 1413 | } 1414 | } 1415 | } 1416 | layer { 1417 | name: "inception_4d/relu_3x3_s" 1418 | type: "ReLU" 1419 | bottom: "inception_4d/3x3_s" 1420 | top: "inception_4d/3x3_s" 1421 | } 1422 | layer { 1423 | name: "inception_4d/5x5_reduce_s" 1424 | type: "Convolution" 1425 | bottom: "inception_4c/output_s" 1426 | top: "inception_4d/5x5_reduce_s" 1427 | param { 1428 | lr_mult: 1 1429 | decay_mult: 1 1430 | } 1431 | param { 1432 | lr_mult: 2 1433 | decay_mult: 0 1434 | } 1435 | convolution_param { 1436 | num_output: 32 1437 | kernel_size: 1 1438 | weight_filler { 1439 | type: "xavier" 1440 | std: 0.2 1441 | } 1442 | bias_filler { 1443 | type: "constant" 1444 | value: 0.2 1445 | } 1446 | } 1447 | } 1448 | layer { 1449 | name: "inception_4d/relu_5x5_reduce_s" 1450 | type: "ReLU" 1451 | bottom: "inception_4d/5x5_reduce_s" 1452 | top: "inception_4d/5x5_reduce_s" 1453 | } 1454 | layer { 1455 | name: "inception_4d/5x5_s" 1456 | type: "Convolution" 1457 | bottom: "inception_4d/5x5_reduce_s" 1458 | top: "inception_4d/5x5_s" 1459 | param { 1460 | lr_mult: 1 1461 | decay_mult: 1 1462 | } 1463 | param { 1464 | lr_mult: 2 1465 | decay_mult: 0 1466 | } 1467 | convolution_param { 1468 | num_output: 64 1469 | pad: 2 1470 | kernel_size: 5 1471 | weight_filler { 1472 | type: "xavier" 1473 | std: 0.03 1474 | } 1475 | bias_filler { 1476 | type: "constant" 1477 | value: 0.2 1478 | } 1479 | } 1480 | } 1481 | layer { 1482 | name: "inception_4d/relu_5x5_s" 1483 | type: "ReLU" 1484 | bottom: "inception_4d/5x5_s" 1485 | top: "inception_4d/5x5_s" 1486 | } 1487 | layer { 1488 | name: "inception_4d/pool_s" 1489 | type: "Pooling" 1490 | bottom: "inception_4c/output_s" 1491 | top: "inception_4d/pool_s" 1492 | pooling_param { 1493 | pool: MAX 1494 | kernel_size: 3 1495 | stride: 1 1496 | pad: 1 1497 | } 1498 | } 1499 | layer { 1500 | name: "inception_4d/pool_proj_s" 1501 | type: "Convolution" 1502 | bottom: "inception_4d/pool_s" 1503 | top: "inception_4d/pool_proj_s" 1504 | param { 1505 | lr_mult: 1 1506 | decay_mult: 1 1507 | } 1508 | param { 1509 | lr_mult: 2 1510 | decay_mult: 0 1511 | } 1512 | convolution_param { 1513 | num_output: 64 1514 | kernel_size: 1 1515 | weight_filler { 1516 | type: "xavier" 1517 | std: 0.1 1518 | } 1519 | bias_filler { 1520 | type: "constant" 1521 | value: 0.2 1522 | } 1523 | } 1524 | } 1525 | layer { 1526 | name: "inception_4d/relu_pool_proj_s" 1527 | type: "ReLU" 1528 | bottom: "inception_4d/pool_proj_s" 1529 | top: "inception_4d/pool_proj_s" 1530 | } 1531 | layer { 1532 | name: "inception_4d/output_s" 1533 | type: "Concat" 1534 | bottom: "inception_4d/1x1_s" 1535 | bottom: "inception_4d/3x3_s" 1536 | bottom: "inception_4d/5x5_s" 1537 | bottom: "inception_4d/pool_proj_s" 1538 | top: "inception_4d/output_s" 1539 | } 1540 | layer { 1541 | name: "loss2/ave_pool_s" 1542 | type: "Pooling" 1543 | bottom: "inception_4d/output_s" 1544 | top: "loss2/ave_pool_s" 1545 | pooling_param { 1546 | pool: AVE 1547 | kernel_size: 5 1548 | stride: 3 1549 | } 1550 | } 1551 | layer { 1552 | name: "loss2/conv_s" 1553 | type: "Convolution" 1554 | bottom: "loss2/ave_pool_s" 1555 | top: "loss2/conv_s" 1556 | param { 1557 | lr_mult: 10 1558 | decay_mult: 1 1559 | } 1560 | param { 1561 | lr_mult: 20 1562 | decay_mult: 0 1563 | } 1564 | convolution_param { 1565 | num_output: 128 1566 | kernel_size: 1 1567 | weight_filler { 1568 | type: "xavier" 1569 | std: 0.08 1570 | } 1571 | bias_filler { 1572 | type: "constant" 1573 | value: 0.2 1574 | } 1575 | } 1576 | } 1577 | layer { 1578 | name: "loss2/relu_conv_s" 1579 | type: "ReLU" 1580 | bottom: "loss2/conv_s" 1581 | top: "loss2/conv_s" 1582 | } 1583 | layer { 1584 | name: "loss2/fc_s" 1585 | type: "InnerProduct" 1586 | bottom: "loss2/conv_s" 1587 | top: "loss2/fc_s" 1588 | param { 1589 | lr_mult: 10 1590 | decay_mult: 1 1591 | } 1592 | param { 1593 | lr_mult: 20 1594 | decay_mult: 0 1595 | } 1596 | inner_product_param { 1597 | num_output: 1024 1598 | weight_filler { 1599 | type: "xavier" 1600 | std: 0.02 1601 | } 1602 | bias_filler { 1603 | type: "constant" 1604 | value: 0.2 1605 | } 1606 | } 1607 | } 1608 | layer { 1609 | name: "loss2/relu_fc_s" 1610 | type: "ReLU" 1611 | bottom: "loss2/fc_s" 1612 | top: "loss2/fc_s" 1613 | } 1614 | layer { 1615 | name: "loss2/drop_fc_s" 1616 | type: "Dropout" 1617 | bottom: "loss2/fc_s" 1618 | top: "loss2/fc_s" 1619 | dropout_param { 1620 | dropout_ratio: 0.7 1621 | } 1622 | } 1623 | layer { 1624 | name: "inception_4e/1x1_s" 1625 | type: "Convolution" 1626 | bottom: "inception_4d/output_s" 1627 | top: "inception_4e/1x1_s" 1628 | param { 1629 | lr_mult: 1 1630 | decay_mult: 1 1631 | } 1632 | param { 1633 | lr_mult: 2 1634 | decay_mult: 0 1635 | } 1636 | convolution_param { 1637 | num_output: 256 1638 | kernel_size: 1 1639 | weight_filler { 1640 | type: "xavier" 1641 | std: 0.03 1642 | } 1643 | bias_filler { 1644 | type: "constant" 1645 | value: 0.2 1646 | } 1647 | } 1648 | } 1649 | layer { 1650 | name: "inception_4e/relu_1x1_s" 1651 | type: "ReLU" 1652 | bottom: "inception_4e/1x1_s" 1653 | top: "inception_4e/1x1_s" 1654 | } 1655 | layer { 1656 | name: "inception_4e/3x3_reduce_s" 1657 | type: "Convolution" 1658 | bottom: "inception_4d/output_s" 1659 | top: "inception_4e/3x3_reduce_s" 1660 | param { 1661 | lr_mult: 1 1662 | decay_mult: 1 1663 | } 1664 | param { 1665 | lr_mult: 2 1666 | decay_mult: 0 1667 | } 1668 | convolution_param { 1669 | num_output: 160 1670 | kernel_size: 1 1671 | weight_filler { 1672 | type: "xavier" 1673 | std: 0.09 1674 | } 1675 | bias_filler { 1676 | type: "constant" 1677 | value: 0.2 1678 | } 1679 | } 1680 | } 1681 | layer { 1682 | name: "inception_4e/relu_3x3_reduce_s" 1683 | type: "ReLU" 1684 | bottom: "inception_4e/3x3_reduce_s" 1685 | top: "inception_4e/3x3_reduce_s" 1686 | } 1687 | layer { 1688 | name: "inception_4e/3x3_s" 1689 | type: "Convolution" 1690 | bottom: "inception_4e/3x3_reduce_s" 1691 | top: "inception_4e/3x3_s" 1692 | param { 1693 | lr_mult: 1 1694 | decay_mult: 1 1695 | } 1696 | param { 1697 | lr_mult: 2 1698 | decay_mult: 0 1699 | } 1700 | convolution_param { 1701 | num_output: 320 1702 | pad: 1 1703 | kernel_size: 3 1704 | weight_filler { 1705 | type: "xavier" 1706 | std: 0.03 1707 | } 1708 | bias_filler { 1709 | type: "constant" 1710 | value: 0.2 1711 | } 1712 | } 1713 | } 1714 | layer { 1715 | name: "inception_4e/relu_3x3_s" 1716 | type: "ReLU" 1717 | bottom: "inception_4e/3x3_s" 1718 | top: "inception_4e/3x3_s" 1719 | } 1720 | layer { 1721 | name: "inception_4e/5x5_reduce_s" 1722 | type: "Convolution" 1723 | bottom: "inception_4d/output_s" 1724 | top: "inception_4e/5x5_reduce_s" 1725 | param { 1726 | lr_mult: 1 1727 | decay_mult: 1 1728 | } 1729 | param { 1730 | lr_mult: 2 1731 | decay_mult: 0 1732 | } 1733 | convolution_param { 1734 | num_output: 32 1735 | kernel_size: 1 1736 | weight_filler { 1737 | type: "xavier" 1738 | std: 0.2 1739 | } 1740 | bias_filler { 1741 | type: "constant" 1742 | value: 0.2 1743 | } 1744 | } 1745 | } 1746 | layer { 1747 | name: "inception_4e/relu_5x5_reduce_s" 1748 | type: "ReLU" 1749 | bottom: "inception_4e/5x5_reduce_s" 1750 | top: "inception_4e/5x5_reduce_s" 1751 | } 1752 | layer { 1753 | name: "inception_4e/5x5_s" 1754 | type: "Convolution" 1755 | bottom: "inception_4e/5x5_reduce_s" 1756 | top: "inception_4e/5x5_s" 1757 | param { 1758 | lr_mult: 1 1759 | decay_mult: 1 1760 | } 1761 | param { 1762 | lr_mult: 2 1763 | decay_mult: 0 1764 | } 1765 | convolution_param { 1766 | num_output: 128 1767 | pad: 2 1768 | kernel_size: 5 1769 | weight_filler { 1770 | type: "xavier" 1771 | std: 0.03 1772 | } 1773 | bias_filler { 1774 | type: "constant" 1775 | value: 0.2 1776 | } 1777 | } 1778 | } 1779 | layer { 1780 | name: "inception_4e/relu_5x5_s" 1781 | type: "ReLU" 1782 | bottom: "inception_4e/5x5_s" 1783 | top: "inception_4e/5x5_s" 1784 | } 1785 | layer { 1786 | name: "inception_4e/pool_s" 1787 | type: "Pooling" 1788 | bottom: "inception_4d/output_s" 1789 | top: "inception_4e/pool_s" 1790 | pooling_param { 1791 | pool: MAX 1792 | kernel_size: 3 1793 | stride: 1 1794 | pad: 1 1795 | } 1796 | } 1797 | layer { 1798 | name: "inception_4e/pool_proj_s" 1799 | type: "Convolution" 1800 | bottom: "inception_4e/pool_s" 1801 | top: "inception_4e/pool_proj_s" 1802 | param { 1803 | lr_mult: 1 1804 | decay_mult: 1 1805 | } 1806 | param { 1807 | lr_mult: 2 1808 | decay_mult: 0 1809 | } 1810 | convolution_param { 1811 | num_output: 128 1812 | kernel_size: 1 1813 | weight_filler { 1814 | type: "xavier" 1815 | std: 0.1 1816 | } 1817 | bias_filler { 1818 | type: "constant" 1819 | value: 0.2 1820 | } 1821 | } 1822 | } 1823 | layer { 1824 | name: "inception_4e/relu_pool_proj_s" 1825 | type: "ReLU" 1826 | bottom: "inception_4e/pool_proj_s" 1827 | top: "inception_4e/pool_proj_s" 1828 | } 1829 | layer { 1830 | name: "inception_4e/output_s" 1831 | type: "Concat" 1832 | bottom: "inception_4e/1x1_s" 1833 | bottom: "inception_4e/3x3_s" 1834 | bottom: "inception_4e/5x5_s" 1835 | bottom: "inception_4e/pool_proj_s" 1836 | top: "inception_4e/output_s" 1837 | } 1838 | layer { 1839 | name: "pool4/3x3_s2_s" 1840 | type: "Pooling" 1841 | bottom: "inception_4e/output_s" 1842 | top: "pool4/3x3_s2_s" 1843 | pooling_param { 1844 | pool: MAX 1845 | kernel_size: 3 1846 | stride: 2 1847 | } 1848 | } 1849 | layer { 1850 | name: "inception_5a/1x1_s" 1851 | type: "Convolution" 1852 | bottom: "pool4/3x3_s2_s" 1853 | top: "inception_5a/1x1_s" 1854 | param { 1855 | lr_mult: 1 1856 | decay_mult: 1 1857 | } 1858 | param { 1859 | lr_mult: 2 1860 | decay_mult: 0 1861 | } 1862 | convolution_param { 1863 | num_output: 256 1864 | kernel_size: 1 1865 | weight_filler { 1866 | type: "xavier" 1867 | std: 0.03 1868 | } 1869 | bias_filler { 1870 | type: "constant" 1871 | value: 0.2 1872 | } 1873 | } 1874 | } 1875 | layer { 1876 | name: "inception_5a/relu_1x1_s" 1877 | type: "ReLU" 1878 | bottom: "inception_5a/1x1_s" 1879 | top: "inception_5a/1x1_s" 1880 | } 1881 | layer { 1882 | name: "inception_5a/3x3_reduce_s" 1883 | type: "Convolution" 1884 | bottom: "pool4/3x3_s2_s" 1885 | top: "inception_5a/3x3_reduce_s" 1886 | param { 1887 | lr_mult: 1 1888 | decay_mult: 1 1889 | } 1890 | param { 1891 | lr_mult: 2 1892 | decay_mult: 0 1893 | } 1894 | convolution_param { 1895 | num_output: 160 1896 | kernel_size: 1 1897 | weight_filler { 1898 | type: "xavier" 1899 | std: 0.09 1900 | } 1901 | bias_filler { 1902 | type: "constant" 1903 | value: 0.2 1904 | } 1905 | } 1906 | } 1907 | layer { 1908 | name: "inception_5a/relu_3x3_reduce_s" 1909 | type: "ReLU" 1910 | bottom: "inception_5a/3x3_reduce_s" 1911 | top: "inception_5a/3x3_reduce_s" 1912 | } 1913 | layer { 1914 | name: "inception_5a/3x3_s" 1915 | type: "Convolution" 1916 | bottom: "inception_5a/3x3_reduce_s" 1917 | top: "inception_5a/3x3_s" 1918 | param { 1919 | lr_mult: 1 1920 | decay_mult: 1 1921 | } 1922 | param { 1923 | lr_mult: 2 1924 | decay_mult: 0 1925 | } 1926 | convolution_param { 1927 | num_output: 320 1928 | pad: 1 1929 | kernel_size: 3 1930 | weight_filler { 1931 | type: "xavier" 1932 | std: 0.03 1933 | } 1934 | bias_filler { 1935 | type: "constant" 1936 | value: 0.2 1937 | } 1938 | } 1939 | } 1940 | layer { 1941 | name: "inception_5a/relu_3x3_s" 1942 | type: "ReLU" 1943 | bottom: "inception_5a/3x3_s" 1944 | top: "inception_5a/3x3_s" 1945 | } 1946 | layer { 1947 | name: "inception_5a/5x5_reduce_s" 1948 | type: "Convolution" 1949 | bottom: "pool4/3x3_s2_s" 1950 | top: "inception_5a/5x5_reduce_s" 1951 | param { 1952 | lr_mult: 1 1953 | decay_mult: 1 1954 | } 1955 | param { 1956 | lr_mult: 2 1957 | decay_mult: 0 1958 | } 1959 | convolution_param { 1960 | num_output: 32 1961 | kernel_size: 1 1962 | weight_filler { 1963 | type: "xavier" 1964 | std: 0.2 1965 | } 1966 | bias_filler { 1967 | type: "constant" 1968 | value: 0.2 1969 | } 1970 | } 1971 | } 1972 | layer { 1973 | name: "inception_5a/relu_5x5_reduce_s" 1974 | type: "ReLU" 1975 | bottom: "inception_5a/5x5_reduce_s" 1976 | top: "inception_5a/5x5_reduce_s" 1977 | } 1978 | layer { 1979 | name: "inception_5a/5x5_s" 1980 | type: "Convolution" 1981 | bottom: "inception_5a/5x5_reduce_s" 1982 | top: "inception_5a/5x5_s" 1983 | param { 1984 | lr_mult: 1 1985 | decay_mult: 1 1986 | } 1987 | param { 1988 | lr_mult: 2 1989 | decay_mult: 0 1990 | } 1991 | convolution_param { 1992 | num_output: 128 1993 | pad: 2 1994 | kernel_size: 5 1995 | weight_filler { 1996 | type: "xavier" 1997 | std: 0.03 1998 | } 1999 | bias_filler { 2000 | type: "constant" 2001 | value: 0.2 2002 | } 2003 | } 2004 | } 2005 | layer { 2006 | name: "inception_5a/relu_5x5_s" 2007 | type: "ReLU" 2008 | bottom: "inception_5a/5x5_s" 2009 | top: "inception_5a/5x5_s" 2010 | } 2011 | layer { 2012 | name: "inception_5a/pool_s" 2013 | type: "Pooling" 2014 | bottom: "pool4/3x3_s2_s" 2015 | top: "inception_5a/pool_s" 2016 | pooling_param { 2017 | pool: MAX 2018 | kernel_size: 3 2019 | stride: 1 2020 | pad: 1 2021 | } 2022 | } 2023 | layer { 2024 | name: "inception_5a/pool_proj_s" 2025 | type: "Convolution" 2026 | bottom: "inception_5a/pool_s" 2027 | top: "inception_5a/pool_proj_s" 2028 | param { 2029 | lr_mult: 1 2030 | decay_mult: 1 2031 | } 2032 | param { 2033 | lr_mult: 2 2034 | decay_mult: 0 2035 | } 2036 | convolution_param { 2037 | num_output: 128 2038 | kernel_size: 1 2039 | weight_filler { 2040 | type: "xavier" 2041 | std: 0.1 2042 | } 2043 | bias_filler { 2044 | type: "constant" 2045 | value: 0.2 2046 | } 2047 | } 2048 | } 2049 | layer { 2050 | name: "inception_5a/relu_pool_proj_s" 2051 | type: "ReLU" 2052 | bottom: "inception_5a/pool_proj_s" 2053 | top: "inception_5a/pool_proj_s" 2054 | } 2055 | layer { 2056 | name: "inception_5a/output_s" 2057 | type: "Concat" 2058 | bottom: "inception_5a/1x1_s" 2059 | bottom: "inception_5a/3x3_s" 2060 | bottom: "inception_5a/5x5_s" 2061 | bottom: "inception_5a/pool_proj_s" 2062 | top: "inception_5a/output_s" 2063 | } 2064 | layer { 2065 | name: "inception_5b/1x1_s" 2066 | type: "Convolution" 2067 | bottom: "inception_5a/output_s" 2068 | top: "inception_5b/1x1_s" 2069 | param { 2070 | lr_mult: 1 2071 | decay_mult: 1 2072 | } 2073 | param { 2074 | lr_mult: 2 2075 | decay_mult: 0 2076 | } 2077 | convolution_param { 2078 | num_output: 384 2079 | kernel_size: 1 2080 | weight_filler { 2081 | type: "xavier" 2082 | std: 0.03 2083 | } 2084 | bias_filler { 2085 | type: "constant" 2086 | value: 0.2 2087 | } 2088 | } 2089 | } 2090 | layer { 2091 | name: "inception_5b/relu_1x1_s" 2092 | type: "ReLU" 2093 | bottom: "inception_5b/1x1_s" 2094 | top: "inception_5b/1x1_s" 2095 | } 2096 | layer { 2097 | name: "inception_5b/3x3_reduce_s" 2098 | type: "Convolution" 2099 | bottom: "inception_5a/output_s" 2100 | top: "inception_5b/3x3_reduce_s" 2101 | param { 2102 | lr_mult: 1 2103 | decay_mult: 1 2104 | } 2105 | param { 2106 | lr_mult: 2 2107 | decay_mult: 0 2108 | } 2109 | convolution_param { 2110 | num_output: 192 2111 | kernel_size: 1 2112 | weight_filler { 2113 | type: "xavier" 2114 | std: 0.09 2115 | } 2116 | bias_filler { 2117 | type: "constant" 2118 | value: 0.2 2119 | } 2120 | } 2121 | } 2122 | layer { 2123 | name: "inception_5b/relu_3x3_reduce_s" 2124 | type: "ReLU" 2125 | bottom: "inception_5b/3x3_reduce_s" 2126 | top: "inception_5b/3x3_reduce_s" 2127 | } 2128 | layer { 2129 | name: "inception_5b/3x3_s" 2130 | type: "Convolution" 2131 | bottom: "inception_5b/3x3_reduce_s" 2132 | top: "inception_5b/3x3_s" 2133 | param { 2134 | lr_mult: 1 2135 | decay_mult: 1 2136 | } 2137 | param { 2138 | lr_mult: 2 2139 | decay_mult: 0 2140 | } 2141 | convolution_param { 2142 | num_output: 384 2143 | pad: 1 2144 | kernel_size: 3 2145 | weight_filler { 2146 | type: "xavier" 2147 | std: 0.03 2148 | } 2149 | bias_filler { 2150 | type: "constant" 2151 | value: 0.2 2152 | } 2153 | } 2154 | } 2155 | layer { 2156 | name: "inception_5b/relu_3x3_s" 2157 | type: "ReLU" 2158 | bottom: "inception_5b/3x3_s" 2159 | top: "inception_5b/3x3_s" 2160 | } 2161 | layer { 2162 | name: "inception_5b/5x5_reduce_s" 2163 | type: "Convolution" 2164 | bottom: "inception_5a/output_s" 2165 | top: "inception_5b/5x5_reduce_s" 2166 | param { 2167 | lr_mult: 1 2168 | decay_mult: 1 2169 | } 2170 | param { 2171 | lr_mult: 2 2172 | decay_mult: 0 2173 | } 2174 | convolution_param { 2175 | num_output: 48 2176 | kernel_size: 1 2177 | weight_filler { 2178 | type: "xavier" 2179 | std: 0.2 2180 | } 2181 | bias_filler { 2182 | type: "constant" 2183 | value: 0.2 2184 | } 2185 | } 2186 | } 2187 | layer { 2188 | name: "inception_5b/relu_5x5_reduce_s" 2189 | type: "ReLU" 2190 | bottom: "inception_5b/5x5_reduce_s" 2191 | top: "inception_5b/5x5_reduce_s" 2192 | } 2193 | layer { 2194 | name: "inception_5b/5x5_s" 2195 | type: "Convolution" 2196 | bottom: "inception_5b/5x5_reduce_s" 2197 | top: "inception_5b/5x5_s" 2198 | param { 2199 | lr_mult: 1 2200 | decay_mult: 1 2201 | } 2202 | param { 2203 | lr_mult: 2 2204 | decay_mult: 0 2205 | } 2206 | convolution_param { 2207 | num_output: 128 2208 | pad: 2 2209 | kernel_size: 5 2210 | weight_filler { 2211 | type: "xavier" 2212 | std: 0.03 2213 | } 2214 | bias_filler { 2215 | type: "constant" 2216 | value: 0.2 2217 | } 2218 | } 2219 | } 2220 | layer { 2221 | name: "inception_5b/relu_5x5_s" 2222 | type: "ReLU" 2223 | bottom: "inception_5b/5x5_s" 2224 | top: "inception_5b/5x5_s" 2225 | } 2226 | layer { 2227 | name: "inception_5b/pool_s" 2228 | type: "Pooling" 2229 | bottom: "inception_5a/output_s" 2230 | top: "inception_5b/pool_s" 2231 | pooling_param { 2232 | pool: MAX 2233 | kernel_size: 3 2234 | stride: 1 2235 | pad: 1 2236 | } 2237 | } 2238 | layer { 2239 | name: "inception_5b/pool_proj_s" 2240 | type: "Convolution" 2241 | bottom: "inception_5b/pool_s" 2242 | top: "inception_5b/pool_proj_s" 2243 | param { 2244 | lr_mult: 1 2245 | decay_mult: 1 2246 | } 2247 | param { 2248 | lr_mult: 2 2249 | decay_mult: 0 2250 | } 2251 | convolution_param { 2252 | num_output: 128 2253 | kernel_size: 1 2254 | weight_filler { 2255 | type: "xavier" 2256 | std: 0.1 2257 | } 2258 | bias_filler { 2259 | type: "constant" 2260 | value: 0.2 2261 | } 2262 | } 2263 | } 2264 | layer { 2265 | name: "inception_5b/relu_pool_proj_s" 2266 | type: "ReLU" 2267 | bottom: "inception_5b/pool_proj_s" 2268 | top: "inception_5b/pool_proj_s" 2269 | } 2270 | layer { 2271 | name: "inception_5b/output_s" 2272 | type: "Concat" 2273 | bottom: "inception_5b/1x1_s" 2274 | bottom: "inception_5b/3x3_s" 2275 | bottom: "inception_5b/5x5_s" 2276 | bottom: "inception_5b/pool_proj_s" 2277 | top: "inception_5b/output_s" 2278 | } 2279 | layer { 2280 | name: "pool5/7x7_s1_s" 2281 | type: "Pooling" 2282 | bottom: "inception_5b/output_s" 2283 | top: "pool5/7x7_s1_s" 2284 | pooling_param { 2285 | pool: AVE 2286 | kernel_size: 7 2287 | stride: 1 2288 | } 2289 | } 2290 | 2291 | -------------------------------------------------------------------------------- /training/README.md: -------------------------------------------------------------------------------- 1 | solver and prototxt for training triplet network 2 | 3 | The prototxt file might be some capability issue with the lastest caffe since I was using an old version from 2015 4 | 5 | For triplet loss I was using the version here https://github.com/happynear/caffe-windows/blob/master/src/caffe/layers/triplet_loss_layer.cpp 6 | 7 | -------------------------------------------------------------------------------- /training/sketch_triplet_solver.prototxt: -------------------------------------------------------------------------------- 1 | # The train/test net protocol buffer definition 2 | net: "models/sketch_triplet/sketch_triplet_train_test.prototxt" 3 | # test_iter specifies how many forward passes the test should carry out. 4 | # In the case of MNIST, we have test batch size 100 and 100 test iterations, 5 | # covering the full 10,000 testing images. 6 | test_iter: 1000 7 | # Carry out testing every 500 training iterations. 8 | test_interval: 500 9 | # The base learning rate, momentum and the weight decay of the network. 10 | base_lr: 0.00001 11 | momentum: 0.9 12 | #solver_type: ADAGRAD 13 | weight_decay: 0.0005 14 | # The learning rate policy 15 | lr_policy: "step" 16 | gamma: 0.1 17 | stepsize: 200000 18 | # Display every 100 iterations 19 | display: 20 20 | # The maximum number of iterations 21 | max_iter: 1000000 22 | # snapshot intermediate results 23 | snapshot: 1000 24 | snapshot_prefix: "models/sketch_triplet/sketch_triplet_hardmining" 25 | # solver mode: CPU or GPU 26 | solver_mode: GPU --------------------------------------------------------------------------------