├── LICENSE
├── README.md
├── data
    └── models
    │   ├── deploy.prototxt
    │   └── res10_300x300_ssd_iter_140000.caffemodel
├── noof
    ├── caffe_face_det.h
    └── main.cpp
├── of
    ├── config.make
    └── src
    │   ├── caffe_face_det.h
    │   ├── cv_cvt.h
    │   ├── main.cpp
    │   ├── ofApp.cpp
    │   └── ofApp.h
└── screenshots
    ├── .DS_Store
    ├── cflags.jpg
    ├── demo1.jpg
    ├── ldflags.jpg
    ├── xcode1.png
    ├── xcode2.png
    ├── xcode3.png
    ├── xcode4.png
    ├── xcode5.png
    └── xcode6.png


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Lingdong Huang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Fast Many Face Detection with C++/OpenFrameworks on macOS using Neural Networks
  2 | 
  3 | ![](screenshots/demo1.jpg)
  4 | 
  5 | ## Introduction
  6 | 
  7 | [OpenCV](http://opencv.org) and its [OpenFrameworks](http://openframeworks.cc) addon [ofxCv](https://github.com/kylemcdonald/ofxCv) already provide face tracking with haar cascades. However, the speed of detection apparently decreases a lot with the number of faces in the frame, and is generally not good at tilted faces. The newer versions of OpenCV (3.3+) introduced a `dnn` module that facilitates importing of models from many different neural networks, as explored in this [pyimagesearch blog](https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/). This allows us to use a [good caffe face detection model](https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector) that runs lightning fast even when there's a ton of faces in the same frame. However the current version of OpenCV bundled with OpenFrameworks is way out of date. One approach is to try compiling OpenFrameworks with a newer version of OpenCV, as discussed in this [OF forum thread](https://forum.openframeworks.cc/t/how-do-i-use-an-alternative-version-of-opencv-with-of/23280/14). Another approach is directly link against a OpenCV installation on the system, which is what we'll be doing in this document.
  8 | 
  9 | This process can be somewhat complex depending familiarity with C++/linking/OF/OpenCv/Xcode stuff. This repo aims to make it much easier by providing detailed steps and ready-to-use code. Hopefully I'll be able to find a even more convenient solution in the future. It is only tested on macOS, but might work in a similar way on Windows/Linux.
 10 | 
 11 | ## Installing Dependencies
 12 | 
 13 | First install OpenCV (3.3+). I tested with 3.4.1 and 4.0.1, but other versions should work too as long as they're >= 3.3.
 14 | 
 15 | I recommend using [brew](http://brew.sh):
 16 | 
 17 | ```bash
 18 | brew install opencv
 19 | ```
 20 | I tested with OpenFrameworks 0.9.8 and 0.10.1. Download OpenFrameworks on [their official website](https://openframeworks.cc/download/). No addon is necessary, since we'll be linking against system OpenCV.
 21 | 
 22 | You may also need to install either mac [commandline tools](http://osxdaily.com/2014/02/12/install-command-line-tools-mac-os-x/) or [XCode](https://developer.apple.com/xcode/).
 23 | 
 24 | And of course, you need to download or clone this repo.
 25 | 
 26 | ## Fixing OpenCV
 27 | 
 28 | There is a conflict between a OpenCV's macro and macOS's macro. This causes an error during  compilation step when using OpenFrameworks. This section introduces a workaround. You can also skip the section, and after you confirm that the error actually occurs when compiling on your system, come back and make the fix.
 29 | 
 30 | The error will look something like this:
 31 | 
 32 | ```
 33 | /usr/local/include/opencv2/stitching/detail/blenders.hpp:67:12: error: expected
 34 |       identifier
 35 |     enum { NO, FEATHER, MULTI_BAND };
 36 |            ^
 37 | ```
 38 | 
 39 | To fix the problem, goto `/usr/local/Cellar/opencv/3.4.1_5/include/opencv2/opencv.hpp` (or wherever your OpenCV installation is) and comment out line 89 `#include "opencv2/stiching.hpp"`, since we're not using the functionality anyways.
 40 | 
 41 | 
 42 | 
 43 | ## C++ & OpenCV only (no OF)
 44 | 
 45 | I think it is a good idea to first get OpenCV and DNN's working with plain C++, and incorporate OpenFrameworks later.
 46 | 
 47 | First, `cd` to `noof` directory of this repo, then try:
 48 | 
 49 | ```bash
 50 | c++ main.cpp -std=c++11 -lopencv_highgui -lopencv_imgproc -lopencv_dnn -lopencv_core -lopencv_videoio -I/usr/local/Cellar/opencv/3.4.1_5/include
 51 | ```
 52 | 
 53 | You **must** subsitute `-I/usr/local/Cellar/opencv/3.4.1_5/include` with the correct path and version of OpenCV installed on your system. (The above command is the minimal version, if you want to use other functionalities of OpenCV, you can find the full version for copy-pasting on the first line of `noof/main.cpp`).
 54 | 
 55 | If you're using OpenCV 4.x, the `-I` path is a bit different, notice the extra `/opencv4` at the end:
 56 | 
 57 | ```bash
 58 | c++ main.cpp -std=c++11 -lopencv_highgui -lopencv_imgproc -lopencv_dnn -lopencv_core -lopencv_videoio -I/usr/local/Cellar/opencv/4.1.0_1/include/opencv4
 59 | ```
 60 | 
 61 | Hopefully this compiles without errors. Then try:
 62 | 
 63 | ```bash
 64 | ./a.out
 65 | ```
 66 | 
 67 | to launch the app. A window should pop up, in it you'll see realtime video feed from your webcam, and a box around each face in the frame.
 68 | 
 69 | The main code is located in `noof/main.cpp`, and the neural networks wrapper is located in `noof/caffe_face_det.h`. You'll see a lot of magic numbers in `caffe_face_det.h`, they're taken from this [pyimagesearch blog](https://www.pyimagesearch.com/2018/02/26/face-detection-with-opencv-and-deep-learning/). Messing with the numbers is tested to be a bad idea.
 70 | 
 71 | You might notice there're false positives. This is because the threshold is low, and you can add a second argument to `detector.detect(frame)` in `main.cpp` to specify the threshold (0.0-1.0), e.g.
 72 | 
 73 | ```c++
 74 |  vector<a_det> detections = detector.detect(frame, 0.5);
 75 | ```
 76 | 
 77 | If you have a lot of small faces in the frame, like a mass surveillance situation, you might want a lower threshold to detect more faces. If there's only a small number of large faces in the frame, then a higher threshold works better.
 78 | 
 79 | 
 80 | ## OpenFrameworks + makefile
 81 | 
 82 | Now that we know OpenCV is working, we can add the OpenFrameworks support. I think it is a good idea to first get `make` and `make RunRelease` commands working, before we delve into the Xcode blackbox in the next section.
 83 | 
 84 | - First, use OpenFrameworks projectGenerator to make a new project. You will **NOT** need any addons. **DO NOT** put ofxOpenCv or ofxCv. Just create a plain, empty project.
 85 | 
 86 | - Say you named the project `MyProject`. Now replace the contents of `MyProject/src` folder with the contents of `of/src` folder from this repo. Then also copy the `data/models` folder from this repo to `MyProject/bin/data`.
 87 | 
 88 | 
 89 | - Now copy-paste the contents of `of/config.make` from this repo to `MyProject/config.make`. However, you'll need to replace the path and version of OpenCV on line 76. Basically you'll be adding these lines to your `config.make` file:
 90 | 
 91 | ```
 92 | PROJECT_LDFLAGS += -lopencv_videostab -lopencv_photo -lopencv_objdetect -lopencv_video -lopencv_ml -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_flann -lopencv_imgproc -lopencv_dnn -lopencv_imgcodecs -lopencv_core
 93 | PROJECT_LDFLAGS += -I/usr/local/Cellar/opencv/3.4.1_5/include
 94 | ```
 95 | 
 96 | - If you are using OpenCV 4.0.1, the link path is a little different: 
 97 | 
 98 | ```
 99 | PROJECT_LDFLAGS += -lopencv_videostab -lopencv_photo -lopencv_objdetect -lopencv_video -lopencv_ml -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_flann -lopencv_imgproc -lopencv_dnn -lopencv_imgcodecs -lopencv_core
100 | PROJECT_LDFLAGS += -I/usr/local/Cellar/opencv/4.0.1/include/opencv4
101 | ```
102 | 
103 | ![](screenshots/ldflags.jpg)
104 | 
105 | - You might also need to add `PROJECT_CFLAGS += ` with the exact same flags as `PROJECT_LDFLAGS` in `config.make` file.
106 | 
107 | ```
108 | PROJECT_CFLAGS += -lopencv_videostab -lopencv_photo -lopencv_objdetect -lopencv_video -lopencv_ml -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_flann -lopencv_imgproc -lopencv_dnn -lopencv_imgcodecs -lopencv_core
109 | PROJECT_CFLAGS += -I/usr/local/Cellar/opencv/4.0.1/include/opencv4
110 | ```
111 | ![](screenshots/cflags.jpg)
112 | 
113 | 
114 | 
115 | - Now, `cd` into `MyProject` folder, and try:
116 | 
117 | ```bash
118 | make
119 | ```
120 | 
121 | - If you see an error complaining about `enum` and `NO` etc., see **Fixing OpenCV** section above for a workaround. Then run `make` again. 
122 | 
123 | - If there's no errors, run
124 | 
125 | ```bash
126 | make RunRelease
127 | ```
128 | 
129 | A window should pop up, in it you'll see realtime video feed from your webcam, and a box around each face in the frame. Again, you might want to adjust the detection threshold, using the same method described at the end of previous section.
130 | 
131 | 
132 | ## OpenFrameworks + Xcode
133 | 
134 | Great! The most difficult parts are now done. You can run your OpenFrameworks face detection projects from the commandline. However, if you for some reason enjoy working with the GUI of Xcode, you can do so by making the following changes:
135 | 
136 | - Click the project name on left sidebar, then go to `Build Settings`, and find the `Search Paths` section. The section is hard to find but it's there.
137 | 
138 | ![](screenshots/xcode1.png)
139 | 
140 | - Add two more entries to `Header Search Paths`: `/usr/local/include` and `/usr/local/Cellar/opencv/3.4.1_5/include`. Again, change OpenCV path to the correct version and location of your installation.
141 | (Add them by clicking on the text on the right, a box will pop up, click the + sign).
142 | 
143 | ![](screenshots/xcode2.png)
144 | 
145 | - If you are using Xcode 10 with OpenCV 4.0.1 and openFrameworks 0.10.1, you should change your `Header Search Paths` into this:
146 | 
147 | ![](screenshots/xcode4.png)
148 | 
149 | - Find `Linking > Other Link Flags`, and add a bunch of new flags. Don't worry: you don't need to add one by one, just enter all of them at once separated by spaces, and Xcode will format automatically. e.g. you can copy-paste the below:
150 | 
151 | ```
152 | -I/usr/local/Cellar/opencv/3.4.1_5/include -L/usr/local/lib -lopencv_videostab -lopencv_photo -lopencv_objdetect -lopencv_video -lopencv_ml -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_flann -lopencv_imgproc -lopencv_dnn -lopencv_imgcodecs -lopencv_core
153 | ```
154 | 
155 | ![](screenshots/xcode3.png)
156 | 
157 | - The path is different using OpenCV 4.0.1:
158 | 
159 | ```
160 | -I/usr/local/Cellar/opencv/4.0.1/include/opencv4 -L/usr/local/lib -lopencv_videostab -lopencv_photo -lopencv_objdetect -lopencv_video -lopencv_ml -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_flann -lopencv_imgproc -lopencv_dnn -lopencv_imgcodecs -lopencv_core
161 | ```
162 | 
163 | ![](screenshots/xcode5.png)
164 | 
165 | - You might also need to add the exact same flags to `Apple Clang - Custom Compiler Flags > Other C Flags`
166 | 
167 | 
168 | ![](screenshots/xcode6.png)
169 | 
170 | - Finally click the play button, everything should work now!
171 | 
172 | 
173 | ## Face Embedding and Tracking
174 | 
175 | So far, we can obtain all the bounding boxes of the faces in the frame. However, we can also do some embedding (for face recognition) and some tracking. I already have working code for that, and will hopefully publish documentation for those soon.
176 | 
177 | 
178 | 
179 | 
180 | 


--------------------------------------------------------------------------------
/data/models/deploy.prototxt:
--------------------------------------------------------------------------------
   1 | input: "data"
   2 | input_shape {
   3 |   dim: 1
   4 |   dim: 3
   5 |   dim: 300
   6 |   dim: 300
   7 | }
   8 | 
   9 | layer {
  10 |   name: "data_bn"
  11 |   type: "BatchNorm"
  12 |   bottom: "data"
  13 |   top: "data_bn"
  14 |   param {
  15 |     lr_mult: 0.0
  16 |   }
  17 |   param {
  18 |     lr_mult: 0.0
  19 |   }
  20 |   param {
  21 |     lr_mult: 0.0
  22 |   }
  23 | }
  24 | layer {
  25 |   name: "data_scale"
  26 |   type: "Scale"
  27 |   bottom: "data_bn"
  28 |   top: "data_bn"
  29 |   param {
  30 |     lr_mult: 1.0
  31 |     decay_mult: 1.0
  32 |   }
  33 |   param {
  34 |     lr_mult: 2.0
  35 |     decay_mult: 1.0
  36 |   }
  37 |   scale_param {
  38 |     bias_term: true
  39 |   }
  40 | }
  41 | layer {
  42 |   name: "conv1_h"
  43 |   type: "Convolution"
  44 |   bottom: "data_bn"
  45 |   top: "conv1_h"
  46 |   param {
  47 |     lr_mult: 1.0
  48 |     decay_mult: 1.0
  49 |   }
  50 |   param {
  51 |     lr_mult: 2.0
  52 |     decay_mult: 1.0
  53 |   }
  54 |   convolution_param {
  55 |     num_output: 32
  56 |     pad: 3
  57 |     kernel_size: 7
  58 |     stride: 2
  59 |     weight_filler {
  60 |       type: "msra"
  61 |       variance_norm: FAN_OUT
  62 |     }
  63 |     bias_filler {
  64 |       type: "constant"
  65 |       value: 0.0
  66 |     }
  67 |   }
  68 | }
  69 | layer {
  70 |   name: "conv1_bn_h"
  71 |   type: "BatchNorm"
  72 |   bottom: "conv1_h"
  73 |   top: "conv1_h"
  74 |   param {
  75 |     lr_mult: 0.0
  76 |   }
  77 |   param {
  78 |     lr_mult: 0.0
  79 |   }
  80 |   param {
  81 |     lr_mult: 0.0
  82 |   }
  83 | }
  84 | layer {
  85 |   name: "conv1_scale_h"
  86 |   type: "Scale"
  87 |   bottom: "conv1_h"
  88 |   top: "conv1_h"
  89 |   param {
  90 |     lr_mult: 1.0
  91 |     decay_mult: 1.0
  92 |   }
  93 |   param {
  94 |     lr_mult: 2.0
  95 |     decay_mult: 1.0
  96 |   }
  97 |   scale_param {
  98 |     bias_term: true
  99 |   }
 100 | }
 101 | layer {
 102 |   name: "conv1_relu"
 103 |   type: "ReLU"
 104 |   bottom: "conv1_h"
 105 |   top: "conv1_h"
 106 | }
 107 | layer {
 108 |   name: "conv1_pool"
 109 |   type: "Pooling"
 110 |   bottom: "conv1_h"
 111 |   top: "conv1_pool"
 112 |   pooling_param {
 113 |     kernel_size: 3
 114 |     stride: 2
 115 |   }
 116 | }
 117 | layer {
 118 |   name: "layer_64_1_conv1_h"
 119 |   type: "Convolution"
 120 |   bottom: "conv1_pool"
 121 |   top: "layer_64_1_conv1_h"
 122 |   param {
 123 |     lr_mult: 1.0
 124 |     decay_mult: 1.0
 125 |   }
 126 |   convolution_param {
 127 |     num_output: 32
 128 |     bias_term: false
 129 |     pad: 1
 130 |     kernel_size: 3
 131 |     stride: 1
 132 |     weight_filler {
 133 |       type: "msra"
 134 |     }
 135 |     bias_filler {
 136 |       type: "constant"
 137 |       value: 0.0
 138 |     }
 139 |   }
 140 | }
 141 | layer {
 142 |   name: "layer_64_1_bn2_h"
 143 |   type: "BatchNorm"
 144 |   bottom: "layer_64_1_conv1_h"
 145 |   top: "layer_64_1_conv1_h"
 146 |   param {
 147 |     lr_mult: 0.0
 148 |   }
 149 |   param {
 150 |     lr_mult: 0.0
 151 |   }
 152 |   param {
 153 |     lr_mult: 0.0
 154 |   }
 155 | }
 156 | layer {
 157 |   name: "layer_64_1_scale2_h"
 158 |   type: "Scale"
 159 |   bottom: "layer_64_1_conv1_h"
 160 |   top: "layer_64_1_conv1_h"
 161 |   param {
 162 |     lr_mult: 1.0
 163 |     decay_mult: 1.0
 164 |   }
 165 |   param {
 166 |     lr_mult: 2.0
 167 |     decay_mult: 1.0
 168 |   }
 169 |   scale_param {
 170 |     bias_term: true
 171 |   }
 172 | }
 173 | layer {
 174 |   name: "layer_64_1_relu2"
 175 |   type: "ReLU"
 176 |   bottom: "layer_64_1_conv1_h"
 177 |   top: "layer_64_1_conv1_h"
 178 | }
 179 | layer {
 180 |   name: "layer_64_1_conv2_h"
 181 |   type: "Convolution"
 182 |   bottom: "layer_64_1_conv1_h"
 183 |   top: "layer_64_1_conv2_h"
 184 |   param {
 185 |     lr_mult: 1.0
 186 |     decay_mult: 1.0
 187 |   }
 188 |   convolution_param {
 189 |     num_output: 32
 190 |     bias_term: false
 191 |     pad: 1
 192 |     kernel_size: 3
 193 |     stride: 1
 194 |     weight_filler {
 195 |       type: "msra"
 196 |     }
 197 |     bias_filler {
 198 |       type: "constant"
 199 |       value: 0.0
 200 |     }
 201 |   }
 202 | }
 203 | layer {
 204 |   name: "layer_64_1_sum"
 205 |   type: "Eltwise"
 206 |   bottom: "layer_64_1_conv2_h"
 207 |   bottom: "conv1_pool"
 208 |   top: "layer_64_1_sum"
 209 | }
 210 | layer {
 211 |   name: "layer_128_1_bn1_h"
 212 |   type: "BatchNorm"
 213 |   bottom: "layer_64_1_sum"
 214 |   top: "layer_128_1_bn1_h"
 215 |   param {
 216 |     lr_mult: 0.0
 217 |   }
 218 |   param {
 219 |     lr_mult: 0.0
 220 |   }
 221 |   param {
 222 |     lr_mult: 0.0
 223 |   }
 224 | }
 225 | layer {
 226 |   name: "layer_128_1_scale1_h"
 227 |   type: "Scale"
 228 |   bottom: "layer_128_1_bn1_h"
 229 |   top: "layer_128_1_bn1_h"
 230 |   param {
 231 |     lr_mult: 1.0
 232 |     decay_mult: 1.0
 233 |   }
 234 |   param {
 235 |     lr_mult: 2.0
 236 |     decay_mult: 1.0
 237 |   }
 238 |   scale_param {
 239 |     bias_term: true
 240 |   }
 241 | }
 242 | layer {
 243 |   name: "layer_128_1_relu1"
 244 |   type: "ReLU"
 245 |   bottom: "layer_128_1_bn1_h"
 246 |   top: "layer_128_1_bn1_h"
 247 | }
 248 | layer {
 249 |   name: "layer_128_1_conv1_h"
 250 |   type: "Convolution"
 251 |   bottom: "layer_128_1_bn1_h"
 252 |   top: "layer_128_1_conv1_h"
 253 |   param {
 254 |     lr_mult: 1.0
 255 |     decay_mult: 1.0
 256 |   }
 257 |   convolution_param {
 258 |     num_output: 128
 259 |     bias_term: false
 260 |     pad: 1
 261 |     kernel_size: 3
 262 |     stride: 2
 263 |     weight_filler {
 264 |       type: "msra"
 265 |     }
 266 |     bias_filler {
 267 |       type: "constant"
 268 |       value: 0.0
 269 |     }
 270 |   }
 271 | }
 272 | layer {
 273 |   name: "layer_128_1_bn2"
 274 |   type: "BatchNorm"
 275 |   bottom: "layer_128_1_conv1_h"
 276 |   top: "layer_128_1_conv1_h"
 277 |   param {
 278 |     lr_mult: 0.0
 279 |   }
 280 |   param {
 281 |     lr_mult: 0.0
 282 |   }
 283 |   param {
 284 |     lr_mult: 0.0
 285 |   }
 286 | }
 287 | layer {
 288 |   name: "layer_128_1_scale2"
 289 |   type: "Scale"
 290 |   bottom: "layer_128_1_conv1_h"
 291 |   top: "layer_128_1_conv1_h"
 292 |   param {
 293 |     lr_mult: 1.0
 294 |     decay_mult: 1.0
 295 |   }
 296 |   param {
 297 |     lr_mult: 2.0
 298 |     decay_mult: 1.0
 299 |   }
 300 |   scale_param {
 301 |     bias_term: true
 302 |   }
 303 | }
 304 | layer {
 305 |   name: "layer_128_1_relu2"
 306 |   type: "ReLU"
 307 |   bottom: "layer_128_1_conv1_h"
 308 |   top: "layer_128_1_conv1_h"
 309 | }
 310 | layer {
 311 |   name: "layer_128_1_conv2"
 312 |   type: "Convolution"
 313 |   bottom: "layer_128_1_conv1_h"
 314 |   top: "layer_128_1_conv2"
 315 |   param {
 316 |     lr_mult: 1.0
 317 |     decay_mult: 1.0
 318 |   }
 319 |   convolution_param {
 320 |     num_output: 128
 321 |     bias_term: false
 322 |     pad: 1
 323 |     kernel_size: 3
 324 |     stride: 1
 325 |     weight_filler {
 326 |       type: "msra"
 327 |     }
 328 |     bias_filler {
 329 |       type: "constant"
 330 |       value: 0.0
 331 |     }
 332 |   }
 333 | }
 334 | layer {
 335 |   name: "layer_128_1_conv_expand_h"
 336 |   type: "Convolution"
 337 |   bottom: "layer_128_1_bn1_h"
 338 |   top: "layer_128_1_conv_expand_h"
 339 |   param {
 340 |     lr_mult: 1.0
 341 |     decay_mult: 1.0
 342 |   }
 343 |   convolution_param {
 344 |     num_output: 128
 345 |     bias_term: false
 346 |     pad: 0
 347 |     kernel_size: 1
 348 |     stride: 2
 349 |     weight_filler {
 350 |       type: "msra"
 351 |     }
 352 |     bias_filler {
 353 |       type: "constant"
 354 |       value: 0.0
 355 |     }
 356 |   }
 357 | }
 358 | layer {
 359 |   name: "layer_128_1_sum"
 360 |   type: "Eltwise"
 361 |   bottom: "layer_128_1_conv2"
 362 |   bottom: "layer_128_1_conv_expand_h"
 363 |   top: "layer_128_1_sum"
 364 | }
 365 | layer {
 366 |   name: "layer_256_1_bn1"
 367 |   type: "BatchNorm"
 368 |   bottom: "layer_128_1_sum"
 369 |   top: "layer_256_1_bn1"
 370 |   param {
 371 |     lr_mult: 0.0
 372 |   }
 373 |   param {
 374 |     lr_mult: 0.0
 375 |   }
 376 |   param {
 377 |     lr_mult: 0.0
 378 |   }
 379 | }
 380 | layer {
 381 |   name: "layer_256_1_scale1"
 382 |   type: "Scale"
 383 |   bottom: "layer_256_1_bn1"
 384 |   top: "layer_256_1_bn1"
 385 |   param {
 386 |     lr_mult: 1.0
 387 |     decay_mult: 1.0
 388 |   }
 389 |   param {
 390 |     lr_mult: 2.0
 391 |     decay_mult: 1.0
 392 |   }
 393 |   scale_param {
 394 |     bias_term: true
 395 |   }
 396 | }
 397 | layer {
 398 |   name: "layer_256_1_relu1"
 399 |   type: "ReLU"
 400 |   bottom: "layer_256_1_bn1"
 401 |   top: "layer_256_1_bn1"
 402 | }
 403 | layer {
 404 |   name: "layer_256_1_conv1"
 405 |   type: "Convolution"
 406 |   bottom: "layer_256_1_bn1"
 407 |   top: "layer_256_1_conv1"
 408 |   param {
 409 |     lr_mult: 1.0
 410 |     decay_mult: 1.0
 411 |   }
 412 |   convolution_param {
 413 |     num_output: 256
 414 |     bias_term: false
 415 |     pad: 1
 416 |     kernel_size: 3
 417 |     stride: 2
 418 |     weight_filler {
 419 |       type: "msra"
 420 |     }
 421 |     bias_filler {
 422 |       type: "constant"
 423 |       value: 0.0
 424 |     }
 425 |   }
 426 | }
 427 | layer {
 428 |   name: "layer_256_1_bn2"
 429 |   type: "BatchNorm"
 430 |   bottom: "layer_256_1_conv1"
 431 |   top: "layer_256_1_conv1"
 432 |   param {
 433 |     lr_mult: 0.0
 434 |   }
 435 |   param {
 436 |     lr_mult: 0.0
 437 |   }
 438 |   param {
 439 |     lr_mult: 0.0
 440 |   }
 441 | }
 442 | layer {
 443 |   name: "layer_256_1_scale2"
 444 |   type: "Scale"
 445 |   bottom: "layer_256_1_conv1"
 446 |   top: "layer_256_1_conv1"
 447 |   param {
 448 |     lr_mult: 1.0
 449 |     decay_mult: 1.0
 450 |   }
 451 |   param {
 452 |     lr_mult: 2.0
 453 |     decay_mult: 1.0
 454 |   }
 455 |   scale_param {
 456 |     bias_term: true
 457 |   }
 458 | }
 459 | layer {
 460 |   name: "layer_256_1_relu2"
 461 |   type: "ReLU"
 462 |   bottom: "layer_256_1_conv1"
 463 |   top: "layer_256_1_conv1"
 464 | }
 465 | layer {
 466 |   name: "layer_256_1_conv2"
 467 |   type: "Convolution"
 468 |   bottom: "layer_256_1_conv1"
 469 |   top: "layer_256_1_conv2"
 470 |   param {
 471 |     lr_mult: 1.0
 472 |     decay_mult: 1.0
 473 |   }
 474 |   convolution_param {
 475 |     num_output: 256
 476 |     bias_term: false
 477 |     pad: 1
 478 |     kernel_size: 3
 479 |     stride: 1
 480 |     weight_filler {
 481 |       type: "msra"
 482 |     }
 483 |     bias_filler {
 484 |       type: "constant"
 485 |       value: 0.0
 486 |     }
 487 |   }
 488 | }
 489 | layer {
 490 |   name: "layer_256_1_conv_expand"
 491 |   type: "Convolution"
 492 |   bottom: "layer_256_1_bn1"
 493 |   top: "layer_256_1_conv_expand"
 494 |   param {
 495 |     lr_mult: 1.0
 496 |     decay_mult: 1.0
 497 |   }
 498 |   convolution_param {
 499 |     num_output: 256
 500 |     bias_term: false
 501 |     pad: 0
 502 |     kernel_size: 1
 503 |     stride: 2
 504 |     weight_filler {
 505 |       type: "msra"
 506 |     }
 507 |     bias_filler {
 508 |       type: "constant"
 509 |       value: 0.0
 510 |     }
 511 |   }
 512 | }
 513 | layer {
 514 |   name: "layer_256_1_sum"
 515 |   type: "Eltwise"
 516 |   bottom: "layer_256_1_conv2"
 517 |   bottom: "layer_256_1_conv_expand"
 518 |   top: "layer_256_1_sum"
 519 | }
 520 | layer {
 521 |   name: "layer_512_1_bn1"
 522 |   type: "BatchNorm"
 523 |   bottom: "layer_256_1_sum"
 524 |   top: "layer_512_1_bn1"
 525 |   param {
 526 |     lr_mult: 0.0
 527 |   }
 528 |   param {
 529 |     lr_mult: 0.0
 530 |   }
 531 |   param {
 532 |     lr_mult: 0.0
 533 |   }
 534 | }
 535 | layer {
 536 |   name: "layer_512_1_scale1"
 537 |   type: "Scale"
 538 |   bottom: "layer_512_1_bn1"
 539 |   top: "layer_512_1_bn1"
 540 |   param {
 541 |     lr_mult: 1.0
 542 |     decay_mult: 1.0
 543 |   }
 544 |   param {
 545 |     lr_mult: 2.0
 546 |     decay_mult: 1.0
 547 |   }
 548 |   scale_param {
 549 |     bias_term: true
 550 |   }
 551 | }
 552 | layer {
 553 |   name: "layer_512_1_relu1"
 554 |   type: "ReLU"
 555 |   bottom: "layer_512_1_bn1"
 556 |   top: "layer_512_1_bn1"
 557 | }
 558 | layer {
 559 |   name: "layer_512_1_conv1_h"
 560 |   type: "Convolution"
 561 |   bottom: "layer_512_1_bn1"
 562 |   top: "layer_512_1_conv1_h"
 563 |   param {
 564 |     lr_mult: 1.0
 565 |     decay_mult: 1.0
 566 |   }
 567 |   convolution_param {
 568 |     num_output: 128
 569 |     bias_term: false
 570 |     pad: 1
 571 |     kernel_size: 3
 572 |     stride: 1 # 2
 573 |     weight_filler {
 574 |       type: "msra"
 575 |     }
 576 |     bias_filler {
 577 |       type: "constant"
 578 |       value: 0.0
 579 |     }
 580 |   }
 581 | }
 582 | layer {
 583 |   name: "layer_512_1_bn2_h"
 584 |   type: "BatchNorm"
 585 |   bottom: "layer_512_1_conv1_h"
 586 |   top: "layer_512_1_conv1_h"
 587 |   param {
 588 |     lr_mult: 0.0
 589 |   }
 590 |   param {
 591 |     lr_mult: 0.0
 592 |   }
 593 |   param {
 594 |     lr_mult: 0.0
 595 |   }
 596 | }
 597 | layer {
 598 |   name: "layer_512_1_scale2_h"
 599 |   type: "Scale"
 600 |   bottom: "layer_512_1_conv1_h"
 601 |   top: "layer_512_1_conv1_h"
 602 |   param {
 603 |     lr_mult: 1.0
 604 |     decay_mult: 1.0
 605 |   }
 606 |   param {
 607 |     lr_mult: 2.0
 608 |     decay_mult: 1.0
 609 |   }
 610 |   scale_param {
 611 |     bias_term: true
 612 |   }
 613 | }
 614 | layer {
 615 |   name: "layer_512_1_relu2"
 616 |   type: "ReLU"
 617 |   bottom: "layer_512_1_conv1_h"
 618 |   top: "layer_512_1_conv1_h"
 619 | }
 620 | layer {
 621 |   name: "layer_512_1_conv2_h"
 622 |   type: "Convolution"
 623 |   bottom: "layer_512_1_conv1_h"
 624 |   top: "layer_512_1_conv2_h"
 625 |   param {
 626 |     lr_mult: 1.0
 627 |     decay_mult: 1.0
 628 |   }
 629 |   convolution_param {
 630 |     num_output: 256
 631 |     bias_term: false
 632 |     pad: 2 # 1
 633 |     kernel_size: 3
 634 |     stride: 1
 635 |     dilation: 2
 636 |     weight_filler {
 637 |       type: "msra"
 638 |     }
 639 |     bias_filler {
 640 |       type: "constant"
 641 |       value: 0.0
 642 |     }
 643 |   }
 644 | }
 645 | layer {
 646 |   name: "layer_512_1_conv_expand_h"
 647 |   type: "Convolution"
 648 |   bottom: "layer_512_1_bn1"
 649 |   top: "layer_512_1_conv_expand_h"
 650 |   param {
 651 |     lr_mult: 1.0
 652 |     decay_mult: 1.0
 653 |   }
 654 |   convolution_param {
 655 |     num_output: 256
 656 |     bias_term: false
 657 |     pad: 0
 658 |     kernel_size: 1
 659 |     stride: 1 # 2
 660 |     weight_filler {
 661 |       type: "msra"
 662 |     }
 663 |     bias_filler {
 664 |       type: "constant"
 665 |       value: 0.0
 666 |     }
 667 |   }
 668 | }
 669 | layer {
 670 |   name: "layer_512_1_sum"
 671 |   type: "Eltwise"
 672 |   bottom: "layer_512_1_conv2_h"
 673 |   bottom: "layer_512_1_conv_expand_h"
 674 |   top: "layer_512_1_sum"
 675 | }
 676 | layer {
 677 |   name: "last_bn_h"
 678 |   type: "BatchNorm"
 679 |   bottom: "layer_512_1_sum"
 680 |   top: "layer_512_1_sum"
 681 |   param {
 682 |     lr_mult: 0.0
 683 |   }
 684 |   param {
 685 |     lr_mult: 0.0
 686 |   }
 687 |   param {
 688 |     lr_mult: 0.0
 689 |   }
 690 | }
 691 | layer {
 692 |   name: "last_scale_h"
 693 |   type: "Scale"
 694 |   bottom: "layer_512_1_sum"
 695 |   top: "layer_512_1_sum"
 696 |   param {
 697 |     lr_mult: 1.0
 698 |     decay_mult: 1.0
 699 |   }
 700 |   param {
 701 |     lr_mult: 2.0
 702 |     decay_mult: 1.0
 703 |   }
 704 |   scale_param {
 705 |     bias_term: true
 706 |   }
 707 | }
 708 | layer {
 709 |   name: "last_relu"
 710 |   type: "ReLU"
 711 |   bottom: "layer_512_1_sum"
 712 |   top: "fc7"
 713 | }
 714 | 
 715 | layer {
 716 |   name: "conv6_1_h"
 717 |   type: "Convolution"
 718 |   bottom: "fc7"
 719 |   top: "conv6_1_h"
 720 |   param {
 721 |     lr_mult: 1
 722 |     decay_mult: 1
 723 |   }
 724 |   param {
 725 |     lr_mult: 2
 726 |     decay_mult: 0
 727 |   }
 728 |   convolution_param {
 729 |     num_output: 128
 730 |     pad: 0
 731 |     kernel_size: 1
 732 |     stride: 1
 733 |     weight_filler {
 734 |       type: "xavier"
 735 |     }
 736 |     bias_filler {
 737 |       type: "constant"
 738 |       value: 0
 739 |     }
 740 |   }
 741 | }
 742 | layer {
 743 |   name: "conv6_1_relu"
 744 |   type: "ReLU"
 745 |   bottom: "conv6_1_h"
 746 |   top: "conv6_1_h"
 747 | }
 748 | layer {
 749 |   name: "conv6_2_h"
 750 |   type: "Convolution"
 751 |   bottom: "conv6_1_h"
 752 |   top: "conv6_2_h"
 753 |   param {
 754 |     lr_mult: 1
 755 |     decay_mult: 1
 756 |   }
 757 |   param {
 758 |     lr_mult: 2
 759 |     decay_mult: 0
 760 |   }
 761 |   convolution_param {
 762 |     num_output: 256
 763 |     pad: 1
 764 |     kernel_size: 3
 765 |     stride: 2
 766 |     weight_filler {
 767 |       type: "xavier"
 768 |     }
 769 |     bias_filler {
 770 |       type: "constant"
 771 |       value: 0
 772 |     }
 773 |   }
 774 | }
 775 | layer {
 776 |   name: "conv6_2_relu"
 777 |   type: "ReLU"
 778 |   bottom: "conv6_2_h"
 779 |   top: "conv6_2_h"
 780 | }
 781 | layer {
 782 |   name: "conv7_1_h"
 783 |   type: "Convolution"
 784 |   bottom: "conv6_2_h"
 785 |   top: "conv7_1_h"
 786 |   param {
 787 |     lr_mult: 1
 788 |     decay_mult: 1
 789 |   }
 790 |   param {
 791 |     lr_mult: 2
 792 |     decay_mult: 0
 793 |   }
 794 |   convolution_param {
 795 |     num_output: 64
 796 |     pad: 0
 797 |     kernel_size: 1
 798 |     stride: 1
 799 |     weight_filler {
 800 |       type: "xavier"
 801 |     }
 802 |     bias_filler {
 803 |       type: "constant"
 804 |       value: 0
 805 |     }
 806 |   }
 807 | }
 808 | layer {
 809 |   name: "conv7_1_relu"
 810 |   type: "ReLU"
 811 |   bottom: "conv7_1_h"
 812 |   top: "conv7_1_h"
 813 | }
 814 | layer {
 815 |   name: "conv7_2_h"
 816 |   type: "Convolution"
 817 |   bottom: "conv7_1_h"
 818 |   top: "conv7_2_h"
 819 |   param {
 820 |     lr_mult: 1
 821 |     decay_mult: 1
 822 |   }
 823 |   param {
 824 |     lr_mult: 2
 825 |     decay_mult: 0
 826 |   }
 827 |   convolution_param {
 828 |     num_output: 128
 829 |     pad: 1
 830 |     kernel_size: 3
 831 |     stride: 2
 832 |     weight_filler {
 833 |       type: "xavier"
 834 |     }
 835 |     bias_filler {
 836 |       type: "constant"
 837 |       value: 0
 838 |     }
 839 |   }
 840 | }
 841 | layer {
 842 |   name: "conv7_2_relu"
 843 |   type: "ReLU"
 844 |   bottom: "conv7_2_h"
 845 |   top: "conv7_2_h"
 846 | }
 847 | layer {
 848 |   name: "conv8_1_h"
 849 |   type: "Convolution"
 850 |   bottom: "conv7_2_h"
 851 |   top: "conv8_1_h"
 852 |   param {
 853 |     lr_mult: 1
 854 |     decay_mult: 1
 855 |   }
 856 |   param {
 857 |     lr_mult: 2
 858 |     decay_mult: 0
 859 |   }
 860 |   convolution_param {
 861 |     num_output: 64
 862 |     pad: 0
 863 |     kernel_size: 1
 864 |     stride: 1
 865 |     weight_filler {
 866 |       type: "xavier"
 867 |     }
 868 |     bias_filler {
 869 |       type: "constant"
 870 |       value: 0
 871 |     }
 872 |   }
 873 | }
 874 | layer {
 875 |   name: "conv8_1_relu"
 876 |   type: "ReLU"
 877 |   bottom: "conv8_1_h"
 878 |   top: "conv8_1_h"
 879 | }
 880 | layer {
 881 |   name: "conv8_2_h"
 882 |   type: "Convolution"
 883 |   bottom: "conv8_1_h"
 884 |   top: "conv8_2_h"
 885 |   param {
 886 |     lr_mult: 1
 887 |     decay_mult: 1
 888 |   }
 889 |   param {
 890 |     lr_mult: 2
 891 |     decay_mult: 0
 892 |   }
 893 |   convolution_param {
 894 |     num_output: 128
 895 |     pad: 1
 896 |     kernel_size: 3
 897 |     stride: 1
 898 |     weight_filler {
 899 |       type: "xavier"
 900 |     }
 901 |     bias_filler {
 902 |       type: "constant"
 903 |       value: 0
 904 |     }
 905 |   }
 906 | }
 907 | layer {
 908 |   name: "conv8_2_relu"
 909 |   type: "ReLU"
 910 |   bottom: "conv8_2_h"
 911 |   top: "conv8_2_h"
 912 | }
 913 | layer {
 914 |   name: "conv9_1_h"
 915 |   type: "Convolution"
 916 |   bottom: "conv8_2_h"
 917 |   top: "conv9_1_h"
 918 |   param {
 919 |     lr_mult: 1
 920 |     decay_mult: 1
 921 |   }
 922 |   param {
 923 |     lr_mult: 2
 924 |     decay_mult: 0
 925 |   }
 926 |   convolution_param {
 927 |     num_output: 64
 928 |     pad: 0
 929 |     kernel_size: 1
 930 |     stride: 1
 931 |     weight_filler {
 932 |       type: "xavier"
 933 |     }
 934 |     bias_filler {
 935 |       type: "constant"
 936 |       value: 0
 937 |     }
 938 |   }
 939 | }
 940 | layer {
 941 |   name: "conv9_1_relu"
 942 |   type: "ReLU"
 943 |   bottom: "conv9_1_h"
 944 |   top: "conv9_1_h"
 945 | }
 946 | layer {
 947 |   name: "conv9_2_h"
 948 |   type: "Convolution"
 949 |   bottom: "conv9_1_h"
 950 |   top: "conv9_2_h"
 951 |   param {
 952 |     lr_mult: 1
 953 |     decay_mult: 1
 954 |   }
 955 |   param {
 956 |     lr_mult: 2
 957 |     decay_mult: 0
 958 |   }
 959 |   convolution_param {
 960 |     num_output: 128
 961 |     pad: 1
 962 |     kernel_size: 3
 963 |     stride: 1
 964 |     weight_filler {
 965 |       type: "xavier"
 966 |     }
 967 |     bias_filler {
 968 |       type: "constant"
 969 |       value: 0
 970 |     }
 971 |   }
 972 | }
 973 | layer {
 974 |   name: "conv9_2_relu"
 975 |   type: "ReLU"
 976 |   bottom: "conv9_2_h"
 977 |   top: "conv9_2_h"
 978 | }
 979 | layer {
 980 |   name: "conv4_3_norm"
 981 |   type: "Normalize"
 982 |   bottom: "layer_256_1_bn1"
 983 |   top: "conv4_3_norm"
 984 |   norm_param {
 985 |     across_spatial: false
 986 |     scale_filler {
 987 |       type: "constant"
 988 |       value: 20
 989 |     }
 990 |     channel_shared: false
 991 |   }
 992 | }
 993 | layer {
 994 |   name: "conv4_3_norm_mbox_loc"
 995 |   type: "Convolution"
 996 |   bottom: "conv4_3_norm"
 997 |   top: "conv4_3_norm_mbox_loc"
 998 |   param {
 999 |     lr_mult: 1
1000 |     decay_mult: 1
1001 |   }
1002 |   param {
1003 |     lr_mult: 2
1004 |     decay_mult: 0
1005 |   }
1006 |   convolution_param {
1007 |     num_output: 16
1008 |     pad: 1
1009 |     kernel_size: 3
1010 |     stride: 1
1011 |     weight_filler {
1012 |       type: "xavier"
1013 |     }
1014 |     bias_filler {
1015 |       type: "constant"
1016 |       value: 0
1017 |     }
1018 |   }
1019 | }
1020 | layer {
1021 |   name: "conv4_3_norm_mbox_loc_perm"
1022 |   type: "Permute"
1023 |   bottom: "conv4_3_norm_mbox_loc"
1024 |   top: "conv4_3_norm_mbox_loc_perm"
1025 |   permute_param {
1026 |     order: 0
1027 |     order: 2
1028 |     order: 3
1029 |     order: 1
1030 |   }
1031 | }
1032 | layer {
1033 |   name: "conv4_3_norm_mbox_loc_flat"
1034 |   type: "Flatten"
1035 |   bottom: "conv4_3_norm_mbox_loc_perm"
1036 |   top: "conv4_3_norm_mbox_loc_flat"
1037 |   flatten_param {
1038 |     axis: 1
1039 |   }
1040 | }
1041 | layer {
1042 |   name: "conv4_3_norm_mbox_conf"
1043 |   type: "Convolution"
1044 |   bottom: "conv4_3_norm"
1045 |   top: "conv4_3_norm_mbox_conf"
1046 |   param {
1047 |     lr_mult: 1
1048 |     decay_mult: 1
1049 |   }
1050 |   param {
1051 |     lr_mult: 2
1052 |     decay_mult: 0
1053 |   }
1054 |   convolution_param {
1055 |     num_output: 8 # 84
1056 |     pad: 1
1057 |     kernel_size: 3
1058 |     stride: 1
1059 |     weight_filler {
1060 |       type: "xavier"
1061 |     }
1062 |     bias_filler {
1063 |       type: "constant"
1064 |       value: 0
1065 |     }
1066 |   }
1067 | }
1068 | layer {
1069 |   name: "conv4_3_norm_mbox_conf_perm"
1070 |   type: "Permute"
1071 |   bottom: "conv4_3_norm_mbox_conf"
1072 |   top: "conv4_3_norm_mbox_conf_perm"
1073 |   permute_param {
1074 |     order: 0
1075 |     order: 2
1076 |     order: 3
1077 |     order: 1
1078 |   }
1079 | }
1080 | layer {
1081 |   name: "conv4_3_norm_mbox_conf_flat"
1082 |   type: "Flatten"
1083 |   bottom: "conv4_3_norm_mbox_conf_perm"
1084 |   top: "conv4_3_norm_mbox_conf_flat"
1085 |   flatten_param {
1086 |     axis: 1
1087 |   }
1088 | }
1089 | layer {
1090 |   name: "conv4_3_norm_mbox_priorbox"
1091 |   type: "PriorBox"
1092 |   bottom: "conv4_3_norm"
1093 |   bottom: "data"
1094 |   top: "conv4_3_norm_mbox_priorbox"
1095 |   prior_box_param {
1096 |     min_size: 30.0
1097 |     max_size: 60.0
1098 |     aspect_ratio: 2
1099 |     flip: true
1100 |     clip: false
1101 |     variance: 0.1
1102 |     variance: 0.1
1103 |     variance: 0.2
1104 |     variance: 0.2
1105 |     step: 8
1106 |     offset: 0.5
1107 |   }
1108 | }
1109 | layer {
1110 |   name: "fc7_mbox_loc"
1111 |   type: "Convolution"
1112 |   bottom: "fc7"
1113 |   top: "fc7_mbox_loc"
1114 |   param {
1115 |     lr_mult: 1
1116 |     decay_mult: 1
1117 |   }
1118 |   param {
1119 |     lr_mult: 2
1120 |     decay_mult: 0
1121 |   }
1122 |   convolution_param {
1123 |     num_output: 24
1124 |     pad: 1
1125 |     kernel_size: 3
1126 |     stride: 1
1127 |     weight_filler {
1128 |       type: "xavier"
1129 |     }
1130 |     bias_filler {
1131 |       type: "constant"
1132 |       value: 0
1133 |     }
1134 |   }
1135 | }
1136 | layer {
1137 |   name: "fc7_mbox_loc_perm"
1138 |   type: "Permute"
1139 |   bottom: "fc7_mbox_loc"
1140 |   top: "fc7_mbox_loc_perm"
1141 |   permute_param {
1142 |     order: 0
1143 |     order: 2
1144 |     order: 3
1145 |     order: 1
1146 |   }
1147 | }
1148 | layer {
1149 |   name: "fc7_mbox_loc_flat"
1150 |   type: "Flatten"
1151 |   bottom: "fc7_mbox_loc_perm"
1152 |   top: "fc7_mbox_loc_flat"
1153 |   flatten_param {
1154 |     axis: 1
1155 |   }
1156 | }
1157 | layer {
1158 |   name: "fc7_mbox_conf"
1159 |   type: "Convolution"
1160 |   bottom: "fc7"
1161 |   top: "fc7_mbox_conf"
1162 |   param {
1163 |     lr_mult: 1
1164 |     decay_mult: 1
1165 |   }
1166 |   param {
1167 |     lr_mult: 2
1168 |     decay_mult: 0
1169 |   }
1170 |   convolution_param {
1171 |     num_output: 12 # 126
1172 |     pad: 1
1173 |     kernel_size: 3
1174 |     stride: 1
1175 |     weight_filler {
1176 |       type: "xavier"
1177 |     }
1178 |     bias_filler {
1179 |       type: "constant"
1180 |       value: 0
1181 |     }
1182 |   }
1183 | }
1184 | layer {
1185 |   name: "fc7_mbox_conf_perm"
1186 |   type: "Permute"
1187 |   bottom: "fc7_mbox_conf"
1188 |   top: "fc7_mbox_conf_perm"
1189 |   permute_param {
1190 |     order: 0
1191 |     order: 2
1192 |     order: 3
1193 |     order: 1
1194 |   }
1195 | }
1196 | layer {
1197 |   name: "fc7_mbox_conf_flat"
1198 |   type: "Flatten"
1199 |   bottom: "fc7_mbox_conf_perm"
1200 |   top: "fc7_mbox_conf_flat"
1201 |   flatten_param {
1202 |     axis: 1
1203 |   }
1204 | }
1205 | layer {
1206 |   name: "fc7_mbox_priorbox"
1207 |   type: "PriorBox"
1208 |   bottom: "fc7"
1209 |   bottom: "data"
1210 |   top: "fc7_mbox_priorbox"
1211 |   prior_box_param {
1212 |     min_size: 60.0
1213 |     max_size: 111.0
1214 |     aspect_ratio: 2
1215 |     aspect_ratio: 3
1216 |     flip: true
1217 |     clip: false
1218 |     variance: 0.1
1219 |     variance: 0.1
1220 |     variance: 0.2
1221 |     variance: 0.2
1222 |     step: 16
1223 |     offset: 0.5
1224 |   }
1225 | }
1226 | layer {
1227 |   name: "conv6_2_mbox_loc"
1228 |   type: "Convolution"
1229 |   bottom: "conv6_2_h"
1230 |   top: "conv6_2_mbox_loc"
1231 |   param {
1232 |     lr_mult: 1
1233 |     decay_mult: 1
1234 |   }
1235 |   param {
1236 |     lr_mult: 2
1237 |     decay_mult: 0
1238 |   }
1239 |   convolution_param {
1240 |     num_output: 24
1241 |     pad: 1
1242 |     kernel_size: 3
1243 |     stride: 1
1244 |     weight_filler {
1245 |       type: "xavier"
1246 |     }
1247 |     bias_filler {
1248 |       type: "constant"
1249 |       value: 0
1250 |     }
1251 |   }
1252 | }
1253 | layer {
1254 |   name: "conv6_2_mbox_loc_perm"
1255 |   type: "Permute"
1256 |   bottom: "conv6_2_mbox_loc"
1257 |   top: "conv6_2_mbox_loc_perm"
1258 |   permute_param {
1259 |     order: 0
1260 |     order: 2
1261 |     order: 3
1262 |     order: 1
1263 |   }
1264 | }
1265 | layer {
1266 |   name: "conv6_2_mbox_loc_flat"
1267 |   type: "Flatten"
1268 |   bottom: "conv6_2_mbox_loc_perm"
1269 |   top: "conv6_2_mbox_loc_flat"
1270 |   flatten_param {
1271 |     axis: 1
1272 |   }
1273 | }
1274 | layer {
1275 |   name: "conv6_2_mbox_conf"
1276 |   type: "Convolution"
1277 |   bottom: "conv6_2_h"
1278 |   top: "conv6_2_mbox_conf"
1279 |   param {
1280 |     lr_mult: 1
1281 |     decay_mult: 1
1282 |   }
1283 |   param {
1284 |     lr_mult: 2
1285 |     decay_mult: 0
1286 |   }
1287 |   convolution_param {
1288 |     num_output: 12 # 126
1289 |     pad: 1
1290 |     kernel_size: 3
1291 |     stride: 1
1292 |     weight_filler {
1293 |       type: "xavier"
1294 |     }
1295 |     bias_filler {
1296 |       type: "constant"
1297 |       value: 0
1298 |     }
1299 |   }
1300 | }
1301 | layer {
1302 |   name: "conv6_2_mbox_conf_perm"
1303 |   type: "Permute"
1304 |   bottom: "conv6_2_mbox_conf"
1305 |   top: "conv6_2_mbox_conf_perm"
1306 |   permute_param {
1307 |     order: 0
1308 |     order: 2
1309 |     order: 3
1310 |     order: 1
1311 |   }
1312 | }
1313 | layer {
1314 |   name: "conv6_2_mbox_conf_flat"
1315 |   type: "Flatten"
1316 |   bottom: "conv6_2_mbox_conf_perm"
1317 |   top: "conv6_2_mbox_conf_flat"
1318 |   flatten_param {
1319 |     axis: 1
1320 |   }
1321 | }
1322 | layer {
1323 |   name: "conv6_2_mbox_priorbox"
1324 |   type: "PriorBox"
1325 |   bottom: "conv6_2_h"
1326 |   bottom: "data"
1327 |   top: "conv6_2_mbox_priorbox"
1328 |   prior_box_param {
1329 |     min_size: 111.0
1330 |     max_size: 162.0
1331 |     aspect_ratio: 2
1332 |     aspect_ratio: 3
1333 |     flip: true
1334 |     clip: false
1335 |     variance: 0.1
1336 |     variance: 0.1
1337 |     variance: 0.2
1338 |     variance: 0.2
1339 |     step: 32
1340 |     offset: 0.5
1341 |   }
1342 | }
1343 | layer {
1344 |   name: "conv7_2_mbox_loc"
1345 |   type: "Convolution"
1346 |   bottom: "conv7_2_h"
1347 |   top: "conv7_2_mbox_loc"
1348 |   param {
1349 |     lr_mult: 1
1350 |     decay_mult: 1
1351 |   }
1352 |   param {
1353 |     lr_mult: 2
1354 |     decay_mult: 0
1355 |   }
1356 |   convolution_param {
1357 |     num_output: 24
1358 |     pad: 1
1359 |     kernel_size: 3
1360 |     stride: 1
1361 |     weight_filler {
1362 |       type: "xavier"
1363 |     }
1364 |     bias_filler {
1365 |       type: "constant"
1366 |       value: 0
1367 |     }
1368 |   }
1369 | }
1370 | layer {
1371 |   name: "conv7_2_mbox_loc_perm"
1372 |   type: "Permute"
1373 |   bottom: "conv7_2_mbox_loc"
1374 |   top: "conv7_2_mbox_loc_perm"
1375 |   permute_param {
1376 |     order: 0
1377 |     order: 2
1378 |     order: 3
1379 |     order: 1
1380 |   }
1381 | }
1382 | layer {
1383 |   name: "conv7_2_mbox_loc_flat"
1384 |   type: "Flatten"
1385 |   bottom: "conv7_2_mbox_loc_perm"
1386 |   top: "conv7_2_mbox_loc_flat"
1387 |   flatten_param {
1388 |     axis: 1
1389 |   }
1390 | }
1391 | layer {
1392 |   name: "conv7_2_mbox_conf"
1393 |   type: "Convolution"
1394 |   bottom: "conv7_2_h"
1395 |   top: "conv7_2_mbox_conf"
1396 |   param {
1397 |     lr_mult: 1
1398 |     decay_mult: 1
1399 |   }
1400 |   param {
1401 |     lr_mult: 2
1402 |     decay_mult: 0
1403 |   }
1404 |   convolution_param {
1405 |     num_output: 12 # 126
1406 |     pad: 1
1407 |     kernel_size: 3
1408 |     stride: 1
1409 |     weight_filler {
1410 |       type: "xavier"
1411 |     }
1412 |     bias_filler {
1413 |       type: "constant"
1414 |       value: 0
1415 |     }
1416 |   }
1417 | }
1418 | layer {
1419 |   name: "conv7_2_mbox_conf_perm"
1420 |   type: "Permute"
1421 |   bottom: "conv7_2_mbox_conf"
1422 |   top: "conv7_2_mbox_conf_perm"
1423 |   permute_param {
1424 |     order: 0
1425 |     order: 2
1426 |     order: 3
1427 |     order: 1
1428 |   }
1429 | }
1430 | layer {
1431 |   name: "conv7_2_mbox_conf_flat"
1432 |   type: "Flatten"
1433 |   bottom: "conv7_2_mbox_conf_perm"
1434 |   top: "conv7_2_mbox_conf_flat"
1435 |   flatten_param {
1436 |     axis: 1
1437 |   }
1438 | }
1439 | layer {
1440 |   name: "conv7_2_mbox_priorbox"
1441 |   type: "PriorBox"
1442 |   bottom: "conv7_2_h"
1443 |   bottom: "data"
1444 |   top: "conv7_2_mbox_priorbox"
1445 |   prior_box_param {
1446 |     min_size: 162.0
1447 |     max_size: 213.0
1448 |     aspect_ratio: 2
1449 |     aspect_ratio: 3
1450 |     flip: true
1451 |     clip: false
1452 |     variance: 0.1
1453 |     variance: 0.1
1454 |     variance: 0.2
1455 |     variance: 0.2
1456 |     step: 64
1457 |     offset: 0.5
1458 |   }
1459 | }
1460 | layer {
1461 |   name: "conv8_2_mbox_loc"
1462 |   type: "Convolution"
1463 |   bottom: "conv8_2_h"
1464 |   top: "conv8_2_mbox_loc"
1465 |   param {
1466 |     lr_mult: 1
1467 |     decay_mult: 1
1468 |   }
1469 |   param {
1470 |     lr_mult: 2
1471 |     decay_mult: 0
1472 |   }
1473 |   convolution_param {
1474 |     num_output: 16
1475 |     pad: 1
1476 |     kernel_size: 3
1477 |     stride: 1
1478 |     weight_filler {
1479 |       type: "xavier"
1480 |     }
1481 |     bias_filler {
1482 |       type: "constant"
1483 |       value: 0
1484 |     }
1485 |   }
1486 | }
1487 | layer {
1488 |   name: "conv8_2_mbox_loc_perm"
1489 |   type: "Permute"
1490 |   bottom: "conv8_2_mbox_loc"
1491 |   top: "conv8_2_mbox_loc_perm"
1492 |   permute_param {
1493 |     order: 0
1494 |     order: 2
1495 |     order: 3
1496 |     order: 1
1497 |   }
1498 | }
1499 | layer {
1500 |   name: "conv8_2_mbox_loc_flat"
1501 |   type: "Flatten"
1502 |   bottom: "conv8_2_mbox_loc_perm"
1503 |   top: "conv8_2_mbox_loc_flat"
1504 |   flatten_param {
1505 |     axis: 1
1506 |   }
1507 | }
1508 | layer {
1509 |   name: "conv8_2_mbox_conf"
1510 |   type: "Convolution"
1511 |   bottom: "conv8_2_h"
1512 |   top: "conv8_2_mbox_conf"
1513 |   param {
1514 |     lr_mult: 1
1515 |     decay_mult: 1
1516 |   }
1517 |   param {
1518 |     lr_mult: 2
1519 |     decay_mult: 0
1520 |   }
1521 |   convolution_param {
1522 |     num_output: 8 # 84
1523 |     pad: 1
1524 |     kernel_size: 3
1525 |     stride: 1
1526 |     weight_filler {
1527 |       type: "xavier"
1528 |     }
1529 |     bias_filler {
1530 |       type: "constant"
1531 |       value: 0
1532 |     }
1533 |   }
1534 | }
1535 | layer {
1536 |   name: "conv8_2_mbox_conf_perm"
1537 |   type: "Permute"
1538 |   bottom: "conv8_2_mbox_conf"
1539 |   top: "conv8_2_mbox_conf_perm"
1540 |   permute_param {
1541 |     order: 0
1542 |     order: 2
1543 |     order: 3
1544 |     order: 1
1545 |   }
1546 | }
1547 | layer {
1548 |   name: "conv8_2_mbox_conf_flat"
1549 |   type: "Flatten"
1550 |   bottom: "conv8_2_mbox_conf_perm"
1551 |   top: "conv8_2_mbox_conf_flat"
1552 |   flatten_param {
1553 |     axis: 1
1554 |   }
1555 | }
1556 | layer {
1557 |   name: "conv8_2_mbox_priorbox"
1558 |   type: "PriorBox"
1559 |   bottom: "conv8_2_h"
1560 |   bottom: "data"
1561 |   top: "conv8_2_mbox_priorbox"
1562 |   prior_box_param {
1563 |     min_size: 213.0
1564 |     max_size: 264.0
1565 |     aspect_ratio: 2
1566 |     flip: true
1567 |     clip: false
1568 |     variance: 0.1
1569 |     variance: 0.1
1570 |     variance: 0.2
1571 |     variance: 0.2
1572 |     step: 100
1573 |     offset: 0.5
1574 |   }
1575 | }
1576 | layer {
1577 |   name: "conv9_2_mbox_loc"
1578 |   type: "Convolution"
1579 |   bottom: "conv9_2_h"
1580 |   top: "conv9_2_mbox_loc"
1581 |   param {
1582 |     lr_mult: 1
1583 |     decay_mult: 1
1584 |   }
1585 |   param {
1586 |     lr_mult: 2
1587 |     decay_mult: 0
1588 |   }
1589 |   convolution_param {
1590 |     num_output: 16
1591 |     pad: 1
1592 |     kernel_size: 3
1593 |     stride: 1
1594 |     weight_filler {
1595 |       type: "xavier"
1596 |     }
1597 |     bias_filler {
1598 |       type: "constant"
1599 |       value: 0
1600 |     }
1601 |   }
1602 | }
1603 | layer {
1604 |   name: "conv9_2_mbox_loc_perm"
1605 |   type: "Permute"
1606 |   bottom: "conv9_2_mbox_loc"
1607 |   top: "conv9_2_mbox_loc_perm"
1608 |   permute_param {
1609 |     order: 0
1610 |     order: 2
1611 |     order: 3
1612 |     order: 1
1613 |   }
1614 | }
1615 | layer {
1616 |   name: "conv9_2_mbox_loc_flat"
1617 |   type: "Flatten"
1618 |   bottom: "conv9_2_mbox_loc_perm"
1619 |   top: "conv9_2_mbox_loc_flat"
1620 |   flatten_param {
1621 |     axis: 1
1622 |   }
1623 | }
1624 | layer {
1625 |   name: "conv9_2_mbox_conf"
1626 |   type: "Convolution"
1627 |   bottom: "conv9_2_h"
1628 |   top: "conv9_2_mbox_conf"
1629 |   param {
1630 |     lr_mult: 1
1631 |     decay_mult: 1
1632 |   }
1633 |   param {
1634 |     lr_mult: 2
1635 |     decay_mult: 0
1636 |   }
1637 |   convolution_param {
1638 |     num_output: 8 # 84
1639 |     pad: 1
1640 |     kernel_size: 3
1641 |     stride: 1
1642 |     weight_filler {
1643 |       type: "xavier"
1644 |     }
1645 |     bias_filler {
1646 |       type: "constant"
1647 |       value: 0
1648 |     }
1649 |   }
1650 | }
1651 | layer {
1652 |   name: "conv9_2_mbox_conf_perm"
1653 |   type: "Permute"
1654 |   bottom: "conv9_2_mbox_conf"
1655 |   top: "conv9_2_mbox_conf_perm"
1656 |   permute_param {
1657 |     order: 0
1658 |     order: 2
1659 |     order: 3
1660 |     order: 1
1661 |   }
1662 | }
1663 | layer {
1664 |   name: "conv9_2_mbox_conf_flat"
1665 |   type: "Flatten"
1666 |   bottom: "conv9_2_mbox_conf_perm"
1667 |   top: "conv9_2_mbox_conf_flat"
1668 |   flatten_param {
1669 |     axis: 1
1670 |   }
1671 | }
1672 | layer {
1673 |   name: "conv9_2_mbox_priorbox"
1674 |   type: "PriorBox"
1675 |   bottom: "conv9_2_h"
1676 |   bottom: "data"
1677 |   top: "conv9_2_mbox_priorbox"
1678 |   prior_box_param {
1679 |     min_size: 264.0
1680 |     max_size: 315.0
1681 |     aspect_ratio: 2
1682 |     flip: true
1683 |     clip: false
1684 |     variance: 0.1
1685 |     variance: 0.1
1686 |     variance: 0.2
1687 |     variance: 0.2
1688 |     step: 300
1689 |     offset: 0.5
1690 |   }
1691 | }
1692 | layer {
1693 |   name: "mbox_loc"
1694 |   type: "Concat"
1695 |   bottom: "conv4_3_norm_mbox_loc_flat"
1696 |   bottom: "fc7_mbox_loc_flat"
1697 |   bottom: "conv6_2_mbox_loc_flat"
1698 |   bottom: "conv7_2_mbox_loc_flat"
1699 |   bottom: "conv8_2_mbox_loc_flat"
1700 |   bottom: "conv9_2_mbox_loc_flat"
1701 |   top: "mbox_loc"
1702 |   concat_param {
1703 |     axis: 1
1704 |   }
1705 | }
1706 | layer {
1707 |   name: "mbox_conf"
1708 |   type: "Concat"
1709 |   bottom: "conv4_3_norm_mbox_conf_flat"
1710 |   bottom: "fc7_mbox_conf_flat"
1711 |   bottom: "conv6_2_mbox_conf_flat"
1712 |   bottom: "conv7_2_mbox_conf_flat"
1713 |   bottom: "conv8_2_mbox_conf_flat"
1714 |   bottom: "conv9_2_mbox_conf_flat"
1715 |   top: "mbox_conf"
1716 |   concat_param {
1717 |     axis: 1
1718 |   }
1719 | }
1720 | layer {
1721 |   name: "mbox_priorbox"
1722 |   type: "Concat"
1723 |   bottom: "conv4_3_norm_mbox_priorbox"
1724 |   bottom: "fc7_mbox_priorbox"
1725 |   bottom: "conv6_2_mbox_priorbox"
1726 |   bottom: "conv7_2_mbox_priorbox"
1727 |   bottom: "conv8_2_mbox_priorbox"
1728 |   bottom: "conv9_2_mbox_priorbox"
1729 |   top: "mbox_priorbox"
1730 |   concat_param {
1731 |     axis: 2
1732 |   }
1733 | }
1734 | 
1735 | layer {
1736 |   name: "mbox_conf_reshape"
1737 |   type: "Reshape"
1738 |   bottom: "mbox_conf"
1739 |   top: "mbox_conf_reshape"
1740 |   reshape_param {
1741 |     shape {
1742 |       dim: 0
1743 |       dim: -1
1744 |       dim: 2
1745 |     }
1746 |   }
1747 | }
1748 | layer {
1749 |   name: "mbox_conf_softmax"
1750 |   type: "Softmax"
1751 |   bottom: "mbox_conf_reshape"
1752 |   top: "mbox_conf_softmax"
1753 |   softmax_param {
1754 |     axis: 2
1755 |   }
1756 | }
1757 | layer {
1758 |   name: "mbox_conf_flatten"
1759 |   type: "Flatten"
1760 |   bottom: "mbox_conf_softmax"
1761 |   top: "mbox_conf_flatten"
1762 |   flatten_param {
1763 |     axis: 1
1764 |   }
1765 | }
1766 | 
1767 | layer {
1768 |   name: "detection_out"
1769 |   type: "DetectionOutput"
1770 |   bottom: "mbox_loc"
1771 |   bottom: "mbox_conf_flatten"
1772 |   bottom: "mbox_priorbox"
1773 |   top: "detection_out"
1774 |   include {
1775 |     phase: TEST
1776 |   }
1777 |   detection_output_param {
1778 |     num_classes: 2
1779 |     share_location: true
1780 |     background_label_id: 0
1781 |     nms_param {
1782 |       nms_threshold: 0.45
1783 |       top_k: 400
1784 |     }
1785 |     code_type: CENTER_SIZE
1786 |     keep_top_k: 200
1787 |     confidence_threshold: 0.01
1788 |   }
1789 | }
1790 | 


--------------------------------------------------------------------------------
/data/models/res10_300x300_ssd_iter_140000.caffemodel:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/data/models/res10_300x300_ssd_iter_140000.caffemodel


--------------------------------------------------------------------------------
/noof/caffe_face_det.h:
--------------------------------------------------------------------------------
 1 | // caffe face detector
 2 | #pragma once
 3 | 
 4 | class a_det{public:
 5 |   float left;
 6 |   float top;
 7 |   float right;
 8 |   float bottom;
 9 |   float certainty;
10 | };
11 | 
12 | class caffe_face_det{public:
13 |   cv::dnn::Net net;
14 |   cv::Mat blob;
15 |   cv::Mat net_out;
16 |   std::string model_location = "../data/models";
17 |   void setup(){
18 |     net = cv::dnn::readNetFromCaffe(model_location+"/deploy.prototxt",
19 |                                                                         model_location+"/res10_300x300_ssd_iter_140000.caffemodel");
20 |   }
21 |   void forward(cv::Mat mat){
22 |     cv::resize(mat,mat,cv::Size(300,300));
23 |     blob = cv::dnn::blobFromImage(mat,1.0,cv::Size(300,300),cv::Scalar(104.0, 177.0, 123.0),false,false);
24 |     net.setInput(blob);
25 |     net_out = net.forward();
26 |   }
27 | 
28 |   std::vector<a_det> detect(cv::Mat mat, float thresh=0.15){
29 |     forward(mat);
30 |     std::vector<a_det> detections;
31 | 
32 |     for(int i = 0; i < net_out.size[2]; i++) {
33 |       cv::Vec<float,7> a = net_out.at<cv::Vec<float,7>>(0,0,i);
34 |       float cert = a[2];
35 |       if (cert >= thresh){
36 |         a_det d;
37 |           d.certainty = cert;
38 |         d.left = a[3] * mat.cols;
39 |         d.top = a[4] * mat.rows;
40 |         d.right = a[5] * mat.cols;
41 |         d.bottom = a[6] * mat.rows;
42 |         detections.push_back(d);
43 |         }
44 |     }
45 |     //cout << detections.size() << "/" << net_out.size[2] << endl;
46 |     return detections;
47 |   }
48 | 
49 | };
50 | 


--------------------------------------------------------------------------------
/noof/main.cpp:
--------------------------------------------------------------------------------
 1 | // c++ main.cpp -lopencv_videostab -lopencv_photo -lopencv_objdetect -lopencv_video -lopencv_ml -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_flann -lopencv_imgproc -lopencv_dnn -lopencv_imgcodecs -lopencv_core -lopencv_videoio -I/usr/local/Cellar/opencv/3.4.1_5/include -std=c++11
 2 | // c++ main.cpp -lopencv_highgui -lopencv_imgproc -lopencv_dnn -lopencv_core -lopencv_videoio -I/usr/local/Cellar/opencv/3.4.1_5/include -std=c++11
 3 | 
 4 | #include <iostream>
 5 | 
 6 | #include <opencv2/opencv.hpp>
 7 | #include <opencv2/dnn.hpp>
 8 | #include <opencv2/imgproc/imgproc.hpp>
 9 | #include "caffe_face_det.h"
10 | 
11 | 
12 | using namespace std;
13 | 
14 | int main() {
15 |     cv::VideoCapture cap(0);
16 |     cout << "hello there." << endl;
17 | 
18 |     if(!cap.isOpened()){
19 |         return -1;
20 |     }
21 |     cv::namedWindow("frame",1);
22 |     caffe_face_det detector;
23 |     detector.setup();
24 | 
25 |     for(;;){
26 |         cv::Mat frame;
27 |         cap >> frame;
28 |         vector<a_det> detections = detector.detect(frame);
29 |         for (int i = 0; i < detections.size(); i++){
30 |             a_det det = detections[i];
31 |             cv::rectangle(frame, cv::Point(det.left,det.top), cv::Point(det.right,det.bottom), cv::Scalar(0,255,255),3);
32 |         }
33 |         cv::imshow("frame", frame);
34 |         if(cv::waitKey(30) >= 0) break;
35 |     }
36 |     return 0;
37 | }


--------------------------------------------------------------------------------
/of/config.make:
--------------------------------------------------------------------------------
 1 | ################################################################################
 2 | # CONFIGURE PROJECT MAKEFILE (optional)
 3 | #   This file is where we make project specific configurations.
 4 | ################################################################################
 5 | 
 6 | ################################################################################
 7 | # PROJECT LINKER FLAGS
 8 | #  These flags will be sent to the linker when compiling the executable.
 9 | #
10 | #    (default) PROJECT_LDFLAGS = -Wl,-rpath=./libs
11 | #
12 | #   Note: Leave a leading space when adding list items with the += operator
13 | ################################################################################
14 | 
15 | # Currently, shared libraries that are needed are copied to the
16 | # $(PROJECT_ROOT)/bin/libs directory.  The following LDFLAGS tell the linker to
17 | # add a runtime path to search for those shared libraries, since they aren't
18 | # incorporated directly into the final executable application binary.
19 | # TODO: should this be a default setting?
20 | # PROJECT_LDFLAGS=-Wl,-rpath=./libs
21 | 
22 | 
23 | PROJECT_LDFLAGS += -lopencv_videostab -lopencv_photo -lopencv_objdetect -lopencv_video -lopencv_ml -lopencv_calib3d -lopencv_features2d -lopencv_highgui -lopencv_flann -lopencv_imgproc -lopencv_dnn -lopencv_imgcodecs -lopencv_core
24 | PROJECT_LDFLAGS += -I/usr/local/Cellar/opencv/3.4.1_5/include
25 | 
26 | ################################################################################
27 | # PROJECT CFLAGS
28 | #   This is a list of fully qualified CFLAGS required when compiling for this
29 | #   project.  These CFLAGS will be used IN ADDITION TO the PLATFORM_CFLAGS
30 | #   defined in your platform specific core configuration files. These flags are
31 | #   presented to the compiler BEFORE the PROJECT_OPTIMIZATION_CFLAGS below.
32 | #
33 | #    (default) PROJECT_CFLAGS = (blank)
34 | #
35 | #   Note: Before adding PROJECT_CFLAGS, note that the PLATFORM_CFLAGS defined in
36 | #   your platform specific configuration file will be applied by default and
37 | #   further flags here may not be needed.
38 | #
39 | #   Note: Leave a leading space when adding list items with the += operator
40 | ################################################################################
41 | # PROJECT_CFLAGS =
42 | 


--------------------------------------------------------------------------------
/of/src/caffe_face_det.h:
--------------------------------------------------------------------------------
 1 | // caffe face detector
 2 | #pragma once
 3 | 
 4 | class a_det{public:
 5 |   float left;
 6 |   float top;
 7 |   float right;
 8 |   float bottom;
 9 |   float certainty;
10 | };
11 | 
12 | class caffe_face_det{public:
13 |   cv::dnn::Net net;
14 |   cv::Mat blob;
15 |   cv::Mat net_out;
16 |   std::string model_location = "../../../data/models";
17 |   void setup(){
18 |     net = cv::dnn::readNetFromCaffe(model_location+"/deploy.prototxt",
19 | 																		model_location+"/res10_300x300_ssd_iter_140000.caffemodel");
20 |   }
21 |   void forward(cv::Mat mat){
22 |     cv::resize(mat,mat,cv::Size(300,300));
23 |     blob = cv::dnn::blobFromImage(mat,1.0,cv::Size(300,300),cv::Scalar(104.0, 177.0, 123.0),false,false);
24 | 		net.setInput(blob);
25 | 		net_out = net.forward();
26 |   }
27 | 
28 |   std::vector<a_det> detect(cv::Mat mat, float thresh=0.1){
29 |     forward(mat);
30 |     std::vector<a_det> detections;
31 |     for(int i = 0; i < net_out.size[2]; i++) {
32 |       cv::Vec<float,7> a = net_out.at<cv::Vec<float,7>>(0,0,i);
33 |       float cert = a[2];
34 |       if (cert >= thresh){
35 |         a_det d;
36 |   		  d.certainty = cert;
37 |         d.left = a[3] * mat.cols;
38 |         d.top = a[4] * mat.rows;
39 |         d.right = a[5] * mat.cols;
40 |         d.bottom = a[6] * mat.rows;
41 |         detections.push_back(d);
42 |   		}
43 |     }
44 |     //cout << detections.size() << "/" << net_out.size[2] << endl;
45 |     return detections;
46 |   }
47 | 
48 | };
49 | 


--------------------------------------------------------------------------------
/of/src/cv_cvt.h:
--------------------------------------------------------------------------------
 1 | // adapted from ofxCv
 2 | #pragma once
 3 | namespace cv_cvt{
 4 |   inline int pix_depth(ofPixels pixels) {
 5 |     switch(pixels.getBytesPerChannel()) {
 6 |       case 4: return CV_32F;
 7 |       case 2: return CV_16U;
 8 |       case 1: default: return CV_8U;
 9 |     }
10 |   }
11 |   inline int pix_cv_type(ofPixels pix){
12 |     return CV_MAKETYPE(pix_depth(pix),pix.getNumChannels());
13 |   }
14 |   inline cv::Mat pix2mat(ofPixels pix){
15 |     return cv::Mat(pix.getHeight(), pix.getWidth(), pix_cv_type(pix), pix.getData(), 0);
16 |   }
17 |   inline ofPixels mat2pix(cv::Mat mat){
18 |     ofPixels pix;
19 |     pix.setFromExternalPixels(mat.ptr(),mat.cols,mat.rows,mat.channels());
20 |     return pix;
21 |   }
22 | }
23 | 


--------------------------------------------------------------------------------
/of/src/main.cpp:
--------------------------------------------------------------------------------
 1 | #include "ofMain.h"
 2 | #include "ofApp.h"
 3 | 
 4 | //========================================================================
 5 | int main( ){
 6 | 	ofSetupOpenGL(1024,768,OF_WINDOW);			// <-------- setup the GL context
 7 | 
 8 | 	// this kicks off the running of my app
 9 | 	// can be OF_WINDOW or OF_FULLSCREEN
10 | 	// pass in width and height too:
11 | 	ofRunApp(new ofApp());
12 | 
13 | }
14 | 


--------------------------------------------------------------------------------
/of/src/ofApp.cpp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/of/src/ofApp.cpp


--------------------------------------------------------------------------------
/of/src/ofApp.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #include "ofMain.h"
 4 | #include <opencv2/opencv.hpp>
 5 | #include <opencv2/dnn.hpp>
 6 | #include <opencv2/imgproc/imgproc.hpp>
 7 | #include "cv_cvt.h"
 8 | #include "caffe_face_det.h"
 9 | 
10 | using namespace std;
11 | 
12 | class ofApp : public ofBaseApp{public:
13 | 
14 | 	ofVideoGrabber grabber;
15 | 	caffe_face_det detector;
16 | 
17 | 	void setup(){
18 | 
19 | 		cout << "OpenCV version : " << CV_VERSION << endl;
20 | 		grabber.setup(640,480);
21 | 		detector.setup();
22 | 	}
23 | 	void update(){
24 | 		grabber.update();
25 | 
26 | 	}
27 | 	void draw(){
28 | 		ofSetColor(255);
29 | 		grabber.draw(0,0);
30 | 		ofPixels pix = grabber.getPixels();
31 | 		cv::Mat mat = cv_cvt::pix2mat(pix);
32 | 		cv::cvtColor(mat,mat,CV_RGB2BGR);
33 | 		// cv::imwrite( "../../../img.jpg", mat);
34 | 
35 | 		vector<a_det> detections = detector.detect(mat);
36 | 		for (int i = 0; i < detections.size(); i++){
37 | 			ofPushStyle();
38 | 			ofSetColor(255,0,0);
39 | 			ofNoFill();
40 | 			ofSetLineWidth(5);
41 | 			ofDrawRectangle(detections[i].left, detections[i].top, detections[i].right-detections[i].left, detections[i].bottom-detections[i].top);
42 | 			ofPopStyle();
43 | 
44 | 		}
45 | 	}
46 | 
47 | };
48 | 


--------------------------------------------------------------------------------
/screenshots/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/.DS_Store


--------------------------------------------------------------------------------
/screenshots/cflags.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/cflags.jpg


--------------------------------------------------------------------------------
/screenshots/demo1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/demo1.jpg


--------------------------------------------------------------------------------
/screenshots/ldflags.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/ldflags.jpg


--------------------------------------------------------------------------------
/screenshots/xcode1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/xcode1.png


--------------------------------------------------------------------------------
/screenshots/xcode2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/xcode2.png


--------------------------------------------------------------------------------
/screenshots/xcode3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/xcode3.png


--------------------------------------------------------------------------------
/screenshots/xcode4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/xcode4.png


--------------------------------------------------------------------------------
/screenshots/xcode5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/xcode5.png


--------------------------------------------------------------------------------
/screenshots/xcode6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LingDong-/fast-many-face-detection-with-cpp-or-openframeworks-on-mac-using-neural-networks/2b729cfe3219aca28b99cb5e1b059235347969a0/screenshots/xcode6.png


--------------------------------------------------------------------------------