├── Face_1.jpg
├── Face_2.jpg
├── Face_3.jpg
├── Face_Mask_Video.mp4
├── LICENSE
├── MaskUltra.cbp
├── README.md
├── RFB-320.bin
├── RFB-320.param
├── UltraFace.cpp
├── UltraFace.hpp
├── mask_detector_opt2.nb
├── mask_ultra.cpp
├── slim_320.bin
└── slim_320.param
/Face_1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/Face-Mask-Detection-Jetson-Nano/9625ddf54561f345985858261d020b1ee13a40bf/Face_1.jpg
--------------------------------------------------------------------------------
/Face_2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/Face-Mask-Detection-Jetson-Nano/9625ddf54561f345985858261d020b1ee13a40bf/Face_2.jpg
--------------------------------------------------------------------------------
/Face_3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/Face-Mask-Detection-Jetson-Nano/9625ddf54561f345985858261d020b1ee13a40bf/Face_3.jpg
--------------------------------------------------------------------------------
/Face_Mask_Video.mp4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/Face-Mask-Detection-Jetson-Nano/9625ddf54561f345985858261d020b1ee13a40bf/Face_Mask_Video.mp4
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | BSD 3-Clause License
2 |
3 | Copyright (c) 2020, Q-engineering
4 | All rights reserved.
5 |
6 | Redistribution and use in source and binary forms, with or without
7 | modification, are permitted provided that the following conditions are met:
8 |
9 | 1. Redistributions of source code must retain the above copyright notice, this
10 | list of conditions and the following disclaimer.
11 |
12 | 2. Redistributions in binary form must reproduce the above copyright notice,
13 | this list of conditions and the following disclaimer in the documentation
14 | and/or other materials provided with the distribution.
15 |
16 | 3. Neither the name of the copyright holder nor the names of its
17 | contributors may be used to endorse or promote products derived from
18 | this software without specific prior written permission.
19 |
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 |
--------------------------------------------------------------------------------
/MaskUltra.cbp:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 |
63 |
64 |
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 |
75 |
76 |
77 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Face-Mask-Detection-Jetson-Nano
2 | 
3 |
4 | ## A fast face mask recognition running at 44-5 FPS on a Jetson Nano.
5 | [](https://opensource.org/licenses/BSD-3-Clause)
6 | This is a fast C++ implementation of two deep learning models found in the public domain.
7 | The first is face detector of Linzaer running on a ncnn framework.
8 | https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB.
9 | The second is the Paddle Lite mask detection which classifies the found faces.
10 | https://github.com/PaddlePaddle/Paddle-Lite/tree/develop/lite/demo/cxx/mask_detection.
11 | The frame rate depends on the number of detected faces and can be calculated as follows:
12 | FPS = 1.0/(0.022 + 0.008 x #Faces) when overclocked to 2014 MHz.
13 | Paper: https://arxiv.org/abs/1905.00641.pdf
14 | Size: 320x320
15 | Special made for a Jetson Nano see [Q-engineering deep learning examples](https://qengineering.eu/deep-learning-examples-on-raspberry-32-64-os.html)
16 | ### New version 2.0.
17 | A new and superior version with only __TensorFlow Lite__ for a Jetson Nano see [GitHub](https://github.com/Qengineering/TensorFlow_Lite_Face_Mask_Jetson-Nano)
18 | ## Dependencies.
19 | ### April 4 2021: Adapted for ncnn version 20210322 or later
20 | To run the application, you have to:
21 | - The Paddle Lite framework installed. [Install Paddle](https://qengineering.eu/install-paddle-on-jetson-nano.html)
22 | - The Tencent ncnn framework installed. [Install ncnn](https://qengineering.eu/install-ncnn-on-jetson-nano.html)
23 | - Code::Blocks installed. (```$ sudo apt-get install codeblocks```)
24 | ## Running the app.
25 | To extract and run the network in Code::Blocks
26 | $ mkdir *MyDir*
27 | $ cd *MyDir*
28 | $ wget https://github.com/Qengineering/Face-Mask-Detection-Jetson-Nano/archive/refs/heads/main.zip
29 | $ unzip -j master.zip
30 | Remove master.zip and README.md as they are no longer needed.
31 | $ rm master.zip
32 | $ rm README.md
33 | Your *MyDir* folder must now look like this:
34 | Face_1.jpg
35 | Face_2.jpg
36 | Face_3.jpg
37 | Face_Mask_Video.mp4
38 | mask_detector_opt2.nb
39 | MaskUltra.cpb
40 | mask_ultra.cpp
41 | UltraFace.cpp
42 | UltraFace.hpp
43 | RFB-320.bin
44 | RFB-320.param
45 | slim_320.bin
46 | slim_320.param
47 | ### Notes.
48 | The directories in the Code::Blocks project file will probably need to be adapted to the naming convention you are using.
49 | The camera input used is a simple OpenCV webcam. The GStreamer is not used in this example for symplicity reasons.
50 | The RFB-320 model recognizes slightly more faces than slim_320 at the expense of a little bit of speed. It is up to you.
51 | Note that the compilation of the Paddle Lite framework in your application may take a while.
52 | See the Raspberry Pi video at https://youtu.be/LDPXgJv3wAk
53 |
--------------------------------------------------------------------------------
/RFB-320.bin:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/Face-Mask-Detection-Jetson-Nano/9625ddf54561f345985858261d020b1ee13a40bf/RFB-320.bin
--------------------------------------------------------------------------------
/RFB-320.param:
--------------------------------------------------------------------------------
1 | 7767517
2 | 116 126
3 | Input input 0 1 input
4 | Convolution 245 1 1 input 245 0=16 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=432
5 | ReLU 247 1 1 245 247
6 | ConvolutionDepthWise 248 1 1 247 248 0=16 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=144 7=16
7 | ReLU 250 1 1 248 250
8 | Convolution 251 1 1 250 251 0=32 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=512
9 | ReLU 253 1 1 251 253
10 | ConvolutionDepthWise 254 1 1 253 254 0=32 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=288 7=32
11 | ReLU 256 1 1 254 256
12 | Convolution 257 1 1 256 257 0=32 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=1024
13 | ReLU 259 1 1 257 259
14 | ConvolutionDepthWise 260 1 1 259 260 0=32 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=288 7=32
15 | ReLU 262 1 1 260 262
16 | Convolution 263 1 1 262 263 0=32 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=1024
17 | ReLU 265 1 1 263 265
18 | ConvolutionDepthWise 266 1 1 265 266 0=32 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=288 7=32
19 | ReLU 268 1 1 266 268
20 | Convolution 269 1 1 268 269 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=2048
21 | ReLU 271 1 1 269 271
22 | ConvolutionDepthWise 272 1 1 271 272 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=576 7=64
23 | ReLU 274 1 1 272 274
24 | Convolution 275 1 1 274 275 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=4096
25 | ReLU 277 1 1 275 277
26 | ConvolutionDepthWise 278 1 1 277 278 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=576 7=64
27 | ReLU 280 1 1 278 280
28 | Convolution 281 1 1 280 281 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=4096
29 | ReLU 283 1 1 281 283
30 | Split splitncnn_0 1 4 283 283_splitncnn_0 283_splitncnn_1 283_splitncnn_2 283_splitncnn_3
31 | Convolution 284 1 1 283_splitncnn_3 284 0=8 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=512
32 | Convolution 286 1 1 284 286 0=16 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1152
33 | ReLU 288 1 1 286 288
34 | Convolution 289 1 1 288 289 0=16 1=3 11=3 2=2 12=2 3=1 13=1 4=2 14=2 15=2 16=2 5=1 6=2304
35 | Convolution 291 1 1 283_splitncnn_2 291 0=8 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=512
36 | Convolution 293 1 1 291 293 0=16 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1152
37 | ReLU 295 1 1 293 295
38 | Convolution 296 1 1 295 296 0=16 1=3 11=3 2=3 12=3 3=1 13=1 4=3 14=3 15=3 16=3 5=1 6=2304
39 | Convolution 298 1 1 283_splitncnn_1 298 0=8 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=512
40 | Convolution 300 1 1 298 300 0=12 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=864
41 | ReLU 302 1 1 300 302
42 | Convolution 303 1 1 302 303 0=16 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1728
43 | ReLU 305 1 1 303 305
44 | Convolution 306 1 1 305 306 0=16 1=3 11=3 2=5 12=5 3=1 13=1 4=5 14=5 15=5 16=5 5=1 6=2304
45 | Concat 308 3 1 289 296 306 308 0=0
46 | Convolution 309 1 1 308 309 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=3072
47 | Convolution 311 1 1 283_splitncnn_0 311 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=4096
48 | BinaryOp 313 2 1 309 311 313 0=0
49 | ReLU 314 1 1 313 314
50 | Split splitncnn_1 1 3 314 314_splitncnn_0 314_splitncnn_1 314_splitncnn_2
51 | ConvolutionDepthWise 315 1 1 314_splitncnn_2 315 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=576 7=64
52 | ReLU 316 1 1 315 316
53 | Convolution 317 1 1 316 317 0=6 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=384
54 | Permute 318 1 1 317 318 0=3
55 | Reshape 328 1 1 318 328 0=2 1=-1
56 | ConvolutionDepthWise 329 1 1 314_splitncnn_1 329 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=576 7=64
57 | ReLU 330 1 1 329 330
58 | Convolution 331 1 1 330 331 0=12 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=768
59 | Permute 332 1 1 331 332 0=3
60 | Reshape 342 1 1 332 342 0=4 1=-1
61 | ConvolutionDepthWise 343 1 1 314_splitncnn_0 343 0=64 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=576 7=64
62 | ReLU 345 1 1 343 345
63 | Convolution 346 1 1 345 346 0=128 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=8192
64 | ReLU 348 1 1 346 348
65 | ConvolutionDepthWise 349 1 1 348 349 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1152 7=128
66 | ReLU 351 1 1 349 351
67 | Convolution 352 1 1 351 352 0=128 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=16384
68 | ReLU 354 1 1 352 354
69 | ConvolutionDepthWise 355 1 1 354 355 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1152 7=128
70 | ReLU 357 1 1 355 357
71 | Convolution 358 1 1 357 358 0=128 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=16384
72 | ReLU 360 1 1 358 360
73 | Split splitncnn_2 1 3 360 360_splitncnn_0 360_splitncnn_1 360_splitncnn_2
74 | ConvolutionDepthWise 361 1 1 360_splitncnn_2 361 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1152 7=128
75 | ReLU 362 1 1 361 362
76 | Convolution 363 1 1 362 363 0=4 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=512
77 | Permute 364 1 1 363 364 0=3
78 | Reshape 374 1 1 364 374 0=2 1=-1
79 | ConvolutionDepthWise 375 1 1 360_splitncnn_1 375 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1152 7=128
80 | ReLU 376 1 1 375 376
81 | Convolution 377 1 1 376 377 0=8 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=1024
82 | Permute 378 1 1 377 378 0=3
83 | Reshape 388 1 1 378 388 0=4 1=-1
84 | ConvolutionDepthWise 389 1 1 360_splitncnn_0 389 0=128 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=1152 7=128
85 | ReLU 391 1 1 389 391
86 | Convolution 392 1 1 391 392 0=256 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=32768
87 | ReLU 394 1 1 392 394
88 | ConvolutionDepthWise 395 1 1 394 395 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2304 7=256
89 | ReLU 397 1 1 395 397
90 | Convolution 398 1 1 397 398 0=256 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=65536
91 | ReLU 400 1 1 398 400
92 | Split splitncnn_3 1 3 400 400_splitncnn_0 400_splitncnn_1 400_splitncnn_2
93 | ConvolutionDepthWise 401 1 1 400_splitncnn_2 401 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2304 7=256
94 | ReLU 402 1 1 401 402
95 | Convolution 403 1 1 402 403 0=4 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=1024
96 | Permute 404 1 1 403 404 0=3
97 | Reshape 414 1 1 404 414 0=2 1=-1
98 | ConvolutionDepthWise 415 1 1 400_splitncnn_1 415 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2304 7=256
99 | ReLU 416 1 1 415 416
100 | Convolution 417 1 1 416 417 0=8 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=2048
101 | Permute 418 1 1 417 418 0=3
102 | Reshape 428 1 1 418 428 0=4 1=-1
103 | Convolution 429 1 1 400_splitncnn_0 429 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=16384
104 | ReLU 430 1 1 429 430
105 | ConvolutionDepthWise 431 1 1 430 431 0=64 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=576 7=64
106 | ReLU 432 1 1 431 432
107 | Convolution 433 1 1 432 433 0=256 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=16384
108 | ReLU 434 1 1 433 434
109 | Split splitncnn_4 1 2 434 434_splitncnn_0 434_splitncnn_1
110 | Convolution 435 1 1 434_splitncnn_1 435 0=6 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=13824
111 | Permute 436 1 1 435 436 0=3
112 | Reshape 446 1 1 436 446 0=2 1=-1
113 | Convolution 447 1 1 434_splitncnn_0 447 0=12 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=27648
114 | Permute 448 1 1 447 448 0=3
115 | Reshape 458 1 1 448 458 0=4 1=-1
116 | Concat 459 4 1 328 374 414 446 459 0=0
117 | Concat boxes 4 1 342 388 428 458 boxes 0=0
118 | Softmax scores 1 1 459 scores 0=1 1=1
119 |
--------------------------------------------------------------------------------
/UltraFace.cpp:
--------------------------------------------------------------------------------
1 | //
2 | // UltraFace.cpp
3 | // UltraFaceTest
4 | //
5 | // Created by vealocia on 2019/10/17.
6 | // Copyright © 2019 vealocia. All rights reserved.
7 | //
8 |
9 | #define clip(x, y) (x < 0 ? 0 : (x > y ? y : x))
10 |
11 | #include "UltraFace.hpp"
12 | #include "mat.h"
13 |
14 | UltraFace::UltraFace(const std::string &bin_path, const std::string ¶m_path,
15 | int input_width, int input_length, int num_thread_,
16 | float score_threshold_, float iou_threshold_, int topk_) {
17 | num_thread = num_thread_;
18 | topk = topk_;
19 | score_threshold = score_threshold_;
20 | iou_threshold = iou_threshold_;
21 | in_w = input_width;
22 | in_h = input_length;
23 | w_h_list = {in_w, in_h};
24 |
25 | for (auto size : w_h_list) {
26 | std::vector fm_item;
27 | for (float stride : strides) {
28 | fm_item.push_back(ceil(size / stride));
29 | }
30 | featuremap_size.push_back(fm_item);
31 | }
32 |
33 | for (auto size : w_h_list) {
34 | shrinkage_size.push_back(strides);
35 | }
36 |
37 | /* generate prior anchors */
38 | for (int index = 0; index < num_featuremap; index++) {
39 | float scale_w = in_w / shrinkage_size[0][index];
40 | float scale_h = in_h / shrinkage_size[1][index];
41 | for (int j = 0; j < featuremap_size[1][index]; j++) {
42 | for (int i = 0; i < featuremap_size[0][index]; i++) {
43 | float x_center = (i + 0.5) / scale_w;
44 | float y_center = (j + 0.5) / scale_h;
45 |
46 | for (float k : min_boxes[index]) {
47 | float w = k / in_w;
48 | float h = k / in_h;
49 | priors.push_back({clip(x_center, 1), clip(y_center, 1), clip(w, 1), clip(h, 1)});
50 | }
51 | }
52 | }
53 | }
54 | num_anchors = priors.size();
55 | /* generate prior anchors finished */
56 |
57 | ultraface.load_param(param_path.data());
58 | ultraface.load_model(bin_path.data());
59 | }
60 |
61 | UltraFace::~UltraFace() { ultraface.clear(); }
62 |
63 | int UltraFace::detect(ncnn::Mat &img, std::vector &face_list) {
64 | if (img.empty()) {
65 | std::cout << "image is empty ,please check!" << std::endl;
66 | return -1;
67 | }
68 |
69 | image_h = img.h;
70 | image_w = img.w;
71 |
72 | ncnn::Mat in;
73 | ncnn::resize_bilinear(img, in, in_w, in_h);
74 | ncnn::Mat ncnn_img = in;
75 | ncnn_img.substract_mean_normalize(mean_vals, norm_vals);
76 |
77 | std::vector bbox_collection;
78 | std::vector valid_input;
79 |
80 | ncnn::Extractor ex = ultraface.create_extractor();
81 | ex.set_num_threads(num_thread);
82 | ex.input("input", ncnn_img);
83 |
84 | ncnn::Mat scores;
85 | ncnn::Mat boxes;
86 | ex.extract("scores", scores);
87 | ex.extract("boxes", boxes);
88 | generateBBox(bbox_collection, scores, boxes, score_threshold, num_anchors);
89 | nms(bbox_collection, face_list);
90 | return 0;
91 | }
92 |
93 | void UltraFace::generateBBox(std::vector &bbox_collection, ncnn::Mat scores, ncnn::Mat boxes, float score_threshold, int num_anchors) {
94 | for (int i = 0; i < num_anchors; i++) {
95 | if (scores.channel(0)[i * 2 + 1] > score_threshold) {
96 | FaceInfo rects;
97 | float x_center = boxes.channel(0)[i * 4] * center_variance * priors[i][2] + priors[i][0];
98 | float y_center = boxes.channel(0)[i * 4 + 1] * center_variance * priors[i][3] + priors[i][1];
99 | float w = exp(boxes.channel(0)[i * 4 + 2] * size_variance) * priors[i][2];
100 | float h = exp(boxes.channel(0)[i * 4 + 3] * size_variance) * priors[i][3];
101 |
102 | rects.x1 = clip(x_center - w / 2.0, 1) * image_w;
103 | rects.y1 = clip(y_center - h / 2.0, 1) * image_h;
104 | rects.x2 = clip(x_center + w / 2.0, 1) * image_w;
105 | rects.y2 = clip(y_center + h / 2.0, 1) * image_h;
106 | rects.score = clip(scores.channel(0)[i * 2 + 1], 1);
107 | bbox_collection.push_back(rects);
108 | }
109 | }
110 | }
111 |
112 | void UltraFace::nms(std::vector &input, std::vector &output, int type) {
113 | std::sort(input.begin(), input.end(), [](const FaceInfo &a, const FaceInfo &b) { return a.score > b.score; });
114 |
115 | int box_num = input.size();
116 |
117 | std::vector merged(box_num, 0);
118 |
119 | for (int i = 0; i < box_num; i++) {
120 | if (merged[i])
121 | continue;
122 | std::vector buf;
123 |
124 | buf.push_back(input[i]);
125 | merged[i] = 1;
126 |
127 | float h0 = input[i].y2 - input[i].y1 + 1;
128 | float w0 = input[i].x2 - input[i].x1 + 1;
129 |
130 | float area0 = h0 * w0;
131 |
132 | for (int j = i + 1; j < box_num; j++) {
133 | if (merged[j])
134 | continue;
135 |
136 | float inner_x0 = input[i].x1 > input[j].x1 ? input[i].x1 : input[j].x1;
137 | float inner_y0 = input[i].y1 > input[j].y1 ? input[i].y1 : input[j].y1;
138 |
139 | float inner_x1 = input[i].x2 < input[j].x2 ? input[i].x2 : input[j].x2;
140 | float inner_y1 = input[i].y2 < input[j].y2 ? input[i].y2 : input[j].y2;
141 |
142 | float inner_h = inner_y1 - inner_y0 + 1;
143 | float inner_w = inner_x1 - inner_x0 + 1;
144 |
145 | if (inner_h <= 0 || inner_w <= 0)
146 | continue;
147 |
148 | float inner_area = inner_h * inner_w;
149 |
150 | float h1 = input[j].y2 - input[j].y1 + 1;
151 | float w1 = input[j].x2 - input[j].x1 + 1;
152 |
153 | float area1 = h1 * w1;
154 |
155 | float score;
156 |
157 | score = inner_area / (area0 + area1 - inner_area);
158 |
159 | if (score > iou_threshold) {
160 | merged[j] = 1;
161 | buf.push_back(input[j]);
162 | }
163 | }
164 | switch (type) {
165 | case hard_nms: {
166 | output.push_back(buf[0]);
167 | break;
168 | }
169 | case blending_nms: {
170 | float total = 0;
171 | for (long unsigned int i = 0; i < buf.size(); i++) {
172 | total += exp(buf[i].score);
173 | }
174 | FaceInfo rects;
175 | memset(&rects, 0, sizeof(rects));
176 | for (long unsigned int i = 0; i < buf.size(); i++) {
177 | float rate = exp(buf[i].score) / total;
178 | rects.x1 += buf[i].x1 * rate;
179 | rects.y1 += buf[i].y1 * rate;
180 | rects.x2 += buf[i].x2 * rate;
181 | rects.y2 += buf[i].y2 * rate;
182 | rects.score += buf[i].score * rate;
183 | }
184 | output.push_back(rects);
185 | break;
186 | }
187 | default: {
188 | printf("wrong type of nms.");
189 | exit(-1);
190 | }
191 | }
192 | }
193 | }
194 |
--------------------------------------------------------------------------------
/UltraFace.hpp:
--------------------------------------------------------------------------------
1 | //
2 | // UltraFace.hpp
3 | // UltraFaceTest
4 | //
5 | // Created by vealocia on 2019/10/17.
6 | // Copyright © 2019 vealocia. All rights reserved.
7 | //
8 |
9 | #ifndef UltraFace_hpp
10 | #define UltraFace_hpp
11 |
12 | #pragma once
13 |
14 | #include "gpu.h"
15 | #include "net.h"
16 | #include
17 | #include
18 | #include
19 | #include
20 |
21 | #define num_featuremap 4
22 | #define hard_nms 1
23 | #define blending_nms 2 /* mix nms was been proposaled in paper blaze face, aims to minimize the temporal jitter*/
24 |
25 | typedef struct FaceInfo {
26 | float x1;
27 | float y1;
28 | float x2;
29 | float y2;
30 | float score;
31 |
32 | float *landmarks;
33 | } FaceInfo;
34 |
35 | class UltraFace {
36 | public:
37 | UltraFace(const std::string &bin_path, const std::string ¶m_path,
38 | int input_width, int input_length, int num_thread_ = 4, float score_threshold_ = 0.7, float iou_threshold_ = 0.3, int topk_ = -1);
39 |
40 | ~UltraFace();
41 |
42 | int detect(ncnn::Mat &img, std::vector &face_list);
43 |
44 | private:
45 | void generateBBox(std::vector &bbox_collection, ncnn::Mat scores, ncnn::Mat boxes, float score_threshold, int num_anchors);
46 |
47 | void nms(std::vector &input, std::vector &output, int type = blending_nms);
48 |
49 | private:
50 | ncnn::Net ultraface;
51 |
52 | int num_thread;
53 | int image_w;
54 | int image_h;
55 |
56 | int in_w;
57 | int in_h;
58 | int num_anchors;
59 |
60 | int topk;
61 | float score_threshold;
62 | float iou_threshold;
63 |
64 |
65 | const float mean_vals[3] = {127, 127, 127};
66 | const float norm_vals[3] = {1.0 / 128, 1.0 / 128, 1.0 / 128};
67 |
68 | const float center_variance = 0.1;
69 | const float size_variance = 0.2;
70 | const std::vector> min_boxes = {
71 | {10.0f, 16.0f, 24.0f},
72 | {32.0f, 48.0f},
73 | {64.0f, 96.0f},
74 | {128.0f, 192.0f, 256.0f}};
75 | const std::vector strides = {8.0, 16.0, 32.0, 64.0};
76 | std::vector> featuremap_size;
77 | std::vector> shrinkage_size;
78 | std::vector w_h_list;
79 |
80 | std::vector> priors = {};
81 | };
82 |
83 | #endif /* UltraFace_hpp */
84 |
--------------------------------------------------------------------------------
/mask_detector_opt2.nb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/Face-Mask-Detection-Jetson-Nano/9625ddf54561f345985858261d020b1ee13a40bf/mask_detector_opt2.nb
--------------------------------------------------------------------------------
/mask_ultra.cpp:
--------------------------------------------------------------------------------
1 | // Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
2 | //
3 | // Licensed under the Apache License, Version 2.0 (the "License");
4 | // you may not use this file except in compliance with the License.
5 | // You may obtain a copy of the License at
6 | //
7 | // http://www.apache.org/licenses/LICENSE-2.0
8 | //
9 | // Unless required by applicable law or agreed to in writing, software
10 | // distributed under the License is distributed on an "AS IS" BASIS,
11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | // See the License for the specific language governing permissions and
13 | // limitations under the License.
14 |
15 | #include
16 | #include
17 | #include
18 | #include "opencv2/core.hpp"
19 | #include "opencv2/imgcodecs.hpp"
20 | #include "opencv2/imgproc.hpp"
21 | #include
22 | #include "paddle_api.h" // NOLINT
23 | #include "paddle_use_kernels.h" // NOLINT
24 | #include "paddle_use_ops.h" // NOLINT
25 | #include "UltraFace.hpp"
26 |
27 | using namespace std;
28 | using namespace paddle::lite_api; // NOLINT
29 |
30 | int main(int argc,char ** argv)
31 | {
32 | float f;
33 | float FPS[16];
34 | int i, Fcnt=0;
35 | cv::Mat frame;
36 | int classify_w = 128;
37 | int classify_h = 128;
38 | float scale_factor = 1.f / 256;
39 | int FaceImgSz = classify_w * classify_h;
40 |
41 | // Mask detection (second phase, when the faces are located)
42 | MobileConfig Mconfig;
43 | std::shared_ptr Mpredictor;
44 | //some timing
45 | chrono::steady_clock::time_point Tbegin, Tend;
46 |
47 | for(i=0;i<16;i++) FPS[i]=0.0;
48 |
49 | //load SSD face detection model and get predictor
50 | // UltraFace ultraface("slim_320.bin","slim_320.param", 320, 240, 2, 0.7); // config model input
51 | UltraFace ultraface("RFB-320.bin","RFB-320.param", 320, 240, 2, 0.7); // config model input
52 |
53 | //load mask detection model
54 | Mconfig.set_model_from_file("mask_detector_opt2.nb");
55 | Mpredictor = CreatePaddlePredictor(Mconfig);
56 | std::cout << "Load classification model succeed." << std::endl;
57 |
58 | // Get Input Tensor
59 | std::unique_ptr input_tensor1(std::move(Mpredictor->GetInput(0)));
60 | input_tensor1->Resize({1, 3, classify_h, classify_w});
61 |
62 | // Get Output Tensor
63 | std::unique_ptr output_tensor1(std::move(Mpredictor->GetOutput(0)));
64 |
65 | cv::VideoCapture cap("Face_Mask_Video.mp4");
66 | if (!cap.isOpened()) {
67 | cerr << "ERROR: Unable to open the camera" << endl;
68 | return 0;
69 | }
70 | cout << "Start grabbing, press ESC on Live window to terminate" << endl;
71 |
72 | while(1){
73 | // frame=cv::imread("Face_2.jpg"); //if you want to run just one picture need to refresh frame before class detection
74 | cap >> frame;
75 | if (frame.empty()) {
76 | cerr << "ERROR: Unable to grab from the camera" << endl;
77 | break;
78 | }
79 |
80 | Tbegin = chrono::steady_clock::now();
81 |
82 | ncnn::Mat inmat = ncnn::Mat::from_pixels(frame.data, ncnn::Mat::PIXEL_BGR2RGB, frame.cols, frame.rows);
83 |
84 | //get the faces
85 | std::vector face_info;
86 | ultraface.detect(inmat, face_info);
87 |
88 | auto* input_data = input_tensor1->mutable_data();
89 |
90 | for(long unsigned int i = 0; i < face_info.size(); i++) {
91 | auto face = face_info[i];
92 | //enlarge 10%
93 | float w = (face.x2 - face.x1)/20.0;
94 | float h = (face.y2 - face.y1)/20.0;
95 | cv::Point pt1(std::max(face.x1-w,float(0.0)),std::max(face.y1-h,float(0.0)));
96 | cv::Point pt2(std::min(face.x2+w,float(frame.cols)),std::min(face.y2+h,float(frame.rows)));
97 | //RecClip is completly inside the frame
98 | cv::Rect RecClip(pt1, pt2);
99 | cv::Mat resized_img;
100 | cv::Mat imgf;
101 |
102 | if(RecClip.width>0 && RecClip.height>0){
103 | //roi has size RecClip
104 | cv::Mat roi = frame(RecClip);
105 |
106 | //resized_img has size 128x128 (uchar)
107 | cv::resize(roi, resized_img, cv::Size(classify_w, classify_h), 0.f, 0.f, cv::INTER_CUBIC);
108 |
109 | //imgf has size 128x128 (float in range 0.0 - +1.0)
110 | resized_img.convertTo(imgf, CV_32FC3, scale_factor);
111 |
112 | //input tensor has size 128x128 (float in range -0.5 - +0.5)
113 | // fill tensor with mean and scale and trans layout: nhwc -> nchw, neon speed up
114 | //offset_nchw(n, c, h, w) = n * CHW + c * HW + h * W + w
115 | //offset_nhwc(n, c, h, w) = n * HWC + h * WC + w * C + c
116 | const float* dimg = reinterpret_cast(imgf.data);
117 |
118 | float* dout_c0 = input_data;
119 | float* dout_c1 = input_data + FaceImgSz;
120 | float* dout_c2 = input_data + FaceImgSz * 2;
121 |
122 | for(int i=0;iRun();
130 |
131 | auto* outptr = output_tensor1->data();
132 | float prob = outptr[1];
133 |
134 | // Draw Detection and Classification Results
135 | bool flag_mask = prob > 0.5f;
136 | cv::Scalar roi_color;
137 |
138 | if(flag_mask) roi_color = cv::Scalar(0, 255, 0);
139 | else roi_color = cv::Scalar(0, 0, 255);
140 | // Draw roi object
141 | cv::rectangle(frame, RecClip, roi_color, 2);
142 | }
143 | }
144 |
145 | Tend = chrono::steady_clock::now();
146 | //calculate frame rate
147 | f = chrono::duration_cast (Tend - Tbegin).count();
148 | if(f>0.0) FPS[((Fcnt++)&0x0F)]=1000.0/f;
149 | for(f=0.0, i=0;i<16;i++){ f+=FPS[i]; }
150 | cv::putText(frame, cv::format("FPS %0.2f", f/16),cv::Point(10,20),cv::FONT_HERSHEY_SIMPLEX,0.6, cv::Scalar(0, 0, 255));
151 |
152 | //cv::imwrite("FaceResult.jpg",frame); //in case you run only a jpg picture
153 |
154 | //show output
155 | cv::imshow("RPi 64 OS - 1,95 GHz - 2 Mb RAM", frame);
156 |
157 | char esc = cv::waitKey(5);
158 | if(esc == 27) break;
159 | }
160 |
161 | cout << "Closing the camera" << endl;
162 | cv::destroyAllWindows();
163 | cout << "Bye!" << endl;
164 |
165 | return 0;
166 | }
167 |
--------------------------------------------------------------------------------
/slim_320.bin:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/Face-Mask-Detection-Jetson-Nano/9625ddf54561f345985858261d020b1ee13a40bf/slim_320.bin
--------------------------------------------------------------------------------
/slim_320.param:
--------------------------------------------------------------------------------
1 | 7767517
2 | 100 107
3 | Input input 0 1 input
4 | Convolution 185 1 1 input 185 0=16 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 5=1 6=432
5 | ReLU 187 1 1 185 187
6 | ConvolutionDepthWise 188 1 1 187 188 0=16 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=144 7=16
7 | ReLU 190 1 1 188 190
8 | Convolution 191 1 1 190 191 0=32 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=512
9 | ReLU 193 1 1 191 193
10 | ConvolutionDepthWise 194 1 1 193 194 0=32 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 5=1 6=288 7=32
11 | ReLU 196 1 1 194 196
12 | Convolution 197 1 1 196 197 0=32 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=1024
13 | ReLU 199 1 1 197 199
14 | ConvolutionDepthWise 200 1 1 199 200 0=32 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=288 7=32
15 | ReLU 202 1 1 200 202
16 | Convolution 203 1 1 202 203 0=32 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=1024
17 | ReLU 205 1 1 203 205
18 | ConvolutionDepthWise 206 1 1 205 206 0=32 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 5=1 6=288 7=32
19 | ReLU 208 1 1 206 208
20 | Convolution 209 1 1 208 209 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=2048
21 | ReLU 211 1 1 209 211
22 | ConvolutionDepthWise 212 1 1 211 212 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=576 7=64
23 | ReLU 214 1 1 212 214
24 | Convolution 215 1 1 214 215 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=4096
25 | ReLU 217 1 1 215 217
26 | ConvolutionDepthWise 218 1 1 217 218 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=576 7=64
27 | ReLU 220 1 1 218 220
28 | Convolution 221 1 1 220 221 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=4096
29 | ReLU 223 1 1 221 223
30 | ConvolutionDepthWise 224 1 1 223 224 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=576 7=64
31 | ReLU 226 1 1 224 226
32 | Convolution 227 1 1 226 227 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=4096
33 | ReLU 229 1 1 227 229
34 | Split splitncnn_0 1 3 229 229_splitncnn_0 229_splitncnn_1 229_splitncnn_2
35 | ConvolutionDepthWise 230 1 1 229_splitncnn_2 230 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=576 7=64
36 | ReLU 231 1 1 230 231
37 | Convolution 232 1 1 231 232 0=6 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=384
38 | Permute 233 1 1 232 233 0=3
39 | Reshape 243 1 1 233 243 0=2 1=-1
40 | ConvolutionDepthWise 244 1 1 229_splitncnn_1 244 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=576 7=64
41 | ReLU 245 1 1 244 245
42 | Convolution 246 1 1 245 246 0=12 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=768
43 | Permute 247 1 1 246 247 0=3
44 | Reshape 257 1 1 247 257 0=4 1=-1
45 | ConvolutionDepthWise 258 1 1 229_splitncnn_0 258 0=64 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 5=1 6=576 7=64
46 | ReLU 260 1 1 258 260
47 | Convolution 261 1 1 260 261 0=128 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=8192
48 | ReLU 263 1 1 261 263
49 | ConvolutionDepthWise 264 1 1 263 264 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=1152 7=128
50 | ReLU 266 1 1 264 266
51 | Convolution 267 1 1 266 267 0=128 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=16384
52 | ReLU 269 1 1 267 269
53 | ConvolutionDepthWise 270 1 1 269 270 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=1152 7=128
54 | ReLU 272 1 1 270 272
55 | Convolution 273 1 1 272 273 0=128 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=16384
56 | ReLU 275 1 1 273 275
57 | Split splitncnn_1 1 3 275 275_splitncnn_0 275_splitncnn_1 275_splitncnn_2
58 | ConvolutionDepthWise 276 1 1 275_splitncnn_2 276 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=1152 7=128
59 | ReLU 277 1 1 276 277
60 | Convolution 278 1 1 277 278 0=4 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=512
61 | Permute 279 1 1 278 279 0=3
62 | Reshape 289 1 1 279 289 0=2 1=-1
63 | ConvolutionDepthWise 290 1 1 275_splitncnn_1 290 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=1152 7=128
64 | ReLU 291 1 1 290 291
65 | Convolution 292 1 1 291 292 0=8 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=1024
66 | Permute 293 1 1 292 293 0=3
67 | Reshape 303 1 1 293 303 0=4 1=-1
68 | ConvolutionDepthWise 304 1 1 275_splitncnn_0 304 0=128 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 5=1 6=1152 7=128
69 | ReLU 306 1 1 304 306
70 | Convolution 307 1 1 306 307 0=256 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=32768
71 | ReLU 309 1 1 307 309
72 | ConvolutionDepthWise 310 1 1 309 310 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=2304 7=256
73 | ReLU 312 1 1 310 312
74 | Convolution 313 1 1 312 313 0=256 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=65536
75 | ReLU 315 1 1 313 315
76 | Split splitncnn_2 1 3 315 315_splitncnn_0 315_splitncnn_1 315_splitncnn_2
77 | ConvolutionDepthWise 316 1 1 315_splitncnn_2 316 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=2304 7=256
78 | ReLU 317 1 1 316 317
79 | Convolution 318 1 1 317 318 0=4 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=1024
80 | Permute 319 1 1 318 319 0=3
81 | Reshape 329 1 1 319 329 0=2 1=-1
82 | ConvolutionDepthWise 330 1 1 315_splitncnn_1 330 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=2304 7=256
83 | ReLU 331 1 1 330 331
84 | Convolution 332 1 1 331 332 0=8 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=2048
85 | Permute 333 1 1 332 333 0=3
86 | Reshape 343 1 1 333 343 0=4 1=-1
87 | Convolution 344 1 1 315_splitncnn_0 344 0=64 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=16384
88 | ReLU 345 1 1 344 345
89 | ConvolutionDepthWise 346 1 1 345 346 0=64 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 5=1 6=576 7=64
90 | ReLU 347 1 1 346 347
91 | Convolution 348 1 1 347 348 0=256 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 5=1 6=16384
92 | ReLU 349 1 1 348 349
93 | Split splitncnn_3 1 2 349 349_splitncnn_0 349_splitncnn_1
94 | Convolution 350 1 1 349_splitncnn_1 350 0=6 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=13824
95 | Permute 351 1 1 350 351 0=3
96 | Reshape 361 1 1 351 361 0=2 1=-1
97 | Convolution 362 1 1 349_splitncnn_0 362 0=12 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 5=1 6=27648
98 | Permute 363 1 1 362 363 0=3
99 | Reshape 373 1 1 363 373 0=4 1=-1
100 | Concat 374 4 1 243 289 329 361 374 0=0
101 | Concat boxes 4 1 257 303 343 373 boxes 0=0
102 | Softmax scores 1 1 374 scores 0=1 1=1
103 |
--------------------------------------------------------------------------------