├── LICENSE ├── README.md ├── YoloV5.cbp ├── busstop.jpg ├── parking.jpg ├── yolov5.cpp ├── yolov5s.bin └── yolov5s.param /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2021, Q-engineering 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | 3. Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # YoloV5 Raspberry Pi 4 2 | ![output image]( https://qengineering.eu/images/test_parkV5.jpg ) 3 | ## YoloV5 with the ncnn framework.
4 | [![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)

5 | Paper: https://towardsdatascience.com/yolo-v5-is-here-b668ce2a4908

6 | Specially made for a bare Raspberry Pi 4, see [Q-engineering deep learning examples](https://qengineering.eu/deep-learning-examples-on-raspberry-32-64-os.html) 7 | 8 | ------------ 9 | 10 | ## Benchmark. 11 | Numbers in **FPS** and reflect only the inference timing. Grabbing frames, post-processing and drawing are not taken into account. 12 | 13 | | Model | size | mAP | Jetson Nano | RPi 4 1950 | RPi 5 2900 | Rock 5 | RK35881
NPU | RK3566/682
NPU | Nano
TensorRT | Orin
TensorRT | 14 | | ------------- | :-----: | :-----: | :-------------: | :-------------: | :-----: | :-----: | :-------------: | :-------------: | :-----: | :-----: | 15 | | [NanoDet](https://github.com/Qengineering/NanoDet-ncnn-Raspberry-Pi-4) | 320x320 | 20.6 | 26.2 | 13.0 | 43.2 |36.0 ||||| 16 | | [NanoDet Plus](https://github.com/Qengineering/NanoDetPlus-ncnn-Raspberry-Pi-4) | 416x416 | 30.4 | 18.5 | 5.0 | 30.0 | 24.9 ||||| 17 | | [PP-PicoDet](https://github.com/Qengineering/PP-PicoDet-ncnn-Raspberry-Pi-4) | 320x320 | 27.0 | 24.0 | 7.5 | 53.7 | 46.7 ||||| 18 | | [YoloFastestV2](https://github.com/Qengineering/YoloFastestV2-ncnn-Raspberry-Pi-4) | 352x352 | 24.1 | 38.4 | 18.8 | 78.5 | 65.4 | |||| 19 | | [YoloV2](https://github.com/Qengineering/YoloV2-ncnn-Raspberry-Pi-4) 20| 416x416 | 19.2 | 10.1 | 3.0 | 24.0 | 20.0 | |||| 20 | | [YoloV3](https://github.com/Qengineering/YoloV3-ncnn-Raspberry-Pi-4) 20| 352x352 tiny | 16.6 | 17.7 | 4.4 | 18.1 | 15.0 | |||| 21 | | [YoloV4](https://github.com/Qengineering/YoloV4-ncnn-Raspberry-Pi-4) | 416x416 tiny | 21.7 | 16.1 | 3.4 | 17.5 | 22.4 | |||| 22 | | [YoloV4](https://github.com/Qengineering/YoloV4-ncnn-Raspberry-Pi-4) | 608x608 full | 45.3 | 1.3 | 0.2 | 1.82 | 1.5 | |||| 23 | | [YoloV5](https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4) | 640x640 nano | 22.5 | 5.0 | 1.6 | 13.6 | 12.5 | 58.8 | 14.8 | 19.0 | 100 | 24 | | [YoloV5](https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4) | 640x640 small | 22.5 | 5.0 | 1.6 | 6.3 | 12.5 | 37.7 | 11.7 | 9.25 | 100 | 25 | | [YoloV6](https://github.com/Qengineering/YoloV6-ncnn-Raspberry-Pi-4) | 640x640 nano | 35.0 | 10.5 | 2.7 | 15.8 | 20.8 | 63.0 | 18.0 ||| 26 | | [YoloV7](https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4) | 640x640 tiny | 38.7 | 8.5 | 2.1 | 14.4 | 17.9 | 53.4 | 16.1 | 15.0 || 27 | | [YoloV8](https://github.com/Qengineering/YoloV8-ncnn-Raspberry-Pi-4) | 640x640 nano | 37.3 | 14.5 | 3.1 | 20.0 | 16.3 | 53.1 | 18.2 ||| 28 | | [YoloV8](https://github.com/Qengineering/YoloV8-ncnn-Raspberry-Pi-4) | 640x640 small | 44.9 | 4.5 | 1.47 | 11.0 | 9.2 | 28.5 | 8.9 ||| 29 | | [YoloV9](https://github.com/Qengineering/YoloV9-ncnn-Raspberry-Pi-4) | 640x640 comp | 53.0 | 1.2 | 0.28 | 1.5 | 1.2 | |||| 30 | | [YoloX](https://github.com/Qengineering/YoloX-ncnn-Raspberry-Pi-4) | 416x416 nano | 25.8 | 22.6 | 7.0 | 38.6 | 28.5 | |||| 31 | | [YoloX](https://github.com/Qengineering/YoloX-ncnn-Raspberry-Pi-4) | 416x416 tiny | 32.8 | 11.35 | 2.8 | 17.2 | 18.1 | |||| 32 | | [YoloX](https://github.com/Qengineering/YoloX-ncnn-Raspberry-Pi-4) | 640x640 small | 40.5 | 3.65 | 0.9 | 4.5 | 7.5 | 30.0 | 10.0 ||| 33 | 34 | 1 The Rock 5 and Orange Pi5 have the RK3588 on board.
35 | 2 The Rock 3, Radxa Zero 3 and Orange Pi3B have the RK3566 on board.
36 | 20 Recognize 20 objects (VOC) instead of 80 (COCO) 37 | 38 | ------------ 39 | 40 | ## Dependencies. 41 | To run the application, you have to: 42 | - A Raspberry Pi 4 with a 32 or 64-bit operating system. It can be the Raspberry 64-bit OS, or Ubuntu 18.04 / 20.04. [Install 64-bit OS](https://qengineering.eu/install-raspberry-64-os.html)
43 | - The Tencent ncnn framework installed. [Install ncnn](https://qengineering.eu/install-ncnn-on-raspberry-pi-4.html)
44 | - OpenCV 64-bit installed. [Install OpenCV 4.5](https://qengineering.eu/install-opencv-4.5-on-raspberry-64-os.html)
45 | - Code::Blocks installed. (```$ sudo apt-get install codeblocks```) 46 | 47 | ------------ 48 | 49 | ## Installing the app. 50 | To extract and run the network in Code::Blocks
51 | $ mkdir *MyDir*
52 | $ cd *MyDir*
53 | $ wget https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/archive/refs/heads/main.zip
54 | $ unzip -j master.zip
55 | Remove master.zip, LICENSE and README.md as they are no longer needed.
56 | $ rm master.zip
57 | $ rm LICENSE
58 | $ rm README.md

59 | Your *MyDir* folder must now look like this:
60 | parking.jpg
61 | busstop.jpg
62 | YoloV5.cpb
63 | yoloV5.cpp
64 | yolov5s.bin
65 | yolov5s.param
66 | 67 | ------------ 68 | 69 | ## Running the app. 70 | To run the application load the project file YoloV5.cbp in Code::Blocks. More info or
71 | if you want to connect a camera to the app, follow the instructions at [Hands-On](https://qengineering.eu/deep-learning-examples-on-raspberry-32-64-os.html#HandsOn).

72 | Many thanks to [nihui](https://github.com/nihui/) again!

73 | ![output image]( https://qengineering.eu/images/test_busV5.jpg ) 74 | 75 | ------------ 76 | 77 | [![paypal](https://qengineering.eu/images/TipJarSmall4.png)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=CPZTM5BB3FCYL) 78 | 79 | 80 | -------------------------------------------------------------------------------- /YoloV5.cbp: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 67 | 68 | -------------------------------------------------------------------------------- /busstop.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/134933aa72247a41bf598ef38b1476d11d003f28/busstop.jpg -------------------------------------------------------------------------------- /parking.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/134933aa72247a41bf598ef38b1476d11d003f28/parking.jpg -------------------------------------------------------------------------------- /yolov5.cpp: -------------------------------------------------------------------------------- 1 | // Tencent is pleased to support the open source community by making ncnn available. 2 | // 3 | // Copyright (C) 2020 THL A29 Limited, a Tencent company. All rights reserved. 4 | // 5 | // Licensed under the BSD 3-Clause License (the "License"); you may not use this file except 6 | // in compliance with the License. You may obtain a copy of the License at 7 | // 8 | // https://opensource.org/licenses/BSD-3-Clause 9 | // 10 | // Unless required by applicable law or agreed to in writing, software distributed 11 | // under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR 12 | // CONDITIONS OF ANY KIND, either express or implied. See the License for the 13 | // specific language governing permissions and limitations under the License. 14 | 15 | // modified 12-31-2021 Q-engineering 16 | 17 | #include "layer.h" 18 | #include "net.h" 19 | 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include 25 | 26 | ncnn::Net yolov5; 27 | 28 | const int target_size = 640; 29 | const float prob_threshold = 0.25f; 30 | const float nms_threshold = 0.45f; 31 | const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f}; 32 | 33 | const char* class_names[] = { 34 | "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light", 35 | "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", 36 | "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", 37 | "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", 38 | "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", 39 | "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch", 40 | "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone", 41 | "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear", 42 | "hair drier", "toothbrush" 43 | }; 44 | 45 | 46 | class YoloV5Focus : public ncnn::Layer 47 | { 48 | public: 49 | YoloV5Focus() 50 | { 51 | one_blob_only = true; 52 | } 53 | 54 | virtual int forward(const ncnn::Mat& bottom_blob, ncnn::Mat& top_blob, const ncnn::Option& opt) const 55 | { 56 | int w = bottom_blob.w; 57 | int h = bottom_blob.h; 58 | int channels = bottom_blob.c; 59 | 60 | int outw = w / 2; 61 | int outh = h / 2; 62 | int outc = channels * 4; 63 | 64 | top_blob.create(outw, outh, outc, 4u, 1, opt.blob_allocator); 65 | if (top_blob.empty()) 66 | return -100; 67 | 68 | #pragma omp parallel for num_threads(opt.num_threads) 69 | for (int p = 0; p < outc; p++) 70 | { 71 | const float* ptr = bottom_blob.channel(p % channels).row((p / channels) % 2) + ((p / channels) / 2); 72 | float* outptr = top_blob.channel(p); 73 | 74 | for (int i = 0; i < outh; i++) 75 | { 76 | for (int j = 0; j < outw; j++) 77 | { 78 | *outptr = *ptr; 79 | 80 | outptr += 1; 81 | ptr += 2; 82 | } 83 | 84 | ptr += w; 85 | } 86 | } 87 | 88 | return 0; 89 | } 90 | }; 91 | 92 | DEFINE_LAYER_CREATOR(YoloV5Focus) 93 | 94 | struct Object 95 | { 96 | cv::Rect_ rect; 97 | int label; 98 | float prob; 99 | }; 100 | 101 | static inline float intersection_area(const Object& a, const Object& b) 102 | { 103 | cv::Rect_ inter = a.rect & b.rect; 104 | return inter.area(); 105 | } 106 | 107 | static void qsort_descent_inplace(std::vector& faceobjects, int left, int right) 108 | { 109 | int i = left; 110 | int j = right; 111 | float p = faceobjects[(left + right) / 2].prob; 112 | 113 | while (i <= j) 114 | { 115 | while (faceobjects[i].prob > p) 116 | i++; 117 | 118 | while (faceobjects[j].prob < p) 119 | j--; 120 | 121 | if (i <= j) 122 | { 123 | // swap 124 | std::swap(faceobjects[i], faceobjects[j]); 125 | 126 | i++; 127 | j--; 128 | } 129 | } 130 | 131 | #pragma omp parallel sections 132 | { 133 | #pragma omp section 134 | { 135 | if (left < j) qsort_descent_inplace(faceobjects, left, j); 136 | } 137 | #pragma omp section 138 | { 139 | if (i < right) qsort_descent_inplace(faceobjects, i, right); 140 | } 141 | } 142 | } 143 | 144 | static void qsort_descent_inplace(std::vector& faceobjects) 145 | { 146 | if (faceobjects.empty()) 147 | return; 148 | 149 | qsort_descent_inplace(faceobjects, 0, faceobjects.size() - 1); 150 | } 151 | 152 | static void nms_sorted_bboxes(const std::vector& faceobjects, std::vector& picked, float nms_threshold) 153 | { 154 | picked.clear(); 155 | 156 | const int n = faceobjects.size(); 157 | 158 | std::vector areas(n); 159 | for (int i = 0; i < n; i++) 160 | { 161 | areas[i] = faceobjects[i].rect.area(); 162 | } 163 | 164 | for (int i = 0; i < n; i++) 165 | { 166 | const Object& a = faceobjects[i]; 167 | 168 | int keep = 1; 169 | for (int j = 0; j < (int)picked.size(); j++) 170 | { 171 | const Object& b = faceobjects[picked[j]]; 172 | 173 | // intersection over union 174 | float inter_area = intersection_area(a, b); 175 | float union_area = areas[i] + areas[picked[j]] - inter_area; 176 | // float IoU = inter_area / union_area 177 | if (inter_area / union_area > nms_threshold) 178 | keep = 0; 179 | } 180 | 181 | if (keep) 182 | picked.push_back(i); 183 | } 184 | } 185 | 186 | static inline float sigmoid(float x) 187 | { 188 | return static_cast(1.f / (1.f + exp(-x))); 189 | } 190 | 191 | static void generate_proposals(const ncnn::Mat& anchors, int stride, const ncnn::Mat& in_pad, const ncnn::Mat& feat_blob, float prob_threshold, std::vector& objects) 192 | { 193 | const int num_grid = feat_blob.h; 194 | 195 | int num_grid_x; 196 | int num_grid_y; 197 | if (in_pad.w > in_pad.h) 198 | { 199 | num_grid_x = in_pad.w / stride; 200 | num_grid_y = num_grid / num_grid_x; 201 | } 202 | else 203 | { 204 | num_grid_y = in_pad.h / stride; 205 | num_grid_x = num_grid / num_grid_y; 206 | } 207 | 208 | const int num_class = feat_blob.w - 5; 209 | 210 | const int num_anchors = anchors.w / 2; 211 | 212 | for (int q = 0; q < num_anchors; q++) 213 | { 214 | const float anchor_w = anchors[q * 2]; 215 | const float anchor_h = anchors[q * 2 + 1]; 216 | 217 | const ncnn::Mat feat = feat_blob.channel(q); 218 | 219 | for (int i = 0; i < num_grid_y; i++) 220 | { 221 | for (int j = 0; j < num_grid_x; j++) 222 | { 223 | const float* featptr = feat.row(i * num_grid_x + j); 224 | 225 | // find class index with max class score 226 | int class_index = 0; 227 | float class_score = -FLT_MAX; 228 | for (int k = 0; k < num_class; k++) 229 | { 230 | float score = featptr[5 + k]; 231 | if (score > class_score) 232 | { 233 | class_index = k; 234 | class_score = score; 235 | } 236 | } 237 | 238 | float box_score = featptr[4]; 239 | 240 | float confidence = sigmoid(box_score) * sigmoid(class_score); 241 | 242 | if (confidence >= prob_threshold) 243 | { 244 | // yolov5/models/yolo.py Detect forward 245 | // y = x[i].sigmoid() 246 | // y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy 247 | // y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh 248 | 249 | float dx = sigmoid(featptr[0]); 250 | float dy = sigmoid(featptr[1]); 251 | float dw = sigmoid(featptr[2]); 252 | float dh = sigmoid(featptr[3]); 253 | 254 | float pb_cx = (dx * 2.f - 0.5f + j) * stride; 255 | float pb_cy = (dy * 2.f - 0.5f + i) * stride; 256 | 257 | float pb_w = pow(dw * 2.f, 2) * anchor_w; 258 | float pb_h = pow(dh * 2.f, 2) * anchor_h; 259 | 260 | float x0 = pb_cx - pb_w * 0.5f; 261 | float y0 = pb_cy - pb_h * 0.5f; 262 | float x1 = pb_cx + pb_w * 0.5f; 263 | float y1 = pb_cy + pb_h * 0.5f; 264 | 265 | Object obj; 266 | obj.rect.x = x0; 267 | obj.rect.y = y0; 268 | obj.rect.width = x1 - x0; 269 | obj.rect.height = y1 - y0; 270 | obj.label = class_index; 271 | obj.prob = confidence; 272 | 273 | objects.push_back(obj); 274 | } 275 | } 276 | } 277 | } 278 | } 279 | 280 | static int detect_yolov5(const cv::Mat& bgr, std::vector& objects) 281 | { 282 | int img_w = bgr.cols; 283 | int img_h = bgr.rows; 284 | 285 | // letterbox pad to multiple of 32 286 | int w = img_w; 287 | int h = img_h; 288 | float scale = 1.f; 289 | if (w > h) 290 | { 291 | scale = (float)target_size / w; 292 | w = target_size; 293 | h = h * scale; 294 | } 295 | else 296 | { 297 | scale = (float)target_size / h; 298 | h = target_size; 299 | w = w * scale; 300 | } 301 | 302 | ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR2RGB, img_w, img_h, w, h); 303 | 304 | // pad to target_size rectangle 305 | // yolov5/utils/datasets.py letterbox 306 | int wpad = (w + 31) / 32 * 32 - w; 307 | int hpad = (h + 31) / 32 * 32 - h; 308 | ncnn::Mat in_pad; 309 | ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 114.f); 310 | 311 | const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f}; 312 | in_pad.substract_mean_normalize(0, norm_vals); 313 | 314 | ncnn::Extractor ex = yolov5.create_extractor(); 315 | 316 | ex.input("images", in_pad); 317 | 318 | std::vector proposals; 319 | 320 | // anchor setting from yolov5/models/yolov5s.yaml 321 | 322 | // stride 8 323 | { 324 | ncnn::Mat out; 325 | ex.extract("output", out); 326 | 327 | ncnn::Mat anchors(6); 328 | anchors[0] = 10.f; 329 | anchors[1] = 13.f; 330 | anchors[2] = 16.f; 331 | anchors[3] = 30.f; 332 | anchors[4] = 33.f; 333 | anchors[5] = 23.f; 334 | 335 | std::vector objects8; 336 | generate_proposals(anchors, 8, in_pad, out, prob_threshold, objects8); 337 | 338 | proposals.insert(proposals.end(), objects8.begin(), objects8.end()); 339 | } 340 | 341 | // stride 16 342 | { 343 | ncnn::Mat out; 344 | ex.extract("781", out); 345 | 346 | ncnn::Mat anchors(6); 347 | anchors[0] = 30.f; 348 | anchors[1] = 61.f; 349 | anchors[2] = 62.f; 350 | anchors[3] = 45.f; 351 | anchors[4] = 59.f; 352 | anchors[5] = 119.f; 353 | 354 | std::vector objects16; 355 | generate_proposals(anchors, 16, in_pad, out, prob_threshold, objects16); 356 | 357 | proposals.insert(proposals.end(), objects16.begin(), objects16.end()); 358 | } 359 | 360 | // stride 32 361 | { 362 | ncnn::Mat out; 363 | ex.extract("801", out); 364 | 365 | ncnn::Mat anchors(6); 366 | anchors[0] = 116.f; 367 | anchors[1] = 90.f; 368 | anchors[2] = 156.f; 369 | anchors[3] = 198.f; 370 | anchors[4] = 373.f; 371 | anchors[5] = 326.f; 372 | 373 | std::vector objects32; 374 | generate_proposals(anchors, 32, in_pad, out, prob_threshold, objects32); 375 | 376 | proposals.insert(proposals.end(), objects32.begin(), objects32.end()); 377 | } 378 | 379 | // sort all proposals by score from highest to lowest 380 | qsort_descent_inplace(proposals); 381 | 382 | // apply nms with nms_threshold 383 | std::vector picked; 384 | nms_sorted_bboxes(proposals, picked, nms_threshold); 385 | 386 | int count = picked.size(); 387 | 388 | objects.resize(count); 389 | for (int i = 0; i < count; i++) 390 | { 391 | objects[i] = proposals[picked[i]]; 392 | 393 | // adjust offset to original unpadded 394 | float x0 = (objects[i].rect.x - (wpad / 2)) / scale; 395 | float y0 = (objects[i].rect.y - (hpad / 2)) / scale; 396 | float x1 = (objects[i].rect.x + objects[i].rect.width - (wpad / 2)) / scale; 397 | float y1 = (objects[i].rect.y + objects[i].rect.height - (hpad / 2)) / scale; 398 | 399 | // clip 400 | x0 = std::max(std::min(x0, (float)(img_w - 1)), 0.f); 401 | y0 = std::max(std::min(y0, (float)(img_h - 1)), 0.f); 402 | x1 = std::max(std::min(x1, (float)(img_w - 1)), 0.f); 403 | y1 = std::max(std::min(y1, (float)(img_h - 1)), 0.f); 404 | 405 | objects[i].rect.x = x0; 406 | objects[i].rect.y = y0; 407 | objects[i].rect.width = x1 - x0; 408 | objects[i].rect.height = y1 - y0; 409 | } 410 | 411 | return 0; 412 | } 413 | 414 | static void draw_objects(cv::Mat& bgr, const std::vector& objects) 415 | { 416 | for (size_t i = 0; i < objects.size(); i++) 417 | { 418 | const Object& obj = objects[i]; 419 | 420 | // fprintf(stderr, "%d = %.5f at %.2f %.2f %.2f x %.2f\n", obj.label, obj.prob, 421 | // obj.rect.x, obj.rect.y, obj.rect.width, obj.rect.height); 422 | 423 | cv::rectangle(bgr, obj.rect, cv::Scalar(255, 0, 0)); 424 | 425 | char text[256]; 426 | sprintf(text, "%s %.1f%%", class_names[obj.label], obj.prob * 100); 427 | 428 | int baseLine = 0; 429 | cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine); 430 | 431 | int x = obj.rect.x; 432 | int y = obj.rect.y - label_size.height - baseLine; 433 | if (y < 0) 434 | y = 0; 435 | if (x + label_size.width > bgr.cols) 436 | x = bgr.cols - label_size.width; 437 | 438 | cv::rectangle(bgr, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)), 439 | cv::Scalar(255, 255, 255), -1); 440 | 441 | cv::putText(bgr, text, cv::Point(x, y + label_size.height), 442 | cv::FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(0, 0, 0)); 443 | } 444 | } 445 | 446 | int main(int argc, char** argv) 447 | { 448 | if (argc != 2) 449 | { 450 | fprintf(stderr, "Usage: %s [imagepath]\n", argv[0]); 451 | return -1; 452 | } 453 | 454 | const char* imagepath = argv[1]; 455 | 456 | cv::Mat m = cv::imread(imagepath, 1); 457 | if (m.empty()) 458 | { 459 | fprintf(stderr, "cv::imread %s failed\n", imagepath); 460 | return -1; 461 | } 462 | 463 | yolov5.register_custom_layer("YoloV5Focus", YoloV5Focus_layer_creator); 464 | 465 | // original pretrained model from https://github.com/ultralytics/yolov5 466 | // the ncnn model https://github.com/nihui/ncnn-assets/tree/master/models 467 | yolov5.load_param("yolov5s.param"); 468 | yolov5.load_model("yolov5s.bin"); 469 | yolov5.opt.num_threads=4; 470 | 471 | std::vector objects; 472 | detect_yolov5(m, objects); 473 | draw_objects(m, objects); 474 | 475 | cv::imshow("RPi4 - 1.95 GHz - 2 GB ram",m); 476 | // cv::imwrite("test.jpg",m); 477 | cv::waitKey(0); 478 | 479 | return 0; 480 | } 481 | -------------------------------------------------------------------------------- /yolov5s.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/134933aa72247a41bf598ef38b1476d11d003f28/yolov5s.bin -------------------------------------------------------------------------------- /yolov5s.param: -------------------------------------------------------------------------------- 1 | 7767517 2 | 192 216 3 | Input images 0 1 images 4 | YoloV5Focus focus 1 1 images 207 5 | Convolution Conv_41 1 1 207 208 0=32 1=3 4=1 5=1 6=3456 6 | HardSwish Div_49 1 1 208 216 0=1.666667e-01 7 | Convolution Conv_50 1 1 216 217 0=64 1=3 3=2 4=1 5=1 6=18432 8 | HardSwish Div_58 1 1 217 225 0=1.666667e-01 9 | Split splitncnn_0 1 2 225 225_splitncnn_0 225_splitncnn_1 10 | Convolution Conv_59 1 1 225_splitncnn_1 226 0=32 1=1 5=1 6=2048 11 | HardSwish Div_67 1 1 226 234 0=1.666667e-01 12 | Split splitncnn_1 1 2 234 234_splitncnn_0 234_splitncnn_1 13 | Convolution Conv_68 1 1 234_splitncnn_1 235 0=32 1=1 5=1 6=1024 14 | HardSwish Div_76 1 1 235 243 0=1.666667e-01 15 | Convolution Conv_77 1 1 243 244 0=32 1=3 4=1 5=1 6=9216 16 | HardSwish Div_85 1 1 244 252 0=1.666667e-01 17 | BinaryOp Add_86 2 1 234_splitncnn_0 252 253 18 | Convolution Conv_87 1 1 253 254 0=32 1=1 6=1024 19 | Convolution Conv_88 1 1 225_splitncnn_0 255 0=32 1=1 6=2048 20 | Concat Concat_89 2 1 254 255 256 21 | BatchNorm BatchNormalization_90 1 1 256 257 0=64 22 | ReLU LeakyRelu_91 1 1 257 258 0=1.000000e-01 23 | Convolution Conv_92 1 1 258 259 0=64 1=1 5=1 6=4096 24 | HardSwish Div_100 1 1 259 267 0=1.666667e-01 25 | Convolution Conv_101 1 1 267 268 0=128 1=3 3=2 4=1 5=1 6=73728 26 | HardSwish Div_109 1 1 268 276 0=1.666667e-01 27 | Split splitncnn_2 1 2 276 276_splitncnn_0 276_splitncnn_1 28 | Convolution Conv_110 1 1 276_splitncnn_1 277 0=64 1=1 5=1 6=8192 29 | HardSwish Div_118 1 1 277 285 0=1.666667e-01 30 | Split splitncnn_3 1 2 285 285_splitncnn_0 285_splitncnn_1 31 | Convolution Conv_119 1 1 285_splitncnn_1 286 0=64 1=1 5=1 6=4096 32 | HardSwish Div_127 1 1 286 294 0=1.666667e-01 33 | Convolution Conv_128 1 1 294 295 0=64 1=3 4=1 5=1 6=36864 34 | HardSwish Div_136 1 1 295 303 0=1.666667e-01 35 | BinaryOp Add_137 2 1 285_splitncnn_0 303 304 36 | Split splitncnn_4 1 2 304 304_splitncnn_0 304_splitncnn_1 37 | Convolution Conv_138 1 1 304_splitncnn_1 305 0=64 1=1 5=1 6=4096 38 | HardSwish Div_146 1 1 305 313 0=1.666667e-01 39 | Convolution Conv_147 1 1 313 314 0=64 1=3 4=1 5=1 6=36864 40 | HardSwish Div_155 1 1 314 322 0=1.666667e-01 41 | BinaryOp Add_156 2 1 304_splitncnn_0 322 323 42 | Split splitncnn_5 1 2 323 323_splitncnn_0 323_splitncnn_1 43 | Convolution Conv_157 1 1 323_splitncnn_1 324 0=64 1=1 5=1 6=4096 44 | HardSwish Div_165 1 1 324 332 0=1.666667e-01 45 | Convolution Conv_166 1 1 332 333 0=64 1=3 4=1 5=1 6=36864 46 | HardSwish Div_174 1 1 333 341 0=1.666667e-01 47 | BinaryOp Add_175 2 1 323_splitncnn_0 341 342 48 | Convolution Conv_176 1 1 342 343 0=64 1=1 6=4096 49 | Convolution Conv_177 1 1 276_splitncnn_0 344 0=64 1=1 6=8192 50 | Concat Concat_178 2 1 343 344 345 51 | BatchNorm BatchNormalization_179 1 1 345 346 0=128 52 | ReLU LeakyRelu_180 1 1 346 347 0=1.000000e-01 53 | Convolution Conv_181 1 1 347 348 0=128 1=1 5=1 6=16384 54 | HardSwish Div_189 1 1 348 356 0=1.666667e-01 55 | Split splitncnn_6 1 2 356 356_splitncnn_0 356_splitncnn_1 56 | Convolution Conv_190 1 1 356_splitncnn_1 357 0=256 1=3 3=2 4=1 5=1 6=294912 57 | HardSwish Div_198 1 1 357 365 0=1.666667e-01 58 | Split splitncnn_7 1 2 365 365_splitncnn_0 365_splitncnn_1 59 | Convolution Conv_199 1 1 365_splitncnn_1 366 0=128 1=1 5=1 6=32768 60 | HardSwish Div_207 1 1 366 374 0=1.666667e-01 61 | Split splitncnn_8 1 2 374 374_splitncnn_0 374_splitncnn_1 62 | Convolution Conv_208 1 1 374_splitncnn_1 375 0=128 1=1 5=1 6=16384 63 | HardSwish Div_216 1 1 375 383 0=1.666667e-01 64 | Convolution Conv_217 1 1 383 384 0=128 1=3 4=1 5=1 6=147456 65 | HardSwish Div_225 1 1 384 392 0=1.666667e-01 66 | BinaryOp Add_226 2 1 374_splitncnn_0 392 393 67 | Split splitncnn_9 1 2 393 393_splitncnn_0 393_splitncnn_1 68 | Convolution Conv_227 1 1 393_splitncnn_1 394 0=128 1=1 5=1 6=16384 69 | HardSwish Div_235 1 1 394 402 0=1.666667e-01 70 | Convolution Conv_236 1 1 402 403 0=128 1=3 4=1 5=1 6=147456 71 | HardSwish Div_244 1 1 403 411 0=1.666667e-01 72 | BinaryOp Add_245 2 1 393_splitncnn_0 411 412 73 | Split splitncnn_10 1 2 412 412_splitncnn_0 412_splitncnn_1 74 | Convolution Conv_246 1 1 412_splitncnn_1 413 0=128 1=1 5=1 6=16384 75 | HardSwish Div_254 1 1 413 421 0=1.666667e-01 76 | Convolution Conv_255 1 1 421 422 0=128 1=3 4=1 5=1 6=147456 77 | HardSwish Div_263 1 1 422 430 0=1.666667e-01 78 | BinaryOp Add_264 2 1 412_splitncnn_0 430 431 79 | Convolution Conv_265 1 1 431 432 0=128 1=1 6=16384 80 | Convolution Conv_266 1 1 365_splitncnn_0 433 0=128 1=1 6=32768 81 | Concat Concat_267 2 1 432 433 434 82 | BatchNorm BatchNormalization_268 1 1 434 435 0=256 83 | ReLU LeakyRelu_269 1 1 435 436 0=1.000000e-01 84 | Convolution Conv_270 1 1 436 437 0=256 1=1 5=1 6=65536 85 | HardSwish Div_278 1 1 437 445 0=1.666667e-01 86 | Split splitncnn_11 1 2 445 445_splitncnn_0 445_splitncnn_1 87 | Convolution Conv_279 1 1 445_splitncnn_1 446 0=512 1=3 3=2 4=1 5=1 6=1179648 88 | HardSwish Div_287 1 1 446 454 0=1.666667e-01 89 | Convolution Conv_288 1 1 454 455 0=256 1=1 5=1 6=131072 90 | HardSwish Div_296 1 1 455 463 0=1.666667e-01 91 | Split splitncnn_12 1 4 463 463_splitncnn_0 463_splitncnn_1 463_splitncnn_2 463_splitncnn_3 92 | Pooling MaxPool_297 1 1 463_splitncnn_3 464 1=5 3=2 5=1 93 | Pooling MaxPool_298 1 1 463_splitncnn_2 465 1=9 3=4 5=1 94 | Pooling MaxPool_299 1 1 463_splitncnn_1 466 1=13 3=6 5=1 95 | Concat Concat_300 4 1 463_splitncnn_0 464 465 466 467 96 | Convolution Conv_301 1 1 467 468 0=512 1=1 5=1 6=524288 97 | HardSwish Div_309 1 1 468 476 0=1.666667e-01 98 | Split splitncnn_13 1 2 476 476_splitncnn_0 476_splitncnn_1 99 | Convolution Conv_310 1 1 476_splitncnn_1 477 0=256 1=1 5=1 6=131072 100 | HardSwish Div_318 1 1 477 485 0=1.666667e-01 101 | Convolution Conv_319 1 1 485 486 0=256 1=1 5=1 6=65536 102 | HardSwish Div_327 1 1 486 494 0=1.666667e-01 103 | Convolution Conv_328 1 1 494 495 0=256 1=3 4=1 5=1 6=589824 104 | HardSwish Div_336 1 1 495 503 0=1.666667e-01 105 | Convolution Conv_337 1 1 503 504 0=256 1=1 6=65536 106 | Convolution Conv_338 1 1 476_splitncnn_0 505 0=256 1=1 6=131072 107 | Concat Concat_339 2 1 504 505 506 108 | BatchNorm BatchNormalization_340 1 1 506 507 0=512 109 | ReLU LeakyRelu_341 1 1 507 508 0=1.000000e-01 110 | Convolution Conv_342 1 1 508 509 0=512 1=1 5=1 6=262144 111 | HardSwish Div_350 1 1 509 517 0=1.666667e-01 112 | Convolution Conv_351 1 1 517 518 0=256 1=1 5=1 6=131072 113 | HardSwish Div_359 1 1 518 526 0=1.666667e-01 114 | Split splitncnn_14 1 2 526 526_splitncnn_0 526_splitncnn_1 115 | Interp Resize_361 1 1 526_splitncnn_1 536 0=1 1=2.000000e+00 2=2.000000e+00 116 | Concat Concat_362 2 1 536 445_splitncnn_0 537 117 | Split splitncnn_15 1 2 537 537_splitncnn_0 537_splitncnn_1 118 | Convolution Conv_363 1 1 537_splitncnn_1 538 0=128 1=1 5=1 6=65536 119 | HardSwish Div_371 1 1 538 546 0=1.666667e-01 120 | Convolution Conv_372 1 1 546 547 0=128 1=1 5=1 6=16384 121 | HardSwish Div_380 1 1 547 555 0=1.666667e-01 122 | Convolution Conv_381 1 1 555 556 0=128 1=3 4=1 5=1 6=147456 123 | HardSwish Div_389 1 1 556 564 0=1.666667e-01 124 | Convolution Conv_390 1 1 564 565 0=128 1=1 6=16384 125 | Convolution Conv_391 1 1 537_splitncnn_0 566 0=128 1=1 6=65536 126 | Concat Concat_392 2 1 565 566 567 127 | BatchNorm BatchNormalization_393 1 1 567 568 0=256 128 | ReLU LeakyRelu_394 1 1 568 569 0=1.000000e-01 129 | Convolution Conv_395 1 1 569 570 0=256 1=1 5=1 6=65536 130 | HardSwish Div_403 1 1 570 578 0=1.666667e-01 131 | Convolution Conv_404 1 1 578 579 0=128 1=1 5=1 6=32768 132 | HardSwish Div_412 1 1 579 587 0=1.666667e-01 133 | Split splitncnn_16 1 2 587 587_splitncnn_0 587_splitncnn_1 134 | Interp Resize_414 1 1 587_splitncnn_1 597 0=1 1=2.000000e+00 2=2.000000e+00 135 | Concat Concat_415 2 1 597 356_splitncnn_0 598 136 | Split splitncnn_17 1 2 598 598_splitncnn_0 598_splitncnn_1 137 | Convolution Conv_416 1 1 598_splitncnn_1 599 0=64 1=1 5=1 6=16384 138 | HardSwish Div_424 1 1 599 607 0=1.666667e-01 139 | Convolution Conv_425 1 1 607 608 0=64 1=1 5=1 6=4096 140 | HardSwish Div_433 1 1 608 616 0=1.666667e-01 141 | Convolution Conv_434 1 1 616 617 0=64 1=3 4=1 5=1 6=36864 142 | HardSwish Div_442 1 1 617 625 0=1.666667e-01 143 | Convolution Conv_443 1 1 625 626 0=64 1=1 6=4096 144 | Convolution Conv_444 1 1 598_splitncnn_0 627 0=64 1=1 6=16384 145 | Concat Concat_445 2 1 626 627 628 146 | BatchNorm BatchNormalization_446 1 1 628 629 0=128 147 | ReLU LeakyRelu_447 1 1 629 630 0=1.000000e-01 148 | Convolution Conv_448 1 1 630 631 0=128 1=1 5=1 6=16384 149 | HardSwish Div_456 1 1 631 639 0=1.666667e-01 150 | Split splitncnn_18 1 2 639 639_splitncnn_0 639_splitncnn_1 151 | Convolution Conv_457 1 1 639_splitncnn_1 640 0=128 1=3 3=2 4=1 5=1 6=147456 152 | HardSwish Div_465 1 1 640 648 0=1.666667e-01 153 | Concat Concat_466 2 1 648 587_splitncnn_0 649 154 | Split splitncnn_19 1 2 649 649_splitncnn_0 649_splitncnn_1 155 | Convolution Conv_467 1 1 649_splitncnn_1 650 0=128 1=1 5=1 6=32768 156 | HardSwish Div_475 1 1 650 658 0=1.666667e-01 157 | Convolution Conv_476 1 1 658 659 0=128 1=1 5=1 6=16384 158 | HardSwish Div_484 1 1 659 667 0=1.666667e-01 159 | Convolution Conv_485 1 1 667 668 0=128 1=3 4=1 5=1 6=147456 160 | HardSwish Div_493 1 1 668 676 0=1.666667e-01 161 | Convolution Conv_494 1 1 676 677 0=128 1=1 6=16384 162 | Convolution Conv_495 1 1 649_splitncnn_0 678 0=128 1=1 6=32768 163 | Concat Concat_496 2 1 677 678 679 164 | BatchNorm BatchNormalization_497 1 1 679 680 0=256 165 | ReLU LeakyRelu_498 1 1 680 681 0=1.000000e-01 166 | Convolution Conv_499 1 1 681 682 0=256 1=1 5=1 6=65536 167 | HardSwish Div_507 1 1 682 690 0=1.666667e-01 168 | Split splitncnn_20 1 2 690 690_splitncnn_0 690_splitncnn_1 169 | Convolution Conv_508 1 1 690_splitncnn_1 691 0=256 1=3 3=2 4=1 5=1 6=589824 170 | HardSwish Div_516 1 1 691 699 0=1.666667e-01 171 | Concat Concat_517 2 1 699 526_splitncnn_0 700 172 | Split splitncnn_21 1 2 700 700_splitncnn_0 700_splitncnn_1 173 | Convolution Conv_518 1 1 700_splitncnn_1 701 0=256 1=1 5=1 6=131072 174 | HardSwish Div_526 1 1 701 709 0=1.666667e-01 175 | Convolution Conv_527 1 1 709 710 0=256 1=1 5=1 6=65536 176 | HardSwish Div_535 1 1 710 718 0=1.666667e-01 177 | Convolution Conv_536 1 1 718 719 0=256 1=3 4=1 5=1 6=589824 178 | HardSwish Div_544 1 1 719 727 0=1.666667e-01 179 | Convolution Conv_545 1 1 727 728 0=256 1=1 6=65536 180 | Convolution Conv_546 1 1 700_splitncnn_0 729 0=256 1=1 6=131072 181 | Concat Concat_547 2 1 728 729 730 182 | BatchNorm BatchNormalization_548 1 1 730 731 0=512 183 | ReLU LeakyRelu_549 1 1 731 732 0=1.000000e-01 184 | Convolution Conv_550 1 1 732 733 0=512 1=1 5=1 6=262144 185 | HardSwish Div_558 1 1 733 741 0=1.666667e-01 186 | Convolution Conv_559 1 1 639_splitncnn_0 742 0=255 1=1 5=1 6=32640 187 | Reshape Reshape_573 1 1 742 760 0=-1 1=85 2=3 188 | Permute Transpose_574 1 1 760 output 0=1 189 | Convolution Conv_575 1 1 690_splitncnn_0 762 0=255 1=1 5=1 6=65280 190 | Reshape Reshape_589 1 1 762 780 0=-1 1=85 2=3 191 | Permute Transpose_590 1 1 780 781 0=1 192 | Convolution Conv_591 1 1 741 782 0=255 1=1 5=1 6=130560 193 | Reshape Reshape_605 1 1 782 800 0=-1 1=85 2=3 194 | Permute Transpose_606 1 1 800 801 0=1 195 | --------------------------------------------------------------------------------