├── LICENSE
├── README.md
├── YoloV5.cbp
├── busstop.jpg
├── parking.jpg
├── yolov5.cpp
├── yolov5s.bin
└── yolov5s.param
/LICENSE:
--------------------------------------------------------------------------------
1 | BSD 3-Clause License
2 |
3 | Copyright (c) 2021, Q-engineering
4 | All rights reserved.
5 |
6 | Redistribution and use in source and binary forms, with or without
7 | modification, are permitted provided that the following conditions are met:
8 |
9 | 1. Redistributions of source code must retain the above copyright notice, this
10 | list of conditions and the following disclaimer.
11 |
12 | 2. Redistributions in binary form must reproduce the above copyright notice,
13 | this list of conditions and the following disclaimer in the documentation
14 | and/or other materials provided with the distribution.
15 |
16 | 3. Neither the name of the copyright holder nor the names of its
17 | contributors may be used to endorse or promote products derived from
18 | this software without specific prior written permission.
19 |
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # YoloV5 Raspberry Pi 4
2 | 
3 | ## YoloV5 with the ncnn framework.
4 | [](https://opensource.org/licenses/BSD-3-Clause)
5 | Paper: https://towardsdatascience.com/yolo-v5-is-here-b668ce2a4908
6 | Specially made for a bare Raspberry Pi 4, see [Q-engineering deep learning examples](https://qengineering.eu/deep-learning-examples-on-raspberry-32-64-os.html)
7 |
8 | ------------
9 |
10 | ## Benchmark.
11 | Numbers in **FPS** and reflect only the inference timing. Grabbing frames, post-processing and drawing are not taken into account.
12 |
13 | | Model | size | mAP | Jetson Nano | RPi 4 1950 | RPi 5 2900 | Rock 5 | RK35881 NPU | RK3566/682 NPU | Nano TensorRT | Orin TensorRT |
14 | | ------------- | :-----: | :-----: | :-------------: | :-------------: | :-----: | :-----: | :-------------: | :-------------: | :-----: | :-----: |
15 | | [NanoDet](https://github.com/Qengineering/NanoDet-ncnn-Raspberry-Pi-4) | 320x320 | 20.6 | 26.2 | 13.0 | 43.2 |36.0 |||||
16 | | [NanoDet Plus](https://github.com/Qengineering/NanoDetPlus-ncnn-Raspberry-Pi-4) | 416x416 | 30.4 | 18.5 | 5.0 | 30.0 | 24.9 |||||
17 | | [PP-PicoDet](https://github.com/Qengineering/PP-PicoDet-ncnn-Raspberry-Pi-4) | 320x320 | 27.0 | 24.0 | 7.5 | 53.7 | 46.7 |||||
18 | | [YoloFastestV2](https://github.com/Qengineering/YoloFastestV2-ncnn-Raspberry-Pi-4) | 352x352 | 24.1 | 38.4 | 18.8 | 78.5 | 65.4 | ||||
19 | | [YoloV2](https://github.com/Qengineering/YoloV2-ncnn-Raspberry-Pi-4) 20 | 416x416 | 19.2 | 10.1 | 3.0 | 24.0 | 20.0 | ||||
20 | | [YoloV3](https://github.com/Qengineering/YoloV3-ncnn-Raspberry-Pi-4) 20 | 352x352 tiny | 16.6 | 17.7 | 4.4 | 18.1 | 15.0 | ||||
21 | | [YoloV4](https://github.com/Qengineering/YoloV4-ncnn-Raspberry-Pi-4) | 416x416 tiny | 21.7 | 16.1 | 3.4 | 17.5 | 22.4 | ||||
22 | | [YoloV4](https://github.com/Qengineering/YoloV4-ncnn-Raspberry-Pi-4) | 608x608 full | 45.3 | 1.3 | 0.2 | 1.82 | 1.5 | ||||
23 | | [YoloV5](https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4) | 640x640 nano | 22.5 | 5.0 | 1.6 | 13.6 | 12.5 | 58.8 | 14.8 | 19.0 | 100 |
24 | | [YoloV5](https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4) | 640x640 small | 22.5 | 5.0 | 1.6 | 6.3 | 12.5 | 37.7 | 11.7 | 9.25 | 100 |
25 | | [YoloV6](https://github.com/Qengineering/YoloV6-ncnn-Raspberry-Pi-4) | 640x640 nano | 35.0 | 10.5 | 2.7 | 15.8 | 20.8 | 63.0 | 18.0 |||
26 | | [YoloV7](https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4) | 640x640 tiny | 38.7 | 8.5 | 2.1 | 14.4 | 17.9 | 53.4 | 16.1 | 15.0 ||
27 | | [YoloV8](https://github.com/Qengineering/YoloV8-ncnn-Raspberry-Pi-4) | 640x640 nano | 37.3 | 14.5 | 3.1 | 20.0 | 16.3 | 53.1 | 18.2 |||
28 | | [YoloV8](https://github.com/Qengineering/YoloV8-ncnn-Raspberry-Pi-4) | 640x640 small | 44.9 | 4.5 | 1.47 | 11.0 | 9.2 | 28.5 | 8.9 |||
29 | | [YoloV9](https://github.com/Qengineering/YoloV9-ncnn-Raspberry-Pi-4) | 640x640 comp | 53.0 | 1.2 | 0.28 | 1.5 | 1.2 | ||||
30 | | [YoloX](https://github.com/Qengineering/YoloX-ncnn-Raspberry-Pi-4) | 416x416 nano | 25.8 | 22.6 | 7.0 | 38.6 | 28.5 | ||||
31 | | [YoloX](https://github.com/Qengineering/YoloX-ncnn-Raspberry-Pi-4) | 416x416 tiny | 32.8 | 11.35 | 2.8 | 17.2 | 18.1 | ||||
32 | | [YoloX](https://github.com/Qengineering/YoloX-ncnn-Raspberry-Pi-4) | 640x640 small | 40.5 | 3.65 | 0.9 | 4.5 | 7.5 | 30.0 | 10.0 |||
33 |
34 | 1 The Rock 5 and Orange Pi5 have the RK3588 on board.
35 | 2 The Rock 3, Radxa Zero 3 and Orange Pi3B have the RK3566 on board.
36 | 20 Recognize 20 objects (VOC) instead of 80 (COCO)
37 |
38 | ------------
39 |
40 | ## Dependencies.
41 | To run the application, you have to:
42 | - A Raspberry Pi 4 with a 32 or 64-bit operating system. It can be the Raspberry 64-bit OS, or Ubuntu 18.04 / 20.04. [Install 64-bit OS](https://qengineering.eu/install-raspberry-64-os.html)
43 | - The Tencent ncnn framework installed. [Install ncnn](https://qengineering.eu/install-ncnn-on-raspberry-pi-4.html)
44 | - OpenCV 64-bit installed. [Install OpenCV 4.5](https://qengineering.eu/install-opencv-4.5-on-raspberry-64-os.html)
45 | - Code::Blocks installed. (```$ sudo apt-get install codeblocks```)
46 |
47 | ------------
48 |
49 | ## Installing the app.
50 | To extract and run the network in Code::Blocks
51 | $ mkdir *MyDir*
52 | $ cd *MyDir*
53 | $ wget https://github.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/archive/refs/heads/main.zip
54 | $ unzip -j master.zip
55 | Remove master.zip, LICENSE and README.md as they are no longer needed.
56 | $ rm master.zip
57 | $ rm LICENSE
58 | $ rm README.md
59 | Your *MyDir* folder must now look like this:
60 | parking.jpg
61 | busstop.jpg
62 | YoloV5.cpb
63 | yoloV5.cpp
64 | yolov5s.bin
65 | yolov5s.param
66 |
67 | ------------
68 |
69 | ## Running the app.
70 | To run the application load the project file YoloV5.cbp in Code::Blocks. More info or
71 | if you want to connect a camera to the app, follow the instructions at [Hands-On](https://qengineering.eu/deep-learning-examples-on-raspberry-32-64-os.html#HandsOn).
72 | Many thanks to [nihui](https://github.com/nihui/) again!
73 | 
74 |
75 | ------------
76 |
77 | [](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=CPZTM5BB3FCYL)
78 |
79 |
80 |
--------------------------------------------------------------------------------
/YoloV5.cbp:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 |
63 |
64 |
65 |
66 |
67 |
68 |
--------------------------------------------------------------------------------
/busstop.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/134933aa72247a41bf598ef38b1476d11d003f28/busstop.jpg
--------------------------------------------------------------------------------
/parking.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/134933aa72247a41bf598ef38b1476d11d003f28/parking.jpg
--------------------------------------------------------------------------------
/yolov5.cpp:
--------------------------------------------------------------------------------
1 | // Tencent is pleased to support the open source community by making ncnn available.
2 | //
3 | // Copyright (C) 2020 THL A29 Limited, a Tencent company. All rights reserved.
4 | //
5 | // Licensed under the BSD 3-Clause License (the "License"); you may not use this file except
6 | // in compliance with the License. You may obtain a copy of the License at
7 | //
8 | // https://opensource.org/licenses/BSD-3-Clause
9 | //
10 | // Unless required by applicable law or agreed to in writing, software distributed
11 | // under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
12 | // CONDITIONS OF ANY KIND, either express or implied. See the License for the
13 | // specific language governing permissions and limitations under the License.
14 |
15 | // modified 12-31-2021 Q-engineering
16 |
17 | #include "layer.h"
18 | #include "net.h"
19 |
20 | #include
21 | #include
22 | #include
23 | #include
24 | #include
25 |
26 | ncnn::Net yolov5;
27 |
28 | const int target_size = 640;
29 | const float prob_threshold = 0.25f;
30 | const float nms_threshold = 0.45f;
31 | const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
32 |
33 | const char* class_names[] = {
34 | "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
35 | "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
36 | "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
37 | "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
38 | "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
39 | "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
40 | "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
41 | "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
42 | "hair drier", "toothbrush"
43 | };
44 |
45 |
46 | class YoloV5Focus : public ncnn::Layer
47 | {
48 | public:
49 | YoloV5Focus()
50 | {
51 | one_blob_only = true;
52 | }
53 |
54 | virtual int forward(const ncnn::Mat& bottom_blob, ncnn::Mat& top_blob, const ncnn::Option& opt) const
55 | {
56 | int w = bottom_blob.w;
57 | int h = bottom_blob.h;
58 | int channels = bottom_blob.c;
59 |
60 | int outw = w / 2;
61 | int outh = h / 2;
62 | int outc = channels * 4;
63 |
64 | top_blob.create(outw, outh, outc, 4u, 1, opt.blob_allocator);
65 | if (top_blob.empty())
66 | return -100;
67 |
68 | #pragma omp parallel for num_threads(opt.num_threads)
69 | for (int p = 0; p < outc; p++)
70 | {
71 | const float* ptr = bottom_blob.channel(p % channels).row((p / channels) % 2) + ((p / channels) / 2);
72 | float* outptr = top_blob.channel(p);
73 |
74 | for (int i = 0; i < outh; i++)
75 | {
76 | for (int j = 0; j < outw; j++)
77 | {
78 | *outptr = *ptr;
79 |
80 | outptr += 1;
81 | ptr += 2;
82 | }
83 |
84 | ptr += w;
85 | }
86 | }
87 |
88 | return 0;
89 | }
90 | };
91 |
92 | DEFINE_LAYER_CREATOR(YoloV5Focus)
93 |
94 | struct Object
95 | {
96 | cv::Rect_ rect;
97 | int label;
98 | float prob;
99 | };
100 |
101 | static inline float intersection_area(const Object& a, const Object& b)
102 | {
103 | cv::Rect_ inter = a.rect & b.rect;
104 | return inter.area();
105 | }
106 |
107 | static void qsort_descent_inplace(std::vector& faceobjects, int left, int right)
108 | {
109 | int i = left;
110 | int j = right;
111 | float p = faceobjects[(left + right) / 2].prob;
112 |
113 | while (i <= j)
114 | {
115 | while (faceobjects[i].prob > p)
116 | i++;
117 |
118 | while (faceobjects[j].prob < p)
119 | j--;
120 |
121 | if (i <= j)
122 | {
123 | // swap
124 | std::swap(faceobjects[i], faceobjects[j]);
125 |
126 | i++;
127 | j--;
128 | }
129 | }
130 |
131 | #pragma omp parallel sections
132 | {
133 | #pragma omp section
134 | {
135 | if (left < j) qsort_descent_inplace(faceobjects, left, j);
136 | }
137 | #pragma omp section
138 | {
139 | if (i < right) qsort_descent_inplace(faceobjects, i, right);
140 | }
141 | }
142 | }
143 |
144 | static void qsort_descent_inplace(std::vector& faceobjects)
145 | {
146 | if (faceobjects.empty())
147 | return;
148 |
149 | qsort_descent_inplace(faceobjects, 0, faceobjects.size() - 1);
150 | }
151 |
152 | static void nms_sorted_bboxes(const std::vector& faceobjects, std::vector& picked, float nms_threshold)
153 | {
154 | picked.clear();
155 |
156 | const int n = faceobjects.size();
157 |
158 | std::vector areas(n);
159 | for (int i = 0; i < n; i++)
160 | {
161 | areas[i] = faceobjects[i].rect.area();
162 | }
163 |
164 | for (int i = 0; i < n; i++)
165 | {
166 | const Object& a = faceobjects[i];
167 |
168 | int keep = 1;
169 | for (int j = 0; j < (int)picked.size(); j++)
170 | {
171 | const Object& b = faceobjects[picked[j]];
172 |
173 | // intersection over union
174 | float inter_area = intersection_area(a, b);
175 | float union_area = areas[i] + areas[picked[j]] - inter_area;
176 | // float IoU = inter_area / union_area
177 | if (inter_area / union_area > nms_threshold)
178 | keep = 0;
179 | }
180 |
181 | if (keep)
182 | picked.push_back(i);
183 | }
184 | }
185 |
186 | static inline float sigmoid(float x)
187 | {
188 | return static_cast(1.f / (1.f + exp(-x)));
189 | }
190 |
191 | static void generate_proposals(const ncnn::Mat& anchors, int stride, const ncnn::Mat& in_pad, const ncnn::Mat& feat_blob, float prob_threshold, std::vector& objects)
192 | {
193 | const int num_grid = feat_blob.h;
194 |
195 | int num_grid_x;
196 | int num_grid_y;
197 | if (in_pad.w > in_pad.h)
198 | {
199 | num_grid_x = in_pad.w / stride;
200 | num_grid_y = num_grid / num_grid_x;
201 | }
202 | else
203 | {
204 | num_grid_y = in_pad.h / stride;
205 | num_grid_x = num_grid / num_grid_y;
206 | }
207 |
208 | const int num_class = feat_blob.w - 5;
209 |
210 | const int num_anchors = anchors.w / 2;
211 |
212 | for (int q = 0; q < num_anchors; q++)
213 | {
214 | const float anchor_w = anchors[q * 2];
215 | const float anchor_h = anchors[q * 2 + 1];
216 |
217 | const ncnn::Mat feat = feat_blob.channel(q);
218 |
219 | for (int i = 0; i < num_grid_y; i++)
220 | {
221 | for (int j = 0; j < num_grid_x; j++)
222 | {
223 | const float* featptr = feat.row(i * num_grid_x + j);
224 |
225 | // find class index with max class score
226 | int class_index = 0;
227 | float class_score = -FLT_MAX;
228 | for (int k = 0; k < num_class; k++)
229 | {
230 | float score = featptr[5 + k];
231 | if (score > class_score)
232 | {
233 | class_index = k;
234 | class_score = score;
235 | }
236 | }
237 |
238 | float box_score = featptr[4];
239 |
240 | float confidence = sigmoid(box_score) * sigmoid(class_score);
241 |
242 | if (confidence >= prob_threshold)
243 | {
244 | // yolov5/models/yolo.py Detect forward
245 | // y = x[i].sigmoid()
246 | // y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
247 | // y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
248 |
249 | float dx = sigmoid(featptr[0]);
250 | float dy = sigmoid(featptr[1]);
251 | float dw = sigmoid(featptr[2]);
252 | float dh = sigmoid(featptr[3]);
253 |
254 | float pb_cx = (dx * 2.f - 0.5f + j) * stride;
255 | float pb_cy = (dy * 2.f - 0.5f + i) * stride;
256 |
257 | float pb_w = pow(dw * 2.f, 2) * anchor_w;
258 | float pb_h = pow(dh * 2.f, 2) * anchor_h;
259 |
260 | float x0 = pb_cx - pb_w * 0.5f;
261 | float y0 = pb_cy - pb_h * 0.5f;
262 | float x1 = pb_cx + pb_w * 0.5f;
263 | float y1 = pb_cy + pb_h * 0.5f;
264 |
265 | Object obj;
266 | obj.rect.x = x0;
267 | obj.rect.y = y0;
268 | obj.rect.width = x1 - x0;
269 | obj.rect.height = y1 - y0;
270 | obj.label = class_index;
271 | obj.prob = confidence;
272 |
273 | objects.push_back(obj);
274 | }
275 | }
276 | }
277 | }
278 | }
279 |
280 | static int detect_yolov5(const cv::Mat& bgr, std::vector& objects)
281 | {
282 | int img_w = bgr.cols;
283 | int img_h = bgr.rows;
284 |
285 | // letterbox pad to multiple of 32
286 | int w = img_w;
287 | int h = img_h;
288 | float scale = 1.f;
289 | if (w > h)
290 | {
291 | scale = (float)target_size / w;
292 | w = target_size;
293 | h = h * scale;
294 | }
295 | else
296 | {
297 | scale = (float)target_size / h;
298 | h = target_size;
299 | w = w * scale;
300 | }
301 |
302 | ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR2RGB, img_w, img_h, w, h);
303 |
304 | // pad to target_size rectangle
305 | // yolov5/utils/datasets.py letterbox
306 | int wpad = (w + 31) / 32 * 32 - w;
307 | int hpad = (h + 31) / 32 * 32 - h;
308 | ncnn::Mat in_pad;
309 | ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 114.f);
310 |
311 | const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
312 | in_pad.substract_mean_normalize(0, norm_vals);
313 |
314 | ncnn::Extractor ex = yolov5.create_extractor();
315 |
316 | ex.input("images", in_pad);
317 |
318 | std::vector proposals;
319 |
320 | // anchor setting from yolov5/models/yolov5s.yaml
321 |
322 | // stride 8
323 | {
324 | ncnn::Mat out;
325 | ex.extract("output", out);
326 |
327 | ncnn::Mat anchors(6);
328 | anchors[0] = 10.f;
329 | anchors[1] = 13.f;
330 | anchors[2] = 16.f;
331 | anchors[3] = 30.f;
332 | anchors[4] = 33.f;
333 | anchors[5] = 23.f;
334 |
335 | std::vector objects8;
336 | generate_proposals(anchors, 8, in_pad, out, prob_threshold, objects8);
337 |
338 | proposals.insert(proposals.end(), objects8.begin(), objects8.end());
339 | }
340 |
341 | // stride 16
342 | {
343 | ncnn::Mat out;
344 | ex.extract("781", out);
345 |
346 | ncnn::Mat anchors(6);
347 | anchors[0] = 30.f;
348 | anchors[1] = 61.f;
349 | anchors[2] = 62.f;
350 | anchors[3] = 45.f;
351 | anchors[4] = 59.f;
352 | anchors[5] = 119.f;
353 |
354 | std::vector objects16;
355 | generate_proposals(anchors, 16, in_pad, out, prob_threshold, objects16);
356 |
357 | proposals.insert(proposals.end(), objects16.begin(), objects16.end());
358 | }
359 |
360 | // stride 32
361 | {
362 | ncnn::Mat out;
363 | ex.extract("801", out);
364 |
365 | ncnn::Mat anchors(6);
366 | anchors[0] = 116.f;
367 | anchors[1] = 90.f;
368 | anchors[2] = 156.f;
369 | anchors[3] = 198.f;
370 | anchors[4] = 373.f;
371 | anchors[5] = 326.f;
372 |
373 | std::vector objects32;
374 | generate_proposals(anchors, 32, in_pad, out, prob_threshold, objects32);
375 |
376 | proposals.insert(proposals.end(), objects32.begin(), objects32.end());
377 | }
378 |
379 | // sort all proposals by score from highest to lowest
380 | qsort_descent_inplace(proposals);
381 |
382 | // apply nms with nms_threshold
383 | std::vector picked;
384 | nms_sorted_bboxes(proposals, picked, nms_threshold);
385 |
386 | int count = picked.size();
387 |
388 | objects.resize(count);
389 | for (int i = 0; i < count; i++)
390 | {
391 | objects[i] = proposals[picked[i]];
392 |
393 | // adjust offset to original unpadded
394 | float x0 = (objects[i].rect.x - (wpad / 2)) / scale;
395 | float y0 = (objects[i].rect.y - (hpad / 2)) / scale;
396 | float x1 = (objects[i].rect.x + objects[i].rect.width - (wpad / 2)) / scale;
397 | float y1 = (objects[i].rect.y + objects[i].rect.height - (hpad / 2)) / scale;
398 |
399 | // clip
400 | x0 = std::max(std::min(x0, (float)(img_w - 1)), 0.f);
401 | y0 = std::max(std::min(y0, (float)(img_h - 1)), 0.f);
402 | x1 = std::max(std::min(x1, (float)(img_w - 1)), 0.f);
403 | y1 = std::max(std::min(y1, (float)(img_h - 1)), 0.f);
404 |
405 | objects[i].rect.x = x0;
406 | objects[i].rect.y = y0;
407 | objects[i].rect.width = x1 - x0;
408 | objects[i].rect.height = y1 - y0;
409 | }
410 |
411 | return 0;
412 | }
413 |
414 | static void draw_objects(cv::Mat& bgr, const std::vector& objects)
415 | {
416 | for (size_t i = 0; i < objects.size(); i++)
417 | {
418 | const Object& obj = objects[i];
419 |
420 | // fprintf(stderr, "%d = %.5f at %.2f %.2f %.2f x %.2f\n", obj.label, obj.prob,
421 | // obj.rect.x, obj.rect.y, obj.rect.width, obj.rect.height);
422 |
423 | cv::rectangle(bgr, obj.rect, cv::Scalar(255, 0, 0));
424 |
425 | char text[256];
426 | sprintf(text, "%s %.1f%%", class_names[obj.label], obj.prob * 100);
427 |
428 | int baseLine = 0;
429 | cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
430 |
431 | int x = obj.rect.x;
432 | int y = obj.rect.y - label_size.height - baseLine;
433 | if (y < 0)
434 | y = 0;
435 | if (x + label_size.width > bgr.cols)
436 | x = bgr.cols - label_size.width;
437 |
438 | cv::rectangle(bgr, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)),
439 | cv::Scalar(255, 255, 255), -1);
440 |
441 | cv::putText(bgr, text, cv::Point(x, y + label_size.height),
442 | cv::FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(0, 0, 0));
443 | }
444 | }
445 |
446 | int main(int argc, char** argv)
447 | {
448 | if (argc != 2)
449 | {
450 | fprintf(stderr, "Usage: %s [imagepath]\n", argv[0]);
451 | return -1;
452 | }
453 |
454 | const char* imagepath = argv[1];
455 |
456 | cv::Mat m = cv::imread(imagepath, 1);
457 | if (m.empty())
458 | {
459 | fprintf(stderr, "cv::imread %s failed\n", imagepath);
460 | return -1;
461 | }
462 |
463 | yolov5.register_custom_layer("YoloV5Focus", YoloV5Focus_layer_creator);
464 |
465 | // original pretrained model from https://github.com/ultralytics/yolov5
466 | // the ncnn model https://github.com/nihui/ncnn-assets/tree/master/models
467 | yolov5.load_param("yolov5s.param");
468 | yolov5.load_model("yolov5s.bin");
469 | yolov5.opt.num_threads=4;
470 |
471 | std::vector objects;
472 | detect_yolov5(m, objects);
473 | draw_objects(m, objects);
474 |
475 | cv::imshow("RPi4 - 1.95 GHz - 2 GB ram",m);
476 | // cv::imwrite("test.jpg",m);
477 | cv::waitKey(0);
478 |
479 | return 0;
480 | }
481 |
--------------------------------------------------------------------------------
/yolov5s.bin:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Qengineering/YoloV5-ncnn-Raspberry-Pi-4/134933aa72247a41bf598ef38b1476d11d003f28/yolov5s.bin
--------------------------------------------------------------------------------
/yolov5s.param:
--------------------------------------------------------------------------------
1 | 7767517
2 | 192 216
3 | Input images 0 1 images
4 | YoloV5Focus focus 1 1 images 207
5 | Convolution Conv_41 1 1 207 208 0=32 1=3 4=1 5=1 6=3456
6 | HardSwish Div_49 1 1 208 216 0=1.666667e-01
7 | Convolution Conv_50 1 1 216 217 0=64 1=3 3=2 4=1 5=1 6=18432
8 | HardSwish Div_58 1 1 217 225 0=1.666667e-01
9 | Split splitncnn_0 1 2 225 225_splitncnn_0 225_splitncnn_1
10 | Convolution Conv_59 1 1 225_splitncnn_1 226 0=32 1=1 5=1 6=2048
11 | HardSwish Div_67 1 1 226 234 0=1.666667e-01
12 | Split splitncnn_1 1 2 234 234_splitncnn_0 234_splitncnn_1
13 | Convolution Conv_68 1 1 234_splitncnn_1 235 0=32 1=1 5=1 6=1024
14 | HardSwish Div_76 1 1 235 243 0=1.666667e-01
15 | Convolution Conv_77 1 1 243 244 0=32 1=3 4=1 5=1 6=9216
16 | HardSwish Div_85 1 1 244 252 0=1.666667e-01
17 | BinaryOp Add_86 2 1 234_splitncnn_0 252 253
18 | Convolution Conv_87 1 1 253 254 0=32 1=1 6=1024
19 | Convolution Conv_88 1 1 225_splitncnn_0 255 0=32 1=1 6=2048
20 | Concat Concat_89 2 1 254 255 256
21 | BatchNorm BatchNormalization_90 1 1 256 257 0=64
22 | ReLU LeakyRelu_91 1 1 257 258 0=1.000000e-01
23 | Convolution Conv_92 1 1 258 259 0=64 1=1 5=1 6=4096
24 | HardSwish Div_100 1 1 259 267 0=1.666667e-01
25 | Convolution Conv_101 1 1 267 268 0=128 1=3 3=2 4=1 5=1 6=73728
26 | HardSwish Div_109 1 1 268 276 0=1.666667e-01
27 | Split splitncnn_2 1 2 276 276_splitncnn_0 276_splitncnn_1
28 | Convolution Conv_110 1 1 276_splitncnn_1 277 0=64 1=1 5=1 6=8192
29 | HardSwish Div_118 1 1 277 285 0=1.666667e-01
30 | Split splitncnn_3 1 2 285 285_splitncnn_0 285_splitncnn_1
31 | Convolution Conv_119 1 1 285_splitncnn_1 286 0=64 1=1 5=1 6=4096
32 | HardSwish Div_127 1 1 286 294 0=1.666667e-01
33 | Convolution Conv_128 1 1 294 295 0=64 1=3 4=1 5=1 6=36864
34 | HardSwish Div_136 1 1 295 303 0=1.666667e-01
35 | BinaryOp Add_137 2 1 285_splitncnn_0 303 304
36 | Split splitncnn_4 1 2 304 304_splitncnn_0 304_splitncnn_1
37 | Convolution Conv_138 1 1 304_splitncnn_1 305 0=64 1=1 5=1 6=4096
38 | HardSwish Div_146 1 1 305 313 0=1.666667e-01
39 | Convolution Conv_147 1 1 313 314 0=64 1=3 4=1 5=1 6=36864
40 | HardSwish Div_155 1 1 314 322 0=1.666667e-01
41 | BinaryOp Add_156 2 1 304_splitncnn_0 322 323
42 | Split splitncnn_5 1 2 323 323_splitncnn_0 323_splitncnn_1
43 | Convolution Conv_157 1 1 323_splitncnn_1 324 0=64 1=1 5=1 6=4096
44 | HardSwish Div_165 1 1 324 332 0=1.666667e-01
45 | Convolution Conv_166 1 1 332 333 0=64 1=3 4=1 5=1 6=36864
46 | HardSwish Div_174 1 1 333 341 0=1.666667e-01
47 | BinaryOp Add_175 2 1 323_splitncnn_0 341 342
48 | Convolution Conv_176 1 1 342 343 0=64 1=1 6=4096
49 | Convolution Conv_177 1 1 276_splitncnn_0 344 0=64 1=1 6=8192
50 | Concat Concat_178 2 1 343 344 345
51 | BatchNorm BatchNormalization_179 1 1 345 346 0=128
52 | ReLU LeakyRelu_180 1 1 346 347 0=1.000000e-01
53 | Convolution Conv_181 1 1 347 348 0=128 1=1 5=1 6=16384
54 | HardSwish Div_189 1 1 348 356 0=1.666667e-01
55 | Split splitncnn_6 1 2 356 356_splitncnn_0 356_splitncnn_1
56 | Convolution Conv_190 1 1 356_splitncnn_1 357 0=256 1=3 3=2 4=1 5=1 6=294912
57 | HardSwish Div_198 1 1 357 365 0=1.666667e-01
58 | Split splitncnn_7 1 2 365 365_splitncnn_0 365_splitncnn_1
59 | Convolution Conv_199 1 1 365_splitncnn_1 366 0=128 1=1 5=1 6=32768
60 | HardSwish Div_207 1 1 366 374 0=1.666667e-01
61 | Split splitncnn_8 1 2 374 374_splitncnn_0 374_splitncnn_1
62 | Convolution Conv_208 1 1 374_splitncnn_1 375 0=128 1=1 5=1 6=16384
63 | HardSwish Div_216 1 1 375 383 0=1.666667e-01
64 | Convolution Conv_217 1 1 383 384 0=128 1=3 4=1 5=1 6=147456
65 | HardSwish Div_225 1 1 384 392 0=1.666667e-01
66 | BinaryOp Add_226 2 1 374_splitncnn_0 392 393
67 | Split splitncnn_9 1 2 393 393_splitncnn_0 393_splitncnn_1
68 | Convolution Conv_227 1 1 393_splitncnn_1 394 0=128 1=1 5=1 6=16384
69 | HardSwish Div_235 1 1 394 402 0=1.666667e-01
70 | Convolution Conv_236 1 1 402 403 0=128 1=3 4=1 5=1 6=147456
71 | HardSwish Div_244 1 1 403 411 0=1.666667e-01
72 | BinaryOp Add_245 2 1 393_splitncnn_0 411 412
73 | Split splitncnn_10 1 2 412 412_splitncnn_0 412_splitncnn_1
74 | Convolution Conv_246 1 1 412_splitncnn_1 413 0=128 1=1 5=1 6=16384
75 | HardSwish Div_254 1 1 413 421 0=1.666667e-01
76 | Convolution Conv_255 1 1 421 422 0=128 1=3 4=1 5=1 6=147456
77 | HardSwish Div_263 1 1 422 430 0=1.666667e-01
78 | BinaryOp Add_264 2 1 412_splitncnn_0 430 431
79 | Convolution Conv_265 1 1 431 432 0=128 1=1 6=16384
80 | Convolution Conv_266 1 1 365_splitncnn_0 433 0=128 1=1 6=32768
81 | Concat Concat_267 2 1 432 433 434
82 | BatchNorm BatchNormalization_268 1 1 434 435 0=256
83 | ReLU LeakyRelu_269 1 1 435 436 0=1.000000e-01
84 | Convolution Conv_270 1 1 436 437 0=256 1=1 5=1 6=65536
85 | HardSwish Div_278 1 1 437 445 0=1.666667e-01
86 | Split splitncnn_11 1 2 445 445_splitncnn_0 445_splitncnn_1
87 | Convolution Conv_279 1 1 445_splitncnn_1 446 0=512 1=3 3=2 4=1 5=1 6=1179648
88 | HardSwish Div_287 1 1 446 454 0=1.666667e-01
89 | Convolution Conv_288 1 1 454 455 0=256 1=1 5=1 6=131072
90 | HardSwish Div_296 1 1 455 463 0=1.666667e-01
91 | Split splitncnn_12 1 4 463 463_splitncnn_0 463_splitncnn_1 463_splitncnn_2 463_splitncnn_3
92 | Pooling MaxPool_297 1 1 463_splitncnn_3 464 1=5 3=2 5=1
93 | Pooling MaxPool_298 1 1 463_splitncnn_2 465 1=9 3=4 5=1
94 | Pooling MaxPool_299 1 1 463_splitncnn_1 466 1=13 3=6 5=1
95 | Concat Concat_300 4 1 463_splitncnn_0 464 465 466 467
96 | Convolution Conv_301 1 1 467 468 0=512 1=1 5=1 6=524288
97 | HardSwish Div_309 1 1 468 476 0=1.666667e-01
98 | Split splitncnn_13 1 2 476 476_splitncnn_0 476_splitncnn_1
99 | Convolution Conv_310 1 1 476_splitncnn_1 477 0=256 1=1 5=1 6=131072
100 | HardSwish Div_318 1 1 477 485 0=1.666667e-01
101 | Convolution Conv_319 1 1 485 486 0=256 1=1 5=1 6=65536
102 | HardSwish Div_327 1 1 486 494 0=1.666667e-01
103 | Convolution Conv_328 1 1 494 495 0=256 1=3 4=1 5=1 6=589824
104 | HardSwish Div_336 1 1 495 503 0=1.666667e-01
105 | Convolution Conv_337 1 1 503 504 0=256 1=1 6=65536
106 | Convolution Conv_338 1 1 476_splitncnn_0 505 0=256 1=1 6=131072
107 | Concat Concat_339 2 1 504 505 506
108 | BatchNorm BatchNormalization_340 1 1 506 507 0=512
109 | ReLU LeakyRelu_341 1 1 507 508 0=1.000000e-01
110 | Convolution Conv_342 1 1 508 509 0=512 1=1 5=1 6=262144
111 | HardSwish Div_350 1 1 509 517 0=1.666667e-01
112 | Convolution Conv_351 1 1 517 518 0=256 1=1 5=1 6=131072
113 | HardSwish Div_359 1 1 518 526 0=1.666667e-01
114 | Split splitncnn_14 1 2 526 526_splitncnn_0 526_splitncnn_1
115 | Interp Resize_361 1 1 526_splitncnn_1 536 0=1 1=2.000000e+00 2=2.000000e+00
116 | Concat Concat_362 2 1 536 445_splitncnn_0 537
117 | Split splitncnn_15 1 2 537 537_splitncnn_0 537_splitncnn_1
118 | Convolution Conv_363 1 1 537_splitncnn_1 538 0=128 1=1 5=1 6=65536
119 | HardSwish Div_371 1 1 538 546 0=1.666667e-01
120 | Convolution Conv_372 1 1 546 547 0=128 1=1 5=1 6=16384
121 | HardSwish Div_380 1 1 547 555 0=1.666667e-01
122 | Convolution Conv_381 1 1 555 556 0=128 1=3 4=1 5=1 6=147456
123 | HardSwish Div_389 1 1 556 564 0=1.666667e-01
124 | Convolution Conv_390 1 1 564 565 0=128 1=1 6=16384
125 | Convolution Conv_391 1 1 537_splitncnn_0 566 0=128 1=1 6=65536
126 | Concat Concat_392 2 1 565 566 567
127 | BatchNorm BatchNormalization_393 1 1 567 568 0=256
128 | ReLU LeakyRelu_394 1 1 568 569 0=1.000000e-01
129 | Convolution Conv_395 1 1 569 570 0=256 1=1 5=1 6=65536
130 | HardSwish Div_403 1 1 570 578 0=1.666667e-01
131 | Convolution Conv_404 1 1 578 579 0=128 1=1 5=1 6=32768
132 | HardSwish Div_412 1 1 579 587 0=1.666667e-01
133 | Split splitncnn_16 1 2 587 587_splitncnn_0 587_splitncnn_1
134 | Interp Resize_414 1 1 587_splitncnn_1 597 0=1 1=2.000000e+00 2=2.000000e+00
135 | Concat Concat_415 2 1 597 356_splitncnn_0 598
136 | Split splitncnn_17 1 2 598 598_splitncnn_0 598_splitncnn_1
137 | Convolution Conv_416 1 1 598_splitncnn_1 599 0=64 1=1 5=1 6=16384
138 | HardSwish Div_424 1 1 599 607 0=1.666667e-01
139 | Convolution Conv_425 1 1 607 608 0=64 1=1 5=1 6=4096
140 | HardSwish Div_433 1 1 608 616 0=1.666667e-01
141 | Convolution Conv_434 1 1 616 617 0=64 1=3 4=1 5=1 6=36864
142 | HardSwish Div_442 1 1 617 625 0=1.666667e-01
143 | Convolution Conv_443 1 1 625 626 0=64 1=1 6=4096
144 | Convolution Conv_444 1 1 598_splitncnn_0 627 0=64 1=1 6=16384
145 | Concat Concat_445 2 1 626 627 628
146 | BatchNorm BatchNormalization_446 1 1 628 629 0=128
147 | ReLU LeakyRelu_447 1 1 629 630 0=1.000000e-01
148 | Convolution Conv_448 1 1 630 631 0=128 1=1 5=1 6=16384
149 | HardSwish Div_456 1 1 631 639 0=1.666667e-01
150 | Split splitncnn_18 1 2 639 639_splitncnn_0 639_splitncnn_1
151 | Convolution Conv_457 1 1 639_splitncnn_1 640 0=128 1=3 3=2 4=1 5=1 6=147456
152 | HardSwish Div_465 1 1 640 648 0=1.666667e-01
153 | Concat Concat_466 2 1 648 587_splitncnn_0 649
154 | Split splitncnn_19 1 2 649 649_splitncnn_0 649_splitncnn_1
155 | Convolution Conv_467 1 1 649_splitncnn_1 650 0=128 1=1 5=1 6=32768
156 | HardSwish Div_475 1 1 650 658 0=1.666667e-01
157 | Convolution Conv_476 1 1 658 659 0=128 1=1 5=1 6=16384
158 | HardSwish Div_484 1 1 659 667 0=1.666667e-01
159 | Convolution Conv_485 1 1 667 668 0=128 1=3 4=1 5=1 6=147456
160 | HardSwish Div_493 1 1 668 676 0=1.666667e-01
161 | Convolution Conv_494 1 1 676 677 0=128 1=1 6=16384
162 | Convolution Conv_495 1 1 649_splitncnn_0 678 0=128 1=1 6=32768
163 | Concat Concat_496 2 1 677 678 679
164 | BatchNorm BatchNormalization_497 1 1 679 680 0=256
165 | ReLU LeakyRelu_498 1 1 680 681 0=1.000000e-01
166 | Convolution Conv_499 1 1 681 682 0=256 1=1 5=1 6=65536
167 | HardSwish Div_507 1 1 682 690 0=1.666667e-01
168 | Split splitncnn_20 1 2 690 690_splitncnn_0 690_splitncnn_1
169 | Convolution Conv_508 1 1 690_splitncnn_1 691 0=256 1=3 3=2 4=1 5=1 6=589824
170 | HardSwish Div_516 1 1 691 699 0=1.666667e-01
171 | Concat Concat_517 2 1 699 526_splitncnn_0 700
172 | Split splitncnn_21 1 2 700 700_splitncnn_0 700_splitncnn_1
173 | Convolution Conv_518 1 1 700_splitncnn_1 701 0=256 1=1 5=1 6=131072
174 | HardSwish Div_526 1 1 701 709 0=1.666667e-01
175 | Convolution Conv_527 1 1 709 710 0=256 1=1 5=1 6=65536
176 | HardSwish Div_535 1 1 710 718 0=1.666667e-01
177 | Convolution Conv_536 1 1 718 719 0=256 1=3 4=1 5=1 6=589824
178 | HardSwish Div_544 1 1 719 727 0=1.666667e-01
179 | Convolution Conv_545 1 1 727 728 0=256 1=1 6=65536
180 | Convolution Conv_546 1 1 700_splitncnn_0 729 0=256 1=1 6=131072
181 | Concat Concat_547 2 1 728 729 730
182 | BatchNorm BatchNormalization_548 1 1 730 731 0=512
183 | ReLU LeakyRelu_549 1 1 731 732 0=1.000000e-01
184 | Convolution Conv_550 1 1 732 733 0=512 1=1 5=1 6=262144
185 | HardSwish Div_558 1 1 733 741 0=1.666667e-01
186 | Convolution Conv_559 1 1 639_splitncnn_0 742 0=255 1=1 5=1 6=32640
187 | Reshape Reshape_573 1 1 742 760 0=-1 1=85 2=3
188 | Permute Transpose_574 1 1 760 output 0=1
189 | Convolution Conv_575 1 1 690_splitncnn_0 762 0=255 1=1 5=1 6=65280
190 | Reshape Reshape_589 1 1 762 780 0=-1 1=85 2=3
191 | Permute Transpose_590 1 1 780 781 0=1
192 | Convolution Conv_591 1 1 741 782 0=255 1=1 5=1 6=130560
193 | Reshape Reshape_605 1 1 782 800 0=-1 1=85 2=3
194 | Permute Transpose_606 1 1 800 801 0=1
195 |
--------------------------------------------------------------------------------