├── CLA.md
├── LICENSE.md
├── README.md
├── deepstream_yolo
├── README.md
├── config_infer_primary_yoloV4.txt
├── config_infer_primary_yoloV7.txt
├── deepstream_app_config_yolo.txt
├── labels.txt
└── nvdsinfer_custom_impl_Yolo
│ ├── Makefile
│ ├── nvdsparsebbox_Yolo.cpp
│ └── nvdsparsebbox_Yolo_cuda.cu
├── tensorrt_yolov4
├── Makefile
├── Makefile.config
├── README.md
├── data
│ ├── demo.jpg
│ └── demo_out.jpg
└── source
│ ├── Makefile
│ ├── SampleYolo.cpp
│ ├── SampleYolo.hpp
│ ├── generate_coco_image_list.py
│ ├── main.cpp
│ └── onnx_add_nms_plugin.py
├── tensorrt_yolov7
├── CMakeLists.txt
├── README.md
├── imgs
│ ├── horses.jpg
│ └── zidane.jpg
├── samples
│ ├── detect.cpp
│ ├── validate_coco.cpp
│ └── video_detect.cpp
├── src
│ ├── Yolov7.cpp
│ ├── Yolov7.h
│ ├── argsParser.cpp
│ ├── argsParser.h
│ └── tools.h
└── test_coco_map.py
└── yolov7_qat
├── README.md
├── doc
├── Guidance_of_QAT_performance_optimization.md
└── imgs
│ ├── QATConv.png
│ ├── QATFlow.png
│ ├── int8_q_recommended_procedure.png
│ ├── monkey-patch-qat-conv-fp16-issue_ptq.png
│ ├── monkey-patch-qat-conv-fp16-issue_ptqonnx.png
│ ├── monkey-patch-qat-conv-fp16-issue_qat.png
│ ├── monkey-patch-qat-conv-fp16-issue_qatonnx.png
│ ├── monkey-patch-qat-conv-fp16-issue_qatonnx_edit.png
│ └── monkey-patch-qat-maxpooling-qat.png
├── quantization
├── quantize.py
└── rules.py
└── scripts
├── detect-trt.py
├── draw-engine.py
├── eval-trt.py
├── eval-trt.sh
├── qat-yolov5.py
├── qat.py
├── quantize_utils.py
└── trt-int8.py
/CLA.md:
--------------------------------------------------------------------------------
1 | ## Individual Contributor License Agreement (CLA)
2 |
3 | **Thank you for submitting your contributions to this project.**
4 |
5 | By signing this CLA, you agree that the following terms apply to all of your past, present and future contributions
6 | to the project.
7 |
8 | ### License.
9 |
10 | You hereby represent that all present, past and future contributions are governed by the
11 | [MIT License](https://opensource.org/licenses/MIT)
12 | copyright statement.
13 |
14 | This entails that to the extent possible under law, you transfer all copyright and related or neighboring rights
15 | of the code or documents you contribute to the project itself or its maintainers.
16 | Furthermore you also represent that you have the authority to perform the above waiver
17 | with respect to the entirety of you contributions.
18 |
19 | ### Moral Rights.
20 |
21 | To the fullest extent permitted under applicable law, you hereby waive, and agree not to
22 | assert, all of your “moral rights” in or relating to your contributions for the benefit of the project.
23 |
24 | ### Third Party Content.
25 |
26 | If your Contribution includes or is based on any source code, object code, bug fixes, configuration changes, tools,
27 | specifications, documentation, data, materials, feedback, information or other works of authorship that were not
28 | authored by you (“Third Party Content”) or if you are aware of any third party intellectual property or proprietary
29 | rights associated with your Contribution (“Third Party Rights”),
30 | then you agree to include with the submission of your Contribution full details respecting such Third Party
31 | Content and Third Party Rights, including, without limitation, identification of which aspects of your
32 | Contribution contain Third Party Content or are associated with Third Party Rights, the owner/author of the
33 | Third Party Content and Third Party Rights, where you obtained the Third Party Content, and any applicable
34 | third party license terms or restrictions respecting the Third Party Content and Third Party Rights. For greater
35 | certainty, the foregoing obligations respecting the identification of Third Party Content and Third Party Rights
36 | do not apply to any portion of a Project that is incorporated into your Contribution to that same Project.
37 |
38 | ### Representations.
39 |
40 | You represent that, other than the Third Party Content and Third Party Rights identified by
41 | you in accordance with this Agreement, you are the sole author of your Contributions and are legally entitled
42 | to grant the foregoing licenses and waivers in respect of your Contributions. If your Contributions were
43 | created in the course of your employment with your past or present employer(s), you represent that such
44 | employer(s) has authorized you to make your Contributions on behalf of such employer(s) or such employer
45 | (s) has waived all of their right, title or interest in or to your Contributions.
46 |
47 | ### Disclaimer.
48 |
49 | To the fullest extent permitted under applicable law, your Contributions are provided on an "as is"
50 | basis, without any warranties or conditions, express or implied, including, without limitation, any implied
51 | warranties or conditions of non-infringement, merchantability or fitness for a particular purpose. You are not
52 | required to provide support for your Contributions, except to the extent you desire to provide support.
53 |
54 | ### No Obligation.
55 |
56 | You acknowledge that the maintainers of this project are under no obligation to use or incorporate your contributions
57 | into the project. The decision to use or incorporate your contributions into the project will be made at the
58 | sole discretion of the maintainers or their authorized delegates.
59 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 |
2 | Apache License
3 | Version 2.0, January 2004
4 | http://www.apache.org/licenses/
5 |
6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7 |
8 | 1. Definitions.
9 |
10 | "License" shall mean the terms and conditions for use, reproduction,
11 | and distribution as defined by Sections 1 through 9 of this document.
12 |
13 | "Licensor" shall mean the copyright owner or entity authorized by
14 | the copyright owner that is granting the License.
15 |
16 | "Legal Entity" shall mean the union of the acting entity and all
17 | other entities that control, are controlled by, or are under common
18 | control with that entity. For the purposes of this definition,
19 | "control" means (i) the power, direct or indirect, to cause the
20 | direction or management of such entity, whether by contract or
21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
22 | outstanding shares, or (iii) beneficial ownership of such entity.
23 |
24 | "You" (or "Your") shall mean an individual or Legal Entity
25 | exercising permissions granted by this License.
26 |
27 | "Source" form shall mean the preferred form for making modifications,
28 | including but not limited to software source code, documentation
29 | source, and configuration files.
30 |
31 | "Object" form shall mean any form resulting from mechanical
32 | transformation or translation of a Source form, including but
33 | not limited to compiled object code, generated documentation,
34 | and conversions to other media types.
35 |
36 | "Work" shall mean the work of authorship, whether in Source or
37 | Object form, made available under the License, as indicated by a
38 | copyright notice that is included in or attached to the work
39 | (an example is provided in the Appendix below).
40 |
41 | "Derivative Works" shall mean any work, whether in Source or Object
42 | form, that is based on (or derived from) the Work and for which the
43 | editorial revisions, annotations, elaborations, or other modifications
44 | represent, as a whole, an original work of authorship. For the purposes
45 | of this License, Derivative Works shall not include works that remain
46 | separable from, or merely link (or bind by name) to the interfaces of,
47 | the Work and Derivative Works thereof.
48 |
49 | "Contribution" shall mean any work of authorship, including
50 | the original version of the Work and any modifications or additions
51 | to that Work or Derivative Works thereof, that is intentionally
52 | submitted to Licensor for inclusion in the Work by the copyright owner
53 | or by an individual or Legal Entity authorized to submit on behalf of
54 | the copyright owner. For the purposes of this definition, "submitted"
55 | means any form of electronic, verbal, or written communication sent
56 | to the Licensor or its representatives, including but not limited to
57 | communication on electronic mailing lists, source code control systems,
58 | and issue tracking systems that are managed by, or on behalf of, the
59 | Licensor for the purpose of discussing and improving the Work, but
60 | excluding communication that is conspicuously marked or otherwise
61 | designated in writing by the copyright owner as "Not a Contribution."
62 |
63 | "Contributor" shall mean Licensor and any individual or Legal Entity
64 | on behalf of whom a Contribution has been received by Licensor and
65 | subsequently incorporated within the Work.
66 |
67 | 2. Grant of Copyright License. Subject to the terms and conditions of
68 | this License, each Contributor hereby grants to You a perpetual,
69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
70 | copyright license to reproduce, prepare Derivative Works of,
71 | publicly display, publicly perform, sublicense, and distribute the
72 | Work and such Derivative Works in Source or Object form.
73 |
74 | 3. Grant of Patent License. Subject to the terms and conditions of
75 | this License, each Contributor hereby grants to You a perpetual,
76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
77 | (except as stated in this section) patent license to make, have made,
78 | use, offer to sell, sell, import, and otherwise transfer the Work,
79 | where such license applies only to those patent claims licensable
80 | by such Contributor that are necessarily infringed by their
81 | Contribution(s) alone or by combination of their Contribution(s)
82 | with the Work to which such Contribution(s) was submitted. If You
83 | institute patent litigation against any entity (including a
84 | cross-claim or counterclaim in a lawsuit) alleging that the Work
85 | or a Contribution incorporated within the Work constitutes direct
86 | or contributory patent infringement, then any patent licenses
87 | granted to You under this License for that Work shall terminate
88 | as of the date such litigation is filed.
89 |
90 | 4. Redistribution. You may reproduce and distribute copies of the
91 | Work or Derivative Works thereof in any medium, with or without
92 | modifications, and in Source or Object form, provided that You
93 | meet the following conditions:
94 |
95 | (a) You must give any other recipients of the Work or
96 | Derivative Works a copy of this License; and
97 |
98 | (b) You must cause any modified files to carry prominent notices
99 | stating that You changed the files; and
100 |
101 | (c) You must retain, in the Source form of any Derivative Works
102 | that You distribute, all copyright, patent, trademark, and
103 | attribution notices from the Source form of the Work,
104 | excluding those notices that do not pertain to any part of
105 | the Derivative Works; and
106 |
107 | (d) If the Work includes a "NOTICE" text file as part of its
108 | distribution, then any Derivative Works that You distribute must
109 | include a readable copy of the attribution notices contained
110 | within such NOTICE file, excluding those notices that do not
111 | pertain to any part of the Derivative Works, in at least one
112 | of the following places: within a NOTICE text file distributed
113 | as part of the Derivative Works; within the Source form or
114 | documentation, if provided along with the Derivative Works; or,
115 | within a display generated by the Derivative Works, if and
116 | wherever such third-party notices normally appear. The contents
117 | of the NOTICE file are for informational purposes only and
118 | do not modify the License. You may add Your own attribution
119 | notices within Derivative Works that You distribute, alongside
120 | or as an addendum to the NOTICE text from the Work, provided
121 | that such additional attribution notices cannot be construed
122 | as modifying the License.
123 |
124 | You may add Your own copyright statement to Your modifications and
125 | may provide additional or different license terms and conditions
126 | for use, reproduction, or distribution of Your modifications, or
127 | for any such Derivative Works as a whole, provided Your use,
128 | reproduction, and distribution of the Work otherwise complies with
129 | the conditions stated in this License.
130 |
131 | 5. Submission of Contributions. Unless You explicitly state otherwise,
132 | any Contribution intentionally submitted for inclusion in the Work
133 | by You to the Licensor shall be under the terms and conditions of
134 | this License, without any additional terms or conditions.
135 | Notwithstanding the above, nothing herein shall supersede or modify
136 | the terms of any separate license agreement you may have executed
137 | with Licensor regarding such Contributions.
138 |
139 | 6. Trademarks. This License does not grant permission to use the trade
140 | names, trademarks, service marks, or product names of the Licensor,
141 | except as required for reasonable and customary use in describing the
142 | origin of the Work and reproducing the content of the NOTICE file.
143 |
144 | 7. Disclaimer of Warranty. Unless required by applicable law or
145 | agreed to in writing, Licensor provides the Work (and each
146 | Contributor provides its Contributions) on an "AS IS" BASIS,
147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 | implied, including, without limitation, any warranties or conditions
149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 | PARTICULAR PURPOSE. You are solely responsible for determining the
151 | appropriateness of using or redistributing the Work and assume any
152 | risks associated with Your exercise of permissions under this License.
153 |
154 | 8. Limitation of Liability. In no event and under no legal theory,
155 | whether in tort (including negligence), contract, or otherwise,
156 | unless required by applicable law (such as deliberate and grossly
157 | negligent acts) or agreed to in writing, shall any Contributor be
158 | liable to You for damages, including any direct, indirect, special,
159 | incidental, or consequential damages of any character arising as a
160 | result of this License or out of the use or inability to use the
161 | Work (including but not limited to damages for loss of goodwill,
162 | work stoppage, computer failure or malfunction, or any and all
163 | other commercial damages or losses), even if such Contributor
164 | has been advised of the possibility of such damages.
165 |
166 | 9. Accepting Warranty or Additional Liability. While redistributing
167 | the Work or Derivative Works thereof, You may choose to offer,
168 | and charge a fee for, acceptance of support, warranty, indemnity,
169 | or other liability obligations and/or rights consistent with this
170 | License. However, in accepting such obligations, You may act only
171 | on Your own behalf and on Your sole responsibility, not on behalf
172 | of any other Contributor, and only if You agree to indemnify,
173 | defend, and hold each Contributor harmless for any liability
174 | incurred by, or claims asserted against, such Contributor by reason
175 | of your accepting any such warranty or additional liability.
176 |
177 | END OF TERMS AND CONDITIONS
178 |
179 | APPENDIX: How to apply the Apache License to your work.
180 |
181 | To apply the Apache License to your work, attach the following
182 | boilerplate notice, with the fields enclosed by brackets "[]"
183 | replaced with your own identifying information. (Don't include
184 | the brackets!) The text should be enclosed in the appropriate
185 | comment syntax for the file format. We also recommend that a
186 | file or class name and description of purpose be included on the
187 | same "printed page" as the copyright notice for easier
188 | identification within third-party archives.
189 |
190 | Copyright [yyyy] [name of copyright owner]
191 |
192 | Licensed under the Apache License, Version 2.0 (the "License");
193 | you may not use this file except in compliance with the License.
194 | You may obtain a copy of the License at
195 |
196 | http://www.apache.org/licenses/LICENSE-2.0
197 |
198 | Unless required by applicable law or agreed to in writing, software
199 | distributed under the License is distributed on an "AS IS" BASIS,
200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201 | See the License for the specific language governing permissions and
202 | limitations under the License.
203 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Yolo DeepStream
2 |
3 | ## Description
4 |
5 | This repo have 4 parts:
6 | ### 1) yolov7_qat
7 | In [yolov7_qat](yolov7_qat), We use [TensorRT's pytorch quntization tool](https://github.com/NVIDIA/TensorRT/tree/main/tools/pytorch-quantization) to Finetune training QAT yolov7 from the pre-trained weight.
8 | Finally we get the same performance of PTQ in TensorRT on Jetson OrinX. And the accuracy(mAP) of the model only dropped a little.
9 |
10 | ### 2) tensorrt_yolov7
11 | In [tensorrt_yolov7](tensorrt_yolov7), We provide a standalone c++ yolov7-app sample here. You can use trtexec to convert FP32 onnx models or QAT-int8 models exported from repo [yolov7_qat](yolov7_qat) to trt-engines. And set the trt-engine as yolov7-app's input. It can do detections on images/videos. Or test mAP on COCO dataset.
12 |
13 | ### 3) deepstream_yolo
14 | In [deepstream_yolo](deepstream_yolo), This sample shows how to integrate YOLO models with customized output layer parsing for detected objects with DeepStreamSDK.
15 |
16 | ### 4) tensorrt_yolov4
17 | In [tensorrt_yolov4](tensorrt_yolov4), This sample shows a standalone tensorrt-sample for yolov4.
18 |
19 | ## Performance
20 | For YoloV7 sample:
21 |
22 | Below table shows the end-to-end performance of processing 1080p videos with this sample application.
23 | - Testing Device :
24 |
25 | 1. Jetson AGX Orin 64GB(PowerMode:MAXN + GPU-freq:1.3GHz + CPU:12-core-2.2GHz)
26 |
27 | 2. Tesla T4
28 |
29 | |Device |precision |Number
of streams | Batch Size | trtexec FPS| deepstream-app FPS
with cuda-post-process |deepstream-app FPS
with cpu-post-process|
30 | |----------- |----------- |----------------- | -----------|----------- |-----------|-----------|
31 | | Orin-X| FP16 | 1 | 1 | 126 | 124 | 120 |
32 | | Orin-X| FP16 | 16 | 16 | 162 | 145 | 135 |
33 | | Orin-X| Int8(PTQ/QAT)| 1 | 1 | 180 | 175 | 128 |
34 | | Orin-X| Int8(PTQ/QAT)| 16 | 16 | 264 | 264 | 135 |
35 | | T4 | FP16 | 1 | 1 | 132 | 125 | 123 |
36 | | T4 | FP16 | 16 | 16 | 169 | 169 | 123 |
37 | | T4 | Int8(PTQ/QAT)| 1 | 1 | 208 | 170 | 127 |
38 | | T4 | Int8(PTQ/QAT)| 16 | 16 | 305 | 300 | 132 |
39 |
40 |
41 | - note: trtexec cudaGraph not enabled as deepstream not support cudaGraph
42 |
43 | ## Code structure
44 | ```bash
45 | ├── deepstream_yolo
46 | │ ├── config_infer_primary_yoloV4.txt # config file for yolov4 model
47 | │ ├── config_infer_primary_yoloV7.txt # config file for yolov7 model
48 | │ ├── deepstream_app_config_yolo.txt # deepStream reference app configuration file for using YOLOv models as the primary detector.
49 | │ ├── labels.txt # labels for coco detection # output layer parsing function for detected objects for the Yolo model.
50 | │ ├── nvdsinfer_custom_impl_Yolo
51 | │ │ ├── Makefile
52 | │ │ └── nvdsparsebbox_Yolo.cpp
53 | │ └── README.md
54 | ├── README.md
55 | ├── tensorrt_yolov4
56 | │ ├── data
57 | │ │ ├── demo.jpg # the demo image
58 | │ │ └── demo_out.jpg # image detection output of the demo image
59 | │ ├── Makefile
60 | │ ├── Makefile.config
61 | │ ├── README.md
62 | │ └── source
63 | │ ├── generate_coco_image_list.py # python script to get list of image names from MS COCO annotation or information file
64 | │ ├── main.cpp # program main entrance where parameters are configured here
65 | │ ├── Makefile
66 | │ ├── onnx_add_nms_plugin.py # python script to add BatchedNMSPlugin node into ONNX model
67 | │ ├── SampleYolo.cpp # yolov4 inference class functions definition file
68 | │ └── SampleYolo.hpp # yolov4 inference class definition file
69 | ├── tensorrt_yolov7
70 | │ ├── CMakeLists.txt
71 | │ ├── imgs # the demo images
72 | │ │ ├── horses.jpg
73 | │ │ └── zidane.jpg
74 | │ ├── README.md
75 | │ ├── samples
76 | │ │ ├── detect.cpp # detection app for images detection
77 | │ │ ├── validate_coco.cpp # validate coco dataset app
78 | │ │ └── video_detect.cpp # detection app for video detection
79 | │ ├── src
80 | │ │ ├── argsParser.cpp # argsParser helper class for commandline parsing
81 | │ │ ├── argsParser.h # argsParser helper class for commandline parsing
82 | │ │ ├── tools.h # helper function for yolov7 class
83 | │ │ ├── Yolov7.cpp # Class Yolov7
84 | │ │ └── Yolov7.h # Class Yolov7
85 | │ └── test_coco_map.py # tool for test coco map with json file
86 | └── yolov7_qat
87 | ├── doc
88 | │ ├── Guidance_of_QAT_performance_optimization.md # guidance for Q&DQ insert and placement for pytorch-quantization tool
89 | ├── quantization
90 | │ ├── quantize.py # helper class for quantize yolov7 model
91 | │ └── rules.py # rules for Q&DQ nodes insert and restrictions
92 | ├── README.md
93 | └── scripts
94 | ├── detect-trt.py # detect a image with tensorrt engine
95 | ├── draw-engine.py # draw tensorrt engine to graph
96 | ├── eval-trt.py # the script for evalating tensorrt mAP
97 | ├── eval-trt.sh # the command lne script for evaluating tensorrt mAP
98 | ├── qat.py # main function for QAT and PTQ
99 | └── trt-int8.py # tensorrt build-in calibration
100 | ```
101 |
--------------------------------------------------------------------------------
/deepstream_yolo/README.md:
--------------------------------------------------------------------------------
1 | # Deploy YOLO Models With DeepStream #
2 |
3 | **This sample shows how to integrate YOLO models with customized output layer parsing for detected objects with DeepStreamSDK.**
4 |
5 | ## 1. Sample contents: ##
6 | - `deepstream_app_config_yolo.txt`: DeepStream reference app configuration file for using YOLO models as the primary detector.
7 | - `config_infer_primary_yoloV4.txt`: Configuration file for the GStreamer nvinfer plugin for the YoloV4 detector model.
8 | - `config_infer_primary_yoloV7.txt`: Configuration file for the GStreamer nvinfer plugin for the YoloV7 detector model.
9 | - `nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp`: Output layer parsing function for detected objects for the Yolo models.
10 |
11 | ## 2. Pre-requisites: ##
12 |
13 | ### 2.1 Please make sure DeepStream 6.1.1+ is properly installed ###
14 |
15 | ### 2.2 Generate Model ###
16 | #### YoloV4
17 |
18 | - Go to this pytorch repository where you can convert YOLOv4 Pytorch model into **ONNX**
19 | - Other famous YOLOv4 pytorch repositories as references:
20 | -
21 | -
22 | -
23 | -
24 | - Or you can download reference ONNX model directly from here ([link](https://drive.google.com/file/d/1tp1xzeey4YBSd8nGd-dkn8Ymii9ordEj/view?usp=sharing)).
25 |
26 | #### YOLOv7
27 | following the guide https://github.com/WongKinYiu/yolov7#export, export a dynamic-batch-1-output onnx-model
28 | ```bash
29 | $ python export.py --weights ./yolov7.pt --grid --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --dynamic-batch
30 | ```
31 | or using the qat model exported from [yolov7_qat](../yolov7_qat)
32 | ## 3. Download and Run ##
33 |
34 | ```sh
35 | $ cd ~/
36 | $ git clone https://github.com/NVIDIA-AI-IOT/yolo_deepstream.git
37 | $ cd ~/yolo_deepstream/deepstream_yolo/nvdsinfer_custom_impl_Yolo
38 | $ make
39 | $ cd ..
40 | ```
41 | Make sure the model exists under ~/yolo_deepstream/deepstream_yolo/. Change the "config-file" parameter in the "deepstream_app_config_yolo.txt" configuration file to the nvinfer configuration file for the model you want to run with.
42 | |Model|Nvinfer Configuration File|
43 | |-----------|----------|
44 | |YoloV4|config_infer_primary_yoloV4.txt|
45 | |YoloV7|config_infer_primary_yoloV7.txt|
46 |
47 | ```
48 | $ deepstream-app -c deepstream_app_config_yolo.txt
49 | ```
50 | ## 4. CUDA Post Processing
51 |
52 | this sample provide two ways of yolov7 post-processing(decoce yolo result, not include NMS), CPU version and GPU version
53 | - CPU implement can be found in: [nvdsparsebbox_Yolo.cpp](deepstream_yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp)
54 | - CUDA implement can be found in: [nvdsparsebbox_Yolo_cuda.cu](deepstream_yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo_cuda.cu)
55 |
56 | Default will use CUDA-post processing. To enable CPU post-processing:
57 | in [config_infer_primary_yoloV7.txt](deepstream_yolo/config_infer_primary_yoloV7.txt)
58 |
59 | - `parse-bbox-func-name=NvDsInferParseCustomYoloV7_cuda` -> `parse-bbox-func-name=NvDsInferParseCustomYoloV7`
60 | - `disable-output-host-copy=1` -> `disable-output-host-copy=0`
61 |
62 | The performance of the CPU-post-processing and CUDA-post-processing result can be found in [Performance](https://github.com/NVIDIA-AI-IOT/yolo_deepstream#performance)
63 |
64 |
--------------------------------------------------------------------------------
/deepstream_yolo/config_infer_primary_yoloV4.txt:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | # SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | # SPDX-License-Identifier: MIT
4 | #
5 | # Permission is hereby granted, free of charge, to any person obtaining a
6 | # copy of this software and associated documentation files (the "Software"),
7 | # to deal in the Software without restriction, including without limitation
8 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | # and/or sell copies of the Software, and to permit persons to whom the
10 | # Software is furnished to do so, subject to the following conditions:
11 | #
12 | # The above copyright notice and this permission notice shall be included in
13 | # all copies or substantial portions of the Software.
14 | #
15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | # DEALINGS IN THE SOFTWARE.
22 | ################################################################################
23 |
24 | # Following properties are mandatory when engine files are not specified:
25 | # int8-calib-file(Only in INT8), model-file-format
26 | # Caffemodel mandatory properties: model-file, proto-file, output-blob-names
27 | # UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
28 | # ONNX: onnx-file
29 | #
30 | # Mandatory properties for detectors:
31 | # num-detected-classes
32 | #
33 | # Optional properties for detectors:
34 | # cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)
35 | # custom-lib-path
36 | # parse-bbox-func-name
37 | #
38 | # Mandatory properties for classifiers:
39 | # classifier-threshold, is-classifier
40 | #
41 | # Optional properties for classifiers:
42 | # classifier-async-mode(Secondary mode only, Default=false)
43 | #
44 | # Optional properties in secondary mode:
45 | # operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
46 | # input-object-min-width, input-object-min-height, input-object-max-width,
47 | # input-object-max-height
48 | #
49 | # Following properties are always recommended:
50 | # batch-size(Default=1)
51 | #
52 | # Other optional properties:
53 | # net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
54 | # model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
55 | # mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),
56 | # custom-lib-path, network-mode(Default=0 i.e FP32)
57 | #
58 | # The values in the config file are overridden by values set through GObject
59 | # properties.
60 |
61 | [property]
62 | gpu-id=0
63 | net-scale-factor=0.0039215697906911373
64 | #0=RGB, 1=BGR
65 | model-color-format=0
66 | onnx-file=yolov4_-1_3_416_416_nms_dynamic.onnx
67 | model-engine-file=yolov4_-1_3_416_416_nms_dynamic.onnx_b16_gpu0_fp16.engine
68 | labelfile-path=labels.txt
69 | batch-size=16
70 | ## 0=FP32, 1=INT8, 2=FP16 mode
71 | network-mode=2
72 | num-detected-classes=80
73 | gie-unique-id=1
74 | network-type=0
75 | is-classifier=0
76 | ## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
77 | cluster-mode=2
78 | maintain-aspect-ratio=1
79 | parse-bbox-func-name=NvDsInferParseCustomYoloV4
80 | custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
81 | #scaling-filter=0
82 | #scaling-compute-hw=0
83 |
84 | [class-attrs-all]
85 | nms-iou-threshold=0.6
86 | pre-cluster-threshold=0.4
87 |
--------------------------------------------------------------------------------
/deepstream_yolo/config_infer_primary_yoloV7.txt:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | # SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | # SPDX-License-Identifier: MIT
4 | #
5 | # Permission is hereby granted, free of charge, to any person obtaining a
6 | # copy of this software and associated documentation files (the "Software"),
7 | # to deal in the Software without restriction, including without limitation
8 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | # and/or sell copies of the Software, and to permit persons to whom the
10 | # Software is furnished to do so, subject to the following conditions:
11 | #
12 | # The above copyright notice and this permission notice shall be included in
13 | # all copies or substantial portions of the Software.
14 | #
15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | # DEALINGS IN THE SOFTWARE.
22 | ################################################################################
23 |
24 | # Following properties are mandatory when engine files are not specified:
25 | # int8-calib-file(Only in INT8), model-file-format
26 | # Caffemodel mandatory properties: model-file, proto-file, output-blob-names
27 | # UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
28 | # ONNX: onnx-file
29 | #
30 | # Mandatory properties for detectors:
31 | # num-detected-classes
32 | #
33 | # Optional properties for detectors:
34 | # cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)
35 | # custom-lib-path
36 | # parse-bbox-func-name
37 | #
38 | # Mandatory properties for classifiers:
39 | # classifier-threshold, is-classifier
40 | #
41 | # Optional properties for classifiers:
42 | # classifier-async-mode(Secondary mode only, Default=false)
43 | #
44 | # Optional properties in secondary mode:
45 | # operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
46 | # input-object-min-width, input-object-min-height, input-object-max-width,
47 | # input-object-max-height
48 | #
49 | # Following properties are always recommended:
50 | # batch-size(Default=1)
51 | #
52 | # Other optional properties:
53 | # net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
54 | # model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
55 | # mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),
56 | # custom-lib-path, network-mode(Default=0 i.e FP32)
57 | #
58 | # The values in the config file are overridden by values set through GObject
59 | # properties.
60 |
61 | [property]
62 | gpu-id=0
63 | net-scale-factor=0.0039215697906911373
64 | #0=RGB, 1=BGR
65 | model-color-format=0
66 | onnx-file=yolov7.onnx
67 | labelfile-path=labels.txt
68 | ## 0=FP32, 1=INT8, 2=FP16 mode
69 | network-mode=2
70 | num-detected-classes=80
71 | gie-unique-id=1
72 | network-type=0
73 | is-classifier=0
74 | ## 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
75 | cluster-mode=2
76 | maintain-aspect-ratio=1
77 | symmetric-padding=1
78 | ## Bilinear Interpolation
79 | scaling-filter=1
80 | #parse-bbox-func-name=NvDsInferParseCustomYoloV7
81 | parse-bbox-func-name=NvDsInferParseCustomYoloV7_cuda
82 | #disable-output-host-copy=0
83 | disable-output-host-copy=1
84 | custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
85 | #scaling-compute-hw=0
86 | ## start from DS6.2
87 | crop-objects-to-roi-boundary=1
88 |
89 |
90 | [class-attrs-all]
91 | #nms-iou-threshold=0.3
92 | #threshold=0.7
93 | nms-iou-threshold=0.65
94 | pre-cluster-threshold=0.25
95 | topk=300
96 |
97 |
--------------------------------------------------------------------------------
/deepstream_yolo/deepstream_app_config_yolo.txt:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | # SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | # SPDX-License-Identifier: MIT
4 | #
5 | # Permission is hereby granted, free of charge, to any person obtaining a
6 | # copy of this software and associated documentation files (the "Software"),
7 | # to deal in the Software without restriction, including without limitation
8 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | # and/or sell copies of the Software, and to permit persons to whom the
10 | # Software is furnished to do so, subject to the following conditions:
11 | #
12 | # The above copyright notice and this permission notice shall be included in
13 | # all copies or substantial portions of the Software.
14 | #
15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | # DEALINGS IN THE SOFTWARE.
22 | ################################################################################
23 |
24 | [application]
25 | enable-perf-measurement=1
26 | perf-measurement-interval-sec=5
27 | #gie-kitti-output-dir=streamscl
28 |
29 | [tiled-display]
30 | enable=0
31 | rows=4
32 | columns=4
33 | width=1280
34 | height=720
35 | gpu-id=0
36 | #(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
37 | #(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
38 | #(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
39 | #(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
40 | #(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
41 | nvbuf-memory-type=0
42 |
43 | [source0]
44 | enable=1
45 | #Type - 1=CameraV4L2 2=URI 3=MultiURI
46 | type=3
47 | uri=file:/opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
48 | num-sources=16
49 | gpu-id=0
50 | # (0): memtype_device - Memory type Device
51 | # (1): memtype_pinned - Memory type Host Pinned
52 | # (2): memtype_unified - Memory type Unified
53 | cudadec-memtype=0
54 |
55 | [sink0]
56 | enable=1
57 | #Type - 1=FakeSink 2=EglSink 3=File
58 | type=3
59 | sync=0
60 | source-id=0
61 | gpu-id=0
62 | nvbuf-memory-type=0
63 | #1=mp4 2=mkv
64 | container=1
65 | #1=h264 2=h265
66 | codec=1
67 | output-file=yolov4.mp4
68 |
69 | [osd]
70 | enable=1
71 | gpu-id=0
72 | border-width=1
73 | text-size=12
74 | text-color=1;1;1;1;
75 | text-bg-color=0.3;0.3;0.3;1
76 | font=Serif
77 | show-clock=0
78 | clock-x-offset=800
79 | clock-y-offset=820
80 | clock-text-size=12
81 | clock-color=1;0;0;0
82 | nvbuf-memory-type=0
83 |
84 | [streammux]
85 | gpu-id=0
86 | ##Boolean property to inform muxer that sources are live
87 | live-source=0
88 | batch-size=16
89 | ##time out in usec, to wait after the first buffer is available
90 | ##to push the batch even if the complete batch is not formed
91 | batched-push-timeout=40000
92 | ## Set muxer output width and height
93 | width=1280
94 | height=720
95 | ##Enable to maintain aspect ratio wrt source, and allow black borders, works
96 | ##along with width, height properties
97 | enable-padding=0
98 | nvbuf-memory-type=0
99 |
100 | # config-file property is mandatory for any gie section.
101 | # Other properties are optional and if set will override the properties set in
102 | # the infer config file.
103 | [primary-gie]
104 | enable=1
105 | gpu-id=0
106 | labelfile-path=labels.txt
107 | batch-size=16
108 | #Required by the app for OSD, not a plugin property
109 | bbox-border-color0=1;0;0;1
110 | bbox-border-color1=0;1;1;1
111 | bbox-border-color2=0;0;1;1
112 | bbox-border-color3=0;1;0;1
113 | interval=0
114 | gie-unique-id=1
115 | nvbuf-memory-type=0
116 | config-file=config_infer_primary_yoloV4.txt
117 | #config-file=config_infer_primary_yoloV7.txt
118 |
119 | [tracker]
120 | enable=0
121 | # For NvDCF and DeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
122 | tracker-width=640
123 | tracker-height=384
124 | ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
125 | # ll-config-file required to set different tracker types
126 | # ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_IOU.yml
127 | ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml
128 | # ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_accuracy.yml
129 | # ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_DeepSORT.yml
130 | gpu-id=0
131 | enable-batch-process=1
132 | enable-past-frame=1
133 | display-tracking-id=1
134 |
135 | [tests]
136 | file-loop=0
137 |
--------------------------------------------------------------------------------
/deepstream_yolo/labels.txt:
--------------------------------------------------------------------------------
1 | person
2 | bicycle
3 | car
4 | motorbike
5 | aeroplane
6 | bus
7 | train
8 | truck
9 | boat
10 | traffic light
11 | fire hydrant
12 | stop sign
13 | parking meter
14 | bench
15 | bird
16 | cat
17 | dog
18 | horse
19 | sheep
20 | cow
21 | elephant
22 | bear
23 | zebra
24 | giraffe
25 | backpack
26 | umbrella
27 | handbag
28 | tie
29 | suitcase
30 | frisbee
31 | skis
32 | snowboard
33 | sports ball
34 | kite
35 | baseball bat
36 | baseball glove
37 | skateboard
38 | surfboard
39 | tennis racket
40 | bottle
41 | wine glass
42 | cup
43 | fork
44 | knife
45 | spoon
46 | bowl
47 | banana
48 | apple
49 | sandwich
50 | orange
51 | broccoli
52 | carrot
53 | hot dog
54 | pizza
55 | donut
56 | cake
57 | chair
58 | sofa
59 | pottedplant
60 | bed
61 | diningtable
62 | toilet
63 | tvmonitor
64 | laptop
65 | mouse
66 | remote
67 | keyboard
68 | cell phone
69 | microwave
70 | oven
71 | toaster
72 | sink
73 | refrigerator
74 | book
75 | clock
76 | vase
77 | scissors
78 | teddy bear
79 | hair drier
80 | toothbrush
81 |
--------------------------------------------------------------------------------
/deepstream_yolo/nvdsinfer_custom_impl_Yolo/Makefile:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | # SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | # SPDX-License-Identifier: MIT
4 | #
5 | # Permission is hereby granted, free of charge, to any person obtaining a
6 | # copy of this software and associated documentation files (the "Software"),
7 | # to deal in the Software without restriction, including without limitation
8 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | # and/or sell copies of the Software, and to permit persons to whom the
10 | # Software is furnished to do so, subject to the following conditions:
11 | #
12 | # The above copyright notice and this permission notice shall be included in
13 | # all copies or substantial portions of the Software.
14 | #
15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | # DEALINGS IN THE SOFTWARE.
22 | ################################################################################
23 |
24 | CC:= g++
25 | NVCC:=/usr/local/cuda/bin/nvcc
26 |
27 | CFLAGS:= -Wall -std=c++11 -shared -fPIC -Wno-error=deprecated-declarations
28 | CFLAGS+= -I/opt/nvidia/deepstream/deepstream/sources/includes/ -I/usr/local/cuda/include
29 |
30 | CUFLAGS:= -std=c++14 -shared
31 | CUFLAGS+= -I/opt/nvidia/deepstream/deepstream/sources/includes/ -I/usr/local/cuda/include
32 | LIBS:= -lnvinfer_plugin -lnvinfer -lnvparsers -L/usr/local/cuda/lib64 -lcudart -lcublas -lstdc++fs
33 | LFLAGS:= -shared -Wl,--start-group $(LIBS) -Wl,--end-group
34 |
35 | INCS:= $(wildcard *.h)
36 | SRCFILES:= nvdsparsebbox_Yolo.cpp\
37 | nvdsparsebbox_Yolo_cuda.cu
38 |
39 | TARGET_LIB:= libnvdsinfer_custom_impl_Yolo.so
40 |
41 | TARGET_OBJS:= $(SRCFILES:.cpp=.o)
42 | TARGET_OBJS:= $(TARGET_OBJS:.cu=.o)
43 |
44 | all: $(TARGET_LIB)
45 |
46 | %.o: %.cpp $(INCS) Makefile
47 | $(CC) -c -o $@ $(CFLAGS) $<
48 |
49 | %.o: %.cu $(INCS) Makefile
50 | $(NVCC) -c -o $@ --compiler-options '-fPIC' $(CUFLAGS) $<
51 |
52 | $(TARGET_LIB) : $(TARGET_OBJS)
53 | $(CC) -o $@ $(TARGET_OBJS) $(LFLAGS)
54 |
55 | clean:
56 | rm -rf $(TARGET_LIB) *.o
57 |
--------------------------------------------------------------------------------
/deepstream_yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp:
--------------------------------------------------------------------------------
1 | /*
2 | * SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | * SPDX-License-Identifier: MIT
4 | *
5 | * Permission is hereby granted, free of charge, to any person obtaining a
6 | * copy of this software and associated documentation files (the "Software"),
7 | * to deal in the Software without restriction, including without limitation
8 | * the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | * and/or sell copies of the Software, and to permit persons to whom the
10 | * Software is furnished to do so, subject to the following conditions:
11 | *
12 | * The above copyright notice and this permission notice shall be included in
13 | * all copies or substantial portions of the Software.
14 | *
15 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | * DEALINGS IN THE SOFTWARE.
22 | */
23 |
24 |
25 | #include
26 | #include
27 | #include
28 | #include
29 | #include
30 | #include
31 | #include
32 | #include "nvdsinfer_custom_impl.h"
33 |
34 | static const int NUM_CLASSES_YOLO = 80;
35 |
36 | float clamp(const float val, const float minVal, const float maxVal)
37 | {
38 | assert(minVal <= maxVal);
39 | return std::min(maxVal, std::max(minVal, val));
40 | }
41 |
42 | extern "C" bool NvDsInferParseCustomYoloV4(
43 | std::vector const& outputLayersInfo,
44 | NvDsInferNetworkInfo const& networkInfo,
45 | NvDsInferParseDetectionParams const& detectionParams,
46 | std::vector& objectList);
47 |
48 | extern "C" bool NvDsInferParseCustomYoloV7(
49 | std::vector const& outputLayersInfo,
50 | NvDsInferNetworkInfo const& networkInfo,
51 | NvDsInferParseDetectionParams const& detectionParams,
52 | std::vector& objectList);
53 |
54 | /* YOLOv4 implementations */
55 | static NvDsInferParseObjectInfo convertBBoxYoloV4(const float& bx1, const float& by1, const float& bx2,
56 | const float& by2, const uint& netW, const uint& netH)
57 | {
58 | NvDsInferParseObjectInfo b;
59 | // Restore coordinates to network input resolution
60 |
61 | float x1 = bx1 * netW;
62 | float y1 = by1 * netH;
63 | float x2 = bx2 * netW;
64 | float y2 = by2 * netH;
65 |
66 | x1 = clamp(x1, 0, netW);
67 | y1 = clamp(y1, 0, netH);
68 | x2 = clamp(x2, 0, netW);
69 | y2 = clamp(y2, 0, netH);
70 |
71 | b.left = x1;
72 | b.width = clamp(x2 - x1, 0, netW);
73 | b.top = y1;
74 | b.height = clamp(y2 - y1, 0, netH);
75 |
76 | return b;
77 | }
78 |
79 | static void addBBoxProposalYoloV4(const float bx, const float by, const float bw, const float bh,
80 | const uint& netW, const uint& netH, const int maxIndex,
81 | const float maxProb, std::vector& binfo)
82 | {
83 | NvDsInferParseObjectInfo bbi = convertBBoxYoloV4(bx, by, bw, bh, netW, netH);
84 | if (bbi.width < 1 || bbi.height < 1) return;
85 |
86 | bbi.detectionConfidence = maxProb;
87 | bbi.classId = maxIndex;
88 | binfo.push_back(bbi);
89 | }
90 |
91 | static std::vector
92 | decodeYoloV4Tensor(
93 | const float* boxes, const float* scores,
94 | const uint num_bboxes, NvDsInferParseDetectionParams const& detectionParams,
95 | const uint& netW, const uint& netH)
96 | {
97 | std::vector binfo;
98 |
99 | uint bbox_location = 0;
100 | uint score_location = 0;
101 | for (uint b = 0; b < num_bboxes; ++b)
102 | {
103 | float bx1 = boxes[bbox_location];
104 | float by1 = boxes[bbox_location + 1];
105 | float bx2 = boxes[bbox_location + 2];
106 | float by2 = boxes[bbox_location + 3];
107 |
108 | float maxProb = 0.0f;
109 | int maxIndex = -1;
110 |
111 | for (uint c = 0; c < detectionParams.numClassesConfigured; ++c)
112 | {
113 | float prob = scores[score_location + c];
114 | if (prob > maxProb)
115 | {
116 | maxProb = prob;
117 | maxIndex = c;
118 | }
119 | }
120 |
121 | if (maxProb > detectionParams.perClassPreclusterThreshold[maxIndex])
122 | {
123 | addBBoxProposalYoloV4(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
124 | }
125 |
126 | bbox_location += 4;
127 | score_location += detectionParams.numClassesConfigured;
128 | }
129 |
130 | return binfo;
131 | }
132 |
133 | extern "C" bool NvDsInferParseCustomYoloV4(
134 | std::vector const& outputLayersInfo,
135 | NvDsInferNetworkInfo const& networkInfo,
136 | NvDsInferParseDetectionParams const& detectionParams,
137 | std::vector& objectList)
138 | {
139 | if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
140 | {
141 | std::cerr << "WARNING: Num classes mismatch. Configured:"
142 | << detectionParams.numClassesConfigured
143 | << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
144 | }
145 |
146 | std::vector objects;
147 |
148 | const NvDsInferLayerInfo &boxes = outputLayersInfo[0]; // num_boxes x 4
149 | const NvDsInferLayerInfo &scores = outputLayersInfo[1]; // num_boxes x num_classes
150 |
151 | // 3 dimensional: [num_boxes, 1, 4]
152 | assert(boxes.inferDims.numDims == 3);
153 | // 2 dimensional: [num_boxes, num_classes]
154 | assert(scores.inferDims.numDims == 2);
155 |
156 | // The second dimension should be num_classes
157 | assert(detectionParams.numClassesConfigured == scores.inferDims.d[1]);
158 |
159 | uint num_bboxes = boxes.inferDims.d[0];
160 |
161 | // std::cout << "Network Info: " << networkInfo.height << " " << networkInfo.width << std::endl;
162 |
163 | std::vector outObjs =
164 | decodeYoloV4Tensor(
165 | (const float*)(boxes.buffer), (const float*)(scores.buffer), num_bboxes, detectionParams,
166 | networkInfo.width, networkInfo.height);
167 |
168 | objects.insert(objects.end(), outObjs.begin(), outObjs.end());
169 |
170 | objectList = objects;
171 |
172 | return true;
173 | }
174 | /* YOLOv4 implementations end*/
175 |
176 | /*Yolov7 bbox parser*/
177 | static NvDsInferParseObjectInfo convertBBoxYoloV7(const float& bx, const float& by, const float& bw,
178 | const float& bh, const int& stride, const uint& netW,
179 | const uint& netH)
180 | {
181 | NvDsInferParseObjectInfo b;
182 | // Restore coordinates to network input resolution
183 | float xCenter = bx * stride;
184 | float yCenter = by * stride;
185 | float x0 = xCenter - bw / 2;
186 | float y0 = yCenter - bh / 2;
187 | float x1 = x0 + bw;
188 | float y1 = y0 + bh;
189 |
190 | x0 = clamp(x0, 0, netW);
191 | y0 = clamp(y0, 0, netH);
192 | x1 = clamp(x1, 0, netW);
193 | y1 = clamp(y1, 0, netH);
194 |
195 | b.left = x0;
196 | b.width = clamp(x1 - x0, 0, netW);
197 | b.top = y0;
198 | b.height = clamp(y1 - y0, 0, netH);
199 |
200 | return b;
201 | }
202 |
203 | static void addBBoxProposalYoloV7(const float bx, const float by, const float bw, const float bh,
204 | const uint stride, const uint& netW, const uint& netH, const int maxIndex,
205 | const float maxProb, std::vector& binfo)
206 | {
207 | NvDsInferParseObjectInfo bbi = convertBBoxYoloV7(bx, by, bw, bh, stride, netW, netH);
208 | if (bbi.width < 1 || bbi.height < 1) return;
209 |
210 | bbi.detectionConfidence = maxProb;
211 | bbi.classId = maxIndex;
212 | binfo.push_back(bbi);
213 | }
214 |
215 | static bool NvDsInferParseYoloV7(
216 | std::vector const& outputLayersInfo,
217 | NvDsInferNetworkInfo const& networkInfo,
218 | NvDsInferParseDetectionParams const& detectionParams,
219 | std::vector& objectList)
220 | {
221 |
222 |
223 | if (outputLayersInfo.empty()) {
224 | std::cerr << "Could not find output layer in bbox parsing" << std::endl;;
225 | return false;
226 | }
227 | const NvDsInferLayerInfo &layer = outputLayersInfo[0];
228 |
229 | if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
230 | {
231 | std::cerr << "WARNING: Num classes mismatch. Configured:"
232 | << detectionParams.numClassesConfigured
233 | << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
234 | }
235 |
236 | std::vector objects;
237 |
238 | float* data = (float*)layer.buffer;
239 | const int dimensions = layer.inferDims.d[1];
240 | int rows = layer.inferDims.numElements / layer.inferDims.d[1];
241 |
242 | for (int i = 0; i < rows; ++i) {
243 | //85 = x, y, w, h, maxProb, score0......score79
244 | float bx = data[ 0];
245 | float by = data[ 1];
246 | float bw = data[ 2];
247 | float bh = data[ 3];
248 | float maxProb = data[ 4];
249 | int maxIndex = data[ 5];
250 | float * classes_scores = data + 5;
251 |
252 | float maxScore = 0;
253 | int index = 0;
254 | for (int j = 0 ;j < NUM_CLASSES_YOLO; j++){
255 | if(*classes_scores > maxScore){
256 | index = j;
257 | maxScore = *classes_scores;
258 | }
259 | classes_scores++;
260 | }
261 |
262 | maxIndex = index;
263 | data += dimensions;
264 |
265 | addBBoxProposalYoloV7(bx, by, bw, bh, 1, networkInfo.width, networkInfo.height, maxIndex, maxProb, objects);
266 | }
267 | objectList = objects;
268 | return true;
269 | }
270 |
271 | extern "C" bool NvDsInferParseCustomYoloV7(
272 | std::vector const& outputLayersInfo,
273 | NvDsInferNetworkInfo const& networkInfo,
274 | NvDsInferParseDetectionParams const& detectionParams,
275 | std::vector& objectList)
276 | {
277 | return NvDsInferParseYoloV7 (
278 | outputLayersInfo, networkInfo, detectionParams, objectList);
279 | }
280 |
281 | /* Check that the custom function has been defined correctly */
282 | CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomYoloV4);
283 | CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomYoloV7);
284 |
--------------------------------------------------------------------------------
/deepstream_yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo_cuda.cu:
--------------------------------------------------------------------------------
1 | /*
2 | * SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | * SPDX-License-Identifier: MIT
4 | *
5 | * Permission is hereby granted, free of charge, to any person obtaining a
6 | * copy of this software and associated documentation files (the "Software"),
7 | * to deal in the Software without restriction, including without limitation
8 | * the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | * and/or sell copies of the Software, and to permit persons to whom the
10 | * Software is furnished to do so, subject to the following conditions:
11 | *
12 | * The above copyright notice and this permission notice shall be included in
13 | * all copies or substantial portions of the Software.
14 | *
15 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | * DEALINGS IN THE SOFTWARE.
22 | */
23 |
24 | #include
25 | #include
26 | #include
27 | #include
28 | #include
29 | #include
30 | #include
31 | #include "nvdsinfer_custom_impl.h"
32 | #include "nvtx3/nvToolsExt.h"
33 | #include
34 | #include
35 |
36 | static const int NUM_CLASSES_YOLO = 80;
37 | #define OBJECTLISTSIZE 25200
38 | #define BLOCKSIZE 1024
39 | thrust::device_vector objects_v(OBJECTLISTSIZE);
40 |
41 | extern "C" bool NvDsInferParseCustomYoloV7_cuda(
42 | std::vector const& outputLayersInfo,
43 | NvDsInferNetworkInfo const& networkInfo,
44 | NvDsInferParseDetectionParams const& detectionParams,
45 | std::vector& objectList);
46 |
47 |
48 | __global__ void decodeYoloV7Tensor_cuda(NvDsInferParseObjectInfo *binfo/*output*/, float* data, int dimensions, int rows,
49 | int netW, int netH, float Threshold){
50 | int idx = blockIdx.x * blockDim.x + threadIdx.x;
51 | if(idx < rows) {
52 | data = data + idx * dimensions;
53 | float maxProb = data[ 4];
54 | //maxProb < Threshold, directly return
55 | if(maxProb < Threshold){
56 | binfo[idx].detectionConfidence = 0.0;
57 | return;
58 | }
59 | float bx = data[ 0];
60 | float by = data[ 1];
61 | float bw = data[ 2];
62 | float bh = data[ 3];
63 | int maxIndex = 0;
64 | float * classes_scores = (float *)(data + 5);
65 | float maxScore = 0;
66 | int index = 0;
67 |
68 | #pragma unroll
69 | for (int j = 0 ;j < NUM_CLASSES_YOLO; j++){
70 | if(*classes_scores > maxScore){
71 | index = j;
72 | maxScore = *classes_scores;
73 | }
74 | classes_scores++;
75 | }
76 | if(maxProb * maxScore < Threshold){
77 | binfo[idx].detectionConfidence = 0.0;
78 | return;
79 | }
80 | maxIndex = index;
81 | float stride = 1.0;
82 | float xCenter = bx * stride;
83 | float yCenter = by * stride;
84 | float x0 = xCenter - bw / 2.0;
85 | float y0 = yCenter - bh / 2.0;
86 | float x1 = x0 + bw;
87 | float y1 = y0 + bh;
88 | x0 = fminf(float(netW), fmaxf(float(0.0), x0));
89 | y0 = fminf(float(netH), fmaxf(float(0.0), y0));
90 | x1 = fminf(float(netW), fmaxf(float(0.0), x1));
91 | y1 = fminf(float(netH), fmaxf(float(0.0), y1));
92 | binfo[idx].left = x0;
93 | binfo[idx].top = y0;
94 | binfo[idx].width = fminf(float(netW), fmaxf(float(0.0), x1-x0));
95 | binfo[idx].height = fminf(float(netH), fmaxf(float(0.0), y1-y0));
96 | binfo[idx].detectionConfidence = maxProb * maxScore;
97 | binfo[idx].classId = maxIndex;
98 | }
99 | return;
100 | }
101 | static bool NvDsInferParseYoloV7_cuda(
102 | std::vector const& outputLayersInfo,
103 | NvDsInferNetworkInfo const& networkInfo,
104 | NvDsInferParseDetectionParams const& detectionParams,
105 | std::vector& objectList)
106 | {
107 |
108 | if (outputLayersInfo.empty()) {
109 | std::cerr << "Could not find output layer in bbox parsing" << std::endl;;
110 | return false;
111 | }
112 | const NvDsInferLayerInfo &layer = outputLayersInfo[0];
113 |
114 | if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
115 | {
116 | std::cerr << "WARNING: Num classes mismatch. Configured:"
117 | << detectionParams.numClassesConfigured
118 | << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
119 | }
120 |
121 | float* data = (float*)layer.buffer;
122 | const int dimensions = layer.inferDims.d[1];
123 | int rows = layer.inferDims.numElements / layer.inferDims.d[1];
124 |
125 | int GRIDSIZE = ((OBJECTLISTSIZE-1)/BLOCKSIZE)+1;
126 | //find the min threshold
127 | float min_PreclusterThreshold = *(std::min_element(detectionParams.perClassPreclusterThreshold.begin(),
128 | detectionParams.perClassPreclusterThreshold.end()));
129 | decodeYoloV7Tensor_cuda<<>>
130 | (thrust::raw_pointer_cast(objects_v.data()), data, dimensions, rows, networkInfo.width,
131 | networkInfo.height, min_PreclusterThreshold);
132 | objectList.resize(OBJECTLISTSIZE);
133 | thrust::copy(objects_v.begin(),objects_v.end(),objectList.begin());//the same as cudamemcpy
134 |
135 | return true;
136 | }
137 |
138 | extern "C" bool NvDsInferParseCustomYoloV7_cuda(
139 | std::vector const& outputLayersInfo,
140 | NvDsInferNetworkInfo const& networkInfo,
141 | NvDsInferParseDetectionParams const& detectionParams,
142 | std::vector& objectList)
143 | {
144 | nvtxRangePush("NvDsInferParseYoloV7");
145 | bool ret = NvDsInferParseYoloV7_cuda (
146 | outputLayersInfo, networkInfo, detectionParams, objectList);
147 |
148 | nvtxRangePop();
149 | return ret;
150 | }
151 |
152 | /* Check that the custom function has been defined correctly */
153 | CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomYoloV7_cuda);
154 |
--------------------------------------------------------------------------------
/tensorrt_yolov4/Makefile:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | # SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | # SPDX-License-Identifier: MIT
4 | #
5 | # Permission is hereby granted, free of charge, to any person obtaining a
6 | # copy of this software and associated documentation files (the "Software"),
7 | # to deal in the Software without restriction, including without limitation
8 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | # and/or sell copies of the Software, and to permit persons to whom the
10 | # Software is furnished to do so, subject to the following conditions:
11 | #
12 | # The above copyright notice and this permission notice shall be included in
13 | # all copies or substantial portions of the Software.
14 | #
15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | # DEALINGS IN THE SOFTWARE.
22 | ################################################################################
23 |
24 |
25 | SHELL=/bin/bash -o pipefail
26 | TARGET?=$(shell uname -m)
27 | LIBDIR?=lib
28 | VERBOSE?=0
29 | ifeq ($(VERBOSE), 1)
30 | AT=
31 | else
32 | AT=@
33 | endif
34 | CUDA_TRIPLE=x86_64-linux
35 | CUBLAS_TRIPLE=x86_64-linux-gnu
36 | DLSW_TRIPLE=x86_64-linux-gnu
37 | ifeq ($(TARGET), aarch64)
38 | CUDA_TRIPLE=aarch64-linux
39 | CUBLAS_TRIPLE=aarch64-linux-gnu
40 | DLSW_TRIPLE=aarch64-linux-gnu
41 | endif
42 | ifeq ($(TARGET), qnx)
43 | CUDA_TRIPLE=aarch64-qnx
44 | CUBLAS_TRIPLE=aarch64-qnx-gnu
45 | DLSW_TRIPLE=aarch64-unknown-nto-qnx
46 | endif
47 | ifeq ($(TARGET), ppc64le)
48 | CUDA_TRIPLE=ppc64le-linux
49 | CUBLAS_TRIPLE=ppc64le-linux
50 | DLSW_TRIPLE=ppc64le-linux
51 | endif
52 | ifeq ($(TARGET), android64)
53 | DLSW_TRIPLE=aarch64-linux-androideabi
54 | CUDA_TRIPLE=$(DLSW_TRIPLE)
55 | CUBLAS_TRIPLE=$(DLSW_TRIPLE)
56 | endif
57 | export TARGET
58 | export VERBOSE
59 | export LIBDIR
60 | export CUDA_TRIPLE
61 | export CUBLAS_TRIPLE
62 | export DLSW_TRIPLE
63 |
64 | ifeq ($(SAFE_PDK), 1)
65 | # Only dlaSafetyRuntime is currently able to execute with safety pdk.
66 | samples = dlaSafetyRuntime
67 | else
68 | samples = sampleAlgorithmSelector sampleCharRNN sampleDynamicReshape sampleFasterRCNN sampleGoogleNet sampleINT8 sampleINT8API sampleMLP sampleMNIST sampleMNISTAPI sampleNMT sampleMovieLens sampleOnnxMNIST sampleUffPluginV2Ext sampleReformatFreeIO sampleSSD sampleUffFasterRCNN sampleUffMaskRCNN sampleUffMNIST sampleUffSSD trtexec samplePlugin
69 |
70 |
71 | # sampleMovieLensMPS should only be compiled for Linux targets.
72 | # sample uses Linux specific shared memory and IPC libraries.
73 | ifeq ($(TARGET),x86_64)
74 | samples += sampleMovieLensMPS
75 | endif
76 |
77 | # sampleNvmedia/dlaSafetyRuntime/dlaSafetyBuilder should only be compiled with DLA enabled.
78 | ifeq ($(ENABLE_DLA),1)
79 | samples += sampleNvmedia
80 | samples += dlaSafetyRuntime
81 | samples += dlaSafetyBuilder
82 | endif
83 | endif
84 |
85 | .PHONY: all clean help
86 | all:
87 | $(AT)$(foreach sample,$(samples), $(MAKE) -C $(sample) &&) :
88 |
89 | clean:
90 | $(AT)$(foreach sample,$(samples), $(MAKE) clean -C $(sample) &&) :
91 |
92 | help:
93 | $(AT)echo "Sample building help menu."
94 | $(AT)echo "Samples:"
95 | $(AT)$(foreach sample,$(samples), echo -e "\t$(sample)" &&) :
96 | $(AT)echo -e "\nCommands:"
97 | $(AT)echo -e "\tall - build all samples."
98 | $(AT)echo -e "\tclean - clean all samples."
99 | $(AT)echo -e "\nVariables:"
100 | $(AT)echo -e "\tTARGET - Specify the target to build for."
101 | $(AT)echo -e "\tVERBOSE - Specify verbose output."
102 | $(AT)echo -e "\tCUDA_INSTALL_DIR - Directory where cuda installs to."
103 |
--------------------------------------------------------------------------------
/tensorrt_yolov4/README.md:
--------------------------------------------------------------------------------
1 | # YOLOv4 Standalone Program of Multi-Tasks
2 |
3 | ## 1. Contents
4 |
5 | - **`common`** Some common code dependencies and utilities
6 | - **`source`** Source code of standalone Program
7 | - `main.cpp`: Program main entrance where parameters are configured here
8 | - `SampleYolo.hpp`: YOLOv4 inference class definition file
9 | - `SampleYolo.cpp`: YOLOv4 inference class functions definition file
10 | - `onnx_add_nms_plugin.py`: Python script to add BatchedNMSPlugin node into ONNX model
11 | - `generate_coco_image_list.py`: Python script to get list of image names from MS COCO annotation or information file
12 |
13 | - **`data`** This directory saves:
14 | - `yolov4.onnx`: the ONNX model (User generated)
15 | - `yolov4.engine`: the TensorRT engine model (would be generated by this program)
16 | - `demo.jpg`: The demo image (Already exists)
17 | - `demo_out.jpg`: Image detection output of the demo image (Already exists, but would be renewed by the program)
18 | - `names.txt`: MS COCO dataset label names (have to be downloaded or generated via COCO API)
19 | - `categories.txt`: MS COCO dataset categories where IDs and names are separated by `"\t"` (have to be generated via COCO API)
20 | - `val2017.txt`: MS COCO validation set image list (have to be generated from corresponding COCO annotation file)
21 | - `valdev2017.txt`: MS COCO test set image list (have to be generated from corresponding COCO annotation file)
22 | - `coco_result.json`: MS COCO dataset output (would be generated by this program)
23 |
24 |
25 | ## 2 Prerequisites before building & running YOLOv4 standalone ##
26 |
27 | ### 2.1 Download TensorRT (higher than 7.1, you can ignore this step if TensorRT 7.1 is already installed) ###
28 |
29 | - Download TensorRT from NVIDIA developer page:
30 | - Install or depackage the deb file or tar file.
31 |
32 | ### 2.2 Download and build TensorRT OSS ###
33 |
34 | - Refer to README files in
35 | - Go to if you are working on Jetson platform
36 | - Go to if you are working on x86 platform
37 |
38 | - Follow guidences in README to clone repository and build `libnvinfer_plugin.so.7.x.x`
39 |
40 | - Rename `/lib/libnvinfer_plugin.so.7.x.x` to `/lib/libnvinfer_plugin.so.7.x.x.back`
41 |
42 | - Copy `/build/out/libnvinfer_plugin.so.7.x.x` into `/lib`
43 |
44 | ### 2.3 Generate YOLOv4 ONNX model with BatchedNMSPlugin node included ###
45 |
46 | #### Step 1 Generate YOLOv4 ONNX model (`CSPDarknet-53 CNN + YOLO header CNN + YOLO layers`) ####
47 |
48 | - Here is one of the YOLOv4 Pytorch repositories that can guide you to generate an ONNX model of YOLOv4.
49 | You can convert from the pretrained DarkNet model into ONNX directly; but you can also 1) convert the DarkNet model into Pytorch, 2) train the Pytorch model using your own dataset, and 3) then convert into ONNX.
50 |
51 | - Other famous YOLOv4 pytorch repositories as references:
52 | -
53 | -
54 | -
55 | -
56 |
57 |
58 | #### Step 2 Add into YOLOv4 ONNX model the BatchedNMSPlugin (`CSPDarknet-53 CNN + YOLO header CNN + YOLO layers + BatchedNMSPlugin`)
59 |
60 | **How can I add `BatchedNMSPlugin` node into ONNX model?**
61 |
62 | - Open `source_gpu_nms/onnx_add_nms_plugin.py`
63 |
64 | - Update attribute values to suit your model
65 |
66 | Example:
67 | ```py
68 | attrs["shareLocation"] = 1
69 | attrs["backgroundLabelId"] = -1
70 | attrs["numClasses"] = 80
71 | attrs["topK"] = topK # from program arguments
72 | attrs["keepTopK"] = keepTopK # from program arguments
73 | attrs["scoreThreshold"] = 0.3
74 | attrs["iouThreshold"] = 0.6
75 | attrs["isNormalized"] = 1
76 | attrs["clipBoxes"] = 1
77 | ```
78 |
79 | - Copy `onnx_add_nms_plugin.py` into `/tools/onnx-graphsurgeon`
80 |
81 | - Go to `/tools/onnx-graphsurgeon` and execute `onnx_add_nms_plugin.py`
82 |
83 | ```sh
84 | cd /tools/onnx-graphsurgeon
85 | python onnx_add_nms_plugin.py -f -t -k
86 | ```
87 |
88 | ## 3. How can I build and run YOLOv4 standalone program? ##
89 |
90 | ### 3.1 Add common source code includes ###
91 |
92 | - This YOLOv4 standalone sample depends on the same common includes as other C++ samples of TensorRT.
93 | - Option 1: Add a link to `/TensorRT-7.1.x.x/samples/common` in `tensorrt_yolov4`
94 | ```
95 | cd /yolov4_sample/tensorrt_yolov4
96 | ln -s /TensorRT-7.1.x.x/samples/common common
97 | ```
98 | - Option 2: Simply copy common includes into `tensorrt_yolov4`
99 | ```
100 | cd /yolov4_sample/tensorrt_yolov4
101 | cp -r /TensorRT-7.1.x.x/samples/common common ./
102 | ```
103 |
104 | ### 3.2 OpenCV dependencies ###
105 |
106 | - Note: There are OpenCV dependencies in this program. Please check if there are OpenCV includes in /usr/include/opencv and if OpenCV libraries like `-lopencv_core` and `-lopencv_imgproc` are installed.
107 |
108 | - Follow README and documents of this repository **** to install OpenCV if corresponding includes and libraries do not exist.
109 |
110 | ### 3.3 Compile and build ###
111 |
112 |
113 | ```sh
114 | cd /yolov4_sample/yolo_cpp_standalone/source_gpu_nms
115 | make clean
116 | make -j
117 | ```
118 |
119 |
120 | ### 3.4 Basic program parameters ###
121 |
122 | - Step1: Use text editor to open `main.cpp` in `/YOLOv4_Sample/tensorrt_yolov4/source`
123 |
124 | - Step2: Go to where function `initializeSampleParams()` is defined
125 |
126 | - Step3: You will find some basic configurations in `initializeSampleParams()` like follows:
127 |
128 | ```cpp
129 | // This argument is for calibration of int8
130 | // Int8 calibration is not available until now
131 | // You have to prepare samples for int8 calibration by yourself
132 | params.nbCalBatches = 80;
133 |
134 | // The engine file to generate or to load
135 | // The engine file does not exist:
136 | // This program will try to load onnx file and convert onnx into engine
137 | // The engine file exists:
138 | // This program will load the engine file directly
139 | params.engingFileName = "../data/yolov4.engine";
140 |
141 | // The onnx file to load
142 | params.onnxFileName = "../data/yolov4.onnx";
143 |
144 | // Input tensor name of ONNX file & engine file
145 | params.inputTensorNames.push_back("input");
146 |
147 | // Old batch configuration, it is zero if explicitBatch flag is true for the tensorrt engine
148 | // May be deprecated in the future
149 | params.batchSize = 0;
150 |
151 | // Number of classes (usually 80, but can be other values)
152 | params.outputClsSize = 80;
153 |
154 | // topK parameter of BatchedNMSPlugin
155 | params.topK = 2000;
156 |
157 | // keepTopK parameter of BatchedNMSPlugin
158 | params.keepTopK = 1000;
159 |
160 | // Batch size, you can modify to other batch size values if needed
161 | params.explicitBatchSize = 1;
162 |
163 | params.inputImageName = "../data/demo.jpg";
164 | params.cocoClassNamesFileName = "../data/coco.names";
165 | params.cocoClassIDFileName = "../data/categories.txt";
166 |
167 | // Config number of DLA cores, -1 if there is no DLA core
168 | params.dlaCore = -1;
169 | ```
170 |
171 | - Step4: Copy and rename the ONNX file (`BatchedNMSPlugin` node included) to the location defined by `initializeSampleParams()`
172 |
173 |
174 | ### 3.5 Run this program to convert ONNX file into Engine file ###
175 |
176 | - This program will automatically convert ONNX into engine if engine does not exist.
177 | - Command:
178 | - To generate Engine of fp32 mode:
179 | ```
180 | ../bin/yolov4
181 | ```
182 | - To generate Engine of fp16 mode:
183 | ```
184 | ../bin/yolov4 --fp16
185 | ```
186 |
187 | ### 3.6 Specific program parameters for `demo` mode, `speed` mode and `coco` mode ###
188 |
189 | #### 3.6.1 To run this program in `demo` mode
190 |
191 | - Command:
192 |
193 | ```
194 | ../bin/yolov4 --demo
195 | ```
196 |
197 | - This program will feed the demo image into YOLOv4 engine and write detection output as an image.
198 | - Please make sure `params.demo = 1` if you want to run this program in demo mode.
199 |
200 | ```cpp
201 | // Configurations to run a demo image
202 | params.demo = 1;
203 | params.outputImageName = "../data/demo_out.jpg";
204 | ```
205 |
206 | #### 3.6.2 To run this program in `speed` mode
207 |
208 | - Command:
209 |
210 | ```
211 | ../bin/yolov4 --speed
212 | ```
213 |
214 | - This program will repeatedly feed the demo image into engine to accumulate time consumed in each iteration
215 | - Please make sure `params.speedTest = 1` if you want to run this program in speed mode
216 |
217 | ```cpp
218 | // Configurations to run speed test
219 | params.speedTest = 1;
220 | params.speedTestItrs = 1000;
221 | ```
222 |
223 | #### 3.6.3 To run this program in `coco` mode
224 |
225 | - Command:
226 |
227 | ```
228 | ../bin/yolov4 --coco
229 | ```
230 |
231 | - Corresponding configuration in `initializeSampleParams()` would be like this:
232 |
233 | ```cpp
234 | // Configurations of Test on COCO dataset
235 | params.cocoTest = 1;
236 | params.cocoClassNamesFileName = "../data/coco.names";
237 | params.cocoClassIDFileName = "../data/categories.txt";
238 | params.cocoImageListFileName = "../data/val2017.txt";
239 | params.cocoTestResultFileName = "../data/coco_result.json";
240 | params.cocoImageDir = "../data/val2017";
241 | ```
242 |
243 | **Note: COCO dataset is just an example, you can use your own validation set or test set to validate YOLOv4 model trained by your own training set**
244 |
245 | - Step 1: Download MS COCO images and annotations from
246 |
247 | - Images for validation:
248 | - Annotations for training and validation:
249 | - Images for test:
250 | - Image info for test:
251 |
252 | - Step 2: Clone COCO API repository from and use COCO API to generate `categories.txt`
253 |
254 | - Format of `categories.txt` must follow this rule: IDs and names are separated by "\t".
255 |
256 | ```
257 | 1 person
258 | 2 bicycle
259 | 2 car
260 | 4 motorcycle
261 | 5 airplane
262 | ```
263 |
264 | - COCO API example that can help you distill categories from COCO dataset (You can have a look at `cocoapi\PythonAPI\pycocoDemo.ipynb` of for more details):
265 |
266 | ```py
267 | # display COCO categories and supercategories
268 | cats = coco.loadCats(coco.getCatIds())
269 | nms=[cat['name'] for cat in cats]
270 | print('COCO categories: \n{}\n'.format(' '.join(nms)))
271 | ```
272 |
273 |
274 | - Step 3: Generate image list file using python script `generate_coco_image_list.py`
275 |
276 | ```
277 | python generate_coco_image_list.py
278 | ```
279 |
280 | - For example, to generate validation image list, the command would be:
281 |
282 | ```
283 | python generate_coco_image_list.py instances_val2017.json val2017.txt
284 | ```
285 | - For example, to generate test-dev image list, the command would be:
286 | ```
287 | python generate_coco_image_list.py image_info_test-dev2017.json testdev2017.txt
288 | ```
289 |
290 | - This program will read image names from the list file whose path should be the same as `params.cocoImageListFileName`, and then feed these images located in `params.cocoImageDir` to YOLOv4 engine
291 | - Please make sure `params.cocoTest = 1` and images exist in `params.cocoImageDir`
292 |
293 |
294 |
--------------------------------------------------------------------------------
/tensorrt_yolov4/data/demo.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/yolo_deepstream/e9b75770ea58713d1cb3902d67c36e11acb888d7/tensorrt_yolov4/data/demo.jpg
--------------------------------------------------------------------------------
/tensorrt_yolov4/data/demo_out.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/yolo_deepstream/e9b75770ea58713d1cb3902d67c36e11acb888d7/tensorrt_yolov4/data/demo_out.jpg
--------------------------------------------------------------------------------
/tensorrt_yolov4/source/Makefile:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | # SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | # SPDX-License-Identifier: MIT
4 | #
5 | # Permission is hereby granted, free of charge, to any person obtaining a
6 | # copy of this software and associated documentation files (the "Software"),
7 | # to deal in the Software without restriction, including without limitation
8 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | # and/or sell copies of the Software, and to permit persons to whom the
10 | # Software is furnished to do so, subject to the following conditions:
11 | #
12 | # The above copyright notice and this permission notice shall be included in
13 | # all copies or substantial portions of the Software.
14 | #
15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | # DEALINGS IN THE SOFTWARE.
22 | ################################################################################
23 |
24 | OUTNAME_RELEASE = yolov4
25 | OUTNAME_DEBUG = yolov4_debug
26 | EXTRA_DIRECTORIES = ../common
27 | .NOTPARALLEL:
28 | MAKEFILE ?= ../Makefile.config
29 | include $(MAKEFILE)
30 |
--------------------------------------------------------------------------------
/tensorrt_yolov4/source/SampleYolo.hpp:
--------------------------------------------------------------------------------
1 | /*
2 | * SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3 | * SPDX-License-Identifier: MIT
4 | *
5 | * Permission is hereby granted, free of charge, to any person obtaining a
6 | * copy of this software and associated documentation files (the "Software"),
7 | * to deal in the Software without restriction, including without limitation
8 | * the rights to use, copy, modify, merge, publish, distribute, sublicense,
9 | * and/or sell copies of the Software, and to permit persons to whom the
10 | * Software is furnished to do so, subject to the following conditions:
11 | *
12 | * The above copyright notice and this permission notice shall be included in
13 | * all copies or substantial portions of the Software.
14 | *
15 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
18 | * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
20 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
21 | * DEALINGS IN THE SOFTWARE.
22 | */
23 |
24 | //!
25 | //! SampleYolo.cpp
26 | //! This file contains the implementation of the YOLOv4 sample. It creates the network using
27 | //! the YOLOv4 ONNX model.
28 |
29 | #pragma once
30 |
31 | #include "BatchStream.h"
32 | #include "EntropyCalibrator.h"
33 | #include "argsParser.h"
34 | #include "buffers.h"
35 | #include "common.h"
36 | #include "logger.h"
37 |
38 | #include "NvOnnxParser.h"
39 | #include "NvInfer.h"
40 | #include
41 |
42 | #include
43 | #include
44 | #include
45 | #include
46 | #include