├── 8vs10.xlsx
├── LICENSE
├── readme.md
├── src
    └── yolov8
    │   ├── .gitignore
    │   ├── CMakeLists.txt
    │   ├── camera_demo.cpp
    │   ├── imagefile_demo.cpp
    │   ├── include
    │       ├── Float16.h
    │       ├── camera.h
    │       ├── common.h
    │       ├── image_process.h
    │       ├── postprocess.h
    │       ├── rknn_pool.h
    │       ├── threadpool.h
    │       ├── videofile.h
    │       └── yolov8.h
    │   ├── model
    │       ├── .gitignore
    │       ├── coco_80_labels_list.txt
    │       ├── dotav1.txt
    │       ├── plane.png
    │       ├── ships.mp4
    │       ├── yolov10n-2.0.0.rknn
    │       ├── yolov10n.onnx
    │       ├── yolov8m.rknn
    │       ├── yolov8n-obb.onnx
    │       ├── yolov8n-obb.rknn
    │       ├── yolov8n-pose.onnx
    │       ├── yolov8n-pose.rknn
    │       ├── yolov8n.rknn
    │       └── yolov8s.rknn
    │   ├── utils
    │       ├── CMakeLists.txt
    │       ├── camera.cpp
    │       ├── image_process.cpp
    │       ├── postprocess.cpp
    │       ├── rknn_pool.cpp
    │       ├── videofile.cpp
    │       └── yolov8.cpp
    │   └── videofile_demo.cpp
└── yolov8n-ros2.md


/8vs10.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/8vs10.xlsx


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 kaylor.chen
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # 写在前面的话
  3 | 如果你看到这个仓库，证明你想试试这个多线程的推理。
  4 | 1. 这个里的代码不是最优的，有些错序的问题需要其他手段解决，我没有在这里解决，你可以看下一个标题新版本仓库的链接。新仓库解决了这个问题
  5 | 2. 本仓库的代码思路想法，在我的B站上有详细的讲解，需要理解程序的可以去b站搜我“kaylordut”
  6 | 3. 项目合作的可以发邮件到kaylor.chen@qq.com, 邮件请说明来意，和简单的需求，以及你的预算。邮件我一般都回复，请不要一来就索要微信，一个切实可行的项目或者良好的技术交流是良好的开始。
  7 | 
  8 | # New Project
  9 | An inference framework compatible with TensorRT, OnnxRuntime, NNRT and RKNN  
 10 | If you want to find some more yolo8/yolo11 demo and depth anything demo, visit my another [repository](https://github.com/kaylorchen/ai_framework_demo)
 11 | 
 12 | # Yolov8/v10 Demo for RK3588
 13 | The project is a multi-threaded inference demo of Yolov8 running on the RK3588 platform, which has been adapted for reading video files and camera feeds. The demo uses the Yolov8n model for file inference, with a maximum inference frame rate of up to 100 frames per second.
 14 | 
 15 | > If you want to test yolov8n with ros2 for yourself kit, click the [link](./yolov8n-ros2.md)
 16 | 
 17 | # Model
 18 | ## Download Model File
 19 | you can find the model file in the 'src/yolov8/model', and some large files: 
 20 | Link: https://pan.baidu.com/s/1zfSVzR1G7mb-EQvs6A6ZYw?pwd=gmcs Password: gmcs   
 21 | Google Drive: https://drive.google.com/drive/folders/1FYluJpdaL-680pipgIQ1zsqqRvNbruEp?usp=sharing
 22 | 
 23 | ## Model pt --> onnx
 24 | ### For Yolov8 
 25 | go to my blog --> [blog.kaylordut.com](https://blog.kaylordut.com/2024/02/09/rk3588's-yolov8-model-conversion-from-pt-to-rknn/#more)
 26 | ### For Yolov10
 27 | go to my another repository --> [yolov10](https://github.com/kaylorchen/yolov10)  
 28 | download pt model and export:
 29 | ```bash
 30 | # End-to-End ONNX
 31 | yolo export model=yolov10n/s/m/b/l/x.pt format=onnx opset=13 simplify
 32 | ```
 33 | 
 34 | ## Model onnx --> rknn
 35 | go to my blog --> [blog.kaylordut.com](https://blog.kaylordut.com/2024/02/09/rk3588's-yolov8-model-conversion-from-pt-to-rknn/#more)
 36 | > TIPS: (Yolov10)
 37 | > - rknn-toolkit2(release:1.6.0) does not support some operators about attention, so it runs attention steps with CPU, leading to increased inference time. 
 38 | > - rknn-toolkit2(beta:2.0.0b12) has the attention operators for 3588, so I build a docker image, you can pull it from __**kaylor/rknn_onnx2rknn:beta**__
 39 | 
 40 | ## Inference Time
 41 | Please refer to the spreadsheet '[8vs10.xlsx](./8vs10.xlsx)' for details.
 42 | 
 43 | |V8l-2.0.0|	V8l-1.6.0|	V10l-2.0.0|	V10l-1.6.0|	V8n-2.0.0	|V8n-1.6.0	|V10n-2.0.0|	V10n-1.6.0|
 44 | |:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|
 45 | |133.07572815534|	133.834951456311|	122.992233009709|	204.471844660194|	17.8990291262136|	18.3300970873786|	21.3009708737864|	49.9883495145631|
 46 | 
 47 | 
 48 | 
 49 | 
 50 | 
 51 | # Demo Video and Guideline
 52 | https://space.bilibili.com/327258623?spm_id_from=333.999.0.0  
 53 | QQ group1: 957577822 (full)
 54 | QQ group2: 546943464
 55 | 
 56 | # Prepare
 57 | 
 58 | ## Build the Cross-Compilation Environment
 59 | Set up a cross-compilation environment based on the following [link](https://github.com/kaylorchen/rk3588_dev_rootfs).
 60 | 
 61 | ## Install Runtime Libraries in Your RK3588 Target Board
 62 | ```bash
 63 | cat << 'EOF' | sudo tee /etc/apt/sources.list.d/kaylordut.list 
 64 | deb [signed-by=/etc/apt/keyrings/kaylor-keyring.gpg] http://apt.kaylordut.cn/kaylordut/ kaylordut main
 65 | EOF
 66 | sudo mkdir /etc/apt/keyrings -pv
 67 | sudo wget -O /etc/apt/keyrings/kaylor-keyring.gpg http://apt.kaylordut.cn/kaylor-keyring.gpg
 68 | sudo apt update
 69 | sudo apt install kaylordut-dev libbytetrack
 70 | ```
 71 | > If your OS is not Ubuntu22.04, and find [kaylordut-dev](https://github.com/kaylorchen/kaylordut) and [libbytetrack](https://github.com/kaylorchen/ByteTrack) sources in my github.
 72 | 
 73 | 
 74 | ## Build the Project for Your RK3588
 75 | 
 76 | - Compile
 77 | 
 78 | ```bash
 79 | git clone https://github.com/kaylorchen/rk3588-yolo-demo.git 
 80 | cd rk3588-yolo-demo/src/yolov8
 81 | mkdir build
 82 | cd build
 83 | cmake -DCMAKE_TOOLCHAIN_FILE=/path/to/toolchain-aarch64.cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
 84 | make 
 85 | ```
 86 | > /path/to/toolchain-aarch64.cmake is .cmake file absolute path
 87 | 
 88 | - Run
 89 |   
 90 | ``` bash
 91 | 
 92 | Usage: ./videofile_demo [--model_path|-m model_path] [--input_filename|-i input_filename] [--threads|-t thread_count] [--framerate|-f framerate] [--label_path|-l label_path]  
 93 | 
 94 | Usage: ./camera_demo [--model_path|-m model_path] [--camera_index|-i index] [--width|-w width] [--height|-h height][--threads|-t thread_count] [--fps|-f framerate] [--label_path|-l label_path]
 95 | 
 96 | Usage: ./imagefile_demo [--model_path|-m model_path] [--input_filename|-i input_filename] [--label_path|-l label_path]
 97 | 
 98 | ```
 99 | 
100 | > you can run the above command in your rk3588 
101 | 
102 | 
103 | 
104 | 


--------------------------------------------------------------------------------
/src/yolov8/.gitignore:
--------------------------------------------------------------------------------
1 | cmake-build-*/
2 | build/
3 | compile_commands.json
4 | .idea/
5 | .vscode/
6 | 


--------------------------------------------------------------------------------
/src/yolov8/CMakeLists.txt:
--------------------------------------------------------------------------------
 1 | cmake_minimum_required(VERSION 3.22)
 2 | project(yolov8)
 3 | 
 4 | set(CMAKE_CXX_STANDARD 17)
 5 | set(CMAKE_BUILD_TYPE Release) # None, Debug, Release, RelWithDebInfo, MinSizeRel
 6 | #SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -Wno-unused-parameter -O0 -g -Wall")
 7 | #SET(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS}  -Wno-unused-parameter -O0 -g -Wall")
 8 | SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -Wno-unused-parameter -O3 -g -Wall")
 9 | SET(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS}  -Wno-unused-parameter -O3 -g -Wall")
10 | 
11 | include_directories(include)
12 | 
13 | add_subdirectory(utils)
14 | 
15 | find_package(OpenCV REQUIRED)
16 | include_directories(${OpenCV_INCLUDE_DIRS})
17 | find_package(kaylordut REQUIRED)
18 | find_package(bytetrack REQUIRED)
19 | 
20 | 
21 | add_executable(videofile_demo videofile_demo.cpp)
22 | target_link_libraries(videofile_demo ${kaylordut_LIBS} ${OpenCV_LIBS} yolov8-kaylordut ${bytetrack_LIBS})
23 | 
24 | add_executable(camera_demo camera_demo.cpp)
25 | target_link_libraries(camera_demo ${kaylordut_LIBS} ${OpenCV_LIBS} yolov8-kaylordut ${bytetrack_LIBS})
26 | 
27 | add_executable(imagefile_demo imagefile_demo.cpp)
28 | target_link_libraries(imagefile_demo ${kaylordut_LIBS} ${OpenCV_LIBS} yolov8-kaylordut ${bytetrack_LIBS})
29 | 


--------------------------------------------------------------------------------
/src/yolov8/camera_demo.cpp:
--------------------------------------------------------------------------------
  1 | //
  2 | // Created by kaylor on 3/9/24.
  3 | //
  4 | #include <cctype>  // 用于检查字符是否为数字
  5 | #include <cerrno>  // 用于检查 strtol 和 strtod 的错误
  6 | 
  7 | #include "camera.h"
  8 | #include "getopt.h"
  9 | #include "kaylordut/log/logger.h"
 10 | #include "kaylordut/time/time_duration.h"
 11 | #include "kaylordut/time/timeout.h"
 12 | #include "rknn_pool.h"
 13 | 
 14 | struct ProgramOptions {
 15 |   std::string model_path;
 16 |   std::string label_path;
 17 |   int thread_count;
 18 |   int camera_index;
 19 |   int width;
 20 |   int height;
 21 |   double fps;
 22 |   bool is_track = false;
 23 | };
 24 | 
 25 | // 检查字符串是否表示有效的数字
 26 | bool isNumber(const std::string &str) {
 27 |   char *end;
 28 |   errno = 0;
 29 |   std::strtod(str.c_str(), &end);
 30 |   return errno == 0 && *end == '\0' && end != str.c_str();
 31 | }
 32 | 
 33 | // 这个函数将解析命令行参数并返回一个 ProgramOptions 结构体
 34 | bool parseCommandLine(int argc, char *argv[], ProgramOptions &options) {
 35 |   options.is_track = false;
 36 |   static struct option longOpts[] = {
 37 |       {"model_path", required_argument, nullptr, 'm'},
 38 |       {"label_path", required_argument, nullptr, 'l'},
 39 |       {"threads", required_argument, nullptr, 't'},
 40 |       {"camera_index", required_argument, nullptr, 'i'},
 41 |       {"width", required_argument, nullptr, 'w'},
 42 |       {"height", required_argument, nullptr, 'h'},
 43 |       {"fps", required_argument, nullptr, 'f'},
 44 |       {"help", no_argument, nullptr, '?'},
 45 |       {"track", no_argument, nullptr, 'T'},
 46 |       {nullptr, 0, nullptr, 0}};
 47 | 
 48 |   int c, optionIndex = 0;
 49 |   while ((c = getopt_long(argc, argv, "m:l:t:i:w:h:f:?T", longOpts,
 50 |                           &optionIndex)) != -1) {
 51 |     switch (c) {
 52 |       case 'm':
 53 |         options.model_path = optarg;
 54 |         break;
 55 |       case 'l':
 56 |         options.label_path = optarg;
 57 |         break;
 58 |       case 't':
 59 |         if (isNumber(optarg)) {
 60 |           options.thread_count = std::atoi(optarg);
 61 |           if (options.thread_count <= 0) {
 62 |             KAYLORDUT_LOG_ERROR("Invalid number of threads: {}", optarg);
 63 |             return false;
 64 |           }
 65 |         } else {
 66 |           KAYLORDUT_LOG_ERROR("Thread count must be a number: {}", optarg);
 67 |           return false;
 68 |         }
 69 |         break;
 70 |       case 'i':
 71 |         if (isNumber(optarg)) {
 72 |           options.camera_index = std::atoi(optarg);
 73 |           if (options.camera_index < 0) {
 74 |             KAYLORDUT_LOG_ERROR("Invalid index of camera: {}", optarg);
 75 |             return false;
 76 |           }
 77 |         } else {
 78 |           KAYLORDUT_LOG_ERROR("Camera index must be a number: {}", optarg);
 79 |           return false;
 80 |         }
 81 |         break;
 82 |       case 'w':
 83 |         if (isNumber(optarg)) {
 84 |           options.width = std::atoi(optarg);
 85 |           if (options.width < 0) {
 86 |             KAYLORDUT_LOG_ERROR("Invalid width: {}", optarg);
 87 |             return false;
 88 |           }
 89 |         } else {
 90 |           KAYLORDUT_LOG_ERROR("Width must be a number: {}", optarg);
 91 |           return false;
 92 |         }
 93 |         break;
 94 |       case 'h':
 95 |         if (isNumber(optarg)) {
 96 |           options.height = std::atoi(optarg);
 97 |           if (options.height < 0) {
 98 |             KAYLORDUT_LOG_ERROR("Invalid height: {}", optarg);
 99 |             return false;
100 |           }
101 |         } else {
102 |           KAYLORDUT_LOG_ERROR("Height must be a number: {}", optarg);
103 |           return false;
104 |         }
105 |         break;
106 |       case 'f':
107 |         if (isNumber(optarg)) {
108 |           options.fps = std::atof(optarg);
109 |           if (options.fps <= 0) {
110 |             KAYLORDUT_LOG_ERROR("Invalid frame rate: {}", optarg);
111 |             return false;
112 |           }
113 |         } else {
114 |           KAYLORDUT_LOG_ERROR("Frame rate must be a number: {}", optarg);
115 |           return false;
116 |         }
117 |         break;
118 |       case 'T':
119 |         options.is_track = true;
120 |         break;
121 |       case '?':
122 |         std::cout << "Usage: " << argv[0]
123 |                   << " [--model_path|-m model_path] [--camera_index|-i index] "
124 |                      "[--width|-w width] [--height|-h height]"
125 |                      "[--threads|-t thread_count] [--fps|-f framerate] "
126 |                      "[--label_path|-l label_path]\n";
127 |         exit(EXIT_SUCCESS);
128 |       default:
129 |         std::cout << "Usage: " << argv[0]
130 |                   << " [--model_path|-m model_path] [--camera_index|-i index] "
131 |                      "[--width|-w width] [--height|-h height]"
132 |                      "[--threads|-t thread_count] [--fps|-f framerate] "
133 |                      "[--label_path|-l label_path]\n";
134 |         abort();
135 |     }
136 |   }
137 | 
138 |   return true;
139 | }
140 | 
141 | std::string getCurrentTimeStr() {
142 |   auto now = std::chrono::system_clock::now();
143 |   auto in_time_t = std::chrono::system_clock::to_time_t(now);
144 | 
145 |   std::stringstream ss;
146 |   ss << std::put_time(std::localtime(&in_time_t), "%Y%m%d_%H%M%S");
147 |   return ss.str();
148 | }
149 | 
150 | int main(int argc, char *argv[]) {
151 |   ProgramOptions options = {"", "", 0, 0, 0, 0, 0.0};
152 |   if (!parseCommandLine(argc, argv, options)) {
153 |     KAYLORDUT_LOG_ERROR("Parse command failed.");
154 |     return 1;
155 |   }
156 |   if (options.fps == 0.0 || options.thread_count == 0 || options.height == 0 ||
157 |       options.width == 0 || options.label_path.empty() ||
158 |       options.model_path.empty()) {
159 |     KAYLORDUT_LOG_ERROR("Missing required options. Use --help for help.");
160 |     return 1;
161 |   }
162 |   auto rknn_pool = std::make_unique<RknnPool>(
163 |       options.model_path, options.thread_count, options.label_path);
164 |   auto camera = std::make_unique<Camera>(
165 |       options.camera_index, cv::Size(options.width, options.height),
166 |       options.fps);
167 |   ImageProcess image_process(options.width, options.height, 640,
168 |                              options.is_track, options.fps);
169 |   //  cv::VideoWriter video_writer(
170 |   //      getCurrentTimeStr() + ".mkv", cv::VideoWriter::fourcc('X', '2', '6',
171 |   //      '4'), options.fps, cv::Size(options.width, options.height), true);
172 |   //  if (!video_writer.isOpened()) {
173 |   //    KAYLORDUT_LOG_ERROR("Open the output video file error");
174 |   //    return -1;
175 |   //  }
176 |   std::unique_ptr<cv::Mat> image;
177 |   std::shared_ptr<cv::Mat> image_res;
178 |   cv::namedWindow("Video", cv::WINDOW_AUTOSIZE);
179 |   static int image_count = 0;
180 |   static int image_res_count = 0;
181 |   TimeDuration time_duration;
182 |   Timeout timeout(std::chrono::seconds(30));
183 |   TimeDuration total_time;
184 |   while ((!timeout.isTimeout()) || (image_count != image_res_count)) {
185 |     auto func = [&] {
186 |       if (!timeout.isTimeout()) {
187 |         image = camera->GetNextFrame();
188 |       }
189 |       if (image != nullptr) {
190 |         rknn_pool->AddInferenceTask(std::move(image), image_process);
191 |         image_count++;
192 |       }
193 |       image_res = rknn_pool->GetImageResultFromQueue();
194 |       if (image_res != nullptr) {
195 |         image_res_count++;
196 |         auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(
197 |             time_duration.DurationSinceLastTime());
198 |         KAYLORDUT_LOG_INFO(
199 |             "image count = {}, image res count = {}, delta = {}, duration = "
200 |             "{}ms",
201 |             image_count, image_res_count, image_count - image_res_count,
202 |             duration.count());
203 |         cv::imshow("Video", *image_res);
204 |         //        video_writer.write(*image_res);
205 |         cv::waitKey(1);
206 |       }
207 |     };
208 |     func();
209 |   }
210 |   auto duration_total_time =
211 |       std::chrono::duration_cast<std::chrono::milliseconds>(
212 |           total_time.DurationSinceLastTime());
213 |   KAYLORDUT_LOG_INFO(
214 |       "Process {} frames, total time is {}ms, average frame rate is {}",
215 |       image_res_count, duration_total_time.count(),
216 |       image_res_count * 1000.0 / duration_total_time.count());
217 |   //  video_writer.release();
218 |   rknn_pool.reset();
219 |   cv::destroyAllWindows();
220 |   return 0;
221 | }


--------------------------------------------------------------------------------
/src/yolov8/imagefile_demo.cpp:
--------------------------------------------------------------------------------
  1 | #include <cctype>  // 用于检查字符是否为数字
  2 | #include <cerrno>  // 用于检查 strtol 和 strtod 的错误
  3 | 
  4 | #include "getopt.h"
  5 | #include "image_process.h"
  6 | #include "kaylordut/log/logger.h"
  7 | #include "kaylordut/time/run_once.h"
  8 | #include "kaylordut/time/time_duration.h"
  9 | #include "rknn_pool.h"
 10 | 
 11 | struct ProgramOptions {
 12 |   std::string model_path;
 13 |   std::string label_path;
 14 |   std::string input_filename;
 15 |   int thread_count;
 16 |   double framerate;
 17 | };
 18 | 
 19 | // 检查字符串是否表示有效的数字
 20 | bool isNumber(const std::string &str) {
 21 |   char *end;
 22 |   errno = 0;
 23 |   std::strtod(str.c_str(), &end);
 24 |   return errno == 0 && *end == '\0' && end != str.c_str();
 25 | }
 26 | 
 27 | // 这个函数将解析命令行参数并返回一个 ProgramOptions 结构体
 28 | bool parseCommandLine(int argc, char *argv[], ProgramOptions &options) {
 29 |   static struct option longOpts[] = {
 30 |       {"model_path", required_argument, nullptr, 'm'},
 31 |       {"label_path", required_argument, nullptr, 'l'},
 32 |       {"input_filename", required_argument, nullptr, 'i'},
 33 |       {"help", no_argument, nullptr, 'h'},
 34 |       {nullptr, 0, nullptr, 0}};
 35 | 
 36 |   int c, optionIndex = 0;
 37 |   while ((c = getopt_long(argc, argv, "m:l:t:f:i:h", longOpts, &optionIndex)) !=
 38 |          -1) {
 39 |     switch (c) {
 40 |       case 'm':
 41 |         options.model_path = optarg;
 42 |         break;
 43 |       case 'l':
 44 |         options.label_path = optarg;
 45 |         break;
 46 |       case 'i':
 47 |         options.input_filename = optarg;
 48 |         break;
 49 |       case 'h':
 50 |         std::cout << "Usage: " << argv[0]
 51 |                   << " [--model_path|-m model_path] [--input_filename|-i "
 52 |                      "input_filename] "
 53 |                      "[--label_path|-l label_path]\n";
 54 |         exit(EXIT_SUCCESS);
 55 |       case '?':
 56 |         // 错误消息由getopt_long自动处理
 57 |         return false;
 58 |       default:
 59 |         std::cout << "Usage: " << argv[0]
 60 |                   << " [--model_path|-d model_path] [--input_filename|-i "
 61 |                      "input_filename] "
 62 |                      "[--label_path|-l label_path]\n";
 63 |         abort();
 64 |     }
 65 |   }
 66 | 
 67 |   return true;
 68 | }
 69 | 
 70 | int main(int argc, char *argv[]) {
 71 |   KAYLORDUT_LOG_INFO("Yolov8 demo for rk3588");
 72 |   ProgramOptions options = {"", "", "", 1, 1.0};
 73 |   if (!parseCommandLine(argc, argv, options)) {
 74 |     KAYLORDUT_LOG_ERROR("Parse command failed.");
 75 |     return 1;
 76 |   }
 77 |   if (options.framerate == 0.0 || options.thread_count == 0 ||
 78 |       options.label_path.empty() || options.input_filename.empty() ||
 79 |       options.model_path.empty()) {
 80 |     KAYLORDUT_LOG_ERROR("Missing required options. Use --help for help.");
 81 |     return 1;
 82 |   }
 83 |   auto rknn_pool = std::make_unique<RknnPool>(
 84 |       options.model_path, options.thread_count, options.label_path);
 85 |   std::unique_ptr<cv::Mat> image = std::make_unique<cv::Mat>();
 86 |   *image = cv::imread(options.input_filename);
 87 |   if (image->empty()) {
 88 |     KAYLORDUT_LOG_ERROR("read image error");
 89 |     return -1;
 90 |   }
 91 |   ImageProcess image_process(image->cols, image->rows, 640);
 92 | 
 93 |   std::shared_ptr<cv::Mat> image_res;
 94 |   uint8_t running_flag = 0;
 95 |   cv::namedWindow("Image demo", cv::WINDOW_AUTOSIZE);
 96 |   static int image_count = 0;
 97 |   static int image_res_count = 0;
 98 |   rknn_pool->AddInferenceTask(std::move(image), image_process);
 99 |   while (image_res == nullptr) {
100 |     image_res = rknn_pool->GetImageResultFromQueue();
101 |   }
102 |   cv::imshow("Image demo", *image_res);
103 |   cv::waitKey(0);
104 |   rknn_pool.reset();
105 |   cv::destroyAllWindows();
106 |   cv::imwrite("result_" + options.input_filename, *image_res);
107 |   return 0;
108 | }


--------------------------------------------------------------------------------
/src/yolov8/include/Float16.h:
--------------------------------------------------------------------------------
  1 | #ifndef _RKNPU2_RKNN_MATMUL_API_DEMO_H_
  2 | #define _RKNPU2_RKNN_MATMUL_API_DEMO_H_
  3 | 
  4 | namespace rknpu2 {
  5 | 
  6 | using ushort = unsigned short;
  7 | 
  8 | typedef union suf32 {
  9 |   int i;
 10 |   unsigned u;
 11 |   float f;
 12 | } suf32;
 13 | 
 14 | class float16 {
 15 |  public:
 16 |   float16() {}
 17 |   explicit float16(float x) { w = bits(x); }
 18 | 
 19 |   operator float() const {
 20 |     suf32 out;
 21 | 
 22 |     unsigned t = ((w & 0x7fff) << 13) + 0x38000000;
 23 |     unsigned sign = (w & 0x8000) << 16;
 24 |     unsigned e = w & 0x7c00;
 25 | 
 26 |     out.u = t + (1 << 23);
 27 |     out.u = (e >= 0x7c00 ? t + 0x38000000
 28 |              : e == 0    ? (static_cast<void>(out.f -= 6.103515625e-05f), out.u)
 29 |                          : t) |
 30 |             sign;
 31 |     return out.f;
 32 |   }
 33 | 
 34 |   static float16 fromBits(ushort b) {
 35 |     float16 result;
 36 |     result.w = b;
 37 |     return result;
 38 |   }
 39 |   static float16 zero() {
 40 |     float16 result;
 41 |     result.w = (ushort)0;
 42 |     return result;
 43 |   }
 44 |   ushort bits() const { return w; }
 45 | 
 46 |   static ushort bits(float x) {
 47 |     suf32 in;
 48 |     in.f = x;
 49 |     unsigned sign = in.u & 0x80000000;
 50 |     in.u ^= sign;
 51 |     ushort w;
 52 | 
 53 |     if (in.u >= 0x47800000)
 54 |       w = (ushort)(in.u > 0x7f800000 ? 0x7e00 : 0x7c00);
 55 |     else {
 56 |       if (in.u < 0x38800000) {
 57 |         in.f += 0.5f;
 58 |         w = (ushort)(in.u - 0x3f000000);
 59 |       } else {
 60 |         unsigned t = in.u + 0xc8000fff;
 61 |         w = (ushort)((t + ((in.u >> 13) & 1)) >> 13);
 62 |       }
 63 |     }
 64 | 
 65 |     w = (ushort)(w | (sign >> 16));
 66 | 
 67 |     return w;
 68 |   }
 69 | 
 70 |   float16& operator=(float x) {
 71 |     w = bits(x);
 72 |     return *this;
 73 |   }
 74 | 
 75 |   float16& operator+=(float x) {
 76 |     w = bits(float() + x);
 77 |     return *this;
 78 |   }
 79 | 
 80 |   float16& operator/(float x) {
 81 |     w = bits(float() / x);
 82 |     return *this;
 83 |   }
 84 | 
 85 |   inline bool is_nan() const {
 86 |     return ((w & 0x7c00u) == 0x7c00u) && ((w & 0x03ffu) != 0x0000u);
 87 |   }
 88 | 
 89 |   inline bool greater(const float16& x) const {
 90 |     bool sign = w & 0x8000;
 91 |     bool sign_x = x.w & 0x8000;
 92 |     if (sign) {
 93 |       if (sign_x)
 94 |         return w < x.w;
 95 |       else
 96 |         return false;
 97 |     } else {
 98 |       if (sign_x) /* Signed zeros are equal, have to check for it */
 99 |         return (w != 0 || x.w != 0x8000);
100 |       else
101 |         return w > x.w;
102 |     }
103 |     return false;
104 |   }
105 | 
106 |   inline bool less(const float16& x) const {
107 |     bool sign = w & 0x8000;
108 |     bool sign_x = x.w & 0x8000;
109 |     if (sign) {
110 |       if (sign_x)
111 |         return w > x.w;
112 |       else
113 |         /* Signed zeros are equal, have to check for it */
114 |         return (w != 0x8000 || x.w != 0);
115 |     } else {
116 |       if (sign_x)
117 |         return false;
118 |       else
119 |         return w < x.w;
120 |     }
121 |     return false;
122 |   }
123 | 
124 |   inline bool operator>(const float16& x) const {
125 |     if (is_nan() || x.is_nan()) {
126 |       return false;
127 |     }
128 |     return greater(x);
129 |   }
130 | 
131 |   inline bool operator<(const float16& x) const {
132 |     if (is_nan() || x.is_nan()) {
133 |       return false;
134 |     }
135 |     return less(x);
136 |   }
137 | 
138 |   inline bool operator>=(const float16& x) const {
139 |     if (is_nan() || x.is_nan()) {
140 |       return false;
141 |     }
142 |     return !less(x);
143 |   }
144 | 
145 |   inline bool operator<=(const float16& x) const {
146 |     if (is_nan() || x.is_nan()) {
147 |       return false;
148 |     }
149 |     return !greater(x);
150 |   }
151 | 
152 |   inline bool operator==(const float16& x) const {
153 |     /*
154 |      * The equality cases are as follows:
155 |      *   - If either value is NaN, never equal.
156 |      *   - If the values are equal, equal.
157 |      *   - If the values are both signed zeros, equal.
158 |      */
159 |     if (is_nan() || x.is_nan()) {
160 |       return false;
161 |     }
162 |     return (w == x.w || ((w | x.w) & 0x7fff) == 0);
163 |   }
164 | 
165 |   inline bool operator!=(const float16& x) const { return !((*this) == x); }
166 | 
167 |  protected:
168 |   ushort w = 0;
169 | };
170 | 
171 | }  // namespace rknpu2
172 | 
173 | #endif /* _RKNPU2_RKNN_MATMUL_API_DEMO_H_ */


--------------------------------------------------------------------------------
/src/yolov8/include/camera.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/9/24.
 3 | //
 4 | 
 5 | #pragma once
 6 | #include "opencv2/opencv.hpp"
 7 | 
 8 | class Camera {
 9 |  public:
10 |   Camera(uint16_t index, cv::Size size, double framerate);
11 |   ~Camera();
12 |   std::unique_ptr<cv::Mat> GetNextFrame();
13 | 
14 |  private:
15 |   cv::Size size_;
16 |   cv::VideoCapture capture_;
17 | };
18 | 


--------------------------------------------------------------------------------
/src/yolov8/include/common.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/5/24.
 3 | //
 4 | 
 5 | #pragma once
 6 | #include "rknn_api.h"
 7 | #define OBJ_NAME_MAX_SIZE 64
 8 | #define OBJ_NUMB_MAX_SIZE 128
 9 | #define OBJ_CLASS_NUM 80
10 | #define NMS_THRESH 0.8
11 | #define BOX_THRESH 0.5
12 | #define PROTO_CHANNEL (32)
13 | #define PROTO_HEIGHT (160)
14 | #define PROTO_WEIGHT (160)
15 | enum ModelType {
16 |   UNKNOWN = 0,
17 |   SEGMENT = 1,
18 |   DETECTION = 2,
19 |   OBB = 3,
20 |   POSE = 4,
21 |   V10_DETECTION = 5,
22 | };
23 | /**
24 |  * @brief LetterBox
25 |  *
26 |  */
27 | typedef struct {
28 |   int x_pad;
29 |   int y_pad;
30 |   float scale;
31 | } letterbox_t;
32 | /**
33 |  * @brief Image rectangle
34 |  *
35 |  */
36 | 
37 | typedef struct {
38 |   float kpt[34];
39 |   float visibility[17];
40 | } object_pose_result;
41 | 
42 | typedef struct {
43 |   int x;
44 |   int y;
45 |   int w;
46 |   int h;
47 |   float theta;
48 | } image_xywht_t;
49 | 
50 | typedef struct {
51 |   image_xywht_t box;
52 |   float prop;
53 |   int cls_id;
54 | } object_obb_result;
55 | 
56 | typedef struct {
57 |   int left;
58 |   int top;
59 |   int right;
60 |   int bottom;
61 | } image_rect_t;
62 | 
63 | typedef struct {
64 |   image_rect_t box;
65 |   float prop;
66 |   int cls_id;
67 | } object_detect_result;
68 | 
69 | typedef struct {
70 |   uint8_t *seg_mask;
71 | } object_segment_result;
72 | 
73 | typedef struct {
74 |   int id;
75 |   int count;
76 |   ModelType model_type;
77 |   object_detect_result results[OBJ_NUMB_MAX_SIZE];
78 |   object_segment_result results_seg[OBJ_NUMB_MAX_SIZE];
79 |   object_obb_result results_obb[OBJ_NUMB_MAX_SIZE];
80 |   object_pose_result results_pose[OBJ_NUMB_MAX_SIZE];
81 | } object_detect_result_list;
82 | 
83 | typedef struct {
84 |   rknn_context rknn_ctx;
85 |   rknn_input_output_num io_num;
86 |   rknn_tensor_attr *input_attrs;
87 |   rknn_tensor_attr *output_attrs;
88 |   int model_channel;
89 |   int model_width;
90 |   int model_height;
91 |   bool is_quant;
92 | } rknn_app_context_t;
93 | 


--------------------------------------------------------------------------------
/src/yolov8/include/image_process.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/4/24.
 3 | //
 4 | 
 5 | #pragma once
 6 | #include "BYTETracker.h"
 7 | #include "mutex"
 8 | #include "opencv2/opencv.hpp"
 9 | #include "postprocess.h"
10 | 
11 | class ImageProcess {
12 |  public:
13 |   ImageProcess(int width, int height, int target_size, bool is_track = false,
14 |                int framerate = 30);
15 |   std::unique_ptr<cv::Mat> Convert(const cv::Mat &src);
16 |   const letterbox_t &get_letter_box();
17 |   void ImagePostProcess(cv::Mat &image, object_detect_result_list &od_results);
18 | 
19 |  private:
20 |   double scale_;
21 |   int padding_x_;
22 |   int padding_y_;
23 |   cv::Size new_size_;
24 |   int target_size_;
25 |   letterbox_t letterbox_;
26 |   bool is_track_;
27 |   std::unique_ptr<BYTETracker> tracker_;
28 |   std::mutex tracker_mutex_;
29 |   void ProcessDetectionImage(cv::Mat &image,
30 |                              object_detect_result_list &od_results) const;
31 |   void ProcessTrackImage(cv::Mat &image, object_detect_result_list &od_results);
32 |   void ProcessPoseImage(cv::Mat &image,
33 |                         object_detect_result_list &od_results) const;
34 |   void ProcessOBBImage(cv::Mat &image,
35 |                        const object_detect_result_list &od_results) const;
36 | };
37 | 


--------------------------------------------------------------------------------
/src/yolov8/include/postprocess.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/5/24.
 3 | //
 4 | 
 5 | #pragma once
 6 | #include <stdint.h>
 7 | 
 8 | #include <string>
 9 | #include <vector>
10 | 
11 | #include "common.h"
12 | #include "rknn_api.h"
13 | 
14 | int init_post_process(std::string &label_path);
15 | void deinit_post_process();
16 | const char *coco_cls_to_name(int cls_id);
17 | int post_process(rknn_app_context_t *app_ctx, rknn_output *outputs,
18 |                  letterbox_t *letter_box, float conf_threshold,
19 |                  float nms_threshold, object_detect_result_list *od_results);
20 | int post_process_v10_detection(rknn_app_context_t *app_ctx,
21 |                                rknn_output *outputs,
22 |                                letterbox_t *letter_box,
23 |                                float conf_threshold,
24 |                                object_detect_result_list *od_results);
25 | int post_process_obb(rknn_app_context_t *app_ctx, rknn_output *outputs,
26 |                      letterbox_t *letter_box, float conf_threshold,
27 |                      float nms_threshold,
28 |                      object_detect_result_list *od_results);
29 | int post_process_seg(rknn_app_context_t *app_ctx, rknn_output *outputs,
30 |                      letterbox_t *letter_box, float conf_threshold,
31 |                      float nms_threshold,
32 |                      object_detect_result_list *od_results);
33 | int post_process_pose(rknn_app_context_t *app_ctx, rknn_output *outputs,
34 |                       letterbox_t *letter_box, float conf_threshold,
35 |                       float nms_threshold,
36 |                       object_detect_result_list *od_results);
37 | int clamp(float val, int min, int max);
38 | 


--------------------------------------------------------------------------------
/src/yolov8/include/rknn_pool.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/6/24.
 3 | //
 4 | 
 5 | #pragma once
 6 | #include "image_process.h"
 7 | #include "opencv2/opencv.hpp"
 8 | #include "queue"
 9 | #include "threadpool.h"
10 | #include "yolov8.h"
11 | class RknnPool {
12 |  public:
13 |   RknnPool(const std::string model_path, const int thread_num,
14 |            const std::string label_path);
15 |   ~RknnPool();
16 |   void Init();
17 |   void DeInit();
18 |   void AddInferenceTask(std::shared_ptr<cv::Mat> src,
19 |                         ImageProcess &image_process);
20 |   int get_model_id();
21 |   std::shared_ptr<cv::Mat> GetImageResultFromQueue();
22 |   int GetTasksSize();
23 | 
24 |  private:
25 |   int thread_num_{1};
26 |   std::string model_path_{"null"};
27 |   std::string label_path_{"null"};
28 |   uint32_t id{0};
29 |   std::unique_ptr<ThreadPool> pool_;
30 |   std::queue<std::shared_ptr<cv::Mat>> image_results_;
31 |   std::vector<std::shared_ptr<Yolov8>> models_;
32 |   std::mutex id_mutex_;
33 |   std::mutex image_results_mutex_;
34 | };
35 | 


--------------------------------------------------------------------------------
/src/yolov8/include/threadpool.h:
--------------------------------------------------------------------------------
  1 | //
  2 | // Created by kaylor on 3/5/24.
  3 | //
  4 | #ifndef THREAD_POOL_H
  5 | #define THREAD_POOL_H
  6 | 
  7 | #include <condition_variable>
  8 | #include <functional>
  9 | #include <future>
 10 | #include <memory>
 11 | #include <mutex>
 12 | #include <queue>
 13 | #include <stdexcept>
 14 | #include <thread>
 15 | #include <vector>
 16 | 
 17 | class ThreadPool {
 18 |  public:
 19 |   ThreadPool(size_t);
 20 |   template <class F, class... Args>
 21 |   auto enqueue(F&& f, Args&&... args)
 22 |       -> std::future<typename std::result_of<F(Args...)>::type>;
 23 |   ~ThreadPool();
 24 |   bool IsTasksEmpty();
 25 |   int TasksSize();
 26 | 
 27 |  private:
 28 |   // need to keep track of threads so we can join them
 29 |   std::vector<std::thread> workers;
 30 |   // the task queue
 31 |   std::queue<std::function<void()> > tasks;
 32 | 
 33 |   // synchronization
 34 |   std::mutex queue_mutex;
 35 |   std::condition_variable condition;
 36 |   bool stop;
 37 | };
 38 | 
 39 | // the constructor just launches some amount of workers
 40 | inline ThreadPool::ThreadPool(size_t threads) : stop(false) {
 41 |   for (size_t i = 0; i < threads; ++i)
 42 |     workers.emplace_back([this](int i) {
 43 |       auto thread_name = "thread" + std::to_string(i);
 44 |       pthread_setname_np(pthread_self(), thread_name.c_str());
 45 |       for (;;) {
 46 |         std::function<void()> task;
 47 | 
 48 |         {
 49 |           std::unique_lock<std::mutex> lock(this->queue_mutex);
 50 |           this->condition.wait(
 51 |               lock, [this] { return this->stop || !this->tasks.empty(); });
 52 |           if (this->stop && this->tasks.empty()) return;
 53 |           task = std::move(this->tasks.front());
 54 |           this->tasks.pop();
 55 |         }
 56 | 
 57 |         task();
 58 |       }
 59 |     }, i);
 60 | }
 61 | 
 62 | // add new work item to the pool
 63 | template <class F, class... Args>
 64 | auto ThreadPool::enqueue(F&& f, Args&&... args)
 65 |     -> std::future<typename std::result_of<F(Args...)>::type> {
 66 |   using return_type = typename std::result_of<F(Args...)>::type;
 67 | 
 68 |   auto task = std::make_shared<std::packaged_task<return_type()> >(
 69 |       std::bind(std::forward<F>(f), std::forward<Args>(args)...));
 70 | 
 71 |   std::future<return_type> res = task->get_future();
 72 |   {
 73 |     std::unique_lock<std::mutex> lock(queue_mutex);
 74 | 
 75 |     // don't allow enqueueing after stopping the pool
 76 |     if (stop) throw std::runtime_error("enqueue on stopped ThreadPool");
 77 | 
 78 |     tasks.emplace([task]() { (*task)(); });
 79 |   }
 80 |   condition.notify_one();
 81 |   return res;
 82 | }
 83 | 
 84 | // the destructor joins all threads
 85 | inline ThreadPool::~ThreadPool() {
 86 |   {
 87 |     std::unique_lock<std::mutex> lock(queue_mutex);
 88 |     stop = true;
 89 |   }
 90 |   condition.notify_all();
 91 |   for (std::thread& worker : workers) worker.join();
 92 | }
 93 | 
 94 | inline bool ThreadPool::IsTasksEmpty() {
 95 |   std::lock_guard<std::mutex> lock_guard(queue_mutex);
 96 |   return tasks.empty();
 97 | }
 98 | 
 99 | inline int ThreadPool::TasksSize() {
100 |   std::lock_guard<std::mutex> lock_guard(queue_mutex);
101 |   return tasks.size();
102 | }
103 | 
104 | #endif


--------------------------------------------------------------------------------
/src/yolov8/include/videofile.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/4/24.
 3 | //
 4 | 
 5 | #pragma once
 6 | #include "opencv2/opencv.hpp"
 7 | #include "string"
 8 | class VideoFile {
 9 |  public:
10 |   VideoFile(const std::string&& filename);
11 |   ~VideoFile();
12 |   void Display(const float framerate = 25.0, const int target_size = 640);
13 |   std::unique_ptr<cv::Mat> GetNextFrame();
14 |   cv::Mat test();
15 |   int get_frame_width();
16 |   int get_frame_height();
17 | 
18 |  private:
19 |   std::string filename_;
20 |   cv::VideoCapture* capture_{nullptr};
21 | };
22 | 


--------------------------------------------------------------------------------
/src/yolov8/include/yolov8.h:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/4/24.
 3 | //
 4 | 
 5 | #pragma once
 6 | #include "common.h"
 7 | #include "memory"
 8 | #include "mutex"
 9 | #include "rknn_api.h"
10 | #include "string"
11 | 
12 | class Yolov8 {
13 |  public:
14 |   Yolov8(std::string &&model_path);
15 |   ~Yolov8();
16 |   int Inference(void *image_buf, object_detect_result_list *od_results,
17 |                 letterbox_t letter_box);
18 |   rknn_context *get_rknn_context();
19 |   int Init(rknn_context *ctx_in, bool copy_weight);
20 |   int DeInit();
21 |   int get_model_width();
22 |   int get_model_height();
23 | 
24 |  private:
25 |   rknn_app_context_t app_ctx_;
26 |   rknn_context ctx_{0};
27 |   std::string model_path_;
28 |   std::unique_ptr<rknn_input[]> inputs_;
29 |   std::unique_ptr<rknn_output[]> outputs_;
30 |   std::mutex outputs_lock_;
31 |   ModelType model_type_;
32 | };
33 | 


--------------------------------------------------------------------------------
/src/yolov8/model/.gitignore:
--------------------------------------------------------------------------------
1 | *.jpg
2 | *.txt
3 | *.py
4 | 


--------------------------------------------------------------------------------
/src/yolov8/model/coco_80_labels_list.txt:
--------------------------------------------------------------------------------
 1 | person
 2 | bicycle
 3 | car
 4 | motorcycle
 5 | airplane
 6 | bus
 7 | train
 8 | truck
 9 | boat
10 | traffic light
11 | fire hydrant
12 | stop sign
13 | parking meter
14 | bench
15 | bird
16 | cat
17 | dog
18 | horse
19 | sheep
20 | cow
21 | elephant
22 | bear
23 | zebra
24 | giraffe
25 | backpack
26 | umbrella
27 | handbag
28 | tie
29 | suitcase
30 | frisbee
31 | skis
32 | snowboard
33 | sports ball
34 | kite
35 | baseball bat
36 | baseball glove
37 | skateboard
38 | surfboard
39 | tennis racket
40 | bottle
41 | wine glass
42 | cup
43 | fork
44 | knife
45 | spoon
46 | bowl
47 | banana
48 | apple
49 | sandwich
50 | orange
51 | broccoli
52 | carrot
53 | hot dog
54 | pizza
55 | donut
56 | cake
57 | chair
58 | couch
59 | potted plant
60 | bed
61 | dining table
62 | toilet
63 | tv
64 | laptop
65 | mouse
66 | remote
67 | keyboard
68 | cell phone
69 | microwave
70 | oven
71 | toaster
72 | sink
73 | refrigerator
74 | book
75 | clock
76 | vase
77 | scissors
78 | teddy bear
79 | hair drier
80 | toothbrush
81 | 


--------------------------------------------------------------------------------
/src/yolov8/model/dotav1.txt:
--------------------------------------------------------------------------------
 1 | plane
 2 | ship
 3 | storage tank
 4 | baseball diamond
 5 | tennis court
 6 | basketball court
 7 | ground track field
 8 | harbor
 9 | bridge
10 | large vehicle
11 | small vehicle
12 | helicopter
13 | roundabout
14 | soccer ball field
15 | swimming pool
16 | container crane
17 | 


--------------------------------------------------------------------------------
/src/yolov8/model/plane.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/plane.png


--------------------------------------------------------------------------------
/src/yolov8/model/ships.mp4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/ships.mp4


--------------------------------------------------------------------------------
/src/yolov8/model/yolov10n-2.0.0.rknn:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov10n-2.0.0.rknn


--------------------------------------------------------------------------------
/src/yolov8/model/yolov10n.onnx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov10n.onnx


--------------------------------------------------------------------------------
/src/yolov8/model/yolov8m.rknn:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8m.rknn


--------------------------------------------------------------------------------
/src/yolov8/model/yolov8n-obb.onnx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-obb.onnx


--------------------------------------------------------------------------------
/src/yolov8/model/yolov8n-obb.rknn:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-obb.rknn


--------------------------------------------------------------------------------
/src/yolov8/model/yolov8n-pose.onnx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-pose.onnx


--------------------------------------------------------------------------------
/src/yolov8/model/yolov8n-pose.rknn:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-pose.rknn


--------------------------------------------------------------------------------
/src/yolov8/model/yolov8n.rknn:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n.rknn


--------------------------------------------------------------------------------
/src/yolov8/model/yolov8s.rknn:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8s.rknn


--------------------------------------------------------------------------------
/src/yolov8/utils/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | find_package(OpenCV REQUIRED)
2 | include_directories(${OpenCV_INCLUDE_DIRS})
3 | find_package(kaylordut REQUIRED)
4 | file(GLOB SRC "*.cpp")
5 | add_library(yolov8-kaylordut ${SRC})
6 | target_link_libraries(yolov8-kaylordut ${kaylordut_LIBS} ${OpenCV_LIBS} rknnrt)


--------------------------------------------------------------------------------
/src/yolov8/utils/camera.cpp:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/9/24.
 3 | //
 4 | 
 5 | #include "camera.h"
 6 | 
 7 | #include "kaylordut/log/logger.h"
 8 | #include "thread"
 9 | 
10 | Camera::Camera(uint16_t index, cv::Size size, double framerate)
11 |     : capture_(index, cv::CAP_V4L2), size_(size) {
12 |   KAYLORDUT_LOG_INFO("Instantiate a Camera object");
13 |   // 这里使用V4L2捕获，因为使用默认的捕获不可以设置捕获的模式和帧率
14 |   if (!capture_.isOpened()) {
15 |     KAYLORDUT_LOG_ERROR("Error opening video stream or file");
16 |     exit(EXIT_FAILURE);
17 |   }
18 |   capture_.set(cv::CAP_PROP_FOURCC,
19 |                cv::VideoWriter::fourcc('M', 'J', 'P', 'G'));
20 |   // 检查是否成功设置格式
21 |   int fourcc = capture_.get(cv::CAP_PROP_FOURCC);
22 |   if (fourcc != cv::VideoWriter::fourcc('M', 'J', 'P', 'G')) {
23 |     KAYLORDUT_LOG_WARN("Set video format failed");
24 |   }
25 |   capture_.set(cv::CAP_PROP_FRAME_WIDTH, size_.width);
26 |   capture_.set(cv::CAP_PROP_FRAME_HEIGHT, size_.height);
27 |   if (!capture_.set(cv::CAP_PROP_FPS, framerate)) {
28 |     KAYLORDUT_LOG_WARN("set framerate failed!!");
29 |   }
30 |   std::this_thread::sleep_for(std::chrono::seconds(1));
31 |   KAYLORDUT_LOG_INFO("camera width: {}, height: {}, fps: {}",
32 |                      capture_.get(cv::CAP_PROP_FRAME_WIDTH),
33 |                      capture_.get(cv::CAP_PROP_FRAME_HEIGHT),
34 |                      capture_.get(cv::CAP_PROP_FPS));
35 | }
36 | 
37 | Camera::~Camera() {
38 |   if (capture_.isOpened()) {
39 |     KAYLORDUT_LOG_INFO("Release camera");
40 |     capture_.release();
41 |   }
42 | }
43 | 
44 | std::unique_ptr<cv::Mat> Camera::GetNextFrame() {
45 |   auto frame = std::make_unique<cv::Mat>();
46 |   capture_ >> *frame;
47 |   if (frame->empty()) {
48 |     KAYLORDUT_LOG_ERROR("Get frame error");
49 |     return nullptr;
50 |   }
51 |   return std::move(frame);
52 | }


--------------------------------------------------------------------------------
/src/yolov8/utils/image_process.cpp:
--------------------------------------------------------------------------------
  1 | //
  2 | // Created by kaylor on 3/4/24.
  3 | //
  4 | 
  5 | #include "image_process.h"
  6 | 
  7 | #include "BYTETracker.h"
  8 | #include "kaylordut/log/logger.h"
  9 | #define N_CLASS_COLORS (20)
 10 | unsigned char class_colors[][3] = {
 11 |     {255, 56, 56},    // 'FF3838'
 12 |     {255, 157, 151},  // 'FF9D97'
 13 |     {255, 112, 31},   // 'FF701F'
 14 |     {255, 178, 29},   // 'FFB21D'
 15 |     {207, 210, 49},   // 'CFD231'
 16 |     {72, 249, 10},    // '48F90A'
 17 |     {146, 204, 23},   // '92CC17'
 18 |     {61, 219, 134},   // '3DDB86'
 19 |     {26, 147, 52},    // '1A9334'
 20 |     {0, 212, 187},    // '00D4BB'
 21 |     {44, 153, 168},   // '2C99A8'
 22 |     {0, 194, 255},    // '00C2FF'
 23 |     {52, 69, 147},    // '344593'
 24 |     {100, 115, 255},  // '6473FF'
 25 |     {0, 24, 236},     // '0018EC'
 26 |     {132, 56, 255},   // '8438FF'
 27 |     {82, 0, 133},     // '520085'
 28 |     {203, 56, 255},   // 'CB38FF'
 29 |     {255, 149, 200},  // 'FF95C8'
 30 |     {255, 55, 199}    // 'FF37C7'
 31 | };
 32 | 
 33 | ImageProcess::ImageProcess(int width, int height, int target_size,
 34 |                            bool is_track, int framerate) {
 35 |   scale_ = static_cast<double>(target_size) / std::max(height, width);
 36 |   padding_x_ = target_size - static_cast<int>(width * scale_);
 37 |   padding_y_ = target_size - static_cast<int>(height * scale_);
 38 |   new_size_ = cv::Size(static_cast<int>(width * scale_),
 39 |                        static_cast<int>(height * scale_));
 40 |   target_size_ = target_size;
 41 |   letterbox_.scale = scale_;
 42 |   letterbox_.x_pad = padding_x_ / 2;
 43 |   letterbox_.y_pad = padding_y_ / 2;
 44 |   is_track_ = is_track;
 45 |   if (is_track) {
 46 |     tracker_ = std::make_unique<BYTETracker>(framerate, 30);
 47 |   }
 48 | }
 49 | 
 50 | std::unique_ptr<cv::Mat> ImageProcess::Convert(const cv::Mat &src) {
 51 |   if (&src == nullptr) {
 52 |     return nullptr;
 53 |   }
 54 |   cv::Mat resize_img;
 55 |   cv::resize(src, resize_img, new_size_);
 56 |   auto square_img = std::make_unique<cv::Mat>(
 57 |       target_size_, target_size_, src.type(), cv::Scalar(114, 114, 114));
 58 |   cv::Point position(padding_x_ / 2, padding_y_ / 2);
 59 |   resize_img.copyTo((*square_img)(
 60 |       cv::Rect(position.x, position.y, resize_img.cols, resize_img.rows)));
 61 |   return std::move(square_img);
 62 | }
 63 | 
 64 | const letterbox_t &ImageProcess::get_letter_box() { return letterbox_; }
 65 | 
 66 | void ImageProcess::ImagePostProcess(cv::Mat &image,
 67 |                                     object_detect_result_list &od_results) {
 68 |   KAYLORDUT_LOG_INFO("ImagePostProcess is called");
 69 |   if (od_results.count >= 1) {
 70 |     int width = image.rows;
 71 |     int height = image.cols;
 72 |     auto *ori_img = image.ptr();
 73 |     int cls_id = od_results.results[0].cls_id;
 74 |     uint8_t *seg_mask = od_results.results_seg[0].seg_mask;
 75 |     float alpha = 0.5f;  // opacity
 76 |     if (seg_mask != nullptr) {
 77 |       for (int j = 0; j < height; j++) {
 78 |         for (int k = 0; k < width; k++) {
 79 |           int pixel_offset = 3 * (j * width + k);
 80 |           if (seg_mask[j * width + k] != 0) {
 81 |             ori_img[pixel_offset + 0] = (unsigned char)clamp(
 82 |                 class_colors[seg_mask[j * width + k] % N_CLASS_COLORS][0] *
 83 |                         (1 - alpha) +
 84 |                     ori_img[pixel_offset + 0] * alpha,
 85 |                 0, 255);  // r
 86 |             ori_img[pixel_offset + 1] = (unsigned char)clamp(
 87 |                 class_colors[seg_mask[j * width + k] % N_CLASS_COLORS][1] *
 88 |                         (1 - alpha) +
 89 |                     ori_img[pixel_offset + 1] * alpha,
 90 |                 0, 255);  // g
 91 |             ori_img[pixel_offset + 2] = (unsigned char)clamp(
 92 |                 class_colors[seg_mask[j * width + k] % N_CLASS_COLORS][2] *
 93 |                         (1 - alpha) +
 94 |                     ori_img[pixel_offset + 2] * alpha,
 95 |                 0, 255);  // b
 96 |           }
 97 |         }
 98 |       }
 99 |       free(seg_mask);
100 |     }
101 |   }
102 |   KAYLORDUT_LOG_INFO("model type is {}", od_results.model_type);
103 |   if (od_results.model_type == ModelType::DETECTION || od_results.model_type == ModelType::V10_DETECTION) {
104 |     if (is_track_) {
105 |       ProcessTrackImage(image, od_results);
106 |     } else {
107 |       ProcessDetectionImage(image, od_results);
108 |     }
109 |   } else if (od_results.model_type == ModelType::OBB) {
110 |     ProcessOBBImage(image, od_results);
111 |   } else if (od_results.model_type == ModelType::POSE) {
112 |     ProcessPoseImage(image, od_results);
113 |   }
114 | }
115 | 
116 | void DrawRotatedRect(cv::Mat &image, float x, float y, float w, float h,
117 |                      float theta, const cv::Scalar &color, int thickness) {
118 |   // 定义旋转矩形的中心，尺寸和旋转角度
119 |   cv::Point2f center(x, y);
120 |   cv::Size2f size(w, h);
121 | 
122 |   // 创建旋转矩形对象
123 |   cv::RotatedRect rotatedRect(center, size, theta);
124 | 
125 |   // 获取矩形的四个顶点
126 |   cv::Point2f vertices[4];
127 |   rotatedRect.points(vertices);
128 | 
129 |   // 绘制矩形的四条边
130 |   for (int i = 0; i < 4; i++) {
131 |     cv::line(image, vertices[i], vertices[(i + 1) % 4], color, thickness);
132 |   }
133 | }
134 | 
135 | void ImageProcess::ProcessOBBImage(
136 |     cv::Mat &image, const object_detect_result_list &od_results) const {
137 |   KAYLORDUT_LOG_INFO(
138 |       "ImageProcess::ProcessOBBImage is called, result count is {}",
139 |       od_results.count);
140 |   for (int i = 0; i < od_results.count; ++i) {
141 |     auto obb_result = od_results.results_obb[i];
142 |     KAYLORDUT_LOG_INFO("{} @ xywhθ = ({} {} {} {} {}) {}",
143 |                        coco_cls_to_name(obb_result.cls_id), obb_result.box.x,
144 |                        obb_result.box.y, obb_result.box.w, obb_result.box.h,
145 |                        obb_result.box.theta * 180.0 / CV_PI, obb_result.prop);
146 |     DrawRotatedRect(image, obb_result.box.x, obb_result.box.y, obb_result.box.w,
147 |                     obb_result.box.h, obb_result.box.theta * 180.0 / CV_PI,
148 |                     cv::Scalar(0, 255, 0), 2);
149 |   }
150 | }
151 | 
152 | void ImageProcess::ProcessTrackImage(cv::Mat &image,
153 |                                      object_detect_result_list &od_results) {
154 |   std::vector<Object> objects;
155 |   for (int i = 0; i < od_results.count; ++i) {
156 |     object_detect_result *detect_result = &(od_results.results[i]);
157 |     KAYLORDUT_LOG_INFO("{} @ ({} {} {} {}) {}",
158 |                        coco_cls_to_name(detect_result->cls_id),
159 |                        detect_result->box.left, detect_result->box.top,
160 |                        detect_result->box.right, detect_result->box.bottom,
161 |                        detect_result->prop);
162 |     Object object;
163 |     object.rect = cv::Rect(detect_result->box.left, detect_result->box.top,
164 |                            detect_result->box.right - detect_result->box.left,
165 |                            detect_result->box.bottom - detect_result->box.top);
166 |     object.label = detect_result->cls_id;
167 |     object.prob = detect_result->prop;
168 |     objects.push_back(object);
169 |   }
170 |   tracker_mutex_.lock();
171 |   std::vector<STrack> output_stracks = tracker_->update(objects);
172 |   for (size_t i = 0; i < output_stracks.size(); ++i) {
173 |     std::vector<float> tlwh = output_stracks[i].tlwh;
174 |     bool vertical = tlwh[2] / tlwh[3] > 1.6;
175 |     if (tlwh[2] * tlwh[3] > 20 && !vertical) {
176 |       Scalar s = tracker_->get_color(output_stracks[i].track_id);
177 |       putText(image, format("%d", output_stracks[i].track_id),
178 |               Point(tlwh[0], tlwh[1] - 5), 0, 0.6, Scalar(0, 0, 255), 2,
179 |               LINE_AA);
180 |       rectangle(image, Rect(tlwh[0], tlwh[1], tlwh[2], tlwh[3]), s, 2);
181 |     }
182 |   }
183 |   tracker_mutex_.unlock();
184 | }
185 | 
186 | void ImageProcess::ProcessDetectionImage(
187 |     cv::Mat &image, object_detect_result_list &od_results) const {
188 |   for (int i = 0; i < od_results.count; ++i) {
189 |     object_detect_result *detect_result = &(od_results.results[i]);
190 |     //    if (strcmp(coco_cls_to_name(detect_result->cls_id), "person") == 0){
191 |     //    continue;}
192 |     KAYLORDUT_LOG_INFO("{} @ ({} {} {} {}) {}",
193 |                        coco_cls_to_name(detect_result->cls_id),
194 |                        detect_result->box.left, detect_result->box.top,
195 |                        detect_result->box.right, detect_result->box.bottom,
196 |                        detect_result->prop);
197 |     cv::rectangle(
198 |         image, cv::Point(detect_result->box.left, detect_result->box.top),
199 |         cv::Point(detect_result->box.right, detect_result->box.bottom),
200 |         cv::Scalar(0, 0, 255), 2);
201 |     char text[256];
202 |     sprintf(text, "%s %.1f%%", coco_cls_to_name(detect_result->cls_id),
203 |             detect_result->prop * 100);
204 |     cv::putText(image, text,
205 |                 cv::Point(detect_result->box.left, detect_result->box.top + 20),
206 |                 cv::FONT_HERSHEY_COMPLEX, 1, cv::Scalar(255, 0, 0), 2,
207 |                 cv::LINE_8);
208 |   }
209 | }
210 | 
211 | void drawSkeleton(cv::Mat &img, const std::vector<cv::Point> &points,
212 |                   const std::vector<int> &pairs, const cv::Scalar &color,
213 |                   int thickness) {
214 |   for (size_t i = 0; i < pairs.size(); i += 2) {
215 |     int index1 = pairs[i];
216 |     int index2 = pairs[i + 1];
217 |     if (points[index1].x != -1 && points[index1].y != -1 &&
218 |         points[index2].x != -1 && points[index2].y != -1) {
219 |       cv::line(img, points[index1], points[index2], color, thickness);
220 |     }
221 |   }
222 | }
223 | 
224 | void ImageProcess::ProcessPoseImage(
225 |     cv::Mat &image, object_detect_result_list &od_results) const {
226 |   for (int i = 0; i < od_results.count; ++i) {
227 |     object_detect_result *detect_result = &(od_results.results[i]);
228 | 
229 |     KAYLORDUT_LOG_INFO("({} {} {} {}) {}", detect_result->box.left,
230 |                        detect_result->box.top, detect_result->box.right,
231 |                        detect_result->box.bottom, detect_result->prop);
232 |     cv::rectangle(
233 |         image, cv::Point(detect_result->box.left, detect_result->box.top),
234 |         cv::Point(detect_result->box.right, detect_result->box.bottom),
235 |         cv::Scalar(0, 0, 255), 2);
236 |     std::vector<cv::Point> points(17);
237 |     for (int j = 0; j < 17; ++j) {
238 |       if (od_results.results_pose[i].visibility[j] <= 0.6) {
239 |         points.at(j) = cv::Point(-1, -1);
240 |         continue;
241 |       }
242 |       points.at(j) = (cv::Point(od_results.results_pose[i].kpt[j * 2 + 0],
243 |                                 od_results.results_pose[i].kpt[j * 2 + 1]));
244 |       cv::Point p(od_results.results_pose[i].kpt[j * 2 + 0],
245 |                   od_results.results_pose[i].kpt[j * 2 + 1]);
246 |       cv::circle(image, p, 10, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA);
247 |     }
248 |     std::vector<int> pairs = {
249 |         0,  1,   // Nose to left eye
250 |         1,  3,   // Left eye to left ear
251 |         0,  2,   // Nose to right eye
252 |         2,  4,   // Right eye to right ear
253 |         0,  5,   // Nose to left shoulder
254 |         5,  7,   // Left shoulder to left elbow
255 |         7,  9,   // Left elbow to left wrist
256 |         0,  6,   // Nose to right shoulder
257 |         6,  8,   // Right shoulder to right elbow
258 |         8,  10,  // Right elbow to right wrist
259 |         5,  6,   // Left shoulder to right shoulder
260 |         11, 12,  // Left hip to right hip
261 |         11, 5,   // Left hip to left shoulder
262 |         12, 6,   // Right hip to right shoulder
263 |         11, 13,  // Left hip to left knee
264 |         12, 14,  // Right hip to right knee
265 |         13, 15,  // Left knee to left ankle
266 |         14, 16   // Right knee to right ankle
267 |     };
268 |     drawSkeleton(image, points, pairs, cv::Scalar(255, 0, 0), 2);
269 |   }
270 | }
271 | 


--------------------------------------------------------------------------------
/src/yolov8/utils/postprocess.cpp:
--------------------------------------------------------------------------------
   1 | //
   2 | // Created by kaylor on 3/5/24.
   3 | //
   4 | 
   5 | #include "postprocess.h"
   6 | 
   7 | #include <set>
   8 | 
   9 | #include "Float16.h"
  10 | #include "filesystem"
  11 | #include "kaylordut/log/logger.h"
  12 | #include "opencv2/imgproc.hpp"
  13 | #include "opencv2/opencv.hpp"
  14 | #include "rknn_matmul_api.h"
  15 | static char *labels[OBJ_CLASS_NUM];
  16 | static int num_labels = 0;
  17 | int clamp(float val, int min, int max) {
  18 |   return val > min ? (val < max ? val : max) : min;
  19 | }
  20 | static char *readLine(FILE *fp, char *buffer, int *len) {
  21 |   int ch;
  22 |   int i = 0;
  23 |   size_t buff_len = 0;
  24 | 
  25 |   buffer = (char *)malloc(buff_len + 1);
  26 |   if (!buffer) return NULL;  // Out of memory
  27 | 
  28 |   while ((ch = fgetc(fp)) != '\n' && ch != EOF) {
  29 |     buff_len++;
  30 |     void *tmp = realloc(buffer, buff_len + 1);
  31 |     if (tmp == NULL) {
  32 |       free(buffer);
  33 |       return NULL;  // Out of memory
  34 |     }
  35 |     buffer = (char *)tmp;
  36 | 
  37 |     buffer[i] = (char)ch;
  38 |     i++;
  39 |   }
  40 |   buffer[i] = '\0';
  41 | 
  42 |   *len = buff_len;
  43 | 
  44 |   // Detect end
  45 |   if (ch == EOF && (i == 0 || ferror(fp))) {
  46 |     free(buffer);
  47 |     return NULL;
  48 |   }
  49 |   return buffer;
  50 | }
  51 | 
  52 | static int readLines(const char *fileName, char *lines[], int max_line) {
  53 |   FILE *file = fopen(fileName, "r");
  54 |   char *s;
  55 |   int i = 0;
  56 |   int n = 0;
  57 | 
  58 |   if (file == NULL) {
  59 |     KAYLORDUT_LOG_ERROR("Open {} fail!", fileName);
  60 |     return -1;
  61 |   }
  62 | 
  63 |   while ((s = readLine(file, s, &n)) != NULL) {
  64 |     lines[i++] = s;
  65 |     if (i >= max_line) break;
  66 |   }
  67 |   fclose(file);
  68 |   KAYLORDUT_LOG_INFO("There are {} lines", i);
  69 |   return i;
  70 | }
  71 | 
  72 | static int loadLabelName(const char *locationFilename, char *label[]) {
  73 |   KAYLORDUT_LOG_INFO("load lable {}", locationFilename);
  74 |   num_labels = readLines(locationFilename, label, OBJ_CLASS_NUM);
  75 |   return 0;
  76 | }
  77 | 
  78 | int init_post_process(std::string &label_path) {
  79 |   int ret = 0;
  80 |   ret = loadLabelName(label_path.c_str(), labels);
  81 |   if (ret < 0) {
  82 |     KAYLORDUT_LOG_ERROR("Load {} failed!", label_path);
  83 |     return -1;
  84 |   }
  85 |   return 0;
  86 | }
  87 | 
  88 | void deinit_post_process() {
  89 |   for (int i = 0; i < num_labels; i++) {
  90 |     if (labels[i] != nullptr) {
  91 |       free(labels[i]);
  92 |       labels[i] = nullptr;
  93 |     }
  94 |   }
  95 | }
  96 | 
  97 | const char *coco_cls_to_name(int cls_id) {
  98 |   if (cls_id >= num_labels) {
  99 |     return "null";
 100 |   }
 101 |   if (labels[cls_id]) {
 102 |     return labels[cls_id];
 103 |   }
 104 |   return "null";
 105 | }
 106 | static float CalculateOverlap(float xmin0, float ymin0, float xmax0,
 107 |                               float ymax0, float xmin1, float ymin1,
 108 |                               float xmax1, float ymax1) {
 109 |   float w = fmax(0.f, fmin(xmax0, xmax1) - fmax(xmin0, xmin1) + 1.0);
 110 |   float h = fmax(0.f, fmin(ymax0, ymax1) - fmax(ymin0, ymin1) + 1.0);
 111 |   float i = w * h;
 112 |   float u = (xmax0 - xmin0 + 1.0) * (ymax0 - ymin0 + 1.0) +
 113 |             (xmax1 - xmin1 + 1.0) * (ymax1 - ymin1 + 1.0) - i;
 114 |   return u <= 0.f ? 0.f : (i / u);
 115 | }
 116 | 
 117 | static int nms(int validCount, std::vector<float> &outputLocations,
 118 |                std::vector<int> &order, float threshold) {
 119 |   for (int i = 0; i < validCount; ++i) {
 120 |     if (order[i] == -1) {
 121 |       continue;
 122 |     }
 123 |     int n = order[i];
 124 |     for (int j = i + 1; j < validCount; ++j) {
 125 |       int m = order[j];
 126 |       if (m == -1) {
 127 |         continue;
 128 |       }
 129 |       float xmin0 = outputLocations[n * 4 + 0];
 130 |       float ymin0 = outputLocations[n * 4 + 1];
 131 |       float xmax0 = outputLocations[n * 4 + 0] + outputLocations[n * 4 + 2];
 132 |       float ymax0 = outputLocations[n * 4 + 1] + outputLocations[n * 4 + 3];
 133 | 
 134 |       float xmin1 = outputLocations[m * 4 + 0];
 135 |       float ymin1 = outputLocations[m * 4 + 1];
 136 |       float xmax1 = outputLocations[m * 4 + 0] + outputLocations[m * 4 + 2];
 137 |       float ymax1 = outputLocations[m * 4 + 1] + outputLocations[m * 4 + 3];
 138 | 
 139 |       float iou = CalculateOverlap(xmin0, ymin0, xmax0, ymax0, xmin1, ymin1,
 140 |                                    xmax1, ymax1);
 141 | 
 142 |       if (iou > threshold) {
 143 |         order[j] = -1;
 144 |       }
 145 |     }
 146 |   }
 147 |   return 0;
 148 | }
 149 | 
 150 | static int nms(int validCount, std::vector<float> &outputLocations,
 151 |                std::vector<int> classIds, std::vector<int> &order, int filterId,
 152 |                float threshold) {
 153 |   for (int i = 0; i < validCount; ++i) {
 154 |     if (order[i] == -1 || classIds[order[i]] != filterId) {
 155 |       continue;
 156 |     }
 157 |     int n = order[i];
 158 |     for (int j = i + 1; j < validCount; ++j) {
 159 |       int m = order[j];
 160 |       if (m == -1 || classIds[order[j]] != filterId) {
 161 |         continue;
 162 |       }
 163 |       float xmin0 = outputLocations[n * 4 + 0];
 164 |       float ymin0 = outputLocations[n * 4 + 1];
 165 |       float xmax0 = outputLocations[n * 4 + 0] + outputLocations[n * 4 + 2];
 166 |       float ymax0 = outputLocations[n * 4 + 1] + outputLocations[n * 4 + 3];
 167 | 
 168 |       float xmin1 = outputLocations[m * 4 + 0];
 169 |       float ymin1 = outputLocations[m * 4 + 1];
 170 |       float xmax1 = outputLocations[m * 4 + 0] + outputLocations[m * 4 + 2];
 171 |       float ymax1 = outputLocations[m * 4 + 1] + outputLocations[m * 4 + 3];
 172 | 
 173 |       float iou = CalculateOverlap(xmin0, ymin0, xmax0, ymax0, xmin1, ymin1,
 174 |                                    xmax1, ymax1);
 175 | 
 176 |       if (iou > threshold) {
 177 |         order[j] = -1;
 178 |       }
 179 |     }
 180 |   }
 181 |   return 0;
 182 | }
 183 | // 计算两个旋转矩形的IoU
 184 | double rotatedRectIoU(const cv::RotatedRect &rect1,
 185 |                       const cv::RotatedRect &rect2) {
 186 |   //  KAYLORDUT_LOG_DEBUG("rotatedRectIoU is called");
 187 |   // 计算两个旋转矩形的交集
 188 |   std::vector<cv::Point2f> intersectingRegion;
 189 |   cv::rotatedRectangleIntersection(rect1, rect2, intersectingRegion);
 190 |   if (intersectingRegion.empty()) {
 191 |     return 0;
 192 |   }
 193 |   //  for (int i = 0; i < intersectingRegion.size(); ++i) {
 194 |   //    KAYLORDUT_LOG_DEBUG("data[{}] = ({} {})", i, intersectingRegion.at(i).x,
 195 |   //    intersectingRegion.at(i).y);
 196 |   //  }
 197 | 
 198 |   // 通过cv::contourArea计算交集区域面积
 199 |   double intersectionArea = cv::contourArea(intersectingRegion);
 200 | 
 201 |   // 计算两个矩形的面积和
 202 |   double area1 = rect1.size.width * rect1.size.height;
 203 |   double area2 = rect2.size.width * rect2.size.height;
 204 | 
 205 |   // 并集面积 = 两矩形面积之和 - 交集面积
 206 |   double unionArea = area1 + area2 - intersectionArea;
 207 |   // 计算IoU
 208 |   return intersectionArea / unionArea;
 209 | }
 210 | 
 211 | static int nms(int validCount, std::vector<float> &bboxes,
 212 |                std::vector<float> &angles, std::vector<int> classIds,
 213 |                std::vector<int> &order, int filterId, float threshold) {
 214 |   //  KAYLORDUT_LOG_DEBUG("nms is called");
 215 |   //  KAYLORDUT_LOG_DEBUG("validCount = {}, filterId = {}, threshold = {}",
 216 |   //  validCount, filterId, threshold);
 217 |   for (int i = 0; i < validCount; ++i) {
 218 |     if (order[i] == -1 || classIds[order[i]] != filterId) {
 219 |       continue;
 220 |     }
 221 |     int n = order[i];
 222 |     for (int j = i + 1; j < validCount; ++j) {
 223 |       int m = order[j];
 224 |       if (m == -1 || classIds[order[j]] != filterId) {
 225 |         continue;
 226 |       }
 227 |       cv::RotatedRect rect1(cv::Point2f(bboxes[n * 4], bboxes[n * 4 + 1]),
 228 |                             cv::Size2f(bboxes[n * 4 + 2], bboxes[n * 4 + 3]),
 229 |                             angles[n]);
 230 |       cv::RotatedRect rect2(cv::Point2f(bboxes[m * 4], bboxes[m * 4 + 1]),
 231 |                             cv::Size2f(bboxes[m * 4 + 2], bboxes[m * 4 + 3]),
 232 |                             angles[m]);
 233 |       auto iou = rotatedRectIoU(rect1, rect2);
 234 |       if (iou > threshold) {
 235 |         order[j] = -1;
 236 |       }
 237 |     }
 238 |   }
 239 |   //  KAYLORDUT_LOG_DEBUG("nms finished");
 240 |   return 0;
 241 | }
 242 | 
 243 | static void crop_mask(uint8_t *seg_mask, uint8_t *all_mask_in_one, float *boxes,
 244 |                       int boxes_num, int *cls_id, int height, int width) {
 245 |   for (int b = 0; b < boxes_num; b++) {
 246 |     float x1 = boxes[b * 4 + 0];
 247 |     float y1 = boxes[b * 4 + 1];
 248 |     float x2 = boxes[b * 4 + 2];
 249 |     float y2 = boxes[b * 4 + 3];
 250 | 
 251 |     for (int i = 0; i < height; i++) {
 252 |       for (int j = 0; j < width; j++) {
 253 |         // 判断该点在矩形框内
 254 |         if (j >= x1 && j < x2 && i >= y1 && i < y2) {
 255 |           if (all_mask_in_one[i * width + j] == 0) {
 256 |             // seg_mask只有 0或者1 ， cls_id因为可能存在0值，所以 +1
 257 |             // 避免结果都为0
 258 |             all_mask_in_one[i * width + j] =
 259 |                 seg_mask[b * width * height + i * width + j] * (cls_id[b] + 1);
 260 |           }
 261 |         }
 262 |       }
 263 |     }
 264 |   }
 265 | }
 266 | 
 267 | static void matmul_by_npu_i8(std::vector<float> &A_input, float *B_input,
 268 |                              uint8_t *C_input, int ROWS_A, int COLS_A,
 269 |                              int COLS_B, rknn_app_context_t *app_ctx) {
 270 |   int B_layout = 0;
 271 |   int AC_layout = 0;
 272 |   int32_t M = 1;
 273 |   int32_t K = COLS_A;
 274 |   int32_t N = COLS_B;
 275 | 
 276 |   rknn_matmul_ctx ctx;
 277 |   rknn_matmul_info info;
 278 |   memset(&info, 0, sizeof(rknn_matmul_info));
 279 |   info.M = M;
 280 |   info.K = K;
 281 |   info.N = N;
 282 |   info.type = RKNN_INT8_MM_INT8_TO_INT32;
 283 |   info.B_layout = B_layout;
 284 |   info.AC_layout = AC_layout;
 285 | 
 286 |   rknn_matmul_io_attr io_attr;
 287 |   memset(&io_attr, 0, sizeof(rknn_matmul_io_attr));
 288 | 
 289 |   int8_t int8Vector_A[ROWS_A * COLS_A];
 290 |   for (int i = 0; i < ROWS_A * COLS_A; ++i) {
 291 |     int8Vector_A[i] = (int8_t)A_input[i];
 292 |   }
 293 | 
 294 |   int8_t int8Vector_B[COLS_A * COLS_B];
 295 |   for (int i = 0; i < COLS_A * COLS_B; ++i) {
 296 |     int8Vector_B[i] = (int8_t)B_input[i];
 297 |   }
 298 | 
 299 |   int ret = rknn_matmul_create(&ctx, &info, &io_attr);
 300 |   // Create A
 301 |   rknn_tensor_mem *A = rknn_create_mem(ctx, io_attr.A.size);
 302 |   // Create B
 303 |   rknn_tensor_mem *B = rknn_create_mem(ctx, io_attr.B.size);
 304 |   // Create C
 305 |   rknn_tensor_mem *C = rknn_create_mem(ctx, io_attr.C.size);
 306 | 
 307 |   memcpy(B->virt_addr, int8Vector_B, B->size);
 308 |   // Set A
 309 |   ret = rknn_matmul_set_io_mem(ctx, A, &io_attr.A);
 310 |   // Set B
 311 |   ret = rknn_matmul_set_io_mem(ctx, B, &io_attr.B);
 312 |   // Set C
 313 |   ret = rknn_matmul_set_io_mem(ctx, C, &io_attr.C);
 314 | 
 315 |   for (int i = 0; i < ROWS_A; ++i) {
 316 |     memcpy(A->virt_addr, int8Vector_A + i * A->size, A->size);
 317 | 
 318 |     // Run
 319 |     ret = rknn_matmul_run(ctx);
 320 | 
 321 |     for (int j = 0; j < COLS_B; ++j) {
 322 |       if (((int32_t *)C->virt_addr)[j] > 0) {
 323 |         C_input[i * COLS_B + j] = 1;
 324 |       } else {
 325 |         C_input[i * COLS_B + j] = 0;
 326 |       }
 327 |     }
 328 |   }
 329 | 
 330 |   // destroy
 331 |   rknn_destroy_mem(ctx, A);
 332 |   rknn_destroy_mem(ctx, B);
 333 |   rknn_destroy_mem(ctx, C);
 334 |   rknn_matmul_destroy(ctx);
 335 | }
 336 | 
 337 | static void matmul_by_npu_fp16(std::vector<float> &A_input, float *B_input,
 338 |                                uint8_t *C_input, int ROWS_A, int COLS_A,
 339 |                                int COLS_B, rknn_app_context_t *app_ctx) {
 340 |   int B_layout = 0;
 341 |   int AC_layout = 0;
 342 |   int32_t M = ROWS_A;
 343 |   int32_t K = COLS_A;
 344 |   int32_t N = COLS_B;
 345 | 
 346 |   rknn_matmul_ctx ctx;
 347 |   rknn_matmul_info info;
 348 |   memset(&info, 0, sizeof(rknn_matmul_info));
 349 |   info.M = M;
 350 |   info.K = K;
 351 |   info.N = N;
 352 |   info.type = RKNN_FLOAT16_MM_FLOAT16_TO_FLOAT32;
 353 |   info.B_layout = B_layout;
 354 |   info.AC_layout = AC_layout;
 355 | 
 356 |   rknn_matmul_io_attr io_attr;
 357 |   memset(&io_attr, 0, sizeof(rknn_matmul_io_attr));
 358 | 
 359 |   rknpu2::float16 int8Vector_A[ROWS_A * COLS_A];
 360 |   for (int i = 0; i < ROWS_A * COLS_A; ++i) {
 361 |     int8Vector_A[i] = (rknpu2::float16)A_input[i];
 362 |   }
 363 | 
 364 |   rknpu2::float16 int8Vector_B[COLS_A * COLS_B];
 365 |   for (int i = 0; i < COLS_A * COLS_B; ++i) {
 366 |     int8Vector_B[i] = (rknpu2::float16)B_input[i];
 367 |   }
 368 | 
 369 |   int ret = rknn_matmul_create(&ctx, &info, &io_attr);
 370 |   // Create A
 371 |   rknn_tensor_mem *A = rknn_create_mem(ctx, io_attr.A.size);
 372 |   // Create B
 373 |   rknn_tensor_mem *B = rknn_create_mem(ctx, io_attr.B.size);
 374 |   // Create C
 375 |   rknn_tensor_mem *C = rknn_create_mem(ctx, io_attr.C.size);
 376 | 
 377 |   memcpy(A->virt_addr, int8Vector_A, A->size);
 378 |   memcpy(B->virt_addr, int8Vector_B, B->size);
 379 | 
 380 |   // Set A
 381 |   ret = rknn_matmul_set_io_mem(ctx, A, &io_attr.A);
 382 |   // Set B
 383 |   ret = rknn_matmul_set_io_mem(ctx, B, &io_attr.B);
 384 |   // Set C
 385 |   ret = rknn_matmul_set_io_mem(ctx, C, &io_attr.C);
 386 | 
 387 |   // Run
 388 |   ret = rknn_matmul_run(ctx);
 389 |   for (int i = 0; i < ROWS_A * COLS_B; ++i) {
 390 |     if (((float *)C->virt_addr)[i] > 0) {
 391 |       C_input[i] = 1;
 392 |     } else {
 393 |       C_input[i] = 0;
 394 |     }
 395 |   }
 396 | 
 397 |   // destroy
 398 |   rknn_destroy_mem(ctx, A);
 399 |   rknn_destroy_mem(ctx, B);
 400 |   rknn_destroy_mem(ctx, C);
 401 |   rknn_matmul_destroy(ctx);
 402 | }
 403 | static void resize_by_opencv(uint8_t *input_image, int input_width,
 404 |                              int input_height, uint8_t *output_image,
 405 |                              int target_width, int target_height) {
 406 |   cv::Mat src_image(input_height, input_width, CV_8U, input_image);
 407 |   cv::Mat dst_image;
 408 |   cv::resize(src_image, dst_image, cv::Size(target_width, target_height), 0, 0,
 409 |              cv::INTER_LINEAR);
 410 |   memcpy(output_image, dst_image.data, target_width * target_height);
 411 | }
 412 | 
 413 | static void seg_reverse(uint8_t *seg_mask, uint8_t *cropped_seg,
 414 |                         uint8_t *seg_mask_real, int model_in_height,
 415 |                         int model_in_width, int proto_height, int proto_width,
 416 |                         int cropped_height, int cropped_width,
 417 |                         int ori_in_height, int ori_in_width, int y_pad,
 418 |                         int x_pad) {
 419 |   int cropped_index = 0;
 420 |   for (int i = 0; i < proto_height; i++) {
 421 |     for (int j = 0; j < proto_width; j++) {
 422 |       if (i >= y_pad && i < proto_height - y_pad && j >= x_pad &&
 423 |           j < proto_width - x_pad) {
 424 |         int seg_index = i * proto_width + j;
 425 |         cropped_seg[cropped_index] = seg_mask[seg_index];
 426 |         cropped_index++;
 427 |       }
 428 |     }
 429 |   }
 430 | 
 431 |   // Note: Here are different methods provided for implementing single-channel
 432 |   // image scaling
 433 |   resize_by_opencv(cropped_seg, cropped_width, cropped_height, seg_mask_real,
 434 |                    ori_in_width, ori_in_height);
 435 |   // resize_by_rga_rk356x(cropped_seg, cropped_width, cropped_height,
 436 |   // seg_mask_real, ori_in_width, ori_in_height);
 437 |   // resize_by_rga_rk3588(cropped_seg, cropped_width, cropped_height,
 438 |   // seg_mask_real, ori_in_width, ori_in_height);
 439 | }
 440 | 
 441 | static int quick_sort_indice_inverse(std::vector<float> &input, int left,
 442 |                                      int right, std::vector<int> &indices) {
 443 |   float key;
 444 |   int key_index;
 445 |   int low = left;
 446 |   int high = right;
 447 |   if (left < right) {
 448 |     key_index = indices[left];
 449 |     key = input[left];
 450 |     while (low < high) {
 451 |       while (low < high && input[high] <= key) {
 452 |         high--;
 453 |       }
 454 |       input[low] = input[high];
 455 |       indices[low] = indices[high];
 456 |       while (low < high && input[low] >= key) {
 457 |         low++;
 458 |       }
 459 |       input[high] = input[low];
 460 |       indices[high] = indices[low];
 461 |     }
 462 |     input[low] = key;
 463 |     indices[low] = key_index;
 464 |     quick_sort_indice_inverse(input, left, low - 1, indices);
 465 |     quick_sort_indice_inverse(input, low + 1, right, indices);
 466 |   }
 467 |   return low;
 468 | }
 469 | 
 470 | inline static int32_t __clip(float val, float min, float max) {
 471 |   float f = val <= min ? min : (val >= max ? max : val);
 472 |   return f;
 473 | }
 474 | 
 475 | static int8_t qnt_f32_to_affine(float f32, int32_t zp, float scale) {
 476 |   float dst_val = (f32 / scale) + zp;
 477 |   int8_t res = (int8_t)__clip(dst_val, -128, 127);
 478 |   return res;
 479 | }
 480 | 
 481 | static float deqnt_affine_to_f32(int8_t qnt, int32_t zp, float scale) {
 482 |   return ((float)qnt - (float)zp) * scale;
 483 | }
 484 | 
 485 | static int box_reverse(int position, int boundary, int pad, float scale) {
 486 |   return (int)((clamp(position, 0, boundary) - pad) / scale);
 487 | }
 488 | 
 489 | void compute_dfl(float *tensor, int dfl_len, float *box) {
 490 |   for (int b = 0; b < 4; b++) {
 491 |     float exp_t[dfl_len];
 492 |     float exp_sum = 0;
 493 |     float acc_sum = 0;
 494 |     for (int i = 0; i < dfl_len; i++) {
 495 |       exp_t[i] = exp(tensor[i + b * dfl_len]);
 496 |       exp_sum += exp_t[i];
 497 |     }
 498 | 
 499 |     for (int i = 0; i < dfl_len; i++) {
 500 |       acc_sum += exp_t[i] / exp_sum * i;
 501 |     }
 502 |     box[b] = acc_sum;
 503 |   }
 504 | }
 505 | 
 506 | static int process_i8(rknn_output *all_input, int input_id, int grid_h,
 507 |                       int grid_w, int height, int width, int stride,
 508 |                       int dfl_len, std::vector<float> &boxes,
 509 |                       std::vector<float> &segments, float *proto,
 510 |                       std::vector<float> &objProbs, std::vector<int> &classId,
 511 |                       float threshold, rknn_app_context_t *app_ctx) {
 512 |   int validCount = 0;
 513 |   // 这个张量的H*W，用来记录每一个通道占用内存的长度
 514 |   int grid_len = grid_h * grid_w;
 515 | 
 516 |   // Skip if input_id is not 0, 4, 8, or 12
 517 |   if (input_id % 4 != 0) {
 518 |     return validCount;
 519 |   }
 520 |   // 最后的这一层是跟掩膜相关，处理方式与其他不一样
 521 |   if (input_id == 12) {
 522 |     /**
 523 |      * 获取第12层的输出数据，然后把量化后的数据还原到为原来的浮点型
 524 |      * 需要注意的是，这里没有使用比例关系，因为程序需要是INT的数据，不需要0~1的float数据
 525 |      */
 526 |     int8_t *input_proto = (int8_t *)all_input[input_id].buf;
 527 |     int32_t zp_proto = app_ctx->output_attrs[input_id].zp;
 528 |     for (int i = 0; i < PROTO_CHANNEL * PROTO_HEIGHT * PROTO_WEIGHT; i++) {
 529 |       proto[i] = input_proto[i] - zp_proto;
 530 |     }
 531 |     return validCount;
 532 |   }
 533 | 
 534 |   int8_t *box_tensor = (int8_t *)all_input[input_id].buf;
 535 |   int32_t box_zp = app_ctx->output_attrs[input_id].zp;
 536 |   float box_scale = app_ctx->output_attrs[input_id].scale;
 537 | 
 538 |   int8_t *score_tensor = (int8_t *)all_input[input_id + 1].buf;
 539 |   int32_t score_zp = app_ctx->output_attrs[input_id + 1].zp;
 540 |   float score_scale = app_ctx->output_attrs[input_id + 1].scale;
 541 | 
 542 |   int8_t *score_sum_tensor = nullptr;
 543 |   int32_t score_sum_zp = 0;
 544 |   float score_sum_scale = 1.0;
 545 |   score_sum_tensor = (int8_t *)all_input[input_id + 2].buf;
 546 |   score_sum_zp = app_ctx->output_attrs[input_id + 2].zp;
 547 |   score_sum_scale = app_ctx->output_attrs[input_id + 2].scale;
 548 | 
 549 |   int8_t *seg_tensor = (int8_t *)all_input[input_id + 3].buf;
 550 |   int32_t seg_zp = app_ctx->output_attrs[input_id + 3].zp;
 551 |   float seg_scale = app_ctx->output_attrs[input_id + 3].scale;
 552 |   // 计算得分阈值的量化之后的数值
 553 |   int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale);
 554 |   int8_t score_sum_thres_i8 =
 555 |       qnt_f32_to_affine(threshold, score_sum_zp, score_sum_scale);
 556 | 
 557 |   for (int i = 0; i < grid_h; i++) {
 558 |     for (int j = 0; j < grid_w; j++) {
 559 |       int offset = i * grid_w + j;
 560 |       int max_class_id = -1;
 561 | 
 562 |       int offset_seg = i * grid_w + j;
 563 |       int8_t *in_ptr_seg = seg_tensor + offset_seg;
 564 | 
 565 |       // for quick filtering through "score sum"
 566 |       if (score_sum_tensor != nullptr) {
 567 |         // 如果得分总和少于设定阈值，直接放弃本次的目标
 568 |         if (score_sum_tensor[offset] < score_sum_thres_i8) {
 569 |           continue;
 570 |         }
 571 |       }
 572 | 
 573 |       int8_t max_score = -score_zp;
 574 |       for (int c = 0; c < num_labels; c++) {  // 这里是为了保留最大的概率类别
 575 |         if ((score_tensor[offset] > score_thres_i8) &&
 576 |             (score_tensor[offset] > max_score)) {
 577 |           max_score = score_tensor[offset];
 578 |           max_class_id = c;
 579 |         }
 580 |         offset += grid_len;  // 计算 i * grid_w + j
 581 |         // 这个像素点下，下一个类别偏移量，所以加上一个通道的长度即可
 582 |       }
 583 | 
 584 |       // compute box， 只有最大概率大于阈值，才判定有目标存在
 585 |       if (max_score > score_thres_i8) {
 586 |         for (int k = 0; k < PROTO_CHANNEL; k++) {
 587 |           // 数据转换为int8， 因为数据是NCHW，所以步长需要grid_len才能拿到数据
 588 |           int8_t seg_element_i8 = in_ptr_seg[(k)*grid_len] - seg_zp;
 589 |           segments.push_back(seg_element_i8);
 590 |         }
 591 | 
 592 |         offset = i * grid_w + j;
 593 |         float box[4];
 594 |         float before_dfl[dfl_len * 4];
 595 |         for (int k = 0; k < dfl_len * 4; k++) {  // 反量化拿到浮点型数据
 596 |           before_dfl[k] =
 597 |               deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale);
 598 |           offset += grid_len;
 599 |         }
 600 |         // 通过dfl计算出box的坐标
 601 |         compute_dfl(before_dfl, dfl_len, box);
 602 | 
 603 |         float x1, y1, x2, y2, w, h;
 604 |         x1 = (-box[0] + j + 0.5) * stride;
 605 |         y1 = (-box[1] + i + 0.5) * stride;
 606 |         x2 = (box[2] + j + 0.5) * stride;
 607 |         y2 = (box[3] + i + 0.5) * stride;
 608 |         w = x2 - x1;
 609 |         h = y2 - y1;
 610 |         boxes.push_back(x1);
 611 |         boxes.push_back(y1);
 612 |         boxes.push_back(w);
 613 |         boxes.push_back(h);
 614 |         // 保存该类的概率和类id
 615 |         objProbs.push_back(
 616 |             deqnt_affine_to_f32(max_score, score_zp, score_scale));
 617 |         classId.push_back(max_class_id);
 618 |         validCount++;
 619 |       }
 620 |     }
 621 |   }
 622 |   return validCount;
 623 | }
 624 | 
 625 | static int process_fp32(rknn_output *all_input, int input_id, int grid_h,
 626 |                         int grid_w, int height, int width, int stride,
 627 |                         int dfl_len, std::vector<float> &boxes,
 628 |                         std::vector<float> &segments, float *proto,
 629 |                         std::vector<float> &objProbs, std::vector<int> &classId,
 630 |                         float threshold) {
 631 |   int validCount = 0;
 632 |   int grid_len = grid_h * grid_w;
 633 | 
 634 |   // Skip if input_id is not 0, 4, 8, or 12
 635 |   if (input_id % 4 != 0) {
 636 |     return validCount;
 637 |   }
 638 | 
 639 |   if (input_id == 12) {
 640 |     float *input_proto = (float *)all_input[input_id].buf;
 641 |     for (int i = 0; i < PROTO_CHANNEL * PROTO_HEIGHT * PROTO_WEIGHT; i++) {
 642 |       proto[i] = input_proto[i];
 643 |     }
 644 |     return validCount;
 645 |   }
 646 | 
 647 |   float *box_tensor = (float *)all_input[input_id].buf;
 648 |   float *score_tensor = (float *)all_input[input_id + 1].buf;
 649 |   float *score_sum_tensor = (float *)all_input[input_id + 2].buf;
 650 |   float *seg_tensor = (float *)all_input[input_id + 3].buf;
 651 | 
 652 |   for (int i = 0; i < grid_h; i++) {
 653 |     for (int j = 0; j < grid_w; j++) {
 654 |       int offset = i * grid_w + j;
 655 |       int max_class_id = -1;
 656 | 
 657 |       int offset_seg = i * grid_w + j;
 658 |       float *in_ptr_seg = seg_tensor + offset_seg;
 659 | 
 660 |       // for quick filtering through "score sum"
 661 |       if (score_sum_tensor != nullptr) {
 662 |         if (score_sum_tensor[offset] < threshold) {
 663 |           continue;
 664 |         }
 665 |       }
 666 | 
 667 |       float max_score = 0;
 668 |       for (int c = 0; c < num_labels; c++) {
 669 |         if ((score_tensor[offset] > threshold) &&
 670 |             (score_tensor[offset] > max_score)) {
 671 |           max_score = score_tensor[offset];
 672 |           max_class_id = c;
 673 |         }
 674 |         offset += grid_len;
 675 |       }
 676 | 
 677 |       // compute box
 678 |       if (max_score > threshold) {
 679 |         for (int k = 0; k < PROTO_CHANNEL; k++) {
 680 |           float seg_element_f32 = in_ptr_seg[(k)*grid_len];
 681 |           segments.push_back(seg_element_f32);
 682 |         }
 683 | 
 684 |         offset = i * grid_w + j;
 685 |         float box[4];
 686 |         float before_dfl[dfl_len * 4];
 687 |         for (int k = 0; k < dfl_len * 4; k++) {
 688 |           before_dfl[k] = box_tensor[offset];
 689 |           offset += grid_len;
 690 |         }
 691 |         compute_dfl(before_dfl, dfl_len, box);
 692 | 
 693 |         float x1, y1, x2, y2, w, h;
 694 |         x1 = (-box[0] + j + 0.5) * stride;
 695 |         y1 = (-box[1] + i + 0.5) * stride;
 696 |         x2 = (box[2] + j + 0.5) * stride;
 697 |         y2 = (box[3] + i + 0.5) * stride;
 698 |         w = x2 - x1;
 699 |         h = y2 - y1;
 700 |         boxes.push_back(x1);
 701 |         boxes.push_back(y1);
 702 |         boxes.push_back(w);
 703 |         boxes.push_back(h);
 704 | 
 705 |         objProbs.push_back(max_score);
 706 |         classId.push_back(max_class_id);
 707 |         validCount++;
 708 |       }
 709 |     }
 710 |   }
 711 |   return validCount;
 712 | }
 713 | 
 714 | int post_process_seg(rknn_app_context_t *app_ctx, rknn_output *outputs,
 715 |                      letterbox_t *letter_box, float conf_threshold,
 716 |                      float nms_threshold,
 717 |                      object_detect_result_list *od_results) {
 718 |   std::vector<float> filterBoxes;  // 用来保存检测目标的box
 719 |   std::vector<float> objProbs;     // 保存该目标的得分
 720 |   std::vector<int> classId;        // 保留该目标的种类对应的index id
 721 | 
 722 |   std::vector<float> filterSegments;
 723 |   float proto[PROTO_CHANNEL * PROTO_HEIGHT * PROTO_WEIGHT];
 724 |   std::vector<float> filterSegments_by_nms;
 725 | 
 726 |   int model_in_w = app_ctx->model_width;   // 获取模型的width
 727 |   int model_in_h = app_ctx->model_height;  // 获取模型的height
 728 | 
 729 |   int validCount = 0;  // 记录有效的输出
 730 |   int stride = 0;      // 活动窗口步长
 731 |   int grid_h = 0;
 732 |   int grid_w = 0;
 733 | 
 734 |   int dfl_len = app_ctx->output_attrs[0].dims[1] / 4;
 735 |   /***
 736 |    * 结果输出有三种输出80x80 40x40
 737 |    * 20x20，这里取模是为了确定每种输出拥有多少层输出
 738 |    */
 739 |   int output_per_branch = app_ctx->io_num.n_output / 3;  // default 3 branch
 740 | 
 741 |   // process the outputs of rknn
 742 |   for (int i = 0; i < 13; i++) {
 743 |     grid_h = app_ctx->output_attrs[i].dims[2];  //  这一层输出的高度
 744 |     grid_w = app_ctx->output_attrs[i].dims[3];  // 这一层输出的宽度
 745 |     stride = model_in_h / grid_h;  // 模型边长对输出层取模等于滑动步长
 746 |     // 如果量化了，使用i8的处理方式
 747 |     if (app_ctx->is_quant) {
 748 |       validCount +=
 749 |           process_i8(outputs, i, grid_h, grid_w, model_in_h, model_in_w, stride,
 750 |                      dfl_len, filterBoxes, filterSegments, proto, objProbs,
 751 |                      classId, conf_threshold, app_ctx);
 752 |     } else {
 753 |       validCount +=
 754 |           process_fp32(outputs, i, grid_h, grid_w, model_in_h, model_in_w,
 755 |                        stride, dfl_len, filterBoxes, filterSegments, proto,
 756 |                        objProbs, classId, conf_threshold);
 757 |     }
 758 |   }
 759 | 
 760 |   // nms
 761 |   if (validCount <= 0) {
 762 |     return 0;
 763 |   }
 764 |   std::vector<int> indexArray;
 765 |   for (int i = 0; i < validCount; ++i) {
 766 |     indexArray.push_back(i);
 767 |   }
 768 |   // 这里快排把识别的种类概率从大到小排列，对应的index也跟着排列
 769 |   quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray);
 770 | 
 771 |   std::set<int> class_set(std::begin(classId), std::end(classId));
 772 | 
 773 |   for (auto c : class_set) {  // 把重合度大于设定阈值的框给标记去掉
 774 |     nms(validCount, filterBoxes, classId, indexArray, c, nms_threshold);
 775 |   }
 776 | 
 777 |   int last_count = 0;
 778 |   od_results->count = 0;
 779 | 
 780 |   for (int i = 0; i < validCount; ++i) {
 781 |     // 上一步中已经标记了无效的重叠框的下标为-1
 782 |     if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) {
 783 |       continue;
 784 |     }
 785 |     int n = indexArray[i];
 786 |     // 获取box， 种类id， 概率分数
 787 |     float x1 = filterBoxes[n * 4 + 0];
 788 |     float y1 = filterBoxes[n * 4 + 1];
 789 |     float x2 = x1 + filterBoxes[n * 4 + 2];
 790 |     float y2 = y1 + filterBoxes[n * 4 + 3];
 791 |     int id = classId[n];
 792 |     float obj_conf = objProbs[i];
 793 | 
 794 |     for (int k = 0; k < PROTO_CHANNEL; k++) {
 795 |       // 获取相对应的分割的向量
 796 |       filterSegments_by_nms.push_back(filterSegments[n * PROTO_CHANNEL + k]);
 797 |     }
 798 | 
 799 |     od_results->results[last_count].box.left = x1;
 800 |     od_results->results[last_count].box.top = y1;
 801 |     od_results->results[last_count].box.right = x2;
 802 |     od_results->results[last_count].box.bottom = y2;
 803 | 
 804 |     od_results->results[last_count].prop = obj_conf;
 805 |     od_results->results[last_count].cls_id = id;
 806 |     last_count++;
 807 |   }
 808 |   od_results->count = last_count;
 809 | 
 810 |   int boxes_num = od_results->count;
 811 | 
 812 |   // compute the mask (binary matrix) through Matmul
 813 |   // 计算掩膜矩阵，最后结果得到 boxes_num 个掩膜矩阵
 814 |   int ROWS_A = boxes_num;
 815 |   int COLS_A = PROTO_CHANNEL;
 816 |   int COLS_B = PROTO_HEIGHT * PROTO_WEIGHT;
 817 |   uint8_t matmul_out[boxes_num * PROTO_HEIGHT * PROTO_WEIGHT];
 818 |   if (app_ctx->is_quant) {
 819 |     matmul_by_npu_i8(filterSegments_by_nms, proto, matmul_out, ROWS_A, COLS_A,
 820 |                      COLS_B, app_ctx);
 821 |   } else {
 822 |     matmul_by_npu_fp16(filterSegments_by_nms, proto, matmul_out, ROWS_A, COLS_A,
 823 |                        COLS_B, app_ctx);
 824 |   }
 825 | 
 826 |   float filterBoxes_by_nms[boxes_num * 4];  // 记录每个box的坐标
 827 |   int cls_id[boxes_num];
 828 |   for (int i = 0; i < boxes_num; i++) {
 829 |     // for crop_mask
 830 |     // 掩膜层的分辨率是 160x160
 831 |     // 640 / 160 = 4.0
 832 |     filterBoxes_by_nms[i * 4 + 0] =
 833 |         od_results->results[i].box.left / 4.0;  // x1;
 834 |     filterBoxes_by_nms[i * 4 + 1] =
 835 |         od_results->results[i].box.top / 4.0;  // y1;
 836 |     filterBoxes_by_nms[i * 4 + 2] =
 837 |         od_results->results[i].box.right / 4.0;  // x2;
 838 |     filterBoxes_by_nms[i * 4 + 3] =
 839 |         od_results->results[i].box.bottom / 4.0;  // y2;
 840 |     cls_id[i] = od_results->results[i].cls_id;
 841 | 
 842 |     // get real box
 843 |     // 这里是把640x640的坐标映射返回到原始输入图像的坐标
 844 |     od_results->results[i].box.left =
 845 |         box_reverse(od_results->results[i].box.left, model_in_w,
 846 |                     letter_box->x_pad, letter_box->scale);
 847 |     od_results->results[i].box.top =
 848 |         box_reverse(od_results->results[i].box.top, model_in_h,
 849 |                     letter_box->y_pad, letter_box->scale);
 850 |     od_results->results[i].box.right =
 851 |         box_reverse(od_results->results[i].box.right, model_in_w,
 852 |                     letter_box->x_pad, letter_box->scale);
 853 |     od_results->results[i].box.bottom =
 854 |         box_reverse(od_results->results[i].box.bottom, model_in_h,
 855 |                     letter_box->y_pad, letter_box->scale);
 856 |   }
 857 | 
 858 |   // crop seg outside box
 859 |   uint8_t all_mask_in_one[PROTO_HEIGHT * PROTO_WEIGHT] = {0};
 860 |   // 把所有的掩膜数据写到一张图上
 861 |   crop_mask(matmul_out, all_mask_in_one, filterBoxes_by_nms, boxes_num, cls_id,
 862 |             PROTO_HEIGHT, PROTO_WEIGHT);
 863 | 
 864 |   // get real mask
 865 |   int cropped_height = PROTO_HEIGHT - letter_box->y_pad / 4 * 2;
 866 |   int cropped_width = PROTO_WEIGHT - letter_box->x_pad / 4 * 2;
 867 |   int y_pad = letter_box->y_pad / 4;  // 640 / 160 = 4
 868 |   int x_pad = letter_box->x_pad / 4;
 869 |   int ori_in_height = (model_in_h - letter_box->y_pad * 2) / letter_box->scale;
 870 |   int ori_in_width = (model_in_w - letter_box->x_pad * 2) / letter_box->scale;
 871 |   uint8_t *cropped_seg_mask =
 872 |       (uint8_t *)malloc(cropped_height * cropped_width * sizeof(uint8_t));
 873 |   uint8_t *real_seg_mask =
 874 |       (uint8_t *)malloc(ori_in_height * ori_in_width * sizeof(uint8_t));
 875 |   // 还原到原来的图像分辨率，结果保存到real_seg_mask中，使用od_results->results_seg[0].seg_mask记录
 876 |   seg_reverse(all_mask_in_one, cropped_seg_mask, real_seg_mask, model_in_h,
 877 |               model_in_w, PROTO_HEIGHT, PROTO_WEIGHT, cropped_height,
 878 |               cropped_width, ori_in_height, ori_in_width, y_pad, x_pad);
 879 |   od_results->results_seg[0].seg_mask = real_seg_mask;
 880 |   free(cropped_seg_mask);
 881 | 
 882 |   return 0;
 883 | }
 884 | 
 885 | static int process_i8_obb(int8_t *box_tensor, int32_t box_zp, float box_scale,
 886 |                           int8_t *score_tensor, int32_t score_zp,
 887 |                           float score_scale, int8_t *angle_tensor,
 888 |                           int32_t angle_zp, float angle_scale, int grid_h,
 889 |                           int grid_w, int stride, int dfl_len,
 890 |                           std::vector<float> &boxes, std::vector<float> &angles,
 891 |                           std::vector<float> &objProbs,
 892 |                           std::vector<int> &classId, float threshold) {
 893 |   //  KAYLORDUT_LOG_DEBUG("process_i8_obb is called");
 894 |   int validCount = 0;
 895 |   int grid_len = grid_h * grid_w;
 896 |   int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale);
 897 | 
 898 |   for (int i = 0; i < grid_h; i++) {
 899 |     for (int j = 0; j < grid_w; j++) {
 900 |       int offset = i * grid_w + j;
 901 |       int max_class_id = -1;
 902 | 
 903 |       int8_t max_score = -score_zp;
 904 | 
 905 |       for (int c = 0; c < num_labels; c++) {
 906 |         if ((score_tensor[offset] > score_thres_i8) &&
 907 |             (score_tensor[offset] > max_score)) {
 908 |           max_score = score_tensor[offset];
 909 |           max_class_id = c;
 910 |           //          KAYLORDUT_LOG_INFO("threshold is {}, {} score is {}",
 911 |           //                              threshold,
 912 |           //                              coco_cls_to_name(c),
 913 |           //                              deqnt_affine_to_f32(score_tensor[offset],
 914 |           //                              score_zp, score_scale));
 915 |         }
 916 |         offset += grid_len;
 917 |       }
 918 | 
 919 |       // compute box
 920 |       if (max_score > score_thres_i8) {
 921 |         offset = i * grid_w + j;
 922 |         float box[4];
 923 |         float before_dfl[dfl_len * 4];
 924 |         for (int k = 0; k < dfl_len * 4; k++) {
 925 |           before_dfl[k] =
 926 |               deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale);
 927 |           offset += grid_len;
 928 |         }
 929 |         compute_dfl(before_dfl, dfl_len, box);
 930 | 
 931 |         //        float x1, y1, x2, y2, w, h;
 932 |         //        x1 = (-box[0] + j + 0.5) * stride;
 933 |         //        y1 = (-box[1] + i + 0.5) * stride;
 934 |         //        x2 = (box[2] + j + 0.5) * stride;
 935 |         //        y2 = (box[3] + i + 0.5) * stride;
 936 |         //        w = x2 - x1;
 937 |         //        h = y2 - y1;
 938 |         offset = i * grid_w + j;
 939 |         float delta_x1, delta_y1, delta_x2, delta_y2;
 940 |         float theta =
 941 |             deqnt_affine_to_f32(angle_tensor[offset], angle_zp, angle_scale);
 942 |         //        KAYLORDUT_LOG_DEBUG("theta = {}", theta);
 943 |         delta_x1 = box[0];
 944 |         delta_y1 = box[1];
 945 |         delta_x2 = box[2];
 946 |         delta_y2 = box[3];
 947 |         float xf = (delta_x2 - delta_x1) / 2.0f;
 948 |         float yf = (delta_y2 - delta_y1) / 2.0f;
 949 |         float x = xf * cos(theta) - yf * sin(theta);
 950 |         float y = xf * sin(theta) + yf * cos(theta);
 951 |         x = (x + j + 0.5) * stride;
 952 |         y = (y + i + 0.5) * stride;
 953 |         float w = (delta_x1 + delta_x2) * stride;
 954 |         float h = (delta_y1 + delta_y2) * stride;
 955 |         boxes.push_back(x);
 956 |         boxes.push_back(y);
 957 |         boxes.push_back(w);
 958 |         boxes.push_back(h);
 959 | 
 960 |         angles.push_back(theta);
 961 |         objProbs.push_back(
 962 |             deqnt_affine_to_f32(max_score, score_zp, score_scale));
 963 |         classId.push_back(max_class_id);
 964 |         validCount++;
 965 |       }
 966 |     }
 967 |   }
 968 |   //  KAYLORDUT_LOG_DEBUG("process_i8_obb finished");
 969 |   return validCount;
 970 | }
 971 | 
 972 | static int process_i8_pose(int8_t *box_tensor, int32_t box_zp, float box_scale,
 973 |                            int8_t *score_tensor, int32_t score_zp,
 974 |                            float score_scale, int8_t *kpt_tensor,
 975 |                            int32_t kpt_zp, float kpt_scale,
 976 |                            int8_t *visibility_tensor, int32_t visibility_zp,
 977 |                            float visibility_scale, int grid_h, int grid_w,
 978 |                            int stride, int dfl_len, std::vector<float> &boxes,
 979 |                            std::vector<float> &objProbs,
 980 |                            std::vector<float> &kpt,
 981 |                            std::vector<float> &visibilities, float threshold) {
 982 |   int validCount = 0;
 983 |   int grid_len = grid_h * grid_w;
 984 |   int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale);
 985 | 
 986 |   for (int i = 0; i < grid_h; i++) {
 987 |     for (int j = 0; j < grid_w; j++) {
 988 |       int offset = i * grid_w + j;
 989 |       int max_class_id = -1;
 990 | 
 991 |       int8_t max_score = -score_zp;
 992 |       for (int c = 0; c < num_labels; c++) {
 993 |         if ((score_tensor[offset] > score_thres_i8) &&
 994 |             (score_tensor[offset] > max_score)) {
 995 |           max_score = score_tensor[offset];
 996 |           max_class_id = c;
 997 |         }
 998 |         offset += grid_len;
 999 |       }
1000 | 
1001 |       // compute box
1002 |       if (max_score > score_thres_i8) {
1003 |         offset = i * grid_w + j;
1004 |         float box[4];
1005 |         float before_dfl[dfl_len * 4];
1006 |         for (int k = 0; k < dfl_len * 4; k++) {
1007 |           before_dfl[k] =
1008 |               deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale);
1009 |           offset += grid_len;
1010 |         }
1011 |         compute_dfl(before_dfl, dfl_len, box);
1012 | 
1013 |         float x1, y1, x2, y2, w, h;
1014 |         x1 = (-box[0] + j + 0.5) * stride;
1015 |         y1 = (-box[1] + i + 0.5) * stride;
1016 |         x2 = (box[2] + j + 0.5) * stride;
1017 |         y2 = (box[3] + i + 0.5) * stride;
1018 |         w = x2 - x1;
1019 |         h = y2 - y1;
1020 |         boxes.push_back(x1);
1021 |         boxes.push_back(y1);
1022 |         boxes.push_back(w);
1023 |         boxes.push_back(h);
1024 | 
1025 |         objProbs.push_back(
1026 |             deqnt_affine_to_f32(max_score, score_zp, score_scale));
1027 |         offset = i * grid_w + j;
1028 |         for (int k = 0; k < 17; ++k) {
1029 |           auto kpt_x = *(kpt_tensor + offset + 2 * k * grid_len);
1030 |           auto kpt_y = *(kpt_tensor + offset + (2 * k + 1) * grid_len);
1031 |           auto kpt_visibility = *(visibility_tensor + offset + k * grid_len);
1032 |           auto res_kpt_x = deqnt_affine_to_f32(kpt_x, kpt_zp, kpt_scale);
1033 |           auto res_kpt_y = deqnt_affine_to_f32(kpt_y, kpt_zp, kpt_scale);
1034 |           auto res_kpt_visibility = deqnt_affine_to_f32(
1035 |               kpt_visibility, visibility_zp, visibility_scale);
1036 |           kpt.push_back(res_kpt_x);
1037 |           kpt.push_back(res_kpt_y);
1038 |           visibilities.push_back(res_kpt_visibility);
1039 |         }
1040 |         validCount++;
1041 |       }
1042 |     }
1043 |   }
1044 |   return validCount;
1045 | }
1046 | static int process_i8(int8_t *box_tensor, int32_t box_zp, float box_scale,
1047 |                       int8_t *score_tensor, int32_t score_zp, float score_scale,
1048 |                       int8_t *score_sum_tensor, int32_t score_sum_zp,
1049 |                       float score_sum_scale, int grid_h, int grid_w, int stride,
1050 |                       int dfl_len, std::vector<float> &boxes,
1051 |                       std::vector<float> &objProbs, std::vector<int> &classId,
1052 |                       float threshold) {
1053 |   int validCount = 0;
1054 |   int grid_len = grid_h * grid_w;
1055 |   int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale);
1056 |   int8_t score_sum_thres_i8 =
1057 |       qnt_f32_to_affine(threshold, score_sum_zp, score_sum_scale);
1058 | 
1059 |   for (int i = 0; i < grid_h; i++) {
1060 |     for (int j = 0; j < grid_w; j++) {
1061 |       int offset = i * grid_w + j;
1062 |       int max_class_id = -1;
1063 | 
1064 |       // 通过 score sum 起到快速过滤的作用
1065 |       if (score_sum_tensor != nullptr) {
1066 |         if (score_sum_tensor[offset] < score_sum_thres_i8) {
1067 |           continue;
1068 |         }
1069 |       }
1070 | 
1071 |       int8_t max_score = -score_zp;
1072 |       for (int c = 0; c < num_labels; c++) {
1073 |         if ((score_tensor[offset] > score_thres_i8) &&
1074 |             (score_tensor[offset] > max_score)) {
1075 |           max_score = score_tensor[offset];
1076 |           max_class_id = c;
1077 |         }
1078 |         offset += grid_len;
1079 |       }
1080 | 
1081 |       // compute box
1082 |       if (max_score > score_thres_i8) {
1083 |         offset = i * grid_w + j;
1084 |         float box[4];
1085 |         float before_dfl[dfl_len * 4];
1086 |         for (int k = 0; k < dfl_len * 4; k++) {
1087 |           before_dfl[k] =
1088 |               deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale);
1089 |           offset += grid_len;
1090 |         }
1091 |         compute_dfl(before_dfl, dfl_len, box);
1092 | 
1093 |         float x1, y1, x2, y2, w, h;
1094 |         x1 = (-box[0] + j + 0.5) * stride;
1095 |         y1 = (-box[1] + i + 0.5) * stride;
1096 |         x2 = (box[2] + j + 0.5) * stride;
1097 |         y2 = (box[3] + i + 0.5) * stride;
1098 |         w = x2 - x1;
1099 |         h = y2 - y1;
1100 |         boxes.push_back(x1);
1101 |         boxes.push_back(y1);
1102 |         boxes.push_back(w);
1103 |         boxes.push_back(h);
1104 | 
1105 |         objProbs.push_back(
1106 |             deqnt_affine_to_f32(max_score, score_zp, score_scale));
1107 |         classId.push_back(max_class_id);
1108 |         validCount++;
1109 |       }
1110 |     }
1111 |   }
1112 |   return validCount;
1113 | }
1114 | 
1115 | static int process_i8_v10(int8_t *box_tensor, int32_t box_zp, float box_scale,
1116 |                           int8_t *score_tensor, int32_t score_zp,
1117 |                           float score_scale, int grid_h, int grid_w, int stride,
1118 |                           int dfl_len, std::vector<float> &boxes,
1119 |                           std::vector<float> &objProbs,
1120 |                           std::vector<int> &classId, float threshold) {
1121 |   int validCount = 0;
1122 |   int grid_len = grid_h * grid_w;
1123 |   int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale);
1124 | 
1125 |   for (int i = 0; i < grid_h; i++) {
1126 |     for (int j = 0; j < grid_w; j++) {
1127 |       int offset = i * grid_w + j;
1128 |       int max_class_id = -1;
1129 | 
1130 |       int8_t max_score = -score_zp;
1131 |       for (int c = 0; c < num_labels; c++) {
1132 |         if ((score_tensor[offset] > score_thres_i8) &&
1133 |             (score_tensor[offset] > max_score)) {
1134 |           max_score = score_tensor[offset];
1135 |           max_class_id = c;
1136 |         }
1137 |         offset += grid_len;
1138 |       }
1139 | 
1140 |       // compute box
1141 |       if (max_score > score_thres_i8) {
1142 |         offset = i * grid_w + j;
1143 |         float box[4];
1144 |         float before_dfl[dfl_len * 4];
1145 |         for (int k = 0; k < dfl_len * 4; k++) {
1146 |           before_dfl[k] =
1147 |               deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale);
1148 |           offset += grid_len;
1149 |         }
1150 |         compute_dfl(before_dfl, dfl_len, box);
1151 | 
1152 |         float x1, y1, x2, y2, w, h;
1153 |         x1 = (-box[0] + j + 0.5) * stride;
1154 |         y1 = (-box[1] + i + 0.5) * stride;
1155 |         x2 = (box[2] + j + 0.5) * stride;
1156 |         y2 = (box[3] + i + 0.5) * stride;
1157 |         w = x2 - x1;
1158 |         h = y2 - y1;
1159 |         boxes.push_back(x1);
1160 |         boxes.push_back(y1);
1161 |         boxes.push_back(w);
1162 |         boxes.push_back(h);
1163 | 
1164 |         objProbs.push_back(
1165 |             deqnt_affine_to_f32(max_score, score_zp, score_scale));
1166 |         classId.push_back(max_class_id);
1167 |         validCount++;
1168 |       }
1169 |     }
1170 |   }
1171 |   return validCount;
1172 | }
1173 | static int process_fp32(float *box_tensor, float *score_tensor,
1174 |                         float *score_sum_tensor, int grid_h, int grid_w,
1175 |                         int stride, int dfl_len, std::vector<float> &boxes,
1176 |                         std::vector<float> &objProbs, std::vector<int> &classId,
1177 |                         float threshold) {
1178 |   int validCount = 0;
1179 |   int grid_len = grid_h * grid_w;
1180 |   for (int i = 0; i < grid_h; i++) {
1181 |     for (int j = 0; j < grid_w; j++) {
1182 |       int offset = i * grid_w + j;
1183 |       int max_class_id = -1;
1184 | 
1185 |       // 通过 score sum 起到快速过滤的作用
1186 |       if (score_sum_tensor != nullptr) {
1187 |         if (score_sum_tensor[offset] < threshold) {
1188 |           continue;
1189 |         }
1190 |       }
1191 | 
1192 |       float max_score = 0;
1193 |       for (int c = 0; c < num_labels; c++) {
1194 |         if ((score_tensor[offset] > threshold) &&
1195 |             (score_tensor[offset] > max_score)) {
1196 |           max_score = score_tensor[offset];
1197 |           max_class_id = c;
1198 |         }
1199 |         offset += grid_len;
1200 |       }
1201 | 
1202 |       // compute box
1203 |       if (max_score > threshold) {
1204 |         offset = i * grid_w + j;
1205 |         float box[4];
1206 |         float before_dfl[dfl_len * 4];
1207 |         for (int k = 0; k < dfl_len * 4; k++) {
1208 |           before_dfl[k] = box_tensor[offset];
1209 |           offset += grid_len;
1210 |         }
1211 |         compute_dfl(before_dfl, dfl_len, box);
1212 | 
1213 |         float x1, y1, x2, y2, w, h;
1214 |         x1 = (-box[0] + j + 0.5) * stride;
1215 |         y1 = (-box[1] + i + 0.5) * stride;
1216 |         x2 = (box[2] + j + 0.5) * stride;
1217 |         y2 = (box[3] + i + 0.5) * stride;
1218 |         w = x2 - x1;
1219 |         h = y2 - y1;
1220 |         boxes.push_back(x1);
1221 |         boxes.push_back(y1);
1222 |         boxes.push_back(w);
1223 |         boxes.push_back(h);
1224 | 
1225 |         objProbs.push_back(max_score);
1226 |         classId.push_back(max_class_id);
1227 |         validCount++;
1228 |       }
1229 |     }
1230 |   }
1231 |   return validCount;
1232 | }
1233 | 
1234 | int post_process_pose(rknn_app_context_t *app_ctx, rknn_output *outputs,
1235 |                       letterbox_t *letter_box, float conf_threshold,
1236 |                       float nms_threshold,
1237 |                       object_detect_result_list *od_results) {
1238 |   std::vector<float> filterBoxes;
1239 |   std::vector<float> objProbs;
1240 |   std::vector<float> kpt;
1241 |   std::vector<float> visibilities;
1242 |   int validCount = 0;
1243 |   int stride = 0;
1244 |   int grid_h = 0;
1245 |   int grid_w = 0;
1246 |   int model_in_w = app_ctx->model_width;
1247 |   int model_in_h = app_ctx->model_height;
1248 | 
1249 |   // default 3 branch
1250 |   int dfl_len = app_ctx->output_attrs[0].dims[1] / 4;
1251 |   int output_per_branch = app_ctx->io_num.n_output / 3;
1252 |   for (int i = 0; i < 3; i++) {
1253 |     int box_idx = i * output_per_branch;
1254 |     int score_idx = i * output_per_branch + 1;
1255 |     int kpt_idx = i * output_per_branch + 2;
1256 |     int visibilities_idx = i * output_per_branch + 3;
1257 | 
1258 |     grid_h = app_ctx->output_attrs[box_idx].dims[2];
1259 |     grid_w = app_ctx->output_attrs[box_idx].dims[3];
1260 |     num_labels = app_ctx->output_attrs[score_idx].dims[1];
1261 |     stride = model_in_h / grid_h;
1262 | 
1263 |     if (app_ctx->is_quant) {
1264 |       validCount += process_i8_pose(
1265 |           (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp,
1266 |           app_ctx->output_attrs[box_idx].scale,
1267 |           (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp,
1268 |           app_ctx->output_attrs[score_idx].scale,
1269 |           (int8_t *)outputs[kpt_idx].buf, app_ctx->output_attrs[kpt_idx].zp,
1270 |           app_ctx->output_attrs[kpt_idx].scale,
1271 |           (int8_t *)outputs[visibilities_idx].buf,
1272 |           app_ctx->output_attrs[visibilities_idx].zp,
1273 |           app_ctx->output_attrs[visibilities_idx].scale, grid_h, grid_w, stride,
1274 |           dfl_len, filterBoxes, objProbs, kpt, visibilities, conf_threshold);
1275 |     }
1276 |   }
1277 | 
1278 |   // no object detect
1279 |   if (validCount <= 0) {
1280 |     return 0;
1281 |   }
1282 |   std::vector<int> indexArray;
1283 |   for (int i = 0; i < validCount; ++i) {
1284 |     indexArray.push_back(i);
1285 |   }
1286 |   quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray);
1287 |   // 因为Pose只有人类一个种类， 所以只有nms可以简化
1288 |   nms(validCount, filterBoxes, indexArray, nms_threshold);
1289 | 
1290 |   int last_count = 0;
1291 |   od_results->count = 0;
1292 | 
1293 |   /* box valid detect target */
1294 |   for (int i = 0; i < validCount; ++i) {
1295 |     if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) {
1296 |       continue;
1297 |     }
1298 |     int n = indexArray[i];
1299 | 
1300 |     float x1 = filterBoxes[n * 4 + 0] - letter_box->x_pad;
1301 |     float y1 = filterBoxes[n * 4 + 1] - letter_box->y_pad;
1302 |     float x2 = x1 + filterBoxes[n * 4 + 2];
1303 |     float y2 = y1 + filterBoxes[n * 4 + 3];
1304 |     float obj_conf = objProbs[i];
1305 | 
1306 |     od_results->results[last_count].box.left =
1307 |         (int)(clamp(x1, 0, model_in_w) / letter_box->scale);
1308 |     od_results->results[last_count].box.top =
1309 |         (int)(clamp(y1, 0, model_in_h) / letter_box->scale);
1310 |     od_results->results[last_count].box.right =
1311 |         (int)(clamp(x2, 0, model_in_w) / letter_box->scale);
1312 |     od_results->results[last_count].box.bottom =
1313 |         (int)(clamp(y2, 0, model_in_h) / letter_box->scale);
1314 |     od_results->results[last_count].prop = obj_conf;
1315 |     for (int j = 0; j < 34; j = j + 2) {
1316 |       auto kpt_x = kpt.at(34 * n + j) - letter_box->x_pad;
1317 |       auto kpt_y = kpt.at(34 * n + j + 1) - letter_box->y_pad;
1318 |       kpt_x /= letter_box->scale;
1319 |       kpt_y /= letter_box->scale;
1320 |       od_results->results_pose[last_count].kpt[j] = kpt_x;
1321 |       od_results->results_pose[last_count].kpt[j + 1] = kpt_y;
1322 |       od_results->results_pose[last_count].visibility[j / 2] =
1323 |           visibilities.at(17 * n + j / 2);
1324 |     }
1325 |     last_count++;
1326 |   }
1327 |   od_results->count = last_count;
1328 |   KAYLORDUT_LOG_INFO("valid count: {}, results count: {}", validCount,
1329 |                      od_results->count);
1330 |   return 0;
1331 | }
1332 | 
1333 | int post_process_v10_detection(rknn_app_context_t *app_ctx,
1334 |                                rknn_output *outputs, letterbox_t *letter_box,
1335 |                                float conf_threshold,
1336 |                                object_detect_result_list *od_results) {
1337 |   std::vector<float> filterBoxes;
1338 |   std::vector<float> objProbs;
1339 |   std::vector<int> classId;
1340 |   int validCount = 0;
1341 |   int stride = 0;
1342 |   int grid_h = 0;
1343 |   int grid_w = 0;
1344 |   int model_in_w = app_ctx->model_width;
1345 |   int model_in_h = app_ctx->model_height;
1346 | 
1347 |   // default 3 branch
1348 |   int dfl_len = app_ctx->output_attrs[0].dims[1] / 4;
1349 |   int output_per_branch = app_ctx->io_num.n_output / 3;
1350 |   for (int i = 0; i < 3; i++) {
1351 |     int box_idx = i * output_per_branch;
1352 |     int score_idx = i * output_per_branch + 1;
1353 | 
1354 |     grid_h = app_ctx->output_attrs[box_idx].dims[2];
1355 |     grid_w = app_ctx->output_attrs[box_idx].dims[3];
1356 |     stride = model_in_h / grid_h;
1357 | 
1358 |     if (app_ctx->is_quant) {
1359 |       validCount += process_i8_v10(
1360 |           (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp,
1361 |           app_ctx->output_attrs[box_idx].scale,
1362 |           (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp,
1363 |           app_ctx->output_attrs[score_idx].scale, grid_h, grid_w, stride,
1364 |           dfl_len, filterBoxes, objProbs, classId, conf_threshold);
1365 |     }
1366 |   }
1367 |   // no object detect
1368 |   if (validCount <= 0) {
1369 |     return 0;
1370 |   }
1371 |   KAYLORDUT_LOG_INFO("valid count is {}", validCount);
1372 |   int last_count = 0;
1373 |   od_results->count = 0;
1374 | 
1375 |   /* box valid detect target */
1376 |   for (int i = 0; i < validCount; ++i) {
1377 |     int n = i;
1378 | 
1379 |     float x1 = filterBoxes[n * 4 + 0] - letter_box->x_pad;
1380 |     float y1 = filterBoxes[n * 4 + 1] - letter_box->y_pad;
1381 |     float x2 = x1 + filterBoxes[n * 4 + 2];
1382 |     float y2 = y1 + filterBoxes[n * 4 + 3];
1383 |     int id = classId[n];
1384 |     float obj_conf = objProbs[i];
1385 | 
1386 |     od_results->results[last_count].box.left =
1387 |         (int)(clamp(x1, 0, model_in_w) / letter_box->scale);
1388 |     od_results->results[last_count].box.top =
1389 |         (int)(clamp(y1, 0, model_in_h) / letter_box->scale);
1390 |     od_results->results[last_count].box.right =
1391 |         (int)(clamp(x2, 0, model_in_w) / letter_box->scale);
1392 |     od_results->results[last_count].box.bottom =
1393 |         (int)(clamp(y2, 0, model_in_h) / letter_box->scale);
1394 |     od_results->results[last_count].prop = obj_conf;
1395 |     od_results->results[last_count].cls_id = id;
1396 |     last_count++;
1397 |   }
1398 |   od_results->count = last_count;
1399 |   return 0;
1400 | }
1401 | 
1402 | int post_process(rknn_app_context_t *app_ctx, rknn_output *outputs,
1403 |                  letterbox_t *letter_box, float conf_threshold,
1404 |                  float nms_threshold, object_detect_result_list *od_results) {
1405 |   std::vector<float> filterBoxes;
1406 |   std::vector<float> objProbs;
1407 |   std::vector<int> classId;
1408 |   int validCount = 0;
1409 |   int stride = 0;
1410 |   int grid_h = 0;
1411 |   int grid_w = 0;
1412 |   int model_in_w = app_ctx->model_width;
1413 |   int model_in_h = app_ctx->model_height;
1414 | 
1415 |   // default 3 branch
1416 |   int dfl_len = app_ctx->output_attrs[0].dims[1] / 4;
1417 |   int output_per_branch = app_ctx->io_num.n_output / 3;
1418 |   for (int i = 0; i < 3; i++) {
1419 |     void *score_sum = nullptr;
1420 |     int32_t score_sum_zp = 0;
1421 |     float score_sum_scale = 1.0;
1422 |     if (output_per_branch == 3) {
1423 |       score_sum = outputs[i * output_per_branch + 2].buf;
1424 |       score_sum_zp = app_ctx->output_attrs[i * output_per_branch + 2].zp;
1425 |       score_sum_scale = app_ctx->output_attrs[i * output_per_branch + 2].scale;
1426 |     }
1427 |     int box_idx = i * output_per_branch;
1428 |     int score_idx = i * output_per_branch + 1;
1429 | 
1430 |     grid_h = app_ctx->output_attrs[box_idx].dims[2];
1431 |     grid_w = app_ctx->output_attrs[box_idx].dims[3];
1432 |     stride = model_in_h / grid_h;
1433 | 
1434 |     if (app_ctx->is_quant) {
1435 |       validCount += process_i8(
1436 |           (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp,
1437 |           app_ctx->output_attrs[box_idx].scale,
1438 |           (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp,
1439 |           app_ctx->output_attrs[score_idx].scale, (int8_t *)score_sum,
1440 |           score_sum_zp, score_sum_scale, grid_h, grid_w, stride, dfl_len,
1441 |           filterBoxes, objProbs, classId, conf_threshold);
1442 |     } else {
1443 |       validCount += process_fp32(
1444 |           (float *)outputs[box_idx].buf, (float *)outputs[score_idx].buf,
1445 |           (float *)score_sum, grid_h, grid_w, stride, dfl_len, filterBoxes,
1446 |           objProbs, classId, conf_threshold);
1447 |     }
1448 |   }
1449 | 
1450 |   // no object detect
1451 |   if (validCount <= 0) {
1452 |     return 0;
1453 |   }
1454 |   std::vector<int> indexArray;
1455 |   // 如果是Yolov8 就进行nms， yolov10不需要
1456 |   if (od_results->model_type == ModelType::DETECTION) {
1457 |     for (int i = 0; i < validCount; ++i) {
1458 |       indexArray.push_back(i);
1459 |     }
1460 |     quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray);
1461 |     std::set<int> class_set(std::begin(classId), std::end(classId));
1462 |     for (auto c : class_set) {
1463 |       nms(validCount, filterBoxes, classId, indexArray, c, nms_threshold);
1464 |     }
1465 |   }
1466 | 
1467 |   int last_count = 0;
1468 |   od_results->count = 0;
1469 | 
1470 |   /* box valid detect target */
1471 |   for (int i = 0; i < validCount; ++i) {
1472 |     int n;
1473 |     // 如果是yolov8， 根据nms结果进行操作
1474 |     if (od_results->model_type == ModelType::DETECTION) {
1475 |       if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) {
1476 |         continue;
1477 |       }
1478 |       n = indexArray[i];
1479 |     } else {
1480 |       n = i;
1481 |     }
1482 | 
1483 |     float x1 = filterBoxes[n * 4 + 0] - letter_box->x_pad;
1484 |     float y1 = filterBoxes[n * 4 + 1] - letter_box->y_pad;
1485 |     float x2 = x1 + filterBoxes[n * 4 + 2];
1486 |     float y2 = y1 + filterBoxes[n * 4 + 3];
1487 |     int id = classId[n];
1488 |     float obj_conf = objProbs[i];
1489 | 
1490 |     od_results->results[last_count].box.left =
1491 |         (int)(clamp(x1, 0, model_in_w) / letter_box->scale);
1492 |     od_results->results[last_count].box.top =
1493 |         (int)(clamp(y1, 0, model_in_h) / letter_box->scale);
1494 |     od_results->results[last_count].box.right =
1495 |         (int)(clamp(x2, 0, model_in_w) / letter_box->scale);
1496 |     od_results->results[last_count].box.bottom =
1497 |         (int)(clamp(y2, 0, model_in_h) / letter_box->scale);
1498 |     od_results->results[last_count].prop = obj_conf;
1499 |     od_results->results[last_count].cls_id = id;
1500 |     last_count++;
1501 |   }
1502 |   od_results->count = last_count;
1503 |   return 0;
1504 | }
1505 | 
1506 | int post_process_obb(rknn_app_context_t *app_ctx, rknn_output *outputs,
1507 |                      letterbox_t *letter_box, float conf_threshold,
1508 |                      float nms_threshold,
1509 |                      object_detect_result_list *od_results) {
1510 |   std::vector<float> filterBoxes;  // box
1511 |   std::vector<float> objProbs;     // 置信度
1512 |   std::vector<int> classId;        // class id
1513 |   std::vector<float> angles;
1514 |   int validCount = 0;
1515 |   int stride = 0;
1516 |   int grid_h = 0;
1517 |   int grid_w = 0;
1518 |   int model_in_w = app_ctx->model_width;
1519 |   int model_in_h = app_ctx->model_height;
1520 | 
1521 |   // default 3 branch
1522 |   int dfl_len = app_ctx->output_attrs[0].dims[1] / 4;
1523 |   int output_per_branch = app_ctx->io_num.n_output / 3;
1524 |   for (int i = 0; i < 3; i++) {
1525 |     int box_idx = i * output_per_branch;
1526 |     int score_idx = i * output_per_branch + 1;
1527 |     int angle_idx = i * output_per_branch + 2;
1528 | 
1529 |     grid_h = app_ctx->output_attrs[box_idx].dims[2];
1530 |     grid_w = app_ctx->output_attrs[box_idx].dims[3];
1531 |     num_labels = app_ctx->output_attrs[score_idx].dims[1];
1532 |     stride = model_in_h / grid_h;
1533 | 
1534 |     if (app_ctx->is_quant) {
1535 |       validCount += process_i8_obb(
1536 |           (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp,
1537 |           app_ctx->output_attrs[box_idx].scale,
1538 |           (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp,
1539 |           app_ctx->output_attrs[score_idx].scale,
1540 |           (int8_t *)outputs[angle_idx].buf, app_ctx->output_attrs[angle_idx].zp,
1541 |           app_ctx->output_attrs[angle_idx].scale, grid_h, grid_w, stride,
1542 |           dfl_len, filterBoxes, angles, objProbs, classId, conf_threshold);
1543 |     } else {
1544 |       //      validCount += process_fp32(
1545 |       //          (float *)outputs[box_idx].buf, (float
1546 |       //          *)outputs[score_idx].buf, (float *)score_sum, grid_h, grid_w,
1547 |       //          stride, dfl_len, filterBoxes, objProbs, classId,
1548 |       //          conf_threshold);
1549 |     }
1550 |   }
1551 | 
1552 |   // no object detect
1553 |   if (validCount <= 0) {
1554 |     return 0;
1555 |   }
1556 |   KAYLORDUT_LOG_INFO("valid count is {}", validCount);
1557 |   std::vector<int> indexArray;
1558 |   for (int i = 0; i < validCount; ++i) {
1559 |     indexArray.push_back(i);
1560 |   }
1561 |   quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray);
1562 | 
1563 |   std::set<int> class_set(std::begin(classId), std::end(classId));
1564 | 
1565 |   for (auto c : class_set) {
1566 |     KAYLORDUT_LOG_DEBUG("processing nms, class id is {}", coco_cls_to_name(c));
1567 |     nms(validCount, filterBoxes, angles, classId, indexArray, c, nms_threshold);
1568 |   }
1569 | 
1570 |   int last_count = 0;
1571 |   od_results->count = 0;
1572 | 
1573 |   /* box valid detect target */
1574 |   for (int i = 0; i < validCount; ++i) {
1575 |     if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) {
1576 |       continue;
1577 |     }
1578 |     int n = indexArray[i];
1579 | 
1580 |     float x = filterBoxes[n * 4 + 0] - letter_box->x_pad;
1581 |     float y = filterBoxes[n * 4 + 1] - letter_box->y_pad;
1582 |     float w = filterBoxes[n * 4 + 2];
1583 |     float h = filterBoxes[n * 4 + 3];
1584 |     float theta = angles[n];
1585 |     int id = classId[n];
1586 |     float obj_conf = objProbs[i];
1587 | 
1588 |     // 这里限制幅度的函数有问题，因为不是四个坐标点，所以不能这样的限制幅度
1589 |     //    od_results->results_obb[last_count].box.x =
1590 |     //        (int)(clamp(x, 0, model_in_w) / letter_box->scale);
1591 |     //    od_results->results_obb[last_count].box.y =
1592 |     //        (int)(clamp(y, 0, model_in_h) / letter_box->scale);
1593 |     //    od_results->results_obb[last_count].box.w =
1594 |     //        (int)(clamp(w, 0, model_in_w) / letter_box->scale);
1595 |     //    od_results->results_obb[last_count].box.h =
1596 |     //        (int)(clamp(h, 0, model_in_h) / letter_box->scale);
1597 |     od_results->results_obb[last_count].box.x = (int)(x / letter_box->scale);
1598 |     od_results->results_obb[last_count].box.y = (int)(y / letter_box->scale);
1599 |     od_results->results_obb[last_count].box.w = (int)(w / letter_box->scale);
1600 |     od_results->results_obb[last_count].box.h = (int)(h / letter_box->scale);
1601 |     od_results->results_obb[last_count].box.theta = theta;
1602 |     //    od_results->results_obb[last_count].box.theta = 0;
1603 |     od_results->results_obb[last_count].prop = obj_conf;
1604 |     od_results->results_obb[last_count].cls_id = id;
1605 |     KAYLORDUT_LOG_INFO(
1606 |         "label is {}, and confidence is {}, xywhθ = ({} {} {} {} {})",
1607 |         coco_cls_to_name(id), obj_conf,
1608 |         od_results->results_obb[last_count].box.x,
1609 |         od_results->results_obb[last_count].box.y,
1610 |         od_results->results_obb[last_count].box.w,
1611 |         od_results->results_obb[last_count].box.h,
1612 |         od_results->results_obb[last_count].box.theta);
1613 |     last_count++;
1614 |   }
1615 |   od_results->count = last_count;
1616 |   KAYLORDUT_LOG_INFO(" result count: {}", last_count);
1617 |   return 0;
1618 | }
1619 | 


--------------------------------------------------------------------------------
/src/yolov8/utils/rknn_pool.cpp:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/6/24.
 3 | //
 4 | 
 5 | #include "rknn_pool.h"
 6 | 
 7 | #include "kaylordut/log/logger.h"
 8 | #include "postprocess.h"
 9 | 
10 | RknnPool::RknnPool(const std::string model_path, const int thread_num,
11 |                    const std::string lable_path) {
12 |   this->thread_num_ = thread_num;
13 |   this->model_path_ = model_path;
14 |   this->label_path_ = lable_path;
15 |   this->Init();
16 | }
17 | 
18 | RknnPool::~RknnPool() { this->DeInit(); }
19 | 
20 | void RknnPool::Init() {
21 |   init_post_process(this->label_path_);
22 |   try {
23 |     // 配置线程池
24 |     this->pool_ = std::make_unique<ThreadPool>(this->thread_num_);
25 |     // 这里每一个线程需要加载一个模型
26 |     for (int i = 0; i < this->thread_num_; ++i) {
27 |       models_.push_back(std::make_shared<Yolov8>(
28 |           std::forward<std::string>(this->model_path_)));
29 |     }
30 |   } catch (const std::bad_alloc &e) {
31 |     KAYLORDUT_LOG_ERROR("Out of memory: {}", e.what());
32 |     exit(EXIT_FAILURE);
33 |   }
34 |   for (int i = 0; i < this->thread_num_; ++i) {
35 |     auto ret = models_[i]->Init(models_[0]->get_rknn_context(), i != 0);
36 |     if (ret != 0) {
37 |       KAYLORDUT_LOG_ERROR("Init rknn model failed!");
38 |       exit(EXIT_FAILURE);
39 |     }
40 |   }
41 | }
42 | 
43 | void RknnPool::DeInit() { deinit_post_process(); }
44 | 
45 | void RknnPool::AddInferenceTask(std::shared_ptr<cv::Mat> src,
46 |                                 ImageProcess &image_process) {
47 |   pool_->enqueue(
48 |       [&](std::shared_ptr<cv::Mat> original_img) {
49 |         auto convert_img = image_process.Convert(*original_img);
50 |         auto mode_id = get_model_id();
51 |         cv::Mat rgb_img = cv::Mat::zeros(
52 |             this->models_[mode_id]->get_model_width(),
53 |             this->models_[mode_id]->get_model_height(), convert_img->type());
54 |         cv::cvtColor(*convert_img, rgb_img, cv::COLOR_BGR2RGB);
55 |         object_detect_result_list od_results;
56 |         this->models_[mode_id]->Inference(rgb_img.ptr(), &od_results,
57 |                                           image_process.get_letter_box());
58 |         image_process.ImagePostProcess(*original_img, od_results);
59 |         std::lock_guard<std::mutex> lock_guard(this->image_results_mutex_);
60 |         this->image_results_.push(std::move(original_img));
61 |       },
62 |       std::move(src));
63 | }
64 | 
65 | int RknnPool::get_model_id() {
66 |   std::lock_guard<std::mutex> lock(id_mutex_);
67 |   int mode_id = id;
68 |   id++;
69 |   if (id == thread_num_) {
70 |     id = 0;
71 |   }
72 |   //  KAYLORDUT_LOG_INFO("id = {}, num = {}, mode id = {}", id, thread_num_,
73 |   //  mode_id);
74 |   return mode_id;
75 | }
76 | 
77 | std::shared_ptr<cv::Mat> RknnPool::GetImageResultFromQueue() {
78 |   std::lock_guard<std::mutex> lock_guard(this->image_results_mutex_);
79 |   if (this->image_results_.empty()) {
80 |     return nullptr;
81 |   } else {
82 |     auto res = this->image_results_.front();
83 |     this->image_results_.pop();
84 |     return std::move(res);
85 |   }
86 | }
87 | 
88 | int RknnPool::GetTasksSize() { return pool_->TasksSize(); }
89 | 


--------------------------------------------------------------------------------
/src/yolov8/utils/videofile.cpp:
--------------------------------------------------------------------------------
 1 | //
 2 | // Created by kaylor on 3/4/24.
 3 | //
 4 | 
 5 | #include "videofile.h"
 6 | 
 7 | #include "image_process.h"
 8 | #include "kaylordut/log/logger.h"
 9 | 
10 | VideoFile::VideoFile(const std::string &&filename) : filename_(filename) {
11 |   capture_ = new cv::VideoCapture(filename_);
12 |   if (!capture_->isOpened()) {
13 |     KAYLORDUT_LOG_ERROR("Error opening video file");
14 |     exit(EXIT_FAILURE);
15 |   }
16 | }
17 | 
18 | VideoFile::~VideoFile() {
19 |   if (capture_ != nullptr) {
20 |     KAYLORDUT_LOG_INFO("Release capture")
21 |     capture_->release();
22 |     delete capture_;
23 |   }
24 | }
25 | 
26 | void VideoFile::Display(const float framerate, const int target_size) {
27 |   const int delay = 1000 / framerate;
28 |   cv::namedWindow("Video", cv::WINDOW_AUTOSIZE);
29 |   cv::Mat frame;
30 |   ImageProcess image_process(capture_->get(cv::CAP_PROP_FRAME_WIDTH),
31 |                              capture_->get(cv::CAP_PROP_FRAME_HEIGHT),
32 |                              target_size, false, 30);
33 |   while (true) {
34 |     *capture_ >> frame;
35 |     if (frame.empty()) {
36 |       break;
37 |     }
38 |     cv::imshow("Video", *(image_process.Convert(frame)));
39 |     //    cv::imshow("Video", frame);
40 |     if (cv::waitKey(delay) >= 0) {
41 |       break;
42 |     }
43 |   }
44 |   cv::destroyAllWindows();
45 | }
46 | 
47 | std::unique_ptr<cv::Mat> VideoFile::GetNextFrame() {
48 |   auto frame = std::make_unique<cv::Mat>();
49 |   *capture_ >> *frame;
50 |   if (frame->empty()) {
51 |     return nullptr;
52 |   }
53 |   return std::move(frame);
54 | }
55 | 
56 | cv::Mat VideoFile::test() {
57 |   cv::Mat frame;
58 |   *capture_ >> frame;
59 |   cv::waitKey(125);
60 |   return frame;
61 | }
62 | 
63 | int VideoFile::get_frame_width() {
64 |   return capture_->get(cv::CAP_PROP_FRAME_WIDTH);
65 | }
66 | 
67 | int VideoFile::get_frame_height() {
68 |   return capture_->get(cv::CAP_PROP_FRAME_HEIGHT);
69 | }


--------------------------------------------------------------------------------
/src/yolov8/utils/yolov8.cpp:
--------------------------------------------------------------------------------
  1 | //
  2 | // Created by kaylor on 3/4/24.
  3 | //
  4 | 
  5 | #include "yolov8.h"
  6 | 
  7 | #include "kaylordut/log/logger.h"
  8 | #include "kaylordut/time/time_duration.h"
  9 | #include "postprocess.h"
 10 | 
 11 | const int RK3588 = 3;
 12 | 
 13 | // 设置模型需要绑定的核心
 14 | // Set the core of the model that needs to be bound
 15 | int get_core_num() {
 16 |   static int core_num = 0;
 17 |   static std::mutex mtx;
 18 |   std::lock_guard<std::mutex> lock(mtx);
 19 |   int temp = core_num % RK3588;
 20 |   core_num++;
 21 |   return temp;
 22 | }
 23 | 
 24 | int read_data_from_file(const char *path, char **out_data) {
 25 |   FILE *fp = fopen(path, "rb");
 26 |   if (fp == NULL) {
 27 |     KAYLORDUT_LOG_INFO("fopen {} failed!", path);
 28 |     return -1;
 29 |   }
 30 |   fseek(fp, 0, SEEK_END);
 31 |   int file_size = ftell(fp);
 32 |   char *data = (char *)malloc(file_size + 1);
 33 |   data[file_size] = 0;
 34 |   fseek(fp, 0, SEEK_SET);
 35 |   if (file_size != fread(data, 1, file_size, fp)) {
 36 |     KAYLORDUT_LOG_INFO("fread {} failed!", path);
 37 |     free(data);
 38 |     fclose(fp);
 39 |     return -1;
 40 |   }
 41 |   if (fp) {
 42 |     fclose(fp);
 43 |   }
 44 |   *out_data = data;
 45 |   return file_size;
 46 | }
 47 | static void dump_tensor_attr(rknn_tensor_attr *attr) {
 48 |   KAYLORDUT_LOG_INFO(
 49 |       "index={}, name={}, n_dims={}, dims=[{}, {}, {}, {}], n_elems={}, "
 50 |       "size={}, fmt={}, type={}, qnt_type={}, "
 51 |       "zp={}, scale={}",
 52 |       attr->index, attr->name, attr->n_dims, attr->dims[0], attr->dims[1],
 53 |       attr->dims[2], attr->dims[3], attr->n_elems, attr->size,
 54 |       get_format_string(attr->fmt), get_type_string(attr->type),
 55 |       get_qnt_type_string(attr->qnt_type), attr->zp, attr->scale);
 56 | }
 57 | 
 58 | Yolov8::Yolov8(std::string &&model_path) : model_path_(model_path) {}
 59 | 
 60 | int Yolov8::Init(rknn_context *ctx_in, bool copy_weight) {
 61 |   int model_len = 0;
 62 |   char *model;
 63 |   int ret = 0;
 64 |   model_len = read_data_from_file(model_path_.c_str(), &model);
 65 |   if (model == nullptr) {
 66 |     KAYLORDUT_LOG_ERROR("Load model failed");
 67 |     return -1;
 68 |   }
 69 |   if (copy_weight) {
 70 |     KAYLORDUT_LOG_INFO("rknn_dup_context() is called");
 71 |     // 复用模型参数
 72 |     ret = rknn_dup_context(ctx_in, &ctx_);
 73 |     if (ret != RKNN_SUCC) {
 74 |       KAYLORDUT_LOG_ERROR("rknn_dup_context failed! error code = {}", ret);
 75 |       return -1;
 76 |     }
 77 |   } else {
 78 |     KAYLORDUT_LOG_INFO("rknn_init() is called");
 79 |     ret = rknn_init(&ctx_, model, model_len, 0, NULL);
 80 |     free(model);
 81 |     if (ret != RKNN_SUCC) {
 82 |       KAYLORDUT_LOG_ERROR("rknn_init failed! error code = {}", ret);
 83 |       return -1;
 84 |     }
 85 |   }
 86 |   rknn_core_mask core_mask;
 87 |   switch (get_core_num()) {
 88 |     case 0:
 89 |       core_mask = RKNN_NPU_CORE_0;
 90 |       break;
 91 |     case 1:
 92 |       core_mask = RKNN_NPU_CORE_1;
 93 |       break;
 94 |     case 2:
 95 |       core_mask = RKNN_NPU_CORE_2;
 96 |       break;
 97 |   }
 98 |   ret = rknn_set_core_mask(ctx_, core_mask);
 99 |   if (ret < 0) {
100 |     KAYLORDUT_LOG_ERROR("rknn_set_core_mask failed! error code = {}", ret);
101 |     return -1;
102 |   }
103 | 
104 |   rknn_sdk_version version;
105 |   ret = rknn_query(ctx_, RKNN_QUERY_SDK_VERSION, &version,
106 |                    sizeof(rknn_sdk_version));
107 |   if (ret < 0) {
108 |     return -1;
109 |   }
110 |   KAYLORDUT_LOG_INFO("sdk version: {} driver version: {}", version.api_version,
111 |                      version.drv_version);
112 | 
113 |   // Get Model Input Output Number
114 |   rknn_input_output_num io_num;
115 |   ret = rknn_query(ctx_, RKNN_QUERY_IN_OUT_NUM, &io_num, sizeof(io_num));
116 |   if (ret != RKNN_SUCC) {
117 |     KAYLORDUT_LOG_ERROR("rknn_query failed! error code = {}", ret);
118 |     return -1;
119 |   }
120 |   KAYLORDUT_LOG_INFO("model input num: {}, and output num: {}", io_num.n_input,
121 |                      io_num.n_output);
122 |   // Get Model Input Info
123 |   KAYLORDUT_LOG_INFO("input tensors:");
124 |   rknn_tensor_attr
125 |       input_attrs[io_num.n_input];  //这里使用的是变长数组，不建议这么使用
126 |   memset(input_attrs, 0, sizeof(input_attrs));
127 |   for (int i = 0; i < io_num.n_input; i++) {
128 |     input_attrs[i].index = i;
129 |     ret = rknn_query(ctx_, RKNN_QUERY_INPUT_ATTR, &(input_attrs[i]),
130 |                      sizeof(rknn_tensor_attr));
131 |     if (ret != RKNN_SUCC) {
132 |       KAYLORDUT_LOG_ERROR("input rknn_query failed! error code = {}", ret);
133 |       return -1;
134 |     }
135 |     dump_tensor_attr(&(input_attrs[i]));
136 |   }
137 | 
138 |   // Get Model Output Info
139 |   KAYLORDUT_LOG_INFO("output tensors:");
140 |   rknn_tensor_attr output_attrs[io_num.n_output];
141 |   memset(output_attrs, 0, sizeof(output_attrs));
142 |   // rk官方的segment模型有13个输出
143 |   if (io_num.n_output == 13) {
144 |     KAYLORDUT_LOG_INFO("this is a segment model")
145 |     model_type_ = ModelType::SEGMENT;
146 |   } else {
147 |     model_type_ = ModelType::DETECTION;
148 |   }
149 |   for (int i = 0; i < io_num.n_output; i++) {
150 |     output_attrs[i].index = i;
151 |     ret = rknn_query(ctx_, RKNN_QUERY_OUTPUT_ATTR, &(output_attrs[i]),
152 |                      sizeof(rknn_tensor_attr));
153 |     if (ret != RKNN_SUCC) {
154 |       KAYLORDUT_LOG_ERROR("output rknn_query fail! error code = {}", ret);
155 |       return -1;
156 |     }
157 |     dump_tensor_attr(&(output_attrs[i]));
158 |     if (i == 2) {
159 |       char *found = strstr(output_attrs[i].name, "angle");
160 |       if (found != NULL) {
161 |         KAYLORDUT_LOG_INFO("this is a OBB model");
162 |         model_type_ = ModelType::OBB;
163 |       }
164 |       found = strstr(output_attrs[i].name, "kpt");
165 |       if (found != NULL) {
166 |         KAYLORDUT_LOG_INFO("this is a POSE model");
167 |         model_type_ = ModelType::POSE;
168 |       }
169 |       found = strstr(output_attrs[i].name, "yolov10");
170 |       if (found != NULL) {
171 |         KAYLORDUT_LOG_INFO("this is a Yolov10 detection model");
172 |         model_type_ = ModelType::V10_DETECTION;
173 |       }
174 |     }
175 |   }
176 |   // Set to context
177 |   app_ctx_.rknn_ctx = ctx_;
178 |   if (output_attrs[0].qnt_type == RKNN_TENSOR_QNT_AFFINE_ASYMMETRIC &&
179 |       output_attrs[0].type == RKNN_TENSOR_INT8) {
180 |     app_ctx_.is_quant = true;
181 |   } else {
182 |     app_ctx_.is_quant = false;
183 |   }
184 |   app_ctx_.io_num = io_num;
185 |   app_ctx_.input_attrs =
186 |       (rknn_tensor_attr *)malloc(io_num.n_input * sizeof(rknn_tensor_attr));
187 |   memcpy(app_ctx_.input_attrs, input_attrs,
188 |          io_num.n_input * sizeof(rknn_tensor_attr));
189 |   app_ctx_.output_attrs =
190 |       (rknn_tensor_attr *)malloc(io_num.n_output * sizeof(rknn_tensor_attr));
191 |   memcpy(app_ctx_.output_attrs, output_attrs,
192 |          io_num.n_output * sizeof(rknn_tensor_attr));
193 | 
194 |   if (input_attrs[0].fmt == RKNN_TENSOR_NCHW) {
195 |     KAYLORDUT_LOG_INFO("model is NCHW input fmt");
196 |     app_ctx_.model_channel = input_attrs[0].dims[1];
197 |     app_ctx_.model_height = input_attrs[0].dims[2];
198 |     app_ctx_.model_width = input_attrs[0].dims[3];
199 |   } else {
200 |     KAYLORDUT_LOG_INFO("model is NHWC input fmt");
201 |     app_ctx_.model_height = input_attrs[0].dims[1];
202 |     app_ctx_.model_width = input_attrs[0].dims[2];
203 |     app_ctx_.model_channel = input_attrs[0].dims[3];
204 |   }
205 |   KAYLORDUT_LOG_INFO("model input height={}, width={}, channel={}",
206 |                      app_ctx_.model_height, app_ctx_.model_width,
207 |                      app_ctx_.model_channel);
208 |   // 初始化输入输出参数
209 |   inputs_ = std::make_unique<rknn_input[]>(app_ctx_.io_num.n_input);
210 |   outputs_ = std::make_unique<rknn_output[]>(app_ctx_.io_num.n_output);
211 |   inputs_[0].index = 0;
212 |   inputs_[0].type = RKNN_TENSOR_UINT8;
213 |   inputs_[0].fmt = RKNN_TENSOR_NHWC;
214 |   inputs_[0].size =
215 |       app_ctx_.model_width * app_ctx_.model_height * app_ctx_.model_channel;
216 |   inputs_[0].buf = nullptr;
217 |   return 0;
218 | }
219 | 
220 | Yolov8::~Yolov8() { DeInit(); }
221 | 
222 | int Yolov8::DeInit() {
223 |   if (app_ctx_.rknn_ctx != 0) {
224 |     KAYLORDUT_LOG_INFO("rknn_destroy")
225 |     rknn_destroy(app_ctx_.rknn_ctx);
226 |     app_ctx_.rknn_ctx = 0;
227 |   }
228 |   if (app_ctx_.input_attrs != nullptr) {
229 |     KAYLORDUT_LOG_INFO("free input_attrs");
230 |     free(app_ctx_.input_attrs);
231 |   }
232 |   if (app_ctx_.output_attrs != nullptr) {
233 |     KAYLORDUT_LOG_INFO("free output_attrs");
234 |     free(app_ctx_.output_attrs);
235 |   }
236 |   return 0;
237 | }
238 | 
239 | rknn_context *Yolov8::get_rknn_context() { return &(this->ctx_); }
240 | 
241 | int Yolov8::Inference(void *image_buf, object_detect_result_list *od_results,
242 |                       letterbox_t letter_box) {
243 |   TimeDuration total_duration;
244 |   inputs_[0].buf = image_buf;
245 |   int ret = rknn_inputs_set(app_ctx_.rknn_ctx, app_ctx_.io_num.n_input,
246 |                             inputs_.get());
247 |   if (ret < 0) {
248 |     KAYLORDUT_LOG_ERROR("rknn_input_set failed! error code = {}", ret);
249 |     return -1;
250 |   }
251 |   TimeDuration time_duration;
252 |   ret = rknn_run(app_ctx_.rknn_ctx, nullptr);
253 |   auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(
254 |       time_duration.DurationSinceLastTime());
255 |   if (ret != RKNN_SUCC) {
256 |     KAYLORDUT_LOG_ERROR("rknn_run failed, error code = {}", ret);
257 |     return -1;
258 |   }
259 |   for (int i = 0; i < app_ctx_.io_num.n_output; ++i) {
260 |     outputs_[i].index = i;
261 |     outputs_[i].want_float = (!app_ctx_.is_quant);
262 |   }
263 |   outputs_lock_.lock();
264 |   ret = rknn_outputs_get(app_ctx_.rknn_ctx, app_ctx_.io_num.n_output,
265 |                          outputs_.get(), nullptr);
266 |   if (ret != RKNN_SUCC) {
267 |     KAYLORDUT_LOG_ERROR("rknn_outputs_get failed, error code = {}", ret);
268 |     return -1;
269 |   }
270 |   const float nms_threshold = NMS_THRESH;       // 默认的NMS阈值
271 |   const float box_conf_threshold = BOX_THRESH;  // 默认的置信度阈值
272 |   // Post Process
273 |   // 把输出结果列表置零
274 |   memset(od_results, 0, sizeof(object_detect_result_list));
275 |   od_results->model_type = model_type_;
276 |   KAYLORDUT_TIME_COST_INFO(
277 |       "rknn_outputs_post_process",
278 |       if (model_type_ == ModelType::SEGMENT) {
279 |         post_process_seg(&app_ctx_, outputs_.get(), &letter_box,
280 |                          box_conf_threshold, nms_threshold, od_results);
281 |       } else if (model_type_ == ModelType::DETECTION ||
282 |                  model_type_ == ModelType::V10_DETECTION) {
283 |         post_process(&app_ctx_, outputs_.get(), &letter_box, box_conf_threshold,
284 |                      nms_threshold, od_results);
285 |       } else if (model_type_ == ModelType::OBB) {
286 |         post_process_obb(&app_ctx_, outputs_.get(), &letter_box,
287 |                          box_conf_threshold, nms_threshold, od_results);
288 |       } else if (model_type_ == ModelType::POSE) {
289 |         post_process_pose(&app_ctx_, outputs_.get(), &letter_box,
290 |                           box_conf_threshold, nms_threshold, od_results);
291 |       }
292 |       /*else if (model_type_ == ModelType::V10_DETECTION) {
293 |         post_process_v10_detection(&app_ctx_, outputs_.get(), &letter_box,
294 |       box_conf_threshold, od_results);
295 |       }*/
296 |   );
297 |   od_results->model_type = model_type_;
298 | 
299 |   // Remeber to release rknn outputs_
300 |   rknn_outputs_release(app_ctx_.rknn_ctx, app_ctx_.io_num.n_output,
301 |                        outputs_.get());
302 |   outputs_lock_.unlock();
303 |   auto total_delta = std::chrono::duration_cast<std::chrono::milliseconds>(
304 |       total_duration.DurationSinceLastTime());
305 |   KAYLORDUT_LOG_DEBUG("Inference time is {}ms and total time is {}ms",
306 |                       duration.count(), total_delta.count());
307 |   return 0;
308 | }
309 | 
310 | int Yolov8::get_model_width() { return app_ctx_.model_width; }
311 | 
312 | int Yolov8::get_model_height() { return app_ctx_.model_height; }
313 | 


--------------------------------------------------------------------------------
/src/yolov8/videofile_demo.cpp:
--------------------------------------------------------------------------------
  1 | #include <cctype>  // 用于检查字符是否为数字
  2 | #include <cerrno>  // 用于检查 strtol 和 strtod 的错误
  3 | 
  4 | #include "getopt.h"
  5 | #include "image_process.h"
  6 | #include "kaylordut/log/logger.h"
  7 | #include "kaylordut/time/run_once.h"
  8 | #include "kaylordut/time/time_duration.h"
  9 | #include "rknn_pool.h"
 10 | #include "videofile.h"
 11 | 
 12 | struct ProgramOptions {
 13 |   std::string model_path;
 14 |   std::string label_path;
 15 |   std::string input_filename;
 16 |   int thread_count;
 17 |   double framerate;
 18 |   bool is_track = false;
 19 | };
 20 | 
 21 | // 检查字符串是否表示有效的数字
 22 | bool isNumber(const std::string &str) {
 23 |   char *end;
 24 |   errno = 0;
 25 |   std::strtod(str.c_str(), &end);
 26 |   return errno == 0 && *end == '\0' && end != str.c_str();
 27 | }
 28 | 
 29 | // 这个函数将解析命令行参数并返回一个 ProgramOptions 结构体
 30 | bool parseCommandLine(int argc, char *argv[], ProgramOptions &options) {
 31 |   options.is_track = false;
 32 |   static struct option longOpts[] = {
 33 |       {"model_path", required_argument, nullptr, 'm'},
 34 |       {"label_path", required_argument, nullptr, 'l'},
 35 |       {"threads", required_argument, nullptr, 't'},
 36 |       {"framerate", required_argument, nullptr, 'f'},
 37 |       {"input_filename", required_argument, nullptr, 'i'},
 38 |       {"help", no_argument, nullptr, 'h'},
 39 |       {"track", no_argument, nullptr, 'T'},
 40 |       {nullptr, 0, nullptr, 0}};
 41 | 
 42 |   int c, optionIndex = 0;
 43 |   while ((c = getopt_long(argc, argv, "m:l:t:f:i:hT", longOpts,
 44 |                           &optionIndex)) != -1) {
 45 |     switch (c) {
 46 |       case 'm':
 47 |         options.model_path = optarg;
 48 |         break;
 49 |       case 'l':
 50 |         options.label_path = optarg;
 51 |         break;
 52 |       case 't':
 53 |         if (isNumber(optarg)) {
 54 |           options.thread_count = std::atoi(optarg);
 55 |           if (options.thread_count <= 0) {
 56 |             KAYLORDUT_LOG_ERROR("Invalid number of threads: {}", optarg);
 57 |             return false;
 58 |           }
 59 |         } else {
 60 |           KAYLORDUT_LOG_ERROR("Thread count must be a number: {}", optarg);
 61 |           return false;
 62 |         }
 63 |         break;
 64 |       case 'f':
 65 |         if (isNumber(optarg)) {
 66 |           options.framerate = std::atof(optarg);
 67 |           if (options.framerate <= 0) {
 68 |             KAYLORDUT_LOG_ERROR("Invalid frame rate: {}", optarg);
 69 |             return false;
 70 |           }
 71 |         } else {
 72 |           KAYLORDUT_LOG_ERROR("Frame rate must be a number: {}", optarg);
 73 |           return false;
 74 |         }
 75 |         break;
 76 |       case 'i':
 77 |         options.input_filename = optarg;
 78 |         break;
 79 |       case 'h':
 80 |         std::cout << "Usage: " << argv[0]
 81 |                   << " [--model_path|-m model_path] [--input_filename|-i "
 82 |                      "input_filename] "
 83 |                      "[--threads|-t thread_count] [--framerate|-f framerate] "
 84 |                      "[--label_path|-l label_path]\n";
 85 |         exit(EXIT_SUCCESS);
 86 |       case 'T':
 87 |         options.is_track = true;
 88 |         break;
 89 |       case '?':
 90 |         // 错误消息由getopt_long自动处理
 91 |         return false;
 92 |       default:
 93 |         std::cout << "Usage: " << argv[0]
 94 |                   << " [--model_path|-d model_path] [--input_filename|-i "
 95 |                      "input_filename] "
 96 |                      "[--threads|-t thread_count] [--framerate|-f framerate] "
 97 |                      "[--label_path|-l label_path]\n";
 98 |         abort();
 99 |     }
100 |   }
101 | 
102 |   return true;
103 | }
104 | 
105 | int main(int argc, char *argv[]) {
106 |   KAYLORDUT_LOG_INFO("Yolov8 demo for rk3588");
107 |   ProgramOptions options = {"", "", "", 0, 0.0};
108 |   if (!parseCommandLine(argc, argv, options)) {
109 |     KAYLORDUT_LOG_ERROR("Parse command failed.");
110 |     return 1;
111 |   }
112 |   if (options.framerate == 0.0 || options.thread_count == 0 ||
113 |       options.label_path.empty() || options.input_filename.empty() ||
114 |       options.model_path.empty()) {
115 |     KAYLORDUT_LOG_ERROR("Missing required options. Use --help for help.");
116 |     return 1;
117 |   }
118 |   auto rknn_pool = std::make_unique<RknnPool>(
119 |       options.model_path, options.thread_count, options.label_path);
120 |   VideoFile video_file(options.input_filename.c_str());
121 |   int delay = 1000 / options.framerate;
122 |   ImageProcess image_process(video_file.get_frame_width(),
123 |                              video_file.get_frame_height(), 640,
124 |                              options.is_track, options.framerate);
125 |   std::unique_ptr<cv::Mat> image;
126 |   std::shared_ptr<cv::Mat> image_res;
127 |   uint8_t running_flag = 0;
128 |   cv::namedWindow("Video", cv::WINDOW_AUTOSIZE);
129 |   static int image_count = 0;
130 |   static int image_res_count = 0;
131 |   TimeDuration time_duration;
132 |   do {
133 |     auto func = [&] {
134 |       running_flag = 0;
135 |       image = video_file.GetNextFrame();
136 |       if (image != nullptr) {
137 |         rknn_pool->AddInferenceTask(std::move(image), image_process);
138 |         running_flag |= 0x01;
139 |         image_count++;
140 |       }
141 |       image_res = rknn_pool->GetImageResultFromQueue();
142 |       if (image_res != nullptr) {
143 |         cv::imshow("Video", *image_res);
144 |         image_res_count++;
145 |         KAYLORDUT_LOG_INFO("image count = {}, image res count = {}, delta = {}",
146 |                            image_count, image_res_count,
147 |                            image_count - image_res_count);
148 |         cv::waitKey(1);
149 |         running_flag |= 0x10;
150 |       }
151 |     };
152 |     run_once_with_delay(func, std::chrono::milliseconds(delay));
153 |   }
154 |   // 读取图像或者获取结果非空 线程池里还有任务 推理的结果没有等于输入图像数
155 |   while (running_flag || rknn_pool->GetTasksSize() ||
156 |          image_count > image_res_count);
157 |   auto time = std::chrono::duration_cast<std::chrono::milliseconds>(
158 |       time_duration.DurationSinceLastTime());
159 |   double fps = image_res_count * 1000.0 / time.count();
160 |   KAYLORDUT_LOG_INFO("Total time is {}ms, and average frame rate is {}fps",
161 |                      time.count(), fps);
162 |   rknn_pool.reset();
163 |   KAYLORDUT_LOG_INFO("exit loop");
164 |   cv::destroyAllWindows();
165 |   return 0;
166 | }


--------------------------------------------------------------------------------
/yolov8n-ros2.md:
--------------------------------------------------------------------------------
 1 | # The simplest way to test yolo with ROS2
 2 | 
 3 | # Install pacakge
 4 | ```bash
 5 | cat << 'EOF' | sudo tee /etc/apt/sources.list.d/kaylordut.list 
 6 | deb [signed-by=/etc/apt/keyrings/kaylor-keyring.gpg] http://apt.kaylordut.cn/kaylordut/ kaylordut main
 7 | EOF
 8 | sudo mkdir /etc/apt/keyrings -pv
 9 | sudo wget -O /etc/apt/keyrings/kaylor-keyring.gpg http://apt.kaylordut.cn/kaylor-keyring.gpg
10 | sudo apt update
11 | sudo apt install ai-framework ros-humble-yolo ros-humble-kaylordut-usb-cam
12 | ```
13 | 
14 | # Run
15 | 
16 | ```bash
17 | ros2 launch kaylordut_usb_cam test.launch.py
18 | ros2 launch yolo yolo.launch.py
19 | ```


--------------------------------------------------------------------------------