├── 8vs10.xlsx ├── LICENSE ├── readme.md ├── src └── yolov8 │ ├── .gitignore │ ├── CMakeLists.txt │ ├── camera_demo.cpp │ ├── imagefile_demo.cpp │ ├── include │ ├── Float16.h │ ├── camera.h │ ├── common.h │ ├── image_process.h │ ├── postprocess.h │ ├── rknn_pool.h │ ├── threadpool.h │ ├── videofile.h │ └── yolov8.h │ ├── model │ ├── .gitignore │ ├── coco_80_labels_list.txt │ ├── dotav1.txt │ ├── plane.png │ ├── ships.mp4 │ ├── yolov10n-2.0.0.rknn │ ├── yolov10n.onnx │ ├── yolov8m.rknn │ ├── yolov8n-obb.onnx │ ├── yolov8n-obb.rknn │ ├── yolov8n-pose.onnx │ ├── yolov8n-pose.rknn │ ├── yolov8n.rknn │ └── yolov8s.rknn │ ├── utils │ ├── CMakeLists.txt │ ├── camera.cpp │ ├── image_process.cpp │ ├── postprocess.cpp │ ├── rknn_pool.cpp │ ├── videofile.cpp │ └── yolov8.cpp │ └── videofile_demo.cpp └── yolov8n-ros2.md /8vs10.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/8vs10.xlsx -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 kaylor.chen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | 2 | # 写在前面的话 3 | 如果你看到这个仓库,证明你想试试这个多线程的推理。 4 | 1. 这个里的代码不是最优的,有些错序的问题需要其他手段解决,我没有在这里解决,你可以看下一个标题新版本仓库的链接。新仓库解决了这个问题 5 | 2. 本仓库的代码思路想法,在我的B站上有详细的讲解,需要理解程序的可以去b站搜我“kaylordut” 6 | 3. 项目合作的可以发邮件到kaylor.chen@qq.com, 邮件请说明来意,和简单的需求,以及你的预算。邮件我一般都回复,请不要一来就索要微信,一个切实可行的项目或者良好的技术交流是良好的开始。 7 | 8 | # New Project 9 | An inference framework compatible with TensorRT, OnnxRuntime, NNRT and RKNN 10 | If you want to find some more yolo8/yolo11 demo and depth anything demo, visit my another [repository](https://github.com/kaylorchen/ai_framework_demo) 11 | 12 | # Yolov8/v10 Demo for RK3588 13 | The project is a multi-threaded inference demo of Yolov8 running on the RK3588 platform, which has been adapted for reading video files and camera feeds. The demo uses the Yolov8n model for file inference, with a maximum inference frame rate of up to 100 frames per second. 14 | 15 | > If you want to test yolov8n with ros2 for yourself kit, click the [link](./yolov8n-ros2.md) 16 | 17 | # Model 18 | ## Download Model File 19 | you can find the model file in the 'src/yolov8/model', and some large files: 20 | Link: https://pan.baidu.com/s/1zfSVzR1G7mb-EQvs6A6ZYw?pwd=gmcs Password: gmcs 21 | Google Drive: https://drive.google.com/drive/folders/1FYluJpdaL-680pipgIQ1zsqqRvNbruEp?usp=sharing 22 | 23 | ## Model pt --> onnx 24 | ### For Yolov8 25 | go to my blog --> [blog.kaylordut.com](https://blog.kaylordut.com/2024/02/09/rk3588's-yolov8-model-conversion-from-pt-to-rknn/#more) 26 | ### For Yolov10 27 | go to my another repository --> [yolov10](https://github.com/kaylorchen/yolov10) 28 | download pt model and export: 29 | ```bash 30 | # End-to-End ONNX 31 | yolo export model=yolov10n/s/m/b/l/x.pt format=onnx opset=13 simplify 32 | ``` 33 | 34 | ## Model onnx --> rknn 35 | go to my blog --> [blog.kaylordut.com](https://blog.kaylordut.com/2024/02/09/rk3588's-yolov8-model-conversion-from-pt-to-rknn/#more) 36 | > TIPS: (Yolov10) 37 | > - rknn-toolkit2(release:1.6.0) does not support some operators about attention, so it runs attention steps with CPU, leading to increased inference time. 38 | > - rknn-toolkit2(beta:2.0.0b12) has the attention operators for 3588, so I build a docker image, you can pull it from __**kaylor/rknn_onnx2rknn:beta**__ 39 | 40 | ## Inference Time 41 | Please refer to the spreadsheet '[8vs10.xlsx](./8vs10.xlsx)' for details. 42 | 43 | |V8l-2.0.0| V8l-1.6.0| V10l-2.0.0| V10l-1.6.0| V8n-2.0.0 |V8n-1.6.0 |V10n-2.0.0| V10n-1.6.0| 44 | |:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:| 45 | |133.07572815534| 133.834951456311| 122.992233009709| 204.471844660194| 17.8990291262136| 18.3300970873786| 21.3009708737864| 49.9883495145631| 46 | 47 | 48 | 49 | 50 | 51 | # Demo Video and Guideline 52 | https://space.bilibili.com/327258623?spm_id_from=333.999.0.0 53 | QQ group1: 957577822 (full) 54 | QQ group2: 546943464 55 | 56 | # Prepare 57 | 58 | ## Build the Cross-Compilation Environment 59 | Set up a cross-compilation environment based on the following [link](https://github.com/kaylorchen/rk3588_dev_rootfs). 60 | 61 | ## Install Runtime Libraries in Your RK3588 Target Board 62 | ```bash 63 | cat << 'EOF' | sudo tee /etc/apt/sources.list.d/kaylordut.list 64 | deb [signed-by=/etc/apt/keyrings/kaylor-keyring.gpg] http://apt.kaylordut.cn/kaylordut/ kaylordut main 65 | EOF 66 | sudo mkdir /etc/apt/keyrings -pv 67 | sudo wget -O /etc/apt/keyrings/kaylor-keyring.gpg http://apt.kaylordut.cn/kaylor-keyring.gpg 68 | sudo apt update 69 | sudo apt install kaylordut-dev libbytetrack 70 | ``` 71 | > If your OS is not Ubuntu22.04, and find [kaylordut-dev](https://github.com/kaylorchen/kaylordut) and [libbytetrack](https://github.com/kaylorchen/ByteTrack) sources in my github. 72 | 73 | 74 | ## Build the Project for Your RK3588 75 | 76 | - Compile 77 | 78 | ```bash 79 | git clone https://github.com/kaylorchen/rk3588-yolo-demo.git 80 | cd rk3588-yolo-demo/src/yolov8 81 | mkdir build 82 | cd build 83 | cmake -DCMAKE_TOOLCHAIN_FILE=/path/to/toolchain-aarch64.cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .. 84 | make 85 | ``` 86 | > /path/to/toolchain-aarch64.cmake is .cmake file absolute path 87 | 88 | - Run 89 | 90 | ``` bash 91 | 92 | Usage: ./videofile_demo [--model_path|-m model_path] [--input_filename|-i input_filename] [--threads|-t thread_count] [--framerate|-f framerate] [--label_path|-l label_path] 93 | 94 | Usage: ./camera_demo [--model_path|-m model_path] [--camera_index|-i index] [--width|-w width] [--height|-h height][--threads|-t thread_count] [--fps|-f framerate] [--label_path|-l label_path] 95 | 96 | Usage: ./imagefile_demo [--model_path|-m model_path] [--input_filename|-i input_filename] [--label_path|-l label_path] 97 | 98 | ``` 99 | 100 | > you can run the above command in your rk3588 101 | 102 | 103 | 104 | -------------------------------------------------------------------------------- /src/yolov8/.gitignore: -------------------------------------------------------------------------------- 1 | cmake-build-*/ 2 | build/ 3 | compile_commands.json 4 | .idea/ 5 | .vscode/ 6 | -------------------------------------------------------------------------------- /src/yolov8/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 3.22) 2 | project(yolov8) 3 | 4 | set(CMAKE_CXX_STANDARD 17) 5 | set(CMAKE_BUILD_TYPE Release) # None, Debug, Release, RelWithDebInfo, MinSizeRel 6 | #SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -Wno-unused-parameter -O0 -g -Wall") 7 | #SET(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS} -Wno-unused-parameter -O0 -g -Wall") 8 | SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -Wno-unused-parameter -O3 -g -Wall") 9 | SET(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS} -Wno-unused-parameter -O3 -g -Wall") 10 | 11 | include_directories(include) 12 | 13 | add_subdirectory(utils) 14 | 15 | find_package(OpenCV REQUIRED) 16 | include_directories(${OpenCV_INCLUDE_DIRS}) 17 | find_package(kaylordut REQUIRED) 18 | find_package(bytetrack REQUIRED) 19 | 20 | 21 | add_executable(videofile_demo videofile_demo.cpp) 22 | target_link_libraries(videofile_demo ${kaylordut_LIBS} ${OpenCV_LIBS} yolov8-kaylordut ${bytetrack_LIBS}) 23 | 24 | add_executable(camera_demo camera_demo.cpp) 25 | target_link_libraries(camera_demo ${kaylordut_LIBS} ${OpenCV_LIBS} yolov8-kaylordut ${bytetrack_LIBS}) 26 | 27 | add_executable(imagefile_demo imagefile_demo.cpp) 28 | target_link_libraries(imagefile_demo ${kaylordut_LIBS} ${OpenCV_LIBS} yolov8-kaylordut ${bytetrack_LIBS}) 29 | -------------------------------------------------------------------------------- /src/yolov8/camera_demo.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/9/24. 3 | // 4 | #include // 用于检查字符是否为数字 5 | #include // 用于检查 strtol 和 strtod 的错误 6 | 7 | #include "camera.h" 8 | #include "getopt.h" 9 | #include "kaylordut/log/logger.h" 10 | #include "kaylordut/time/time_duration.h" 11 | #include "kaylordut/time/timeout.h" 12 | #include "rknn_pool.h" 13 | 14 | struct ProgramOptions { 15 | std::string model_path; 16 | std::string label_path; 17 | int thread_count; 18 | int camera_index; 19 | int width; 20 | int height; 21 | double fps; 22 | bool is_track = false; 23 | }; 24 | 25 | // 检查字符串是否表示有效的数字 26 | bool isNumber(const std::string &str) { 27 | char *end; 28 | errno = 0; 29 | std::strtod(str.c_str(), &end); 30 | return errno == 0 && *end == '\0' && end != str.c_str(); 31 | } 32 | 33 | // 这个函数将解析命令行参数并返回一个 ProgramOptions 结构体 34 | bool parseCommandLine(int argc, char *argv[], ProgramOptions &options) { 35 | options.is_track = false; 36 | static struct option longOpts[] = { 37 | {"model_path", required_argument, nullptr, 'm'}, 38 | {"label_path", required_argument, nullptr, 'l'}, 39 | {"threads", required_argument, nullptr, 't'}, 40 | {"camera_index", required_argument, nullptr, 'i'}, 41 | {"width", required_argument, nullptr, 'w'}, 42 | {"height", required_argument, nullptr, 'h'}, 43 | {"fps", required_argument, nullptr, 'f'}, 44 | {"help", no_argument, nullptr, '?'}, 45 | {"track", no_argument, nullptr, 'T'}, 46 | {nullptr, 0, nullptr, 0}}; 47 | 48 | int c, optionIndex = 0; 49 | while ((c = getopt_long(argc, argv, "m:l:t:i:w:h:f:?T", longOpts, 50 | &optionIndex)) != -1) { 51 | switch (c) { 52 | case 'm': 53 | options.model_path = optarg; 54 | break; 55 | case 'l': 56 | options.label_path = optarg; 57 | break; 58 | case 't': 59 | if (isNumber(optarg)) { 60 | options.thread_count = std::atoi(optarg); 61 | if (options.thread_count <= 0) { 62 | KAYLORDUT_LOG_ERROR("Invalid number of threads: {}", optarg); 63 | return false; 64 | } 65 | } else { 66 | KAYLORDUT_LOG_ERROR("Thread count must be a number: {}", optarg); 67 | return false; 68 | } 69 | break; 70 | case 'i': 71 | if (isNumber(optarg)) { 72 | options.camera_index = std::atoi(optarg); 73 | if (options.camera_index < 0) { 74 | KAYLORDUT_LOG_ERROR("Invalid index of camera: {}", optarg); 75 | return false; 76 | } 77 | } else { 78 | KAYLORDUT_LOG_ERROR("Camera index must be a number: {}", optarg); 79 | return false; 80 | } 81 | break; 82 | case 'w': 83 | if (isNumber(optarg)) { 84 | options.width = std::atoi(optarg); 85 | if (options.width < 0) { 86 | KAYLORDUT_LOG_ERROR("Invalid width: {}", optarg); 87 | return false; 88 | } 89 | } else { 90 | KAYLORDUT_LOG_ERROR("Width must be a number: {}", optarg); 91 | return false; 92 | } 93 | break; 94 | case 'h': 95 | if (isNumber(optarg)) { 96 | options.height = std::atoi(optarg); 97 | if (options.height < 0) { 98 | KAYLORDUT_LOG_ERROR("Invalid height: {}", optarg); 99 | return false; 100 | } 101 | } else { 102 | KAYLORDUT_LOG_ERROR("Height must be a number: {}", optarg); 103 | return false; 104 | } 105 | break; 106 | case 'f': 107 | if (isNumber(optarg)) { 108 | options.fps = std::atof(optarg); 109 | if (options.fps <= 0) { 110 | KAYLORDUT_LOG_ERROR("Invalid frame rate: {}", optarg); 111 | return false; 112 | } 113 | } else { 114 | KAYLORDUT_LOG_ERROR("Frame rate must be a number: {}", optarg); 115 | return false; 116 | } 117 | break; 118 | case 'T': 119 | options.is_track = true; 120 | break; 121 | case '?': 122 | std::cout << "Usage: " << argv[0] 123 | << " [--model_path|-m model_path] [--camera_index|-i index] " 124 | "[--width|-w width] [--height|-h height]" 125 | "[--threads|-t thread_count] [--fps|-f framerate] " 126 | "[--label_path|-l label_path]\n"; 127 | exit(EXIT_SUCCESS); 128 | default: 129 | std::cout << "Usage: " << argv[0] 130 | << " [--model_path|-m model_path] [--camera_index|-i index] " 131 | "[--width|-w width] [--height|-h height]" 132 | "[--threads|-t thread_count] [--fps|-f framerate] " 133 | "[--label_path|-l label_path]\n"; 134 | abort(); 135 | } 136 | } 137 | 138 | return true; 139 | } 140 | 141 | std::string getCurrentTimeStr() { 142 | auto now = std::chrono::system_clock::now(); 143 | auto in_time_t = std::chrono::system_clock::to_time_t(now); 144 | 145 | std::stringstream ss; 146 | ss << std::put_time(std::localtime(&in_time_t), "%Y%m%d_%H%M%S"); 147 | return ss.str(); 148 | } 149 | 150 | int main(int argc, char *argv[]) { 151 | ProgramOptions options = {"", "", 0, 0, 0, 0, 0.0}; 152 | if (!parseCommandLine(argc, argv, options)) { 153 | KAYLORDUT_LOG_ERROR("Parse command failed."); 154 | return 1; 155 | } 156 | if (options.fps == 0.0 || options.thread_count == 0 || options.height == 0 || 157 | options.width == 0 || options.label_path.empty() || 158 | options.model_path.empty()) { 159 | KAYLORDUT_LOG_ERROR("Missing required options. Use --help for help."); 160 | return 1; 161 | } 162 | auto rknn_pool = std::make_unique( 163 | options.model_path, options.thread_count, options.label_path); 164 | auto camera = std::make_unique( 165 | options.camera_index, cv::Size(options.width, options.height), 166 | options.fps); 167 | ImageProcess image_process(options.width, options.height, 640, 168 | options.is_track, options.fps); 169 | // cv::VideoWriter video_writer( 170 | // getCurrentTimeStr() + ".mkv", cv::VideoWriter::fourcc('X', '2', '6', 171 | // '4'), options.fps, cv::Size(options.width, options.height), true); 172 | // if (!video_writer.isOpened()) { 173 | // KAYLORDUT_LOG_ERROR("Open the output video file error"); 174 | // return -1; 175 | // } 176 | std::unique_ptr image; 177 | std::shared_ptr image_res; 178 | cv::namedWindow("Video", cv::WINDOW_AUTOSIZE); 179 | static int image_count = 0; 180 | static int image_res_count = 0; 181 | TimeDuration time_duration; 182 | Timeout timeout(std::chrono::seconds(30)); 183 | TimeDuration total_time; 184 | while ((!timeout.isTimeout()) || (image_count != image_res_count)) { 185 | auto func = [&] { 186 | if (!timeout.isTimeout()) { 187 | image = camera->GetNextFrame(); 188 | } 189 | if (image != nullptr) { 190 | rknn_pool->AddInferenceTask(std::move(image), image_process); 191 | image_count++; 192 | } 193 | image_res = rknn_pool->GetImageResultFromQueue(); 194 | if (image_res != nullptr) { 195 | image_res_count++; 196 | auto duration = std::chrono::duration_cast( 197 | time_duration.DurationSinceLastTime()); 198 | KAYLORDUT_LOG_INFO( 199 | "image count = {}, image res count = {}, delta = {}, duration = " 200 | "{}ms", 201 | image_count, image_res_count, image_count - image_res_count, 202 | duration.count()); 203 | cv::imshow("Video", *image_res); 204 | // video_writer.write(*image_res); 205 | cv::waitKey(1); 206 | } 207 | }; 208 | func(); 209 | } 210 | auto duration_total_time = 211 | std::chrono::duration_cast( 212 | total_time.DurationSinceLastTime()); 213 | KAYLORDUT_LOG_INFO( 214 | "Process {} frames, total time is {}ms, average frame rate is {}", 215 | image_res_count, duration_total_time.count(), 216 | image_res_count * 1000.0 / duration_total_time.count()); 217 | // video_writer.release(); 218 | rknn_pool.reset(); 219 | cv::destroyAllWindows(); 220 | return 0; 221 | } -------------------------------------------------------------------------------- /src/yolov8/imagefile_demo.cpp: -------------------------------------------------------------------------------- 1 | #include // 用于检查字符是否为数字 2 | #include // 用于检查 strtol 和 strtod 的错误 3 | 4 | #include "getopt.h" 5 | #include "image_process.h" 6 | #include "kaylordut/log/logger.h" 7 | #include "kaylordut/time/run_once.h" 8 | #include "kaylordut/time/time_duration.h" 9 | #include "rknn_pool.h" 10 | 11 | struct ProgramOptions { 12 | std::string model_path; 13 | std::string label_path; 14 | std::string input_filename; 15 | int thread_count; 16 | double framerate; 17 | }; 18 | 19 | // 检查字符串是否表示有效的数字 20 | bool isNumber(const std::string &str) { 21 | char *end; 22 | errno = 0; 23 | std::strtod(str.c_str(), &end); 24 | return errno == 0 && *end == '\0' && end != str.c_str(); 25 | } 26 | 27 | // 这个函数将解析命令行参数并返回一个 ProgramOptions 结构体 28 | bool parseCommandLine(int argc, char *argv[], ProgramOptions &options) { 29 | static struct option longOpts[] = { 30 | {"model_path", required_argument, nullptr, 'm'}, 31 | {"label_path", required_argument, nullptr, 'l'}, 32 | {"input_filename", required_argument, nullptr, 'i'}, 33 | {"help", no_argument, nullptr, 'h'}, 34 | {nullptr, 0, nullptr, 0}}; 35 | 36 | int c, optionIndex = 0; 37 | while ((c = getopt_long(argc, argv, "m:l:t:f:i:h", longOpts, &optionIndex)) != 38 | -1) { 39 | switch (c) { 40 | case 'm': 41 | options.model_path = optarg; 42 | break; 43 | case 'l': 44 | options.label_path = optarg; 45 | break; 46 | case 'i': 47 | options.input_filename = optarg; 48 | break; 49 | case 'h': 50 | std::cout << "Usage: " << argv[0] 51 | << " [--model_path|-m model_path] [--input_filename|-i " 52 | "input_filename] " 53 | "[--label_path|-l label_path]\n"; 54 | exit(EXIT_SUCCESS); 55 | case '?': 56 | // 错误消息由getopt_long自动处理 57 | return false; 58 | default: 59 | std::cout << "Usage: " << argv[0] 60 | << " [--model_path|-d model_path] [--input_filename|-i " 61 | "input_filename] " 62 | "[--label_path|-l label_path]\n"; 63 | abort(); 64 | } 65 | } 66 | 67 | return true; 68 | } 69 | 70 | int main(int argc, char *argv[]) { 71 | KAYLORDUT_LOG_INFO("Yolov8 demo for rk3588"); 72 | ProgramOptions options = {"", "", "", 1, 1.0}; 73 | if (!parseCommandLine(argc, argv, options)) { 74 | KAYLORDUT_LOG_ERROR("Parse command failed."); 75 | return 1; 76 | } 77 | if (options.framerate == 0.0 || options.thread_count == 0 || 78 | options.label_path.empty() || options.input_filename.empty() || 79 | options.model_path.empty()) { 80 | KAYLORDUT_LOG_ERROR("Missing required options. Use --help for help."); 81 | return 1; 82 | } 83 | auto rknn_pool = std::make_unique( 84 | options.model_path, options.thread_count, options.label_path); 85 | std::unique_ptr image = std::make_unique(); 86 | *image = cv::imread(options.input_filename); 87 | if (image->empty()) { 88 | KAYLORDUT_LOG_ERROR("read image error"); 89 | return -1; 90 | } 91 | ImageProcess image_process(image->cols, image->rows, 640); 92 | 93 | std::shared_ptr image_res; 94 | uint8_t running_flag = 0; 95 | cv::namedWindow("Image demo", cv::WINDOW_AUTOSIZE); 96 | static int image_count = 0; 97 | static int image_res_count = 0; 98 | rknn_pool->AddInferenceTask(std::move(image), image_process); 99 | while (image_res == nullptr) { 100 | image_res = rknn_pool->GetImageResultFromQueue(); 101 | } 102 | cv::imshow("Image demo", *image_res); 103 | cv::waitKey(0); 104 | rknn_pool.reset(); 105 | cv::destroyAllWindows(); 106 | cv::imwrite("result_" + options.input_filename, *image_res); 107 | return 0; 108 | } -------------------------------------------------------------------------------- /src/yolov8/include/Float16.h: -------------------------------------------------------------------------------- 1 | #ifndef _RKNPU2_RKNN_MATMUL_API_DEMO_H_ 2 | #define _RKNPU2_RKNN_MATMUL_API_DEMO_H_ 3 | 4 | namespace rknpu2 { 5 | 6 | using ushort = unsigned short; 7 | 8 | typedef union suf32 { 9 | int i; 10 | unsigned u; 11 | float f; 12 | } suf32; 13 | 14 | class float16 { 15 | public: 16 | float16() {} 17 | explicit float16(float x) { w = bits(x); } 18 | 19 | operator float() const { 20 | suf32 out; 21 | 22 | unsigned t = ((w & 0x7fff) << 13) + 0x38000000; 23 | unsigned sign = (w & 0x8000) << 16; 24 | unsigned e = w & 0x7c00; 25 | 26 | out.u = t + (1 << 23); 27 | out.u = (e >= 0x7c00 ? t + 0x38000000 28 | : e == 0 ? (static_cast(out.f -= 6.103515625e-05f), out.u) 29 | : t) | 30 | sign; 31 | return out.f; 32 | } 33 | 34 | static float16 fromBits(ushort b) { 35 | float16 result; 36 | result.w = b; 37 | return result; 38 | } 39 | static float16 zero() { 40 | float16 result; 41 | result.w = (ushort)0; 42 | return result; 43 | } 44 | ushort bits() const { return w; } 45 | 46 | static ushort bits(float x) { 47 | suf32 in; 48 | in.f = x; 49 | unsigned sign = in.u & 0x80000000; 50 | in.u ^= sign; 51 | ushort w; 52 | 53 | if (in.u >= 0x47800000) 54 | w = (ushort)(in.u > 0x7f800000 ? 0x7e00 : 0x7c00); 55 | else { 56 | if (in.u < 0x38800000) { 57 | in.f += 0.5f; 58 | w = (ushort)(in.u - 0x3f000000); 59 | } else { 60 | unsigned t = in.u + 0xc8000fff; 61 | w = (ushort)((t + ((in.u >> 13) & 1)) >> 13); 62 | } 63 | } 64 | 65 | w = (ushort)(w | (sign >> 16)); 66 | 67 | return w; 68 | } 69 | 70 | float16& operator=(float x) { 71 | w = bits(x); 72 | return *this; 73 | } 74 | 75 | float16& operator+=(float x) { 76 | w = bits(float() + x); 77 | return *this; 78 | } 79 | 80 | float16& operator/(float x) { 81 | w = bits(float() / x); 82 | return *this; 83 | } 84 | 85 | inline bool is_nan() const { 86 | return ((w & 0x7c00u) == 0x7c00u) && ((w & 0x03ffu) != 0x0000u); 87 | } 88 | 89 | inline bool greater(const float16& x) const { 90 | bool sign = w & 0x8000; 91 | bool sign_x = x.w & 0x8000; 92 | if (sign) { 93 | if (sign_x) 94 | return w < x.w; 95 | else 96 | return false; 97 | } else { 98 | if (sign_x) /* Signed zeros are equal, have to check for it */ 99 | return (w != 0 || x.w != 0x8000); 100 | else 101 | return w > x.w; 102 | } 103 | return false; 104 | } 105 | 106 | inline bool less(const float16& x) const { 107 | bool sign = w & 0x8000; 108 | bool sign_x = x.w & 0x8000; 109 | if (sign) { 110 | if (sign_x) 111 | return w > x.w; 112 | else 113 | /* Signed zeros are equal, have to check for it */ 114 | return (w != 0x8000 || x.w != 0); 115 | } else { 116 | if (sign_x) 117 | return false; 118 | else 119 | return w < x.w; 120 | } 121 | return false; 122 | } 123 | 124 | inline bool operator>(const float16& x) const { 125 | if (is_nan() || x.is_nan()) { 126 | return false; 127 | } 128 | return greater(x); 129 | } 130 | 131 | inline bool operator<(const float16& x) const { 132 | if (is_nan() || x.is_nan()) { 133 | return false; 134 | } 135 | return less(x); 136 | } 137 | 138 | inline bool operator>=(const float16& x) const { 139 | if (is_nan() || x.is_nan()) { 140 | return false; 141 | } 142 | return !less(x); 143 | } 144 | 145 | inline bool operator<=(const float16& x) const { 146 | if (is_nan() || x.is_nan()) { 147 | return false; 148 | } 149 | return !greater(x); 150 | } 151 | 152 | inline bool operator==(const float16& x) const { 153 | /* 154 | * The equality cases are as follows: 155 | * - If either value is NaN, never equal. 156 | * - If the values are equal, equal. 157 | * - If the values are both signed zeros, equal. 158 | */ 159 | if (is_nan() || x.is_nan()) { 160 | return false; 161 | } 162 | return (w == x.w || ((w | x.w) & 0x7fff) == 0); 163 | } 164 | 165 | inline bool operator!=(const float16& x) const { return !((*this) == x); } 166 | 167 | protected: 168 | ushort w = 0; 169 | }; 170 | 171 | } // namespace rknpu2 172 | 173 | #endif /* _RKNPU2_RKNN_MATMUL_API_DEMO_H_ */ -------------------------------------------------------------------------------- /src/yolov8/include/camera.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/9/24. 3 | // 4 | 5 | #pragma once 6 | #include "opencv2/opencv.hpp" 7 | 8 | class Camera { 9 | public: 10 | Camera(uint16_t index, cv::Size size, double framerate); 11 | ~Camera(); 12 | std::unique_ptr GetNextFrame(); 13 | 14 | private: 15 | cv::Size size_; 16 | cv::VideoCapture capture_; 17 | }; 18 | -------------------------------------------------------------------------------- /src/yolov8/include/common.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/5/24. 3 | // 4 | 5 | #pragma once 6 | #include "rknn_api.h" 7 | #define OBJ_NAME_MAX_SIZE 64 8 | #define OBJ_NUMB_MAX_SIZE 128 9 | #define OBJ_CLASS_NUM 80 10 | #define NMS_THRESH 0.8 11 | #define BOX_THRESH 0.5 12 | #define PROTO_CHANNEL (32) 13 | #define PROTO_HEIGHT (160) 14 | #define PROTO_WEIGHT (160) 15 | enum ModelType { 16 | UNKNOWN = 0, 17 | SEGMENT = 1, 18 | DETECTION = 2, 19 | OBB = 3, 20 | POSE = 4, 21 | V10_DETECTION = 5, 22 | }; 23 | /** 24 | * @brief LetterBox 25 | * 26 | */ 27 | typedef struct { 28 | int x_pad; 29 | int y_pad; 30 | float scale; 31 | } letterbox_t; 32 | /** 33 | * @brief Image rectangle 34 | * 35 | */ 36 | 37 | typedef struct { 38 | float kpt[34]; 39 | float visibility[17]; 40 | } object_pose_result; 41 | 42 | typedef struct { 43 | int x; 44 | int y; 45 | int w; 46 | int h; 47 | float theta; 48 | } image_xywht_t; 49 | 50 | typedef struct { 51 | image_xywht_t box; 52 | float prop; 53 | int cls_id; 54 | } object_obb_result; 55 | 56 | typedef struct { 57 | int left; 58 | int top; 59 | int right; 60 | int bottom; 61 | } image_rect_t; 62 | 63 | typedef struct { 64 | image_rect_t box; 65 | float prop; 66 | int cls_id; 67 | } object_detect_result; 68 | 69 | typedef struct { 70 | uint8_t *seg_mask; 71 | } object_segment_result; 72 | 73 | typedef struct { 74 | int id; 75 | int count; 76 | ModelType model_type; 77 | object_detect_result results[OBJ_NUMB_MAX_SIZE]; 78 | object_segment_result results_seg[OBJ_NUMB_MAX_SIZE]; 79 | object_obb_result results_obb[OBJ_NUMB_MAX_SIZE]; 80 | object_pose_result results_pose[OBJ_NUMB_MAX_SIZE]; 81 | } object_detect_result_list; 82 | 83 | typedef struct { 84 | rknn_context rknn_ctx; 85 | rknn_input_output_num io_num; 86 | rknn_tensor_attr *input_attrs; 87 | rknn_tensor_attr *output_attrs; 88 | int model_channel; 89 | int model_width; 90 | int model_height; 91 | bool is_quant; 92 | } rknn_app_context_t; 93 | -------------------------------------------------------------------------------- /src/yolov8/include/image_process.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/4/24. 3 | // 4 | 5 | #pragma once 6 | #include "BYTETracker.h" 7 | #include "mutex" 8 | #include "opencv2/opencv.hpp" 9 | #include "postprocess.h" 10 | 11 | class ImageProcess { 12 | public: 13 | ImageProcess(int width, int height, int target_size, bool is_track = false, 14 | int framerate = 30); 15 | std::unique_ptr Convert(const cv::Mat &src); 16 | const letterbox_t &get_letter_box(); 17 | void ImagePostProcess(cv::Mat &image, object_detect_result_list &od_results); 18 | 19 | private: 20 | double scale_; 21 | int padding_x_; 22 | int padding_y_; 23 | cv::Size new_size_; 24 | int target_size_; 25 | letterbox_t letterbox_; 26 | bool is_track_; 27 | std::unique_ptr tracker_; 28 | std::mutex tracker_mutex_; 29 | void ProcessDetectionImage(cv::Mat &image, 30 | object_detect_result_list &od_results) const; 31 | void ProcessTrackImage(cv::Mat &image, object_detect_result_list &od_results); 32 | void ProcessPoseImage(cv::Mat &image, 33 | object_detect_result_list &od_results) const; 34 | void ProcessOBBImage(cv::Mat &image, 35 | const object_detect_result_list &od_results) const; 36 | }; 37 | -------------------------------------------------------------------------------- /src/yolov8/include/postprocess.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/5/24. 3 | // 4 | 5 | #pragma once 6 | #include 7 | 8 | #include 9 | #include 10 | 11 | #include "common.h" 12 | #include "rknn_api.h" 13 | 14 | int init_post_process(std::string &label_path); 15 | void deinit_post_process(); 16 | const char *coco_cls_to_name(int cls_id); 17 | int post_process(rknn_app_context_t *app_ctx, rknn_output *outputs, 18 | letterbox_t *letter_box, float conf_threshold, 19 | float nms_threshold, object_detect_result_list *od_results); 20 | int post_process_v10_detection(rknn_app_context_t *app_ctx, 21 | rknn_output *outputs, 22 | letterbox_t *letter_box, 23 | float conf_threshold, 24 | object_detect_result_list *od_results); 25 | int post_process_obb(rknn_app_context_t *app_ctx, rknn_output *outputs, 26 | letterbox_t *letter_box, float conf_threshold, 27 | float nms_threshold, 28 | object_detect_result_list *od_results); 29 | int post_process_seg(rknn_app_context_t *app_ctx, rknn_output *outputs, 30 | letterbox_t *letter_box, float conf_threshold, 31 | float nms_threshold, 32 | object_detect_result_list *od_results); 33 | int post_process_pose(rknn_app_context_t *app_ctx, rknn_output *outputs, 34 | letterbox_t *letter_box, float conf_threshold, 35 | float nms_threshold, 36 | object_detect_result_list *od_results); 37 | int clamp(float val, int min, int max); 38 | -------------------------------------------------------------------------------- /src/yolov8/include/rknn_pool.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/6/24. 3 | // 4 | 5 | #pragma once 6 | #include "image_process.h" 7 | #include "opencv2/opencv.hpp" 8 | #include "queue" 9 | #include "threadpool.h" 10 | #include "yolov8.h" 11 | class RknnPool { 12 | public: 13 | RknnPool(const std::string model_path, const int thread_num, 14 | const std::string label_path); 15 | ~RknnPool(); 16 | void Init(); 17 | void DeInit(); 18 | void AddInferenceTask(std::shared_ptr src, 19 | ImageProcess &image_process); 20 | int get_model_id(); 21 | std::shared_ptr GetImageResultFromQueue(); 22 | int GetTasksSize(); 23 | 24 | private: 25 | int thread_num_{1}; 26 | std::string model_path_{"null"}; 27 | std::string label_path_{"null"}; 28 | uint32_t id{0}; 29 | std::unique_ptr pool_; 30 | std::queue> image_results_; 31 | std::vector> models_; 32 | std::mutex id_mutex_; 33 | std::mutex image_results_mutex_; 34 | }; 35 | -------------------------------------------------------------------------------- /src/yolov8/include/threadpool.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/5/24. 3 | // 4 | #ifndef THREAD_POOL_H 5 | #define THREAD_POOL_H 6 | 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | 17 | class ThreadPool { 18 | public: 19 | ThreadPool(size_t); 20 | template 21 | auto enqueue(F&& f, Args&&... args) 22 | -> std::future::type>; 23 | ~ThreadPool(); 24 | bool IsTasksEmpty(); 25 | int TasksSize(); 26 | 27 | private: 28 | // need to keep track of threads so we can join them 29 | std::vector workers; 30 | // the task queue 31 | std::queue > tasks; 32 | 33 | // synchronization 34 | std::mutex queue_mutex; 35 | std::condition_variable condition; 36 | bool stop; 37 | }; 38 | 39 | // the constructor just launches some amount of workers 40 | inline ThreadPool::ThreadPool(size_t threads) : stop(false) { 41 | for (size_t i = 0; i < threads; ++i) 42 | workers.emplace_back([this](int i) { 43 | auto thread_name = "thread" + std::to_string(i); 44 | pthread_setname_np(pthread_self(), thread_name.c_str()); 45 | for (;;) { 46 | std::function task; 47 | 48 | { 49 | std::unique_lock lock(this->queue_mutex); 50 | this->condition.wait( 51 | lock, [this] { return this->stop || !this->tasks.empty(); }); 52 | if (this->stop && this->tasks.empty()) return; 53 | task = std::move(this->tasks.front()); 54 | this->tasks.pop(); 55 | } 56 | 57 | task(); 58 | } 59 | }, i); 60 | } 61 | 62 | // add new work item to the pool 63 | template 64 | auto ThreadPool::enqueue(F&& f, Args&&... args) 65 | -> std::future::type> { 66 | using return_type = typename std::result_of::type; 67 | 68 | auto task = std::make_shared >( 69 | std::bind(std::forward(f), std::forward(args)...)); 70 | 71 | std::future res = task->get_future(); 72 | { 73 | std::unique_lock lock(queue_mutex); 74 | 75 | // don't allow enqueueing after stopping the pool 76 | if (stop) throw std::runtime_error("enqueue on stopped ThreadPool"); 77 | 78 | tasks.emplace([task]() { (*task)(); }); 79 | } 80 | condition.notify_one(); 81 | return res; 82 | } 83 | 84 | // the destructor joins all threads 85 | inline ThreadPool::~ThreadPool() { 86 | { 87 | std::unique_lock lock(queue_mutex); 88 | stop = true; 89 | } 90 | condition.notify_all(); 91 | for (std::thread& worker : workers) worker.join(); 92 | } 93 | 94 | inline bool ThreadPool::IsTasksEmpty() { 95 | std::lock_guard lock_guard(queue_mutex); 96 | return tasks.empty(); 97 | } 98 | 99 | inline int ThreadPool::TasksSize() { 100 | std::lock_guard lock_guard(queue_mutex); 101 | return tasks.size(); 102 | } 103 | 104 | #endif -------------------------------------------------------------------------------- /src/yolov8/include/videofile.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/4/24. 3 | // 4 | 5 | #pragma once 6 | #include "opencv2/opencv.hpp" 7 | #include "string" 8 | class VideoFile { 9 | public: 10 | VideoFile(const std::string&& filename); 11 | ~VideoFile(); 12 | void Display(const float framerate = 25.0, const int target_size = 640); 13 | std::unique_ptr GetNextFrame(); 14 | cv::Mat test(); 15 | int get_frame_width(); 16 | int get_frame_height(); 17 | 18 | private: 19 | std::string filename_; 20 | cv::VideoCapture* capture_{nullptr}; 21 | }; 22 | -------------------------------------------------------------------------------- /src/yolov8/include/yolov8.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/4/24. 3 | // 4 | 5 | #pragma once 6 | #include "common.h" 7 | #include "memory" 8 | #include "mutex" 9 | #include "rknn_api.h" 10 | #include "string" 11 | 12 | class Yolov8 { 13 | public: 14 | Yolov8(std::string &&model_path); 15 | ~Yolov8(); 16 | int Inference(void *image_buf, object_detect_result_list *od_results, 17 | letterbox_t letter_box); 18 | rknn_context *get_rknn_context(); 19 | int Init(rknn_context *ctx_in, bool copy_weight); 20 | int DeInit(); 21 | int get_model_width(); 22 | int get_model_height(); 23 | 24 | private: 25 | rknn_app_context_t app_ctx_; 26 | rknn_context ctx_{0}; 27 | std::string model_path_; 28 | std::unique_ptr inputs_; 29 | std::unique_ptr outputs_; 30 | std::mutex outputs_lock_; 31 | ModelType model_type_; 32 | }; 33 | -------------------------------------------------------------------------------- /src/yolov8/model/.gitignore: -------------------------------------------------------------------------------- 1 | *.jpg 2 | *.txt 3 | *.py 4 | -------------------------------------------------------------------------------- /src/yolov8/model/coco_80_labels_list.txt: -------------------------------------------------------------------------------- 1 | person 2 | bicycle 3 | car 4 | motorcycle 5 | airplane 6 | bus 7 | train 8 | truck 9 | boat 10 | traffic light 11 | fire hydrant 12 | stop sign 13 | parking meter 14 | bench 15 | bird 16 | cat 17 | dog 18 | horse 19 | sheep 20 | cow 21 | elephant 22 | bear 23 | zebra 24 | giraffe 25 | backpack 26 | umbrella 27 | handbag 28 | tie 29 | suitcase 30 | frisbee 31 | skis 32 | snowboard 33 | sports ball 34 | kite 35 | baseball bat 36 | baseball glove 37 | skateboard 38 | surfboard 39 | tennis racket 40 | bottle 41 | wine glass 42 | cup 43 | fork 44 | knife 45 | spoon 46 | bowl 47 | banana 48 | apple 49 | sandwich 50 | orange 51 | broccoli 52 | carrot 53 | hot dog 54 | pizza 55 | donut 56 | cake 57 | chair 58 | couch 59 | potted plant 60 | bed 61 | dining table 62 | toilet 63 | tv 64 | laptop 65 | mouse 66 | remote 67 | keyboard 68 | cell phone 69 | microwave 70 | oven 71 | toaster 72 | sink 73 | refrigerator 74 | book 75 | clock 76 | vase 77 | scissors 78 | teddy bear 79 | hair drier 80 | toothbrush 81 | -------------------------------------------------------------------------------- /src/yolov8/model/dotav1.txt: -------------------------------------------------------------------------------- 1 | plane 2 | ship 3 | storage tank 4 | baseball diamond 5 | tennis court 6 | basketball court 7 | ground track field 8 | harbor 9 | bridge 10 | large vehicle 11 | small vehicle 12 | helicopter 13 | roundabout 14 | soccer ball field 15 | swimming pool 16 | container crane 17 | -------------------------------------------------------------------------------- /src/yolov8/model/plane.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/plane.png -------------------------------------------------------------------------------- /src/yolov8/model/ships.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/ships.mp4 -------------------------------------------------------------------------------- /src/yolov8/model/yolov10n-2.0.0.rknn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov10n-2.0.0.rknn -------------------------------------------------------------------------------- /src/yolov8/model/yolov10n.onnx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov10n.onnx -------------------------------------------------------------------------------- /src/yolov8/model/yolov8m.rknn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8m.rknn -------------------------------------------------------------------------------- /src/yolov8/model/yolov8n-obb.onnx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-obb.onnx -------------------------------------------------------------------------------- /src/yolov8/model/yolov8n-obb.rknn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-obb.rknn -------------------------------------------------------------------------------- /src/yolov8/model/yolov8n-pose.onnx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-pose.onnx -------------------------------------------------------------------------------- /src/yolov8/model/yolov8n-pose.rknn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n-pose.rknn -------------------------------------------------------------------------------- /src/yolov8/model/yolov8n.rknn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8n.rknn -------------------------------------------------------------------------------- /src/yolov8/model/yolov8s.rknn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kaylorchen/rk3588-yolo-demo/1ebfe9ff4db93ae3b2c30dec6cd11a628a531210/src/yolov8/model/yolov8s.rknn -------------------------------------------------------------------------------- /src/yolov8/utils/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | find_package(OpenCV REQUIRED) 2 | include_directories(${OpenCV_INCLUDE_DIRS}) 3 | find_package(kaylordut REQUIRED) 4 | file(GLOB SRC "*.cpp") 5 | add_library(yolov8-kaylordut ${SRC}) 6 | target_link_libraries(yolov8-kaylordut ${kaylordut_LIBS} ${OpenCV_LIBS} rknnrt) -------------------------------------------------------------------------------- /src/yolov8/utils/camera.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/9/24. 3 | // 4 | 5 | #include "camera.h" 6 | 7 | #include "kaylordut/log/logger.h" 8 | #include "thread" 9 | 10 | Camera::Camera(uint16_t index, cv::Size size, double framerate) 11 | : capture_(index, cv::CAP_V4L2), size_(size) { 12 | KAYLORDUT_LOG_INFO("Instantiate a Camera object"); 13 | // 这里使用V4L2捕获,因为使用默认的捕获不可以设置捕获的模式和帧率 14 | if (!capture_.isOpened()) { 15 | KAYLORDUT_LOG_ERROR("Error opening video stream or file"); 16 | exit(EXIT_FAILURE); 17 | } 18 | capture_.set(cv::CAP_PROP_FOURCC, 19 | cv::VideoWriter::fourcc('M', 'J', 'P', 'G')); 20 | // 检查是否成功设置格式 21 | int fourcc = capture_.get(cv::CAP_PROP_FOURCC); 22 | if (fourcc != cv::VideoWriter::fourcc('M', 'J', 'P', 'G')) { 23 | KAYLORDUT_LOG_WARN("Set video format failed"); 24 | } 25 | capture_.set(cv::CAP_PROP_FRAME_WIDTH, size_.width); 26 | capture_.set(cv::CAP_PROP_FRAME_HEIGHT, size_.height); 27 | if (!capture_.set(cv::CAP_PROP_FPS, framerate)) { 28 | KAYLORDUT_LOG_WARN("set framerate failed!!"); 29 | } 30 | std::this_thread::sleep_for(std::chrono::seconds(1)); 31 | KAYLORDUT_LOG_INFO("camera width: {}, height: {}, fps: {}", 32 | capture_.get(cv::CAP_PROP_FRAME_WIDTH), 33 | capture_.get(cv::CAP_PROP_FRAME_HEIGHT), 34 | capture_.get(cv::CAP_PROP_FPS)); 35 | } 36 | 37 | Camera::~Camera() { 38 | if (capture_.isOpened()) { 39 | KAYLORDUT_LOG_INFO("Release camera"); 40 | capture_.release(); 41 | } 42 | } 43 | 44 | std::unique_ptr Camera::GetNextFrame() { 45 | auto frame = std::make_unique(); 46 | capture_ >> *frame; 47 | if (frame->empty()) { 48 | KAYLORDUT_LOG_ERROR("Get frame error"); 49 | return nullptr; 50 | } 51 | return std::move(frame); 52 | } -------------------------------------------------------------------------------- /src/yolov8/utils/image_process.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/4/24. 3 | // 4 | 5 | #include "image_process.h" 6 | 7 | #include "BYTETracker.h" 8 | #include "kaylordut/log/logger.h" 9 | #define N_CLASS_COLORS (20) 10 | unsigned char class_colors[][3] = { 11 | {255, 56, 56}, // 'FF3838' 12 | {255, 157, 151}, // 'FF9D97' 13 | {255, 112, 31}, // 'FF701F' 14 | {255, 178, 29}, // 'FFB21D' 15 | {207, 210, 49}, // 'CFD231' 16 | {72, 249, 10}, // '48F90A' 17 | {146, 204, 23}, // '92CC17' 18 | {61, 219, 134}, // '3DDB86' 19 | {26, 147, 52}, // '1A9334' 20 | {0, 212, 187}, // '00D4BB' 21 | {44, 153, 168}, // '2C99A8' 22 | {0, 194, 255}, // '00C2FF' 23 | {52, 69, 147}, // '344593' 24 | {100, 115, 255}, // '6473FF' 25 | {0, 24, 236}, // '0018EC' 26 | {132, 56, 255}, // '8438FF' 27 | {82, 0, 133}, // '520085' 28 | {203, 56, 255}, // 'CB38FF' 29 | {255, 149, 200}, // 'FF95C8' 30 | {255, 55, 199} // 'FF37C7' 31 | }; 32 | 33 | ImageProcess::ImageProcess(int width, int height, int target_size, 34 | bool is_track, int framerate) { 35 | scale_ = static_cast(target_size) / std::max(height, width); 36 | padding_x_ = target_size - static_cast(width * scale_); 37 | padding_y_ = target_size - static_cast(height * scale_); 38 | new_size_ = cv::Size(static_cast(width * scale_), 39 | static_cast(height * scale_)); 40 | target_size_ = target_size; 41 | letterbox_.scale = scale_; 42 | letterbox_.x_pad = padding_x_ / 2; 43 | letterbox_.y_pad = padding_y_ / 2; 44 | is_track_ = is_track; 45 | if (is_track) { 46 | tracker_ = std::make_unique(framerate, 30); 47 | } 48 | } 49 | 50 | std::unique_ptr ImageProcess::Convert(const cv::Mat &src) { 51 | if (&src == nullptr) { 52 | return nullptr; 53 | } 54 | cv::Mat resize_img; 55 | cv::resize(src, resize_img, new_size_); 56 | auto square_img = std::make_unique( 57 | target_size_, target_size_, src.type(), cv::Scalar(114, 114, 114)); 58 | cv::Point position(padding_x_ / 2, padding_y_ / 2); 59 | resize_img.copyTo((*square_img)( 60 | cv::Rect(position.x, position.y, resize_img.cols, resize_img.rows))); 61 | return std::move(square_img); 62 | } 63 | 64 | const letterbox_t &ImageProcess::get_letter_box() { return letterbox_; } 65 | 66 | void ImageProcess::ImagePostProcess(cv::Mat &image, 67 | object_detect_result_list &od_results) { 68 | KAYLORDUT_LOG_INFO("ImagePostProcess is called"); 69 | if (od_results.count >= 1) { 70 | int width = image.rows; 71 | int height = image.cols; 72 | auto *ori_img = image.ptr(); 73 | int cls_id = od_results.results[0].cls_id; 74 | uint8_t *seg_mask = od_results.results_seg[0].seg_mask; 75 | float alpha = 0.5f; // opacity 76 | if (seg_mask != nullptr) { 77 | for (int j = 0; j < height; j++) { 78 | for (int k = 0; k < width; k++) { 79 | int pixel_offset = 3 * (j * width + k); 80 | if (seg_mask[j * width + k] != 0) { 81 | ori_img[pixel_offset + 0] = (unsigned char)clamp( 82 | class_colors[seg_mask[j * width + k] % N_CLASS_COLORS][0] * 83 | (1 - alpha) + 84 | ori_img[pixel_offset + 0] * alpha, 85 | 0, 255); // r 86 | ori_img[pixel_offset + 1] = (unsigned char)clamp( 87 | class_colors[seg_mask[j * width + k] % N_CLASS_COLORS][1] * 88 | (1 - alpha) + 89 | ori_img[pixel_offset + 1] * alpha, 90 | 0, 255); // g 91 | ori_img[pixel_offset + 2] = (unsigned char)clamp( 92 | class_colors[seg_mask[j * width + k] % N_CLASS_COLORS][2] * 93 | (1 - alpha) + 94 | ori_img[pixel_offset + 2] * alpha, 95 | 0, 255); // b 96 | } 97 | } 98 | } 99 | free(seg_mask); 100 | } 101 | } 102 | KAYLORDUT_LOG_INFO("model type is {}", od_results.model_type); 103 | if (od_results.model_type == ModelType::DETECTION || od_results.model_type == ModelType::V10_DETECTION) { 104 | if (is_track_) { 105 | ProcessTrackImage(image, od_results); 106 | } else { 107 | ProcessDetectionImage(image, od_results); 108 | } 109 | } else if (od_results.model_type == ModelType::OBB) { 110 | ProcessOBBImage(image, od_results); 111 | } else if (od_results.model_type == ModelType::POSE) { 112 | ProcessPoseImage(image, od_results); 113 | } 114 | } 115 | 116 | void DrawRotatedRect(cv::Mat &image, float x, float y, float w, float h, 117 | float theta, const cv::Scalar &color, int thickness) { 118 | // 定义旋转矩形的中心,尺寸和旋转角度 119 | cv::Point2f center(x, y); 120 | cv::Size2f size(w, h); 121 | 122 | // 创建旋转矩形对象 123 | cv::RotatedRect rotatedRect(center, size, theta); 124 | 125 | // 获取矩形的四个顶点 126 | cv::Point2f vertices[4]; 127 | rotatedRect.points(vertices); 128 | 129 | // 绘制矩形的四条边 130 | for (int i = 0; i < 4; i++) { 131 | cv::line(image, vertices[i], vertices[(i + 1) % 4], color, thickness); 132 | } 133 | } 134 | 135 | void ImageProcess::ProcessOBBImage( 136 | cv::Mat &image, const object_detect_result_list &od_results) const { 137 | KAYLORDUT_LOG_INFO( 138 | "ImageProcess::ProcessOBBImage is called, result count is {}", 139 | od_results.count); 140 | for (int i = 0; i < od_results.count; ++i) { 141 | auto obb_result = od_results.results_obb[i]; 142 | KAYLORDUT_LOG_INFO("{} @ xywhθ = ({} {} {} {} {}) {}", 143 | coco_cls_to_name(obb_result.cls_id), obb_result.box.x, 144 | obb_result.box.y, obb_result.box.w, obb_result.box.h, 145 | obb_result.box.theta * 180.0 / CV_PI, obb_result.prop); 146 | DrawRotatedRect(image, obb_result.box.x, obb_result.box.y, obb_result.box.w, 147 | obb_result.box.h, obb_result.box.theta * 180.0 / CV_PI, 148 | cv::Scalar(0, 255, 0), 2); 149 | } 150 | } 151 | 152 | void ImageProcess::ProcessTrackImage(cv::Mat &image, 153 | object_detect_result_list &od_results) { 154 | std::vector objects; 155 | for (int i = 0; i < od_results.count; ++i) { 156 | object_detect_result *detect_result = &(od_results.results[i]); 157 | KAYLORDUT_LOG_INFO("{} @ ({} {} {} {}) {}", 158 | coco_cls_to_name(detect_result->cls_id), 159 | detect_result->box.left, detect_result->box.top, 160 | detect_result->box.right, detect_result->box.bottom, 161 | detect_result->prop); 162 | Object object; 163 | object.rect = cv::Rect(detect_result->box.left, detect_result->box.top, 164 | detect_result->box.right - detect_result->box.left, 165 | detect_result->box.bottom - detect_result->box.top); 166 | object.label = detect_result->cls_id; 167 | object.prob = detect_result->prop; 168 | objects.push_back(object); 169 | } 170 | tracker_mutex_.lock(); 171 | std::vector output_stracks = tracker_->update(objects); 172 | for (size_t i = 0; i < output_stracks.size(); ++i) { 173 | std::vector tlwh = output_stracks[i].tlwh; 174 | bool vertical = tlwh[2] / tlwh[3] > 1.6; 175 | if (tlwh[2] * tlwh[3] > 20 && !vertical) { 176 | Scalar s = tracker_->get_color(output_stracks[i].track_id); 177 | putText(image, format("%d", output_stracks[i].track_id), 178 | Point(tlwh[0], tlwh[1] - 5), 0, 0.6, Scalar(0, 0, 255), 2, 179 | LINE_AA); 180 | rectangle(image, Rect(tlwh[0], tlwh[1], tlwh[2], tlwh[3]), s, 2); 181 | } 182 | } 183 | tracker_mutex_.unlock(); 184 | } 185 | 186 | void ImageProcess::ProcessDetectionImage( 187 | cv::Mat &image, object_detect_result_list &od_results) const { 188 | for (int i = 0; i < od_results.count; ++i) { 189 | object_detect_result *detect_result = &(od_results.results[i]); 190 | // if (strcmp(coco_cls_to_name(detect_result->cls_id), "person") == 0){ 191 | // continue;} 192 | KAYLORDUT_LOG_INFO("{} @ ({} {} {} {}) {}", 193 | coco_cls_to_name(detect_result->cls_id), 194 | detect_result->box.left, detect_result->box.top, 195 | detect_result->box.right, detect_result->box.bottom, 196 | detect_result->prop); 197 | cv::rectangle( 198 | image, cv::Point(detect_result->box.left, detect_result->box.top), 199 | cv::Point(detect_result->box.right, detect_result->box.bottom), 200 | cv::Scalar(0, 0, 255), 2); 201 | char text[256]; 202 | sprintf(text, "%s %.1f%%", coco_cls_to_name(detect_result->cls_id), 203 | detect_result->prop * 100); 204 | cv::putText(image, text, 205 | cv::Point(detect_result->box.left, detect_result->box.top + 20), 206 | cv::FONT_HERSHEY_COMPLEX, 1, cv::Scalar(255, 0, 0), 2, 207 | cv::LINE_8); 208 | } 209 | } 210 | 211 | void drawSkeleton(cv::Mat &img, const std::vector &points, 212 | const std::vector &pairs, const cv::Scalar &color, 213 | int thickness) { 214 | for (size_t i = 0; i < pairs.size(); i += 2) { 215 | int index1 = pairs[i]; 216 | int index2 = pairs[i + 1]; 217 | if (points[index1].x != -1 && points[index1].y != -1 && 218 | points[index2].x != -1 && points[index2].y != -1) { 219 | cv::line(img, points[index1], points[index2], color, thickness); 220 | } 221 | } 222 | } 223 | 224 | void ImageProcess::ProcessPoseImage( 225 | cv::Mat &image, object_detect_result_list &od_results) const { 226 | for (int i = 0; i < od_results.count; ++i) { 227 | object_detect_result *detect_result = &(od_results.results[i]); 228 | 229 | KAYLORDUT_LOG_INFO("({} {} {} {}) {}", detect_result->box.left, 230 | detect_result->box.top, detect_result->box.right, 231 | detect_result->box.bottom, detect_result->prop); 232 | cv::rectangle( 233 | image, cv::Point(detect_result->box.left, detect_result->box.top), 234 | cv::Point(detect_result->box.right, detect_result->box.bottom), 235 | cv::Scalar(0, 0, 255), 2); 236 | std::vector points(17); 237 | for (int j = 0; j < 17; ++j) { 238 | if (od_results.results_pose[i].visibility[j] <= 0.6) { 239 | points.at(j) = cv::Point(-1, -1); 240 | continue; 241 | } 242 | points.at(j) = (cv::Point(od_results.results_pose[i].kpt[j * 2 + 0], 243 | od_results.results_pose[i].kpt[j * 2 + 1])); 244 | cv::Point p(od_results.results_pose[i].kpt[j * 2 + 0], 245 | od_results.results_pose[i].kpt[j * 2 + 1]); 246 | cv::circle(image, p, 10, cv::Scalar(0, 0, 255), cv::FILLED, cv::LINE_AA); 247 | } 248 | std::vector pairs = { 249 | 0, 1, // Nose to left eye 250 | 1, 3, // Left eye to left ear 251 | 0, 2, // Nose to right eye 252 | 2, 4, // Right eye to right ear 253 | 0, 5, // Nose to left shoulder 254 | 5, 7, // Left shoulder to left elbow 255 | 7, 9, // Left elbow to left wrist 256 | 0, 6, // Nose to right shoulder 257 | 6, 8, // Right shoulder to right elbow 258 | 8, 10, // Right elbow to right wrist 259 | 5, 6, // Left shoulder to right shoulder 260 | 11, 12, // Left hip to right hip 261 | 11, 5, // Left hip to left shoulder 262 | 12, 6, // Right hip to right shoulder 263 | 11, 13, // Left hip to left knee 264 | 12, 14, // Right hip to right knee 265 | 13, 15, // Left knee to left ankle 266 | 14, 16 // Right knee to right ankle 267 | }; 268 | drawSkeleton(image, points, pairs, cv::Scalar(255, 0, 0), 2); 269 | } 270 | } 271 | -------------------------------------------------------------------------------- /src/yolov8/utils/postprocess.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/5/24. 3 | // 4 | 5 | #include "postprocess.h" 6 | 7 | #include 8 | 9 | #include "Float16.h" 10 | #include "filesystem" 11 | #include "kaylordut/log/logger.h" 12 | #include "opencv2/imgproc.hpp" 13 | #include "opencv2/opencv.hpp" 14 | #include "rknn_matmul_api.h" 15 | static char *labels[OBJ_CLASS_NUM]; 16 | static int num_labels = 0; 17 | int clamp(float val, int min, int max) { 18 | return val > min ? (val < max ? val : max) : min; 19 | } 20 | static char *readLine(FILE *fp, char *buffer, int *len) { 21 | int ch; 22 | int i = 0; 23 | size_t buff_len = 0; 24 | 25 | buffer = (char *)malloc(buff_len + 1); 26 | if (!buffer) return NULL; // Out of memory 27 | 28 | while ((ch = fgetc(fp)) != '\n' && ch != EOF) { 29 | buff_len++; 30 | void *tmp = realloc(buffer, buff_len + 1); 31 | if (tmp == NULL) { 32 | free(buffer); 33 | return NULL; // Out of memory 34 | } 35 | buffer = (char *)tmp; 36 | 37 | buffer[i] = (char)ch; 38 | i++; 39 | } 40 | buffer[i] = '\0'; 41 | 42 | *len = buff_len; 43 | 44 | // Detect end 45 | if (ch == EOF && (i == 0 || ferror(fp))) { 46 | free(buffer); 47 | return NULL; 48 | } 49 | return buffer; 50 | } 51 | 52 | static int readLines(const char *fileName, char *lines[], int max_line) { 53 | FILE *file = fopen(fileName, "r"); 54 | char *s; 55 | int i = 0; 56 | int n = 0; 57 | 58 | if (file == NULL) { 59 | KAYLORDUT_LOG_ERROR("Open {} fail!", fileName); 60 | return -1; 61 | } 62 | 63 | while ((s = readLine(file, s, &n)) != NULL) { 64 | lines[i++] = s; 65 | if (i >= max_line) break; 66 | } 67 | fclose(file); 68 | KAYLORDUT_LOG_INFO("There are {} lines", i); 69 | return i; 70 | } 71 | 72 | static int loadLabelName(const char *locationFilename, char *label[]) { 73 | KAYLORDUT_LOG_INFO("load lable {}", locationFilename); 74 | num_labels = readLines(locationFilename, label, OBJ_CLASS_NUM); 75 | return 0; 76 | } 77 | 78 | int init_post_process(std::string &label_path) { 79 | int ret = 0; 80 | ret = loadLabelName(label_path.c_str(), labels); 81 | if (ret < 0) { 82 | KAYLORDUT_LOG_ERROR("Load {} failed!", label_path); 83 | return -1; 84 | } 85 | return 0; 86 | } 87 | 88 | void deinit_post_process() { 89 | for (int i = 0; i < num_labels; i++) { 90 | if (labels[i] != nullptr) { 91 | free(labels[i]); 92 | labels[i] = nullptr; 93 | } 94 | } 95 | } 96 | 97 | const char *coco_cls_to_name(int cls_id) { 98 | if (cls_id >= num_labels) { 99 | return "null"; 100 | } 101 | if (labels[cls_id]) { 102 | return labels[cls_id]; 103 | } 104 | return "null"; 105 | } 106 | static float CalculateOverlap(float xmin0, float ymin0, float xmax0, 107 | float ymax0, float xmin1, float ymin1, 108 | float xmax1, float ymax1) { 109 | float w = fmax(0.f, fmin(xmax0, xmax1) - fmax(xmin0, xmin1) + 1.0); 110 | float h = fmax(0.f, fmin(ymax0, ymax1) - fmax(ymin0, ymin1) + 1.0); 111 | float i = w * h; 112 | float u = (xmax0 - xmin0 + 1.0) * (ymax0 - ymin0 + 1.0) + 113 | (xmax1 - xmin1 + 1.0) * (ymax1 - ymin1 + 1.0) - i; 114 | return u <= 0.f ? 0.f : (i / u); 115 | } 116 | 117 | static int nms(int validCount, std::vector &outputLocations, 118 | std::vector &order, float threshold) { 119 | for (int i = 0; i < validCount; ++i) { 120 | if (order[i] == -1) { 121 | continue; 122 | } 123 | int n = order[i]; 124 | for (int j = i + 1; j < validCount; ++j) { 125 | int m = order[j]; 126 | if (m == -1) { 127 | continue; 128 | } 129 | float xmin0 = outputLocations[n * 4 + 0]; 130 | float ymin0 = outputLocations[n * 4 + 1]; 131 | float xmax0 = outputLocations[n * 4 + 0] + outputLocations[n * 4 + 2]; 132 | float ymax0 = outputLocations[n * 4 + 1] + outputLocations[n * 4 + 3]; 133 | 134 | float xmin1 = outputLocations[m * 4 + 0]; 135 | float ymin1 = outputLocations[m * 4 + 1]; 136 | float xmax1 = outputLocations[m * 4 + 0] + outputLocations[m * 4 + 2]; 137 | float ymax1 = outputLocations[m * 4 + 1] + outputLocations[m * 4 + 3]; 138 | 139 | float iou = CalculateOverlap(xmin0, ymin0, xmax0, ymax0, xmin1, ymin1, 140 | xmax1, ymax1); 141 | 142 | if (iou > threshold) { 143 | order[j] = -1; 144 | } 145 | } 146 | } 147 | return 0; 148 | } 149 | 150 | static int nms(int validCount, std::vector &outputLocations, 151 | std::vector classIds, std::vector &order, int filterId, 152 | float threshold) { 153 | for (int i = 0; i < validCount; ++i) { 154 | if (order[i] == -1 || classIds[order[i]] != filterId) { 155 | continue; 156 | } 157 | int n = order[i]; 158 | for (int j = i + 1; j < validCount; ++j) { 159 | int m = order[j]; 160 | if (m == -1 || classIds[order[j]] != filterId) { 161 | continue; 162 | } 163 | float xmin0 = outputLocations[n * 4 + 0]; 164 | float ymin0 = outputLocations[n * 4 + 1]; 165 | float xmax0 = outputLocations[n * 4 + 0] + outputLocations[n * 4 + 2]; 166 | float ymax0 = outputLocations[n * 4 + 1] + outputLocations[n * 4 + 3]; 167 | 168 | float xmin1 = outputLocations[m * 4 + 0]; 169 | float ymin1 = outputLocations[m * 4 + 1]; 170 | float xmax1 = outputLocations[m * 4 + 0] + outputLocations[m * 4 + 2]; 171 | float ymax1 = outputLocations[m * 4 + 1] + outputLocations[m * 4 + 3]; 172 | 173 | float iou = CalculateOverlap(xmin0, ymin0, xmax0, ymax0, xmin1, ymin1, 174 | xmax1, ymax1); 175 | 176 | if (iou > threshold) { 177 | order[j] = -1; 178 | } 179 | } 180 | } 181 | return 0; 182 | } 183 | // 计算两个旋转矩形的IoU 184 | double rotatedRectIoU(const cv::RotatedRect &rect1, 185 | const cv::RotatedRect &rect2) { 186 | // KAYLORDUT_LOG_DEBUG("rotatedRectIoU is called"); 187 | // 计算两个旋转矩形的交集 188 | std::vector intersectingRegion; 189 | cv::rotatedRectangleIntersection(rect1, rect2, intersectingRegion); 190 | if (intersectingRegion.empty()) { 191 | return 0; 192 | } 193 | // for (int i = 0; i < intersectingRegion.size(); ++i) { 194 | // KAYLORDUT_LOG_DEBUG("data[{}] = ({} {})", i, intersectingRegion.at(i).x, 195 | // intersectingRegion.at(i).y); 196 | // } 197 | 198 | // 通过cv::contourArea计算交集区域面积 199 | double intersectionArea = cv::contourArea(intersectingRegion); 200 | 201 | // 计算两个矩形的面积和 202 | double area1 = rect1.size.width * rect1.size.height; 203 | double area2 = rect2.size.width * rect2.size.height; 204 | 205 | // 并集面积 = 两矩形面积之和 - 交集面积 206 | double unionArea = area1 + area2 - intersectionArea; 207 | // 计算IoU 208 | return intersectionArea / unionArea; 209 | } 210 | 211 | static int nms(int validCount, std::vector &bboxes, 212 | std::vector &angles, std::vector classIds, 213 | std::vector &order, int filterId, float threshold) { 214 | // KAYLORDUT_LOG_DEBUG("nms is called"); 215 | // KAYLORDUT_LOG_DEBUG("validCount = {}, filterId = {}, threshold = {}", 216 | // validCount, filterId, threshold); 217 | for (int i = 0; i < validCount; ++i) { 218 | if (order[i] == -1 || classIds[order[i]] != filterId) { 219 | continue; 220 | } 221 | int n = order[i]; 222 | for (int j = i + 1; j < validCount; ++j) { 223 | int m = order[j]; 224 | if (m == -1 || classIds[order[j]] != filterId) { 225 | continue; 226 | } 227 | cv::RotatedRect rect1(cv::Point2f(bboxes[n * 4], bboxes[n * 4 + 1]), 228 | cv::Size2f(bboxes[n * 4 + 2], bboxes[n * 4 + 3]), 229 | angles[n]); 230 | cv::RotatedRect rect2(cv::Point2f(bboxes[m * 4], bboxes[m * 4 + 1]), 231 | cv::Size2f(bboxes[m * 4 + 2], bboxes[m * 4 + 3]), 232 | angles[m]); 233 | auto iou = rotatedRectIoU(rect1, rect2); 234 | if (iou > threshold) { 235 | order[j] = -1; 236 | } 237 | } 238 | } 239 | // KAYLORDUT_LOG_DEBUG("nms finished"); 240 | return 0; 241 | } 242 | 243 | static void crop_mask(uint8_t *seg_mask, uint8_t *all_mask_in_one, float *boxes, 244 | int boxes_num, int *cls_id, int height, int width) { 245 | for (int b = 0; b < boxes_num; b++) { 246 | float x1 = boxes[b * 4 + 0]; 247 | float y1 = boxes[b * 4 + 1]; 248 | float x2 = boxes[b * 4 + 2]; 249 | float y2 = boxes[b * 4 + 3]; 250 | 251 | for (int i = 0; i < height; i++) { 252 | for (int j = 0; j < width; j++) { 253 | // 判断该点在矩形框内 254 | if (j >= x1 && j < x2 && i >= y1 && i < y2) { 255 | if (all_mask_in_one[i * width + j] == 0) { 256 | // seg_mask只有 0或者1 , cls_id因为可能存在0值,所以 +1 257 | // 避免结果都为0 258 | all_mask_in_one[i * width + j] = 259 | seg_mask[b * width * height + i * width + j] * (cls_id[b] + 1); 260 | } 261 | } 262 | } 263 | } 264 | } 265 | } 266 | 267 | static void matmul_by_npu_i8(std::vector &A_input, float *B_input, 268 | uint8_t *C_input, int ROWS_A, int COLS_A, 269 | int COLS_B, rknn_app_context_t *app_ctx) { 270 | int B_layout = 0; 271 | int AC_layout = 0; 272 | int32_t M = 1; 273 | int32_t K = COLS_A; 274 | int32_t N = COLS_B; 275 | 276 | rknn_matmul_ctx ctx; 277 | rknn_matmul_info info; 278 | memset(&info, 0, sizeof(rknn_matmul_info)); 279 | info.M = M; 280 | info.K = K; 281 | info.N = N; 282 | info.type = RKNN_INT8_MM_INT8_TO_INT32; 283 | info.B_layout = B_layout; 284 | info.AC_layout = AC_layout; 285 | 286 | rknn_matmul_io_attr io_attr; 287 | memset(&io_attr, 0, sizeof(rknn_matmul_io_attr)); 288 | 289 | int8_t int8Vector_A[ROWS_A * COLS_A]; 290 | for (int i = 0; i < ROWS_A * COLS_A; ++i) { 291 | int8Vector_A[i] = (int8_t)A_input[i]; 292 | } 293 | 294 | int8_t int8Vector_B[COLS_A * COLS_B]; 295 | for (int i = 0; i < COLS_A * COLS_B; ++i) { 296 | int8Vector_B[i] = (int8_t)B_input[i]; 297 | } 298 | 299 | int ret = rknn_matmul_create(&ctx, &info, &io_attr); 300 | // Create A 301 | rknn_tensor_mem *A = rknn_create_mem(ctx, io_attr.A.size); 302 | // Create B 303 | rknn_tensor_mem *B = rknn_create_mem(ctx, io_attr.B.size); 304 | // Create C 305 | rknn_tensor_mem *C = rknn_create_mem(ctx, io_attr.C.size); 306 | 307 | memcpy(B->virt_addr, int8Vector_B, B->size); 308 | // Set A 309 | ret = rknn_matmul_set_io_mem(ctx, A, &io_attr.A); 310 | // Set B 311 | ret = rknn_matmul_set_io_mem(ctx, B, &io_attr.B); 312 | // Set C 313 | ret = rknn_matmul_set_io_mem(ctx, C, &io_attr.C); 314 | 315 | for (int i = 0; i < ROWS_A; ++i) { 316 | memcpy(A->virt_addr, int8Vector_A + i * A->size, A->size); 317 | 318 | // Run 319 | ret = rknn_matmul_run(ctx); 320 | 321 | for (int j = 0; j < COLS_B; ++j) { 322 | if (((int32_t *)C->virt_addr)[j] > 0) { 323 | C_input[i * COLS_B + j] = 1; 324 | } else { 325 | C_input[i * COLS_B + j] = 0; 326 | } 327 | } 328 | } 329 | 330 | // destroy 331 | rknn_destroy_mem(ctx, A); 332 | rknn_destroy_mem(ctx, B); 333 | rknn_destroy_mem(ctx, C); 334 | rknn_matmul_destroy(ctx); 335 | } 336 | 337 | static void matmul_by_npu_fp16(std::vector &A_input, float *B_input, 338 | uint8_t *C_input, int ROWS_A, int COLS_A, 339 | int COLS_B, rknn_app_context_t *app_ctx) { 340 | int B_layout = 0; 341 | int AC_layout = 0; 342 | int32_t M = ROWS_A; 343 | int32_t K = COLS_A; 344 | int32_t N = COLS_B; 345 | 346 | rknn_matmul_ctx ctx; 347 | rknn_matmul_info info; 348 | memset(&info, 0, sizeof(rknn_matmul_info)); 349 | info.M = M; 350 | info.K = K; 351 | info.N = N; 352 | info.type = RKNN_FLOAT16_MM_FLOAT16_TO_FLOAT32; 353 | info.B_layout = B_layout; 354 | info.AC_layout = AC_layout; 355 | 356 | rknn_matmul_io_attr io_attr; 357 | memset(&io_attr, 0, sizeof(rknn_matmul_io_attr)); 358 | 359 | rknpu2::float16 int8Vector_A[ROWS_A * COLS_A]; 360 | for (int i = 0; i < ROWS_A * COLS_A; ++i) { 361 | int8Vector_A[i] = (rknpu2::float16)A_input[i]; 362 | } 363 | 364 | rknpu2::float16 int8Vector_B[COLS_A * COLS_B]; 365 | for (int i = 0; i < COLS_A * COLS_B; ++i) { 366 | int8Vector_B[i] = (rknpu2::float16)B_input[i]; 367 | } 368 | 369 | int ret = rknn_matmul_create(&ctx, &info, &io_attr); 370 | // Create A 371 | rknn_tensor_mem *A = rknn_create_mem(ctx, io_attr.A.size); 372 | // Create B 373 | rknn_tensor_mem *B = rknn_create_mem(ctx, io_attr.B.size); 374 | // Create C 375 | rknn_tensor_mem *C = rknn_create_mem(ctx, io_attr.C.size); 376 | 377 | memcpy(A->virt_addr, int8Vector_A, A->size); 378 | memcpy(B->virt_addr, int8Vector_B, B->size); 379 | 380 | // Set A 381 | ret = rknn_matmul_set_io_mem(ctx, A, &io_attr.A); 382 | // Set B 383 | ret = rknn_matmul_set_io_mem(ctx, B, &io_attr.B); 384 | // Set C 385 | ret = rknn_matmul_set_io_mem(ctx, C, &io_attr.C); 386 | 387 | // Run 388 | ret = rknn_matmul_run(ctx); 389 | for (int i = 0; i < ROWS_A * COLS_B; ++i) { 390 | if (((float *)C->virt_addr)[i] > 0) { 391 | C_input[i] = 1; 392 | } else { 393 | C_input[i] = 0; 394 | } 395 | } 396 | 397 | // destroy 398 | rknn_destroy_mem(ctx, A); 399 | rknn_destroy_mem(ctx, B); 400 | rknn_destroy_mem(ctx, C); 401 | rknn_matmul_destroy(ctx); 402 | } 403 | static void resize_by_opencv(uint8_t *input_image, int input_width, 404 | int input_height, uint8_t *output_image, 405 | int target_width, int target_height) { 406 | cv::Mat src_image(input_height, input_width, CV_8U, input_image); 407 | cv::Mat dst_image; 408 | cv::resize(src_image, dst_image, cv::Size(target_width, target_height), 0, 0, 409 | cv::INTER_LINEAR); 410 | memcpy(output_image, dst_image.data, target_width * target_height); 411 | } 412 | 413 | static void seg_reverse(uint8_t *seg_mask, uint8_t *cropped_seg, 414 | uint8_t *seg_mask_real, int model_in_height, 415 | int model_in_width, int proto_height, int proto_width, 416 | int cropped_height, int cropped_width, 417 | int ori_in_height, int ori_in_width, int y_pad, 418 | int x_pad) { 419 | int cropped_index = 0; 420 | for (int i = 0; i < proto_height; i++) { 421 | for (int j = 0; j < proto_width; j++) { 422 | if (i >= y_pad && i < proto_height - y_pad && j >= x_pad && 423 | j < proto_width - x_pad) { 424 | int seg_index = i * proto_width + j; 425 | cropped_seg[cropped_index] = seg_mask[seg_index]; 426 | cropped_index++; 427 | } 428 | } 429 | } 430 | 431 | // Note: Here are different methods provided for implementing single-channel 432 | // image scaling 433 | resize_by_opencv(cropped_seg, cropped_width, cropped_height, seg_mask_real, 434 | ori_in_width, ori_in_height); 435 | // resize_by_rga_rk356x(cropped_seg, cropped_width, cropped_height, 436 | // seg_mask_real, ori_in_width, ori_in_height); 437 | // resize_by_rga_rk3588(cropped_seg, cropped_width, cropped_height, 438 | // seg_mask_real, ori_in_width, ori_in_height); 439 | } 440 | 441 | static int quick_sort_indice_inverse(std::vector &input, int left, 442 | int right, std::vector &indices) { 443 | float key; 444 | int key_index; 445 | int low = left; 446 | int high = right; 447 | if (left < right) { 448 | key_index = indices[left]; 449 | key = input[left]; 450 | while (low < high) { 451 | while (low < high && input[high] <= key) { 452 | high--; 453 | } 454 | input[low] = input[high]; 455 | indices[low] = indices[high]; 456 | while (low < high && input[low] >= key) { 457 | low++; 458 | } 459 | input[high] = input[low]; 460 | indices[high] = indices[low]; 461 | } 462 | input[low] = key; 463 | indices[low] = key_index; 464 | quick_sort_indice_inverse(input, left, low - 1, indices); 465 | quick_sort_indice_inverse(input, low + 1, right, indices); 466 | } 467 | return low; 468 | } 469 | 470 | inline static int32_t __clip(float val, float min, float max) { 471 | float f = val <= min ? min : (val >= max ? max : val); 472 | return f; 473 | } 474 | 475 | static int8_t qnt_f32_to_affine(float f32, int32_t zp, float scale) { 476 | float dst_val = (f32 / scale) + zp; 477 | int8_t res = (int8_t)__clip(dst_val, -128, 127); 478 | return res; 479 | } 480 | 481 | static float deqnt_affine_to_f32(int8_t qnt, int32_t zp, float scale) { 482 | return ((float)qnt - (float)zp) * scale; 483 | } 484 | 485 | static int box_reverse(int position, int boundary, int pad, float scale) { 486 | return (int)((clamp(position, 0, boundary) - pad) / scale); 487 | } 488 | 489 | void compute_dfl(float *tensor, int dfl_len, float *box) { 490 | for (int b = 0; b < 4; b++) { 491 | float exp_t[dfl_len]; 492 | float exp_sum = 0; 493 | float acc_sum = 0; 494 | for (int i = 0; i < dfl_len; i++) { 495 | exp_t[i] = exp(tensor[i + b * dfl_len]); 496 | exp_sum += exp_t[i]; 497 | } 498 | 499 | for (int i = 0; i < dfl_len; i++) { 500 | acc_sum += exp_t[i] / exp_sum * i; 501 | } 502 | box[b] = acc_sum; 503 | } 504 | } 505 | 506 | static int process_i8(rknn_output *all_input, int input_id, int grid_h, 507 | int grid_w, int height, int width, int stride, 508 | int dfl_len, std::vector &boxes, 509 | std::vector &segments, float *proto, 510 | std::vector &objProbs, std::vector &classId, 511 | float threshold, rknn_app_context_t *app_ctx) { 512 | int validCount = 0; 513 | // 这个张量的H*W,用来记录每一个通道占用内存的长度 514 | int grid_len = grid_h * grid_w; 515 | 516 | // Skip if input_id is not 0, 4, 8, or 12 517 | if (input_id % 4 != 0) { 518 | return validCount; 519 | } 520 | // 最后的这一层是跟掩膜相关,处理方式与其他不一样 521 | if (input_id == 12) { 522 | /** 523 | * 获取第12层的输出数据,然后把量化后的数据还原到为原来的浮点型 524 | * 需要注意的是,这里没有使用比例关系,因为程序需要是INT的数据,不需要0~1的float数据 525 | */ 526 | int8_t *input_proto = (int8_t *)all_input[input_id].buf; 527 | int32_t zp_proto = app_ctx->output_attrs[input_id].zp; 528 | for (int i = 0; i < PROTO_CHANNEL * PROTO_HEIGHT * PROTO_WEIGHT; i++) { 529 | proto[i] = input_proto[i] - zp_proto; 530 | } 531 | return validCount; 532 | } 533 | 534 | int8_t *box_tensor = (int8_t *)all_input[input_id].buf; 535 | int32_t box_zp = app_ctx->output_attrs[input_id].zp; 536 | float box_scale = app_ctx->output_attrs[input_id].scale; 537 | 538 | int8_t *score_tensor = (int8_t *)all_input[input_id + 1].buf; 539 | int32_t score_zp = app_ctx->output_attrs[input_id + 1].zp; 540 | float score_scale = app_ctx->output_attrs[input_id + 1].scale; 541 | 542 | int8_t *score_sum_tensor = nullptr; 543 | int32_t score_sum_zp = 0; 544 | float score_sum_scale = 1.0; 545 | score_sum_tensor = (int8_t *)all_input[input_id + 2].buf; 546 | score_sum_zp = app_ctx->output_attrs[input_id + 2].zp; 547 | score_sum_scale = app_ctx->output_attrs[input_id + 2].scale; 548 | 549 | int8_t *seg_tensor = (int8_t *)all_input[input_id + 3].buf; 550 | int32_t seg_zp = app_ctx->output_attrs[input_id + 3].zp; 551 | float seg_scale = app_ctx->output_attrs[input_id + 3].scale; 552 | // 计算得分阈值的量化之后的数值 553 | int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale); 554 | int8_t score_sum_thres_i8 = 555 | qnt_f32_to_affine(threshold, score_sum_zp, score_sum_scale); 556 | 557 | for (int i = 0; i < grid_h; i++) { 558 | for (int j = 0; j < grid_w; j++) { 559 | int offset = i * grid_w + j; 560 | int max_class_id = -1; 561 | 562 | int offset_seg = i * grid_w + j; 563 | int8_t *in_ptr_seg = seg_tensor + offset_seg; 564 | 565 | // for quick filtering through "score sum" 566 | if (score_sum_tensor != nullptr) { 567 | // 如果得分总和少于设定阈值,直接放弃本次的目标 568 | if (score_sum_tensor[offset] < score_sum_thres_i8) { 569 | continue; 570 | } 571 | } 572 | 573 | int8_t max_score = -score_zp; 574 | for (int c = 0; c < num_labels; c++) { // 这里是为了保留最大的概率类别 575 | if ((score_tensor[offset] > score_thres_i8) && 576 | (score_tensor[offset] > max_score)) { 577 | max_score = score_tensor[offset]; 578 | max_class_id = c; 579 | } 580 | offset += grid_len; // 计算 i * grid_w + j 581 | // 这个像素点下,下一个类别偏移量,所以加上一个通道的长度即可 582 | } 583 | 584 | // compute box, 只有最大概率大于阈值,才判定有目标存在 585 | if (max_score > score_thres_i8) { 586 | for (int k = 0; k < PROTO_CHANNEL; k++) { 587 | // 数据转换为int8, 因为数据是NCHW,所以步长需要grid_len才能拿到数据 588 | int8_t seg_element_i8 = in_ptr_seg[(k)*grid_len] - seg_zp; 589 | segments.push_back(seg_element_i8); 590 | } 591 | 592 | offset = i * grid_w + j; 593 | float box[4]; 594 | float before_dfl[dfl_len * 4]; 595 | for (int k = 0; k < dfl_len * 4; k++) { // 反量化拿到浮点型数据 596 | before_dfl[k] = 597 | deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale); 598 | offset += grid_len; 599 | } 600 | // 通过dfl计算出box的坐标 601 | compute_dfl(before_dfl, dfl_len, box); 602 | 603 | float x1, y1, x2, y2, w, h; 604 | x1 = (-box[0] + j + 0.5) * stride; 605 | y1 = (-box[1] + i + 0.5) * stride; 606 | x2 = (box[2] + j + 0.5) * stride; 607 | y2 = (box[3] + i + 0.5) * stride; 608 | w = x2 - x1; 609 | h = y2 - y1; 610 | boxes.push_back(x1); 611 | boxes.push_back(y1); 612 | boxes.push_back(w); 613 | boxes.push_back(h); 614 | // 保存该类的概率和类id 615 | objProbs.push_back( 616 | deqnt_affine_to_f32(max_score, score_zp, score_scale)); 617 | classId.push_back(max_class_id); 618 | validCount++; 619 | } 620 | } 621 | } 622 | return validCount; 623 | } 624 | 625 | static int process_fp32(rknn_output *all_input, int input_id, int grid_h, 626 | int grid_w, int height, int width, int stride, 627 | int dfl_len, std::vector &boxes, 628 | std::vector &segments, float *proto, 629 | std::vector &objProbs, std::vector &classId, 630 | float threshold) { 631 | int validCount = 0; 632 | int grid_len = grid_h * grid_w; 633 | 634 | // Skip if input_id is not 0, 4, 8, or 12 635 | if (input_id % 4 != 0) { 636 | return validCount; 637 | } 638 | 639 | if (input_id == 12) { 640 | float *input_proto = (float *)all_input[input_id].buf; 641 | for (int i = 0; i < PROTO_CHANNEL * PROTO_HEIGHT * PROTO_WEIGHT; i++) { 642 | proto[i] = input_proto[i]; 643 | } 644 | return validCount; 645 | } 646 | 647 | float *box_tensor = (float *)all_input[input_id].buf; 648 | float *score_tensor = (float *)all_input[input_id + 1].buf; 649 | float *score_sum_tensor = (float *)all_input[input_id + 2].buf; 650 | float *seg_tensor = (float *)all_input[input_id + 3].buf; 651 | 652 | for (int i = 0; i < grid_h; i++) { 653 | for (int j = 0; j < grid_w; j++) { 654 | int offset = i * grid_w + j; 655 | int max_class_id = -1; 656 | 657 | int offset_seg = i * grid_w + j; 658 | float *in_ptr_seg = seg_tensor + offset_seg; 659 | 660 | // for quick filtering through "score sum" 661 | if (score_sum_tensor != nullptr) { 662 | if (score_sum_tensor[offset] < threshold) { 663 | continue; 664 | } 665 | } 666 | 667 | float max_score = 0; 668 | for (int c = 0; c < num_labels; c++) { 669 | if ((score_tensor[offset] > threshold) && 670 | (score_tensor[offset] > max_score)) { 671 | max_score = score_tensor[offset]; 672 | max_class_id = c; 673 | } 674 | offset += grid_len; 675 | } 676 | 677 | // compute box 678 | if (max_score > threshold) { 679 | for (int k = 0; k < PROTO_CHANNEL; k++) { 680 | float seg_element_f32 = in_ptr_seg[(k)*grid_len]; 681 | segments.push_back(seg_element_f32); 682 | } 683 | 684 | offset = i * grid_w + j; 685 | float box[4]; 686 | float before_dfl[dfl_len * 4]; 687 | for (int k = 0; k < dfl_len * 4; k++) { 688 | before_dfl[k] = box_tensor[offset]; 689 | offset += grid_len; 690 | } 691 | compute_dfl(before_dfl, dfl_len, box); 692 | 693 | float x1, y1, x2, y2, w, h; 694 | x1 = (-box[0] + j + 0.5) * stride; 695 | y1 = (-box[1] + i + 0.5) * stride; 696 | x2 = (box[2] + j + 0.5) * stride; 697 | y2 = (box[3] + i + 0.5) * stride; 698 | w = x2 - x1; 699 | h = y2 - y1; 700 | boxes.push_back(x1); 701 | boxes.push_back(y1); 702 | boxes.push_back(w); 703 | boxes.push_back(h); 704 | 705 | objProbs.push_back(max_score); 706 | classId.push_back(max_class_id); 707 | validCount++; 708 | } 709 | } 710 | } 711 | return validCount; 712 | } 713 | 714 | int post_process_seg(rknn_app_context_t *app_ctx, rknn_output *outputs, 715 | letterbox_t *letter_box, float conf_threshold, 716 | float nms_threshold, 717 | object_detect_result_list *od_results) { 718 | std::vector filterBoxes; // 用来保存检测目标的box 719 | std::vector objProbs; // 保存该目标的得分 720 | std::vector classId; // 保留该目标的种类对应的index id 721 | 722 | std::vector filterSegments; 723 | float proto[PROTO_CHANNEL * PROTO_HEIGHT * PROTO_WEIGHT]; 724 | std::vector filterSegments_by_nms; 725 | 726 | int model_in_w = app_ctx->model_width; // 获取模型的width 727 | int model_in_h = app_ctx->model_height; // 获取模型的height 728 | 729 | int validCount = 0; // 记录有效的输出 730 | int stride = 0; // 活动窗口步长 731 | int grid_h = 0; 732 | int grid_w = 0; 733 | 734 | int dfl_len = app_ctx->output_attrs[0].dims[1] / 4; 735 | /*** 736 | * 结果输出有三种输出80x80 40x40 737 | * 20x20,这里取模是为了确定每种输出拥有多少层输出 738 | */ 739 | int output_per_branch = app_ctx->io_num.n_output / 3; // default 3 branch 740 | 741 | // process the outputs of rknn 742 | for (int i = 0; i < 13; i++) { 743 | grid_h = app_ctx->output_attrs[i].dims[2]; // 这一层输出的高度 744 | grid_w = app_ctx->output_attrs[i].dims[3]; // 这一层输出的宽度 745 | stride = model_in_h / grid_h; // 模型边长对输出层取模等于滑动步长 746 | // 如果量化了,使用i8的处理方式 747 | if (app_ctx->is_quant) { 748 | validCount += 749 | process_i8(outputs, i, grid_h, grid_w, model_in_h, model_in_w, stride, 750 | dfl_len, filterBoxes, filterSegments, proto, objProbs, 751 | classId, conf_threshold, app_ctx); 752 | } else { 753 | validCount += 754 | process_fp32(outputs, i, grid_h, grid_w, model_in_h, model_in_w, 755 | stride, dfl_len, filterBoxes, filterSegments, proto, 756 | objProbs, classId, conf_threshold); 757 | } 758 | } 759 | 760 | // nms 761 | if (validCount <= 0) { 762 | return 0; 763 | } 764 | std::vector indexArray; 765 | for (int i = 0; i < validCount; ++i) { 766 | indexArray.push_back(i); 767 | } 768 | // 这里快排把识别的种类概率从大到小排列,对应的index也跟着排列 769 | quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray); 770 | 771 | std::set class_set(std::begin(classId), std::end(classId)); 772 | 773 | for (auto c : class_set) { // 把重合度大于设定阈值的框给标记去掉 774 | nms(validCount, filterBoxes, classId, indexArray, c, nms_threshold); 775 | } 776 | 777 | int last_count = 0; 778 | od_results->count = 0; 779 | 780 | for (int i = 0; i < validCount; ++i) { 781 | // 上一步中已经标记了无效的重叠框的下标为-1 782 | if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) { 783 | continue; 784 | } 785 | int n = indexArray[i]; 786 | // 获取box, 种类id, 概率分数 787 | float x1 = filterBoxes[n * 4 + 0]; 788 | float y1 = filterBoxes[n * 4 + 1]; 789 | float x2 = x1 + filterBoxes[n * 4 + 2]; 790 | float y2 = y1 + filterBoxes[n * 4 + 3]; 791 | int id = classId[n]; 792 | float obj_conf = objProbs[i]; 793 | 794 | for (int k = 0; k < PROTO_CHANNEL; k++) { 795 | // 获取相对应的分割的向量 796 | filterSegments_by_nms.push_back(filterSegments[n * PROTO_CHANNEL + k]); 797 | } 798 | 799 | od_results->results[last_count].box.left = x1; 800 | od_results->results[last_count].box.top = y1; 801 | od_results->results[last_count].box.right = x2; 802 | od_results->results[last_count].box.bottom = y2; 803 | 804 | od_results->results[last_count].prop = obj_conf; 805 | od_results->results[last_count].cls_id = id; 806 | last_count++; 807 | } 808 | od_results->count = last_count; 809 | 810 | int boxes_num = od_results->count; 811 | 812 | // compute the mask (binary matrix) through Matmul 813 | // 计算掩膜矩阵,最后结果得到 boxes_num 个掩膜矩阵 814 | int ROWS_A = boxes_num; 815 | int COLS_A = PROTO_CHANNEL; 816 | int COLS_B = PROTO_HEIGHT * PROTO_WEIGHT; 817 | uint8_t matmul_out[boxes_num * PROTO_HEIGHT * PROTO_WEIGHT]; 818 | if (app_ctx->is_quant) { 819 | matmul_by_npu_i8(filterSegments_by_nms, proto, matmul_out, ROWS_A, COLS_A, 820 | COLS_B, app_ctx); 821 | } else { 822 | matmul_by_npu_fp16(filterSegments_by_nms, proto, matmul_out, ROWS_A, COLS_A, 823 | COLS_B, app_ctx); 824 | } 825 | 826 | float filterBoxes_by_nms[boxes_num * 4]; // 记录每个box的坐标 827 | int cls_id[boxes_num]; 828 | for (int i = 0; i < boxes_num; i++) { 829 | // for crop_mask 830 | // 掩膜层的分辨率是 160x160 831 | // 640 / 160 = 4.0 832 | filterBoxes_by_nms[i * 4 + 0] = 833 | od_results->results[i].box.left / 4.0; // x1; 834 | filterBoxes_by_nms[i * 4 + 1] = 835 | od_results->results[i].box.top / 4.0; // y1; 836 | filterBoxes_by_nms[i * 4 + 2] = 837 | od_results->results[i].box.right / 4.0; // x2; 838 | filterBoxes_by_nms[i * 4 + 3] = 839 | od_results->results[i].box.bottom / 4.0; // y2; 840 | cls_id[i] = od_results->results[i].cls_id; 841 | 842 | // get real box 843 | // 这里是把640x640的坐标映射返回到原始输入图像的坐标 844 | od_results->results[i].box.left = 845 | box_reverse(od_results->results[i].box.left, model_in_w, 846 | letter_box->x_pad, letter_box->scale); 847 | od_results->results[i].box.top = 848 | box_reverse(od_results->results[i].box.top, model_in_h, 849 | letter_box->y_pad, letter_box->scale); 850 | od_results->results[i].box.right = 851 | box_reverse(od_results->results[i].box.right, model_in_w, 852 | letter_box->x_pad, letter_box->scale); 853 | od_results->results[i].box.bottom = 854 | box_reverse(od_results->results[i].box.bottom, model_in_h, 855 | letter_box->y_pad, letter_box->scale); 856 | } 857 | 858 | // crop seg outside box 859 | uint8_t all_mask_in_one[PROTO_HEIGHT * PROTO_WEIGHT] = {0}; 860 | // 把所有的掩膜数据写到一张图上 861 | crop_mask(matmul_out, all_mask_in_one, filterBoxes_by_nms, boxes_num, cls_id, 862 | PROTO_HEIGHT, PROTO_WEIGHT); 863 | 864 | // get real mask 865 | int cropped_height = PROTO_HEIGHT - letter_box->y_pad / 4 * 2; 866 | int cropped_width = PROTO_WEIGHT - letter_box->x_pad / 4 * 2; 867 | int y_pad = letter_box->y_pad / 4; // 640 / 160 = 4 868 | int x_pad = letter_box->x_pad / 4; 869 | int ori_in_height = (model_in_h - letter_box->y_pad * 2) / letter_box->scale; 870 | int ori_in_width = (model_in_w - letter_box->x_pad * 2) / letter_box->scale; 871 | uint8_t *cropped_seg_mask = 872 | (uint8_t *)malloc(cropped_height * cropped_width * sizeof(uint8_t)); 873 | uint8_t *real_seg_mask = 874 | (uint8_t *)malloc(ori_in_height * ori_in_width * sizeof(uint8_t)); 875 | // 还原到原来的图像分辨率,结果保存到real_seg_mask中,使用od_results->results_seg[0].seg_mask记录 876 | seg_reverse(all_mask_in_one, cropped_seg_mask, real_seg_mask, model_in_h, 877 | model_in_w, PROTO_HEIGHT, PROTO_WEIGHT, cropped_height, 878 | cropped_width, ori_in_height, ori_in_width, y_pad, x_pad); 879 | od_results->results_seg[0].seg_mask = real_seg_mask; 880 | free(cropped_seg_mask); 881 | 882 | return 0; 883 | } 884 | 885 | static int process_i8_obb(int8_t *box_tensor, int32_t box_zp, float box_scale, 886 | int8_t *score_tensor, int32_t score_zp, 887 | float score_scale, int8_t *angle_tensor, 888 | int32_t angle_zp, float angle_scale, int grid_h, 889 | int grid_w, int stride, int dfl_len, 890 | std::vector &boxes, std::vector &angles, 891 | std::vector &objProbs, 892 | std::vector &classId, float threshold) { 893 | // KAYLORDUT_LOG_DEBUG("process_i8_obb is called"); 894 | int validCount = 0; 895 | int grid_len = grid_h * grid_w; 896 | int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale); 897 | 898 | for (int i = 0; i < grid_h; i++) { 899 | for (int j = 0; j < grid_w; j++) { 900 | int offset = i * grid_w + j; 901 | int max_class_id = -1; 902 | 903 | int8_t max_score = -score_zp; 904 | 905 | for (int c = 0; c < num_labels; c++) { 906 | if ((score_tensor[offset] > score_thres_i8) && 907 | (score_tensor[offset] > max_score)) { 908 | max_score = score_tensor[offset]; 909 | max_class_id = c; 910 | // KAYLORDUT_LOG_INFO("threshold is {}, {} score is {}", 911 | // threshold, 912 | // coco_cls_to_name(c), 913 | // deqnt_affine_to_f32(score_tensor[offset], 914 | // score_zp, score_scale)); 915 | } 916 | offset += grid_len; 917 | } 918 | 919 | // compute box 920 | if (max_score > score_thres_i8) { 921 | offset = i * grid_w + j; 922 | float box[4]; 923 | float before_dfl[dfl_len * 4]; 924 | for (int k = 0; k < dfl_len * 4; k++) { 925 | before_dfl[k] = 926 | deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale); 927 | offset += grid_len; 928 | } 929 | compute_dfl(before_dfl, dfl_len, box); 930 | 931 | // float x1, y1, x2, y2, w, h; 932 | // x1 = (-box[0] + j + 0.5) * stride; 933 | // y1 = (-box[1] + i + 0.5) * stride; 934 | // x2 = (box[2] + j + 0.5) * stride; 935 | // y2 = (box[3] + i + 0.5) * stride; 936 | // w = x2 - x1; 937 | // h = y2 - y1; 938 | offset = i * grid_w + j; 939 | float delta_x1, delta_y1, delta_x2, delta_y2; 940 | float theta = 941 | deqnt_affine_to_f32(angle_tensor[offset], angle_zp, angle_scale); 942 | // KAYLORDUT_LOG_DEBUG("theta = {}", theta); 943 | delta_x1 = box[0]; 944 | delta_y1 = box[1]; 945 | delta_x2 = box[2]; 946 | delta_y2 = box[3]; 947 | float xf = (delta_x2 - delta_x1) / 2.0f; 948 | float yf = (delta_y2 - delta_y1) / 2.0f; 949 | float x = xf * cos(theta) - yf * sin(theta); 950 | float y = xf * sin(theta) + yf * cos(theta); 951 | x = (x + j + 0.5) * stride; 952 | y = (y + i + 0.5) * stride; 953 | float w = (delta_x1 + delta_x2) * stride; 954 | float h = (delta_y1 + delta_y2) * stride; 955 | boxes.push_back(x); 956 | boxes.push_back(y); 957 | boxes.push_back(w); 958 | boxes.push_back(h); 959 | 960 | angles.push_back(theta); 961 | objProbs.push_back( 962 | deqnt_affine_to_f32(max_score, score_zp, score_scale)); 963 | classId.push_back(max_class_id); 964 | validCount++; 965 | } 966 | } 967 | } 968 | // KAYLORDUT_LOG_DEBUG("process_i8_obb finished"); 969 | return validCount; 970 | } 971 | 972 | static int process_i8_pose(int8_t *box_tensor, int32_t box_zp, float box_scale, 973 | int8_t *score_tensor, int32_t score_zp, 974 | float score_scale, int8_t *kpt_tensor, 975 | int32_t kpt_zp, float kpt_scale, 976 | int8_t *visibility_tensor, int32_t visibility_zp, 977 | float visibility_scale, int grid_h, int grid_w, 978 | int stride, int dfl_len, std::vector &boxes, 979 | std::vector &objProbs, 980 | std::vector &kpt, 981 | std::vector &visibilities, float threshold) { 982 | int validCount = 0; 983 | int grid_len = grid_h * grid_w; 984 | int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale); 985 | 986 | for (int i = 0; i < grid_h; i++) { 987 | for (int j = 0; j < grid_w; j++) { 988 | int offset = i * grid_w + j; 989 | int max_class_id = -1; 990 | 991 | int8_t max_score = -score_zp; 992 | for (int c = 0; c < num_labels; c++) { 993 | if ((score_tensor[offset] > score_thres_i8) && 994 | (score_tensor[offset] > max_score)) { 995 | max_score = score_tensor[offset]; 996 | max_class_id = c; 997 | } 998 | offset += grid_len; 999 | } 1000 | 1001 | // compute box 1002 | if (max_score > score_thres_i8) { 1003 | offset = i * grid_w + j; 1004 | float box[4]; 1005 | float before_dfl[dfl_len * 4]; 1006 | for (int k = 0; k < dfl_len * 4; k++) { 1007 | before_dfl[k] = 1008 | deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale); 1009 | offset += grid_len; 1010 | } 1011 | compute_dfl(before_dfl, dfl_len, box); 1012 | 1013 | float x1, y1, x2, y2, w, h; 1014 | x1 = (-box[0] + j + 0.5) * stride; 1015 | y1 = (-box[1] + i + 0.5) * stride; 1016 | x2 = (box[2] + j + 0.5) * stride; 1017 | y2 = (box[3] + i + 0.5) * stride; 1018 | w = x2 - x1; 1019 | h = y2 - y1; 1020 | boxes.push_back(x1); 1021 | boxes.push_back(y1); 1022 | boxes.push_back(w); 1023 | boxes.push_back(h); 1024 | 1025 | objProbs.push_back( 1026 | deqnt_affine_to_f32(max_score, score_zp, score_scale)); 1027 | offset = i * grid_w + j; 1028 | for (int k = 0; k < 17; ++k) { 1029 | auto kpt_x = *(kpt_tensor + offset + 2 * k * grid_len); 1030 | auto kpt_y = *(kpt_tensor + offset + (2 * k + 1) * grid_len); 1031 | auto kpt_visibility = *(visibility_tensor + offset + k * grid_len); 1032 | auto res_kpt_x = deqnt_affine_to_f32(kpt_x, kpt_zp, kpt_scale); 1033 | auto res_kpt_y = deqnt_affine_to_f32(kpt_y, kpt_zp, kpt_scale); 1034 | auto res_kpt_visibility = deqnt_affine_to_f32( 1035 | kpt_visibility, visibility_zp, visibility_scale); 1036 | kpt.push_back(res_kpt_x); 1037 | kpt.push_back(res_kpt_y); 1038 | visibilities.push_back(res_kpt_visibility); 1039 | } 1040 | validCount++; 1041 | } 1042 | } 1043 | } 1044 | return validCount; 1045 | } 1046 | static int process_i8(int8_t *box_tensor, int32_t box_zp, float box_scale, 1047 | int8_t *score_tensor, int32_t score_zp, float score_scale, 1048 | int8_t *score_sum_tensor, int32_t score_sum_zp, 1049 | float score_sum_scale, int grid_h, int grid_w, int stride, 1050 | int dfl_len, std::vector &boxes, 1051 | std::vector &objProbs, std::vector &classId, 1052 | float threshold) { 1053 | int validCount = 0; 1054 | int grid_len = grid_h * grid_w; 1055 | int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale); 1056 | int8_t score_sum_thres_i8 = 1057 | qnt_f32_to_affine(threshold, score_sum_zp, score_sum_scale); 1058 | 1059 | for (int i = 0; i < grid_h; i++) { 1060 | for (int j = 0; j < grid_w; j++) { 1061 | int offset = i * grid_w + j; 1062 | int max_class_id = -1; 1063 | 1064 | // 通过 score sum 起到快速过滤的作用 1065 | if (score_sum_tensor != nullptr) { 1066 | if (score_sum_tensor[offset] < score_sum_thres_i8) { 1067 | continue; 1068 | } 1069 | } 1070 | 1071 | int8_t max_score = -score_zp; 1072 | for (int c = 0; c < num_labels; c++) { 1073 | if ((score_tensor[offset] > score_thres_i8) && 1074 | (score_tensor[offset] > max_score)) { 1075 | max_score = score_tensor[offset]; 1076 | max_class_id = c; 1077 | } 1078 | offset += grid_len; 1079 | } 1080 | 1081 | // compute box 1082 | if (max_score > score_thres_i8) { 1083 | offset = i * grid_w + j; 1084 | float box[4]; 1085 | float before_dfl[dfl_len * 4]; 1086 | for (int k = 0; k < dfl_len * 4; k++) { 1087 | before_dfl[k] = 1088 | deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale); 1089 | offset += grid_len; 1090 | } 1091 | compute_dfl(before_dfl, dfl_len, box); 1092 | 1093 | float x1, y1, x2, y2, w, h; 1094 | x1 = (-box[0] + j + 0.5) * stride; 1095 | y1 = (-box[1] + i + 0.5) * stride; 1096 | x2 = (box[2] + j + 0.5) * stride; 1097 | y2 = (box[3] + i + 0.5) * stride; 1098 | w = x2 - x1; 1099 | h = y2 - y1; 1100 | boxes.push_back(x1); 1101 | boxes.push_back(y1); 1102 | boxes.push_back(w); 1103 | boxes.push_back(h); 1104 | 1105 | objProbs.push_back( 1106 | deqnt_affine_to_f32(max_score, score_zp, score_scale)); 1107 | classId.push_back(max_class_id); 1108 | validCount++; 1109 | } 1110 | } 1111 | } 1112 | return validCount; 1113 | } 1114 | 1115 | static int process_i8_v10(int8_t *box_tensor, int32_t box_zp, float box_scale, 1116 | int8_t *score_tensor, int32_t score_zp, 1117 | float score_scale, int grid_h, int grid_w, int stride, 1118 | int dfl_len, std::vector &boxes, 1119 | std::vector &objProbs, 1120 | std::vector &classId, float threshold) { 1121 | int validCount = 0; 1122 | int grid_len = grid_h * grid_w; 1123 | int8_t score_thres_i8 = qnt_f32_to_affine(threshold, score_zp, score_scale); 1124 | 1125 | for (int i = 0; i < grid_h; i++) { 1126 | for (int j = 0; j < grid_w; j++) { 1127 | int offset = i * grid_w + j; 1128 | int max_class_id = -1; 1129 | 1130 | int8_t max_score = -score_zp; 1131 | for (int c = 0; c < num_labels; c++) { 1132 | if ((score_tensor[offset] > score_thres_i8) && 1133 | (score_tensor[offset] > max_score)) { 1134 | max_score = score_tensor[offset]; 1135 | max_class_id = c; 1136 | } 1137 | offset += grid_len; 1138 | } 1139 | 1140 | // compute box 1141 | if (max_score > score_thres_i8) { 1142 | offset = i * grid_w + j; 1143 | float box[4]; 1144 | float before_dfl[dfl_len * 4]; 1145 | for (int k = 0; k < dfl_len * 4; k++) { 1146 | before_dfl[k] = 1147 | deqnt_affine_to_f32(box_tensor[offset], box_zp, box_scale); 1148 | offset += grid_len; 1149 | } 1150 | compute_dfl(before_dfl, dfl_len, box); 1151 | 1152 | float x1, y1, x2, y2, w, h; 1153 | x1 = (-box[0] + j + 0.5) * stride; 1154 | y1 = (-box[1] + i + 0.5) * stride; 1155 | x2 = (box[2] + j + 0.5) * stride; 1156 | y2 = (box[3] + i + 0.5) * stride; 1157 | w = x2 - x1; 1158 | h = y2 - y1; 1159 | boxes.push_back(x1); 1160 | boxes.push_back(y1); 1161 | boxes.push_back(w); 1162 | boxes.push_back(h); 1163 | 1164 | objProbs.push_back( 1165 | deqnt_affine_to_f32(max_score, score_zp, score_scale)); 1166 | classId.push_back(max_class_id); 1167 | validCount++; 1168 | } 1169 | } 1170 | } 1171 | return validCount; 1172 | } 1173 | static int process_fp32(float *box_tensor, float *score_tensor, 1174 | float *score_sum_tensor, int grid_h, int grid_w, 1175 | int stride, int dfl_len, std::vector &boxes, 1176 | std::vector &objProbs, std::vector &classId, 1177 | float threshold) { 1178 | int validCount = 0; 1179 | int grid_len = grid_h * grid_w; 1180 | for (int i = 0; i < grid_h; i++) { 1181 | for (int j = 0; j < grid_w; j++) { 1182 | int offset = i * grid_w + j; 1183 | int max_class_id = -1; 1184 | 1185 | // 通过 score sum 起到快速过滤的作用 1186 | if (score_sum_tensor != nullptr) { 1187 | if (score_sum_tensor[offset] < threshold) { 1188 | continue; 1189 | } 1190 | } 1191 | 1192 | float max_score = 0; 1193 | for (int c = 0; c < num_labels; c++) { 1194 | if ((score_tensor[offset] > threshold) && 1195 | (score_tensor[offset] > max_score)) { 1196 | max_score = score_tensor[offset]; 1197 | max_class_id = c; 1198 | } 1199 | offset += grid_len; 1200 | } 1201 | 1202 | // compute box 1203 | if (max_score > threshold) { 1204 | offset = i * grid_w + j; 1205 | float box[4]; 1206 | float before_dfl[dfl_len * 4]; 1207 | for (int k = 0; k < dfl_len * 4; k++) { 1208 | before_dfl[k] = box_tensor[offset]; 1209 | offset += grid_len; 1210 | } 1211 | compute_dfl(before_dfl, dfl_len, box); 1212 | 1213 | float x1, y1, x2, y2, w, h; 1214 | x1 = (-box[0] + j + 0.5) * stride; 1215 | y1 = (-box[1] + i + 0.5) * stride; 1216 | x2 = (box[2] + j + 0.5) * stride; 1217 | y2 = (box[3] + i + 0.5) * stride; 1218 | w = x2 - x1; 1219 | h = y2 - y1; 1220 | boxes.push_back(x1); 1221 | boxes.push_back(y1); 1222 | boxes.push_back(w); 1223 | boxes.push_back(h); 1224 | 1225 | objProbs.push_back(max_score); 1226 | classId.push_back(max_class_id); 1227 | validCount++; 1228 | } 1229 | } 1230 | } 1231 | return validCount; 1232 | } 1233 | 1234 | int post_process_pose(rknn_app_context_t *app_ctx, rknn_output *outputs, 1235 | letterbox_t *letter_box, float conf_threshold, 1236 | float nms_threshold, 1237 | object_detect_result_list *od_results) { 1238 | std::vector filterBoxes; 1239 | std::vector objProbs; 1240 | std::vector kpt; 1241 | std::vector visibilities; 1242 | int validCount = 0; 1243 | int stride = 0; 1244 | int grid_h = 0; 1245 | int grid_w = 0; 1246 | int model_in_w = app_ctx->model_width; 1247 | int model_in_h = app_ctx->model_height; 1248 | 1249 | // default 3 branch 1250 | int dfl_len = app_ctx->output_attrs[0].dims[1] / 4; 1251 | int output_per_branch = app_ctx->io_num.n_output / 3; 1252 | for (int i = 0; i < 3; i++) { 1253 | int box_idx = i * output_per_branch; 1254 | int score_idx = i * output_per_branch + 1; 1255 | int kpt_idx = i * output_per_branch + 2; 1256 | int visibilities_idx = i * output_per_branch + 3; 1257 | 1258 | grid_h = app_ctx->output_attrs[box_idx].dims[2]; 1259 | grid_w = app_ctx->output_attrs[box_idx].dims[3]; 1260 | num_labels = app_ctx->output_attrs[score_idx].dims[1]; 1261 | stride = model_in_h / grid_h; 1262 | 1263 | if (app_ctx->is_quant) { 1264 | validCount += process_i8_pose( 1265 | (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp, 1266 | app_ctx->output_attrs[box_idx].scale, 1267 | (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp, 1268 | app_ctx->output_attrs[score_idx].scale, 1269 | (int8_t *)outputs[kpt_idx].buf, app_ctx->output_attrs[kpt_idx].zp, 1270 | app_ctx->output_attrs[kpt_idx].scale, 1271 | (int8_t *)outputs[visibilities_idx].buf, 1272 | app_ctx->output_attrs[visibilities_idx].zp, 1273 | app_ctx->output_attrs[visibilities_idx].scale, grid_h, grid_w, stride, 1274 | dfl_len, filterBoxes, objProbs, kpt, visibilities, conf_threshold); 1275 | } 1276 | } 1277 | 1278 | // no object detect 1279 | if (validCount <= 0) { 1280 | return 0; 1281 | } 1282 | std::vector indexArray; 1283 | for (int i = 0; i < validCount; ++i) { 1284 | indexArray.push_back(i); 1285 | } 1286 | quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray); 1287 | // 因为Pose只有人类一个种类, 所以只有nms可以简化 1288 | nms(validCount, filterBoxes, indexArray, nms_threshold); 1289 | 1290 | int last_count = 0; 1291 | od_results->count = 0; 1292 | 1293 | /* box valid detect target */ 1294 | for (int i = 0; i < validCount; ++i) { 1295 | if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) { 1296 | continue; 1297 | } 1298 | int n = indexArray[i]; 1299 | 1300 | float x1 = filterBoxes[n * 4 + 0] - letter_box->x_pad; 1301 | float y1 = filterBoxes[n * 4 + 1] - letter_box->y_pad; 1302 | float x2 = x1 + filterBoxes[n * 4 + 2]; 1303 | float y2 = y1 + filterBoxes[n * 4 + 3]; 1304 | float obj_conf = objProbs[i]; 1305 | 1306 | od_results->results[last_count].box.left = 1307 | (int)(clamp(x1, 0, model_in_w) / letter_box->scale); 1308 | od_results->results[last_count].box.top = 1309 | (int)(clamp(y1, 0, model_in_h) / letter_box->scale); 1310 | od_results->results[last_count].box.right = 1311 | (int)(clamp(x2, 0, model_in_w) / letter_box->scale); 1312 | od_results->results[last_count].box.bottom = 1313 | (int)(clamp(y2, 0, model_in_h) / letter_box->scale); 1314 | od_results->results[last_count].prop = obj_conf; 1315 | for (int j = 0; j < 34; j = j + 2) { 1316 | auto kpt_x = kpt.at(34 * n + j) - letter_box->x_pad; 1317 | auto kpt_y = kpt.at(34 * n + j + 1) - letter_box->y_pad; 1318 | kpt_x /= letter_box->scale; 1319 | kpt_y /= letter_box->scale; 1320 | od_results->results_pose[last_count].kpt[j] = kpt_x; 1321 | od_results->results_pose[last_count].kpt[j + 1] = kpt_y; 1322 | od_results->results_pose[last_count].visibility[j / 2] = 1323 | visibilities.at(17 * n + j / 2); 1324 | } 1325 | last_count++; 1326 | } 1327 | od_results->count = last_count; 1328 | KAYLORDUT_LOG_INFO("valid count: {}, results count: {}", validCount, 1329 | od_results->count); 1330 | return 0; 1331 | } 1332 | 1333 | int post_process_v10_detection(rknn_app_context_t *app_ctx, 1334 | rknn_output *outputs, letterbox_t *letter_box, 1335 | float conf_threshold, 1336 | object_detect_result_list *od_results) { 1337 | std::vector filterBoxes; 1338 | std::vector objProbs; 1339 | std::vector classId; 1340 | int validCount = 0; 1341 | int stride = 0; 1342 | int grid_h = 0; 1343 | int grid_w = 0; 1344 | int model_in_w = app_ctx->model_width; 1345 | int model_in_h = app_ctx->model_height; 1346 | 1347 | // default 3 branch 1348 | int dfl_len = app_ctx->output_attrs[0].dims[1] / 4; 1349 | int output_per_branch = app_ctx->io_num.n_output / 3; 1350 | for (int i = 0; i < 3; i++) { 1351 | int box_idx = i * output_per_branch; 1352 | int score_idx = i * output_per_branch + 1; 1353 | 1354 | grid_h = app_ctx->output_attrs[box_idx].dims[2]; 1355 | grid_w = app_ctx->output_attrs[box_idx].dims[3]; 1356 | stride = model_in_h / grid_h; 1357 | 1358 | if (app_ctx->is_quant) { 1359 | validCount += process_i8_v10( 1360 | (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp, 1361 | app_ctx->output_attrs[box_idx].scale, 1362 | (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp, 1363 | app_ctx->output_attrs[score_idx].scale, grid_h, grid_w, stride, 1364 | dfl_len, filterBoxes, objProbs, classId, conf_threshold); 1365 | } 1366 | } 1367 | // no object detect 1368 | if (validCount <= 0) { 1369 | return 0; 1370 | } 1371 | KAYLORDUT_LOG_INFO("valid count is {}", validCount); 1372 | int last_count = 0; 1373 | od_results->count = 0; 1374 | 1375 | /* box valid detect target */ 1376 | for (int i = 0; i < validCount; ++i) { 1377 | int n = i; 1378 | 1379 | float x1 = filterBoxes[n * 4 + 0] - letter_box->x_pad; 1380 | float y1 = filterBoxes[n * 4 + 1] - letter_box->y_pad; 1381 | float x2 = x1 + filterBoxes[n * 4 + 2]; 1382 | float y2 = y1 + filterBoxes[n * 4 + 3]; 1383 | int id = classId[n]; 1384 | float obj_conf = objProbs[i]; 1385 | 1386 | od_results->results[last_count].box.left = 1387 | (int)(clamp(x1, 0, model_in_w) / letter_box->scale); 1388 | od_results->results[last_count].box.top = 1389 | (int)(clamp(y1, 0, model_in_h) / letter_box->scale); 1390 | od_results->results[last_count].box.right = 1391 | (int)(clamp(x2, 0, model_in_w) / letter_box->scale); 1392 | od_results->results[last_count].box.bottom = 1393 | (int)(clamp(y2, 0, model_in_h) / letter_box->scale); 1394 | od_results->results[last_count].prop = obj_conf; 1395 | od_results->results[last_count].cls_id = id; 1396 | last_count++; 1397 | } 1398 | od_results->count = last_count; 1399 | return 0; 1400 | } 1401 | 1402 | int post_process(rknn_app_context_t *app_ctx, rknn_output *outputs, 1403 | letterbox_t *letter_box, float conf_threshold, 1404 | float nms_threshold, object_detect_result_list *od_results) { 1405 | std::vector filterBoxes; 1406 | std::vector objProbs; 1407 | std::vector classId; 1408 | int validCount = 0; 1409 | int stride = 0; 1410 | int grid_h = 0; 1411 | int grid_w = 0; 1412 | int model_in_w = app_ctx->model_width; 1413 | int model_in_h = app_ctx->model_height; 1414 | 1415 | // default 3 branch 1416 | int dfl_len = app_ctx->output_attrs[0].dims[1] / 4; 1417 | int output_per_branch = app_ctx->io_num.n_output / 3; 1418 | for (int i = 0; i < 3; i++) { 1419 | void *score_sum = nullptr; 1420 | int32_t score_sum_zp = 0; 1421 | float score_sum_scale = 1.0; 1422 | if (output_per_branch == 3) { 1423 | score_sum = outputs[i * output_per_branch + 2].buf; 1424 | score_sum_zp = app_ctx->output_attrs[i * output_per_branch + 2].zp; 1425 | score_sum_scale = app_ctx->output_attrs[i * output_per_branch + 2].scale; 1426 | } 1427 | int box_idx = i * output_per_branch; 1428 | int score_idx = i * output_per_branch + 1; 1429 | 1430 | grid_h = app_ctx->output_attrs[box_idx].dims[2]; 1431 | grid_w = app_ctx->output_attrs[box_idx].dims[3]; 1432 | stride = model_in_h / grid_h; 1433 | 1434 | if (app_ctx->is_quant) { 1435 | validCount += process_i8( 1436 | (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp, 1437 | app_ctx->output_attrs[box_idx].scale, 1438 | (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp, 1439 | app_ctx->output_attrs[score_idx].scale, (int8_t *)score_sum, 1440 | score_sum_zp, score_sum_scale, grid_h, grid_w, stride, dfl_len, 1441 | filterBoxes, objProbs, classId, conf_threshold); 1442 | } else { 1443 | validCount += process_fp32( 1444 | (float *)outputs[box_idx].buf, (float *)outputs[score_idx].buf, 1445 | (float *)score_sum, grid_h, grid_w, stride, dfl_len, filterBoxes, 1446 | objProbs, classId, conf_threshold); 1447 | } 1448 | } 1449 | 1450 | // no object detect 1451 | if (validCount <= 0) { 1452 | return 0; 1453 | } 1454 | std::vector indexArray; 1455 | // 如果是Yolov8 就进行nms, yolov10不需要 1456 | if (od_results->model_type == ModelType::DETECTION) { 1457 | for (int i = 0; i < validCount; ++i) { 1458 | indexArray.push_back(i); 1459 | } 1460 | quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray); 1461 | std::set class_set(std::begin(classId), std::end(classId)); 1462 | for (auto c : class_set) { 1463 | nms(validCount, filterBoxes, classId, indexArray, c, nms_threshold); 1464 | } 1465 | } 1466 | 1467 | int last_count = 0; 1468 | od_results->count = 0; 1469 | 1470 | /* box valid detect target */ 1471 | for (int i = 0; i < validCount; ++i) { 1472 | int n; 1473 | // 如果是yolov8, 根据nms结果进行操作 1474 | if (od_results->model_type == ModelType::DETECTION) { 1475 | if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) { 1476 | continue; 1477 | } 1478 | n = indexArray[i]; 1479 | } else { 1480 | n = i; 1481 | } 1482 | 1483 | float x1 = filterBoxes[n * 4 + 0] - letter_box->x_pad; 1484 | float y1 = filterBoxes[n * 4 + 1] - letter_box->y_pad; 1485 | float x2 = x1 + filterBoxes[n * 4 + 2]; 1486 | float y2 = y1 + filterBoxes[n * 4 + 3]; 1487 | int id = classId[n]; 1488 | float obj_conf = objProbs[i]; 1489 | 1490 | od_results->results[last_count].box.left = 1491 | (int)(clamp(x1, 0, model_in_w) / letter_box->scale); 1492 | od_results->results[last_count].box.top = 1493 | (int)(clamp(y1, 0, model_in_h) / letter_box->scale); 1494 | od_results->results[last_count].box.right = 1495 | (int)(clamp(x2, 0, model_in_w) / letter_box->scale); 1496 | od_results->results[last_count].box.bottom = 1497 | (int)(clamp(y2, 0, model_in_h) / letter_box->scale); 1498 | od_results->results[last_count].prop = obj_conf; 1499 | od_results->results[last_count].cls_id = id; 1500 | last_count++; 1501 | } 1502 | od_results->count = last_count; 1503 | return 0; 1504 | } 1505 | 1506 | int post_process_obb(rknn_app_context_t *app_ctx, rknn_output *outputs, 1507 | letterbox_t *letter_box, float conf_threshold, 1508 | float nms_threshold, 1509 | object_detect_result_list *od_results) { 1510 | std::vector filterBoxes; // box 1511 | std::vector objProbs; // 置信度 1512 | std::vector classId; // class id 1513 | std::vector angles; 1514 | int validCount = 0; 1515 | int stride = 0; 1516 | int grid_h = 0; 1517 | int grid_w = 0; 1518 | int model_in_w = app_ctx->model_width; 1519 | int model_in_h = app_ctx->model_height; 1520 | 1521 | // default 3 branch 1522 | int dfl_len = app_ctx->output_attrs[0].dims[1] / 4; 1523 | int output_per_branch = app_ctx->io_num.n_output / 3; 1524 | for (int i = 0; i < 3; i++) { 1525 | int box_idx = i * output_per_branch; 1526 | int score_idx = i * output_per_branch + 1; 1527 | int angle_idx = i * output_per_branch + 2; 1528 | 1529 | grid_h = app_ctx->output_attrs[box_idx].dims[2]; 1530 | grid_w = app_ctx->output_attrs[box_idx].dims[3]; 1531 | num_labels = app_ctx->output_attrs[score_idx].dims[1]; 1532 | stride = model_in_h / grid_h; 1533 | 1534 | if (app_ctx->is_quant) { 1535 | validCount += process_i8_obb( 1536 | (int8_t *)outputs[box_idx].buf, app_ctx->output_attrs[box_idx].zp, 1537 | app_ctx->output_attrs[box_idx].scale, 1538 | (int8_t *)outputs[score_idx].buf, app_ctx->output_attrs[score_idx].zp, 1539 | app_ctx->output_attrs[score_idx].scale, 1540 | (int8_t *)outputs[angle_idx].buf, app_ctx->output_attrs[angle_idx].zp, 1541 | app_ctx->output_attrs[angle_idx].scale, grid_h, grid_w, stride, 1542 | dfl_len, filterBoxes, angles, objProbs, classId, conf_threshold); 1543 | } else { 1544 | // validCount += process_fp32( 1545 | // (float *)outputs[box_idx].buf, (float 1546 | // *)outputs[score_idx].buf, (float *)score_sum, grid_h, grid_w, 1547 | // stride, dfl_len, filterBoxes, objProbs, classId, 1548 | // conf_threshold); 1549 | } 1550 | } 1551 | 1552 | // no object detect 1553 | if (validCount <= 0) { 1554 | return 0; 1555 | } 1556 | KAYLORDUT_LOG_INFO("valid count is {}", validCount); 1557 | std::vector indexArray; 1558 | for (int i = 0; i < validCount; ++i) { 1559 | indexArray.push_back(i); 1560 | } 1561 | quick_sort_indice_inverse(objProbs, 0, validCount - 1, indexArray); 1562 | 1563 | std::set class_set(std::begin(classId), std::end(classId)); 1564 | 1565 | for (auto c : class_set) { 1566 | KAYLORDUT_LOG_DEBUG("processing nms, class id is {}", coco_cls_to_name(c)); 1567 | nms(validCount, filterBoxes, angles, classId, indexArray, c, nms_threshold); 1568 | } 1569 | 1570 | int last_count = 0; 1571 | od_results->count = 0; 1572 | 1573 | /* box valid detect target */ 1574 | for (int i = 0; i < validCount; ++i) { 1575 | if (indexArray[i] == -1 || last_count >= OBJ_NUMB_MAX_SIZE) { 1576 | continue; 1577 | } 1578 | int n = indexArray[i]; 1579 | 1580 | float x = filterBoxes[n * 4 + 0] - letter_box->x_pad; 1581 | float y = filterBoxes[n * 4 + 1] - letter_box->y_pad; 1582 | float w = filterBoxes[n * 4 + 2]; 1583 | float h = filterBoxes[n * 4 + 3]; 1584 | float theta = angles[n]; 1585 | int id = classId[n]; 1586 | float obj_conf = objProbs[i]; 1587 | 1588 | // 这里限制幅度的函数有问题,因为不是四个坐标点,所以不能这样的限制幅度 1589 | // od_results->results_obb[last_count].box.x = 1590 | // (int)(clamp(x, 0, model_in_w) / letter_box->scale); 1591 | // od_results->results_obb[last_count].box.y = 1592 | // (int)(clamp(y, 0, model_in_h) / letter_box->scale); 1593 | // od_results->results_obb[last_count].box.w = 1594 | // (int)(clamp(w, 0, model_in_w) / letter_box->scale); 1595 | // od_results->results_obb[last_count].box.h = 1596 | // (int)(clamp(h, 0, model_in_h) / letter_box->scale); 1597 | od_results->results_obb[last_count].box.x = (int)(x / letter_box->scale); 1598 | od_results->results_obb[last_count].box.y = (int)(y / letter_box->scale); 1599 | od_results->results_obb[last_count].box.w = (int)(w / letter_box->scale); 1600 | od_results->results_obb[last_count].box.h = (int)(h / letter_box->scale); 1601 | od_results->results_obb[last_count].box.theta = theta; 1602 | // od_results->results_obb[last_count].box.theta = 0; 1603 | od_results->results_obb[last_count].prop = obj_conf; 1604 | od_results->results_obb[last_count].cls_id = id; 1605 | KAYLORDUT_LOG_INFO( 1606 | "label is {}, and confidence is {}, xywhθ = ({} {} {} {} {})", 1607 | coco_cls_to_name(id), obj_conf, 1608 | od_results->results_obb[last_count].box.x, 1609 | od_results->results_obb[last_count].box.y, 1610 | od_results->results_obb[last_count].box.w, 1611 | od_results->results_obb[last_count].box.h, 1612 | od_results->results_obb[last_count].box.theta); 1613 | last_count++; 1614 | } 1615 | od_results->count = last_count; 1616 | KAYLORDUT_LOG_INFO(" result count: {}", last_count); 1617 | return 0; 1618 | } 1619 | -------------------------------------------------------------------------------- /src/yolov8/utils/rknn_pool.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/6/24. 3 | // 4 | 5 | #include "rknn_pool.h" 6 | 7 | #include "kaylordut/log/logger.h" 8 | #include "postprocess.h" 9 | 10 | RknnPool::RknnPool(const std::string model_path, const int thread_num, 11 | const std::string lable_path) { 12 | this->thread_num_ = thread_num; 13 | this->model_path_ = model_path; 14 | this->label_path_ = lable_path; 15 | this->Init(); 16 | } 17 | 18 | RknnPool::~RknnPool() { this->DeInit(); } 19 | 20 | void RknnPool::Init() { 21 | init_post_process(this->label_path_); 22 | try { 23 | // 配置线程池 24 | this->pool_ = std::make_unique(this->thread_num_); 25 | // 这里每一个线程需要加载一个模型 26 | for (int i = 0; i < this->thread_num_; ++i) { 27 | models_.push_back(std::make_shared( 28 | std::forward(this->model_path_))); 29 | } 30 | } catch (const std::bad_alloc &e) { 31 | KAYLORDUT_LOG_ERROR("Out of memory: {}", e.what()); 32 | exit(EXIT_FAILURE); 33 | } 34 | for (int i = 0; i < this->thread_num_; ++i) { 35 | auto ret = models_[i]->Init(models_[0]->get_rknn_context(), i != 0); 36 | if (ret != 0) { 37 | KAYLORDUT_LOG_ERROR("Init rknn model failed!"); 38 | exit(EXIT_FAILURE); 39 | } 40 | } 41 | } 42 | 43 | void RknnPool::DeInit() { deinit_post_process(); } 44 | 45 | void RknnPool::AddInferenceTask(std::shared_ptr src, 46 | ImageProcess &image_process) { 47 | pool_->enqueue( 48 | [&](std::shared_ptr original_img) { 49 | auto convert_img = image_process.Convert(*original_img); 50 | auto mode_id = get_model_id(); 51 | cv::Mat rgb_img = cv::Mat::zeros( 52 | this->models_[mode_id]->get_model_width(), 53 | this->models_[mode_id]->get_model_height(), convert_img->type()); 54 | cv::cvtColor(*convert_img, rgb_img, cv::COLOR_BGR2RGB); 55 | object_detect_result_list od_results; 56 | this->models_[mode_id]->Inference(rgb_img.ptr(), &od_results, 57 | image_process.get_letter_box()); 58 | image_process.ImagePostProcess(*original_img, od_results); 59 | std::lock_guard lock_guard(this->image_results_mutex_); 60 | this->image_results_.push(std::move(original_img)); 61 | }, 62 | std::move(src)); 63 | } 64 | 65 | int RknnPool::get_model_id() { 66 | std::lock_guard lock(id_mutex_); 67 | int mode_id = id; 68 | id++; 69 | if (id == thread_num_) { 70 | id = 0; 71 | } 72 | // KAYLORDUT_LOG_INFO("id = {}, num = {}, mode id = {}", id, thread_num_, 73 | // mode_id); 74 | return mode_id; 75 | } 76 | 77 | std::shared_ptr RknnPool::GetImageResultFromQueue() { 78 | std::lock_guard lock_guard(this->image_results_mutex_); 79 | if (this->image_results_.empty()) { 80 | return nullptr; 81 | } else { 82 | auto res = this->image_results_.front(); 83 | this->image_results_.pop(); 84 | return std::move(res); 85 | } 86 | } 87 | 88 | int RknnPool::GetTasksSize() { return pool_->TasksSize(); } 89 | -------------------------------------------------------------------------------- /src/yolov8/utils/videofile.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/4/24. 3 | // 4 | 5 | #include "videofile.h" 6 | 7 | #include "image_process.h" 8 | #include "kaylordut/log/logger.h" 9 | 10 | VideoFile::VideoFile(const std::string &&filename) : filename_(filename) { 11 | capture_ = new cv::VideoCapture(filename_); 12 | if (!capture_->isOpened()) { 13 | KAYLORDUT_LOG_ERROR("Error opening video file"); 14 | exit(EXIT_FAILURE); 15 | } 16 | } 17 | 18 | VideoFile::~VideoFile() { 19 | if (capture_ != nullptr) { 20 | KAYLORDUT_LOG_INFO("Release capture") 21 | capture_->release(); 22 | delete capture_; 23 | } 24 | } 25 | 26 | void VideoFile::Display(const float framerate, const int target_size) { 27 | const int delay = 1000 / framerate; 28 | cv::namedWindow("Video", cv::WINDOW_AUTOSIZE); 29 | cv::Mat frame; 30 | ImageProcess image_process(capture_->get(cv::CAP_PROP_FRAME_WIDTH), 31 | capture_->get(cv::CAP_PROP_FRAME_HEIGHT), 32 | target_size, false, 30); 33 | while (true) { 34 | *capture_ >> frame; 35 | if (frame.empty()) { 36 | break; 37 | } 38 | cv::imshow("Video", *(image_process.Convert(frame))); 39 | // cv::imshow("Video", frame); 40 | if (cv::waitKey(delay) >= 0) { 41 | break; 42 | } 43 | } 44 | cv::destroyAllWindows(); 45 | } 46 | 47 | std::unique_ptr VideoFile::GetNextFrame() { 48 | auto frame = std::make_unique(); 49 | *capture_ >> *frame; 50 | if (frame->empty()) { 51 | return nullptr; 52 | } 53 | return std::move(frame); 54 | } 55 | 56 | cv::Mat VideoFile::test() { 57 | cv::Mat frame; 58 | *capture_ >> frame; 59 | cv::waitKey(125); 60 | return frame; 61 | } 62 | 63 | int VideoFile::get_frame_width() { 64 | return capture_->get(cv::CAP_PROP_FRAME_WIDTH); 65 | } 66 | 67 | int VideoFile::get_frame_height() { 68 | return capture_->get(cv::CAP_PROP_FRAME_HEIGHT); 69 | } -------------------------------------------------------------------------------- /src/yolov8/utils/yolov8.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by kaylor on 3/4/24. 3 | // 4 | 5 | #include "yolov8.h" 6 | 7 | #include "kaylordut/log/logger.h" 8 | #include "kaylordut/time/time_duration.h" 9 | #include "postprocess.h" 10 | 11 | const int RK3588 = 3; 12 | 13 | // 设置模型需要绑定的核心 14 | // Set the core of the model that needs to be bound 15 | int get_core_num() { 16 | static int core_num = 0; 17 | static std::mutex mtx; 18 | std::lock_guard lock(mtx); 19 | int temp = core_num % RK3588; 20 | core_num++; 21 | return temp; 22 | } 23 | 24 | int read_data_from_file(const char *path, char **out_data) { 25 | FILE *fp = fopen(path, "rb"); 26 | if (fp == NULL) { 27 | KAYLORDUT_LOG_INFO("fopen {} failed!", path); 28 | return -1; 29 | } 30 | fseek(fp, 0, SEEK_END); 31 | int file_size = ftell(fp); 32 | char *data = (char *)malloc(file_size + 1); 33 | data[file_size] = 0; 34 | fseek(fp, 0, SEEK_SET); 35 | if (file_size != fread(data, 1, file_size, fp)) { 36 | KAYLORDUT_LOG_INFO("fread {} failed!", path); 37 | free(data); 38 | fclose(fp); 39 | return -1; 40 | } 41 | if (fp) { 42 | fclose(fp); 43 | } 44 | *out_data = data; 45 | return file_size; 46 | } 47 | static void dump_tensor_attr(rknn_tensor_attr *attr) { 48 | KAYLORDUT_LOG_INFO( 49 | "index={}, name={}, n_dims={}, dims=[{}, {}, {}, {}], n_elems={}, " 50 | "size={}, fmt={}, type={}, qnt_type={}, " 51 | "zp={}, scale={}", 52 | attr->index, attr->name, attr->n_dims, attr->dims[0], attr->dims[1], 53 | attr->dims[2], attr->dims[3], attr->n_elems, attr->size, 54 | get_format_string(attr->fmt), get_type_string(attr->type), 55 | get_qnt_type_string(attr->qnt_type), attr->zp, attr->scale); 56 | } 57 | 58 | Yolov8::Yolov8(std::string &&model_path) : model_path_(model_path) {} 59 | 60 | int Yolov8::Init(rknn_context *ctx_in, bool copy_weight) { 61 | int model_len = 0; 62 | char *model; 63 | int ret = 0; 64 | model_len = read_data_from_file(model_path_.c_str(), &model); 65 | if (model == nullptr) { 66 | KAYLORDUT_LOG_ERROR("Load model failed"); 67 | return -1; 68 | } 69 | if (copy_weight) { 70 | KAYLORDUT_LOG_INFO("rknn_dup_context() is called"); 71 | // 复用模型参数 72 | ret = rknn_dup_context(ctx_in, &ctx_); 73 | if (ret != RKNN_SUCC) { 74 | KAYLORDUT_LOG_ERROR("rknn_dup_context failed! error code = {}", ret); 75 | return -1; 76 | } 77 | } else { 78 | KAYLORDUT_LOG_INFO("rknn_init() is called"); 79 | ret = rknn_init(&ctx_, model, model_len, 0, NULL); 80 | free(model); 81 | if (ret != RKNN_SUCC) { 82 | KAYLORDUT_LOG_ERROR("rknn_init failed! error code = {}", ret); 83 | return -1; 84 | } 85 | } 86 | rknn_core_mask core_mask; 87 | switch (get_core_num()) { 88 | case 0: 89 | core_mask = RKNN_NPU_CORE_0; 90 | break; 91 | case 1: 92 | core_mask = RKNN_NPU_CORE_1; 93 | break; 94 | case 2: 95 | core_mask = RKNN_NPU_CORE_2; 96 | break; 97 | } 98 | ret = rknn_set_core_mask(ctx_, core_mask); 99 | if (ret < 0) { 100 | KAYLORDUT_LOG_ERROR("rknn_set_core_mask failed! error code = {}", ret); 101 | return -1; 102 | } 103 | 104 | rknn_sdk_version version; 105 | ret = rknn_query(ctx_, RKNN_QUERY_SDK_VERSION, &version, 106 | sizeof(rknn_sdk_version)); 107 | if (ret < 0) { 108 | return -1; 109 | } 110 | KAYLORDUT_LOG_INFO("sdk version: {} driver version: {}", version.api_version, 111 | version.drv_version); 112 | 113 | // Get Model Input Output Number 114 | rknn_input_output_num io_num; 115 | ret = rknn_query(ctx_, RKNN_QUERY_IN_OUT_NUM, &io_num, sizeof(io_num)); 116 | if (ret != RKNN_SUCC) { 117 | KAYLORDUT_LOG_ERROR("rknn_query failed! error code = {}", ret); 118 | return -1; 119 | } 120 | KAYLORDUT_LOG_INFO("model input num: {}, and output num: {}", io_num.n_input, 121 | io_num.n_output); 122 | // Get Model Input Info 123 | KAYLORDUT_LOG_INFO("input tensors:"); 124 | rknn_tensor_attr 125 | input_attrs[io_num.n_input]; //这里使用的是变长数组,不建议这么使用 126 | memset(input_attrs, 0, sizeof(input_attrs)); 127 | for (int i = 0; i < io_num.n_input; i++) { 128 | input_attrs[i].index = i; 129 | ret = rknn_query(ctx_, RKNN_QUERY_INPUT_ATTR, &(input_attrs[i]), 130 | sizeof(rknn_tensor_attr)); 131 | if (ret != RKNN_SUCC) { 132 | KAYLORDUT_LOG_ERROR("input rknn_query failed! error code = {}", ret); 133 | return -1; 134 | } 135 | dump_tensor_attr(&(input_attrs[i])); 136 | } 137 | 138 | // Get Model Output Info 139 | KAYLORDUT_LOG_INFO("output tensors:"); 140 | rknn_tensor_attr output_attrs[io_num.n_output]; 141 | memset(output_attrs, 0, sizeof(output_attrs)); 142 | // rk官方的segment模型有13个输出 143 | if (io_num.n_output == 13) { 144 | KAYLORDUT_LOG_INFO("this is a segment model") 145 | model_type_ = ModelType::SEGMENT; 146 | } else { 147 | model_type_ = ModelType::DETECTION; 148 | } 149 | for (int i = 0; i < io_num.n_output; i++) { 150 | output_attrs[i].index = i; 151 | ret = rknn_query(ctx_, RKNN_QUERY_OUTPUT_ATTR, &(output_attrs[i]), 152 | sizeof(rknn_tensor_attr)); 153 | if (ret != RKNN_SUCC) { 154 | KAYLORDUT_LOG_ERROR("output rknn_query fail! error code = {}", ret); 155 | return -1; 156 | } 157 | dump_tensor_attr(&(output_attrs[i])); 158 | if (i == 2) { 159 | char *found = strstr(output_attrs[i].name, "angle"); 160 | if (found != NULL) { 161 | KAYLORDUT_LOG_INFO("this is a OBB model"); 162 | model_type_ = ModelType::OBB; 163 | } 164 | found = strstr(output_attrs[i].name, "kpt"); 165 | if (found != NULL) { 166 | KAYLORDUT_LOG_INFO("this is a POSE model"); 167 | model_type_ = ModelType::POSE; 168 | } 169 | found = strstr(output_attrs[i].name, "yolov10"); 170 | if (found != NULL) { 171 | KAYLORDUT_LOG_INFO("this is a Yolov10 detection model"); 172 | model_type_ = ModelType::V10_DETECTION; 173 | } 174 | } 175 | } 176 | // Set to context 177 | app_ctx_.rknn_ctx = ctx_; 178 | if (output_attrs[0].qnt_type == RKNN_TENSOR_QNT_AFFINE_ASYMMETRIC && 179 | output_attrs[0].type == RKNN_TENSOR_INT8) { 180 | app_ctx_.is_quant = true; 181 | } else { 182 | app_ctx_.is_quant = false; 183 | } 184 | app_ctx_.io_num = io_num; 185 | app_ctx_.input_attrs = 186 | (rknn_tensor_attr *)malloc(io_num.n_input * sizeof(rknn_tensor_attr)); 187 | memcpy(app_ctx_.input_attrs, input_attrs, 188 | io_num.n_input * sizeof(rknn_tensor_attr)); 189 | app_ctx_.output_attrs = 190 | (rknn_tensor_attr *)malloc(io_num.n_output * sizeof(rknn_tensor_attr)); 191 | memcpy(app_ctx_.output_attrs, output_attrs, 192 | io_num.n_output * sizeof(rknn_tensor_attr)); 193 | 194 | if (input_attrs[0].fmt == RKNN_TENSOR_NCHW) { 195 | KAYLORDUT_LOG_INFO("model is NCHW input fmt"); 196 | app_ctx_.model_channel = input_attrs[0].dims[1]; 197 | app_ctx_.model_height = input_attrs[0].dims[2]; 198 | app_ctx_.model_width = input_attrs[0].dims[3]; 199 | } else { 200 | KAYLORDUT_LOG_INFO("model is NHWC input fmt"); 201 | app_ctx_.model_height = input_attrs[0].dims[1]; 202 | app_ctx_.model_width = input_attrs[0].dims[2]; 203 | app_ctx_.model_channel = input_attrs[0].dims[3]; 204 | } 205 | KAYLORDUT_LOG_INFO("model input height={}, width={}, channel={}", 206 | app_ctx_.model_height, app_ctx_.model_width, 207 | app_ctx_.model_channel); 208 | // 初始化输入输出参数 209 | inputs_ = std::make_unique(app_ctx_.io_num.n_input); 210 | outputs_ = std::make_unique(app_ctx_.io_num.n_output); 211 | inputs_[0].index = 0; 212 | inputs_[0].type = RKNN_TENSOR_UINT8; 213 | inputs_[0].fmt = RKNN_TENSOR_NHWC; 214 | inputs_[0].size = 215 | app_ctx_.model_width * app_ctx_.model_height * app_ctx_.model_channel; 216 | inputs_[0].buf = nullptr; 217 | return 0; 218 | } 219 | 220 | Yolov8::~Yolov8() { DeInit(); } 221 | 222 | int Yolov8::DeInit() { 223 | if (app_ctx_.rknn_ctx != 0) { 224 | KAYLORDUT_LOG_INFO("rknn_destroy") 225 | rknn_destroy(app_ctx_.rknn_ctx); 226 | app_ctx_.rknn_ctx = 0; 227 | } 228 | if (app_ctx_.input_attrs != nullptr) { 229 | KAYLORDUT_LOG_INFO("free input_attrs"); 230 | free(app_ctx_.input_attrs); 231 | } 232 | if (app_ctx_.output_attrs != nullptr) { 233 | KAYLORDUT_LOG_INFO("free output_attrs"); 234 | free(app_ctx_.output_attrs); 235 | } 236 | return 0; 237 | } 238 | 239 | rknn_context *Yolov8::get_rknn_context() { return &(this->ctx_); } 240 | 241 | int Yolov8::Inference(void *image_buf, object_detect_result_list *od_results, 242 | letterbox_t letter_box) { 243 | TimeDuration total_duration; 244 | inputs_[0].buf = image_buf; 245 | int ret = rknn_inputs_set(app_ctx_.rknn_ctx, app_ctx_.io_num.n_input, 246 | inputs_.get()); 247 | if (ret < 0) { 248 | KAYLORDUT_LOG_ERROR("rknn_input_set failed! error code = {}", ret); 249 | return -1; 250 | } 251 | TimeDuration time_duration; 252 | ret = rknn_run(app_ctx_.rknn_ctx, nullptr); 253 | auto duration = std::chrono::duration_cast( 254 | time_duration.DurationSinceLastTime()); 255 | if (ret != RKNN_SUCC) { 256 | KAYLORDUT_LOG_ERROR("rknn_run failed, error code = {}", ret); 257 | return -1; 258 | } 259 | for (int i = 0; i < app_ctx_.io_num.n_output; ++i) { 260 | outputs_[i].index = i; 261 | outputs_[i].want_float = (!app_ctx_.is_quant); 262 | } 263 | outputs_lock_.lock(); 264 | ret = rknn_outputs_get(app_ctx_.rknn_ctx, app_ctx_.io_num.n_output, 265 | outputs_.get(), nullptr); 266 | if (ret != RKNN_SUCC) { 267 | KAYLORDUT_LOG_ERROR("rknn_outputs_get failed, error code = {}", ret); 268 | return -1; 269 | } 270 | const float nms_threshold = NMS_THRESH; // 默认的NMS阈值 271 | const float box_conf_threshold = BOX_THRESH; // 默认的置信度阈值 272 | // Post Process 273 | // 把输出结果列表置零 274 | memset(od_results, 0, sizeof(object_detect_result_list)); 275 | od_results->model_type = model_type_; 276 | KAYLORDUT_TIME_COST_INFO( 277 | "rknn_outputs_post_process", 278 | if (model_type_ == ModelType::SEGMENT) { 279 | post_process_seg(&app_ctx_, outputs_.get(), &letter_box, 280 | box_conf_threshold, nms_threshold, od_results); 281 | } else if (model_type_ == ModelType::DETECTION || 282 | model_type_ == ModelType::V10_DETECTION) { 283 | post_process(&app_ctx_, outputs_.get(), &letter_box, box_conf_threshold, 284 | nms_threshold, od_results); 285 | } else if (model_type_ == ModelType::OBB) { 286 | post_process_obb(&app_ctx_, outputs_.get(), &letter_box, 287 | box_conf_threshold, nms_threshold, od_results); 288 | } else if (model_type_ == ModelType::POSE) { 289 | post_process_pose(&app_ctx_, outputs_.get(), &letter_box, 290 | box_conf_threshold, nms_threshold, od_results); 291 | } 292 | /*else if (model_type_ == ModelType::V10_DETECTION) { 293 | post_process_v10_detection(&app_ctx_, outputs_.get(), &letter_box, 294 | box_conf_threshold, od_results); 295 | }*/ 296 | ); 297 | od_results->model_type = model_type_; 298 | 299 | // Remeber to release rknn outputs_ 300 | rknn_outputs_release(app_ctx_.rknn_ctx, app_ctx_.io_num.n_output, 301 | outputs_.get()); 302 | outputs_lock_.unlock(); 303 | auto total_delta = std::chrono::duration_cast( 304 | total_duration.DurationSinceLastTime()); 305 | KAYLORDUT_LOG_DEBUG("Inference time is {}ms and total time is {}ms", 306 | duration.count(), total_delta.count()); 307 | return 0; 308 | } 309 | 310 | int Yolov8::get_model_width() { return app_ctx_.model_width; } 311 | 312 | int Yolov8::get_model_height() { return app_ctx_.model_height; } 313 | -------------------------------------------------------------------------------- /src/yolov8/videofile_demo.cpp: -------------------------------------------------------------------------------- 1 | #include // 用于检查字符是否为数字 2 | #include // 用于检查 strtol 和 strtod 的错误 3 | 4 | #include "getopt.h" 5 | #include "image_process.h" 6 | #include "kaylordut/log/logger.h" 7 | #include "kaylordut/time/run_once.h" 8 | #include "kaylordut/time/time_duration.h" 9 | #include "rknn_pool.h" 10 | #include "videofile.h" 11 | 12 | struct ProgramOptions { 13 | std::string model_path; 14 | std::string label_path; 15 | std::string input_filename; 16 | int thread_count; 17 | double framerate; 18 | bool is_track = false; 19 | }; 20 | 21 | // 检查字符串是否表示有效的数字 22 | bool isNumber(const std::string &str) { 23 | char *end; 24 | errno = 0; 25 | std::strtod(str.c_str(), &end); 26 | return errno == 0 && *end == '\0' && end != str.c_str(); 27 | } 28 | 29 | // 这个函数将解析命令行参数并返回一个 ProgramOptions 结构体 30 | bool parseCommandLine(int argc, char *argv[], ProgramOptions &options) { 31 | options.is_track = false; 32 | static struct option longOpts[] = { 33 | {"model_path", required_argument, nullptr, 'm'}, 34 | {"label_path", required_argument, nullptr, 'l'}, 35 | {"threads", required_argument, nullptr, 't'}, 36 | {"framerate", required_argument, nullptr, 'f'}, 37 | {"input_filename", required_argument, nullptr, 'i'}, 38 | {"help", no_argument, nullptr, 'h'}, 39 | {"track", no_argument, nullptr, 'T'}, 40 | {nullptr, 0, nullptr, 0}}; 41 | 42 | int c, optionIndex = 0; 43 | while ((c = getopt_long(argc, argv, "m:l:t:f:i:hT", longOpts, 44 | &optionIndex)) != -1) { 45 | switch (c) { 46 | case 'm': 47 | options.model_path = optarg; 48 | break; 49 | case 'l': 50 | options.label_path = optarg; 51 | break; 52 | case 't': 53 | if (isNumber(optarg)) { 54 | options.thread_count = std::atoi(optarg); 55 | if (options.thread_count <= 0) { 56 | KAYLORDUT_LOG_ERROR("Invalid number of threads: {}", optarg); 57 | return false; 58 | } 59 | } else { 60 | KAYLORDUT_LOG_ERROR("Thread count must be a number: {}", optarg); 61 | return false; 62 | } 63 | break; 64 | case 'f': 65 | if (isNumber(optarg)) { 66 | options.framerate = std::atof(optarg); 67 | if (options.framerate <= 0) { 68 | KAYLORDUT_LOG_ERROR("Invalid frame rate: {}", optarg); 69 | return false; 70 | } 71 | } else { 72 | KAYLORDUT_LOG_ERROR("Frame rate must be a number: {}", optarg); 73 | return false; 74 | } 75 | break; 76 | case 'i': 77 | options.input_filename = optarg; 78 | break; 79 | case 'h': 80 | std::cout << "Usage: " << argv[0] 81 | << " [--model_path|-m model_path] [--input_filename|-i " 82 | "input_filename] " 83 | "[--threads|-t thread_count] [--framerate|-f framerate] " 84 | "[--label_path|-l label_path]\n"; 85 | exit(EXIT_SUCCESS); 86 | case 'T': 87 | options.is_track = true; 88 | break; 89 | case '?': 90 | // 错误消息由getopt_long自动处理 91 | return false; 92 | default: 93 | std::cout << "Usage: " << argv[0] 94 | << " [--model_path|-d model_path] [--input_filename|-i " 95 | "input_filename] " 96 | "[--threads|-t thread_count] [--framerate|-f framerate] " 97 | "[--label_path|-l label_path]\n"; 98 | abort(); 99 | } 100 | } 101 | 102 | return true; 103 | } 104 | 105 | int main(int argc, char *argv[]) { 106 | KAYLORDUT_LOG_INFO("Yolov8 demo for rk3588"); 107 | ProgramOptions options = {"", "", "", 0, 0.0}; 108 | if (!parseCommandLine(argc, argv, options)) { 109 | KAYLORDUT_LOG_ERROR("Parse command failed."); 110 | return 1; 111 | } 112 | if (options.framerate == 0.0 || options.thread_count == 0 || 113 | options.label_path.empty() || options.input_filename.empty() || 114 | options.model_path.empty()) { 115 | KAYLORDUT_LOG_ERROR("Missing required options. Use --help for help."); 116 | return 1; 117 | } 118 | auto rknn_pool = std::make_unique( 119 | options.model_path, options.thread_count, options.label_path); 120 | VideoFile video_file(options.input_filename.c_str()); 121 | int delay = 1000 / options.framerate; 122 | ImageProcess image_process(video_file.get_frame_width(), 123 | video_file.get_frame_height(), 640, 124 | options.is_track, options.framerate); 125 | std::unique_ptr image; 126 | std::shared_ptr image_res; 127 | uint8_t running_flag = 0; 128 | cv::namedWindow("Video", cv::WINDOW_AUTOSIZE); 129 | static int image_count = 0; 130 | static int image_res_count = 0; 131 | TimeDuration time_duration; 132 | do { 133 | auto func = [&] { 134 | running_flag = 0; 135 | image = video_file.GetNextFrame(); 136 | if (image != nullptr) { 137 | rknn_pool->AddInferenceTask(std::move(image), image_process); 138 | running_flag |= 0x01; 139 | image_count++; 140 | } 141 | image_res = rknn_pool->GetImageResultFromQueue(); 142 | if (image_res != nullptr) { 143 | cv::imshow("Video", *image_res); 144 | image_res_count++; 145 | KAYLORDUT_LOG_INFO("image count = {}, image res count = {}, delta = {}", 146 | image_count, image_res_count, 147 | image_count - image_res_count); 148 | cv::waitKey(1); 149 | running_flag |= 0x10; 150 | } 151 | }; 152 | run_once_with_delay(func, std::chrono::milliseconds(delay)); 153 | } 154 | // 读取图像或者获取结果非空 线程池里还有任务 推理的结果没有等于输入图像数 155 | while (running_flag || rknn_pool->GetTasksSize() || 156 | image_count > image_res_count); 157 | auto time = std::chrono::duration_cast( 158 | time_duration.DurationSinceLastTime()); 159 | double fps = image_res_count * 1000.0 / time.count(); 160 | KAYLORDUT_LOG_INFO("Total time is {}ms, and average frame rate is {}fps", 161 | time.count(), fps); 162 | rknn_pool.reset(); 163 | KAYLORDUT_LOG_INFO("exit loop"); 164 | cv::destroyAllWindows(); 165 | return 0; 166 | } -------------------------------------------------------------------------------- /yolov8n-ros2.md: -------------------------------------------------------------------------------- 1 | # The simplest way to test yolo with ROS2 2 | 3 | # Install pacakge 4 | ```bash 5 | cat << 'EOF' | sudo tee /etc/apt/sources.list.d/kaylordut.list 6 | deb [signed-by=/etc/apt/keyrings/kaylor-keyring.gpg] http://apt.kaylordut.cn/kaylordut/ kaylordut main 7 | EOF 8 | sudo mkdir /etc/apt/keyrings -pv 9 | sudo wget -O /etc/apt/keyrings/kaylor-keyring.gpg http://apt.kaylordut.cn/kaylor-keyring.gpg 10 | sudo apt update 11 | sudo apt install ai-framework ros-humble-yolo ros-humble-kaylordut-usb-cam 12 | ``` 13 | 14 | # Run 15 | 16 | ```bash 17 | ros2 launch kaylordut_usb_cam test.launch.py 18 | ros2 launch yolo yolo.launch.py 19 | ``` --------------------------------------------------------------------------------