├── .gitignore ├── README.md ├── README_EN.md ├── bdd100k.names ├── export_onnx.py ├── images ├── 0ace96c3-48481887.jpg ├── 3c0e7240-96e390d2.jpg ├── 7dd9ef45-f197db95.jpg ├── 8e1c1ab0-a8b92173.jpg ├── 9aa94005-ff1d4c9a.jpg └── adb4871d-4d063244.jpg ├── main.cpp ├── main.py └── yolop.onnx /.gitignore: -------------------------------------------------------------------------------- 1 | .idea -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # YOLOP-opencv-dnn 2 | 使用OpenCV部署全景驾驶感知网络YOLOP,可同时处理交通目标检测、可驾驶区域分割、车道线检测,三项视觉感知任务,依然是包含C++和Python两种版本的程序实现 3 | 4 | onnx文件从百度云盘下载,链接:https://pan.baidu.com/s/1A_9cldUHeY9GUle_HO4Crg 5 | 提取码:mf1x 6 | 7 | C++版本的主程序文件是main.cpp,Python版本的主程序文件是main.py。把onnx文件下载到主程序文件所在目录后,就可以运行程序了。文件夹images 8 | 里含有若干张测试图片,来自于bdd100k自动驾驶数据集。 9 | 10 | 本套程序是在华中科技大学视觉团队在最近发布的项目https://github.com/hustvl/YOLOP 11 | 的基础上做的一个opencv推理部署程序,本套程序只依赖opencv库就可以运行, 12 | 从而彻底摆脱对任何深度学习框架的依赖。如果程序运行出错,那很有可能是您安装的opencv版本低了,这时升级opencv版本就能正常运行的。 13 | 14 | 此外,在本套程序里,还有一个export_onnx.py文件,它是生成onnx文件的程序。不过,export_onnx.py文件不能本套程序目录内运行的, 15 | 假如您想了解如何生成.onnx文件,需要把export_onnx.py文件拷贝到https://github.com/hustvl/YOLOP 16 | 的主目录里之后,并且修改lib/models/common.py里的代码, 17 | 这时运行export_onnx.py就可以生成onnx文件了。在lib/models/common.py里修改哪些代码,可以参见我的csdn博客文章 18 | https://blog.csdn.net/nihate/article/details/112731327 19 | -------------------------------------------------------------------------------- /README_EN.md: -------------------------------------------------------------------------------- 1 | # YOLOP-opencv-dnn 2 | This repository contained an OpenCV version of YOLOP, a panoptic driving perception network that can handle simultaneously traffic target detection, drivable area segmentation, and lane line detection. 3 | 4 | You can find joined to the repository, an onnx file created from the provided weight of YOLOP. 5 | 6 | You will find in the repository, a C++ version (main.cpp), a Python version (main.py), an onnx file created from the provided weight of YOLOP and images folder that contains several test images from the bdd100k autopilot dataset. 7 | 8 | This program is an opencv inference deployment program based on the recently released [project YOLOP](https://github.com/hustvl/YOLOP) by the vision team of Huazhong University of Science and Technology. 9 | It can be run using only the opencv library, thus completely getting rid of the dependency of any deep learning framework. 10 | 11 | This program has been tested with opencv 4.5.3. It doesn't work with opencv 4.2.0 and below. 12 | 13 | ## Export your own onnx file 14 | You will find in this repository a file export_onnx.py, which is the program that generates the onnx file. If you want to know how to generate .onnx files, you need to copy the export_onnx.py file to the home directory of [YOLOP](https://github.com/hustvl/YOLOP). 15 | You will also need to modify the code in YOLOP/lib/models/common.py as follow : 16 | ~~~python 17 | class Contract(nn.Module): 18 | # Contract width-height into channels, i.e. x(1,64,80,80) to x(1,256,40,40) 19 | def __init__(self, gain = 2): 20 | super().__init__() 21 | self.gain = gain 22 | 23 | def forward(self, x): 24 | N, C, H, W = x.size() # assert (H / s == 0) and (W / s == 0), 'Indivisible gain' 25 | s = self.gain 26 | x = x.view(N, C, H // s, s, W // s, s) # x(1,64,40,2,40,2) 27 | x = x.permute(0, 3, 5, 1, 2, 4).contiguous() # x(1,2,2,64,40,40) 28 | return x.view(N, C * s * s, H // s, W // s) # x(1,256,40,40) 29 | 30 | 31 | class Focus(nn.Module): 32 | # Focus wh information into c-space 33 | # slice concat conv 34 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 35 | super(Focus, self).__init__() 36 | self.conv = Conv(c1 * 4, c2, k, s, p, g, act) 37 | self.contract = Contract(gain=2) 38 | 39 | def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2) 40 | # return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)) 41 | return self.conv(self.contract(x)) 42 | ~~~ 43 | We are adding a Contract class and we have modified the content of the Focus class. 44 | We also need to modify the content of the method forward from the Detect class as follow : 45 | ~~~python 46 | def forward(self, x): 47 | if not torch.onnx.is_in_onnx_export(): 48 | z = [] # inference output 49 | for i in range(self.nl): 50 | x[i] = self.m[i](x[i]) # conv 51 | # print(str(i)+str(x[i].shape)) 52 | bs, _, ny, nx = x[i].shape # x(bs,255,w,w) to x(bs,3,w,w,85) 53 | x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() 54 | # print(str(i)+str(x[i].shape)) 55 | 56 | if not self.training: # inference 57 | if self.grid[i].shape[2:4] != x[i].shape[2:4]: 58 | self.grid[i] = self._make_grid(nx, ny).to(x[i].device) 59 | y = x[i].sigmoid() 60 | # print("**") 61 | # print(y.shape) #[1, 3, w, h, 85] 62 | # print(self.grid[i].shape) #[1, 3, w, h, 2] 63 | y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy 64 | y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh 65 | """print("**") 66 | print(y.shape) # [1, 3, w, h, 85] 67 | print(y.view(bs, -1, self.no).shape) # [1, 3*w*h, 85]""" 68 | z.append(y.view(bs, -1, self.no)) 69 | return x if self.training else (torch.cat(z, 1), x) 70 | 71 | else: 72 | for i in range(self.nl): 73 | x[i] = self.m[i](x[i]) # conv 74 | # print(str(i)+str(x[i].shape)) 75 | bs, _, ny, nx = x[i].shape # x(bs,255,w,w) to x(bs,3,w,w,85) 76 | x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() 77 | x[i] = torch.sigmoid(x[i]) 78 | x[i] = x[i].view(-1, self.no) 79 | return torch.cat(x, dim=0) 80 | ~~~ 81 | After these steps, you can run export_onnx.py to generate the onnx file. 82 | These steps have been extracted from the following Chinese csdn blog post : https://blog.csdn.net/nihate/article/details/112731327 83 | 84 | -------------------------------------------------------------------------------- /bdd100k.names: -------------------------------------------------------------------------------- 1 | car -------------------------------------------------------------------------------- /export_onnx.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from lib.models.common import Conv, SPP, Bottleneck, BottleneckCSP, Focus, Concat, Detect, SharpenConv 4 | from torch.nn import Upsample 5 | import cv2 6 | 7 | # The lane line and the driving area segment branches without share information with each other and without link 8 | YOLOP = [ 9 | [24, 33, 42], # Det_out_idx, Da_Segout_idx, LL_Segout_idx 10 | [-1, Focus, [3, 32, 3]], # 0 11 | [-1, Conv, [32, 64, 3, 2]], # 1 12 | [-1, BottleneckCSP, [64, 64, 1]], # 2 13 | [-1, Conv, [64, 128, 3, 2]], # 3 14 | [-1, BottleneckCSP, [128, 128, 3]], # 4 15 | [-1, Conv, [128, 256, 3, 2]], # 5 16 | [-1, BottleneckCSP, [256, 256, 3]], # 6 17 | [-1, Conv, [256, 512, 3, 2]], # 7 18 | [-1, SPP, [512, 512, [5, 9, 13]]], # 8 19 | [-1, BottleneckCSP, [512, 512, 1, False]], # 9 20 | [-1, Conv, [512, 256, 1, 1]], # 10 21 | [-1, Upsample, [None, 2, 'nearest']], # 11 22 | [[-1, 6], Concat, [1]], # 12 23 | [-1, BottleneckCSP, [512, 256, 1, False]], # 13 24 | [-1, Conv, [256, 128, 1, 1]], # 14 25 | [-1, Upsample, [None, 2, 'nearest']], # 15 26 | [[-1, 4], Concat, [1]], # 16 #Encoder 27 | 28 | [-1, BottleneckCSP, [256, 128, 1, False]], # 17 29 | [-1, Conv, [128, 128, 3, 2]], # 18 30 | [[-1, 14], Concat, [1]], # 19 31 | [-1, BottleneckCSP, [256, 256, 1, False]], # 20 32 | [-1, Conv, [256, 256, 3, 2]], # 21 33 | [[-1, 10], Concat, [1]], # 22 34 | [-1, BottleneckCSP, [512, 512, 1, False]], # 23 35 | [[17, 20, 23], Detect, 36 | [1, [[3, 9, 5, 11, 4, 20], [7, 18, 6, 39, 12, 31], [19, 50, 38, 81, 68, 157]], [128, 256, 512]]], 37 | # Detection head 24 38 | 39 | [16, Conv, [256, 128, 3, 1]], # 25 40 | [-1, Upsample, [None, 2, 'nearest']], # 26 41 | [-1, BottleneckCSP, [128, 64, 1, False]], # 27 42 | [-1, Conv, [64, 32, 3, 1]], # 28 43 | [-1, Upsample, [None, 2, 'nearest']], # 29 44 | [-1, Conv, [32, 16, 3, 1]], # 30 45 | [-1, BottleneckCSP, [16, 8, 1, False]], # 31 46 | [-1, Upsample, [None, 2, 'nearest']], # 32 47 | [-1, Conv, [8, 2, 3, 1]], # 33 Driving area segmentation head 48 | 49 | [16, Conv, [256, 128, 3, 1]], # 34 50 | [-1, Upsample, [None, 2, 'nearest']], # 35 51 | [-1, BottleneckCSP, [128, 64, 1, False]], # 36 52 | [-1, Conv, [64, 32, 3, 1]], # 37 53 | [-1, Upsample, [None, 2, 'nearest']], # 38 54 | [-1, Conv, [32, 16, 3, 1]], # 39 55 | [-1, BottleneckCSP, [16, 8, 1, False]], # 40 56 | [-1, Upsample, [None, 2, 'nearest']], # 41 57 | [-1, Conv, [8, 2, 3, 1]] # 42 Lane line segmentation head 58 | ] 59 | 60 | class MCnet(nn.Module): 61 | def __init__(self, block_cfg): 62 | super(MCnet, self).__init__() 63 | layers, save = [], [] 64 | self.nc = 1 65 | self.detector_index = -1 66 | self.det_out_idx = block_cfg[0][0] 67 | self.seg_out_idx = block_cfg[0][1:] 68 | self.num_anchors = 3 69 | self.num_outchannel = 5 + self.nc 70 | # Build model 71 | for i, (from_, block, args) in enumerate(block_cfg[1:]): 72 | block = eval(block) if isinstance(block, str) else block # eval strings 73 | if block is Detect: 74 | self.detector_index = i 75 | block_ = block(*args) 76 | block_.index, block_.from_ = i, from_ 77 | layers.append(block_) 78 | save.extend(x % i for x in ([from_] if isinstance(from_, int) else from_) if x != -1) # append to savelist 79 | assert self.detector_index == block_cfg[0][0] 80 | 81 | self.model, self.save = nn.Sequential(*layers), sorted(save) 82 | self.names = [str(i) for i in range(self.nc)] 83 | 84 | # set stride、anchor for detector 85 | # Detector = self.model[self.detector_index] # detector 86 | # if isinstance(Detector, Detect): 87 | # s = 128 # 2x min stride 88 | # # for x in self.forward(torch.zeros(1, 3, s, s)): 89 | # # print (x.shape) 90 | # with torch.no_grad(): 91 | # model_out = self.forward(torch.zeros(1, 3, s, s)) 92 | # detects, _, _ = model_out 93 | # Detector.stride = torch.tensor([s / x.shape[-2] for x in detects]) # forward 94 | # # print("stride"+str(Detector.stride )) 95 | # Detector.anchors /= Detector.stride.view(-1, 1, 1) # Set the anchors for the corresponding scale 96 | # check_anchor_order(Detector) 97 | # self.stride = Detector.stride 98 | def forward(self, x): 99 | cache = [] 100 | out = [] 101 | det_out = None 102 | for i, block in enumerate(self.model): 103 | if block.from_ != -1: 104 | x = cache[block.from_] if isinstance(block.from_, int) else [x if j == -1 else cache[j] for j in block.from_] # calculate concat detect 105 | x = block(x) 106 | if i in self.seg_out_idx: # save driving area segment result 107 | # m = nn.Sigmoid() 108 | # out.append(m(x)) 109 | out.append(torch.sigmoid(x)) 110 | if i == self.detector_index: 111 | det_out = x 112 | cache.append(x if block.index in self.save else None) 113 | out[0] = out[0].view(2, 640, 640) 114 | out[1] = out[1].view(2, 640, 640) 115 | return det_out, out[0], out[1] 116 | 117 | if __name__ == "__main__": 118 | device = 'cuda' if torch.cuda.is_available() else 'cpu' 119 | model = MCnet(YOLOP) 120 | checkpoint = torch.load('weights/End-to-end.pth', map_location=device) 121 | model.load_state_dict(checkpoint['state_dict']) 122 | model.eval() 123 | output_onnx = 'yolop.onnx' 124 | inputs = torch.randn(1, 3, 640, 640) 125 | # with torch.no_grad(): 126 | # output = model(inputs) 127 | # print(output) 128 | 129 | torch.onnx.export(model, inputs, output_onnx, verbose=False, opset_version=12, input_names=['images'], output_names=['det_out', 'drive_area_seg', 'lane_line_seg']) 130 | print('convert', output_onnx, 'to onnx finish!!!') 131 | 132 | try: 133 | dnnnet = cv2.dnn.readNet(output_onnx) 134 | print('read sucess') 135 | except cv2.error as err: 136 | print('Your Opencv version : {} may be incompatible, please consider upgrading'.format(cv2.__version__)) 137 | print('Read failed : ', err) -------------------------------------------------------------------------------- /images/0ace96c3-48481887.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc203/YOLOP-opencv-dnn/724b2aef804845529a85f4589142232c4d9ea3a4/images/0ace96c3-48481887.jpg -------------------------------------------------------------------------------- /images/3c0e7240-96e390d2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc203/YOLOP-opencv-dnn/724b2aef804845529a85f4589142232c4d9ea3a4/images/3c0e7240-96e390d2.jpg -------------------------------------------------------------------------------- /images/7dd9ef45-f197db95.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc203/YOLOP-opencv-dnn/724b2aef804845529a85f4589142232c4d9ea3a4/images/7dd9ef45-f197db95.jpg -------------------------------------------------------------------------------- /images/8e1c1ab0-a8b92173.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc203/YOLOP-opencv-dnn/724b2aef804845529a85f4589142232c4d9ea3a4/images/8e1c1ab0-a8b92173.jpg -------------------------------------------------------------------------------- /images/9aa94005-ff1d4c9a.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc203/YOLOP-opencv-dnn/724b2aef804845529a85f4589142232c4d9ea3a4/images/9aa94005-ff1d4c9a.jpg -------------------------------------------------------------------------------- /images/adb4871d-4d063244.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc203/YOLOP-opencv-dnn/724b2aef804845529a85f4589142232c4d9ea3a4/images/adb4871d-4d063244.jpg -------------------------------------------------------------------------------- /main.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | using namespace cv; 9 | using namespace dnn; 10 | using namespace std; 11 | 12 | class YOLO 13 | { 14 | public: 15 | YOLO(string modelpath, float confThreshold, float nmsThreshold, float objThreshold); 16 | Mat detect(Mat& frame); 17 | private: 18 | const float mean[3] = { 0.485, 0.456, 0.406 }; 19 | const float std[3] = { 0.229, 0.224, 0.225 }; 20 | const float anchors[3][6] = { {3,9,5,11,4,20}, {7,18,6,39,12,31},{19,50,38,81,68,157} }; 21 | const float stride[3] = { 8.0, 16.0, 32.0 }; 22 | const string classesFile = "bdd100k.names"; 23 | const int inpWidth = 640; 24 | const int inpHeight = 640; 25 | float confThreshold; 26 | float nmsThreshold; 27 | float objThreshold; 28 | const bool keep_ratio = true; 29 | vector classes; 30 | Net net; 31 | Mat resize_image(Mat srcimg, int* newh, int* neww, int* top, int* left); 32 | void normalize(Mat& srcimg); 33 | void drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame); 34 | }; 35 | 36 | YOLO::YOLO(string modelpath, float confThreshold, float nmsThreshold, float objThreshold) 37 | { 38 | this->confThreshold = confThreshold; 39 | this->nmsThreshold = nmsThreshold; 40 | this->objThreshold = objThreshold; 41 | 42 | ifstream ifs(this->classesFile.c_str()); 43 | string line; 44 | while (getline(ifs, line)) this->classes.push_back(line); 45 | this->net = readNet(modelpath); 46 | } 47 | 48 | Mat YOLO::resize_image(Mat srcimg, int* newh, int* neww, int* top, int* left) 49 | { 50 | int srch = srcimg.rows, srcw = srcimg.cols; 51 | *newh = this->inpHeight; 52 | *neww = this->inpWidth; 53 | Mat dstimg; 54 | if (this->keep_ratio && srch != srcw) 55 | { 56 | float hw_scale = (float)srch / srcw; 57 | if (hw_scale > 1) 58 | { 59 | *newh = this->inpHeight; 60 | *neww = int(this->inpWidth / hw_scale); 61 | resize(srcimg, dstimg, Size(*neww, *newh), INTER_AREA); 62 | *left = int((this->inpWidth - *neww) * 0.5); 63 | copyMakeBorder(dstimg, dstimg, 0, 0, *left, this->inpWidth - *neww - *left, BORDER_CONSTANT, 0); 64 | } 65 | else 66 | { 67 | *newh = (int)this->inpHeight * hw_scale; 68 | *neww = this->inpWidth; 69 | resize(srcimg, dstimg, Size(*neww, *newh), INTER_AREA); 70 | *top = (int)(this->inpHeight - *newh) * 0.5; 71 | copyMakeBorder(dstimg, dstimg, *top, this->inpHeight - *newh - *top, 0, 0, BORDER_CONSTANT, 0); 72 | } 73 | } 74 | else 75 | { 76 | resize(srcimg, dstimg, Size(*neww, *newh), INTER_AREA); 77 | } 78 | return dstimg; 79 | } 80 | 81 | void YOLO::normalize(Mat& img) 82 | { 83 | img.convertTo(img, CV_32F); 84 | int i = 0, j = 0; 85 | const float scale = 1.0 / 255.0; 86 | for (i = 0; i < img.rows; i++) 87 | { 88 | float* pdata = (float*)(img.data + i * img.step); 89 | for (j = 0; j < img.cols; j++) 90 | { 91 | pdata[0] = (pdata[0] * scale - this->mean[0]) / this->std[0]; 92 | pdata[1] = (pdata[1] * scale - this->mean[1]) / this->std[1]; 93 | pdata[2] = (pdata[2] * scale - this->mean[2]) / this->std[2]; 94 | pdata += 3; 95 | } 96 | } 97 | } 98 | 99 | void YOLO::drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame) // Draw the predicted bounding box 100 | { 101 | //Draw a rectangle displaying the bounding box 102 | rectangle(frame, Point(left, top), Point(right, bottom), Scalar(0, 0, 255), 2); 103 | 104 | //Get the label for the class name and its confidence 105 | string label = format("%.2f", conf); 106 | label = this->classes[classId] + ":" + label; 107 | 108 | //Display the label at the top of the bounding box 109 | int baseLine; 110 | Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine); 111 | top = max(top, labelSize.height); 112 | //rectangle(frame, Point(left, top - int(1.5 * labelSize.height)), Point(left + int(1.5 * labelSize.width), top + baseLine), Scalar(0, 255, 0), FILLED); 113 | putText(frame, label, Point(left, top), FONT_HERSHEY_SIMPLEX, 1, Scalar(0, 255, 0), 1); 114 | } 115 | 116 | Mat YOLO::detect(Mat& srcimg) 117 | { 118 | int newh = 0, neww = 0, padh = 0, padw = 0; 119 | Mat dstimg = this->resize_image(srcimg, &newh, &neww, &padh, &padw); 120 | this->normalize(dstimg); 121 | Mat blob = blobFromImage(dstimg); 122 | this->net.setInput(blob); 123 | vector outs; 124 | this->net.forward(outs, this->net.getUnconnectedOutLayersNames()); 125 | 126 | Mat outimg = srcimg.clone(); 127 | float ratioh = (float)newh / srcimg.rows; 128 | float ratiow = (float)neww / srcimg.cols; 129 | int i = 0, j = 0, area = this->inpHeight*this->inpWidth; 130 | float* pdata_drive = (float*)outs[1].data; ///drive area segment 131 | float* pdata_lane_line = (float*)outs[2].data; ///lane line segment 132 | for (i = 0; i < outimg.rows; i++) 133 | { 134 | for (j = 0; j < outimg.cols; j++) 135 | { 136 | const int x = int(j*ratiow) + padw; 137 | const int y = int(i*ratioh) + padh; 138 | if (pdata_drive[y * this->inpWidth + x] < pdata_drive[area + y * this->inpWidth + x]) 139 | { 140 | outimg.at(i, j)[0] = 0; 141 | outimg.at(i, j)[1] = 255; 142 | outimg.at(i, j)[2] = 0; 143 | } 144 | if (pdata_lane_line[y * this->inpWidth + x] < pdata_lane_line[area + y * this->inpWidth + x]) 145 | { 146 | outimg.at(i, j)[0] = 255; 147 | outimg.at(i, j)[1] = 0; 148 | outimg.at(i, j)[2] = 0; 149 | } 150 | } 151 | } 152 | /////generate proposals 153 | vector classIds; 154 | vector confidences; 155 | vector boxes; 156 | ratioh = (float)srcimg.rows / newh; 157 | ratiow = (float)srcimg.cols / neww; 158 | int n = 0, q = 0, nout = this->classes.size() + 5, row_ind = 0; 159 | float* pdata = (float*)outs[0].data; 160 | for (n = 0; n < 3; n++) ///�߶� 161 | { 162 | int num_grid_x = (int)(this->inpWidth / this->stride[n]); 163 | int num_grid_y = (int)(this->inpHeight / this->stride[n]); 164 | for (q = 0; q < 3; q++) ///anchor�� 165 | { 166 | const float anchor_w = this->anchors[n][q * 2]; 167 | const float anchor_h = this->anchors[n][q * 2 + 1]; 168 | for (i = 0; i < num_grid_y; i++) 169 | { 170 | for (j = 0; j < num_grid_x; j++) 171 | { 172 | const float box_score = pdata[4]; 173 | if (box_score > this->objThreshold) 174 | { 175 | Mat scores = outs[0].row(row_ind).colRange(5, outs[0].cols); 176 | Point classIdPoint; 177 | double max_class_socre; 178 | // Get the value and location of the maximum score 179 | minMaxLoc(scores, 0, &max_class_socre, 0, &classIdPoint); 180 | if (max_class_socre > this->confThreshold) 181 | { 182 | float cx = (pdata[0] * 2.f - 0.5f + j) * this->stride[n]; ///cx 183 | float cy = (pdata[1] * 2.f - 0.5f + i) * this->stride[n]; ///cy 184 | float w = powf(pdata[2] * 2.f, 2.f) * anchor_w; ///w 185 | float h = powf(pdata[3] * 2.f, 2.f) * anchor_h; ///h 186 | 187 | int left = (cx - 0.5*w - padw)*ratiow; 188 | int top = (cy - 0.5*h - padh)*ratioh; 189 | 190 | classIds.push_back(classIdPoint.x); 191 | confidences.push_back(max_class_socre * box_score); 192 | boxes.push_back(Rect(left, top, (int)(w*ratiow), (int)(h*ratioh))); 193 | } 194 | } 195 | row_ind++; 196 | pdata += nout; 197 | } 198 | } 199 | } 200 | } 201 | 202 | // Perform non maximum suppression to eliminate redundant overlapping boxes with 203 | // lower confidences 204 | vector indices; 205 | NMSBoxes(boxes, confidences, this->confThreshold, this->nmsThreshold, indices); 206 | for (size_t i = 0; i < indices.size(); ++i) 207 | { 208 | int idx = indices[i]; 209 | Rect box = boxes[idx]; 210 | this->drawPred(classIds[idx], confidences[idx], box.x, box.y, 211 | box.x + box.width, box.y + box.height, outimg); 212 | } 213 | return outimg; 214 | } 215 | 216 | int main() 217 | { 218 | YOLO yolo_model("yolop.onnx", 0.25, 0.45, 0.5); 219 | string imgpath = "images/0ace96c3-48481887.jpg"; 220 | Mat srcimg = imread(imgpath); 221 | Mat outimg = yolo_model.detect(srcimg); 222 | 223 | static const string kWinName = "Deep learning object detection in OpenCV"; 224 | namedWindow(kWinName, WINDOW_NORMAL); 225 | imshow(kWinName, outimg); 226 | waitKey(0); 227 | destroyAllWindows(); 228 | } -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import argparse 3 | import numpy as np 4 | 5 | class yolop(): 6 | def __init__(self, confThreshold=0.25, nmsThreshold=0.5, objThreshold=0.45): 7 | with open('bdd100k.names', 'rt') as f: 8 | self.classes = f.read().rstrip('\n').split('\n') ###这个是在bdd100k数据集上训练的模型做opencv部署的,如果你在自己的数据集上训练出的模型做opencv部署,那么需要修改self.classes 9 | num_classes = len(self.classes) 10 | anchors = [[3,9,5,11,4,20], [7,18,6,39,12,31], [19,50,38,81,68,157]] 11 | self.nl = len(anchors) 12 | self.na = len(anchors[0]) // 2 13 | self.no = num_classes + 5 14 | self.stride = np.array([8., 16., 32.]) 15 | self.anchor_grid = np.asarray(anchors, dtype=np.float32).reshape(self.nl, -1, 2) 16 | self.inpWidth = 640 17 | self.inpHeight = 640 18 | self.generate_grid() 19 | self.net = cv2.dnn.readNet('yolop.onnx') 20 | self.confThreshold = confThreshold 21 | self.nmsThreshold = nmsThreshold 22 | self.objThreshold = objThreshold 23 | self.mean = np.array([0.485, 0.456, 0.406], dtype=np.float32).reshape(1, 1, 3) 24 | self.std = np.array([0.229, 0.224, 0.225], dtype=np.float32).reshape(1, 1, 3) 25 | self.keep_ratio = True 26 | def generate_grid(self): 27 | self.grid = [np.zeros(1)] * self.nl 28 | self.length = [] 29 | self.areas = [] 30 | for i in range(self.nl): 31 | h, w = int(self.inpHeight/self.stride[i]), int(self.inpWidth/self.stride[i]) 32 | self.length.append(int(self.na * h * w)) 33 | self.areas.append(h*w) 34 | if self.grid[i].shape[2:4] != (h,w): 35 | self.grid[i] = self._make_grid(w, h) 36 | def _make_grid(self, nx=20, ny=20): 37 | xv, yv = np.meshgrid(np.arange(ny), np.arange(nx)) 38 | return np.stack((xv, yv), 2).reshape((-1, 2)).astype(np.float32) 39 | 40 | def postprocess(self, frame, outs, newh, neww, padh, padw): 41 | frameHeight = frame.shape[0] 42 | frameWidth = frame.shape[1] 43 | ratioh, ratiow = frameHeight / newh, frameWidth / neww 44 | # Scan through all the bounding boxes output from the network and keep only the 45 | # ones with high confidence scores. Assign the box's class label as the class with the highest score. 46 | classIds = [] 47 | confidences = [] 48 | boxes = [] 49 | for detection in outs: 50 | scores = detection[5:] 51 | classId = np.argmax(scores) 52 | confidence = scores[classId] 53 | if confidence > self.confThreshold and detection[4] > self.objThreshold: 54 | center_x = int((detection[0]-padw) * ratiow) 55 | center_y = int((detection[1]-padh) * ratioh) 56 | width = int(detection[2] * ratiow) 57 | height = int(detection[3] * ratioh) 58 | left = int(center_x - width / 2) 59 | top = int(center_y - height / 2) 60 | classIds.append(classId) 61 | confidences.append(float(confidence) * detection[4]) 62 | boxes.append([left, top, width, height]) 63 | 64 | # Perform non maximum suppression to eliminate redundant overlapping boxes with 65 | # lower confidences. 66 | indices = cv2.dnn.NMSBoxes(boxes, confidences, self.confThreshold, self.nmsThreshold) 67 | for i in indices: 68 | i = i[0] 69 | box = boxes[i] 70 | left = box[0] 71 | top = box[1] 72 | width = box[2] 73 | height = box[3] 74 | frame = self.drawPred(frame, classIds[i], confidences[i], left, top, left + width, top + height) 75 | return frame 76 | def drawPred(self, frame, classId, conf, left, top, right, bottom): 77 | # Draw a bounding box. 78 | cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), thickness=2) 79 | 80 | label = '%.2f' % conf 81 | label = '%s:%s' % (self.classes[classId], label) 82 | 83 | # Display the label at the top of the bounding box 84 | labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1) 85 | top = max(top, labelSize[1]) 86 | # cv.rectangle(frame, (left, top - round(1.5 * labelSize[1])), (left + round(1.5 * labelSize[0]), top + baseLine), (255,255,255), cv.FILLED) 87 | cv2.putText(frame, label, (left, top - 10), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), thickness=1) 88 | return frame 89 | def resize_image(self, srcimg): 90 | padh, padw, newh, neww = 0, 0, self.inpHeight, self.inpWidth 91 | if self.keep_ratio and srcimg.shape[0] != srcimg.shape[1]: 92 | hw_scale = srcimg.shape[0] / srcimg.shape[1] 93 | if hw_scale > 1: 94 | newh, neww = self.inpHeight, int(self.inpWidth / hw_scale) 95 | img = cv2.resize(srcimg, (neww, newh), interpolation=cv2.INTER_AREA) 96 | padw = int((self.inpWidth - neww) * 0.5) 97 | img = cv2.copyMakeBorder(img, 0, 0, padw, self.inpWidth - neww - padw, cv2.BORDER_CONSTANT, 98 | value=0) # add border 99 | else: 100 | newh, neww = int(self.inpHeight * hw_scale), self.inpWidth 101 | img = cv2.resize(srcimg, (neww, newh), interpolation=cv2.INTER_AREA) 102 | padh = int((self.inpHeight - newh) * 0.5) 103 | img = cv2.copyMakeBorder(img, padh, self.inpHeight - newh - padh, 0, 0, cv2.BORDER_CONSTANT, value=0) 104 | else: 105 | img = cv2.resize(srcimg, (self.inpWidth, self.inpHeight), interpolation=cv2.INTER_AREA) 106 | return img, newh, neww, padh, padw 107 | 108 | def _normalize(self, img): ### c++: https://blog.csdn.net/wuqingshan2010/article/details/107727909 109 | img = img.astype(np.float32) / 255.0 110 | img = (img - self.mean) / self.std 111 | return img 112 | def detect(self, srcimg): 113 | img, newh, neww, padh, padw = self.resize_image(srcimg) 114 | img = self._normalize(img) 115 | blob = cv2.dnn.blobFromImage(img) 116 | # Sets the input to the network 117 | self.net.setInput(blob) 118 | 119 | # Runs the forward pass to get output of the output layers 120 | outs = self.net.forward(self.net.getUnconnectedOutLayersNames()) 121 | # inference output 122 | outimg = srcimg.copy() 123 | drive_area_mask = outs[1][:, padh:(self.inpHeight - padh), padw:(self.inpWidth - padw)] 124 | seg_id = np.argmax(drive_area_mask, axis=0).astype(np.uint8) 125 | seg_id = cv2.resize(seg_id, (srcimg.shape[1], srcimg.shape[0]), interpolation=cv2.INTER_NEAREST) 126 | outimg[seg_id == 1] = [0, 255, 0] 127 | 128 | lane_line_mask = outs[2][:, padh:(self.inpHeight - padh), padw:(self.inpWidth - padw)] 129 | seg_id = np.argmax(lane_line_mask, axis=0).astype(np.uint8) 130 | seg_id = cv2.resize(seg_id, (srcimg.shape[1], srcimg.shape[0]), interpolation=cv2.INTER_NEAREST) 131 | outimg[seg_id == 1] = [255, 0, 0] 132 | 133 | det_out = outs[0] 134 | row_ind = 0 135 | for i in range(self.nl): 136 | det_out[row_ind:row_ind+self.length[i], 0:2] = (det_out[row_ind:row_ind+self.length[i], 0:2] * 2. - 0.5 + np.tile(self.grid[i],(self.na, 1))) * int(self.stride[i]) 137 | det_out[row_ind:row_ind+self.length[i], 2:4] = (det_out[row_ind:row_ind+self.length[i], 2:4] * 2) ** 2 * np.repeat(self.anchor_grid[i], self.areas[i], axis=0) 138 | row_ind += self.length[i] 139 | outimg = self.postprocess(outimg, det_out, newh, neww, padh, padw) 140 | return outimg 141 | 142 | if __name__ == "__main__": 143 | parser = argparse.ArgumentParser() 144 | parser.add_argument("--imgpath", type=str, default='images/0ace96c3-48481887.jpg', help="image path") 145 | parser.add_argument('--confThreshold', default=0.25, type=float, help='class confidence') 146 | parser.add_argument('--nmsThreshold', default=0.45, type=float, help='nms iou thresh') 147 | parser.add_argument('--objThreshold', default=0.5, type=float, help='object confidence') 148 | args = parser.parse_args() 149 | 150 | yolonet = yolop(confThreshold=args.confThreshold, nmsThreshold=args.nmsThreshold, objThreshold=args.objThreshold) 151 | srcimg = cv2.imread(args.imgpath) 152 | outimg = yolonet.detect(srcimg) 153 | 154 | winName = 'Deep learning object detection in OpenCV' 155 | cv2.namedWindow(winName, 0) 156 | cv2.imshow(winName, outimg) 157 | cv2.waitKey(0) 158 | cv2.destroyAllWindows() -------------------------------------------------------------------------------- /yolop.onnx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hpc203/YOLOP-opencv-dnn/724b2aef804845529a85f4589142232c4d9ea3a4/yolop.onnx --------------------------------------------------------------------------------