├── README.md ├── data ├── ccpd.yaml └── hyp.scratch.yaml ├── demo ├── det_result │ ├── 003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg │ ├── 004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg │ ├── 004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg │ ├── 00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg │ ├── 004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg │ ├── 006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg │ ├── 0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg │ ├── 0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg │ └── 04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg ├── images │ ├── 003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg │ ├── 004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg │ ├── 004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg │ ├── 00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg │ ├── 004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg │ ├── 006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg │ ├── 0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg │ ├── 0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg │ └── 04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg └── rec_result │ ├── 003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg │ ├── 004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg │ ├── 004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg │ ├── 00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg │ ├── 004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg │ ├── 006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg │ ├── 0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg │ ├── 0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg │ └── 04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg ├── figures ├── lpr-test.png ├── lpr-val.png └── yolov5-test.png ├── main.py ├── models ├── LPRNet.py ├── common.py ├── experimental.py ├── yolo.py └── yolov5s.yaml ├── requirements.txt ├── tools ├── ccpd2lpr.py ├── ccpd2yolov5.py ├── detect_yolov5.py ├── split_dataset.py ├── test_lprnet.py ├── test_yolov5.py ├── train_lprnet.py └── train_yolov5.py ├── utils ├── activations.py ├── datasets.py ├── general.py ├── google_utils.py ├── load_lpr_data.py ├── metrics.py ├── plots.py ├── torch_utils.py └── utils.py └── weights ├── lprnet_best.pth └── yolov5_best.pt /README.md: -------------------------------------------------------------------------------- 1 | # 车牌识别项目(CCPD数据集) 2 | 3 | 这个项目是使用YOLOv5和LPRNet对CCPD车牌进行检测和识别。之前一直在学习OCR相关的东西,就想着能不能做一个车牌识别的项目出来, 4 | 刚好车牌检测以前也做过。我的打算是做一个轻量级的车牌识别项目,用YOLOv5进行车牌检测,用LPRNet进行车牌识别。 5 | 6 | 目前仅支持识别蓝牌和绿牌(新能源车牌)等中国车牌。后续如果添加数据,可以再继续微调,可支持更多场景和更多类型车牌,提高识别准确率! 7 | 8 | 9 | 主要参考以下四个仓库: 10 | 11 | 1. Github: [https://github.com/ultralytics/yolov5](https://github.com/ultralytics/yolov5) 12 | 2. Github: [https://github.com/sirius-ai/LPRNet_Pytorch](https://github.com/sirius-ai/LPRNet_Pytorch) 13 | 3. [https://gitee.com/reason1251326862/plate_classification](https://gitee.com/reason1251326862/plate_classification) 14 | 4. [https://github.com/kiloGrand/License-Plate-Recognition](https://github.com/kiloGrand/License-Plate-Recognition) 15 | 16 | 如果对YOLOv5不熟悉的同学可以先看看我写的YOLOv5源码讲解CSDN: 17 | [【YOLOV5-5.x 源码讲解】整体项目文件导航](https://blog.csdn.net/qq_38253797/article/details/119043919) 18 | 19 | 注释版YOLOv5源码我也开源在了Github上: 20 | [HuKai97/yolov5-5.x-annotations](https://github.com/HuKai97/yolov5-5.x-annotations) 21 | 22 | 欢迎大家star! 23 | 24 | 25 | ## 一、CSDN源码关键部分讲解 26 | 数据制作、训练、测试全在博客里讲的很清楚,感兴趣的可以来看看: 27 | 1. [【YOLOV5-5.x 源码讲解】整体项目文件导航](https://blog.csdn.net/qq_38253797/article/details/119043919) 28 | 2. [【项目三、车牌检测+识别项目】一、CCPD车牌数据集转为YOLOv5格式和LPRNet格式](https://blog.csdn.net/qq_38253797/article/details/125042833) 29 | 3. [【项目三、车牌检测+识别项目】二、使用YOLO进行车牌检测](https://blog.csdn.net/qq_38253797/article/details/125027825) 30 | 4. [【项目三、车牌检测+识别项目】三、LPRNet车牌识别网络原理和核心源码解读](https://blog.csdn.net/qq_38253797/article/details/125054464) 31 | 5. [【项目三、车牌检测+识别项目】四、使用LPRNet进行车牌识别](https://blog.csdn.net/qq_38253797/article/details/125019442) 32 | 33 | 34 | ## 二、数据集下载 35 | 直接在这里下载官方CCPD数据即可:[detectRecog/CCPD](https://github.com/detectRecog/CCPD) 36 | 37 | ## 三、检测模型性能 38 | model|img_size|epochs|mAP_0.5|mAP_0.5:0.95|size 39 | ------ | -----| -----| -----| -----| ----- 40 | yolov5s| 640x640| 60 | 0.995|0.825| 14M 41 | 42 | # 四、识别模型性能 43 | model | 数据集| epochs| acc |size 44 | -------- | -----| -----|--------| ----- 45 | LPRNet| val | 100 | 94.33% | 1.7M 46 | LPRNet| test | 100 | 94.30% | 1.7M 47 | 48 | 总体模型速度:(检测+识别)速度:47.6FPS(970 GPU) 49 | 50 | 51 | ## 五、识别效果 52 | 更多请看demo/rec_result 53 | 54 | ![](demo/rec_result/003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg) 55 | 56 | ## 六、不足、更多改进空间 57 | 1. 数据集缺点,因为算力有限,我使用的只是CCPD2019中的base部分蓝牌和CCPD2020中的全部绿牌,对于一些复杂场景,如:远距离、模糊、复杂场景雪天雨天大雾、 58 | 光线较暗/亮等等,这些其实CCPD2019中都有的,后面如果资源充足的话可以考虑重启这个项目,再优化下数据集; 59 | 2. 数据集缺点,无法识别双层车牌 60 | 3. 模型方面,可不可以加一些提高图像分辨率的算法,在检测到车牌区域位置,先提高车牌区域分辨率,再进行识别。 61 | 4. 模型方面,可不可以加一些图片矫正的算法,在检测到车牌区域位置,先矫正车牌图片,再进行识别。 62 | 63 | -------------------------------------------------------------------------------- /data/ccpd.yaml: -------------------------------------------------------------------------------- 1 | path: ../datasets/ccpd/det 2 | train: images/train 3 | val: images/val 4 | test: images/test 5 | 6 | nc: 1 7 | names: ['licence'] 8 | 9 | -------------------------------------------------------------------------------- /data/hyp.scratch.yaml: -------------------------------------------------------------------------------- 1 | # Hyperparameters for COCO training from scratch 2 | # python train.py --batch 40 --cfg yolov5m.yaml --weights '' --data coco.yaml --img 640 --epochs 300 3 | # See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials 4 | 5 | # 1、训练相关参数 6 | lr0: 0.01 # 初始学习率(SGD=1E-2, Adam=1E-3) 7 | lrf: 0.1 # 最终学习率, 以one_cycle形式或者线性从lr0衰减至lr0 * lrf 8 | momentum: 0.937 # SGD momentum/Adam beta1 9 | weight_decay: 0.0005 # optimizer权重衰减系数 5e-4 10 | warmup_epochs: 3.0 # 前3个epoch进行warmup 11 | warmup_momentum: 0.8 # warmup初始化动量 12 | warmup_bias_lr: 0.1 # warmup初始bias学习率 13 | # 2、损失函数相关参数 14 | box: 0.05 # box iou损失系数 15 | cls: 0.5 # cls分类损失系数 16 | cls_pw: 1.0 # cls BCELoss正样本权重 17 | obj: 1.0 # obj loss gain (scale with pixels) 18 | obj_pw: 1.0 # obj BCELoss正样本权重 19 | fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5) 20 | # 3、其他几个参数 21 | iou_t: 0.20 # IoU training threshold 22 | anchor_t: 4.0 # anchor的长宽比阈值(长:宽 = 4:1) 用于k-means中计算 bpr和aat 23 | #anchors: 3 # 每个输出层的anchors数量 (0 to ignore) k-means时打开 24 | # 4、数据增强相关参数 25 | hsv_h: 0.015 # hsv增强系数 色调 26 | hsv_s: 0.7 # hsv增强系数 饱和度 27 | hsv_v: 0.4 # hsv增强系数 亮度 28 | degrees: 0.0 # random_perspective增强系数 旋转角度 (+/- deg) 29 | translate: 0.1 # random_perspective增强系数 平移 (+/- fraction) 30 | scale: 0.5 # random_perspective增强系数 图像缩放 (+/- gain) 31 | shear: 0.0 # random_perspective增强系数 图像剪切 (+/- deg) 32 | perspective: 0.0 # random_perspective增强系数 透明度 (+/- fraction), range 0-0.001 33 | flipud: 0.0 # 上下翻转数据增强(probability) 34 | fliplr: 0.5 # 左右翻转数据增强(probability) 35 | mosaic: 1.0 # mosaic数据增强(probability) 36 | mixup: 0.0 # mixup数据增强(probability) 37 | # copy_paste: 1.0 # segment copy-paste(probability) 38 | -------------------------------------------------------------------------------- /demo/det_result/003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg -------------------------------------------------------------------------------- /demo/det_result/004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg -------------------------------------------------------------------------------- /demo/det_result/004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg -------------------------------------------------------------------------------- /demo/det_result/00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg -------------------------------------------------------------------------------- /demo/det_result/004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg -------------------------------------------------------------------------------- /demo/det_result/006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg -------------------------------------------------------------------------------- /demo/det_result/0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg -------------------------------------------------------------------------------- /demo/det_result/0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg -------------------------------------------------------------------------------- /demo/det_result/04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/det_result/04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg -------------------------------------------------------------------------------- /demo/images/003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg -------------------------------------------------------------------------------- /demo/images/004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg -------------------------------------------------------------------------------- /demo/images/004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg -------------------------------------------------------------------------------- /demo/images/00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg -------------------------------------------------------------------------------- /demo/images/004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg -------------------------------------------------------------------------------- /demo/images/006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg -------------------------------------------------------------------------------- /demo/images/0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg -------------------------------------------------------------------------------- /demo/images/0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg -------------------------------------------------------------------------------- /demo/images/04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/images/04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg -------------------------------------------------------------------------------- /demo/rec_result/003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/003748802682-91_84-220&469_341&511-328&514_224&510_224&471_328&475-10_2_5_22_31_31_27-103-12.jpg -------------------------------------------------------------------------------- /demo/rec_result/004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/004504310344827586-86_270-291&474_390&516-390&513_293&516_291&483_387&474-0_0_3_25_25_25_30_30-228-16.jpg -------------------------------------------------------------------------------- /demo/rec_result/004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/004636015325670498-90_268-329&567_450&602-450&602_329&601_330&569_449&567-0_0_3_26_26_33_27_33-176-28.jpg -------------------------------------------------------------------------------- /demo/rec_result/00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/00471264367816-90_87-310&523_445&572-448&574_310&579_307&532_445&527-0_0_5_10_30_24_29-135-22.jpg -------------------------------------------------------------------------------- /demo/rec_result/004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/004827586206896552-93_265-348&574_460&613-460&613_349&610_348&574_459&582-0_0_3_25_28_28_32_32-153-23.jpg -------------------------------------------------------------------------------- /demo/rec_result/006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/006664272030651341-86_266-274&515_395&565-395&553_279&565_274&526_395&515-0_0_5_24_29_29_30_32-72-21.jpg -------------------------------------------------------------------------------- /demo/rec_result/0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/0089224137931-92_85-368&433_515&496-516&498_369&484_373&433_520&447-0_0_22_30_1_29_33-134-36.jpg -------------------------------------------------------------------------------- /demo/rec_result/0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/0200311302682-90_89-282&337_528&424-536&423_287&422_292&333_541&334-0_0_21_16_32_32_31-40-44.jpg -------------------------------------------------------------------------------- /demo/rec_result/04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/demo/rec_result/04396551724137931-90_253-218&507_558&625-534&589_218&625_248&512_558&507-0_0_3_25_33_31_32_33-40-6.jpg -------------------------------------------------------------------------------- /figures/lpr-test.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/figures/lpr-test.png -------------------------------------------------------------------------------- /figures/lpr-val.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/figures/lpr-val.png -------------------------------------------------------------------------------- /figures/yolov5-test.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/figures/yolov5-test.png -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | import torch.backends.cudnn as cudnn 4 | 5 | from models.experimental import * 6 | from utils.datasets import * 7 | from utils.utils import * 8 | from models.LPRNet import * 9 | 10 | def detect(save_img=False): 11 | classify, out, source, det_weights, rec_weights, view_img, save_txt, imgsz = \ 12 | opt.classify, opt.output, opt.source, opt.det_weights, opt.rec_weights, opt.view_img, opt.save_txt, opt.img_size 13 | webcam = source == '0' or source.startswith('rtsp') or source.startswith('http') or source.endswith('.txt') 14 | 15 | # Initialize 16 | device = torch_utils.select_device(opt.device) 17 | if os.path.exists(out): 18 | shutil.rmtree(out) # delete rec_result folder 19 | os.makedirs(out) # make new rec_result folder 20 | half = device.type != 'cpu' # half precision only supported on CUDA 21 | 22 | # Load yolov5 model 23 | model = attempt_load(det_weights, map_location=device) # load FP32 model 24 | print("load det pretrained model successful!") 25 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size 26 | if half: 27 | model.half() # to FP16 28 | 29 | # Second-stage classifier 也就是rec 字符识别 30 | if classify: 31 | modelc = LPRNet(lpr_max_len=8, phase=False, class_num=len(CHARS), dropout_rate=0).to(device) 32 | modelc.load_state_dict(torch.load(rec_weights, map_location=torch.device('cpu'))) 33 | print("load rec pretrained model successful!") 34 | modelc.to(device).eval() 35 | 36 | # Set Dataloader 37 | vid_path, vid_writer = None, None 38 | if webcam: 39 | view_img = True 40 | cudnn.benchmark = True # set True to speed up constant image size demo 41 | dataset = LoadStreams(source, img_size=imgsz) 42 | else: 43 | save_img = True 44 | dataset = LoadImages(source, img_size=imgsz) 45 | 46 | # Get names and colors 47 | names = model.module.names if hasattr(model, 'module') else model.names 48 | colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))] 49 | 50 | # Run demo 51 | t0 = time.time() 52 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img 53 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once 54 | for path, img, im0s, vid_cap in dataset: 55 | img = torch.from_numpy(img).to(device) 56 | img = img.half() if half else img.float() # uint8 to fp16/32 57 | img /= 255.0 # 0 - 255 to 0.0 - 1.0 58 | if img.ndimension() == 3: 59 | img = img.unsqueeze(0) 60 | 61 | # Inference 62 | t1 = torch_utils.time_synchronized() 63 | pred = model(img, augment=opt.augment)[0] 64 | # Apply NMS 65 | pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms) 66 | 67 | 68 | # Apply Classifier 69 | if classify: 70 | pred, plat_num = apply_classifier(pred, modelc, img, im0s) 71 | 72 | t2 = torch_utils.time_synchronized() 73 | 74 | # Process detections 75 | for i, det in enumerate(pred): # detections per image 76 | if webcam: # batch_size >= 1 77 | p, s, im0 = path[i], '%g: ' % i, im0s[i].copy() 78 | else: 79 | p, s, im0 = path, '', im0s 80 | 81 | save_path = str(Path(out) / Path(p).name) 82 | txt_path = str(Path(out) / Path(p).stem) + ('_%g' % dataset.frame if dataset.mode == 'video' else '') 83 | s += '%gx%g ' % img.shape[2:] # print string 84 | gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh 85 | if det is not None and len(det): 86 | # Rescale boxes from img_size to im0 size 87 | det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round() 88 | 89 | # Print results 90 | for c in det[:, 5].unique(): 91 | n = (det[:, 5] == c).sum() # detections per class 92 | s += '%g %ss, ' % (n, names[int(c)]) # add to string 93 | 94 | # Write results 95 | for de, lic_plat in zip(det, plat_num): 96 | # xyxy,conf,cls,lic_plat=de[:4],de[4],de[5],de[6:] 97 | *xyxy, conf, cls=de 98 | 99 | if save_txt: # Write to file 100 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh 101 | with open(txt_path + '.txt', 'a') as f: 102 | f.write(('%g ' * 5 + '\n') % (cls, xywh)) # label format 103 | 104 | if save_img or view_img: # Add bbox to image 105 | # label = '%s %.2f' % (names[int(cls)], conf) 106 | lb = "" 107 | for a,i in enumerate(lic_plat): 108 | # if a ==0: 109 | # continue 110 | lb += CHARS[int(i)] 111 | label = '%s %.2f' % (lb, conf) 112 | im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3) 113 | 114 | # Print time (demo + NMS) 115 | print('%sDone. (%.3fs)' % (s, t2 - t1)) 116 | 117 | # Stream results 118 | if view_img: 119 | cv2.imshow(p, im0) 120 | if cv2.waitKey(1) == ord('q'): # q to quit 121 | raise StopIteration 122 | 123 | # Save results (image with detections) 124 | if save_img: 125 | if dataset.mode == 'images': 126 | cv2.imwrite(save_path, im0) 127 | else: 128 | if vid_path != save_path: # new video 129 | vid_path = save_path 130 | if isinstance(vid_writer, cv2.VideoWriter): 131 | vid_writer.release() # release previous video writer 132 | 133 | fourcc = 'mp4v' # rec_result video codec 134 | fps = vid_cap.get(cv2.CAP_PROP_FPS) 135 | w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 136 | h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 137 | vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*fourcc), fps, (w, h)) 138 | vid_writer.write(im0) 139 | 140 | if save_txt or save_img: 141 | print('Results saved to %s' % os.getcwd() + os.sep + out) 142 | if platform == 'darwin': # MacOS 143 | os.system('open ' + save_path) 144 | 145 | print('Done. (%.3fs)' % (time.time() - t0)) 146 | 147 | 148 | if __name__ == '__main__': 149 | parser = argparse.ArgumentParser() 150 | parser.add_argument('--classify', nargs='+', type=str, default=True, help='True rec') 151 | parser.add_argument('--det-weights', nargs='+', type=str, default='./weights/yolov5_best.pt', help='model.pt path(s)') 152 | parser.add_argument('--rec-weights', nargs='+', type=str, default='./weights/lprnet_best.pth', help='model.pt path(s)') 153 | parser.add_argument('--source', type=str, default='./demo/images/', help='source') # file/folder, 0 for webcam 154 | parser.add_argument('--output', type=str, default='demo/rec_result', help='rec_result folder') # rec_result folder 155 | parser.add_argument('--img-size', type=int, default=640, help='demo size (pixels)') 156 | parser.add_argument('--conf-thres', type=float, default=0.4, help='object confidence threshold') 157 | parser.add_argument('--iou-thres', type=float, default=0.5, help='IOU threshold for NMS') 158 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 159 | parser.add_argument('--view-img', action='store_true', help='display results') 160 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') 161 | parser.add_argument('--classes', nargs='+', type=int, help='filter by class') 162 | parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS') 163 | parser.add_argument('--augment', action='store_true', help='augmented demo') 164 | parser.add_argument('--update', action='store_true', help='update all models') 165 | opt = parser.parse_args() 166 | print(opt) 167 | 168 | with torch.no_grad(): 169 | if opt.update: # update all models (to fix SourceChangeWarning) 170 | for opt.weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt', 'yolov3-spp.pt']: 171 | detect() 172 | create_pretrained(opt.weights, opt.weights) 173 | else: 174 | detect() 175 | -------------------------------------------------------------------------------- /models/LPRNet.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch 3 | CHARS = ['京', '沪', '津', '渝', '冀', '晋', '蒙', '辽', '吉', '黑', 4 | '苏', '浙', '皖', '闽', '赣', '鲁', '豫', '鄂', '湘', '粤', 5 | '桂', '琼', '川', '贵', '云', '藏', '陕', '甘', '青', '宁', 6 | '新', 7 | '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 8 | 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 9 | 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 10 | 'W', 'X', 'Y', 'Z', 'I', 'O', '-' 11 | ] 12 | class small_basic_block(nn.Module): 13 | def __init__(self, ch_in, ch_out): 14 | super(small_basic_block, self).__init__() 15 | self.block = nn.Sequential( 16 | nn.Conv2d(ch_in, ch_out // 4, kernel_size=1), 17 | nn.ReLU(), 18 | nn.Conv2d(ch_out // 4, ch_out // 4, kernel_size=(3, 1), padding=(1, 0)), 19 | nn.ReLU(), 20 | nn.Conv2d(ch_out // 4, ch_out // 4, kernel_size=(1, 3), padding=(0, 1)), 21 | nn.ReLU(), 22 | nn.Conv2d(ch_out // 4, ch_out, kernel_size=1), 23 | ) 24 | def forward(self, x): 25 | return self.block(x) 26 | 27 | class LPRNet(nn.Module): 28 | def __init__(self, lpr_max_len, phase, class_num, dropout_rate): 29 | super(LPRNet, self).__init__() 30 | self.phase = phase 31 | self.lpr_max_len = lpr_max_len 32 | self.class_num = class_num 33 | self.backbone = nn.Sequential( 34 | nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1), # 0 [bs,3,24,94] -> [bs,64,22,92] 35 | nn.BatchNorm2d(num_features=64), # 1 -> [bs,64,22,92] 36 | nn.ReLU(), # 2 -> [bs,64,22,92] 37 | nn.MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 1, 1)), # 3 -> [bs,64,20,90] 38 | small_basic_block(ch_in=64, ch_out=128), # 4 -> [bs,128,20,90] 39 | nn.BatchNorm2d(num_features=128), # 5 -> [bs,128,20,90] 40 | nn.ReLU(), # 6 -> [bs,128,20,90] 41 | nn.MaxPool3d(kernel_size=(1, 3, 3), stride=(2, 1, 2)), # 7 -> [bs,64,18,44] 42 | small_basic_block(ch_in=64, ch_out=256), # 8 -> [bs,256,18,44] 43 | nn.BatchNorm2d(num_features=256), # 9 -> [bs,256,18,44] 44 | nn.ReLU(), # 10 -> [bs,256,18,44] 45 | small_basic_block(ch_in=256, ch_out=256), # 11 -> [bs,256,18,44] 46 | nn.BatchNorm2d(num_features=256), # 12 -> [bs,256,18,44] 47 | nn.ReLU(), # 13 -> [bs,256,18,44] 48 | nn.MaxPool3d(kernel_size=(1, 3, 3), stride=(4, 1, 2)), # 14 -> [bs,64,16,21] 49 | nn.Dropout(dropout_rate), # 0.5 dropout rate # 15 -> [bs,64,16,21] 50 | nn.Conv2d(in_channels=64, out_channels=256, kernel_size=(1, 4), stride=1), # 16 -> [bs,256,16,18] 51 | nn.BatchNorm2d(num_features=256), # 17 -> [bs,256,16,18] 52 | nn.ReLU(), # 18 -> [bs,256,16,18] 53 | nn.Dropout(dropout_rate), # 0.5 dropout rate 19 -> [bs,256,16,18] 54 | nn.Conv2d(in_channels=256, out_channels=class_num, kernel_size=(13, 1), stride=1), # class_num=68 20 -> [bs,68,4,18] 55 | nn.BatchNorm2d(num_features=class_num), # 21 -> [bs,68,4,18] 56 | nn.ReLU(), # 22 -> [bs,68,4,18] 57 | ) 58 | self.container = nn.Sequential( 59 | nn.Conv2d(in_channels=448+self.class_num, out_channels=self.class_num, kernel_size=(1, 1), stride=(1, 1)), 60 | # nn.BatchNorm2d(num_features=self.class_num), 61 | # nn.ReLU(), 62 | # nn.Conv2d(in_channels=self.class_num, out_channels=self.lpr_max_len+1, kernel_size=3, stride=2), 63 | # nn.ReLU(), 64 | ) 65 | # self.connected = nn.Sequential( 66 | # nn.Linear(class_num * 88, 128), 67 | # nn.ReLU(), 68 | # ) 69 | # 70 | 71 | def forward(self, x): 72 | keep_features = list() 73 | for i, layer in enumerate(self.backbone.children()): 74 | x = layer(x) 75 | if i in [2, 6, 13, 22]: 76 | keep_features.append(x) 77 | 78 | global_context = list() 79 | # keep_features: [bs,64,22,92] [bs,128,20,90] [bs,256,18,44] [bs,68,4,18] 80 | for i, f in enumerate(keep_features): 81 | if i in [0, 1]: 82 | # [bs,64,22,92] -> [bs,64,4,18] 83 | # [bs,128,20,90] -> [bs,128,4,18] 84 | f = nn.AvgPool2d(kernel_size=5, stride=5)(f) 85 | if i in [2]: 86 | # [bs,256,18,44] -> [bs,256,4,18] 87 | f = nn.AvgPool2d(kernel_size=(4, 10), stride=(4, 2))(f) 88 | 89 | # 没看懂这是在干嘛?有上面的avg提取上下文信息不久可以了? 90 | f_pow = torch.pow(f, 2) # [bs,64,4,18] 所有元素求平方 91 | f_mean = torch.mean(f_pow) # 1 所有元素求平均 92 | f = torch.div(f, f_mean) # [bs,64,4,18] 所有元素除以这个均值 93 | global_context.append(f) 94 | 95 | x = torch.cat(global_context, 1) # [bs,516,4,18] 96 | x = self.container(x) # -> [bs, 68, 4, 18] head头 97 | logits = torch.mean(x, dim=2) # -> [bs, 68, 18] # 68 字符类别数 18字符序列长度 98 | 99 | return logits 100 | 101 | 102 | 103 | 104 | # https://blog.csdn.net/weixin_39027619/article/details/106143755 105 | # def forward(self, x): 106 | # x = self.backbone(x) 107 | # pattern = x.flatten(1, -1) 108 | # pattern = self.connected(pattern) 109 | # width = x.size()[-1] 110 | # pattern = torch.reshape(pattern, [-1, 128, 1, 1]) 111 | # pattern = pattern.repeat(1, 1, 1, width) 112 | # x = torch.cat([x, pattern], dim=1) 113 | # x = self.container(x) 114 | # logits = x.squeeze(2) 115 | # return logits 116 | 117 | 118 | # def build_lprnet(lpr_max_len=8, phase=False, class_num=66, dropout_rate=0.5): 119 | # 120 | # Net = LPRNet(lpr_max_len, phase, class_num, dropout_rate) 121 | # 122 | # if phase == "train": 123 | # return Net.train() 124 | # else: 125 | # return Net.eval() 126 | -------------------------------------------------------------------------------- /models/common.py: -------------------------------------------------------------------------------- 1 | # This file contains modules common to various models 2 | 3 | from utils.utils import * 4 | 5 | 6 | def autopad(k, p=None): # kernel, padding 7 | # Pad to 'same' 8 | if p is None: 9 | p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad 10 | return p 11 | 12 | 13 | def DWConv(c1, c2, k=1, s=1, act=True): 14 | # Depthwise convolution 15 | return Conv(c1, c2, k, s, g=math.gcd(c1, c2), act=act) 16 | 17 | 18 | class Conv(nn.Module): 19 | # Standard convolution 20 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 21 | super(Conv, self).__init__() 22 | self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) 23 | self.bn = nn.BatchNorm2d(c2) 24 | self.act = nn.LeakyReLU(0.1, inplace=True) if act else nn.Identity() 25 | 26 | def forward(self, x): 27 | return self.act(self.bn(self.conv(x))) 28 | 29 | def fuseforward(self, x): 30 | return self.act(self.conv(x)) 31 | 32 | 33 | class Bottleneck(nn.Module): 34 | # Standard bottleneck 35 | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion 36 | super(Bottleneck, self).__init__() 37 | c_ = int(c2 * e) # hidden channels 38 | self.cv1 = Conv(c1, c_, 1, 1) 39 | self.cv2 = Conv(c_, c2, 3, 1, g=g) 40 | self.add = shortcut and c1 == c2 41 | 42 | def forward(self, x): 43 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x)) 44 | 45 | 46 | class BottleneckCSP(nn.Module): 47 | # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks 48 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion 49 | super(BottleneckCSP, self).__init__() 50 | c_ = int(c2 * e) # hidden channels 51 | self.cv1 = Conv(c1, c_, 1, 1) 52 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False) 53 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False) 54 | self.cv4 = Conv(2 * c_, c2, 1, 1) 55 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3) 56 | self.act = nn.LeakyReLU(0.1, inplace=True) 57 | self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)]) 58 | 59 | def forward(self, x): 60 | y1 = self.cv3(self.m(self.cv1(x))) 61 | y2 = self.cv2(x) 62 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1)))) 63 | 64 | 65 | class SPP(nn.Module): 66 | # Spatial pyramid pooling layer used in YOLOv3-SPP 67 | def __init__(self, c1, c2, k=(5, 9, 13)): 68 | super(SPP, self).__init__() 69 | c_ = c1 // 2 # hidden channels 70 | self.cv1 = Conv(c1, c_, 1, 1) 71 | self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1) 72 | self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k]) 73 | 74 | def forward(self, x): 75 | x = self.cv1(x) 76 | return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1)) 77 | 78 | 79 | class Flatten(nn.Module): 80 | # Use after nn.AdaptiveAvgPool2d(1) to remove last 2 dimensions 81 | def forward(self, x): 82 | return x.view(x.size(0), -1) 83 | 84 | 85 | class Focus(nn.Module): 86 | # Focus wh information into c-space 87 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 88 | super(Focus, self).__init__() 89 | self.conv = Conv(c1 * 4, c2, k, s, p, g, act) 90 | 91 | def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2) 92 | return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)) 93 | 94 | 95 | class Concat(nn.Module): 96 | # Concatenate a list of tensors along dimension 97 | def __init__(self, dimension=1): 98 | super(Concat, self).__init__() 99 | self.d = dimension 100 | 101 | def forward(self, x): 102 | return torch.cat(x, self.d) 103 | -------------------------------------------------------------------------------- /models/experimental.py: -------------------------------------------------------------------------------- 1 | # This file contains experimental modules 2 | 3 | from models.common import * 4 | from utils import google_utils 5 | 6 | 7 | class CrossConv(nn.Module): 8 | # Cross Convolution Downsample 9 | def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False): 10 | # ch_in, ch_out, kernel, stride, groups, expansion, shortcut 11 | super(CrossConv, self).__init__() 12 | c_ = int(c2 * e) # hidden channels 13 | self.cv1 = Conv(c1, c_, (1, k), (1, s)) 14 | self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g) 15 | self.add = shortcut and c1 == c2 16 | 17 | def forward(self, x): 18 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x)) 19 | 20 | 21 | class C3(nn.Module): 22 | # Cross Convolution CSP 23 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion 24 | super(C3, self).__init__() 25 | c_ = int(c2 * e) # hidden channels 26 | self.cv1 = Conv(c1, c_, 1, 1) 27 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False) 28 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False) 29 | self.cv4 = Conv(2 * c_, c2, 1, 1) 30 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3) 31 | self.act = nn.LeakyReLU(0.1, inplace=True) 32 | self.m = nn.Sequential(*[CrossConv(c_, c_, 3, 1, g, 1.0, shortcut) for _ in range(n)]) 33 | 34 | def forward(self, x): 35 | y1 = self.cv3(self.m(self.cv1(x))) 36 | y2 = self.cv2(x) 37 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1)))) 38 | 39 | 40 | class Sum(nn.Module): 41 | # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070 42 | def __init__(self, n, weight=False): # n: number of inputs 43 | super(Sum, self).__init__() 44 | self.weight = weight # apply weights boolean 45 | self.iter = range(n - 1) # iter object 46 | if weight: 47 | self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights 48 | 49 | def forward(self, x): 50 | y = x[0] # no weight 51 | if self.weight: 52 | w = torch.sigmoid(self.w) * 2 53 | for i in self.iter: 54 | y = y + x[i + 1] * w[i] 55 | else: 56 | for i in self.iter: 57 | y = y + x[i + 1] 58 | return y 59 | 60 | 61 | class GhostConv(nn.Module): 62 | # Ghost Convolution https://github.com/huawei-noah/ghostnet 63 | def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups 64 | super(GhostConv, self).__init__() 65 | c_ = c2 // 2 # hidden channels 66 | self.cv1 = Conv(c1, c_, k, s, g, act) 67 | self.cv2 = Conv(c_, c_, 5, 1, c_, act) 68 | 69 | def forward(self, x): 70 | y = self.cv1(x) 71 | return torch.cat([y, self.cv2(y)], 1) 72 | 73 | 74 | class GhostBottleneck(nn.Module): 75 | # Ghost Bottleneck https://github.com/huawei-noah/ghostnet 76 | def __init__(self, c1, c2, k, s): 77 | super(GhostBottleneck, self).__init__() 78 | c_ = c2 // 2 79 | self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw 80 | DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw 81 | GhostConv(c_, c2, 1, 1, act=False)) # pw-linear 82 | self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False), 83 | Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity() 84 | 85 | def forward(self, x): 86 | return self.conv(x) + self.shortcut(x) 87 | 88 | 89 | class MixConv2d(nn.Module): 90 | # Mixed Depthwise Conv https://arxiv.org/abs/1907.09595 91 | def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True): 92 | super(MixConv2d, self).__init__() 93 | groups = len(k) 94 | if equal_ch: # equal c_ per group 95 | i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices 96 | c_ = [(i == g).sum() for g in range(groups)] # intermediate channels 97 | else: # equal weight.numel() per group 98 | b = [c2] + [0] * groups 99 | a = np.eye(groups + 1, groups, k=-1) 100 | a -= np.roll(a, 1, axis=1) 101 | a *= np.array(k) ** 2 102 | a[0] = 1 103 | c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b 104 | 105 | self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)]) 106 | self.bn = nn.BatchNorm2d(c2) 107 | self.act = nn.LeakyReLU(0.1, inplace=True) 108 | 109 | def forward(self, x): 110 | return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1))) 111 | 112 | 113 | class Ensemble(nn.ModuleList): 114 | # Ensemble of models 115 | def __init__(self): 116 | super(Ensemble, self).__init__() 117 | 118 | def forward(self, x, augment=False): 119 | y = [] 120 | for module in self: 121 | y.append(module(x, augment)[0]) 122 | # y = torch.stack(y).max(0)[0] # max ensemble 123 | # y = torch.cat(y, 1) # nms ensemble 124 | y = torch.stack(y).mean(0) # mean ensemble 125 | return y, None # demo, train rec_result 126 | 127 | 128 | def attempt_load(weights, map_location=None): 129 | # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a 130 | model = Ensemble() 131 | for w in weights if isinstance(weights, list) else [weights]: 132 | google_utils.attempt_download(w) 133 | model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model 134 | 135 | if len(model) == 1: 136 | return model[-1] # return model 137 | else: 138 | print('Ensemble created with %s\n' % weights) 139 | for k in ['names', 'stride']: 140 | setattr(model, k, getattr(model[-1], k)) 141 | return model # return ensemble 142 | -------------------------------------------------------------------------------- /models/yolo.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | from models.experimental import * 4 | 5 | 6 | class Detect(nn.Module): 7 | def __init__(self, nc=80, anchors=()): # detection layer 8 | super(Detect, self).__init__() 9 | self.stride = None # strides computed during build 10 | self.nc = nc # number of classes 11 | self.no = nc + 5 # number of outputs per anchor 12 | self.nl = len(anchors) # number of detection layers 13 | self.na = len(anchors[0]) // 2 # number of anchors 14 | self.grid = [torch.zeros(1)] * self.nl # init grid 15 | a = torch.tensor(anchors).float().view(self.nl, -1, 2) 16 | self.register_buffer('anchors', a) # shape(nl,na,2) 17 | self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2) 18 | self.export = False # onnx export 19 | 20 | def forward(self, x): 21 | # x = x.copy() # for profiling 22 | z = [] # demo rec_result 23 | self.training |= self.export 24 | for i in range(self.nl): 25 | bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85) 26 | x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() 27 | 28 | if not self.training: # demo 29 | if self.grid[i].shape[2:4] != x[i].shape[2:4]: 30 | self.grid[i] = self._make_grid(nx, ny).to(x[i].device) 31 | 32 | y = x[i].sigmoid() 33 | y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i])) * self.stride[i] # xy 34 | y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh 35 | z.append(y.view(bs, -1, self.no)) 36 | 37 | return x if self.training else (torch.cat(z, 1), x) 38 | 39 | @staticmethod 40 | def _make_grid(nx=20, ny=20): 41 | yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)]) 42 | return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float() 43 | 44 | 45 | class Model(nn.Module): 46 | def __init__(self, model_cfg='yolov5s.yaml', ch=3, nc=None): # model, input channels, number of classes 47 | super(Model, self).__init__() 48 | if type(model_cfg) is dict: 49 | self.md = model_cfg # model dict 50 | else: # is *.yaml 51 | import yaml # for torch hub 52 | with open(model_cfg) as f: 53 | self.md = yaml.load(f, Loader=yaml.FullLoader) # model dict 54 | 55 | # Define model 56 | if nc and nc != self.md['nc']: 57 | print('Overriding %s nc=%g with nc=%g' % (model_cfg, self.md['nc'], nc)) 58 | self.md['nc'] = nc # override yaml value 59 | self.model, self.save = parse_model(self.md, ch=[ch]) # model, savelist, ch_out 60 | # print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))]) 61 | 62 | # Build strides, anchors 63 | m = self.model[-1] # Detect() 64 | if isinstance(m, Detect): 65 | s = 128 # 2x min stride 66 | m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward 67 | m.anchors /= m.stride.view(-1, 1, 1) 68 | check_anchor_order(m) 69 | self.stride = m.stride 70 | self._initialize_biases() # only run once 71 | # print('Strides: %s' % m.stride.tolist()) 72 | 73 | # Init weights, biases 74 | torch_utils.initialize_weights(self) 75 | self._initialize_biases() # only run once 76 | torch_utils.model_info(self) 77 | print('') 78 | 79 | def forward(self, x, augment=False, profile=False): 80 | if augment: 81 | img_size = x.shape[-2:] # height, width 82 | s = [0.83, 0.67] # scales 83 | y = [] 84 | for i, xi in enumerate((x, 85 | torch_utils.scale_img(x.flip(3), s[0]), # flip-lr and scale 86 | torch_utils.scale_img(x, s[1]), # scale 87 | )): 88 | # cv2.imwrite('img%g.jpg' % i, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1]) 89 | y.append(self.forward_once(xi)[0]) 90 | 91 | y[1][..., :4] /= s[0] # scale 92 | y[1][..., 0] = img_size[1] - y[1][..., 0] # flip lr 93 | y[2][..., :4] /= s[1] # scale 94 | return torch.cat(y, 1), None # augmented demo, train 95 | else: 96 | return self.forward_once(x, profile) # single-scale demo, train 97 | 98 | def forward_once(self, x, profile=False): 99 | y, dt = [], [] # outputs 100 | for m in self.model: 101 | if m.f != -1: # if not from previous layer 102 | x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers 103 | 104 | if profile: 105 | try: 106 | import thop 107 | o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 # FLOPS 108 | except: 109 | o = 0 110 | t = torch_utils.time_synchronized() 111 | for _ in range(10): 112 | _ = m(x) 113 | dt.append((torch_utils.time_synchronized() - t) * 100) 114 | print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type)) 115 | 116 | x = m(x) # run 117 | y.append(x if m.i in self.save else None) # save rec_result 118 | 119 | if profile: 120 | print('%.1fms total' % sum(dt)) 121 | return x 122 | 123 | def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency 124 | # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1. 125 | m = self.model[-1] # Detect() module 126 | for f, s in zip(m.f, m.stride): #  from 127 | mi = self.model[f % m.i] 128 | b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85) 129 | b[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image) 130 | b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls 131 | mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True) 132 | 133 | def _print_biases(self): 134 | m = self.model[-1] # Detect() module 135 | for f in sorted([x % m.i for x in m.f]): #  from 136 | b = self.model[f].bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85) 137 | print(('%g Conv2d.bias:' + '%10.3g' * 6) % (f, *b[:5].mean(1).tolist(), b[5:].mean())) 138 | 139 | # def _print_weights(self): 140 | # for m in self.model.modules(): 141 | # if type(m) is Bottleneck: 142 | # print('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights 143 | 144 | def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers 145 | print('Fusing layers... ', end='') 146 | for m in self.model.modules(): 147 | if type(m) is Conv: 148 | m.conv = torch_utils.fuse_conv_and_bn(m.conv, m.bn) # update conv 149 | m.bn = None # remove batchnorm 150 | m.forward = m.fuseforward # update forward 151 | torch_utils.model_info(self) 152 | return self 153 | 154 | def parse_model(md, ch): # model_dict, input_channels(3) 155 | print('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments')) 156 | anchors, nc, gd, gw = md['anchors'], md['nc'], md['depth_multiple'], md['width_multiple'] 157 | na = (len(anchors[0]) // 2) # number of anchors 158 | no = na * (nc + 5) # number of outputs = anchors * (classes + 5) 159 | 160 | layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out 161 | for i, (f, n, m, args) in enumerate(md['backbone'] + md['head']): # from, number, module, args 162 | m = eval(m) if isinstance(m, str) else m # eval strings 163 | for j, a in enumerate(args): 164 | try: 165 | args[j] = eval(a) if isinstance(a, str) else a # eval strings 166 | except: 167 | pass 168 | 169 | n = max(round(n * gd), 1) if n > 1 else n # depth gain 170 | if m in [nn.Conv2d, Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]: 171 | c1, c2 = ch[f], args[0] 172 | 173 | # Normal 174 | # if i > 0 and args[0] != no: # channel expansion factor 175 | # ex = 1.75 # exponential (default 2.0) 176 | # e = math.log(c2 / ch[1]) / math.log(2) 177 | # c2 = int(ch[1] * ex ** e) 178 | # if m != Focus: 179 | c2 = make_divisible(c2 * gw, 8) if c2 != no else c2 180 | 181 | # Experimental 182 | # if i > 0 and args[0] != no: # channel expansion factor 183 | # ex = 1 + gw # exponential (default 2.0) 184 | # ch1 = 32 # ch[1] 185 | # e = math.log(c2 / ch1) / math.log(2) # level 1-n 186 | # c2 = int(ch1 * ex ** e) 187 | # if m != Focus: 188 | # c2 = make_divisible(c2, 8) if c2 != no else c2 189 | 190 | args = [c1, c2, *args[1:]] 191 | if m in [BottleneckCSP, C3]: 192 | args.insert(2, n) 193 | n = 1 194 | elif m is nn.BatchNorm2d: 195 | args = [ch[f]] 196 | elif m is Concat: 197 | c2 = sum([ch[-1 if x == -1 else x + 1] for x in f]) 198 | elif m is Detect: 199 | f = f or list(reversed([(-1 if j == i else j - 1) for j, x in enumerate(ch) if x == no])) 200 | else: 201 | c2 = ch[f] 202 | 203 | m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module 204 | t = str(m)[8:-2].replace('__main__.', '') # module type 205 | np = sum([x.numel() for x in m_.parameters()]) # number params 206 | m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params 207 | print('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print 208 | save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist 209 | layers.append(m_) 210 | ch.append(c2) 211 | return nn.Sequential(*layers), sorted(save) 212 | 213 | 214 | if __name__ == '__main__': 215 | parser = argparse.ArgumentParser() 216 | parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml') 217 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 218 | opt = parser.parse_args() 219 | opt.cfg = check_file(opt.cfg) # check file 220 | device = torch_utils.select_device(opt.device) 221 | 222 | # Create model 223 | model = Model(opt.cfg).to(device) 224 | model.train() 225 | 226 | # Profile 227 | # img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device) 228 | # y = model(img, profile=True) 229 | 230 | # ONNX export 231 | # model.model[-1].export = True 232 | # torch.onnx.export(model, img, opt.cfg.replace('.yaml', '.onnx'), verbose=True, opset_version=11) 233 | 234 | # Tensorboard 235 | # from torch.utils.tensorboard import SummaryWriter 236 | # tb_writer = SummaryWriter() 237 | # print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/") 238 | # tb_writer.add_graph(model.model, img) # add model to tensorboard 239 | # tb_writer.add_image('test', img[0], dataformats='CWH') # add model to tensorboard 240 | -------------------------------------------------------------------------------- /models/yolov5s.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 0.33 # model depth multiple 4 | width_multiple: 0.50 # layer channel multiple 5 | # , 6 | # anchors 7 | anchors: 8 | - [188,58, 250,72, 372,148] # P5/32 9 | - [130,39, 107,62, 157,46] # P4/16 10 | - [54,18, 72,29, 105,30] # P3/8 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, BottleneckCSP, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, BottleneckCSP, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, BottleneckCSP, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | ] 25 | 26 | # YOLOv5 head 27 | head: 28 | [[-1, 3, BottleneckCSP, [1024, False]], # 9 29 | 30 | [-1, 1, Conv, [512, 1, 1]], 31 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 32 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 33 | [-1, 3, BottleneckCSP, [512, False]], # 13 34 | 35 | [-1, 1, Conv, [256, 1, 1]], 36 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 37 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 38 | [-1, 3, BottleneckCSP, [256, False]], 39 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 18 (P3/8-small) 40 | 41 | [-2, 1, Conv, [256, 3, 2]], 42 | [[-1, 14], 1, Concat, [1]], # cat head P4 43 | [-1, 3, BottleneckCSP, [512, False]], 44 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P4/16-medium) 45 | 46 | [-2, 1, Conv, [512, 3, 2]], 47 | [[-1, 10], 1, Concat, [1]], # cat head P5 48 | [-1, 3, BottleneckCSP, [1024, False]], 49 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 26 (P5/32-large) 50 | 51 | [[], 1, Detect, [nc, anchors]], # Detect(P5, P4, P3) 52 | ] 53 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # pip install -r requirements.txt 2 | 3 | # Base ---------------------------------------- 4 | matplotlib>=3.2.2 5 | numpy>=1.18.5 6 | opencv-python>=4.1.2 7 | Pillow>=7.1.2 8 | PyYAML>=5.3.1 9 | requests>=2.23.0 10 | scipy>=1.4.1 11 | torch>=1.7.0 12 | torchvision>=0.8.1 13 | tqdm>=4.41.0 14 | 15 | # Logging ------------------------------------- 16 | tensorboard>=2.4.1 17 | # wandb 18 | 19 | # Plotting ------------------------------------ 20 | pandas>=1.1.4 21 | seaborn>=0.11.0 22 | 23 | # Export -------------------------------------- 24 | # coremltools>=4.1 # CoreML export 25 | # onnx>=1.9.0 # ONNX export 26 | # onnx-simplifier>=0.3.6 # ONNX simplifier 27 | # scikit-learn==0.19.2 # CoreML quantization 28 | # tensorflow>=2.4.1 # TFLite export 29 | # tensorflowjs>=3.9.0 # TF.js export 30 | 31 | # Extras -------------------------------------- 32 | # albumentations>=1.0.3 33 | # Cython # for pycocotools https://github.com/cocodataset/cocoapi/issues/172 34 | # pycocotools>=2.0 # COCO mAP 35 | # roboflow 36 | thop # FLOPs computation 37 | -------------------------------------------------------------------------------- /tools/ccpd2lpr.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: HuKai 3 | @Date: 2022/5/29 21:24 4 | @github: https://github.com/HuKai97 5 | """ 6 | import cv2 7 | import os 8 | import numpy as np 9 | 10 | # 参考 https://blog.csdn.net/qq_36516958/article/details/114274778 11 | # https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data#2-create-labels 12 | from PIL import Image 13 | # CCPD车牌有重复,应该是不同角度或者模糊程度 14 | path = r'K:\MyProject\datasets\ccpd\new\ccpd_2019\images\test' # 改成自己的车牌路径 15 | 16 | 17 | provinces = ["皖", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "京", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂", "琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "警", "学", "O"] 18 | alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 19 | 'X', 'Y', 'Z', 'O'] 20 | ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 21 | 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O'] 22 | num = 0 23 | for filename in os.listdir(path): 24 | num += 1 25 | result = "" 26 | _, _, box, points, plate, brightness, blurriness = filename.split('-') 27 | list_plate = plate.split('_') # 读取车牌 28 | result += provinces[int(list_plate[0])] 29 | result += alphabets[int(list_plate[1])] 30 | result += ads[int(list_plate[2])] + ads[int(list_plate[3])] + ads[int(list_plate[4])] + ads[int(list_plate[5])] + ads[int(list_plate[6])] 31 | # 新能源车牌的要求,如果不是新能源车牌可以删掉这个if 32 | # if result[2] != 'D' and result[2] != 'F' \ 33 | # and result[-1] != 'D' and result[-1] != 'F': 34 | # print(filename) 35 | # print("Error label, Please check!") 36 | # assert 0, "Error label ^~^!!!" 37 | print(result) 38 | img_path = os.path.join(path, filename) 39 | img = cv2.imread(img_path) 40 | assert os.path.exists(img_path), "image file {} dose not exist.".format(img_path) 41 | 42 | box = box.split('_') # 车牌边界 43 | box = [list(map(int, i.split('&'))) for i in box] 44 | 45 | xmin = box[0][0] 46 | xmax = box[1][0] 47 | ymin = box[0][1] 48 | ymax = box[1][1] 49 | 50 | img = Image.fromarray(img) 51 | img = img.crop((xmin, ymin, xmax, ymax)) # 裁剪出车牌位置 52 | img = img.resize((94, 24), Image.LANCZOS) 53 | img = np.asarray(img) # 转成array,变成24*94*3 54 | 55 | cv2.imencode('.jpg', img)[1].tofile(r"K:\MyProject\datasets\ccpd\new\ccpd_2019\rec_images\test\{}.jpg".format(result)) 56 | # 图片中文名会报错 57 | # cv2.imwrite(r"K:\MyProject\datasets\ccpd\new\ccpd_2020\rec_images\train\{}.jpg".format(result), img) # 改成自己存放的路径 58 | print("共生成{}张".format(num)) 59 | -------------------------------------------------------------------------------- /tools/ccpd2yolov5.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: HuKai 3 | @Date: 2022/5/29 10:47 4 | @github: https://github.com/HuKai97 5 | """ 6 | import shutil 7 | import cv2 8 | import os 9 | 10 | def txt_translate(path, txt_path): 11 | for filename in os.listdir(path): 12 | print(filename) 13 | 14 | list1 = filename.split("-", 3) # 第一次分割,以减号'-'做分割 15 | subname = list1[2] 16 | list2 = filename.split(".", 1) 17 | subname1 = list2[1] 18 | if subname1 == 'txt': 19 | continue 20 | lt, rb = subname.split("_", 1) # 第二次分割,以下划线'_'做分割 21 | lx, ly = lt.split("&", 1) 22 | rx, ry = rb.split("&", 1) 23 | width = int(rx) - int(lx) 24 | height = int(ry) - int(ly) # bounding box的宽和高 25 | cx = float(lx) + width / 2 26 | cy = float(ly) + height / 2 # bounding box中心点 27 | 28 | img = cv2.imread(path + filename) 29 | if img is None: # 自动删除失效图片(下载过程有的图片会存在无法读取的情况) 30 | os.remove(os.path.join(path, filename)) 31 | continue 32 | width = width / img.shape[1] 33 | height = height / img.shape[0] 34 | cx = cx / img.shape[1] 35 | cy = cy / img.shape[0] 36 | 37 | txtname = filename.split(".", 1) 38 | txtfile = txt_path + txtname[0] + ".txt" 39 | # 绿牌是第0类,蓝牌是第1类 40 | with open(txtfile, "w") as f: 41 | f.write(str(0) + " " + str(cx) + " " + str(cy) + " " + str(width) + " " + str(height)) 42 | 43 | 44 | if __name__ == '__main__': 45 | # det图片存储地址 46 | trainDir = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\images\train\\" 47 | validDir = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\images\val\\" 48 | testDir = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\images\test\\" 49 | # det txt存储地址 50 | train_txt_path = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\labels\train\\" 51 | val_txt_path = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\labels\val\\" 52 | test_txt_path = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\labels\test\\" 53 | txt_translate(trainDir, train_txt_path) 54 | txt_translate(validDir, val_txt_path) 55 | txt_translate(testDir, test_txt_path) -------------------------------------------------------------------------------- /tools/detect_yolov5.py: -------------------------------------------------------------------------------- 1 | """Run inference with a YOLOv5 model on images, videos, directories, streams 2 | 3 | Usage: 4 | $ python path/to/detect.py --source path/to/img.jpg --weights yolov5s.pt --img 640 5 | """ 6 | 7 | import argparse # python的命令行解析的标准模块 可以让我们直接在命令行中就可以向程序中传入参数并让程序运行 8 | import sys # sys系统模块 包含了与Python解释器和它的环境有关的函数 9 | import time # 时间模块 更底层 10 | from pathlib import Path # Path将str转换为Path对象 使字符串路径易于操作的模块 11 | 12 | import cv2 # opencv模块 13 | import torch # pytorch模块 14 | import torch.backends.cudnn as cudnn # cuda模块 15 | 16 | FILE = Path(__file__).absolute() # FILE = WindowsPath 'F:\yolo_v5\yolov5-U\detect.py' 17 | # 将'F:/yolo_v5/yolov5-U'加入系统的环境变量 该脚本结束后失效 18 | sys.path.append(FILE.parents[0].as_posix()) # add yolov5-U/ to path 19 | 20 | # ----------------- 导入自定义的其他包 ------------------- 21 | from models.experimental import attempt_load 22 | from utils.datasets import LoadStreams, LoadImages 23 | from utils.general import check_img_size, check_requirements, check_imshow, colorstr, non_max_suppression, \ 24 | apply_classifier, scale_coords, xyxy2xywh, strip_optimizer, set_logging, increment_path, save_one_box 25 | from utils.plots import colors, plot_one_box 26 | from utils.torch_utils import select_device, load_classifier, time_synchronized, model_info, prune 27 | 28 | 29 | @torch.no_grad() 30 | def run(weights='weights/yolov5s.pt', # 权重文件地址 默认 weights/yolov5.pt 31 | source='data/images', # 测试数据文件(图片或视频)的保存路径 默认data/images 32 | imgsz=640, # 输入图片的大小 默认640(pixels) 33 | conf_thres=0.25, # object置信度阈值 默认0.25 用在nms中 34 | iou_thres=0.45, # 做nms的iou阈值 默认0.45 用在nms中 35 | max_det=1000, # 每张图片最多的目标数量 用在nms中 36 | device='', # 设置代码执行的设备 cuda device, i.e. 0 or 0,1,2,3 or cpu 37 | view_img=False, # 是否展示预测之后的图片或视频 默认False 38 | save_txt=False, # 是否将预测的框坐标以txt文件格式保存 默认True 会在runs/detect/expn/labels下生成每张图片预测的txt文件 39 | save_conf=False, # 是否保存预测每个目标的置信度到预测tx文件中 默认True 40 | save_crop=False, # 是否需要将预测到的目标从原图中扣出来 剪切好 并保存 会在runs/detect/expn下生成crops文件,将剪切的图片保存在里面 默认False 41 | nosave=False, # 是否不要保存预测后的图片 默认False 就是默认要保存预测后的图片 42 | classes=None, # 在nms中是否是只保留某些特定的类 默认是None 就是所有类只要满足条件都可以保留 43 | agnostic_nms=False, # 进行nms是否也除去不同类别之间的框 默认False 44 | augment=False, # 预测是否也要采用数据增强 TTA 默认False 45 | update=False, # 是否将optimizer从ckpt中删除 更新模型 默认False 46 | project='runs/detect', # 当前测试结果放在哪个主文件夹下 默认runs/detect 47 | name='exp', # 当前测试结果放在run/detect下的文件名 默认是exp => run/detect/exp 48 | exist_ok=False, # 是否存在当前文件 默认False 一般是 no exist-ok 连用 所以一般都要重新创建文件夹 49 | line_thickness=3, # bounding box thickness (pixels) 画框的框框的线宽 默认是 3 50 | hide_labels=False, # 画出的框框是否需要隐藏label信息 默认False 51 | hide_conf=False, # 画出的框框是否需要隐藏conf信息 默认False 52 | half=False, # 是否使用半精度 Float16 推理 可以缩短推理时间 但是默认是False 53 | prune_model=False, # 是否使用模型剪枝 进行推理加速 54 | fuse=False, # 是否使用conv + bn融合技术 进行推理加速 55 | ): 56 | 57 | # ===================================== 1、初始化一些配置 ===================================== 58 | # 是否保存预测后的图片 默认nosave=False 所以只要传入的文件地址不是以.txt结尾 就都是要保存预测后的图片的 59 | save_img = not nosave and not source.endswith('.txt') # save inference images True 60 | 61 | # 是否是使用webcam 网页数据 一般是Fasle 因为我们一般是使用图片流LoadImages(可以处理图片/视频流文件) 62 | webcam = source.isnumeric() or source.endswith('.txt') or source.lower().startswith( 63 | ('rtsp://', 'rtmp://', 'http://', 'https://')) 64 | 65 | # 检查当前Path(project) / name是否存在 如果存在就新建新的save_dir 默认exist_ok=False 需要重建 66 | # 将原先传入的名字扩展成新的save_dir 如runs/detect/exp存在 就扩展成 runs/detect/exp1 67 | save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run 68 | # 如果需要save txt就新建save_dir / 'labels' 否则就新建save_dir 69 | # 默认save_txt=False 所以这里一般都是新建一个 save_dir(runs/detect/expn) 70 | (save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir 71 | 72 | # Initialize 初始化日志信息 73 | set_logging() 74 | 75 | # 获取当前主机可用的设备 76 | device = select_device(device) 77 | 78 | # 如果设配是GPU 就使用half(float16) 包括模型半精度和输入图片半精度 79 | half &= device.type != 'cpu' # half precision only supported on CUDA 80 | 81 | 82 | # ===================================== 2、载入模型和模型参数并调整模型 ===================================== 83 | # 2.1、加载Float32模型 84 | model = attempt_load(weights, map_location=device) 85 | 86 | # 是否使用模型剪枝技术 加速推理 87 | if prune_model: 88 | model_info(model) # 打印模型信息 89 | prune(model, 0.3) # 对模型进行剪枝 加速推理 90 | model_info(model) # 再打印模型信息 观察剪枝后模型变化 91 | 92 | # 是否使用模型的conv+bn融合技术 加速推理 93 | if fuse: 94 | model = model.fuse() # 将模型的conv+bn融合 可以加速推理 95 | 96 | # 2.2、载入一些模型参数 97 | # stride: 模型最大的下采样率 [8, 16, 32] 所有stride一般为32 98 | stride = int(model.stride.max()) # model stride 99 | 100 | # 确保输入图片的尺寸imgsz能整除stride=32 如果不能则调整为能被整除并返回 101 | imgsz = check_img_size(imgsz, s=stride) # check image size 保证img size必须是32的倍数 102 | 103 | # 得到数据集的所有类的类名 104 | names = ['licence'] # get class names 105 | 106 | # 2.3、调整模型 107 | # 是否将模型从float32 -> float16 加速推理 108 | if half: 109 | model.half() # to float16 110 | 111 | # 是否加载二次分类模型 112 | # 这里考虑到目标检测完是否需要第二次分类,自己可以考虑自己的任务自己加上 但是这里默认是False的 我们默认不使用 113 | classify = False 114 | if classify: 115 | modelc = load_classifier(name='resnet50', n=2) # initialize 116 | modelc.load_state_dict(torch.load('resnet50.pt', map_location=device)['model']).to(device).eval() 117 | 118 | 119 | # ===================================== 3、加载推理数据 ===================================== 120 | # Set Dataloader 121 | # 通过不同的输入源来设置不同的数据加载方式 122 | vid_path, vid_writer = None, None 123 | if webcam: 124 | # 一般不会使用webcam模式从网页中获取数据 125 | view_img = check_imshow() 126 | cudnn.benchmark = True # set True to speed up constant image size inference 127 | dataset = LoadStreams(source, img_size=imgsz) 128 | else: 129 | # 一般是直接从source文件目录下直接读取图片或者视频数据 130 | dataset = LoadImages(source, img_size=imgsz) 131 | 132 | 133 | # ===================================== 4、推理前测试 ===================================== 134 | # 这里先设置一个全零的Tensor进行一次前向推理 判断程序是否正常 135 | if device.type != 'cpu': 136 | model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters()))) # run once 137 | 138 | # ===================================== 5、正式推理 ===================================== 139 | t0 = time.time() 140 | # path: 图片/视频的路径 141 | # img: 进行resize + pad之后的图片 142 | # img0s: 原尺寸的图片 143 | # vid_cap: 当读取图片时为None, 读取视频时为视频源 144 | for path, img, im0s, vid_cap in dataset: 145 | # gt_label = getGtLabelByImagePath([path], im0s.shape) 146 | # 5.1、处理每一张图片的格式 147 | img = torch.from_numpy(img).to(device) # numpy array to tensor and device 148 | img = img.half() if half else img.float() # 半精度训练 uint8 to fp16/32 149 | img /= 255.0 # 归一化 0 - 255 to 0.0 - 1.0 150 | # 如果图片是3维(RGB) 就在前面添加一个维度1当中batch_size=1 151 | # 因为输入网络的图片需要是4为的 [batch_size, channel, w, h] 152 | if img.ndimension() == 3: 153 | img = img.unsqueeze(0) 154 | 155 | # 5.2、对每张图片/视频进行前向推理 156 | t1 = time_synchronized() 157 | # pred shape=[1, num_boxes, xywh+obj_conf+classes] = [1, 18900, 25] 158 | pred = model(img, augment=augment)[0] 159 | 160 | # 5.3、nms除去多余的框 161 | # Apply NMS 进行NMS 162 | # conf_thres: 置信度阈值 163 | # iou_thres: iou阈值 164 | # classes: 是否只保留特定的类别 默认为None 165 | # agnostic_nms: 进行nms是否也去除不同类别之间的框 默认False 166 | # max_det: 每张图片的最大目标个数 默认1000 167 | # pred: [num_obj, 6] = [5, 6] 这里的预测信息pred还是相对于 img_size(640) 的 168 | pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms) 169 | t2 = time_synchronized() 170 | 171 | # 5.4、考虑进行二次分类 172 | # Apply Classifier 如果需要二次分类 就进行二次分类 一般是不需要的 173 | if classify: 174 | pred = apply_classifier(pred, modelc, img, im0s) 175 | 176 | # 5.5、后续保存或者打印预测信息 177 | # 对每张图片进行处理 将pred(相对img_size 640)映射回原图img0 size 178 | for i, det in enumerate(pred): # detections per image 179 | if webcam: 180 | # 如果输入源是webcam(网页)则batch_size>=1 取出dataset中的一张图片 181 | p, s, im0, frame = path[i], f'{i}: ', im0s[i].copy(), dataset.count 182 | else: 183 | # 但是大部分我们一般都是从LoadImages流读取本都文件中的照片或者视频 所以batch_size=1 184 | # p: 当前图片/视频的绝对路径 如 F:\yolo_v5\yolov5-U\data\images\bus.jpg 185 | # s: 输出信息 初始为 '' 186 | # im0: 原始图片 letterbox + pad 之前的图片 187 | # frame: 初始为0 可能是当前图片属于视频中的第几帧? 188 | p, s, im0, frame = path, '', im0s.copy(), getattr(dataset, 'frame', 0) 189 | 190 | # 当前图片路径 如 F:\yolo_v5\yolov5-U\data\images\bus.jpg 191 | p = Path(p) # to Path 192 | # 图片/视频的保存路径save_path 如 runs\\detect\\exp8\\bus.jpg 193 | save_path = str(save_dir / p.name) # img.jpg 194 | # txt文件(保存预测框坐标)保存路径 如 runs\\detect\\exp8\\labels\\bus 195 | txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # img.txt 196 | 197 | # print string 输出信息 图片shape (w, h) 198 | s += '%gx%g ' % img.shape[2:] 199 | 200 | # normalization gain gn = [w, h, w, h] 用于后面的归一化 201 | gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] 202 | 203 | # imc: for save_crop 在save_crop中使用 204 | imc = im0.copy() if save_crop else im0 205 | 206 | # 在原图im0上画gt_label 207 | # for gt in gt_label: 208 | # cls = int(gt[0]) 209 | # label = gt[1:] 210 | # text = None if hide_labels else (names[cls]) 211 | # # 画出gt框 212 | # plot_one_box(label, im0, label=text, color=(0, 0, 255), line_thickness=4) 213 | 214 | # 在原图im0上画pred框 215 | if len(det): 216 | # Rescale boxes from img_size to im0 size 217 | # 将预测信息(相对img_size 640)映射回原图 img0 size 218 | det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round() 219 | 220 | # Print results 221 | # 输出信息s + 检测到的各个类别的目标个数 222 | for c in det[:, -1].unique(): 223 | n = (det[:, -1] == c).sum() # detections per class 224 | s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string 225 | 226 | # Write results 227 | # 保存预测信息: txt、img0上画框、crop_img 228 | for *xyxy, conf, cls in reversed(det): 229 | if int(cls) != 0: 230 | continue 231 | # 将每个图片的预测信息分别存入save_dir/labels下的xxx.txt中 每行: class_id+score+xywh 232 | if save_txt: # Write to file(txt) 233 | # 将xyxy(左上角 + 右下角)格式转换为xywh(中心的 + 宽高)格式 并除以gn(whwh)做归一化 转为list再保存 234 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh 235 | line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format 236 | with open(txt_path + '.txt', 'a') as f: 237 | f.write(('%g ' * len(line)).rstrip() % line + '\n') 238 | 239 | # 在原图上画框 + 将预测到的目标剪切出来 保存成图片 保存在save_dir/crops下 240 | if save_img or save_crop or view_img: 241 | c = int(cls) # integer class 242 | label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}') 243 | # 画出预测框 244 | plot_one_box(xyxy, im0, label=label, color=colors(c, True), line_thickness=3) 245 | if save_crop: 246 | # 如果需要就将预测到的目标剪切出来 保存成图片 保存在save_dir/crops下 247 | save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True) 248 | 249 | # 打印前向传播 + NMS 花费的时间 250 | print(f'{s}Done. ({t2 - t1:.3f}s)') 251 | 252 | # Stream results 253 | # 是否需要显示我们预测后的结果 img0(此时已将pred结果可视化到了img0中) 254 | if view_img: 255 | cv2.imshow(str(p), im0) 256 | cv2.waitKey(1) # 1 millisecond 257 | 258 | # Save results (image with detections) 259 | # 是否需要保存图片或视频(检测后的图片/视频 里面已经被我们画好了框的) img0 260 | if save_img: 261 | if dataset.mode == 'images': 262 | cv2.imwrite(save_path, im0) 263 | else: # 'video' or 'stream' 264 | if vid_path != save_path: # new video 265 | vid_path = save_path 266 | if isinstance(vid_writer, cv2.VideoWriter): 267 | vid_writer.release() # release previous video writer 268 | if vid_cap: # video 269 | fps = vid_cap.get(cv2.CAP_PROP_FPS) 270 | w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 271 | h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 272 | else: # stream 273 | fps, w, h = 30, im0.shape[1], im0.shape[0] 274 | save_path += '.mp4' 275 | vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h)) 276 | vid_writer.write(im0) 277 | 278 | 279 | 280 | # ===================================== 6、推理结束, 保存结果, 打印信息 ===================================== 281 | # 保存预测的label信息 xywh等 save_txt 282 | if save_txt or save_img: 283 | s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else '' 284 | print(f"Results saved to {save_dir}{s}") 285 | 286 | if update: 287 | # strip_optimizer函数将optimizer从ckpt中删除 更新模型 288 | strip_optimizer(weights) # update model (to fix SourceChangeWarning) 289 | 290 | # 打印预测的总时间 291 | print(f'Done. ({time.time() - t0:.3f}s)') 292 | 293 | 294 | def parse_opt(): 295 | """ 296 | opt参数解析 297 | weights: 模型的权重地址 默认 weights/yolov5.pt 298 | source: 测试数据文件(图片或视频)的保存路径 默认data/images 299 | imgsz: 网络输入图片的大小 默认640 300 | conf-thres: object置信度阈值 默认0.25 301 | iou-thres: 做nms的iou阈值 默认0.45 302 | max-det: 每张图片最大的目标个数 默认1000 303 | device: 设置代码执行的设备 cuda device, i.e. 0 or 0,1,2,3 or cpu 304 | view-img: 是否展示预测之后的图片或视频 默认False 305 | save-txt: 是否将预测的框坐标以txt文件格式保存 默认True 会在runs/detect/expn/labels下生成每张图片预测的txt文件 306 | save-conf: 是否保存预测每个目标的置信度到预测tx文件中 默认True 307 | save-crop: 是否需要将预测到的目标从原图中扣出来 剪切好 并保存 会在runs/detect/expn下生成crops文件,将剪切的图片保存在里面 默认False 308 | nosave: 是否不要保存预测后的图片 默认False 就是默认要保存预测后的图片 309 | classes: 在nms中是否是只保留某些特定的类 默认是None 就是所有类只要满足条件都可以保留 310 | agnostic-nms: 进行nms是否也除去不同类别之间的框 默认False 311 | augment: 预测是否也要采用数据增强 TTA 312 | update: 是否将optimizer从ckpt中删除 更新模型 默认False 313 | project: 当前测试结果放在哪个主文件夹下 默认runs/detect 314 | name: 当前测试结果放在run/detect下的文件名 默认是exp 315 | exist-ok: 是否存在当前文件 默认False 一般是 no exist-ok 连用 所以一般都要重新创建文件夹 316 | line-thickness: 画框的框框的线宽 默认是 3 317 | hide-labels: 画出的框框是否需要隐藏label信息 默认False 318 | hide-conf: 画出的框框是否需要隐藏conf信息 默认False 319 | half: 是否使用半精度 Float16 推理 可以缩短推理时间 但是默认是False 320 | """ 321 | parser = argparse.ArgumentParser() 322 | parser.add_argument('--weights', nargs='+', type=str, default=r'K:\MyProject\YOLOv5-LPRNet-Licence-Recognition\weights\yolov5_best.pt', help='model.pt path(s)') 323 | parser.add_argument('--source', type=str, default=r'K:\MyProject\YOLOv5-LPRNet-Licence-Recognition\demo\images\\', help='file/dir/URL/glob, 0 for webcam') 324 | parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='inference size (pixels)') 325 | parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold') 326 | parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold') 327 | parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image') 328 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 329 | parser.add_argument('--view-img', action='store_true', help='show results') 330 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') 331 | parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') 332 | parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes') 333 | parser.add_argument('--nosave', action='store_true', help='do not save images/videos') 334 | parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3') 335 | parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS') 336 | parser.add_argument('--augment', action='store_true', help='augmented inference') 337 | parser.add_argument('--update', action='store_true', help='update all models') 338 | parser.add_argument('--project', default=r'K:\MyProject\YOLOv5-LPRNet-Licence-Recognition/runs/detect', help='save results to project/name') 339 | parser.add_argument('--name', default='exp', help='save results to project/name') 340 | parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment') 341 | parser.add_argument('--line-thickness', default=1, type=int, help='bounding box thickness (pixels)') 342 | parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels') 343 | parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences') 344 | parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference') 345 | parser.add_argument('--prune-model', default=False, action='store_true', help='model prune') 346 | parser.add_argument('--fuse', default=False, action='store_true', help='fuse conv and bn') 347 | opt = parser.parse_args() 348 | return opt 349 | 350 | 351 | def main(opt): 352 | # 调用colorstr函数彩色打印选择的opt参数 353 | print(colorstr('detect: ') + ', '.join(f'{k}={v}' for k, v in vars(opt).items())) 354 | # 检查已经安装的包是否满足requirements对应txt文件的要求 355 | check_requirements(exclude=('tensorboard', 'thop')) 356 | # 执行run 开始推理 357 | run(**vars(opt)) 358 | 359 | 360 | if __name__ == "__main__": 361 | opt = parse_opt() 362 | main(opt) 363 | -------------------------------------------------------------------------------- /tools/split_dataset.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: HuKai 3 | @Date: 2022/5/29 10:44 4 | @github: https://github.com/HuKai97 5 | """ 6 | import os 7 | import random 8 | 9 | import shutil 10 | from shutil import copy2 11 | trainfiles = os.listdir(r"K:\MyProject\datasets\ccpd\new\ccpd_2019\base") #(图片文件夹) 12 | num_train = len(trainfiles) 13 | print("num_train: " + str(num_train) ) 14 | index_list = list(range(num_train)) 15 | print(index_list) 16 | random.shuffle(index_list) # 打乱顺序 17 | num = 0 18 | trainDir = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\train" #(将图片文件夹中的6份放在这个文件夹下) 19 | validDir = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\val" #(将图片文件夹中的2份放在这个文件夹下) 20 | detectDir = r"K:\MyProject\datasets\ccpd\new\ccpd_2019\test" #(将图片文件夹中的2份放在这个文件夹下) 21 | for i in index_list: 22 | fileName = os.path.join(r"K:\MyProject\datasets\ccpd\new\ccpd_2019\base", trainfiles[i]) #(图片文件夹)+图片名=图片地址 23 | if num < num_train*0.7: # 7:1:2 24 | print(str(fileName)) 25 | copy2(fileName, trainDir) 26 | elif num < num_train*0.8: 27 | print(str(fileName)) 28 | copy2(fileName, validDir) 29 | else: 30 | print(str(fileName)) 31 | copy2(fileName, detectDir) 32 | num += 1 -------------------------------------------------------------------------------- /tools/test_lprnet.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # /usr/bin/env/python3 3 | 4 | ''' 5 | test pretrained model. 6 | Author: aiboy.wei@outlook.com . 7 | ''' 8 | from torch.utils.data import DataLoader 9 | 10 | 11 | from PIL import Image, ImageDraw, ImageFont 12 | 13 | # import torch.backends.cudnn as cudnn 14 | from torch.autograd import Variable 15 | import torch.nn.functional as F 16 | from torch import optim 17 | import torch.nn as nn 18 | import numpy as np 19 | import argparse 20 | import torch 21 | import time 22 | import cv2 23 | import os 24 | 25 | from models.LPRNet import CHARS, LPRNet 26 | from utils.load_lpr_data import LPRDataLoader 27 | 28 | 29 | def get_parser(): 30 | parser = argparse.ArgumentParser(description='parameters to train net') 31 | parser.add_argument('--img_size', default=[94, 24], help='the image size') 32 | parser.add_argument('--test_img_dirs', default=r"K:\MyProject\datasets\ccpd\rec\test", help='the test images path') 33 | parser.add_argument('--dropout_rate', default=0, help='dropout rate.') 34 | parser.add_argument('--lpr_max_len', default=8, help='license plate number max length.') 35 | parser.add_argument('--test_batch_size', default=100, help='testing batch size.') 36 | parser.add_argument('--phase_train', default=False, type=bool, help='train or test phase flag.') 37 | parser.add_argument('--num_workers', default=0, type=int, help='Number of workers used in dataloading') 38 | parser.add_argument('--cuda', default=True, type=bool, help='Use cuda to train model') 39 | parser.add_argument('--show', default=False, type=bool, help='show test image and its predict result or not.') 40 | parser.add_argument('--pretrained_model', default=r'K:\MyProject\YOLOv5-LPRNet-Licence-Recognition\weights\lprnet_best.pth', help='pretrained base model') 41 | 42 | args = parser.parse_args() 43 | 44 | return args 45 | 46 | def collate_fn(batch): 47 | imgs = [] 48 | labels = [] 49 | lengths = [] 50 | for _, sample in enumerate(batch): 51 | img, label, length = sample 52 | imgs.append(torch.from_numpy(img)) 53 | labels.extend(label) 54 | lengths.append(length) 55 | labels = np.asarray(labels).flatten().astype(np.float32) 56 | 57 | return (torch.stack(imgs, 0), torch.from_numpy(labels), lengths) 58 | 59 | def test(): 60 | args = get_parser() 61 | 62 | lprnet = LPRNet(lpr_max_len=args.lpr_max_len, phase=args.phase_train, class_num=len(CHARS), dropout_rate=args.dropout_rate) 63 | device = torch.device("cuda:0" if args.cuda else "cpu") 64 | lprnet.to(device) 65 | print("Successful to build network!") 66 | 67 | # load pretrained model 68 | if args.pretrained_model: 69 | lprnet.load_state_dict(torch.load(args.pretrained_model)) 70 | print("load pretrained model successful!") 71 | else: 72 | print("[Error] Can't found pretrained mode, please check!") 73 | return False 74 | 75 | test_img_dirs = os.path.expanduser(args.test_img_dirs) 76 | test_dataset = LPRDataLoader(test_img_dirs.split(','), args.img_size, args.lpr_max_len) 77 | try: 78 | Greedy_Decode_Eval(lprnet, test_dataset, args) 79 | finally: 80 | cv2.destroyAllWindows() 81 | 82 | def Greedy_Decode_Eval(Net, datasets, args): 83 | # TestNet = Net.eval() 84 | epoch_size = len(datasets) // args.test_batch_size 85 | batch_iterator = iter(DataLoader(datasets, args.test_batch_size, shuffle=True, num_workers=args.num_workers, collate_fn=collate_fn)) 86 | 87 | Tp = 0 88 | Tn_1 = 0 89 | Tn_2 = 0 90 | t1 = time.time() 91 | for i in range(epoch_size): 92 | # load train data 93 | images, labels, lengths = next(batch_iterator) 94 | start = 0 95 | targets = [] 96 | for length in lengths: 97 | label = labels[start:start+length] 98 | targets.append(label) 99 | start += length 100 | targets = np.array([el.numpy() for el in targets]) 101 | imgs = images.numpy().copy() 102 | 103 | if args.cuda: 104 | images = Variable(images.cuda()) 105 | else: 106 | images = Variable(images) 107 | 108 | # forward 109 | # images: [bs, 3, 24, 94] 110 | # prebs: [bs, 68, 18] 111 | prebs = Net(images) 112 | # greedy decode 113 | prebs = prebs.cpu().detach().numpy() 114 | preb_labels = list() 115 | for i in range(prebs.shape[0]): 116 | preb = prebs[i, :, :] # 对每张图片 [68, 18] 117 | preb_label = list() 118 | for j in range(preb.shape[1]): # 18 返回序列中每个位置最大的概率对应的字符idx 其中'-'是67 119 | preb_label.append(np.argmax(preb[:, j], axis=0)) 120 | no_repeat_blank_label = list() 121 | pre_c = preb_label[0] 122 | if pre_c != len(CHARS) - 1: # 记录重复字符 123 | no_repeat_blank_label.append(pre_c) 124 | for c in preb_label: # 去除重复字符和空白字符'-' 125 | if (pre_c == c) or (c == len(CHARS) - 1): 126 | if c == len(CHARS) - 1: 127 | pre_c = c 128 | continue 129 | no_repeat_blank_label.append(c) 130 | pre_c = c 131 | preb_labels.append(no_repeat_blank_label) # 得到最终的无重复字符和无空白字符的序列 132 | for i, label in enumerate(preb_labels): # 统计准确率 133 | # show image and its predict label 134 | if args.show: 135 | show(imgs[i], label, targets[i]) 136 | if len(label) != len(targets[i]): 137 | Tn_1 += 1 # 错误+1 138 | continue 139 | if (np.asarray(targets[i]) == np.asarray(label)).all(): 140 | Tp += 1 # 完全正确+1 141 | else: 142 | Tn_2 += 1 143 | Acc = Tp * 1.0 / (Tp + Tn_1 + Tn_2) 144 | print("[Info] Test Accuracy: {} [{}:{}:{}:{}]".format(Acc, Tp, Tn_1, Tn_2, (Tp+Tn_1+Tn_2))) 145 | t2 = time.time() 146 | print("[Info] Test Speed: {}s 1/{}]".format((t2 - t1) / len(datasets), len(datasets))) 147 | 148 | def show(img, label, target): 149 | img = np.transpose(img, (1, 2, 0)) 150 | img *= 128. 151 | img += 127.5 152 | img = img.astype(np.uint8) 153 | 154 | lb = "" 155 | for i in label: 156 | lb += CHARS[i] 157 | tg = "" 158 | for j in target.tolist(): 159 | tg += CHARS[int(j)] 160 | 161 | flag = "F" 162 | if lb == tg: 163 | flag = "T" 164 | # img = cv2.putText(img, lb, (0,16), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.6, (0, 0, 255), 1) 165 | img = cv2ImgAddText(img, lb, (0, 0)) 166 | cv2.imshow("test", img) 167 | print("target: ", tg, " ### {} ### ".format(flag), "predict: ", lb) 168 | cv2.waitKey() 169 | cv2.destroyAllWindows() 170 | 171 | def cv2ImgAddText(img, text, pos, textColor=(255, 0, 0), textSize=12): 172 | if (isinstance(img, np.ndarray)): # detect opencv format or not 173 | img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) 174 | draw = ImageDraw.Draw(img) 175 | fontText = ImageFont.truetype("data/NotoSansCJK-Regular.ttc", textSize, encoding="utf-8") 176 | draw.text(pos, text, textColor, font=fontText) 177 | 178 | return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR) 179 | 180 | 181 | if __name__ == "__main__": 182 | test() 183 | -------------------------------------------------------------------------------- /tools/test_yolov5.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import json 3 | 4 | from models.experimental import * 5 | from utils.datasets import * 6 | 7 | 8 | def test(data, 9 | weights=None, 10 | batch_size=16, 11 | imgsz=640, 12 | conf_thres=0.001, 13 | iou_thres=0.6, # for NMS 14 | save_json=False, 15 | single_cls=False, 16 | augment=False, 17 | verbose=False, 18 | model=None, 19 | dataloader=None, 20 | save_dir='', 21 | merge=False): 22 | # Initialize/load model and set device 23 | training = model is not None 24 | if training: # called by train_yolov5.py 25 | device = next(model.parameters()).device # get model device 26 | 27 | else: # called directly 28 | device = torch_utils.select_device(opt.device, batch_size=batch_size) 29 | merge = opt.merge # use Merge NMS 30 | 31 | # Remove previous 32 | for f in glob.glob(str(Path(save_dir) / 'test_batch*.jpg')): 33 | os.remove(f) 34 | 35 | # Load model 36 | model = attempt_load(weights, map_location=device) # load FP32 model 37 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size 38 | 39 | # Multi-GPU disabled, incompatible with .half() https://github.com/ultralytics/yolov5/issues/99 40 | # if device.type != 'cpu' and torch.cuda.device_count() > 1: 41 | # model = nn.DataParallel(model) 42 | 43 | # Half 44 | half = device.type != 'cpu' and torch.cuda.device_count() == 1 # half precision only supported on single-GPU 45 | if half: 46 | model.half() # to FP16 47 | print("多GPU") 48 | # Configure 49 | model.eval() 50 | with open(data) as f: 51 | data = yaml.load(f, Loader=yaml.FullLoader) # model dict 52 | nc = 1 if single_cls else int(data['nc']) # number of classes 53 | iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95 54 | iouv = iouv[0].view(1) # comment for mAP@0.5:0.95 55 | niou = iouv.numel() 56 | 57 | # Dataloader 58 | if not training: 59 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img 60 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once 61 | path = data['test'] if opt.task == 'test' else data['val'] # path to val/test images 62 | dataloader = create_dataloader(path, imgsz, batch_size, model.stride.max(), opt, 63 | hyp=None, augment=False, cache=False, pad=0.5, rect=True)[0] 64 | 65 | seen = 0 66 | names = model.names if hasattr(model, 'names') else model.module.names 67 | coco91class = coco80_to_coco91_class() 68 | s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Targets', 'P', 'R', 'mAP@.5', 'mAP@.5:.95') 69 | p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0. 70 | loss = torch.zeros(3, device=device) 71 | jdict, stats, ap, ap_class = [], [], [], [] 72 | for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)): 73 | img = img.to(device) 74 | img = img.half() if half else img.float() # uint8 to fp16/32 75 | img /= 255.0 # 0 - 255 to 0.0 - 1.0 76 | targets = targets.to(device) 77 | nb, _, height, width = img.shape # batch size, channels, height, width 78 | whwh = torch.Tensor([width, height, width, height]).to(device) 79 | 80 | # Disable gradients 81 | with torch.no_grad(): 82 | # Run model 83 | t = torch_utils.time_synchronized() 84 | inf_out, train_out = model(img, augment=augment) # demo and training outputs 85 | t0 += torch_utils.time_synchronized() - t 86 | 87 | # Compute loss 88 | if training: # if model has loss hyperparameters 89 | loss += compute_loss([x.float() for x in train_out], targets, model)[1][:3] # GIoU, obj, cls 90 | 91 | # Run NMS 92 | t = torch_utils.time_synchronized() 93 | output = non_max_suppression(inf_out, conf_thres=conf_thres, iou_thres=iou_thres, merge=merge) 94 | t1 += torch_utils.time_synchronized() - t 95 | 96 | # Statistics per image 97 | for si, pred in enumerate(output): 98 | labels = targets[targets[:, 0] == si, 1:] 99 | nl = len(labels) 100 | tcls = labels[:, 0].tolist() if nl else [] # target class 101 | seen += 1 102 | 103 | if pred is None: 104 | if nl: 105 | stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls)) 106 | continue 107 | 108 | # Append to text file 109 | # with open('test.txt', 'a') as file: 110 | # [file.write('%11.5g' * 7 % tuple(x) + '\n') for x in pred] 111 | 112 | # Clip boxes to image bounds 113 | clip_coords(pred, (height, width)) 114 | 115 | # Append to pycocotools JSON dictionary 116 | if save_json: 117 | # [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ... 118 | image_id = int(Path(paths[si]).stem.split('_')[-1]) 119 | box = pred[:, :4].clone() # xyxy 120 | scale_coords(img[si].shape[1:], box, shapes[si][0], shapes[si][1]) # to original shape 121 | box = xyxy2xywh(box) # xywh 122 | box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner 123 | for p, b in zip(pred.tolist(), box.tolist()): 124 | jdict.append({'image_id': image_id, 125 | 'category_id': coco91class[int(p[5])], 126 | 'bbox': [round(x, 3) for x in b], 127 | 'score': round(p[4], 5)}) 128 | 129 | # Assign all predictions as incorrect 130 | correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device) 131 | if nl: 132 | detected = [] # target indices 133 | tcls_tensor = labels[:, 0] 134 | 135 | # target boxes 136 | tbox = xywh2xyxy(labels[:, 1:5]) * whwh 137 | 138 | # Per target class 139 | for cls in torch.unique(tcls_tensor): 140 | ti = (cls == tcls_tensor).nonzero().view(-1) # prediction indices 141 | pi = (cls == pred[:, 5]).nonzero().view(-1) # target indices 142 | 143 | # Search for detections 144 | if pi.shape[0]: 145 | # Prediction to target ious 146 | ious, i = box_iou(pred[pi, :4], tbox[ti]).max(1) # best ious, indices 147 | 148 | # Append detections 149 | for j in (ious > iouv[0]).nonzero(): 150 | d = ti[i[j]] # detected target 151 | if d not in detected: 152 | detected.append(d) 153 | correct[pi[j]] = ious[j]>iouv # iou_thres is 1xn 154 | if len(detected) == nl: # all targets already located in image 155 | break 156 | 157 | # Append statistics (correct, conf, pcls, tcls) 158 | stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls)) 159 | 160 | # Plot images 161 | if batch_i < 1: 162 | f = Path(save_dir) / ('test_batch%g_gt.jpg' % batch_i) # filename 163 | plot_images(img, targets, paths, str(f), names) # ground truth 164 | f = Path(save_dir) / ('test_batch%g_pred.jpg' % batch_i) 165 | plot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictions 166 | 167 | # Compute statistics 168 | stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy 169 | if len(stats): 170 | p, r, ap, f1, ap_class = ap_per_class(*stats) 171 | p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1) # [P, R, AP@0.5, AP@0.5:0.95] 172 | mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean() 173 | nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class 174 | else: 175 | nt = torch.zeros(1) 176 | 177 | # Print results 178 | pf = '%20s' + '%12.3g' * 6 # print format 179 | print(pf % ('all', seen, nt.sum(), mp, mr, map50, map)) 180 | 181 | # Print results per class 182 | if verbose and nc > 1 and len(stats): 183 | for i, c in enumerate(ap_class): 184 | print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i])) 185 | 186 | # Print speeds 187 | t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple 188 | if not training: 189 | print('Speed: %.1f/%.1f/%.1f ms demo/NMS/total per %gx%g image at batch-size %g' % t) 190 | 191 | # Save JSON 192 | if save_json and map50 and len(jdict): 193 | imgIds = [int(Path(x).stem.split('_')[-1]) for x in dataloader.dataset.img_files] 194 | f = 'detections_val2017_%s_results.json' % \ 195 | (weights.split(os.sep)[-1].replace('.pt', '') if isinstance(weights, str) else '') # filename 196 | print('\nCOCO mAP with pycocotools... saving %s...' % f) 197 | with open(f, 'w') as file: 198 | json.dump(jdict, file) 199 | 200 | try: 201 | from pycocotools.coco import COCO 202 | from pycocotools.cocoeval import COCOeval 203 | 204 | # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb 205 | cocoGt = COCO(glob.glob('../coco/annotations/instances_val*.json')[0]) # initialize COCO ground truth api 206 | cocoDt = cocoGt.loadRes(f) # initialize COCO pred api 207 | 208 | cocoEval = COCOeval(cocoGt, cocoDt, 'bbox') 209 | cocoEval.params.imgIds = imgIds # image IDs to evaluate 210 | cocoEval.evaluate() 211 | cocoEval.accumulate() 212 | cocoEval.summarize() 213 | map, map50 = cocoEval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5) 214 | except: 215 | print('WARNING: pycocotools must be installed with numpy==1.17 to run correctly. ' 216 | 'See https://github.com/cocodataset/cocoapi/issues/356') 217 | 218 | # Return results 219 | model.float() # for training 220 | maps = np.zeros(nc) + map 221 | for i, c in enumerate(ap_class): 222 | maps[c] = ap[i] 223 | return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t 224 | 225 | 226 | if __name__ == '__main__': 227 | parser = argparse.ArgumentParser(prog='test_yolov5.py') 228 | parser.add_argument('--weights', nargs='+', type=str, default='', help='model.pt path(s)') 229 | parser.add_argument('--data', type=str, default='', help='*.data path') 230 | parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch') 231 | parser.add_argument('--img-size', type=int, default=640, help='demo size (pixels)') 232 | parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold') 233 | parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS') 234 | parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file') 235 | parser.add_argument('--task', default='test', help="'val', 'test', 'study'") 236 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 237 | parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset') 238 | parser.add_argument('--augment', action='store_true', help='augmented demo') 239 | parser.add_argument('--merge', action='store_true', help='use Merge NMS') 240 | parser.add_argument('--verbose', action='store_true', help='report mAP by class') 241 | opt = parser.parse_args() 242 | opt.save_json = opt.save_json or opt.data.endswith('coco.yaml') 243 | opt.data = check_file(opt.data) # check file 244 | print(opt) 245 | 246 | # task = 'val', 'test', 'study' 247 | if opt.task in ['val', 'test']: # (default) run normally 248 | test(opt.data, 249 | opt.weights, 250 | opt.batch_size, 251 | opt.img_size, 252 | opt.conf_thres, 253 | opt.iou_thres, 254 | opt.save_json, 255 | opt.single_cls, 256 | opt.augment, 257 | opt.verbose) 258 | 259 | elif opt.task == 'study': # run over a range of settings and save/plot 260 | for weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt', 'yolov3-spp.pt']: 261 | f = 'study_%s_%s.txt' % (Path(opt.data).stem, Path(weights).stem) # filename to save to 262 | x = list(range(352, 832, 64)) # x axis 263 | y = [] # y axis 264 | for i in x: # img-size 265 | print('\nRunning %s point %s...' % (f, i)) 266 | r, _, t = test(opt.data, weights, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json) 267 | y.append(r + t) # results and times 268 | np.savetxt(f, y, fmt='%10.4g') # save 269 | os.system('zip -r study.zip study_*.txt') 270 | # plot_study_txt(f, x) # plot 271 | -------------------------------------------------------------------------------- /tools/train_lprnet.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # /usr/bin/env/python3 3 | 4 | ''' 5 | Pytorch implementation for LPRNet. 6 | Author: aiboy.wei@outlook.com . 7 | ''' 8 | from torch.utils.data import DataLoader 9 | 10 | # import torch.backends.cudnn as cudnn 11 | from torch.autograd import Variable 12 | import torch.nn.functional as F 13 | from torch import optim 14 | import torch.nn as nn 15 | import numpy as np 16 | import argparse 17 | import torch 18 | import time 19 | import os 20 | 21 | from models.LPRNet import LPRNet, CHARS 22 | from utils.load_lpr_data import LPRDataLoader 23 | 24 | 25 | def sparse_tuple_for_ctc(T_length, lengths): 26 | input_lengths = [] 27 | target_lengths = [] 28 | 29 | for ch in lengths: 30 | input_lengths.append(T_length) 31 | target_lengths.append(ch) 32 | 33 | return tuple(input_lengths), tuple(target_lengths) 34 | 35 | def adjust_learning_rate(optimizer, cur_epoch, base_lr, lr_schedule): 36 | """ 37 | Sets the learning rate 38 | """ 39 | lr = 0 40 | for i, e in enumerate(lr_schedule): 41 | if cur_epoch < e: 42 | lr = base_lr * (0.1 ** i) 43 | break 44 | if lr == 0: 45 | lr = base_lr 46 | for param_group in optimizer.param_groups: 47 | param_group['lr'] = lr 48 | 49 | return lr 50 | 51 | def get_parser(): 52 | parser = argparse.ArgumentParser(description='parameters to train net') 53 | parser.add_argument('--max_epoch', default=100, help='epoch to train the network') 54 | parser.add_argument('--img_size', default=[94, 24], help='the image size') 55 | parser.add_argument('--train_img_dirs', default=r"K:\MyProject\datasets\ccpd\rec\train", help='the train images path') 56 | parser.add_argument('--test_img_dirs', default=r"K:\MyProject\datasets\ccpd\rec\val", help='the test images path') 57 | parser.add_argument('--dropout_rate', default=0.5, help='dropout rate.') 58 | parser.add_argument('--learning_rate', default=0.01, help='base value of learning rate.') 59 | parser.add_argument('--lpr_max_len', default=8, help='license plate number max length.') 60 | parser.add_argument('--train_batch_size', default=128, help='training batch size.') 61 | parser.add_argument('--test_batch_size', default=128, help='testing batch size.') 62 | parser.add_argument('--phase_train', default=True, type=bool, help='train or test phase flag.') 63 | parser.add_argument('--num_workers', default=0, type=int, help='Number of workers used in dataloading') 64 | parser.add_argument('--cuda', default=True, type=bool, help='Use cuda to train model') 65 | parser.add_argument('--resume_epoch', default=0, type=int, help='resume iter for retraining') 66 | parser.add_argument('--save_interval', default=500, type=int, help='interval for save model state dict') 67 | parser.add_argument('--test_interval', default=500, type=int, help='interval for evaluate') 68 | parser.add_argument('--momentum', default=0.9, type=float, help='momentum') 69 | parser.add_argument('--weight_decay', default=2e-5, type=float, help='Weight decay for SGD') 70 | parser.add_argument('--lr_schedule', default=[20, 40, 60, 80, 100], help='schedule for learning rate.') 71 | parser.add_argument('--save_folder', default=r'../runs', 72 | help='Location to save checkpoint models') 73 | parser.add_argument('--pretrained_model', default='', help='no pretrain') 74 | 75 | args = parser.parse_args() 76 | 77 | return args 78 | 79 | def collate_fn(batch): 80 | imgs = [] 81 | labels = [] 82 | lengths = [] 83 | for _, sample in enumerate(batch): 84 | img, label, length = sample 85 | imgs.append(torch.from_numpy(img)) 86 | labels.extend(label) 87 | lengths.append(length) 88 | labels = np.asarray(labels).flatten().astype(np.int) 89 | return (torch.stack(imgs, 0), torch.from_numpy(labels), lengths) 90 | 91 | def train(): 92 | args = get_parser() 93 | 94 | T_length = 18 # args.lpr_max_len 95 | epoch = 0 + args.resume_epoch 96 | loss_val = 0 97 | 98 | if not os.path.exists(args.save_folder): 99 | os.mkdir(args.save_folder) 100 | 101 | lprnet = LPRNet(lpr_max_len=args.lpr_max_len, phase=args.phase_train, class_num=len(CHARS), dropout_rate=args.dropout_rate) 102 | device = torch.device("cuda:0" if args.cuda else "cpu") 103 | lprnet.to(device) 104 | print("Successful to build network!") 105 | 106 | # load pretrained model 107 | if args.pretrained_model: 108 | lprnet.load_state_dict(torch.load(args.pretrained_model)) 109 | print("load pretrained model successful!") 110 | else: 111 | def xavier(param): 112 | nn.init.xavier_uniform(param) 113 | 114 | def weights_init(m): 115 | for key in m.state_dict(): 116 | if key.split('.')[-1] == 'weight': 117 | if 'conv' in key: 118 | nn.init.kaiming_normal_(m.state_dict()[key], mode='fan_out') 119 | if 'bn' in key: 120 | m.state_dict()[key][...] = xavier(1) 121 | elif key.split('.')[-1] == 'bias': 122 | m.state_dict()[key][...] = 0.01 123 | 124 | lprnet.backbone.apply(weights_init) 125 | lprnet.container.apply(weights_init) 126 | print("initial net weights successful!") 127 | 128 | # define optimizer 129 | # optimizer = optim.SGD(lprnet.parameters(), lr=args.learning_rate, 130 | # momentum=args.momentum, weight_decay=args.weight_decay) 131 | optimizer = optim.RMSprop(lprnet.parameters(), lr=args.learning_rate, alpha = 0.9, eps=1e-08, 132 | momentum=args.momentum, weight_decay=args.weight_decay) 133 | train_img_dirs = os.path.expanduser(args.train_img_dirs) 134 | test_img_dirs = os.path.expanduser(args.test_img_dirs) 135 | train_dataset = LPRDataLoader(train_img_dirs.split(','), args.img_size, args.lpr_max_len) 136 | test_dataset = LPRDataLoader(test_img_dirs.split(','), args.img_size, args.lpr_max_len) 137 | 138 | epoch_size = len(train_dataset) // args.train_batch_size 139 | max_iter = args.max_epoch * epoch_size 140 | 141 | ctc_loss = nn.CTCLoss(blank=len(CHARS)-1, reduction='mean') # reduction: 'none' | 'mean' | 'sum' 142 | 143 | if args.resume_epoch > 0: 144 | start_iter = args.resume_epoch * epoch_size 145 | else: 146 | start_iter = 0 147 | 148 | for iteration in range(start_iter, max_iter): 149 | if iteration % epoch_size == 0: 150 | # create batch iterator 151 | batch_iterator = iter(DataLoader(train_dataset, args.train_batch_size, shuffle=True, num_workers=args.num_workers, collate_fn=collate_fn)) 152 | loss_val = 0 153 | epoch += 1 154 | 155 | if iteration !=0 and iteration % args.save_interval == 0: 156 | torch.save(lprnet.state_dict(), args.save_folder + 'LPRNet_' + '_iteration_' + repr(iteration) + '.pth') 157 | 158 | if (iteration + 1) % args.test_interval == 0: 159 | Greedy_Decode_Eval(lprnet, test_dataset, args) 160 | # lprnet.train() # should be switch to train mode 161 | 162 | start_time = time.time() 163 | # load train data 164 | images, labels, lengths = next(batch_iterator) 165 | # labels = np.array([el.numpy() for el in labels]).T 166 | # print(labels) 167 | # get ctc parameters 168 | input_lengths, target_lengths = sparse_tuple_for_ctc(T_length, lengths) 169 | # update lr 170 | lr = adjust_learning_rate(optimizer, epoch, args.learning_rate, args.lr_schedule) 171 | 172 | if args.cuda: 173 | images = Variable(images, requires_grad=False).cuda() 174 | labels = Variable(labels, requires_grad=False).cuda() 175 | else: 176 | images = Variable(images, requires_grad=False) 177 | labels = Variable(labels, requires_grad=False) 178 | 179 | # forward 180 | logits = lprnet(images) 181 | log_probs = logits.permute(2, 0, 1) # for ctc loss: T x N x C 182 | # print(labels.shape) 183 | log_probs = log_probs.log_softmax(2).requires_grad_() 184 | # log_probs = log_probs.detach().requires_grad_() 185 | # print(log_probs.shape) 186 | # backprop 187 | optimizer.zero_grad() 188 | 189 | # log_probs: 预测结果 [18, bs, 68] 其中18为序列长度 68为字典数 190 | # labels: [93] 191 | # input_lengths: tuple example: 000=18 001=18... 每个序列长度 192 | # target_lengths: tuple example: 000=7 001=8 ... 每个gt长度 193 | loss = ctc_loss(log_probs, labels, input_lengths=input_lengths, target_lengths=target_lengths) 194 | if loss.item() == np.inf: 195 | continue 196 | loss.backward() 197 | optimizer.step() 198 | loss_val += loss.item() 199 | end_time = time.time() 200 | if iteration % 20 == 0: 201 | print('Epoch:' + repr(epoch) + ' || epochiter: ' + repr(iteration % epoch_size) + '/' + repr(epoch_size) 202 | + '|| Totel iter ' + repr(iteration) + ' || Loss: %.4f||' % (loss.item()) + 203 | 'Batch time: %.4f sec. ||' % (end_time - start_time) + 'LR: %.8f' % (lr)) 204 | # final test 205 | print("Final test Accuracy:") 206 | Greedy_Decode_Eval(lprnet, test_dataset, args) 207 | 208 | # save final parameters 209 | torch.save(lprnet.state_dict(), args.save_folder + 'lprnet-pretrain.pth') 210 | 211 | def Greedy_Decode_Eval(Net, datasets, args): 212 | # TestNet = Net.eval() 213 | epoch_size = len(datasets) // args.test_batch_size 214 | batch_iterator = iter(DataLoader(datasets, args.test_batch_size, shuffle=True, num_workers=args.num_workers, collate_fn=collate_fn)) 215 | 216 | Tp = 0 217 | Tn_1 = 0 218 | Tn_2 = 0 219 | t1 = time.time() 220 | for i in range(epoch_size): 221 | # load train data 222 | images, labels, lengths = next(batch_iterator) 223 | start = 0 224 | targets = [] 225 | for length in lengths: 226 | label = labels[start:start+length] 227 | targets.append(label) 228 | start += length 229 | targets = np.array([el.numpy() for el in targets]) 230 | 231 | if args.cuda: 232 | images = Variable(images.cuda()) 233 | else: 234 | images = Variable(images) 235 | 236 | # forward 237 | prebs = Net(images) 238 | # greedy decode 239 | prebs = prebs.cpu().detach().numpy() 240 | preb_labels = list() 241 | for i in range(prebs.shape[0]): 242 | preb = prebs[i, :, :] 243 | preb_label = list() 244 | for j in range(preb.shape[1]): 245 | preb_label.append(np.argmax(preb[:, j], axis=0)) 246 | no_repeat_blank_label = list() 247 | pre_c = preb_label[0] 248 | if pre_c != len(CHARS) - 1: 249 | no_repeat_blank_label.append(pre_c) 250 | for c in preb_label: # dropout repeate label and blank label 251 | if (pre_c == c) or (c == len(CHARS) - 1): 252 | if c == len(CHARS) - 1: 253 | pre_c = c 254 | continue 255 | no_repeat_blank_label.append(c) 256 | pre_c = c 257 | preb_labels.append(no_repeat_blank_label) 258 | for i, label in enumerate(preb_labels): 259 | if len(label) != len(targets[i]): 260 | Tn_1 += 1 261 | continue 262 | if (np.asarray(targets[i]) == np.asarray(label)).all(): 263 | Tp += 1 264 | else: 265 | Tn_2 += 1 266 | 267 | Acc = Tp * 1.0 / (Tp + Tn_1 + Tn_2) 268 | print("[Info] Test Accuracy: {} [{}:{}:{}:{}]".format(Acc, Tp, Tn_1, Tn_2, (Tp+Tn_1+Tn_2))) 269 | t2 = time.time() 270 | print("[Info] Test Speed: {}s 1/{}]".format((t2 - t1) / len(datasets), len(datasets))) 271 | 272 | 273 | if __name__ == "__main__": 274 | train() 275 | -------------------------------------------------------------------------------- /tools/train_yolov5.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | import torch.distributed as dist 4 | import torch.nn.functional as F 5 | import torch.optim as optim 6 | import torch.optim.lr_scheduler as lr_scheduler 7 | import torch.utils.data 8 | from torch.utils.tensorboard import SummaryWriter 9 | 10 | import test_yolov5 # import test_yolov5.py to get mAP after each epoch 11 | from models.yolo import Model 12 | from utils import google_utils 13 | from utils.datasets import * 14 | from utils.utils import * 15 | 16 | mixed_precision = True 17 | try: # Mixed precision training https://github.com/NVIDIA/apex 18 | from apex import amp 19 | except: 20 | print('Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex') 21 | mixed_precision = False # not installed 22 | 23 | # Hyperparameters 24 | hyp = {'optimizer': 'SGD', # ['adam', 'SGD', None] if none, default is SGD 25 | 'lr0': 0.01, # initial learning rate (SGD=1E-2, Adam=1E-3) 26 | 'momentum': 0.937, # SGD momentum/Adam beta1 27 | 'weight_decay': 5e-4, # optimizer weight decay 28 | 'giou': 0.05, # giou loss gain 29 | 'cls': 0.58, # cls loss gain 30 | 'cls_pw': 1.0, # cls BCELoss positive_weight 31 | 'obj': 1.0, # obj loss gain (*=img_size/320 if img_size != 320) 32 | 'obj_pw': 1.0, # obj BCELoss positive_weight 33 | 'iou_t': 0.20, # iou training threshold 34 | 'anchor_t': 4.0, # anchor-multiple threshold 35 | 'fl_gamma': 0.0, # focal loss gamma (efficientDet default is gamma=1.5) 36 | 'hsv_h': 0.014, # HSV-Saturation augmentation (fraction) 37 | 'hsv_v': 0.36, # image image HSV-Hue augmentation (fraction) 38 | 'hsv_s': 0.68, # imageHSV-Value augmentation (fraction) 39 | 'degrees': 0.0, # image rotation (+/- deg) 40 | 'translate': 0.0, # image translation (+/- fraction) 41 | 'scale': 0.5, # image scale (+/- gain) 42 | 'shear': 0.0} # image shear (+/- deg) 43 | 44 | 45 | def train(hyp): 46 | print(f'Hyperparameters {hyp}') 47 | log_dir = tb_writer.log_dir # run directory 48 | wdir = str(Path(log_dir) / 'weights') + os.sep # weights directory 49 | print("adkjg;ajg;lfkjdsg",wdir) 50 | os.makedirs(wdir, exist_ok=True) 51 | last = wdir + 'yolov5_best.pt' 52 | best = wdir + 'best.pt' 53 | results_file = log_dir + os.sep + 'results.txt' 54 | 55 | epochs = opt.epochs # 300 56 | batch_size = opt.batch_size # 64 57 | weights = opt.weights # initial training weights 58 | 59 | # Configure 60 | init_seeds(1) 61 | with open(opt.data) as f: 62 | data_dict = yaml.load(f, Loader=yaml.FullLoader) # model dict 63 | train_path = data_dict['train'] 64 | test_path = data_dict['val'] 65 | nc = 1 if opt.single_cls else int(data_dict['nc']) # number of classes 66 | 67 | # Remove previous results 68 | for f in glob.glob('*_batch*.jpg') + glob.glob(results_file): 69 | os.remove(f) 70 | 71 | # Create model 72 | model = Model(opt.cfg, nc=data_dict['nc']).to(device) 73 | 74 | # Image sizes 75 | gs = int(max(model.stride)) # grid size (max stride) 76 | imgsz, imgsz_test = [check_img_size(x, gs) for x in opt.img_size] # verify imgsz are gs-multiples 77 | 78 | # Optimizer 79 | nbs = 64 # nominal batch size 80 | accumulate = max(round(nbs / batch_size), 1) # accumulate loss before optimizing 81 | hyp['weight_decay'] *= batch_size * accumulate / nbs # scale weight_decay 82 | pg0, pg1, pg2 = [], [], [] # optimizer parameter groups 83 | for k, v in model.named_parameters(): 84 | if v.requires_grad: 85 | if '.bias' in k: 86 | pg2.append(v) # biases 87 | elif '.weight' in k and '.bn' not in k: 88 | pg1.append(v) # apply weight decay 89 | else: 90 | pg0.append(v) # all else 91 | 92 | if hyp['optimizer'] == 'adam': # https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html#OneCycleLR 93 | optimizer = optim.Adam(pg0, lr=hyp['lr0'], betas=(hyp['momentum'], 0.999)) # adjust beta1 to momentum 94 | else: 95 | optimizer = optim.SGD(pg0, lr=hyp['lr0'], momentum=hyp['momentum'], nesterov=True) 96 | 97 | optimizer.add_param_group({'params': pg1, 'weight_decay': hyp['weight_decay']}) # add pg1 with weight_decay 98 | optimizer.add_param_group({'params': pg2}) # add pg2 (biases) 99 | print('Optimizer groups: %g .bias, %g conv.weight, %g other' % (len(pg2), len(pg1), len(pg0))) 100 | del pg0, pg1, pg2 101 | 102 | # Scheduler https://arxiv.org/pdf/1812.01187.pdf 103 | lf = lambda x: (((1 + math.cos(x * math.pi / epochs)) / 2) ** 1.0) * 0.9 + 0.1 # cosine 104 | scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf) 105 | plot_lr_scheduler(optimizer, scheduler, epochs, save_dir=log_dir) 106 | 107 | # Load Model 108 | google_utils.attempt_download(weights) 109 | start_epoch, best_fitness = 0, 0.0 110 | if weights.endswith('.pt'): # pytorch format 111 | ckpt = torch.load(weights, map_location=device) # load checkpoint 112 | 113 | # load model 114 | try: 115 | ckpt['model'] = {k: v for k, v in ckpt['model'].float().state_dict().items() 116 | if model.state_dict()[k].shape == v.shape} # to FP32, filter 117 | model.load_state_dict(ckpt['model'], strict=False) 118 | for k, v in model.named_parameters(): 119 | print(v) 120 | except KeyError as e: 121 | s = "%s is not compatible with %s. This may be due to model differences or %s may be out of date. " \ 122 | "Please delete or update %s and try again, or use --weights '' to train from scratch." \ 123 | % (opt.weights, opt.cfg, opt.weights, opt.weights) 124 | raise KeyError(s) from e 125 | 126 | # load optimizer 127 | if ckpt['optimizer'] is not None: 128 | optimizer.load_state_dict(ckpt['optimizer']) 129 | best_fitness = ckpt['best_fitness'] 130 | 131 | # load results 132 | if ckpt.get('training_results') is not None: 133 | with open(results_file, 'w') as file: 134 | file.write(ckpt['training_results']) # write results.txt 135 | 136 | # epochs 137 | start_epoch = ckpt['epoch'] + 1 138 | if epochs < start_epoch: 139 | print('%s has been trained for %g epochs. Fine-tuning for %g additional epochs.' % 140 | (opt.weights, ckpt['epoch'], epochs)) 141 | epochs += ckpt['epoch'] # finetune additional epochs 142 | 143 | del ckpt 144 | 145 | # Mixed precision training https://github.com/NVIDIA/apex 146 | if mixed_precision: 147 | model, optimizer = amp.initialize(model, optimizer, opt_level='O1', verbosity=0) 148 | 149 | # Distributed training 150 | if device.type != 'cpu' and torch.cuda.device_count() > 1 and torch.distributed.is_available(): 151 | dist.init_process_group(backend='nccl', # distributed backend 152 | init_method='tcp://127.0.0.1:9999', # init method 153 | world_size=1, # number of nodes 154 | rank=0) # node rank 155 | # model = torch.nn.parallel.DistributedDataParallel(model) 156 | # pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html 157 | 158 | # Trainloader 159 | dataloader, dataset = create_dataloader(train_path, imgsz, batch_size, gs, opt, 160 | hyp=hyp, augment=True, cache=opt.cache_images, rect=opt.rect) 161 | mlc = np.concatenate(dataset.labels, 0)[:, 0].max() # max label class 162 | assert mlc < nc, 'Label class %g exceeds nc=%g in %s. Correct your labels or your model.' % (mlc, nc, opt.cfg) 163 | 164 | # Testloader 165 | testloader = create_dataloader(test_path, imgsz_test, batch_size, gs, opt, 166 | hyp=hyp, augment=False, cache=opt.cache_images, rect=True)[0] 167 | 168 | # Model parameters 169 | hyp['cls'] *= nc / 80. # scale coco-tuned hyp['cls'] to current dataset 170 | model.nc = nc # attach number of classes to model 171 | model.hyp = hyp # attach hyperparameters to model 172 | model.gr = 1.0 # giou loss ratio (obj_loss = 1.0 or giou) 173 | model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device) # attach class weights 174 | model.names = data_dict['names'] 175 | 176 | # Save run settings 177 | with open(Path(log_dir) / 'hyp.yaml', 'w') as f: 178 | yaml.dump(hyp, f, sort_keys=False) 179 | with open(Path(log_dir) / 'opt.yaml', 'w') as f: 180 | yaml.dump(vars(opt), f, sort_keys=False) 181 | 182 | # Class frequency 183 | labels = np.concatenate(dataset.labels, 0) 184 | c = torch.tensor(labels[:, 0]) # classes 185 | # cf = torch.bincount(c.long(), minlength=nc) + 1. 186 | # model._initialize_biases(cf.to(device)) 187 | plot_labels(labels, save_dir=log_dir) 188 | if tb_writer: 189 | tb_writer.add_histogram('classes', c, 0) 190 | 191 | # Check anchors 192 | if not opt.noautoanchor: 193 | check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz) 194 | 195 | # Exponential moving average 196 | ema = torch_utils.ModelEMA(model) 197 | 198 | # Start training 199 | t0 = time.time() 200 | nb = len(dataloader) # number of batches 201 | nw = max(3 * nb, 1e3) # number of warmup iterations, max(3 epochs, 1k iterations) 202 | maps = np.zeros(nc) # mAP per class 203 | results = (0, 0, 0, 0, 0, 0, 0) # 'P', 'R', 'mAP', 'F1', 'val GIoU', 'val Objectness', 'val Classification' 204 | scheduler.last_epoch = start_epoch - 1 # do not move 205 | print('Image sizes %g train, %g test' % (imgsz, imgsz_test)) 206 | print('Using %g dataloader workers' % dataloader.num_workers) 207 | print('Starting training for %g epochs...' % epochs) 208 | # torch.autograd.set_detect_anomaly(True) 209 | for epoch in range(start_epoch, epochs): # epoch ------------------------------------------------------------------ 210 | model.train() 211 | 212 | # Update image weights (optional) 213 | if dataset.image_weights: 214 | w = model.class_weights.cpu().numpy() * (1 - maps) ** 2 # class weights 215 | image_weights = labels_to_image_weights(dataset.labels, nc=nc, class_weights=w) 216 | dataset.indices = random.choices(range(dataset.n), weights=image_weights, k=dataset.n) # rand weighted idx 217 | 218 | # Update mosaic border 219 | # b = int(random.uniform(0.25 * imgsz, 0.75 * imgsz + gs) // gs * gs) 220 | # dataset.mosaic_border = [b - imgsz, -b] # height, width borders 221 | 222 | mloss = torch.zeros(4, device=device) # mean losses 223 | print(('\n' + '%10s' * 8) % ('Epoch', 'gpu_mem', 'GIoU', 'obj', 'cls', 'total', 'targets', 'img_size')) 224 | pbar = tqdm(enumerate(dataloader), total=nb) # progress bar 225 | for i, (imgs, targets, paths, _) in pbar: # batch ------------------------------------------------------------- 226 | ni = i + nb * epoch # number integrated batches (since train start) 227 | imgs = imgs.to(device).float() / 255.0 # uint8 to float32, 0 - 255 to 0.0 - 1.0 228 | 229 | # Warmup 230 | if ni <= nw: 231 | xi = [0, nw] # x interp 232 | # model.gr = np.interp(ni, xi, [0.0, 1.0]) # giou loss ratio (obj_loss = 1.0 or giou) 233 | accumulate = max(1, np.interp(ni, xi, [1, nbs / batch_size]).round()) 234 | for j, x in enumerate(optimizer.param_groups): 235 | # bias lr falls from 0.1 to lr0, all other lrs rise from 0.0 to lr0 236 | x['lr'] = np.interp(ni, xi, [0.1 if j == 2 else 0.0, x['initial_lr'] * lf(epoch)]) 237 | if 'momentum' in x: 238 | x['momentum'] = np.interp(ni, xi, [0.9, hyp['momentum']]) 239 | 240 | # Multi-scale 241 | if opt.multi_scale: 242 | sz = random.randrange(imgsz * 0.5, imgsz * 1.5 + gs) // gs * gs # size 243 | sf = sz / max(imgs.shape[2:]) # scale factor 244 | if sf != 1: 245 | ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]] # new shape (stretched to gs-multiple) 246 | imgs = F.interpolate(imgs, size=ns, mode='bilinear', align_corners=False) 247 | 248 | # Forward 249 | pred = model(imgs) 250 | 251 | # Loss 252 | loss, loss_items = compute_loss(pred, targets.to(device), model) 253 | if not torch.isfinite(loss): 254 | print('WARNING: non-finite loss, ending training ', loss_items) 255 | return results 256 | 257 | # Backward 258 | if mixed_precision: 259 | with amp.scale_loss(loss, optimizer) as scaled_loss: 260 | scaled_loss.backward() 261 | else: 262 | loss.backward() 263 | 264 | # Optimize 265 | if ni % accumulate == 0: 266 | optimizer.step() 267 | optimizer.zero_grad() 268 | ema.update(model) 269 | 270 | # Print 271 | mloss = (mloss * i + loss_items) / (i + 1) # update mean losses 272 | mem = '%.3gG' % (torch.cuda.memory_cached() / 1E9 if torch.cuda.is_available() else 0) # (GB) 273 | s = ('%10s' * 2 + '%10.4g' * 6) % ( 274 | '%g/%g' % (epoch, epochs - 1), mem, *mloss, targets.shape[0], imgs.shape[-1]) 275 | pbar.set_description(s) 276 | 277 | # Plot 278 | if ni < 3: 279 | f = str(Path(log_dir) / ('train_batch%g.jpg' % ni)) # filename 280 | result = plot_images(images=imgs, targets=targets, paths=paths, fname=f) 281 | if tb_writer and result is not None: 282 | tb_writer.add_image(f, result, dataformats='HWC', global_step=epoch) 283 | # tb_writer.add_graph(model, images) # add model to tensorboard 284 | 285 | # end batch ------------------------------------------------------------------------------------------------ 286 | 287 | # Scheduler 288 | scheduler.step() 289 | 290 | # mAP 291 | ema.update_attr(model) 292 | final_epoch = epoch + 1 == epochs 293 | if not opt.notest or final_epoch: # Calculate mAP 294 | results, maps, times = test_yolov5.test(opt.data, 295 | batch_size=batch_size, 296 | imgsz=imgsz_test, 297 | save_json=final_epoch and opt.data.endswith(os.sep + 'coco.yaml'), 298 | model=ema.ema, 299 | single_cls=opt.single_cls, 300 | dataloader=testloader, 301 | save_dir=log_dir) 302 | 303 | # Write 304 | with open(results_file, 'a') as f: 305 | f.write(s + '%10.4g' * 7 % results + '\n') # P, R, mAP, F1, test_losses=(GIoU, obj, cls) 306 | if len(opt.name) and opt.bucket: 307 | os.system('gsutil cp results.txt gs://%s/results/results%s.txt' % (opt.bucket, opt.name)) 308 | 309 | # Tensorboard 310 | if tb_writer: 311 | tags = ['train/giou_loss', 'train/obj_loss', 'train/cls_loss', 312 | 'metrics/precision', 'metrics/recall', 'metrics/mAP_0.5', 'metrics/F1', 313 | 'val/giou_loss', 'val/obj_loss', 'val/cls_loss'] 314 | for x, tag in zip(list(mloss[:-1]) + list(results), tags): 315 | tb_writer.add_scalar(tag, x, epoch) 316 | 317 | # Update best mAP 318 | fi = fitness(np.array(results).reshape(1, -1)) # fitness_i = weighted combination of [P, R, mAP, F1] 319 | if fi > best_fitness: 320 | best_fitness = fi 321 | 322 | # Save model 323 | save = (not opt.nosave) or (final_epoch and not opt.evolve) 324 | if save: 325 | with open(results_file, 'r') as f: # create checkpoint 326 | ckpt = {'epoch': epoch, 327 | 'best_fitness': best_fitness, 328 | 'training_results': f.read(), 329 | 'model': ema.ema, 330 | 'optimizer': None if final_epoch else optimizer.state_dict()} 331 | 332 | # Save last, best and delete 333 | torch.save(ckpt, last) 334 | if (best_fitness == fi) and not final_epoch: 335 | torch.save(ckpt, best) 336 | del ckpt 337 | 338 | # end epoch ---------------------------------------------------------------------------------------------------- 339 | # end training 340 | 341 | # Strip optimizers 342 | n = ('_' if len(opt.name) and not opt.name.isnumeric() else '') + opt.name 343 | fresults, flast, fbest = 'results%s.txt' % n, wdir + 'last%s.pt' % n, wdir + 'best%s.pt' % n 344 | for f1, f2 in zip([wdir + 'yolov5_best.pt', wdir + 'best.pt', 'results.txt'], [flast, fbest, fresults]): 345 | if os.path.exists(f1): 346 | os.rename(f1, f2) # rename 347 | ispt = f2.endswith('.pt') # is *.pt 348 | strip_optimizer(f2) if ispt else None # strip optimizer 349 | os.system('gsutil cp %s gs://%s/weights' % (f2, opt.bucket)) if opt.bucket and ispt else None # upload 350 | 351 | # Finish 352 | 353 | plot_results(save_dir=log_dir) # save as results.png 354 | print('%g epochs completed in %.3f hours.\n' % (epoch - start_epoch + 1, (time.time() - t0) / 3600)) 355 | dist.destroy_process_group() if device.type != 'cpu' and torch.cuda.device_count() > 1 else None 356 | torch.cuda.empty_cache() 357 | return results 358 | 359 | 360 | if __name__ == '__main__': 361 | check_git_status() 362 | parser = argparse.ArgumentParser() 363 | parser.add_argument('--cfg', type=str, default='', help='model.yaml path') 364 | parser.add_argument('--data', type=str, default='', help='data.yaml path') 365 | parser.add_argument('--hyp', type=str, default='', help='hyp.yaml path (optional)') 366 | parser.add_argument('--weights', type=str, default='', help='initial weights path') 367 | parser.add_argument('--epochs', type=int, default=100) 368 | parser.add_argument('--batch-size', type=int, default=12) 369 | parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='train,test sizes') 370 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 371 | 372 | parser.add_argument('--rect', action='store_true', help='rectangular training') 373 | parser.add_argument('--resume', nargs='?', const='get_last', default=False, 374 | help='resume from given path/to/yolov5_best.pt, or most recent run if blank.') 375 | parser.add_argument('--nosave', action='store_true', help='only save final checkpoint') 376 | parser.add_argument('--notest', action='store_true', help='only test final epoch') 377 | parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check') 378 | parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters') 379 | parser.add_argument('--bucket', type=str, default='', help='gsutil bucket') 380 | parser.add_argument('--cache-images', action='store_true', help='cache images for faster training') 381 | parser.add_argument('--name', default='', help='renames results.txt to results_name.txt if supplied') 382 | parser.add_argument('--multi-scale', default=True, help='vary img-size +/- 50%%') 383 | parser.add_argument('--single-cls', default=True, help='train as single-class dataset') 384 | opt = parser.parse_args() 385 | 386 | last = get_latest_run() if opt.resume == 'get_last' else opt.resume # resume from most recent run 387 | if last and not opt.weights: 388 | print(f'Resuming training from {last}') 389 | opt.weights = last if opt.resume and not opt.weights else opt.weights 390 | opt.cfg = check_file(opt.cfg) # check file 391 | opt.data = check_file(opt.data) # check file 392 | opt.hyp = check_file(opt.hyp) if opt.hyp else '' # check file 393 | print(opt) 394 | opt.img_size.extend([opt.img_size[-1]] * (2 - len(opt.img_size))) # extend to 2 sizes (train, test) 395 | device = torch_utils.select_device(opt.device, apex=mixed_precision, batch_size=opt.batch_size) 396 | if device.type == 'cpu': 397 | mixed_precision = False 398 | 399 | # Train 400 | if not opt.evolve: 401 | print('Start Tensorboard with "tensorboard --logdir=runs", view at http://localhost:6006/') 402 | tb_writer = SummaryWriter(comment=opt.name) 403 | if opt.hyp: # update hyps 404 | with open(opt.hyp) as f: 405 | hyp.update(yaml.load(f, Loader=yaml.FullLoader)) 406 | 407 | train(hyp) 408 | 409 | # Evolve hyperparameters (optional) 410 | else: 411 | tb_writer = None 412 | opt.notest, opt.nosave = True, True # only test/save final epoch 413 | if opt.bucket: 414 | os.system('gsutil cp gs://%s/evolve.txt .' % opt.bucket) # download evolve.txt if exists 415 | 416 | for _ in range(10): # generations to evolve 417 | if os.path.exists('evolve.txt'): # if evolve.txt exists: select best hyps and mutate 418 | # Select parent(s) 419 | parent = 'single' # parent selection method: 'single' or 'weighted' 420 | x = np.loadtxt('evolve.txt', ndmin=2) 421 | n = min(5, len(x)) # number of previous results to consider 422 | x = x[np.argsort(-fitness(x))][:n] # top n mutations 423 | w = fitness(x) - fitness(x).min() # weights 424 | if parent == 'single' or len(x) == 1: 425 | # x = x[random.randint(0, n - 1)] # random selection 426 | x = x[random.choices(range(n), weights=w)[0]] # weighted selection 427 | elif parent == 'weighted': 428 | x = (x * w.reshape(n, 1)).sum(0) / w.sum() # weighted combination 429 | 430 | # Mutate 431 | mp, s = 0.9, 0.2 # mutation probability, sigma 432 | npr = np.random 433 | npr.seed(int(time.time())) 434 | g = np.array([1, 1, 1, 1, 1, 1, 1, 0, .1, 1, 0, 1, 1, 1, 1, 1, 1, 1]) # gains 435 | ng = len(g) 436 | v = np.ones(ng) 437 | while all(v == 1): # mutate until a change occurs (prevent duplicates) 438 | v = (g * (npr.random(ng) < mp) * npr.randn(ng) * npr.random() * s + 1).clip(0.3, 3.0) 439 | for i, k in enumerate(hyp.keys()): # plt.hist(v.ravel(), 300) 440 | hyp[k] = x[i + 7] * v[i] # mutate 441 | 442 | # Clip to limits 443 | keys = ['lr0', 'iou_t', 'momentum', 'weight_decay', 'hsv_s', 'hsv_v', 'translate', 'scale', 'fl_gamma'] 444 | limits = [(1e-5, 1e-2), (0.00, 0.70), (0.60, 0.98), (0, 0.001), (0, .9), (0, .9), (0, .9), (0, .9), (0, 3)] 445 | for k, v in zip(keys, limits): 446 | hyp[k] = np.clip(hyp[k], v[0], v[1]) 447 | 448 | # Train mutation 449 | results = train(hyp.copy()) 450 | 451 | # Write mutation results 452 | print_mutation(hyp, results, opt.bucket) 453 | 454 | # Plot results 455 | # plot_evolution_results(hyp) 456 | -------------------------------------------------------------------------------- /utils/activations.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import torch.nn as nn 5 | 6 | 7 | # Swish ------------------------------------------------------------------------ 8 | class SwishImplementation(torch.autograd.Function): 9 | @staticmethod 10 | def forward(ctx, x): 11 | ctx.save_for_backward(x) 12 | return x * torch.sigmoid(x) 13 | 14 | @staticmethod 15 | def backward(ctx, grad_output): 16 | x = ctx.saved_tensors[0] 17 | sx = torch.sigmoid(x) 18 | return grad_output * (sx * (1 + x * (1 - sx))) 19 | 20 | 21 | class MemoryEfficientSwish(nn.Module): 22 | @staticmethod 23 | def forward(x): 24 | return SwishImplementation.apply(x) 25 | 26 | 27 | class HardSwish(nn.Module): # https://arxiv.org/pdf/1905.02244.pdf 28 | @staticmethod 29 | def forward(x): 30 | return x * F.hardtanh(x + 3, 0., 6., True) / 6. 31 | 32 | 33 | class Swish(nn.Module): 34 | @staticmethod 35 | def forward(x): 36 | return x * torch.sigmoid(x) 37 | 38 | 39 | # Mish ------------------------------------------------------------------------ 40 | class MishImplementation(torch.autograd.Function): 41 | @staticmethod 42 | def forward(ctx, x): 43 | ctx.save_for_backward(x) 44 | return x.mul(torch.tanh(F.softplus(x))) # x * tanh(ln(1 + exp(x))) 45 | 46 | @staticmethod 47 | def backward(ctx, grad_output): 48 | x = ctx.saved_tensors[0] 49 | sx = torch.sigmoid(x) 50 | fx = F.softplus(x).tanh() 51 | return grad_output * (fx + x * sx * (1 - fx * fx)) 52 | 53 | 54 | class MemoryEfficientMish(nn.Module): 55 | @staticmethod 56 | def forward(x): 57 | return MishImplementation.apply(x) 58 | 59 | 60 | class Mish(nn.Module): # https://github.com/digantamisra98/Mish 61 | @staticmethod 62 | def forward(x): 63 | return x * F.softplus(x).tanh() 64 | -------------------------------------------------------------------------------- /utils/google_utils.py: -------------------------------------------------------------------------------- 1 | # This file contains google utils: https://cloud.google.com/storage/docs/reference/libraries 2 | # pip install --upgrade google-cloud-storage 3 | # from google.cloud import storage 4 | 5 | import os 6 | import time 7 | import subprocess 8 | from pathlib import Path 9 | 10 | def gsutil_getsize(url=''): 11 | s=subprocess.check_output('gsutil du %s' % url,shell=True).decode('utf-8') 12 | def attempt_download(weights): 13 | # Attempt to download pretrained weights if not found locally 14 | weights = weights.strip() 15 | msg = weights + ' missing, try downloading from https://drive.google.com/drive/folders/1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J' 16 | 17 | r = 1 18 | if len(weights) > 0 and not os.path.isfile(weights): 19 | d = {'yolov3-spp.pt': '1mM67oNw4fZoIOL1c8M3hHmj66d8e-ni_', # yolov3-spp.yaml 20 | 'yolov5s.pt': '1R5T6rIyy3lLwgFXNms8whc-387H0tMQO', # yolov5s.yaml 21 | 'yolov5m.pt': '1vobuEExpWQVpXExsJ2w-Mbf3HJjWkQJr', # yolov5m.yaml 22 | 'yolov5l.pt': '1hrlqD1Wdei7UT4OgT785BEk1JwnSvNEV', # yolov5l.yaml 23 | 'yolov5x.pt': '1mM8aZJlWTxOg7BZJvNUMrTnA2AbeCVzS', # yolov5x.yaml 24 | } 25 | 26 | file = Path(weights).name 27 | if file in d: 28 | r = gdrive_download(id=d[file], name=weights) 29 | 30 | if not (r == 0 and os.path.exists(weights) and os.path.getsize(weights) > 1E6): # weights exist and > 1MB 31 | os.remove(weights) if os.path.exists(weights) else None # remove partial downloads 32 | s = "curl -L -o %s 'https://storage.googleapis.com/ultralytics/yolov5/ckpt/%s'" % (weights, file) 33 | r = os.system(s) # execute, capture return values 34 | 35 | # Error check 36 | if not (r == 0 and os.path.exists(weights) and os.path.getsize(weights) > 1E6): # weights exist and > 1MB 37 | os.remove(weights) if os.path.exists(weights) else None # remove partial downloads 38 | raise Exception(msg) 39 | 40 | 41 | def gdrive_download(id='1HaXkef9z6y5l4vUnCYgdmEAj61c6bfWO', name='coco.zip'): 42 | # https://gist.github.com/tanaikech/f0f2d122e05bf5f971611258c22c110f 43 | # Downloads a file from Google Drive, accepting presented query 44 | # from utils.google_utils import *; gdrive_download() 45 | t = time.time() 46 | 47 | print('Downloading https://drive.google.com/uc?export=download&id=%s as %s... ' % (id, name), end='') 48 | os.remove(name) if os.path.exists(name) else None # remove existing 49 | os.remove('cookie') if os.path.exists('cookie') else None 50 | 51 | # Attempt file download 52 | os.system("curl -c ./cookie -s -L \"https://drive.google.com/uc?export=download&id=%s\" > /dev/null" % id) 53 | if os.path.exists('cookie'): # large file 54 | s = "curl -Lb ./cookie \"https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=%s\" -o %s" % ( 55 | id, name) 56 | else: # small file 57 | s = "curl -s -L -o %s 'https://drive.google.com/uc?export=download&id=%s'" % (name, id) 58 | r = os.system(s) # execute, capture return values 59 | os.remove('cookie') if os.path.exists('cookie') else None 60 | 61 | # Error check 62 | if r != 0: 63 | os.remove(name) if os.path.exists(name) else None # remove partial 64 | print('Download error ') # raise Exception('Download error') 65 | return r 66 | 67 | # Unzip if archive 68 | if name.endswith('.zip'): 69 | print('unzipping... ', end='') 70 | os.system('unzip -q %s' % name) # unzip 71 | os.remove(name) # remove zip to free space 72 | 73 | print('Done (%.1fs)' % (time.time() - t)) 74 | return r 75 | 76 | # def upload_blob(bucket_name, source_file_name, destination_blob_name): 77 | # # Uploads a file to a bucket 78 | # # https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python 79 | # 80 | # storage_client = storage.Client() 81 | # bucket = storage_client.get_bucket(bucket_name) 82 | # blob = bucket.blob(destination_blob_name) 83 | # 84 | # blob.upload_from_filename(source_file_name) 85 | # 86 | # print('File {} uploaded to {}.'.format( 87 | # source_file_name, 88 | # destination_blob_name)) 89 | # 90 | # 91 | # def download_blob(bucket_name, source_blob_name, destination_file_name): 92 | # # Uploads a blob from a bucket 93 | # storage_client = storage.Client() 94 | # bucket = storage_client.get_bucket(bucket_name) 95 | # blob = bucket.blob(source_blob_name) 96 | # 97 | # blob.download_to_filename(destination_file_name) 98 | # 99 | # print('Blob {} downloaded to {}.'.format( 100 | # source_blob_name, 101 | # destination_file_name)) 102 | -------------------------------------------------------------------------------- /utils/load_lpr_data.py: -------------------------------------------------------------------------------- 1 | from imutils import paths 2 | import numpy as np 3 | import random 4 | import cv2 5 | import os 6 | 7 | from torch.utils.data import Dataset 8 | 9 | CHARS = ['京', '沪', '津', '渝', '冀', '晋', '蒙', '辽', '吉', '黑', 10 | '苏', '浙', '皖', '闽', '赣', '鲁', '豫', '鄂', '湘', '粤', 11 | '桂', '琼', '川', '贵', '云', '藏', '陕', '甘', '青', '宁', 12 | '新', 13 | '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 14 | 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 15 | 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 16 | 'W', 'X', 'Y', 'Z', 'I', 'O', '-' 17 | ] 18 | 19 | CHARS_DICT = {char:i for i, char in enumerate(CHARS)} 20 | 21 | class LPRDataLoader(Dataset): 22 | def __init__(self, img_dir, imgSize, lpr_max_len, PreprocFun=None): 23 | self.img_dir = img_dir 24 | self.img_paths = [] 25 | for i in range(len(img_dir)): 26 | self.img_paths += [el for el in paths.list_images(img_dir[i])] 27 | random.shuffle(self.img_paths) 28 | self.img_size = imgSize 29 | self.lpr_max_len = lpr_max_len 30 | if PreprocFun is not None: 31 | self.PreprocFun = PreprocFun 32 | else: 33 | self.PreprocFun = self.transform 34 | 35 | def __len__(self): 36 | return len(self.img_paths) 37 | 38 | def __getitem__(self, index): 39 | filename = self.img_paths[index] 40 | # Image = cv2.imread(filename) 41 | Image = cv2.imdecode(np.fromfile(filename, dtype=np.uint8), -1) 42 | Image = cv2.cvtColor(Image, cv2.COLOR_RGB2BGR) 43 | height, width, _ = Image.shape 44 | if height != self.img_size[1] or width != self.img_size[0]: 45 | Image = cv2.resize(Image, self.img_size) 46 | Image = self.PreprocFun(Image) 47 | 48 | basename = os.path.basename(filename) 49 | imgname, suffix = os.path.splitext(basename) 50 | imgname = imgname.split("-")[0].split("_")[0] 51 | label = list() 52 | for c in imgname: 53 | # one_hot_base = np.zeros(len(CHARS)) 54 | # one_hot_base[CHARS_DICT[c]] = 1 55 | label.append(CHARS_DICT[c]) 56 | 57 | if len(label) == 8: 58 | if self.check(label) == False: 59 | print(imgname) 60 | assert 0, "Error label ^~^!!!" 61 | 62 | return Image, label, len(label) 63 | 64 | def transform(self, img): 65 | img = img.astype('float32') 66 | img -= 127.5 67 | img *= 0.0078125 68 | img = np.transpose(img, (2, 0, 1)) 69 | 70 | return img 71 | 72 | def check(self, label): 73 | if label[2] != CHARS_DICT['D'] and label[2] != CHARS_DICT['F'] \ 74 | and label[-1] != CHARS_DICT['D'] and label[-1] != CHARS_DICT['F']: 75 | print("Error label, Please check!") 76 | return False 77 | else: 78 | return True 79 | -------------------------------------------------------------------------------- /utils/metrics.py: -------------------------------------------------------------------------------- 1 | # 该文件通过训练的预测结果与gt结合计算出P、R、F1-score、AP、不同IOU阈值下的mAP等。同时还能对上述指标进行可视化 2 | # 可以绘制混淆矩阵以及P-R曲线 3 | 4 | import math # 数学函数模块 5 | import warnings # 发出警告 6 | from pathlib import Path # Path将str转换为Path对象 使字符串路径易于操作的模块 7 | import matplotlib.pyplot as plt # matplotlib画图模块 8 | import numpy as np # numpy数组操作模块 9 | import torch # pytorch框架 10 | 11 | 12 | 13 | def fitness(x): 14 | """通过指标加权的形式返回适应度(最终mAP) 在train.py中使用 15 | Model fitness as a weighted combination of metrics 16 | 判断模型好坏的指标不是mAP@0.5也不是mAP@0.5:0.95 而是[P, R, mAP@0.5, mAP@0.5:0.95]4者的加权 17 | 一般w=[0,0,0.1,0.9] 即最终的mAP=0.1mAP@0.5 + 0.9mAP@0.5:0.95 18 | """ 19 | w = [0.0, 0.0, 0.1, 0.9] # weights for [P, R, mAP@0.5, mAP@0.5:0.95] 20 | # (torch.tensor).sum(1) 每一行求和tensor为二维时返回一个以每一行求和为结果(常数)的行向量 21 | return (x[:, :4] * w).sum(1) 22 | 23 | 24 | def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir='.', names=()): 25 | """用于val.py中计算每个类的mAP 26 | 计算每一个类的AP指标(average precision)还可以 绘制P-R曲线 27 | mAP基本概念: https://www.bilibili.com/video/BV1ez4y1X7g2 28 | Source: https://github.com/rafaelpadilla/Object-Detection-Metrics 29 | :params tp(correct): [pred_sum, 10]=[1905, 10] bool 整个数据集所有图片中所有预测框在每一个iou条件下(0.5~0.95)10个是否是TP 30 | :params conf: [img_sum]=[1905] 整个数据集所有图片的所有预测框的conf 31 | :params pred_cls: [img_sum]=[1905] 整个数据集所有图片的所有预测框的类别 32 | 这里的tp、conf、pred_cls是一一对应的 33 | :params target_cls: [gt_sum]=[929] 整个数据集所有图片的所有gt框的class 34 | :params plot: bool 35 | :params save_dir: runs\train\exp30 36 | :params names: dict{key(class_index):value(class_name)} 获取数据集所有类别的index和对应类名 37 | :return p[:, i]: [nc] 最大平均f1时每个类别的precision 38 | :return r[:, i]: [nc] 最大平均f1时每个类别的recall 39 | :return ap: [71, 10] 数据集每个类别在10个iou阈值下的mAP 40 | :return f1[:, i]: [nc] 最大平均f1时每个类别的f1 41 | :return unique_classes.astype('int32'): [nc] 返回数据集中所有的类别index 42 | """ 43 | # 计算mAP 需要将tp按照conf降序排列 44 | # Sort by objectness 按conf从大到小排序 返回数据对应的索引 45 | i = np.argsort(-conf) 46 | # 得到重新排序后对应的 tp, conf, pre_cls 47 | tp, conf, pred_cls = tp[i], conf[i], pred_cls[i] 48 | 49 | # Find unique classes 对类别去重, 因为计算ap是对每类进行 50 | unique_classes = np.unique(target_cls) 51 | nc = unique_classes.shape[0] # 数据集类别数 number of classes 52 | 53 | # Create Precision-Recall curve and compute AP for each class 54 | # px: [0, 1] 中间间隔1000个点 x坐标(用于绘制P-Conf、R-Conf、F1-Conf) 55 | # py: y坐标[] 用于绘制IOU=0.5时的PR曲线 56 | px, py = np.linspace(0, 1, 1000), [] # for plotting 57 | 58 | # 初始化 对每一个类别在每一个IOU阈值下 计算AP P R ap=[nc, 10] p=[nc, 1000] r=[nc, 1000] 59 | ap, p, r = np.zeros((nc, tp.shape[1])), np.zeros((nc, 1000)), np.zeros((nc, 1000)) 60 | for ci, c in enumerate(unique_classes): # ci: index 0 c: class 0 unique_classes: 所有gt中不重复的class 61 | # i: 记录着所有预测框是否是c类别框 是c类对应位置为True, 否则为False 62 | i = pred_cls == c 63 | # n_l: gt框中的c类别框数量 = tp+fn 254 64 | n_l = (target_cls == c).sum() # number of labels 65 | # n_p: 预测框中c类别的框数量 695 66 | n_p = i.sum() # number of predictions 67 | 68 | # 如果没有预测到 或者 ground truth没有标注 则略过类别c 69 | if n_p == 0 or n_l == 0: 70 | continue 71 | else: 72 | # Accumulate FPs(False Positive) and TPs(Ture Positive) FP + TP = all_detections 73 | # tp[i] 可以根据i中的的True/False觉定是否删除这个数 所有tp中属于类c的预测框 74 | # 如: tp=[0,1,0,1] i=[True,False,False,True] b=tp[i] => b=[0,1] 75 | # a.cumsum(0) 会按照对象进行累加操作 76 | # 一维按行累加如: a=[0,1,0,1] b = a.cumsum(0) => b=[0,1,1,2] 而二维则按列累加 77 | # fpc: 类别为c 顺序按置信度排列 截至到每一个预测框的各个iou阈值下FP个数 最后一行表示c类在该iou阈值下所有FP数 78 | # tpc: 类别为c 顺序按置信度排列 截至到每一个预测框的各个iou阈值下TP个数 最后一行表示c类在该iou阈值下所有TP数 79 | fpc = (1 - tp[i]).cumsum(0) # fp[i] = 1 - tp[i] 80 | tpc = tp[i].cumsum(0) 81 | 82 | # Recall=TP/(TP+FN) 加一个1e-16的目的是防止分母为0 83 | # n_l=TP+FN=num_gt: c类的gt个数=预测是c类而且预测正确+预测不是c类但是预测错误 84 | # recall: 类别为c 顺序按置信度排列 截至每一个预测框的各个iou阈值下的召回率 85 | recall = tpc / (n_l + 1e-16) # recall curve 用于计算mAP 86 | # 返回所有类别, 横坐标为conf(值为px=[0, 1, 1000] 0~1 1000个点)对应的recall值 r=[nc, 1000] 每一行从小到大 87 | r[ci] = np.interp(-px, -conf[i], recall[:, 0], left=0) # 用于绘制R-Confidence(R_curve.png) 88 | 89 | # Precision=TP/(TP+FP) 90 | # precision: 类别为c 顺序按置信度排列 截至每一个预测框的各个iou阈值下的精确率 91 | precision = tpc / (tpc + fpc) # precision curve 用于计算mAP 92 | # 返回所有类别, 横坐标为conf(值为px=[0, 1, 1000] 0~1 1000个点)对应的precision值 p=[nc, 1000] 93 | # 总体上是从小到大 但是细节上有点起伏 如: 0.91503 0.91558 0.90968 0.91026 0.90446 0.90506 94 | p[ci] = np.interp(-px, -conf[i], precision[:, 0], left=1) # 用于绘制P-Confidence(P_curve.png) 95 | 96 | # AP from recall-precision curve 97 | # 对c类别, 分别计算每一个iou阈值(0.5~0.95 10个)下的mAP 98 | for j in range(tp.shape[1]): # tp [pred_sum, 10] 99 | # 这里执行10次计算ci这个类别在所有mAP阈值下的平均mAP ap[nc, 10] 100 | ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j]) 101 | if plot and j == 0: 102 | py.append(np.interp(px, mrec, mpre)) # py: 用于绘制每一个类别IOU=0.5时的PR曲线 103 | 104 | # 计算F1分数 P和R的调和平均值 综合评价指标 105 | # 我们希望的是P和R两个越大越好, 但是P和R常常是两个冲突的变量, 经常是P越大R越小, 或者R越大P越小 所以我们引入F1综合指标 106 | # 不同任务的重点不一样, 有些任务希望P越大越好, 有些任务希望R越大越好, 有些任务希望两者都大, 这时候就看F1这个综合指标了 107 | # 返回所有类别, 横坐标为conf(值为px=[0, 1, 1000] 0~1 1000个点)对应的f1值 f1=[nc, 1000] 108 | f1 = 2 * p * r / (p + r + 1e-16) # 用于绘制P-Confidence(F1_curve.png) 109 | 110 | if plot: 111 | plot_pr_curve(px, py, ap, Path(save_dir) / 'PR_curve.png', names) # 画pr曲线 112 | plot_mc_curve(px, f1, Path(save_dir) / 'F1_curve.png', names, ylabel='F1') # 画F1_conf曲线 113 | plot_mc_curve(px, p, Path(save_dir) / 'P_curve.png', names, ylabel='Precision') # 画P_conf曲线 114 | plot_mc_curve(px, r, Path(save_dir) / 'R_curve.png', names, ylabel='Recall') # 画R_conf曲线 115 | 116 | # f1=[nc, 1000] f1.mean(0)=[1000]求出所有类别在x轴每个conf点上的平均f1 117 | # .argmax(): 求出每个点平均f1中最大的f1对应conf点的index 118 | i = f1.mean(0).argmax() # max F1 index 119 | 120 | # p=[nc, 1000] 每个类别在x轴每个conf值对应的precision 121 | # p[:, i]: [nc] 最大平均f1时每个类别的precision 122 | # r[:, i]: [nc] 最大平均f1时每个类别的recall 123 | # f1[:, i]: [nc] 最大平均f1时每个类别的f1 124 | # ap: [71, 10] 数据集每个类别在10个iou阈值下的mAP 125 | # unique_classes.astype('int32'): [nc] 返回数据集中所有的类别index 126 | return p[:, i], r[:, i], ap, f1[:, i], unique_classes.astype('int32') 127 | 128 | 129 | def compute_ap(recall, precision): 130 | """用于ap_per_class函数中 131 | 计算某个类别在某个iou阈值下的mAP 132 | Compute the average precision, given the recall and precision curves 133 | :params recall: (list) [1635] 在某个iou阈值下某个类别所有的预测框的recall 从小到大 134 | (每个预测框的recall都是截至到这个预测框为止的总recall) 135 | :params precision: (list) [1635] 在某个iou阈值下某个类别所有的预测框的precision 136 | 总体上是从大到小 但是细节上有点起伏 如: 0.91503 0.91558 0.90968 0.91026 0.90446 0.90506 137 | (每个预测框的precision都是截至到这个预测框为止的总precision) 138 | :return ap: Average precision 返回某类别在某个iou下的mAP(均值) [1] 139 | :return mpre: precision curve [1637] 返回 开头 + 输入precision(排序后) + 末尾 140 | :return mrec: recall curve [1637] 返回 开头 + 输入recall + 末尾 141 | """ 142 | # 在开头和末尾添加保护值 防止全零的情况出现 value Append sentinel values to beginning and end 143 | mrec = np.concatenate(([0.], recall, [recall[-1] + 0.01])) # [1637] 144 | mpre = np.concatenate(([1.], precision, [0.])) # [1637] 145 | 146 | # Compute the precision envelope np.flip翻转顺序 147 | # np.flip(mpre): 把一维数组每个元素的顺序进行翻转 第一个翻转成为最后一个 148 | # np.maximum.accumulate(np.flip(mpre)): 计算数组(或数组的特定轴)的累积最大值 令mpre是单调的 从小到大 149 | # np.flip(np.maximum.accumulate(np.flip(mpre))): 从大到小 150 | # 到这大概看明白了这步的目的: 要保证mpre是从大到小单调的(左右可以相同) 151 | # 我觉得这样可能是为了更好计算mAP 因为如果一直起起伏伏太难算了(x间隔很小就是一个矩形) 而且这样做误差也不会很大 两个之间的数都是间隔很小的 152 | mpre = np.flip(np.maximum.accumulate(np.flip(mpre))) 153 | 154 | # Integrate area under curve 155 | method = 'interp' # methods: 'continuous', 'interp' 156 | if method == 'interp': # 用一些典型的间断点来计算AP 157 | x = np.linspace(0, 1, 101) # 101-point interp (COCO) [0, 0.01, ..., 1] 158 | # np.trapz(list,list) 计算两个list对应点与点之间四边形的面积 以定积分形式估算AP 第一个参数是y 第二个参数是x 159 | ap = np.trapz(np.interp(x, mrec, mpre), x) # integrate 160 | else: # 'continuous' # 采用连续的方法计算AP 161 | # 通过错位的方式 判断哪个点当前位置到下一个位置值发生改变 并通过!=判断 返回一个布尔数组 162 | i = np.where(mrec[1:] != mrec[:-1])[0] # points where x axis (recall) changes 163 | # 值改变了就求出当前矩阵的面积 值没变就说明当前矩阵和下一个矩阵的高相等所有可以合并计算 164 | ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1]) # area under curve 165 | 166 | return ap, mpre, mrec 167 | 168 | 169 | class ConfusionMatrix: 170 | """用在val.py中计算混淆矩阵 171 | Updated version of https://github.com/kaanakan/object_detection_confusion_matrix 172 | 混淆矩阵: 定义 更新 return 绘制 print打印 173 | """ 174 | def __init__(self, nc, conf=0.25, iou_thres=0.45): # 个人觉得这里iou_thres应该改成0.5(和后面计算mAP对应) 175 | """ 176 | params nc: 数据集类别个数 177 | params conf: 预测框置信度阈值 178 | Params iou_thres: iou阈值 179 | """ 180 | # 初始化混淆矩阵 pred x gt 其中横坐标/纵坐标第81类为背景类 181 | # 如果某个gt[j]没用任何pred正样本匹配到 那么[nc, gt[j]_class] += 1 182 | # 如果某个pred[i]负样本且没有哪个gt与之对应 那么[pred[i]_class nc] += 1 183 | self.matrix = np.zeros((nc + 1, nc + 1)) 184 | self.nc = nc # number of classes 185 | self.conf = conf 186 | self.iou_thres = iou_thres 187 | 188 | def process_batch(self, detections, labels): 189 | """ 190 | :params detections: [N, 6] = [pred_obj_num, x1y1x2y2+object_conf+cls] = [300, 6] 191 | 一个batch中一张图的预测信息 其中x1y1x2y2是映射到原图img的 192 | :params labels: [M, 5] = [gt_num, class+x1y1x2y2] = [17, 5] 其中x1y1x2y2是映射到原图img的 193 | :return: None, updates confusion matrix accordingly 194 | """ 195 | # 筛除置信度过低的预测框(和nms差不多) [10, 6] 196 | detections = detections[detections[:, 4] > self.conf] 197 | 198 | gt_classes = labels[:, 0].int() # 所有gt框类别(int) [17] 类别可能会重复 199 | detection_classes = detections[:, 5].int() # 所有pred框类别(int) [10] 类别可能会重复 Positive + Negative 200 | 201 | # 求出所有gt框和所有pred框的iou [17, x1y1x2y2] + [10, x1y1x2y2] => [17, 10] [i, j] 第i个gt框和第j个pred的iou 202 | iou = box_iou(labels[:, 1:], detections[:, :4]) 203 | 204 | # iou > self.iou_thres: [17, 10] bool 符合条件True 不符合False 205 | # x[0]: [10] gt_index x[1]: [10] pred_index x合起来看就是第x[0]个gt框和第x[1]个pred的iou符合条件 206 | # 17 x 10个iou 经过iou阈值筛选后只有10个满足iou阈值条件 207 | x = torch.where(iou > self.iou_thres) 208 | 209 | # 后面会专门对这里一连串的matches变化给个实例再解释 210 | if x[0].shape[0]: # 存在大于阈值的iou时 211 | # torch.stack(x, 1): [10, gt_index+pred_index] 212 | # iou[x[0], x[1]][:, None]): [10, 1] x[0]和x[1]的iou 213 | # 1、matches: [10, gt_index+pred_index+iou] = [10, 3] 214 | matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy() 215 | if x[0].shape[0] > 1: 216 | # 2、matches按第三列iou从大到小重排序 217 | matches = matches[matches[:, 2].argsort()[::-1]] 218 | # 3、取第二列中各个框首次出现(不同预测的框)的行(即每一种预测的框中iou最大的那个) 219 | matches = matches[np.unique(matches[:, 1], return_index=True)[1]] 220 | # 4、matches再按第三列iou从大到小重排序 221 | matches = matches[matches[:, 2].argsort()[::-1]] 222 | # 5、取第一列中各个框首次出现(不同gt的框)的行(即每一种gt框中iou最大的那个) 223 | matches = matches[np.unique(matches[:, 0], return_index=True)[1]] # [9, gt_index+pred_index+iou] 224 | # 经过这样的处理 最终得到每一种预测框与所有gt框中iou最大的那个(在大于阈值的前提下) 225 | # 预测框唯一 gt框也唯一 这样得到的matches对应的Pred都是正样本Positive 9个 226 | else: 227 | matches = np.zeros((0, 3)) 228 | 229 | n = matches.shape[0] > 0 # 满足条件的iou是否大于0个 bool 230 | # a.transpose(): 转换维度 对二维数组就是转置 这里的matches: [9, gt_index+pred_index+iou] -> [gt_index+pred_index+iou, 17] 231 | # m0: [1, 9] 满足条件(正样本)的gt框index(不重复) m1: [1, 9] 满足条件(正样本)的pred框index(不重复) 232 | m0, m1, _ = matches.transpose().astype(np.int16) 233 | for i, gc in enumerate(gt_classes): 234 | j = m0 == i 235 | if n and sum(j) == 1: 236 | # 如果sum(j)=1 说明gt[i]这个真实框被某个预测框检测到了 但是detection_classes[m1[j]]并不一定等于gc 所以此时可能是TP或者是FP 237 | # m1[j]: gt框index=i时, 满足条件的pred框index detection_classes[m1[j]]: pred_class_index 238 | # gc: gt_class_index matrix[pred_class_index,gt_class_index] += 1 239 | self.matrix[detection_classes[m1[j]], gc] += 1 # TP + FP 某个gt检测到了(Positive IOU ≥ threshold) 但是有可能分类分错了 也有可能分类分对了 240 | else: 241 | # 如果sum(j)=0 说明gt[i]这个真实框没用被任何预测框检测到 也就是说这个真实框被检测成了背景框 242 | # 所以对应的混淆矩阵 [背景类, gc] += 1 其中横坐标第81类是背景background IOU < threshold 243 | self.matrix[self.nc, gc] += 1 # background FP +1 某个gt没检测到 被检测为background了 244 | 245 | if n: 246 | for i, dc in enumerate(detection_classes): 247 | if not any(m1 == i): 248 | # detection_classes - matrix[1] = negative 且没用对应的gt和negative相对应 所以background FN+1 249 | self.matrix[dc, self.nc] += 1 # background FN 250 | 251 | def matrix(self): 252 | # 返回这个混淆矩阵 253 | return self.matrix 254 | 255 | def plot(self, normalize=True, save_dir='', names=()): 256 | """ 257 | :params normalize: 是否将混淆矩阵归一化 默认True 258 | :params save_dir: runs/train/expn 混淆矩阵保存地址 259 | :params names: 数据集的所有类别名 260 | :return None 261 | """ 262 | try: 263 | import seaborn as sn # seaborn 为matplotlib可视化更好看的一个模块 264 | 265 | array = self.matrix / ((self.matrix.sum(0).reshape(1, -1) + 1E-6) if normalize else 1) # 混淆矩阵归一化 0~1 266 | array[array < 0.005] = np.nan # 混淆矩阵中小于0.005的值被认为NaN 267 | 268 | fig = plt.figure(figsize=(12, 9), tight_layout=True) # 初始化画布 269 | sn.set(font_scale=1.0 if self.nc < 50 else 0.8) # 设置label的字体大小 270 | labels = (0 < len(names) < 99) and len(names) == self.nc # 绘制混淆矩阵时 是否使用names作为labels 271 | 272 | # 绘制热力图 即混淆矩阵可视化 273 | with warnings.catch_warnings(): 274 | warnings.simplefilter('ignore') # suppress empty matrix RuntimeWarning: All-NaN slice encountered 275 | # sean.heatmap: 热力图 data: 数据矩阵 annot: 为True时为每个单元格写入数据值 False用颜色深浅表示 276 | # annot_kws: 格子外框宽度 fmt: 添加注释时要使用的字符串格式代码 cmap: 指色彩颜色的选择 277 | # square: 是否是正方形 xticklabels、yticklabels: xy标签 278 | sn.heatmap(array, annot=self.nc < 30, annot_kws={"size": 8}, cmap='Blues', fmt='.2f', square=True, 279 | xticklabels=names + ['background FP'] if labels else "auto", 280 | yticklabels=names + ['background FN'] if labels else "auto").set_facecolor((1, 1, 1)) 281 | # 设置figure的横坐标 纵坐标及保存该图片 282 | fig.axes[0].set_xlabel('True') 283 | fig.axes[0].set_ylabel('Predicted') 284 | fig.savefig(Path(save_dir) / 'confusion_matrix.png', dpi=250) 285 | except Exception as e: 286 | print(f'WARNING: ConfusionMatrix plot failure: {e}') 287 | 288 | def print(self): 289 | # print按行输出打印混淆矩阵matrix 290 | for i in range(self.nc + 1): 291 | print(' '.join(map(str, self.matrix[i]))) 292 | 293 | def bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False, eps=1e-7): 294 | """在ComputeLoss的__call__函数中调用计算回归损失 295 | :params box1: 预测框 296 | :params box2: 预测框 297 | :return box1和box2的IoU/GIoU/DIoU/CIoU 298 | """ 299 | box2 = box2.T 300 | 301 | # Get the coordinates of bounding boxes 302 | if x1y1x2y2: # x1, y1, x2, y2 = box1 303 | b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3] 304 | b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3] 305 | else: # transform from xywh to xyxy 306 | b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2 307 | b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2 308 | b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2 309 | b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2 310 | 311 | # Intersection area tensor.clamp(0): 将矩阵中小于0的元数变成0 312 | inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \ 313 | (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0) 314 | 315 | # Union Area 316 | w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps 317 | w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps 318 | union = w1 * h1 + w2 * h2 - inter + eps 319 | 320 | iou = inter / union 321 | if GIoU or DIoU or CIoU: 322 | cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1) # 两个框的最小闭包区域的width 323 | ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1) # 两个框的最小闭包区域的height 324 | if CIoU or DIoU: # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1 325 | c2 = cw ** 2 + ch ** 2 + eps # convex diagonal squared 326 | rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 + 327 | (b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4 # center distance squared 328 | if DIoU: 329 | return iou - rho2 / c2 # DIoU 330 | elif CIoU: # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47 331 | v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2) 332 | with torch.no_grad(): 333 | alpha = v / (v - iou + (1 + eps)) 334 | return iou - (rho2 / c2 + v * alpha) # CIoU 335 | else: # GIoU https://arxiv.org/pdf/1902.09630.pdf 336 | c_area = cw * ch + eps # convex area 337 | return iou - (c_area - union) / c_area # GIoU 338 | else: 339 | return iou # IoU 340 | 341 | def box_iou(box1, box2): 342 | """用于计算混淆矩阵 343 | https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py 344 | :params box1: (Tensor[N, 4]) [N, x1y1x2y2] 345 | :params box2: (Tensor[M, 4]) [M, x1y1x2y2] 346 | :return box1和box2的iou [N, M] 347 | """ 348 | def box_area(box): 349 | # 求出box的面积 350 | return (box[2] - box[0]) * (box[3] - box[1]) 351 | 352 | area1 = box_area(box1.T) # box1面积 353 | area2 = box_area(box2.T) # box2面积 354 | 355 | # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2) 356 | # 等价于(torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0) 357 | inter = (torch.min(box1[:, None, 2:], box2[:, 2:]) - torch.max(box1[:, None, :2], box2[:, :2])).clamp(0).prod(2) 358 | return inter / (area1[:, None] + area2 + 1e-16 - inter) # iou = inter / (area1 + area2 - inter) 359 | 360 | def bbox_ioa(box1, box2, eps=1E-7): 361 | """ Returns the intersection over box2 area given box1, box2. Boxes are x1y1x2y2 362 | box1: np.array of shape(4) 363 | box2: np.array of shape(nx4) 364 | returns: np.array of shape(n) 365 | """ 366 | 367 | box2 = box2.transpose() 368 | 369 | # Get the coordinates of bounding boxes 370 | b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3] 371 | b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3] 372 | 373 | # Intersection area 374 | inter_area = (np.minimum(b1_x2, b2_x2) - np.maximum(b1_x1, b2_x1)).clip(0) * \ 375 | (np.minimum(b1_y2, b2_y2) - np.maximum(b1_y1, b2_y1)).clip(0) 376 | 377 | # box2 area 378 | box2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1) + eps 379 | 380 | # Intersection over box2 area 381 | return inter_area / box2_area 382 | 383 | def wh_iou(wh1, wh2): 384 | """在ComputeLoss类的build_targets函数中被调用(老版正样本筛选条件) 385 | :params wh1: anchors 当前feature map的3个anchor [N, 2] 386 | :params wh2: t[:, 4:6] gt框的wh(没筛选 所有的gt) [M, 2] 387 | :return 返回wh1和wh2的iou(矩阵) 388 | """ 389 | # Returns the nxm IoU matrix. wh1 is nx2, wh2 is mx2 390 | wh1 = wh1[:, None] # [N,2] -> [N,1,2] 391 | wh2 = wh2[None] # [M, 2] -> [1,M,2] 392 | # 这里会利用广播机制使wh1: [N,1,2]->[N,M,2] wh2: [1,M,2]->[N,M,2] 393 | # 相当于 inter = torch.min(w1, w2) * torch.min(h1, h2) 394 | # 计算inter 默认两个bounding box的左上角是重叠在一起的 这样才可以计算 可以自己画个图就明白了 395 | inter = torch.min(wh1, wh2).prod(2) # [N,M] 396 | # iou = inter / (area1 + area2 - inter) 1e-16防止分母为0 prod(2): 宽高相乘(矩阵运算) 397 | return inter / (wh1.prod(2) + wh2.prod(2) + 1e-16 - inter) 398 | 399 | # Plots ---------------------------------------------------------------------------------------------------------------- 400 | def plot_pr_curve(px, py, ap, save_dir='pr_curve.png', names=()): 401 | """用于ap_per_class函数 402 | Precision-recall curve 绘制PR曲线 403 | :params px: [1000] 横坐标 recall 值为0~1直接取1000个数 404 | :params py: list{nc} nc个[1000] 所有类别在IOU=0.5,横坐标为px(recall)时的precision 405 | :params ap: [nc, 10] 所有类别在每个IOU阈值下的平均mAP 406 | :params save_dir: runs\test\exp54\PR_curve.png PR曲线存储位置 407 | :params names: {dict:80} 数据集所有类别的字典 key:value 408 | """ 409 | fig, ax = plt.subplots(1, 1, figsize=(9, 6), tight_layout=True) # 设置画布 410 | py = np.stack(py, axis=1) # [1000, nc] 411 | 412 | # 画出所有类别在10个IOU阈值下的PR曲线 413 | if 0 < len(names) < 21: # display per-class legend if < 21 classes 414 | for i, y in enumerate(py.T): # 如果<21 classes就一个个类画 因为要显示图例就必须一个个画 415 | ax.plot(px, y, linewidth=1, label=f'{names[i]} {ap[i, 0]:.3f}') # plot(recall, precision) 416 | else: # 如果>=21 classes 显示图例就会很乱 所以就不显示图例了 可以直接输入数组 x[1000] y[1000, 71] 417 | ax.plot(px, py, linewidth=1, color='grey') # plot(recall, precision) 418 | 419 | # 画出所有类别在IOU=0.5阈值下的平均PR曲线 420 | ax.plot(px, py.mean(1), linewidth=3, color='blue', label='all classes %.3f mAP@0.5' % ap[:, 0].mean()) 421 | ax.set_xlabel('Recall') # 设置x轴标签 422 | ax.set_ylabel('Precision') # 设置y轴标签 423 | ax.set_xlim(0, 1) # x=[0, 1] 424 | ax.set_ylim(0, 1) # y=[0, 1] 425 | plt.legend(bbox_to_anchor=(1.04, 1), loc="upper left") # 显示图例 426 | fig.savefig(Path(save_dir), dpi=250) # 保存PR_curve.png图片 427 | 428 | def plot_mc_curve(px, py, save_dir='mc_curve.png', names=(), xlabel='Confidence', ylabel='Metric'): 429 | """用于ap_per_class函数 430 | Metric-Confidence curve 可用于绘制 F1-Confidence/P-Confidence/R-Confidence曲线 431 | :params px: [0, 1, 1000] 横坐标 0-1 1000个点 conf [1000] 432 | :params py: 对每个类, 针对横坐标为conf=[0, 1, 1000] 对应的f1/p/r值 纵坐标 [71, 1000] 433 | :params save_dir: 图片保存地址 434 | :parmas names: 数据集names 435 | :params xlabel: x轴标签 436 | :params ylabel: y轴标签 437 | """ 438 | fig, ax = plt.subplots(1, 1, figsize=(9, 6), tight_layout=True) # 设置画布 439 | 440 | # 画出所有类别的F1-Confidence/P-Confidence/R-Confidence曲线 441 | if 0 < len(names) < 21: # display per-class legend if < 21 classes 442 | for i, y in enumerate(py): # 如果<21 classes就一个个类画 因为要显示图例就必须一个个画 443 | ax.plot(px, y, linewidth=1, label=f'{names[i]}') # plot(confidence, metric) 444 | else: # 如果>=21 classes 显示图例就会很乱 所以就不显示图例了 可以直接输入数组 x[1000] y[1000, 71] 445 | ax.plot(px, py.T, linewidth=1, color='grey') # plot(confidence, metric) 446 | 447 | # 画出所有类别在每个x点(conf)对应的均值F1-Confidence/P-Confidence/R-Confidence曲线 448 | y = py.mean(0) # [1000] 求出所以类别在每个x点(conf)的平均值 449 | ax.plot(px, y, linewidth=3, color='blue', label=f'all classes {y.max():.2f} at {px[y.argmax()]:.3f}') 450 | ax.set_xlabel(xlabel) # 设置x轴标签 451 | ax.set_ylabel(ylabel) # 设置y轴标签 452 | ax.set_xlim(0, 1) # x=[0, 1] 453 | ax.set_ylim(0, 1) # y=[0, 1] 454 | plt.legend(bbox_to_anchor=(1.04, 1), loc="upper left") # 显示图例 455 | fig.savefig(Path(save_dir), dpi=250) # 保存png图片 456 | -------------------------------------------------------------------------------- /utils/plots.py: -------------------------------------------------------------------------------- 1 | # Plotting utils 这个脚本都是一些画图工具 2 | 3 | import glob # 仅支持部分通配符的文件搜索模块 4 | import math # 数学公式模块 5 | import os # 与操作系统进行交互的模块 6 | from copy import copy # 提供通用的浅层和深层copy操作 7 | from pathlib import Path # Path将str转换为Path对象 使字符串路径易于操作的模块 8 | 9 | import cv2 # opencv库 10 | import matplotlib # matplotlib模块 11 | import matplotlib.pyplot as plt # matplotlib画图模块 12 | import numpy as np # numpy矩阵处理函数库 13 | import pandas as pd # pandas矩阵操作模块 14 | import seaborn as sn # 基于matplotlib的图形可视化python包 能够做出各种有吸引力的统计图表 15 | import torch # pytorch框架 16 | import yaml # yaml配置文件读写模块 17 | from PIL import Image, ImageDraw, ImageFont # 图片操作模块 18 | from torchvision import transforms # 包含很多种对图像数据进行变换的函数 19 | 20 | from utils.general import increment_path, xywh2xyxy, xyxy2xywh 21 | from utils.metrics import fitness 22 | 23 | # 设置一些基本的配置 Settings 24 | matplotlib.rc('font', **{'size': 11}) # 自定义matplotlib图上字体font大小size=11 25 | # 在PyCharm 页面中控制绘图显示与否 26 | # 如果这句话放在import matplotlib.pyplot as plt之前就算加上plt.show()也不会再屏幕上绘图 放在之后其实没什么用 27 | matplotlib.use('Agg') # for writing to files only 28 | 29 | 30 | class Colors: 31 | # Ultralytics color palette https://ultralytics.com/ 32 | def __init__(self): 33 | # hex = matplotlib.colors.TABLEAU_COLORS.values() 34 | # 红 浅红 橘色 黄色 - 浅绿 'FF37C7' 'FF3838', 'FF9D97'这三个是红色 暂时去掉 35 | hex = ('FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB', 36 | '2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8') 37 | # 将hex列表中所有hex格式(十六进制)的颜色转换rgb格式的颜色 38 | self.palette = [self.hex2rgb('#' + c) for c in hex] 39 | # 颜色个数 40 | self.n = len(self.palette) 41 | 42 | def __call__(self, i, bgr=False): 43 | # 根据输入的index 选择对应的rgb颜色 44 | c = self.palette[int(i) % self.n] 45 | # 返回选择的颜色 默认是rgb 46 | return (c[2], c[1], c[0]) if bgr else c 47 | 48 | @staticmethod 49 | def hex2rgb(h): # rgb order (PIL) 50 | # hex -> rgb 51 | return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4)) 52 | colors = Colors() # 初始化Colors对象 下面调用colors的时候会调用__call__函数 53 | 54 | def plot_one_box(x, im, color=(128, 128, 128), label=None, line_thickness=3): 55 | """一般会用在detect.py中在nms之后变量每一个预测框,再将每个预测框画在原图上 56 | 使用opencv在原图im上画一个bounding box 57 | :params x: 预测得到的bounding box [x1 y1 x2 y2] 58 | :params im: 原图 要将bounding box画在这个图上 array 59 | :params color: bounding box线的颜色 60 | :params labels: 标签上的框框信息 类别 + score 61 | :params line_thickness: bounding box的线宽 62 | """ 63 | # check im内存是否连续 64 | assert im.data.contiguous, 'Image not contiguous. Apply np.ascontiguousarray(im) to plot_on_box() input image.' 65 | # tl = 框框的线宽 要么等于line_thickness要么根据原图im长宽信息自适应生成一个 66 | tl = line_thickness or round(0.002 * (im.shape[0] + im.shape[1]) / 2) + 1 # line/font thickness 67 | # c1 = (x1, y1) = 矩形框的左上角 c2 = (x2, y2) = 矩形框的右下角 68 | c1, c2 = (int(float(x[0])), int(float(x[1]))), (int(float(x[2])), int(float(x[3]))) 69 | # c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3])) 70 | # cv2.rectangle: 在im上画出框框 c1: start_point(x1, y1) c2: end_point(x2, y2) 71 | # 注意: 这里的c1+c2可以是左上角+右下角 也可以是左下角+右上角都可以 72 | cv2.rectangle(im, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA) 73 | # 如果label不为空还要在框框上面显示标签label + score 74 | if label: 75 | tf = max(tl - 1, 1) # label字体的线宽 font thickness 76 | # cv2.getTextSize: 根据输入的label信息计算文本字符串的宽度和高度 77 | # 0: 文字字体类型 fontScale: 字体缩放系数 thickness: 字体笔画线宽 78 | # 返回retval 字体的宽高 (width, height), baseLine 相对于最底端文本的 y 坐标 79 | t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0] 80 | c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 81 | # 同上面一样是个画框的步骤 但是线宽thickness=-1表示整个矩形都填充color颜色 82 | cv2.rectangle(im, c1, c2, color, -1, cv2.LINE_AA) # filled 83 | # cv2.putText: 在图片上写文本 这里是在上面这个矩形框里写label + score文本 84 | # (c1[0], c1[1] - 2)文本左下角坐标 0: 文字样式 fontScale: 字体缩放系数 85 | # [225, 255, 255]: 文字颜色 thickness: tf字体笔画线宽 lineType: 线样式 86 | cv2.putText(im, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=2, lineType=cv2.LINE_AA) 87 | # 1、没用到 88 | def plot_one_box_PIL(box, im, color=(128, 128, 128), label=None, line_thickness=None): 89 | """ 90 | 使用PIL在原图im上画一个bounding box 91 | :params box: 预测得到的bounding box [x1 y1 x2 y2] 92 | :params im: 原图 要将bounding box画在这个图上 array 93 | :params color: bounding box线的颜色 94 | :params label: 标签上的bounding box框框信息 类别 + score 95 | :params line_thickness: bounding box的线宽 96 | """ 97 | # 将原图array格式->Image格式 98 | im = Image.fromarray(im) 99 | # (初始化)创建一个可以在给定图像(im)上绘图的对象, 在之后调用draw.函数的时候不需要传入im参数,它是直接针对im上进行绘画的 100 | draw = ImageDraw.Draw(im) 101 | # 设置绘制bounding box的线宽 102 | line_thickness = line_thickness or max(int(min(im.size) / 200), 2) 103 | # 在im图像上绘制bounding box 104 | # xy: box [x1 y1 x2 y2] 左上角 + 右下角 width: 线宽 outline: 矩形外框颜色color fill: 将整个矩形填充颜色color 105 | # outline和fill一般根据需求二选一 106 | draw.rectangle(box, width=line_thickness, outline=color) # plot 107 | # 如果label不为空还要在框框上面显示标签label + score 108 | if label: 109 | # 加载一个TrueType或者OpenType字体文件("Arial.ttf"), 并且创建一个字体对象font, font写出的字体大小size=12 110 | font = ImageFont.truetype("Arial.ttf", size=max(round(max(im.size) / 40), 12)) 111 | # 返回给定文本label的宽度txt_width和高度txt_height 112 | txt_width, txt_height = font.getsize(label) 113 | # 在im图像上绘制矩形框 整个框框填充颜色color(用来存放label信息) [x1 y1 x2 y2] 左上角 + 右下角 114 | draw.rectangle([box[0], box[1] - txt_height + 4, box[0] + txt_width, box[1]], fill=color) 115 | # 在上面这个矩形中写入text信息(label) x1y1 左上角 116 | draw.text((box[0], box[1] - txt_height + 1), label, fill=(255, 255, 255), font=font) 117 | 118 | # 再返回array类型的im(绘好bounding box和label的) 119 | return np.asarray(im) 120 | 121 | # 2、没用到 122 | def plot_wh_methods(): 123 | """没用到 124 | 比较ya=e^x、yb=(2 * sigmoid(x))^2 以及 yc=(2 * sigmoid(x))^1.6 三个图形 125 | wh损失计算的方式ya、yb、yc三种 ya: yolo method yb/yc: power method 126 | 实验发现使用原来的yolo method损失计算有时候会突然迅速走向无限None值, 而power method方式计算wh损失下降会比较平稳 127 | 最后实验证明yb是最好的wh损失计算方式, yolov5-5.0的wh损失计算代码用的就是yb计算方式 128 | Compares the two methods for width-height anchor multiplication 129 | https://github.com/ultralytics/yolov3/issues/168 130 | """ 131 | x = np.arange(-4.0, 4.0, .1) # (-4.0, 4.0) 每隔0.1取一个值 132 | ya = np.exp(x) # ya = e^x yolo method 133 | yb = torch.sigmoid(torch.from_numpy(x)).numpy() * 2 # yb = 2 * sigmoid(x) 134 | 135 | fig = plt.figure(figsize=(6, 3), tight_layout=True) # 创建自定义图像 初始化画布 136 | plt.plot(x, ya, '.-', label='YOLOv3') # 绘制折线图 可以任意加几条线 137 | plt.plot(x, yb ** 2, '.-', label='YOLOv5 ^2') 138 | plt.plot(x, yb ** 1.6, '.-', label='YOLOv5 ^1.6') 139 | plt.xlim(left=-4, right=4) # 设置x轴、y轴范围 140 | plt.ylim(bottom=0, top=6) 141 | plt.xlabel('input') # 设置x轴、y轴标签 142 | plt.ylabel('rec_result') 143 | plt.grid() # 生成网格 144 | plt.legend() # 加上图例 如果是折线图,需要在plt.plot中加入label参数(图例名) 145 | fig.savefig('comparison.png', dpi=200) # plt绘完图, fig.savefig()保存图片 146 | 147 | def output_to_target(output): 148 | """用在test.py中进行绘制前3个batch的预测框predictions 因为只有predictions需要修改格式 target是不需要修改格式的 149 | 将经过nms后的output [num_obj,x1y1x2y2+conf+cls] -> [num_obj, batch_id+class+x+y+w+h+conf] 转变格式 150 | 以便在plot_images中进行绘图 + 显示label 151 | Convert model rec_result to target format [batch_id, class_id, x, y, w, h, conf] 152 | :params rec_result: list{tensor(8)}分别对应着当前batch的8(batch_size)张图片做完nms后的结果 153 | list中每个tensor[n, 6] n表示当前图片检测到的目标个数 6=x1y1x2y2+conf+cls 154 | :return np.array(targets): [num_targets, batch_id+class+xywh+conf] 其中num_targets为当前batch中所有检测到目标框的个数 155 | """ 156 | targets = [] 157 | for i, o in enumerate(output): # 对每张图片分别做处理 158 | for *box, conf, cls in o.cpu().numpy(): # 对每张图片的每个检测到的目标框进行convert格式 159 | targets.append([i, cls, *list(*xyxy2xywh(np.array(box)[None])), conf]) 160 | return np.array(targets) 161 | def plot_images(images, targets, paths=None, fname='images.jpg', names=None, max_size=640, max_subplots=16): 162 | """用在test.py中进行绘制前3个batch的ground truth和预测框predictions(两个图) 一起保存 163 | 将整个batch的labels都画在这个batch的images上 164 | Plot image grid with labels 165 | :params images: 当前batch的所有图片 Tensor [batch_size, 3, h, w] 且图片都是归一化后的 166 | :params targets: 直接来自target: Tensor[num_target, img_index+class+xywh] [num_target, 6] 167 | 来自output_to_target: Tensor[num_pred, batch_id+class+xywh+conf] [num_pred, 7] 168 | :params paths: tuple 当前batch中所有图片的地址 169 | 如: '..\\datasets\\coco128\\images\\train2017\\000000000315.jpg' 170 | :params fname: 最终保存的文件路径 + 名字 runs\train\exp8\train_batch2.jpg 171 | :params names: 传入的类名 从class index可以相应的key值 但是默认是None 只显示class index不显示类名 172 | :params max_size: 图片的最大尺寸640 如果images有图片的大小(w/h)大于640则需要resize 如果都是小于640则不需要resize 173 | :params max_subplots: 最大子图个数 16 174 | :params mosaic: 一张大图 最多可以显示max_subplots张图片 将总多的图片(包括各自的label框框)一起贴在一起显示 175 | mosaic每张图片的左上方还会显示当前图片的名字 最好以fname为名保存起来 176 | """ 177 | if isinstance(images, torch.Tensor): 178 | images = images.cpu().float().numpy() # tensor -> numpy array 179 | if isinstance(targets, torch.Tensor): 180 | targets = targets.cpu().numpy() 181 | 182 | # 反归一化 将归一化后的图片还原 un-normalise 183 | if np.max(images[0]) <= 1: 184 | images *= 255 185 | 186 | # 设置一些基础变量 187 | tl = 3 # 设置线宽 line thickness 3 188 | tf = max(tl - 1, 1) # 设置字体笔画线宽 font thickness 2 189 | bs, _, h, w = images.shape # batch size 4, channel 3, height 512, width 512 190 | bs = min(bs, max_subplots) # 子图总数 正方形 limit plot images 4 191 | ns = np.ceil(bs ** 0.5) # ns=每行/每列最大子图个数 子图总数=ns*ns ceil向上取整 2 192 | 193 | # Check if we should resize 194 | # 如果images有图片的大小(w/h)大于640则需要resize 如果都是小于640则不需要resize 195 | scale_factor = max_size / max(h, w) # 1.25 196 | if scale_factor < 1: 197 | # 如果w/h有任何一条边超过640, 就要将较长边缩放到640, 另外一条边相应也缩放 198 | h = math.ceil(scale_factor * h) # 512 199 | w = math.ceil(scale_factor * w) # 512 200 | 201 | # np.full 返回一个指定形状、类型和数值的数组 202 | # shape: (int(ns * h), int(ns * w), 3) (1024, 1024, 3) 填充的值: 255 dtype 填充类型: np.uint8 203 | mosaic = np.full((int(ns * h), int(ns * w), 3), 255, dtype=np.uint8) # init 204 | # 对batch内每张图片 205 | for i, img in enumerate(images): # img (3, 512, 512) 206 | # 如果图片要超过max_subplots我们就不管了 207 | if i == max_subplots: # if last batch has fewer images than we expect 208 | break 209 | 210 | # (block_x, block_y) 相当于是左上角的左边 211 | block_x = int(w * (i // ns)) # // 取整 0 0 512 512 ns=2 212 | block_y = int(h * (i % ns)) # % 取余 0 512 0 512 213 | 214 | img = img.transpose(1, 2, 0) # (512, 512, 3) h w c 215 | if scale_factor < 1: # 如果scale_factor < 1说明h/w超过max_size 需要resize回来 216 | img = cv2.resize(img, (w, h)) 217 | 218 | # 将这个batch的图片一张张的贴到mosaic相应的位置上 hwc 这里最好自己画个图理解下 219 | # 第一张图mosaic[0:512, 0:512, :] 第二张图mosaic[512:1024, 0:512, :] 220 | # 第三张图mosaic[0:512, 512:1024, :] 第四张图mosaic[512:1024, 512:1024, :] 221 | mosaic[block_y:block_y + h, block_x:block_x + w, :] = img 222 | if len(targets) > 0: 223 | # 求出属于这张img的target 224 | image_targets = targets[targets[:, 0] == i] 225 | # 将这张图片的所有target的xywh -> xyxy 226 | boxes = xywh2xyxy(image_targets[:, 2:6]).T 227 | # 得到这张图片所有target的类别classes 228 | classes = image_targets[:, 1].astype('int') 229 | # 如果image_targets.shape[1] == 6则说明没有置信度信息(此时target实际上是真实框) 230 | # 如果长度为7则第7个信息就是置信度信息(此时target为预测框信息) 231 | labels = image_targets.shape[1] == 6 # labels if no conf column 232 | # 得到当前这张图的所有target的置信度信息(pred) 如果没有就为空(真实label) 233 | # check for confidence presence (label vs pred) 234 | conf = None if labels else image_targets[:, 6] 235 | 236 | if boxes.shape[1]: # boxes.shape[1]不为空说明这张图有target目标 237 | if boxes.max() <= 1.01: # if normalized with tolerance 0.01 238 | # 因为图片是反归一化的 所以这里boxes也反归一化 239 | boxes[[0, 2]] *= w # scale to pixels 240 | boxes[[1, 3]] *= h 241 | elif scale_factor < 1: 242 | # 如果scale_factor < 1 说明resize过, 那么boxes也要相应变化 243 | # absolute coords need scale if image scales 244 | boxes *= scale_factor 245 | # 上面得到的boxes信息是相对img这张图片的标签信息 因为我们最终是要将img贴到mosaic上 所以还要变换label->mosaic 246 | boxes[[0, 2]] += block_x 247 | boxes[[1, 3]] += block_y 248 | 249 | # 将当前的图片img的所有标签boxes画到mosaic上 250 | for j, box in enumerate(boxes.T): 251 | # 遍历每个box 252 | cls = int(classes[j]) # 得到这个box的class index 253 | color = colors(cls) # 得到这个box框线的颜色 254 | cls = names[cls] if names else cls # 如果传入类名就显示类名 如果没传入类名就显示class index 255 | 256 | # 如果labels不为空说明是在显示真实target 不需要conf置信度 直接显示label即可 257 | # 如果conf[j] > 0.25 首先说明是在显示pred 且这个box的conf必须大于0.25 相当于又是一轮nms筛选 显示label + conf 258 | if labels or conf[j] > 0.25: # 0.25 conf thresh 259 | label = '%s' % cls if labels else '%s %.1f' % (cls, conf[j]) # 框框上面的显示信息 260 | plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl) # 一个个的画框 261 | 262 | # 在mosaic每张图片相对位置的左上角写上每张图片的文件名 如000000000315.jpg 263 | if paths: 264 | # paths[i]: '..\\datasets\\coco128\\images\\train2017\\000000000315.jpg' Path: str -> Wins地址 265 | # .name: str'000000000315.jpg' [:40]取前40个字符 最终还是等于str'000000000315.jpg' 266 | label = Path(paths[i]).name[:40] # trim to 40 char 267 | # 返回文本 label 的宽高 (width, height) 268 | t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0] 269 | # 在mosaic上写文本信息 270 | # 要绘制的图像 + 要写上前的文本信息 + 文本左下角坐标 + 要使用的字体 + 字体缩放系数 + 字体的颜色 + 字体的线宽 + 矩形边框的线型 271 | cv2.putText(mosaic, label, (block_x + 5, block_y + t_size[1] + 5), 0, 272 | tl / 3, [220, 220, 220], thickness=tf, lineType=cv2.LINE_AA) 273 | 274 | # mosaic内每张图片与图片之间弄一个边界框隔开 好看点 其实做法特简单 就是将每个img在mosaic中画个框即可 275 | cv2.rectangle(mosaic, (block_x, block_y), (block_x + w, block_y + h), (255, 255, 255), thickness=3) 276 | 277 | # 最后一步 check是否需要将mosaic图片保存起来 278 | if fname: # 文件名不为空的话 fname = runs\train\exp8\train_batch2.jpg 279 | # 限制mosaic图片尺寸 280 | r = min(1280. / max(h, w) / ns, 1.0) # ratio to limit image size 281 | mosaic = cv2.resize(mosaic, (int(ns * w * r), int(ns * h * r)), interpolation=cv2.INTER_AREA) 282 | # cv2.imwrite(fname, cv2.cvtColor(mosaic, cv2.COLOR_BGR2RGB)) # cv2 save 最好BGR -> RGB再保存 283 | Image.fromarray(mosaic).save(fname) # PIL save 必须要numpy array -> tensor格式 才能保存 284 | return mosaic 285 | 286 | def plot_lr_scheduler(optimizer, scheduler, epochs=300, save_dir=''): 287 | """用在train.py中学习率设置后可视化一下 288 | Plot LR simulating training for full epochs 289 | :params optimizer: 优化器 290 | :params scheduler: 策略调整器 291 | :params epochs: x 292 | :params save_dir: lr图片 保存地址 293 | """ 294 | optimizer, scheduler = copy(optimizer), copy(scheduler) # do not modify originals 295 | y = [] # 存放每个epoch的学习率 296 | 297 | # 从optimizer中取学习率 一个epoch取一个 共取epochs个 每取一次需要使用scheduler.step更新下一个epoch的学习率 298 | for _ in range(epochs): 299 | scheduler.step() # 更新下一个epoch的学习率 300 | # ptimizer.param_groups[0]['lr']: 取下一个epoch的学习率lr 301 | y.append(optimizer.param_groups[0]['lr']) 302 | plt.plot(y, '.-', label='LR') # 没有传入x 默认会传入 0..epochs-1 303 | plt.xlabel('epoch') 304 | plt.ylabel('LR') 305 | plt.grid() 306 | plt.xlim(0, epochs) 307 | plt.ylim(0) 308 | plt.savefig(Path(save_dir) / 'LR.png', dpi=200) # 保存 309 | plt.close() 310 | 311 | def hist2d(x, y, n=100): 312 | """用在plot_evolution 313 | 使用numpy画出2d直方图 314 | 2d histogram used in labels.png and evolve.png 315 | """ 316 | # xedges: 返回在start=x.min()和stop=x.max()之间返回均匀间隔的n个数据 317 | xedges, yedges = np.linspace(x.min(), x.max(), n), np.linspace(y.min(), y.max(), n) 318 | # np.histogram2d: 2d直方图 x: x轴坐标 y: y轴坐标 (xedges, yedges): bins x, y轴的长条形数目 319 | # 返回hist: 直方图对象 xedges: x轴对象 yedges: y轴对象 320 | hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges)) 321 | # np.clip: 截取函数 令目标内所有数据都属于一个范围 [0, hist.shape[0] - 1] 小于0的等于0 大于同理 322 | # np.digitize 用于分区 323 | xidx = np.clip(np.digitize(x, xedges) - 1, 0, hist.shape[0] - 1) # x轴坐标 324 | yidx = np.clip(np.digitize(y, yedges) - 1, 0, hist.shape[1] - 1) # y轴坐标 325 | return np.log(hist[xidx, yidx]) 326 | 327 | def plot_test_txt(test_dir='test.txt'): 328 | """可以自己写个脚本执行test.txt文件 329 | 利用test.txt xyxy画出其直方图和双直方图 330 | Plot test.txt histograms 331 | :params test_dir: test.py中生成的一些 save_dir/labels中的txt文件 332 | """ 333 | # x [:, xyxy] 334 | x = np.loadtxt(test_dir, dtype=np.float32) 335 | box = xyxy2xywh(x[:, 2:6]) # xyxy to xywh 这里我改了下 原来是0:4 但我发现txt中存放的是 cls+conf+xyxy 336 | cx, cy = box[:, 0], box[:, 1] # x y 337 | 338 | # 将figure分成1行1列 figure size=(6, 6) tight_layout=true 会自动调整子图参数, 使之填充整个图像区域 339 | # 返回figure(绘图对象)和axes(坐标对象) 340 | fig, ax = plt.subplots(1, 1, figsize=(6, 6), tight_layout=True) 341 | # hist2d: 双直方图 cx: x坐标 cy: y坐标 bins: 横竖分为几条 cmax、cmin: 所有的bins的值少于cmin和大于cmax的不显示 342 | ax.hist2d(cx, cy, bins=600, cmax=10, cmin=0) 343 | ax.set_aspect('equal') # 设置两个轴的长度始终相同 figure为正方形 344 | plt.savefig('hist2d.png', dpi=300) 345 | 346 | fig, ax = plt.subplots(1, 2, figsize=(12, 6), tight_layout=True) 347 | # hist 正常直方图 cx: 绘图数据 bins: 直方图的长条形数目 normed: 是否将得到的直方图向量归一化 348 | # facecolor: 长条形的颜色 edgecolor:长条形边框的颜色 alpha:透明度 349 | ax[0].hist(cx, bins=600) 350 | ax[1].hist(cy, bins=600) 351 | plt.savefig('hist1d.png', dpi=200) 352 | 353 | # 3、没用到 354 | def plot_targets_txt(): 355 | """没用到 和plot_labels作用重复 356 | 利用targets.txt xywh画出其直方图 357 | Plot targets.txt histograms 358 | """ 359 | # x [:, xywh] 360 | x = np.loadtxt('targets.txt', dtype=np.float32).T 361 | s = ['x targets', 'y targets', 'width targets', 'height targets'] 362 | fig, ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True) 363 | ax = ax.ravel() # 将多维数组降位一维 364 | for i in range(4): 365 | ax[i].hist(x[i], bins=100, label='%.3g +/- %.3g' % (x[i].mean(), x[i].std())) 366 | ax[i].legend() # 显示上行label图例 367 | ax[i].set_title(s[i]) 368 | plt.savefig('targets.jpg', dpi=200) 369 | 370 | def plot_labels(labels, names=(), save_dir=Path(''), loggers=None): 371 | """通常用在train.py中 加载数据datasets和labels后 对labels进行可视化 分析labels信息 372 | plot dataset labels 生成labels_correlogram.jpg和labels.jpg 画出数据集的labels相关直方图信息 373 | :params labels: 数据集的全部真实框标签 (num_targets, class+xywh) (929, 5) 374 | :params names: 数据集的所有类别名 375 | :params save_dir: runs\train\exp21 376 | :params loggers: 日志对象 377 | """ 378 | print('Plotting labels... ') 379 | # c: classes (929) b: boxes xywh (4, 929) .transpose() 将(4, 929) -> (929, 4) 380 | c, b = labels[:, 0], labels[:, 1:].transpose() 381 | nc = int(c.max() + 1) # 类别总数 number of classes 80 382 | # pd.DataFrame: 创建DataFrame, 类似于一种excel, 表头是['x', 'y', 'width', 'height'] 表格数据: b中数据按行依次存储 383 | x = pd.DataFrame(b.transpose(), columns=['x', 'y', 'width', 'height']) 384 | 385 | # 1、画出labels的 xywh 各自联合分布直方图 labels_correlogram.jpg 386 | # seaborn correlogram seaborn.pairplot 多变量联合分布图: 查看两个或两个以上变量之间两两相互关系的可视化形式 387 | # data: 联合分布数据x diag_kind:表示联合分布图中对角线图的类型 kind:表示联合分布图中非对角线图的类型 388 | # corner: True 表示只显示左下侧 因为左下和右上是重复的 plot_kws,diag_kws: 可以接受字典的参数,对图形进行微调 389 | sn.pairplot(x, corner=True, diag_kind='auto', kind='hist', diag_kws=dict(bins=50), plot_kws=dict(pmax=0.9)) 390 | plt.savefig(save_dir / 'labels_correlogram.jpg', dpi=200) # 保存labels_correlogram.jpg 391 | plt.close() 392 | 393 | # 2、画出classes的各个类的分布直方图ax[0], 画出所有的真实框ax[1], 画出xy直方图ax[2], 画出wh直方图ax[3] labels.jpg 394 | matplotlib.use('svg') # faster 395 | # 将整个figure分成2*2四个区域 396 | ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)[1].ravel() 397 | # 第一个区域ax[1]画出classes的分布直方图 398 | y = ax[0].hist(c, bins=np.linspace(0, nc, nc + 1) - 0.5, rwidth=0.8) 399 | # [y[2].patches[i].set_color([x / 255 for x in colors(i)]) for i in range(nc)] # update colors bug #3195 400 | ax[0].set_ylabel('instances') # 设置y轴label 401 | if 0 < len(names) < 30: # 小于30个类别就把所有的类别名作为横坐标 402 | ax[0].set_xticks(range(len(names))) # 设置刻度 403 | ax[0].set_xticklabels(names, rotation=90, fontsize=10) # 旋转90度 设置每个刻度标签 404 | else: 405 | ax[0].set_xlabel('classes') # 如果类别数大于30个, 可能就放不下去了, 所以只显示x轴label 406 | # 第三个区域ax[2]画出xy直方图 第四个区域ax[3]画出wh直方图 407 | sn.histplot(x, x='x', y='y', ax=ax[2], bins=50, pmax=0.9) 408 | sn.histplot(x, x='width', y='height', ax=ax[3], bins=50, pmax=0.9) 409 | 410 | # 第二个区域ax[1]画出所有的真实框 411 | labels[:, 1:3] = 0.5 # center xy 412 | labels[:, 1:] = xywh2xyxy(labels[:, 1:]) * 2000 # xyxy 413 | img = Image.fromarray(np.ones((2000, 2000, 3), dtype=np.uint8) * 255) # 初始化一个窗口 414 | for cls, *box in labels[:1000]: # 把所有的框画在img窗口中 415 | ImageDraw.Draw(img).rectangle(box, width=1, outline=colors(cls)) # plot 416 | ax[1].imshow(img) 417 | ax[1].axis('off') # 不要xy轴 418 | 419 | # 去掉上下左右坐标系(去掉上下左右边框) 420 | for a in [0, 1, 2, 3]: 421 | for s in ['top', 'right', 'left', 'bottom']: 422 | ax[a].spines[s].set_visible(False) 423 | 424 | plt.savefig(save_dir / 'labels.jpg', dpi=200) 425 | matplotlib.use('Agg') 426 | plt.close() 427 | 428 | # 打印日志 loggers 429 | for k, v in loggers.items() or {}: 430 | if k == 'wandb' and v: 431 | v.log({"Labels": [v.Image(str(x), caption=x.name) for x in save_dir.glob('*labels*.jpg')]}, commit=False) 432 | 433 | def plot_evolution(yaml_file='data/hyp.finetune.yaml', save_dir=Path('')): 434 | """用在train.py的超参进化算法后,输出参超进化的结果 435 | 超参进化在每一轮都会产生一系列的进化后的超参(存在yaml_file) 以及每一轮都会算出当前轮次的7个指标(evolve.txt) 436 | 这个函数要做的就是把每个超参在所有轮次变化的值和maps以散点图的形式显示出来,并标出最大的map对应的超参值 一个超参一个散点图 437 | :params yaml_file: 'runs/train/evolve/hyp_evolved.yaml' 438 | """ 439 | with open(yaml_file) as f: 440 | hyp = yaml.safe_load(f) # 导入超参文件 441 | # evolve.txt中每一行为一次进化的结果 442 | # 每行前七个数字(P, R, mAP, F1, test_losses(GIOU, obj, cls)) 之后为hyp 443 | x = np.loadtxt('evolve.txt', ndmin=2) 444 | f = fitness(x) # 得到所有进化轮次后得到的加权形式的map 445 | # weights = (f - f.min()) ** 2 # for weighted results 446 | plt.figure(figsize=(10, 12), tight_layout=True) 447 | matplotlib.rc('font', **{'size': 8}) # 设置matplotlib参数 font_size: 8 448 | for i, (k, v) in enumerate(hyp.items()): 449 | y = x[:, i + 7] # y=当前超参在每一轮进化后的值 450 | # mu = (y * weights).sum() / weights.sum() # best weighted result 451 | mu = y[f.argmax()] # 得到加权map最大的epoch时的超参(认为这个超参为所有轮次的最佳超参) 452 | plt.subplot(6, 5, i + 1) # 假设有30个参数 6行5列 一个部分画一个图 453 | # 画出每个超参变化的散点图 x: x坐标为当前超参每一轮进化后的值y y: y坐标为所有进化轮次后得到的加权形式的map 454 | # c: 色彩或颜色 cmap: Colormap实例 alpha: edgecolors: 边框颜色 455 | plt.scatter(y, f, c=hist2d(y, f, 20), cmap='viridis', alpha=.8, edgecolors='none') 456 | # 在当前小图上再画出最佳map时对应的超参 大大的 '+' 做记号 457 | plt.plot(mu, f.max(), 'k+', markersize=15) 458 | plt.title('%s = %.3g' % (k, mu), fontdict={'size': 9}) # limit to 40 characters 459 | if i % 5 != 0: # 一行只能画5个小图 460 | plt.yticks([]) 461 | print('%15s: %.3g' % (k, mu)) # 输出最佳超参 462 | plt.savefig(save_dir / 'evolve.png', dpi=200) # 保存evolve.png 463 | print('\nPlot saved as evolve.png') 464 | 465 | def plot_results(start=0, stop=0, bucket='', id=(), save_dir=''): 466 | """'用在训练结束, 对训练结果进行可视化 467 | 画出训练完的 results.txt Plot training 'results*.txt' 最终生成results.png 468 | :params start: 读取数据的开始epoch 因为result.txt的数据是一个epoch一行的 469 | :params stop: 读取数据的结束epoch 470 | :params bucket: 是否需要从googleapis中下载results*.txt文件 471 | :params id: 需要从googleapis中下载的results + id.txt 默认为空 472 | :params save_dir: 'runs\train\exp22' 473 | """ 474 | # 建造一个figure 分割成2行5列, 由10个小subplots组成 475 | fig, ax = plt.subplots(2, 5, figsize=(12, 6), tight_layout=True) 476 | ax = ax.ravel() # 将多维数组降为一维 477 | s = ['Box', 'Objectness', 'Classification', 'Precision', 'Recall', 478 | 'val Box', 'val Objectness', 'val Classification', 'mAP@0.5', 'mAP@0.5:0.95'] # titles 479 | 480 | if bucket: 481 | # files = ['https://storage.googleapis.com/%s/results%g.txt' % (bucket, x) for x in id] 482 | files = ['results%g.txt' % x for x in id] 483 | c = ('gsutil cp ' + '%s ' * len(files) + '.') % tuple('gs://%s/results%g.txt' % (bucket, x) for x in id) # cmd指令 484 | os.system(c) # 使用cmd指令从googleapis中下载results*.txt 485 | else: 486 | # 不从网盘上下载就直接从文件目录中模糊查找 如files=[WindowsPath('runs/train/exp22/results.txt')] 487 | files = list(Path(save_dir).glob('results*.txt')) # 搜索save_dir目录下类似'results*.txt'文件名的文件 488 | assert len(files), 'No results.txt files found in %s, nothing to plot.' % os.path.abspath(save_dir) 489 | 490 | # 读取files文件数据进行可视化 491 | for fi, f in enumerate(files): 492 | try: 493 | # files 原始一行: epoch/epochs - 1, memory, Box, Objectness, Classification, sum_loss, targets.shape[0], img_shape, Precision, Recall, map@0.5, map@0.5:0.95, Val Box, Val Objectness, Val Classification 494 | # 只使用[2, 3, 4, 8, 9, 12, 13, 14, 10, 11]列 (10, 1) 分布对应 => 495 | # [Box, Objectness, Classification, Precision, Recall, Val Box, Val Objectness, Val Classification, map@0.5, map@0.5:0.95] 496 | results = np.loadtxt(f, usecols=[2, 3, 4, 8, 9, 12, 13, 14, 10, 11], ndmin=2).T # (10, 1) 497 | n = results.shape[1] # number of rows 1 498 | # 根据start(epoch)和stop(epoch)读取相应的轮次的数据 499 | x = range(start, min(stop, n) if stop else n) 500 | for i in range(10): # 分别可视化这10个指标 501 | y = results[i, x] 502 | if i in [0, 1, 2, 5, 6, 7]: 503 | y[y == 0] = np.nan # loss值不能为0 要显示为np.nan 504 | # y /= y[0] # normalize 505 | # label = labels[fi] if len(labels) else f.stem 506 | ax[i].plot(x, y, marker='.', linewidth=2, markersize=8) # 画子图 507 | # ax[i].plot(x, y, marker='.', label=label, linewidth=2, markersize=8) 508 | ax[i].set_title(s[i]) # 设置子图标题 509 | # if i in [5, 6, 7]: # share train and val loss y axes 510 | # ax[i].get_shared_y_axes().join(ax[i], ax[i - 5]) 511 | except Exception as e: 512 | print('Warning: Plotting error for %s; %s' % (f, e)) 513 | 514 | # ax[1].legend() 515 | fig.savefig(Path(save_dir) / 'results1.png', dpi=200) # 保存results.png 516 | def plot_results_overlay(start=0, stop=0): 517 | """可以用在train.py或者自写一个文件 518 | 画出训练完的 results.txt Plot training 'results*.txt' 而且将原先的10个折线图缩减为5个折线图, train和val相对比 519 | Plot training 'results*.txt', overlaying train and val losses 520 | """ 521 | s = ['train', 'train', 'train', 'Precision', 'mAP@0.5', 'val', 'val', 'val', 'Recall', 'mAP@0.5:0.95'] # legends 522 | t = ['Box', 'Objectness', 'Classification', 'P-R', 'mAP-F1'] # titles 523 | 524 | # 遍历每个模糊查询匹配到的results*.txt 525 | for f in sorted(glob.glob('results*.txt') + glob.glob('../../Downloads/results*.txt')): 526 | # files 原始一行: epoch/epochs - 1, memory, Box, Objectness, Classification, sum_loss, targets.shape[0], img_shape, Precision, Recall, map@0.5, map@0.5:0.95, Val Box, Val Objectness, Val Classification 527 | # 只使用[2, 3, 4, 8, 9, 12, 13, 14, 10, 11]列 (10, 1) 分布对应 => 528 | # [Box, Objectness, Classification, Precision, Recall, Val Box, Val Objectness, Val Classification, map@0.5, map@0.5:0.95] 529 | results = np.loadtxt(f, usecols=[2, 3, 4, 8, 9, 12, 13, 14, 10, 11], ndmin=2).T # (10, 1) 530 | n = results.shape[1] # number of rows 1 531 | # 根据start(epoch)和stop(epoch)读取相应的轮次的数据 532 | x = range(start, min(stop, n) if stop else n) 533 | # 建造一个figure 分割成1行5列, 由5个小subplots组成 [Box, Objectness, Classification, P-R, mAP-F1] 534 | fig, ax = plt.subplots(1, 5, figsize=(14, 3.5), tight_layout=True) 535 | ax = ax.ravel() # 将多维数组降为一维 536 | 537 | # 分别可视化这5个指标 [Box, Objectness, Classification, P-R, mAP-F1] 538 | for i in range(5): 539 | for j in [i, i + 5]: # 每个指标都要读取train(i) + val(i+5)两个值 540 | y = results[j, x] 541 | ax[i].plot(x, y, marker='.', label=s[j]) 542 | # y_smooth = butter_lowpass_filtfilt(y) # y抖动太大就取一个平滑版本 543 | # ax[i].plot(x, np.gradient(y_smooth), marker='.', label=s[j]) 544 | 545 | ax[i].set_title(t[i]) # 设置子图标题 546 | ax[i].legend() # 设置子图图例legend 547 | ax[i].set_ylabel(f) if i == 0 else None # add filename 548 | fig.savefig(f.replace('.txt', '.png'), dpi=200) # 保存result.png 549 | def butter_lowpass_filtfilt(data, cutoff=1500, fs=50000, order=5): 550 | """ 551 | 当data值抖动太大, 就取data的平滑曲线 552 | """ 553 | from scipy.signal import butter, filtfilt 554 | 555 | # https://stackoverflow.com/questions/28536191/how-to-filter-smooth-with-scipy-numpy 556 | def butter_lowpass(cutoff, fs, order): 557 | nyq = 0.5 * fs 558 | normal_cutoff = cutoff / nyq 559 | return butter(order, normal_cutoff, btype='low', analog=False) 560 | 561 | b, a = butter_lowpass(cutoff, fs, order=order) 562 | return filtfilt(b, a, data) # forward-backward filter 563 | 564 | def feature_visualization(x, module_type, stage, n=64): 565 | """用在yolo.py的Model类中的forward_once函数中 自行选择任意层进行可视化该层feature map 566 | 可视化feature map(模型任意层都可以用) 567 | :params x: Features map [bs, channels, height, width] 568 | :params module_type: Module type 569 | :params stage: Module stage within model 570 | :params n: Maximum number of feature maps to plot 571 | """ 572 | batch, channels, height, width = x.shape # batch, channels, height, width 573 | if height > 1 and width > 1: 574 | project, name = 'runs/features', 'exp' 575 | save_dir = increment_path(Path(project) / name) # increment run 576 | save_dir.mkdir(parents=True, exist_ok=True) # make save dir 577 | 578 | plt.figure(tight_layout=True) 579 | # torch.chunk: 与torch.cat()原理相反 将tensor x按dim(行或列)分割成channels个tensor块, 返回的是一个元组 580 | # 将第2个维度(channels)将x分成channels份 每张图有三个block batch张图 blocks=len(blocks)=3*batch 581 | blocks = torch.chunk(x, channels, dim=1) 582 | n = min(n, len(blocks)) # 总共可视化的feature map数量 583 | for i in range(n): 584 | feature = transforms.ToPILImage()(blocks[i].squeeze()) # tensor -> PIL Image 585 | ax = plt.subplot(int(math.sqrt(n)), int(math.sqrt(n)), i + 1) # 根号n行根号n列 当前属于第i+1张子图 586 | ax.axis('off') 587 | plt.imshow(feature) # cmap='gray' 可视化当前feature map 588 | 589 | f = f"stage_{stage}_{module_type.split('.')[-1]}_features.png" 590 | print(f'Saving {save_dir / f}...') 591 | plt.savefig(save_dir / f, dpi=300) 592 | 593 | def plot_study_txt(path='', x=None): 594 | """没用到 595 | Plot study.txt generated by val.py 596 | """ 597 | plot2 = False # plot additional results 598 | if plot2: 599 | ax = plt.subplots(2, 4, figsize=(10, 6), tight_layout=True)[1].ravel() 600 | 601 | fig2, ax2 = plt.subplots(1, 1, figsize=(8, 4), tight_layout=True) 602 | # for f in [Path(path) / f'study_coco_{x}.txt' for x in ['yolov5s6', 'yolov5m6', 'yolov5l6', 'yolov5x6']]: 603 | for f in sorted(Path(path).glob('study*.txt')): 604 | y = np.loadtxt(f, dtype=np.float32, usecols=[0, 1, 2, 3, 7, 8, 9], ndmin=2).T 605 | x = np.arange(y.shape[1]) if x is None else np.array(x) 606 | if plot2: 607 | s = ['P', 'R', 'mAP@.5', 'mAP@.5:.95', 't_preprocess (ms/img)', 't_inference (ms/img)', 't_NMS (ms/img)'] 608 | for i in range(7): 609 | ax[i].plot(x, y[i], '.-', linewidth=2, markersize=8) 610 | ax[i].set_title(s[i]) 611 | 612 | j = y[3].argmax() + 1 613 | ax2.plot(y[5, 1:j], y[3, 1:j] * 1E2, '.-', linewidth=2, markersize=8, 614 | label=f.stem.replace('study_coco_', '').replace('yolo', 'YOLO')) 615 | 616 | ax2.plot(1E3 / np.array([209, 140, 97, 58, 35, 18]), [34.6, 40.5, 43.0, 47.5, 49.7, 51.5], 617 | 'k.-', linewidth=2, markersize=8, alpha=.25, label='EfficientDet') 618 | 619 | ax2.grid(alpha=0.2) 620 | ax2.set_yticks(np.arange(20, 60, 5)) 621 | ax2.set_xlim(0, 57) 622 | ax2.set_ylim(30, 55) 623 | ax2.set_xlabel('GPU Speed (ms/img)') 624 | ax2.set_ylabel('COCO AP val') 625 | ax2.legend(loc='lower right') 626 | plt.savefig(str(Path(path).name) + '.png', dpi=300) 627 | 628 | def profile_idetection(start=0, stop=0, labels=(), save_dir=''): 629 | """没用到 630 | Plot iDetection '*.txt' per-image logs 631 | """ 632 | ax = plt.subplots(2, 4, figsize=(12, 6), tight_layout=True)[1].ravel() 633 | s = ['Images', 'Free Storage (GB)', 'RAM Usage (GB)', 'Battery', 'dt_raw (ms)', 'dt_smooth (ms)', 'real-world FPS'] 634 | files = list(Path(save_dir).glob('frames*.txt')) 635 | for fi, f in enumerate(files): 636 | try: 637 | results = np.loadtxt(f, ndmin=2).T[:, 90:-30] # clip first and last rows 638 | n = results.shape[1] # number of rows 639 | x = np.arange(start, min(stop, n) if stop else n) 640 | results = results[:, x] 641 | t = (results[0] - results[0].min()) # set t0=0s 642 | results[0] = x 643 | for i, a in enumerate(ax): 644 | if i < len(results): 645 | label = labels[fi] if len(labels) else f.stem.replace('frames_', '') 646 | a.plot(t, results[i], marker='.', label=label, linewidth=1, markersize=5) 647 | a.set_title(s[i]) 648 | a.set_xlabel('time (s)') 649 | # if fi == len(files) - 1: 650 | # a.set_ylim(bottom=0) 651 | for side in ['top', 'right']: 652 | a.spines[side].set_visible(False) 653 | else: 654 | a.remove() 655 | except Exception as e: 656 | print('Warning: Plotting error for %s; %s' % (f, e)) 657 | 658 | ax[1].legend() 659 | plt.savefig(Path(save_dir) / 'idetection_profile.png', dpi=200) 660 | -------------------------------------------------------------------------------- /utils/torch_utils.py: -------------------------------------------------------------------------------- 1 | import math 2 | import os 3 | import time 4 | from copy import deepcopy 5 | 6 | import torch 7 | import torch.backends.cudnn as cudnn 8 | import torch.nn as nn 9 | import torch.nn.functional as F 10 | import torchvision.models as models 11 | 12 | 13 | def init_seeds(seed=0): 14 | torch.manual_seed(seed) 15 | 16 | # Speed-reproducibility tradeoff https://pytorch.org/docs/stable/notes/randomness.html 17 | if seed == 0: # slower, more reproducible 18 | cudnn.deterministic = True 19 | cudnn.benchmark = False 20 | else: # faster, less reproducible 21 | cudnn.deterministic = False 22 | cudnn.benchmark = True 23 | 24 | 25 | def select_device(device='', apex=False, batch_size=None): 26 | # device = 'cpu' or '0' or '0,1,2,3' 27 | cpu_request = device.lower() == 'cpu' 28 | if device and not cpu_request: # if device requested other than 'cpu' 29 | os.environ['CUDA_VISIBLE_DEVICES'] = device # set environment variable 30 | assert torch.cuda.is_available(), 'CUDA unavailable, invalid device %s requested' % device # check availablity 31 | 32 | cuda = False if cpu_request else torch.cuda.is_available() 33 | if cuda: 34 | c = 1024 ** 2 # bytes to MB 35 | ng = torch.cuda.device_count() 36 | if ng > 1 and batch_size: # check that batch_size is compatible with device_count 37 | assert batch_size % ng == 0, 'batch-size %g not multiple of GPU count %g' % (batch_size, ng) 38 | x = [torch.cuda.get_device_properties(i) for i in range(ng)] 39 | s = 'Using CUDA ' + ('Apex ' if apex else '') # apex for mixed precision https://github.com/NVIDIA/apex 40 | for i in range(0, ng): 41 | if i == 1: 42 | s = ' ' * len(s) 43 | print("%sdevice%g _CudaDeviceProperties(name='%s', total_memory=%dMB)" % 44 | (s, i, x[i].name, x[i].total_memory / c)) 45 | else: 46 | print('Using CPU') 47 | 48 | print('') # skip a line 49 | return torch.device('cuda:0' if cuda else 'cpu') 50 | 51 | 52 | def time_synchronized(): 53 | torch.cuda.synchronize() if torch.cuda.is_available() else None 54 | return time.time() 55 | 56 | 57 | def is_parallel(model): 58 | # is model is parallel with DP or DDP 59 | return type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel) 60 | 61 | 62 | def initialize_weights(model): 63 | for m in model.modules(): 64 | t = type(m) 65 | if t is nn.Conv2d: 66 | pass # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 67 | elif t is nn.BatchNorm2d: 68 | m.eps = 1e-4 69 | m.momentum = 0.03 70 | elif t in [nn.LeakyReLU, nn.ReLU, nn.ReLU6]: 71 | m.inplace = True 72 | 73 | 74 | def find_modules(model, mclass=nn.Conv2d): 75 | # finds layer indices matching module class 'mclass' 76 | return [i for i, m in enumerate(model.module_list) if isinstance(m, mclass)] 77 | 78 | 79 | def sparsity(model): 80 | # Return global model sparsity 81 | a, b = 0., 0. 82 | for p in model.parameters(): 83 | a += p.numel() 84 | b += (p == 0).sum() 85 | return b / a 86 | 87 | 88 | def prune(model, amount=0.3): 89 | # Prune model to requested global sparsity 90 | import torch.nn.utils.prune as prune 91 | print('Pruning model... ', end='') 92 | for name, m in model.named_modules(): 93 | if isinstance(m, nn.Conv2d): 94 | prune.l1_unstructured(m, name='weight', amount=amount) # prune 95 | prune.remove(m, 'weight') # make permanent 96 | print(' %.3g global sparsity' % sparsity(model)) 97 | 98 | 99 | def fuse_conv_and_bn(conv, bn): 100 | # https://tehnokv.com/posts/fusing-batchnorm-and-conv/ 101 | with torch.no_grad(): 102 | # init 103 | fusedconv = nn.Conv2d(conv.in_channels, 104 | conv.out_channels, 105 | kernel_size=conv.kernel_size, 106 | stride=conv.stride, 107 | padding=conv.padding, 108 | bias=True).to(conv.weight.device) 109 | 110 | # prepare filters 111 | w_conv = conv.weight.clone().view(conv.out_channels, -1) 112 | w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var))) 113 | fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size())) 114 | 115 | # prepare spatial bias 116 | b_conv = torch.zeros(conv.weight.size(0), device=conv.weight.device) if conv.bias is None else conv.bias 117 | b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps)) 118 | fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn) 119 | 120 | return fusedconv 121 | 122 | 123 | def model_info(model, verbose=False): 124 | # Plots a line-by-line description of a PyTorch model 125 | n_p = sum(x.numel() for x in model.parameters()) # number parameters 126 | n_g = sum(x.numel() for x in model.parameters() if x.requires_grad) # number gradients 127 | if verbose: 128 | print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma')) 129 | for i, (name, p) in enumerate(model.named_parameters()): 130 | name = name.replace('module_list.', '') 131 | print('%5g %40s %9s %12g %20s %10.3g %10.3g' % 132 | (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std())) 133 | 134 | try: # FLOPS 135 | from thop import profile 136 | flops = profile(deepcopy(model), inputs=(torch.zeros(1, 3, 64, 64),), verbose=False)[0] / 1E9 * 2 137 | fs = ', %.1f GFLOPS' % (flops * 100) # 640x640 FLOPS 138 | except: 139 | fs = '' 140 | 141 | print('Model Summary: %g layers, %g parameters, %g gradients%s' % (len(list(model.parameters())), n_p, n_g, fs)) 142 | 143 | 144 | def load_classifier(name='resnet101', n=2): 145 | # Loads a pretrained model reshaped to n-class rec_result 146 | model = models.__dict__[name](pretrained=True) 147 | 148 | # Display model properties 149 | input_size = [3, 224, 224] 150 | input_space = 'RGB' 151 | input_range = [0, 1] 152 | mean = [0.485, 0.456, 0.406] 153 | std = [0.229, 0.224, 0.225] 154 | for x in [input_size, input_space, input_range, mean, std]: 155 | print(x + ' =', eval(x)) 156 | 157 | # Reshape rec_result to n classes 158 | filters = model.fc.weight.shape[1] 159 | model.fc.bias = nn.Parameter(torch.zeros(n), requires_grad=True) 160 | model.fc.weight = nn.Parameter(torch.zeros(n, filters), requires_grad=True) 161 | model.fc.out_features = n 162 | return model 163 | 164 | 165 | def scale_img(img, ratio=1.0, same_shape=False): # img(16,3,256,416), r=ratio 166 | # scales img(bs,3,y,x) by ratio 167 | h, w = img.shape[2:] 168 | s = (int(h * ratio), int(w * ratio)) # new size 169 | img = F.interpolate(img, size=s, mode='bilinear', align_corners=False) # resize 170 | if not same_shape: # pad/crop img 171 | gs = 32 # (pixels) grid size 172 | h, w = [math.ceil(x * ratio / gs) * gs for x in (h, w)] 173 | return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447) # value = imagenet mean 174 | 175 | 176 | class ModelEMA: 177 | """ Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models 178 | Keep a moving average of everything in the model state_dict (parameters and buffers). 179 | This is intended to allow functionality like 180 | https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage 181 | A smoothed version of the weights is necessary for some training schemes to perform well. 182 | E.g. Google's hyper-params for training MNASNet, MobileNet-V3, EfficientNet, etc that use 183 | RMSprop with a short 2.4-3 epoch decay period and slow LR decay rate of .96-.99 requires EMA 184 | smoothing of weights to match results. Pay attention to the decay constant you are using 185 | relative to your update count per epoch. 186 | To keep EMA from using GPU resources, set device='cpu'. This will save a bit of memory but 187 | disable validation of the EMA weights. Validation will have to be done manually in a separate 188 | process, or after the training stops converging. 189 | This class is sensitive where it is initialized in the sequence of model init, 190 | GPU assignment and distributed training wrappers. 191 | I've tested with the sequence in my own train_yolov5.py for torch.DataParallel, apex.DDP, and single-GPU. 192 | """ 193 | 194 | def __init__(self, model, decay=0.9999, device=''): 195 | # Create EMA 196 | self.ema = deepcopy(model.module if is_parallel(model) else model) # FP32 EMA 197 | self.ema.eval() 198 | self.updates = 0 # number of EMA updates 199 | self.decay = lambda x: decay * (1 - math.exp(-x / 2000)) # decay exponential ramp (to help early epochs) 200 | self.device = device # perform ema on different device from model if set 201 | if device: 202 | self.ema.to(device) 203 | for p in self.ema.parameters(): 204 | p.requires_grad_(False) 205 | 206 | def update(self, model): 207 | # Update EMA parameters 208 | with torch.no_grad(): 209 | self.updates += 1 210 | d = self.decay(self.updates) 211 | 212 | msd = model.module.state_dict() if is_parallel(model) else model.state_dict() # model state_dict 213 | for k, v in self.ema.state_dict().items(): 214 | if v.dtype.is_floating_point: 215 | v *= d 216 | v += (1. - d) * msd[k].detach() 217 | 218 | def update_attr(self, model): 219 | # Update EMA attributes 220 | for k, v in model.__dict__.items(): 221 | if not k.startswith('_') and k not in ["process_group", "reducer"]: 222 | setattr(self.ema, k, v) 223 | -------------------------------------------------------------------------------- /weights/lprnet_best.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/weights/lprnet_best.pth -------------------------------------------------------------------------------- /weights/yolov5_best.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HuKai97/YOLOv5-LPRNet-Licence-Recognition/b42fd0837b4bd5becc484533c7f2324123c703a3/weights/yolov5_best.pt --------------------------------------------------------------------------------