├── .gitignore ├── LICENSE ├── README.md ├── README_CN.md ├── docs ├── _config.yml └── index.md ├── v1.0 └── main.py ├── v2.0 └── gesture.py └── v3.0 ├── 01_image_processing_and_data_augmentation.ipynb ├── 02_munge_data.py ├── 03_Modeling_and_Inference.ipynb ├── LICENSE ├── README.md ├── modeling_data └── aug_data │ └── annotations.csv ├── ord.txt ├── windows_v1.8.1 ├── data │ └── predefined_classes.txt └── labelImg.exe └── yolov5 ├── .dockerignore ├── .gitattributes ├── .github ├── ISSUE_TEMPLATE │ ├── --bug-report.md │ ├── --feature-request.md │ └── -question.md └── workflows │ ├── ci-testing.yml │ ├── greetings.yml │ ├── rebase.yml │ └── stale.yml ├── .gitignore ├── Dockerfile ├── LICENSE ├── README.md ├── config.yaml ├── detect.py ├── hubconf.py ├── models ├── __init__.py ├── common.py ├── experimental.py ├── export.py ├── hub │ ├── yolov3-spp.yaml │ ├── yolov5-fpn.yaml │ └── yolov5-panet.yaml ├── yolo.py ├── yolov5l.yaml ├── yolov5m.yaml ├── yolov5s.yaml └── yolov5x.yaml ├── requirements.txt ├── sotabench.py ├── test.py ├── train.py ├── tutorial.ipynb ├── utils ├── __init__.py ├── activations.py ├── datasets.py ├── evolve.sh ├── general.py ├── google_app_engine │ ├── Dockerfile │ ├── additional_requirements.txt │ └── app.yaml ├── google_utils.py └── torch_utils.py ├── weights └── download_weights.sh └── yolo_data └── labels ├── train.cache └── validation.cache /.gitignore: -------------------------------------------------------------------------------- 1 | # visual studio code 2 | .vscode/ 3 | .idea/ 4 | 5 | # python 6 | venv/ 7 | virtualenv/ 8 | __pycache__ 9 | 10 | # misc 11 | .DS_Store 12 | 13 | # results 14 | *.npy 15 | *.npz 16 | *.png 17 | *.PNG 18 | *.jpg 19 | *.JPG 20 | *.jpeg 21 | 22 | # notebook checkpoints 23 | .ipynb_checkpoints 24 | 25 | # path -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Hi 👋 2 | 3 | Come here, don’t you star this progect? & Forgive my pool English. 4 | 5 | Welcome to star this repo! 6 | 7 | Mid-air brush [Demo] 8 | 9 | README [EN|CN] 10 | 11 | ## Description 12 | 13 | Mid-air gesture recognition and drawing, the default gesture 1 is a brush, gesture 2 is to change the color, and gesture 5 is to clear the drawing board 14 | Display based on OpenCV. 15 | 16 | 17 | ## Change Log 18 | 19 | ### v3.0 20 | 21 | This version of the project is based on GA_Data_Science_Capstone 22 | 23 | Use Yolo_v5 to recognize gestures and index fingers for drawing. Please make your own gesture dataset and label them. Data preprocessing is in files 01 and 02. 24 | The project can be run on Raspberry Pi, use the Raspberry Pi to collect images and push them to the computer for reasoning, there is a delay. 25 | 26 | #### How to run 27 | 28 | ```sh 29 | cd v3.0 30 | pip install -r requirements.txt 31 | jupyter notebook 32 | 33 | # open and run 01_image_processing_and_data_augmentation.ipynb 34 | 35 | # run labelImg to label data 1, 2, 5, forefinger 36 | 37 | python 02_munge_data.py 38 | 39 | # train model 40 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example 41 | tensorboard --logdir runs/ 42 | 43 | # run use pc cam 44 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0 45 | 46 | # run use raspi 47 | # run on raspi 48 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264 49 | # run on pc 50 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/ 51 | ``` 52 | 53 | ### v2.0 54 | 55 | Gesture recognition based on OpenCV and convex hull detection. 56 | Skin color detection + convex hull + number of contour lines (count the number of fingers). 57 | 58 | #### How to run 59 | 60 | ```sh 61 | cd v2.0 62 | python gesture.py 63 | ``` 64 | 65 | 66 | 67 | ### v1.0 68 | 69 | Skin color detection + convex hull based on OpenCV. 70 | 71 | 72 | #### How to run 73 | ```sh 74 | cd v1.0 75 | python main.py 76 | ``` 77 | 78 | 79 | 80 | -------------------------------------------------------------------------------- /README_CN.md: -------------------------------------------------------------------------------- 1 | ## Hi 👋 2 | 3 | 来都来了,不点个小星星吗? 4 | 5 | Welcome to star this repo 6 | 7 | 凌空画笔 [Demo] 8 | 9 | README [EN|CN] 10 | 11 | ## Description 12 | 13 | 凌空手势识别和绘制,默认手势1是画笔,手势2是更换颜色,手势5是清空画板 14 | 显示基于OpenCV 15 | 16 | 17 | ## Change Log 18 | 19 | ### v3.0 20 | 21 | 该版本项目基于GA_Data_Science_Capstone 22 | 23 | 用Yolo_v5识别手势和食指进行绘制,请自行手势数据集并进行标注,数据预处理在01和02文件中 24 | 该项目可移植到树莓派上运行,利用树莓派收集图像,推流到电脑进行推理,有延迟 25 | 26 | #### How to run 27 | 28 | ```sh 29 | cd v3.0 30 | pip install -r requirements.txt 31 | jupyter notebook 32 | 33 | # open and run 01_image_processing_and_data_augmentation.ipynb 34 | 35 | # run labelImg to label data 1, 2, 5, forefinger 36 | 37 | python 02_munge_data.py 38 | 39 | # train model 40 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example 41 | tensorboard --logdir runs/ 42 | 43 | # run use pc cam 44 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0 45 | 46 | # run use raspi 47 | # run on raspi 48 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264 49 | # run on pc 50 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/ 51 | ``` 52 | 53 | ### v2.0 54 | 55 | 基于OpenCV和凸包检测的手势识别 56 | 肤色检测+凸包+数轮廓线个数(统计手指数量) 57 | 58 | #### How to run 59 | 60 | ```sh 61 | cd v2.0 62 | python gesture.py 63 | ``` 64 | 65 | 66 | 67 | ### v1.0 68 | 69 | 基于OpenCV的肤色检测+凸包 70 | 71 | 72 | #### How to run 73 | ```sh 74 | cd v1.0 75 | python main.py 76 | ``` 77 | 78 | 79 | 80 | -------------------------------------------------------------------------------- /docs/_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-cayman -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | ## mid-air-draw 2 | 3 | mid-air-draw[Demo] 4 | 5 | Welcome to star this repo 6 | 7 | ### v1.0 8 | 肤色检测+凸包 9 | 10 | ```sh 11 | cd v1.0 12 | python main.py 13 | ``` 14 | 15 | ### v2.0 16 | 肤色检测+凸包+数轮廓线个数(统计手指数量) 17 | 18 | #### How to run 19 | 20 | ```sh 21 | cd v2.0 22 | python gesture.py 23 | ``` 24 | 25 | 26 | ### v3.0 27 | 28 | ```sh 29 | cd v3.0 30 | pip install -r requirements.txt 31 | jupyter notebook 32 | 33 | # open and run 01_image_processing_and_data_augmentation.ipynb 34 | 35 | # run labelImg to label data 1, 2, 5, forefinger 36 | 37 | python 02_munge_data.py 38 | 39 | # train model 40 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example 41 | tensorboard --logdir runs/ 42 | 43 | # run use pc cam 44 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0 45 | 46 | # run use raspi 47 | # run on raspi 48 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264 49 | # run on pc 50 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/ 51 | ``` -------------------------------------------------------------------------------- /v1.0/main.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | 4 | 5 | def main(): 6 | cap = cv2.VideoCapture(0) 7 | init = 0 8 | last_point = 0 9 | font = cv2.FONT_HERSHEY_SIMPLEX # 设置字体 10 | size = 0.5 # 设置大小 11 | width, height = 300, 300 # 设置拍摄窗口大小 12 | x0, y0 = 100, 100 # 设置选取位置 13 | while cap.isOpened(): 14 | ret, img = cap.read() 15 | img = cv2.flip(img, 2) 16 | roi = binaryMask(img, x0, y0, width, height) 17 | res = skinMask(roi) 18 | contours = getContours(res) 19 | if init == 0: 20 | img2 = roi.copy() 21 | img2[:, :, :] = 255 22 | init = 1 23 | 24 | print(len(contours)) 25 | if len(contours) > 0: 26 | first = [x[0] for x in contours[0]] 27 | first = np.array(first[:]) 28 | print(first) 29 | y_min = roi.shape[1] 30 | idx = 0 31 | for i, (x, y) in enumerate(first): 32 | if y < y_min: 33 | y_min = y 34 | idx = i 35 | print(first[idx]) 36 | point = (first[idx][0], first[idx][1]) 37 | cv2.circle(img2, point, 1, (255, 0, 0)) 38 | if last_point != 0: 39 | cv2.line(img2, point, last_point, (255, 0, 0), 1) 40 | last_point = point 41 | 42 | # print(img2) 43 | cv2.drawContours(roi, contours, -1, (0, 255, 0), 2) 44 | cv2.imshow('capture', img) 45 | cv2.imshow('roi', roi) 46 | cv2.imshow('draw', img2) 47 | k = cv2.waitKey(10) 48 | if k == 27: 49 | break 50 | 51 | 52 | def getContours(img): 53 | kernel = np.ones((5, 5), np.uint8) 54 | closed = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel) 55 | closed = cv2.morphologyEx(closed, cv2.MORPH_CLOSE, kernel) 56 | contours, h = cv2.findContours(closed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) 57 | vaildContours = [] 58 | for cont in contours: 59 | if cv2.contourArea(cont) > 9000: 60 | # x,y,w,h = cv2.boundingRect(cont) 61 | # if h/w >0.75: 62 | # filter face failed 63 | vaildContours.append(cv2.convexHull(cont)) 64 | # print(cv2.convexHull(cont)) 65 | # rect = cv2.minAreaRect(cont) 66 | # box = cv2.cv.BoxPoint(rect) 67 | # vaildContours.append(np.int0(box)) 68 | return vaildContours 69 | 70 | 71 | def binaryMask(frame, x0, y0, width, height): 72 | cv2.rectangle(frame, (x0, y0), (x0 + width, y0 + height), (0, 255, 0)) # 画出截取的手势框图 73 | roi = frame[y0:y0 + height, x0:x0 + width] # 获取手势框图 74 | return roi 75 | 76 | 77 | def HSVBin(img): 78 | hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV) 79 | 80 | lower_skin = np.array([100, 50, 0]) 81 | upper_skin = np.array([125, 255, 255]) 82 | 83 | mask = cv2.inRange(hsv, lower_skin, upper_skin) 84 | # res = cv2.bitwise_and(img,img,mask=mask) 85 | return mask 86 | 87 | 88 | def skinMask1(roi): 89 | rgb = cv2.cvtColor(roi, cv2.COLOR_BGR2RGB) # 转换到RGB空间 90 | (R, G, B) = cv2.split(rgb) # 获取图像每个像素点的RGB的值,即将一个二维矩阵拆成三个二维矩阵 91 | skin = np.zeros(R.shape, dtype=np.uint8) # 掩膜 92 | (x, y) = R.shape # 获取图像的像素点的坐标范围 93 | for i in range(0, x): 94 | for j in range(0, y): 95 | # 判断条件,不在肤色范围内则将掩膜设为黑色,即255 96 | if (abs(R[i][j] - G[i][j]) > 15) and (R[i][j] > G[i][j]) and (R[i][j] > B[i][j]): 97 | if (R[i][j] > 95) and (G[i][j] > 40) and (B[i][j] > 20) \ 98 | and (max(R[i][j], G[i][j], B[i][j]) - min(R[i][j], G[i][j], B[i][j]) > 15): 99 | skin[i][j] = 255 100 | elif (R[i][j] > 220) and (G[i][j] > 210) and (B[i][j] > 170): 101 | skin[i][j] = 255 102 | # res = cv2.bitwise_and(roi, roi, mask=skin) # 图像与运算 103 | return skin 104 | 105 | 106 | def skinMask2(roi): 107 | low = np.array([0, 48, 50]) # 最低阈值 108 | high = np.array([20, 255, 255]) # 最高阈值 109 | hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV) # 转换到HSV空间 110 | mask = cv2.inRange(hsv, low, high) # 掩膜,不在范围内的设为255 111 | # res = cv2.bitwise_and(roi, roi, mask=mask) # 图像与运算 112 | return mask 113 | 114 | 115 | def skinMask3(roi): 116 | skinCrCbHist = np.zeros((256, 256), dtype=np.uint8) 117 | cv2.ellipse(skinCrCbHist, (113, 155), (23, 25), 43, 0, 360, (255, 255, 255), -1) # 绘制椭圆弧线 118 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间 119 | (y, Cr, Cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值 120 | skin = np.zeros(Cr.shape, dtype=np.uint8) # 掩膜 121 | (x, y) = Cr.shape 122 | for i in range(0, x): 123 | for j in range(0, y): 124 | if skinCrCbHist[Cr[i][j], Cb[i][j]] > 0: # 若不在椭圆区间中 125 | skin[i][j] = 255 126 | # res = cv2.bitwise_and(roi, roi, mask=skin) 127 | return skin 128 | 129 | 130 | def skinMask4(roi): 131 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间 132 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值 133 | cr1 = cv2.GaussianBlur(cr, (5, 5), 0) 134 | _, skin = cv2.threshold(cr1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) # Ostu处理 135 | # res = cv2.bitwise_and(roi, roi, mask=skin) 136 | return skin 137 | 138 | 139 | def skinMask5(roi): 140 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间 141 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值 142 | skin = np.zeros(cr.shape, dtype=np.uint8) 143 | (x, y) = cr.shape 144 | for i in range(0, x): 145 | for j in range(0, y): 146 | # 每个像素点进行判断 147 | if (cr[i][j] > 130) and (cr[i][j] < 175) and (cb[i][j] > 77) and (cb[i][j] < 127): 148 | skin[i][j] = 255 149 | # res = cv2.bitwise_and(roi, roi, mask=skin) 150 | return skin 151 | 152 | 153 | def skinMask(roi): 154 | return skinMask4(roi) 155 | 156 | 157 | if __name__ == '__main__': 158 | main() 159 | -------------------------------------------------------------------------------- /v2.0/gesture.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import copy 4 | import math 5 | 6 | # from appscript import app 7 | 8 | # Environment: 9 | # hardware:Raspberry Pi 4B 10 | # OS : Raspbian GNU/Linux 10 (buster) 11 | # python: 3.7.3 12 | # opencv: 4.2.0 13 | 14 | # parameters 15 | cap_region_x_begin = 0.6 # start point/total width 16 | cap_region_y_end = 0.6 # start point/total width 17 | threshold = 60 # BINARY threshold 18 | blurValue = 41 # GaussianBlur parameter 19 | bgSubThreshold = 50 20 | learningRate = 0 21 | 22 | # variables 23 | isBgCaptured = 0 # bool, whether the background captured 24 | triggerSwitch = False # if true, keyborad simulator works 25 | 26 | 27 | def skinMask1(roi): 28 | rgb = cv2.cvtColor(roi, cv2.COLOR_BGR2RGB) # 转换到RGB空间 29 | (R, G, B) = cv2.split(rgb) # 获取图像每个像素点的RGB的值,即将一个二维矩阵拆成三个二维矩阵 30 | skin = np.zeros(R.shape, dtype=np.uint8) # 掩膜 31 | (x, y) = R.shape # 获取图像的像素点的坐标范围 32 | for i in range(0, x): 33 | for j in range(0, y): 34 | # 判断条件,不在肤色范围内则将掩膜设为黑色,即255 35 | if (abs(R[i][j] - G[i][j]) > 15) and (R[i][j] > G[i][j]) and (R[i][j] > B[i][j]): 36 | if (R[i][j] > 95) and (G[i][j] > 40) and (B[i][j] > 20) \ 37 | and (max(R[i][j], G[i][j], B[i][j]) - min(R[i][j], G[i][j], B[i][j]) > 15): 38 | skin[i][j] = 255 39 | elif (R[i][j] > 220) and (G[i][j] > 210) and (B[i][j] > 170): 40 | skin[i][j] = 255 41 | # res = cv2.bitwise_and(roi, roi, mask=skin) # 图像与运算 42 | return skin 43 | 44 | 45 | def skinMask2(roi): 46 | low = np.array([0, 48, 50]) # 最低阈值 47 | high = np.array([20, 255, 255]) # 最高阈值 48 | hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV) # 转换到HSV空间 49 | mask = cv2.inRange(hsv, low, high) # 掩膜,不在范围内的设为255 50 | # res = cv2.bitwise_and(roi, roi, mask=mask) # 图像与运算 51 | return mask 52 | 53 | 54 | def skinMask3(roi): 55 | skinCrCbHist = np.zeros((256, 256), dtype=np.uint8) 56 | cv2.ellipse(skinCrCbHist, (113, 155), (23, 25), 43, 0, 360, (255, 255, 255), -1) # 绘制椭圆弧线 57 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间 58 | (y, Cr, Cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值 59 | skin = np.zeros(Cr.shape, dtype=np.uint8) # 掩膜 60 | (x, y) = Cr.shape 61 | for i in range(0, x): 62 | for j in range(0, y): 63 | if skinCrCbHist[Cr[i][j], Cb[i][j]] > 0: # 若不在椭圆区间中 64 | skin[i][j] = 255 65 | # res = cv2.bitwise_and(roi, roi, mask=skin) 66 | return skin 67 | 68 | 69 | def skinMask4(roi): 70 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间 71 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值 72 | cr1 = cv2.GaussianBlur(cr, (5, 5), 0) 73 | _, skin = cv2.threshold(cr1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) # Ostu处理 74 | # res = cv2.bitwise_and(roi, roi, mask=skin) 75 | return skin 76 | 77 | 78 | def skinMask5(roi): 79 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间 80 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值 81 | skin = np.zeros(cr.shape, dtype=np.uint8) 82 | (x, y) = cr.shape 83 | for i in range(0, x): 84 | for j in range(0, y): 85 | # 每个像素点进行判断 86 | if (cr[i][j] > 130) and (cr[i][j] < 175) and (cb[i][j] > 77) and (cb[i][j] < 127): 87 | skin[i][j] = 255 88 | # res = cv2.bitwise_and(roi, roi, mask=skin) 89 | return skin 90 | 91 | 92 | def skinMask(roi): 93 | return skinMask4(roi) 94 | 95 | 96 | def dis(p1, p2): 97 | (x1, y1) = p1 98 | (x2, y2) = p2 99 | return np.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2) 100 | 101 | 102 | def printThreshold(thr): 103 | print("! Changed threshold to " + str(thr)) 104 | 105 | 106 | def removeBG(frame): 107 | """ 108 | fgmask = bgModel.apply(frame, learningRate=learningRate) 109 | # kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)) 110 | # res = cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel) 111 | 112 | kernel = np.ones((3, 3), np.uint8) 113 | fgmask = cv2.erode(fgmask, kernel, iterations=1) 114 | """ 115 | res = cv2.bitwise_and(frame, frame, mask=skinMask(frame)) 116 | return res 117 | 118 | 119 | def calculateFingers(res, drawing): # -> finished bool, cnt: finger count 120 | # convexity defect 121 | hull = cv2.convexHull(res, returnPoints=False) 122 | if len(hull) > 3: 123 | defects = cv2.convexityDefects(res, hull) 124 | if type(defects) != type(None): # avoid crashing. (BUG not found) 125 | 126 | cnt = 0 127 | for i in range(defects.shape[0]): # calculate the angle 128 | s, e, f, d = defects[i][0] 129 | start = tuple(res[s][0]) 130 | end = tuple(res[e][0]) 131 | far = tuple(res[f][0]) 132 | a = math.sqrt((end[0] - start[0]) ** 2 + (end[1] - start[1]) ** 2) 133 | b = math.sqrt((far[0] - start[0]) ** 2 + (far[1] - start[1]) ** 2) 134 | c = math.sqrt((end[0] - far[0]) ** 2 + (end[1] - far[1]) ** 2) 135 | angle = math.acos((b ** 2 + c ** 2 - a ** 2) / (2 * b * c)) # cosine theorem 136 | if angle <= math.pi / 2: # angle less than 90 degree, treat as fingers 137 | cnt += 1 138 | cv2.line(drawing, far, start, [211, 200, 200], 2) 139 | cv2.line(drawing, far, end, [211, 200, 200], 2) 140 | cv2.circle(drawing, far, 8, [211, 84, 0], -1) 141 | return True, cnt 142 | return False, 0 143 | 144 | 145 | # Camera 146 | camera = cv2.VideoCapture(0) 147 | # rt = camera.get(10) 148 | # print(rt) 149 | camera.set(10, 150) 150 | cv2.namedWindow('trackbar') 151 | cv2.createTrackbar('trh1', 'trackbar', threshold, 100, printThreshold) 152 | 153 | last_point = 0 154 | init = 0 155 | 156 | while camera.isOpened(): 157 | ret, frame = camera.read() 158 | threshold = cv2.getTrackbarPos('trh1', 'trackbar') 159 | frame = cv2.bilateralFilter(frame, 5, 50, 100) # smoothing filter 160 | frame = cv2.flip(frame, 1) # flip the frame horizontally 161 | cv2.rectangle(frame, (int(cap_region_x_begin * frame.shape[1]), 0), 162 | (frame.shape[1], int(cap_region_y_end * frame.shape[0])), (255, 0, 0), 2) 163 | cv2.imshow('original', frame) 164 | print(frame.shape) 165 | 166 | # Main operation 167 | if isBgCaptured == 1: # this part wont run until background captured 168 | img = removeBG(frame) 169 | img = img[0:int(cap_region_y_end * frame.shape[0]), 170 | int(cap_region_x_begin * frame.shape[1]):frame.shape[1]] # clip the ROI 171 | cv2.imshow('mask', img) 172 | if init == 0: 173 | img2 = img.copy() 174 | img2[:, :] = 255 175 | init = 1 176 | # convert the image into binary image 177 | gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 178 | blur = cv2.GaussianBlur(gray, (blurValue, blurValue), 0) 179 | # cv2.imshow('blur', blur) 180 | ret, thresh = cv2.threshold(blur, threshold, 255, cv2.THRESH_BINARY) 181 | # cv2.imshow('ori', thresh) 182 | 183 | # get the coutours 184 | thresh1 = copy.deepcopy(thresh) 185 | contours, hierarchy = cv2.findContours(thresh1, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) 186 | length = len(contours) 187 | maxArea = -1 188 | drawing = np.zeros(img.shape, np.uint8) 189 | if length > 0: 190 | for i in range(length): # find the biggest contour (according to area) 191 | temp = contours[i] 192 | area = cv2.contourArea(temp) 193 | if area > maxArea: 194 | maxArea = area 195 | ci = i 196 | 197 | res = contours[ci] 198 | 199 | # print(last_point) 200 | # print(res) 201 | hull = cv2.convexHull(res) 202 | drawing = np.zeros(img.shape, np.uint8) 203 | # cv2.drawContours(drawing, [], 0, (0, 255, 0), 2) 204 | cv2.drawContours(drawing, [res], 0, (0, 255, 0), 2) 205 | cv2.drawContours(drawing, [hull], 0, (0, 0, 255), 3) 206 | 207 | isFinishCal, cnt = calculateFingers(res, drawing) 208 | if cnt > 2: 209 | img2[:, :] = 255 210 | # print(cnt) 211 | if triggerSwitch is True: 212 | # if isFinishCal is True and cnt <= 2: 213 | if isFinishCal is True: 214 | print(cnt) 215 | # app('System Events').keystroke(' ') # simulate pressing blank space 216 | if cnt <= 2: 217 | first = [x[0] for x in contours[ci]] 218 | first = np.array(first[:]) 219 | # print(first) 220 | y_min = frame.shape[1] 221 | idx = 0 222 | for i, (x, y) in enumerate(first): 223 | if y < y_min: 224 | y_min = y 225 | idx = i 226 | # print(first[idx]) 227 | point = (first[idx][0], first[idx][1]) 228 | cv2.circle(img2, point, 3, (255, 0, 0)) 229 | if last_point != 0: 230 | # print('????') 231 | if dis(last_point, point) < 30: 232 | cv2.line(img2, point, last_point, (255, 0, 0), 3) 233 | last_point = point 234 | ''' 235 | if cnt > 1: 236 | first = [x[0] for x in contours[ci]] 237 | else: 238 | first = [x[0] for x in contours[0]] 239 | first = [x[0] for x in contours[ci]] 240 | first = np.array(first[:]) 241 | # print(first) 242 | y_min = frame.shape[1] 243 | idx = 0 244 | for i, (x, y) in enumerate(first): 245 | if y < y_min: 246 | y_min = y 247 | idx = i 248 | # print(first[idx]) 249 | point = (first[idx][0], first[idx][1]) 250 | cv2.circle(img2, point, 3, (255, 255, 255)) 251 | if last_point != 0: 252 | # print('????') 253 | cv2.line(img2, point, last_point, (255, 255, 255), 3) 254 | last_point = point 255 | ''' 256 | 257 | cv2.imshow('output', drawing) 258 | cv2.imshow('draw', img2) 259 | 260 | # Keyboard OP 261 | k = cv2.waitKey(10) 262 | if k == 27: # press ESC to exit 263 | camera.release() 264 | cv2.destroyAllWindows() 265 | break 266 | elif k == ord('b'): # press 'b' to capture the background 267 | bgModel = cv2.createBackgroundSubtractorMOG2(0, bgSubThreshold) 268 | isBgCaptured = 1 269 | print('!!!Background Captured!!!') 270 | elif k == ord('r'): # press 'r' to reset the background 271 | bgModel = None 272 | triggerSwitch = False 273 | isBgCaptured = 0 274 | print('!!!Reset BackGround!!!') 275 | elif k == ord('n'): 276 | triggerSwitch = True 277 | print('!!!Trigger On!!!') 278 | elif k == ord('c'): 279 | img2[:, :] = 255 280 | print('!!!img2 Clear!!!') 281 | -------------------------------------------------------------------------------- /v3.0/02_munge_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | The purpose of this python script is to create an unbiased training and validation set. 3 | The split data will be run in the terminal calling a function (process_data) that will join the 4 | annotations.csv file with new .txt files for bounding box class and coordinates for each image. 5 | """ 6 | # Credit to Abhishek Thakur, as this is a modified version of this notebook. 7 | # Source to video, where he goes over his code: https://www.youtube.com/watch?v=NU9Xr_NYslo&t=1392s 8 | 9 | # Import libraries 10 | import os 11 | import ast 12 | import pandas as pd 13 | import numpy as np 14 | from sklearn import model_selection 15 | from tqdm import tqdm 16 | import shutil 17 | 18 | # The DATA_PATH will be where your augmented images and annotations.csv files are. 19 | # The OUTPUT_PATH is where the train and validation images and labels will go to. 20 | DATA_PATH = './modeling_data/aug_data/' 21 | OUTPUT_PATH = './yolov5/yolo_data/' 22 | 23 | 24 | # Function for taking each row in the annotations file 25 | def process_data(data, data_type='train'): 26 | for _, row in tqdm(data.iterrows(), total=len(data)): 27 | image_name = row['image_id'][:-4] # removing file extension .jpeg 28 | bounding_boxes = row['bboxes'] 29 | yolo_data = [] 30 | for bbox in bounding_boxes: 31 | category = bbox[0] 32 | x_center = bbox[1] 33 | y_center = bbox[2] 34 | w = bbox[3] 35 | h = bbox[4] 36 | yolo_data.append([category, x_center, y_center, w, h]) # yolo formated labels 37 | yolo_data = np.array(yolo_data) 38 | 39 | np.savetxt( 40 | # Outputting .txt file to appropriate train/validation folders 41 | os.path.join(OUTPUT_PATH, f"labels/{data_type}/{image_name}.txt"), 42 | yolo_data, 43 | fmt=["%d", "%f", "%f", "%f", "%f"] 44 | ) 45 | shutil.copyfile( 46 | # Copying the augmented images to the appropriate train/validation folders 47 | os.path.join(DATA_PATH, f"images/{image_name}.jpg"), 48 | os.path.join(OUTPUT_PATH, f"images/{data_type}/{image_name}.jpg"), 49 | ) 50 | 51 | 52 | if __name__ == '__main__': 53 | df = pd.read_csv(os.path.join(DATA_PATH, 'annotations.csv')) 54 | df.bbox = df.bbox.apply(ast.literal_eval) # Convert string to list for bounding boxes 55 | df = df.groupby('image_id')['bbox'].apply(list).reset_index(name='bboxes') 56 | 57 | # splitting data to a 90/10 split 58 | df_train, df_valid = model_selection.train_test_split( 59 | df, 60 | test_size=0.1, 61 | random_state=42, 62 | shuffle=True 63 | ) 64 | 65 | df_train = df_train.reset_index(drop=True) 66 | df_valid = df_valid.reset_index(drop=True) 67 | 68 | # Run function to have our data ready for modeling in 03_Modeling_and_Inference.ipynb 69 | process_data(df_train, data_type='train') 70 | process_data(df_valid, data_type='validation') 71 | -------------------------------------------------------------------------------- /v3.0/LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 David Lee 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /v3.0/README.md: -------------------------------------------------------------------------------- 1 | ### GA_Data_Science_Capstone_Project 2 | # **Interactive ABC's with American Sign Language** 3 | ### A step in Increasing Accessability for the Deaf Community with Computer Vision utilizing Yolov5. 4 | ![ASL_Demo](assets/alphabet.gif) 5 | 6 | 7 | # **Executive Summary** 8 | Utilizing Yolov5, a custom computer vision model was created on the American Sign Language alphabet. The project was promoted on social platforms to diversify the dataset. A total of 721 images were collected in the span of two weeks using DropBox request forms. Manual labels were created of the original images which were then resized, and organized for preprocessing. Several carefully selected augmentations were made to the images to compensate for the small dataset count. A total of 18,000 images were then used for modeling. Transfer learning was incorporated with Yolov5m weights and training completed on 300 epochs with an image size of 1024 in 163 hours. A mean average precision score of 0.8527 was achieved. Inference tests were successfully performed with areas identifying the models strengths and weaknesses for future development. 9 | 10 | All operations were performed on my local Linux machine with a CUDA/cudNN setup using Pytorch. 11 | 12 | 13 | # **Table of Contents** 14 | 15 | - [Executive Summary](#executivesummary) 16 | - [Table of Contents](#contents) 17 | - [Data Colelction Method](#data) 18 | - [Preprocessing](#preprocessing) 19 | - [Modeling](#modeling) 20 | - [Inference](#inference) 21 | - [Conclusions](#conclusions) 22 | - [Next Steps](#nextsteps) 23 | - [Citations](#cite) 24 | - [Special Thanks](#thanks) 25 | 26 | 27 | 28 | 29 | - [Back to Contents](#contents) 30 | # **Problem Statement:** 31 | Have you ever considered how easy it is to perform simple communication tasks such as ordering food at a drive thru, discussing financial information with a banker, telling a physician your symptoms at a hospital, or even negotiating your wages from your employer? What if there was a rule where you couldn’t speak and were only able to use your hands for each of these circumstances? The deaf community cannot do what most of the population take for granted and are often placed in degrading situations due to these challenges they face every day. Access to qualified interpretation services isn’t feasible in most cases leaving many in the deaf community with underemployment, social isolation, and public health challenges. To give these members of our community a greater voice, I have attempted to answer this question: 32 | 33 | 34 | **Can computer vision bridge the gap for the deaf and hard of hearing by learning American Sign Language?** 35 | 36 | In order to do this, a Yolov5 model was trained on the ASL alphabet. If successful, it may mark a step in the right direction for both greater accessibility and educational resources. 37 | 38 | 39 | - [Back to Contents](#contents) 40 | # **Data Collection Method:** 41 | The decision was made to create an original dataset for a few reasons. The first was to mirror the intended environment on a mobile device or webcam. These often have resolutions of 720 or 1080p. Several existing datasets have a low resolution and many do not include the letters “j” and “z” as they require movements. 42 | 43 | A letter request form was created with an introduction to my project along with instruction on how to submit voluntary sign language images with dropbox file request forms. This was distributed on social platforms to bring awareness, and to collect data. 44 | 45 | 46 | #### Dropbox request form used: (Deadline Sep. 27th, 2020) 47 | https://docs.google.com/document/d/1ChZPPr1dsHtgNqQ55a0FMngJj8PJbGgArm8xsiNYlRQ/edit?usp=sharing 48 | [link](https://docs.google.com/document/d/1ChZPPr1dsHtgNqQ55a0FMngJj8PJbGgArm8xsiNYlRQ/edit?usp=sharing) 49 | 50 | A total of 720 images were collected: 51 | 52 | Here is the distributions of images: (Letters / Counts) 53 | 54 | A - 29 55 | B - 25 56 | C - 25 57 | D - 28 58 | E - 25 59 | F - 30 60 | G - 30 61 | H - 29 62 | I - 30 63 | J - 38 64 | K - 27 65 | L - 28 66 | M - 28 67 | N - 27 68 | O - 28 69 | P - 25 70 | Q - 26 71 | R - 25 72 | S - 30 73 | T - 25 74 | U - 25 75 | V - 28 76 | W - 27 77 | X - 26 78 | Y - 26 79 | Z - 30 80 | 81 | 82 | - [Back to Contents](#contents) 83 | # **Preproccessing** 84 | ### Labeling the images 85 | Manual bounding box labels were created on the original images using the labelImg software. 86 | 87 | Each of the pictures and bounding box coordinates were then passed through an albumentations pipeline that resized the images to 1024 x 1024 pixel squares and added probabilities of different transformations. 88 | 89 | These transformations included specified degrees of rotations, shifts in the image locations, blurs, horizontal flips, random erase, and a variety of other color transformations. 90 | 91 | ![](assets/augmentations_slide.png) 92 | 93 | 94 | 25 augmented images were created for each image resulting in an image set of 18,000 used for modeling. 95 | 96 | 97 | - [Back to Contents](#contents) 98 | # **Modeling: Yolov5** 99 | To address acceptable inference speeds and size, Yolov5 was chosen for modeling. 100 | 101 | This was released in June 10th of this year, and is still in active development. Although Yolov5 by Ultralytics is not created by the original Yolo authors, Yolo v5 is said to be faster and more lightweight, with accuracy on par with Yolo v4 which is widely considered as the fastest and most accurate real-time object detection model. 102 | 103 | ![](assets/Yolov5_explanation.png) 104 | 105 | Yolo was designed as a convolutional neural network for real time object detection. Its more complex than basic classification as object detection needs to identify the objects and locate where it is on the image. This single stage object detector, has 3 main components: 106 | 107 | The backbone basically extracts important features of an image, the neck mainly uses feature pyramids which help in generalizing the object scaling for better performance on unseen data. The model head does the actual detection part where anchor boxes are applied on features that generate output vectors. 108 | These vectors include the class probabilities, the objectness scores, and bounding boxes. 109 | 110 | 111 | The model used was yolov5m with transfer learning on pretrained weights. 112 | 113 | #### **Model Training** 114 | Epochs: 300 115 | Batch Size: 8 116 | Image Size: 1024 x 1024 117 | Weights: yolov5m.pt 118 | 119 | ![](assets/results.png) 120 | 121 | mAP@.5: 98.17% 122 | 123 | **mAP@.5:.95: 85.27%** 124 | 125 | Training batch example: 126 | ![](assets/train_batch2.jpg) 127 | 128 | Test batch predictions example: 129 | ![](assets/test_batchm_pred.jpg) 130 | 131 | 132 | - [Back to Contents](#contents) 133 | # **Inference** 134 | ### **Images** 135 | I had reserved a test set of my son’s attempts at each letter that was not included in any of the training and validation sets. In fact no pictures of hands from children were used for training the model. Ideally several more images would help in showcasing how well our model performs, but this a start. 136 | ![](assets/test_slides.png) 137 | 138 | Out of 26 letters, 18 were correctly predicted. 139 | 140 | Letters that did not receive a prediction (G, H, J, and Z) 141 | 142 | Letters that were incorrectly predicted were: 143 | “D” predicted as “F” 144 | “E” predicted as “T” 145 | “P“ predicted as “Q” 146 | “R” predicted as “U” 147 | 148 | 149 | ## **Video Findings:** 150 | 151 | ============================================================== 152 | **Left-handed:** 153 | This test shows that our image augmentation pipeline performed well as it was set to flip the images horizontally at a 50% probability. 154 | ![](assets/left_handed.gif) 155 | 156 | ============================================================== 157 | **Child's hand:** 158 | The test on my son's hand was performed, and the model still performs well here. 159 | ![](assets/son_name.gif) 160 | 161 | ============================================================== 162 | **Multiple letters on screen:** 163 | Simultaneous letters were also detected. Although sign language is not used like the video on the right, it shows that multiple people can be on screen and the model will be able to distinguish more than one instance of the language. 164 | ![](assets/hi_screen_record.gif) 165 | 166 | ============================================================== 167 | ## **Video Limitations:** 168 | ============================================================== 169 | **Distance** 170 | There were limitations I’ve discovered in my model. The biggest one is distance. As many of the original pictures were taken from my phone on my hands, the distance of my hand to the camera was very close, negatively impacting inference at further distances. 171 | 172 | ![](assets/distance_limitation.gif) 173 | 174 | ============================================================== 175 | **New environments** 176 | These video clips of volunteers below were not included in any of the model training. Although the model picks up a lot of the letters, the prediction confidence levels are lower, and there are more misclassifications present. 177 | ![](assets/volunteers.gif) 178 | 179 | 180 | I've verified this with a video of my own. 181 | ![](assets/bg_limitation.gif) 182 | 183 | **Even though the original image set was on only 720 pictures, the implications of the results displayed bring us to an exciting conclusion.** 184 | 185 | ============================================================== 186 | 187 | 188 | - [Back to Contents](#contents) 189 | # **Conclusions** 190 | Computer vision can and should be used in marking a step in greater accessibility and educational resources for our deaf and hard of hearing communities! 191 | 192 | - Even though the original image set was on only 720 pictures, the implications of the results displayed here is promising 193 | - Gathering more image data from a variety of sources would help our model inference in different distances and environments better. 194 | - Even letters with movements are able to be recognized through computer vision. 195 | 196 | 197 | 198 | - [Back to Contents](#contents) 199 | # **Next Steps** 200 | I believe this project is aligned with the vision of the National Association of the Deaf in bringing better accessibility and education for this underrepresented community. If I am able to bring awareness to the project, and partner with an organization like the NAD, I will be able to gather better data on the people that speak this language natively to push the project further. 201 | 202 | The technology is still very new, and the model I have trained for this presentation was primarily used to find out if it would work. I’m happy with my initial results and I’ve already trained a smaller model that I’ll be testing for mobile deployment in the future. 203 | 204 | I believe computer vision can help give our deaf and hard of hearing neighbors a voice with the right support and project awareness. 205 | 206 | - [Back to Contents](#contents) 207 | 208 | # **Citations** 209 | Python Version: 3.8 210 | Packages: pandas, numpy, matplotlib, sklearn, opencv, os, ast, albumentations, tqdm, torch, IPython, PIL, shutil 211 | 212 | ### Resources: 213 | 214 | Yolov5 github 215 | https://github.com/ultralytics/yolov5 216 | 217 | Yolov5 requirements 218 | https://github.com/ultralytics/yolov5/blob/master/requirements.txt 219 | 220 | Cudnn install guide: 221 | https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html 222 | 223 | Install Opencv: 224 | https://www.codegrepper.com/code-examples/python/how+to+install+opencv+in+python+3.8 225 | 226 | Roboflow augmentation process: 227 | https://docs.roboflow.com/image-transformations/image-augmentation 228 | 229 | Heavily utilized research paper on image augmentations: 230 | https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0#Sec3 231 | 232 | Pillow library: 233 | https://pillow.readthedocs.io/en/latest/handbook/index.html 234 | 235 | Labeling Software labelImg: 236 | https://github.com/tzutalin/labelImg 237 | 238 | Albumentations library 239 | https://github.com/albumentations-team/albumentations 240 | 241 | # **Special Thanks** 242 | Joseph Nelson, CEO of Roboflow.ai, for delivering a computer vision lesson to our class, and answering my questions directly. 243 | 244 | And to my volunteers: 245 | Nathan & Roxanne Seither 246 | Juhee Sung-Schenck 247 | Josh Mizraji 248 | Lydia Kajeckas 249 | Aidan Curley 250 | Chris Johnson 251 | Eric Lee 252 | 253 | And to the General Assembly DSI-720 instructors: 254 | Adi Bronshtein 255 | Patrick Wales-Dinan 256 | Kelly Slatery 257 | Noah Christiansen 258 | Jacob Ellena 259 | Bradford Smith 260 | 261 | This project would not have been possible without the time all of you invested in me. Thank you! -------------------------------------------------------------------------------- /v3.0/ord.txt: -------------------------------------------------------------------------------- 1 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example 2 | tensorboard --logdir runs/ 3 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0 4 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source 0 5 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source rtsp://192.168.0.106:8554/ 6 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.0.106:8080/ 7 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/ 8 | 9 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 480 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264 10 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264 11 | 12 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 25|cvlc -vvv stream:///dev/stdin --sout '#standard access=http,mux=ts,dst=:8090}' :demux=h264 -------------------------------------------------------------------------------- /v3.0/windows_v1.8.1/data/predefined_classes.txt: -------------------------------------------------------------------------------- 1 | 1 2 | 2 3 | 5 4 | forefinger -------------------------------------------------------------------------------- /v3.0/windows_v1.8.1/labelImg.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/windows_v1.8.1/labelImg.exe -------------------------------------------------------------------------------- /v3.0/yolov5/.dockerignore: -------------------------------------------------------------------------------- 1 | # Repo-specific DockerIgnore ------------------------------------------------------------------------------------------- 2 | #.git 3 | .cache 4 | .idea 5 | runs 6 | output 7 | coco 8 | storage.googleapis.com 9 | 10 | data/samples/* 11 | **/results*.txt 12 | *.jpg 13 | 14 | # Neural Network weights ----------------------------------------------------------------------------------------------- 15 | **/*.weights 16 | **/*.pt 17 | **/*.pth 18 | **/*.onnx 19 | **/*.mlmodel 20 | **/*.torchscript 21 | 22 | 23 | # Below Copied From .gitignore ----------------------------------------------------------------------------------------- 24 | # Below Copied From .gitignore ----------------------------------------------------------------------------------------- 25 | 26 | 27 | # GitHub Python GitIgnore ---------------------------------------------------------------------------------------------- 28 | # Byte-compiled / optimized / DLL files 29 | __pycache__/ 30 | *.py[cod] 31 | *$py.class 32 | 33 | # C extensions 34 | *.so 35 | 36 | # Distribution / packaging 37 | .Python 38 | env/ 39 | build/ 40 | develop-eggs/ 41 | dist/ 42 | downloads/ 43 | eggs/ 44 | .eggs/ 45 | lib/ 46 | lib64/ 47 | parts/ 48 | sdist/ 49 | var/ 50 | wheels/ 51 | *.egg-info/ 52 | .installed.cfg 53 | *.egg 54 | 55 | # PyInstaller 56 | # Usually these files are written by a python script from a template 57 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 58 | *.manifest 59 | *.spec 60 | 61 | # Installer logs 62 | pip-log.txt 63 | pip-delete-this-directory.txt 64 | 65 | # Unit test / coverage reports 66 | htmlcov/ 67 | .tox/ 68 | .coverage 69 | .coverage.* 70 | .cache 71 | nosetests.xml 72 | coverage.xml 73 | *.cover 74 | .hypothesis/ 75 | 76 | # Translations 77 | *.mo 78 | *.pot 79 | 80 | # Django stuff: 81 | *.log 82 | local_settings.py 83 | 84 | # Flask stuff: 85 | instance/ 86 | .webassets-cache 87 | 88 | # Scrapy stuff: 89 | .scrapy 90 | 91 | # Sphinx documentation 92 | docs/_build/ 93 | 94 | # PyBuilder 95 | target/ 96 | 97 | # Jupyter Notebook 98 | .ipynb_checkpoints 99 | 100 | # pyenv 101 | .python-version 102 | 103 | # celery beat schedule file 104 | celerybeat-schedule 105 | 106 | # SageMath parsed files 107 | *.sage.py 108 | 109 | # dotenv 110 | .env 111 | 112 | # virtualenv 113 | .venv* 114 | venv*/ 115 | ENV*/ 116 | 117 | # Spyder project settings 118 | .spyderproject 119 | .spyproject 120 | 121 | # Rope project settings 122 | .ropeproject 123 | 124 | # mkdocs documentation 125 | /site 126 | 127 | # mypy 128 | .mypy_cache/ 129 | 130 | 131 | # https://github.com/github/gitignore/blob/master/Global/macOS.gitignore ----------------------------------------------- 132 | 133 | # General 134 | .DS_Store 135 | .AppleDouble 136 | .LSOverride 137 | 138 | # Icon must end with two \r 139 | Icon 140 | Icon? 141 | 142 | # Thumbnails 143 | ._* 144 | 145 | # Files that might appear in the root of a volume 146 | .DocumentRevisions-V100 147 | .fseventsd 148 | .Spotlight-V100 149 | .TemporaryItems 150 | .Trashes 151 | .VolumeIcon.icns 152 | .com.apple.timemachine.donotpresent 153 | 154 | # Directories potentially created on remote AFP share 155 | .AppleDB 156 | .AppleDesktop 157 | Network Trash Folder 158 | Temporary Items 159 | .apdisk 160 | 161 | 162 | # https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore 163 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm 164 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839 165 | 166 | # User-specific stuff: 167 | .idea/* 168 | .idea/**/workspace.xml 169 | .idea/**/tasks.xml 170 | .idea/dictionaries 171 | .html # Bokeh Plots 172 | .pg # TensorFlow Frozen Graphs 173 | .avi # videos 174 | 175 | # Sensitive or high-churn files: 176 | .idea/**/dataSources/ 177 | .idea/**/dataSources.ids 178 | .idea/**/dataSources.local.xml 179 | .idea/**/sqlDataSources.xml 180 | .idea/**/dynamic.xml 181 | .idea/**/uiDesigner.xml 182 | 183 | # Gradle: 184 | .idea/**/gradle.xml 185 | .idea/**/libraries 186 | 187 | # CMake 188 | cmake-build-debug/ 189 | cmake-build-release/ 190 | 191 | # Mongo Explorer plugin: 192 | .idea/**/mongoSettings.xml 193 | 194 | ## File-based project format: 195 | *.iws 196 | 197 | ## Plugin-specific files: 198 | 199 | # IntelliJ 200 | out/ 201 | 202 | # mpeltonen/sbt-idea plugin 203 | .idea_modules/ 204 | 205 | # JIRA plugin 206 | atlassian-ide-plugin.xml 207 | 208 | # Cursive Clojure plugin 209 | .idea/replstate.xml 210 | 211 | # Crashlytics plugin (for Android Studio and IntelliJ) 212 | com_crashlytics_export_strings.xml 213 | crashlytics.properties 214 | crashlytics-build.properties 215 | fabric.properties 216 | -------------------------------------------------------------------------------- /v3.0/yolov5/.gitattributes: -------------------------------------------------------------------------------- 1 | # this drop notebooks from GitHub language stats 2 | *.ipynb linguist-vendored 3 | -------------------------------------------------------------------------------- /v3.0/yolov5/.github/ISSUE_TEMPLATE/--bug-report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: "\U0001F41BBug report" 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: bug 6 | assignees: '' 7 | 8 | --- 9 | 10 | Before submitting a bug report, please be aware that your issue **must be reproducible** with all of the following, otherwise it is non-actionable, and we can not help you: 11 | - **Current repo**: run `git fetch && git status -uno` to check and `git pull` to update repo 12 | - **Common dataset**: coco.yaml or coco128.yaml 13 | - **Common environment**: Colab, Google Cloud, or Docker image. See https://github.com/ultralytics/yolov5#environments 14 | 15 | If this is a custom dataset/training question you **must include** your `train*.jpg`, `test*.jpg` and `results.png` figures, or we can not help you. You can generate these with `utils.plot_results()`. 16 | 17 | 18 | ## 🐛 Bug 19 | A clear and concise description of what the bug is. 20 | 21 | 22 | ## To Reproduce (REQUIRED) 23 | 24 | Input: 25 | ``` 26 | import torch 27 | 28 | a = torch.tensor([5]) 29 | c = a / 0 30 | ``` 31 | 32 | Output: 33 | ``` 34 | Traceback (most recent call last): 35 | File "/Users/glennjocher/opt/anaconda3/envs/env1/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code 36 | exec(code_obj, self.user_global_ns, self.user_ns) 37 | File "", line 5, in 38 | c = a / 0 39 | RuntimeError: ZeroDivisionError 40 | ``` 41 | 42 | 43 | ## Expected behavior 44 | A clear and concise description of what you expected to happen. 45 | 46 | 47 | ## Environment 48 | If applicable, add screenshots to help explain your problem. 49 | 50 | - OS: [e.g. Ubuntu] 51 | - GPU [e.g. 2080 Ti] 52 | 53 | 54 | ## Additional context 55 | Add any other context about the problem here. 56 | -------------------------------------------------------------------------------- /v3.0/yolov5/.github/ISSUE_TEMPLATE/--feature-request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: "\U0001F680Feature request" 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: enhancement 6 | assignees: '' 7 | 8 | --- 9 | 10 | ## 🚀 Feature 11 | 12 | 13 | ## Motivation 14 | 15 | 16 | 17 | ## Pitch 18 | 19 | 20 | 21 | ## Alternatives 22 | 23 | 24 | 25 | ## Additional context 26 | 27 | 28 | -------------------------------------------------------------------------------- /v3.0/yolov5/.github/ISSUE_TEMPLATE/-question.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: "❓Question" 3 | about: Ask a general question 4 | title: '' 5 | labels: question 6 | assignees: '' 7 | 8 | --- 9 | 10 | ## ❔Question 11 | 12 | 13 | ## Additional context 14 | -------------------------------------------------------------------------------- /v3.0/yolov5/.github/workflows/ci-testing.yml: -------------------------------------------------------------------------------- 1 | name: CI CPU testing 2 | 3 | on: # https://help.github.com/en/actions/reference/events-that-trigger-workflows 4 | push: 5 | pull_request: 6 | schedule: 7 | - cron: "0 0 * * *" 8 | 9 | jobs: 10 | cpu-tests: 11 | 12 | runs-on: ${{ matrix.os }} 13 | strategy: 14 | fail-fast: false 15 | matrix: 16 | os: [ubuntu-latest, macos-latest, windows-latest] 17 | python-version: [3.8] 18 | model: ['yolov5s'] # models to test 19 | 20 | # Timeout: https://stackoverflow.com/a/59076067/4521646 21 | timeout-minutes: 50 22 | steps: 23 | - uses: actions/checkout@v2 24 | - name: Set up Python ${{ matrix.python-version }} 25 | uses: actions/setup-python@v2 26 | with: 27 | python-version: ${{ matrix.python-version }} 28 | 29 | # Note: This uses an internal pip API and may not always work 30 | # https://github.com/actions/cache/blob/master/examples.md#multiple-oss-in-a-workflow 31 | - name: Get pip cache 32 | id: pip-cache 33 | run: | 34 | python -c "from pip._internal.locations import USER_CACHE_DIR; print('::set-output name=dir::' + USER_CACHE_DIR)" 35 | 36 | - name: Cache pip 37 | uses: actions/cache@v1 38 | with: 39 | path: ${{ steps.pip-cache.outputs.dir }} 40 | key: ${{ runner.os }}-${{ matrix.python-version }}-pip-${{ hashFiles('requirements.txt') }} 41 | restore-keys: | 42 | ${{ runner.os }}-${{ matrix.python-version }}-pip- 43 | 44 | - name: Install dependencies 45 | run: | 46 | python -m pip install --upgrade pip 47 | pip install -qr requirements.txt -f https://download.pytorch.org/whl/cpu/torch_stable.html 48 | pip install -q onnx 49 | python --version 50 | pip --version 51 | pip list 52 | shell: bash 53 | 54 | - name: Download data 55 | run: | 56 | # curl -L -o tmp.zip https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip 57 | # unzip -q tmp.zip -d ../ 58 | # rm tmp.zip 59 | 60 | - name: Tests workflow 61 | run: | 62 | # export PYTHONPATH="$PWD" # to run '$ python *.py' files in subdirectories 63 | di=cpu # inference devices # define device 64 | 65 | # train 66 | python train.py --img 256 --batch 8 --weights weights/${{ matrix.model }}.pt --cfg models/${{ matrix.model }}.yaml --epochs 1 --device $di 67 | # detect 68 | python detect.py --weights weights/${{ matrix.model }}.pt --device $di 69 | python detect.py --weights runs/exp0/weights/last.pt --device $di 70 | # test 71 | python test.py --img 256 --batch 8 --weights weights/${{ matrix.model }}.pt --device $di 72 | python test.py --img 256 --batch 8 --weights runs/exp0/weights/last.pt --device $di 73 | 74 | python models/yolo.py --cfg models/${{ matrix.model }}.yaml # inspect 75 | python models/export.py --img 256 --batch 1 --weights weights/${{ matrix.model }}.pt # export 76 | shell: bash 77 | -------------------------------------------------------------------------------- /v3.0/yolov5/.github/workflows/greetings.yml: -------------------------------------------------------------------------------- 1 | name: Greetings 2 | 3 | on: [pull_request_target, issues] 4 | 5 | jobs: 6 | greeting: 7 | runs-on: ubuntu-latest 8 | steps: 9 | - uses: actions/first-interaction@v1 10 | with: 11 | repo-token: ${{ secrets.GITHUB_TOKEN }} 12 | pr-message: | 13 | Hello @${{ github.actor }}, thank you for submitting a PR! To allow your work to be integrated as seamlessly as possible, we advise you to: 14 | - Verify your PR is **up-to-date with origin/master.** If your PR is behind origin/master update by running the following, replacing 'feature' with the name of your local branch: 15 | ```bash 16 | git remote add upstream https://github.com/ultralytics/yolov5.git 17 | git fetch upstream 18 | git checkout feature # <----- replace 'feature' with local branch name 19 | git rebase upstream/master 20 | git push -u origin -f 21 | ``` 22 | - Verify all Continuous Integration (CI) **checks are passing**. 23 | - Reduce changes to the absolute **minimum** required for your bug fix or feature addition. _"It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is."_ -Bruce Lee 24 | 25 | issue-message: | 26 | Hello @${{ github.actor }}, thank you for your interest in our work! Please visit our [Custom Training Tutorial](https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data) to get started, and see our [Jupyter Notebook](https://github.com/ultralytics/yolov5/blob/master/tutorial.ipynb) Open In Colab, [Docker Image](https://hub.docker.com/r/ultralytics/yolov5), and [Google Cloud Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/GCP-Quickstart) for example environments. 27 | 28 | If this is a bug report, please provide screenshots and **minimum viable code to reproduce your issue**, otherwise we can not help you. 29 | 30 | If this is a custom model or data training question, please note Ultralytics does **not** provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as: 31 | - **Cloud-based AI** systems operating on **hundreds of HD video streams in realtime.** 32 | - **Edge AI** integrated into custom iOS and Android apps for realtime **30 FPS video inference.** 33 | - **Custom data training**, hyperparameter evolution, and model exportation to any destination. 34 | 35 | For more information please visit https://www.ultralytics.com. 36 | -------------------------------------------------------------------------------- /v3.0/yolov5/.github/workflows/rebase.yml: -------------------------------------------------------------------------------- 1 | name: Automatic Rebase 2 | # https://github.com/marketplace/actions/automatic-rebase 3 | 4 | on: 5 | issue_comment: 6 | types: [created] 7 | 8 | jobs: 9 | rebase: 10 | name: Rebase 11 | if: github.event.issue.pull_request != '' && contains(github.event.comment.body, '/rebase') 12 | runs-on: ubuntu-latest 13 | steps: 14 | - name: Checkout the latest code 15 | uses: actions/checkout@v2 16 | with: 17 | fetch-depth: 0 18 | - name: Automatic Rebase 19 | uses: cirrus-actions/rebase@1.3.1 20 | env: 21 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 22 | -------------------------------------------------------------------------------- /v3.0/yolov5/.github/workflows/stale.yml: -------------------------------------------------------------------------------- 1 | name: Close stale issues 2 | on: 3 | schedule: 4 | - cron: "0 0 * * *" 5 | 6 | jobs: 7 | stale: 8 | runs-on: ubuntu-latest 9 | steps: 10 | - uses: actions/stale@v1 11 | with: 12 | repo-token: ${{ secrets.GITHUB_TOKEN }} 13 | stale-issue-message: 'This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.' 14 | stale-pr-message: 'This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.' 15 | days-before-stale: 30 16 | days-before-close: 5 17 | exempt-issue-labels: 'documentation,tutorial' 18 | operations-per-run: 100 # The maximum number of operations per run, used to control rate limiting. 19 | -------------------------------------------------------------------------------- /v3.0/yolov5/.gitignore: -------------------------------------------------------------------------------- 1 | # Repo-specific GitIgnore ---------------------------------------------------------------------------------------------- 2 | *.jpg 3 | *.jpeg 4 | *.png 5 | *.bmp 6 | *.tif 7 | *.tiff 8 | *.heic 9 | *.JPG 10 | *.JPEG 11 | *.PNG 12 | *.BMP 13 | *.TIF 14 | *.TIFF 15 | *.HEIC 16 | *.mp4 17 | *.mov 18 | *.MOV 19 | *.avi 20 | *.data 21 | *.json 22 | 23 | *.cfg 24 | !cfg/yolov3*.cfg 25 | 26 | storage.googleapis.com 27 | runs/* 28 | data/* 29 | !data/samples/zidane.jpg 30 | !data/samples/bus.jpg 31 | !data/coco.names 32 | !data/coco_paper.names 33 | !data/coco.data 34 | !data/coco_*.data 35 | !data/coco_*.txt 36 | !data/trainvalno5k.shapes 37 | !data/*.sh 38 | 39 | pycocotools/* 40 | results*.txt 41 | gcp_test*.sh 42 | 43 | # MATLAB GitIgnore ----------------------------------------------------------------------------------------------------- 44 | *.m~ 45 | *.mat 46 | !targets*.mat 47 | 48 | # Neural Network weights ----------------------------------------------------------------------------------------------- 49 | *.weights 50 | *.pt 51 | *.onnx 52 | *.mlmodel 53 | *.torchscript 54 | darknet53.conv.74 55 | yolov3-tiny.conv.15 56 | 57 | # GitHub Python GitIgnore ---------------------------------------------------------------------------------------------- 58 | # Byte-compiled / optimized / DLL files 59 | __pycache__/ 60 | *.py[cod] 61 | *$py.class 62 | 63 | # C extensions 64 | *.so 65 | 66 | # Distribution / packaging 67 | .Python 68 | env/ 69 | build/ 70 | develop-eggs/ 71 | dist/ 72 | downloads/ 73 | eggs/ 74 | .eggs/ 75 | lib/ 76 | lib64/ 77 | parts/ 78 | sdist/ 79 | var/ 80 | wheels/ 81 | *.egg-info/ 82 | .installed.cfg 83 | *.egg 84 | 85 | # PyInstaller 86 | # Usually these files are written by a python script from a template 87 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 88 | *.manifest 89 | *.spec 90 | 91 | # Installer logs 92 | pip-log.txt 93 | pip-delete-this-directory.txt 94 | 95 | # Unit test / coverage reports 96 | htmlcov/ 97 | .tox/ 98 | .coverage 99 | .coverage.* 100 | .cache 101 | nosetests.xml 102 | coverage.xml 103 | *.cover 104 | .hypothesis/ 105 | 106 | # Translations 107 | *.mo 108 | *.pot 109 | 110 | # Django stuff: 111 | *.log 112 | local_settings.py 113 | 114 | # Flask stuff: 115 | instance/ 116 | .webassets-cache 117 | 118 | # Scrapy stuff: 119 | .scrapy 120 | 121 | # Sphinx documentation 122 | docs/_build/ 123 | 124 | # PyBuilder 125 | target/ 126 | 127 | # Jupyter Notebook 128 | .ipynb_checkpoints 129 | 130 | # pyenv 131 | .python-version 132 | 133 | # celery beat schedule file 134 | celerybeat-schedule 135 | 136 | # SageMath parsed files 137 | *.sage.py 138 | 139 | # dotenv 140 | .env 141 | 142 | # virtualenv 143 | .venv* 144 | venv*/ 145 | ENV*/ 146 | 147 | # Spyder project settings 148 | .spyderproject 149 | .spyproject 150 | 151 | # Rope project settings 152 | .ropeproject 153 | 154 | # mkdocs documentation 155 | /site 156 | 157 | # mypy 158 | .mypy_cache/ 159 | 160 | 161 | # https://github.com/github/gitignore/blob/master/Global/macOS.gitignore ----------------------------------------------- 162 | 163 | # General 164 | .DS_Store 165 | .AppleDouble 166 | .LSOverride 167 | 168 | # Icon must end with two \r 169 | Icon 170 | Icon? 171 | 172 | # Thumbnails 173 | ._* 174 | 175 | # Files that might appear in the root of a volume 176 | .DocumentRevisions-V100 177 | .fseventsd 178 | .Spotlight-V100 179 | .TemporaryItems 180 | .Trashes 181 | .VolumeIcon.icns 182 | .com.apple.timemachine.donotpresent 183 | 184 | # Directories potentially created on remote AFP share 185 | .AppleDB 186 | .AppleDesktop 187 | Network Trash Folder 188 | Temporary Items 189 | .apdisk 190 | 191 | 192 | # https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore 193 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm 194 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839 195 | 196 | # User-specific stuff: 197 | .idea/* 198 | .idea/**/workspace.xml 199 | .idea/**/tasks.xml 200 | .idea/dictionaries 201 | .html # Bokeh Plots 202 | .pg # TensorFlow Frozen Graphs 203 | .avi # videos 204 | 205 | # Sensitive or high-churn files: 206 | .idea/**/dataSources/ 207 | .idea/**/dataSources.ids 208 | .idea/**/dataSources.local.xml 209 | .idea/**/sqlDataSources.xml 210 | .idea/**/dynamic.xml 211 | .idea/**/uiDesigner.xml 212 | 213 | # Gradle: 214 | .idea/**/gradle.xml 215 | .idea/**/libraries 216 | 217 | # CMake 218 | cmake-build-debug/ 219 | cmake-build-release/ 220 | 221 | # Mongo Explorer plugin: 222 | .idea/**/mongoSettings.xml 223 | 224 | ## File-based project format: 225 | *.iws 226 | 227 | ## Plugin-specific files: 228 | 229 | # IntelliJ 230 | out/ 231 | 232 | # mpeltonen/sbt-idea plugin 233 | .idea_modules/ 234 | 235 | # JIRA plugin 236 | atlassian-ide-plugin.xml 237 | 238 | # Cursive Clojure plugin 239 | .idea/replstate.xml 240 | 241 | # Crashlytics plugin (for Android Studio and IntelliJ) 242 | com_crashlytics_export_strings.xml 243 | crashlytics.properties 244 | crashlytics-build.properties 245 | fabric.properties 246 | -------------------------------------------------------------------------------- /v3.0/yolov5/Dockerfile: -------------------------------------------------------------------------------- 1 | # Start FROM Nvidia PyTorch image https://ngc.nvidia.com/catalog/containers/nvidia:pytorch 2 | FROM nvcr.io/nvidia/pytorch:20.10-py3 3 | 4 | # Install dependencies 5 | RUN pip install --upgrade pip 6 | # COPY requirements.txt . 7 | # RUN pip install -r requirements.txt 8 | RUN pip install gsutil 9 | 10 | # Create working directory 11 | RUN mkdir -p /usr/src/app 12 | WORKDIR /usr/src/app 13 | 14 | # Copy contents 15 | COPY . /usr/src/app 16 | 17 | # Copy weights 18 | #RUN python3 -c "from models import *; \ 19 | #attempt_download('weights/yolov5s.pt'); \ 20 | #attempt_download('weights/yolov5m.pt'); \ 21 | #attempt_download('weights/yolov5l.pt')" 22 | 23 | 24 | # --------------------------------------------------- Extras Below --------------------------------------------------- 25 | 26 | # Build and Push 27 | # t=ultralytics/yolov5:latest && sudo docker build -t $t . && sudo docker push $t 28 | # for v in {300..303}; do t=ultralytics/coco:v$v && sudo docker build -t $t . && sudo docker push $t; done 29 | 30 | # Pull and Run 31 | # t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host $t 32 | 33 | # Pull and Run with local directory access 34 | # t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/coco:/usr/src/coco $t 35 | 36 | # Kill all 37 | # sudo docker kill $(sudo docker ps -q) 38 | 39 | # Kill all image-based 40 | # sudo docker kill $(sudo docker ps -a -q --filter ancestor=ultralytics/yolov5:latest) 41 | 42 | # Bash into running container 43 | # sudo docker container exec -it ba65811811ab bash 44 | 45 | # Bash into stopped container 46 | # sudo docker commit 092b16b25c5b usr/resume && sudo docker run -it --gpus all --ipc=host -v "$(pwd)"/coco:/usr/src/coco --entrypoint=sh usr/resume 47 | 48 | # Send weights to GCP 49 | # python -c "from utils.general import *; strip_optimizer('runs/exp0_*/weights/best.pt', 'tmp.pt')" && gsutil cp tmp.pt gs://*.pt 50 | 51 | # Clean up 52 | # docker system prune -a --volumes 53 | -------------------------------------------------------------------------------- /v3.0/yolov5/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 |   4 | 5 | ![CI CPU testing](https://github.com/ultralytics/yolov5/workflows/CI%20CPU%20testing/badge.svg) 6 | 7 | This repository represents Ultralytics open-source research into future object detection methods, and incorporates our lessons learned and best practices evolved over training thousands of models on custom client datasets with our previous YOLO repository https://github.com/ultralytics/yolov3. **All code and models are under active development, and are subject to modification or deletion without notice.** Use at your own risk. 8 | 9 | ** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8. 10 | 11 | - **August 13, 2020**: [v3.0 release](https://github.com/ultralytics/yolov5/releases/tag/v3.0): nn.Hardswish() activations, data autodownload, native AMP. 12 | - **July 23, 2020**: [v2.0 release](https://github.com/ultralytics/yolov5/releases/tag/v2.0): improved model definition, training and mAP. 13 | - **June 22, 2020**: [PANet](https://arxiv.org/abs/1803.01534) updates: new heads, reduced parameters, improved speed and mAP [364fcfd](https://github.com/ultralytics/yolov5/commit/364fcfd7dba53f46edd4f04c037a039c0a287972). 14 | - **June 19, 2020**: [FP16](https://pytorch.org/docs/stable/nn.html#torch.nn.Module.half) as new default for smaller checkpoints and faster inference [d4c6674](https://github.com/ultralytics/yolov5/commit/d4c6674c98e19df4c40e33a777610a18d1961145). 15 | - **June 9, 2020**: [CSP](https://github.com/WongKinYiu/CrossStagePartialNetworks) updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP). 16 | - **May 27, 2020**: Public release. YOLOv5 models are SOTA among all known YOLO implementations. 17 | - **April 1, 2020**: Start development of future compound-scaled [YOLOv3](https://github.com/ultralytics/yolov3)/[YOLOv4](https://github.com/AlexeyAB/darknet)-based PyTorch models. 18 | 19 | 20 | ## Pretrained Checkpoints 21 | 22 | | Model | APval | APtest | AP50 | SpeedGPU | FPSGPU || params | FLOPS | 23 | |---------- |------ |------ |------ | -------- | ------| ------ |------ | :------: | 24 | | [YOLOv5s](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 37.0 | 37.0 | 56.2 | **2.4ms** | **416** || 7.5M | 13.2B 25 | | [YOLOv5m](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 44.3 | 44.3 | 63.2 | 3.4ms | 294 || 21.8M | 39.4B 26 | | [YOLOv5l](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 47.7 | 47.7 | 66.5 | 4.4ms | 227 || 47.8M | 88.1B 27 | | [YOLOv5x](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | **49.2** | **49.2** | **67.7** | 6.9ms | 145 || 89.0M | 166.4B 28 | | | | | | | || | 29 | | [YOLOv5x](https://github.com/ultralytics/yolov5/releases/tag/v3.0) + TTA|**50.8**| **50.8** | **68.9** | 25.5ms | 39 || 89.0M | 354.3B 30 | | | | | | | || | 31 | | [YOLOv3-SPP](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 45.6 | 45.5 | 65.2 | 4.5ms | 222 || 63.0M | 118.0B 32 | 33 | ** APtest denotes COCO [test-dev2017](http://cocodataset.org/#upload) server results, all other AP results in the table denote val2017 accuracy. 34 | ** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. **Reproduce** by `python test.py --data coco.yaml --img 640 --conf 0.001` 35 | ** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP [n1-standard-16](https://cloud.google.com/compute/docs/machine-types#n1_standard_machine_types) instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. **Reproduce** by `python test.py --data coco.yaml --img 640 --conf 0.1` 36 | ** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation). 37 | ** Test Time Augmentation ([TTA](https://github.com/ultralytics/yolov5/issues/303)) runs at 3 image sizes. **Reproduce** by `python test.py --data coco.yaml --img 832 --augment` 38 | 39 | ## Requirements 40 | 41 | Python 3.8 or later with all [requirements.txt](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) dependencies installed, including `torch>=1.6`. To install run: 42 | ```bash 43 | $ pip install -r requirements.txt 44 | ``` 45 | 46 | 47 | ## Tutorials 48 | 49 | * [Train Custom Data](https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data) 50 | * [Multi-GPU Training](https://github.com/ultralytics/yolov5/issues/475) 51 | * [PyTorch Hub](https://github.com/ultralytics/yolov5/issues/36) 52 | * [ONNX and TorchScript Export](https://github.com/ultralytics/yolov5/issues/251) 53 | * [Test-Time Augmentation (TTA)](https://github.com/ultralytics/yolov5/issues/303) 54 | * [Model Ensembling](https://github.com/ultralytics/yolov5/issues/318) 55 | * [Model Pruning/Sparsity](https://github.com/ultralytics/yolov5/issues/304) 56 | * [Hyperparameter Evolution](https://github.com/ultralytics/yolov5/issues/607) 57 | * [TensorRT Deployment](https://github.com/wang-xinyu/tensorrtx) 58 | 59 | 60 | ## Environments 61 | 62 | YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including [CUDA](https://developer.nvidia.com/cuda)/[CUDNN](https://developer.nvidia.com/cudnn), [Python](https://www.python.org/) and [PyTorch](https://pytorch.org/) preinstalled): 63 | 64 | - **Google Colab Notebook** with free GPU: Open In Colab 65 | - **Kaggle Notebook** with free GPU: [https://www.kaggle.com/ultralytics/yolov5](https://www.kaggle.com/ultralytics/yolov5) 66 | - **Google Cloud** Deep Learning VM. See [GCP Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/GCP-Quickstart) 67 | - **Docker Image** https://hub.docker.com/r/ultralytics/yolov5. See [Docker Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/Docker-Quickstart) ![Docker Pulls](https://img.shields.io/docker/pulls/ultralytics/yolov5?logo=docker) 68 | 69 | 70 | ## Inference 71 | 72 | detect.py runs inference on a variety of sources, downloading models automatically from the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases) and saving results to `inference/output`. 73 | ```bash 74 | $ python detect.py --source 0 # webcam 75 | file.jpg # image 76 | file.mp4 # video 77 | path/ # directory 78 | path/*.jpg # glob 79 | rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa # rtsp stream 80 | rtmp://192.168.1.105/live/test # rtmp stream 81 | http://112.50.243.8/PLTV/88888888/224/3221225900/1.m3u8 # http stream 82 | ``` 83 | 84 | To run inference on example images in `inference/images`: 85 | ```bash 86 | $ python detect.py --source inference/images --weights yolov5s.pt --conf 0.25 87 | 88 | Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', img_size=640, iou_thres=0.45, output='inference/output', save_conf=False, save_txt=False, source='inference/images', update=False, view_img=False, weights='yolov5s.pt') 89 | Using CUDA device0 _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', total_memory=16160MB) 90 | 91 | Downloading https://github.com/ultralytics/yolov5/releases/download/v3.0/yolov5s.pt to yolov5s.pt... 100%|██████████████| 14.5M/14.5M [00:00<00:00, 21.3MB/s] 92 | 93 | Fusing layers... 94 | Model Summary: 140 layers, 7.45958e+06 parameters, 0 gradients 95 | image 1/2 yolov5/inference/images/bus.jpg: 640x480 4 persons, 1 buss, 1 skateboards, Done. (0.013s) 96 | image 2/2 yolov5/inference/images/zidane.jpg: 384x640 2 persons, 2 ties, Done. (0.013s) 97 | Results saved to yolov5/inference/output 98 | Done. (0.124s) 99 | ``` 100 | 101 | 102 | ### PyTorch Hub 103 | 104 | To run **batched inference** with YOLOv5 and [PyTorch Hub](https://github.com/ultralytics/yolov5/issues/36): 105 | ```python 106 | import torch 107 | from PIL import Image 108 | 109 | # Model 110 | model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).fuse().eval() # yolov5s.pt 111 | model = model.autoshape() # for autoshaping of PIL/cv2/np inputs and NMS 112 | 113 | # Images 114 | img1 = Image.open('zidane.jpg') 115 | img2 = Image.open('bus.jpg') 116 | imgs = [img1, img2] # batched list of images 117 | 118 | # Inference 119 | prediction = model(imgs, size=640) # includes NMS 120 | ``` 121 | 122 | 123 | ## Training 124 | 125 | Download [COCO](https://github.com/ultralytics/yolov5/blob/master/data/scripts/get_coco.sh) and run command below. Training times for YOLOv5s/m/l/x are 2/4/6/8 days on a single V100 (multi-GPU times faster). Use the largest `--batch-size` your GPU allows (batch sizes shown for 16 GB devices). 126 | ```bash 127 | $ python train.py --data coco.yaml --cfg yolov5s.yaml --weights '' --batch-size 64 128 | yolov5m 40 129 | yolov5l 24 130 | yolov5x 16 131 | ``` 132 | 133 | 134 | 135 | ## Citation 136 | 137 | [![DOI](https://zenodo.org/badge/264818686.svg)](https://zenodo.org/badge/latestdoi/264818686) 138 | 139 | 140 | ## About Us 141 | 142 | Ultralytics is a U.S.-based particle physics and AI startup with over 6 years of expertise supporting government, academic and business clients. We offer a wide range of vision AI services, spanning from simple expert advice up to delivery of fully customized, end-to-end production solutions, including: 143 | - **Cloud-based AI** systems operating on **hundreds of HD video streams in realtime.** 144 | - **Edge AI** integrated into custom iOS and Android apps for realtime **30 FPS video inference.** 145 | - **Custom data training**, hyperparameter evolution, and model exportation to any destination. 146 | 147 | For business inquiries and professional support requests please visit us at https://www.ultralytics.com. 148 | 149 | 150 | ## Contact 151 | 152 | **Issues should be raised directly in the repository.** For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com. 153 | -------------------------------------------------------------------------------- /v3.0/yolov5/config.yaml: -------------------------------------------------------------------------------- 1 | train: yolo_data/images/train 2 | val: yolo_data/images/validation 3 | nc: 4 4 | names: ['1', 5 | '2', 6 | '5', 7 | 'forefinger'] 8 | -------------------------------------------------------------------------------- /v3.0/yolov5/detect.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import shutil 4 | import time 5 | from pathlib import Path 6 | 7 | import cv2 8 | import torch 9 | import torch.backends.cudnn as cudnn 10 | from numpy import random 11 | import numpy as np 12 | import copy 13 | from PIL import Image 14 | 15 | from models.experimental import attempt_load 16 | from utils.datasets import LoadStreams, LoadImages 17 | from utils.general import ( 18 | check_img_size, non_max_suppression, apply_classifier, scale_coords, 19 | xyxy2xywh, plot_one_box, strip_optimizer, set_logging) 20 | from utils.torch_utils import select_device, load_classifier, time_synchronized 21 | 22 | 23 | def get_dis(p1, p2): 24 | (x1, y1) = p1 25 | (x2, y2) = p2 26 | return np.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2) 27 | 28 | 29 | def detect(save_img=False): 30 | out, source, weights, view_img, save_txt, imgsz = \ 31 | opt.save_dir, opt.source, opt.weights, opt.view_img, opt.save_txt, opt.img_size 32 | webcam = source.isnumeric() or source.startswith(('rtsp://', 'rtmp://', 'http://')) or source.endswith('.txt') 33 | 34 | # Initialize 35 | set_logging() 36 | device = select_device(opt.device) 37 | if os.path.exists(out): # output dir 38 | shutil.rmtree(out) # delete dir 39 | os.makedirs(out) # make new dir 40 | half = device.type != 'cpu' # half precision only supported on CUDA 41 | 42 | # Load model 43 | model = attempt_load(weights, map_location=device) # load FP32 model 44 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size 45 | if half: 46 | model.half() # to FP16 47 | 48 | # Second-stage classifier 49 | classify = False 50 | if classify: 51 | modelc = load_classifier(name='resnet101', n=2) # initialize 52 | modelc.load_state_dict(torch.load('weights/resnet101.pt', map_location=device)['model']) # load weights 53 | modelc.to(device).eval() 54 | 55 | # Set Dataloader 56 | vid_path, vid_writer = None, None 57 | if webcam: 58 | view_img = True 59 | cudnn.benchmark = True # set True to speed up constant image size inference 60 | dataset = LoadStreams(source, img_size=imgsz) 61 | else: 62 | save_img = True 63 | dataset = LoadImages(source, img_size=imgsz) 64 | 65 | # Get names and colors 66 | names = model.module.names if hasattr(model, 'module') else model.names 67 | colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))] 68 | 69 | # Run inference 70 | t0 = time.time() 71 | 72 | init_pic = np.zeros((1080, 1920, 3), dtype=np.uint8) 73 | init_pic[:, :, :] = 255 74 | # init_pic = Image.open('data/ppt.jpg') 75 | # init_pic = np.array(init_pic, dtype=np.uint8) 76 | sketchpad = copy.copy(init_pic) 77 | last_point = 0 78 | last_point_time = 0 79 | dis_max = 300 80 | clr = (255, 0, 0) 81 | clr_list = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (255, 0, 255), (0, 255, 255), 82 | (0, 0, 0)] 83 | line_width = 10 84 | 85 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img 86 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once 87 | for path, img, im0s, vid_cap in dataset: 88 | img = torch.from_numpy(img).to(device) 89 | img = img.half() if half else img.float() # uint8 to fp16/32 90 | img /= 255.0 # 0 - 255 to 0.0 - 1.0 91 | if img.ndimension() == 3: 92 | img = img.unsqueeze(0) 93 | 94 | # Inference 95 | t1 = time_synchronized() 96 | pred = model(img, augment=opt.augment)[0] 97 | 98 | # Apply NMS 99 | pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms) 100 | t2 = time_synchronized() 101 | 102 | # Apply Classifier 103 | if classify: 104 | pred = apply_classifier(pred, modelc, img, im0s) 105 | 106 | # Process detections 107 | for i, det in enumerate(pred): # detections per image 108 | if webcam: # batch_size >= 1 109 | p, s, im0 = path[i], '%g: ' % i, im0s[i].copy() 110 | else: 111 | p, s, im0 = path, '', im0s 112 | 113 | save_path = str(Path(out) / Path(p).name) 114 | txt_path = str(Path(out) / Path(p).stem) + ('_%g' % dataset.frame if dataset.mode == 'video' else '') 115 | s += '%gx%g ' % img.shape[2:] # print string 116 | gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh 117 | if det is not None and len(det): 118 | # Rescale boxes from img_size to im0 size 119 | det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round() 120 | 121 | # Print results 122 | for c in det[:, -1].unique(): 123 | n = (det[:, -1] == c).sum() # detections per class 124 | s += '%g %ss, ' % (n, names[int(c)]) # add to string 125 | 126 | # Write results 127 | flag_1 = False 128 | flag_2 = False 129 | flag_3 = False 130 | flag_4 = False 131 | flag_xyxy = 0 132 | for *xyxy, conf, cls in reversed(det): 133 | if cls == 0: 134 | flag_1 = True 135 | elif cls == 1: 136 | flag_2 = True 137 | elif cls == 2: 138 | flag_3 = True 139 | elif cls == 3: 140 | flag_4 = True 141 | flag_xyxy = torch.tensor(xyxy).view(1, 4).cpu().numpy() 142 | flag_xyxy = flag_xyxy[0] 143 | # print(flag_xyxy) 144 | 145 | if save_txt: # Write to file 146 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh 147 | line = (cls, conf, *xywh) if opt.save_conf else (cls, *xywh) # label format 148 | with open(txt_path + '.txt', 'a') as f: 149 | f.write(('%g ' * len(line) + '\n') % line) 150 | 151 | if save_img or view_img: # Add bbox to image 152 | label = '%s %.2f' % (names[int(cls)], conf) 153 | plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3) 154 | 155 | # order 156 | if flag_1 and flag_4: 157 | point = ((flag_xyxy[0] + flag_xyxy[2]) / 2 * 3, (flag_xyxy[1] + flag_xyxy[3]) / 2 * 3) 158 | x, y = point 159 | point = (int(x), int(y)) 160 | cv2.circle(sketchpad, point, line_width, clr) 161 | if last_point != 0: 162 | if time.time()-last_point_time < 0.5 and get_dis(last_point, point) < dis_max: 163 | # print(point, last_point) 164 | cv2.line(sketchpad, point, last_point, clr, line_width) 165 | last_point = point 166 | last_point_time = time.time() 167 | if flag_3: 168 | sketchpad = init_pic.copy() 169 | 170 | if flag_2: 171 | clr = clr_list[random.randint(len(clr_list))] 172 | 173 | # Print time (inference + NMS) 174 | print('%sDone. (%.3fs)' % (s, t2 - t1)) 175 | 176 | # Stream results 177 | if view_img: 178 | cv2.namedWindow(p, 0) 179 | cv2.resizeWindow(p, 640, 480) 180 | cv2.imshow(p, cv2.flip(im0, 1)) 181 | cv2.namedWindow('sketchpad', cv2.WINDOW_NORMAL) 182 | cv2.setWindowProperty('sketchpad', cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN) 183 | cv2.imshow('sketchpad', cv2.flip(sketchpad, 1)) 184 | 185 | if cv2.waitKey(1) == ord('q'): # q to quit 186 | raise StopIteration 187 | 188 | # Save results (image with detections) 189 | if save_img: 190 | if dataset.mode == 'images': 191 | cv2.imwrite(save_path, im0) 192 | else: 193 | if vid_path != save_path: # new video 194 | vid_path = save_path 195 | if isinstance(vid_writer, cv2.VideoWriter): 196 | vid_writer.release() # release previous video writer 197 | 198 | fourcc = 'mp4v' # output video codec 199 | fps = vid_cap.get(cv2.CAP_PROP_FPS) 200 | w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 201 | h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 202 | vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*fourcc), fps, (w, h)) 203 | vid_writer.write(im0) 204 | 205 | if save_txt or save_img: 206 | print('Results saved to %s' % Path(out)) 207 | 208 | print('Done. (%.3fs)' % (time.time() - t0)) 209 | 210 | 211 | if __name__ == '__main__': 212 | parser = argparse.ArgumentParser() 213 | parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model.pt path(s)') 214 | parser.add_argument('--source', type=str, default='inference/images', help='source') # file/folder, 0 for webcam 215 | parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)') 216 | parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold') 217 | parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS') 218 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 219 | parser.add_argument('--view-img', action='store_true', help='display results') 220 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') 221 | parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') 222 | parser.add_argument('--save-dir', type=str, default='inference/output', help='directory to save results') 223 | parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3') 224 | parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS') 225 | parser.add_argument('--augment', action='store_true', help='augmented inference') 226 | parser.add_argument('--update', action='store_true', help='update all models') 227 | opt = parser.parse_args() 228 | print(opt) 229 | 230 | with torch.no_grad(): 231 | if opt.update: # update all models (to fix SourceChangeWarning) 232 | for opt.weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']: 233 | detect() 234 | strip_optimizer(opt.weights) 235 | else: 236 | detect() 237 | -------------------------------------------------------------------------------- /v3.0/yolov5/hubconf.py: -------------------------------------------------------------------------------- 1 | """File for accessing YOLOv5 via PyTorch Hub https://pytorch.org/hub/ 2 | 3 | Usage: 4 | import torch 5 | model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True, channels=3, classes=80) 6 | """ 7 | 8 | dependencies = ['torch', 'yaml'] 9 | import os 10 | 11 | import torch 12 | 13 | from models.yolo import Model 14 | from utils.general import set_logging 15 | from utils.google_utils import attempt_download 16 | 17 | set_logging() 18 | 19 | 20 | def create(name, pretrained, channels, classes): 21 | """Creates a specified YOLOv5 model 22 | 23 | Arguments: 24 | name (str): name of model, i.e. 'yolov5s' 25 | pretrained (bool): load pretrained weights into the model 26 | channels (int): number of input channels 27 | classes (int): number of model classes 28 | 29 | Returns: 30 | pytorch model 31 | """ 32 | config = os.path.join(os.path.dirname(__file__), 'models', f'{name}.yaml') # model.yaml path 33 | try: 34 | model = Model(config, channels, classes) 35 | if pretrained: 36 | fname = f'{name}.pt' # checkpoint filename 37 | attempt_download(fname) # download if not found locally 38 | ckpt = torch.load(fname, map_location=torch.device('cpu')) # load 39 | state_dict = ckpt['model'].float().state_dict() # to FP32 40 | state_dict = {k: v for k, v in state_dict.items() if model.state_dict()[k].shape == v.shape} # filter 41 | model.load_state_dict(state_dict, strict=False) # load 42 | if len(ckpt['model'].names) == classes: 43 | model.names = ckpt['model'].names # set class names attribute 44 | # model = model.autoshape() # for autoshaping of PIL/cv2/np inputs and NMS 45 | return model 46 | 47 | except Exception as e: 48 | help_url = 'https://github.com/ultralytics/yolov5/issues/36' 49 | s = 'Cache maybe be out of date, try force_reload=True. See %s for help.' % help_url 50 | raise Exception(s) from e 51 | 52 | 53 | def yolov5s(pretrained=False, channels=3, classes=80): 54 | """YOLOv5-small model from https://github.com/ultralytics/yolov5 55 | 56 | Arguments: 57 | pretrained (bool): load pretrained weights into the model, default=False 58 | channels (int): number of input channels, default=3 59 | classes (int): number of model classes, default=80 60 | 61 | Returns: 62 | pytorch model 63 | """ 64 | return create('yolov5s', pretrained, channels, classes) 65 | 66 | 67 | def yolov5m(pretrained=False, channels=3, classes=80): 68 | """YOLOv5-medium model from https://github.com/ultralytics/yolov5 69 | 70 | Arguments: 71 | pretrained (bool): load pretrained weights into the model, default=False 72 | channels (int): number of input channels, default=3 73 | classes (int): number of model classes, default=80 74 | 75 | Returns: 76 | pytorch model 77 | """ 78 | return create('yolov5m', pretrained, channels, classes) 79 | 80 | 81 | def yolov5l(pretrained=False, channels=3, classes=80): 82 | """YOLOv5-large model from https://github.com/ultralytics/yolov5 83 | 84 | Arguments: 85 | pretrained (bool): load pretrained weights into the model, default=False 86 | channels (int): number of input channels, default=3 87 | classes (int): number of model classes, default=80 88 | 89 | Returns: 90 | pytorch model 91 | """ 92 | return create('yolov5l', pretrained, channels, classes) 93 | 94 | 95 | def yolov5x(pretrained=False, channels=3, classes=80): 96 | """YOLOv5-xlarge model from https://github.com/ultralytics/yolov5 97 | 98 | Arguments: 99 | pretrained (bool): load pretrained weights into the model, default=False 100 | channels (int): number of input channels, default=3 101 | classes (int): number of model classes, default=80 102 | 103 | Returns: 104 | pytorch model 105 | """ 106 | return create('yolov5x', pretrained, channels, classes) 107 | 108 | 109 | if __name__ == '__main__': 110 | model = create(name='yolov5s', pretrained=True, channels=3, classes=80) # example 111 | model = model.fuse().eval().autoshape() # for autoshaping of PIL/cv2/np inputs and NMS 112 | 113 | # Verify inference 114 | from PIL import Image 115 | 116 | img = Image.open('inference/images/zidane.jpg') 117 | y = model(img) 118 | print(y[0].shape) 119 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/models/__init__.py -------------------------------------------------------------------------------- /v3.0/yolov5/models/common.py: -------------------------------------------------------------------------------- 1 | # This file contains modules common to various models 2 | 3 | import math 4 | import numpy as np 5 | import torch 6 | import torch.nn as nn 7 | 8 | from utils.datasets import letterbox 9 | from utils.general import non_max_suppression, make_divisible, scale_coords 10 | 11 | 12 | def autopad(k, p=None): # kernel, padding 13 | # Pad to 'same' 14 | if p is None: 15 | p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad 16 | return p 17 | 18 | 19 | def DWConv(c1, c2, k=1, s=1, act=True): 20 | # Depthwise convolution 21 | return Conv(c1, c2, k, s, g=math.gcd(c1, c2), act=act) 22 | 23 | 24 | class Conv(nn.Module): 25 | # Standard convolution 26 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 27 | super(Conv, self).__init__() 28 | self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) 29 | self.bn = nn.BatchNorm2d(c2) 30 | self.act = nn.Hardswish() if act else nn.Identity() 31 | 32 | def forward(self, x): 33 | return self.act(self.bn(self.conv(x))) 34 | 35 | def fuseforward(self, x): 36 | return self.act(self.conv(x)) 37 | 38 | 39 | class Bottleneck(nn.Module): 40 | # Standard bottleneck 41 | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion 42 | super(Bottleneck, self).__init__() 43 | c_ = int(c2 * e) # hidden channels 44 | self.cv1 = Conv(c1, c_, 1, 1) 45 | self.cv2 = Conv(c_, c2, 3, 1, g=g) 46 | self.add = shortcut and c1 == c2 47 | 48 | def forward(self, x): 49 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x)) 50 | 51 | 52 | class BottleneckCSP(nn.Module): 53 | # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks 54 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion 55 | super(BottleneckCSP, self).__init__() 56 | c_ = int(c2 * e) # hidden channels 57 | self.cv1 = Conv(c1, c_, 1, 1) 58 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False) 59 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False) 60 | self.cv4 = Conv(2 * c_, c2, 1, 1) 61 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3) 62 | self.act = nn.LeakyReLU(0.1, inplace=True) 63 | self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)]) 64 | 65 | def forward(self, x): 66 | y1 = self.cv3(self.m(self.cv1(x))) 67 | y2 = self.cv2(x) 68 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1)))) 69 | 70 | 71 | class SPP(nn.Module): 72 | # Spatial pyramid pooling layer used in YOLOv3-SPP 73 | def __init__(self, c1, c2, k=(5, 9, 13)): 74 | super(SPP, self).__init__() 75 | c_ = c1 // 2 # hidden channels 76 | self.cv1 = Conv(c1, c_, 1, 1) 77 | self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1) 78 | self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k]) 79 | 80 | def forward(self, x): 81 | x = self.cv1(x) 82 | return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1)) 83 | 84 | 85 | class Focus(nn.Module): 86 | # Focus wh information into c-space 87 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 88 | super(Focus, self).__init__() 89 | self.conv = Conv(c1 * 4, c2, k, s, p, g, act) 90 | 91 | def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2) 92 | return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)) 93 | 94 | 95 | class Concat(nn.Module): 96 | # Concatenate a list of tensors along dimension 97 | def __init__(self, dimension=1): 98 | super(Concat, self).__init__() 99 | self.d = dimension 100 | 101 | def forward(self, x): 102 | return torch.cat(x, self.d) 103 | 104 | 105 | class NMS(nn.Module): 106 | # Non-Maximum Suppression (NMS) module 107 | conf = 0.25 # confidence threshold 108 | iou = 0.45 # IoU threshold 109 | classes = None # (optional list) filter by class 110 | 111 | def __init__(self): 112 | super(NMS, self).__init__() 113 | 114 | def forward(self, x): 115 | return non_max_suppression(x[0], conf_thres=self.conf, iou_thres=self.iou, classes=self.classes) 116 | 117 | 118 | class autoShape(nn.Module): 119 | # input-robust model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and NMS 120 | img_size = 640 # inference size (pixels) 121 | conf = 0.25 # NMS confidence threshold 122 | iou = 0.45 # NMS IoU threshold 123 | classes = None # (optional list) filter by class 124 | 125 | def __init__(self, model): 126 | super(autoShape, self).__init__() 127 | self.model = model 128 | 129 | def forward(self, x, size=640, augment=False, profile=False): 130 | # supports inference from various sources. For height=720, width=1280, RGB images example inputs are: 131 | # opencv: x = cv2.imread('image.jpg')[:,:,::-1] # HWC BGR to RGB x(720,1280,3) 132 | # PIL: x = Image.open('image.jpg') # HWC x(720,1280,3) 133 | # numpy: x = np.zeros((720,1280,3)) # HWC 134 | # torch: x = torch.zeros(16,3,720,1280) # BCHW 135 | # multiple: x = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...] # list of images 136 | 137 | p = next(self.model.parameters()) # for device and type 138 | if isinstance(x, torch.Tensor): # torch 139 | return self.model(x.to(p.device).type_as(p), augment, profile) # inference 140 | 141 | # Pre-process 142 | if not isinstance(x, list): 143 | x = [x] 144 | shape0, shape1 = [], [] # image and inference shapes 145 | batch = range(len(x)) # batch size 146 | for i in batch: 147 | x[i] = np.array(x[i])[:, :, :3] # up to 3 channels if png 148 | s = x[i].shape[:2] # HWC 149 | shape0.append(s) # image shape 150 | g = (size / max(s)) # gain 151 | shape1.append([y * g for y in s]) 152 | shape1 = [make_divisible(x, int(self.stride.max())) for x in np.stack(shape1, 0).max(0)] # inference shape 153 | x = [letterbox(x[i], new_shape=shape1, auto=False)[0] for i in batch] # pad 154 | x = np.stack(x, 0) if batch[-1] else x[0][None] # stack 155 | x = np.ascontiguousarray(x.transpose((0, 3, 1, 2))) # BHWC to BCHW 156 | x = torch.from_numpy(x).to(p.device).type_as(p) / 255. # uint8 to fp16/32 157 | 158 | # Inference 159 | x = self.model(x, augment, profile) # forward 160 | x = non_max_suppression(x[0], conf_thres=self.conf, iou_thres=self.iou, classes=self.classes) # NMS 161 | 162 | # Post-process 163 | for i in batch: 164 | if x[i] is not None: 165 | x[i][:, :4] = scale_coords(shape1, x[i][:, :4], shape0[i]) 166 | return x 167 | 168 | 169 | class Flatten(nn.Module): 170 | # Use after nn.AdaptiveAvgPool2d(1) to remove last 2 dimensions 171 | @staticmethod 172 | def forward(x): 173 | return x.view(x.size(0), -1) 174 | 175 | 176 | class Classify(nn.Module): 177 | # Classification head, i.e. x(b,c1,20,20) to x(b,c2) 178 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1): # ch_in, ch_out, kernel, stride, padding, groups 179 | super(Classify, self).__init__() 180 | self.aap = nn.AdaptiveAvgPool2d(1) # to x(b,c1,1,1) 181 | self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) # to x(b,c2,1,1) 182 | self.flat = Flatten() 183 | 184 | def forward(self, x): 185 | z = torch.cat([self.aap(y) for y in (x if isinstance(x, list) else [x])], 1) # cat if list 186 | return self.flat(self.conv(z)) # flatten to x(b,c2) 187 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/experimental.py: -------------------------------------------------------------------------------- 1 | # This file contains experimental modules 2 | 3 | import numpy as np 4 | import torch 5 | import torch.nn as nn 6 | 7 | from models.common import Conv, DWConv 8 | from utils.google_utils import attempt_download 9 | 10 | 11 | class CrossConv(nn.Module): 12 | # Cross Convolution Downsample 13 | def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False): 14 | # ch_in, ch_out, kernel, stride, groups, expansion, shortcut 15 | super(CrossConv, self).__init__() 16 | c_ = int(c2 * e) # hidden channels 17 | self.cv1 = Conv(c1, c_, (1, k), (1, s)) 18 | self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g) 19 | self.add = shortcut and c1 == c2 20 | 21 | def forward(self, x): 22 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x)) 23 | 24 | 25 | class C3(nn.Module): 26 | # Cross Convolution CSP 27 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion 28 | super(C3, self).__init__() 29 | c_ = int(c2 * e) # hidden channels 30 | self.cv1 = Conv(c1, c_, 1, 1) 31 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False) 32 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False) 33 | self.cv4 = Conv(2 * c_, c2, 1, 1) 34 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3) 35 | self.act = nn.LeakyReLU(0.1, inplace=True) 36 | self.m = nn.Sequential(*[CrossConv(c_, c_, 3, 1, g, 1.0, shortcut) for _ in range(n)]) 37 | 38 | def forward(self, x): 39 | y1 = self.cv3(self.m(self.cv1(x))) 40 | y2 = self.cv2(x) 41 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1)))) 42 | 43 | 44 | class Sum(nn.Module): 45 | # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070 46 | def __init__(self, n, weight=False): # n: number of inputs 47 | super(Sum, self).__init__() 48 | self.weight = weight # apply weights boolean 49 | self.iter = range(n - 1) # iter object 50 | if weight: 51 | self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights 52 | 53 | def forward(self, x): 54 | y = x[0] # no weight 55 | if self.weight: 56 | w = torch.sigmoid(self.w) * 2 57 | for i in self.iter: 58 | y = y + x[i + 1] * w[i] 59 | else: 60 | for i in self.iter: 61 | y = y + x[i + 1] 62 | return y 63 | 64 | 65 | class GhostConv(nn.Module): 66 | # Ghost Convolution https://github.com/huawei-noah/ghostnet 67 | def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups 68 | super(GhostConv, self).__init__() 69 | c_ = c2 // 2 # hidden channels 70 | self.cv1 = Conv(c1, c_, k, s, None, g, act) 71 | self.cv2 = Conv(c_, c_, 5, 1, None, c_, act) 72 | 73 | def forward(self, x): 74 | y = self.cv1(x) 75 | return torch.cat([y, self.cv2(y)], 1) 76 | 77 | 78 | class GhostBottleneck(nn.Module): 79 | # Ghost Bottleneck https://github.com/huawei-noah/ghostnet 80 | def __init__(self, c1, c2, k, s): 81 | super(GhostBottleneck, self).__init__() 82 | c_ = c2 // 2 83 | self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw 84 | DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw 85 | GhostConv(c_, c2, 1, 1, act=False)) # pw-linear 86 | self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False), 87 | Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity() 88 | 89 | def forward(self, x): 90 | return self.conv(x) + self.shortcut(x) 91 | 92 | 93 | class MixConv2d(nn.Module): 94 | # Mixed Depthwise Conv https://arxiv.org/abs/1907.09595 95 | def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True): 96 | super(MixConv2d, self).__init__() 97 | groups = len(k) 98 | if equal_ch: # equal c_ per group 99 | i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices 100 | c_ = [(i == g).sum() for g in range(groups)] # intermediate channels 101 | else: # equal weight.numel() per group 102 | b = [c2] + [0] * groups 103 | a = np.eye(groups + 1, groups, k=-1) 104 | a -= np.roll(a, 1, axis=1) 105 | a *= np.array(k) ** 2 106 | a[0] = 1 107 | c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b 108 | 109 | self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)]) 110 | self.bn = nn.BatchNorm2d(c2) 111 | self.act = nn.LeakyReLU(0.1, inplace=True) 112 | 113 | def forward(self, x): 114 | return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1))) 115 | 116 | 117 | class Ensemble(nn.ModuleList): 118 | # Ensemble of models 119 | def __init__(self): 120 | super(Ensemble, self).__init__() 121 | 122 | def forward(self, x, augment=False): 123 | y = [] 124 | for module in self: 125 | y.append(module(x, augment)[0]) 126 | # y = torch.stack(y).max(0)[0] # max ensemble 127 | # y = torch.cat(y, 1) # nms ensemble 128 | y = torch.stack(y).mean(0) # mean ensemble 129 | return y, None # inference, train output 130 | 131 | 132 | def attempt_load(weights, map_location=None): 133 | # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a 134 | model = Ensemble() 135 | for w in weights if isinstance(weights, list) else [weights]: 136 | attempt_download(w) 137 | model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model 138 | 139 | # Compatibility updates 140 | for m in model.modules(): 141 | if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]: 142 | m.inplace = True # pytorch 1.7.0 compatibility 143 | elif type(m) is Conv: 144 | m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility 145 | 146 | if len(model) == 1: 147 | return model[-1] # return model 148 | else: 149 | print('Ensemble created with %s\n' % weights) 150 | for k in ['names', 'stride']: 151 | setattr(model, k, getattr(model[-1], k)) 152 | return model # return ensemble 153 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/export.py: -------------------------------------------------------------------------------- 1 | """Exports a YOLOv5 *.pt model to ONNX and TorchScript formats 2 | 3 | Usage: 4 | $ export PYTHONPATH="$PWD" && python models/export.py --weights ./weights/yolov5s.pt --img 640 --batch 1 5 | """ 6 | 7 | import argparse 8 | import sys 9 | import time 10 | 11 | sys.path.append('./') # to run '$ python *.py' files in subdirectories 12 | 13 | import torch 14 | import torch.nn as nn 15 | 16 | import models 17 | from models.experimental import attempt_load 18 | from utils.activations import Hardswish 19 | from utils.general import set_logging, check_img_size 20 | 21 | if __name__ == '__main__': 22 | parser = argparse.ArgumentParser() 23 | parser.add_argument('--weights', type=str, default='./yolov5s.pt', help='weights path') # from yolov5/models/ 24 | parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='image size') # height, width 25 | parser.add_argument('--batch-size', type=int, default=1, help='batch size') 26 | opt = parser.parse_args() 27 | opt.img_size *= 2 if len(opt.img_size) == 1 else 1 # expand 28 | print(opt) 29 | set_logging() 30 | t = time.time() 31 | 32 | # Load PyTorch model 33 | model = attempt_load(opt.weights, map_location=torch.device('cpu')) # load FP32 model 34 | labels = model.names 35 | 36 | # Checks 37 | gs = int(max(model.stride)) # grid size (max stride) 38 | opt.img_size = [check_img_size(x, gs) for x in opt.img_size] # verify img_size are gs-multiples 39 | 40 | # Input 41 | img = torch.zeros(opt.batch_size, 3, *opt.img_size) # image size(1,3,320,192) iDetection 42 | 43 | # Update model 44 | for k, m in model.named_modules(): 45 | m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility 46 | if isinstance(m, models.common.Conv) and isinstance(m.act, nn.Hardswish): 47 | m.act = Hardswish() # assign activation 48 | # if isinstance(m, models.yolo.Detect): 49 | # m.forward = m.forward_export # assign forward (optional) 50 | model.model[-1].export = True # set Detect() layer export=True 51 | y = model(img) # dry run 52 | 53 | # TorchScript export 54 | try: 55 | print('\nStarting TorchScript export with torch %s...' % torch.__version__) 56 | f = opt.weights.replace('.pt', '.torchscript.pt') # filename 57 | ts = torch.jit.trace(model, img) 58 | ts.save(f) 59 | print('TorchScript export success, saved as %s' % f) 60 | except Exception as e: 61 | print('TorchScript export failure: %s' % e) 62 | 63 | # ONNX export 64 | try: 65 | import onnx 66 | 67 | print('\nStarting ONNX export with onnx %s...' % onnx.__version__) 68 | f = opt.weights.replace('.pt', '.onnx') # filename 69 | torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['images'], 70 | output_names=['classes', 'boxes'] if y is None else ['output']) 71 | 72 | # Checks 73 | onnx_model = onnx.load(f) # load onnx model 74 | onnx.checker.check_model(onnx_model) # check onnx model 75 | # print(onnx.helper.printable_graph(onnx_model.graph)) # print a human readable model 76 | print('ONNX export success, saved as %s' % f) 77 | except Exception as e: 78 | print('ONNX export failure: %s' % e) 79 | 80 | # CoreML export 81 | try: 82 | import coremltools as ct 83 | 84 | print('\nStarting CoreML export with coremltools %s...' % ct.__version__) 85 | # convert model from torchscript and apply pixel scaling as per detect.py 86 | model = ct.convert(ts, inputs=[ct.ImageType(name='image', shape=img.shape, scale=1 / 255.0, bias=[0, 0, 0])]) 87 | f = opt.weights.replace('.pt', '.mlmodel') # filename 88 | model.save(f) 89 | print('CoreML export success, saved as %s' % f) 90 | except Exception as e: 91 | print('CoreML export failure: %s' % e) 92 | 93 | # Finish 94 | print('\nExport complete (%.2fs). Visualize with https://github.com/lutzroeder/netron.' % (time.time() - t)) 95 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/hub/yolov3-spp.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 1.0 # model depth multiple 4 | width_multiple: 1.0 # layer channel multiple 5 | 6 | # anchors 7 | anchors: 8 | - [10,13, 16,30, 33,23] # P3/8 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [116,90, 156,198, 373,326] # P5/32 11 | 12 | # darknet53 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Conv, [32, 3, 1]], # 0 16 | [-1, 1, Conv, [64, 3, 2]], # 1-P1/2 17 | [-1, 1, Bottleneck, [64]], 18 | [-1, 1, Conv, [128, 3, 2]], # 3-P2/4 19 | [-1, 2, Bottleneck, [128]], 20 | [-1, 1, Conv, [256, 3, 2]], # 5-P3/8 21 | [-1, 8, Bottleneck, [256]], 22 | [-1, 1, Conv, [512, 3, 2]], # 7-P4/16 23 | [-1, 8, Bottleneck, [512]], 24 | [-1, 1, Conv, [1024, 3, 2]], # 9-P5/32 25 | [-1, 4, Bottleneck, [1024]], # 10 26 | ] 27 | 28 | # YOLOv3-SPP head 29 | head: 30 | [[-1, 1, Bottleneck, [1024, False]], 31 | [-1, 1, SPP, [512, [5, 9, 13]]], 32 | [-1, 1, Conv, [1024, 3, 1]], 33 | [-1, 1, Conv, [512, 1, 1]], 34 | [-1, 1, Conv, [1024, 3, 1]], # 15 (P5/32-large) 35 | 36 | [-2, 1, Conv, [256, 1, 1]], 37 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 38 | [[-1, 8], 1, Concat, [1]], # cat backbone P4 39 | [-1, 1, Bottleneck, [512, False]], 40 | [-1, 1, Bottleneck, [512, False]], 41 | [-1, 1, Conv, [256, 1, 1]], 42 | [-1, 1, Conv, [512, 3, 1]], # 22 (P4/16-medium) 43 | 44 | [-2, 1, Conv, [128, 1, 1]], 45 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 46 | [[-1, 6], 1, Concat, [1]], # cat backbone P3 47 | [-1, 1, Bottleneck, [256, False]], 48 | [-1, 2, Bottleneck, [256, False]], # 27 (P3/8-small) 49 | 50 | [[27, 22, 15], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 51 | ] 52 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/hub/yolov5-fpn.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 1.0 # model depth multiple 4 | width_multiple: 1.0 # layer channel multiple 5 | 6 | # anchors 7 | anchors: 8 | - [10,13, 16,30, 33,23] # P3/8 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [116,90, 156,198, 373,326] # P5/32 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, Bottleneck, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, BottleneckCSP, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, BottleneckCSP, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | [-1, 6, BottleneckCSP, [1024]], # 9 25 | ] 26 | 27 | # YOLOv5 FPN head 28 | head: 29 | [[-1, 3, BottleneckCSP, [1024, False]], # 10 (P5/32-large) 30 | 31 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 32 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 33 | [-1, 1, Conv, [512, 1, 1]], 34 | [-1, 3, BottleneckCSP, [512, False]], # 14 (P4/16-medium) 35 | 36 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 37 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 38 | [-1, 1, Conv, [256, 1, 1]], 39 | [-1, 3, BottleneckCSP, [256, False]], # 18 (P3/8-small) 40 | 41 | [[18, 14, 10], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 42 | ] 43 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/hub/yolov5-panet.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 1.0 # model depth multiple 4 | width_multiple: 1.0 # layer channel multiple 5 | 6 | # anchors 7 | anchors: 8 | - [116,90, 156,198, 373,326] # P5/32 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [10,13, 16,30, 33,23] # P3/8 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, BottleneckCSP, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, BottleneckCSP, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, BottleneckCSP, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | [-1, 3, BottleneckCSP, [1024, False]], # 9 25 | ] 26 | 27 | # YOLOv5 PANet head 28 | head: 29 | [[-1, 1, Conv, [512, 1, 1]], 30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 32 | [-1, 3, BottleneckCSP, [512, False]], # 13 33 | 34 | [-1, 1, Conv, [256, 1, 1]], 35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small) 38 | 39 | [-1, 1, Conv, [256, 3, 2]], 40 | [[-1, 14], 1, Concat, [1]], # cat head P4 41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium) 42 | 43 | [-1, 1, Conv, [512, 3, 2]], 44 | [[-1, 10], 1, Concat, [1]], # cat head P5 45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large) 46 | 47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P5, P4, P3) 48 | ] 49 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/yolo.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import logging 3 | import sys 4 | from copy import deepcopy 5 | from pathlib import Path 6 | 7 | import math 8 | 9 | sys.path.append('./') # to run '$ python *.py' files in subdirectories 10 | logger = logging.getLogger(__name__) 11 | 12 | import torch 13 | import torch.nn as nn 14 | 15 | from models.common import Conv, Bottleneck, SPP, DWConv, Focus, BottleneckCSP, Concat, NMS, autoShape 16 | from models.experimental import MixConv2d, CrossConv, C3 17 | from utils.general import check_anchor_order, make_divisible, check_file, set_logging 18 | from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \ 19 | select_device, copy_attr 20 | 21 | 22 | class Detect(nn.Module): 23 | stride = None # strides computed during build 24 | export = False # onnx export 25 | 26 | def __init__(self, nc=80, anchors=(), ch=()): # detection layer 27 | super(Detect, self).__init__() 28 | self.nc = nc # number of classes 29 | self.no = nc + 5 # number of outputs per anchor 30 | self.nl = len(anchors) # number of detection layers 31 | self.na = len(anchors[0]) // 2 # number of anchors 32 | self.grid = [torch.zeros(1)] * self.nl # init grid 33 | a = torch.tensor(anchors).float().view(self.nl, -1, 2) 34 | self.register_buffer('anchors', a) # shape(nl,na,2) 35 | self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2) 36 | self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv 37 | 38 | def forward(self, x): 39 | # x = x.copy() # for profiling 40 | z = [] # inference output 41 | self.training |= self.export 42 | for i in range(self.nl): 43 | x[i] = self.m[i](x[i]) # conv 44 | bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85) 45 | x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() 46 | 47 | if not self.training: # inference 48 | if self.grid[i].shape[2:4] != x[i].shape[2:4]: 49 | self.grid[i] = self._make_grid(nx, ny).to(x[i].device) 50 | 51 | y = x[i].sigmoid() 52 | y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy 53 | y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh 54 | z.append(y.view(bs, -1, self.no)) 55 | 56 | return x if self.training else (torch.cat(z, 1), x) 57 | 58 | @staticmethod 59 | def _make_grid(nx=20, ny=20): 60 | yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)]) 61 | return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float() 62 | 63 | 64 | class Model(nn.Module): 65 | def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None): # model, input channels, number of classes 66 | super(Model, self).__init__() 67 | if isinstance(cfg, dict): 68 | self.yaml = cfg # model dict 69 | else: # is *.yaml 70 | import yaml # for torch hub 71 | self.yaml_file = Path(cfg).name 72 | with open(cfg) as f: 73 | self.yaml = yaml.load(f, Loader=yaml.FullLoader) # model dict 74 | 75 | # Define model 76 | if nc and nc != self.yaml['nc']: 77 | print('Overriding model.yaml nc=%g with nc=%g' % (self.yaml['nc'], nc)) 78 | self.yaml['nc'] = nc # override yaml value 79 | self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist, ch_out 80 | # print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))]) 81 | 82 | # Build strides, anchors 83 | m = self.model[-1] # Detect() 84 | if isinstance(m, Detect): 85 | s = 128 # 2x min stride 86 | m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward 87 | m.anchors /= m.stride.view(-1, 1, 1) 88 | check_anchor_order(m) 89 | self.stride = m.stride 90 | self._initialize_biases() # only run once 91 | # print('Strides: %s' % m.stride.tolist()) 92 | 93 | # Init weights, biases 94 | initialize_weights(self) 95 | self.info() 96 | print('') 97 | 98 | def forward(self, x, augment=False, profile=False): 99 | if augment: 100 | img_size = x.shape[-2:] # height, width 101 | s = [1, 0.83, 0.67] # scales 102 | f = [None, 3, None] # flips (2-ud, 3-lr) 103 | y = [] # outputs 104 | for si, fi in zip(s, f): 105 | xi = scale_img(x.flip(fi) if fi else x, si) 106 | yi = self.forward_once(xi)[0] # forward 107 | # cv2.imwrite('img%g.jpg' % s, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1]) # save 108 | yi[..., :4] /= si # de-scale 109 | if fi == 2: 110 | yi[..., 1] = img_size[0] - yi[..., 1] # de-flip ud 111 | elif fi == 3: 112 | yi[..., 0] = img_size[1] - yi[..., 0] # de-flip lr 113 | y.append(yi) 114 | return torch.cat(y, 1), None # augmented inference, train 115 | else: 116 | return self.forward_once(x, profile) # single-scale inference, train 117 | 118 | def forward_once(self, x, profile=False): 119 | y, dt = [], [] # outputs 120 | for m in self.model: 121 | if m.f != -1: # if not from previous layer 122 | x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers 123 | 124 | if profile: 125 | try: 126 | import thop 127 | o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 # FLOPS 128 | except: 129 | o = 0 130 | t = time_synchronized() 131 | for _ in range(10): 132 | _ = m(x) 133 | dt.append((time_synchronized() - t) * 100) 134 | print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type)) 135 | 136 | x = m(x) # run 137 | y.append(x if m.i in self.save else None) # save output 138 | 139 | if profile: 140 | print('%.1fms total' % sum(dt)) 141 | return x 142 | 143 | def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency 144 | # https://arxiv.org/abs/1708.02002 section 3.3 145 | # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1. 146 | m = self.model[-1] # Detect() module 147 | for mi, s in zip(m.m, m.stride): # from 148 | b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85) 149 | b[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image) 150 | b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls 151 | mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True) 152 | 153 | def _print_biases(self): 154 | m = self.model[-1] # Detect() module 155 | for mi in m.m: # from 156 | b = mi.bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85) 157 | print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean())) 158 | 159 | # def _print_weights(self): 160 | # for m in self.model.modules(): 161 | # if type(m) is Bottleneck: 162 | # print('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights 163 | 164 | def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers 165 | print('Fusing layers... ') 166 | for m in self.model.modules(): 167 | if type(m) is Conv and hasattr(m, 'bn'): 168 | m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv 169 | delattr(m, 'bn') # remove batchnorm 170 | m.forward = m.fuseforward # update forward 171 | self.info() 172 | return self 173 | 174 | def nms(self, mode=True): # add or remove NMS module 175 | present = type(self.model[-1]) is NMS # last layer is NMS 176 | if mode and not present: 177 | print('Adding NMS... ') 178 | m = NMS() # module 179 | m.f = -1 # from 180 | m.i = self.model[-1].i + 1 # index 181 | self.model.add_module(name='%s' % m.i, module=m) # add 182 | self.eval() 183 | elif not mode and present: 184 | print('Removing NMS... ') 185 | self.model = self.model[:-1] # remove 186 | return self 187 | 188 | def autoshape(self): # add autoShape module 189 | print('Adding autoShape... ') 190 | m = autoShape(self) # wrap model 191 | copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=()) # copy attributes 192 | return m 193 | 194 | def info(self, verbose=False): # print model information 195 | model_info(self, verbose) 196 | 197 | 198 | def parse_model(d, ch): # model_dict, input_channels(3) 199 | logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments')) 200 | anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple'] 201 | na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors 202 | no = na * (nc + 5) # number of outputs = anchors * (classes + 5) 203 | 204 | layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out 205 | for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args 206 | m = eval(m) if isinstance(m, str) else m # eval strings 207 | for j, a in enumerate(args): 208 | try: 209 | args[j] = eval(a) if isinstance(a, str) else a # eval strings 210 | except: 211 | pass 212 | 213 | n = max(round(n * gd), 1) if n > 1 else n # depth gain 214 | if m in [Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]: 215 | c1, c2 = ch[f], args[0] 216 | 217 | # Normal 218 | # if i > 0 and args[0] != no: # channel expansion factor 219 | # ex = 1.75 # exponential (default 2.0) 220 | # e = math.log(c2 / ch[1]) / math.log(2) 221 | # c2 = int(ch[1] * ex ** e) 222 | # if m != Focus: 223 | 224 | c2 = make_divisible(c2 * gw, 8) if c2 != no else c2 225 | 226 | # Experimental 227 | # if i > 0 and args[0] != no: # channel expansion factor 228 | # ex = 1 + gw # exponential (default 2.0) 229 | # ch1 = 32 # ch[1] 230 | # e = math.log(c2 / ch1) / math.log(2) # level 1-n 231 | # c2 = int(ch1 * ex ** e) 232 | # if m != Focus: 233 | # c2 = make_divisible(c2, 8) if c2 != no else c2 234 | 235 | args = [c1, c2, *args[1:]] 236 | if m in [BottleneckCSP, C3]: 237 | args.insert(2, n) 238 | n = 1 239 | elif m is nn.BatchNorm2d: 240 | args = [ch[f]] 241 | elif m is Concat: 242 | c2 = sum([ch[-1 if x == -1 else x + 1] for x in f]) 243 | elif m is Detect: 244 | args.append([ch[x + 1] for x in f]) 245 | if isinstance(args[1], int): # number of anchors 246 | args[1] = [list(range(args[1] * 2))] * len(f) 247 | else: 248 | c2 = ch[f] 249 | 250 | m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module 251 | t = str(m)[8:-2].replace('__main__.', '') # module type 252 | np = sum([x.numel() for x in m_.parameters()]) # number params 253 | m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params 254 | logger.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print 255 | save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist 256 | layers.append(m_) 257 | ch.append(c2) 258 | return nn.Sequential(*layers), sorted(save) 259 | 260 | 261 | if __name__ == '__main__': 262 | parser = argparse.ArgumentParser() 263 | parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml') 264 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 265 | opt = parser.parse_args() 266 | opt.cfg = check_file(opt.cfg) # check file 267 | set_logging() 268 | device = select_device(opt.device) 269 | 270 | # Create model 271 | model = Model(opt.cfg).to(device) 272 | model.train() 273 | 274 | # Profile 275 | # img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device) 276 | # y = model(img, profile=True) 277 | 278 | # Tensorboard 279 | # from torch.utils.tensorboard import SummaryWriter 280 | # tb_writer = SummaryWriter() 281 | # print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/") 282 | # tb_writer.add_graph(model.model, img) # add model to tensorboard 283 | # tb_writer.add_image('test', img[0], dataformats='CWH') # add model to tensorboard 284 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/yolov5l.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 1.0 # model depth multiple 4 | width_multiple: 1.0 # layer channel multiple 5 | 6 | # anchors 7 | anchors: 8 | - [10,13, 16,30, 33,23] # P3/8 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [116,90, 156,198, 373,326] # P5/32 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, BottleneckCSP, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, BottleneckCSP, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, BottleneckCSP, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | [-1, 3, BottleneckCSP, [1024, False]], # 9 25 | ] 26 | 27 | # YOLOv5 head 28 | head: 29 | [[-1, 1, Conv, [512, 1, 1]], 30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 32 | [-1, 3, BottleneckCSP, [512, False]], # 13 33 | 34 | [-1, 1, Conv, [256, 1, 1]], 35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small) 38 | 39 | [-1, 1, Conv, [256, 3, 2]], 40 | [[-1, 14], 1, Concat, [1]], # cat head P4 41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium) 42 | 43 | [-1, 1, Conv, [512, 3, 2]], 44 | [[-1, 10], 1, Concat, [1]], # cat head P5 45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large) 46 | 47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 48 | ] 49 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/yolov5m.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 0.67 # model depth multiple 4 | width_multiple: 0.75 # layer channel multiple 5 | 6 | # anchors 7 | anchors: 8 | - [10,13, 16,30, 33,23] # P3/8 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [116,90, 156,198, 373,326] # P5/32 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, BottleneckCSP, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, BottleneckCSP, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, BottleneckCSP, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | [-1, 3, BottleneckCSP, [1024, False]], # 9 25 | ] 26 | 27 | # YOLOv5 head 28 | head: 29 | [[-1, 1, Conv, [512, 1, 1]], 30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 32 | [-1, 3, BottleneckCSP, [512, False]], # 13 33 | 34 | [-1, 1, Conv, [256, 1, 1]], 35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small) 38 | 39 | [-1, 1, Conv, [256, 3, 2]], 40 | [[-1, 14], 1, Concat, [1]], # cat head P4 41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium) 42 | 43 | [-1, 1, Conv, [512, 3, 2]], 44 | [[-1, 10], 1, Concat, [1]], # cat head P5 45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large) 46 | 47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 48 | ] 49 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/yolov5s.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 0.33 # model depth multiple 4 | width_multiple: 0.50 # layer channel multiple 5 | 6 | # anchors 7 | anchors: 8 | - [10,13, 16,30, 33,23] # P3/8 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [116,90, 156,198, 373,326] # P5/32 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, BottleneckCSP, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, BottleneckCSP, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, BottleneckCSP, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | [-1, 3, BottleneckCSP, [1024, False]], # 9 25 | ] 26 | 27 | # YOLOv5 head 28 | head: 29 | [[-1, 1, Conv, [512, 1, 1]], 30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 32 | [-1, 3, BottleneckCSP, [512, False]], # 13 33 | 34 | [-1, 1, Conv, [256, 1, 1]], 35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small) 38 | 39 | [-1, 1, Conv, [256, 3, 2]], 40 | [[-1, 14], 1, Concat, [1]], # cat head P4 41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium) 42 | 43 | [-1, 1, Conv, [512, 3, 2]], 44 | [[-1, 10], 1, Concat, [1]], # cat head P5 45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large) 46 | 47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 48 | ] 49 | -------------------------------------------------------------------------------- /v3.0/yolov5/models/yolov5x.yaml: -------------------------------------------------------------------------------- 1 | # parameters 2 | nc: 80 # number of classes 3 | depth_multiple: 1.33 # model depth multiple 4 | width_multiple: 1.25 # layer channel multiple 5 | 6 | # anchors 7 | anchors: 8 | - [10,13, 16,30, 33,23] # P3/8 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [116,90, 156,198, 373,326] # P5/32 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, BottleneckCSP, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, BottleneckCSP, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, BottleneckCSP, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | [-1, 3, BottleneckCSP, [1024, False]], # 9 25 | ] 26 | 27 | # YOLOv5 head 28 | head: 29 | [[-1, 1, Conv, [512, 1, 1]], 30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 32 | [-1, 3, BottleneckCSP, [512, False]], # 13 33 | 34 | [-1, 1, Conv, [256, 1, 1]], 35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small) 38 | 39 | [-1, 1, Conv, [256, 3, 2]], 40 | [[-1, 14], 1, Concat, [1]], # cat head P4 41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium) 42 | 43 | [-1, 1, Conv, [512, 3, 2]], 44 | [[-1, 10], 1, Concat, [1]], # cat head P5 45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large) 46 | 47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 48 | ] 49 | -------------------------------------------------------------------------------- /v3.0/yolov5/requirements.txt: -------------------------------------------------------------------------------- 1 | # pip install -r requirements.txt 2 | 3 | # base ---------------------------------------- 4 | Cython 5 | matplotlib>=3.2.2 6 | numpy>=1.18.5 7 | opencv-python>=4.1.2 8 | pillow 9 | PyYAML>=5.3 10 | scipy>=1.4.1 11 | tensorboard>=2.2 12 | torch>=1.6.0 13 | torchvision>=0.7.0 14 | tqdm>=4.41.0 15 | 16 | # logging ------------------------------------- 17 | # wandb 18 | 19 | # coco ---------------------------------------- 20 | # pycocotools>=2.0 21 | 22 | # export -------------------------------------- 23 | # packaging # for coremltools 24 | # coremltools==4.0 25 | # onnx>=1.7.0 26 | # scikit-learn==0.19.2 # for coreml quantization 27 | 28 | # extras -------------------------------------- 29 | # thop # FLOPS computation 30 | # seaborn # plotting 31 | -------------------------------------------------------------------------------- /v3.0/yolov5/sotabench.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import glob 3 | import os 4 | import shutil 5 | from pathlib import Path 6 | 7 | import numpy as np 8 | import torch 9 | import yaml 10 | from sotabencheval.object_detection import COCOEvaluator 11 | from sotabencheval.utils import is_server 12 | from tqdm import tqdm 13 | 14 | from models.experimental import attempt_load 15 | from utils.datasets import create_dataloader 16 | from utils.general import ( 17 | coco80_to_coco91_class, check_dataset, check_file, check_img_size, compute_loss, non_max_suppression, scale_coords, 18 | xyxy2xywh, clip_coords, set_logging) 19 | from utils.torch_utils import select_device, time_synchronized 20 | 21 | DATA_ROOT = './.data/vision/coco' if is_server() else '../coco' # sotabench data dir 22 | 23 | 24 | def test(data, 25 | weights=None, 26 | batch_size=16, 27 | imgsz=640, 28 | conf_thres=0.001, 29 | iou_thres=0.6, # for NMS 30 | save_json=False, 31 | single_cls=False, 32 | augment=False, 33 | verbose=False, 34 | model=None, 35 | dataloader=None, 36 | save_dir='', 37 | merge=False, 38 | save_txt=False): 39 | # Initialize/load model and set device 40 | training = model is not None 41 | if training: # called by train.py 42 | device = next(model.parameters()).device # get model device 43 | 44 | else: # called directly 45 | set_logging() 46 | device = select_device(opt.device, batch_size=batch_size) 47 | merge, save_txt = opt.merge, opt.save_txt # use Merge NMS, save *.txt labels 48 | if save_txt: 49 | out = Path('inference/output') 50 | if os.path.exists(out): 51 | shutil.rmtree(out) # delete output folder 52 | os.makedirs(out) # make new output folder 53 | 54 | # Remove previous 55 | for f in glob.glob(str(Path(save_dir) / 'test_batch*.jpg')): 56 | os.remove(f) 57 | 58 | # Load model 59 | model = attempt_load(weights, map_location=device) # load FP32 model 60 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size 61 | 62 | # Multi-GPU disabled, incompatible with .half() https://github.com/ultralytics/yolov5/issues/99 63 | # if device.type != 'cpu' and torch.cuda.device_count() > 1: 64 | # model = nn.DataParallel(model) 65 | 66 | # Half 67 | half = device.type != 'cpu' # half precision only supported on CUDA 68 | if half: 69 | model.half() 70 | 71 | # Configure 72 | model.eval() 73 | with open(data) as f: 74 | data = yaml.load(f, Loader=yaml.FullLoader) # model dict 75 | check_dataset(data) # check 76 | nc = 1 if single_cls else int(data['nc']) # number of classes 77 | iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95 78 | niou = iouv.numel() 79 | 80 | # Dataloader 81 | if not training: 82 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img 83 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once 84 | path = data['test'] if opt.task == 'test' else data['val'] # path to val/test images 85 | dataloader = create_dataloader(path, imgsz, batch_size, model.stride.max(), opt, 86 | hyp=None, augment=False, cache=True, pad=0.5, rect=True)[0] 87 | 88 | seen = 0 89 | names = model.names if hasattr(model, 'names') else model.module.names 90 | coco91class = coco80_to_coco91_class() 91 | s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Targets', 'P', 'R', 'mAP@.5', 'mAP@.5:.95') 92 | p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0. 93 | loss = torch.zeros(3, device=device) 94 | jdict, stats, ap, ap_class = [], [], [], [] 95 | evaluator = COCOEvaluator(root=DATA_ROOT, model_name=opt.weights.replace('.pt', '')) 96 | for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)): 97 | img = img.to(device, non_blocking=True) 98 | img = img.half() if half else img.float() # uint8 to fp16/32 99 | img /= 255.0 # 0 - 255 to 0.0 - 1.0 100 | targets = targets.to(device) 101 | nb, _, height, width = img.shape # batch size, channels, height, width 102 | whwh = torch.Tensor([width, height, width, height]).to(device) 103 | 104 | # Disable gradients 105 | with torch.no_grad(): 106 | # Run model 107 | t = time_synchronized() 108 | inf_out, train_out = model(img, augment=augment) # inference and training outputs 109 | t0 += time_synchronized() - t 110 | 111 | # Compute loss 112 | if training: # if model has loss hyperparameters 113 | loss += compute_loss([x.float() for x in train_out], targets, model)[1][:3] # box, obj, cls 114 | 115 | # Run NMS 116 | t = time_synchronized() 117 | output = non_max_suppression(inf_out, conf_thres=conf_thres, iou_thres=iou_thres, merge=merge) 118 | t1 += time_synchronized() - t 119 | 120 | # Statistics per image 121 | for si, pred in enumerate(output): 122 | labels = targets[targets[:, 0] == si, 1:] 123 | nl = len(labels) 124 | tcls = labels[:, 0].tolist() if nl else [] # target class 125 | seen += 1 126 | 127 | if pred is None: 128 | if nl: 129 | stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls)) 130 | continue 131 | 132 | # Append to text file 133 | if save_txt: 134 | gn = torch.tensor(shapes[si][0])[[1, 0, 1, 0]] # normalization gain whwh 135 | x = pred.clone() 136 | x[:, :4] = scale_coords(img[si].shape[1:], x[:, :4], shapes[si][0], shapes[si][1]) # to original 137 | for *xyxy, conf, cls in x: 138 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh 139 | with open(str(out / Path(paths[si]).stem) + '.txt', 'a') as f: 140 | f.write(('%g ' * 5 + '\n') % (cls, *xywh)) # label format 141 | 142 | # Clip boxes to image bounds 143 | clip_coords(pred, (height, width)) 144 | 145 | # Append to pycocotools JSON dictionary 146 | if save_json: 147 | # [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ... 148 | image_id = Path(paths[si]).stem 149 | box = pred[:, :4].clone() # xyxy 150 | scale_coords(img[si].shape[1:], box, shapes[si][0], shapes[si][1]) # to original shape 151 | box = xyxy2xywh(box) # xywh 152 | box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner 153 | for p, b in zip(pred.tolist(), box.tolist()): 154 | result = {'image_id': int(image_id) if image_id.isnumeric() else image_id, 155 | 'category_id': coco91class[int(p[5])], 156 | 'bbox': [round(x, 3) for x in b], 157 | 'score': round(p[4], 5)} 158 | jdict.append(result) 159 | 160 | #evaluator.add([result]) 161 | #if evaluator.cache_exists: 162 | # break 163 | 164 | # # Assign all predictions as incorrect 165 | # correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device) 166 | # if nl: 167 | # detected = [] # target indices 168 | # tcls_tensor = labels[:, 0] 169 | # 170 | # # target boxes 171 | # tbox = xywh2xyxy(labels[:, 1:5]) * whwh 172 | # 173 | # # Per target class 174 | # for cls in torch.unique(tcls_tensor): 175 | # ti = (cls == tcls_tensor).nonzero(as_tuple=False).view(-1) # prediction indices 176 | # pi = (cls == pred[:, 5]).nonzero(as_tuple=False).view(-1) # target indices 177 | # 178 | # # Search for detections 179 | # if pi.shape[0]: 180 | # # Prediction to target ious 181 | # ious, i = box_iou(pred[pi, :4], tbox[ti]).max(1) # best ious, indices 182 | # 183 | # # Append detections 184 | # detected_set = set() 185 | # for j in (ious > iouv[0]).nonzero(as_tuple=False): 186 | # d = ti[i[j]] # detected target 187 | # if d.item() not in detected_set: 188 | # detected_set.add(d.item()) 189 | # detected.append(d) 190 | # correct[pi[j]] = ious[j] > iouv # iou_thres is 1xn 191 | # if len(detected) == nl: # all targets already located in image 192 | # break 193 | # 194 | # # Append statistics (correct, conf, pcls, tcls) 195 | # stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls)) 196 | 197 | # # Plot images 198 | # if batch_i < 1: 199 | # f = Path(save_dir) / ('test_batch%g_gt.jpg' % batch_i) # filename 200 | # plot_images(img, targets, paths, str(f), names) # ground truth 201 | # f = Path(save_dir) / ('test_batch%g_pred.jpg' % batch_i) 202 | # plot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictions 203 | 204 | evaluator.add(jdict) 205 | evaluator.save() 206 | 207 | # # Compute statistics 208 | # stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy 209 | # if len(stats) and stats[0].any(): 210 | # p, r, ap, f1, ap_class = ap_per_class(*stats) 211 | # p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1) # [P, R, AP@0.5, AP@0.5:0.95] 212 | # mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean() 213 | # nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class 214 | # else: 215 | # nt = torch.zeros(1) 216 | # 217 | # # Print results 218 | # pf = '%20s' + '%12.3g' * 6 # print format 219 | # print(pf % ('all', seen, nt.sum(), mp, mr, map50, map)) 220 | # 221 | # # Print results per class 222 | # if verbose and nc > 1 and len(stats): 223 | # for i, c in enumerate(ap_class): 224 | # print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i])) 225 | # 226 | # # Print speeds 227 | # t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple 228 | # if not training: 229 | # print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t) 230 | # 231 | # # Save JSON 232 | # if save_json and len(jdict): 233 | # f = 'detections_val2017_%s_results.json' % \ 234 | # (weights.split(os.sep)[-1].replace('.pt', '') if isinstance(weights, str) else '') # filename 235 | # print('\nCOCO mAP with pycocotools... saving %s...' % f) 236 | # with open(f, 'w') as file: 237 | # json.dump(jdict, file) 238 | # 239 | # try: # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb 240 | # from pycocotools.coco import COCO 241 | # from pycocotools.cocoeval import COCOeval 242 | # 243 | # imgIds = [int(Path(x).stem) for x in dataloader.dataset.img_files] 244 | # cocoGt = COCO(glob.glob('../coco/annotations/instances_val*.json')[0]) # initialize COCO ground truth api 245 | # cocoDt = cocoGt.loadRes(f) # initialize COCO pred api 246 | # cocoEval = COCOeval(cocoGt, cocoDt, 'bbox') 247 | # cocoEval.params.imgIds = imgIds # image IDs to evaluate 248 | # cocoEval.evaluate() 249 | # cocoEval.accumulate() 250 | # cocoEval.summarize() 251 | # map, map50 = cocoEval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5) 252 | # except Exception as e: 253 | # print('ERROR: pycocotools unable to run: %s' % e) 254 | # 255 | # # Return results 256 | # model.float() # for training 257 | # maps = np.zeros(nc) + map 258 | # for i, c in enumerate(ap_class): 259 | # maps[c] = ap[i] 260 | # return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t 261 | 262 | 263 | if __name__ == '__main__': 264 | parser = argparse.ArgumentParser(prog='test.py') 265 | parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model.pt path(s)') 266 | parser.add_argument('--data', type=str, default='data/coco.yaml', help='*.data path') 267 | parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch') 268 | parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)') 269 | parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold') 270 | parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS') 271 | parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file') 272 | parser.add_argument('--task', default='val', help="'val', 'test', 'study'") 273 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 274 | parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset') 275 | parser.add_argument('--augment', action='store_true', help='augmented inference') 276 | parser.add_argument('--merge', action='store_true', help='use Merge NMS') 277 | parser.add_argument('--verbose', action='store_true', help='report mAP by class') 278 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') 279 | opt = parser.parse_args() 280 | opt.save_json |= opt.data.endswith('coco.yaml') 281 | opt.data = check_file(opt.data) # check file 282 | print(opt) 283 | 284 | if opt.task in ['val', 'test']: # run normally 285 | test(opt.data, 286 | opt.weights, 287 | opt.batch_size, 288 | opt.img_size, 289 | opt.conf_thres, 290 | opt.iou_thres, 291 | opt.save_json, 292 | opt.single_cls, 293 | opt.augment, 294 | opt.verbose) 295 | 296 | elif opt.task == 'study': # run over a range of settings and save/plot 297 | for weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']: 298 | f = 'study_%s_%s.txt' % (Path(opt.data).stem, Path(weights).stem) # filename to save to 299 | x = list(range(320, 800, 64)) # x axis 300 | y = [] # y axis 301 | for i in x: # img-size 302 | print('\nRunning %s point %s...' % (f, i)) 303 | r, _, t = test(opt.data, weights, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json) 304 | y.append(r + t) # results and times 305 | np.savetxt(f, y, fmt='%10.4g') # save 306 | os.system('zip -r study.zip study_*.txt') 307 | # utils.general.plot_study_txt(f, x) # plot -------------------------------------------------------------------------------- /v3.0/yolov5/test.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import glob 3 | import json 4 | import os 5 | import shutil 6 | from pathlib import Path 7 | 8 | import numpy as np 9 | import torch 10 | import yaml 11 | from tqdm import tqdm 12 | 13 | from models.experimental import attempt_load 14 | from utils.datasets import create_dataloader 15 | from utils.general import ( 16 | coco80_to_coco91_class, check_dataset, check_file, check_img_size, compute_loss, non_max_suppression, scale_coords, 17 | xyxy2xywh, clip_coords, plot_images, xywh2xyxy, box_iou, output_to_target, ap_per_class, set_logging) 18 | from utils.torch_utils import select_device, time_synchronized 19 | 20 | 21 | def test(data, 22 | weights=None, 23 | batch_size=16, 24 | imgsz=640, 25 | conf_thres=0.001, 26 | iou_thres=0.6, # for NMS 27 | save_json=False, 28 | single_cls=False, 29 | augment=False, 30 | verbose=False, 31 | model=None, 32 | dataloader=None, 33 | save_dir=Path(''), # for saving images 34 | save_txt=False, # for auto-labelling 35 | save_conf=False, 36 | plots=True, 37 | log_imgs=0): # number of logged images 38 | 39 | # Initialize/load model and set device 40 | training = model is not None 41 | if training: # called by train.py 42 | device = next(model.parameters()).device # get model device 43 | 44 | else: # called directly 45 | set_logging() 46 | device = select_device(opt.device, batch_size=batch_size) 47 | save_txt = opt.save_txt # save *.txt labels 48 | 49 | # Remove previous 50 | if os.path.exists(save_dir): 51 | shutil.rmtree(save_dir) # delete dir 52 | os.makedirs(save_dir) # make new dir 53 | 54 | if save_txt: 55 | out = save_dir / 'autolabels' 56 | if os.path.exists(out): 57 | shutil.rmtree(out) # delete dir 58 | os.makedirs(out) # make new dir 59 | 60 | # Load model 61 | model = attempt_load(weights, map_location=device) # load FP32 model 62 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size 63 | 64 | # Multi-GPU disabled, incompatible with .half() https://github.com/ultralytics/yolov5/issues/99 65 | # if device.type != 'cpu' and torch.cuda.device_count() > 1: 66 | # model = nn.DataParallel(model) 67 | 68 | # Half 69 | half = device.type != 'cpu' # half precision only supported on CUDA 70 | if half: 71 | model.half() 72 | 73 | # Configure 74 | model.eval() 75 | with open(data) as f: 76 | data = yaml.load(f, Loader=yaml.FullLoader) # model dict 77 | check_dataset(data) # check 78 | nc = 1 if single_cls else int(data['nc']) # number of classes 79 | iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95 80 | niou = iouv.numel() 81 | 82 | # Logging 83 | log_imgs = min(log_imgs, 100) # ceil 84 | try: 85 | import wandb # Weights & Biases 86 | except ImportError: 87 | log_imgs = 0 88 | 89 | # Dataloader 90 | if not training: 91 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img 92 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once 93 | path = data['test'] if opt.task == 'test' else data['val'] # path to val/test images 94 | dataloader = create_dataloader(path, imgsz, batch_size, model.stride.max(), opt, 95 | hyp=None, augment=False, cache=False, pad=0.5, rect=True)[0] 96 | 97 | seen = 0 98 | names = model.names if hasattr(model, 'names') else model.module.names 99 | coco91class = coco80_to_coco91_class() 100 | s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Targets', 'P', 'R', 'mAP@.5', 'mAP@.5:.95') 101 | p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0. 102 | loss = torch.zeros(3, device=device) 103 | jdict, stats, ap, ap_class, wandb_images = [], [], [], [], [] 104 | for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)): 105 | img = img.to(device, non_blocking=True) 106 | img = img.half() if half else img.float() # uint8 to fp16/32 107 | img /= 255.0 # 0 - 255 to 0.0 - 1.0 108 | targets = targets.to(device) 109 | nb, _, height, width = img.shape # batch size, channels, height, width 110 | whwh = torch.Tensor([width, height, width, height]).to(device) 111 | 112 | # Disable gradients 113 | with torch.no_grad(): 114 | # Run model 115 | t = time_synchronized() 116 | inf_out, train_out = model(img, augment=augment) # inference and training outputs 117 | t0 += time_synchronized() - t 118 | 119 | # Compute loss 120 | if training: # if model has loss hyperparameters 121 | loss += compute_loss([x.float() for x in train_out], targets, model)[1][:3] # box, obj, cls 122 | 123 | # Run NMS 124 | t = time_synchronized() 125 | output = non_max_suppression(inf_out, conf_thres=conf_thres, iou_thres=iou_thres) 126 | t1 += time_synchronized() - t 127 | 128 | # Statistics per image 129 | for si, pred in enumerate(output): 130 | labels = targets[targets[:, 0] == si, 1:] 131 | nl = len(labels) 132 | tcls = labels[:, 0].tolist() if nl else [] # target class 133 | seen += 1 134 | 135 | if pred is None: 136 | if nl: 137 | stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls)) 138 | continue 139 | 140 | # Append to text file 141 | if save_txt: 142 | gn = torch.tensor(shapes[si][0])[[1, 0, 1, 0]] # normalization gain whwh 143 | x = pred.clone() 144 | x[:, :4] = scale_coords(img[si].shape[1:], x[:, :4], shapes[si][0], shapes[si][1]) # to original 145 | for *xyxy, conf, cls in x: 146 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh 147 | line = (cls, conf, *xywh) if save_conf else (cls, *xywh) # label format 148 | with open(str(out / Path(paths[si]).stem) + '.txt', 'a') as f: 149 | f.write(('%g ' * len(line) + '\n') % line) 150 | 151 | # W&B logging 152 | if len(wandb_images) < log_imgs: 153 | bbox_data = [{"position": {"minX": xyxy[0], "minY": xyxy[1], "maxX": xyxy[2], "maxY": xyxy[3]}, 154 | "class_id": int(cls), 155 | "scores": {"class_score": conf}, 156 | "domain": "pixel"} for *xyxy, conf, cls in pred.clone().tolist()] 157 | wandb_images.append(wandb.Image(img[si], boxes={"predictions": {"box_data": bbox_data}})) 158 | 159 | # Clip boxes to image bounds 160 | clip_coords(pred, (height, width)) 161 | 162 | # Append to pycocotools JSON dictionary 163 | if save_json: 164 | # [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ... 165 | image_id = Path(paths[si]).stem 166 | box = pred[:, :4].clone() # xyxy 167 | scale_coords(img[si].shape[1:], box, shapes[si][0], shapes[si][1]) # to original shape 168 | box = xyxy2xywh(box) # xywh 169 | box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner 170 | for p, b in zip(pred.tolist(), box.tolist()): 171 | jdict.append({'image_id': int(image_id) if image_id.isnumeric() else image_id, 172 | 'category_id': coco91class[int(p[5])], 173 | 'bbox': [round(x, 3) for x in b], 174 | 'score': round(p[4], 5)}) 175 | 176 | # Assign all predictions as incorrect 177 | correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device) 178 | if nl: 179 | detected = [] # target indices 180 | tcls_tensor = labels[:, 0] 181 | 182 | # target boxes 183 | tbox = xywh2xyxy(labels[:, 1:5]) * whwh 184 | 185 | # Per target class 186 | for cls in torch.unique(tcls_tensor): 187 | ti = (cls == tcls_tensor).nonzero(as_tuple=False).view(-1) # prediction indices 188 | pi = (cls == pred[:, 5]).nonzero(as_tuple=False).view(-1) # target indices 189 | 190 | # Search for detections 191 | if pi.shape[0]: 192 | # Prediction to target ious 193 | ious, i = box_iou(pred[pi, :4], tbox[ti]).max(1) # best ious, indices 194 | 195 | # Append detections 196 | detected_set = set() 197 | for j in (ious > iouv[0]).nonzero(as_tuple=False): 198 | d = ti[i[j]] # detected target 199 | if d.item() not in detected_set: 200 | detected_set.add(d.item()) 201 | detected.append(d) 202 | correct[pi[j]] = ious[j] > iouv # iou_thres is 1xn 203 | if len(detected) == nl: # all targets already located in image 204 | break 205 | 206 | # Append statistics (correct, conf, pcls, tcls) 207 | stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls)) 208 | 209 | # Plot images 210 | if plots and batch_i < 1: 211 | f = save_dir / f'test_batch{batch_i}_gt.jpg' # filename 212 | plot_images(img, targets, paths, str(f), names) # ground truth 213 | f = save_dir / f'test_batch{batch_i}_pred.jpg' 214 | plot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictions 215 | 216 | # W&B logging 217 | if wandb_images: 218 | wandb.log({"outputs": wandb_images}) 219 | 220 | # Compute statistics 221 | stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy 222 | if len(stats) and stats[0].any(): 223 | p, r, ap, f1, ap_class = ap_per_class(*stats, plot=plots, fname=save_dir / 'precision-recall_curve.png') 224 | p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1) # [P, R, AP@0.5, AP@0.5:0.95] 225 | mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean() 226 | nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class 227 | else: 228 | nt = torch.zeros(1) 229 | 230 | # Print results 231 | pf = '%20s' + '%12.3g' * 6 # print format 232 | print(pf % ('all', seen, nt.sum(), mp, mr, map50, map)) 233 | 234 | # Print results per class 235 | if verbose and nc > 1 and len(stats): 236 | for i, c in enumerate(ap_class): 237 | print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i])) 238 | 239 | # Print speeds 240 | t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple 241 | if not training: 242 | print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t) 243 | 244 | # Save JSON 245 | if save_json and len(jdict): 246 | w = Path(weights[0] if isinstance(weights, list) else weights).stem if weights is not None else '' # weights 247 | file = save_dir / f"detections_val2017_{w}_results.json" # predicted annotations file 248 | print('\nCOCO mAP with pycocotools... saving %s...' % file) 249 | with open(file, 'w') as f: 250 | json.dump(jdict, f) 251 | 252 | try: # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb 253 | from pycocotools.coco import COCO 254 | from pycocotools.cocoeval import COCOeval 255 | 256 | imgIds = [int(Path(x).stem) for x in dataloader.dataset.img_files] 257 | cocoGt = COCO(glob.glob('../coco/annotations/instances_val*.json')[0]) # initialize COCO ground truth api 258 | cocoDt = cocoGt.loadRes(str(file)) # initialize COCO pred api 259 | cocoEval = COCOeval(cocoGt, cocoDt, 'bbox') 260 | cocoEval.params.imgIds = imgIds # image IDs to evaluate 261 | cocoEval.evaluate() 262 | cocoEval.accumulate() 263 | cocoEval.summarize() 264 | map, map50 = cocoEval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5) 265 | except Exception as e: 266 | print('ERROR: pycocotools unable to run: %s' % e) 267 | 268 | # Return results 269 | model.float() # for training 270 | maps = np.zeros(nc) + map 271 | for i, c in enumerate(ap_class): 272 | maps[c] = ap[i] 273 | return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t 274 | 275 | 276 | if __name__ == '__main__': 277 | parser = argparse.ArgumentParser(prog='test.py') 278 | parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model.pt path(s)') 279 | parser.add_argument('--data', type=str, default='data/coco128.yaml', help='*.data path') 280 | parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch') 281 | parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)') 282 | parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold') 283 | parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS') 284 | parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file') 285 | parser.add_argument('--task', default='val', help="'val', 'test', 'study'") 286 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 287 | parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset') 288 | parser.add_argument('--augment', action='store_true', help='augmented inference') 289 | parser.add_argument('--verbose', action='store_true', help='report mAP by class') 290 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') 291 | parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') 292 | parser.add_argument('--save-dir', type=str, default='runs/test', help='directory to save results') 293 | opt = parser.parse_args() 294 | opt.save_json |= opt.data.endswith('coco.yaml') 295 | opt.data = check_file(opt.data) # check file 296 | print(opt) 297 | 298 | if opt.task in ['val', 'test']: # run normally 299 | test(opt.data, 300 | opt.weights, 301 | opt.batch_size, 302 | opt.img_size, 303 | opt.conf_thres, 304 | opt.iou_thres, 305 | opt.save_json, 306 | opt.single_cls, 307 | opt.augment, 308 | opt.verbose, 309 | save_dir=Path(opt.save_dir), 310 | save_txt=opt.save_txt, 311 | save_conf=opt.save_conf, 312 | ) 313 | 314 | print('Results saved to %s' % opt.save_dir) 315 | 316 | elif opt.task == 'study': # run over a range of settings and save/plot 317 | for weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']: 318 | f = 'study_%s_%s.txt' % (Path(opt.data).stem, Path(weights).stem) # filename to save to 319 | x = list(range(320, 800, 64)) # x axis 320 | y = [] # y axis 321 | for i in x: # img-size 322 | print('\nRunning %s point %s...' % (f, i)) 323 | r, _, t = test(opt.data, weights, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json) 324 | y.append(r + t) # results and times 325 | np.savetxt(f, y, fmt='%10.4g') # save 326 | os.system('zip -r study.zip study_*.txt') 327 | # utils.general.plot_study_txt(f, x) # plot 328 | -------------------------------------------------------------------------------- /v3.0/yolov5/train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import logging 3 | import os 4 | import random 5 | import shutil 6 | import time 7 | from pathlib import Path 8 | from warnings import warn 9 | 10 | import math 11 | import numpy as np 12 | import torch.distributed as dist 13 | import torch.nn.functional as F 14 | import torch.optim as optim 15 | import torch.optim.lr_scheduler as lr_scheduler 16 | import torch.utils.data 17 | import yaml 18 | from torch.cuda import amp 19 | from torch.nn.parallel import DistributedDataParallel as DDP 20 | from torch.utils.tensorboard import SummaryWriter 21 | from tqdm import tqdm 22 | 23 | import test # import test.py to get mAP after each epoch 24 | from models.yolo import Model 25 | from utils.datasets import create_dataloader 26 | from utils.general import ( 27 | torch_distributed_zero_first, labels_to_class_weights, plot_labels, check_anchors, labels_to_image_weights, 28 | compute_loss, plot_images, fitness, strip_optimizer, plot_results, get_latest_run, check_dataset, check_file, 29 | check_git_status, check_img_size, increment_dir, print_mutation, plot_evolution, set_logging, init_seeds) 30 | from utils.google_utils import attempt_download 31 | from utils.torch_utils import ModelEMA, select_device, intersect_dicts 32 | 33 | logger = logging.getLogger(__name__) 34 | 35 | 36 | def train(hyp, opt, device, tb_writer=None, wandb=None): 37 | logger.info(f'Hyperparameters {hyp}') 38 | log_dir = Path(tb_writer.log_dir) if tb_writer else Path(opt.logdir) / 'evolve' # logging directory 39 | wdir = log_dir / 'weights' # weights directory 40 | os.makedirs(wdir, exist_ok=True) 41 | last = wdir / 'last.pt' 42 | best = wdir / 'best.pt' 43 | results_file = str(log_dir / 'results.txt') 44 | epochs, batch_size, total_batch_size, weights, rank = \ 45 | opt.epochs, opt.batch_size, opt.total_batch_size, opt.weights, opt.global_rank 46 | 47 | # Save run settings 48 | with open(log_dir / 'hyp.yaml', 'w') as f: 49 | yaml.dump(hyp, f, sort_keys=False) 50 | with open(log_dir / 'opt.yaml', 'w') as f: 51 | yaml.dump(vars(opt), f, sort_keys=False) 52 | 53 | # Configure 54 | cuda = device.type != 'cpu' 55 | init_seeds(2 + rank) 56 | with open(opt.data) as f: 57 | data_dict = yaml.load(f, Loader=yaml.FullLoader) # data dict 58 | with torch_distributed_zero_first(rank): 59 | check_dataset(data_dict) # check 60 | train_path = data_dict['train'] 61 | test_path = data_dict['val'] 62 | nc, names = (1, ['item']) if opt.single_cls else (int(data_dict['nc']), data_dict['names']) # number classes, names 63 | assert len(names) == nc, '%g names found for nc=%g dataset in %s' % (len(names), nc, opt.data) # check 64 | 65 | # Model 66 | pretrained = weights.endswith('.pt') 67 | if pretrained: 68 | with torch_distributed_zero_first(rank): 69 | attempt_download(weights) # download if not found locally 70 | ckpt = torch.load(weights, map_location=device) # load checkpoint 71 | if hyp.get('anchors'): 72 | ckpt['model'].yaml['anchors'] = round(hyp['anchors']) # force autoanchor 73 | model = Model(opt.cfg or ckpt['model'].yaml, ch=3, nc=nc).to(device) # create 74 | exclude = ['anchor'] if opt.cfg or hyp.get('anchors') else [] # exclude keys 75 | state_dict = ckpt['model'].float().state_dict() # to FP32 76 | state_dict = intersect_dicts(state_dict, model.state_dict(), exclude=exclude) # intersect 77 | model.load_state_dict(state_dict, strict=False) # load 78 | logger.info('Transferred %g/%g items from %s' % (len(state_dict), len(model.state_dict()), weights)) # report 79 | else: 80 | model = Model(opt.cfg, ch=3, nc=nc).to(device) # create 81 | 82 | # Freeze 83 | freeze = ['', ] # parameter names to freeze (full or partial) 84 | if any(freeze): 85 | for k, v in model.named_parameters(): 86 | if any(x in k for x in freeze): 87 | print('freezing %s' % k) 88 | v.requires_grad = False 89 | 90 | # Optimizer 91 | nbs = 64 # nominal batch size 92 | accumulate = max(round(nbs / total_batch_size), 1) # accumulate loss before optimizing 93 | hyp['weight_decay'] *= total_batch_size * accumulate / nbs # scale weight_decay 94 | 95 | pg0, pg1, pg2 = [], [], [] # optimizer parameter groups 96 | for k, v in model.named_parameters(): 97 | v.requires_grad = True 98 | if '.bias' in k: 99 | pg2.append(v) # biases 100 | elif '.weight' in k and '.bn' not in k: 101 | pg1.append(v) # apply weight decay 102 | else: 103 | pg0.append(v) # all else 104 | 105 | if opt.adam: 106 | optimizer = optim.Adam(pg0, lr=hyp['lr0'], betas=(hyp['momentum'], 0.999)) # adjust beta1 to momentum 107 | else: 108 | optimizer = optim.SGD(pg0, lr=hyp['lr0'], momentum=hyp['momentum'], nesterov=True) 109 | 110 | optimizer.add_param_group({'params': pg1, 'weight_decay': hyp['weight_decay']}) # add pg1 with weight_decay 111 | optimizer.add_param_group({'params': pg2}) # add pg2 (biases) 112 | logger.info('Optimizer groups: %g .bias, %g conv.weight, %g other' % (len(pg2), len(pg1), len(pg0))) 113 | del pg0, pg1, pg2 114 | 115 | # Scheduler https://arxiv.org/pdf/1812.01187.pdf 116 | # https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html#OneCycleLR 117 | lf = lambda x: ((1 + math.cos(x * math.pi / epochs)) / 2) * (1 - hyp['lrf']) + hyp['lrf'] # cosine 118 | scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf) 119 | # plot_lr_scheduler(optimizer, scheduler, epochs) 120 | 121 | # Logging 122 | if wandb and wandb.run is None: 123 | id = ckpt.get('wandb_id') if 'ckpt' in locals() else None 124 | wandb_run = wandb.init(config=opt, resume="allow", project=os.path.basename(log_dir), id=id) 125 | 126 | # Resume 127 | start_epoch, best_fitness = 0, 0.0 128 | if pretrained: 129 | # Optimizer 130 | if ckpt['optimizer'] is not None: 131 | optimizer.load_state_dict(ckpt['optimizer']) 132 | best_fitness = ckpt['best_fitness'] 133 | 134 | # Results 135 | if ckpt.get('training_results') is not None: 136 | with open(results_file, 'w') as file: 137 | file.write(ckpt['training_results']) # write results.txt 138 | 139 | # Epochs 140 | start_epoch = ckpt['epoch'] + 1 141 | if opt.resume: 142 | assert start_epoch > 0, '%s training to %g epochs is finished, nothing to resume.' % (weights, epochs) 143 | shutil.copytree(wdir, wdir.parent / f'weights_backup_epoch{start_epoch - 1}') # save previous weights 144 | if epochs < start_epoch: 145 | logger.info('%s has been trained for %g epochs. Fine-tuning for %g additional epochs.' % 146 | (weights, ckpt['epoch'], epochs)) 147 | epochs += ckpt['epoch'] # finetune additional epochs 148 | 149 | del ckpt, state_dict 150 | 151 | # Image sizes 152 | gs = int(max(model.stride)) # grid size (max stride) 153 | imgsz, imgsz_test = [check_img_size(x, gs) for x in opt.img_size] # verify imgsz are gs-multiples 154 | 155 | # DP mode 156 | if cuda and rank == -1 and torch.cuda.device_count() > 1: 157 | model = torch.nn.DataParallel(model) 158 | 159 | # SyncBatchNorm 160 | if opt.sync_bn and cuda and rank != -1: 161 | model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model).to(device) 162 | logger.info('Using SyncBatchNorm()') 163 | 164 | # Exponential moving average 165 | ema = ModelEMA(model) if rank in [-1, 0] else None 166 | 167 | # DDP mode 168 | if cuda and rank != -1: 169 | model = DDP(model, device_ids=[opt.local_rank], output_device=opt.local_rank) 170 | 171 | # Trainloader 172 | dataloader, dataset = create_dataloader(train_path, imgsz, batch_size, gs, opt, 173 | hyp=hyp, augment=True, cache=opt.cache_images, rect=opt.rect, 174 | rank=rank, world_size=opt.world_size, workers=opt.workers) 175 | mlc = np.concatenate(dataset.labels, 0)[:, 0].max() # max label class 176 | nb = len(dataloader) # number of batches 177 | assert mlc < nc, 'Label class %g exceeds nc=%g in %s. Possible class labels are 0-%g' % (mlc, nc, opt.data, nc - 1) 178 | 179 | # Process 0 180 | if rank in [-1, 0]: 181 | ema.updates = start_epoch * nb // accumulate # set EMA updates 182 | testloader = create_dataloader(test_path, imgsz_test, total_batch_size, gs, opt, 183 | hyp=hyp, augment=False, cache=opt.cache_images and not opt.notest, rect=True, 184 | rank=-1, world_size=opt.world_size, workers=opt.workers)[0] # testloader 185 | 186 | if not opt.resume: 187 | labels = np.concatenate(dataset.labels, 0) 188 | c = torch.tensor(labels[:, 0]) # classes 189 | # cf = torch.bincount(c.long(), minlength=nc) + 1. # frequency 190 | # model._initialize_biases(cf.to(device)) 191 | plot_labels(labels, save_dir=log_dir) 192 | if tb_writer: 193 | # tb_writer.add_hparams(hyp, {}) # causes duplicate https://github.com/ultralytics/yolov5/pull/384 194 | tb_writer.add_histogram('classes', c, 0) 195 | 196 | # Anchors 197 | if not opt.noautoanchor: 198 | check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz) 199 | 200 | # Model parameters 201 | hyp['cls'] *= nc / 80. # scale coco-tuned hyp['cls'] to current dataset 202 | model.nc = nc # attach number of classes to model 203 | model.hyp = hyp # attach hyperparameters to model 204 | model.gr = 1.0 # iou loss ratio (obj_loss = 1.0 or iou) 205 | model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device) # attach class weights 206 | model.names = names 207 | 208 | # Start training 209 | t0 = time.time() 210 | nw = max(round(hyp['warmup_epochs'] * nb), 1e3) # number of warmup iterations, max(3 epochs, 1k iterations) 211 | # nw = min(nw, (epochs - start_epoch) / 2 * nb) # limit warmup to < 1/2 of training 212 | maps = np.zeros(nc) # mAP per class 213 | results = (0, 0, 0, 0, 0, 0, 0) # P, R, mAP@.5, mAP@.5-.95, val_loss(box, obj, cls) 214 | scheduler.last_epoch = start_epoch - 1 # do not move 215 | scaler = amp.GradScaler(enabled=cuda) 216 | logger.info('Image sizes %g train, %g test\n' 217 | 'Using %g dataloader workers\nLogging results to %s\n' 218 | 'Starting training for %g epochs...' % (imgsz, imgsz_test, dataloader.num_workers, log_dir, epochs)) 219 | for epoch in range(start_epoch, epochs): # epoch ------------------------------------------------------------------ 220 | model.train() 221 | 222 | # Update image weights (optional) 223 | if opt.image_weights: 224 | # Generate indices 225 | if rank in [-1, 0]: 226 | cw = model.class_weights.cpu().numpy() * (1 - maps) ** 2 # class weights 227 | iw = labels_to_image_weights(dataset.labels, nc=nc, class_weights=cw) # image weights 228 | dataset.indices = random.choices(range(dataset.n), weights=iw, k=dataset.n) # rand weighted idx 229 | # Broadcast if DDP 230 | if rank != -1: 231 | indices = (torch.tensor(dataset.indices) if rank == 0 else torch.zeros(dataset.n)).int() 232 | dist.broadcast(indices, 0) 233 | if rank != 0: 234 | dataset.indices = indices.cpu().numpy() 235 | 236 | # Update mosaic border 237 | # b = int(random.uniform(0.25 * imgsz, 0.75 * imgsz + gs) // gs * gs) 238 | # dataset.mosaic_border = [b - imgsz, -b] # height, width borders 239 | 240 | mloss = torch.zeros(4, device=device) # mean losses 241 | if rank != -1: 242 | dataloader.sampler.set_epoch(epoch) 243 | pbar = enumerate(dataloader) 244 | logger.info(('\n' + '%10s' * 8) % ('Epoch', 'gpu_mem', 'box', 'obj', 'cls', 'total', 'targets', 'img_size')) 245 | if rank in [-1, 0]: 246 | pbar = tqdm(pbar, total=nb) # progress bar 247 | optimizer.zero_grad() 248 | for i, (imgs, targets, paths, _) in pbar: # batch ------------------------------------------------------------- 249 | ni = i + nb * epoch # number integrated batches (since train start) 250 | imgs = imgs.to(device, non_blocking=True).float() / 255.0 # uint8 to float32, 0-255 to 0.0-1.0 251 | 252 | # Warmup 253 | if ni <= nw: 254 | xi = [0, nw] # x interp 255 | # model.gr = np.interp(ni, xi, [0.0, 1.0]) # iou loss ratio (obj_loss = 1.0 or iou) 256 | accumulate = max(1, np.interp(ni, xi, [1, nbs / total_batch_size]).round()) 257 | for j, x in enumerate(optimizer.param_groups): 258 | # bias lr falls from 0.1 to lr0, all other lrs rise from 0.0 to lr0 259 | x['lr'] = np.interp(ni, xi, [hyp['warmup_bias_lr'] if j == 2 else 0.0, x['initial_lr'] * lf(epoch)]) 260 | if 'momentum' in x: 261 | x['momentum'] = np.interp(ni, xi, [hyp['warmup_momentum'], hyp['momentum']]) 262 | 263 | # Multi-scale 264 | if opt.multi_scale: 265 | sz = random.randrange(imgsz * 0.5, imgsz * 1.5 + gs) // gs * gs # size 266 | sf = sz / max(imgs.shape[2:]) # scale factor 267 | if sf != 1: 268 | ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]] # new shape (stretched to gs-multiple) 269 | imgs = F.interpolate(imgs, size=ns, mode='bilinear', align_corners=False) 270 | 271 | # Forward 272 | with amp.autocast(enabled=cuda): 273 | pred = model(imgs) # forward 274 | loss, loss_items = compute_loss(pred, targets.to(device), model) # loss scaled by batch_size 275 | if rank != -1: 276 | loss *= opt.world_size # gradient averaged between devices in DDP mode 277 | 278 | # Backward 279 | scaler.scale(loss).backward() 280 | 281 | # Optimize 282 | if ni % accumulate == 0: 283 | scaler.step(optimizer) # optimizer.step 284 | scaler.update() 285 | optimizer.zero_grad() 286 | if ema: 287 | ema.update(model) 288 | 289 | # Print 290 | if rank in [-1, 0]: 291 | mloss = (mloss * i + loss_items) / (i + 1) # update mean losses 292 | mem = '%.3gG' % (torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0) # (GB) 293 | s = ('%10s' * 2 + '%10.4g' * 6) % ( 294 | '%g/%g' % (epoch, epochs - 1), mem, *mloss, targets.shape[0], imgs.shape[-1]) 295 | pbar.set_description(s) 296 | 297 | # Plot 298 | if ni < 3: 299 | f = str(log_dir / f'train_batch{ni}.jpg') # filename 300 | result = plot_images(images=imgs, targets=targets, paths=paths, fname=f) 301 | # if tb_writer and result is not None: 302 | # tb_writer.add_image(f, result, dataformats='HWC', global_step=epoch) 303 | # tb_writer.add_graph(model, imgs) # add model to tensorboard 304 | 305 | # end batch ------------------------------------------------------------------------------------------------ 306 | 307 | # Scheduler 308 | lr = [x['lr'] for x in optimizer.param_groups] # for tensorboard 309 | scheduler.step() 310 | 311 | # DDP process 0 or single-GPU 312 | if rank in [-1, 0]: 313 | # mAP 314 | if ema: 315 | ema.update_attr(model, include=['yaml', 'nc', 'hyp', 'gr', 'names', 'stride']) 316 | final_epoch = epoch + 1 == epochs 317 | if not opt.notest or final_epoch: # Calculate mAP 318 | results, maps, times = test.test(opt.data, 319 | batch_size=total_batch_size, 320 | imgsz=imgsz_test, 321 | model=ema.ema, 322 | single_cls=opt.single_cls, 323 | dataloader=testloader, 324 | save_dir=log_dir, 325 | plots=epoch == 0 or final_epoch, # plot first and last 326 | log_imgs=opt.log_imgs) 327 | 328 | # Write 329 | with open(results_file, 'a') as f: 330 | f.write(s + '%10.4g' * 7 % results + '\n') # P, R, mAP@.5, mAP@.5-.95, val_loss(box, obj, cls) 331 | if len(opt.name) and opt.bucket: 332 | os.system('gsutil cp %s gs://%s/results/results%s.txt' % (results_file, opt.bucket, opt.name)) 333 | 334 | # Log 335 | tags = ['train/giou_loss', 'train/obj_loss', 'train/cls_loss', # train loss 336 | 'metrics/precision', 'metrics/recall', 'metrics/mAP_0.5', 'metrics/mAP_0.5:0.95', 337 | 'val/giou_loss', 'val/obj_loss', 'val/cls_loss', # val loss 338 | 'x/lr0', 'x/lr1', 'x/lr2'] # params 339 | for x, tag in zip(list(mloss[:-1]) + list(results) + lr, tags): 340 | if tb_writer: 341 | tb_writer.add_scalar(tag, x, epoch) # tensorboard 342 | if wandb: 343 | wandb.log({tag: x}) # W&B 344 | 345 | # Update best mAP 346 | fi = fitness(np.array(results).reshape(1, -1)) # weighted combination of [P, R, mAP@.5, mAP@.5-.95] 347 | if fi > best_fitness: 348 | best_fitness = fi 349 | 350 | # Save model 351 | save = (not opt.nosave) or (final_epoch and not opt.evolve) 352 | if save: 353 | with open(results_file, 'r') as f: # create checkpoint 354 | ckpt = {'epoch': epoch, 355 | 'best_fitness': best_fitness, 356 | 'training_results': f.read(), 357 | 'model': ema.ema, 358 | 'optimizer': None if final_epoch else optimizer.state_dict(), 359 | 'wandb_id': wandb_run.id if wandb else None} 360 | 361 | # Save last, best and delete 362 | torch.save(ckpt, last) 363 | if best_fitness == fi: 364 | torch.save(ckpt, best) 365 | del ckpt 366 | # end epoch ---------------------------------------------------------------------------------------------------- 367 | # end training 368 | 369 | if rank in [-1, 0]: 370 | # Strip optimizers 371 | n = opt.name if opt.name.isnumeric() else '' 372 | fresults, flast, fbest = log_dir / f'results{n}.txt', wdir / f'last{n}.pt', wdir / f'best{n}.pt' 373 | for f1, f2 in zip([wdir / 'last.pt', wdir / 'best.pt', results_file], [flast, fbest, fresults]): 374 | if os.path.exists(f1): 375 | os.rename(f1, f2) # rename 376 | if str(f2).endswith('.pt'): # is *.pt 377 | strip_optimizer(f2) # strip optimizer 378 | os.system('gsutil cp %s gs://%s/weights' % (f2, opt.bucket)) if opt.bucket else None # upload 379 | # Finish 380 | if not opt.evolve: 381 | plot_results(save_dir=log_dir) # save as results.png 382 | logger.info('%g epochs completed in %.3f hours.\n' % (epoch - start_epoch + 1, (time.time() - t0) / 3600)) 383 | 384 | dist.destroy_process_group() if rank not in [-1, 0] else None 385 | torch.cuda.empty_cache() 386 | return results 387 | 388 | 389 | if __name__ == '__main__': 390 | parser = argparse.ArgumentParser() 391 | parser.add_argument('--weights', type=str, default='yolov5s.pt', help='initial weights path') 392 | parser.add_argument('--cfg', type=str, default='', help='model.yaml path') 393 | parser.add_argument('--data', type=str, default='data/coco128.yaml', help='data.yaml path') 394 | parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path') 395 | parser.add_argument('--epochs', type=int, default=300) 396 | parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs') 397 | parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes') 398 | parser.add_argument('--rect', action='store_true', help='rectangular training') 399 | parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training') 400 | parser.add_argument('--nosave', action='store_true', help='only save final checkpoint') 401 | parser.add_argument('--notest', action='store_true', help='only test final epoch') 402 | parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check') 403 | parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters') 404 | parser.add_argument('--bucket', type=str, default='', help='gsutil bucket') 405 | parser.add_argument('--cache-images', action='store_true', help='cache images for faster training') 406 | parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training') 407 | parser.add_argument('--name', default='', help='renames experiment folder exp{N} to exp{N}_{name} if supplied') 408 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 409 | parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%') 410 | parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset') 411 | parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer') 412 | parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode') 413 | parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify') 414 | parser.add_argument('--logdir', type=str, default='runs/', help='logging directory') 415 | parser.add_argument('--log-imgs', type=int, default=10, help='number of images for W&B logging, max 100') 416 | parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers') 417 | 418 | opt = parser.parse_args() 419 | 420 | # Set DDP variables 421 | opt.total_batch_size = opt.batch_size 422 | opt.world_size = int(os.environ['WORLD_SIZE']) if 'WORLD_SIZE' in os.environ else 1 423 | opt.global_rank = int(os.environ['RANK']) if 'RANK' in os.environ else -1 424 | set_logging(opt.global_rank) 425 | if opt.global_rank in [-1, 0]: 426 | check_git_status() 427 | 428 | # Resume 429 | if opt.resume: # resume an interrupted run 430 | ckpt = opt.resume if isinstance(opt.resume, str) else get_latest_run() # specified or most recent path 431 | log_dir = Path(ckpt).parent.parent # runs/exp0 432 | assert os.path.isfile(ckpt), 'ERROR: --resume checkpoint does not exist' 433 | with open(log_dir / 'opt.yaml') as f: 434 | opt = argparse.Namespace(**yaml.load(f, Loader=yaml.FullLoader)) # replace 435 | opt.cfg, opt.weights, opt.resume = '', ckpt, True 436 | logger.info('Resuming training from %s' % ckpt) 437 | 438 | else: 439 | # opt.hyp = opt.hyp or ('hyp.finetune.yaml' if opt.weights else 'hyp.scratch.yaml') 440 | opt.data, opt.cfg, opt.hyp = check_file(opt.data), check_file(opt.cfg), check_file(opt.hyp) # check files 441 | assert len(opt.cfg) or len(opt.weights), 'either --cfg or --weights must be specified' 442 | opt.img_size.extend([opt.img_size[-1]] * (2 - len(opt.img_size))) # extend to 2 sizes (train, test) 443 | log_dir = increment_dir(Path(opt.logdir) / 'exp', opt.name) # runs/exp1 444 | 445 | # DDP mode 446 | device = select_device(opt.device, batch_size=opt.batch_size) 447 | if opt.local_rank != -1: 448 | assert torch.cuda.device_count() > opt.local_rank 449 | torch.cuda.set_device(opt.local_rank) 450 | device = torch.device('cuda', opt.local_rank) 451 | dist.init_process_group(backend='nccl', init_method='env://') # distributed backend 452 | assert opt.batch_size % opt.world_size == 0, '--batch-size must be multiple of CUDA device count' 453 | opt.batch_size = opt.total_batch_size // opt.world_size 454 | 455 | # Hyperparameters 456 | with open(opt.hyp) as f: 457 | hyp = yaml.load(f, Loader=yaml.FullLoader) # load hyps 458 | if 'box' not in hyp: 459 | warn('Compatibility: %s missing "box" which was renamed from "giou" in %s' % 460 | (opt.hyp, 'https://github.com/ultralytics/yolov5/pull/1120')) 461 | hyp['box'] = hyp.pop('giou') 462 | 463 | # Train 464 | logger.info(opt) 465 | if not opt.evolve: 466 | tb_writer, wandb = None, None # init loggers 467 | if opt.global_rank in [-1, 0]: 468 | # Tensorboard 469 | logger.info(f'Start Tensorboard with "tensorboard --logdir {opt.logdir}", view at http://localhost:6006/') 470 | tb_writer = SummaryWriter(log_dir=log_dir) # runs/exp0 471 | 472 | # W&B 473 | try: 474 | import wandb 475 | 476 | assert os.environ.get('WANDB_DISABLED') != 'true' 477 | logger.info("Weights & Biases logging enabled, to disable set os.environ['WANDB_DISABLED'] = 'true'") 478 | except (ImportError, AssertionError): 479 | opt.log_imgs = 0 480 | logger.info("Install Weights & Biases for experiment logging via 'pip install wandb' (recommended)") 481 | 482 | train(hyp, opt, device, tb_writer, wandb) 483 | 484 | # Evolve hyperparameters (optional) 485 | else: 486 | # Hyperparameter evolution metadata (mutation scale 0-1, lower_limit, upper_limit) 487 | meta = {'lr0': (1, 1e-5, 1e-1), # initial learning rate (SGD=1E-2, Adam=1E-3) 488 | 'lrf': (1, 0.01, 1.0), # final OneCycleLR learning rate (lr0 * lrf) 489 | 'momentum': (0.3, 0.6, 0.98), # SGD momentum/Adam beta1 490 | 'weight_decay': (1, 0.0, 0.001), # optimizer weight decay 491 | 'warmup_epochs': (1, 0.0, 5.0), # warmup epochs (fractions ok) 492 | 'warmup_momentum': (1, 0.0, 0.95), # warmup initial momentum 493 | 'warmup_bias_lr': (1, 0.0, 0.2), # warmup initial bias lr 494 | 'box': (1, 0.02, 0.2), # box loss gain 495 | 'cls': (1, 0.2, 4.0), # cls loss gain 496 | 'cls_pw': (1, 0.5, 2.0), # cls BCELoss positive_weight 497 | 'obj': (1, 0.2, 4.0), # obj loss gain (scale with pixels) 498 | 'obj_pw': (1, 0.5, 2.0), # obj BCELoss positive_weight 499 | 'iou_t': (0, 0.1, 0.7), # IoU training threshold 500 | 'anchor_t': (1, 2.0, 8.0), # anchor-multiple threshold 501 | 'anchors': (2, 2.0, 10.0), # anchors per output grid (0 to ignore) 502 | 'fl_gamma': (0, 0.0, 2.0), # focal loss gamma (efficientDet default gamma=1.5) 503 | 'hsv_h': (1, 0.0, 0.1), # image HSV-Hue augmentation (fraction) 504 | 'hsv_s': (1, 0.0, 0.9), # image HSV-Saturation augmentation (fraction) 505 | 'hsv_v': (1, 0.0, 0.9), # image HSV-Value augmentation (fraction) 506 | 'degrees': (1, 0.0, 45.0), # image rotation (+/- deg) 507 | 'translate': (1, 0.0, 0.9), # image translation (+/- fraction) 508 | 'scale': (1, 0.0, 0.9), # image scale (+/- gain) 509 | 'shear': (1, 0.0, 10.0), # image shear (+/- deg) 510 | 'perspective': (0, 0.0, 0.001), # image perspective (+/- fraction), range 0-0.001 511 | 'flipud': (1, 0.0, 1.0), # image flip up-down (probability) 512 | 'fliplr': (0, 0.0, 1.0), # image flip left-right (probability) 513 | 'mosaic': (1, 0.0, 1.0), # image mixup (probability) 514 | 'mixup': (1, 0.0, 1.0)} # image mixup (probability) 515 | 516 | assert opt.local_rank == -1, 'DDP mode not implemented for --evolve' 517 | opt.notest, opt.nosave = True, True # only test/save final epoch 518 | # ei = [isinstance(x, (int, float)) for x in hyp.values()] # evolvable indices 519 | yaml_file = Path(opt.logdir) / 'evolve' / 'hyp_evolved.yaml' # save best result here 520 | if opt.bucket: 521 | os.system('gsutil cp gs://%s/evolve.txt .' % opt.bucket) # download evolve.txt if exists 522 | 523 | for _ in range(300): # generations to evolve 524 | if os.path.exists('evolve.txt'): # if evolve.txt exists: select best hyps and mutate 525 | # Select parent(s) 526 | parent = 'single' # parent selection method: 'single' or 'weighted' 527 | x = np.loadtxt('evolve.txt', ndmin=2) 528 | n = min(5, len(x)) # number of previous results to consider 529 | x = x[np.argsort(-fitness(x))][:n] # top n mutations 530 | w = fitness(x) - fitness(x).min() # weights 531 | if parent == 'single' or len(x) == 1: 532 | # x = x[random.randint(0, n - 1)] # random selection 533 | x = x[random.choices(range(n), weights=w)[0]] # weighted selection 534 | elif parent == 'weighted': 535 | x = (x * w.reshape(n, 1)).sum(0) / w.sum() # weighted combination 536 | 537 | # Mutate 538 | mp, s = 0.8, 0.2 # mutation probability, sigma 539 | npr = np.random 540 | npr.seed(int(time.time())) 541 | g = np.array([x[0] for x in meta.values()]) # gains 0-1 542 | ng = len(meta) 543 | v = np.ones(ng) 544 | while all(v == 1): # mutate until a change occurs (prevent duplicates) 545 | v = (g * (npr.random(ng) < mp) * npr.randn(ng) * npr.random() * s + 1).clip(0.3, 3.0) 546 | for i, k in enumerate(hyp.keys()): # plt.hist(v.ravel(), 300) 547 | hyp[k] = float(x[i + 7] * v[i]) # mutate 548 | 549 | # Constrain to limits 550 | for k, v in meta.items(): 551 | hyp[k] = max(hyp[k], v[1]) # lower limit 552 | hyp[k] = min(hyp[k], v[2]) # upper limit 553 | hyp[k] = round(hyp[k], 5) # significant digits 554 | 555 | # Train mutation 556 | results = train(hyp.copy(), opt, device) 557 | 558 | # Write mutation results 559 | print_mutation(hyp.copy(), results, yaml_file, opt.bucket) 560 | 561 | # Plot results 562 | plot_evolution(yaml_file) 563 | print(f'Hyperparameter evolution complete. Best results saved as: {yaml_file}\n' 564 | f'Command to train a new model with these hyperparameters: $ python train.py --hyp {yaml_file}') 565 | -------------------------------------------------------------------------------- /v3.0/yolov5/utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/utils/__init__.py -------------------------------------------------------------------------------- /v3.0/yolov5/utils/activations.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | 6 | # Swish https://arxiv.org/pdf/1905.02244.pdf --------------------------------------------------------------------------- 7 | class Swish(nn.Module): # 8 | @staticmethod 9 | def forward(x): 10 | return x * torch.sigmoid(x) 11 | 12 | 13 | class Hardswish(nn.Module): # export-friendly version of nn.Hardswish() 14 | @staticmethod 15 | def forward(x): 16 | # return x * F.hardsigmoid(x) # for torchscript and CoreML 17 | return x * F.hardtanh(x + 3, 0., 6.) / 6. # for torchscript, CoreML and ONNX 18 | 19 | 20 | class MemoryEfficientSwish(nn.Module): 21 | class F(torch.autograd.Function): 22 | @staticmethod 23 | def forward(ctx, x): 24 | ctx.save_for_backward(x) 25 | return x * torch.sigmoid(x) 26 | 27 | @staticmethod 28 | def backward(ctx, grad_output): 29 | x = ctx.saved_tensors[0] 30 | sx = torch.sigmoid(x) 31 | return grad_output * (sx * (1 + x * (1 - sx))) 32 | 33 | def forward(self, x): 34 | return self.F.apply(x) 35 | 36 | 37 | # Mish https://github.com/digantamisra98/Mish -------------------------------------------------------------------------- 38 | class Mish(nn.Module): 39 | @staticmethod 40 | def forward(x): 41 | return x * F.softplus(x).tanh() 42 | 43 | 44 | class MemoryEfficientMish(nn.Module): 45 | class F(torch.autograd.Function): 46 | @staticmethod 47 | def forward(ctx, x): 48 | ctx.save_for_backward(x) 49 | return x.mul(torch.tanh(F.softplus(x))) # x * tanh(ln(1 + exp(x))) 50 | 51 | @staticmethod 52 | def backward(ctx, grad_output): 53 | x = ctx.saved_tensors[0] 54 | sx = torch.sigmoid(x) 55 | fx = F.softplus(x).tanh() 56 | return grad_output * (fx + x * sx * (1 - fx * fx)) 57 | 58 | def forward(self, x): 59 | return self.F.apply(x) 60 | 61 | 62 | # FReLU https://arxiv.org/abs/2007.11824 ------------------------------------------------------------------------------- 63 | class FReLU(nn.Module): 64 | def __init__(self, c1, k=3): # ch_in, kernel 65 | super().__init__() 66 | self.conv = nn.Conv2d(c1, c1, k, 1, 1, groups=c1) 67 | self.bn = nn.BatchNorm2d(c1) 68 | 69 | def forward(self, x): 70 | return torch.max(x, self.bn(self.conv(x))) 71 | -------------------------------------------------------------------------------- /v3.0/yolov5/utils/evolve.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Hyperparameter evolution commands (avoids CUDA memory leakage issues) 3 | # Replaces train.py python generations 'for' loop with a bash 'for' loop 4 | 5 | # Start on 4-GPU machine 6 | #for i in 0 1 2 3; do 7 | # t=ultralytics/yolov5:evolve && sudo docker pull $t && sudo docker run -d --ipc=host --gpus all -v "$(pwd)"/VOC:/usr/src/VOC $t bash utils/evolve.sh $i 8 | # sleep 60 # avoid simultaneous evolve.txt read/write 9 | #done 10 | 11 | # Hyperparameter evolution commands 12 | while true; do 13 | # python train.py --batch 64 --weights yolov5m.pt --data voc.yaml --img 512 --epochs 50 --evolve --bucket ult/evolve/voc --device $1 14 | python train.py --batch 40 --weights yolov5m.pt --data coco.yaml --img 640 --epochs 30 --evolve --bucket ult/evolve/coco --device $1 15 | done 16 | -------------------------------------------------------------------------------- /v3.0/yolov5/utils/google_app_engine/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM gcr.io/google-appengine/python 2 | 3 | # Create a virtualenv for dependencies. This isolates these packages from 4 | # system-level packages. 5 | # Use -p python3 or -p python3.7 to select python version. Default is version 2. 6 | RUN virtualenv /env -p python3 7 | 8 | # Setting these environment variables are the same as running 9 | # source /env/bin/activate. 10 | ENV VIRTUAL_ENV /env 11 | ENV PATH /env/bin:$PATH 12 | 13 | RUN apt-get update && apt-get install -y python-opencv 14 | 15 | # Copy the application's requirements.txt and run pip to install all 16 | # dependencies into the virtualenv. 17 | ADD requirements.txt /app/requirements.txt 18 | RUN pip install -r /app/requirements.txt 19 | 20 | # Add the application source code. 21 | ADD . /app 22 | 23 | # Run a WSGI server to serve the application. gunicorn must be declared as 24 | # a dependency in requirements.txt. 25 | CMD gunicorn -b :$PORT main:app 26 | -------------------------------------------------------------------------------- /v3.0/yolov5/utils/google_app_engine/additional_requirements.txt: -------------------------------------------------------------------------------- 1 | # add these requirements in your app on top of the existing ones 2 | pip==18.1 3 | Flask==1.0.2 4 | gunicorn==19.9.0 5 | -------------------------------------------------------------------------------- /v3.0/yolov5/utils/google_app_engine/app.yaml: -------------------------------------------------------------------------------- 1 | runtime: custom 2 | env: flex 3 | 4 | service: yolov5app 5 | 6 | liveness_check: 7 | initial_delay_sec: 600 8 | 9 | manual_scaling: 10 | instances: 1 11 | resources: 12 | cpu: 1 13 | memory_gb: 4 14 | disk_size_gb: 20 -------------------------------------------------------------------------------- /v3.0/yolov5/utils/google_utils.py: -------------------------------------------------------------------------------- 1 | # This file contains google utils: https://cloud.google.com/storage/docs/reference/libraries 2 | # pip install --upgrade google-cloud-storage 3 | # from google.cloud import storage 4 | 5 | import os 6 | import platform 7 | import subprocess 8 | import time 9 | from pathlib import Path 10 | 11 | import torch 12 | 13 | 14 | def gsutil_getsize(url=''): 15 | # gs://bucket/file size https://cloud.google.com/storage/docs/gsutil/commands/du 16 | s = subprocess.check_output('gsutil du %s' % url, shell=True).decode('utf-8') 17 | return eval(s.split(' ')[0]) if len(s) else 0 # bytes 18 | 19 | 20 | def attempt_download(weights): 21 | # Attempt to download pretrained weights if not found locally 22 | weights = weights.strip().replace("'", '') 23 | file = Path(weights).name 24 | 25 | msg = weights + ' missing, try downloading from https://github.com/ultralytics/yolov5/releases/' 26 | models = ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt'] # available models 27 | 28 | if file in models and not os.path.isfile(weights): 29 | # Google Drive 30 | # d = {'yolov5s.pt': '1R5T6rIyy3lLwgFXNms8whc-387H0tMQO', 31 | # 'yolov5m.pt': '1vobuEExpWQVpXExsJ2w-Mbf3HJjWkQJr', 32 | # 'yolov5l.pt': '1hrlqD1Wdei7UT4OgT785BEk1JwnSvNEV', 33 | # 'yolov5x.pt': '1mM8aZJlWTxOg7BZJvNUMrTnA2AbeCVzS'} 34 | # r = gdrive_download(id=d[file], name=weights) if file in d else 1 35 | # if r == 0 and os.path.exists(weights) and os.path.getsize(weights) > 1E6: # check 36 | # return 37 | 38 | try: # GitHub 39 | url = 'https://github.com/ultralytics/yolov5/releases/download/v3.0/' + file 40 | print('Downloading %s to %s...' % (url, weights)) 41 | torch.hub.download_url_to_file(url, weights) 42 | assert os.path.exists(weights) and os.path.getsize(weights) > 1E6 # check 43 | except Exception as e: # GCP 44 | print('Download error: %s' % e) 45 | url = 'https://storage.googleapis.com/ultralytics/yolov5/ckpt/' + file 46 | print('Downloading %s to %s...' % (url, weights)) 47 | r = os.system('curl -L %s -o %s' % (url, weights)) # torch.hub.download_url_to_file(url, weights) 48 | finally: 49 | if not (os.path.exists(weights) and os.path.getsize(weights) > 1E6): # check 50 | os.remove(weights) if os.path.exists(weights) else None # remove partial downloads 51 | print('ERROR: Download failure: %s' % msg) 52 | print('') 53 | return 54 | 55 | 56 | def gdrive_download(id='1n_oKgR81BJtqk75b00eAjdv03qVCQn2f', name='coco128.zip'): 57 | # Downloads a file from Google Drive. from utils.google_utils import *; gdrive_download() 58 | t = time.time() 59 | 60 | print('Downloading https://drive.google.com/uc?export=download&id=%s as %s... ' % (id, name), end='') 61 | os.remove(name) if os.path.exists(name) else None # remove existing 62 | os.remove('cookie') if os.path.exists('cookie') else None 63 | 64 | # Attempt file download 65 | out = "NUL" if platform.system() == "Windows" else "/dev/null" 66 | os.system('curl -c ./cookie -s -L "drive.google.com/uc?export=download&id=%s" > %s ' % (id, out)) 67 | if os.path.exists('cookie'): # large file 68 | s = 'curl -Lb ./cookie "drive.google.com/uc?export=download&confirm=%s&id=%s" -o %s' % (get_token(), id, name) 69 | else: # small file 70 | s = 'curl -s -L -o %s "drive.google.com/uc?export=download&id=%s"' % (name, id) 71 | r = os.system(s) # execute, capture return 72 | os.remove('cookie') if os.path.exists('cookie') else None 73 | 74 | # Error check 75 | if r != 0: 76 | os.remove(name) if os.path.exists(name) else None # remove partial 77 | print('Download error ') # raise Exception('Download error') 78 | return r 79 | 80 | # Unzip if archive 81 | if name.endswith('.zip'): 82 | print('unzipping... ', end='') 83 | os.system('unzip -q %s' % name) # unzip 84 | os.remove(name) # remove zip to free space 85 | 86 | print('Done (%.1fs)' % (time.time() - t)) 87 | return r 88 | 89 | 90 | def get_token(cookie="./cookie"): 91 | with open(cookie) as f: 92 | for line in f: 93 | if "download" in line: 94 | return line.split()[-1] 95 | return "" 96 | 97 | # def upload_blob(bucket_name, source_file_name, destination_blob_name): 98 | # # Uploads a file to a bucket 99 | # # https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python 100 | # 101 | # storage_client = storage.Client() 102 | # bucket = storage_client.get_bucket(bucket_name) 103 | # blob = bucket.blob(destination_blob_name) 104 | # 105 | # blob.upload_from_filename(source_file_name) 106 | # 107 | # print('File {} uploaded to {}.'.format( 108 | # source_file_name, 109 | # destination_blob_name)) 110 | # 111 | # 112 | # def download_blob(bucket_name, source_blob_name, destination_file_name): 113 | # # Uploads a blob from a bucket 114 | # storage_client = storage.Client() 115 | # bucket = storage_client.get_bucket(bucket_name) 116 | # blob = bucket.blob(source_blob_name) 117 | # 118 | # blob.download_to_filename(destination_file_name) 119 | # 120 | # print('Blob {} downloaded to {}.'.format( 121 | # source_blob_name, 122 | # destination_file_name)) 123 | -------------------------------------------------------------------------------- /v3.0/yolov5/utils/torch_utils.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import os 3 | import time 4 | from copy import deepcopy 5 | 6 | import math 7 | import torch 8 | import torch.backends.cudnn as cudnn 9 | import torch.nn as nn 10 | import torch.nn.functional as F 11 | import torchvision 12 | 13 | logger = logging.getLogger(__name__) 14 | 15 | 16 | def init_torch_seeds(seed=0): 17 | torch.manual_seed(seed) 18 | 19 | # Speed-reproducibility tradeoff https://pytorch.org/docs/stable/notes/randomness.html 20 | if seed == 0: # slower, more reproducible 21 | cudnn.deterministic = True 22 | cudnn.benchmark = False 23 | else: # faster, less reproducible 24 | cudnn.deterministic = False 25 | cudnn.benchmark = True 26 | 27 | 28 | def select_device(device='', batch_size=None): 29 | # device = 'cpu' or '0' or '0,1,2,3' 30 | cpu_request = device.lower() == 'cpu' 31 | if device and not cpu_request: # if device requested other than 'cpu' 32 | os.environ['CUDA_VISIBLE_DEVICES'] = device # set environment variable 33 | assert torch.cuda.is_available(), 'CUDA unavailable, invalid device %s requested' % device # check availablity 34 | 35 | cuda = False if cpu_request else torch.cuda.is_available() 36 | if cuda: 37 | c = 1024 ** 2 # bytes to MB 38 | ng = torch.cuda.device_count() 39 | if ng > 1 and batch_size: # check that batch_size is compatible with device_count 40 | assert batch_size % ng == 0, 'batch-size %g not multiple of GPU count %g' % (batch_size, ng) 41 | x = [torch.cuda.get_device_properties(i) for i in range(ng)] 42 | s = 'Using CUDA ' 43 | for i in range(0, ng): 44 | if i == 1: 45 | s = ' ' * len(s) 46 | logger.info("%sdevice%g _CudaDeviceProperties(name='%s', total_memory=%dMB)" % 47 | (s, i, x[i].name, x[i].total_memory / c)) 48 | else: 49 | logger.info('Using CPU') 50 | 51 | logger.info('') # skip a line 52 | return torch.device('cuda:0' if cuda else 'cpu') 53 | 54 | 55 | def time_synchronized(): 56 | torch.cuda.synchronize() if torch.cuda.is_available() else None 57 | return time.time() 58 | 59 | 60 | def is_parallel(model): 61 | return type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel) 62 | 63 | 64 | def intersect_dicts(da, db, exclude=()): 65 | # Dictionary intersection of matching keys and shapes, omitting 'exclude' keys, using da values 66 | return {k: v for k, v in da.items() if k in db and not any(x in k for x in exclude) and v.shape == db[k].shape} 67 | 68 | 69 | def initialize_weights(model): 70 | for m in model.modules(): 71 | t = type(m) 72 | if t is nn.Conv2d: 73 | pass # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 74 | elif t is nn.BatchNorm2d: 75 | m.eps = 1e-3 76 | m.momentum = 0.03 77 | elif t in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]: 78 | m.inplace = True 79 | 80 | 81 | def find_modules(model, mclass=nn.Conv2d): 82 | # Finds layer indices matching module class 'mclass' 83 | return [i for i, m in enumerate(model.module_list) if isinstance(m, mclass)] 84 | 85 | 86 | def sparsity(model): 87 | # Return global model sparsity 88 | a, b = 0., 0. 89 | for p in model.parameters(): 90 | a += p.numel() 91 | b += (p == 0).sum() 92 | return b / a 93 | 94 | 95 | def prune(model, amount=0.3): 96 | # Prune model to requested global sparsity 97 | import torch.nn.utils.prune as prune 98 | print('Pruning model... ', end='') 99 | for name, m in model.named_modules(): 100 | if isinstance(m, nn.Conv2d): 101 | prune.l1_unstructured(m, name='weight', amount=amount) # prune 102 | prune.remove(m, 'weight') # make permanent 103 | print(' %.3g global sparsity' % sparsity(model)) 104 | 105 | 106 | def fuse_conv_and_bn(conv, bn): 107 | # Fuse convolution and batchnorm layers https://tehnokv.com/posts/fusing-batchnorm-and-conv/ 108 | 109 | # init 110 | fusedconv = nn.Conv2d(conv.in_channels, 111 | conv.out_channels, 112 | kernel_size=conv.kernel_size, 113 | stride=conv.stride, 114 | padding=conv.padding, 115 | groups=conv.groups, 116 | bias=True).requires_grad_(False).to(conv.weight.device) 117 | 118 | # prepare filters 119 | w_conv = conv.weight.clone().view(conv.out_channels, -1) 120 | w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var))) 121 | fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size())) 122 | 123 | # prepare spatial bias 124 | b_conv = torch.zeros(conv.weight.size(0), device=conv.weight.device) if conv.bias is None else conv.bias 125 | b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps)) 126 | fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn) 127 | 128 | return fusedconv 129 | 130 | 131 | def model_info(model, verbose=False): 132 | # Plots a line-by-line description of a PyTorch model 133 | n_p = sum(x.numel() for x in model.parameters()) # number parameters 134 | n_g = sum(x.numel() for x in model.parameters() if x.requires_grad) # number gradients 135 | if verbose: 136 | print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma')) 137 | for i, (name, p) in enumerate(model.named_parameters()): 138 | name = name.replace('module_list.', '') 139 | print('%5g %40s %9s %12g %20s %10.3g %10.3g' % 140 | (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std())) 141 | 142 | try: # FLOPS 143 | from thop import profile 144 | flops = profile(deepcopy(model), inputs=(torch.zeros(1, 3, 64, 64),), verbose=False)[0] / 1E9 * 2 145 | fs = ', %.1f GFLOPS' % (flops * 100) # 640x640 FLOPS 146 | except: 147 | fs = '' 148 | 149 | logger.info( 150 | 'Model Summary: %g layers, %g parameters, %g gradients%s' % (len(list(model.parameters())), n_p, n_g, fs)) 151 | 152 | 153 | def load_classifier(name='resnet101', n=2): 154 | # Loads a pretrained model reshaped to n-class output 155 | model = torchvision.models.__dict__[name](pretrained=True) 156 | 157 | # ResNet model properties 158 | # input_size = [3, 224, 224] 159 | # input_space = 'RGB' 160 | # input_range = [0, 1] 161 | # mean = [0.485, 0.456, 0.406] 162 | # std = [0.229, 0.224, 0.225] 163 | 164 | # Reshape output to n classes 165 | filters = model.fc.weight.shape[1] 166 | model.fc.bias = nn.Parameter(torch.zeros(n), requires_grad=True) 167 | model.fc.weight = nn.Parameter(torch.zeros(n, filters), requires_grad=True) 168 | model.fc.out_features = n 169 | return model 170 | 171 | 172 | def scale_img(img, ratio=1.0, same_shape=False): # img(16,3,256,416), r=ratio 173 | # scales img(bs,3,y,x) by ratio 174 | if ratio == 1.0: 175 | return img 176 | else: 177 | h, w = img.shape[2:] 178 | s = (int(h * ratio), int(w * ratio)) # new size 179 | img = F.interpolate(img, size=s, mode='bilinear', align_corners=False) # resize 180 | if not same_shape: # pad/crop img 181 | gs = 32 # (pixels) grid size 182 | h, w = [math.ceil(x * ratio / gs) * gs for x in (h, w)] 183 | return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447) # value = imagenet mean 184 | 185 | 186 | def copy_attr(a, b, include=(), exclude=()): 187 | # Copy attributes from b to a, options to only include [...] and to exclude [...] 188 | for k, v in b.__dict__.items(): 189 | if (len(include) and k not in include) or k.startswith('_') or k in exclude: 190 | continue 191 | else: 192 | setattr(a, k, v) 193 | 194 | 195 | class ModelEMA: 196 | """ Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models 197 | Keep a moving average of everything in the model state_dict (parameters and buffers). 198 | This is intended to allow functionality like 199 | https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage 200 | A smoothed version of the weights is necessary for some training schemes to perform well. 201 | This class is sensitive where it is initialized in the sequence of model init, 202 | GPU assignment and distributed training wrappers. 203 | """ 204 | 205 | def __init__(self, model, decay=0.9999, updates=0): 206 | # Create EMA 207 | self.ema = deepcopy(model.module if is_parallel(model) else model).eval() # FP32 EMA 208 | # if next(model.parameters()).device.type != 'cpu': 209 | # self.ema.half() # FP16 EMA 210 | self.updates = updates # number of EMA updates 211 | self.decay = lambda x: decay * (1 - math.exp(-x / 2000)) # decay exponential ramp (to help early epochs) 212 | for p in self.ema.parameters(): 213 | p.requires_grad_(False) 214 | 215 | def update(self, model): 216 | # Update EMA parameters 217 | with torch.no_grad(): 218 | self.updates += 1 219 | d = self.decay(self.updates) 220 | 221 | msd = model.module.state_dict() if is_parallel(model) else model.state_dict() # model state_dict 222 | for k, v in self.ema.state_dict().items(): 223 | if v.dtype.is_floating_point: 224 | v *= d 225 | v += (1. - d) * msd[k].detach() 226 | 227 | def update_attr(self, model, include=(), exclude=('process_group', 'reducer')): 228 | # Update EMA attributes 229 | copy_attr(self.ema, model, include, exclude) 230 | -------------------------------------------------------------------------------- /v3.0/yolov5/weights/download_weights.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Download common models 3 | 4 | python -c " 5 | from utils.google_utils import *; 6 | attempt_download('weights/yolov5s.pt'); 7 | attempt_download('weights/yolov5m.pt'); 8 | attempt_download('weights/yolov5l.pt'); 9 | attempt_download('weights/yolov5x.pt') 10 | " 11 | -------------------------------------------------------------------------------- /v3.0/yolov5/yolo_data/labels/train.cache: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/yolo_data/labels/train.cache -------------------------------------------------------------------------------- /v3.0/yolov5/yolo_data/labels/validation.cache: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/yolo_data/labels/validation.cache --------------------------------------------------------------------------------