├── .gitignore
├── LICENSE
├── README.md
├── README_CN.md
├── docs
├── _config.yml
└── index.md
├── v1.0
└── main.py
├── v2.0
└── gesture.py
└── v3.0
├── 01_image_processing_and_data_augmentation.ipynb
├── 02_munge_data.py
├── 03_Modeling_and_Inference.ipynb
├── LICENSE
├── README.md
├── modeling_data
└── aug_data
│ └── annotations.csv
├── ord.txt
├── windows_v1.8.1
├── data
│ └── predefined_classes.txt
└── labelImg.exe
└── yolov5
├── .dockerignore
├── .gitattributes
├── .github
├── ISSUE_TEMPLATE
│ ├── --bug-report.md
│ ├── --feature-request.md
│ └── -question.md
└── workflows
│ ├── ci-testing.yml
│ ├── greetings.yml
│ ├── rebase.yml
│ └── stale.yml
├── .gitignore
├── Dockerfile
├── LICENSE
├── README.md
├── config.yaml
├── detect.py
├── hubconf.py
├── models
├── __init__.py
├── common.py
├── experimental.py
├── export.py
├── hub
│ ├── yolov3-spp.yaml
│ ├── yolov5-fpn.yaml
│ └── yolov5-panet.yaml
├── yolo.py
├── yolov5l.yaml
├── yolov5m.yaml
├── yolov5s.yaml
└── yolov5x.yaml
├── requirements.txt
├── sotabench.py
├── test.py
├── train.py
├── tutorial.ipynb
├── utils
├── __init__.py
├── activations.py
├── datasets.py
├── evolve.sh
├── general.py
├── google_app_engine
│ ├── Dockerfile
│ ├── additional_requirements.txt
│ └── app.yaml
├── google_utils.py
└── torch_utils.py
├── weights
└── download_weights.sh
└── yolo_data
└── labels
├── train.cache
└── validation.cache
/.gitignore:
--------------------------------------------------------------------------------
1 | # visual studio code
2 | .vscode/
3 | .idea/
4 |
5 | # python
6 | venv/
7 | virtualenv/
8 | __pycache__
9 |
10 | # misc
11 | .DS_Store
12 |
13 | # results
14 | *.npy
15 | *.npz
16 | *.png
17 | *.PNG
18 | *.jpg
19 | *.JPG
20 | *.jpeg
21 |
22 | # notebook checkpoints
23 | .ipynb_checkpoints
24 |
25 | # path
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Hi 👋
2 |
3 | Come here, don’t you star this progect? & Forgive my pool English.
4 |
5 | Welcome to star this repo!
6 |
7 | Mid-air brush [Demo]
8 |
9 | README [EN|CN]
10 |
11 | ## Description
12 |
13 | Mid-air gesture recognition and drawing, the default gesture 1 is a brush, gesture 2 is to change the color, and gesture 5 is to clear the drawing board
14 | Display based on OpenCV.
15 |
16 |
17 | ## Change Log
18 |
19 | ### v3.0
20 |
21 | This version of the project is based on GA_Data_Science_Capstone
22 |
23 | Use Yolo_v5 to recognize gestures and index fingers for drawing. Please make your own gesture dataset and label them. Data preprocessing is in files 01 and 02.
24 | The project can be run on Raspberry Pi, use the Raspberry Pi to collect images and push them to the computer for reasoning, there is a delay.
25 |
26 | #### How to run
27 |
28 | ```sh
29 | cd v3.0
30 | pip install -r requirements.txt
31 | jupyter notebook
32 |
33 | # open and run 01_image_processing_and_data_augmentation.ipynb
34 |
35 | # run labelImg to label data 1, 2, 5, forefinger
36 |
37 | python 02_munge_data.py
38 |
39 | # train model
40 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example
41 | tensorboard --logdir runs/
42 |
43 | # run use pc cam
44 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0
45 |
46 | # run use raspi
47 | # run on raspi
48 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264
49 | # run on pc
50 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/
51 | ```
52 |
53 | ### v2.0
54 |
55 | Gesture recognition based on OpenCV and convex hull detection.
56 | Skin color detection + convex hull + number of contour lines (count the number of fingers).
57 |
58 | #### How to run
59 |
60 | ```sh
61 | cd v2.0
62 | python gesture.py
63 | ```
64 |
65 |
66 |
67 | ### v1.0
68 |
69 | Skin color detection + convex hull based on OpenCV.
70 |
71 |
72 | #### How to run
73 | ```sh
74 | cd v1.0
75 | python main.py
76 | ```
77 |
78 |
79 |
80 |
--------------------------------------------------------------------------------
/README_CN.md:
--------------------------------------------------------------------------------
1 | ## Hi 👋
2 |
3 | 来都来了,不点个小星星吗?
4 |
5 | Welcome to star this repo
6 |
7 | 凌空画笔 [Demo]
8 |
9 | README [EN|CN]
10 |
11 | ## Description
12 |
13 | 凌空手势识别和绘制,默认手势1是画笔,手势2是更换颜色,手势5是清空画板
14 | 显示基于OpenCV
15 |
16 |
17 | ## Change Log
18 |
19 | ### v3.0
20 |
21 | 该版本项目基于GA_Data_Science_Capstone
22 |
23 | 用Yolo_v5识别手势和食指进行绘制,请自行手势数据集并进行标注,数据预处理在01和02文件中
24 | 该项目可移植到树莓派上运行,利用树莓派收集图像,推流到电脑进行推理,有延迟
25 |
26 | #### How to run
27 |
28 | ```sh
29 | cd v3.0
30 | pip install -r requirements.txt
31 | jupyter notebook
32 |
33 | # open and run 01_image_processing_and_data_augmentation.ipynb
34 |
35 | # run labelImg to label data 1, 2, 5, forefinger
36 |
37 | python 02_munge_data.py
38 |
39 | # train model
40 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example
41 | tensorboard --logdir runs/
42 |
43 | # run use pc cam
44 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0
45 |
46 | # run use raspi
47 | # run on raspi
48 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264
49 | # run on pc
50 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/
51 | ```
52 |
53 | ### v2.0
54 |
55 | 基于OpenCV和凸包检测的手势识别
56 | 肤色检测+凸包+数轮廓线个数(统计手指数量)
57 |
58 | #### How to run
59 |
60 | ```sh
61 | cd v2.0
62 | python gesture.py
63 | ```
64 |
65 |
66 |
67 | ### v1.0
68 |
69 | 基于OpenCV的肤色检测+凸包
70 |
71 |
72 | #### How to run
73 | ```sh
74 | cd v1.0
75 | python main.py
76 | ```
77 |
78 |
79 |
80 |
--------------------------------------------------------------------------------
/docs/_config.yml:
--------------------------------------------------------------------------------
1 | theme: jekyll-theme-cayman
--------------------------------------------------------------------------------
/docs/index.md:
--------------------------------------------------------------------------------
1 | ## mid-air-draw
2 |
3 | mid-air-draw[Demo]
4 |
5 | Welcome to star this repo
6 |
7 | ### v1.0
8 | 肤色检测+凸包
9 |
10 | ```sh
11 | cd v1.0
12 | python main.py
13 | ```
14 |
15 | ### v2.0
16 | 肤色检测+凸包+数轮廓线个数(统计手指数量)
17 |
18 | #### How to run
19 |
20 | ```sh
21 | cd v2.0
22 | python gesture.py
23 | ```
24 |
25 |
26 | ### v3.0
27 |
28 | ```sh
29 | cd v3.0
30 | pip install -r requirements.txt
31 | jupyter notebook
32 |
33 | # open and run 01_image_processing_and_data_augmentation.ipynb
34 |
35 | # run labelImg to label data 1, 2, 5, forefinger
36 |
37 | python 02_munge_data.py
38 |
39 | # train model
40 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example
41 | tensorboard --logdir runs/
42 |
43 | # run use pc cam
44 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0
45 |
46 | # run use raspi
47 | # run on raspi
48 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264
49 | # run on pc
50 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/
51 | ```
--------------------------------------------------------------------------------
/v1.0/main.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import numpy as np
3 |
4 |
5 | def main():
6 | cap = cv2.VideoCapture(0)
7 | init = 0
8 | last_point = 0
9 | font = cv2.FONT_HERSHEY_SIMPLEX # 设置字体
10 | size = 0.5 # 设置大小
11 | width, height = 300, 300 # 设置拍摄窗口大小
12 | x0, y0 = 100, 100 # 设置选取位置
13 | while cap.isOpened():
14 | ret, img = cap.read()
15 | img = cv2.flip(img, 2)
16 | roi = binaryMask(img, x0, y0, width, height)
17 | res = skinMask(roi)
18 | contours = getContours(res)
19 | if init == 0:
20 | img2 = roi.copy()
21 | img2[:, :, :] = 255
22 | init = 1
23 |
24 | print(len(contours))
25 | if len(contours) > 0:
26 | first = [x[0] for x in contours[0]]
27 | first = np.array(first[:])
28 | print(first)
29 | y_min = roi.shape[1]
30 | idx = 0
31 | for i, (x, y) in enumerate(first):
32 | if y < y_min:
33 | y_min = y
34 | idx = i
35 | print(first[idx])
36 | point = (first[idx][0], first[idx][1])
37 | cv2.circle(img2, point, 1, (255, 0, 0))
38 | if last_point != 0:
39 | cv2.line(img2, point, last_point, (255, 0, 0), 1)
40 | last_point = point
41 |
42 | # print(img2)
43 | cv2.drawContours(roi, contours, -1, (0, 255, 0), 2)
44 | cv2.imshow('capture', img)
45 | cv2.imshow('roi', roi)
46 | cv2.imshow('draw', img2)
47 | k = cv2.waitKey(10)
48 | if k == 27:
49 | break
50 |
51 |
52 | def getContours(img):
53 | kernel = np.ones((5, 5), np.uint8)
54 | closed = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
55 | closed = cv2.morphologyEx(closed, cv2.MORPH_CLOSE, kernel)
56 | contours, h = cv2.findContours(closed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
57 | vaildContours = []
58 | for cont in contours:
59 | if cv2.contourArea(cont) > 9000:
60 | # x,y,w,h = cv2.boundingRect(cont)
61 | # if h/w >0.75:
62 | # filter face failed
63 | vaildContours.append(cv2.convexHull(cont))
64 | # print(cv2.convexHull(cont))
65 | # rect = cv2.minAreaRect(cont)
66 | # box = cv2.cv.BoxPoint(rect)
67 | # vaildContours.append(np.int0(box))
68 | return vaildContours
69 |
70 |
71 | def binaryMask(frame, x0, y0, width, height):
72 | cv2.rectangle(frame, (x0, y0), (x0 + width, y0 + height), (0, 255, 0)) # 画出截取的手势框图
73 | roi = frame[y0:y0 + height, x0:x0 + width] # 获取手势框图
74 | return roi
75 |
76 |
77 | def HSVBin(img):
78 | hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
79 |
80 | lower_skin = np.array([100, 50, 0])
81 | upper_skin = np.array([125, 255, 255])
82 |
83 | mask = cv2.inRange(hsv, lower_skin, upper_skin)
84 | # res = cv2.bitwise_and(img,img,mask=mask)
85 | return mask
86 |
87 |
88 | def skinMask1(roi):
89 | rgb = cv2.cvtColor(roi, cv2.COLOR_BGR2RGB) # 转换到RGB空间
90 | (R, G, B) = cv2.split(rgb) # 获取图像每个像素点的RGB的值,即将一个二维矩阵拆成三个二维矩阵
91 | skin = np.zeros(R.shape, dtype=np.uint8) # 掩膜
92 | (x, y) = R.shape # 获取图像的像素点的坐标范围
93 | for i in range(0, x):
94 | for j in range(0, y):
95 | # 判断条件,不在肤色范围内则将掩膜设为黑色,即255
96 | if (abs(R[i][j] - G[i][j]) > 15) and (R[i][j] > G[i][j]) and (R[i][j] > B[i][j]):
97 | if (R[i][j] > 95) and (G[i][j] > 40) and (B[i][j] > 20) \
98 | and (max(R[i][j], G[i][j], B[i][j]) - min(R[i][j], G[i][j], B[i][j]) > 15):
99 | skin[i][j] = 255
100 | elif (R[i][j] > 220) and (G[i][j] > 210) and (B[i][j] > 170):
101 | skin[i][j] = 255
102 | # res = cv2.bitwise_and(roi, roi, mask=skin) # 图像与运算
103 | return skin
104 |
105 |
106 | def skinMask2(roi):
107 | low = np.array([0, 48, 50]) # 最低阈值
108 | high = np.array([20, 255, 255]) # 最高阈值
109 | hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV) # 转换到HSV空间
110 | mask = cv2.inRange(hsv, low, high) # 掩膜,不在范围内的设为255
111 | # res = cv2.bitwise_and(roi, roi, mask=mask) # 图像与运算
112 | return mask
113 |
114 |
115 | def skinMask3(roi):
116 | skinCrCbHist = np.zeros((256, 256), dtype=np.uint8)
117 | cv2.ellipse(skinCrCbHist, (113, 155), (23, 25), 43, 0, 360, (255, 255, 255), -1) # 绘制椭圆弧线
118 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间
119 | (y, Cr, Cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值
120 | skin = np.zeros(Cr.shape, dtype=np.uint8) # 掩膜
121 | (x, y) = Cr.shape
122 | for i in range(0, x):
123 | for j in range(0, y):
124 | if skinCrCbHist[Cr[i][j], Cb[i][j]] > 0: # 若不在椭圆区间中
125 | skin[i][j] = 255
126 | # res = cv2.bitwise_and(roi, roi, mask=skin)
127 | return skin
128 |
129 |
130 | def skinMask4(roi):
131 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间
132 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值
133 | cr1 = cv2.GaussianBlur(cr, (5, 5), 0)
134 | _, skin = cv2.threshold(cr1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) # Ostu处理
135 | # res = cv2.bitwise_and(roi, roi, mask=skin)
136 | return skin
137 |
138 |
139 | def skinMask5(roi):
140 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间
141 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值
142 | skin = np.zeros(cr.shape, dtype=np.uint8)
143 | (x, y) = cr.shape
144 | for i in range(0, x):
145 | for j in range(0, y):
146 | # 每个像素点进行判断
147 | if (cr[i][j] > 130) and (cr[i][j] < 175) and (cb[i][j] > 77) and (cb[i][j] < 127):
148 | skin[i][j] = 255
149 | # res = cv2.bitwise_and(roi, roi, mask=skin)
150 | return skin
151 |
152 |
153 | def skinMask(roi):
154 | return skinMask4(roi)
155 |
156 |
157 | if __name__ == '__main__':
158 | main()
159 |
--------------------------------------------------------------------------------
/v2.0/gesture.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import numpy as np
3 | import copy
4 | import math
5 |
6 | # from appscript import app
7 |
8 | # Environment:
9 | # hardware:Raspberry Pi 4B
10 | # OS : Raspbian GNU/Linux 10 (buster)
11 | # python: 3.7.3
12 | # opencv: 4.2.0
13 |
14 | # parameters
15 | cap_region_x_begin = 0.6 # start point/total width
16 | cap_region_y_end = 0.6 # start point/total width
17 | threshold = 60 # BINARY threshold
18 | blurValue = 41 # GaussianBlur parameter
19 | bgSubThreshold = 50
20 | learningRate = 0
21 |
22 | # variables
23 | isBgCaptured = 0 # bool, whether the background captured
24 | triggerSwitch = False # if true, keyborad simulator works
25 |
26 |
27 | def skinMask1(roi):
28 | rgb = cv2.cvtColor(roi, cv2.COLOR_BGR2RGB) # 转换到RGB空间
29 | (R, G, B) = cv2.split(rgb) # 获取图像每个像素点的RGB的值,即将一个二维矩阵拆成三个二维矩阵
30 | skin = np.zeros(R.shape, dtype=np.uint8) # 掩膜
31 | (x, y) = R.shape # 获取图像的像素点的坐标范围
32 | for i in range(0, x):
33 | for j in range(0, y):
34 | # 判断条件,不在肤色范围内则将掩膜设为黑色,即255
35 | if (abs(R[i][j] - G[i][j]) > 15) and (R[i][j] > G[i][j]) and (R[i][j] > B[i][j]):
36 | if (R[i][j] > 95) and (G[i][j] > 40) and (B[i][j] > 20) \
37 | and (max(R[i][j], G[i][j], B[i][j]) - min(R[i][j], G[i][j], B[i][j]) > 15):
38 | skin[i][j] = 255
39 | elif (R[i][j] > 220) and (G[i][j] > 210) and (B[i][j] > 170):
40 | skin[i][j] = 255
41 | # res = cv2.bitwise_and(roi, roi, mask=skin) # 图像与运算
42 | return skin
43 |
44 |
45 | def skinMask2(roi):
46 | low = np.array([0, 48, 50]) # 最低阈值
47 | high = np.array([20, 255, 255]) # 最高阈值
48 | hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV) # 转换到HSV空间
49 | mask = cv2.inRange(hsv, low, high) # 掩膜,不在范围内的设为255
50 | # res = cv2.bitwise_and(roi, roi, mask=mask) # 图像与运算
51 | return mask
52 |
53 |
54 | def skinMask3(roi):
55 | skinCrCbHist = np.zeros((256, 256), dtype=np.uint8)
56 | cv2.ellipse(skinCrCbHist, (113, 155), (23, 25), 43, 0, 360, (255, 255, 255), -1) # 绘制椭圆弧线
57 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间
58 | (y, Cr, Cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值
59 | skin = np.zeros(Cr.shape, dtype=np.uint8) # 掩膜
60 | (x, y) = Cr.shape
61 | for i in range(0, x):
62 | for j in range(0, y):
63 | if skinCrCbHist[Cr[i][j], Cb[i][j]] > 0: # 若不在椭圆区间中
64 | skin[i][j] = 255
65 | # res = cv2.bitwise_and(roi, roi, mask=skin)
66 | return skin
67 |
68 |
69 | def skinMask4(roi):
70 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间
71 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值
72 | cr1 = cv2.GaussianBlur(cr, (5, 5), 0)
73 | _, skin = cv2.threshold(cr1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) # Ostu处理
74 | # res = cv2.bitwise_and(roi, roi, mask=skin)
75 | return skin
76 |
77 |
78 | def skinMask5(roi):
79 | YCrCb = cv2.cvtColor(roi, cv2.COLOR_BGR2YCR_CB) # 转换至YCrCb空间
80 | (y, cr, cb) = cv2.split(YCrCb) # 拆分出Y,Cr,Cb值
81 | skin = np.zeros(cr.shape, dtype=np.uint8)
82 | (x, y) = cr.shape
83 | for i in range(0, x):
84 | for j in range(0, y):
85 | # 每个像素点进行判断
86 | if (cr[i][j] > 130) and (cr[i][j] < 175) and (cb[i][j] > 77) and (cb[i][j] < 127):
87 | skin[i][j] = 255
88 | # res = cv2.bitwise_and(roi, roi, mask=skin)
89 | return skin
90 |
91 |
92 | def skinMask(roi):
93 | return skinMask4(roi)
94 |
95 |
96 | def dis(p1, p2):
97 | (x1, y1) = p1
98 | (x2, y2) = p2
99 | return np.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
100 |
101 |
102 | def printThreshold(thr):
103 | print("! Changed threshold to " + str(thr))
104 |
105 |
106 | def removeBG(frame):
107 | """
108 | fgmask = bgModel.apply(frame, learningRate=learningRate)
109 | # kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
110 | # res = cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel)
111 |
112 | kernel = np.ones((3, 3), np.uint8)
113 | fgmask = cv2.erode(fgmask, kernel, iterations=1)
114 | """
115 | res = cv2.bitwise_and(frame, frame, mask=skinMask(frame))
116 | return res
117 |
118 |
119 | def calculateFingers(res, drawing): # -> finished bool, cnt: finger count
120 | # convexity defect
121 | hull = cv2.convexHull(res, returnPoints=False)
122 | if len(hull) > 3:
123 | defects = cv2.convexityDefects(res, hull)
124 | if type(defects) != type(None): # avoid crashing. (BUG not found)
125 |
126 | cnt = 0
127 | for i in range(defects.shape[0]): # calculate the angle
128 | s, e, f, d = defects[i][0]
129 | start = tuple(res[s][0])
130 | end = tuple(res[e][0])
131 | far = tuple(res[f][0])
132 | a = math.sqrt((end[0] - start[0]) ** 2 + (end[1] - start[1]) ** 2)
133 | b = math.sqrt((far[0] - start[0]) ** 2 + (far[1] - start[1]) ** 2)
134 | c = math.sqrt((end[0] - far[0]) ** 2 + (end[1] - far[1]) ** 2)
135 | angle = math.acos((b ** 2 + c ** 2 - a ** 2) / (2 * b * c)) # cosine theorem
136 | if angle <= math.pi / 2: # angle less than 90 degree, treat as fingers
137 | cnt += 1
138 | cv2.line(drawing, far, start, [211, 200, 200], 2)
139 | cv2.line(drawing, far, end, [211, 200, 200], 2)
140 | cv2.circle(drawing, far, 8, [211, 84, 0], -1)
141 | return True, cnt
142 | return False, 0
143 |
144 |
145 | # Camera
146 | camera = cv2.VideoCapture(0)
147 | # rt = camera.get(10)
148 | # print(rt)
149 | camera.set(10, 150)
150 | cv2.namedWindow('trackbar')
151 | cv2.createTrackbar('trh1', 'trackbar', threshold, 100, printThreshold)
152 |
153 | last_point = 0
154 | init = 0
155 |
156 | while camera.isOpened():
157 | ret, frame = camera.read()
158 | threshold = cv2.getTrackbarPos('trh1', 'trackbar')
159 | frame = cv2.bilateralFilter(frame, 5, 50, 100) # smoothing filter
160 | frame = cv2.flip(frame, 1) # flip the frame horizontally
161 | cv2.rectangle(frame, (int(cap_region_x_begin * frame.shape[1]), 0),
162 | (frame.shape[1], int(cap_region_y_end * frame.shape[0])), (255, 0, 0), 2)
163 | cv2.imshow('original', frame)
164 | print(frame.shape)
165 |
166 | # Main operation
167 | if isBgCaptured == 1: # this part wont run until background captured
168 | img = removeBG(frame)
169 | img = img[0:int(cap_region_y_end * frame.shape[0]),
170 | int(cap_region_x_begin * frame.shape[1]):frame.shape[1]] # clip the ROI
171 | cv2.imshow('mask', img)
172 | if init == 0:
173 | img2 = img.copy()
174 | img2[:, :] = 255
175 | init = 1
176 | # convert the image into binary image
177 | gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
178 | blur = cv2.GaussianBlur(gray, (blurValue, blurValue), 0)
179 | # cv2.imshow('blur', blur)
180 | ret, thresh = cv2.threshold(blur, threshold, 255, cv2.THRESH_BINARY)
181 | # cv2.imshow('ori', thresh)
182 |
183 | # get the coutours
184 | thresh1 = copy.deepcopy(thresh)
185 | contours, hierarchy = cv2.findContours(thresh1, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
186 | length = len(contours)
187 | maxArea = -1
188 | drawing = np.zeros(img.shape, np.uint8)
189 | if length > 0:
190 | for i in range(length): # find the biggest contour (according to area)
191 | temp = contours[i]
192 | area = cv2.contourArea(temp)
193 | if area > maxArea:
194 | maxArea = area
195 | ci = i
196 |
197 | res = contours[ci]
198 |
199 | # print(last_point)
200 | # print(res)
201 | hull = cv2.convexHull(res)
202 | drawing = np.zeros(img.shape, np.uint8)
203 | # cv2.drawContours(drawing, [], 0, (0, 255, 0), 2)
204 | cv2.drawContours(drawing, [res], 0, (0, 255, 0), 2)
205 | cv2.drawContours(drawing, [hull], 0, (0, 0, 255), 3)
206 |
207 | isFinishCal, cnt = calculateFingers(res, drawing)
208 | if cnt > 2:
209 | img2[:, :] = 255
210 | # print(cnt)
211 | if triggerSwitch is True:
212 | # if isFinishCal is True and cnt <= 2:
213 | if isFinishCal is True:
214 | print(cnt)
215 | # app('System Events').keystroke(' ') # simulate pressing blank space
216 | if cnt <= 2:
217 | first = [x[0] for x in contours[ci]]
218 | first = np.array(first[:])
219 | # print(first)
220 | y_min = frame.shape[1]
221 | idx = 0
222 | for i, (x, y) in enumerate(first):
223 | if y < y_min:
224 | y_min = y
225 | idx = i
226 | # print(first[idx])
227 | point = (first[idx][0], first[idx][1])
228 | cv2.circle(img2, point, 3, (255, 0, 0))
229 | if last_point != 0:
230 | # print('????')
231 | if dis(last_point, point) < 30:
232 | cv2.line(img2, point, last_point, (255, 0, 0), 3)
233 | last_point = point
234 | '''
235 | if cnt > 1:
236 | first = [x[0] for x in contours[ci]]
237 | else:
238 | first = [x[0] for x in contours[0]]
239 | first = [x[0] for x in contours[ci]]
240 | first = np.array(first[:])
241 | # print(first)
242 | y_min = frame.shape[1]
243 | idx = 0
244 | for i, (x, y) in enumerate(first):
245 | if y < y_min:
246 | y_min = y
247 | idx = i
248 | # print(first[idx])
249 | point = (first[idx][0], first[idx][1])
250 | cv2.circle(img2, point, 3, (255, 255, 255))
251 | if last_point != 0:
252 | # print('????')
253 | cv2.line(img2, point, last_point, (255, 255, 255), 3)
254 | last_point = point
255 | '''
256 |
257 | cv2.imshow('output', drawing)
258 | cv2.imshow('draw', img2)
259 |
260 | # Keyboard OP
261 | k = cv2.waitKey(10)
262 | if k == 27: # press ESC to exit
263 | camera.release()
264 | cv2.destroyAllWindows()
265 | break
266 | elif k == ord('b'): # press 'b' to capture the background
267 | bgModel = cv2.createBackgroundSubtractorMOG2(0, bgSubThreshold)
268 | isBgCaptured = 1
269 | print('!!!Background Captured!!!')
270 | elif k == ord('r'): # press 'r' to reset the background
271 | bgModel = None
272 | triggerSwitch = False
273 | isBgCaptured = 0
274 | print('!!!Reset BackGround!!!')
275 | elif k == ord('n'):
276 | triggerSwitch = True
277 | print('!!!Trigger On!!!')
278 | elif k == ord('c'):
279 | img2[:, :] = 255
280 | print('!!!img2 Clear!!!')
281 |
--------------------------------------------------------------------------------
/v3.0/02_munge_data.py:
--------------------------------------------------------------------------------
1 | """
2 | The purpose of this python script is to create an unbiased training and validation set.
3 | The split data will be run in the terminal calling a function (process_data) that will join the
4 | annotations.csv file with new .txt files for bounding box class and coordinates for each image.
5 | """
6 | # Credit to Abhishek Thakur, as this is a modified version of this notebook.
7 | # Source to video, where he goes over his code: https://www.youtube.com/watch?v=NU9Xr_NYslo&t=1392s
8 |
9 | # Import libraries
10 | import os
11 | import ast
12 | import pandas as pd
13 | import numpy as np
14 | from sklearn import model_selection
15 | from tqdm import tqdm
16 | import shutil
17 |
18 | # The DATA_PATH will be where your augmented images and annotations.csv files are.
19 | # The OUTPUT_PATH is where the train and validation images and labels will go to.
20 | DATA_PATH = './modeling_data/aug_data/'
21 | OUTPUT_PATH = './yolov5/yolo_data/'
22 |
23 |
24 | # Function for taking each row in the annotations file
25 | def process_data(data, data_type='train'):
26 | for _, row in tqdm(data.iterrows(), total=len(data)):
27 | image_name = row['image_id'][:-4] # removing file extension .jpeg
28 | bounding_boxes = row['bboxes']
29 | yolo_data = []
30 | for bbox in bounding_boxes:
31 | category = bbox[0]
32 | x_center = bbox[1]
33 | y_center = bbox[2]
34 | w = bbox[3]
35 | h = bbox[4]
36 | yolo_data.append([category, x_center, y_center, w, h]) # yolo formated labels
37 | yolo_data = np.array(yolo_data)
38 |
39 | np.savetxt(
40 | # Outputting .txt file to appropriate train/validation folders
41 | os.path.join(OUTPUT_PATH, f"labels/{data_type}/{image_name}.txt"),
42 | yolo_data,
43 | fmt=["%d", "%f", "%f", "%f", "%f"]
44 | )
45 | shutil.copyfile(
46 | # Copying the augmented images to the appropriate train/validation folders
47 | os.path.join(DATA_PATH, f"images/{image_name}.jpg"),
48 | os.path.join(OUTPUT_PATH, f"images/{data_type}/{image_name}.jpg"),
49 | )
50 |
51 |
52 | if __name__ == '__main__':
53 | df = pd.read_csv(os.path.join(DATA_PATH, 'annotations.csv'))
54 | df.bbox = df.bbox.apply(ast.literal_eval) # Convert string to list for bounding boxes
55 | df = df.groupby('image_id')['bbox'].apply(list).reset_index(name='bboxes')
56 |
57 | # splitting data to a 90/10 split
58 | df_train, df_valid = model_selection.train_test_split(
59 | df,
60 | test_size=0.1,
61 | random_state=42,
62 | shuffle=True
63 | )
64 |
65 | df_train = df_train.reset_index(drop=True)
66 | df_valid = df_valid.reset_index(drop=True)
67 |
68 | # Run function to have our data ready for modeling in 03_Modeling_and_Inference.ipynb
69 | process_data(df_train, data_type='train')
70 | process_data(df_valid, data_type='validation')
71 |
--------------------------------------------------------------------------------
/v3.0/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 David Lee
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/v3.0/README.md:
--------------------------------------------------------------------------------
1 | ### GA_Data_Science_Capstone_Project
2 | # **Interactive ABC's with American Sign Language**
3 | ### A step in Increasing Accessability for the Deaf Community with Computer Vision utilizing Yolov5.
4 | 
5 |
6 |
7 | # **Executive Summary**
8 | Utilizing Yolov5, a custom computer vision model was created on the American Sign Language alphabet. The project was promoted on social platforms to diversify the dataset. A total of 721 images were collected in the span of two weeks using DropBox request forms. Manual labels were created of the original images which were then resized, and organized for preprocessing. Several carefully selected augmentations were made to the images to compensate for the small dataset count. A total of 18,000 images were then used for modeling. Transfer learning was incorporated with Yolov5m weights and training completed on 300 epochs with an image size of 1024 in 163 hours. A mean average precision score of 0.8527 was achieved. Inference tests were successfully performed with areas identifying the models strengths and weaknesses for future development.
9 |
10 | All operations were performed on my local Linux machine with a CUDA/cudNN setup using Pytorch.
11 |
12 |
13 | # **Table of Contents**
14 |
15 | - [Executive Summary](#executivesummary)
16 | - [Table of Contents](#contents)
17 | - [Data Colelction Method](#data)
18 | - [Preprocessing](#preprocessing)
19 | - [Modeling](#modeling)
20 | - [Inference](#inference)
21 | - [Conclusions](#conclusions)
22 | - [Next Steps](#nextsteps)
23 | - [Citations](#cite)
24 | - [Special Thanks](#thanks)
25 |
26 |
27 |
28 |
29 | - [Back to Contents](#contents)
30 | # **Problem Statement:**
31 | Have you ever considered how easy it is to perform simple communication tasks such as ordering food at a drive thru, discussing financial information with a banker, telling a physician your symptoms at a hospital, or even negotiating your wages from your employer? What if there was a rule where you couldn’t speak and were only able to use your hands for each of these circumstances? The deaf community cannot do what most of the population take for granted and are often placed in degrading situations due to these challenges they face every day. Access to qualified interpretation services isn’t feasible in most cases leaving many in the deaf community with underemployment, social isolation, and public health challenges. To give these members of our community a greater voice, I have attempted to answer this question:
32 |
33 |
34 | **Can computer vision bridge the gap for the deaf and hard of hearing by learning American Sign Language?**
35 |
36 | In order to do this, a Yolov5 model was trained on the ASL alphabet. If successful, it may mark a step in the right direction for both greater accessibility and educational resources.
37 |
38 |
39 | - [Back to Contents](#contents)
40 | # **Data Collection Method:**
41 | The decision was made to create an original dataset for a few reasons. The first was to mirror the intended environment on a mobile device or webcam. These often have resolutions of 720 or 1080p. Several existing datasets have a low resolution and many do not include the letters “j” and “z” as they require movements.
42 |
43 | A letter request form was created with an introduction to my project along with instruction on how to submit voluntary sign language images with dropbox file request forms. This was distributed on social platforms to bring awareness, and to collect data.
44 |
45 |
46 | #### Dropbox request form used: (Deadline Sep. 27th, 2020)
47 | https://docs.google.com/document/d/1ChZPPr1dsHtgNqQ55a0FMngJj8PJbGgArm8xsiNYlRQ/edit?usp=sharing
48 | [link](https://docs.google.com/document/d/1ChZPPr1dsHtgNqQ55a0FMngJj8PJbGgArm8xsiNYlRQ/edit?usp=sharing)
49 |
50 | A total of 720 images were collected:
51 |
52 | Here is the distributions of images: (Letters / Counts)
53 |
54 | A - 29
55 | B - 25
56 | C - 25
57 | D - 28
58 | E - 25
59 | F - 30
60 | G - 30
61 | H - 29
62 | I - 30
63 | J - 38
64 | K - 27
65 | L - 28
66 | M - 28
67 | N - 27
68 | O - 28
69 | P - 25
70 | Q - 26
71 | R - 25
72 | S - 30
73 | T - 25
74 | U - 25
75 | V - 28
76 | W - 27
77 | X - 26
78 | Y - 26
79 | Z - 30
80 |
81 |
82 | - [Back to Contents](#contents)
83 | # **Preproccessing**
84 | ### Labeling the images
85 | Manual bounding box labels were created on the original images using the labelImg software.
86 |
87 | Each of the pictures and bounding box coordinates were then passed through an albumentations pipeline that resized the images to 1024 x 1024 pixel squares and added probabilities of different transformations.
88 |
89 | These transformations included specified degrees of rotations, shifts in the image locations, blurs, horizontal flips, random erase, and a variety of other color transformations.
90 |
91 | 
92 |
93 |
94 | 25 augmented images were created for each image resulting in an image set of 18,000 used for modeling.
95 |
96 |
97 | - [Back to Contents](#contents)
98 | # **Modeling: Yolov5**
99 | To address acceptable inference speeds and size, Yolov5 was chosen for modeling.
100 |
101 | This was released in June 10th of this year, and is still in active development. Although Yolov5 by Ultralytics is not created by the original Yolo authors, Yolo v5 is said to be faster and more lightweight, with accuracy on par with Yolo v4 which is widely considered as the fastest and most accurate real-time object detection model.
102 |
103 | 
104 |
105 | Yolo was designed as a convolutional neural network for real time object detection. Its more complex than basic classification as object detection needs to identify the objects and locate where it is on the image. This single stage object detector, has 3 main components:
106 |
107 | The backbone basically extracts important features of an image, the neck mainly uses feature pyramids which help in generalizing the object scaling for better performance on unseen data. The model head does the actual detection part where anchor boxes are applied on features that generate output vectors.
108 | These vectors include the class probabilities, the objectness scores, and bounding boxes.
109 |
110 |
111 | The model used was yolov5m with transfer learning on pretrained weights.
112 |
113 | #### **Model Training**
114 | Epochs: 300
115 | Batch Size: 8
116 | Image Size: 1024 x 1024
117 | Weights: yolov5m.pt
118 |
119 | 
120 |
121 | mAP@.5: 98.17%
122 |
123 | **mAP@.5:.95: 85.27%**
124 |
125 | Training batch example:
126 | 
127 |
128 | Test batch predictions example:
129 | 
130 |
131 |
132 | - [Back to Contents](#contents)
133 | # **Inference**
134 | ### **Images**
135 | I had reserved a test set of my son’s attempts at each letter that was not included in any of the training and validation sets. In fact no pictures of hands from children were used for training the model. Ideally several more images would help in showcasing how well our model performs, but this a start.
136 | 
137 |
138 | Out of 26 letters, 18 were correctly predicted.
139 |
140 | Letters that did not receive a prediction (G, H, J, and Z)
141 |
142 | Letters that were incorrectly predicted were:
143 | “D” predicted as “F”
144 | “E” predicted as “T”
145 | “P“ predicted as “Q”
146 | “R” predicted as “U”
147 |
148 |
149 | ## **Video Findings:**
150 |
151 | ==============================================================
152 | **Left-handed:**
153 | This test shows that our image augmentation pipeline performed well as it was set to flip the images horizontally at a 50% probability.
154 | 
155 |
156 | ==============================================================
157 | **Child's hand:**
158 | The test on my son's hand was performed, and the model still performs well here.
159 | 
160 |
161 | ==============================================================
162 | **Multiple letters on screen:**
163 | Simultaneous letters were also detected. Although sign language is not used like the video on the right, it shows that multiple people can be on screen and the model will be able to distinguish more than one instance of the language.
164 | 
165 |
166 | ==============================================================
167 | ## **Video Limitations:**
168 | ==============================================================
169 | **Distance**
170 | There were limitations I’ve discovered in my model. The biggest one is distance. As many of the original pictures were taken from my phone on my hands, the distance of my hand to the camera was very close, negatively impacting inference at further distances.
171 |
172 | 
173 |
174 | ==============================================================
175 | **New environments**
176 | These video clips of volunteers below were not included in any of the model training. Although the model picks up a lot of the letters, the prediction confidence levels are lower, and there are more misclassifications present.
177 | 
178 |
179 |
180 | I've verified this with a video of my own.
181 | 
182 |
183 | **Even though the original image set was on only 720 pictures, the implications of the results displayed bring us to an exciting conclusion.**
184 |
185 | ==============================================================
186 |
187 |
188 | - [Back to Contents](#contents)
189 | # **Conclusions**
190 | Computer vision can and should be used in marking a step in greater accessibility and educational resources for our deaf and hard of hearing communities!
191 |
192 | - Even though the original image set was on only 720 pictures, the implications of the results displayed here is promising
193 | - Gathering more image data from a variety of sources would help our model inference in different distances and environments better.
194 | - Even letters with movements are able to be recognized through computer vision.
195 |
196 |
197 |
198 | - [Back to Contents](#contents)
199 | # **Next Steps**
200 | I believe this project is aligned with the vision of the National Association of the Deaf in bringing better accessibility and education for this underrepresented community. If I am able to bring awareness to the project, and partner with an organization like the NAD, I will be able to gather better data on the people that speak this language natively to push the project further.
201 |
202 | The technology is still very new, and the model I have trained for this presentation was primarily used to find out if it would work. I’m happy with my initial results and I’ve already trained a smaller model that I’ll be testing for mobile deployment in the future.
203 |
204 | I believe computer vision can help give our deaf and hard of hearing neighbors a voice with the right support and project awareness.
205 |
206 | - [Back to Contents](#contents)
207 |
208 | # **Citations**
209 | Python Version: 3.8
210 | Packages: pandas, numpy, matplotlib, sklearn, opencv, os, ast, albumentations, tqdm, torch, IPython, PIL, shutil
211 |
212 | ### Resources:
213 |
214 | Yolov5 github
215 | https://github.com/ultralytics/yolov5
216 |
217 | Yolov5 requirements
218 | https://github.com/ultralytics/yolov5/blob/master/requirements.txt
219 |
220 | Cudnn install guide:
221 | https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html
222 |
223 | Install Opencv:
224 | https://www.codegrepper.com/code-examples/python/how+to+install+opencv+in+python+3.8
225 |
226 | Roboflow augmentation process:
227 | https://docs.roboflow.com/image-transformations/image-augmentation
228 |
229 | Heavily utilized research paper on image augmentations:
230 | https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0#Sec3
231 |
232 | Pillow library:
233 | https://pillow.readthedocs.io/en/latest/handbook/index.html
234 |
235 | Labeling Software labelImg:
236 | https://github.com/tzutalin/labelImg
237 |
238 | Albumentations library
239 | https://github.com/albumentations-team/albumentations
240 |
241 | # **Special Thanks**
242 | Joseph Nelson, CEO of Roboflow.ai, for delivering a computer vision lesson to our class, and answering my questions directly.
243 |
244 | And to my volunteers:
245 | Nathan & Roxanne Seither
246 | Juhee Sung-Schenck
247 | Josh Mizraji
248 | Lydia Kajeckas
249 | Aidan Curley
250 | Chris Johnson
251 | Eric Lee
252 |
253 | And to the General Assembly DSI-720 instructors:
254 | Adi Bronshtein
255 | Patrick Wales-Dinan
256 | Kelly Slatery
257 | Noah Christiansen
258 | Jacob Ellena
259 | Bradford Smith
260 |
261 | This project would not have been possible without the time all of you invested in me. Thank you!
--------------------------------------------------------------------------------
/v3.0/ord.txt:
--------------------------------------------------------------------------------
1 | python train.py --img 512 --batch 16 --epochs 100 --data config.yaml --cfg models/yolov5s.yaml --name yolo_example
2 | tensorboard --logdir runs/
3 | python detect.py --weights weights/best.pt --img 512 --conf 0.3 --source 0
4 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source 0
5 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source rtsp://192.168.0.106:8554/
6 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.0.106:8080/
7 | python detect.py --weights runs/exp12_yolo_example/weights/best.pt --img 512 --conf 0.15 --source http://192.168.43.46:8080/
8 |
9 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 480 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264
10 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 30|cvlc -vvv stream:///dev/stdin --sout '#standard{access=http,mux=ts,dst=:8080}' :demux=h264
11 |
12 | sudo raspivid -o - -rot 180 -t 0 -w 640 -h 360 -fps 25|cvlc -vvv stream:///dev/stdin --sout '#standard access=http,mux=ts,dst=:8090}' :demux=h264
--------------------------------------------------------------------------------
/v3.0/windows_v1.8.1/data/predefined_classes.txt:
--------------------------------------------------------------------------------
1 | 1
2 | 2
3 | 5
4 | forefinger
--------------------------------------------------------------------------------
/v3.0/windows_v1.8.1/labelImg.exe:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/windows_v1.8.1/labelImg.exe
--------------------------------------------------------------------------------
/v3.0/yolov5/.dockerignore:
--------------------------------------------------------------------------------
1 | # Repo-specific DockerIgnore -------------------------------------------------------------------------------------------
2 | #.git
3 | .cache
4 | .idea
5 | runs
6 | output
7 | coco
8 | storage.googleapis.com
9 |
10 | data/samples/*
11 | **/results*.txt
12 | *.jpg
13 |
14 | # Neural Network weights -----------------------------------------------------------------------------------------------
15 | **/*.weights
16 | **/*.pt
17 | **/*.pth
18 | **/*.onnx
19 | **/*.mlmodel
20 | **/*.torchscript
21 |
22 |
23 | # Below Copied From .gitignore -----------------------------------------------------------------------------------------
24 | # Below Copied From .gitignore -----------------------------------------------------------------------------------------
25 |
26 |
27 | # GitHub Python GitIgnore ----------------------------------------------------------------------------------------------
28 | # Byte-compiled / optimized / DLL files
29 | __pycache__/
30 | *.py[cod]
31 | *$py.class
32 |
33 | # C extensions
34 | *.so
35 |
36 | # Distribution / packaging
37 | .Python
38 | env/
39 | build/
40 | develop-eggs/
41 | dist/
42 | downloads/
43 | eggs/
44 | .eggs/
45 | lib/
46 | lib64/
47 | parts/
48 | sdist/
49 | var/
50 | wheels/
51 | *.egg-info/
52 | .installed.cfg
53 | *.egg
54 |
55 | # PyInstaller
56 | # Usually these files are written by a python script from a template
57 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
58 | *.manifest
59 | *.spec
60 |
61 | # Installer logs
62 | pip-log.txt
63 | pip-delete-this-directory.txt
64 |
65 | # Unit test / coverage reports
66 | htmlcov/
67 | .tox/
68 | .coverage
69 | .coverage.*
70 | .cache
71 | nosetests.xml
72 | coverage.xml
73 | *.cover
74 | .hypothesis/
75 |
76 | # Translations
77 | *.mo
78 | *.pot
79 |
80 | # Django stuff:
81 | *.log
82 | local_settings.py
83 |
84 | # Flask stuff:
85 | instance/
86 | .webassets-cache
87 |
88 | # Scrapy stuff:
89 | .scrapy
90 |
91 | # Sphinx documentation
92 | docs/_build/
93 |
94 | # PyBuilder
95 | target/
96 |
97 | # Jupyter Notebook
98 | .ipynb_checkpoints
99 |
100 | # pyenv
101 | .python-version
102 |
103 | # celery beat schedule file
104 | celerybeat-schedule
105 |
106 | # SageMath parsed files
107 | *.sage.py
108 |
109 | # dotenv
110 | .env
111 |
112 | # virtualenv
113 | .venv*
114 | venv*/
115 | ENV*/
116 |
117 | # Spyder project settings
118 | .spyderproject
119 | .spyproject
120 |
121 | # Rope project settings
122 | .ropeproject
123 |
124 | # mkdocs documentation
125 | /site
126 |
127 | # mypy
128 | .mypy_cache/
129 |
130 |
131 | # https://github.com/github/gitignore/blob/master/Global/macOS.gitignore -----------------------------------------------
132 |
133 | # General
134 | .DS_Store
135 | .AppleDouble
136 | .LSOverride
137 |
138 | # Icon must end with two \r
139 | Icon
140 | Icon?
141 |
142 | # Thumbnails
143 | ._*
144 |
145 | # Files that might appear in the root of a volume
146 | .DocumentRevisions-V100
147 | .fseventsd
148 | .Spotlight-V100
149 | .TemporaryItems
150 | .Trashes
151 | .VolumeIcon.icns
152 | .com.apple.timemachine.donotpresent
153 |
154 | # Directories potentially created on remote AFP share
155 | .AppleDB
156 | .AppleDesktop
157 | Network Trash Folder
158 | Temporary Items
159 | .apdisk
160 |
161 |
162 | # https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore
163 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm
164 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839
165 |
166 | # User-specific stuff:
167 | .idea/*
168 | .idea/**/workspace.xml
169 | .idea/**/tasks.xml
170 | .idea/dictionaries
171 | .html # Bokeh Plots
172 | .pg # TensorFlow Frozen Graphs
173 | .avi # videos
174 |
175 | # Sensitive or high-churn files:
176 | .idea/**/dataSources/
177 | .idea/**/dataSources.ids
178 | .idea/**/dataSources.local.xml
179 | .idea/**/sqlDataSources.xml
180 | .idea/**/dynamic.xml
181 | .idea/**/uiDesigner.xml
182 |
183 | # Gradle:
184 | .idea/**/gradle.xml
185 | .idea/**/libraries
186 |
187 | # CMake
188 | cmake-build-debug/
189 | cmake-build-release/
190 |
191 | # Mongo Explorer plugin:
192 | .idea/**/mongoSettings.xml
193 |
194 | ## File-based project format:
195 | *.iws
196 |
197 | ## Plugin-specific files:
198 |
199 | # IntelliJ
200 | out/
201 |
202 | # mpeltonen/sbt-idea plugin
203 | .idea_modules/
204 |
205 | # JIRA plugin
206 | atlassian-ide-plugin.xml
207 |
208 | # Cursive Clojure plugin
209 | .idea/replstate.xml
210 |
211 | # Crashlytics plugin (for Android Studio and IntelliJ)
212 | com_crashlytics_export_strings.xml
213 | crashlytics.properties
214 | crashlytics-build.properties
215 | fabric.properties
216 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.gitattributes:
--------------------------------------------------------------------------------
1 | # this drop notebooks from GitHub language stats
2 | *.ipynb linguist-vendored
3 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.github/ISSUE_TEMPLATE/--bug-report.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: "\U0001F41BBug report"
3 | about: Create a report to help us improve
4 | title: ''
5 | labels: bug
6 | assignees: ''
7 |
8 | ---
9 |
10 | Before submitting a bug report, please be aware that your issue **must be reproducible** with all of the following, otherwise it is non-actionable, and we can not help you:
11 | - **Current repo**: run `git fetch && git status -uno` to check and `git pull` to update repo
12 | - **Common dataset**: coco.yaml or coco128.yaml
13 | - **Common environment**: Colab, Google Cloud, or Docker image. See https://github.com/ultralytics/yolov5#environments
14 |
15 | If this is a custom dataset/training question you **must include** your `train*.jpg`, `test*.jpg` and `results.png` figures, or we can not help you. You can generate these with `utils.plot_results()`.
16 |
17 |
18 | ## 🐛 Bug
19 | A clear and concise description of what the bug is.
20 |
21 |
22 | ## To Reproduce (REQUIRED)
23 |
24 | Input:
25 | ```
26 | import torch
27 |
28 | a = torch.tensor([5])
29 | c = a / 0
30 | ```
31 |
32 | Output:
33 | ```
34 | Traceback (most recent call last):
35 | File "/Users/glennjocher/opt/anaconda3/envs/env1/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
36 | exec(code_obj, self.user_global_ns, self.user_ns)
37 | File "", line 5, in
38 | c = a / 0
39 | RuntimeError: ZeroDivisionError
40 | ```
41 |
42 |
43 | ## Expected behavior
44 | A clear and concise description of what you expected to happen.
45 |
46 |
47 | ## Environment
48 | If applicable, add screenshots to help explain your problem.
49 |
50 | - OS: [e.g. Ubuntu]
51 | - GPU [e.g. 2080 Ti]
52 |
53 |
54 | ## Additional context
55 | Add any other context about the problem here.
56 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.github/ISSUE_TEMPLATE/--feature-request.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: "\U0001F680Feature request"
3 | about: Suggest an idea for this project
4 | title: ''
5 | labels: enhancement
6 | assignees: ''
7 |
8 | ---
9 |
10 | ## 🚀 Feature
11 |
12 |
13 | ## Motivation
14 |
15 |
16 |
17 | ## Pitch
18 |
19 |
20 |
21 | ## Alternatives
22 |
23 |
24 |
25 | ## Additional context
26 |
27 |
28 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.github/ISSUE_TEMPLATE/-question.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: "❓Question"
3 | about: Ask a general question
4 | title: ''
5 | labels: question
6 | assignees: ''
7 |
8 | ---
9 |
10 | ## ❔Question
11 |
12 |
13 | ## Additional context
14 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.github/workflows/ci-testing.yml:
--------------------------------------------------------------------------------
1 | name: CI CPU testing
2 |
3 | on: # https://help.github.com/en/actions/reference/events-that-trigger-workflows
4 | push:
5 | pull_request:
6 | schedule:
7 | - cron: "0 0 * * *"
8 |
9 | jobs:
10 | cpu-tests:
11 |
12 | runs-on: ${{ matrix.os }}
13 | strategy:
14 | fail-fast: false
15 | matrix:
16 | os: [ubuntu-latest, macos-latest, windows-latest]
17 | python-version: [3.8]
18 | model: ['yolov5s'] # models to test
19 |
20 | # Timeout: https://stackoverflow.com/a/59076067/4521646
21 | timeout-minutes: 50
22 | steps:
23 | - uses: actions/checkout@v2
24 | - name: Set up Python ${{ matrix.python-version }}
25 | uses: actions/setup-python@v2
26 | with:
27 | python-version: ${{ matrix.python-version }}
28 |
29 | # Note: This uses an internal pip API and may not always work
30 | # https://github.com/actions/cache/blob/master/examples.md#multiple-oss-in-a-workflow
31 | - name: Get pip cache
32 | id: pip-cache
33 | run: |
34 | python -c "from pip._internal.locations import USER_CACHE_DIR; print('::set-output name=dir::' + USER_CACHE_DIR)"
35 |
36 | - name: Cache pip
37 | uses: actions/cache@v1
38 | with:
39 | path: ${{ steps.pip-cache.outputs.dir }}
40 | key: ${{ runner.os }}-${{ matrix.python-version }}-pip-${{ hashFiles('requirements.txt') }}
41 | restore-keys: |
42 | ${{ runner.os }}-${{ matrix.python-version }}-pip-
43 |
44 | - name: Install dependencies
45 | run: |
46 | python -m pip install --upgrade pip
47 | pip install -qr requirements.txt -f https://download.pytorch.org/whl/cpu/torch_stable.html
48 | pip install -q onnx
49 | python --version
50 | pip --version
51 | pip list
52 | shell: bash
53 |
54 | - name: Download data
55 | run: |
56 | # curl -L -o tmp.zip https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
57 | # unzip -q tmp.zip -d ../
58 | # rm tmp.zip
59 |
60 | - name: Tests workflow
61 | run: |
62 | # export PYTHONPATH="$PWD" # to run '$ python *.py' files in subdirectories
63 | di=cpu # inference devices # define device
64 |
65 | # train
66 | python train.py --img 256 --batch 8 --weights weights/${{ matrix.model }}.pt --cfg models/${{ matrix.model }}.yaml --epochs 1 --device $di
67 | # detect
68 | python detect.py --weights weights/${{ matrix.model }}.pt --device $di
69 | python detect.py --weights runs/exp0/weights/last.pt --device $di
70 | # test
71 | python test.py --img 256 --batch 8 --weights weights/${{ matrix.model }}.pt --device $di
72 | python test.py --img 256 --batch 8 --weights runs/exp0/weights/last.pt --device $di
73 |
74 | python models/yolo.py --cfg models/${{ matrix.model }}.yaml # inspect
75 | python models/export.py --img 256 --batch 1 --weights weights/${{ matrix.model }}.pt # export
76 | shell: bash
77 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.github/workflows/greetings.yml:
--------------------------------------------------------------------------------
1 | name: Greetings
2 |
3 | on: [pull_request_target, issues]
4 |
5 | jobs:
6 | greeting:
7 | runs-on: ubuntu-latest
8 | steps:
9 | - uses: actions/first-interaction@v1
10 | with:
11 | repo-token: ${{ secrets.GITHUB_TOKEN }}
12 | pr-message: |
13 | Hello @${{ github.actor }}, thank you for submitting a PR! To allow your work to be integrated as seamlessly as possible, we advise you to:
14 | - Verify your PR is **up-to-date with origin/master.** If your PR is behind origin/master update by running the following, replacing 'feature' with the name of your local branch:
15 | ```bash
16 | git remote add upstream https://github.com/ultralytics/yolov5.git
17 | git fetch upstream
18 | git checkout feature # <----- replace 'feature' with local branch name
19 | git rebase upstream/master
20 | git push -u origin -f
21 | ```
22 | - Verify all Continuous Integration (CI) **checks are passing**.
23 | - Reduce changes to the absolute **minimum** required for your bug fix or feature addition. _"It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is."_ -Bruce Lee
24 |
25 | issue-message: |
26 | Hello @${{ github.actor }}, thank you for your interest in our work! Please visit our [Custom Training Tutorial](https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data) to get started, and see our [Jupyter Notebook](https://github.com/ultralytics/yolov5/blob/master/tutorial.ipynb)
, [Docker Image](https://hub.docker.com/r/ultralytics/yolov5), and [Google Cloud Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/GCP-Quickstart) for example environments.
27 |
28 | If this is a bug report, please provide screenshots and **minimum viable code to reproduce your issue**, otherwise we can not help you.
29 |
30 | If this is a custom model or data training question, please note Ultralytics does **not** provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
31 | - **Cloud-based AI** systems operating on **hundreds of HD video streams in realtime.**
32 | - **Edge AI** integrated into custom iOS and Android apps for realtime **30 FPS video inference.**
33 | - **Custom data training**, hyperparameter evolution, and model exportation to any destination.
34 |
35 | For more information please visit https://www.ultralytics.com.
36 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.github/workflows/rebase.yml:
--------------------------------------------------------------------------------
1 | name: Automatic Rebase
2 | # https://github.com/marketplace/actions/automatic-rebase
3 |
4 | on:
5 | issue_comment:
6 | types: [created]
7 |
8 | jobs:
9 | rebase:
10 | name: Rebase
11 | if: github.event.issue.pull_request != '' && contains(github.event.comment.body, '/rebase')
12 | runs-on: ubuntu-latest
13 | steps:
14 | - name: Checkout the latest code
15 | uses: actions/checkout@v2
16 | with:
17 | fetch-depth: 0
18 | - name: Automatic Rebase
19 | uses: cirrus-actions/rebase@1.3.1
20 | env:
21 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
22 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.github/workflows/stale.yml:
--------------------------------------------------------------------------------
1 | name: Close stale issues
2 | on:
3 | schedule:
4 | - cron: "0 0 * * *"
5 |
6 | jobs:
7 | stale:
8 | runs-on: ubuntu-latest
9 | steps:
10 | - uses: actions/stale@v1
11 | with:
12 | repo-token: ${{ secrets.GITHUB_TOKEN }}
13 | stale-issue-message: 'This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.'
14 | stale-pr-message: 'This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.'
15 | days-before-stale: 30
16 | days-before-close: 5
17 | exempt-issue-labels: 'documentation,tutorial'
18 | operations-per-run: 100 # The maximum number of operations per run, used to control rate limiting.
19 |
--------------------------------------------------------------------------------
/v3.0/yolov5/.gitignore:
--------------------------------------------------------------------------------
1 | # Repo-specific GitIgnore ----------------------------------------------------------------------------------------------
2 | *.jpg
3 | *.jpeg
4 | *.png
5 | *.bmp
6 | *.tif
7 | *.tiff
8 | *.heic
9 | *.JPG
10 | *.JPEG
11 | *.PNG
12 | *.BMP
13 | *.TIF
14 | *.TIFF
15 | *.HEIC
16 | *.mp4
17 | *.mov
18 | *.MOV
19 | *.avi
20 | *.data
21 | *.json
22 |
23 | *.cfg
24 | !cfg/yolov3*.cfg
25 |
26 | storage.googleapis.com
27 | runs/*
28 | data/*
29 | !data/samples/zidane.jpg
30 | !data/samples/bus.jpg
31 | !data/coco.names
32 | !data/coco_paper.names
33 | !data/coco.data
34 | !data/coco_*.data
35 | !data/coco_*.txt
36 | !data/trainvalno5k.shapes
37 | !data/*.sh
38 |
39 | pycocotools/*
40 | results*.txt
41 | gcp_test*.sh
42 |
43 | # MATLAB GitIgnore -----------------------------------------------------------------------------------------------------
44 | *.m~
45 | *.mat
46 | !targets*.mat
47 |
48 | # Neural Network weights -----------------------------------------------------------------------------------------------
49 | *.weights
50 | *.pt
51 | *.onnx
52 | *.mlmodel
53 | *.torchscript
54 | darknet53.conv.74
55 | yolov3-tiny.conv.15
56 |
57 | # GitHub Python GitIgnore ----------------------------------------------------------------------------------------------
58 | # Byte-compiled / optimized / DLL files
59 | __pycache__/
60 | *.py[cod]
61 | *$py.class
62 |
63 | # C extensions
64 | *.so
65 |
66 | # Distribution / packaging
67 | .Python
68 | env/
69 | build/
70 | develop-eggs/
71 | dist/
72 | downloads/
73 | eggs/
74 | .eggs/
75 | lib/
76 | lib64/
77 | parts/
78 | sdist/
79 | var/
80 | wheels/
81 | *.egg-info/
82 | .installed.cfg
83 | *.egg
84 |
85 | # PyInstaller
86 | # Usually these files are written by a python script from a template
87 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
88 | *.manifest
89 | *.spec
90 |
91 | # Installer logs
92 | pip-log.txt
93 | pip-delete-this-directory.txt
94 |
95 | # Unit test / coverage reports
96 | htmlcov/
97 | .tox/
98 | .coverage
99 | .coverage.*
100 | .cache
101 | nosetests.xml
102 | coverage.xml
103 | *.cover
104 | .hypothesis/
105 |
106 | # Translations
107 | *.mo
108 | *.pot
109 |
110 | # Django stuff:
111 | *.log
112 | local_settings.py
113 |
114 | # Flask stuff:
115 | instance/
116 | .webassets-cache
117 |
118 | # Scrapy stuff:
119 | .scrapy
120 |
121 | # Sphinx documentation
122 | docs/_build/
123 |
124 | # PyBuilder
125 | target/
126 |
127 | # Jupyter Notebook
128 | .ipynb_checkpoints
129 |
130 | # pyenv
131 | .python-version
132 |
133 | # celery beat schedule file
134 | celerybeat-schedule
135 |
136 | # SageMath parsed files
137 | *.sage.py
138 |
139 | # dotenv
140 | .env
141 |
142 | # virtualenv
143 | .venv*
144 | venv*/
145 | ENV*/
146 |
147 | # Spyder project settings
148 | .spyderproject
149 | .spyproject
150 |
151 | # Rope project settings
152 | .ropeproject
153 |
154 | # mkdocs documentation
155 | /site
156 |
157 | # mypy
158 | .mypy_cache/
159 |
160 |
161 | # https://github.com/github/gitignore/blob/master/Global/macOS.gitignore -----------------------------------------------
162 |
163 | # General
164 | .DS_Store
165 | .AppleDouble
166 | .LSOverride
167 |
168 | # Icon must end with two \r
169 | Icon
170 | Icon?
171 |
172 | # Thumbnails
173 | ._*
174 |
175 | # Files that might appear in the root of a volume
176 | .DocumentRevisions-V100
177 | .fseventsd
178 | .Spotlight-V100
179 | .TemporaryItems
180 | .Trashes
181 | .VolumeIcon.icns
182 | .com.apple.timemachine.donotpresent
183 |
184 | # Directories potentially created on remote AFP share
185 | .AppleDB
186 | .AppleDesktop
187 | Network Trash Folder
188 | Temporary Items
189 | .apdisk
190 |
191 |
192 | # https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore
193 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm
194 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839
195 |
196 | # User-specific stuff:
197 | .idea/*
198 | .idea/**/workspace.xml
199 | .idea/**/tasks.xml
200 | .idea/dictionaries
201 | .html # Bokeh Plots
202 | .pg # TensorFlow Frozen Graphs
203 | .avi # videos
204 |
205 | # Sensitive or high-churn files:
206 | .idea/**/dataSources/
207 | .idea/**/dataSources.ids
208 | .idea/**/dataSources.local.xml
209 | .idea/**/sqlDataSources.xml
210 | .idea/**/dynamic.xml
211 | .idea/**/uiDesigner.xml
212 |
213 | # Gradle:
214 | .idea/**/gradle.xml
215 | .idea/**/libraries
216 |
217 | # CMake
218 | cmake-build-debug/
219 | cmake-build-release/
220 |
221 | # Mongo Explorer plugin:
222 | .idea/**/mongoSettings.xml
223 |
224 | ## File-based project format:
225 | *.iws
226 |
227 | ## Plugin-specific files:
228 |
229 | # IntelliJ
230 | out/
231 |
232 | # mpeltonen/sbt-idea plugin
233 | .idea_modules/
234 |
235 | # JIRA plugin
236 | atlassian-ide-plugin.xml
237 |
238 | # Cursive Clojure plugin
239 | .idea/replstate.xml
240 |
241 | # Crashlytics plugin (for Android Studio and IntelliJ)
242 | com_crashlytics_export_strings.xml
243 | crashlytics.properties
244 | crashlytics-build.properties
245 | fabric.properties
246 |
--------------------------------------------------------------------------------
/v3.0/yolov5/Dockerfile:
--------------------------------------------------------------------------------
1 | # Start FROM Nvidia PyTorch image https://ngc.nvidia.com/catalog/containers/nvidia:pytorch
2 | FROM nvcr.io/nvidia/pytorch:20.10-py3
3 |
4 | # Install dependencies
5 | RUN pip install --upgrade pip
6 | # COPY requirements.txt .
7 | # RUN pip install -r requirements.txt
8 | RUN pip install gsutil
9 |
10 | # Create working directory
11 | RUN mkdir -p /usr/src/app
12 | WORKDIR /usr/src/app
13 |
14 | # Copy contents
15 | COPY . /usr/src/app
16 |
17 | # Copy weights
18 | #RUN python3 -c "from models import *; \
19 | #attempt_download('weights/yolov5s.pt'); \
20 | #attempt_download('weights/yolov5m.pt'); \
21 | #attempt_download('weights/yolov5l.pt')"
22 |
23 |
24 | # --------------------------------------------------- Extras Below ---------------------------------------------------
25 |
26 | # Build and Push
27 | # t=ultralytics/yolov5:latest && sudo docker build -t $t . && sudo docker push $t
28 | # for v in {300..303}; do t=ultralytics/coco:v$v && sudo docker build -t $t . && sudo docker push $t; done
29 |
30 | # Pull and Run
31 | # t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host $t
32 |
33 | # Pull and Run with local directory access
34 | # t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/coco:/usr/src/coco $t
35 |
36 | # Kill all
37 | # sudo docker kill $(sudo docker ps -q)
38 |
39 | # Kill all image-based
40 | # sudo docker kill $(sudo docker ps -a -q --filter ancestor=ultralytics/yolov5:latest)
41 |
42 | # Bash into running container
43 | # sudo docker container exec -it ba65811811ab bash
44 |
45 | # Bash into stopped container
46 | # sudo docker commit 092b16b25c5b usr/resume && sudo docker run -it --gpus all --ipc=host -v "$(pwd)"/coco:/usr/src/coco --entrypoint=sh usr/resume
47 |
48 | # Send weights to GCP
49 | # python -c "from utils.general import *; strip_optimizer('runs/exp0_*/weights/best.pt', 'tmp.pt')" && gsutil cp tmp.pt gs://*.pt
50 |
51 | # Clean up
52 | # docker system prune -a --volumes
53 |
--------------------------------------------------------------------------------
/v3.0/yolov5/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |  
4 |
5 | 
6 |
7 | This repository represents Ultralytics open-source research into future object detection methods, and incorporates our lessons learned and best practices evolved over training thousands of models on custom client datasets with our previous YOLO repository https://github.com/ultralytics/yolov3. **All code and models are under active development, and are subject to modification or deletion without notice.** Use at your own risk.
8 |
9 |
** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8.
10 |
11 | - **August 13, 2020**: [v3.0 release](https://github.com/ultralytics/yolov5/releases/tag/v3.0): nn.Hardswish() activations, data autodownload, native AMP.
12 | - **July 23, 2020**: [v2.0 release](https://github.com/ultralytics/yolov5/releases/tag/v2.0): improved model definition, training and mAP.
13 | - **June 22, 2020**: [PANet](https://arxiv.org/abs/1803.01534) updates: new heads, reduced parameters, improved speed and mAP [364fcfd](https://github.com/ultralytics/yolov5/commit/364fcfd7dba53f46edd4f04c037a039c0a287972).
14 | - **June 19, 2020**: [FP16](https://pytorch.org/docs/stable/nn.html#torch.nn.Module.half) as new default for smaller checkpoints and faster inference [d4c6674](https://github.com/ultralytics/yolov5/commit/d4c6674c98e19df4c40e33a777610a18d1961145).
15 | - **June 9, 2020**: [CSP](https://github.com/WongKinYiu/CrossStagePartialNetworks) updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP).
16 | - **May 27, 2020**: Public release. YOLOv5 models are SOTA among all known YOLO implementations.
17 | - **April 1, 2020**: Start development of future compound-scaled [YOLOv3](https://github.com/ultralytics/yolov3)/[YOLOv4](https://github.com/AlexeyAB/darknet)-based PyTorch models.
18 |
19 |
20 | ## Pretrained Checkpoints
21 |
22 | | Model | APval | APtest | AP50 | SpeedGPU | FPSGPU || params | FLOPS |
23 | |---------- |------ |------ |------ | -------- | ------| ------ |------ | :------: |
24 | | [YOLOv5s](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 37.0 | 37.0 | 56.2 | **2.4ms** | **416** || 7.5M | 13.2B
25 | | [YOLOv5m](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 44.3 | 44.3 | 63.2 | 3.4ms | 294 || 21.8M | 39.4B
26 | | [YOLOv5l](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 47.7 | 47.7 | 66.5 | 4.4ms | 227 || 47.8M | 88.1B
27 | | [YOLOv5x](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | **49.2** | **49.2** | **67.7** | 6.9ms | 145 || 89.0M | 166.4B
28 | | | | | | | || |
29 | | [YOLOv5x](https://github.com/ultralytics/yolov5/releases/tag/v3.0) + TTA|**50.8**| **50.8** | **68.9** | 25.5ms | 39 || 89.0M | 354.3B
30 | | | | | | | || |
31 | | [YOLOv3-SPP](https://github.com/ultralytics/yolov5/releases/tag/v3.0) | 45.6 | 45.5 | 65.2 | 4.5ms | 222 || 63.0M | 118.0B
32 |
33 | ** APtest denotes COCO [test-dev2017](http://cocodataset.org/#upload) server results, all other AP results in the table denote val2017 accuracy.
34 | ** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. **Reproduce** by `python test.py --data coco.yaml --img 640 --conf 0.001`
35 | ** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP [n1-standard-16](https://cloud.google.com/compute/docs/machine-types#n1_standard_machine_types) instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. **Reproduce** by `python test.py --data coco.yaml --img 640 --conf 0.1`
36 | ** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
37 | ** Test Time Augmentation ([TTA](https://github.com/ultralytics/yolov5/issues/303)) runs at 3 image sizes. **Reproduce** by `python test.py --data coco.yaml --img 832 --augment`
38 |
39 | ## Requirements
40 |
41 | Python 3.8 or later with all [requirements.txt](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) dependencies installed, including `torch>=1.6`. To install run:
42 | ```bash
43 | $ pip install -r requirements.txt
44 | ```
45 |
46 |
47 | ## Tutorials
48 |
49 | * [Train Custom Data](https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data)
50 | * [Multi-GPU Training](https://github.com/ultralytics/yolov5/issues/475)
51 | * [PyTorch Hub](https://github.com/ultralytics/yolov5/issues/36)
52 | * [ONNX and TorchScript Export](https://github.com/ultralytics/yolov5/issues/251)
53 | * [Test-Time Augmentation (TTA)](https://github.com/ultralytics/yolov5/issues/303)
54 | * [Model Ensembling](https://github.com/ultralytics/yolov5/issues/318)
55 | * [Model Pruning/Sparsity](https://github.com/ultralytics/yolov5/issues/304)
56 | * [Hyperparameter Evolution](https://github.com/ultralytics/yolov5/issues/607)
57 | * [TensorRT Deployment](https://github.com/wang-xinyu/tensorrtx)
58 |
59 |
60 | ## Environments
61 |
62 | YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including [CUDA](https://developer.nvidia.com/cuda)/[CUDNN](https://developer.nvidia.com/cudnn), [Python](https://www.python.org/) and [PyTorch](https://pytorch.org/) preinstalled):
63 |
64 | - **Google Colab Notebook** with free GPU:
65 | - **Kaggle Notebook** with free GPU: [https://www.kaggle.com/ultralytics/yolov5](https://www.kaggle.com/ultralytics/yolov5)
66 | - **Google Cloud** Deep Learning VM. See [GCP Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/GCP-Quickstart)
67 | - **Docker Image** https://hub.docker.com/r/ultralytics/yolov5. See [Docker Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/Docker-Quickstart) 
68 |
69 |
70 | ## Inference
71 |
72 | detect.py runs inference on a variety of sources, downloading models automatically from the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases) and saving results to `inference/output`.
73 | ```bash
74 | $ python detect.py --source 0 # webcam
75 | file.jpg # image
76 | file.mp4 # video
77 | path/ # directory
78 | path/*.jpg # glob
79 | rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa # rtsp stream
80 | rtmp://192.168.1.105/live/test # rtmp stream
81 | http://112.50.243.8/PLTV/88888888/224/3221225900/1.m3u8 # http stream
82 | ```
83 |
84 | To run inference on example images in `inference/images`:
85 | ```bash
86 | $ python detect.py --source inference/images --weights yolov5s.pt --conf 0.25
87 |
88 | Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', img_size=640, iou_thres=0.45, output='inference/output', save_conf=False, save_txt=False, source='inference/images', update=False, view_img=False, weights='yolov5s.pt')
89 | Using CUDA device0 _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', total_memory=16160MB)
90 |
91 | Downloading https://github.com/ultralytics/yolov5/releases/download/v3.0/yolov5s.pt to yolov5s.pt... 100%|██████████████| 14.5M/14.5M [00:00<00:00, 21.3MB/s]
92 |
93 | Fusing layers...
94 | Model Summary: 140 layers, 7.45958e+06 parameters, 0 gradients
95 | image 1/2 yolov5/inference/images/bus.jpg: 640x480 4 persons, 1 buss, 1 skateboards, Done. (0.013s)
96 | image 2/2 yolov5/inference/images/zidane.jpg: 384x640 2 persons, 2 ties, Done. (0.013s)
97 | Results saved to yolov5/inference/output
98 | Done. (0.124s)
99 | ```
100 |
101 |
102 | ### PyTorch Hub
103 |
104 | To run **batched inference** with YOLOv5 and [PyTorch Hub](https://github.com/ultralytics/yolov5/issues/36):
105 | ```python
106 | import torch
107 | from PIL import Image
108 |
109 | # Model
110 | model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).fuse().eval() # yolov5s.pt
111 | model = model.autoshape() # for autoshaping of PIL/cv2/np inputs and NMS
112 |
113 | # Images
114 | img1 = Image.open('zidane.jpg')
115 | img2 = Image.open('bus.jpg')
116 | imgs = [img1, img2] # batched list of images
117 |
118 | # Inference
119 | prediction = model(imgs, size=640) # includes NMS
120 | ```
121 |
122 |
123 | ## Training
124 |
125 | Download [COCO](https://github.com/ultralytics/yolov5/blob/master/data/scripts/get_coco.sh) and run command below. Training times for YOLOv5s/m/l/x are 2/4/6/8 days on a single V100 (multi-GPU times faster). Use the largest `--batch-size` your GPU allows (batch sizes shown for 16 GB devices).
126 | ```bash
127 | $ python train.py --data coco.yaml --cfg yolov5s.yaml --weights '' --batch-size 64
128 | yolov5m 40
129 | yolov5l 24
130 | yolov5x 16
131 | ```
132 |
133 |
134 |
135 | ## Citation
136 |
137 | [](https://zenodo.org/badge/latestdoi/264818686)
138 |
139 |
140 | ## About Us
141 |
142 | Ultralytics is a U.S.-based particle physics and AI startup with over 6 years of expertise supporting government, academic and business clients. We offer a wide range of vision AI services, spanning from simple expert advice up to delivery of fully customized, end-to-end production solutions, including:
143 | - **Cloud-based AI** systems operating on **hundreds of HD video streams in realtime.**
144 | - **Edge AI** integrated into custom iOS and Android apps for realtime **30 FPS video inference.**
145 | - **Custom data training**, hyperparameter evolution, and model exportation to any destination.
146 |
147 | For business inquiries and professional support requests please visit us at https://www.ultralytics.com.
148 |
149 |
150 | ## Contact
151 |
152 | **Issues should be raised directly in the repository.** For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
153 |
--------------------------------------------------------------------------------
/v3.0/yolov5/config.yaml:
--------------------------------------------------------------------------------
1 | train: yolo_data/images/train
2 | val: yolo_data/images/validation
3 | nc: 4
4 | names: ['1',
5 | '2',
6 | '5',
7 | 'forefinger']
8 |
--------------------------------------------------------------------------------
/v3.0/yolov5/detect.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import os
3 | import shutil
4 | import time
5 | from pathlib import Path
6 |
7 | import cv2
8 | import torch
9 | import torch.backends.cudnn as cudnn
10 | from numpy import random
11 | import numpy as np
12 | import copy
13 | from PIL import Image
14 |
15 | from models.experimental import attempt_load
16 | from utils.datasets import LoadStreams, LoadImages
17 | from utils.general import (
18 | check_img_size, non_max_suppression, apply_classifier, scale_coords,
19 | xyxy2xywh, plot_one_box, strip_optimizer, set_logging)
20 | from utils.torch_utils import select_device, load_classifier, time_synchronized
21 |
22 |
23 | def get_dis(p1, p2):
24 | (x1, y1) = p1
25 | (x2, y2) = p2
26 | return np.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
27 |
28 |
29 | def detect(save_img=False):
30 | out, source, weights, view_img, save_txt, imgsz = \
31 | opt.save_dir, opt.source, opt.weights, opt.view_img, opt.save_txt, opt.img_size
32 | webcam = source.isnumeric() or source.startswith(('rtsp://', 'rtmp://', 'http://')) or source.endswith('.txt')
33 |
34 | # Initialize
35 | set_logging()
36 | device = select_device(opt.device)
37 | if os.path.exists(out): # output dir
38 | shutil.rmtree(out) # delete dir
39 | os.makedirs(out) # make new dir
40 | half = device.type != 'cpu' # half precision only supported on CUDA
41 |
42 | # Load model
43 | model = attempt_load(weights, map_location=device) # load FP32 model
44 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size
45 | if half:
46 | model.half() # to FP16
47 |
48 | # Second-stage classifier
49 | classify = False
50 | if classify:
51 | modelc = load_classifier(name='resnet101', n=2) # initialize
52 | modelc.load_state_dict(torch.load('weights/resnet101.pt', map_location=device)['model']) # load weights
53 | modelc.to(device).eval()
54 |
55 | # Set Dataloader
56 | vid_path, vid_writer = None, None
57 | if webcam:
58 | view_img = True
59 | cudnn.benchmark = True # set True to speed up constant image size inference
60 | dataset = LoadStreams(source, img_size=imgsz)
61 | else:
62 | save_img = True
63 | dataset = LoadImages(source, img_size=imgsz)
64 |
65 | # Get names and colors
66 | names = model.module.names if hasattr(model, 'module') else model.names
67 | colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))]
68 |
69 | # Run inference
70 | t0 = time.time()
71 |
72 | init_pic = np.zeros((1080, 1920, 3), dtype=np.uint8)
73 | init_pic[:, :, :] = 255
74 | # init_pic = Image.open('data/ppt.jpg')
75 | # init_pic = np.array(init_pic, dtype=np.uint8)
76 | sketchpad = copy.copy(init_pic)
77 | last_point = 0
78 | last_point_time = 0
79 | dis_max = 300
80 | clr = (255, 0, 0)
81 | clr_list = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (255, 0, 255), (0, 255, 255),
82 | (0, 0, 0)]
83 | line_width = 10
84 |
85 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img
86 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once
87 | for path, img, im0s, vid_cap in dataset:
88 | img = torch.from_numpy(img).to(device)
89 | img = img.half() if half else img.float() # uint8 to fp16/32
90 | img /= 255.0 # 0 - 255 to 0.0 - 1.0
91 | if img.ndimension() == 3:
92 | img = img.unsqueeze(0)
93 |
94 | # Inference
95 | t1 = time_synchronized()
96 | pred = model(img, augment=opt.augment)[0]
97 |
98 | # Apply NMS
99 | pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms)
100 | t2 = time_synchronized()
101 |
102 | # Apply Classifier
103 | if classify:
104 | pred = apply_classifier(pred, modelc, img, im0s)
105 |
106 | # Process detections
107 | for i, det in enumerate(pred): # detections per image
108 | if webcam: # batch_size >= 1
109 | p, s, im0 = path[i], '%g: ' % i, im0s[i].copy()
110 | else:
111 | p, s, im0 = path, '', im0s
112 |
113 | save_path = str(Path(out) / Path(p).name)
114 | txt_path = str(Path(out) / Path(p).stem) + ('_%g' % dataset.frame if dataset.mode == 'video' else '')
115 | s += '%gx%g ' % img.shape[2:] # print string
116 | gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
117 | if det is not None and len(det):
118 | # Rescale boxes from img_size to im0 size
119 | det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
120 |
121 | # Print results
122 | for c in det[:, -1].unique():
123 | n = (det[:, -1] == c).sum() # detections per class
124 | s += '%g %ss, ' % (n, names[int(c)]) # add to string
125 |
126 | # Write results
127 | flag_1 = False
128 | flag_2 = False
129 | flag_3 = False
130 | flag_4 = False
131 | flag_xyxy = 0
132 | for *xyxy, conf, cls in reversed(det):
133 | if cls == 0:
134 | flag_1 = True
135 | elif cls == 1:
136 | flag_2 = True
137 | elif cls == 2:
138 | flag_3 = True
139 | elif cls == 3:
140 | flag_4 = True
141 | flag_xyxy = torch.tensor(xyxy).view(1, 4).cpu().numpy()
142 | flag_xyxy = flag_xyxy[0]
143 | # print(flag_xyxy)
144 |
145 | if save_txt: # Write to file
146 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
147 | line = (cls, conf, *xywh) if opt.save_conf else (cls, *xywh) # label format
148 | with open(txt_path + '.txt', 'a') as f:
149 | f.write(('%g ' * len(line) + '\n') % line)
150 |
151 | if save_img or view_img: # Add bbox to image
152 | label = '%s %.2f' % (names[int(cls)], conf)
153 | plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
154 |
155 | # order
156 | if flag_1 and flag_4:
157 | point = ((flag_xyxy[0] + flag_xyxy[2]) / 2 * 3, (flag_xyxy[1] + flag_xyxy[3]) / 2 * 3)
158 | x, y = point
159 | point = (int(x), int(y))
160 | cv2.circle(sketchpad, point, line_width, clr)
161 | if last_point != 0:
162 | if time.time()-last_point_time < 0.5 and get_dis(last_point, point) < dis_max:
163 | # print(point, last_point)
164 | cv2.line(sketchpad, point, last_point, clr, line_width)
165 | last_point = point
166 | last_point_time = time.time()
167 | if flag_3:
168 | sketchpad = init_pic.copy()
169 |
170 | if flag_2:
171 | clr = clr_list[random.randint(len(clr_list))]
172 |
173 | # Print time (inference + NMS)
174 | print('%sDone. (%.3fs)' % (s, t2 - t1))
175 |
176 | # Stream results
177 | if view_img:
178 | cv2.namedWindow(p, 0)
179 | cv2.resizeWindow(p, 640, 480)
180 | cv2.imshow(p, cv2.flip(im0, 1))
181 | cv2.namedWindow('sketchpad', cv2.WINDOW_NORMAL)
182 | cv2.setWindowProperty('sketchpad', cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)
183 | cv2.imshow('sketchpad', cv2.flip(sketchpad, 1))
184 |
185 | if cv2.waitKey(1) == ord('q'): # q to quit
186 | raise StopIteration
187 |
188 | # Save results (image with detections)
189 | if save_img:
190 | if dataset.mode == 'images':
191 | cv2.imwrite(save_path, im0)
192 | else:
193 | if vid_path != save_path: # new video
194 | vid_path = save_path
195 | if isinstance(vid_writer, cv2.VideoWriter):
196 | vid_writer.release() # release previous video writer
197 |
198 | fourcc = 'mp4v' # output video codec
199 | fps = vid_cap.get(cv2.CAP_PROP_FPS)
200 | w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
201 | h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
202 | vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*fourcc), fps, (w, h))
203 | vid_writer.write(im0)
204 |
205 | if save_txt or save_img:
206 | print('Results saved to %s' % Path(out))
207 |
208 | print('Done. (%.3fs)' % (time.time() - t0))
209 |
210 |
211 | if __name__ == '__main__':
212 | parser = argparse.ArgumentParser()
213 | parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model.pt path(s)')
214 | parser.add_argument('--source', type=str, default='inference/images', help='source') # file/folder, 0 for webcam
215 | parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
216 | parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold')
217 | parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS')
218 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
219 | parser.add_argument('--view-img', action='store_true', help='display results')
220 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
221 | parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
222 | parser.add_argument('--save-dir', type=str, default='inference/output', help='directory to save results')
223 | parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')
224 | parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
225 | parser.add_argument('--augment', action='store_true', help='augmented inference')
226 | parser.add_argument('--update', action='store_true', help='update all models')
227 | opt = parser.parse_args()
228 | print(opt)
229 |
230 | with torch.no_grad():
231 | if opt.update: # update all models (to fix SourceChangeWarning)
232 | for opt.weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']:
233 | detect()
234 | strip_optimizer(opt.weights)
235 | else:
236 | detect()
237 |
--------------------------------------------------------------------------------
/v3.0/yolov5/hubconf.py:
--------------------------------------------------------------------------------
1 | """File for accessing YOLOv5 via PyTorch Hub https://pytorch.org/hub/
2 |
3 | Usage:
4 | import torch
5 | model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True, channels=3, classes=80)
6 | """
7 |
8 | dependencies = ['torch', 'yaml']
9 | import os
10 |
11 | import torch
12 |
13 | from models.yolo import Model
14 | from utils.general import set_logging
15 | from utils.google_utils import attempt_download
16 |
17 | set_logging()
18 |
19 |
20 | def create(name, pretrained, channels, classes):
21 | """Creates a specified YOLOv5 model
22 |
23 | Arguments:
24 | name (str): name of model, i.e. 'yolov5s'
25 | pretrained (bool): load pretrained weights into the model
26 | channels (int): number of input channels
27 | classes (int): number of model classes
28 |
29 | Returns:
30 | pytorch model
31 | """
32 | config = os.path.join(os.path.dirname(__file__), 'models', f'{name}.yaml') # model.yaml path
33 | try:
34 | model = Model(config, channels, classes)
35 | if pretrained:
36 | fname = f'{name}.pt' # checkpoint filename
37 | attempt_download(fname) # download if not found locally
38 | ckpt = torch.load(fname, map_location=torch.device('cpu')) # load
39 | state_dict = ckpt['model'].float().state_dict() # to FP32
40 | state_dict = {k: v for k, v in state_dict.items() if model.state_dict()[k].shape == v.shape} # filter
41 | model.load_state_dict(state_dict, strict=False) # load
42 | if len(ckpt['model'].names) == classes:
43 | model.names = ckpt['model'].names # set class names attribute
44 | # model = model.autoshape() # for autoshaping of PIL/cv2/np inputs and NMS
45 | return model
46 |
47 | except Exception as e:
48 | help_url = 'https://github.com/ultralytics/yolov5/issues/36'
49 | s = 'Cache maybe be out of date, try force_reload=True. See %s for help.' % help_url
50 | raise Exception(s) from e
51 |
52 |
53 | def yolov5s(pretrained=False, channels=3, classes=80):
54 | """YOLOv5-small model from https://github.com/ultralytics/yolov5
55 |
56 | Arguments:
57 | pretrained (bool): load pretrained weights into the model, default=False
58 | channels (int): number of input channels, default=3
59 | classes (int): number of model classes, default=80
60 |
61 | Returns:
62 | pytorch model
63 | """
64 | return create('yolov5s', pretrained, channels, classes)
65 |
66 |
67 | def yolov5m(pretrained=False, channels=3, classes=80):
68 | """YOLOv5-medium model from https://github.com/ultralytics/yolov5
69 |
70 | Arguments:
71 | pretrained (bool): load pretrained weights into the model, default=False
72 | channels (int): number of input channels, default=3
73 | classes (int): number of model classes, default=80
74 |
75 | Returns:
76 | pytorch model
77 | """
78 | return create('yolov5m', pretrained, channels, classes)
79 |
80 |
81 | def yolov5l(pretrained=False, channels=3, classes=80):
82 | """YOLOv5-large model from https://github.com/ultralytics/yolov5
83 |
84 | Arguments:
85 | pretrained (bool): load pretrained weights into the model, default=False
86 | channels (int): number of input channels, default=3
87 | classes (int): number of model classes, default=80
88 |
89 | Returns:
90 | pytorch model
91 | """
92 | return create('yolov5l', pretrained, channels, classes)
93 |
94 |
95 | def yolov5x(pretrained=False, channels=3, classes=80):
96 | """YOLOv5-xlarge model from https://github.com/ultralytics/yolov5
97 |
98 | Arguments:
99 | pretrained (bool): load pretrained weights into the model, default=False
100 | channels (int): number of input channels, default=3
101 | classes (int): number of model classes, default=80
102 |
103 | Returns:
104 | pytorch model
105 | """
106 | return create('yolov5x', pretrained, channels, classes)
107 |
108 |
109 | if __name__ == '__main__':
110 | model = create(name='yolov5s', pretrained=True, channels=3, classes=80) # example
111 | model = model.fuse().eval().autoshape() # for autoshaping of PIL/cv2/np inputs and NMS
112 |
113 | # Verify inference
114 | from PIL import Image
115 |
116 | img = Image.open('inference/images/zidane.jpg')
117 | y = model(img)
118 | print(y[0].shape)
119 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/models/__init__.py
--------------------------------------------------------------------------------
/v3.0/yolov5/models/common.py:
--------------------------------------------------------------------------------
1 | # This file contains modules common to various models
2 |
3 | import math
4 | import numpy as np
5 | import torch
6 | import torch.nn as nn
7 |
8 | from utils.datasets import letterbox
9 | from utils.general import non_max_suppression, make_divisible, scale_coords
10 |
11 |
12 | def autopad(k, p=None): # kernel, padding
13 | # Pad to 'same'
14 | if p is None:
15 | p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad
16 | return p
17 |
18 |
19 | def DWConv(c1, c2, k=1, s=1, act=True):
20 | # Depthwise convolution
21 | return Conv(c1, c2, k, s, g=math.gcd(c1, c2), act=act)
22 |
23 |
24 | class Conv(nn.Module):
25 | # Standard convolution
26 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
27 | super(Conv, self).__init__()
28 | self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
29 | self.bn = nn.BatchNorm2d(c2)
30 | self.act = nn.Hardswish() if act else nn.Identity()
31 |
32 | def forward(self, x):
33 | return self.act(self.bn(self.conv(x)))
34 |
35 | def fuseforward(self, x):
36 | return self.act(self.conv(x))
37 |
38 |
39 | class Bottleneck(nn.Module):
40 | # Standard bottleneck
41 | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion
42 | super(Bottleneck, self).__init__()
43 | c_ = int(c2 * e) # hidden channels
44 | self.cv1 = Conv(c1, c_, 1, 1)
45 | self.cv2 = Conv(c_, c2, 3, 1, g=g)
46 | self.add = shortcut and c1 == c2
47 |
48 | def forward(self, x):
49 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
50 |
51 |
52 | class BottleneckCSP(nn.Module):
53 | # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
54 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
55 | super(BottleneckCSP, self).__init__()
56 | c_ = int(c2 * e) # hidden channels
57 | self.cv1 = Conv(c1, c_, 1, 1)
58 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
59 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
60 | self.cv4 = Conv(2 * c_, c2, 1, 1)
61 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3)
62 | self.act = nn.LeakyReLU(0.1, inplace=True)
63 | self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
64 |
65 | def forward(self, x):
66 | y1 = self.cv3(self.m(self.cv1(x)))
67 | y2 = self.cv2(x)
68 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))
69 |
70 |
71 | class SPP(nn.Module):
72 | # Spatial pyramid pooling layer used in YOLOv3-SPP
73 | def __init__(self, c1, c2, k=(5, 9, 13)):
74 | super(SPP, self).__init__()
75 | c_ = c1 // 2 # hidden channels
76 | self.cv1 = Conv(c1, c_, 1, 1)
77 | self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
78 | self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
79 |
80 | def forward(self, x):
81 | x = self.cv1(x)
82 | return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))
83 |
84 |
85 | class Focus(nn.Module):
86 | # Focus wh information into c-space
87 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
88 | super(Focus, self).__init__()
89 | self.conv = Conv(c1 * 4, c2, k, s, p, g, act)
90 |
91 | def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2)
92 | return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
93 |
94 |
95 | class Concat(nn.Module):
96 | # Concatenate a list of tensors along dimension
97 | def __init__(self, dimension=1):
98 | super(Concat, self).__init__()
99 | self.d = dimension
100 |
101 | def forward(self, x):
102 | return torch.cat(x, self.d)
103 |
104 |
105 | class NMS(nn.Module):
106 | # Non-Maximum Suppression (NMS) module
107 | conf = 0.25 # confidence threshold
108 | iou = 0.45 # IoU threshold
109 | classes = None # (optional list) filter by class
110 |
111 | def __init__(self):
112 | super(NMS, self).__init__()
113 |
114 | def forward(self, x):
115 | return non_max_suppression(x[0], conf_thres=self.conf, iou_thres=self.iou, classes=self.classes)
116 |
117 |
118 | class autoShape(nn.Module):
119 | # input-robust model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and NMS
120 | img_size = 640 # inference size (pixels)
121 | conf = 0.25 # NMS confidence threshold
122 | iou = 0.45 # NMS IoU threshold
123 | classes = None # (optional list) filter by class
124 |
125 | def __init__(self, model):
126 | super(autoShape, self).__init__()
127 | self.model = model
128 |
129 | def forward(self, x, size=640, augment=False, profile=False):
130 | # supports inference from various sources. For height=720, width=1280, RGB images example inputs are:
131 | # opencv: x = cv2.imread('image.jpg')[:,:,::-1] # HWC BGR to RGB x(720,1280,3)
132 | # PIL: x = Image.open('image.jpg') # HWC x(720,1280,3)
133 | # numpy: x = np.zeros((720,1280,3)) # HWC
134 | # torch: x = torch.zeros(16,3,720,1280) # BCHW
135 | # multiple: x = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...] # list of images
136 |
137 | p = next(self.model.parameters()) # for device and type
138 | if isinstance(x, torch.Tensor): # torch
139 | return self.model(x.to(p.device).type_as(p), augment, profile) # inference
140 |
141 | # Pre-process
142 | if not isinstance(x, list):
143 | x = [x]
144 | shape0, shape1 = [], [] # image and inference shapes
145 | batch = range(len(x)) # batch size
146 | for i in batch:
147 | x[i] = np.array(x[i])[:, :, :3] # up to 3 channels if png
148 | s = x[i].shape[:2] # HWC
149 | shape0.append(s) # image shape
150 | g = (size / max(s)) # gain
151 | shape1.append([y * g for y in s])
152 | shape1 = [make_divisible(x, int(self.stride.max())) for x in np.stack(shape1, 0).max(0)] # inference shape
153 | x = [letterbox(x[i], new_shape=shape1, auto=False)[0] for i in batch] # pad
154 | x = np.stack(x, 0) if batch[-1] else x[0][None] # stack
155 | x = np.ascontiguousarray(x.transpose((0, 3, 1, 2))) # BHWC to BCHW
156 | x = torch.from_numpy(x).to(p.device).type_as(p) / 255. # uint8 to fp16/32
157 |
158 | # Inference
159 | x = self.model(x, augment, profile) # forward
160 | x = non_max_suppression(x[0], conf_thres=self.conf, iou_thres=self.iou, classes=self.classes) # NMS
161 |
162 | # Post-process
163 | for i in batch:
164 | if x[i] is not None:
165 | x[i][:, :4] = scale_coords(shape1, x[i][:, :4], shape0[i])
166 | return x
167 |
168 |
169 | class Flatten(nn.Module):
170 | # Use after nn.AdaptiveAvgPool2d(1) to remove last 2 dimensions
171 | @staticmethod
172 | def forward(x):
173 | return x.view(x.size(0), -1)
174 |
175 |
176 | class Classify(nn.Module):
177 | # Classification head, i.e. x(b,c1,20,20) to x(b,c2)
178 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1): # ch_in, ch_out, kernel, stride, padding, groups
179 | super(Classify, self).__init__()
180 | self.aap = nn.AdaptiveAvgPool2d(1) # to x(b,c1,1,1)
181 | self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) # to x(b,c2,1,1)
182 | self.flat = Flatten()
183 |
184 | def forward(self, x):
185 | z = torch.cat([self.aap(y) for y in (x if isinstance(x, list) else [x])], 1) # cat if list
186 | return self.flat(self.conv(z)) # flatten to x(b,c2)
187 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/experimental.py:
--------------------------------------------------------------------------------
1 | # This file contains experimental modules
2 |
3 | import numpy as np
4 | import torch
5 | import torch.nn as nn
6 |
7 | from models.common import Conv, DWConv
8 | from utils.google_utils import attempt_download
9 |
10 |
11 | class CrossConv(nn.Module):
12 | # Cross Convolution Downsample
13 | def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False):
14 | # ch_in, ch_out, kernel, stride, groups, expansion, shortcut
15 | super(CrossConv, self).__init__()
16 | c_ = int(c2 * e) # hidden channels
17 | self.cv1 = Conv(c1, c_, (1, k), (1, s))
18 | self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g)
19 | self.add = shortcut and c1 == c2
20 |
21 | def forward(self, x):
22 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
23 |
24 |
25 | class C3(nn.Module):
26 | # Cross Convolution CSP
27 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
28 | super(C3, self).__init__()
29 | c_ = int(c2 * e) # hidden channels
30 | self.cv1 = Conv(c1, c_, 1, 1)
31 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
32 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
33 | self.cv4 = Conv(2 * c_, c2, 1, 1)
34 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3)
35 | self.act = nn.LeakyReLU(0.1, inplace=True)
36 | self.m = nn.Sequential(*[CrossConv(c_, c_, 3, 1, g, 1.0, shortcut) for _ in range(n)])
37 |
38 | def forward(self, x):
39 | y1 = self.cv3(self.m(self.cv1(x)))
40 | y2 = self.cv2(x)
41 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))
42 |
43 |
44 | class Sum(nn.Module):
45 | # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
46 | def __init__(self, n, weight=False): # n: number of inputs
47 | super(Sum, self).__init__()
48 | self.weight = weight # apply weights boolean
49 | self.iter = range(n - 1) # iter object
50 | if weight:
51 | self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights
52 |
53 | def forward(self, x):
54 | y = x[0] # no weight
55 | if self.weight:
56 | w = torch.sigmoid(self.w) * 2
57 | for i in self.iter:
58 | y = y + x[i + 1] * w[i]
59 | else:
60 | for i in self.iter:
61 | y = y + x[i + 1]
62 | return y
63 |
64 |
65 | class GhostConv(nn.Module):
66 | # Ghost Convolution https://github.com/huawei-noah/ghostnet
67 | def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups
68 | super(GhostConv, self).__init__()
69 | c_ = c2 // 2 # hidden channels
70 | self.cv1 = Conv(c1, c_, k, s, None, g, act)
71 | self.cv2 = Conv(c_, c_, 5, 1, None, c_, act)
72 |
73 | def forward(self, x):
74 | y = self.cv1(x)
75 | return torch.cat([y, self.cv2(y)], 1)
76 |
77 |
78 | class GhostBottleneck(nn.Module):
79 | # Ghost Bottleneck https://github.com/huawei-noah/ghostnet
80 | def __init__(self, c1, c2, k, s):
81 | super(GhostBottleneck, self).__init__()
82 | c_ = c2 // 2
83 | self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw
84 | DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw
85 | GhostConv(c_, c2, 1, 1, act=False)) # pw-linear
86 | self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False),
87 | Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity()
88 |
89 | def forward(self, x):
90 | return self.conv(x) + self.shortcut(x)
91 |
92 |
93 | class MixConv2d(nn.Module):
94 | # Mixed Depthwise Conv https://arxiv.org/abs/1907.09595
95 | def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):
96 | super(MixConv2d, self).__init__()
97 | groups = len(k)
98 | if equal_ch: # equal c_ per group
99 | i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices
100 | c_ = [(i == g).sum() for g in range(groups)] # intermediate channels
101 | else: # equal weight.numel() per group
102 | b = [c2] + [0] * groups
103 | a = np.eye(groups + 1, groups, k=-1)
104 | a -= np.roll(a, 1, axis=1)
105 | a *= np.array(k) ** 2
106 | a[0] = 1
107 | c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b
108 |
109 | self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)])
110 | self.bn = nn.BatchNorm2d(c2)
111 | self.act = nn.LeakyReLU(0.1, inplace=True)
112 |
113 | def forward(self, x):
114 | return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))
115 |
116 |
117 | class Ensemble(nn.ModuleList):
118 | # Ensemble of models
119 | def __init__(self):
120 | super(Ensemble, self).__init__()
121 |
122 | def forward(self, x, augment=False):
123 | y = []
124 | for module in self:
125 | y.append(module(x, augment)[0])
126 | # y = torch.stack(y).max(0)[0] # max ensemble
127 | # y = torch.cat(y, 1) # nms ensemble
128 | y = torch.stack(y).mean(0) # mean ensemble
129 | return y, None # inference, train output
130 |
131 |
132 | def attempt_load(weights, map_location=None):
133 | # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
134 | model = Ensemble()
135 | for w in weights if isinstance(weights, list) else [weights]:
136 | attempt_download(w)
137 | model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
138 |
139 | # Compatibility updates
140 | for m in model.modules():
141 | if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]:
142 | m.inplace = True # pytorch 1.7.0 compatibility
143 | elif type(m) is Conv:
144 | m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
145 |
146 | if len(model) == 1:
147 | return model[-1] # return model
148 | else:
149 | print('Ensemble created with %s\n' % weights)
150 | for k in ['names', 'stride']:
151 | setattr(model, k, getattr(model[-1], k))
152 | return model # return ensemble
153 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/export.py:
--------------------------------------------------------------------------------
1 | """Exports a YOLOv5 *.pt model to ONNX and TorchScript formats
2 |
3 | Usage:
4 | $ export PYTHONPATH="$PWD" && python models/export.py --weights ./weights/yolov5s.pt --img 640 --batch 1
5 | """
6 |
7 | import argparse
8 | import sys
9 | import time
10 |
11 | sys.path.append('./') # to run '$ python *.py' files in subdirectories
12 |
13 | import torch
14 | import torch.nn as nn
15 |
16 | import models
17 | from models.experimental import attempt_load
18 | from utils.activations import Hardswish
19 | from utils.general import set_logging, check_img_size
20 |
21 | if __name__ == '__main__':
22 | parser = argparse.ArgumentParser()
23 | parser.add_argument('--weights', type=str, default='./yolov5s.pt', help='weights path') # from yolov5/models/
24 | parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='image size') # height, width
25 | parser.add_argument('--batch-size', type=int, default=1, help='batch size')
26 | opt = parser.parse_args()
27 | opt.img_size *= 2 if len(opt.img_size) == 1 else 1 # expand
28 | print(opt)
29 | set_logging()
30 | t = time.time()
31 |
32 | # Load PyTorch model
33 | model = attempt_load(opt.weights, map_location=torch.device('cpu')) # load FP32 model
34 | labels = model.names
35 |
36 | # Checks
37 | gs = int(max(model.stride)) # grid size (max stride)
38 | opt.img_size = [check_img_size(x, gs) for x in opt.img_size] # verify img_size are gs-multiples
39 |
40 | # Input
41 | img = torch.zeros(opt.batch_size, 3, *opt.img_size) # image size(1,3,320,192) iDetection
42 |
43 | # Update model
44 | for k, m in model.named_modules():
45 | m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
46 | if isinstance(m, models.common.Conv) and isinstance(m.act, nn.Hardswish):
47 | m.act = Hardswish() # assign activation
48 | # if isinstance(m, models.yolo.Detect):
49 | # m.forward = m.forward_export # assign forward (optional)
50 | model.model[-1].export = True # set Detect() layer export=True
51 | y = model(img) # dry run
52 |
53 | # TorchScript export
54 | try:
55 | print('\nStarting TorchScript export with torch %s...' % torch.__version__)
56 | f = opt.weights.replace('.pt', '.torchscript.pt') # filename
57 | ts = torch.jit.trace(model, img)
58 | ts.save(f)
59 | print('TorchScript export success, saved as %s' % f)
60 | except Exception as e:
61 | print('TorchScript export failure: %s' % e)
62 |
63 | # ONNX export
64 | try:
65 | import onnx
66 |
67 | print('\nStarting ONNX export with onnx %s...' % onnx.__version__)
68 | f = opt.weights.replace('.pt', '.onnx') # filename
69 | torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['images'],
70 | output_names=['classes', 'boxes'] if y is None else ['output'])
71 |
72 | # Checks
73 | onnx_model = onnx.load(f) # load onnx model
74 | onnx.checker.check_model(onnx_model) # check onnx model
75 | # print(onnx.helper.printable_graph(onnx_model.graph)) # print a human readable model
76 | print('ONNX export success, saved as %s' % f)
77 | except Exception as e:
78 | print('ONNX export failure: %s' % e)
79 |
80 | # CoreML export
81 | try:
82 | import coremltools as ct
83 |
84 | print('\nStarting CoreML export with coremltools %s...' % ct.__version__)
85 | # convert model from torchscript and apply pixel scaling as per detect.py
86 | model = ct.convert(ts, inputs=[ct.ImageType(name='image', shape=img.shape, scale=1 / 255.0, bias=[0, 0, 0])])
87 | f = opt.weights.replace('.pt', '.mlmodel') # filename
88 | model.save(f)
89 | print('CoreML export success, saved as %s' % f)
90 | except Exception as e:
91 | print('CoreML export failure: %s' % e)
92 |
93 | # Finish
94 | print('\nExport complete (%.2fs). Visualize with https://github.com/lutzroeder/netron.' % (time.time() - t))
95 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/hub/yolov3-spp.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 1.0 # model depth multiple
4 | width_multiple: 1.0 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # darknet53 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Conv, [32, 3, 1]], # 0
16 | [-1, 1, Conv, [64, 3, 2]], # 1-P1/2
17 | [-1, 1, Bottleneck, [64]],
18 | [-1, 1, Conv, [128, 3, 2]], # 3-P2/4
19 | [-1, 2, Bottleneck, [128]],
20 | [-1, 1, Conv, [256, 3, 2]], # 5-P3/8
21 | [-1, 8, Bottleneck, [256]],
22 | [-1, 1, Conv, [512, 3, 2]], # 7-P4/16
23 | [-1, 8, Bottleneck, [512]],
24 | [-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
25 | [-1, 4, Bottleneck, [1024]], # 10
26 | ]
27 |
28 | # YOLOv3-SPP head
29 | head:
30 | [[-1, 1, Bottleneck, [1024, False]],
31 | [-1, 1, SPP, [512, [5, 9, 13]]],
32 | [-1, 1, Conv, [1024, 3, 1]],
33 | [-1, 1, Conv, [512, 1, 1]],
34 | [-1, 1, Conv, [1024, 3, 1]], # 15 (P5/32-large)
35 |
36 | [-2, 1, Conv, [256, 1, 1]],
37 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
38 | [[-1, 8], 1, Concat, [1]], # cat backbone P4
39 | [-1, 1, Bottleneck, [512, False]],
40 | [-1, 1, Bottleneck, [512, False]],
41 | [-1, 1, Conv, [256, 1, 1]],
42 | [-1, 1, Conv, [512, 3, 1]], # 22 (P4/16-medium)
43 |
44 | [-2, 1, Conv, [128, 1, 1]],
45 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
46 | [[-1, 6], 1, Concat, [1]], # cat backbone P3
47 | [-1, 1, Bottleneck, [256, False]],
48 | [-1, 2, Bottleneck, [256, False]], # 27 (P3/8-small)
49 |
50 | [[27, 22, 15], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
51 | ]
52 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/hub/yolov5-fpn.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 1.0 # model depth multiple
4 | width_multiple: 1.0 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # YOLOv5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17 | [-1, 3, Bottleneck, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 6, BottleneckCSP, [1024]], # 9
25 | ]
26 |
27 | # YOLOv5 FPN head
28 | head:
29 | [[-1, 3, BottleneckCSP, [1024, False]], # 10 (P5/32-large)
30 |
31 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
32 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
33 | [-1, 1, Conv, [512, 1, 1]],
34 | [-1, 3, BottleneckCSP, [512, False]], # 14 (P4/16-medium)
35 |
36 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
37 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
38 | [-1, 1, Conv, [256, 1, 1]],
39 | [-1, 3, BottleneckCSP, [256, False]], # 18 (P3/8-small)
40 |
41 | [[18, 14, 10], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
42 | ]
43 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/hub/yolov5-panet.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 1.0 # model depth multiple
4 | width_multiple: 1.0 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [116,90, 156,198, 373,326] # P5/32
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [10,13, 16,30, 33,23] # P3/8
11 |
12 | # YOLOv5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17 | [-1, 3, BottleneckCSP, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 3, BottleneckCSP, [1024, False]], # 9
25 | ]
26 |
27 | # YOLOv5 PANet head
28 | head:
29 | [[-1, 1, Conv, [512, 1, 1]],
30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
32 | [-1, 3, BottleneckCSP, [512, False]], # 13
33 |
34 | [-1, 1, Conv, [256, 1, 1]],
35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small)
38 |
39 | [-1, 1, Conv, [256, 3, 2]],
40 | [[-1, 14], 1, Concat, [1]], # cat head P4
41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium)
42 |
43 | [-1, 1, Conv, [512, 3, 2]],
44 | [[-1, 10], 1, Concat, [1]], # cat head P5
45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large)
46 |
47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P5, P4, P3)
48 | ]
49 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/yolo.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import logging
3 | import sys
4 | from copy import deepcopy
5 | from pathlib import Path
6 |
7 | import math
8 |
9 | sys.path.append('./') # to run '$ python *.py' files in subdirectories
10 | logger = logging.getLogger(__name__)
11 |
12 | import torch
13 | import torch.nn as nn
14 |
15 | from models.common import Conv, Bottleneck, SPP, DWConv, Focus, BottleneckCSP, Concat, NMS, autoShape
16 | from models.experimental import MixConv2d, CrossConv, C3
17 | from utils.general import check_anchor_order, make_divisible, check_file, set_logging
18 | from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \
19 | select_device, copy_attr
20 |
21 |
22 | class Detect(nn.Module):
23 | stride = None # strides computed during build
24 | export = False # onnx export
25 |
26 | def __init__(self, nc=80, anchors=(), ch=()): # detection layer
27 | super(Detect, self).__init__()
28 | self.nc = nc # number of classes
29 | self.no = nc + 5 # number of outputs per anchor
30 | self.nl = len(anchors) # number of detection layers
31 | self.na = len(anchors[0]) // 2 # number of anchors
32 | self.grid = [torch.zeros(1)] * self.nl # init grid
33 | a = torch.tensor(anchors).float().view(self.nl, -1, 2)
34 | self.register_buffer('anchors', a) # shape(nl,na,2)
35 | self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
36 | self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv
37 |
38 | def forward(self, x):
39 | # x = x.copy() # for profiling
40 | z = [] # inference output
41 | self.training |= self.export
42 | for i in range(self.nl):
43 | x[i] = self.m[i](x[i]) # conv
44 | bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
45 | x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
46 |
47 | if not self.training: # inference
48 | if self.grid[i].shape[2:4] != x[i].shape[2:4]:
49 | self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
50 |
51 | y = x[i].sigmoid()
52 | y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
53 | y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
54 | z.append(y.view(bs, -1, self.no))
55 |
56 | return x if self.training else (torch.cat(z, 1), x)
57 |
58 | @staticmethod
59 | def _make_grid(nx=20, ny=20):
60 | yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
61 | return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
62 |
63 |
64 | class Model(nn.Module):
65 | def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None): # model, input channels, number of classes
66 | super(Model, self).__init__()
67 | if isinstance(cfg, dict):
68 | self.yaml = cfg # model dict
69 | else: # is *.yaml
70 | import yaml # for torch hub
71 | self.yaml_file = Path(cfg).name
72 | with open(cfg) as f:
73 | self.yaml = yaml.load(f, Loader=yaml.FullLoader) # model dict
74 |
75 | # Define model
76 | if nc and nc != self.yaml['nc']:
77 | print('Overriding model.yaml nc=%g with nc=%g' % (self.yaml['nc'], nc))
78 | self.yaml['nc'] = nc # override yaml value
79 | self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist, ch_out
80 | # print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])
81 |
82 | # Build strides, anchors
83 | m = self.model[-1] # Detect()
84 | if isinstance(m, Detect):
85 | s = 128 # 2x min stride
86 | m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward
87 | m.anchors /= m.stride.view(-1, 1, 1)
88 | check_anchor_order(m)
89 | self.stride = m.stride
90 | self._initialize_biases() # only run once
91 | # print('Strides: %s' % m.stride.tolist())
92 |
93 | # Init weights, biases
94 | initialize_weights(self)
95 | self.info()
96 | print('')
97 |
98 | def forward(self, x, augment=False, profile=False):
99 | if augment:
100 | img_size = x.shape[-2:] # height, width
101 | s = [1, 0.83, 0.67] # scales
102 | f = [None, 3, None] # flips (2-ud, 3-lr)
103 | y = [] # outputs
104 | for si, fi in zip(s, f):
105 | xi = scale_img(x.flip(fi) if fi else x, si)
106 | yi = self.forward_once(xi)[0] # forward
107 | # cv2.imwrite('img%g.jpg' % s, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1]) # save
108 | yi[..., :4] /= si # de-scale
109 | if fi == 2:
110 | yi[..., 1] = img_size[0] - yi[..., 1] # de-flip ud
111 | elif fi == 3:
112 | yi[..., 0] = img_size[1] - yi[..., 0] # de-flip lr
113 | y.append(yi)
114 | return torch.cat(y, 1), None # augmented inference, train
115 | else:
116 | return self.forward_once(x, profile) # single-scale inference, train
117 |
118 | def forward_once(self, x, profile=False):
119 | y, dt = [], [] # outputs
120 | for m in self.model:
121 | if m.f != -1: # if not from previous layer
122 | x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers
123 |
124 | if profile:
125 | try:
126 | import thop
127 | o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 # FLOPS
128 | except:
129 | o = 0
130 | t = time_synchronized()
131 | for _ in range(10):
132 | _ = m(x)
133 | dt.append((time_synchronized() - t) * 100)
134 | print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))
135 |
136 | x = m(x) # run
137 | y.append(x if m.i in self.save else None) # save output
138 |
139 | if profile:
140 | print('%.1fms total' % sum(dt))
141 | return x
142 |
143 | def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency
144 | # https://arxiv.org/abs/1708.02002 section 3.3
145 | # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
146 | m = self.model[-1] # Detect() module
147 | for mi, s in zip(m.m, m.stride): # from
148 | b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
149 | b[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
150 | b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
151 | mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
152 |
153 | def _print_biases(self):
154 | m = self.model[-1] # Detect() module
155 | for mi in m.m: # from
156 | b = mi.bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85)
157 | print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))
158 |
159 | # def _print_weights(self):
160 | # for m in self.model.modules():
161 | # if type(m) is Bottleneck:
162 | # print('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights
163 |
164 | def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
165 | print('Fusing layers... ')
166 | for m in self.model.modules():
167 | if type(m) is Conv and hasattr(m, 'bn'):
168 | m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
169 | delattr(m, 'bn') # remove batchnorm
170 | m.forward = m.fuseforward # update forward
171 | self.info()
172 | return self
173 |
174 | def nms(self, mode=True): # add or remove NMS module
175 | present = type(self.model[-1]) is NMS # last layer is NMS
176 | if mode and not present:
177 | print('Adding NMS... ')
178 | m = NMS() # module
179 | m.f = -1 # from
180 | m.i = self.model[-1].i + 1 # index
181 | self.model.add_module(name='%s' % m.i, module=m) # add
182 | self.eval()
183 | elif not mode and present:
184 | print('Removing NMS... ')
185 | self.model = self.model[:-1] # remove
186 | return self
187 |
188 | def autoshape(self): # add autoShape module
189 | print('Adding autoShape... ')
190 | m = autoShape(self) # wrap model
191 | copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=()) # copy attributes
192 | return m
193 |
194 | def info(self, verbose=False): # print model information
195 | model_info(self, verbose)
196 |
197 |
198 | def parse_model(d, ch): # model_dict, input_channels(3)
199 | logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
200 | anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']
201 | na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors
202 | no = na * (nc + 5) # number of outputs = anchors * (classes + 5)
203 |
204 | layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out
205 | for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args
206 | m = eval(m) if isinstance(m, str) else m # eval strings
207 | for j, a in enumerate(args):
208 | try:
209 | args[j] = eval(a) if isinstance(a, str) else a # eval strings
210 | except:
211 | pass
212 |
213 | n = max(round(n * gd), 1) if n > 1 else n # depth gain
214 | if m in [Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]:
215 | c1, c2 = ch[f], args[0]
216 |
217 | # Normal
218 | # if i > 0 and args[0] != no: # channel expansion factor
219 | # ex = 1.75 # exponential (default 2.0)
220 | # e = math.log(c2 / ch[1]) / math.log(2)
221 | # c2 = int(ch[1] * ex ** e)
222 | # if m != Focus:
223 |
224 | c2 = make_divisible(c2 * gw, 8) if c2 != no else c2
225 |
226 | # Experimental
227 | # if i > 0 and args[0] != no: # channel expansion factor
228 | # ex = 1 + gw # exponential (default 2.0)
229 | # ch1 = 32 # ch[1]
230 | # e = math.log(c2 / ch1) / math.log(2) # level 1-n
231 | # c2 = int(ch1 * ex ** e)
232 | # if m != Focus:
233 | # c2 = make_divisible(c2, 8) if c2 != no else c2
234 |
235 | args = [c1, c2, *args[1:]]
236 | if m in [BottleneckCSP, C3]:
237 | args.insert(2, n)
238 | n = 1
239 | elif m is nn.BatchNorm2d:
240 | args = [ch[f]]
241 | elif m is Concat:
242 | c2 = sum([ch[-1 if x == -1 else x + 1] for x in f])
243 | elif m is Detect:
244 | args.append([ch[x + 1] for x in f])
245 | if isinstance(args[1], int): # number of anchors
246 | args[1] = [list(range(args[1] * 2))] * len(f)
247 | else:
248 | c2 = ch[f]
249 |
250 | m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module
251 | t = str(m)[8:-2].replace('__main__.', '') # module type
252 | np = sum([x.numel() for x in m_.parameters()]) # number params
253 | m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params
254 | logger.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print
255 | save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist
256 | layers.append(m_)
257 | ch.append(c2)
258 | return nn.Sequential(*layers), sorted(save)
259 |
260 |
261 | if __name__ == '__main__':
262 | parser = argparse.ArgumentParser()
263 | parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml')
264 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
265 | opt = parser.parse_args()
266 | opt.cfg = check_file(opt.cfg) # check file
267 | set_logging()
268 | device = select_device(opt.device)
269 |
270 | # Create model
271 | model = Model(opt.cfg).to(device)
272 | model.train()
273 |
274 | # Profile
275 | # img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device)
276 | # y = model(img, profile=True)
277 |
278 | # Tensorboard
279 | # from torch.utils.tensorboard import SummaryWriter
280 | # tb_writer = SummaryWriter()
281 | # print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/")
282 | # tb_writer.add_graph(model.model, img) # add model to tensorboard
283 | # tb_writer.add_image('test', img[0], dataformats='CWH') # add model to tensorboard
284 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/yolov5l.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 1.0 # model depth multiple
4 | width_multiple: 1.0 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # YOLOv5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17 | [-1, 3, BottleneckCSP, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 3, BottleneckCSP, [1024, False]], # 9
25 | ]
26 |
27 | # YOLOv5 head
28 | head:
29 | [[-1, 1, Conv, [512, 1, 1]],
30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
32 | [-1, 3, BottleneckCSP, [512, False]], # 13
33 |
34 | [-1, 1, Conv, [256, 1, 1]],
35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small)
38 |
39 | [-1, 1, Conv, [256, 3, 2]],
40 | [[-1, 14], 1, Concat, [1]], # cat head P4
41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium)
42 |
43 | [-1, 1, Conv, [512, 3, 2]],
44 | [[-1, 10], 1, Concat, [1]], # cat head P5
45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large)
46 |
47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
48 | ]
49 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/yolov5m.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 0.67 # model depth multiple
4 | width_multiple: 0.75 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # YOLOv5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17 | [-1, 3, BottleneckCSP, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 3, BottleneckCSP, [1024, False]], # 9
25 | ]
26 |
27 | # YOLOv5 head
28 | head:
29 | [[-1, 1, Conv, [512, 1, 1]],
30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
32 | [-1, 3, BottleneckCSP, [512, False]], # 13
33 |
34 | [-1, 1, Conv, [256, 1, 1]],
35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small)
38 |
39 | [-1, 1, Conv, [256, 3, 2]],
40 | [[-1, 14], 1, Concat, [1]], # cat head P4
41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium)
42 |
43 | [-1, 1, Conv, [512, 3, 2]],
44 | [[-1, 10], 1, Concat, [1]], # cat head P5
45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large)
46 |
47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
48 | ]
49 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/yolov5s.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 0.33 # model depth multiple
4 | width_multiple: 0.50 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # YOLOv5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17 | [-1, 3, BottleneckCSP, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 3, BottleneckCSP, [1024, False]], # 9
25 | ]
26 |
27 | # YOLOv5 head
28 | head:
29 | [[-1, 1, Conv, [512, 1, 1]],
30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
32 | [-1, 3, BottleneckCSP, [512, False]], # 13
33 |
34 | [-1, 1, Conv, [256, 1, 1]],
35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small)
38 |
39 | [-1, 1, Conv, [256, 3, 2]],
40 | [[-1, 14], 1, Concat, [1]], # cat head P4
41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium)
42 |
43 | [-1, 1, Conv, [512, 3, 2]],
44 | [[-1, 10], 1, Concat, [1]], # cat head P5
45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large)
46 |
47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
48 | ]
49 |
--------------------------------------------------------------------------------
/v3.0/yolov5/models/yolov5x.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 1.33 # model depth multiple
4 | width_multiple: 1.25 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # YOLOv5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17 | [-1, 3, BottleneckCSP, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 3, BottleneckCSP, [1024, False]], # 9
25 | ]
26 |
27 | # YOLOv5 head
28 | head:
29 | [[-1, 1, Conv, [512, 1, 1]],
30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
32 | [-1, 3, BottleneckCSP, [512, False]], # 13
33 |
34 | [-1, 1, Conv, [256, 1, 1]],
35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
37 | [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small)
38 |
39 | [-1, 1, Conv, [256, 3, 2]],
40 | [[-1, 14], 1, Concat, [1]], # cat head P4
41 | [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium)
42 |
43 | [-1, 1, Conv, [512, 3, 2]],
44 | [[-1, 10], 1, Concat, [1]], # cat head P5
45 | [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large)
46 |
47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
48 | ]
49 |
--------------------------------------------------------------------------------
/v3.0/yolov5/requirements.txt:
--------------------------------------------------------------------------------
1 | # pip install -r requirements.txt
2 |
3 | # base ----------------------------------------
4 | Cython
5 | matplotlib>=3.2.2
6 | numpy>=1.18.5
7 | opencv-python>=4.1.2
8 | pillow
9 | PyYAML>=5.3
10 | scipy>=1.4.1
11 | tensorboard>=2.2
12 | torch>=1.6.0
13 | torchvision>=0.7.0
14 | tqdm>=4.41.0
15 |
16 | # logging -------------------------------------
17 | # wandb
18 |
19 | # coco ----------------------------------------
20 | # pycocotools>=2.0
21 |
22 | # export --------------------------------------
23 | # packaging # for coremltools
24 | # coremltools==4.0
25 | # onnx>=1.7.0
26 | # scikit-learn==0.19.2 # for coreml quantization
27 |
28 | # extras --------------------------------------
29 | # thop # FLOPS computation
30 | # seaborn # plotting
31 |
--------------------------------------------------------------------------------
/v3.0/yolov5/sotabench.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import glob
3 | import os
4 | import shutil
5 | from pathlib import Path
6 |
7 | import numpy as np
8 | import torch
9 | import yaml
10 | from sotabencheval.object_detection import COCOEvaluator
11 | from sotabencheval.utils import is_server
12 | from tqdm import tqdm
13 |
14 | from models.experimental import attempt_load
15 | from utils.datasets import create_dataloader
16 | from utils.general import (
17 | coco80_to_coco91_class, check_dataset, check_file, check_img_size, compute_loss, non_max_suppression, scale_coords,
18 | xyxy2xywh, clip_coords, set_logging)
19 | from utils.torch_utils import select_device, time_synchronized
20 |
21 | DATA_ROOT = './.data/vision/coco' if is_server() else '../coco' # sotabench data dir
22 |
23 |
24 | def test(data,
25 | weights=None,
26 | batch_size=16,
27 | imgsz=640,
28 | conf_thres=0.001,
29 | iou_thres=0.6, # for NMS
30 | save_json=False,
31 | single_cls=False,
32 | augment=False,
33 | verbose=False,
34 | model=None,
35 | dataloader=None,
36 | save_dir='',
37 | merge=False,
38 | save_txt=False):
39 | # Initialize/load model and set device
40 | training = model is not None
41 | if training: # called by train.py
42 | device = next(model.parameters()).device # get model device
43 |
44 | else: # called directly
45 | set_logging()
46 | device = select_device(opt.device, batch_size=batch_size)
47 | merge, save_txt = opt.merge, opt.save_txt # use Merge NMS, save *.txt labels
48 | if save_txt:
49 | out = Path('inference/output')
50 | if os.path.exists(out):
51 | shutil.rmtree(out) # delete output folder
52 | os.makedirs(out) # make new output folder
53 |
54 | # Remove previous
55 | for f in glob.glob(str(Path(save_dir) / 'test_batch*.jpg')):
56 | os.remove(f)
57 |
58 | # Load model
59 | model = attempt_load(weights, map_location=device) # load FP32 model
60 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size
61 |
62 | # Multi-GPU disabled, incompatible with .half() https://github.com/ultralytics/yolov5/issues/99
63 | # if device.type != 'cpu' and torch.cuda.device_count() > 1:
64 | # model = nn.DataParallel(model)
65 |
66 | # Half
67 | half = device.type != 'cpu' # half precision only supported on CUDA
68 | if half:
69 | model.half()
70 |
71 | # Configure
72 | model.eval()
73 | with open(data) as f:
74 | data = yaml.load(f, Loader=yaml.FullLoader) # model dict
75 | check_dataset(data) # check
76 | nc = 1 if single_cls else int(data['nc']) # number of classes
77 | iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95
78 | niou = iouv.numel()
79 |
80 | # Dataloader
81 | if not training:
82 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img
83 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once
84 | path = data['test'] if opt.task == 'test' else data['val'] # path to val/test images
85 | dataloader = create_dataloader(path, imgsz, batch_size, model.stride.max(), opt,
86 | hyp=None, augment=False, cache=True, pad=0.5, rect=True)[0]
87 |
88 | seen = 0
89 | names = model.names if hasattr(model, 'names') else model.module.names
90 | coco91class = coco80_to_coco91_class()
91 | s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Targets', 'P', 'R', 'mAP@.5', 'mAP@.5:.95')
92 | p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0.
93 | loss = torch.zeros(3, device=device)
94 | jdict, stats, ap, ap_class = [], [], [], []
95 | evaluator = COCOEvaluator(root=DATA_ROOT, model_name=opt.weights.replace('.pt', ''))
96 | for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
97 | img = img.to(device, non_blocking=True)
98 | img = img.half() if half else img.float() # uint8 to fp16/32
99 | img /= 255.0 # 0 - 255 to 0.0 - 1.0
100 | targets = targets.to(device)
101 | nb, _, height, width = img.shape # batch size, channels, height, width
102 | whwh = torch.Tensor([width, height, width, height]).to(device)
103 |
104 | # Disable gradients
105 | with torch.no_grad():
106 | # Run model
107 | t = time_synchronized()
108 | inf_out, train_out = model(img, augment=augment) # inference and training outputs
109 | t0 += time_synchronized() - t
110 |
111 | # Compute loss
112 | if training: # if model has loss hyperparameters
113 | loss += compute_loss([x.float() for x in train_out], targets, model)[1][:3] # box, obj, cls
114 |
115 | # Run NMS
116 | t = time_synchronized()
117 | output = non_max_suppression(inf_out, conf_thres=conf_thres, iou_thres=iou_thres, merge=merge)
118 | t1 += time_synchronized() - t
119 |
120 | # Statistics per image
121 | for si, pred in enumerate(output):
122 | labels = targets[targets[:, 0] == si, 1:]
123 | nl = len(labels)
124 | tcls = labels[:, 0].tolist() if nl else [] # target class
125 | seen += 1
126 |
127 | if pred is None:
128 | if nl:
129 | stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls))
130 | continue
131 |
132 | # Append to text file
133 | if save_txt:
134 | gn = torch.tensor(shapes[si][0])[[1, 0, 1, 0]] # normalization gain whwh
135 | x = pred.clone()
136 | x[:, :4] = scale_coords(img[si].shape[1:], x[:, :4], shapes[si][0], shapes[si][1]) # to original
137 | for *xyxy, conf, cls in x:
138 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
139 | with open(str(out / Path(paths[si]).stem) + '.txt', 'a') as f:
140 | f.write(('%g ' * 5 + '\n') % (cls, *xywh)) # label format
141 |
142 | # Clip boxes to image bounds
143 | clip_coords(pred, (height, width))
144 |
145 | # Append to pycocotools JSON dictionary
146 | if save_json:
147 | # [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ...
148 | image_id = Path(paths[si]).stem
149 | box = pred[:, :4].clone() # xyxy
150 | scale_coords(img[si].shape[1:], box, shapes[si][0], shapes[si][1]) # to original shape
151 | box = xyxy2xywh(box) # xywh
152 | box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner
153 | for p, b in zip(pred.tolist(), box.tolist()):
154 | result = {'image_id': int(image_id) if image_id.isnumeric() else image_id,
155 | 'category_id': coco91class[int(p[5])],
156 | 'bbox': [round(x, 3) for x in b],
157 | 'score': round(p[4], 5)}
158 | jdict.append(result)
159 |
160 | #evaluator.add([result])
161 | #if evaluator.cache_exists:
162 | # break
163 |
164 | # # Assign all predictions as incorrect
165 | # correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device)
166 | # if nl:
167 | # detected = [] # target indices
168 | # tcls_tensor = labels[:, 0]
169 | #
170 | # # target boxes
171 | # tbox = xywh2xyxy(labels[:, 1:5]) * whwh
172 | #
173 | # # Per target class
174 | # for cls in torch.unique(tcls_tensor):
175 | # ti = (cls == tcls_tensor).nonzero(as_tuple=False).view(-1) # prediction indices
176 | # pi = (cls == pred[:, 5]).nonzero(as_tuple=False).view(-1) # target indices
177 | #
178 | # # Search for detections
179 | # if pi.shape[0]:
180 | # # Prediction to target ious
181 | # ious, i = box_iou(pred[pi, :4], tbox[ti]).max(1) # best ious, indices
182 | #
183 | # # Append detections
184 | # detected_set = set()
185 | # for j in (ious > iouv[0]).nonzero(as_tuple=False):
186 | # d = ti[i[j]] # detected target
187 | # if d.item() not in detected_set:
188 | # detected_set.add(d.item())
189 | # detected.append(d)
190 | # correct[pi[j]] = ious[j] > iouv # iou_thres is 1xn
191 | # if len(detected) == nl: # all targets already located in image
192 | # break
193 | #
194 | # # Append statistics (correct, conf, pcls, tcls)
195 | # stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls))
196 |
197 | # # Plot images
198 | # if batch_i < 1:
199 | # f = Path(save_dir) / ('test_batch%g_gt.jpg' % batch_i) # filename
200 | # plot_images(img, targets, paths, str(f), names) # ground truth
201 | # f = Path(save_dir) / ('test_batch%g_pred.jpg' % batch_i)
202 | # plot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictions
203 |
204 | evaluator.add(jdict)
205 | evaluator.save()
206 |
207 | # # Compute statistics
208 | # stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy
209 | # if len(stats) and stats[0].any():
210 | # p, r, ap, f1, ap_class = ap_per_class(*stats)
211 | # p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1) # [P, R, AP@0.5, AP@0.5:0.95]
212 | # mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
213 | # nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class
214 | # else:
215 | # nt = torch.zeros(1)
216 | #
217 | # # Print results
218 | # pf = '%20s' + '%12.3g' * 6 # print format
219 | # print(pf % ('all', seen, nt.sum(), mp, mr, map50, map))
220 | #
221 | # # Print results per class
222 | # if verbose and nc > 1 and len(stats):
223 | # for i, c in enumerate(ap_class):
224 | # print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))
225 | #
226 | # # Print speeds
227 | # t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple
228 | # if not training:
229 | # print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t)
230 | #
231 | # # Save JSON
232 | # if save_json and len(jdict):
233 | # f = 'detections_val2017_%s_results.json' % \
234 | # (weights.split(os.sep)[-1].replace('.pt', '') if isinstance(weights, str) else '') # filename
235 | # print('\nCOCO mAP with pycocotools... saving %s...' % f)
236 | # with open(f, 'w') as file:
237 | # json.dump(jdict, file)
238 | #
239 | # try: # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
240 | # from pycocotools.coco import COCO
241 | # from pycocotools.cocoeval import COCOeval
242 | #
243 | # imgIds = [int(Path(x).stem) for x in dataloader.dataset.img_files]
244 | # cocoGt = COCO(glob.glob('../coco/annotations/instances_val*.json')[0]) # initialize COCO ground truth api
245 | # cocoDt = cocoGt.loadRes(f) # initialize COCO pred api
246 | # cocoEval = COCOeval(cocoGt, cocoDt, 'bbox')
247 | # cocoEval.params.imgIds = imgIds # image IDs to evaluate
248 | # cocoEval.evaluate()
249 | # cocoEval.accumulate()
250 | # cocoEval.summarize()
251 | # map, map50 = cocoEval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5)
252 | # except Exception as e:
253 | # print('ERROR: pycocotools unable to run: %s' % e)
254 | #
255 | # # Return results
256 | # model.float() # for training
257 | # maps = np.zeros(nc) + map
258 | # for i, c in enumerate(ap_class):
259 | # maps[c] = ap[i]
260 | # return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t
261 |
262 |
263 | if __name__ == '__main__':
264 | parser = argparse.ArgumentParser(prog='test.py')
265 | parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model.pt path(s)')
266 | parser.add_argument('--data', type=str, default='data/coco.yaml', help='*.data path')
267 | parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch')
268 | parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
269 | parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold')
270 | parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS')
271 | parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file')
272 | parser.add_argument('--task', default='val', help="'val', 'test', 'study'")
273 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
274 | parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
275 | parser.add_argument('--augment', action='store_true', help='augmented inference')
276 | parser.add_argument('--merge', action='store_true', help='use Merge NMS')
277 | parser.add_argument('--verbose', action='store_true', help='report mAP by class')
278 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
279 | opt = parser.parse_args()
280 | opt.save_json |= opt.data.endswith('coco.yaml')
281 | opt.data = check_file(opt.data) # check file
282 | print(opt)
283 |
284 | if opt.task in ['val', 'test']: # run normally
285 | test(opt.data,
286 | opt.weights,
287 | opt.batch_size,
288 | opt.img_size,
289 | opt.conf_thres,
290 | opt.iou_thres,
291 | opt.save_json,
292 | opt.single_cls,
293 | opt.augment,
294 | opt.verbose)
295 |
296 | elif opt.task == 'study': # run over a range of settings and save/plot
297 | for weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']:
298 | f = 'study_%s_%s.txt' % (Path(opt.data).stem, Path(weights).stem) # filename to save to
299 | x = list(range(320, 800, 64)) # x axis
300 | y = [] # y axis
301 | for i in x: # img-size
302 | print('\nRunning %s point %s...' % (f, i))
303 | r, _, t = test(opt.data, weights, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json)
304 | y.append(r + t) # results and times
305 | np.savetxt(f, y, fmt='%10.4g') # save
306 | os.system('zip -r study.zip study_*.txt')
307 | # utils.general.plot_study_txt(f, x) # plot
--------------------------------------------------------------------------------
/v3.0/yolov5/test.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import glob
3 | import json
4 | import os
5 | import shutil
6 | from pathlib import Path
7 |
8 | import numpy as np
9 | import torch
10 | import yaml
11 | from tqdm import tqdm
12 |
13 | from models.experimental import attempt_load
14 | from utils.datasets import create_dataloader
15 | from utils.general import (
16 | coco80_to_coco91_class, check_dataset, check_file, check_img_size, compute_loss, non_max_suppression, scale_coords,
17 | xyxy2xywh, clip_coords, plot_images, xywh2xyxy, box_iou, output_to_target, ap_per_class, set_logging)
18 | from utils.torch_utils import select_device, time_synchronized
19 |
20 |
21 | def test(data,
22 | weights=None,
23 | batch_size=16,
24 | imgsz=640,
25 | conf_thres=0.001,
26 | iou_thres=0.6, # for NMS
27 | save_json=False,
28 | single_cls=False,
29 | augment=False,
30 | verbose=False,
31 | model=None,
32 | dataloader=None,
33 | save_dir=Path(''), # for saving images
34 | save_txt=False, # for auto-labelling
35 | save_conf=False,
36 | plots=True,
37 | log_imgs=0): # number of logged images
38 |
39 | # Initialize/load model and set device
40 | training = model is not None
41 | if training: # called by train.py
42 | device = next(model.parameters()).device # get model device
43 |
44 | else: # called directly
45 | set_logging()
46 | device = select_device(opt.device, batch_size=batch_size)
47 | save_txt = opt.save_txt # save *.txt labels
48 |
49 | # Remove previous
50 | if os.path.exists(save_dir):
51 | shutil.rmtree(save_dir) # delete dir
52 | os.makedirs(save_dir) # make new dir
53 |
54 | if save_txt:
55 | out = save_dir / 'autolabels'
56 | if os.path.exists(out):
57 | shutil.rmtree(out) # delete dir
58 | os.makedirs(out) # make new dir
59 |
60 | # Load model
61 | model = attempt_load(weights, map_location=device) # load FP32 model
62 | imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size
63 |
64 | # Multi-GPU disabled, incompatible with .half() https://github.com/ultralytics/yolov5/issues/99
65 | # if device.type != 'cpu' and torch.cuda.device_count() > 1:
66 | # model = nn.DataParallel(model)
67 |
68 | # Half
69 | half = device.type != 'cpu' # half precision only supported on CUDA
70 | if half:
71 | model.half()
72 |
73 | # Configure
74 | model.eval()
75 | with open(data) as f:
76 | data = yaml.load(f, Loader=yaml.FullLoader) # model dict
77 | check_dataset(data) # check
78 | nc = 1 if single_cls else int(data['nc']) # number of classes
79 | iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95
80 | niou = iouv.numel()
81 |
82 | # Logging
83 | log_imgs = min(log_imgs, 100) # ceil
84 | try:
85 | import wandb # Weights & Biases
86 | except ImportError:
87 | log_imgs = 0
88 |
89 | # Dataloader
90 | if not training:
91 | img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img
92 | _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once
93 | path = data['test'] if opt.task == 'test' else data['val'] # path to val/test images
94 | dataloader = create_dataloader(path, imgsz, batch_size, model.stride.max(), opt,
95 | hyp=None, augment=False, cache=False, pad=0.5, rect=True)[0]
96 |
97 | seen = 0
98 | names = model.names if hasattr(model, 'names') else model.module.names
99 | coco91class = coco80_to_coco91_class()
100 | s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Targets', 'P', 'R', 'mAP@.5', 'mAP@.5:.95')
101 | p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0.
102 | loss = torch.zeros(3, device=device)
103 | jdict, stats, ap, ap_class, wandb_images = [], [], [], [], []
104 | for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
105 | img = img.to(device, non_blocking=True)
106 | img = img.half() if half else img.float() # uint8 to fp16/32
107 | img /= 255.0 # 0 - 255 to 0.0 - 1.0
108 | targets = targets.to(device)
109 | nb, _, height, width = img.shape # batch size, channels, height, width
110 | whwh = torch.Tensor([width, height, width, height]).to(device)
111 |
112 | # Disable gradients
113 | with torch.no_grad():
114 | # Run model
115 | t = time_synchronized()
116 | inf_out, train_out = model(img, augment=augment) # inference and training outputs
117 | t0 += time_synchronized() - t
118 |
119 | # Compute loss
120 | if training: # if model has loss hyperparameters
121 | loss += compute_loss([x.float() for x in train_out], targets, model)[1][:3] # box, obj, cls
122 |
123 | # Run NMS
124 | t = time_synchronized()
125 | output = non_max_suppression(inf_out, conf_thres=conf_thres, iou_thres=iou_thres)
126 | t1 += time_synchronized() - t
127 |
128 | # Statistics per image
129 | for si, pred in enumerate(output):
130 | labels = targets[targets[:, 0] == si, 1:]
131 | nl = len(labels)
132 | tcls = labels[:, 0].tolist() if nl else [] # target class
133 | seen += 1
134 |
135 | if pred is None:
136 | if nl:
137 | stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls))
138 | continue
139 |
140 | # Append to text file
141 | if save_txt:
142 | gn = torch.tensor(shapes[si][0])[[1, 0, 1, 0]] # normalization gain whwh
143 | x = pred.clone()
144 | x[:, :4] = scale_coords(img[si].shape[1:], x[:, :4], shapes[si][0], shapes[si][1]) # to original
145 | for *xyxy, conf, cls in x:
146 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
147 | line = (cls, conf, *xywh) if save_conf else (cls, *xywh) # label format
148 | with open(str(out / Path(paths[si]).stem) + '.txt', 'a') as f:
149 | f.write(('%g ' * len(line) + '\n') % line)
150 |
151 | # W&B logging
152 | if len(wandb_images) < log_imgs:
153 | bbox_data = [{"position": {"minX": xyxy[0], "minY": xyxy[1], "maxX": xyxy[2], "maxY": xyxy[3]},
154 | "class_id": int(cls),
155 | "scores": {"class_score": conf},
156 | "domain": "pixel"} for *xyxy, conf, cls in pred.clone().tolist()]
157 | wandb_images.append(wandb.Image(img[si], boxes={"predictions": {"box_data": bbox_data}}))
158 |
159 | # Clip boxes to image bounds
160 | clip_coords(pred, (height, width))
161 |
162 | # Append to pycocotools JSON dictionary
163 | if save_json:
164 | # [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ...
165 | image_id = Path(paths[si]).stem
166 | box = pred[:, :4].clone() # xyxy
167 | scale_coords(img[si].shape[1:], box, shapes[si][0], shapes[si][1]) # to original shape
168 | box = xyxy2xywh(box) # xywh
169 | box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner
170 | for p, b in zip(pred.tolist(), box.tolist()):
171 | jdict.append({'image_id': int(image_id) if image_id.isnumeric() else image_id,
172 | 'category_id': coco91class[int(p[5])],
173 | 'bbox': [round(x, 3) for x in b],
174 | 'score': round(p[4], 5)})
175 |
176 | # Assign all predictions as incorrect
177 | correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device)
178 | if nl:
179 | detected = [] # target indices
180 | tcls_tensor = labels[:, 0]
181 |
182 | # target boxes
183 | tbox = xywh2xyxy(labels[:, 1:5]) * whwh
184 |
185 | # Per target class
186 | for cls in torch.unique(tcls_tensor):
187 | ti = (cls == tcls_tensor).nonzero(as_tuple=False).view(-1) # prediction indices
188 | pi = (cls == pred[:, 5]).nonzero(as_tuple=False).view(-1) # target indices
189 |
190 | # Search for detections
191 | if pi.shape[0]:
192 | # Prediction to target ious
193 | ious, i = box_iou(pred[pi, :4], tbox[ti]).max(1) # best ious, indices
194 |
195 | # Append detections
196 | detected_set = set()
197 | for j in (ious > iouv[0]).nonzero(as_tuple=False):
198 | d = ti[i[j]] # detected target
199 | if d.item() not in detected_set:
200 | detected_set.add(d.item())
201 | detected.append(d)
202 | correct[pi[j]] = ious[j] > iouv # iou_thres is 1xn
203 | if len(detected) == nl: # all targets already located in image
204 | break
205 |
206 | # Append statistics (correct, conf, pcls, tcls)
207 | stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls))
208 |
209 | # Plot images
210 | if plots and batch_i < 1:
211 | f = save_dir / f'test_batch{batch_i}_gt.jpg' # filename
212 | plot_images(img, targets, paths, str(f), names) # ground truth
213 | f = save_dir / f'test_batch{batch_i}_pred.jpg'
214 | plot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictions
215 |
216 | # W&B logging
217 | if wandb_images:
218 | wandb.log({"outputs": wandb_images})
219 |
220 | # Compute statistics
221 | stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy
222 | if len(stats) and stats[0].any():
223 | p, r, ap, f1, ap_class = ap_per_class(*stats, plot=plots, fname=save_dir / 'precision-recall_curve.png')
224 | p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1) # [P, R, AP@0.5, AP@0.5:0.95]
225 | mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
226 | nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class
227 | else:
228 | nt = torch.zeros(1)
229 |
230 | # Print results
231 | pf = '%20s' + '%12.3g' * 6 # print format
232 | print(pf % ('all', seen, nt.sum(), mp, mr, map50, map))
233 |
234 | # Print results per class
235 | if verbose and nc > 1 and len(stats):
236 | for i, c in enumerate(ap_class):
237 | print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))
238 |
239 | # Print speeds
240 | t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple
241 | if not training:
242 | print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t)
243 |
244 | # Save JSON
245 | if save_json and len(jdict):
246 | w = Path(weights[0] if isinstance(weights, list) else weights).stem if weights is not None else '' # weights
247 | file = save_dir / f"detections_val2017_{w}_results.json" # predicted annotations file
248 | print('\nCOCO mAP with pycocotools... saving %s...' % file)
249 | with open(file, 'w') as f:
250 | json.dump(jdict, f)
251 |
252 | try: # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
253 | from pycocotools.coco import COCO
254 | from pycocotools.cocoeval import COCOeval
255 |
256 | imgIds = [int(Path(x).stem) for x in dataloader.dataset.img_files]
257 | cocoGt = COCO(glob.glob('../coco/annotations/instances_val*.json')[0]) # initialize COCO ground truth api
258 | cocoDt = cocoGt.loadRes(str(file)) # initialize COCO pred api
259 | cocoEval = COCOeval(cocoGt, cocoDt, 'bbox')
260 | cocoEval.params.imgIds = imgIds # image IDs to evaluate
261 | cocoEval.evaluate()
262 | cocoEval.accumulate()
263 | cocoEval.summarize()
264 | map, map50 = cocoEval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5)
265 | except Exception as e:
266 | print('ERROR: pycocotools unable to run: %s' % e)
267 |
268 | # Return results
269 | model.float() # for training
270 | maps = np.zeros(nc) + map
271 | for i, c in enumerate(ap_class):
272 | maps[c] = ap[i]
273 | return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t
274 |
275 |
276 | if __name__ == '__main__':
277 | parser = argparse.ArgumentParser(prog='test.py')
278 | parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model.pt path(s)')
279 | parser.add_argument('--data', type=str, default='data/coco128.yaml', help='*.data path')
280 | parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch')
281 | parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
282 | parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold')
283 | parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS')
284 | parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file')
285 | parser.add_argument('--task', default='val', help="'val', 'test', 'study'")
286 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
287 | parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
288 | parser.add_argument('--augment', action='store_true', help='augmented inference')
289 | parser.add_argument('--verbose', action='store_true', help='report mAP by class')
290 | parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
291 | parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
292 | parser.add_argument('--save-dir', type=str, default='runs/test', help='directory to save results')
293 | opt = parser.parse_args()
294 | opt.save_json |= opt.data.endswith('coco.yaml')
295 | opt.data = check_file(opt.data) # check file
296 | print(opt)
297 |
298 | if opt.task in ['val', 'test']: # run normally
299 | test(opt.data,
300 | opt.weights,
301 | opt.batch_size,
302 | opt.img_size,
303 | opt.conf_thres,
304 | opt.iou_thres,
305 | opt.save_json,
306 | opt.single_cls,
307 | opt.augment,
308 | opt.verbose,
309 | save_dir=Path(opt.save_dir),
310 | save_txt=opt.save_txt,
311 | save_conf=opt.save_conf,
312 | )
313 |
314 | print('Results saved to %s' % opt.save_dir)
315 |
316 | elif opt.task == 'study': # run over a range of settings and save/plot
317 | for weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']:
318 | f = 'study_%s_%s.txt' % (Path(opt.data).stem, Path(weights).stem) # filename to save to
319 | x = list(range(320, 800, 64)) # x axis
320 | y = [] # y axis
321 | for i in x: # img-size
322 | print('\nRunning %s point %s...' % (f, i))
323 | r, _, t = test(opt.data, weights, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json)
324 | y.append(r + t) # results and times
325 | np.savetxt(f, y, fmt='%10.4g') # save
326 | os.system('zip -r study.zip study_*.txt')
327 | # utils.general.plot_study_txt(f, x) # plot
328 |
--------------------------------------------------------------------------------
/v3.0/yolov5/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import logging
3 | import os
4 | import random
5 | import shutil
6 | import time
7 | from pathlib import Path
8 | from warnings import warn
9 |
10 | import math
11 | import numpy as np
12 | import torch.distributed as dist
13 | import torch.nn.functional as F
14 | import torch.optim as optim
15 | import torch.optim.lr_scheduler as lr_scheduler
16 | import torch.utils.data
17 | import yaml
18 | from torch.cuda import amp
19 | from torch.nn.parallel import DistributedDataParallel as DDP
20 | from torch.utils.tensorboard import SummaryWriter
21 | from tqdm import tqdm
22 |
23 | import test # import test.py to get mAP after each epoch
24 | from models.yolo import Model
25 | from utils.datasets import create_dataloader
26 | from utils.general import (
27 | torch_distributed_zero_first, labels_to_class_weights, plot_labels, check_anchors, labels_to_image_weights,
28 | compute_loss, plot_images, fitness, strip_optimizer, plot_results, get_latest_run, check_dataset, check_file,
29 | check_git_status, check_img_size, increment_dir, print_mutation, plot_evolution, set_logging, init_seeds)
30 | from utils.google_utils import attempt_download
31 | from utils.torch_utils import ModelEMA, select_device, intersect_dicts
32 |
33 | logger = logging.getLogger(__name__)
34 |
35 |
36 | def train(hyp, opt, device, tb_writer=None, wandb=None):
37 | logger.info(f'Hyperparameters {hyp}')
38 | log_dir = Path(tb_writer.log_dir) if tb_writer else Path(opt.logdir) / 'evolve' # logging directory
39 | wdir = log_dir / 'weights' # weights directory
40 | os.makedirs(wdir, exist_ok=True)
41 | last = wdir / 'last.pt'
42 | best = wdir / 'best.pt'
43 | results_file = str(log_dir / 'results.txt')
44 | epochs, batch_size, total_batch_size, weights, rank = \
45 | opt.epochs, opt.batch_size, opt.total_batch_size, opt.weights, opt.global_rank
46 |
47 | # Save run settings
48 | with open(log_dir / 'hyp.yaml', 'w') as f:
49 | yaml.dump(hyp, f, sort_keys=False)
50 | with open(log_dir / 'opt.yaml', 'w') as f:
51 | yaml.dump(vars(opt), f, sort_keys=False)
52 |
53 | # Configure
54 | cuda = device.type != 'cpu'
55 | init_seeds(2 + rank)
56 | with open(opt.data) as f:
57 | data_dict = yaml.load(f, Loader=yaml.FullLoader) # data dict
58 | with torch_distributed_zero_first(rank):
59 | check_dataset(data_dict) # check
60 | train_path = data_dict['train']
61 | test_path = data_dict['val']
62 | nc, names = (1, ['item']) if opt.single_cls else (int(data_dict['nc']), data_dict['names']) # number classes, names
63 | assert len(names) == nc, '%g names found for nc=%g dataset in %s' % (len(names), nc, opt.data) # check
64 |
65 | # Model
66 | pretrained = weights.endswith('.pt')
67 | if pretrained:
68 | with torch_distributed_zero_first(rank):
69 | attempt_download(weights) # download if not found locally
70 | ckpt = torch.load(weights, map_location=device) # load checkpoint
71 | if hyp.get('anchors'):
72 | ckpt['model'].yaml['anchors'] = round(hyp['anchors']) # force autoanchor
73 | model = Model(opt.cfg or ckpt['model'].yaml, ch=3, nc=nc).to(device) # create
74 | exclude = ['anchor'] if opt.cfg or hyp.get('anchors') else [] # exclude keys
75 | state_dict = ckpt['model'].float().state_dict() # to FP32
76 | state_dict = intersect_dicts(state_dict, model.state_dict(), exclude=exclude) # intersect
77 | model.load_state_dict(state_dict, strict=False) # load
78 | logger.info('Transferred %g/%g items from %s' % (len(state_dict), len(model.state_dict()), weights)) # report
79 | else:
80 | model = Model(opt.cfg, ch=3, nc=nc).to(device) # create
81 |
82 | # Freeze
83 | freeze = ['', ] # parameter names to freeze (full or partial)
84 | if any(freeze):
85 | for k, v in model.named_parameters():
86 | if any(x in k for x in freeze):
87 | print('freezing %s' % k)
88 | v.requires_grad = False
89 |
90 | # Optimizer
91 | nbs = 64 # nominal batch size
92 | accumulate = max(round(nbs / total_batch_size), 1) # accumulate loss before optimizing
93 | hyp['weight_decay'] *= total_batch_size * accumulate / nbs # scale weight_decay
94 |
95 | pg0, pg1, pg2 = [], [], [] # optimizer parameter groups
96 | for k, v in model.named_parameters():
97 | v.requires_grad = True
98 | if '.bias' in k:
99 | pg2.append(v) # biases
100 | elif '.weight' in k and '.bn' not in k:
101 | pg1.append(v) # apply weight decay
102 | else:
103 | pg0.append(v) # all else
104 |
105 | if opt.adam:
106 | optimizer = optim.Adam(pg0, lr=hyp['lr0'], betas=(hyp['momentum'], 0.999)) # adjust beta1 to momentum
107 | else:
108 | optimizer = optim.SGD(pg0, lr=hyp['lr0'], momentum=hyp['momentum'], nesterov=True)
109 |
110 | optimizer.add_param_group({'params': pg1, 'weight_decay': hyp['weight_decay']}) # add pg1 with weight_decay
111 | optimizer.add_param_group({'params': pg2}) # add pg2 (biases)
112 | logger.info('Optimizer groups: %g .bias, %g conv.weight, %g other' % (len(pg2), len(pg1), len(pg0)))
113 | del pg0, pg1, pg2
114 |
115 | # Scheduler https://arxiv.org/pdf/1812.01187.pdf
116 | # https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html#OneCycleLR
117 | lf = lambda x: ((1 + math.cos(x * math.pi / epochs)) / 2) * (1 - hyp['lrf']) + hyp['lrf'] # cosine
118 | scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)
119 | # plot_lr_scheduler(optimizer, scheduler, epochs)
120 |
121 | # Logging
122 | if wandb and wandb.run is None:
123 | id = ckpt.get('wandb_id') if 'ckpt' in locals() else None
124 | wandb_run = wandb.init(config=opt, resume="allow", project=os.path.basename(log_dir), id=id)
125 |
126 | # Resume
127 | start_epoch, best_fitness = 0, 0.0
128 | if pretrained:
129 | # Optimizer
130 | if ckpt['optimizer'] is not None:
131 | optimizer.load_state_dict(ckpt['optimizer'])
132 | best_fitness = ckpt['best_fitness']
133 |
134 | # Results
135 | if ckpt.get('training_results') is not None:
136 | with open(results_file, 'w') as file:
137 | file.write(ckpt['training_results']) # write results.txt
138 |
139 | # Epochs
140 | start_epoch = ckpt['epoch'] + 1
141 | if opt.resume:
142 | assert start_epoch > 0, '%s training to %g epochs is finished, nothing to resume.' % (weights, epochs)
143 | shutil.copytree(wdir, wdir.parent / f'weights_backup_epoch{start_epoch - 1}') # save previous weights
144 | if epochs < start_epoch:
145 | logger.info('%s has been trained for %g epochs. Fine-tuning for %g additional epochs.' %
146 | (weights, ckpt['epoch'], epochs))
147 | epochs += ckpt['epoch'] # finetune additional epochs
148 |
149 | del ckpt, state_dict
150 |
151 | # Image sizes
152 | gs = int(max(model.stride)) # grid size (max stride)
153 | imgsz, imgsz_test = [check_img_size(x, gs) for x in opt.img_size] # verify imgsz are gs-multiples
154 |
155 | # DP mode
156 | if cuda and rank == -1 and torch.cuda.device_count() > 1:
157 | model = torch.nn.DataParallel(model)
158 |
159 | # SyncBatchNorm
160 | if opt.sync_bn and cuda and rank != -1:
161 | model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model).to(device)
162 | logger.info('Using SyncBatchNorm()')
163 |
164 | # Exponential moving average
165 | ema = ModelEMA(model) if rank in [-1, 0] else None
166 |
167 | # DDP mode
168 | if cuda and rank != -1:
169 | model = DDP(model, device_ids=[opt.local_rank], output_device=opt.local_rank)
170 |
171 | # Trainloader
172 | dataloader, dataset = create_dataloader(train_path, imgsz, batch_size, gs, opt,
173 | hyp=hyp, augment=True, cache=opt.cache_images, rect=opt.rect,
174 | rank=rank, world_size=opt.world_size, workers=opt.workers)
175 | mlc = np.concatenate(dataset.labels, 0)[:, 0].max() # max label class
176 | nb = len(dataloader) # number of batches
177 | assert mlc < nc, 'Label class %g exceeds nc=%g in %s. Possible class labels are 0-%g' % (mlc, nc, opt.data, nc - 1)
178 |
179 | # Process 0
180 | if rank in [-1, 0]:
181 | ema.updates = start_epoch * nb // accumulate # set EMA updates
182 | testloader = create_dataloader(test_path, imgsz_test, total_batch_size, gs, opt,
183 | hyp=hyp, augment=False, cache=opt.cache_images and not opt.notest, rect=True,
184 | rank=-1, world_size=opt.world_size, workers=opt.workers)[0] # testloader
185 |
186 | if not opt.resume:
187 | labels = np.concatenate(dataset.labels, 0)
188 | c = torch.tensor(labels[:, 0]) # classes
189 | # cf = torch.bincount(c.long(), minlength=nc) + 1. # frequency
190 | # model._initialize_biases(cf.to(device))
191 | plot_labels(labels, save_dir=log_dir)
192 | if tb_writer:
193 | # tb_writer.add_hparams(hyp, {}) # causes duplicate https://github.com/ultralytics/yolov5/pull/384
194 | tb_writer.add_histogram('classes', c, 0)
195 |
196 | # Anchors
197 | if not opt.noautoanchor:
198 | check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz)
199 |
200 | # Model parameters
201 | hyp['cls'] *= nc / 80. # scale coco-tuned hyp['cls'] to current dataset
202 | model.nc = nc # attach number of classes to model
203 | model.hyp = hyp # attach hyperparameters to model
204 | model.gr = 1.0 # iou loss ratio (obj_loss = 1.0 or iou)
205 | model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device) # attach class weights
206 | model.names = names
207 |
208 | # Start training
209 | t0 = time.time()
210 | nw = max(round(hyp['warmup_epochs'] * nb), 1e3) # number of warmup iterations, max(3 epochs, 1k iterations)
211 | # nw = min(nw, (epochs - start_epoch) / 2 * nb) # limit warmup to < 1/2 of training
212 | maps = np.zeros(nc) # mAP per class
213 | results = (0, 0, 0, 0, 0, 0, 0) # P, R, mAP@.5, mAP@.5-.95, val_loss(box, obj, cls)
214 | scheduler.last_epoch = start_epoch - 1 # do not move
215 | scaler = amp.GradScaler(enabled=cuda)
216 | logger.info('Image sizes %g train, %g test\n'
217 | 'Using %g dataloader workers\nLogging results to %s\n'
218 | 'Starting training for %g epochs...' % (imgsz, imgsz_test, dataloader.num_workers, log_dir, epochs))
219 | for epoch in range(start_epoch, epochs): # epoch ------------------------------------------------------------------
220 | model.train()
221 |
222 | # Update image weights (optional)
223 | if opt.image_weights:
224 | # Generate indices
225 | if rank in [-1, 0]:
226 | cw = model.class_weights.cpu().numpy() * (1 - maps) ** 2 # class weights
227 | iw = labels_to_image_weights(dataset.labels, nc=nc, class_weights=cw) # image weights
228 | dataset.indices = random.choices(range(dataset.n), weights=iw, k=dataset.n) # rand weighted idx
229 | # Broadcast if DDP
230 | if rank != -1:
231 | indices = (torch.tensor(dataset.indices) if rank == 0 else torch.zeros(dataset.n)).int()
232 | dist.broadcast(indices, 0)
233 | if rank != 0:
234 | dataset.indices = indices.cpu().numpy()
235 |
236 | # Update mosaic border
237 | # b = int(random.uniform(0.25 * imgsz, 0.75 * imgsz + gs) // gs * gs)
238 | # dataset.mosaic_border = [b - imgsz, -b] # height, width borders
239 |
240 | mloss = torch.zeros(4, device=device) # mean losses
241 | if rank != -1:
242 | dataloader.sampler.set_epoch(epoch)
243 | pbar = enumerate(dataloader)
244 | logger.info(('\n' + '%10s' * 8) % ('Epoch', 'gpu_mem', 'box', 'obj', 'cls', 'total', 'targets', 'img_size'))
245 | if rank in [-1, 0]:
246 | pbar = tqdm(pbar, total=nb) # progress bar
247 | optimizer.zero_grad()
248 | for i, (imgs, targets, paths, _) in pbar: # batch -------------------------------------------------------------
249 | ni = i + nb * epoch # number integrated batches (since train start)
250 | imgs = imgs.to(device, non_blocking=True).float() / 255.0 # uint8 to float32, 0-255 to 0.0-1.0
251 |
252 | # Warmup
253 | if ni <= nw:
254 | xi = [0, nw] # x interp
255 | # model.gr = np.interp(ni, xi, [0.0, 1.0]) # iou loss ratio (obj_loss = 1.0 or iou)
256 | accumulate = max(1, np.interp(ni, xi, [1, nbs / total_batch_size]).round())
257 | for j, x in enumerate(optimizer.param_groups):
258 | # bias lr falls from 0.1 to lr0, all other lrs rise from 0.0 to lr0
259 | x['lr'] = np.interp(ni, xi, [hyp['warmup_bias_lr'] if j == 2 else 0.0, x['initial_lr'] * lf(epoch)])
260 | if 'momentum' in x:
261 | x['momentum'] = np.interp(ni, xi, [hyp['warmup_momentum'], hyp['momentum']])
262 |
263 | # Multi-scale
264 | if opt.multi_scale:
265 | sz = random.randrange(imgsz * 0.5, imgsz * 1.5 + gs) // gs * gs # size
266 | sf = sz / max(imgs.shape[2:]) # scale factor
267 | if sf != 1:
268 | ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]] # new shape (stretched to gs-multiple)
269 | imgs = F.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)
270 |
271 | # Forward
272 | with amp.autocast(enabled=cuda):
273 | pred = model(imgs) # forward
274 | loss, loss_items = compute_loss(pred, targets.to(device), model) # loss scaled by batch_size
275 | if rank != -1:
276 | loss *= opt.world_size # gradient averaged between devices in DDP mode
277 |
278 | # Backward
279 | scaler.scale(loss).backward()
280 |
281 | # Optimize
282 | if ni % accumulate == 0:
283 | scaler.step(optimizer) # optimizer.step
284 | scaler.update()
285 | optimizer.zero_grad()
286 | if ema:
287 | ema.update(model)
288 |
289 | # Print
290 | if rank in [-1, 0]:
291 | mloss = (mloss * i + loss_items) / (i + 1) # update mean losses
292 | mem = '%.3gG' % (torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0) # (GB)
293 | s = ('%10s' * 2 + '%10.4g' * 6) % (
294 | '%g/%g' % (epoch, epochs - 1), mem, *mloss, targets.shape[0], imgs.shape[-1])
295 | pbar.set_description(s)
296 |
297 | # Plot
298 | if ni < 3:
299 | f = str(log_dir / f'train_batch{ni}.jpg') # filename
300 | result = plot_images(images=imgs, targets=targets, paths=paths, fname=f)
301 | # if tb_writer and result is not None:
302 | # tb_writer.add_image(f, result, dataformats='HWC', global_step=epoch)
303 | # tb_writer.add_graph(model, imgs) # add model to tensorboard
304 |
305 | # end batch ------------------------------------------------------------------------------------------------
306 |
307 | # Scheduler
308 | lr = [x['lr'] for x in optimizer.param_groups] # for tensorboard
309 | scheduler.step()
310 |
311 | # DDP process 0 or single-GPU
312 | if rank in [-1, 0]:
313 | # mAP
314 | if ema:
315 | ema.update_attr(model, include=['yaml', 'nc', 'hyp', 'gr', 'names', 'stride'])
316 | final_epoch = epoch + 1 == epochs
317 | if not opt.notest or final_epoch: # Calculate mAP
318 | results, maps, times = test.test(opt.data,
319 | batch_size=total_batch_size,
320 | imgsz=imgsz_test,
321 | model=ema.ema,
322 | single_cls=opt.single_cls,
323 | dataloader=testloader,
324 | save_dir=log_dir,
325 | plots=epoch == 0 or final_epoch, # plot first and last
326 | log_imgs=opt.log_imgs)
327 |
328 | # Write
329 | with open(results_file, 'a') as f:
330 | f.write(s + '%10.4g' * 7 % results + '\n') # P, R, mAP@.5, mAP@.5-.95, val_loss(box, obj, cls)
331 | if len(opt.name) and opt.bucket:
332 | os.system('gsutil cp %s gs://%s/results/results%s.txt' % (results_file, opt.bucket, opt.name))
333 |
334 | # Log
335 | tags = ['train/giou_loss', 'train/obj_loss', 'train/cls_loss', # train loss
336 | 'metrics/precision', 'metrics/recall', 'metrics/mAP_0.5', 'metrics/mAP_0.5:0.95',
337 | 'val/giou_loss', 'val/obj_loss', 'val/cls_loss', # val loss
338 | 'x/lr0', 'x/lr1', 'x/lr2'] # params
339 | for x, tag in zip(list(mloss[:-1]) + list(results) + lr, tags):
340 | if tb_writer:
341 | tb_writer.add_scalar(tag, x, epoch) # tensorboard
342 | if wandb:
343 | wandb.log({tag: x}) # W&B
344 |
345 | # Update best mAP
346 | fi = fitness(np.array(results).reshape(1, -1)) # weighted combination of [P, R, mAP@.5, mAP@.5-.95]
347 | if fi > best_fitness:
348 | best_fitness = fi
349 |
350 | # Save model
351 | save = (not opt.nosave) or (final_epoch and not opt.evolve)
352 | if save:
353 | with open(results_file, 'r') as f: # create checkpoint
354 | ckpt = {'epoch': epoch,
355 | 'best_fitness': best_fitness,
356 | 'training_results': f.read(),
357 | 'model': ema.ema,
358 | 'optimizer': None if final_epoch else optimizer.state_dict(),
359 | 'wandb_id': wandb_run.id if wandb else None}
360 |
361 | # Save last, best and delete
362 | torch.save(ckpt, last)
363 | if best_fitness == fi:
364 | torch.save(ckpt, best)
365 | del ckpt
366 | # end epoch ----------------------------------------------------------------------------------------------------
367 | # end training
368 |
369 | if rank in [-1, 0]:
370 | # Strip optimizers
371 | n = opt.name if opt.name.isnumeric() else ''
372 | fresults, flast, fbest = log_dir / f'results{n}.txt', wdir / f'last{n}.pt', wdir / f'best{n}.pt'
373 | for f1, f2 in zip([wdir / 'last.pt', wdir / 'best.pt', results_file], [flast, fbest, fresults]):
374 | if os.path.exists(f1):
375 | os.rename(f1, f2) # rename
376 | if str(f2).endswith('.pt'): # is *.pt
377 | strip_optimizer(f2) # strip optimizer
378 | os.system('gsutil cp %s gs://%s/weights' % (f2, opt.bucket)) if opt.bucket else None # upload
379 | # Finish
380 | if not opt.evolve:
381 | plot_results(save_dir=log_dir) # save as results.png
382 | logger.info('%g epochs completed in %.3f hours.\n' % (epoch - start_epoch + 1, (time.time() - t0) / 3600))
383 |
384 | dist.destroy_process_group() if rank not in [-1, 0] else None
385 | torch.cuda.empty_cache()
386 | return results
387 |
388 |
389 | if __name__ == '__main__':
390 | parser = argparse.ArgumentParser()
391 | parser.add_argument('--weights', type=str, default='yolov5s.pt', help='initial weights path')
392 | parser.add_argument('--cfg', type=str, default='', help='model.yaml path')
393 | parser.add_argument('--data', type=str, default='data/coco128.yaml', help='data.yaml path')
394 | parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path')
395 | parser.add_argument('--epochs', type=int, default=300)
396 | parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs')
397 | parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
398 | parser.add_argument('--rect', action='store_true', help='rectangular training')
399 | parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
400 | parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
401 | parser.add_argument('--notest', action='store_true', help='only test final epoch')
402 | parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
403 | parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
404 | parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
405 | parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
406 | parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
407 | parser.add_argument('--name', default='', help='renames experiment folder exp{N} to exp{N}_{name} if supplied')
408 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
409 | parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
410 | parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset')
411 | parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
412 | parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
413 | parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
414 | parser.add_argument('--logdir', type=str, default='runs/', help='logging directory')
415 | parser.add_argument('--log-imgs', type=int, default=10, help='number of images for W&B logging, max 100')
416 | parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')
417 |
418 | opt = parser.parse_args()
419 |
420 | # Set DDP variables
421 | opt.total_batch_size = opt.batch_size
422 | opt.world_size = int(os.environ['WORLD_SIZE']) if 'WORLD_SIZE' in os.environ else 1
423 | opt.global_rank = int(os.environ['RANK']) if 'RANK' in os.environ else -1
424 | set_logging(opt.global_rank)
425 | if opt.global_rank in [-1, 0]:
426 | check_git_status()
427 |
428 | # Resume
429 | if opt.resume: # resume an interrupted run
430 | ckpt = opt.resume if isinstance(opt.resume, str) else get_latest_run() # specified or most recent path
431 | log_dir = Path(ckpt).parent.parent # runs/exp0
432 | assert os.path.isfile(ckpt), 'ERROR: --resume checkpoint does not exist'
433 | with open(log_dir / 'opt.yaml') as f:
434 | opt = argparse.Namespace(**yaml.load(f, Loader=yaml.FullLoader)) # replace
435 | opt.cfg, opt.weights, opt.resume = '', ckpt, True
436 | logger.info('Resuming training from %s' % ckpt)
437 |
438 | else:
439 | # opt.hyp = opt.hyp or ('hyp.finetune.yaml' if opt.weights else 'hyp.scratch.yaml')
440 | opt.data, opt.cfg, opt.hyp = check_file(opt.data), check_file(opt.cfg), check_file(opt.hyp) # check files
441 | assert len(opt.cfg) or len(opt.weights), 'either --cfg or --weights must be specified'
442 | opt.img_size.extend([opt.img_size[-1]] * (2 - len(opt.img_size))) # extend to 2 sizes (train, test)
443 | log_dir = increment_dir(Path(opt.logdir) / 'exp', opt.name) # runs/exp1
444 |
445 | # DDP mode
446 | device = select_device(opt.device, batch_size=opt.batch_size)
447 | if opt.local_rank != -1:
448 | assert torch.cuda.device_count() > opt.local_rank
449 | torch.cuda.set_device(opt.local_rank)
450 | device = torch.device('cuda', opt.local_rank)
451 | dist.init_process_group(backend='nccl', init_method='env://') # distributed backend
452 | assert opt.batch_size % opt.world_size == 0, '--batch-size must be multiple of CUDA device count'
453 | opt.batch_size = opt.total_batch_size // opt.world_size
454 |
455 | # Hyperparameters
456 | with open(opt.hyp) as f:
457 | hyp = yaml.load(f, Loader=yaml.FullLoader) # load hyps
458 | if 'box' not in hyp:
459 | warn('Compatibility: %s missing "box" which was renamed from "giou" in %s' %
460 | (opt.hyp, 'https://github.com/ultralytics/yolov5/pull/1120'))
461 | hyp['box'] = hyp.pop('giou')
462 |
463 | # Train
464 | logger.info(opt)
465 | if not opt.evolve:
466 | tb_writer, wandb = None, None # init loggers
467 | if opt.global_rank in [-1, 0]:
468 | # Tensorboard
469 | logger.info(f'Start Tensorboard with "tensorboard --logdir {opt.logdir}", view at http://localhost:6006/')
470 | tb_writer = SummaryWriter(log_dir=log_dir) # runs/exp0
471 |
472 | # W&B
473 | try:
474 | import wandb
475 |
476 | assert os.environ.get('WANDB_DISABLED') != 'true'
477 | logger.info("Weights & Biases logging enabled, to disable set os.environ['WANDB_DISABLED'] = 'true'")
478 | except (ImportError, AssertionError):
479 | opt.log_imgs = 0
480 | logger.info("Install Weights & Biases for experiment logging via 'pip install wandb' (recommended)")
481 |
482 | train(hyp, opt, device, tb_writer, wandb)
483 |
484 | # Evolve hyperparameters (optional)
485 | else:
486 | # Hyperparameter evolution metadata (mutation scale 0-1, lower_limit, upper_limit)
487 | meta = {'lr0': (1, 1e-5, 1e-1), # initial learning rate (SGD=1E-2, Adam=1E-3)
488 | 'lrf': (1, 0.01, 1.0), # final OneCycleLR learning rate (lr0 * lrf)
489 | 'momentum': (0.3, 0.6, 0.98), # SGD momentum/Adam beta1
490 | 'weight_decay': (1, 0.0, 0.001), # optimizer weight decay
491 | 'warmup_epochs': (1, 0.0, 5.0), # warmup epochs (fractions ok)
492 | 'warmup_momentum': (1, 0.0, 0.95), # warmup initial momentum
493 | 'warmup_bias_lr': (1, 0.0, 0.2), # warmup initial bias lr
494 | 'box': (1, 0.02, 0.2), # box loss gain
495 | 'cls': (1, 0.2, 4.0), # cls loss gain
496 | 'cls_pw': (1, 0.5, 2.0), # cls BCELoss positive_weight
497 | 'obj': (1, 0.2, 4.0), # obj loss gain (scale with pixels)
498 | 'obj_pw': (1, 0.5, 2.0), # obj BCELoss positive_weight
499 | 'iou_t': (0, 0.1, 0.7), # IoU training threshold
500 | 'anchor_t': (1, 2.0, 8.0), # anchor-multiple threshold
501 | 'anchors': (2, 2.0, 10.0), # anchors per output grid (0 to ignore)
502 | 'fl_gamma': (0, 0.0, 2.0), # focal loss gamma (efficientDet default gamma=1.5)
503 | 'hsv_h': (1, 0.0, 0.1), # image HSV-Hue augmentation (fraction)
504 | 'hsv_s': (1, 0.0, 0.9), # image HSV-Saturation augmentation (fraction)
505 | 'hsv_v': (1, 0.0, 0.9), # image HSV-Value augmentation (fraction)
506 | 'degrees': (1, 0.0, 45.0), # image rotation (+/- deg)
507 | 'translate': (1, 0.0, 0.9), # image translation (+/- fraction)
508 | 'scale': (1, 0.0, 0.9), # image scale (+/- gain)
509 | 'shear': (1, 0.0, 10.0), # image shear (+/- deg)
510 | 'perspective': (0, 0.0, 0.001), # image perspective (+/- fraction), range 0-0.001
511 | 'flipud': (1, 0.0, 1.0), # image flip up-down (probability)
512 | 'fliplr': (0, 0.0, 1.0), # image flip left-right (probability)
513 | 'mosaic': (1, 0.0, 1.0), # image mixup (probability)
514 | 'mixup': (1, 0.0, 1.0)} # image mixup (probability)
515 |
516 | assert opt.local_rank == -1, 'DDP mode not implemented for --evolve'
517 | opt.notest, opt.nosave = True, True # only test/save final epoch
518 | # ei = [isinstance(x, (int, float)) for x in hyp.values()] # evolvable indices
519 | yaml_file = Path(opt.logdir) / 'evolve' / 'hyp_evolved.yaml' # save best result here
520 | if opt.bucket:
521 | os.system('gsutil cp gs://%s/evolve.txt .' % opt.bucket) # download evolve.txt if exists
522 |
523 | for _ in range(300): # generations to evolve
524 | if os.path.exists('evolve.txt'): # if evolve.txt exists: select best hyps and mutate
525 | # Select parent(s)
526 | parent = 'single' # parent selection method: 'single' or 'weighted'
527 | x = np.loadtxt('evolve.txt', ndmin=2)
528 | n = min(5, len(x)) # number of previous results to consider
529 | x = x[np.argsort(-fitness(x))][:n] # top n mutations
530 | w = fitness(x) - fitness(x).min() # weights
531 | if parent == 'single' or len(x) == 1:
532 | # x = x[random.randint(0, n - 1)] # random selection
533 | x = x[random.choices(range(n), weights=w)[0]] # weighted selection
534 | elif parent == 'weighted':
535 | x = (x * w.reshape(n, 1)).sum(0) / w.sum() # weighted combination
536 |
537 | # Mutate
538 | mp, s = 0.8, 0.2 # mutation probability, sigma
539 | npr = np.random
540 | npr.seed(int(time.time()))
541 | g = np.array([x[0] for x in meta.values()]) # gains 0-1
542 | ng = len(meta)
543 | v = np.ones(ng)
544 | while all(v == 1): # mutate until a change occurs (prevent duplicates)
545 | v = (g * (npr.random(ng) < mp) * npr.randn(ng) * npr.random() * s + 1).clip(0.3, 3.0)
546 | for i, k in enumerate(hyp.keys()): # plt.hist(v.ravel(), 300)
547 | hyp[k] = float(x[i + 7] * v[i]) # mutate
548 |
549 | # Constrain to limits
550 | for k, v in meta.items():
551 | hyp[k] = max(hyp[k], v[1]) # lower limit
552 | hyp[k] = min(hyp[k], v[2]) # upper limit
553 | hyp[k] = round(hyp[k], 5) # significant digits
554 |
555 | # Train mutation
556 | results = train(hyp.copy(), opt, device)
557 |
558 | # Write mutation results
559 | print_mutation(hyp.copy(), results, yaml_file, opt.bucket)
560 |
561 | # Plot results
562 | plot_evolution(yaml_file)
563 | print(f'Hyperparameter evolution complete. Best results saved as: {yaml_file}\n'
564 | f'Command to train a new model with these hyperparameters: $ python train.py --hyp {yaml_file}')
565 |
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/utils/__init__.py
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/activations.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import torch.nn.functional as F
4 |
5 |
6 | # Swish https://arxiv.org/pdf/1905.02244.pdf ---------------------------------------------------------------------------
7 | class Swish(nn.Module): #
8 | @staticmethod
9 | def forward(x):
10 | return x * torch.sigmoid(x)
11 |
12 |
13 | class Hardswish(nn.Module): # export-friendly version of nn.Hardswish()
14 | @staticmethod
15 | def forward(x):
16 | # return x * F.hardsigmoid(x) # for torchscript and CoreML
17 | return x * F.hardtanh(x + 3, 0., 6.) / 6. # for torchscript, CoreML and ONNX
18 |
19 |
20 | class MemoryEfficientSwish(nn.Module):
21 | class F(torch.autograd.Function):
22 | @staticmethod
23 | def forward(ctx, x):
24 | ctx.save_for_backward(x)
25 | return x * torch.sigmoid(x)
26 |
27 | @staticmethod
28 | def backward(ctx, grad_output):
29 | x = ctx.saved_tensors[0]
30 | sx = torch.sigmoid(x)
31 | return grad_output * (sx * (1 + x * (1 - sx)))
32 |
33 | def forward(self, x):
34 | return self.F.apply(x)
35 |
36 |
37 | # Mish https://github.com/digantamisra98/Mish --------------------------------------------------------------------------
38 | class Mish(nn.Module):
39 | @staticmethod
40 | def forward(x):
41 | return x * F.softplus(x).tanh()
42 |
43 |
44 | class MemoryEfficientMish(nn.Module):
45 | class F(torch.autograd.Function):
46 | @staticmethod
47 | def forward(ctx, x):
48 | ctx.save_for_backward(x)
49 | return x.mul(torch.tanh(F.softplus(x))) # x * tanh(ln(1 + exp(x)))
50 |
51 | @staticmethod
52 | def backward(ctx, grad_output):
53 | x = ctx.saved_tensors[0]
54 | sx = torch.sigmoid(x)
55 | fx = F.softplus(x).tanh()
56 | return grad_output * (fx + x * sx * (1 - fx * fx))
57 |
58 | def forward(self, x):
59 | return self.F.apply(x)
60 |
61 |
62 | # FReLU https://arxiv.org/abs/2007.11824 -------------------------------------------------------------------------------
63 | class FReLU(nn.Module):
64 | def __init__(self, c1, k=3): # ch_in, kernel
65 | super().__init__()
66 | self.conv = nn.Conv2d(c1, c1, k, 1, 1, groups=c1)
67 | self.bn = nn.BatchNorm2d(c1)
68 |
69 | def forward(self, x):
70 | return torch.max(x, self.bn(self.conv(x)))
71 |
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/evolve.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | # Hyperparameter evolution commands (avoids CUDA memory leakage issues)
3 | # Replaces train.py python generations 'for' loop with a bash 'for' loop
4 |
5 | # Start on 4-GPU machine
6 | #for i in 0 1 2 3; do
7 | # t=ultralytics/yolov5:evolve && sudo docker pull $t && sudo docker run -d --ipc=host --gpus all -v "$(pwd)"/VOC:/usr/src/VOC $t bash utils/evolve.sh $i
8 | # sleep 60 # avoid simultaneous evolve.txt read/write
9 | #done
10 |
11 | # Hyperparameter evolution commands
12 | while true; do
13 | # python train.py --batch 64 --weights yolov5m.pt --data voc.yaml --img 512 --epochs 50 --evolve --bucket ult/evolve/voc --device $1
14 | python train.py --batch 40 --weights yolov5m.pt --data coco.yaml --img 640 --epochs 30 --evolve --bucket ult/evolve/coco --device $1
15 | done
16 |
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/google_app_engine/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM gcr.io/google-appengine/python
2 |
3 | # Create a virtualenv for dependencies. This isolates these packages from
4 | # system-level packages.
5 | # Use -p python3 or -p python3.7 to select python version. Default is version 2.
6 | RUN virtualenv /env -p python3
7 |
8 | # Setting these environment variables are the same as running
9 | # source /env/bin/activate.
10 | ENV VIRTUAL_ENV /env
11 | ENV PATH /env/bin:$PATH
12 |
13 | RUN apt-get update && apt-get install -y python-opencv
14 |
15 | # Copy the application's requirements.txt and run pip to install all
16 | # dependencies into the virtualenv.
17 | ADD requirements.txt /app/requirements.txt
18 | RUN pip install -r /app/requirements.txt
19 |
20 | # Add the application source code.
21 | ADD . /app
22 |
23 | # Run a WSGI server to serve the application. gunicorn must be declared as
24 | # a dependency in requirements.txt.
25 | CMD gunicorn -b :$PORT main:app
26 |
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/google_app_engine/additional_requirements.txt:
--------------------------------------------------------------------------------
1 | # add these requirements in your app on top of the existing ones
2 | pip==18.1
3 | Flask==1.0.2
4 | gunicorn==19.9.0
5 |
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/google_app_engine/app.yaml:
--------------------------------------------------------------------------------
1 | runtime: custom
2 | env: flex
3 |
4 | service: yolov5app
5 |
6 | liveness_check:
7 | initial_delay_sec: 600
8 |
9 | manual_scaling:
10 | instances: 1
11 | resources:
12 | cpu: 1
13 | memory_gb: 4
14 | disk_size_gb: 20
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/google_utils.py:
--------------------------------------------------------------------------------
1 | # This file contains google utils: https://cloud.google.com/storage/docs/reference/libraries
2 | # pip install --upgrade google-cloud-storage
3 | # from google.cloud import storage
4 |
5 | import os
6 | import platform
7 | import subprocess
8 | import time
9 | from pathlib import Path
10 |
11 | import torch
12 |
13 |
14 | def gsutil_getsize(url=''):
15 | # gs://bucket/file size https://cloud.google.com/storage/docs/gsutil/commands/du
16 | s = subprocess.check_output('gsutil du %s' % url, shell=True).decode('utf-8')
17 | return eval(s.split(' ')[0]) if len(s) else 0 # bytes
18 |
19 |
20 | def attempt_download(weights):
21 | # Attempt to download pretrained weights if not found locally
22 | weights = weights.strip().replace("'", '')
23 | file = Path(weights).name
24 |
25 | msg = weights + ' missing, try downloading from https://github.com/ultralytics/yolov5/releases/'
26 | models = ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt'] # available models
27 |
28 | if file in models and not os.path.isfile(weights):
29 | # Google Drive
30 | # d = {'yolov5s.pt': '1R5T6rIyy3lLwgFXNms8whc-387H0tMQO',
31 | # 'yolov5m.pt': '1vobuEExpWQVpXExsJ2w-Mbf3HJjWkQJr',
32 | # 'yolov5l.pt': '1hrlqD1Wdei7UT4OgT785BEk1JwnSvNEV',
33 | # 'yolov5x.pt': '1mM8aZJlWTxOg7BZJvNUMrTnA2AbeCVzS'}
34 | # r = gdrive_download(id=d[file], name=weights) if file in d else 1
35 | # if r == 0 and os.path.exists(weights) and os.path.getsize(weights) > 1E6: # check
36 | # return
37 |
38 | try: # GitHub
39 | url = 'https://github.com/ultralytics/yolov5/releases/download/v3.0/' + file
40 | print('Downloading %s to %s...' % (url, weights))
41 | torch.hub.download_url_to_file(url, weights)
42 | assert os.path.exists(weights) and os.path.getsize(weights) > 1E6 # check
43 | except Exception as e: # GCP
44 | print('Download error: %s' % e)
45 | url = 'https://storage.googleapis.com/ultralytics/yolov5/ckpt/' + file
46 | print('Downloading %s to %s...' % (url, weights))
47 | r = os.system('curl -L %s -o %s' % (url, weights)) # torch.hub.download_url_to_file(url, weights)
48 | finally:
49 | if not (os.path.exists(weights) and os.path.getsize(weights) > 1E6): # check
50 | os.remove(weights) if os.path.exists(weights) else None # remove partial downloads
51 | print('ERROR: Download failure: %s' % msg)
52 | print('')
53 | return
54 |
55 |
56 | def gdrive_download(id='1n_oKgR81BJtqk75b00eAjdv03qVCQn2f', name='coco128.zip'):
57 | # Downloads a file from Google Drive. from utils.google_utils import *; gdrive_download()
58 | t = time.time()
59 |
60 | print('Downloading https://drive.google.com/uc?export=download&id=%s as %s... ' % (id, name), end='')
61 | os.remove(name) if os.path.exists(name) else None # remove existing
62 | os.remove('cookie') if os.path.exists('cookie') else None
63 |
64 | # Attempt file download
65 | out = "NUL" if platform.system() == "Windows" else "/dev/null"
66 | os.system('curl -c ./cookie -s -L "drive.google.com/uc?export=download&id=%s" > %s ' % (id, out))
67 | if os.path.exists('cookie'): # large file
68 | s = 'curl -Lb ./cookie "drive.google.com/uc?export=download&confirm=%s&id=%s" -o %s' % (get_token(), id, name)
69 | else: # small file
70 | s = 'curl -s -L -o %s "drive.google.com/uc?export=download&id=%s"' % (name, id)
71 | r = os.system(s) # execute, capture return
72 | os.remove('cookie') if os.path.exists('cookie') else None
73 |
74 | # Error check
75 | if r != 0:
76 | os.remove(name) if os.path.exists(name) else None # remove partial
77 | print('Download error ') # raise Exception('Download error')
78 | return r
79 |
80 | # Unzip if archive
81 | if name.endswith('.zip'):
82 | print('unzipping... ', end='')
83 | os.system('unzip -q %s' % name) # unzip
84 | os.remove(name) # remove zip to free space
85 |
86 | print('Done (%.1fs)' % (time.time() - t))
87 | return r
88 |
89 |
90 | def get_token(cookie="./cookie"):
91 | with open(cookie) as f:
92 | for line in f:
93 | if "download" in line:
94 | return line.split()[-1]
95 | return ""
96 |
97 | # def upload_blob(bucket_name, source_file_name, destination_blob_name):
98 | # # Uploads a file to a bucket
99 | # # https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python
100 | #
101 | # storage_client = storage.Client()
102 | # bucket = storage_client.get_bucket(bucket_name)
103 | # blob = bucket.blob(destination_blob_name)
104 | #
105 | # blob.upload_from_filename(source_file_name)
106 | #
107 | # print('File {} uploaded to {}.'.format(
108 | # source_file_name,
109 | # destination_blob_name))
110 | #
111 | #
112 | # def download_blob(bucket_name, source_blob_name, destination_file_name):
113 | # # Uploads a blob from a bucket
114 | # storage_client = storage.Client()
115 | # bucket = storage_client.get_bucket(bucket_name)
116 | # blob = bucket.blob(source_blob_name)
117 | #
118 | # blob.download_to_filename(destination_file_name)
119 | #
120 | # print('Blob {} downloaded to {}.'.format(
121 | # source_blob_name,
122 | # destination_file_name))
123 |
--------------------------------------------------------------------------------
/v3.0/yolov5/utils/torch_utils.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import os
3 | import time
4 | from copy import deepcopy
5 |
6 | import math
7 | import torch
8 | import torch.backends.cudnn as cudnn
9 | import torch.nn as nn
10 | import torch.nn.functional as F
11 | import torchvision
12 |
13 | logger = logging.getLogger(__name__)
14 |
15 |
16 | def init_torch_seeds(seed=0):
17 | torch.manual_seed(seed)
18 |
19 | # Speed-reproducibility tradeoff https://pytorch.org/docs/stable/notes/randomness.html
20 | if seed == 0: # slower, more reproducible
21 | cudnn.deterministic = True
22 | cudnn.benchmark = False
23 | else: # faster, less reproducible
24 | cudnn.deterministic = False
25 | cudnn.benchmark = True
26 |
27 |
28 | def select_device(device='', batch_size=None):
29 | # device = 'cpu' or '0' or '0,1,2,3'
30 | cpu_request = device.lower() == 'cpu'
31 | if device and not cpu_request: # if device requested other than 'cpu'
32 | os.environ['CUDA_VISIBLE_DEVICES'] = device # set environment variable
33 | assert torch.cuda.is_available(), 'CUDA unavailable, invalid device %s requested' % device # check availablity
34 |
35 | cuda = False if cpu_request else torch.cuda.is_available()
36 | if cuda:
37 | c = 1024 ** 2 # bytes to MB
38 | ng = torch.cuda.device_count()
39 | if ng > 1 and batch_size: # check that batch_size is compatible with device_count
40 | assert batch_size % ng == 0, 'batch-size %g not multiple of GPU count %g' % (batch_size, ng)
41 | x = [torch.cuda.get_device_properties(i) for i in range(ng)]
42 | s = 'Using CUDA '
43 | for i in range(0, ng):
44 | if i == 1:
45 | s = ' ' * len(s)
46 | logger.info("%sdevice%g _CudaDeviceProperties(name='%s', total_memory=%dMB)" %
47 | (s, i, x[i].name, x[i].total_memory / c))
48 | else:
49 | logger.info('Using CPU')
50 |
51 | logger.info('') # skip a line
52 | return torch.device('cuda:0' if cuda else 'cpu')
53 |
54 |
55 | def time_synchronized():
56 | torch.cuda.synchronize() if torch.cuda.is_available() else None
57 | return time.time()
58 |
59 |
60 | def is_parallel(model):
61 | return type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel)
62 |
63 |
64 | def intersect_dicts(da, db, exclude=()):
65 | # Dictionary intersection of matching keys and shapes, omitting 'exclude' keys, using da values
66 | return {k: v for k, v in da.items() if k in db and not any(x in k for x in exclude) and v.shape == db[k].shape}
67 |
68 |
69 | def initialize_weights(model):
70 | for m in model.modules():
71 | t = type(m)
72 | if t is nn.Conv2d:
73 | pass # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
74 | elif t is nn.BatchNorm2d:
75 | m.eps = 1e-3
76 | m.momentum = 0.03
77 | elif t in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]:
78 | m.inplace = True
79 |
80 |
81 | def find_modules(model, mclass=nn.Conv2d):
82 | # Finds layer indices matching module class 'mclass'
83 | return [i for i, m in enumerate(model.module_list) if isinstance(m, mclass)]
84 |
85 |
86 | def sparsity(model):
87 | # Return global model sparsity
88 | a, b = 0., 0.
89 | for p in model.parameters():
90 | a += p.numel()
91 | b += (p == 0).sum()
92 | return b / a
93 |
94 |
95 | def prune(model, amount=0.3):
96 | # Prune model to requested global sparsity
97 | import torch.nn.utils.prune as prune
98 | print('Pruning model... ', end='')
99 | for name, m in model.named_modules():
100 | if isinstance(m, nn.Conv2d):
101 | prune.l1_unstructured(m, name='weight', amount=amount) # prune
102 | prune.remove(m, 'weight') # make permanent
103 | print(' %.3g global sparsity' % sparsity(model))
104 |
105 |
106 | def fuse_conv_and_bn(conv, bn):
107 | # Fuse convolution and batchnorm layers https://tehnokv.com/posts/fusing-batchnorm-and-conv/
108 |
109 | # init
110 | fusedconv = nn.Conv2d(conv.in_channels,
111 | conv.out_channels,
112 | kernel_size=conv.kernel_size,
113 | stride=conv.stride,
114 | padding=conv.padding,
115 | groups=conv.groups,
116 | bias=True).requires_grad_(False).to(conv.weight.device)
117 |
118 | # prepare filters
119 | w_conv = conv.weight.clone().view(conv.out_channels, -1)
120 | w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var)))
121 | fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size()))
122 |
123 | # prepare spatial bias
124 | b_conv = torch.zeros(conv.weight.size(0), device=conv.weight.device) if conv.bias is None else conv.bias
125 | b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps))
126 | fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn)
127 |
128 | return fusedconv
129 |
130 |
131 | def model_info(model, verbose=False):
132 | # Plots a line-by-line description of a PyTorch model
133 | n_p = sum(x.numel() for x in model.parameters()) # number parameters
134 | n_g = sum(x.numel() for x in model.parameters() if x.requires_grad) # number gradients
135 | if verbose:
136 | print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma'))
137 | for i, (name, p) in enumerate(model.named_parameters()):
138 | name = name.replace('module_list.', '')
139 | print('%5g %40s %9s %12g %20s %10.3g %10.3g' %
140 | (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std()))
141 |
142 | try: # FLOPS
143 | from thop import profile
144 | flops = profile(deepcopy(model), inputs=(torch.zeros(1, 3, 64, 64),), verbose=False)[0] / 1E9 * 2
145 | fs = ', %.1f GFLOPS' % (flops * 100) # 640x640 FLOPS
146 | except:
147 | fs = ''
148 |
149 | logger.info(
150 | 'Model Summary: %g layers, %g parameters, %g gradients%s' % (len(list(model.parameters())), n_p, n_g, fs))
151 |
152 |
153 | def load_classifier(name='resnet101', n=2):
154 | # Loads a pretrained model reshaped to n-class output
155 | model = torchvision.models.__dict__[name](pretrained=True)
156 |
157 | # ResNet model properties
158 | # input_size = [3, 224, 224]
159 | # input_space = 'RGB'
160 | # input_range = [0, 1]
161 | # mean = [0.485, 0.456, 0.406]
162 | # std = [0.229, 0.224, 0.225]
163 |
164 | # Reshape output to n classes
165 | filters = model.fc.weight.shape[1]
166 | model.fc.bias = nn.Parameter(torch.zeros(n), requires_grad=True)
167 | model.fc.weight = nn.Parameter(torch.zeros(n, filters), requires_grad=True)
168 | model.fc.out_features = n
169 | return model
170 |
171 |
172 | def scale_img(img, ratio=1.0, same_shape=False): # img(16,3,256,416), r=ratio
173 | # scales img(bs,3,y,x) by ratio
174 | if ratio == 1.0:
175 | return img
176 | else:
177 | h, w = img.shape[2:]
178 | s = (int(h * ratio), int(w * ratio)) # new size
179 | img = F.interpolate(img, size=s, mode='bilinear', align_corners=False) # resize
180 | if not same_shape: # pad/crop img
181 | gs = 32 # (pixels) grid size
182 | h, w = [math.ceil(x * ratio / gs) * gs for x in (h, w)]
183 | return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447) # value = imagenet mean
184 |
185 |
186 | def copy_attr(a, b, include=(), exclude=()):
187 | # Copy attributes from b to a, options to only include [...] and to exclude [...]
188 | for k, v in b.__dict__.items():
189 | if (len(include) and k not in include) or k.startswith('_') or k in exclude:
190 | continue
191 | else:
192 | setattr(a, k, v)
193 |
194 |
195 | class ModelEMA:
196 | """ Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models
197 | Keep a moving average of everything in the model state_dict (parameters and buffers).
198 | This is intended to allow functionality like
199 | https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage
200 | A smoothed version of the weights is necessary for some training schemes to perform well.
201 | This class is sensitive where it is initialized in the sequence of model init,
202 | GPU assignment and distributed training wrappers.
203 | """
204 |
205 | def __init__(self, model, decay=0.9999, updates=0):
206 | # Create EMA
207 | self.ema = deepcopy(model.module if is_parallel(model) else model).eval() # FP32 EMA
208 | # if next(model.parameters()).device.type != 'cpu':
209 | # self.ema.half() # FP16 EMA
210 | self.updates = updates # number of EMA updates
211 | self.decay = lambda x: decay * (1 - math.exp(-x / 2000)) # decay exponential ramp (to help early epochs)
212 | for p in self.ema.parameters():
213 | p.requires_grad_(False)
214 |
215 | def update(self, model):
216 | # Update EMA parameters
217 | with torch.no_grad():
218 | self.updates += 1
219 | d = self.decay(self.updates)
220 |
221 | msd = model.module.state_dict() if is_parallel(model) else model.state_dict() # model state_dict
222 | for k, v in self.ema.state_dict().items():
223 | if v.dtype.is_floating_point:
224 | v *= d
225 | v += (1. - d) * msd[k].detach()
226 |
227 | def update_attr(self, model, include=(), exclude=('process_group', 'reducer')):
228 | # Update EMA attributes
229 | copy_attr(self.ema, model, include, exclude)
230 |
--------------------------------------------------------------------------------
/v3.0/yolov5/weights/download_weights.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | # Download common models
3 |
4 | python -c "
5 | from utils.google_utils import *;
6 | attempt_download('weights/yolov5s.pt');
7 | attempt_download('weights/yolov5m.pt');
8 | attempt_download('weights/yolov5l.pt');
9 | attempt_download('weights/yolov5x.pt')
10 | "
11 |
--------------------------------------------------------------------------------
/v3.0/yolov5/yolo_data/labels/train.cache:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/yolo_data/labels/train.cache
--------------------------------------------------------------------------------
/v3.0/yolov5/yolo_data/labels/validation.cache:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yyyanbj/mid-air-draw/9ce05fe981e9037d8c0151be66c0254f8f2523d5/v3.0/yolov5/yolo_data/labels/validation.cache
--------------------------------------------------------------------------------