├── README.md
├── app.py
├── cam
├── 1.png
├── 2.png
├── __pycache__
│ ├── base_camera.cpython-37.pyc
│ └── base_camera.cpython-38.pyc
├── base_camera.py
├── camera.py
├── coco.names
├── result.png
├── test.jpg
├── test_re.jpg
└── train.jpg
├── center
├── get_train_val.py
└── xml_yolo.py
├── config
├── score.yaml
├── yolov3-spp.yaml
├── yolov5l.yaml
├── yolov5m.yaml
├── yolov5s.yaml
└── yolov5x.yaml
├── detect.py
├── inference
├── inputs
│ └── 2007_000033.jpg
└── outputs
│ └── 2007_000033.jpg
├── models
├── __pycache__
│ ├── common.cpython-37.pyc
│ ├── de.cpython-37.pyc
│ ├── experimental.cpython-37.pyc
│ └── yolo.cpython-37.pyc
├── common.py
├── de.py
├── experimental.py
├── onnx_export.py
└── yolo.py
├── requirements.txt
├── static
├── client.js
├── style.css
├── style1.css
└── worker.js
├── templates
└── index1.html
├── test.py
├── train.py
└── utils
├── __init__.py
├── __pycache__
├── __init__.cpython-37.pyc
├── datasets.cpython-37.pyc
├── google_utils.cpython-37.pyc
├── torch_utils.cpython-37.pyc
└── utils.cpython-37.pyc
├── activations.py
├── datasets.py
├── google_utils.py
├── torch_utils.py
└── utils.py
/README.md:
--------------------------------------------------------------------------------
1 | # 使用yolov5训练自己的数据集(详细过程)并通过flask部署
2 |
3 | #### 依赖库
4 | - torch
5 | - torchvision
6 | - numpy
7 | - opencv-python
8 | - lxml
9 | - tqdm
10 | - flask
11 | - pillow
12 | - tensorboard
13 | - matplotlib
14 | - pycocotools
15 |
16 | #### Windows,请使用 pycocotools-windows 代替 pycocotools
17 |
18 | #### Check all dependencies installed
19 | ```
20 | pip install -r requirements.txt
21 | ```
22 | ### 1.准备数据集
23 |
24 | 这里以PASCAL VOC数据集为例,[提取码: 07wp](https://pan.baidu.com/s/1u8k9wlLUklyLxQnaSrG4xQ)
25 | 将获取的数据集放到datasets目录下
26 | 数据集结构如下:
27 | ```
28 | ---VOC2012
29 | --------Annotations
30 | ---------------xml0
31 | ---------------xml1
32 | --------JPEGImages
33 | ---------------img0
34 | ---------------img1
35 | --------pascal_voc_classes.txt
36 | ```
37 | Annotations为所有的xml文件,JPEGImages为所有的图片文件,pascal_voc_classes.txt为类别文件。
38 |
39 | #### 获取标签文件
40 | yolo标签文件的格式如下:
41 | ```
42 | 102 0.682813 0.415278 0.237500 0.502778
43 | 102 0.914844 0.396528 0.168750 0.451389
44 |
45 | 第一位 label,为图片中物体的类别
46 | 后面四位为图片中物体的位置,(xcenter, ycenter, h, w)即目标物体中心位置的相对坐标和相对高宽
47 | 上图中存在两个目标
48 | ```
49 | 如果你已经拥有如上的label文件,可直接跳到下一步。
50 | 没有如上标签文件,可使用 [labelimg 提取码 dbi2](https://pan.baidu.com/s/1oEFodW83koHLcGasRoBZhA) 打标签。生成xml格式的label文件,再转为yolo格式的label文件。labelimg的使用非常简单,在此不在赘述。
51 |
52 | xml格式的label文件转为yolo格式:
53 |
54 | ```
55 | python center/xml_yolo.py
56 | ```
57 |
58 | pascal_voc_classes.txt,为你的类别对应的json文件。如下为voc数据集类别格式。
59 | ```python
60 | ["aeroplane","bicycle", "bird","boat","bottle","bus","car","cat","chair","cow","diningtable","dog","horse","motorbike","person","pottedplant","sheep","sofa","train", "tvmonitor"]
61 | ```
62 | #### 运行上面代码后的路径结构
63 | ```
64 | ---VOC2012
65 | --------Annotations
66 | --------JPEGImages
67 | --------pascal_voc_classes.json
68 | ---yolodata
69 | --------images
70 | --------labels
71 | ```
72 |
73 | ### 2.划分训练集和测试集
74 | 训练集和测试集的划分很简单,将原始数据打乱,然后按 9 :1划分为训练集和测试集即可。代码如下:
75 |
76 | ```
77 | python center/get_train_val.py
78 | ```
79 | ##### 运行上面代码会生成如下路径结构
80 | ```
81 | ---VOC2012
82 | --------Annotations
83 | --------JPEGImages
84 | --------pascal_voc_classes.json
85 | ---yolodata
86 | --------images
87 | --------labels
88 | ---traindata
89 | --------images
90 | ----------------train
91 | ----------------val
92 | --------labels
93 | ----------------train
94 | ----------------val
95 | ```
96 | ##### traindata就是最后需要的训练文件
97 |
98 | ### 3. 训练模型
99 |
100 | yolov5的训练很简单,本文已将代码简化,代码结构如下:
101 |
102 | ```
103 | dataset # 数据集
104 | ------traindata # 训练数据集
105 | inference # 输入输出接口
106 | ------inputs # 输入数据
107 | ------outputs # 输出数据
108 | config # 配置文件
109 | ------score.yaml # 训练配置文件
110 | ------yolov5l.yaml # 模型配置文件
111 | models # 模型代码
112 | runs # 日志文件
113 | utils # 代码文件
114 | weights # 模型保存路径,last.pt,best.pt
115 | train.py # 训练代码
116 | detect.py # 测试代码
117 | ```
118 |
119 | score.yaml解释如下:
120 | ```
121 | # train and val datasets (image directory)
122 | train: ./datasets/traindata/images/train/
123 | val: ./datasets/traindata/images/val/
124 | # number of classes
125 | nc: 2
126 | # class names
127 | names: ['苹果','香蕉']
128 | ```
129 |
130 | - train: 为图像数据的train,地址
131 | - val: 为图像数据的val,地址
132 | - nc: 为类别个数
133 | - names: 为类别对应的名称
134 |
135 |
136 | ##### yolov5l.yaml解释如下:
137 |
138 | ```
139 | nc: 2 # number of classes
140 | depth_multiple: 1.0 # model depth multiple
141 | width_multiple: 1.0 # layer channel multiple
142 | anchors:
143 | - [10,13, 16,30, 33,23] # P3/8
144 | - [30,61, 62,45, 59,119] # P4/16
145 | - [116,90, 156,198, 373,326] # P5/32
146 | backbone:
147 | # [from, number, module, args]
148 | [[-1, 1, Focus, [64, 3]], # 1-P1/2
149 | [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
150 | [-1, 3, Bottleneck, [128]],
151 | [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
152 | [-1, 9, BottleneckCSP, [256]],
153 | [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
154 | [-1, 9, BottleneckCSP, [512]],
155 | [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
156 | [-1, 1, SPP, [1024, [5, 9, 13]]],
157 | [-1, 6, BottleneckCSP, [1024]], # 10
158 | ]
159 | head:
160 | [[-1, 3, BottleneckCSP, [1024, False]], # 11
161 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 12 (P5/32-large)
162 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
163 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
164 | [-1, 1, Conv, [512, 1, 1]],
165 | [-1, 3, BottleneckCSP, [512, False]],
166 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 17 (P4/16-medium)
167 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
168 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
169 | [-1, 1, Conv, [256, 1, 1]],
170 | [-1, 3, BottleneckCSP, [256, False]],
171 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 22 (P3/8-small)
172 | [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
173 | ]
174 | ```
175 | - nc:为目标类别个数
176 | - depth_multiple 和 width_multiple:控制模型深度和宽度。不同的参数对应:s,m,l,x 模型。
177 | - anchors: 为对输入的目标框通过k-means聚类产生的基础框,通过这个基础框去预测目标的box。
178 | - yolov5会自动产生anchors,yolov5采用欧氏距离进行k-means聚类,再使用遗传算法做一系列的变异得到最终的anchors。但是本人采用欧氏距离进行k-means聚类得到的效果不如采用 1 - iou进行k-means聚类的效果。如果想要 1 - iou 进行k-means聚类源码请私聊我。但是效果其实相差无几。
179 | - backbone: 为图像特征提取部分的网络结构。
180 | - head: 为最后的预测部分的网络结构
181 |
182 |
183 | #####train.py配置十分简单:
184 | 
185 |
186 | 我们仅需修改如下参数即可
187 | ```
188 | epoch: 控制训练迭代的次数
189 | batch_size 输入迭代的图片数量
190 | cfg: 配置网络模型路径
191 | data: 训练配置文件路径
192 | weights: 载入模型,进行断点继续训练
193 | ```
194 | 终端运行(默认yolov5l)
195 | ```
196 | python train.py
197 | ```
198 | 即可开始训练。
199 |
200 | ##### 训练过程
201 |
202 | 
203 |
204 | ##### 训练结果
205 |
206 | 
207 |
208 | ### 4. 测试模型
209 |
210 | 
211 |
212 | ##### 需要需改三个参数
213 | ```
214 | source: 需要检测的images/videos路径
215 | out: 保存结果的路径
216 | weights: 训练得到的模型权重文件的路径
217 | ```
218 | ##### 你也可以使用在coco数据集上的权重文件进行测试将他们放到weights文件夹下
219 |
220 | [提取码:hhbb](https://pan.baidu.com/s/18AD8HpLhcRGSKOwGwPJMMg)
221 |
222 | 终端运行
223 | ```
224 | python detect.py
225 | ```
226 | 即可开始检测。
227 |
228 | ##### 测试结果
229 |
230 | 
231 |
232 | 
233 |
234 | ### 5.通过flask部署
235 |
236 | flask的部署是非简单。如果有不明白的可以参考我之前的博客。
237 |
238 | [阿里云ECS部署python,flask项目,简单易懂,无需nginx和uwsgi](https://blog.csdn.net/qq_44523137/article/details/112676287?spm=1001.2014.3001.5501)
239 |
240 | [基于yolov3-deepsort-flask的目标检测和多目标追踪web平台](https://blog.csdn.net/qq_44523137/article/details/116323516?spm=1001.2014.3001.5501)
241 |
242 |
243 |
244 | 终端运行
245 | ```
246 | python app.py
247 | ```
248 | 即可开始跳转到网页,上传图片进行检测。
249 |
250 |
251 |
252 |
253 |
--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import time
3 | from flask import Flask, request, Response,render_template
4 | import json
5 | from cam.base_camera import BaseCamera
6 |
7 | from models.de import detect,get_model
8 | import os
9 | os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
10 | app = Flask(__name__)
11 | class_names = [c.strip() for c in open(r'cam/coco.names').readlines()]
12 | file_name = ['jpg','jpeg','png']
13 |
14 | yolov5_model = get_model()
15 |
16 | @app.route('/images', methods= ['POST'])
17 | def get_image():
18 | image = request.files["images"]
19 | image_name = image.filename
20 | image.save(os.path.join(os.getcwd(), image_name))
21 | if image_name.split(".")[-1] in file_name:
22 | img = cv2.imread(image_name)
23 | img = detect(yolov5_model,img)
24 | _, img_encoded = cv2.imencode('.jpg', img)
25 | response = img_encoded.tobytes()
26 | os.remove(image_name)
27 | try:
28 | return Response(response=response, status=200, mimetype='image/jpg')
29 | except:
30 | return render_template('index1.html')
31 | @app.route('/')
32 | def upload_file():
33 | return render_template('index1.html')
34 | if __name__ == '__main__':
35 | # Run locally
36 | app.run(debug=True, host='127.0.0.1', port=5000)
37 | #Run on the server
38 | # app.run(debug=True, host = '0.0.0.0', port=5000)
39 |
--------------------------------------------------------------------------------
/cam/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/1.png
--------------------------------------------------------------------------------
/cam/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/2.png
--------------------------------------------------------------------------------
/cam/__pycache__/base_camera.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/__pycache__/base_camera.cpython-37.pyc
--------------------------------------------------------------------------------
/cam/__pycache__/base_camera.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/__pycache__/base_camera.cpython-38.pyc
--------------------------------------------------------------------------------
/cam/base_camera.py:
--------------------------------------------------------------------------------
1 | import time
2 | import threading
3 | try:
4 | from greenlet import getcurrent as get_ident
5 | except ImportError:
6 | try:
7 | from thread import get_ident
8 | except ImportError:
9 | from _thread import get_ident
10 |
11 |
12 | class CameraEvent(object):
13 | """An Event-like class that signals all active clients when a new frame is
14 | available.
15 | """
16 | def __init__(self):
17 | self.events = {}
18 |
19 | def wait(self):
20 | """Invoked from each client's thread to wait for the next frame."""
21 | ident = get_ident()
22 | if ident not in self.events:
23 | # this is a new client
24 | # add an entry for it in the self.events dict
25 | # each entry has two elements, a threading.Event() and a timestamp
26 | self.events[ident] = [threading.Event(), time.time()]
27 | return self.events[ident][0].wait()
28 |
29 | def set(self):
30 | """Invoked by the camera thread when a new frame is available."""
31 | now = time.time()
32 | remove = None
33 | for ident, event in self.events.items():
34 | if not event[0].isSet():
35 | # if this client's event is not set, then set it
36 | # also update the last set timestamp to now
37 | event[0].set()
38 | event[1] = now
39 | else:
40 | # if the client's event is already set, it means the client
41 | # did not process a previous frame
42 | # if the event stays set for more than 5 seconds, then assume
43 | # the client is gone and remove it
44 | if now - event[1] > 5:
45 | remove = ident
46 | if remove:
47 | del self.events[remove]
48 |
49 | def clear(self):
50 | """Invoked from each client's thread after a frame was processed."""
51 | self.events[get_ident()][0].clear()
52 |
53 |
54 | class BaseCamera(object):
55 | thread = None # background thread that reads frames from camera
56 | frame = None # current frame is stored here by background thread
57 | last_access = 0 # time of last client access to the camera
58 | event = CameraEvent()
59 |
60 | def __init__(self):
61 | """Start the background camera thread if it isn't running yet."""
62 | if BaseCamera.thread is None:
63 | BaseCamera.last_access = time.time()
64 |
65 | # start background frame thread
66 | BaseCamera.thread = threading.Thread(target=self._thread)
67 | BaseCamera.thread.start()
68 |
69 | # wait until frames are available
70 | while self.get_frame() is None:
71 | time.sleep(0)
72 |
73 | def get_frame(self):
74 | """Return the current camera frame."""
75 | BaseCamera.last_access = time.time()
76 |
77 | # wait for a signal from the camera thread
78 | BaseCamera.event.wait()
79 | BaseCamera.event.clear()
80 |
81 | return BaseCamera.frame
82 |
83 | @staticmethod
84 | def frames(path):
85 | """"Generator that returns frames from the camera."""
86 | raise RuntimeError('Must be implemented by subclasses.')
87 |
88 | @classmethod
89 | def _thread(cls):
90 | """Camera background thread."""
91 | print('Starting camera thread.')
92 | frames_iterator = cls.frames()
93 | for frame in frames_iterator:
94 | BaseCamera.frame = frame
95 | BaseCamera.event.set() # send signal to clients
96 | time.sleep(0)
97 |
98 | # if there hasn't been any clients asking for frames in
99 | # the last 10 seconds then stop the thread
100 | if time.time() - BaseCamera.last_access > 60:
101 | frames_iterator.close()
102 | print('Stopping camera thread due to inactivity.')
103 | break
104 | BaseCamera.thread = None
--------------------------------------------------------------------------------
/cam/camera.py:
--------------------------------------------------------------------------------
1 |
2 | from cam.base_camera import BaseCamera
3 | import cv2
4 | import tensorflow as tf
5 | from yolov3_tf2.models import YoloV3
6 | from yolov3_tf2.dataset import transform_images
7 | from yolov3_tf2.utils import draw_outputs
8 |
9 | # customize your API through the following parameters
10 | classes_path = 'coco.names'
11 | weights_path = './weights/yolov3.tf'
12 | tiny = False # set to True if using a Yolov3 Tiny model
13 | size = 416 # size images are resized to for model
14 | output_path = './detections/' # path to output folder where images with detections are saved
15 | num_classes = 80 # number of classes in model
16 |
17 | # load in weights and classes
18 | physical_devices = tf.config.experimental.list_physical_devices('GPU')
19 | if len(physical_devices) > 0:
20 | tf.config.experimental.set_memory_growth(physical_devices[0], True)
21 |
22 |
23 | yolo = YoloV3(classes=num_classes)
24 |
25 | yolo.load_weights(weights_path).expect_partial()
26 | print('weights loaded')
27 |
28 | class_names = [c.strip() for c in open(classes_path).readlines()]
29 | print('classes loaded')
30 |
31 |
32 | class Camera(BaseCamera):
33 |
34 | @staticmethod
35 | def frames():
36 | cam = cv2.VideoCapture(r'./finish.mp4')
37 | if not cam.isOpened():
38 | raise RuntimeError('Could not start camera.')
39 |
40 | while True:
41 | # read current frame
42 | _, img = cam.read()
43 | try:
44 | if CameraParams.gray:
45 | img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
46 | if CameraParams.gaussian:
47 | img_raw = tf.convert_to_tensor(img)
48 | img_raw = tf.expand_dims(img_raw, 0)
49 | # img detect
50 | img_raw = transform_images(img_raw, size)
51 | boxes, scores, classes, nums = yolo(img_raw)
52 | img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
53 | img = draw_outputs(img, (boxes, scores, classes, nums), class_names)
54 | if CameraParams.sobel:
55 | if(len(img.shape) == 3):
56 | img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
57 | img = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5) # x
58 | img = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=5) # y
59 | if CameraParams.canny:
60 | img = cv2.Canny(img, 100, 200, 3, L2gradient=True)
61 | except Exception as e:
62 | print(e)
63 | # encode as a jpeg image and return it
64 | yield cv2.imencode('.jpg', img)[1].tobytes()
65 |
66 | class CameraParams():
67 |
68 | gray = False
69 | gaussian = False
70 | sobel = False
71 | canny = False
72 | def __init__(self, gray, gaussian, sobel, canny, yolo):
73 | self.gray = gray
74 | self.gaussian = gaussian
75 | self.sobel = sobel
76 | self.canny = canny
77 | self.yolo
78 |
--------------------------------------------------------------------------------
/cam/coco.names:
--------------------------------------------------------------------------------
1 | person
2 | bicycle
3 | car
4 | motorbike
5 | aeroplane
6 | bus
7 | train
8 | truck
9 | boat
10 | traffic light
11 | fire hydrant
12 | stop sign
13 | parking meter
14 | bench
15 | bird
16 | cat
17 | dog
18 | horse
19 | sheep
20 | cow
21 | elephant
22 | bear
23 | zebra
24 | giraffe
25 | backpack
26 | umbrella
27 | handbag
28 | tie
29 | suitcase
30 | frisbee
31 | skis
32 | snowboard
33 | sports ball
34 | kite
35 | baseball bat
36 | baseball glove
37 | skateboard
38 | surfboard
39 | tennis racket
40 | bottle
41 | wine glass
42 | cup
43 | fork
44 | knife
45 | spoon
46 | bowl
47 | banana
48 | apple
49 | sandwich
50 | orange
51 | broccoli
52 | carrot
53 | hot dog
54 | pizza
55 | donut
56 | cake
57 | chair
58 | sofa
59 | pottedplant
60 | bed
61 | diningtable
62 | toilet
63 | tvmonitor
64 | laptop
65 | mouse
66 | remote
67 | keyboard
68 | cell phone
69 | microwave
70 | oven
71 | toaster
72 | sink
73 | refrigerator
74 | book
75 | clock
76 | vase
77 | scissors
78 | teddy bear
79 | hair drier
80 | toothbrush
81 |
--------------------------------------------------------------------------------
/cam/result.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/result.png
--------------------------------------------------------------------------------
/cam/test.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/test.jpg
--------------------------------------------------------------------------------
/cam/test_re.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/test_re.jpg
--------------------------------------------------------------------------------
/cam/train.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/train.jpg
--------------------------------------------------------------------------------
/center/get_train_val.py:
--------------------------------------------------------------------------------
1 | import os,shutil
2 | import numpy as np
3 | import cv2
4 | from tqdm import tqdm
5 | #上一步保存的所有image和label文件路径
6 | image_root = r'../datasets/yolo_data/images'
7 | label_root = r'../datasets/yolo_data/labels'
8 | names = []
9 | for root,dir,files in os.walk(label_root ):
10 | for file in files:
11 | names.append(file)
12 | val_split = 0.1
13 | np.random.seed(10101)
14 | np.random.shuffle(names)
15 | num_val = int(len(names)*val_split)
16 | num_train = len(names) - num_val
17 | trains = names[:num_train]
18 | vals = names[num_train:]
19 | #保存路径
20 | save_path_img = r'../datasets//traindata'
21 | if not os.path.exists(save_path_img):
22 | os.mkdir(save_path_img)
23 | def get_train_val_data(img_root,txt_root,save_path_img,files,typ):
24 | def get_path(root_path,path1):
25 | path = os.path.join(root_path,path1)
26 | if not os.path.exists(path):
27 | os.mkdir(path)
28 | return path
29 | for val in tqdm(files):
30 | txt_path = os.path.join(txt_root,val)
31 | img_path = os.path.join(img_root,val.split('.')[0]+'.jpg')
32 | img_path1 = get_path(save_path_img,'images')
33 | txt_path1 = get_path(save_path_img,'labels')
34 | rt_img = get_path(img_path1,typ)
35 | rt_txt = get_path(txt_path1,typ)
36 | txt_path1 = os.path.join(rt_txt,val)
37 | img_path1 = os.path.join(rt_img,val.split('.')[0]+'.jpg')
38 | shutil.copyfile(img_path, img_path1)
39 | shutil.copyfile(txt_path,txt_path1)
40 | get_train_val_data(image_root,label_root,save_path_img,vals,'val')
41 | get_train_val_data(image_root,label_root,save_path_img,trains,'train')
42 |
43 |
44 |
45 |
46 |
47 |
--------------------------------------------------------------------------------
/center/xml_yolo.py:
--------------------------------------------------------------------------------
1 | import os
2 | from tqdm import tqdm
3 | from lxml import etree
4 | import json
5 | import shutil
6 | # 原始xml路径和image路径
7 | xml_root_path = r'../datasets/VOC2012/Annotations'
8 | img_root_path = r'../datasets/VOC2012/JPEGImages'
9 | # 保存的图片和yolo格式label路径。要新建文件夹
10 | def get_path(path):
11 | if not os.path.exists(path):
12 | os.mkdir(path)
13 | return path
14 | get_path(r'../datasets/yolo_data')
15 | save_label_path = get_path(r'../datasets/yolo_data/labels')
16 | save_images_path = get_path(r'../datasets/yolo_data/images')
17 | def parse_xml_to_dict(xml):
18 | if len(xml) == 0: # 遍历到底层,直接返回tag对应的信息
19 | return {xml.tag: xml.text}
20 | result = {}
21 | for child in xml:
22 | child_result = parse_xml_to_dict(child) # 递归遍历标签信息
23 | if child.tag != 'object':
24 | result[child.tag] = child_result[child.tag]
25 | else:
26 | if child.tag not in result: # 因为object可能有多个,所以需要放入列表里
27 | result[child.tag] = []
28 | result[child.tag].append(child_result[child.tag])
29 | return {xml.tag: result}
30 | def translate_info(file_names, img_root_path, class_list):
31 | for root,dirs,files in os.walk(file_names):
32 | for file in tqdm(files):
33 | # 检查xml文件是否存在
34 | xml_path = os.path.join(root, file)
35 | # read xml
36 | with open(xml_path) as fid:
37 | xml_str = fid.read()
38 | xml = etree.fromstring(xml_str)
39 | data = parse_xml_to_dict(xml)["annotation"]
40 | img_height = int(data["size"]["height"])
41 | img_width = int(data["size"]["width"])
42 | img_path = data["filename"]
43 |
44 | # write object info into txt
45 | assert "object" in data.keys(), "file: '{}' lack of object key.".format(xml_path)
46 | if len(data["object"]) == 0:
47 | # 如果xml文件中没有目标就直接忽略该样本
48 | print("Warning: in '{}' xml, there are no objects.".format(xml_path))
49 | continue
50 | with open(os.path.join(save_label_path, file.split(".")[0] + ".txt"), "w") as f:
51 | for index, obj in enumerate(data["object"]):
52 | # 获取每个object的box信息
53 | xmin = float(obj["bndbox"]["xmin"])
54 | xmax = float(obj["bndbox"]["xmax"])
55 | ymin = float(obj["bndbox"]["ymin"])
56 | ymax = float(obj["bndbox"]["ymax"])
57 | class_name = obj["name"]
58 | class_index = class_list.index(class_name)
59 | # 进一步检查数据,有的标注信息中可能有w或h为0的情况,这样的数据会导致计算回归loss为nan
60 | if xmax <= xmin or ymax <= ymin:
61 | print("Warning: in '{}' xml, there are some bbox w/h <=0".format(xml_path))
62 | continue
63 | # 将box信息转换到yolo格式
64 | xcenter = xmin + (xmax - xmin) / 2
65 | ycenter = ymin + (ymax - ymin) / 2
66 | w = xmax - xmin
67 | h = ymax - ymin
68 | # 绝对坐标转相对坐标,保存6位小数
69 | xcenter = round(xcenter / img_width, 6)
70 | ycenter = round(ycenter / img_height, 6)
71 | w = round(w / img_width, 6)
72 | h = round(h / img_height, 6)
73 | info = [str(i) for i in [class_index, xcenter, ycenter, w, h]]
74 | if index == 0:
75 | f.write(" ".join(info))
76 | else:
77 | f.write("\n" + " ".join(info))
78 | # copy image into save_images_path
79 | path_copy_to = os.path.join(save_images_path,file.split(".")[0] + ".jpg")
80 | shutil.copyfile(os.path.join(img_root_path, img_path), path_copy_to)
81 |
82 | label_json_path = r'../datasets/VOC2012/pascal_voc_classes.txt'
83 | with open(label_json_path, 'r') as f:
84 | label_file = f.readlines()
85 | class_list = label_file[0].split(',')
86 | translate_info(xml_root_path, img_root_path, class_list)
--------------------------------------------------------------------------------
/config/score.yaml:
--------------------------------------------------------------------------------
1 | # train and val datasets (image directory or *.txt file with image paths)
2 | train: ./datasets/traindata/images/train/
3 | val: ./datasets/traindata/images/val/
4 | # number of classes
5 | nc: 20
6 | # class names
7 | names: ["aeroplane","bicycle","bird","boat","bottle","bus","car","cat","chair","cow","diningtable","dog","horse","motorbike","person","pottedplant","sheep","sofa","train","tvmonitor"]
--------------------------------------------------------------------------------
/config/yolov3-spp.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 1.0 # expand model depth
4 | width_multiple: 1.0 # expand layer channels
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # darknet53 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Conv, [32, 3, 1]], # 0
16 | [-1, 1, Conv, [64, 3, 2]], # 1-P1/2
17 | [-1, 1, Bottleneck, [64]],
18 | [-1, 1, Conv, [128, 3, 2]], # 3-P2/4
19 | [-1, 2, Bottleneck, [128]],
20 | [-1, 1, Conv, [256, 3, 2]], # 5-P3/8
21 | [-1, 8, Bottleneck, [256]],
22 | [-1, 1, Conv, [512, 3, 2]], # 7-P4/16
23 | [-1, 8, Bottleneck, [512]],
24 | [-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
25 | [-1, 4, Bottleneck, [1024]], # 10
26 | ]
27 |
28 | # yolov3-spp head
29 | # na = len(anchors[0])
30 | head:
31 | [[-1, 1, Bottleneck, [1024, False]], # 11
32 | [-1, 1, SPP, [512, [5, 9, 13]]],
33 | [-1, 1, Conv, [1024, 3, 1]],
34 | [-1, 1, Conv, [512, 1, 1]],
35 | [-1, 1, Conv, [1024, 3, 1]],
36 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 16 (P5/32-large)
37 |
38 | [-3, 1, Conv, [256, 1, 1]],
39 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
40 | [[-1, 8], 1, Concat, [1]], # cat backbone P4
41 | [-1, 1, Bottleneck, [512, False]],
42 | [-1, 1, Bottleneck, [512, False]],
43 | [-1, 1, Conv, [256, 1, 1]],
44 | [-1, 1, Conv, [512, 3, 1]],
45 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 24 (P4/16-medium)
46 |
47 | [-3, 1, Conv, [128, 1, 1]],
48 | [-1, 1, nn.Upsample, [None, 2, 'nearest']],
49 | [[-1, 6], 1, Concat, [1]], # cat backbone P3
50 | [-1, 1, Bottleneck, [256, False]],
51 | [-1, 2, Bottleneck, [256, False]],
52 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 30 (P3/8-small)
53 |
54 | [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
55 | ]
56 |
--------------------------------------------------------------------------------
/config/yolov5l.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 20 # number of classes
3 | depth_multiple: 1.0 # model depth multiple
4 | width_multiple: 1.0 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # yolov5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 1-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17 | [-1, 3, Bottleneck, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 6, BottleneckCSP, [1024]], # 10
25 | ]
26 |
27 | # yolov5 head
28 | head:
29 | [[-1, 3, BottleneckCSP, [1024, False]], # 11
30 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 12 (P5/32-large)
31 |
32 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
34 | [-1, 1, Conv, [512, 1, 1]],
35 | [-1, 3, BottleneckCSP, [512, False]],
36 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 17 (P4/16-medium)
37 |
38 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
40 | [-1, 1, Conv, [256, 1, 1]],
41 | [-1, 3, BottleneckCSP, [256, False]],
42 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 22 (P3/8-small)
43 |
44 | [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45 | ]
46 |
--------------------------------------------------------------------------------
/config/yolov5m.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 20 # number of classes
3 | depth_multiple: 0.67 # model depth multiple
4 | width_multiple: 0.75 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # yolov5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 1-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17 | [-1, 3, Bottleneck, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 6, BottleneckCSP, [1024]], # 10
25 | ]
26 |
27 | # yolov5 head
28 | head:
29 | [[-1, 3, BottleneckCSP, [1024, False]], # 11
30 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 12 (P5/32-large)
31 |
32 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
34 | [-1, 1, Conv, [512, 1, 1]],
35 | [-1, 3, BottleneckCSP, [512, False]],
36 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 17 (P4/16-medium)
37 |
38 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
40 | [-1, 1, Conv, [256, 1, 1]],
41 | [-1, 3, BottleneckCSP, [256, False]],
42 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 22 (P3/8-small)
43 |
44 | [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45 | ]
46 |
--------------------------------------------------------------------------------
/config/yolov5s.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 20 # number of classes
3 | depth_multiple: 0.33 # model depth multiple
4 | width_multiple: 0.50 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # yolov5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 1-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17 | [-1, 3, Bottleneck, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 6, BottleneckCSP, [1024]], # 10
25 | ]
26 |
27 | # yolov5 head
28 | head:
29 | [[-1, 3, BottleneckCSP, [1024, False]], # 11
30 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 12 (P5/32-large)
31 |
32 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
34 | [-1, 1, Conv, [512, 1, 1]],
35 | [-1, 3, BottleneckCSP, [512, False]],
36 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 17 (P4/16-medium)
37 |
38 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
40 | [-1, 1, Conv, [256, 1, 1]],
41 | [-1, 3, BottleneckCSP, [256, False]],
42 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 22 (P3/8-small)
43 |
44 | [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45 | ]
46 |
--------------------------------------------------------------------------------
/config/yolov5x.yaml:
--------------------------------------------------------------------------------
1 | # parameters
2 | nc: 80 # number of classes
3 | depth_multiple: 1.33 # model depth multiple
4 | width_multiple: 1.25 # layer channel multiple
5 |
6 | # anchors
7 | anchors:
8 | - [10,13, 16,30, 33,23] # P3/8
9 | - [30,61, 62,45, 59,119] # P4/16
10 | - [116,90, 156,198, 373,326] # P5/32
11 |
12 | # yolov5 backbone
13 | backbone:
14 | # [from, number, module, args]
15 | [[-1, 1, Focus, [64, 3]], # 1-P1/2
16 | [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17 | [-1, 3, Bottleneck, [128]],
18 | [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19 | [-1, 9, BottleneckCSP, [256]],
20 | [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21 | [-1, 9, BottleneckCSP, [512]],
22 | [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 | [-1, 1, SPP, [1024, [5, 9, 13]]],
24 | [-1, 6, BottleneckCSP, [1024]], # 10
25 | ]
26 |
27 | # yolov5 head
28 | head:
29 | [[-1, 3, BottleneckCSP, [1024, False]], # 11
30 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 12 (P5/32-large)
31 |
32 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 | [[-1, 6], 1, Concat, [1]], # cat backbone P4
34 | [-1, 1, Conv, [512, 1, 1]],
35 | [-1, 3, BottleneckCSP, [512, False]],
36 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 17 (P4/16-medium)
37 |
38 | [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 | [[-1, 4], 1, Concat, [1]], # cat backbone P3
40 | [-1, 1, Conv, [256, 1, 1]],
41 | [-1, 3, BottleneckCSP, [256, False]],
42 | [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], # 22 (P3/8-small)
43 |
44 | [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45 | ]
46 |
--------------------------------------------------------------------------------
/detect.py:
--------------------------------------------------------------------------------
1 | from utils.datasets import *
2 | from utils.utils import *
3 |
4 | def detect(source, out, weights):
5 | source, out, weights, imgsz = source, out, weights, 640
6 | # Initialize
7 | device = torch_utils.select_device('cpu')
8 | if os.path.exists(out):
9 | shutil.rmtree(out) # delete output folder
10 | os.makedirs(out) # make new output folder
11 | # Load model
12 | google_utils.attempt_download(weights)
13 | model = torch.load(weights, map_location=device)['model']
14 | model.to(device).eval()
15 | vid_path, vid_writer = None, None
16 | dataset = LoadImages(source, img_size=imgsz)
17 | # Get names and colors
18 | names = model.names if hasattr(model, 'names') else model.modules.names
19 | colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))]
20 | # Run inference
21 | t0 = time.time()
22 | for path, img, im0s, vid_cap in dataset:
23 | t1 = time.time()
24 | img = torch.from_numpy(img).to(device)
25 | img = img.float() # uint8 to fp16/32
26 | img /= 255.0 # 0 - 255 to 0.0 - 1.0
27 | if img.ndimension() == 3:
28 | img = img.unsqueeze(0)
29 | # Inference
30 | pred = model(img, augment=False)[0]
31 | pred = non_max_suppression(pred, 0.4, 0.5,
32 | fast=True, classes=None, agnostic=False)
33 | # Process detections
34 | for i, det in enumerate(pred): # detections per image
35 | p, s, im0 = path, '', im0s
36 | save_path = str(Path(out) / Path(p).name)
37 | s += '%gx%g ' % img.shape[2:] # print string
38 | if det is not None and len(det):
39 | # Rescale boxes from img_size to im0 size
40 | det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
41 | # Print results
42 | for c in det[:, -1].unique():
43 | n = (det[:, -1] == c).sum() # detections per class
44 | s += '%g %ss, ' % (n, names[int(c)]) # add to string
45 | for *xyxy, conf, cls in det:
46 | # Add bbox to image
47 | label = '%s%.2f' % (names[int(cls)], conf)
48 | im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=1)
49 | # xmin,ymin, xmax,ymax = int(xyxy[0]), int(xyxy[1]),int(xyxy[2]), int(xyxy[3])
50 | # xcenter = xmin + (xmax - xmin) / 2
51 | # ycenter = ymin + (ymax - ymin) / 2
52 | # w = xmax - xmin
53 | # h = ymax - ymin
54 | # Save results (image with detections)
55 | print('%sDone. (%.3fs)' % (s, time.time() - t1))
56 | if dataset.mode == 'images':
57 | cv2.imwrite(save_path, im0)
58 | else:
59 | if vid_path != save_path: # new video
60 | vid_path = save_path
61 | if isinstance(vid_writer, cv2.VideoWriter):
62 | vid_writer.release() # release previous video writer
63 | fps = vid_cap.get(cv2.CAP_PROP_FPS)
64 | w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
65 | h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
66 | vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*opt.fourcc), fps, (w, h))
67 | vid_writer.write(im0)
68 |
69 | print('Done. (%.3fs)' % (time.time() - t0))
70 |
71 |
72 | source = './inference/inputs'
73 | out = './inference/outputs'
74 | weights = './weights/yolov5l.pt'
75 |
76 | with torch.no_grad():
77 | detect(source, out, weights)
78 |
--------------------------------------------------------------------------------
/inference/inputs/2007_000033.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/inference/inputs/2007_000033.jpg
--------------------------------------------------------------------------------
/inference/outputs/2007_000033.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/inference/outputs/2007_000033.jpg
--------------------------------------------------------------------------------
/models/__pycache__/common.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/common.cpython-37.pyc
--------------------------------------------------------------------------------
/models/__pycache__/de.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/de.cpython-37.pyc
--------------------------------------------------------------------------------
/models/__pycache__/experimental.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/experimental.cpython-37.pyc
--------------------------------------------------------------------------------
/models/__pycache__/yolo.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/yolo.cpython-37.pyc
--------------------------------------------------------------------------------
/models/common.py:
--------------------------------------------------------------------------------
1 | # This file contains modules common to various models
2 |
3 |
4 | from utils.utils import *
5 |
6 |
7 | def DWConv(c1, c2, k=1, s=1, act=True):
8 | # Depthwise convolution
9 | return Conv(c1, c2, k, s, g=math.gcd(c1, c2), act=act)
10 |
11 |
12 | class Conv(nn.Module):
13 | # Standard convolution
14 | def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups
15 | super(Conv, self).__init__()
16 | self.conv = nn.Conv2d(c1, c2, k, s, k // 2, groups=g, bias=False)
17 | self.bn = nn.BatchNorm2d(c2)
18 | self.act = nn.LeakyReLU(0.1, inplace=True) if act else nn.Identity()
19 |
20 | def forward(self, x):
21 | return self.act(self.bn(self.conv(x)))
22 |
23 | def fuseforward(self, x):
24 | return self.act(self.conv(x))
25 |
26 |
27 | class Bottleneck(nn.Module):
28 | # Standard bottleneck
29 | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion
30 | super(Bottleneck, self).__init__()
31 | c_ = int(c2 * e) # hidden channels
32 | self.cv1 = Conv(c1, c_, 1, 1)
33 | self.cv2 = Conv(c_, c2, 3, 1, g=g)
34 | self.add = shortcut and c1 == c2
35 |
36 | def forward(self, x):
37 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
38 |
39 |
40 | class BottleneckCSP(nn.Module):
41 | # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
42 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
43 | super(BottleneckCSP, self).__init__()
44 | c_ = int(c2 * e) # hidden channels
45 | self.cv1 = Conv(c1, c_, 1, 1)
46 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
47 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
48 | self.cv4 = Conv(c2, c2, 1, 1)
49 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3)
50 | self.act = nn.LeakyReLU(0.1, inplace=True)
51 | self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
52 |
53 | def forward(self, x):
54 | y1 = self.cv3(self.m(self.cv1(x)))
55 | y2 = self.cv2(x)
56 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))
57 |
58 |
59 | class SPP(nn.Module):
60 | # Spatial pyramid pooling layer used in YOLOv3-SPP
61 | def __init__(self, c1, c2, k=(5, 9, 13)):
62 | super(SPP, self).__init__()
63 | c_ = c1 // 2 # hidden channels
64 | self.cv1 = Conv(c1, c_, 1, 1)
65 | self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
66 | self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
67 |
68 | def forward(self, x):
69 | x = self.cv1(x)
70 | return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))
71 |
72 |
73 | class Flatten(nn.Module):
74 | # Use after nn.AdaptiveAvgPool2d(1) to remove last 2 dimensions
75 | def forward(self, x):
76 | return x.view(x.size(0), -1)
77 |
78 |
79 | class Focus(nn.Module):
80 | # Focus wh information into c-space
81 | def __init__(self, c1, c2, k=1):
82 | super(Focus, self).__init__()
83 | self.conv = Conv(c1 * 4, c2, k, 1)
84 |
85 | def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2)
86 | return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
87 |
88 |
89 | class Concat(nn.Module):
90 | # Concatenate a list of tensors along dimension
91 | def __init__(self, dimension=1):
92 | super(Concat, self).__init__()
93 | self.d = dimension
94 |
95 | def forward(self, x):
96 | return torch.cat(x, self.d)
97 |
--------------------------------------------------------------------------------
/models/de.py:
--------------------------------------------------------------------------------
1 | from utils.datasets import *
2 | from utils.utils import *
3 |
4 |
5 | def get_model():
6 | weights = r'./weights/yolov5s.pt'
7 | device = torch.device("cuda" if (torch.cuda.is_available()) else "cpu")
8 | google_utils.attempt_download(weights)
9 | model = torch.load(weights, map_location=device)['model']
10 | model.to(device).eval()
11 | return model
12 |
13 |
14 | def letterbox(img, new_shape=(416, 416), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
15 | # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
16 | shape = img.shape[:2] # current shape [height, width]
17 | if isinstance(new_shape, int):
18 | new_shape = (new_shape, new_shape)
19 |
20 | # Scale ratio (new / old)
21 | r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
22 | if not scaleup: # only scale down, do not scale up (for better test mAP)
23 | r = min(r, 1.0)
24 | # Compute padding
25 | ratio = r, r # width, height ratios
26 | new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
27 | dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
28 | if auto: # minimum rectangle
29 | dw, dh = np.mod(dw, 64), np.mod(dh, 64) # wh padding
30 | elif scaleFill: # stretch
31 | dw, dh = 0.0, 0.0
32 | new_unpad = new_shape
33 | ratio = new_shape[0] / shape[1], new_shape[1] / shape[0] # width, height ratios
34 |
35 | dw /= 2 # divide padding into 2 sides
36 | dh /= 2
37 | if shape[::-1] != new_unpad: # resize
38 | img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
39 | top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
40 | left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
41 | img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
42 | return img, ratio, (dw, dh)
43 |
44 | def detect(model, im0s):
45 | t0 = time.time()
46 | device = torch.device("cuda" if (torch.cuda.is_available()) else "cpu")
47 | names = model.names if hasattr(model, 'names') else model.modules.names
48 | colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))]
49 | img = letterbox(im0s, new_shape=640)[0]
50 | img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
51 | img = np.ascontiguousarray(img)
52 | img = torch.from_numpy(img).to(device)
53 | img = img.float()
54 | img /= 255.0 # 0 - 255 to 0.0 - 1.0
55 | if img.ndimension() == 3:
56 | img = img.unsqueeze(0)
57 | pred = model(img, augment=False)[0]
58 | pred = non_max_suppression(pred, 0.4, 0.5,
59 | fast=True, classes=None, agnostic=False)
60 | for i, det in enumerate(pred): # detections per image
61 | im0 = im0s
62 | if det is not None and len(det):
63 | det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
64 | for *xyxy, conf, cls in det:
65 | label = '%s%.2f' % (names[int(cls)], conf)
66 | im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=1)
67 | print('Done. (%.3fs)' % (time.time() - t0))
68 | return im0
69 |
70 |
--------------------------------------------------------------------------------
/models/experimental.py:
--------------------------------------------------------------------------------
1 | from models.common import *
2 |
3 |
4 | class Sum(nn.Module):
5 | # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
6 | def __init__(self, n, weight=False): # n: number of inputs
7 | super(Sum, self).__init__()
8 | self.weight = weight # apply weights boolean
9 | self.iter = range(n - 1) # iter object
10 | if weight:
11 | self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights
12 |
13 | def forward(self, x):
14 | y = x[0] # no weight
15 | if self.weight:
16 | w = torch.sigmoid(self.w) * 2
17 | for i in self.iter:
18 | y = y + x[i + 1] * w[i]
19 | else:
20 | for i in self.iter:
21 | y = y + x[i + 1]
22 | return y
23 |
24 |
25 | class GhostConv(nn.Module):
26 | # Ghost Convolution https://github.com/huawei-noah/ghostnet
27 | def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups
28 | super(GhostConv, self).__init__()
29 | c_ = c2 // 2 # hidden channels
30 | self.cv1 = Conv(c1, c_, k, s, g, act)
31 | self.cv2 = Conv(c_, c_, 5, 1, c_, act)
32 |
33 | def forward(self, x):
34 | y = self.cv1(x)
35 | return torch.cat([y, self.cv2(y)], 1)
36 |
37 |
38 | class GhostBottleneck(nn.Module):
39 | # Ghost Bottleneck https://github.com/huawei-noah/ghostnet
40 | def __init__(self, c1, c2, k, s):
41 | super(GhostBottleneck, self).__init__()
42 | c_ = c2 // 2
43 | self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw
44 | DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw
45 | GhostConv(c_, c2, 1, 1, act=False)) # pw-linear
46 | self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False),
47 | Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity()
48 |
49 | def forward(self, x):
50 | return self.conv(x) + self.shortcut(x)
51 |
52 |
53 | class ConvPlus(nn.Module):
54 | # Plus-shaped convolution
55 | def __init__(self, c1, c2, k=3, s=1, g=1, bias=True): # ch_in, ch_out, kernel, stride, groups
56 | super(ConvPlus, self).__init__()
57 | self.cv1 = nn.Conv2d(c1, c2, (k, 1), s, (k // 2, 0), groups=g, bias=bias)
58 | self.cv2 = nn.Conv2d(c1, c2, (1, k), s, (0, k // 2), groups=g, bias=bias)
59 |
60 | def forward(self, x):
61 | return self.cv1(x) + self.cv2(x)
62 |
63 |
64 | class MixConv2d(nn.Module):
65 | # Mixed Depthwise Conv https://arxiv.org/abs/1907.09595
66 | def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):
67 | super(MixConv2d, self).__init__()
68 | groups = len(k)
69 | if equal_ch: # equal c_ per group
70 | i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices
71 | c_ = [(i == g).sum() for g in range(groups)] # intermediate channels
72 | else: # equal weight.numel() per group
73 | b = [c2] + [0] * groups
74 | a = np.eye(groups + 1, groups, k=-1)
75 | a -= np.roll(a, 1, axis=1)
76 | a *= np.array(k) ** 2
77 | a[0] = 1
78 | c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b
79 |
80 | self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)])
81 | self.bn = nn.BatchNorm2d(c2)
82 | self.act = nn.LeakyReLU(0.1, inplace=True)
83 |
84 | def forward(self, x):
85 | return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))
86 |
--------------------------------------------------------------------------------
/models/onnx_export.py:
--------------------------------------------------------------------------------
1 | """Exports a pytorch *.pt model to *.onnx format
2 |
3 | Usage:
4 | import torch
5 | $ export PYTHONPATH="$PWD" && python models/onnx_export.py --weights ./weights/yolov5s.pt --img 640 --batch 1
6 | """
7 |
8 | import argparse
9 |
10 | import onnx
11 |
12 | from models.common import *
13 |
14 | if __name__ == '__main__':
15 | parser = argparse.ArgumentParser()
16 | parser.add_argument('--weights', type=str, default='./yolov5s.pt', help='weights path')
17 | parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='image size')
18 | parser.add_argument('--batch-size', type=int, default=1, help='batch size')
19 | opt = parser.parse_args()
20 | print(opt)
21 |
22 | # Parameters
23 | f = opt.weights.replace('.pt', '.onnx') # onnx filename
24 | img = torch.zeros((opt.batch_size, 3, *opt.img_size)) # image size, (1, 3, 320, 192) iDetection
25 |
26 | # Load pytorch model
27 | google_utils.attempt_download(opt.weights)
28 | model = torch.load(opt.weights)['model']
29 | model.eval()
30 | model.fuse()
31 |
32 | # Export to onnx
33 | model.model[-1].export = True # set Detect() layer export=True
34 | _ = model(img) # dry run
35 | torch.onnx.export(model, img, f, verbose=False, opset_version=11, input_names=['images'],
36 | output_names=['output']) # output_names=['classes', 'boxes']
37 |
38 | # Check onnx model
39 | model = onnx.load(f) # load onnx model
40 | onnx.checker.check_model(model) # check onnx model
41 | print(onnx.helper.printable_graph(model.graph)) # print a human readable representation of the graph
42 | print('Export complete. ONNX model saved to %s\nView with https://github.com/lutzroeder/netron' % f)
43 |
--------------------------------------------------------------------------------
/models/yolo.py:
--------------------------------------------------------------------------------
1 | import argparse
2 |
3 | import yaml
4 |
5 | from models.experimental import *
6 |
7 |
8 | class Detect(nn.Module):
9 | def __init__(self, nc=80, anchors=()): # detection layer
10 | super(Detect, self).__init__()
11 | self.stride = None # strides computed during build
12 | self.nc = nc # number of classes
13 | self.no = nc + 5 # number of outputs per anchor
14 | self.nl = len(anchors) # number of detection layers
15 | self.na = len(anchors[0]) // 2 # number of anchors
16 | self.grid = [torch.zeros(1)] * self.nl # init grid
17 | a = torch.tensor(anchors).float().view(self.nl, -1, 2)
18 | self.register_buffer('anchors', a) # shape(nl,na,2)
19 | self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
20 | self.export = False # onnx export
21 |
22 | def forward(self, x):
23 | # x = x.copy() # for profiling
24 | z = [] # inference output
25 | self.training |= self.export
26 | for i in range(self.nl):
27 | bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
28 | x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
29 |
30 | if not self.training: # inference
31 | if self.grid[i].shape[2:4] != x[i].shape[2:4]:
32 | self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
33 |
34 | y = x[i].sigmoid()
35 | y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i] # xy
36 | y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
37 | z.append(y.view(bs, -1, self.no))
38 |
39 | return x if self.training else (torch.cat(z, 1), x)
40 |
41 | @staticmethod
42 | def _make_grid(nx=20, ny=20):
43 | yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
44 | return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
45 |
46 |
47 | class Model(nn.Module):
48 | def __init__(self, model_cfg='yolov5s.yaml', ch=3, nc=None): # model, input channels, number of classes
49 | super(Model, self).__init__()
50 | if type(model_cfg) is dict:
51 | self.md = model_cfg # model dict
52 | else: # is *.yaml
53 | with open(model_cfg) as f:
54 | self.md = yaml.load(f, Loader=yaml.FullLoader) # model dict
55 |
56 | # Define model
57 | if nc:
58 | self.md['nc'] = nc # override yaml value
59 | self.model, self.save = parse_model(self.md, ch=[ch]) # model, savelist, ch_out
60 | # print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])
61 |
62 | # Build strides, anchors
63 | m = self.model[-1] # Detect()
64 | m.stride = torch.tensor([64 / x.shape[-2] for x in self.forward(torch.zeros(1, ch, 64, 64))]) # forward
65 | m.anchors /= m.stride.view(-1, 1, 1)
66 | self.stride = m.stride
67 |
68 | # Init weights, biases
69 | torch_utils.initialize_weights(self)
70 | self._initialize_biases() # only run once
71 | torch_utils.model_info(self)
72 | print('')
73 |
74 | def forward(self, x, augment=False, profile=False):
75 | if augment:
76 | img_size = x.shape[-2:] # height, width
77 | s = [0.83, 0.67] # scales
78 | y = []
79 | for i, xi in enumerate((x,
80 | torch_utils.scale_img(x.flip(3), s[0]), # flip-lr and scale
81 | torch_utils.scale_img(x, s[1]), # scale
82 | )):
83 | # cv2.imwrite('img%g.jpg' % i, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1])
84 | y.append(self.forward_once(xi)[0])
85 |
86 | y[1][..., :4] /= s[0] # scale
87 | y[1][..., 0] = img_size[1] - y[1][..., 0] # flip lr
88 | y[2][..., :4] /= s[1] # scale
89 | return torch.cat(y, 1), None # augmented inference, train
90 | else:
91 | return self.forward_once(x, profile) # single-scale inference, train
92 |
93 | def forward_once(self, x, profile=False):
94 | y, dt = [], [] # outputs
95 | for m in self.model:
96 | if m.f != -1: # if not from previous layer
97 | x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers
98 |
99 | if profile:
100 | import thop
101 | o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 # FLOPS
102 | t = torch_utils.time_synchronized()
103 | for _ in range(10):
104 | _ = m(x)
105 | dt.append((torch_utils.time_synchronized() - t) * 100)
106 | print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))
107 |
108 | x = m(x) # run
109 | y.append(x if m.i in self.save else None) # save output
110 |
111 | if profile:
112 | print('%.1fms total' % sum(dt))
113 | return x
114 |
115 | def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency
116 | # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
117 | m = self.model[-1] # Detect() module
118 | for f, s in zip(m.f, m.stride): # from
119 | mi = self.model[f % m.i]
120 | b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
121 | # b[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
122 | # b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
123 | b.data[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
124 | b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
125 |
126 | mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
127 | # def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency
128 | # # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
129 | # m = self.model[-1] # Detect() module
130 | # for mi, s in zip(m.m, m.stride): # from
131 | # b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
132 | # with torch.no_grad():
133 | # b[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
134 | # b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
135 | # mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
136 |
137 | def _print_biases(self):
138 | m = self.model[-1] # Detect() module
139 | for f in sorted([x % m.i for x in m.f]): # from
140 | b = self.model[f].bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85)
141 | print(('%g Conv2d.bias:' + '%10.3g' * 6) % (f, *b[:5].mean(1).tolist(), b[5:].mean()))
142 |
143 | # def _print_weights(self):
144 | # for m in self.model.modules():
145 | # if type(m) is Bottleneck:
146 | # print('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights
147 |
148 | def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
149 | print('Fusing layers...')
150 | for m in self.model.modules():
151 | if type(m) is Conv:
152 | m.conv = torch_utils.fuse_conv_and_bn(m.conv, m.bn) # update conv
153 | m.bn = None # remove batchnorm
154 | m.forward = m.fuseforward # update forward
155 | torch_utils.model_info(self)
156 |
157 |
158 | def parse_model(md, ch): # model_dict, input_channels(3)
159 | print('\n%3s%15s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
160 | anchors, nc, gd, gw = md['anchors'], md['nc'], md['depth_multiple'], md['width_multiple']
161 | na = (len(anchors[0]) // 2) # number of anchors
162 | no = na * (nc + 5) # number of outputs = anchors * (classes + 5)
163 |
164 | layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out
165 | for i, (f, n, m, args) in enumerate(md['backbone'] + md['head']): # from, number, module, args
166 | m = eval(m) if isinstance(m, str) else m # eval strings
167 | for j, a in enumerate(args):
168 | try:
169 | args[j] = eval(a) if isinstance(a, str) else a # eval strings
170 | except:
171 | pass
172 |
173 | n = max(round(n * gd), 1) if n > 1 else n # depth gain
174 | if m in [nn.Conv2d, Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, ConvPlus, BottleneckCSP]:
175 | c1, c2 = ch[f], args[0]
176 |
177 | # Normal
178 | # if i > 0 and args[0] != no: # channel expansion factor
179 | # ex = 1.75 # exponential (default 2.0)
180 | # e = math.log(c2 / ch[1]) / math.log(2)
181 | # c2 = int(ch[1] * ex ** e)
182 | # if m != Focus:
183 | c2 = make_divisible(c2 * gw, 8) if c2 != no else c2
184 |
185 | # Experimental
186 | # if i > 0 and args[0] != no: # channel expansion factor
187 | # ex = 1 + gw # exponential (default 2.0)
188 | # ch1 = 32 # ch[1]
189 | # e = math.log(c2 / ch1) / math.log(2) # level 1-n
190 | # c2 = int(ch1 * ex ** e)
191 | # if m != Focus:
192 | # c2 = make_divisible(c2, 8) if c2 != no else c2
193 |
194 | args = [c1, c2, *args[1:]]
195 | if m is BottleneckCSP:
196 | args.insert(2, n)
197 | n = 1
198 | elif m is nn.BatchNorm2d:
199 | args = [ch[f]]
200 | elif m is Concat:
201 | c2 = sum([ch[-1 if x == -1 else x + 1] for x in f])
202 | elif m is Detect:
203 | f = f or list(reversed([(-1 if j == i else j - 1) for j, x in enumerate(ch) if x == no]))
204 | else:
205 | c2 = ch[f]
206 |
207 | m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module
208 | t = str(m)[8:-2].replace('__main__.', '') # module type
209 | np = sum([x.numel() for x in m_.parameters()]) # number params
210 | m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params
211 | print('%3s%15s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print
212 | save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist
213 | layers.append(m_)
214 | ch.append(c2)
215 | return nn.Sequential(*layers), sorted(save)
216 |
217 |
218 | if __name__ == '__main__':
219 | parser = argparse.ArgumentParser()
220 | parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml')
221 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
222 | opt = parser.parse_args()
223 | opt.cfg = glob.glob('./**/' + opt.cfg, recursive=True)[0] # find file
224 |
225 | device = torch_utils.select_device(opt.device)
226 |
227 | # Create model
228 | model = Model(opt.cfg).to(device)
229 | model.train()
230 |
231 | # Profile
232 | # img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device)
233 | # y = model(img, profile=True)
234 | # print([y[0].shape] + [x.shape for x in y[1]])
235 |
236 | # ONNX export
237 | # model.model[-1].export = True
238 | # torch.onnx.export(model, img, f.replace('.yaml', '.onnx'), verbose=True, opset_version=11)
239 |
240 | # Tensorboard
241 | # from torch.utils.tensorboard import SummaryWriter
242 | # tb_writer = SummaryWriter()
243 | # print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/")
244 | # tb_writer.add_graph(model.model, img) # add model to tensorboard
245 | # tb_writer.add_image('test', img[0], dataformats='CWH') # add model to tensorboard
246 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | torch==1.8.0
2 | torchvision==0.9.0
3 | numpy
4 | opencv-python
5 | lxml
6 | tqdm
7 | flask
8 | pillow
9 | tensorboard
10 | pycocotools # pycocotools-windows
--------------------------------------------------------------------------------
/static/client.js:
--------------------------------------------------------------------------------
1 | var el = x => document.getElementById(x);
2 |
3 | function showPicker() {
4 | el("file-input").click();
5 | }
6 |
7 | function showPicked(input) {
8 | el("upload-label").innerHTML = input.files[0].name;
9 |
10 | var reader = new FileReader();
11 | reader.onload = function (e) {
12 | if (e.target.result.split("/")[0].split(":")[1] == "image"){
13 | el("image-picked").src = e.target.result;
14 | el("image-picked").className = "";
15 | el("image-picked1").className = "no-display";
16 | }
17 | else{
18 | el("image-picked1").src = e.target.result;
19 | el("image-picked1").className = "";
20 | el("image-picked").className = "no-display";
21 | }
22 | };
23 | reader.readAsDataURL(input.files[0]);
24 | }
--------------------------------------------------------------------------------
/static/style.css:
--------------------------------------------------------------------------------
1 | .modal {
2 | display: none;
3 | position: fixed;
4 | z-index: 1000;
5 | top: 0;
6 | left: 0;
7 | height: 100%;
8 | width: 100%;
9 | background: rgba( 255, 255, 255, .8 )
10 | url('/static/ajax-loader.gif')
11 | 50% 50%
12 | no-repeat;
13 | }
14 |
15 | /* When the body has the loading class, we turn
16 | the scrollbar off with overflow:hidden */
17 | body.loading .modal {
18 | overflow: hidden;
19 | }
20 |
21 | /* Anytime the body has the loading class, our
22 | modal element will be visible */
23 | body.loading .modal {
24 | display: block;
25 | }
--------------------------------------------------------------------------------
/static/style1.css:
--------------------------------------------------------------------------------
1 | body {
2 | background-color: #fff;
3 | }
4 |
5 | .no-display {
6 | display: none;
7 | }
8 |
9 | .center {
10 | margin: auto;
11 | padding: 10px 50px;
12 | text-align: center;
13 | font-size: 14px;
14 | }
15 |
16 | .title {
17 | font-size: 30px;
18 | margin-top: 1em;
19 | margin-bottom: 1em;
20 | color: #262626;
21 | }
22 |
23 | .content {
24 | margin-top: 10em;
25 | }
26 |
27 | .analyze {
28 | margin-top: 5em;
29 | }
30 |
31 | .upload-label {
32 | padding: 10px;
33 | font-size: 12px;
34 | }
35 |
36 | .result-label {
37 | margin-top: 0.5em;
38 | padding: 10px;
39 | font-size: 13px;
40 | }
41 |
42 | button.choose-file-button {
43 | width: 200px;
44 | height: 40px;
45 | border-radius: 2px;
46 | background-color: #ffffff;
47 | border: solid 1px #ff8100;
48 | font-size: 13px;
49 | color: #ff8100;
50 | }
51 |
52 | button.analyze-button {
53 | width: 200px;
54 | height: 40px;
55 | border: solid 1px #ff8100;
56 | border-radius: 2px;
57 | background-color: #ff8100;
58 | font-size: 13px;
59 | color: #ffffff;
60 | }
61 |
62 | button:focus {
63 | outline: 0;
64 | }
65 |
--------------------------------------------------------------------------------
/static/worker.js:
--------------------------------------------------------------------------------
1 | $('#detections').hide()
2 | var $loading = $('#loading').hide();
3 |
4 | $('#updateCamera').click(function (event) {
5 | event.preventDefault();
6 | const data = {
7 | "gray": $('#gray').is(":checked"),
8 | "gaussian": $('#gaussian').is(":checked"),
9 | "sobel": $('#sobel').is(":checked"),
10 | "canny": $('#canny').is(":checked"),
11 | }
12 | console.log(data)
13 | $.ajax({
14 | type: 'POST',
15 | url: '/cameraParams',
16 | data: data,
17 | success: function (success) {
18 | console.log(success)
19 | }, error: function (error) {
20 | console.log(error)
21 | }
22 | })
23 | });
24 |
25 | var loadFile = function (event) {
26 | var output = document.getElementById('input');
27 | output.src = URL.createObjectURL(event.target.files[0]);
28 | };
29 |
30 | $(document)
31 | .ajaxStart(function () {
32 | $loading.show();
33 | })
34 | .ajaxStop(function () {
35 | $loading.hide();
36 | });
37 |
--------------------------------------------------------------------------------
/templates/index1.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
10 |
11 |
12 |
13 | yolo deepsort
14 |
15 |
16 |
25 |
26 |
27 |
Target Detection and Multi-Target Tracking Platform
28 |
29 |
30 |
39 |
40 |
41 |
42 |
43 |
44 |
![Chosen Image]()
45 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import json
3 |
4 | import yaml
5 | from torch.utils.data import DataLoader
6 |
7 | from utils.datasets import *
8 | from utils.utils import *
9 |
10 |
11 | def test(data,
12 | weights=None,
13 | batch_size=16,
14 | imgsz=640,
15 | conf_thres=0.001,
16 | iou_thres=0.6, # for nms
17 | save_json=False,
18 | single_cls=False,
19 | augment=False,
20 | model=None,
21 | dataloader=None,
22 | fast=False,
23 | verbose=False): # 0 fast, 1 accurate
24 | # Initialize/load model and set device
25 | if model is None:
26 | device = torch_utils.select_device(opt.device, batch_size=batch_size)
27 |
28 | # Remove previous
29 | for f in glob.glob('test_batch*.jpg'):
30 | os.remove(f)
31 |
32 | # Load model
33 | google_utils.attempt_download(weights)
34 | model = torch.load(weights, map_location=device)['model']
35 | torch_utils.model_info(model)
36 | # model.fuse()
37 | model.to(device)
38 |
39 | if device.type != 'cpu' and torch.cuda.device_count() > 1:
40 | model = nn.DataParallel(model)
41 |
42 | training = False
43 | else: # called by train.py
44 | device = next(model.parameters()).device # get model device
45 | training = True
46 |
47 | # Configure run
48 | with open(data) as f:
49 | data = yaml.load(f, Loader=yaml.FullLoader) # model dict
50 | nc = 1 if single_cls else int(data['nc']) # number of classes
51 | iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95
52 | # iouv = iouv[0].view(1) # comment for mAP@0.5:0.95
53 | niou = iouv.numel()
54 |
55 | # Dataloader
56 | if dataloader is None:
57 | fast |= conf_thres > 0.001 # enable fast mode
58 | path = data['test'] if opt.task == 'test' else data['val'] # path to val/test images
59 | dataset = LoadImagesAndLabels(path,
60 | imgsz,
61 | batch_size,
62 | rect=True, # rectangular inference
63 | single_cls=opt.single_cls, # single class mode
64 | pad=0.0 if fast else 0.5) # padding
65 | batch_size = min(batch_size, len(dataset))
66 | nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8]) # number of workers
67 | dataloader = DataLoader(dataset,
68 | batch_size=batch_size,
69 | num_workers=nw,
70 | pin_memory=True,
71 | collate_fn=dataset.collate_fn)
72 |
73 | seen = 0
74 | model.eval()
75 | _ = model(torch.zeros((1, 3, imgsz, imgsz), device=device)) if device.type != 'cpu' else None # run once
76 | names = model.names if hasattr(model, 'names') else model.module.names
77 | coco91class = coco80_to_coco91_class()
78 | s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Targets', 'P', 'R', 'mAP@.5', 'mAP@.5:.95')
79 | p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0.
80 | loss = torch.zeros(3, device=device)
81 | jdict, stats, ap, ap_class = [], [], [], []
82 | for batch_i, (imgs, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
83 | imgs = imgs.to(device).float() / 255.0 # uint8 to float32, 0 - 255 to 0.0 - 1.0
84 | targets = targets.to(device)
85 | nb, _, height, width = imgs.shape # batch size, channels, height, width
86 | whwh = torch.Tensor([width, height, width, height]).to(device)
87 |
88 | # Disable gradients
89 | with torch.no_grad():
90 | # Run model
91 | t = torch_utils.time_synchronized()
92 | inf_out, train_out = model(imgs, augment=augment) # inference and training outputs
93 | t0 += torch_utils.time_synchronized() - t
94 |
95 | # Compute loss
96 | if training: # if model has loss hyperparameters
97 | loss += compute_loss(train_out, targets, model)[1][:3] # GIoU, obj, cls
98 |
99 | # Run NMS
100 | t = torch_utils.time_synchronized()
101 | output = non_max_suppression(inf_out, conf_thres=conf_thres, iou_thres=iou_thres, fast=fast)
102 | t1 += torch_utils.time_synchronized() - t
103 |
104 | # Statistics per image
105 | for si, pred in enumerate(output):
106 | labels = targets[targets[:, 0] == si, 1:]
107 | nl = len(labels)
108 | tcls = labels[:, 0].tolist() if nl else [] # target class
109 | seen += 1
110 |
111 | if pred is None:
112 | if nl:
113 | stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls))
114 | continue
115 |
116 | # Append to text file
117 | # with open('test.txt', 'a') as file:
118 | # [file.write('%11.5g' * 7 % tuple(x) + '\n') for x in pred]
119 |
120 | # Clip boxes to image bounds
121 | clip_coords(pred, (height, width))
122 |
123 | # Append to pycocotools JSON dictionary
124 | if save_json:
125 | # [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ...
126 | image_id = int(Path(paths[si]).stem.split('_')[-1])
127 | box = pred[:, :4].clone() # xyxy
128 | scale_coords(imgs[si].shape[1:], box, shapes[si][0], shapes[si][1]) # to original shape
129 | box = xyxy2xywh(box) # xywh
130 | box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner
131 | for p, b in zip(pred.tolist(), box.tolist()):
132 | jdict.append({'image_id': image_id,
133 | 'category_id': coco91class[int(p[5])],
134 | 'bbox': [round(x, 3) for x in b],
135 | 'score': round(p[4], 5)})
136 |
137 | # Assign all predictions as incorrect
138 | correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device)
139 | if nl:
140 | detected = [] # target indices
141 | tcls_tensor = labels[:, 0]
142 |
143 | # target boxes
144 | tbox = xywh2xyxy(labels[:, 1:5]) * whwh
145 |
146 | # Per target class
147 | for cls in torch.unique(tcls_tensor):
148 | ti = (cls == tcls_tensor).nonzero().view(-1) # prediction indices
149 | pi = (cls == pred[:, 5]).nonzero().view(-1) # target indices
150 |
151 | # Search for detections
152 | if pi.shape[0]:
153 | # Prediction to target ious
154 | ious, i = box_iou(pred[pi, :4], tbox[ti]).max(1) # best ious, indices
155 |
156 | # Append detections
157 | for j in (ious > iouv[0]).nonzero():
158 | d = ti[i[j]] # detected target
159 | if d not in detected:
160 | detected.append(d)
161 | correct[pi[j]] = ious[j] > iouv # iou_thres is 1xn
162 | if len(detected) == nl: # all targets already located in image
163 | break
164 |
165 | # Append statistics (correct, conf, pcls, tcls)
166 | stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls))
167 |
168 | # Plot images
169 | if batch_i < 1:
170 | f = 'test_batch%g_gt.jpg' % batch_i # filename
171 | plot_images(imgs, targets, paths, f, names) # ground truth
172 | f = 'test_batch%g_pred.jpg' % batch_i
173 | plot_images(imgs, output_to_target(output, width, height), paths, f, names) # predictions
174 |
175 | # Compute statistics
176 | stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy
177 | if len(stats):
178 | p, r, ap, f1, ap_class = ap_per_class(*stats)
179 | p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1) # [P, R, AP@0.5, AP@0.5:0.95]
180 | mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
181 | nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class
182 | else:
183 | nt = torch.zeros(1)
184 |
185 | # Print results
186 | pf = '%20s' + '%12.3g' * 6 # print format
187 | print(pf % ('all', seen, nt.sum(), mp, mr, map50, map))
188 |
189 | # Print results per class
190 | if verbose and nc > 1 and len(stats):
191 | for i, c in enumerate(ap_class):
192 | print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))
193 |
194 | # Print speeds
195 | t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple
196 | if not training:
197 | print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t)
198 |
199 | # Save JSON
200 | if save_json and map50 and len(jdict):
201 | imgIds = [int(Path(x).stem.split('_')[-1]) for x in dataloader.dataset.img_files]
202 | f = 'detections_val2017_%s_results.json' % \
203 | (weights.split(os.sep)[-1].replace('.pt', '') if weights else '') # filename
204 | print('\nCOCO mAP with pycocotools... saving %s...' % f)
205 | with open(f, 'w') as file:
206 | json.dump(jdict, file)
207 |
208 | try:
209 | from pycocotools.coco import COCO
210 | from pycocotools.cocoeval import COCOeval
211 |
212 | # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
213 | cocoGt = COCO(glob.glob('../coco/annotations/instances_val*.json')[0]) # initialize COCO ground truth api
214 | cocoDt = cocoGt.loadRes(f) # initialize COCO pred api
215 |
216 | cocoEval = COCOeval(cocoGt, cocoDt, 'bbox')
217 | cocoEval.params.imgIds = imgIds # [:32] # only evaluate these images
218 | cocoEval.evaluate()
219 | cocoEval.accumulate()
220 | cocoEval.summarize()
221 | map, map50 = cocoEval.stats[:2] # update to pycocotools results (mAP@0.5:0.95, mAP@0.5)
222 | except:
223 | print('WARNING: pycocotools must be installed with numpy==1.17 to run correctly. '
224 | 'See https://github.com/cocodataset/cocoapi/issues/356')
225 |
226 | # Return results
227 | maps = np.zeros(nc) + map
228 | for i, c in enumerate(ap_class):
229 | maps[c] = ap[i]
230 | return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t
231 |
232 |
233 | if __name__ == '__main__':
234 | parser = argparse.ArgumentParser(prog='test.py')
235 | parser.add_argument('--weights', type=str, default='weights/yolov5s.pt', help='model.pt path')
236 | parser.add_argument('--data', type=str, default='data/coco.yaml', help='*.data path')
237 | parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch')
238 | parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
239 | parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold')
240 | parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS')
241 | parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file')
242 | parser.add_argument('--task', default='val', help="'val', 'test', 'study'")
243 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
244 | parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
245 | parser.add_argument('--augment', action='store_true', help='augmented inference')
246 | parser.add_argument('--verbose', action='store_true', help='report mAP by class')
247 | opt = parser.parse_args()
248 | opt.save_json = opt.save_json or opt.data.endswith('coco.yaml')
249 | opt.data = glob.glob('./**/' + opt.data, recursive=True)[0] # find file
250 | print(opt)
251 |
252 | # task = 'val', 'test', 'study'
253 | if opt.task in ['val', 'test']: # (default) run normally
254 | test(opt.data,
255 | opt.weights,
256 | opt.batch_size,
257 | opt.img_size,
258 | opt.conf_thres,
259 | opt.iou_thres,
260 | opt.save_json,
261 | opt.single_cls,
262 | opt.augment)
263 |
264 | elif opt.task == 'study': # run over a range of settings and save/plot
265 | for weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']:
266 | f = 'study_%s_%s.txt' % (Path(opt.data).stem, Path(weights).stem) # filename to save to
267 | x = list(range(288, 896, 64)) # x axis
268 | y = [] # y axis
269 | for i in x: # img-size
270 | print('\nRunning %s point %s...' % (f, i))
271 | r, _, t = test(opt.data, weights, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json)
272 | y.append(r + t) # results and times
273 | np.savetxt(f, y, fmt='%10.4g') # save
274 | os.system('zip -r study.zip study_*.txt')
275 | # plot_study_txt(f, x) # plot
276 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import torch.distributed as dist
3 | import torch.nn.functional as F
4 | import torch.optim as optim
5 | import torch.optim.lr_scheduler as lr_scheduler
6 | import yaml
7 | from torch.utils.tensorboard import SummaryWriter
8 | import test # import test.py to get mAP after each epoch
9 | from models.yolo import Model
10 | from utils.datasets import *
11 | from utils.utils import *
12 | mixed_precision = True
13 | try: # Mixed precision training https://github.com/NVIDIA/apex
14 | from apex import amp
15 | except:
16 | print('Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex')
17 | mixed_precision = False # not installed
18 | wdir = 'weights' + os.sep # weights dir
19 | last = wdir + 'last.pt'
20 | best = wdir + 'best.pt'
21 | results_file = 'results.txt'
22 | # Hyperparameters
23 | hyp = {'lr0': 0.01, # initial learning rate (SGD=1E-2, Adam=1E-3)
24 | 'momentum': 0.937, # SGD momentum
25 | 'weight_decay': 5e-4, # optimizer weight decay
26 | 'giou': 0.05, # giou loss gain
27 | 'cls': 0.58, # cls loss gain
28 | 'cls_pw': 1.0, # cls BCELoss positive_weight
29 | 'obj': 1.0, # obj loss gain (*=img_size/320 if img_size != 320)
30 | 'obj_pw': 1.0, # obj BCELoss positive_weight
31 | 'iou_t': 0.20, # iou training threshold
32 | 'anchor_t': 4.0, # anchor-multiple threshold
33 | 'fl_gamma': 0, # focal loss gamma (efficientDet default is gamma=1.5)
34 | 'hsv_h': 0.014, # image HSV-Hue augmentation (fraction)
35 | 'hsv_s': 0.68, # image HSV-Saturation augmentation (fraction)
36 | 'hsv_v': 0.36, # image HSV-Value augmentation (fraction)
37 | 'degrees': 0.0, # image rotation (+/- deg)
38 | 'translate': 0.0, # image translation (+/- fraction)
39 | 'scale': 0.5, # image scale (+/- gain)
40 | 'shear': 0.0} # image shear (+/- deg)
41 | print(hyp)
42 |
43 | # Overwrite hyp with hyp*.txt (optional)
44 | f = glob.glob('hyp*.txt')
45 | if f:
46 | print('Using %s' % f[0])
47 | for k, v in zip(hyp.keys(), np.loadtxt(f[0])):
48 | hyp[k] = v
49 |
50 | # Print focal loss if gamma > 0
51 | if hyp['fl_gamma']:
52 | print('Using FocalLoss(gamma=%g)' % hyp['fl_gamma'])
53 |
54 | def train(hyp):
55 | epochs = opt.epochs # 300
56 | batch_size = opt.batch_size # 64
57 | weights = opt.weights # initial training weights
58 |
59 | # Configure
60 | init_seeds(1)
61 | with open(opt.data,'r',encoding='UTF-8') as f:
62 | data_dict = yaml.load(f, Loader=yaml.FullLoader) # model dict
63 | train_path = data_dict['train']
64 | test_path = data_dict['val']
65 | nc = 1 if opt.single_cls else int(data_dict['nc']) # number of classes
66 |
67 | # Remove previous results
68 | for f in glob.glob('*_batch*.jpg') + glob.glob(results_file):
69 | os.remove(f)
70 |
71 | # Create model
72 | model = Model(opt.cfg).to(device)
73 | assert model.md['nc'] == nc, '%s nc=%g classes but %s nc=%g classes' % (opt.data, nc, opt.cfg, model.md['nc'])
74 |
75 | # Image sizes
76 | gs = int(max(model.stride)) # grid size (max stride)
77 | if any(x % gs != 0 for x in opt.img_size):
78 | print('WARNING: --img-size %g,%g must be multiple of %s max stride %g' % (*opt.img_size, opt.cfg, gs))
79 | imgsz, imgsz_test = [make_divisible(x, gs) for x in opt.img_size] # image sizes (train, test)
80 |
81 | # Optimizer
82 | nbs = 64 # nominal batch size
83 | accumulate = max(round(nbs / batch_size), 1) # accumulate loss before optimizing
84 | hyp['weight_decay'] *= batch_size * accumulate / nbs # scale weight_decay
85 | pg0, pg1, pg2 = [], [], [] # optimizer parameter groups
86 | for k, v in model.named_parameters():
87 | if v.requires_grad:
88 | if '.bias' in k:
89 | pg2.append(v) # biases
90 | elif '.weight' in k and '.bn' not in k:
91 | pg1.append(v) # apply weight decay
92 | else:
93 | pg0.append(v) # all else
94 |
95 | optimizer = optim.Adam(pg0, lr=hyp['lr0']) if opt.adam else \
96 | optim.SGD(pg0, lr=hyp['lr0'], momentum=hyp['momentum'], nesterov=True)
97 | optimizer.add_param_group({'params': pg1, 'weight_decay': hyp['weight_decay']}) # add pg1 with weight_decay
98 | optimizer.add_param_group({'params': pg2}) # add pg2 (biases)
99 | print('Optimizer groups: %g .bias, %g conv.weight, %g other' % (len(pg2), len(pg1), len(pg0)))
100 | del pg0, pg1, pg2
101 |
102 | # Load Model
103 | google_utils.attempt_download(weights)
104 | start_epoch, best_fitness = 0, 0.0
105 | if weights.endswith('.pt'): # pytorch format
106 | ckpt = torch.load(weights, map_location=device) # load checkpoint
107 |
108 | # load model
109 | try:
110 | ckpt['model'] = \
111 | {k: v for k, v in ckpt['model'].state_dict().items() if model.state_dict()[k].numel() == v.numel()}
112 | model.load_state_dict(ckpt['model'], strict=False)
113 | except KeyError as e:
114 | s = "%s is not compatible with %s. Specify --weights '' or specify a --cfg compatible with %s." \
115 | % (opt.weights, opt.cfg, opt.weights)
116 | raise KeyError(s) from e
117 |
118 | # load optimizer
119 | if ckpt['optimizer'] is not None:
120 | optimizer.load_state_dict(ckpt['optimizer'])
121 | best_fitness = ckpt['best_fitness']
122 |
123 | # load results
124 | if ckpt.get('training_results') is not None:
125 | with open(results_file, 'w') as file:
126 | file.write(ckpt['training_results']) # write results.txt
127 |
128 | start_epoch = ckpt['epoch'] + 1
129 | del ckpt
130 |
131 | if mixed_precision:
132 | model, optimizer = amp.initialize(model, optimizer, opt_level='O1', verbosity=0)
133 |
134 | lf = lambda x: (((1 + math.cos(x * math.pi / epochs)) / 2) ** 1.0) * 0.9 + 0.1 # cosine
135 | scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)
136 | scheduler.last_epoch = start_epoch - 1 # do not move
137 |
138 | # Initialize distributed training
139 | if device.type != 'cpu' and torch.cuda.device_count() > 1 and torch.distributed.is_available():
140 | dist.init_process_group(backend='nccl', # distributed backend
141 | init_method='tcp://127.0.0.1:9999', # init method
142 | world_size=1, # number of nodes
143 | rank=0) # node rank
144 | model = torch.nn.parallel.DistributedDataParallel(model)
145 |
146 | # Dataset
147 | dataset = LoadImagesAndLabels(train_path, imgsz, batch_size,
148 | augment=True,
149 | hyp=hyp, # augmentation hyperparameters
150 | rect=opt.rect, # rectangular training
151 | cache_images=opt.cache_images,
152 | single_cls=opt.single_cls)
153 | mlc = np.concatenate(dataset.labels, 0)[:, 0].max() # max label class
154 | assert mlc < nc, 'Label class %g exceeds nc=%g in %s. Correct your labels or your model.' % (mlc, nc, opt.cfg)
155 |
156 | # Dataloader
157 | batch_size = min(batch_size, len(dataset))
158 | nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8]) # number of workers
159 | nw = 0
160 | dataloader = torch.utils.data.DataLoader(dataset,
161 | batch_size=batch_size,
162 | num_workers=nw,
163 | shuffle=not opt.rect, # Shuffle=True unless rectangular training is used
164 | pin_memory=True,
165 | collate_fn=dataset.collate_fn)
166 |
167 | # Testloader
168 | testloader = torch.utils.data.DataLoader(LoadImagesAndLabels(test_path, imgsz_test, batch_size,
169 | hyp=hyp,
170 | rect=True,
171 | cache_images=opt.cache_images,
172 | single_cls=opt.single_cls),
173 | batch_size=batch_size,
174 | num_workers=nw,
175 | pin_memory=True,
176 | collate_fn=dataset.collate_fn)
177 |
178 | # Model parameters
179 | hyp['cls'] *= nc / 80. # scale coco-tuned hyp['cls'] to current dataset
180 | model.nc = nc # attach number of classes to model
181 | model.hyp = hyp # attach hyperparameters to model
182 | model.gr = 1.0 # giou loss ratio (obj_loss = 1.0 or giou)
183 | model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device) # attach class weights
184 | model.names = data_dict['names']
185 |
186 | # class frequency
187 | labels = np.concatenate(dataset.labels, 0)
188 | c = torch.tensor(labels[:, 0]) # classes
189 | tb_writer.add_histogram('classes', c, 0)
190 |
191 | # Exponential moving average
192 | ema = torch_utils.ModelEMA(model)
193 |
194 | # Start training
195 | t0 = time.time()
196 | nb = len(dataloader) # number of batches
197 | n_burn = max(3 * nb, 1e3) # burn-in iterations, max(3 epochs, 1k iterations)
198 | maps = np.zeros(nc) # mAP per class
199 | results = (0, 0, 0, 0, 0, 0, 0) # 'P', 'R', 'mAP', 'F1', 'val GIoU', 'val Objectness', 'val Classification'
200 | print('Image sizes %g train, %g test' % (imgsz, imgsz_test))
201 | print('Using %g dataloader workers' % nw)
202 | print('Starting training for %g epochs...' % epochs)
203 | # torch.autograd.set_detect_anomaly(True)
204 | for epoch in range(start_epoch, epochs): # epoch ------------------------------------------------------------------
205 | model.train()
206 |
207 | # Update image weights (optional)
208 | if dataset.image_weights:
209 | w = model.class_weights.cpu().numpy() * (1 - maps) ** 2 # class weights
210 | image_weights = labels_to_image_weights(dataset.labels, nc=nc, class_weights=w)
211 | dataset.indices = random.choices(range(dataset.n), weights=image_weights, k=dataset.n) # rand weighted idx
212 |
213 | mloss = torch.zeros(4, device=device) # mean losses
214 | print(('\n' + '%10s' * 8) % ('Epoch', 'gpu_mem', 'GIoU', 'obj', 'cls', 'total', 'targets', 'img_size'))
215 | try:
216 | pbar = tqdm(enumerate(dataloader), total=nb) # progress bar
217 | for i, (imgs, targets, paths, _) in pbar: # batch -------------------------------------------------------------
218 | ni = i + nb * epoch # number integrated batches (since train start)
219 | imgs = imgs.to(device).float() / 255.0 # uint8 to float32, 0 - 255 to 0.0 - 1.0
220 |
221 | # Burn-in
222 | if ni <= n_burn:
223 | xi = [0, n_burn] # x interp
224 | # model.gr = np.interp(ni, xi, [0.0, 1.0]) # giou loss ratio (obj_loss = 1.0 or giou)
225 | accumulate = max(1, np.interp(ni, xi, [1, nbs / batch_size]).round())
226 | for j, x in enumerate(optimizer.param_groups):
227 | # bias lr falls from 0.1 to lr0, all other lrs rise from 0.0 to lr0
228 | x['lr'] = np.interp(ni, xi, [0.1 if j == 2 else 0.0, x['initial_lr'] * lf(epoch)])
229 | if 'momentum' in x:
230 | x['momentum'] = np.interp(ni, xi, [0.9, hyp['momentum']])
231 |
232 | # Multi-scale
233 | if opt.multi_scale:
234 | sz = random.randrange(imgsz * 0.5, imgsz * 1.5 + gs) // gs * gs # size
235 | sf = sz / max(imgs.shape[2:]) # scale factor
236 | if sf != 1:
237 | ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]] # new shape (stretched to gs-multiple)
238 | imgs = F.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)
239 |
240 | # Forward
241 | pred = model(imgs)
242 |
243 | # Loss
244 | loss, loss_items = compute_loss(pred, targets.to(device), model)
245 | if not torch.isfinite(loss):
246 | print('WARNING: non-finite loss, ending training ', loss_items)
247 | return results
248 |
249 | # Backward
250 | if mixed_precision:
251 | with amp.scale_loss(loss, optimizer) as scaled_loss:
252 | scaled_loss.backward()
253 | else:
254 | loss.backward()
255 |
256 | # Optimize
257 | if ni % accumulate == 0:
258 | optimizer.step()
259 | optimizer.zero_grad()
260 | ema.update(model)
261 |
262 | # Print
263 | mloss = (mloss * i + loss_items) / (i + 1) # update mean losses
264 | mem = '%.3gG' % (torch.cuda.memory_cached() / 1E9 if torch.cuda.is_available() else 0) # (GB)
265 | s = ('%10s' * 2 + '%10.4g' * 6) % (
266 | '%g/%g' % (epoch, epochs - 1), mem, *mloss, targets.shape[0], imgs.shape[-1])
267 | pbar.set_description(s)
268 |
269 | # Plot
270 | if ni < 3:
271 | f = 'train_batch%g.jpg' % i # filename
272 | res = plot_images(images=imgs, targets=targets, paths=paths, fname=f)
273 | if tb_writer:
274 | tb_writer.add_image(f, res, dataformats='HWC', global_step=epoch)
275 | # tb_writer.add_graph(model, imgs) # add model to tensorboard
276 | # end batch ------------------------------------------------------------------------------------------------
277 | except:
278 | pass
279 | # Scheduler
280 | scheduler.step()
281 |
282 | torch.cuda.empty_cache()
283 | # mAP
284 | ema.update_attr(model)
285 | final_epoch = epoch + 1 == epochs
286 | if not opt.notest or final_epoch: # Calculate mAP
287 | results, maps, times = test.test(opt.data,
288 | batch_size=batch_size,
289 | imgsz=imgsz_test,
290 | save_json=final_epoch and opt.data.endswith(os.sep + 'coco.yaml'),
291 | model=ema.ema,
292 | single_cls=opt.single_cls,
293 | dataloader=testloader,
294 | fast=ni < n_burn)
295 |
296 | # Write
297 | with open(results_file, 'a') as f:
298 | f.write(s + '%10.4g' * 7 % results + '\n') # P, R, mAP, F1, test_losses=(GIoU, obj, cls)
299 | if len(opt.name) and opt.bucket:
300 | os.system('gsutil cp results.txt gs://%s/results/results%s.txt' % (opt.bucket, opt.name))
301 |
302 | # Tensorboard
303 | if tb_writer:
304 | tags = ['train/giou_loss', 'train/obj_loss', 'train/cls_loss',
305 | 'metrics/precision', 'metrics/recall', 'metrics/mAP_0.5', 'metrics/F1',
306 | 'val/giou_loss', 'val/obj_loss', 'val/cls_loss']
307 | for x, tag in zip(list(mloss[:-1]) + list(results), tags):
308 | tb_writer.add_scalar(tag, x, epoch)
309 |
310 | # Update best mAP
311 | fi = fitness(np.array(results).reshape(1, -1)) # fitness_i = weighted combination of [P, R, mAP, F1]
312 | if fi > best_fitness:
313 | best_fitness = fi
314 |
315 | # Save model
316 | save = (not opt.nosave) or (final_epoch and not opt.evolve)
317 | if save:
318 | with open(results_file, 'r') as f: # create checkpoint
319 | ckpt = {'epoch': epoch,
320 | 'best_fitness': best_fitness,
321 | 'training_results': f.read(),
322 | 'model': ema.ema.module if hasattr(model, 'module') else ema.ema,
323 | 'optimizer': None if final_epoch else optimizer.state_dict()}
324 |
325 | # Save last, best and delete
326 | torch.save(ckpt, last)
327 | if (best_fitness == fi) and not final_epoch:
328 | torch.save(ckpt, best)
329 | del ckpt
330 |
331 | # end epoch ----------------------------------------------------------------------------------------------------
332 | # end training
333 |
334 | n = opt.name
335 | if len(n):
336 | n = '_' + n if not n.isnumeric() else n
337 | fresults, flast, fbest = 'results%s.txt' % n, wdir + 'last%s.pt' % n, wdir + 'best%s.pt' % n
338 | for f1, f2 in zip([wdir + 'last.pt', wdir + 'best.pt', 'results.txt'], [flast, fbest, fresults]):
339 | if os.path.exists(f1):
340 | os.rename(f1, f2) # rename
341 | ispt = f2.endswith('.pt') # is *.pt
342 | strip_optimizer(f2) if ispt else None # strip optimizer
343 | os.system('gsutil cp %s gs://%s/weights' % (f2, opt.bucket)) if opt.bucket and ispt else None # upload
344 |
345 | if not opt.evolve:
346 | # plot_results() # save as results.png
347 | pass
348 | print('%g epochs completed in %.3f hours.\n' % (epoch - start_epoch + 1, (time.time() - t0) / 3600))
349 | dist.destroy_process_group() if torch.cuda.device_count() > 1 else None
350 | torch.cuda.empty_cache()
351 | return results
352 |
353 | if __name__ == '__main__':
354 | parser = argparse.ArgumentParser()
355 | parser.add_argument('--epochs', type=int, default=300)
356 | parser.add_argument('--batch-size', type=int, default=1)
357 | parser.add_argument('--cfg', type=str, default='./config/yolov5l.yaml', help='*.cfg path')
358 | parser.add_argument('--data', type=str, default='./config/score.yaml', help='*.data path')
359 | parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='train,test sizes')
360 | parser.add_argument('--rect', action='store_true', help='rectangular training')
361 | parser.add_argument('--resume', action='store_true', help='resume training from last.pt')
362 | parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
363 | parser.add_argument('--notest', action='store_true', help='only test final epoch')
364 | parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
365 | parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
366 | parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
367 | parser.add_argument('--weights', type=str, default='', help='initial weights path')
368 | parser.add_argument('--name', default='', help='renames results.txt to results_name.txt if supplied')
369 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
370 | parser.add_argument('--adam', action='store_true', help='use adam optimizer')
371 | parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%')
372 | parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset')
373 | opt = parser.parse_args()
374 | opt.weights = last if opt.resume else opt.weights
375 | print(opt)
376 | opt.img_size.extend([opt.img_size[-1]] * (2 - len(opt.img_size))) # extend to 2 sizes (train, test)
377 | device = torch_utils.select_device(opt.device, apex=mixed_precision, batch_size=opt.batch_size)
378 | # check_git_status()
379 | if device.type == 'cpu':
380 | mixed_precision = False
381 | # Train
382 | # if not opt.evolve:
383 | tb_writer = SummaryWriter(comment=opt.name)
384 | print('Start Tensorboard with "tensorboard --logdir=runs", view at http://localhost:6006/')
385 | train(hyp)
386 | # Evolve hyperparameters (optional)
387 | # else:
388 | # tb_writer = None
389 | # opt.notest, opt.nosave = True, True # only test/save final epoch
390 | # if opt.bucket:
391 | # os.system('gsutil cp gs://%s/evolve.txt .' % opt.bucket) # download evolve.txt if exists
392 | # for _ in range(10): # generations to evolve
393 | # if os.path.exists('evolve.txt'): # if evolve.txt exists: select best hyps and mutate
394 | # # Select parent(s)
395 | # parent = 'single' # parent selection method: 'single' or 'weighted'
396 | # x = np.loadtxt('evolve.txt', ndmin=2)
397 | # n = min(5, len(x)) # number of previous results to consider
398 | # x = x[np.argsort(-fitness(x))][:n] # top n mutations
399 | # w = fitness(x) - fitness(x).min() # weights
400 | # if parent == 'single' or len(x) == 1:
401 | # # x = x[random.randint(0, n - 1)] # random selection
402 | # x = x[random.choices(range(n), weights=w)[0]] # weighted selection
403 | # elif parent == 'weighted':
404 | # x = (x * w.reshape(n, 1)).sum(0) / w.sum() # weighted combination
405 |
406 | # # Mutate
407 | # mp, s = 0.9, 0.2 # mutation probability, sigma
408 | # npr = np.random
409 | # npr.seed(int(time.time()))
410 | # g = np.array([1, 1, 1, 1, 1, 1, 1, 0, .1, 1, 0, 1, 1, 1, 1, 1, 1, 1]) # gains
411 | # ng = len(g)
412 | # v = np.ones(ng)
413 | # while all(v == 1): # mutate until a change occurs (prevent duplicates)
414 | # v = (g * (npr.random(ng) < mp) * npr.randn(ng) * npr.random() * s + 1).clip(0.3, 3.0)
415 | # for i, k in enumerate(hyp.keys()): # plt.hist(v.ravel(), 300)
416 | # hyp[k] = x[i + 7] * v[i] # mutate
417 |
418 | # # Clip to limits
419 | # keys = ['lr0', 'iou_t', 'momentum', 'weight_decay', 'hsv_s', 'hsv_v', 'translate', 'scale', 'fl_gamma']
420 | # limits = [(1e-5, 1e-2), (0.00, 0.70), (0.60, 0.98), (0, 0.001), (0, .9), (0, .9), (0, .9), (0, .9), (0, 3)]
421 | # for k, v in zip(keys, limits):
422 | # hyp[k] = np.clip(hyp[k], v[0], v[1])
423 | # # Train mutation
424 | # results = train(hyp.copy())
425 | # # Write mutation results
426 | # print_mutation(hyp, results, opt.bucket)
427 | # # Plot results
428 | # # plot_evolution_results(hyp)
429 |
--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__init__.py
--------------------------------------------------------------------------------
/utils/__pycache__/__init__.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/__init__.cpython-37.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/datasets.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/datasets.cpython-37.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/google_utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/google_utils.cpython-37.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/torch_utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/torch_utils.cpython-37.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/utils.cpython-37.pyc
--------------------------------------------------------------------------------
/utils/activations.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.functional as F
3 | import torch.nn as nn
4 |
5 |
6 | # Swish ------------------------------------------------------------------------
7 | class SwishImplementation(torch.autograd.Function):
8 | @staticmethod
9 | def forward(ctx, x):
10 | ctx.save_for_backward(x)
11 | return x * torch.sigmoid(x)
12 |
13 | @staticmethod
14 | def backward(ctx, grad_output):
15 | x = ctx.saved_tensors[0]
16 | sx = torch.sigmoid(x)
17 | return grad_output * (sx * (1 + x * (1 - sx)))
18 |
19 |
20 | class MemoryEfficientSwish(nn.Module):
21 | @staticmethod
22 | def forward(x):
23 | return SwishImplementation.apply(x)
24 |
25 |
26 | class HardSwish(nn.Module): # https://arxiv.org/pdf/1905.02244.pdf
27 | @staticmethod
28 | def forward(x):
29 | return x * F.hardtanh(x + 3, 0., 6., True) / 6.
30 |
31 |
32 | class Swish(nn.Module):
33 | @staticmethod
34 | def forward(x):
35 | return x * torch.sigmoid(x)
36 |
37 |
38 | # Mish ------------------------------------------------------------------------
39 | class MishImplementation(torch.autograd.Function):
40 | @staticmethod
41 | def forward(ctx, x):
42 | ctx.save_for_backward(x)
43 | return x.mul(torch.tanh(F.softplus(x))) # x * tanh(ln(1 + exp(x)))
44 |
45 | @staticmethod
46 | def backward(ctx, grad_output):
47 | x = ctx.saved_tensors[0]
48 | sx = torch.sigmoid(x)
49 | fx = F.softplus(x).tanh()
50 | return grad_output * (fx + x * sx * (1 - fx * fx))
51 |
52 |
53 | class MemoryEfficientMish(nn.Module):
54 | @staticmethod
55 | def forward(x):
56 | return MishImplementation.apply(x)
57 |
58 |
59 | class Mish(nn.Module): # https://github.com/digantamisra98/Mish
60 | @staticmethod
61 | def forward(x):
62 | return x * F.softplus(x).tanh()
63 |
--------------------------------------------------------------------------------
/utils/datasets.py:
--------------------------------------------------------------------------------
1 | import glob
2 | import math
3 | import os
4 | import random
5 | import shutil
6 | import time
7 | from pathlib import Path
8 | from threading import Thread
9 |
10 | import cv2
11 | import numpy as np
12 | import torch
13 | from PIL import Image, ExifTags
14 | from torch.utils.data import Dataset
15 | from tqdm import tqdm
16 |
17 | from utils.utils import xyxy2xywh, xywh2xyxy
18 |
19 | help_url = 'https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data'
20 | img_formats = ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.dng']
21 | vid_formats = ['.mov', '.avi', '.mp4']
22 |
23 | # Get orientation exif tag
24 | for orientation in ExifTags.TAGS.keys():
25 | if ExifTags.TAGS[orientation] == 'Orientation':
26 | break
27 |
28 |
29 | def exif_size(img):
30 | # Returns exif-corrected PIL size
31 | s = img.size # (width, height)
32 | try:
33 | rotation = dict(img._getexif().items())[orientation]
34 | if rotation == 6: # rotation 270
35 | s = (s[1], s[0])
36 | elif rotation == 8: # rotation 90
37 | s = (s[1], s[0])
38 | except:
39 | pass
40 |
41 | return s
42 |
43 |
44 | class LoadImages: # for inference
45 | def __init__(self, path, img_size=416):
46 | path = str(Path(path)) # os-agnostic
47 | files = []
48 | if os.path.isdir(path):
49 | files = sorted(glob.glob(os.path.join(path, '*.*')))
50 | elif os.path.isfile(path):
51 | files = [path]
52 |
53 | images = [x for x in files if os.path.splitext(x)[-1].lower() in img_formats]
54 | videos = [x for x in files if os.path.splitext(x)[-1].lower() in vid_formats]
55 | nI, nV = len(images), len(videos)
56 |
57 | self.img_size = img_size
58 | self.files = images + videos
59 | self.nF = nI + nV # number of files
60 | self.video_flag = [False] * nI + [True] * nV
61 | self.mode = 'images'
62 | if any(videos):
63 | self.new_video(videos[0]) # new video
64 | else:
65 | self.cap = None
66 | assert self.nF > 0, 'No images or videos found in ' + path
67 |
68 | def __iter__(self):
69 | self.count = 0
70 | return self
71 |
72 | def __next__(self):
73 | if self.count == self.nF:
74 | raise StopIteration
75 | path = self.files[self.count]
76 |
77 | if self.video_flag[self.count]:
78 | # Read video
79 | self.mode = 'video'
80 | ret_val, img0 = self.cap.read()
81 | if not ret_val:
82 | self.count += 1
83 | self.cap.release()
84 | if self.count == self.nF: # last video
85 | raise StopIteration
86 | else:
87 | path = self.files[self.count]
88 | self.new_video(path)
89 | ret_val, img0 = self.cap.read()
90 |
91 | self.frame += 1
92 | print('video %g/%g (%g/%g) %s: ' % (self.count + 1, self.nF, self.frame, self.nframes, path), end='')
93 |
94 | else:
95 | # Read image
96 | self.count += 1
97 | img0 = cv2.imread(path) # BGR
98 | assert img0 is not None, 'Image Not Found ' + path
99 | print('image %g/%g %s: ' % (self.count, self.nF, path), end='')
100 |
101 | # Padded resize
102 | img = letterbox(img0, new_shape=self.img_size)[0]
103 |
104 | # Convert
105 | img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
106 | img = np.ascontiguousarray(img)
107 |
108 | # cv2.imwrite(path + '.letterbox.jpg', 255 * img.transpose((1, 2, 0))[:, :, ::-1]) # save letterbox image
109 | return path, img, img0, self.cap
110 |
111 | def new_video(self, path):
112 | self.frame = 0
113 | self.cap = cv2.VideoCapture(path)
114 | self.nframes = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT))
115 |
116 | def __len__(self):
117 | return self.nF # number of files
118 |
119 | # def LoadImages(img0): # for inference
120 | #
121 | # img = letterbox(img0, new_shape=640)[0]
122 | #
123 | # img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
124 | # img = np.ascontiguousarray(img)
125 | #
126 | # return img, img0
127 |
128 |
129 |
130 | class LoadWebcam: # for inference
131 | def __init__(self, pipe=0, img_size=416):
132 | self.img_size = img_size
133 |
134 | if pipe == '0':
135 | pipe = 0 # local camera
136 | # pipe = 'rtsp://192.168.1.64/1' # IP camera
137 | # pipe = 'rtsp://username:password@192.168.1.64/1' # IP camera with login
138 | # pipe = 'rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa' # IP traffic camera
139 | # pipe = 'http://wmccpinetop.axiscam.net/mjpg/video.mjpg' # IP golf camera
140 |
141 | # https://answers.opencv.org/question/215996/changing-gstreamer-pipeline-to-opencv-in-pythonsolved/
142 | # pipe = '"rtspsrc location="rtsp://username:password@192.168.1.64/1" latency=10 ! appsink' # GStreamer
143 |
144 | # https://answers.opencv.org/question/200787/video-acceleration-gstremer-pipeline-in-videocapture/
145 | # https://stackoverflow.com/questions/54095699/install-gstreamer-support-for-opencv-python-package # install help
146 | # pipe = "rtspsrc location=rtsp://root:root@192.168.0.91:554/axis-media/media.amp?videocodec=h264&resolution=3840x2160 protocols=GST_RTSP_LOWER_TRANS_TCP ! rtph264depay ! queue ! vaapih264dec ! videoconvert ! appsink" # GStreamer
147 |
148 | self.pipe = pipe
149 | self.cap = cv2.VideoCapture(pipe) # video capture object
150 | self.cap.set(cv2.CAP_PROP_BUFFERSIZE, 3) # set buffer size
151 |
152 | def __iter__(self):
153 | self.count = -1
154 | return self
155 |
156 | def __next__(self):
157 | self.count += 1
158 | if cv2.waitKey(1) == ord('q'): # q to quit
159 | self.cap.release()
160 | cv2.destroyAllWindows()
161 | raise StopIteration
162 |
163 | # Read frame
164 | if self.pipe == 0: # local camera
165 | ret_val, img0 = self.cap.read()
166 | img0 = cv2.flip(img0, 1) # flip left-right
167 | else: # IP camera
168 | n = 0
169 | while True:
170 | n += 1
171 | self.cap.grab()
172 | if n % 30 == 0: # skip frames
173 | ret_val, img0 = self.cap.retrieve()
174 | if ret_val:
175 | break
176 |
177 | # Print
178 | assert ret_val, 'Camera Error %s' % self.pipe
179 | img_path = 'webcam.jpg'
180 | print('webcam %g: ' % self.count, end='')
181 |
182 | # Padded resize
183 | img = letterbox(img0, new_shape=self.img_size)[0]
184 |
185 | # Convert
186 | img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
187 | img = np.ascontiguousarray(img)
188 |
189 | return img_path, img, img0, None
190 |
191 | def __len__(self):
192 | return 0
193 |
194 |
195 | class LoadStreams: # multiple IP or RTSP cameras
196 | def __init__(self, sources='streams.txt', img_size=416):
197 | self.mode = 'images'
198 | self.img_size = img_size
199 |
200 | if os.path.isfile(sources):
201 | with open(sources, 'r') as f:
202 | sources = [x.strip() for x in f.read().splitlines() if len(x.strip())]
203 | else:
204 | sources = [sources]
205 |
206 | n = len(sources)
207 | self.imgs = [None] * n
208 | self.sources = sources
209 | for i, s in enumerate(sources):
210 | # Start the thread to read frames from the video stream
211 | print('%g/%g: %s... ' % (i + 1, n, s), end='')
212 | cap = cv2.VideoCapture(0 if s == '0' else s)
213 | assert cap.isOpened(), 'Failed to open %s' % s
214 | w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
215 | h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
216 | fps = cap.get(cv2.CAP_PROP_FPS) % 100
217 | _, self.imgs[i] = cap.read() # guarantee first frame
218 | thread = Thread(target=self.update, args=([i, cap]), daemon=True)
219 | print(' success (%gx%g at %.2f FPS).' % (w, h, fps))
220 | thread.start()
221 | print('') # newline
222 |
223 | # check for common shapes
224 | s = np.stack([letterbox(x, new_shape=self.img_size)[0].shape for x in self.imgs], 0) # inference shapes
225 | self.rect = np.unique(s, axis=0).shape[0] == 1 # rect inference if all shapes equal
226 | if not self.rect:
227 | print('WARNING: Different stream shapes detected. For optimal performance supply similarly-shaped streams.')
228 |
229 | def update(self, index, cap):
230 | # Read next stream frame in a daemon thread
231 | n = 0
232 | while cap.isOpened():
233 | n += 1
234 | # _, self.imgs[index] = cap.read()
235 | cap.grab()
236 | if n == 4: # read every 4th frame
237 | _, self.imgs[index] = cap.retrieve()
238 | n = 0
239 | time.sleep(0.01) # wait time
240 |
241 | def __iter__(self):
242 | self.count = -1
243 | return self
244 |
245 | def __next__(self):
246 | self.count += 1
247 | img0 = self.imgs.copy()
248 | if cv2.waitKey(1) == ord('q'): # q to quit
249 | cv2.destroyAllWindows()
250 | raise StopIteration
251 |
252 | # Letterbox
253 | img = [letterbox(x, new_shape=self.img_size, auto=self.rect)[0] for x in img0]
254 |
255 | # Stack
256 | img = np.stack(img, 0)
257 |
258 | # Convert
259 | img = img[:, :, :, ::-1].transpose(0, 3, 1, 2) # BGR to RGB, to bsx3x416x416
260 | img = np.ascontiguousarray(img)
261 |
262 | return self.sources, img, img0, None
263 |
264 | def __len__(self):
265 | return 0 # 1E12 frames = 32 streams at 30 FPS for 30 years
266 |
267 |
268 | class LoadImagesAndLabels(Dataset): # for training/testing
269 | def __init__(self, path, img_size=416, batch_size=16, augment=False, hyp=None, rect=False, image_weights=False,
270 | cache_images=False, single_cls=False, pad=0.0):
271 | try:
272 | path = str(Path(path)) # os-agnostic
273 | parent = str(Path(path).parent) + os.sep
274 | if os.path.isfile(path): # file
275 | with open(path, 'r') as f:
276 | f = f.read().splitlines()
277 | f = [x.replace('./', parent) if x.startswith('./') else x for x in f] # local to global path
278 | elif os.path.isdir(path): # folder
279 | f = glob.iglob(path + os.sep + '*.*')
280 | else:
281 | raise Exception('%s does not exist' % path)
282 | self.img_files = [x.replace('/', os.sep) for x in f if os.path.splitext(x)[-1].lower() in img_formats]
283 | except:
284 | raise Exception('Error loading data from %s. See %s' % (path, help_url))
285 |
286 | n = len(self.img_files)
287 | assert n > 0, 'No images found in %s. See %s' % (path, help_url)
288 | bi = np.floor(np.arange(n) / batch_size).astype(np.int) # batch index
289 | nb = bi[-1] + 1 # number of batches
290 |
291 | self.n = n # number of images
292 | self.batch = bi # batch index of image
293 | self.img_size = img_size
294 | self.augment = augment
295 | self.hyp = hyp
296 | self.image_weights = image_weights
297 | self.rect = False if image_weights else rect
298 | self.mosaic = self.augment and not self.rect # load 4 images at a time into a mosaic (only during training)
299 |
300 | # Define labels
301 | self.label_files = [x.replace('images', 'labels').replace(os.path.splitext(x)[-1], '.txt')
302 | for x in self.img_files]
303 |
304 | # Rectangular Training https://github.com/ultralytics/yolov3/issues/232
305 | if self.rect:
306 | # Read image shapes (wh)
307 | sp = path.replace('.txt', '') + '.shapes' # shapefile path
308 | try:
309 | with open(sp, 'r') as f: # read existing shapefile
310 | s = [x.split() for x in f.read().splitlines()]
311 | assert len(s) == n, 'Shapefile out of sync'
312 | except:
313 | s = [exif_size(Image.open(f)) for f in tqdm(self.img_files, desc='Reading image shapes')]
314 | np.savetxt(sp, s, fmt='%g') # overwrites existing (if any)
315 |
316 | # Sort by aspect ratio
317 | s = np.array(s, dtype=np.float64)
318 | ar = s[:, 1] / s[:, 0] # aspect ratio
319 | irect = ar.argsort()
320 | self.img_files = [self.img_files[i] for i in irect]
321 | self.label_files = [self.label_files[i] for i in irect]
322 | self.shapes = s[irect] # wh
323 | ar = ar[irect]
324 |
325 | # Set training image shapes
326 | shapes = [[1, 1]] * nb
327 | for i in range(nb):
328 | ari = ar[bi == i]
329 | mini, maxi = ari.min(), ari.max()
330 | if maxi < 1:
331 | shapes[i] = [maxi, 1]
332 | elif mini > 1:
333 | shapes[i] = [1, 1 / mini]
334 |
335 | self.batch_shapes = np.ceil(np.array(shapes) * img_size / 32. + pad).astype(np.int) * 32
336 |
337 | # Cache labels
338 | self.imgs = [None] * n
339 | self.labels = [np.zeros((0, 5), dtype=np.float32)] * n
340 | create_datasubset, extract_bounding_boxes, labels_loaded = False, False, False
341 | nm, nf, ne, ns, nd = 0, 0, 0, 0, 0 # number missing, found, empty, datasubset, duplicate
342 | np_labels_path = str(Path(self.label_files[0]).parent) + '.npy' # saved labels in *.npy file
343 | if os.path.isfile(np_labels_path):
344 | s = np_labels_path # print string
345 | x = np.load(np_labels_path, allow_pickle=True)
346 | if len(x) == n:
347 | self.labels = x
348 | labels_loaded = True
349 | else:
350 | s = path.replace('images', 'labels')
351 |
352 | pbar = tqdm(self.label_files)
353 | for i, file in enumerate(pbar):
354 | if labels_loaded:
355 | l = self.labels[i]
356 | # np.savetxt(file, l, '%g') # save *.txt from *.npy file
357 | else:
358 | try:
359 | with open(file, 'r') as f:
360 | l = np.array([x.split() for x in f.read().splitlines()], dtype=np.float32)
361 | except:
362 | nm += 1 # print('missing labels for image %s' % self.img_files[i]) # file missing
363 | continue
364 |
365 | if l.shape[0]:
366 | assert l.shape[1] == 5, '> 5 label columns: %s' % file
367 | assert (l >= 0).all(), 'negative labels: %s' % file
368 | assert (l[:, 1:] <= 1).all(), 'non-normalized or out of bounds coordinate labels: %s' % file
369 | if np.unique(l, axis=0).shape[0] < l.shape[0]: # duplicate rows
370 | nd += 1 # print('WARNING: duplicate rows in %s' % self.label_files[i]) # duplicate rows
371 | if single_cls:
372 | l[:, 0] = 0 # force dataset into single-class mode
373 | self.labels[i] = l
374 | nf += 1 # file found
375 |
376 | # Create subdataset (a smaller dataset)
377 | if create_datasubset and ns < 1E4:
378 | if ns == 0:
379 | create_folder(path='./datasubset')
380 | os.makedirs('./datasubset/images')
381 | exclude_classes = 43
382 | if exclude_classes not in l[:, 0]:
383 | ns += 1
384 | # shutil.copy(src=self.img_files[i], dst='./datasubset/images/') # copy image
385 | with open('./datasubset/images.txt', 'a') as f:
386 | f.write(self.img_files[i] + '\n')
387 |
388 | # Extract object detection boxes for a second stage classifier
389 | if extract_bounding_boxes:
390 | p = Path(self.img_files[i])
391 | img = cv2.imread(str(p))
392 | h, w = img.shape[:2]
393 | for j, x in enumerate(l):
394 | f = '%s%sclassifier%s%g_%g_%s' % (p.parent.parent, os.sep, os.sep, x[0], j, p.name)
395 | if not os.path.exists(Path(f).parent):
396 | os.makedirs(Path(f).parent) # make new output folder
397 |
398 | b = x[1:] * [w, h, w, h] # box
399 | b[2:] = b[2:].max() # rectangle to square
400 | b[2:] = b[2:] * 1.3 + 30 # pad
401 | b = xywh2xyxy(b.reshape(-1, 4)).ravel().astype(np.int)
402 |
403 | b[[0, 2]] = np.clip(b[[0, 2]], 0, w) # clip boxes outside of image
404 | b[[1, 3]] = np.clip(b[[1, 3]], 0, h)
405 | assert cv2.imwrite(f, img[b[1]:b[3], b[0]:b[2]]), 'Failure extracting classifier boxes'
406 | else:
407 | ne += 1 # print('empty labels for image %s' % self.img_files[i]) # file empty
408 | # os.system("rm '%s' '%s'" % (self.img_files[i], self.label_files[i])) # remove
409 |
410 | pbar.desc = 'Caching labels %s (%g found, %g missing, %g empty, %g duplicate, for %g images)' % (
411 | s, nf, nm, ne, nd, n)
412 | assert nf > 0 or n == 20288, 'No labels found in %s. See %s' % (os.path.dirname(file) + os.sep, help_url)
413 | if not labels_loaded and n > 1000:
414 | print('Saving labels to %s for faster future loading' % np_labels_path)
415 | np.save(np_labels_path, self.labels) # save for next time
416 |
417 | # Cache images into memory for faster training (WARNING: large datasets may exceed system RAM)
418 | if cache_images: # if training
419 | gb = 0 # Gigabytes of cached images
420 | pbar = tqdm(range(len(self.img_files)), desc='Caching images')
421 | self.img_hw0, self.img_hw = [None] * n, [None] * n
422 | for i in pbar: # max 10k images
423 | self.imgs[i], self.img_hw0[i], self.img_hw[i] = load_image(self, i) # img, hw_original, hw_resized
424 | gb += self.imgs[i].nbytes
425 | pbar.desc = 'Caching images (%.1fGB)' % (gb / 1E9)
426 |
427 | # Detect corrupted images https://medium.com/joelthchao/programmatically-detect-corrupted-image-8c1b2006c3d3
428 | detect_corrupted_images = False
429 | if detect_corrupted_images:
430 | from skimage import io # conda install -c conda-forge scikit-image
431 | for file in tqdm(self.img_files, desc='Detecting corrupted images'):
432 | try:
433 | _ = io.imread(file)
434 | except:
435 | print('Corrupted image detected: %s' % file)
436 |
437 | def __len__(self):
438 | return len(self.img_files)
439 |
440 | # def __iter__(self):
441 | # self.count = -1
442 | # print('ran dataset iter')
443 | # #self.shuffled_vector = np.random.permutation(self.nF) if self.augment else np.arange(self.nF)
444 | # return self
445 |
446 | def __getitem__(self, index):
447 | if self.image_weights:
448 | index = self.indices[index]
449 |
450 | hyp = self.hyp
451 | if self.mosaic:
452 | # Load mosaic
453 | img, labels = load_mosaic(self, index)
454 | shapes = None
455 |
456 | else:
457 | # Load image
458 | img, (h0, w0), (h, w) = load_image(self, index)
459 |
460 | # Letterbox
461 | shape = self.batch_shapes[self.batch[index]] if self.rect else self.img_size # final letterboxed shape
462 | img, ratio, pad = letterbox(img, shape, auto=False, scaleup=self.augment)
463 | shapes = (h0, w0), ((h / h0, w / w0), pad) # for COCO mAP rescaling
464 |
465 | # Load labels
466 | labels = []
467 | x = self.labels[index]
468 | if x.size > 0:
469 | # Normalized xywh to pixel xyxy format
470 | labels = x.copy()
471 | labels[:, 1] = ratio[0] * w * (x[:, 1] - x[:, 3] / 2) + pad[0] # pad width
472 | labels[:, 2] = ratio[1] * h * (x[:, 2] - x[:, 4] / 2) + pad[1] # pad height
473 | labels[:, 3] = ratio[0] * w * (x[:, 1] + x[:, 3] / 2) + pad[0]
474 | labels[:, 4] = ratio[1] * h * (x[:, 2] + x[:, 4] / 2) + pad[1]
475 |
476 | if self.augment:
477 | # Augment imagespace
478 | if not self.mosaic:
479 | img, labels = random_affine(img, labels,
480 | degrees=hyp['degrees'],
481 | translate=hyp['translate'],
482 | scale=hyp['scale'],
483 | shear=hyp['shear'])
484 |
485 | # Augment colorspace
486 | augment_hsv(img, hgain=hyp['hsv_h'], sgain=hyp['hsv_s'], vgain=hyp['hsv_v'])
487 |
488 | # Apply cutouts
489 | # if random.random() < 0.9:
490 | # labels = cutout(img, labels)
491 |
492 | nL = len(labels) # number of labels
493 | if nL:
494 | # convert xyxy to xywh
495 | labels[:, 1:5] = xyxy2xywh(labels[:, 1:5])
496 |
497 | # Normalize coordinates 0 - 1
498 | labels[:, [2, 4]] /= img.shape[0] # height
499 | labels[:, [1, 3]] /= img.shape[1] # width
500 |
501 | if self.augment:
502 | # random left-right flip
503 | lr_flip = True
504 | if lr_flip and random.random() < 0.5:
505 | img = np.fliplr(img)
506 | if nL:
507 | labels[:, 1] = 1 - labels[:, 1]
508 |
509 | # random up-down flip
510 | ud_flip = False
511 | if ud_flip and random.random() < 0.5:
512 | img = np.flipud(img)
513 | if nL:
514 | labels[:, 2] = 1 - labels[:, 2]
515 |
516 | labels_out = torch.zeros((nL, 6))
517 | if nL:
518 | labels_out[:, 1:] = torch.from_numpy(labels)
519 |
520 | # Convert
521 | img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
522 | img = np.ascontiguousarray(img)
523 |
524 | return torch.from_numpy(img), labels_out, self.img_files[index], shapes
525 |
526 | @staticmethod
527 | def collate_fn(batch):
528 | img, label, path, shapes = zip(*batch) # transposed
529 | for i, l in enumerate(label):
530 | l[:, 0] = i # add target image index for build_targets()
531 | return torch.stack(img, 0), torch.cat(label, 0), path, shapes
532 |
533 |
534 | def load_image(self, index):
535 | # loads 1 image from dataset, returns img, original hw, resized hw
536 | img = self.imgs[index]
537 | if img is None: # not cached
538 | path = self.img_files[index]
539 | img = cv2.imread(path) # BGR
540 | assert img is not None, 'Image Not Found ' + path
541 | h0, w0 = img.shape[:2] # orig hw
542 | r = self.img_size / max(h0, w0) # resize image to img_size
543 | if r != 1: # always resize down, only resize up if training with augmentation
544 | interp = cv2.INTER_AREA if r < 1 and not self.augment else cv2.INTER_LINEAR
545 | img = cv2.resize(img, (int(w0 * r), int(h0 * r)), interpolation=interp)
546 | return img, (h0, w0), img.shape[:2] # img, hw_original, hw_resized
547 | else:
548 | return self.imgs[index], self.img_hw0[index], self.img_hw[index] # img, hw_original, hw_resized
549 |
550 |
551 | def augment_hsv(img, hgain=0.5, sgain=0.5, vgain=0.5):
552 | r = np.random.uniform(-1, 1, 3) * [hgain, sgain, vgain] + 1 # random gains
553 | hue, sat, val = cv2.split(cv2.cvtColor(img, cv2.COLOR_BGR2HSV))
554 | dtype = img.dtype # uint8
555 |
556 | x = np.arange(0, 256, dtype=np.int16)
557 | lut_hue = ((x * r[0]) % 180).astype(dtype)
558 | lut_sat = np.clip(x * r[1], 0, 255).astype(dtype)
559 | lut_val = np.clip(x * r[2], 0, 255).astype(dtype)
560 |
561 | img_hsv = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val))).astype(dtype)
562 | cv2.cvtColor(img_hsv, cv2.COLOR_HSV2BGR, dst=img) # no return needed
563 |
564 | # Histogram equalization
565 | # if random.random() < 0.2:
566 | # for i in range(3):
567 | # img[:, :, i] = cv2.equalizeHist(img[:, :, i])
568 |
569 |
570 | def load_mosaic(self, index):
571 | # loads images in a mosaic
572 |
573 | labels4 = []
574 | s = self.img_size
575 | xc, yc = [int(random.uniform(s * 0.5, s * 1.5)) for _ in range(2)] # mosaic center x, y
576 | indices = [index] + [random.randint(0, len(self.labels) - 1) for _ in range(3)] # 3 additional image indices
577 | for i, index in enumerate(indices):
578 | # Load image
579 | img, _, (h, w) = load_image(self, index)
580 |
581 | # place img in img4
582 | if i == 0: # top left
583 | img4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8) # base image with 4 tiles
584 | x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc # xmin, ymin, xmax, ymax (large image)
585 | x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h # xmin, ymin, xmax, ymax (small image)
586 | elif i == 1: # top right
587 | x1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), yc
588 | x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
589 | elif i == 2: # bottom left
590 | x1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)
591 | x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, max(xc, w), min(y2a - y1a, h)
592 | elif i == 3: # bottom right
593 | x1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)
594 | x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)
595 |
596 | img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b] # img4[ymin:ymax, xmin:xmax]
597 | padw = x1a - x1b
598 | padh = y1a - y1b
599 |
600 | # Labels
601 | x = self.labels[index]
602 | labels = x.copy()
603 | if x.size > 0: # Normalized xywh to pixel xyxy format
604 | labels[:, 1] = w * (x[:, 1] - x[:, 3] / 2) + padw
605 | labels[:, 2] = h * (x[:, 2] - x[:, 4] / 2) + padh
606 | labels[:, 3] = w * (x[:, 1] + x[:, 3] / 2) + padw
607 | labels[:, 4] = h * (x[:, 2] + x[:, 4] / 2) + padh
608 | labels4.append(labels)
609 |
610 | # Concat/clip labels
611 | if len(labels4):
612 | labels4 = np.concatenate(labels4, 0)
613 | # np.clip(labels4[:, 1:] - s / 2, 0, s, out=labels4[:, 1:]) # use with center crop
614 | np.clip(labels4[:, 1:], 0, 2 * s, out=labels4[:, 1:]) # use with random_affine
615 |
616 | # Augment
617 | # img4 = img4[s // 2: int(s * 1.5), s // 2:int(s * 1.5)] # center crop (WARNING, requires box pruning)
618 | img4, labels4 = random_affine(img4, labels4,
619 | degrees=self.hyp['degrees'],
620 | translate=self.hyp['translate'],
621 | scale=self.hyp['scale'],
622 | shear=self.hyp['shear'],
623 | border=-s // 2) # border to remove
624 |
625 | return img4, labels4
626 |
627 |
628 | def letterbox(img, new_shape=(416, 416), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
629 | # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
630 | shape = img.shape[:2] # current shape [height, width]
631 | if isinstance(new_shape, int):
632 | new_shape = (new_shape, new_shape)
633 |
634 | # Scale ratio (new / old)
635 | r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
636 | if not scaleup: # only scale down, do not scale up (for better test mAP)
637 | r = min(r, 1.0)
638 |
639 | # Compute padding
640 | ratio = r, r # width, height ratios
641 | new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
642 | dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
643 | if auto: # minimum rectangle
644 | dw, dh = np.mod(dw, 64), np.mod(dh, 64) # wh padding
645 | elif scaleFill: # stretch
646 | dw, dh = 0.0, 0.0
647 | new_unpad = new_shape
648 | ratio = new_shape[0] / shape[1], new_shape[1] / shape[0] # width, height ratios
649 |
650 | dw /= 2 # divide padding into 2 sides
651 | dh /= 2
652 |
653 | if shape[::-1] != new_unpad: # resize
654 | img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
655 | top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
656 | left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
657 | img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
658 | return img, ratio, (dw, dh)
659 |
660 |
661 | def random_affine(img, targets=(), degrees=10, translate=.1, scale=.1, shear=10, border=0):
662 | # torchvision.transforms.RandomAffine(degrees=(-10, 10), translate=(.1, .1), scale=(.9, 1.1), shear=(-10, 10))
663 | # https://medium.com/uruvideo/dataset-augmentation-with-random-homographies-a8f4b44830d4
664 | # targets = [cls, xyxy]
665 |
666 | height = img.shape[0] + border * 2
667 | width = img.shape[1] + border * 2
668 |
669 | # Rotation and Scale
670 | R = np.eye(3)
671 | a = random.uniform(-degrees, degrees)
672 | # a += random.choice([-180, -90, 0, 90]) # add 90deg rotations to small rotations
673 | s = random.uniform(1 - scale, 1 + scale)
674 | # s = 2 ** random.uniform(-scale, scale)
675 | R[:2] = cv2.getRotationMatrix2D(angle=a, center=(img.shape[1] / 2, img.shape[0] / 2), scale=s)
676 |
677 | # Translation
678 | T = np.eye(3)
679 | T[0, 2] = random.uniform(-translate, translate) * img.shape[0] + border # x translation (pixels)
680 | T[1, 2] = random.uniform(-translate, translate) * img.shape[1] + border # y translation (pixels)
681 |
682 | # Shear
683 | S = np.eye(3)
684 | S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180) # x shear (deg)
685 | S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180) # y shear (deg)
686 |
687 | # Combined rotation matrix
688 | M = S @ T @ R # ORDER IS IMPORTANT HERE!!
689 | if (border != 0) or (M != np.eye(3)).any(): # image changed
690 | img = cv2.warpAffine(img, M[:2], dsize=(width, height), flags=cv2.INTER_LINEAR, borderValue=(114, 114, 114))
691 |
692 | # Transform label coordinates
693 | n = len(targets)
694 | if n:
695 | # warp points
696 | xy = np.ones((n * 4, 3))
697 | xy[:, :2] = targets[:, [1, 2, 3, 4, 1, 4, 3, 2]].reshape(n * 4, 2) # x1y1, x2y2, x1y2, x2y1
698 | xy = (xy @ M.T)[:, :2].reshape(n, 8)
699 |
700 | # create new boxes
701 | x = xy[:, [0, 2, 4, 6]]
702 | y = xy[:, [1, 3, 5, 7]]
703 | xy = np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T
704 |
705 | # # apply angle-based reduction of bounding boxes
706 | # radians = a * math.pi / 180
707 | # reduction = max(abs(math.sin(radians)), abs(math.cos(radians))) ** 0.5
708 | # x = (xy[:, 2] + xy[:, 0]) / 2
709 | # y = (xy[:, 3] + xy[:, 1]) / 2
710 | # w = (xy[:, 2] - xy[:, 0]) * reduction
711 | # h = (xy[:, 3] - xy[:, 1]) * reduction
712 | # xy = np.concatenate((x - w / 2, y - h / 2, x + w / 2, y + h / 2)).reshape(4, n).T
713 |
714 | # reject warped points outside of image
715 | xy[:, [0, 2]] = xy[:, [0, 2]].clip(0, width)
716 | xy[:, [1, 3]] = xy[:, [1, 3]].clip(0, height)
717 | w = xy[:, 2] - xy[:, 0]
718 | h = xy[:, 3] - xy[:, 1]
719 | area = w * h
720 | area0 = (targets[:, 3] - targets[:, 1]) * (targets[:, 4] - targets[:, 2])
721 | ar = np.maximum(w / (h + 1e-16), h / (w + 1e-16)) # aspect ratio
722 | i = (w > 4) & (h > 4) & (area / (area0 * s + 1e-16) > 0.2) & (ar < 10)
723 |
724 | targets = targets[i]
725 | targets[:, 1:5] = xy[i]
726 |
727 | return img, targets
728 |
729 |
730 | def cutout(image, labels):
731 | # https://arxiv.org/abs/1708.04552
732 | # https://github.com/hysts/pytorch_cutout/blob/master/dataloader.py
733 | # https://towardsdatascience.com/when-conventional-wisdom-fails-revisiting-data-augmentation-for-self-driving-cars-4831998c5509
734 | h, w = image.shape[:2]
735 |
736 | def bbox_ioa(box1, box2):
737 | # Returns the intersection over box2 area given box1, box2. box1 is 4, box2 is nx4. boxes are x1y1x2y2
738 | box2 = box2.transpose()
739 |
740 | # Get the coordinates of bounding boxes
741 | b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
742 | b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
743 |
744 | # Intersection area
745 | inter_area = (np.minimum(b1_x2, b2_x2) - np.maximum(b1_x1, b2_x1)).clip(0) * \
746 | (np.minimum(b1_y2, b2_y2) - np.maximum(b1_y1, b2_y1)).clip(0)
747 |
748 | # box2 area
749 | box2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1) + 1e-16
750 |
751 | # Intersection over box2 area
752 | return inter_area / box2_area
753 |
754 | # create random masks
755 | scales = [0.5] * 1 + [0.25] * 2 + [0.125] * 4 + [0.0625] * 8 + [0.03125] * 16 # image size fraction
756 | for s in scales:
757 | mask_h = random.randint(1, int(h * s))
758 | mask_w = random.randint(1, int(w * s))
759 |
760 | # box
761 | xmin = max(0, random.randint(0, w) - mask_w // 2)
762 | ymin = max(0, random.randint(0, h) - mask_h // 2)
763 | xmax = min(w, xmin + mask_w)
764 | ymax = min(h, ymin + mask_h)
765 |
766 | # apply random color mask
767 | image[ymin:ymax, xmin:xmax] = [random.randint(64, 191) for _ in range(3)]
768 |
769 | # return unobscured labels
770 | if len(labels) and s > 0.03:
771 | box = np.array([xmin, ymin, xmax, ymax], dtype=np.float32)
772 | ioa = bbox_ioa(box, labels[:, 1:5]) # intersection over area
773 | labels = labels[ioa < 0.60] # remove >60% obscured labels
774 |
775 | return labels
776 |
777 |
778 | def reduce_img_size(path='../data/sm4/images', img_size=1024): # from utils.datasets import *; reduce_img_size()
779 | # creates a new ./images_reduced folder with reduced size images of maximum size img_size
780 | path_new = path + '_reduced' # reduced images path
781 | create_folder(path_new)
782 | for f in tqdm(glob.glob('%s/*.*' % path)):
783 | try:
784 | img = cv2.imread(f)
785 | h, w = img.shape[:2]
786 | r = img_size / max(h, w) # size ratio
787 | if r < 1.0:
788 | img = cv2.resize(img, (int(w * r), int(h * r)), interpolation=cv2.INTER_AREA) # _LINEAR fastest
789 | fnew = f.replace(path, path_new) # .replace(Path(f).suffix, '.jpg')
790 | cv2.imwrite(fnew, img)
791 | except:
792 | print('WARNING: image failure %s' % f)
793 |
794 |
795 | def convert_images2bmp(): # from utils.datasets import *; convert_images2bmp()
796 | # Save images
797 | formats = [x.lower() for x in img_formats] + [x.upper() for x in img_formats]
798 | # for path in ['../coco/images/val2014', '../coco/images/train2014']:
799 | for path in ['../data/sm4/images', '../data/sm4/background']:
800 | create_folder(path + 'bmp')
801 | for ext in formats: # ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.dng']
802 | for f in tqdm(glob.glob('%s/*%s' % (path, ext)), desc='Converting %s' % ext):
803 | cv2.imwrite(f.replace(ext.lower(), '.bmp').replace(path, path + 'bmp'), cv2.imread(f))
804 |
805 | # Save labels
806 | # for path in ['../coco/trainvalno5k.txt', '../coco/5k.txt']:
807 | for file in ['../data/sm4/out_train.txt', '../data/sm4/out_test.txt']:
808 | with open(file, 'r') as f:
809 | lines = f.read()
810 | # lines = f.read().replace('2014/', '2014bmp/') # coco
811 | lines = lines.replace('/images', '/imagesbmp')
812 | lines = lines.replace('/background', '/backgroundbmp')
813 | for ext in formats:
814 | lines = lines.replace(ext, '.bmp')
815 | with open(file.replace('.txt', 'bmp.txt'), 'w') as f:
816 | f.write(lines)
817 |
818 |
819 | def recursive_dataset2bmp(dataset='../data/sm4_bmp'): # from utils.datasets import *; recursive_dataset2bmp()
820 | # Converts dataset to bmp (for faster training)
821 | formats = [x.lower() for x in img_formats] + [x.upper() for x in img_formats]
822 | for a, b, files in os.walk(dataset):
823 | for file in tqdm(files, desc=a):
824 | p = a + '/' + file
825 | s = Path(file).suffix
826 | if s == '.txt': # replace text
827 | with open(p, 'r') as f:
828 | lines = f.read()
829 | for f in formats:
830 | lines = lines.replace(f, '.bmp')
831 | with open(p, 'w') as f:
832 | f.write(lines)
833 | elif s in formats: # replace image
834 | cv2.imwrite(p.replace(s, '.bmp'), cv2.imread(p))
835 | if s != '.bmp':
836 | os.system("rm '%s'" % p)
837 |
838 |
839 | def imagelist2folder(path='data/coco_64img.txt'): # from utils.datasets import *; imagelist2folder()
840 | # Copies all the images in a text file (list of images) into a folder
841 | create_folder(path[:-4])
842 | with open(path, 'r') as f:
843 | for line in f.read().splitlines():
844 | os.system('cp "%s" %s' % (line, path[:-4]))
845 | print(line)
846 |
847 |
848 | def create_folder(path='./new_folder'):
849 | # Create folder
850 | if os.path.exists(path):
851 | shutil.rmtree(path) # delete output folder
852 | os.makedirs(path) # make new output folder
853 |
--------------------------------------------------------------------------------
/utils/google_utils.py:
--------------------------------------------------------------------------------
1 | # This file contains google utils: https://cloud.google.com/storage/docs/reference/libraries
2 | # pip install --upgrade google-cloud-storage
3 | # from google.cloud import storage
4 |
5 | import os
6 | import time
7 | from pathlib import Path
8 |
9 |
10 | def attempt_download(weights):
11 | # Attempt to download pretrained weights if not found locally
12 | weights = weights.strip()
13 | msg = weights + ' missing, try downloading from https://drive.google.com/drive/folders/1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J'
14 |
15 | r = 1
16 | if len(weights) > 0 and not os.path.isfile(weights):
17 | d = {'yolov3-spp.pt': '1mM67oNw4fZoIOL1c8M3hHmj66d8e-ni_', # yolov3-spp.yaml
18 | 'yolov5s.pt': '1R5T6rIyy3lLwgFXNms8whc-387H0tMQO', # yolov5s.yaml
19 | 'yolov5m.pt': '1vobuEExpWQVpXExsJ2w-Mbf3HJjWkQJr', # yolov5m.yaml
20 | 'yolov5l.pt': '1hrlqD1Wdei7UT4OgT785BEk1JwnSvNEV', # yolov5l.yaml
21 | 'yolov5x.pt': '1mM8aZJlWTxOg7BZJvNUMrTnA2AbeCVzS', # yolov5x.yaml
22 | }
23 |
24 | file = Path(weights).name
25 | if file in d:
26 | r = gdrive_download(id=d[file], name=weights)
27 |
28 | # Error check
29 | if not (r == 0 and os.path.exists(weights) and os.path.getsize(weights) > 1E6): # weights exist and > 1MB
30 | os.system('rm ' + weights) # remove partial downloads
31 | raise Exception(msg)
32 |
33 |
34 | def gdrive_download(id='1HaXkef9z6y5l4vUnCYgdmEAj61c6bfWO', name='coco.zip'):
35 | # https://gist.github.com/tanaikech/f0f2d122e05bf5f971611258c22c110f
36 | # Downloads a file from Google Drive, accepting presented query
37 | # from utils.google_utils import *; gdrive_download()
38 | t = time.time()
39 |
40 | print('Downloading https://drive.google.com/uc?export=download&id=%s as %s... ' % (id, name), end='')
41 | os.remove(name) if os.path.exists(name) else None # remove existing
42 | os.remove('cookie') if os.path.exists('cookie') else None
43 |
44 | # Attempt file download
45 | os.system("curl -c ./cookie -s -L \"https://drive.google.com/uc?export=download&id=%s\" > /dev/null" % id)
46 | if os.path.exists('cookie'): # large file
47 | s = "curl -Lb ./cookie \"https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=%s\" -o %s" % (
48 | id, name)
49 | else: # small file
50 | s = "curl -s -L -o %s 'https://drive.google.com/uc?export=download&id=%s'" % (name, id)
51 | r = os.system(s) # execute, capture return values
52 | os.remove('cookie') if os.path.exists('cookie') else None
53 |
54 | # Error check
55 | if r != 0:
56 | os.remove(name) if os.path.exists(name) else None # remove partial
57 | print('Download error ') # raise Exception('Download error')
58 | return r
59 |
60 | # Unzip if archive
61 | if name.endswith('.zip'):
62 | print('unzipping... ', end='')
63 | os.system('unzip -q %s' % name) # unzip
64 | os.remove(name) # remove zip to free space
65 |
66 | print('Done (%.1fs)' % (time.time() - t))
67 | return r
68 |
69 | # def upload_blob(bucket_name, source_file_name, destination_blob_name):
70 | # # Uploads a file to a bucket
71 | # # https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python
72 | #
73 | # storage_client = storage.Client()
74 | # bucket = storage_client.get_bucket(bucket_name)
75 | # blob = bucket.blob(destination_blob_name)
76 | #
77 | # blob.upload_from_filename(source_file_name)
78 | #
79 | # print('File {} uploaded to {}.'.format(
80 | # source_file_name,
81 | # destination_blob_name))
82 | #
83 | #
84 | # def download_blob(bucket_name, source_blob_name, destination_file_name):
85 | # # Uploads a blob from a bucket
86 | # storage_client = storage.Client()
87 | # bucket = storage_client.get_bucket(bucket_name)
88 | # blob = bucket.blob(source_blob_name)
89 | #
90 | # blob.download_to_filename(destination_file_name)
91 | #
92 | # print('Blob {} downloaded to {}.'.format(
93 | # source_blob_name,
94 | # destination_file_name))
95 |
--------------------------------------------------------------------------------
/utils/torch_utils.py:
--------------------------------------------------------------------------------
1 | import math
2 | import os
3 | import time
4 | from copy import deepcopy
5 |
6 | import torch
7 | import torch.backends.cudnn as cudnn
8 | import torch.nn as nn
9 | import torch.nn.functional as F
10 |
11 |
12 | def init_seeds(seed=0):
13 | torch.manual_seed(seed)
14 |
15 | # Speed-reproducibility tradeoff https://pytorch.org/docs/stable/notes/randomness.html
16 | if seed == 0: # slower, more reproducible
17 | cudnn.deterministic = True
18 | cudnn.benchmark = False
19 | else: # faster, less reproducible
20 | cudnn.deterministic = False
21 | cudnn.benchmark = True
22 |
23 |
24 | def select_device(device='', apex=False, batch_size=None):
25 | # device = 'cpu' or '0' or '0,1,2,3'
26 | cpu_request = device.lower() == 'cpu'
27 | if device and not cpu_request: # if device requested other than 'cpu'
28 | os.environ['CUDA_VISIBLE_DEVICES'] = device # set environment variable
29 | assert torch.cuda.is_available(), 'CUDA unavailable, invalid device %s requested' % device # check availablity
30 |
31 | cuda = False if cpu_request else torch.cuda.is_available()
32 | if cuda:
33 | c = 1024 ** 2 # bytes to MB
34 | ng = torch.cuda.device_count()
35 | if ng > 1 and batch_size: # check that batch_size is compatible with device_count
36 | assert batch_size % ng == 0, 'batch-size %g not multiple of GPU count %g' % (batch_size, ng)
37 | x = [torch.cuda.get_device_properties(i) for i in range(ng)]
38 | s = 'Using CUDA ' + ('Apex ' if apex else '') # apex for mixed precision https://github.com/NVIDIA/apex
39 | for i in range(0, ng):
40 | if i == 1:
41 | s = ' ' * len(s)
42 | print("%sdevice%g _CudaDeviceProperties(name='%s', total_memory=%dMB)" %
43 | (s, i, x[i].name, x[i].total_memory / c))
44 | else:
45 | print('Using CPU')
46 |
47 | print('') # skip a line
48 | return torch.device('cuda:0' if cuda else 'cpu')
49 |
50 |
51 | def time_synchronized():
52 | torch.cuda.synchronize() if torch.cuda.is_available() else None
53 | return time.time()
54 |
55 |
56 | def initialize_weights(model):
57 | for m in model.modules():
58 | t = type(m)
59 | if t is nn.Conv2d:
60 | pass # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
61 | elif t is nn.BatchNorm2d:
62 | m.eps = 1e-4
63 | m.momentum = 0.03
64 | elif t in [nn.LeakyReLU, nn.ReLU, nn.ReLU6]:
65 | m.inplace = True
66 |
67 |
68 | def find_modules(model, mclass=nn.Conv2d):
69 | # finds layer indices matching module class 'mclass'
70 | return [i for i, m in enumerate(model.module_list) if isinstance(m, mclass)]
71 |
72 |
73 | def fuse_conv_and_bn(conv, bn):
74 | # https://tehnokv.com/posts/fusing-batchnorm-and-conv/
75 | with torch.no_grad():
76 | # init
77 | fusedconv = torch.nn.Conv2d(conv.in_channels,
78 | conv.out_channels,
79 | kernel_size=conv.kernel_size,
80 | stride=conv.stride,
81 | padding=conv.padding,
82 | bias=True)
83 |
84 | # prepare filters
85 | w_conv = conv.weight.clone().view(conv.out_channels, -1)
86 | w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var)))
87 | fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size()))
88 |
89 | # prepare spatial bias
90 | if conv.bias is not None:
91 | b_conv = conv.bias
92 | else:
93 | b_conv = torch.zeros(conv.weight.size(0), device=conv.weight.device)
94 | b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps))
95 | fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn)
96 |
97 | return fusedconv
98 |
99 |
100 | def model_info(model, verbose=False):
101 | # Plots a line-by-line description of a PyTorch model
102 | n_p = sum(x.numel() for x in model.parameters()) # number parameters
103 | n_g = sum(x.numel() for x in model.parameters() if x.requires_grad) # number gradients
104 | if verbose:
105 | print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma'))
106 | for i, (name, p) in enumerate(model.named_parameters()):
107 | name = name.replace('module_list.', '')
108 | print('%5g %40s %9s %12g %20s %10.3g %10.3g' %
109 | (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std()))
110 |
111 | try: # FLOPS
112 | from thop import profile
113 | macs, _ = profile(model, inputs=(torch.zeros(1, 3, 480, 640),), verbose=False)
114 | fs = ', %.1f GFLOPS' % (macs / 1E9 * 2)
115 | except:
116 | fs = ''
117 |
118 | print('Model Summary: %g layers, %g parameters, %g gradients%s' % (len(list(model.parameters())), n_p, n_g, fs))
119 |
120 |
121 | def load_classifier(name='resnet101', n=2):
122 | # Loads a pretrained model reshaped to n-class output
123 | import pretrainedmodels # https://github.com/Cadene/pretrained-models.pytorch#torchvision
124 | model = pretrainedmodels.__dict__[name](num_classes=1000, pretrained='imagenet')
125 |
126 | # Display model properties
127 | for x in ['model.input_size', 'model.input_space', 'model.input_range', 'model.mean', 'model.std']:
128 | print(x + ' =', eval(x))
129 |
130 | # Reshape output to n classes
131 | filters = model.last_linear.weight.shape[1]
132 | model.last_linear.bias = torch.nn.Parameter(torch.zeros(n))
133 | model.last_linear.weight = torch.nn.Parameter(torch.zeros(n, filters))
134 | model.last_linear.out_features = n
135 | return model
136 |
137 |
138 | def scale_img(img, ratio=1.0, same_shape=False): # img(16,3,256,416), r=ratio
139 | # scales img(bs,3,y,x) by ratio
140 | h, w = img.shape[2:]
141 | s = (int(h * ratio), int(w * ratio)) # new size
142 | img = F.interpolate(img, size=s, mode='bilinear', align_corners=False) # resize
143 | if not same_shape: # pad/crop img
144 | gs = 32 # (pixels) grid size
145 | h, w = [math.ceil(x * ratio / gs) * gs for x in (h, w)]
146 | return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447) # value = imagenet mean
147 |
148 |
149 | class ModelEMA:
150 | """ Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models
151 | Keep a moving average of everything in the model state_dict (parameters and buffers).
152 | This is intended to allow functionality like
153 | https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage
154 | A smoothed version of the weights is necessary for some training schemes to perform well.
155 | E.g. Google's hyper-params for training MNASNet, MobileNet-V3, EfficientNet, etc that use
156 | RMSprop with a short 2.4-3 epoch decay period and slow LR decay rate of .96-.99 requires EMA
157 | smoothing of weights to match results. Pay attention to the decay constant you are using
158 | relative to your update count per epoch.
159 | To keep EMA from using GPU resources, set device='cpu'. This will save a bit of memory but
160 | disable validation of the EMA weights. Validation will have to be done manually in a separate
161 | process, or after the training stops converging.
162 | This class is sensitive where it is initialized in the sequence of model init,
163 | GPU assignment and distributed training wrappers.
164 | I've tested with the sequence in my own train.py for torch.DataParallel, apex.DDP, and single-GPU.
165 | """
166 |
167 | def __init__(self, model, decay=0.9999, device=''):
168 | # make a copy of the model for accumulating moving average of weights
169 | self.ema = deepcopy(model)
170 | self.ema.eval()
171 | self.updates = 0 # number of EMA updates
172 | self.decay = lambda x: decay * (1 - math.exp(-x / 2000)) # decay exponential ramp (to help early epochs)
173 | self.device = device # perform ema on different device from model if set
174 | if device:
175 | self.ema.to(device=device)
176 | for p in self.ema.parameters():
177 | p.requires_grad_(False)
178 |
179 | def update(self, model):
180 | self.updates += 1
181 | d = self.decay(self.updates)
182 | with torch.no_grad():
183 | if type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel):
184 | msd, esd = model.module.state_dict(), self.ema.module.state_dict()
185 | else:
186 | msd, esd = model.state_dict(), self.ema.state_dict()
187 |
188 | for k, v in esd.items():
189 | if v.dtype.is_floating_point:
190 | v *= d
191 | v += (1. - d) * msd[k].detach()
192 |
193 | def update_attr(self, model):
194 | # Assign attributes (which may change during training)
195 | for k in model.__dict__.keys():
196 | if not k.startswith('_'):
197 | setattr(self.ema, k, getattr(model, k))
198 |
--------------------------------------------------------------------------------
/utils/utils.py:
--------------------------------------------------------------------------------
1 | import glob
2 | import math
3 | import os
4 | import random
5 | import shutil
6 | import subprocess
7 | import time
8 | from copy import copy
9 | from pathlib import Path
10 | from sys import platform
11 |
12 | import cv2
13 | import matplotlib
14 | import matplotlib.pyplot as plt
15 | import numpy as np
16 | import torch
17 | import torch.nn as nn
18 | import torchvision
19 | from scipy.signal import butter, filtfilt
20 | from tqdm import tqdm
21 |
22 | from PIL import Image,ImageDraw,ImageFont
23 |
24 | from . import torch_utils, google_utils # torch_utils, google_utils
25 |
26 | # Set printoptions
27 | torch.set_printoptions(linewidth=320, precision=5, profile='long')
28 | np.set_printoptions(linewidth=320, formatter={'float_kind': '{:11.5g}'.format}) # format short g, %precision=5
29 | matplotlib.rc('font', **{'size': 11})
30 |
31 | # Prevent OpenCV from multithreading (to use PyTorch DataLoader)
32 | cv2.setNumThreads(0)
33 |
34 |
35 | def init_seeds(seed=0):
36 | random.seed(seed)
37 | np.random.seed(seed)
38 | torch_utils.init_seeds(seed=seed)
39 |
40 |
41 | def check_git_status():
42 | if platform in ['linux', 'darwin']:
43 | # Suggest 'git pull' if repo is out of date
44 | s = subprocess.check_output('if [ -d .git ]; then git fetch && git status -uno; fi', shell=True).decode('utf-8')
45 | if 'Your branch is behind' in s:
46 | print(s[s.find('Your branch is behind'):s.find('\n\n')] + '\n')
47 |
48 |
49 | def make_divisible(x, divisor):
50 | # Returns x evenly divisble by divisor
51 | return math.ceil(x / divisor) * divisor
52 |
53 |
54 | def labels_to_class_weights(labels, nc=80):
55 | # Get class weights (inverse frequency) from training labels
56 | if labels[0] is None: # no labels loaded
57 | return torch.Tensor()
58 |
59 | labels = np.concatenate(labels, 0) # labels.shape = (866643, 5) for COCO
60 | classes = labels[:, 0].astype(np.int) # labels = [class xywh]
61 | weights = np.bincount(classes, minlength=nc) # occurences per class
62 |
63 | # Prepend gridpoint count (for uCE trianing)
64 | # gpi = ((320 / 32 * np.array([1, 2, 4])) ** 2 * 3).sum() # gridpoints per image
65 | # weights = np.hstack([gpi * len(labels) - weights.sum() * 9, weights * 9]) ** 0.5 # prepend gridpoints to start
66 |
67 | weights[weights == 0] = 1 # replace empty bins with 1
68 | weights = 1 / weights # number of targets per class
69 | weights /= weights.sum() # normalize
70 | return torch.from_numpy(weights)
71 |
72 |
73 | def labels_to_image_weights(labels, nc=80, class_weights=np.ones(80)):
74 | # Produces image weights based on class mAPs
75 | n = len(labels)
76 | class_counts = np.array([np.bincount(labels[i][:, 0].astype(np.int), minlength=nc) for i in range(n)])
77 | image_weights = (class_weights.reshape(1, nc) * class_counts).sum(1)
78 | # index = random.choices(range(n), weights=image_weights, k=1) # weight image sample
79 | return image_weights
80 |
81 |
82 | def coco80_to_coco91_class(): # converts 80-index (val2014) to 91-index (paper)
83 | # https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/
84 | # a = np.loadtxt('data/coco.names', dtype='str', delimiter='\n')
85 | # b = np.loadtxt('data/coco_paper.names', dtype='str', delimiter='\n')
86 | # x1 = [list(a[i] == b).index(True) + 1 for i in range(80)] # darknet to coco
87 | # x2 = [list(b[i] == a).index(True) if any(b[i] == a) else None for i in range(91)] # coco to darknet
88 | x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34,
89 | 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
90 | 64, 65, 67, 70, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90]
91 | return x
92 |
93 |
94 | def xyxy2xywh(x):
95 | # Convert nx4 boxes from [x1, y1, x2, y2] to [x, y, w, h] where xy1=top-left, xy2=bottom-right
96 | y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
97 | y[:, 0] = (x[:, 0] + x[:, 2]) / 2 # x center
98 | y[:, 1] = (x[:, 1] + x[:, 3]) / 2 # y center
99 | y[:, 2] = x[:, 2] - x[:, 0] # width
100 | y[:, 3] = x[:, 3] - x[:, 1] # height
101 | return y
102 |
103 |
104 | def xywh2xyxy(x):
105 | # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
106 | y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
107 | y[:, 0] = x[:, 0] - x[:, 2] / 2 # top left x
108 | y[:, 1] = x[:, 1] - x[:, 3] / 2 # top left y
109 | y[:, 2] = x[:, 0] + x[:, 2] / 2 # bottom right x
110 | y[:, 3] = x[:, 1] + x[:, 3] / 2 # bottom right y
111 | return y
112 |
113 |
114 | def scale_coords(img1_shape, coords, img0_shape, ratio_pad=None):
115 | # Rescale coords (xyxy) from img1_shape to img0_shape
116 | if ratio_pad is None: # calculate from img0_shape
117 | gain = max(img1_shape) / max(img0_shape) # gain = old / new
118 | pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2 # wh padding
119 | else:
120 | gain = ratio_pad[0][0]
121 | pad = ratio_pad[1]
122 |
123 | coords[:, [0, 2]] -= pad[0] # x padding
124 | coords[:, [1, 3]] -= pad[1] # y padding
125 | coords[:, :4] /= gain
126 | clip_coords(coords, img0_shape)
127 | return coords
128 |
129 |
130 | def clip_coords(boxes, img_shape):
131 | # Clip bounding xyxy bounding boxes to image shape (height, width)
132 | boxes[:, 0].clamp_(0, img_shape[1]) # x1
133 | boxes[:, 1].clamp_(0, img_shape[0]) # y1
134 | boxes[:, 2].clamp_(0, img_shape[1]) # x2
135 | boxes[:, 3].clamp_(0, img_shape[0]) # y2
136 |
137 |
138 | def ap_per_class(tp, conf, pred_cls, target_cls):
139 | """ Compute the average precision, given the recall and precision curves.
140 | Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
141 | # Arguments
142 | tp: True positives (nparray, nx1 or nx10).
143 | conf: Objectness value from 0-1 (nparray).
144 | pred_cls: Predicted object classes (nparray).
145 | target_cls: True object classes (nparray).
146 | # Returns
147 | The average precision as computed in py-faster-rcnn.
148 | """
149 |
150 | # Sort by objectness
151 | i = np.argsort(-conf)
152 | tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
153 |
154 | # Find unique classes
155 | unique_classes = np.unique(target_cls)
156 |
157 | # Create Precision-Recall curve and compute AP for each class
158 | pr_score = 0.1 # score to evaluate P and R https://github.com/ultralytics/yolov3/issues/898
159 | s = [unique_classes.shape[0], tp.shape[1]] # number class, number iou thresholds (i.e. 10 for mAP0.5...0.95)
160 | ap, p, r = np.zeros(s), np.zeros(s), np.zeros(s)
161 | for ci, c in enumerate(unique_classes):
162 | i = pred_cls == c
163 | n_gt = (target_cls == c).sum() # Number of ground truth objects
164 | n_p = i.sum() # Number of predicted objects
165 |
166 | if n_p == 0 or n_gt == 0:
167 | continue
168 | else:
169 | # Accumulate FPs and TPs
170 | fpc = (1 - tp[i]).cumsum(0)
171 | tpc = tp[i].cumsum(0)
172 |
173 | # Recall
174 | recall = tpc / (n_gt + 1e-16) # recall curve
175 | r[ci] = np.interp(-pr_score, -conf[i], recall[:, 0]) # r at pr_score, negative x, xp because xp decreases
176 |
177 | # Precision
178 | precision = tpc / (tpc + fpc) # precision curve
179 | p[ci] = np.interp(-pr_score, -conf[i], precision[:, 0]) # p at pr_score
180 |
181 | # AP from recall-precision curve
182 | for j in range(tp.shape[1]):
183 | ap[ci, j] = compute_ap(recall[:, j], precision[:, j])
184 |
185 | # Plot
186 | # fig, ax = plt.subplots(1, 1, figsize=(5, 5))
187 | # ax.plot(recall, precision)
188 | # ax.set_xlabel('Recall')
189 | # ax.set_ylabel('Precision')
190 | # ax.set_xlim(0, 1.01)
191 | # ax.set_ylim(0, 1.01)
192 | # fig.tight_layout()
193 | # fig.savefig('PR_curve.png', dpi=300)
194 |
195 | # Compute F1 score (harmonic mean of precision and recall)
196 | f1 = 2 * p * r / (p + r + 1e-16)
197 |
198 | return p, r, ap, f1, unique_classes.astype('int32')
199 |
200 |
201 | def compute_ap(recall, precision):
202 | """ Compute the average precision, given the recall and precision curves.
203 | Source: https://github.com/rbgirshick/py-faster-rcnn.
204 | # Arguments
205 | recall: The recall curve (list).
206 | precision: The precision curve (list).
207 | # Returns
208 | The average precision as computed in py-faster-rcnn.
209 | """
210 |
211 | # Append sentinel values to beginning and end
212 | mrec = np.concatenate(([0.], recall, [min(recall[-1] + 1E-3, 1.)]))
213 | mpre = np.concatenate(([0.], precision, [0.]))
214 |
215 | # Compute the precision envelope
216 | mpre = np.flip(np.maximum.accumulate(np.flip(mpre)))
217 |
218 | # Integrate area under curve
219 | method = 'interp' # methods: 'continuous', 'interp'
220 | if method == 'interp':
221 | x = np.linspace(0, 1, 101) # 101-point interp (COCO)
222 | ap = np.trapz(np.interp(x, mrec, mpre), x) # integrate
223 | else: # 'continuous'
224 | i = np.where(mrec[1:] != mrec[:-1])[0] # points where x axis (recall) changes
225 | ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1]) # area under curve
226 |
227 | return ap
228 |
229 |
230 | def bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False):
231 | # Returns the IoU of box1 to box2. box1 is 4, box2 is nx4
232 | box2 = box2.t()
233 |
234 | # Get the coordinates of bounding boxes
235 | if x1y1x2y2: # x1, y1, x2, y2 = box1
236 | b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
237 | b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
238 | else: # transform from xywh to xyxy
239 | b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
240 | b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
241 | b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
242 | b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2
243 |
244 | # Intersection area
245 | inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
246 | (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)
247 |
248 | # Union Area
249 | w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1
250 | w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1
251 | union = (w1 * h1 + 1e-16) + w2 * h2 - inter
252 |
253 | iou = inter / union # iou
254 | if GIoU or DIoU or CIoU:
255 | cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1) # convex (smallest enclosing box) width
256 | ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1) # convex height
257 | if GIoU: # Generalized IoU https://arxiv.org/pdf/1902.09630.pdf
258 | c_area = cw * ch + 1e-16 # convex area
259 | return iou - (c_area - union) / c_area # GIoU
260 | if DIoU or CIoU: # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1
261 | # convex diagonal squared
262 | c2 = cw ** 2 + ch ** 2 + 1e-16
263 | # centerpoint distance squared
264 | rho2 = ((b2_x1 + b2_x2) - (b1_x1 + b1_x2)) ** 2 / 4 + ((b2_y1 + b2_y2) - (b1_y1 + b1_y2)) ** 2 / 4
265 | if DIoU:
266 | return iou - rho2 / c2 # DIoU
267 | elif CIoU: # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47
268 | v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)
269 | with torch.no_grad():
270 | alpha = v / (1 - iou + v)
271 | return iou - (rho2 / c2 + v * alpha) # CIoU
272 |
273 | return iou
274 |
275 |
276 | def box_iou(box1, box2):
277 | # https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py
278 | """
279 | Return intersection-over-union (Jaccard index) of boxes.
280 | Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
281 | Arguments:
282 | box1 (Tensor[N, 4])
283 | box2 (Tensor[M, 4])
284 | Returns:
285 | iou (Tensor[N, M]): the NxM matrix containing the pairwise
286 | IoU values for every element in boxes1 and boxes2
287 | """
288 |
289 | def box_area(box):
290 | # box = 4xn
291 | return (box[2] - box[0]) * (box[3] - box[1])
292 |
293 | area1 = box_area(box1.t())
294 | area2 = box_area(box2.t())
295 |
296 | # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
297 | inter = (torch.min(box1[:, None, 2:], box2[:, 2:]) - torch.max(box1[:, None, :2], box2[:, :2])).clamp(0).prod(2)
298 | return inter / (area1[:, None] + area2 - inter) # iou = inter / (area1 + area2 - inter)
299 |
300 |
301 | def wh_iou(wh1, wh2):
302 | # Returns the nxm IoU matrix. wh1 is nx2, wh2 is mx2
303 | wh1 = wh1[:, None] # [N,1,2]
304 | wh2 = wh2[None] # [1,M,2]
305 | inter = torch.min(wh1, wh2).prod(2) # [N,M]
306 | return inter / (wh1.prod(2) + wh2.prod(2) - inter) # iou = inter / (area1 + area2 - inter)
307 |
308 |
309 | class FocalLoss(nn.Module):
310 | # Wraps focal loss around existing loss_fcn(), i.e. criteria = FocalLoss(nn.BCEWithLogitsLoss(), gamma=1.5)
311 | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
312 | super(FocalLoss, self).__init__()
313 | self.loss_fcn = loss_fcn # must be nn.BCEWithLogitsLoss()
314 | self.gamma = gamma
315 | self.alpha = alpha
316 | self.reduction = loss_fcn.reduction
317 | self.loss_fcn.reduction = 'none' # required to apply FL to each element
318 |
319 | def forward(self, pred, true):
320 | loss = self.loss_fcn(pred, true)
321 | # p_t = torch.exp(-loss)
322 | # loss *= self.alpha * (1.000001 - p_t) ** self.gamma # non-zero power for gradient stability
323 |
324 | # TF implementation https://github.com/tensorflow/addons/blob/v0.7.1/tensorflow_addons/losses/focal_loss.py
325 | pred_prob = torch.sigmoid(pred) # prob from logits
326 | p_t = true * pred_prob + (1 - true) * (1 - pred_prob)
327 | alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha)
328 | modulating_factor = (1.0 - p_t) ** self.gamma
329 | loss *= alpha_factor * modulating_factor
330 |
331 | if self.reduction == 'mean':
332 | return loss.mean()
333 | elif self.reduction == 'sum':
334 | return loss.sum()
335 | else: # 'none'
336 | return loss
337 |
338 |
339 | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues/238#issuecomment-598028441
340 | # return positive, negative label smoothing BCE targets
341 | return 1.0 - 0.5 * eps, 0.5 * eps
342 |
343 |
344 | class BCEBlurWithLogitsLoss(nn.Module):
345 | # BCEwithLogitLoss() with reduced missing label effects.
346 | def __init__(self, alpha=0.05):
347 | super(BCEBlurWithLogitsLoss, self).__init__()
348 | self.loss_fcn = nn.BCEWithLogitsLoss(reduction='none') # must be nn.BCEWithLogitsLoss()
349 | self.alpha = alpha
350 |
351 | def forward(self, pred, true):
352 | loss = self.loss_fcn(pred, true)
353 | pred = torch.sigmoid(pred) # prob from logits
354 | dx = pred - true # reduce only missing label effects
355 | # dx = (pred - true).abs() # reduce missing label and false label effects
356 | alpha_factor = 1 - torch.exp((dx - 1) / (self.alpha + 1e-4))
357 | loss *= alpha_factor
358 | return loss.mean()
359 |
360 |
361 | def compute_loss(p, targets, model): # predictions, targets, model
362 | ft = torch.cuda.FloatTensor if p[0].is_cuda else torch.Tensor
363 | lcls, lbox, lobj = ft([0]), ft([0]), ft([0])
364 | tcls, tbox, indices, anchors = build_targets(p, targets, model) # targets
365 | h = model.hyp # hyperparameters
366 | red = 'mean' # Loss reduction (sum or mean)
367 |
368 | # Define criteria
369 | BCEcls = nn.BCEWithLogitsLoss(pos_weight=ft([h['cls_pw']]), reduction=red)
370 | BCEobj = nn.BCEWithLogitsLoss(pos_weight=ft([h['obj_pw']]), reduction=red)
371 |
372 | # class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3
373 | cp, cn = smooth_BCE(eps=0.0)
374 |
375 | # focal loss
376 | g = h['fl_gamma'] # focal loss gamma
377 | if g > 0:
378 | BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
379 |
380 | # per output
381 | nt = 0 # targets
382 | for i, pi in enumerate(p): # layer index, layer predictions
383 | b, a, gj, gi = indices[i] # image, anchor, gridy, gridx
384 | tobj = torch.zeros_like(pi[..., 0]) # target obj
385 |
386 | nb = b.shape[0] # number of targets
387 | if nb:
388 | nt += nb # cumulative targets
389 | ps = pi[b, a, gj, gi] # prediction subset corresponding to targets
390 |
391 | # GIoU
392 | pxy = ps[:, :2].sigmoid() * 2. - 0.5
393 | pwh = (ps[:, 2:4].sigmoid() * 2) ** 2 * anchors[i]
394 | pbox = torch.cat((pxy, pwh), 1) # predicted box
395 | giou = bbox_iou(pbox.t(), tbox[i], x1y1x2y2=False, GIoU=True) # giou(prediction, target)
396 | lbox += (1.0 - giou).sum() if red == 'sum' else (1.0 - giou).mean() # giou loss
397 |
398 | # Obj
399 | tobj[b, a, gj, gi] = (1.0 - model.gr) + model.gr * giou.detach().clamp(0).type(tobj.dtype) # giou ratio
400 |
401 | # Class
402 | if model.nc > 1: # cls loss (only if multiple classes)
403 | t = torch.full_like(ps[:, 5:], cn) # targets
404 | t[range(nb), tcls[i]] = cp
405 | lcls += BCEcls(ps[:, 5:], t) # BCE
406 |
407 | # Append targets to text file
408 | # with open('targets.txt', 'a') as file:
409 | # [file.write('%11.5g ' * 4 % tuple(x) + '\n') for x in torch.cat((txy[i], twh[i]), 1)]
410 |
411 | lobj += BCEobj(pi[..., 4], tobj) # obj loss
412 |
413 | lbox *= h['giou']
414 | lobj *= h['obj']
415 | lcls *= h['cls']
416 | bs = tobj.shape[0] # batch size
417 | if red == 'sum':
418 | g = 3.0 # loss gain
419 | lobj *= g / bs
420 | if nt:
421 | lcls *= g / nt / model.nc
422 | lbox *= g / nt
423 |
424 | loss = lbox + lobj + lcls
425 | return loss * bs, torch.cat((lbox, lobj, lcls, loss)).detach()
426 |
427 |
428 | def build_targets(p, targets, model):
429 | # Build targets for compute_loss(), input targets(image,class,x,y,w,h)
430 | det = model.module.model[-1] if type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel) \
431 | else model.model[-1] # Detect() module
432 | na, nt = det.na, targets.shape[0] # number of anchors, targets
433 | tcls, tbox, indices, anch = [], [], [], []
434 | gain = torch.ones(6, device=targets.device) # normalized to gridspace gain
435 | off = torch.tensor([[1, 0], [0, 1], [-1, 0], [0, -1]], device=targets.device).float() # overlap offsets
436 | at = torch.arange(na).view(na, 1).repeat(1, nt) # anchor tensor, same as .repeat_interleave(nt)
437 |
438 | style = 'rect4'
439 | for i in range(det.nl):
440 | anchors = det.anchors[i]
441 | gain[2:] = torch.tensor(p[i].shape)[[3, 2, 3, 2]] # xyxy gain
442 |
443 | # Match targets to anchors
444 | a, t, offsets = [], targets * gain, 0
445 | if nt:
446 | r = t[None, :, 4:6] / anchors[:, None] # wh ratio
447 | j = torch.max(r, 1. / r).max(2)[0] < model.hyp['anchor_t'] # compare
448 | # j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t'] # iou(3,n) = wh_iou(anchors(3,2), gwh(n,2))
449 | a, t = at[j], t.repeat(na, 1, 1)[j] # filter
450 |
451 | # overlaps
452 | gxy = t[:, 2:4] # grid xy
453 | z = torch.zeros_like(gxy)
454 | if style == 'rect2':
455 | g = 0.2 # offset
456 | j, k = ((gxy % 1. < g) & (gxy > 1.)).T
457 | a, t = torch.cat((a, a[j], a[k]), 0), torch.cat((t, t[j], t[k]), 0)
458 | offsets = torch.cat((z, z[j] + off[0], z[k] + off[1]), 0) * g
459 |
460 | elif style == 'rect4':
461 | g = 0.5 # offset
462 | j, k = ((gxy % 1. < g) & (gxy > 1.)).T
463 | l, m = ((gxy % 1. > (1 - g)) & (gxy < (gain[[2, 3]] - 1.))).T
464 | a, t = torch.cat((a, a[j], a[k], a[l], a[m]), 0), torch.cat((t, t[j], t[k], t[l], t[m]), 0)
465 | offsets = torch.cat((z, z[j] + off[0], z[k] + off[1], z[l] + off[2], z[m] + off[3]), 0) * g
466 |
467 | # Define
468 | b, c = t[:, :2].long().T # image, class
469 | gxy = t[:, 2:4] # grid xy
470 | gwh = t[:, 4:6] # grid wh
471 | gij = (gxy - offsets).long()
472 | gi, gj = gij.T # grid xy indices
473 |
474 | # Append
475 | indices.append((b, a, gj, gi)) # image, anchor, grid indices
476 | tbox.append(torch.cat((gxy - gij, gwh), 1)) # box
477 | anch.append(anchors[a]) # anchors
478 | tcls.append(c) # class
479 |
480 | return tcls, tbox, indices, anch
481 |
482 |
483 | def non_max_suppression(prediction, conf_thres=0.1, iou_thres=0.6, fast=False, classes=None, agnostic=False):
484 | """
485 | Performs Non-Maximum Suppression on inference results
486 | Returns detections with shape:
487 | nx6 (x1, y1, x2, y2, conf, cls)
488 | """
489 | nc = prediction[0].shape[1] - 5 # number of classes
490 | xc = prediction[..., 4] > conf_thres # candidates
491 |
492 | # Settings
493 | min_wh, max_wh = 2, 4096 # (pixels) minimum and maximum box width and height
494 | max_det = 300 # maximum number of detections per image
495 | time_limit = 10.0 # seconds to quit after
496 | redundant = True # require redundant detections
497 | fast |= conf_thres > 0.001 # fast mode
498 | if fast:
499 | merge = False
500 | multi_label = False
501 | else:
502 | merge = True # merge for best mAP (adds 0.5ms/img)
503 | multi_label = nc > 1 # multiple labels per box (adds 0.5ms/img)
504 |
505 | t = time.time()
506 | output = [None] * prediction.shape[0]
507 | for xi, x in enumerate(prediction): # image index, image inference
508 | # Apply constraints
509 | # x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0 # width-height
510 | x = x[xc[xi]] # confidence
511 |
512 | # If none remain process next image
513 | if not x.shape[0]:
514 | continue
515 |
516 | # Compute conf
517 | x[:, 5:] *= x[:, 4:5] # conf = obj_conf * cls_conf
518 |
519 | # Box (center x, center y, width, height) to (x1, y1, x2, y2)
520 | box = xywh2xyxy(x[:, :4])
521 |
522 | # Detections matrix nx6 (xyxy, conf, cls)
523 | if multi_label:
524 | i, j = (x[:, 5:] > conf_thres).nonzero().t()
525 | x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
526 | else: # best class only
527 | conf, j = x[:, 5:].max(1, keepdim=True)
528 | x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]
529 |
530 | # Filter by class
531 | if classes:
532 | x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]
533 |
534 | # Apply finite constraint
535 | # if not torch.isfinite(x).all():
536 | # x = x[torch.isfinite(x).all(1)]
537 |
538 | # If none remain process next image
539 | n = x.shape[0] # number of boxes
540 | if not n:
541 | continue
542 |
543 | # Sort by confidence
544 | # x = x[x[:, 4].argsort(descending=True)]
545 |
546 | # Batched NMS
547 | c = x[:, 5:6] * (0 if agnostic else max_wh) # classes
548 | boxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scores
549 | i = torchvision.ops.boxes.nms(boxes, scores, iou_thres)
550 | if i.shape[0] > max_det: # limit detections
551 | i = i[:max_det]
552 | if merge and (1 < n < 3E3): # Merge NMS (boxes merged using weighted mean)
553 | try: # update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
554 | iou = box_iou(boxes[i], boxes) > iou_thres # iou matrix
555 | weights = iou * scores[None] # box weights
556 | x[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(1, keepdim=True) # merged boxes
557 | if redundant:
558 | i = i[iou.sum(1) > 1] # require redundancy
559 | except: # possible CUDA error https://github.com/ultralytics/yolov3/issues/1139
560 | print(x, i, x.shape, i.shape)
561 | pass
562 |
563 | output[xi] = x[i]
564 | if (time.time() - t) > time_limit:
565 | break # time limit exceeded
566 |
567 | return output
568 |
569 |
570 | def strip_optimizer(f='weights/best.pt'): # from utils.utils import *; strip_optimizer()
571 | # Strip optimizer from *.pt files for lighter files (reduced by 1/2 size)
572 | x = torch.load(f, map_location=torch.device('cpu'))
573 | x['optimizer'] = None
574 | torch.save(x, f)
575 | print('Optimizer stripped from %s' % f)
576 |
577 |
578 | def create_backbone(f='weights/best.pt', s='weights/backbone.pt'): # from utils.utils import *; create_backbone()
579 | # create backbone 's' from 'f'
580 | device = torch.device('cpu')
581 | x = torch.load(f, map_location=device)
582 | torch.save(x, s) # update model if SourceChangeWarning
583 | x = torch.load(s, map_location=device)
584 |
585 | x['optimizer'] = None
586 | x['training_results'] = None
587 | x['epoch'] = -1
588 | for p in x['model'].parameters():
589 | p.requires_grad = True
590 | torch.save(x, s)
591 | print('%s modified for backbone use and saved as %s' % (f, s))
592 |
593 |
594 | def coco_class_count(path='../coco/labels/train2014/'):
595 | # Histogram of occurrences per class
596 | nc = 80 # number classes
597 | x = np.zeros(nc, dtype='int32')
598 | files = sorted(glob.glob('%s/*.*' % path))
599 | for i, file in enumerate(files):
600 | labels = np.loadtxt(file, dtype=np.float32).reshape(-1, 5)
601 | x += np.bincount(labels[:, 0].astype('int32'), minlength=nc)
602 | print(i, len(files))
603 |
604 |
605 | def coco_only_people(path='../coco/labels/train2017/'): # from utils.utils import *; coco_only_people()
606 | # Find images with only people
607 | files = sorted(glob.glob('%s/*.*' % path))
608 | for i, file in enumerate(files):
609 | labels = np.loadtxt(file, dtype=np.float32).reshape(-1, 5)
610 | if all(labels[:, 0] == 0):
611 | print(labels.shape[0], file)
612 |
613 |
614 | def crop_images_random(path='../images/', scale=0.50): # from utils.utils import *; crop_images_random()
615 | # crops images into random squares up to scale fraction
616 | # WARNING: overwrites images!
617 | for file in tqdm(sorted(glob.glob('%s/*.*' % path))):
618 | img = cv2.imread(file) # BGR
619 | if img is not None:
620 | h, w = img.shape[:2]
621 |
622 | # create random mask
623 | a = 30 # minimum size (pixels)
624 | mask_h = random.randint(a, int(max(a, h * scale))) # mask height
625 | mask_w = mask_h # mask width
626 |
627 | # box
628 | xmin = max(0, random.randint(0, w) - mask_w // 2)
629 | ymin = max(0, random.randint(0, h) - mask_h // 2)
630 | xmax = min(w, xmin + mask_w)
631 | ymax = min(h, ymin + mask_h)
632 |
633 | # apply random color mask
634 | cv2.imwrite(file, img[ymin:ymax, xmin:xmax])
635 |
636 |
637 | def coco_single_class_labels(path='../coco/labels/train2014/', label_class=43):
638 | # Makes single-class coco datasets. from utils.utils import *; coco_single_class_labels()
639 | if os.path.exists('new/'):
640 | shutil.rmtree('new/') # delete output folder
641 | os.makedirs('new/') # make new output folder
642 | os.makedirs('new/labels/')
643 | os.makedirs('new/images/')
644 | for file in tqdm(sorted(glob.glob('%s/*.*' % path))):
645 | with open(file, 'r') as f:
646 | labels = np.array([x.split() for x in f.read().splitlines()], dtype=np.float32)
647 | i = labels[:, 0] == label_class
648 | if any(i):
649 | img_file = file.replace('labels', 'images').replace('txt', 'jpg')
650 | labels[:, 0] = 0 # reset class to 0
651 | with open('new/images.txt', 'a') as f: # add image to dataset list
652 | f.write(img_file + '\n')
653 | with open('new/labels/' + Path(file).name, 'a') as f: # write label
654 | for l in labels[i]:
655 | f.write('%g %.6f %.6f %.6f %.6f\n' % tuple(l))
656 | shutil.copyfile(src=img_file, dst='new/images/' + Path(file).name.replace('txt', 'jpg')) # copy images
657 |
658 |
659 | def kmean_anchors(path='./data/coco128.txt', n=9, img_size=(640, 640), thr=0.20, gen=1000):
660 | # Creates kmeans anchors for use in *.cfg files: from utils.utils import *; _ = kmean_anchors()
661 | # n: number of anchors
662 | # img_size: (min, max) image size used for multi-scale training (can be same values)
663 | # thr: IoU threshold hyperparameter used for training (0.0 - 1.0)
664 | # gen: generations to evolve anchors using genetic algorithm
665 | from utils.datasets import LoadImagesAndLabels
666 |
667 | def print_results(k):
668 | k = k[np.argsort(k.prod(1))] # sort small to large
669 | iou = wh_iou(wh, torch.Tensor(k))
670 | max_iou = iou.max(1)[0]
671 | bpr, aat = (max_iou > thr).float().mean(), (iou > thr).float().mean() * n # best possible recall, anch > thr
672 |
673 | # thr = 5.0
674 | # r = wh[:, None] / k[None]
675 | # ar = torch.max(r, 1. / r).max(2)[0]
676 | # max_ar = ar.min(1)[0]
677 | # bpr, aat = (max_ar < thr).float().mean(), (ar < thr).float().mean() * n # best possible recall, anch > thr
678 |
679 | print('%.2f iou_thr: %.3f best possible recall, %.2f anchors > thr' % (thr, bpr, aat))
680 | print('n=%g, img_size=%s, IoU_all=%.3f/%.3f-mean/best, IoU>thr=%.3f-mean: ' %
681 | (n, img_size, iou.mean(), max_iou.mean(), iou[iou > thr].mean()), end='')
682 | for i, x in enumerate(k):
683 | print('%i,%i' % (round(x[0]), round(x[1])), end=', ' if i < len(k) - 1 else '\n') # use in *.cfg
684 | return k
685 |
686 | def fitness(k): # mutation fitness
687 | iou = wh_iou(wh, torch.Tensor(k)) # iou
688 | max_iou = iou.max(1)[0]
689 | return (max_iou * (max_iou > thr).float()).mean() # product
690 |
691 | # def fitness_ratio(k): # mutation fitness
692 | # # wh(5316,2), k(9,2)
693 | # r = wh[:, None] / k[None]
694 | # x = torch.max(r, 1. / r).max(2)[0]
695 | # m = x.min(1)[0]
696 | # return 1. / (m * (m < 5).float()).mean() # product
697 |
698 | # Get label wh
699 | wh = []
700 | dataset = LoadImagesAndLabels(path, augment=True, rect=True)
701 | nr = 1 if img_size[0] == img_size[1] else 3 # number augmentation repetitions
702 | for s, l in zip(dataset.shapes, dataset.labels):
703 | # wh.append(l[:, 3:5] * (s / s.max())) # image normalized to letterbox normalized wh
704 | wh.append(l[:, 3:5] * s) # image normalized to pixels
705 | wh = np.concatenate(wh, 0).repeat(nr, axis=0) # augment 3x
706 | # wh *= np.random.uniform(img_size[0], img_size[1], size=(wh.shape[0], 1)) # normalized to pixels (multi-scale)
707 | wh = wh[(wh > 2.0).all(1)] # remove below threshold boxes (< 2 pixels wh)
708 |
709 | # Kmeans calculation
710 | from scipy.cluster.vq import kmeans
711 | print('Running kmeans for %g anchors on %g points...' % (n, len(wh)))
712 | s = wh.std(0) # sigmas for whitening
713 | k, dist = kmeans(wh / s, n, iter=30) # points, mean distance
714 | k *= s
715 | wh = torch.Tensor(wh)
716 | k = print_results(k)
717 |
718 | # # Plot
719 | # k, d = [None] * 20, [None] * 20
720 | # for i in tqdm(range(1, 21)):
721 | # k[i-1], d[i-1] = kmeans(wh / s, i) # points, mean distance
722 | # fig, ax = plt.subplots(1, 2, figsize=(14, 7))
723 | # ax = ax.ravel()
724 | # ax[0].plot(np.arange(1, 21), np.array(d) ** 2, marker='.')
725 | # fig, ax = plt.subplots(1, 2, figsize=(14, 7)) # plot wh
726 | # ax[0].hist(wh[wh[:, 0]<100, 0],400)
727 | # ax[1].hist(wh[wh[:, 1]<100, 1],400)
728 | # fig.tight_layout()
729 | # fig.savefig('wh.png', dpi=200)
730 |
731 | # Evolve
732 | npr = np.random
733 | f, sh, mp, s = fitness(k), k.shape, 0.9, 0.1 # fitness, generations, mutation prob, sigma
734 | for _ in tqdm(range(gen), desc='Evolving anchors'):
735 | v = np.ones(sh)
736 | while (v == 1).all(): # mutate until a change occurs (prevent duplicates)
737 | v = ((npr.random(sh) < mp) * npr.random() * npr.randn(*sh) * s + 1).clip(0.3, 3.0)
738 | kg = (k.copy() * v).clip(min=2.0)
739 | fg = fitness(kg)
740 | if fg > f:
741 | f, k = fg, kg.copy()
742 | print_results(k)
743 | k = print_results(k)
744 |
745 | return k
746 |
747 |
748 | def print_mutation(hyp, results, bucket=''):
749 | # Print mutation results to evolve.txt (for use with train.py --evolve)
750 | a = '%10s' * len(hyp) % tuple(hyp.keys()) # hyperparam keys
751 | b = '%10.3g' * len(hyp) % tuple(hyp.values()) # hyperparam values
752 | c = '%10.4g' * len(results) % results # results (P, R, mAP, F1, test_loss)
753 | print('\n%s\n%s\nEvolved fitness: %s\n' % (a, b, c))
754 |
755 | if bucket:
756 | os.system('gsutil cp gs://%s/evolve.txt .' % bucket) # download evolve.txt
757 |
758 | with open('evolve.txt', 'a') as f: # append result
759 | f.write(c + b + '\n')
760 | x = np.unique(np.loadtxt('evolve.txt', ndmin=2), axis=0) # load unique rows
761 | np.savetxt('evolve.txt', x[np.argsort(-fitness(x))], '%10.3g') # save sort by fitness
762 |
763 | if bucket:
764 | os.system('gsutil cp evolve.txt gs://%s' % bucket) # upload evolve.txt
765 |
766 |
767 | def apply_classifier(x, model, img, im0):
768 | # applies a second stage classifier to yolo outputs
769 | im0 = [im0] if isinstance(im0, np.ndarray) else im0
770 | for i, d in enumerate(x): # per image
771 | if d is not None and len(d):
772 | d = d.clone()
773 |
774 | # Reshape and pad cutouts
775 | b = xyxy2xywh(d[:, :4]) # boxes
776 | b[:, 2:] = b[:, 2:].max(1)[0].unsqueeze(1) # rectangle to square
777 | b[:, 2:] = b[:, 2:] * 1.3 + 30 # pad
778 | d[:, :4] = xywh2xyxy(b).long()
779 |
780 | # Rescale boxes from img_size to im0 size
781 | scale_coords(img.shape[2:], d[:, :4], im0[i].shape)
782 |
783 | # Classes
784 | pred_cls1 = d[:, 5].long()
785 | ims = []
786 | for j, a in enumerate(d): # per item
787 | cutout = im0[i][int(a[1]):int(a[3]), int(a[0]):int(a[2])]
788 | im = cv2.resize(cutout, (224, 224)) # BGR
789 | # cv2.imwrite('test%i.jpg' % j, cutout)
790 |
791 | im = im[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
792 | im = np.ascontiguousarray(im, dtype=np.float32) # uint8 to float32
793 | im /= 255.0 # 0 - 255 to 0.0 - 1.0
794 | ims.append(im)
795 |
796 | pred_cls2 = model(torch.Tensor(ims).to(d.device)).argmax(1) # classifier prediction
797 | x[i] = x[i][pred_cls1 == pred_cls2] # retain matching class detections
798 |
799 | return x
800 |
801 |
802 | def fitness(x):
803 | # Returns fitness (for use with results.txt or evolve.txt)
804 | w = [0.0, 0.0, 0.1, 0.9] # weights for [P, R, mAP@0.5, mAP@0.5:0.95]
805 | return (x[:, :4] * w).sum(1)
806 |
807 |
808 | def output_to_target(output, width, height):
809 | """
810 | Convert a YOLO model output to target format
811 | [batch_id, class_id, x, y, w, h, conf]
812 | """
813 | if isinstance(output, torch.Tensor):
814 | output = output.cpu().numpy()
815 |
816 | targets = []
817 | for i, o in enumerate(output):
818 | if o is not None:
819 | for pred in o:
820 | box = pred[:4]
821 | w = (box[2] - box[0]) / width
822 | h = (box[3] - box[1]) / height
823 | x = box[0] / width + w / 2
824 | y = box[1] / height + h / 2
825 | conf = pred[4]
826 | cls = int(pred[5])
827 |
828 | targets.append([i, cls, x, y, w, h, conf])
829 |
830 | return np.array(targets)
831 |
832 |
833 | # Plotting functions ---------------------------------------------------------------------------------------------------
834 | def butter_lowpass_filtfilt(data, cutoff=1500, fs=50000, order=5):
835 | # https://stackoverflow.com/questions/28536191/how-to-filter-smooth-with-scipy-numpy
836 | def butter_lowpass(cutoff, fs, order):
837 | nyq = 0.5 * fs
838 | normal_cutoff = cutoff / nyq
839 | b, a = butter(order, normal_cutoff, btype='low', analog=False)
840 | return b, a
841 |
842 | b, a = butter_lowpass(cutoff, fs, order=order)
843 | return filtfilt(b, a, data) # forward-backward filter
844 |
845 |
846 | def cv2AddChineseText(img, text, position, textColor=(0, 255, 0), textSize=30):
847 | if (isinstance(img, np.ndarray)): # 判断是否OpenCV图片类型
848 | img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
849 | # 创建一个可以在给定图像上绘图的对象
850 | draw = ImageDraw.Draw(img)
851 | # 字体的格式
852 | fontStyle = ImageFont.truetype(
853 | "simsun.ttc", textSize, encoding="utf-8")
854 | # 绘制文本
855 | draw.text(position, text, textColor, font=fontStyle)
856 | # 转换回OpenCV格式
857 | return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)
858 |
859 | def plot_one_box(x, img, color=None, label=None, line_thickness=None):
860 | # Plots one bounding box on image img
861 | tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1 # line/font thickness
862 | color = color or [random.randint(0, 255) for _ in range(3)]
863 | c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
864 | cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
865 | if label:
866 | tf = max(tl - 1, 1) # font thickness
867 | t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
868 | c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
869 | cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled
870 | img = cv2AddChineseText(img, label, (c1[0], c1[1] - 14), textColor=(255, 255, 255), textSize=15)
871 | font = cv2.FONT_HERSHEY_SIMPLEX
872 | cv2.putText(img,"YOLO v5 by HuBin",(40,40),font, 0.1, (0, 255, 0), 1)
873 | # cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
874 |
875 | return img
876 |
877 | def plot_wh_methods(): # from utils.utils import *; plot_wh_methods()
878 | # Compares the two methods for width-height anchor multiplication
879 | # https://github.com/ultralytics/yolov3/issues/168
880 | x = np.arange(-4.0, 4.0, .1)
881 | ya = np.exp(x)
882 | yb = torch.sigmoid(torch.from_numpy(x)).numpy() * 2
883 |
884 | fig = plt.figure(figsize=(6, 3), dpi=150)
885 | plt.plot(x, ya, '.-', label='yolo method')
886 | plt.plot(x, yb ** 2, '.-', label='^2 power method')
887 | plt.plot(x, yb ** 2.5, '.-', label='^2.5 power method')
888 | plt.xlim(left=-4, right=4)
889 | plt.ylim(bottom=0, top=6)
890 | plt.xlabel('input')
891 | plt.ylabel('output')
892 | plt.legend()
893 | fig.tight_layout()
894 | fig.savefig('comparison.png', dpi=200)
895 |
896 |
897 | def plot_images(images, targets, paths=None, fname='images.jpg', names=None, max_size=640, max_subplots=16):
898 | tl = 3 # line thickness
899 | tf = max(tl - 1, 1) # font thickness
900 | if os.path.isfile(fname): # do not overwrite
901 | return None
902 |
903 | if isinstance(images, torch.Tensor):
904 | images = images.cpu().numpy()
905 |
906 | if isinstance(targets, torch.Tensor):
907 | targets = targets.cpu().numpy()
908 |
909 | # un-normalise
910 | if np.max(images[0]) <= 1:
911 | images *= 255
912 |
913 | bs, _, h, w = images.shape # batch size, _, height, width
914 | bs = min(bs, max_subplots) # limit plot images
915 | ns = np.ceil(bs ** 0.5) # number of subplots (square)
916 |
917 | # Check if we should resize
918 | scale_factor = max_size / max(h, w)
919 | if scale_factor < 1:
920 | h = math.ceil(scale_factor * h)
921 | w = math.ceil(scale_factor * w)
922 |
923 | # Empty array for output
924 | mosaic = np.full((int(ns * h), int(ns * w), 3), 255, dtype=np.uint8)
925 |
926 | # Fix class - colour map
927 | prop_cycle = plt.rcParams['axes.prop_cycle']
928 | # https://stackoverflow.com/questions/51350872/python-from-color-name-to-rgb
929 | hex2rgb = lambda h: tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))
930 | color_lut = [hex2rgb(h) for h in prop_cycle.by_key()['color']]
931 |
932 | for i, img in enumerate(images):
933 | if i == max_subplots: # if last batch has fewer images than we expect
934 | break
935 |
936 | block_x = int(w * (i // ns))
937 | block_y = int(h * (i % ns))
938 |
939 | img = img.transpose(1, 2, 0)
940 | if scale_factor < 1:
941 | img = cv2.resize(img, (w, h))
942 |
943 | mosaic[block_y:block_y + h, block_x:block_x + w, :] = img
944 | if len(targets) > 0:
945 | image_targets = targets[targets[:, 0] == i]
946 | boxes = xywh2xyxy(image_targets[:, 2:6]).T
947 | classes = image_targets[:, 1].astype('int')
948 | gt = image_targets.shape[1] == 6 # ground truth if no conf column
949 | conf = None if gt else image_targets[:, 6] # check for confidence presence (gt vs pred)
950 |
951 | boxes[[0, 2]] *= w
952 | boxes[[0, 2]] += block_x
953 | boxes[[1, 3]] *= h
954 | boxes[[1, 3]] += block_y
955 | for j, box in enumerate(boxes.T):
956 | cls = int(classes[j])
957 | color = color_lut[cls % len(color_lut)]
958 | cls = names[cls] if names else cls
959 | if gt or conf[j] > 0.3: # 0.3 conf thresh
960 | label = '%s' % cls if gt else '%s %.1f' % (cls, conf[j])
961 | plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl)
962 |
963 | # Draw image filename labels
964 | if paths is not None:
965 | label = os.path.basename(paths[i])[:40] # trim to 40 char
966 | t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
967 | cv2.putText(mosaic, label, (block_x + 5, block_y + t_size[1] + 5), 0, tl / 3, [220, 220, 220], thickness=tf,
968 | lineType=cv2.LINE_AA)
969 |
970 | # Image border
971 | cv2.rectangle(mosaic, (block_x, block_y), (block_x + w, block_y + h), (255, 255, 255), thickness=3)
972 |
973 | if fname is not None:
974 | mosaic = cv2.resize(mosaic, (int(ns * w * 0.5), int(ns * h * 0.5)), interpolation=cv2.INTER_AREA)
975 | cv2.imwrite(fname, cv2.cvtColor(mosaic, cv2.COLOR_BGR2RGB))
976 |
977 | return mosaic
978 |
979 |
980 | def plot_lr_scheduler(optimizer, scheduler, epochs=300):
981 | # Plot LR simulating training for full epochs
982 | optimizer, scheduler = copy(optimizer), copy(scheduler) # do not modify originals
983 | y = []
984 | for _ in range(epochs):
985 | scheduler.step()
986 | y.append(optimizer.param_groups[0]['lr'])
987 | plt.plot(y, '.-', label='LR')
988 | plt.xlabel('epoch')
989 | plt.ylabel('LR')
990 | plt.grid()
991 | plt.xlim(0, epochs)
992 | plt.ylim(0)
993 | plt.tight_layout()
994 | plt.savefig('LR.png', dpi=200)
995 |
996 |
997 | def plot_test_txt(): # from utils.utils import *; plot_test()
998 | # Plot test.txt histograms
999 | x = np.loadtxt('test.txt', dtype=np.float32)
1000 | box = xyxy2xywh(x[:, :4])
1001 | cx, cy = box[:, 0], box[:, 1]
1002 |
1003 | fig, ax = plt.subplots(1, 1, figsize=(6, 6), tight_layout=True)
1004 | ax.hist2d(cx, cy, bins=600, cmax=10, cmin=0)
1005 | ax.set_aspect('equal')
1006 | plt.savefig('hist2d.png', dpi=300)
1007 |
1008 | fig, ax = plt.subplots(1, 2, figsize=(12, 6), tight_layout=True)
1009 | ax[0].hist(cx, bins=600)
1010 | ax[1].hist(cy, bins=600)
1011 | plt.savefig('hist1d.png', dpi=200)
1012 |
1013 |
1014 | def plot_targets_txt(): # from utils.utils import *; plot_targets_txt()
1015 | # Plot targets.txt histograms
1016 | x = np.loadtxt('targets.txt', dtype=np.float32).T
1017 | s = ['x targets', 'y targets', 'width targets', 'height targets']
1018 | fig, ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)
1019 | ax = ax.ravel()
1020 | for i in range(4):
1021 | ax[i].hist(x[i], bins=100, label='%.3g +/- %.3g' % (x[i].mean(), x[i].std()))
1022 | ax[i].legend()
1023 | ax[i].set_title(s[i])
1024 | plt.savefig('targets.jpg', dpi=200)
1025 |
1026 |
1027 | def plot_study_txt(f='study.txt', x=None): # from utils.utils import *; plot_study_txt()
1028 | # Plot study.txt generated by test.py
1029 | fig, ax = plt.subplots(2, 4, figsize=(10, 6), tight_layout=True)
1030 | ax = ax.ravel()
1031 |
1032 | fig2, ax2 = plt.subplots(1, 1, figsize=(8, 4), tight_layout=True)
1033 | for f in ['coco_study/study_coco_yolov5%s.txt' % x for x in ['s', 'm', 'l', 'x']]:
1034 | y = np.loadtxt(f, dtype=np.float32, usecols=[0, 1, 2, 3, 7, 8, 9], ndmin=2).T
1035 | x = np.arange(y.shape[1]) if x is None else np.array(x)
1036 | s = ['P', 'R', 'mAP@.5', 'mAP@.5:.95', 't_inference (ms/img)', 't_NMS (ms/img)', 't_total (ms/img)']
1037 | for i in range(7):
1038 | ax[i].plot(x, y[i], '.-', linewidth=2, markersize=8)
1039 | ax[i].set_title(s[i])
1040 |
1041 | j = y[3].argmax() + 1
1042 | ax2.plot(y[6, :j], y[3, :j] * 1E2, '.-', linewidth=2, markersize=8,
1043 | label=Path(f).stem.replace('study_coco_', '').replace('yolo', 'YOLO'))
1044 |
1045 | ax2.plot(1E3 / np.array([209, 140, 97, 58, 35, 18]), [33.5, 39.1, 42.5, 45.9, 49., 50.5],
1046 | 'k.-', linewidth=2, markersize=8, alpha=.25, label='EfficientDet')
1047 | ax2.set_xlim(0, 30)
1048 | ax2.set_ylim(25, 50)
1049 | ax2.set_xlabel('GPU Latency (ms)')
1050 | ax2.set_ylabel('COCO AP val')
1051 | ax2.legend(loc='lower right')
1052 | ax2.grid()
1053 | plt.savefig('study_mAP_latency.png', dpi=300)
1054 | plt.savefig(f.replace('.txt', '.png'), dpi=200)
1055 |
1056 |
1057 | def plot_labels(labels):
1058 | # plot dataset labels
1059 | c, b = labels[:, 0], labels[:, 1:].transpose() # classees, boxes
1060 |
1061 | def hist2d(x, y, n=100):
1062 | xedges, yedges = np.linspace(x.min(), x.max(), n), np.linspace(y.min(), y.max(), n)
1063 | hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges))
1064 | xidx = np.clip(np.digitize(x, xedges) - 1, 0, hist.shape[0] - 1)
1065 | yidx = np.clip(np.digitize(y, yedges) - 1, 0, hist.shape[1] - 1)
1066 | return np.log(hist[xidx, yidx])
1067 |
1068 | fig, ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)
1069 | ax = ax.ravel()
1070 | ax[0].hist(c, bins=int(c.max() + 1))
1071 | ax[0].set_xlabel('classes')
1072 | ax[1].scatter(b[0], b[1], c=hist2d(b[0], b[1], 90), cmap='jet')
1073 | ax[1].set_xlabel('x')
1074 | ax[1].set_ylabel('y')
1075 | ax[2].scatter(b[2], b[3], c=hist2d(b[2], b[3], 90), cmap='jet')
1076 | ax[2].set_xlabel('width')
1077 | ax[2].set_ylabel('height')
1078 | plt.savefig('labels.png', dpi=200)
1079 |
1080 |
1081 | def plot_evolution_results(hyp): # from utils.utils import *; plot_evolution_results(hyp)
1082 | # Plot hyperparameter evolution results in evolve.txt
1083 | x = np.loadtxt('evolve.txt', ndmin=2)
1084 | f = fitness(x)
1085 | # weights = (f - f.min()) ** 2 # for weighted results
1086 | plt.figure(figsize=(12, 10), tight_layout=True)
1087 | matplotlib.rc('font', **{'size': 8})
1088 | for i, (k, v) in enumerate(hyp.items()):
1089 | y = x[:, i + 7]
1090 | # mu = (y * weights).sum() / weights.sum() # best weighted result
1091 | mu = y[f.argmax()] # best single result
1092 | plt.subplot(4, 5, i + 1)
1093 | plt.plot(mu, f.max(), 'o', markersize=10)
1094 | plt.plot(y, f, '.')
1095 | plt.title('%s = %.3g' % (k, mu), fontdict={'size': 9}) # limit to 40 characters
1096 | print('%15s: %.3g' % (k, mu))
1097 | plt.savefig('evolve.png', dpi=200)
1098 |
1099 |
1100 | def plot_results_overlay(start=0, stop=0): # from utils.utils import *; plot_results_overlay()
1101 | # Plot training 'results*.txt', overlaying train and val losses
1102 | s = ['train', 'train', 'train', 'Precision', 'mAP@0.5', 'val', 'val', 'val', 'Recall', 'mAP@0.5:0.95'] # legends
1103 | t = ['GIoU', 'Objectness', 'Classification', 'P-R', 'mAP-F1'] # titles
1104 | for f in sorted(glob.glob('results*.txt') + glob.glob('../../Downloads/results*.txt')):
1105 | results = np.loadtxt(f, usecols=[2, 3, 4, 8, 9, 12, 13, 14, 10, 11], ndmin=2).T
1106 | n = results.shape[1] # number of rows
1107 | x = range(start, min(stop, n) if stop else n)
1108 | fig, ax = plt.subplots(1, 5, figsize=(14, 3.5), tight_layout=True)
1109 | ax = ax.ravel()
1110 | for i in range(5):
1111 | for j in [i, i + 5]:
1112 | y = results[j, x]
1113 | ax[i].plot(x, y, marker='.', label=s[j])
1114 | # y_smooth = butter_lowpass_filtfilt(y)
1115 | # ax[i].plot(x, np.gradient(y_smooth), marker='.', label=s[j])
1116 |
1117 | ax[i].set_title(t[i])
1118 | ax[i].legend()
1119 | ax[i].set_ylabel(f) if i == 0 else None # add filename
1120 | fig.savefig(f.replace('.txt', '.png'), dpi=200)
1121 |
1122 |
1123 | def plot_results(start=0, stop=0, bucket='', id=(), labels=()): # from utils.utils import *; plot_results()
1124 | # Plot training 'results*.txt' as seen in https://github.com/ultralytics/yolov5#reproduce-our-training
1125 | fig, ax = plt.subplots(2, 5, figsize=(12, 6))
1126 | ax = ax.ravel()
1127 | s = ['GIoU', 'Objectness', 'Classification', 'Precision', 'Recall',
1128 | 'val GIoU', 'val Objectness', 'val Classification', 'mAP@0.5', 'mAP@0.5:0.95']
1129 | if bucket:
1130 | os.system('rm -rf storage.googleapis.com')
1131 | files = ['https://storage.googleapis.com/%s/results%g.txt' % (bucket, x) for x in id]
1132 | else:
1133 | files = glob.glob('results*.txt') + glob.glob('../../Downloads/results*.txt')
1134 | for fi, f in enumerate(files):
1135 | try:
1136 | results = np.loadtxt(f, usecols=[2, 3, 4, 8, 9, 12, 13, 14, 10, 11], ndmin=2).T
1137 | n = results.shape[1] # number of rows
1138 | x = range(start, min(stop, n) if stop else n)
1139 | for i in range(10):
1140 | y = results[i, x]
1141 | if i in [0, 1, 2, 5, 6, 7]:
1142 | y[y == 0] = np.nan # dont show zero loss values
1143 | # y /= y[0] # normalize
1144 | label = labels[fi] if len(labels) else Path(f).stem
1145 | ax[i].plot(x, y, marker='.', label=label, linewidth=2, markersize=8)
1146 | ax[i].set_title(s[i])
1147 | # if i in [5, 6, 7]: # share train and val loss y axes
1148 | # ax[i].get_shared_y_axes().join(ax[i], ax[i - 5])
1149 | except:
1150 | print('Warning: Plotting error for %s, skipping file' % f)
1151 |
1152 | fig.tight_layout()
1153 | ax[1].legend()
1154 | fig.savefig('results.png', dpi=200)
1155 |
--------------------------------------------------------------------------------