├── README.md
├── app.py
├── cam
    ├── 1.png
    ├── 2.png
    ├── __pycache__
    │   ├── base_camera.cpython-37.pyc
    │   └── base_camera.cpython-38.pyc
    ├── base_camera.py
    ├── camera.py
    ├── coco.names
    ├── result.png
    ├── test.jpg
    ├── test_re.jpg
    └── train.jpg
├── center
    ├── get_train_val.py
    └── xml_yolo.py
├── config
    ├── score.yaml
    ├── yolov3-spp.yaml
    ├── yolov5l.yaml
    ├── yolov5m.yaml
    ├── yolov5s.yaml
    └── yolov5x.yaml
├── detect.py
├── inference
    ├── inputs
    │   └── 2007_000033.jpg
    └── outputs
    │   └── 2007_000033.jpg
├── models
    ├── __pycache__
    │   ├── common.cpython-37.pyc
    │   ├── de.cpython-37.pyc
    │   ├── experimental.cpython-37.pyc
    │   └── yolo.cpython-37.pyc
    ├── common.py
    ├── de.py
    ├── experimental.py
    ├── onnx_export.py
    └── yolo.py
├── requirements.txt
├── static
    ├── client.js
    ├── style.css
    ├── style1.css
    └── worker.js
├── templates
    └── index1.html
├── test.py
├── train.py
└── utils
    ├── __init__.py
    ├── __pycache__
        ├── __init__.cpython-37.pyc
        ├── datasets.cpython-37.pyc
        ├── google_utils.cpython-37.pyc
        ├── torch_utils.cpython-37.pyc
        └── utils.cpython-37.pyc
    ├── activations.py
    ├── datasets.py
    ├── google_utils.py
    ├── torch_utils.py
    └── utils.py


/README.md:
--------------------------------------------------------------------------------
  1 | # 使用yolov5训练自己的数据集（详细过程）并通过flask部署
  2 | 
  3 | #### 依赖库
  4 | - torch 
  5 | - torchvision 
  6 | - numpy
  7 | - opencv-python
  8 | - lxml
  9 | - tqdm
 10 | - flask
 11 | - pillow
 12 | - tensorboard
 13 | - matplotlib
 14 | - pycocotools
 15 | 
 16 | #### Windows，请使用 pycocotools-windows 代替 pycocotools
 17 | 
 18 | #### Check all dependencies installed
 19 | ```
 20 | pip install -r requirements.txt
 21 | ```
 22 | ### 1.准备数据集
 23 | 
 24 | 这里以PASCAL VOC数据集为例，[提取码： 07wp](https://pan.baidu.com/s/1u8k9wlLUklyLxQnaSrG4xQ)
 25 | 将获取的数据集放到datasets目录下
 26 | 数据集结构如下：
 27 | ```
 28 | ---VOC2012
 29 | --------Annotations
 30 | ---------------xml0
 31 | ---------------xml1
 32 | --------JPEGImages
 33 | ---------------img0
 34 | ---------------img1
 35 | --------pascal_voc_classes.txt
 36 | ```
 37 | Annotations为所有的xml文件，JPEGImages为所有的图片文件，pascal_voc_classes.txt为类别文件。
 38 | 
 39 | #### 获取标签文件
 40 | yolo标签文件的格式如下：
 41 | ```
 42 | 102 0.682813 0.415278 0.237500 0.502778
 43 | 102 0.914844 0.396528 0.168750 0.451389
 44 | 
 45 | 第一位 label，为图片中物体的类别
 46 | 后面四位为图片中物体的位置，（xcenter, ycenter, h, w）即目标物体中心位置的相对坐标和相对高宽
 47 | 上图中存在两个目标
 48 | ```
 49 | 如果你已经拥有如上的label文件，可直接跳到下一步。
 50 | 没有如上标签文件，可使用 [labelimg  提取码  dbi2](https://pan.baidu.com/s/1oEFodW83koHLcGasRoBZhA) 打标签。生成xml格式的label文件，再转为yolo格式的label文件。labelimg的使用非常简单，在此不在赘述。
 51 | 
 52 | xml格式的label文件转为yolo格式:
 53 | 
 54 | ```
 55 | python center/xml_yolo.py
 56 | ```
 57 | 
 58 | pascal_voc_classes.txt，为你的类别对应的json文件。如下为voc数据集类别格式。
 59 | ```python
 60 | ["aeroplane","bicycle", "bird","boat","bottle","bus","car","cat","chair","cow","diningtable","dog","horse","motorbike","person","pottedplant","sheep","sofa","train", "tvmonitor"]
 61 | ```
 62 | ####  运行上面代码后的路径结构
 63 | ```
 64 | ---VOC2012
 65 | --------Annotations
 66 | --------JPEGImages
 67 | --------pascal_voc_classes.json
 68 | ---yolodata
 69 | --------images
 70 | --------labels
 71 | ```
 72 | 
 73 | ### 2.划分训练集和测试集
 74 | 训练集和测试集的划分很简单，将原始数据打乱，然后按 9  ：1划分为训练集和测试集即可。代码如下：
 75 | 
 76 | ```
 77 | python center/get_train_val.py
 78 | ```
 79 | ##### 运行上面代码会生成如下路径结构
 80 | ```
 81 | ---VOC2012
 82 | --------Annotations
 83 | --------JPEGImages
 84 | --------pascal_voc_classes.json
 85 | ---yolodata
 86 | --------images
 87 | --------labels
 88 | ---traindata
 89 | --------images
 90 | ----------------train
 91 | ----------------val
 92 | --------labels
 93 | ----------------train
 94 | ----------------val
 95 | ```
 96 | ##### traindata就是最后需要的训练文件
 97 | 
 98 | ### 3. 训练模型
 99 | 
100 | yolov5的训练很简单，本文已将代码简化，代码结构如下：
101 | 
102 | ```
103 | dataset             # 数据集
104 | ------traindata     # 训练数据集
105 | inference           # 输入输出接口
106 | ------inputs        # 输入数据
107 | ------outputs       # 输出数据
108 | config              # 配置文件
109 | ------score.yaml    # 训练配置文件
110 | ------yolov5l.yaml  # 模型配置文件
111 | models              # 模型代码
112 | runs	            # 日志文件
113 | utils               # 代码文件
114 | weights             # 模型保存路径，last.pt，best.pt
115 | train.py            # 训练代码
116 | detect.py           # 测试代码
117 | ```
118 | 
119 | score.yaml解释如下：
120 | ```
121 | # train and val datasets (image directory)
122 | train: ./datasets/traindata/images/train/
123 | val: ./datasets/traindata/images/val/
124 | # number of classes
125 | nc: 2
126 | # class names
127 | names: ['苹果','香蕉']
128 | ```
129 | 
130 | - train:   为图像数据的train，地址
131 | - val:     为图像数据的val，地址
132 | - nc:      为类别个数
133 | - names:   为类别对应的名称
134 | 
135 | 
136 | ##### yolov5l.yaml解释如下：
137 | 
138 | ```
139 | nc: 2 # number of classes
140 | depth_multiple: 1.0  # model depth multiple
141 | width_multiple: 1.0  # layer channel multiple
142 | anchors:
143 |   - [10,13, 16,30, 33,23]  # P3/8
144 |   - [30,61, 62,45, 59,119]  # P4/16
145 |   - [116,90, 156,198, 373,326]  # P5/32
146 | backbone:
147 |   # [from, number, module, args]
148 |   [[-1, 1, Focus, [64, 3]],  # 1-P1/2
149 |    [-1, 1, Conv, [128, 3, 2]],  # 2-P2/4
150 |    [-1, 3, Bottleneck, [128]],
151 |    [-1, 1, Conv, [256, 3, 2]],  # 4-P3/8
152 |    [-1, 9, BottleneckCSP, [256]],
153 |    [-1, 1, Conv, [512, 3, 2]],  # 6-P4/16
154 |    [-1, 9, BottleneckCSP, [512]],
155 |    [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
156 |    [-1, 1, SPP, [1024, [5, 9, 13]]],
157 |    [-1, 6, BottleneckCSP, [1024]],  # 10
158 |   ]
159 | head:
160 |   [[-1, 3, BottleneckCSP, [1024, False]],  # 11
161 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 12 (P5/32-large)
162 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
163 |    [[-1, 6], 1, Concat, [1]],  # cat backbone P4
164 |    [-1, 1, Conv, [512, 1, 1]],
165 |    [-1, 3, BottleneckCSP, [512, False]],
166 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 17 (P4/16-medium)
167 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
168 |    [[-1, 4], 1, Concat, [1]],  # cat backbone P3
169 |    [-1, 1, Conv, [256, 1, 1]],
170 |    [-1, 3, BottleneckCSP, [256, False]],
171 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 22 (P3/8-small)
172 |    [[], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
173 |   ]
174 | ```
175 | - nc：为目标类别个数
176 | - depth_multiple 和 width_multiple：控制模型深度和宽度。不同的参数对应：s，m，l，x 模型。
177 | - anchors: 为对输入的目标框通过k-means聚类产生的基础框，通过这个基础框去预测目标的box。
178 | - yolov5会自动产生anchors，yolov5采用欧氏距离进行k-means聚类，再使用遗传算法做一系列的变异得到最终的anchors。但是本人采用欧氏距离进行k-means聚类得到的效果不如采用 1 - iou进行k-means聚类的效果。如果想要 1 - iou 进行k-means聚类源码请私聊我。但是效果其实相差无几。
179 | - backbone: 为图像特征提取部分的网络结构。
180 | - head:    为最后的预测部分的网络结构
181 | 
182 | 
183 | #####train.py配置十分简单：
184 | ![在这里插入图片描述](cam/1.png)
185 | 
186 | 我们仅需修改如下参数即可
187 | ```
188 | epoch:         控制训练迭代的次数
189 | batch_size     输入迭代的图片数量
190 | cfg:           配置网络模型路径
191 | data:          训练配置文件路径
192 | weights:       载入模型，进行断点继续训练
193 | ```
194 | 终端运行(默认yolov5l)
195 | ```
196 |  python train.py
197 | ```
198 | 即可开始训练。
199 | 
200 | ##### 训练过程
201 | 
202 | ![](cam/train.jpg)
203 | 
204 | ##### 训练结果
205 | 
206 | ![](cam/result.png)
207 | 
208 | ### 4. 测试模型
209 | 
210 | ![](cam/2.png)
211 | 
212 | ##### 需要需改三个参数
213 | ```
214 | source：        需要检测的images/videos路径
215 | out：		保存结果的路径
216 | weights：       训练得到的模型权重文件的路径
217 | ```
218 | ##### 你也可以使用在coco数据集上的权重文件进行测试将他们放到weights文件夹下
219 | 
220 | [提取码：hhbb](https://pan.baidu.com/s/18AD8HpLhcRGSKOwGwPJMMg)
221 | 
222 | 终端运行
223 | ```
224 |  python detect.py
225 | ```
226 | 即可开始检测。
227 | 
228 | ##### 测试结果
229 | 
230 | ![](cam/test.jpg)
231 | 
232 | ![](cam/test_re.jpg)
233 | 
234 | ### 5.通过flask部署
235 | 
236 | flask的部署是非简单。如果有不明白的可以参考我之前的博客。
237 | 
238 | [阿里云ECS部署python,flask项目，简单易懂，无需nginx和uwsgi](https://blog.csdn.net/qq_44523137/article/details/112676287?spm=1001.2014.3001.5501)
239 | 
240 | [基于yolov3-deepsort-flask的目标检测和多目标追踪web平台](https://blog.csdn.net/qq_44523137/article/details/116323516?spm=1001.2014.3001.5501)
241 | 
242 | 
243 | 
244 | 终端运行
245 | ```
246 |  python app.py
247 | ```
248 | 即可开始跳转到网页，上传图片进行检测。
249 | 
250 | 
251 | 
252 | 
253 | 


--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import time
 3 | from flask import Flask, request, Response,render_template
 4 | import json
 5 | from cam.base_camera import BaseCamera
 6 | 
 7 | from models.de import detect,get_model
 8 | import os
 9 | os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
10 | app = Flask(__name__)
11 | class_names = [c.strip() for c in open(r'cam/coco.names').readlines()]
12 | file_name = ['jpg','jpeg','png']
13 | 
14 | yolov5_model = get_model()
15 | 
16 | @app.route('/images', methods= ['POST'])
17 | def get_image():
18 |     image = request.files["images"]
19 |     image_name = image.filename
20 |     image.save(os.path.join(os.getcwd(), image_name))
21 |     if image_name.split(".")[-1] in file_name:
22 |         img = cv2.imread(image_name)
23 |         img = detect(yolov5_model,img)
24 |         _, img_encoded = cv2.imencode('.jpg', img)
25 |         response = img_encoded.tobytes()
26 |         os.remove(image_name)
27 |         try:
28 |             return Response(response=response, status=200, mimetype='image/jpg')
29 |         except:
30 |             return render_template('index1.html')
31 | @app.route('/')
32 | def upload_file():
33 |    return render_template('index1.html')
34 | if __name__ == '__main__':
35 |     #    Run locally
36 |     app.run(debug=True, host='127.0.0.1', port=5000)
37 |     #Run on the server
38 |     # app.run(debug=True, host = '0.0.0.0', port=5000)
39 | 


--------------------------------------------------------------------------------
/cam/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/1.png


--------------------------------------------------------------------------------
/cam/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/2.png


--------------------------------------------------------------------------------
/cam/__pycache__/base_camera.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/__pycache__/base_camera.cpython-37.pyc


--------------------------------------------------------------------------------
/cam/__pycache__/base_camera.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/__pycache__/base_camera.cpython-38.pyc


--------------------------------------------------------------------------------
/cam/base_camera.py:
--------------------------------------------------------------------------------
  1 | import time
  2 | import threading
  3 | try:
  4 |     from greenlet import getcurrent as get_ident
  5 | except ImportError:
  6 |     try:
  7 |         from thread import get_ident
  8 |     except ImportError:
  9 |         from _thread import get_ident
 10 | 
 11 | 
 12 | class CameraEvent(object):
 13 |     """An Event-like class that signals all active clients when a new frame is
 14 |     available.
 15 |     """
 16 |     def __init__(self):
 17 |         self.events = {}
 18 | 
 19 |     def wait(self):
 20 |         """Invoked from each client's thread to wait for the next frame."""
 21 |         ident = get_ident()
 22 |         if ident not in self.events:
 23 |             # this is a new client
 24 |             # add an entry for it in the self.events dict
 25 |             # each entry has two elements, a threading.Event() and a timestamp
 26 |             self.events[ident] = [threading.Event(), time.time()]
 27 |         return self.events[ident][0].wait()
 28 | 
 29 |     def set(self):
 30 |         """Invoked by the camera thread when a new frame is available."""
 31 |         now = time.time()
 32 |         remove = None
 33 |         for ident, event in self.events.items():
 34 |             if not event[0].isSet():
 35 |                 # if this client's event is not set, then set it
 36 |                 # also update the last set timestamp to now
 37 |                 event[0].set()
 38 |                 event[1] = now
 39 |             else:
 40 |                 # if the client's event is already set, it means the client
 41 |                 # did not process a previous frame
 42 |                 # if the event stays set for more than 5 seconds, then assume
 43 |                 # the client is gone and remove it
 44 |                 if now - event[1] > 5:
 45 |                     remove = ident
 46 |         if remove:
 47 |             del self.events[remove]
 48 | 
 49 |     def clear(self):
 50 |         """Invoked from each client's thread after a frame was processed."""
 51 |         self.events[get_ident()][0].clear()
 52 | 
 53 | 
 54 | class BaseCamera(object):
 55 |     thread = None  # background thread that reads frames from camera
 56 |     frame = None  # current frame is stored here by background thread
 57 |     last_access = 0  # time of last client access to the camera
 58 |     event = CameraEvent()
 59 | 
 60 |     def __init__(self):
 61 |         """Start the background camera thread if it isn't running yet."""
 62 |         if BaseCamera.thread is None:
 63 |             BaseCamera.last_access = time.time()
 64 | 
 65 |             # start background frame thread
 66 |             BaseCamera.thread = threading.Thread(target=self._thread)
 67 |             BaseCamera.thread.start()
 68 | 
 69 |             # wait until frames are available
 70 |             while self.get_frame() is None:
 71 |                 time.sleep(0)
 72 | 
 73 |     def get_frame(self):
 74 |         """Return the current camera frame."""
 75 |         BaseCamera.last_access = time.time()
 76 | 
 77 |         # wait for a signal from the camera thread
 78 |         BaseCamera.event.wait()
 79 |         BaseCamera.event.clear()
 80 | 
 81 |         return BaseCamera.frame
 82 | 
 83 |     @staticmethod
 84 |     def frames(path):
 85 |         """"Generator that returns frames from the camera."""
 86 |         raise RuntimeError('Must be implemented by subclasses.')
 87 | 
 88 |     @classmethod
 89 |     def _thread(cls):
 90 |         """Camera background thread."""
 91 |         print('Starting camera thread.')
 92 |         frames_iterator = cls.frames()
 93 |         for frame in frames_iterator:
 94 |             BaseCamera.frame = frame
 95 |             BaseCamera.event.set()  # send signal to clients
 96 |             time.sleep(0)
 97 | 
 98 |             # if there hasn't been any clients asking for frames in
 99 |             # the last 10 seconds then stop the thread
100 |             if time.time() - BaseCamera.last_access > 60:
101 |                 frames_iterator.close()
102 |                 print('Stopping camera thread due to inactivity.')
103 |                 break
104 |         BaseCamera.thread = None


--------------------------------------------------------------------------------
/cam/camera.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from cam.base_camera import BaseCamera
 3 | import cv2
 4 | import tensorflow as tf
 5 | from yolov3_tf2.models import YoloV3
 6 | from yolov3_tf2.dataset import transform_images
 7 | from yolov3_tf2.utils import draw_outputs
 8 | 
 9 | # customize your API through the following parameters
10 | classes_path = 'coco.names'
11 | weights_path = './weights/yolov3.tf'
12 | tiny = False                    # set to True if using a Yolov3 Tiny model
13 | size = 416                      # size images are resized to for model
14 | output_path = './detections/'   # path to output folder where images with detections are saved
15 | num_classes = 80                # number of classes in model
16 | 
17 | # load in weights and classes
18 | physical_devices = tf.config.experimental.list_physical_devices('GPU')
19 | if len(physical_devices) > 0:
20 |     tf.config.experimental.set_memory_growth(physical_devices[0], True)
21 | 
22 | 
23 | yolo = YoloV3(classes=num_classes)
24 | 
25 | yolo.load_weights(weights_path).expect_partial()
26 | print('weights loaded')
27 | 
28 | class_names = [c.strip() for c in open(classes_path).readlines()]
29 | print('classes loaded')
30 | 
31 | 
32 | class Camera(BaseCamera):
33 | 
34 |     @staticmethod
35 |     def frames():
36 |         cam = cv2.VideoCapture(r'./finish.mp4')
37 |         if not cam.isOpened():
38 |             raise RuntimeError('Could not start camera.')
39 | 
40 |         while True:
41 |             # read current frame
42 |             _, img = cam.read()
43 |             try:
44 |                 if CameraParams.gray:
45 |                     img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
46 |                 if CameraParams.gaussian:
47 |                     img_raw = tf.convert_to_tensor(img)
48 |                     img_raw = tf.expand_dims(img_raw, 0)
49 |                     # img detect
50 |                     img_raw = transform_images(img_raw, size)
51 |                     boxes, scores, classes, nums = yolo(img_raw)
52 |                     img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
53 |                     img = draw_outputs(img, (boxes, scores, classes, nums), class_names)
54 |                 if CameraParams.sobel:
55 |                     if(len(img.shape) == 3):
56 |                         img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
57 |                     img = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5)  # x
58 |                     img = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=5)  # y
59 |                 if CameraParams.canny:
60 |                     img = cv2.Canny(img, 100, 200, 3, L2gradient=True)
61 |             except Exception as e:
62 |                 print(e)
63 |             # encode as a jpeg image and return it
64 |             yield cv2.imencode('.jpg', img)[1].tobytes()
65 | 
66 | class CameraParams():
67 | 
68 |     gray = False
69 |     gaussian = False
70 |     sobel = False
71 |     canny = False
72 |     def __init__(self, gray, gaussian, sobel, canny, yolo):
73 |         self.gray = gray
74 |         self.gaussian = gaussian
75 |         self.sobel = sobel
76 |         self.canny = canny
77 |         self.yolo
78 | 


--------------------------------------------------------------------------------
/cam/coco.names:
--------------------------------------------------------------------------------
 1 | person
 2 | bicycle
 3 | car
 4 | motorbike
 5 | aeroplane
 6 | bus
 7 | train
 8 | truck
 9 | boat
10 | traffic light
11 | fire hydrant
12 | stop sign
13 | parking meter
14 | bench
15 | bird
16 | cat
17 | dog
18 | horse
19 | sheep
20 | cow
21 | elephant
22 | bear
23 | zebra
24 | giraffe
25 | backpack
26 | umbrella
27 | handbag
28 | tie
29 | suitcase
30 | frisbee
31 | skis
32 | snowboard
33 | sports ball
34 | kite
35 | baseball bat
36 | baseball glove
37 | skateboard
38 | surfboard
39 | tennis racket
40 | bottle
41 | wine glass
42 | cup
43 | fork
44 | knife
45 | spoon
46 | bowl
47 | banana
48 | apple
49 | sandwich
50 | orange
51 | broccoli
52 | carrot
53 | hot dog
54 | pizza
55 | donut
56 | cake
57 | chair
58 | sofa
59 | pottedplant
60 | bed
61 | diningtable
62 | toilet
63 | tvmonitor
64 | laptop
65 | mouse
66 | remote
67 | keyboard
68 | cell phone
69 | microwave
70 | oven
71 | toaster
72 | sink
73 | refrigerator
74 | book
75 | clock
76 | vase
77 | scissors
78 | teddy bear
79 | hair drier
80 | toothbrush
81 | 


--------------------------------------------------------------------------------
/cam/result.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/result.png


--------------------------------------------------------------------------------
/cam/test.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/test.jpg


--------------------------------------------------------------------------------
/cam/test_re.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/test_re.jpg


--------------------------------------------------------------------------------
/cam/train.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/cam/train.jpg


--------------------------------------------------------------------------------
/center/get_train_val.py:
--------------------------------------------------------------------------------
 1 | import os,shutil
 2 | import numpy as np
 3 | import cv2
 4 | from tqdm import tqdm
 5 | #上一步保存的所有image和label文件路径
 6 | image_root = r'../datasets/yolo_data/images'
 7 | label_root = r'../datasets/yolo_data/labels'
 8 | names = []
 9 | for root,dir,files in os.walk(label_root ):
10 |     for file in files:
11 |         names.append(file)
12 | val_split = 0.1
13 | np.random.seed(10101)
14 | np.random.shuffle(names)
15 | num_val = int(len(names)*val_split)
16 | num_train = len(names) - num_val
17 | trains = names[:num_train]
18 | vals = names[num_train:]
19 | #保存路径
20 | save_path_img = r'../datasets//traindata'
21 | if not os.path.exists(save_path_img):
22 |     os.mkdir(save_path_img)
23 | def get_train_val_data(img_root,txt_root,save_path_img,files,typ):
24 |         def get_path(root_path,path1):
25 |             path = os.path.join(root_path,path1)
26 |             if not os.path.exists(path):
27 |                 os.mkdir(path)
28 |             return path
29 |         for val in tqdm(files):
30 |             txt_path = os.path.join(txt_root,val)
31 |             img_path = os.path.join(img_root,val.split('.')[0]+'.jpg')
32 |             img_path1 = get_path(save_path_img,'images')
33 |             txt_path1 = get_path(save_path_img,'labels')
34 |             rt_img = get_path(img_path1,typ)
35 |             rt_txt = get_path(txt_path1,typ)
36 |             txt_path1 = os.path.join(rt_txt,val)
37 |             img_path1 = os.path.join(rt_img,val.split('.')[0]+'.jpg')
38 |             shutil.copyfile(img_path, img_path1)
39 |             shutil.copyfile(txt_path,txt_path1)
40 | get_train_val_data(image_root,label_root,save_path_img,vals,'val')
41 | get_train_val_data(image_root,label_root,save_path_img,trains,'train') 
42 | 
43 |     
44 |     
45 |     
46 |     
47 |     


--------------------------------------------------------------------------------
/center/xml_yolo.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | from tqdm import tqdm
 3 | from lxml import etree
 4 | import json
 5 | import shutil
 6 | # 原始xml路径和image路径
 7 | xml_root_path = r'../datasets/VOC2012/Annotations'
 8 | img_root_path = r'../datasets/VOC2012/JPEGImages'
 9 | # 保存的图片和yolo格式label路径。要新建文件夹
10 | def get_path(path):
11 |     if not os.path.exists(path):
12 |         os.mkdir(path)
13 |     return path
14 | get_path(r'../datasets/yolo_data')
15 | save_label_path = get_path(r'../datasets/yolo_data/labels')
16 | save_images_path = get_path(r'../datasets/yolo_data/images')
17 | def parse_xml_to_dict(xml):
18 |     if len(xml) == 0:  # 遍历到底层，直接返回tag对应的信息
19 |         return {xml.tag: xml.text}
20 |     result = {}
21 |     for child in xml:
22 |         child_result = parse_xml_to_dict(child)  # 递归遍历标签信息
23 |         if child.tag != 'object':
24 |             result[child.tag] = child_result[child.tag]
25 |         else:
26 |             if child.tag not in result:  # 因为object可能有多个，所以需要放入列表里
27 |                 result[child.tag] = []
28 |             result[child.tag].append(child_result[child.tag])
29 |     return {xml.tag: result}
30 | def translate_info(file_names, img_root_path, class_list):
31 |     for root,dirs,files in os.walk(file_names):
32 |         for file in tqdm(files):
33 |             # 检查xml文件是否存在
34 |             xml_path = os.path.join(root, file)
35 |             # read xml
36 |             with open(xml_path) as fid:
37 |                 xml_str = fid.read()
38 |             xml = etree.fromstring(xml_str)
39 |             data = parse_xml_to_dict(xml)["annotation"]
40 |             img_height = int(data["size"]["height"])
41 |             img_width = int(data["size"]["width"])
42 |             img_path = data["filename"]
43 | 
44 |             # write object info into txt
45 |             assert "object" in data.keys(), "file: '{}' lack of object key.".format(xml_path)
46 |             if len(data["object"]) == 0:
47 |                 # 如果xml文件中没有目标就直接忽略该样本
48 |                 print("Warning: in '{}' xml, there are no objects.".format(xml_path))
49 |                 continue
50 |             with open(os.path.join(save_label_path, file.split(".")[0] + ".txt"), "w") as f:
51 |                 for index, obj in enumerate(data["object"]):
52 |                     # 获取每个object的box信息
53 |                     xmin = float(obj["bndbox"]["xmin"])
54 |                     xmax = float(obj["bndbox"]["xmax"])
55 |                     ymin = float(obj["bndbox"]["ymin"])
56 |                     ymax = float(obj["bndbox"]["ymax"])
57 |                     class_name = obj["name"]
58 |                     class_index = class_list.index(class_name)
59 |                     # 进一步检查数据，有的标注信息中可能有w或h为0的情况，这样的数据会导致计算回归loss为nan
60 |                     if xmax <= xmin or ymax <= ymin:
61 |                         print("Warning: in '{}' xml, there are some bbox w/h <=0".format(xml_path))
62 |                         continue
63 |                     # 将box信息转换到yolo格式
64 |                     xcenter = xmin + (xmax - xmin) / 2
65 |                     ycenter = ymin + (ymax - ymin) / 2
66 |                     w = xmax - xmin
67 |                     h = ymax - ymin
68 |                     # 绝对坐标转相对坐标，保存6位小数
69 |                     xcenter = round(xcenter / img_width, 6)
70 |                     ycenter = round(ycenter / img_height, 6)
71 |                     w = round(w / img_width, 6)
72 |                     h = round(h / img_height, 6)
73 |                     info = [str(i) for i in [class_index, xcenter, ycenter, w, h]]
74 |                     if index == 0:
75 |                         f.write(" ".join(info))
76 |                     else:
77 |                         f.write("\n" + " ".join(info))
78 |             # copy image into save_images_path
79 |             path_copy_to = os.path.join(save_images_path,file.split(".")[0] + ".jpg")
80 |             shutil.copyfile(os.path.join(img_root_path, img_path), path_copy_to)
81 |             
82 | label_json_path = r'../datasets/VOC2012/pascal_voc_classes.txt'
83 | with open(label_json_path, 'r') as f:
84 |     label_file = f.readlines()
85 | class_list = label_file[0].split(',')
86 | translate_info(xml_root_path, img_root_path, class_list)


--------------------------------------------------------------------------------
/config/score.yaml:
--------------------------------------------------------------------------------
1 | # train and val datasets (image directory or *.txt file with image paths)
2 | train: ./datasets/traindata/images/train/
3 | val: ./datasets/traindata/images/val/
4 | # number of classes
5 | nc: 20
6 | # class names
7 | names: ["aeroplane","bicycle","bird","boat","bottle","bus","car","cat","chair","cow","diningtable","dog","horse","motorbike","person","pottedplant","sheep","sofa","train","tvmonitor"]


--------------------------------------------------------------------------------
/config/yolov3-spp.yaml:
--------------------------------------------------------------------------------
 1 | # parameters
 2 | nc: 80  # number of classes
 3 | depth_multiple: 1.0  # expand model depth
 4 | width_multiple: 1.0  # expand layer channels
 5 | 
 6 | # anchors
 7 | anchors:
 8 |   - [10,13, 16,30, 33,23]  # P3/8
 9 |   - [30,61, 62,45, 59,119]  # P4/16
10 |   - [116,90, 156,198, 373,326]  # P5/32
11 | 
12 | # darknet53 backbone
13 | backbone:
14 |   # [from, number, module, args]
15 |   [[-1, 1, Conv, [32, 3, 1]],  # 0
16 |    [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2
17 |    [-1, 1, Bottleneck, [64]],
18 |    [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4
19 |    [-1, 2, Bottleneck, [128]],
20 |    [-1, 1, Conv, [256, 3, 2]],  # 5-P3/8
21 |    [-1, 8, Bottleneck, [256]],
22 |    [-1, 1, Conv, [512, 3, 2]],  # 7-P4/16
23 |    [-1, 8, Bottleneck, [512]],
24 |    [-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
25 |    [-1, 4, Bottleneck, [1024]],  # 10
26 |   ]
27 | 
28 | # yolov3-spp head
29 | # na = len(anchors[0])
30 | head:
31 |   [[-1, 1, Bottleneck, [1024, False]],  # 11
32 |    [-1, 1, SPP, [512, [5, 9, 13]]],
33 |    [-1, 1, Conv, [1024, 3, 1]],
34 |    [-1, 1, Conv, [512, 1, 1]],
35 |    [-1, 1, Conv, [1024, 3, 1]],
36 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],  # 16 (P5/32-large)
37 | 
38 |    [-3, 1, Conv, [256, 1, 1]],
39 |    [-1, 1, nn.Upsample, [None, 2, 'nearest']],
40 |    [[-1, 8], 1, Concat, [1]],  # cat backbone P4
41 |    [-1, 1, Bottleneck, [512, False]],
42 |    [-1, 1, Bottleneck, [512, False]],
43 |    [-1, 1, Conv, [256, 1, 1]],
44 |    [-1, 1, Conv, [512, 3, 1]],
45 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],  # 24 (P4/16-medium)
46 | 
47 |    [-3, 1, Conv, [128, 1, 1]],
48 |    [-1, 1, nn.Upsample, [None, 2, 'nearest']],
49 |    [[-1, 6], 1, Concat, [1]],  # cat backbone P3
50 |    [-1, 1, Bottleneck, [256, False]],
51 |    [-1, 2, Bottleneck, [256, False]],
52 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],  # 30 (P3/8-small)
53 | 
54 |    [[], 1, Detect, [nc, anchors]],   # Detect(P3, P4, P5)
55 |   ]
56 | 


--------------------------------------------------------------------------------
/config/yolov5l.yaml:
--------------------------------------------------------------------------------
 1 | # parameters
 2 | nc: 20 # number of classes
 3 | depth_multiple: 1.0  # model depth multiple
 4 | width_multiple: 1.0  # layer channel multiple
 5 | 
 6 | # anchors
 7 | anchors:
 8 |   - [10,13, 16,30, 33,23]  # P3/8
 9 |   - [30,61, 62,45, 59,119]  # P4/16
10 |   - [116,90, 156,198, 373,326]  # P5/32
11 | 
12 | # yolov5 backbone
13 | backbone:
14 |   # [from, number, module, args]
15 |   [[-1, 1, Focus, [64, 3]],  # 1-P1/2
16 |    [-1, 1, Conv, [128, 3, 2]],  # 2-P2/4
17 |    [-1, 3, Bottleneck, [128]],
18 |    [-1, 1, Conv, [256, 3, 2]],  # 4-P3/8
19 |    [-1, 9, BottleneckCSP, [256]],
20 |    [-1, 1, Conv, [512, 3, 2]],  # 6-P4/16
21 |    [-1, 9, BottleneckCSP, [512]],
22 |    [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 |    [-1, 1, SPP, [1024, [5, 9, 13]]],
24 |    [-1, 6, BottleneckCSP, [1024]],  # 10
25 |   ]
26 | 
27 | # yolov5 head
28 | head:
29 |   [[-1, 3, BottleneckCSP, [1024, False]],  # 11
30 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 12 (P5/32-large)
31 | 
32 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 |    [[-1, 6], 1, Concat, [1]],  # cat backbone P4
34 |    [-1, 1, Conv, [512, 1, 1]],
35 |    [-1, 3, BottleneckCSP, [512, False]],
36 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 17 (P4/16-medium)
37 | 
38 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 |    [[-1, 4], 1, Concat, [1]],  # cat backbone P3
40 |    [-1, 1, Conv, [256, 1, 1]],
41 |    [-1, 3, BottleneckCSP, [256, False]],
42 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 22 (P3/8-small)
43 | 
44 |    [[], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
45 |   ]
46 | 


--------------------------------------------------------------------------------
/config/yolov5m.yaml:
--------------------------------------------------------------------------------
 1 | # parameters
 2 | nc: 20  # number of classes
 3 | depth_multiple: 0.67  # model depth multiple
 4 | width_multiple: 0.75  # layer channel multiple
 5 | 
 6 | # anchors
 7 | anchors:
 8 |   - [10,13, 16,30, 33,23]  # P3/8
 9 |   - [30,61, 62,45, 59,119]  # P4/16
10 |   - [116,90, 156,198, 373,326]  # P5/32
11 | 
12 | # yolov5 backbone
13 | backbone:
14 |   # [from, number, module, args]
15 |   [[-1, 1, Focus, [64, 3]],  # 1-P1/2
16 |    [-1, 1, Conv, [128, 3, 2]],  # 2-P2/4
17 |    [-1, 3, Bottleneck, [128]],
18 |    [-1, 1, Conv, [256, 3, 2]],  # 4-P3/8
19 |    [-1, 9, BottleneckCSP, [256]],
20 |    [-1, 1, Conv, [512, 3, 2]],  # 6-P4/16
21 |    [-1, 9, BottleneckCSP, [512]],
22 |    [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 |    [-1, 1, SPP, [1024, [5, 9, 13]]],
24 |    [-1, 6, BottleneckCSP, [1024]],  # 10
25 |   ]
26 | 
27 | # yolov5 head
28 | head:
29 |   [[-1, 3, BottleneckCSP, [1024, False]],  # 11
30 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 12 (P5/32-large)
31 | 
32 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 |    [[-1, 6], 1, Concat, [1]],  # cat backbone P4
34 |    [-1, 1, Conv, [512, 1, 1]],
35 |    [-1, 3, BottleneckCSP, [512, False]],
36 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 17 (P4/16-medium)
37 | 
38 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 |    [[-1, 4], 1, Concat, [1]],  # cat backbone P3
40 |    [-1, 1, Conv, [256, 1, 1]],
41 |    [-1, 3, BottleneckCSP, [256, False]],
42 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 22 (P3/8-small)
43 | 
44 |    [[], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
45 |   ]
46 | 


--------------------------------------------------------------------------------
/config/yolov5s.yaml:
--------------------------------------------------------------------------------
 1 | # parameters
 2 | nc: 20  # number of classes
 3 | depth_multiple: 0.33  # model depth multiple
 4 | width_multiple: 0.50  # layer channel multiple
 5 | 
 6 | # anchors
 7 | anchors:
 8 |   - [10,13, 16,30, 33,23]  # P3/8
 9 |   - [30,61, 62,45, 59,119]  # P4/16
10 |   - [116,90, 156,198, 373,326]  # P5/32
11 | 
12 | # yolov5 backbone
13 | backbone:
14 |   # [from, number, module, args]
15 |   [[-1, 1, Focus, [64, 3]],  # 1-P1/2
16 |    [-1, 1, Conv, [128, 3, 2]],  # 2-P2/4
17 |    [-1, 3, Bottleneck, [128]],
18 |    [-1, 1, Conv, [256, 3, 2]],  # 4-P3/8
19 |    [-1, 9, BottleneckCSP, [256]],
20 |    [-1, 1, Conv, [512, 3, 2]],  # 6-P4/16
21 |    [-1, 9, BottleneckCSP, [512]],
22 |    [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 |    [-1, 1, SPP, [1024, [5, 9, 13]]],
24 |    [-1, 6, BottleneckCSP, [1024]],  # 10
25 |   ]
26 | 
27 | # yolov5 head
28 | head:
29 |   [[-1, 3, BottleneckCSP, [1024, False]],  # 11
30 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 12 (P5/32-large)
31 | 
32 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 |    [[-1, 6], 1, Concat, [1]],  # cat backbone P4
34 |    [-1, 1, Conv, [512, 1, 1]],
35 |    [-1, 3, BottleneckCSP, [512, False]],
36 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 17 (P4/16-medium)
37 | 
38 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 |    [[-1, 4], 1, Concat, [1]],  # cat backbone P3
40 |    [-1, 1, Conv, [256, 1, 1]],
41 |    [-1, 3, BottleneckCSP, [256, False]],
42 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 22 (P3/8-small)
43 | 
44 |    [[], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
45 |   ]
46 | 


--------------------------------------------------------------------------------
/config/yolov5x.yaml:
--------------------------------------------------------------------------------
 1 | # parameters
 2 | nc: 80  # number of classes
 3 | depth_multiple: 1.33  # model depth multiple
 4 | width_multiple: 1.25  # layer channel multiple
 5 | 
 6 | # anchors
 7 | anchors:
 8 |   - [10,13, 16,30, 33,23]  # P3/8
 9 |   - [30,61, 62,45, 59,119]  # P4/16
10 |   - [116,90, 156,198, 373,326]  # P5/32
11 | 
12 | # yolov5 backbone
13 | backbone:
14 |   # [from, number, module, args]
15 |   [[-1, 1, Focus, [64, 3]],  # 1-P1/2
16 |    [-1, 1, Conv, [128, 3, 2]],  # 2-P2/4
17 |    [-1, 3, Bottleneck, [128]],
18 |    [-1, 1, Conv, [256, 3, 2]],  # 4-P3/8
19 |    [-1, 9, BottleneckCSP, [256]],
20 |    [-1, 1, Conv, [512, 3, 2]],  # 6-P4/16
21 |    [-1, 9, BottleneckCSP, [512]],
22 |    [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23 |    [-1, 1, SPP, [1024, [5, 9, 13]]],
24 |    [-1, 6, BottleneckCSP, [1024]],  # 10
25 |   ]
26 | 
27 | # yolov5 head
28 | head:
29 |   [[-1, 3, BottleneckCSP, [1024, False]],  # 11
30 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 12 (P5/32-large)
31 | 
32 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33 |    [[-1, 6], 1, Concat, [1]],  # cat backbone P4
34 |    [-1, 1, Conv, [512, 1, 1]],
35 |    [-1, 3, BottleneckCSP, [512, False]],
36 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 17 (P4/16-medium)
37 | 
38 |    [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39 |    [[-1, 4], 1, Concat, [1]],  # cat backbone P3
40 |    [-1, 1, Conv, [256, 1, 1]],
41 |    [-1, 3, BottleneckCSP, [256, False]],
42 |    [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 22 (P3/8-small)
43 | 
44 |    [[], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
45 |   ]
46 | 


--------------------------------------------------------------------------------
/detect.py:
--------------------------------------------------------------------------------
 1 | from utils.datasets import *
 2 | from utils.utils import *
 3 | 
 4 | def detect(source, out, weights):
 5 |     source, out, weights, imgsz = source, out, weights, 640
 6 |     # Initialize
 7 |     device = torch_utils.select_device('cpu')
 8 |     if os.path.exists(out):
 9 |         shutil.rmtree(out)  # delete output folder
10 |     os.makedirs(out)  # make new output folder
11 |     # Load model
12 |     google_utils.attempt_download(weights)
13 |     model = torch.load(weights, map_location=device)['model']
14 |     model.to(device).eval()
15 |     vid_path, vid_writer = None, None
16 |     dataset = LoadImages(source, img_size=imgsz)
17 |     # Get names and colors
18 |     names = model.names if hasattr(model, 'names') else model.modules.names
19 |     colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))]
20 |     # Run inference
21 |     t0 = time.time()
22 |     for path, img, im0s, vid_cap in dataset:
23 |         t1 = time.time()
24 |         img = torch.from_numpy(img).to(device)
25 |         img = img.float()  # uint8 to fp16/32
26 |         img /= 255.0  # 0 - 255 to 0.0 - 1.0
27 |         if img.ndimension() == 3:
28 |             img = img.unsqueeze(0)
29 |         # Inference
30 |         pred = model(img, augment=False)[0]
31 |         pred = non_max_suppression(pred, 0.4, 0.5,
32 |                                fast=True, classes=None, agnostic=False)
33 |         # Process detections
34 |         for i, det in enumerate(pred):  # detections per image
35 |             p, s, im0 = path, '', im0s
36 |             save_path = str(Path(out) / Path(p).name)
37 |             s += '%gx%g ' % img.shape[2:]  # print string
38 |             if det is not None and len(det):
39 |                 # Rescale boxes from img_size to im0 size
40 |                 det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
41 |                 # Print results
42 |                 for c in det[:, -1].unique():
43 |                     n = (det[:, -1] == c).sum()  # detections per class
44 |                     s += '%g %ss, ' % (n, names[int(c)])  # add to string
45 |                 for *xyxy, conf, cls in det:
46 |                     # Add bbox to image
47 |                     label = '%s%.2f' % (names[int(cls)], conf)
48 |                     im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=1)
49 |                     # xmin,ymin, xmax,ymax = int(xyxy[0]), int(xyxy[1]),int(xyxy[2]), int(xyxy[3])
50 |                     # xcenter = xmin + (xmax - xmin) / 2
51 |                     # ycenter = ymin + (ymax - ymin) / 2
52 |                     # w = xmax - xmin
53 |                     # h = ymax - ymin
54 |             # Save results (image with detections)
55 |             print('%sDone.  (%.3fs)' % (s, time.time() - t1))
56 |             if dataset.mode == 'images':
57 |                 cv2.imwrite(save_path, im0)
58 |             else:
59 |                 if vid_path != save_path:  # new video
60 |                     vid_path = save_path
61 |                     if isinstance(vid_writer, cv2.VideoWriter):
62 |                         vid_writer.release()  # release previous video writer
63 |                     fps = vid_cap.get(cv2.CAP_PROP_FPS)
64 |                     w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
65 |                     h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
66 |                     vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*opt.fourcc), fps, (w, h))
67 |                 vid_writer.write(im0)
68 | 
69 |     print('Done. (%.3fs)' % (time.time() - t0))
70 | 
71 | 
72 | source = './inference/inputs'
73 | out = './inference/outputs'
74 | weights = './weights/yolov5l.pt'
75 | 
76 | with torch.no_grad():
77 |     detect(source, out, weights)
78 | 


--------------------------------------------------------------------------------
/inference/inputs/2007_000033.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/inference/inputs/2007_000033.jpg


--------------------------------------------------------------------------------
/inference/outputs/2007_000033.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/inference/outputs/2007_000033.jpg


--------------------------------------------------------------------------------
/models/__pycache__/common.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/common.cpython-37.pyc


--------------------------------------------------------------------------------
/models/__pycache__/de.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/de.cpython-37.pyc


--------------------------------------------------------------------------------
/models/__pycache__/experimental.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/experimental.cpython-37.pyc


--------------------------------------------------------------------------------
/models/__pycache__/yolo.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/models/__pycache__/yolo.cpython-37.pyc


--------------------------------------------------------------------------------
/models/common.py:
--------------------------------------------------------------------------------
 1 | # This file contains modules common to various models
 2 | 
 3 | 
 4 | from utils.utils import *
 5 | 
 6 | 
 7 | def DWConv(c1, c2, k=1, s=1, act=True):
 8 |     # Depthwise convolution
 9 |     return Conv(c1, c2, k, s, g=math.gcd(c1, c2), act=act)
10 | 
11 | 
12 | class Conv(nn.Module):
13 |     # Standard convolution
14 |     def __init__(self, c1, c2, k=1, s=1, g=1, act=True):  # ch_in, ch_out, kernel, stride, groups
15 |         super(Conv, self).__init__()
16 |         self.conv = nn.Conv2d(c1, c2, k, s, k // 2, groups=g, bias=False)
17 |         self.bn = nn.BatchNorm2d(c2)
18 |         self.act = nn.LeakyReLU(0.1, inplace=True) if act else nn.Identity()
19 | 
20 |     def forward(self, x):
21 |         return self.act(self.bn(self.conv(x)))
22 | 
23 |     def fuseforward(self, x):
24 |         return self.act(self.conv(x))
25 | 
26 | 
27 | class Bottleneck(nn.Module):
28 |     # Standard bottleneck
29 |     def __init__(self, c1, c2, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, shortcut, groups, expansion
30 |         super(Bottleneck, self).__init__()
31 |         c_ = int(c2 * e)  # hidden channels
32 |         self.cv1 = Conv(c1, c_, 1, 1)
33 |         self.cv2 = Conv(c_, c2, 3, 1, g=g)
34 |         self.add = shortcut and c1 == c2
35 | 
36 |     def forward(self, x):
37 |         return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
38 | 
39 | 
40 | class BottleneckCSP(nn.Module):
41 |     # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
42 |     def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
43 |         super(BottleneckCSP, self).__init__()
44 |         c_ = int(c2 * e)  # hidden channels
45 |         self.cv1 = Conv(c1, c_, 1, 1)
46 |         self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
47 |         self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
48 |         self.cv4 = Conv(c2, c2, 1, 1)
49 |         self.bn = nn.BatchNorm2d(2 * c_)  # applied to cat(cv2, cv3)
50 |         self.act = nn.LeakyReLU(0.1, inplace=True)
51 |         self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
52 | 
53 |     def forward(self, x):
54 |         y1 = self.cv3(self.m(self.cv1(x)))
55 |         y2 = self.cv2(x)
56 |         return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))
57 | 
58 | 
59 | class SPP(nn.Module):
60 |     # Spatial pyramid pooling layer used in YOLOv3-SPP
61 |     def __init__(self, c1, c2, k=(5, 9, 13)):
62 |         super(SPP, self).__init__()
63 |         c_ = c1 // 2  # hidden channels
64 |         self.cv1 = Conv(c1, c_, 1, 1)
65 |         self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
66 |         self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
67 | 
68 |     def forward(self, x):
69 |         x = self.cv1(x)
70 |         return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))
71 | 
72 | 
73 | class Flatten(nn.Module):
74 |     # Use after nn.AdaptiveAvgPool2d(1) to remove last 2 dimensions
75 |     def forward(self, x):
76 |         return x.view(x.size(0), -1)
77 | 
78 | 
79 | class Focus(nn.Module):
80 |     # Focus wh information into c-space
81 |     def __init__(self, c1, c2, k=1):
82 |         super(Focus, self).__init__()
83 |         self.conv = Conv(c1 * 4, c2, k, 1)
84 | 
85 |     def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
86 |         return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
87 | 
88 | 
89 | class Concat(nn.Module):
90 |     # Concatenate a list of tensors along dimension
91 |     def __init__(self, dimension=1):
92 |         super(Concat, self).__init__()
93 |         self.d = dimension
94 | 
95 |     def forward(self, x):
96 |         return torch.cat(x, self.d)
97 | 


--------------------------------------------------------------------------------
/models/de.py:
--------------------------------------------------------------------------------
 1 | from utils.datasets import *
 2 | from utils.utils import *
 3 | 
 4 | 
 5 | def get_model():
 6 |     weights = r'./weights/yolov5s.pt'
 7 |     device = torch.device("cuda" if (torch.cuda.is_available()) else "cpu")
 8 |     google_utils.attempt_download(weights)
 9 |     model = torch.load(weights, map_location=device)['model']
10 |     model.to(device).eval()
11 |     return model
12 | 
13 | 
14 | def letterbox(img, new_shape=(416, 416), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
15 |     # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
16 |     shape = img.shape[:2]  # current shape [height, width]
17 |     if isinstance(new_shape, int):
18 |         new_shape = (new_shape, new_shape)
19 | 
20 |     # Scale ratio (new / old)
21 |     r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
22 |     if not scaleup:  # only scale down, do not scale up (for better test mAP)
23 |         r = min(r, 1.0)
24 |     # Compute padding
25 |     ratio = r, r  # width, height ratios
26 |     new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
27 |     dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
28 |     if auto:  # minimum rectangle
29 |         dw, dh = np.mod(dw, 64), np.mod(dh, 64)  # wh padding
30 |     elif scaleFill:  # stretch
31 |         dw, dh = 0.0, 0.0
32 |         new_unpad = new_shape
33 |         ratio = new_shape[0] / shape[1], new_shape[1] / shape[0]  # width, height ratios
34 | 
35 |     dw /= 2  # divide padding into 2 sides
36 |     dh /= 2
37 |     if shape[::-1] != new_unpad:  # resize
38 |         img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
39 |     top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
40 |     left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
41 |     img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
42 |     return img, ratio, (dw, dh)
43 | 
44 | def detect(model, im0s):
45 |     t0 = time.time()
46 |     device = torch.device("cuda" if (torch.cuda.is_available()) else "cpu")
47 |     names = model.names if hasattr(model, 'names') else model.modules.names
48 |     colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))]
49 |     img = letterbox(im0s, new_shape=640)[0]
50 |     img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
51 |     img = np.ascontiguousarray(img)
52 |     img = torch.from_numpy(img).to(device)
53 |     img = img.float()
54 |     img /= 255.0  # 0 - 255 to 0.0 - 1.0
55 |     if img.ndimension() == 3:
56 |         img = img.unsqueeze(0)
57 |     pred = model(img, augment=False)[0]
58 |     pred = non_max_suppression(pred, 0.4, 0.5,
59 |                                fast=True, classes=None, agnostic=False)
60 |     for i, det in enumerate(pred):  # detections per image
61 |         im0 = im0s
62 |         if det is not None and len(det):
63 |             det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
64 |             for *xyxy, conf, cls in det:
65 |                 label = '%s%.2f' % (names[int(cls)], conf)
66 |                 im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=1)
67 |     print('Done. (%.3fs)' % (time.time() - t0))
68 |     return im0
69 | 
70 | 


--------------------------------------------------------------------------------
/models/experimental.py:
--------------------------------------------------------------------------------
 1 | from models.common import *
 2 | 
 3 | 
 4 | class Sum(nn.Module):
 5 |     # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
 6 |     def __init__(self, n, weight=False):  # n: number of inputs
 7 |         super(Sum, self).__init__()
 8 |         self.weight = weight  # apply weights boolean
 9 |         self.iter = range(n - 1)  # iter object
10 |         if weight:
11 |             self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True)  # layer weights
12 | 
13 |     def forward(self, x):
14 |         y = x[0]  # no weight
15 |         if self.weight:
16 |             w = torch.sigmoid(self.w) * 2
17 |             for i in self.iter:
18 |                 y = y + x[i + 1] * w[i]
19 |         else:
20 |             for i in self.iter:
21 |                 y = y + x[i + 1]
22 |         return y
23 | 
24 | 
25 | class GhostConv(nn.Module):
26 |     # Ghost Convolution https://github.com/huawei-noah/ghostnet
27 |     def __init__(self, c1, c2, k=1, s=1, g=1, act=True):  # ch_in, ch_out, kernel, stride, groups
28 |         super(GhostConv, self).__init__()
29 |         c_ = c2 // 2  # hidden channels
30 |         self.cv1 = Conv(c1, c_, k, s, g, act)
31 |         self.cv2 = Conv(c_, c_, 5, 1, c_, act)
32 | 
33 |     def forward(self, x):
34 |         y = self.cv1(x)
35 |         return torch.cat([y, self.cv2(y)], 1)
36 | 
37 | 
38 | class GhostBottleneck(nn.Module):
39 |     # Ghost Bottleneck https://github.com/huawei-noah/ghostnet
40 |     def __init__(self, c1, c2, k, s):
41 |         super(GhostBottleneck, self).__init__()
42 |         c_ = c2 // 2
43 |         self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1),  # pw
44 |                                   DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(),  # dw
45 |                                   GhostConv(c_, c2, 1, 1, act=False))  # pw-linear
46 |         self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False),
47 |                                       Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity()
48 | 
49 |     def forward(self, x):
50 |         return self.conv(x) + self.shortcut(x)
51 | 
52 | 
53 | class ConvPlus(nn.Module):
54 |     # Plus-shaped convolution
55 |     def __init__(self, c1, c2, k=3, s=1, g=1, bias=True):  # ch_in, ch_out, kernel, stride, groups
56 |         super(ConvPlus, self).__init__()
57 |         self.cv1 = nn.Conv2d(c1, c2, (k, 1), s, (k // 2, 0), groups=g, bias=bias)
58 |         self.cv2 = nn.Conv2d(c1, c2, (1, k), s, (0, k // 2), groups=g, bias=bias)
59 | 
60 |     def forward(self, x):
61 |         return self.cv1(x) + self.cv2(x)
62 | 
63 | 
64 | class MixConv2d(nn.Module):
65 |     # Mixed Depthwise Conv https://arxiv.org/abs/1907.09595
66 |     def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):
67 |         super(MixConv2d, self).__init__()
68 |         groups = len(k)
69 |         if equal_ch:  # equal c_ per group
70 |             i = torch.linspace(0, groups - 1E-6, c2).floor()  # c2 indices
71 |             c_ = [(i == g).sum() for g in range(groups)]  # intermediate channels
72 |         else:  # equal weight.numel() per group
73 |             b = [c2] + [0] * groups
74 |             a = np.eye(groups + 1, groups, k=-1)
75 |             a -= np.roll(a, 1, axis=1)
76 |             a *= np.array(k) ** 2
77 |             a[0] = 1
78 |             c_ = np.linalg.lstsq(a, b, rcond=None)[0].round()  # solve for equal weight indices, ax = b
79 | 
80 |         self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)])
81 |         self.bn = nn.BatchNorm2d(c2)
82 |         self.act = nn.LeakyReLU(0.1, inplace=True)
83 | 
84 |     def forward(self, x):
85 |         return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))
86 | 


--------------------------------------------------------------------------------
/models/onnx_export.py:
--------------------------------------------------------------------------------
 1 | """Exports a pytorch *.pt model to *.onnx format
 2 | 
 3 | Usage:
 4 |     import torch
 5 |     $ export PYTHONPATH="$PWD" && python models/onnx_export.py --weights ./weights/yolov5s.pt --img 640 --batch 1
 6 | """
 7 | 
 8 | import argparse
 9 | 
10 | import onnx
11 | 
12 | from models.common import *
13 | 
14 | if __name__ == '__main__':
15 |     parser = argparse.ArgumentParser()
16 |     parser.add_argument('--weights', type=str, default='./yolov5s.pt', help='weights path')
17 |     parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='image size')
18 |     parser.add_argument('--batch-size', type=int, default=1, help='batch size')
19 |     opt = parser.parse_args()
20 |     print(opt)
21 | 
22 |     # Parameters
23 |     f = opt.weights.replace('.pt', '.onnx')  # onnx filename
24 |     img = torch.zeros((opt.batch_size, 3, *opt.img_size))  # image size, (1, 3, 320, 192) iDetection
25 | 
26 |     # Load pytorch model
27 |     google_utils.attempt_download(opt.weights)
28 |     model = torch.load(opt.weights)['model']
29 |     model.eval()
30 |     model.fuse()
31 | 
32 |     # Export to onnx
33 |     model.model[-1].export = True  # set Detect() layer export=True
34 |     _ = model(img)  # dry run
35 |     torch.onnx.export(model, img, f, verbose=False, opset_version=11, input_names=['images'],
36 |                       output_names=['output'])  # output_names=['classes', 'boxes']
37 | 
38 |     # Check onnx model
39 |     model = onnx.load(f)  # load onnx model
40 |     onnx.checker.check_model(model)  # check onnx model
41 |     print(onnx.helper.printable_graph(model.graph))  # print a human readable representation of the graph
42 |     print('Export complete. ONNX model saved to %s\nView with https://github.com/lutzroeder/netron' % f)
43 | 


--------------------------------------------------------------------------------
/models/yolo.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | 
  3 | import yaml
  4 | 
  5 | from models.experimental import *
  6 | 
  7 | 
  8 | class Detect(nn.Module):
  9 |     def __init__(self, nc=80, anchors=()):  # detection layer
 10 |         super(Detect, self).__init__()
 11 |         self.stride = None  # strides computed during build
 12 |         self.nc = nc  # number of classes
 13 |         self.no = nc + 5  # number of outputs per anchor
 14 |         self.nl = len(anchors)  # number of detection layers
 15 |         self.na = len(anchors[0]) // 2  # number of anchors
 16 |         self.grid = [torch.zeros(1)] * self.nl  # init grid
 17 |         a = torch.tensor(anchors).float().view(self.nl, -1, 2)
 18 |         self.register_buffer('anchors', a)  # shape(nl,na,2)
 19 |         self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2))  # shape(nl,1,na,1,1,2)
 20 |         self.export = False  # onnx export
 21 | 
 22 |     def forward(self, x):
 23 |         # x = x.copy()  # for profiling
 24 |         z = []  # inference output
 25 |         self.training |= self.export
 26 |         for i in range(self.nl):
 27 |             bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)
 28 |             x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
 29 | 
 30 |             if not self.training:  # inference
 31 |                 if self.grid[i].shape[2:4] != x[i].shape[2:4]:
 32 |                     self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
 33 | 
 34 |                 y = x[i].sigmoid()
 35 |                 y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i]  # xy
 36 |                 y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
 37 |                 z.append(y.view(bs, -1, self.no))
 38 | 
 39 |         return x if self.training else (torch.cat(z, 1), x)
 40 | 
 41 |     @staticmethod
 42 |     def _make_grid(nx=20, ny=20):
 43 |         yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
 44 |         return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
 45 | 
 46 | 
 47 | class Model(nn.Module):
 48 |     def __init__(self, model_cfg='yolov5s.yaml', ch=3, nc=None):  # model, input channels, number of classes
 49 |         super(Model, self).__init__()
 50 |         if type(model_cfg) is dict:
 51 |             self.md = model_cfg  # model dict
 52 |         else:  # is *.yaml
 53 |             with open(model_cfg) as f:
 54 |                 self.md = yaml.load(f, Loader=yaml.FullLoader)  # model dict
 55 | 
 56 |         # Define model
 57 |         if nc:
 58 |             self.md['nc'] = nc  # override yaml value
 59 |         self.model, self.save = parse_model(self.md, ch=[ch])  # model, savelist, ch_out
 60 |         # print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])
 61 | 
 62 |         # Build strides, anchors
 63 |         m = self.model[-1]  # Detect()
 64 |         m.stride = torch.tensor([64 / x.shape[-2] for x in self.forward(torch.zeros(1, ch, 64, 64))])  # forward
 65 |         m.anchors /= m.stride.view(-1, 1, 1)
 66 |         self.stride = m.stride
 67 | 
 68 |         # Init weights, biases
 69 |         torch_utils.initialize_weights(self)
 70 |         self._initialize_biases()  # only run once
 71 |         torch_utils.model_info(self)
 72 |         print('')
 73 | 
 74 |     def forward(self, x, augment=False, profile=False):
 75 |         if augment:
 76 |             img_size = x.shape[-2:]  # height, width
 77 |             s = [0.83, 0.67]  # scales
 78 |             y = []
 79 |             for i, xi in enumerate((x,
 80 |                                     torch_utils.scale_img(x.flip(3), s[0]),  # flip-lr and scale
 81 |                                     torch_utils.scale_img(x, s[1]),  # scale
 82 |                                     )):
 83 |                 # cv2.imwrite('img%g.jpg' % i, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1])
 84 |                 y.append(self.forward_once(xi)[0])
 85 | 
 86 |             y[1][..., :4] /= s[0]  # scale
 87 |             y[1][..., 0] = img_size[1] - y[1][..., 0]  # flip lr
 88 |             y[2][..., :4] /= s[1]  # scale
 89 |             return torch.cat(y, 1), None  # augmented inference, train
 90 |         else:
 91 |             return self.forward_once(x, profile)  # single-scale inference, train
 92 | 
 93 |     def forward_once(self, x, profile=False):
 94 |         y, dt = [], []  # outputs
 95 |         for m in self.model:
 96 |             if m.f != -1:  # if not from previous layer
 97 |                 x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers
 98 | 
 99 |             if profile:
100 |                 import thop
101 |                 o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2  # FLOPS
102 |                 t = torch_utils.time_synchronized()
103 |                 for _ in range(10):
104 |                     _ = m(x)
105 |                 dt.append((torch_utils.time_synchronized() - t) * 100)
106 |                 print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))
107 | 
108 |             x = m(x)  # run
109 |             y.append(x if m.i in self.save else None)  # save output
110 | 
111 |         if profile:
112 |             print('%.1fms total' % sum(dt))
113 |         return x
114 | 
115 |     def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency
116 |         # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
117 |         m = self.model[-1]  # Detect() module
118 |         for f, s in zip(m.f, m.stride):  #  from
119 |             mi = self.model[f % m.i]
120 |             b = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)
121 |             # b[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)
122 |             # b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls
123 |             b.data[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)
124 |             b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls
125 | 
126 |             mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
127 |     # def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency
128 |     #     # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
129 |     #     m = self.model[-1]  # Detect() module
130 |     #     for mi, s in zip(m.m, m.stride):  # from
131 |     #         b = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)
132 |     #         with torch.no_grad():
133 |     #             b[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)
134 |     #             b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls
135 |     #         mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
136 | 
137 |     def _print_biases(self):
138 |         m = self.model[-1]  # Detect() module
139 |         for f in sorted([x % m.i for x in m.f]):  #  from
140 |             b = self.model[f].bias.detach().view(m.na, -1).T  # conv.bias(255) to (3,85)
141 |             print(('%g Conv2d.bias:' + '%10.3g' * 6) % (f, *b[:5].mean(1).tolist(), b[5:].mean()))
142 | 
143 |     # def _print_weights(self):
144 |     #     for m in self.model.modules():
145 |     #         if type(m) is Bottleneck:
146 |     #             print('%10.3g' % (m.w.detach().sigmoid() * 2))  # shortcut weights
147 | 
148 |     def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers
149 |         print('Fusing layers...')
150 |         for m in self.model.modules():
151 |             if type(m) is Conv:
152 |                 m.conv = torch_utils.fuse_conv_and_bn(m.conv, m.bn)  # update conv
153 |                 m.bn = None  # remove batchnorm
154 |                 m.forward = m.fuseforward  # update forward
155 |         torch_utils.model_info(self)
156 | 
157 | 
158 | def parse_model(md, ch):  # model_dict, input_channels(3)
159 |     print('\n%3s%15s%3s%10s  %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
160 |     anchors, nc, gd, gw = md['anchors'], md['nc'], md['depth_multiple'], md['width_multiple']
161 |     na = (len(anchors[0]) // 2)  # number of anchors
162 |     no = na * (nc + 5)  # number of outputs = anchors * (classes + 5)
163 | 
164 |     layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch out
165 |     for i, (f, n, m, args) in enumerate(md['backbone'] + md['head']):  # from, number, module, args
166 |         m = eval(m) if isinstance(m, str) else m  # eval strings
167 |         for j, a in enumerate(args):
168 |             try:
169 |                 args[j] = eval(a) if isinstance(a, str) else a  # eval strings
170 |             except:
171 |                 pass
172 | 
173 |         n = max(round(n * gd), 1) if n > 1 else n  # depth gain
174 |         if m in [nn.Conv2d, Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, ConvPlus, BottleneckCSP]:
175 |             c1, c2 = ch[f], args[0]
176 | 
177 |             # Normal
178 |             # if i > 0 and args[0] != no:  # channel expansion factor
179 |             #     ex = 1.75  # exponential (default 2.0)
180 |             #     e = math.log(c2 / ch[1]) / math.log(2)
181 |             #     c2 = int(ch[1] * ex ** e)
182 |             # if m != Focus:
183 |             c2 = make_divisible(c2 * gw, 8) if c2 != no else c2
184 | 
185 |             # Experimental
186 |             # if i > 0 and args[0] != no:  # channel expansion factor
187 |             #     ex = 1 + gw  # exponential (default 2.0)
188 |             #     ch1 = 32  # ch[1]
189 |             #     e = math.log(c2 / ch1) / math.log(2)  # level 1-n
190 |             #     c2 = int(ch1 * ex ** e)
191 |             # if m != Focus:
192 |             #     c2 = make_divisible(c2, 8) if c2 != no else c2
193 | 
194 |             args = [c1, c2, *args[1:]]
195 |             if m is BottleneckCSP:
196 |                 args.insert(2, n)
197 |                 n = 1
198 |         elif m is nn.BatchNorm2d:
199 |             args = [ch[f]]
200 |         elif m is Concat:
201 |             c2 = sum([ch[-1 if x == -1 else x + 1] for x in f])
202 |         elif m is Detect:
203 |             f = f or list(reversed([(-1 if j == i else j - 1) for j, x in enumerate(ch) if x == no]))
204 |         else:
205 |             c2 = ch[f]
206 | 
207 |         m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args)  # module
208 |         t = str(m)[8:-2].replace('__main__.', '')  # module type
209 |         np = sum([x.numel() for x in m_.parameters()])  # number params
210 |         m_.i, m_.f, m_.type, m_.np = i, f, t, np  # attach index, 'from' index, type, number params
211 |         print('%3s%15s%3s%10.0f  %-40s%-30s' % (i, f, n, np, t, args))  # print
212 |         save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelist
213 |         layers.append(m_)
214 |         ch.append(c2)
215 |     return nn.Sequential(*layers), sorted(save)
216 | 
217 | 
218 | if __name__ == '__main__':
219 |     parser = argparse.ArgumentParser()
220 |     parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml')
221 |     parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
222 |     opt = parser.parse_args()
223 |     opt.cfg = glob.glob('./**/' + opt.cfg, recursive=True)[0]  # find file
224 | 
225 |     device = torch_utils.select_device(opt.device)
226 | 
227 |     # Create model
228 |     model = Model(opt.cfg).to(device)
229 |     model.train()
230 | 
231 |     # Profile
232 |     # img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device)
233 |     # y = model(img, profile=True)
234 |     # print([y[0].shape] + [x.shape for x in y[1]])
235 | 
236 |     # ONNX export
237 |     # model.model[-1].export = True
238 |     # torch.onnx.export(model, img, f.replace('.yaml', '.onnx'), verbose=True, opset_version=11)
239 | 
240 |     # Tensorboard
241 |     # from torch.utils.tensorboard import SummaryWriter
242 |     # tb_writer = SummaryWriter()
243 |     # print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/")
244 |     # tb_writer.add_graph(model.model, img)  # add model to tensorboard
245 |     # tb_writer.add_image('test', img[0], dataformats='CWH')  # add model to tensorboard
246 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | torch==1.8.0
 2 | torchvision==0.9.0
 3 | numpy
 4 | opencv-python
 5 | lxml
 6 | tqdm
 7 | flask
 8 | pillow
 9 | tensorboard
10 | pycocotools  # pycocotools-windows


--------------------------------------------------------------------------------
/static/client.js:
--------------------------------------------------------------------------------
 1 | var el = x => document.getElementById(x);
 2 | 
 3 | function showPicker() {
 4 |   el("file-input").click();
 5 | }
 6 | 
 7 | function showPicked(input) {
 8 |   el("upload-label").innerHTML = input.files[0].name;
 9 | 
10 |   var reader = new FileReader();
11 |   reader.onload = function (e) {
12 |     if (e.target.result.split("/")[0].split(":")[1] == "image"){
13 |       el("image-picked").src = e.target.result;
14 |       el("image-picked").className = "";
15 |       el("image-picked1").className = "no-display";
16 |     }
17 |   else{
18 |       el("image-picked1").src = e.target.result;
19 |       el("image-picked1").className = "";
20 |       el("image-picked").className = "no-display";
21 |     }
22 |   };
23 |   reader.readAsDataURL(input.files[0]);
24 | }


--------------------------------------------------------------------------------
/static/style.css:
--------------------------------------------------------------------------------
 1 | .modal {
 2 |     display:    none;
 3 |     position:   fixed;
 4 |     z-index:    1000;
 5 |     top:        0;
 6 |     left:       0;
 7 |     height:     100%;
 8 |     width:      100%;
 9 |     background: rgba( 255, 255, 255, .8 ) 
10 |                 url('/static/ajax-loader.gif') 
11 |                 50% 50% 
12 |                 no-repeat;
13 | }
14 | 
15 | /* When the body has the loading class, we turn
16 |    the scrollbar off with overflow:hidden */
17 | body.loading .modal {
18 |     overflow: hidden;   
19 | }
20 | 
21 | /* Anytime the body has the loading class, our
22 |    modal element will be visible */
23 | body.loading .modal {
24 |     display: block;
25 | }


--------------------------------------------------------------------------------
/static/style1.css:
--------------------------------------------------------------------------------
 1 | body {
 2 |     background-color: #fff;
 3 | }
 4 | 
 5 | .no-display {
 6 |     display: none;
 7 | }
 8 | 
 9 | .center {
10 |     margin: auto;
11 |     padding: 10px 50px;
12 |     text-align: center;
13 |     font-size: 14px;
14 | }
15 | 
16 | .title {
17 |     font-size: 30px;
18 |     margin-top: 1em;
19 |     margin-bottom: 1em;
20 |     color: #262626;
21 | }
22 | 
23 | .content {
24 |     margin-top: 10em;
25 | }
26 | 
27 | .analyze {
28 |     margin-top: 5em;
29 | }
30 | 
31 | .upload-label {
32 |     padding: 10px;
33 |     font-size: 12px;
34 | }
35 | 
36 | .result-label {
37 |     margin-top: 0.5em;
38 |     padding: 10px;
39 |     font-size: 13px;
40 | }
41 | 
42 | button.choose-file-button {
43 |     width: 200px;
44 |     height: 40px;
45 |     border-radius: 2px;
46 |     background-color: #ffffff;
47 |     border: solid 1px #ff8100;
48 |     font-size: 13px;
49 |     color: #ff8100;
50 | }
51 | 
52 | button.analyze-button {
53 |     width: 200px;
54 |     height: 40px;
55 |     border: solid 1px #ff8100;
56 |     border-radius: 2px;
57 |     background-color: #ff8100;
58 |     font-size: 13px;
59 |     color: #ffffff;
60 | }
61 | 
62 | button:focus {
63 |     outline: 0;
64 | }
65 | 


--------------------------------------------------------------------------------
/static/worker.js:
--------------------------------------------------------------------------------
 1 | $('#detections').hide()
 2 | var $loading = $('#loading').hide();
 3 | 
 4 | $('#updateCamera').click(function (event) {
 5 |     event.preventDefault();
 6 |     const data = {
 7 |         "gray": $('#gray').is(":checked"),
 8 |         "gaussian": $('#gaussian').is(":checked"),
 9 |         "sobel": $('#sobel').is(":checked"),
10 |         "canny": $('#canny').is(":checked"),
11 |     }
12 |     console.log(data)
13 |     $.ajax({
14 |         type: 'POST',
15 |         url: '/cameraParams',
16 |         data: data,
17 |         success: function (success) {
18 |             console.log(success)
19 |         }, error: function (error) {
20 |             console.log(error)
21 |         }
22 |     })
23 | });
24 | 
25 | var loadFile = function (event) {
26 |     var output = document.getElementById('input');
27 |     output.src = URL.createObjectURL(event.target.files[0]);
28 | };
29 | 
30 | $(document)
31 |     .ajaxStart(function () {
32 |         $loading.show();
33 |     })
34 |     .ajaxStop(function () {
35 |         $loading.hide();
36 |     });
37 | 


--------------------------------------------------------------------------------
/templates/index1.html:
--------------------------------------------------------------------------------
 1 | <!doctype html>
 2 | <html lang="pt-BR">
 3 | <head>
 4 |   <!-- Required meta tags -->
 5 |   <meta charset="utf-8">
 6 |   <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
 7 |   <!-- Bootstrap CSS -->
 8 |   <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"
 9 |     integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
10 |   <link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">
11 |   <link rel='stylesheet' href='../static/style1.css'>
12 |   <script src='../static/client.js'></script>
13 |   <title>yolo deepsort</title>
14 | </head>
15 | <body>
16 |   <div class="container">
17 |     <nav class="navbar navbar-expand-sm bg-primary navbar-dark justify-content-center">
18 |       <ul class="navbar-nav">
19 |         <li class="nav-item">
20 |           <a class="nav-link" href="index1.html"  type = "submit">YoloV5</a>
21 |         </li>
22 |       </ul>
23 |     </nav>
24 |   </div>
25 | <div>
26 |   <div class='center'>
27 |     <div class='title'>Target Detection and Multi-Target Tracking Platform</div>
28 |     <div class='content'>
29 |       <div class='no-display'>
30 |       <form action = "/images" method = "POST"
31 |          enctype = "multipart/form-data">
32 |          <input id='file-input'
33 |                class='no-display'
34 |                type='file'
35 |                name='images'
36 |                accept='image/png,image/jpeg,image/jpg,video/mp4,video/avi'
37 |                onchange='showPicked(this)'>
38 |       </div>
39 |       <button class='choose-file-button' type='button' onclick='showPicker()'>Upload file</button>
40 |       <div class='upload-label'>
41 |         <label id='upload-label'>No file chosen</label>
42 |       </div>
43 |       <div>
44 |         <img id='image-picked' class='no-display' alt='Chosen Image' height='250'>
45 |         <video autoplay="autoplay" id='image-picked1' class='no-display' alt='Chosen Image' height='250'>
46 |       </div>
47 |       <div class='analyze'>
48 |         <button id="analyze-button" class="analyze-button" type = "submit">Analyze</button>
49 |       </div>
50 |       </form>
51 |     </div>
52 |   </div>
53 | </div>
54 | </body>
55 | </html>


--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import json
  3 | 
  4 | import yaml
  5 | from torch.utils.data import DataLoader
  6 | 
  7 | from utils.datasets import *
  8 | from utils.utils import *
  9 | 
 10 | 
 11 | def test(data,
 12 |          weights=None,
 13 |          batch_size=16,
 14 |          imgsz=640,
 15 |          conf_thres=0.001,
 16 |          iou_thres=0.6,  # for nms
 17 |          save_json=False,
 18 |          single_cls=False,
 19 |          augment=False,
 20 |          model=None,
 21 |          dataloader=None,
 22 |          fast=False,
 23 |          verbose=False):  # 0 fast, 1 accurate
 24 |     # Initialize/load model and set device
 25 |     if model is None:
 26 |         device = torch_utils.select_device(opt.device, batch_size=batch_size)
 27 | 
 28 |         # Remove previous
 29 |         for f in glob.glob('test_batch*.jpg'):
 30 |             os.remove(f)
 31 | 
 32 |         # Load model
 33 |         google_utils.attempt_download(weights)
 34 |         model = torch.load(weights, map_location=device)['model']
 35 |         torch_utils.model_info(model)
 36 |         # model.fuse()
 37 |         model.to(device)
 38 | 
 39 |         if device.type != 'cpu' and torch.cuda.device_count() > 1:
 40 |             model = nn.DataParallel(model)
 41 | 
 42 |         training = False
 43 |     else:  # called by train.py
 44 |         device = next(model.parameters()).device  # get model device
 45 |         training = True
 46 | 
 47 |     # Configure run
 48 |     with open(data) as f:
 49 |         data = yaml.load(f, Loader=yaml.FullLoader)  # model dict
 50 |     nc = 1 if single_cls else int(data['nc'])  # number of classes
 51 |     iouv = torch.linspace(0.5, 0.95, 10).to(device)  # iou vector for mAP@0.5:0.95
 52 |     # iouv = iouv[0].view(1)  # comment for mAP@0.5:0.95
 53 |     niou = iouv.numel()
 54 | 
 55 |     # Dataloader
 56 |     if dataloader is None:
 57 |         fast |= conf_thres > 0.001  # enable fast mode
 58 |         path = data['test'] if opt.task == 'test' else data['val']  # path to val/test images
 59 |         dataset = LoadImagesAndLabels(path,
 60 |                                       imgsz,
 61 |                                       batch_size,
 62 |                                       rect=True,  # rectangular inference
 63 |                                       single_cls=opt.single_cls,  # single class mode
 64 |                                       pad=0.0 if fast else 0.5)  # padding
 65 |         batch_size = min(batch_size, len(dataset))
 66 |         nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])  # number of workers
 67 |         dataloader = DataLoader(dataset,
 68 |                                 batch_size=batch_size,
 69 |                                 num_workers=nw,
 70 |                                 pin_memory=True,
 71 |                                 collate_fn=dataset.collate_fn)
 72 | 
 73 |     seen = 0
 74 |     model.eval()
 75 |     _ = model(torch.zeros((1, 3, imgsz, imgsz), device=device)) if device.type != 'cpu' else None  # run once
 76 |     names = model.names if hasattr(model, 'names') else model.module.names
 77 |     coco91class = coco80_to_coco91_class()
 78 |     s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Targets', 'P', 'R', 'mAP@.5', 'mAP@.5:.95')
 79 |     p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0.
 80 |     loss = torch.zeros(3, device=device)
 81 |     jdict, stats, ap, ap_class = [], [], [], []
 82 |     for batch_i, (imgs, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
 83 |         imgs = imgs.to(device).float() / 255.0  # uint8 to float32, 0 - 255 to 0.0 - 1.0
 84 |         targets = targets.to(device)
 85 |         nb, _, height, width = imgs.shape  # batch size, channels, height, width
 86 |         whwh = torch.Tensor([width, height, width, height]).to(device)
 87 | 
 88 |         # Disable gradients
 89 |         with torch.no_grad():
 90 |             # Run model
 91 |             t = torch_utils.time_synchronized()
 92 |             inf_out, train_out = model(imgs, augment=augment)  # inference and training outputs
 93 |             t0 += torch_utils.time_synchronized() - t
 94 | 
 95 |             # Compute loss
 96 |             if training:  # if model has loss hyperparameters
 97 |                 loss += compute_loss(train_out, targets, model)[1][:3]  # GIoU, obj, cls
 98 | 
 99 |             # Run NMS
100 |             t = torch_utils.time_synchronized()
101 |             output = non_max_suppression(inf_out, conf_thres=conf_thres, iou_thres=iou_thres, fast=fast)
102 |             t1 += torch_utils.time_synchronized() - t
103 | 
104 |         # Statistics per image
105 |         for si, pred in enumerate(output):
106 |             labels = targets[targets[:, 0] == si, 1:]
107 |             nl = len(labels)
108 |             tcls = labels[:, 0].tolist() if nl else []  # target class
109 |             seen += 1
110 | 
111 |             if pred is None:
112 |                 if nl:
113 |                     stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls))
114 |                 continue
115 | 
116 |             # Append to text file
117 |             # with open('test.txt', 'a') as file:
118 |             #    [file.write('%11.5g' * 7 % tuple(x) + '\n') for x in pred]
119 | 
120 |             # Clip boxes to image bounds
121 |             clip_coords(pred, (height, width))
122 | 
123 |             # Append to pycocotools JSON dictionary
124 |             if save_json:
125 |                 # [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ...
126 |                 image_id = int(Path(paths[si]).stem.split('_')[-1])
127 |                 box = pred[:, :4].clone()  # xyxy
128 |                 scale_coords(imgs[si].shape[1:], box, shapes[si][0], shapes[si][1])  # to original shape
129 |                 box = xyxy2xywh(box)  # xywh
130 |                 box[:, :2] -= box[:, 2:] / 2  # xy center to top-left corner
131 |                 for p, b in zip(pred.tolist(), box.tolist()):
132 |                     jdict.append({'image_id': image_id,
133 |                                   'category_id': coco91class[int(p[5])],
134 |                                   'bbox': [round(x, 3) for x in b],
135 |                                   'score': round(p[4], 5)})
136 | 
137 |             # Assign all predictions as incorrect
138 |             correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device)
139 |             if nl:
140 |                 detected = []  # target indices
141 |                 tcls_tensor = labels[:, 0]
142 | 
143 |                 # target boxes
144 |                 tbox = xywh2xyxy(labels[:, 1:5]) * whwh
145 | 
146 |                 # Per target class
147 |                 for cls in torch.unique(tcls_tensor):
148 |                     ti = (cls == tcls_tensor).nonzero().view(-1)  # prediction indices
149 |                     pi = (cls == pred[:, 5]).nonzero().view(-1)  # target indices
150 | 
151 |                     # Search for detections
152 |                     if pi.shape[0]:
153 |                         # Prediction to target ious
154 |                         ious, i = box_iou(pred[pi, :4], tbox[ti]).max(1)  # best ious, indices
155 | 
156 |                         # Append detections
157 |                         for j in (ious > iouv[0]).nonzero():
158 |                             d = ti[i[j]]  # detected target
159 |                             if d not in detected:
160 |                                 detected.append(d)
161 |                                 correct[pi[j]] = ious[j] > iouv  # iou_thres is 1xn
162 |                                 if len(detected) == nl:  # all targets already located in image
163 |                                     break
164 | 
165 |             # Append statistics (correct, conf, pcls, tcls)
166 |             stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls))
167 | 
168 |         # Plot images
169 |         if batch_i < 1:
170 |             f = 'test_batch%g_gt.jpg' % batch_i  # filename
171 |             plot_images(imgs, targets, paths, f, names)  # ground truth
172 |             f = 'test_batch%g_pred.jpg' % batch_i
173 |             plot_images(imgs, output_to_target(output, width, height), paths, f, names)  # predictions
174 | 
175 |     # Compute statistics
176 |     stats = [np.concatenate(x, 0) for x in zip(*stats)]  # to numpy
177 |     if len(stats):
178 |         p, r, ap, f1, ap_class = ap_per_class(*stats)
179 |         p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1)  # [P, R, AP@0.5, AP@0.5:0.95]
180 |         mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
181 |         nt = np.bincount(stats[3].astype(np.int64), minlength=nc)  # number of targets per class
182 |     else:
183 |         nt = torch.zeros(1)
184 | 
185 |     # Print results
186 |     pf = '%20s' + '%12.3g' * 6  # print format
187 |     print(pf % ('all', seen, nt.sum(), mp, mr, map50, map))
188 | 
189 |     # Print results per class
190 |     if verbose and nc > 1 and len(stats):
191 |         for i, c in enumerate(ap_class):
192 |             print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))
193 | 
194 |     # Print speeds
195 |     t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size)  # tuple
196 |     if not training:
197 |         print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t)
198 | 
199 |     # Save JSON
200 |     if save_json and map50 and len(jdict):
201 |         imgIds = [int(Path(x).stem.split('_')[-1]) for x in dataloader.dataset.img_files]
202 |         f = 'detections_val2017_%s_results.json' % \
203 |             (weights.split(os.sep)[-1].replace('.pt', '') if weights else '')  # filename
204 |         print('\nCOCO mAP with pycocotools... saving %s...' % f)
205 |         with open(f, 'w') as file:
206 |             json.dump(jdict, file)
207 | 
208 |         try:
209 |             from pycocotools.coco import COCO
210 |             from pycocotools.cocoeval import COCOeval
211 | 
212 |             # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
213 |             cocoGt = COCO(glob.glob('../coco/annotations/instances_val*.json')[0])  # initialize COCO ground truth api
214 |             cocoDt = cocoGt.loadRes(f)  # initialize COCO pred api
215 | 
216 |             cocoEval = COCOeval(cocoGt, cocoDt, 'bbox')
217 |             cocoEval.params.imgIds = imgIds  # [:32]  # only evaluate these images
218 |             cocoEval.evaluate()
219 |             cocoEval.accumulate()
220 |             cocoEval.summarize()
221 |             map, map50 = cocoEval.stats[:2]  # update to pycocotools results (mAP@0.5:0.95, mAP@0.5)
222 |         except:
223 |             print('WARNING: pycocotools must be installed with numpy==1.17 to run correctly. '
224 |                   'See https://github.com/cocodataset/cocoapi/issues/356')
225 | 
226 |     # Return results
227 |     maps = np.zeros(nc) + map
228 |     for i, c in enumerate(ap_class):
229 |         maps[c] = ap[i]
230 |     return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t
231 | 
232 | 
233 | if __name__ == '__main__':
234 |     parser = argparse.ArgumentParser(prog='test.py')
235 |     parser.add_argument('--weights', type=str, default='weights/yolov5s.pt', help='model.pt path')
236 |     parser.add_argument('--data', type=str, default='data/coco.yaml', help='*.data path')
237 |     parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch')
238 |     parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
239 |     parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold')
240 |     parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS')
241 |     parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file')
242 |     parser.add_argument('--task', default='val', help="'val', 'test', 'study'")
243 |     parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
244 |     parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
245 |     parser.add_argument('--augment', action='store_true', help='augmented inference')
246 |     parser.add_argument('--verbose', action='store_true', help='report mAP by class')
247 |     opt = parser.parse_args()
248 |     opt.save_json = opt.save_json or opt.data.endswith('coco.yaml')
249 |     opt.data = glob.glob('./**/' + opt.data, recursive=True)[0]  # find file
250 |     print(opt)
251 | 
252 |     # task = 'val', 'test', 'study'
253 |     if opt.task in ['val', 'test']:  # (default) run normally
254 |         test(opt.data,
255 |              opt.weights,
256 |              opt.batch_size,
257 |              opt.img_size,
258 |              opt.conf_thres,
259 |              opt.iou_thres,
260 |              opt.save_json,
261 |              opt.single_cls,
262 |              opt.augment)
263 | 
264 |     elif opt.task == 'study':  # run over a range of settings and save/plot
265 |         for weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']:
266 |             f = 'study_%s_%s.txt' % (Path(opt.data).stem, Path(weights).stem)  # filename to save to
267 |             x = list(range(288, 896, 64))  # x axis
268 |             y = []  # y axis
269 |             for i in x:  # img-size
270 |                 print('\nRunning %s point %s...' % (f, i))
271 |                 r, _, t = test(opt.data, weights, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json)
272 |                 y.append(r + t)  # results and times
273 |             np.savetxt(f, y, fmt='%10.4g')  # save
274 |         os.system('zip -r study.zip study_*.txt')
275 |         # plot_study_txt(f, x)  # plot
276 | 


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import torch.distributed as dist
  3 | import torch.nn.functional as F
  4 | import torch.optim as optim
  5 | import torch.optim.lr_scheduler as lr_scheduler
  6 | import yaml
  7 | from torch.utils.tensorboard import SummaryWriter
  8 | import test  # import test.py to get mAP after each epoch
  9 | from models.yolo import Model
 10 | from utils.datasets import *
 11 | from utils.utils import *
 12 | mixed_precision = True
 13 | try:  # Mixed precision training https://github.com/NVIDIA/apex
 14 |     from apex import amp
 15 | except:
 16 |     print('Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex')
 17 |     mixed_precision = False  # not installed
 18 | wdir = 'weights' + os.sep  # weights dir
 19 | last = wdir + 'last.pt'
 20 | best = wdir + 'best.pt'
 21 | results_file = 'results.txt'
 22 | # Hyperparameters
 23 | hyp = {'lr0': 0.01,  # initial learning rate (SGD=1E-2, Adam=1E-3)
 24 |        'momentum': 0.937,  # SGD momentum
 25 |        'weight_decay': 5e-4,  # optimizer weight decay
 26 |        'giou': 0.05,  # giou loss gain
 27 |        'cls': 0.58,  # cls loss gain
 28 |        'cls_pw': 1.0,  # cls BCELoss positive_weight
 29 |        'obj': 1.0,  # obj loss gain (*=img_size/320 if img_size != 320)
 30 |        'obj_pw': 1.0,  # obj BCELoss positive_weight
 31 |        'iou_t': 0.20,  # iou training threshold
 32 |        'anchor_t': 4.0,  # anchor-multiple threshold
 33 |        'fl_gamma': 0,  # focal loss gamma (efficientDet default is gamma=1.5)
 34 |        'hsv_h': 0.014,  # image HSV-Hue augmentation (fraction)
 35 |        'hsv_s': 0.68,  # image HSV-Saturation augmentation (fraction)
 36 |        'hsv_v': 0.36,  # image HSV-Value augmentation (fraction)
 37 |        'degrees': 0.0,  # image rotation (+/- deg)
 38 |        'translate': 0.0,  # image translation (+/- fraction)
 39 |        'scale': 0.5,  # image scale (+/- gain)
 40 |        'shear': 0.0}  # image shear (+/- deg)
 41 | print(hyp)
 42 | 
 43 | # Overwrite hyp with hyp*.txt (optional)
 44 | f = glob.glob('hyp*.txt')
 45 | if f:
 46 |     print('Using %s' % f[0])
 47 |     for k, v in zip(hyp.keys(), np.loadtxt(f[0])):
 48 |         hyp[k] = v
 49 | 
 50 | # Print focal loss if gamma > 0
 51 | if hyp['fl_gamma']:
 52 |     print('Using FocalLoss(gamma=%g)' % hyp['fl_gamma'])
 53 | 
 54 | def train(hyp):
 55 |     epochs = opt.epochs  # 300
 56 |     batch_size = opt.batch_size  # 64
 57 |     weights = opt.weights  # initial training weights
 58 | 
 59 |     # Configure
 60 |     init_seeds(1)
 61 |     with open(opt.data,'r',encoding='UTF-8') as f:
 62 |         data_dict = yaml.load(f, Loader=yaml.FullLoader)  # model dict
 63 |     train_path = data_dict['train']
 64 |     test_path = data_dict['val']
 65 |     nc = 1 if opt.single_cls else int(data_dict['nc'])  # number of classes
 66 | 
 67 |     # Remove previous results
 68 |     for f in glob.glob('*_batch*.jpg') + glob.glob(results_file):
 69 |         os.remove(f)
 70 | 
 71 |     # Create model
 72 |     model = Model(opt.cfg).to(device)
 73 |     assert model.md['nc'] == nc, '%s nc=%g classes but %s nc=%g classes' % (opt.data, nc, opt.cfg, model.md['nc'])
 74 | 
 75 |     # Image sizes
 76 |     gs = int(max(model.stride))  # grid size (max stride)
 77 |     if any(x % gs != 0 for x in opt.img_size):
 78 |         print('WARNING: --img-size %g,%g must be multiple of %s max stride %g' % (*opt.img_size, opt.cfg, gs))
 79 |     imgsz, imgsz_test = [make_divisible(x, gs) for x in opt.img_size]  # image sizes (train, test)
 80 | 
 81 |     # Optimizer
 82 |     nbs = 64  # nominal batch size
 83 |     accumulate = max(round(nbs / batch_size), 1)  # accumulate loss before optimizing
 84 |     hyp['weight_decay'] *= batch_size * accumulate / nbs  # scale weight_decay
 85 |     pg0, pg1, pg2 = [], [], []  # optimizer parameter groups
 86 |     for k, v in model.named_parameters():
 87 |         if v.requires_grad:
 88 |             if '.bias' in k:
 89 |                 pg2.append(v)  # biases
 90 |             elif '.weight' in k and '.bn' not in k:
 91 |                 pg1.append(v)  # apply weight decay
 92 |             else:
 93 |                 pg0.append(v)  # all else
 94 | 
 95 |     optimizer = optim.Adam(pg0, lr=hyp['lr0']) if opt.adam else \
 96 |         optim.SGD(pg0, lr=hyp['lr0'], momentum=hyp['momentum'], nesterov=True)
 97 |     optimizer.add_param_group({'params': pg1, 'weight_decay': hyp['weight_decay']})  # add pg1 with weight_decay
 98 |     optimizer.add_param_group({'params': pg2})  # add pg2 (biases)
 99 |     print('Optimizer groups: %g .bias, %g conv.weight, %g other' % (len(pg2), len(pg1), len(pg0)))
100 |     del pg0, pg1, pg2
101 | 
102 |     # Load Model
103 |     google_utils.attempt_download(weights)
104 |     start_epoch, best_fitness = 0, 0.0
105 |     if weights.endswith('.pt'):  # pytorch format
106 |         ckpt = torch.load(weights, map_location=device)  # load checkpoint
107 |     
108 |         # load model
109 |         try:
110 |             ckpt['model'] = \
111 |                 {k: v for k, v in ckpt['model'].state_dict().items() if model.state_dict()[k].numel() == v.numel()}
112 |             model.load_state_dict(ckpt['model'], strict=False)
113 |         except KeyError as e:
114 |             s = "%s is not compatible with %s. Specify --weights '' or specify a --cfg compatible with %s." \
115 |                 % (opt.weights, opt.cfg, opt.weights)
116 |             raise KeyError(s) from e
117 |     
118 |         # load optimizer
119 |         if ckpt['optimizer'] is not None:
120 |             optimizer.load_state_dict(ckpt['optimizer'])
121 |             best_fitness = ckpt['best_fitness']
122 |     
123 |         # load results
124 |         if ckpt.get('training_results') is not None:
125 |             with open(results_file, 'w') as file:
126 |                 file.write(ckpt['training_results'])  # write results.txt
127 |     
128 |         start_epoch = ckpt['epoch'] + 1
129 |         del ckpt
130 | 
131 |     if mixed_precision:
132 |         model, optimizer = amp.initialize(model, optimizer, opt_level='O1', verbosity=0)
133 | 
134 |     lf = lambda x: (((1 + math.cos(x * math.pi / epochs)) / 2) ** 1.0) * 0.9 + 0.1  # cosine
135 |     scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)
136 |     scheduler.last_epoch = start_epoch - 1  # do not move
137 | 
138 |     # Initialize distributed training
139 |     if device.type != 'cpu' and torch.cuda.device_count() > 1 and torch.distributed.is_available():
140 |         dist.init_process_group(backend='nccl',  # distributed backend
141 |                                 init_method='tcp://127.0.0.1:9999',  # init method
142 |                                 world_size=1,  # number of nodes
143 |                                 rank=0)  # node rank
144 |         model = torch.nn.parallel.DistributedDataParallel(model)
145 | 
146 |     # Dataset
147 |     dataset = LoadImagesAndLabels(train_path, imgsz, batch_size,
148 |                                   augment=True,
149 |                                   hyp=hyp,  # augmentation hyperparameters
150 |                                   rect=opt.rect,  # rectangular training
151 |                                   cache_images=opt.cache_images,
152 |                                   single_cls=opt.single_cls)
153 |     mlc = np.concatenate(dataset.labels, 0)[:, 0].max()  # max label class
154 |     assert mlc < nc, 'Label class %g exceeds nc=%g in %s. Correct your labels or your model.' % (mlc, nc, opt.cfg)
155 | 
156 |     # Dataloader
157 |     batch_size = min(batch_size, len(dataset))
158 |     nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])  # number of workers
159 |     nw = 0
160 |     dataloader = torch.utils.data.DataLoader(dataset,
161 |                                              batch_size=batch_size,
162 |                                              num_workers=nw,
163 |                                              shuffle=not opt.rect,  # Shuffle=True unless rectangular training is used
164 |                                              pin_memory=True,
165 |                                              collate_fn=dataset.collate_fn)
166 | 
167 |     # Testloader
168 |     testloader = torch.utils.data.DataLoader(LoadImagesAndLabels(test_path, imgsz_test, batch_size,
169 |                                                                  hyp=hyp,
170 |                                                                  rect=True,
171 |                                                                  cache_images=opt.cache_images,
172 |                                                                  single_cls=opt.single_cls),
173 |                                              batch_size=batch_size,
174 |                                              num_workers=nw,
175 |                                              pin_memory=True,
176 |                                              collate_fn=dataset.collate_fn)
177 | 
178 |     # Model parameters
179 |     hyp['cls'] *= nc / 80.  # scale coco-tuned hyp['cls'] to current dataset
180 |     model.nc = nc  # attach number of classes to model
181 |     model.hyp = hyp  # attach hyperparameters to model
182 |     model.gr = 1.0  # giou loss ratio (obj_loss = 1.0 or giou)
183 |     model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device)  # attach class weights
184 |     model.names = data_dict['names']
185 | 
186 |     # class frequency
187 |     labels = np.concatenate(dataset.labels, 0)
188 |     c = torch.tensor(labels[:, 0])  # classes
189 |     tb_writer.add_histogram('classes', c, 0)
190 | 
191 |     # Exponential moving average
192 |     ema = torch_utils.ModelEMA(model)
193 | 
194 |     # Start training
195 |     t0 = time.time()
196 |     nb = len(dataloader)  # number of batches
197 |     n_burn = max(3 * nb, 1e3)  # burn-in iterations, max(3 epochs, 1k iterations)
198 |     maps = np.zeros(nc)  # mAP per class
199 |     results = (0, 0, 0, 0, 0, 0, 0)  # 'P', 'R', 'mAP', 'F1', 'val GIoU', 'val Objectness', 'val Classification'
200 |     print('Image sizes %g train, %g test' % (imgsz, imgsz_test))
201 |     print('Using %g dataloader workers' % nw)
202 |     print('Starting training for %g epochs...' % epochs)
203 |     # torch.autograd.set_detect_anomaly(True)
204 |     for epoch in range(start_epoch, epochs):  # epoch ------------------------------------------------------------------
205 |         model.train()
206 | 
207 |         # Update image weights (optional)
208 |         if dataset.image_weights:
209 |             w = model.class_weights.cpu().numpy() * (1 - maps) ** 2  # class weights
210 |             image_weights = labels_to_image_weights(dataset.labels, nc=nc, class_weights=w)
211 |             dataset.indices = random.choices(range(dataset.n), weights=image_weights, k=dataset.n)  # rand weighted idx
212 | 
213 |         mloss = torch.zeros(4, device=device)  # mean losses
214 |         print(('\n' + '%10s' * 8) % ('Epoch', 'gpu_mem', 'GIoU', 'obj', 'cls', 'total', 'targets', 'img_size'))
215 |         try:
216 |             pbar = tqdm(enumerate(dataloader), total=nb)  # progress bar
217 |             for i, (imgs, targets, paths, _) in pbar:  # batch -------------------------------------------------------------
218 |                 ni = i + nb * epoch  # number integrated batches (since train start)
219 |                 imgs = imgs.to(device).float() / 255.0  # uint8 to float32, 0 - 255 to 0.0 - 1.0
220 | 
221 |                 # Burn-in
222 |                 if ni <= n_burn:
223 |                     xi = [0, n_burn]  # x interp
224 |                     # model.gr = np.interp(ni, xi, [0.0, 1.0])  # giou loss ratio (obj_loss = 1.0 or giou)
225 |                     accumulate = max(1, np.interp(ni, xi, [1, nbs / batch_size]).round())
226 |                     for j, x in enumerate(optimizer.param_groups):
227 |                         # bias lr falls from 0.1 to lr0, all other lrs rise from 0.0 to lr0
228 |                         x['lr'] = np.interp(ni, xi, [0.1 if j == 2 else 0.0, x['initial_lr'] * lf(epoch)])
229 |                         if 'momentum' in x:
230 |                             x['momentum'] = np.interp(ni, xi, [0.9, hyp['momentum']])
231 | 
232 |                 # Multi-scale
233 |                 if opt.multi_scale:
234 |                     sz = random.randrange(imgsz * 0.5, imgsz * 1.5 + gs) // gs * gs  # size
235 |                     sf = sz / max(imgs.shape[2:])  # scale factor
236 |                     if sf != 1:
237 |                         ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]]  # new shape (stretched to gs-multiple)
238 |                         imgs = F.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)
239 | 
240 |                 # Forward
241 |                 pred = model(imgs)
242 | 
243 |                 # Loss
244 |                 loss, loss_items = compute_loss(pred, targets.to(device), model)
245 |                 if not torch.isfinite(loss):
246 |                     print('WARNING: non-finite loss, ending training ', loss_items)
247 |                     return results
248 | 
249 |                 # Backward
250 |                 if mixed_precision:
251 |                     with amp.scale_loss(loss, optimizer) as scaled_loss:
252 |                         scaled_loss.backward()
253 |                 else:
254 |                     loss.backward()
255 | 
256 |                 # Optimize
257 |                 if ni % accumulate == 0:
258 |                     optimizer.step()
259 |                     optimizer.zero_grad()
260 |                     ema.update(model)
261 | 
262 |                 # Print
263 |                 mloss = (mloss * i + loss_items) / (i + 1)  # update mean losses
264 |                 mem = '%.3gG' % (torch.cuda.memory_cached() / 1E9 if torch.cuda.is_available() else 0)  # (GB)
265 |                 s = ('%10s' * 2 + '%10.4g' * 6) % (
266 |                     '%g/%g' % (epoch, epochs - 1), mem, *mloss, targets.shape[0], imgs.shape[-1])
267 |                 pbar.set_description(s)
268 | 
269 |                 # Plot
270 |                 if ni < 3:
271 |                     f = 'train_batch%g.jpg' % i  # filename
272 |                     res = plot_images(images=imgs, targets=targets, paths=paths, fname=f)
273 |                     if tb_writer:
274 |                         tb_writer.add_image(f, res, dataformats='HWC', global_step=epoch)
275 |                         # tb_writer.add_graph(model, imgs)  # add model to tensorboard
276 |                 # end batch ------------------------------------------------------------------------------------------------
277 |         except:
278 |             pass
279 |         # Scheduler
280 |         scheduler.step()
281 | 
282 |         torch.cuda.empty_cache()
283 |         # mAP
284 |         ema.update_attr(model)
285 |         final_epoch = epoch + 1 == epochs
286 |         if not opt.notest or final_epoch:  # Calculate mAP
287 |             results, maps, times = test.test(opt.data,
288 |                                              batch_size=batch_size,
289 |                                              imgsz=imgsz_test,
290 |                                              save_json=final_epoch and opt.data.endswith(os.sep + 'coco.yaml'),
291 |                                              model=ema.ema,
292 |                                              single_cls=opt.single_cls,
293 |                                              dataloader=testloader,
294 |                                              fast=ni < n_burn)
295 | 
296 |         # Write
297 |         with open(results_file, 'a') as f:
298 |             f.write(s + '%10.4g' * 7 % results + '\n')  # P, R, mAP, F1, test_losses=(GIoU, obj, cls)
299 |         if len(opt.name) and opt.bucket:
300 |             os.system('gsutil cp results.txt gs://%s/results/results%s.txt' % (opt.bucket, opt.name))
301 | 
302 |         # Tensorboard
303 |         if tb_writer:
304 |             tags = ['train/giou_loss', 'train/obj_loss', 'train/cls_loss',
305 |                     'metrics/precision', 'metrics/recall', 'metrics/mAP_0.5', 'metrics/F1',
306 |                     'val/giou_loss', 'val/obj_loss', 'val/cls_loss']
307 |             for x, tag in zip(list(mloss[:-1]) + list(results), tags):
308 |                 tb_writer.add_scalar(tag, x, epoch)
309 | 
310 |         # Update best mAP
311 |         fi = fitness(np.array(results).reshape(1, -1))  # fitness_i = weighted combination of [P, R, mAP, F1]
312 |         if fi > best_fitness:
313 |             best_fitness = fi
314 | 
315 |         # Save model
316 |         save = (not opt.nosave) or (final_epoch and not opt.evolve)
317 |         if save:
318 |             with open(results_file, 'r') as f:  # create checkpoint
319 |                 ckpt = {'epoch': epoch,
320 |                         'best_fitness': best_fitness,
321 |                         'training_results': f.read(),
322 |                         'model': ema.ema.module if hasattr(model, 'module') else ema.ema,
323 |                         'optimizer': None if final_epoch else optimizer.state_dict()}
324 | 
325 |             # Save last, best and delete
326 |             torch.save(ckpt, last)
327 |             if (best_fitness == fi) and not final_epoch:
328 |                 torch.save(ckpt, best)
329 |             del ckpt
330 | 
331 |         # end epoch ----------------------------------------------------------------------------------------------------
332 |     # end training
333 | 
334 |     n = opt.name
335 |     if len(n):
336 |         n = '_' + n if not n.isnumeric() else n
337 |         fresults, flast, fbest = 'results%s.txt' % n, wdir + 'last%s.pt' % n, wdir + 'best%s.pt' % n
338 |         for f1, f2 in zip([wdir + 'last.pt', wdir + 'best.pt', 'results.txt'], [flast, fbest, fresults]):
339 |             if os.path.exists(f1):
340 |                 os.rename(f1, f2)  # rename
341 |                 ispt = f2.endswith('.pt')  # is *.pt
342 |                 strip_optimizer(f2) if ispt else None  # strip optimizer
343 |                 os.system('gsutil cp %s gs://%s/weights' % (f2, opt.bucket)) if opt.bucket and ispt else None  # upload
344 | 
345 |     if not opt.evolve:
346 |         # plot_results()  # save as results.png
347 |         pass
348 |     print('%g epochs completed in %.3f hours.\n' % (epoch - start_epoch + 1, (time.time() - t0) / 3600))
349 |     dist.destroy_process_group() if torch.cuda.device_count() > 1 else None
350 |     torch.cuda.empty_cache()
351 |     return results
352 | 
353 | if __name__ == '__main__':
354 |     parser = argparse.ArgumentParser()
355 |     parser.add_argument('--epochs', type=int, default=300)
356 |     parser.add_argument('--batch-size', type=int, default=1)
357 |     parser.add_argument('--cfg', type=str, default='./config/yolov5l.yaml', help='*.cfg path')
358 |     parser.add_argument('--data', type=str, default='./config/score.yaml', help='*.data path')
359 |     parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='train,test sizes')
360 |     parser.add_argument('--rect', action='store_true', help='rectangular training')
361 |     parser.add_argument('--resume', action='store_true', help='resume training from last.pt')
362 |     parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
363 |     parser.add_argument('--notest', action='store_true', help='only test final epoch')
364 |     parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
365 |     parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
366 |     parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
367 |     parser.add_argument('--weights', type=str, default='', help='initial weights path')
368 |     parser.add_argument('--name', default='', help='renames results.txt to results_name.txt if supplied')
369 |     parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
370 |     parser.add_argument('--adam', action='store_true', help='use adam optimizer')
371 |     parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%')
372 |     parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset')
373 |     opt = parser.parse_args()
374 |     opt.weights = last if opt.resume else opt.weights
375 |     print(opt)
376 |     opt.img_size.extend([opt.img_size[-1]] * (2 - len(opt.img_size)))  # extend to 2 sizes (train, test)
377 |     device = torch_utils.select_device(opt.device, apex=mixed_precision, batch_size=opt.batch_size)
378 |     # check_git_status()
379 |     if device.type == 'cpu':
380 |         mixed_precision = False
381 |     # Train
382 |     # if not opt.evolve:
383 |     tb_writer = SummaryWriter(comment=opt.name)
384 |     print('Start Tensorboard with "tensorboard --logdir=runs", view at http://localhost:6006/')
385 |     train(hyp)
386 |     # Evolve hyperparameters (optional)
387 |     # else:
388 |     #     tb_writer = None
389 |     #     opt.notest, opt.nosave = True, True  # only test/save final epoch
390 |     #     if opt.bucket:
391 |     #         os.system('gsutil cp gs://%s/evolve.txt .' % opt.bucket)  # download evolve.txt if exists
392 |     #     for _ in range(10):  # generations to evolve
393 |     #         if os.path.exists('evolve.txt'):  # if evolve.txt exists: select best hyps and mutate
394 |     #             # Select parent(s)
395 |     #             parent = 'single'  # parent selection method: 'single' or 'weighted'
396 |     #             x = np.loadtxt('evolve.txt', ndmin=2)
397 |     #             n = min(5, len(x))  # number of previous results to consider
398 |     #             x = x[np.argsort(-fitness(x))][:n]  # top n mutations
399 |     #             w = fitness(x) - fitness(x).min()  # weights
400 |     #             if parent == 'single' or len(x) == 1:
401 |     #                 # x = x[random.randint(0, n - 1)]  # random selection
402 |     #                 x = x[random.choices(range(n), weights=w)[0]]  # weighted selection
403 |     #             elif parent == 'weighted':
404 |     #                 x = (x * w.reshape(n, 1)).sum(0) / w.sum()  # weighted combination
405 | 
406 |     #             # Mutate
407 |     #             mp, s = 0.9, 0.2  # mutation probability, sigma
408 |     #             npr = np.random
409 |     #             npr.seed(int(time.time()))
410 |     #             g = np.array([1, 1, 1, 1, 1, 1, 1, 0, .1, 1, 0, 1, 1, 1, 1, 1, 1, 1])  # gains
411 |     #             ng = len(g)
412 |     #             v = np.ones(ng)
413 |     #             while all(v == 1):  # mutate until a change occurs (prevent duplicates)
414 |     #                 v = (g * (npr.random(ng) < mp) * npr.randn(ng) * npr.random() * s + 1).clip(0.3, 3.0)
415 |     #             for i, k in enumerate(hyp.keys()):  # plt.hist(v.ravel(), 300)
416 |     #                 hyp[k] = x[i + 7] * v[i]  # mutate
417 | 
418 |     #         # Clip to limits
419 |     #         keys = ['lr0', 'iou_t', 'momentum', 'weight_decay', 'hsv_s', 'hsv_v', 'translate', 'scale', 'fl_gamma']
420 |     #         limits = [(1e-5, 1e-2), (0.00, 0.70), (0.60, 0.98), (0, 0.001), (0, .9), (0, .9), (0, .9), (0, .9), (0, 3)]
421 |     #         for k, v in zip(keys, limits):
422 |     #             hyp[k] = np.clip(hyp[k], v[0], v[1])
423 |     #         # Train mutation
424 |     #         results = train(hyp.copy())
425 |     #         # Write mutation results
426 |     #         print_mutation(hyp, results, opt.bucket)
427 |     #         # Plot results
428 |     #         # plot_evolution_results(hyp)
429 | 


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__init__.py


--------------------------------------------------------------------------------
/utils/__pycache__/__init__.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/__init__.cpython-37.pyc


--------------------------------------------------------------------------------
/utils/__pycache__/datasets.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/datasets.cpython-37.pyc


--------------------------------------------------------------------------------
/utils/__pycache__/google_utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/google_utils.cpython-37.pyc


--------------------------------------------------------------------------------
/utils/__pycache__/torch_utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/torch_utils.cpython-37.pyc


--------------------------------------------------------------------------------
/utils/__pycache__/utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Transformer-man/yolov5-flask/36573a0b6e91d5a91f3394af278f5a5e768efae7/utils/__pycache__/utils.cpython-37.pyc


--------------------------------------------------------------------------------
/utils/activations.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.functional as F
 3 | import torch.nn as nn
 4 | 
 5 | 
 6 | # Swish ------------------------------------------------------------------------
 7 | class SwishImplementation(torch.autograd.Function):
 8 |     @staticmethod
 9 |     def forward(ctx, x):
10 |         ctx.save_for_backward(x)
11 |         return x * torch.sigmoid(x)
12 | 
13 |     @staticmethod
14 |     def backward(ctx, grad_output):
15 |         x = ctx.saved_tensors[0]
16 |         sx = torch.sigmoid(x)
17 |         return grad_output * (sx * (1 + x * (1 - sx)))
18 | 
19 | 
20 | class MemoryEfficientSwish(nn.Module):
21 |     @staticmethod
22 |     def forward(x):
23 |         return SwishImplementation.apply(x)
24 | 
25 | 
26 | class HardSwish(nn.Module):  # https://arxiv.org/pdf/1905.02244.pdf
27 |     @staticmethod
28 |     def forward(x):
29 |         return x * F.hardtanh(x + 3, 0., 6., True) / 6.
30 | 
31 | 
32 | class Swish(nn.Module):
33 |     @staticmethod
34 |     def forward(x):
35 |         return x * torch.sigmoid(x)
36 | 
37 | 
38 | # Mish ------------------------------------------------------------------------
39 | class MishImplementation(torch.autograd.Function):
40 |     @staticmethod
41 |     def forward(ctx, x):
42 |         ctx.save_for_backward(x)
43 |         return x.mul(torch.tanh(F.softplus(x)))  # x * tanh(ln(1 + exp(x)))
44 | 
45 |     @staticmethod
46 |     def backward(ctx, grad_output):
47 |         x = ctx.saved_tensors[0]
48 |         sx = torch.sigmoid(x)
49 |         fx = F.softplus(x).tanh()
50 |         return grad_output * (fx + x * sx * (1 - fx * fx))
51 | 
52 | 
53 | class MemoryEfficientMish(nn.Module):
54 |     @staticmethod
55 |     def forward(x):
56 |         return MishImplementation.apply(x)
57 | 
58 | 
59 | class Mish(nn.Module):  # https://github.com/digantamisra98/Mish
60 |     @staticmethod
61 |     def forward(x):
62 |         return x * F.softplus(x).tanh()
63 | 


--------------------------------------------------------------------------------
/utils/datasets.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | import math
  3 | import os
  4 | import random
  5 | import shutil
  6 | import time
  7 | from pathlib import Path
  8 | from threading import Thread
  9 | 
 10 | import cv2
 11 | import numpy as np
 12 | import torch
 13 | from PIL import Image, ExifTags
 14 | from torch.utils.data import Dataset
 15 | from tqdm import tqdm
 16 | 
 17 | from utils.utils import xyxy2xywh, xywh2xyxy
 18 | 
 19 | help_url = 'https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data'
 20 | img_formats = ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.dng']
 21 | vid_formats = ['.mov', '.avi', '.mp4']
 22 | 
 23 | # Get orientation exif tag
 24 | for orientation in ExifTags.TAGS.keys():
 25 |     if ExifTags.TAGS[orientation] == 'Orientation':
 26 |         break
 27 | 
 28 | 
 29 | def exif_size(img):
 30 |     # Returns exif-corrected PIL size
 31 |     s = img.size  # (width, height)
 32 |     try:
 33 |         rotation = dict(img._getexif().items())[orientation]
 34 |         if rotation == 6:  # rotation 270
 35 |             s = (s[1], s[0])
 36 |         elif rotation == 8:  # rotation 90
 37 |             s = (s[1], s[0])
 38 |     except:
 39 |         pass
 40 | 
 41 |     return s
 42 | 
 43 | 
 44 | class LoadImages:  # for inference
 45 |     def __init__(self, path, img_size=416):
 46 |         path = str(Path(path))  # os-agnostic
 47 |         files = []
 48 |         if os.path.isdir(path):
 49 |             files = sorted(glob.glob(os.path.join(path, '*.*')))
 50 |         elif os.path.isfile(path):
 51 |             files = [path]
 52 | 
 53 |         images = [x for x in files if os.path.splitext(x)[-1].lower() in img_formats]
 54 |         videos = [x for x in files if os.path.splitext(x)[-1].lower() in vid_formats]
 55 |         nI, nV = len(images), len(videos)
 56 | 
 57 |         self.img_size = img_size
 58 |         self.files = images + videos
 59 |         self.nF = nI + nV  # number of files
 60 |         self.video_flag = [False] * nI + [True] * nV
 61 |         self.mode = 'images'
 62 |         if any(videos):
 63 |             self.new_video(videos[0])  # new video
 64 |         else:
 65 |             self.cap = None
 66 |         assert self.nF > 0, 'No images or videos found in ' + path
 67 | 
 68 |     def __iter__(self):
 69 |         self.count = 0
 70 |         return self
 71 | 
 72 |     def __next__(self):
 73 |         if self.count == self.nF:
 74 |             raise StopIteration
 75 |         path = self.files[self.count]
 76 | 
 77 |         if self.video_flag[self.count]:
 78 |             # Read video
 79 |             self.mode = 'video'
 80 |             ret_val, img0 = self.cap.read()
 81 |             if not ret_val:
 82 |                 self.count += 1
 83 |                 self.cap.release()
 84 |                 if self.count == self.nF:  # last video
 85 |                     raise StopIteration
 86 |                 else:
 87 |                     path = self.files[self.count]
 88 |                     self.new_video(path)
 89 |                     ret_val, img0 = self.cap.read()
 90 | 
 91 |             self.frame += 1
 92 |             print('video %g/%g (%g/%g) %s: ' % (self.count + 1, self.nF, self.frame, self.nframes, path), end='')
 93 | 
 94 |         else:
 95 |             # Read image
 96 |             self.count += 1
 97 |             img0 = cv2.imread(path)  # BGR
 98 |             assert img0 is not None, 'Image Not Found ' + path
 99 |             print('image %g/%g %s: ' % (self.count, self.nF, path), end='')
100 | 
101 |         # Padded resize
102 |         img = letterbox(img0, new_shape=self.img_size)[0]
103 | 
104 |         # Convert
105 |         img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
106 |         img = np.ascontiguousarray(img)
107 | 
108 |         # cv2.imwrite(path + '.letterbox.jpg', 255 * img.transpose((1, 2, 0))[:, :, ::-1])  # save letterbox image
109 |         return path, img, img0, self.cap
110 | 
111 |     def new_video(self, path):
112 |         self.frame = 0
113 |         self.cap = cv2.VideoCapture(path)
114 |         self.nframes = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT))
115 | 
116 |     def __len__(self):
117 |         return self.nF  # number of files
118 | 
119 | # def LoadImages(img0):  # for inference
120 | #
121 | #         img = letterbox(img0, new_shape=640)[0]
122 | #
123 | #         img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
124 | #         img = np.ascontiguousarray(img)
125 | #
126 | #         return img, img0
127 | 
128 | 
129 | 
130 | class LoadWebcam:  # for inference
131 |     def __init__(self, pipe=0, img_size=416):
132 |         self.img_size = img_size
133 | 
134 |         if pipe == '0':
135 |             pipe = 0  # local camera
136 |         # pipe = 'rtsp://192.168.1.64/1'  # IP camera
137 |         # pipe = 'rtsp://username:password@192.168.1.64/1'  # IP camera with login
138 |         # pipe = 'rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa'  # IP traffic camera
139 |         # pipe = 'http://wmccpinetop.axiscam.net/mjpg/video.mjpg'  # IP golf camera
140 | 
141 |         # https://answers.opencv.org/question/215996/changing-gstreamer-pipeline-to-opencv-in-pythonsolved/
142 |         # pipe = '"rtspsrc location="rtsp://username:password@192.168.1.64/1" latency=10 ! appsink'  # GStreamer
143 | 
144 |         # https://answers.opencv.org/question/200787/video-acceleration-gstremer-pipeline-in-videocapture/
145 |         # https://stackoverflow.com/questions/54095699/install-gstreamer-support-for-opencv-python-package  # install help
146 |         # pipe = "rtspsrc location=rtsp://root:root@192.168.0.91:554/axis-media/media.amp?videocodec=h264&resolution=3840x2160 protocols=GST_RTSP_LOWER_TRANS_TCP ! rtph264depay ! queue ! vaapih264dec ! videoconvert ! appsink"  # GStreamer
147 | 
148 |         self.pipe = pipe
149 |         self.cap = cv2.VideoCapture(pipe)  # video capture object
150 |         self.cap.set(cv2.CAP_PROP_BUFFERSIZE, 3)  # set buffer size
151 | 
152 |     def __iter__(self):
153 |         self.count = -1
154 |         return self
155 | 
156 |     def __next__(self):
157 |         self.count += 1
158 |         if cv2.waitKey(1) == ord('q'):  # q to quit
159 |             self.cap.release()
160 |             cv2.destroyAllWindows()
161 |             raise StopIteration
162 | 
163 |         # Read frame
164 |         if self.pipe == 0:  # local camera
165 |             ret_val, img0 = self.cap.read()
166 |             img0 = cv2.flip(img0, 1)  # flip left-right
167 |         else:  # IP camera
168 |             n = 0
169 |             while True:
170 |                 n += 1
171 |                 self.cap.grab()
172 |                 if n % 30 == 0:  # skip frames
173 |                     ret_val, img0 = self.cap.retrieve()
174 |                     if ret_val:
175 |                         break
176 | 
177 |         # Print
178 |         assert ret_val, 'Camera Error %s' % self.pipe
179 |         img_path = 'webcam.jpg'
180 |         print('webcam %g: ' % self.count, end='')
181 | 
182 |         # Padded resize
183 |         img = letterbox(img0, new_shape=self.img_size)[0]
184 | 
185 |         # Convert
186 |         img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
187 |         img = np.ascontiguousarray(img)
188 | 
189 |         return img_path, img, img0, None
190 | 
191 |     def __len__(self):
192 |         return 0
193 | 
194 | 
195 | class LoadStreams:  # multiple IP or RTSP cameras
196 |     def __init__(self, sources='streams.txt', img_size=416):
197 |         self.mode = 'images'
198 |         self.img_size = img_size
199 | 
200 |         if os.path.isfile(sources):
201 |             with open(sources, 'r') as f:
202 |                 sources = [x.strip() for x in f.read().splitlines() if len(x.strip())]
203 |         else:
204 |             sources = [sources]
205 | 
206 |         n = len(sources)
207 |         self.imgs = [None] * n
208 |         self.sources = sources
209 |         for i, s in enumerate(sources):
210 |             # Start the thread to read frames from the video stream
211 |             print('%g/%g: %s... ' % (i + 1, n, s), end='')
212 |             cap = cv2.VideoCapture(0 if s == '0' else s)
213 |             assert cap.isOpened(), 'Failed to open %s' % s
214 |             w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
215 |             h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
216 |             fps = cap.get(cv2.CAP_PROP_FPS) % 100
217 |             _, self.imgs[i] = cap.read()  # guarantee first frame
218 |             thread = Thread(target=self.update, args=([i, cap]), daemon=True)
219 |             print(' success (%gx%g at %.2f FPS).' % (w, h, fps))
220 |             thread.start()
221 |         print('')  # newline
222 | 
223 |         # check for common shapes
224 |         s = np.stack([letterbox(x, new_shape=self.img_size)[0].shape for x in self.imgs], 0)  # inference shapes
225 |         self.rect = np.unique(s, axis=0).shape[0] == 1  # rect inference if all shapes equal
226 |         if not self.rect:
227 |             print('WARNING: Different stream shapes detected. For optimal performance supply similarly-shaped streams.')
228 | 
229 |     def update(self, index, cap):
230 |         # Read next stream frame in a daemon thread
231 |         n = 0
232 |         while cap.isOpened():
233 |             n += 1
234 |             # _, self.imgs[index] = cap.read()
235 |             cap.grab()
236 |             if n == 4:  # read every 4th frame
237 |                 _, self.imgs[index] = cap.retrieve()
238 |                 n = 0
239 |             time.sleep(0.01)  # wait time
240 | 
241 |     def __iter__(self):
242 |         self.count = -1
243 |         return self
244 | 
245 |     def __next__(self):
246 |         self.count += 1
247 |         img0 = self.imgs.copy()
248 |         if cv2.waitKey(1) == ord('q'):  # q to quit
249 |             cv2.destroyAllWindows()
250 |             raise StopIteration
251 | 
252 |         # Letterbox
253 |         img = [letterbox(x, new_shape=self.img_size, auto=self.rect)[0] for x in img0]
254 | 
255 |         # Stack
256 |         img = np.stack(img, 0)
257 | 
258 |         # Convert
259 |         img = img[:, :, :, ::-1].transpose(0, 3, 1, 2)  # BGR to RGB, to bsx3x416x416
260 |         img = np.ascontiguousarray(img)
261 | 
262 |         return self.sources, img, img0, None
263 | 
264 |     def __len__(self):
265 |         return 0  # 1E12 frames = 32 streams at 30 FPS for 30 years
266 | 
267 | 
268 | class LoadImagesAndLabels(Dataset):  # for training/testing
269 |     def __init__(self, path, img_size=416, batch_size=16, augment=False, hyp=None, rect=False, image_weights=False,
270 |                  cache_images=False, single_cls=False, pad=0.0):
271 |         try:
272 |             path = str(Path(path))  # os-agnostic
273 |             parent = str(Path(path).parent) + os.sep
274 |             if os.path.isfile(path):  # file
275 |                 with open(path, 'r') as f:
276 |                     f = f.read().splitlines()
277 |                     f = [x.replace('./', parent) if x.startswith('./') else x for x in f]  # local to global path
278 |             elif os.path.isdir(path):  # folder
279 |                 f = glob.iglob(path + os.sep + '*.*')
280 |             else:
281 |                 raise Exception('%s does not exist' % path)
282 |             self.img_files = [x.replace('/', os.sep) for x in f if os.path.splitext(x)[-1].lower() in img_formats]
283 |         except:
284 |             raise Exception('Error loading data from %s. See %s' % (path, help_url))
285 | 
286 |         n = len(self.img_files)
287 |         assert n > 0, 'No images found in %s. See %s' % (path, help_url)
288 |         bi = np.floor(np.arange(n) / batch_size).astype(np.int)  # batch index
289 |         nb = bi[-1] + 1  # number of batches
290 | 
291 |         self.n = n  # number of images
292 |         self.batch = bi  # batch index of image
293 |         self.img_size = img_size
294 |         self.augment = augment
295 |         self.hyp = hyp
296 |         self.image_weights = image_weights
297 |         self.rect = False if image_weights else rect
298 |         self.mosaic = self.augment and not self.rect  # load 4 images at a time into a mosaic (only during training)
299 | 
300 |         # Define labels
301 |         self.label_files = [x.replace('images', 'labels').replace(os.path.splitext(x)[-1], '.txt')
302 |                             for x in self.img_files]
303 | 
304 |         # Rectangular Training  https://github.com/ultralytics/yolov3/issues/232
305 |         if self.rect:
306 |             # Read image shapes (wh)
307 |             sp = path.replace('.txt', '') + '.shapes'  # shapefile path
308 |             try:
309 |                 with open(sp, 'r') as f:  # read existing shapefile
310 |                     s = [x.split() for x in f.read().splitlines()]
311 |                     assert len(s) == n, 'Shapefile out of sync'
312 |             except:
313 |                 s = [exif_size(Image.open(f)) for f in tqdm(self.img_files, desc='Reading image shapes')]
314 |                 np.savetxt(sp, s, fmt='%g')  # overwrites existing (if any)
315 | 
316 |             # Sort by aspect ratio
317 |             s = np.array(s, dtype=np.float64)
318 |             ar = s[:, 1] / s[:, 0]  # aspect ratio
319 |             irect = ar.argsort()
320 |             self.img_files = [self.img_files[i] for i in irect]
321 |             self.label_files = [self.label_files[i] for i in irect]
322 |             self.shapes = s[irect]  # wh
323 |             ar = ar[irect]
324 | 
325 |             # Set training image shapes
326 |             shapes = [[1, 1]] * nb
327 |             for i in range(nb):
328 |                 ari = ar[bi == i]
329 |                 mini, maxi = ari.min(), ari.max()
330 |                 if maxi < 1:
331 |                     shapes[i] = [maxi, 1]
332 |                 elif mini > 1:
333 |                     shapes[i] = [1, 1 / mini]
334 | 
335 |             self.batch_shapes = np.ceil(np.array(shapes) * img_size / 32. + pad).astype(np.int) * 32
336 | 
337 |         # Cache labels
338 |         self.imgs = [None] * n
339 |         self.labels = [np.zeros((0, 5), dtype=np.float32)] * n
340 |         create_datasubset, extract_bounding_boxes, labels_loaded = False, False, False
341 |         nm, nf, ne, ns, nd = 0, 0, 0, 0, 0  # number missing, found, empty, datasubset, duplicate
342 |         np_labels_path = str(Path(self.label_files[0]).parent) + '.npy'  # saved labels in *.npy file
343 |         if os.path.isfile(np_labels_path):
344 |             s = np_labels_path  # print string
345 |             x = np.load(np_labels_path, allow_pickle=True)
346 |             if len(x) == n:
347 |                 self.labels = x
348 |                 labels_loaded = True
349 |         else:
350 |             s = path.replace('images', 'labels')
351 | 
352 |         pbar = tqdm(self.label_files)
353 |         for i, file in enumerate(pbar):
354 |             if labels_loaded:
355 |                 l = self.labels[i]
356 |                 # np.savetxt(file, l, '%g')  # save *.txt from *.npy file
357 |             else:
358 |                 try:
359 |                     with open(file, 'r') as f:
360 |                         l = np.array([x.split() for x in f.read().splitlines()], dtype=np.float32)
361 |                 except:
362 |                     nm += 1  # print('missing labels for image %s' % self.img_files[i])  # file missing
363 |                     continue
364 | 
365 |             if l.shape[0]:
366 |                 assert l.shape[1] == 5, '> 5 label columns: %s' % file
367 |                 assert (l >= 0).all(), 'negative labels: %s' % file
368 |                 assert (l[:, 1:] <= 1).all(), 'non-normalized or out of bounds coordinate labels: %s' % file
369 |                 if np.unique(l, axis=0).shape[0] < l.shape[0]:  # duplicate rows
370 |                     nd += 1  # print('WARNING: duplicate rows in %s' % self.label_files[i])  # duplicate rows
371 |                 if single_cls:
372 |                     l[:, 0] = 0  # force dataset into single-class mode
373 |                 self.labels[i] = l
374 |                 nf += 1  # file found
375 | 
376 |                 # Create subdataset (a smaller dataset)
377 |                 if create_datasubset and ns < 1E4:
378 |                     if ns == 0:
379 |                         create_folder(path='./datasubset')
380 |                         os.makedirs('./datasubset/images')
381 |                     exclude_classes = 43
382 |                     if exclude_classes not in l[:, 0]:
383 |                         ns += 1
384 |                         # shutil.copy(src=self.img_files[i], dst='./datasubset/images/')  # copy image
385 |                         with open('./datasubset/images.txt', 'a') as f:
386 |                             f.write(self.img_files[i] + '\n')
387 | 
388 |                 # Extract object detection boxes for a second stage classifier
389 |                 if extract_bounding_boxes:
390 |                     p = Path(self.img_files[i])
391 |                     img = cv2.imread(str(p))
392 |                     h, w = img.shape[:2]
393 |                     for j, x in enumerate(l):
394 |                         f = '%s%sclassifier%s%g_%g_%s' % (p.parent.parent, os.sep, os.sep, x[0], j, p.name)
395 |                         if not os.path.exists(Path(f).parent):
396 |                             os.makedirs(Path(f).parent)  # make new output folder
397 | 
398 |                         b = x[1:] * [w, h, w, h]  # box
399 |                         b[2:] = b[2:].max()  # rectangle to square
400 |                         b[2:] = b[2:] * 1.3 + 30  # pad
401 |                         b = xywh2xyxy(b.reshape(-1, 4)).ravel().astype(np.int)
402 | 
403 |                         b[[0, 2]] = np.clip(b[[0, 2]], 0, w)  # clip boxes outside of image
404 |                         b[[1, 3]] = np.clip(b[[1, 3]], 0, h)
405 |                         assert cv2.imwrite(f, img[b[1]:b[3], b[0]:b[2]]), 'Failure extracting classifier boxes'
406 |             else:
407 |                 ne += 1  # print('empty labels for image %s' % self.img_files[i])  # file empty
408 |                 # os.system("rm '%s' '%s'" % (self.img_files[i], self.label_files[i]))  # remove
409 | 
410 |             pbar.desc = 'Caching labels %s (%g found, %g missing, %g empty, %g duplicate, for %g images)' % (
411 |                 s, nf, nm, ne, nd, n)
412 |         assert nf > 0 or n == 20288, 'No labels found in %s. See %s' % (os.path.dirname(file) + os.sep, help_url)
413 |         if not labels_loaded and n > 1000:
414 |             print('Saving labels to %s for faster future loading' % np_labels_path)
415 |             np.save(np_labels_path, self.labels)  # save for next time
416 | 
417 |         # Cache images into memory for faster training (WARNING: large datasets may exceed system RAM)
418 |         if cache_images:  # if training
419 |             gb = 0  # Gigabytes of cached images
420 |             pbar = tqdm(range(len(self.img_files)), desc='Caching images')
421 |             self.img_hw0, self.img_hw = [None] * n, [None] * n
422 |             for i in pbar:  # max 10k images
423 |                 self.imgs[i], self.img_hw0[i], self.img_hw[i] = load_image(self, i)  # img, hw_original, hw_resized
424 |                 gb += self.imgs[i].nbytes
425 |                 pbar.desc = 'Caching images (%.1fGB)' % (gb / 1E9)
426 | 
427 |         # Detect corrupted images https://medium.com/joelthchao/programmatically-detect-corrupted-image-8c1b2006c3d3
428 |         detect_corrupted_images = False
429 |         if detect_corrupted_images:
430 |             from skimage import io  # conda install -c conda-forge scikit-image
431 |             for file in tqdm(self.img_files, desc='Detecting corrupted images'):
432 |                 try:
433 |                     _ = io.imread(file)
434 |                 except:
435 |                     print('Corrupted image detected: %s' % file)
436 | 
437 |     def __len__(self):
438 |         return len(self.img_files)
439 | 
440 |     # def __iter__(self):
441 |     #     self.count = -1
442 |     #     print('ran dataset iter')
443 |     #     #self.shuffled_vector = np.random.permutation(self.nF) if self.augment else np.arange(self.nF)
444 |     #     return self
445 | 
446 |     def __getitem__(self, index):
447 |         if self.image_weights:
448 |             index = self.indices[index]
449 | 
450 |         hyp = self.hyp
451 |         if self.mosaic:
452 |             # Load mosaic
453 |             img, labels = load_mosaic(self, index)
454 |             shapes = None
455 | 
456 |         else:
457 |             # Load image
458 |             img, (h0, w0), (h, w) = load_image(self, index)
459 | 
460 |             # Letterbox
461 |             shape = self.batch_shapes[self.batch[index]] if self.rect else self.img_size  # final letterboxed shape
462 |             img, ratio, pad = letterbox(img, shape, auto=False, scaleup=self.augment)
463 |             shapes = (h0, w0), ((h / h0, w / w0), pad)  # for COCO mAP rescaling
464 | 
465 |             # Load labels
466 |             labels = []
467 |             x = self.labels[index]
468 |             if x.size > 0:
469 |                 # Normalized xywh to pixel xyxy format
470 |                 labels = x.copy()
471 |                 labels[:, 1] = ratio[0] * w * (x[:, 1] - x[:, 3] / 2) + pad[0]  # pad width
472 |                 labels[:, 2] = ratio[1] * h * (x[:, 2] - x[:, 4] / 2) + pad[1]  # pad height
473 |                 labels[:, 3] = ratio[0] * w * (x[:, 1] + x[:, 3] / 2) + pad[0]
474 |                 labels[:, 4] = ratio[1] * h * (x[:, 2] + x[:, 4] / 2) + pad[1]
475 | 
476 |         if self.augment:
477 |             # Augment imagespace
478 |             if not self.mosaic:
479 |                 img, labels = random_affine(img, labels,
480 |                                             degrees=hyp['degrees'],
481 |                                             translate=hyp['translate'],
482 |                                             scale=hyp['scale'],
483 |                                             shear=hyp['shear'])
484 | 
485 |             # Augment colorspace
486 |             augment_hsv(img, hgain=hyp['hsv_h'], sgain=hyp['hsv_s'], vgain=hyp['hsv_v'])
487 | 
488 |             # Apply cutouts
489 |             # if random.random() < 0.9:
490 |             #     labels = cutout(img, labels)
491 | 
492 |         nL = len(labels)  # number of labels
493 |         if nL:
494 |             # convert xyxy to xywh
495 |             labels[:, 1:5] = xyxy2xywh(labels[:, 1:5])
496 | 
497 |             # Normalize coordinates 0 - 1
498 |             labels[:, [2, 4]] /= img.shape[0]  # height
499 |             labels[:, [1, 3]] /= img.shape[1]  # width
500 | 
501 |         if self.augment:
502 |             # random left-right flip
503 |             lr_flip = True
504 |             if lr_flip and random.random() < 0.5:
505 |                 img = np.fliplr(img)
506 |                 if nL:
507 |                     labels[:, 1] = 1 - labels[:, 1]
508 | 
509 |             # random up-down flip
510 |             ud_flip = False
511 |             if ud_flip and random.random() < 0.5:
512 |                 img = np.flipud(img)
513 |                 if nL:
514 |                     labels[:, 2] = 1 - labels[:, 2]
515 | 
516 |         labels_out = torch.zeros((nL, 6))
517 |         if nL:
518 |             labels_out[:, 1:] = torch.from_numpy(labels)
519 | 
520 |         # Convert
521 |         img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
522 |         img = np.ascontiguousarray(img)
523 | 
524 |         return torch.from_numpy(img), labels_out, self.img_files[index], shapes
525 | 
526 |     @staticmethod
527 |     def collate_fn(batch):
528 |         img, label, path, shapes = zip(*batch)  # transposed
529 |         for i, l in enumerate(label):
530 |             l[:, 0] = i  # add target image index for build_targets()
531 |         return torch.stack(img, 0), torch.cat(label, 0), path, shapes
532 | 
533 | 
534 | def load_image(self, index):
535 |     # loads 1 image from dataset, returns img, original hw, resized hw
536 |     img = self.imgs[index]
537 |     if img is None:  # not cached
538 |         path = self.img_files[index]
539 |         img = cv2.imread(path)  # BGR
540 |         assert img is not None, 'Image Not Found ' + path
541 |         h0, w0 = img.shape[:2]  # orig hw
542 |         r = self.img_size / max(h0, w0)  # resize image to img_size
543 |         if r != 1:  # always resize down, only resize up if training with augmentation
544 |             interp = cv2.INTER_AREA if r < 1 and not self.augment else cv2.INTER_LINEAR
545 |             img = cv2.resize(img, (int(w0 * r), int(h0 * r)), interpolation=interp)
546 |         return img, (h0, w0), img.shape[:2]  # img, hw_original, hw_resized
547 |     else:
548 |         return self.imgs[index], self.img_hw0[index], self.img_hw[index]  # img, hw_original, hw_resized
549 | 
550 | 
551 | def augment_hsv(img, hgain=0.5, sgain=0.5, vgain=0.5):
552 |     r = np.random.uniform(-1, 1, 3) * [hgain, sgain, vgain] + 1  # random gains
553 |     hue, sat, val = cv2.split(cv2.cvtColor(img, cv2.COLOR_BGR2HSV))
554 |     dtype = img.dtype  # uint8
555 | 
556 |     x = np.arange(0, 256, dtype=np.int16)
557 |     lut_hue = ((x * r[0]) % 180).astype(dtype)
558 |     lut_sat = np.clip(x * r[1], 0, 255).astype(dtype)
559 |     lut_val = np.clip(x * r[2], 0, 255).astype(dtype)
560 | 
561 |     img_hsv = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val))).astype(dtype)
562 |     cv2.cvtColor(img_hsv, cv2.COLOR_HSV2BGR, dst=img)  # no return needed
563 | 
564 |     # Histogram equalization
565 |     # if random.random() < 0.2:
566 |     #     for i in range(3):
567 |     #         img[:, :, i] = cv2.equalizeHist(img[:, :, i])
568 | 
569 | 
570 | def load_mosaic(self, index):
571 |     # loads images in a mosaic
572 | 
573 |     labels4 = []
574 |     s = self.img_size
575 |     xc, yc = [int(random.uniform(s * 0.5, s * 1.5)) for _ in range(2)]  # mosaic center x, y
576 |     indices = [index] + [random.randint(0, len(self.labels) - 1) for _ in range(3)]  # 3 additional image indices
577 |     for i, index in enumerate(indices):
578 |         # Load image
579 |         img, _, (h, w) = load_image(self, index)
580 | 
581 |         # place img in img4
582 |         if i == 0:  # top left
583 |             img4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8)  # base image with 4 tiles
584 |             x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc  # xmin, ymin, xmax, ymax (large image)
585 |             x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h  # xmin, ymin, xmax, ymax (small image)
586 |         elif i == 1:  # top right
587 |             x1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), yc
588 |             x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
589 |         elif i == 2:  # bottom left
590 |             x1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)
591 |             x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, max(xc, w), min(y2a - y1a, h)
592 |         elif i == 3:  # bottom right
593 |             x1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)
594 |             x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)
595 | 
596 |         img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b]  # img4[ymin:ymax, xmin:xmax]
597 |         padw = x1a - x1b
598 |         padh = y1a - y1b
599 | 
600 |         # Labels
601 |         x = self.labels[index]
602 |         labels = x.copy()
603 |         if x.size > 0:  # Normalized xywh to pixel xyxy format
604 |             labels[:, 1] = w * (x[:, 1] - x[:, 3] / 2) + padw
605 |             labels[:, 2] = h * (x[:, 2] - x[:, 4] / 2) + padh
606 |             labels[:, 3] = w * (x[:, 1] + x[:, 3] / 2) + padw
607 |             labels[:, 4] = h * (x[:, 2] + x[:, 4] / 2) + padh
608 |         labels4.append(labels)
609 | 
610 |     # Concat/clip labels
611 |     if len(labels4):
612 |         labels4 = np.concatenate(labels4, 0)
613 |         # np.clip(labels4[:, 1:] - s / 2, 0, s, out=labels4[:, 1:])  # use with center crop
614 |         np.clip(labels4[:, 1:], 0, 2 * s, out=labels4[:, 1:])  # use with random_affine
615 | 
616 |     # Augment
617 |     # img4 = img4[s // 2: int(s * 1.5), s // 2:int(s * 1.5)]  # center crop (WARNING, requires box pruning)
618 |     img4, labels4 = random_affine(img4, labels4,
619 |                                   degrees=self.hyp['degrees'],
620 |                                   translate=self.hyp['translate'],
621 |                                   scale=self.hyp['scale'],
622 |                                   shear=self.hyp['shear'],
623 |                                   border=-s // 2)  # border to remove
624 | 
625 |     return img4, labels4
626 | 
627 | 
628 | def letterbox(img, new_shape=(416, 416), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
629 |     # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
630 |     shape = img.shape[:2]  # current shape [height, width]
631 |     if isinstance(new_shape, int):
632 |         new_shape = (new_shape, new_shape)
633 | 
634 |     # Scale ratio (new / old)
635 |     r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
636 |     if not scaleup:  # only scale down, do not scale up (for better test mAP)
637 |         r = min(r, 1.0)
638 | 
639 |     # Compute padding
640 |     ratio = r, r  # width, height ratios
641 |     new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
642 |     dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
643 |     if auto:  # minimum rectangle
644 |         dw, dh = np.mod(dw, 64), np.mod(dh, 64)  # wh padding
645 |     elif scaleFill:  # stretch
646 |         dw, dh = 0.0, 0.0
647 |         new_unpad = new_shape
648 |         ratio = new_shape[0] / shape[1], new_shape[1] / shape[0]  # width, height ratios
649 | 
650 |     dw /= 2  # divide padding into 2 sides
651 |     dh /= 2
652 | 
653 |     if shape[::-1] != new_unpad:  # resize
654 |         img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
655 |     top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
656 |     left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
657 |     img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
658 |     return img, ratio, (dw, dh)
659 | 
660 | 
661 | def random_affine(img, targets=(), degrees=10, translate=.1, scale=.1, shear=10, border=0):
662 |     # torchvision.transforms.RandomAffine(degrees=(-10, 10), translate=(.1, .1), scale=(.9, 1.1), shear=(-10, 10))
663 |     # https://medium.com/uruvideo/dataset-augmentation-with-random-homographies-a8f4b44830d4
664 |     # targets = [cls, xyxy]
665 | 
666 |     height = img.shape[0] + border * 2
667 |     width = img.shape[1] + border * 2
668 | 
669 |     # Rotation and Scale
670 |     R = np.eye(3)
671 |     a = random.uniform(-degrees, degrees)
672 |     # a += random.choice([-180, -90, 0, 90])  # add 90deg rotations to small rotations
673 |     s = random.uniform(1 - scale, 1 + scale)
674 |     # s = 2 ** random.uniform(-scale, scale)
675 |     R[:2] = cv2.getRotationMatrix2D(angle=a, center=(img.shape[1] / 2, img.shape[0] / 2), scale=s)
676 | 
677 |     # Translation
678 |     T = np.eye(3)
679 |     T[0, 2] = random.uniform(-translate, translate) * img.shape[0] + border  # x translation (pixels)
680 |     T[1, 2] = random.uniform(-translate, translate) * img.shape[1] + border  # y translation (pixels)
681 | 
682 |     # Shear
683 |     S = np.eye(3)
684 |     S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # x shear (deg)
685 |     S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # y shear (deg)
686 | 
687 |     # Combined rotation matrix
688 |     M = S @ T @ R  # ORDER IS IMPORTANT HERE!!
689 |     if (border != 0) or (M != np.eye(3)).any():  # image changed
690 |         img = cv2.warpAffine(img, M[:2], dsize=(width, height), flags=cv2.INTER_LINEAR, borderValue=(114, 114, 114))
691 | 
692 |     # Transform label coordinates
693 |     n = len(targets)
694 |     if n:
695 |         # warp points
696 |         xy = np.ones((n * 4, 3))
697 |         xy[:, :2] = targets[:, [1, 2, 3, 4, 1, 4, 3, 2]].reshape(n * 4, 2)  # x1y1, x2y2, x1y2, x2y1
698 |         xy = (xy @ M.T)[:, :2].reshape(n, 8)
699 | 
700 |         # create new boxes
701 |         x = xy[:, [0, 2, 4, 6]]
702 |         y = xy[:, [1, 3, 5, 7]]
703 |         xy = np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T
704 | 
705 |         # # apply angle-based reduction of bounding boxes
706 |         # radians = a * math.pi / 180
707 |         # reduction = max(abs(math.sin(radians)), abs(math.cos(radians))) ** 0.5
708 |         # x = (xy[:, 2] + xy[:, 0]) / 2
709 |         # y = (xy[:, 3] + xy[:, 1]) / 2
710 |         # w = (xy[:, 2] - xy[:, 0]) * reduction
711 |         # h = (xy[:, 3] - xy[:, 1]) * reduction
712 |         # xy = np.concatenate((x - w / 2, y - h / 2, x + w / 2, y + h / 2)).reshape(4, n).T
713 | 
714 |         # reject warped points outside of image
715 |         xy[:, [0, 2]] = xy[:, [0, 2]].clip(0, width)
716 |         xy[:, [1, 3]] = xy[:, [1, 3]].clip(0, height)
717 |         w = xy[:, 2] - xy[:, 0]
718 |         h = xy[:, 3] - xy[:, 1]
719 |         area = w * h
720 |         area0 = (targets[:, 3] - targets[:, 1]) * (targets[:, 4] - targets[:, 2])
721 |         ar = np.maximum(w / (h + 1e-16), h / (w + 1e-16))  # aspect ratio
722 |         i = (w > 4) & (h > 4) & (area / (area0 * s + 1e-16) > 0.2) & (ar < 10)
723 | 
724 |         targets = targets[i]
725 |         targets[:, 1:5] = xy[i]
726 | 
727 |     return img, targets
728 | 
729 | 
730 | def cutout(image, labels):
731 |     # https://arxiv.org/abs/1708.04552
732 |     # https://github.com/hysts/pytorch_cutout/blob/master/dataloader.py
733 |     # https://towardsdatascience.com/when-conventional-wisdom-fails-revisiting-data-augmentation-for-self-driving-cars-4831998c5509
734 |     h, w = image.shape[:2]
735 | 
736 |     def bbox_ioa(box1, box2):
737 |         # Returns the intersection over box2 area given box1, box2. box1 is 4, box2 is nx4. boxes are x1y1x2y2
738 |         box2 = box2.transpose()
739 | 
740 |         # Get the coordinates of bounding boxes
741 |         b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
742 |         b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
743 | 
744 |         # Intersection area
745 |         inter_area = (np.minimum(b1_x2, b2_x2) - np.maximum(b1_x1, b2_x1)).clip(0) * \
746 |                      (np.minimum(b1_y2, b2_y2) - np.maximum(b1_y1, b2_y1)).clip(0)
747 | 
748 |         # box2 area
749 |         box2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1) + 1e-16
750 | 
751 |         # Intersection over box2 area
752 |         return inter_area / box2_area
753 | 
754 |     # create random masks
755 |     scales = [0.5] * 1 + [0.25] * 2 + [0.125] * 4 + [0.0625] * 8 + [0.03125] * 16  # image size fraction
756 |     for s in scales:
757 |         mask_h = random.randint(1, int(h * s))
758 |         mask_w = random.randint(1, int(w * s))
759 | 
760 |         # box
761 |         xmin = max(0, random.randint(0, w) - mask_w // 2)
762 |         ymin = max(0, random.randint(0, h) - mask_h // 2)
763 |         xmax = min(w, xmin + mask_w)
764 |         ymax = min(h, ymin + mask_h)
765 | 
766 |         # apply random color mask
767 |         image[ymin:ymax, xmin:xmax] = [random.randint(64, 191) for _ in range(3)]
768 | 
769 |         # return unobscured labels
770 |         if len(labels) and s > 0.03:
771 |             box = np.array([xmin, ymin, xmax, ymax], dtype=np.float32)
772 |             ioa = bbox_ioa(box, labels[:, 1:5])  # intersection over area
773 |             labels = labels[ioa < 0.60]  # remove >60% obscured labels
774 | 
775 |     return labels
776 | 
777 | 
778 | def reduce_img_size(path='../data/sm4/images', img_size=1024):  # from utils.datasets import *; reduce_img_size()
779 |     # creates a new ./images_reduced folder with reduced size images of maximum size img_size
780 |     path_new = path + '_reduced'  # reduced images path
781 |     create_folder(path_new)
782 |     for f in tqdm(glob.glob('%s/*.*' % path)):
783 |         try:
784 |             img = cv2.imread(f)
785 |             h, w = img.shape[:2]
786 |             r = img_size / max(h, w)  # size ratio
787 |             if r < 1.0:
788 |                 img = cv2.resize(img, (int(w * r), int(h * r)), interpolation=cv2.INTER_AREA)  # _LINEAR fastest
789 |             fnew = f.replace(path, path_new)  # .replace(Path(f).suffix, '.jpg')
790 |             cv2.imwrite(fnew, img)
791 |         except:
792 |             print('WARNING: image failure %s' % f)
793 | 
794 | 
795 | def convert_images2bmp():  # from utils.datasets import *; convert_images2bmp()
796 |     # Save images
797 |     formats = [x.lower() for x in img_formats] + [x.upper() for x in img_formats]
798 |     # for path in ['../coco/images/val2014', '../coco/images/train2014']:
799 |     for path in ['../data/sm4/images', '../data/sm4/background']:
800 |         create_folder(path + 'bmp')
801 |         for ext in formats:  # ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.dng']
802 |             for f in tqdm(glob.glob('%s/*%s' % (path, ext)), desc='Converting %s' % ext):
803 |                 cv2.imwrite(f.replace(ext.lower(), '.bmp').replace(path, path + 'bmp'), cv2.imread(f))
804 | 
805 |     # Save labels
806 |     # for path in ['../coco/trainvalno5k.txt', '../coco/5k.txt']:
807 |     for file in ['../data/sm4/out_train.txt', '../data/sm4/out_test.txt']:
808 |         with open(file, 'r') as f:
809 |             lines = f.read()
810 |             # lines = f.read().replace('2014/', '2014bmp/')  # coco
811 |             lines = lines.replace('/images', '/imagesbmp')
812 |             lines = lines.replace('/background', '/backgroundbmp')
813 |         for ext in formats:
814 |             lines = lines.replace(ext, '.bmp')
815 |         with open(file.replace('.txt', 'bmp.txt'), 'w') as f:
816 |             f.write(lines)
817 | 
818 | 
819 | def recursive_dataset2bmp(dataset='../data/sm4_bmp'):  # from utils.datasets import *; recursive_dataset2bmp()
820 |     # Converts dataset to bmp (for faster training)
821 |     formats = [x.lower() for x in img_formats] + [x.upper() for x in img_formats]
822 |     for a, b, files in os.walk(dataset):
823 |         for file in tqdm(files, desc=a):
824 |             p = a + '/' + file
825 |             s = Path(file).suffix
826 |             if s == '.txt':  # replace text
827 |                 with open(p, 'r') as f:
828 |                     lines = f.read()
829 |                 for f in formats:
830 |                     lines = lines.replace(f, '.bmp')
831 |                 with open(p, 'w') as f:
832 |                     f.write(lines)
833 |             elif s in formats:  # replace image
834 |                 cv2.imwrite(p.replace(s, '.bmp'), cv2.imread(p))
835 |                 if s != '.bmp':
836 |                     os.system("rm '%s'" % p)
837 | 
838 | 
839 | def imagelist2folder(path='data/coco_64img.txt'):  # from utils.datasets import *; imagelist2folder()
840 |     # Copies all the images in a text file (list of images) into a folder
841 |     create_folder(path[:-4])
842 |     with open(path, 'r') as f:
843 |         for line in f.read().splitlines():
844 |             os.system('cp "%s" %s' % (line, path[:-4]))
845 |             print(line)
846 | 
847 | 
848 | def create_folder(path='./new_folder'):
849 |     # Create folder
850 |     if os.path.exists(path):
851 |         shutil.rmtree(path)  # delete output folder
852 |     os.makedirs(path)  # make new output folder
853 | 


--------------------------------------------------------------------------------
/utils/google_utils.py:
--------------------------------------------------------------------------------
 1 | # This file contains google utils: https://cloud.google.com/storage/docs/reference/libraries
 2 | # pip install --upgrade google-cloud-storage
 3 | # from google.cloud import storage
 4 | 
 5 | import os
 6 | import time
 7 | from pathlib import Path
 8 | 
 9 | 
10 | def attempt_download(weights):
11 |     # Attempt to download pretrained weights if not found locally
12 |     weights = weights.strip()
13 |     msg = weights + ' missing, try downloading from https://drive.google.com/drive/folders/1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J'
14 | 
15 |     r = 1
16 |     if len(weights) > 0 and not os.path.isfile(weights):
17 |         d = {'yolov3-spp.pt': '1mM67oNw4fZoIOL1c8M3hHmj66d8e-ni_',  # yolov3-spp.yaml
18 |              'yolov5s.pt': '1R5T6rIyy3lLwgFXNms8whc-387H0tMQO',  # yolov5s.yaml
19 |              'yolov5m.pt': '1vobuEExpWQVpXExsJ2w-Mbf3HJjWkQJr',  # yolov5m.yaml
20 |              'yolov5l.pt': '1hrlqD1Wdei7UT4OgT785BEk1JwnSvNEV',  # yolov5l.yaml
21 |              'yolov5x.pt': '1mM8aZJlWTxOg7BZJvNUMrTnA2AbeCVzS',  # yolov5x.yaml
22 |              }
23 | 
24 |         file = Path(weights).name
25 |         if file in d:
26 |             r = gdrive_download(id=d[file], name=weights)
27 | 
28 |         # Error check
29 |         if not (r == 0 and os.path.exists(weights) and os.path.getsize(weights) > 1E6):  # weights exist and > 1MB
30 |             os.system('rm ' + weights)  # remove partial downloads
31 |             raise Exception(msg)
32 | 
33 | 
34 | def gdrive_download(id='1HaXkef9z6y5l4vUnCYgdmEAj61c6bfWO', name='coco.zip'):
35 |     # https://gist.github.com/tanaikech/f0f2d122e05bf5f971611258c22c110f
36 |     # Downloads a file from Google Drive, accepting presented query
37 |     # from utils.google_utils import *; gdrive_download()
38 |     t = time.time()
39 | 
40 |     print('Downloading https://drive.google.com/uc?export=download&id=%s as %s... ' % (id, name), end='')
41 |     os.remove(name) if os.path.exists(name) else None  # remove existing
42 |     os.remove('cookie') if os.path.exists('cookie') else None
43 | 
44 |     # Attempt file download
45 |     os.system("curl -c ./cookie -s -L \"https://drive.google.com/uc?export=download&id=%s\" > /dev/null" % id)
46 |     if os.path.exists('cookie'):  # large file
47 |         s = "curl -Lb ./cookie \"https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=%s\" -o %s" % (
48 |             id, name)
49 |     else:  # small file
50 |         s = "curl -s -L -o %s 'https://drive.google.com/uc?export=download&id=%s'" % (name, id)
51 |     r = os.system(s)  # execute, capture return values
52 |     os.remove('cookie') if os.path.exists('cookie') else None
53 | 
54 |     # Error check
55 |     if r != 0:
56 |         os.remove(name) if os.path.exists(name) else None  # remove partial
57 |         print('Download error ')  # raise Exception('Download error')
58 |         return r
59 | 
60 |     # Unzip if archive
61 |     if name.endswith('.zip'):
62 |         print('unzipping... ', end='')
63 |         os.system('unzip -q %s' % name)  # unzip
64 |         os.remove(name)  # remove zip to free space
65 | 
66 |     print('Done (%.1fs)' % (time.time() - t))
67 |     return r
68 | 
69 | # def upload_blob(bucket_name, source_file_name, destination_blob_name):
70 | #     # Uploads a file to a bucket
71 | #     # https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python
72 | #
73 | #     storage_client = storage.Client()
74 | #     bucket = storage_client.get_bucket(bucket_name)
75 | #     blob = bucket.blob(destination_blob_name)
76 | #
77 | #     blob.upload_from_filename(source_file_name)
78 | #
79 | #     print('File {} uploaded to {}.'.format(
80 | #         source_file_name,
81 | #         destination_blob_name))
82 | #
83 | #
84 | # def download_blob(bucket_name, source_blob_name, destination_file_name):
85 | #     # Uploads a blob from a bucket
86 | #     storage_client = storage.Client()
87 | #     bucket = storage_client.get_bucket(bucket_name)
88 | #     blob = bucket.blob(source_blob_name)
89 | #
90 | #     blob.download_to_filename(destination_file_name)
91 | #
92 | #     print('Blob {} downloaded to {}.'.format(
93 | #         source_blob_name,
94 | #         destination_file_name))
95 | 


--------------------------------------------------------------------------------
/utils/torch_utils.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | import os
  3 | import time
  4 | from copy import deepcopy
  5 | 
  6 | import torch
  7 | import torch.backends.cudnn as cudnn
  8 | import torch.nn as nn
  9 | import torch.nn.functional as F
 10 | 
 11 | 
 12 | def init_seeds(seed=0):
 13 |     torch.manual_seed(seed)
 14 | 
 15 |     # Speed-reproducibility tradeoff https://pytorch.org/docs/stable/notes/randomness.html
 16 |     if seed == 0:  # slower, more reproducible
 17 |         cudnn.deterministic = True
 18 |         cudnn.benchmark = False
 19 |     else:  # faster, less reproducible
 20 |         cudnn.deterministic = False
 21 |         cudnn.benchmark = True
 22 | 
 23 | 
 24 | def select_device(device='', apex=False, batch_size=None):
 25 |     # device = 'cpu' or '0' or '0,1,2,3'
 26 |     cpu_request = device.lower() == 'cpu'
 27 |     if device and not cpu_request:  # if device requested other than 'cpu'
 28 |         os.environ['CUDA_VISIBLE_DEVICES'] = device  # set environment variable
 29 |         assert torch.cuda.is_available(), 'CUDA unavailable, invalid device %s requested' % device  # check availablity
 30 | 
 31 |     cuda = False if cpu_request else torch.cuda.is_available()
 32 |     if cuda:
 33 |         c = 1024 ** 2  # bytes to MB
 34 |         ng = torch.cuda.device_count()
 35 |         if ng > 1 and batch_size:  # check that batch_size is compatible with device_count
 36 |             assert batch_size % ng == 0, 'batch-size %g not multiple of GPU count %g' % (batch_size, ng)
 37 |         x = [torch.cuda.get_device_properties(i) for i in range(ng)]
 38 |         s = 'Using CUDA ' + ('Apex ' if apex else '')  # apex for mixed precision https://github.com/NVIDIA/apex
 39 |         for i in range(0, ng):
 40 |             if i == 1:
 41 |                 s = ' ' * len(s)
 42 |             print("%sdevice%g _CudaDeviceProperties(name='%s', total_memory=%dMB)" %
 43 |                   (s, i, x[i].name, x[i].total_memory / c))
 44 |     else:
 45 |         print('Using CPU')
 46 | 
 47 |     print('')  # skip a line
 48 |     return torch.device('cuda:0' if cuda else 'cpu')
 49 | 
 50 | 
 51 | def time_synchronized():
 52 |     torch.cuda.synchronize() if torch.cuda.is_available() else None
 53 |     return time.time()
 54 | 
 55 | 
 56 | def initialize_weights(model):
 57 |     for m in model.modules():
 58 |         t = type(m)
 59 |         if t is nn.Conv2d:
 60 |             pass  # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 61 |         elif t is nn.BatchNorm2d:
 62 |             m.eps = 1e-4
 63 |             m.momentum = 0.03
 64 |         elif t in [nn.LeakyReLU, nn.ReLU, nn.ReLU6]:
 65 |             m.inplace = True
 66 | 
 67 | 
 68 | def find_modules(model, mclass=nn.Conv2d):
 69 |     # finds layer indices matching module class 'mclass'
 70 |     return [i for i, m in enumerate(model.module_list) if isinstance(m, mclass)]
 71 | 
 72 | 
 73 | def fuse_conv_and_bn(conv, bn):
 74 |     # https://tehnokv.com/posts/fusing-batchnorm-and-conv/
 75 |     with torch.no_grad():
 76 |         # init
 77 |         fusedconv = torch.nn.Conv2d(conv.in_channels,
 78 |                                     conv.out_channels,
 79 |                                     kernel_size=conv.kernel_size,
 80 |                                     stride=conv.stride,
 81 |                                     padding=conv.padding,
 82 |                                     bias=True)
 83 | 
 84 |         # prepare filters
 85 |         w_conv = conv.weight.clone().view(conv.out_channels, -1)
 86 |         w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var)))
 87 |         fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size()))
 88 | 
 89 |         # prepare spatial bias
 90 |         if conv.bias is not None:
 91 |             b_conv = conv.bias
 92 |         else:
 93 |             b_conv = torch.zeros(conv.weight.size(0), device=conv.weight.device)
 94 |         b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps))
 95 |         fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn)
 96 | 
 97 |         return fusedconv
 98 | 
 99 | 
100 | def model_info(model, verbose=False):
101 |     # Plots a line-by-line description of a PyTorch model
102 |     n_p = sum(x.numel() for x in model.parameters())  # number parameters
103 |     n_g = sum(x.numel() for x in model.parameters() if x.requires_grad)  # number gradients
104 |     if verbose:
105 |         print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma'))
106 |         for i, (name, p) in enumerate(model.named_parameters()):
107 |             name = name.replace('module_list.', '')
108 |             print('%5g %40s %9s %12g %20s %10.3g %10.3g' %
109 |                   (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std()))
110 | 
111 |     try:  # FLOPS
112 |         from thop import profile
113 |         macs, _ = profile(model, inputs=(torch.zeros(1, 3, 480, 640),), verbose=False)
114 |         fs = ', %.1f GFLOPS' % (macs / 1E9 * 2)
115 |     except:
116 |         fs = ''
117 | 
118 |     print('Model Summary: %g layers, %g parameters, %g gradients%s' % (len(list(model.parameters())), n_p, n_g, fs))
119 | 
120 | 
121 | def load_classifier(name='resnet101', n=2):
122 |     # Loads a pretrained model reshaped to n-class output
123 |     import pretrainedmodels  # https://github.com/Cadene/pretrained-models.pytorch#torchvision
124 |     model = pretrainedmodels.__dict__[name](num_classes=1000, pretrained='imagenet')
125 | 
126 |     # Display model properties
127 |     for x in ['model.input_size', 'model.input_space', 'model.input_range', 'model.mean', 'model.std']:
128 |         print(x + ' =', eval(x))
129 | 
130 |     # Reshape output to n classes
131 |     filters = model.last_linear.weight.shape[1]
132 |     model.last_linear.bias = torch.nn.Parameter(torch.zeros(n))
133 |     model.last_linear.weight = torch.nn.Parameter(torch.zeros(n, filters))
134 |     model.last_linear.out_features = n
135 |     return model
136 | 
137 | 
138 | def scale_img(img, ratio=1.0, same_shape=False):  # img(16,3,256,416), r=ratio
139 |     # scales img(bs,3,y,x) by ratio
140 |     h, w = img.shape[2:]
141 |     s = (int(h * ratio), int(w * ratio))  # new size
142 |     img = F.interpolate(img, size=s, mode='bilinear', align_corners=False)  # resize
143 |     if not same_shape:  # pad/crop img
144 |         gs = 32  # (pixels) grid size
145 |         h, w = [math.ceil(x * ratio / gs) * gs for x in (h, w)]
146 |     return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447)  # value = imagenet mean
147 | 
148 | 
149 | class ModelEMA:
150 |     """ Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models
151 |     Keep a moving average of everything in the model state_dict (parameters and buffers).
152 |     This is intended to allow functionality like
153 |     https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage
154 |     A smoothed version of the weights is necessary for some training schemes to perform well.
155 |     E.g. Google's hyper-params for training MNASNet, MobileNet-V3, EfficientNet, etc that use
156 |     RMSprop with a short 2.4-3 epoch decay period and slow LR decay rate of .96-.99 requires EMA
157 |     smoothing of weights to match results. Pay attention to the decay constant you are using
158 |     relative to your update count per epoch.
159 |     To keep EMA from using GPU resources, set device='cpu'. This will save a bit of memory but
160 |     disable validation of the EMA weights. Validation will have to be done manually in a separate
161 |     process, or after the training stops converging.
162 |     This class is sensitive where it is initialized in the sequence of model init,
163 |     GPU assignment and distributed training wrappers.
164 |     I've tested with the sequence in my own train.py for torch.DataParallel, apex.DDP, and single-GPU.
165 |     """
166 | 
167 |     def __init__(self, model, decay=0.9999, device=''):
168 |         # make a copy of the model for accumulating moving average of weights
169 |         self.ema = deepcopy(model)
170 |         self.ema.eval()
171 |         self.updates = 0  # number of EMA updates
172 |         self.decay = lambda x: decay * (1 - math.exp(-x / 2000))  # decay exponential ramp (to help early epochs)
173 |         self.device = device  # perform ema on different device from model if set
174 |         if device:
175 |             self.ema.to(device=device)
176 |         for p in self.ema.parameters():
177 |             p.requires_grad_(False)
178 | 
179 |     def update(self, model):
180 |         self.updates += 1
181 |         d = self.decay(self.updates)
182 |         with torch.no_grad():
183 |             if type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel):
184 |                 msd, esd = model.module.state_dict(), self.ema.module.state_dict()
185 |             else:
186 |                 msd, esd = model.state_dict(), self.ema.state_dict()
187 | 
188 |             for k, v in esd.items():
189 |                 if v.dtype.is_floating_point:
190 |                     v *= d
191 |                     v += (1. - d) * msd[k].detach()
192 | 
193 |     def update_attr(self, model):
194 |         # Assign attributes (which may change during training)
195 |         for k in model.__dict__.keys():
196 |             if not k.startswith('_'):
197 |                 setattr(self.ema, k, getattr(model, k))
198 | 


--------------------------------------------------------------------------------
/utils/utils.py:
--------------------------------------------------------------------------------
   1 | import glob
   2 | import math
   3 | import os
   4 | import random
   5 | import shutil
   6 | import subprocess
   7 | import time
   8 | from copy import copy
   9 | from pathlib import Path
  10 | from sys import platform
  11 | 
  12 | import cv2
  13 | import matplotlib
  14 | import matplotlib.pyplot as plt
  15 | import numpy as np
  16 | import torch
  17 | import torch.nn as nn
  18 | import torchvision
  19 | from scipy.signal import butter, filtfilt
  20 | from tqdm import tqdm
  21 | 
  22 | from PIL import Image,ImageDraw,ImageFont
  23 | 
  24 | from . import torch_utils, google_utils  #  torch_utils, google_utils
  25 | 
  26 | # Set printoptions
  27 | torch.set_printoptions(linewidth=320, precision=5, profile='long')
  28 | np.set_printoptions(linewidth=320, formatter={'float_kind': '{:11.5g}'.format})  # format short g, %precision=5
  29 | matplotlib.rc('font', **{'size': 11})
  30 | 
  31 | # Prevent OpenCV from multithreading (to use PyTorch DataLoader)
  32 | cv2.setNumThreads(0)
  33 | 
  34 | 
  35 | def init_seeds(seed=0):
  36 |     random.seed(seed)
  37 |     np.random.seed(seed)
  38 |     torch_utils.init_seeds(seed=seed)
  39 | 
  40 | 
  41 | def check_git_status():
  42 |     if platform in ['linux', 'darwin']:
  43 |         # Suggest 'git pull' if repo is out of date
  44 |         s = subprocess.check_output('if [ -d .git ]; then git fetch && git status -uno; fi', shell=True).decode('utf-8')
  45 |         if 'Your branch is behind' in s:
  46 |             print(s[s.find('Your branch is behind'):s.find('\n\n')] + '\n')
  47 | 
  48 | 
  49 | def make_divisible(x, divisor):
  50 |     # Returns x evenly divisble by divisor
  51 |     return math.ceil(x / divisor) * divisor
  52 | 
  53 | 
  54 | def labels_to_class_weights(labels, nc=80):
  55 |     # Get class weights (inverse frequency) from training labels
  56 |     if labels[0] is None:  # no labels loaded
  57 |         return torch.Tensor()
  58 | 
  59 |     labels = np.concatenate(labels, 0)  # labels.shape = (866643, 5) for COCO
  60 |     classes = labels[:, 0].astype(np.int)  # labels = [class xywh]
  61 |     weights = np.bincount(classes, minlength=nc)  # occurences per class
  62 | 
  63 |     # Prepend gridpoint count (for uCE trianing)
  64 |     # gpi = ((320 / 32 * np.array([1, 2, 4])) ** 2 * 3).sum()  # gridpoints per image
  65 |     # weights = np.hstack([gpi * len(labels)  - weights.sum() * 9, weights * 9]) ** 0.5  # prepend gridpoints to start
  66 | 
  67 |     weights[weights == 0] = 1  # replace empty bins with 1
  68 |     weights = 1 / weights  # number of targets per class
  69 |     weights /= weights.sum()  # normalize
  70 |     return torch.from_numpy(weights)
  71 | 
  72 | 
  73 | def labels_to_image_weights(labels, nc=80, class_weights=np.ones(80)):
  74 |     # Produces image weights based on class mAPs
  75 |     n = len(labels)
  76 |     class_counts = np.array([np.bincount(labels[i][:, 0].astype(np.int), minlength=nc) for i in range(n)])
  77 |     image_weights = (class_weights.reshape(1, nc) * class_counts).sum(1)
  78 |     # index = random.choices(range(n), weights=image_weights, k=1)  # weight image sample
  79 |     return image_weights
  80 | 
  81 | 
  82 | def coco80_to_coco91_class():  # converts 80-index (val2014) to 91-index (paper)
  83 |     # https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/
  84 |     # a = np.loadtxt('data/coco.names', dtype='str', delimiter='\n')
  85 |     # b = np.loadtxt('data/coco_paper.names', dtype='str', delimiter='\n')
  86 |     # x1 = [list(a[i] == b).index(True) + 1 for i in range(80)]  # darknet to coco
  87 |     # x2 = [list(b[i] == a).index(True) if any(b[i] == a) else None for i in range(91)]  # coco to darknet
  88 |     x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34,
  89 |          35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
  90 |          64, 65, 67, 70, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90]
  91 |     return x
  92 | 
  93 | 
  94 | def xyxy2xywh(x):
  95 |     # Convert nx4 boxes from [x1, y1, x2, y2] to [x, y, w, h] where xy1=top-left, xy2=bottom-right
  96 |     y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
  97 |     y[:, 0] = (x[:, 0] + x[:, 2]) / 2  # x center
  98 |     y[:, 1] = (x[:, 1] + x[:, 3]) / 2  # y center
  99 |     y[:, 2] = x[:, 2] - x[:, 0]  # width
 100 |     y[:, 3] = x[:, 3] - x[:, 1]  # height
 101 |     return y
 102 | 
 103 | 
 104 | def xywh2xyxy(x):
 105 |     # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
 106 |     y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
 107 |     y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
 108 |     y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
 109 |     y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
 110 |     y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
 111 |     return y
 112 | 
 113 | 
 114 | def scale_coords(img1_shape, coords, img0_shape, ratio_pad=None):
 115 |     # Rescale coords (xyxy) from img1_shape to img0_shape
 116 |     if ratio_pad is None:  # calculate from img0_shape
 117 |         gain = max(img1_shape) / max(img0_shape)  # gain  = old / new
 118 |         pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2  # wh padding
 119 |     else:
 120 |         gain = ratio_pad[0][0]
 121 |         pad = ratio_pad[1]
 122 | 
 123 |     coords[:, [0, 2]] -= pad[0]  # x padding
 124 |     coords[:, [1, 3]] -= pad[1]  # y padding
 125 |     coords[:, :4] /= gain
 126 |     clip_coords(coords, img0_shape)
 127 |     return coords
 128 | 
 129 | 
 130 | def clip_coords(boxes, img_shape):
 131 |     # Clip bounding xyxy bounding boxes to image shape (height, width)
 132 |     boxes[:, 0].clamp_(0, img_shape[1])  # x1
 133 |     boxes[:, 1].clamp_(0, img_shape[0])  # y1
 134 |     boxes[:, 2].clamp_(0, img_shape[1])  # x2
 135 |     boxes[:, 3].clamp_(0, img_shape[0])  # y2
 136 | 
 137 | 
 138 | def ap_per_class(tp, conf, pred_cls, target_cls):
 139 |     """ Compute the average precision, given the recall and precision curves.
 140 |     Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
 141 |     # Arguments
 142 |         tp:    True positives (nparray, nx1 or nx10).
 143 |         conf:  Objectness value from 0-1 (nparray).
 144 |         pred_cls: Predicted object classes (nparray).
 145 |         target_cls: True object classes (nparray).
 146 |     # Returns
 147 |         The average precision as computed in py-faster-rcnn.
 148 |     """
 149 | 
 150 |     # Sort by objectness
 151 |     i = np.argsort(-conf)
 152 |     tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
 153 | 
 154 |     # Find unique classes
 155 |     unique_classes = np.unique(target_cls)
 156 | 
 157 |     # Create Precision-Recall curve and compute AP for each class
 158 |     pr_score = 0.1  # score to evaluate P and R https://github.com/ultralytics/yolov3/issues/898
 159 |     s = [unique_classes.shape[0], tp.shape[1]]  # number class, number iou thresholds (i.e. 10 for mAP0.5...0.95)
 160 |     ap, p, r = np.zeros(s), np.zeros(s), np.zeros(s)
 161 |     for ci, c in enumerate(unique_classes):
 162 |         i = pred_cls == c
 163 |         n_gt = (target_cls == c).sum()  # Number of ground truth objects
 164 |         n_p = i.sum()  # Number of predicted objects
 165 | 
 166 |         if n_p == 0 or n_gt == 0:
 167 |             continue
 168 |         else:
 169 |             # Accumulate FPs and TPs
 170 |             fpc = (1 - tp[i]).cumsum(0)
 171 |             tpc = tp[i].cumsum(0)
 172 | 
 173 |             # Recall
 174 |             recall = tpc / (n_gt + 1e-16)  # recall curve
 175 |             r[ci] = np.interp(-pr_score, -conf[i], recall[:, 0])  # r at pr_score, negative x, xp because xp decreases
 176 | 
 177 |             # Precision
 178 |             precision = tpc / (tpc + fpc)  # precision curve
 179 |             p[ci] = np.interp(-pr_score, -conf[i], precision[:, 0])  # p at pr_score
 180 | 
 181 |             # AP from recall-precision curve
 182 |             for j in range(tp.shape[1]):
 183 |                 ap[ci, j] = compute_ap(recall[:, j], precision[:, j])
 184 | 
 185 |             # Plot
 186 |             # fig, ax = plt.subplots(1, 1, figsize=(5, 5))
 187 |             # ax.plot(recall, precision)
 188 |             # ax.set_xlabel('Recall')
 189 |             # ax.set_ylabel('Precision')
 190 |             # ax.set_xlim(0, 1.01)
 191 |             # ax.set_ylim(0, 1.01)
 192 |             # fig.tight_layout()
 193 |             # fig.savefig('PR_curve.png', dpi=300)
 194 | 
 195 |     # Compute F1 score (harmonic mean of precision and recall)
 196 |     f1 = 2 * p * r / (p + r + 1e-16)
 197 | 
 198 |     return p, r, ap, f1, unique_classes.astype('int32')
 199 | 
 200 | 
 201 | def compute_ap(recall, precision):
 202 |     """ Compute the average precision, given the recall and precision curves.
 203 |     Source: https://github.com/rbgirshick/py-faster-rcnn.
 204 |     # Arguments
 205 |         recall:    The recall curve (list).
 206 |         precision: The precision curve (list).
 207 |     # Returns
 208 |         The average precision as computed in py-faster-rcnn.
 209 |     """
 210 | 
 211 |     # Append sentinel values to beginning and end
 212 |     mrec = np.concatenate(([0.], recall, [min(recall[-1] + 1E-3, 1.)]))
 213 |     mpre = np.concatenate(([0.], precision, [0.]))
 214 | 
 215 |     # Compute the precision envelope
 216 |     mpre = np.flip(np.maximum.accumulate(np.flip(mpre)))
 217 | 
 218 |     # Integrate area under curve
 219 |     method = 'interp'  # methods: 'continuous', 'interp'
 220 |     if method == 'interp':
 221 |         x = np.linspace(0, 1, 101)  # 101-point interp (COCO)
 222 |         ap = np.trapz(np.interp(x, mrec, mpre), x)  # integrate
 223 |     else:  # 'continuous'
 224 |         i = np.where(mrec[1:] != mrec[:-1])[0]  # points where x axis (recall) changes
 225 |         ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])  # area under curve
 226 | 
 227 |     return ap
 228 | 
 229 | 
 230 | def bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False):
 231 |     # Returns the IoU of box1 to box2. box1 is 4, box2 is nx4
 232 |     box2 = box2.t()
 233 | 
 234 |     # Get the coordinates of bounding boxes
 235 |     if x1y1x2y2:  # x1, y1, x2, y2 = box1
 236 |         b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
 237 |         b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
 238 |     else:  # transform from xywh to xyxy
 239 |         b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
 240 |         b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
 241 |         b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
 242 |         b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2
 243 | 
 244 |     # Intersection area
 245 |     inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
 246 |             (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)
 247 | 
 248 |     # Union Area
 249 |     w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1
 250 |     w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1
 251 |     union = (w1 * h1 + 1e-16) + w2 * h2 - inter
 252 | 
 253 |     iou = inter / union  # iou
 254 |     if GIoU or DIoU or CIoU:
 255 |         cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # convex (smallest enclosing box) width
 256 |         ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)  # convex height
 257 |         if GIoU:  # Generalized IoU https://arxiv.org/pdf/1902.09630.pdf
 258 |             c_area = cw * ch + 1e-16  # convex area
 259 |             return iou - (c_area - union) / c_area  # GIoU
 260 |         if DIoU or CIoU:  # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1
 261 |             # convex diagonal squared
 262 |             c2 = cw ** 2 + ch ** 2 + 1e-16
 263 |             # centerpoint distance squared
 264 |             rho2 = ((b2_x1 + b2_x2) - (b1_x1 + b1_x2)) ** 2 / 4 + ((b2_y1 + b2_y2) - (b1_y1 + b1_y2)) ** 2 / 4
 265 |             if DIoU:
 266 |                 return iou - rho2 / c2  # DIoU
 267 |             elif CIoU:  # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47
 268 |                 v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)
 269 |                 with torch.no_grad():
 270 |                     alpha = v / (1 - iou + v)
 271 |                 return iou - (rho2 / c2 + v * alpha)  # CIoU
 272 | 
 273 |     return iou
 274 | 
 275 | 
 276 | def box_iou(box1, box2):
 277 |     # https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py
 278 |     """
 279 |     Return intersection-over-union (Jaccard index) of boxes.
 280 |     Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
 281 |     Arguments:
 282 |         box1 (Tensor[N, 4])
 283 |         box2 (Tensor[M, 4])
 284 |     Returns:
 285 |         iou (Tensor[N, M]): the NxM matrix containing the pairwise
 286 |             IoU values for every element in boxes1 and boxes2
 287 |     """
 288 | 
 289 |     def box_area(box):
 290 |         # box = 4xn
 291 |         return (box[2] - box[0]) * (box[3] - box[1])
 292 | 
 293 |     area1 = box_area(box1.t())
 294 |     area2 = box_area(box2.t())
 295 | 
 296 |     # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
 297 |     inter = (torch.min(box1[:, None, 2:], box2[:, 2:]) - torch.max(box1[:, None, :2], box2[:, :2])).clamp(0).prod(2)
 298 |     return inter / (area1[:, None] + area2 - inter)  # iou = inter / (area1 + area2 - inter)
 299 | 
 300 | 
 301 | def wh_iou(wh1, wh2):
 302 |     # Returns the nxm IoU matrix. wh1 is nx2, wh2 is mx2
 303 |     wh1 = wh1[:, None]  # [N,1,2]
 304 |     wh2 = wh2[None]  # [1,M,2]
 305 |     inter = torch.min(wh1, wh2).prod(2)  # [N,M]
 306 |     return inter / (wh1.prod(2) + wh2.prod(2) - inter)  # iou = inter / (area1 + area2 - inter)
 307 | 
 308 | 
 309 | class FocalLoss(nn.Module):
 310 |     # Wraps focal loss around existing loss_fcn(), i.e. criteria = FocalLoss(nn.BCEWithLogitsLoss(), gamma=1.5)
 311 |     def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
 312 |         super(FocalLoss, self).__init__()
 313 |         self.loss_fcn = loss_fcn  # must be nn.BCEWithLogitsLoss()
 314 |         self.gamma = gamma
 315 |         self.alpha = alpha
 316 |         self.reduction = loss_fcn.reduction
 317 |         self.loss_fcn.reduction = 'none'  # required to apply FL to each element
 318 | 
 319 |     def forward(self, pred, true):
 320 |         loss = self.loss_fcn(pred, true)
 321 |         # p_t = torch.exp(-loss)
 322 |         # loss *= self.alpha * (1.000001 - p_t) ** self.gamma  # non-zero power for gradient stability
 323 | 
 324 |         # TF implementation https://github.com/tensorflow/addons/blob/v0.7.1/tensorflow_addons/losses/focal_loss.py
 325 |         pred_prob = torch.sigmoid(pred)  # prob from logits
 326 |         p_t = true * pred_prob + (1 - true) * (1 - pred_prob)
 327 |         alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha)
 328 |         modulating_factor = (1.0 - p_t) ** self.gamma
 329 |         loss *= alpha_factor * modulating_factor
 330 | 
 331 |         if self.reduction == 'mean':
 332 |             return loss.mean()
 333 |         elif self.reduction == 'sum':
 334 |             return loss.sum()
 335 |         else:  # 'none'
 336 |             return loss
 337 | 
 338 | 
 339 | def smooth_BCE(eps=0.1):  # https://github.com/ultralytics/yolov3/issues/238#issuecomment-598028441
 340 |     # return positive, negative label smoothing BCE targets
 341 |     return 1.0 - 0.5 * eps, 0.5 * eps
 342 | 
 343 | 
 344 | class BCEBlurWithLogitsLoss(nn.Module):
 345 |     # BCEwithLogitLoss() with reduced missing label effects.
 346 |     def __init__(self, alpha=0.05):
 347 |         super(BCEBlurWithLogitsLoss, self).__init__()
 348 |         self.loss_fcn = nn.BCEWithLogitsLoss(reduction='none')  # must be nn.BCEWithLogitsLoss()
 349 |         self.alpha = alpha
 350 | 
 351 |     def forward(self, pred, true):
 352 |         loss = self.loss_fcn(pred, true)
 353 |         pred = torch.sigmoid(pred)  # prob from logits
 354 |         dx = pred - true  # reduce only missing label effects
 355 |         # dx = (pred - true).abs()  # reduce missing label and false label effects
 356 |         alpha_factor = 1 - torch.exp((dx - 1) / (self.alpha + 1e-4))
 357 |         loss *= alpha_factor
 358 |         return loss.mean()
 359 | 
 360 | 
 361 | def compute_loss(p, targets, model):  # predictions, targets, model
 362 |     ft = torch.cuda.FloatTensor if p[0].is_cuda else torch.Tensor
 363 |     lcls, lbox, lobj = ft([0]), ft([0]), ft([0])
 364 |     tcls, tbox, indices, anchors = build_targets(p, targets, model)  # targets
 365 |     h = model.hyp  # hyperparameters
 366 |     red = 'mean'  # Loss reduction (sum or mean)
 367 | 
 368 |     # Define criteria
 369 |     BCEcls = nn.BCEWithLogitsLoss(pos_weight=ft([h['cls_pw']]), reduction=red)
 370 |     BCEobj = nn.BCEWithLogitsLoss(pos_weight=ft([h['obj_pw']]), reduction=red)
 371 | 
 372 |     # class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3
 373 |     cp, cn = smooth_BCE(eps=0.0)
 374 | 
 375 |     # focal loss
 376 |     g = h['fl_gamma']  # focal loss gamma
 377 |     if g > 0:
 378 |         BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
 379 | 
 380 |     # per output
 381 |     nt = 0  # targets
 382 |     for i, pi in enumerate(p):  # layer index, layer predictions
 383 |         b, a, gj, gi = indices[i]  # image, anchor, gridy, gridx
 384 |         tobj = torch.zeros_like(pi[..., 0])  # target obj
 385 | 
 386 |         nb = b.shape[0]  # number of targets
 387 |         if nb:
 388 |             nt += nb  # cumulative targets
 389 |             ps = pi[b, a, gj, gi]  # prediction subset corresponding to targets
 390 | 
 391 |             # GIoU
 392 |             pxy = ps[:, :2].sigmoid() * 2. - 0.5
 393 |             pwh = (ps[:, 2:4].sigmoid() * 2) ** 2 * anchors[i]
 394 |             pbox = torch.cat((pxy, pwh), 1)  # predicted box
 395 |             giou = bbox_iou(pbox.t(), tbox[i], x1y1x2y2=False, GIoU=True)  # giou(prediction, target)
 396 |             lbox += (1.0 - giou).sum() if red == 'sum' else (1.0 - giou).mean()  # giou loss
 397 | 
 398 |             # Obj
 399 |             tobj[b, a, gj, gi] = (1.0 - model.gr) + model.gr * giou.detach().clamp(0).type(tobj.dtype)  # giou ratio
 400 | 
 401 |             # Class
 402 |             if model.nc > 1:  # cls loss (only if multiple classes)
 403 |                 t = torch.full_like(ps[:, 5:], cn)  # targets
 404 |                 t[range(nb), tcls[i]] = cp
 405 |                 lcls += BCEcls(ps[:, 5:], t)  # BCE
 406 | 
 407 |             # Append targets to text file
 408 |             # with open('targets.txt', 'a') as file:
 409 |             #     [file.write('%11.5g ' * 4 % tuple(x) + '\n') for x in torch.cat((txy[i], twh[i]), 1)]
 410 | 
 411 |         lobj += BCEobj(pi[..., 4], tobj)  # obj loss
 412 | 
 413 |     lbox *= h['giou']
 414 |     lobj *= h['obj']
 415 |     lcls *= h['cls']
 416 |     bs = tobj.shape[0]  # batch size
 417 |     if red == 'sum':
 418 |         g = 3.0  # loss gain
 419 |         lobj *= g / bs
 420 |         if nt:
 421 |             lcls *= g / nt / model.nc
 422 |             lbox *= g / nt
 423 | 
 424 |     loss = lbox + lobj + lcls
 425 |     return loss * bs, torch.cat((lbox, lobj, lcls, loss)).detach()
 426 | 
 427 | 
 428 | def build_targets(p, targets, model):
 429 |     # Build targets for compute_loss(), input targets(image,class,x,y,w,h)
 430 |     det = model.module.model[-1] if type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel) \
 431 |         else model.model[-1]  # Detect() module
 432 |     na, nt = det.na, targets.shape[0]  # number of anchors, targets
 433 |     tcls, tbox, indices, anch = [], [], [], []
 434 |     gain = torch.ones(6, device=targets.device)  # normalized to gridspace gain
 435 |     off = torch.tensor([[1, 0], [0, 1], [-1, 0], [0, -1]], device=targets.device).float()  # overlap offsets
 436 |     at = torch.arange(na).view(na, 1).repeat(1, nt)  # anchor tensor, same as .repeat_interleave(nt)
 437 | 
 438 |     style = 'rect4'
 439 |     for i in range(det.nl):
 440 |         anchors = det.anchors[i]
 441 |         gain[2:] = torch.tensor(p[i].shape)[[3, 2, 3, 2]]  # xyxy gain
 442 | 
 443 |         # Match targets to anchors
 444 |         a, t, offsets = [], targets * gain, 0
 445 |         if nt:
 446 |             r = t[None, :, 4:6] / anchors[:, None]  # wh ratio
 447 |             j = torch.max(r, 1. / r).max(2)[0] < model.hyp['anchor_t']  # compare
 448 |             # j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t']  # iou(3,n) = wh_iou(anchors(3,2), gwh(n,2))
 449 |             a, t = at[j], t.repeat(na, 1, 1)[j]  # filter
 450 | 
 451 |             # overlaps
 452 |             gxy = t[:, 2:4]  # grid xy
 453 |             z = torch.zeros_like(gxy)
 454 |             if style == 'rect2':
 455 |                 g = 0.2  # offset
 456 |                 j, k = ((gxy % 1. < g) & (gxy > 1.)).T
 457 |                 a, t = torch.cat((a, a[j], a[k]), 0), torch.cat((t, t[j], t[k]), 0)
 458 |                 offsets = torch.cat((z, z[j] + off[0], z[k] + off[1]), 0) * g
 459 | 
 460 |             elif style == 'rect4':
 461 |                 g = 0.5  # offset
 462 |                 j, k = ((gxy % 1. < g) & (gxy > 1.)).T
 463 |                 l, m = ((gxy % 1. > (1 - g)) & (gxy < (gain[[2, 3]] - 1.))).T
 464 |                 a, t = torch.cat((a, a[j], a[k], a[l], a[m]), 0), torch.cat((t, t[j], t[k], t[l], t[m]), 0)
 465 |                 offsets = torch.cat((z, z[j] + off[0], z[k] + off[1], z[l] + off[2], z[m] + off[3]), 0) * g
 466 | 
 467 |         # Define
 468 |         b, c = t[:, :2].long().T  # image, class
 469 |         gxy = t[:, 2:4]  # grid xy
 470 |         gwh = t[:, 4:6]  # grid wh
 471 |         gij = (gxy - offsets).long()
 472 |         gi, gj = gij.T  # grid xy indices
 473 | 
 474 |         # Append
 475 |         indices.append((b, a, gj, gi))  # image, anchor, grid indices
 476 |         tbox.append(torch.cat((gxy - gij, gwh), 1))  # box
 477 |         anch.append(anchors[a])  # anchors
 478 |         tcls.append(c)  # class
 479 | 
 480 |     return tcls, tbox, indices, anch
 481 | 
 482 | 
 483 | def non_max_suppression(prediction, conf_thres=0.1, iou_thres=0.6, fast=False, classes=None, agnostic=False):
 484 |     """
 485 |     Performs  Non-Maximum Suppression on inference results
 486 |     Returns detections with shape:
 487 |         nx6 (x1, y1, x2, y2, conf, cls)
 488 |     """
 489 |     nc = prediction[0].shape[1] - 5  # number of classes
 490 |     xc = prediction[..., 4] > conf_thres  # candidates
 491 | 
 492 |     # Settings
 493 |     min_wh, max_wh = 2, 4096  # (pixels) minimum and maximum box width and height
 494 |     max_det = 300  # maximum number of detections per image
 495 |     time_limit = 10.0  # seconds to quit after
 496 |     redundant = True  # require redundant detections
 497 |     fast |= conf_thres > 0.001  # fast mode
 498 |     if fast:
 499 |         merge = False
 500 |         multi_label = False
 501 |     else:
 502 |         merge = True  # merge for best mAP (adds 0.5ms/img)
 503 |         multi_label = nc > 1  # multiple labels per box (adds 0.5ms/img)
 504 | 
 505 |     t = time.time()
 506 |     output = [None] * prediction.shape[0]
 507 |     for xi, x in enumerate(prediction):  # image index, image inference
 508 |         # Apply constraints
 509 |         # x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0  # width-height
 510 |         x = x[xc[xi]]  # confidence
 511 | 
 512 |         # If none remain process next image
 513 |         if not x.shape[0]:
 514 |             continue
 515 | 
 516 |         # Compute conf
 517 |         x[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_conf
 518 | 
 519 |         # Box (center x, center y, width, height) to (x1, y1, x2, y2)
 520 |         box = xywh2xyxy(x[:, :4])
 521 | 
 522 |         # Detections matrix nx6 (xyxy, conf, cls)
 523 |         if multi_label:
 524 |             i, j = (x[:, 5:] > conf_thres).nonzero().t()
 525 |             x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
 526 |         else:  # best class only
 527 |             conf, j = x[:, 5:].max(1, keepdim=True)
 528 |             x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]
 529 | 
 530 |         # Filter by class
 531 |         if classes:
 532 |             x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]
 533 | 
 534 |         # Apply finite constraint
 535 |         # if not torch.isfinite(x).all():
 536 |         #     x = x[torch.isfinite(x).all(1)]
 537 | 
 538 |         # If none remain process next image
 539 |         n = x.shape[0]  # number of boxes
 540 |         if not n:
 541 |             continue
 542 | 
 543 |         # Sort by confidence
 544 |         # x = x[x[:, 4].argsort(descending=True)]
 545 | 
 546 |         # Batched NMS
 547 |         c = x[:, 5:6] * (0 if agnostic else max_wh)  # classes
 548 |         boxes, scores = x[:, :4] + c, x[:, 4]  # boxes (offset by class), scores
 549 |         i = torchvision.ops.boxes.nms(boxes, scores, iou_thres)
 550 |         if i.shape[0] > max_det:  # limit detections
 551 |             i = i[:max_det]
 552 |         if merge and (1 < n < 3E3):  # Merge NMS (boxes merged using weighted mean)
 553 |             try:  # update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
 554 |                 iou = box_iou(boxes[i], boxes) > iou_thres  # iou matrix
 555 |                 weights = iou * scores[None]  # box weights
 556 |                 x[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(1, keepdim=True)  # merged boxes
 557 |                 if redundant:
 558 |                     i = i[iou.sum(1) > 1]  # require redundancy
 559 |             except:  # possible CUDA error https://github.com/ultralytics/yolov3/issues/1139
 560 |                 print(x, i, x.shape, i.shape)
 561 |                 pass
 562 | 
 563 |         output[xi] = x[i]
 564 |         if (time.time() - t) > time_limit:
 565 |             break  # time limit exceeded
 566 | 
 567 |     return output
 568 | 
 569 | 
 570 | def strip_optimizer(f='weights/best.pt'):  # from utils.utils import *; strip_optimizer()
 571 |     # Strip optimizer from *.pt files for lighter files (reduced by 1/2 size)
 572 |     x = torch.load(f, map_location=torch.device('cpu'))
 573 |     x['optimizer'] = None
 574 |     torch.save(x, f)
 575 |     print('Optimizer stripped from %s' % f)
 576 | 
 577 | 
 578 | def create_backbone(f='weights/best.pt', s='weights/backbone.pt'):  # from utils.utils import *; create_backbone()
 579 |     # create backbone 's' from 'f'
 580 |     device = torch.device('cpu')
 581 |     x = torch.load(f, map_location=device)
 582 |     torch.save(x, s)  # update model if SourceChangeWarning
 583 |     x = torch.load(s, map_location=device)
 584 | 
 585 |     x['optimizer'] = None
 586 |     x['training_results'] = None
 587 |     x['epoch'] = -1
 588 |     for p in x['model'].parameters():
 589 |         p.requires_grad = True
 590 |     torch.save(x, s)
 591 |     print('%s modified for backbone use and saved as %s' % (f, s))
 592 | 
 593 | 
 594 | def coco_class_count(path='../coco/labels/train2014/'):
 595 |     # Histogram of occurrences per class
 596 |     nc = 80  # number classes
 597 |     x = np.zeros(nc, dtype='int32')
 598 |     files = sorted(glob.glob('%s/*.*' % path))
 599 |     for i, file in enumerate(files):
 600 |         labels = np.loadtxt(file, dtype=np.float32).reshape(-1, 5)
 601 |         x += np.bincount(labels[:, 0].astype('int32'), minlength=nc)
 602 |         print(i, len(files))
 603 | 
 604 | 
 605 | def coco_only_people(path='../coco/labels/train2017/'):  # from utils.utils import *; coco_only_people()
 606 |     # Find images with only people
 607 |     files = sorted(glob.glob('%s/*.*' % path))
 608 |     for i, file in enumerate(files):
 609 |         labels = np.loadtxt(file, dtype=np.float32).reshape(-1, 5)
 610 |         if all(labels[:, 0] == 0):
 611 |             print(labels.shape[0], file)
 612 | 
 613 | 
 614 | def crop_images_random(path='../images/', scale=0.50):  # from utils.utils import *; crop_images_random()
 615 |     # crops images into random squares up to scale fraction
 616 |     # WARNING: overwrites images!
 617 |     for file in tqdm(sorted(glob.glob('%s/*.*' % path))):
 618 |         img = cv2.imread(file)  # BGR
 619 |         if img is not None:
 620 |             h, w = img.shape[:2]
 621 | 
 622 |             # create random mask
 623 |             a = 30  # minimum size (pixels)
 624 |             mask_h = random.randint(a, int(max(a, h * scale)))  # mask height
 625 |             mask_w = mask_h  # mask width
 626 | 
 627 |             # box
 628 |             xmin = max(0, random.randint(0, w) - mask_w // 2)
 629 |             ymin = max(0, random.randint(0, h) - mask_h // 2)
 630 |             xmax = min(w, xmin + mask_w)
 631 |             ymax = min(h, ymin + mask_h)
 632 | 
 633 |             # apply random color mask
 634 |             cv2.imwrite(file, img[ymin:ymax, xmin:xmax])
 635 | 
 636 | 
 637 | def coco_single_class_labels(path='../coco/labels/train2014/', label_class=43):
 638 |     # Makes single-class coco datasets. from utils.utils import *; coco_single_class_labels()
 639 |     if os.path.exists('new/'):
 640 |         shutil.rmtree('new/')  # delete output folder
 641 |     os.makedirs('new/')  # make new output folder
 642 |     os.makedirs('new/labels/')
 643 |     os.makedirs('new/images/')
 644 |     for file in tqdm(sorted(glob.glob('%s/*.*' % path))):
 645 |         with open(file, 'r') as f:
 646 |             labels = np.array([x.split() for x in f.read().splitlines()], dtype=np.float32)
 647 |         i = labels[:, 0] == label_class
 648 |         if any(i):
 649 |             img_file = file.replace('labels', 'images').replace('txt', 'jpg')
 650 |             labels[:, 0] = 0  # reset class to 0
 651 |             with open('new/images.txt', 'a') as f:  # add image to dataset list
 652 |                 f.write(img_file + '\n')
 653 |             with open('new/labels/' + Path(file).name, 'a') as f:  # write label
 654 |                 for l in labels[i]:
 655 |                     f.write('%g %.6f %.6f %.6f %.6f\n' % tuple(l))
 656 |             shutil.copyfile(src=img_file, dst='new/images/' + Path(file).name.replace('txt', 'jpg'))  # copy images
 657 | 
 658 | 
 659 | def kmean_anchors(path='./data/coco128.txt', n=9, img_size=(640, 640), thr=0.20, gen=1000):
 660 |     # Creates kmeans anchors for use in *.cfg files: from utils.utils import *; _ = kmean_anchors()
 661 |     # n: number of anchors
 662 |     # img_size: (min, max) image size used for multi-scale training (can be same values)
 663 |     # thr: IoU threshold hyperparameter used for training (0.0 - 1.0)
 664 |     # gen: generations to evolve anchors using genetic algorithm
 665 |     from utils.datasets import LoadImagesAndLabels
 666 | 
 667 |     def print_results(k):
 668 |         k = k[np.argsort(k.prod(1))]  # sort small to large
 669 |         iou = wh_iou(wh, torch.Tensor(k))
 670 |         max_iou = iou.max(1)[0]
 671 |         bpr, aat = (max_iou > thr).float().mean(), (iou > thr).float().mean() * n  # best possible recall, anch > thr
 672 | 
 673 |         # thr = 5.0
 674 |         # r = wh[:, None] / k[None]
 675 |         # ar = torch.max(r, 1. / r).max(2)[0]
 676 |         # max_ar = ar.min(1)[0]
 677 |         # bpr, aat = (max_ar < thr).float().mean(), (ar < thr).float().mean() * n  # best possible recall, anch > thr
 678 | 
 679 |         print('%.2f iou_thr: %.3f best possible recall, %.2f anchors > thr' % (thr, bpr, aat))
 680 |         print('n=%g, img_size=%s, IoU_all=%.3f/%.3f-mean/best, IoU>thr=%.3f-mean: ' %
 681 |               (n, img_size, iou.mean(), max_iou.mean(), iou[iou > thr].mean()), end='')
 682 |         for i, x in enumerate(k):
 683 |             print('%i,%i' % (round(x[0]), round(x[1])), end=',  ' if i < len(k) - 1 else '\n')  # use in *.cfg
 684 |         return k
 685 | 
 686 |     def fitness(k):  # mutation fitness
 687 |         iou = wh_iou(wh, torch.Tensor(k))  # iou
 688 |         max_iou = iou.max(1)[0]
 689 |         return (max_iou * (max_iou > thr).float()).mean()  # product
 690 | 
 691 |     # def fitness_ratio(k):  # mutation fitness
 692 |     #     # wh(5316,2), k(9,2)
 693 |     #     r = wh[:, None] / k[None]
 694 |     #     x = torch.max(r, 1. / r).max(2)[0]
 695 |     #     m = x.min(1)[0]
 696 |     #     return 1. / (m * (m < 5).float()).mean()  # product
 697 | 
 698 |     # Get label wh
 699 |     wh = []
 700 |     dataset = LoadImagesAndLabels(path, augment=True, rect=True)
 701 |     nr = 1 if img_size[0] == img_size[1] else 3  # number augmentation repetitions
 702 |     for s, l in zip(dataset.shapes, dataset.labels):
 703 |         # wh.append(l[:, 3:5] * (s / s.max()))  # image normalized to letterbox normalized wh
 704 |         wh.append(l[:, 3:5] * s)  # image normalized to pixels
 705 |     wh = np.concatenate(wh, 0).repeat(nr, axis=0)  # augment 3x
 706 |     # wh *= np.random.uniform(img_size[0], img_size[1], size=(wh.shape[0], 1))  # normalized to pixels (multi-scale)
 707 |     wh = wh[(wh > 2.0).all(1)]  # remove below threshold boxes (< 2 pixels wh)
 708 | 
 709 |     # Kmeans calculation
 710 |     from scipy.cluster.vq import kmeans
 711 |     print('Running kmeans for %g anchors on %g points...' % (n, len(wh)))
 712 |     s = wh.std(0)  # sigmas for whitening
 713 |     k, dist = kmeans(wh / s, n, iter=30)  # points, mean distance
 714 |     k *= s
 715 |     wh = torch.Tensor(wh)
 716 |     k = print_results(k)
 717 | 
 718 |     # # Plot
 719 |     # k, d = [None] * 20, [None] * 20
 720 |     # for i in tqdm(range(1, 21)):
 721 |     #     k[i-1], d[i-1] = kmeans(wh / s, i)  # points, mean distance
 722 |     # fig, ax = plt.subplots(1, 2, figsize=(14, 7))
 723 |     # ax = ax.ravel()
 724 |     # ax[0].plot(np.arange(1, 21), np.array(d) ** 2, marker='.')
 725 |     # fig, ax = plt.subplots(1, 2, figsize=(14, 7))  # plot wh
 726 |     # ax[0].hist(wh[wh[:, 0]<100, 0],400)
 727 |     # ax[1].hist(wh[wh[:, 1]<100, 1],400)
 728 |     # fig.tight_layout()
 729 |     # fig.savefig('wh.png', dpi=200)
 730 | 
 731 |     # Evolve
 732 |     npr = np.random
 733 |     f, sh, mp, s = fitness(k), k.shape, 0.9, 0.1  # fitness, generations, mutation prob, sigma
 734 |     for _ in tqdm(range(gen), desc='Evolving anchors'):
 735 |         v = np.ones(sh)
 736 |         while (v == 1).all():  # mutate until a change occurs (prevent duplicates)
 737 |             v = ((npr.random(sh) < mp) * npr.random() * npr.randn(*sh) * s + 1).clip(0.3, 3.0)
 738 |         kg = (k.copy() * v).clip(min=2.0)
 739 |         fg = fitness(kg)
 740 |         if fg > f:
 741 |             f, k = fg, kg.copy()
 742 |             print_results(k)
 743 |     k = print_results(k)
 744 | 
 745 |     return k
 746 | 
 747 | 
 748 | def print_mutation(hyp, results, bucket=''):
 749 |     # Print mutation results to evolve.txt (for use with train.py --evolve)
 750 |     a = '%10s' * len(hyp) % tuple(hyp.keys())  # hyperparam keys
 751 |     b = '%10.3g' * len(hyp) % tuple(hyp.values())  # hyperparam values
 752 |     c = '%10.4g' * len(results) % results  # results (P, R, mAP, F1, test_loss)
 753 |     print('\n%s\n%s\nEvolved fitness: %s\n' % (a, b, c))
 754 | 
 755 |     if bucket:
 756 |         os.system('gsutil cp gs://%s/evolve.txt .' % bucket)  # download evolve.txt
 757 | 
 758 |     with open('evolve.txt', 'a') as f:  # append result
 759 |         f.write(c + b + '\n')
 760 |     x = np.unique(np.loadtxt('evolve.txt', ndmin=2), axis=0)  # load unique rows
 761 |     np.savetxt('evolve.txt', x[np.argsort(-fitness(x))], '%10.3g')  # save sort by fitness
 762 | 
 763 |     if bucket:
 764 |         os.system('gsutil cp evolve.txt gs://%s' % bucket)  # upload evolve.txt
 765 | 
 766 | 
 767 | def apply_classifier(x, model, img, im0):
 768 |     # applies a second stage classifier to yolo outputs
 769 |     im0 = [im0] if isinstance(im0, np.ndarray) else im0
 770 |     for i, d in enumerate(x):  # per image
 771 |         if d is not None and len(d):
 772 |             d = d.clone()
 773 | 
 774 |             # Reshape and pad cutouts
 775 |             b = xyxy2xywh(d[:, :4])  # boxes
 776 |             b[:, 2:] = b[:, 2:].max(1)[0].unsqueeze(1)  # rectangle to square
 777 |             b[:, 2:] = b[:, 2:] * 1.3 + 30  # pad
 778 |             d[:, :4] = xywh2xyxy(b).long()
 779 | 
 780 |             # Rescale boxes from img_size to im0 size
 781 |             scale_coords(img.shape[2:], d[:, :4], im0[i].shape)
 782 | 
 783 |             # Classes
 784 |             pred_cls1 = d[:, 5].long()
 785 |             ims = []
 786 |             for j, a in enumerate(d):  # per item
 787 |                 cutout = im0[i][int(a[1]):int(a[3]), int(a[0]):int(a[2])]
 788 |                 im = cv2.resize(cutout, (224, 224))  # BGR
 789 |                 # cv2.imwrite('test%i.jpg' % j, cutout)
 790 | 
 791 |                 im = im[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
 792 |                 im = np.ascontiguousarray(im, dtype=np.float32)  # uint8 to float32
 793 |                 im /= 255.0  # 0 - 255 to 0.0 - 1.0
 794 |                 ims.append(im)
 795 | 
 796 |             pred_cls2 = model(torch.Tensor(ims).to(d.device)).argmax(1)  # classifier prediction
 797 |             x[i] = x[i][pred_cls1 == pred_cls2]  # retain matching class detections
 798 | 
 799 |     return x
 800 | 
 801 | 
 802 | def fitness(x):
 803 |     # Returns fitness (for use with results.txt or evolve.txt)
 804 |     w = [0.0, 0.0, 0.1, 0.9]  # weights for [P, R, mAP@0.5, mAP@0.5:0.95]
 805 |     return (x[:, :4] * w).sum(1)
 806 | 
 807 | 
 808 | def output_to_target(output, width, height):
 809 |     """
 810 |     Convert a YOLO model output to target format
 811 |     [batch_id, class_id, x, y, w, h, conf]
 812 |     """
 813 |     if isinstance(output, torch.Tensor):
 814 |         output = output.cpu().numpy()
 815 | 
 816 |     targets = []
 817 |     for i, o in enumerate(output):
 818 |         if o is not None:
 819 |             for pred in o:
 820 |                 box = pred[:4]
 821 |                 w = (box[2] - box[0]) / width
 822 |                 h = (box[3] - box[1]) / height
 823 |                 x = box[0] / width + w / 2
 824 |                 y = box[1] / height + h / 2
 825 |                 conf = pred[4]
 826 |                 cls = int(pred[5])
 827 | 
 828 |                 targets.append([i, cls, x, y, w, h, conf])
 829 | 
 830 |     return np.array(targets)
 831 | 
 832 | 
 833 | # Plotting functions ---------------------------------------------------------------------------------------------------
 834 | def butter_lowpass_filtfilt(data, cutoff=1500, fs=50000, order=5):
 835 |     # https://stackoverflow.com/questions/28536191/how-to-filter-smooth-with-scipy-numpy
 836 |     def butter_lowpass(cutoff, fs, order):
 837 |         nyq = 0.5 * fs
 838 |         normal_cutoff = cutoff / nyq
 839 |         b, a = butter(order, normal_cutoff, btype='low', analog=False)
 840 |         return b, a
 841 | 
 842 |     b, a = butter_lowpass(cutoff, fs, order=order)
 843 |     return filtfilt(b, a, data)  # forward-backward filter
 844 | 
 845 | 
 846 | def cv2AddChineseText(img, text, position, textColor=(0, 255, 0), textSize=30):
 847 |     if (isinstance(img, np.ndarray)):  # 判断是否OpenCV图片类型
 848 |         img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
 849 |     # 创建一个可以在给定图像上绘图的对象
 850 |     draw = ImageDraw.Draw(img)
 851 |     # 字体的格式
 852 |     fontStyle = ImageFont.truetype(
 853 |         "simsun.ttc", textSize, encoding="utf-8")
 854 |     # 绘制文本
 855 |     draw.text(position, text, textColor, font=fontStyle)
 856 |     # 转换回OpenCV格式
 857 |     return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)
 858 | 
 859 | def plot_one_box(x, img, color=None, label=None, line_thickness=None):
 860 |     # Plots one bounding box on image img
 861 |     tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thickness
 862 |     color = color or [random.randint(0, 255) for _ in range(3)]
 863 |     c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
 864 |     cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
 865 |     if label:
 866 |         tf = max(tl - 1, 1)  # font thickness
 867 |         t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
 868 |         c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
 869 |         cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filled
 870 |         img = cv2AddChineseText(img, label, (c1[0], c1[1] - 14), textColor=(255, 255, 255), textSize=15)
 871 |         font = cv2.FONT_HERSHEY_SIMPLEX
 872 |         cv2.putText(img,"YOLO v5  by HuBin",(40,40),font, 0.1, (0, 255, 0), 1)
 873 |         # cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
 874 | 
 875 |     return img
 876 | 
 877 | def plot_wh_methods():  # from utils.utils import *; plot_wh_methods()
 878 |     # Compares the two methods for width-height anchor multiplication
 879 |     # https://github.com/ultralytics/yolov3/issues/168
 880 |     x = np.arange(-4.0, 4.0, .1)
 881 |     ya = np.exp(x)
 882 |     yb = torch.sigmoid(torch.from_numpy(x)).numpy() * 2
 883 | 
 884 |     fig = plt.figure(figsize=(6, 3), dpi=150)
 885 |     plt.plot(x, ya, '.-', label='yolo method')
 886 |     plt.plot(x, yb ** 2, '.-', label='^2 power method')
 887 |     plt.plot(x, yb ** 2.5, '.-', label='^2.5 power method')
 888 |     plt.xlim(left=-4, right=4)
 889 |     plt.ylim(bottom=0, top=6)
 890 |     plt.xlabel('input')
 891 |     plt.ylabel('output')
 892 |     plt.legend()
 893 |     fig.tight_layout()
 894 |     fig.savefig('comparison.png', dpi=200)
 895 | 
 896 | 
 897 | def plot_images(images, targets, paths=None, fname='images.jpg', names=None, max_size=640, max_subplots=16):
 898 |     tl = 3  # line thickness
 899 |     tf = max(tl - 1, 1)  # font thickness
 900 |     if os.path.isfile(fname):  # do not overwrite
 901 |         return None
 902 | 
 903 |     if isinstance(images, torch.Tensor):
 904 |         images = images.cpu().numpy()
 905 | 
 906 |     if isinstance(targets, torch.Tensor):
 907 |         targets = targets.cpu().numpy()
 908 | 
 909 |     # un-normalise
 910 |     if np.max(images[0]) <= 1:
 911 |         images *= 255
 912 | 
 913 |     bs, _, h, w = images.shape  # batch size, _, height, width
 914 |     bs = min(bs, max_subplots)  # limit plot images
 915 |     ns = np.ceil(bs ** 0.5)  # number of subplots (square)
 916 | 
 917 |     # Check if we should resize
 918 |     scale_factor = max_size / max(h, w)
 919 |     if scale_factor < 1:
 920 |         h = math.ceil(scale_factor * h)
 921 |         w = math.ceil(scale_factor * w)
 922 | 
 923 |     # Empty array for output
 924 |     mosaic = np.full((int(ns * h), int(ns * w), 3), 255, dtype=np.uint8)
 925 | 
 926 |     # Fix class - colour map
 927 |     prop_cycle = plt.rcParams['axes.prop_cycle']
 928 |     # https://stackoverflow.com/questions/51350872/python-from-color-name-to-rgb
 929 |     hex2rgb = lambda h: tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))
 930 |     color_lut = [hex2rgb(h) for h in prop_cycle.by_key()['color']]
 931 | 
 932 |     for i, img in enumerate(images):
 933 |         if i == max_subplots:  # if last batch has fewer images than we expect
 934 |             break
 935 | 
 936 |         block_x = int(w * (i // ns))
 937 |         block_y = int(h * (i % ns))
 938 | 
 939 |         img = img.transpose(1, 2, 0)
 940 |         if scale_factor < 1:
 941 |             img = cv2.resize(img, (w, h))
 942 | 
 943 |         mosaic[block_y:block_y + h, block_x:block_x + w, :] = img
 944 |         if len(targets) > 0:
 945 |             image_targets = targets[targets[:, 0] == i]
 946 |             boxes = xywh2xyxy(image_targets[:, 2:6]).T
 947 |             classes = image_targets[:, 1].astype('int')
 948 |             gt = image_targets.shape[1] == 6  # ground truth if no conf column
 949 |             conf = None if gt else image_targets[:, 6]  # check for confidence presence (gt vs pred)
 950 | 
 951 |             boxes[[0, 2]] *= w
 952 |             boxes[[0, 2]] += block_x
 953 |             boxes[[1, 3]] *= h
 954 |             boxes[[1, 3]] += block_y
 955 |             for j, box in enumerate(boxes.T):
 956 |                 cls = int(classes[j])
 957 |                 color = color_lut[cls % len(color_lut)]
 958 |                 cls = names[cls] if names else cls
 959 |                 if gt or conf[j] > 0.3:  # 0.3 conf thresh
 960 |                     label = '%s' % cls if gt else '%s %.1f' % (cls, conf[j])
 961 |                     plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl)
 962 | 
 963 |         # Draw image filename labels
 964 |         if paths is not None:
 965 |             label = os.path.basename(paths[i])[:40]  # trim to 40 char
 966 |             t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
 967 |             cv2.putText(mosaic, label, (block_x + 5, block_y + t_size[1] + 5), 0, tl / 3, [220, 220, 220], thickness=tf,
 968 |                         lineType=cv2.LINE_AA)
 969 | 
 970 |         # Image border
 971 |         cv2.rectangle(mosaic, (block_x, block_y), (block_x + w, block_y + h), (255, 255, 255), thickness=3)
 972 | 
 973 |     if fname is not None:
 974 |         mosaic = cv2.resize(mosaic, (int(ns * w * 0.5), int(ns * h * 0.5)), interpolation=cv2.INTER_AREA)
 975 |         cv2.imwrite(fname, cv2.cvtColor(mosaic, cv2.COLOR_BGR2RGB))
 976 | 
 977 |     return mosaic
 978 | 
 979 | 
 980 | def plot_lr_scheduler(optimizer, scheduler, epochs=300):
 981 |     # Plot LR simulating training for full epochs
 982 |     optimizer, scheduler = copy(optimizer), copy(scheduler)  # do not modify originals
 983 |     y = []
 984 |     for _ in range(epochs):
 985 |         scheduler.step()
 986 |         y.append(optimizer.param_groups[0]['lr'])
 987 |     plt.plot(y, '.-', label='LR')
 988 |     plt.xlabel('epoch')
 989 |     plt.ylabel('LR')
 990 |     plt.grid()
 991 |     plt.xlim(0, epochs)
 992 |     plt.ylim(0)
 993 |     plt.tight_layout()
 994 |     plt.savefig('LR.png', dpi=200)
 995 | 
 996 | 
 997 | def plot_test_txt():  # from utils.utils import *; plot_test()
 998 |     # Plot test.txt histograms
 999 |     x = np.loadtxt('test.txt', dtype=np.float32)
1000 |     box = xyxy2xywh(x[:, :4])
1001 |     cx, cy = box[:, 0], box[:, 1]
1002 | 
1003 |     fig, ax = plt.subplots(1, 1, figsize=(6, 6), tight_layout=True)
1004 |     ax.hist2d(cx, cy, bins=600, cmax=10, cmin=0)
1005 |     ax.set_aspect('equal')
1006 |     plt.savefig('hist2d.png', dpi=300)
1007 | 
1008 |     fig, ax = plt.subplots(1, 2, figsize=(12, 6), tight_layout=True)
1009 |     ax[0].hist(cx, bins=600)
1010 |     ax[1].hist(cy, bins=600)
1011 |     plt.savefig('hist1d.png', dpi=200)
1012 | 
1013 | 
1014 | def plot_targets_txt():  # from utils.utils import *; plot_targets_txt()
1015 |     # Plot targets.txt histograms
1016 |     x = np.loadtxt('targets.txt', dtype=np.float32).T
1017 |     s = ['x targets', 'y targets', 'width targets', 'height targets']
1018 |     fig, ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)
1019 |     ax = ax.ravel()
1020 |     for i in range(4):
1021 |         ax[i].hist(x[i], bins=100, label='%.3g +/- %.3g' % (x[i].mean(), x[i].std()))
1022 |         ax[i].legend()
1023 |         ax[i].set_title(s[i])
1024 |     plt.savefig('targets.jpg', dpi=200)
1025 | 
1026 | 
1027 | def plot_study_txt(f='study.txt', x=None):  # from utils.utils import *; plot_study_txt()
1028 |     # Plot study.txt generated by test.py
1029 |     fig, ax = plt.subplots(2, 4, figsize=(10, 6), tight_layout=True)
1030 |     ax = ax.ravel()
1031 | 
1032 |     fig2, ax2 = plt.subplots(1, 1, figsize=(8, 4), tight_layout=True)
1033 |     for f in ['coco_study/study_coco_yolov5%s.txt' % x for x in ['s', 'm', 'l', 'x']]:
1034 |         y = np.loadtxt(f, dtype=np.float32, usecols=[0, 1, 2, 3, 7, 8, 9], ndmin=2).T
1035 |         x = np.arange(y.shape[1]) if x is None else np.array(x)
1036 |         s = ['P', 'R', 'mAP@.5', 'mAP@.5:.95', 't_inference (ms/img)', 't_NMS (ms/img)', 't_total (ms/img)']
1037 |         for i in range(7):
1038 |             ax[i].plot(x, y[i], '.-', linewidth=2, markersize=8)
1039 |             ax[i].set_title(s[i])
1040 | 
1041 |         j = y[3].argmax() + 1
1042 |         ax2.plot(y[6, :j], y[3, :j] * 1E2, '.-', linewidth=2, markersize=8,
1043 |                  label=Path(f).stem.replace('study_coco_', '').replace('yolo', 'YOLO'))
1044 | 
1045 |     ax2.plot(1E3 / np.array([209, 140, 97, 58, 35, 18]), [33.5, 39.1, 42.5, 45.9, 49., 50.5],
1046 |              'k.-', linewidth=2, markersize=8, alpha=.25, label='EfficientDet')
1047 |     ax2.set_xlim(0, 30)
1048 |     ax2.set_ylim(25, 50)
1049 |     ax2.set_xlabel('GPU Latency (ms)')
1050 |     ax2.set_ylabel('COCO AP val')
1051 |     ax2.legend(loc='lower right')
1052 |     ax2.grid()
1053 |     plt.savefig('study_mAP_latency.png', dpi=300)
1054 |     plt.savefig(f.replace('.txt', '.png'), dpi=200)
1055 | 
1056 | 
1057 | def plot_labels(labels):
1058 |     # plot dataset labels
1059 |     c, b = labels[:, 0], labels[:, 1:].transpose()  # classees, boxes
1060 | 
1061 |     def hist2d(x, y, n=100):
1062 |         xedges, yedges = np.linspace(x.min(), x.max(), n), np.linspace(y.min(), y.max(), n)
1063 |         hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges))
1064 |         xidx = np.clip(np.digitize(x, xedges) - 1, 0, hist.shape[0] - 1)
1065 |         yidx = np.clip(np.digitize(y, yedges) - 1, 0, hist.shape[1] - 1)
1066 |         return np.log(hist[xidx, yidx])
1067 | 
1068 |     fig, ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)
1069 |     ax = ax.ravel()
1070 |     ax[0].hist(c, bins=int(c.max() + 1))
1071 |     ax[0].set_xlabel('classes')
1072 |     ax[1].scatter(b[0], b[1], c=hist2d(b[0], b[1], 90), cmap='jet')
1073 |     ax[1].set_xlabel('x')
1074 |     ax[1].set_ylabel('y')
1075 |     ax[2].scatter(b[2], b[3], c=hist2d(b[2], b[3], 90), cmap='jet')
1076 |     ax[2].set_xlabel('width')
1077 |     ax[2].set_ylabel('height')
1078 |     plt.savefig('labels.png', dpi=200)
1079 | 
1080 | 
1081 | def plot_evolution_results(hyp):  # from utils.utils import *; plot_evolution_results(hyp)
1082 |     # Plot hyperparameter evolution results in evolve.txt
1083 |     x = np.loadtxt('evolve.txt', ndmin=2)
1084 |     f = fitness(x)
1085 |     # weights = (f - f.min()) ** 2  # for weighted results
1086 |     plt.figure(figsize=(12, 10), tight_layout=True)
1087 |     matplotlib.rc('font', **{'size': 8})
1088 |     for i, (k, v) in enumerate(hyp.items()):
1089 |         y = x[:, i + 7]
1090 |         # mu = (y * weights).sum() / weights.sum()  # best weighted result
1091 |         mu = y[f.argmax()]  # best single result
1092 |         plt.subplot(4, 5, i + 1)
1093 |         plt.plot(mu, f.max(), 'o', markersize=10)
1094 |         plt.plot(y, f, '.')
1095 |         plt.title('%s = %.3g' % (k, mu), fontdict={'size': 9})  # limit to 40 characters
1096 |         print('%15s: %.3g' % (k, mu))
1097 |     plt.savefig('evolve.png', dpi=200)
1098 | 
1099 | 
1100 | def plot_results_overlay(start=0, stop=0):  # from utils.utils import *; plot_results_overlay()
1101 |     # Plot training 'results*.txt', overlaying train and val losses
1102 |     s = ['train', 'train', 'train', 'Precision', 'mAP@0.5', 'val', 'val', 'val', 'Recall', 'mAP@0.5:0.95']  # legends
1103 |     t = ['GIoU', 'Objectness', 'Classification', 'P-R', 'mAP-F1']  # titles
1104 |     for f in sorted(glob.glob('results*.txt') + glob.glob('../../Downloads/results*.txt')):
1105 |         results = np.loadtxt(f, usecols=[2, 3, 4, 8, 9, 12, 13, 14, 10, 11], ndmin=2).T
1106 |         n = results.shape[1]  # number of rows
1107 |         x = range(start, min(stop, n) if stop else n)
1108 |         fig, ax = plt.subplots(1, 5, figsize=(14, 3.5), tight_layout=True)
1109 |         ax = ax.ravel()
1110 |         for i in range(5):
1111 |             for j in [i, i + 5]:
1112 |                 y = results[j, x]
1113 |                 ax[i].plot(x, y, marker='.', label=s[j])
1114 |                 # y_smooth = butter_lowpass_filtfilt(y)
1115 |                 # ax[i].plot(x, np.gradient(y_smooth), marker='.', label=s[j])
1116 | 
1117 |             ax[i].set_title(t[i])
1118 |             ax[i].legend()
1119 |             ax[i].set_ylabel(f) if i == 0 else None  # add filename
1120 |         fig.savefig(f.replace('.txt', '.png'), dpi=200)
1121 | 
1122 | 
1123 | def plot_results(start=0, stop=0, bucket='', id=(), labels=()):  # from utils.utils import *; plot_results()
1124 |     # Plot training 'results*.txt' as seen in https://github.com/ultralytics/yolov5#reproduce-our-training
1125 |     fig, ax = plt.subplots(2, 5, figsize=(12, 6))
1126 |     ax = ax.ravel()
1127 |     s = ['GIoU', 'Objectness', 'Classification', 'Precision', 'Recall',
1128 |          'val GIoU', 'val Objectness', 'val Classification', 'mAP@0.5', 'mAP@0.5:0.95']
1129 |     if bucket:
1130 |         os.system('rm -rf storage.googleapis.com')
1131 |         files = ['https://storage.googleapis.com/%s/results%g.txt' % (bucket, x) for x in id]
1132 |     else:
1133 |         files = glob.glob('results*.txt') + glob.glob('../../Downloads/results*.txt')
1134 |     for fi, f in enumerate(files):
1135 |         try:
1136 |             results = np.loadtxt(f, usecols=[2, 3, 4, 8, 9, 12, 13, 14, 10, 11], ndmin=2).T
1137 |             n = results.shape[1]  # number of rows
1138 |             x = range(start, min(stop, n) if stop else n)
1139 |             for i in range(10):
1140 |                 y = results[i, x]
1141 |                 if i in [0, 1, 2, 5, 6, 7]:
1142 |                     y[y == 0] = np.nan  # dont show zero loss values
1143 |                     # y /= y[0]  # normalize
1144 |                 label = labels[fi] if len(labels) else Path(f).stem
1145 |                 ax[i].plot(x, y, marker='.', label=label, linewidth=2, markersize=8)
1146 |                 ax[i].set_title(s[i])
1147 |                 # if i in [5, 6, 7]:  # share train and val loss y axes
1148 |                 #     ax[i].get_shared_y_axes().join(ax[i], ax[i - 5])
1149 |         except:
1150 |             print('Warning: Plotting error for %s, skipping file' % f)
1151 | 
1152 |     fig.tight_layout()
1153 |     ax[1].legend()
1154 |     fig.savefig('results.png', dpi=200)
1155 | 


--------------------------------------------------------------------------------