├── .gitignore ├── README.md ├── config ├── data_cfg.yaml ├── hyps │ └── hyp_finetune.yaml └── train_cfg.yaml ├── detect.py ├── models ├── __init__.py ├── common.py ├── experimental.py ├── tf.py ├── yolo.py └── yolov5s.yaml ├── pretrains └── pretrain.pt ├── requirements.txt ├── train.py ├── utils ├── __init__.py ├── activations.py ├── augmentations.py ├── autoanchor.py ├── callbacks.py ├── datasets.py ├── downloads.py ├── general.py ├── loggers │ ├── __init__.py │ └── wandb │ │ ├── README.md │ │ ├── __init__.py │ │ ├── log_dataset.py │ │ ├── sweep.py │ │ ├── sweep.yaml │ │ └── wandb_utils.py ├── loss.py ├── metrics.py ├── plots.py └── torch_utils.py └── val.py /.gitignore: -------------------------------------------------------------------------------- 1 | dataset 2 | .idea 3 | *.pyc 4 | results 5 | dataset.zip 6 | best.pt 7 | *.zip -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | #### Table of contents 2 | 1. [Introduction](#introduction) 3 | 2. [Dataset](#dataset) 4 | 3. [Model & Metrics](#run) 5 | 4. [How to Run](#quickstart) 6 | - [Quickstart](#quickstart) 7 | - [Install](#install) 8 | - [Training](#training) 9 | - [Evaluation](#evaluation) 10 | - [Detection](#detection) 11 | 12 | 13 |

14 |

DATA COMPETITION

15 | 16 | ![](https://drive.google.com/uc?export=view&id=1DEpMpgsX-MU3-de4Gqoa7Nk3VD12_Vwk) 17 | 18 | The COVID-19 pandemic, which is caused by the SARS-CoV-2 virus, is still continuing strong, infecting hundreds of millions of people and killing millions. Face masks reduce transmission by preventing aerosols and droplets from spreading too far into the atmosphere. As a result, there is a growing demand for automated systems that can detect whether people are not wearing masks or are wearing masks incorrectly. This competition was designed in order to solve the problem mentioned above. This competition is unlike any other that has come before it. With a fixed model, participants will receive model code and configuration code that organizers use to train models. The candidate's task is to use data processing and generation techniques to improve the model's performance, then submit the dataset to the organizing team for training and evaluation on the private test set. The winner is the team with the highest score on the private test set. 19 | 20 | ## Dataset 21 | * A dataset of 1100 images will be sent to you. This is an object detection dataset consisting of employee images at the office. 22 | The dataset has been assigned 3 labels by us which are no mask, mask, and incorrect mask, with the numbers 0,1,2 corresponding to each. 23 | 24 | * The dataset has been divided into three parts for you: train, valid, and public test. We have prepared a private test to be able to evaluate the candidate's model. 25 | This private test will be made public after the contest ends. In the public test, you can get a basic idea of the private test. **Download the dataset** [here](https://drive.google.com/file/d/1wiu8nb7zFu9gxJRKlhs9lWO7ZyN_Tssh/view?usp=sharing) 26 | 27 | * To improve the model's performance, you can re-label it and employ data augmentation to generate more images (up to 3000). 28 | 29 | The number of each label in each part is shown below: 30 | | | No mask | Mask |incorrect mask| 31 | |-----------------|:------------:|:-----------:|:------------:| 32 | | Train | 308 | 882 | 51 | 33 | | Val | 97 | 190 | 9 | 34 | | Public_test | 47 | 95 | 13 | 35 | 36 | ## Model & Metrics 37 | * The challenge is defined as object detection challenge. In the competition, 38 | We use [YOLOv5s](https://github.com/ultralytics/yolov5/releases) and also use a pre-trained model 39 | trained with easy mask dataset to greatly reduce training time. 40 | * We fix all [hyperparameters](config/hyps/hyp_finetune.yaml) of the model 41 | and **do not use any augmentation tips** in the source code. 42 | Therefore, each participant need to build the best possible dataset by relabeling 43 | incorrect labels, splitting train/val, augmentation tips, adding new dataset, etc. 44 | 45 | * In training process, Early Stopping method with patience setten to 100 iterations 46 | is used to keep track of validation set's wAP@0.5. Detail about wAP@0.5 metric: 47 |

48 | wAP@0.5 = weighted_AP@0.5 = 0.2 * AP50_w + 0.3 * AP50_nw + 0.5 * AP50_wi 49 |

50 | 51 | Where,
52 | AP50_w: AP50 on valid mask boxes
53 | AP50_nw: AP50 on non-mask boxes
54 | AP50_wi: AP50 on invalid mask boxes
55 | 56 | * The wAP@0.5 metric is also used as the main metric 57 | to evaluate participant's submission on private testing set. 58 | 59 | 60 | ## How to Run 61 | ### QuickStart 62 | Click the image below 63 | 64 | 65 | Open In Colab 66 | 67 | 68 | ### Install requirements 69 | 70 | * All requirements are included in [requirements.txt](https://github.com/fsoft-ailab/Data-Competition/blob/main/requirements.txt) 71 | 72 | 73 | * Run the script below to clone and install all requirements 74 | 75 | 76 | ```angular2html 77 | git clone https://github.com/fsoft-ailab/Data-Competition 78 | cd Data-Competition 79 | pip3 install -r requirements.txt 80 | ``` 81 | 82 | ### Training 83 | 84 | 85 | * Put your dataset into the Data-Competition folder. 86 | The structure of dataset folder is followed as folder structure below: 87 | ```bash 88 | folder-name 89 | ├── images 90 | │ ├── train 91 | │ │ ├── train_img1.jpg 92 | │ │ ├── train_img2.jpg 93 | │ │ └── ... 94 | │ │ 95 | │ └── val 96 | │ ├── val_img1.jpg 97 | │ ├── val_img2.jpg 98 | │ └── ... 99 | │ 100 | └── labels 101 | ├── train 102 | │ ├── train_img1.txt 103 | │ ├── train_img2.txt 104 | │ └── ... 105 | │ 106 | └── val 107 | ├── val_img1.txt 108 | ├── val_img2.txt 109 | └── ... 110 | 111 | ``` 112 | * Change relative paths to train and val images folder in `config/data_cfg.yaml` [file](config/data_cfg.yaml) 113 | 114 | * [train_cfg.yaml](config/train_cfg.yaml) where we set up the model during training. 115 | You should not change such hyperparameters because it will result in incorrect results. The training results are saved 116 | in the `results/train/`. 117 | * Run the script below to train the model. Specify particular name to identify your experiment: 118 | ```angular2html 119 | python3 train.py --batch-size 64 --device 0 --name 120 | ``` 121 | `Note`: If you get out of memory error, you can decrease batch-size to multiple of 2 as 32, 16. 122 | 123 | ### Evaluation 124 | * Run script below to evaluate on particular dataset. 125 | * The `--task`'s value is only one of `train, val, or test`, respectively 126 | evaluating on the training set, validation set, or public testing set. 127 | * `Note`: Specify relative path to images folder which 128 | you evaluate in `config/data_cfg.yaml` [file](config/data_cfg.yaml). 129 | 130 | ```angular2html 131 | python3 val.py --weights --task test --name --batch-size 64 --device 0 132 | val 133 | train 134 | ``` 135 | * Results are saved at `results/evaluate//`. 136 | 137 | ### Detection 138 | 139 | * You can use this script to make inferences on particular folder 140 | 141 | * Results are saved at ``. 142 | ```angular2html 143 | python3 detect.py --weights --source --dir --device 0 144 | ``` 145 | 146 | * You can find more default arguments at [detect.py](https://github.com/fsoft-ailab/Data-Competition/blob/main/train.py) 147 | 148 | ## References 149 | * Our source code is based on Ultralytics's implementation: https://github.com/ultralytics/yolov5 150 | * ScaledYOLOV4: https://github.com/WongKinYiu/ScaledYOLOv4 151 | -------------------------------------------------------------------------------- /config/data_cfg.yaml: -------------------------------------------------------------------------------- 1 | train: ./dataset/images/train # relative path to train images 2 | val: ./dataset/images/val # relative path to val images 3 | test: ./dataset/images/public_test # relative path to public test images 4 | 5 | # Classes 6 | # Please don't change 7 | num_class: 3 # number of classes 8 | # names: ['0', '1', '2'] 9 | names: ['no_mask', 'mask', 'incorrect_mask'] 10 | 11 | 12 | -------------------------------------------------------------------------------- /config/hyps/hyp_finetune.yaml: -------------------------------------------------------------------------------- 1 | # Please don't change 2 | 3 | lr0: 0.0032 4 | lrf: 0.2 5 | momentum: 0.937 6 | weight_decay: 0.0005 7 | warmup_epochs: 3.0 8 | warmup_momentum: 0.8 9 | warmup_bias_lr: 0.1 10 | box: 0.05 11 | cls: 0.5 12 | cls_pw: 1.0 13 | obj: 1.0 14 | obj_pw: 1.0 15 | iou_t: 0.20 16 | anchor_t: 4.0 17 | 18 | # Augment 19 | fl_gamma: 0.0 20 | hsv_h: 0.0 21 | hsv_s: 0.0 22 | hsv_v: 0.0 23 | degrees: 0.0 24 | translate: 0.0 25 | scale: 0.0 26 | shear: 0.0 27 | perspective: 0.0 28 | flipud: 0.0 29 | fliplr: 0.0 30 | mosaic: 0.0 31 | mixup: 0.0 32 | copy_paste: 0.0 -------------------------------------------------------------------------------- /config/train_cfg.yaml: -------------------------------------------------------------------------------- 1 | # Please don't change any parameters 2 | 3 | weights: 'pretrains/pretrain.pt' # path to pretrained model weight 4 | model_cfg: 'models/yolov5s.yaml' # path to model config 5 | data_cfg: 'config/data_cfg.yaml' # path to data config 6 | hyp: 'config/hyps/hyp_finetune.yaml' # path to hyper parameters config 7 | project: 'results/train' 8 | artifact_alias: 'latest' 9 | epochs: 100 10 | img_size: 640 11 | rect: False 12 | resume: False 13 | nosave: False 14 | noval: False 15 | noautoanchor: False 16 | evolve: False 17 | bucket: '' 18 | image_weights: False 19 | multi_scale: False 20 | single_cls: False 21 | adam: False 22 | sync_bn: False 23 | entity: '' 24 | exist_ok: False 25 | quad: False 26 | label_smoothing: 0.0 27 | linear_lr: False 28 | bbox_interval: -1 29 | save_period: -1 30 | patience: 100 -------------------------------------------------------------------------------- /detect.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import sys 3 | from pathlib import Path 4 | 5 | import cv2 6 | import torch 7 | 8 | from models.experimental import attempt_load 9 | from utils.datasets import LoadImages 10 | from utils.general import check_img_size, check_requirements, colorstr, is_ascii, \ 11 | non_max_suppression, scale_coords, xyxy2xywh, set_logging, increment_path, \ 12 | save_one_box 13 | from utils.plots import Annotator, colors 14 | from utils.torch_utils import select_device, time_sync 15 | 16 | 17 | FILE = Path(__file__).resolve() 18 | sys.path.append(FILE.parents[0].as_posix()) 19 | 20 | 21 | @torch.no_grad() 22 | def run(weights, # model.pt path(s) 23 | source, # file/dir 24 | img_size, # inference size (pixels) 25 | conf_threshold, # confidence threshold 26 | iou_threshold, # NMS IOU threshold 27 | max_det, # maximum detections per image 28 | device, # cuda device, i.e. 0 or 0,1,2,3 or cpu 29 | view_img, # show results 30 | save_txt, # save results to *.txt 31 | save_conf, # save confidences in --save-txt labels 32 | save_crop, # save cropped prediction boxes 33 | nosave, # do not save images 34 | classes, # filter by class: --class 0, or --class 0 2 3 35 | agnostic_nms, # class-agnostic NMS 36 | augment, # augmented inference 37 | visualize, # visualize features 38 | dir, # save results to results/detect/ 39 | exist_ok, # existing results/detect/ ok, do not increment 40 | line_thickness, # bounding box thickness (pixels) 41 | hide_labels, # hide labels 42 | hide_conf, # hide confidences 43 | half, # use FP16 half-precision inference 44 | ): 45 | save_img = not nosave and not source.endswith('.txt') # save inference images 46 | 47 | # Directories 48 | save_dir = increment_path(Path(dir), exist_ok=exist_ok) # increment run 49 | (save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir 50 | 51 | # Initialize 52 | set_logging() 53 | device = select_device(device) 54 | half &= device.type != 'cpu' # half precision only supported on CUDA 55 | 56 | # Load model 57 | w = weights[0] if isinstance(weights, list) else weights 58 | suffix = Path(w).suffix.lower() 59 | assert suffix == ".pt" 60 | 61 | model = attempt_load(weights, map_location=device) # load FP32 model 62 | stride = int(model.stride.max()) # model stride 63 | names = model.module.names if hasattr(model, 'module') else model.names # get class names 64 | if half: 65 | model.half() # to FP16 66 | 67 | img_size = check_img_size(img_size, s=stride) # check image size 68 | ascii = is_ascii(names) # names are ascii (use PIL for UTF-8) 69 | 70 | # Dataloader 71 | dataset = LoadImages(source, img_size=img_size, stride=stride, auto=True) 72 | 73 | # Run inference 74 | if device.type != 'cpu': 75 | model(torch.zeros(1, 3, *img_size).to(device).type_as(next(model.parameters()))) # run once 76 | dt, seen = [0.0, 0.0, 0.0], 0 77 | for path, img, im0s, _ in dataset: 78 | t1 = time_sync() 79 | img = torch.from_numpy(img).to(device) 80 | img = img.half() if half else img.float() # uint8 to fp16/32 81 | img = img / 255.0 # 0 - 255 to 0.0 - 1.0 82 | if len(img.shape) == 3: 83 | img = img[None] # expand for batch dim 84 | t2 = time_sync() 85 | dt[0] += t2 - t1 86 | 87 | # Inference 88 | 89 | visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False 90 | pred = model(img, augment=augment, visualize=visualize)[0] 91 | t3 = time_sync() 92 | dt[1] += t3 - t2 93 | 94 | # NMS 95 | pred = non_max_suppression(pred, conf_threshold, iou_threshold, classes, agnostic_nms, max_det=max_det) 96 | dt[2] += time_sync() - t3 97 | 98 | # Process predictions 99 | for i, det in enumerate(pred): # per image 100 | seen += 1 101 | p, s, im0, frame = path, '', im0s.copy(), getattr(dataset, 'frame', 0) 102 | 103 | p = Path(p) # to Path 104 | save_path = str(save_dir / p.name) # img.jpg 105 | txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # img.txt 106 | s += '%gx%g ' % img.shape[2:] # print string 107 | gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh 108 | imc = im0.copy() if save_crop else im0 # for save_crop 109 | annotator = Annotator(im0, line_width=line_thickness, pil=not ascii) 110 | if len(det): 111 | # Rescale boxes from img_size to im0 size 112 | det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round() 113 | 114 | # Print results 115 | for c in det[:, -1].unique(): 116 | n = (det[:, -1] == c).sum() # detections per class 117 | s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string 118 | 119 | # Write results 120 | for *xyxy, conf, cls in reversed(det): 121 | if save_txt: # Write to file 122 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh 123 | line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format 124 | with open(txt_path + '.txt', 'a') as f: 125 | f.write(('%g ' * len(line)).rstrip() % line + '\n') 126 | 127 | if save_img or save_crop or view_img: # Add bbox to image 128 | c = int(cls) # integer class 129 | label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}') 130 | annotator.box_label(xyxy, label, color=colors(c, True)) 131 | if save_crop: 132 | save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True) 133 | 134 | # Print time (inference-only) 135 | print(f'{s}Done. ({t3 - t2:.3f}s)') 136 | 137 | im0 = annotator.result() 138 | # Save results (image with detections) 139 | if save_img: 140 | cv2.imwrite(save_path, im0) 141 | 142 | # Print results 143 | t = tuple(x / seen * 1E3 for x in dt) # speeds per image 144 | print(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *img_size)}' % t) 145 | if save_txt or save_img: 146 | s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else '' 147 | print(f"Results saved to {save_dir}") 148 | 149 | 150 | def parser(): 151 | args = argparse.ArgumentParser() 152 | args.add_argument('--weights', type=str, help='specify your weight path', required=True) 153 | args.add_argument('--source', type=str, help='folder contain image', required=True) 154 | args.add_argument('--dir',type=str, help='save results to dir', required=True) 155 | args.add_argument('--conf-threshold', type=float, default=0.25, help='confidence threshold') 156 | args.add_argument('--iou-threshold', type=float, default=0.6, help='NMS IoU threshold') 157 | args.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 158 | args.add_argument('--save-txt', action='store_true', help='save results to *.txt') 159 | args.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') 160 | args.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes') 161 | args.add_argument('--hide-labels', default=False, action='store_true', help='hide labels') 162 | args.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences') 163 | args.add_argument('--half', action='store_true', help='use FP16 half-precision inference') 164 | args = args.parse_args() 165 | 166 | args.agnostic_nms = False 167 | args.augment = False 168 | args.classes = None 169 | args.exist_ok = False 170 | args.img_size = [640, 640] 171 | args.nosave = False 172 | args.view_img = False 173 | args.visualize = False 174 | args.max_det = 1000 175 | args.line_thickness = 2 176 | 177 | return args 178 | 179 | 180 | def main(opt): 181 | print(colorstr('detect: ') + ', '.join(f'{k}={v}' for k, v in vars(opt).items())) 182 | check_requirements(exclude=('tensorboard', 'thop')) 183 | run(**vars(opt)) 184 | 185 | 186 | if __name__ == "__main__": 187 | main(parser()) 188 | -------------------------------------------------------------------------------- /models/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fsoft-ailab/Data-Competition/63741b681885467a563d5500e4658673d6d737d7/models/__init__.py -------------------------------------------------------------------------------- /models/common.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | Common modules 4 | """ 5 | 6 | import logging 7 | import math 8 | import warnings 9 | from copy import copy 10 | from pathlib import Path 11 | 12 | import numpy as np 13 | import pandas as pd 14 | import requests 15 | import torch 16 | import torch.nn as nn 17 | from PIL import Image 18 | from torch.cuda import amp 19 | 20 | from utils.datasets import exif_transpose, letterbox 21 | from utils.general import colorstr, increment_path, is_ascii, make_divisible, non_max_suppression, save_one_box, \ 22 | scale_coords, xyxy2xywh 23 | from utils.plots import Annotator, colors 24 | from utils.torch_utils import time_sync 25 | 26 | LOGGER = logging.getLogger(__name__) 27 | 28 | 29 | def autopad(k, p=None): # kernel, padding 30 | # Pad to 'same' 31 | if p is None: 32 | p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad 33 | return p 34 | 35 | 36 | class Conv(nn.Module): 37 | # Standard convolution 38 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 39 | super().__init__() 40 | self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) 41 | self.bn = nn.BatchNorm2d(c2) 42 | self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 43 | 44 | def forward(self, x): 45 | return self.act(self.bn(self.conv(x))) 46 | 47 | def forward_fuse(self, x): 48 | return self.act(self.conv(x)) 49 | 50 | 51 | class DWConv(Conv): 52 | # Depth-wise convolution class 53 | def __init__(self, c1, c2, k=1, s=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 54 | super().__init__(c1, c2, k, s, g=math.gcd(c1, c2), act=act) 55 | 56 | 57 | class TransformerLayer(nn.Module): 58 | # Transformer layer https://arxiv.org/abs/2010.11929 (LayerNorm layers removed for better performance) 59 | def __init__(self, c, num_heads): 60 | super().__init__() 61 | self.q = nn.Linear(c, c, bias=False) 62 | self.k = nn.Linear(c, c, bias=False) 63 | self.v = nn.Linear(c, c, bias=False) 64 | self.ma = nn.MultiheadAttention(embed_dim=c, num_heads=num_heads) 65 | self.fc1 = nn.Linear(c, c, bias=False) 66 | self.fc2 = nn.Linear(c, c, bias=False) 67 | 68 | def forward(self, x): 69 | x = self.ma(self.q(x), self.k(x), self.v(x))[0] + x 70 | x = self.fc2(self.fc1(x)) + x 71 | return x 72 | 73 | 74 | class TransformerBlock(nn.Module): 75 | # Vision Transformer https://arxiv.org/abs/2010.11929 76 | def __init__(self, c1, c2, num_heads, num_layers): 77 | super().__init__() 78 | self.conv = None 79 | if c1 != c2: 80 | self.conv = Conv(c1, c2) 81 | self.linear = nn.Linear(c2, c2) # learnable position embedding 82 | self.tr = nn.Sequential(*[TransformerLayer(c2, num_heads) for _ in range(num_layers)]) 83 | self.c2 = c2 84 | 85 | def forward(self, x): 86 | if self.conv is not None: 87 | x = self.conv(x) 88 | b, _, w, h = x.shape 89 | p = x.flatten(2).unsqueeze(0).transpose(0, 3).squeeze(3) 90 | return self.tr(p + self.linear(p)).unsqueeze(3).transpose(0, 3).reshape(b, self.c2, w, h) 91 | 92 | 93 | class Bottleneck(nn.Module): 94 | # Standard bottleneck 95 | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion 96 | super().__init__() 97 | c_ = int(c2 * e) # hidden channels 98 | self.cv1 = Conv(c1, c_, 1, 1) 99 | self.cv2 = Conv(c_, c2, 3, 1, g=g) 100 | self.add = shortcut and c1 == c2 101 | 102 | def forward(self, x): 103 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x)) 104 | 105 | 106 | class BottleneckCSP(nn.Module): 107 | # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks 108 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion 109 | super().__init__() 110 | c_ = int(c2 * e) # hidden channels 111 | self.cv1 = Conv(c1, c_, 1, 1) 112 | self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False) 113 | self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False) 114 | self.cv4 = Conv(2 * c_, c2, 1, 1) 115 | self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3) 116 | self.act = nn.LeakyReLU(0.1, inplace=True) 117 | self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)]) 118 | 119 | def forward(self, x): 120 | y1 = self.cv3(self.m(self.cv1(x))) 121 | y2 = self.cv2(x) 122 | return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1)))) 123 | 124 | 125 | class C3(nn.Module): 126 | # CSP Bottleneck with 3 convolutions 127 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion 128 | super().__init__() 129 | c_ = int(c2 * e) # hidden channels 130 | self.cv1 = Conv(c1, c_, 1, 1) 131 | self.cv2 = Conv(c1, c_, 1, 1) 132 | self.cv3 = Conv(2 * c_, c2, 1) # act=FReLU(c2) 133 | self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)]) 134 | # self.m = nn.Sequential(*[CrossConv(c_, c_, 3, 1, g, 1.0, shortcut) for _ in range(n)]) 135 | 136 | def forward(self, x): 137 | return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1)) 138 | 139 | 140 | class C3TR(C3): 141 | # C3 module with TransformerBlock() 142 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): 143 | super().__init__(c1, c2, n, shortcut, g, e) 144 | c_ = int(c2 * e) 145 | self.m = TransformerBlock(c_, c_, 4, n) 146 | 147 | 148 | class C3SPP(C3): 149 | # C3 module with SPP() 150 | def __init__(self, c1, c2, k=(5, 9, 13), n=1, shortcut=True, g=1, e=0.5): 151 | super().__init__(c1, c2, n, shortcut, g, e) 152 | c_ = int(c2 * e) 153 | self.m = SPP(c_, c_, k) 154 | 155 | 156 | class C3Ghost(C3): 157 | # C3 module with GhostBottleneck() 158 | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): 159 | super().__init__(c1, c2, n, shortcut, g, e) 160 | c_ = int(c2 * e) # hidden channels 161 | self.m = nn.Sequential(*[GhostBottleneck(c_, c_) for _ in range(n)]) 162 | 163 | 164 | class SPP(nn.Module): 165 | # Spatial Pyramid Pooling (SPP) layer https://arxiv.org/abs/1406.4729 166 | def __init__(self, c1, c2, k=(5, 9, 13)): 167 | super().__init__() 168 | c_ = c1 // 2 # hidden channels 169 | self.cv1 = Conv(c1, c_, 1, 1) 170 | self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1) 171 | self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k]) 172 | 173 | def forward(self, x): 174 | x = self.cv1(x) 175 | with warnings.catch_warnings(): 176 | warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning 177 | return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1)) 178 | 179 | 180 | class SPPF(nn.Module): 181 | # Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher 182 | def __init__(self, c1, c2, k=5): # equivalent to SPP(k=(5, 9, 13)) 183 | super().__init__() 184 | c_ = c1 // 2 # hidden channels 185 | self.cv1 = Conv(c1, c_, 1, 1) 186 | self.cv2 = Conv(c_ * 4, c2, 1, 1) 187 | self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2) 188 | 189 | def forward(self, x): 190 | x = self.cv1(x) 191 | with warnings.catch_warnings(): 192 | warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning 193 | y1 = self.m(x) 194 | y2 = self.m(y1) 195 | return self.cv2(torch.cat([x, y1, y2, self.m(y2)], 1)) 196 | 197 | 198 | class Focus(nn.Module): 199 | # Focus wh information into c-space 200 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups 201 | super().__init__() 202 | self.conv = Conv(c1 * 4, c2, k, s, p, g, act) 203 | # self.contract = Contract(gain=2) 204 | 205 | def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2) 206 | return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)) 207 | # return self.conv(self.contract(x)) 208 | 209 | 210 | class GhostConv(nn.Module): 211 | # Ghost Convolution https://github.com/huawei-noah/ghostnet 212 | def __init__(self, c1, c2, k=1, s=1, g=1, act=True): # ch_in, ch_out, kernel, stride, groups 213 | super().__init__() 214 | c_ = c2 // 2 # hidden channels 215 | self.cv1 = Conv(c1, c_, k, s, None, g, act) 216 | self.cv2 = Conv(c_, c_, 5, 1, None, c_, act) 217 | 218 | def forward(self, x): 219 | y = self.cv1(x) 220 | return torch.cat([y, self.cv2(y)], 1) 221 | 222 | 223 | class GhostBottleneck(nn.Module): 224 | # Ghost Bottleneck https://github.com/huawei-noah/ghostnet 225 | def __init__(self, c1, c2, k=3, s=1): # ch_in, ch_out, kernel, stride 226 | super().__init__() 227 | c_ = c2 // 2 228 | self.conv = nn.Sequential(GhostConv(c1, c_, 1, 1), # pw 229 | DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(), # dw 230 | GhostConv(c_, c2, 1, 1, act=False)) # pw-linear 231 | self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False), 232 | Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity() 233 | 234 | def forward(self, x): 235 | return self.conv(x) + self.shortcut(x) 236 | 237 | 238 | class Contract(nn.Module): 239 | # Contract width-height into channels, i.e. x(1,64,80,80) to x(1,256,40,40) 240 | def __init__(self, gain=2): 241 | super().__init__() 242 | self.gain = gain 243 | 244 | def forward(self, x): 245 | b, c, h, w = x.size() # assert (h / s == 0) and (W / s == 0), 'Indivisible gain' 246 | s = self.gain 247 | x = x.view(b, c, h // s, s, w // s, s) # x(1,64,40,2,40,2) 248 | x = x.permute(0, 3, 5, 1, 2, 4).contiguous() # x(1,2,2,64,40,40) 249 | return x.view(b, c * s * s, h // s, w // s) # x(1,256,40,40) 250 | 251 | 252 | class Expand(nn.Module): 253 | # Expand channels into width-height, i.e. x(1,64,80,80) to x(1,16,160,160) 254 | def __init__(self, gain=2): 255 | super().__init__() 256 | self.gain = gain 257 | 258 | def forward(self, x): 259 | b, c, h, w = x.size() # assert C / s ** 2 == 0, 'Indivisible gain' 260 | s = self.gain 261 | x = x.view(b, s, s, c // s ** 2, h, w) # x(1,2,2,16,80,80) 262 | x = x.permute(0, 3, 4, 1, 5, 2).contiguous() # x(1,16,80,2,80,2) 263 | return x.view(b, c // s ** 2, h * s, w * s) # x(1,16,160,160) 264 | 265 | 266 | class Concat(nn.Module): 267 | # Concatenate a list of tensors along dimension 268 | def __init__(self, dimension=1): 269 | super().__init__() 270 | self.d = dimension 271 | 272 | def forward(self, x): 273 | return torch.cat(x, self.d) 274 | 275 | 276 | class AutoShape(nn.Module): 277 | # YOLOv5 input-robust model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and NMS 278 | conf = 0.25 # NMS confidence threshold 279 | iou = 0.45 # NMS IoU threshold 280 | classes = None # (optional list) filter by class 281 | multi_label = False # NMS multiple labels per box 282 | max_det = 1000 # maximum number of detections per image 283 | 284 | def __init__(self, model): 285 | super().__init__() 286 | self.model = model.eval() 287 | 288 | def autoshape(self): 289 | LOGGER.info('AutoShape already enabled, skipping... ') # model already converted to model.autoshape() 290 | return self 291 | 292 | @torch.no_grad() 293 | def forward(self, imgs, size=640, augment=False, profile=False): 294 | # Inference from various sources. For height=640, width=1280, RGB images example inputs are: 295 | # file: imgs = 'config/images/zidane.jpg' # str or PosixPath 296 | # URI: = 'https://ultralytics.com/images/zidane.jpg' 297 | # OpenCV: = cv2.imread('image.jpg')[:,:,::-1] # HWC BGR to RGB x(640,1280,3) 298 | # PIL: = Image.open('image.jpg') or ImageGrab.grab() # HWC x(640,1280,3) 299 | # numpy: = np.zeros((640,1280,3)) # HWC 300 | # torch: = torch.zeros(16,3,320,640) # BCHW (scaled to size=640, 0-1 values) 301 | # multiple: = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...] # list of images 302 | 303 | t = [time_sync()] 304 | p = next(self.model.parameters()) # for device and type 305 | if isinstance(imgs, torch.Tensor): # torch 306 | with amp.autocast(enabled=p.device.type != 'cpu'): 307 | return self.model(imgs.to(p.device).type_as(p), augment, profile) # inference 308 | 309 | # Pre-process 310 | n, imgs = (len(imgs), imgs) if isinstance(imgs, list) else (1, [imgs]) # number of images, list of images 311 | shape0, shape1, files = [], [], [] # image and inference shapes, filenames 312 | for i, im in enumerate(imgs): 313 | f = f'image{i}' # filename 314 | if isinstance(im, (str, Path)): # filename or uri 315 | im, f = Image.open(requests.get(im, stream=True).raw if str(im).startswith('http') else im), im 316 | im = np.asarray(exif_transpose(im)) 317 | elif isinstance(im, Image.Image): # PIL Image 318 | im, f = np.asarray(exif_transpose(im)), getattr(im, 'filename', f) or f 319 | files.append(Path(f).with_suffix('.jpg').name) 320 | if im.shape[0] < 5: # image in CHW 321 | im = im.transpose((1, 2, 0)) # reverse dataloader .transpose(2, 0, 1) 322 | im = im[..., :3] if im.ndim == 3 else np.tile(im[..., None], 3) # enforce 3ch input 323 | s = im.shape[:2] # HWC 324 | shape0.append(s) # image shape 325 | g = (size / max(s)) # gain 326 | shape1.append([y * g for y in s]) 327 | imgs[i] = im if im.data.contiguous else np.ascontiguousarray(im) # update 328 | shape1 = [make_divisible(x, int(self.stride.max())) for x in np.stack(shape1, 0).max(0)] # inference shape 329 | x = [letterbox(im, new_shape=shape1, auto=False)[0] for im in imgs] # pad 330 | x = np.stack(x, 0) if n > 1 else x[0][None] # stack 331 | x = np.ascontiguousarray(x.transpose((0, 3, 1, 2))) # BHWC to BCHW 332 | x = torch.from_numpy(x).to(p.device).type_as(p) / 255. # uint8 to fp16/32 333 | t.append(time_sync()) 334 | 335 | with amp.autocast(enabled=p.device.type != 'cpu'): 336 | # Inference 337 | y = self.model(x, augment, profile)[0] # forward 338 | t.append(time_sync()) 339 | 340 | # Post-process 341 | y = non_max_suppression(y, self.conf, iou_thres=self.iou, classes=self.classes, 342 | multi_label=self.multi_label, max_det=self.max_det) # NMS 343 | for i in range(n): 344 | scale_coords(shape1, y[i][:, :4], shape0[i]) 345 | 346 | t.append(time_sync()) 347 | return Detections(imgs, y, files, t, self.names, x.shape) 348 | 349 | 350 | class Detections: 351 | # YOLOv5 detections class for inference results 352 | def __init__(self, imgs, pred, files, times=None, names=None, shape=None): 353 | super().__init__() 354 | d = pred[0].device # device 355 | gn = [torch.tensor([*[im.shape[i] for i in [1, 0, 1, 0]], 1., 1.], device=d) for im in imgs] # normalizations 356 | self.imgs = imgs # list of images as numpy arrays 357 | self.pred = pred # list of tensors pred[0] = (xyxy, conf, cls) 358 | self.names = names # class names 359 | self.ascii = is_ascii(names) # names are ascii (use PIL for UTF-8) 360 | self.files = files # image filenames 361 | self.xyxy = pred # xyxy pixels 362 | self.xywh = [xyxy2xywh(x) for x in pred] # xywh pixels 363 | self.xyxyn = [x / g for x, g in zip(self.xyxy, gn)] # xyxy normalized 364 | self.xywhn = [x / g for x, g in zip(self.xywh, gn)] # xywh normalized 365 | self.n = len(self.pred) # number of images (batch size) 366 | self.t = tuple((times[i + 1] - times[i]) * 1000 / self.n for i in range(3)) # timestamps (ms) 367 | self.s = shape # inference BCHW shape 368 | 369 | def display(self, pprint=False, show=False, save=False, crop=False, render=False, save_dir=Path('')): 370 | crops = [] 371 | for i, (im, pred) in enumerate(zip(self.imgs, self.pred)): 372 | str = f'image {i + 1}/{len(self.pred)}: {im.shape[0]}x{im.shape[1]} ' 373 | if pred.shape[0]: 374 | for c in pred[:, -1].unique(): 375 | n = (pred[:, -1] == c).sum() # detections per class 376 | str += f"{n} {self.names[int(c)]}{'s' * (n > 1)}, " # add to string 377 | if show or save or render or crop: 378 | annotator = Annotator(im, pil=not self.ascii) 379 | for *box, conf, cls in reversed(pred): # xyxy, confidence, class 380 | label = f'{self.names[int(cls)]} {conf:.2f}' 381 | if crop: 382 | file = save_dir / 'crops' / self.names[int(cls)] / self.files[i] if save else None 383 | crops.append({'box': box, 'conf': conf, 'cls': cls, 'label': label, 384 | 'im': save_one_box(box, im, file=file, save=save)}) 385 | else: # all others 386 | annotator.box_label(box, label, color=colors(cls)) 387 | im = annotator.im 388 | else: 389 | str += '(no detections)' 390 | 391 | im = Image.fromarray(im.astype(np.uint8)) if isinstance(im, np.ndarray) else im # from np 392 | if pprint: 393 | LOGGER.info(str.rstrip(', ')) 394 | if show: 395 | im.show(self.files[i]) # show 396 | if save: 397 | f = self.files[i] 398 | im.save(save_dir / f) # save 399 | if i == self.n - 1: 400 | LOGGER.info(f"Saved {self.n} image{'s' * (self.n > 1)} to {colorstr('bold', save_dir)}") 401 | if render: 402 | self.imgs[i] = np.asarray(im) 403 | if crop: 404 | if save: 405 | LOGGER.info(f'Saved results to {save_dir}\n') 406 | return crops 407 | 408 | def print(self): 409 | self.display(pprint=True) # print results 410 | LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {tuple(self.s)}' % 411 | self.t) 412 | 413 | def show(self): 414 | self.display(show=True) # show results 415 | 416 | def save(self, save_dir='runs/detect/exp'): 417 | save_dir = increment_path(save_dir, exist_ok=save_dir != 'runs/detect/exp', mkdir=True) # increment save_dir 418 | self.display(save=True, save_dir=save_dir) # save results 419 | 420 | def crop(self, save=True, save_dir='runs/detect/exp'): 421 | save_dir = increment_path(save_dir, exist_ok=save_dir != 'runs/detect/exp', mkdir=True) if save else None 422 | return self.display(crop=True, save=save, save_dir=save_dir) # crop results 423 | 424 | def render(self): 425 | self.display(render=True) # render results 426 | return self.imgs 427 | 428 | def pandas(self): 429 | # return detections as pandas DataFrames, i.e. print(results.pandas().xyxy[0]) 430 | new = copy(self) # return copy 431 | ca = 'xmin', 'ymin', 'xmax', 'ymax', 'confidence', 'class', 'name' # xyxy columns 432 | cb = 'xcenter', 'ycenter', 'width', 'height', 'confidence', 'class', 'name' # xywh columns 433 | for k, c in zip(['xyxy', 'xyxyn', 'xywh', 'xywhn'], [ca, ca, cb, cb]): 434 | a = [[x[:5] + [int(x[5]), self.names[int(x[5])]] for x in x.tolist()] for x in getattr(self, k)] # update 435 | setattr(new, k, [pd.DataFrame(x, columns=c) for x in a]) 436 | return new 437 | 438 | def tolist(self): 439 | # return a list of Detections objects, i.e. 'for result in results.tolist():' 440 | x = [Detections([self.imgs[i]], [self.pred[i]], self.names, self.s) for i in range(self.n)] 441 | for d in x: 442 | for k in ['imgs', 'pred', 'xyxy', 'xyxyn', 'xywh', 'xywhn']: 443 | setattr(d, k, getattr(d, k)[0]) # pop out of list 444 | return x 445 | 446 | def __len__(self): 447 | return self.n 448 | 449 | 450 | class Classify(nn.Module): 451 | # Classification head, i.e. x(b,c1,20,20) to x(b,c2) 452 | def __init__(self, c1, c2, k=1, s=1, p=None, g=1): # ch_in, ch_out, kernel, stride, padding, groups 453 | super().__init__() 454 | self.aap = nn.AdaptiveAvgPool2d(1) # to x(b,c1,1,1) 455 | self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g) # to x(b,c2,1,1) 456 | self.flat = nn.Flatten() 457 | 458 | def forward(self, x): 459 | z = torch.cat([self.aap(y) for y in (x if isinstance(x, list) else [x])], 1) # cat if list 460 | return self.flat(self.conv(z)) # flatten to x(b,c2) 461 | -------------------------------------------------------------------------------- /models/experimental.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | Experimental modules 4 | """ 5 | 6 | import numpy as np 7 | import torch 8 | import torch.nn as nn 9 | 10 | from models.common import Conv 11 | from utils.downloads import attempt_download 12 | 13 | 14 | class CrossConv(nn.Module): 15 | # Cross Convolution Downsample 16 | def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False): 17 | # ch_in, ch_out, kernel, stride, groups, expansion, shortcut 18 | super().__init__() 19 | c_ = int(c2 * e) # hidden channels 20 | self.cv1 = Conv(c1, c_, (1, k), (1, s)) 21 | self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g) 22 | self.add = shortcut and c1 == c2 23 | 24 | def forward(self, x): 25 | return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x)) 26 | 27 | 28 | class Sum(nn.Module): 29 | # Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070 30 | def __init__(self, n, weight=False): # n: number of inputs 31 | super().__init__() 32 | self.weight = weight # apply weights boolean 33 | self.iter = range(n - 1) # iter object 34 | if weight: 35 | self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights 36 | 37 | def forward(self, x): 38 | y = x[0] # no weight 39 | if self.weight: 40 | w = torch.sigmoid(self.w) * 2 41 | for i in self.iter: 42 | y = y + x[i + 1] * w[i] 43 | else: 44 | for i in self.iter: 45 | y = y + x[i + 1] 46 | return y 47 | 48 | 49 | class MixConv2d(nn.Module): 50 | # Mixed Depth-wise Conv https://arxiv.org/abs/1907.09595 51 | def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True): 52 | super().__init__() 53 | groups = len(k) 54 | if equal_ch: # equal c_ per group 55 | i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices 56 | c_ = [(i == g).sum() for g in range(groups)] # intermediate channels 57 | else: # equal weight.numel() per group 58 | b = [c2] + [0] * groups 59 | a = np.eye(groups + 1, groups, k=-1) 60 | a -= np.roll(a, 1, axis=1) 61 | a *= np.array(k) ** 2 62 | a[0] = 1 63 | c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b 64 | 65 | self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)]) 66 | self.bn = nn.BatchNorm2d(c2) 67 | self.act = nn.LeakyReLU(0.1, inplace=True) 68 | 69 | def forward(self, x): 70 | return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1))) 71 | 72 | 73 | class Ensemble(nn.ModuleList): 74 | # Ensemble of models 75 | def __init__(self): 76 | super().__init__() 77 | 78 | def forward(self, x, augment=False, profile=False, visualize=False): 79 | y = [] 80 | for module in self: 81 | y.append(module(x, augment, profile, visualize)[0]) 82 | # y = torch.stack(y).max(0)[0] # max ensemble 83 | # y = torch.stack(y).mean(0) # mean ensemble 84 | y = torch.cat(y, 1) # nms ensemble 85 | return y, None # inference, train output 86 | 87 | 88 | def attempt_load(weights, map_location=None, inplace=True, fuse=True): 89 | from models.yolo import Detect, Model 90 | 91 | # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a 92 | model = Ensemble() 93 | for w in weights if isinstance(weights, list) else [weights]: 94 | ckpt = torch.load(attempt_download(w), map_location=map_location) # load 95 | if fuse: 96 | model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().fuse().eval()) # FP32 model 97 | else: 98 | model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().eval()) # without layer fuse 99 | 100 | 101 | # Compatibility updates 102 | for m in model.modules(): 103 | if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU, Detect, Model]: 104 | m.inplace = inplace # pytorch 1.7.0 compatibility 105 | elif type(m) is Conv: 106 | m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility 107 | 108 | if len(model) == 1: 109 | return model[-1] # return model 110 | else: 111 | print(f'Ensemble created with {weights}\n') 112 | for k in ['names']: 113 | setattr(model, k, getattr(model[-1], k)) 114 | model.stride = model[torch.argmax(torch.tensor([m.stride.max() for m in model])).int()].stride # max stride 115 | return model # return ensemble 116 | -------------------------------------------------------------------------------- /models/yolo.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | YOLO-specific modules 4 | 5 | Usage: 6 | $ python path/to/models/yolo.py --cfg yolov5s.yaml 7 | """ 8 | 9 | import argparse 10 | import sys 11 | from copy import deepcopy 12 | from pathlib import Path 13 | 14 | FILE = Path(__file__).resolve() 15 | sys.path.append(FILE.parents[1].as_posix()) # add yolov5/ to path 16 | 17 | from models.common import * 18 | from models.experimental import * 19 | from utils.autoanchor import check_anchor_order 20 | from utils.general import check_yaml, make_divisible, set_logging 21 | from utils.plots import feature_visualization 22 | from utils.torch_utils import copy_attr, fuse_conv_and_bn, initialize_weights, model_info, scale_img, \ 23 | select_device, time_sync 24 | 25 | try: 26 | import thop # for FLOPs computation 27 | except ImportError: 28 | thop = None 29 | 30 | LOGGER = logging.getLogger(__name__) 31 | 32 | 33 | class Detect(nn.Module): 34 | stride = None # strides computed during build 35 | onnx_dynamic = False # ONNX export parameter 36 | 37 | def __init__(self, nc=80, anchors=(), ch=(), inplace=True): # detection layer 38 | super().__init__() 39 | self.nc = nc # number of classes 40 | self.no = nc + 5 # number of outputs per anchor 41 | self.nl = len(anchors) # number of detection layers 42 | self.na = len(anchors[0]) // 2 # number of anchors 43 | self.grid = [torch.zeros(1)] * self.nl # init grid 44 | a = torch.tensor(anchors).float().view(self.nl, -1, 2) 45 | self.register_buffer('anchors', a) # shape(nl,na,2) 46 | self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2) 47 | self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv 48 | self.inplace = inplace # use in-place ops (e.g. slice assignment) 49 | 50 | def forward(self, x): 51 | z = [] # inference output 52 | for i in range(self.nl): 53 | x[i] = self.m[i](x[i]) # conv 54 | bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85) 55 | x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() 56 | 57 | if not self.training: # inference 58 | if self.grid[i].shape[2:4] != x[i].shape[2:4] or self.onnx_dynamic: 59 | self.grid[i] = self._make_grid(nx, ny).to(x[i].device) 60 | 61 | y = x[i].sigmoid() 62 | if self.inplace: 63 | y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy 64 | y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh 65 | else: # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953 66 | xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy 67 | wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2) # wh 68 | y = torch.cat((xy, wh, y[..., 4:]), -1) 69 | z.append(y.view(bs, -1, self.no)) 70 | 71 | return x if self.training else (torch.cat(z, 1), x) 72 | 73 | @staticmethod 74 | def _make_grid(nx=20, ny=20): 75 | yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)]) 76 | return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float() 77 | 78 | 79 | class Model(nn.Module): 80 | def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None, anchors=None): # model, input channels, number of classes 81 | super().__init__() 82 | if isinstance(cfg, dict): 83 | self.yaml = cfg # model dict 84 | else: # is *.yaml 85 | import yaml # for torch hub 86 | self.yaml_file = Path(cfg).name 87 | with open(cfg) as f: 88 | self.yaml = yaml.safe_load(f) # model dict 89 | 90 | # Define model 91 | ch = self.yaml['ch'] = self.yaml.get('ch', ch) # input channels 92 | if nc and nc != self.yaml['nc']: 93 | LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}") 94 | self.yaml['nc'] = nc # override yaml value 95 | if anchors: 96 | LOGGER.info(f'Overriding model.yaml anchors with anchors={anchors}') 97 | self.yaml['anchors'] = round(anchors) # override yaml value 98 | self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist 99 | self.names = [str(i) for i in range(self.yaml['nc'])] # default names 100 | self.inplace = self.yaml.get('inplace', True) 101 | # LOGGER.info([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))]) 102 | 103 | # Build strides, anchors 104 | m = self.model[-1] # Detect() 105 | if isinstance(m, Detect): 106 | s = 256 # 2x min stride 107 | m.inplace = self.inplace 108 | m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward 109 | m.anchors /= m.stride.view(-1, 1, 1) 110 | check_anchor_order(m) 111 | self.stride = m.stride 112 | self._initialize_biases() # only run once 113 | # LOGGER.info('Strides: %s' % m.stride.tolist()) 114 | 115 | # Init weights, biases 116 | initialize_weights(self) 117 | self.info() 118 | LOGGER.info('') 119 | 120 | def forward(self, x, augment=False, profile=False, visualize=False): 121 | if augment: 122 | return self.forward_augment(x) # augmented inference, None 123 | return self.forward_once(x, profile, visualize) # single-scale inference, train 124 | 125 | def forward_augment(self, x): 126 | img_size = x.shape[-2:] # height, width 127 | s = [1, 0.83, 0.67] # scales 128 | f = [None, 3, None] # flips (2-ud, 3-lr) 129 | y = [] # outputs 130 | for si, fi in zip(s, f): 131 | xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max())) 132 | yi = self.forward_once(xi)[0] # forward 133 | # cv2.imwrite(f'img_{si}.jpg', 255 * xi[0].cpu().numpy().transpose((1, 2, 0))[:, :, ::-1]) # save 134 | yi = self._descale_pred(yi, fi, si, img_size) 135 | y.append(yi) 136 | return torch.cat(y, 1), None # augmented inference, train 137 | 138 | def forward_once(self, x, profile=False, visualize=False): 139 | y, dt = [], [] # outputs 140 | for m in self.model: 141 | if m.f != -1: # if not from previous layer 142 | x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers 143 | 144 | if profile: 145 | c = isinstance(m, Detect) # copy input as inplace fix 146 | o = thop.profile(m, inputs=(x.copy() if c else x,), verbose=False)[0] / 1E9 * 2 if thop else 0 # FLOPs 147 | t = time_sync() 148 | for _ in range(10): 149 | m(x.copy() if c else x) 150 | dt.append((time_sync() - t) * 100) 151 | if m == self.model[0]: 152 | LOGGER.info(f"{'time (ms)':>10s} {'GFLOPs':>10s} {'params':>10s} {'module'}") 153 | LOGGER.info(f'{dt[-1]:10.2f} {o:10.2f} {m.np:10.0f} {m.type}') 154 | 155 | x = m(x) # run 156 | y.append(x if m.i in self.save else None) # save output 157 | 158 | if visualize: 159 | feature_visualization(x, m.type, m.i, save_dir=visualize) 160 | 161 | if profile: 162 | LOGGER.info('%.1fms total' % sum(dt)) 163 | return x 164 | 165 | def _descale_pred(self, p, flips, scale, img_size): 166 | # de-scale predictions following augmented inference (inverse operation) 167 | if self.inplace: 168 | p[..., :4] /= scale # de-scale 169 | if flips == 2: 170 | p[..., 1] = img_size[0] - p[..., 1] # de-flip ud 171 | elif flips == 3: 172 | p[..., 0] = img_size[1] - p[..., 0] # de-flip lr 173 | else: 174 | x, y, wh = p[..., 0:1] / scale, p[..., 1:2] / scale, p[..., 2:4] / scale # de-scale 175 | if flips == 2: 176 | y = img_size[0] - y # de-flip ud 177 | elif flips == 3: 178 | x = img_size[1] - x # de-flip lr 179 | p = torch.cat((x, y, wh, p[..., 4:]), -1) 180 | return p 181 | 182 | def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency 183 | # https://arxiv.org/abs/1708.02002 section 3.3 184 | # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1. 185 | m = self.model[-1] # Detect() module 186 | for mi, s in zip(m.m, m.stride): # from 187 | b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85) 188 | b.data[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image) 189 | b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls 190 | mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True) 191 | 192 | def _print_biases(self): 193 | m = self.model[-1] # Detect() module 194 | for mi in m.m: # from 195 | b = mi.bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85) 196 | LOGGER.info( 197 | ('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean())) 198 | 199 | # def _print_weights(self): 200 | # for m in self.model.modules(): 201 | # if type(m) is Bottleneck: 202 | # LOGGER.info('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights 203 | 204 | def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers 205 | LOGGER.info('Fusing layers... ') 206 | for m in self.model.modules(): 207 | if isinstance(m, (Conv, DWConv)) and hasattr(m, 'bn'): 208 | m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv 209 | delattr(m, 'bn') # remove batchnorm 210 | m.forward = m.forward_fuse # update forward 211 | self.info() 212 | return self 213 | 214 | def autoshape(self): # add AutoShape module 215 | LOGGER.info('Adding AutoShape... ') 216 | m = AutoShape(self) # wrap model 217 | copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=()) # copy attributes 218 | return m 219 | 220 | def info(self, verbose=False, img_size=640): # print model information 221 | model_info(self, verbose, img_size) 222 | 223 | 224 | def parse_model(d, ch): # model_dict, input_channels(3) 225 | LOGGER.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments')) 226 | anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple'] 227 | na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors 228 | no = na * (nc + 5) # number of outputs = anchors * (classes + 5) 229 | 230 | layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out 231 | for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args 232 | m = eval(m) if isinstance(m, str) else m # eval strings 233 | for j, a in enumerate(args): 234 | try: 235 | args[j] = eval(a) if isinstance(a, str) else a # eval strings 236 | except: 237 | pass 238 | 239 | n = n_ = max(round(n * gd), 1) if n > 1 else n # depth gain 240 | if m in [Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv, 241 | BottleneckCSP, C3, C3TR, C3SPP, C3Ghost]: 242 | c1, c2 = ch[f], args[0] 243 | if c2 != no: # if not output 244 | c2 = make_divisible(c2 * gw, 8) 245 | 246 | args = [c1, c2, *args[1:]] 247 | if m in [BottleneckCSP, C3, C3TR, C3Ghost]: 248 | args.insert(2, n) # number of repeats 249 | n = 1 250 | elif m is nn.BatchNorm2d: 251 | args = [ch[f]] 252 | elif m is Concat: 253 | c2 = sum([ch[x] for x in f]) 254 | elif m is Detect: 255 | args.append([ch[x] for x in f]) 256 | if isinstance(args[1], int): # number of anchors 257 | args[1] = [list(range(args[1] * 2))] * len(f) 258 | elif m is Contract: 259 | c2 = ch[f] * args[0] ** 2 260 | elif m is Expand: 261 | c2 = ch[f] // args[0] ** 2 262 | else: 263 | c2 = ch[f] 264 | 265 | m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module 266 | t = str(m)[8:-2].replace('__main__.', '') # module type 267 | np = sum([x.numel() for x in m_.parameters()]) # number params 268 | m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params 269 | LOGGER.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n_, np, t, args)) # print 270 | save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist 271 | layers.append(m_) 272 | if i == 0: 273 | ch = [] 274 | ch.append(c2) 275 | return nn.Sequential(*layers), sorted(save) 276 | 277 | 278 | if __name__ == '__main__': 279 | parser = argparse.ArgumentParser() 280 | parser.add_argument('--cfg', type=str, default='yolov5s.yaml', help='model.yaml') 281 | parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 282 | parser.add_argument('--profile', action='store_true', help='profile model speed') 283 | opt = parser.parse_args() 284 | opt.cfg = check_yaml(opt.cfg) # check YAML 285 | set_logging() 286 | device = select_device(opt.device) 287 | 288 | # Create model 289 | model = Model(opt.cfg).to(device) 290 | model.train() 291 | 292 | # Profile 293 | if opt.profile: 294 | img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device) 295 | y = model(img, profile=True) 296 | 297 | # Tensorboard (not working https://github.com/ultralytics/yolov5/issues/2898) 298 | # from torch.utils.tensorboard import SummaryWriter 299 | # tb_writer = SummaryWriter('.') 300 | # LOGGER.info("Run 'tensorboard --logdir=models' to view tensorboard at http://localhost:6006/") 301 | # tb_writer.add_graph(torch.jit.trace(model, img, strict=False), []) # add model graph 302 | -------------------------------------------------------------------------------- /models/yolov5s.yaml: -------------------------------------------------------------------------------- 1 | # Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 2 | 3 | # Parameters 4 | nc: 80 # number of classes 5 | depth_multiple: 0.33 # model depth multiple 6 | width_multiple: 0.50 # layer channel multiple 7 | anchors: 8 | - [10,13, 16,30, 33,23] # P3/8 9 | - [30,61, 62,45, 59,119] # P4/16 10 | - [116,90, 156,198, 373,326] # P5/32 11 | 12 | # YOLOv5 backbone 13 | backbone: 14 | # [from, number, module, args] 15 | [[-1, 1, Focus, [64, 3]], # 0-P1/2 16 | [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 17 | [-1, 3, C3, [128]], 18 | [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 19 | [-1, 9, C3, [256]], 20 | [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 21 | [-1, 9, C3, [512]], 22 | [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 23 | [-1, 1, SPP, [1024, [5, 9, 13]]], 24 | [-1, 3, C3, [1024, False]], # 9 25 | ] 26 | 27 | # YOLOv5 head 28 | head: 29 | [[-1, 1, Conv, [512, 1, 1]], 30 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 31 | [[-1, 6], 1, Concat, [1]], # cat backbone P4 32 | [-1, 3, C3, [512, False]], # 13 33 | 34 | [-1, 1, Conv, [256, 1, 1]], 35 | [-1, 1, nn.Upsample, [None, 2, 'nearest']], 36 | [[-1, 4], 1, Concat, [1]], # cat backbone P3 37 | [-1, 3, C3, [256, False]], # 17 (P3/8-small) 38 | 39 | [-1, 1, Conv, [256, 3, 2]], 40 | [[-1, 14], 1, Concat, [1]], # cat head P4 41 | [-1, 3, C3, [512, False]], # 20 (P4/16-medium) 42 | 43 | [-1, 1, Conv, [512, 3, 2]], 44 | [[-1, 10], 1, Concat, [1]], # cat head P5 45 | [-1, 3, C3, [1024, False]], # 23 (P5/32-large) 46 | 47 | [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) 48 | ] 49 | -------------------------------------------------------------------------------- /pretrains/pretrain.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fsoft-ailab/Data-Competition/63741b681885467a563d5500e4658673d6d737d7/pretrains/pretrain.pt -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # pip install -r requirements.txt 2 | 3 | # Base ---------------------------------------- 4 | matplotlib>=3.2.2 5 | numpy>=1.18.5 6 | opencv-python>=4.1.2 7 | Pillow>=8.0.0 8 | PyYAML>=5.3.1 9 | scipy>=1.4.1 10 | torch>=1.7.0 11 | torchvision>=0.8.1 12 | tqdm>=4.41.0 13 | 14 | # Logging ------------------------------------- 15 | tensorboard>=2.4.1 16 | # wandb 17 | 18 | # Plotting ------------------------------------ 19 | seaborn>=0.11.0 20 | pandas 21 | 22 | # Export -------------------------------------- 23 | # coremltools>=4.1 # CoreML export 24 | # onnx>=1.9.0 # ONNX export 25 | # onnx-simplifier>=0.3.6 # ONNX simplifier 26 | # scikit-learn==0.19.2 # CoreML quantization 27 | # tensorflow>=2.4.1 # TFLite export 28 | # tensorflowjs>=3.9.0 # TF.js export 29 | 30 | # Extras -------------------------------------- 31 | # Cython # for pycocotools https://github.com/cocodataset/cocoapi/issues/172 32 | # pycocotools>=2.0 # COCO mAP 33 | # albumentations>=1.0.3 34 | thop # FLOPs computation 35 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import time 3 | import math 4 | import random 5 | import logging 6 | import argparse 7 | from pathlib import Path 8 | from copy import deepcopy 9 | 10 | import yaml 11 | import torch 12 | import numpy as np 13 | from tqdm import tqdm 14 | import torch.nn as nn 15 | from torch.cuda import amp 16 | import torch.nn.functional as F 17 | from torch.optim import Adam, SGD, lr_scheduler 18 | 19 | import val 20 | from models.yolo import Model 21 | from utils.loss import ComputeLoss 22 | from utils.plots import plot_labels, plot_lr_scheduler 23 | from utils.metrics import fitness 24 | from utils.loggers import Loggers 25 | from utils.callbacks import Callbacks 26 | from utils.autoanchor import check_anchors 27 | from utils.datasets import create_dataloader 28 | from utils.torch_utils import EarlyStopping, ModelEMA, de_parallel, intersect_dicts, select_device 29 | from utils.general import labels_to_class_weights, increment_path, labels_to_image_weights, init_seeds, \ 30 | strip_optimizer, check_dataset, check_img_size, check_requirements, check_file,\ 31 | check_yaml, check_suffix, one_cycle, colorstr, methods, set_logging 32 | 33 | 34 | FILE = Path(__file__).resolve() 35 | sys.path.append(FILE.parents[0].as_posix()) 36 | LOGGER = logging.getLogger(__name__) 37 | 38 | 39 | def train(hyp, 40 | args, 41 | device, 42 | callbacks 43 | ): 44 | [save_dir, epochs, batch_size, pretrained_path, 45 | evolve, data_cfg, model_cfg, resume, no_val, no_save, workers] = Path(args.save_dir), args.epochs, \ 46 | args.batch_size, args.weights, \ 47 | args.evolve, args.data_cfg, args.model_cfg, \ 48 | args.resume, args.noval, args.nosave, args.workers 49 | 50 | # Directories 51 | weight_path = save_dir / 'weights' # weights dir 52 | weight_path.mkdir(parents=True, exist_ok=True) # make dir 53 | last, best = weight_path / 'last.pt', weight_path / 'best.pt' 54 | 55 | # Hyper parameters 56 | if isinstance(hyp, str): 57 | with open(hyp) as f: 58 | hyp: dict = yaml.safe_load(f) # load hyper parameter dict 59 | LOGGER.info(colorstr('Hyper parameters: ') + ', '.join(f'{k}={v}' for k, v in hyp.items())) 60 | 61 | # Save run settings 62 | with open(save_dir / 'hyp.yaml', 'w') as f: 63 | yaml.safe_dump(hyp, f, sort_keys=False) 64 | with open(save_dir / 'opt.yaml', 'w') as f: 65 | yaml.safe_dump(vars(args), f, sort_keys=False) 66 | 67 | # Loggers 68 | loggers = Loggers(save_dir, pretrained_path, args, hyp, LOGGER) 69 | 70 | # Register actions 71 | for k in methods(loggers): 72 | callbacks.register_action(k, callback=getattr(loggers, k)) 73 | 74 | """ 75 | =============================== 76 | Config 77 | =============================== 78 | """ 79 | plots: bool = not evolve 80 | cuda: bool = device.type != 'cpu' 81 | init_seeds(0) 82 | 83 | data_dict = check_dataset(data_cfg) 84 | train_path, val_path = data_dict['train'], data_dict['val'] 85 | num_class = int(data_dict['num_class']) # number of classes 86 | class_name = data_dict['names'] 87 | 88 | """ 89 | =============================== 90 | Model 91 | =============================== 92 | """ 93 | check_suffix(pretrained_path, '.pt') 94 | use_pretrained = pretrained_path.endswith('.pt') 95 | check_point = None 96 | if use_pretrained: 97 | check_point = torch.load(pretrained_path, map_location=device) # load checkpoint 98 | 99 | # create model 100 | model = Model(model_cfg or check_point['model'].yaml, ch=3, nc=num_class, anchors=hyp.get('anchors')).to(device) 101 | exclude = ['anchor'] if (model_cfg or hyp.get('anchors')) and not resume else [] # exclude keys 102 | csd = check_point['model'].float().state_dict() # checkpoint state_dict as FP32 103 | csd = intersect_dicts(csd, model.state_dict(), exclude=exclude) # intersect 104 | model.load_state_dict(csd, strict=False) # load 105 | LOGGER.info(f'Transferred {len(csd)}/{len(model.state_dict())} items from {pretrained_path}') # report 106 | else: 107 | # create model 108 | model = Model(model_cfg, ch=3, nc=num_class, anchors=hyp.get('anchors')).to(device) 109 | 110 | """ 111 | =============================== 112 | Optimizer 113 | =============================== 114 | """ 115 | nbs = 64 # nominal batch size 116 | accumulate = max(round(nbs / batch_size), 1) # accumulate loss before optimizing 117 | hyp['weight_decay'] *= batch_size * accumulate / nbs # scale weight_decay 118 | LOGGER.info(f"Scaled weight_decay = {hyp['weight_decay']}") 119 | 120 | g0, g1, g2 = [], [], [] # optimizer parameter groups 121 | for v in model.modules(): 122 | if hasattr(v, 'bias') and isinstance(v.bias, nn.Parameter): # bias 123 | g2.append(v.bias) 124 | if isinstance(v, nn.BatchNorm2d): # weight (no decay) 125 | g0.append(v.weight) 126 | elif hasattr(v, 'weight') and isinstance(v.weight, nn.Parameter): # weight (with decay) 127 | g1.append(v.weight) 128 | 129 | if args.adam: 130 | optimizer = Adam(g0, lr=hyp['lr0'], betas=(hyp['momentum'], 0.999)) # adjust beta1 to momentum 131 | else: 132 | optimizer = SGD(g0, lr=hyp['lr0'], momentum=hyp['momentum'], nesterov=True) 133 | 134 | optimizer.add_param_group({'params': g1, 'weight_decay': hyp['weight_decay']}) # add g1 with weight_decay 135 | optimizer.add_param_group({'params': g2}) # add g2 (biases) 136 | LOGGER.info(f"{colorstr('Optimizer:')} {type(optimizer).__name__} with parameter groups " 137 | f"{len(g0)} weight, {len(g1)} weight (no decay), {len(g2)} bias") 138 | del g0, g1, g2 139 | 140 | # Scheduler 141 | if args.linear_lr: 142 | lr_lambda = lambda y: (1 - y / (epochs - 1)) * (1.0 - hyp['lrf']) + hyp['lrf'] # linear 143 | else: 144 | lr_lambda = one_cycle(1, hyp['lrf'], epochs) # cosine 1->hyp['lrf'] 145 | scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lr_lambda) 146 | # plot_lr_scheduler(optimizer, scheduler, epochs) 147 | 148 | # EMA 149 | ema = ModelEMA(model) 150 | 151 | start_epoch, best_fitness = 0, 0.0 152 | if use_pretrained: 153 | # Optimizer 154 | if check_point['optimizer'] is not None: 155 | optimizer.load_state_dict(check_point['optimizer']) 156 | best_fitness = check_point['best_fitness'] 157 | 158 | # EMA 159 | if ema and check_point.get('ema'): 160 | ema.ema.load_state_dict(check_point['ema'].float().state_dict()) 161 | ema.updates = check_point['updates'] 162 | 163 | # Epochs 164 | start_epoch = check_point['epoch'] + 1 165 | if resume: 166 | assert start_epoch > 0, f'{pretrained_path} training to {epochs} epochs is finished, nothing to resume.' 167 | if epochs < start_epoch: 168 | LOGGER.info("{} has been trained for {} epochs. Fine-tuning for {} more epochs.".format( 169 | pretrained_path, 170 | check_point['epoch'], 171 | epochs 172 | )) 173 | 174 | del check_point, csd 175 | 176 | # Image sizes 177 | grid_size = max(int(model.stride.max()), 32) 178 | nl = model.model[-1].nl # number of detection layers (used for scaling hyp['obj']) 179 | img_size = check_img_size(args.img_size, grid_size, floor=grid_size * 2) # verify img_size is gs-multiple 180 | 181 | # Train Loader 182 | train_loader, dataset = create_dataloader(train_path, img_size, batch_size, grid_size, 183 | hyp=hyp, augment=False, cache=args.cache, rect=args.rect, 184 | workers=workers, image_weights=args.image_weights, quad=args.quad, 185 | prefix=colorstr('Train: ')) 186 | 187 | max_label_class = int(np.concatenate(dataset.labels, 0)[:, 0].max()) 188 | num_batches = len(train_loader) 189 | assert max_label_class < num_class, \ 190 | 'Label class {} exceeds num_class={} in {}. Possible class labels are 0-{}'.format( 191 | max_label_class, 192 | num_class, 193 | data_cfg, 194 | num_class - 1 195 | ) 196 | 197 | # Val Loader 198 | val_loader = create_dataloader(val_path, img_size, batch_size * 2, grid_size, 199 | hyp=hyp, cache=None if no_val else args.cache, rect=True, 200 | workers=workers, pad=0.5, 201 | prefix=colorstr('Val: '))[0] 202 | 203 | if not resume: 204 | labels = np.concatenate(dataset.labels, 0) 205 | 206 | if plots: 207 | plot_labels(labels, class_name, save_dir) 208 | 209 | # Anchors 210 | if not args.noautoanchor: 211 | check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=img_size) 212 | model.half().float() # pre-reduce anchor precision 213 | 214 | callbacks.run('on_pretrain_routine_end') 215 | 216 | # Model parameters 217 | hyp['box'] *= 3. / nl # scale to layers 218 | hyp['cls'] *= num_class / 80. * 3. / nl # scale to classes and layers 219 | hyp['obj'] *= (img_size / 640) ** 2 * 3. / nl # scale to image size and layers 220 | hyp['label_smoothing'] = args.label_smoothing 221 | model.nc = num_class # attach number of classes to model 222 | model.hyp = hyp # attach hyper parameters to model 223 | model.class_weights = labels_to_class_weights(dataset.labels, num_class).to( 224 | device) * num_class # attach class weights 225 | model.names = class_name 226 | 227 | # Start training 228 | t0 = time.time() 229 | num_warmup_inters = min(round(hyp['warmup_epochs'] * num_batches), 1000) 230 | last_opt_step = -1 231 | maps = np.zeros(num_class) 232 | results = (0, 0, 0, 0, 0, 0, 0) # P, R, mAP@.5, mAP@.5-.95, val_loss(box, obj, cls) 233 | scheduler.last_epoch = start_epoch - 1 # do not move 234 | scaler = amp.GradScaler(enabled=cuda) 235 | stopper = EarlyStopping(patience=args.patience) 236 | compute_loss = ComputeLoss(model) # init loss class 237 | LOGGER.info(f'Image sizes {img_size} train, {img_size} val\n' 238 | f'Using {train_loader.num_workers} dataloader workers\n' 239 | f"Logging results to {colorstr('bold', save_dir)}\n" 240 | f'Starting training for {epochs} epochs...') 241 | 242 | final_epoch = 0 243 | for epoch in range(start_epoch, epochs): 244 | final_epoch = max(final_epoch, epoch) 245 | model.train() 246 | if args.image_weights: 247 | class_weight = model.class_weights.cpu().numpy() * (1 - maps) ** 2 / num_class 248 | image_weight = labels_to_image_weights(dataset.labels, nc=num_class, class_weights=class_weight) 249 | dataset.indices = random.choices(range(dataset.n), weights=image_weight, k=dataset.n) # rand weighted idx 250 | 251 | mean_losses = torch.zeros(3, device=device) 252 | 253 | plot_bar = enumerate(train_loader) 254 | LOGGER.info(('\n' + '%10s' * 7) % ('Epoch', 'gpu_mem', 'box', 'obj', 'cls', 'labels', 'img_size')) 255 | 256 | plot_bar = tqdm(plot_bar, total=num_batches) 257 | optimizer.zero_grad() 258 | 259 | for i, (img_batch, targets, paths, _) in plot_bar: 260 | num_inters = i + num_batches * epoch 261 | 262 | # Preprocess 263 | img_batch = img_batch.to(device, non_blocking=True).float() / 255.0 264 | 265 | # Warmup 266 | if num_inters <= num_warmup_inters: 267 | xi = [0, num_warmup_inters] # x interp 268 | 269 | accumulate = max(1, np.interp(num_inters, xi, [1, nbs / batch_size]).round()) 270 | for j, x in enumerate(optimizer.param_groups): 271 | x['lr'] = np.interp(num_inters, xi, 272 | [hyp['warmup_bias_lr'] if j == 2 else 0.0, x['initial_lr'] * lr_lambda(epoch)]) 273 | if 'momentum' in x: 274 | x['momentum'] = np.interp(num_inters, xi, [hyp['warmup_momentum'], hyp['momentum']]) 275 | 276 | # Multi-scale 277 | if args.multi_scale: 278 | size = random.randrange(img_size * 0.5, img_size * 1.5 + grid_size) // grid_size * grid_size 279 | scale_factor = size / max(img_batch.shape[2:]) 280 | if scale_factor != 1: 281 | new_shape = [math.ceil(x * scale_factor / grid_size) * grid_size for x in img_batch.shape[2:]] 282 | img_batch = F.interpolate(img_batch, size=new_shape, mode='bilinear', align_corners=False) 283 | 284 | # Forward 285 | with amp.autocast(enabled=cuda): 286 | pred = model(img_batch) # forward 287 | loss, loss_items = compute_loss(pred, targets.to(device)) 288 | 289 | if args.quad: 290 | loss *= 4. 291 | 292 | # Backward 293 | scaler.scale(loss).backward() 294 | 295 | # Optimize 296 | if num_inters - last_opt_step >= accumulate: 297 | scaler.step(optimizer) 298 | scaler.update() 299 | optimizer.zero_grad() 300 | if ema: 301 | ema.update(model) 302 | last_opt_step = num_inters 303 | 304 | # Log 305 | # Update mean losses 306 | mean_losses = (mean_losses * i + loss_items) / (i + 1) 307 | mem = f'{torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0:.3g}G' # (GB) 308 | plot_bar.set_description(('%10s' * 2 + '%10.4g' * 5) % ( 309 | f'{epoch}/{epochs - 1}', mem, *mean_losses, targets.shape[0], img_batch.shape[-1])) 310 | callbacks.run('on_train_batch_end', num_inters, model, img_batch, targets, paths, plots, args.sync_bn) 311 | 312 | # Scheduler 313 | lr = [x['lr'] for x in optimizer.param_groups] 314 | scheduler.step() 315 | 316 | # mAP 317 | callbacks.run('on_train_epoch_end', epoch=epoch) 318 | ema.update_attr(model, include=['yaml', 'nc', 'hyp', 'names', 'stride', 'class_weights']) 319 | final_epoch = (epoch + 1 == epochs) or stopper.possible_stop 320 | 321 | if not no_val or final_epoch: # Calculate mAP 322 | results, maps, _ = val.run(data_dict, 323 | batch_size=batch_size * 2, 324 | img_size=img_size, 325 | model=ema.ema, 326 | dataloader=val_loader, 327 | save_dir=save_dir, 328 | verbose=num_class < 50 and final_epoch, 329 | plots=plots and final_epoch, 330 | callbacks=callbacks, 331 | compute_loss=compute_loss) 332 | 333 | # Update best mAP 334 | fi = fitness(np.array(results).reshape(1, -1)) # weighted combination of [P, R, wAP@.5, mAP@.5, mAP@.5-.95] 335 | if fi > best_fitness: 336 | best_fitness = fi 337 | log_val = list(mean_losses) + list(results) + lr 338 | callbacks.run('on_fit_epoch_end', log_val, epoch, best_fitness, fi) 339 | 340 | # Save model 341 | if (not no_save) or (final_epoch and not evolve): # if save 342 | check_point = {'epoch': epoch, 343 | 'best_fitness': best_fitness, 344 | 'model': deepcopy(de_parallel(model)).half(), 345 | 'ema': deepcopy(ema.ema).half(), 346 | 'updates': ema.updates, 347 | 'optimizer': optimizer.state_dict()} 348 | 349 | # Save last, best and delete 350 | torch.save(check_point, last) 351 | if best_fitness == fi: 352 | torch.save(check_point, best) 353 | del check_point 354 | callbacks.run('on_model_save', last, epoch, final_epoch, best_fitness, fi) 355 | 356 | # Stop Single-GPU 357 | if stopper(epoch=epoch, fitness=fi): 358 | break 359 | 360 | # End training 361 | LOGGER.info('{} epochs completed in {:.3f} hours.'.format( 362 | final_epoch - start_epoch + 1, 363 | (time.time() - t0) / 3600 364 | )) 365 | 366 | if not evolve: 367 | # Strip optimizers 368 | for f in last, best: 369 | if f.exists(): 370 | strip_optimizer(f) # strip optimizers 371 | callbacks.run('on_train_end', last, best, plots, final_epoch) 372 | LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}") 373 | 374 | # Release gpu memory 375 | torch.cuda.empty_cache() 376 | 377 | return results 378 | 379 | 380 | def parser(known=False): 381 | args = argparse.ArgumentParser() 382 | args.add_argument('--data_cfg', type=str, default='config/data_cfg.yaml', help='dataset config file path') 383 | args.add_argument('--batch-size', type=int, default=64, help='batch size') 384 | args.add_argument('--cache', type=str, nargs='?', const='ram', help='--cache images in "ram" (default) or "disk"') 385 | args.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 386 | args.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers') 387 | args.add_argument('--name', type=str, help='define your version experience', required=True) 388 | args = args.parse_known_args()[0] if known else args.parse_args() 389 | 390 | with open(Path('config') / 'train_cfg.yaml') as f: 391 | temp_args: dict = yaml.safe_load(f) 392 | 393 | keys = list(temp_args.keys()) 394 | already_keys = list(args.__dict__.keys()) 395 | 396 | for key in keys: 397 | if key not in already_keys: 398 | args.__setattr__(key, temp_args[key]) 399 | 400 | return args 401 | 402 | 403 | def main(args, callbacks=Callbacks()): 404 | 405 | set_logging() 406 | print(colorstr('Train: ') + ', '.join(f'{k}={v}' for k, v in vars(args).items())) 407 | 408 | # Check requirements 409 | check_requirements(requirements=FILE.parent / 'requirements.txt', exclude=['thop']) 410 | 411 | args.data_cfg = check_file(args.data_cfg) 412 | args.model_cfg = check_yaml(args.model_cfg) 413 | args.hyp = check_yaml(args.hyp) 414 | assert len(args.model_cfg) or len(args.weights), 'either --cfg or --weights must be specified' 415 | 416 | args.save_dir = str(increment_path(Path(args.project) / args.name, exist_ok=args.exist_ok)) 417 | 418 | # DDP mode 419 | device = select_device(args.device, batch_size=args.batch_size) 420 | print(device) 421 | 422 | train(args.hyp, args, device, callbacks) 423 | 424 | 425 | if __name__ == "__main__": 426 | main(args=parser()) 427 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | # import sys 2 | # from pathlib import Path 3 | # 4 | # import torch 5 | # from PIL import ImageFont 6 | # 7 | # FILE = Path(__file__).resolve() 8 | # ROOT = FILE.parents[1] # yolov5/ dir 9 | # if str(ROOT) not in sys.path: 10 | # sys.path.append(str(ROOT)) # add ROOT to PATH 11 | # 12 | # # Check YOLOv5 Annotator font 13 | # font = 'Arial.ttf' 14 | # try: 15 | # ImageFont.truetype(font) 16 | # except Exception as e: # download if missing 17 | # url = "https://ultralytics.com/assets/" + font 18 | # print(f'Downloading {url} to {ROOT / font}...') 19 | # torch.hub.download_url_to_file(url, str(ROOT / font)) 20 | -------------------------------------------------------------------------------- /utils/activations.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | Activation functions 4 | """ 5 | 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | 10 | 11 | # SiLU https://arxiv.org/pdf/1606.08415.pdf ---------------------------------------------------------------------------- 12 | class SiLU(nn.Module): # export-friendly version of nn.SiLU() 13 | @staticmethod 14 | def forward(x): 15 | return x * torch.sigmoid(x) 16 | 17 | 18 | class Hardswish(nn.Module): # export-friendly version of nn.Hardswish() 19 | @staticmethod 20 | def forward(x): 21 | # return x * F.hardsigmoid(x) # for torchscript and CoreML 22 | return x * F.hardtanh(x + 3, 0., 6.) / 6. # for torchscript, CoreML and ONNX 23 | 24 | 25 | # Mish https://github.com/digantamisra98/Mish -------------------------------------------------------------------------- 26 | class Mish(nn.Module): 27 | @staticmethod 28 | def forward(x): 29 | return x * F.softplus(x).tanh() 30 | 31 | 32 | class MemoryEfficientMish(nn.Module): 33 | class F(torch.autograd.Function): 34 | @staticmethod 35 | def forward(ctx, x): 36 | ctx.save_for_backward(x) 37 | return x.mul(torch.tanh(F.softplus(x))) # x * tanh(ln(1 + exp(x))) 38 | 39 | @staticmethod 40 | def backward(ctx, grad_output): 41 | x = ctx.saved_tensors[0] 42 | sx = torch.sigmoid(x) 43 | fx = F.softplus(x).tanh() 44 | return grad_output * (fx + x * sx * (1 - fx * fx)) 45 | 46 | def forward(self, x): 47 | return self.F.apply(x) 48 | 49 | 50 | # FReLU https://arxiv.org/abs/2007.11824 ------------------------------------------------------------------------------- 51 | class FReLU(nn.Module): 52 | def __init__(self, c1, k=3): # ch_in, kernel 53 | super().__init__() 54 | self.conv = nn.Conv2d(c1, c1, k, 1, 1, groups=c1, bias=False) 55 | self.bn = nn.BatchNorm2d(c1) 56 | 57 | def forward(self, x): 58 | return torch.max(x, self.bn(self.conv(x))) 59 | 60 | 61 | # ACON https://arxiv.org/pdf/2009.04759.pdf ---------------------------------------------------------------------------- 62 | class AconC(nn.Module): 63 | r""" ACON activation (activate or not). 64 | AconC: (p1*x-p2*x) * sigmoid(beta*(p1*x-p2*x)) + p2*x, beta is a learnable parameter 65 | according to "Activate or Not: Learning Customized Activation" . 66 | """ 67 | 68 | def __init__(self, c1): 69 | super().__init__() 70 | self.p1 = nn.Parameter(torch.randn(1, c1, 1, 1)) 71 | self.p2 = nn.Parameter(torch.randn(1, c1, 1, 1)) 72 | self.beta = nn.Parameter(torch.ones(1, c1, 1, 1)) 73 | 74 | def forward(self, x): 75 | dpx = (self.p1 - self.p2) * x 76 | return dpx * torch.sigmoid(self.beta * dpx) + self.p2 * x 77 | 78 | 79 | class MetaAconC(nn.Module): 80 | r""" ACON activation (activate or not). 81 | MetaAconC: (p1*x-p2*x) * sigmoid(beta*(p1*x-p2*x)) + p2*x, beta is generated by a small network 82 | according to "Activate or Not: Learning Customized Activation" . 83 | """ 84 | 85 | def __init__(self, c1, k=1, s=1, r=16): # ch_in, kernel, stride, r 86 | super().__init__() 87 | c2 = max(r, c1 // r) 88 | self.p1 = nn.Parameter(torch.randn(1, c1, 1, 1)) 89 | self.p2 = nn.Parameter(torch.randn(1, c1, 1, 1)) 90 | self.fc1 = nn.Conv2d(c1, c2, k, s, bias=True) 91 | self.fc2 = nn.Conv2d(c2, c1, k, s, bias=True) 92 | # self.bn1 = nn.BatchNorm2d(c2) 93 | # self.bn2 = nn.BatchNorm2d(c1) 94 | 95 | def forward(self, x): 96 | y = x.mean(dim=2, keepdims=True).mean(dim=3, keepdims=True) 97 | # batch-size 1 bug/instabilities https://github.com/ultralytics/yolov5/issues/2891 98 | # beta = torch.sigmoid(self.bn2(self.fc2(self.bn1(self.fc1(y))))) # bug/unstable 99 | beta = torch.sigmoid(self.fc2(self.fc1(y))) # bug patch BN layers removed 100 | dpx = (self.p1 - self.p2) * x 101 | return dpx * torch.sigmoid(beta * dpx) + self.p2 * x 102 | -------------------------------------------------------------------------------- /utils/augmentations.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | Image augmentation functions 4 | """ 5 | 6 | import logging 7 | import math 8 | import random 9 | 10 | import cv2 11 | import numpy as np 12 | 13 | from utils.general import colorstr, segment2box, resample_segments, check_version 14 | from utils.metrics import bbox_ioa 15 | 16 | 17 | class Albumentations: 18 | # YOLOv5 Albumentations class (optional, only used if package is installed) 19 | def __init__(self): 20 | self.transform = None 21 | try: 22 | import albumentations as A 23 | check_version(A.__version__, '1.0.3') # version requirement 24 | 25 | self.transform = A.Compose([ 26 | A.Blur(p=0.1), 27 | A.MedianBlur(p=0.1), 28 | A.ToGray(p=0.01)], 29 | bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels'])) 30 | 31 | logging.info(colorstr('albumentations: ') + ', '.join(f'{x}' for x in self.transform.transforms if x.p)) 32 | except ImportError: # package not installed, skip 33 | pass 34 | except Exception as e: 35 | logging.info(colorstr('albumentations: ') + f'{e}') 36 | 37 | def __call__(self, im, labels, p=1.0): 38 | if self.transform and random.random() < p: 39 | new = self.transform(image=im, bboxes=labels[:, 1:], class_labels=labels[:, 0]) # transformed 40 | im, labels = new['image'], np.array([[c, *b] for c, b in zip(new['class_labels'], new['bboxes'])]) 41 | return im, labels 42 | 43 | 44 | def augment_hsv(im, hgain=0.5, sgain=0.5, vgain=0.5): 45 | # HSV color-space augmentation 46 | if hgain or sgain or vgain: 47 | r = np.random.uniform(-1, 1, 3) * [hgain, sgain, vgain] + 1 # random gains 48 | hue, sat, val = cv2.split(cv2.cvtColor(im, cv2.COLOR_BGR2HSV)) 49 | dtype = im.dtype # uint8 50 | 51 | x = np.arange(0, 256, dtype=r.dtype) 52 | lut_hue = ((x * r[0]) % 180).astype(dtype) 53 | lut_sat = np.clip(x * r[1], 0, 255).astype(dtype) 54 | lut_val = np.clip(x * r[2], 0, 255).astype(dtype) 55 | 56 | im_hsv = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val))) 57 | cv2.cvtColor(im_hsv, cv2.COLOR_HSV2BGR, dst=im) # no return needed 58 | 59 | 60 | def hist_equalize(im, clahe=True, bgr=False): 61 | # Equalize histogram on BGR image 'im' with im.shape(n,m,3) and range 0-255 62 | yuv = cv2.cvtColor(im, cv2.COLOR_BGR2YUV if bgr else cv2.COLOR_RGB2YUV) 63 | if clahe: 64 | c = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8)) 65 | yuv[:, :, 0] = c.apply(yuv[:, :, 0]) 66 | else: 67 | yuv[:, :, 0] = cv2.equalizeHist(yuv[:, :, 0]) # equalize Y channel histogram 68 | return cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR if bgr else cv2.COLOR_YUV2RGB) # convert YUV image to RGB 69 | 70 | 71 | def replicate(im, labels): 72 | # Replicate labels 73 | h, w = im.shape[:2] 74 | boxes = labels[:, 1:].astype(int) 75 | x1, y1, x2, y2 = boxes.T 76 | s = ((x2 - x1) + (y2 - y1)) / 2 # side length (pixels) 77 | for i in s.argsort()[:round(s.size * 0.5)]: # smallest indices 78 | x1b, y1b, x2b, y2b = boxes[i] 79 | bh, bw = y2b - y1b, x2b - x1b 80 | yc, xc = int(random.uniform(0, h - bh)), int(random.uniform(0, w - bw)) # offset x, y 81 | x1a, y1a, x2a, y2a = [xc, yc, xc + bw, yc + bh] 82 | im[y1a:y2a, x1a:x2a] = im[y1b:y2b, x1b:x2b] # im4[ymin:ymax, xmin:xmax] 83 | labels = np.append(labels, [[labels[i, 0], x1a, y1a, x2a, y2a]], axis=0) 84 | 85 | return im, labels 86 | 87 | 88 | def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32): 89 | # Resize and pad image while meeting stride-multiple constraints 90 | shape = im.shape[:2] # current shape [height, width] 91 | if isinstance(new_shape, int): 92 | new_shape = (new_shape, new_shape) 93 | 94 | # Scale ratio (new / old) 95 | r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) 96 | if not scaleup: # only scale down, do not scale up (for better val mAP) 97 | r = min(r, 1.0) 98 | 99 | # Compute padding 100 | ratio = r, r # width, height ratios 101 | new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) 102 | dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding 103 | if auto: # minimum rectangle 104 | dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding 105 | elif scaleFill: # stretch 106 | dw, dh = 0.0, 0.0 107 | new_unpad = (new_shape[1], new_shape[0]) 108 | ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios 109 | 110 | dw /= 2 # divide padding into 2 sides 111 | dh /= 2 112 | 113 | if shape[::-1] != new_unpad: # resize 114 | im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR) 115 | top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1)) 116 | left, right = int(round(dw - 0.1)), int(round(dw + 0.1)) 117 | im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border 118 | return im, ratio, (dw, dh) 119 | 120 | 121 | def random_perspective(im, targets=(), segments=(), degrees=10, translate=.1, scale=.1, shear=10, perspective=0.0, 122 | border=(0, 0)): 123 | # torchvision.transforms.RandomAffine(degrees=(-10, 10), translate=(.1, .1), scale=(.9, 1.1), shear=(-10, 10)) 124 | # targets = [cls, xyxy] 125 | 126 | height = im.shape[0] + border[0] * 2 # shape(h,w,c) 127 | width = im.shape[1] + border[1] * 2 128 | 129 | # Center 130 | C = np.eye(3) 131 | C[0, 2] = -im.shape[1] / 2 # x translation (pixels) 132 | C[1, 2] = -im.shape[0] / 2 # y translation (pixels) 133 | 134 | # Perspective 135 | P = np.eye(3) 136 | P[2, 0] = random.uniform(-perspective, perspective) # x perspective (about y) 137 | P[2, 1] = random.uniform(-perspective, perspective) # y perspective (about x) 138 | 139 | # Rotation and Scale 140 | R = np.eye(3) 141 | a = random.uniform(-degrees, degrees) 142 | # a += random.choice([-180, -90, 0, 90]) # add 90deg rotations to small rotations 143 | s = random.uniform(1 - scale, 1 + scale) 144 | # s = 2 ** random.uniform(-scale, scale) 145 | R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s) 146 | 147 | # Shear 148 | S = np.eye(3) 149 | S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180) # x shear (deg) 150 | S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180) # y shear (deg) 151 | 152 | # Translation 153 | T = np.eye(3) 154 | T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width # x translation (pixels) 155 | T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height # y translation (pixels) 156 | 157 | # Combined rotation matrix 158 | M = T @ S @ R @ P @ C # order of operations (right to left) is IMPORTANT 159 | if (border[0] != 0) or (border[1] != 0) or (M != np.eye(3)).any(): # image changed 160 | if perspective: 161 | im = cv2.warpPerspective(im, M, dsize=(width, height), borderValue=(114, 114, 114)) 162 | else: # affine 163 | im = cv2.warpAffine(im, M[:2], dsize=(width, height), borderValue=(114, 114, 114)) 164 | 165 | # Visualize 166 | # import matplotlib.pyplot as plt 167 | # ax = plt.subplots(1, 2, figsize=(12, 6))[1].ravel() 168 | # ax[0].imshow(im[:, :, ::-1]) # base 169 | # ax[1].imshow(im2[:, :, ::-1]) # warped 170 | 171 | # Transform label coordinates 172 | n = len(targets) 173 | if n: 174 | use_segments = any(x.any() for x in segments) 175 | new = np.zeros((n, 4)) 176 | if use_segments: # warp segments 177 | segments = resample_segments(segments) # upsample 178 | for i, segment in enumerate(segments): 179 | xy = np.ones((len(segment), 3)) 180 | xy[:, :2] = segment 181 | xy = xy @ M.T # transform 182 | xy = xy[:, :2] / xy[:, 2:3] if perspective else xy[:, :2] # perspective rescale or affine 183 | 184 | # clip 185 | new[i] = segment2box(xy, width, height) 186 | 187 | else: # warp boxes 188 | xy = np.ones((n * 4, 3)) 189 | xy[:, :2] = targets[:, [1, 2, 3, 4, 1, 4, 3, 2]].reshape(n * 4, 2) # x1y1, x2y2, x1y2, x2y1 190 | xy = xy @ M.T # transform 191 | xy = (xy[:, :2] / xy[:, 2:3] if perspective else xy[:, :2]).reshape(n, 8) # perspective rescale or affine 192 | 193 | # create new boxes 194 | x = xy[:, [0, 2, 4, 6]] 195 | y = xy[:, [1, 3, 5, 7]] 196 | new = np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T 197 | 198 | # clip 199 | new[:, [0, 2]] = new[:, [0, 2]].clip(0, width) 200 | new[:, [1, 3]] = new[:, [1, 3]].clip(0, height) 201 | 202 | # filter candidates 203 | i = box_candidates(box1=targets[:, 1:5].T * s, box2=new.T, area_thr=0.01 if use_segments else 0.10) 204 | targets = targets[i] 205 | targets[:, 1:5] = new[i] 206 | 207 | return im, targets 208 | 209 | 210 | def copy_paste(im, labels, segments, p=0.5): 211 | # Implement Copy-Paste augmentation https://arxiv.org/abs/2012.07177, labels as nx5 np.array(cls, xyxy) 212 | n = len(segments) 213 | if p and n: 214 | h, w, c = im.shape # height, width, channels 215 | im_new = np.zeros(im.shape, np.uint8) 216 | for j in random.sample(range(n), k=round(p * n)): 217 | l, s = labels[j], segments[j] 218 | box = w - l[3], l[2], w - l[1], l[4] 219 | ioa = bbox_ioa(box, labels[:, 1:5]) # intersection over area 220 | if (ioa < 0.30).all(): # allow 30% obscuration of existing labels 221 | labels = np.concatenate((labels, [[l[0], *box]]), 0) 222 | segments.append(np.concatenate((w - s[:, 0:1], s[:, 1:2]), 1)) 223 | cv2.drawContours(im_new, [segments[j].astype(np.int32)], -1, (255, 255, 255), cv2.FILLED) 224 | 225 | result = cv2.bitwise_and(src1=im, src2=im_new) 226 | result = cv2.flip(result, 1) # augment segments (flip left-right) 227 | i = result > 0 # pixels to replace 228 | # i[:, :] = result.max(2).reshape(h, w, 1) # act over ch 229 | im[i] = result[i] # cv2.imwrite('debug.jpg', im) # debug 230 | 231 | return im, labels, segments 232 | 233 | 234 | def cutout(im, labels, p=0.5): 235 | # Applies image cutout augmentation https://arxiv.org/abs/1708.04552 236 | if random.random() < p: 237 | h, w = im.shape[:2] 238 | scales = [0.5] * 1 + [0.25] * 2 + [0.125] * 4 + [0.0625] * 8 + [0.03125] * 16 # image size fraction 239 | for s in scales: 240 | mask_h = random.randint(1, int(h * s)) # create random masks 241 | mask_w = random.randint(1, int(w * s)) 242 | 243 | # box 244 | xmin = max(0, random.randint(0, w) - mask_w // 2) 245 | ymin = max(0, random.randint(0, h) - mask_h // 2) 246 | xmax = min(w, xmin + mask_w) 247 | ymax = min(h, ymin + mask_h) 248 | 249 | # apply random color mask 250 | im[ymin:ymax, xmin:xmax] = [random.randint(64, 191) for _ in range(3)] 251 | 252 | # return unobscured labels 253 | if len(labels) and s > 0.03: 254 | box = np.array([xmin, ymin, xmax, ymax], dtype=np.float32) 255 | ioa = bbox_ioa(box, labels[:, 1:5]) # intersection over area 256 | labels = labels[ioa < 0.60] # remove >60% obscured labels 257 | 258 | return labels 259 | 260 | 261 | def mixup(im, labels, im2, labels2): 262 | # Applies MixUp augmentation https://arxiv.org/pdf/1710.09412.pdf 263 | r = np.random.beta(32.0, 32.0) # mixup ratio, alpha=beta=32.0 264 | im = (im * r + im2 * (1 - r)).astype(np.uint8) 265 | labels = np.concatenate((labels, labels2), 0) 266 | return im, labels 267 | 268 | 269 | def box_candidates(box1, box2, wh_thr=2, ar_thr=20, area_thr=0.1, eps=1e-16): # box1(4,n), box2(4,n) 270 | # Compute candidate boxes: box1 before augment, box2 after augment, wh_thr (pixels), aspect_ratio_thr, area_ratio 271 | w1, h1 = box1[2] - box1[0], box1[3] - box1[1] 272 | w2, h2 = box2[2] - box2[0], box2[3] - box2[1] 273 | ar = np.maximum(w2 / (h2 + eps), h2 / (w2 + eps)) # aspect ratio 274 | return (w2 > wh_thr) & (h2 > wh_thr) & (w2 * h2 / (w1 * h1 + eps) > area_thr) & (ar < ar_thr) # candidates 275 | -------------------------------------------------------------------------------- /utils/autoanchor.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | Auto-anchor utils 4 | """ 5 | 6 | import random 7 | 8 | import numpy as np 9 | import torch 10 | import yaml 11 | from tqdm import tqdm 12 | 13 | from utils.general import colorstr 14 | 15 | 16 | def check_anchor_order(m): 17 | # Check anchor order against stride order for YOLOv5 Detect() module m, and correct if necessary 18 | a = m.anchor_grid.prod(-1).view(-1) # anchor area 19 | da = a[-1] - a[0] # delta a 20 | ds = m.stride[-1] - m.stride[0] # delta s 21 | if da.sign() != ds.sign(): # same order 22 | print('Reversing anchor order') 23 | m.anchors[:] = m.anchors.flip(0) 24 | m.anchor_grid[:] = m.anchor_grid.flip(0) 25 | 26 | 27 | def check_anchors(dataset, model, thr=4.0, imgsz=640): 28 | # Check anchor fit to config, recompute if necessary 29 | prefix = colorstr('autoanchor: ') 30 | print(f'\n{prefix}Analyzing anchors... ', end='') 31 | m = model.module.model[-1] if hasattr(model, 'module') else model.model[-1] # Detect() 32 | shapes = imgsz * dataset.shapes / dataset.shapes.max(1, keepdims=True) 33 | scale = np.random.uniform(0.9, 1.1, size=(shapes.shape[0], 1)) # augment scale 34 | wh = torch.tensor(np.concatenate([l[:, 3:5] * s for s, l in zip(shapes * scale, dataset.labels)])).float() # wh 35 | 36 | def metric(k): # compute metric 37 | r = wh[:, None] / k[None] 38 | x = torch.min(r, 1. / r).min(2)[0] # ratio metric 39 | best = x.max(1)[0] # best_x 40 | aat = (x > 1. / thr).float().sum(1).mean() # anchors above threshold 41 | bpr = (best > 1. / thr).float().mean() # best possible recall 42 | return bpr, aat 43 | 44 | anchors = m.anchor_grid.clone().cpu().view(-1, 2) # current anchors 45 | bpr, aat = metric(anchors) 46 | print(f'anchors/target = {aat:.2f}, Best Possible Recall (BPR) = {bpr:.4f}', end='') 47 | if bpr < 0.98: # threshold to recompute 48 | print('. Attempting to improve anchors, please wait...') 49 | na = m.anchor_grid.numel() // 2 # number of anchors 50 | try: 51 | anchors = kmean_anchors(dataset, n=na, img_size=imgsz, thr=thr, gen=1000, verbose=False) 52 | except Exception as e: 53 | print(f'{prefix}ERROR: {e}') 54 | new_bpr = metric(anchors)[0] 55 | if new_bpr > bpr: # replace anchors 56 | anchors = torch.tensor(anchors, device=m.anchors.device).type_as(m.anchors) 57 | m.anchor_grid[:] = anchors.clone().view_as(m.anchor_grid) # for inference 58 | m.anchors[:] = anchors.clone().view_as(m.anchors) / m.stride.to(m.anchors.device).view(-1, 1, 1) # loss 59 | check_anchor_order(m) 60 | print(f'{prefix}New anchors saved to model. Update model *.yaml to use these anchors in the future.') 61 | else: 62 | print(f'{prefix}Original anchors better than new anchors. Proceeding with original anchors.') 63 | print('') # newline 64 | 65 | 66 | def kmean_anchors(dataset='./config/coco128.yaml', n=9, img_size=640, thr=4.0, gen=1000, verbose=True): 67 | """ Creates kmeans-evolved anchors from training dataset 68 | 69 | Arguments: 70 | dataset: path to config.yaml, or a loaded dataset 71 | n: number of anchors 72 | img_size: image size used for training 73 | thr: anchor-label wh ratio threshold hyperparameter hyp['anchor_t'] used for training, default=4.0 74 | gen: generations to evolve anchors using genetic algorithm 75 | verbose: print all results 76 | 77 | Return: 78 | k: kmeans evolved anchors 79 | 80 | Usage: 81 | from utils.autoanchor import *; _ = kmean_anchors() 82 | """ 83 | from scipy.cluster.vq import kmeans 84 | 85 | thr = 1. / thr 86 | prefix = colorstr('autoanchor: ') 87 | 88 | def metric(k, wh): # compute metrics 89 | r = wh[:, None] / k[None] 90 | x = torch.min(r, 1. / r).min(2)[0] # ratio metric 91 | # x = wh_iou(wh, torch.tensor(k)) # iou metric 92 | return x, x.max(1)[0] # x, best_x 93 | 94 | def anchor_fitness(k): # mutation fitness 95 | _, best = metric(torch.tensor(k, dtype=torch.float32), wh) 96 | return (best * (best > thr).float()).mean() # fitness 97 | 98 | def print_results(k): 99 | k = k[np.argsort(k.prod(1))] # sort small to large 100 | x, best = metric(k, wh0) 101 | bpr, aat = (best > thr).float().mean(), (x > thr).float().mean() * n # best possible recall, anch > thr 102 | print(f'{prefix}thr={thr:.2f}: {bpr:.4f} best possible recall, {aat:.2f} anchors past thr') 103 | print(f'{prefix}n={n}, img_size={img_size}, metric_all={x.mean():.3f}/{best.mean():.3f}-mean/best, ' 104 | f'past_thr={x[x > thr].mean():.3f}-mean: ', end='') 105 | for i, x in enumerate(k): 106 | print('%i,%i' % (round(x[0]), round(x[1])), end=', ' if i < len(k) - 1 else '\n') # use in *.cfg 107 | return k 108 | 109 | if isinstance(dataset, str): # *.yaml file 110 | with open(dataset, errors='ignore') as f: 111 | data_dict = yaml.safe_load(f) # model dict 112 | from utils.datasets import LoadImagesAndLabels 113 | dataset = LoadImagesAndLabels(data_dict['train'], augment=True, rect=True) 114 | 115 | # Get label wh 116 | shapes = img_size * dataset.shapes / dataset.shapes.max(1, keepdims=True) 117 | wh0 = np.concatenate([l[:, 3:5] * s for s, l in zip(shapes, dataset.labels)]) # wh 118 | 119 | # Filter 120 | i = (wh0 < 3.0).any(1).sum() 121 | if i: 122 | print(f'{prefix}WARNING: Extremely small objects found. {i} of {len(wh0)} labels are < 3 pixels in size.') 123 | wh = wh0[(wh0 >= 2.0).any(1)] # filter > 2 pixels 124 | # wh = wh * (np.random.rand(wh.shape[0], 1) * 0.9 + 0.1) # multiply by random scale 0-1 125 | 126 | # Kmeans calculation 127 | print(f'{prefix}Running kmeans for {n} anchors on {len(wh)} points...') 128 | s = wh.std(0) # sigmas for whitening 129 | k, dist = kmeans(wh / s, n, iter=30) # points, mean distance 130 | assert len(k) == n, print(f'{prefix}ERROR: scipy.cluster.vq.kmeans requested {n} points but returned only {len(k)}') 131 | k *= s 132 | wh = torch.tensor(wh, dtype=torch.float32) # filtered 133 | wh0 = torch.tensor(wh0, dtype=torch.float32) # unfiltered 134 | k = print_results(k) 135 | 136 | # Plot 137 | # k, d = [None] * 20, [None] * 20 138 | # for i in tqdm(range(1, 21)): 139 | # k[i-1], d[i-1] = kmeans(wh / s, i) # points, mean distance 140 | # fig, ax = plt.subplots(1, 2, figsize=(14, 7), tight_layout=True) 141 | # ax = ax.ravel() 142 | # ax[0].plot(np.arange(1, 21), np.array(d) ** 2, marker='.') 143 | # fig, ax = plt.subplots(1, 2, figsize=(14, 7)) # plot wh 144 | # ax[0].hist(wh[wh[:, 0]<100, 0],400) 145 | # ax[1].hist(wh[wh[:, 1]<100, 1],400) 146 | # fig.savefig('wh.png', dpi=200) 147 | 148 | # Evolve 149 | npr = np.random 150 | f, sh, mp, s = anchor_fitness(k), k.shape, 0.9, 0.1 # fitness, generations, mutation prob, sigma 151 | pbar = tqdm(range(gen), desc=f'{prefix}Evolving anchors with Genetic Algorithm:') # progress bar 152 | for _ in pbar: 153 | v = np.ones(sh) 154 | while (v == 1).all(): # mutate until a change occurs (prevent duplicates) 155 | v = ((npr.random(sh) < mp) * random.random() * npr.randn(*sh) * s + 1).clip(0.3, 3.0) 156 | kg = (k.copy() * v).clip(min=2.0) 157 | fg = anchor_fitness(kg) 158 | if fg > f: 159 | f, k = fg, kg.copy() 160 | pbar.desc = f'{prefix}Evolving anchors with Genetic Algorithm: fitness = {f:.4f}' 161 | if verbose: 162 | print_results(k) 163 | 164 | return print_results(k) 165 | -------------------------------------------------------------------------------- /utils/callbacks.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | Callback utils 4 | """ 5 | 6 | 7 | class Callbacks: 8 | """" 9 | Handles all registered callbacks for YOLOv5 Hooks 10 | """ 11 | 12 | # Define the available callbacks 13 | _callbacks = { 14 | 'on_pretrain_routine_start': [], 15 | 'on_pretrain_routine_end': [], 16 | 17 | 'on_train_start': [], 18 | 'on_train_epoch_start': [], 19 | 'on_train_batch_start': [], 20 | 'optimizer_step': [], 21 | 'on_before_zero_grad': [], 22 | 'on_train_batch_end': [], 23 | 'on_train_epoch_end': [], 24 | 25 | 'on_val_start': [], 26 | 'on_val_batch_start': [], 27 | 'on_val_image_end': [], 28 | 'on_val_batch_end': [], 29 | 'on_val_end': [], 30 | 31 | 'on_fit_epoch_end': [], # fit = train + val 32 | 'on_model_save': [], 33 | 'on_train_end': [], 34 | 35 | 'teardown': [], 36 | } 37 | 38 | def register_action(self, hook, name='', callback=None): 39 | """ 40 | Register a new action to a callback hook 41 | 42 | Args: 43 | hook The callback hook name to register the action to 44 | name The name of the action for later reference 45 | callback The callback to fire 46 | """ 47 | assert hook in self._callbacks, f"hook '{hook}' not found in callbacks {self._callbacks}" 48 | assert callable(callback), f"callback '{callback}' is not callable" 49 | self._callbacks[hook] = [{'name': name, 'callback': callback}] 50 | 51 | def get_registered_actions(self, hook=None): 52 | """" 53 | Returns all the registered actions by callback hook 54 | 55 | Args: 56 | hook The name of the hook to check, defaults to all 57 | """ 58 | if hook: 59 | return self._callbacks[hook] 60 | else: 61 | return self._callbacks 62 | 63 | def run(self, hook, *args, **kwargs): 64 | """ 65 | Loop through the registered actions and fire all callbacks 66 | 67 | Args: 68 | hook The name of the hook to check, defaults to all 69 | args Arguments to receive from YOLOv5 70 | kwargs Keyword Arguments to receive from YOLOv5 71 | """ 72 | 73 | assert hook in self._callbacks, f"hook '{hook}' not found in callbacks {self._callbacks}" 74 | 75 | for logger in self._callbacks[hook]: 76 | logger['callback'](*args, **kwargs) 77 | -------------------------------------------------------------------------------- /utils/downloads.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | 4 | Download utils 5 | """ 6 | 7 | import os 8 | import platform 9 | import subprocess 10 | import time 11 | import urllib 12 | from pathlib import Path 13 | 14 | import requests 15 | import torch 16 | 17 | 18 | def gsutil_getsize(url=''): 19 | # gs://bucket/file size https://cloud.google.com/storage/docs/gsutil/commands/du 20 | s = subprocess.check_output(f'gsutil du {url}', shell=True).decode('utf-8') 21 | return eval(s.split(' ')[0]) if len(s) else 0 # bytes 22 | 23 | 24 | def safe_download(file, url, url2=None, min_bytes=1E0, error_msg=''): 25 | # Attempts to download file from url or url2, checks and removes incomplete downloads < min_bytes 26 | file = Path(file) 27 | assert_msg = f"Downloaded file '{file}' does not exist or size is < min_bytes={min_bytes}" 28 | try: # url1 29 | print(f'Downloading {url} to {file}...') 30 | torch.hub.download_url_to_file(url, str(file)) 31 | assert file.exists() and file.stat().st_size > min_bytes, assert_msg # check 32 | except Exception as e: # url2 33 | file.unlink(missing_ok=True) # remove partial downloads 34 | print(f'ERROR: {e}\nRe-attempting {url2 or url} to {file}...') 35 | os.system(f"curl -L '{url2 or url}' -o '{file}' --retry 3 -C -") # curl download, retry and resume on fail 36 | finally: 37 | if not file.exists() or file.stat().st_size < min_bytes: # check 38 | file.unlink(missing_ok=True) # remove partial downloads 39 | print(f"ERROR: {assert_msg}\n{error_msg}") 40 | print('') 41 | 42 | 43 | def attempt_download(file, repo='ultralytics/yolov5'): # from utils.downloads import *; attempt_download() 44 | # Attempt file download if does not exist 45 | file = Path(str(file).strip().replace("'", '')) 46 | 47 | if not file.exists(): 48 | # URL specified 49 | name = Path(urllib.parse.unquote(str(file))).name # decode '%2F' to '/' etc. 50 | if str(file).startswith(('http:/', 'https:/')): # download 51 | url = str(file).replace(':/', '://') # Pathlib turns :// -> :/ 52 | name = name.split('?')[0] # parse authentication https://url.com/file.txt?auth... 53 | safe_download(file=name, url=url, min_bytes=1E5) 54 | return name 55 | 56 | # GitHub assets 57 | file.parent.mkdir(parents=True, exist_ok=True) # make parent dir (if required) 58 | try: 59 | response = requests.get(f'https://api.github.com/repos/{repo}/releases/latest').json() # github api 60 | assets = [x['name'] for x in response['assets']] # release assets, i.e. ['yolov5s.pt', 'yolov5m.pt', ...] 61 | tag = response['tag_name'] # i.e. 'v1.0' 62 | except: # fallback plan 63 | assets = ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt', 64 | 'yolov5s6.pt', 'yolov5m6.pt', 'yolov5l6.pt', 'yolov5x6.pt'] 65 | try: 66 | tag = subprocess.check_output('git tag', shell=True, stderr=subprocess.STDOUT).decode().split()[-1] 67 | except: 68 | tag = 'v5.0' # current release 69 | 70 | if name in assets: 71 | safe_download(file, 72 | url=f'https://github.com/{repo}/releases/download/{tag}/{name}', 73 | # url2=f'https://storage.googleapis.com/{repo}/ckpt/{name}', # backup url (optional) 74 | min_bytes=1E5, 75 | error_msg=f'{file} missing, try downloading from https://github.com/{repo}/releases/') 76 | 77 | return str(file) 78 | 79 | 80 | def gdrive_download(id='16TiPfZj7htmTyhntwcZyEEAejOUxuT6m', file='tmp.zip'): 81 | # Downloads a file from Google Drive. from yolov5.utils.downloads import *; gdrive_download() 82 | t = time.time() 83 | file = Path(file) 84 | cookie = Path('cookie') # gdrive cookie 85 | print(f'Downloading https://drive.google.com/uc?export=download&id={id} as {file}... ', end='') 86 | file.unlink(missing_ok=True) # remove existing file 87 | cookie.unlink(missing_ok=True) # remove existing cookie 88 | 89 | # Attempt file download 90 | out = "NUL" if platform.system() == "Windows" else "/dev/null" 91 | os.system(f'curl -c ./cookie -s -L "drive.google.com/uc?export=download&id={id}" > {out}') 92 | if os.path.exists('cookie'): # large file 93 | s = f'curl -Lb ./cookie "drive.google.com/uc?export=download&confirm={get_token()}&id={id}" -o {file}' 94 | else: # small file 95 | s = f'curl -s -L -o {file} "drive.google.com/uc?export=download&id={id}"' 96 | r = os.system(s) # execute, capture return 97 | cookie.unlink(missing_ok=True) # remove existing cookie 98 | 99 | # Error check 100 | if r != 0: 101 | file.unlink(missing_ok=True) # remove partial 102 | print('Download error ') # raise Exception('Download error') 103 | return r 104 | 105 | # Unzip if archive 106 | if file.suffix == '.zip': 107 | print('unzipping... ', end='') 108 | os.system(f'unzip -q {file}') # unzip 109 | file.unlink() # remove zip to free space 110 | 111 | print(f'Done ({time.time() - t:.1f}s)') 112 | return r 113 | 114 | 115 | def get_token(cookie="./cookie"): 116 | with open(cookie) as f: 117 | for line in f: 118 | if "download" in line: 119 | return line.split()[-1] 120 | return "" 121 | 122 | # Google utils: https://cloud.google.com/storage/docs/reference/libraries ---------------------------------------------- 123 | # 124 | # 125 | # def upload_blob(bucket_name, source_file_name, destination_blob_name): 126 | # # Uploads a file to a bucket 127 | # # https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python 128 | # 129 | # storage_client = storage.Client() 130 | # bucket = storage_client.get_bucket(bucket_name) 131 | # blob = bucket.blob(destination_blob_name) 132 | # 133 | # blob.upload_from_filename(source_file_name) 134 | # 135 | # print('File {} uploaded to {}.'.format( 136 | # source_file_name, 137 | # destination_blob_name)) 138 | # 139 | # 140 | # def download_blob(bucket_name, source_blob_name, destination_file_name): 141 | # # Uploads a blob from a bucket 142 | # storage_client = storage.Client() 143 | # bucket = storage_client.get_bucket(bucket_name) 144 | # blob = bucket.blob(source_blob_name) 145 | # 146 | # blob.download_to_filename(destination_file_name) 147 | # 148 | # print('Blob {} downloaded to {}.'.format( 149 | # source_blob_name, 150 | # destination_file_name)) 151 | -------------------------------------------------------------------------------- /utils/loggers/__init__.py: -------------------------------------------------------------------------------- 1 | # YOLOv5 🚀 by Ultralytics, GPL-3.0 license 2 | """ 3 | Logging utils 4 | """ 5 | 6 | import warnings 7 | from threading import Thread 8 | 9 | import torch 10 | from torch.utils.tensorboard import SummaryWriter 11 | 12 | from utils.general import colorstr, emojis 13 | from utils.loggers.wandb.wandb_utils import WandbLogger 14 | from utils.plots import plot_images, plot_results 15 | from utils.torch_utils import de_parallel 16 | 17 | LOGGERS = ('csv', 'tb') # text-file, TensorBoard, Weights & Biases 18 | 19 | try: 20 | import wandb 21 | 22 | assert hasattr(wandb, '__version__') # verify package import not local dir 23 | except (ImportError, AssertionError): 24 | wandb = None 25 | 26 | 27 | class Loggers: 28 | # YOLOv5 Loggers class 29 | def __init__(self, save_dir=None, weights=None, opt=None, hyp=None, logger=None, include: tuple = LOGGERS): 30 | self.save_dir = save_dir 31 | self.weights = weights 32 | self.opt = opt 33 | self.hyp = hyp 34 | self.logger = logger # for printing results to console 35 | self.include = include 36 | self.keys = ['train/box_loss', 'train/obj_loss', 'train/cls_loss', # train loss 37 | 'metrics/precision', 'metrics/recall', 'metrics/wAP_0.5','metrics/mAP_0.5', 'metrics/mAP_0.5:0.95', # metrics 38 | 'val/box_loss', 'val/obj_loss', 'val/cls_loss', # val loss 39 | 'x/lr0', 'x/lr1', 'x/lr2'] # params 40 | for k in LOGGERS: 41 | setattr(self, k, None) # init empty logger dictionary 42 | self.csv = True # always log to csv 43 | 44 | # TensorBoard 45 | s = self.save_dir 46 | if 'tb' in self.include and not self.opt.evolve: 47 | prefix = colorstr('TensorBoard: ') 48 | self.logger.info(f"{prefix}Start with 'tensorboard --logdir {s.parent}', view at http://localhost:6006/") 49 | self.tb = SummaryWriter(str(s)) 50 | 51 | # W&B 52 | if wandb and 'wandb' in self.include: 53 | wandb_artifact_resume = isinstance(self.opt.resume, str) and self.opt.resume.startswith('wandb-artifact://') 54 | run_id = torch.load(self.weights).get('wandb_id') if self.opt.resume and not wandb_artifact_resume else None 55 | self.opt.hyp = self.hyp # add hyperparameters 56 | self.wandb = WandbLogger(self.opt, run_id) 57 | else: 58 | self.wandb = None 59 | 60 | def on_pretrain_routine_end(self): 61 | # Callback runs on pre-train routine end 62 | paths = self.save_dir.glob('*labels*.jpg') # training labels 63 | if self.wandb: 64 | self.wandb.log({"Labels": [wandb.Image(str(x), caption=x.name) for x in paths]}) 65 | 66 | def on_train_batch_end(self, ni, model, imgs, targets, paths, plots, sync_bn): 67 | # Callback runs on train batch end 68 | if plots: 69 | if ni == 0: 70 | if not sync_bn: # tb.add_graph() --sync known issue https://github.com/ultralytics/yolov5/issues/3754 71 | with warnings.catch_warnings(): 72 | warnings.simplefilter('ignore') # suppress jit trace warning 73 | self.tb.add_graph(torch.jit.trace(de_parallel(model), imgs[0:1], strict=False), []) 74 | if ni < 3: 75 | f = self.save_dir / f'train_batch{ni}.jpg' # filename 76 | Thread(target=plot_images, args=(imgs, targets, paths, f), daemon=True).start() 77 | if self.wandb and ni == 10: 78 | files = sorted(self.save_dir.glob('train*.jpg')) 79 | self.wandb.log({'Mosaics': [wandb.Image(str(f), caption=f.name) for f in files if f.exists()]}) 80 | 81 | def on_train_epoch_end(self, epoch): 82 | # Callback runs on train epoch end 83 | if self.wandb: 84 | self.wandb.current_epoch = epoch + 1 85 | 86 | def on_val_image_end(self, pred, predn, path, names, im): 87 | # Callback runs on val image end 88 | if self.wandb: 89 | self.wandb.val_one_image(pred, predn, path, names, im) 90 | 91 | def on_val_end(self): 92 | # Callback runs on val end 93 | if self.wandb: 94 | files = sorted(self.save_dir.glob('val*.jpg')) 95 | self.wandb.log({"Validation": [wandb.Image(str(f), caption=f.name) for f in files]}) 96 | 97 | def on_fit_epoch_end(self, vals, epoch, best_fitness, fi): 98 | # Callback runs at the end of each fit (train+val) epoch 99 | x = {k: v for k, v in zip(self.keys, vals)} # dict 100 | if self.csv: 101 | file = self.save_dir / 'results.csv' 102 | n = len(x) + 1 # number of cols 103 | s = '' if file.exists() else (('%20s,' * n % tuple(['epoch'] + self.keys)).rstrip(',') + '\n') # add header 104 | with open(file, 'a') as f: 105 | f.write(s + ('%20.5g,' * n % tuple([epoch] + vals)).rstrip(',') + '\n') 106 | 107 | if self.tb: 108 | for k, v in x.items(): 109 | self.tb.add_scalar(k, v, epoch) 110 | 111 | if self.wandb: 112 | self.wandb.log(x) 113 | self.wandb.end_epoch(best_result=best_fitness == fi) 114 | 115 | def on_model_save(self, last, epoch, final_epoch, best_fitness, fi): 116 | # Callback runs on model save event 117 | if self.wandb: 118 | if ((epoch + 1) % self.opt.save_period == 0 and not final_epoch) and self.opt.save_period != -1: 119 | self.wandb.log_model(last.parent, self.opt, epoch, fi, best_model=best_fitness == fi) 120 | 121 | def on_train_end(self, last, best, plots, epoch): 122 | # Callback runs on training end 123 | if plots: 124 | plot_results(file=self.save_dir / 'results.csv') # save results.png 125 | files = ['results.png', 'confusion_matrix.png', *[f'{x}_curve.png' for x in ('F1', 'PR', 'P', 'R')]] 126 | files = [(self.save_dir / f) for f in files if (self.save_dir / f).exists()] # filter 127 | 128 | if self.tb: 129 | import cv2 130 | for f in files: 131 | self.tb.add_image(f.stem, cv2.imread(str(f))[..., ::-1], epoch, dataformats='HWC') 132 | 133 | if self.wandb: 134 | self.wandb.log({"Results": [wandb.Image(str(f), caption=f.name) for f in files]}) 135 | # Calling wandb.log. TODO: Refactor this into WandbLogger.log_model 136 | if not self.opt.evolve: 137 | wandb.log_artifact(str(best if best.exists() else last), type='model', 138 | name='run_' + self.wandb.wandb_run.id + '_model', 139 | aliases=['latest', 'best', 'stripped']) 140 | self.wandb.finish_run() 141 | else: 142 | self.wandb.finish_run() 143 | self.wandb = WandbLogger(self.opt) 144 | -------------------------------------------------------------------------------- /utils/loggers/wandb/README.md: -------------------------------------------------------------------------------- 1 | 📚 This guide explains how to use **Weights & Biases** (W&B) with YOLOv5 🚀. 2 | * [About Weights & Biases](#about-weights-&-biases) 3 | * [First-Time Setup](#first-time-setup) 4 | * [Viewing runs](#viewing-runs) 5 | * [Advanced Usage: Dataset Versioning and Evaluation](#advanced-usage) 6 | * [Reports: Share your work with the world!](#reports) 7 | 8 | ## About Weights & Biases 9 | Think of [W&B](https://wandb.ai/site?utm_campaign=repo_yolo_wandbtutorial) like GitHub for machine learning models. With a few lines of code, save everything you need to debug, compare and reproduce your models — architecture, hyperparameters, git commits, model weights, GPU usage, and even datasets and predictions. 10 | 11 | Used by top researchers including teams at OpenAI, Lyft, Github, and MILA, W&B is part of the new standard of best practices for machine learning. How W&B can help you optimize your machine learning workflows: 12 | 13 | * [Debug](https://wandb.ai/wandb/getting-started/reports/Visualize-Debug-Machine-Learning-Models--VmlldzoyNzY5MDk#Free-2) model performance in real time 14 | * [GPU usage](https://wandb.ai/wandb/getting-started/reports/Visualize-Debug-Machine-Learning-Models--VmlldzoyNzY5MDk#System-4), visualized automatically 15 | * [Custom charts](https://wandb.ai/wandb/customizable-charts/reports/Powerful-Custom-Charts-To-Debug-Model-Peformance--VmlldzoyNzY4ODI) for powerful, extensible visualization 16 | * [Share insights](https://wandb.ai/wandb/getting-started/reports/Visualize-Debug-Machine-Learning-Models--VmlldzoyNzY5MDk#Share-8) interactively with collaborators 17 | * [Optimize hyperparameters](https://docs.wandb.com/sweeps) efficiently 18 | * [Track](https://docs.wandb.com/artifacts) datasets, pipelines, and production models 19 | 20 | ## First-Time Setup 21 |
22 | Toggle Details 23 | When you first train, W&B will prompt you to create a new account and will generate an **API key** for you. If you are an existing user you can retrieve your key from https://wandb.ai/authorize. This key is used to tell W&B where to log your data. You only need to supply your key once, and then it is remembered on the same device. 24 | 25 | W&B will create a cloud **project** (default is 'YOLOv5') for your training runs, and each new training run will be provided a unique run **name** within that project as project/name. You can also manually set your project and run name as: 26 | 27 | ```shell 28 | $ python train.py --project ... --name ... 29 | ``` 30 | 31 | 32 |
33 | 34 | ## Viewing Runs 35 |
36 | Toggle Details 37 | Run information streams from your environment to the W&B cloud console as you train. This allows you to monitor and even cancel runs in realtime . All important information is logged: 38 | 39 | * Training & Validation losses 40 | * Metrics: Precision, Recall, mAP@0.5, mAP@0.5:0.95 41 | * Learning Rate over time 42 | * A bounding box debugging panel, showing the training progress over time 43 | * GPU: Type, **GPU Utilization**, power, temperature, **CUDA memory usage** 44 | * System: Disk I/0, CPU utilization, RAM memory usage 45 | * Your trained model as W&B Artifact 46 | * Environment: OS and Python types, Git repository and state, **training command** 47 | 48 | 49 |
50 | 51 | ## Advanced Usage 52 | You can leverage W&B artifacts and Tables integration to easily visualize and manage your datasets, models and training evaluations. Here are some quick examples to get you started. 53 |
54 |

1. Visualize and Version Datasets

55 | Log, visualize, dynamically query, and understand your data with W&B Tables. You can use the following command to log your dataset as a W&B Table. This will generate a {dataset}_wandb.yaml file which can be used to train from dataset artifact. 56 |
57 | Usage 58 | Code $ python utils/logger/wandb/log_dataset.py --project ... --name ... --data .. 59 | 60 | ![Screenshot (64)](https://user-images.githubusercontent.com/15766192/128486078-d8433890-98a3-4d12-8986-b6c0e3fc64b9.png) 61 |
62 | 63 |

2: Train and Log Evaluation simultaneousy

64 | This is an extension of the previous section, but it'll also training after uploading the dataset. This also evaluation Table 65 | Evaluation table compares your predictions and ground truths across the validation set for each epoch. It uses the references to the already uploaded datasets, 66 | so no images will be uploaded from your system more than once. 67 |
68 | Usage 69 | Code $ python utils/logger/wandb/log_dataset.py --data .. --upload_data 70 | 71 | ![Screenshot (72)](https://user-images.githubusercontent.com/15766192/128979739-4cf63aeb-a76f-483f-8861-1c0100b938a5.png) 72 |
73 | 74 |

3: Train using dataset artifact

75 | When you upload a dataset as described in the first section, you get a new config file with an added `_wandb` to its name. This file contains the information that 76 | can be used to train a model directly from the dataset artifact. This also logs evaluation 77 |
78 | Usage 79 | Code $ python utils/logger/wandb/log_dataset.py --data {data}_wandb.yaml 80 | 81 | ![Screenshot (72)](https://user-images.githubusercontent.com/15766192/128979739-4cf63aeb-a76f-483f-8861-1c0100b938a5.png) 82 |
83 | 84 |

4: Save model checkpoints as artifacts

85 | To enable saving and versioning checkpoints of your experiment, pass `--save_period n` with the base cammand, where `n` represents checkpoint interval. 86 | You can also log both the dataset and model checkpoints simultaneously. If not passed, only the final model will be logged 87 | 88 |
89 | Usage 90 | Code $ python train.py --save_period 1 91 | 92 | ![Screenshot (68)](https://user-images.githubusercontent.com/15766192/128726138-ec6c1f60-639d-437d-b4ee-3acd9de47ef3.png) 93 |
94 | 95 |
96 | 97 |

5: Resume runs from checkpoint artifacts.

98 | Any run can be resumed using artifacts if the --resume argument starts with wandb-artifact:// prefix followed by the run path, i.e, wandb-artifact://username/project/runid . This doesn't require the model checkpoint to be present on the local system. 99 | 100 |
101 | Usage 102 | Code $ python train.py --resume wandb-artifact://{run_path} 103 | 104 | ![Screenshot (70)](https://user-images.githubusercontent.com/15766192/128728988-4e84b355-6c87-41ae-a591-14aecf45343e.png) 105 |
106 | 107 |

6: Resume runs from dataset artifact & checkpoint artifacts.

108 | Local dataset or model checkpoints are not required. This can be used to resume runs directly on a different device 109 | The syntax is same as the previous section, but you'll need to lof both the dataset and model checkpoints as artifacts, i.e, set bot --upload_dataset or 110 | train from _wandb.yaml file and set --save_period 111 | 112 |
113 | Usage 114 | Code $ python train.py --resume wandb-artifact://{run_path} 115 | 116 | ![Screenshot (70)](https://user-images.githubusercontent.com/15766192/128728988-4e84b355-6c87-41ae-a591-14aecf45343e.png) 117 |
118 | 119 | 120 | 121 | 122 | 123 |

Reports

124 | W&B Reports can be created from your saved runs for sharing online. Once a report is created you will receive a link you can use to publically share your results. Here is an example report created from the COCO128 tutorial trainings of all four YOLOv5 models ([link](https://wandb.ai/glenn-jocher/yolov5_tutorial/reports/YOLOv5-COCO128-Tutorial-Results--VmlldzozMDI5OTY)). 125 | 126 | 127 | 128 | ## Environments 129 | YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including [CUDA](https://developer.nvidia.com/cuda)/[CUDNN](https://developer.nvidia.com/cudnn), [Python](https://www.python.org/) and [PyTorch](https://pytorch.org/) preinstalled): 130 | 131 | * **Google Colab and Kaggle** notebooks with free GPU: [![Open In Colab](https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb) [![Open In Kaggle](https://camo.githubusercontent.com/a08ca511178e691ace596a95d334f73cf4ce06e83a5c4a5169b8bb68cac27bef/68747470733a2f2f6b6167676c652e636f6d2f7374617469632f696d616765732f6f70656e2d696e2d6b6167676c652e737667)](https://www.kaggle.com/ultralytics/yolov5) 132 | * **Google Cloud** Deep Learning VM. See [GCP Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/GCP-Quickstart) 133 | * **Amazon** Deep Learning AMI. See [AWS Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/AWS-Quickstart) 134 | * **Docker Image**. See [Docker Quickstart Guide](https://github.com/ultralytics/yolov5/wiki/Docker-Quickstart) [![Docker Pulls](https://camo.githubusercontent.com/280faedaf431e4c0c24fdb30ec00a66d627404e5c4c498210d3f014dd58c2c7e/68747470733a2f2f696d672e736869656c64732e696f2f646f636b65722f70756c6c732f756c7472616c79746963732f796f6c6f76353f6c6f676f3d646f636b6572)](https://hub.docker.com/r/ultralytics/yolov5) 135 | 136 | ## Status 137 | ![CI CPU testing](https://github.com/ultralytics/yolov5/workflows/CI%20CPU%20testing/badge.svg) 138 | 139 | If this badge is green, all [YOLOv5 GitHub Actions](https://github.com/ultralytics/yolov5/actions) Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training ([train.py](https://github.com/ultralytics/yolov5/blob/master/train.py)), validation ([val.py](https://github.com/ultralytics/yolov5/blob/master/val.py)), inference ([detect.py](https://github.com/ultralytics/yolov5/blob/master/detect.py)) and export ([export.py](https://github.com/ultralytics/yolov5/blob/master/export.py)) on MacOS, Windows, and Ubuntu every 24 hours and on every commit. 140 | 141 | -------------------------------------------------------------------------------- /utils/loggers/wandb/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fsoft-ailab/Data-Competition/63741b681885467a563d5500e4658673d6d737d7/utils/loggers/wandb/__init__.py -------------------------------------------------------------------------------- /utils/loggers/wandb/log_dataset.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | from wandb_utils import WandbLogger 4 | 5 | WANDB_ARTIFACT_PREFIX = 'wandb-artifact://' 6 | 7 | 8 | def create_dataset_artifact(opt): 9 | logger = WandbLogger(opt, None, job_type='Dataset Creation') # TODO: return value unused 10 | 11 | 12 | if __name__ == '__main__': 13 | parser = argparse.ArgumentParser() 14 | parser.add_argument('--config', type=str, default='config/coco128.yaml', help='config.yaml path') 15 | parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset') 16 | parser.add_argument('--project', type=str, default='YOLOv5', help='name of W&B Project') 17 | parser.add_argument('--entity', default=None, help='W&B entity') 18 | parser.add_argument('--name', type=str, default='log dataset', help='name of W&B run') 19 | 20 | opt = parser.parse_args() 21 | opt.resume = False # Explicitly disallow resume check for dataset upload job 22 | 23 | create_dataset_artifact(opt) 24 | -------------------------------------------------------------------------------- /utils/loggers/wandb/sweep.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from pathlib import Path 3 | 4 | import wandb 5 | 6 | FILE = Path(__file__).resolve() 7 | sys.path.append(FILE.parents[3].as_posix()) # add utils/ to path 8 | 9 | from train import train, parser 10 | from utils.general import increment_path 11 | from utils.torch_utils import select_device 12 | from utils.callbacks import Callbacks 13 | 14 | 15 | def sweep(): 16 | wandb.init() 17 | # Get hyp dict from sweep agent 18 | hyp_dict = vars(wandb.config).get("_items") 19 | 20 | # Workaround: get necessary opt args 21 | opt = parser(known=True) 22 | opt.batch_size = hyp_dict.get("batch_size") 23 | opt.save_dir = str(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok or opt.evolve)) 24 | opt.epochs = hyp_dict.get("epochs") 25 | opt.no_save = True 26 | opt.data = hyp_dict.get("config") 27 | device = select_device(opt.device, batch_size=opt.batch_size) 28 | 29 | # train 30 | train(hyp_dict, opt, device, callbacks=Callbacks()) 31 | 32 | 33 | if __name__ == "__main__": 34 | sweep() 35 | -------------------------------------------------------------------------------- /utils/loggers/wandb/sweep.yaml: -------------------------------------------------------------------------------- 1 | # Hyperparameters for training 2 | # To set range- 3 | # Provide min and max values as: 4 | # parameter: 5 | # 6 | # min: scalar 7 | # max: scalar 8 | # OR 9 | # 10 | # Set a specific list of search space- 11 | # parameter: 12 | # values: [scalar1, scalar2, scalar3...] 13 | # 14 | # You can use grid, bayesian and hyperopt search strategy 15 | # For more info on configuring sweeps visit - https://docs.wandb.ai/guides/sweeps/configuration 16 | 17 | program: utils/loggers/wandb/sweep.py 18 | method: random 19 | metric: 20 | name: metrics/mAP_0.5 21 | goal: maximize 22 | 23 | parameters: 24 | # hyperparameters: set either min, max range or values list 25 | data: 26 | value: "config/coco128.yaml" 27 | batch_size: 28 | values: [64] 29 | epochs: 30 | values: [10] 31 | 32 | lr0: 33 | distribution: uniform 34 | min: 1e-5 35 | max: 1e-1 36 | lrf: 37 | distribution: uniform 38 | min: 0.01 39 | max: 1.0 40 | momentum: 41 | distribution: uniform 42 | min: 0.6 43 | max: 0.98 44 | weight_decay: 45 | distribution: uniform 46 | min: 0.0 47 | max: 0.001 48 | warmup_epochs: 49 | distribution: uniform 50 | min: 0.0 51 | max: 5.0 52 | warmup_momentum: 53 | distribution: uniform 54 | min: 0.0 55 | max: 0.95 56 | warmup_bias_lr: 57 | distribution: uniform 58 | min: 0.0 59 | max: 0.2 60 | box: 61 | distribution: uniform 62 | min: 0.02 63 | max: 0.2 64 | cls: 65 | distribution: uniform 66 | min: 0.2 67 | max: 4.0 68 | cls_pw: 69 | distribution: uniform 70 | min: 0.5 71 | max: 2.0 72 | obj: 73 | distribution: uniform 74 | min: 0.2 75 | max: 4.0 76 | obj_pw: 77 | distribution: uniform 78 | min: 0.5 79 | max: 2.0 80 | iou_t: 81 | distribution: uniform 82 | min: 0.1 83 | max: 0.7 84 | anchor_t: 85 | distribution: uniform 86 | min: 2.0 87 | max: 8.0 88 | fl_gamma: 89 | distribution: uniform 90 | min: 0.0 91 | max: 0.1 92 | hsv_h: 93 | distribution: uniform 94 | min: 0.0 95 | max: 0.1 96 | hsv_s: 97 | distribution: uniform 98 | min: 0.0 99 | max: 0.9 100 | hsv_v: 101 | distribution: uniform 102 | min: 0.0 103 | max: 0.9 104 | degrees: 105 | distribution: uniform 106 | min: 0.0 107 | max: 45.0 108 | translate: 109 | distribution: uniform 110 | min: 0.0 111 | max: 0.9 112 | scale: 113 | distribution: uniform 114 | min: 0.0 115 | max: 0.9 116 | shear: 117 | distribution: uniform 118 | min: 0.0 119 | max: 10.0 120 | perspective: 121 | distribution: uniform 122 | min: 0.0 123 | max: 0.001 124 | flipud: 125 | distribution: uniform 126 | min: 0.0 127 | max: 1.0 128 | fliplr: 129 | distribution: uniform 130 | min: 0.0 131 | max: 1.0 132 | mosaic: 133 | distribution: uniform 134 | min: 0.0 135 | max: 1.0 136 | mixup: 137 | distribution: uniform 138 | min: 0.0 139 | max: 1.0 140 | copy_paste: 141 | distribution: uniform 142 | min: 0.0 143 | max: 1.0 144 | -------------------------------------------------------------------------------- /utils/loss.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | 4 | Loss functions 5 | """ 6 | 7 | import torch 8 | import torch.nn as nn 9 | 10 | from utils.metrics import bbox_iou 11 | from utils.torch_utils import is_parallel 12 | 13 | 14 | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues/238#issuecomment-598028441 15 | # return positive, negative label smoothing BCE targets 16 | return 1.0 - 0.5 * eps, 0.5 * eps 17 | 18 | 19 | class BCEBlurWithLogitsLoss(nn.Module): 20 | # BCEwithLogitLoss() with reduced missing label effects. 21 | def __init__(self, alpha=0.05): 22 | super(BCEBlurWithLogitsLoss, self).__init__() 23 | self.loss_fcn = nn.BCEWithLogitsLoss(reduction='none') # must be nn.BCEWithLogitsLoss() 24 | self.alpha = alpha 25 | 26 | def forward(self, pred, true): 27 | loss = self.loss_fcn(pred, true) 28 | pred = torch.sigmoid(pred) # prob from logits 29 | dx = pred - true # reduce only missing label effects 30 | # dx = (pred - true).abs() # reduce missing label and false label effects 31 | alpha_factor = 1 - torch.exp((dx - 1) / (self.alpha + 1e-4)) 32 | loss *= alpha_factor 33 | return loss.mean() 34 | 35 | 36 | class FocalLoss(nn.Module): 37 | # Wraps focal loss around existing loss_fcn(), i.e. criteria = FocalLoss(nn.BCEWithLogitsLoss(), gamma=1.5) 38 | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25): 39 | super(FocalLoss, self).__init__() 40 | self.loss_fcn = loss_fcn # must be nn.BCEWithLogitsLoss() 41 | self.gamma = gamma 42 | self.alpha = alpha 43 | self.reduction = loss_fcn.reduction 44 | self.loss_fcn.reduction = 'none' # required to apply FL to each element 45 | 46 | def forward(self, pred, true): 47 | loss = self.loss_fcn(pred, true) 48 | # p_t = torch.exp(-loss) 49 | # loss *= self.alpha * (1.000001 - p_t) ** self.gamma # non-zero power for gradient stability 50 | 51 | # TF implementation https://github.com/tensorflow/addons/blob/v0.7.1/tensorflow_addons/losses/focal_loss.py 52 | pred_prob = torch.sigmoid(pred) # prob from logits 53 | p_t = true * pred_prob + (1 - true) * (1 - pred_prob) 54 | alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha) 55 | modulating_factor = (1.0 - p_t) ** self.gamma 56 | loss *= alpha_factor * modulating_factor 57 | 58 | if self.reduction == 'mean': 59 | return loss.mean() 60 | elif self.reduction == 'sum': 61 | return loss.sum() 62 | else: # 'none' 63 | return loss 64 | 65 | 66 | class QFocalLoss(nn.Module): 67 | # Wraps Quality focal loss around existing loss_fcn(), i.e. criteria = FocalLoss(nn.BCEWithLogitsLoss(), gamma=1.5) 68 | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25): 69 | super(QFocalLoss, self).__init__() 70 | self.loss_fcn = loss_fcn # must be nn.BCEWithLogitsLoss() 71 | self.gamma = gamma 72 | self.alpha = alpha 73 | self.reduction = loss_fcn.reduction 74 | self.loss_fcn.reduction = 'none' # required to apply FL to each element 75 | 76 | def forward(self, pred, true): 77 | loss = self.loss_fcn(pred, true) 78 | 79 | pred_prob = torch.sigmoid(pred) # prob from logits 80 | alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha) 81 | modulating_factor = torch.abs(true - pred_prob) ** self.gamma 82 | loss *= alpha_factor * modulating_factor 83 | 84 | if self.reduction == 'mean': 85 | return loss.mean() 86 | elif self.reduction == 'sum': 87 | return loss.sum() 88 | else: # 'none' 89 | return loss 90 | 91 | 92 | class ComputeLoss: 93 | # Compute losses 94 | def __init__(self, model, autobalance=False): 95 | self.sort_obj_iou = False 96 | device = next(model.parameters()).device # get model device 97 | h = model.hyp # hyperparameters 98 | 99 | # Define criteria 100 | BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device)) 101 | BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['obj_pw']], device=device)) 102 | 103 | # Class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3 104 | self.cp, self.cn = smooth_BCE(eps=h.get('label_smoothing', 0.0)) # positive, negative BCE targets 105 | 106 | # Focal loss 107 | g = h['fl_gamma'] # focal loss gamma 108 | if g > 0: 109 | BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g) 110 | 111 | det = model.module.model[-1] if is_parallel(model) else model.model[-1] # Detect() module 112 | self.balance = {3: [4.0, 1.0, 0.4]}.get(det.nl, [4.0, 1.0, 0.25, 0.06, .02]) # P3-P7 113 | self.ssi = list(det.stride).index(16) if autobalance else 0 # stride 16 index 114 | self.BCEcls, self.BCEobj, self.gr, self.hyp, self.autobalance = BCEcls, BCEobj, 1.0, h, autobalance 115 | for k in 'na', 'nc', 'nl', 'anchors': 116 | setattr(self, k, getattr(det, k)) 117 | 118 | def __call__(self, p, targets): # predictions, targets, model 119 | device = targets.device 120 | lcls, lbox, lobj = torch.zeros(1, device=device), torch.zeros(1, device=device), torch.zeros(1, device=device) 121 | tcls, tbox, indices, anchors = self.build_targets(p, targets) # targets 122 | 123 | # Losses 124 | for i, pi in enumerate(p): # layer index, layer predictions 125 | b, a, gj, gi = indices[i] # image, anchor, gridy, gridx 126 | tobj = torch.zeros_like(pi[..., 0], device=device) # target obj 127 | 128 | n = b.shape[0] # number of targets 129 | if n: 130 | ps = pi[b, a, gj, gi] # prediction subset corresponding to targets 131 | 132 | # Regression 133 | pxy = ps[:, :2].sigmoid() * 2. - 0.5 134 | pwh = (ps[:, 2:4].sigmoid() * 2) ** 2 * anchors[i] 135 | pbox = torch.cat((pxy, pwh), 1) # predicted box 136 | iou = bbox_iou(pbox.T, tbox[i], x1y1x2y2=False, CIoU=True) # iou(prediction, target) 137 | lbox += (1.0 - iou).mean() # iou loss 138 | 139 | # Objectness 140 | score_iou = iou.detach().clamp(0).type(tobj.dtype) 141 | if self.sort_obj_iou: 142 | sort_id = torch.argsort(score_iou) 143 | b, a, gj, gi, score_iou = b[sort_id], a[sort_id], gj[sort_id], gi[sort_id], score_iou[sort_id] 144 | tobj[b, a, gj, gi] = (1.0 - self.gr) + self.gr * score_iou # iou ratio 145 | 146 | # Classification 147 | if self.nc > 1: # cls loss (only if multiple classes) 148 | t = torch.full_like(ps[:, 5:], self.cn, device=device) # targets 149 | t[range(n), tcls[i]] = self.cp 150 | lcls += self.BCEcls(ps[:, 5:], t) # BCE 151 | 152 | # Append targets to text file 153 | # with open('targets.txt', 'a') as file: 154 | # [file.write('%11.5g ' * 4 % tuple(x) + '\n') for x in torch.cat((txy[i], twh[i]), 1)] 155 | 156 | obji = self.BCEobj(pi[..., 4], tobj) 157 | lobj += obji * self.balance[i] # obj loss 158 | if self.autobalance: 159 | self.balance[i] = self.balance[i] * 0.9999 + 0.0001 / obji.detach().item() 160 | 161 | if self.autobalance: 162 | self.balance = [x / self.balance[self.ssi] for x in self.balance] 163 | lbox *= self.hyp['box'] 164 | lobj *= self.hyp['obj'] 165 | lcls *= self.hyp['cls'] 166 | bs = tobj.shape[0] # batch size 167 | 168 | return (lbox + lobj + lcls) * bs, torch.cat((lbox, lobj, lcls)).detach() 169 | 170 | def build_targets(self, p, targets): 171 | # Build targets for compute_loss(), input targets(image,class,x,y,w,h) 172 | na, nt = self.na, targets.shape[0] # number of anchors, targets 173 | tcls, tbox, indices, anch = [], [], [], [] 174 | gain = torch.ones(7, device=targets.device) # normalized to gridspace gain 175 | ai = torch.arange(na, device=targets.device).float().view(na, 1).repeat(1, nt) # same as .repeat_interleave(nt) 176 | targets = torch.cat((targets.repeat(na, 1, 1), ai[:, :, None]), 2) # append anchor indices 177 | 178 | g = 0.5 # bias 179 | off = torch.tensor([[0, 0], 180 | [1, 0], [0, 1], [-1, 0], [0, -1], # j,k,l,m 181 | # [1, 1], [1, -1], [-1, 1], [-1, -1], # jk,jm,lk,lm 182 | ], device=targets.device).float() * g # offsets 183 | 184 | for i in range(self.nl): 185 | anchors = self.anchors[i] 186 | gain[2:6] = torch.tensor(p[i].shape)[[3, 2, 3, 2]] # xyxy gain 187 | 188 | # Match targets to anchors 189 | t = targets * gain 190 | if nt: 191 | # Matches 192 | r = t[:, :, 4:6] / anchors[:, None] # wh ratio 193 | j = torch.max(r, 1. / r).max(2)[0] < self.hyp['anchor_t'] # compare 194 | # j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t'] # iou(3,n)=wh_iou(anchors(3,2), gwh(n,2)) 195 | t = t[j] # filter 196 | 197 | # Offsets 198 | gxy = t[:, 2:4] # grid xy 199 | gxi = gain[[2, 3]] - gxy # inverse 200 | j, k = ((gxy % 1. < g) & (gxy > 1.)).T 201 | l, m = ((gxi % 1. < g) & (gxi > 1.)).T 202 | j = torch.stack((torch.ones_like(j), j, k, l, m)) 203 | t = t.repeat((5, 1, 1))[j] 204 | offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j] 205 | else: 206 | t = targets[0] 207 | offsets = 0 208 | 209 | # Define 210 | b, c = t[:, :2].long().T # image, class 211 | gxy = t[:, 2:4] # grid xy 212 | gwh = t[:, 4:6] # grid wh 213 | gij = (gxy - offsets).long() 214 | gi, gj = gij.T # grid xy indices 215 | 216 | # Append 217 | a = t[:, 6].long() # anchor indices 218 | indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices 219 | tbox.append(torch.cat((gxy - gij, gwh), 1)) # box 220 | anch.append(anchors[a]) # anchors 221 | tcls.append(c) # class 222 | 223 | return tcls, tbox, indices, anch 224 | -------------------------------------------------------------------------------- /utils/metrics.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | 4 | Model validation metrics 5 | """ 6 | 7 | import math 8 | import warnings 9 | from pathlib import Path 10 | 11 | import matplotlib.pyplot as plt 12 | import numpy as np 13 | import torch 14 | 15 | 16 | def fitness(x): 17 | # Model fitness as a weighted combination of metrics 18 | w = [0.0, 0.0, 1, 0.0, 0.0] # weights for [P, R, wAP@0.5, mAP@0.5, mAP@0.5:0.95] 19 | return (x[:, :5] * w).sum(1) 20 | 21 | 22 | def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir='.', names=()): 23 | """ Compute the average precision, given the recall and precision curves. 24 | Source: https://github.com/rafaelpadilla/Object-Detection-Metrics. 25 | # Arguments 26 | tp: True positives (nparray, nx1 or nx10). 27 | conf: Objectness value from 0-1 (nparray). 28 | pred_cls: Predicted object classes (nparray). 29 | target_cls: True object classes (nparray). 30 | plot: Plot precision-recall curve at mAP@0.5 31 | save_dir: Plot save directory 32 | # Returns 33 | The average precision as computed in py-faster-rcnn. 34 | """ 35 | 36 | # Sort by objectness 37 | i = np.argsort(-conf) 38 | tp, conf, pred_cls = tp[i], conf[i], pred_cls[i] 39 | 40 | # Find unique classes 41 | unique_classes = np.unique(target_cls) 42 | nc = unique_classes.shape[0] # number of classes, number of detections 43 | 44 | # Create Precision-Recall curve and compute AP for each class 45 | px, py = np.linspace(0, 1, 1000), [] # for plotting 46 | ap, p, r = np.zeros((nc, tp.shape[1])), np.zeros((nc, 1000)), np.zeros((nc, 1000)) 47 | for ci, c in enumerate(unique_classes): 48 | i = pred_cls == c 49 | n_l = (target_cls == c).sum() # number of labels 50 | n_p = i.sum() # number of predictions 51 | 52 | if n_p == 0 or n_l == 0: 53 | continue 54 | else: 55 | # Accumulate FPs and TPs 56 | fpc = (1 - tp[i]).cumsum(0) 57 | tpc = tp[i].cumsum(0) 58 | 59 | # Recall 60 | recall = tpc / (n_l + 1e-16) # recall curve 61 | r[ci] = np.interp(-px, -conf[i], recall[:, 0], left=0) # negative x, xp because xp decreases 62 | 63 | # Precision 64 | precision = tpc / (tpc + fpc) # precision curve 65 | p[ci] = np.interp(-px, -conf[i], precision[:, 0], left=1) # p at pr_score 66 | 67 | # AP from recall-precision curve 68 | for j in range(tp.shape[1]): 69 | ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j]) 70 | if plot and j == 0: 71 | py.append(np.interp(px, mrec, mpre)) # precision at mAP@0.5 72 | 73 | # Compute F1 (harmonic mean of precision and recall) 74 | f1 = 2 * p * r / (p + r + 1e-16) 75 | if plot: 76 | plot_pr_curve(px, py, ap, Path(save_dir) / 'PR_curve.png', names) 77 | plot_mc_curve(px, f1, Path(save_dir) / 'F1_curve.png', names, ylabel='F1') 78 | plot_mc_curve(px, p, Path(save_dir) / 'P_curve.png', names, ylabel='Precision') 79 | plot_mc_curve(px, r, Path(save_dir) / 'R_curve.png', names, ylabel='Recall') 80 | 81 | i = f1.mean(0).argmax() # max F1 index 82 | return p[:, i], r[:, i], ap, f1[:, i], unique_classes.astype('int32') 83 | 84 | 85 | def compute_ap(recall, precision): 86 | """ Compute the average precision, given the recall and precision curves 87 | # Arguments 88 | recall: The recall curve (list) 89 | precision: The precision curve (list) 90 | # Returns 91 | Average precision, precision curve, recall curve 92 | """ 93 | 94 | # Append sentinel values to beginning and end 95 | mrec = np.concatenate(([0.0], recall, [1.0])) 96 | mpre = np.concatenate(([1.0], precision, [0.0])) 97 | 98 | # Compute the precision envelope 99 | mpre = np.flip(np.maximum.accumulate(np.flip(mpre))) 100 | 101 | # Integrate area under curve 102 | method = 'interp' # methods: 'continuous', 'interp' 103 | if method == 'interp': 104 | x = np.linspace(0, 1, 101) # 101-point interp (COCO) 105 | ap = np.trapz(np.interp(x, mrec, mpre), x) # integrate 106 | else: # 'continuous' 107 | i = np.where(mrec[1:] != mrec[:-1])[0] # points where x axis (recall) changes 108 | ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1]) # area under curve 109 | 110 | return ap, mpre, mrec 111 | 112 | 113 | class ConfusionMatrix: 114 | # Updated version of https://github.com/kaanakan/object_detection_confusion_matrix 115 | def __init__(self, nc, conf=0.25, iou_thres=0.45): 116 | self.matrix = np.zeros((nc + 1, nc + 1)) 117 | self.nc = nc # number of classes 118 | self.conf = conf 119 | self.iou_thres = iou_thres 120 | 121 | def process_batch(self, detections, labels): 122 | """ 123 | Return intersection-over-union (Jaccard index) of boxes. 124 | Both sets of boxes are expected to be in (x1, y1, x2, y2) format. 125 | Arguments: 126 | detections (Array[N, 6]), x1, y1, x2, y2, conf, class 127 | labels (Array[M, 5]), class, x1, y1, x2, y2 128 | Returns: 129 | None, updates confusion matrix accordingly 130 | """ 131 | detections = detections[detections[:, 4] > self.conf] 132 | gt_classes = labels[:, 0].int() 133 | detection_classes = detections[:, 5].int() 134 | iou = box_iou(labels[:, 1:], detections[:, :4]) 135 | 136 | x = torch.where(iou > self.iou_thres) 137 | if x[0].shape[0]: 138 | matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy() 139 | if x[0].shape[0] > 1: 140 | matches = matches[matches[:, 2].argsort()[::-1]] 141 | matches = matches[np.unique(matches[:, 1], return_index=True)[1]] 142 | matches = matches[matches[:, 2].argsort()[::-1]] 143 | matches = matches[np.unique(matches[:, 0], return_index=True)[1]] 144 | else: 145 | matches = np.zeros((0, 3)) 146 | 147 | n = matches.shape[0] > 0 148 | m0, m1, _ = matches.transpose().astype(np.int16) 149 | for i, gc in enumerate(gt_classes): 150 | j = m0 == i 151 | if n and sum(j) == 1: 152 | self.matrix[detection_classes[m1[j]], gc] += 1 # correct 153 | else: 154 | self.matrix[self.nc, gc] += 1 # background FP 155 | 156 | if n: 157 | for i, dc in enumerate(detection_classes): 158 | if not any(m1 == i): 159 | self.matrix[dc, self.nc] += 1 # background FN 160 | 161 | def matrix(self): 162 | return self.matrix 163 | 164 | def plot(self, normalize=True, save_dir='', names=()): 165 | try: 166 | import seaborn as sn 167 | 168 | array = self.matrix / ((self.matrix.sum(0).reshape(1, -1) + 1E-6) if normalize else 1) # normalize columns 169 | array[array < 0.005] = np.nan # don't annotate (would appear as 0.00) 170 | 171 | fig = plt.figure(figsize=(12, 9), tight_layout=True) 172 | sn.set(font_scale=1.0 if self.nc < 50 else 0.8) # for label size 173 | labels = (0 < len(names) < 99) and len(names) == self.nc # apply names to ticklabels 174 | with warnings.catch_warnings(): 175 | warnings.simplefilter('ignore') # suppress empty matrix RuntimeWarning: All-NaN slice encountered 176 | sn.heatmap(array, annot=self.nc < 30, annot_kws={"size": 8}, cmap='Blues', fmt='.2f', square=True, 177 | xticklabels=names + ['background FP'] if labels else "auto", 178 | yticklabels=names + ['background FN'] if labels else "auto").set_facecolor((1, 1, 1)) 179 | fig.axes[0].set_xlabel('True') 180 | fig.axes[0].set_ylabel('Predicted') 181 | fig.savefig(Path(save_dir) / 'confusion_matrix.png', dpi=250) 182 | plt.close() 183 | except Exception as e: 184 | print(f'WARNING: ConfusionMatrix plot failure: {e}') 185 | 186 | def print(self): 187 | for i in range(self.nc + 1): 188 | print(' '.join(map(str, self.matrix[i]))) 189 | 190 | 191 | def bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False, eps=1e-7): 192 | # Returns the IoU of box1 to box2. box1 is 4, box2 is nx4 193 | box2 = box2.T 194 | 195 | # Get the coordinates of bounding boxes 196 | if x1y1x2y2: # x1, y1, x2, y2 = box1 197 | b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3] 198 | b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3] 199 | else: # transform from xywh to xyxy 200 | b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2 201 | b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2 202 | b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2 203 | b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2 204 | 205 | # Intersection area 206 | inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \ 207 | (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0) 208 | 209 | # Union Area 210 | w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps 211 | w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps 212 | union = w1 * h1 + w2 * h2 - inter + eps 213 | 214 | iou = inter / union 215 | if GIoU or DIoU or CIoU: 216 | cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1) # convex (smallest enclosing box) width 217 | ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1) # convex height 218 | if CIoU or DIoU: # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1 219 | c2 = cw ** 2 + ch ** 2 + eps # convex diagonal squared 220 | rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 + 221 | (b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4 # center distance squared 222 | if DIoU: 223 | return iou - rho2 / c2 # DIoU 224 | elif CIoU: # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47 225 | v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2) 226 | with torch.no_grad(): 227 | alpha = v / (v - iou + (1 + eps)) 228 | return iou - (rho2 / c2 + v * alpha) # CIoU 229 | else: # GIoU https://arxiv.org/pdf/1902.09630.pdf 230 | c_area = cw * ch + eps # convex area 231 | return iou - (c_area - union) / c_area # GIoU 232 | else: 233 | return iou # IoU 234 | 235 | 236 | def box_iou(box1, box2): 237 | # https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py 238 | """ 239 | Return intersection-over-union (Jaccard index) of boxes. 240 | Both sets of boxes are expected to be in (x1, y1, x2, y2) format. 241 | Arguments: 242 | box1 (Tensor[N, 4]) 243 | box2 (Tensor[M, 4]) 244 | Returns: 245 | iou (Tensor[N, M]): the NxM matrix containing the pairwise 246 | IoU values for every element in boxes1 and boxes2 247 | """ 248 | 249 | def box_area(box): 250 | # box = 4xn 251 | return (box[2] - box[0]) * (box[3] - box[1]) 252 | 253 | area1 = box_area(box1.T) 254 | area2 = box_area(box2.T) 255 | 256 | # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2) 257 | inter = (torch.min(box1[:, None, 2:], box2[:, 2:]) - torch.max(box1[:, None, :2], box2[:, :2])).clamp(0).prod(2) 258 | return inter / (area1[:, None] + area2 - inter) # iou = inter / (area1 + area2 - inter) 259 | 260 | 261 | def bbox_ioa(box1, box2, eps=1E-7): 262 | """ Returns the intersection over box2 area given box1, box2. Boxes are x1y1x2y2 263 | box1: np.array of shape(4) 264 | box2: np.array of shape(nx4) 265 | returns: np.array of shape(n) 266 | """ 267 | 268 | box2 = box2.transpose() 269 | 270 | # Get the coordinates of bounding boxes 271 | b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3] 272 | b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3] 273 | 274 | # Intersection area 275 | inter_area = (np.minimum(b1_x2, b2_x2) - np.maximum(b1_x1, b2_x1)).clip(0) * \ 276 | (np.minimum(b1_y2, b2_y2) - np.maximum(b1_y1, b2_y1)).clip(0) 277 | 278 | # box2 area 279 | box2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1) + eps 280 | 281 | # Intersection over box2 area 282 | return inter_area / box2_area 283 | 284 | 285 | def wh_iou(wh1, wh2): 286 | # Returns the nxm IoU matrix. wh1 is nx2, wh2 is mx2 287 | wh1 = wh1[:, None] # [N,1,2] 288 | wh2 = wh2[None] # [1,M,2] 289 | inter = torch.min(wh1, wh2).prod(2) # [N,M] 290 | return inter / (wh1.prod(2) + wh2.prod(2) - inter) # iou = inter / (area1 + area2 - inter) 291 | 292 | 293 | # Plots ---------------------------------------------------------------------------------------------------------------- 294 | 295 | def plot_pr_curve(px, py, ap, save_dir='pr_curve.png', names=()): 296 | # Precision-recall curve 297 | 298 | weight = [0.3, 0.2, 0.5] 299 | wAP = (np.array(py).T * weight) 300 | fig, ax = plt.subplots(1, 1, figsize=(9, 6), tight_layout=True) 301 | py = np.stack(py, axis=1) 302 | 303 | if 0 < len(names) < 21: # display per-class legend if < 21 classes 304 | for i, y in enumerate(py.T): 305 | ax.plot(px, y, linewidth=1, label=f'{names[i]} {ap[i, 0]:.3f}') # plot(recall, precision) 306 | else: 307 | ax.plot(px, py, linewidth=1, color='grey') # plot(recall, precision) 308 | 309 | ax.plot(px, py.mean(1), linewidth=2, color='blue', label='all classes %.3f mAP@0.5' % ap[:, 0].mean()) 310 | ax.plot(px, np.sum(wAP, axis=1), linewidth=3, color='red', label='{:.3f} wAP@0.5'.format(np.sum(ap[:, 0] * weight))) 311 | ax.set_xlabel('Recall') 312 | ax.set_ylabel('Precision') 313 | ax.set_xlim(0, 1) 314 | ax.set_ylim(0, 1) 315 | plt.legend(bbox_to_anchor=(1.04, 1), loc="upper left") 316 | fig.savefig(Path(save_dir), dpi=250) 317 | plt.close() 318 | 319 | 320 | def plot_mc_curve(px, py, save_dir='mc_curve.png', names=(), xlabel='Confidence', ylabel='Metric'): 321 | # Metric-confidence curve 322 | fig, ax = plt.subplots(1, 1, figsize=(9, 6), tight_layout=True) 323 | 324 | if 0 < len(names) < 21: # display per-class legend if < 21 classes 325 | for i, y in enumerate(py): 326 | ax.plot(px, y, linewidth=1, label=f'{names[i]}') # plot(confidence, metric) 327 | else: 328 | ax.plot(px, py.T, linewidth=1, color='grey') # plot(confidence, metric) 329 | 330 | y = py.mean(0) 331 | ax.plot(px, y, linewidth=3, color='blue', label=f'all classes {y.max():.2f} at {px[y.argmax()]:.3f}') 332 | ax.set_xlabel(xlabel) 333 | ax.set_ylabel(ylabel) 334 | ax.set_xlim(0, 1) 335 | ax.set_ylim(0, 1) 336 | plt.legend(bbox_to_anchor=(1.04, 1), loc="upper left") 337 | fig.savefig(Path(save_dir), dpi=250) 338 | plt.close() 339 | -------------------------------------------------------------------------------- /utils/plots.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | 4 | Plotting utils 5 | """ 6 | 7 | import math 8 | from copy import copy 9 | from pathlib import Path 10 | 11 | import cv2 12 | import matplotlib 13 | import matplotlib.pyplot as plt 14 | import numpy as np 15 | import pandas as pd 16 | import seaborn as sn 17 | import torch 18 | from PIL import Image, ImageDraw, ImageFont 19 | 20 | from utils.general import user_config_dir, is_ascii, xywh2xyxy, xyxy2xywh 21 | from utils.metrics import fitness 22 | 23 | # Settings 24 | CONFIG_DIR = user_config_dir() # Ultralytics settings dir 25 | matplotlib.rc('font', **{'size': 11}) 26 | matplotlib.use('Agg') # for writing to files only 27 | 28 | 29 | class Colors: 30 | # Ultralytics color palette https://ultralytics.com/ 31 | def __init__(self): 32 | # hex = matplotlib.colors.TABLEAU_COLORS.values() 33 | hex = ('FF3838', 'FF9D97', 'FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB', 34 | '2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8', 'FF37C7') 35 | self.palette = [self.hex2rgb('#' + c) for c in hex] 36 | self.n = len(self.palette) 37 | 38 | def __call__(self, i, bgr=False): 39 | c = self.palette[int(i) % self.n] 40 | return (c[2], c[1], c[0]) if bgr else c 41 | 42 | @staticmethod 43 | def hex2rgb(h): # rgb order (PIL) 44 | return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4)) 45 | 46 | 47 | colors = Colors() # create instance for 'from utils.plots import colors' 48 | 49 | 50 | def check_font(font='Arial.ttf', size=10): 51 | # Return a PIL TrueType Font, downloading to CONFIG_DIR if necessary 52 | font = Path(font) 53 | font = font if font.exists() else (CONFIG_DIR / font.name) 54 | try: 55 | return ImageFont.truetype(str(font) if font.exists() else font.name, size) 56 | except Exception as e: # download if missing 57 | url = "https://ultralytics.com/assets/" + font.name 58 | print(f'Downloading {url} to {font}...') 59 | torch.hub.download_url_to_file(url, str(font)) 60 | return ImageFont.truetype(str(font), size) 61 | 62 | 63 | class Annotator: 64 | check_font() # download TTF if necessary 65 | 66 | # YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations 67 | def __init__(self, im, line_width=None, font_size=None, font='Arial.ttf', pil=True): 68 | assert im.data.contiguous, 'Image not contiguous. Apply np.ascontiguousarray(im) to Annotator() input images.' 69 | self.pil = pil 70 | if self.pil: # use PIL 71 | self.im = im if isinstance(im, Image.Image) else Image.fromarray(im) 72 | self.draw = ImageDraw.Draw(self.im) 73 | self.font = check_font(font, size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12)) 74 | self.fh = self.font.getsize('a')[1] - 3 # font height 75 | else: # use cv2 76 | self.im = im 77 | self.lw = line_width or max(round(sum(im.shape) / 2 * 0.003), 2) # line width 78 | 79 | def box_label(self, box, label='', color=(128, 128, 128), txt_color=(255, 255, 255)): 80 | # Add one xyxy box to image with label 81 | if self.pil or not is_ascii(label): 82 | self.draw.rectangle(box, width=self.lw, outline=color) # box 83 | if label: 84 | w = self.font.getsize(label)[0] # text width 85 | self.draw.rectangle([box[0], box[1] - self.fh, box[0] + w + 1, box[1] + 1], fill=color) 86 | self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls') 87 | else: # cv2 88 | c1, c2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3])) 89 | cv2.rectangle(self.im, c1, c2, color, thickness=self.lw, lineType=cv2.LINE_AA) 90 | if label: 91 | tf = max(self.lw - 1, 1) # font thickness 92 | w, h = cv2.getTextSize(label, 0, fontScale=self.lw / 3, thickness=tf)[0] 93 | c2 = c1[0] + w, c1[1] - h - 3 94 | cv2.rectangle(self.im, c1, c2, color, -1, cv2.LINE_AA) # filled 95 | cv2.putText(self.im, label, (c1[0], c1[1] - 2), 0, self.lw / 3, txt_color, thickness=tf, 96 | lineType=cv2.LINE_AA) 97 | 98 | def rectangle(self, xy, fill=None, outline=None, width=1): 99 | # Add rectangle to image (PIL-only) 100 | self.draw.rectangle(xy, fill, outline, width) 101 | 102 | def text(self, xy, text, txt_color=(255, 255, 255)): 103 | # Add text to image (PIL-only) 104 | w, h = self.font.getsize(text) # text width, height 105 | self.draw.text((xy[0], xy[1] - h + 1), text, fill=txt_color, font=self.font) 106 | 107 | def result(self): 108 | # Return annotated image as array 109 | return np.asarray(self.im) 110 | 111 | 112 | def hist2d(x, y, n=100): 113 | # 2d histogram used in labels.png and evolve.png 114 | xedges, yedges = np.linspace(x.min(), x.max(), n), np.linspace(y.min(), y.max(), n) 115 | hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges)) 116 | xidx = np.clip(np.digitize(x, xedges) - 1, 0, hist.shape[0] - 1) 117 | yidx = np.clip(np.digitize(y, yedges) - 1, 0, hist.shape[1] - 1) 118 | return np.log(hist[xidx, yidx]) 119 | 120 | 121 | def butter_lowpass_filtfilt(data, cutoff=1500, fs=50000, order=5): 122 | from scipy.signal import butter, filtfilt 123 | 124 | # https://stackoverflow.com/questions/28536191/how-to-filter-smooth-with-scipy-numpy 125 | def butter_lowpass(cutoff, fs, order): 126 | nyq = 0.5 * fs 127 | normal_cutoff = cutoff / nyq 128 | return butter(order, normal_cutoff, btype='low', analog=False) 129 | 130 | b, a = butter_lowpass(cutoff, fs, order=order) 131 | return filtfilt(b, a, data) # forward-backward filter 132 | 133 | 134 | def output_to_target(output): 135 | # Convert model output to target format [batch_id, class_id, x, y, w, h, conf] 136 | targets = [] 137 | for i, o in enumerate(output): 138 | for *box, conf, cls in o.cpu().numpy(): 139 | targets.append([i, cls, *list(*xyxy2xywh(np.array(box)[None])), conf]) 140 | return np.array(targets) 141 | 142 | 143 | def plot_images(images, targets, paths=None, fname='images.jpg', names=None, max_size=1920, max_subplots=16): 144 | # Plot image grid with labels 145 | if isinstance(images, torch.Tensor): 146 | images = images.cpu().float().numpy() 147 | if isinstance(targets, torch.Tensor): 148 | targets = targets.cpu().numpy() 149 | if np.max(images[0]) <= 1: 150 | images *= 255.0 # de-normalise (optional) 151 | bs, _, h, w = images.shape # batch size, _, height, width 152 | bs = min(bs, max_subplots) # limit plot images 153 | ns = np.ceil(bs ** 0.5) # number of subplots (square) 154 | 155 | # Build Image 156 | mosaic = np.full((int(ns * h), int(ns * w), 3), 255, dtype=np.uint8) # init 157 | for i, im in enumerate(images): 158 | if i == max_subplots: # if last batch has fewer images than we expect 159 | break 160 | x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin 161 | im = im.transpose(1, 2, 0) 162 | mosaic[y:y + h, x:x + w, :] = im 163 | 164 | # Resize (optional) 165 | scale = max_size / ns / max(h, w) 166 | if scale < 1: 167 | h = math.ceil(scale * h) 168 | w = math.ceil(scale * w) 169 | mosaic = cv2.resize(mosaic, tuple(int(x * ns) for x in (w, h))) 170 | 171 | # Annotate 172 | fs = int((h + w) * ns * 0.01) # font size 173 | annotator = Annotator(mosaic, line_width=round(fs / 10), font_size=fs) 174 | for i in range(i + 1): 175 | x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin 176 | annotator.rectangle([x, y, x + w, y + h], None, (255, 255, 255), width=2) # borders 177 | if paths: 178 | annotator.text((x + 5, y + 5 + h), text=Path(paths[i]).name[:40], txt_color=(220, 220, 220)) # filenames 179 | if len(targets) > 0: 180 | ti = targets[targets[:, 0] == i] # image targets 181 | boxes = xywh2xyxy(ti[:, 2:6]).T 182 | classes = ti[:, 1].astype('int') 183 | labels = ti.shape[1] == 6 # labels if no conf column 184 | conf = None if labels else ti[:, 6] # check for confidence presence (label vs pred) 185 | 186 | if boxes.shape[1]: 187 | if boxes.max() <= 1.01: # if normalized with tolerance 0.01 188 | boxes[[0, 2]] *= w # scale to pixels 189 | boxes[[1, 3]] *= h 190 | elif scale < 1: # absolute coords need scale if image scales 191 | boxes *= scale 192 | boxes[[0, 2]] += x 193 | boxes[[1, 3]] += y 194 | for j, box in enumerate(boxes.T.tolist()): 195 | cls = classes[j] 196 | color = colors(cls) 197 | cls = names[cls] if names else cls 198 | if labels or conf[j] > 0.25: # 0.25 conf thresh 199 | label = f'{cls}' if labels else f'{cls} {conf[j]:.1f}' 200 | annotator.box_label(box, label, color=color) 201 | annotator.im.save(fname) # save 202 | 203 | 204 | def plot_lr_scheduler(optimizer, scheduler, epochs=300, save_dir=''): 205 | # Plot LR simulating training for full epochs 206 | optimizer, scheduler = copy(optimizer), copy(scheduler) # do not modify originals 207 | y = [] 208 | for _ in range(epochs): 209 | scheduler.step() 210 | y.append(optimizer.param_groups[0]['lr']) 211 | plt.plot(y, '.-', label='LR') 212 | plt.xlabel('epoch') 213 | plt.ylabel('LR') 214 | plt.grid() 215 | plt.xlim(0, epochs) 216 | plt.ylim(0) 217 | plt.savefig(Path(save_dir) / 'LR.png', dpi=200) 218 | plt.close() 219 | 220 | 221 | def plot_val_txt(): # from utils.plots import *; plot_val() 222 | # Plot val.txt histograms 223 | x = np.loadtxt('val.txt', dtype=np.float32) 224 | box = xyxy2xywh(x[:, :4]) 225 | cx, cy = box[:, 0], box[:, 1] 226 | 227 | fig, ax = plt.subplots(1, 1, figsize=(6, 6), tight_layout=True) 228 | ax.hist2d(cx, cy, bins=600, cmax=10, cmin=0) 229 | ax.set_aspect('equal') 230 | plt.savefig('hist2d.png', dpi=300) 231 | 232 | fig, ax = plt.subplots(1, 2, figsize=(12, 6), tight_layout=True) 233 | ax[0].hist(cx, bins=600) 234 | ax[1].hist(cy, bins=600) 235 | plt.savefig('hist1d.png', dpi=200) 236 | 237 | 238 | def plot_targets_txt(): # from utils.plots import *; plot_targets_txt() 239 | # Plot targets.txt histograms 240 | x = np.loadtxt('targets.txt', dtype=np.float32).T 241 | s = ['x targets', 'y targets', 'width targets', 'height targets'] 242 | fig, ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True) 243 | ax = ax.ravel() 244 | for i in range(4): 245 | ax[i].hist(x[i], bins=100, label='%.3g +/- %.3g' % (x[i].mean(), x[i].std())) 246 | ax[i].legend() 247 | ax[i].set_title(s[i]) 248 | plt.savefig('targets.jpg', dpi=200) 249 | 250 | 251 | def plot_study_txt(path='', x=None): # from utils.plots import *; plot_study_txt() 252 | # Plot study.txt generated by val.py 253 | plot2 = False # plot additional results 254 | if plot2: 255 | ax = plt.subplots(2, 4, figsize=(10, 6), tight_layout=True)[1].ravel() 256 | 257 | fig2, ax2 = plt.subplots(1, 1, figsize=(8, 4), tight_layout=True) 258 | # for f in [Path(path) / f'study_coco_{x}.txt' for x in ['yolov5s6', 'yolov5m6', 'yolov5l6', 'yolov5x6']]: 259 | for f in sorted(Path(path).glob('study*.txt')): 260 | y = np.loadtxt(f, dtype=np.float32, usecols=[0, 1, 2, 3, 7, 8, 9], ndmin=2).T 261 | x = np.arange(y.shape[1]) if x is None else np.array(x) 262 | if plot2: 263 | s = ['P', 'R', 'mAP@.5', 'mAP@.5:.95', 't_preprocess (ms/img)', 't_inference (ms/img)', 't_NMS (ms/img)'] 264 | for i in range(7): 265 | ax[i].plot(x, y[i], '.-', linewidth=2, markersize=8) 266 | ax[i].set_title(s[i]) 267 | 268 | j = y[3].argmax() + 1 269 | ax2.plot(y[5, 1:j], y[3, 1:j] * 1E2, '.-', linewidth=2, markersize=8, 270 | label=f.stem.replace('study_coco_', '').replace('yolo', 'YOLO')) 271 | 272 | ax2.plot(1E3 / np.array([209, 140, 97, 58, 35, 18]), [34.6, 40.5, 43.0, 47.5, 49.7, 51.5], 273 | 'k.-', linewidth=2, markersize=8, alpha=.25, label='EfficientDet') 274 | 275 | ax2.grid(alpha=0.2) 276 | ax2.set_yticks(np.arange(20, 60, 5)) 277 | ax2.set_xlim(0, 57) 278 | ax2.set_ylim(30, 55) 279 | ax2.set_xlabel('GPU Speed (ms/img)') 280 | ax2.set_ylabel('COCO AP val') 281 | ax2.legend(loc='lower right') 282 | plt.savefig(str(Path(path).name) + '.png', dpi=300) 283 | 284 | 285 | def plot_labels(labels, names=(), save_dir=Path('')): 286 | # plot dataset labels 287 | print('Plotting labels... ') 288 | c, b = labels[:, 0], labels[:, 1:].transpose() # classes, boxes 289 | nc = int(c.max() + 1) # number of classes 290 | x = pd.DataFrame(b.transpose(), columns=['x', 'y', 'width', 'height']) 291 | 292 | # seaborn correlogram 293 | sn.pairplot(x, corner=True, diag_kind='auto', kind='hist', diag_kws=dict(bins=50), plot_kws=dict(pmax=0.9)) 294 | plt.savefig(save_dir / 'labels_correlogram.jpg', dpi=200) 295 | plt.close() 296 | 297 | # matplotlib labels 298 | matplotlib.use('svg') # faster 299 | ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)[1].ravel() 300 | y = ax[0].hist(c, bins=np.linspace(0, nc, nc + 1) - 0.5, rwidth=0.8) 301 | # [y[2].patches[i].set_color([x / 255 for x in colors(i)]) for i in range(nc)] # update colors bug #3195 302 | ax[0].set_ylabel('instances') 303 | if 0 < len(names) < 30: 304 | ax[0].set_xticks(range(len(names))) 305 | ax[0].set_xticklabels(names, rotation=90, fontsize=10) 306 | else: 307 | ax[0].set_xlabel('classes') 308 | sn.histplot(x, x='x', y='y', ax=ax[2], bins=50, pmax=0.9) 309 | sn.histplot(x, x='width', y='height', ax=ax[3], bins=50, pmax=0.9) 310 | 311 | # rectangles 312 | labels[:, 1:3] = 0.5 # center 313 | labels[:, 1:] = xywh2xyxy(labels[:, 1:]) * 2000 314 | img = Image.fromarray(np.ones((2000, 2000, 3), dtype=np.uint8) * 255) 315 | for cls, *box in labels[:1000]: 316 | ImageDraw.Draw(img).rectangle(box, width=1, outline=colors(cls)) # plot 317 | ax[1].imshow(img) 318 | ax[1].axis('off') 319 | 320 | for a in [0, 1, 2, 3]: 321 | for s in ['top', 'right', 'left', 'bottom']: 322 | ax[a].spines[s].set_visible(False) 323 | 324 | plt.savefig(save_dir / 'labels.jpg', dpi=200) 325 | matplotlib.use('Agg') 326 | plt.close() 327 | 328 | 329 | def profile_idetection(start=0, stop=0, labels=(), save_dir=''): 330 | # Plot iDetection '*.txt' per-image logs. from utils.plots import *; profile_idetection() 331 | ax = plt.subplots(2, 4, figsize=(12, 6), tight_layout=True)[1].ravel() 332 | s = ['Images', 'Free Storage (GB)', 'RAM Usage (GB)', 'Battery', 'dt_raw (ms)', 'dt_smooth (ms)', 'real-world FPS'] 333 | files = list(Path(save_dir).glob('frames*.txt')) 334 | for fi, f in enumerate(files): 335 | try: 336 | results = np.loadtxt(f, ndmin=2).T[:, 90:-30] # clip first and last rows 337 | n = results.shape[1] # number of rows 338 | x = np.arange(start, min(stop, n) if stop else n) 339 | results = results[:, x] 340 | t = (results[0] - results[0].min()) # set t0=0s 341 | results[0] = x 342 | for i, a in enumerate(ax): 343 | if i < len(results): 344 | label = labels[fi] if len(labels) else f.stem.replace('frames_', '') 345 | a.plot(t, results[i], marker='.', label=label, linewidth=1, markersize=5) 346 | a.set_title(s[i]) 347 | a.set_xlabel('time (s)') 348 | # if fi == len(files) - 1: 349 | # a.set_ylim(bottom=0) 350 | for side in ['top', 'right']: 351 | a.spines[side].set_visible(False) 352 | else: 353 | a.remove() 354 | except Exception as e: 355 | print('Warning: Plotting error for %s; %s' % (f, e)) 356 | ax[1].legend() 357 | plt.savefig(Path(save_dir) / 'idetection_profile.png', dpi=200) 358 | 359 | 360 | def plot_evolve(evolve_csv='path/to/evolve.csv'): # from utils.plots import *; plot_evolve() 361 | # Plot evolve.csv hyp evolution results 362 | evolve_csv = Path(evolve_csv) 363 | data = pd.read_csv(evolve_csv) 364 | keys = [x.strip() for x in data.columns] 365 | x = data.values 366 | f = fitness(x) 367 | j = np.argmax(f) # max fitness index 368 | plt.figure(figsize=(10, 12), tight_layout=True) 369 | matplotlib.rc('font', **{'size': 8}) 370 | for i, k in enumerate(keys[7:]): 371 | v = x[:, 7 + i] 372 | mu = v[j] # best single result 373 | plt.subplot(6, 5, i + 1) 374 | plt.scatter(v, f, c=hist2d(v, f, 20), cmap='viridis', alpha=.8, edgecolors='none') 375 | plt.plot(mu, f.max(), 'k+', markersize=15) 376 | plt.title('%s = %.3g' % (k, mu), fontdict={'size': 9}) # limit to 40 characters 377 | if i % 5 != 0: 378 | plt.yticks([]) 379 | print('%15s: %.3g' % (k, mu)) 380 | f = evolve_csv.with_suffix('.png') # filename 381 | plt.savefig(f, dpi=200) 382 | plt.close() 383 | print(f'Saved {f}') 384 | 385 | 386 | def plot_results(file='path/to/results.csv', dir=''): 387 | # Plot training results.csv. Usage: from utils.plots import *; plot_results('path/to/results.csv') 388 | save_dir = Path(file).parent if file else Path(dir) 389 | fig, ax = plt.subplots(2, 5, figsize=(12, 6), tight_layout=True) 390 | ax = ax.ravel() 391 | files = list(save_dir.glob('results*.csv')) 392 | assert len(files), f'No results.csv files found in {save_dir.resolve()}, nothing to plot.' 393 | for fi, f in enumerate(files): 394 | try: 395 | data = pd.read_csv(f) 396 | s = [x.strip() for x in data.columns] 397 | x = data.values[:, 0] 398 | for i, j in enumerate([1, 2, 3, 4, 5, 8, 9, 10, 6, 7]): 399 | y = data.values[:, j] 400 | # y[y == 0] = np.nan # don't show zero values 401 | ax[i].plot(x, y, marker='.', label=f.stem, linewidth=2, markersize=8) 402 | ax[i].set_title(s[j], fontsize=12) 403 | # if j in [8, 9, 10]: # share train and val loss y axes 404 | # ax[i].get_shared_y_axes().join(ax[i], ax[i - 5]) 405 | except Exception as e: 406 | print(f'Warning: Plotting error for {f}: {e}') 407 | ax[1].legend() 408 | fig.savefig(save_dir / 'results.png', dpi=200) 409 | plt.close() 410 | 411 | 412 | def feature_visualization(x, module_type, stage, n=32, save_dir=Path('runs/detect/exp')): 413 | """ 414 | x: Features to be visualized 415 | module_type: Module type 416 | stage: Module stage within model 417 | n: Maximum number of feature maps to plot 418 | save_dir: Directory to save results 419 | """ 420 | if 'Detect' not in module_type: 421 | batch, channels, height, width = x.shape # batch, channels, height, width 422 | if height > 1 and width > 1: 423 | f = f"stage{stage}_{module_type.split('.')[-1]}_features.png" # filename 424 | 425 | blocks = torch.chunk(x[0].cpu(), channels, dim=0) # select batch index 0, block by channels 426 | n = min(n, channels) # number of plots 427 | fig, ax = plt.subplots(math.ceil(n / 8), 8, tight_layout=True) # 8 rows x n/8 cols 428 | ax = ax.ravel() 429 | plt.subplots_adjust(wspace=0.05, hspace=0.05) 430 | for i in range(n): 431 | ax[i].imshow(blocks[i].squeeze()) # cmap='gray' 432 | ax[i].axis('off') 433 | 434 | print(f'Saving {save_dir / f}... ({n}/{channels})') 435 | plt.savefig(save_dir / f, dpi=300, bbox_inches='tight') 436 | plt.close() 437 | -------------------------------------------------------------------------------- /utils/torch_utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | Source: YOLOv5 🚀 by Ultralytics https://github.com/ultralytics/yolov5 3 | 4 | PyTorch utils 5 | """ 6 | 7 | import datetime 8 | import logging 9 | import math 10 | import os 11 | import platform 12 | import subprocess 13 | import time 14 | from contextlib import contextmanager 15 | from copy import deepcopy 16 | from pathlib import Path 17 | 18 | import torch 19 | import torch.backends.cudnn as cudnn 20 | import torch.distributed as dist 21 | import torch.nn as nn 22 | import torch.nn.functional as F 23 | import torchvision 24 | 25 | try: 26 | import thop # for FLOPs computation 27 | except ImportError: 28 | thop = None 29 | 30 | LOGGER = logging.getLogger(__name__) 31 | 32 | 33 | @contextmanager 34 | def torch_distributed_zero_first(local_rank: int): 35 | """ 36 | Decorator to make all processes in distributed training wait for each local_master to do something. 37 | """ 38 | if local_rank not in [-1, 0]: 39 | dist.barrier(device_ids=[local_rank]) 40 | yield 41 | if local_rank == 0: 42 | dist.barrier(device_ids=[0]) 43 | 44 | 45 | def init_torch_seeds(seed=0): 46 | # Speed-reproducibility tradeoff https://pytorch.org/docs/stable/notes/randomness.html 47 | torch.manual_seed(seed) 48 | torch.cuda.manual_seed(seed) 49 | torch.cuda.manual_seed_all(seed) 50 | 51 | if seed == 0: # slower, more reproducible 52 | cudnn.benchmark, cudnn.deterministic = False, True 53 | else: # faster, less reproducible 54 | cudnn.benchmark, cudnn.deterministic = True, False 55 | 56 | 57 | def date_modified(path=__file__): 58 | # return human-readable file modification date, i.e. '2021-3-26' 59 | t = datetime.datetime.fromtimestamp(Path(path).stat().st_mtime) 60 | return f'{t.year}-{t.month}-{t.day}' 61 | 62 | 63 | def git_describe(path=Path(__file__).parent): # path must be a directory 64 | # return human-readable git description, i.e. v5.0-5-g3e25f1e https://git-scm.com/docs/git-describe 65 | s = f'git -C {path} describe --tags --long --always' 66 | try: 67 | return subprocess.check_output(s, shell=True, stderr=subprocess.STDOUT).decode()[:-1] 68 | except subprocess.CalledProcessError as e: 69 | return '' # not a git repository 70 | 71 | 72 | def select_device(device='', batch_size=None): 73 | # device = 'cpu' or '0' or '0,1,2,3' 74 | s = f'YOLOv5 🚀 {git_describe() or date_modified()} torch {torch.__version__} ' # string 75 | device = str(device).strip().lower().replace('cuda:', '') # to string, 'cuda:0' to '0' 76 | cpu = device == 'cpu' 77 | if cpu: 78 | os.environ['CUDA_VISIBLE_DEVICES'] = '-1' # force torch.cuda.is_available() = False 79 | elif device: # non-cpu device requested 80 | os.environ['CUDA_VISIBLE_DEVICES'] = device # set environment variable 81 | assert torch.cuda.is_available(), f'CUDA unavailable, invalid device {device} requested' # check availability 82 | 83 | cuda = not cpu and torch.cuda.is_available() 84 | if cuda: 85 | devices = device.split(',') if device else '0' # range(torch.cuda.device_count()) # i.e. 0,1,6,7 86 | n = len(devices) # device count 87 | if n > 1 and batch_size: # check batch_size is divisible by device_count 88 | assert batch_size % n == 0, f'batch-size {batch_size} not multiple of GPU count {n}' 89 | space = ' ' * (len(s) + 1) 90 | for i, d in enumerate(devices): 91 | p = torch.cuda.get_device_properties(i) 92 | s += f"{'' if i == 0 else space}CUDA:{d} ({p.name}, {p.total_memory / 1024 ** 2}MB)\n" # bytes to MB 93 | else: 94 | s += 'CPU\n' 95 | 96 | LOGGER.info(s.encode().decode('ascii', 'ignore') if platform.system() == 'Windows' else s) # emoji-safe 97 | return torch.device('cuda:0' if cuda else 'cpu') 98 | 99 | 100 | def time_sync(): 101 | # pytorch-accurate time 102 | if torch.cuda.is_available(): 103 | torch.cuda.synchronize() 104 | return time.time() 105 | 106 | 107 | def profile(input, ops, n=10, device=None): 108 | # YOLOv5 speed/memory/FLOPs profiler 109 | # 110 | # Usage: 111 | # input = torch.randn(16, 3, 640, 640) 112 | # m1 = lambda x: x * torch.sigmoid(x) 113 | # m2 = nn.SiLU() 114 | # profile(input, [m1, m2], n=100) # profile over 100 iterations 115 | 116 | results = [] 117 | logging.basicConfig(format="%(message)s", level=logging.INFO) 118 | device = device or select_device() 119 | print(f"{'Params':>12s}{'GFLOPs':>12s}{'GPU_mem (GB)':>14s}{'forward (ms)':>14s}{'backward (ms)':>14s}" 120 | f"{'input':>24s}{'output':>24s}") 121 | 122 | for x in input if isinstance(input, list) else [input]: 123 | x = x.to(device) 124 | x.requires_grad = True 125 | for m in ops if isinstance(ops, list) else [ops]: 126 | m = m.to(device) if hasattr(m, 'to') else m # device 127 | m = m.half() if hasattr(m, 'half') and isinstance(x, torch.Tensor) and x.dtype is torch.float16 else m 128 | tf, tb, t = 0., 0., [0., 0., 0.] # dt forward, backward 129 | try: 130 | flops = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 # GFLOPs 131 | except: 132 | flops = 0 133 | 134 | try: 135 | for _ in range(n): 136 | t[0] = time_sync() 137 | y = m(x) 138 | t[1] = time_sync() 139 | try: 140 | _ = (sum([yi.sum() for yi in y]) if isinstance(y, list) else y).sum().backward() 141 | t[2] = time_sync() 142 | except Exception as e: # no backward method 143 | print(e) 144 | t[2] = float('nan') 145 | tf += (t[1] - t[0]) * 1000 / n # ms per op forward 146 | tb += (t[2] - t[1]) * 1000 / n # ms per op backward 147 | mem = torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0 # (GB) 148 | s_in = tuple(x.shape) if isinstance(x, torch.Tensor) else 'list' 149 | s_out = tuple(y.shape) if isinstance(y, torch.Tensor) else 'list' 150 | p = sum(list(x.numel() for x in m.parameters())) if isinstance(m, nn.Module) else 0 # parameters 151 | print(f'{p:12}{flops:12.4g}{mem:>14.3f}{tf:14.4g}{tb:14.4g}{str(s_in):>24s}{str(s_out):>24s}') 152 | results.append([p, flops, mem, tf, tb, s_in, s_out]) 153 | except Exception as e: 154 | print(e) 155 | results.append(None) 156 | torch.cuda.empty_cache() 157 | return results 158 | 159 | 160 | def is_parallel(model): 161 | # Returns True if model is of type DP or DDP 162 | return type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel) 163 | 164 | 165 | def de_parallel(model): 166 | # De-parallelize a model: returns single-GPU model if model is of type DP or DDP 167 | return model.module if is_parallel(model) else model 168 | 169 | 170 | def intersect_dicts(da, db, exclude=()): 171 | # Dictionary intersection of matching keys and shapes, omitting 'exclude' keys, using da values 172 | return {k: v for k, v in da.items() if k in db and not any(x in k for x in exclude) and v.shape == db[k].shape} 173 | 174 | 175 | def initialize_weights(model): 176 | for m in model.modules(): 177 | t = type(m) 178 | if t is nn.Conv2d: 179 | pass # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 180 | elif t is nn.BatchNorm2d: 181 | m.eps = 1e-3 182 | m.momentum = 0.03 183 | elif t in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]: 184 | m.inplace = True 185 | 186 | 187 | def find_modules(model, mclass=nn.Conv2d): 188 | # Finds layer indices matching module class 'mclass' 189 | return [i for i, m in enumerate(model.module_list) if isinstance(m, mclass)] 190 | 191 | 192 | def sparsity(model): 193 | # Return global model sparsity 194 | a, b = 0., 0. 195 | for p in model.parameters(): 196 | a += p.numel() 197 | b += (p == 0).sum() 198 | return b / a 199 | 200 | 201 | def prune(model, amount=0.3): 202 | # Prune model to requested global sparsity 203 | import torch.nn.utils.prune as prune 204 | print('Pruning model... ', end='') 205 | for name, m in model.named_modules(): 206 | if isinstance(m, nn.Conv2d): 207 | prune.l1_unstructured(m, name='weight', amount=amount) # prune 208 | prune.remove(m, 'weight') # make permanent 209 | print(' %.3g global sparsity' % sparsity(model)) 210 | 211 | 212 | def fuse_conv_and_bn(conv, bn): 213 | # Fuse convolution and batchnorm layers https://tehnokv.com/posts/fusing-batchnorm-and-conv/ 214 | fusedconv = nn.Conv2d(conv.in_channels, 215 | conv.out_channels, 216 | kernel_size=conv.kernel_size, 217 | stride=conv.stride, 218 | padding=conv.padding, 219 | groups=conv.groups, 220 | bias=True).requires_grad_(False).to(conv.weight.device) 221 | 222 | # prepare filters 223 | w_conv = conv.weight.clone().view(conv.out_channels, -1) 224 | w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var))) 225 | fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.shape)) 226 | 227 | # prepare spatial bias 228 | b_conv = torch.zeros(conv.weight.size(0), device=conv.weight.device) if conv.bias is None else conv.bias 229 | b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps)) 230 | fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn) 231 | 232 | return fusedconv 233 | 234 | 235 | def model_info(model, verbose=False, img_size=640): 236 | # Model information. img_size may be int or list, i.e. img_size=640 or img_size=[640, 320] 237 | n_p = sum(x.numel() for x in model.parameters()) # number parameters 238 | n_g = sum(x.numel() for x in model.parameters() if x.requires_grad) # number gradients 239 | if verbose: 240 | print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma')) 241 | for i, (name, p) in enumerate(model.named_parameters()): 242 | name = name.replace('module_list.', '') 243 | print('%5g %40s %9s %12g %20s %10.3g %10.3g' % 244 | (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std())) 245 | 246 | try: # FLOPs 247 | from thop import profile 248 | stride = max(int(model.stride.max()), 32) if hasattr(model, 'stride') else 32 249 | img = torch.zeros((1, model.yaml.get('ch', 3), stride, stride), device=next(model.parameters()).device) # input 250 | flops = profile(deepcopy(model), inputs=(img,), verbose=False)[0] / 1E9 * 2 # stride GFLOPs 251 | img_size = img_size if isinstance(img_size, list) else [img_size, img_size] # expand if int/float 252 | fs = ', %.1f GFLOPs' % (flops * img_size[0] / stride * img_size[1] / stride) # 640x640 GFLOPs 253 | except (ImportError, Exception): 254 | fs = '' 255 | 256 | LOGGER.info(f"Model Summary: {len(list(model.modules()))} layers, {n_p} parameters, {n_g} gradients{fs}") 257 | 258 | 259 | def load_classifier(name='resnet101', n=2): 260 | # Loads a pretrained model reshaped to n-class output 261 | model = torchvision.models.__dict__[name](pretrained=True) 262 | 263 | # ResNet model properties 264 | # input_size = [3, 224, 224] 265 | # input_space = 'RGB' 266 | # input_range = [0, 1] 267 | # mean = [0.485, 0.456, 0.406] 268 | # std = [0.229, 0.224, 0.225] 269 | 270 | # Reshape output to n classes 271 | filters = model.fc.weight.shape[1] 272 | model.fc.bias = nn.Parameter(torch.zeros(n), requires_grad=True) 273 | model.fc.weight = nn.Parameter(torch.zeros(n, filters), requires_grad=True) 274 | model.fc.out_features = n 275 | return model 276 | 277 | 278 | def scale_img(img, ratio=1.0, same_shape=False, gs=32): # img(16,3,256,416) 279 | # scales img(bs,3,y,x) by ratio constrained to gs-multiple 280 | if ratio == 1.0: 281 | return img 282 | else: 283 | h, w = img.shape[2:] 284 | s = (int(h * ratio), int(w * ratio)) # new size 285 | img = F.interpolate(img, size=s, mode='bilinear', align_corners=False) # resize 286 | if not same_shape: # pad/crop img 287 | h, w = [math.ceil(x * ratio / gs) * gs for x in (h, w)] 288 | return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447) # value = imagenet mean 289 | 290 | 291 | def copy_attr(a, b, include=(), exclude=()): 292 | # Copy attributes from b to a, options to only include [...] and to exclude [...] 293 | for k, v in b.__dict__.items(): 294 | if (len(include) and k not in include) or k.startswith('_') or k in exclude: 295 | continue 296 | else: 297 | setattr(a, k, v) 298 | 299 | 300 | class EarlyStopping: 301 | # YOLOv5 simple early stopper 302 | def __init__(self, patience=30): 303 | self.best_fitness = 0.0 # i.e. mAP 304 | self.best_epoch = 0 305 | self.patience = patience or float('inf') # epochs to wait after fitness stops improving to stop 306 | self.possible_stop = False # possible stop may occur next epoch 307 | 308 | def __call__(self, epoch, fitness): 309 | if fitness >= self.best_fitness: # >= 0 to allow for early zero-fitness stage of training 310 | self.best_epoch = epoch 311 | self.best_fitness = fitness 312 | delta = epoch - self.best_epoch # epochs without improvement 313 | self.possible_stop = delta >= (self.patience - 1) # possible stop may occur next epoch 314 | stop = delta >= self.patience # stop training if patience exceeded 315 | if stop: 316 | LOGGER.info(f'EarlyStopping patience {self.patience} exceeded, stopping training.') 317 | return stop 318 | 319 | 320 | class ModelEMA: 321 | """ Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models 322 | Keep a moving average of everything in the model state_dict (parameters and buffers). 323 | This is intended to allow functionality like 324 | https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage 325 | A smoothed version of the weights is necessary for some training schemes to perform well. 326 | This class is sensitive where it is initialized in the sequence of model init, 327 | GPU assignment and distributed training wrappers. 328 | """ 329 | 330 | def __init__(self, model, decay=0.9999, updates=0): 331 | # Create EMA 332 | self.ema = deepcopy(model.module if is_parallel(model) else model).eval() # FP32 EMA 333 | # if next(model.parameters()).device.type != 'cpu': 334 | # self.ema.half() # FP16 EMA 335 | self.updates = updates # number of EMA updates 336 | self.decay = lambda x: decay * (1 - math.exp(-x / 2000)) # decay exponential ramp (to help early epochs) 337 | for p in self.ema.parameters(): 338 | p.requires_grad_(False) 339 | 340 | def update(self, model): 341 | # Update EMA parameters 342 | with torch.no_grad(): 343 | self.updates += 1 344 | d = self.decay(self.updates) 345 | 346 | msd = model.module.state_dict() if is_parallel(model) else model.state_dict() # model state_dict 347 | for k, v in self.ema.state_dict().items(): 348 | if v.dtype.is_floating_point: 349 | v *= d 350 | v += (1. - d) * msd[k].detach() 351 | 352 | def update_attr(self, model, include=(), exclude=('process_group', 'reducer')): 353 | # Update EMA attributes 354 | copy_attr(self.ema, model, include, exclude) 355 | -------------------------------------------------------------------------------- /val.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import argparse 3 | from pathlib import Path 4 | from threading import Thread 5 | 6 | import torch 7 | import numpy as np 8 | from tqdm import tqdm 9 | 10 | from utils.callbacks import Callbacks 11 | from models.experimental import attempt_load 12 | from utils.datasets import create_dataloader 13 | from utils.plots import plot_images, output_to_target 14 | from utils.metrics import ap_per_class, ConfusionMatrix 15 | from utils.torch_utils import select_device, time_sync 16 | from utils.general import check_dataset, check_img_size, check_suffix, check_yaml, box_iou,\ 17 | non_max_suppression, scale_coords, xyxy2xywh, xywh2xyxy, set_logging, increment_path, colorstr 18 | 19 | 20 | FILE = Path(__file__).resolve() 21 | sys.path.append(FILE.parents[0].as_posix()) 22 | 23 | 24 | def save_one_txt(normed_pred, save_conf, shape, file): 25 | # Save one txt result 26 | gn = torch.tensor(shape)[[1, 0, 1, 0]] # normalization gain wh x wh 27 | for *xyxy, conf, cls in normed_pred.tolist(): 28 | xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized x y w h 29 | line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format 30 | with open(file, 'a') as f: 31 | f.write(('%g ' * len(line)).rstrip() % line + '\n') 32 | 33 | 34 | def process_batch(detections, labels, iou_thresholds): 35 | """ 36 | Return correct predictions matrix. Both sets of boxes are in (x1, y1, x2, y2) format. 37 | Arguments: 38 | detections (Array[N, 6]), x1, y1, x2, y2, conf, class 39 | labels (Array[M, 5]), class, x1, y1, x2, y2 40 | iou_thresholds: list iou thresholds from 0.5 -> 0.95 41 | Returns: 42 | correct (Array[N, 10]), for 10 IoU levels 43 | """ 44 | correct = torch.zeros(detections.shape[0], iou_thresholds.shape[0], dtype=torch.bool, device=iou_thresholds.device) 45 | iou = box_iou(labels[:, 1:], detections[:, :4]) 46 | x = torch.where((iou >= iou_thresholds[0]) & (labels[:, 0:1] == detections[:, 5])) 47 | if x[0].shape[0]: 48 | # [label, detection, iou] 49 | matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy() 50 | if x[0].shape[0] > 1: 51 | matches = matches[matches[:, 2].argsort()[::-1]] 52 | matches = matches[np.unique(matches[:, 1], return_index=True)[1]] 53 | matches = matches[np.unique(matches[:, 0], return_index=True)[1]] 54 | matches = torch.Tensor(matches).to(iou_thresholds.device) 55 | correct[matches[:, 1].long()] = matches[:, 2:3] >= iou_thresholds 56 | 57 | return correct 58 | 59 | 60 | def cal_weighted_ap(ap50): 61 | return 0.2 * ap50[1] + 0.3 * ap50[0] + 0.5 * ap50[2] 62 | 63 | 64 | @torch.no_grad() 65 | def run(data, 66 | weights=None, # model.pt path(s) 67 | batch_size=32, # batch size 68 | img_size=640, # inference size (pixels) 69 | conf_threshold=0.001, # confidence threshold 70 | iou_threshold=0.6, # NMS IoU threshold 71 | task='val', # train, val, test, speed or study 72 | device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu 73 | augment=False, # augmented inference 74 | verbose=False, # verbose output 75 | save_txt=False, # save results to *.txt 76 | save_hybrid=False, # save label+prediction hybrid results to *.txt 77 | save_conf=False, # save confidences in --save-txt labels 78 | project='results/val', # save to project/name 79 | name='exp', # save to project/name 80 | exist_ok=False, # existing project/name ok, do not increment 81 | half=True, # use FP16 half-precision inference 82 | model=None, 83 | dataloader=None, 84 | save_dir=Path(''), 85 | plots=True, 86 | callbacks=Callbacks(), 87 | compute_loss=None, 88 | ): 89 | 90 | # Initialize/load model and set device 91 | is_loaded_model = model is not None 92 | grid_size = None 93 | 94 | if is_loaded_model: 95 | device = next(model.parameters()).device 96 | else: 97 | device = select_device(device, batch_size=batch_size) 98 | 99 | # Directories 100 | save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) 101 | (save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) 102 | 103 | # Load model 104 | check_suffix(weights, '.pt') 105 | model = attempt_load(weights, map_location=device) 106 | grid_size = max(int(model.stride.max()), 32) 107 | img_size = check_img_size(img_size, s=grid_size) 108 | 109 | # Data 110 | data = check_dataset(data) 111 | 112 | # Half 113 | half &= device.type != 'cpu' # half precision only supported on CUDA 114 | if half: 115 | model.half() 116 | 117 | # Configure 118 | model.eval() 119 | num_class = int(data['num_class']) 120 | iou_thresholds = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95 121 | num_thresholds = iou_thresholds.numel() 122 | 123 | # Dataloader 124 | if not is_loaded_model: 125 | if device.type != 'cpu': 126 | model(torch.zeros(1, 3, img_size, img_size).to(device).type_as(next(model.parameters()))) 127 | task = task if task in ('train', 'val', 'test') else 'val' 128 | dataloader = create_dataloader(data[task], img_size, batch_size, grid_size, pad=0.5, rect=True, 129 | prefix=colorstr(f'{task}: '))[0] 130 | 131 | seen = 0 132 | num_per_class = [0] * num_class 133 | 134 | confusion_matrix = ConfusionMatrix(nc=num_class) 135 | names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)} 136 | s = ('%20s' + '%11s' * 8) % ('Class', 'Images', 'Labels', 'Boxes', 'P', 'R', 'wAP@.5', 'mAP@.5', 'mAP@.5:.95') 137 | dt, p, r, f1, mp, mr, map50, map, wap50 = [0.0, 0.0, 0.0], 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 138 | 139 | loss = torch.zeros(3, device=device) 140 | stats, ap, ap_class = [], [], [] 141 | 142 | for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)): 143 | t1 = time_sync() 144 | 145 | # Preprocess 146 | img = img.to(device, non_blocking=True) 147 | img = img.half() if half else img.float() # uint8 to fp16/32 148 | img /= 255.0 149 | 150 | for i in range(num_class): 151 | num_per_class[i] += len(np.where(targets[:, 1] == i)[0]) 152 | targets = targets.to(device) 153 | 154 | batch_size, _, height, width = img.shape # batch size, channels, height, width 155 | t2 = time_sync() 156 | dt[0] += t2 - t1 157 | 158 | # Run model 159 | out, train_out = model(img, augment=augment) # inference and training outputs 160 | dt[1] += time_sync() - t2 161 | 162 | # Compute loss 163 | if compute_loss: 164 | # box, obj, cls 165 | loss += compute_loss([x.float() for x in train_out], targets)[1] 166 | 167 | # Run NMS 168 | targets[:, 2:] *= torch.Tensor([width, height, width, height]).to(device) # to pixels 169 | lb = [targets[targets[:, 0] == i, 1:] for i in range(batch_size)] if save_hybrid else [] 170 | t3 = time_sync() 171 | 172 | # Note depth 8 -> 6 173 | out = non_max_suppression(out, conf_threshold, iou_threshold, labels=lb, multi_label=True) 174 | dt[2] += time_sync() - t3 175 | 176 | # Statistics per image 177 | for si, pred in enumerate(out): 178 | labels = targets[targets[:, 0] == si, 1:] 179 | nl = len(labels) 180 | target_class = labels[:, 0].tolist() if nl else [] # target class 181 | path, shape = Path(paths[si]), shapes[si][0] 182 | seen += 1 183 | 184 | if len(pred) == 0: 185 | if nl: 186 | stats.append((torch.zeros(0, num_thresholds, dtype=torch.bool), 187 | torch.Tensor(), torch.Tensor(), target_class)) 188 | continue 189 | 190 | normed_pred = pred.clone() 191 | scale_coords(img[si].shape[1:], normed_pred[:, :4], shape, shapes[si][1]) # native-space pred 192 | 193 | # Evaluate 194 | if nl: 195 | target_boxes = xywh2xyxy(labels[:, 1:5]) # target boxes 196 | scale_coords(img[si].shape[1:], target_boxes, shape, shapes[si][1]) # native-space labels 197 | labels_per_img = torch.cat((labels[:, 0:1], target_boxes), 1) # native-space labels 198 | correct = process_batch(normed_pred, labels_per_img, iou_thresholds) 199 | if plots: 200 | confusion_matrix.process_batch(normed_pred, labels_per_img) 201 | else: 202 | correct = torch.zeros(pred.shape[0], num_thresholds, dtype=torch.bool) 203 | 204 | # correct, confidence, pred_label, target_label 205 | stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), target_class)) 206 | 207 | # Save/log 208 | if save_txt: 209 | save_one_txt(normed_pred, save_conf, shape, file=save_dir / 'labels' / (path.stem + '.txt')) 210 | callbacks.run('on_val_image_end', pred, normed_pred, path, names, img[si]) 211 | 212 | # Plot images 213 | if plots and batch_i < 3: 214 | f = save_dir / f'val_batch{batch_i}_labels.jpg' # labels 215 | Thread(target=plot_images, args=(img, targets, paths, f, names), daemon=True).start() 216 | f = save_dir / f'val_batch{batch_i}_pred.jpg' # predictions 217 | Thread(target=plot_images, args=(img, output_to_target(out), paths, f, names), daemon=True).start() 218 | 219 | # Compute statistics 220 | stats = [np.concatenate(x, 0) for x in zip(*stats)] 221 | 222 | # Count detected boxes per class 223 | boxes_per_class = np.bincount(stats[2].astype(np.int64), minlength=num_class) 224 | ap50 = None 225 | 226 | if len(stats) and stats[0].any(): 227 | p, r, ap, f1, ap_class = ap_per_class(*stats, plot=plots, save_dir=save_dir, names=names) 228 | ap50, ap = ap[:, 0], ap.mean(1) # AP@0.5, AP@0.5:0.95 229 | mp, mr, wap50, map50, map = p.mean(), r.mean(), cal_weighted_ap(ap50), ap50.mean(), ap.mean() 230 | nt = np.bincount(stats[3].astype(np.int64), minlength=num_class) # number of targets per class 231 | else: 232 | nt = torch.zeros(1) 233 | 234 | # Print results 235 | print_format = '%20s' + '%11i' * 3 + '%11.3g' * 5 # print format 236 | print(print_format % ('all', seen, nt.sum(), sum(boxes_per_class), mp, mr, wap50, map50, map)) 237 | 238 | # Print results per class 239 | if (verbose or (num_class < 50 and not is_loaded_model)) and num_class > 1 and len(stats): 240 | for i, c in enumerate(ap_class): 241 | print(print_format % (names[c], num_per_class[i], nt[c], 242 | boxes_per_class[i], p[i], r[i], ap50[i], ap50[i], ap[i])) 243 | 244 | # Print speeds 245 | t = tuple(x / seen * 1E3 for x in dt) 246 | if not is_loaded_model: 247 | shape = (batch_size, 3, img_size, img_size) 248 | print(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {shape}' % t) 249 | 250 | # Plots 251 | if plots: 252 | confusion_matrix.plot(save_dir=save_dir, names=list(names.values())) 253 | callbacks.run('on_val_end') 254 | 255 | # Return results 256 | model.float() 257 | if not is_loaded_model: 258 | s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else '' 259 | print(f"Results saved to {colorstr('bold', save_dir)}{s}") 260 | maps = np.zeros(num_class) + map 261 | for i, c in enumerate(ap_class): 262 | maps[c] = ap[i] 263 | return (mp, mr, wap50, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t 264 | 265 | 266 | def parser(): 267 | args = argparse.ArgumentParser(prog='val.py') 268 | args.add_argument('--data', type=str, default='config/data_cfg.yaml', help='dataset.yaml path') 269 | args.add_argument('--weights', type=str, help='specify your weight path', required=True) 270 | args.add_argument('--task', help='train, val, test', required=True) 271 | args.add_argument('--name', help='save to project/name', required=True) 272 | args.add_argument('--batch-size', type=int, default=64, help='batch size') 273 | args.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') 274 | args = args.parse_args() 275 | 276 | args.img_size = 640 277 | args.conf_threshold = 0.001 278 | args.iou_threshold = 0.6 279 | args.augment = False 280 | args.exist_ok = False 281 | args.half = False 282 | args.project = 'results/evaluate/' + args.task 283 | args.save_conf = False 284 | args.save_hybrid = False 285 | args.save_txt = False 286 | args.verbose = False 287 | args.plots = True 288 | 289 | args.save_txt |= args.save_hybrid 290 | args.data = check_yaml(args.data) 291 | 292 | return args 293 | 294 | 295 | def main(args): 296 | set_logging() 297 | print(colorstr('val: ') + ', '.join(f'{k}={v}' for k, v in vars(args).items())) 298 | 299 | if args.task in ('train', 'val', 'test'): # run normally 300 | run(**vars(args)) 301 | 302 | 303 | if __name__ == "__main__": 304 | main(parser()) 305 | --------------------------------------------------------------------------------