├── .DS_Store ├── .gitignore ├── README.md ├── common ├── .DS_Store ├── __init__.py ├── coco_dataset.py ├── data_transforms.py ├── demo │ ├── demo0.jpg │ ├── demo1.jpg │ └── loss_curve.png └── utils.py ├── data ├── .DS_Store ├── coco.names └── get_coco_dataset.sh ├── evaluate ├── coco_index2category.json ├── eval.py ├── eval_coco.py └── params.py ├── nets ├── .DS_Store ├── __init__.py ├── backbone │ ├── .DS_Store │ ├── __init__.py │ ├── darknet.py │ └── shufflenet.py ├── model_main.py └── yolo_loss.py ├── requirements.txt ├── test ├── .DS_Store ├── images │ ├── test1.jpg │ ├── test2.jpg │ ├── test3.jpg │ └── test4.jpg ├── params.py ├── test_fps.py └── test_images.py ├── training ├── params.py └── training.py └── weights ├── .DS_Store └── README.md /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/.DS_Store -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | *.pth 9 | 10 | # Distribution / packaging 11 | .Python 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .coverage 43 | .coverage.* 44 | .cache 45 | nosetests.xml 46 | coverage.xml 47 | *.cover 48 | .hypothesis/ 49 | .pytest_cache/ 50 | 51 | # Translations 52 | *.mo 53 | *.pot 54 | 55 | # Django stuff: 56 | *.log 57 | local_settings.py 58 | db.sqlite3 59 | 60 | # Flask stuff: 61 | instance/ 62 | .webassets-cache 63 | 64 | # Scrapy stuff: 65 | .scrapy 66 | 67 | # Sphinx documentation 68 | docs/_build/ 69 | 70 | # PyBuilder 71 | target/ 72 | 73 | # Jupyter Notebook 74 | .ipynb_checkpoints 75 | 76 | # pyenv 77 | .python-version 78 | 79 | # celery beat schedule file 80 | celerybeat-schedule 81 | 82 | # SageMath parsed files 83 | *.sage.py 84 | 85 | # Environments 86 | .env 87 | .venv 88 | env/ 89 | venv/ 90 | ENV/ 91 | env.bak/ 92 | venv.bak/ 93 | 94 | # Spyder project settings 95 | .spyderproject 96 | .spyproject 97 | 98 | # Rope project settings 99 | .ropeproject 100 | 101 | # mkdocs documentation 102 | /site 103 | 104 | # mypy 105 | .mypy_cache/ 106 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ShuffleNetv2-YOLOv3 2 | The work is base on [YOLOv3_PyTorch](https://github.com/BobLiu20/YOLOv3_PyTorch). I replace the backbone with ShuffleNet v2. But after testing, I can't train a good detector. Many people said the [work](https://github.com/BobLiu20/YOLOv3_PyTorch) has many problems. **So I don't recommend this repo, if you want to use shufflenetv2 + yolo3, you can go for [this](https://github.com/TencentYoutuResearch/ObjectDetection-OneStageDet).** 3 | ## Why this project 4 | The computing complexity of darknet53 is costly. I want to speed up network computing. So I replace the backbone with ShuffleNet v2 which is a lightweight network in order to use the detector in mobile devices like smartphone. 5 | ## Installation 6 | ### Environment 7 | - pytorch >= 0.4.0 8 | - python >= 3.6.0 9 | 10 | ### Get code 11 | ```git clone https://github.com/ZhuYun97/ShufflNetv2-YOLOv3.git``` 12 | 13 | ### Download COCO dataset 14 | ``` 15 | cd ShufflNetv2-YOLOv3/data 16 | bash get_coco_dataset.sh 17 | ``` 18 | 19 | ## Training 20 | ### Download pretrained weights 21 | - If you want to use ShuffleNetv2, you can downlaod the pretrained weights(emmmm, under training) 22 | - if you want to use darknet, you just follow [the original author](https://github.com/BobLiu20/YOLOv3_PyTorch) 23 | 24 | ### Modify training parameters 25 | 1. Review config file training/params.py 26 | 2. Replace YOUR_WORKING_DIR to your working directory. Use for save model and tmp file. 27 | 3. Adjust your GPU device. see parallels. 28 | 4. Adjust other parameters. 29 | 30 | ### Start training 31 | ``` 32 | cd training 33 | python training.py params.py 34 | ``` 35 | 36 | ### Option: Visualizing training 37 | ``` 38 | # please install tensorboard in first 39 | python -m tensorboard.main --logdir=YOUR_WORKING_DIR 40 | ``` 41 | ## Evaluate 42 | ### Download pretrained weights 43 | - If you want to use ShuffleNetv2, you can downlaod the pretrained weights(emmmm, under training) 44 | - if you want to use darknet, you just follow [the original author](https://github.com/BobLiu20/YOLOv3_PyTorch) 45 | Move downloaded file to wegihts folder in this project. 46 | 47 | ### Start evaluate 48 | ``` 49 | cd evaluate 50 | python eval_coco.py params.py 51 | ``` 52 | 53 | ## Quick test 54 | ### pretrained weights 55 | Please download pretrained weights [in progress]() or use yourself checkpoint. 56 | ``` 57 | Start test 58 | cd test 59 | python test_images.py params.py 60 | You can got result images in output folder. 61 | ``` 62 | 63 | ## Measure FPS 64 | pretrained weights 65 | Please download pretrained weights [in progress]() or use yourself checkpoint. 66 | 67 | ## Start test 68 | ``` 69 | cd test 70 | python test_fps.py params.py 71 | ``` 72 | ## Results 73 | Test in TitanX GPU with different input size and batch size. 74 | Keep in mind this is a full test in YOLOv3. Not only backbone but also yolo layer and NMS. 75 | 76 | ## References 77 | [YOLOv3_PyTorch](https://github.com/BobLiu20/YOLOv3_PyTorch) 78 | -------------------------------------------------------------------------------- /common/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/common/.DS_Store -------------------------------------------------------------------------------- /common/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/common/__init__.py -------------------------------------------------------------------------------- /common/coco_dataset.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import logging 4 | import cv2 5 | 6 | import torch 7 | from torch.utils.data import Dataset 8 | 9 | from . import data_transforms 10 | 11 | 12 | class COCODataset(Dataset): 13 | def __init__(self, list_path, img_size, is_training, is_debug=False): 14 | self.img_files = [] 15 | self.label_files = [] 16 | for path in open(list_path, 'r'): 17 | label_path = path.replace('images', 'labels').replace('.png', '.txt').replace( 18 | '.jpg', '.txt').strip() 19 | if os.path.isfile(label_path): 20 | self.img_files.append(path) 21 | self.label_files.append(label_path) 22 | else: 23 | logging.info("no label found. skip it: {}".format(path)) 24 | logging.info("Total images: {}".format(len(self.img_files))) 25 | self.img_size = img_size # (w, h) 26 | self.max_objects = 50 27 | self.is_debug = is_debug 28 | 29 | # transforms and augmentation 30 | self.transforms = data_transforms.Compose() 31 | if is_training: 32 | self.transforms.add(data_transforms.ImageBaseAug()) 33 | # self.transforms.add(data_transforms.KeepAspect()) 34 | self.transforms.add(data_transforms.ResizeImage(self.img_size)) 35 | self.transforms.add(data_transforms.ToTensor(self.max_objects, self.is_debug)) 36 | 37 | def __getitem__(self, index): 38 | img_path = self.img_files[index % len(self.img_files)].rstrip() 39 | img = cv2.imread(img_path, cv2.IMREAD_COLOR) 40 | if img is None: 41 | raise Exception("Read image error: {}".format(img_path)) 42 | ori_h, ori_w = img.shape[:2] 43 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 44 | 45 | label_path = self.label_files[index % len(self.img_files)].rstrip() 46 | if os.path.exists(label_path): 47 | labels = np.loadtxt(label_path).reshape(-1, 5) 48 | else: 49 | logging.info("label does not exist: {}".format(label_path)) 50 | labels = np.zeros((1, 5), np.float32) 51 | 52 | sample = {'image': img, 'label': labels} 53 | if self.transforms is not None: 54 | sample = self.transforms(sample) 55 | sample["image_path"] = img_path 56 | sample["origin_size"] = str([ori_w, ori_h]) 57 | return sample 58 | 59 | def __len__(self): 60 | return len(self.img_files) 61 | 62 | 63 | # use for test dataloader 64 | if __name__ == "__main__": 65 | dataloader = torch.utils.data.DataLoader(COCODataset("../data/coco/trainvalno5k.txt", 66 | (416, 416), True, is_debug=True), 67 | batch_size=2, 68 | shuffle=False, num_workers=1, pin_memory=False) 69 | for step, sample in enumerate(dataloader): 70 | for i, (image, label) in enumerate(zip(sample['image'], sample['label'])): 71 | image = image.numpy() 72 | h, w = image.shape[:2] 73 | for l in label: 74 | if l.sum() == 0: 75 | continue 76 | x1 = int((l[1] - l[3] / 2) * w) 77 | y1 = int((l[2] - l[4] / 2) * h) 78 | x2 = int((l[1] + l[3] / 2) * w) 79 | y2 = int((l[2] + l[4] / 2) * h) 80 | cv2.rectangle(image, (x1, y1), (x2, y2), (0, 0, 255)) 81 | image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) 82 | cv2.imwrite("step{}_{}.jpg".format(step, i), image) 83 | # only one batch 84 | break 85 | -------------------------------------------------------------------------------- /common/data_transforms.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import torch 4 | 5 | # pip install imgaug 6 | import imgaug as ia 7 | from imgaug import augmenters as iaa 8 | 9 | 10 | class Compose(object): 11 | """Composes several transforms together. 12 | Args: 13 | transforms (list of ``Transform`` objects): list of transforms to compose. 14 | """ 15 | def __init__(self, transforms=[]): 16 | self.transforms = transforms 17 | 18 | def __call__(self, img): 19 | for t in self.transforms: 20 | img = t(img) 21 | return img 22 | 23 | def add(self, transform): 24 | self.transforms.append(transform) 25 | 26 | 27 | class ToTensor(object): 28 | def __init__(self, max_objects=50, is_debug=False): 29 | self.max_objects = max_objects 30 | self.is_debug = is_debug 31 | 32 | def __call__(self, sample): 33 | image, labels = sample['image'], sample['label'] 34 | if self.is_debug == False: 35 | image = image.astype(np.float32) 36 | image /= 255.0 37 | image = np.transpose(image, (2, 0, 1)) 38 | image = image.astype(np.float32) 39 | 40 | filled_labels = np.zeros((self.max_objects, 5), np.float32) 41 | filled_labels[range(len(labels))[:self.max_objects]] = labels[:self.max_objects] 42 | return {'image': torch.from_numpy(image), 'label': torch.from_numpy(filled_labels)} 43 | 44 | class KeepAspect(object): 45 | def __init__(self): 46 | pass 47 | 48 | def __call__(self, sample): 49 | image, label = sample['image'], sample['label'] 50 | 51 | h, w, _ = image.shape 52 | dim_diff = np.abs(h - w) 53 | # Upper (left) and lower (right) padding 54 | pad1, pad2 = dim_diff // 2, dim_diff - dim_diff // 2 55 | # Determine padding 56 | pad = ((pad1, pad2), (0, 0), (0, 0)) if h <= w else ((0, 0), (pad1, pad2), (0, 0)) 57 | # Add padding 58 | image_new = np.pad(image, pad, 'constant', constant_values=128) 59 | padded_h, padded_w, _ = image_new.shape 60 | 61 | # Extract coordinates for unpadded + unscaled image 62 | x1 = w * (label[:, 1] - label[:, 3]/2) 63 | y1 = h * (label[:, 2] - label[:, 4]/2) 64 | x2 = w * (label[:, 1] + label[:, 3]/2) 65 | y2 = h * (label[:, 2] + label[:, 4]/2) 66 | # Adjust for added padding 67 | x1 += pad[1][0] 68 | y1 += pad[0][0] 69 | x2 += pad[1][0] 70 | y2 += pad[0][0] 71 | # Calculate ratios from coordinates 72 | label[:, 1] = ((x1 + x2) / 2) / padded_w 73 | label[:, 2] = ((y1 + y2) / 2) / padded_h 74 | label[:, 3] *= w / padded_w 75 | label[:, 4] *= h / padded_h 76 | 77 | return {'image': image_new, 'label': label} 78 | 79 | class ResizeImage(object): 80 | def __init__(self, new_size, interpolation=cv2.INTER_LINEAR): 81 | self.new_size = tuple(new_size) # (w, h) 82 | self.interpolation = interpolation 83 | 84 | def __call__(self, sample): 85 | image, label = sample['image'], sample['label'] 86 | image = cv2.resize(image, self.new_size, interpolation=self.interpolation) 87 | return {'image': image, 'label': label} 88 | 89 | class ImageBaseAug(object): 90 | def __init__(self): 91 | sometimes = lambda aug: iaa.Sometimes(0.5, aug) 92 | self.seq = iaa.Sequential( 93 | [ 94 | # Blur each image with varying strength using 95 | # gaussian blur (sigma between 0 and 3.0), 96 | # average/uniform blur (kernel size between 2x2 and 7x7) 97 | # median blur (kernel size between 3x3 and 11x11). 98 | iaa.OneOf([ 99 | iaa.GaussianBlur((0, 3.0)), 100 | iaa.AverageBlur(k=(2, 7)), 101 | iaa.MedianBlur(k=(3, 11)), 102 | ]), 103 | # Sharpen each image, overlay the result with the original 104 | # image using an alpha between 0 (no sharpening) and 1 105 | # (full sharpening effect). 106 | sometimes(iaa.Sharpen(alpha=(0, 0.5), lightness=(0.75, 1.5))), 107 | # Add gaussian noise to some images. 108 | sometimes(iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5)), 109 | # Add a value of -5 to 5 to each pixel. 110 | sometimes(iaa.Add((-5, 5), per_channel=0.5)), 111 | # Change brightness of images (80-120% of original value). 112 | sometimes(iaa.Multiply((0.8, 1.2), per_channel=0.5)), 113 | # Improve or worsen the contrast of images. 114 | sometimes(iaa.ContrastNormalization((0.5, 2.0), per_channel=0.5)), 115 | ], 116 | # do all of the above augmentations in random order 117 | random_order=True 118 | ) 119 | 120 | def __call__(self, sample): 121 | seq_det = self.seq.to_deterministic() 122 | image, label = sample['image'], sample['label'] 123 | image = seq_det.augment_images([image])[0] 124 | return {'image': image, 'label': label} 125 | -------------------------------------------------------------------------------- /common/demo/demo0.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/common/demo/demo0.jpg -------------------------------------------------------------------------------- /common/demo/demo1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/common/demo/demo1.jpg -------------------------------------------------------------------------------- /common/demo/loss_curve.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/common/demo/loss_curve.png -------------------------------------------------------------------------------- /common/utils.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | import math 3 | import time 4 | import torch 5 | import torch.nn as nn 6 | import torch.nn.functional as F 7 | from torch.autograd import Variable 8 | import numpy as np 9 | 10 | 11 | def bbox_iou(box1, box2, x1y1x2y2=True): 12 | """ 13 | Returns the IoU of two bounding boxes 14 | """ 15 | if not x1y1x2y2: 16 | # Transform from center and width to exact coordinates 17 | b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0] + box1[:, 2] / 2 18 | b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1] + box1[:, 3] / 2 19 | b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2 20 | b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2 21 | else: 22 | # Get the coordinates of bounding boxes 23 | b1_x1, b1_y1, b1_x2, b1_y2 = box1[:,0], box1[:,1], box1[:,2], box1[:,3] 24 | b2_x1, b2_y1, b2_x2, b2_y2 = box2[:,0], box2[:,1], box2[:,2], box2[:,3] 25 | 26 | # get the corrdinates of the intersection rectangle 27 | inter_rect_x1 = torch.max(b1_x1, b2_x1) 28 | inter_rect_y1 = torch.max(b1_y1, b2_y1) 29 | inter_rect_x2 = torch.min(b1_x2, b2_x2) 30 | inter_rect_y2 = torch.min(b1_y2, b2_y2) 31 | # Intersection area 32 | inter_area = torch.clamp(inter_rect_x2 - inter_rect_x1 + 1, min=0) * \ 33 | torch.clamp(inter_rect_y2 - inter_rect_y1 + 1, min=0) 34 | # Union Area 35 | b1_area = (b1_x2 - b1_x1 + 1) * (b1_y2 - b1_y1 + 1) 36 | b2_area = (b2_x2 - b2_x1 + 1) * (b2_y2 - b2_y1 + 1) 37 | 38 | iou = inter_area / (b1_area + b2_area - inter_area + 1e-16) 39 | 40 | return iou 41 | 42 | 43 | def non_max_suppression(prediction, num_classes, conf_thres=0.5, nms_thres=0.4): 44 | """ 45 | Removes detections with lower object confidence score than 'conf_thres' and performs 46 | Non-Maximum Suppression to further filter detections. 47 | Returns detections with shape: 48 | (x1, y1, x2, y2, object_conf, class_score, class_pred) 49 | """ 50 | 51 | # From (center x, center y, width, height) to (x1, y1, x2, y2) 52 | box_corner = prediction.new(prediction.shape) 53 | box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2 54 | box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2 55 | box_corner[:, :, 2] = prediction[:, :, 0] + prediction[:, :, 2] / 2 56 | box_corner[:, :, 3] = prediction[:, :, 1] + prediction[:, :, 3] / 2 57 | prediction[:, :, :4] = box_corner[:, :, :4] 58 | 59 | output = [None for _ in range(len(prediction))] 60 | for image_i, image_pred in enumerate(prediction): 61 | # Filter out confidence scores below threshold 62 | conf_mask = (image_pred[:, 4] >= conf_thres).squeeze() 63 | image_pred = image_pred[conf_mask] 64 | # If none are remaining => process next image 65 | if not image_pred.size(0): 66 | continue 67 | # Get score and class with highest confidence 68 | class_conf, class_pred = torch.max(image_pred[:, 5:5 + num_classes], 1, keepdim=True) 69 | # Detections ordered as (x1, y1, x2, y2, obj_conf, class_conf, class_pred) 70 | detections = torch.cat((image_pred[:, :5], class_conf.float(), class_pred.float()), 1) 71 | # Iterate through all predicted classes 72 | unique_labels = detections[:, -1].cpu().unique() 73 | if prediction.is_cuda: 74 | unique_labels = unique_labels.cuda() 75 | for c in unique_labels: 76 | # Get the detections with the particular class 77 | detections_class = detections[detections[:, -1] == c] 78 | # Sort the detections by maximum objectness confidence 79 | _, conf_sort_index = torch.sort(detections_class[:, 4], descending=True) 80 | detections_class = detections_class[conf_sort_index] 81 | # Perform non-maximum suppression 82 | max_detections = [] 83 | while detections_class.size(0): 84 | # Get detection with highest confidence and save as max detection 85 | max_detections.append(detections_class[0].unsqueeze(0)) 86 | # Stop if we're at the last detection 87 | if len(detections_class) == 1: 88 | break 89 | # Get the IOUs for all boxes with lower confidence 90 | ious = bbox_iou(max_detections[-1], detections_class[1:]) 91 | # Remove detections with IoU >= NMS threshold 92 | detections_class = detections_class[1:][ious < nms_thres] 93 | 94 | max_detections = torch.cat(max_detections).data 95 | # Add max detections to outputs 96 | output[image_i] = max_detections if output[image_i] is None else torch.cat((output[image_i], max_detections)) 97 | 98 | return output 99 | -------------------------------------------------------------------------------- /data/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/data/.DS_Store -------------------------------------------------------------------------------- /data/coco.names: -------------------------------------------------------------------------------- 1 | person 2 | bicycle 3 | car 4 | motorbike 5 | aeroplane 6 | bus 7 | train 8 | truck 9 | boat 10 | traffic light 11 | fire hydrant 12 | stop sign 13 | parking meter 14 | bench 15 | bird 16 | cat 17 | dog 18 | horse 19 | sheep 20 | cow 21 | elephant 22 | bear 23 | zebra 24 | giraffe 25 | backpack 26 | umbrella 27 | handbag 28 | tie 29 | suitcase 30 | frisbee 31 | skis 32 | snowboard 33 | sports ball 34 | kite 35 | baseball bat 36 | baseball glove 37 | skateboard 38 | surfboard 39 | tennis racket 40 | bottle 41 | wine glass 42 | cup 43 | fork 44 | knife 45 | spoon 46 | bowl 47 | banana 48 | apple 49 | sandwich 50 | orange 51 | broccoli 52 | carrot 53 | hot dog 54 | pizza 55 | donut 56 | cake 57 | chair 58 | sofa 59 | pottedplant 60 | bed 61 | diningtable 62 | toilet 63 | tvmonitor 64 | laptop 65 | mouse 66 | remote 67 | keyboard 68 | cell phone 69 | microwave 70 | oven 71 | toaster 72 | sink 73 | refrigerator 74 | book 75 | clock 76 | vase 77 | scissors 78 | teddy bear 79 | hair drier 80 | toothbrush 81 | -------------------------------------------------------------------------------- /data/get_coco_dataset.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # CREDIT: https://github.com/pjreddie/darknet/tree/master/scripts/get_coco_dataset.sh 4 | 5 | # Clone COCO API 6 | git clone https://github.com/pdollar/coco 7 | cd coco 8 | 9 | mkdir images 10 | cd images 11 | 12 | # Download Images 13 | wget -c https://pjreddie.com/media/files/train2014.zip 14 | wget -c https://pjreddie.com/media/files/val2014.zip 15 | 16 | # Unzip 17 | unzip -q train2014.zip 18 | unzip -q val2014.zip 19 | 20 | cd .. 21 | 22 | # Download COCO Metadata 23 | wget -c https://pjreddie.com/media/files/instances_train-val2014.zip 24 | wget -c https://pjreddie.com/media/files/coco/5k.part 25 | wget -c https://pjreddie.com/media/files/coco/trainvalno5k.part 26 | wget -c https://pjreddie.com/media/files/coco/labels.tgz 27 | tar xzf labels.tgz 28 | unzip -q instances_train-val2014.zip 29 | 30 | # Set Up Image Lists 31 | paste <(awk "{print \"$PWD\"}" <5k.part) 5k.part | tr -d '\t' > 5k.txt 32 | paste <(awk "{print \"$PWD\"}" trainvalno5k.txt 33 | -------------------------------------------------------------------------------- /evaluate/coco_index2category.json: -------------------------------------------------------------------------------- 1 | {"0": 1, "1": 2, "2": 3, "3": 4, "4": 5, "5": 6, "6": 7, "7": 8, "8": 9, "9": 10, "10": 11, "11": 13, "12": 14, "13": 15, "14": 16, "15": 17, "16": 18, "17": 19, "18": 20, "19": 21, "20": 22, "21": 23, "22": 24, "23": 25, "24": 27, "25": 28, "26": 31, "27": 32, "28": 33, "29": 34, "30": 35, "31": 36, "32": 37, "33": 38, "34": 39, "35": 40, "36": 41, "37": 42, "38": 43, "39": 44, "40": 46, "41": 47, "42": 48, "43": 49, "44": 50, "45": 51, "46": 52, "47": 53, "48": 54, "49": 55, "50": 56, "51": 57, "52": 58, "53": 59, "54": 60, "55": 61, "56": 62, "57": 63, "58": 64, "59": 65, "60": 67, "61": 70, "62": 72, "63": 73, "64": 74, "65": 75, "66": 76, "67": 77, "68": 78, "69": 79, "70": 80, "71": 81, "72": 82, "73": 84, "74": 85, "75": 86, "76": 87, "77": 88, "78": 89, "79": 90} -------------------------------------------------------------------------------- /evaluate/eval.py: -------------------------------------------------------------------------------- 1 | # coding='utf-8' 2 | import os 3 | import sys 4 | import numpy as np 5 | import time 6 | import datetime 7 | import json 8 | import importlib 9 | import logging 10 | import shutil 11 | 12 | import torch 13 | import torch.nn as nn 14 | 15 | 16 | MY_DIRNAME = os.path.dirname(os.path.abspath(__file__)) 17 | sys.path.insert(0, os.path.join(MY_DIRNAME, '..')) 18 | from nets.model_main import ModelMain 19 | from nets.yolo_loss import YOLOLoss 20 | from common.coco_dataset import COCODataset 21 | from common.utils import non_max_suppression, bbox_iou 22 | 23 | 24 | def evaluate(config): 25 | is_training = False 26 | # Load and initialize network 27 | net = ModelMain(config, is_training=is_training) 28 | net.train(is_training) 29 | 30 | # Set data parallel 31 | net = nn.DataParallel(net) 32 | net = net.cuda() 33 | 34 | # Restore pretrain model 35 | if config["pretrain_snapshot"]: 36 | state_dict = torch.load(config["pretrain_snapshot"]) 37 | net.load_state_dict(state_dict) 38 | else: 39 | logging.warning("missing pretrain_snapshot!!!") 40 | 41 | # YOLO loss with 3 scales 42 | yolo_losses = [] 43 | for i in range(3): 44 | yolo_losses.append(YOLOLoss(config["yolo"]["anchors"][i], 45 | config["yolo"]["classes"], (config["img_w"], config["img_h"]))) 46 | 47 | # DataLoader 48 | dataloader = torch.utils.data.DataLoader(COCODataset(config["val_path"], 49 | (config["img_w"], config["img_h"]), 50 | is_training=False), 51 | batch_size=config["batch_size"], 52 | shuffle=False, num_workers=16, pin_memory=False) 53 | 54 | # Start the eval loop 55 | logging.info("Start eval.") 56 | n_gt = 0 57 | correct = 0 58 | for step, samples in enumerate(dataloader): 59 | images, labels = samples["image"], samples["label"] 60 | labels = labels.cuda() 61 | with torch.no_grad(): 62 | outputs = net(images) 63 | output_list = [] 64 | for i in range(3): 65 | output_list.append(yolo_losses[i](outputs[i])) 66 | output = torch.cat(output_list, 1) 67 | output = non_max_suppression(output, config["yolo"]["classes"], conf_thres=0.2) 68 | # calculate 69 | for sample_i in range(labels.size(0)): 70 | # Get labels for sample where width is not zero (dummies) 71 | target_sample = labels[sample_i, labels[sample_i, :, 3] != 0] 72 | for obj_cls, tx, ty, tw, th in target_sample: 73 | # Get rescaled gt coordinates 74 | tx1, tx2 = config["img_w"] * (tx - tw / 2), config["img_w"] * (tx + tw / 2) 75 | ty1, ty2 = config["img_h"] * (ty - th / 2), config["img_h"] * (ty + th / 2) 76 | n_gt += 1 77 | box_gt = torch.cat([coord.unsqueeze(0) for coord in [tx1, ty1, tx2, ty2]]).view(1, -1) 78 | sample_pred = output[sample_i] 79 | if sample_pred is not None: 80 | # Iterate through predictions where the class predicted is same as gt 81 | for x1, y1, x2, y2, conf, obj_conf, obj_pred in sample_pred[sample_pred[:, 6] == obj_cls]: 82 | box_pred = torch.cat([coord.unsqueeze(0) for coord in [x1, y1, x2, y2]]).view(1, -1) 83 | iou = bbox_iou(box_pred, box_gt) 84 | if iou >= config["iou_thres"]: 85 | correct += 1 86 | break 87 | if n_gt: 88 | logging.info('Batch [%d/%d] mAP: %.5f' % (step, len(dataloader), float(correct / n_gt))) 89 | 90 | logging.info('Mean Average Precision: %.5f' % float(correct / n_gt)) 91 | 92 | def main(): 93 | logging.basicConfig(level=logging.DEBUG, 94 | format="[%(asctime)s %(filename)s] %(message)s") 95 | 96 | if len(sys.argv) != 2: 97 | logging.error("Usage: python eval.py params.py") 98 | sys.exit() 99 | params_path = sys.argv[1] 100 | if not os.path.isfile(params_path): 101 | logging.error("no params file found! path: {}".format(params_path)) 102 | sys.exit() 103 | config = importlib.import_module(params_path[:-3]).TRAINING_PARAMS 104 | config["batch_size"] *= len(config["parallels"]) 105 | 106 | # Start training 107 | os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(map(str, config["parallels"])) 108 | evaluate(config) 109 | 110 | if __name__ == "__main__": 111 | main() 112 | -------------------------------------------------------------------------------- /evaluate/eval_coco.py: -------------------------------------------------------------------------------- 1 | # coding='utf-8' 2 | import os 3 | import json 4 | from json import encoder 5 | encoder.FLOAT_REPR = lambda o: format(o, '.2f') 6 | import sys 7 | import numpy as np 8 | import time 9 | import datetime 10 | import importlib 11 | import logging 12 | import shutil 13 | 14 | import matplotlib 15 | matplotlib.use('Agg') # disable display 16 | from pycocotools.coco import COCO 17 | from pycocotools.cocoeval import COCOeval 18 | 19 | import torch 20 | import torch.nn as nn 21 | 22 | 23 | MY_DIRNAME = os.path.dirname(os.path.abspath(__file__)) 24 | sys.path.insert(0, os.path.join(MY_DIRNAME, '..')) 25 | from nets.model_main import ModelMain 26 | from nets.yolo_loss import YOLOLoss 27 | from common.coco_dataset import COCODataset 28 | from common.utils import non_max_suppression 29 | 30 | 31 | def evaluate(config): 32 | is_training = False 33 | # Load and initialize network 34 | net = ModelMain(config, is_training=is_training) 35 | net.train(is_training) 36 | 37 | # Set data parallel 38 | net = nn.DataParallel(net) 39 | net = net.cuda() 40 | 41 | # Restore pretrain model 42 | if config["pretrain_snapshot"]: 43 | logging.info("Load checkpoint: {}".format(config["pretrain_snapshot"])) 44 | state_dict = torch.load(config["pretrain_snapshot"]) 45 | net.load_state_dict(state_dict) 46 | else: 47 | logging.warning("missing pretrain_snapshot!!!") 48 | 49 | # YOLO loss with 3 scales 50 | yolo_losses = [] 51 | for i in range(3): 52 | yolo_losses.append(YOLOLoss(config["yolo"]["anchors"][i], 53 | config["yolo"]["classes"], (config["img_w"], config["img_h"]))) 54 | 55 | # DataLoader. 56 | dataloader = torch.utils.data.DataLoader(COCODataset(config["val_path"], 57 | (config["img_w"], config["img_h"]), 58 | is_training=False), 59 | batch_size=config["batch_size"], 60 | shuffle=False, num_workers=8, pin_memory=False) 61 | 62 | # Coco Prepare. 63 | index2category = json.load(open("coco_index2category.json")) 64 | 65 | # Start the eval loop 66 | logging.info("Start eval.") 67 | coco_results = [] 68 | coco_img_ids= set([]) 69 | for step, samples in enumerate(dataloader): 70 | images, labels = samples["image"], samples["label"] 71 | image_paths, origin_sizes = samples["image_path"], samples["origin_size"] 72 | with torch.no_grad(): 73 | outputs = net(images) 74 | output_list = [] 75 | for i in range(3): 76 | output_list.append(yolo_losses[i](outputs[i])) 77 | output = torch.cat(output_list, 1) 78 | batch_detections = non_max_suppression(output, config["yolo"]["classes"], 79 | conf_thres=0.01, 80 | nms_thres=0.45) 81 | for idx, detections in enumerate(batch_detections): 82 | image_id = int(os.path.basename(image_paths[idx])[-16:-4]) 83 | coco_img_ids.add(image_id) 84 | if detections is not None: 85 | origin_size = eval(origin_sizes[idx]) 86 | detections = detections.cpu().numpy() 87 | for x1, y1, x2, y2, conf, cls_conf, cls_pred in detections: 88 | x1 = x1 / config["img_w"] * origin_size[0] 89 | x2 = x2 / config["img_w"] * origin_size[0] 90 | y1 = y1 / config["img_h"] * origin_size[1] 91 | y2 = y2 / config["img_h"] * origin_size[1] 92 | w = x2 - x1 93 | h = y2 - y1 94 | coco_results.append({ 95 | "image_id": image_id, 96 | "category_id": index2category[str(int(cls_pred.item()))], 97 | "bbox": (float(x1), float(y1), float(w), float(h)), 98 | "score": float(conf), 99 | }) 100 | logging.info("Now {}/{}".format(step, len(dataloader))) 101 | save_results_path = "coco_results.json" 102 | with open(save_results_path, "w") as f: 103 | json.dump(coco_results, f, sort_keys=True, indent=4, separators=(',', ':')) 104 | logging.info("Save coco format results to {}".format(save_results_path)) 105 | 106 | # COCO api 107 | logging.info("Using coco-evaluate tools to evaluate.") 108 | cocoGt = COCO(config["annotation_path"]) 109 | cocoDt = cocoGt.loadRes(save_results_path) 110 | cocoEval = COCOeval(cocoGt, cocoDt, "bbox") 111 | cocoEval.params.imgIds = list(coco_img_ids) # real imgIds 112 | cocoEval.evaluate() 113 | cocoEval.accumulate() 114 | cocoEval.summarize() 115 | 116 | 117 | def main(): 118 | logging.basicConfig(level=logging.DEBUG, 119 | format="[%(asctime)s %(filename)s] %(message)s") 120 | 121 | if len(sys.argv) != 2: 122 | logging.error("Usage: python eval_coco.py params.py") 123 | sys.exit() 124 | params_path = sys.argv[1] 125 | if not os.path.isfile(params_path): 126 | logging.error("no params file found! path: {}".format(params_path)) 127 | sys.exit() 128 | config = importlib.import_module(params_path[:-3]).TRAINING_PARAMS 129 | config["batch_size"] *= len(config["parallels"]) 130 | 131 | # Start training 132 | os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(map(str, config["parallels"])) 133 | evaluate(config) 134 | 135 | if __name__ == "__main__": 136 | main() 137 | -------------------------------------------------------------------------------- /evaluate/params.py: -------------------------------------------------------------------------------- 1 | TRAINING_PARAMS = \ 2 | { 3 | "model_params": { 4 | "backbone_name": "shufflenet_2", 5 | "backbone_pretrained": "", 6 | }, 7 | "yolo": { 8 | "anchors": [[[116, 90], [156, 198], [373, 326]], 9 | [[30, 61], [62, 45], [59, 119]], 10 | [[10, 13], [16, 30], [33, 23]]], 11 | "classes": 80, 12 | }, 13 | "batch_size": 16, 14 | "iou_thres": 0.5, 15 | "val_path": "../data/coco/5k.txt", 16 | "annotation_path": "../data/coco/annotations/instances_val2014.json", 17 | "img_h": 416, 18 | "img_w": 416, 19 | "parallels": [0], 20 | "pretrain_snapshot": "../training/YOUR_WORKING_DIR/shufflenet_2/size416x416_try2/20190719210856/model.pth", 21 | } 22 | -------------------------------------------------------------------------------- /nets/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/nets/.DS_Store -------------------------------------------------------------------------------- /nets/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/nets/__init__.py -------------------------------------------------------------------------------- /nets/backbone/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/nets/backbone/.DS_Store -------------------------------------------------------------------------------- /nets/backbone/__init__.py: -------------------------------------------------------------------------------- 1 | from . import darknet 2 | from . import shufflenet 3 | 4 | backbone_fn = { 5 | "darknet_21": darknet.darknet21, 6 | "darknet_53": darknet.darknet53, 7 | "shufflenet_2": shufflenet.shufflenet2 8 | } 9 | -------------------------------------------------------------------------------- /nets/backbone/darknet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import math 4 | from collections import OrderedDict 5 | 6 | __all__ = ['darknet21', 'darknet53'] 7 | 8 | 9 | class BasicBlock(nn.Module): 10 | def __init__(self, inplanes, planes): 11 | super(BasicBlock, self).__init__() 12 | self.conv1 = nn.Conv2d(inplanes, planes[0], kernel_size=1, 13 | stride=1, padding=0, bias=False) 14 | self.bn1 = nn.BatchNorm2d(planes[0]) 15 | self.relu1 = nn.LeakyReLU(0.1) 16 | self.conv2 = nn.Conv2d(planes[0], planes[1], kernel_size=3, 17 | stride=1, padding=1, bias=False) 18 | self.bn2 = nn.BatchNorm2d(planes[1]) 19 | self.relu2 = nn.LeakyReLU(0.1) 20 | 21 | def forward(self, x): 22 | residual = x 23 | 24 | out = self.conv1(x) 25 | out = self.bn1(out) 26 | out = self.relu1(out) 27 | 28 | out = self.conv2(out) 29 | out = self.bn2(out) 30 | out = self.relu2(out) 31 | 32 | out += residual 33 | return out 34 | 35 | 36 | class DarkNet(nn.Module): 37 | def __init__(self, layers): 38 | super(DarkNet, self).__init__() 39 | self.inplanes = 32 40 | self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=3, stride=1, padding=1, bias=False) 41 | self.bn1 = nn.BatchNorm2d(self.inplanes) 42 | self.relu1 = nn.LeakyReLU(0.1) 43 | 44 | self.layer1 = self._make_layer([32, 64], layers[0]) 45 | self.layer2 = self._make_layer([64, 128], layers[1]) 46 | self.layer3 = self._make_layer([128, 256], layers[2]) 47 | self.layer4 = self._make_layer([256, 512], layers[3]) 48 | self.layer5 = self._make_layer([512, 1024], layers[4]) 49 | 50 | self.layers_out_filters = [64, 128, 256, 512, 1024] 51 | 52 | 53 | for m in self.modules(): 54 | if isinstance(m, nn.Conv2d): 55 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 56 | m.weight.data.normal_(0, math.sqrt(2. / n)) 57 | elif isinstance(m, nn.BatchNorm2d): 58 | m.weight.data.fill_(1) 59 | m.bias.data.zero_() 60 | 61 | def _make_layer(self, planes, blocks): 62 | layers = [] 63 | # downsample 64 | layers.append(("ds_conv", nn.Conv2d(self.inplanes, planes[1], kernel_size=3, 65 | stride=2, padding=1, bias=False))) 66 | layers.append(("ds_bn", nn.BatchNorm2d(planes[1]))) 67 | layers.append(("ds_relu", nn.LeakyReLU(0.1))) 68 | # blocks 69 | self.inplanes = planes[1] 70 | for i in range(0, blocks): 71 | layers.append(("residual_{}".format(i), BasicBlock(self.inplanes, planes))) 72 | return nn.Sequential(OrderedDict(layers)) 73 | 74 | def forward(self, x): 75 | x = self.conv1(x) 76 | x = self.bn1(x) 77 | x = self.relu1(x) 78 | 79 | x = self.layer1(x) 80 | x = self.layer2(x) 81 | out3 = self.layer3(x) 82 | out4 = self.layer4(out3) 83 | out5 = self.layer5(out4) 84 | 85 | return out3, out4, out5 86 | 87 | def darknet21(pretrained, **kwargs): 88 | """Constructs a darknet-21 model. 89 | """ 90 | model = DarkNet([1, 1, 2, 2, 1]) 91 | if pretrained: 92 | if isinstance(pretrained, str): 93 | model.load_state_dict(torch.load(pretrained)) 94 | else: 95 | raise Exception("darknet request a pretrained path. got [{}]".format(pretrained)) 96 | return model 97 | 98 | def darknet53(pretrained, **kwargs): 99 | """Constructs a darknet-53 model. 100 | """ 101 | model = DarkNet([1, 2, 8, 8, 4]) 102 | if pretrained: 103 | if isinstance(pretrained, str): 104 | model.load_state_dict(torch.load(pretrained)) 105 | else: 106 | raise Exception("darknet request a pretrained path. got [{}]".format(pretrained)) 107 | return model 108 | 109 | 110 | 111 | -------------------------------------------------------------------------------- /nets/backbone/shufflenet.py: -------------------------------------------------------------------------------- 1 | import torch as t 2 | import torch.nn as nn 3 | import math 4 | from collections import OrderedDict 5 | 6 | __all__ = ['shufflenet2'] 7 | 8 | #### The model below is defined by myself 9 | 10 | 11 | def channel_shuffle(x, groups=2): 12 | bat_size, channels, w, h = x.shape 13 | group_c = channels // groups 14 | x = x.view(bat_size, groups, group_c, w, h) 15 | x = t.transpose(x, 1, 2).contiguous() 16 | x = x.view(bat_size, -1, w, h) 17 | return x 18 | 19 | # used in the block 20 | def conv_1x1_bn(in_c, out_c, stride=1): 21 | return nn.Sequential( 22 | nn.Conv2d(in_c, out_c, 1, stride, 0, bias=False), 23 | nn.BatchNorm2d(out_c), 24 | nn.ReLU(True) 25 | ) 26 | 27 | def conv_bn(in_c, out_c, stride=2): 28 | return nn.Sequential( 29 | nn.Conv2d(in_c, out_c, 3, stride, 1, bias=False), 30 | nn.BatchNorm2d(out_c), 31 | nn.ReLU(True) 32 | ) 33 | 34 | 35 | class ShuffleBlock(nn.Module): 36 | def __init__(self, in_c, out_c, downsample=False): 37 | super(ShuffleBlock, self).__init__() 38 | self.downsample = downsample 39 | half_c = out_c // 2 40 | if downsample: 41 | self.branch1 = nn.Sequential( 42 | # 3*3 dw conv, stride = 2 43 | nn.Conv2d(in_c, in_c, 3, 2, 1, groups=in_c, bias=False), 44 | nn.BatchNorm2d(in_c), 45 | # 1*1 pw conv 46 | nn.Conv2d(in_c, half_c, 1, 1, 0, bias=False), 47 | nn.BatchNorm2d(half_c), 48 | nn.ReLU(True) 49 | ) 50 | 51 | self.branch2 = nn.Sequential( 52 | # 1*1 pw conv 53 | nn.Conv2d(in_c, half_c, 1, 1, 0, bias=False), 54 | nn.BatchNorm2d(half_c), 55 | nn.ReLU(True), 56 | # 3*3 dw conv, stride = 2 57 | nn.Conv2d(half_c, half_c, 3, 2, 1, groups=half_c, bias=False), 58 | nn.BatchNorm2d(half_c), 59 | # 1*1 pw conv 60 | nn.Conv2d(half_c, half_c, 1, 1, 0, bias=False), 61 | nn.BatchNorm2d(half_c), 62 | nn.ReLU(True) 63 | ) 64 | else: 65 | # in_c = out_c 66 | assert in_c == out_c 67 | 68 | self.branch2 = nn.Sequential( 69 | # 1*1 pw conv 70 | nn.Conv2d(half_c, half_c, 1, 1, 0, bias=False), 71 | nn.BatchNorm2d(half_c), 72 | nn.ReLU(True), 73 | # 3*3 dw conv, stride = 1 74 | nn.Conv2d(half_c, half_c, 3, 1, 1, groups=half_c, bias=False), 75 | nn.BatchNorm2d(half_c), 76 | # 1*1 pw conv 77 | nn.Conv2d(half_c, half_c, 1, 1, 0, bias=False), 78 | nn.BatchNorm2d(half_c), 79 | nn.ReLU(True) 80 | ) 81 | 82 | 83 | def forward(self, x): 84 | out = None 85 | if self.downsample: 86 | # if it is downsampling, we don't need to do channel split 87 | out = t.cat((self.branch1(x), self.branch2(x)), 1) 88 | else: 89 | # channel split 90 | channels = x.shape[1] 91 | c = channels // 2 92 | x1 = x[:, :c, :, :] 93 | x2 = x[:, c:, :, :] 94 | out = t.cat((x1, self.branch2(x2)), 1) 95 | return channel_shuffle(out, 2) 96 | 97 | 98 | class ShuffleNet2(nn.Module): 99 | def __init__(self, input_size=416, net_type=1): 100 | super(ShuffleNet2, self).__init__() 101 | assert input_size % 32 == 0 # 因为一共会下采样32倍 102 | self.layers_out_filters = [24, 116, 232, 1024] # used for shufflenet v2 103 | 104 | self.stage_repeat_num = [4, 8, 4] 105 | if net_type == 0.5: 106 | self.out_channels = [3, 24, 48, 96, 192, 1024] 107 | elif net_type == 1: 108 | self.out_channels = [3, 24, 116, 232, 464, 1024] 109 | elif net_type == 1.5: 110 | self.out_channels = [3, 24, 176, 352, 704, 1024] 111 | elif net_type == 2: 112 | self.out_channels = [3, 24, 244, 488, 976, 2948] 113 | elif net_type == -1: 114 | self.out_channels = [3, 24, 128, 256, 512, 1024] 115 | else: 116 | print("the type is error, you should choose 0.5, 1, 1.5 or 2") 117 | 118 | # let's start building layers 119 | self.conv1 = nn.Conv2d(3, self.out_channels[1], 3, 2, 1) 120 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 121 | in_c = self.out_channels[1] 122 | 123 | self.stage2 = [] 124 | self.stage3 = [] 125 | self.stage4 = [] 126 | for stage_idx in range(len(self.stage_repeat_num)): 127 | out_c = self.out_channels[2+stage_idx] 128 | repeat_num = self.stage_repeat_num[stage_idx] 129 | stage = [] 130 | for i in range(repeat_num): 131 | if i == 0: 132 | stage.append(ShuffleBlock(in_c, out_c, downsample=True)) 133 | else: 134 | stage.append(ShuffleBlock(in_c, in_c, downsample=False)) 135 | in_c = out_c 136 | if stage_idx == 0: 137 | self.stage2 = stage 138 | elif stage_idx == 1: 139 | self.stage3 = stage 140 | elif stage_idx == 2: 141 | self.stage4 = stage 142 | else: 143 | print("error") 144 | # self.stages = nn.Sequential(*self.stages) 145 | self.stage2 = nn.Sequential(*self.stage2) # 58 * 58 * 116 146 | self.stage3 = nn.Sequential(*self.stage3) # 26 * 26 * 232 147 | self.stage4 = nn.Sequential(*self.stage4) 148 | in_c = self.out_channels[-2] 149 | out_c = self.out_channels[-1] 150 | self.conv5 = conv_1x1_bn(in_c, out_c, 1) # 13 * 13 * 1024 151 | # self.g_avg_pool = nn.AvgPool2d(kernel_size=(int)(input_size/32)) # 如果输入的是224,则此处为7 152 | 153 | # # fc layer 154 | # self.fc = nn.Linear(out_c, num_classes) 155 | 156 | 157 | def forward(self, x): 158 | x = self.conv1(x) 159 | x = self.maxpool(x) 160 | out3 = self.stage2(x) 161 | out4 = self.stage3(out3) 162 | out5 = self.stage4(out4) 163 | out5 = self.conv5(out5) 164 | # x = self.g_avg_pool(x) 165 | # x = x.view(-1, self.out_channels[-1]) 166 | # x = self.fc(x) 167 | return out3, out4, out5 168 | 169 | def shufflenet2(pretrained, **kwargs): 170 | """Constructs a darknet-53 model. 171 | """ 172 | model = ShuffleNet2() 173 | if pretrained: 174 | if isinstance(pretrained, str): 175 | model.load_state_dict(t.load(pretrained)) 176 | else: 177 | raise Exception("darknet request a pretrained path. got [{}]".format(pretrained)) 178 | return model 179 | -------------------------------------------------------------------------------- /nets/model_main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from collections import OrderedDict 4 | 5 | from .backbone import backbone_fn 6 | 7 | 8 | class ModelMain(nn.Module): 9 | def __init__(self, config, is_training=True): 10 | super(ModelMain, self).__init__() 11 | self.config = config 12 | self.training = is_training 13 | self.model_params = config["model_params"] 14 | # backbone 15 | _backbone_fn = backbone_fn[self.model_params["backbone_name"]] 16 | self.backbone = _backbone_fn(self.model_params["backbone_pretrained"]) 17 | _out_filters = self.backbone.layers_out_filters 18 | # embedding0 19 | final_out_filter0 = len(config["yolo"]["anchors"][0]) * (5 + config["yolo"]["classes"]) 20 | self.embedding0 = self._make_embedding([512, 1024], _out_filters[-1], final_out_filter0) 21 | # embedding1 22 | final_out_filter1 = len(config["yolo"]["anchors"][1]) * (5 + config["yolo"]["classes"]) 23 | self.embedding1_cbl = self._make_cbl(512, 256, 1) 24 | self.embedding1_upsample = nn.Upsample(scale_factor=2, mode='nearest') 25 | self.embedding1 = self._make_embedding([256, 512], _out_filters[-2] + 256, final_out_filter1) 26 | # embedding2 27 | final_out_filter2 = len(config["yolo"]["anchors"][2]) * (5 + config["yolo"]["classes"]) 28 | self.embedding2_cbl = self._make_cbl(256, 128, 1) 29 | self.embedding2_upsample = nn.Upsample(scale_factor=2, mode='nearest') 30 | self.embedding2 = self._make_embedding([128, 256], _out_filters[-3] + 128, final_out_filter2) 31 | 32 | def _make_cbl(self, _in, _out, ks): 33 | ''' cbl = conv + batch_norm + leaky_relu 34 | ''' 35 | pad = (ks - 1) // 2 if ks else 0 36 | return nn.Sequential(OrderedDict([ 37 | ("conv", nn.Conv2d(_in, _out, kernel_size=ks, stride=1, padding=pad, bias=False)), 38 | ("bn", nn.BatchNorm2d(_out)), 39 | ("relu", nn.LeakyReLU(0.1)), 40 | ])) 41 | 42 | def _make_embedding(self, filters_list, in_filters, out_filter): 43 | m = nn.ModuleList([ 44 | self._make_cbl(in_filters, filters_list[0], 1), 45 | self._make_cbl(filters_list[0], filters_list[1], 3), 46 | self._make_cbl(filters_list[1], filters_list[0], 1), 47 | self._make_cbl(filters_list[0], filters_list[1], 3), 48 | self._make_cbl(filters_list[1], filters_list[0], 1), 49 | self._make_cbl(filters_list[0], filters_list[1], 3)]) 50 | m.add_module("conv_out", nn.Conv2d(filters_list[1], out_filter, kernel_size=1, 51 | stride=1, padding=0, bias=True)) 52 | return m 53 | 54 | def forward(self, x): 55 | def _branch(_embedding, _in): 56 | for i, e in enumerate(_embedding): 57 | _in = e(_in) 58 | if i == 4: 59 | out_branch = _in 60 | return _in, out_branch 61 | # backbone 62 | x2, x1, x0 = self.backbone(x) 63 | # yolo branch 0 64 | out0, out0_branch = _branch(self.embedding0, x0) 65 | # yolo branch 1 66 | x1_in = self.embedding1_cbl(out0_branch) 67 | x1_in = self.embedding1_upsample(x1_in) 68 | x1_in = torch.cat([x1_in, x1], 1) 69 | out1, out1_branch = _branch(self.embedding1, x1_in) 70 | # yolo branch 2 71 | x2_in = self.embedding2_cbl(out1_branch) 72 | x2_in = self.embedding2_upsample(x2_in) 73 | x2_in = torch.cat([x2_in, x2], 1) 74 | out2, out2_branch = _branch(self.embedding2, x2_in) 75 | return out0, out1, out2 76 | 77 | def load_darknet_weights(self, weights_path): 78 | import numpy as np 79 | #Open the weights file 80 | fp = open(weights_path, "rb") 81 | header = np.fromfile(fp, dtype=np.int32, count=5) # First five are header values 82 | # Needed to write header when saving weights 83 | weights = np.fromfile(fp, dtype=np.float32) # The rest are weights 84 | print ("total len weights = ", weights.shape) 85 | fp.close() 86 | 87 | ptr = 0 88 | all_dict = self.state_dict() 89 | all_keys = self.state_dict().keys() 90 | print (all_keys) 91 | last_bn_weight = None 92 | last_conv = None 93 | for i, (k, v) in enumerate(all_dict.items()): 94 | if 'bn' in k: 95 | if 'weight' in k: 96 | last_bn_weight = v 97 | elif 'bias' in k: 98 | num_b = v.numel() 99 | vv = torch.from_numpy(weights[ptr:ptr + num_b]).view_as(v) 100 | v.copy_(vv) 101 | print ("bn_bias: ", ptr, num_b, k) 102 | ptr += num_b 103 | # weight 104 | v = last_bn_weight 105 | num_b = v.numel() 106 | vv = torch.from_numpy(weights[ptr:ptr + num_b]).view_as(v) 107 | v.copy_(vv) 108 | print ("bn_weight: ", ptr, num_b, k) 109 | ptr += num_b 110 | last_bn_weight = None 111 | elif 'running_mean' in k: 112 | num_b = v.numel() 113 | vv = torch.from_numpy(weights[ptr:ptr + num_b]).view_as(v) 114 | v.copy_(vv) 115 | print ("bn_mean: ", ptr, num_b, k) 116 | ptr += num_b 117 | elif 'running_var' in k: 118 | num_b = v.numel() 119 | vv = torch.from_numpy(weights[ptr:ptr + num_b]).view_as(v) 120 | v.copy_(vv) 121 | print ("bn_var: ", ptr, num_b, k) 122 | ptr += num_b 123 | # conv 124 | v = last_conv 125 | num_b = v.numel() 126 | vv = torch.from_numpy(weights[ptr:ptr + num_b]).view_as(v) 127 | v.copy_(vv) 128 | print ("conv wight: ", ptr, num_b, k) 129 | ptr += num_b 130 | last_conv = None 131 | else: 132 | raise Exception("Error for bn") 133 | elif 'conv' in k: 134 | if 'weight' in k: 135 | last_conv = v 136 | else: 137 | num_b = v.numel() 138 | vv = torch.from_numpy(weights[ptr:ptr + num_b]).view_as(v) 139 | v.copy_(vv) 140 | print ("conv bias: ", ptr, num_b, k) 141 | ptr += num_b 142 | # conv 143 | v = last_conv 144 | num_b = v.numel() 145 | vv = torch.from_numpy(weights[ptr:ptr + num_b]).view_as(v) 146 | v.copy_(vv) 147 | print ("conv wight: ", ptr, num_b, k) 148 | ptr += num_b 149 | last_conv = None 150 | print("Total ptr = ", ptr) 151 | print("real size = ", weights.shape) 152 | 153 | 154 | if __name__ == "__main__": 155 | config = {"model_params": {"backbone_name": "darknet_53"}} 156 | m = ModelMain(config) 157 | x = torch.randn(1, 3, 416, 416) 158 | y0, y1, y2 = m(x) 159 | print(y0.size()) 160 | print(y1.size()) 161 | print(y2.size()) 162 | 163 | -------------------------------------------------------------------------------- /nets/yolo_loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import numpy as np 4 | import math 5 | 6 | from common.utils import bbox_iou 7 | 8 | 9 | class YOLOLoss(nn.Module): 10 | def __init__(self, anchors, num_classes, img_size): 11 | super(YOLOLoss, self).__init__() 12 | self.anchors = anchors 13 | self.num_anchors = len(anchors) 14 | self.num_classes = num_classes 15 | self.bbox_attrs = 5 + num_classes 16 | self.img_size = img_size 17 | 18 | self.ignore_threshold = 0.5 19 | self.lambda_xy = 2.5 20 | self.lambda_wh = 2.5 21 | self.lambda_conf = 1.0 22 | self.lambda_cls = 1.0 23 | 24 | self.mse_loss = nn.MSELoss() 25 | self.bce_loss = nn.BCELoss() 26 | 27 | def forward(self, input, targets=None): 28 | bs = input.size(0) 29 | in_h = input.size(2) 30 | in_w = input.size(3) 31 | stride_h = self.img_size[1] / in_h 32 | stride_w = self.img_size[0] / in_w 33 | scaled_anchors = [(a_w / stride_w, a_h / stride_h) for a_w, a_h in self.anchors] 34 | 35 | prediction = input.view(bs, self.num_anchors, 36 | self.bbox_attrs, in_h, in_w).permute(0, 1, 3, 4, 2).contiguous() 37 | 38 | # Get outputs 39 | x = torch.sigmoid(prediction[..., 0]) # Center x 40 | y = torch.sigmoid(prediction[..., 1]) # Center y 41 | w = prediction[..., 2] # Width 42 | h = prediction[..., 3] # Height 43 | conf = torch.sigmoid(prediction[..., 4]) # Conf 44 | pred_cls = torch.sigmoid(prediction[..., 5:]) # Cls pred. 45 | 46 | if targets is not None: 47 | # build target 48 | mask, noobj_mask, tx, ty, tw, th, tconf, tcls = self.get_target(targets, scaled_anchors, 49 | in_w, in_h, 50 | self.ignore_threshold) 51 | mask, noobj_mask = mask.cuda(), noobj_mask.cuda() 52 | tx, ty, tw, th = tx.cuda(), ty.cuda(), tw.cuda(), th.cuda() 53 | tconf, tcls = tconf.cuda(), tcls.cuda() 54 | # losses. 55 | loss_x = self.bce_loss(x * mask, tx * mask) 56 | loss_y = self.bce_loss(y * mask, ty * mask) 57 | loss_w = self.mse_loss(w * mask, tw * mask) 58 | loss_h = self.mse_loss(h * mask, th * mask) 59 | loss_conf = self.bce_loss(conf * mask, mask) + \ 60 | 0.5 * self.bce_loss(conf * noobj_mask, noobj_mask * 0.0) 61 | loss_cls = self.bce_loss(pred_cls[mask == 1], tcls[mask == 1]) 62 | # total loss = losses * weight 63 | loss = loss_x * self.lambda_xy + loss_y * self.lambda_xy + \ 64 | loss_w * self.lambda_wh + loss_h * self.lambda_wh + \ 65 | loss_conf * self.lambda_conf + loss_cls * self.lambda_cls 66 | 67 | return loss, loss_x.item(), loss_y.item(), loss_w.item(),\ 68 | loss_h.item(), loss_conf.item(), loss_cls.item() 69 | else: 70 | FloatTensor = torch.cuda.FloatTensor if x.is_cuda else torch.FloatTensor 71 | LongTensor = torch.cuda.LongTensor if x.is_cuda else torch.LongTensor 72 | # Calculate offsets for each grid 73 | grid_x = torch.linspace(0, in_w-1, in_w).repeat(in_w, 1).repeat( 74 | bs * self.num_anchors, 1, 1).view(x.shape).type(FloatTensor) 75 | grid_y = torch.linspace(0, in_h-1, in_h).repeat(in_h, 1).t().repeat( 76 | bs * self.num_anchors, 1, 1).view(y.shape).type(FloatTensor) 77 | # Calculate anchor w, h 78 | anchor_w = FloatTensor(scaled_anchors).index_select(1, LongTensor([0])) 79 | anchor_h = FloatTensor(scaled_anchors).index_select(1, LongTensor([1])) 80 | anchor_w = anchor_w.repeat(bs, 1).repeat(1, 1, in_h * in_w).view(w.shape) 81 | anchor_h = anchor_h.repeat(bs, 1).repeat(1, 1, in_h * in_w).view(h.shape) 82 | # Add offset and scale with anchors 83 | pred_boxes = FloatTensor(prediction[..., :4].shape) 84 | pred_boxes[..., 0] = x.data + grid_x 85 | pred_boxes[..., 1] = y.data + grid_y 86 | pred_boxes[..., 2] = torch.exp(w.data) * anchor_w 87 | pred_boxes[..., 3] = torch.exp(h.data) * anchor_h 88 | # Results 89 | _scale = torch.Tensor([stride_w, stride_h] * 2).type(FloatTensor) 90 | output = torch.cat((pred_boxes.view(bs, -1, 4) * _scale, 91 | conf.view(bs, -1, 1), pred_cls.view(bs, -1, self.num_classes)), -1) 92 | return output.data 93 | 94 | def get_target(self, target, anchors, in_w, in_h, ignore_threshold): 95 | bs = target.size(0) 96 | 97 | mask = torch.zeros(bs, self.num_anchors, in_h, in_w, requires_grad=False) 98 | noobj_mask = torch.ones(bs, self.num_anchors, in_h, in_w, requires_grad=False) 99 | tx = torch.zeros(bs, self.num_anchors, in_h, in_w, requires_grad=False) 100 | ty = torch.zeros(bs, self.num_anchors, in_h, in_w, requires_grad=False) 101 | tw = torch.zeros(bs, self.num_anchors, in_h, in_w, requires_grad=False) 102 | th = torch.zeros(bs, self.num_anchors, in_h, in_w, requires_grad=False) 103 | tconf = torch.zeros(bs, self.num_anchors, in_h, in_w, requires_grad=False) 104 | tcls = torch.zeros(bs, self.num_anchors, in_h, in_w, self.num_classes, requires_grad=False) 105 | for b in range(bs): 106 | for t in range(target.shape[1]): 107 | if target[b, t].sum() == 0: 108 | continue 109 | # Convert to position relative to box 110 | gx = target[b, t, 1] * in_w 111 | gy = target[b, t, 2] * in_h 112 | gw = target[b, t, 3] * in_w 113 | gh = target[b, t, 4] * in_h 114 | # Get grid box indices 115 | gi = int(gx) 116 | gj = int(gy) 117 | # Get shape of gt box 118 | gt_box = torch.FloatTensor(np.array([0, 0, gw, gh])).unsqueeze(0) 119 | # Get shape of anchor box 120 | anchor_shapes = torch.FloatTensor(np.concatenate((np.zeros((self.num_anchors, 2)), 121 | np.array(anchors)), 1)) 122 | # Calculate iou between gt and anchor shapes 123 | anch_ious = bbox_iou(gt_box, anchor_shapes) 124 | # Where the overlap is larger than threshold set mask to zero (ignore) 125 | noobj_mask[b, anch_ious > ignore_threshold, gj, gi] = 0 126 | # Find the best matching anchor box 127 | best_n = np.argmax(anch_ious) 128 | 129 | # Masks 130 | mask[b, best_n, gj, gi] = 1 131 | # Coordinates 132 | tx[b, best_n, gj, gi] = gx - gi 133 | ty[b, best_n, gj, gi] = gy - gj 134 | # Width and height 135 | tw[b, best_n, gj, gi] = math.log(gw/anchors[best_n][0] + 1e-16) 136 | th[b, best_n, gj, gi] = math.log(gh/anchors[best_n][1] + 1e-16) 137 | # object 138 | tconf[b, best_n, gj, gi] = 1 139 | # One-hot encoding of label 140 | tcls[b, best_n, gj, gi, int(target[b, t, 0])] = 1 141 | 142 | return mask, noobj_mask, tx, ty, tw, th, tconf, tcls 143 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | torch>=0.4.0 3 | torchvision 4 | pillow 5 | tensorboardX 6 | imgaug>=0.2.5 7 | opencv-python 8 | matplotlib 9 | cython 10 | pycocotools 11 | -------------------------------------------------------------------------------- /test/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/test/.DS_Store -------------------------------------------------------------------------------- /test/images/test1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/test/images/test1.jpg -------------------------------------------------------------------------------- /test/images/test2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/test/images/test2.jpg -------------------------------------------------------------------------------- /test/images/test3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/test/images/test3.jpg -------------------------------------------------------------------------------- /test/images/test4.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/test/images/test4.jpg -------------------------------------------------------------------------------- /test/params.py: -------------------------------------------------------------------------------- 1 | TRAINING_PARAMS = \ 2 | { 3 | "model_params": { 4 | "backbone_name": "shufflenet_2", 5 | "backbone_pretrained": "", 6 | }, 7 | "yolo": { 8 | "anchors": [[[116, 90], [156, 198], [373, 326]], 9 | [[30, 61], [62, 45], [59, 119]], 10 | [[10, 13], [16, 30], [33, 23]]], 11 | "classes": 80, 12 | }, 13 | "batch_size": 16, 14 | "confidence_threshold": 0.5, 15 | "images_path": "./images/", 16 | "classes_names_path": "../data/coco.names", 17 | "img_h": 416, 18 | "img_w": 416, 19 | "parallels": [0], 20 | "pretrain_snapshot": "../weights/official_yolov3_weights_pytorch.pth", 21 | } 22 | -------------------------------------------------------------------------------- /test/test_fps.py: -------------------------------------------------------------------------------- 1 | # coding='utf-8' 2 | import os 3 | import sys 4 | import numpy as np 5 | import time 6 | import datetime 7 | import json 8 | import importlib 9 | import logging 10 | import shutil 11 | import cv2 12 | import random 13 | 14 | import torch 15 | import torch.nn as nn 16 | 17 | 18 | MY_DIRNAME = os.path.dirname(os.path.abspath(__file__)) 19 | sys.path.insert(0, os.path.join(MY_DIRNAME, '..')) 20 | from nets.model_main import ModelMain 21 | from nets.yolo_loss import YOLOLoss 22 | from common.utils import non_max_suppression, bbox_iou 23 | 24 | 25 | def test(config): 26 | is_training = False 27 | # Load and initialize network 28 | net = ModelMain(config, is_training=is_training) 29 | net.train(is_training) 30 | 31 | # Set data parallel 32 | net = nn.DataParallel(net) 33 | net = net.cuda() 34 | 35 | # Restore pretrain model 36 | if config["pretrain_snapshot"]: 37 | logging.info("load checkpoint from {}".format(config["pretrain_snapshot"])) 38 | state_dict = torch.load(config["pretrain_snapshot"]) 39 | net.load_state_dict(state_dict) 40 | else: 41 | raise Exception("missing pretrain_snapshot!!!") 42 | 43 | # YOLO loss with 3 scales 44 | yolo_losses = [] 45 | for i in range(3): 46 | yolo_losses.append(YOLOLoss(config["yolo"]["anchors"][i], 47 | config["yolo"]["classes"], (config["img_w"], config["img_h"]))) 48 | 49 | # prepare images path 50 | images_name = os.listdir(config["images_path"]) 51 | images_path = [os.path.join(config["images_path"], name) for name in images_name] 52 | if len(images_path) == 0: 53 | raise Exception("no image found in {}".format(config["images_path"])) 54 | 55 | # Start testing FPS of different batch size 56 | for batch_size in range(1, 10): 57 | # preprocess 58 | images = [] 59 | for path in images_path[: batch_size]: 60 | image = cv2.imread(path, cv2.IMREAD_COLOR) 61 | if image is None: 62 | logging.error("read path error: {}. skip it.".format(path)) 63 | continue 64 | image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) 65 | image = cv2.resize(image, (config["img_w"], config["img_h"]), 66 | interpolation=cv2.INTER_LINEAR) 67 | image = image.astype(np.float32) 68 | image /= 255.0 69 | image = np.transpose(image, (2, 0, 1)) 70 | image = image.astype(np.float32) 71 | images.append(image) 72 | for i in range(batch_size-len(images)): 73 | images.append(images[0]) # fill len to batch_sze 74 | images = np.asarray(images) 75 | images = torch.from_numpy(images).cuda() 76 | # inference in 30 times and calculate average 77 | inference_times = [] 78 | for i in range(30): 79 | start_time = time.time() 80 | with torch.no_grad(): 81 | outputs = net(images) 82 | output_list = [] 83 | for i in range(3): 84 | output_list.append(yolo_losses[i](outputs[i])) 85 | output = torch.cat(output_list, 1) 86 | batch_detections = non_max_suppression(output, config["yolo"]["classes"], 87 | conf_thres=config["confidence_threshold"]) 88 | torch.cuda.synchronize() # wait all done. 89 | end_time = time.time() 90 | inference_times.append(end_time - start_time) 91 | inference_time = sum(inference_times) / len(inference_times) / batch_size 92 | fps = 1.0 / inference_time 93 | logging.info("Batch_Size: {}, Inference_Time: {:.5f} s/image, FPS: {}".format(batch_size, 94 | inference_time, 95 | fps)) 96 | 97 | 98 | 99 | def main(): 100 | logging.basicConfig(level=logging.DEBUG, 101 | format="[%(asctime)s %(filename)s] %(message)s") 102 | 103 | if len(sys.argv) != 2: 104 | logging.error("Usage: python test_images.py params.py") 105 | sys.exit() 106 | params_path = sys.argv[1] 107 | if not os.path.isfile(params_path): 108 | logging.error("no params file found! path: {}".format(params_path)) 109 | sys.exit() 110 | config = importlib.import_module(params_path[:-3]).TRAINING_PARAMS 111 | config["batch_size"] *= len(config["parallels"]) 112 | 113 | # Start training 114 | os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(map(str, config["parallels"])) 115 | test(config) 116 | 117 | 118 | if __name__ == "__main__": 119 | main() 120 | -------------------------------------------------------------------------------- /test/test_images.py: -------------------------------------------------------------------------------- 1 | # coding='utf-8' 2 | import os 3 | import sys 4 | import numpy as np 5 | import time 6 | import datetime 7 | import json 8 | import importlib 9 | import logging 10 | import shutil 11 | import cv2 12 | import random 13 | 14 | import matplotlib 15 | matplotlib.use('Agg') 16 | import matplotlib.pyplot as plt 17 | import matplotlib.patches as patches 18 | from matplotlib.ticker import NullLocator 19 | 20 | import torch 21 | import torch.nn as nn 22 | 23 | 24 | MY_DIRNAME = os.path.dirname(os.path.abspath(__file__)) 25 | sys.path.insert(0, os.path.join(MY_DIRNAME, '..')) 26 | from nets.model_main import ModelMain 27 | from nets.yolo_loss import YOLOLoss 28 | from common.utils import non_max_suppression, bbox_iou 29 | 30 | cmap = plt.get_cmap('tab20b') 31 | colors = [cmap(i) for i in np.linspace(0, 1, 20)] 32 | 33 | 34 | def test(config): 35 | is_training = False 36 | # Load and initialize network 37 | net = ModelMain(config, is_training=is_training) 38 | net.train(is_training) 39 | 40 | # Set data parallel 41 | net = nn.DataParallel(net) 42 | net = net.cuda() 43 | 44 | # Restore pretrain model 45 | if config["pretrain_snapshot"]: 46 | logging.info("load checkpoint from {}".format(config["pretrain_snapshot"])) 47 | state_dict = torch.load(config["pretrain_snapshot"]) 48 | net.load_state_dict(state_dict) 49 | else: 50 | raise Exception("missing pretrain_snapshot!!!") 51 | 52 | # YOLO loss with 3 scales 53 | yolo_losses = [] 54 | for i in range(3): 55 | yolo_losses.append(YOLOLoss(config["yolo"]["anchors"][i], 56 | config["yolo"]["classes"], (config["img_w"], config["img_h"]))) 57 | 58 | # prepare images path 59 | images_name = os.listdir(config["images_path"]) 60 | images_path = [os.path.join(config["images_path"], name) for name in images_name] 61 | if len(images_path) == 0: 62 | raise Exception("no image found in {}".format(config["images_path"])) 63 | 64 | # Start inference 65 | batch_size = config["batch_size"] 66 | for step in range(0, len(images_path), batch_size): 67 | # preprocess 68 | images = [] 69 | images_origin = [] 70 | for path in images_path[step*batch_size: (step+1)*batch_size]: 71 | logging.info("processing: {}".format(path)) 72 | image = cv2.imread(path, cv2.IMREAD_COLOR) 73 | if image is None: 74 | logging.error("read path error: {}. skip it.".format(path)) 75 | continue 76 | image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) 77 | images_origin.append(image) # keep for save result 78 | image = cv2.resize(image, (config["img_w"], config["img_h"]), 79 | interpolation=cv2.INTER_LINEAR) 80 | image = image.astype(np.float32) 81 | image /= 255.0 82 | image = np.transpose(image, (2, 0, 1)) 83 | image = image.astype(np.float32) 84 | images.append(image) 85 | images = np.asarray(images) 86 | images = torch.from_numpy(images).cuda() 87 | # inference 88 | with torch.no_grad(): 89 | outputs = net(images) 90 | output_list = [] 91 | for i in range(3): 92 | output_list.append(yolo_losses[i](outputs[i])) 93 | output = torch.cat(output_list, 1) 94 | batch_detections = non_max_suppression(output, config["yolo"]["classes"], 95 | conf_thres=config["confidence_threshold"], 96 | nms_thres=0.45) 97 | 98 | # write result images. Draw bounding boxes and labels of detections 99 | classes = open(config["classes_names_path"], "r").read().split("\n")[:-1] 100 | if not os.path.isdir("./output/"): 101 | os.makedirs("./output/") 102 | for idx, detections in enumerate(batch_detections): 103 | plt.figure() 104 | fig, ax = plt.subplots(1) 105 | ax.imshow(images_origin[idx]) 106 | if detections is not None: 107 | unique_labels = detections[:, -1].cpu().unique() 108 | n_cls_preds = len(unique_labels) 109 | bbox_colors = random.sample(colors, n_cls_preds) 110 | for x1, y1, x2, y2, conf, cls_conf, cls_pred in detections: 111 | color = bbox_colors[int(np.where(unique_labels == int(cls_pred))[0])] 112 | # Rescale coordinates to original dimensions 113 | ori_h, ori_w = images_origin[idx].shape[:2] 114 | pre_h, pre_w = config["img_h"], config["img_w"] 115 | box_h = ((y2 - y1) / pre_h) * ori_h 116 | box_w = ((x2 - x1) / pre_w) * ori_w 117 | y1 = (y1 / pre_h) * ori_h 118 | x1 = (x1 / pre_w) * ori_w 119 | # Create a Rectangle patch 120 | bbox = patches.Rectangle((x1, y1), box_w, box_h, linewidth=2, 121 | edgecolor=color, 122 | facecolor='none') 123 | # Add the bbox to the plot 124 | ax.add_patch(bbox) 125 | # Add label 126 | plt.text(x1, y1, s=classes[int(cls_pred)], color='white', 127 | verticalalignment='top', 128 | bbox={'color': color, 'pad': 0}) 129 | # Save generated image with detections 130 | plt.axis('off') 131 | plt.gca().xaxis.set_major_locator(NullLocator()) 132 | plt.gca().yaxis.set_major_locator(NullLocator()) 133 | plt.savefig('output/{}_{}.jpg'.format(step, idx), bbox_inches='tight', pad_inches=0.0) 134 | plt.close() 135 | logging.info("Save all results to ./output/") 136 | 137 | 138 | def main(): 139 | logging.basicConfig(level=logging.DEBUG, 140 | format="[%(asctime)s %(filename)s] %(message)s") 141 | 142 | if len(sys.argv) != 2: 143 | logging.error("Usage: python test_images.py params.py") 144 | sys.exit() 145 | params_path = sys.argv[1] 146 | if not os.path.isfile(params_path): 147 | logging.error("no params file found! path: {}".format(params_path)) 148 | sys.exit() 149 | config = importlib.import_module(params_path[:-3]).TRAINING_PARAMS 150 | config["batch_size"] *= len(config["parallels"]) 151 | 152 | # Start training 153 | os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(map(str, config["parallels"])) 154 | test(config) 155 | 156 | 157 | if __name__ == "__main__": 158 | main() 159 | -------------------------------------------------------------------------------- /training/params.py: -------------------------------------------------------------------------------- 1 | TRAINING_PARAMS = \ 2 | { 3 | "model_params": { 4 | "backbone_name": "shufflenet_2", 5 | "backbone_pretrained": "", # set empty to disable 6 | }, 7 | "yolo": { 8 | "anchors": [[[116, 90], [156, 198], [373, 326]], 9 | [[30, 61], [62, 45], [59, 119]], 10 | [[10, 13], [16, 30], [33, 23]]], 11 | "classes": 80, 12 | }, 13 | "lr": { 14 | "backbone_lr": 0.001, 15 | "other_lr": 0.01, 16 | "freeze_backbone": False, # freeze backbone wegiths to finetune 17 | "decay_gamma": 0.1, 18 | "decay_step": 20, # decay lr in every ? epochs 19 | }, 20 | "optimizer": { 21 | "type": "sgd", 22 | "weight_decay": 4e-05, 23 | }, 24 | "batch_size": 8, 25 | "train_path": "../data/coco/trainvalno5k.txt", 26 | "epochs": 2, 27 | "img_h": 416, 28 | "img_w": 416, 29 | "parallels": [0,1,2,3], # config GPU device 30 | "working_dir": "YOUR_WORKING_DIR", # replace with your working dir 31 | "pretrain_snapshot": "", # load checkpoint 32 | "evaluate_type": "", 33 | "try": 1, 34 | "export_onnx": False, 35 | } 36 | -------------------------------------------------------------------------------- /training/training.py: -------------------------------------------------------------------------------- 1 | # coding='utf-8' 2 | import os 3 | import sys 4 | import numpy as np 5 | import time 6 | import datetime 7 | import json 8 | import importlib 9 | import logging 10 | import shutil 11 | 12 | import torch 13 | import torch.nn as nn 14 | import torch.optim as optim 15 | import torch.nn.functional as F 16 | 17 | from tensorboardX import SummaryWriter 18 | 19 | MY_DIRNAME = os.path.dirname(os.path.abspath(__file__)) 20 | sys.path.insert(0, os.path.join(MY_DIRNAME, '..')) 21 | # sys.path.insert(0, os.path.join(MY_DIRNAME, '..', 'evaluate')) 22 | from nets.model_main import ModelMain 23 | from nets.yolo_loss import YOLOLoss 24 | from common.coco_dataset import COCODataset 25 | 26 | 27 | def train(config): 28 | config["global_step"] = config.get("start_step", 0) 29 | is_training = False if config.get("export_onnx") else True 30 | 31 | # Load and initialize network 32 | net = ModelMain(config, is_training=is_training) 33 | net.train(is_training) 34 | 35 | # Optimizer and learning rate 36 | optimizer = _get_optimizer(config, net) 37 | lr_scheduler = optim.lr_scheduler.StepLR( 38 | optimizer, 39 | step_size=config["lr"]["decay_step"], 40 | gamma=config["lr"]["decay_gamma"]) 41 | 42 | # Set data parallel 43 | net = nn.DataParallel(net) 44 | net = net.cuda() 45 | 46 | # Restore pretrain model 47 | if config["pretrain_snapshot"]: 48 | logging.info("Load pretrained weights from {}".format(config["pretrain_snapshot"])) 49 | state_dict = torch.load(config["pretrain_snapshot"]) 50 | net.load_state_dict(state_dict) 51 | 52 | # Only export onnx 53 | # if config.get("export_onnx"): 54 | # real_model = net.module 55 | # real_model.eval() 56 | # dummy_input = torch.randn(8, 3, config["img_h"], config["img_w"]).cuda() 57 | # save_path = os.path.join(config["sub_working_dir"], "pytorch.onnx") 58 | # logging.info("Exporting onnx to {}".format(save_path)) 59 | # torch.onnx.export(real_model, dummy_input, save_path, verbose=False) 60 | # logging.info("Done. Exiting now.") 61 | # sys.exit() 62 | 63 | # Evaluate interface 64 | # if config["evaluate_type"]: 65 | # logging.info("Using {} to evaluate model.".format(config["evaluate_type"])) 66 | # evaluate_func = importlib.import_module(config["evaluate_type"]).run_eval 67 | # config["online_net"] = net 68 | 69 | # YOLO loss with 3 scales 70 | yolo_losses = [] 71 | for i in range(3): 72 | yolo_losses.append(YOLOLoss(config["yolo"]["anchors"][i], 73 | config["yolo"]["classes"], (config["img_w"], config["img_h"]))) 74 | 75 | # DataLoader 76 | dataloader = torch.utils.data.DataLoader(COCODataset(config["train_path"], 77 | (config["img_w"], config["img_h"]), 78 | is_training=True), 79 | batch_size=config["batch_size"], 80 | shuffle=True, num_workers=32, pin_memory=True) 81 | 82 | # Start the training loop 83 | logging.info("Start training.") 84 | for epoch in range(config["epochs"]): 85 | for step, samples in enumerate(dataloader): 86 | images, labels = samples["image"], samples["label"] 87 | start_time = time.time() 88 | config["global_step"] += 1 89 | 90 | # Forward and backward 91 | optimizer.zero_grad() 92 | outputs = net(images) 93 | losses_name = ["total_loss", "x", "y", "w", "h", "conf", "cls"] 94 | losses = [] 95 | for _ in range(len(losses_name)): 96 | losses.append([]) 97 | for i in range(3): 98 | _loss_item = yolo_losses[i](outputs[i], labels) 99 | for j, l in enumerate(_loss_item): 100 | losses[j].append(l) 101 | losses = [sum(l) for l in losses] 102 | loss = losses[0] 103 | loss.backward() 104 | optimizer.step() 105 | 106 | if step > 0 and step % 10 == 0: 107 | _loss = loss.item() 108 | duration = float(time.time() - start_time) 109 | example_per_second = config["batch_size"] / duration 110 | lr = optimizer.param_groups[0]['lr'] 111 | logging.info( 112 | "epoch [%.3d] iter = %d loss = %.2f example/sec = %.3f lr = %.5f "% 113 | (epoch, step, _loss, example_per_second, lr) 114 | ) 115 | config["tensorboard_writer"].add_scalar("lr", 116 | lr, 117 | config["global_step"]) 118 | config["tensorboard_writer"].add_scalar("example/sec", 119 | example_per_second, 120 | config["global_step"]) 121 | for i, name in enumerate(losses_name): 122 | value = _loss if i == 0 else losses[i] 123 | config["tensorboard_writer"].add_scalar(name, 124 | value, 125 | config["global_step"]) 126 | 127 | if step > 0 and step % 1000 == 0: 128 | # net.train(False) 129 | _save_checkpoint(net.state_dict(), config) 130 | # net.train(True) 131 | 132 | lr_scheduler.step() 133 | 134 | # net.train(False) 135 | _save_checkpoint(net.state_dict(), config) 136 | # net.train(True) 137 | logging.info("Bye~") 138 | 139 | # best_eval_result = 0.0 140 | def _save_checkpoint(state_dict, config, evaluate_func=None): 141 | # global best_eval_result 142 | checkpoint_path = os.path.join(config["sub_working_dir"], "model.pth") 143 | torch.save(state_dict, checkpoint_path) 144 | logging.info("Model checkpoint saved to %s" % checkpoint_path) 145 | # eval_result = evaluate_func(config) 146 | # if eval_result > best_eval_result: 147 | # best_eval_result = eval_result 148 | # logging.info("New best result: {}".format(best_eval_result)) 149 | # best_checkpoint_path = os.path.join(config["sub_working_dir"], 'model_best.pth') 150 | # shutil.copyfile(checkpoint_path, best_checkpoint_path) 151 | # logging.info("Best checkpoint saved to {}".format(best_checkpoint_path)) 152 | # else: 153 | # logging.info("Best result: {}".format(best_eval_result)) 154 | 155 | 156 | def _get_optimizer(config, net): 157 | optimizer = None 158 | 159 | # Assign different lr for each layer 160 | params = None 161 | base_params = list( 162 | map(id, net.backbone.parameters()) 163 | ) 164 | logits_params = filter(lambda p: id(p) not in base_params, net.parameters()) 165 | 166 | if not config["lr"]["freeze_backbone"]: 167 | params = [ 168 | {"params": logits_params, "lr": config["lr"]["other_lr"]}, 169 | {"params": net.backbone.parameters(), "lr": config["lr"]["backbone_lr"]}, 170 | ] 171 | else: 172 | logging.info("freeze backbone's parameters.") 173 | for p in net.backbone.parameters(): 174 | p.requires_grad = False 175 | params = [ 176 | {"params": logits_params, "lr": config["lr"]["other_lr"]}, 177 | ] 178 | 179 | # Initialize optimizer class 180 | if config["optimizer"]["type"] == "adam": 181 | optimizer = optim.Adam(params, weight_decay=config["optimizer"]["weight_decay"]) 182 | elif config["optimizer"]["type"] == "amsgrad": 183 | optimizer = optim.Adam(params, weight_decay=config["optimizer"]["weight_decay"], 184 | amsgrad=True) 185 | elif config["optimizer"]["type"] == "rmsprop": 186 | optimizer = optim.RMSprop(params, weight_decay=config["optimizer"]["weight_decay"]) 187 | else: 188 | # Default to sgd 189 | logging.info("Using SGD optimizer.") 190 | optimizer = optim.SGD(params, momentum=0.9, 191 | weight_decay=config["optimizer"]["weight_decay"], 192 | nesterov=(config["optimizer"]["type"] == "nesterov")) 193 | 194 | return optimizer 195 | 196 | def main(): 197 | logging.basicConfig(level=logging.DEBUG, 198 | format="[%(asctime)s %(filename)s] %(message)s") 199 | 200 | if len(sys.argv) != 2: 201 | logging.error("Usage: python training.py params.py") 202 | sys.exit() 203 | params_path = sys.argv[1] 204 | if not os.path.isfile(params_path): 205 | logging.error("no params file found! path: {}".format(params_path)) 206 | sys.exit() 207 | config = importlib.import_module(params_path[:-3]).TRAINING_PARAMS 208 | config["batch_size"] *= len(config["parallels"]) 209 | 210 | # Create sub_working_dir 211 | sub_working_dir = '{}/{}/size{}x{}_try{}/{}'.format( 212 | config['working_dir'], config['model_params']['backbone_name'], 213 | config['img_w'], config['img_h'], config['try'], 214 | time.strftime("%Y%m%d%H%M%S", time.localtime())) 215 | if not os.path.exists(sub_working_dir): 216 | os.makedirs(sub_working_dir) 217 | config["sub_working_dir"] = sub_working_dir 218 | logging.info("sub working dir: %s" % sub_working_dir) 219 | 220 | # Creat tf_summary writer 221 | config["tensorboard_writer"] = SummaryWriter(sub_working_dir) 222 | logging.info("Please using 'python -m tensorboard.main --logdir={}'".format(sub_working_dir)) 223 | 224 | # Start training 225 | os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(map(str, config["parallels"])) 226 | train(config) 227 | 228 | if __name__ == "__main__": 229 | main() 230 | -------------------------------------------------------------------------------- /weights/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhuYun97/ShuffleNetv2-YOLOv3/6d8e5ccff90519f307cb3112513f50ca044662f3/weights/.DS_Store -------------------------------------------------------------------------------- /weights/README.md: -------------------------------------------------------------------------------- 1 | ### 0. Overview 2 | All of this weights are working for this project with pytorch's format. 3 | 4 | ### 1. YOLO v3 weights base on darknet_53 backbone (mAP=59.66%) 5 | * Name: yolov3_weights_pytorch.pth 6 | * Download: [Google Drive](https://drive.google.com/open?id=1Bm_CLv9hP3mMQ5cyerKRjvt7_t1duvjI) or [Baidu Drive](https://pan.baidu.com/s/1gx-XRUE1NTfIMKkQ1L0awQ) 7 | 8 | ### 2. Backbone weights 9 | * This is a pretrain model. Use for train yourself data set. 10 | * Name: darknet53_weights_pytorch.pth 11 | * Download: [Google Drive](https://drive.google.com/open?id=1VYwHUznM3jLD7ftmOSCHnpkVpBJcFIOA) or [Baidu Drive](https://pan.baidu.com/s/1axXjz6ct9Rn9GtDTust6DA) 12 | 13 | ### 3. Official weigths. 14 | * Name: official_yolov3_weights_pytorch.pth 15 | * Download: [Google Drive](https://drive.google.com/file/d/1SnFAlSvsx37J7MDNs3WWLgeKY0iknikP/view?usp=sharing) or [Baidu Drive](https://pan.baidu.com/s/1YCcRLPWPNhsQfn5f8bs_0g) 16 | --------------------------------------------------------------------------------