├── CIoU.png
├── LICENSE
├── README.md
├── config
├── __init__.py
├── __pycache__
│ ├── __init__.cpython-35.pyc
│ ├── __init__.cpython-36.pyc
│ ├── config.cpython-35.pyc
│ └── config.cpython-36.pyc
└── config.py
├── data
├── CRACK.py
├── VOC.py
├── __init__.py
├── __pycache__
│ ├── CRACK.cpython-36.pyc
│ ├── VOC.cpython-36.pyc
│ └── __init__.cpython-36.pyc
└── utils
│ ├── __init__.py
│ ├── __pycache__
│ ├── __init__.cpython-35.pyc
│ ├── __init__.cpython-36.pyc
│ ├── augmentations.cpython-35.pyc
│ └── augmentations.cpython-36.pyc
│ └── augmentations.py
├── model
├── __init__.py
├── __pycache__
│ ├── __init__.cpython-35.pyc
│ ├── __init__.cpython-36.pyc
│ ├── build_ssd.cpython-35.pyc
│ └── build_ssd.cpython-36.pyc
├── backbone
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── __init__.cpython-35.pyc
│ │ ├── __init__.cpython-36.pyc
│ │ ├── build_backbone.cpython-35.pyc
│ │ └── build_backbone.cpython-36.pyc
│ └── build_backbone.py
├── build_ssd.py
├── head
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── __init__.cpython-35.pyc
│ │ ├── __init__.cpython-36.pyc
│ │ ├── build_head.cpython-35.pyc
│ │ └── build_head.cpython-36.pyc
│ └── build_head.py
├── neck
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── __init__.cpython-35.pyc
│ │ ├── __init__.cpython-36.pyc
│ │ ├── build_neck.cpython-35.pyc
│ │ ├── build_neck.cpython-36.pyc
│ │ ├── ssd_neck.cpython-35.pyc
│ │ └── ssd_neck.cpython-36.pyc
│ ├── build_neck.py
│ └── ssd_neck.py
└── utils
│ ├── __init__.py
│ ├── __pycache__
│ ├── __init__.cpython-35.pyc
│ ├── __init__.cpython-36.pyc
│ ├── conv_module.cpython-35.pyc
│ ├── conv_module.cpython-36.pyc
│ ├── norm.cpython-35.pyc
│ ├── norm.cpython-36.pyc
│ ├── weight_init.cpython-35.pyc
│ └── weight_init.cpython-36.pyc
│ ├── conv_module.py
│ ├── norm.py
│ └── weight_init.py
├── tools
├── ap.py
├── eval.py
├── test.py
└── train.py
├── utils
├── __init__.py
├── __pycache__
│ ├── __init__.cpython-35.pyc
│ ├── __init__.cpython-35.sublime-workspace
│ └── __init__.cpython-36.pyc
├── box
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── __init__.cpython-35.pyc
│ │ ├── __init__.cpython-36.pyc
│ │ ├── box_utils.cpython-35.pyc
│ │ ├── box_utils.cpython-36.pyc
│ │ ├── prior_box.cpython-35.pyc
│ │ └── prior_box.cpython-36.pyc
│ ├── box_utils.py
│ └── prior_box.py
├── detection
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── __init__.cpython-35.pyc
│ │ ├── __init__.cpython-36.pyc
│ │ ├── detection.cpython-35.pyc
│ │ └── detection.cpython-36.pyc
│ └── detection.py
└── loss
│ ├── __init__.py
│ ├── __pycache__
│ ├── __init__.cpython-35.pyc
│ ├── __init__.cpython-36.pyc
│ ├── multibox_loss.cpython-35.pyc
│ └── multibox_loss.cpython-36.pyc
│ └── multibox_loss.py
└── work_dir
└── DIoU-NMS.txt
/CIoU.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/CIoU.png
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | ## Complete-IoU Loss and Cluster-NMS for improving Object Detection and Instance Segmentation.
4 |
5 | This is the code for our papers:
6 | - [Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression](https://arxiv.org/abs/1911.08287)
7 | - [Enhancing Geometric Factors into Model Learning and Inference for Object Detection and Instance Segmentation](https://arxiv.org/abs/2005.03572)
8 |
9 | ```
10 | @Inproceedings{zheng2020diou,
11 | author = {Zheng, Zhaohui and Wang, Ping and Liu, Wei and Li, Jinze and Ye, Rongguang and Ren, Dongwei},
12 | title = {Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression},
13 | booktitle = {The AAAI Conference on Artificial Intelligence (AAAI)},
14 | year = {2020},
15 | }
16 |
17 | @Article{zheng2021ciou,
18 | author = {Zheng, Zhaohui and Wang, Ping and Ren, Dongwei and Liu, Wei and Ye, Rongguang and Hu, Qinghua and Zuo, Wangmeng},
19 | title = {Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation},
20 | booktitle = {IEEE Transactions on Cybernetics},
21 | year = {2021},
22 | }
23 | ```
24 |
25 | ## SSD_FPN_DIoU,CIoU in PyTorch
26 | The code references [SSD: Single Shot MultiBox Object Detector, in PyTorch](https://github.com/amdegroot/ssd.pytorch), [mmdet](https://github.com/open-mmlab/mmdetection) and [**JavierHuang**](https://github.com/JaryHuang). Currently, some experiments are carried out on the VOC dataset, if you want to train your own dataset, more details can be refer to the links above.
27 |
28 | ### Losses
29 |
30 | Losses can be chosen with the `losstype` option in the `config/config.py` file The valid options are currently: `[Iou|Giou|Diou|Ciou|SmoothL1]`.
31 |
32 | ```
33 | VOC:
34 | 'losstype': 'Ciou'
35 | ```
36 |
37 | ## Fold-Structure
38 | The fold structure as follow:
39 | - config/
40 | - config.py
41 | - __init__.py
42 | - data/
43 | - __init__.py
44 | - VOC.py
45 | - VOCdevkit/
46 | - model/
47 | - build_ssd.py
48 | - __init__.py
49 | - backbone/
50 | - neck/
51 | - head/
52 | - utils/
53 | - utils/
54 | - box/
55 | - detection/
56 | - loss/
57 | - __init__.py
58 | - tools/
59 | - train.py
60 | - eval.py
61 | - test.py
62 | - work_dir/
63 |
64 |
65 | ## Environment
66 | - pytorch 0.4.1
67 | - python3+
68 | - visdom
69 | - for real-time loss visualization during training!
70 | ```Shell
71 | pip install visdom
72 | ```
73 | - Start the server (probably in a screen or tmux)
74 | ```Shell
75 | python visdom
76 | ```
77 | * Then (during training) navigate to http://localhost:8097/ (see the Train section below for training details).
78 |
79 |
80 | ## Datasets
81 | - PASCAL VOC:Download VOC2007, VOC2012 dataset, then put VOCdevkit in the data directory
82 |
83 |
84 | ## Training
85 |
86 | ### Training VOC
87 | - The pretrained model refer [pretrained-models.pytorch](https://github.com/Cadene/pretrained-models.pytorch),you can download it.
88 |
89 | - In the DIoU-SSD-pytorch fold:
90 | ```Shell
91 | python tools/train.py
92 | ```
93 |
94 | - Note:
95 | * For training, default NVIDIA GPU.
96 | * You can set the parameters in the train.py (see 'tools/train.py` for options)
97 | * In the config,you can set the work_dir to save your training weight.(see 'configs/config.py`)
98 |
99 | ## Evaluation
100 | - To evaluate a trained network:
101 |
102 | ```Shell
103 | python tools/ap.py --trained_model {your_weight_address}
104 | ```
105 |
106 | For example: (the output is AP50, AP75 and AP of our CIoU loss)
107 | ```
108 | Results:
109 | 0.033
110 | 0.015
111 | 0.009
112 | 0.011
113 | 0.008
114 | 0.083
115 | 0.044
116 | 0.042
117 | 0.004
118 | 0.014
119 | 0.026
120 | 0.034
121 | 0.010
122 | 0.006
123 | 0.009
124 | 0.006
125 | 0.009
126 | 0.013
127 | 0.106
128 | 0.011
129 | 0.025
130 | ~~~~~~~~
131 |
132 | --------------------------------------------------------------
133 | Results computed with the **unofficial** Python eval code.
134 | Results should be very close to the official MATLAB eval code.
135 | --------------------------------------------------------------
136 | 0.7884902583981603 0.5615516772893671 0.5143832356646468
137 | ```
138 |
139 | ## Test
140 | - To test a trained network:
141 |
142 | ```Shell
143 | python test.py -- trained_model {your_weight_address}
144 | ```
145 | if you want to visual the box, you can add the command --visbox True(default False)
146 |
147 | ## Performance
148 |
149 | #### VOC2007 Test mAP
150 | - Backbone is ResNet50-FPN:
151 |
152 | | Test |AP|AP75|
153 | |:-:|:-:|:-:|
154 | |IoU|51.0|54.7|
155 | |GIoU|51.1|55.4|
156 | |DIoU|51.3|55.7|
157 | |CIoU|51.5|56.4|
158 | |CIoU 16|53.3|58.2|
159 |
160 | ##### "16" means bbox regression weight is set to 16.
161 | ## Cluster-NMS
162 |
163 | See `Detect` function of [utils/detection/detection.py](utils/detection/detection.py) for our Cluster-NMS implementation.
164 |
165 | Currently, NMS only surports `cluster_nms`, `cluster_diounms`, `cluster_weighted_nms`, `cluster_weighted_diounms`. (See `'nms_kind'` in [config/config.py](config/config.py))
166 |
167 | #### Hardware
168 | - 1 RTX 2080 Ti
169 | - Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz
170 |
171 | | Backbone | Loss | Regression weight | NMS | FPS | time | box AP | box AP75 |
172 | |:-------------:|:-------:|:-------:|:------------------------------------:|:----:|:----:|:----:|:----:|
173 | | Resnet50-FPN | CIoU | 5 | Fast NMS |**28.8**|**34.7**| 50.7 | 56.2 |
174 | | Resnet50-FPN | CIoU | 5 | Original NMS | 17.8 | 56.1 | 51.5 | 56.4 |
175 | | Resnet50-FPN | CIoU | 5 | DIoU-NMS | 11.4 | 87.6 | 51.9 | 56.6 |
176 | | Resnet50-FPN | CIoU | 5 | Cluster-NMS | 28.0 | 35.7 | 51.5 | 56.4 |
177 | | Resnet50-FPN | CIoU | 5 | Cluster-DIoU-NMS | 27.7 | 36.1 | 51.9 | 56.6 |
178 | | Resnet50-FPN | CIoU | 5 | Weighted Cluster-NMS | 26.8 | 37.3 | 51.9 | 56.3 |
179 | | Resnet50-FPN | CIoU | 5 | Weighted + Cluster-DIoU-NMS | 26.5 | 37.8 |**52.4**|**57.0**|
180 |
181 | #### Hardware
182 | - 1 RTX 2080
183 | - Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
184 |
185 | | Backbone | Loss | Regression weight | NMS | FPS | time | box AP | box AP75 |
186 | |:-------------:|:-------:|:-------:|:------------------------------------:|:----:|:----:|:----:|:----:|
187 | | Resnet50-FPN | CIoU | 16 | Original NMS | 19.7 | 50.9 | 53.3 | 58.2 |
188 | | Resnet50-FPN | CIoU | 16 | Cluster-NMS | 28.0 | 35.7 | 53.4 | 58.2 |
189 | | Resnet50-FPN | CIoU | 16 | Cluster-DIoU-NMS | 26.5 | 37.7 | 53.7 | 58.6 |
190 | | Resnet50-FPN | CIoU | 16 | Weighted Cluster-NMS | 26.9 | 37.2 | 53.8 | 58.7 |
191 | | Resnet50-FPN | CIoU | 16 | Weighted + Cluster-DIoU-NMS | 26.3 | 38.0 |**54.1**|**59.0**|
192 | #### Note:
193 | - Here the box coordinate weighted average is only performed in `IoU> 0.8`. We searched that `IoU > NMS thresh` is not good for SSD and `IoU>0.9` is almost same to `Cluster-NMS`. (Refer to [CAD](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8265304) for the details of Weighted-NMS.)
194 |
195 | - We further incorporate DIoU into Weighted Cluster-NMS for SSD which can get higher AP.
196 |
197 | - Note that Torchvision NMS has the fastest speed, that is owing to CUDA implementation and engineering accelerations (like upper triangular IoU matrix only). However, our Cluster-NMS requires less iterations for NMS and can also be further accelerated by adopting engineering tricks.
198 |
199 | - Currently, Torchvision NMS use IoU as criterion, not DIoU. However, if we directly replace IoU with DIoU in Original NMS, it will costs much more time due to the sequence operation. Now, Cluster-DIoU-NMS will significantly speed up DIoU-NMS and obtain exactly the same result.
200 |
201 | - Torchvision NMS is a function in Torchvision>=0.3, and our Cluster-NMS can be applied to any projects that use low version of Torchvision and other deep learning frameworks as long as it can do matrix operations. **No other import, no need to compile, less iteration, fully GPU-accelerated and better performance**.
202 | ## Pretrained weights
203 |
204 | Here are the trained models using the configurations in this repository.
205 |
206 | - [IoU bbox regression weight 5](https://pan.baidu.com/s/1eNcD9CrnRL79VIH5lsOTPA)
207 | - [GIoU bbox regression weight 5](https://pan.baidu.com/s/1_b1RS5qaRVJUwi27mcpXow)
208 | - [DIoU bbox regression weight 5](https://pan.baidu.com/s/1x1keVP958-DyN_OuWdDAXA)
209 | - [CIoU bbox regression weight 5](https://share.weiyun.com/5LSzur7)
210 | - [CIoU bbox regression weight 16](https://share.weiyun.com/5U3OHez)
211 |
--------------------------------------------------------------------------------
/config/__init__.py:
--------------------------------------------------------------------------------
1 | from .config import voc,crack,coco,trafic
--------------------------------------------------------------------------------
/config/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/config/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/config/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/config/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/config/__pycache__/config.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/config/__pycache__/config.cpython-35.pyc
--------------------------------------------------------------------------------
/config/__pycache__/config.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/config/__pycache__/config.cpython-36.pyc
--------------------------------------------------------------------------------
/config/config.py:
--------------------------------------------------------------------------------
1 | # config.py
2 | import os.path
3 |
4 | # gets home dir cross platform
5 | #HOME = os.path.expanduser("~")
6 | #HOME = os.path.abspath(os.path.dirname(__file__)).split("/") this path
7 | HOME = os.path.join(os.getcwd()) #../path
8 | # for making bounding boxes pretty
9 | COLORS = ((255, 0, 0, 128), (0, 255, 0, 128), (0, 0, 255, 128),
10 | (0, 255, 255, 128), (255, 0, 255, 128), (255, 255, 0, 128))
11 |
12 | #crcak (104, 117, 123)
13 |
14 | # SSD300 CONFIGS
15 |
16 |
17 | voc= {
18 | 'model':"resnet50",
19 | 'losstype':'Ciou',
20 | 'num_classes':21,
21 | 'mean':(123.675, 116.28, 103.53),
22 | 'std':(1.0,1.0,1.0),#(58.395, 57.12, 57.375),
23 | 'lr_steps': (80000, 100000,120000),
24 | 'max_iter': 120000,
25 | 'max_epoch':80,
26 | 'feature_maps': [38, 19, 10, 5, 3, 1],
27 | 'min_dim': 300,
28 | 'backbone_out':[512,1024,2048,512,256,256],
29 | 'neck_out':[256,256,256,256,256,256],
30 | 'steps':[8, 16, 32, 64, 100, 300],
31 | 'min_sizes': [30, 60, 111, 162, 213, 264],
32 | 'max_sizes': [60, 111, 162, 213, 264, 315],
33 | 'aspect_ratios': [[2], [2, 3], [2, 3], [2, 3], [2], [2]],
34 | 'variance': [0.1, 0.2],
35 | 'clip': True,
36 | 'nms_kind': "cluster_weighted_diounms", #Currently, NMS only surports 'cluster_nms', 'cluster_diounms', 'cluster_weighted_nms', 'cluster_weighted_diounms'
37 | 'beta1':0.5,
38 | 'name': 'VOC',
39 | 'work_name':"SSD300_VOC_FPN_GIOU",
40 | }
41 |
42 |
43 | crack = {
44 | 'model':"resnet50",
45 | 'num_classes': 2,
46 | 'mean':(127.5, 127.5, 127.5),
47 | 'std':(1.0, 1.0, 1.0),
48 | 'lr_steps': (25000, 35000, 45000),
49 | 'max_iter': 50000,
50 | 'max_epoch':2000,
51 | 'feature_maps': [38, 19, 10, 5, 3, 1],
52 | 'min_dim': 300,
53 | 'backbone_out':[512,1024,2048,512,256,256],
54 | 'neck_out':[256,256,256,256,256,256],
55 | 'steps': [8, 16, 32, 64, 100, 300],
56 | 'min_sizes': [21, 45, 99, 153, 207, 261],
57 | 'max_sizes': [45, 99, 153, 207, 261, 315],
58 | 'aspect_ratios': [[2], [2, 3], [2, 3], [2, 3], [2], [2]],#[[1/0.49], [1/0.16,1/0.09], [1/0.16,1/0.09], [1/0.16,1/0.09], [1/0.09], [1/0.09]],
59 | 'variance': [0.1, 0.2],
60 | 'clip': True,
61 | 'name': 'CRACK',
62 | 'work_name':"SSD300_CRACK_FPN_GIOU",
63 | }
64 |
65 | coco = {
66 | 'num_classes': 201,
67 | 'lr_steps': (280000, 360000, 400000),
68 | 'max_iter': 400000,
69 | 'max_epoch':80,
70 | 'feature_maps': [38, 19, 10, 5, 3, 1],
71 | 'min_dim': 300,
72 | 'steps': [8, 16, 32, 64, 100, 300],
73 | 'min_sizes': [21, 45, 99, 153, 207, 261],
74 | 'max_sizes': [45, 99, 153, 207, 261, 315],
75 | 'aspect_ratios': [[2], [2, 3], [2, 3], [2, 3], [2], [2]],
76 | 'variance': [0.1, 0.2],
77 | 'clip': True,
78 | 'name': 'COCO',
79 | }
80 |
81 |
82 |
83 | trafic = {
84 | 'num_classes': 21,
85 | 'lr_steps': (80000, 100000, 120000),
86 | 'max_iter': 12000,
87 | 'max_epoch':2000,
88 | 'feature_maps': [50, 25, 13, 7, 5, 3],
89 | 'min_dim': 800,
90 | 'steps': [16, 32, 64, 100, 300, 600],
91 | 'min_sizes': [16, 32, 64, 128, 256, 512],
92 | 'max_sizes': [32, 64, 128, 256, 512, 630],
93 | 'aspect_ratios': [[1], [1,1/2], [1/2,1], [1/2,1], [1], [1]],
94 | 'variance': [0.1, 0.2],
95 | 'clip': True,
96 | 'name': 'TRAFIC',
97 | }
98 |
--------------------------------------------------------------------------------
/data/CRACK.py:
--------------------------------------------------------------------------------
1 | """VOC Dataset Classes
2 |
3 | Original author: Francisco Massa
4 | https://github.com/fmassa/vision/blob/voc_dataset/torchvision/datasets/voc.py
5 |
6 | Updated by: Ellis Brown, Max deGroot
7 | """
8 | import os
9 | import sys
10 | import cv2
11 | import numpy as np
12 | if sys.version_info[0] == 2:
13 | import xml.etree.cElementTree as ET
14 | else:
15 | import xml.etree.ElementTree as ET
16 |
17 | from .VOC import VOCDetection,VOCAnnotationTransform
18 |
19 |
20 |
21 |
22 | CRACK_CLASSES = ( # always index 0
23 | 'neg',)
24 |
25 | HOME = os.path.join(os.getcwd())
26 | CRACK_ROOT = os.path.join(HOME, "data/CrackData/")
27 |
28 | class CRACKDetection(VOCDetection):
29 | """VOC Detection Dataset Object
30 |
31 | input is image, target is annotation
32 |
33 | Arguments:
34 | root (string): filepath to VOCdevkit folder.
35 | image_set (string): imageset to use (eg. 'train', 'val', 'test')
36 | transform (callable, optional): transformation to perform on the
37 | input image
38 | target_transform (callable, optional): transformation to perform on the
39 | target `annotation`
40 | (eg: take in caption string, return tensor of word indices)
41 | dataset_name (string, optional): which dataset to load
42 | (default: 'VOC2007')
43 | """
44 |
45 | def __init__(self, root = CRACK_ROOT,
46 | image_sets= 'trainval.txt',transform=None,
47 | bbox_transform = VOCAnnotationTransform(class_to_ind = CRACK_CLASSES),
48 | dataset_name = 'CRACK'):
49 | self.root = root
50 | self.transform = transform
51 | self.bbox = bbox_transform
52 | self.name = dataset_name
53 | self._annopath = os.path.join('%s', 'Annotations', '%s.xml')
54 | self._imgpath = os.path.join('%s', 'JPEGImages', '%s.jpg')
55 | self.ids = list()
56 | rootpath = os.path.join(self.root, 'crack/')
57 | for line in open(os.path.join(rootpath, 'ImageSets', 'Main',image_sets)):
58 | self.ids.append((rootpath, line.strip()))
59 | #self.ids = self.ids[0:20]
60 |
61 |
62 | def mix_up(self,fir_index):
63 | fir_id = self.ids[fir_index]
64 | sec_index = np.random.randint(0,len(self.ids))
65 | sec_id = self.ids[sec_index]
66 | first_img = cv2.imread(self._imgpath % fir_id)
67 | second_img = cv2.imread(self._imgpath % sec_id)
68 | if eq(first_img.shape,second_img.shape) is False:
69 | raise Exception("The image shape is not same, please!,the first img is {},shape = {},\
70 | the second is {},shape = {}".format(fir_index,str(first_img.shape),sec_index,str(second_img.shape)))
71 |
72 | else:
73 | height,width,channels = first_img.shape
74 |
75 | lam = np.random.beta(1.5, 1.5)
76 | res = cv2.addWeighted(first_img, lam, second_img, 1-lam,0)
77 |
78 | first_target = ET.parse(self._annopath % fir_id).getroot()
79 | second_target = ET.parse(self._annopath % sec_id).getroot()
80 |
81 | target = []
82 | if lam <= 0.9 and lam >= 0.1:
83 | target = self.target_transform(first_target, width, height)
84 | target+= self.target_transform(second_target, width, height)
85 | elif lam>0.9:
86 | target = self.target_transform(first_target, width, height)
87 | else:
88 | target = self.target_transform(second_target, width, height)
89 | return res,target,height,width
90 |
91 |
92 |
93 |
94 |
95 |
96 |
--------------------------------------------------------------------------------
/data/VOC.py:
--------------------------------------------------------------------------------
1 | """VOC Dataset Classes
2 | Original author: Francisco Massa
3 | https://github.com/fmassa/vision/blob/voc_dataset/torchvision/datasets/voc.py
4 | Updated by: Ellis Brown, Max deGroot
5 | """
6 | import os
7 | import os.path as osp
8 | import sys
9 | import torch
10 | import torch.utils.data as data
11 | import cv2
12 | import numpy as np
13 | if sys.version_info[0] == 2:
14 | import xml.etree.cElementTree as ET
15 | else:
16 | import xml.etree.ElementTree as ET
17 |
18 |
19 | VOC_CLASSES = ( # always index 0
20 | 'aeroplane', 'bicycle', 'bird', 'boat',
21 | 'bottle', 'bus', 'car', 'cat', 'chair',
22 | 'cow', 'diningtable', 'dog', 'horse',
23 | 'motorbike', 'person', 'pottedplant',
24 | 'sheep', 'sofa', 'train', 'tvmonitor')
25 |
26 | # note: if you used our download scripts, this should be right
27 | HOME = osp.join(os.getcwd())
28 | VOC_ROOT = osp.join(HOME, "data/VOCdevkit/")
29 |
30 | class VOCAnnotationTransform(object):
31 | """Transforms a VOC annotation into a Tensor of bbox coords and label index
32 | Initilized with a dictionary lookup of classnames to indexes
33 |
34 | Arguments:
35 | class_to_ind (dict, optional): dictionary lookup of classnames -> indexes
36 | (default: alphabetic indexing of VOC's 20 classes)
37 | keep_difficult (bool, optional): keep difficult instances or not
38 | (default: False)
39 | height (int): height
40 | width (int): width
41 | """
42 |
43 | def __init__(self, class_to_ind=None, keep_difficult=False):
44 | self.class_to_ind = dict(
45 | zip(class_to_ind, range(len(class_to_ind))))
46 |
47 | self.keep_difficult = keep_difficult
48 |
49 | def __call__(self, target, width, height):
50 | """
51 | Arguments:
52 | target (annotation) : the target annotation to be made usable
53 | will be an ET.Element
54 | Returns:
55 | a list containing lists of bounding boxes [bbox coords, class name]
56 | """
57 | res = []
58 | for obj in target.iter('object'):
59 | difficult = int(obj.find('difficult').text) == 1
60 | if not self.keep_difficult and difficult:
61 | continue
62 | name = obj.find('name').text.lower().strip()
63 | bbox = obj.find('bndbox')
64 |
65 | pts = ['xmin', 'ymin', 'xmax', 'ymax']
66 | bndbox = []
67 | for i, pt in enumerate(pts):
68 | cur_pt = int(bbox.find(pt).text) - 1
69 | # scale height or width:x/width,y/height
70 | cur_pt = cur_pt / width if i % 2 == 0 else cur_pt / height
71 | bndbox.append(cur_pt)
72 | label_idx = self.class_to_ind[name]
73 | bndbox.append(label_idx)
74 | res += [bndbox] # [xmin, ymin, xmax, ymax, label_ind]
75 | #img_id = target.find('filename').text[:-4]
76 |
77 | return res # [[xmin, ymin, xmax, ymax, label_ind], ... ]
78 |
79 |
80 | class VOCDetection(data.Dataset):
81 | """VOC Detection Dataset Object
82 | input is image, target is annotation
83 | Arguments:
84 | root (string): filepath to VOCdevkit folder.
85 | image_set (string): imageset to use (eg. 'train', 'val', 'test')
86 | transform (callable, optional): transformation to perform on the
87 | input image
88 | bbox_transform (callable, optional): transformation to perform on the
89 | target `annotation`
90 | (eg: take in caption string, return tensor of word indices)
91 | dataset_name (string, optional): which dataset to load
92 | (default: 'VOC2007')
93 | """
94 | def __init__(self, root,
95 | image_sets=[('2007', 'trainval'), ('2012', 'trainval')],
96 | transform=None, bbox_transform=VOCAnnotationTransform(class_to_ind = VOC_CLASSES),
97 | dataset_name='VOC0712'):
98 | self.root = root
99 | self.image_set = image_sets
100 | self.transform = transform
101 | self.bbox = bbox_transform
102 | self.name = dataset_name
103 | self._annopath = osp.join('%s', 'Annotations', '%s.xml')
104 | self._imgpath = osp.join('%s', 'JPEGImages', '%s.jpg')
105 | self.ids = list()
106 | for (year, name) in image_sets:
107 | rootpath = osp.join(self.root, 'VOC' + year)
108 | for line in open(osp.join(rootpath, 'ImageSets', 'Main', name + '.txt')):
109 | self.ids.append((rootpath, line.strip()))
110 | #self.ids = self.ids[0:400]
111 | def __getitem__(self, index):
112 | im, gt, h, w = self.pull_item(index)
113 |
114 | return im, gt
115 |
116 | def __len__(self):
117 | return len(self.ids)
118 |
119 | def pull_item(self, index):
120 | img_id = self.ids[index]
121 |
122 | target = ET.parse(self._annopath % img_id).getroot()
123 | img = cv2.imread(self._imgpath % img_id)
124 | height, width, channels = img.shape
125 |
126 | if self.bbox is not None:
127 | target = self.bbox(target, width, height)
128 |
129 | if self.transform is not None:
130 | target = np.array(target)
131 | img, boxes, labels = self.transform(img, target[:, :4], target[:, 4])
132 | # to rgb
133 | img = img[:, :, (2, 1, 0)]
134 | # img = img.transpose(2, 0, 1)
135 | target = np.hstack((boxes, np.expand_dims(labels, axis=1)))
136 | return torch.from_numpy(img).permute(2, 0, 1), target, height, width
137 |
138 | def pull_image(self, index):
139 | '''Returns the original image object at index in PIL form
140 |
141 | Note: not using self.__getitem__(), as any transformations passed in
142 | could mess up this functionality.
143 |
144 | Argument:
145 | index (int): index of img to show
146 | Return:
147 | PIL img
148 | '''
149 | img_id = self.ids[index]
150 | image = cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
151 | rgb_image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
152 | print(rgb_image.shape)
153 | return rgb_image
154 |
155 | def pull_anno(self, index):
156 | '''Returns the original annotation of image at index
157 |
158 | Note: not using self.__getitem__(), as any transformations passed in
159 | could mess up this functionality.
160 |
161 | Argument:
162 | index (int): index of img to get annotation of
163 | Return:
164 | list: [img_id, [(label, bbox coords),...]]
165 | eg: ('001718', [('dog', (96, 13, 438, 332))])
166 | '''
167 | img_id = self.ids[index]
168 | anno = ET.parse(self._annopath % img_id).getroot()
169 | gt = self.bbox(anno, 1, 1)
170 | return img_id[1], gt
171 |
172 | def pull_tensor(self, index):
173 | '''Returns the original image at an index in tensor form
174 |
175 | Note: not using self.__getitem__(), as any transformations passed in
176 | could mess up this functionality.
177 |
178 | Argument:
179 | index (int): index of img to show
180 | Return:
181 | tensorized version of img, squeezed
182 | '''
183 | return torch.Tensor(self.pull_image(index)).unsqueeze_(0)
184 |
185 |
186 | def detection_collate(batch):
187 | """Custom collate fn for dealing with batches of images that have a different
188 | number of associated object annotations (bounding boxes).
189 |
190 | Arguments:
191 | batch: (tuple) A tuple of tensor images and lists of annotations
192 |
193 | Return:
194 | A tuple containing:
195 | 1) (tensor) batch of images stacked on their 0 dim
196 | 2) (list of tensors) annotations for a given image are stacked on
197 | 0 dim
198 | """
199 | targets = []
200 | imgs = []
201 | for sample in batch:
202 | imgs.append(sample[0])
203 | targets.append(torch.FloatTensor(sample[1]))
204 | return torch.stack(imgs, 0), targets
205 |
--------------------------------------------------------------------------------
/data/__init__.py:
--------------------------------------------------------------------------------
1 | from .VOC import VOC_ROOT, VOC_CLASSES, VOCAnnotationTransform, VOCDetection
2 | from .VOC import detection_collate
3 | from .CRACK import CRACK_ROOT, CRACK_CLASSES, CRACKDetection
4 | from .utils import SSDAugmentation,BaseTransform
5 |
--------------------------------------------------------------------------------
/data/__pycache__/CRACK.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/data/__pycache__/CRACK.cpython-36.pyc
--------------------------------------------------------------------------------
/data/__pycache__/VOC.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/data/__pycache__/VOC.cpython-36.pyc
--------------------------------------------------------------------------------
/data/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/data/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/data/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .augmentations import SSDAugmentation, BaseTransform
2 |
--------------------------------------------------------------------------------
/data/utils/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/data/utils/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/data/utils/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/data/utils/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/data/utils/__pycache__/augmentations.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/data/utils/__pycache__/augmentations.cpython-35.pyc
--------------------------------------------------------------------------------
/data/utils/__pycache__/augmentations.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/data/utils/__pycache__/augmentations.cpython-36.pyc
--------------------------------------------------------------------------------
/data/utils/augmentations.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torchvision import transforms
3 | import cv2
4 | import numpy as np
5 | import types
6 | from numpy import random
7 |
8 |
9 | def intersect(box_a, box_b):
10 | '''
11 | calcute the intersect of box
12 | args:
13 | box_a = [boxs_num,4]
14 | box_b = [4]
15 |
16 | return iou_area = [boxs_num,1]
17 | '''
18 | max_xy = np.minimum(box_a[:, 2:], box_b[2:])
19 | min_xy = np.maximum(box_a[:, :2], box_b[:2])
20 | inter = np.clip((max_xy - min_xy), a_min=0, a_max=np.inf)
21 | return inter[:, 0] * inter[:, 1]
22 |
23 |
24 | def jaccard_numpy(box_a, box_b):
25 | """Compute the jaccard overlap of two sets of boxes. The jaccard overlap
26 | is simply the intersection over union of two boxes.
27 | E.g.:
28 | A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
29 | Args:
30 | box_a: Multiple bounding boxes, Shape: [num_boxes,4]
31 | box_b: Single bounding box, Shape: [4]
32 | Return:
33 | jaccard overlap: Shape: [box_a.shape[0], box_a.shape[1]]
34 | """
35 | inter = intersect(box_a, box_b)
36 | area_a = ((box_a[:, 2]-box_a[:, 0]) *
37 | (box_a[:, 3]-box_a[:, 1])) # [A,B]
38 | area_b = ((box_b[2]-box_b[0]) *
39 | (box_b[3]-box_b[1])) # [A,B]
40 | union = area_a + area_b - inter
41 | return inter / union # [A,B]
42 |
43 |
44 | class Compose(object):
45 | """
46 | Composes several augmentations together.
47 | Args:
48 | transforms (List[Transform]): list of transforms to compose.
49 | Example:
50 | augmentations.Compose([
51 | transforms.CenterCrop(10),
52 | transforms.ToTensor(),
53 | ])
54 | """
55 |
56 | def __init__(self, transforms):
57 | self.transforms = transforms
58 |
59 | def __call__(self, img, boxes=None, labels=None):
60 | for t in self.transforms:
61 | img, boxes, labels = t(img, boxes, labels)
62 | return img, boxes, labels
63 |
64 |
65 | class Lambda(object):
66 | """Applies a lambda as a transform."""
67 |
68 | def __init__(self, lambd):
69 | assert isinstance(lambd, types.LambdaType)
70 | self.lambd = lambd
71 |
72 | def __call__(self, img, boxes=None, labels=None):
73 | return self.lambd(img, boxes, labels)
74 |
75 |
76 | class ConvertFromInts(object):
77 | '''
78 | Convert the image to ints
79 | '''
80 | def __call__(self, image, boxes=None, labels=None):
81 | return image.astype(np.float32), boxes, labels
82 |
83 |
84 | class SubtractMeans(object):
85 | '''
86 | Sub the image means
87 | '''
88 | def __init__(self, mean):
89 | self.mean = np.array(mean, dtype=np.float32)
90 |
91 | def __call__(self, image, boxes=None, labels=None):
92 | image = image.astype(np.float32)
93 | image -= self.mean
94 | return image.astype(np.float32), boxes, labels
95 |
96 |
97 | class Standform(object):
98 | '''
99 | make the image to standorm
100 | '''
101 | def __init__(self,mean,std):
102 | self.means = np.array(mean,dtype = np.float32)
103 | self.std = np.array(std,dtype = np.float32)
104 | def __call__(self, image, boxes=None, labels=None):
105 | image = image.astype(np.float32)
106 | return (image - self.means)/self.std,boxes,labels
107 |
108 |
109 | class ToAbsoluteCoords(object):
110 | '''
111 | make the boxes to Absolute Coords
112 | '''
113 | def __call__(self, image, boxes=None, labels=None):
114 | height, width, channels = image.shape
115 | boxes[:, 0] *= width
116 | boxes[:, 2] *= width
117 | boxes[:, 1] *= height
118 | boxes[:, 3] *= height
119 |
120 | return image, boxes, labels
121 |
122 |
123 | class ToPercentCoords(object):
124 | '''
125 | make the boxes to Percent Coords
126 | '''
127 | def __call__(self, image, boxes=None, labels=None):
128 | height, width, channels = image.shape
129 | boxes[:, 0] /= width
130 | boxes[:, 2] /= width
131 | boxes[:, 1] /= height
132 | boxes[:, 3] /= height
133 |
134 | return image, boxes, labels
135 |
136 |
137 | class Resize(object):
138 | '''
139 | resize the image
140 | args:
141 | size = (size,size)
142 | '''
143 | def __init__(self, size=300):
144 | if isinstance(size,int):
145 | self.size = (size,size)
146 | elif isinstance(size,tuple):
147 | self.size = size
148 | else:
149 | raise Exception("The size is int or tuple")
150 |
151 | def __call__(self, image, boxes=None, labels=None):
152 | image = cv2.resize(image, self.size)
153 | return image, boxes, labels
154 |
155 |
156 | class RandomSaturation(object):
157 | '''
158 | Random to change the Saturation(HSV):0.0~1.0
159 | assert: this image is HSV
160 | args:
161 | lower,upper is the parameter to random the saturation
162 | '''
163 | def __init__(self, lower=0.5, upper=1.5):
164 | self.lower = lower
165 | self.upper = upper
166 | assert self.upper >= self.lower, "contrast upper must be >= lower."
167 | assert self.lower >= 0, "contrast lower must be non-negative."
168 |
169 | def __call__(self, image, boxes=None, labels=None):
170 | if random.randint(2):
171 | image[:, :, 1] *= random.uniform(self.lower, self.upper)
172 |
173 | return image, boxes, labels
174 |
175 |
176 | class RandomHue(object):
177 | '''
178 | Random to change the Hue(HSV):0~360
179 | assert: this image is HSV
180 | args:
181 | delta is the parameters to random change the hue.
182 |
183 | '''
184 | def __init__(self, delta=18.0):
185 | assert delta >= 0.0 and delta <= 360.0
186 | self.delta = delta
187 |
188 | def __call__(self, image, boxes=None, labels=None):
189 | if random.randint(2):
190 | image[:, :, 0] += random.uniform(-self.delta, self.delta)
191 | image[:, :, 0][image[:, :, 0] > 360.0] -= 360.0
192 | image[:, :, 0][image[:, :, 0] < 0.0] += 360.0
193 | return image, boxes, labels
194 |
195 |
196 | class RandomLightingNoise(object):
197 | def __init__(self):
198 | self.perms = ((0, 1, 2), (0, 2, 1),
199 | (1, 0, 2), (1, 2, 0),
200 | (2, 0, 1), (2, 1, 0))
201 |
202 | def __call__(self, image, boxes=None, labels=None):
203 | if random.randint(2):
204 | swap = self.perms[random.randint(len(self.perms))]
205 | shuffle = SwapChannels(swap) # shuffle channels
206 | image = shuffle(image)
207 | return image, boxes, labels
208 |
209 |
210 | class ConvertColor(object):
211 | '''
212 | change the image from HSV to BGR or from BGR to HSV color
213 | args:
214 | current
215 | transform
216 | '''
217 | def __init__(self, current='RGB', transform='HSV'):
218 | self.transform = transform
219 | self.current = current
220 |
221 | def __call__(self, image, boxes=None, labels=None):
222 | if self.current == 'RGB' and self.transform == 'HSV':
223 | image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
224 | elif self.current == 'HSV' and self.transform == 'RGB':
225 | image = cv2.cvtColor(image, cv2.COLOR_HSV2RGB)
226 | else:
227 | raise NotImplementedError
228 | return image, boxes, labels
229 |
230 |
231 | class RandomContrast(object):
232 | '''
233 | Random to improve the image contrast:g(i,j) = alpha*f(i,j)
234 | '''
235 | def __init__(self, lower=0.5, upper=1.5):
236 | self.lower = lower
237 | self.upper = upper
238 | assert self.upper >= self.lower, "contrast upper must be >= lower."
239 | assert self.lower >= 0, "contrast lower must be non-negative."
240 |
241 | # expects float image
242 | def __call__(self, image, boxes=None, labels=None):
243 | if random.randint(2):
244 | alpha = random.uniform(self.lower, self.upper)
245 | image *= alpha
246 | return image, boxes, labels
247 |
248 |
249 | class RandomBrightness(object):
250 | '''
251 | Random to improve the image bright:g(i,j) = f(i,j) + beta
252 | '''
253 | def __init__(self, delta=32):
254 | assert delta >= 0.0
255 | assert delta <= 255.0
256 | self.delta = delta
257 |
258 | def __call__(self, image, boxes=None, labels=None):
259 | if random.randint(2):
260 | delta = random.uniform(-self.delta, self.delta)
261 | image += delta
262 | return image, boxes, labels
263 |
264 |
265 | class ToCV2Image(object):
266 | '''
267 | change the iamge shape c,h,w to h,w,c
268 | '''
269 | def __call__(self, tensor, boxes=None, labels=None):
270 | return tensor.cpu().numpy().astype(np.float32).transpose((1, 2, 0)), boxes, labels
271 |
272 |
273 | class ToTensor(object):
274 | '''
275 | chage the image shape h,w,c to c,h,w
276 | '''
277 |
278 | def __call__(self, cvimage, boxes=None, labels=None):
279 | return torch.from_numpy(cvimage.astype(np.float32)).permute(2, 0, 1), boxes, labels
280 |
281 |
282 | class RandomSampleCrop(object):
283 | """Crop
284 | Arguments:
285 | img (Image): the image being input during training
286 | boxes (Tensor): the original bounding boxes in pt form
287 | labels (Tensor): the class labels for each bbox
288 | mode (float tuple): the min and max jaccard overlaps
289 | Return:
290 | (img, boxes, classes)
291 | img (Image): the cropped image
292 | boxes (Tensor): the adjusted bounding boxes in pt form
293 | labels (Tensor): the class labels for each bbox
294 | """
295 | def __init__(self):
296 | self.sample_options = (
297 | # using entire original input image
298 | None,
299 | # sample a patch s.t. MIN jaccard w/ obj in .1,.3,.4,.7,.9
300 | (0.1, None),
301 | (0.3, None),
302 | (0.7, None),
303 | (0.9, None),
304 | # randomly sample a patch
305 | (None, None),
306 | )
307 |
308 | def __call__(self, image, boxes=None, labels=None):
309 | height, width, _ = image.shape
310 | while True:
311 | # randomly choose a mode
312 | mode = random.choice(self.sample_options)
313 | if mode is None:
314 | return image, boxes, labels
315 |
316 | min_iou, max_iou = mode
317 | if min_iou is None:
318 | min_iou = float('-inf')
319 | if max_iou is None:
320 | max_iou = float('inf')
321 |
322 | # max trails (50)
323 | for _ in range(50):
324 | current_image = image
325 |
326 | w = random.uniform(0.3 * width, width)
327 | h = random.uniform(0.3 * height, height)
328 |
329 | # aspect ratio constraint b/t .5 & 2
330 | if h / w < 0.5 or h / w > 2:
331 | continue
332 |
333 | left = random.uniform(width - w)
334 | top = random.uniform(height - h)
335 |
336 | # convert to integer rect x1,y1,x2,y2
337 | rect = np.array([int(left), int(top), int(left+w), int(top+h)])
338 |
339 | # calculate IoU (jaccard overlap) b/t the cropped and gt boxes
340 | overlap = jaccard_numpy(boxes, rect)
341 |
342 | # is min and max overlap constraint satisfied? if not try again
343 | if overlap.min() < min_iou and max_iou < overlap.max():
344 | continue
345 |
346 | # cut the crop from the image
347 | current_image = current_image[rect[1]:rect[3], rect[0]:rect[2],
348 | :]
349 |
350 | # keep overlap with gt box IF center in sampled patch
351 | #calcute the center in the boxes
352 | centers = (boxes[:, :2] + boxes[:, 2:]) / 2.0
353 |
354 | # mask in all gt boxes that above and to the left of centers
355 | m1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1])
356 |
357 | # mask in all gt boxes that under and to the right of centers
358 | m2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1])
359 |
360 | # mask in that both m1 and m2 are true
361 | #select the valid box that center in the rect
362 | mask = m1 * m2
363 |
364 | # have any valid boxes? try again if not
365 | if not mask.any():
366 | continue
367 |
368 | # take only matching gt boxes
369 | current_boxes = boxes[mask, :].copy()
370 |
371 | # take only matching gt labels
372 | current_labels = labels[mask]
373 |
374 | # should we use the box left and top corner or the crop's
375 | current_boxes[:, :2] = np.maximum(current_boxes[:, :2],
376 | rect[:2])
377 | # adjust to crop (by substracting crop's left,top)
378 | current_boxes[:, :2] -= rect[:2]
379 |
380 | current_boxes[:, 2:] = np.minimum(current_boxes[:, 2:],
381 | rect[2:])
382 | # adjust to crop (by substracting crop's left,top)
383 | current_boxes[:, 2:] -= rect[:2]
384 |
385 | return current_image, current_boxes, current_labels
386 |
387 |
388 | class Expand(object):
389 | '''
390 | expand:ratio = 0.5
391 | '''
392 | def __init__(self, mean):
393 | self.mean = mean
394 |
395 | def __call__(self, image, boxes, labels):
396 | if random.randint(2):
397 | return image, boxes, labels
398 |
399 | height, width, depth = image.shape
400 | ratio = random.uniform(1, 4)
401 | # random to make the left and top
402 | left = random.uniform(0, width*ratio - width)
403 | top = random.uniform(0, height*ratio - height)
404 |
405 | expand_image = np.zeros(
406 | (int(height*ratio), int(width*ratio), depth),
407 | dtype=image.dtype)
408 | expand_image[:, :, :] = self.mean
409 | #put the image to the expand image
410 | expand_image[int(top):int(top + height),
411 | int(left):int(left + width)] = image
412 | image = expand_image
413 |
414 | boxes = boxes.copy()
415 | #match the box left and top
416 | boxes[:, :2] += (int(left), int(top))
417 | boxes[:, 2:] += (int(left), int(top))
418 |
419 | return image, boxes, labels
420 |
421 | '''
422 | horizontal flip: ration = 0.5
423 | '''
424 | class RandomMirror(object):
425 | def __call__(self, image, boxes, classes):
426 | _, width, _ = image.shape
427 | if random.randint(2):
428 | image = image[:, ::-1]
429 | boxes = boxes.copy()
430 | boxes[:, 0::2] = width - boxes[:, 2::-2]
431 | return image, boxes, classes
432 |
433 |
434 | class SwapChannels(object):
435 | """Transforms a tensorized image by swapping the channels in the order
436 | specified in the swap tuple.
437 | Args:
438 | swaps (int triple): final order of channels
439 | eg: (2, 1, 0)
440 | """
441 |
442 | def __init__(self, swaps):
443 | self.swaps = swaps
444 |
445 | def __call__(self, image):
446 | """
447 | Args:
448 | image (Tensor): image tensor to be transformed
449 | Return:
450 | a tensor with channels swapped according to swap
451 | """
452 | # if torch.is_tensor(image):
453 | # image = image.data.cpu().numpy()
454 | # else:
455 | # image = np.array(image)
456 | image = image[:, :, self.swaps]
457 | return image
458 |
459 |
460 | class PhotometricDistort(object):
461 | def __init__(self):
462 | self.pd = [
463 | RandomContrast(),
464 | ConvertColor(transform='HSV'),
465 | RandomSaturation(),
466 | RandomHue(),
467 | ConvertColor(current='HSV', transform='RGB'),
468 | RandomContrast()
469 | ]
470 | self.rand_brightness = RandomBrightness()
471 | self.rand_light_noise = RandomLightingNoise()
472 |
473 | def __call__(self, image, boxes, labels):
474 | im = image.copy()
475 | im, boxes, labels = self.rand_brightness(im, boxes, labels)
476 | if random.randint(2):
477 | distort = Compose(self.pd[:-1])
478 | else:
479 | distort = Compose(self.pd[1:])
480 | im, boxes, labels = distort(im, boxes, labels)
481 | return self.rand_light_noise(im, boxes, labels)
482 |
483 |
484 | class SSDAugmentation(object):
485 | def __init__(self, size=300, mean=(104, 117, 123),std =(104, 117, 123)):
486 | self.mean = mean
487 | self.std = std
488 | self.size = size
489 | self.augment = Compose([
490 | ConvertFromInts(),
491 | ToAbsoluteCoords(),
492 | PhotometricDistort(),
493 | Expand(self.mean),
494 | RandomSampleCrop(),
495 | RandomMirror(),
496 | ToPercentCoords(),
497 | Resize(self.size),
498 | Standform(self.mean,self.std)
499 | #SubtractMeans(self.mean)
500 | ])
501 |
502 | def __call__(self, img, boxes, labels):
503 | return self.augment(img, boxes, labels)
504 |
505 | def base_transform(image, size, mean):
506 | x = Standform(self.mean,self.std)
507 | x = cv2.resize(image, (size, size)).astype(np.float32)
508 | x -= mean
509 | x = x.astype(np.float32)
510 | return x
511 |
512 |
513 | class BaseTransform:
514 | def __init__(self, size, mean,std):
515 | self.mean = mean
516 | self.std = std
517 | self.size = size
518 | self.augment = Compose([
519 | ConvertFromInts(),
520 | Resize(self.size),
521 | Standform(self.mean,self.std)
522 |
523 | ])
524 |
525 | def __call__(self, image, boxes=None, labels=None):
526 | return self.augment(image, boxes, labels)
527 |
--------------------------------------------------------------------------------
/model/__init__.py:
--------------------------------------------------------------------------------
1 | from .build_ssd import build_ssd
2 |
--------------------------------------------------------------------------------
/model/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/model/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/model/__pycache__/build_ssd.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/__pycache__/build_ssd.cpython-35.pyc
--------------------------------------------------------------------------------
/model/__pycache__/build_ssd.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/__pycache__/build_ssd.cpython-36.pyc
--------------------------------------------------------------------------------
/model/backbone/__init__.py:
--------------------------------------------------------------------------------
1 | from .build_backbone import Backbone
2 |
--------------------------------------------------------------------------------
/model/backbone/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/backbone/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/model/backbone/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/backbone/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/model/backbone/__pycache__/build_backbone.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/backbone/__pycache__/build_backbone.cpython-35.pyc
--------------------------------------------------------------------------------
/model/backbone/__pycache__/build_backbone.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/backbone/__pycache__/build_backbone.cpython-36.pyc
--------------------------------------------------------------------------------
/model/backbone/build_backbone.py:
--------------------------------------------------------------------------------
1 | import pretrainedmodels
2 | import torch.nn as nn
3 | from torchsummary import summary
4 | from ..utils import ConvModule
5 |
6 |
7 |
8 |
9 |
10 | class Backbone(nn.Module):
11 | def __init__(self, model_name, feature_map):
12 | super(Backbone,self).__init__()
13 | self.normalize = {'type':'BN'}
14 | lay,channal = self.get_pretrainedmodel(model_name)
15 | self.model = self.add_extras(lay, channal)
16 | self.model_length = len(self.model)
17 | self.feature_map = feature_map
18 |
19 |
20 |
21 | def get_pretrainedmodel(self,model_name,pretrained = 'imagenet'):#'imagenet'
22 | '''
23 | get the pretraindmodel lay
24 | args:
25 | model_name
26 | pretrained:None or imagenet
27 | '''
28 | model = pretrainedmodels.__dict__[model_name](num_classes = 1000,pretrained = pretrained)
29 | #get the model lay,it's a list
30 | if model_name in ['resnet18','resnet34','resnet50','resnet101','resnet152']:
31 | lay = nn.Sequential(*list(model.children())[:-2])
32 | if model_name == 'resnet50':
33 | out_channels = 2048
34 |
35 | return lay,out_channels
36 |
37 | def add_extras(self,lay,in_channel):
38 | exts1 = nn.Sequential(
39 | ConvModule(2048,256,1,normalize=None,stride = 1,
40 | bias=True,inplace=False),
41 | ConvModule(256,512,3,normalize=None,stride = 2,padding = 1,
42 | bias=True,inplace=False)
43 |
44 |
45 | #nn.Conv2d(in_channels = 256, out_channels = 512, kernel_size = 3 ,stride = 2, padding = 1)
46 | )
47 | lay.add_module("exts1",exts1)
48 |
49 | exts2 = nn.Sequential(
50 | ConvModule(512,128,1,normalize=None,stride = 1,
51 | bias=True,inplace=False),
52 | ConvModule(128,256,3,normalize=None,stride = 2,padding = 1,
53 | bias=True,inplace=False)
54 |
55 | )
56 | lay.add_module("exts2",exts2)
57 |
58 | exts3 = nn.Sequential(
59 | ConvModule(256,128,1,normalize=None,stride = 1,
60 | bias=True,inplace=False),
61 | ConvModule(128,256,3,normalize=None,stride = 1,padding = 0,
62 | bias=True,inplace=False)
63 | )
64 | lay.add_module("exts3",exts3)
65 |
66 | return lay
67 |
68 | def forward(self,x):
69 | outs = []
70 |
71 | for i in range(self.model_length):
72 | x = self.model[i](x)
73 |
74 | if i+1 in self.feature_map:
75 |
76 | outs.append(x)
77 | #for i in range(len(outs)):
78 | #print(outs[i].shape[1])
79 | if len(outs) == 1:
80 | return outs[0]
81 | else:
82 | return tuple(outs)
83 |
84 |
85 |
86 | if __name__=='__main__':
87 | import torch.nn as nn
88 | use_gpu = True
89 | model_name = 'resnet50'
90 |
91 |
92 | # could be fbresnet152 or inceptionresnetv2
93 | feature_map = [6,7,8,9,10,11]
94 | bone_model = Backbone(model_name,feature_map)
95 | if use_gpu:
96 | bone_model.cuda()
97 | summary(bone_model, (3,300, 300))
98 |
--------------------------------------------------------------------------------
/model/build_ssd.py:
--------------------------------------------------------------------------------
1 | '''
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 | from layers import *
6 | from data import voc, coco
7 | import os
8 | import torchvision
9 | '''
10 | from model.backbone import Backbone
11 | from model.neck import Neck,SSDNeck
12 | from model.head import SSDHead
13 | import torch.nn as nn
14 | import torch
15 | from utils import PriorBox,Detect
16 | import torch.nn.functional as F
17 |
18 | class SSD(nn.Module):
19 | """Single Shot Multibox Architecture
20 | The network is composed of a base VGG network followed by the
21 | added multibox conv layers. Each multibox layer branches into
22 | 1) conv2d for class conf scores
23 | 2) conv2d for localization predictions
24 | 3) associated priorbox layer to produce default bounding
25 | boxes specific to the layer's feature map size.
26 | See: https://arxiv.org/pdf/1512.02325.pdf for more details.
27 |
28 | Args:
29 | phase: (string) Can be "test" or "train"
30 | size: input image size
31 | base: VGG16 layers for input, size of either 300 or 500
32 | extras: extra layers that feed to multibox loc and conf layers
33 | head: "multibox head" consists of loc and conf conv layers
34 | """
35 |
36 | def __init__(self, phase, size, Backbone, Neck, Head, cfg):
37 | super(SSD, self).__init__()
38 | self.phase = phase
39 | self.cfg = cfg
40 | self.priorbox = PriorBox(self.cfg)
41 | self.priors = self.priorbox.forward()
42 | self.size = size
43 | # SSD network
44 | self.backbone = Backbone
45 | self.neck = Neck
46 | self.head = Head
47 | self.num_classes = cfg['num_classes']
48 | self.softmax = nn.Softmax(dim=-1)
49 | self.detect = Detect(self.num_classes , 0, 200, 0.01, 0.45,variance = cfg['variance'], nms_kind=cfg['nms_kind'], beta1=cfg['beta1'])
50 |
51 | def forward(self, x, phase):
52 | """Applies network layers and ops on input image(s) x.
53 |
54 | Args:
55 | x: input image or batch of images. Shape: [batch,3,300,300].
56 |
57 | Return:
58 | Depending on phase:
59 | test:
60 | Variable(tensor) of output class label predictions,
61 | confidence score, and corresponding location predictions for
62 | each object detected. Shape: [batch,topk,7]
63 |
64 | train:
65 | list of concat outputs from:
66 | 1: confidence layers, Shape: [batch*num_priors,num_classes]
67 | 2: localization layers, Shape: [batch,num_priors*4]
68 | 3: priorbox layers, Shape: [2,num_priors*4]
69 | """
70 |
71 |
72 | x = self.backbone(x)
73 | if self.neck is not None:
74 | x = self.neck(x)
75 |
76 | conf,loc = self.head(x)
77 |
78 | loc = torch.cat([o.view(o.size(0), -1) for o in loc], 1)
79 | conf = torch.cat([o.view(o.size(0), -1) for o in conf], 1)
80 | if phase == "test":
81 | output = self.detect(
82 | loc.view(loc.size(0), -1, 4), # loc preds
83 | self.softmax(conf.view(conf.size(0), -1,
84 | self.num_classes)), # conf preds
85 | #self.priors.type(type(x.data)) # default boxes
86 | self.priors
87 | )
88 | else:
89 | output = (
90 | loc.view(loc.size(0), -1, 4),
91 | conf.view(conf.size(0), -1, self.num_classes),
92 | self.priors
93 | )
94 | return output
95 |
96 | def load_weights(self, base_file):
97 | other, ext = os.path.splitext(base_file)
98 | if ext == '.pkl' or '.pth':
99 | print('Loading weights into state dict...')
100 | self.load_state_dict(torch.load(base_file,
101 | map_location=lambda storage, loc: storage))
102 | print('Finished!')
103 | else:
104 | print('Sorry only .pth and .pkl files supported.')
105 |
106 |
107 | def build_ssd(phase, size=300, cfg=None):
108 | if phase != "test" and phase != "train":
109 | print("ERROR: Phase: " + phase + " not recognized")
110 | return
111 | if size != 300 and size!=600 and size!=800:
112 | print("ERROR: You specified size " + repr(size) + ". However, " +
113 | "currently only SSD300 (size=300) is supported!")
114 | return
115 | print(phase)
116 | base = Backbone(cfg['model'],[6,7,8,9,10,11])
117 |
118 | neck = Neck(in_channels = cfg['backbone_out'], out_channels = cfg['neck_out'])
119 | head = SSDHead(num_classes = cfg['num_classes'],in_channels = cfg['neck_out'],aspect_ratios = cfg['aspect_ratios'])
120 |
121 | return SSD(phase, size, base, neck, head, cfg)
122 |
--------------------------------------------------------------------------------
/model/head/__init__.py:
--------------------------------------------------------------------------------
1 | from .build_head import SSDHead
--------------------------------------------------------------------------------
/model/head/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/head/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/model/head/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/head/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/model/head/__pycache__/build_head.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/head/__pycache__/build_head.cpython-35.pyc
--------------------------------------------------------------------------------
/model/head/__pycache__/build_head.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/head/__pycache__/build_head.cpython-36.pyc
--------------------------------------------------------------------------------
/model/head/build_head.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 |
6 |
7 |
8 | class SSDHead(nn.Module):
9 |
10 | def __init__(self,
11 | num_classes=81,
12 | in_channels=[256,256,256,256,256],
13 | aspect_ratios=([2], [2, 3], [2, 3], [2, 3], [2], [2])):
14 | super(SSDHead, self).__init__()
15 | self.num_classes = num_classes
16 | self.in_channels = in_channels
17 | num_anchors = [len(ratios) * 2 + 2 for ratios in aspect_ratios]
18 | reg_convs = []
19 | cls_convs = []
20 | for i in range(len(in_channels)):
21 | reg_convs.append(
22 | nn.Conv2d(
23 | in_channels[i],
24 | num_anchors[i] * 4,
25 | kernel_size=3,
26 | padding=1))
27 | cls_convs.append(
28 | nn.Conv2d(
29 | in_channels[i],
30 | num_anchors[i] * num_classes,
31 | kernel_size=3,
32 | padding=1))
33 | self.reg_convs = nn.ModuleList(reg_convs)
34 | self.cls_convs = nn.ModuleList(cls_convs)
35 |
36 | self.init_weights()
37 | def init_weights(self):
38 | for m in self.modules():
39 | if isinstance(m, nn.Conv2d):
40 | torch.nn.init.xavier_uniform_(m.weight)
41 |
42 | def forward(self, feats):
43 | cls_scores = []
44 | bbox_preds = []
45 | for feat, reg_conv, cls_conv in zip(feats, self.reg_convs,
46 | self.cls_convs):
47 | #[num_featuremap,w,h,c]
48 | cls_scores.append(cls_conv(feat).permute(0, 2, 3, 1).contiguous())
49 | bbox_preds.append(reg_conv(feat).permute(0, 2, 3, 1).contiguous())
50 |
51 | return cls_scores, bbox_preds
--------------------------------------------------------------------------------
/model/neck/__init__.py:
--------------------------------------------------------------------------------
1 | from .build_neck import Neck
2 | from .ssd_neck import SSDNeck
3 |
--------------------------------------------------------------------------------
/model/neck/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/neck/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/model/neck/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/neck/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/model/neck/__pycache__/build_neck.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/neck/__pycache__/build_neck.cpython-35.pyc
--------------------------------------------------------------------------------
/model/neck/__pycache__/build_neck.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/neck/__pycache__/build_neck.cpython-36.pyc
--------------------------------------------------------------------------------
/model/neck/__pycache__/ssd_neck.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/neck/__pycache__/ssd_neck.cpython-35.pyc
--------------------------------------------------------------------------------
/model/neck/__pycache__/ssd_neck.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/neck/__pycache__/ssd_neck.cpython-36.pyc
--------------------------------------------------------------------------------
/model/neck/build_neck.py:
--------------------------------------------------------------------------------
1 | import torch.nn as nn
2 | import torch.nn.functional as F
3 | import sys
4 | from ..utils import ConvModule
5 |
6 |
7 | class Neck(nn.Module):
8 | def __init__(self, in_channels = [64,256,512,1024,2048],out_channels = 256,out_map=None,start_level = 0,end_level = None):
9 | super(Neck,self).__init__()
10 | self.in_channels = in_channels
11 | if isinstance(out_channels,int):
12 | out_channels = [out_channels for i in range(len(self.in_channels))]
13 | self.out_channels = out_channels
14 |
15 | #select the out map
16 | self.out_map = out_map
17 | self.start_level = start_level
18 | self.end_level = end_level
19 | self.normalize = {'type':'BN'}
20 | if self.end_level is None:
21 | self.end_level = len(self.in_channels)
22 |
23 | if self.start_level<0 or self.end_level>len(self.in_channels):
24 | assert Exception("start_level or end_level is error")
25 |
26 |
27 | self.lateral_convs = nn.ModuleList()
28 | self.fpn_convs = nn.ModuleList()
29 |
30 | for i in range(self.start_level, self.end_level):
31 | l_conv = ConvModule(
32 | self.in_channels[i],
33 | self.out_channels[i],
34 | 1,
35 | normalize=self.normalize,
36 | bias=False,
37 | inplace=False)
38 | fpn_conv = ConvModule(
39 | out_channels[i],
40 | out_channels[i],
41 | 3,
42 | padding=1,
43 | normalize=self.normalize,
44 | bias=False,
45 | inplace=True)
46 |
47 | self.lateral_convs.append(l_conv)
48 | self.fpn_convs.append(fpn_conv)
49 |
50 | self.init_weights()
51 |
52 |
53 | def init_weights(self):
54 | for m in self.modules():
55 | if isinstance(m, nn.Conv2d):
56 | nn.init.xavier_uniform_(m.weight)
57 |
58 | def forward(self,inputs):
59 | assert len(inputs) == len(self.in_channels)
60 |
61 | # build laterals
62 | laterals = [
63 | lateral_conv(inputs[i + self.start_level])
64 | for i, lateral_conv in enumerate(self.lateral_convs)
65 | ]
66 |
67 | # build top-down path
68 | used_backbone_levels = len(laterals)
69 | for i in range(used_backbone_levels - 1, 0, -1):
70 |
71 | laterals[i - 1] += F.interpolate(
72 | laterals[i],size = laterals[i-1].shape[2:], mode='nearest')
73 | # build outputs
74 | # part 1: from original levels
75 | outs = [
76 | self.fpn_convs[i](laterals[i]) for i in range(used_backbone_levels)
77 | ]
78 | if self.out_map is not None:
79 | outs = outs[self.out_map]
80 | '''
81 | for i in range(len(outs)):
82 | print(outs[i].shape)
83 | '''
84 | return tuple(outs)
85 |
86 |
87 |
88 |
--------------------------------------------------------------------------------
/model/neck/ssd_neck.py:
--------------------------------------------------------------------------------
1 | import torch.nn as nn
2 | import torch.nn.functional as F
3 | from ..utils import ConvModule
4 |
5 |
6 | class SSDNeck(nn.Module):
7 | def __init__(self, in_channels = [1024,2048],out_channels = 256,out_map=None,start_level = 0,end_level = None):
8 | super(SSDNeck,self).__init__()
9 | self.in_channels = in_channels
10 | if isinstance(out_channels,int):
11 | out_channels = [out_channels for i in range(len(self.in_channels))]
12 | self.out_channels = out_channels
13 |
14 | #select the out map
15 | self.out_map = out_map
16 | self.start_level = start_level
17 | self.end_level = end_level
18 | self.normalize = {'type':'BN'}
19 | if self.end_level is None:
20 | self.end_level = len(self.out_channels)
21 |
22 |
23 | self.fpn_convs = nn.ModuleList()
24 |
25 | for i in range(self.start_level, self.end_level):
26 | if i == 0 :
27 | fpn_conv = ConvModule(
28 | in_channels[-1],
29 | out_channels[0],
30 | 3,
31 | stride = 2,
32 | padding=1,
33 | normalize=self.normalize,
34 | bias=True,
35 | inplace=True)
36 | else:
37 | fpn_conv = ConvModule(
38 | out_channels[i-1],
39 | out_channels[i],
40 | 3,
41 | stride = 1,
42 | padding=0,
43 | normalize=self.normalize,
44 | bias=True,
45 | inplace=True)
46 |
47 | self.fpn_convs.append(fpn_conv)
48 |
49 | self.init_weights()
50 | def init_weights(self):
51 | for m in self.modules():
52 | if isinstance(m, nn.Conv2d):
53 | nn.init.xavier_uniform_(m.weight)
54 |
55 | def forward(self,inputs):
56 | assert len(inputs) == len(self.in_channels)
57 |
58 | outs = []
59 | # build outputs
60 | # part 1: from original levels
61 |
62 | x = inputs[-1]
63 | outs += inputs
64 | for i in range(self.start_level, self.end_level):
65 | x = self.fpn_convs[i](x)
66 | outs.append(x)
67 | if self.out_map is not None:
68 | outs = outs[self.out_map]
69 | '''
70 | for i in range(len(outs)):
71 | print(outs[i].shape)
72 | '''
73 | return tuple(outs)
74 |
75 |
76 |
77 |
--------------------------------------------------------------------------------
/model/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .conv_module import ConvModule
2 | from .norm import build_norm_layer
3 | from .weight_init import (xavier_init, normal_init, uniform_init, kaiming_init,
4 | bias_init_with_prob)
5 |
6 | __all__ = [
7 | 'ConvModule', 'build_norm_layer', 'xavier_init', 'normal_init',
8 | 'uniform_init', 'kaiming_init', 'bias_init_with_prob'
9 | ]
10 |
--------------------------------------------------------------------------------
/model/utils/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/model/utils/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/model/utils/__pycache__/conv_module.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/conv_module.cpython-35.pyc
--------------------------------------------------------------------------------
/model/utils/__pycache__/conv_module.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/conv_module.cpython-36.pyc
--------------------------------------------------------------------------------
/model/utils/__pycache__/norm.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/norm.cpython-35.pyc
--------------------------------------------------------------------------------
/model/utils/__pycache__/norm.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/norm.cpython-36.pyc
--------------------------------------------------------------------------------
/model/utils/__pycache__/weight_init.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/weight_init.cpython-35.pyc
--------------------------------------------------------------------------------
/model/utils/__pycache__/weight_init.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/model/utils/__pycache__/weight_init.cpython-36.pyc
--------------------------------------------------------------------------------
/model/utils/conv_module.py:
--------------------------------------------------------------------------------
1 | import warnings
2 |
3 | import torch.nn as nn
4 | from mmcv.cnn import kaiming_init, constant_init
5 |
6 | from .norm import build_norm_layer
7 |
8 |
9 | class ConvModule(nn.Module):
10 |
11 | def __init__(self,
12 | in_channels,
13 | out_channels,
14 | kernel_size,
15 | stride=1,
16 | padding=0,
17 | dilation=1,
18 | groups=1,
19 | bias=True,
20 | normalize=None,
21 | activation='relu',
22 | inplace=True,
23 | activate_last=True):
24 | super(ConvModule, self).__init__()
25 | self.with_norm = normalize is not None
26 | self.with_activatation = activation is not None
27 | self.with_bias = bias
28 | self.activation = activation
29 | self.activate_last = activate_last
30 |
31 | if self.with_norm and self.with_bias:
32 | warnings.warn('ConvModule has norm and bias at the same time')
33 |
34 | self.conv = nn.Conv2d(
35 | in_channels,
36 | out_channels,
37 | kernel_size,
38 | stride,
39 | padding,
40 | dilation,
41 | groups,
42 | bias=bias)
43 |
44 | self.in_channels = self.conv.in_channels
45 | self.out_channels = self.conv.out_channels
46 | self.kernel_size = self.conv.kernel_size
47 | self.stride = self.conv.stride
48 | self.padding = self.conv.padding
49 | self.dilation = self.conv.dilation
50 | self.transposed = self.conv.transposed
51 | self.output_padding = self.conv.output_padding
52 | self.groups = self.conv.groups
53 |
54 | if self.with_norm:
55 | # norm after conv or not
56 | norm_channels = out_channels if self.activate_last else in_channels
57 | self.norm_name, norm = build_norm_layer(normalize, norm_channels)
58 | self.add_module(self.norm_name, norm)
59 |
60 | if self.with_activatation:
61 | assert activation in ['relu'], 'Only ReLU supported.'
62 | if self.activation == 'relu':
63 | self.activate = nn.ReLU(inplace=inplace)
64 |
65 | @property
66 | def norm(self):
67 | return getattr(self, self.norm_name)
68 |
69 | def forward(self, x, activate=True, norm=True):
70 | if self.activate_last:
71 | x = self.conv(x)
72 | if norm and self.with_norm:
73 | x = self.norm(x)
74 | if activate and self.with_activatation:
75 | x = self.activate(x)
76 | else:
77 | if norm and self.with_norm:
78 | x = self.norm(x)
79 | if activate and self.with_activatation:
80 | x = self.activate(x)
81 | x = self.conv(x)
82 | return x
83 |
--------------------------------------------------------------------------------
/model/utils/norm.py:
--------------------------------------------------------------------------------
1 | import torch.nn as nn
2 |
3 |
4 | norm_cfg = {
5 | # format: layer_type: (abbreviation, module)
6 | 'BN': ('bn', nn.BatchNorm2d),
7 | 'SyncBN': ('bn', None),
8 | 'GN': ('gn', nn.GroupNorm),
9 | # and potentially 'SN'
10 | }
11 |
12 |
13 | def build_norm_layer(cfg, num_features, postfix=''):
14 | """ Build normalization layer
15 |
16 | Args:
17 | cfg (dict): cfg should contain:
18 | type (str): identify norm layer type.
19 | layer args: args needed to instantiate a norm layer.
20 | frozen (bool): [optional] whether stop gradient updates
21 | of norm layer, it is helpful to set frozen mode
22 | in backbone's norms.
23 | num_features (int): number of channels from input
24 | postfix (int, str): appended into norm abbreation to
25 | create named layer.
26 |
27 | Returns:
28 | name (str): abbreation + postfix
29 | layer (nn.Module): created norm layer
30 | """
31 | assert isinstance(cfg, dict) and 'type' in cfg
32 | cfg_ = cfg.copy()
33 |
34 | layer_type = cfg_.pop('type')
35 | if layer_type not in norm_cfg:
36 | raise KeyError('Unrecognized norm type {}'.format(layer_type))
37 | else:
38 | abbr, norm_layer = norm_cfg[layer_type]
39 | if norm_layer is None:
40 | raise NotImplementedError
41 |
42 | assert isinstance(postfix, (int, str))
43 | name = abbr + str(postfix)
44 |
45 | frozen = cfg_.pop('frozen', False)
46 | cfg_.setdefault('eps', 1e-5)
47 | if layer_type != 'GN':
48 | layer = norm_layer(num_features, **cfg_)
49 | else:
50 | assert 'num_groups' in cfg_
51 | layer = norm_layer(num_channels=num_features, **cfg_)
52 |
53 | if frozen:
54 | for param in layer.parameters():
55 | param.requires_grad = False
56 |
57 | return name, layer
58 |
--------------------------------------------------------------------------------
/model/utils/weight_init.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import torch.nn as nn
3 |
4 |
5 | def xavier_init(module, gain=1, bias=0, distribution='normal'):
6 | assert distribution in ['uniform', 'normal']
7 | if distribution == 'uniform':
8 | nn.init.xavier_uniform_(module.weight, gain=gain)
9 | else:
10 | nn.init.xavier_normal_(module.weight, gain=gain)
11 | if hasattr(module, 'bias'):
12 | nn.init.constant_(module.bias, bias)
13 |
14 |
15 | def normal_init(module, mean=0, std=1, bias=0):
16 | nn.init.normal_(module.weight, mean, std)
17 | if hasattr(module, 'bias'):
18 | nn.init.constant_(module.bias, bias)
19 |
20 |
21 | def uniform_init(module, a=0, b=1, bias=0):
22 | nn.init.uniform_(module.weight, a, b)
23 | if hasattr(module, 'bias'):
24 | nn.init.constant_(module.bias, bias)
25 |
26 |
27 | def kaiming_init(module,
28 | mode='fan_out',
29 | nonlinearity='relu',
30 | bias=0,
31 | distribution='normal'):
32 | assert distribution in ['uniform', 'normal']
33 | if distribution == 'uniform':
34 | nn.init.kaiming_uniform_(
35 | module.weight, mode=mode, nonlinearity=nonlinearity)
36 | else:
37 | nn.init.kaiming_normal_(
38 | module.weight, mode=mode, nonlinearity=nonlinearity)
39 | if hasattr(module, 'bias'):
40 | nn.init.constant_(module.bias, bias)
41 |
42 |
43 | def bias_init_with_prob(prior_prob):
44 | """ initialize conv/fc bias value according to giving probablity"""
45 | bias_init = float(-np.log((1 - prior_prob) / prior_prob))
46 | return bias_init
47 |
--------------------------------------------------------------------------------
/tools/ap.py:
--------------------------------------------------------------------------------
1 | """Adapted from:
2 | @longcw faster_rcnn_pytorch: https://github.com/longcw/faster_rcnn_pytorch
3 | @rbgirshick py-faster-rcnn https://github.com/rbgirshick/py-faster-rcnn
4 | Licensed under The MIT License [see LICENSE for details]
5 | """
6 |
7 | from __future__ import print_function
8 | import os
9 | import sys
10 | import time
11 | sys.path.append(os.getcwd())
12 |
13 | import torch
14 | from data import *
15 | from config import crack,voc,coco,trafic
16 | from data import VOC_CLASSES as labelmap
17 | from model import build_ssd
18 |
19 | import argparse
20 | import numpy as np
21 | import pickle
22 |
23 | from tqdm import tqdm
24 |
25 | if sys.version_info[0] == 2:
26 | import xml.etree.cElementTree as ET
27 | else:
28 | import xml.etree.ElementTree as ET
29 |
30 |
31 | def str2bool(v):
32 | return v.lower() in ("yes", "true", "t", "1")
33 |
34 |
35 | parser = argparse.ArgumentParser(
36 | description='Single Shot MultiBox Detector Evaluation')
37 | parser.add_argument('--trained_model',
38 | default='weights/ssd300_mAP_77.43_v2.pth', type=str,
39 | help='Trained state_dict file path to open')
40 | parser.add_argument('--save_folder', default='eval/', type=str,
41 | help='File path to save results')
42 | parser.add_argument('--confidence_threshold', default=0.01, type=float,
43 | help='Detection confidence threshold')
44 | parser.add_argument('--top_k', default=5, type=int,
45 | help='Further restrict the number of predictions to parse')
46 | parser.add_argument('--cuda', default=True, type=str2bool,
47 | help='Use cuda to train model')
48 | parser.add_argument('--voc_root', default=VOC_ROOT,
49 | help='Location of VOC root directory')
50 | parser.add_argument('--over_thresh', default=0.5, type=float,
51 | help='Cleanup and remove results files following eval')
52 | args = parser.parse_args()
53 |
54 | if not os.path.exists(args.save_folder):
55 | os.mkdir(args.save_folder)
56 |
57 |
58 |
59 |
60 |
61 | class Timer(object):
62 | """A simple timer."""
63 | def __init__(self):
64 | self.total_time = 0.
65 | self.calls = 0
66 | self.start_time = 0.
67 | self.diff = 0.
68 | self.average_time = 0.
69 |
70 | def tic(self):
71 | # using time.time instead of time.clock because time time.clock
72 | # does not normalize for multithreading
73 | self.start_time = time.time()
74 |
75 | def toc(self, average=True):
76 | self.diff = time.time() - self.start_time
77 | self.total_time += self.diff
78 | self.calls += 1
79 | self.average_time = self.total_time / self.calls
80 | if average:
81 | return self.average_time
82 | else:
83 | return self.diff
84 |
85 |
86 | def parse_rec(filename):
87 | """ Parse a PASCAL VOC xml file """
88 | tree = ET.parse(filename)
89 | objects = []
90 | for obj in tree.findall('object'):
91 | obj_struct = {}
92 | obj_struct['name'] = obj.find('name').text
93 | obj_struct['pose'] = obj.find('pose').text
94 | obj_struct['truncated'] = int(obj.find('truncated').text)
95 | obj_struct['difficult'] = int(obj.find('difficult').text)
96 | bbox = obj.find('bndbox')
97 | obj_struct['bbox'] = [int(bbox.find('xmin').text) - 1,
98 | int(bbox.find('ymin').text) - 1,
99 | int(bbox.find('xmax').text) - 1,
100 | int(bbox.find('ymax').text) - 1]
101 | objects.append(obj_struct)
102 |
103 | return objects
104 |
105 |
106 | def get_output_dir(name, phase):
107 | """Return the directory where experimental artifacts are placed.
108 | If the directory does not exist, it is created.
109 | A canonical path is built using the name from an imdb and a network
110 | (if not None).
111 | """
112 | filedir = os.path.join(name, phase)
113 | if not os.path.exists(filedir):
114 | os.makedirs(filedir)
115 | return filedir
116 |
117 |
118 | def get_voc_results_file_template(data_dir,image_set, cls):
119 | # VOCdevkit/VOC2007/results/det_test_aeroplane.txt
120 | filename = 'det_' + image_set + '_%s.txt' % (cls)
121 | filedir = os.path.join(data_dir, 'results')
122 | if not os.path.exists(filedir):
123 | os.makedirs(filedir)
124 | path = os.path.join(filedir, filename)
125 | return path
126 |
127 |
128 | def write_voc_results_file(data_dir,all_boxes, dataset ,set_type):
129 | for cls_ind, cls in enumerate(labelmap):
130 | #get any class to store the result
131 | filename = get_voc_results_file_template(data_dir,set_type, cls)
132 | with open(filename, 'wt') as f:
133 | for im_ind, index in enumerate(dataset.ids):
134 | dets = all_boxes[cls_ind+1][im_ind]
135 | if dets == []:
136 | continue
137 | # the VOCdevkit expects 1-based indices
138 | for k in range(dets.shape[0]):
139 | f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
140 | format(index[1], dets[k, -1],
141 | dets[k, 0] + 1, dets[k, 1] + 1,
142 | dets[k, 2] + 1, dets[k, 3] + 1))
143 |
144 |
145 | def do_python_eval(output_dir, set_type,thresh, use_07=True):
146 | cachedir = os.path.join(output_dir, 'annotations_cache')
147 | imgsetpath = os.path.join(output_dir,'ImageSets',
148 | 'Main', '{:s}.txt')
149 | annopath = os.path.join(output_dir, 'Annotations', '%s.xml')
150 | if not os.path.isdir(cachedir):
151 | os.mkdir(cachedir)
152 | #print(cachedir)
153 | aps = []
154 | # The PASCAL VOC metric changed in 2010
155 | use_07_metric = use_07
156 | print('VOC07 metric? ' + ('Yes' if use_07_metric else 'No'))
157 |
158 | for i, cls in enumerate(labelmap):
159 | filename = get_voc_results_file_template(output_dir,set_type, cls)
160 | rec, prec, ap = voc_eval(
161 | filename, annopath, imgsetpath.format(set_type), cls, cachedir,
162 | ovthresh=thresh, use_07_metric=use_07_metric)
163 | aps += [ap]
164 | print('AP for {} = {:.4f}'.format(cls, ap))
165 | #with open(os.path.join(output_dir, cls + '_pr.pkl'), 'wb') as f:
166 | #pickle.dump({'rec': rec, 'prec': prec, 'ap': ap}, f)
167 | print('Mean AP = {:.4f}'.format(np.mean(aps)))
168 | print('~~~~~~~~')
169 | print('Results:')
170 | for ap in aps:
171 | print('{:.3f}'.format(ap))
172 | print('{:.3f}'.format(np.mean(aps)))
173 | print('~~~~~~~~')
174 | print('')
175 | print('--------------------------------------------------------------')
176 | print('Results computed with the **unofficial** Python eval code.')
177 | print('Results should be very close to the official MATLAB eval code.')
178 | print('--------------------------------------------------------------')
179 | return np.mean(aps)
180 |
181 | def voc_ap(rec, prec, use_07_metric=True):
182 | """ ap = voc_ap(rec, prec, [use_07_metric])
183 | Compute VOC AP given precision and recall.
184 | If use_07_metric is true, uses the
185 | VOC 07 11 point method (default:True).
186 | """
187 | if use_07_metric:
188 | # 11 point metric
189 | ap = 0.
190 | for t in np.arange(0., 1.1, 0.1):
191 | if np.sum(rec >= t) == 0:
192 | p = 0
193 | else:
194 | p = np.max(prec[rec >= t])
195 | ap = ap + p / 11.
196 | else:
197 | # correct AP calculation
198 | # first append sentinel values at the end
199 | mrec = np.concatenate(([0.], rec, [1.]))
200 | mpre = np.concatenate(([0.], prec, [0.]))
201 |
202 | # compute the precision envelope
203 | for i in range(mpre.size - 1, 0, -1):
204 | mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
205 |
206 | # to calculate area under PR curve, look for points
207 | # where X axis (recall) changes value
208 | i = np.where(mrec[1:] != mrec[:-1])[0]
209 |
210 | # and sum (\Delta recall) * prec
211 | ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
212 | return ap
213 |
214 |
215 | def voc_eval(detpath,
216 | annopath,
217 | imagesetfile,
218 | classname,
219 | cachedir,
220 | ovthresh=0.5,
221 | use_07_metric=True):
222 | """rec, prec, ap = voc_eval(detpath,
223 | annopath,
224 | imagesetfile,
225 | classname,
226 | [ovthresh],
227 | [use_07_metric])
228 | Top level function that does the PASCAL VOC evaluation.
229 | detpath: Path to detections
230 | detpath.format(classname) should produce the detection results file.
231 | annopath: Path to annotations
232 | annopath.format(imagename) should be the xml annotations file.
233 | imagesetfile: Text file containing the list of images, one image per line.
234 | classname: Category name (duh)
235 | cachedir: Directory for caching the annotations
236 | [ovthresh]: Overlap threshold (default = 0.5)
237 | [use_07_metric]: Whether to use VOC07's 11 point AP computation
238 | (default True)
239 | """
240 | # assumes detections are in detpath.format(classname)
241 | # assumes annotations are in annopath.format(imagename)
242 | # assumes imagesetfile is a text file with each line an image name
243 | # cachedir caches the annotations in a pickle file
244 | # first load gt
245 | cachefile = os.path.join(cachedir, 'annots.pkl')
246 | # read list of images
247 | with open(imagesetfile, 'r') as f:
248 | lines = f.readlines()
249 | imagenames = [x.strip() for x in lines]
250 | # save the truth data as pickle,if the pickle in the file, just load it.
251 | if not os.path.isfile(cachefile):
252 | #load annots
253 | recs = {}
254 | for i, imagename in enumerate(imagenames):
255 | recs[imagename] = parse_rec(annopath % (imagename))
256 | # save
257 | print('Saving cached annotations to {:s}'.format(cachefile))
258 | with open(cachefile, 'wb') as f:
259 | pickle.dump(recs, f)
260 | else:
261 | # load
262 | with open(cachefile, 'rb') as f:
263 | recs = pickle.load(f)
264 |
265 | # extract gt objects for this class
266 | class_recs = {}
267 | npos = 0
268 | for imagename in imagenames:
269 |
270 | R = [obj for obj in recs[imagename] if obj['name'] == classname]
271 | bbox = np.array([x['bbox'] for x in R])
272 | difficult = np.array([x['difficult'] for x in R]).astype(np.bool)
273 | det = [False] * len(R)
274 | npos = npos + sum(~difficult)
275 | class_recs[imagename] = {'bbox': bbox,
276 | 'difficult': difficult,
277 | 'det': det}
278 |
279 | # read dets
280 | detfile = detpath.format(classname)
281 | with open(detfile, 'r') as f:
282 | lines = f.readlines()
283 | if any(lines) == 1:
284 |
285 | splitlines = [x.strip().split(' ') for x in lines]
286 | image_ids = [x[0] for x in splitlines]
287 | confidence = np.array([float(x[1]) for x in splitlines])
288 | BB = np.array([[float(z) for z in x[2:]] for x in splitlines])
289 |
290 | # sort by confidence
291 | sorted_ind = np.argsort(-confidence)
292 | sorted_scores = np.sort(-confidence)
293 | BB = BB[sorted_ind, :]
294 | image_ids = [image_ids[x] for x in sorted_ind]
295 |
296 | # go down dets and mark TPs and FPs
297 | nd = len(image_ids)
298 | tp = np.zeros(nd)
299 | fp = np.zeros(nd)
300 | for d in range(nd):
301 | R = class_recs[image_ids[d]]
302 | bb = BB[d, :].astype(float)
303 | ovmax = -np.inf
304 | BBGT = R['bbox'].astype(float)
305 | if BBGT.size > 0:
306 | # compute overlaps
307 | # intersection
308 | ixmin = np.maximum(BBGT[:, 0], bb[0])
309 | iymin = np.maximum(BBGT[:, 1], bb[1])
310 | ixmax = np.minimum(BBGT[:, 2], bb[2])
311 | iymax = np.minimum(BBGT[:, 3], bb[3])
312 | iw = np.maximum(ixmax - ixmin, 0.)
313 | ih = np.maximum(iymax - iymin, 0.)
314 | inters = iw * ih
315 | uni = ((bb[2] - bb[0]) * (bb[3] - bb[1]) +
316 | (BBGT[:, 2] - BBGT[:, 0]) *
317 | (BBGT[:, 3] - BBGT[:, 1]) - inters)
318 | overlaps = inters / uni
319 | ovmax = np.max(overlaps)
320 | jmax = np.argmax(overlaps)
321 |
322 | if ovmax > ovthresh:
323 | if not R['difficult'][jmax]:
324 | if not R['det'][jmax]:
325 | tp[d] = 1.
326 | R['det'][jmax] = 1
327 | else:
328 | fp[d] = 1.
329 | else:
330 | fp[d] = 1.
331 |
332 | # compute precision recall
333 | fp = np.cumsum(fp)
334 | tp = np.cumsum(tp)
335 | rec = tp / float(npos)
336 | # avoid divide by zero in case the first detection matches a difficult
337 | # ground truth
338 | prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
339 | ap = voc_ap(rec, prec, use_07_metric)
340 | else:
341 | rec = -1.
342 | prec = -1.
343 | ap = -1.
344 |
345 | return rec, prec, ap
346 |
347 |
348 | def test_net(save_folder, net, cuda, dataset, top_k,im_size=300, thresh=0.05):
349 | #the len of pic
350 | num_images = len(dataset)
351 | # all detections are collected into:[21,4952,0]
352 | # all_boxes[cls][image] = N x 5 array of detections in
353 | # (x1, y1, x2, y2, score)
354 | all_boxes = [[[] for _ in range(num_images)]
355 | for _ in range(len(labelmap)+1)]
356 | # timers
357 | _t = {'im_detect': Timer(), 'misc': Timer()}
358 |
359 | print(num_images)
360 | for i in tqdm(range(num_images)):
361 | with torch.no_grad():
362 | im, gt, h, w = dataset.pull_item(i)
363 | x = im.unsqueeze(0)
364 | if args.cuda:
365 | x = x.cuda()
366 | _t['im_detect'].tic()
367 | detections = net(x,'test').data
368 | detect_time = _t['im_detect'].toc(average=False)
369 |
370 | # skip j = 0, because it's the background class
371 | for j in range(1, detections.size(1)):
372 | dets = detections[0, j, :]
373 | mask = dets[:, 0].gt(0.).expand(5, dets.size(0)).t()
374 | dets = torch.masked_select(dets, mask).view(-1, 5)
375 | if dets.size(0) == 0:
376 | continue
377 | boxes = dets[:, 1:]
378 | boxes[:, 0] *= w
379 | boxes[:, 2] *= w
380 | boxes[:, 1] *= h
381 | boxes[:, 3] *= h
382 | scores = dets[:, 0].cpu().numpy()
383 | cls_dets = np.hstack((boxes.cpu().numpy(),
384 | scores[:, np.newaxis])).astype(np.float32,
385 | copy=False)
386 | all_boxes[j][i] = cls_dets
387 | return all_boxes
388 |
389 | def evaluate_detections(data_dir,box_list,dataset, thresh, eval_type = 'test'):
390 | #write the det result to dir
391 | write_voc_results_file(data_dir,box_list, dataset, eval_type)
392 | return do_python_eval(data_dir,eval_type,thresh=thresh)
393 |
394 |
395 | if __name__ == '__main__':
396 | if torch.cuda.is_available():
397 | if args.cuda:
398 | torch.set_default_tensor_type('torch.cuda.FloatTensor')
399 | if not args.cuda:
400 | print("WARNING: It looks like you have a CUDA device, but aren't using \
401 | CUDA. Run with --cuda for optimal eval speed.")
402 | torch.set_default_tensor_type('torch.FloatTensor')
403 | else:
404 | torch.set_default_tensor_type('torch.FloatTensor')
405 | num_classes = len(labelmap) + 1 # +1 for background
406 | net = build_ssd('test', size = 300, cfg = voc) # initialize SSD
407 | net.load_state_dict(torch.load(args.trained_model))
408 |
409 | print('Finished loading model!')
410 | # load data
411 | dataset = VOCDetection(args.voc_root, [('2007', 'test')],
412 | BaseTransform(300, voc['mean'],voc['std']))
413 | if args.cuda:
414 | net = net.cuda()
415 | torch.backends.cudnn.benchmark = False
416 | net.eval()
417 |
418 | # evaluation
419 | devkit_path = VOC_ROOT +'VOC2007/'
420 |
421 | all_boxes = test_net(args.save_folder, net, args.cuda, dataset,args.top_k, 300,
422 | thresh=args.confidence_threshold)
423 | print('Evaluating detections')
424 | results = []
425 | for thresh in np.arange(0.5,1,0.05):
426 | result = evaluate_detections(devkit_path,all_boxes, dataset,thresh, 'test')
427 | results.append(result)
428 | print(results[0], results[5], sum(results)/10)
429 |
430 |
--------------------------------------------------------------------------------
/tools/eval.py:
--------------------------------------------------------------------------------
1 | """Adapted from:
2 | @longcw faster_rcnn_pytorch: https://github.com/longcw/faster_rcnn_pytorch
3 | @rbgirshick py-faster-rcnn https://github.com/rbgirshick/py-faster-rcnn
4 | Licensed under The MIT License [see LICENSE for details]
5 | """
6 |
7 | from __future__ import print_function
8 | import os
9 | import sys
10 | import time
11 | sys.path.append(os.getcwd())
12 |
13 | import torch
14 | from data import *
15 | from config import crack,voc,coco,trafic
16 | from data import VOC_CLASSES as labelmap
17 | from model import build_ssd
18 |
19 | import argparse
20 | import numpy as np
21 | import pickle
22 |
23 | from tqdm import tqdm
24 |
25 | if sys.version_info[0] == 2:
26 | import xml.etree.cElementTree as ET
27 | else:
28 | import xml.etree.ElementTree as ET
29 |
30 |
31 | def str2bool(v):
32 | return v.lower() in ("yes", "true", "t", "1")
33 |
34 |
35 | parser = argparse.ArgumentParser(
36 | description='Single Shot MultiBox Detector Evaluation')
37 | parser.add_argument('--trained_model',
38 | default='weights/ssd300_mAP_77.43_v2.pth', type=str,
39 | help='Trained state_dict file path to open')
40 | parser.add_argument('--save_folder', default='eval/', type=str,
41 | help='File path to save results')
42 | parser.add_argument('--confidence_threshold', default=0.01, type=float,
43 | help='Detection confidence threshold')
44 | parser.add_argument('--top_k', default=5, type=int,
45 | help='Further restrict the number of predictions to parse')
46 | parser.add_argument('--cuda', default=True, type=str2bool,
47 | help='Use cuda to train model')
48 | parser.add_argument('--voc_root', default=VOC_ROOT,
49 | help='Location of VOC root directory')
50 | parser.add_argument('--over_thresh', default=0.5, type=float,
51 | help='Cleanup and remove results files following eval')
52 | args = parser.parse_args()
53 |
54 | if not os.path.exists(args.save_folder):
55 | os.mkdir(args.save_folder)
56 |
57 |
58 |
59 |
60 |
61 | class Timer(object):
62 | """A simple timer."""
63 | def __init__(self):
64 | self.total_time = 0.
65 | self.calls = 0
66 | self.start_time = 0.
67 | self.diff = 0.
68 | self.average_time = 0.
69 |
70 | def tic(self):
71 | # using time.time instead of time.clock because time time.clock
72 | # does not normalize for multithreading
73 | self.start_time = time.time()
74 |
75 | def toc(self, average=True):
76 | self.diff = time.time() - self.start_time
77 | self.total_time += self.diff
78 | self.calls += 1
79 | self.average_time = self.total_time / self.calls
80 | if average:
81 | return self.average_time
82 | else:
83 | return self.diff
84 |
85 |
86 | def parse_rec(filename):
87 | """ Parse a PASCAL VOC xml file """
88 | tree = ET.parse(filename)
89 | objects = []
90 | for obj in tree.findall('object'):
91 | obj_struct = {}
92 | obj_struct['name'] = obj.find('name').text
93 | obj_struct['pose'] = obj.find('pose').text
94 | obj_struct['truncated'] = int(obj.find('truncated').text)
95 | obj_struct['difficult'] = int(obj.find('difficult').text)
96 | bbox = obj.find('bndbox')
97 | obj_struct['bbox'] = [int(bbox.find('xmin').text) - 1,
98 | int(bbox.find('ymin').text) - 1,
99 | int(bbox.find('xmax').text) - 1,
100 | int(bbox.find('ymax').text) - 1]
101 | objects.append(obj_struct)
102 |
103 | return objects
104 |
105 |
106 | def get_output_dir(name, phase):
107 | """Return the directory where experimental artifacts are placed.
108 | If the directory does not exist, it is created.
109 | A canonical path is built using the name from an imdb and a network
110 | (if not None).
111 | """
112 | filedir = os.path.join(name, phase)
113 | if not os.path.exists(filedir):
114 | os.makedirs(filedir)
115 | return filedir
116 |
117 |
118 | def get_voc_results_file_template(data_dir,image_set, cls):
119 | # VOCdevkit/VOC2007/results/det_test_aeroplane.txt
120 | filename = 'det_' + image_set + '_%s.txt' % (cls)
121 | filedir = os.path.join(data_dir, 'results')
122 | if not os.path.exists(filedir):
123 | os.makedirs(filedir)
124 | path = os.path.join(filedir, filename)
125 | return path
126 |
127 |
128 | def write_voc_results_file(data_dir,all_boxes, dataset ,set_type):
129 | for cls_ind, cls in enumerate(labelmap):
130 | #get any class to store the result
131 | filename = get_voc_results_file_template(data_dir,set_type, cls)
132 | with open(filename, 'wt') as f:
133 | for im_ind, index in enumerate(dataset.ids):
134 | dets = all_boxes[cls_ind+1][im_ind]
135 | if dets == []:
136 | continue
137 | # the VOCdevkit expects 1-based indices
138 | for k in range(dets.shape[0]):
139 | f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
140 | format(index[1], dets[k, -1],
141 | dets[k, 0] + 1, dets[k, 1] + 1,
142 | dets[k, 2] + 1, dets[k, 3] + 1))
143 |
144 |
145 | def do_python_eval(output_dir, set_type,use_07=True):
146 | cachedir = os.path.join(output_dir, 'annotations_cache')
147 | imgsetpath = os.path.join(output_dir,'ImageSets',
148 | 'Main', '{:s}.txt')
149 | annopath = os.path.join(output_dir, 'Annotations', '%s.xml')
150 | if not os.path.isdir(cachedir):
151 | os.mkdir(cachedir)
152 | #print(cachedir)
153 | aps = []
154 | # The PASCAL VOC metric changed in 2010
155 | use_07_metric = use_07
156 | print('VOC07 metric? ' + ('Yes' if use_07_metric else 'No'))
157 |
158 | for i, cls in enumerate(labelmap):
159 | filename = get_voc_results_file_template(output_dir,set_type, cls)
160 | rec, prec, ap = voc_eval(
161 | filename, annopath, imgsetpath.format(set_type), cls, cachedir,
162 | ovthresh=args.over_thresh, use_07_metric=use_07_metric)
163 | aps += [ap]
164 | print('AP for {} = {:.4f}'.format(cls, ap))
165 | #with open(os.path.join(output_dir, cls + '_pr.pkl'), 'wb') as f:
166 | #pickle.dump({'rec': rec, 'prec': prec, 'ap': ap}, f)
167 | print('Mean AP = {:.4f}'.format(np.mean(aps)))
168 | print('~~~~~~~~')
169 | print('Results:')
170 | for ap in aps:
171 | print('{:.3f}'.format(ap))
172 | print('{:.3f}'.format(np.mean(aps)))
173 | print('~~~~~~~~')
174 | print('')
175 | print('--------------------------------------------------------------')
176 | print('Results computed with the **unofficial** Python eval code.')
177 | print('Results should be very close to the official MATLAB eval code.')
178 | print('--------------------------------------------------------------')
179 | return np.mean(aps)
180 |
181 | def voc_ap(rec, prec, use_07_metric=True):
182 | """ ap = voc_ap(rec, prec, [use_07_metric])
183 | Compute VOC AP given precision and recall.
184 | If use_07_metric is true, uses the
185 | VOC 07 11 point method (default:True).
186 | """
187 | if use_07_metric:
188 | # 11 point metric
189 | ap = 0.
190 | for t in np.arange(0., 1.1, 0.1):
191 | if np.sum(rec >= t) == 0:
192 | p = 0
193 | else:
194 | p = np.max(prec[rec >= t])
195 | ap = ap + p / 11.
196 | else:
197 | # correct AP calculation
198 | # first append sentinel values at the end
199 | mrec = np.concatenate(([0.], rec, [1.]))
200 | mpre = np.concatenate(([0.], prec, [0.]))
201 |
202 | # compute the precision envelope
203 | for i in range(mpre.size - 1, 0, -1):
204 | mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
205 |
206 | # to calculate area under PR curve, look for points
207 | # where X axis (recall) changes value
208 | i = np.where(mrec[1:] != mrec[:-1])[0]
209 |
210 | # and sum (\Delta recall) * prec
211 | ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
212 | return ap
213 |
214 |
215 | def voc_eval(detpath,
216 | annopath,
217 | imagesetfile,
218 | classname,
219 | cachedir,
220 | ovthresh=0.5,
221 | use_07_metric=True):
222 | """rec, prec, ap = voc_eval(detpath,
223 | annopath,
224 | imagesetfile,
225 | classname,
226 | [ovthresh],
227 | [use_07_metric])
228 | Top level function that does the PASCAL VOC evaluation.
229 | detpath: Path to detections
230 | detpath.format(classname) should produce the detection results file.
231 | annopath: Path to annotations
232 | annopath.format(imagename) should be the xml annotations file.
233 | imagesetfile: Text file containing the list of images, one image per line.
234 | classname: Category name (duh)
235 | cachedir: Directory for caching the annotations
236 | [ovthresh]: Overlap threshold (default = 0.5)
237 | [use_07_metric]: Whether to use VOC07's 11 point AP computation
238 | (default True)
239 | """
240 | # assumes detections are in detpath.format(classname)
241 | # assumes annotations are in annopath.format(imagename)
242 | # assumes imagesetfile is a text file with each line an image name
243 | # cachedir caches the annotations in a pickle file
244 | # first load gt
245 | cachefile = os.path.join(cachedir, 'annots.pkl')
246 | # read list of images
247 | with open(imagesetfile, 'r') as f:
248 | lines = f.readlines()
249 | imagenames = [x.strip() for x in lines]
250 | # save the truth data as pickle,if the pickle in the file, just load it.
251 | if not os.path.isfile(cachefile):
252 | #load annots
253 | recs = {}
254 | for i, imagename in enumerate(imagenames):
255 | recs[imagename] = parse_rec(annopath % (imagename))
256 | # save
257 | print('Saving cached annotations to {:s}'.format(cachefile))
258 | with open(cachefile, 'wb') as f:
259 | pickle.dump(recs, f)
260 | else:
261 | # load
262 | with open(cachefile, 'rb') as f:
263 | recs = pickle.load(f)
264 |
265 | # extract gt objects for this class
266 | class_recs = {}
267 | npos = 0
268 | for imagename in imagenames:
269 |
270 | R = [obj for obj in recs[imagename] if obj['name'] == classname]
271 | bbox = np.array([x['bbox'] for x in R])
272 | difficult = np.array([x['difficult'] for x in R]).astype(np.bool)
273 | det = [False] * len(R)
274 | npos = npos + sum(~difficult)
275 | class_recs[imagename] = {'bbox': bbox,
276 | 'difficult': difficult,
277 | 'det': det}
278 |
279 | # read dets
280 | detfile = detpath.format(classname)
281 | with open(detfile, 'r') as f:
282 | lines = f.readlines()
283 | if any(lines) == 1:
284 |
285 | splitlines = [x.strip().split(' ') for x in lines]
286 | image_ids = [x[0] for x in splitlines]
287 | confidence = np.array([float(x[1]) for x in splitlines])
288 | BB = np.array([[float(z) for z in x[2:]] for x in splitlines])
289 |
290 | # sort by confidence
291 | sorted_ind = np.argsort(-confidence)
292 | sorted_scores = np.sort(-confidence)
293 | BB = BB[sorted_ind, :]
294 | image_ids = [image_ids[x] for x in sorted_ind]
295 |
296 | # go down dets and mark TPs and FPs
297 | nd = len(image_ids)
298 | tp = np.zeros(nd)
299 | fp = np.zeros(nd)
300 | for d in range(nd):
301 | R = class_recs[image_ids[d]]
302 | bb = BB[d, :].astype(float)
303 | ovmax = -np.inf
304 | BBGT = R['bbox'].astype(float)
305 | if BBGT.size > 0:
306 | # compute overlaps
307 | # intersection
308 | ixmin = np.maximum(BBGT[:, 0], bb[0])
309 | iymin = np.maximum(BBGT[:, 1], bb[1])
310 | ixmax = np.minimum(BBGT[:, 2], bb[2])
311 | iymax = np.minimum(BBGT[:, 3], bb[3])
312 | iw = np.maximum(ixmax - ixmin, 0.)
313 | ih = np.maximum(iymax - iymin, 0.)
314 | inters = iw * ih
315 | uni = ((bb[2] - bb[0]) * (bb[3] - bb[1]) +
316 | (BBGT[:, 2] - BBGT[:, 0]) *
317 | (BBGT[:, 3] - BBGT[:, 1]) - inters)
318 | overlaps = inters / uni
319 | ovmax = np.max(overlaps)
320 | jmax = np.argmax(overlaps)
321 |
322 | if ovmax > ovthresh:
323 | if not R['difficult'][jmax]:
324 | if not R['det'][jmax]:
325 | tp[d] = 1.
326 | R['det'][jmax] = 1
327 | else:
328 | fp[d] = 1.
329 | else:
330 | fp[d] = 1.
331 |
332 | # compute precision recall
333 | fp = np.cumsum(fp)
334 | tp = np.cumsum(tp)
335 | rec = tp / float(npos)
336 | # avoid divide by zero in case the first detection matches a difficult
337 | # ground truth
338 | prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
339 | ap = voc_ap(rec, prec, use_07_metric)
340 | else:
341 | rec = -1.
342 | prec = -1.
343 | ap = -1.
344 |
345 | return rec, prec, ap
346 |
347 |
348 | def test_net(save_folder, net, cuda, dataset, top_k,im_size=300, thresh=0.05):
349 | #the len of pic
350 | num_images = len(dataset)
351 | # all detections are collected into:[21,4952,0]
352 | # all_boxes[cls][image] = N x 5 array of detections in
353 | # (x1, y1, x2, y2, score)
354 | all_boxes = [[[] for _ in range(num_images)]
355 | for _ in range(len(labelmap)+1)]
356 | # timers
357 | _t = {'im_detect': Timer(), 'misc': Timer()}
358 |
359 | print(num_images)
360 | for i in tqdm(range(num_images)):
361 | with torch.no_grad():
362 | im, gt, h, w = dataset.pull_item(i)
363 | x = im.unsqueeze(0)
364 | if args.cuda:
365 | x = x.cuda()
366 | _t['im_detect'].tic()
367 | detections = net(x,'test').data
368 | detect_time = _t['im_detect'].toc(average=False)
369 |
370 | # skip j = 0, because it's the background class
371 | for j in range(1, detections.size(1)):
372 | dets = detections[0, j, :]
373 | mask = dets[:, 0].gt(0.).expand(5, dets.size(0)).t()
374 | dets = torch.masked_select(dets, mask).view(-1, 5)
375 | if dets.size(0) == 0:
376 | continue
377 | boxes = dets[:, 1:]
378 | boxes[:, 0] *= w
379 | boxes[:, 2] *= w
380 | boxes[:, 1] *= h
381 | boxes[:, 3] *= h
382 | scores = dets[:, 0].cpu().numpy()
383 | cls_dets = np.hstack((boxes.cpu().numpy(),
384 | scores[:, np.newaxis])).astype(np.float32,
385 | copy=False)
386 | all_boxes[j][i] = cls_dets
387 | return all_boxes
388 |
389 | def evaluate_detections(data_dir,box_list,dataset,eval_type = 'test'):
390 | #write the det result to dir
391 | write_voc_results_file(data_dir,box_list, dataset, eval_type)
392 | return do_python_eval(data_dir,eval_type)
393 |
394 |
395 | if __name__ == '__main__':
396 | if torch.cuda.is_available():
397 | if args.cuda:
398 | torch.set_default_tensor_type('torch.cuda.FloatTensor')
399 | if not args.cuda:
400 | print("WARNING: It looks like you have a CUDA device, but aren't using \
401 | CUDA. Run with --cuda for optimal eval speed.")
402 | torch.set_default_tensor_type('torch.FloatTensor')
403 | else:
404 | torch.set_default_tensor_type('torch.FloatTensor')
405 | num_classes = len(labelmap) + 1 # +1 for background
406 | net = build_ssd('test', size = 300, cfg = voc) # initialize SSD
407 | net.load_state_dict(torch.load(args.trained_model))
408 |
409 | print('Finished loading model!')
410 | # load data
411 | dataset = VOCDetection(args.voc_root, [('2007', 'test')],
412 | BaseTransform(300, voc['mean'],voc['std']))
413 | if args.cuda:
414 | net = net.cuda()
415 | #torch.backends.cudnn.benchmark = True
416 | net.eval()
417 |
418 | # evaluation
419 | devkit_path = VOC_ROOT +'VOC2007/'
420 |
421 | all_boxes = test_net(args.save_folder, net, args.cuda, dataset,args.top_k, 300,
422 | thresh=args.confidence_threshold)
423 | print('Evaluating detections')
424 | result = evaluate_detections(devkit_path,all_boxes, dataset,'test')
425 |
--------------------------------------------------------------------------------
/tools/test.py:
--------------------------------------------------------------------------------
1 | from __future__ import print_function
2 | import sys
3 | import os
4 | sys.path.append(os.getcwd())
5 | import PIL
6 | from PIL import Image,ImageDraw,ImageFont
7 | import argparse
8 | import torch
9 | import torch.nn as nn
10 | import torchvision.transforms as transforms
11 | from data import VOC_CLASSES as labelmap
12 | from PIL import Image
13 | from data import VOCAnnotationTransform, VOCDetection, BaseTransform, VOC_CLASSES,CRACK_CLASSES,CRACKDetection
14 | import torch.utils.data as data
15 | from model import build_ssd
16 | from data import *
17 | from tqdm import tqdm
18 | import pandas as pd
19 | from matplotlib import pyplot as plot
20 | import numpy as np
21 | import matplotlib.pyplot as plt
22 | from config import voc
23 | parser = argparse.ArgumentParser(description='Single Shot MultiBox Detection')
24 | parser.add_argument('--trained_model', default='weights/ssd300_COCO_14100_0.9832380952380952.pth',
25 | type=str, help='Trained state_dict file path to open')
26 | parser.add_argument('--save_folder', default='result/', type=str,
27 | help='Dir to save results')
28 | parser.add_argument('--visual_threshold', default=0.6, type=float,
29 | help='Final confidence threshold')
30 | parser.add_argument('--cuda', default=True, type=bool,
31 | help='Use cuda to train model')
32 | parser.add_argument('--voc_root', default=VOC_ROOT, help='Location of VOC root directory')
33 | parser.add_argument('--visbox', default=False, type=bool, help="vis the boxes")
34 | args = parser.parse_args()
35 |
36 |
37 |
38 | def vis_image(img, ax=None):
39 | """Visualize a color image.
40 |
41 | Args:
42 | img (~numpy.ndarray): An array of shape :math:`(3, height, width)`.
43 | This is in RGB format and the range of its value is
44 | :math:`[0, 255]`.
45 | ax (matplotlib.axes.Axis): The visualization is displayed on this
46 | axis. If this is :obj:`None` (default), a new axis is created.
47 |
48 | Returns:
49 | ~matploblib.axes.Axes:
50 | Returns the Axes object with the plot for further tweaking.
51 |
52 | """
53 | #print(img.shape)
54 | if ax is None:
55 | fig = plot.figure()
56 | ax = fig.add_subplot(1, 1, 1)
57 | # CHW -> HWC
58 | #img = np.transpose(img,(1, 2, 0))
59 | ax.imshow(img.astype(np.uint8))
60 | return ax
61 |
62 |
63 | def vis_bbox(img, bbox, label=None, score=None, ax=None):
64 | """Visualize bounding boxes inside image.
65 |
66 | Args:
67 | img (~numpy.ndarray): An array of shape :math:`(3, height, width)`.
68 | This is in RGB format and the range of its value is
69 | :math:`[0, 255]`.
70 | bbox (~numpy.ndarray): An array of shape :math:`(R, 4)`, where
71 | :math:`R` is the number of bounding boxes in the image.
72 | Each element is organized
73 | by :math:`(y_{min}, x_{min}, y_{max}, x_{max})` in the second axis.
74 | label (~numpy.ndarray): An integer array of shape :math:`(R,)`.
75 | The values correspond to id for label names stored in
76 | :obj:`label_names`. This is optional.
77 | score (~numpy.ndarray): A float array of shape :math:`(R,)`.
78 | Each value indicates how confident the prediction is.
79 | This is optional.
80 | label_names (iterable of strings): Name of labels ordered according
81 | to label ids. If this is :obj:`None`, labels will be skipped.
82 | ax (matplotlib.axes.Axis): The visualization is displayed on this
83 | axis. If this is :obj:`None` (default), a new axis is created.
84 |
85 | Returns:
86 | ~matploblib.axes.Axes:
87 | Returns the Axes object with the plot for further tweaking.
88 |
89 | """
90 | #label_names = ['neg','bg']
91 | label_names = list(labelmap) + ['bg']
92 | # add for index `-1`
93 | if label is not None and not len(bbox) == len(label):
94 | raise ValueError('The length of label must be same as that of bbox')
95 | if score is not None and not len(bbox) == len(score):
96 | raise ValueError('The length of score must be same as that of bbox')
97 |
98 | # Returns newly instantiated matplotlib.axes.Axes object if ax is None
99 | ax = vis_image(img, ax=ax)
100 | # If there is no bounding box to display, visualize the image and exit.
101 | if len(bbox) == 0:
102 | return ax
103 |
104 | for i, bb in enumerate(bbox):
105 | xy = (bb[0], bb[1])
106 | width = bb[2] - bb[0]
107 | height = bb[3] - bb[1]
108 |
109 | ax.add_patch(plot.Rectangle(
110 | xy, width, height, fill=False, edgecolor='red', linewidth=2))
111 | #plt.text(bb[0],bb[1],score,family='fantasy',fontsize=36,style='italic',color='mediumvioletred')
112 | caption = list()
113 |
114 | if label is not None and label_names is not None:
115 | lb = label[i]
116 | if not (-1 <= lb < len(label_names)): # modfy here to add backgroud
117 | raise ValueError('No corresponding name is given')
118 | caption.append(label_names[lb])
119 | if score is not None:
120 | sc = score[i]
121 | caption.append('{:.2f}'.format(sc))
122 |
123 | if len(caption) > 0:
124 | ax.text(bb[1], bb[0],
125 | ': '.join(caption),
126 | style='italic',
127 | bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 0})
128 | return ax
129 |
130 | def test_net(save_folder, net, cuda, testset, transform, thresh):
131 | # dump predictions and assoc. ground truth to text file for now
132 | filename = save_folder+'test.txt'
133 | num_images = len(testset)
134 | for i in range(num_images):
135 | print('Testing image {:d}/{:d}....'.format(i+1, num_images))
136 | img = testset.pull_image(i)
137 | img_id, annotation = testset.pull_anno(i)
138 | x = torch.from_numpy(transform(img)[0]).permute(2, 0, 1)
139 | x = x.unsqueeze(0)
140 |
141 | with open(filename, mode='a') as f:
142 | f.write('\nGROUND TRUTH FOR: '+img_id+'\n')
143 | for box in annotation:
144 | f.write('label: '+' || '.join(str(b) for b in box)+'\n')
145 | if cuda:
146 | x = x.cuda()
147 |
148 | y = net(x,'test') # forward pass
149 | detections = y.data #[batch_size,num_class,top_k,conf+locloc][1,21,200,5]
150 | # scale each detection back up to the image
151 | scale = torch.Tensor([img.shape[1], img.shape[0],
152 | img.shape[1], img.shape[0]])
153 | if args.visbox == True:
154 | boxs = detections[0,1:,:,:]
155 | #print(boxs.shape)
156 | boxs = boxs[boxs[:,:,0]>args.visual_threshold]
157 | #print(boxs.shape)
158 | for t in range(21):
159 | boxes = detections[0,t,:,:]
160 | for gg in range(200):
161 | if boxes[gg,0]>=args.visual_threshold:
162 | tt= boxes[gg,:]
163 | print(tt)
164 | with open(r'/mnt/home/test_ciou.txt','a') as f1:
165 | f1.write(str(i))
166 | f1.write(' ')
167 | f1.write(str(t))
168 | f1.write(' ')
169 | f1.write(str(tt))
170 | f1.write('\n')
171 | continue
172 | #print(boxs)
173 | if boxs.shape[0] != 0:
174 | boxs= boxs[:,1:]
175 | vis_bbox(np.array(img),boxs*scale)
176 | #x1=boxs[:,0]
177 | #y2=boxs[:,1]
178 | #x2=boxs[:,2]
179 | #y2=boxs[:,3]
180 | #print(y2)
181 | #r=boxs.shape
182 | #print(r[0])
183 | #plt.text(bb[0],bb[1],score,family='fantasy',fontsize=36,style='italic',color='mediumvioletred')
184 | plt.axis('off')
185 | plt.savefig('/mnt/home/ciou/%d.png'%(i))
186 | plot.show()
187 | pred_num = 0
188 | for i in range(detections.size(1)):
189 | j = 0
190 |
191 | while detections[0, i, j, 0] >= 0.6:
192 | if pred_num == 0:
193 | with open(filename, mode='a') as f:
194 | f.write('PREDICTIONS: '+'\n')
195 |
196 |
197 | score = detections[0, i, j, 0]
198 | label_name = labelmap[i-1]
199 | pt = (detections[0, i, j, 1:]*scale).cpu().numpy()
200 | coords = (pt[0], pt[1], pt[2], pt[3])
201 |
202 | pred_num += 1
203 | with open(filename, mode='a') as f:
204 | f.write(str(pred_num)+' label: '+label_name+' score: ' +
205 | str(score) + ' '+' || '.join(str(c) for c in coords) + '\n')
206 | j += 1
207 |
208 |
209 | def test_voc():
210 | # load net
211 | net = build_ssd('test', 300, voc) # initialize SSD
212 | net.load_state_dict(torch.load(args.trained_model))
213 | net.eval()
214 | print('Finished loading model!')
215 | # load data
216 | testset = VOCDetection(args.voc_root, [('2007', 'test')],
217 | BaseTransform(300, voc['mean'],voc['std']))
218 | if args.cuda:
219 | net = net.cuda()
220 | #torch.backends.cudnn.benchmark = True
221 | # evaluation
222 | test_net(args.save_folder, net, args.cuda, testset,
223 | BaseTransform(300, voc['mean'],voc['std']),
224 | thresh=args.visual_threshold)
225 |
226 |
227 |
228 | if __name__ == '__main__':
229 |
230 | if args.cuda and torch.cuda.is_available():
231 | torch.set_default_tensor_type('torch.cuda.FloatTensor')
232 | else:
233 | torch.set_default_tensor_type('torch.FloatTensor')
234 |
235 | if not os.path.exists(args.save_folder):
236 | os.mkdir(args.save_folder)
237 | test_voc()
238 |
--------------------------------------------------------------------------------
/tools/train.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | sys.path.append(os.getcwd())
4 | from model import build_ssd
5 | from data import *
6 | from config import crack,voc
7 | from utils import MultiBoxLoss
8 |
9 |
10 | import time
11 | import torch
12 | import torch.nn as nn
13 | import torch.optim as optim
14 | import torch.backends.cudnn as cudnn
15 | import torch.nn.init as init
16 | import torch.utils.data as data
17 |
18 | import argparse
19 | from tqdm import tqdm
20 |
21 |
22 | def str2bool(v):
23 | return v.lower() in ("yes", "true", "t", "1")
24 | '''
25 | from eval import test_net
26 | '''
27 | parser = argparse.ArgumentParser(description=
28 | 'Single Shot MultiBox Detector Training With Pytorch')
29 | train_set = parser.add_mutually_exclusive_group()
30 | parser.add_argument('--dataset', default='VOC', choices=['VOC', 'COCO','CRACK','TRAFIC'],
31 | type=str, help='VOC or COCO')
32 | parser.add_argument('--basenet', default=None,#'vgg16_reducedfc.pth',
33 | help='Pretrained base model')
34 | parser.add_argument('--batch_size', default=32, type=int,
35 | help='Batch size for training')
36 | parser.add_argument('--max_epoch', default=232, type=int,
37 | help='Max Epoch for training')
38 | parser.add_argument('--resume', default=None, type=str,
39 | help='Checkpoint state_dict file to resume training from')
40 | parser.add_argument('--start_iter', default=0, type=int,
41 | help='Resume training at this iter')
42 | parser.add_argument('--num_workers', default=4, type=int,
43 | help='Number of workers used in dataloading')
44 | parser.add_argument('--cuda', default=True, type=str2bool,
45 | help='Use CUDA to train model')
46 | parser.add_argument('--lr', '--learning-rate', default=1e-3, type=float,
47 | help='initial learning rate')
48 | parser.add_argument('--momentum', default=0.9, type=float,
49 | help='Momentum value for optim')
50 | parser.add_argument('--weight_decay', default=5e-4, type=float,
51 | help='Weight decay for SGD')
52 | parser.add_argument('--gamma', default=0.1, type=float,
53 | help='Gamma update for SGD')
54 | parser.add_argument('--visdom', default='VOC',type=str,
55 | help='Use visdom')
56 | parser.add_argument('--work_dir', default='work_dir/',
57 | help='Directory for saving checkpoint models')
58 |
59 | parser.add_argument('--weight', default=5, type=int)
60 | args = parser.parse_args()
61 |
62 | weight = args.weight
63 |
64 | if torch.cuda.is_available():
65 | if args.cuda:
66 | torch.set_default_tensor_type('torch.cuda.FloatTensor')
67 | if not args.cuda:
68 | print("WARNING: It looks like you have a CUDA device, but aren't " +
69 | "using CUDA.\nRun with --cuda for optimal training speed.")
70 | torch.set_default_tensor_type('torch.FloatTensor')
71 | else:
72 | torch.set_default_tensor_type('torch.FloatTensor')
73 |
74 | if not os.path.exists(args.work_dir):
75 | os.mkdir(args.work_dir)
76 |
77 |
78 | def data_eval(dataset, net):
79 | return test_net('eval/', net, True, dataset,
80 | BaseTransform(trafic['min_dim'], MEANS), 5, 300,
81 | thresh=0.05)
82 |
83 |
84 | def train():
85 | '''
86 | get the dataset and dataloader
87 | '''
88 | print(args.dataset)
89 | if args.dataset == 'COCO':
90 | if not os.path.exists(COCO_ROOT):
91 | parser.error('Must specify dataset_root if specifying dataset')
92 |
93 | cfg = coco
94 | dataset = COCODetection(root=COCO_ROOT,
95 | transform=SSDAugmentation(cfg['min_dim'],
96 | MEANS),filename = 'train.txt')
97 | elif args.dataset == 'VOC':
98 | if not os.path.exists(VOC_ROOT):
99 | parser.error('Must specify dataset_root if specifying dataset')
100 |
101 | cfg = voc
102 | dataset = VOCDetection(root=VOC_ROOT,
103 | transform = SSDAugmentation(cfg['min_dim'],
104 | mean = cfg['mean'],std = cfg['std']))
105 | print(len(dataset))
106 | elif args.dataset == 'CRACK':
107 | if not os.path.exists(CRACK_ROOT):
108 | parser.error('Must specify dataset_root if specifying dataset')
109 |
110 | cfg = crack
111 | dataset = CRACKDetection(root=CRACK_ROOT,
112 | transform=SSDAugmentation(cfg['min_dim'],
113 | mean = cfg['mean'],std = cfg['std']))
114 |
115 | data_loader = data.DataLoader(dataset, args.batch_size,
116 | num_workers=args.num_workers,
117 | shuffle=True, collate_fn=detection_collate,
118 | pin_memory=True)
119 |
120 | #build, load, the net
121 | ssd_net = build_ssd('train',size = cfg['min_dim'],cfg = cfg)
122 | '''
123 | for name,param in ssd_net.named_parameters():
124 | if param.requires_grad:
125 | print(name)
126 | '''
127 | if args.resume:
128 | print('Resuming training, loading {}...'.format(args.resume))
129 | ssd_net.load_state_dict(torch.load(args.resume))
130 |
131 | if args.cuda:
132 | net = ssd_net.cuda()
133 | net.train()
134 |
135 | #optimizer
136 | optimizer = optim.SGD(net.parameters(), lr=args.lr, momentum=args.momentum,
137 | weight_decay=args.weight_decay)
138 |
139 |
140 | #loss:SmoothL1\Iou\Giou\Diou\Ciou
141 | print(cfg['losstype'])
142 | criterion = MultiBoxLoss(cfg = cfg,overlap_thresh = 0.5,
143 | prior_for_matching = True,bkg_label = 0,
144 | neg_mining = True, neg_pos = 3,neg_overlap = 0.5,
145 | encode_target = False, use_gpu = args.cuda,loss_name = cfg['losstype'])
146 |
147 | if args.visdom:
148 | import visdom
149 | viz = visdom.Visdom(env=cfg['work_name'])
150 | vis_title = 'SSD on ' + args.dataset
151 | vis_legend = ['Loc Loss', 'Conf Loss', 'Total Loss']
152 | iter_plot = create_vis_plot(viz,'Iteration', 'Loss', vis_title, vis_legend)
153 | epoch_plot = create_vis_plot(viz,'Epoch', 'Loss', vis_title+" epoch loss", vis_legend)
154 | #epoch_acc = create_acc_plot(viz,'Epoch', 'acc', args.dataset+" Acc",["Acc"])
155 |
156 |
157 |
158 |
159 |
160 | epoch_size = len(dataset) // args.batch_size
161 | print('Training SSD on:', dataset.name,epoch_size)
162 | iteration = args.start_iter
163 | step_index = 0
164 | loc_loss = 0
165 | conf_loss = 0
166 | for epoch in range(args.max_epoch):
167 | for ii, batch_iterator in tqdm(enumerate(data_loader)):
168 | iteration += 1
169 |
170 | if iteration in cfg['lr_steps']:
171 | step_index += 1
172 | adjust_learning_rate(optimizer, args.gamma, step_index)
173 |
174 | # load train data
175 | images, targets = batch_iterator
176 | #print(images,targets)
177 | if args.cuda:
178 | images = images.cuda()
179 | targets = [ann.cuda() for ann in targets]
180 | else:
181 | images = images
182 | targets = [ann for ann in targets]
183 | t0 = time.time()
184 | out = net(images,'train')
185 | optimizer.zero_grad()
186 | loss_l, loss_c = criterion(out, targets)
187 | loss = weight * loss_l + loss_c
188 | loss.backward()
189 | optimizer.step()
190 | t1 = time.time()
191 | loc_loss += loss_l.item()
192 | conf_loss += loss_c.item()
193 | #print(iteration)
194 | if iteration % 10 == 0:
195 | print('timer: %.4f sec.' % (t1 - t0))
196 | print('iter ' + repr(iteration) + ' || Loss: %.4f ||' % (loss.item()), end=' ')
197 |
198 |
199 | if args.visdom:
200 | if iteration>20 and iteration% 10 == 0:
201 | update_vis_plot(viz,iteration, loss_l.item(), loss_c.item(),
202 | iter_plot, epoch_plot, 'append')
203 |
204 | if epoch % 10 == 0 and epoch >60:#epoch>1000 and epoch % 50 == 0:
205 | print('Saving state, iter:', iteration)
206 | #print('loss_l:'+weight * loss_l+', loss_c:'+'loss_c')
207 | save_folder = args.work_dir+cfg['work_name']
208 | if not os.path.exists(save_folder):
209 | os.mkdir(save_folder)
210 | torch.save(net.state_dict(),args.work_dir+cfg['work_name']+'/ssd'+
211 | repr(epoch)+'_.pth')
212 | if args.visdom:
213 | update_vis_plot(viz, epoch, loc_loss, conf_loss, epoch_plot, epoch_plot,
214 | 'append', epoch_size)
215 | loc_loss = 0
216 | conf_loss = 0
217 |
218 | torch.save(net.state_dict(),args.work_dir+cfg['work_name']+'/ssd'+repr(epoch)+ str(args.weight) +'_.pth')
219 |
220 | def adjust_learning_rate(optimizer, gamma, step):
221 | """Sets the learning rate to the initial LR decayed by 10 at every
222 | specified step
223 | # Adapted from PyTorch Imagenet example:
224 | # https://github.com/pytorch/examples/blob/master/imagenet/main.py
225 | """
226 | lr = args.lr * (gamma ** (step))
227 | for param_group in optimizer.param_groups:
228 | param_group['lr'] = lr
229 | print(param_group['lr'])
230 |
231 |
232 | def create_vis_plot(viz,_xlabel, _ylabel, _title, _legend):
233 | return viz.line(
234 | X=torch.zeros((1,)).cpu(),
235 | Y=torch.zeros((1, 3)).cpu(),
236 | opts=dict(
237 | xlabel=_xlabel,
238 | ylabel=_ylabel,
239 | title=_title,
240 | legend=_legend
241 | )
242 | )
243 |
244 | def create_acc_plot(viz,_xlabel, _ylabel, _title, _legend):
245 | return viz.line(
246 | X=torch.zeros((1,)).cpu(),
247 | Y=torch.zeros((1,)).cpu(),
248 | opts=dict(
249 | xlabel=_xlabel,
250 | ylabel=_ylabel,
251 | title=_title,
252 | legend=_legend
253 | )
254 | )
255 |
256 |
257 | def update_vis_plot(viz,iteration, loc, conf, window1, window2, update_type,
258 | epoch_size=1):
259 | viz.line(
260 | X=torch.ones((1, 3)).cpu() * iteration,
261 | Y=torch.Tensor([loc, conf, loc + conf]).unsqueeze(0).cpu() / epoch_size,
262 | win=window1,
263 | update=update_type
264 | )
265 |
266 |
267 | def update_acc_plot(viz,iteration,acc, window1,update_type,
268 | epoch_size=1):
269 | viz.line(
270 | X=torch.ones((1, 1)).cpu()*iteration,
271 | Y=torch.Tensor([acc]).unsqueeze(0).cpu(),
272 | win=window1,
273 | update=update_type
274 | )
275 | # initialize epoch plot on first iteration
276 | '''
277 | if iteration == 0:
278 | print(loc, conf, loc + conf)
279 | viz.line(
280 | X=torch.zeros((1, 3)).cpu(),
281 | Y=torch.Tensor([loc, conf, loc + conf]).unsqueeze(0).cpu(),
282 | win=window2,
283 | update=True
284 | )
285 | '''
286 | if __name__ == '__main__':
287 | train()
288 |
--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .box import *
2 | from .detection import *
3 | from .loss import *
4 |
--------------------------------------------------------------------------------
/utils/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/__pycache__/__init__.cpython-35.sublime-workspace:
--------------------------------------------------------------------------------
1 | {
2 | "auto_complete":
3 | {
4 | "selected_items":
5 | [
6 | [
7 | "set",
8 | "set_type\tstatement"
9 | ],
10 | [
11 | "eval",
12 | "eval_type\tparam"
13 | ],
14 | [
15 | "all",
16 | "all_boxes\tstatement"
17 | ],
18 | [
19 | "clas",
20 | "classname\tparam"
21 | ],
22 | [
23 | "loss",
24 | "loss_l\tstatement"
25 | ],
26 | [
27 | "area",
28 | "area2\tstatement"
29 | ],
30 | [
31 | "ten",
32 | "Tensor\tclass"
33 | ],
34 | [
35 | "tensor",
36 | "tensor"
37 | ],
38 | [
39 | "loc",
40 | "loc_t\tstatement"
41 | ],
42 | [
43 | "pri",
44 | "priors\tstatement"
45 | ],
46 | [
47 | "b",
48 | "bbox_overlaps_gious\tmodule"
49 | ],
50 | [
51 | "box",
52 | "boxes2\tstatement"
53 | ],
54 | [
55 | "g",
56 | "GiouLoss\tclass"
57 | ],
58 | [
59 | "num",
60 | "num_pos\tstatement"
61 | ],
62 | [
63 | "match",
64 | "match_gious\tfunction"
65 | ],
66 | [
67 | "mat",
68 | "match_gious\tfunction"
69 | ],
70 | [
71 | "per",
72 | "permute"
73 | ],
74 | [
75 | "Voc",
76 | "VOC_CLASSES\tinstance"
77 | ],
78 | [
79 | "model",
80 | "model_name\tparam"
81 | ],
82 | [
83 | "out",
84 | "out_channels"
85 | ],
86 | [
87 | "mode",
88 | "model_length"
89 | ],
90 | [
91 | "mo",
92 | "model\tstatement"
93 | ],
94 | [
95 | "exts",
96 | "exts3_1\tstatement"
97 | ],
98 | [
99 | "modu",
100 | "ModuleList\tclass"
101 | ],
102 | [
103 | "con",
104 | "Conv2d\tclass"
105 | ],
106 | [
107 | "data",
108 | "dataset\tstatement"
109 | ],
110 | [
111 | "i",
112 | "i\tstatement"
113 | ],
114 | [
115 | "input",
116 | "inputs\tparam"
117 | ],
118 | [
119 | "x",
120 | "xavier_uniform\tstatement"
121 | ],
122 | [
123 | "in",
124 | "init_weights\tfunction"
125 | ],
126 | [
127 | "Mu",
128 | "MultiBoxLoss"
129 | ],
130 | [
131 | "la",
132 | "laterals\tstatement"
133 | ],
134 | [
135 | "dete",
136 | "Detect"
137 | ],
138 | [
139 | "p",
140 | "PriorBox\tclass"
141 | ],
142 | [
143 | "P",
144 | "PriorBox"
145 | ],
146 | [
147 | "pr",
148 | "prior_box\tmodule"
149 | ],
150 | [
151 | "with",
152 | "with_norm\tstatement"
153 | ],
154 | [
155 | "buil",
156 | "build_ssd\tmodule"
157 | ],
158 | [
159 | "c",
160 | "ConvModule\tclass"
161 | ],
162 | [
163 | "v",
164 | "voc\tmodule"
165 | ],
166 | [
167 | "SS",
168 | "SSDHead\tclass"
169 | ],
170 | [
171 | "base",
172 | "base_feature\tstatement"
173 | ],
174 | [
175 | "fea",
176 | "feature_map_out\tstatement"
177 | ],
178 | [
179 | "ne",
180 | "build_neck\tmodule"
181 | ],
182 | [
183 | "ex",
184 | "Exception\tclass"
185 | ],
186 | [
187 | "bu",
188 | "build_backbone\tmodule"
189 | ],
190 | [
191 | "is",
192 | "issubset\tfunction"
193 | ],
194 | [
195 | "pre",
196 | "pretrained\tparam"
197 | ],
198 | [
199 | "dataset",
200 | "dataset_root"
201 | ],
202 | [
203 | "VOC",
204 | "VOC_ROOT\tinstance"
205 | ],
206 | [
207 | "SSD",
208 | "SSD_Eval_Augmentation\tclass"
209 | ],
210 | [
211 | "au",
212 | "augmentations\tmodule"
213 | ],
214 | [
215 | "gt",
216 | "gt_bboxes\tstatement"
217 | ],
218 | [
219 | "ras",
220 | "raise\tkeyword"
221 | ],
222 | [
223 | "Cra",
224 | "CRACKDetection\tclass"
225 | ],
226 | [
227 | "Cr",
228 | "CRACK_ROOT\tstatement"
229 | ],
230 | [
231 | "voc",
232 | "VOCDetection\tclass"
233 | ],
234 | [
235 | "vo",
236 | "VOC\tmodule"
237 | ],
238 | [
239 | "Tr",
240 | "True\tkeyword"
241 | ],
242 | [
243 | "Tra",
244 | "TRAFIC_CLASSES\tinstance"
245 | ],
246 | [
247 | "tra",
248 | "TRAFICDetection\tclass"
249 | ],
250 | [
251 | "TRA",
252 | "TRAFIC_ROOT\tinstance"
253 | ],
254 | [
255 | "anc",
256 | "anchor_list\tstatement"
257 | ],
258 | [
259 | "nu",
260 | "num_ins\tstatement"
261 | ],
262 | [
263 | "wat",
264 | "waitKey\tfunction"
265 | ],
266 | [
267 | "sco",
268 | "Score_precision"
269 | ],
270 | [
271 | "sc",
272 | "Score_recall"
273 | ],
274 | [
275 | "wi",
276 | "write\tfunction"
277 | ],
278 | [
279 | "wri",
280 | "write\tfunction"
281 | ],
282 | [
283 | "wr",
284 | "write\tfunction"
285 | ],
286 | [
287 | "image",
288 | "img_names\tstatement"
289 | ],
290 | [
291 | "spl",
292 | "splitext\tfunction"
293 | ],
294 | [
295 | "src",
296 | "src_xml_dir\tstatement"
297 | ],
298 | [
299 | "sq",
300 | "sqrt\tinstance"
301 | ],
302 | [
303 | "cra",
304 | "Crack\tclass"
305 | ],
306 | [
307 | "get",
308 | "get_dataset\tfunction"
309 | ],
310 | [
311 | "val",
312 | "val_dataset\tstatement"
313 | ],
314 | [
315 | "ro",
316 | "roi_crop\tmodule"
317 | ],
318 | [
319 | "im",
320 | "imdb"
321 | ],
322 | [
323 | "mask",
324 | "masked_select\tfunction"
325 | ],
326 | [
327 | "ima",
328 | "img_label"
329 | ],
330 | [
331 | "file",
332 | "filename\tstatement"
333 | ],
334 | [
335 | "fir",
336 | "first_img\tstatement"
337 | ],
338 | [
339 | "fri",
340 | "first_img\tstatement"
341 | ],
342 | [
343 | "E",
344 | "Exception\tclass"
345 | ],
346 | [
347 | "sec",
348 | "second_img\tstatement"
349 | ],
350 | [
351 | "img",
352 | "img_two\tstatement"
353 | ],
354 | [
355 | "add",
356 | "addWeighted\tfunction"
357 | ],
358 | [
359 | "wa",
360 | "waitKey\tfunction"
361 | ],
362 | [
363 | "conf",
364 | "conf_t\tstatement"
365 | ],
366 | [
367 | "max",
368 | "max_pro\tstatement"
369 | ],
370 | [
371 | "cr",
372 | "CRACKDetection\tclass"
373 | ],
374 | [
375 | "res",
376 | "resnet\tparam"
377 | ],
378 | [
379 | "pos",
380 | "pos_idx\tstatement"
381 | ],
382 | [
383 | "e",
384 | "empty_cache\tfunction"
385 | ],
386 | [
387 | "ss",
388 | "SSD_Eval_Augmentation\tclass"
389 | ],
390 | [
391 | "ssd",
392 | "SSD_Eval_Augmentation\tclass"
393 | ],
394 | [
395 | "class",
396 | "class_to_ind\tparam"
397 | ],
398 | [
399 | "CAR",
400 | "CRACK_ROOT\tinstance"
401 | ],
402 | [
403 | "root",
404 | "rootpath\tstatement"
405 | ],
406 | [
407 | "CRAC",
408 | "CRACK_ROOT"
409 | ],
410 | [
411 | "re",
412 | "reduction\tparam"
413 | ],
414 | [
415 | "pic",
416 | "pic_size"
417 | ],
418 | [
419 | "f",
420 | "functional\tmodule"
421 | ],
422 | [
423 | "load",
424 | "load_model_path"
425 | ],
426 | [
427 | "INPUT",
428 | "input_IMG\tstatement"
429 | ],
430 | [
431 | "r",
432 | "r"
433 | ],
434 | [
435 | "triph",
436 | "triphard-rate"
437 | ],
438 | [
439 | "trip",
440 | "TripHardLoss\tclass"
441 | ],
442 | [
443 | "ide",
444 | "ide_face\tstatement"
445 | ],
446 | [
447 | "best",
448 | "best_threshold_index"
449 | ],
450 | [
451 | "cv",
452 | "cvtColor\tfunction"
453 | ],
454 | [
455 | "COLOR_B",
456 | "COLOR_BGR2GRAY\tinstance"
457 | ],
458 | [
459 | "epo",
460 | "epoch_acc\tstatement"
461 | ],
462 | [
463 | "Mys",
464 | "Myseresnext50_32x4d\tclass"
465 | ],
466 | [
467 | "ince",
468 | "inception_v3\tfunction"
469 | ],
470 | [
471 | "lr",
472 | "lr_find\tstatement"
473 | ],
474 | [
475 | "plot",
476 | "plot_lr\tfunction"
477 | ],
478 | [
479 | "optimizer",
480 | "optimizer\tparam"
481 | ],
482 | [
483 | "ran",
484 | "randint\tfunction"
485 | ],
486 | [
487 | "torchv",
488 | "torchvision\tmodule"
489 | ],
490 | [
491 | "bat",
492 | "batch_size\tstatement"
493 | ],
494 | [
495 | "tes",
496 | "test_data\tstatement"
497 | ],
498 | [
499 | "da",
500 | "data_aug\tmodule"
501 | ],
502 | [
503 | "tans",
504 | "data_transforms"
505 | ],
506 | [
507 | "fo",
508 | "FocalLoss\tclass"
509 | ],
510 | [
511 | "result",
512 | "result_list"
513 | ],
514 | [
515 | "str",
516 | "str_result\tstatement"
517 | ]
518 | ]
519 | },
520 | "buffers":
521 | [
522 | {
523 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/tools/train.py",
524 | "settings":
525 | {
526 | "buffer_size": 10125,
527 | "encoding": "UTF-8",
528 | "line_ending": "Unix"
529 | }
530 | },
531 | {
532 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/VOC.py",
533 | "settings":
534 | {
535 | "buffer_size": 7310,
536 | "encoding": "UTF-8",
537 | "line_ending": "Unix"
538 | }
539 | },
540 | {
541 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/config/config.py",
542 | "settings":
543 | {
544 | "buffer_size": 3019,
545 | "encoding": "UTF-8",
546 | "line_ending": "Unix"
547 | }
548 | },
549 | {
550 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/tools/eval.py",
551 | "settings":
552 | {
553 | "buffer_size": 15766,
554 | "encoding": "UTF-8",
555 | "line_ending": "Unix"
556 | }
557 | }
558 | ],
559 | "build_system": "Packages/Python/Python.sublime-build",
560 | "build_system_choices":
561 | [
562 | [
563 | [
564 | [
565 | "Packages/Python/Python.sublime-build",
566 | ""
567 | ],
568 | [
569 | "Packages/Python/Python.sublime-build",
570 | "Syntax Check"
571 | ]
572 | ],
573 | [
574 | "Packages/Python/Python.sublime-build",
575 | ""
576 | ]
577 | ]
578 | ],
579 | "build_varint": "",
580 | "command_palette":
581 | {
582 | "height": 92.0,
583 | "last_filter": "Package Control: install",
584 | "selected_items":
585 | [
586 | [
587 | "Package Control: install",
588 | "Package Control: Install Package"
589 | ],
590 | [
591 | "Package Control: instal",
592 | "Package Control: Install Package"
593 | ]
594 | ],
595 | "width": 449.0
596 | },
597 | "console":
598 | {
599 | "height": 126.0,
600 | "history":
601 | [
602 | "import urllib.request,os,hashlib; h = '6f4c264a24d933ce70df5dedcf1dcaee' + 'ebe013ee18cced0ef93d5f746d80ef60'; pf = 'Package Control.sublime-package'; ipp = sublime.installed_packages_path(); urllib.request.install_opener( urllib.request.build_opener( urllib.request.ProxyHandler()) ); by = urllib.request.urlopen( 'http://packagecontrol.io/' + pf.replace(' ', '%20')).read(); dh = hashlib.sha256(by).hexdigest(); print('Error validating download (got %s instead of %s), please try manual install' % (dh, h)) if dh != h else open(os.path.join( ipp, pf), 'wb' ).write(by) "
603 | ]
604 | },
605 | "distraction_free":
606 | {
607 | "menu_visible": true,
608 | "show_minimap": false,
609 | "show_open_files": false,
610 | "show_tabs": false,
611 | "side_bar_visible": false,
612 | "status_bar_visible": false
613 | },
614 | "file_history":
615 | [
616 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/box/box_utils.py",
617 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/utils/augmentations.py",
618 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/loss/multibox_loss.py",
619 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/tools/test.py",
620 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/test.py",
621 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/VOC.py",
622 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/eval.py",
623 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/tools/eval.py",
624 | "/home/hzw/MachineLearning/DeepLearning/test.py",
625 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/config/config.py",
626 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/tools/train.py",
627 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/build_ssd.py",
628 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/head/build_head.py",
629 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/detection/detection.py",
630 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/box/__init__.py",
631 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/ssd.py",
632 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/train.py",
633 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/backbone/build_backbone.py",
634 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/__init__.py",
635 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/utils/__init__.py",
636 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/data/voc0712.py",
637 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/pascal_voc/ssd300_voc.py",
638 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/data/config.py",
639 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/utils/conv_module.py",
640 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/neck/build_neck.py",
641 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/neck/ssd_neck.py",
642 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/loss/__init__.py",
643 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/neck/build_neck (copy).py",
644 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/Rail defect detection_submit/main.py",
645 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/utils/conv_module.py",
646 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/utils/weight_init.py",
647 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/__init__.py",
648 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/box/prior_box.py",
649 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/prior_box/prior_box.py",
650 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/utils/prior_box/__init__.py",
651 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/__init__.py",
652 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/backbone/__init__.py",
653 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/head/__init__.py",
654 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/utils/norm.py",
655 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py",
656 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/anchor_heads/ssd_head.py",
657 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/necks/fpn.py",
658 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/builder.py",
659 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/utils/norm.py",
660 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/CRACK.py",
661 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/model/ssd.py",
662 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/voc.py",
663 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/xml_style.py",
664 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/config/__init__.py",
665 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/detectors/two_stage.py",
666 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/tools/test_trafic.py",
667 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/detectors/cascade_rcnn.py",
668 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/custom.py",
669 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/bbox_heads/bbox_head.py",
670 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/anchor_heads/rpn_head.py",
671 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/faster_rcnn_r101_fpn_1x_trafic.py",
672 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/cascade_rcnn_r101_fpn_1x_trafic.py",
673 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/anchor_heads/anchor_head.py",
674 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/transforms.py",
675 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/detectors/rpn.py",
676 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/anchor/anchor_target.py",
677 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/detectors/base.py",
678 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/detectors/faster_rcnn.py",
679 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/loss/losses.py",
680 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/apis/train.py",
681 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/loader/sampler.py",
682 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/trafic_signal.py",
683 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/loader/build_loader.py",
684 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/bbox/transforms.py",
685 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/bbox/bbox_target.py",
686 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/data/Traffic-Sign/csv2xml.py",
687 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/data/Traffic-Sign/createVOC.py",
688 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/mmdet/apis/train.py",
689 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/tools/train.py",
690 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/voc0712.py",
691 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/layers/functions/detection.py",
692 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/utils/augmentations.py",
693 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/simple-faster-rcnn-pytorch-master/data/voc_dataset.py",
694 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/utils/eval_tool.py",
695 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/simple-faster-rcnn-pytorch-master/train.py",
696 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/simple-faster-rcnn-pytorch-master/model/utils/bbox_tools.py",
697 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/layers/functions/prior_box.py",
698 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/layers/modules/multibox_loss.py",
699 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD_Pytorch/data/__init__.py",
700 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/Rail defect detection_submit/config.py",
701 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/Rail defect detection_submit/models/BasicModule.py",
702 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/data/Traffic-Sign/Traiffic-Sign0510/k-means_Anchor.py",
703 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/faster_rcnn_r50_fpn_1x_trafic.py",
704 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/cascade_rcnn_r50_fpn_1x_trafic.py",
705 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/data/show_anno.py",
706 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/faster_rcnn_ohem_r50_fpn_1x.py",
707 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/anchor/anchor_generator.py",
708 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/cascade_rcnn_r50_fpn_1x.py",
709 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/roi_extractors/single_level.py",
710 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/registry.py",
711 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/models/backbones/resnet.py",
712 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/data/Traffic-Sign/Traiffic-Sign0510/createVOC.py",
713 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/utils.py",
714 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/tools/voc_eval.py",
715 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/evaluation/mean_ap.py",
716 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/cascade_rcnn_r101_fpn_1x.py",
717 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/faster_rcnn_x101_64x4d_fpn_1x_trafic.py",
718 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/faster_rcnn_r101_fpn_1x.py",
719 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/trafic/faster_rcnn_x101_64x4d_fpn_1x.py",
720 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/pascal_voc/faster_rcnn_r50_fpn_1x_trafic.py",
721 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/tools/test _trafic.py",
722 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712 (copy).py",
723 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/tools/test.py",
724 | "/home/hzw/MachineLearning/DeepLearning/work/tk/tk_windows.py",
725 | "/media/hzw/Seagate Expansion Drive/研究生/DeepLearning/项目/work/tk/tk_windows.py",
726 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/configs/pascal_voc/faster_rcnn_r50_fpn_1x_crack.py",
727 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/__init__.py",
728 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/trafic-signal.py",
729 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/evaluation/__init__.py",
730 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/evaluation/class_names.py",
731 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/setup.py",
732 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/core/evaluation/coco_utils.py",
733 | "/usr/local/lib/python3.5/dist-packages/mmdet-0.5.7+head-py3.5.egg/mmdet/datasets/crack.py",
734 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/compile.sh",
735 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/mmdet/models/builder.py",
736 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/tools/just_do_test.py",
737 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/mmdet/datasets/utils.py",
738 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/build/lib/mmdet/datasets/utils.py",
739 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/mmdetection/build/lib/mmdet/models/builder.py",
740 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/fpn.pytorch/lib/datasets/vg.py",
741 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/fpn.pytorch/lib/datasets/pascal_voc.py",
742 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/fpn.pytorch/lib/pycocotools/mask.py",
743 | "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/fpn.pytorch/lib/datasets/imagenet.py"
744 | ],
745 | "find":
746 | {
747 | "height": 40.0
748 | },
749 | "find_in_files":
750 | {
751 | "height": 0.0,
752 | "where_history":
753 | [
754 | ]
755 | },
756 | "find_state":
757 | {
758 | "case_sensitive": false,
759 | "find_history":
760 | [
761 | "devkit_path",
762 | "transform",
763 | "cachedir",
764 | "save_folder",
765 | "output_dir",
766 | "dataset_mean",
767 | "annopath",
768 | "print",
769 | "output_dir",
770 | "type",
771 | "BaseTransform",
772 | "labelmap",
773 | "cls",
774 | "000005",
775 | "match",
776 | "jaccard",
777 | "intersect",
778 | "net",
779 | "ssd_net",
780 | "Detect",
781 | "transform",
782 | "BaseTransform",
783 | "means",
784 | "self.softmax",
785 | "epoch_size",
786 | "self.negpos_ratio",
787 | "num_classes",
788 | "normalize",
789 | "print",
790 | "loc_loss",
791 | "print",
792 | "decoded_boxes",
793 | "nms",
794 | "cfg",
795 | "cls_out_channels",
796 | "out_map",
797 | "xavier_init",
798 | "input_size",
799 | "normalize",
800 | "norm_layer",
801 | "normalize",
802 | "ConvModule",
803 | "self.modules()",
804 | "image_transform",
805 | "detection_collate",
806 | "voc",
807 | "osp",
808 | "Standform",
809 | "SubtractMeans",
810 | "RandomMirror",
811 | "print",
812 | "rpn_head",
813 | "DC",
814 | "BboxTransform",
815 | "img_infos",
816 | "get",
817 | "image_transform",
818 | "transform",
819 | "image_transform",
820 | "eq",
821 | "osp",
822 | "torch",
823 | "flip",
824 | "det_file",
825 | "resnet",
826 | "lr",
827 | "trafic",
828 | "print",
829 | "im_detect",
830 | "print",
831 | "pre_acc",
832 | "max_pre_acc",
833 | "set_type",
834 | "imgsetpath",
835 | "voc_eval",
836 | "voc_eval,",
837 | "imagesetfile,",
838 | "voc_eval",
839 | "set_type",
840 | "labelmap",
841 | "voc_eval",
842 | "set_type",
843 | "BaseTransform",
844 | "dataset_mean",
845 | "labelmap",
846 | "filename",
847 | "Detect",
848 | "CRACK_CLASSES",
849 | "flip",
850 | "print",
851 | "anchor_target",
852 | "sampling",
853 | "multi_apply",
854 | "print",
855 | "anchor_generators",
856 | "anchor_target",
857 | "img_meta",
858 | "train_cfg",
859 | "self.train_cfg",
860 | "featmap_strides",
861 | "in_channels",
862 | "args.validate",
863 | "sample",
864 | "group",
865 | "validate",
866 | "cfg",
867 | "flip",
868 | "flip_ratio",
869 | "box",
870 | "results2json",
871 | "det2json",
872 | "dataset",
873 | "cPickle",
874 | "print",
875 | "cPickle",
876 | "print",
877 | "best_map",
878 | "self.feat_stride",
879 | "anchor",
880 | "text",
881 | "generate_anchor_base",
882 | "print",
883 | "create_vis_plot",
884 | "max_pre_acc",
885 | "pull_anno",
886 | "pull_image",
887 | "labels",
888 | "targets"
889 | ],
890 | "highlight": true,
891 | "in_selection": false,
892 | "preserve_case": false,
893 | "regex": false,
894 | "replace_history":
895 | [
896 | "pickle",
897 | "np",
898 | "\t"
899 | ],
900 | "reverse": false,
901 | "show_context": true,
902 | "use_buffer2": true,
903 | "whole_word": false,
904 | "wrap": true
905 | },
906 | "groups":
907 | [
908 | {
909 | "selected": 0,
910 | "sheets":
911 | [
912 | {
913 | "buffer": 0,
914 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/tools/train.py",
915 | "semi_transient": false,
916 | "settings":
917 | {
918 | "buffer_size": 10125,
919 | "regions":
920 | {
921 | },
922 | "selection":
923 | [
924 | [
925 | 8073,
926 | 8073
927 | ]
928 | ],
929 | "settings":
930 | {
931 | "auto_complete_triggers":
932 | [
933 | {
934 | "characters": ".",
935 | "selector": "source.python - string - comment - constant.numeric"
936 | },
937 | {
938 | "characters": ".",
939 | "selector": "source.python - string - constant.numeric"
940 | }
941 | ],
942 | "syntax": "Packages/Python/Python.sublime-syntax",
943 | "tab_size": 4,
944 | "translate_tabs_to_spaces": true
945 | },
946 | "translation.x": 0.0,
947 | "translation.y": 4608.0,
948 | "zoom_level": 1.0
949 | },
950 | "stack_index": 0,
951 | "type": "text"
952 | },
953 | {
954 | "buffer": 1,
955 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/data/VOC.py",
956 | "semi_transient": false,
957 | "settings":
958 | {
959 | "buffer_size": 7310,
960 | "regions":
961 | {
962 | },
963 | "selection":
964 | [
965 | [
966 | 3638,
967 | 3638
968 | ]
969 | ],
970 | "settings":
971 | {
972 | "syntax": "Packages/Python/Python.sublime-syntax",
973 | "tab_size": 4,
974 | "translate_tabs_to_spaces": true
975 | },
976 | "translation.x": 0.0,
977 | "translation.y": 1872.0,
978 | "zoom_level": 1.0
979 | },
980 | "stack_index": 1,
981 | "type": "text"
982 | },
983 | {
984 | "buffer": 2,
985 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/config/config.py",
986 | "semi_transient": false,
987 | "settings":
988 | {
989 | "buffer_size": 3019,
990 | "regions":
991 | {
992 | },
993 | "selection":
994 | [
995 | [
996 | 981,
997 | 981
998 | ]
999 | ],
1000 | "settings":
1001 | {
1002 | "syntax": "Packages/Python/Python.sublime-syntax",
1003 | "tab_size": 4,
1004 | "translate_tabs_to_spaces": true
1005 | },
1006 | "translation.x": 0.0,
1007 | "translation.y": 144.0,
1008 | "zoom_level": 1.0
1009 | },
1010 | "stack_index": 3,
1011 | "type": "text"
1012 | },
1013 | {
1014 | "buffer": 3,
1015 | "file": "/home/hzw/MachineLearning/DeepLearning/ObjectDetection/SSD/tools/eval.py",
1016 | "semi_transient": false,
1017 | "settings":
1018 | {
1019 | "buffer_size": 15766,
1020 | "regions":
1021 | {
1022 | },
1023 | "selection":
1024 | [
1025 | [
1026 | 5972,
1027 | 5972
1028 | ]
1029 | ],
1030 | "settings":
1031 | {
1032 | "auto_complete_triggers":
1033 | [
1034 | {
1035 | "characters": ".",
1036 | "selector": "source.python - string - comment - constant.numeric"
1037 | },
1038 | {
1039 | "characters": ".",
1040 | "selector": "source.python - string - constant.numeric"
1041 | }
1042 | ],
1043 | "syntax": "Packages/Python/Python.sublime-syntax",
1044 | "tab_size": 4,
1045 | "translate_tabs_to_spaces": true
1046 | },
1047 | "translation.x": 0.0,
1048 | "translation.y": 3900.0,
1049 | "zoom_level": 1.0
1050 | },
1051 | "stack_index": 2,
1052 | "type": "text"
1053 | }
1054 | ]
1055 | }
1056 | ],
1057 | "incremental_find":
1058 | {
1059 | "height": 32.0
1060 | },
1061 | "input":
1062 | {
1063 | "height": 0.0
1064 | },
1065 | "layout":
1066 | {
1067 | "cells":
1068 | [
1069 | [
1070 | 0,
1071 | 0,
1072 | 1,
1073 | 1
1074 | ]
1075 | ],
1076 | "cols":
1077 | [
1078 | 0.0,
1079 | 1.0
1080 | ],
1081 | "rows":
1082 | [
1083 | 0.0,
1084 | 1.0
1085 | ]
1086 | },
1087 | "menu_visible": true,
1088 | "output.exec":
1089 | {
1090 | "height": 154.0
1091 | },
1092 | "output.find_results":
1093 | {
1094 | "height": 0.0
1095 | },
1096 | "output.unsaved_changes":
1097 | {
1098 | "height": 154.0
1099 | },
1100 | "pinned_build_system": "Packages/Python/Python.sublime-build",
1101 | "project": "__init__.cpython-35.sublime-project",
1102 | "replace":
1103 | {
1104 | "height": 60.0
1105 | },
1106 | "save_all_on_build": true,
1107 | "select_file":
1108 | {
1109 | "height": 0.0,
1110 | "last_filter": "",
1111 | "selected_items":
1112 | [
1113 | [
1114 | "",
1115 | "~/MachineLearning/DeepLearning/classfication/VggNet/VggNet.py"
1116 | ]
1117 | ],
1118 | "width": 0.0
1119 | },
1120 | "select_project":
1121 | {
1122 | "height": 500.0,
1123 | "last_filter": "",
1124 | "selected_items":
1125 | [
1126 | ],
1127 | "width": 380.0
1128 | },
1129 | "select_symbol":
1130 | {
1131 | "height": 0.0,
1132 | "last_filter": "",
1133 | "selected_items":
1134 | [
1135 | ],
1136 | "width": 0.0
1137 | },
1138 | "selected_group": 0,
1139 | "settings":
1140 | {
1141 | },
1142 | "show_minimap": true,
1143 | "show_open_files": true,
1144 | "show_tabs": true,
1145 | "side_bar_visible": true,
1146 | "side_bar_width": 150.0,
1147 | "status_bar_visible": true,
1148 | "template_settings":
1149 | {
1150 | }
1151 | }
1152 |
--------------------------------------------------------------------------------
/utils/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/box/__init__.py:
--------------------------------------------------------------------------------
1 | from .prior_box import PriorBox
2 | from .box_utils import decode,nms, diounms
3 | from .box_utils import match, log_sum_exp,match_ious,bbox_overlaps_iou, bbox_overlaps_giou, bbox_overlaps_diou, bbox_overlaps_ciou
4 |
5 |
6 |
7 |
--------------------------------------------------------------------------------
/utils/box/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/box/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/box/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/box/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/box/__pycache__/box_utils.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/box/__pycache__/box_utils.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/box/__pycache__/box_utils.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/box/__pycache__/box_utils.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/box/__pycache__/prior_box.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/box/__pycache__/prior_box.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/box/__pycache__/prior_box.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/box/__pycache__/prior_box.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/box/box_utils.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import torch
3 | import math
4 |
5 | def bbox_overlaps_diou(bboxes1, bboxes2):
6 |
7 | rows = bboxes1.shape[0]
8 | cols = bboxes2.shape[0]
9 | dious = torch.zeros((rows, cols))
10 | if rows * cols == 0:
11 | return dious
12 | exchange = False
13 | if bboxes1.shape[0] > bboxes2.shape[0]:
14 | bboxes1, bboxes2 = bboxes2, bboxes1
15 | dious = torch.zeros((cols, rows))
16 | exchange = True
17 |
18 | w1 = bboxes1[:, 2] - bboxes1[:, 0]
19 | h1 = bboxes1[:, 3] - bboxes1[:, 1]
20 | w2 = bboxes2[:, 2] - bboxes2[:, 0]
21 | h2 = bboxes2[:, 3] - bboxes2[:, 1]
22 |
23 | area1 = w1 * h1
24 | area2 = w2 * h2
25 | center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
26 | center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
27 | center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
28 | center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2
29 |
30 | inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
31 | inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
32 | out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:])
33 | out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2])
34 |
35 | inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
36 | inter_area = inter[:, 0] * inter[:, 1]
37 | inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
38 | outer = torch.clamp((out_max_xy - out_min_xy), min=0)
39 | outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
40 | union = area1+area2-inter_area
41 | dious = inter_area / union - (inter_diag) / outer_diag
42 | dious = torch.clamp(dious,min=-1.0,max = 1.0)
43 | if exchange:
44 | dious = dious.T
45 | return dious
46 |
47 | def bbox_overlaps_ciou(bboxes1, bboxes2):
48 | rows = bboxes1.shape[0]
49 | cols = bboxes2.shape[0]
50 | cious = torch.zeros((rows, cols))
51 | if rows * cols == 0:
52 | return cious
53 | exchange = False
54 | if bboxes1.shape[0] > bboxes2.shape[0]:
55 | bboxes1, bboxes2 = bboxes2, bboxes1
56 | cious = torch.zeros((cols, rows))
57 | exchange = True
58 |
59 | w1 = bboxes1[:, 2] - bboxes1[:, 0]
60 | h1 = bboxes1[:, 3] - bboxes1[:, 1]
61 | w2 = bboxes2[:, 2] - bboxes2[:, 0]
62 | h2 = bboxes2[:, 3] - bboxes2[:, 1]
63 |
64 | area1 = w1 * h1
65 | area2 = w2 * h2
66 |
67 | center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
68 | center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
69 | center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
70 | center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2
71 |
72 | inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
73 | inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
74 | out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:])
75 | out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2])
76 |
77 | inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
78 | inter_area = inter[:, 0] * inter[:, 1]
79 | inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
80 | outer = torch.clamp((out_max_xy - out_min_xy), min=0)
81 | outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
82 | union = area1+area2-inter_area
83 | u = (inter_diag) / outer_diag
84 | iou = inter_area / union
85 | v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2)
86 | with torch.no_grad():
87 | S = 1 - iou
88 | alpha = v / (S + v)
89 | cious = iou - (u + alpha * v)
90 | cious = torch.clamp(cious,min=-1.0,max = 1.0)
91 | if exchange:
92 | cious = cious.T
93 | return cious
94 |
95 | def bbox_overlaps_iou(bboxes1, bboxes2):
96 | rows = bboxes1.shape[0]
97 | cols = bboxes2.shape[0]
98 | ious = torch.zeros((rows, cols))
99 | if rows * cols == 0:
100 | return ious
101 | exchange = False
102 | if bboxes1.shape[0] > bboxes2.shape[0]:
103 | bboxes1, bboxes2 = bboxes2, bboxes1
104 | ious = torch.zeros((cols, rows))
105 | exchange = True
106 | area1 = (bboxes1[:, 2] - bboxes1[:, 0]) * (
107 | bboxes1[:, 3] - bboxes1[:, 1])
108 | area2 = (bboxes2[:, 2] - bboxes2[:, 0]) * (
109 | bboxes2[:, 3] - bboxes2[:, 1])
110 |
111 | inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
112 | inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
113 |
114 | inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
115 | inter_area = inter[:, 0] * inter[:, 1]
116 | union = area1+area2-inter_area
117 | ious = inter_area / union
118 | ious = torch.clamp(ious,min=0,max = 1.0)
119 | if exchange:
120 | ious = ious.T
121 | return ious
122 |
123 | def bbox_overlaps_giou(bboxes1, bboxes2):
124 | rows = bboxes1.shape[0]
125 | cols = bboxes2.shape[0]
126 | ious = torch.zeros((rows, cols))
127 | if rows * cols == 0:
128 | return ious
129 | exchange = False
130 | if bboxes1.shape[0] > bboxes2.shape[0]:
131 | bboxes1, bboxes2 = bboxes2, bboxes1
132 | ious = torch.zeros((cols, rows))
133 | exchange = True
134 | area1 = (bboxes1[:, 2] - bboxes1[:, 0]) * (
135 | bboxes1[:, 3] - bboxes1[:, 1])
136 | area2 = (bboxes2[:, 2] - bboxes2[:, 0]) * (
137 | bboxes2[:, 3] - bboxes2[:, 1])
138 |
139 | inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
140 |
141 | inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
142 |
143 | out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:])
144 |
145 | out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2])
146 |
147 | inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
148 | inter_area = inter[:, 0] * inter[:, 1]
149 | outer = torch.clamp((out_max_xy - out_min_xy), min=0)
150 | outer_area = outer[:, 0] * outer[:, 1]
151 | union = area1+area2-inter_area
152 | closure = outer_area
153 |
154 | ious = inter_area / union - (closure - union) / closure
155 | ious = torch.clamp(ious,min=-1.0,max = 1.0)
156 | if exchange:
157 | ious = ious.T
158 | return ious
159 |
160 | def point_form(boxes):
161 | """ Convert prior_boxes to (xmin, ymin, xmax, ymax)
162 | representation for comparison to point form ground truth data.
163 | Args:
164 | boxes: (tensor) center-size default boxes from priorbox layers.
165 | Return:
166 | boxes: (tensor) Converted xmin, ymin, xmax, ymax form of boxes.
167 | """
168 | #print(boxes)
169 | return torch.cat((boxes[:, :2] - boxes[:, 2:]/2, # xmin, ymin
170 | boxes[:, :2] + boxes[:, 2:]/2), 1) # xmax, ymax
171 |
172 |
173 | def center_size(boxes):
174 | """ Convert prior_boxes to (cx, cy, w, h)
175 | representation for comparison to center-size form ground truth data.
176 | Args:
177 | boxes: (tensor) point_form boxes
178 | Return:
179 | boxes: (tensor) Converted xmin, ymin, xmax, ymax form of boxes.
180 | """
181 | return torch.cat((boxes[:, 2:] + boxes[:, :2])/2, # cx, cy
182 | boxes[:, 2:] - boxes[:, :2], 1) # w, h
183 |
184 |
185 | def intersect(box_a, box_b):
186 | """ We resize both tensors to [A,B,2] without new malloc:
187 | [A,2] -> [A,1,2] -> [A,B,2]
188 | [B,2] -> [1,B,2] -> [A,B,2]
189 | Then we compute the area of intersect between box_a and box_b.
190 | Args:
191 | box_a: (tensor) bounding boxes, Shape: [A,4].
192 | box_b: (tensor) bounding boxes, Shape: [B,4].
193 | Return:
194 | (tensor) intersection area, Shape: [A,B].
195 | """
196 | #print(box_a)
197 | #print(box_b)
198 | A = box_a.size(0)
199 | B = box_b.size(0)
200 | max_xy = torch.min(box_a[:, 2:].unsqueeze(1).expand(A, B, 2),
201 | box_b[:, 2:].unsqueeze(0).expand(A, B, 2))
202 | min_xy = torch.max(box_a[:, :2].unsqueeze(1).expand(A, B, 2),
203 | box_b[:, :2].unsqueeze(0).expand(A, B, 2))
204 | inter = torch.clamp((max_xy - min_xy), min=0)
205 | return inter[:, :, 0] * inter[:, :, 1]
206 |
207 |
208 | def jaccard(box_a, box_b):
209 | """Compute the jaccard overlap of two sets of boxes. The jaccard overlap
210 | is simply the intersection over union of two boxes. Here we operate on
211 | ground truth boxes and default boxes.
212 | E.g.:
213 | A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
214 | Args:
215 | box_a: (tensor) Ground truth bounding boxes, Shape: [num_objects,4]
216 | box_b: (tensor) Prior boxes from priorbox layers, Shape: [num_priors,4]
217 | Return:
218 | jaccard overlap: (tensor) Shape: [box_a.size(0), box_b.size(0)]
219 | """
220 | inter = intersect(box_a, box_b)
221 | area_a = ((box_a[:, 2]-box_a[:, 0]) *
222 | (box_a[:, 3]-box_a[:, 1])).unsqueeze(1).expand_as(inter) # [A,B]
223 | area_b = ((box_b[:, 2]-box_b[:, 0]) *
224 | (box_b[:, 3]-box_b[:, 1])).unsqueeze(0).expand_as(inter) # [A,B]
225 | union = area_a + area_b - inter
226 | return inter / union # [A,B]
227 |
228 |
229 | def match_ious(threshold, truths, priors, variances, labels, loc_t, conf_t, idx):
230 | """Match each prior box with the ground truth box of the highest jaccard
231 | overlap, encode the bounding boxes, then return the matched indices
232 | corresponding to both confidence and location preds.
233 | Args:
234 | threshold: (float) The overlap threshold used when mathing boxes.
235 | truths: (tensor) Ground truth boxes, Shape: [num_obj, num_priors].
236 | priors: (tensor) Prior boxes from priorbox layers, Shape: [n_priors,4].
237 | variances: (tensor) Variances corresponding to each prior coord,
238 | Shape: [num_priors, 4].
239 | labels: (tensor) All the class labels for the image, Shape: [num_obj].
240 | loc_t: (tensor) Tensor to be filled w/ endcoded location targets.
241 | conf_t: (tensor) Tensor to be filled w/ matched indices for conf preds.
242 | idx: (int) current batch index
243 | Return:
244 | The matched indices corresponding to 1)location and 2)confidence preds.
245 | """
246 | # jaccard index
247 | loc_t[idx] = point_form(priors)
248 | overlaps = jaccard(
249 | truths,
250 | point_form(priors)
251 | )
252 | # (Bipartite Matching)
253 | # [1,num_objects] best prior for each ground truth
254 | best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
255 | # [1,num_priors] best ground truth for each prior
256 | best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True)
257 | best_truth_idx.squeeze_(0)
258 | best_truth_overlap.squeeze_(0)
259 | best_prior_idx.squeeze_(1)
260 | best_prior_overlap.squeeze_(1)
261 | best_truth_overlap.index_fill_(0, best_prior_idx, 2) # ensure best prior
262 | # TODO refactor: index best_prior_idx with long tensor
263 | # ensure every gt matches with its prior of max overlap
264 | for j in range(best_prior_idx.size(0)):
265 | best_truth_idx[best_prior_idx[j]] = j
266 | matches = truths[best_truth_idx] # Shape: [num_priors,4]
267 | conf = labels[best_truth_idx] + 1 # Shape: [num_priors]
268 | conf[best_truth_overlap < threshold] = 0 # label as background
269 | loc_t[idx] = matches # [num_priors,4] encoded offsets to learn
270 | conf_t[idx] = conf # [num_priors] top class label for each prior
271 |
272 |
273 | def match(threshold, truths, priors, variances, labels, loc_t, conf_t, idx):
274 | """Match each prior box with the ground truth box of the highest jaccard
275 | overlap, encode the bounding boxes, then return the matched indices
276 | corresponding to both confidence and location preds.
277 | Args:
278 | threshold: (float) The overlap threshold used when mathing boxes.
279 | truths: (tensor) Ground truth boxes, Shape: [num_obj, num_priors].
280 | priors: (tensor) Prior boxes from priorbox layers, Shape: [n_priors,4].
281 | variances: (tensor) Variances corresponding to each prior coord,
282 | Shape: [num_priors, 4].
283 | labels: (tensor) All the class labels for the image, Shape: [num_obj].
284 | loc_t: (tensor) Tensor to be filled w/ endcoded location targets.
285 | conf_t: (tensor) Tensor to be filled w/ matched indices for conf preds.
286 | idx: (int) current batch index
287 | Return:
288 | The matched indices corresponding to 1)location and 2)confidence preds.
289 | """
290 | # jaccard index
291 | overlaps = jaccard(
292 | truths,
293 | point_form(priors)
294 | )
295 | # (Bipartite Matching)
296 | # [1,num_objects] best prior for each ground truth
297 | best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
298 | # [1,num_priors] best ground truth for each prior
299 | best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True)
300 | best_truth_idx.squeeze_(0)
301 | best_truth_overlap.squeeze_(0)
302 | best_prior_idx.squeeze_(1)
303 | best_prior_overlap.squeeze_(1)
304 | best_truth_overlap.index_fill_(0, best_prior_idx, 2) # ensure best prior
305 | # TODO refactor: index best_prior_idx with long tensor
306 | # ensure every gt matches with its prior of max overlap
307 | for j in range(best_prior_idx.size(0)):
308 | best_truth_idx[best_prior_idx[j]] = j
309 | matches = truths[best_truth_idx] # Shape: [num_priors,4]
310 | conf = labels[best_truth_idx] + 1 # Shape: [num_priors]
311 | conf[best_truth_overlap < threshold] = 0 # label as background
312 | loc = encode(matches, priors, variances)
313 | loc_t[idx] = loc # [num_priors,4] encoded offsets to learn
314 | conf_t[idx] = conf # [num_priors] top class label for each prior
315 |
316 |
317 | def encode(matched, priors, variances):
318 | """Encode the variances from the priorbox layers into the ground truth boxes
319 | we have matched (based on jaccard overlap) with the prior boxes.
320 | Args:
321 | matched: (tensor) Coords of ground truth for each prior in point-form
322 | Shape: [num_priors, 4].
323 | priors: (tensor) Prior boxes in center-offset form
324 | Shape: [num_priors,4].
325 | variances: (list[float]) Variances of priorboxes
326 | Return:
327 | encoded boxes (tensor), Shape: [num_priors, 4]
328 | """
329 |
330 | # dist b/t match center and prior's center
331 | g_cxcy = (matched[:, :2] + matched[:, 2:])/2 - priors[:, :2]
332 | # encode variance
333 | g_cxcy /= (variances[0] * priors[:, 2:])
334 | # match wh / prior wh
335 | g_wh = (matched[:, 2:] - matched[:, :2]) / priors[:, 2:]
336 | g_wh = torch.log(g_wh) / variances[1]
337 | # return target for smooth_l1_loss
338 | return torch.cat([g_cxcy, g_wh], 1) # [num_priors,4]
339 |
340 |
341 | # Adapted from https://github.com/Hakuyume/chainer-ssd
342 | def decode(loc, priors, variances):
343 | """Decode locations from predictions using priors to undo
344 | the encoding we did for offset regression at train time.
345 | Args:
346 | loc (tensor): location predictions for loc layers,
347 | Shape: [num_priors,4]
348 | priors (tensor): Prior boxes in center-offset form.
349 | Shape: [num_priors,4].
350 | variances: (list[float]) Variances of priorboxes
351 | Return:
352 | decoded bounding box predictions
353 | """
354 |
355 | boxes = torch.cat((
356 | priors[:, :2] + loc[:, :2] * variances[0] * priors[:, 2:],
357 | priors[:, 2:] * torch.exp(loc[:, 2:] * variances[1])), 1)
358 | boxes[:, :2] -= boxes[:, 2:] / 2
359 | boxes[:, 2:] += boxes[:, :2]
360 | #print(boxes)
361 | return boxes
362 |
363 |
364 | def log_sum_exp(x):
365 | """Utility function for computing log_sum_exp while determining
366 | This will be used to determine unaveraged confidence loss across
367 | all examples in a batch.
368 | Args:
369 | x (Variable(tensor)): conf_preds from conf layers
370 | """
371 | x_max = x.data.max()
372 | return torch.log(torch.sum(torch.exp(x-x_max), 1, keepdim=True)) + x_max
373 |
374 |
375 | # Original author: Francisco Massa:
376 | # https://github.com/fmassa/object-detection.torch
377 | # Ported to PyTorch by Max deGroot (02/01/2017)
378 | def nms(boxes, scores, overlap=0.5, top_k=200):
379 | """Apply non-maximum suppression at test time to avoid detecting too many
380 | overlapping bounding boxes for a given object.
381 | Args:
382 | boxes: (tensor) The location preds for the img, Shape: [num_priors,4].
383 | scores: (tensor) The class predscores for the img, Shape:[num_priors].
384 | overlap: (float) The overlap thresh for suppressing unnecessary boxes.
385 | top_k: (int) The Maximum number of box preds to consider.
386 | Return:
387 | The indices of the kept boxes with respect to num_priors.
388 | """
389 |
390 | keep = scores.new(scores.size(0)).zero_().long()
391 | if boxes.numel() == 0:
392 | return keep
393 | x1 = boxes[:, 0]
394 | y1 = boxes[:, 1]
395 | x2 = boxes[:, 2]
396 | y2 = boxes[:, 3]
397 | area = torch.mul(x2 - x1, y2 - y1)
398 | v, idx = scores.sort(0) # sort in ascending order
399 | # I = I[v >= 0.01]
400 | idx = idx[-top_k:] # indices of the top-k largest vals
401 | xx1 = boxes.new()
402 | yy1 = boxes.new()
403 | xx2 = boxes.new()
404 | yy2 = boxes.new()
405 | w = boxes.new()
406 | h = boxes.new()
407 |
408 | # keep = torch.Tensor()
409 | count = 0
410 | while idx.numel() > 0:
411 | i = idx[-1] # index of current largest val
412 | # keep.append(i)
413 | keep[count] = i
414 | count += 1
415 | if idx.size(0) == 1:
416 | break
417 | idx = idx[:-1] # remove kept element from view
418 | # load bboxes of next highest vals
419 | torch.index_select(x1, 0, idx, out=xx1)
420 | torch.index_select(y1, 0, idx, out=yy1)
421 | torch.index_select(x2, 0, idx, out=xx2)
422 | torch.index_select(y2, 0, idx, out=yy2)
423 | # store element-wise max with next highest score
424 | xx1 = torch.clamp(xx1, min=x1[i])
425 | yy1 = torch.clamp(yy1, min=y1[i])
426 | xx2 = torch.clamp(xx2, max=x2[i])
427 | yy2 = torch.clamp(yy2, max=y2[i])
428 | w.resize_as_(xx2)
429 | h.resize_as_(yy2)
430 | w = xx2 - xx1
431 | h = yy2 - yy1
432 | # check sizes of xx1 and xx2.. after each iteration
433 | w = torch.clamp(w, min=0.0)
434 | h = torch.clamp(h, min=0.0)
435 | inter = w*h
436 | # IoU = i / (area(a) + area(b) - i)
437 | rem_areas = torch.index_select(area, 0, idx) # load remaining areas)
438 | union = (rem_areas - inter) + area[i]
439 | IoU = inter/union # store result in iou
440 | # keep only elements with an IoU <= overlap
441 | idx = idx[IoU.le(overlap)]
442 | return keep, count
443 |
444 | def diounms(boxes, scores, overlap=0.5, top_k=200, beta1=1.0):
445 | """Apply DIoU-NMS at test time to avoid detecting too many
446 | overlapping bounding boxes for a given object.
447 | Args:
448 | boxes: (tensor) The location preds for the img, Shape: [num_priors,4].
449 | scores: (tensor) The class predscores for the img, Shape:[num_priors].
450 | overlap: (float) The overlap thresh for suppressing unnecessary boxes.
451 | top_k: (int) The Maximum number of box preds to consider.
452 | beta1: (float) DIoU=IoU-R_DIoU^{beta1}.
453 | Return:
454 | The indices of the kept boxes with respect to num_priors.
455 | """
456 |
457 | keep = scores.new(scores.size(0)).zero_().long()
458 | if boxes.numel() == 0:
459 | return keep
460 | x1 = boxes[:, 0]
461 | y1 = boxes[:, 1]
462 | x2 = boxes[:, 2]
463 | y2 = boxes[:, 3]
464 | area = torch.mul(x2 - x1, y2 - y1)
465 | v, idx = scores.sort(0) # sort in ascending order
466 | # I = I[v >= 0.01]
467 | idx = idx[-top_k:] # indices of the top-k largest vals
468 | xx1 = boxes.new()
469 | yy1 = boxes.new()
470 | xx2 = boxes.new()
471 | yy2 = boxes.new()
472 | w = boxes.new()
473 | h = boxes.new()
474 |
475 | # keep = torch.Tensor()
476 | count = 0
477 | while idx.numel() > 0:
478 | i = idx[-1] # index of current largest val
479 | # keep.append(i)
480 | keep[count] = i
481 | count += 1
482 | if idx.size(0) == 1:
483 | break
484 | idx = idx[:-1] # remove kept element from view
485 | # load bboxes of next highest vals
486 | torch.index_select(x1, 0, idx, out=xx1)
487 | torch.index_select(y1, 0, idx, out=yy1)
488 | torch.index_select(x2, 0, idx, out=xx2)
489 | torch.index_select(y2, 0, idx, out=yy2)
490 | # store element-wise max with next highest score
491 | inx1 = torch.clamp(xx1, min=x1[i])
492 | iny1 = torch.clamp(yy1, min=y1[i])
493 | inx2 = torch.clamp(xx2, max=x2[i])
494 | iny2 = torch.clamp(yy2, max=y2[i])
495 | center_x1 = (x1[i] + x2[i]) / 2
496 | center_y1 = (y1[i] + y2[i]) / 2
497 | center_x2 = (xx1 + xx2) / 2
498 | center_y2 = (yy1 + yy2) / 2
499 | d = (center_x1 - center_x2) ** 2 + (center_y1 - center_y2) ** 2
500 | cx1 = torch.clamp(xx1, max=x1[i])
501 | cy1 = torch.clamp(yy1, max=y1[i])
502 | cx2 = torch.clamp(xx2, min=x2[i])
503 | cy2 = torch.clamp(yy2, min=y2[i])
504 | c = (cx2 - cx1) ** 2 + (cy2 - cy1) ** 2
505 | u= d / c
506 | w.resize_as_(xx2)
507 | h.resize_as_(yy2)
508 | w = inx2 - inx1
509 | h = iny2 - iny1
510 | # check sizes of xx1 and xx2.. after each iteration
511 | w = torch.clamp(w, min=0.0)
512 | h = torch.clamp(h, min=0.0)
513 | inter = w*h
514 | # IoU = i / (area(a) + area(b) - i)
515 | rem_areas = torch.index_select(area, 0, idx) # load remaining areas)
516 | union = (rem_areas - inter) + area[i]
517 | IoU = inter/union - u ** beta1 # store result in diou
518 | # keep only elements with an IoU <= overlap
519 | idx = idx[IoU.le(overlap)]
520 | return keep, count
521 |
--------------------------------------------------------------------------------
/utils/box/prior_box.py:
--------------------------------------------------------------------------------
1 | from __future__ import division
2 | from math import sqrt as sqrt
3 | from itertools import product as product
4 | import torch
5 |
6 |
7 | class PriorBox(object):
8 | """Compute priorbox coordinates in center-offset form for each source
9 | feature map.
10 | """
11 | def __init__(self, cfg):
12 | super(PriorBox, self).__init__()
13 | self.image_size = cfg['min_dim']
14 | # number of priors for feature map location (either 4 or 6)
15 | self.num_priors = len(cfg['aspect_ratios'])
16 | self.variance = cfg['variance'] or [0.1]
17 | self.feature_maps = cfg['feature_maps']
18 | self.min_sizes = cfg['min_sizes']
19 | self.max_sizes = cfg['max_sizes']
20 | self.steps = cfg['steps']
21 | self.aspect_ratios = cfg['aspect_ratios']
22 | self.clip = cfg['clip']
23 | self.version = cfg['name']
24 | for v in self.variance:
25 | if v <= 0:
26 | raise ValueError('Variances must be greater than 0')
27 |
28 | def forward(self):
29 | mean = []
30 | for k, f in enumerate(self.feature_maps):
31 | for i, j in product(range(f), repeat=2):
32 | f_k = self.image_size / self.steps[k]
33 | # unit center x,y
34 | cx = (j + 0.5) / f_k
35 | cy = (i + 0.5) / f_k
36 |
37 | # aspect_ratio: 1
38 | # rel size: min_size
39 | s_k = self.min_sizes[k]/self.image_size
40 | mean += [cx, cy, s_k, s_k]
41 |
42 | # aspect_ratio: 1
43 | # rel size: sqrt(s_k * s_(k+1))
44 | s_k_prime = sqrt(s_k * (self.max_sizes[k]/self.image_size))
45 | mean += [cx, cy, s_k_prime, s_k_prime]
46 |
47 | # rest of aspect ratios
48 | #print(self.aspect_ratios[k])
49 | for ar in self.aspect_ratios[k]:
50 | mean += [cx, cy, s_k*sqrt(ar), s_k/sqrt(ar)]
51 | mean += [cx, cy, s_k/sqrt(ar), s_k*sqrt(ar)]
52 | # back to torch land
53 | output = torch.Tensor(mean).view(-1, 4)
54 | if self.clip:
55 | output.clamp_(max=1, min=0)
56 | #print(output.shape)
57 | return output
58 |
--------------------------------------------------------------------------------
/utils/detection/__init__.py:
--------------------------------------------------------------------------------
1 | from .detection import Detect
2 |
3 |
--------------------------------------------------------------------------------
/utils/detection/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/detection/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/detection/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/detection/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/detection/__pycache__/detection.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/detection/__pycache__/detection.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/detection/__pycache__/detection.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/detection/__pycache__/detection.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/detection/detection.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch.autograd import Function
3 | from ..box import decode, nms, diounms
4 |
5 | def intersect(box_a, box_b):
6 |
7 | n = box_a.size(0)
8 | A = box_a.size(1)
9 | B = box_b.size(1)
10 | max_xy = torch.min(box_a[:, :, 2:].unsqueeze(2).expand(n, A, B, 2),
11 | box_b[:, :, 2:].unsqueeze(1).expand(n, A, B, 2))
12 | min_xy = torch.max(box_a[:, :, :2].unsqueeze(2).expand(n, A, B, 2),
13 | box_b[:, :, :2].unsqueeze(1).expand(n, A, B, 2))
14 | inter = torch.clamp((max_xy - min_xy), min=0)
15 | return inter[:, :, :, 0] * inter[:, :, :, 1]
16 |
17 | def jaccard(box_a, box_b, iscrowd:bool=False):
18 |
19 | use_batch = True
20 | if box_a.dim() == 2:
21 | use_batch = False
22 | box_a = box_a[None, ...]
23 | box_b = box_b[None, ...]
24 |
25 | inter = intersect(box_a, box_b)
26 | area_a = ((box_a[:, :, 2]-box_a[:, :, 0]) *
27 | (box_a[:, :, 3]-box_a[:, :, 1])).unsqueeze(2).expand_as(inter) # [A,B]
28 | area_b = ((box_b[:, :, 2]-box_b[:, :, 0]) *
29 | (box_b[:, :, 3]-box_b[:, :, 1])).unsqueeze(1).expand_as(inter) # [A,B]
30 | union = area_a + area_b - inter
31 | out = inter / area_a if iscrowd else inter / union
32 |
33 | return out if use_batch else out.squeeze(0)
34 |
35 | def box_diou(boxes1, boxes2, beta):
36 |
37 | def box_area(box):
38 | # box = 4xn
39 | return (box[2] - box[0]) * (box[3] - box[1])
40 |
41 | area1 = box_area(boxes1.t())
42 | area2 = box_area(boxes2.t())
43 |
44 | lt = torch.max(boxes1[:, None, :2], boxes2[:, :2]) # [N,M,2]
45 | rb = torch.min(boxes1[:, None, 2:], boxes2[:, 2:]) # [N,M,2]
46 | clt=torch.min(boxes1[:, None, :2], boxes2[:, :2])
47 | crb=torch.max(boxes1[:, None, 2:], boxes2[:, 2:])
48 | x1=(boxes1[:, None, 0] + boxes1[:, None, 2])/2
49 | y1=(boxes1[:, None, 1] + boxes1[:, None, 3])/2
50 | x2=(boxes2[:, None, 0] + boxes2[:, None, 2])/2
51 | y2=(boxes2[:, None, 1] + boxes2[:, None, 3])/2
52 | d=(x1-x2.t())**2 + (y1-y2.t())**2
53 | c=((crb-clt)**2).sum(dim=2)
54 | inter = (rb - lt).clamp(min=0).prod(2) # [N,M]
55 | return inter / (area1[:, None] + area2 - inter) - (d / c) ** beta # iou = inter / (area1 + area2 - inter)
56 |
57 | class Detect(Function):
58 | """At test time, Detect is the final layer of SSD. Decode location preds,
59 | apply non-maximum suppression to location predictions based on conf
60 | scores and threshold to a top_k number of output predictions for both
61 | confidence score and locations.
62 | """
63 | def __init__(self, num_classes, bkg_label, top_k, conf_thresh, nms_thresh,variance, nms_kind, beta1):
64 | self.num_classes = num_classes
65 | self.background_label = bkg_label
66 | self.top_k = top_k
67 | # Parameters used in nms.
68 | self.nms_thresh = nms_thresh
69 | if nms_thresh <= 0:
70 | raise ValueError('nms_threshold must be non negative.')
71 | self.conf_thresh = conf_thresh
72 | self.variance = variance
73 | self.nms_kind = nms_kind
74 | self.beta1 = beta1
75 |
76 | def forward(self, loc_data, conf_data, prior_data):
77 | """
78 | Args:
79 | loc_data: (tensor) Loc preds from loc layers
80 | Shape: [batch,num_priors*4]
81 | conf_data: (tensor) Shape: Conf preds from conf layers
82 | Shape: [batch*num_priors,num_classes]
83 | prior_data: (tensor) Prior boxes and variances from priorbox layers
84 | Shape: [1,num_priors,4]
85 | nms_kind: greedynms or diounms
86 | """
87 | num = loc_data.size(0)
88 | num_priors = prior_data.size(0)
89 | output = torch.zeros(num, self.num_classes, self.top_k, 5)
90 | conf_preds = conf_data.view(num, num_priors,
91 | self.num_classes).transpose(2, 1)
92 |
93 | # Decode predictions into bboxes.
94 | for i in range(num):
95 | decoded_boxes = decode(loc_data[i], prior_data, self.variance)
96 | # For each class, perform nms
97 | conf_scores = conf_preds[i].clone()
98 | sort_scores, idx = conf_scores.sort(1, descending=True)
99 | c_mask = (sort_scores>=self.conf_thresh)[:,:self.top_k]
100 |
101 | s1,s2 = decoded_boxes.size()
102 | z = decoded_boxes[idx]
103 |
104 | h = (torch.arange(0,21).cuda()).float().unsqueeze(1).unsqueeze(1)
105 | one = torch.ones(21,s1,s2).cuda().mul(h)
106 | boxes = z[:,:self.top_k][c_mask] #[N,4] box
107 | z = one*2 + z
108 |
109 | boxes_batch = z[:,:self.top_k][c_mask] #[N,4] box with offset
110 |
111 | scores = sort_scores[:,:self.top_k][c_mask] #[N,1]
112 | classes = one[:,:self.top_k][c_mask][:,0] #[N,1]
113 |
114 | # Do not support Fast NMS, due to it damages the performance.
115 |
116 | if self.nms_kind == "cluster_nms" or self.nms_kind == "cluster_weighted_nms" :
117 | iou = jaccard(boxes_batch, boxes_batch).triu_(diagonal=1)
118 | else:
119 | if self.nms_kind == "cluster_diounms" or self.nms_kind == "cluster_weighted_diounms":
120 | iou = box_diou(boxes_batch, boxes_batch, self.beta1).triu_(diagonal=1)
121 | else:
122 | assert Exception("Currently, NMS only surports 'cluster_nms', 'cluster_diounms', 'cluster_weighted_nms', 'cluster_weighted_diounms'.")
123 | B = iou
124 | for j in range(999):
125 | A=B
126 | maxA=A.max(dim=0)[0]
127 | E = (maxA0.8).float() + torch.eye(n).cuda()) * (scores.reshape((1,n)))
135 | xx1 = boxes[:,0].expand(n,n)
136 | yy1 = boxes[:,1].expand(n,n)
137 | xx2 = boxes[:,2].expand(n,n)
138 | yy2 = boxes[:,3].expand(n,n)
139 |
140 | weightsum=weights.sum(dim=1)
141 | xx1 = (xx1*weights).sum(dim=1)/(weightsum)
142 | yy1 = (yy1*weights).sum(dim=1)/(weightsum)
143 | xx2 = (xx2*weights).sum(dim=1)/(weightsum)
144 | yy2 = (yy2*weights).sum(dim=1)/(weightsum)
145 | boxes = torch.stack([xx1, yy1, xx2, yy2], 1)
146 |
147 | boxes = boxes[keep]
148 | scores = scores[keep]
149 | classes = classes[keep]
150 |
151 | score_box = torch.cat((scores.unsqueeze(1),boxes), 1)
152 |
153 | for cl in range(1, self.num_classes):
154 | mask = (classes == cl)
155 | output[i, cl, :]=torch.cat((score_box[mask],output[i, cl, :]),0)[:self.top_k]
156 | return output
157 |
158 | def forward_traditional_nms(self, loc_data, conf_data, prior_data):
159 | """
160 | Args:
161 | loc_data: (tensor) Loc preds from loc layers
162 | Shape: [batch,num_priors*4]
163 | conf_data: (tensor) Shape: Conf preds from conf layers
164 | Shape: [batch*num_priors,num_classes]
165 | prior_data: (tensor) Prior boxes and variances from priorbox layers
166 | Shape: [1,num_priors,4]
167 | nms_kind: greedynms or diounms
168 | """
169 |
170 | # This funtion is no longer supported. Due to extremely time-consuming.
171 |
172 | num = loc_data.size(0)
173 | num_priors = prior_data.size(0)
174 | output = torch.zeros(num, self.num_classes, self.top_k, 5)
175 | conf_preds = conf_data.view(num, num_priors,
176 | self.num_classes).transpose(2, 1)
177 |
178 | # Decode predictions into bboxes.
179 | for i in range(num):
180 | decoded_boxes = decode(loc_data[i], prior_data, self.variance)
181 | # For each class, perform nms
182 | conf_scores = conf_preds[i].clone()
183 |
184 | for cl in range(1, self.num_classes):
185 | c_mask = conf_scores[cl].gt(self.conf_thresh)
186 | scores = conf_scores[cl][c_mask]
187 | if scores.size(0) == 0:
188 | continue
189 | l_mask = c_mask.unsqueeze(1).expand_as(decoded_boxes)
190 | boxes = decoded_boxes[l_mask].view(-1, 4)
191 | # idx of highest scoring and non-overlapping boxes per class
192 | if self.nms_kind == "greedynms":
193 | ids, count = diounms(boxes, scores, self.nms_thresh, self.top_k)
194 | else:
195 | if self.nms_kind == "diounms":
196 | ids, count = diounms(boxes, scores, self.nms_thresh, self.top_k, self.beta1)
197 | else:
198 | print("use default greedy-NMS")
199 | ids, count = nms(boxes, scores, self.nms_thresh, self.top_k)
200 | output[i, cl, :count] = \
201 | torch.cat((scores[ids[:count]].unsqueeze(1),
202 | boxes[ids[:count]]), 1)
203 | flt = output.contiguous().view(num, -1, 5)
204 | _, idx = flt[:, :, 0].sort(1, descending=True)
205 | _, rank = idx.sort(1)
206 | flt[(rank < self.top_k).unsqueeze(-1).expand_as(flt)].fill_(0)
207 | return output
208 |
--------------------------------------------------------------------------------
/utils/loss/__init__.py:
--------------------------------------------------------------------------------
1 | from .multibox_loss import MultiBoxLoss
2 |
--------------------------------------------------------------------------------
/utils/loss/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/loss/__pycache__/__init__.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/loss/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/loss/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/loss/__pycache__/multibox_loss.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/loss/__pycache__/multibox_loss.cpython-35.pyc
--------------------------------------------------------------------------------
/utils/loss/__pycache__/multibox_loss.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Zzh-tju/DIoU-SSD-pytorch/cec038bc1057f0cd532752413b24924fde427f09/utils/loss/__pycache__/multibox_loss.cpython-36.pyc
--------------------------------------------------------------------------------
/utils/loss/multibox_loss.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 | from ..box import match, log_sum_exp
6 | from ..box import match_ious, bbox_overlaps_iou, bbox_overlaps_giou, bbox_overlaps_diou, bbox_overlaps_ciou, decode
7 |
8 | class FocalLoss(nn.Module):
9 | """
10 | This criterion is a implemenation of Focal Loss, which is proposed in
11 | Focal Loss for Dense Object Detection.
12 |
13 | Loss(x, class) = - \alpha (1-softmax(x)[class])^gamma \log(softmax(x)[class])
14 |
15 | The losses are averaged across observations for each minibatch.
16 |
17 | Args:
18 | alpha(1D Tensor, Variable) : the scalar factor for this criterion
19 | gamma(float, double) : gamma > 0; reduces the relative loss for well-classified examples (p > .5),
20 | putting more focus on hard, misclassified examples
21 | size_average(bool): By default, the losses are averaged over observations for each minibatch.
22 | However, if the field size_average is set to False, the losses are
23 | instead summed for each minibatch.
24 | """
25 | def __init__(self, class_num, alpha=None, gamma=2, size_average=True):
26 | super(FocalLoss, self).__init__()
27 | if alpha is None:
28 | self.alpha = torch.ones(class_num, 1)
29 | else:
30 | if isinstance(alpha, Variable):
31 | self.alpha = alpha
32 | else:
33 | self.alpha = alpha
34 | self.gamma = gamma
35 | self.class_num = class_num
36 | self.size_average = size_average
37 | print(self.gamma)
38 | def forward(self, inputs, targets):
39 | N = inputs.size(0)
40 | C = inputs.size(1)
41 | P = F.softmax(inputs,dim= 1)
42 | class_mask = inputs.data.new(N, C).fill_(0)
43 | class_mask = Variable(class_mask)
44 | ids = targets.view(-1, 1)
45 | class_mask.scatter_(1, ids.data, 1.)
46 |
47 | if inputs.is_cuda and not self.alpha.is_cuda:
48 | self.alpha = self.alpha.cuda()
49 | alpha = self.alpha[ids.data.view(-1)]
50 |
51 | probs = (P*class_mask).sum(1).view(-1,1)
52 |
53 | log_p = probs.log()
54 |
55 | batch_loss = -alpha*(torch.pow((1-probs), self.gamma))*log_p
56 |
57 | if self.size_average:
58 | loss = batch_loss.mean()
59 | else:
60 | loss = batch_loss.sum()
61 | return loss
62 |
63 |
64 | class IouLoss(nn.Module):
65 |
66 | def __init__(self,pred_mode = 'Center',size_sum=True,variances=None,losstype='Giou'):
67 | super(IouLoss, self).__init__()
68 | self.size_sum = size_sum
69 | self.pred_mode = pred_mode
70 | self.variances = variances
71 | self.loss = losstype
72 | def forward(self, loc_p, loc_t,prior_data):
73 | num = loc_p.shape[0]
74 |
75 | if self.pred_mode == 'Center':
76 | decoded_boxes = decode(loc_p, prior_data, self.variances)
77 | else:
78 | decoded_boxes = loc_p
79 | if self.loss == 'Iou':
80 | loss = torch.sum(1.0 - bbox_overlaps_iou(decoded_boxes, loc_t))
81 | else:
82 | if self.loss == 'Giou':
83 | loss = torch.sum(1.0 - bbox_overlaps_giou(decoded_boxes,loc_t))
84 | else:
85 | if self.loss == 'Diou':
86 | loss = torch.sum(1.0 - bbox_overlaps_diou(decoded_boxes,loc_t))
87 | else:
88 | loss = torch.sum(1.0 - bbox_overlaps_ciou(decoded_boxes, loc_t))
89 |
90 | if self.size_sum:
91 | loss = loss
92 | else:
93 | loss = loss/num
94 | return loss
95 |
96 | class MultiBoxLoss(nn.Module):
97 | """SSD Weighted Loss Function
98 | Compute Targets:
99 | 1) Produce Confidence Target Indices by matching ground truth boxes
100 | with (default) 'priorboxes' that have jaccard index > threshold parameter
101 | (default threshold: 0.5).
102 | 2) Produce localization target by 'encoding' variance into offsets of ground
103 | truth boxes and their matched 'priorboxes'.
104 | 3) Hard negative mining to filter the excessive number of negative examples
105 | that comes with using a large number of default bounding boxes.
106 | (default negative:positive ratio 3:1)
107 | Objective Loss:
108 | L(x,c,l,g) = (Lconf(x, c) + αLloc(x,l,g)) / N
109 | Where, Lconf is the CrossEntropy Loss and Lloc is the SmoothL1 Loss
110 | weighted by α which is set to 1 by cross val.
111 | Args:
112 | c: class confidences,
113 | l: predicted boxes,
114 | g: ground truth boxes
115 | N: number of matched default boxes
116 | See: https://arxiv.org/pdf/1512.02325.pdf for more details.
117 | """
118 |
119 | def __init__(self, cfg, overlap_thresh, prior_for_matching,
120 | bkg_label, neg_mining, neg_pos, neg_overlap, encode_target,
121 | use_gpu=True,loss_name = 'SmoothL1'):
122 | super(MultiBoxLoss, self).__init__()
123 | self.use_gpu = use_gpu
124 |
125 | self.num_classes = cfg['num_classes']
126 | self.threshold = overlap_thresh
127 | self.background_label = bkg_label
128 | self.encode_target = encode_target
129 | self.use_prior_for_matching = prior_for_matching
130 | self.do_neg_mining = neg_mining
131 | self.negpos_ratio = neg_pos
132 | self.neg_overlap = neg_overlap
133 | self.variance = cfg['variance']
134 | self.focalloss = FocalLoss(self.num_classes,gamma=2,size_average = False)
135 | self.loss = loss_name
136 | self.gious = IouLoss(pred_mode = 'Center',size_sum=True,variances=self.variance, losstype=self.loss)
137 | if self.loss != 'SmoothL1' or self.loss !='Giou':
138 | assert Exception("THe loss is Error, loss name must be SmoothL1 or Giou")
139 |
140 | else:
141 | match_ious(self.threshold, truths, defaults, self.variance, labels,
142 | loc_t, conf_t, idx)
143 |
144 | def forward(self, predictions, targets):
145 | """Multibox Loss
146 | Args:
147 | predictions (tuple): A tuple containing loc preds, conf preds,
148 | and prior boxes from SSD net.
149 | conf shape: torch.size(batch_size,num_priors,num_classes)
150 | loc shape: torch.size(batch_size,num_priors,4)
151 | priors shape: torch.size(num_priors,4)
152 |
153 | targets (tensor): Ground truth boxes and labels for a batch,
154 | shape: [batch_size,num_objs,5] (last idx is the label).
155 | """
156 | loc_data, conf_data, priors = predictions
157 | num = loc_data.size(0)
158 |
159 | priors = priors[:loc_data.size(1), :]
160 |
161 | num_priors = (priors.size(0))
162 |
163 | # match priors (default boxes) and ground truth boxes
164 | loc_t = torch.Tensor(num, num_priors, 4)
165 |
166 | conf_t = torch.LongTensor(num, num_priors)
167 | for idx in range(num):
168 | truths = targets[idx][:, :-1].data
169 | labels = targets[idx][:, -1].data
170 | defaults = priors.data
171 | if self.loss == 'SmoothL1':
172 | match(self.threshold, truths, defaults, self.variance, labels,
173 | loc_t, conf_t, idx)
174 | else:
175 | match_ious(self.threshold, truths, defaults, self.variance, labels,
176 | loc_t, conf_t, idx)
177 |
178 | if self.use_gpu:
179 | loc_t = loc_t.cuda()
180 | conf_t = conf_t.cuda()
181 | # wrap targets
182 | #loc_t = Variable(loc_t, requires_grad=True)
183 | #conf_t = Variable(conf_t, requires_grad=True)
184 |
185 | pos = conf_t > 0
186 | num_pos = pos.sum(dim=1, keepdim=True)
187 | # Localization Loss (Smooth L1)
188 | # Shape: [batch,num_priors,4]
189 | pos_idx = pos.unsqueeze(pos.dim()).expand_as(loc_data)
190 |
191 | loc_p = loc_data[pos_idx].view(-1, 4)
192 | loc_t = loc_t[pos_idx].view(-1, 4)
193 |
194 | if self.loss == 'SmoothL1':
195 | loss_l = F.smooth_l1_loss(loc_p, loc_t, reduction='sum')
196 | else:
197 | giou_priors = priors.data.unsqueeze(0).expand_as(loc_data)
198 | loss_l = self.gious(loc_p,loc_t,giou_priors[pos_idx].view(-1, 4))
199 | # Compute max conf across batch for hard negative mining
200 | batch_conf = conf_data.view(-1, self.num_classes)
201 | loss_c = log_sum_exp(batch_conf) - batch_conf.gather(1, conf_t.view(-1, 1))
202 |
203 | # Hard Negative Mining
204 | loss_c = loss_c.view(num, -1)
205 | loss_c[pos] = 0
206 | _, loss_idx = loss_c.sort(1, descending=True)
207 | _, idx_rank = loss_idx.sort(1)
208 | num_pos = pos.long().sum(1, keepdim=True)
209 | num_neg = torch.clamp(self.negpos_ratio*num_pos, max=pos.size(1)-1)
210 | neg = idx_rank < num_neg.expand_as(idx_rank)
211 |
212 | # Confidence Loss Including Positive and Negative Examples
213 | pos_idx = pos.unsqueeze(2).expand_as(conf_data)
214 | neg_idx = neg.unsqueeze(2).expand_as(conf_data)
215 | conf_p = conf_data[(pos_idx+neg_idx).gt(0)].view(-1, self.num_classes)
216 | targets_weighted = conf_t[(pos+neg).gt(0)]
217 | loss_c = F.cross_entropy(conf_p, targets_weighted, reduction='sum')
218 |
219 | # Sum of losses: L(x,c,l,g) = (Lconf(x, c) + αLloc(x,l,g)) / N
220 | '''
221 | batch_conf = conf_data.view(-1, self.num_classes)
222 | loss_c = self.focalloss(batch_conf,conf_t)
223 | '''
224 | N = num_pos.data.sum().double()
225 | loss_l = loss_l.double()
226 | loss_c = loss_c.double()
227 | loss_l /= N
228 | loss_c /= N
229 |
230 | return loss_l, loss_c
231 |
232 |
233 |
--------------------------------------------------------------------------------
/work_dir/DIoU-NMS.txt:
--------------------------------------------------------------------------------
1 | --------------------------------------------------------------
2 | Results computed with the **unofficial** Python eval code.
3 | Results should be very close to the official MATLAB eval code.
4 | --------------------------------------------------------------
5 | 0.7857084352769135 0.5633757991885664 0.5162622076659289
6 |
--------------------------------------------------------------------------------