├── LICENSE
├── README.md
├── assets
└── eval_baseline_idd.png
├── cfg.py
├── coco_eval.py
├── coco_utils.py
├── datasets
├── .ipynb_checkpoints
│ └── idd-checkpoint.py
├── bdd.py
├── cityscapes.py
└── idd.py
├── engine.py
├── eval_idd_bdd.py
├── evaluation_baseline.py
├── exp
├── evaluate_script.py
├── evaluation_transport.py
├── exp.ipynb
├── optimal_transport.ipynb
└── train_script.py
├── get_datalists.py
├── imports.py
├── inference.ipynb
├── train_baseline.py
├── transforms.py
└── utils.py
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 Prajjwal Bhargava
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Object detection for autonomous navigation
2 | This repository provides core support for performing object detection on navigation datasets. Support for 3D object detection and domain adaptation are in experimental phase and will be added later. This project provides support for training, evaluation, inference, visualization.
3 |
4 | ### This repo also contains the code for:
5 | - [On Generalizing Detection Models for Unconstrained Environments (ICCV W 2019)](https://arxiv.org/abs/1909.13080) in `exp`
6 |
7 | If you use the code in any way, please consider citing:
8 | ```
9 | @InProceedings{Bhargava_2019_ICCV,
10 | author = {Bhargava, Prajjwal},
11 | title = {On Generalizing Detection Models for Unconstrained Environments},
12 | booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
13 | month = {Oct},
14 | year = {2019}
15 | }
16 | ```
17 |
18 | #### NEW: Pretrained models are now available
19 |
20 | ## Prerequisites
21 | - Pytorch >= 1.1
22 | - torchvision >= 0.3
23 | - tensorboardX (optional, required for visualizing)
24 |
25 | ## Datasets
26 | This work provides support for the following datasets (related to object detection for autonomous navigation):
27 | - [India Driving Dataset](https://idd.insaan.iiit.ac.in/)
28 | - [Berkeley Deep drive](https://bdd-data.berkeley.edu/)
29 | - [Cityscapes](https://www.cityscapes-dataset.com/)
30 |
31 | Directory structure :
32 | ```
33 | +-- data
34 | | +-- bdd100k
35 | | +-- IDD_Detection
36 | | +-- cityscapes
37 | +-- autonmous-object-detection
38 | .......
39 | ```
40 | ### Getting started
41 | 1. Download the required dataset
42 | 2. Setup dataset paths in `cfg.py`
43 | 3. Create datalists
44 | 4. Start training and evaluating
45 |
46 | ## Documentation
47 |
48 | ### Setting up Config
49 | By default, all paths and hyperparameters are loaded from `cfg.py`. Users are required to specify paths of dataset and hyperparameters once.
50 | This can also be overriden by user
51 |
52 | ### Datalists
53 | We use something called datalists. Datalists are lists which contains path to images and labels. This is because some of the images don't have proper labels. Datalists ensure that the lists only contain structured usable data (dataloader would work seamlessly). Data cleaning happens in the process.
54 |
55 | You need to specify a proper path and `ds` variable in the `cfg.py` to specify the dataset you want to use.
56 | ```
57 | python3 get_datalists.py
58 | ```
59 |
60 | ### Datasets
61 | It assumes that datalists have been created. This step ensures that you won't get bad samples while dataloader iterates. Create a dir named `data` and put all datasets inside it.
62 | This library uses a common API (similar to torchvision).
63 | All datasets class expect the same inputs:
64 | ```
65 | Input:
66 | idd_image_path_list
67 | idd_anno_path_list
68 | get_transform: A transformation function.
69 | ```
70 | ```
71 | Output:
72 | A dict containing boxes, labels, image_id, area, iscrowd inside a torch.tensor.
73 | ```
74 | - IDD
75 |
76 | ```
77 | dset = IDD(idd_image_path_list,idd_anno_path_list,transforms=None)
78 | ```
79 |
80 | - BDD100K
81 |
82 | ```
83 | dset = BDD(bdd_img_path_list,train_anno_json_path,transforms=None)
84 | ```
85 |
86 | BDD100k doesn't provide individual ground truths. A single JSON file is provided. So creating dataset takes a little longer than usual for parsing JSON.
87 |
88 | - Cityscapes
89 |
90 | ```
91 | dset = Cityscapes(image_path_list,target_path_list, split='train',transforms=None)
92 | ```
93 |
94 | This was tested for Citypersons (GTs for person class). You can extract GTs from segmentation as well, but user would have to manage datalists.
95 |
96 | ### Transforms
97 | - ```get_transforms(bool:train)```
98 |
99 | converts images into tensors and applies Random Horizontal flipping on input data.
100 |
101 | ### Model
102 | Any detection model can be used (Yolo,FasterRCNN,SSD). Currently we provide support from torchvision.
103 |
104 | ```
105 | from train_baseline import get_model
106 | model = get_model(len(classes)) # Returns a Faster RCNN with Resnet 50 as backbone pretrained on COCO.
107 | ```
108 |
109 | ### Training
110 | Support for baseline has been added. Domain adaptive features will be added later.
111 | Users need to specify the path in the script (in user defined settings section) and dataset
112 |
113 | ```
114 | $ python train_baseline.py
115 | ```
116 |
117 | ### Evaluation
118 | Evaluation in performed in COCO format. Users need to specify saved `model_name` in `cfg.py`on which evaluation is supposed to occur.
119 |
120 | CocoAPI needs to be compiled. first download it from [here](https://github.com/cocodataset/cocoapi)
121 | ```
122 | $ cd cocoapi/PythonAPI
123 | $ python setup.py build_ext install
124 | ```
125 |
126 | Now evaluation can be performed.
127 |
128 | ```
129 | $ python3 evaluation_baseline.py
130 | ```
131 |
132 | ## Pretrained models
133 | Pretrained Models for IDD and BDD100k are available [here](https://drive.google.com/open?id=1EGMce4aHlo7QpvMsxXgato87gQo8aYrk). For BDD100k, you can straightaway use the model. This model was used to perform incremental learning as mentioned in the paper on IDD. As a result, the base network (model for BDD100k) was reused with new task specific layers to train on IDD.
134 |
135 | ## Incremental learning support
136 | Please refer to `exp` directory, jupyter notebooks are self explanatory. Here are the results from the paper.
137 |
138 | | S and T | Epoch | Active Components (with LR) | LR Range | map (%) at specified epochs |
139 | |------------------------------|---------------------|--------------------------------------------------------|---------------------|------------------------------------------------------|
140 | |
BDD -> IDD
IDD -> BDD |
5
Eval | +ROI Head(1e-3) |
1e-3, 6e-3
- |
24.3
45.7 |
141 | | BDD -> IDD
IDD -> BDD |
5,9
Eval | +RPN (1e-4)
+ROI head (1e-3) |
1e-4, 6e-4
- |
24.7, 24.9
45.3, 45.0
|
142 | | BDD -> IDD
IDD -> BDD |
1,5,6,7
Eval |
+RPN (1e-4)+ROI head (1e-3) |
1e-4, 6e-3
- |
24.3, 24.9, 24.9, 25.0
45.7, 44.8, 44.7, 44.7 |
143 | | BDD -> IDD
IDD -> BDD |
1,5,10
Eval |
+ROI head (1e-3)
+RPN (4e-4) +FPN(2e-4)
|
1e-4, 6e-3
- |
24.9, 25.4, 25.9
45.2, 43.9, 43.3
|
144 |
145 | ### Inference
146 |
147 | Refer to `inference.ipynb` for plotting images with model's predictions.
148 |
149 | ### Visualization
150 |
151 | By default, tensorboard will start logging `loss` and `learning_rate` in `engine.py`. You can start by using
152 | ```
153 | $ tensorboard /path/ --port=8888
154 | ```
155 |
156 | ### Example
157 |
158 | 
159 |
--------------------------------------------------------------------------------
/assets/eval_baseline_idd.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prajjwal1/autonomous-object-detection/d52fd71d28209dbbbc064c97194e3b1171d7e825/assets/eval_baseline_idd.png
--------------------------------------------------------------------------------
/cfg.py:
--------------------------------------------------------------------------------
1 | ########## User specific settings ##########################
2 | idd_path = "/home/jupyter/autonue/data/IDD_Detection/"
3 | bdd_path = "/home/jupyter/autonue/data/bdd100k"
4 | cityscapes_path = "/ml/temp/autonue/data/cityscapes"
5 | cityscapes_split = "train"
6 |
7 | idx = 1
8 | batch_size = 8
9 |
10 | num_epochs = 25
11 | lr = 0.001
12 | ckpt = False
13 | idd_hq = False
14 | model_name = "bdd100k_24.pth"
15 | ##############################################################
16 |
17 | dset_list = ["bdd100k", "idd_non_hq", "idd_hq", "Cityscapes"]
18 | ds = dset_list[idx]
19 |
--------------------------------------------------------------------------------
/coco_eval.py:
--------------------------------------------------------------------------------
1 | import copy
2 | import json
3 | import tempfile
4 | import time
5 | from collections import defaultdict
6 |
7 | import numpy as np
8 |
9 | import pycocotools.mask as mask_util
10 | import torch
11 | import torch._six
12 | import utils
13 | from pycocotools.coco import COCO
14 | from pycocotools.cocoeval import COCOeval
15 |
16 |
17 | class CocoEvaluator(object):
18 | def __init__(self, coco_gt, iou_types):
19 | assert isinstance(iou_types, (list, tuple))
20 | coco_gt = copy.deepcopy(coco_gt)
21 | self.coco_gt = coco_gt
22 |
23 | self.iou_types = iou_types
24 | self.coco_eval = {}
25 | for iou_type in iou_types:
26 | self.coco_eval[iou_type] = COCOeval(coco_gt, iouType=iou_type)
27 |
28 | self.img_ids = []
29 | self.eval_imgs = {k: [] for k in iou_types}
30 |
31 | def update(self, predictions):
32 | img_ids = list(np.unique(list(predictions.keys())))
33 | self.img_ids.extend(img_ids)
34 |
35 | for iou_type in self.iou_types:
36 | results = self.prepare(predictions, iou_type)
37 | coco_dt = loadRes(self.coco_gt, results) if results else COCO()
38 | coco_eval = self.coco_eval[iou_type]
39 |
40 | coco_eval.cocoDt = coco_dt
41 | coco_eval.params.imgIds = list(img_ids)
42 | img_ids, eval_imgs = evaluate(coco_eval)
43 |
44 | self.eval_imgs[iou_type].append(eval_imgs)
45 |
46 | def synchronize_between_processes(self):
47 | for iou_type in self.iou_types:
48 | self.eval_imgs[iou_type] = np.concatenate(self.eval_imgs[iou_type], 2)
49 | create_common_coco_eval(
50 | self.coco_eval[iou_type], self.img_ids, self.eval_imgs[iou_type]
51 | )
52 |
53 | def accumulate(self):
54 | for coco_eval in self.coco_eval.values():
55 | coco_eval.accumulate()
56 |
57 | def summarize(self):
58 | for iou_type, coco_eval in self.coco_eval.items():
59 | print("IoU metric: {}".format(iou_type))
60 | coco_eval.summarize()
61 |
62 | def prepare(self, predictions, iou_type):
63 | if iou_type == "bbox":
64 | return self.prepare_for_coco_detection(predictions)
65 | elif iou_type == "segm":
66 | return self.prepare_for_coco_segmentation(predictions)
67 | elif iou_type == "keypoints":
68 | return self.prepare_for_coco_keypoint(predictions)
69 | else:
70 | raise ValueError("Unknown iou type {}".format(iou_type))
71 |
72 | def prepare_for_coco_detection(self, predictions):
73 | coco_results = []
74 | for original_id, prediction in predictions.items():
75 | if len(prediction) == 0:
76 | continue
77 |
78 | boxes = prediction["boxes"]
79 | boxes = convert_to_xywh(boxes).tolist()
80 | scores = prediction["scores"].tolist()
81 | labels = prediction["labels"].tolist()
82 |
83 | coco_results.extend(
84 | [
85 | {
86 | "image_id": original_id,
87 | "category_id": labels[k],
88 | "bbox": box,
89 | "score": scores[k],
90 | }
91 | for k, box in enumerate(boxes)
92 | ]
93 | )
94 | return coco_results
95 |
96 | def prepare_for_coco_segmentation(self, predictions):
97 | coco_results = []
98 | for original_id, prediction in predictions.items():
99 | if len(prediction) == 0:
100 | continue
101 |
102 | scores = prediction["scores"]
103 | labels = prediction["labels"]
104 | masks = prediction["masks"]
105 |
106 | masks = masks > 0.5
107 |
108 | scores = prediction["scores"].tolist()
109 | labels = prediction["labels"].tolist()
110 |
111 | rles = [
112 | mask_util.encode(np.array(mask[0, :, :, np.newaxis], order="F"))[0]
113 | for mask in masks
114 | ]
115 | for rle in rles:
116 | rle["counts"] = rle["counts"].decode("utf-8")
117 |
118 | coco_results.extend(
119 | [
120 | {
121 | "image_id": original_id,
122 | "category_id": labels[k],
123 | "segmentation": rle,
124 | "score": scores[k],
125 | }
126 | for k, rle in enumerate(rles)
127 | ]
128 | )
129 | return coco_results
130 |
131 | def prepare_for_coco_keypoint(self, predictions):
132 | coco_results = []
133 | for original_id, prediction in predictions.items():
134 | if len(prediction) == 0:
135 | continue
136 |
137 | boxes = prediction["boxes"]
138 | boxes = convert_to_xywh(boxes).tolist()
139 | scores = prediction["scores"].tolist()
140 | labels = prediction["labels"].tolist()
141 | keypoints = prediction["keypoints"]
142 | keypoints = keypoints.flatten(start_dim=1).tolist()
143 |
144 | coco_results.extend(
145 | [
146 | {
147 | "image_id": original_id,
148 | "category_id": labels[k],
149 | "keypoints": keypoint,
150 | "score": scores[k],
151 | }
152 | for k, keypoint in enumerate(keypoints)
153 | ]
154 | )
155 | return coco_results
156 |
157 |
158 | def convert_to_xywh(boxes):
159 | xmin, ymin, xmax, ymax = boxes.unbind(1)
160 | return torch.stack((xmin, ymin, xmax - xmin, ymax - ymin), dim=1)
161 |
162 |
163 | def merge(img_ids, eval_imgs):
164 | all_img_ids = utils.all_gather(img_ids)
165 | all_eval_imgs = utils.all_gather(eval_imgs)
166 |
167 | merged_img_ids = []
168 | for p in all_img_ids:
169 | merged_img_ids.extend(p)
170 |
171 | merged_eval_imgs = []
172 | for p in all_eval_imgs:
173 | merged_eval_imgs.append(p)
174 |
175 | merged_img_ids = np.array(merged_img_ids)
176 | merged_eval_imgs = np.concatenate(merged_eval_imgs, 2)
177 |
178 | # keep only unique (and in sorted order) images
179 | merged_img_ids, idx = np.unique(merged_img_ids, return_index=True)
180 | merged_eval_imgs = merged_eval_imgs[..., idx]
181 |
182 | return merged_img_ids, merged_eval_imgs
183 |
184 |
185 | def create_common_coco_eval(coco_eval, img_ids, eval_imgs):
186 | img_ids, eval_imgs = merge(img_ids, eval_imgs)
187 | img_ids = list(img_ids)
188 | eval_imgs = list(eval_imgs.flatten())
189 |
190 | coco_eval.evalImgs = eval_imgs
191 | coco_eval.params.imgIds = img_ids
192 | coco_eval._paramsEval = copy.deepcopy(coco_eval.params)
193 |
194 |
195 | #################################################################
196 | # From pycocotools, just removed the prints and fixed
197 | # a Python3 bug about unicode not defined
198 | #################################################################
199 |
200 | # Ideally, pycocotools wouldn't have hard-coded prints
201 | # so that we could avoid copy-pasting those two functions
202 |
203 |
204 | def createIndex(self):
205 | # create index
206 | # print('creating index...')
207 | anns, cats, imgs = {}, {}, {}
208 | imgToAnns, catToImgs = defaultdict(list), defaultdict(list)
209 | if "annotations" in self.dataset:
210 | for ann in self.dataset["annotations"]:
211 | imgToAnns[ann["image_id"]].append(ann)
212 | anns[ann["id"]] = ann
213 |
214 | if "images" in self.dataset:
215 | for img in self.dataset["images"]:
216 | imgs[img["id"]] = img
217 |
218 | if "categories" in self.dataset:
219 | for cat in self.dataset["categories"]:
220 | cats[cat["id"]] = cat
221 |
222 | if "annotations" in self.dataset and "categories" in self.dataset:
223 | for ann in self.dataset["annotations"]:
224 | catToImgs[ann["category_id"]].append(ann["image_id"])
225 |
226 | # print('index created!')
227 |
228 | # create class members
229 | self.anns = anns
230 | self.imgToAnns = imgToAnns
231 | self.catToImgs = catToImgs
232 | self.imgs = imgs
233 | self.cats = cats
234 |
235 |
236 | maskUtils = mask_util
237 |
238 |
239 | def loadRes(self, resFile):
240 | """
241 | Load result file and return a result api object.
242 | :param resFile (str) : file name of result file
243 | :return: res (obj) : result api object
244 | """
245 | res = COCO()
246 | res.dataset["images"] = [img for img in self.dataset["images"]]
247 |
248 | # print('Loading and preparing results...')
249 | # tic = time.time()
250 | if isinstance(resFile, torch._six.string_classes):
251 | anns = json.load(open(resFile))
252 | elif type(resFile) == np.ndarray:
253 | anns = self.loadNumpyAnnotations(resFile)
254 | else:
255 | anns = resFile
256 | assert type(anns) == list, "results in not an array of objects"
257 | annsImgIds = [ann["image_id"] for ann in anns]
258 | assert set(annsImgIds) == (
259 | set(annsImgIds) & set(self.getImgIds())
260 | ), "Results do not correspond to current coco set"
261 | if "caption" in anns[0]:
262 | imgIds = set([img["id"] for img in res.dataset["images"]]) & set(
263 | [ann["image_id"] for ann in anns]
264 | )
265 | res.dataset["images"] = [
266 | img for img in res.dataset["images"] if img["id"] in imgIds
267 | ]
268 | for id, ann in enumerate(anns):
269 | ann["id"] = id + 1
270 | elif "bbox" in anns[0] and not anns[0]["bbox"] == []:
271 | res.dataset["categories"] = copy.deepcopy(self.dataset["categories"])
272 | for id, ann in enumerate(anns):
273 | bb = ann["bbox"]
274 | x1, x2, y1, y2 = [bb[0], bb[0] + bb[2], bb[1], bb[1] + bb[3]]
275 | if "segmentation" not in ann:
276 | ann["segmentation"] = [[x1, y1, x1, y2, x2, y2, x2, y1]]
277 | ann["area"] = bb[2] * bb[3]
278 | ann["id"] = id + 1
279 | ann["iscrowd"] = 0
280 | elif "segmentation" in anns[0]:
281 | res.dataset["categories"] = copy.deepcopy(self.dataset["categories"])
282 | for id, ann in enumerate(anns):
283 | # now only support compressed RLE format as segmentation results
284 | ann["area"] = maskUtils.area(ann["segmentation"])
285 | if "bbox" not in ann:
286 | ann["bbox"] = maskUtils.toBbox(ann["segmentation"])
287 | ann["id"] = id + 1
288 | ann["iscrowd"] = 0
289 | elif "keypoints" in anns[0]:
290 | res.dataset["categories"] = copy.deepcopy(self.dataset["categories"])
291 | for id, ann in enumerate(anns):
292 | s = ann["keypoints"]
293 | x = s[0::3]
294 | y = s[1::3]
295 | x0, x1, y0, y1 = np.min(x), np.max(x), np.min(y), np.max(y)
296 | ann["area"] = (x1 - x0) * (y1 - y0)
297 | ann["id"] = id + 1
298 | ann["bbox"] = [x0, y0, x1 - x0, y1 - y0]
299 | # print('DONE (t={:0.2f}s)'.format(time.time()- tic))
300 |
301 | res.dataset["annotations"] = anns
302 | createIndex(res)
303 | return res
304 |
305 |
306 | def evaluate(self):
307 | """
308 | Run per image evaluation on given images and store results (a list of dict) in self.evalImgs
309 | :return: None
310 | """
311 | # tic = time.time()
312 | # print('Running per image evaluation...')
313 | p = self.params
314 | # add backward compatibility if useSegm is specified in params
315 | if p.useSegm is not None:
316 | p.iouType = "segm" if p.useSegm == 1 else "bbox"
317 | print(
318 | "useSegm (deprecated) is not None. Running {} evaluation".format(p.iouType)
319 | )
320 | # print('Evaluate annotation type *{}*'.format(p.iouType))
321 | p.imgIds = list(np.unique(p.imgIds))
322 | if p.useCats:
323 | p.catIds = list(np.unique(p.catIds))
324 | p.maxDets = sorted(p.maxDets)
325 | self.params = p
326 |
327 | self._prepare()
328 | # loop through images, area range, max detection number
329 | catIds = p.catIds if p.useCats else [-1]
330 |
331 | if p.iouType == "segm" or p.iouType == "bbox":
332 | computeIoU = self.computeIoU
333 | elif p.iouType == "keypoints":
334 | computeIoU = self.computeOks
335 | self.ious = {
336 | (imgId, catId): computeIoU(imgId, catId)
337 | for imgId in p.imgIds
338 | for catId in catIds
339 | }
340 |
341 | evaluateImg = self.evaluateImg
342 | maxDet = p.maxDets[-1]
343 | evalImgs = [
344 | evaluateImg(imgId, catId, areaRng, maxDet)
345 | for catId in catIds
346 | for areaRng in p.areaRng
347 | for imgId in p.imgIds
348 | ]
349 | # this is NOT in the pycocotools code, but could be done outside
350 | evalImgs = np.asarray(evalImgs).reshape(len(catIds), len(p.areaRng), len(p.imgIds))
351 | self._paramsEval = copy.deepcopy(self.params)
352 | # toc = time.time()
353 | # print('DONE (t={:0.2f}s).'.format(toc-tic))
354 | return p.imgIds, evalImgs
355 |
--------------------------------------------------------------------------------
/coco_utils.py:
--------------------------------------------------------------------------------
1 | import copy
2 | import os
3 |
4 | import torchvision
5 | from PIL import Image
6 | from tqdm import tqdm
7 |
8 | import torch
9 | import torch.utils.data
10 | import transforms as T
11 | from pycocotools import mask as coco_mask
12 | from pycocotools.coco import COCO
13 |
14 |
15 | class FilterAndRemapCocoCategories(object):
16 | def __init__(self, categories, remap=True):
17 | self.categories = categories
18 | self.remap = remap
19 |
20 | def __call__(self, image, target):
21 | anno = target["annotations"]
22 | anno = [obj for obj in anno if obj["category_id"] in self.categories]
23 | if not self.remap:
24 | target["annotations"] = anno
25 | return image, target
26 | anno = copy.deepcopy(anno)
27 | for obj in anno:
28 | obj["category_id"] = self.categories.index(obj["category_id"])
29 | target["annotations"] = anno
30 | return image, target
31 |
32 |
33 | def convert_coco_poly_to_mask(segmentations, height, width):
34 | masks = []
35 | for polygons in segmentations:
36 | rles = coco_mask.frPyObjects(polygons, height, width)
37 | mask = coco_mask.decode(rles)
38 | if len(mask.shape) < 3:
39 | mask = mask[..., None]
40 | mask = torch.as_tensor(mask, dtype=torch.uint8)
41 | mask = mask.any(dim=2)
42 | masks.append(mask)
43 | if masks:
44 | masks = torch.stack(masks, dim=0)
45 | else:
46 | masks = torch.zeros((0, height, width), dtype=torch.uint8)
47 | return masks
48 |
49 |
50 | class ConvertCocoPolysToMask(object):
51 | def __call__(self, image, target):
52 | w, h = image.size
53 |
54 | image_id = target["image_id"]
55 | image_id = torch.tensor([image_id])
56 |
57 | anno = target["annotations"]
58 |
59 | anno = [obj for obj in anno if obj["iscrowd"] == 0]
60 |
61 | boxes = [obj["bbox"] for obj in anno]
62 | # guard against no boxes via resizing
63 | boxes = torch.as_tensor(boxes, dtype=torch.float32).reshape(-1, 4)
64 | boxes[:, 2:] += boxes[:, :2]
65 | boxes[:, 0::2].clamp_(min=0, max=w)
66 | boxes[:, 1::2].clamp_(min=0, max=h)
67 |
68 | classes = [obj["category_id"] for obj in anno]
69 | classes = torch.tensor(classes, dtype=torch.int64)
70 |
71 | segmentations = [obj["segmentation"] for obj in anno]
72 | masks = convert_coco_poly_to_mask(segmentations, h, w)
73 |
74 | keypoints = None
75 | if anno and "keypoints" in anno[0]:
76 | keypoints = [obj["keypoints"] for obj in anno]
77 | keypoints = torch.as_tensor(keypoints, dtype=torch.float32)
78 | num_keypoints = keypoints.shape[0]
79 | if num_keypoints:
80 | keypoints = keypoints.view(num_keypoints, -1, 3)
81 |
82 | keep = (boxes[:, 3] > boxes[:, 1]) & (boxes[:, 2] > boxes[:, 0])
83 | boxes = boxes[keep]
84 | classes = classes[keep]
85 | masks = masks[keep]
86 | if keypoints is not None:
87 | keypoints = keypoints[keep]
88 |
89 | target = {}
90 | target["boxes"] = boxes
91 | target["labels"] = classes
92 | target["masks"] = masks
93 | target["image_id"] = image_id
94 | if keypoints is not None:
95 | target["keypoints"] = keypoints
96 |
97 | # for conversion to coco api
98 | area = torch.tensor([obj["area"] for obj in anno])
99 | iscrowd = torch.tensor([obj["iscrowd"] for obj in anno])
100 | target["area"] = area
101 | target["iscrowd"] = iscrowd
102 |
103 | return image, target
104 |
105 |
106 | def _coco_remove_images_without_annotations(dataset, cat_list=None):
107 | def _has_only_empty_bbox(anno):
108 | return all(any(o <= 1 for o in obj["bbox"][2:]) for obj in anno)
109 |
110 | def _count_visible_keypoints(anno):
111 | return sum(sum(1 for v in ann["keypoints"][2::3] if v > 0) for ann in anno)
112 |
113 | min_keypoints_per_image = 10
114 |
115 | def _has_valid_annotation(anno):
116 | # if it's empty, there is no annotation
117 | if len(anno) == 0:
118 | return False
119 | # if all boxes have close to zero area, there is no annotation
120 | if _has_only_empty_bbox(anno):
121 | return False
122 | # keypoints task have a slight different critera for considering
123 | # if an annotation is valid
124 | if "keypoints" not in anno[0]:
125 | return True
126 | # for keypoint detection tasks, only consider valid images those
127 | # containing at least min_keypoints_per_image
128 | if _count_visible_keypoints(anno) >= min_keypoints_per_image:
129 | return True
130 | return False
131 |
132 | assert isinstance(dataset, torchvision.datasets.CocoDetection)
133 | ids = []
134 | for ds_idx, img_id in enumerate(dataset.ids):
135 | ann_ids = dataset.coco.getAnnIds(imgIds=img_id, iscrowd=None)
136 | anno = dataset.coco.loadAnns(ann_ids)
137 | if cat_list:
138 | anno = [obj for obj in anno if obj["category_id"] in cat_list]
139 | if _has_valid_annotation(anno):
140 | ids.append(ds_idx)
141 |
142 | dataset = torch.utils.data.Subset(dataset, ids)
143 | return dataset
144 |
145 |
146 | def convert_to_coco_api(ds):
147 | coco_ds = COCO()
148 | ann_id = 0
149 | dataset = {"images": [], "categories": [], "annotations": []}
150 | categories = set()
151 | for img_idx in tqdm(range(len(ds))):
152 | # find better way to get target
153 | # targets = ds.get_annotations(img_idx)
154 | img, targets = ds[img_idx]
155 | img = torchvision.transforms.ToTensor()(img)
156 | image_id = targets["image_id"].item()
157 | img_dict = {}
158 | img_dict["id"] = image_id
159 | img_dict["height"] = img.shape[-2]
160 | img_dict["width"] = img.shape[-1]
161 | dataset["images"].append(img_dict)
162 | bboxes = targets["boxes"]
163 | bboxes[:, 2:] -= bboxes[:, :2]
164 | bboxes = bboxes.tolist()
165 | labels = targets["labels"].tolist()
166 | areas = targets["area"].tolist()
167 | iscrowd = targets["iscrowd"].tolist()
168 | if "masks" in targets:
169 | masks = targets["masks"]
170 | # make masks Fortran contiguous for coco_mask
171 | masks = masks.permute(0, 2, 1).contiguous().permute(0, 2, 1)
172 | if "keypoints" in targets:
173 | keypoints = targets["keypoints"]
174 | keypoints = keypoints.reshape(keypoints.shape[0], -1).tolist()
175 | num_objs = len(bboxes)
176 | for i in range(num_objs):
177 | ann = {}
178 | ann["image_id"] = image_id
179 | ann["bbox"] = bboxes[i]
180 | ann["category_id"] = labels[i]
181 | categories.add(labels[i])
182 | ann["area"] = areas[i]
183 | ann["iscrowd"] = iscrowd[i]
184 | ann["id"] = ann_id
185 | if "masks" in targets:
186 | ann["segmentation"] = coco_mask.encode(masks[i].numpy())
187 | if "keypoints" in targets:
188 | ann["keypoints"] = keypoints[i]
189 | ann["num_keypoints"] = sum(k != 0 for k in keypoints[i][2::3])
190 | dataset["annotations"].append(ann)
191 | ann_id += 1
192 | dataset["categories"] = [{"id": i} for i in sorted(categories)]
193 | coco_ds.dataset = dataset
194 | coco_ds.createIndex()
195 | return coco_ds
196 |
197 |
198 | def get_coco_api_from_dataset(dataset):
199 | for i in range(10):
200 | if isinstance(dataset, torchvision.datasets.CocoDetection):
201 | break
202 | if isinstance(dataset, torch.utils.data.Subset):
203 | dataset = dataset.dataset
204 | if isinstance(dataset, torchvision.datasets.CocoDetection):
205 | return dataset.coco
206 | return convert_to_coco_api(dataset)
207 |
208 |
209 | class CocoDetection(torchvision.datasets.CocoDetection):
210 | def __init__(self, img_folder, ann_file, transforms):
211 | super(CocoDetection, self).__init__(img_folder, ann_file)
212 | self._transforms = transforms
213 |
214 | def __getitem__(self, idx):
215 | img, target = super(CocoDetection, self).__getitem__(idx)
216 | image_id = self.ids[idx]
217 | target = dict(image_id=image_id, annotations=target)
218 | if self._transforms is not None:
219 | img, target = self._transforms(img, target)
220 | return img, target
221 |
222 |
223 | def get_coco(root, image_set, transforms, mode="instances"):
224 | anno_file_template = "{}_{}2017.json"
225 | PATHS = {
226 | "train": (
227 | "train2017",
228 | os.path.join("annotations", anno_file_template.format(mode, "train")),
229 | ),
230 | "val": (
231 | "val2017",
232 | os.path.join("annotations", anno_file_template.format(mode, "val")),
233 | ),
234 | # "train": ("val2017", os.path.join("annotations", anno_file_template.format(mode, "val")))
235 | }
236 |
237 | t = [ConvertCocoPolysToMask()]
238 |
239 | if transforms is not None:
240 | t.append(transforms)
241 | transforms = T.Compose(t)
242 |
243 | img_folder, ann_file = PATHS[image_set]
244 | img_folder = os.path.join(root, img_folder)
245 | ann_file = os.path.join(root, ann_file)
246 |
247 | dataset = CocoDetection(img_folder, ann_file, transforms=transforms)
248 |
249 | if image_set == "train":
250 | dataset = _coco_remove_images_without_annotations(dataset)
251 |
252 | # dataset = torch.utils.data.Subset(dataset, [i for i in range(500)])
253 |
254 | return dataset
255 |
256 |
257 | def get_coco_kp(root, image_set, transforms):
258 | return get_coco(root, image_set, transforms, mode="person_keypoints")
259 |
--------------------------------------------------------------------------------
/datasets/.ipynb_checkpoints/idd-checkpoint.py:
--------------------------------------------------------------------------------
1 | import os
2 | import xml.etree.ElementTree as ET
3 | from glob import glob
4 | from pathlib import Path
5 |
6 | import matplotlib
7 | import matplotlib.pyplot as plt
8 | import numpy as np
9 | import torchvision
10 | from PIL import Image
11 | from torchvision import transforms
12 |
13 | import torch
14 | import transforms as T
15 | import utils
16 | from torch import FloatTensor, Tensor
17 | from torch.utils.data import (DataLoader, Dataset, RandomSampler,
18 | SequentialSampler)
19 |
20 |
21 | def get_transform(train):
22 | transforms = []
23 | transforms.append(T.ToTensor())
24 | # transforms.append(T.Normalize(mean=(0.3520, 0.3520, 0.3520),std=(0.2930, 0.2930, 0.2930)))
25 | if train:
26 | transforms.append(T.RandomHorizontalFlip(0.5))
27 | return T.Compose(transforms)
28 |
29 |
30 | class IDD(torch.utils.data.Dataset):
31 | def __init__(self, list_img_path, list_anno_path, transforms=None):
32 | super(IDD, self).__init__()
33 | self.img = list_img_path
34 | self.anno = list_anno_path
35 | self.transforms = transforms
36 | self.classes = {
37 | "person": 0,
38 | "rider": 1,
39 | "car": 2,
40 | "truck": 3,
41 | "bus": 4,
42 | "motorcycle": 5,
43 | "bicycle": 6,
44 | "autorickshaw": 7,
45 | "animal": 8,
46 | "traffic light": 9,
47 | "traffic sign": 10,
48 | "vehicle fallback": 11,
49 | } #'caravan':12,'trailer':13,'train':14}
50 |
51 | def __len__(self):
52 | return len(self.img)
53 |
54 | def get_height_and_width(self, idx):
55 | img_path = os.path.join(img_path, self.img[idx])
56 | img = Image.open(img_path).convert("RGB")
57 | dim_tensor = torchvision.transforms.ToTensor()(img).shape
58 | height, width = dim_tensor[1], dim_tensor[2]
59 | return height, width
60 |
61 | def get_label_bboxes(self, xml_obj):
62 | xml_obj = ET.parse(xml_obj)
63 | objects, bboxes = [], []
64 |
65 | for node in xml_obj.getroot().iter("object"):
66 | object_present = node.find("name").text
67 | xmin = int(node.find("bndbox/xmin").text)
68 | xmax = int(node.find("bndbox/xmax").text)
69 | ymin = int(node.find("bndbox/ymin").text)
70 | ymax = int(node.find("bndbox/ymax").text)
71 | if object_present in self.classes:
72 | objects.append(self.classes[object_present])
73 | bboxes.append((xmin, ymin, xmax, ymax))
74 | return Tensor(objects), Tensor(bboxes)
75 |
76 | def __getitem__(self, idx):
77 | img_path = self.img[idx]
78 | img = Image.open(img_path).convert("RGB")
79 |
80 | labels = self.get_label_bboxes(self.anno[idx])[0]
81 | bboxes = self.get_label_bboxes(self.anno[idx])[1]
82 |
83 | img_id = Tensor([idx])
84 | area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
85 |
86 | iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
87 | target = {}
88 | target["boxes"] = bboxes
89 | target["labels"] = labels
90 | target["image_id"] = img_id
91 | target["area"] = area
92 | target["iscrowd"] = iscrowd
93 |
94 | if self.transforms is not None:
95 | img, target = self.transforms(img, target)
96 |
97 | return img, target
98 |
99 |
100 | class IDD_Test(torch.utils.data.Dataset):
101 | def __init__(self, list_img_path, list_anno_path):
102 | super(IDD_Test, self).__init__()
103 | self.img = sorted(list_img_path)
104 | self.anno = sorted(list_anno_path)
105 | self.classes = {
106 | "person": 0,
107 | "rider": 1,
108 | "car": 2,
109 | "truck": 3,
110 | "bus": 4,
111 | "motorcycle": 5,
112 | "bicycle": 6,
113 | "autorickshaw": 7,
114 | "animal": 8,
115 | "traffic light": 9,
116 | "traffic sign": 10,
117 | "vehicle fallback": 11,
118 | "caravan": 12,
119 | "trailer": 13,
120 | "train": 14,
121 | }
122 |
123 | def __len__(self):
124 | return len(self.img)
125 |
126 | def get_height_and_width(self, idx):
127 | img_path = os.path.join(img_path, self.imgs[idx])
128 | img = Image.open(img_path).convert("RGB")
129 | dim_tensor = torchvision.transforms.ToTensor()(img).shape
130 | height, width = dim_tensor[1], dim_tensor[2]
131 | return height, width
132 |
133 | def get_label_bboxes(self, xml_obj):
134 | xml_obj = ET.parse(xml_obj)
135 | objects, bboxes = [], []
136 |
137 | for node in xml_obj.getroot().iter("object"):
138 | object_present = node.find("name").text
139 | xmin = int(node.find("bndbox/xmin").text)
140 | xmax = int(node.find("bndbox/xmax").text)
141 | ymin = int(node.find("bndbox/ymin").text)
142 | ymax = int(node.find("bndbox/ymax").text)
143 | objects.append(self.classes[object_present])
144 | bboxes.append((xmin, ymin, xmax, ymax))
145 | return Tensor(objects), Tensor(bboxes)
146 |
147 | def __getitem__(self, idx):
148 | img_path = self.img[idx]
149 | img = Image.open(img_path).convert("RGB")
150 |
151 | labels = self.get_label_bboxes(self.anno[idx])[0]
152 | bboxes = self.get_label_bboxes(self.anno[idx])[1]
153 |
154 | img_id = Tensor([idx])
155 | area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
156 |
157 | iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
158 | target = {}
159 | target["boxes"] = bboxes
160 | target["labels"] = labels
161 | target["image_id"] = img_id
162 | target["area"] = area
163 | target["iscrowd"] = iscrowd
164 |
165 | return img, target
166 |
--------------------------------------------------------------------------------
/datasets/bdd.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | from pathlib import Path
4 |
5 | import numpy as np
6 | import torchvision
7 | from PIL import Image
8 | from torchvision import transforms
9 | from tqdm import tqdm
10 |
11 | import torch
12 | import transforms as T
13 | import utils
14 | from torch import Tensor, nn
15 | from torch.utils.data import Dataset
16 |
17 |
18 | def get_ground_truths(train_img_path_list, anno_data):
19 |
20 | bboxes, total_bboxes = [], []
21 | labels, total_labels = [], []
22 | classes = {
23 | "bus": 0,
24 | "traffic light": 1,
25 | "traffic sign": 2,
26 | "person": 3,
27 | "bike": 4,
28 | "truck": 5,
29 | "motor": 6,
30 | "car": 7,
31 | "train": 8,
32 | "rider": 9,
33 | "drivable area": 10,
34 | "lane": 11,
35 | }
36 |
37 | for i in tqdm(range(len(train_img_path_list))):
38 | for j in range(len(anno_data[i]["labels"])):
39 | if "box2d" in anno_data[i]["labels"][j]:
40 | xmin = anno_data[i]["labels"][j]["box2d"]["x1"]
41 | ymin = anno_data[i]["labels"][j]["box2d"]["y1"]
42 | xmax = anno_data[i]["labels"][j]["box2d"]["x2"]
43 | ymax = anno_data[i]["labels"][j]["box2d"]["y2"]
44 | bbox = [xmin, ymin, xmax, ymax]
45 | category = anno_data[i]["labels"][j]["category"]
46 | cls = classes[category]
47 |
48 | bboxes.append(bbox)
49 | labels.append(cls)
50 |
51 | total_bboxes.append(torch.tensor(bboxes))
52 | total_labels.append(torch.tensor(labels))
53 | bboxes = []
54 | labels = []
55 |
56 | return total_bboxes, total_labels
57 |
58 |
59 | def _load_json(path_list_idx):
60 | with open(path_list_idx, "r") as file:
61 | data = json.load(file)
62 | return data
63 |
64 |
65 | def get_transform(train):
66 | transforms = []
67 | transforms.append(T.ToTensor())
68 | if train:
69 | transforms.append(T.RandomHorizontalFlip(0.5))
70 | return T.Compose(transforms)
71 |
72 |
73 | class BDD(torch.utils.data.Dataset):
74 | def __init__(
75 | self, img_path, anno_json_path, transforms=None
76 | ): # total_bboxes_list,total_labels_list,transforms=None):
77 | super(BDD, self).__init__()
78 | self.img_path = img_path
79 | self.anno_data = _load_json(anno_json_path)
80 | self.total_bboxes_list, self.total_labels_list = get_ground_truths(
81 | self.img_path, self.anno_data
82 | )
83 | self.transforms = transforms
84 | self.classes = {
85 | "bus": 0,
86 | "traffic light": 1,
87 | "traffic sign": 2,
88 | "person": 3,
89 | "bike": 4,
90 | "truck": 5,
91 | "motor": 6,
92 | "car": 7,
93 | "train": 8,
94 | "rider": 9,
95 | "drivable area": 10,
96 | "lane": 11,
97 | }
98 |
99 | def __len__(self):
100 | return len(self.img_path)
101 |
102 | def __getitem__(self, idx):
103 | img_path = self.img_path[idx]
104 | img = Image.open(img_path).convert("RGB")
105 |
106 | labels = self.total_labels_list[idx]
107 | bboxes = self.total_bboxes_list[idx]
108 | area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
109 |
110 | img_id = torch.tensor([idx])
111 | iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
112 | target = {}
113 | target["boxes"] = bboxes
114 | target["labels"] = labels
115 | target["image_id"] = img_id
116 | target["area"] = area
117 | target["iscrowd"] = iscrowd
118 |
119 | if self.transforms is not None:
120 | img, target = self.transforms(img, target)
121 |
122 | return img, target
123 |
--------------------------------------------------------------------------------
/datasets/cityscapes.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | import pickle
4 |
5 | import numpy as np
6 | import torchvision
7 | from PIL import Image
8 | from torchvision import transforms
9 |
10 | import torch
11 | import transforms as T
12 | import utils
13 | from torch import FloatTensor, Tensor
14 | from torch.utils.data import (DataLoader, Dataset, RandomSampler,
15 | SequentialSampler)
16 | from torch.utils.data.dataloader import default_collate
17 | from transforms import *
18 |
19 |
20 | class Cityscapes(torch.utils.data.Dataset):
21 | def __init__(
22 | self, image_path_list, target_path_list, split="train", transforms=None
23 | ):
24 | super(Cityscapes, self).__init__()
25 | self.images = image_path_list
26 | self.targets = target_path_list
27 | self.transforms = transforms
28 | self.classes = {
29 | "pedestrian": 0,
30 | "rider": 1,
31 | "person group": 2,
32 | "person (other)": 3,
33 | "sitting person": 4,
34 | "ignore": 5,
35 | }
36 |
37 | def get_label_bboxes(self, label):
38 | """
39 | Bounding boxes are in the form [x0,y0.w,h]
40 | """
41 | bboxes = []
42 | labels = []
43 | for data in label["objects"]:
44 | x0 = data["bbox"][0]
45 | y0 = data["bbox"][1]
46 | x1 = x0 + data["bbox"][2]
47 | y1 = y0 + data["bbox"][3]
48 | bbox_list = [x0, y0, x1, y1]
49 | labels.append(self.classes[data["label"]])
50 | bboxes.append(bbox_list)
51 | return Tensor(bboxes), Tensor(labels)
52 |
53 | def __len__(self):
54 | return len(self.images)
55 |
56 | def extra_repr(self):
57 | lines = ["Split: {split}", "Mode: {mode}", "Type: {target_type}"]
58 | return "\n".join(lines).format(**self.__dict__)
59 |
60 | def _load_json(self, path_list_idx):
61 | with open(path_list_idx, "r") as file:
62 | data = json.load(file)
63 | return data
64 |
65 | def __getitem__(self, idx):
66 |
67 | image = Image.open(self.images[idx]).convert("RGB")
68 |
69 | data = self._load_json(self.targets[idx])
70 |
71 | labels = self.get_label_bboxes(data)[1]
72 | bboxes = self.get_label_bboxes(data)[0]
73 | area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
74 | iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
75 |
76 | img_id = Tensor([idx])
77 | target = {}
78 | target["boxes"] = bboxes
79 | target["labels"] = labels
80 | target["image_id"] = img_id
81 | target["area"] = area
82 | target["iscrowd"] = iscrowd
83 |
84 | if self.transforms is not None:
85 | image, target = self.transforms(image, target)
86 | return image, target
87 |
88 |
89 | def get_transform(train):
90 | transforms = []
91 | transforms.append(T.ToTensor())
92 | # transforms.append(T.Normalize(mean=(0.485, 0.456, 0.406),std=(0.229, 0.224, 0.225)))
93 | if train:
94 | transforms.append(T.RandomHorizontalFlip(0.5))
95 | return T.Compose(transforms)
96 |
--------------------------------------------------------------------------------
/datasets/idd.py:
--------------------------------------------------------------------------------
1 | import os
2 | import time
3 | import xml.etree.ElementTree as ET
4 | from pathlib import Path
5 |
6 | import numpy as np
7 | import torchvision
8 | from PIL import Image
9 | from torchvision import transforms
10 |
11 | import torch
12 | import transforms as T
13 | import utils
14 | from coco_eval import CocoEvaluator
15 | from coco_utils import get_coco_api_from_dataset
16 | from torch import FloatTensor, Tensor
17 | from torch.utils.data import (DataLoader, Dataset, RandomSampler,
18 | SequentialSampler)
19 |
20 |
21 | def get_transform(train):
22 | transforms = []
23 | transforms.append(T.ToTensor())
24 | # transforms.append(T.Normalize(mean=(0.3520, 0.3520, 0.3520),std=(0.2930, 0.2930, 0.2930)))
25 | if train:
26 | transforms.append(T.RandomHorizontalFlip(0.5))
27 | return T.Compose(transforms)
28 |
29 |
30 | class IDD(torch.utils.data.Dataset):
31 | def __init__(self, list_img_path, list_anno_path, transforms=None):
32 | super(IDD, self).__init__()
33 | self.img = list_img_path
34 | self.anno = list_anno_path
35 | self.transforms = transforms
36 | self.classes = {
37 | "person": 0,
38 | "rider": 1,
39 | "car": 2,
40 | "truck": 3,
41 | "bus": 4,
42 | "motorcycle": 5,
43 | "bicycle": 6,
44 | "autorickshaw": 7,
45 | "animal": 8,
46 | "traffic light": 9,
47 | "traffic sign": 10,
48 | "vehicle fallback": 11,
49 | "caravan": 12,
50 | "trailer": 13,
51 | "train": 14,
52 | }
53 |
54 | def __len__(self):
55 | return len(self.img)
56 |
57 | def get_height_and_width(self, idx):
58 | img_path = os.path.join(img_path, self.img[idx])
59 | img = Image.open(img_path).convert("RGB")
60 | dim_tensor = torchvision.transforms.ToTensor()(img).shape
61 | height, width = dim_tensor[1], dim_tensor[2]
62 | return height, width
63 |
64 | def get_label_bboxes(self, xml_obj):
65 | xml_obj = ET.parse(xml_obj)
66 | objects, bboxes = [], []
67 |
68 | for node in xml_obj.getroot().iter("object"):
69 | object_present = node.find("name").text
70 | xmin = int(node.find("bndbox/xmin").text)
71 | xmax = int(node.find("bndbox/xmax").text)
72 | ymin = int(node.find("bndbox/ymin").text)
73 | ymax = int(node.find("bndbox/ymax").text)
74 | objects.append(self.classes[object_present])
75 | bboxes.append((xmin, ymin, xmax, ymax))
76 | return Tensor(objects), Tensor(bboxes)
77 |
78 | def __getitem__(self, idx):
79 | img_path = self.img[idx]
80 | img = Image.open(img_path).convert("RGB")
81 |
82 | labels = self.get_label_bboxes(self.anno[idx])[0]
83 | bboxes = self.get_label_bboxes(self.anno[idx])[1]
84 | labels = labels.type(torch.int64)
85 | img_id = Tensor([idx])
86 | area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
87 |
88 | iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
89 | target = {}
90 | target["boxes"] = bboxes
91 | target["labels"] = labels
92 | target["image_id"] = img_id
93 | target["area"] = area
94 | target["iscrowd"] = iscrowd
95 |
96 | if self.transforms is not None:
97 | img, target = self.transforms(img, target)
98 |
99 | return img, target
100 |
--------------------------------------------------------------------------------
/engine.py:
--------------------------------------------------------------------------------
1 | # Adapted from torchvision, changes include tensorboard support
2 |
3 | import math
4 | import sys
5 | import time
6 |
7 | from tensorboardX import SummaryWriter
8 |
9 | import torch
10 | import utils
11 | from coco_eval import CocoEvaluator
12 | from coco_utils import get_coco_api_from_dataset
13 | from imports import *
14 |
15 | writer = SummaryWriter()
16 | num_iters = 0
17 |
18 |
19 | def train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq):
20 | global num_iters
21 | model.train()
22 | metric_logger = utils.MetricLogger(delimiter=" ")
23 | metric_logger.add_meter("lr", utils.SmoothedValue(window_size=1, fmt="{value:.6f}"))
24 | header = "Epoch: [{}]".format(epoch)
25 |
26 | lr_scheduler = None
27 | if epoch == 0:
28 | warmup_factor = 1.0 / 1000
29 | warmup_iters = min(1000, len(data_loader) - 1)
30 |
31 | lr_scheduler = utils.warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor)
32 |
33 | for images, targets in metric_logger.log_every(data_loader, print_freq, header):
34 | images = list(image.to(device) for image in images)
35 |
36 | targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
37 |
38 | loss_dict = model(images, targets)
39 | num_iters += 1
40 | losses = sum(loss for loss in loss_dict.values())
41 |
42 | # reduce losses over all GPUs for logging purposes
43 | loss_dict_reduced = utils.reduce_dict(loss_dict)
44 | losses_reduced = sum(loss for loss in loss_dict_reduced.values())
45 |
46 | loss_value = losses_reduced.item()
47 |
48 | writer.add_scalar("Loss/train", loss_value, num_iters)
49 | writer.add_scalar("Learning rate", optimizer.param_groups[0]["lr"], num_iters)
50 | writer.add_scalar("Momentum", optimizer.param_groups[0]["momentum"], num_iters)
51 |
52 | if not math.isfinite(loss_value):
53 | print("Loss is {}, stopping training".format(loss_value))
54 | print(loss_dict_reduced)
55 | sys.exit(1)
56 |
57 | optimizer.zero_grad()
58 | losses.backward()
59 | optimizer.step()
60 |
61 | if lr_scheduler is not None:
62 | lr_scheduler.step()
63 |
64 | metric_logger.update(loss=losses_reduced, **loss_dict_reduced)
65 | metric_logger.update(lr=optimizer.param_groups[0]["lr"])
66 |
67 |
68 | def _get_iou_types(model):
69 | model_without_ddp = model
70 | if isinstance(model, torch.nn.parallel.DistributedDataParallel):
71 | model_without_ddp = model.module
72 | iou_types = ["bbox"]
73 | return iou_types
74 |
75 |
76 | @torch.no_grad()
77 | def evaluate(model, data_loader, device):
78 | iou_types = ["bbox"]
79 | coco = get_coco_api_from_dataset(data_loader.dataset)
80 | n_threads = torch.get_num_threads()
81 | torch.set_num_threads(1)
82 | cpu_device = torch.device("cpu")
83 | model.eval()
84 | metric_logger = utils.MetricLogger(delimiter=" ")
85 | header = "Test:"
86 | model.to(device)
87 | iou_types = _get_iou_types(model)
88 | coco_evaluator = CocoEvaluator(coco, iou_types)
89 | to_tensor = torchvision.transforms.ToTensor()
90 | for image, targets in metric_logger.log_every(data_loader, 100, header):
91 |
92 | image = list(to_tensor(img).to(device) for img in image)
93 | targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
94 | torch.cuda.synchronize()
95 | model_time = time.time()
96 |
97 | outputs = model(image)
98 |
99 | outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
100 | model_time = time.time() - model_time
101 |
102 | res = {
103 | target["image_id"].item(): output
104 | for target, output in zip(targets, outputs)
105 | }
106 | evaluator_time = time.time()
107 | coco_evaluator.update(res)
108 | evaluator_time = time.time() - evaluator_time
109 | metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
110 |
111 | # gather the stats from all processes
112 | metric_logger.synchronize_between_processes()
113 | print("Averaged stats:", metric_logger)
114 | coco_evaluator.synchronize_between_processes()
115 |
116 | # accumulate predictions from all images
117 | coco_evaluator.accumulate()
118 | coco_evaluator.summarize()
119 | torch.set_num_threads(n_threads)
120 | return coco_evaluator
121 |
--------------------------------------------------------------------------------
/eval_idd_bdd.py:
--------------------------------------------------------------------------------
1 | # Adapted from torchvision, changes made to support evaluation on idd and bdd100k
2 |
3 | import pickle
4 | import time
5 |
6 | from coco_eval import CocoEvaluator
7 | from coco_utils import get_coco_api_from_dataset
8 | from datasets.bdd import *
9 | from datasets.idd import *
10 | from imports import *
11 |
12 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
13 |
14 | ########################### User Defined settings ########################
15 | ds = "BDD"
16 | bdd_path = "/home/jupyter/autonue/data/bdd100k/"
17 | batch_size = 8
18 | model_name = "bdd100k_24.pth"
19 | idd_path = "/home/jupyter/autonue/data/IDD_Detection/"
20 | # name = 'do_ft_trained_bdd_eval_idd_ready.pth'
21 | use_checkpoint = False
22 | ################################ Dataset and Dataloader Management ##########################################
23 |
24 | print("Loading files")
25 |
26 | if ds == "IDD":
27 | print("Evaluation on India Driving dataset")
28 | with open("datalists/idd_images_path_list.txt", "rb") as fp:
29 | idd_image_path_list = pickle.load(fp)
30 | with open("datalists/idd_anno_path_list.txt", "rb") as fp:
31 | idd_anno_path_list = pickle.load(fp)
32 |
33 | val_img_paths = []
34 | with open(idd_path + "val.txt") as f:
35 | val_img_paths = f.readlines()
36 | for i in range(len(val_img_paths)):
37 | val_img_paths[i] = val_img_paths[i].strip("\n")
38 | val_img_paths[i] = val_img_paths[i] + ".jpg"
39 | val_img_paths[i] = os.path.join(idd_path + "JPEGImages", val_img_paths[i])
40 |
41 | val_anno_paths = []
42 | for i in range(len(val_img_paths)):
43 | val_anno_paths.append(val_img_paths[i].replace("JPEGImages", "Annotations"))
44 | val_anno_paths[i] = val_anno_paths[i].replace(".jpg", ".xml")
45 |
46 | val_img_paths, val_anno_paths = sorted(val_img_paths), sorted(val_anno_paths)
47 |
48 | assert len(val_img_paths) == len(val_anno_paths)
49 | # val_img_paths = val_img_paths[:10]
50 | # val_anno_paths = val_anno_paths[:10]
51 |
52 | val_dataset = IDD_Test(val_img_paths, val_anno_paths)
53 | val_dl = torch.utils.data.DataLoader(
54 | val_dataset,
55 | batch_size=batch_size,
56 | shuffle=True,
57 | num_workers=4,
58 | collate_fn=utils.collate_fn,
59 | )
60 |
61 | if ds == "BDD":
62 | print("Evaluation on Berkeley Deep Drive")
63 | root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
64 | root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
65 |
66 | val_img_path = root_img_path + "/val/"
67 | val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
68 |
69 | with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
70 | bdd_img_path_list = pickle.load(fp)
71 |
72 | val_dataset = BDD(bdd_img_path_list, val_anno_json_path)
73 | val_dl = torch.utils.data.DataLoader(
74 | val_dataset,
75 | batch_size=batch_size,
76 | shuffle=True,
77 | num_workers=0,
78 | collate_fn=utils.collate_fn,
79 | pin_memory=True,
80 | )
81 |
82 | ###################################################################################################3
83 |
84 |
85 | def get_model(num_classes):
86 | model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False)
87 | in_features = model.roi_heads.box_predictor.cls_score.in_features
88 | model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
89 | in_features, num_classes
90 | ) # replace the pre-trained head with a new one
91 | return model.cuda()
92 |
93 |
94 | ckpt = torch.load("saved_models/ulm_det_ft0.pth")
95 | model = get_model(15)
96 | model.load_state_dict(ckpt["model"])
97 |
98 | model_bdd = get_model(12)
99 | ckpt2 = torch.load("saved_models/bdd100k_24.pth")
100 | model_bdd.load_state_dict(ckpt2["model"])
101 |
102 | model.roi_heads = model_bdd.roi_heads
103 | model.roi_heads.load_state_dict(model_bdd.roi_heads.state_dict())
104 |
105 | model.cuda()
106 |
107 | params = [p for p in model.parameters() if p.requires_grad]
108 | optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
109 | lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
110 |
111 | if use_checkpoint:
112 | checkpoint = torch.load("saved_models/" + model_name)
113 | model.load_state_dict(checkpoint["model"])
114 | print("Model Loaded successfully")
115 |
116 |
117 | def _get_iou_types(model):
118 | model_without_ddp = model
119 | if isinstance(model, torch.nn.parallel.DistributedDataParallel):
120 | model_without_ddp = model.module
121 | iou_types = ["bbox"]
122 | return iou_types
123 |
124 |
125 | print("##### Dataloader is ready #######")
126 | iou_types = _get_iou_types(model)
127 |
128 | print("Getting coco api from dataset")
129 | coco = get_coco_api_from_dataset(val_dl.dataset)
130 | print("Done")
131 |
132 |
133 | @torch.no_grad()
134 | def evaluate(model, data_loader, device):
135 | n_threads = torch.get_num_threads()
136 | # FIXME remove this and make paste_masks_in_image run on the GPU
137 | torch.set_num_threads(1)
138 | cpu_device = torch.device("cpu")
139 | model.eval()
140 | metric_logger = utils.MetricLogger(delimiter=" ")
141 | header = "Test:"
142 | model.cuda()
143 | # coco = get_coco_api_from_dataset(data_loader.dataset)
144 | iou_types = _get_iou_types(model)
145 | coco_evaluator = CocoEvaluator(coco, iou_types)
146 |
147 | for image, targets in metric_logger.log_every(data_loader, 100, header):
148 | # print(image)
149 | # image = torchvision.transforms.ToTensor()(image[0]) # Returns a scaler tuple
150 | # print(image.shape) # dim of image 1080x1920
151 |
152 | image = torchvision.transforms.ToTensor()(image[0]).to(device)
153 | # image = img.to(device) for img in image
154 | targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
155 | torch.cuda.synchronize()
156 | model_time = time.time()
157 |
158 | outputs = model([image])
159 |
160 | outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
161 | model_time = time.time() - model_time
162 |
163 | res = {
164 | target["image_id"].item(): output
165 | for target, output in zip(targets, outputs)
166 | }
167 | evaluator_time = time.time()
168 | coco_evaluator.update(res)
169 | evaluator_time = time.time() - evaluator_time
170 | metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
171 |
172 | # gather the stats from all processes
173 | metric_logger.synchronize_between_processes()
174 | print("Averaged stats:", metric_logger)
175 | coco_evaluator.synchronize_between_processes()
176 |
177 | # accumulate predictions from all images
178 | coco_evaluator.accumulate()
179 | coco_evaluator.summarize()
180 | torch.set_num_threads(n_threads)
181 | return coco_evaluator
182 |
183 |
184 | print("Evaluation in progress")
185 | evaluate(model, val_dl, device=device)
186 |
--------------------------------------------------------------------------------
/evaluation_baseline.py:
--------------------------------------------------------------------------------
1 | import pickle
2 | import time
3 |
4 | from cfg import *
5 | from coco_eval import CocoEvaluator
6 | from coco_utils import get_coco_api_from_dataset
7 | from datasets.bdd import *
8 | from datasets.idd import *
9 | from imports import *
10 |
11 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
12 |
13 | print("Loading files")
14 |
15 | if ds in ["idd_non_hq", "idd_hq"]:
16 | print("Evaluation on India Driving dataset")
17 | with open("datalists/idd_images_path_list.txt", "rb") as fp:
18 | idd_image_path_list = pickle.load(fp)
19 | with open("datalists/idd_anno_path_list.txt", "rb") as fp:
20 | idd_anno_path_list = pickle.load(fp)
21 |
22 | val_img_paths = []
23 | with open(idd_path + "val.txt") as f:
24 | val_img_paths = f.readlines()
25 | for i in range(len(val_img_paths)):
26 | val_img_paths[i] = val_img_paths[i].strip("\n")
27 | val_img_paths[i] = val_img_paths[i] + ".jpg"
28 | val_img_paths[i] = os.path.join(idd_path + "JPEGImages", val_img_paths[i])
29 |
30 | val_anno_paths = []
31 | for i in range(len(val_img_paths)):
32 | val_anno_paths.append(val_img_paths[i].replace("JPEGImages", "Annotations"))
33 | val_anno_paths[i] = val_anno_paths[i].replace(".jpg", ".xml")
34 |
35 | val_img_paths, val_anno_paths = sorted(val_img_paths), sorted(val_anno_paths)
36 |
37 | assert len(val_img_paths) == len(val_anno_paths)
38 | val_img_paths = val_img_paths[:10]
39 | val_anno_paths = val_anno_paths[:10]
40 |
41 | val_dataset = IDD(val_img_paths, val_anno_paths, None)
42 | val_dl = torch.utils.data.DataLoader(
43 | val_dataset,
44 | batch_size=batch_size,
45 | shuffle=True,
46 | num_workers=4,
47 | collate_fn=utils.collate_fn,
48 | )
49 |
50 | if ds == "bdd100k":
51 | print("Evaluation on Berkeley Deep Drive")
52 | root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
53 | root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
54 |
55 | val_img_path = root_img_path + "/val/"
56 | val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
57 |
58 | with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
59 | bdd_img_path_list = pickle.load(fp)
60 | # bdd_img_path_list = bdd_img_path_list[:10]
61 | val_dataset = BDD(bdd_img_path_list, val_anno_json_path)
62 | val_dl = torch.utils.data.DataLoader(
63 | val_dataset,
64 | batch_size=batch_size,
65 | shuffle=True,
66 | num_workers=0,
67 | collate_fn=utils.collate_fn,
68 | pin_memory=True,
69 | )
70 |
71 | if ds == "Cityscapes":
72 | with open("datalists/cityscapes_val_images_path.txt", "rb") as fp:
73 | images = pickle.load(fp)
74 | with open("datalists/cityscapes_val_targets_path.txt", "rb") as fp:
75 | targets = pickle.load(fp)
76 |
77 | val_dataset = Cityscapes(images, targets)
78 | val_dl = torch.utils.data.DataLoader(
79 | val_dataset,
80 | batch_size=batch_size,
81 | shuffle=True,
82 | num_workers=4,
83 | collate_fn=utils.collate_fn,
84 | )
85 |
86 | ###################################################################################################3
87 |
88 |
89 | def get_model(num_classes):
90 | model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False)
91 | in_features = model.roi_heads.box_predictor.cls_score.in_features
92 | model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
93 | in_features, num_classes
94 | ) # replace the pre-trained head with a new one
95 | return model.cuda()
96 |
97 |
98 | model = get_model(12)
99 | model.to(device)
100 | params = [p for p in model.parameters() if p.requires_grad]
101 | optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
102 | lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
103 |
104 | checkpoint = torch.load("saved_models/" + model_name)
105 | model.load_state_dict(checkpoint["model"])
106 | print("Model Loaded successfully")
107 |
108 | print("##### Dataloader is ready #######")
109 |
110 |
111 | print("Getting coco api from dataset")
112 | coco = get_coco_api_from_dataset(val_dl.dataset)
113 | print("Done")
114 |
115 | print("Evaluation in progress")
116 | evaluate(model, val_dl, device=device)
117 |
--------------------------------------------------------------------------------
/exp/evaluate_script.py:
--------------------------------------------------------------------------------
1 | from collections import OrderedDict
2 |
3 | from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
4 |
5 | from cfg import *
6 | from datasets.bdd import *
7 | from datasets.idd import *
8 | from imports import *
9 |
10 | batch_size = 16
11 |
12 |
13 | def get_model(num_classes):
14 | model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True).cpu()
15 | in_features = model.roi_heads.box_predictor.cls_score.in_features
16 | model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
17 | in_features, num_classes
18 | ).cpu() # replace the pre-trained head with a new one
19 | return model.cpu()
20 |
21 |
22 | with open("datalists/idd_val_images_path_list.txt", "rb") as fp:
23 | val_img_paths = pickle.load(fp)
24 |
25 | with open("datalists/idd_val_anno_path_list.txt", "rb") as fp:
26 | val_anno_paths = pickle.load(fp)
27 | # val_img_paths = val_img_paths[:10]
28 | # val_anno_paths = val_anno_paths[:10]
29 | val_dataset_idd = IDD(val_img_paths, val_anno_paths)
30 | val_dl_idd = torch.utils.data.DataLoader(
31 | val_dataset_idd,
32 | batch_size=batch_size,
33 | shuffle=True,
34 | num_workers=4,
35 | collate_fn=utils.collate_fn,
36 | )
37 |
38 | root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
39 | root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
40 |
41 | val_img_path = root_img_path + "/val/"
42 | val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
43 |
44 | with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
45 | bdd_img_path_list = pickle.load(fp)
46 | # bdd_img_path_list = bdd_img_path_list[:10]
47 | val_dataset_bdd = BDD(bdd_img_path_list, val_anno_json_path)
48 | val_dl_bdd = torch.utils.data.DataLoader(
49 | val_dataset_bdd,
50 | batch_size=batch_size,
51 | shuffle=True,
52 | num_workers=0,
53 | collate_fn=utils.collate_fn,
54 | pin_memory=True,
55 | )
56 |
57 | coco_idd = get_coco_api_from_dataset(val_dl_idd.dataset)
58 | coco_bdd = get_coco_api_from_dataset(val_dl_bdd.dataset)
59 |
60 |
61 | @torch.no_grad()
62 | def evaluate_(model, coco_dset, data_loader, device):
63 | iou_types = ["bbox"]
64 | coco = coco_dset
65 | n_threads = torch.get_num_threads()
66 | # FIXME remove this and make paste_masks_in_image run on the GPU
67 | torch.set_num_threads(1)
68 | cpu_device = torch.device("cpu")
69 | model.eval()
70 | metric_logger = utils.MetricLogger(delimiter=" ")
71 | header = "Test:"
72 | model.to(device)
73 | iou_types = _get_iou_types(model)
74 | coco_evaluator = CocoEvaluator(coco, iou_types)
75 | to_tensor = torchvision.transforms.ToTensor()
76 | for image, targets in metric_logger.log_every(data_loader, 100, header):
77 |
78 | image = list(to_tensor(img).to(device) for img in image)
79 | targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
80 | torch.cuda.synchronize()
81 | model_time = time.time()
82 |
83 | outputs = model(image)
84 |
85 | outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
86 | model_time = time.time() - model_time
87 |
88 | res = {
89 | target["image_id"].item(): output
90 | for target, output in zip(targets, outputs)
91 | }
92 | evaluator_time = time.time()
93 | coco_evaluator.update(res)
94 | evaluator_time = time.time() - evaluator_time
95 | metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
96 |
97 | # gather the stats from all processes
98 | metric_logger.synchronize_between_processes()
99 | print("Averaged stats:", metric_logger)
100 | coco_evaluator.synchronize_between_processes()
101 |
102 | # accumulate predictions from all images
103 | coco_evaluator.accumulate()
104 | coco_evaluator.summarize()
105 | torch.set_num_threads(n_threads)
106 | return coco_evaluator
107 |
108 |
109 | def _get_iou_types(model):
110 | model_without_ddp = model
111 | if isinstance(model, torch.nn.parallel.DistributedDataParallel):
112 | model_without_ddp = model.module
113 | iou_types = ["bbox"]
114 | return iou_types
115 |
116 |
117 | device = torch.device("cuda")
118 |
119 | trained_models = [
120 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_0.pth',
121 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_1.pth',
122 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_2.pth',
123 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_2.pth',
124 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_3.pth',
125 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_4.pth',
126 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_5.pth',
127 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_6.pth',
128 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_7.pth',
129 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_8.pth',
130 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_9.pth',
131 | "task_2_1/s_bdd_t_idd_task_new_2_1_epoch_10.pth",
132 | # 'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_11.pth'
133 | ]
134 |
135 | for idx in tqdm(range(0, len(trained_models))):
136 | model = get_model(15)
137 | ckpt = torch.load("saved_models/" + trained_models[idx])
138 | model.load_state_dict(ckpt["model"])
139 |
140 | model.to(device)
141 |
142 | print("########## Evaluation of IDD ", "### IDX ", trained_models[idx])
143 |
144 | evaluate_(model, coco_idd, val_dl_idd, device=torch.device("cuda"))
145 |
146 | model.roi_heads.box_predictor = FastRCNNPredictor(1024, 12)
147 |
148 | model_bdd = get_model(12)
149 | checkpoint = torch.load("saved_models/" + "bdd100k_24.pth")
150 | model_bdd.load_state_dict(checkpoint["model"])
151 |
152 | model.roi_heads.load_state_dict(model_bdd.roi_heads.state_dict())
153 |
154 | model.cuda()
155 |
156 | for n, p in model.named_parameters():
157 | p.requires_grad = False # Number of params in RPN = 593935
158 |
159 | for n, p in model.rpn.named_parameters():
160 | p.requires_grad = True
161 |
162 | for n, p in model.roi_heads.named_parameters():
163 | p.requires_grad = True # Number of params in RPN = 593935
164 |
165 | print("########## Evaluation of BDD ", "### IDX ", trained_models[idx])
166 | evaluate_(model, coco_bdd, val_dl_bdd, device=torch.device("cuda"))
167 |
168 | del model, model_bdd
169 |
--------------------------------------------------------------------------------
/exp/evaluation_transport.py:
--------------------------------------------------------------------------------
1 | import pickle
2 | import time
3 |
4 | from coco_eval import CocoEvaluator
5 | from coco_utils import get_coco_api_from_dataset
6 | from datasets.bdd import *
7 | from datasets.idd import *
8 | from detection import faster_rcnn
9 |
10 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
11 |
12 | ########################### User Defined settings ########################
13 | ds = "IDD"
14 | bdd_path = "/home/jupyter/autonue/data/bdd100k/"
15 | idd_path = "/home/jupyter/autonue/data/IDD_Detection/"
16 | batch_size = 8
17 | model_name = "bdd100k_24.pth" #'bdd100k_24.pth'
18 | # name = 'do_ft_trained_bdd_eval_idd_ready.pth'
19 | ################################ Dataset and Dataloader Management ##########################################
20 |
21 | print("Loading files")
22 |
23 | if ds == "IDD":
24 | # with open("datalists/idd_images_path_list.txt", "rb") as fp:
25 | # idd_image_path_list = pickle.load(fp)
26 | # with open("datalists/idd_anno_path_list.txt", "rb") as fp:
27 | # idd_anno_path_list = pickle.load(fp)
28 |
29 | val_img_paths = []
30 | with open(idd_path + "val.txt") as f:
31 | val_img_paths = f.readlines()
32 | for i in range(len(val_img_paths)):
33 | val_img_paths[i] = val_img_paths[i].strip("\n")
34 | val_img_paths[i] = val_img_paths[i] + ".jpg"
35 | val_img_paths[i] = os.path.join(idd_path + "JPEGImages", val_img_paths[i])
36 |
37 | val_anno_paths = []
38 | for i in range(len(val_img_paths)):
39 | val_anno_paths.append(val_img_paths[i].replace("JPEGImages", "Annotations"))
40 | val_anno_paths[i] = val_anno_paths[i].replace(".jpg", ".xml")
41 |
42 | val_img_paths, val_anno_paths = sorted(val_img_paths), sorted(val_anno_paths)
43 |
44 | assert len(val_img_paths) == len(val_anno_paths)
45 | # val_img_paths = val_img_paths[:10]
46 | # val_anno_paths = val_anno_paths[:10]
47 |
48 | val_dataset = IDD_Test(val_img_paths, val_anno_paths)
49 | val_dl = torch.utils.data.DataLoader(
50 | val_dataset,
51 | batch_size=batch_size,
52 | shuffle=True,
53 | num_workers=4,
54 | collate_fn=utils.collate_fn,
55 | )
56 |
57 | if ds == "BDD":
58 | root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
59 | root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
60 |
61 | val_img_path = root_img_path + "/val/"
62 | val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
63 |
64 | with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
65 | bdd_img_path_list = pickle.load(fp)
66 |
67 | val_dataset = BDD(bdd_img_path_list, val_anno_json_path)
68 | val_dl = torch.utils.data.DataLoader(
69 | val_dataset,
70 | batch_size=batch_size,
71 | shuffle=True,
72 | num_workers=4,
73 | collate_fn=utils.collate_fn,
74 | )
75 |
76 | if ds == "Cityscapes":
77 | with open("datalists/cityscapes_val_images_path.txt", "rb") as fp:
78 | images = pickle.load(fp)
79 | with open("datalists/cityscapes_val_targets_path.txt", "rb") as fp:
80 | targets = pickle.load(fp)
81 |
82 | val_dataset = Cityscapes(images, targets)
83 | val_dl = torch.utils.data.DataLoader(
84 | val_dataset,
85 | batch_size=batch_size,
86 | shuffle=True,
87 | num_workers=4,
88 | collate_fn=utils.collate_fn,
89 | )
90 |
91 | ###################################################################################################3
92 |
93 |
94 | def get_model(num_classes):
95 | model = faster_rcnn.fasterrcnn_resnet50_fpn(pretrained=True)
96 | in_features = model.roi_heads.box_predictor.cls_score.in_features
97 | model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
98 | in_features, num_classes
99 | ) # replace the pre-trained head with a new one
100 | return model.cuda()
101 |
102 |
103 | model = get_model(12)
104 | model.to(device)
105 | params = [p for p in model.parameters() if p.requires_grad]
106 | optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
107 | lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
108 |
109 | checkpoint = torch.load("saved_models/" + model_name)
110 | model.load_state_dict(checkpoint["model"])
111 | print("Model Loaded successfully")
112 |
113 |
114 | def _get_iou_types(model):
115 | model_without_ddp = model
116 | if isinstance(model, torch.nn.parallel.DistributedDataParallel):
117 | model_without_ddp = model.module
118 | iou_types = ["bbox"]
119 | return iou_types
120 |
121 |
122 | print("##### Dataloader is ready #######")
123 | iou_types = _get_iou_types(model)
124 |
125 | print("Getting coco api from dataset")
126 | coco = get_coco_api_from_dataset(val_dl.dataset)
127 | print("Done")
128 |
129 |
130 | @torch.no_grad()
131 | def evaluate(model, data_loader, device):
132 | n_threads = torch.get_num_threads()
133 | # FIXME remove this and make paste_masks_in_image run on the GPU
134 | torch.set_num_threads(1)
135 | cpu_device = torch.device("cpu")
136 | model.eval()
137 | metric_logger = utils.MetricLogger(delimiter=" ")
138 | header = "Test:"
139 | model.cuda()
140 | # coco = get_coco_api_from_dataset(data_loader.dataset)
141 | iou_types = _get_iou_types(model)
142 | coco_evaluator = CocoEvaluator(coco, iou_types)
143 |
144 | for image, targets in metric_logger.log_every(data_loader, 100, header):
145 | # print(image)
146 | # image = torchvision.transforms.ToTensor()(image[0]) # Returns a scaler tuple
147 | # print(image.shape) # dim of image 1080x1920
148 |
149 | image = torchvision.transforms.ToTensor()(image[0]).to(device)
150 | # image = img.to(device) for img in image
151 | targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
152 | torch.cuda.synchronize()
153 | model_time = time.time()
154 |
155 | outputs = model([image])
156 |
157 | outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
158 | model_time = time.time() - model_time
159 |
160 | res = {
161 | target["image_id"].item(): output
162 | for target, output in zip(targets, outputs)
163 | }
164 | evaluator_time = time.time()
165 | coco_evaluator.update(res)
166 | evaluator_time = time.time() - evaluator_time
167 | metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
168 |
169 | # gather the stats from all processes
170 | metric_logger.synchronize_between_processes()
171 | print("Averaged stats:", metric_logger)
172 | coco_evaluator.synchronize_between_processes()
173 |
174 | # accumulate predictions from all images
175 | coco_evaluator.accumulate()
176 | coco_evaluator.summarize()
177 | torch.set_num_threads(n_threads)
178 | return coco_evaluator
179 |
180 |
181 | print("Evaluation in progress")
182 | evaluate(model, val_dl, device=device)
183 |
--------------------------------------------------------------------------------
/exp/optimal_transport.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [
8 | {
9 | "name": "stdout",
10 | "output_type": "stream",
11 | "text": [
12 | "Unet loaded successfully\n"
13 | ]
14 | }
15 | ],
16 | "source": [
17 | "from imports import *\n",
18 | "from datasets.idd import *\n",
19 | "from datasets.bdd import *\n",
20 | "from detection.unet import *\n",
21 | "from collections import OrderedDict\n",
22 | "from torch_cluster import nearest\n",
23 | "from fastprogress import master_bar, progress_bar"
24 | ]
25 | },
26 | {
27 | "cell_type": "code",
28 | "execution_count": 2,
29 | "metadata": {},
30 | "outputs": [],
31 | "source": [
32 | "batch_size=8\n",
33 | "num_epochs=1"
34 | ]
35 | },
36 | {
37 | "cell_type": "code",
38 | "execution_count": 3,
39 | "metadata": {},
40 | "outputs": [
41 | {
42 | "name": "stdout",
43 | "output_type": "stream",
44 | "text": [
45 | "Loading files\n"
46 | ]
47 | },
48 | {
49 | "name": "stderr",
50 | "output_type": "stream",
51 | "text": [
52 | "100%|██████████| 69863/69863 [00:02<00:00, 25953.05it/s]\n"
53 | ]
54 | }
55 | ],
56 | "source": [
57 | "path = '/home/jupyter/autonue/data'\n",
58 | "root_img_path = os.path.join(path,'bdd100k','images','100k')\n",
59 | "root_anno_path = os.path.join(path,'bdd100k','labels')\n",
60 | "\n",
61 | "train_img_path = root_img_path+'/train/'\n",
62 | "val_img_path = root_img_path+'/val/'\n",
63 | "\n",
64 | "train_anno_json_path = root_anno_path+'/bdd100k_labels_images_train.json'\n",
65 | "val_anno_json_path = root_anno_path+'/bdd100k_labels_images_val.json'\n",
66 | "\n",
67 | "print(\"Loading files\")\n",
68 | "\n",
69 | "with open(\"datalists/bdd100k_train_images_path.txt\", \"rb\") as fp:\n",
70 | " train_img_path_list = pickle.load(fp)\n",
71 | "with open(\"datalists/bdd100k_val_images_path.txt\", \"rb\") as fp:\n",
72 | " val_img_path_list = pickle.load(fp)\n",
73 | "\n",
74 | "src_dataset = dset = BDD(train_img_path_list,train_anno_json_path,get_transform(train=True))\n",
75 | "src_dl = torch.utils.data.DataLoader(src_dataset, batch_size=batch_size, shuffle=True, num_workers=4,collate_fn=utils.collate_fn) "
76 | ]
77 | },
78 | {
79 | "cell_type": "code",
80 | "execution_count": 4,
81 | "metadata": {},
82 | "outputs": [],
83 | "source": [
84 | "with open(\"datalists/idd_images_path_list.txt\", \"rb\") as fp:\n",
85 | " non_hq_img_paths = pickle.load(fp)\n",
86 | "with open(\"datalists/idd_anno_path_list.txt\", \"rb\") as fp:\n",
87 | " non_hq_anno_paths = pickle.load(fp)\n",
88 | "\n",
89 | "with open(\"datalists/idd_hq_images_path_list.txt\", \"rb\") as fp:\n",
90 | " hq_img_paths = pickle.load(fp)\n",
91 | "with open(\"datalists/idd_hq_anno_path_list.txt\", \"rb\") as fp:\n",
92 | " hq_anno_paths = pickle.load(fp)\n",
93 | " \n",
94 | "trgt_images = hq_img_paths #non_hq_img_paths #\n",
95 | "trgt_annos = hq_anno_paths #non_hq_anno_paths #hq_anno_paths + \n",
96 | "trgt_dataset = IDD(trgt_images,trgt_annos,get_transform(train=True))\n",
97 | "trgt_dl = torch.utils.data.DataLoader(trgt_dataset, batch_size=batch_size, shuffle=True, num_workers=4,collate_fn=utils.collate_fn)"
98 | ]
99 | },
100 | {
101 | "cell_type": "code",
102 | "execution_count": 5,
103 | "metadata": {},
104 | "outputs": [],
105 | "source": [
106 | "#src_dataset[0][0].shape,trgt_dataset[0][0].shape"
107 | ]
108 | },
109 | {
110 | "cell_type": "code",
111 | "execution_count": 6,
112 | "metadata": {},
113 | "outputs": [],
114 | "source": [
115 | "class TransportBlock(nn.Module):\n",
116 | " def __init__(self,backbone,n_channels=256,batch_size=2):\n",
117 | " super(TransportBlock, self).__init__()\n",
118 | " self.backbone = backbone.cuda()\n",
119 | " self.stats = [0.485, 0.456, 0.406],[0.229, 0.224, 0.225]\n",
120 | " self.batch_size=2\n",
121 | " self.unet = Unet(n_channels).cuda()\n",
122 | " \n",
123 | " for name,p in self.backbone.named_parameters():\n",
124 | " p.requires_grad=False\n",
125 | " \n",
126 | " def unet_forward(self,x):\n",
127 | " return self.unet(x)\n",
128 | " \n",
129 | " def transport_loss(self,S_embeddings, T_embeddings, N_cluster=5):\n",
130 | " Loss = 0. \n",
131 | " for batch in range(self.batch_size):\n",
132 | " S_embeddings = S_embeddings[batch].view(256,-1)\n",
133 | " T_embeddings = T_embeddings[batch].view(256,-1)\n",
134 | " \n",
135 | " N_random_vec = S_embeddings[np.random.choice(S_embeddings.shape[0], N_cluster)]\n",
136 | "\n",
137 | " cluster_labels = nearest(S_embeddings, N_random_vec)\n",
138 | " cluster_centroids = torch.cat([torch.mean(S_embeddings[cluster_labels == label], dim=0).unsqueeze(0) for label in cluster_labels])\n",
139 | "\n",
140 | " Target_labels = nearest(T_embeddings, cluster_centroids)\n",
141 | "\n",
142 | " target_centroids = []\n",
143 | " for label in cluster_labels:\n",
144 | " if label in Target_labels:\n",
145 | " target_centroids.append(torch.mean(T_embeddings[Target_labels == label], dim=0))\n",
146 | " else:\n",
147 | " target_centroids.append(cluster_centroids[label]) \n",
148 | "\n",
149 | " target_centroids = torch.cat(target_centroids)\n",
150 | "\n",
151 | " dist = lambda x,y: torch.mean((x -y)**2)\n",
152 | " intra_class_variance = torch.cat([dist(T_embeddings[Target_labels[label]], target_centroids[label]).unsqueeze(0) for label in cluster_labels])\n",
153 | " centroid_distance = torch.cat([dist(target_centroids[label], cluster_centroids[label]).unsqueeze(0) for label in cluster_labels])\n",
154 | "\n",
155 | " Loss += torch.mean(centroid_distance*intra_class_variance) # similar to earth mover distance\n",
156 | " return Loss"
157 | ]
158 | },
159 | {
160 | "cell_type": "code",
161 | "execution_count": 7,
162 | "metadata": {},
163 | "outputs": [],
164 | "source": [
165 | "def get_model(num_classes):\n",
166 | " model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True).cpu()\n",
167 | " in_features = model.roi_heads.box_predictor.cls_score.in_features\n",
168 | " model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(in_features, num_classes).cpu() # replace the pre-trained head with a new one\n",
169 | " return model.cpu()"
170 | ]
171 | },
172 | {
173 | "cell_type": "code",
174 | "execution_count": 8,
175 | "metadata": {},
176 | "outputs": [],
177 | "source": [
178 | "ckpt = torch.load('saved_models/bdd100k_24.pth')"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 9,
184 | "metadata": {},
185 | "outputs": [
186 | {
187 | "data": {
188 | "text/plain": [
189 | "IncompatibleKeys(missing_keys=[], unexpected_keys=[])"
190 | ]
191 | },
192 | "execution_count": 9,
193 | "metadata": {},
194 | "output_type": "execute_result"
195 | }
196 | ],
197 | "source": [
198 | "model = get_model(12)\n",
199 | "model.load_state_dict(torch.load('saved_models/bdd100k_24.pth')['model'])"
200 | ]
201 | },
202 | {
203 | "cell_type": "code",
204 | "execution_count": 10,
205 | "metadata": {},
206 | "outputs": [],
207 | "source": [
208 | "ot = TransportBlock(model.backbone)\n",
209 | "params = [p for p in ot.unet.parameters() if p.requires_grad]\n",
210 | "optimizer = torch.optim.SGD(params, lr=1e-3,momentum=0.9, weight_decay=0.0005)\n",
211 | "lr_scheduler = torch.optim.lr_scheduler.CyclicLR(optimizer,base_lr=1e-3,max_lr=6e-3)"
212 | ]
213 | },
214 | {
215 | "cell_type": "code",
216 | "execution_count": 11,
217 | "metadata": {},
218 | "outputs": [
219 | {
220 | "data": {
221 | "text/plain": [
222 | "GeneralizedRCNNTransform()"
223 | ]
224 | },
225 | "execution_count": 11,
226 | "metadata": {},
227 | "output_type": "execute_result"
228 | }
229 | ],
230 | "source": [
231 | "from detection import transform\n",
232 | "transform = transform.GeneralizedRCNNTransform(min_size=800, max_size=1333, image_mean=[0.485, 0.456, 0.406], image_std=[0.229, 0.224, 0.225])\n",
233 | "transform.eval()"
234 | ]
235 | },
236 | {
237 | "cell_type": "code",
238 | "execution_count": 12,
239 | "metadata": {},
240 | "outputs": [
241 | {
242 | "data": {
243 | "text/html": [
244 | "\n",
245 | "