├── LICENSE
├── README.md
├── assets
    └── eval_baseline_idd.png
├── cfg.py
├── coco_eval.py
├── coco_utils.py
├── datasets
    ├── .ipynb_checkpoints
    │   └── idd-checkpoint.py
    ├── bdd.py
    ├── cityscapes.py
    └── idd.py
├── engine.py
├── eval_idd_bdd.py
├── evaluation_baseline.py
├── exp
    ├── evaluate_script.py
    ├── evaluation_transport.py
    ├── exp.ipynb
    ├── optimal_transport.ipynb
    └── train_script.py
├── get_datalists.py
├── imports.py
├── inference.ipynb
├── train_baseline.py
├── transforms.py
└── utils.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Prajjwal Bhargava
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ## Object detection for autonomous navigation 
  2 | This repository provides core support for performing object detection on navigation datasets. Support for 3D object detection and domain adaptation are in experimental phase and will be added later. This project provides support for training, evaluation, inference, visualization.
  3 | 
  4 | ### This repo also contains the code for:
  5 | - [On Generalizing Detection Models for Unconstrained Environments (ICCV W 2019)](https://arxiv.org/abs/1909.13080) in `exp`
  6 | 
  7 | If you use the code in any way, please consider citing:
  8 | ```
  9 | @InProceedings{Bhargava_2019_ICCV,
 10 | author = {Bhargava, Prajjwal},
 11 | title = {On Generalizing Detection Models for Unconstrained Environments},
 12 | booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
 13 | month = {Oct},
 14 | year = {2019}
 15 | }
 16 | ```
 17 | 
 18 | #### NEW: Pretrained models are now available
 19 | 
 20 | ## Prerequisites
 21 | - Pytorch >= 1.1
 22 | - torchvision >= 0.3
 23 | - tensorboardX (optional, required for visualizing)
 24 | 
 25 | ## Datasets
 26 | This work provides support for the following datasets (related to object detection for autonomous navigation):
 27 | - [India Driving Dataset](https://idd.insaan.iiit.ac.in/)
 28 | - [Berkeley Deep drive](https://bdd-data.berkeley.edu/)
 29 | - [Cityscapes](https://www.cityscapes-dataset.com/) 
 30 | 
 31 | Directory structure :
 32 | ```
 33 | +-- data
 34 | |   +-- bdd100k
 35 | |   +-- IDD_Detection
 36 | |   +-- cityscapes
 37 | +-- autonmous-object-detection
 38 | .......
 39 | ```
 40 | ### Getting started
 41 | 1. Download the required dataset
 42 | 2. Setup dataset paths in `cfg.py`
 43 | 3. Create datalists
 44 | 4. Start training and evaluating
 45 | 
 46 | ## Documentation
 47 | 
 48 | ### Setting up Config
 49 | By default, all paths and hyperparameters are loaded from `cfg.py`. Users are required to specify paths of dataset and hyperparameters once.
 50 | This can also be overriden by user 
 51 | 
 52 | ### Datalists
 53 | We use something called datalists. Datalists are lists which contains path to images and labels. This is because some of the images don't have proper labels. Datalists ensure that the lists only contain structured usable data (dataloader would work seamlessly). Data cleaning happens in the process.
 54 | 
 55 | You need to specify a proper path and `ds` variable in the `cfg.py` to specify the dataset you want to use.
 56 | ```
 57 | python3 get_datalists.py
 58 | ```
 59 | 
 60 | ### Datasets
 61 | It assumes that datalists have been created. This step ensures that you won't get bad samples while dataloader iterates. Create a dir named `data` and put all datasets inside it.
 62 | This library uses a common API (similar to torchvision). 
 63 | All datasets class expect the same inputs:
 64 | ```
 65 | Input:
 66 |     idd_image_path_list
 67 |     idd_anno_path_list
 68 |     get_transform: A transformation function.
 69 | ```
 70 | ```
 71 | Output:
 72 |     A dict containing boxes, labels, image_id, area, iscrowd inside a torch.tensor.
 73 | ```
 74 | - IDD
 75 | 
 76 | ```
 77 | dset = IDD(idd_image_path_list,idd_anno_path_list,transforms=None)
 78 | ```
 79 | 
 80 | - BDD100K 
 81 | 
 82 | ```
 83 | dset = BDD(bdd_img_path_list,train_anno_json_path,transforms=None)
 84 | ```
 85 | 
 86 | BDD100k doesn't provide individual ground truths. A single JSON file is provided. So creating dataset takes a little longer than usual for parsing JSON.
 87 | 
 88 | - Cityscapes
 89 | 
 90 | ```
 91 | dset = Cityscapes(image_path_list,target_path_list, split='train',transforms=None)
 92 | ```
 93 | 
 94 | This was tested for Citypersons (GTs for person class). You can extract GTs from segmentation as well, but user would have to manage datalists.
 95 | 
 96 | ### Transforms
 97 | - ```get_transforms(bool:train)```
 98 | 
 99 | converts images into tensors and applies Random Horizontal flipping on input data.
100 | 
101 | ### Model
102 | Any detection model can be used (Yolo,FasterRCNN,SSD). Currently we provide support from torchvision.
103 | 
104 | ```
105 | from train_baseline import get_model
106 | model = get_model(len(classes))    # Returns a Faster RCNN with Resnet 50 as backbone pretrained on COCO.
107 | ```
108 | 
109 | ### Training
110 | Support for baseline has been added. Domain adaptive features will be added later.
111 | Users need to specify the path in the script (in user defined settings section) and dataset 
112 | 
113 | ```
114 | $ python train_baseline.py
115 | ```
116 | 
117 | ### Evaluation
118 | Evaluation in performed in COCO format. Users need to specify saved `model_name` in `cfg.py`on which evaluation is supposed to occur.
119 | 
120 | CocoAPI needs to be compiled. first download it from [here](https://github.com/cocodataset/cocoapi)
121 | ```
122 | $ cd cocoapi/PythonAPI
123 | $ python setup.py build_ext install
124 | ```
125 | 
126 | Now evaluation can be performed.
127 | 
128 | ```
129 | $ python3 evaluation_baseline.py
130 | ```
131 | 
132 | ## Pretrained models
133 | Pretrained Models for IDD and BDD100k are available [here](https://drive.google.com/open?id=1EGMce4aHlo7QpvMsxXgato87gQo8aYrk). For BDD100k, you can straightaway use the model. This model was used to perform incremental learning as mentioned in the paper on IDD. As a result, the base network (model for BDD100k) was reused with new task specific layers to train on IDD. 
134 | 
135 | ## Incremental learning support
136 | Please refer to `exp` directory, jupyter notebooks are self explanatory. Here are the results from the paper.
137 | 
138 | | S and T                      | Epoch               | Active Components (with LR)                            | LR Range            | map (%) at specified epochs                          |
139 | |------------------------------|---------------------|--------------------------------------------------------|---------------------|------------------------------------------------------|
140 | | <br>BDD -> IDD<br>IDD -> BDD | <br>5<br>Eval       | +ROI Head(1e-3)                                        | <br>1e-3, 6e-3<br>- | <br>24.3<br>45.7                                     |
141 | | BDD -> IDD<br>IDD -> BDD     | <br>5,9<br>Eval     | +RPN (1e-4)<br>+ROI head (1e-3)                        | <br>1e-4, 6e-4<br>- | <br>24.7, 24.9<br><br>45.3, 45.0<br>                 |
142 | | BDD -> IDD<br>IDD -> BDD     | <br>1,5,6,7<br>Eval | <br>+RPN (1e-4)+ROI head (1e-3)                        | <br>1e-4, 6e-3<br>- | <br>24.3, 24.9, 24.9, 25.0<br>45.7, 44.8, 44.7, 44.7 |
143 | | BDD -> IDD<br>IDD -> BDD     | <br>1,5,10<br>Eval  | <br>+ROI head (1e-3)<br><br>+RPN (4e-4) +FPN(2e-4)<br> | <br>1e-4, 6e-3<br>- | <br>24.9, 25.4, 25.9<br><br>45.2, 43.9, 43.3<br>     |
144 | 
145 | ### Inference
146 | 
147 | Refer to `inference.ipynb` for plotting images with model's predictions.
148 | 
149 | ### Visualization
150 | 
151 | By default, tensorboard will start logging `loss` and `learning_rate` in `engine.py`. You can start by using
152 | ```
153 | $ tensorboard /path/ --port=8888
154 | ```
155 | 
156 | ### Example
157 | 
158 | ![img](assets/eval_baseline_idd.png)
159 | 


--------------------------------------------------------------------------------
/assets/eval_baseline_idd.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prajjwal1/autonomous-object-detection/d52fd71d28209dbbbc064c97194e3b1171d7e825/assets/eval_baseline_idd.png


--------------------------------------------------------------------------------
/cfg.py:
--------------------------------------------------------------------------------
 1 | ##########  User specific settings ##########################
 2 | idd_path = "/home/jupyter/autonue/data/IDD_Detection/"
 3 | bdd_path = "/home/jupyter/autonue/data/bdd100k"
 4 | cityscapes_path = "/ml/temp/autonue/data/cityscapes"
 5 | cityscapes_split = "train"
 6 | 
 7 | idx = 1
 8 | batch_size = 8
 9 | 
10 | num_epochs = 25
11 | lr = 0.001
12 | ckpt = False
13 | idd_hq = False
14 | model_name = "bdd100k_24.pth"
15 | ##############################################################
16 | 
17 | dset_list = ["bdd100k", "idd_non_hq", "idd_hq", "Cityscapes"]
18 | ds = dset_list[idx]
19 | 


--------------------------------------------------------------------------------
/coco_eval.py:
--------------------------------------------------------------------------------
  1 | import copy
  2 | import json
  3 | import tempfile
  4 | import time
  5 | from collections import defaultdict
  6 | 
  7 | import numpy as np
  8 | 
  9 | import pycocotools.mask as mask_util
 10 | import torch
 11 | import torch._six
 12 | import utils
 13 | from pycocotools.coco import COCO
 14 | from pycocotools.cocoeval import COCOeval
 15 | 
 16 | 
 17 | class CocoEvaluator(object):
 18 |     def __init__(self, coco_gt, iou_types):
 19 |         assert isinstance(iou_types, (list, tuple))
 20 |         coco_gt = copy.deepcopy(coco_gt)
 21 |         self.coco_gt = coco_gt
 22 | 
 23 |         self.iou_types = iou_types
 24 |         self.coco_eval = {}
 25 |         for iou_type in iou_types:
 26 |             self.coco_eval[iou_type] = COCOeval(coco_gt, iouType=iou_type)
 27 | 
 28 |         self.img_ids = []
 29 |         self.eval_imgs = {k: [] for k in iou_types}
 30 | 
 31 |     def update(self, predictions):
 32 |         img_ids = list(np.unique(list(predictions.keys())))
 33 |         self.img_ids.extend(img_ids)
 34 | 
 35 |         for iou_type in self.iou_types:
 36 |             results = self.prepare(predictions, iou_type)
 37 |             coco_dt = loadRes(self.coco_gt, results) if results else COCO()
 38 |             coco_eval = self.coco_eval[iou_type]
 39 | 
 40 |             coco_eval.cocoDt = coco_dt
 41 |             coco_eval.params.imgIds = list(img_ids)
 42 |             img_ids, eval_imgs = evaluate(coco_eval)
 43 | 
 44 |             self.eval_imgs[iou_type].append(eval_imgs)
 45 | 
 46 |     def synchronize_between_processes(self):
 47 |         for iou_type in self.iou_types:
 48 |             self.eval_imgs[iou_type] = np.concatenate(self.eval_imgs[iou_type], 2)
 49 |             create_common_coco_eval(
 50 |                 self.coco_eval[iou_type], self.img_ids, self.eval_imgs[iou_type]
 51 |             )
 52 | 
 53 |     def accumulate(self):
 54 |         for coco_eval in self.coco_eval.values():
 55 |             coco_eval.accumulate()
 56 | 
 57 |     def summarize(self):
 58 |         for iou_type, coco_eval in self.coco_eval.items():
 59 |             print("IoU metric: {}".format(iou_type))
 60 |             coco_eval.summarize()
 61 | 
 62 |     def prepare(self, predictions, iou_type):
 63 |         if iou_type == "bbox":
 64 |             return self.prepare_for_coco_detection(predictions)
 65 |         elif iou_type == "segm":
 66 |             return self.prepare_for_coco_segmentation(predictions)
 67 |         elif iou_type == "keypoints":
 68 |             return self.prepare_for_coco_keypoint(predictions)
 69 |         else:
 70 |             raise ValueError("Unknown iou type {}".format(iou_type))
 71 | 
 72 |     def prepare_for_coco_detection(self, predictions):
 73 |         coco_results = []
 74 |         for original_id, prediction in predictions.items():
 75 |             if len(prediction) == 0:
 76 |                 continue
 77 | 
 78 |             boxes = prediction["boxes"]
 79 |             boxes = convert_to_xywh(boxes).tolist()
 80 |             scores = prediction["scores"].tolist()
 81 |             labels = prediction["labels"].tolist()
 82 | 
 83 |             coco_results.extend(
 84 |                 [
 85 |                     {
 86 |                         "image_id": original_id,
 87 |                         "category_id": labels[k],
 88 |                         "bbox": box,
 89 |                         "score": scores[k],
 90 |                     }
 91 |                     for k, box in enumerate(boxes)
 92 |                 ]
 93 |             )
 94 |         return coco_results
 95 | 
 96 |     def prepare_for_coco_segmentation(self, predictions):
 97 |         coco_results = []
 98 |         for original_id, prediction in predictions.items():
 99 |             if len(prediction) == 0:
100 |                 continue
101 | 
102 |             scores = prediction["scores"]
103 |             labels = prediction["labels"]
104 |             masks = prediction["masks"]
105 | 
106 |             masks = masks > 0.5
107 | 
108 |             scores = prediction["scores"].tolist()
109 |             labels = prediction["labels"].tolist()
110 | 
111 |             rles = [
112 |                 mask_util.encode(np.array(mask[0, :, :, np.newaxis], order="F"))[0]
113 |                 for mask in masks
114 |             ]
115 |             for rle in rles:
116 |                 rle["counts"] = rle["counts"].decode("utf-8")
117 | 
118 |             coco_results.extend(
119 |                 [
120 |                     {
121 |                         "image_id": original_id,
122 |                         "category_id": labels[k],
123 |                         "segmentation": rle,
124 |                         "score": scores[k],
125 |                     }
126 |                     for k, rle in enumerate(rles)
127 |                 ]
128 |             )
129 |         return coco_results
130 | 
131 |     def prepare_for_coco_keypoint(self, predictions):
132 |         coco_results = []
133 |         for original_id, prediction in predictions.items():
134 |             if len(prediction) == 0:
135 |                 continue
136 | 
137 |             boxes = prediction["boxes"]
138 |             boxes = convert_to_xywh(boxes).tolist()
139 |             scores = prediction["scores"].tolist()
140 |             labels = prediction["labels"].tolist()
141 |             keypoints = prediction["keypoints"]
142 |             keypoints = keypoints.flatten(start_dim=1).tolist()
143 | 
144 |             coco_results.extend(
145 |                 [
146 |                     {
147 |                         "image_id": original_id,
148 |                         "category_id": labels[k],
149 |                         "keypoints": keypoint,
150 |                         "score": scores[k],
151 |                     }
152 |                     for k, keypoint in enumerate(keypoints)
153 |                 ]
154 |             )
155 |         return coco_results
156 | 
157 | 
158 | def convert_to_xywh(boxes):
159 |     xmin, ymin, xmax, ymax = boxes.unbind(1)
160 |     return torch.stack((xmin, ymin, xmax - xmin, ymax - ymin), dim=1)
161 | 
162 | 
163 | def merge(img_ids, eval_imgs):
164 |     all_img_ids = utils.all_gather(img_ids)
165 |     all_eval_imgs = utils.all_gather(eval_imgs)
166 | 
167 |     merged_img_ids = []
168 |     for p in all_img_ids:
169 |         merged_img_ids.extend(p)
170 | 
171 |     merged_eval_imgs = []
172 |     for p in all_eval_imgs:
173 |         merged_eval_imgs.append(p)
174 | 
175 |     merged_img_ids = np.array(merged_img_ids)
176 |     merged_eval_imgs = np.concatenate(merged_eval_imgs, 2)
177 | 
178 |     # keep only unique (and in sorted order) images
179 |     merged_img_ids, idx = np.unique(merged_img_ids, return_index=True)
180 |     merged_eval_imgs = merged_eval_imgs[..., idx]
181 | 
182 |     return merged_img_ids, merged_eval_imgs
183 | 
184 | 
185 | def create_common_coco_eval(coco_eval, img_ids, eval_imgs):
186 |     img_ids, eval_imgs = merge(img_ids, eval_imgs)
187 |     img_ids = list(img_ids)
188 |     eval_imgs = list(eval_imgs.flatten())
189 | 
190 |     coco_eval.evalImgs = eval_imgs
191 |     coco_eval.params.imgIds = img_ids
192 |     coco_eval._paramsEval = copy.deepcopy(coco_eval.params)
193 | 
194 | 
195 | #################################################################
196 | # From pycocotools, just removed the prints and fixed
197 | # a Python3 bug about unicode not defined
198 | #################################################################
199 | 
200 | # Ideally, pycocotools wouldn't have hard-coded prints
201 | # so that we could avoid copy-pasting those two functions
202 | 
203 | 
204 | def createIndex(self):
205 |     # create index
206 |     # print('creating index...')
207 |     anns, cats, imgs = {}, {}, {}
208 |     imgToAnns, catToImgs = defaultdict(list), defaultdict(list)
209 |     if "annotations" in self.dataset:
210 |         for ann in self.dataset["annotations"]:
211 |             imgToAnns[ann["image_id"]].append(ann)
212 |             anns[ann["id"]] = ann
213 | 
214 |     if "images" in self.dataset:
215 |         for img in self.dataset["images"]:
216 |             imgs[img["id"]] = img
217 | 
218 |     if "categories" in self.dataset:
219 |         for cat in self.dataset["categories"]:
220 |             cats[cat["id"]] = cat
221 | 
222 |     if "annotations" in self.dataset and "categories" in self.dataset:
223 |         for ann in self.dataset["annotations"]:
224 |             catToImgs[ann["category_id"]].append(ann["image_id"])
225 | 
226 |     # print('index created!')
227 | 
228 |     # create class members
229 |     self.anns = anns
230 |     self.imgToAnns = imgToAnns
231 |     self.catToImgs = catToImgs
232 |     self.imgs = imgs
233 |     self.cats = cats
234 | 
235 | 
236 | maskUtils = mask_util
237 | 
238 | 
239 | def loadRes(self, resFile):
240 |     """
241 |     Load result file and return a result api object.
242 |     :param   resFile (str)     : file name of result file
243 |     :return: res (obj)         : result api object
244 |     """
245 |     res = COCO()
246 |     res.dataset["images"] = [img for img in self.dataset["images"]]
247 | 
248 |     # print('Loading and preparing results...')
249 |     # tic = time.time()
250 |     if isinstance(resFile, torch._six.string_classes):
251 |         anns = json.load(open(resFile))
252 |     elif type(resFile) == np.ndarray:
253 |         anns = self.loadNumpyAnnotations(resFile)
254 |     else:
255 |         anns = resFile
256 |     assert type(anns) == list, "results in not an array of objects"
257 |     annsImgIds = [ann["image_id"] for ann in anns]
258 |     assert set(annsImgIds) == (
259 |         set(annsImgIds) & set(self.getImgIds())
260 |     ), "Results do not correspond to current coco set"
261 |     if "caption" in anns[0]:
262 |         imgIds = set([img["id"] for img in res.dataset["images"]]) & set(
263 |             [ann["image_id"] for ann in anns]
264 |         )
265 |         res.dataset["images"] = [
266 |             img for img in res.dataset["images"] if img["id"] in imgIds
267 |         ]
268 |         for id, ann in enumerate(anns):
269 |             ann["id"] = id + 1
270 |     elif "bbox" in anns[0] and not anns[0]["bbox"] == []:
271 |         res.dataset["categories"] = copy.deepcopy(self.dataset["categories"])
272 |         for id, ann in enumerate(anns):
273 |             bb = ann["bbox"]
274 |             x1, x2, y1, y2 = [bb[0], bb[0] + bb[2], bb[1], bb[1] + bb[3]]
275 |             if "segmentation" not in ann:
276 |                 ann["segmentation"] = [[x1, y1, x1, y2, x2, y2, x2, y1]]
277 |             ann["area"] = bb[2] * bb[3]
278 |             ann["id"] = id + 1
279 |             ann["iscrowd"] = 0
280 |     elif "segmentation" in anns[0]:
281 |         res.dataset["categories"] = copy.deepcopy(self.dataset["categories"])
282 |         for id, ann in enumerate(anns):
283 |             # now only support compressed RLE format as segmentation results
284 |             ann["area"] = maskUtils.area(ann["segmentation"])
285 |             if "bbox" not in ann:
286 |                 ann["bbox"] = maskUtils.toBbox(ann["segmentation"])
287 |             ann["id"] = id + 1
288 |             ann["iscrowd"] = 0
289 |     elif "keypoints" in anns[0]:
290 |         res.dataset["categories"] = copy.deepcopy(self.dataset["categories"])
291 |         for id, ann in enumerate(anns):
292 |             s = ann["keypoints"]
293 |             x = s[0::3]
294 |             y = s[1::3]
295 |             x0, x1, y0, y1 = np.min(x), np.max(x), np.min(y), np.max(y)
296 |             ann["area"] = (x1 - x0) * (y1 - y0)
297 |             ann["id"] = id + 1
298 |             ann["bbox"] = [x0, y0, x1 - x0, y1 - y0]
299 |     # print('DONE (t={:0.2f}s)'.format(time.time()- tic))
300 | 
301 |     res.dataset["annotations"] = anns
302 |     createIndex(res)
303 |     return res
304 | 
305 | 
306 | def evaluate(self):
307 |     """
308 |     Run per image evaluation on given images and store results (a list of dict) in self.evalImgs
309 |     :return: None
310 |     """
311 |     # tic = time.time()
312 |     # print('Running per image evaluation...')
313 |     p = self.params
314 |     # add backward compatibility if useSegm is specified in params
315 |     if p.useSegm is not None:
316 |         p.iouType = "segm" if p.useSegm == 1 else "bbox"
317 |         print(
318 |             "useSegm (deprecated) is not None. Running {} evaluation".format(p.iouType)
319 |         )
320 |     # print('Evaluate annotation type *{}*'.format(p.iouType))
321 |     p.imgIds = list(np.unique(p.imgIds))
322 |     if p.useCats:
323 |         p.catIds = list(np.unique(p.catIds))
324 |     p.maxDets = sorted(p.maxDets)
325 |     self.params = p
326 | 
327 |     self._prepare()
328 |     # loop through images, area range, max detection number
329 |     catIds = p.catIds if p.useCats else [-1]
330 | 
331 |     if p.iouType == "segm" or p.iouType == "bbox":
332 |         computeIoU = self.computeIoU
333 |     elif p.iouType == "keypoints":
334 |         computeIoU = self.computeOks
335 |     self.ious = {
336 |         (imgId, catId): computeIoU(imgId, catId)
337 |         for imgId in p.imgIds
338 |         for catId in catIds
339 |     }
340 | 
341 |     evaluateImg = self.evaluateImg
342 |     maxDet = p.maxDets[-1]
343 |     evalImgs = [
344 |         evaluateImg(imgId, catId, areaRng, maxDet)
345 |         for catId in catIds
346 |         for areaRng in p.areaRng
347 |         for imgId in p.imgIds
348 |     ]
349 |     # this is NOT in the pycocotools code, but could be done outside
350 |     evalImgs = np.asarray(evalImgs).reshape(len(catIds), len(p.areaRng), len(p.imgIds))
351 |     self._paramsEval = copy.deepcopy(self.params)
352 |     # toc = time.time()
353 |     # print('DONE (t={:0.2f}s).'.format(toc-tic))
354 |     return p.imgIds, evalImgs
355 | 


--------------------------------------------------------------------------------
/coco_utils.py:
--------------------------------------------------------------------------------
  1 | import copy
  2 | import os
  3 | 
  4 | import torchvision
  5 | from PIL import Image
  6 | from tqdm import tqdm
  7 | 
  8 | import torch
  9 | import torch.utils.data
 10 | import transforms as T
 11 | from pycocotools import mask as coco_mask
 12 | from pycocotools.coco import COCO
 13 | 
 14 | 
 15 | class FilterAndRemapCocoCategories(object):
 16 |     def __init__(self, categories, remap=True):
 17 |         self.categories = categories
 18 |         self.remap = remap
 19 | 
 20 |     def __call__(self, image, target):
 21 |         anno = target["annotations"]
 22 |         anno = [obj for obj in anno if obj["category_id"] in self.categories]
 23 |         if not self.remap:
 24 |             target["annotations"] = anno
 25 |             return image, target
 26 |         anno = copy.deepcopy(anno)
 27 |         for obj in anno:
 28 |             obj["category_id"] = self.categories.index(obj["category_id"])
 29 |         target["annotations"] = anno
 30 |         return image, target
 31 | 
 32 | 
 33 | def convert_coco_poly_to_mask(segmentations, height, width):
 34 |     masks = []
 35 |     for polygons in segmentations:
 36 |         rles = coco_mask.frPyObjects(polygons, height, width)
 37 |         mask = coco_mask.decode(rles)
 38 |         if len(mask.shape) < 3:
 39 |             mask = mask[..., None]
 40 |         mask = torch.as_tensor(mask, dtype=torch.uint8)
 41 |         mask = mask.any(dim=2)
 42 |         masks.append(mask)
 43 |     if masks:
 44 |         masks = torch.stack(masks, dim=0)
 45 |     else:
 46 |         masks = torch.zeros((0, height, width), dtype=torch.uint8)
 47 |     return masks
 48 | 
 49 | 
 50 | class ConvertCocoPolysToMask(object):
 51 |     def __call__(self, image, target):
 52 |         w, h = image.size
 53 | 
 54 |         image_id = target["image_id"]
 55 |         image_id = torch.tensor([image_id])
 56 | 
 57 |         anno = target["annotations"]
 58 | 
 59 |         anno = [obj for obj in anno if obj["iscrowd"] == 0]
 60 | 
 61 |         boxes = [obj["bbox"] for obj in anno]
 62 |         # guard against no boxes via resizing
 63 |         boxes = torch.as_tensor(boxes, dtype=torch.float32).reshape(-1, 4)
 64 |         boxes[:, 2:] += boxes[:, :2]
 65 |         boxes[:, 0::2].clamp_(min=0, max=w)
 66 |         boxes[:, 1::2].clamp_(min=0, max=h)
 67 | 
 68 |         classes = [obj["category_id"] for obj in anno]
 69 |         classes = torch.tensor(classes, dtype=torch.int64)
 70 | 
 71 |         segmentations = [obj["segmentation"] for obj in anno]
 72 |         masks = convert_coco_poly_to_mask(segmentations, h, w)
 73 | 
 74 |         keypoints = None
 75 |         if anno and "keypoints" in anno[0]:
 76 |             keypoints = [obj["keypoints"] for obj in anno]
 77 |             keypoints = torch.as_tensor(keypoints, dtype=torch.float32)
 78 |             num_keypoints = keypoints.shape[0]
 79 |             if num_keypoints:
 80 |                 keypoints = keypoints.view(num_keypoints, -1, 3)
 81 | 
 82 |         keep = (boxes[:, 3] > boxes[:, 1]) & (boxes[:, 2] > boxes[:, 0])
 83 |         boxes = boxes[keep]
 84 |         classes = classes[keep]
 85 |         masks = masks[keep]
 86 |         if keypoints is not None:
 87 |             keypoints = keypoints[keep]
 88 | 
 89 |         target = {}
 90 |         target["boxes"] = boxes
 91 |         target["labels"] = classes
 92 |         target["masks"] = masks
 93 |         target["image_id"] = image_id
 94 |         if keypoints is not None:
 95 |             target["keypoints"] = keypoints
 96 | 
 97 |         # for conversion to coco api
 98 |         area = torch.tensor([obj["area"] for obj in anno])
 99 |         iscrowd = torch.tensor([obj["iscrowd"] for obj in anno])
100 |         target["area"] = area
101 |         target["iscrowd"] = iscrowd
102 | 
103 |         return image, target
104 | 
105 | 
106 | def _coco_remove_images_without_annotations(dataset, cat_list=None):
107 |     def _has_only_empty_bbox(anno):
108 |         return all(any(o <= 1 for o in obj["bbox"][2:]) for obj in anno)
109 | 
110 |     def _count_visible_keypoints(anno):
111 |         return sum(sum(1 for v in ann["keypoints"][2::3] if v > 0) for ann in anno)
112 | 
113 |     min_keypoints_per_image = 10
114 | 
115 |     def _has_valid_annotation(anno):
116 |         # if it's empty, there is no annotation
117 |         if len(anno) == 0:
118 |             return False
119 |         # if all boxes have close to zero area, there is no annotation
120 |         if _has_only_empty_bbox(anno):
121 |             return False
122 |         # keypoints task have a slight different critera for considering
123 |         # if an annotation is valid
124 |         if "keypoints" not in anno[0]:
125 |             return True
126 |         # for keypoint detection tasks, only consider valid images those
127 |         # containing at least min_keypoints_per_image
128 |         if _count_visible_keypoints(anno) >= min_keypoints_per_image:
129 |             return True
130 |         return False
131 | 
132 |     assert isinstance(dataset, torchvision.datasets.CocoDetection)
133 |     ids = []
134 |     for ds_idx, img_id in enumerate(dataset.ids):
135 |         ann_ids = dataset.coco.getAnnIds(imgIds=img_id, iscrowd=None)
136 |         anno = dataset.coco.loadAnns(ann_ids)
137 |         if cat_list:
138 |             anno = [obj for obj in anno if obj["category_id"] in cat_list]
139 |         if _has_valid_annotation(anno):
140 |             ids.append(ds_idx)
141 | 
142 |     dataset = torch.utils.data.Subset(dataset, ids)
143 |     return dataset
144 | 
145 | 
146 | def convert_to_coco_api(ds):
147 |     coco_ds = COCO()
148 |     ann_id = 0
149 |     dataset = {"images": [], "categories": [], "annotations": []}
150 |     categories = set()
151 |     for img_idx in tqdm(range(len(ds))):
152 |         # find better way to get target
153 |         # targets = ds.get_annotations(img_idx)
154 |         img, targets = ds[img_idx]
155 |         img = torchvision.transforms.ToTensor()(img)
156 |         image_id = targets["image_id"].item()
157 |         img_dict = {}
158 |         img_dict["id"] = image_id
159 |         img_dict["height"] = img.shape[-2]
160 |         img_dict["width"] = img.shape[-1]
161 |         dataset["images"].append(img_dict)
162 |         bboxes = targets["boxes"]
163 |         bboxes[:, 2:] -= bboxes[:, :2]
164 |         bboxes = bboxes.tolist()
165 |         labels = targets["labels"].tolist()
166 |         areas = targets["area"].tolist()
167 |         iscrowd = targets["iscrowd"].tolist()
168 |         if "masks" in targets:
169 |             masks = targets["masks"]
170 |             # make masks Fortran contiguous for coco_mask
171 |             masks = masks.permute(0, 2, 1).contiguous().permute(0, 2, 1)
172 |         if "keypoints" in targets:
173 |             keypoints = targets["keypoints"]
174 |             keypoints = keypoints.reshape(keypoints.shape[0], -1).tolist()
175 |         num_objs = len(bboxes)
176 |         for i in range(num_objs):
177 |             ann = {}
178 |             ann["image_id"] = image_id
179 |             ann["bbox"] = bboxes[i]
180 |             ann["category_id"] = labels[i]
181 |             categories.add(labels[i])
182 |             ann["area"] = areas[i]
183 |             ann["iscrowd"] = iscrowd[i]
184 |             ann["id"] = ann_id
185 |             if "masks" in targets:
186 |                 ann["segmentation"] = coco_mask.encode(masks[i].numpy())
187 |             if "keypoints" in targets:
188 |                 ann["keypoints"] = keypoints[i]
189 |                 ann["num_keypoints"] = sum(k != 0 for k in keypoints[i][2::3])
190 |             dataset["annotations"].append(ann)
191 |             ann_id += 1
192 |     dataset["categories"] = [{"id": i} for i in sorted(categories)]
193 |     coco_ds.dataset = dataset
194 |     coco_ds.createIndex()
195 |     return coco_ds
196 | 
197 | 
198 | def get_coco_api_from_dataset(dataset):
199 |     for i in range(10):
200 |         if isinstance(dataset, torchvision.datasets.CocoDetection):
201 |             break
202 |         if isinstance(dataset, torch.utils.data.Subset):
203 |             dataset = dataset.dataset
204 |     if isinstance(dataset, torchvision.datasets.CocoDetection):
205 |         return dataset.coco
206 |     return convert_to_coco_api(dataset)
207 | 
208 | 
209 | class CocoDetection(torchvision.datasets.CocoDetection):
210 |     def __init__(self, img_folder, ann_file, transforms):
211 |         super(CocoDetection, self).__init__(img_folder, ann_file)
212 |         self._transforms = transforms
213 | 
214 |     def __getitem__(self, idx):
215 |         img, target = super(CocoDetection, self).__getitem__(idx)
216 |         image_id = self.ids[idx]
217 |         target = dict(image_id=image_id, annotations=target)
218 |         if self._transforms is not None:
219 |             img, target = self._transforms(img, target)
220 |         return img, target
221 | 
222 | 
223 | def get_coco(root, image_set, transforms, mode="instances"):
224 |     anno_file_template = "{}_{}2017.json"
225 |     PATHS = {
226 |         "train": (
227 |             "train2017",
228 |             os.path.join("annotations", anno_file_template.format(mode, "train")),
229 |         ),
230 |         "val": (
231 |             "val2017",
232 |             os.path.join("annotations", anno_file_template.format(mode, "val")),
233 |         ),
234 |         # "train": ("val2017", os.path.join("annotations", anno_file_template.format(mode, "val")))
235 |     }
236 | 
237 |     t = [ConvertCocoPolysToMask()]
238 | 
239 |     if transforms is not None:
240 |         t.append(transforms)
241 |     transforms = T.Compose(t)
242 | 
243 |     img_folder, ann_file = PATHS[image_set]
244 |     img_folder = os.path.join(root, img_folder)
245 |     ann_file = os.path.join(root, ann_file)
246 | 
247 |     dataset = CocoDetection(img_folder, ann_file, transforms=transforms)
248 | 
249 |     if image_set == "train":
250 |         dataset = _coco_remove_images_without_annotations(dataset)
251 | 
252 |     # dataset = torch.utils.data.Subset(dataset, [i for i in range(500)])
253 | 
254 |     return dataset
255 | 
256 | 
257 | def get_coco_kp(root, image_set, transforms):
258 |     return get_coco(root, image_set, transforms, mode="person_keypoints")
259 | 


--------------------------------------------------------------------------------
/datasets/.ipynb_checkpoints/idd-checkpoint.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import xml.etree.ElementTree as ET
  3 | from glob import glob
  4 | from pathlib import Path
  5 | 
  6 | import matplotlib
  7 | import matplotlib.pyplot as plt
  8 | import numpy as np
  9 | import torchvision
 10 | from PIL import Image
 11 | from torchvision import transforms
 12 | 
 13 | import torch
 14 | import transforms as T
 15 | import utils
 16 | from torch import FloatTensor, Tensor
 17 | from torch.utils.data import (DataLoader, Dataset, RandomSampler,
 18 |                               SequentialSampler)
 19 | 
 20 | 
 21 | def get_transform(train):
 22 |     transforms = []
 23 |     transforms.append(T.ToTensor())
 24 |     # transforms.append(T.Normalize(mean=(0.3520, 0.3520, 0.3520),std=(0.2930, 0.2930, 0.2930)))
 25 |     if train:
 26 |         transforms.append(T.RandomHorizontalFlip(0.5))
 27 |     return T.Compose(transforms)
 28 | 
 29 | 
 30 | class IDD(torch.utils.data.Dataset):
 31 |     def __init__(self, list_img_path, list_anno_path, transforms=None):
 32 |         super(IDD, self).__init__()
 33 |         self.img = list_img_path
 34 |         self.anno = list_anno_path
 35 |         self.transforms = transforms
 36 |         self.classes = {
 37 |             "person": 0,
 38 |             "rider": 1,
 39 |             "car": 2,
 40 |             "truck": 3,
 41 |             "bus": 4,
 42 |             "motorcycle": 5,
 43 |             "bicycle": 6,
 44 |             "autorickshaw": 7,
 45 |             "animal": 8,
 46 |             "traffic light": 9,
 47 |             "traffic sign": 10,
 48 |             "vehicle fallback": 11,
 49 |         }  #'caravan':12,'trailer':13,'train':14}
 50 | 
 51 |     def __len__(self):
 52 |         return len(self.img)
 53 | 
 54 |     def get_height_and_width(self, idx):
 55 |         img_path = os.path.join(img_path, self.img[idx])
 56 |         img = Image.open(img_path).convert("RGB")
 57 |         dim_tensor = torchvision.transforms.ToTensor()(img).shape
 58 |         height, width = dim_tensor[1], dim_tensor[2]
 59 |         return height, width
 60 | 
 61 |     def get_label_bboxes(self, xml_obj):
 62 |         xml_obj = ET.parse(xml_obj)
 63 |         objects, bboxes = [], []
 64 | 
 65 |         for node in xml_obj.getroot().iter("object"):
 66 |             object_present = node.find("name").text
 67 |             xmin = int(node.find("bndbox/xmin").text)
 68 |             xmax = int(node.find("bndbox/xmax").text)
 69 |             ymin = int(node.find("bndbox/ymin").text)
 70 |             ymax = int(node.find("bndbox/ymax").text)
 71 |             if object_present in self.classes:
 72 |                 objects.append(self.classes[object_present])
 73 |                 bboxes.append((xmin, ymin, xmax, ymax))
 74 |         return Tensor(objects), Tensor(bboxes)
 75 | 
 76 |     def __getitem__(self, idx):
 77 |         img_path = self.img[idx]
 78 |         img = Image.open(img_path).convert("RGB")
 79 | 
 80 |         labels = self.get_label_bboxes(self.anno[idx])[0]
 81 |         bboxes = self.get_label_bboxes(self.anno[idx])[1]
 82 | 
 83 |         img_id = Tensor([idx])
 84 |         area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
 85 | 
 86 |         iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
 87 |         target = {}
 88 |         target["boxes"] = bboxes
 89 |         target["labels"] = labels
 90 |         target["image_id"] = img_id
 91 |         target["area"] = area
 92 |         target["iscrowd"] = iscrowd
 93 | 
 94 |         if self.transforms is not None:
 95 |             img, target = self.transforms(img, target)
 96 | 
 97 |         return img, target
 98 | 
 99 | 
100 | class IDD_Test(torch.utils.data.Dataset):
101 |     def __init__(self, list_img_path, list_anno_path):
102 |         super(IDD_Test, self).__init__()
103 |         self.img = sorted(list_img_path)
104 |         self.anno = sorted(list_anno_path)
105 |         self.classes = {
106 |             "person": 0,
107 |             "rider": 1,
108 |             "car": 2,
109 |             "truck": 3,
110 |             "bus": 4,
111 |             "motorcycle": 5,
112 |             "bicycle": 6,
113 |             "autorickshaw": 7,
114 |             "animal": 8,
115 |             "traffic light": 9,
116 |             "traffic sign": 10,
117 |             "vehicle fallback": 11,
118 |             "caravan": 12,
119 |             "trailer": 13,
120 |             "train": 14,
121 |         }
122 | 
123 |     def __len__(self):
124 |         return len(self.img)
125 | 
126 |     def get_height_and_width(self, idx):
127 |         img_path = os.path.join(img_path, self.imgs[idx])
128 |         img = Image.open(img_path).convert("RGB")
129 |         dim_tensor = torchvision.transforms.ToTensor()(img).shape
130 |         height, width = dim_tensor[1], dim_tensor[2]
131 |         return height, width
132 | 
133 |     def get_label_bboxes(self, xml_obj):
134 |         xml_obj = ET.parse(xml_obj)
135 |         objects, bboxes = [], []
136 | 
137 |         for node in xml_obj.getroot().iter("object"):
138 |             object_present = node.find("name").text
139 |             xmin = int(node.find("bndbox/xmin").text)
140 |             xmax = int(node.find("bndbox/xmax").text)
141 |             ymin = int(node.find("bndbox/ymin").text)
142 |             ymax = int(node.find("bndbox/ymax").text)
143 |             objects.append(self.classes[object_present])
144 |             bboxes.append((xmin, ymin, xmax, ymax))
145 |         return Tensor(objects), Tensor(bboxes)
146 | 
147 |     def __getitem__(self, idx):
148 |         img_path = self.img[idx]
149 |         img = Image.open(img_path).convert("RGB")
150 | 
151 |         labels = self.get_label_bboxes(self.anno[idx])[0]
152 |         bboxes = self.get_label_bboxes(self.anno[idx])[1]
153 | 
154 |         img_id = Tensor([idx])
155 |         area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
156 | 
157 |         iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
158 |         target = {}
159 |         target["boxes"] = bboxes
160 |         target["labels"] = labels
161 |         target["image_id"] = img_id
162 |         target["area"] = area
163 |         target["iscrowd"] = iscrowd
164 | 
165 |         return img, target
166 | 


--------------------------------------------------------------------------------
/datasets/bdd.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import os
  3 | from pathlib import Path
  4 | 
  5 | import numpy as np
  6 | import torchvision
  7 | from PIL import Image
  8 | from torchvision import transforms
  9 | from tqdm import tqdm
 10 | 
 11 | import torch
 12 | import transforms as T
 13 | import utils
 14 | from torch import Tensor, nn
 15 | from torch.utils.data import Dataset
 16 | 
 17 | 
 18 | def get_ground_truths(train_img_path_list, anno_data):
 19 | 
 20 |     bboxes, total_bboxes = [], []
 21 |     labels, total_labels = [], []
 22 |     classes = {
 23 |         "bus": 0,
 24 |         "traffic light": 1,
 25 |         "traffic sign": 2,
 26 |         "person": 3,
 27 |         "bike": 4,
 28 |         "truck": 5,
 29 |         "motor": 6,
 30 |         "car": 7,
 31 |         "train": 8,
 32 |         "rider": 9,
 33 |         "drivable area": 10,
 34 |         "lane": 11,
 35 |     }
 36 | 
 37 |     for i in tqdm(range(len(train_img_path_list))):
 38 |         for j in range(len(anno_data[i]["labels"])):
 39 |             if "box2d" in anno_data[i]["labels"][j]:
 40 |                 xmin = anno_data[i]["labels"][j]["box2d"]["x1"]
 41 |                 ymin = anno_data[i]["labels"][j]["box2d"]["y1"]
 42 |                 xmax = anno_data[i]["labels"][j]["box2d"]["x2"]
 43 |                 ymax = anno_data[i]["labels"][j]["box2d"]["y2"]
 44 |                 bbox = [xmin, ymin, xmax, ymax]
 45 |                 category = anno_data[i]["labels"][j]["category"]
 46 |                 cls = classes[category]
 47 | 
 48 |                 bboxes.append(bbox)
 49 |                 labels.append(cls)
 50 | 
 51 |         total_bboxes.append(torch.tensor(bboxes))
 52 |         total_labels.append(torch.tensor(labels))
 53 |         bboxes = []
 54 |         labels = []
 55 | 
 56 |     return total_bboxes, total_labels
 57 | 
 58 | 
 59 | def _load_json(path_list_idx):
 60 |     with open(path_list_idx, "r") as file:
 61 |         data = json.load(file)
 62 |     return data
 63 | 
 64 | 
 65 | def get_transform(train):
 66 |     transforms = []
 67 |     transforms.append(T.ToTensor())
 68 |     if train:
 69 |         transforms.append(T.RandomHorizontalFlip(0.5))
 70 |     return T.Compose(transforms)
 71 | 
 72 | 
 73 | class BDD(torch.utils.data.Dataset):
 74 |     def __init__(
 75 |         self, img_path, anno_json_path, transforms=None
 76 |     ):  # total_bboxes_list,total_labels_list,transforms=None):
 77 |         super(BDD, self).__init__()
 78 |         self.img_path = img_path
 79 |         self.anno_data = _load_json(anno_json_path)
 80 |         self.total_bboxes_list, self.total_labels_list = get_ground_truths(
 81 |             self.img_path, self.anno_data
 82 |         )
 83 |         self.transforms = transforms
 84 |         self.classes = {
 85 |             "bus": 0,
 86 |             "traffic light": 1,
 87 |             "traffic sign": 2,
 88 |             "person": 3,
 89 |             "bike": 4,
 90 |             "truck": 5,
 91 |             "motor": 6,
 92 |             "car": 7,
 93 |             "train": 8,
 94 |             "rider": 9,
 95 |             "drivable area": 10,
 96 |             "lane": 11,
 97 |         }
 98 | 
 99 |     def __len__(self):
100 |         return len(self.img_path)
101 | 
102 |     def __getitem__(self, idx):
103 |         img_path = self.img_path[idx]
104 |         img = Image.open(img_path).convert("RGB")
105 | 
106 |         labels = self.total_labels_list[idx]
107 |         bboxes = self.total_bboxes_list[idx]
108 |         area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
109 | 
110 |         img_id = torch.tensor([idx])
111 |         iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
112 |         target = {}
113 |         target["boxes"] = bboxes
114 |         target["labels"] = labels
115 |         target["image_id"] = img_id
116 |         target["area"] = area
117 |         target["iscrowd"] = iscrowd
118 | 
119 |         if self.transforms is not None:
120 |             img, target = self.transforms(img, target)
121 | 
122 |         return img, target
123 | 


--------------------------------------------------------------------------------
/datasets/cityscapes.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import os
 3 | import pickle
 4 | 
 5 | import numpy as np
 6 | import torchvision
 7 | from PIL import Image
 8 | from torchvision import transforms
 9 | 
10 | import torch
11 | import transforms as T
12 | import utils
13 | from torch import FloatTensor, Tensor
14 | from torch.utils.data import (DataLoader, Dataset, RandomSampler,
15 |                               SequentialSampler)
16 | from torch.utils.data.dataloader import default_collate
17 | from transforms import *
18 | 
19 | 
20 | class Cityscapes(torch.utils.data.Dataset):
21 |     def __init__(
22 |         self, image_path_list, target_path_list, split="train", transforms=None
23 |     ):
24 |         super(Cityscapes, self).__init__()
25 |         self.images = image_path_list
26 |         self.targets = target_path_list
27 |         self.transforms = transforms
28 |         self.classes = {
29 |             "pedestrian": 0,
30 |             "rider": 1,
31 |             "person group": 2,
32 |             "person (other)": 3,
33 |             "sitting person": 4,
34 |             "ignore": 5,
35 |         }
36 | 
37 |     def get_label_bboxes(self, label):
38 |         """
39 |         Bounding boxes are in the form [x0,y0.w,h]
40 |         """
41 |         bboxes = []
42 |         labels = []
43 |         for data in label["objects"]:
44 |             x0 = data["bbox"][0]
45 |             y0 = data["bbox"][1]
46 |             x1 = x0 + data["bbox"][2]
47 |             y1 = y0 + data["bbox"][3]
48 |             bbox_list = [x0, y0, x1, y1]
49 |             labels.append(self.classes[data["label"]])
50 |             bboxes.append(bbox_list)
51 |         return Tensor(bboxes), Tensor(labels)
52 | 
53 |     def __len__(self):
54 |         return len(self.images)
55 | 
56 |     def extra_repr(self):
57 |         lines = ["Split: {split}", "Mode: {mode}", "Type: {target_type}"]
58 |         return "\n".join(lines).format(**self.__dict__)
59 | 
60 |     def _load_json(self, path_list_idx):
61 |         with open(path_list_idx, "r") as file:
62 |             data = json.load(file)
63 |         return data
64 | 
65 |     def __getitem__(self, idx):
66 | 
67 |         image = Image.open(self.images[idx]).convert("RGB")
68 | 
69 |         data = self._load_json(self.targets[idx])
70 | 
71 |         labels = self.get_label_bboxes(data)[1]
72 |         bboxes = self.get_label_bboxes(data)[0]
73 |         area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
74 |         iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
75 | 
76 |         img_id = Tensor([idx])
77 |         target = {}
78 |         target["boxes"] = bboxes
79 |         target["labels"] = labels
80 |         target["image_id"] = img_id
81 |         target["area"] = area
82 |         target["iscrowd"] = iscrowd
83 | 
84 |         if self.transforms is not None:
85 |             image, target = self.transforms(image, target)
86 |         return image, target
87 | 
88 | 
89 | def get_transform(train):
90 |     transforms = []
91 |     transforms.append(T.ToTensor())
92 |     # transforms.append(T.Normalize(mean=(0.485, 0.456, 0.406),std=(0.229, 0.224, 0.225)))
93 |     if train:
94 |         transforms.append(T.RandomHorizontalFlip(0.5))
95 |     return T.Compose(transforms)
96 | 


--------------------------------------------------------------------------------
/datasets/idd.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import time
  3 | import xml.etree.ElementTree as ET
  4 | from pathlib import Path
  5 | 
  6 | import numpy as np
  7 | import torchvision
  8 | from PIL import Image
  9 | from torchvision import transforms
 10 | 
 11 | import torch
 12 | import transforms as T
 13 | import utils
 14 | from coco_eval import CocoEvaluator
 15 | from coco_utils import get_coco_api_from_dataset
 16 | from torch import FloatTensor, Tensor
 17 | from torch.utils.data import (DataLoader, Dataset, RandomSampler,
 18 |                               SequentialSampler)
 19 | 
 20 | 
 21 | def get_transform(train):
 22 |     transforms = []
 23 |     transforms.append(T.ToTensor())
 24 |     # transforms.append(T.Normalize(mean=(0.3520, 0.3520, 0.3520),std=(0.2930, 0.2930, 0.2930)))
 25 |     if train:
 26 |         transforms.append(T.RandomHorizontalFlip(0.5))
 27 |     return T.Compose(transforms)
 28 | 
 29 | 
 30 | class IDD(torch.utils.data.Dataset):
 31 |     def __init__(self, list_img_path, list_anno_path, transforms=None):
 32 |         super(IDD, self).__init__()
 33 |         self.img = list_img_path
 34 |         self.anno = list_anno_path
 35 |         self.transforms = transforms
 36 |         self.classes = {
 37 |             "person": 0,
 38 |             "rider": 1,
 39 |             "car": 2,
 40 |             "truck": 3,
 41 |             "bus": 4,
 42 |             "motorcycle": 5,
 43 |             "bicycle": 6,
 44 |             "autorickshaw": 7,
 45 |             "animal": 8,
 46 |             "traffic light": 9,
 47 |             "traffic sign": 10,
 48 |             "vehicle fallback": 11,
 49 |             "caravan": 12,
 50 |             "trailer": 13,
 51 |             "train": 14,
 52 |         }
 53 | 
 54 |     def __len__(self):
 55 |         return len(self.img)
 56 | 
 57 |     def get_height_and_width(self, idx):
 58 |         img_path = os.path.join(img_path, self.img[idx])
 59 |         img = Image.open(img_path).convert("RGB")
 60 |         dim_tensor = torchvision.transforms.ToTensor()(img).shape
 61 |         height, width = dim_tensor[1], dim_tensor[2]
 62 |         return height, width
 63 | 
 64 |     def get_label_bboxes(self, xml_obj):
 65 |         xml_obj = ET.parse(xml_obj)
 66 |         objects, bboxes = [], []
 67 | 
 68 |         for node in xml_obj.getroot().iter("object"):
 69 |             object_present = node.find("name").text
 70 |             xmin = int(node.find("bndbox/xmin").text)
 71 |             xmax = int(node.find("bndbox/xmax").text)
 72 |             ymin = int(node.find("bndbox/ymin").text)
 73 |             ymax = int(node.find("bndbox/ymax").text)
 74 |             objects.append(self.classes[object_present])
 75 |             bboxes.append((xmin, ymin, xmax, ymax))
 76 |         return Tensor(objects), Tensor(bboxes)
 77 | 
 78 |     def __getitem__(self, idx):
 79 |         img_path = self.img[idx]
 80 |         img = Image.open(img_path).convert("RGB")
 81 | 
 82 |         labels = self.get_label_bboxes(self.anno[idx])[0]
 83 |         bboxes = self.get_label_bboxes(self.anno[idx])[1]
 84 |         labels = labels.type(torch.int64)
 85 |         img_id = Tensor([idx])
 86 |         area = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
 87 | 
 88 |         iscrowd = torch.zeros(len(bboxes,), dtype=torch.int64)
 89 |         target = {}
 90 |         target["boxes"] = bboxes
 91 |         target["labels"] = labels
 92 |         target["image_id"] = img_id
 93 |         target["area"] = area
 94 |         target["iscrowd"] = iscrowd
 95 | 
 96 |         if self.transforms is not None:
 97 |             img, target = self.transforms(img, target)
 98 | 
 99 |         return img, target
100 | 


--------------------------------------------------------------------------------
/engine.py:
--------------------------------------------------------------------------------
  1 | # Adapted from torchvision, changes include tensorboard support
  2 | 
  3 | import math
  4 | import sys
  5 | import time
  6 | 
  7 | from tensorboardX import SummaryWriter
  8 | 
  9 | import torch
 10 | import utils
 11 | from coco_eval import CocoEvaluator
 12 | from coco_utils import get_coco_api_from_dataset
 13 | from imports import *
 14 | 
 15 | writer = SummaryWriter()
 16 | num_iters = 0
 17 | 
 18 | 
 19 | def train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq):
 20 |     global num_iters
 21 |     model.train()
 22 |     metric_logger = utils.MetricLogger(delimiter="  ")
 23 |     metric_logger.add_meter("lr", utils.SmoothedValue(window_size=1, fmt="{value:.6f}"))
 24 |     header = "Epoch: [{}]".format(epoch)
 25 | 
 26 |     lr_scheduler = None
 27 |     if epoch == 0:
 28 |         warmup_factor = 1.0 / 1000
 29 |         warmup_iters = min(1000, len(data_loader) - 1)
 30 | 
 31 |         lr_scheduler = utils.warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor)
 32 | 
 33 |     for images, targets in metric_logger.log_every(data_loader, print_freq, header):
 34 |         images = list(image.to(device) for image in images)
 35 | 
 36 |         targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
 37 | 
 38 |         loss_dict = model(images, targets)
 39 |         num_iters += 1
 40 |         losses = sum(loss for loss in loss_dict.values())
 41 | 
 42 |         # reduce losses over all GPUs for logging purposes
 43 |         loss_dict_reduced = utils.reduce_dict(loss_dict)
 44 |         losses_reduced = sum(loss for loss in loss_dict_reduced.values())
 45 | 
 46 |         loss_value = losses_reduced.item()
 47 | 
 48 |         writer.add_scalar("Loss/train", loss_value, num_iters)
 49 |         writer.add_scalar("Learning rate", optimizer.param_groups[0]["lr"], num_iters)
 50 |         writer.add_scalar("Momentum", optimizer.param_groups[0]["momentum"], num_iters)
 51 | 
 52 |         if not math.isfinite(loss_value):
 53 |             print("Loss is {}, stopping training".format(loss_value))
 54 |             print(loss_dict_reduced)
 55 |             sys.exit(1)
 56 | 
 57 |         optimizer.zero_grad()
 58 |         losses.backward()
 59 |         optimizer.step()
 60 | 
 61 |         if lr_scheduler is not None:
 62 |             lr_scheduler.step()
 63 | 
 64 |         metric_logger.update(loss=losses_reduced, **loss_dict_reduced)
 65 |         metric_logger.update(lr=optimizer.param_groups[0]["lr"])
 66 | 
 67 | 
 68 | def _get_iou_types(model):
 69 |     model_without_ddp = model
 70 |     if isinstance(model, torch.nn.parallel.DistributedDataParallel):
 71 |         model_without_ddp = model.module
 72 |     iou_types = ["bbox"]
 73 |     return iou_types
 74 | 
 75 | 
 76 | @torch.no_grad()
 77 | def evaluate(model, data_loader, device):
 78 |     iou_types = ["bbox"]
 79 |     coco = get_coco_api_from_dataset(data_loader.dataset)
 80 |     n_threads = torch.get_num_threads()
 81 |     torch.set_num_threads(1)
 82 |     cpu_device = torch.device("cpu")
 83 |     model.eval()
 84 |     metric_logger = utils.MetricLogger(delimiter="  ")
 85 |     header = "Test:"
 86 |     model.to(device)
 87 |     iou_types = _get_iou_types(model)
 88 |     coco_evaluator = CocoEvaluator(coco, iou_types)
 89 |     to_tensor = torchvision.transforms.ToTensor()
 90 |     for image, targets in metric_logger.log_every(data_loader, 100, header):
 91 | 
 92 |         image = list(to_tensor(img).to(device) for img in image)
 93 |         targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
 94 |         torch.cuda.synchronize()
 95 |         model_time = time.time()
 96 | 
 97 |         outputs = model(image)
 98 | 
 99 |         outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
100 |         model_time = time.time() - model_time
101 | 
102 |         res = {
103 |             target["image_id"].item(): output
104 |             for target, output in zip(targets, outputs)
105 |         }
106 |         evaluator_time = time.time()
107 |         coco_evaluator.update(res)
108 |         evaluator_time = time.time() - evaluator_time
109 |         metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
110 | 
111 |     # gather the stats from all processes
112 |     metric_logger.synchronize_between_processes()
113 |     print("Averaged stats:", metric_logger)
114 |     coco_evaluator.synchronize_between_processes()
115 | 
116 |     # accumulate predictions from all images
117 |     coco_evaluator.accumulate()
118 |     coco_evaluator.summarize()
119 |     torch.set_num_threads(n_threads)
120 |     return coco_evaluator
121 | 


--------------------------------------------------------------------------------
/eval_idd_bdd.py:
--------------------------------------------------------------------------------
  1 | # Adapted from torchvision, changes made to support evaluation on idd and bdd100k
  2 | 
  3 | import pickle
  4 | import time
  5 | 
  6 | from coco_eval import CocoEvaluator
  7 | from coco_utils import get_coco_api_from_dataset
  8 | from datasets.bdd import *
  9 | from datasets.idd import *
 10 | from imports import *
 11 | 
 12 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
 13 | 
 14 | ###########################    User Defined settings ########################
 15 | ds = "BDD"
 16 | bdd_path = "/home/jupyter/autonue/data/bdd100k/"
 17 | batch_size = 8
 18 | model_name = "bdd100k_24.pth"
 19 | idd_path = "/home/jupyter/autonue/data/IDD_Detection/"
 20 | # name = 'do_ft_trained_bdd_eval_idd_ready.pth'
 21 | use_checkpoint = False
 22 | ################################     Dataset and Dataloader Management       ##########################################
 23 | 
 24 | print("Loading files")
 25 | 
 26 | if ds == "IDD":
 27 |     print("Evaluation on India Driving dataset")
 28 |     with open("datalists/idd_images_path_list.txt", "rb") as fp:
 29 |         idd_image_path_list = pickle.load(fp)
 30 |     with open("datalists/idd_anno_path_list.txt", "rb") as fp:
 31 |         idd_anno_path_list = pickle.load(fp)
 32 | 
 33 |     val_img_paths = []
 34 |     with open(idd_path + "val.txt") as f:
 35 |         val_img_paths = f.readlines()
 36 |     for i in range(len(val_img_paths)):
 37 |         val_img_paths[i] = val_img_paths[i].strip("\n")
 38 |         val_img_paths[i] = val_img_paths[i] + ".jpg"
 39 |         val_img_paths[i] = os.path.join(idd_path + "JPEGImages", val_img_paths[i])
 40 | 
 41 |     val_anno_paths = []
 42 |     for i in range(len(val_img_paths)):
 43 |         val_anno_paths.append(val_img_paths[i].replace("JPEGImages", "Annotations"))
 44 |         val_anno_paths[i] = val_anno_paths[i].replace(".jpg", ".xml")
 45 | 
 46 |     val_img_paths, val_anno_paths = sorted(val_img_paths), sorted(val_anno_paths)
 47 | 
 48 |     assert len(val_img_paths) == len(val_anno_paths)
 49 |     #     val_img_paths = val_img_paths[:10]
 50 |     #     val_anno_paths = val_anno_paths[:10]
 51 | 
 52 |     val_dataset = IDD_Test(val_img_paths, val_anno_paths)
 53 |     val_dl = torch.utils.data.DataLoader(
 54 |         val_dataset,
 55 |         batch_size=batch_size,
 56 |         shuffle=True,
 57 |         num_workers=4,
 58 |         collate_fn=utils.collate_fn,
 59 |     )
 60 | 
 61 | if ds == "BDD":
 62 |     print("Evaluation on Berkeley Deep Drive")
 63 |     root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
 64 |     root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
 65 | 
 66 |     val_img_path = root_img_path + "/val/"
 67 |     val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
 68 | 
 69 |     with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
 70 |         bdd_img_path_list = pickle.load(fp)
 71 | 
 72 |     val_dataset = BDD(bdd_img_path_list, val_anno_json_path)
 73 |     val_dl = torch.utils.data.DataLoader(
 74 |         val_dataset,
 75 |         batch_size=batch_size,
 76 |         shuffle=True,
 77 |         num_workers=0,
 78 |         collate_fn=utils.collate_fn,
 79 |         pin_memory=True,
 80 |     )
 81 | 
 82 | ###################################################################################################3
 83 | 
 84 | 
 85 | def get_model(num_classes):
 86 |     model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False)
 87 |     in_features = model.roi_heads.box_predictor.cls_score.in_features
 88 |     model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
 89 |         in_features, num_classes
 90 |     )  # replace the pre-trained head with a new one
 91 |     return model.cuda()
 92 | 
 93 | 
 94 | ckpt = torch.load("saved_models/ulm_det_ft0.pth")
 95 | model = get_model(15)
 96 | model.load_state_dict(ckpt["model"])
 97 | 
 98 | model_bdd = get_model(12)
 99 | ckpt2 = torch.load("saved_models/bdd100k_24.pth")
100 | model_bdd.load_state_dict(ckpt2["model"])
101 | 
102 | model.roi_heads = model_bdd.roi_heads
103 | model.roi_heads.load_state_dict(model_bdd.roi_heads.state_dict())
104 | 
105 | model.cuda()
106 | 
107 | params = [p for p in model.parameters() if p.requires_grad]
108 | optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
109 | lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
110 | 
111 | if use_checkpoint:
112 |     checkpoint = torch.load("saved_models/" + model_name)
113 |     model.load_state_dict(checkpoint["model"])
114 |     print("Model Loaded successfully")
115 | 
116 | 
117 | def _get_iou_types(model):
118 |     model_without_ddp = model
119 |     if isinstance(model, torch.nn.parallel.DistributedDataParallel):
120 |         model_without_ddp = model.module
121 |     iou_types = ["bbox"]
122 |     return iou_types
123 | 
124 | 
125 | print("##### Dataloader is ready #######")
126 | iou_types = _get_iou_types(model)
127 | 
128 | print("Getting coco api from dataset")
129 | coco = get_coco_api_from_dataset(val_dl.dataset)
130 | print("Done")
131 | 
132 | 
133 | @torch.no_grad()
134 | def evaluate(model, data_loader, device):
135 |     n_threads = torch.get_num_threads()
136 |     # FIXME remove this and make paste_masks_in_image run on the GPU
137 |     torch.set_num_threads(1)
138 |     cpu_device = torch.device("cpu")
139 |     model.eval()
140 |     metric_logger = utils.MetricLogger(delimiter="  ")
141 |     header = "Test:"
142 |     model.cuda()
143 |     # coco = get_coco_api_from_dataset(data_loader.dataset)
144 |     iou_types = _get_iou_types(model)
145 |     coco_evaluator = CocoEvaluator(coco, iou_types)
146 | 
147 |     for image, targets in metric_logger.log_every(data_loader, 100, header):
148 |         # print(image)
149 |         # image = torchvision.transforms.ToTensor()(image[0])  # Returns a scaler tuple
150 |         # print(image.shape)                                # dim of image 1080x1920
151 | 
152 |         image = torchvision.transforms.ToTensor()(image[0]).to(device)
153 |         # image = img.to(device) for img in image
154 |         targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
155 |         torch.cuda.synchronize()
156 |         model_time = time.time()
157 | 
158 |         outputs = model([image])
159 | 
160 |         outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
161 |         model_time = time.time() - model_time
162 | 
163 |         res = {
164 |             target["image_id"].item(): output
165 |             for target, output in zip(targets, outputs)
166 |         }
167 |         evaluator_time = time.time()
168 |         coco_evaluator.update(res)
169 |         evaluator_time = time.time() - evaluator_time
170 |         metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
171 | 
172 |     # gather the stats from all processes
173 |     metric_logger.synchronize_between_processes()
174 |     print("Averaged stats:", metric_logger)
175 |     coco_evaluator.synchronize_between_processes()
176 | 
177 |     # accumulate predictions from all images
178 |     coco_evaluator.accumulate()
179 |     coco_evaluator.summarize()
180 |     torch.set_num_threads(n_threads)
181 |     return coco_evaluator
182 | 
183 | 
184 | print("Evaluation in progress")
185 | evaluate(model, val_dl, device=device)
186 | 


--------------------------------------------------------------------------------
/evaluation_baseline.py:
--------------------------------------------------------------------------------
  1 | import pickle
  2 | import time
  3 | 
  4 | from cfg import *
  5 | from coco_eval import CocoEvaluator
  6 | from coco_utils import get_coco_api_from_dataset
  7 | from datasets.bdd import *
  8 | from datasets.idd import *
  9 | from imports import *
 10 | 
 11 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
 12 | 
 13 | print("Loading files")
 14 | 
 15 | if ds in ["idd_non_hq", "idd_hq"]:
 16 |     print("Evaluation on India Driving dataset")
 17 |     with open("datalists/idd_images_path_list.txt", "rb") as fp:
 18 |         idd_image_path_list = pickle.load(fp)
 19 |     with open("datalists/idd_anno_path_list.txt", "rb") as fp:
 20 |         idd_anno_path_list = pickle.load(fp)
 21 | 
 22 |     val_img_paths = []
 23 |     with open(idd_path + "val.txt") as f:
 24 |         val_img_paths = f.readlines()
 25 |     for i in range(len(val_img_paths)):
 26 |         val_img_paths[i] = val_img_paths[i].strip("\n")
 27 |         val_img_paths[i] = val_img_paths[i] + ".jpg"
 28 |         val_img_paths[i] = os.path.join(idd_path + "JPEGImages", val_img_paths[i])
 29 | 
 30 |     val_anno_paths = []
 31 |     for i in range(len(val_img_paths)):
 32 |         val_anno_paths.append(val_img_paths[i].replace("JPEGImages", "Annotations"))
 33 |         val_anno_paths[i] = val_anno_paths[i].replace(".jpg", ".xml")
 34 | 
 35 |     val_img_paths, val_anno_paths = sorted(val_img_paths), sorted(val_anno_paths)
 36 | 
 37 |     assert len(val_img_paths) == len(val_anno_paths)
 38 |     val_img_paths = val_img_paths[:10]
 39 |     val_anno_paths = val_anno_paths[:10]
 40 | 
 41 |     val_dataset = IDD(val_img_paths, val_anno_paths, None)
 42 |     val_dl = torch.utils.data.DataLoader(
 43 |         val_dataset,
 44 |         batch_size=batch_size,
 45 |         shuffle=True,
 46 |         num_workers=4,
 47 |         collate_fn=utils.collate_fn,
 48 |     )
 49 | 
 50 | if ds == "bdd100k":
 51 |     print("Evaluation on Berkeley Deep Drive")
 52 |     root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
 53 |     root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
 54 | 
 55 |     val_img_path = root_img_path + "/val/"
 56 |     val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
 57 | 
 58 |     with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
 59 |         bdd_img_path_list = pickle.load(fp)
 60 |     # bdd_img_path_list = bdd_img_path_list[:10]
 61 |     val_dataset = BDD(bdd_img_path_list, val_anno_json_path)
 62 |     val_dl = torch.utils.data.DataLoader(
 63 |         val_dataset,
 64 |         batch_size=batch_size,
 65 |         shuffle=True,
 66 |         num_workers=0,
 67 |         collate_fn=utils.collate_fn,
 68 |         pin_memory=True,
 69 |     )
 70 | 
 71 | if ds == "Cityscapes":
 72 |     with open("datalists/cityscapes_val_images_path.txt", "rb") as fp:
 73 |         images = pickle.load(fp)
 74 |     with open("datalists/cityscapes_val_targets_path.txt", "rb") as fp:
 75 |         targets = pickle.load(fp)
 76 | 
 77 |     val_dataset = Cityscapes(images, targets)
 78 |     val_dl = torch.utils.data.DataLoader(
 79 |         val_dataset,
 80 |         batch_size=batch_size,
 81 |         shuffle=True,
 82 |         num_workers=4,
 83 |         collate_fn=utils.collate_fn,
 84 |     )
 85 | 
 86 | ###################################################################################################3
 87 | 
 88 | 
 89 | def get_model(num_classes):
 90 |     model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False)
 91 |     in_features = model.roi_heads.box_predictor.cls_score.in_features
 92 |     model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
 93 |         in_features, num_classes
 94 |     )  # replace the pre-trained head with a new one
 95 |     return model.cuda()
 96 | 
 97 | 
 98 | model = get_model(12)
 99 | model.to(device)
100 | params = [p for p in model.parameters() if p.requires_grad]
101 | optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
102 | lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
103 | 
104 | checkpoint = torch.load("saved_models/" + model_name)
105 | model.load_state_dict(checkpoint["model"])
106 | print("Model Loaded successfully")
107 | 
108 | print("##### Dataloader is ready #######")
109 | 
110 | 
111 | print("Getting coco api from dataset")
112 | coco = get_coco_api_from_dataset(val_dl.dataset)
113 | print("Done")
114 | 
115 | print("Evaluation in progress")
116 | evaluate(model, val_dl, device=device)
117 | 


--------------------------------------------------------------------------------
/exp/evaluate_script.py:
--------------------------------------------------------------------------------
  1 | from collections import OrderedDict
  2 | 
  3 | from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
  4 | 
  5 | from cfg import *
  6 | from datasets.bdd import *
  7 | from datasets.idd import *
  8 | from imports import *
  9 | 
 10 | batch_size = 16
 11 | 
 12 | 
 13 | def get_model(num_classes):
 14 |     model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True).cpu()
 15 |     in_features = model.roi_heads.box_predictor.cls_score.in_features
 16 |     model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
 17 |         in_features, num_classes
 18 |     ).cpu()  # replace the pre-trained head with a new one
 19 |     return model.cpu()
 20 | 
 21 | 
 22 | with open("datalists/idd_val_images_path_list.txt", "rb") as fp:
 23 |     val_img_paths = pickle.load(fp)
 24 | 
 25 | with open("datalists/idd_val_anno_path_list.txt", "rb") as fp:
 26 |     val_anno_paths = pickle.load(fp)
 27 | # val_img_paths = val_img_paths[:10]
 28 | # val_anno_paths = val_anno_paths[:10]
 29 | val_dataset_idd = IDD(val_img_paths, val_anno_paths)
 30 | val_dl_idd = torch.utils.data.DataLoader(
 31 |     val_dataset_idd,
 32 |     batch_size=batch_size,
 33 |     shuffle=True,
 34 |     num_workers=4,
 35 |     collate_fn=utils.collate_fn,
 36 | )
 37 | 
 38 | root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
 39 | root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
 40 | 
 41 | val_img_path = root_img_path + "/val/"
 42 | val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
 43 | 
 44 | with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
 45 |     bdd_img_path_list = pickle.load(fp)
 46 | # bdd_img_path_list = bdd_img_path_list[:10]
 47 | val_dataset_bdd = BDD(bdd_img_path_list, val_anno_json_path)
 48 | val_dl_bdd = torch.utils.data.DataLoader(
 49 |     val_dataset_bdd,
 50 |     batch_size=batch_size,
 51 |     shuffle=True,
 52 |     num_workers=0,
 53 |     collate_fn=utils.collate_fn,
 54 |     pin_memory=True,
 55 | )
 56 | 
 57 | coco_idd = get_coco_api_from_dataset(val_dl_idd.dataset)
 58 | coco_bdd = get_coco_api_from_dataset(val_dl_bdd.dataset)
 59 | 
 60 | 
 61 | @torch.no_grad()
 62 | def evaluate_(model, coco_dset, data_loader, device):
 63 |     iou_types = ["bbox"]
 64 |     coco = coco_dset
 65 |     n_threads = torch.get_num_threads()
 66 |     # FIXME remove this and make paste_masks_in_image run on the GPU
 67 |     torch.set_num_threads(1)
 68 |     cpu_device = torch.device("cpu")
 69 |     model.eval()
 70 |     metric_logger = utils.MetricLogger(delimiter="  ")
 71 |     header = "Test:"
 72 |     model.to(device)
 73 |     iou_types = _get_iou_types(model)
 74 |     coco_evaluator = CocoEvaluator(coco, iou_types)
 75 |     to_tensor = torchvision.transforms.ToTensor()
 76 |     for image, targets in metric_logger.log_every(data_loader, 100, header):
 77 | 
 78 |         image = list(to_tensor(img).to(device) for img in image)
 79 |         targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
 80 |         torch.cuda.synchronize()
 81 |         model_time = time.time()
 82 | 
 83 |         outputs = model(image)
 84 | 
 85 |         outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
 86 |         model_time = time.time() - model_time
 87 | 
 88 |         res = {
 89 |             target["image_id"].item(): output
 90 |             for target, output in zip(targets, outputs)
 91 |         }
 92 |         evaluator_time = time.time()
 93 |         coco_evaluator.update(res)
 94 |         evaluator_time = time.time() - evaluator_time
 95 |         metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
 96 | 
 97 |     # gather the stats from all processes
 98 |     metric_logger.synchronize_between_processes()
 99 |     print("Averaged stats:", metric_logger)
100 |     coco_evaluator.synchronize_between_processes()
101 | 
102 |     # accumulate predictions from all images
103 |     coco_evaluator.accumulate()
104 |     coco_evaluator.summarize()
105 |     torch.set_num_threads(n_threads)
106 |     return coco_evaluator
107 | 
108 | 
109 | def _get_iou_types(model):
110 |     model_without_ddp = model
111 |     if isinstance(model, torch.nn.parallel.DistributedDataParallel):
112 |         model_without_ddp = model.module
113 |     iou_types = ["bbox"]
114 |     return iou_types
115 | 
116 | 
117 | device = torch.device("cuda")
118 | 
119 | trained_models = [
120 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_0.pth',
121 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_1.pth',
122 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_2.pth',
123 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_2.pth',
124 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_3.pth',
125 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_4.pth',
126 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_5.pth',
127 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_6.pth',
128 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_7.pth',
129 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_8.pth',
130 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_9.pth',
131 |     "task_2_1/s_bdd_t_idd_task_new_2_1_epoch_10.pth",
132 |     #                  'task_2_1/s_bdd_t_idd_task_new_2_1_epoch_11.pth'
133 | ]
134 | 
135 | for idx in tqdm(range(0, len(trained_models))):
136 |     model = get_model(15)
137 |     ckpt = torch.load("saved_models/" + trained_models[idx])
138 |     model.load_state_dict(ckpt["model"])
139 | 
140 |     model.to(device)
141 | 
142 |     print("##########  Evaluation of IDD  ", "###   IDX  ", trained_models[idx])
143 | 
144 |     evaluate_(model, coco_idd, val_dl_idd, device=torch.device("cuda"))
145 | 
146 |     model.roi_heads.box_predictor = FastRCNNPredictor(1024, 12)
147 | 
148 |     model_bdd = get_model(12)
149 |     checkpoint = torch.load("saved_models/" + "bdd100k_24.pth")
150 |     model_bdd.load_state_dict(checkpoint["model"])
151 | 
152 |     model.roi_heads.load_state_dict(model_bdd.roi_heads.state_dict())
153 | 
154 |     model.cuda()
155 | 
156 |     for n, p in model.named_parameters():
157 |         p.requires_grad = False  # Number of params in RPN = 593935
158 | 
159 |     for n, p in model.rpn.named_parameters():
160 |         p.requires_grad = True
161 | 
162 |     for n, p in model.roi_heads.named_parameters():
163 |         p.requires_grad = True  # Number of params in RPN = 593935
164 | 
165 |     print("##########  Evaluation of BDD  ", "###   IDX  ", trained_models[idx])
166 |     evaluate_(model, coco_bdd, val_dl_bdd, device=torch.device("cuda"))
167 | 
168 |     del model, model_bdd
169 | 


--------------------------------------------------------------------------------
/exp/evaluation_transport.py:
--------------------------------------------------------------------------------
  1 | import pickle
  2 | import time
  3 | 
  4 | from coco_eval import CocoEvaluator
  5 | from coco_utils import get_coco_api_from_dataset
  6 | from datasets.bdd import *
  7 | from datasets.idd import *
  8 | from detection import faster_rcnn
  9 | 
 10 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
 11 | 
 12 | ###########################    User Defined settings ########################
 13 | ds = "IDD"
 14 | bdd_path = "/home/jupyter/autonue/data/bdd100k/"
 15 | idd_path = "/home/jupyter/autonue/data/IDD_Detection/"
 16 | batch_size = 8
 17 | model_name = "bdd100k_24.pth"  #'bdd100k_24.pth'
 18 | # name = 'do_ft_trained_bdd_eval_idd_ready.pth'
 19 | ################################     Dataset and Dataloader Management       ##########################################
 20 | 
 21 | print("Loading files")
 22 | 
 23 | if ds == "IDD":
 24 |     #     with open("datalists/idd_images_path_list.txt", "rb") as fp:
 25 |     #          idd_image_path_list = pickle.load(fp)
 26 |     #     with open("datalists/idd_anno_path_list.txt", "rb") as fp:
 27 |     #          idd_anno_path_list = pickle.load(fp)
 28 | 
 29 |     val_img_paths = []
 30 |     with open(idd_path + "val.txt") as f:
 31 |         val_img_paths = f.readlines()
 32 |     for i in range(len(val_img_paths)):
 33 |         val_img_paths[i] = val_img_paths[i].strip("\n")
 34 |         val_img_paths[i] = val_img_paths[i] + ".jpg"
 35 |         val_img_paths[i] = os.path.join(idd_path + "JPEGImages", val_img_paths[i])
 36 | 
 37 |     val_anno_paths = []
 38 |     for i in range(len(val_img_paths)):
 39 |         val_anno_paths.append(val_img_paths[i].replace("JPEGImages", "Annotations"))
 40 |         val_anno_paths[i] = val_anno_paths[i].replace(".jpg", ".xml")
 41 | 
 42 |     val_img_paths, val_anno_paths = sorted(val_img_paths), sorted(val_anno_paths)
 43 | 
 44 |     assert len(val_img_paths) == len(val_anno_paths)
 45 |     # val_img_paths = val_img_paths[:10]
 46 |     # val_anno_paths = val_anno_paths[:10]
 47 | 
 48 |     val_dataset = IDD_Test(val_img_paths, val_anno_paths)
 49 |     val_dl = torch.utils.data.DataLoader(
 50 |         val_dataset,
 51 |         batch_size=batch_size,
 52 |         shuffle=True,
 53 |         num_workers=4,
 54 |         collate_fn=utils.collate_fn,
 55 |     )
 56 | 
 57 | if ds == "BDD":
 58 |     root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
 59 |     root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
 60 | 
 61 |     val_img_path = root_img_path + "/val/"
 62 |     val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
 63 | 
 64 |     with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
 65 |         bdd_img_path_list = pickle.load(fp)
 66 | 
 67 |     val_dataset = BDD(bdd_img_path_list, val_anno_json_path)
 68 |     val_dl = torch.utils.data.DataLoader(
 69 |         val_dataset,
 70 |         batch_size=batch_size,
 71 |         shuffle=True,
 72 |         num_workers=4,
 73 |         collate_fn=utils.collate_fn,
 74 |     )
 75 | 
 76 | if ds == "Cityscapes":
 77 |     with open("datalists/cityscapes_val_images_path.txt", "rb") as fp:
 78 |         images = pickle.load(fp)
 79 |     with open("datalists/cityscapes_val_targets_path.txt", "rb") as fp:
 80 |         targets = pickle.load(fp)
 81 | 
 82 |     val_dataset = Cityscapes(images, targets)
 83 |     val_dl = torch.utils.data.DataLoader(
 84 |         val_dataset,
 85 |         batch_size=batch_size,
 86 |         shuffle=True,
 87 |         num_workers=4,
 88 |         collate_fn=utils.collate_fn,
 89 |     )
 90 | 
 91 | ###################################################################################################3
 92 | 
 93 | 
 94 | def get_model(num_classes):
 95 |     model = faster_rcnn.fasterrcnn_resnet50_fpn(pretrained=True)
 96 |     in_features = model.roi_heads.box_predictor.cls_score.in_features
 97 |     model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
 98 |         in_features, num_classes
 99 |     )  # replace the pre-trained head with a new one
100 |     return model.cuda()
101 | 
102 | 
103 | model = get_model(12)
104 | model.to(device)
105 | params = [p for p in model.parameters() if p.requires_grad]
106 | optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
107 | lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
108 | 
109 | checkpoint = torch.load("saved_models/" + model_name)
110 | model.load_state_dict(checkpoint["model"])
111 | print("Model Loaded successfully")
112 | 
113 | 
114 | def _get_iou_types(model):
115 |     model_without_ddp = model
116 |     if isinstance(model, torch.nn.parallel.DistributedDataParallel):
117 |         model_without_ddp = model.module
118 |     iou_types = ["bbox"]
119 |     return iou_types
120 | 
121 | 
122 | print("##### Dataloader is ready #######")
123 | iou_types = _get_iou_types(model)
124 | 
125 | print("Getting coco api from dataset")
126 | coco = get_coco_api_from_dataset(val_dl.dataset)
127 | print("Done")
128 | 
129 | 
130 | @torch.no_grad()
131 | def evaluate(model, data_loader, device):
132 |     n_threads = torch.get_num_threads()
133 |     # FIXME remove this and make paste_masks_in_image run on the GPU
134 |     torch.set_num_threads(1)
135 |     cpu_device = torch.device("cpu")
136 |     model.eval()
137 |     metric_logger = utils.MetricLogger(delimiter="  ")
138 |     header = "Test:"
139 |     model.cuda()
140 |     # coco = get_coco_api_from_dataset(data_loader.dataset)
141 |     iou_types = _get_iou_types(model)
142 |     coco_evaluator = CocoEvaluator(coco, iou_types)
143 | 
144 |     for image, targets in metric_logger.log_every(data_loader, 100, header):
145 |         # print(image)
146 |         # image = torchvision.transforms.ToTensor()(image[0])  # Returns a scaler tuple
147 |         # print(image.shape)                                # dim of image 1080x1920
148 | 
149 |         image = torchvision.transforms.ToTensor()(image[0]).to(device)
150 |         # image = img.to(device) for img in image
151 |         targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
152 |         torch.cuda.synchronize()
153 |         model_time = time.time()
154 | 
155 |         outputs = model([image])
156 | 
157 |         outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
158 |         model_time = time.time() - model_time
159 | 
160 |         res = {
161 |             target["image_id"].item(): output
162 |             for target, output in zip(targets, outputs)
163 |         }
164 |         evaluator_time = time.time()
165 |         coco_evaluator.update(res)
166 |         evaluator_time = time.time() - evaluator_time
167 |         metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
168 | 
169 |     # gather the stats from all processes
170 |     metric_logger.synchronize_between_processes()
171 |     print("Averaged stats:", metric_logger)
172 |     coco_evaluator.synchronize_between_processes()
173 | 
174 |     # accumulate predictions from all images
175 |     coco_evaluator.accumulate()
176 |     coco_evaluator.summarize()
177 |     torch.set_num_threads(n_threads)
178 |     return coco_evaluator
179 | 
180 | 
181 | print("Evaluation in progress")
182 | evaluate(model, val_dl, device=device)
183 | 


--------------------------------------------------------------------------------
/exp/optimal_transport.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [
  8 |     {
  9 |      "name": "stdout",
 10 |      "output_type": "stream",
 11 |      "text": [
 12 |       "Unet loaded successfully\n"
 13 |      ]
 14 |     }
 15 |    ],
 16 |    "source": [
 17 |     "from imports import *\n",
 18 |     "from datasets.idd import *\n",
 19 |     "from datasets.bdd import *\n",
 20 |     "from detection.unet import *\n",
 21 |     "from collections import OrderedDict\n",
 22 |     "from torch_cluster import nearest\n",
 23 |     "from fastprogress import master_bar, progress_bar"
 24 |    ]
 25 |   },
 26 |   {
 27 |    "cell_type": "code",
 28 |    "execution_count": 2,
 29 |    "metadata": {},
 30 |    "outputs": [],
 31 |    "source": [
 32 |     "batch_size=8\n",
 33 |     "num_epochs=1"
 34 |    ]
 35 |   },
 36 |   {
 37 |    "cell_type": "code",
 38 |    "execution_count": 3,
 39 |    "metadata": {},
 40 |    "outputs": [
 41 |     {
 42 |      "name": "stdout",
 43 |      "output_type": "stream",
 44 |      "text": [
 45 |       "Loading files\n"
 46 |      ]
 47 |     },
 48 |     {
 49 |      "name": "stderr",
 50 |      "output_type": "stream",
 51 |      "text": [
 52 |       "100%|██████████| 69863/69863 [00:02<00:00, 25953.05it/s]\n"
 53 |      ]
 54 |     }
 55 |    ],
 56 |    "source": [
 57 |     "path = '/home/jupyter/autonue/data'\n",
 58 |     "root_img_path = os.path.join(path,'bdd100k','images','100k')\n",
 59 |     "root_anno_path = os.path.join(path,'bdd100k','labels')\n",
 60 |     "\n",
 61 |     "train_img_path = root_img_path+'/train/'\n",
 62 |     "val_img_path = root_img_path+'/val/'\n",
 63 |     "\n",
 64 |     "train_anno_json_path = root_anno_path+'/bdd100k_labels_images_train.json'\n",
 65 |     "val_anno_json_path = root_anno_path+'/bdd100k_labels_images_val.json'\n",
 66 |     "\n",
 67 |     "print(\"Loading files\")\n",
 68 |     "\n",
 69 |     "with open(\"datalists/bdd100k_train_images_path.txt\", \"rb\") as fp:\n",
 70 |     "    train_img_path_list = pickle.load(fp)\n",
 71 |     "with open(\"datalists/bdd100k_val_images_path.txt\", \"rb\") as fp:\n",
 72 |     "    val_img_path_list = pickle.load(fp)\n",
 73 |     "\n",
 74 |     "src_dataset = dset = BDD(train_img_path_list,train_anno_json_path,get_transform(train=True))\n",
 75 |     "src_dl =  torch.utils.data.DataLoader(src_dataset, batch_size=batch_size, shuffle=True, num_workers=4,collate_fn=utils.collate_fn) "
 76 |    ]
 77 |   },
 78 |   {
 79 |    "cell_type": "code",
 80 |    "execution_count": 4,
 81 |    "metadata": {},
 82 |    "outputs": [],
 83 |    "source": [
 84 |     "with open(\"datalists/idd_images_path_list.txt\", \"rb\") as fp:\n",
 85 |     "    non_hq_img_paths = pickle.load(fp)\n",
 86 |     "with open(\"datalists/idd_anno_path_list.txt\", \"rb\") as fp:\n",
 87 |     "    non_hq_anno_paths = pickle.load(fp)\n",
 88 |     "\n",
 89 |     "with open(\"datalists/idd_hq_images_path_list.txt\", \"rb\") as fp:\n",
 90 |     "    hq_img_paths = pickle.load(fp)\n",
 91 |     "with open(\"datalists/idd_hq_anno_path_list.txt\", \"rb\") as fp:\n",
 92 |     "    hq_anno_paths = pickle.load(fp)\n",
 93 |     "    \n",
 94 |     "trgt_images =  hq_img_paths #non_hq_img_paths #\n",
 95 |     "trgt_annos = hq_anno_paths #non_hq_anno_paths #hq_anno_paths + \n",
 96 |     "trgt_dataset = IDD(trgt_images,trgt_annos,get_transform(train=True))\n",
 97 |     "trgt_dl =  torch.utils.data.DataLoader(trgt_dataset, batch_size=batch_size, shuffle=True, num_workers=4,collate_fn=utils.collate_fn)"
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "code",
102 |    "execution_count": 5,
103 |    "metadata": {},
104 |    "outputs": [],
105 |    "source": [
106 |     "#src_dataset[0][0].shape,trgt_dataset[0][0].shape"
107 |    ]
108 |   },
109 |   {
110 |    "cell_type": "code",
111 |    "execution_count": 6,
112 |    "metadata": {},
113 |    "outputs": [],
114 |    "source": [
115 |     "class TransportBlock(nn.Module):\n",
116 |     "    def __init__(self,backbone,n_channels=256,batch_size=2):\n",
117 |     "        super(TransportBlock, self).__init__()\n",
118 |     "        self.backbone = backbone.cuda()\n",
119 |     "        self.stats = [0.485, 0.456, 0.406],[0.229, 0.224, 0.225]\n",
120 |     "        self.batch_size=2\n",
121 |     "        self.unet = Unet(n_channels).cuda()\n",
122 |     "        \n",
123 |     "        for name,p in self.backbone.named_parameters():\n",
124 |     "            p.requires_grad=False\n",
125 |     "        \n",
126 |     "    def unet_forward(self,x):\n",
127 |     "        return self.unet(x)\n",
128 |     "                \n",
129 |     "    def transport_loss(self,S_embeddings, T_embeddings, N_cluster=5):\n",
130 |     "        Loss = 0.  \n",
131 |     "        for batch in range(self.batch_size):\n",
132 |     "            S_embeddings = S_embeddings[batch].view(256,-1)\n",
133 |     "            T_embeddings = T_embeddings[batch].view(256,-1)\n",
134 |     "            \n",
135 |     "            N_random_vec =  S_embeddings[np.random.choice(S_embeddings.shape[0], N_cluster)]\n",
136 |     "\n",
137 |     "            cluster_labels = nearest(S_embeddings, N_random_vec)\n",
138 |     "            cluster_centroids = torch.cat([torch.mean(S_embeddings[cluster_labels == label], dim=0).unsqueeze(0) for label in cluster_labels])\n",
139 |     "\n",
140 |     "            Target_labels = nearest(T_embeddings, cluster_centroids)\n",
141 |     "\n",
142 |     "            target_centroids = []\n",
143 |     "            for label in cluster_labels:\n",
144 |     "                if label in Target_labels:\n",
145 |     "                    target_centroids.append(torch.mean(T_embeddings[Target_labels == label], dim=0))\n",
146 |     "                else:\n",
147 |     "                    target_centroids.append(cluster_centroids[label])  \n",
148 |     "\n",
149 |     "            target_centroids = torch.cat(target_centroids)\n",
150 |     "\n",
151 |     "            dist = lambda x,y: torch.mean((x -y)**2)\n",
152 |     "            intra_class_variance = torch.cat([dist(T_embeddings[Target_labels[label]], target_centroids[label]).unsqueeze(0) for label in cluster_labels])\n",
153 |     "            centroid_distance = torch.cat([dist(target_centroids[label], cluster_centroids[label]).unsqueeze(0) for label in cluster_labels])\n",
154 |     "\n",
155 |     "            Loss += torch.mean(centroid_distance*intra_class_variance) # similar to earth mover distance\n",
156 |     "        return Loss"
157 |    ]
158 |   },
159 |   {
160 |    "cell_type": "code",
161 |    "execution_count": 7,
162 |    "metadata": {},
163 |    "outputs": [],
164 |    "source": [
165 |     "def get_model(num_classes):\n",
166 |     "    model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True).cpu()\n",
167 |     "    in_features = model.roi_heads.box_predictor.cls_score.in_features\n",
168 |     "    model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(in_features, num_classes).cpu() # replace the pre-trained head with a new one\n",
169 |     "    return model.cpu()"
170 |    ]
171 |   },
172 |   {
173 |    "cell_type": "code",
174 |    "execution_count": 8,
175 |    "metadata": {},
176 |    "outputs": [],
177 |    "source": [
178 |     "ckpt = torch.load('saved_models/bdd100k_24.pth')"
179 |    ]
180 |   },
181 |   {
182 |    "cell_type": "code",
183 |    "execution_count": 9,
184 |    "metadata": {},
185 |    "outputs": [
186 |     {
187 |      "data": {
188 |       "text/plain": [
189 |        "IncompatibleKeys(missing_keys=[], unexpected_keys=[])"
190 |       ]
191 |      },
192 |      "execution_count": 9,
193 |      "metadata": {},
194 |      "output_type": "execute_result"
195 |     }
196 |    ],
197 |    "source": [
198 |     "model = get_model(12)\n",
199 |     "model.load_state_dict(torch.load('saved_models/bdd100k_24.pth')['model'])"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "code",
204 |    "execution_count": 10,
205 |    "metadata": {},
206 |    "outputs": [],
207 |    "source": [
208 |     "ot = TransportBlock(model.backbone)\n",
209 |     "params = [p for p in ot.unet.parameters() if p.requires_grad]\n",
210 |     "optimizer = torch.optim.SGD(params, lr=1e-3,momentum=0.9, weight_decay=0.0005)\n",
211 |     "lr_scheduler = torch.optim.lr_scheduler.CyclicLR(optimizer,base_lr=1e-3,max_lr=6e-3)"
212 |    ]
213 |   },
214 |   {
215 |    "cell_type": "code",
216 |    "execution_count": 11,
217 |    "metadata": {},
218 |    "outputs": [
219 |     {
220 |      "data": {
221 |       "text/plain": [
222 |        "GeneralizedRCNNTransform()"
223 |       ]
224 |      },
225 |      "execution_count": 11,
226 |      "metadata": {},
227 |      "output_type": "execute_result"
228 |     }
229 |    ],
230 |    "source": [
231 |     "from detection import transform\n",
232 |     "transform = transform.GeneralizedRCNNTransform(min_size=800, max_size=1333, image_mean=[0.485, 0.456, 0.406], image_std=[0.229, 0.224, 0.225])\n",
233 |     "transform.eval()"
234 |    ]
235 |   },
236 |   {
237 |    "cell_type": "code",
238 |    "execution_count": 12,
239 |    "metadata": {},
240 |    "outputs": [
241 |     {
242 |      "data": {
243 |       "text/html": [
244 |        "\n",
245 |        "    <div>\n",
246 |        "        <style>\n",
247 |        "            /* Turns off some styling */\n",
248 |        "            progress {\n",
249 |        "                /* gets rid of default border in Firefox and Opera. */\n",
250 |        "                border: none;\n",
251 |        "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
252 |        "                background-size: auto;\n",
253 |        "            }\n",
254 |        "            .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
255 |        "                background: #F44336;\n",
256 |        "            }\n",
257 |        "        </style>\n",
258 |        "      <progress value='0' class='' max='1', style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
259 |        "      0.00% [0/1 00:00<00:00]\n",
260 |        "    </div>\n",
261 |        "    \n",
262 |        "\n",
263 |        "\n",
264 |        "    <div>\n",
265 |        "        <style>\n",
266 |        "            /* Turns off some styling */\n",
267 |        "            progress {\n",
268 |        "                /* gets rid of default border in Firefox and Opera. */\n",
269 |        "                border: none;\n",
270 |        "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
271 |        "                background-size: auto;\n",
272 |        "            }\n",
273 |        "            .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n",
274 |        "                background: #F44336;\n",
275 |        "            }\n",
276 |        "        </style>\n",
277 |        "      <progress value='0' class='progress-bar-interrupted' max='1841', style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
278 |        "      Interrupted\n",
279 |        "    </div>\n",
280 |        "    "
281 |       ],
282 |       "text/plain": [
283 |        "<IPython.core.display.HTML object>"
284 |       ]
285 |      },
286 |      "metadata": {},
287 |      "output_type": "display_data"
288 |     }
289 |    ],
290 |    "source": [
291 |     "mb = master_bar(range(num_epochs))\n",
292 |     "for i in mb:\n",
293 |     "    for trgt_img, _ in progress_bar(trgt_dl,parent=mb):\n",
294 |     "        src_img, _ = next(iter(src_dl))\n",
295 |     "\n",
296 |     "        src_images = list(image.cuda() for image in src_img)\n",
297 |     "        trgt_images = list(image.cuda() for image in trgt_img)\n",
298 |     "\n",
299 |     "        src_images, _ = transform(src_images, None)\n",
300 |     "        src_features = ot.backbone(src_images.tensors)[0]\n",
301 |     "\n",
302 |     "        trgt_images, _ = transform(trgt_images, None)\n",
303 |     "        trgt_features = ot.backbone(trgt_images.tensors)[0]\n",
304 |     "        \n",
305 |     "        torch.save(src_features,'src_features.pth')\n",
306 |     "        torch.save(trgt_features,'trgt_features.pth')\n",
307 |     "        \n",
308 |     "        modified_trgt_features = ot.unet_forward(trgt_features)\n",
309 |     "        \n",
310 |     "        torch.save(modified_trgt_features,'modified_trgt_features.pth')\n",
311 |     "        \n",
312 |     "        break\n",
313 |     "        #print(src_features.shape,modified_trgt_features.shape)\n",
314 |     "        \n",
315 |     "        # pad if dim of feature maps are not same\n",
316 |     "        if src_features.shape!=modified_trgt_features.shape:\n",
317 |     "            print(\"Earlier\", src_features.shape,modified_trgt_features.shape)\n",
318 |     "            print(\"Fixing\")\n",
319 |     "            if src_features.size(3)<336:\n",
320 |     "                src_features = F.pad(src_features,(336-src_features.size(3),0,0,0)).contiguous()\n",
321 |     "            if modified_trgt_features.size(3)>192:\n",
322 |     "                modified_trgt_features = F.pad(modified_trgt_features,(0,0,192-modified_trgt_features.size(2),0)).contiguous()\n",
323 |     "            if modified_trgt_features.size(3)<336:\n",
324 |     "                modified_trgt_features = F.pad(modified_trgt_features,(336-modified_trgt_features.size(3),0,0,0)).contiguous()\n",
325 |     "        ############################################################  \n",
326 |     "        #print(\"Now\", src_features.shape,modified_trgt_features.shape)\n",
327 |     "        assert src_features.shape==modified_trgt_features.shape\n",
328 |     "\n",
329 |     "        loss = ot.transport_loss(src_features,modified_trgt_features)\n",
330 |     "\n",
331 |     "        print (\"transport_loss: \",loss.item(),\"lr: \", optimizer.param_groups[0][\"lr\"])\n",
332 |     "        optimizer.zero_grad()\n",
333 |     "        loss.backward()\n",
334 |     "        optimizer.step()\n",
335 |     "        lr_scheduler.step()\n",
336 |     "\n",
337 |     "        del src_images,trgt_images,src_features,trgt_features,_\n",
338 |     "    break"
339 |    ]
340 |   },
341 |   {
342 |    "cell_type": "code",
343 |    "execution_count": 13,
344 |    "metadata": {},
345 |    "outputs": [],
346 |    "source": [
347 |     "torch.save({\n",
348 |     "            'model_state_dict': ot.unet.state_dict(),\n",
349 |     "            'optimizer_state_dict': optimizer.state_dict(),\n",
350 |     "            }, 'saved_models/unet.pth')"
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "code",
355 |    "execution_count": null,
356 |    "metadata": {},
357 |    "outputs": [],
358 |    "source": []
359 |   }
360 |  ],
361 |  "metadata": {
362 |   "kernelspec": {
363 |    "display_name": "Python 3",
364 |    "language": "python",
365 |    "name": "python3"
366 |   },
367 |   "language_info": {
368 |    "codemirror_mode": {
369 |     "name": "ipython",
370 |     "version": 3
371 |    },
372 |    "file_extension": ".py",
373 |    "mimetype": "text/x-python",
374 |    "name": "python",
375 |    "nbconvert_exporter": "python",
376 |    "pygments_lexer": "ipython3",
377 |    "version": "3.6.7"
378 |   }
379 |  },
380 |  "nbformat": 4,
381 |  "nbformat_minor": 2
382 | }
383 | 


--------------------------------------------------------------------------------
/exp/train_script.py:
--------------------------------------------------------------------------------
 1 | from collections import OrderedDict
 2 | 
 3 | from cfg import *
 4 | from datasets.bdd import *
 5 | from datasets.idd import *
 6 | from imports import *
 7 | 
 8 | batch_size = 16
 9 | 
10 | with open("datalists/idd_images_path_list.txt", "rb") as fp:
11 |     non_hq_img_paths = pickle.load(fp)
12 | with open("datalists/idd_anno_path_list.txt", "rb") as fp:
13 |     non_hq_anno_paths = pickle.load(fp)
14 | 
15 | with open("datalists/idd_hq_images_path_list.txt", "rb") as fp:
16 |     hq_img_paths = pickle.load(fp)
17 | with open("datalists/idd_hq_anno_path_list.txt", "rb") as fp:
18 |     hq_anno_paths = pickle.load(fp)
19 | 
20 | trgt_images = non_hq_img_paths  # hq_img_paths
21 | trgt_annos = non_hq_anno_paths  # hq_anno_paths + hq_anno_paths
22 | trgt_dataset = IDD(trgt_images, trgt_annos, get_transform(train=True))
23 | trgt_dl = torch.utils.data.DataLoader(
24 |     trgt_dataset,
25 |     batch_size=batch_size,
26 |     shuffle=True,
27 |     num_workers=4,
28 |     collate_fn=utils.collate_fn,
29 | )
30 | 
31 | 
32 | def get_model(num_classes):
33 |     model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True).cpu()
34 |     in_features = model.roi_heads.box_predictor.cls_score.in_features
35 |     model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
36 |         in_features, num_classes
37 |     ).cpu()  # replace the pre-trained head with a new one
38 |     return model.cpu()
39 | 
40 | 
41 | model = get_model(15)
42 | ckpt = torch.load("saved_models/task_2_1/s_bdd_t_idd_task_new_2_1_epoch_2.pth")
43 | model.load_state_dict(ckpt["model"])
44 | 
45 | for n, p in model.backbone.body.named_parameters():
46 |     p.requires_grad = False  # Number of params in RPN = 593935
47 | 
48 | for n, p in model.rpn.named_parameters():
49 |     p.requires_grad = True
50 | 
51 | for n, p in model.backbone.fpn.named_parameters():
52 |     p.requires_grad = True
53 | 
54 | for n, p in model.roi_heads.named_parameters():
55 |     p.requires_grad = True  # Number of params in RPN = 593935
56 | 
57 | device = torch.device("cuda")
58 | model.to(device)
59 | 
60 | optimizer = torch.optim.SGD(
61 |     [
62 |         {"params": model.backbone.body.parameters(), "lr": 1e-5},
63 |         {"params": model.backbone.fpn.parameters(), "lr": 2e-4},
64 |         {"params": model.rpn.parameters(), "lr": 4e-4},
65 |         {"params": model.roi_heads.parameters(), "lr": 1e-3},
66 |     ]
67 | )
68 | 
69 | lr_scheduler = torch.optim.lr_scheduler.CyclicLR(optimizer, base_lr=1e-4, max_lr=6e-3)
70 | 
71 | for epoch in tqdm(range(3, 16)):
72 |     train_one_epoch(model, optimizer, trgt_dl, device, epoch, print_freq=50)
73 | 
74 |     lr_scheduler.step()
75 | 
76 |     save_name = (
77 |         "saved_models/task_2_1/s_bdd_t_idd_task_new_2_1_epoch_" + str(epoch) + ".pth"
78 |     )
79 |     torch.save(
80 |         {"model": model.state_dict(), "optimizer": optimizer.state_dict(),}, save_name
81 |     )
82 |     print("Saved model", save_name)
83 | 


--------------------------------------------------------------------------------
/get_datalists.py:
--------------------------------------------------------------------------------
  1 | from cfg import *
  2 | from imports import *
  3 | 
  4 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
  5 | 
  6 | if ds == "bdd100k":
  7 |     print("Creating datalist for Berkeley Deep Drive")
  8 |     root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
  9 |     root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
 10 | 
 11 |     train_img_path = root_img_path + "/train/"
 12 |     val_img_path = root_img_path + "/val/"
 13 | 
 14 |     train_anno_json = root_anno_path + "/bdd100k_labels_images_train.json"
 15 |     val_anno_json = root_anno_path + "/bdd100k_labels_images_val.json"
 16 | 
 17 |     def _load_json(path_list_idx):
 18 |         with open(path_list_idx, "r") as file:
 19 |             data = json.load(file)
 20 |         return data
 21 | 
 22 |     train_anno_data = _load_json(train_anno_json)
 23 | 
 24 |     img_datalist = []
 25 |     for i in tqdm(range(len(train_anno_data))):
 26 |         img_path = train_img_path + train_anno_data[i]["name"]
 27 |         img_datalist.append(img_path)
 28 | 
 29 |     val_anno_data = _load_json(val_anno_json)
 30 | 
 31 |     val_datalist = []
 32 | 
 33 |     for i in range(len(val_anno_data)):
 34 |         img_path = val_img_path + val_anno_data[i]["name"]
 35 |         val_datalist.append(img_path)
 36 | 
 37 |     try:
 38 |         os.mkdir("datalists")
 39 |     except:
 40 |         pass
 41 | 
 42 |     with open("datalists/bdd100k_train_images_path.txt", "wb") as fp:
 43 |         pickle.dump(img_datalist, fp)
 44 | 
 45 |     with open("datalists/bdd100k_val_images_path.txt", "wb") as fp:
 46 |         pickle.dump(val_datalist, fp)
 47 | 
 48 |     print("Done")
 49 | 
 50 | if ds == "idd_non_hq":
 51 |     print("Creating datalist for India Driving Dataset (non HQ)")
 52 |     ######################################################################################
 53 |     root_anno_path = os.path.join(idd_path, "Annotations", "highquality_16k")
 54 |     root_img_path = os.path.join(idd_path, "JPEGImages", "highquality_16k")
 55 | 
 56 |     img_id = os.listdir(root_img_path)
 57 |     anno_id = os.listdir(root_anno_path)
 58 | 
 59 |     img_idxs = [value for value in img_id if value in anno_id]
 60 |     anno_idxs = [value for value in anno_id if value in img_idxs]
 61 | 
 62 |     img_paths = []
 63 |     for i in range(len(img_idxs)):
 64 |         img_paths.append(os.path.join(root_img_path, img_idxs[i]))
 65 |     assert len(img_paths) == len(img_idxs)
 66 |     total_img_paths = []
 67 |     for i in tqdm(range(len(img_paths))):
 68 |         img_names = os.listdir(img_paths[i])
 69 |         for j in range(len(img_names)):
 70 |             img_name = os.path.join(img_paths[i], img_names[j])
 71 |             total_img_paths.append(img_name)
 72 | 
 73 |     anno_paths = []
 74 |     for i in range(len(anno_idxs)):
 75 |         anno_paths.append(os.path.join(root_anno_path, anno_idxs[i]))
 76 |     assert len(anno_paths) == len(anno_idxs)
 77 |     total_anno_paths = []
 78 |     for i in tqdm(range(len(anno_paths))):
 79 |         anno_names = os.listdir(anno_paths[i])
 80 |         for j in range(len(anno_names)):
 81 |             anno_name = os.path.join(anno_paths[i], anno_names[j])
 82 |             # print(img_name)
 83 |             total_anno_paths.append(anno_name)
 84 | 
 85 |     total_img_paths, total_anno_paths = (
 86 |         sorted(total_img_paths),
 87 |         sorted(total_anno_paths),
 88 |     )
 89 |     len(total_img_paths), len(total_anno_paths)
 90 | 
 91 |     ###############################################################
 92 |     def get_obj_bboxes(xml_obj):
 93 |         xml_obj = ET.parse(xml_obj)
 94 |         objects, bboxes = [], []
 95 | 
 96 |         for node in xml_obj.getroot().iter("object"):
 97 |             object_present = node.find("name").text
 98 |             xmin = int(node.find("bndbox/xmin").text)
 99 |             xmax = int(node.find("bndbox/xmax").text)
100 |             ymin = int(node.find("bndbox/ymin").text)
101 |             ymax = int(node.find("bndbox/ymax").text)
102 |             objects.append(object_present)
103 |             bboxes.append((xmin, ymin, xmax, ymax))
104 |         return objects, bboxes
105 | 
106 |     def get_label_bboxes(xml_obj):
107 |         xml_obj = ET.parse(xml_obj)
108 |         objects, bboxes = [], []
109 | 
110 |         for node in xml_obj.getroot().iter("object"):
111 |             object_present = node.find("name").text
112 |             xmin = int(node.find("bndbox/xmin").text)
113 |             xmax = int(node.find("bndbox/xmax").text)
114 |             ymin = int(node.find("bndbox/ymin").text)
115 |             ymax = int(node.find("bndbox/ymax").text)
116 |             objects.append(labels[object_present])
117 |             bboxes.append((xmin, ymin, xmax, ymax))
118 |         return Tensor(objects), Tensor(bboxes)
119 | 
120 |     ##############################################################
121 | 
122 |     print("######### Checking ############")
123 |     print(total_img_paths[100], total_anno_paths[100])
124 | 
125 |     print("Images without annotations found, fixing them")
126 |     cnt = 0
127 |     for i, a in tqdm(enumerate(total_anno_paths)):
128 |         obj_anno_0 = get_obj_bboxes(total_anno_paths[i])
129 |         if not obj_anno_0[0]:
130 |             total_anno_paths.remove(a)
131 |             a = a.replace("Annotations", "JPEGImages")
132 |             a = a.replace("xml", "jpg")
133 |             total_img_paths.remove(a)
134 |             # print("Problematic", a)
135 |             cnt += 1
136 | 
137 |     print("Total number of images without annotations: " + str(cnt))
138 | 
139 |     # total_img_paths = total_img_paths[:10000]
140 |     # total_anno_paths = total_anno_paths[:10000]
141 |     print(total_img_paths[2000], total_anno_paths[2000])
142 | 
143 |     assert len(total_anno_paths) == len(total_img_paths)
144 | 
145 |     with open("datalists/idd_hq_images_path_list.txt", "wb") as fp:
146 |         pickle.dump(total_img_paths, fp)
147 | 
148 |     with open("datalists/idd_hq_anno_path_list.txt", "wb") as fp:
149 |         pickle.dump(total_anno_paths, fp)
150 | 
151 |     print("Saved successfully", "datalists/idd_hq_images_path_list.txt")
152 | 
153 | if ds == "idd_hq":
154 |     print("Creating datalist for India Driving Dataset (HQ)")
155 |     root_anno_path = os.path.join(idd_path, "Annotations", "highquality_16k")
156 |     root_img_path = os.path.join(idd_path, "JPEGImages", "highquality_16k")
157 | 
158 |     img_id = os.listdir(root_img_path)
159 |     anno_id = os.listdir(root_anno_path)
160 | 
161 |     img_idxs = [value for value in img_id if value in anno_id]
162 |     anno_idxs = [value for value in anno_id if value in img_idxs]
163 | 
164 |     img_paths = []
165 |     for i in range(len(img_idxs)):
166 |         img_paths.append(os.path.join(root_img_path, img_idxs[i]))
167 |     assert len(img_paths) == len(img_idxs)
168 |     total_img_paths = []
169 |     for i in tqdm(range(len(img_paths))):
170 |         img_names = os.listdir(img_paths[i])
171 |         for j in range(len(img_names)):
172 |             img_name = os.path.join(img_paths[i], img_names[j])
173 |             total_img_paths.append(img_name)
174 | 
175 |     anno_paths = []
176 |     for i in range(len(anno_idxs)):
177 |         anno_paths.append(os.path.join(root_anno_path, anno_idxs[i]))
178 |     assert len(anno_paths) == len(anno_idxs)
179 |     total_anno_paths = []
180 |     for i in tqdm(range(len(anno_paths))):
181 |         anno_names = os.listdir(anno_paths[i])
182 |         for j in range(len(anno_names)):
183 |             anno_name = os.path.join(anno_paths[i], anno_names[j])
184 |             # print(img_name)
185 |             total_anno_paths.append(anno_name)
186 | 
187 |     total_img_paths, total_anno_paths = (
188 |         sorted(total_img_paths),
189 |         sorted(total_anno_paths),
190 |     )
191 |     len(total_img_paths), len(total_anno_paths)
192 | 
193 |     ###############################################################
194 |     def get_obj_bboxes(xml_obj):
195 |         xml_obj = ET.parse(xml_obj)
196 |         objects, bboxes = [], []
197 | 
198 |         for node in xml_obj.getroot().iter("object"):
199 |             object_present = node.find("name").text
200 |             xmin = int(node.find("bndbox/xmin").text)
201 |             xmax = int(node.find("bndbox/xmax").text)
202 |             ymin = int(node.find("bndbox/ymin").text)
203 |             ymax = int(node.find("bndbox/ymax").text)
204 |             objects.append(object_present)
205 |             bboxes.append((xmin, ymin, xmax, ymax))
206 |         return objects, bboxes
207 | 
208 |     def get_label_bboxes(xml_obj):
209 |         xml_obj = ET.parse(xml_obj)
210 |         objects, bboxes = [], []
211 | 
212 |         for node in xml_obj.getroot().iter("object"):
213 |             object_present = node.find("name").text
214 |             xmin = int(node.find("bndbox/xmin").text)
215 |             xmax = int(node.find("bndbox/xmax").text)
216 |             ymin = int(node.find("bndbox/ymin").text)
217 |             ymax = int(node.find("bndbox/ymax").text)
218 |             objects.append(labels[object_present])
219 |             bboxes.append((xmin, ymin, xmax, ymax))
220 |         return Tensor(objects), Tensor(bboxes)
221 | 
222 |     ##############################################################
223 | 
224 |     print("######### Checking ############")
225 |     print(total_img_paths[100], total_anno_paths[100])
226 | 
227 |     print("images without annotations found, fixing them")
228 |     cnt = 0
229 |     for i, a in tqdm(enumerate(total_anno_paths)):
230 |         obj_anno_0 = get_obj_bboxes(total_anno_paths[i])
231 |         if not obj_anno_0[0]:
232 |             total_anno_paths.remove(a)
233 |             a = a.replace("Annotations", "JPEGImages")
234 |             a = a.replace("xml", "jpg")
235 |             total_img_paths.remove(a)
236 |             # print("Problematic", a)
237 |             cnt += 1
238 | 
239 |     print("Total number of images without annotations: " + str(cnt))
240 | 
241 |     # total_img_paths = total_img_paths[:10000]
242 |     # total_anno_paths = total_anno_paths[:10000]
243 |     print(total_img_paths[2000], total_anno_paths[2000])
244 | 
245 |     assert len(total_anno_paths) == len(total_img_paths)
246 | 
247 |     with open("datalists/idd_hq_images_path_list.txt", "wb") as fp:
248 |         pickle.dump(total_img_paths, fp)
249 | 
250 |     with open("datalists/idd_hq_anno_path_list.txt", "wb") as fp:
251 |         pickle.dump(total_anno_paths, fp)
252 | 
253 |     print("Saved successfully", "datalists/idd_hq_images_path_list.txt")
254 | 
255 | if ds == "Cityscapes":
256 |     root = cityscapes_path
257 |     images_dir = os.path.join(root, "images", cityscapes_split)
258 |     targets_dir = os.path.join(root, "bboxes", cityscapes_split)
259 |     images_val_dir = os.path.join(root, "images", "val")
260 |     targets_val_dir = os.path.join(root, "bboxes", "val")
261 | 
262 |     images, targets = [], []
263 |     val_images, val_targets = [], []
264 | 
265 |     print("Images Directory", images_dir)
266 |     print("Targets Directory", targets_dir)
267 |     print("Validation Images Directory", images_val_dir)
268 |     print("Validation Targets Directory", targets_val_dir)
269 | 
270 |     if split not in ["train", "test", "val"]:
271 |         raise ValueError(
272 |             'Invalid split for mode "fine"! Please use split="train", split="test"'
273 |             ' or split="val"'
274 |         )
275 | 
276 |     if not os.path.isdir(images_dir) or not os.path.isdir(targets_dir):
277 |         raise RuntimeError(
278 |             "Dataset not found or incomplete. Please make sure all required folders for the"
279 |             ' specified "split" and "mode" are inside the "root" directory'
280 |         )
281 | 
282 |     #####################  For Training Set ###################################
283 |     for city in os.listdir(images_dir):
284 |         img_dir = os.path.join(images_dir, city)
285 |         target_dir = os.path.join(targets_dir, city)
286 | 
287 |         for file_name in os.listdir(img_dir):
288 |             # target_types = []
289 |             target_name = "{}_{}".format(
290 |                 file_name.split("_leftImg8bit")[0], "gtBboxCityPersons.json"
291 |             )
292 |             targets.append(os.path.join(target_dir, target_name))
293 | 
294 |             images.append(os.path.join(img_dir, file_name))
295 |             # targets.append(target_types)
296 | 
297 |     ###################### For Validation Set ##########################
298 | 
299 |     for city in os.listdir(images_val_dir):
300 |         img_val_dir = os.path.join(images_val_dir, city)
301 |         target_val_dir = os.path.join(targets_val_dir, city)
302 | 
303 |         for file_name in os.listdir(img_val_dir):
304 |             # target_types = []
305 |             target_val_name = "{}_{}".format(
306 |                 file_name.split("_leftImg8bit")[0], "gtBboxCityPersons.json"
307 |             )
308 |             val_targets.append(os.path.join(target_val_dir, target_val_name))
309 | 
310 |             val_images.append(os.path.join(img_val_dir, file_name))
311 |     #######################################################################
312 | 
313 |     print("Length of images and targets", len(images), len(targets))
314 |     print("Lenght of Validation images and targets", len(val_images), len(val_targets))
315 | 
316 |     images, targets = sorted(images), sorted(targets)
317 |     val_images, val_targets = sorted(val_images), sorted(val_targets)
318 | 
319 |     cityscapes_classes = {
320 |         "pedestrian": 0,
321 |         "rider": 1,
322 |         "person group": 2,
323 |         "person (other)": 3,
324 |         "sitting person": 4,
325 |         "ignore": 5,
326 |     }
327 | 
328 |     def _load_json(path):
329 |         with open(path, "r") as file:
330 |             data = json.load(file)
331 |         return data
332 | 
333 |     def get_label_bboxes(label):
334 |         bboxes = []
335 |         labels = []
336 |         for data in label["objects"]:
337 |             bboxes.append(data["bbox"])
338 |             labels.append(cityscapes_classes[data["label"]])
339 |         return bboxes, labels
340 | 
341 |     ##################################### Fixing annotations with empty labels ########################3
342 |     empty_target_paths = []
343 | 
344 |     for i in tqdm(range(2975)):
345 |         data = _load_json(targets[i])
346 |         obj, bbox_coords = get_label_bboxes(data)[0], get_label_bboxes(data)[1]
347 |         if len(bbox_coords) == 0:  # Check if the list is empty
348 |             fname = targets[i]
349 |             empty_target_paths.append(fname)
350 | 
351 |     print("Length of Empty targets: ", len(empty_target_paths))
352 | 
353 |     img_files_to_remove = []
354 | 
355 |     for i in range(len(empty_target_paths)):
356 |         fname = empty_target_paths[i]
357 |         fname = fname.replace("json", "png")
358 |         fname = fname.replace("gtBboxCityPersons", "leftImg8bit")
359 |         fname = fname.replace("bboxes", "images")
360 |         img_files_to_remove.append(fname)
361 | 
362 |     print("Image files to remove", len(img_files_to_remove))
363 |     print(empty_target_paths[0])
364 |     print(img_files_to_remove[0])
365 | 
366 |     for i in range(len(empty_target_paths)):
367 |         target_fname = empty_target_paths[i]
368 |         image_fname = img_files_to_remove[i]
369 |         if target_fname in targets:
370 |             targets.remove(target_fname)
371 |         if image_fname in images:
372 |             images.remove(image_fname)
373 |     #################################### Validation Set : Fixing annotations ################################
374 |     val_target_files_to_remove = []
375 | 
376 |     for i in tqdm(range(500)):
377 |         data = _load_json(val_targets[i])
378 |         obj, bbox_coords = get_label_bboxes(data)[0], get_label_bboxes(data)[1]
379 |         if len(bbox_coords) == 0:  # Check if the list is empty
380 |             fname = val_targets[i]
381 |             val_target_files_to_remove.append(fname)
382 | 
383 |     print("Length of Empty targets: ", len(val_target_files_to_remove))
384 | 
385 |     val_img_files_to_remove = []
386 | 
387 |     for i in range(len(val_target_files_to_remove)):
388 |         fname = val_target_files_to_remove[i]
389 |         fname = fname.replace("json", "png")
390 |         fname = fname.replace("gtBboxCityPersons", "leftImg8bit")
391 |         fname = fname.replace("bboxes", "images")
392 |         # fname = fname.replace('train','val')
393 |         val_img_files_to_remove.append(fname)
394 | 
395 |     print("Image files to remove", len(val_img_files_to_remove))
396 |     print(val_target_files_to_remove[0])
397 |     print(val_img_files_to_remove[0], val_images[0])
398 | 
399 |     for i in range(len(val_img_files_to_remove)):
400 |         target_fname = val_target_files_to_remove[i]
401 |         image_fname = val_img_files_to_remove[i]
402 | 
403 |         if image_fname in val_images:
404 |             val_images.remove(image_fname)
405 | 
406 |         if target_fname in val_targets:
407 |             val_targets.remove(target_fname)
408 | 
409 |     ###############################################################################################################
410 | 
411 |     print("Updated Length", len(images), len(targets))
412 |     # assert len(val_images)==len(val_targets)==500
413 |     print("Length of Validation set", len(val_images))
414 | 
415 |     with open("datalists/cityscapes_images_path.txt", "wb") as fp:
416 |         pickle.dump(images, fp)
417 | 
418 |     with open("datalists/cityscapes_targets_path.txt", "wb") as fp:
419 |         pickle.dump(targets, fp)
420 | 
421 |     with open("datalists/cityscapes_val_images_path.txt", "wb") as fp:
422 |         pickle.dump(val_images, fp)
423 | 
424 |     with open("datalists/cityscapes_val_targets_path.txt", "wb") as fp:
425 |         pickle.dump(val_targets, fp)
426 |     ################################################################################################
427 |     print("Done")
428 | 


--------------------------------------------------------------------------------
/imports.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import math
 3 | import os
 4 | import pickle
 5 | import xml.etree.ElementTree as ET
 6 | from glob import glob
 7 | from pathlib import Path
 8 | 
 9 | import matplotlib
10 | import matplotlib.patches as patches
11 | import matplotlib.pyplot as plt
12 | import numpy as np
13 | import torchvision
14 | from PIL import Image
15 | from torchvision import transforms
16 | from tqdm import tqdm
17 | 
18 | import torch
19 | import transforms as T
20 | import utils
21 | from engine import *
22 | from torch import FloatTensor, Tensor, nn
23 | from torch.utils.data import (DataLoader, Dataset, RandomSampler,
24 |                               SequentialSampler)
25 | 
26 | COLOR = "yellow"
27 | matplotlib.rcParams["text.color"] = COLOR
28 | 
29 | import ssl
30 | ssl._create_default_https_context = ssl._create_unverified_context
31 | 


--------------------------------------------------------------------------------
/train_baseline.py:
--------------------------------------------------------------------------------
  1 | import pickle
  2 | 
  3 | from cfg import *
  4 | from datasets.bdd import *
  5 | from datasets.idd import *
  6 | from imports import *
  7 | 
  8 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
  9 | 
 10 | if ds == "bdd100k":
 11 |     root_img_path = os.path.join(bdd_path, "bdd100k_images_100k", "images", "100k")
 12 |     root_anno_path = os.path.join(bdd_path, "bdd100k_labels_release", "labels")
 13 | 
 14 |     train_img_path = root_img_path + "/train/"
 15 |     val_img_path = root_img_path + "/val/"
 16 | 
 17 |     train_anno_json_path = root_anno_path + "/bdd100k_labels_images_train.json"
 18 |     val_anno_json_path = root_anno_path + "/bdd100k_labels_images_val.json"
 19 | 
 20 |     print("Loading files")
 21 | 
 22 |     with open("datalists/bdd100k_train_images_path.txt", "rb") as fp:
 23 |         train_img_path_list = pickle.load(fp)
 24 |     with open("datalists/bdd100k_val_images_path.txt", "rb") as fp:
 25 |         val_img_path_list = pickle.load(fp)
 26 | 
 27 |     dataset_train = BDD(
 28 |         train_img_path_list, train_anno_json_path, get_transform(train=True)
 29 |     )
 30 |     dl = torch.utils.data.DataLoader(
 31 |         dataset_train,
 32 |         batch_size=batch_size,
 33 |         shuffle=True,
 34 |         num_workers=4,
 35 |         collate_fn=utils.collate_fn,
 36 |     )
 37 | 
 38 | if ds in ["idd_non_hq", "idd_hq"]:
 39 | 
 40 |     with open("datalists/idd_images_path_list.txt", "rb") as fp:
 41 |         non_hq_img_paths = pickle.load(fp)
 42 |     with open("datalists/idd_anno_path_list.txt", "rb") as fp:
 43 |         non_hq_anno_paths = pickle.load(fp)
 44 | 
 45 |     if idd_hq == True:
 46 |         images = non_hq_img_paths + hq_img_paths
 47 |         annos = non_hq_anno_paths + hq_anno_paths
 48 |     else:
 49 |         images = non_hq_img_paths
 50 |         annos = non_hq_anno_paths
 51 |     dataset_train = IDD(images, annos, get_transform(train=True))
 52 |     dl = torch.utils.data.DataLoader(
 53 |         dataset_train,
 54 |         batch_size=batch_size,
 55 |         shuffle=True,
 56 |         num_workers=4,
 57 |         collate_fn=utils.collate_fn,
 58 |     )
 59 | 
 60 | print("Loading done")
 61 | 
 62 | 
 63 | def get_model(num_classes):
 64 |     model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
 65 |     in_features = model.roi_heads.box_predictor.cls_score.in_features
 66 |     model.roi_heads.box_predictor = torchvision.models.detection.faster_rcnn.FastRCNNPredictor(
 67 |         in_features, num_classes
 68 |     )  # replace the pre-trained head with a new one
 69 |     return model.cuda() if torch.cuda.is_available() else model.cpu()
 70 | 
 71 | print("Model initialization")
 72 | model = get_model(len(dataset_train.classes))
 73 | model.to(device)
 74 | params = [p for p in model.parameters() if p.requires_grad]
 75 | optimizer = torch.optim.SGD(params, lr=lr, momentum=0.9, weight_decay=0.0005)
 76 | lr_scheduler = torch.optim.lr_scheduler.CyclicLR(optimizer, base_lr=1e-3, max_lr=6e-3)
 77 | 
 78 | try:
 79 |     os.mkdir("saved_models/")
 80 | except:
 81 |     pass
 82 | 
 83 | 
 84 | if ckpt:
 85 |     checkpoint = torch.load("saved_models/sideRight.pth")
 86 |     model.load_state_dict(checkpoint["model_state_dict"])
 87 |     optimizer.load_state_dict(checkpoint["optimizer_state_dict"])
 88 | # epoch = checkpoint['epoch']
 89 | 
 90 | 
 91 | print("Training started")
 92 | 
 93 | 
 94 | for epoch in tqdm(range(num_epochs)):
 95 |     train_one_epoch(model, optimizer, dl, device, epoch, print_freq=200)
 96 |     lr_scheduler.step()
 97 | 
 98 |     if epoch == 5 or epoch == 10 or epoch == 15 or epoch == 20 or epoch == 24:
 99 |         save_name = "saved_models/bdd100k_" + str(epoch) + ".pth"
100 |         torch.save(
101 |             {"model": model.state_dict(), "optimizer": optimizer.state_dict(),},
102 |             save_name,
103 |         )
104 |         print("Saved model", save_name)
105 | 


--------------------------------------------------------------------------------
/transforms.py:
--------------------------------------------------------------------------------
 1 | import random
 2 | 
 3 | from torchvision.transforms import functional as F
 4 | 
 5 | import torch
 6 | 
 7 | 
 8 | class Compose(object):
 9 |     def __init__(self, transforms):
10 |         self.transforms = transforms
11 | 
12 |     def __call__(self, image, target):
13 |         for t in self.transforms:
14 |             image, target = t(image, target)
15 |         return image, target
16 | 
17 | 
18 | class RandomHorizontalFlip(object):
19 |     def __init__(self, prob):
20 |         self.prob = prob
21 | 
22 |     def __call__(self, image, target):
23 |         if random.random() < self.prob:
24 |             height, width = image.shape[-2:]
25 |             image = image.flip(-1)
26 |             bbox = target["boxes"]
27 |             bbox[:, [0, 2]] = width - bbox[:, [2, 0]]
28 |             target["boxes"] = bbox
29 |         return image, target
30 | 
31 | 
32 | class ToTensor(object):
33 |     def __call__(self, image, target):
34 |         image = F.to_tensor(image)
35 |         return image, target
36 | 


--------------------------------------------------------------------------------
/utils.py:
--------------------------------------------------------------------------------
  1 | from __future__ import print_function
  2 | 
  3 | import datetime
  4 | import errno
  5 | import os
  6 | import pickle
  7 | import time
  8 | from collections import defaultdict, deque
  9 | 
 10 | import torch
 11 | import torch.distributed as dist
 12 | 
 13 | cuda_avail = torch.cuda.is_available()
 14 | 
 15 | 
 16 | class SmoothedValue(object):
 17 |     """Track a series of values and provide access to smoothed values over a
 18 |     window or the global series average.
 19 |     """
 20 | 
 21 |     def __init__(self, window_size=20, fmt=None):
 22 |         if fmt is None:
 23 |             fmt = "{median:.4f} ({global_avg:.4f})"
 24 |         self.deque = deque(maxlen=window_size)
 25 |         self.total = 0.0
 26 |         self.count = 0
 27 |         self.fmt = fmt
 28 | 
 29 |     def update(self, value, n=1):
 30 |         self.deque.append(value)
 31 |         self.count += n
 32 |         self.total += value * n
 33 | 
 34 |     def synchronize_between_processes(self):
 35 |         """
 36 |         Warning: does not synchronize the deque!
 37 |         """
 38 |         if not is_dist_avail_and_initialized():
 39 |             return
 40 |         t = torch.tensor([self.count, self.total], dtype=torch.float64, device="cuda")
 41 |         dist.barrier()
 42 |         dist.all_reduce(t)
 43 |         t = t.tolist()
 44 |         self.count = int(t[0])
 45 |         self.total = t[1]
 46 | 
 47 |     @property
 48 |     def median(self):
 49 |         d = torch.tensor(list(self.deque))
 50 |         return d.median().item()
 51 | 
 52 |     @property
 53 |     def avg(self):
 54 |         d = torch.tensor(list(self.deque), dtype=torch.float32)
 55 |         return d.mean().item()
 56 | 
 57 |     @property
 58 |     def global_avg(self):
 59 |         return self.total / self.count
 60 | 
 61 |     @property
 62 |     def max(self):
 63 |         return max(self.deque)
 64 | 
 65 |     @property
 66 |     def value(self):
 67 |         return self.deque[-1]
 68 | 
 69 |     def __str__(self):
 70 |         return self.fmt.format(
 71 |             median=self.median,
 72 |             avg=self.avg,
 73 |             global_avg=self.global_avg,
 74 |             max=self.max,
 75 |             value=self.value,
 76 |         )
 77 | 
 78 | 
 79 | def all_gather(data):
 80 |     """
 81 |     Run all_gather on arbitrary picklable data (not necessarily tensors)
 82 |     Args:
 83 |         data: any picklable object
 84 |     Returns:
 85 |         list[data]: list of data gathered from each rank
 86 |     """
 87 |     world_size = get_world_size()
 88 |     if world_size == 1:
 89 |         return [data]
 90 | 
 91 |     # serialized to a Tensor
 92 |     buffer = pickle.dumps(data)
 93 |     storage = torch.ByteStorage.from_buffer(buffer)
 94 |     tensor = torch.ByteTensor(storage).to("cuda")
 95 | 
 96 |     # obtain Tensor size of each rank
 97 |     local_size = torch.tensor([tensor.numel()], device="cuda")
 98 |     size_list = [torch.tensor([0], device="cuda") for _ in range(world_size)]
 99 |     dist.all_gather(size_list, local_size)
100 |     size_list = [int(size.item()) for size in size_list]
101 |     max_size = max(size_list)
102 | 
103 |     # receiving Tensor from all ranks
104 |     # we pad the tensor because torch all_gather does not support
105 |     # gathering tensors of different shapes
106 |     tensor_list = []
107 |     for _ in size_list:
108 |         tensor_list.append(torch.empty((max_size,), dtype=torch.uint8, device="cuda"))
109 |     if local_size != max_size:
110 |         padding = torch.empty(
111 |             size=(max_size - local_size,), dtype=torch.uint8, device="cuda"
112 |         )
113 |         tensor = torch.cat((tensor, padding), dim=0)
114 |     dist.all_gather(tensor_list, tensor)
115 | 
116 |     data_list = []
117 |     for size, tensor in zip(size_list, tensor_list):
118 |         buffer = tensor.cpu().numpy().tobytes()[:size]
119 |         data_list.append(pickle.loads(buffer))
120 | 
121 |     return data_list
122 | 
123 | 
124 | def reduce_dict(input_dict, average=True):
125 |     """
126 |     Args:
127 |         input_dict (dict): all the values will be reduced
128 |         average (bool): whether to do average or sum
129 |     Reduce the values in the dictionary from all processes so that all processes
130 |     have the averaged results. Returns a dict with the same fields as
131 |     input_dict, after reduction.
132 |     """
133 |     world_size = get_world_size()
134 |     if world_size < 2:
135 |         return input_dict
136 |     with torch.no_grad():
137 |         names = []
138 |         values = []
139 |         # sort the keys so that they are consistent across processes
140 |         for k in sorted(input_dict.keys()):
141 |             names.append(k)
142 |             values.append(input_dict[k])
143 |         values = torch.stack(values, dim=0)
144 |         dist.all_reduce(values)
145 |         if average:
146 |             values /= world_size
147 |         reduced_dict = {k: v for k, v in zip(names, values)}
148 |     return reduced_dict
149 | 
150 | 
151 | class MetricLogger(object):
152 |     def __init__(self, delimiter="\t"):
153 |         self.meters = defaultdict(SmoothedValue)
154 |         self.delimiter = delimiter
155 | 
156 |     def update(self, **kwargs):
157 |         for k, v in kwargs.items():
158 |             if isinstance(v, torch.Tensor):
159 |                 v = v.item()
160 |             assert isinstance(v, (float, int))
161 |             self.meters[k].update(v)
162 | 
163 |     def __getattr__(self, attr):
164 |         if attr in self.meters:
165 |             return self.meters[attr]
166 |         if attr in self.__dict__:
167 |             return self.__dict__[attr]
168 |         raise AttributeError(
169 |             "'{}' object has no attribute '{}'".format(type(self).__name__, attr)
170 |         )
171 | 
172 |     def __str__(self):
173 |         loss_str = []
174 |         for name, meter in self.meters.items():
175 |             loss_str.append("{}: {}".format(name, str(meter)))
176 |         return self.delimiter.join(loss_str)
177 | 
178 |     def synchronize_between_processes(self):
179 |         for meter in self.meters.values():
180 |             meter.synchronize_between_processes()
181 | 
182 |     def add_meter(self, name, meter):
183 |         self.meters[name] = meter
184 | 
185 |     def log_every(self, iterable, print_freq, header=None):
186 |         i = 0
187 |         if not header:
188 |             header = ""
189 |         start_time = time.time()
190 |         end = time.time()
191 |         iter_time = SmoothedValue(fmt="{avg:.4f}")
192 |         data_time = SmoothedValue(fmt="{avg:.4f}")
193 |         space_fmt = ":" + str(len(str(len(iterable)))) + "d"
194 |         log_msg = self.delimiter.join(
195 |             [
196 |                 header,
197 |                 "[{0" + space_fmt + "}/{1}]",
198 |                 "eta: {eta}",
199 |                 "{meters}",
200 |                 "time: {time}",
201 |                 "data: {data}",
202 |                 "max mem: {memory:.0f}",
203 |             ]
204 |         )
205 |         MB = 1024.0 * 1024.0
206 |         for obj in iterable:
207 |             data_time.update(time.time() - end)
208 |             yield obj
209 |             iter_time.update(time.time() - end)
210 |             if i % print_freq == 0 or i == len(iterable) - 1:
211 |                 eta_seconds = iter_time.global_avg * (len(iterable) - i)
212 |                 eta_string = str(datetime.timedelta(seconds=int(eta_seconds)))
213 |                 print(
214 |                     log_msg.format(
215 |                         i,
216 |                         len(iterable),
217 |                         eta=eta_string,
218 |                         meters=str(self),
219 |                         time=str(iter_time),
220 |                         data=str(data_time),
221 |                         memory=None
222 |                         if not cuda_avail
223 |                         else torch.cuda.max_memory_allocated() / MB,
224 |                     )
225 |                 )
226 |             i += 1
227 |             end = time.time()
228 |         total_time = time.time() - start_time
229 |         total_time_str = str(datetime.timedelta(seconds=int(total_time)))
230 |         print(
231 |             "{} Total time: {} ({:.4f} s / it)".format(
232 |                 header, total_time_str, total_time / len(iterable)
233 |             )
234 |         )
235 | 
236 | 
237 | def collate_fn(batch):
238 |     return tuple(zip(*batch))
239 | 
240 | 
241 | def warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor):
242 |     def f(x):
243 |         if x >= warmup_iters:
244 |             return 1
245 |         alpha = float(x) / warmup_iters
246 |         return warmup_factor * (1 - alpha) + alpha
247 | 
248 |     return torch.optim.lr_scheduler.LambdaLR(optimizer, f)
249 | 
250 | 
251 | def mkdir(path):
252 |     try:
253 |         os.makedirs(path)
254 |     except OSError as e:
255 |         if e.errno != errno.EEXIST:
256 |             raise
257 | 
258 | 
259 | def setup_for_distributed(is_master):
260 |     """
261 |     This function disables printing when not in master process
262 |     """
263 |     import builtins as __builtin__
264 | 
265 |     builtin_print = __builtin__.print
266 | 
267 |     def print(*args, **kwargs):
268 |         force = kwargs.pop("force", False)
269 |         if is_master or force:
270 |             builtin_print(*args, **kwargs)
271 | 
272 |     __builtin__.print = print
273 | 
274 | 
275 | def is_dist_avail_and_initialized():
276 |     if not dist.is_available():
277 |         return False
278 |     if not dist.is_initialized():
279 |         return False
280 |     return True
281 | 
282 | 
283 | def get_world_size():
284 |     if not is_dist_avail_and_initialized():
285 |         return 1
286 |     return dist.get_world_size()
287 | 
288 | 
289 | def get_rank():
290 |     if not is_dist_avail_and_initialized():
291 |         return 0
292 |     return dist.get_rank()
293 | 
294 | 
295 | def is_main_process():
296 |     return get_rank() == 0
297 | 
298 | 
299 | def save_on_master(*args, **kwargs):
300 |     if is_main_process():
301 |         torch.save(*args, **kwargs)
302 | 
303 | 
304 | def init_distributed_mode(args):
305 |     if "RANK" in os.environ and "WORLD_SIZE" in os.environ:
306 |         args.rank = int(os.environ["RANK"])
307 |         args.world_size = int(os.environ["WORLD_SIZE"])
308 |         args.gpu = int(os.environ["LOCAL_RANK"])
309 |     elif "SLURM_PROCID" in os.environ:
310 |         args.rank = int(os.environ["SLURM_PROCID"])
311 |         args.gpu = args.rank % torch.cuda.device_count()
312 |     else:
313 |         print("Not using distributed mode")
314 |         args.distributed = False
315 |         return
316 | 
317 |     args.distributed = True
318 | 
319 |     torch.cuda.set_device(args.gpu)
320 |     args.dist_backend = "nccl"
321 |     print(
322 |         "| distributed init (rank {}): {}".format(args.rank, args.dist_url), flush=True
323 |     )
324 |     torch.distributed.init_process_group(
325 |         backend=args.dist_backend,
326 |         init_method=args.dist_url,
327 |         world_size=args.world_size,
328 |         rank=args.rank,
329 |     )
330 |     torch.distributed.barrier()
331 |     setup_for_distributed(args.rank == 0)
332 | 


--------------------------------------------------------------------------------