├── .gitignore
├── README.md
├── __init__.py
├── apply_net.py
├── apply_net_single.py
├── configs
    ├── exp01.yaml
    ├── exp02.yaml
    ├── exp03.yaml
    ├── exp04.yaml
    ├── exp05.yaml
    └── exp06.yaml
├── evaluate_net.py
├── imaterialist
    ├── __init__.py
    ├── config.py
    ├── data
    │   ├── __init__.py
    │   ├── dataset_mapper.py
    │   ├── datasets
    │   │   ├── __init__.py
    │   │   ├── coco.py
    │   │   ├── make_dataset.py
    │   │   ├── rle_utils.py
    │   │   ├── rle_utils_old.py
    │   │   └── test_rle.py
    │   └── structures.py
    ├── evaluator.py
    ├── modeling
    │   ├── __init__.py
    │   ├── attributes_rcnn.py
    │   └── roi_heads
    │   │   ├── __init__.py
    │   │   ├── attributes_head.py
    │   │   └── roi_heads.py
    └── submission_utils
    │   ├── resize_longest_edge.py
    │   └── test_csv_write.py
├── notebooks
    ├── 01-EDA.ipynb
    ├── 02-rle_encoder_decoder.ipynb
    ├── 03-Create-dataset.ipynb
    ├── 04-Inference.ipynb
    ├── 05-Training_and_inference_experiments.ipynb
    ├── 06-Attribute-inference.ipynb
    ├── 07-Results.ipynb
    └── InteractiveLabelExplorer.ipynb
├── requirements.txt
└── train_net.py


/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | 
  5 | # C extensions
  6 | *.so
  7 | 
  8 | # Distribution / packaging
  9 | .Python
 10 | env/
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | *.egg-info/
 23 | .installed.cfg
 24 | *.egg
 25 | 
 26 | # PyInstaller
 27 | #  Usually these files are written by a python script from a template
 28 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 29 | *.manifest
 30 | *.spec
 31 | results_bengali/
 32 | # Installer logs
 33 | pip-log.txt
 34 | pip-delete-this-directory.txt
 35 | 
 36 | # Unit test / coverage reports
 37 | htmlcov/
 38 | .tox/
 39 | .coverage
 40 | .coverage.*
 41 | .cache
 42 | nosetests.xml
 43 | coverage.xml
 44 | *.cover
 45 | 
 46 | # Translations
 47 | *.mo
 48 | *.pot
 49 | 
 50 | # Django stuff:
 51 | *.log
 52 | 
 53 | # Sphinx documentation
 54 | docs/_build/
 55 | 
 56 | # PyBuilder
 57 | target/
 58 | 
 59 | # DotEnv configuration
 60 | .env_bengali
 61 | 
 62 | # Database
 63 | *.db
 64 | *.rdb
 65 | 
 66 | # Pycharm
 67 | .idea
 68 | 
 69 | # VS Code
 70 | .vscode/
 71 | 
 72 | # Spyder
 73 | .spyproject/
 74 | 
 75 | # Jupyter NB Checkpoints
 76 | .ipynb_checkpoints/
 77 | */.ipynb_checkpoints/
 78 | 
 79 | # exclude data from source control by default
 80 | /home
 81 | /data
 82 | /data_*
 83 | /results_*
 84 | /notebooks_*
 85 | /output
 86 | /iMaterialist2020/configs/*
 87 | 
 88 | # Mac OS-specific storage files
 89 | .DS_Store
 90 | 
 91 | # vim
 92 | *.swp
 93 | *.swo
 94 | 
 95 | # Mypy cache
 96 | .mypy_cache/
 97 | 
 98 | .idea/
 99 | __pycache__
100 | configs/eai_server_paths.yaml
101 | 
102 | # inbox folder for experiment runs should stay local
103 | configs/inbox
104 | 
105 | 
106 | configs/Archive/
107 | 
108 | models/mobilenet_v2-b0353104.pth
109 | 
110 | /depends/depends.zip
111 | /depends/dill.pkl
112 | /depends/wheelhouse
113 | submission.csv
114 | /depends/wheelhouse/
115 | .env*
116 | .flake8
117 | =2.0.1
118 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # iMaterialist 2020 Kaggle Competition in Detectron2
 2 | 
 3 | In this competition we are tasked to do instance segmentation as well as attribute localization (recognize one or multiple attributes for the instances) on a fashion and apparel dataset. [Here is the link to competition](https://www.kaggle.com/c/imaterialist-fashion-2020-fgvc7/overview). 
 4 | 
 5 | <p><img src="https://julienbeaulieu.github.io/public/imaterialist/imaterialist-dataset.png" /></p>
 6 | 
 7 | ## Model and Training
 8 | 
 9 | To solve the challenging problems entailed in this task we use and extend Detectron2’s MaskRCNN architecture and added a new attribute head as shown in orange below.  
10 | 
11 | <p><img src="https://julienbeaulieu.github.io/public/imaterialist/attribute-MaskRCNN%20model.png" /></p>
12 | 
13 | -	In prior steps in the MaskRCNN architecture we leverage a ResNet-50 with a feature pyramid network (FPN) as backbone. 
14 | -	The input image is resized to 1300 of the longer edge to feed the network. 
15 | -	Random horizontal flipping was applied during the training. 
16 | -	The model was trained on top of pre-trained COCO dataset weights for 300,000 iterations.
17 | 
18 | ## Kaggle Submission
19 | 
20 | The submission to Kaggle required specific encoding (run length encoding - RLE) for all the predicted masks in order to reduce the size of the submitted file. This posed a number of challenges since RLE is not standardized amongst COCO, Detectron2 and Kaggle. Also, Kaggle required that each pixel of the masks do not overlap, so mask refining was required.
21 | 
22 | ## Evaluation
23 | 
24 | Submissions are evaluated on the mean average precision at two different thresholds.
25 | 
26 | 1. IoU: intersection over union (IoU) thresholds. The IoU of a proposed set of object pixels and a set of true object pixels is calculated as:
27 | 
28 | <p align="center"><img src="https://latex.codecogs.com/gif.latex?IoU(A,B) = \frac{|A\cap B|}{|A\cup B|} " /></p>
29 | 
30 | 2. F1: f1 score between a set of predicted attributes and a set of true attributes of one segmentation mask
31 | 
32 | The metric sweeps over a range of IoU thresholds and F1 thresholds, at each point calculating an average precision value. The threshold values range from 0.5 to 0.95 with a step size of 0.05: (0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95). In other words, at an IoU threshold of 0.5 and an F1 threshold of 0.5, a predicted object is considered a "hit" if it satisfies the following conditions:
33 | 
34 | 1.	Its intersection over union with a ground truth object is greater than 0.5
35 | 2.	If the ground truth object has attributes, the f1 scores of predicted attributes and ground-truth attributes is greater than 0.5.
36 | At each threshold pair, t=(ti, tf), a precision value is calculated based on the number of true positives (TP), false negatives (FN), and false positives (FP) resulting from comparing the predicted object to all ground truth objects:
37 | 
38 | <p align="center"><img src="https://latex.codecogs.com/gif.latex? \frac{TP(ti,tf) }{TP(ti,tf)+FP(ti,tf)} " /></p> 
39 | 
40 | ## Category and Attributes Analysis 
41 | 
42 | There are 46 apparel categories and 294 attributes presented in the Fashionpedia dataset. On average, each image was annotated with 7.3 instances, 5.4 categories, and 16.7 attributes. Of all the masks with categories and attributes, each mask has 3.7 attributes on average (max 14 attributes).
43 | 
44 | ## Docker 
45 | 
46 | A Docker image is available at https://hub.docker.com/r/cvnnig/detectron2.
47 | 
48 | ## WIP
49 | 
50 | This repo is still being cleaned and organized.
51 | 
52 | ## Authors
53 | 
54 | Julien Beaulieu, Yang Ding
55 | 


--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Julienbeaulieu/iMaterialist2020-Image-Segmentation-on-Detectron2/8d96069bb021dd374fb8de310bb454d351971974/__init__.py


--------------------------------------------------------------------------------
/apply_net.py:
--------------------------------------------------------------------------------
  1 | 
  2 | """
  3 | Run model inference on a set of image. 
  4 | 
  5 | Files will be generated one by one to save memory
  6 | 
  7 | Predicted masks are encoded into RLE format to save memory as well. 
  8 | 
  9 | Predictions can be visualized and will also be saved to csv. 
 10 | 
 11 | """
 12 | 
 13 | import logging
 14 | import csv
 15 | import torch
 16 | import os
 17 | import pickle
 18 | import pandas as pd
 19 | import numpy as np
 20 | from datetime import datetime
 21 | from environs import Env
 22 | from pathlib import Path
 23 | from matplotlib.pyplot import imsave
 24 | from typing import Any, Dict, List
 25 | 
 26 | from detectron2.data.detection_utils import read_image
 27 | from detectron2.engine import default_argument_parser, launch
 28 | from detectron2.structures.instances import Instances
 29 | from detectron2.utils.visualizer import ColorMode, Visualizer
 30 | 
 31 | from imaterialist.data.datasets.coco import register_datadict, MetadataCatalog
 32 | from imaterialist.config import setup_prediction
 33 | from imaterialist.evaluator import iMatPredictor
 34 | from imaterialist.data.datasets.rle_utils_old import mask_to_KaggleRLE
 35 | from imaterialist.data.datasets.rle_utils import mask_to_KaggleRLE_downscale
 36 | 
 37 | 
 38 | LOGGER_NAME = "apply_net"
 39 | logger = logging.getLogger(LOGGER_NAME)
 40 | 
 41 | env = Env()
 42 | env.read_env()
 43 | 
 44 | path_data_interim = Path(env("path_interim"))
 45 | path_test_data = Path(env("path_test"))
 46 | path_output = Path(env("path_output"))
 47 | 
 48 | class FileGen:
 49 |     '''
 50 |     Class that lazily builds a list of file_paths from a directory.
 51 |     This is done by returning a generator using a generator expression. 
 52 |     Helps not run into memory issues
 53 |     '''
 54 | 
 55 |     def __init__(self, file_path):
 56 |         self.file_path = file_path
 57 | 
 58 |     def __iter__(self):
 59 |         return (os.path.join(self.file_path, fname)
 60 |                 for fname in os.listdir(self.file_path)
 61 |                 if os.path.isfile(os.path.join(self.file_path, fname)))
 62 | 
 63 | 
 64 | def execute_on_outputs(entry: Dict[str, Any], outputs: Instances) -> List[dict]:
 65 |     """
 66 |     Parse instance from prediction to return a dict of the easier to read attributes.
 67 |     :param entry:
 68 |     :param outputs:
 69 |     :return:
 70 |     """
 71 | 
 72 |     image_fpath = entry["file_name"]
 73 |     logger.info(f"Processing {image_fpath}")
 74 | 
 75 |     # Get predicted classes from outputs
 76 |     pred_classes = np.array(outputs.pred_classes.cpu().tolist())
 77 | 
 78 |     # Get attribute scores from outputs
 79 |     attr_scores = np.array(outputs.attr_scores.cpu())
 80 | 
 81 |     # Keep only attributes with a score > 0.5
 82 |     attr_filter = attr_scores > 0.5
 83 | 
 84 |     # Get the index of attributes where the score is > 0. Each item in the list
 85 |     # corresponds to the predicted attributes for one instance
 86 |     # attr_filtered = [np.array(torch.where(attr_filter[i])[0].to("cpu")) for i in range(len(attr_filter))]
 87 |     attr_filtered = [np.where(attr_filter[i])[0] for i in range(len(attr_filter))]
 88 | 
 89 | 
 90 |     # Get masks from outputs
 91 |     has_mask = outputs.has("pred_masks")
 92 |     if has_mask:
 93 |         # use RLE to encode the masks, because they are too large and takes memory
 94 |         # since this evaluator stores outputs of the entire dataset
 95 | 
 96 |         # Old Non-Union RLE
 97 |         # rles = [
 98 |         #     mask_to_KaggleRLE(mask) for mask in outputs.pred_masks.cpu()
 99 |         # ]
100 | 
101 |         # New Union RLE
102 |         rles = refine_masks(outputs.pred_masks.cpu())
103 | 
104 | 
105 |         ##############################
106 |         # Uncomment following code to encode the masks to compressed RLE 
107 |         # instead of uncompressed RLE like above: 
108 |         ##############################
109 | 
110 |         # use RLE to encode the masks, because they are too large and takes memory
111 |         # since this evaluator stores outputs of the entire dataset
112 |         # rles = [
113 |         #     mask_util.encode(np.array(mask[:, :, None], order="F", dtype="uint8"))[0]
114 |         #     for mask in outputs.pred_masks.cpu()
115 |         # ]
116 |         # for rle in rles:
117 |         #     # "counts" is an array encoded by mask_util as a byte-stream. Python3's
118 |         #     # json writer which always produces strings cannot serialize a bytestream
119 |         #     # unless you decode it. Thankfully, utf-8 works out (which is also what
120 |         #     # the pycocotools/_mask.pyx does).
121 |         #     rle["counts"] = rle["counts"].decode("utf-8")
122 |     results = []
123 | 
124 |     # Cycle each instance of an image
125 |     for k in range(len(outputs)):
126 |         # Attribute 294 is the category for empty attributes so we make the tensor empty
127 |         # if it contains 294
128 |         if 294 in attr_filtered[k]:
129 |             # Must be sent to CPU
130 |             attr_filtered[k] = np.array(torch.tensor([], device='cuda:0').cpu())
131 | 
132 | 
133 |         # per Kaggle requirement.
134 |         attributes_sorted = get_attribute_ids(list(attr_filtered[k]))
135 |         # Get image ID from full path string
136 |         image_id = Path(image_fpath).stem
137 |         class_id = str(pred_classes[k])
138 |         if has_mask:
139 |             result = {"ImageId": image_id,
140 |                       "EncodedPixels": rles[k],
141 |                       "ClassId": class_id,
142 |                       "AttributesIds": attributes_sorted,  # attribute IDs must be comma separated and sorted.
143 |                       }
144 |             # Encoded Pixels MUST be SPACE separated
145 |         else:
146 |             result = {"ImageId": image_id,
147 |                       "ClassId": str(pred_classes[k]),
148 |                       "AttributesIds": attributes_sorted,  # attribute IDs must be comma separated and sorted.
149 |                       }
150 |         results.append(result)
151 |     return results
152 | 
153 | def get_attribute_ids(att_ids: List[int]):
154 |     """
155 |     Get concatenated AttributesIds
156 |     Args:
157 |         att_ids: [int], list of apparel attributes
158 |     Returns:
159 |         att_ids: string, e.g. "2,10,55,91"
160 |     """
161 |     #  Source: https://www.kaggle.com/c/imaterialist-fashion-2020-fgvc7/overview/evaluation
162 |     att_ids.sort()  # need to be sorted before concatenation
163 |     return ','.join([str(a) for a in att_ids])
164 | 
165 | def export_results(result: Dict[str, Any]):
166 |     """
167 |     Take the results and write them out into CSV and PKL for upload and future review
168 |     :param result:
169 |     :return:
170 |     """
171 |     # Get and create the folder path name to ensure the subsequent write operation will be successful.
172 |     out_fname = result["out_fname"]
173 |     out_dir = os.path.dirname(out_fname)
174 |     if len(out_dir) > 0 and not os.path.exists(out_dir):
175 |         os.makedirs(out_dir)
176 | 
177 |     # Write out pickle
178 |     path_pickle = path_output / f"{out_fname}.pkl"
179 |     pickle_write(result, path_pickle)
180 | 
181 |     # Write out CSV
182 |     path_csv = path_output / f"{out_fname}.csv"
183 |     # Filter out the blank rows where no encoded pixel is given but class prediction is given...
184 |     filter_csv_write(result["results"], path_csv)
185 | 
186 | 
187 | def pickle_write(result, path_pickle):
188 |     with open(path_pickle, "wb") as pickle_file:
189 |         pickle.dump(result["results"], pickle_file)
190 |         logger.info(f"Output saved to {path_pickle}")
191 | 
192 | 
193 | def main(args, visualize=True):
194 |     # datadic_train = pd.read_feather(path_data_interim / 'imaterialist_train_multihot_n=4000.feather')
195 |     # datadic_val = pd.read_feather(path_data_interim / 'imterailist_val_multihot_n=1000.feather')
196 |     
197 |     # register_datadict(datadic_train, "sample_fashion_train")
198 |     # register_datadict(datadic_val, "sample_fashion_test")
199 | 
200 |     # This small set of data just to provide label./home/nasty/imaterialis
201 |     datadic_test = pd.read_feather(path_data_interim / 'imateralist_val_multihot_n=1000.feather')
202 |     register_datadict(datadic_test, "sample_fashion_test")
203 |     fashion_metadata = get_fashion_metadata()
204 | 
205 |     # This update the prediction weight and output path automatically.
206 |     cfg = setup_prediction(args)
207 | 
208 |     # cfg must have (cfg.DATASETS.TEST[0])
209 | 
210 |     # Generate the predictor
211 |     predictor = iMatPredictor(cfg)
212 | 
213 |     # Create a list of image files
214 |     # Loop through all data and generate.
215 |     file_list = FileGen(path_test_data)
216 | 
217 |     # Dictionary where we'll append the results of all images and instances
218 |     all_results = {"results": [], "out_fname": 'result_file'}
219 | 
220 |     for file_name in file_list:
221 |         img = read_image(file_name, format="BGR")  # predictor takes BGR format
222 |         with torch.no_grad():
223 | 
224 |             if visualize:
225 |                 #==========================================
226 |                 # Call the visualizer, label and save data
227 |                 # =========================================
228 | 
229 |                 #show_predicted_image(file_name, predictor, fashion_metadata)
230 | 
231 |                 v = Visualizer(img[:, :, ::-1],
232 |                                metadata=fashion_metadata,
233 |                                scale=0.8,
234 |                                instance_mode=ColorMode.IMAGE_BW  # remove the colors of unsegmented pixels
235 |                                )
236 | 
237 |                 # Get outputs of model in eval mode
238 |                 outputs = predictor(img)["instances"]
239 |                 v = v.draw_instance_predictions(outputs.to("cpu"))
240 |                 time_stamp = datetime.now().isoformat().replace(":", "")
241 |                 name = Path(file_name).stem
242 |                 imsave(f"{path_output}/{name}_{time_stamp}.png", v.get_image()[:, :, ::-1])
243 | 
244 |             # Get results for image
245 |             result = execute_on_outputs({"file_name": file_name, "image": img}, outputs)
246 | 
247 |             all_results["results"].append(result)
248 | 
249 |     # Dump all results to output path
250 |     # Pkl and CSV
251 |     export_results(all_results)
252 |     print("Example prediction result: ")
253 |     print(all_results["results"][0])  # verification
254 | 
255 | 
256 | def filter_csv_write(list_list_dict: List[List[dict]], path_csv):
257 |     """
258 |     Write the list of csv predictions into CSV but omit the rows where EncodedPixels are empty
259 |     :param list_dict:
260 |     :param path_csv:
261 |     :return:
262 |     """
263 |     # Flatten the two list.
264 |     # Feturn item if they the encoded pixel is not  flat.
265 |     flat_list = []
266 |     # Iterate through image list.
267 |     for sublist in list_list_dict:
268 |         # Iterate through mask list
269 |         for item in sublist:
270 |             # If the EncodedPixel is empty, skip.
271 |             # if item["EncodedPixels"] == "":
272 |             #     continue
273 |             # else:
274 |             flat_list.append(item)
275 | 
276 |     # With blanks.
277 |     # flat_list = [item for sublist in list_list_dict for item in sublist]
278 | 
279 |     # Source: https://stackoverflow.com/questions/3086973/how-do-i-convert-this-list-of-dictionaries-to-a-csv-file
280 |     keys = flat_list[0].keys()
281 |     with open(path_csv, 'w') as output_file:
282 |         # quote char prevent dict_writer to quote string that contain separtor: ,
283 |         # The attributes are separated by COMMA, and must be quoted, by using space,
284 |         dict_writer = csv.DictWriter(output_file, keys)
285 |         dict_writer.writeheader()
286 |         dict_writer.writerows(flat_list)
287 | 
288 | 
289 | def csv_write(list_list_dict: List[List[dict]], path_csv):
290 |     """
291 |     Write the list of csv predictions into CSV.
292 |     :param list_dict:
293 |     :param path_csv:
294 |     :return:
295 |     """
296 |     # Flatten the two list.
297 |     flat_list = [item for sublist in list_list_dict for item in sublist]
298 | 
299 | 
300 |     # Source: https://stackoverflow.com/questions/3086973/how-do-i-convert-this-list-of-dictionaries-to-a-csv-file
301 |     keys = flat_list[0].keys()
302 |     with open(path_csv, 'w') as output_file:
303 |         # quote char prevent dict_writer to quote string that contain separtor: ,
304 |         # The attributes are separated by COMMA, and must be quoted, by using space,
305 |         dict_writer = csv.DictWriter(output_file, keys)
306 |         dict_writer.writeheader()
307 |         dict_writer.writerows(flat_list)
308 | 
309 |     #path_submission = Path(path_csv).parent / "submission.csv"
310 |     #reader = csv.reader(open(path_csv, "r"), skipinitialspace=True)
311 |     #writer = csv.writer(open(path_submission, "w"), quoting=csv.QUOTE_NONE)
312 |     #writer.writerows(reader)
313 | 
314 | 
315 | def get_fashion_metadata():
316 |     # datadic_val = pd.read_feather(path_data_interim / 'imaterailist_test_multihot_n=100.feather')
317 |     # register_datadict(datadic_val, "sample_fashion_test")
318 |     fashion_metadata = MetadataCatalog.get("sample_fashion_test")
319 |     return fashion_metadata
320 | 
321 | 
322 | if __name__ == '__main__':
323 |     args = default_argument_parser().parse_args()
324 |     args.eval_only = True
325 |     args.config_file = "/home/nasty/imaterialist2020/iMaterialist2020/configs/exp05.yaml"
326 |     print("Command Line Args:", args)
327 |     launch(
328 |         main,
329 |         args.num_gpus,
330 |         num_machines=args.num_machines,
331 |         machine_rank=args.machine_rank,
332 |         dist_url=args.dist_url,
333 |         args=(args,),
334 |     )
335 | 


--------------------------------------------------------------------------------
/apply_net_single.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Runs inference on one image
  3 | """
  4 | 
  5 | import cv2
  6 | import pickle
  7 | import os
  8 | from pathlib import Path
  9 | from environs import Env
 10 | from datetime import datetime
 11 | from matplotlib.pyplot import imshow, imsave
 12 | 
 13 | from apply_net import get_fashion_metadata
 14 | from imaterialist.data.datasets.coco import register_datadict
 15 | from detectron2.utils.visualizer import Visualizer, ColorMode
 16 | 
 17 | from imaterialist.config import initialize_imaterialist_config, update_weights_outpath
 18 | from imaterialist.evaluator import iMatPredictor
 19 | 
 20 | env = Env()
 21 | env.read_env()
 22 | 
 23 | path_output = Path(env("path_output_images"))
 24 | path_data_interim = Path(env("path_interim"))
 25 | path_images_local = Path(env("path_images_local"))
 26 | 
 27 | def predicted_image_datadict(datadic_test, predictor, fashion_metadata):
 28 |     """
 29 |     Show 3 predicted images from the Fashion Dict (make sure it is the test set!)
 30 |     :param dict_test:
 31 |     :return:
 32 |     """
 33 |     import random
 34 |     from datetime import datetime
 35 |     from imaterialist.data.datasets.make_dataset import load_category_attributes
 36 |     seed = random.randint(0, 99999999)
 37 | 
 38 |     # Randomly Grab 9 samples, iterate through rows of them, convert to list of tuple.  :
 39 |     list_tuple = list(datadic_test.sample(n=50, random_state=seed).iterrows())
 40 |     _, list_datadic = zip(*list_tuple)
 41 | 
 42 |     for i, d in enumerate(list_datadic):
 43 |         time_stamp = datetime.now().isoformat().replace(":", "")
 44 | 
 45 |         im = cv2.imread(d["ImageId"])
 46 |         im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
 47 | 
 48 |         # Run through predictor
 49 |         outputs = predictor(im)
 50 | 
 51 |         # Visualize
 52 |         v = Visualizer(im[:, :, ::-1],
 53 |                        metadata=fashion_metadata,
 54 |                        scale=0.8,
 55 |                        instance_mode=ColorMode.IMAGE_BW  # remove the colors of unsegmented pixels
 56 |                        )
 57 |         # Bring the data back to CPU before passing to Numpy to draw
 58 |         v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
 59 |         imshow(v.get_image()[:, :, ::-1])
 60 | 
 61 |         imsave(f"{path_output}/{time_stamp}.png", v.get_image()[:, :, ::-1])
 62 | 
 63 | 
 64 | def predicted_image_show(path_image_file, predictor, fashion_metadata):
 65 |     """
 66 |     Visualize and save an image predicted using the given predictor and labelled with the fashion data.
 67 | 
 68 |     :param dict_test:
 69 |     :return:
 70 |     """
 71 |     time_stamp = datetime.now().isoformat().replace(":", "")
 72 | 
 73 |     for path in os.listdir(path_image_file):
 74 |         
 75 |         full_path = os.path.join(path_image_file, path)
 76 |         print(full_path)
 77 | 
 78 |     
 79 |         im = cv2.imread(full_path)
 80 |         im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
 81 | 
 82 |         # Run through predictor
 83 |         outputs = predictor(im)
 84 | 
 85 |         # Visualize
 86 |         v = Visualizer(im[:, :, ::-1],
 87 |                     metadata=fashion_metadata,
 88 |                     scale=0.8,
 89 |                     instance_mode=ColorMode.IMAGE_BW  # remove the colors of unsegmented pixels
 90 |                     )
 91 |         # Bring the data back to CPU before passing to Numpy to draw
 92 |         v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
 93 |         imshow(v.get_image()[:, :, ::-1])
 94 | 
 95 |         imsave(f"{path_output}/{path}.png", v.get_image()[:, :, ::-1])
 96 | 
 97 | 
 98 | def load_model_predict_image(path_image=path_images_local):
 99 |     # cfg = setup(args)
100 |     cfg = initialize_imaterialist_config()
101 | 
102 |     # Merge from TRAINED config file.
103 |     cfg.merge_from_file("/home/julien/data-science/kaggle/imaterialist/configs/exp06.yaml")
104 |     update_weights_outpath(cfg, "/home/julien/data-science/kaggle/imaterialist/output/exp03/model_0109999.pth")
105 | 
106 |     # Set max input size
107 |     cfg.INPUT.MAX_SIZE_TEST = 1024
108 | 
109 |     # Generate Predictor
110 |     predictor = iMatPredictor(cfg)
111 | 
112 |     datadict_val = pickle.load(open(path_data_interim / 'imaterialist_test_multihot_n=100.p', 'rb'))
113 |     register_datadict(datadict_val, "sample_fashion_test")
114 | 
115 |     # This small set of data just to provide label.
116 |     fashion_metadata = get_fashion_metadata()
117 | 
118 |     # Call the visualizer.
119 |     predicted_image_show(path_image, predictor, fashion_metadata)
120 | 
121 | if __name__ == '__main__':
122 |     load_model_predict_image()
123 | 
124 | 


--------------------------------------------------------------------------------
/configs/exp01.yaml:
--------------------------------------------------------------------------------
 1 | DATALOADER:
 2 |   NUM_WORKERS: 1
 3 | MODEL:
 4 |   ROI_HEADS:
 5 |     NAME: "StandardROIHeads"
 6 |     SCORE_THRESH_TEST: 0.3
 7 | DATASETS:
 8 |   TRAIN: ("sample_fashion_train", )
 9 |   TEST: ("sample_fashion_test",)
10 | SOLVER: 
11 |   MAX_ITER: 1000
12 | OUTPUT_DIR: '/home/julien/data-science/kaggle/imaterialist/output/exp01'
13 | 
14 | 
15 | 


--------------------------------------------------------------------------------
/configs/exp02.yaml:
--------------------------------------------------------------------------------
 1 | 
 2 | MODEL:
 3 |   ROI_HEADS:
 4 |     NAME: "StandardROIHeads"
 5 |     SCORE_THRESH_TEST: 0.5
 6 | DATASETS:
 7 |   TRAIN: ("sample_fashion_train", )
 8 |   TEST: ("sample_fashion_test",)
 9 | SOLVER: 
10 |   MAX_ITER: 50000
11 |   IMS_PER_BATCH: 4
12 | OUTPUT_DIR: '/home/julien/data-science/kaggle/imaterialist/output/exp02'


--------------------------------------------------------------------------------
/configs/exp03.yaml:
--------------------------------------------------------------------------------
 1 | 
 2 | MODEL:
 3 |   ROI_BOX_HEAD:
 4 |     NUM_FC: 2
 5 |   ROI_HEADS:
 6 |     NAME: "StandardROIHeads"
 7 |     SCORE_THRESH_TEST: 0.5
 8 | DATASETS:
 9 |   TRAIN: ("sample_fashion_train", )
10 |   TEST: ("sample_fashion_test",)
11 | SOLVER: 
12 |   MAX_ITER: 120000
13 |   IMS_PER_BATCH: 8
14 |   BASE_LR: 0.0004
15 | TEST:
16 |   DETECTIONS_PER_IMAGE: 30
17 | OUTPUT_DIR: '/home/julien/data-science/kaggle/imaterialist/output/exp03'
18 | 
19 | 


--------------------------------------------------------------------------------
/configs/exp04.yaml:
--------------------------------------------------------------------------------
 1 | DATALOADER:
 2 |   NUM_WORKERS: 1
 3 | MODEL:
 4 |   ROI_BOX_HEAD:
 5 |     NUM_FC: 2
 6 |   ROI_HEADS:
 7 |     NAME: "StandardROIHeads"
 8 |     SCORE_THRESH_TEST: 0.5
 9 | DATASETS:
10 |   TRAIN: ("sample_fashion_train", )
11 |   TEST: ("sample_fashion_test",)
12 | SOLVER: 
13 |   MAX_ITER: 120000
14 |   IMS_PER_BATCH: 2
15 |   BASE_LR: 0.0004
16 | OUTPUT_DIR: '/home/julien/data-science/kaggle/imaterialist/output/exp04'
17 | 


--------------------------------------------------------------------------------
/configs/exp05.yaml:
--------------------------------------------------------------------------------
 1 | 
 2 | INPUT:
 3 |   MAX_SIZE_TEST: 1024
 4 | MODEL:
 5 |   ROI_BOX_HEAD:
 6 |     NUM_FC: 2
 7 |   ROI_HEADS:
 8 |     NAME: "StandardROIHeads"
 9 |     SCORE_THRESH_TEST: 0.5
10 | DATASETS:
11 |   TRAIN: ("sample_fashion_train", )
12 |   TEST: ("sample_fashion_test",)
13 | SOLVER: 
14 |   MAX_ITER: 120000
15 |   IMS_PER_BATCH: 8
16 |   BASE_LR: 0.0004
17 | OUTPUT_DIR: '/home/julien/data-science/kaggle/imaterialist/output/exp05'
18 | 


--------------------------------------------------------------------------------
/configs/exp06.yaml:
--------------------------------------------------------------------------------
 1 | 
 2 | INPUT:
 3 |   MAX_SIZE_TEST: 1024
 4 | MODEL:
 5 |   ROI_BOX_HEAD:
 6 |     NUM_FC: 2
 7 |   ROI_HEADS:
 8 |     NAME: "StandardROIHeads"
 9 |     SCORE_THRESH_TEST: 0.5
10 | DATASETS:
11 |   TRAIN: ("sample_fashion_train", )
12 |   TEST: ("sample_fashion_test",)
13 | SOLVER: 
14 |   MAX_ITER: 500
15 |   IMS_PER_BATCH: 4
16 |   BASE_LR: 0.004
17 | OUTPUT_DIR: '/home/julien/data-science/kaggle/imaterialist/output/images'
18 | 


--------------------------------------------------------------------------------
/evaluate_net.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Run the coco evaluator on "sample_fashion_test" to evaluate performance using 
 3 | AP metric
 4 | 
 5 | TODO: add command line interface for all dataset and model weight inputs instead of 
 6 | having them hard coded.
 7 | """
 8 | 
 9 | 
10 | from pathlib import Path
11 | 
12 | from detectron2.engine import DefaultTrainer
13 | from detectron2.config import get_cfg
14 | from detectron2.evaluation import COCOEvaluator, inference_on_dataset
15 | from detectron2.data import build_detection_test_loader
16 | 
17 | from iMaterialist2020.imaterialist.config import add_imaterialist_config
18 | from iMaterialist2020.imaterialist.data.datasets.coco import register_datadict
19 | from iMaterialist2020.imaterialist.data.dataset_mapper import iMatDatasetMapper
20 | from environs import Env
21 | 
22 | env = Env()
23 | env.read_env()
24 | 
25 | # Get training dataframe
26 | path_data = Path(env("path_raw"))
27 | path_image = path_data / "train/"
28 | path_output = Path(env("path_output"))
29 | path_eval = Path(env("path_eval"))
30 | path_data_interim = Path(env("path_interim"))
31 | path_model = Path(env("path_model"))
32 | 
33 | if __name__=="__main__":
34 |     # load dataframe
35 |     # fixme: this number needs to update or dynamic
36 |     datadic_train = pd.read_feather(path_data_interim / 'imaterialist_train_multihot_n=266721.feather')
37 |     datadic_test = pd.read_feather(path_data_interim / 'imaterailist_test_multihot_n=66680.feather')
38 | 
39 |     register_datadict(datadic_train, "sample_fashion_train")
40 |     register_datadict(datadic_test, "sample_fashion_test")
41 | 
42 |     # cfg = setup(args)
43 |     cfg = get_cfg()
44 | 
45 |     # Add Solver etc.
46 |     add_imaterialist_config(cfg)
47 | 
48 |         # Merge from config file.
49 |     config_file = "/home/dyt811/Git/cvnnig/iMaterialist2020/configs/config.yaml"
50 |     cfg.merge_from_file(config_file)
51 | 
52 |     # Load the final weight.
53 |     cfg.MODEL.WEIGHTS = str(path_model / "model_0109999.pth")
54 |     cfg.OUTPUT_DIR = str(path_output)
55 | 
56 |     trainer = DefaultTrainer(cfg)
57 | 
58 |     # load weights
59 |     trainer.resume_or_load(resume=False)
60 | 
61 |     # Evaluate performance using AP metric implemented in COCO API
62 |     evaluator = COCOEvaluator("sample_fashion_test", cfg, False, output_dir=str(path_output))
63 |     val_loader = build_detection_test_loader(cfg, "sample_fashion_test", mapper=iMatDatasetMapper(cfg))
64 |     inference_on_dataset(trainer.model, val_loader, evaluator)


--------------------------------------------------------------------------------
/imaterialist/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Julienbeaulieu/iMaterialist2020-Image-Segmentation-on-Detectron2/8d96069bb021dd374fb8de310bb454d351971974/imaterialist/__init__.py


--------------------------------------------------------------------------------
/imaterialist/config.py:
--------------------------------------------------------------------------------
 1 | from detectron2.config import CfgNode as CN
 2 | from detectron2 import model_zoo
 3 | from detectron2.config import get_cfg
 4 | from pathlib import Path
 5 | from environs import Env
 6 | from detectron2.engine import default_setup
 7 | import detectron2.utils.comm as comm
 8 | from detectron2.utils.logger import setup_logger
 9 | import os
10 | 
11 | env = Env()
12 | env.read_env()
13 | 
14 | def add_imaterialist_config(cfg: CN):
15 |     """
16 |     Add config for imaterialist2 head
17 |     """
18 | 
19 |     _C = cfg
20 | 
21 |     _C.merge_from_file(model_zoo.get_config_file(
22 |         "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
23 |     _C.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(
24 |         "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
25 | 
26 |     ##### Input #####
27 |     # Set a smaller image size than default to avoid memory problems
28 | 
29 |     # Size of the smallest side of the image during training
30 |     # _C.INPUT.MIN_SIZE_TRAIN = (400,)
31 |     # # Maximum size of the side of the image during training
32 |     # _C.INPUT.MAX_SIZE_TRAIN = 600
33 | 
34 |     # # Size of the smallest side of the image during testing. Set to zero to disable resize in testing.
35 |     # _C.INPUT.MIN_SIZE_TEST = 400
36 |     # # Maximum size of the side of the image during testing
37 |     # _C.INPUT.MAX_SIZE_TEST = 600
38 |     
39 |     _C.SOLVER.IMS_PER_BATCH = 2
40 |     _C.SOLVER.BASE_LR = 0.0004
41 |     _C.SOLVER.MAX_ITER = 50000
42 |     _C.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512  # default: 512
43 |     _C.MODEL.ROI_HEADS.NUM_CLASSES = 46  # 46 classes in iMaterialist
44 |     _C.MODEL.ROI_HEADS.NUM_ATTRIBUTES = 295
45 |     # this should ALWAYS be left at 1 because it will double or more memory usage if higher.
46 |     _C.DATALOADER.NUM_WORKERS = 1
47 | 
48 | def initialize_imaterialist_config():
49 |     """
50 |     Cannot directly merge until intialize the imaterialist config properly in the first place.
51 |     :return:
52 |     """
53 |     cfg = get_cfg()
54 |     add_imaterialist_config(cfg)
55 |     return cfg
56 | 
57 | def setup_prediction(args):
58 |     """
59 |     Setup up the cfg per the prediction requirement.
60 |     Will use weight specified in the environmental variable.
61 |     :param args:
62 |     :return:
63 |     """
64 |     cfg = initialize_imaterialist_config()
65 | 
66 |     # Merge from pretrained or opts
67 |     cfg.merge_from_file(args.config_file)
68 |     cfg.merge_from_list(args.opts)
69 | 
70 |     cfg = update_weights_outpath(cfg, env("path_trained_weights"))
71 |     # cfg must have (cfg.DATASETS.TEST[0])
72 | 
73 |     # Set max input size
74 |     #cfg.INPUT.MAX_SIZE_TEST = 1024
75 | 
76 |     cfg.freeze()
77 |     default_setup(cfg, args)
78 |     # Setup logger for "imaterialist" module
79 |     setup_logger(output=cfg.OUTPUT_DIR, distributed_rank=comm.get_rank(), name="imaterialist")
80 |     return cfg
81 | 
82 | def update_weights_outpath(cfg, weights_path):
83 |     """
84 |     Update these two attributes using environmental variable because the CFG past along was hard coded.
85 |     :param cfg:
86 |     :param weights_path:
87 |     :return:
88 |     """
89 |     # Add the trained weights
90 |     cfg.MODEL.WEIGHTS = weights_path
91 |     cfg.OUTPUT_DIR = env("path_output")
92 | 
93 |     return cfg
94 | 


--------------------------------------------------------------------------------
/imaterialist/data/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Julienbeaulieu/iMaterialist2020-Image-Segmentation-on-Detectron2/8d96069bb021dd374fb8de310bb454d351971974/imaterialist/data/__init__.py


--------------------------------------------------------------------------------
/imaterialist/data/dataset_mapper.py:
--------------------------------------------------------------------------------
  1 | import copy
  2 | import torch
  3 | import numpy as np
  4 | from fvcore.common.file_io import PathManager
  5 | 
  6 | from detectron2.data import DatasetMapper
  7 | from detectron2.data import transforms as T
  8 | from detectron2.data import detection_utils as utils
  9 | 
 10 | from .structures import Attributes
 11 | 
 12 | class iMatDatasetMapper(DatasetMapper):
 13 |     """
 14 |     A callable which takes a dataset dict in Detectron2 Dataset format,
 15 |     and maps it into a format used by the model.
 16 | 
 17 |     This is a customized version of the default DatasetMapper where we add attributes
 18 |     to the Instances class.
 19 | 
 20 |     The callable currently does the following:
 21 | 
 22 |     1. Read the image from "file_name"
 23 |     2. Applies cropping/geometric transforms to the image and annotations
 24 |     3. Prepare data and annotations to Tensor and :class:`Instances` (including 
 25 |     attributes)
 26 |     """
 27 |     def __call__(self, dataset_dict):
 28 |         """
 29 |         Args:
 30 |             dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format.
 31 | 
 32 |         Returns:
 33 |             dict: a format that builtin models in detectron2 accept
 34 |         """
 35 |         dataset_dict = copy.deepcopy(dataset_dict)  # it will be modified by code below
 36 |         # USER: Write your own image loading if it's not from a file
 37 |         image = utils.read_image(dataset_dict["file_name"], format=self.img_format)
 38 |         utils.check_image_size(dataset_dict, image)
 39 | 
 40 |         if "annotations" not in dataset_dict:
 41 |             image, transforms = T.apply_transform_gens(
 42 |                 ([self.crop_gen] if self.crop_gen else []) + self.tfm_gens, image
 43 |             )
 44 |         else:
 45 |             # Crop around an instance if there are instances in the image.
 46 |             # USER: Remove if you don't use cropping
 47 |             if self.crop_gen:
 48 |                 crop_tfm = utils.gen_crop_transform_with_instance(
 49 |                     self.crop_gen.get_crop_size(image.shape[:2]),
 50 |                     image.shape[:2],
 51 |                     np.random.choice(dataset_dict["annotations"]),
 52 |                 )
 53 |                 image = crop_tfm.apply_image(image)
 54 |             image, transforms = T.apply_transform_gens(self.tfm_gens, image)
 55 |             if self.crop_gen:
 56 |                 transforms = crop_tfm + transforms
 57 | 
 58 |         image_shape = image.shape[:2]  # h, w
 59 | 
 60 |         # Pytorch's dataloader is efficient on torch.Tensor due to shared-memory,
 61 |         # but not efficient on large generic data structures due to the use of pickle & mp.Queue.
 62 |         # Therefore it's important to use torch.Tensor.
 63 |         dataset_dict["image"] = torch.as_tensor(np.ascontiguousarray(image.transpose(2, 0, 1)))
 64 | 
 65 |         # USER: Remove if you don't use pre-computed proposals.
 66 |         if self.load_proposals:
 67 |             utils.transform_proposals(
 68 |                 dataset_dict, image_shape, transforms, self.min_box_side_len, self.proposal_topk
 69 |             )
 70 | 
 71 |         if not self.is_train:
 72 |             # USER: Modify this if you want to keep them for some reason.
 73 |             dataset_dict.pop("annotations", None)
 74 |             dataset_dict.pop("sem_seg_file_name", None)
 75 |             return dataset_dict
 76 | 
 77 |         if "annotations" in dataset_dict:
 78 |             # USER: Modify this if you want to keep them for some reason.
 79 |             for anno in dataset_dict["annotations"]:
 80 |                 if not self.mask_on:
 81 |                     anno.pop("segmentation", None)
 82 |                 if not self.keypoint_on:
 83 |                     anno.pop("keypoints", None)
 84 | 
 85 |             # USER: Implement additional transformations if you have other types of data
 86 |             annos = [
 87 |                 utils.transform_instance_annotations(
 88 |                     obj, transforms, image_shape, keypoint_hflip_indices=self.keypoint_hflip_indices
 89 |                 )
 90 |                 for obj in dataset_dict.pop("annotations")
 91 |                 if obj.get("iscrowd", 0) == 0
 92 |             ]
 93 |             instances = utils.annotations_to_instances(
 94 |                 annos, image_shape, mask_format=self.mask_format
 95 |             )
 96 |             # Create a tight bounding box from masks, useful when image is cropped
 97 |             if self.crop_gen and instances.has("gt_masks"):
 98 |                 instances.gt_boxes = instances.gt_masks.get_bounding_boxes()           
 99 |             
100 |             #################################
101 |             # Custom attributes section
102 |             #################################
103 | 
104 |             # Get attributes from annos
105 |             if len(annos) and 'attributes' in annos[0]:
106 |     
107 |                 # get a list of list of attributes
108 |                 gt_attributes = [x['attributes'] for x in annos]
109 |                 gt_attributes = torch.tensor(gt_attributes, dtype=torch.float32)
110 |                 # Put attributes in Attributes class holder and add them to instances
111 |                 
112 |                 # Using Attributes(gt_attributes) needs more work - currently fails
113 |                 instances.gt_attributes = gt_attributes
114 | 
115 |             # End attributes section
116 |                 
117 |             dataset_dict["instances"] = utils.filter_empty_instances(instances)
118 |         return dataset_dict


--------------------------------------------------------------------------------
/imaterialist/data/datasets/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Julienbeaulieu/iMaterialist2020-Image-Segmentation-on-Detectron2/8d96069bb021dd374fb8de310bb454d351971974/imaterialist/data/datasets/__init__.py


--------------------------------------------------------------------------------
/imaterialist/data/datasets/coco.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import random
 3 | import numpy as np
 4 | 
 5 | from detectron2.structures import BoxMode
 6 | from detectron2.data import DatasetCatalog, MetadataCatalog
 7 | from imaterialist.data.datasets.make_dataset import load_dataset_into_dataframes
 8 | from imaterialist.data.datasets.rle_utils import rle_decode_string
 9 | 
10 | 
11 | 
12 | # https://detectron2.readthedocs.io/tutorials/datasets.html
13 | # https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5
14 | def convert_to_datadict(df_input):
15 |     """
16 |     :param df_input:
17 |     :return:
18 |     """
19 |     dataset_dicts = []
20 | 
21 |     # Find the unique list of imageId, we will build the
22 |     list_unique_ImageIds = df_input['ImageId'].unique().tolist()
23 |     for idx, filename in enumerate(list_unique_ImageIds):
24 | 
25 |         record = {}
26 |         
27 |         # Convert to int otherwise evaluation will throw an error
28 |         record['height'] = int(df_input[df_input['ImageId'] == filename]['Height'].values[0])
29 |         record['width'] = int(df_input[df_input['ImageId'] == filename]['Width'].values[0])
30 |         
31 |         record['file_name'] = filename
32 |         record['image_id'] = idx
33 | 
34 |         objs = []
35 |         for index, row in df_input[(df_input['ImageId'] == filename)].iterrows():
36 | 
37 |             # Get binary mask
38 |             mask = rle_decode_string(row['EncodedPixels'], row['Height'], row['Width'])
39 | 
40 |             # opencv 4.2+
41 |             # Transform the mask from binary to polygon format
42 |             contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,
43 |                                                     cv2.CHAIN_APPROX_SIMPLE)
44 |             
45 |             # opencv 3.2
46 |             # mask_new, contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,
47 |             #                                            cv2.CHAIN_APPROX_SIMPLE)
48 |             
49 |             segmentation = []
50 | 
51 |             for contour in contours:
52 |                 contour = contour.flatten().tolist()
53 |                 # segmentation.append(contour)
54 |                 if len(contour) > 4:
55 |                     segmentation.append(contour) 
56 | 
57 |                     # Data for each mask
58 |             obj = {
59 |                 'bbox': [row['x0'], row['y0'], row['x1'], row['y1']],
60 |                 'bbox_mode': BoxMode.XYXY_ABS,
61 |                 'category_id': row['ClassId'],
62 |                 'attributes': row['AttributesIds'], # New key: attributes
63 |                 'segmentation': segmentation,
64 |             }
65 |             objs.append(obj)
66 |         record['annotations'] = objs
67 |         dataset_dicts.append(record)
68 |     return dataset_dicts
69 | 
70 | def register_datadict(datadict_input, label_dataset:str = "sample_fashion_train"):
71 |     """
72 |     Register the data type with the Catalog function from Detectron2 code base.
73 |     fixme: currently hard coded as sample_fashion_train sample_fashion_test
74 |     """
75 |     _, _, df_categories = load_dataset_into_dataframes()
76 |     # Register the train and test and set metadata
77 | 
78 |     DatasetCatalog.register(label_dataset, lambda d=datadict_input: convert_to_datadict(d))
79 |     MetadataCatalog.get(label_dataset).set(thing_classes=list(df_categories.name))


--------------------------------------------------------------------------------
/imaterialist/data/datasets/make_dataset.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import logging
  3 | import pickle
  4 | import numpy as np
  5 | import pandas as pd
  6 | from pathlib import Path
  7 | from sklearn import preprocessing
  8 | 
  9 | from imaterialist.data.datasets.rle_utils import rle_decode_string, rle2bbox
 10 | 
 11 | from environs import Env
 12 | 
 13 | env = Env()
 14 | env.read_env()
 15 | 
 16 | # Get training dataframe
 17 | path_data = Path(env("path_raw"))
 18 | path_image = path_data / "train/"
 19 | path_data_interim = Path(env("path_interim"))
 20 | 
 21 | def load_category_attributes(path_data: Path = path_data):
 22 |     # Get label descriptions
 23 |     with open(path_data / 'label_descriptions.json', 'r') as file:
 24 |         label_desc = json.load(file)
 25 | 
 26 |     df_categories = pd.DataFrame(label_desc['categories'])
 27 |     df_attributes = pd.DataFrame(label_desc['attributes'])
 28 | 
 29 |     return df_attributes, df_categories
 30 | 
 31 | 
 32 | def load_dataset_into_dataframes(path_data: Path = path_data, n_cases: int = 0):
 33 |     """
 34 |     Get all the CSV from the competition into dataframes.
 35 |     """
 36 | 
 37 |     path_label = path_data / 'train.csv'
 38 |     df = pd.read_csv(path_label)
 39 | 
 40 |     # Just getting a smaller df to make the rest run faster
 41 |     if n_cases == 0:
 42 |         df = df.copy()
 43 |     elif n_cases != 0:
 44 |         df = df[:n_cases].copy()
 45 | 
 46 |     # Get label descriptions
 47 |     with open(path_data/'label_descriptions.json', 'r') as file:
 48 |         label_desc = json.load(file)
 49 | 
 50 |     df_categories = pd.DataFrame(label_desc['categories'])
 51 |     df_attributes = pd.DataFrame(label_desc['attributes'])
 52 | 
 53 |     return df, df_attributes, df_categories
 54 | 
 55 | 
 56 | def attr_str_to_list(df, df_attributes):
 57 |     '''
 58 |     Function that transforms DataFrame AttributeIds which are of type string into a 
 59 |     list of integers. Strings must be converted because they cannot be transformed into Tensors
 60 |     '''
 61 |     lb = preprocessing.LabelBinarizer()
 62 | 
 63 |     attribute_list = df_attributes.id.unique()
 64 |     attribute_list = np.sort(np.insert(attribute_list, 1, 999))
 65 | 
 66 |     lb.fit(attribute_list)
 67 | 
 68 | 
 69 |     # cycle through all the non NaN rows - NaN causes an error
 70 |     for index, row in df.iterrows():
 71 |         
 72 |         # Treating str differently than int
 73 |         if isinstance(row['AttributesIds'], str):
 74 |             
 75 |             # Convert each row's string into a list of strings             
 76 |             df['AttributesIds'][index] = row['AttributesIds'].split(',')
 77 |             
 78 |             # Convert each string in the list to int
 79 |             df['AttributesIds'][index] = [int(x) for x in df['AttributesIds'][index]]
 80 |                         
 81 |         # If int - make it a list of length 1
 82 |         if isinstance(row['AttributesIds'], int):
 83 |             df['AttributesIds'][index] = [999]
 84 |                
 85 |         df['AttributesIds'][index] = lb.transform(df['AttributesIds'][index]).sum(axis=0) 
 86 |         
 87 | 
 88 | def create_datadict(df_labels_masks, df_attributes):
 89 |     """
 90 |     Creates the data dictionary necessary for Detectron2 which incorporated the additional following information:
 91 |         ImageId
 92 |         x0
 93 |         y0
 94 |         x1
 95 |         y1
 96 |     """
 97 | 
 98 |     # Get image file path required for dict and add it to our data frame
 99 | 
100 |     # Get only the first 50K labels, out of 333K labels.
101 |     datedic_labels_masks = df_labels_masks.copy()  # df sample
102 | 
103 |     # Append ImageId information.
104 |     datedic_labels_masks['ImageId'] = str(path_image) + "/" + datedic_labels_masks['ImageId'] + ".jpg"
105 | 
106 |     # Get bboxes for each mask with our helper function
107 |     bboxes = [rle2bbox(c.EncodedPixels, (c.Height, c.Width)) for n, c in datedic_labels_masks.iterrows()]
108 | 
109 |     # Turn list into array for proper indexing
110 |     bboxes_array = np.array(bboxes)
111 | 
112 |     # Add each x, y coordinate as a column
113 |     datedic_labels_masks['x0'], datedic_labels_masks['y0'], datedic_labels_masks['x1'], datedic_labels_masks['y1'] = bboxes_array[:, 0], bboxes_array[:, 1], bboxes_array[:,2], bboxes_array[:, 3]
114 |     
115 |     datedic_labels_masks = datedic_labels_masks.astype({"x0": int, "y0": int, "x1":int, 'y1':int})
116 | 
117 |     #Replace NaNs from AttributeIds by 999
118 |     datedic_labels_masks = datedic_labels_masks.fillna(999)
119 |     
120 |     # Turn attributes from string to list of ints with padding
121 |     attr_str_to_list(datedic_labels_masks, df_attributes) 
122 |     
123 |     return datedic_labels_masks
124 | 
125 | def main(n_sample_size: int = 0, train_test_split: float = 0.8):
126 |     """
127 |     Runs data processing scripts to turn raw train.csv dataframe from (../raw) into
128 |         cleaned dataframe ready to be used by our dataset_dict.
129 |     :param n_sample_size:
130 |     :return:
131 |     """
132 |     logger = logging.getLogger(__name__)
133 |     logger.info('making final data set from raw data')
134 | 
135 |     data_full, df_attributes, _ = load_dataset_into_dataframes(n_cases=500)
136 |     datadic_full = create_datadict(data_full, df_attributes)
137 | 
138 |     # if n_sample_size not specified, use entire data set.
139 |     # If too large, use entire data set.
140 |     if n_sample_size == 0 or int(n_sample_size) > datadic_full.shape[0]:
141 |         n_sample_size = len(datadic_full)
142 | 
143 | 
144 |     # Arbitrary split in training / testing dataframes
145 |     n_train: int = round(n_sample_size * train_test_split)
146 |     n_test: int = n_sample_size - n_train
147 |     datadic_train = datadic_full[:n_train].copy()
148 |     datadic_val = datadic_full[-n_test:].copy()
149 | 
150 |     pickle.dump(datadic_train, open(path_data_interim / f'imaterialist_train_multihot_n={n_train}.p', "wb"))
151 |     pickle.dump(datadic_val, open(path_data_interim / f'imaterialist_test_multihot_n={n_test}.p', "wb"))
152 | 
153 |     # # Saving to feather format - faster than pickle 
154 |     # datadic_train.reset_index().to_feather(path_data_interim / f'imaterialist_train_multihot_n={n_train}.feather')
155 |     # datadic_val.reset_index().to_feather(path_data_interim / f'imaterailist_test_multihot_n={n_test}.feather')
156 |     
157 | if __name__ == '__main__':
158 |     log_fmt = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
159 |     logging.basicConfig(level=logging.INFO, format=log_fmt)
160 | 
161 |     # Change size to create larger datasets
162 |     main(500)
163 | 


--------------------------------------------------------------------------------
/imaterialist/data/datasets/rle_utils.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from PIL import Image
  3 | import math
  4 | 
  5 | # Rle helper functions
  6 | 
  7 | def rle_decode_string(rle, h, w):
  8 |     '''
  9 |     rle: run-length encoded image mask, as string
 10 |     h: heigh of image on which RLE was produced
 11 |     w: width of image on which RLE was produced
 12 |     returns a binary mask with the same shape
 13 |     '''
 14 |     mask = np.full(h * w, 0, dtype=np.uint8)
 15 |     annotation = [int(x) for x in rle.split(' ')]
 16 |     for i, start_pixel in enumerate(annotation[::2]):
 17 |         mask[start_pixel: start_pixel + annotation[2 * i + 1]] = 1
 18 |     mask = mask.reshape((h, w), order='F')
 19 | 
 20 |     return mask
 21 | 
 22 | 
 23 | def mask_to_KaggleRLE_old(img):
 24 |     '''
 25 |     Source: https://www.kaggle.com/lifa08/run-length-encode-and-decode
 26 |     img: numpy array, 1 - mask, 0 - background
 27 |     Returns run length as string formated
 28 |     '''
 29 |     pixels = img.flatten()
 30 |     pixels = np.concatenate([[0], pixels, [0]])
 31 |     runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
 32 |     runs[1::2] -= runs[::2]
 33 |     return ' '.join(str(x) for x in runs)
 34 | 
 35 | def mask_to_KaggleRLENew(img):
 36 |     pixels = img.T.flatten()
 37 |     # We need to allow for cases where there is a '1' at either end of the sequence.
 38 |     # We do this by padding with a zero at each end when needed.
 39 |     use_padding = False
 40 |     if pixels[0] or pixels[-1]:
 41 |         use_padding = True
 42 |         pixel_padded = np.zeros([len(pixels) + 2], dtype=pixels.dtype)
 43 |         pixel_padded[1:-1] = pixels
 44 |         pixels = pixel_padded
 45 |     rle = np.where(pixels[1:] != pixels[:-1])[0] + 2
 46 |     if use_padding:
 47 |         rle = rle - 1
 48 |     rle[1::2] = rle[1::2] - rle[:-1:2]
 49 |     return ' '.join(str(x) for x in rle)
 50 | 
 51 | 
 52 | 
 53 | def mask_to_KaggleRLE_downscale(img, max_size=1024):
 54 |     '''
 55 |     Adaptive funciton to first DOWNSCALE the image before running RLE
 56 |     # Source: https://stackoverflow.com/a/28453021
 57 |     img: numpy array, 1 - mask, 0 - background
 58 |     Returns run length as string formated
 59 |     '''
 60 |     # img is a tensor
 61 |     img_array = np.array(img)
 62 |     pil_image = Image.fromarray(img_array)
 63 |     width_current = pil_image.size[0]
 64 |     height_current = pil_image.size[1]
 65 | 
 66 |     longest_edge = max(width_current, height_current)
 67 | 
 68 |     # Always rescale
 69 |     if (width_current > height_current):
 70 |         new_width = max_size
 71 |         scaled_height = max_size / float(width_current) * height_current
 72 |         # Must floor the mask
 73 |         new_height = int(math.floor(scaled_height))
 74 |     else:
 75 |         scale_width = max_size / float(height_current) * width_current
 76 |         # Must floor the mask
 77 |         new_width = int(math.floor(scale_width))
 78 |         new_height = max_size
 79 | 
 80 |     pil_image = pil_image.resize((new_width, new_height), Image.NEAREST)
 81 | 
 82 |     image_array = np.array(pil_image)
 83 |     return mask_to_KaggleRLENew(image_array)
 84 | 
 85 | 
 86 | def rle_encode(mask):
 87 |     pixels = mask.T.flatten()
 88 |     # We need to allow for cases where there is a '1' at either end of the sequence.
 89 |     # We do this by padding with a zero at each end when needed.
 90 |     use_padding = False
 91 |     if pixels[0] or pixels[-1]:
 92 |         use_padding = True
 93 |         pixel_padded = np.zeros([len(pixels) + 2], dtype=pixels.dtype)
 94 |         pixel_padded[1:-1] = pixels
 95 |         pixels = pixel_padded
 96 |     rle = np.where(pixels[1:] != pixels[:-1])[0] + 1
 97 |     if use_padding:
 98 |         rle = rle - 1
 99 |     rle[1::2] = rle[1::2] - rle[:-1:2]
100 |     return rle
101 | 
102 | 
103 | def rle2bbox(rle, shape):
104 |     '''
105 |     Get a bbox from a mask which is required for Detectron 2 dataset
106 |     rle: run-length encoded image mask, as string
107 |     shape: (height, width) of image on which RLE was produced
108 |     Returns (x0, y0, x1, y1) tuple describing the bounding box of the rle mask
109 | 
110 |     Note on image vs np.array dimensions:
111 | 
112 |         np.array implies the `[y, x]` indexing order in terms of image dimensions,
113 |         so the variable on `shape[0]` is `y`, and the variable on the `shape[1]` is `x`,
114 |         hence the result would be correct (x0,y0,x1,y1) in terms of image dimensions
115 |         for RLE-encoded indices of np.array (which are produced by widely used kernels
116 |         and are used in most kaggle competitions datasets)
117 |     '''
118 | 
119 |     a = np.fromiter(rle.split(), dtype=np.uint)
120 |     a = a.reshape((-1, 2))  # an array of (start, length) pairs
121 |     a[:, 0] -= 1  # `start` is 1-indexed
122 | 
123 |     y0 = a[:, 0] % shape[0]
124 |     y1 = y0 + a[:, 1]
125 |     if np.any(y1 > shape[0]):
126 |         # got `y` overrun, meaning that there are a pixels in mask on 0 and shape[0] position
127 |         y0 = 0
128 |         y1 = shape[0]
129 |     else:
130 |         y0 = np.min(y0)
131 |         y1 = np.max(y1)
132 | 
133 |     x0 = a[:, 0] // shape[0]
134 |     x1 = (a[:, 0] + a[:, 1]) // shape[0]
135 |     x0 = np.min(x0)
136 |     x1 = np.max(x1)
137 | 
138 |     if x1 > shape[1]:
139 |         # just went out of the image dimensions
140 |         raise ValueError("invalid RLE or image dimensions: x1=%d > shape[1]=%d" % (
141 |             x1, shape[1]
142 |         ))
143 | 
144 |     return x0, y0, x1, y1
145 | 
146 | def rle_decode_new(rle_str, mask_shape):
147 |     # This is the reverse decoding mechanism
148 |     # Source: https://www.kaggle.com/stainsby/fast-tested-rle-and-input-routines
149 |     s = rle_str.split()
150 |     starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
151 |     starts -= 1
152 |     ends = starts + lengths
153 |     mask = np.zeros(np.prod(mask_shape), dtype=np.uint8)
154 |     for lo, hi in zip(starts, ends):
155 |         mask[lo:hi] = 1
156 |     return mask.reshape(mask_shape[::-1]).T
157 | 
158 | def refine_masks(masks):
159 |     """
160 |     Take a series of masks
161 |     sort them from small to large (yet preserve order information)
162 |     Add each one to union, keep track of already counted pixel coordinate positions to make the masks "uniquefy"
163 |     Convert each "uniquefied" mask to RLE
164 |     return them in the order originally found
165 | 
166 |     :param masks:
167 |     :param n_labels:
168 |     :return:
169 |     """
170 |     # Source: https://github.com/abhishekkrthakur/imat-fashion/blob/master/predict.py
171 |     n_labels = len(masks)
172 | 
173 |     # Early return if nothing.
174 |     if n_labels == 0:
175 |         # Return empty string
176 |         return []
177 | 
178 |     masks = np.array(masks)
179 | 
180 |     # Compute the areas of each mask
181 |     #mask_areas = np.sum(masks.reshape(-1, masks.shape[0]), axis=1)
182 |     masks_areas = [np.sum(np.sum(mask)) for mask in masks]
183 | 
184 |     # Preserve the original order?
185 |     masks_areas_ordered = list(enumerate(masks_areas))
186 |     masks_areas_ordered_sorted = sorted(masks_areas_ordered, key=lambda a: a[1])
187 | 
188 |     # One reference mask is created to be incrementally populated
189 |     # This has same as number of labels.
190 | 
191 |     # Generate a blank union mask, which all labels will be iteratively updating to ensure no overlap pixels.
192 |     union_mask = np.zeros(masks[0].shape, dtype=bool)
193 | 
194 |     # Refined masks
195 |     uniquefied_mask = []
196 | 
197 |     # Iterate from the smallest, so smallest ones are preserved
198 |     # Second parameter is area, useless.
199 |     for mask_index, _ in masks_areas_ordered_sorted:
200 |         # Current Mask:
201 |         mask_current = masks[mask_index, :, :]
202 | 
203 |         # unionized version fo the current mask: Default to false/0 if not defined.
204 |         # not logical_not, it turns all False (default) to True. True&True = True
205 |         # All true for the first iteration only.
206 |         union_mask_inverted = np.logical_not(union_mask)
207 | 
208 |         uniquefied_mask_current = np.logical_and(
209 |             mask_current,
210 |             union_mask_inverted # not logical_not, it turns all False (default) to True. True&True = True
211 |         )
212 | 
213 |         uniquefied_mask.append((mask_index, uniquefied_mask_current))
214 | 
215 |         # update the union mask to include the latest calculation
216 |         union_mask = np.logical_or(mask_current, union_mask)
217 | 
218 |     # sort this by original index.
219 |     uniquefied_mask.sort(key=lambda a: a[0])
220 | 
221 |     refined_rle = []
222 | 
223 |     # Iterate through masks last axis
224 |     for mask_index, uniquefied_mask_current in uniquefied_mask:
225 | 
226 |         # Change this line to determine whether to use downscaled KaggleRLE or regular KaggleRLE conversion process
227 |         rle = mask_to_KaggleRLE_downscale(uniquefied_mask_current)
228 |         #rle = mask_to_KaggleRLE_old(uniquefied_mask_current)
229 |         #rle = mask_to_KaggleRLE_downscale(uniquefied_mask_current)
230 |         refined_rle.append(rle)
231 |         # Sanity check on uniquefying reduction
232 |         # print(f"Original: {masks_areas[mask_index]}, {np.sum(np.sum(uniquefied_mask_current))}")
233 | 
234 |     return refined_rle
235 | 


--------------------------------------------------------------------------------
/imaterialist/data/datasets/rle_utils_old.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from pycocotools.mask import encode, toBbox
  3 | from typing import List
  4 | from itertools import groupby
  5 | 
  6 | 
  7 | def mask_to_uncompressed_CocoRLE(binary_mask):
  8 |     """
  9 |     Source: https://stackoverflow.com/questions/49494337/encode-numpy-array-using-uncompressed-rle-for-coco-dataset
 10 |     """
 11 |     rle = {'counts': [], 'size': list(binary_mask.shape)}
 12 |     counts = rle.get('counts')
 13 |     for i, (value, elements) in enumerate(groupby(binary_mask.ravel(order='F'))):
 14 |         if i == 0 and value == 1:
 15 |             counts.append(0)
 16 |         counts.append(len(list(elements)))
 17 |     return rle
 18 | 
 19 | def mask_to_KaggleRLE(img):
 20 |     '''
 21 |     Source: https://www.kaggle.com/lifa08/run-length-encode-and-decode
 22 |     img: numpy array, 1 - mask, 0 - background
 23 |     Returns run length as string formated
 24 |     '''
 25 |     pixels = img.flatten()
 26 |     pixels = np.concatenate([[0], pixels, [0]])
 27 |     runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
 28 |     runs[1::2] -= runs[::2]
 29 |     return ' '.join(str(x) for x in runs)
 30 | 
 31 | 
 32 | def KaggleRLE_to_mask1(mask_rle, shape):
 33 |     '''
 34 |     # Source: https://www.kaggle.com/lifa08/run-length-encode-and-decode
 35 |     mask_rle: run-length as string formated (start length)
 36 |     shape: (height,width) of array to return, Height Width Format
 37 |     Returns numpy array, 1 - mask, 0 - background
 38 | 
 39 |     (in fortran format?)
 40 | 
 41 |     '''
 42 |     s = mask_rle.split()
 43 |     starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
 44 |     starts -= 1
 45 |     ends = starts + lengths
 46 |     img = np.zeros(shape[0] * shape[1], dtype=np.uint8)
 47 | 
 48 |     for lo, hi in zip(starts, ends):
 49 |         img[lo:hi] = 1
 50 |     return img.reshape(shape)
 51 | 
 52 | 
 53 | def KaggleRLE_to_CocoRLE(KaggleRLE: str, h: int, w: int) -> List[dict]:
 54 |     """
 55 |     This wrapper function converts kaggle KaggleRLE to binary, then convert that binary to COCORLE.
 56 |     :param KaggleRLE:
 57 |     :param h:
 58 |     :param w:
 59 |     :return:
 60 |     """
 61 |     # Conver to binary using tried and true masks.
 62 |     mask = KaggleRLE_to_mask(KaggleRLE, h, w)
 63 | 
 64 |     # using PyCoCoAPI to convert binary mask to CocoRLE format, which are re a LIST of dict of run-length encoding of binary masks.
 65 |     CocoRLE = encode(np.asfortranarray(mask))
 66 | 
 67 |     return CocoRLE
 68 | 
 69 | def KaggleRLE_to_CocoBoundBoxes(KaggleRLE: str, h: int, w: int) -> List[int]:
 70 | 
 71 |     # Generate the KaggleRLE in coco format
 72 |     CocoRLE = KaggleRLE_to_CocoRLE(KaggleRLE, h, w)
 73 | 
 74 |     # Generate the BBS using the CocoAPI.
 75 |     CocoBBS = toBbox(CocoRLE)
 76 | 
 77 |     return CocoBBS
 78 | 
 79 | def KaggleRLE_to_mask(rle, h, w):
 80 |     '''
 81 |     rle: run-length encoded image mask, as string
 82 |     h: heigh of image on which KaggleRLE was produced
 83 |     w: width of image on which KaggleRLE was produced
 84 | 
 85 |     returns a binary mask with the same shape
 86 | 
 87 |     '''
 88 |     mask = np.full(h * w, 0, dtype=np.uint8)
 89 |     annotation = [int(x) for x in rle.split(' ')]
 90 |     for i, start_pixel in enumerate(annotation[::2]):
 91 |         mask[start_pixel: start_pixel + annotation[2 * i + 1]] = 1
 92 |     mask = mask.reshape((h, w), order='F')
 93 | 
 94 |     return mask
 95 | 
 96 | 
 97 | def KaggleRLE_to_bbox(rle, shape):
 98 |     '''
 99 |     Get a bbox from a mask which is required for Detectron 2 dataset
100 |     rle: run-length encoded image mask, as string
101 |     shape: (height, width) of image on which KaggleRLE was produced
102 |     Returns (x0, y0, x1, y1) tuple describing the bounding box of the rle mask
103 | 
104 |     Note on image vs np.array dimensions:
105 | 
106 |         np.array implies the `[y, x]` indexing order in terms of image dimensions,
107 |         so the variable on `shape[0]` is `y`, and the variable on the `shape[1]` is `x`,
108 |         hence the result would be correct (x0,y0,x1,y1) in terms of image dimensions
109 |         for KaggleRLE-encoded indices of np.array (which are produced by widely used kernels
110 |         and are used in most kaggle competitions datasets)
111 |     '''
112 | 
113 |     a = np.fromiter(rle.split(), dtype=np.uint)
114 |     a = a.reshape((-1, 2))  # an array of (start, length) pairs
115 |     a[:, 0] -= 1  # `start` is 1-indexed
116 | 
117 |     y0 = a[:, 0] % shape[0]
118 |     y1 = y0 + a[:, 1]
119 |     if np.any(y1 > shape[0]):
120 |         # got `y` overrun, meaning that there are a pixels in mask on 0 and shape[0] position
121 |         y0 = 0
122 |         y1 = shape[0]
123 |     else:
124 |         y0 = np.min(y0)
125 |         y1 = np.max(y1)
126 | 
127 |     x0 = a[:, 0] // shape[0]
128 |     x1 = (a[:, 0] + a[:, 1]) // shape[0]
129 |     x0 = np.min(x0)
130 |     x1 = np.max(x1)
131 | 
132 |     if x1 > shape[1]:
133 |         # just went out of the image dimensions
134 |         raise ValueError("invalid KaggleRLE or image dimensions: x1=%d > shape[1]=%d" % (
135 |             x1, shape[1]
136 |         ))
137 | 
138 |     return x0, y0, x1, y1


--------------------------------------------------------------------------------
/imaterialist/data/datasets/test_rle.py:
--------------------------------------------------------------------------------
  1 | from pycocotools.mask import decode, frPyObjects
  2 | import pytest
  3 | import numpy as np
  4 | import pickle
  5 | from pycocotools import mask
  6 | import pytest
  7 | from rle_utils_old import KaggleRLE_to_CocoRLE, KaggleRLE_to_mask, KaggleRLE_to_mask1, mask_to_uncompressed_CocoRLE, mask_to_KaggleRLE,KaggleRLE_to_CocoBoundBoxes, KaggleRLE_to_bbox
  8 | from rle_utils import refine_masks, rle_decode_new
  9 | from PIL import Image
 10 | def test_pycocoapiRLE():
 11 | 
 12 |     # Example prediction data.
 13 |     data1 = "PUi>9Td0j0\\O<E9F:I<Fi0SO<E9E;C<]Oc0K3M4K4M3O100O10000010O000O4KV1lJU@l2ea0POe0jNW[o4"
 14 |     data2 = "iXn;5ad09L1N2O1N2O001O1O0O2O000O100O1O2N1LUOW\\Ol0hc04N2O2N1N2O100O2N100010O01O1O0010O01O011N00Mg\\OdNYc0X1g\\OhN2OXc0W18L5M2O1M3O1N2O1O1O10001M4M4L4M3L`eV7"
 15 |     binary = decode(data1)
 16 |     print(binary)
 17 | 
 18 | 
 19 | # Adopted from example shown at: https://stackoverflow.com/questions/49494337/encode-numpy-array-using-uncompressed-rle-for-coco-dataset
 20 | # Which made me aware how annoying this whole thing is.
 21 | @pytest.fixture
 22 | def get_ground_truth_mask():
 23 |     ground_truth_binary_mask = np.array([[  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
 24 |                                          [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
 25 |                                          [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
 26 |                                          [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
 27 |                                          [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
 28 |                                          [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
 29 |                                          [  1,   0,   0,   0,   0,   0,   0,   0,   0,   0],
 30 |                                          [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
 31 |                                          [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0]], dtype=np.uint8)
 32 |     return ground_truth_binary_mask
 33 | 
 34 | @pytest.fixture
 35 | def get_ground_truth_KaggleRLE():
 36 |     # Manually encode into KaggleRLE
 37 |     KaggleRLE = list(map(str, [6, 1, 47, 4, 56, 4, 65, 4]))
 38 |     ground_truth_KaggleRLE = " ".join(KaggleRLE)
 39 |     return ground_truth_KaggleRLE
 40 | 
 41 | def test_cocoencoding(get_ground_truth_KaggleRLE, get_ground_truth_mask):
 42 |     cocoRLE = KaggleRLE_to_CocoRLE(get_ground_truth_KaggleRLE, *get_ground_truth_mask.shape)
 43 | 
 44 |     # Ground truth from Stackoverflow:
 45 |     uncompressedCocoRLE = {'counts': [6, 1, 40, 4, 5, 4, 5, 4, 21], 'size': [9, 10]}
 46 |     compressedCoCoRLE = mask.frPyObjects(uncompressedCocoRLE, uncompressedCocoRLE.get('size')[0], uncompressedCocoRLE.get('size')[1])
 47 | 
 48 |     # Ensure compressed version is the same.
 49 |     assert cocoRLE.items() == compressedCoCoRLE.items()
 50 | 
 51 |     # Ensure the binary converted from these are the same.
 52 |     assert np.ndarray.all(mask.decode(cocoRLE) == mask.decode(compressedCoCoRLE))
 53 |     assert np.ndarray.all(mask.toBbox(cocoRLE) == mask.toBbox(cocoRLE))
 54 | 
 55 | def test_mask2CompressedCocoRLE(get_ground_truth_mask):
 56 |     print(mask.encode(np.asfortranarray(get_ground_truth_mask)))
 57 | 
 58 | def failed_test_mask2CompressedCocoRLE1(get_ground_truth_mask):
 59 |     print(mask.frPyObjects(get_ground_truth_mask, *get_ground_truth_mask.shape))
 60 | 
 61 | def test_mask2UncompressedCocoRLE(get_ground_truth_mask):
 62 |     print(mask_to_uncompressed_CocoRLE(get_ground_truth_mask))
 63 | 
 64 | def test_mask2KaggleRLE(get_ground_truth_mask):
 65 |     print(mask_to_KaggleRLE(get_ground_truth_mask))
 66 | 
 67 | def test_decoding_KaggleRLE(get_ground_truth_KaggleRLE, get_ground_truth_mask):
 68 |     print(KaggleRLE_to_mask(get_ground_truth_KaggleRLE, *get_ground_truth_mask.shape))
 69 | 
 70 | def test_bbs(get_ground_truth_KaggleRLE, get_ground_truth_mask):
 71 |     assert(
 72 |         np.all(
 73 |             KaggleRLE_to_CocoBoundBoxes(get_ground_truth_KaggleRLE, *get_ground_truth_mask.shape)
 74 |             ==
 75 |             KaggleRLE_to_bbox(get_ground_truth_KaggleRLE, get_ground_truth_mask.shape))
 76 |            )
 77 | 
 78 | def test_bbs(get_ground_truth_KaggleRLE, get_ground_truth_mask):
 79 | 
 80 | 
 81 |     print(np.array(KaggleRLE_to_CocoBoundBoxes(get_ground_truth_KaggleRLE, *get_ground_truth_mask.shape)).shape)
 82 | 
 83 |     print(np.array(KaggleRLE_to_bbox(get_ground_truth_KaggleRLE, get_ground_truth_mask.shape)).shape)
 84 | 
 85 | def test_union_mask():
 86 |     # Load masks
 87 |     image1 = pickle.load(open("/home/dyt811/Git/cvnnig/data_imaterialist2020/2020-05-25T214027_UnionMaskCalculation/image2.pkl", 'rb'))
 88 |     print(image1)
 89 |     print(image1.shape)
 90 |     refine_masks(image1)
 91 | 
 92 | def test_rle():
 93 |     #data = "270542 8 271494 30 272452 34 273412 35 274372 36 275332 37 276292 42 277252 71 278210 76 279170 79 280129 82 281089 83 282049 84 283008 86 283968 87 284927 89 285887 89 286847 89 287806 91 288766 91 289726 92 290686 92 291646 93 292606 93 293567 92 294527 92 295488 91 296450 89 297417 83 298382 78 299350 70 300318 62 301279 61 302240 59 303201 58 304162 57 305125 53 306087 50 307049 48 308014 42 308983 6 308998 17 309963 10"
 94 |     #data = "489900 6 490537 11 491175 14 491812 17 492450 20 493090 20 493724 26 494364 26 495003 27 495643 27 496283 27 496923 28 497563 28 498203 28 498843 28 499483 28 500122 30 500762 30 501402 30 502042 30 502682 30 503322 30 503962 30 504602 30 505243 30 505883 30 506523 30 507163 30 507803 30 508443 30 509083 30 509723 30 510363 31 511003 32 511643 33 512283 34 512923 35 513563 35 514204 34 514844 35 515485 34 516126 34 516767 33 517408 33 518048 33 518688 33 519328 33 519968 33 520609 33 521249 33 521889 33 522529 33 523169 33 523809 33 524449 33 525089 33 525729 33 526369 34 527009 34 527649 34 528289 34 528929 34 529569 34 530209 33 530849 33 531489 33 532129 33 532769 33 533409 33 534049 33 534689 33 535329 33 535969 34 536609 34 537249 34 537889 34 538529 34 539170 34 539810 34 540450 34 541091 33 541731 33 542371 33 543012 32 543652 32 544293 31 544934 30 545575 28 546216 27 546857 25 547500 21 548142 17 548784 14 549430 5"
 95 |     data = "329135 1 329138 3 329986 13 330837 16 331688 18 332540 19 333391 21 334243 22 335095 22 335947 22 336798 22 337650 22 338501 23 339353 22 340205 22 341057 22 341908 23 342760 23 343612 24 344464 24 345316 24 346168 25 347020 24 347871 25 348723 25 349575 25 350427 25 351279 25 352131 25 352983 25 353835 24 354686 25 355538 25 356390 25 357242 25 358094 25 358946 25 359797 26 360649 26 361501 26 362353 26 363205 26 364057 27 364909 27 365761 28 366613 29 367465 31 368317 31 369168 33 370020 33 370872 33 371724 33 372576 33 373428 33 374280 33 375132 33 375984 33 376836 33 377688 33 378540 33 379392 32 380243 33 381095 33 381947 34 382799 34 383651 34 384503 34 385355 35 386207 35 387059 35 387911 35 388763 36 389615 36 390467 36 391319 37 392171 37 393023 37 393875 38 394727 38 395579 37 396430 38 397282 38 398134 38 398986 38 399838 37 400690 37 401542 37 402394 37 403246 36 404098 36 404950 36 405802 36 406653 37 407505 37 408357 37 409209 37 410061 37 410913 37 411765 37 412617 37 413469 37 414321 37 415173 37 416025 37 416877 37 417729 37 418581 37 419432 38 420284 38 421136 38 421988 38 422840 37 423692 37 424544 37 425396 37 426248 33 426282 2 427100 33 427135 1 427952 33 427987 1 428804 33 429656 33 430508 33 431360 33 432212 33 433063 34 433915 34 434767 34 434803 1 435619 34 435655 1 436471 34 436507 1 437323 34 437358 2 438175 34 438210 2 439027 34 439062 3 439879 38 440731 38 441583 38 442435 38 443287 34 443322 2 444139 34 444175 1 444991 34 445843 34 446695 34 447547 33 448399 33 449251 33 450103 33 450955 33 451807 33 452659 32 453511 32 454363 30 455215 30 456067 30 456919 30 457771 30 458623 30 459475 30 460327 31 461179 31 462031 31 462883 31 463735 32 464587 32 465439 32 466291 32 467143 32 467995 32 468847 32 469699 32 470551 32 471403 32 472255 32 473106 33 473958 33 474810 33 475662 33 476514 33 477366 33 478218 33 479070 33 479922 33 480774 33 481626 33 482478 33 483330 32 484182 32 485034 32 485886 32 486738 32 487590 32 488442 32 489294 32 490146 32 490998 31 491850 31 492702 31 493554 31 494406 31 495258 31 496109 32 496961 32 497813 32 498665 32 499517 32 500369 32 501221 32 502073 32 502925 32 503777 32 504629 32 505481 32 506333 32 507185 32 508037 32 508888 33 509740 33 510592 33 511444 33 512296 33 513148 33 514000 33 514852 33 515704 33 516556 34 517408 34 518260 34 519112 34 519964 34 520816 34 521668 34 522520 34 523372 34 524224 34 525076 35 525928 35 526780 35 527632 35 528484 35 529336 35 530187 36 531039 36 531891 36 532743 36 533595 36 534447 36 535299 36 536151 36 537003 36 537855 36 538707 36 539559 36 540411 36 541263 37 542115 37 542967 37 543819 37 544671 37 545523 37 546375 37 547227 38 548079 38 548931 38 549783 38 550635 38 551487 38 552339 38 553191 38 554043 38 554895 38 555747 38 556599 38 557451 38 558302 39 559154 39 560006 39 560858 39 561710 39 562562 39 563414 39 564266 39 565118 39 565970 39 566822 39 567674 39 568526 39 569378 39 570230 39 571082 39 571934 39 572786 40 573638 40 574490 40 575342 40 576194 40 577045 40 577897 40 578749 40 579601 40 580453 40 581305 40 582157 40 583009 40 583861 40 584713 40 585565 40 586417 40 587269 40 588121 40 588973 40 589825 40 590677 41 591529 41 592381 41 593233 41 594084 42 594936 42 595788 43 596640 43 597492 43 598344 43 599196 43 600048 43 600900 43 601752 43 602604 43 603456 43 604308 42 605160 42 606012 42 606863 43 607715 43 608567 43 609419 43 610271 43 611123 43 611975 43 612827 43 613679 42 614531 42 615383 42 616235 42 617087 42 617939 42 618791 42 619643 42 620495 42 621347 42 622198 43 623050 43 623902 43 624754 43 625607 41 626459 41 627311 41 628163 41 629015 40 629867 40 630719 40 631571 40 632423 40 633275 40 634127 40 634979 39 635831 39 636683 39 637535 38 638387 38 639239 36 640091 35 640943 35 641795 35 642648 33 643500 33 644352 33 645204 32 646056 30 646908 30 647760 2 647764 3 647773 16 648612 1 648616 2 648626 14 649480 9 650333 7"
 96 |     
 97 |     mask = rle_decode_new(data, (833, 1000))
 98 |     im = Image.fromarray((mask * 255).astype(np.uint8))
 99 |     im.show()
100 | 
101 | def test_rle_dimension_check():
102 |     path_root = '/home/dyt811/Git/cvnnig/data_imaterialist2020/interim/resized_test'
103 |     import csv
104 |     with open("/home/dyt811/Git/cvnnig/data_imaterialist2020/2020-05-26T133659_DouleDownwith443K/result_file.csv", newline='') as f:
105 |         reader = csv.reader(f)
106 |         data = list(reader)
107 |     import pandas as pd
108 |     df = pd.DataFrame(data)
109 |     header = df.iloc[0]
110 |     df = df[1:]
111 |     df.columns = header
112 | 
113 |     result = []
114 | 
115 |     # Iterate through each rows:
116 |     for index, row in df.iterrows():
117 |         name = row["ImageId"]
118 |         RLElist = row["EncodedPixels"].split()
119 |         #print(RLElist[-2:])
120 | 
121 |         raw_image = Image.open(path_root+"/"+name+".jpg")
122 |         if raw_image.size[0] > 1024 or raw_image.size[1] > 1024:
123 |             pass
124 |         total_pixel = raw_image.size[0] * raw_image.size[1]
125 |         total_index = (int(RLElist[-2])+int(RLElist[-1]))
126 |         outofbound = total_index - total_pixel
127 |         if outofbound > 0:
128 |             shithitfans = outofbound
129 |         elif outofbound == 0 or total_index >= 1024*1024:
130 |             shithitfans = 1
131 |         else:
132 |             shithitfans = 0
133 | 
134 |         #if shithitfans != 0:
135 |         result.append(f"For image {name} sized at {raw_image.size}, RLE = total: {RLElist[-2:]}, {total_pixel} vs {total_index}. Out of bound {outofbound} {'FuckedUp ' * shithitfans} ")
136 | 
137 |     with open("/home/dyt811/Desktop/NewRLE_sizecheck.csv", mode='w', newline="") as f:
138 |         writer = csv.writer(f)
139 |         for row in result:
140 |             writer.writerow(row)
141 | 
142 | 


--------------------------------------------------------------------------------
/imaterialist/data/structures.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from typing import Iterator, List, Tuple, Union
  3 | import torch.nn.functional as F
  4 | 
  5 | # Base Attribute holder
  6 | class Attributes:
  7 |     """
  8 |     This structure stores a list of attributes as a Nx13 torch.Tensor.
  9 |     It behaves like a Tensor
 10 |     (support indexing, `to(device)`, `.device`, and iteration over all attributes)
 11 |     """
 12 |     
 13 |     AttributeSizeType = Union[List[int], Tuple[int, int]]
 14 | 
 15 |     def __init__(self, tensor: torch.Tensor):
 16 |         """
 17 |         Args:
 18 |             tensor (Tensor[float]): a Nx14 matrix.  Each row is [attribute_1, attribute_2, ...].
 19 |         """
 20 |         device = tensor.device if isinstance(tensor, torch.Tensor) else torch.device("cpu")
 21 |         tensor = torch.as_tensor(tensor, dtype=torch.int64, device=device)
 22 |         if tensor.numel() == 0:
 23 |             # Use reshape, so we don't end up creating a new tensor that does not depend on
 24 |             # the inputs (and consequently confuses jit)
 25 |             tensor = tensor.reshape((0, 295)).to(dtype=torch.int64, device=device)
 26 |         assert tensor.dim() == 2 and tensor.size(-1) == 295, tensor.size()
 27 | 
 28 |         self.tensor = tensor
 29 | 
 30 | 
 31 |     def __getitem__(self, item: Union[int, slice, torch.BoolTensor]) -> "Boxes":
 32 |         """
 33 |         Returns:
 34 |             Attributes: Create a new :class:`Attributes` by indexing.
 35 |         The following usage are allowed:
 36 |         1. `new_attributes = attributes[3]`: return a `Attributes` which contains only one Attribute.
 37 |         2. `new_attributes = attributes[2:10]`: return a slice of attributes.
 38 |         3. `new_attributes = attributes[vector]`, where vector is a torch.BoolTensor
 39 |            with `length = len(attributes)`. Nonzero elements in the vector will be selected.
 40 |         Note that the returned Attributes might share storage with this Attributes,
 41 |         subject to Pytorch's indexing semantics.
 42 |         """
 43 |         if isinstance(item, int):
 44 |             return Attributes(self.tensor[item].view(1, -1))
 45 |         b = self.tensor[item]
 46 |         assert b.dim() == 2, "Indexing on Attributes with {} failed to return a matrix!".format(item)
 47 |         return Attributes(b)
 48 | 
 49 |     def __len__(self) -> int:
 50 |         return self.tensor.shape[0]
 51 |     
 52 |     def to(self, device: str) -> "Attributes":
 53 |         return Attributes(self.tensor.to(device))
 54 | 
 55 |     def nonempty(self, threshold: float = 0.0) -> torch.Tensor:
 56 |         """
 57 |         Find attributes that are non-empty.
 58 |         An attribute is considered empty if its first attribute in the list is 999.
 59 |         Returns:
 60 |             Tensor:
 61 |                 a binary vector which represents whether each attribute is empty
 62 |                 (False) or non-empty (True).
 63 |         """
 64 |         attributes = self.tensor
 65 |         first_attr = attributes[:, 0]
 66 |         keep = (first_attr != 999)
 67 |         return keep
 68 | 
 69 |     def __repr__(self) -> str:
 70 |         return "Attributes(" + str(self.tensor) + ")"
 71 | 
 72 | 
 73 |     def remove_padding(self, attribute):
 74 |         pass
 75 | 
 76 |     @classmethod
 77 |     def cat(cls, attributes_list: List["Attributes"]) -> "Attributes":
 78 |         """
 79 |         Concatenates a list of Attributes into a single Attributes
 80 |         Arguments:
 81 |             Attributes_list (list[Attributes])
 82 |         Returns:
 83 |             Attributes: the concatenated Attributes
 84 |         """
 85 |         assert isinstance(attributes_list, (list, tuple))
 86 |         if len(attributes_list) == 0:
 87 |             return cls(torch.empty(0))
 88 |         assert all(isinstance(attribute, Attributes) for attribute in attributes_list)
 89 | 
 90 |         # use torch.cat (v.s. layers.cat) so the returned boxes never share storage with input
 91 |         cat_attributes = cls(torch.cat([b.tensor for b in attributes_list], dim=0))
 92 |         return cat_attributes
 93 |     
 94 |     def size(self):
 95 |         'required in order to pass loss function assertions'
 96 |         return (len(self), 295)
 97 | 
 98 |     def numel(self):
 99 |         'required in order to pass loss function assertions'
100 |         return len(self)
101 | 
102 |     @property
103 |     def device(self) -> torch.device:
104 |         return self.tensor.device
105 |  
106 |     def __iter__(self) -> Iterator[torch.Tensor]:
107 |         """
108 |         Yield attributes as a Tensor of shape (14,) at a time.
109 |         """
110 |         yield from self.tensor


--------------------------------------------------------------------------------
/imaterialist/evaluator.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import detectron2.data.transforms as T
 3 | from detectron2.checkpoint import DetectionCheckpointer
 4 | 
 5 | from imaterialist.data.datasets.coco import MetadataCatalog
 6 | from imaterialist.modeling import build_model
 7 | 
 8 | class iMatPredictor:
 9 |     """
10 |     A specailized predictor with attributes added!
11 |     # Note the reference to the imaterialst MetaDataCatalog and modeling specifically!
12 |     """
13 |     def __init__(self, cfg):
14 |         self.cfg = cfg.clone()  # cfg can be modified by model
15 |         self.model = build_model(self.cfg)
16 |         self.model.eval()
17 |         self.metadata = MetadataCatalog.get(cfg.DATASETS.TEST[0])
18 | 
19 |         checkpointer = DetectionCheckpointer(self.model)
20 |         checkpointer.load(cfg.MODEL.WEIGHTS)
21 | 
22 |         self.transform_gen = T.ResizeShortestEdge(
23 |             [cfg.INPUT.MIN_SIZE_TEST, cfg.INPUT.MIN_SIZE_TEST], cfg.INPUT.MAX_SIZE_TEST
24 |         )
25 | 
26 |         self.input_format = cfg.INPUT.FORMAT
27 |         assert self.input_format in ["RGB", "BGR"], self.input_format
28 | 
29 |     def __call__(self, original_image):
30 |         """
31 |         Args:
32 |             original_image (np.ndarray): an image of shape (H, W, C) (in BGR order).
33 |         Returns:
34 |             predictions (dict):
35 |                 the output of the model for one image only.
36 |                 See :doc:`/tutorials/models` for details about the format.
37 |         """
38 |         with torch.no_grad():  # https://github.com/sphinx-doc/sphinx/issues/4258
39 |             # Apply pre-processing to image.
40 |             if self.input_format == "RGB":
41 |                 # whether the model expects BGR inputs or RGB
42 |                 original_image = original_image[:, :, ::-1]
43 |             height, width = original_image.shape[:2]
44 |             image = self.transform_gen.get_transform(original_image).apply_image(original_image)
45 |             image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
46 | 
47 |             inputs = {"image": image, "height": height, "width": width}
48 |             predictions = self.model([inputs])[0]
49 |             return predictions


--------------------------------------------------------------------------------
/imaterialist/modeling/__init__.py:
--------------------------------------------------------------------------------
1 | from .attributes_rcnn import build_model


--------------------------------------------------------------------------------
/imaterialist/modeling/attributes_rcnn.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | import logging
  3 | import numpy as np
  4 | import torch
  5 | from torch import nn
  6 | 
  7 | from .roi_heads.roi_heads import build_roi_heads
  8 | 
  9 | from detectron2.structures import ImageList
 10 | from detectron2.utils.events import get_event_storage
 11 | from detectron2.utils.logger import log_first_n
 12 | from detectron2.utils.registry import Registry
 13 | from detectron2.modeling.backbone import build_backbone
 14 | from detectron2.modeling.postprocessing import detector_postprocess
 15 | from detectron2.modeling.proposal_generator import build_proposal_generator
 16 | from detectron2.modeling.meta_arch.build import META_ARCH_REGISTRY
 17 | 
 18 | 
 19 | META_ARCH_REGISTRY = Registry("META_ARCH")  # noqa F401 isort:skip
 20 | META_ARCH_REGISTRY.__doc__ = """
 21 | Registry for meta-architectures, i.e. the whole model.
 22 | 
 23 | The registered object will be called with `obj(cfg)`
 24 | and expected to return a `nn.Module` object.
 25 | """
 26 | 
 27 | __all__ = ["GeneralizedRCNN", "ProposalNetwork", "build_model"]
 28 | 
 29 | 
 30 | @META_ARCH_REGISTRY.register()
 31 | class GeneralizedRCNN(nn.Module):
 32 |     """
 33 |     Generalized R-CNN. Any models that contains the following three components:
 34 |     1. Per-image feature extraction (aka backbone)
 35 |     2. Region proposal generation
 36 |     3. Per-region feature extraction and prediction
 37 |     """
 38 | 
 39 |     def __init__(self, cfg):
 40 |         super().__init__()
 41 | 
 42 |         self.backbone = build_backbone(cfg)
 43 |         self.proposal_generator = build_proposal_generator(cfg, self.backbone.output_shape())
 44 |         self.roi_heads = build_roi_heads(cfg, self.backbone.output_shape())
 45 |         self.vis_period = cfg.VIS_PERIOD
 46 |         self.input_format = cfg.INPUT.FORMAT
 47 | 
 48 |         assert len(cfg.MODEL.PIXEL_MEAN) == len(cfg.MODEL.PIXEL_STD)
 49 |         self.register_buffer("pixel_mean", torch.Tensor(cfg.MODEL.PIXEL_MEAN).view(-1, 1, 1))
 50 |         self.register_buffer("pixel_std", torch.Tensor(cfg.MODEL.PIXEL_STD).view(-1, 1, 1))
 51 | 
 52 |     @property
 53 |     def device(self):
 54 |         return self.pixel_mean.device
 55 | 
 56 |     def visualize_training(self, batched_inputs, proposals):
 57 |         """
 58 |         A function used to visualize images and proposals. It shows ground truth
 59 |         bounding boxes on the original image and up to 20 predicted object
 60 |         proposals on the original image. Users can implement different
 61 |         visualization functions for different models.
 62 | 
 63 |         Args:
 64 |             batched_inputs (list): a list that contains input to the model.
 65 |             proposals (list): a list that contains predicted proposals. Both
 66 |                 batched_inputs and proposals should have the same length.
 67 |         """
 68 |         from detectron2.utils.visualizer import Visualizer
 69 | 
 70 |         storage = get_event_storage()
 71 |         max_vis_prop = 20
 72 | 
 73 |         for input, prop in zip(batched_inputs, proposals):
 74 |             img = input["image"].cpu().numpy()
 75 |             assert img.shape[0] == 3, "Images should have 3 channels."
 76 |             if self.input_format == "BGR":
 77 |                 img = img[::-1, :, :]
 78 |             img = img.transpose(1, 2, 0)
 79 |             v_gt = Visualizer(img, None)
 80 |             v_gt = v_gt.overlay_instances(boxes=input["instances"].gt_boxes)
 81 |             anno_img = v_gt.get_image()
 82 |             box_size = min(len(prop.proposal_boxes), max_vis_prop)
 83 |             v_pred = Visualizer(img, None)
 84 |             v_pred = v_pred.overlay_instances(
 85 |                 boxes=prop.proposal_boxes[0:box_size].tensor.cpu().numpy()
 86 |             )
 87 |             prop_img = v_pred.get_image()
 88 |             vis_img = np.concatenate((anno_img, prop_img), axis=1)
 89 |             vis_img = vis_img.transpose(2, 0, 1)
 90 |             vis_name = "Left: GT bounding boxes;  Right: Predicted proposals"
 91 |             storage.put_image(vis_name, vis_img)
 92 |             break  # only visualize one image in a batch
 93 | 
 94 |     def forward(self, batched_inputs):
 95 |         """
 96 |         Args:
 97 |             batched_inputs: a list, batched outputs of :class:`DatasetMapper` .
 98 |                 Each item in the list contains the inputs for one image.
 99 |                 For now, each item in the list is a dict that contains:
100 | 
101 |                 * image: Tensor, image in (C, H, W) format.
102 |                 * instances (optional): groundtruth :class:`Instances`
103 |                 * proposals (optional): :class:`Instances`, precomputed proposals.
104 | 
105 |                 Other information that's included in the original dicts, such as:
106 | 
107 |                 * "height", "width" (int): the output resolution of the model, used in inference.
108 |                   See :meth:`postprocess` for details.
109 | 
110 |         Returns:
111 |             list[dict]:
112 |                 Each dict is the output for one input image.
113 |                 The dict contains one key "instances" whose value is a :class:`Instances`.
114 |                 The :class:`Instances` object has the following keys:
115 |                 "pred_boxes", "pred_classes", "scores", "pred_masks", "pred_keypoints"
116 |         """
117 |         if not self.training:
118 |             return self.inference(batched_inputs)
119 | 
120 |         images = self.preprocess_image(batched_inputs)
121 |         if "instances" in batched_inputs[0]:
122 |             gt_instances = [x["instances"].to(self.device) for x in batched_inputs]
123 |         elif "targets" in batched_inputs[0]:
124 |             log_first_n(
125 |                 logging.WARN, "'targets' in the model inputs is now renamed to 'instances'!", n=10
126 |             )
127 |             gt_instances = [x["targets"].to(self.device) for x in batched_inputs]
128 |         else:
129 |             gt_instances = None
130 | 
131 |         features = self.backbone(images.tensor)
132 | 
133 |         if self.proposal_generator:
134 |             proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)
135 |         else:
136 |             assert "proposals" in batched_inputs[0]
137 |             proposals = [x["proposals"].to(self.device) for x in batched_inputs]
138 |             proposal_losses = {}
139 | 
140 |         _, detector_losses = self.roi_heads(images, features, proposals, gt_instances)
141 |         if self.vis_period > 0:
142 |             storage = get_event_storage()
143 |             if storage.iter % self.vis_period == 0:
144 |                 self.visualize_training(batched_inputs, proposals)
145 | 
146 |         losses = {}
147 |         losses.update(detector_losses)
148 |         losses.update(proposal_losses)
149 |         return losses
150 | 
151 |     def inference(self, batched_inputs, detected_instances=None, do_postprocess=True):
152 |         """
153 |         Run inference on the given inputs.
154 | 
155 |         Args:
156 |             batched_inputs (list[dict]): same as in :meth:`forward`
157 |             detected_instances (None or list[Instances]): if not None, it
158 |                 contains an `Instances` object per image. The `Instances`
159 |                 object contains "pred_boxes" and "pred_classes" which are
160 |                 known boxes in the image.
161 |                 The inference will then skip the detection of bounding boxes,
162 |                 and only predict other per-ROI outputs.
163 |             do_postprocess (bool): whether to apply post-processing on the outputs.
164 | 
165 |         Returns:
166 |             same as in :meth:`forward`.
167 |         """
168 |         assert not self.training
169 | 
170 |         images = self.preprocess_image(batched_inputs)
171 |         features = self.backbone(images.tensor)
172 | 
173 |         if detected_instances is None:
174 |             if self.proposal_generator:
175 |                 proposals, _ = self.proposal_generator(images, features, None)
176 |             else:
177 |                 assert "proposals" in batched_inputs[0]
178 |                 proposals = [x["proposals"].to(self.device) for x in batched_inputs]
179 | 
180 |             results, _ = self.roi_heads(images, features, proposals, None)
181 |         else:
182 |             detected_instances = [x.to(self.device) for x in detected_instances]
183 |             results = self.roi_heads.forward_with_given_boxes(features, detected_instances)
184 | 
185 |         if do_postprocess:
186 |             return GeneralizedRCNN._postprocess(results, batched_inputs, images.image_sizes)
187 |         else:
188 |             return results
189 | 
190 |     def preprocess_image(self, batched_inputs):
191 |         """
192 |         Normalize, pad and batch the input images.
193 |         """
194 |         images = [x["image"].to(self.device) for x in batched_inputs]
195 |         images = [(x - self.pixel_mean) / self.pixel_std for x in images]
196 |         images = ImageList.from_tensors(images, self.backbone.size_divisibility)
197 |         return images
198 | 
199 |     @staticmethod
200 |     def _postprocess(instances, batched_inputs, image_sizes):
201 |         """
202 |         Rescale the output instances to the target size.
203 |         """
204 |         # note: private function; subject to changes
205 |         processed_results = []
206 |         for results_per_image, input_per_image, image_size in zip(
207 |             instances, batched_inputs, image_sizes
208 |         ):
209 |             height = input_per_image.get("height", image_size[0])
210 |             width = input_per_image.get("width", image_size[1])
211 |             r = detector_postprocess(results_per_image, height, width)
212 |             processed_results.append({"instances": r})
213 |         return processed_results
214 | 
215 | 
216 | @META_ARCH_REGISTRY.register()
217 | class ProposalNetwork(nn.Module):
218 |     """
219 |     A meta architecture that only predicts object proposals.
220 |     """
221 | 
222 |     def __init__(self, cfg):
223 |         super().__init__()
224 |         self.backbone = build_backbone(cfg)
225 |         self.proposal_generator = build_proposal_generator(cfg, self.backbone.output_shape())
226 | 
227 |         self.register_buffer("pixel_mean", torch.Tensor(cfg.MODEL.PIXEL_MEAN).view(-1, 1, 1))
228 |         self.register_buffer("pixel_std", torch.Tensor(cfg.MODEL.PIXEL_STD).view(-1, 1, 1))
229 | 
230 |     @property
231 |     def device(self):
232 |         return self.pixel_mean.device
233 | 
234 |     def forward(self, batched_inputs):
235 |         """
236 |         Args:
237 |             Same as in :class:`GeneralizedRCNN.forward`
238 | 
239 |         Returns:
240 |             list[dict]:
241 |                 Each dict is the output for one input image.
242 |                 The dict contains one key "proposals" whose value is a
243 |                 :class:`Instances` with keys "proposal_boxes" and "objectness_logits".
244 |         """
245 |         images = [x["image"].to(self.device) for x in batched_inputs]
246 |         images = [(x - self.pixel_mean) / self.pixel_std for x in images]
247 |         images = ImageList.from_tensors(images, self.backbone.size_divisibility)
248 |         features = self.backbone(images.tensor)
249 | 
250 |         if "instances" in batched_inputs[0]:
251 |             gt_instances = [x["instances"].to(self.device) for x in batched_inputs]
252 |         elif "targets" in batched_inputs[0]:
253 |             log_first_n(
254 |                 logging.WARN, "'targets' in the model inputs is now renamed to 'instances'!", n=10
255 |             )
256 |             gt_instances = [x["targets"].to(self.device) for x in batched_inputs]
257 |         else:
258 |             gt_instances = None
259 |         proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)
260 |         # In training, the proposals are not useful at all but we generate them anyway.
261 |         # This makes RPN-only models about 5% slower.
262 |         if self.training:
263 |             return proposal_losses
264 | 
265 |         processed_results = []
266 |         for results_per_image, input_per_image, image_size in zip(
267 |             proposals, batched_inputs, images.image_sizes
268 |         ):
269 |             height = input_per_image.get("height", image_size[0])
270 |             width = input_per_image.get("width", image_size[1])
271 |             r = detector_postprocess(results_per_image, height, width)
272 |             processed_results.append({"proposals": r})
273 |         return processed_results
274 | 
275 | def build_model(cfg):
276 |     """
277 |     Build the whole model architecture, defined by ``cfg.MODEL.META_ARCHITECTURE``.
278 |     Note that it does not load any weights from ``cfg``.
279 |     """
280 |     meta_arch = cfg.MODEL.META_ARCHITECTURE
281 |     model = META_ARCH_REGISTRY.get(meta_arch)(cfg)
282 |     model.to(torch.device(cfg.MODEL.DEVICE))
283 |     return model
284 | 


--------------------------------------------------------------------------------
/imaterialist/modeling/roi_heads/__init__.py:
--------------------------------------------------------------------------------
1 | from .roi_heads import (
2 |     ROI_HEADS_REGISTRY,
3 |     build_roi_heads, 
4 |     StandardROIHeads,
5 | )
6 | 
7 | 


--------------------------------------------------------------------------------
/imaterialist/modeling/roi_heads/attributes_head.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | import logging
  3 | import torch
  4 | from torch import nn
  5 | from torch.nn import functional as F
  6 | 
  7 | from detectron2.config import configurable
  8 | from detectron2.layers import Linear, ShapeSpec, batched_nms, cat
  9 | from detectron2.modeling.box_regression import Box2BoxTransform
 10 | from detectron2.structures import Boxes, Instances
 11 | from detectron2.utils.events import get_event_storage
 12 | from detectron2.modeling.roi_heads.fast_rcnn import FastRCNNOutputLayers, FastRCNNOutputs
 13 | from detectron2.modeling.box_regression import Box2BoxTransform
 14 | 
 15 | __all__ = ["fast_rcnn_inference", "AttributesFastRCNNOutputLayers"]
 16 | 
 17 | 
 18 | logger = logging.getLogger(__name__)
 19 | 
 20 | """
 21 | Shape shorthand in this module:
 22 | 
 23 |     N: number of images in the minibatch
 24 |     R: number of ROIs, combined over all images, in the minibatch
 25 |     Ri: number of ROIs in image i
 26 |     K: number of foreground classes. E.g.,there are 80 foreground classes in COCO.
 27 | 
 28 | Naming convention:
 29 | 
 30 |     deltas: refers to the 4-d (dx, dy, dw, dh) deltas that parameterize the box2box
 31 |     transform (see :class:`box_regression.Box2BoxTransform`).
 32 | 
 33 |     pred_class_logits: predicted class scores in [-inf, +inf]; use
 34 |         softmax(pred_class_logits) to estimate P(class).
 35 | 
 36 |     gt_classes: ground-truth classification labels in [0, K], where [0, K) represent
 37 |         foreground object classes and K represents the background class.
 38 | 
 39 |     pred_proposal_deltas: predicted box2box transform deltas for transforming proposals
 40 |         to detection box predictions.
 41 | 
 42 |     gt_proposal_deltas: ground-truth box2box transform deltas
 43 | """
 44 | 
 45 | 
 46 | def fast_rcnn_inference(boxes, scores, attr_scores, image_shapes, score_thresh, nms_thresh, topk_per_image): 
 47 |     """
 48 |     Call `fast_rcnn_inference_single_image` for all images.
 49 | 
 50 |     Args:
 51 |         boxes (list[Tensor]): A list of Tensors of predicted class-specific or class-agnostic
 52 |             boxes for each image. Element i has shape (Ri, K * 4) if doing
 53 |             class-specific regression, or (Ri, 4) if doing class-agnostic
 54 |             regression, where Ri is the number of predicted objects for image i.
 55 |             This is compatible with the output of :meth:`FastRCNNOutputLayers.predict_boxes`.
 56 |         scores (list[Tensor]): A list of Tensors of predicted class scores for each image.
 57 |             Element i has shape (Ri, K + 1), where Ri is the number of predicted objects
 58 |             for image i. Compatible with the output of :meth:`FastRCNNOutputLayers.predict_probs`.
 59 | 
 60 |         New:    
 61 |         attributes (list[Tensor]): A list of Tensors of predicted attributes for each images. 
 62 |             Element i has shape (Ri, K * 14).
 63 |             
 64 |         image_shapes (list[tuple]): A list of (width, height) tuples for each image in the batch.
 65 |         score_thresh (float): Only return detections with a confidence score exceeding this
 66 |             threshold.
 67 |         nms_thresh (float):  The threshold to use for box non-maximum suppression. Value in [0, 1].
 68 |         topk_per_image (int): The number of top scoring detections to return. Set < 0 to return
 69 |             all detections.
 70 | 
 71 |     Returns:
 72 |         instances: (list[Instances]): A list of N instances, one for each image in the batch,
 73 |             that stores the topk most confidence detections.
 74 |         kept_indices: (list[Tensor]): A list of 1D tensor of length of N, each element indicates
 75 |             the corresponding boxes/scores index in [0, Ri) from the input, for image i.
 76 |     """
 77 |     result_per_image = [
 78 |         fast_rcnn_inference_single_image(
 79 |             boxes_per_image, 
 80 |             scores_per_image, 
 81 |             attributes_per_image, 
 82 |             image_shape, 
 83 |             score_thresh, 
 84 |             nms_thresh, 
 85 |             topk_per_image
 86 |         )
 87 |         for scores_per_image, boxes_per_image, attributes_per_image, image_shape in zip(scores, boxes, attr_scores, image_shapes) 
 88 |     ]
 89 |     return [x[0] for x in result_per_image], [x[1] for x in result_per_image]
 90 | 
 91 | 
 92 | def fast_rcnn_inference_single_image(
 93 |     boxes, scores, attr_scores, image_shape, score_thresh, nms_thresh, topk_per_image):
 94 |     """
 95 |     Single-image inference. Return bounding-box detection results by thresholding
 96 |     on scores and applying non-maximum suppression (NMS).
 97 | 
 98 |     Args:
 99 |         Same as `fast_rcnn_inference`, but with boxes, scores, and image shapes
100 |         per image.
101 | 
102 |     Returns:
103 |         Same as `fast_rcnn_inference`, but for only one image.
104 |     """
105 |     # Make sure boxes and scores don't contain infinite or Nan
106 |     valid_mask = torch.isfinite(boxes).all(dim=1) & torch.isfinite(scores).all(dim=1) \
107 |                                                   & torch.isfinite(attr_scores).all(dim=1)
108 |     
109 |     # Get scores from finite boxes and scores
110 |     if not valid_mask.all():
111 |         boxes = boxes[valid_mask]
112 |         scores = scores[valid_mask] 
113 |         attr_scores = attr_scores[valid_mask]
114 | 
115 |     scores = scores[:, :-1] # Remove background class?
116 |     num_bbox_reg_classes = boxes.shape[1] // 4
117 |     # Convert to Boxes to use the `clip` function ...
118 |     boxes = Boxes(boxes.reshape(-1, 4))
119 |     boxes.clip(image_shape)
120 |     boxes = boxes.tensor.view(-1, num_bbox_reg_classes, 4)  # R x C x 4
121 | 
122 |     # If using Attributes class:
123 |     # attributes = Attributes(attributes.reshape(-1, 295))
124 |     # attributes = attributes.tensor.view(-1, num_bbox_reg_classes, 295)
125 | 
126 |     # Filter results based on detection scores
127 |     filter_mask = scores > score_thresh  # R x K
128 |     # R' x 2. First column contains indices of the R predictions;
129 |     # Second column contains indices of classes.
130 |     filter_inds = filter_mask.nonzero()
131 | 
132 |     if num_bbox_reg_classes == 1:
133 |         boxes = boxes[filter_inds[:, 0], 0]
134 |     else:
135 |         boxes = boxes[filter_mask]
136 |     scores = scores[filter_mask]
137 | 
138 |     # Apply per-class NMS
139 |     keep = batched_nms(boxes, scores, filter_inds[:, 1], nms_thresh)
140 |     if topk_per_image >= 0:
141 |         keep = keep[:topk_per_image]
142 |     boxes, scores, attr_scores, filter_inds, = boxes[keep], scores[keep], attr_scores[keep], filter_inds[keep]
143 | 
144 |     result = Instances(image_shape)
145 |     result.pred_boxes = Boxes(boxes)
146 |     result.scores = scores
147 |     result.attr_scores = attr_scores
148 |     result.pred_classes = filter_inds[:, 1]
149 |     return result, filter_inds[:, 0]
150 | 
151 | 
152 | class AttributesFastRCNNOutputs(FastRCNNOutputs):
153 |     """
154 |     A class that stores information about outputs of a Fast R-CNN head.
155 |     It provides methods that are used to decode the outputs of a Fast R-CNN head.
156 |     """
157 |     
158 |     def __init__(
159 |         self,
160 |         box2box_transform,
161 |         pred_class_logits,
162 |         pred_attributes,
163 |         pred_proposal_deltas,
164 |         proposals,
165 |         smooth_l1_beta=0,
166 |     ):
167 |         """
168 |         Args:
169 |             box2box_transform (Box2BoxTransform/Box2BoxTransformRotated):
170 |                 box2box transform instance for proposal-to-detection transformations.
171 |             pred_class_logits (Tensor): A tensor of shape (R, K + 1) storing the predicted class
172 |                 logits for all R predicted object instances.
173 |                 Each row corresponds to a predicted object instance.
174 |             pred_proposal_deltas (Tensor): A tensor of shape (R, K * B) or (R, B) for
175 |                 class-specific or class-agnostic regression. It stores the predicted deltas that
176 |                 transform proposals into final box detections.
177 |                 B is the box dimension (4 or 5).
178 |                 When B is 4, each row is [dx, dy, dw, dh (, ....)].
179 |                 When B is 5, each row is [dx, dy, dw, dh, da (, ....)].
180 |             proposals (list[Instances]): A list of N Instances, where Instances i stores the
181 |                 proposals for image i, in the field "proposal_boxes".
182 |                 When training, each Instances must have ground-truth labels
183 |                 stored in the field "gt_classes" and "gt_boxes".
184 |                 The total number of all instances must be equal to R.
185 |             smooth_l1_beta (float): The transition point between L1 and L2 loss in
186 |                 the smooth L1 loss function. When set to 0, the loss becomes L1. When
187 |                 set to +inf, the loss becomes constant 0.
188 |         """
189 |         self.box2box_transform = box2box_transform
190 |         self.num_preds_per_image = [len(p) for p in proposals]
191 |         self.pred_class_logits = pred_class_logits
192 |         self.pred_attributes = pred_attributes  # attribute predictions
193 |         self.pred_proposal_deltas = pred_proposal_deltas
194 |         self.smooth_l1_beta = smooth_l1_beta
195 |         self.image_shapes = [x.image_size for x in proposals]
196 | 
197 |         if len(proposals):
198 |             box_type = type(proposals[0].proposal_boxes)
199 | 
200 |             # Used if we take the Attributes class
201 |             attribute_type = type(proposals[0].gt_attributes)
202 |             
203 |             # cat(..., dim=0) concatenates over all images in the batch
204 |             self.proposals = box_type.cat([p.proposal_boxes for p in proposals])
205 |             assert (
206 |                 not self.proposals.tensor.requires_grad
207 |             ), "Proposals should not require gradients!"
208 | 
209 |             # The following fields should exist only when training.
210 |             if proposals[0].has("gt_boxes"):
211 |                 self.gt_boxes      = box_type.cat([p.gt_boxes for p in proposals])
212 |                 assert proposals[0].has("gt_classes")
213 |                 self.gt_classes    = cat([p.gt_classes for p in proposals], dim=0)
214 |                 self.gt_attributes = cat([p.gt_attributes for p in proposals], dim=0)
215 | 
216 |                 # use this line if using Attributes class
217 |                 #self.gt_attributes = attribute_type.cat([p.gt_attributes for p in proposals])
218 |         else:
219 |             self.proposals = Boxes(torch.zeros(0, 4, device=self.pred_proposal_deltas.device))
220 |         self._no_instances = len(proposals) == 0  # no instances found
221 | 
222 |     def _log_accuracy(self):
223 |         """
224 |         Log the accuracy metrics to EventStorage.
225 |         """
226 |         num_instances = self.gt_classes.numel()
227 |         pred_classes = self.pred_class_logits.argmax(dim=1)
228 |         bg_class_ind = self.pred_class_logits.shape[1] - 1
229 | 
230 |         fg_inds = (self.gt_classes >= 0) & (self.gt_classes < bg_class_ind)
231 |         num_fg = fg_inds.nonzero().numel()
232 |         fg_gt_classes = self.gt_classes[fg_inds]
233 |         fg_pred_classes = pred_classes[fg_inds]
234 | 
235 |         num_false_negative = (fg_pred_classes == bg_class_ind).nonzero().numel()
236 |         num_accurate = (pred_classes == self.gt_classes).nonzero().numel()
237 |         fg_num_accurate = (fg_pred_classes == fg_gt_classes).nonzero().numel()
238 | 
239 |         storage = get_event_storage()
240 |         if num_instances > 0:
241 |             storage.put_scalar("fast_rcnn/cls_accuracy", num_accurate / num_instances)
242 |             if num_fg > 0:
243 |                 storage.put_scalar("fast_rcnn/fg_cls_accuracy", fg_num_accurate / num_fg)
244 |                 storage.put_scalar("fast_rcnn/false_negative", num_false_negative / num_fg)
245 | 
246 |     def binary_cross_entropy_loss(self):
247 |         """
248 |         Compute the binary cross entropy loss for attribute classification.
249 | 
250 |         Returns:
251 |             scalar Tensor
252 |         """
253 |         if self._no_instances:
254 |             return 0.0 
255 |         else:
256 |             return F.binary_cross_entropy_with_logits(
257 |                 self.pred_attributes, 
258 |                 self.gt_attributes,
259 |                 reduction="mean")
260 | 
261 |     def losses(self):
262 |         """
263 |         Compute the default losses for box head in Fast(er) R-CNN,
264 |         with softmax cross entropy loss and smooth L1 loss.
265 | 
266 |         Returns:
267 |             A dict of losses (scalar tensors) containing keys "loss_cls" and "loss_box_reg".
268 |         """
269 |         return {
270 |             "loss_cls": self.softmax_cross_entropy_loss(),
271 |             "loss_box_reg": self.smooth_l1_loss(),
272 |             "loss_attr": self.binary_cross_entropy_loss()
273 |         }
274 | 
275 | class AttributesFastRCNNOutputLayers(FastRCNNOutputLayers):
276 |     """
277 |     Two linear layers for predicting Fast R-CNN outputs:
278 |       (1) proposal-to-detection box regression deltas
279 |       (2) classification scores
280 |       (3) attribute scores
281 |     """
282 | 
283 |     @configurable
284 |     def __init__(
285 |         self,
286 |         input_shape,
287 |         *,
288 |         box2box_transform,
289 |         num_classes,
290 |         num_attributes,
291 |         cls_agnostic_bbox_reg=False,
292 |         smooth_l1_beta=0.0,
293 |         test_score_thresh=0.0,
294 |         test_nms_thresh=0.5,
295 |         test_topk_per_image=100,
296 |     ):
297 |         """
298 |         NOTE: this interface is experimental.
299 | 
300 |         Args:
301 |             input_shape (ShapeSpec): shape of the input feature to this module
302 |             box2box_transform (Box2BoxTransform or Box2BoxTransformRotated):
303 |             num_classes (int): number of foreground classes
304 |             cls_agnostic_bbox_reg (bool): whether to use class agnostic for bbox regression
305 |             smooth_l1_beta (float): transition point from L1 to L2 loss.
306 |             test_score_thresh (float): threshold to filter predictions results.
307 |             test_nms_thresh (float): NMS threshold for prediction results.
308 |             test_topk_per_image (int): number of top predictions to produce per image.
309 |         """
310 |         super().__init__(input_shape, box2box_transform=box2box_transform, num_classes=num_classes)
311 |         if isinstance(input_shape, int):  # some backward compatbility
312 |             input_shape = ShapeSpec(channels=input_shape)
313 |         input_size = input_shape.channels * (input_shape.width or 1) * (input_shape.height or 1)
314 |         # The prediction layer for num_classes foreground classes and one background class
315 |         # (hence + 1)
316 |         self.cls_score = Linear(input_size, num_classes + 1)
317 | 
318 |         # Add attribute branch
319 |         self.attr_scores = Linear(input_size, num_attributes)
320 | 
321 |         num_bbox_reg_classes = 1 if cls_agnostic_bbox_reg else num_classes
322 |         box_dim = len(box2box_transform.weights)
323 |         self.bbox_pred = Linear(input_size, num_bbox_reg_classes * box_dim)
324 | 
325 |         nn.init.normal_(self.cls_score.weight, std=0.01)
326 |         nn.init.normal_(self.attr_scores.weight, std=0.01)
327 |         nn.init.normal_(self.bbox_pred.weight, std=0.001)
328 |         for l in [self.cls_score, self.attr_scores, self.bbox_pred]:
329 |             nn.init.constant_(l.bias, 0)
330 | 
331 |         self.box2box_transform = box2box_transform
332 |         self.smooth_l1_beta = smooth_l1_beta
333 |         self.test_score_thresh = test_score_thresh
334 |         self.test_nms_thresh = test_nms_thresh
335 |         self.test_topk_per_image = test_topk_per_image
336 | 
337 |     @classmethod
338 |     def from_config(cls, cfg, input_shape):
339 |         return {
340 |             "input_shape": input_shape,
341 |             "box2box_transform": Box2BoxTransform(weights=cfg.MODEL.ROI_BOX_HEAD.BBOX_REG_WEIGHTS),
342 |             # fmt: off
343 |             "num_classes"           : cfg.MODEL.ROI_HEADS.NUM_CLASSES,
344 |             "num_attributes"        : cfg.MODEL.ROI_HEADS.NUM_ATTRIBUTES,
345 |             "cls_agnostic_bbox_reg" : cfg.MODEL.ROI_BOX_HEAD.CLS_AGNOSTIC_BBOX_REG,
346 |             "smooth_l1_beta"        : cfg.MODEL.ROI_BOX_HEAD.SMOOTH_L1_BETA,
347 |             "test_score_thresh"     : cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST,
348 |             "test_nms_thresh"       : cfg.MODEL.ROI_HEADS.NMS_THRESH_TEST,
349 |             "test_topk_per_image"   : cfg.TEST.DETECTIONS_PER_IMAGE
350 |             # fmt: on
351 |         }
352 | 
353 |     def forward(self, x):
354 |         """
355 |         Returns:
356 |             Tensor: Nx(K+1) scores for each box
357 |             Tensor: Nx4 or Nx(Kx4) bounding box regression deltas.
358 |         """
359 |         if x.dim() > 2:
360 |             x = torch.flatten(x, start_dim=1)
361 |         scores = self.cls_score(x)
362 |         attr_scores = self.attr_scores(x)
363 |         proposal_deltas = self.bbox_pred(x)
364 |         return scores, attr_scores, proposal_deltas
365 | 
366 |     # TODO: move the implementation to this class.
367 |     def losses(self, predictions, proposals):
368 |         """
369 |         Args:
370 |             predictions: return values of :meth:`forward()`.
371 |             proposals (list[Instances]): proposals that match the features
372 |                 that were used to compute predictions.
373 |         """
374 |         scores, attr_scores, proposal_deltas = predictions
375 |         return AttributesFastRCNNOutputs(
376 |             self.box2box_transform, scores, attr_scores, proposal_deltas, proposals, self.smooth_l1_beta
377 |         ).losses()
378 | 
379 |     def inference(self, predictions, proposals):
380 |         """
381 |         Returns:
382 |             list[Instances]: same as `fast_rcnn_inference`.
383 |             list[Tensor]: same as `fast_rcnn_inference`.
384 |         """
385 |         boxes = self.predict_boxes(predictions, proposals)
386 |         scores = self.predict_probs(predictions, proposals)
387 |         attr_scores = self.predict_attribute_probs(predictions, proposals)
388 |         image_shapes = [x.image_size for x in proposals]
389 |         return fast_rcnn_inference(
390 |             boxes,
391 |             scores,
392 |             attr_scores,
393 |             image_shapes,
394 |             self.test_score_thresh,
395 |             self.test_nms_thresh,
396 |             self.test_topk_per_image,
397 |         )
398 |     
399 |     def predict_boxes_for_gt_classes(self, predictions, proposals):
400 |         """
401 |         Returns:
402 |             list[Tensor]: A list of Tensors of predicted boxes for GT classes in case of
403 |                 class-specific box head. Element i of the list has shape (Ri, B), where Ri is
404 |                 the number of predicted objects for image i and B is the box dimension (4 or 5)
405 |         """
406 |         if not len(proposals):
407 |             return []
408 |         scores, _, proposal_deltas = predictions
409 |         proposal_boxes = [p.proposal_boxes for p in proposals]
410 |         proposal_boxes = proposal_boxes[0].cat(proposal_boxes).tensor
411 |         N, B = proposal_boxes.shape
412 |         predict_boxes = apply_deltas_broadcast(
413 |             self.box2box_transform, proposal_deltas, proposal_boxes
414 |         )  # Nx(KxB)
415 | 
416 |         K = predict_boxes.shape[1] // B
417 |         if K > 1:
418 |             gt_classes = torch.cat([p.gt_classes for p in proposals], dim=0)
419 |             # Some proposals are ignored or have a background class. Their gt_classes
420 |             # cannot be used as index.
421 |             gt_classes = gt_classes.clamp_(0, K - 1)
422 | 
423 |             predict_boxes = predict_boxes.view(N, K, B)[
424 |                 torch.arange(N, dtype=torch.long, device=predict_boxes.device), gt_classes
425 |             ]
426 |         num_prop_per_image = [len(p) for p in proposals]
427 |         return predict_boxes.split(num_prop_per_image)
428 | 
429 |     def predict_boxes(self, predictions, proposals):
430 |         """
431 |         Returns:
432 |             list[Tensor]: A list of Tensors of predicted class-specific or class-agnostic boxes
433 |                 for each image. Element i has shape (Ri, K * B) or (Ri, B), where Ri is
434 |                 the number of predicted objects for image i and B is the box dimension (4 or 5)
435 |         """
436 |         if not len(proposals):
437 |             return []
438 |         _, _, proposal_deltas = predictions
439 |         num_prop_per_image = [len(p) for p in proposals]
440 |         proposal_boxes = [p.proposal_boxes for p in proposals]
441 |         proposal_boxes = proposal_boxes[0].cat(proposal_boxes).tensor
442 |         predict_boxes = self.box2box_transform.apply_deltas(
443 |             proposal_deltas, proposal_boxes
444 |         )  # Nx(KxB)
445 |         return predict_boxes.split(num_prop_per_image)
446 | 
447 |     def predict_probs(self, predictions, proposals):
448 |         """
449 |         Returns:
450 |             list[Tensor]: A list of Tensors of predicted class probabilities for each image.
451 |                 Element i has shape (Ri, K + 1), where Ri is the number of predicted objects
452 |                 for image i.
453 |         """
454 |         scores, _, _ = predictions
455 |         num_inst_per_image = [len(p) for p in proposals]
456 |         probs = F.softmax(scores, dim=-1)
457 |         return probs.split(num_inst_per_image, dim=0)
458 | 
459 |     def predict_attribute_probs(self, predictions, proposals):
460 |         """
461 |         Returns:
462 |             list[Tensor]: A list of Tensors of predicted class probabilities for each image.
463 |                 Element i has shape (Ri, K + 1), where Ri is the number of predicted objects
464 |                 for image i.
465 |         """
466 |         _, attr_scores, _ = predictions
467 |         num_inst_per_image = [len(p) for p in proposals]
468 |         probs = torch.sigmoid(attr_scores)
469 |         return probs.split(num_inst_per_image, dim=0)
470 | 


--------------------------------------------------------------------------------
/imaterialist/modeling/roi_heads/roi_heads.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
 2 | import logging
 3 | from typing import Dict
 4 | 
 5 | from detectron2.layers import ShapeSpec
 6 | from detectron2.utils.registry import Registry
 7 | from detectron2.modeling.roi_heads.roi_heads import StandardROIHeads
 8 | from detectron2.modeling.poolers import ROIPooler
 9 | from detectron2.modeling.roi_heads.box_head import build_box_head
10 | 
11 | from .attributes_head import AttributesFastRCNNOutputLayers
12 | 
13 | ROI_HEADS_REGISTRY = Registry("ROI_HEADS")
14 | ROI_HEADS_REGISTRY.__doc__ = """
15 | Registry for ROI heads in a generalized R-CNN model.
16 | ROIHeads take feature maps and region proposals, and
17 | perform per-region computation.
18 | 
19 | The registered object will be called with `obj(cfg, input_shape)`.
20 | The call is expected to return an :class:`ROIHeads`.
21 | """
22 | 
23 | logger = logging.getLogger(__name__)
24 | 
25 | 
26 | def build_roi_heads(cfg, input_shape):
27 |     """
28 |     Build ROIHeads defined by `cfg.MODEL.ROI_HEADS.NAME`.
29 |     """
30 |     name = cfg.MODEL.ROI_HEADS.NAME
31 |     return ROI_HEADS_REGISTRY.get(name)(cfg, input_shape)
32 | 
33 | 
34 | @ROI_HEADS_REGISTRY.register()
35 | class StandardROIHeads(StandardROIHeads):
36 |     """
37 |     It's "standard" in a sense that there is no ROI transform sharing
38 |     or feature sharing between tasks.
39 |     The cropped rois go to separate branches (boxes and masks) directly.
40 |     This way, it is easier to make separate abstractions for different branches.
41 | 
42 |     This class is used by most models, such as FPN and C5.
43 |     To implement more models, you can subclass it and implement a different
44 |     :meth:`forward()` or a head.
45 |     """
46 | 
47 |     def __init__(self, cfg, input_shape: Dict[str, ShapeSpec]):
48 |         super().__init__(cfg, input_shape)
49 |         self.in_features = cfg.MODEL.ROI_HEADS.IN_FEATURES
50 | 
51 |         self._init_box_head(cfg, input_shape)
52 |         
53 |     @classmethod
54 |     def _init_box_head(cls, cfg, input_shape):
55 |         # fmt: off
56 |         in_features       = cfg.MODEL.ROI_HEADS.IN_FEATURES
57 |         pooler_resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
58 |         pooler_scales     = tuple(1.0 / input_shape[k].stride for k in in_features)
59 |         sampling_ratio    = cfg.MODEL.ROI_BOX_HEAD.POOLER_SAMPLING_RATIO
60 |         pooler_type       = cfg.MODEL.ROI_BOX_HEAD.POOLER_TYPE
61 |         # fmt: on
62 | 
63 |         # If StandardROIHeads is applied on multiple feature maps (as in FPN),
64 |         # then we share the same predictors and therefore the channel counts must be the same
65 |         in_channels = [input_shape[f].channels for f in in_features]
66 |         # Check all channel counts are equal
67 |         assert len(set(in_channels)) == 1, in_channels
68 |         in_channels = in_channels[0]
69 | 
70 |         box_pooler = ROIPooler(
71 |             output_size=pooler_resolution,
72 |             scales=pooler_scales,
73 |             sampling_ratio=sampling_ratio,
74 |             pooler_type=pooler_type,
75 |         )
76 |         # Here we split "box head" and "box predictor", which is mainly due to historical reasons.
77 |         # They are used together so the "box predictor" layers should be part of the "box head".
78 |         # New subclasses of ROIHeads do not need "box predictor"s.
79 |         box_head = build_box_head(
80 |             cfg, ShapeSpec(channels=in_channels, height=pooler_resolution, width=pooler_resolution)
81 |         )
82 |         box_predictor = AttributesFastRCNNOutputLayers(cfg, box_head.output_shape)
83 |         return {
84 |             "box_in_features": in_features,
85 |             "box_pooler": box_pooler,
86 |             "box_head": box_head,
87 |             "box_predictor": box_predictor,
88 |         }
89 | 


--------------------------------------------------------------------------------
/imaterialist/submission_utils/resize_longest_edge.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from PIL import Image
 3 | # Rle helper functions
 4 | from matplotlib.pyplot import imsave
 5 | from environs import Env
 6 | from pathlib import Path
 7 | from tqdm import tqdm
 8 | import os
 9 | import glob
10 | import math
11 | env = Env()
12 | env.read_env()
13 | 
14 | path_data_interim = Path(env("path_interim"))
15 | path_test_data = Path(env("path_test"))
16 | path_output = Path(env("path_output"))
17 | 
18 | def downscale_folder(path_test_data="/home/dyt811/Git/cvnnig/data_imaterialist2020/raw/test"):
19 |     """
20 |     Take the entire test data set, try to downscale the long edge to 1024.
21 |     :param path_test_data:
22 |     :return:
23 |     """
24 |     list_files = glob.glob(f"{path_test_data}/*.jpg")
25 |     for file in tqdm(list_files):
26 |         downscale_image(file, path_data_interim)
27 | 
28 | def downscale_image(img, path_data_interim = path_data_interim, max_size=1024):
29 |     '''
30 |     Adaptive funciton to first DOWNSCALE the image before running RLE
31 |     # Source: https://stackoverflow.com/a/28453021
32 |     img: numpy array, 1 - mask, 0 - background
33 |     Returns run length as string formated
34 |     '''
35 |     #img is a tensor
36 |     img_data = Image.open(img)
37 |     img_array = np.array(img_data)
38 |     pil_image = Image.fromarray(img_array)
39 |     width_current = pil_image.size[0]
40 |     height_current = pil_image.size[1]
41 | 
42 |     longest_edge = max(width_current, height_current)
43 | 
44 |     # Longest edge MUST be 1024, even if smaller or larger images
45 |     if (width_current > height_current):
46 |         new_width = max_size
47 |         scaled_height = max_size / float(width_current) * height_current
48 |         new_height = int(math.floor(scaled_height))
49 |     else:
50 |         scale_width = max_size / float(height_current) * width_current
51 |         new_width = int(math.floor(scale_width))
52 |         new_height = max_size
53 |     # Always resizing.
54 |     pil_image = pil_image.resize((new_width, new_height), Image.NEAREST)
55 | 
56 |     image_array = np.array(pil_image)
57 | 
58 |     imsave(f"{path_data_interim}/resized_test/{Path(img).stem}.jpg", image_array)
59 | 
60 | if __name__ =="__main__":
61 |     downscale_folder()


--------------------------------------------------------------------------------
/imaterialist/submission_utils/test_csv_write.py:
--------------------------------------------------------------------------------
 1 | import csv
 2 | from typing import List
 3 | import pickle
 4 | 
 5 | def filter_csv_write(list_list_dict: List[List[dict]], path_csv):
 6 |     """
 7 |     Write the list of csv predictions into CSV.
 8 |     :param list_dict:
 9 |     :param path_csv:
10 |     :return:
11 |     """
12 |     # Flatten the two list.
13 |     # Feturn item if they the encoded pixel is not  flat.
14 |     flat_list = []
15 |     # Iterate through image list.
16 |     for sublist in list_list_dict:
17 |         # Iterate through mask list
18 |         for item in sublist:
19 |             # If the EncodedPixel is empty, skip.
20 |             if item["EncodedPixels"] == "":
21 |                 continue
22 |             else:
23 |                 flat_list.append(item)
24 | 
25 |     # With blanks.
26 |     # flat_list = [item for sublist in list_list_dict for item in sublist]
27 | 
28 |     # Source: https://stackoverflow.com/questions/3086973/how-do-i-convert-this-list-of-dictionaries-to-a-csv-file
29 |     keys = flat_list[0].keys()
30 |     with open(path_csv, 'w') as output_file:
31 |         # quote char prevent dict_writer to quote string that contain separtor: ,
32 |         # The attributes are separated by COMMA, and must be quoted, by using space,
33 |         dict_writer = csv.DictWriter(output_file, keys)
34 |         dict_writer.writeheader()
35 |         dict_writer.writerows(flat_list)
36 | 
37 | def /test_filter_csvwrite():
38 |     # Load masks
39 |     data = pickle.load(open("/home/dyt811/Git/cvnnig/data_imaterialist2020/2020-05-25T014759_NSM0.75Prediction/result_file.pkl", 'rb'))
40 |     filter_csv_write(data, "2020-05-26T005749_csvBlank.csv")


--------------------------------------------------------------------------------
/notebooks/03-Create-dataset.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "import json\n",
 10 |     "import logging\n",
 11 |     "import pickle\n",
 12 |     "import numpy as np\n",
 13 |     "import pandas as pd\n",
 14 |     "from pathlib import Path\n",
 15 |     "from sklearn import preprocessing\n",
 16 |     "import sys\n",
 17 |     "from environs import Env\n",
 18 |     "import torch\n",
 19 |     "import detectron2"
 20 |    ]
 21 |   },
 22 |   {
 23 |    "cell_type": "code",
 24 |    "execution_count": 2,
 25 |    "metadata": {},
 26 |    "outputs": [
 27 |     {
 28 |      "data": {
 29 |       "text/plain": [
 30 |        "'0.1.3'"
 31 |       ]
 32 |      },
 33 |      "execution_count": 2,
 34 |      "metadata": {},
 35 |      "output_type": "execute_result"
 36 |     }
 37 |    ],
 38 |    "source": [
 39 |     "detectron2.__version__"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 2,
 45 |    "metadata": {},
 46 |    "outputs": [],
 47 |    "source": [
 48 |     "sys.path.append('../')\n",
 49 |     "\n",
 50 |     "env = Env()\n",
 51 |     "env.read_env()\n",
 52 |     "\n",
 53 |     "# Get training dataframe\n",
 54 |     "path_data = Path(env(\"path_raw\"))\n",
 55 |     "path_image = path_data / \"train/\"\n",
 56 |     "path_data_interim = Path(env(\"path_interim\"))"
 57 |    ]
 58 |   },
 59 |   {
 60 |    "cell_type": "code",
 61 |    "execution_count": 3,
 62 |    "metadata": {},
 63 |    "outputs": [],
 64 |    "source": [
 65 |     "from imaterialist.data.datasets.make_dataset import load_dataset_into_dataframes, create_datadict, attr_str_to_list                  "
 66 |    ]
 67 |   },
 68 |   {
 69 |    "cell_type": "code",
 70 |    "execution_count": 4,
 71 |    "metadata": {},
 72 |    "outputs": [
 73 |     {
 74 |      "name": "stderr",
 75 |      "output_type": "stream",
 76 |      "text": [
 77 |       "../imaterialist/data/datasets/make_dataset.py:76: SettingWithCopyWarning: \n",
 78 |       "A value is trying to be set on a copy of a slice from a DataFrame\n",
 79 |       "\n",
 80 |       "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
 81 |       "  df['AttributesIds'][index] = row['AttributesIds'].split(',')\n",
 82 |       "../imaterialist/data/datasets/make_dataset.py:79: SettingWithCopyWarning: \n",
 83 |       "A value is trying to be set on a copy of a slice from a DataFrame\n",
 84 |       "\n",
 85 |       "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
 86 |       "  df['AttributesIds'][index] = [int(x) for x in df['AttributesIds'][index]]\n",
 87 |       "../imaterialist/data/datasets/make_dataset.py:85: SettingWithCopyWarning: \n",
 88 |       "A value is trying to be set on a copy of a slice from a DataFrame\n",
 89 |       "\n",
 90 |       "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
 91 |       "  df['AttributesIds'][index] = lb.transform(df['AttributesIds'][index]).sum(axis=0)\n",
 92 |       "../imaterialist/data/datasets/make_dataset.py:83: SettingWithCopyWarning: \n",
 93 |       "A value is trying to be set on a copy of a slice from a DataFrame\n",
 94 |       "\n",
 95 |       "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
 96 |       "  df['AttributesIds'][index] = [999]\n"
 97 |      ]
 98 |     }
 99 |    ],
100 |    "source": [
101 |     "data_full, df_attributes, _ = load_dataset_into_dataframes(n_cases=500)\n",
102 |     "datadic_full = create_datadict(data_full, df_attributes)"
103 |    ]
104 |   },
105 |   {
106 |    "cell_type": "code",
107 |    "execution_count": 5,
108 |    "metadata": {},
109 |    "outputs": [],
110 |    "source": [
111 |     "n_train = 400\n",
112 |     "n_test = 100\n",
113 |     "\n",
114 |     "datadic_train = datadic_full[:n_train].copy()\n",
115 |     "datadic_val = datadic_full[-n_test:].copy()"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "code",
120 |    "execution_count": 6,
121 |    "metadata": {},
122 |    "outputs": [
123 |     {
124 |      "data": {
125 |       "text/html": [
126 |        "<div>\n",
127 |        "<style scoped>\n",
128 |        "    .dataframe tbody tr th:only-of-type {\n",
129 |        "        vertical-align: middle;\n",
130 |        "    }\n",
131 |        "\n",
132 |        "    .dataframe tbody tr th {\n",
133 |        "        vertical-align: top;\n",
134 |        "    }\n",
135 |        "\n",
136 |        "    .dataframe thead th {\n",
137 |        "        text-align: right;\n",
138 |        "    }\n",
139 |        "</style>\n",
140 |        "<table border=\"1\" class=\"dataframe\">\n",
141 |        "  <thead>\n",
142 |        "    <tr style=\"text-align: right;\">\n",
143 |        "      <th></th>\n",
144 |        "      <th>ImageId</th>\n",
145 |        "      <th>EncodedPixels</th>\n",
146 |        "      <th>Height</th>\n",
147 |        "      <th>Width</th>\n",
148 |        "      <th>ClassId</th>\n",
149 |        "      <th>AttributesIds</th>\n",
150 |        "      <th>x0</th>\n",
151 |        "      <th>y0</th>\n",
152 |        "      <th>x1</th>\n",
153 |        "      <th>y1</th>\n",
154 |        "    </tr>\n",
155 |        "  </thead>\n",
156 |        "  <tbody>\n",
157 |        "    <tr>\n",
158 |        "      <th>322</th>\n",
159 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
160 |        "      <td>1131858 13 1131917 36 1133630 42 1133718 40 11...</td>\n",
161 |        "      <td>1800</td>\n",
162 |        "      <td>1200</td>\n",
163 |        "      <td>23</td>\n",
164 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
165 |        "      <td>628</td>\n",
166 |        "      <td>1418</td>\n",
167 |        "      <td>692</td>\n",
168 |        "      <td>1583</td>\n",
169 |        "    </tr>\n",
170 |        "    <tr>\n",
171 |        "      <th>307</th>\n",
172 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
173 |        "      <td>212864 7 213881 21 214899 31 215920 35 216943 ...</td>\n",
174 |        "      <td>1024</td>\n",
175 |        "      <td>683</td>\n",
176 |        "      <td>23</td>\n",
177 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
178 |        "      <td>207</td>\n",
179 |        "      <td>878</td>\n",
180 |        "      <td>257</td>\n",
181 |        "      <td>961</td>\n",
182 |        "    </tr>\n",
183 |        "    <tr>\n",
184 |        "      <th>136</th>\n",
185 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
186 |        "      <td>179500 6 180515 16 181534 22 182557 25 183579 ...</td>\n",
187 |        "      <td>1024</td>\n",
188 |        "      <td>680</td>\n",
189 |        "      <td>10</td>\n",
190 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
191 |        "      <td>175</td>\n",
192 |        "      <td>163</td>\n",
193 |        "      <td>530</td>\n",
194 |        "      <td>956</td>\n",
195 |        "    </tr>\n",
196 |        "    <tr>\n",
197 |        "      <th>370</th>\n",
198 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
199 |        "      <td>5669654 1 5673613 1 5677571 3 5681530 4 568548...</td>\n",
200 |        "      <td>3960</td>\n",
201 |        "      <td>2640</td>\n",
202 |        "      <td>31</td>\n",
203 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
204 |        "      <td>1431</td>\n",
205 |        "      <td>1546</td>\n",
206 |        "      <td>2057</td>\n",
207 |        "      <td>2923</td>\n",
208 |        "    </tr>\n",
209 |        "    <tr>\n",
210 |        "      <th>288</th>\n",
211 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
212 |        "      <td>896338 1 897937 4 899537 7 901137 9 902736 13 ...</td>\n",
213 |        "      <td>1600</td>\n",
214 |        "      <td>1067</td>\n",
215 |        "      <td>33</td>\n",
216 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
217 |        "      <td>560</td>\n",
218 |        "      <td>282</td>\n",
219 |        "      <td>631</td>\n",
220 |        "      <td>349</td>\n",
221 |        "    </tr>\n",
222 |        "    <tr>\n",
223 |        "      <th>167</th>\n",
224 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
225 |        "      <td>247017 4 248399 11 249780 16 251162 18 252543 ...</td>\n",
226 |        "      <td>1383</td>\n",
227 |        "      <td>900</td>\n",
228 |        "      <td>4</td>\n",
229 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
230 |        "      <td>178</td>\n",
231 |        "      <td>337</td>\n",
232 |        "      <td>682</td>\n",
233 |        "      <td>984</td>\n",
234 |        "    </tr>\n",
235 |        "    <tr>\n",
236 |        "      <th>53</th>\n",
237 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
238 |        "      <td>701292 5 703597 15 705903 25 708209 34 710515 ...</td>\n",
239 |        "      <td>2310</td>\n",
240 |        "      <td>1536</td>\n",
241 |        "      <td>6</td>\n",
242 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
243 |        "      <td>303</td>\n",
244 |        "      <td>1181</td>\n",
245 |        "      <td>824</td>\n",
246 |        "      <td>2136</td>\n",
247 |        "    </tr>\n",
248 |        "    <tr>\n",
249 |        "      <th>75</th>\n",
250 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
251 |        "      <td>1590945 1 1593942 4 1596939 7 1599938 8 160293...</td>\n",
252 |        "      <td>3000</td>\n",
253 |        "      <td>2000</td>\n",
254 |        "      <td>37</td>\n",
255 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
256 |        "      <td>530</td>\n",
257 |        "      <td>929</td>\n",
258 |        "      <td>741</td>\n",
259 |        "      <td>1003</td>\n",
260 |        "    </tr>\n",
261 |        "    <tr>\n",
262 |        "      <th>350</th>\n",
263 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
264 |        "      <td>243646 2 244218 6 244714 16 244743 3 244760 5 ...</td>\n",
265 |        "      <td>576</td>\n",
266 |        "      <td>1024</td>\n",
267 |        "      <td>24</td>\n",
268 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
269 |        "      <td>422</td>\n",
270 |        "      <td>217</td>\n",
271 |        "      <td>456</td>\n",
272 |        "      <td>575</td>\n",
273 |        "    </tr>\n",
274 |        "    <tr>\n",
275 |        "      <th>265</th>\n",
276 |        "      <td>/home/julien/data-science/kaggle/imaterialist/...</td>\n",
277 |        "      <td>1660869 5 1663151 15 1665436 21 1667723 22 167...</td>\n",
278 |        "      <td>2287</td>\n",
279 |        "      <td>1522</td>\n",
280 |        "      <td>33</td>\n",
281 |        "      <td>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...</td>\n",
282 |        "      <td>726</td>\n",
283 |        "      <td>499</td>\n",
284 |        "      <td>887</td>\n",
285 |        "      <td>571</td>\n",
286 |        "    </tr>\n",
287 |        "  </tbody>\n",
288 |        "</table>\n",
289 |        "</div>"
290 |       ],
291 |       "text/plain": [
292 |        "                                               ImageId  \\\n",
293 |        "322  /home/julien/data-science/kaggle/imaterialist/...   \n",
294 |        "307  /home/julien/data-science/kaggle/imaterialist/...   \n",
295 |        "136  /home/julien/data-science/kaggle/imaterialist/...   \n",
296 |        "370  /home/julien/data-science/kaggle/imaterialist/...   \n",
297 |        "288  /home/julien/data-science/kaggle/imaterialist/...   \n",
298 |        "167  /home/julien/data-science/kaggle/imaterialist/...   \n",
299 |        "53   /home/julien/data-science/kaggle/imaterialist/...   \n",
300 |        "75   /home/julien/data-science/kaggle/imaterialist/...   \n",
301 |        "350  /home/julien/data-science/kaggle/imaterialist/...   \n",
302 |        "265  /home/julien/data-science/kaggle/imaterialist/...   \n",
303 |        "\n",
304 |        "                                         EncodedPixels  Height  Width  \\\n",
305 |        "322  1131858 13 1131917 36 1133630 42 1133718 40 11...    1800   1200   \n",
306 |        "307  212864 7 213881 21 214899 31 215920 35 216943 ...    1024    683   \n",
307 |        "136  179500 6 180515 16 181534 22 182557 25 183579 ...    1024    680   \n",
308 |        "370  5669654 1 5673613 1 5677571 3 5681530 4 568548...    3960   2640   \n",
309 |        "288  896338 1 897937 4 899537 7 901137 9 902736 13 ...    1600   1067   \n",
310 |        "167  247017 4 248399 11 249780 16 251162 18 252543 ...    1383    900   \n",
311 |        "53   701292 5 703597 15 705903 25 708209 34 710515 ...    2310   1536   \n",
312 |        "75   1590945 1 1593942 4 1596939 7 1599938 8 160293...    3000   2000   \n",
313 |        "350  243646 2 244218 6 244714 16 244743 3 244760 5 ...     576   1024   \n",
314 |        "265  1660869 5 1663151 15 1665436 21 1667723 22 167...    2287   1522   \n",
315 |        "\n",
316 |        "     ClassId                                      AttributesIds    x0    y0  \\\n",
317 |        "322       23  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   628  1418   \n",
318 |        "307       23  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   207   878   \n",
319 |        "136       10  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   175   163   \n",
320 |        "370       31  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...  1431  1546   \n",
321 |        "288       33  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   560   282   \n",
322 |        "167        4  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   178   337   \n",
323 |        "53         6  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   303  1181   \n",
324 |        "75        37  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   530   929   \n",
325 |        "350       24  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   422   217   \n",
326 |        "265       33  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...   726   499   \n",
327 |        "\n",
328 |        "       x1    y1  \n",
329 |        "322   692  1583  \n",
330 |        "307   257   961  \n",
331 |        "136   530   956  \n",
332 |        "370  2057  2923  \n",
333 |        "288   631   349  \n",
334 |        "167   682   984  \n",
335 |        "53    824  2136  \n",
336 |        "75    741  1003  \n",
337 |        "350   456   575  \n",
338 |        "265   887   571  "
339 |       ]
340 |      },
341 |      "execution_count": 6,
342 |      "metadata": {},
343 |      "output_type": "execute_result"
344 |     }
345 |    ],
346 |    "source": [
347 |     "datadic_train.sample(10)"
348 |    ]
349 |   },
350 |   {
351 |    "cell_type": "code",
352 |    "execution_count": 7,
353 |    "metadata": {},
354 |    "outputs": [
355 |     {
356 |      "name": "stdout",
357 |      "output_type": "stream",
358 |      "text": [
359 |       "<class 'pandas.core.frame.DataFrame'>\n",
360 |       "RangeIndex: 400 entries, 0 to 399\n",
361 |       "Data columns (total 10 columns):\n",
362 |       " #   Column         Non-Null Count  Dtype \n",
363 |       "---  ------         --------------  ----- \n",
364 |       " 0   ImageId        400 non-null    object\n",
365 |       " 1   EncodedPixels  400 non-null    object\n",
366 |       " 2   Height         400 non-null    int64 \n",
367 |       " 3   Width          400 non-null    int64 \n",
368 |       " 4   ClassId        400 non-null    int64 \n",
369 |       " 5   AttributesIds  400 non-null    object\n",
370 |       " 6   x0             400 non-null    int64 \n",
371 |       " 7   y0             400 non-null    int64 \n",
372 |       " 8   x1             400 non-null    int64 \n",
373 |       " 9   y1             400 non-null    int64 \n",
374 |       "dtypes: int64(7), object(3)\n",
375 |       "memory usage: 31.4+ KB\n"
376 |      ]
377 |     }
378 |    ],
379 |    "source": [
380 |     "\n",
381 |     "datadic_train.info()"
382 |    ]
383 |   },
384 |   {
385 |    "cell_type": "code",
386 |    "execution_count": 20,
387 |    "metadata": {},
388 |    "outputs": [],
389 |    "source": [
390 |     "import pickle"
391 |    ]
392 |   },
393 |   {
394 |    "cell_type": "code",
395 |    "execution_count": 10,
396 |    "metadata": {},
397 |    "outputs": [],
398 |    "source": [
399 |     "datadict_train = pickle.load(open(path_data_interim / 'imaterialist_train_multihot_n=400.p', 'rb'))"
400 |    ]
401 |   },
402 |   {
403 |    "cell_type": "code",
404 |    "execution_count": null,
405 |    "metadata": {},
406 |    "outputs": [],
407 |    "source": []
408 |   }
409 |  ],
410 |  "metadata": {
411 |   "kernelspec": {
412 |    "display_name": "Python (imaterialist)",
413 |    "language": "python",
414 |    "name": "imaterialist"
415 |   },
416 |   "language_info": {
417 |    "codemirror_mode": {
418 |     "name": "ipython",
419 |     "version": 3
420 |    },
421 |    "file_extension": ".py",
422 |    "mimetype": "text/x-python",
423 |    "name": "python",
424 |    "nbconvert_exporter": "python",
425 |    "pygments_lexer": "ipython3",
426 |    "version": "3.8.3"
427 |   }
428 |  },
429 |  "nbformat": 4,
430 |  "nbformat_minor": 4
431 | }
432 | 


--------------------------------------------------------------------------------
/notebooks/06-Attribute-inference.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "%reload_ext autoreload\n",
 10 |     "%autoreload 2\n",
 11 |     "%matplotlib inline"
 12 |    ]
 13 |   },
 14 |   {
 15 |    "cell_type": "code",
 16 |    "execution_count": 2,
 17 |    "metadata": {},
 18 |    "outputs": [],
 19 |    "source": [
 20 |     "import numpy as np \n",
 21 |     "import pandas as pd \n",
 22 |     "import collections\n",
 23 |     "import torch\n",
 24 |     "import feather\n",
 25 |     "import json\n",
 26 |     "import os\n",
 27 |     "import cv2\n",
 28 |     "import random\n",
 29 |     "import gc\n",
 30 |     "import pycocotools\n",
 31 |     "\n",
 32 |     "from tqdm import tqdm\n",
 33 |     "import matplotlib.pyplot as plt\n",
 34 |     "import PIL\n",
 35 |     "from PIL import Image, ImageFile\n",
 36 |     "from torch.utils.data import Dataset, DataLoader\n",
 37 |     "\n",
 38 |     "from pathlib import Path\n",
 39 |     "from environs import Env\n",
 40 |     "\n",
 41 |     "from detectron2 import model_zoo\n",
 42 |     "from detectron2.structures import BoxMode\n",
 43 |     "from detectron2.engine import DefaultPredictor, default_argument_parser, default_setup\n",
 44 |     "from detectron2.config import get_cfg\n",
 45 |     "from detectron2.utils.visualizer import Visualizer\n",
 46 |     "from detectron2.data import MetadataCatalog\n",
 47 |     "from detectron2.utils.logger import setup_logger\n",
 48 |     "\n",
 49 |     "import sys\n",
 50 |     "sys.path.append('../')\n",
 51 |     "\n",
 52 |     "from iMaterialist2020.imaterialist.data.datasets.coco import register_datadict\n",
 53 |     "from iMaterialist2020.imaterialist.config import add_imaterialist_config\n",
 54 |     "from iMaterialist2020.imaterialist.modeling import build_model\n",
 55 |     "\n",
 56 |     "env = Env()\n",
 57 |     "env.read_env()"
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "code",
 62 |    "execution_count": 3,
 63 |    "metadata": {},
 64 |    "outputs": [],
 65 |    "source": [
 66 |     "# Get training dataframe\n",
 67 |     "data_dir = Path(env('path_raw'))\n",
 68 |     "image_dir = Path(env('path_images'))\n",
 69 |     "df = pd.read_csv(data_dir/'train.csv')\n",
 70 |     "\n",
 71 |     "# Load modified df for Detectron2 dataset dict \n",
 72 |     "df_detectron = pd.read_feather('../data/interim/imaterialist_train_multihot_n=4000.feather') \n",
 73 |     "\n",
 74 |     "# Get label descriptions\n",
 75 |     "with open(data_dir/'label_descriptions.json', 'r') as file:\n",
 76 |     "    label_desc = json.load(file)\n",
 77 |     "df_categories = pd.DataFrame(label_desc['categories'])\n",
 78 |     "df_attributes = pd.DataFrame(label_desc['attributes'])"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "code",
 83 |    "execution_count": 4,
 84 |    "metadata": {},
 85 |    "outputs": [
 86 |     {
 87 |      "output_type": "execute_result",
 88 |      "data": {
 89 |       "text/plain": "'/home/nasty/imaterialist2020/data/raw/train/00000663ed1ff0c4e0132b9b9ac53f6e.jpg'"
 90 |      },
 91 |      "metadata": {},
 92 |      "execution_count": 4
 93 |     }
 94 |    ],
 95 |    "source": [
 96 |     "\n",
 97 |     "df_detectron.ImageId[0]"
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "code",
102 |    "execution_count": 5,
103 |    "metadata": {},
104 |    "outputs": [
105 |     {
106 |      "output_type": "error",
107 |      "ename": "TypeError",
108 |      "evalue": "Image data of dtype <U80 cannot be converted to float",
109 |      "traceback": [
110 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
111 |       "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
112 |       "\u001b[0;32m<ipython-input-5-a1e35f60347f>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mimshow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdf_detectron\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mImageId\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
113 |       "\u001b[0;32m~/anaconda3/envs/imaterialist/lib/python3.8/site-packages/matplotlib/pyplot.py\u001b[0m in \u001b[0;36mimshow\u001b[0;34m(X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, shape, filternorm, filterrad, imlim, resample, url, data, **kwargs)\u001b[0m\n\u001b[1;32m   2676\u001b[0m         \u001b[0mfilterrad\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m4.0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mimlim\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcbook\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdeprecation\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_deprecated_parameter\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   2677\u001b[0m         resample=None, url=None, *, data=None, **kwargs):\n\u001b[0;32m-> 2678\u001b[0;31m     __ret = gca().imshow(\n\u001b[0m\u001b[1;32m   2679\u001b[0m         \u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcmap\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcmap\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnorm\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mnorm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maspect\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maspect\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   2680\u001b[0m         \u001b[0minterpolation\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0minterpolation\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malpha\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0malpha\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvmin\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvmin\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
114 |       "\u001b[0;32m~/anaconda3/envs/imaterialist/lib/python3.8/site-packages/matplotlib/__init__.py\u001b[0m in \u001b[0;36minner\u001b[0;34m(ax, data, *args, **kwargs)\u001b[0m\n\u001b[1;32m   1597\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0minner\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0max\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1598\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mdata\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1599\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0max\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0mmap\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msanitize_sequence\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1600\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1601\u001b[0m         \u001b[0mbound\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnew_sig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbind\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0max\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
115 |       "\u001b[0;32m~/anaconda3/envs/imaterialist/lib/python3.8/site-packages/matplotlib/cbook/deprecation.py\u001b[0m in \u001b[0;36mwrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    367\u001b[0m                 \u001b[0;34mf\"%(removal)s.  If any parameter follows {name!r}, they \"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    368\u001b[0m                 f\"should be pass as keyword, not positionally.\")\n\u001b[0;32m--> 369\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    370\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    371\u001b[0m     \u001b[0;32mreturn\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
116 |       "\u001b[0;32m~/anaconda3/envs/imaterialist/lib/python3.8/site-packages/matplotlib/cbook/deprecation.py\u001b[0m in \u001b[0;36mwrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m    367\u001b[0m                 \u001b[0;34mf\"%(removal)s.  If any parameter follows {name!r}, they \"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    368\u001b[0m                 f\"should be pass as keyword, not positionally.\")\n\u001b[0;32m--> 369\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    370\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    371\u001b[0m     \u001b[0;32mreturn\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
117 |       "\u001b[0;32m~/anaconda3/envs/imaterialist/lib/python3.8/site-packages/matplotlib/axes/_axes.py\u001b[0m in \u001b[0;36mimshow\u001b[0;34m(self, X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, shape, filternorm, filterrad, imlim, resample, url, **kwargs)\u001b[0m\n\u001b[1;32m   5677\u001b[0m                               resample=resample, **kwargs)\n\u001b[1;32m   5678\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 5679\u001b[0;31m         \u001b[0mim\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mset_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   5680\u001b[0m         \u001b[0mim\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mset_alpha\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0malpha\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   5681\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mim\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_clip_path\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
118 |       "\u001b[0;32m~/anaconda3/envs/imaterialist/lib/python3.8/site-packages/matplotlib/image.py\u001b[0m in \u001b[0;36mset_data\u001b[0;34m(self, A)\u001b[0m\n\u001b[1;32m    682\u001b[0m         if (self._A.dtype != np.uint8 and\n\u001b[1;32m    683\u001b[0m                 not np.can_cast(self._A.dtype, float, \"same_kind\")):\n\u001b[0;32m--> 684\u001b[0;31m             raise TypeError(\"Image data of dtype {} cannot be converted to \"\n\u001b[0m\u001b[1;32m    685\u001b[0m                             \"float\".format(self._A.dtype))\n\u001b[1;32m    686\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
119 |       "\u001b[0;31mTypeError\u001b[0m: Image data of dtype <U80 cannot be converted to float"
120 |      ]
121 |     }
122 |    ],
123 |    "source": [
124 |     "plt.imshow(df_detectron.ImageId[0])"
125 |    ]
126 |   },
127 |   {
128 |    "cell_type": "code",
129 |    "execution_count": 34,
130 |    "metadata": {},
131 |    "outputs": [],
132 |    "source": [
133 |     "def setup(args):\n",
134 |     "    cfg = get_cfg()\n",
135 |     "    add_imaterialist_config(cfg)\n",
136 |     "    cfg.merge_from_file(model_zoo.get_config_file(\"COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml\"))\n",
137 |     "    cfg.merge_from_file(args.config_file)\n",
138 |     "    \n",
139 |     "    cfg.merge_from_list(args.opts)\n",
140 |     "    cfg.freeze()\n",
141 |     "    default_setup(cfg, args)\n",
142 |     "    # Setup logger for \"imaterialist\" module\n",
143 |     "    setup_logger(output=cfg.OUTPUT_DIR, distributed_rank=comm.get_rank(), name=\"imaterialist\")\n",
144 |     "    return cfg"
145 |    ]
146 |   },
147 |   {
148 |    "cell_type": "code",
149 |    "execution_count": 40,
150 |    "metadata": {},
151 |    "outputs": [],
152 |    "source": [
153 |     "class iMatPredictor(DefaultPredictor):\n",
154 |     "    def __init__(self, cfg):\n",
155 |     "        #super().__init__(cfg)\n",
156 |     "        self.model = build_model(self, cfg)\n"
157 |    ]
158 |   },
159 |   {
160 |    "cell_type": "code",
161 |    "execution_count": null,
162 |    "metadata": {},
163 |    "outputs": [],
164 |    "source": []
165 |   },
166 |   {
167 |    "cell_type": "code",
168 |    "execution_count": 41,
169 |    "metadata": {},
170 |    "outputs": [],
171 |    "source": [
172 |     "\n",
173 |     "\n",
174 |     "datadic_train = pd.read_feather(path_data_interim / 'imaterialist_train_multihot_n=4000.feather')\n",
175 |     "datadic_val = pd.read_feather(path_data_interim / 'imterailist_val_multihot_n=1000.feather')\n",
176 |     "register_datadict(datadic_train, \"sample_fashion_train\")\n",
177 |     "register_datadict(datadic_val, \"sample_fashion_test\")\n",
178 |     "\n",
179 |     "cfg = setup(args)\n",
180 |     "\n",
181 |     "im = cv2.imread(df_detectron.ImageId[0])\n",
182 |     "\n",
183 |     "predictor = iMatPredictor(im)"
184 |    ]
185 |   },
186 |   {
187 |    "cell_type": "code",
188 |    "execution_count": null,
189 |    "metadata": {},
190 |    "outputs": [],
191 |    "source": []
192 |   },
193 |   {
194 |    "cell_type": "code",
195 |    "execution_count": 39,
196 |    "metadata": {},
197 |    "outputs": [
198 |     {
199 |      "output_type": "stream",
200 |      "name": "stderr",
201 |      "text": "usage: ipykernel_launcher.py [-h] [--config-file FILE] [--resume]\n                             [--eval-only] [--num-gpus NUM_GPUS]\n                             [--num-machines NUM_MACHINES]\n                             [--machine-rank MACHINE_RANK]\n                             [--dist-url DIST_URL]\n                             ...\nipykernel_launcher.py: error: unrecognized arguments: -f\n"
202 |     },
203 |     {
204 |      "output_type": "error",
205 |      "ename": "SystemExit",
206 |      "evalue": "2",
207 |      "traceback": [
208 |       "An exception has occurred, use %tb to see the full traceback.\n",
209 |       "\u001b[0;31mSystemExit\u001b[0m\u001b[0;31m:\u001b[0m 2\n"
210 |      ]
211 |     }
212 |    ],
213 |    "source": [
214 |     "args = default_argument_parser().parse_args()\n",
215 |     "args.config_file = \"/home/nasty/imaterialist2020/iMaterialist2020/configs/exp01.yaml\"\n",
216 |     "main(args)"
217 |    ]
218 |   },
219 |   {
220 |    "cell_type": "code",
221 |    "execution_count": null,
222 |    "metadata": {},
223 |    "outputs": [],
224 |    "source": []
225 |   }
226 |  ],
227 |  "metadata": {
228 |   "language_info": {
229 |    "codemirror_mode": {
230 |     "name": "ipython",
231 |     "version": 3
232 |    },
233 |    "file_extension": ".py",
234 |    "mimetype": "text/x-python",
235 |    "name": "python",
236 |    "nbconvert_exporter": "python",
237 |    "pygments_lexer": "ipython3",
238 |    "version": "3.8.2-final"
239 |   },
240 |   "orig_nbformat": 2,
241 |   "kernelspec": {
242 |    "name": "conda-imaterialist",
243 |    "display_name": "Python (imaterialist)"
244 |   }
245 |  },
246 |  "nbformat": 4,
247 |  "nbformat_minor": 2
248 | }


--------------------------------------------------------------------------------
/notebooks/InteractiveLabelExplorer.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 2,
  6 |    "metadata": {
  7 |     "pycharm": {
  8 |      "is_executing": false
  9 |     }
 10 |    },
 11 |    "outputs": [
 12 |     {
 13 |      "data": {
 14 |       "text/html": [
 15 |        "<style>.container { width:100% !important; }</style>"
 16 |       ],
 17 |       "text/plain": [
 18 |        "<IPython.core.display.HTML object>"
 19 |       ]
 20 |      },
 21 |      "metadata": {},
 22 |      "output_type": "display_data"
 23 |     }
 24 |    ],
 25 |    "source": [
 26 |     "import pickle\n",
 27 |     "import os\n",
 28 |     "from pathlib import Path\n",
 29 |     "from PIL import Image as PILImage\n",
 30 |     "from IPython.display import Image \n",
 31 |     "# Notebook widget for interactive exploration\n",
 32 |     "import ipywidgets as widgets\n",
 33 |     "from ipywidgets import interact, interact_manual\n",
 34 |     "import matplotlib.pyplot as plt\n",
 35 |     "from matplotlib.pyplot import imshow\n",
 36 |     "import cv2 as cv\n",
 37 |     "from IPython.core.display import display, HTML\n",
 38 |     "display(HTML(\"<style>.container { width:100% !important; }</style>\"))\n",
 39 |     "\n",
 40 |     "from PythonUtils.rle_decoding import RLE_decoding\n",
 41 |     "\n",
 42 |     "import numpy as np\n",
 43 |     "import pandas as pd\n",
 44 |     "from dotenv import load_dotenv, find_dotenv\n",
 45 |     "from src.data.csv_label_read import pandaread_image_labels\n",
 46 |     "from dotenv import load_dotenv, find_dotenv\n",
 47 |     "load_dotenv(find_dotenv())\n",
 48 |     "PATH_DATA_RAW = os.getenv(\"PATH_DATA_RAW\")\n"
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": 3,
 54 |    "metadata": {
 55 |     "pycharm": {
 56 |      "is_executing": true
 57 |     }
 58 |    },
 59 |    "outputs": [],
 60 |    "source": [
 61 |     "dataframe = pandaread_image_labels()"
 62 |    ]
 63 |   },
 64 |   {
 65 |    "cell_type": "code",
 66 |    "execution_count": 4,
 67 |    "metadata": {
 68 |     "pycharm": {
 69 |      "is_executing": true
 70 |     }
 71 |    },
 72 |    "outputs": [],
 73 |    "source": [
 74 |     "# Interactively Explorer the DataFrame\n",
 75 |     "import dtale\n",
 76 |     "d = dtale.show(dataframe)\n",
 77 |     "d.open_browser()"
 78 |    ]
 79 |   },
 80 |   {
 81 |    "cell_type": "code",
 82 |    "execution_count": 5,
 83 |    "metadata": {
 84 |     "pycharm": {
 85 |      "is_executing": true
 86 |     },
 87 |     "scrolled": false
 88 |    },
 89 |    "outputs": [
 90 |     {
 91 |      "data": {
 92 |       "application/vnd.jupyter.widget-view+json": {
 93 |        "model_id": "74d20e20c9d34601bcc05a3a43793a43",
 94 |        "version_major": 2,
 95 |        "version_minor": 0
 96 |       },
 97 |       "text/plain": [
 98 |        "interactive(children=(IntSlider(value=166700, description='index_label', max=333400), Output()), _dom_classes=…"
 99 |       ]
100 |      },
101 |      "metadata": {},
102 |      "output_type": "display_data"
103 |     }
104 |    ],
105 |    "source": [
106 |     "@interact\n",
107 |     "def show_count(index_label=(0, len(dataframe)-1)):\n",
108 |     "    df_label = dataframe.loc[index_label, :]    \n",
109 |     "    print(df_label.ClassId)\n",
110 |     "    #print(df_label.EncodedPixels)\n",
111 |     "    \n",
112 |     "    (order, length) = RLE_decoding.parse_order_length_string(df_label.EncodedPixels)\n",
113 |     "    \n",
114 |     "    test = RLE_decoding(order, length, x_max=df_label.Width, y_max=df_label.Height, y_encoded_first=False)\n",
115 |     "    test.decode()\n",
116 |     "    \n",
117 |     "    path_original = Path(PATH_DATA_RAW) / f\"train/{df_label.ImageId}.jpg\"\n",
118 |     "    \n",
119 |     "    image_original = PILImage.open(path_original)\n",
120 |     "    \n",
121 |     "    scale = 4\n",
122 |     "    \n",
123 |     "    \n",
124 |     "    original_resized = image_original.resize((image_original.size[0]//scale,image_original.size[1]//scale ))\n",
125 |     "    \n",
126 |     "    from matplotlib import rcParams\n",
127 |     "\n",
128 |     "    # figure size in inches optional\n",
129 |     "    rcParams['figure.figsize'] = 11 ,8\n",
130 |     "    \n",
131 |     "    mask = test.get_mask()\n",
132 |     "    mask_resized = mask.resize((df_label.Width//scale, df_label.Height//scale))\n",
133 |     "    \n",
134 |     "    # display images\n",
135 |     "    fig, ax = plt.subplots(1,2)\n",
136 |     "\n",
137 |     "    ax[0].imshow(original_resized, cmap='gray');\n",
138 |     "    ax[1].imshow(mask_resized, cmap='gray');\n",
139 |     "    fig.set_size_inches(20,20)"
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "code",
144 |    "execution_count": null,
145 |    "metadata": {
146 |     "pycharm": {
147 |      "is_executing": true
148 |     }
149 |    },
150 |    "outputs": [],
151 |    "source": [
152 |     "image_original.size"
153 |    ]
154 |   },
155 |   {
156 |    "cell_type": "code",
157 |    "execution_count": null,
158 |    "metadata": {
159 |     "pycharm": {
160 |      "is_executing": true
161 |     }
162 |    },
163 |    "outputs": [],
164 |    "source": []
165 |   }
166 |  ],
167 |  "metadata": {
168 |   "kernelspec": {
169 |    "display_name": "Python 3",
170 |    "language": "python",
171 |    "name": "python3"
172 |   },
173 |   "language_info": {
174 |    "codemirror_mode": {
175 |     "name": "ipython",
176 |     "version": 3
177 |    },
178 |    "file_extension": ".py",
179 |    "mimetype": "text/x-python",
180 |    "name": "python",
181 |    "nbconvert_exporter": "python",
182 |    "pygments_lexer": "ipython3",
183 |    "version": "3.7.6"
184 |   }
185 |  },
186 |  "nbformat": 4,
187 |  "nbformat_minor": 2
188 | }
189 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
  1 | absl-py==0.9.0
  2 | astroid==2.4.0
  3 | attrs==19.3.0
  4 | backcall==0.1.0
  5 | bleach==3.1.4
  6 | cachetools==4.1.0
  7 | certifi==2020.4.5.1
  8 | chardet==3.0.4
  9 | cloudpickle==1.4.1
 10 | cycler==0.10.0
 11 | Cython==0.29.17
 12 | decorator==4.4.2
 13 | defusedxml==0.6.0
 14 | detectron2==0.1.2+cu101
 15 | entrypoints==0.3
 16 | environs==7.4.0
 17 | feather-format==0.4.1
 18 | flake8==3.7.9
 19 | future==0.18.2
 20 | fvcore==0.1.dev200506
 21 | google-auth==1.14.2
 22 | google-auth-oauthlib==0.4.1
 23 | grpcio==1.28.1
 24 | idna==2.9
 25 | importlib-metadata==1.5.0
 26 | ipykernel==5.1.4
 27 | ipython==7.13.0
 28 | ipython-genutils==0.2.0
 29 | isort==4.3.21
 30 | jedi==0.17.0
 31 | Jinja2==2.11.2
 32 | joblib==0.14.1
 33 | jsonschema==3.2.0
 34 | jupyter-client==6.1.3
 35 | jupyter-core==4.6.3
 36 | kaggle==1.5.6
 37 | kiwisolver==1.2.0
 38 | lazy-object-proxy==1.4.3
 39 | Markdown==3.2.2
 40 | MarkupSafe==1.1.1
 41 | marshmallow==3.6.0
 42 | matplotlib==3.1.3
 43 | mccabe==0.6.1
 44 | mistune==0.8.4
 45 | mkl-fft==1.0.15
 46 | mkl-random==1.1.0
 47 | mkl-service==2.3.0
 48 | mock==4.0.2
 49 | nbconvert==5.6.1
 50 | nbformat==5.0.6
 51 | notebook==6.0.3
 52 | numpy==1.18.1
 53 | oauthlib==3.1.0
 54 | olefile==0.46
 55 | opencv-python==4.2.0.34
 56 | pandas==1.0.3
 57 | pandocfilters==1.4.2
 58 | parso==0.7.0
 59 | pexpect==4.8.0
 60 | pickleshare==0.7.5
 61 | Pillow==7.1.2
 62 | portalocker==1.7.0
 63 | prometheus-client==0.7.1
 64 | prompt-toolkit==3.0.4
 65 | protobuf==3.11.3
 66 | ptyprocess==0.6.0
 67 | pyarrow==0.17.0
 68 | pyasn1==0.4.8
 69 | pyasn1-modules==0.2.8
 70 | pycocotools==2.0
 71 | pycodestyle==2.5.0
 72 | pydot==1.4.1
 73 | pyflakes==2.1.1
 74 | Pygments==2.6.1
 75 | pylint==2.5.0
 76 | pyparsing==2.4.7
 77 | pyrsistent==0.16.0
 78 | python-dateutil==2.8.1
 79 | python-dotenv==0.13.0
 80 | python-slugify==4.0.0
 81 | pytz==2020.1
 82 | PyYAML==5.3.1
 83 | pyzmq==18.1.1
 84 | requests==2.23.0
 85 | requests-oauthlib==1.3.0
 86 | rsa==4.0
 87 | scikit-learn==0.23.0
 88 | scipy==1.4.1
 89 | seaborn==0.10.1
 90 | Send2Trash==1.5.0
 91 | sip==4.19.13
 92 | six==1.14.0
 93 | tabulate==0.8.7
 94 | tensorboard==2.2.1
 95 | tensorboard-plugin-wit==1.6.0.post3
 96 | termcolor==1.1.0
 97 | terminado==0.8.3
 98 | testpath==0.4.4
 99 | text-unidecode==1.3
100 | threadpoolctl==2.0.0
101 | toml==0.10.0
102 | torch==1.5.0+cu101
103 | torchvision==0.6.0+cu101
104 | tornado==6.0.4
105 | tqdm==4.46.0
106 | traitlets==4.3.3
107 | urllib3==1.24.3
108 | wcwidth==0.1.9
109 | webencodings==0.5.1
110 | Werkzeug==1.0.1
111 | wrapt==1.11.2
112 | yacs==0.1.7
113 | zipp==3.1.0
114 | 


--------------------------------------------------------------------------------
/train_net.py:
--------------------------------------------------------------------------------
  1 | """
  2 | iMaterialist 2020 training script. 
  3 | 
  4 | This script runs a trainer where we pass in custom dataset mapper which contains all the 
  5 | attributes of each instance. 
  6 | 
  7 | We register the data dictionnaries, load the configs, and run the trainer
  8 | """
  9 | 
 10 | 
 11 | import pandas as pd
 12 | import logging
 13 | from environs import Env
 14 | from pathlib import Path
 15 | import pickle
 16 | 
 17 | import detectron2.utils.comm as comm
 18 | from detectron2 import model_zoo
 19 | from detectron2.config import get_cfg
 20 | from detectron2.engine import DefaultTrainer, default_argument_parser, default_setup, launch
 21 | from detectron2.data import build_detection_train_loader, build_detection_test_loader
 22 | from detectron2.utils.logger import setup_logger
 23 | 
 24 | from imaterialist.data.dataset_mapper import iMatDatasetMapper
 25 | from imaterialist.config import add_imaterialist_config
 26 | from imaterialist.data.datasets.coco import register_datadict
 27 | from imaterialist.modeling import build_model
 28 | 
 29 | from imaterialist.modeling import roi_heads
 30 | 
 31 | # Get environment variables 
 32 | env = Env()
 33 | env.read_env()
 34 | 
 35 | # Set path to the data
 36 | path_data_interim = Path(env("path_interim"))
 37 | 
 38 | class FashionTrainer(DefaultTrainer):
 39 |     'A customized version of DefaultTrainer. We add a custom mapping to the dataloader'
 40 |     
 41 |     @classmethod
 42 |     def build_train_loader(cls, cfg):
 43 |         return build_detection_train_loader(cfg, mapper=iMatDatasetMapper(cfg))
 44 |     
 45 |     @classmethod
 46 |     def build_test_loader(cls, cfg, dataset_name):
 47 |         return build_detection_test_loader(cfg, dataset_name, mapper=iMatDatasetMapper(cfg))
 48 | 
 49 |     @classmethod
 50 |     def build_model(cls, cfg):
 51 |         """
 52 |         Returns:
 53 |             torch.nn.Module:
 54 | 
 55 |         It now calls :func:`detectron2.modeling.build_model`.
 56 |         """
 57 |         model = build_model(cfg)
 58 |         logger = logging.getLogger(__name__)
 59 |         logger.info("Model:\n{}".format(model))
 60 |         return model
 61 | 
 62 | def setup(args):
 63 |     """
 64 |     Setup all the custom and default configs before training
 65 |     """
 66 |     cfg = get_cfg()
 67 |     add_imaterialist_config(cfg)
 68 |     cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
 69 |     cfg.merge_from_file(args.config_file)
 70 |     
 71 |     cfg.merge_from_list(args.opts)
 72 |     cfg.freeze()
 73 |     default_setup(cfg, args)
 74 |     # Setup logger for "imaterialist" module
 75 |     setup_logger(output=cfg.OUTPUT_DIR, distributed_rank=comm.get_rank(), name="imaterialist")
 76 |     return cfg
 77 | 
 78 | def main(args):
 79 |     """
 80 |     load dataframes
 81 |     register detectron2 datadictionnaries
 82 |     setup config
 83 |     initialize the trainer
 84 |     run trainer to train the model
 85 |     """
 86 |     # load dataframe
 87 |     # fixme: this number needs to update or dynamic
 88 |     # datadic_train = pd.read_feather(path_data_interim / 'imaterialist_train_multihot_n=400.feather')
 89 |     # datadic_val = pd.read_feather(path_data_interim / 'imaterailist_test_multihot_n=100.feather')
 90 | 
 91 |     datadict_train = pickle.load(open(path_data_interim / 'imaterialist_train_multihot_n=400.p', 'rb'))
 92 |     datadict_val = pickle.load(open(path_data_interim / 'imaterialist_test_multihot_n=100.p', 'rb'))
 93 | 
 94 |     register_datadict(datadict_train, "sample_fashion_train")
 95 |     register_datadict(datadict_val, "sample_fashion_test")
 96 | 
 97 |     cfg = setup(args)
 98 | 
 99 |     trainer = FashionTrainer(cfg)
100 |     trainer.resume_or_load(resume=args.resume)
101 |     return trainer.train()
102 | 
103 | if __name__ == '__main__':
104 |     args = default_argument_parser().parse_args()
105 |     args.config_file = "/home/julien/data-science/kaggle/imaterialist/configs/exp06.yaml"
106 |     print("Command Line Args:", args)
107 |     launch(
108 |         main,
109 |         args.num_gpus,
110 |         num_machines=args.num_machines,
111 |         machine_rank=args.machine_rank,
112 |         dist_url=args.dist_url,
113 |         args=(args,),
114 |     )


--------------------------------------------------------------------------------