├── README.md
├── costum_imagenet.py
├── examples
    ├── test1.png
    └── test2.png
├── get_indices.py
├── read_xml.py
└── test_foreground.py


/README.md:
--------------------------------------------------------------------------------
 1 | # ImageNet 1K Bounding Boxes
 2 | For some experiments, you might wanna pass only the `background` of imagenet images vs passing only the foreground. Here, I've included the code to extract the meta-data for the bounding box, cleaning up the the downloaded stuff, and then changing ImageNet Loader to support only the images that have box annotations. 
 3 | 
 4 | # How to use:
 5 | ```python
 6 | from costum_imagenet import BackgroundForegroundImageNet
 7 | tr = trans.Compose([trans.Resize(224), trans.CenterCrop(224), trans.ToTensor(), ])
 8 | dataset = BackgroundForegroundImageNet(root='./data/imagenet/train', download=True, transform=tr)
 9 | x, b, f, y = dataset[0]
10 | torchvision.utils.save_image(torch.stack([x, b, f]), 'test1.png')
11 | ```
12 | 
13 | # Example:
14 | ![Test1](/examples/test1.png)
15 | ![Test2](/examples/test2.png)
16 | 
17 | If you set the value `download=True`, the bounding boxes and the indices of `imagenet` train split that have the bounding boxes will be downloaded. But if for some reason you want to create your own bounding boxes from the scratch, here's the steps for doing it: 
18 | 
19 | # Restarting from the scratch
20 | Downloading:
21 | First download the data from [here](https://image-net.org/data/bboxes_annotations.tar.gz):
22 | ```bash
23 | wget "https://image-net.org/data/bboxes_annotations.tar.gz"
24 | ```
25 | 
26 | Extract the File:
27 | ```bash
28 | tar -xvf bboxes_annotations.tar.gz 
29 | ```
30 | 
31 | Extract every subfolder:
32 | ```bash
33 | cd bboxes_annotations
34 | ls | grep .tar.gz | while read f ; do tar -xvf "${f}" ; done
35 | ```
36 | 
37 | Convert dataset to JS:
38 | ```bash
39 | python read_xml.py 
40 | ```
41 | 
42 | Clean the extra 50GB extracted files:
43 | ```bash
44 | rm *.tar.gz
45 | ls | grep "n.*" | while read f ; do rm -rf "${f}"  ; done 
46 | ```
47 | 
48 | Get Indices that have bounding boxes:
49 | ```bash
50 | python get_indices.py 
51 | ```
52 | 
53 | Then simply pass the path to the files `boxes.pt` and `indices.pt` to your `BackgroundForegroundImageNet` constructor
54 | ```python 
55 | dataset = BackgroundForegroundImageNet(root='.', download=False, boxes='boxes.pt', indices='indices.pt')
56 | ```
57 | 


--------------------------------------------------------------------------------
/costum_imagenet.py:
--------------------------------------------------------------------------------
 1 | from torch.utils import model_zoo
 2 | from torchvision.datasets import ImageFolder
 3 | import torch
 4 | from torchvision.transforms import ToTensor, ToPILImage
 5 | 
 6 | 
 7 | class BackgroundForegroundImageNet(ImageFolder):
 8 |     boxes_url = 'https://github.com/AminJun/ImageNet1KBoundingBoxes/releases/download/files/boxes.pt'
 9 |     indices_url = 'https://github.com/AminJun/ImageNet1KBoundingBoxes/releases/download/files/indices.pt'
10 | 
11 |     def __init__(self, root: str = '.', download=True, boxes: str = None, indices: str = None, *args, **kwargs):
12 |         assert download or (boxes is not None and indices is not None)
13 |         if download:
14 |             self.boxes = model_zoo.load_url(self.boxes_url, map_location='cpu')
15 |             self.b_indices = model_zoo.load_url(self.indices_url, map_location='cpu')
16 |         else:
17 |             self.boxes = torch.load(boxes)
18 |             self.b_indices = torch.load(indices)
19 | 
20 |         merged = {}
21 |         for k, v in self.boxes.items():
22 |             merged.update(v)
23 |         self.boxes = merged
24 | 
25 |         self.pre_transform = ToTensor()
26 |         self.back_transform = ToPILImage()
27 |         print('loading imagenet folders')
28 |         super(BackgroundForegroundImageNet, self).__init__(root, *args, **kwargs)
29 | 
30 |     def __len__(self):
31 |         return len(self.b_indices)
32 | 
33 |     def __getitem__(self, item):
34 |         real_i = self.b_indices[item]
35 |         path, target = self.samples[real_i]
36 |         sample = self.pre_transform(self.loader(path))
37 |         background = sample.clone().detach()
38 |         for box in self.boxes[path.split('/')[-1]][0]:
39 |             x1, x2, y1, y2 = box
40 |             background[:, int(y1):int(y2), int(x1):int(x2)] = 0
41 |         foreground = (sample - background).detach().clone()
42 | 
43 |         sample, background, foreground = self.back_transform(sample), self.back_transform(
44 |             background), self.back_transform(foreground)
45 | 
46 |         if self.transform is not None:
47 |             sample, background, foreground = self.transform(sample), self.transform(background), self.transform(
48 |                 foreground)
49 |         if self.target_transform is not None:
50 |             target = self.target_transform(target)
51 | 
52 |         return sample, background, foreground, target
53 | 


--------------------------------------------------------------------------------
/examples/test1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AminJun/ImageNet1KBoundingBoxes/c5958eaeb4553a6eb154c8c031f2b9df0f494790/examples/test1.png


--------------------------------------------------------------------------------
/examples/test2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AminJun/ImageNet1KBoundingBoxes/c5958eaeb4553a6eb154c8c031f2b9df0f494790/examples/test2.png


--------------------------------------------------------------------------------
/get_indices.py:
--------------------------------------------------------------------------------
 1 | from torchvision.datasets import ImageFolder
 2 | import torch
 3 | 
 4 | 
 5 | def main():
 6 |     dataset = ImageFolder(root='./data/imagenet/train', transform=None)
 7 |     boxes = torch.load('boxes.pt')
 8 |     merged = {}
 9 |     for k, v in boxes.items():
10 |         merged.update(v)
11 | 
12 |     indices = [i for i, (name, _) in enumerate(dataset.samples) if name.split('/')[-1] in merged]
13 |     torch.save(indices, 'indices.pt')
14 | 
15 | 
16 | if __name__ == '__main__':
17 |     main()
18 | 


--------------------------------------------------------------------------------
/read_xml.py:
--------------------------------------------------------------------------------
 1 | import pdb
 2 | import torch 
 3 | from tqdm import tqdm
 4 | import xmltodict
 5 | import os
 6 | 
 7 | 
 8 | def translate_obj(cls: str, obj) -> list:
 9 |     if isinstance(obj, list):
10 |         return [[v['bndbox']['xmin'], v['bndbox']['xmax'], v['bndbox']['ymin'], v['bndbox']['ymax']] for v in obj if
11 |                 v['name'] == cls]
12 |     if obj['name'] == cls:
13 |         v = obj
14 |         return [[v['bndbox']['xmin'], v['bndbox']['xmax'], v['bndbox']['ymin'], v['bndbox']['ymax']]]
15 |     raise NotImplementedError
16 | 
17 | 
18 | def objects(expected_class: str, path: str) -> list:
19 |     with open(path, 'r') as f:
20 |         data = f.read()
21 |     xml = xmltodict.parse(data)
22 |     return [translate_obj(expected_class, v) for k, v in xml['annotation'].items() if k == 'object']
23 | 
24 | 
25 | def get_path(xml_path: str) -> str:
26 |     return xml_path[:-4] + '.JPEG'
27 | 
28 | 
29 | def translate_folder(xml_folder: str, root: str) -> {}:
30 |     parent = os.path.join(root, xml_folder)
31 |     return {f'{get_path(path)}': objects(xml_folder, os.path.join(parent, path)) for path in os.listdir(parent)}
32 | 
33 | 
34 | def translate_dataset(root: str, classes: list):
35 |     return {f'{dr}': translate_folder(dr, root) for dr in tqdm(os.listdir(root)) if
36 |             os.path.isdir(os.path.join(root, dr)) and dr in classes}
37 | 
38 | 
39 | def main():
40 |     with open('im1knames.txt', 'r') as f:
41 |         im1k_classes = f.read().split('\n')
42 |     dataset = translate_dataset('/cmlscratch/aminjun/Datasets/ImageNetBoxes/Annotation/', im1k_classes)
43 |     torch.save(dataset, 'boxes.pt')
44 | 
45 | 
46 | if __name__ == '__main__':
47 |     main()
48 | 


--------------------------------------------------------------------------------
/test_foreground.py:
--------------------------------------------------------------------------------
 1 | import torchvision.utils
 2 | import torch
 3 | import torchvision.transforms as trans
 4 | 
 5 | from costum_imagenet import BackgroundForegroundImageNet
 6 | 
 7 | 
 8 | def test1():
 9 |     tr = trans.Compose([trans.Resize(224), trans.CenterCrop(224), trans.ToTensor(), ])
10 |     dataset = BackgroundForegroundImageNet(root='./data/imagenet/train', download=True, transform=tr)
11 |     x, b, f, y = dataset[0]
12 |     torchvision.utils.save_image(torch.stack([x, b, f]), 'test1.png')
13 | 
14 | 
15 | def test2():
16 |     tr = trans.Compose([trans.Resize(224), trans.CenterCrop(224), trans.ToTensor(), ])
17 |     dataset = BackgroundForegroundImageNet(root='./data/imagenet/train', download=False, boxes='boxes.pt',
18 |                                            indices='indices.pt', transform=tr)
19 |     x, b, f, y = dataset[1]
20 |     torchvision.utils.save_image(torch.stack([x, b, f]), 'test2.png')
21 | 
22 | 
23 | if __name__ == '__main__':
24 |     test1()
25 |     test2()
26 | 


--------------------------------------------------------------------------------