├── .gitignore
├── LICENSE
├── README.md
├── all.sh
├── configs
    └── fsod
    │   ├── Base-FSOD-C4.yaml
    │   ├── R_50_C4_1x.yaml
    │   └── finetune_R_50_C4_1x.yaml
├── datasets
    ├── coco
    │   ├── 1_split_filter.py
    │   ├── 2_balance.py
    │   ├── 3_gen_support_pool.py
    │   ├── 4_gen_support_pool_10_shot.py
    │   ├── 5_voc_part.py
    │   ├── 6_voc_few_shot.py
    │   └── new_annotations
    │   │   └── final_split_voc_10_shot_instances_train2017.json
    └── generate_support_data.sh
├── fewx
    ├── __init__.py
    ├── config
    │   ├── __init__.py
    │   ├── config.py
    │   └── defaults.py
    ├── data
    │   ├── __init__.py
    │   ├── build.py
    │   ├── dataset_mapper.py
    │   └── datasets
    │   │   ├── __init__.py
    │   │   ├── builtin.py
    │   │   └── register_coco.py
    ├── evaluation
    │   ├── __init__.py
    │   └── coco_evaluation.py
    ├── layers
    │   ├── __init__.py
    │   ├── boundary.py
    │   ├── conv_with_kaiming_uniform.py
    │   ├── deform_conv.py
    │   ├── iou_loss.py
    │   ├── misc.py
    │   ├── ml_nms.py
    │   └── naive_group_norm.py
    ├── modeling
    │   ├── __init__.py
    │   └── fsod
    │   │   ├── __init__.py
    │   │   ├── fsod_fast_rcnn.py
    │   │   ├── fsod_rcnn.py
    │   │   ├── fsod_roi_heads.py
    │   │   └── fsod_rpn.py
    ├── solver
    │   ├── __init__.py
    │   └── build.py
    └── utils
    │   ├── __init__.py
    │   ├── comm.py
    │   └── measures.py
├── fsod_train_net.py
└── log
    ├── fsod_finetune_test_log.txt
    ├── fsod_finetune_train_log.txt
    ├── fsod_train_log.txt
    └── metric.txt


/.gitignore:
--------------------------------------------------------------------------------
 1 | # output dir
 2 | output
 3 | instant_test_output
 4 | inference_test_output
 5 | 
 6 | 
 7 | *.jpg
 8 | *.png
 9 | #*.txt
10 | #*.json
11 | *.diff
12 | 
13 | # compilation and distribution
14 | __pycache__
15 | _ext
16 | *.pyc
17 | *.so
18 | detectron2.egg-info/
19 | build/
20 | dist/
21 | wheels/
22 | 
23 | # pytorch/python/numpy formats
24 | *.pth
25 | *.pkl
26 | *.npy
27 | 
28 | # ipython/jupyter notebooks
29 | *.ipynb
30 | **/.ipynb_checkpoints/
31 | 
32 | # Editor temporaries
33 | *.swn
34 | *.swo
35 | *.swp
36 | *~
37 | 
38 | # editor settings
39 | .idea
40 | .vscode
41 | 
42 | # project dirs
43 | /detectron2/model_zoo/configs
44 | #/datasets
45 | /projects/*/datasets
46 | /models
47 | #/datasets/coco/new_annotations
48 | /datasets/coco/support
49 | /datasets/coco/10_shot_support
50 | 
51 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 Qi Fan
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # FewX
 2 | 
 3 | **FewX** is an open source toolbox on top of Detectron2 for data-limited instance-level recognition tasks, e.g., few-shot object detection, few-shot instance segmentation, partially supervised instance segmentation and so on. 
 4 | 
 5 | All data-limited instance-level recognition works from **Qi Fan**  (HKUST, fanqics@gmail.com) are open-sourced here.
 6 | 
 7 | To date, FewX implements the following algorithms:
 8 | 
 9 | - [FSOD](https://arxiv.org/abs/1908.01998): few-shot object detection with [FSOD dataset](https://github.com/fanq15/Few-Shot-Object-Detection-Dataset).
10 | - [CPMask](https://arxiv.org/abs/2007.12387): partially supervised/fully supervised/few-shot instance segmentation. (**20220725 working on it**)
11 | - [FSVOD](https://arxiv.org/abs/2104.14805): few-shot video object detection with [FSVOD-500 dataset](https://drive.google.com/drive/folders/1DDQ81A8yVj7D8vLUS01657ATr2sK1zgC?usp=sharing) and [FSYTV-40 dataset](https://drive.google.com/drive/folders/1a1PpfAxeYL7AbxYViDDnx7ACFtRohVL5?usp=sharing). (**20220725 working on it**)
12 | 
13 | ## Highlights
14 | - **State-of-the-art performance.**  
15 |   - FSOD is the best few-shot object detection model. (This model can be directly applied to novel classes without finetuning. And finetuning can bring better performance.)
16 |   - CPMask is the best partially supervised/few-shot instance segmentation model.
17 | - **Easy to use.** You only need to run 3 code lines to conduct the entire experiment.
18 |   - Install Pre-Built Detectron2 in one code line.
19 |   - Prepare dataset in one code line. (You need to first download the dataset and change the **data path** in the script.)
20 |   - Training and evaluation in one code line.
21 | 
22 | ## Updates
23 | - FewX has been released. (09/08/2020)
24 | 
25 | ## Results on MS COCO
26 | 
27 | ### Few Shot Object Detection
28 | 
29 | |Method|Training Dataset|Evaluation way&shot|box AP|download|
30 | |:--------:|:--------:|:--------:|:--------:|:--:|
31 | |FSOD (paper)|COCO (non-voc)|full-way 10-shot|11.1|-|
32 | |FSOD (this implementation)|COCO (non-voc)|full-way 10-shot|**12.0**|<a href="https://drive.google.com/file/d/1VO1XMKtiU4pMNPfIvw5iZRqlO9dr5BhN/view?usp=sharing">model</a>&nbsp;\|&nbsp;<a href="https://drive.google.com/file/d/18eC5Nn1HBJcDf75CoLWOwncYFXzHGXFD/view?usp=sharing">metrics</a>|
33 | 
34 | The results are reported on the COCO voc subset with **ResNet-50** backbone.
35 | 
36 | The model only trained on base classes is <a href="https://drive.google.com/file/d/1VdGVmcufa2JBmZUfwAcDj1OL5tKTFhQ1/view?usp=sharing"> base model</a>&nbsp;\.
37 | 
38 | You can reference the [original FSOD implementation](https://github.com/fanq15/FSOD-code) on the [Few-Shot-Object-Detection-Dataset](https://github.com/fanq15/Few-Shot-Object-Detection-Dataset).
39 | 
40 | ## Step 1: Installation
41 | You only need to install [detectron2](https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md). We recommend the Pre-Built Detectron2 (Linux only) version with pytorch 1.7. I use the Pre-Built Detectron2 with CUDA 10.1 and pytorch 1.7 and you can run this code to install it.
42 | 
43 | ```
44 | python -m pip install detectron2 -f \
45 |   https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.7/index.html
46 | ```
47 | 
48 | ## Step 2: Prepare dataset
49 | - Prepare for coco dataset following [this instruction](https://github.com/facebookresearch/detectron2/tree/master/datasets).
50 | 
51 | - `cd datasets`, change the `DATA_ROOT` in the `generate_support_data.sh` to your data path and run `sh generate_support_data.sh`.
52 | 
53 | ``` 
54 | cd FewX/datasets
55 | sh generate_support_data.sh
56 | ```
57 | 
58 | ## Step 3: Training and Evaluation
59 | 
60 | Run `sh all.sh` in the root dir. (This script uses `4 GPUs`. You can change the GPU number. If you use 2 GPUs with unchanged batch size (8), please [halve the learning rate](https://github.com/fanq15/FewX/issues/6#issuecomment-674367388).)
61 | 
62 | ```
63 | cd FewX
64 | sh all.sh
65 | ```
66 | 
67 | 
68 | ## TODO
69 |  - [ ] Add FSVOD and CPMask codes to this repo.
70 |  - [ ] Add other dataset results to FSOD.
71 |  - [ ] Add [CPMask](https://arxiv.org/abs/2007.12387) code with partially supervised instance segmentation, fully supervised instance segmentation and few-shot instance segmentation.
72 | 
73 | ## Citing FewX
74 | If you use this toolbox in your research or wish to refer to the baseline results, please use the following BibTeX entries.
75 | 
76 |   ```
77 |   @inproceedings{fan2021fsvod,
78 |     title={Few-Shot Video Object Detection},
79 |     author={Fan, Qi and Tang, Chi-Keung and Tai, Yu-Wing},
80 |     booktitle={ECCV},
81 |     year={2022}
82 |   }
83 |   @inproceedings{fan2020cpmask,
84 |     title={Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation},
85 |     author={Fan, Qi and Ke, Lei and Pei, Wenjie and Tang, Chi-Keung and Tai, Yu-Wing},
86 |     booktitle={ECCV},
87 |     year={2020}
88 |   }
89 |   @inproceedings{fan2020fsod,
90 |     title={Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector},
91 |     author={Fan, Qi and Zhuo, Wei and Tang, Chi-Keung and Tai, Yu-Wing},
92 |     booktitle={CVPR},
93 |     year={2020}
94 |   }
95 |   ```
96 | 
97 | ## Special Thanks
98 | [Detectron2](https://github.com/facebookresearch/detectron2), [AdelaiDet](https://github.com/aim-uofa/AdelaiDet), [centermask2](https://github.com/youngwanLEE/centermask2)
99 | 


--------------------------------------------------------------------------------
/all.sh:
--------------------------------------------------------------------------------
 1 | rm support_dir/support_feature.pkl
 2 | CUDA_VISIBLE_DEVICES=0,1,2,3 python3 fsod_train_net.py --num-gpus 4 \
 3 | 	--config-file configs/fsod/R_50_C4_1x.yaml 2>&1 | tee log/fsod_train_log.txt
 4 | 
 5 | #CUDA_VISIBLE_DEVICES=0,1,2,3 python3 tools/train_net.py --num-gpus 4 \
 6 | #	--config-file configs/fsod/R_50_C4_1x.yaml \
 7 | #	--eval-only MODEL.WEIGHTS ./output/fsod/R_50_C4_1x/model_final.pth 2>&1 | tee log/fsod_test_log.txt
 8 | 
 9 | rm support_dir/support_feature.pkl
10 | CUDA_VISIBLE_DEVICES=0,1,2,3 python3 fsod_train_net.py --num-gpus 4 \
11 | 	--config-file configs/fsod/finetune_R_50_C4_1x.yaml 2>&1 | tee log/fsod_finetune_train_log.txt
12 | CUDA_VISIBLE_DEVICES=0,1,2,3 python3 fsod_train_net.py --num-gpus 4 \
13 | 	--config-file configs/fsod/finetune_R_50_C4_1x.yaml \
14 | 	--eval-only MODEL.WEIGHTS ./output/fsod/finetune_dir/R_50_C4_1x/model_final.pth 2>&1 | tee log/fsod_finetune_test_log.txt
15 | 
16 | #CUDA_VISIBLE_DEVICES=0,1,2,3 python3 fsod_train_net.py --num-gpus 4 \
17 | #	--config-file configs/fsod/finetune_R_50_C4_1x.yaml \
18 | #	--eval-only MODEL.WEIGHTS ./output/fsod/finetune_dir/R_50_C4_1x/model_final.pth 2>&1 | tee log/fsod_finetune_test_log.txt
19 | 
20 | 


--------------------------------------------------------------------------------
/configs/fsod/Base-FSOD-C4.yaml:
--------------------------------------------------------------------------------
 1 | MODEL:
 2 |   META_ARCHITECTURE: "FsodRCNN"
 3 |   PROPOSAL_GENERATOR:
 4 |     NAME: "FsodRPN"  
 5 |   RPN:
 6 |     PRE_NMS_TOPK_TEST: 6000
 7 |     POST_NMS_TOPK_TEST: 100
 8 |   ROI_HEADS:
 9 |     NAME: "FsodRes5ROIHeads"
10 |     BATCH_SIZE_PER_IMAGE: 128
11 |     POSITIVE_FRACTION: 0.5
12 |     NUM_CLASSES: 1
13 |   BACKBONE:
14 |     FREEZE_AT: 3
15 |   #PIXEL_MEAN: [102.9801, 115.9465, 122.7717]
16 | DATASETS:
17 |   TRAIN: ("coco_2017_train_nonvoc",) #("coco_2017_train",)
18 |   TEST: ("coco_2017_val",)
19 | DATALOADER:
20 |   NUM_WORKERS: 8
21 | SOLVER:
22 |   IMS_PER_BATCH: 8 #16
23 |   BASE_LR: 0.004 #0.02
24 |   STEPS: (112000, 120000) #(30000, 40000) #(56000,) #(60000, 80000)
25 |   MAX_ITER: 120000 #300000 #45000 #60000 #90000
26 |   WARMUP_ITERS: 1000 #500
27 |   WARMUP_FACTOR: 0.1
28 |   CHECKPOINT_PERIOD: 30000
29 |   HEAD_LR_FACTOR: 2.0
30 |   #WEIGHT_DECAY_BIAS: 0.0
31 | INPUT:
32 |   FS:
33 |     SUPPORT_WAY: 2
34 |     SUPPORT_SHOT: 10
35 |   MIN_SIZE_TRAIN: (440, 472, 504, 536, 568, 600) #(600,) #(640, 672, 704, 736, 768, 800)
36 |   MAX_SIZE_TRAIN: 1000
37 |   MIN_SIZE_TEST: 600
38 |   MAX_SIZE_TEST: 1000
39 | VERSION: 2
40 | 


--------------------------------------------------------------------------------
/configs/fsod/R_50_C4_1x.yaml:
--------------------------------------------------------------------------------
1 | _BASE_: "Base-FSOD-C4.yaml"
2 | MODEL:
3 |   WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
4 |   MASK_ON: False
5 |   RESNETS:
6 |     DEPTH: 50
7 | OUTPUT_DIR: './output/fsod/R_50_C4_1x'
8 | 


--------------------------------------------------------------------------------
/configs/fsod/finetune_R_50_C4_1x.yaml:
--------------------------------------------------------------------------------
 1 | _BASE_: "Base-FSOD-C4.yaml"
 2 | MODEL:
 3 |   WEIGHTS: "./output/fsod/R_50_C4_1x/model_final.pth" 
 4 |   MASK_ON: False
 5 |   RESNETS:
 6 |     DEPTH: 50
 7 |   BACKBONE:
 8 |     FREEZE_AT: 5
 9 | DATASETS:
10 |   TRAIN: ("coco_2017_train_voc_10_shot",)
11 |   TEST: ("coco_2017_val",)
12 | SOLVER:
13 |   IMS_PER_BATCH: 4
14 |   BASE_LR: 0.001
15 |   STEPS: (2000, 3000)
16 |   MAX_ITER: 3000
17 |   WARMUP_ITERS: 200
18 | INPUT:
19 |   FS:
20 |     FEW_SHOT: True
21 |     SUPPORT_WAY: 2
22 |     SUPPORT_SHOT: 9
23 |   MIN_SIZE_TRAIN: (440, 472, 504, 536, 568, 600)
24 |   MAX_SIZE_TRAIN: 1000
25 |   MIN_SIZE_TEST: 600
26 |   MAX_SIZE_TEST: 1000
27 | OUTPUT_DIR: './output/fsod/finetune_dir/R_50_C4_1x'
28 | 
29 | 


--------------------------------------------------------------------------------
/datasets/coco/1_split_filter.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Created on Fri Jun  5 15:27:52 2020
  5 | 
  6 | @author: fanq15
  7 | """
  8 | 
  9 | from pycocotools.coco import COCO
 10 | import cv2
 11 | import numpy as np
 12 | from os.path import join, isdir
 13 | from os import mkdir, makedirs
 14 | from concurrent import futures
 15 | import sys
 16 | import time
 17 | import math
 18 | import matplotlib.pyplot as plt
 19 | import os
 20 | import pandas as pd
 21 | import json
 22 | import sys
 23 | 
 24 | 
 25 | def filter_coco(coco, cls_split):
 26 |     new_anns = []
 27 |     all_cls_dict = {}
 28 |     for img_id, id in enumerate(coco.imgs):
 29 |         img = coco.loadImgs(id)[0]
 30 |         anns = coco.loadAnns(coco.getAnnIds(imgIds=id, iscrowd=None))
 31 |         skip_flag = False
 32 |         img_cls_dict = {}
 33 |         if len(anns) == 0:
 34 |             continue
 35 |         for ann in anns:
 36 |             segmentation = ann['segmentation']
 37 |             area = ann['area']
 38 |             iscrowd = ann['iscrowd']
 39 |             image_id = ann['image_id']
 40 |             bbox = ann['bbox']
 41 |             category_id = ann['category_id']
 42 |             id = ann['id']
 43 |             bbox_area = bbox[2] * bbox[3]
 44 |             
 45 |             # filter images with small boxes
 46 |             if category_id in cls_split:
 47 |                 if bbox_area < 32 * 32:
 48 |                     skip_flag = True
 49 |                 
 50 |         if skip_flag:
 51 |             continue
 52 |         else:
 53 |             for ann in anns:
 54 |                 category_id = ann['category_id']
 55 |                 if category_id in cls_split:
 56 |                     new_anns.append(ann)
 57 | 
 58 |                     if category_id in all_cls_dict.keys():
 59 |                         all_cls_dict[category_id] += 1
 60 |                     else:
 61 |                         all_cls_dict[category_id] = 1
 62 |                         
 63 |     print(len(new_anns))
 64 |     print(sorted(all_cls_dict.items(), key = lambda kv:(kv[1], kv[0])))     
 65 |     return new_anns
 66 | 
 67 | 
 68 | root_path = sys.argv[1]
 69 | print(root_path)
 70 | #root_path = '/home/fanqi/data/COCO'
 71 | dataDir = './annotations'
 72 | support_dict = {}
 73 | 
 74 | support_dict['support_box'] = []
 75 | support_dict['category_id'] = []
 76 | support_dict['image_id'] = []
 77 | support_dict['id'] = []
 78 | support_dict['file_path'] = []
 79 | 
 80 | voc_inds = (0, 1, 2, 3, 4, 5, 6, 8, 14, 15, 16, 17, 18, 19, 39, 56, 57, 58, 60, 62)
 81 | 
 82 | 
 83 | for dataType in ['instances_train2017.json']: #, 'split_voc_instances_train2017.json']:
 84 |     annFile = join(dataDir, dataType)
 85 | 
 86 |     with open(annFile,'r') as load_f:
 87 |         dataset = json.load(load_f)
 88 |         print(dataset.keys())
 89 |         save_info = dataset['info']
 90 |         save_licenses = dataset['licenses']
 91 |         save_images = dataset['images']
 92 |         save_categories = dataset['categories']
 93 |         save_annotations = dataset['annotations']
 94 | 
 95 | 
 96 |     inds_split2 = [i for i in range(len(save_categories)) if i not in voc_inds]
 97 | 
 98 |     # split annotations according to categories
 99 |     categories_split1 = [save_categories[i] for i in voc_inds]
100 |     categories_split2 = [save_categories[i] for i in inds_split2]
101 |     cids_split1 = [c['id'] for c in categories_split1]
102 |     cids_split2 = [c['id'] for c in categories_split2]
103 |     print('Split 1: {} classes'.format(len(categories_split1)))
104 |     for c in categories_split1:
105 |         print('\t', c['name'])
106 |     print('Split 2: {} classes'.format(len(categories_split2)))
107 |     for c in categories_split2:
108 |         print('\t', c['name'])
109 |     
110 |     coco = COCO(annFile)
111 | 
112 |     # for non-voc, there can be non_voc images
113 |     annotations_split2 = filter_coco(coco, cids_split2)
114 | 
115 |     # for voc, there can be non_voc images
116 |     annotations = dataset['annotations']
117 |     annotations_split1 = []
118 |     
119 |     for ann in annotations:
120 |         if ann['category_id'] in cids_split1: # voc 20
121 |             annotations_split1.append(ann)
122 | 
123 |     dataset_split1 = {
124 |         'info': save_info,
125 |         'licenses': save_licenses,
126 |         'images': save_images,
127 |         'annotations': annotations_split1,
128 |         'categories': save_categories}
129 |     dataset_split2 = {
130 |         'info': save_info,
131 |         'licenses': save_licenses,
132 |         'images': save_images,
133 |         'annotations': annotations_split2,
134 |         'categories': save_categories}
135 |     new_annotations_path = os.path.join(root_path, 'new_annotations')
136 |     if not os.path.exists(new_annotations_path):
137 |         os.makedirs(new_annotations_path)
138 |     split2_file = os.path.join(root_path, 'new_annotations/final_split_non_voc_instances_train2017.json')
139 | 
140 |     with open(split2_file, 'w') as f:
141 |         json.dump(dataset_split2, f) 
142 | 


--------------------------------------------------------------------------------
/datasets/coco/2_balance.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Created on Fri Jun  5 16:04:06 2020
  5 | 
  6 | @author: fanq15
  7 | """
  8 | 
  9 | from pycocotools.coco import COCO
 10 | import cv2
 11 | import numpy as np
 12 | from os.path import join, isdir
 13 | from os import mkdir, makedirs
 14 | from concurrent import futures
 15 | import sys
 16 | import time
 17 | import math
 18 | import matplotlib.pyplot as plt
 19 | import os
 20 | import pandas as pd
 21 | import json
 22 | import sys
 23 | 
 24 | def balance_coco(coco):
 25 |     all_cls_dict = {}
 26 |     for img_id, id in enumerate(coco.imgs):
 27 |         anns = coco.loadAnns(coco.getAnnIds(imgIds=id, iscrowd=None))
 28 |         img_cls_dict = {}
 29 |         if len(anns) == 0:
 30 |             continue
 31 |         for ann in anns:
 32 |             category_id = ann['category_id']
 33 |             id = ann['id']
 34 |             
 35 |             if category_id in img_cls_dict.keys():
 36 |                 img_cls_dict[category_id] += 1
 37 |             else:
 38 |                 img_cls_dict[category_id] = 1
 39 |             
 40 |         for category_id, num in img_cls_dict.items():
 41 |             if category_id in all_cls_dict.keys():
 42 |                 all_cls_dict[category_id] += 1 # count the image number containing the target category
 43 |             else:
 44 |                 all_cls_dict[category_id] = 1 
 45 |     print('Image number of non-voc classes before class balancing: ', sorted(all_cls_dict.items(), key = lambda kv:(kv[1], kv[0])))  
 46 |     
 47 |     new_anns = []
 48 |     for img_id, id in enumerate(coco.imgs):
 49 |         save_flag = False
 50 |         anns = coco.loadAnns(coco.getAnnIds(imgIds=id, iscrowd=None))
 51 |         if len(anns) == 0:
 52 |             continue
 53 |         # in this image, if there is at least one category with less than 2000 images, save this image.
 54 |         # otherwise, remove this image.
 55 |         for ann in anns:
 56 |             category_id = ann['category_id']
 57 |             
 58 |             if all_cls_dict[category_id] <= 2000:
 59 |                 save_flag = True
 60 |             
 61 |         if save_flag:
 62 |             for ann in anns:
 63 |                 new_anns.append(ann)
 64 |         else:
 65 |             for ann in anns:
 66 |                 category_id = ann['category_id']
 67 |                 all_cls_dict[category_id] -= 1
 68 |     print('Instance number of non-voc classes after class balancing: ', len(new_anns))
 69 |     print('Image number of non-voc classes after class balancing: ', sorted(all_cls_dict.items(), key = lambda kv:(kv[1], kv[0])))  
 70 |     
 71 |     return new_anns
 72 | 
 73 | root_path = sys.argv[1]
 74 | #root_path = '/home/fanqi/data/COCO'
 75 | dataDir = os.path.join(root_path, 'new_annotations')
 76 | support_dict = {}
 77 | 
 78 | support_dict['support_box'] = []
 79 | support_dict['category_id'] = []
 80 | support_dict['image_id'] = []
 81 | support_dict['id'] = []
 82 | support_dict['file_path'] = []
 83 | 
 84 | 
 85 | for dataType in ['split_non_voc_instances_train2017.json']:
 86 |     annFile = join(dataDir, dataType)
 87 | 
 88 |     with open(annFile,'r') as load_f:
 89 |         dataset = json.load(load_f)
 90 |         print(dataset.keys())
 91 |         save_info = dataset['info']
 92 |         save_licenses = dataset['licenses']
 93 |         save_images = dataset['images']
 94 |         save_categories = dataset['categories']
 95 | 
 96 |     print(annFile)
 97 |     shot_num = 10
 98 |     coco = COCO(annFile)
 99 |     print(coco)
100 |     
101 |     annotations = balance_coco(coco)
102 |     dataset_split = {
103 |         'info': save_info,
104 |         'licenses': save_licenses,
105 |         'images': save_images,
106 |         'annotations': annotations,
107 |         'categories': save_categories}
108 |     split_file = os.path.join(root_path, 'new_annotations/final_split_non_voc_instances_train2017.json')
109 | 
110 |     with open(split_file, 'w') as f:
111 |         json.dump(dataset_split, f)
112 | 


--------------------------------------------------------------------------------
/datasets/coco/3_gen_support_pool.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Created on Thu Jun  4 15:30:24 2020
  5 | 
  6 | @author: fanq15
  7 | """
  8 | 
  9 | from pycocotools.coco import COCO
 10 | import cv2
 11 | import numpy as np
 12 | from os.path import join, isdir
 13 | from os import mkdir, makedirs
 14 | from concurrent import futures
 15 | import sys
 16 | import time
 17 | import math
 18 | import matplotlib.pyplot as plt
 19 | import os
 20 | import pandas as pd
 21 | import json
 22 | import shutil
 23 | import sys
 24 | 
 25 | def vis_image(im, bboxs, im_name):
 26 |     dpi = 300
 27 |     fig, ax = plt.subplots() 
 28 |     ax.imshow(im, aspect='equal') 
 29 |     plt.axis('off') 
 30 |     height, width, channels = im.shape 
 31 |     fig.set_size_inches(width/100.0/3.0, height/100.0/3.0) 
 32 |     plt.gca().xaxis.set_major_locator(plt.NullLocator()) 
 33 |     plt.gca().yaxis.set_major_locator(plt.NullLocator()) 
 34 |     plt.subplots_adjust(top=1,bottom=0,left=0,right=1,hspace=0,wspace=0) 
 35 |     plt.margins(0,0)
 36 |     # Show box (off by default, box_alpha=0.0)
 37 |     for bbox in bboxs:
 38 |         ax.add_patch(
 39 |             plt.Rectangle((bbox[0], bbox[1]),
 40 |                           bbox[2] - bbox[0],
 41 |                           bbox[3] - bbox[1],
 42 |                           fill=False, edgecolor='r',
 43 |                           linewidth=0.5, alpha=1))
 44 |     output_name = os.path.basename(im_name)
 45 |     plt.savefig(im_name, dpi=dpi, bbox_inches='tight', pad_inches=0)
 46 |     plt.close('all')
 47 | 
 48 | 
 49 | def crop_support(img, bbox):
 50 |     image_shape = img.shape[:2]  # h, w
 51 |     data_height, data_width = image_shape
 52 |     
 53 |     img = img.transpose(2, 0, 1)
 54 | 
 55 |     x1 = int(bbox[0])
 56 |     y1 = int(bbox[1])
 57 |     x2 = int(bbox[2])
 58 |     y2 = int(bbox[3])
 59 |     
 60 |     width = x2 - x1
 61 |     height = y2 - y1
 62 |     context_pixel = 16 #int(16 * im_scale)
 63 |     
 64 |     new_x1 = 0
 65 |     new_y1 = 0
 66 |     new_x2 = width
 67 |     new_y2 = height
 68 |     target_size = (320, 320) #(384, 384)
 69 |  
 70 |     if width >= height:
 71 |         crop_x1 = x1 - context_pixel
 72 |         crop_x2 = x2 + context_pixel
 73 |    
 74 |         # New_x1 and new_x2 will change when crop context or overflow
 75 |         new_x1 = new_x1 + context_pixel
 76 |         new_x2 = new_x1 + width
 77 |         if crop_x1 < 0:
 78 |             new_x1 = new_x1 + crop_x1
 79 |             new_x2 = new_x1 + width
 80 |             crop_x1 = 0
 81 |         if crop_x2 > data_width:
 82 |             crop_x2 = data_width
 83 |             
 84 |         short_size = height
 85 |         long_size = crop_x2 - crop_x1
 86 |         y_center = int((y2+y1) / 2) #math.ceil((y2 + y1) / 2)
 87 |         crop_y1 = int(y_center - (long_size / 2)) #int(y_center - math.ceil(long_size / 2))
 88 |         crop_y2 = int(y_center + (long_size / 2)) #int(y_center + math.floor(long_size / 2))
 89 |         
 90 |         # New_y1 and new_y2 will change when crop context or overflow
 91 |         new_y1 = new_y1 + math.ceil((long_size - short_size) / 2)
 92 |         new_y2 = new_y1 + height
 93 |         if crop_y1 < 0:
 94 |             new_y1 = new_y1 + crop_y1
 95 |             new_y2 = new_y1 + height
 96 |             crop_y1 = 0
 97 |         if crop_y2 > data_height:
 98 |             crop_y2 = data_height
 99 |         
100 |         crop_short_size = crop_y2 - crop_y1
101 |         crop_long_size = crop_x2 - crop_x1
102 |         square = np.zeros((3, crop_long_size, crop_long_size), dtype = np.uint8)
103 |         delta = int((crop_long_size - crop_short_size) / 2) #int(math.ceil((crop_long_size - crop_short_size) / 2))
104 |         square_y1 = delta
105 |         square_y2 = delta + crop_short_size
106 | 
107 |         new_y1 = new_y1 + delta
108 |         new_y2 = new_y2 + delta
109 |         
110 |         crop_box = img[:, crop_y1:crop_y2, crop_x1:crop_x2]
111 |         square[:, square_y1:square_y2, :] = crop_box
112 | 
113 |         #show_square = np.zeros((crop_long_size, crop_long_size, 3))#, dtype=np.int16)
114 |         #show_crop_box = original_img[crop_y1:crop_y2, crop_x1:crop_x2, :]
115 |         #show_square[square_y1:square_y2, :, :] = show_crop_box
116 |         #show_square = show_square.astype(np.int16)
117 |     else:
118 |         crop_y1 = y1 - context_pixel
119 |         crop_y2 = y2 + context_pixel
120 |    
121 |         # New_y1 and new_y2 will change when crop context or overflow
122 |         new_y1 = new_y1 + context_pixel
123 |         new_y2 = new_y1 + height
124 |         if crop_y1 < 0:
125 |             new_y1 = new_y1 + crop_y1
126 |             new_y2 = new_y1 + height
127 |             crop_y1 = 0
128 |         if crop_y2 > data_height:
129 |             crop_y2 = data_height
130 |             
131 |         short_size = width
132 |         long_size = crop_y2 - crop_y1
133 |         x_center = int((x2 + x1) / 2) #math.ceil((x2 + x1) / 2)
134 |         crop_x1 = int(x_center - (long_size / 2)) #int(x_center - math.ceil(long_size / 2))
135 |         crop_x2 = int(x_center + (long_size / 2)) #int(x_center + math.floor(long_size / 2))
136 | 
137 |         # New_x1 and new_x2 will change when crop context or overflow
138 |         new_x1 = new_x1 + math.ceil((long_size - short_size) / 2)
139 |         new_x2 = new_x1 + width
140 |         if crop_x1 < 0:
141 |             new_x1 = new_x1 + crop_x1
142 |             new_x2 = new_x1 + width
143 |             crop_x1 = 0
144 |         if crop_x2 > data_width:
145 |             crop_x2 = data_width
146 | 
147 |         crop_short_size = crop_x2 - crop_x1
148 |         crop_long_size = crop_y2 - crop_y1
149 |         square = np.zeros((3, crop_long_size, crop_long_size), dtype = np.uint8)
150 |         delta = int((crop_long_size - crop_short_size) / 2) #int(math.ceil((crop_long_size - crop_short_size) / 2))
151 |         square_x1 = delta
152 |         square_x2 = delta + crop_short_size
153 | 
154 |         new_x1 = new_x1 + delta
155 |         new_x2 = new_x2 + delta
156 |         crop_box = img[:, crop_y1:crop_y2, crop_x1:crop_x2]
157 |         square[:, :, square_x1:square_x2] = crop_box
158 | 
159 |         #show_square = np.zeros((crop_long_size, crop_long_size, 3)) #, dtype=np.int16)
160 |         #show_crop_box = original_img[crop_y1:crop_y2, crop_x1:crop_x2, :]
161 |         #show_square[:, square_x1:square_x2, :] = show_crop_box
162 |         #show_square = show_square.astype(np.int16)
163 |     #print(crop_y2 - crop_y1, crop_x2 - crop_x1, bbox, data_height, data_width)
164 | 
165 |     square = square.astype(np.float32, copy=False)
166 |     square_scale = float(target_size[0]) / long_size
167 |     square = square.transpose(1,2,0)
168 |     square = cv2.resize(square, target_size, interpolation=cv2.INTER_LINEAR) # None, None, fx=square_scale, fy=square_scale, interpolation=cv2.INTER_LINEAR)
169 |     #square = square.transpose(2,0,1)
170 |     square = square.astype(np.uint8)
171 | 
172 |     new_x1 = int(new_x1 * square_scale)
173 |     new_y1 = int(new_y1 * square_scale)
174 |     new_x2 = int(new_x2 * square_scale)
175 |     new_y2 = int(new_y2 * square_scale)
176 | 
177 |     # For test
178 |     #show_square = cv2.resize(show_square, target_size, interpolation=cv2.INTER_LINEAR) # None, None, fx=square_scale, fy=square_scale, interpolation=cv2.INTER_LINEAR)
179 |     #self.vis_image(show_square, [new_x1, new_y1, new_x2, new_y2], img_path.split('/')[-1][:-4]+'_crop.jpg', './test')
180 | 
181 |     support_data = square
182 |     support_box = np.array([new_x1, new_y1, new_x2, new_y2]).astype(np.float32)
183 |     return support_data, support_box
184 |         
185 | 
186 | def main():
187 |     dataDir = '.'
188 |     #root_path = '/home/fanqi/data/COCO'
189 |     root_path = sys.argv[1]
190 |     support_path = os.path.join(root_path, 'support')
191 |     if not isdir(support_path): 
192 |         mkdir(support_path)
193 |     #else:
194 |     #    shutil.rmtree(support_path)
195 | 
196 |     support_dict = {}
197 |     
198 |     support_dict['support_box'] = []
199 |     support_dict['category_id'] = []
200 |     support_dict['image_id'] = []
201 |     support_dict['id'] = []
202 |     support_dict['file_path'] = []
203 | 
204 |     for dataType in ['train2017']: #, 'train2017']:
205 |         set_crop_base_path = join(support_path, dataType)
206 |         set_img_base_path = join(dataDir, dataType)
207 | 
208 |         annFile = os.path.join(root_path, 'new_annotations/final_split_non_voc_instances_train2017.json')
209 |         with open(annFile,'r') as load_f:
210 |             dataset = json.load(load_f)
211 |             print(dataset.keys())
212 |             save_info = dataset['info']
213 |             save_licenses = dataset['licenses']
214 |             save_images = dataset['images']
215 |             save_categories = dataset['categories']
216 | 
217 |         coco = COCO(annFile)
218 |         
219 |         for img_id, id in enumerate(coco.imgs):
220 |             if img_id % 100 == 0:
221 |                 print(img_id)
222 |             img = coco.loadImgs(id)[0]
223 |             anns = coco.loadAnns(coco.getAnnIds(imgIds=id, iscrowd=None))
224 | 
225 |             if len(anns) == 0:
226 |                 continue
227 | 
228 |             frame_crop_base_path = join(set_crop_base_path, img['file_name'].split('/')[-1].split('.')[0])
229 |             if not isdir(frame_crop_base_path): makedirs(frame_crop_base_path)
230 |             im = cv2.imread('{}/{}'.format(set_img_base_path, img['file_name']))
231 |             #print('{}/{}'.format(set_img_base_path, img['file_name']))
232 |             for item_id, ann in enumerate(anns):
233 |                 rect = ann['bbox']
234 |                 bbox = [rect[0], rect[1], rect[0] + rect[2], rect[1] + rect[3]]
235 |                 support_img, support_box = crop_support(im, bbox)
236 |                 #im_name = img['file_name'].split('.')[0] + '_' + str(item_id) + '.jpg'
237 |                 #output_dir = './fig'
238 |                 #vis_image(support_img[:, :, ::-1], support_box, join(frame_crop_base_path, '{:04d}.jpg'.format(item_id)))
239 |                 if rect[2] <= 0 or rect[3] <=0:
240 |                     print(rect)
241 |                     continue
242 |                 file_path = join(frame_crop_base_path, '{:04d}.jpg'.format(item_id))
243 |                 cv2.imwrite(file_path, support_img)
244 |                 #print(file_path)
245 |                 support_dict['support_box'].append(support_box.tolist())
246 |                 support_dict['category_id'].append(ann['category_id'])
247 |                 support_dict['image_id'].append(ann['image_id'])
248 |                 support_dict['id'].append(ann['id'])
249 |                 support_dict['file_path'].append(file_path)
250 | 
251 |         support_df = pd.DataFrame.from_dict(support_dict)
252 |         
253 |     return support_df
254 | 
255 | 
256 | if __name__ == '__main__':
257 |     since = time.time()
258 |     support_df = main()
259 |     support_df.to_pickle("./train_support_df.pkl")
260 | 
261 |     time_elapsed = time.time() - since
262 |     print('Total complete in {:.0f}m {:.0f}s'.format(
263 |         time_elapsed // 60, time_elapsed % 60))
264 | 
265 | 


--------------------------------------------------------------------------------
/datasets/coco/4_gen_support_pool_10_shot.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Created on Thu Jun  4 15:30:24 2020
  5 | 
  6 | @author: fanq15
  7 | """
  8 | 
  9 | from pycocotools.coco import COCO
 10 | import cv2
 11 | import numpy as np
 12 | from os.path import join, isdir
 13 | from os import mkdir, makedirs
 14 | from concurrent import futures
 15 | import sys
 16 | import time
 17 | import math
 18 | import matplotlib.pyplot as plt
 19 | import os
 20 | import pandas as pd
 21 | import json
 22 | import shutil
 23 | import sys
 24 | 
 25 | def vis_image(im, bboxs, im_name):
 26 |     dpi = 300
 27 |     fig, ax = plt.subplots() 
 28 |     ax.imshow(im, aspect='equal') 
 29 |     plt.axis('off') 
 30 |     height, width, channels = im.shape 
 31 |     fig.set_size_inches(width/100.0/3.0, height/100.0/3.0) 
 32 |     plt.gca().xaxis.set_major_locator(plt.NullLocator()) 
 33 |     plt.gca().yaxis.set_major_locator(plt.NullLocator()) 
 34 |     plt.subplots_adjust(top=1,bottom=0,left=0,right=1,hspace=0,wspace=0) 
 35 |     plt.margins(0,0)
 36 |     # Show box (off by default, box_alpha=0.0)
 37 |     for bbox in bboxs:
 38 |         ax.add_patch(
 39 |             plt.Rectangle((bbox[0], bbox[1]),
 40 |                           bbox[2] - bbox[0],
 41 |                           bbox[3] - bbox[1],
 42 |                           fill=False, edgecolor='r',
 43 |                           linewidth=0.5, alpha=1))
 44 |     output_name = os.path.basename(im_name)
 45 |     plt.savefig(im_name, dpi=dpi, bbox_inches='tight', pad_inches=0)
 46 |     plt.close('all')
 47 | 
 48 | 
 49 | def crop_support(img, bbox):
 50 |     image_shape = img.shape[:2]  # h, w
 51 |     data_height, data_width = image_shape
 52 |     
 53 |     img = img.transpose(2, 0, 1)
 54 | 
 55 |     x1 = int(bbox[0])
 56 |     y1 = int(bbox[1])
 57 |     x2 = int(bbox[2])
 58 |     y2 = int(bbox[3])
 59 |     
 60 |     width = x2 - x1
 61 |     height = y2 - y1
 62 |     context_pixel = 16 #int(16 * im_scale)
 63 |     
 64 |     new_x1 = 0
 65 |     new_y1 = 0
 66 |     new_x2 = width
 67 |     new_y2 = height
 68 |     target_size = (320, 320) #(384, 384)
 69 |  
 70 |     if width >= height:
 71 |         crop_x1 = x1 - context_pixel
 72 |         crop_x2 = x2 + context_pixel
 73 |    
 74 |         # New_x1 and new_x2 will change when crop context or overflow
 75 |         new_x1 = new_x1 + context_pixel
 76 |         new_x2 = new_x1 + width
 77 |         if crop_x1 < 0:
 78 |             new_x1 = new_x1 + crop_x1
 79 |             new_x2 = new_x1 + width
 80 |             crop_x1 = 0
 81 |         if crop_x2 > data_width:
 82 |             crop_x2 = data_width
 83 |             
 84 |         short_size = height
 85 |         long_size = crop_x2 - crop_x1
 86 |         y_center = int((y2+y1) / 2) #math.ceil((y2 + y1) / 2)
 87 |         crop_y1 = int(y_center - (long_size / 2)) #int(y_center - math.ceil(long_size / 2))
 88 |         crop_y2 = int(y_center + (long_size / 2)) #int(y_center + math.floor(long_size / 2))
 89 |         
 90 |         # New_y1 and new_y2 will change when crop context or overflow
 91 |         new_y1 = new_y1 + math.ceil((long_size - short_size) / 2)
 92 |         new_y2 = new_y1 + height
 93 |         if crop_y1 < 0:
 94 |             new_y1 = new_y1 + crop_y1
 95 |             new_y2 = new_y1 + height
 96 |             crop_y1 = 0
 97 |         if crop_y2 > data_height:
 98 |             crop_y2 = data_height
 99 |         
100 |         crop_short_size = crop_y2 - crop_y1
101 |         crop_long_size = crop_x2 - crop_x1
102 |         square = np.zeros((3, crop_long_size, crop_long_size), dtype = np.uint8)
103 |         delta = int((crop_long_size - crop_short_size) / 2) #int(math.ceil((crop_long_size - crop_short_size) / 2))
104 |         square_y1 = delta
105 |         square_y2 = delta + crop_short_size
106 | 
107 |         new_y1 = new_y1 + delta
108 |         new_y2 = new_y2 + delta
109 |         
110 |         crop_box = img[:, crop_y1:crop_y2, crop_x1:crop_x2]
111 |         square[:, square_y1:square_y2, :] = crop_box
112 | 
113 |         #show_square = np.zeros((crop_long_size, crop_long_size, 3))#, dtype=np.int16)
114 |         #show_crop_box = original_img[crop_y1:crop_y2, crop_x1:crop_x2, :]
115 |         #show_square[square_y1:square_y2, :, :] = show_crop_box
116 |         #show_square = show_square.astype(np.int16)
117 |     else:
118 |         crop_y1 = y1 - context_pixel
119 |         crop_y2 = y2 + context_pixel
120 |    
121 |         # New_y1 and new_y2 will change when crop context or overflow
122 |         new_y1 = new_y1 + context_pixel
123 |         new_y2 = new_y1 + height
124 |         if crop_y1 < 0:
125 |             new_y1 = new_y1 + crop_y1
126 |             new_y2 = new_y1 + height
127 |             crop_y1 = 0
128 |         if crop_y2 > data_height:
129 |             crop_y2 = data_height
130 |             
131 |         short_size = width
132 |         long_size = crop_y2 - crop_y1
133 |         x_center = int((x2 + x1) / 2) #math.ceil((x2 + x1) / 2)
134 |         crop_x1 = int(x_center - (long_size / 2)) #int(x_center - math.ceil(long_size / 2))
135 |         crop_x2 = int(x_center + (long_size / 2)) #int(x_center + math.floor(long_size / 2))
136 | 
137 |         # New_x1 and new_x2 will change when crop context or overflow
138 |         new_x1 = new_x1 + math.ceil((long_size - short_size) / 2)
139 |         new_x2 = new_x1 + width
140 |         if crop_x1 < 0:
141 |             new_x1 = new_x1 + crop_x1
142 |             new_x2 = new_x1 + width
143 |             crop_x1 = 0
144 |         if crop_x2 > data_width:
145 |             crop_x2 = data_width
146 | 
147 |         crop_short_size = crop_x2 - crop_x1
148 |         crop_long_size = crop_y2 - crop_y1
149 |         square = np.zeros((3, crop_long_size, crop_long_size), dtype = np.uint8)
150 |         delta = int((crop_long_size - crop_short_size) / 2) #int(math.ceil((crop_long_size - crop_short_size) / 2))
151 |         square_x1 = delta
152 |         square_x2 = delta + crop_short_size
153 | 
154 |         new_x1 = new_x1 + delta
155 |         new_x2 = new_x2 + delta
156 |         crop_box = img[:, crop_y1:crop_y2, crop_x1:crop_x2]
157 |         square[:, :, square_x1:square_x2] = crop_box
158 | 
159 |         #show_square = np.zeros((crop_long_size, crop_long_size, 3)) #, dtype=np.int16)
160 |         #show_crop_box = original_img[crop_y1:crop_y2, crop_x1:crop_x2, :]
161 |         #show_square[:, square_x1:square_x2, :] = show_crop_box
162 |         #show_square = show_square.astype(np.int16)
163 |     #print(crop_y2 - crop_y1, crop_x2 - crop_x1, bbox, data_height, data_width)
164 | 
165 |     square = square.astype(np.float32, copy=False)
166 |     square_scale = float(target_size[0]) / long_size
167 |     square = square.transpose(1,2,0)
168 |     square = cv2.resize(square, target_size, interpolation=cv2.INTER_LINEAR) # None, None, fx=square_scale, fy=square_scale, interpolation=cv2.INTER_LINEAR)
169 |     #square = square.transpose(2,0,1)
170 |     square = square.astype(np.uint8)
171 | 
172 |     new_x1 = int(new_x1 * square_scale)
173 |     new_y1 = int(new_y1 * square_scale)
174 |     new_x2 = int(new_x2 * square_scale)
175 |     new_y2 = int(new_y2 * square_scale)
176 | 
177 |     # For test
178 |     #show_square = cv2.resize(show_square, target_size, interpolation=cv2.INTER_LINEAR) # None, None, fx=square_scale, fy=square_scale, interpolation=cv2.INTER_LINEAR)
179 |     #self.vis_image(show_square, [new_x1, new_y1, new_x2, new_y2], img_path.split('/')[-1][:-4]+'_crop.jpg', './test')
180 | 
181 |     support_data = square
182 |     support_box = np.array([new_x1, new_y1, new_x2, new_y2]).astype(np.float32)
183 |     return support_data, support_box
184 |         
185 | 
186 | def main():
187 |     dataDir = '.'
188 | 
189 |     #root_path = '/home/fanqi/data/COCO'
190 |     root_path = sys.argv[1]
191 |     support_path = os.path.join(root_path, '10_shot_support')
192 |     #support_path = '10_shot_support'
193 |     if not isdir(support_path): 
194 |         mkdir(support_path)
195 |     #else:
196 |     #    shutil.rmtree(support_path)
197 | 
198 |     support_dict = {}
199 |     
200 |     support_dict['support_box'] = []
201 |     support_dict['category_id'] = []
202 |     support_dict['image_id'] = []
203 |     support_dict['id'] = []
204 |     support_dict['file_path'] = []
205 | 
206 |     for dataType in ['train2017']: #, 'train2017']:
207 |         set_crop_base_path = join(support_path, dataType)
208 |         set_img_base_path = join(dataDir, dataType)
209 | 
210 |         # other information
211 |         #annFile = '{}/annotations/instances_{}.json'.format(dataDir,dataType)
212 |         #annFile = './new_annotations/final_split_voc_10_shot_instances_train2017.json'
213 |         
214 |         annFile = os.path.join(root_path, 'new_annotations/final_split_voc_10_shot_instances_train2017.json')
215 |         
216 |         with open(annFile,'r') as load_f:
217 |             dataset = json.load(load_f)
218 |             print(dataset.keys())
219 |             save_info = dataset['info']
220 |             save_licenses = dataset['licenses']
221 |             save_images = dataset['images']
222 |             save_categories = dataset['categories']
223 | 
224 |         coco = COCO(annFile)
225 |         
226 |         for img_id, id in enumerate(coco.imgs):
227 |             if img_id % 100 == 0:
228 |                 print(img_id)
229 |             img = coco.loadImgs(id)[0]
230 |             anns = coco.loadAnns(coco.getAnnIds(imgIds=id, iscrowd=None))
231 | 
232 |             if len(anns) == 0:
233 |                 continue
234 | 
235 |             #print(img['file_name'])
236 |             frame_crop_base_path = join(set_crop_base_path, img['file_name'].split('/')[-1].split('.')[0])
237 |             if not isdir(frame_crop_base_path): makedirs(frame_crop_base_path)
238 |             im = cv2.imread('{}/{}'.format(set_img_base_path, img['file_name']))
239 |             for item_id, ann in enumerate(anns):
240 |                 #print(ann)
241 |                 rect = ann['bbox']
242 |                 bbox = [rect[0], rect[1], rect[0] + rect[2], rect[1] + rect[3]]
243 |                 support_img, support_box = crop_support(im, bbox)
244 |                 #im_name = img['file_name'].split('.')[0] + '_' + str(item_id) + '.jpg'
245 |                 #output_dir = './fig'
246 |                 #vis_image(support_img[:, :, ::-1], support_box, join(frame_crop_base_path, '{:04d}.jpg'.format(item_id)))
247 |                 if rect[2] <= 0 or rect[3] <=0:
248 |                     print(rect)
249 |                     continue
250 |                 file_path = join(frame_crop_base_path, '{:04d}.jpg'.format(item_id))
251 |                 cv2.imwrite(file_path, support_img)
252 |                 #print(file_path)
253 |                 support_dict['support_box'].append(support_box.tolist())
254 |                 support_dict['category_id'].append(ann['category_id'])
255 |                 support_dict['image_id'].append(ann['image_id'])
256 |                 support_dict['id'].append(ann['id'])
257 |                 support_dict['file_path'].append(file_path)
258 | 
259 |         support_df = pd.DataFrame.from_dict(support_dict)
260 |         
261 |     return support_df
262 | 
263 | 
264 | if __name__ == '__main__':
265 |     since = time.time()
266 |     support_df = main()
267 |     support_df.to_pickle("./10_shot_support_df.pkl")
268 | 
269 |     time_elapsed = time.time() - since
270 |     print('Total complete in {:.0f}m {:.0f}s'.format(
271 |         time_elapsed // 60, time_elapsed % 60))
272 | 
273 | 


--------------------------------------------------------------------------------
/datasets/coco/5_voc_part.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | # -*- coding: utf-8 -*-
 3 | """
 4 | Created on Fri Jun  5 15:27:52 2020
 5 | 
 6 | @author: fanq15
 7 | """
 8 | 
 9 | from pycocotools.coco import COCO
10 | import cv2
11 | import numpy as np
12 | from os.path import join, isdir
13 | from os import mkdir, makedirs
14 | from concurrent import futures
15 | import sys
16 | import time
17 | import math
18 | import matplotlib.pyplot as plt
19 | import os
20 | import pandas as pd
21 | import json
22 | import sys
23 | 
24 | root_path = './'
25 | print(root_path)
26 | #root_path = '/home/fanqi/data/COCO'
27 | dataDir = './annotations'
28 | support_dict = {}
29 | 
30 | support_dict['support_box'] = []
31 | support_dict['category_id'] = []
32 | support_dict['image_id'] = []
33 | support_dict['id'] = []
34 | support_dict['file_path'] = []
35 | 
36 | voc_inds = (0, 1, 2, 3, 4, 5, 6, 8, 14, 15, 16, 17, 18, 19, 39, 56, 57, 58, 60, 62)
37 | 
38 | 
39 | for dataType in ['instances_train2017.json']: #, 'split_voc_instances_train2017.json']:
40 |     annFile = join(dataDir, dataType)
41 | 
42 |     with open(annFile,'r') as load_f:
43 |         dataset = json.load(load_f)
44 |         print(dataset.keys())
45 |         save_info = dataset['info']
46 |         save_licenses = dataset['licenses']
47 |         save_images = dataset['images']
48 |         save_categories = dataset['categories']
49 |         save_annotations = dataset['annotations']
50 | 
51 | 
52 |     inds_split2 = [i for i in range(len(save_categories)) if i not in voc_inds]
53 | 
54 |     # split annotations according to categories
55 |     categories_split1 = [save_categories[i] for i in voc_inds]
56 |     categories_split2 = [save_categories[i] for i in inds_split2]
57 |     cids_split1 = [c['id'] for c in categories_split1]
58 |     cids_split2 = [c['id'] for c in categories_split2]
59 |     print('Split 1: {} classes'.format(len(categories_split1)))
60 |     for c in categories_split1:
61 |         print('\t', c['name'])
62 |     print('Split 2: {} classes'.format(len(categories_split2)))
63 |     for c in categories_split2:
64 |         print('\t', c['name'])
65 | 
66 |     coco = COCO(annFile)
67 | 
68 |     # for voc, there can be non_voc images
69 |     annotations = dataset['annotations']
70 |     annotations_split1 = []
71 |     
72 |     for ann in annotations:
73 |         if ann['category_id'] in cids_split1: # voc 20
74 |             annotations_split1.append(ann)
75 | 
76 |     dataset_split1 = {
77 |         'info': save_info,
78 |         'licenses': save_licenses,
79 |         'images': save_images,
80 |         'annotations': annotations_split1,
81 |         'categories': save_categories}
82 | 
83 |     new_annotations_path = os.path.join(root_path, 'new_annotations')
84 |     if not os.path.exists(new_annotations_path):
85 |         os.makedirs(new_annotations_path)
86 |     split1_file = os.path.join(root_path, 'new_annotations/split_voc_instances_train2017.json')
87 | 
88 |     with open(split1_file, 'w') as f:
89 |         json.dump(dataset_split1, f)
90 | 
91 |     
92 |     
93 | 


--------------------------------------------------------------------------------
/datasets/coco/6_voc_few_shot.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Created on Fri Jun  5 16:04:06 2020
  5 | 
  6 | @author: fanq15
  7 | """
  8 | 
  9 | from pycocotools.coco import COCO
 10 | import cv2
 11 | import numpy as np
 12 | from os.path import join, isdir
 13 | from os import mkdir, makedirs
 14 | from concurrent import futures
 15 | import sys
 16 | import time
 17 | import math
 18 | import matplotlib.pyplot as plt
 19 | import os
 20 | import pandas as pd
 21 | import json
 22 | import sys
 23 | 
 24 | def few_shot(coco, shot_num):
 25 |     new_anns = []
 26 |     all_cls_dict = {}
 27 |     for img_id, id in enumerate(coco.imgs):
 28 |         anns = coco.loadAnns(coco.getAnnIds(imgIds=id, iscrowd=None))
 29 |         skip_flag = False
 30 |         img_cls_dict = {}
 31 |         if len(anns) != 1:
 32 |             continue
 33 |         for ann in anns:
 34 |             area = ann['area']
 35 |             category_id = ann['category_id']
 36 |             id = ann['id']
 37 |             
 38 |             if category_id in img_cls_dict.keys():
 39 |                 img_cls_dict[category_id] += 1
 40 |             else:
 41 |                 img_cls_dict[category_id] = 1
 42 |             
 43 |             # filter images with small boxes
 44 |             if area < 64 * 64 or area > 224 * 224:
 45 |                 skip_flag = True
 46 |                 
 47 |             if category_id in all_cls_dict.keys():
 48 |                 if all_cls_dict[category_id] == shot_num:
 49 |                     skip_flag = True
 50 | 
 51 |         if skip_flag:
 52 |             continue
 53 |         else:
 54 |             for ann in anns:
 55 |                 new_anns.append(ann)
 56 |             for category_id, num in img_cls_dict.items():
 57 |                 if category_id in all_cls_dict.keys():
 58 |                     all_cls_dict[category_id] += num
 59 |                 else:
 60 |                     all_cls_dict[category_id] = num
 61 |     print(len(new_anns))
 62 |     print(sorted(all_cls_dict.items(), key = lambda kv:(kv[1], kv[0])))     
 63 |     return new_anns
 64 | 
 65 | 
 66 | root_path = './'
 67 | #root_path = '/home/fanqi/data/COCO'
 68 | dataDir = os.path.join(root_path, 'new_annotations')
 69 | support_dict = {}
 70 | 
 71 | support_dict['support_box'] = []
 72 | support_dict['category_id'] = []
 73 | support_dict['image_id'] = []
 74 | support_dict['id'] = []
 75 | support_dict['file_path'] = []
 76 | 
 77 | 
 78 | for dataType in ['split_voc_instances_train2017.json']:
 79 |     annFile = join(dataDir, dataType)
 80 | 
 81 |     with open(annFile,'r') as load_f:
 82 |         dataset = json.load(load_f)
 83 |         print(dataset.keys())
 84 |         save_info = dataset['info']
 85 |         save_licenses = dataset['licenses']
 86 |         save_images = dataset['images']
 87 |         save_categories = dataset['categories']
 88 | 
 89 |     print(annFile)
 90 |     shot_num = 10
 91 |     coco = COCO(annFile)
 92 |     print(coco)
 93 |     
 94 |     annotations = few_shot(coco, shot_num)
 95 |     dataset_split = {
 96 |         'info': save_info,
 97 |         'licenses': save_licenses,
 98 |         'images': save_images,
 99 |         'annotations': annotations,
100 |         'categories': save_categories}
101 |     #split_file = os.path.join(root_path, 'new_annotations/final_split_voc_10_shot_instances_train2017.json')
102 |     split_file = './new_annotations/final_split_voc_10_shot_instances_train2017.json'
103 |     
104 |     with open(split_file, 'w') as f:
105 |         json.dump(dataset_split, f)
106 | 
107 | 
108 |     
109 |     
110 | 


--------------------------------------------------------------------------------
/datasets/generate_support_data.sh:
--------------------------------------------------------------------------------
 1 | DATA_ROOT=/home/fanqi/data/COCO
 2 | 
 3 | cd coco
 4 | 
 5 | ln -s $DATA_ROOT/train2017 ./
 6 | ln -s $DATA_ROOT/val2017 ./
 7 | ln -s $DATA_ROOT/annotations ./
 8 | 
 9 | python3 1_split_filter.py ./ 
10 | #python3 2_balance.py ./
11 | python3 3_gen_support_pool.py ./
12 | python3 4_gen_support_pool_10_shot.py ./
13 | 
14 | 


--------------------------------------------------------------------------------
/fewx/__init__.py:
--------------------------------------------------------------------------------
1 | from fewx import modeling
2 | from fewx import config
3 | from fewx import layers
4 | 
5 | __version__ = "0.1"
6 | 


--------------------------------------------------------------------------------
/fewx/config/__init__.py:
--------------------------------------------------------------------------------
1 | from .config import get_cfg
2 | 
3 | __all__ = [
4 |     "get_cfg",
5 | ]
6 | 


--------------------------------------------------------------------------------
/fewx/config/config.py:
--------------------------------------------------------------------------------
 1 | from detectron2.config import CfgNode
 2 | 
 3 | 
 4 | def get_cfg() -> CfgNode:
 5 |     """
 6 |     Get a copy of the default config.
 7 |     Returns:
 8 |         a detectron2 CfgNode instance.
 9 |     """
10 |     from .defaults import _C
11 | 
12 |     return _C.clone()
13 | 


--------------------------------------------------------------------------------
/fewx/config/defaults.py:
--------------------------------------------------------------------------------
 1 | from detectron2.config.defaults import _C
 2 | from detectron2.config import CfgNode as CN
 3 | 
 4 | 
 5 | # ---------------------------------------------------------------------------- #
 6 | # Additional Configs
 7 | # ---------------------------------------------------------------------------- #
 8 | _C.SOLVER.HEAD_LR_FACTOR = 1.0
 9 | 
10 | # ---------------------------------------------------------------------------- #
11 | # Few shot setting
12 | # ---------------------------------------------------------------------------- #
13 | _C.INPUT.FS = CN()
14 | _C.INPUT.FS.FEW_SHOT = False
15 | _C.INPUT.FS.SUPPORT_WAY = 2
16 | _C.INPUT.FS.SUPPORT_SHOT = 10
17 | 


--------------------------------------------------------------------------------
/fewx/data/__init__.py:
--------------------------------------------------------------------------------
1 | from .dataset_mapper import DatasetMapperWithSupport
2 | # ensure the builtin datasets are registered
3 | from . import datasets  # isort:skip
4 | 
5 | __all__ = [k for k in globals().keys() if not k.startswith("_")]
6 | 


--------------------------------------------------------------------------------
/fewx/data/build.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | import bisect
  3 | import copy
  4 | import itertools
  5 | import logging
  6 | import numpy as np
  7 | import operator
  8 | import pickle
  9 | import torch.utils.data
 10 | from fvcore.common.file_io import PathManager
 11 | from tabulate import tabulate
 12 | from termcolor import colored
 13 | 
 14 | from detectron2.structures import BoxMode
 15 | from detectron2.utils.comm import get_world_size
 16 | from detectron2.utils.env import seed_all_rng
 17 | from detectron2.utils.logger import log_first_n
 18 | 
 19 | from detectron2.data.catalog import DatasetCatalog, MetadataCatalog
 20 | from detectron2.data.common import AspectRatioGroupedDataset, DatasetFromList, MapDataset
 21 | from detectron2.data.dataset_mapper import DatasetMapper
 22 | from detectron2.data.detection_utils import check_metadata_consistency
 23 | from detectron2.data.samplers import InferenceSampler, RepeatFactorTrainingSampler, TrainingSampler
 24 | 
 25 | from detectron2.data.build import build_batch_data_loader, filter_images_with_only_crowd_annotations, load_proposals_into_dataset, filter_images_with_few_keypoints, print_instances_class_histogram, trivial_batch_collator, get_detection_dataset_dicts
 26 | 
 27 | def fsod_get_detection_dataset_dicts(
 28 |     dataset_names, filter_empty=True, min_keypoints=0, proposal_files=None
 29 | ):
 30 |     """
 31 |     Load and prepare dataset dicts for instance detection/segmentation and semantic segmentation.
 32 |     Args:
 33 |         dataset_names (list[str]): a list of dataset names
 34 |         filter_empty (bool): whether to filter out images without instance annotations
 35 |         min_keypoints (int): filter out images with fewer keypoints than
 36 |             `min_keypoints`. Set to 0 to do nothing.
 37 |         proposal_files (list[str]): if given, a list of object proposal files
 38 |             that match each dataset in `dataset_names`.
 39 |     """
 40 |     assert len(dataset_names)
 41 |     dataset_dicts_original = [DatasetCatalog.get(dataset_name) for dataset_name in dataset_names]
 42 |     for dataset_name, dicts in zip(dataset_names, dataset_dicts_original):
 43 |         assert len(dicts), "Dataset '{}' is empty!".format(dataset_name)
 44 | 
 45 |     if proposal_files is not None:
 46 |         assert len(dataset_names) == len(proposal_files)
 47 |         # load precomputed proposals from proposal files
 48 |         dataset_dicts_original = [
 49 |             load_proposals_into_dataset(dataset_i_dicts, proposal_file)
 50 |             for dataset_i_dicts, proposal_file in zip(dataset_dicts_original, proposal_files)
 51 |         ]
 52 | 
 53 |     if 'train' not in dataset_names[0]:
 54 |         dataset_dicts = list(itertools.chain.from_iterable(dataset_dicts_original))
 55 |     else:
 56 |         dataset_dicts_original = list(itertools.chain.from_iterable(dataset_dicts_original))
 57 |         dataset_dicts_original = filter_images_with_only_crowd_annotations(dataset_dicts_original)
 58 |         ###################################################################################
 59 |         # split image-based annotations to instance-based annotations for few-shot learning
 60 |         dataset_dicts = []
 61 |         index_dicts = []
 62 |         split_flag = True
 63 |         if split_flag:
 64 |             for record in dataset_dicts_original:
 65 |                 file_name = record['file_name']
 66 |                 height = record['height']
 67 |                 width = record['width']
 68 |                 image_id = record['image_id']
 69 |                 annotations = record['annotations']
 70 |                 category_dict = {}
 71 |                 for ann_id, ann in enumerate(annotations):
 72 | 
 73 |                     ann.pop("segmentation", None)
 74 |                     ann.pop("keypoints", None)
 75 | 
 76 |                     category_id = ann['category_id']
 77 |                     if category_id not in category_dict.keys():
 78 |                         category_dict[category_id] = [ann]
 79 |                     else:
 80 |                         category_dict[category_id].append(ann)
 81 |                 
 82 |                 for key, item in category_dict.items():
 83 |                     instance_ann = {}
 84 |                     instance_ann['file_name'] = file_name
 85 |                     instance_ann['height'] = height
 86 |                     instance_ann['width'] = width
 87 | 
 88 |                     instance_ann['annotations'] = item
 89 |                     
 90 |                     dataset_dicts.append(instance_ann)
 91 | 
 92 | 
 93 |     has_instances = "annotations" in dataset_dicts[0]
 94 |     # Keep images without instance-level GT if the dataset has semantic labels.
 95 |     if filter_empty and has_instances and "sem_seg_file_name" not in dataset_dicts[0]:
 96 |         dataset_dicts = filter_images_with_only_crowd_annotations(dataset_dicts)
 97 | 
 98 |     if min_keypoints > 0 and has_instances:
 99 |         dataset_dicts = filter_images_with_few_keypoints(dataset_dicts, min_keypoints)
100 | 
101 |     if has_instances:
102 |         try:
103 |             class_names = MetadataCatalog.get(dataset_names[0]).thing_classes
104 |             check_metadata_consistency("thing_classes", dataset_names)
105 |             print_instances_class_histogram(dataset_dicts, class_names)
106 |         except AttributeError:  # class names are not available for this dataset
107 |             pass
108 |     return dataset_dicts
109 | 
110 | def build_detection_train_loader(cfg, mapper=None):
111 |     """
112 |     A data loader is created by the following steps:
113 |     1. Use the dataset names in config to query :class:`DatasetCatalog`, and obtain a list of dicts.
114 |     2. Coordinate a random shuffle order shared among all processes (all GPUs)
115 |     3. Each process spawn another few workers to process the dicts. Each worker will:
116 |        * Map each metadata dict into another format to be consumed by the model.
117 |        * Batch them by simply putting dicts into a list.
118 |     The batched ``list[mapped_dict]`` is what this dataloader will yield.
119 |     Args:
120 |         cfg (CfgNode): the config
121 |         mapper (callable): a callable which takes a sample (dict) from dataset and
122 |             returns the format to be consumed by the model.
123 |             By default it will be `DatasetMapper(cfg, True)`.
124 |     Returns:
125 |         an infinite iterator of training data
126 |     """
127 |     dataset_dicts = fsod_get_detection_dataset_dicts(
128 |         cfg.DATASETS.TRAIN,
129 |         filter_empty=cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS,
130 |         min_keypoints=cfg.MODEL.ROI_KEYPOINT_HEAD.MIN_KEYPOINTS_PER_IMAGE
131 |         if cfg.MODEL.KEYPOINT_ON
132 |         else 0,
133 |         proposal_files=cfg.DATASETS.PROPOSAL_FILES_TRAIN if cfg.MODEL.LOAD_PROPOSALS else None,
134 |     )
135 |     dataset = DatasetFromList(dataset_dicts, copy=False)
136 |     
137 |     if mapper is None:
138 |         mapper = DatasetMapper(cfg, True)
139 |     dataset = MapDataset(dataset, mapper)
140 | 
141 |     sampler_name = cfg.DATALOADER.SAMPLER_TRAIN
142 |     logger = logging.getLogger(__name__)
143 |     logger.info("Using training sampler {}".format(sampler_name))
144 |     # TODO avoid if-else?
145 |     if sampler_name == "TrainingSampler":
146 |         sampler = TrainingSampler(len(dataset))
147 |     elif sampler_name == "RepeatFactorTrainingSampler":
148 |         repeat_factors = RepeatFactorTrainingSampler.repeat_factors_from_category_frequency(
149 |             dataset_dicts, cfg.DATALOADER.REPEAT_THRESHOLD
150 |         )
151 |         sampler = RepeatFactorTrainingSampler(repeat_factors)
152 |     else:
153 |         raise ValueError("Unknown training sampler: {}".format(sampler_name))
154 |     return build_batch_data_loader(
155 |         dataset,
156 |         sampler,
157 |         cfg.SOLVER.IMS_PER_BATCH,
158 |         aspect_ratio_grouping=cfg.DATALOADER.ASPECT_RATIO_GROUPING,
159 |         num_workers=cfg.DATALOADER.NUM_WORKERS,
160 |     )
161 | 
162 | def build_detection_test_loader(cfg, dataset_name, mapper=None):
163 |     """
164 |     Similar to `build_detection_train_loader`.
165 |     But this function uses the given `dataset_name` argument (instead of the names in cfg),
166 |     and uses batch size 1.
167 |     Args:
168 |         cfg: a detectron2 CfgNode
169 |         dataset_name (str): a name of the dataset that's available in the DatasetCatalog
170 |         mapper (callable): a callable which takes a sample (dict) from dataset
171 |            and returns the format to be consumed by the model.
172 |            By default it will be `DatasetMapper(cfg, False)`.
173 |     Returns:
174 |         DataLoader: a torch DataLoader, that loads the given detection
175 |         dataset, with test-time transformation and batching.
176 |     """
177 |     dataset_dicts = get_detection_dataset_dicts(
178 |         [dataset_name],
179 |         filter_empty=False, # True,
180 |         proposal_files=[
181 |             cfg.DATASETS.PROPOSAL_FILES_TEST[list(cfg.DATASETS.TEST).index(dataset_name)]
182 |         ]
183 |         if cfg.MODEL.LOAD_PROPOSALS
184 |         else None,
185 |     )
186 | 
187 |     dataset = DatasetFromList(dataset_dicts)
188 |     if mapper is None:
189 |         mapper = DatasetMapper(cfg, False) # True)
190 |     dataset = MapDataset(dataset, mapper)
191 | 
192 |     sampler = InferenceSampler(len(dataset))
193 |     # Always use 1 image per worker during inference since this is the
194 |     # standard when reporting inference time in papers.
195 |     batch_sampler = torch.utils.data.sampler.BatchSampler(sampler, 1, drop_last=False)
196 | 
197 |     data_loader = torch.utils.data.DataLoader(
198 |         dataset,
199 |         num_workers=cfg.DATALOADER.NUM_WORKERS,
200 |         batch_sampler=batch_sampler,
201 |         collate_fn=trivial_batch_collator,
202 |     )
203 |     return data_loader
204 | 


--------------------------------------------------------------------------------
/fewx/data/dataset_mapper.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | import copy
  3 | import logging
  4 | import numpy as np
  5 | import torch
  6 | from fvcore.common.file_io import PathManager
  7 | from PIL import Image
  8 | 
  9 | from detectron2.data import detection_utils as utils
 10 | from detectron2.data import transforms as T
 11 | 
 12 | import pandas as pd
 13 | from detectron2.data.catalog import MetadataCatalog
 14 | 
 15 | """
 16 | This file contains the default mapping that's applied to "dataset dicts".
 17 | """
 18 | 
 19 | __all__ = ["DatasetMapperWithSupport"]
 20 | 
 21 | 
 22 | class DatasetMapperWithSupport:
 23 |     """
 24 |     A callable which takes a dataset dict in Detectron2 Dataset format,
 25 |     and map it into a format used by the model.
 26 | 
 27 |     This is the default callable to be used to map your dataset dict into training data.
 28 |     You may need to follow it to implement your own one for customized logic,
 29 |     such as a different way to read or transform images.
 30 |     See :doc:`/tutorials/data_loading` for details.
 31 | 
 32 |     The callable currently does the following:
 33 | 
 34 |     1. Read the image from "file_name"
 35 |     2. Applies cropping/geometric transforms to the image and annotations
 36 |     3. Prepare data and annotations to Tensor and :class:`Instances`
 37 |     """
 38 | 
 39 |     def __init__(self, cfg, is_train=True):
 40 |         if cfg.INPUT.CROP.ENABLED and is_train:
 41 |             self.crop_gen = T.RandomCrop(cfg.INPUT.CROP.TYPE, cfg.INPUT.CROP.SIZE)
 42 |             logging.getLogger(__name__).info("CropGen used in training: " + str(self.crop_gen))
 43 |         else:
 44 |             self.crop_gen = None
 45 | 
 46 |         self.tfm_gens = utils.build_transform_gen(cfg, is_train)
 47 | 
 48 |         # fmt: off
 49 |         self.img_format     = cfg.INPUT.FORMAT
 50 |         self.mask_on        = cfg.MODEL.MASK_ON
 51 |         self.mask_format    = cfg.INPUT.MASK_FORMAT
 52 |         self.keypoint_on    = cfg.MODEL.KEYPOINT_ON
 53 |         self.load_proposals = cfg.MODEL.LOAD_PROPOSALS
 54 | 
 55 |         self.few_shot       = cfg.INPUT.FS.FEW_SHOT
 56 |         self.support_way       = cfg.INPUT.FS.SUPPORT_WAY
 57 |         self.support_shot       = cfg.INPUT.FS.SUPPORT_SHOT
 58 |         # fmt: on
 59 |         if self.keypoint_on and is_train:
 60 |             # Flip only makes sense in training
 61 |             self.keypoint_hflip_indices = utils.create_keypoint_hflip_indices(cfg.DATASETS.TRAIN)
 62 |         else:
 63 |             self.keypoint_hflip_indices = None
 64 | 
 65 |         if self.load_proposals:
 66 |             self.proposal_min_box_size = cfg.MODEL.PROPOSAL_GENERATOR.MIN_SIZE
 67 |             self.proposal_topk = (
 68 |                 cfg.DATASETS.PRECOMPUTED_PROPOSAL_TOPK_TRAIN
 69 |                 if is_train
 70 |                 else cfg.DATASETS.PRECOMPUTED_PROPOSAL_TOPK_TEST
 71 |             )
 72 |         self.is_train = is_train
 73 | 
 74 |         if self.is_train:
 75 |             # support_df
 76 |             self.support_on = True
 77 |             if self.few_shot:
 78 |                 self.support_df = pd.read_pickle("./datasets/coco/10_shot_support_df.pkl")
 79 |             else:
 80 |                 self.support_df = pd.read_pickle("./datasets/coco/train_support_df.pkl")
 81 | 
 82 |             metadata = MetadataCatalog.get('coco_2017_train')
 83 |             # unmap the category mapping ids for COCO
 84 |             reverse_id_mapper = lambda dataset_id: metadata.thing_dataset_id_to_contiguous_id[dataset_id]  # noqa
 85 |             self.support_df['category_id'] = self.support_df['category_id'].map(reverse_id_mapper)
 86 | 
 87 | 
 88 |     def __call__(self, dataset_dict):
 89 |         """
 90 |         Args:
 91 |             dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format.
 92 | 
 93 |         Returns:
 94 |             dict: a format that builtin models in detectron2 accept
 95 |         """
 96 |         dataset_dict = copy.deepcopy(dataset_dict)  # it will be modified by code below
 97 |         # USER: Write your own image loading if it's not from a file
 98 |         image = utils.read_image(dataset_dict["file_name"], format=self.img_format)
 99 |         utils.check_image_size(dataset_dict, image)
100 |         if self.is_train:
101 |             # support
102 |             if self.support_on:
103 |                 if "annotations" in dataset_dict:
104 |                     # USER: Modify this if you want to keep them for some reason.
105 |                     for anno in dataset_dict["annotations"]:
106 |                         if not self.mask_on:
107 |                             anno.pop("segmentation", None)
108 |                         if not self.keypoint_on:
109 |                             anno.pop("keypoints", None)
110 |                 support_images, support_bboxes, support_cls = self.generate_support(dataset_dict)
111 |                 dataset_dict['support_images'] = torch.as_tensor(np.ascontiguousarray(support_images))
112 |                 dataset_dict['support_bboxes'] = support_bboxes
113 |                 dataset_dict['support_cls'] = support_cls
114 | 
115 |         if "annotations" not in dataset_dict:
116 |             image, transforms = T.apply_transform_gens(
117 |                 ([self.crop_gen] if self.crop_gen else []) + self.tfm_gens, image
118 |             )
119 |         else:
120 |             # Crop around an instance if there are instances in the image.
121 |             # USER: Remove if you don't use cropping
122 |             if self.crop_gen:
123 |                 crop_tfm = utils.gen_crop_transform_with_instance(
124 |                     self.crop_gen.get_crop_size(image.shape[:2]),
125 |                     image.shape[:2],
126 |                     np.random.choice(dataset_dict["annotations"]),
127 |                 )
128 |                 image = crop_tfm.apply_image(image)
129 |             image, transforms = T.apply_transform_gens(self.tfm_gens, image)
130 |             if self.crop_gen:
131 |                 transforms = crop_tfm + transforms
132 | 
133 |         image_shape = image.shape[:2]  # h, w
134 | 
135 |         # Pytorch's dataloader is efficient on torch.Tensor due to shared-memory,
136 |         # but not efficient on large generic data structures due to the use of pickle & mp.Queue.
137 |         # Therefore it's important to use torch.Tensor.
138 |         dataset_dict["image"] = torch.as_tensor(np.ascontiguousarray(image.transpose(2, 0, 1)))
139 | 
140 |         # USER: Remove if you don't use pre-computed proposals.
141 |         # Most users would not need this feature.
142 |         if self.load_proposals:
143 |             utils.transform_proposals(
144 |                 dataset_dict,
145 |                 image_shape,
146 |                 transforms,
147 |                 self.proposal_min_box_size,
148 |                 self.proposal_topk,
149 |             )
150 | 
151 |         if not self.is_train:
152 |             # USER: Modify this if you want to keep them for some reason.
153 |             dataset_dict.pop("annotations", None)
154 |             dataset_dict.pop("sem_seg_file_name", None)
155 |             return dataset_dict
156 | 
157 |         if "annotations" in dataset_dict:
158 |             # USER: Modify this if you want to keep them for some reason.
159 |             for anno in dataset_dict["annotations"]:
160 |                 if not self.mask_on:
161 |                     anno.pop("segmentation", None)
162 |                 if not self.keypoint_on:
163 |                     anno.pop("keypoints", None)
164 |             
165 |             # USER: Implement additional transformations if you have other types of data
166 |             annos = [
167 |                 utils.transform_instance_annotations(
168 |                     obj, transforms, image_shape, keypoint_hflip_indices=self.keypoint_hflip_indices
169 |                 )
170 |                 for obj in dataset_dict.pop("annotations")
171 |                 if obj.get("iscrowd", 0) == 0
172 |             ]
173 |             instances = utils.annotations_to_instances(
174 |                 annos, image_shape, mask_format=self.mask_format
175 |             )
176 |             # Create a tight bounding box from masks, useful when image is cropped
177 |             if self.crop_gen and instances.has("gt_masks"):
178 |                 instances.gt_boxes = instances.gt_masks.get_bounding_boxes()
179 |             dataset_dict["instances"] = utils.filter_empty_instances(instances)
180 | 
181 |         # USER: Remove if you don't do semantic/panoptic segmentation.
182 |         if "sem_seg_file_name" in dataset_dict:
183 |             with PathManager.open(dataset_dict.pop("sem_seg_file_name"), "rb") as f:
184 |                 sem_seg_gt = Image.open(f)
185 |                 sem_seg_gt = np.asarray(sem_seg_gt, dtype="uint8")
186 |             sem_seg_gt = transforms.apply_segmentation(sem_seg_gt)
187 |             sem_seg_gt = torch.as_tensor(sem_seg_gt.astype("long"))
188 |             dataset_dict["sem_seg"] = sem_seg_gt
189 |         return dataset_dict
190 | 
191 |     def generate_support(self, dataset_dict):
192 |         support_way = self.support_way #2
193 |         support_shot = self.support_shot #5
194 |         
195 |         id = dataset_dict['annotations'][0]['id']
196 |         query_cls = self.support_df.loc[self.support_df['id']==id, 'category_id'].tolist()[0] # they share the same category_id and image_id
197 |         query_img = self.support_df.loc[self.support_df['id']==id, 'image_id'].tolist()[0]
198 |         all_cls = self.support_df.loc[self.support_df['image_id']==query_img, 'category_id'].tolist()
199 | 
200 |         # Crop support data and get new support box in the support data
201 |         support_data_all = np.zeros((support_way * support_shot, 3, 320, 320), dtype = np.float32)
202 |         support_box_all = np.zeros((support_way * support_shot, 4), dtype = np.float32)
203 |         used_image_id = [query_img]
204 | 
205 |         used_id_ls = []
206 |         for item in dataset_dict['annotations']:
207 |             used_id_ls.append(item['id'])
208 |         #used_category_id = [query_cls]
209 |         used_category_id = list(set(all_cls))
210 |         support_category_id = []
211 |         mixup_i = 0
212 | 
213 |         for shot in range(support_shot):
214 |             # Support image and box
215 |             support_id = self.support_df.loc[(self.support_df['category_id'] == query_cls) & (~self.support_df['image_id'].isin(used_image_id)) & (~self.support_df['id'].isin(used_id_ls)), 'id'].sample(random_state=id).tolist()[0]
216 |             support_cls = self.support_df.loc[self.support_df['id'] == support_id, 'category_id'].tolist()[0]
217 |             support_img = self.support_df.loc[self.support_df['id'] == support_id, 'image_id'].tolist()[0]
218 |             used_id_ls.append(support_id) 
219 |             used_image_id.append(support_img)
220 | 
221 |             support_db = self.support_df.loc[self.support_df['id'] == support_id, :]
222 |             assert support_db['id'].values[0] == support_id
223 |             
224 |             support_data = utils.read_image('./datasets/coco/' + support_db["file_path"].tolist()[0], format=self.img_format)
225 |             support_data = torch.as_tensor(np.ascontiguousarray(support_data.transpose(2, 0, 1)))
226 |             support_box = support_db['support_box'].tolist()[0]
227 |             #print(support_data)
228 |             support_data_all[mixup_i] = support_data
229 |             support_box_all[mixup_i] = support_box
230 |             support_category_id.append(0) #support_cls)
231 |             mixup_i += 1
232 | 
233 |         if support_way == 1:
234 |             pass
235 |         else:
236 |             for way in range(support_way-1):
237 |                 other_cls = self.support_df.loc[(~self.support_df['category_id'].isin(used_category_id)), 'category_id'].drop_duplicates().sample(random_state=id).tolist()[0]
238 |                 used_category_id.append(other_cls)
239 |                 for shot in range(support_shot):
240 |                     # Support image and box
241 | 
242 |                     support_id = self.support_df.loc[(self.support_df['category_id'] == other_cls) & (~self.support_df['image_id'].isin(used_image_id)) & (~self.support_df['id'].isin(used_id_ls)), 'id'].sample(random_state=id).tolist()[0]
243 |                      
244 |                     support_cls = self.support_df.loc[self.support_df['id'] == support_id, 'category_id'].tolist()[0]
245 |                     support_img = self.support_df.loc[self.support_df['id'] == support_id, 'image_id'].tolist()[0]
246 | 
247 |                     used_id_ls.append(support_id) 
248 |                     used_image_id.append(support_img)
249 | 
250 |                     support_db = self.support_df.loc[self.support_df['id'] == support_id, :]
251 |                     assert support_db['id'].values[0] == support_id
252 | 
253 |                     support_data = utils.read_image('./datasets/coco/' + support_db["file_path"].tolist()[0], format=self.img_format)
254 |                     support_data = torch.as_tensor(np.ascontiguousarray(support_data.transpose(2, 0, 1)))
255 |                     support_box = support_db['support_box'].tolist()[0]
256 |                     support_data_all[mixup_i] = support_data
257 |                     support_box_all[mixup_i] = support_box
258 |                     support_category_id.append(1) #support_cls)
259 |                     mixup_i += 1
260 |         
261 |         return support_data_all, support_box_all, support_category_id
262 | 


--------------------------------------------------------------------------------
/fewx/data/datasets/__init__.py:
--------------------------------------------------------------------------------
1 | from . import builtin  # ensure the builtin datasets are registered
2 | from .register_coco import register_coco_instances
3 | 
4 | __all__ = [k for k in globals().keys() if "builtin" not in k and not k.startswith("_")]
5 | 


--------------------------------------------------------------------------------
/fewx/data/datasets/builtin.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | 
 3 | from .register_coco import register_coco_instances
 4 | from detectron2.data.datasets.builtin_meta import _get_builtin_metadata
 5 | 
 6 | # ==== Predefined datasets and splits for COCO ==========
 7 | 
 8 | _PREDEFINED_SPLITS_COCO = {}
 9 | _PREDEFINED_SPLITS_COCO["coco"] = {
10 |     "coco_2017_train_nonvoc": ("coco/train2017", "coco/new_annotations/final_split_non_voc_instances_train2017.json"),
11 |     "coco_2017_train_voc_10_shot": ("coco/train2017", "coco/new_annotations/final_split_voc_10_shot_instances_train2017.json"),
12 | }
13 | 
14 | def register_all_coco(root):
15 |     for dataset_name, splits_per_dataset in _PREDEFINED_SPLITS_COCO.items():
16 |         for key, (image_root, json_file) in splits_per_dataset.items():
17 |             # Assume pre-defined datasets live in `./datasets`.
18 |             register_coco_instances(
19 |                 key,
20 |                 _get_builtin_metadata(dataset_name),
21 |                 os.path.join(root, json_file) if "://" not in json_file else json_file,
22 |                 os.path.join(root, image_root),
23 |             )
24 | 
25 | # Register them all under "./datasets"
26 | _root = os.getenv("DETECTRON2_DATASETS", "datasets")
27 | register_all_coco(_root)
28 | 


--------------------------------------------------------------------------------
/fewx/data/datasets/register_coco.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
 2 | import copy
 3 | import os
 4 | 
 5 | from detectron2.data import DatasetCatalog, MetadataCatalog
 6 | 
 7 | from detectron2.data.datasets.coco import load_coco_json, load_sem_seg
 8 | 
 9 | """
10 | This file contains functions to register a COCO-format dataset to the DatasetCatalog.
11 | """
12 | 
13 | __all__ = ["register_coco_instances", "register_coco_panoptic_separated"]
14 | 
15 | 
16 | def register_coco_instances(name, metadata, json_file, image_root):
17 |     """
18 |     Register a dataset in COCO's json annotation format for
19 |     instance detection, instance segmentation and keypoint detection.
20 |     (i.e., Type 1 and 2 in http://cocodataset.org/#format-data.
21 |     `instances*.json` and `person_keypoints*.json` in the dataset).
22 |     This is an example of how to register a new dataset.
23 |     You can do something similar to this function, to register new datasets.
24 |     Args:
25 |         name (str): the name that identifies a dataset, e.g. "coco_2014_train".
26 |         metadata (dict): extra metadata associated with this dataset.  You can
27 |             leave it as an empty dict.
28 |         json_file (str): path to the json instance annotation file.
29 |         image_root (str or path-like): directory which contains all the images.
30 |     """
31 |     assert isinstance(name, str), name
32 |     assert isinstance(json_file, (str, os.PathLike)), json_file
33 |     assert isinstance(image_root, (str, os.PathLike)), image_root
34 |     # 1. register a function which returns dicts
35 |     DatasetCatalog.register(name, lambda: load_coco_json(json_file, image_root, name, extra_annotation_keys=['id']))
36 | 
37 |     # 2. Optionally, add metadata about this dataset,
38 |     # since they might be useful in evaluation, visualization or logging
39 |     MetadataCatalog.get(name).set(
40 |         json_file=json_file, image_root=image_root, evaluator_type="coco", **metadata
41 |     )
42 | 


--------------------------------------------------------------------------------
/fewx/evaluation/__init__.py:
--------------------------------------------------------------------------------
1 | from .coco_evaluation import COCOEvaluator
2 | 
3 | __all__ = [k for k in globals().keys() if not k.startswith("_")]
4 | 


--------------------------------------------------------------------------------
/fewx/evaluation/coco_evaluation.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | import contextlib
  3 | import copy
  4 | import io
  5 | import itertools
  6 | import json
  7 | import logging
  8 | import numpy as np
  9 | import os
 10 | import pickle
 11 | from collections import OrderedDict
 12 | import pycocotools.mask as mask_util
 13 | import torch
 14 | from fvcore.common.file_io import PathManager
 15 | from pycocotools.coco import COCO
 16 | from tabulate import tabulate
 17 | 
 18 | import detectron2.utils.comm as comm
 19 | from detectron2.data import MetadataCatalog
 20 | from detectron2.data.datasets.coco import convert_to_coco_json
 21 | from detectron2.evaluation.fast_eval_api import COCOeval_opt as COCOeval
 22 | from detectron2.structures import Boxes, BoxMode, pairwise_iou
 23 | from detectron2.utils.logger import create_small_table
 24 | 
 25 | from detectron2.evaluation.evaluator import DatasetEvaluator
 26 | 
 27 | CLASS_NAMES = [
 28 |     "airplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat",
 29 |     "chair", "cow", "dining table", "dog", "horse", "motorcycle", "person",
 30 |     "potted plant", "sheep", "couch", "train", "tv",
 31 | ]
 32 | 
 33 | class COCOEvaluator(DatasetEvaluator):
 34 |     """
 35 |     Evaluate AR for object proposals, AP for instance detection/segmentation, AP
 36 |     for keypoint detection outputs using COCO's metrics.
 37 |     See http://cocodataset.org/#detection-eval and
 38 |     http://cocodataset.org/#keypoints-eval to understand its metrics.
 39 | 
 40 |     In addition to COCO, this evaluator is able to support any bounding box detection,
 41 |     instance segmentation, or keypoint detection dataset.
 42 |     """
 43 | 
 44 |     def __init__(self, dataset_name, cfg, distributed, output_dir=None):
 45 |         """
 46 |         Args:
 47 |             dataset_name (str): name of the dataset to be evaluated.
 48 |                 It must have either the following corresponding metadata:
 49 | 
 50 |                     "json_file": the path to the COCO format annotation
 51 | 
 52 |                 Or it must be in detectron2's standard dataset format
 53 |                 so it can be converted to COCO format automatically.
 54 |             cfg (CfgNode): config instance
 55 |             distributed (True): if True, will collect results from all ranks and run evaluation
 56 |                 in the main process.
 57 |                 Otherwise, will evaluate the results in the current process.
 58 |             output_dir (str): optional, an output directory to dump all
 59 |                 results predicted on the dataset. The dump contains two files:
 60 | 
 61 |                 1. "instance_predictions.pth" a file in torch serialization
 62 |                    format that contains all the raw original predictions.
 63 |                 2. "coco_instances_results.json" a json file in COCO's result
 64 |                    format.
 65 |         """
 66 |         self._tasks = self._tasks_from_config(cfg)
 67 |         self._distributed = distributed
 68 |         self._output_dir = output_dir
 69 | 
 70 |         self._cpu_device = torch.device("cpu")
 71 | 
 72 |         # replace fewx with d2
 73 |         self._logger = logging.getLogger(__name__)
 74 |         self._metadata = MetadataCatalog.get(dataset_name)
 75 |         if not hasattr(self._metadata, "json_file"):
 76 |             self._logger.info(
 77 |                 f"'{dataset_name}' is not registered by `register_coco_instances`."
 78 |                 " Therefore trying to convert it to COCO format ..."
 79 |             )
 80 | 
 81 |             cache_path = os.path.join(output_dir, f"{dataset_name}_coco_format.json")
 82 |             self._metadata.json_file = cache_path
 83 |             convert_to_coco_json(dataset_name, cache_path)
 84 | 
 85 |         json_file = PathManager.get_local_path(self._metadata.json_file)
 86 |         with contextlib.redirect_stdout(io.StringIO()):
 87 |             self._coco_api = COCO(json_file)
 88 | 
 89 |         self._kpt_oks_sigmas = cfg.TEST.KEYPOINT_OKS_SIGMAS
 90 |         # Test set json files do not contain annotations (evaluation must be
 91 |         # performed using the COCO evaluation server).
 92 |         self._do_evaluation = "annotations" in self._coco_api.dataset
 93 | 
 94 |     def reset(self):
 95 |         self._predictions = []
 96 | 
 97 |     def _tasks_from_config(self, cfg):
 98 |         """
 99 |         Returns:
100 |             tuple[str]: tasks that can be evaluated under the given configuration.
101 |         """
102 |         tasks = ("bbox",)
103 |         if cfg.MODEL.MASK_ON:
104 |             tasks = tasks + ("segm",)
105 |         if cfg.MODEL.KEYPOINT_ON:
106 |             tasks = tasks + ("keypoints",)
107 |         return tasks
108 | 
109 |     def process(self, inputs, outputs):
110 |         """
111 |         Args:
112 |             inputs: the inputs to a COCO model (e.g., GeneralizedRCNN).
113 |                 It is a list of dict. Each dict corresponds to an image and
114 |                 contains keys like "height", "width", "file_name", "image_id".
115 |             outputs: the outputs of a COCO model. It is a list of dicts with key
116 |                 "instances" that contains :class:`Instances`.
117 |         """
118 |         for input, output in zip(inputs, outputs):
119 |             prediction = {"image_id": input["image_id"]}
120 | 
121 |             # TODO this is ugly
122 |             if "instances" in output:
123 |                 instances = output["instances"].to(self._cpu_device)
124 |                 prediction["instances"] = instances_to_coco_json(instances, input["image_id"])
125 |             if "proposals" in output:
126 |                 prediction["proposals"] = output["proposals"].to(self._cpu_device)
127 |             self._predictions.append(prediction)
128 | 
129 |     def evaluate(self):
130 |         if self._distributed:
131 |             comm.synchronize()
132 |             predictions = comm.gather(self._predictions, dst=0)
133 |             predictions = list(itertools.chain(*predictions))
134 | 
135 |             if not comm.is_main_process():
136 |                 return {}
137 |         else:
138 |             predictions = self._predictions
139 | 
140 |         if len(predictions) == 0:
141 |             self._logger.warning("[COCOEvaluator] Did not receive valid predictions.")
142 |             return {}
143 | 
144 |         if self._output_dir:
145 |             PathManager.mkdirs(self._output_dir)
146 |             file_path = os.path.join(self._output_dir, "instances_predictions.pth")
147 |             with PathManager.open(file_path, "wb") as f:
148 |                 torch.save(predictions, f)
149 | 
150 |         self._results = OrderedDict()
151 |         if "proposals" in predictions[0]:
152 |             self._eval_box_proposals(predictions)
153 |         if "instances" in predictions[0]:
154 |             self._eval_predictions(set(self._tasks), predictions)
155 |         # Copy so the caller can do whatever with results
156 |         return copy.deepcopy(self._results)
157 | 
158 |     def _eval_predictions(self, tasks, predictions):
159 |         """
160 |         Evaluate predictions on the given tasks.
161 |         Fill self._results with the metrics of the tasks.
162 |         """
163 |         self._logger.info("Preparing results for COCO format ...")
164 |         coco_results = list(itertools.chain(*[x["instances"] for x in predictions]))
165 | 
166 |         # unmap the category ids for COCO
167 |         if hasattr(self._metadata, "thing_dataset_id_to_contiguous_id"):
168 |             reverse_id_mapping = {
169 |                 v: k for k, v in self._metadata.thing_dataset_id_to_contiguous_id.items()
170 |             }
171 |             for result in coco_results:
172 |                 category_id = result["category_id"]
173 |                 assert (
174 |                     category_id in reverse_id_mapping
175 |                 ), "A prediction has category_id={}, which is not available in the dataset.".format(
176 |                     category_id
177 |                 )
178 |                 result["category_id"] = reverse_id_mapping[category_id]
179 | 
180 |         if self._output_dir:
181 |             file_path = os.path.join(self._output_dir, "coco_instances_results.json")
182 |             self._logger.info("Saving results to {}".format(file_path))
183 |             with PathManager.open(file_path, "w") as f:
184 |                 f.write(json.dumps(coco_results))
185 |                 f.flush()
186 | 
187 |         if not self._do_evaluation:
188 |             self._logger.info("Annotations are not available for evaluation.")
189 |             return
190 | 
191 |         self._logger.info("Evaluating predictions ...")
192 |         for task in sorted(tasks):
193 |             coco_eval = (
194 |                 _evaluate_predictions_on_coco(
195 |                     self._coco_api, coco_results, task, kpt_oks_sigmas=self._kpt_oks_sigmas
196 |                 )
197 |                 if len(coco_results) > 0
198 |                 else None  # cocoapi does not handle empty results very well
199 |             )
200 | 
201 |             res = self._derive_coco_results(
202 |                 coco_eval, task, class_names=self._metadata.get("thing_classes")
203 |             )
204 |             self._results[task] = res
205 | 
206 |     def _eval_box_proposals(self, predictions):
207 |         """
208 |         Evaluate the box proposals in predictions.
209 |         Fill self._results with the metrics for "box_proposals" task.
210 |         """
211 |         if self._output_dir:
212 |             # Saving generated box proposals to file.
213 |             # Predicted box_proposals are in XYXY_ABS mode.
214 |             bbox_mode = BoxMode.XYXY_ABS.value
215 |             ids, boxes, objectness_logits = [], [], []
216 |             for prediction in predictions:
217 |                 ids.append(prediction["image_id"])
218 |                 boxes.append(prediction["proposals"].proposal_boxes.tensor.numpy())
219 |                 objectness_logits.append(prediction["proposals"].objectness_logits.numpy())
220 | 
221 |             proposal_data = {
222 |                 "boxes": boxes,
223 |                 "objectness_logits": objectness_logits,
224 |                 "ids": ids,
225 |                 "bbox_mode": bbox_mode,
226 |             }
227 |             with PathManager.open(os.path.join(self._output_dir, "box_proposals.pkl"), "wb") as f:
228 |                 pickle.dump(proposal_data, f)
229 | 
230 |         if not self._do_evaluation:
231 |             self._logger.info("Annotations are not available for evaluation.")
232 |             return
233 | 
234 |         self._logger.info("Evaluating bbox proposals ...")
235 |         res = {}
236 |         areas = {"all": "", "small": "s", "medium": "m", "large": "l"}
237 |         for limit in [100, 1000]:
238 |             for area, suffix in areas.items():
239 |                 stats = _evaluate_box_proposals(predictions, self._coco_api, area=area, limit=limit)
240 |                 key = "AR{}@{:d}".format(suffix, limit)
241 |                 res[key] = float(stats["ar"].item() * 100)
242 |         self._logger.info("Proposal metrics: \n" + create_small_table(res))
243 |         self._results["box_proposals"] = res
244 | 
245 |     def _calculate_ap(self, class_names, precisions, T=None, A=None):
246 |         ################## ap #####################
247 |         voc_ls = []
248 |         non_voc_ls = []
249 | 
250 |         for idx, name in enumerate(class_names):
251 |             # area range index 0: all area ranges
252 |             # max dets index -1: typically 100 per image
253 |             if T is not None and A is None:
254 |                 precision = precisions[T, :, idx, 0, -1]
255 |             elif A is not None and T is None:
256 |                 precision = precisions[:, :, idx, A, -1]
257 |             elif T is None and A is None:
258 |                 precision = precisions[:, :, idx, 0, -1]
259 |             else:
260 |                 assert False
261 | 
262 |             precision = precision[precision > -1]
263 |             ap = np.mean(precision) if precision.size else float("nan")
264 | 
265 |             # calculate voc ap and non-voc ap
266 |             if name in CLASS_NAMES:
267 |                 voc_ls.append(ap * 100)
268 |             else:
269 |                 non_voc_ls.append(ap * 100)
270 |         if len(voc_ls) > 0:
271 |             voc_ap = sum(voc_ls) * 1.0 / len(voc_ls)
272 |         else:
273 |             voc_ap = 0.0
274 |         if len(non_voc_ls) > 0:
275 |             non_voc_ap = sum(non_voc_ls) * 1.0 / len(non_voc_ls)
276 |         else:
277 |             non_voc_ap = 0.0
278 | 
279 |         return voc_ap, non_voc_ap
280 | 
281 |     def _derive_coco_results(self, coco_eval, iou_type, class_names=None):
282 |         """
283 |         Derive the desired score numbers from summarized COCOeval.
284 | 
285 |         Args:
286 |             coco_eval (None or COCOEval): None represents no predictions from model.
287 |             iou_type (str):
288 |             class_names (None or list[str]): if provided, will use it to predict
289 |                 per-category AP.
290 | 
291 |         Returns:
292 |             a dict of {metric name: score}
293 |         """
294 | 
295 |         metrics = {
296 |             "bbox": ["AP", "AP50", "AP75", "APs", "APm", "APl"],
297 |             "segm": ["AP", "AP50", "AP75", "APs", "APm", "APl"],
298 |             "keypoints": ["AP", "AP50", "AP75", "APm", "APl"],
299 |         }[iou_type]
300 | 
301 |         if coco_eval is None:
302 |             self._logger.warn("No predictions from the model!")
303 |             return {metric: float("nan") for metric in metrics}
304 | 
305 |         # the standard metrics
306 |         results = {
307 |             metric: float(coco_eval.stats[idx] * 100 if coco_eval.stats[idx] >= 0 else "nan")
308 |             for idx, metric in enumerate(metrics)
309 |         }
310 |         self._logger.info(
311 |             "Evaluation results for {}: \n".format(iou_type) + create_small_table(results)
312 |         )
313 |         if not np.isfinite(sum(results.values())):
314 |             self._logger.info("Some metrics cannot be computed and is shown as NaN.")
315 | 
316 |         if class_names is None or len(class_names) <= 1:
317 |             return results
318 |         # Compute per-category AP
319 |         # from https://github.com/facebookresearch/Detectron/blob/a6a835f5b8208c45d0dce217ce9bbda915f44df7/detectron/datasets/json_dataset_evaluator.py#L222-L252 # noqa
320 |         precisions = coco_eval.eval["precision"]
321 |         # precision has dims (iou, recall, cls, area range, max dets)
322 |         assert len(class_names) == precisions.shape[2]
323 | 
324 |         results_per_category = []
325 |         voc_ls = []
326 |         non_voc_ls = []
327 | 
328 |         ################## ap #####################
329 |         for idx, name in enumerate(class_names):
330 |             # area range index 0: all area ranges
331 |             # max dets index -1: typically 100 per image
332 |             precision = precisions[:, :, idx, 0, -1]
333 |             precision = precision[precision > -1]
334 |             ap = np.mean(precision) if precision.size else float("nan")
335 |             results_per_category.append(("{}".format(name), float(ap * 100)))
336 |         # calculate voc and non voc metrics
337 |         voc_ap, non_voc_ap = self._calculate_ap(class_names, precisions)
338 |         voc_ap_50, non_voc_ap_50 = self._calculate_ap(class_names, precisions, T=0) # T=0, iou_thresh = 0.5
339 |         voc_ap_75, non_voc_ap_75 = self._calculate_ap(class_names, precisions, T=5) # T=5, iou_thresh = 0.75
340 | 
341 |         voc_ap_small, non_voc_ap_small = self._calculate_ap(class_names, precisions, A=1) # A=1, small
342 |         voc_ap_medium, non_voc_ap_medium = self._calculate_ap(class_names, precisions, A=2) # A=2, medium
343 |         voc_ap_large, non_voc_ap_large = self._calculate_ap(class_names, precisions, A=3) # A=3, large
344 | 
345 |         # print voc ap
346 |         self._logger.info("Evaluation results for VOC 20 categories =======> AP  : " + str('%.2f' % voc_ap))
347 |         self._logger.info("Evaluation results for VOC 20 categories =======> AP50: " + str('%.2f' % voc_ap_50))
348 |         self._logger.info("Evaluation results for VOC 20 categories =======> AP75: " + str('%.2f' % voc_ap_75))
349 |         self._logger.info("Evaluation results for VOC 20 categories =======> APs : " + str('%.2f' % voc_ap_small))
350 |         self._logger.info("Evaluation results for VOC 20 categories =======> APm : " + str('%.2f' % voc_ap_medium))
351 |         self._logger.info("Evaluation results for VOC 20 categories =======> APl : " + str('%.2f' % voc_ap_large))
352 | 
353 |         # print voc ap
354 |         self._logger.info("Evaluation results for Non VOC 60 categories =======> AP  : " + str('%.2f' % non_voc_ap))
355 |         self._logger.info("Evaluation results for Non VOC 60 categories =======> AP50: " + str('%.2f' % non_voc_ap_50))
356 |         self._logger.info("Evaluation results for Non VOC 60 categories =======> AP75: " + str('%.2f' % non_voc_ap_75))
357 |         self._logger.info("Evaluation results for Non VOC 60 categories =======> APs : " + str('%.2f' % non_voc_ap_small))
358 |         self._logger.info("Evaluation results for Non VOC 60 categories =======> APm : " + str('%.2f' % non_voc_ap_medium))
359 |         self._logger.info("Evaluation results for Non VOC 60 categories =======> APl : " + str('%.2f' % non_voc_ap_large))
360 | 
361 |         '''
362 |         # log evaluation results in csv
363 |         # type, AP, AP50, AP75, APs, APm, APl
364 |         eval_log = iou_type + ',' + 'AP,AP50,AP75,APs,APm,APl' + '\n'
365 |         eval_log += 'all' + ',' + str('%.2f' % results['AP']) + ',' + str('%.2f' % results['AP50']) + ',' + str('%.2f' % results['AP75']) + ',' + str('%.2f' % results['APs']) + ',' + str('%.2f' % results['APm']) + ',' + str('%.2f' % results['APl']) + '\n'
366 |         eval_log += 'VOC' + ',' + str('%.2f' % voc_ap) + ',' + str('%.2f' % voc_ap_50) + ',' + str('%.2f' % voc_ap_75) + ',' + str('%.2f' % voc_ap_small) + ',' + str('%.2f' % voc_ap_medium) + ',' + str('%.2f' % voc_ap_large) + '\n'
367 |         eval_log += 'Non-VOC' + ',' + str('%.2f' % non_voc_ap) + ',' + str('%.2f' % non_voc_ap_50) + ',' + str('%.2f' % non_voc_ap_75) + ',' + str('%.2f' % non_voc_ap_small) + ',' + str('%.2f' % non_voc_ap_medium) + ',' + str('%.2f' % non_voc_ap_large) + '\n'
368 | 
369 |         with open('evaluation_result.csv', 'a') as f:
370 |             f.write(eval_log)
371 |         '''
372 |         # tabulate it
373 |         N_COLS = min(6, len(results_per_category) * 2)
374 |         results_flatten = list(itertools.chain(*results_per_category))
375 |         results_2d = itertools.zip_longest(*[results_flatten[i::N_COLS] for i in range(N_COLS)])
376 |         table = tabulate(
377 |             results_2d,
378 |             tablefmt="pipe",
379 |             floatfmt=".3f",
380 |             headers=["category", "AP"] * (N_COLS // 2),
381 |             numalign="left",
382 |         )
383 |         self._logger.info("Per-category {} AP: \n".format(iou_type) + table)
384 | 
385 |         results.update({"AP-" + name: ap for name, ap in results_per_category})
386 |         return results
387 | 
388 | def instances_to_coco_json(instances, img_id):
389 |     """
390 |     Dump an "Instances" object to a COCO-format json that's used for evaluation.
391 | 
392 |     Args:
393 |         instances (Instances):
394 |         img_id (int): the image id
395 | 
396 |     Returns:
397 |         list[dict]: list of json annotations in COCO format.
398 |     """
399 |     num_instance = len(instances)
400 |     if num_instance == 0:
401 |         return []
402 | 
403 |     boxes = instances.pred_boxes.tensor.numpy()
404 |     boxes = BoxMode.convert(boxes, BoxMode.XYXY_ABS, BoxMode.XYWH_ABS)
405 |     boxes = boxes.tolist()
406 |     scores = instances.scores.tolist()
407 |     classes = instances.pred_classes.tolist()
408 | 
409 |     has_mask = instances.has("pred_masks")
410 |     if has_mask:
411 |         # use RLE to encode the masks, because they are too large and takes memory
412 |         # since this evaluator stores outputs of the entire dataset
413 |         rles = [
414 |             mask_util.encode(np.array(mask[:, :, None], order="F", dtype="uint8"))[0]
415 |             for mask in instances.pred_masks
416 |         ]
417 |         for rle in rles:
418 |             # "counts" is an array encoded by mask_util as a byte-stream. Python3's
419 |             # json writer which always produces strings cannot serialize a bytestream
420 |             # unless you decode it. Thankfully, utf-8 works out (which is also what
421 |             # the pycocotools/_mask.pyx does).
422 |             rle["counts"] = rle["counts"].decode("utf-8")
423 | 
424 |     has_keypoints = instances.has("pred_keypoints")
425 |     if has_keypoints:
426 |         keypoints = instances.pred_keypoints
427 | 
428 |     results = []
429 |     for k in range(num_instance):
430 |         result = {
431 |             "image_id": img_id,
432 |             "category_id": classes[k],
433 |             "bbox": boxes[k],
434 |             "score": scores[k],
435 |         }
436 |         if has_mask:
437 |             result["segmentation"] = rles[k]
438 |         if has_keypoints:
439 |             # In COCO annotations,
440 |             # keypoints coordinates are pixel indices.
441 |             # However our predictions are floating point coordinates.
442 |             # Therefore we subtract 0.5 to be consistent with the annotation format.
443 |             # This is the inverse of data loading logic in `datasets/coco.py`.
444 |             keypoints[k][:, :2] -= 0.5
445 |             result["keypoints"] = keypoints[k].flatten().tolist()
446 |         results.append(result)
447 |     return results
448 | 
449 | 
450 | # inspired from Detectron:
451 | # https://github.com/facebookresearch/Detectron/blob/a6a835f5b8208c45d0dce217ce9bbda915f44df7/detectron/datasets/json_dataset_evaluator.py#L255 # noqa
452 | def _evaluate_box_proposals(dataset_predictions, coco_api, thresholds=None, area="all", limit=None):
453 |     """
454 |     Evaluate detection proposal recall metrics. This function is a much
455 |     faster alternative to the official COCO API recall evaluation code. However,
456 |     it produces slightly different results.
457 |     """
458 |     # Record max overlap value for each gt box
459 |     # Return vector of overlap values
460 |     areas = {
461 |         "all": 0,
462 |         "small": 1,
463 |         "medium": 2,
464 |         "large": 3,
465 |         "96-128": 4,
466 |         "128-256": 5,
467 |         "256-512": 6,
468 |         "512-inf": 7,
469 |     }
470 |     area_ranges = [
471 |         [0 ** 2, 1e5 ** 2],  # all
472 |         [0 ** 2, 32 ** 2],  # small
473 |         [32 ** 2, 96 ** 2],  # medium
474 |         [96 ** 2, 1e5 ** 2],  # large
475 |         [96 ** 2, 128 ** 2],  # 96-128
476 |         [128 ** 2, 256 ** 2],  # 128-256
477 |         [256 ** 2, 512 ** 2],  # 256-512
478 |         [512 ** 2, 1e5 ** 2],
479 |     ]  # 512-inf
480 |     assert area in areas, "Unknown area range: {}".format(area)
481 |     area_range = area_ranges[areas[area]]
482 |     gt_overlaps = []
483 |     num_pos = 0
484 | 
485 |     for prediction_dict in dataset_predictions:
486 |         predictions = prediction_dict["proposals"]
487 | 
488 |         # sort predictions in descending order
489 |         # TODO maybe remove this and make it explicit in the documentation
490 |         inds = predictions.objectness_logits.sort(descending=True)[1]
491 |         predictions = predictions[inds]
492 | 
493 |         ann_ids = coco_api.getAnnIds(imgIds=prediction_dict["image_id"])
494 |         anno = coco_api.loadAnns(ann_ids)
495 |         gt_boxes = [
496 |             BoxMode.convert(obj["bbox"], BoxMode.XYWH_ABS, BoxMode.XYXY_ABS)
497 |             for obj in anno
498 |             if obj["iscrowd"] == 0
499 |         ]
500 |         gt_boxes = torch.as_tensor(gt_boxes).reshape(-1, 4)  # guard against no boxes
501 |         gt_boxes = Boxes(gt_boxes)
502 |         gt_areas = torch.as_tensor([obj["area"] for obj in anno if obj["iscrowd"] == 0])
503 | 
504 |         if len(gt_boxes) == 0 or len(predictions) == 0:
505 |             continue
506 | 
507 |         valid_gt_inds = (gt_areas >= area_range[0]) & (gt_areas <= area_range[1])
508 |         gt_boxes = gt_boxes[valid_gt_inds]
509 | 
510 |         num_pos += len(gt_boxes)
511 | 
512 |         if len(gt_boxes) == 0:
513 |             continue
514 | 
515 |         if limit is not None and len(predictions) > limit:
516 |             predictions = predictions[:limit]
517 | 
518 |         overlaps = pairwise_iou(predictions.proposal_boxes, gt_boxes)
519 | 
520 |         _gt_overlaps = torch.zeros(len(gt_boxes))
521 |         for j in range(min(len(predictions), len(gt_boxes))):
522 |             # find which proposal box maximally covers each gt box
523 |             # and get the iou amount of coverage for each gt box
524 |             max_overlaps, argmax_overlaps = overlaps.max(dim=0)
525 | 
526 |             # find which gt box is 'best' covered (i.e. 'best' = most iou)
527 |             gt_ovr, gt_ind = max_overlaps.max(dim=0)
528 |             assert gt_ovr >= 0
529 |             # find the proposal box that covers the best covered gt box
530 |             box_ind = argmax_overlaps[gt_ind]
531 |             # record the iou coverage of this gt box
532 |             _gt_overlaps[j] = overlaps[box_ind, gt_ind]
533 |             assert _gt_overlaps[j] == gt_ovr
534 |             # mark the proposal box and the gt box as used
535 |             overlaps[box_ind, :] = -1
536 |             overlaps[:, gt_ind] = -1
537 | 
538 |         # append recorded iou coverage level
539 |         gt_overlaps.append(_gt_overlaps)
540 |     gt_overlaps = (
541 |         torch.cat(gt_overlaps, dim=0) if len(gt_overlaps) else torch.zeros(0, dtype=torch.float32)
542 |     )
543 |     gt_overlaps, _ = torch.sort(gt_overlaps)
544 | 
545 |     if thresholds is None:
546 |         step = 0.05
547 |         thresholds = torch.arange(0.5, 0.95 + 1e-5, step, dtype=torch.float32)
548 |     recalls = torch.zeros_like(thresholds)
549 |     # compute recall for each iou threshold
550 |     for i, t in enumerate(thresholds):
551 |         recalls[i] = (gt_overlaps >= t).float().sum() / float(num_pos)
552 |     # ar = 2 * np.trapz(recalls, thresholds)
553 |     ar = recalls.mean()
554 |     return {
555 |         "ar": ar,
556 |         "recalls": recalls,
557 |         "thresholds": thresholds,
558 |         "gt_overlaps": gt_overlaps,
559 |         "num_pos": num_pos,
560 |     }
561 | 
562 | 
563 | def _evaluate_predictions_on_coco(coco_gt, coco_results, iou_type, kpt_oks_sigmas=None):
564 |     """
565 |     Evaluate the coco results using COCOEval API.
566 |     """
567 |     assert len(coco_results) > 0
568 | 
569 |     if iou_type == "segm":
570 |         coco_results = copy.deepcopy(coco_results)
571 |         # When evaluating mask AP, if the results contain bbox, cocoapi will
572 |         # use the box area as the area of the instance, instead of the mask area.
573 |         # This leads to a different definition of small/medium/large.
574 |         # We remove the bbox field to let mask AP use mask area.
575 |         for c in coco_results:
576 |             c.pop("bbox", None)
577 | 
578 |     coco_dt = coco_gt.loadRes(coco_results)
579 |     coco_eval = COCOeval(coco_gt, coco_dt, iou_type)
580 | 
581 |     if iou_type == "keypoints":
582 |         # Use the COCO default keypoint OKS sigmas unless overrides are specified
583 |         if kpt_oks_sigmas:
584 |             assert hasattr(coco_eval.params, "kpt_oks_sigmas"), "pycocotools is too old!"
585 |             coco_eval.params.kpt_oks_sigmas = np.array(kpt_oks_sigmas)
586 |         # COCOAPI requires every detection and every gt to have keypoints, so
587 |         # we just take the first entry from both
588 |         num_keypoints_dt = len(coco_results[0]["keypoints"]) // 3
589 |         num_keypoints_gt = len(next(iter(coco_gt.anns.values()))["keypoints"]) // 3
590 |         num_keypoints_oks = len(coco_eval.params.kpt_oks_sigmas)
591 |         assert num_keypoints_oks == num_keypoints_dt == num_keypoints_gt, (
592 |             f"[COCOEvaluator] Prediction contain {num_keypoints_dt} keypoints. "
593 |             f"Ground truth contains {num_keypoints_gt} keypoints. "
594 |             f"The length of cfg.TEST.KEYPOINT_OKS_SIGMAS is {num_keypoints_oks}. "
595 |             "They have to agree with each other. For meaning of OKS, please refer to "
596 |             "http://cocodataset.org/#keypoints-eval."
597 |         )
598 | 
599 |     coco_eval.evaluate()
600 |     coco_eval.accumulate()
601 |     coco_eval.summarize()
602 | 
603 |     return coco_eval
604 | 


--------------------------------------------------------------------------------
/fewx/layers/__init__.py:
--------------------------------------------------------------------------------
 1 | from .deform_conv import DFConv2d
 2 | from .ml_nms import ml_nms
 3 | from .iou_loss import IOULoss
 4 | from .conv_with_kaiming_uniform import conv_with_kaiming_uniform
 5 | from .naive_group_norm import NaiveGroupNorm
 6 | from .boundary import get_instances_contour_interior
 7 | from .misc import interpolate
 8 | 
 9 | __all__ = [k for k in globals().keys() if not k.startswith("_")]
10 | 


--------------------------------------------------------------------------------
/fewx/layers/boundary.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | # -*- coding: utf-8 -*-
 3 | """
 4 | Created on Sat Jan 11 00:22:15 2020
 5 | 
 6 | @author: fanq15
 7 | """
 8 | import numpy as np
 9 | #from PIL import Image #, ImageOps, ImageDraw
10 | from skimage import filters, img_as_ubyte
11 | from skimage.morphology import remove_small_objects, dilation, erosion, binary_dilation, binary_erosion, square
12 | #from scipy.ndimage.interpolation import map_coordinates
13 | #from scipy.ndimage.morphology import binary_fill_holes
14 | from scipy.ndimage.filters import gaussian_filter
15 | from scipy.ndimage.measurements import center_of_mass
16 | 
17 | def get_contour_interior(mask, bold=False):
18 |     if True: #'camunet' == config['param']['model']:
19 |         # 2-pixel contour (1out+1in), 2-pixel shrinked interior
20 |         outer = binary_dilation(mask) #, square(9))
21 |         if bold:
22 |             outer = binary_dilation(outer) #, square(9))
23 |         inner = binary_erosion(mask) #, square(9))
24 |         contour = ((outer != inner) > 0).astype(np.uint8)
25 |         interior = (erosion(inner) > 0).astype(np.uint8)
26 |     else:
27 |         contour = filters.scharr(mask)
28 |         scharr_threshold = np.amax(abs(contour)) / 2.
29 |         contour = (np.abs(contour) > scharr_threshold).astype(np.uint8)
30 |         interior = (mask - contour > 0).astype(np.uint8)
31 |     return contour, interior
32 | 
33 | def get_center(mask):
34 |     r = 2
35 |     y, x = center_of_mass(mask)
36 |     center_img = Image.fromarray(np.zeros_like(mask).astype(np.uint8))
37 |     if not np.isnan(x) and not np.isnan(y):
38 |         draw = ImageDraw.Draw(center_img)
39 |         draw.ellipse([x-r, y-r, x+r, y+r], fill='White')
40 |     center = np.asarray(center_img)
41 |     return center
42 | 
43 | def get_instances_contour_interior(instances_mask):
44 |     adjacent_boundary_only = False #False #config['contour'].getboolean('adjacent_boundary_only')
45 |     instances_mask = instances_mask.data
46 |     result_c = np.zeros_like(instances_mask, dtype=np.uint8)
47 |     result_i = np.zeros_like(instances_mask, dtype=np.uint8)
48 |     weight = np.ones_like(instances_mask, dtype=np.float32)
49 |     #masks = decompose_mask(instances_mask)
50 |     #for m in masks:
51 |     contour, interior = get_contour_interior(instances_mask, bold=adjacent_boundary_only)
52 |     #center = get_center(m)
53 |     if adjacent_boundary_only:
54 |         result_c += contour
55 |     else:
56 |         result_c = np.maximum(result_c, contour)
57 |     result_i = np.maximum(result_i, interior)
58 |     #contour += center
59 |     contour = np.where(contour > 0, 1, 0)
60 |     # magic number 50 make weight distributed to [1, 5) roughly
61 |     weight *= (1 + gaussian_filter(contour, sigma=1) / 50)
62 |     if adjacent_boundary_only:
63 |         result_c = (result_c > 1).astype(np.uint8)
64 |     return result_c, result_i, weight
65 | 


--------------------------------------------------------------------------------
/fewx/layers/conv_with_kaiming_uniform.py:
--------------------------------------------------------------------------------
 1 | from torch import nn
 2 | 
 3 | from detectron2.layers import Conv2d
 4 | from .deform_conv import DFConv2d
 5 | from detectron2.layers.batch_norm import get_norm
 6 | 
 7 | 
 8 | def conv_with_kaiming_uniform(
 9 |         norm=None, activation=None,
10 |         use_deformable=False, use_sep=False):
11 |     def make_conv(
12 |         in_channels, out_channels, kernel_size, stride=1, dilation=1
13 |     ):
14 |         if use_deformable:
15 |             conv_func = DFConv2d
16 |         else:
17 |             conv_func = Conv2d
18 |         if use_sep:
19 |             assert in_channels == out_channels
20 |             groups = in_channels
21 |         else:
22 |             groups = 1
23 |         conv = conv_func(
24 |             in_channels,
25 |             out_channels,
26 |             kernel_size=kernel_size,
27 |             stride=stride,
28 |             padding=dilation * (kernel_size - 1) // 2,
29 |             dilation=dilation,
30 |             groups=groups,
31 |             bias=(norm is None)
32 |         )
33 |         if not use_deformable:
34 |             # Caffe2 implementation uses XavierFill, which in fact
35 |             # corresponds to kaiming_uniform_ in PyTorch
36 |             nn.init.kaiming_uniform_(conv.weight, a=1)
37 |             if norm is None:
38 |                 nn.init.constant_(conv.bias, 0)
39 |         module = [conv,]
40 |         if norm is not None and len(norm) > 0:
41 |             if norm == "GN":
42 |                 norm_module = nn.GroupNorm(32, out_channels)
43 |             else:
44 |                 norm_module = get_norm(norm, out_channels)
45 |             module.append(norm_module)
46 |         if activation is not None:
47 |             module.append(nn.ReLU(inplace=True))
48 |         if len(module) > 1:
49 |             return nn.Sequential(*module)
50 |         return conv
51 | 
52 |     return make_conv
53 | 


--------------------------------------------------------------------------------
/fewx/layers/deform_conv.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torch import nn
  3 | 
  4 | from detectron2.layers import Conv2d
  5 | 
  6 | 
  7 | class _NewEmptyTensorOp(torch.autograd.Function):
  8 |     @staticmethod
  9 |     def forward(ctx, x, new_shape):
 10 |         ctx.shape = x.shape
 11 |         return x.new_empty(new_shape)
 12 | 
 13 |     @staticmethod
 14 |     def backward(ctx, grad):
 15 |         shape = ctx.shape
 16 |         return _NewEmptyTensorOp.apply(grad, shape), None
 17 | 
 18 | 
 19 | class DFConv2d(nn.Module):
 20 |     """
 21 |     Deformable convolutional layer with configurable
 22 |     deformable groups, dilations and groups.
 23 | 
 24 |     Code is from:
 25 |     https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/maskrcnn_benchmark/layers/misc.py
 26 | 
 27 | 
 28 |     """
 29 |     def __init__(
 30 |             self,
 31 |             in_channels,
 32 |             out_channels,
 33 |             with_modulated_dcn=True,
 34 |             kernel_size=3,
 35 |             stride=1,
 36 |             groups=1,
 37 |             dilation=1,
 38 |             deformable_groups=1,
 39 |             bias=False,
 40 |             padding=None
 41 |     ):
 42 |         super(DFConv2d, self).__init__()
 43 |         if isinstance(kernel_size, (list, tuple)):
 44 |             assert isinstance(stride, (list, tuple))
 45 |             assert isinstance(dilation, (list, tuple))
 46 |             assert len(kernel_size) == 2
 47 |             assert len(stride) == 2
 48 |             assert len(dilation) == 2
 49 |             padding = (
 50 |                 dilation[0] * (kernel_size[0] - 1) // 2,
 51 |                 dilation[1] * (kernel_size[1] - 1) // 2
 52 |             )
 53 |             offset_base_channels = kernel_size[0] * kernel_size[1]
 54 |         else:
 55 |             padding = dilation * (kernel_size - 1) // 2
 56 |             offset_base_channels = kernel_size * kernel_size
 57 |         if with_modulated_dcn:
 58 |             from detectron2.layers.deform_conv import ModulatedDeformConv
 59 |             offset_channels = offset_base_channels * 3  # default: 27
 60 |             conv_block = ModulatedDeformConv
 61 |         else:
 62 |             from detectron2.layers.deform_conv import DeformConv
 63 |             offset_channels = offset_base_channels * 2  # default: 18
 64 |             conv_block = DeformConv
 65 |         self.offset = Conv2d(
 66 |             in_channels,
 67 |             deformable_groups * offset_channels,
 68 |             kernel_size=kernel_size,
 69 |             stride=stride,
 70 |             padding=padding,
 71 |             groups=1,
 72 |             dilation=dilation
 73 |         )
 74 |         for l in [self.offset, ]:
 75 |             nn.init.kaiming_uniform_(l.weight, a=1)
 76 |             torch.nn.init.constant_(l.bias, 0.)
 77 |         self.conv = conv_block(
 78 |             in_channels,
 79 |             out_channels,
 80 |             kernel_size=kernel_size,
 81 |             stride=stride,
 82 |             padding=padding,
 83 |             dilation=dilation,
 84 |             groups=groups,
 85 |             deformable_groups=deformable_groups,
 86 |             bias=bias
 87 |         )
 88 |         self.with_modulated_dcn = with_modulated_dcn
 89 |         self.kernel_size = kernel_size
 90 |         self.stride = stride
 91 |         self.padding = padding
 92 |         self.dilation = dilation
 93 |         self.offset_split = offset_base_channels * deformable_groups * 2
 94 | 
 95 |     def forward(self, x, return_offset=False):
 96 |         if x.numel() > 0:
 97 |             if not self.with_modulated_dcn:
 98 |                 offset_mask = self.offset(x)
 99 |                 x = self.conv(x, offset_mask)
100 |             else:
101 |                 offset_mask = self.offset(x)
102 |                 offset = offset_mask[:, :self.offset_split, :, :]
103 |                 mask = offset_mask[:, self.offset_split:, :, :].sigmoid()
104 |                 x = self.conv(x, offset, mask)
105 |             if return_offset:
106 |                 return x, offset_mask
107 |             return x
108 |         # get output shape
109 |         output_shape = [
110 |             (i + 2 * p - (di * (k - 1) + 1)) // d + 1
111 |             for i, p, di, k, d in zip(
112 |                 x.shape[-2:],
113 |                 self.padding,
114 |                 self.dilation,
115 |                 self.kernel_size,
116 |                 self.stride
117 |             )
118 |         ]
119 |         output_shape = [x.shape[0], self.conv.weight.shape[0]] + output_shape
120 |         return _NewEmptyTensorOp.apply(x, output_shape)
121 | 


--------------------------------------------------------------------------------
/fewx/layers/iou_loss.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from torch import nn
 3 | 
 4 | 
 5 | class IOULoss(nn.Module):
 6 |     """
 7 |     Intersetion Over Union (IoU) loss which supports three
 8 |     different IoU computations:
 9 |     * IoU
10 |     * Linear IoU
11 |     * gIoU
12 |     """
13 |     def __init__(self, loc_loss_type='iou'):
14 |         super(IOULoss, self).__init__()
15 |         self.loc_loss_type = loc_loss_type
16 | 
17 |     def forward(self, pred, target, weight=None):
18 |         """
19 |         Args:
20 |             pred: Nx4 predicted bounding boxes
21 |             target: Nx4 target bounding boxes
22 |             weight: N loss weight for each instance
23 |         """
24 |         pred_left = pred[:, 0]
25 |         pred_top = pred[:, 1]
26 |         pred_right = pred[:, 2]
27 |         pred_bottom = pred[:, 3]
28 | 
29 |         target_left = target[:, 0]
30 |         target_top = target[:, 1]
31 |         target_right = target[:, 2]
32 |         target_bottom = target[:, 3]
33 | 
34 |         target_aera = (target_left + target_right) * \
35 |                       (target_top + target_bottom)
36 |         pred_aera = (pred_left + pred_right) * \
37 |                     (pred_top + pred_bottom)
38 | 
39 |         w_intersect = torch.min(pred_left, target_left) + \
40 |                       torch.min(pred_right, target_right)
41 |         h_intersect = torch.min(pred_bottom, target_bottom) + \
42 |                       torch.min(pred_top, target_top)
43 | 
44 |         g_w_intersect = torch.max(pred_left, target_left) + \
45 |                         torch.max(pred_right, target_right)
46 |         g_h_intersect = torch.max(pred_bottom, target_bottom) + \
47 |                         torch.max(pred_top, target_top)
48 |         ac_uion = g_w_intersect * g_h_intersect
49 | 
50 |         area_intersect = w_intersect * h_intersect
51 |         area_union = target_aera + pred_aera - area_intersect
52 | 
53 |         ious = (area_intersect + 1.0) / (area_union + 1.0)
54 |         gious = ious - (ac_uion - area_union) / ac_uion
55 |         if self.loc_loss_type == 'iou':
56 |             losses = -torch.log(ious)
57 |         elif self.loc_loss_type == 'linear_iou':
58 |             losses = 1 - ious
59 |         elif self.loc_loss_type == 'giou':
60 |             losses = 1 - gious
61 |         else:
62 |             raise NotImplementedError
63 | 
64 |         if weight is not None:
65 |             return (losses * weight).sum()
66 |         else:
67 |             return losses.sum()
68 | 


--------------------------------------------------------------------------------
/fewx/layers/misc.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
 2 | """
 3 | helper class that supports empty tensors on some nn functions.
 4 | Ideally, add support directly in PyTorch to empty tensors in
 5 | those functions.
 6 | This can be removed once https://github.com/pytorch/pytorch/issues/12013
 7 | is implemented
 8 | """
 9 | 
10 | import math
11 | import torch
12 | from torch.nn.modules.utils import _ntuple
13 | from torch import nn
14 | 
15 | class _NewEmptyTensorOp(torch.autograd.Function):
16 |     @staticmethod
17 |     def forward(ctx, x, new_shape):
18 |         ctx.shape = x.shape
19 |         return x.new_empty(new_shape)
20 | 
21 |     @staticmethod
22 |     def backward(ctx, grad):
23 |         shape = ctx.shape
24 |         return _NewEmptyTensorOp.apply(grad, shape), None
25 | 
26 | 
27 | def interpolate(
28 |     input, size=None, scale_factor=None, mode="nearest", align_corners=None
29 | ):
30 |     if input.numel() > 0:
31 |         return torch.nn.functional.interpolate(
32 |             input, size, scale_factor, mode, align_corners
33 |         )
34 | 
35 |     def _check_size_scale_factor(dim):
36 |         if size is None and scale_factor is None:
37 |             raise ValueError("either size or scale_factor should be defined")
38 |         if size is not None and scale_factor is not None:
39 |             raise ValueError("only one of size or scale_factor should be defined")
40 |         if (
41 |             scale_factor is not None
42 |             and isinstance(scale_factor, tuple)
43 |             and len(scale_factor) != dim
44 |         ):
45 |             raise ValueError(
46 |                 "scale_factor shape must match input shape. "
47 |                 "Input is {}D, scale_factor size is {}".format(dim, len(scale_factor))
48 |             )
49 | 
50 |     def _output_size(dim):
51 |         _check_size_scale_factor(dim)
52 |         if size is not None:
53 |             return size
54 |         scale_factors = _ntuple(dim)(scale_factor)
55 |         # math.floor might return float in py2.7
56 |         return [
57 |             int(math.floor(input.size(i + 2) * scale_factors[i])) for i in range(dim)
58 |         ]
59 | 
60 |     output_shape = tuple(_output_size(2))
61 |     output_shape = input.shape[:-2] + output_shape
62 |     return _NewEmptyTensorOp.apply(input, output_shape)
63 | 


--------------------------------------------------------------------------------
/fewx/layers/ml_nms.py:
--------------------------------------------------------------------------------
 1 | from detectron2.layers import batched_nms
 2 | 
 3 | 
 4 | def ml_nms(boxlist, nms_thresh, max_proposals=-1,
 5 |            score_field="scores", label_field="labels"):
 6 |     """
 7 |     Performs non-maximum suppression on a boxlist, with scores specified
 8 |     in a boxlist field via score_field.
 9 |     
10 |     Args:
11 |         boxlist (detectron2.structures.Boxes): 
12 |         nms_thresh (float): 
13 |         max_proposals (int): if > 0, then only the top max_proposals are kept
14 |             after non-maximum suppression
15 |         score_field (str): 
16 |     """
17 |     if nms_thresh <= 0:
18 |         return boxlist
19 |     boxes = boxlist.pred_boxes.tensor
20 |     scores = boxlist.scores
21 |     labels = boxlist.pred_classes
22 |     keep = batched_nms(boxes, scores, labels, nms_thresh)
23 |     if max_proposals > 0:
24 |         keep = keep[: max_proposals]
25 |     boxlist = boxlist[keep]
26 |     return boxlist
27 | 


--------------------------------------------------------------------------------
/fewx/layers/naive_group_norm.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from torch.nn import Module, Parameter
 3 | from torch.nn import init
 4 | 
 5 | 
 6 | class NaiveGroupNorm(Module):
 7 |     r"""NaiveGroupNorm implements Group Normalization with the high-level matrix operations in PyTorch.
 8 |     It is a temporary solution to export GN by ONNX before the official GN can be exported by ONNX.
 9 |     The usage of NaiveGroupNorm is exactly the same as the official :class:`torch.nn.GroupNorm`.
10 |     Args:
11 |         num_groups (int): number of groups to separate the channels into
12 |         num_channels (int): number of channels expected in input
13 |         eps: a value added to the denominator for numerical stability. Default: 1e-5
14 |         affine: a boolean value that when set to ``True``, this module
15 |             has learnable per-channel affine parameters initialized to ones (for weights)
16 |             and zeros (for biases). Default: ``True``.
17 |     Shape:
18 |         - Input: :math:`(N, C, *)` where :math:`C=\text{num\_channels}`
19 |         - Output: :math:`(N, C, *)` (same shape as input)
20 |     Examples::
21 |         >>> input = torch.randn(20, 6, 10, 10)
22 |         >>> # Separate 6 channels into 3 groups
23 |         >>> m = NaiveGroupNorm(3, 6)
24 |         >>> # Separate 6 channels into 6 groups (equivalent with InstanceNorm)
25 |         >>> m = NaiveGroupNorm(6, 6)
26 |         >>> # Put all 6 channels into a single group (equivalent with LayerNorm)
27 |         >>> m = NaiveGroupNorm(1, 6)
28 |         >>> # Activating the module
29 |         >>> output = m(input)
30 |     .. _`Group Normalization`: https://arxiv.org/abs/1803.08494
31 |     """
32 |     __constants__ = ['num_groups', 'num_channels', 'eps', 'affine', 'weight',
33 |                      'bias']
34 | 
35 |     def __init__(self, num_groups, num_channels, eps=1e-5, affine=True):
36 |         super(NaiveGroupNorm, self).__init__()
37 |         self.num_groups = num_groups
38 |         self.num_channels = num_channels
39 |         self.eps = eps
40 |         self.affine = affine
41 |         if self.affine:
42 |             self.weight = Parameter(torch.Tensor(num_channels))
43 |             self.bias = Parameter(torch.Tensor(num_channels))
44 |         else:
45 |             self.register_parameter('weight', None)
46 |             self.register_parameter('bias', None)
47 |         self.reset_parameters()
48 | 
49 |     def reset_parameters(self):
50 |         if self.affine:
51 |             init.ones_(self.weight)
52 |             init.zeros_(self.bias)
53 | 
54 |     def forward(self, input):
55 |         N, C, H, W = input.size()
56 |         assert C % self.num_groups == 0
57 |         input = input.reshape(N, self.num_groups, -1)
58 |         mean = input.mean(dim=-1, keepdim=True)
59 |         var = (input ** 2).mean(dim=-1, keepdim=True) - mean ** 2
60 |         std = torch.sqrt(var + self.eps)
61 | 
62 |         input = (input - mean) / std
63 |         input = input.reshape(N, C, H, W)
64 |         if self.affine:
65 |             input = input * self.weight.reshape(1, C, 1, 1) + self.bias.reshape(1, C, 1, 1)
66 |         return input
67 | 
68 |     def extra_repr(self):
69 |         return '{num_groups}, {num_channels}, eps={eps}, ' \
70 |             'affine={affine}'.format(**self.__dict__)
71 | 


--------------------------------------------------------------------------------
/fewx/modeling/__init__.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
2 | from .fsod import FsodRCNN, FsodRes5ROIHeads, FsodFastRCNNOutputLayers, FsodRPN
3 | 
4 | _EXCLUDE = {"torch", "ShapeSpec"}
5 | __all__ = [k for k in globals().keys() if k not in _EXCLUDE and not k.startswith("_")]
6 | 


--------------------------------------------------------------------------------
/fewx/modeling/fsod/__init__.py:
--------------------------------------------------------------------------------
1 | from .fsod_rcnn import FsodRCNN
2 | from .fsod_roi_heads import FsodRes5ROIHeads
3 | from .fsod_fast_rcnn import FsodFastRCNNOutputLayers
4 | from .fsod_rpn import FsodRPN
5 | 


--------------------------------------------------------------------------------
/fewx/modeling/fsod/fsod_rcnn.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | import logging
  3 | import numpy as np
  4 | import torch
  5 | from torch import nn
  6 | 
  7 | from detectron2.data.detection_utils import convert_image_to_rgb
  8 | from detectron2.structures import ImageList, Boxes, Instances
  9 | from detectron2.utils.events import get_event_storage
 10 | from detectron2.utils.logger import log_first_n
 11 | 
 12 | from detectron2.modeling.backbone import build_backbone
 13 | from detectron2.modeling.postprocessing import detector_postprocess
 14 | from detectron2.modeling.proposal_generator import build_proposal_generator
 15 | from .fsod_roi_heads import build_roi_heads
 16 | from detectron2.modeling.meta_arch.build import META_ARCH_REGISTRY
 17 | 
 18 | from detectron2.modeling.poolers import ROIPooler
 19 | import torch.nn.functional as F
 20 | 
 21 | from .fsod_fast_rcnn import FsodFastRCNNOutputs
 22 | 
 23 | import os
 24 | 
 25 | import matplotlib.pyplot as plt
 26 | import pandas as pd
 27 | 
 28 | from detectron2.data.catalog import MetadataCatalog
 29 | import detectron2.data.detection_utils as utils
 30 | import pickle
 31 | import sys
 32 | 
 33 | __all__ = ["FsodRCNN"]
 34 | 
 35 | 
 36 | @META_ARCH_REGISTRY.register()
 37 | class FsodRCNN(nn.Module):
 38 |     """
 39 |     Generalized R-CNN. Any models that contains the following three components:
 40 |     1. Per-image feature extraction (aka backbone)
 41 |     2. Region proposal generation
 42 |     3. Per-region feature extraction and prediction
 43 |     """
 44 | 
 45 |     def __init__(self, cfg):
 46 |         super().__init__()
 47 | 
 48 |         self.backbone = build_backbone(cfg)
 49 |         self.proposal_generator = build_proposal_generator(cfg, self.backbone.output_shape())
 50 |         self.roi_heads = build_roi_heads(cfg, self.backbone.output_shape())
 51 |         self.vis_period = cfg.VIS_PERIOD
 52 |         self.input_format = cfg.INPUT.FORMAT
 53 | 
 54 |         assert len(cfg.MODEL.PIXEL_MEAN) == len(cfg.MODEL.PIXEL_STD)
 55 |         self.register_buffer("pixel_mean", torch.Tensor(cfg.MODEL.PIXEL_MEAN).view(-1, 1, 1))
 56 |         self.register_buffer("pixel_std", torch.Tensor(cfg.MODEL.PIXEL_STD).view(-1, 1, 1))
 57 | 
 58 |         self.in_features              = cfg.MODEL.ROI_HEADS.IN_FEATURES
 59 | 
 60 |         self.support_way = cfg.INPUT.FS.SUPPORT_WAY
 61 |         self.support_shot = cfg.INPUT.FS.SUPPORT_SHOT
 62 |         self.logger = logging.getLogger(__name__)
 63 | 
 64 |     @property
 65 |     def device(self):
 66 |         return self.pixel_mean.device
 67 | 
 68 |     def visualize_training(self, batched_inputs, proposals):
 69 |         """
 70 |         A function used to visualize images and proposals. It shows ground truth
 71 |         bounding boxes on the original image and up to 20 predicted object
 72 |         proposals on the original image. Users can implement different
 73 |         visualization functions for different models.
 74 | 
 75 |         Args:
 76 |             batched_inputs (list): a list that contains input to the model.
 77 |             proposals (list): a list that contains predicted proposals. Both
 78 |                 batched_inputs and proposals should have the same length.
 79 |         """
 80 |         from detectron2.utils.visualizer import Visualizer
 81 | 
 82 |         storage = get_event_storage()
 83 |         max_vis_prop = 20
 84 | 
 85 |         for input, prop in zip(batched_inputs, proposals):
 86 |             img = input["image"]
 87 |             img = convert_image_to_rgb(img.permute(1, 2, 0), self.input_format)
 88 |             v_gt = Visualizer(img, None)
 89 |             v_gt = v_gt.overlay_instances(boxes=input["instances"].gt_boxes)
 90 |             anno_img = v_gt.get_image()
 91 |             box_size = min(len(prop.proposal_boxes), max_vis_prop)
 92 |             v_pred = Visualizer(img, None)
 93 |             v_pred = v_pred.overlay_instances(
 94 |                 boxes=prop.proposal_boxes[0:box_size].tensor.cpu().numpy()
 95 |             )
 96 |             prop_img = v_pred.get_image()
 97 |             vis_img = np.concatenate((anno_img, prop_img), axis=1)
 98 |             vis_img = vis_img.transpose(2, 0, 1)
 99 |             vis_name = "Left: GT bounding boxes;  Right: Predicted proposals"
100 |             storage.put_image(vis_name, vis_img)
101 |             break  # only visualize one image in a batch
102 | 
103 |     def forward(self, batched_inputs):
104 |         """
105 |         Args:
106 |             batched_inputs: a list, batched outputs of :class:`DatasetMapper` .
107 |                 Each item in the list contains the inputs for one image.
108 |                 For now, each item in the list is a dict that contains:
109 | 
110 |                 * image: Tensor, image in (C, H, W) format.
111 |                 * instances (optional): groundtruth :class:`Instances`
112 |                 * proposals (optional): :class:`Instances`, precomputed proposals.
113 | 
114 |                 Other information that's included in the original dicts, such as:
115 | 
116 |                 * "height", "width" (int): the output resolution of the model, used in inference.
117 |                   See :meth:`postprocess` for details.
118 | 
119 |         Returns:
120 |             list[dict]:
121 |                 Each dict is the output for one input image.
122 |                 The dict contains one key "instances" whose value is a :class:`Instances`.
123 |                 The :class:`Instances` object has the following keys:
124 |                 "pred_boxes", "pred_classes", "scores", "pred_masks", "pred_keypoints"
125 |         """
126 |         if not self.training:
127 |             self.init_model()
128 |             return self.inference(batched_inputs)
129 |         
130 |         images, support_images = self.preprocess_image(batched_inputs)
131 |         if "instances" in batched_inputs[0]:
132 |             for x in batched_inputs:
133 |                 x['instances'].set('gt_classes', torch.full_like(x['instances'].get('gt_classes'), 0))
134 |             
135 |             gt_instances = [x["instances"].to(self.device) for x in batched_inputs]
136 |         else:
137 |             gt_instances = None
138 |         
139 |         features = self.backbone(images.tensor)
140 | 
141 |         # support branches
142 |         support_bboxes_ls = []
143 |         for item in batched_inputs:
144 |             bboxes = item['support_bboxes']
145 |             for box in bboxes:
146 |                 box = Boxes(box[np.newaxis, :])
147 |                 support_bboxes_ls.append(box.to(self.device))
148 |         
149 |         B, N, C, H, W = support_images.tensor.shape
150 |         assert N == self.support_way * self.support_shot
151 | 
152 |         support_images = support_images.tensor.reshape(B*N, C, H, W)
153 |         support_features = self.backbone(support_images)
154 |         
155 |         # support feature roi pooling
156 |         feature_pooled = self.roi_heads.roi_pooling(support_features, support_bboxes_ls)
157 | 
158 |         support_box_features = self.roi_heads._shared_roi_transform([support_features[f] for f in self.in_features], support_bboxes_ls)
159 |         assert self.support_way == 2 # now only 2 way support
160 | 
161 |         detector_loss_cls = []
162 |         detector_loss_box_reg = []
163 |         rpn_loss_rpn_cls = []
164 |         rpn_loss_rpn_loc = []
165 |         for i in range(B): # batch
166 |             # query
167 |             query_gt_instances = [gt_instances[i]] # one query gt instances
168 |             query_images = ImageList.from_tensors([images[i]]) # one query image
169 | 
170 |             query_feature_res4 = features['res4'][i].unsqueeze(0) # one query feature for attention rpn
171 |             query_features = {'res4': query_feature_res4} # one query feature for rcnn
172 | 
173 |             # positive support branch ##################################
174 |             pos_begin = i * self.support_shot * self.support_way
175 |             pos_end = pos_begin + self.support_shot
176 |             pos_support_features = feature_pooled[pos_begin:pos_end].mean(0, True) # pos support features from res4, average all supports, for rcnn
177 |             pos_support_features_pool = pos_support_features.mean(dim=[2, 3], keepdim=True) # average pooling support feature for attention rpn
178 |             pos_correlation = F.conv2d(query_feature_res4, pos_support_features_pool.permute(1,0,2,3), groups=1024) # attention map
179 | 
180 |             pos_features = {'res4': pos_correlation} # attention map for attention rpn
181 |             pos_support_box_features = support_box_features[pos_begin:pos_end].mean(0, True)
182 |             pos_proposals, pos_anchors, pos_pred_objectness_logits, pos_gt_labels, pos_pred_anchor_deltas, pos_gt_boxes = self.proposal_generator(query_images, pos_features, query_gt_instances) # attention rpn
183 |             pos_pred_class_logits, pos_pred_proposal_deltas, pos_detector_proposals = self.roi_heads(query_images, query_features, pos_support_box_features, pos_proposals, query_gt_instances) # pos rcnn
184 | 
185 |             # negative support branch ##################################
186 |             neg_begin = pos_end 
187 |             neg_end = neg_begin + self.support_shot 
188 | 
189 |             neg_support_features = feature_pooled[neg_begin:neg_end].mean(0, True)
190 |             neg_support_features_pool = neg_support_features.mean(dim=[2, 3], keepdim=True)
191 |             neg_correlation = F.conv2d(query_feature_res4, neg_support_features_pool.permute(1,0,2,3), groups=1024)
192 | 
193 |             neg_features = {'res4': neg_correlation}
194 | 
195 |             neg_support_box_features = support_box_features[neg_begin:neg_end].mean(0, True)
196 |             neg_proposals, neg_anchors, neg_pred_objectness_logits, neg_gt_labels, neg_pred_anchor_deltas, neg_gt_boxes = self.proposal_generator(query_images, neg_features, query_gt_instances)
197 |             neg_pred_class_logits, neg_pred_proposal_deltas, neg_detector_proposals = self.roi_heads(query_images, query_features, neg_support_box_features, neg_proposals, query_gt_instances)
198 | 
199 |             # rpn loss
200 |             outputs_images = ImageList.from_tensors([images[i], images[i]])
201 | 
202 |             outputs_pred_objectness_logits = [torch.cat(pos_pred_objectness_logits + neg_pred_objectness_logits, dim=0)]
203 |             outputs_pred_anchor_deltas = [torch.cat(pos_pred_anchor_deltas + neg_pred_anchor_deltas, dim=0)]
204 |             
205 |             outputs_anchors = pos_anchors # + neg_anchors
206 | 
207 |             # convert 1 in neg_gt_labels to 0
208 |             for item in neg_gt_labels:
209 |                 item[item == 1] = 0
210 | 
211 |             outputs_gt_boxes = pos_gt_boxes + neg_gt_boxes #[None]
212 |             outputs_gt_labels = pos_gt_labels + neg_gt_labels
213 |             
214 |             if self.training:
215 |                 proposal_losses = self.proposal_generator.losses(
216 |                     outputs_anchors, outputs_pred_objectness_logits, outputs_gt_labels, outputs_pred_anchor_deltas, outputs_gt_boxes)
217 |                 proposal_losses = {k: v * self.proposal_generator.loss_weight for k, v in proposal_losses.items()}
218 |             else:
219 |                 proposal_losses = {}
220 | 
221 |             # detector loss
222 |             detector_pred_class_logits = torch.cat([pos_pred_class_logits, neg_pred_class_logits], dim=0)
223 |             detector_pred_proposal_deltas = torch.cat([pos_pred_proposal_deltas, neg_pred_proposal_deltas], dim=0)
224 |             for item in neg_detector_proposals:
225 |                 item.gt_classes = torch.full_like(item.gt_classes, 1)
226 |             
227 |             #detector_proposals = pos_detector_proposals + neg_detector_proposals
228 |             detector_proposals = [Instances.cat(pos_detector_proposals + neg_detector_proposals)]
229 |             if self.training:
230 |                 predictions = detector_pred_class_logits, detector_pred_proposal_deltas
231 |                 detector_losses = self.roi_heads.box_predictor.losses(predictions, detector_proposals)
232 | 
233 |             rpn_loss_rpn_cls.append(proposal_losses['loss_rpn_cls'])
234 |             rpn_loss_rpn_loc.append(proposal_losses['loss_rpn_loc'])
235 |             detector_loss_cls.append(detector_losses['loss_cls'])
236 |             detector_loss_box_reg.append(detector_losses['loss_box_reg'])
237 |         
238 |         proposal_losses = {}
239 |         detector_losses = {}
240 | 
241 |         proposal_losses['loss_rpn_cls'] = torch.stack(rpn_loss_rpn_cls).mean()
242 |         proposal_losses['loss_rpn_loc'] = torch.stack(rpn_loss_rpn_loc).mean()
243 |         detector_losses['loss_cls'] = torch.stack(detector_loss_cls).mean() 
244 |         detector_losses['loss_box_reg'] = torch.stack(detector_loss_box_reg).mean()
245 | 
246 | 
247 |         losses = {}
248 |         losses.update(detector_losses)
249 |         losses.update(proposal_losses)
250 |         return losses
251 | 
252 |     def init_model(self):
253 |         self.support_on = True #False
254 | 
255 |         support_dir = './support_dir'
256 |         if not os.path.exists(support_dir):
257 |             os.makedirs(support_dir)
258 | 
259 |         support_file_name = os.path.join(support_dir, 'support_feature.pkl')
260 |         if not os.path.exists(support_file_name):
261 |             support_path = './datasets/coco/10_shot_support_df.pkl'
262 |             support_df = pd.read_pickle(support_path)
263 | 
264 |             metadata = MetadataCatalog.get('coco_2017_train')
265 |             # unmap the category mapping ids for COCO
266 |             reverse_id_mapper = lambda dataset_id: metadata.thing_dataset_id_to_contiguous_id[dataset_id]  # noqa
267 |             support_df['category_id'] = support_df['category_id'].map(reverse_id_mapper)
268 | 
269 |             support_dict = {'res4_avg': {}, 'res5_avg': {}}
270 |             for cls in support_df['category_id'].unique():
271 |                 support_cls_df = support_df.loc[support_df['category_id'] == cls, :].reset_index()
272 |                 support_data_all = []
273 |                 support_box_all = []
274 | 
275 |                 for index, support_img_df in support_cls_df.iterrows():
276 |                     img_path = os.path.join('./datasets/coco', support_img_df['file_path'])
277 |                     support_data = utils.read_image(img_path, format='BGR')
278 |                     support_data = torch.as_tensor(np.ascontiguousarray(support_data.transpose(2, 0, 1)))
279 |                     support_data_all.append(support_data)
280 | 
281 |                     support_box = support_img_df['support_box']
282 |                     support_box_all.append(Boxes([support_box]).to(self.device))
283 | 
284 |                 # support images
285 |                 support_images = [x.to(self.device) for x in support_data_all]
286 |                 support_images = [(x - self.pixel_mean) / self.pixel_std for x in support_images]
287 |                 support_images = ImageList.from_tensors(support_images, self.backbone.size_divisibility)
288 |                 support_features = self.backbone(support_images.tensor)
289 | 
290 |                 res4_pooled = self.roi_heads.roi_pooling(support_features, support_box_all)
291 |                 res4_avg = res4_pooled.mean(0, True)
292 |                 res4_avg = res4_avg.mean(dim=[2,3], keepdim=True)
293 |                 support_dict['res4_avg'][cls] = res4_avg.detach().cpu().data
294 | 
295 |                 res5_feature = self.roi_heads._shared_roi_transform([support_features[f] for f in self.in_features], support_box_all)
296 |                 res5_avg = res5_feature.mean(0, True)
297 |                 support_dict['res5_avg'][cls] = res5_avg.detach().cpu().data
298 | 
299 |                 del res4_avg
300 |                 del res4_pooled
301 |                 del support_features
302 |                 del res5_feature
303 |                 del res5_avg
304 | 
305 |             with open(support_file_name, 'wb') as f:
306 |                pickle.dump(support_dict, f)
307 |             self.logger.info("=========== Offline support features are generated. ===========")
308 |             self.logger.info("============ Few-shot object detetion will start. =============")
309 |             sys.exit(0)
310 |             
311 |         else:
312 |             with open(support_file_name, "rb") as hFile:
313 |                 self.support_dict  = pickle.load(hFile, encoding="latin1")
314 |             for res_key, res_dict in self.support_dict.items():
315 |                 for cls_key, feature in res_dict.items():
316 |                     self.support_dict[res_key][cls_key] = feature.cuda()
317 | 
318 |     def inference(self, batched_inputs, detected_instances=None, do_postprocess=True):
319 |         """
320 |         Run inference on the given inputs.
321 |         Args:
322 |             batched_inputs (list[dict]): same as in :meth:`forward`
323 |             detected_instances (None or list[Instances]): if not None, it
324 |                 contains an `Instances` object per image. The `Instances`
325 |                 object contains "pred_boxes" and "pred_classes" which are
326 |                 known boxes in the image.
327 |                 The inference will then skip the detection of bounding boxes,
328 |                 and only predict other per-ROI outputs.
329 |             do_postprocess (bool): whether to apply post-processing on the outputs.
330 |         Returns:
331 |             same as in :meth:`forward`.
332 |         """
333 |         assert not self.training
334 |         
335 |         images = self.preprocess_image(batched_inputs)
336 |         features = self.backbone(images.tensor)
337 | 
338 |         B, _, _, _ = features['res4'].shape
339 |         assert B == 1 # only support 1 query image in test
340 |         assert len(images) == 1
341 |         support_proposals_dict = {}
342 |         support_box_features_dict = {}
343 |         proposal_num_dict = {}
344 |  
345 |         for cls_id, res4_avg in self.support_dict['res4_avg'].items():
346 |             query_images = ImageList.from_tensors([images[0]]) # one query image
347 | 
348 |             query_features_res4 = features['res4'] # one query feature for attention rpn
349 |             query_features = {'res4': query_features_res4} # one query feature for rcnn
350 | 
351 |             # support branch ##################################
352 |             support_box_features = self.support_dict['res5_avg'][cls_id]
353 | 
354 |             correlation = F.conv2d(query_features_res4, res4_avg.permute(1,0,2,3), groups=1024) # attention map
355 | 
356 |             support_correlation = {'res4': correlation} # attention map for attention rpn
357 | 
358 |             proposals, _ = self.proposal_generator(query_images, support_correlation, None)
359 |             support_proposals_dict[cls_id] = proposals
360 |             support_box_features_dict[cls_id] = support_box_features
361 | 
362 |             if cls_id not in proposal_num_dict.keys():
363 |                 proposal_num_dict[cls_id] = []
364 |             proposal_num_dict[cls_id].append(len(proposals[0]))
365 | 
366 |             del support_box_features
367 |             del correlation
368 |             del res4_avg
369 |             del query_features_res4
370 | 
371 |         results, _ = self.roi_heads.eval_with_support(query_images, query_features, support_proposals_dict, support_box_features_dict)
372 |         
373 |         if do_postprocess:
374 |             return FsodRCNN._postprocess(results, batched_inputs, images.image_sizes)
375 |         else:
376 |             return results
377 | 
378 |     def preprocess_image(self, batched_inputs):
379 |         """
380 |         Normalize, pad and batch the input images.
381 |         """
382 |         images = [x["image"].to(self.device) for x in batched_inputs]
383 |         images = [(x - self.pixel_mean) / self.pixel_std for x in images]
384 |         images = ImageList.from_tensors(images, self.backbone.size_divisibility)
385 |         if self.training:
386 |             # support images
387 |             support_images = [x['support_images'].to(self.device) for x in batched_inputs]
388 |             support_images = [(x - self.pixel_mean) / self.pixel_std for x in support_images]
389 |             support_images = ImageList.from_tensors(support_images, self.backbone.size_divisibility)
390 | 
391 |             return images, support_images
392 |         else:
393 |             return images
394 | 
395 |     @staticmethod
396 |     def _postprocess(instances, batched_inputs, image_sizes):
397 |         """
398 |         Rescale the output instances to the target size.
399 |         """
400 |         # note: private function; subject to changes
401 |         processed_results = []
402 |         for results_per_image, input_per_image, image_size in zip(
403 |             instances, batched_inputs, image_sizes
404 |         ):
405 |             height = input_per_image.get("height", image_size[0])
406 |             width = input_per_image.get("width", image_size[1])
407 |             r = detector_postprocess(results_per_image, height, width)
408 |             processed_results.append({"instances": r})
409 |         return processed_results
410 | 


--------------------------------------------------------------------------------
/fewx/modeling/fsod/fsod_roi_heads.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | import inspect
  3 | import logging
  4 | import numpy as np
  5 | from typing import Dict, List, Optional, Tuple, Union
  6 | import torch
  7 | from torch import nn
  8 | 
  9 | from detectron2.config import configurable
 10 | from detectron2.layers import ShapeSpec, nonzero_tuple
 11 | from detectron2.structures import Boxes, ImageList, Instances, pairwise_iou
 12 | from detectron2.utils.events import get_event_storage
 13 | from detectron2.utils.registry import Registry
 14 | 
 15 | from detectron2.modeling.backbone.resnet import BottleneckBlock, make_stage
 16 | from detectron2.modeling.matcher import Matcher
 17 | from detectron2.modeling.poolers import ROIPooler
 18 | from detectron2.modeling.proposal_generator.proposal_utils import add_ground_truth_to_proposals
 19 | from detectron2.modeling.sampling import subsample_labels
 20 | from detectron2.modeling.roi_heads.box_head import build_box_head
 21 | from detectron2.modeling.roi_heads.roi_heads import ROIHeads
 22 | from .fsod_fast_rcnn import FsodFastRCNNOutputLayers
 23 | 
 24 | import time
 25 | from detectron2.structures import Boxes, Instances
 26 | 
 27 | ROI_HEADS_REGISTRY = Registry("ROI_HEADS")
 28 | ROI_HEADS_REGISTRY.__doc__ = """
 29 | Registry for ROI heads in a generalized R-CNN model.
 30 | ROIHeads take feature maps and region proposals, and
 31 | perform per-region computation.
 32 | 
 33 | The registered object will be called with `obj(cfg, input_shape)`.
 34 | The call is expected to return an :class:`ROIHeads`.
 35 | """
 36 | 
 37 | logger = logging.getLogger(__name__)
 38 | 
 39 | def build_roi_heads(cfg, input_shape):
 40 |     """
 41 |     Build ROIHeads defined by `cfg.MODEL.ROI_HEADS.NAME`.
 42 |     """
 43 |     name = cfg.MODEL.ROI_HEADS.NAME
 44 |     return ROI_HEADS_REGISTRY.get(name)(cfg, input_shape)
 45 | 
 46 | @ROI_HEADS_REGISTRY.register()
 47 | class FsodRes5ROIHeads(ROIHeads):
 48 |     """
 49 |     The ROIHeads in a typical "C4" R-CNN model, where
 50 |     the box and mask head share the cropping and
 51 |     the per-region feature computation by a Res5 block.
 52 |     """
 53 | 
 54 |     def __init__(self, cfg, input_shape):
 55 |         super().__init__(cfg)
 56 | 
 57 |         # fmt: off
 58 |         self.in_features  = cfg.MODEL.ROI_HEADS.IN_FEATURES
 59 |         pooler_resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
 60 |         pooler_type       = cfg.MODEL.ROI_BOX_HEAD.POOLER_TYPE
 61 |         pooler_scales     = (1.0 / input_shape[self.in_features[0]].stride, )
 62 |         sampling_ratio    = cfg.MODEL.ROI_BOX_HEAD.POOLER_SAMPLING_RATIO
 63 |         self.mask_on      = cfg.MODEL.MASK_ON
 64 |         # fmt: on
 65 |         assert not cfg.MODEL.KEYPOINT_ON
 66 |         assert len(self.in_features) == 1
 67 | 
 68 |         self.pooler = ROIPooler(
 69 |             output_size=pooler_resolution,
 70 |             scales=pooler_scales,
 71 |             sampling_ratio=sampling_ratio,
 72 |             pooler_type=pooler_type,
 73 |         )
 74 | 
 75 |         self.res5, out_channels = self._build_res5_block(cfg)
 76 |         self.box_predictor = FsodFastRCNNOutputLayers(
 77 |             cfg, ShapeSpec(channels=out_channels, height=1, width=1)
 78 |         )
 79 | 
 80 |     def _build_res5_block(self, cfg):
 81 |         # fmt: off
 82 |         stage_channel_factor = 2 ** 3  # res5 is 8x res2
 83 |         num_groups           = cfg.MODEL.RESNETS.NUM_GROUPS
 84 |         width_per_group      = cfg.MODEL.RESNETS.WIDTH_PER_GROUP
 85 |         bottleneck_channels  = num_groups * width_per_group * stage_channel_factor
 86 |         out_channels         = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS * stage_channel_factor
 87 |         stride_in_1x1        = cfg.MODEL.RESNETS.STRIDE_IN_1X1
 88 |         norm                 = cfg.MODEL.RESNETS.NORM
 89 |         assert not cfg.MODEL.RESNETS.DEFORM_ON_PER_STAGE[-1], \
 90 |             "Deformable conv is not yet supported in res5 head."
 91 |         # fmt: on
 92 | 
 93 |         blocks = make_stage(
 94 |             BottleneckBlock,
 95 |             3,
 96 |             first_stride=2,
 97 |             in_channels=out_channels // 2,
 98 |             bottleneck_channels=bottleneck_channels,
 99 |             out_channels=out_channels,
100 |             num_groups=num_groups,
101 |             norm=norm,
102 |             stride_in_1x1=stride_in_1x1,
103 |         )
104 |         return nn.Sequential(*blocks), out_channels
105 | 
106 |     def _shared_roi_transform(self, features, boxes):
107 |         x = self.pooler(features, boxes)
108 |         return self.res5(x)
109 | 
110 |     def roi_pooling(self, features, boxes):
111 |         box_features = self.pooler(
112 |             [features[f] for f in self.in_features], boxes
113 |         )
114 |         #feature_pooled = box_features.mean(dim=[2, 3], keepdim=True)  # pooled to 1x1
115 | 
116 |         return box_features #feature_pooled
117 | 
118 |     def forward(self, images, features, support_box_features, proposals, targets=None):
119 |         """
120 |         See :meth:`ROIHeads.forward`.
121 |         """
122 |         del images
123 |         
124 |         if self.training:
125 |             assert targets
126 |             proposals = self.label_and_sample_proposals(proposals, targets)
127 |         del targets
128 |         proposal_boxes = [x.proposal_boxes for x in proposals]
129 |         box_features = self._shared_roi_transform(
130 |             [features[f] for f in self.in_features], proposal_boxes
131 |         )
132 | 
133 |         #support_features = self.res5(support_features)
134 |         pred_class_logits, pred_proposal_deltas = self.box_predictor(box_features, support_box_features)
135 | 
136 |         return pred_class_logits, pred_proposal_deltas, proposals
137 | 
138 |     @torch.no_grad()
139 |     def eval_with_support(self, images, features, support_proposals_dict, support_box_features_dict):
140 |         """
141 |         See :meth:`ROIHeads.forward`.
142 |         """
143 |         del images
144 |         
145 |         full_proposals_ls = []
146 |         cls_ls = []
147 |         for cls_id, proposals in support_proposals_dict.items():
148 |             full_proposals_ls.append(proposals[0])
149 |             cls_ls.append(cls_id)
150 |         
151 |         full_proposals_ls = [Instances.cat(full_proposals_ls)]
152 | 
153 |         proposal_boxes = [x.proposal_boxes for x in full_proposals_ls]
154 |         assert len(proposal_boxes[0]) == 2000
155 | 
156 |         box_features = self._shared_roi_transform(
157 |             [features[f] for f in self.in_features], proposal_boxes
158 |         )
159 |         
160 |         full_scores_ls = []
161 |         full_bboxes_ls = []
162 |         full_cls_ls = []
163 |         cnt = 0
164 |         #for cls_id, support_box_features in support_box_features_dict.items():
165 |         for cls_id in cls_ls:
166 |             support_box_features = support_box_features_dict[cls_id]
167 |             query_features = box_features[cnt*100:(cnt+1)*100]
168 |             pred_class_logits, pred_proposal_deltas = self.box_predictor(query_features, support_box_features)
169 |             full_scores_ls.append(pred_class_logits)
170 |             full_bboxes_ls.append(pred_proposal_deltas)
171 |             full_cls_ls.append(torch.full_like(pred_class_logits[:, 0].unsqueeze(-1), cls_id).to(torch.int8))
172 |             del query_features
173 |             del support_box_features
174 | 
175 |             cnt += 1
176 |         
177 |         class_logits = torch.cat(full_scores_ls, dim=0)
178 |         proposal_deltas = torch.cat(full_bboxes_ls, dim=0)
179 |         pred_cls = torch.cat(full_cls_ls, dim=0) #.unsqueeze(-1)
180 |         
181 |         predictions = class_logits, proposal_deltas
182 |         proposals = full_proposals_ls
183 |         pred_instances, _ = self.box_predictor.inference(pred_cls, predictions, proposals)
184 |         pred_instances = self.forward_with_given_boxes(features, pred_instances)
185 | 
186 |         return pred_instances, {}
187 | 
188 |     def forward_with_given_boxes(self, features, instances):
189 |         """
190 |         Use the given boxes in `instances` to produce other (non-box) per-ROI outputs.
191 | 
192 |         Args:
193 |             features: same as in `forward()`
194 |             instances (list[Instances]): instances to predict other outputs. Expect the keys
195 |                 "pred_boxes" and "pred_classes" to exist.
196 | 
197 |         Returns:
198 |             instances (Instances):
199 |                 the same `Instances` object, with extra
200 |                 fields such as `pred_masks` or `pred_keypoints`.
201 |         """
202 |         assert not self.training
203 |         assert instances[0].has("pred_boxes") and instances[0].has("pred_classes")
204 | 
205 |         if self.mask_on:
206 |             features = [features[f] for f in self.in_features]
207 |             x = self._shared_roi_transform(features, [x.pred_boxes for x in instances])
208 |             return self.mask_head(x, instances)
209 |         else:
210 |             return instances
211 | 


--------------------------------------------------------------------------------
/fewx/modeling/fsod/fsod_rpn.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | from typing import Dict, List, Optional, Tuple
  3 | import torch
  4 | import torch.nn.functional as F
  5 | from fvcore.nn import smooth_l1_loss
  6 | from torch import nn
  7 | 
  8 | from detectron2.config import configurable
  9 | from detectron2.layers import ShapeSpec, cat
 10 | from detectron2.structures import Boxes, ImageList, Instances, pairwise_iou
 11 | from detectron2.utils.events import get_event_storage
 12 | from detectron2.utils.memory import retry_if_cuda_oom
 13 | from detectron2.utils.registry import Registry
 14 | 
 15 | from detectron2.modeling.anchor_generator import build_anchor_generator
 16 | from detectron2.modeling.box_regression import Box2BoxTransform
 17 | from detectron2.modeling.matcher import Matcher
 18 | from detectron2.modeling.sampling import subsample_labels
 19 | from detectron2.modeling.proposal_generator.build import PROPOSAL_GENERATOR_REGISTRY
 20 | from detectron2.modeling.proposal_generator.proposal_utils import find_top_rpn_proposals
 21 | 
 22 | RPN_HEAD_REGISTRY = Registry("RPN_HEAD")
 23 | RPN_HEAD_REGISTRY.__doc__ = """
 24 | Registry for RPN heads, which take feature maps and perform
 25 | objectness classification and bounding box regression for anchors.
 26 | 
 27 | The registered object will be called with `obj(cfg, input_shape)`.
 28 | The call should return a `nn.Module` object.
 29 | """
 30 | 
 31 | 
 32 | """
 33 | Shape shorthand in this module:
 34 | 
 35 |     N: number of images in the minibatch
 36 |     L: number of feature maps per image on which RPN is run
 37 |     A: number of cell anchors (must be the same for all feature maps)
 38 |     Hi, Wi: height and width of the i-th feature map
 39 |     B: size of the box parameterization
 40 | 
 41 | Naming convention:
 42 | 
 43 |     objectness: refers to the binary classification of an anchor as object vs. not object.
 44 | 
 45 |     deltas: refers to the 4-d (dx, dy, dw, dh) deltas that parameterize the box2box
 46 |     transform (see :class:`box_regression.Box2BoxTransform`), or 5d for rotated boxes.
 47 | 
 48 |     pred_objectness_logits: predicted objectness scores in [-inf, +inf]; use
 49 |         sigmoid(pred_objectness_logits) to estimate P(object).
 50 | 
 51 |     gt_labels: ground-truth binary classification labels for objectness
 52 | 
 53 |     pred_anchor_deltas: predicted box2box transform deltas
 54 | 
 55 |     gt_anchor_deltas: ground-truth box2box transform deltas
 56 | """
 57 | 
 58 | 
 59 | def build_rpn_head(cfg, input_shape):
 60 |     """
 61 |     Build an RPN head defined by `cfg.MODEL.RPN.HEAD_NAME`.
 62 |     """
 63 |     name = cfg.MODEL.RPN.HEAD_NAME
 64 |     return RPN_HEAD_REGISTRY.get(name)(cfg, input_shape)
 65 | 
 66 | 
 67 | @RPN_HEAD_REGISTRY.register()
 68 | class StandardRPNHead(nn.Module):
 69 |     """
 70 |     Standard RPN classification and regression heads described in :paper:`Faster R-CNN`.
 71 |     Uses a 3x3 conv to produce a shared hidden state from which one 1x1 conv predicts
 72 |     objectness logits for each anchor and a second 1x1 conv predicts bounding-box deltas
 73 |     specifying how to deform each anchor into an object proposal.
 74 |     """
 75 | 
 76 |     @configurable
 77 |     def __init__(self, *, in_channels: int, num_anchors: int, box_dim: int = 4):
 78 |         """
 79 |         NOTE: this interface is experimental.
 80 | 
 81 |         Args:
 82 |             in_channels (int): number of input feature channels. When using multiple
 83 |                 input features, they must have the same number of channels.
 84 |             num_anchors (int): number of anchors to predict for *each spatial position*
 85 |                 on the feature map. The total number of anchors for each
 86 |                 feature map will be `num_anchors * H * W`.
 87 |             box_dim (int): dimension of a box, which is also the number of box regression
 88 |                 predictions to make for each anchor. An axis aligned box has
 89 |                 box_dim=4, while a rotated box has box_dim=5.
 90 |         """
 91 |         super().__init__()
 92 |         # 3x3 conv for the hidden representation
 93 |         self.conv = nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1, padding=1)
 94 |         # 1x1 conv for predicting objectness logits
 95 |         self.objectness_logits = nn.Conv2d(in_channels, num_anchors, kernel_size=1, stride=1)
 96 |         # 1x1 conv for predicting box2box transform deltas
 97 |         self.anchor_deltas = nn.Conv2d(in_channels, num_anchors * box_dim, kernel_size=1, stride=1)
 98 | 
 99 |         for l in [self.conv, self.objectness_logits, self.anchor_deltas]:
100 |             nn.init.normal_(l.weight, std=0.01)
101 |             nn.init.constant_(l.bias, 0)
102 | 
103 |     @classmethod
104 |     def from_config(cls, cfg, input_shape):
105 |         # Standard RPN is shared across levels:
106 |         in_channels = [s.channels for s in input_shape]
107 |         assert len(set(in_channels)) == 1, "Each level must have the same channel!"
108 |         in_channels = in_channels[0]
109 | 
110 |         # RPNHead should take the same input as anchor generator
111 |         # NOTE: it assumes that creating an anchor generator does not have unwanted side effect.
112 |         anchor_generator = build_anchor_generator(cfg, input_shape)
113 |         num_anchors = anchor_generator.num_anchors
114 |         box_dim = anchor_generator.box_dim
115 |         assert (
116 |             len(set(num_anchors)) == 1
117 |         ), "Each level must have the same number of anchors per spatial position"
118 |         return {"in_channels": in_channels, "num_anchors": num_anchors[0], "box_dim": box_dim}
119 | 
120 |     def forward(self, features: List[torch.Tensor]):
121 |         """
122 |         Args:
123 |             features (list[Tensor]): list of feature maps
124 | 
125 |         Returns:
126 |             list[Tensor]: A list of L elements.
127 |                 Element i is a tensor of shape (N, A, Hi, Wi) representing
128 |                 the predicted objectness logits for all anchors. A is the number of cell anchors.
129 |             list[Tensor]: A list of L elements. Element i is a tensor of shape
130 |                 (N, A*box_dim, Hi, Wi) representing the predicted "deltas" used to transform anchors
131 |                 to proposals.
132 |         """
133 |         pred_objectness_logits = []
134 |         pred_anchor_deltas = []
135 |         for x in features:
136 |             t = F.relu(self.conv(x))
137 |             pred_objectness_logits.append(self.objectness_logits(t))
138 |             pred_anchor_deltas.append(self.anchor_deltas(t))
139 |         return pred_objectness_logits, pred_anchor_deltas
140 | 
141 | 
142 | @PROPOSAL_GENERATOR_REGISTRY.register()
143 | class FsodRPN(nn.Module):
144 |     """
145 |     Region Proposal Network, introduced by :paper:`Faster R-CNN`.
146 |     """
147 | 
148 |     @configurable
149 |     def __init__(
150 |         self,
151 |         *,
152 |         in_features: List[str],
153 |         head: nn.Module,
154 |         anchor_generator: nn.Module,
155 |         anchor_matcher: Matcher,
156 |         box2box_transform: Box2BoxTransform,
157 |         batch_size_per_image: int,
158 |         positive_fraction: float,
159 |         pre_nms_topk: Tuple[float, float],
160 |         post_nms_topk: Tuple[float, float],
161 |         nms_thresh: float = 0.7,
162 |         min_box_size: float = 0.0,
163 |         anchor_boundary_thresh: float = -1.0,
164 |         loss_weight: float = 1.0,
165 |         smooth_l1_beta: float = 0.0
166 |     ):
167 |         """
168 |         NOTE: this interface is experimental.
169 | 
170 |         Args:
171 |             in_features (list[str]): list of names of input features to use
172 |             head (nn.Module): a module that predicts logits and regression deltas
173 |                 for each level from a list of per-level features
174 |             anchor_generator (nn.Module): a module that creates anchors from a
175 |                 list of features. Usually an instance of :class:`AnchorGenerator`
176 |             anchor_matcher (Matcher): label the anchors by matching them with ground truth.
177 |             box2box_transform (Box2BoxTransform): defines the transform from anchors boxes to
178 |                 instance boxes
179 |             batch_size_per_image (int): number of anchors per image to sample for training
180 |             positive_fraction (float): fraction of foreground anchors to sample for training
181 |             pre_nms_topk (tuple[float]): (train, test) that represents the
182 |                 number of top k proposals to select before NMS, in
183 |                 training and testing.
184 |             post_nms_topk (tuple[float]): (train, test) that represents the
185 |                 number of top k proposals to select after NMS, in
186 |                 training and testing.
187 |             nms_thresh (float): NMS threshold used to de-duplicate the predicted proposals
188 |             min_box_size (float): remove proposal boxes with any side smaller than this threshold,
189 |                 in the unit of input image pixels
190 |             anchor_boundary_thresh (float): legacy option
191 |             loss_weight (float): weight to be multiplied to the loss
192 |             smooth_l1_beta (float): beta parameter for the smooth L1
193 |                 regression loss. Default to use L1 loss.
194 |         """
195 |         super().__init__()
196 |         self.in_features = in_features
197 |         self.rpn_head = head
198 |         self.anchor_generator = anchor_generator
199 |         self.anchor_matcher = anchor_matcher
200 |         self.box2box_transform = box2box_transform
201 |         self.batch_size_per_image = batch_size_per_image
202 |         self.positive_fraction = positive_fraction
203 |         # Map from self.training state to train/test settings
204 |         self.pre_nms_topk = {True: pre_nms_topk[0], False: pre_nms_topk[1]}
205 |         self.post_nms_topk = {True: post_nms_topk[0], False: post_nms_topk[1]}
206 |         self.nms_thresh = nms_thresh
207 |         self.min_box_size = min_box_size
208 |         self.anchor_boundary_thresh = anchor_boundary_thresh
209 |         self.loss_weight = loss_weight
210 |         self.smooth_l1_beta = smooth_l1_beta
211 | 
212 |     @classmethod
213 |     def from_config(cls, cfg, input_shape: Dict[str, ShapeSpec]):
214 |         in_features = cfg.MODEL.RPN.IN_FEATURES
215 |         ret = {
216 |             "in_features": in_features,
217 |             "min_box_size": cfg.MODEL.PROPOSAL_GENERATOR.MIN_SIZE,
218 |             "nms_thresh": cfg.MODEL.RPN.NMS_THRESH,
219 |             "batch_size_per_image": cfg.MODEL.RPN.BATCH_SIZE_PER_IMAGE,
220 |             "positive_fraction": cfg.MODEL.RPN.POSITIVE_FRACTION,
221 |             "smooth_l1_beta": cfg.MODEL.RPN.SMOOTH_L1_BETA,
222 |             "loss_weight": cfg.MODEL.RPN.LOSS_WEIGHT,
223 |             "anchor_boundary_thresh": cfg.MODEL.RPN.BOUNDARY_THRESH,
224 |             "box2box_transform": Box2BoxTransform(weights=cfg.MODEL.RPN.BBOX_REG_WEIGHTS),
225 |         }
226 | 
227 |         ret["pre_nms_topk"] = (cfg.MODEL.RPN.PRE_NMS_TOPK_TRAIN, cfg.MODEL.RPN.PRE_NMS_TOPK_TEST)
228 |         ret["post_nms_topk"] = (cfg.MODEL.RPN.POST_NMS_TOPK_TRAIN, cfg.MODEL.RPN.POST_NMS_TOPK_TEST)
229 | 
230 |         ret["anchor_generator"] = build_anchor_generator(cfg, [input_shape[f] for f in in_features])
231 |         ret["anchor_matcher"] = Matcher(
232 |             cfg.MODEL.RPN.IOU_THRESHOLDS, cfg.MODEL.RPN.IOU_LABELS, allow_low_quality_matches=True
233 |         )
234 |         ret["head"] = build_rpn_head(cfg, [input_shape[f] for f in in_features])
235 |         return ret
236 | 
237 |     def _subsample_labels(self, label):
238 |         """
239 |         Randomly sample a subset of positive and negative examples, and overwrite
240 |         the label vector to the ignore value (-1) for all elements that are not
241 |         included in the sample.
242 | 
243 |         Args:
244 |             labels (Tensor): a vector of -1, 0, 1. Will be modified in-place and returned.
245 |         """
246 |         pos_idx, neg_idx = subsample_labels(
247 |             label, self.batch_size_per_image, self.positive_fraction, 0
248 |         )
249 |         # Fill with the ignore label (-1), then set positive and negative labels
250 |         label.fill_(-1)
251 |         label.scatter_(0, pos_idx, 1)
252 |         label.scatter_(0, neg_idx, 0)
253 |         return label
254 | 
255 |     @torch.no_grad()
256 |     def label_and_sample_anchors(self, anchors: List[Boxes], gt_instances: List[Instances]):
257 |         """
258 |         Args:
259 |             anchors (list[Boxes]): anchors for each feature map.
260 |             gt_instances: the ground-truth instances for each image.
261 | 
262 |         Returns:
263 |             list[Tensor]:
264 |                 List of #img tensors. i-th element is a vector of labels whose length is
265 |                 the total number of anchors across all feature maps R = sum(Hi * Wi * A).
266 |                 Label values are in {-1, 0, 1}, with meanings: -1 = ignore; 0 = negative
267 |                 class; 1 = positive class.
268 |             list[Tensor]:
269 |                 i-th element is a Rx4 tensor. The values are the matched gt boxes for each
270 |                 anchor. Values are undefined for those anchors not labeled as 1.
271 |         """
272 |         anchors = Boxes.cat(anchors)
273 | 
274 |         gt_boxes = [x.gt_boxes for x in gt_instances]
275 |         image_sizes = [x.image_size for x in gt_instances]
276 |         del gt_instances
277 | 
278 |         gt_labels = []
279 |         matched_gt_boxes = []
280 |         for image_size_i, gt_boxes_i in zip(image_sizes, gt_boxes):
281 |             """
282 |             image_size_i: (h, w) for the i-th image
283 |             gt_boxes_i: ground-truth boxes for i-th image
284 |             """
285 | 
286 |             match_quality_matrix = retry_if_cuda_oom(pairwise_iou)(gt_boxes_i, anchors)
287 |             matched_idxs, gt_labels_i = retry_if_cuda_oom(self.anchor_matcher)(match_quality_matrix)
288 |             # Matching is memory-expensive and may result in CPU tensors. But the result is small
289 |             gt_labels_i = gt_labels_i.to(device=gt_boxes_i.device)
290 |             del match_quality_matrix
291 | 
292 |             if self.anchor_boundary_thresh >= 0:
293 |                 # Discard anchors that go out of the boundaries of the image
294 |                 # NOTE: This is legacy functionality that is turned off by default in Detectron2
295 |                 anchors_inside_image = anchors.inside_box(image_size_i, self.anchor_boundary_thresh)
296 |                 gt_labels_i[~anchors_inside_image] = -1
297 | 
298 |             # A vector of labels (-1, 0, 1) for each anchor
299 |             gt_labels_i = self._subsample_labels(gt_labels_i)
300 | 
301 |             if len(gt_boxes_i) == 0:
302 |                 # These values won't be used anyway since the anchor is labeled as background
303 |                 matched_gt_boxes_i = torch.zeros_like(anchors.tensor)
304 |             else:
305 |                 # TODO wasted indexing computation for ignored boxes
306 |                 matched_gt_boxes_i = gt_boxes_i[matched_idxs].tensor
307 | 
308 |             gt_labels.append(gt_labels_i)  # N,AHW
309 |             matched_gt_boxes.append(matched_gt_boxes_i)
310 |         return gt_labels, matched_gt_boxes
311 | 
312 |     def losses(
313 |         self,
314 |         anchors,
315 |         pred_objectness_logits: List[torch.Tensor],
316 |         gt_labels: List[torch.Tensor],
317 |         pred_anchor_deltas: List[torch.Tensor],
318 |         gt_boxes,
319 |     ):
320 |         """
321 |         Return the losses from a set of RPN predictions and their associated ground-truth.
322 | 
323 |         Args:
324 |             anchors (list[Boxes or RotatedBoxes]): anchors for each feature map, each
325 |                 has shape (Hi*Wi*A, B), where B is box dimension (4 or 5).
326 |             pred_objectness_logits (list[Tensor]): A list of L elements.
327 |                 Element i is a tensor of shape (N, Hi*Wi*A) representing
328 |                 the predicted objectness logits for all anchors.
329 |             gt_labels (list[Tensor]): Output of :meth:`label_and_sample_anchors`.
330 |             pred_anchor_deltas (list[Tensor]): A list of L elements. Element i is a tensor of shape
331 |                 (N, Hi*Wi*A, 4 or 5) representing the predicted "deltas" used to transform anchors
332 |                 to proposals.
333 |             gt_boxes (list[Boxes or RotatedBoxes]): Output of :meth:`label_and_sample_anchors`.
334 | 
335 |         Returns:
336 |             dict[loss name -> loss value]: A dict mapping from loss name to loss value.
337 |                 Loss names are: `loss_rpn_cls` for objectness classification and
338 |                 `loss_rpn_loc` for proposal localization.
339 |         """
340 |         num_images = len(gt_labels)
341 |         gt_labels = torch.stack(gt_labels)  # (N, sum(Hi*Wi*Ai))
342 |         anchors = type(anchors[0]).cat(anchors).tensor  # Ax(4 or 5)
343 |         #print(anchors.shape, gt_boxes[0].shape, len(gt_boxes))
344 |         gt_anchor_deltas = [self.box2box_transform.get_deltas(anchors, k) for k in gt_boxes]
345 |         gt_anchor_deltas = torch.stack(gt_anchor_deltas)  # (N, sum(Hi*Wi*Ai), 4 or 5)
346 | 
347 |         # Log the number of positive/negative anchors per-image that's used in training
348 |         pos_mask = gt_labels == 1
349 |         num_pos_anchors = pos_mask.sum().item()
350 |         num_neg_anchors = (gt_labels == 0).sum().item()
351 |         storage = get_event_storage()
352 |         storage.put_scalar("rpn/num_pos_anchors", num_pos_anchors / num_images)
353 |         storage.put_scalar("rpn/num_neg_anchors", num_neg_anchors / num_images)
354 | 
355 |         localization_loss = smooth_l1_loss(
356 |             cat(pred_anchor_deltas, dim=1)[pos_mask],
357 |             gt_anchor_deltas[pos_mask],
358 |             self.smooth_l1_beta,
359 |             reduction="sum",
360 |         )
361 |         valid_mask = gt_labels >= 0
362 |         objectness_loss = F.binary_cross_entropy_with_logits(
363 |             cat(pred_objectness_logits, dim=1)[valid_mask],
364 |             gt_labels[valid_mask].to(torch.float32),
365 |             reduction="sum",
366 |         )
367 |         normalizer = self.batch_size_per_image * num_images
368 |         return {
369 |             "loss_rpn_cls": objectness_loss / normalizer,
370 |             "loss_rpn_loc": localization_loss / normalizer,
371 |         }
372 | 
373 |     def forward(
374 |         self,
375 |         images: ImageList,
376 |         features: Dict[str, torch.Tensor],
377 |         gt_instances: Optional[Instances] = None,
378 |     ):
379 |         """
380 |         Args:
381 |             images (ImageList): input images of length `N`
382 |             features (dict[str, Tensor]): input data as a mapping from feature
383 |                 map name to tensor. Axis 0 represents the number of images `N` in
384 |                 the input data; axes 1-3 are channels, height, and width, which may
385 |                 vary between feature maps (e.g., if a feature pyramid is used).
386 |             gt_instances (list[Instances], optional): a length `N` list of `Instances`s.
387 |                 Each `Instances` stores ground-truth instances for the corresponding image.
388 | 
389 |         Returns:
390 |             proposals: list[Instances]: contains fields "proposal_boxes", "objectness_logits"
391 |             loss: dict[Tensor] or None
392 |         """
393 |         features = [features[f] for f in self.in_features]
394 |         anchors = self.anchor_generator(features)
395 | 
396 |         pred_objectness_logits, pred_anchor_deltas = self.rpn_head(features)
397 |         # Transpose the Hi*Wi*A dimension to the middle:
398 |         pred_objectness_logits = [
399 |             # (N, A, Hi, Wi) -> (N, Hi, Wi, A) -> (N, Hi*Wi*A)
400 |             score.permute(0, 2, 3, 1).flatten(1)
401 |             for score in pred_objectness_logits
402 |         ]
403 |         pred_anchor_deltas = [
404 |             # (N, A*B, Hi, Wi) -> (N, A, B, Hi, Wi) -> (N, Hi, Wi, A, B) -> (N, Hi*Wi*A, B)
405 |             x.view(x.shape[0], -1, self.anchor_generator.box_dim, x.shape[-2], x.shape[-1])
406 |             .permute(0, 3, 4, 1, 2)
407 |             .flatten(1, -2)
408 |             for x in pred_anchor_deltas
409 |         ]
410 | 
411 |         if self.training:
412 |             gt_labels, gt_boxes = self.label_and_sample_anchors(anchors, gt_instances)
413 |             #losses = self.losses(
414 |             #    anchors, pred_objectness_logits, gt_labels, pred_anchor_deltas, gt_boxes
415 |             #)
416 |             #losses = {k: v * self.loss_weight for k, v in losses.items()}
417 |             proposals = self.predict_proposals(
418 |                 anchors, pred_objectness_logits, pred_anchor_deltas, images.image_sizes
419 |             )
420 | 
421 |             return proposals, anchors, pred_objectness_logits, gt_labels, pred_anchor_deltas, gt_boxes #, losses
422 |         else:
423 |             losses = {}
424 |             proposals = self.predict_proposals(
425 |                 anchors, pred_objectness_logits, pred_anchor_deltas, images.image_sizes
426 |             )
427 |             return proposals, losses
428 | 
429 |     @torch.no_grad()
430 |     def predict_proposals(
431 |         self,
432 |         anchors,
433 |         pred_objectness_logits: List[torch.Tensor],
434 |         pred_anchor_deltas: List[torch.Tensor],
435 |         image_sizes: List[Tuple[int, int]],
436 |     ):
437 |         """
438 |         Decode all the predicted box regression deltas to proposals. Find the top proposals
439 |         by applying NMS and removing boxes that are too small.
440 | 
441 |         Returns:
442 |             proposals (list[Instances]): list of N Instances. The i-th Instances
443 |                 stores post_nms_topk object proposals for image i, sorted by their
444 |                 objectness score in descending order.
445 |         """
446 |         # The proposals are treated as fixed for approximate joint training with roi heads.
447 |         # This approach ignores the derivative w.r.t. the proposal boxes’ coordinates that
448 |         # are also network responses, so is approximate.
449 |         pred_proposals = self._decode_proposals(anchors, pred_anchor_deltas)
450 |         return find_top_rpn_proposals(
451 |             pred_proposals,
452 |             pred_objectness_logits,
453 |             image_sizes,
454 |             self.nms_thresh,
455 |             self.pre_nms_topk[self.training],
456 |             self.post_nms_topk[self.training],
457 |             self.min_box_size,
458 |             self.training,
459 |         )
460 | 
461 |     def _decode_proposals(self, anchors, pred_anchor_deltas: List[torch.Tensor]):
462 |         """
463 |         Transform anchors into proposals by applying the predicted anchor deltas.
464 | 
465 |         Returns:
466 |             proposals (list[Tensor]): A list of L tensors. Tensor i has shape
467 |                 (N, Hi*Wi*A, B)
468 |         """
469 |         N = pred_anchor_deltas[0].shape[0]
470 |         proposals = []
471 |         # For each feature map
472 |         for anchors_i, pred_anchor_deltas_i in zip(anchors, pred_anchor_deltas):
473 |             B = anchors_i.tensor.size(1)
474 |             pred_anchor_deltas_i = pred_anchor_deltas_i.reshape(-1, B)
475 |             # Expand anchors to shape (N*Hi*Wi*A, B)
476 |             anchors_i = anchors_i.tensor.unsqueeze(0).expand(N, -1, -1).reshape(-1, B)
477 |             proposals_i = self.box2box_transform.apply_deltas(pred_anchor_deltas_i, anchors_i)
478 |             # Append feature map proposals with shape (N, Hi*Wi*A, B)
479 |             proposals.append(proposals_i.view(N, -1, B))
480 |         return proposals
481 | 


--------------------------------------------------------------------------------
/fewx/solver/__init__.py:
--------------------------------------------------------------------------------
1 | from .build import build_lr_scheduler, build_optimizer
2 | 
3 | __all__ = [k for k in globals().keys() if not k.startswith("_")]
4 | 


--------------------------------------------------------------------------------
/fewx/solver/build.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  2 | from enum import Enum
  3 | from typing import Any, Callable, Dict, Iterable, List, Set, Type, Union
  4 | import torch
  5 | 
  6 | from detectron2.config import CfgNode
  7 | 
  8 | from detectron2.solver.lr_scheduler import WarmupCosineLR, WarmupMultiStepLR
  9 | 
 10 | _GradientClipperInput = Union[torch.Tensor, Iterable[torch.Tensor]]
 11 | _GradientClipper = Callable[[_GradientClipperInput], None]
 12 | 
 13 | 
 14 | class GradientClipType(Enum):
 15 |     VALUE = "value"
 16 |     NORM = "norm"
 17 | 
 18 | 
 19 | def _create_gradient_clipper(cfg: CfgNode) -> _GradientClipper:
 20 |     """
 21 |     Creates gradient clipping closure to clip by value or by norm,
 22 |     according to the provided config.
 23 |     """
 24 |     cfg = cfg.clone()
 25 | 
 26 |     def clip_grad_norm(p: _GradientClipperInput):
 27 |         torch.nn.utils.clip_grad_norm_(p, cfg.CLIP_VALUE, cfg.NORM_TYPE)
 28 | 
 29 |     def clip_grad_value(p: _GradientClipperInput):
 30 |         torch.nn.utils.clip_grad_value_(p, cfg.CLIP_VALUE)
 31 | 
 32 |     _GRADIENT_CLIP_TYPE_TO_CLIPPER = {
 33 |         GradientClipType.VALUE: clip_grad_value,
 34 |         GradientClipType.NORM: clip_grad_norm,
 35 |     }
 36 |     return _GRADIENT_CLIP_TYPE_TO_CLIPPER[GradientClipType(cfg.CLIP_TYPE)]
 37 | 
 38 | 
 39 | def _generate_optimizer_class_with_gradient_clipping(
 40 |     optimizer_type: Type[torch.optim.Optimizer], gradient_clipper: _GradientClipper
 41 | ) -> Type[torch.optim.Optimizer]:
 42 |     """
 43 |     Dynamically creates a new type that inherits the type of a given instance
 44 |     and overrides the `step` method to add gradient clipping
 45 |     """
 46 | 
 47 |     def optimizer_wgc_step(self, closure=None):
 48 |         for group in self.param_groups:
 49 |             for p in group["params"]:
 50 |                 gradient_clipper(p)
 51 |         super(type(self), self).step(closure)
 52 | 
 53 |     OptimizerWithGradientClip = type(
 54 |         optimizer_type.__name__ + "WithGradientClip",
 55 |         (optimizer_type,),
 56 |         {"step": optimizer_wgc_step},
 57 |     )
 58 |     return OptimizerWithGradientClip
 59 | 
 60 | 
 61 | def maybe_add_gradient_clipping(
 62 |     cfg: CfgNode, optimizer: torch.optim.Optimizer
 63 | ) -> torch.optim.Optimizer:
 64 |     """
 65 |     If gradient clipping is enabled through config options, wraps the existing
 66 |     optimizer instance of some type OptimizerType to become an instance
 67 |     of the new dynamically created class OptimizerTypeWithGradientClip
 68 |     that inherits OptimizerType and overrides the `step` method to
 69 |     include gradient clipping.
 70 | 
 71 |     Args:
 72 |         cfg: CfgNode
 73 |             configuration options
 74 |         optimizer: torch.optim.Optimizer
 75 |             existing optimizer instance
 76 | 
 77 |     Return:
 78 |         optimizer: torch.optim.Optimizer
 79 |             either the unmodified optimizer instance (if gradient clipping is
 80 |             disabled), or the same instance with adjusted __class__ to override
 81 |             the `step` method and include gradient clipping
 82 |     """
 83 |     if not cfg.SOLVER.CLIP_GRADIENTS.ENABLED:
 84 |         return optimizer
 85 |     grad_clipper = _create_gradient_clipper(cfg.SOLVER.CLIP_GRADIENTS)
 86 |     OptimizerWithGradientClip = _generate_optimizer_class_with_gradient_clipping(
 87 |         type(optimizer), grad_clipper
 88 |     )
 89 |     optimizer.__class__ = OptimizerWithGradientClip
 90 |     return optimizer
 91 | 
 92 | 
 93 | def build_optimizer(cfg: CfgNode, model: torch.nn.Module) -> torch.optim.Optimizer:
 94 |     """
 95 |     Build an optimizer from config.
 96 |     """
 97 |     norm_module_types = (
 98 |         torch.nn.BatchNorm1d,
 99 |         torch.nn.BatchNorm2d,
100 |         torch.nn.BatchNorm3d,
101 |         torch.nn.SyncBatchNorm,
102 |         # NaiveSyncBatchNorm inherits from BatchNorm2d
103 |         torch.nn.GroupNorm,
104 |         torch.nn.InstanceNorm1d,
105 |         torch.nn.InstanceNorm2d,
106 |         torch.nn.InstanceNorm3d,
107 |         torch.nn.LayerNorm,
108 |         torch.nn.LocalResponseNorm,
109 |     )
110 |     params: List[Dict[str, Any]] = []
111 |     memo: Set[torch.nn.parameter.Parameter] = set()
112 |     for module in model.modules():
113 |         for key, value in module.named_parameters(): #recurse=False):
114 |             if not value.requires_grad:
115 |                 continue
116 |             # Avoid duplicating parameters
117 |             if value in memo:
118 |                 continue
119 |             memo.add(value)
120 |             lr = cfg.SOLVER.BASE_LR
121 |             weight_decay = cfg.SOLVER.WEIGHT_DECAY
122 |             if isinstance(module, norm_module_types):
123 |                 weight_decay = cfg.SOLVER.WEIGHT_DECAY_NORM
124 |             elif "bias" in key: #key == "bias":
125 |                 # NOTE: unlike Detectron v1, we now default BIAS_LR_FACTOR to 1.0
126 |                 # and WEIGHT_DECAY_BIAS to WEIGHT_DECAY so that bias optimizer
127 |                 # hyperparameters are by default exactly the same as for regular
128 |                 # weights.
129 |                 lr = cfg.SOLVER.BASE_LR * cfg.SOLVER.BIAS_LR_FACTOR
130 |                 weight_decay = cfg.SOLVER.WEIGHT_DECAY_BIAS
131 | 
132 |             if 'box_predictor' in key:
133 |                 lr = cfg.SOLVER.BASE_LR * cfg.SOLVER.HEAD_LR_FACTOR
134 |             params += [{"params": [value], "lr": lr, "weight_decay": weight_decay}]
135 |     optimizer = torch.optim.SGD(
136 |         params, cfg.SOLVER.BASE_LR, momentum=cfg.SOLVER.MOMENTUM, nesterov=cfg.SOLVER.NESTEROV
137 |     )
138 |     optimizer = maybe_add_gradient_clipping(cfg, optimizer)
139 |     return optimizer
140 | 
141 | 
142 | def build_lr_scheduler(
143 |     cfg: CfgNode, optimizer: torch.optim.Optimizer
144 | ) -> torch.optim.lr_scheduler._LRScheduler:
145 |     """
146 |     Build a LR scheduler from config.
147 |     """
148 |     name = cfg.SOLVER.LR_SCHEDULER_NAME
149 |     if name == "WarmupMultiStepLR":
150 |         return WarmupMultiStepLR(
151 |             optimizer,
152 |             cfg.SOLVER.STEPS,
153 |             cfg.SOLVER.GAMMA,
154 |             warmup_factor=cfg.SOLVER.WARMUP_FACTOR,
155 |             warmup_iters=cfg.SOLVER.WARMUP_ITERS,
156 |             warmup_method=cfg.SOLVER.WARMUP_METHOD,
157 |         )
158 |     elif name == "WarmupCosineLR":
159 |         return WarmupCosineLR(
160 |             optimizer,
161 |             cfg.SOLVER.MAX_ITER,
162 |             warmup_factor=cfg.SOLVER.WARMUP_FACTOR,
163 |             warmup_iters=cfg.SOLVER.WARMUP_ITERS,
164 |             warmup_method=cfg.SOLVER.WARMUP_METHOD,
165 |         )
166 |     else:
167 |         raise ValueError("Unknown LR scheduler: {}".format(name))
168 | 


--------------------------------------------------------------------------------
/fewx/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fanq15/FewX/6347165aa24ba20d03ef06321f82bd818e3a8021/fewx/utils/__init__.py


--------------------------------------------------------------------------------
/fewx/utils/comm.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn.functional as F
 3 | import torch.distributed as dist
 4 | 
 5 | from detectron2.utils.comm import get_world_size
 6 | 
 7 | 
 8 | def reduce_sum(tensor):
 9 |     world_size = get_world_size()
10 |     if world_size < 2:
11 |         return tensor
12 |     tensor = tensor.clone()
13 |     dist.all_reduce(tensor, op=dist.ReduceOp.SUM)
14 |     return tensor
15 | 
16 | 
17 | def aligned_bilinear(tensor, factor):
18 |     assert tensor.dim() == 4
19 |     assert factor >= 1
20 |     assert int(factor) == factor
21 | 
22 |     if factor == 1:
23 |         return tensor
24 | 
25 |     h, w = tensor.size()[2:]
26 |     tensor = F.pad(tensor, pad=(0, 1, 0, 1), mode="replicate")
27 |     oh = factor * h + 1
28 |     ow = factor * w + 1
29 |     tensor = F.interpolate(
30 |         tensor, size=(oh, ow),
31 |         mode='bilinear',
32 |         align_corners=True
33 |     )
34 |     tensor = F.pad(
35 |         tensor, pad=(factor // 2, 0, factor // 2, 0),
36 |         mode="replicate"
37 |     )
38 | 
39 |     return tensor[:, :, :oh - 1, :ow - 1]
40 | 
41 | 
42 | def compute_locations(h, w, stride, device):
43 |     shifts_x = torch.arange(
44 |         0, w * stride, step=stride,
45 |         dtype=torch.float32, device=device
46 |     )
47 |     shifts_y = torch.arange(
48 |         0, h * stride, step=stride,
49 |         dtype=torch.float32, device=device
50 |     )
51 |     shift_y, shift_x = torch.meshgrid(shifts_y, shifts_x)
52 |     shift_x = shift_x.reshape(-1)
53 |     shift_y = shift_y.reshape(-1)
54 |     locations = torch.stack((shift_x, shift_y), dim=1) + stride // 2
55 |     return locations
56 | 


--------------------------------------------------------------------------------
/fewx/utils/measures.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | # Adapted from https://github.com/ShichenLiu/CondenseNet/blob/master/utils.py
  3 | from __future__ import absolute_import
  4 | from __future__ import unicode_literals
  5 | from __future__ import print_function
  6 | from __future__ import division
  7 | 
  8 | import operator
  9 | 
 10 | from functools import reduce
 11 | 
 12 | 
 13 | def get_num_gen(gen):
 14 |     return sum(1 for x in gen)
 15 | 
 16 | 
 17 | def is_pruned(layer):
 18 |     try:
 19 |         layer.mask
 20 |         return True
 21 |     except AttributeError:
 22 |         return False
 23 | 
 24 | 
 25 | def is_leaf(model):
 26 |     return get_num_gen(model.children()) == 0
 27 | 
 28 | 
 29 | def get_layer_info(layer):
 30 |     layer_str = str(layer)
 31 |     type_name = layer_str[:layer_str.find('(')].strip()
 32 |     return type_name
 33 | 
 34 | 
 35 | def get_layer_param(model):
 36 |     return sum([reduce(operator.mul, i.size(), 1) for i in model.parameters()])
 37 | 
 38 | 
 39 | ### The input batch size should be 1 to call this function
 40 | def measure_layer(layer, *args):
 41 |     global count_ops, count_params
 42 | 
 43 |     for x in args:
 44 |         delta_ops = 0
 45 |         delta_params = 0
 46 |         multi_add = 1
 47 |         type_name = get_layer_info(layer)
 48 | 
 49 |         ### ops_conv
 50 |         if type_name in ['Conv2d']:
 51 |             out_h = int((x.size()[2] + 2 * layer.padding[0] / layer.dilation[0] - layer.kernel_size[0]) /
 52 |                         layer.stride[0] + 1)
 53 |             out_w = int((x.size()[3] + 2 * layer.padding[1] / layer.dilation[1] - layer.kernel_size[1]) /
 54 |                         layer.stride[1] + 1)
 55 |             delta_ops = layer.in_channels * layer.out_channels * layer.kernel_size[0] * layer.kernel_size[1] * out_h * out_w / layer.groups * multi_add
 56 |             delta_params = get_layer_param(layer)
 57 | 
 58 |         elif type_name in ['ConvTranspose2d']:
 59 |             _, _, in_h, in_w = x.size()
 60 |             out_h = int((in_h-1)*layer.stride[0] - 2 * layer.padding[0] + layer.kernel_size[0] + layer.output_padding[0])
 61 |             out_w = int((in_w-1)*layer.stride[1] - 2 * layer.padding[1] + layer.kernel_size[1] + layer.output_padding[1])
 62 |             delta_ops = layer.in_channels * layer.out_channels * layer.kernel_size[0] *  \
 63 |                         layer.kernel_size[1] * out_h * out_w / layer.groups * multi_add
 64 |             delta_params = get_layer_param(layer)
 65 | 
 66 |         ### ops_learned_conv
 67 |         elif type_name in ['LearnedGroupConv']:
 68 |             measure_layer(layer.relu, x)
 69 |             measure_layer(layer.norm, x)
 70 |             conv = layer.conv
 71 |             out_h = int((x.size()[2] + 2 * conv.padding[0] - conv.kernel_size[0]) /
 72 |                         conv.stride[0] + 1)
 73 |             out_w = int((x.size()[3] + 2 * conv.padding[1] - conv.kernel_size[1]) /
 74 |                         conv.stride[1] + 1)
 75 |             delta_ops = conv.in_channels * conv.out_channels * conv.kernel_size[0] * conv.kernel_size[1] * out_h * out_w / layer.condense_factor * multi_add
 76 |             delta_params = get_layer_param(conv) / layer.condense_factor
 77 | 
 78 |         ### ops_nonlinearity
 79 |         elif type_name in ['ReLU', 'ReLU6']:
 80 |             delta_ops = x.numel()
 81 |             delta_params = get_layer_param(layer)
 82 | 
 83 |         ### ops_pooling
 84 |         elif type_name in ['AvgPool2d', 'MaxPool2d']:
 85 |             in_w = x.size()[2]
 86 |             kernel_ops = layer.kernel_size * layer.kernel_size
 87 |             out_w = int((in_w + 2 * layer.padding - layer.kernel_size) / layer.stride + 1)
 88 |             out_h = int((in_w + 2 * layer.padding - layer.kernel_size) / layer.stride + 1)
 89 |             delta_ops = x.size()[0] * x.size()[1] * out_w * out_h * kernel_ops
 90 |             delta_params = get_layer_param(layer)
 91 | 
 92 |         elif type_name in ['LastLevelMaxPool']:
 93 |             pass
 94 | 
 95 |         elif type_name in ['AdaptiveAvgPool2d']:
 96 |             delta_ops = x.size()[0] * x.size()[1] * x.size()[2] * x.size()[3]
 97 |             delta_params = get_layer_param(layer)
 98 | 
 99 |         elif type_name in ['ZeroPad2d', 'RetinaNetPostProcessor']:
100 |             pass
101 |             #delta_ops = x.size()[0] * x.size()[1] * x.size()[2] * x.size()[3]
102 |             #delta_params = get_layer_param(layer)
103 | 
104 |         ### ops_linear
105 |         elif type_name in ['Linear']:
106 |             weight_ops = layer.weight.numel() * multi_add
107 |             bias_ops = layer.bias.numel()
108 |             delta_ops = x.size()[0] * (weight_ops + bias_ops)
109 |             delta_params = get_layer_param(layer)
110 | 
111 |         ### ops_nothing
112 |         elif type_name in ['BatchNorm2d', 'Dropout2d', 'DropChannel', 'Dropout', 'FrozenBatchNorm2d', 'GroupNorm']:
113 |             delta_params = get_layer_param(layer)
114 | 
115 |         elif type_name in ['SumTwo']:
116 |             delta_ops = x.numel()
117 | 
118 |         elif type_name in ['AggregateCell']:
119 |             if not layer.pre_transform:
120 |                 delta_ops = 2 * x.numel() # twice for each input
121 |             else:
122 |                 measure_layer(layer.branch_1, x)
123 |                 measure_layer(layer.branch_2, x)
124 |                 delta_params = get_layer_param(layer)
125 | 
126 |         elif type_name in ['Identity', 'Zero']:
127 |             pass
128 | 
129 |         elif type_name in ['Scale']:
130 |             delta_params = get_layer_param(layer)
131 |             delta_ops = x.numel()
132 | 
133 |         elif type_name in ['FCOSPostProcessor', 'RPNPostProcessor', 'KeypointPostProcessor',
134 |                            'ROIAlign', 'PostProcessor', 'KeypointRCNNPredictor', 
135 |                            'NaiveSyncBatchNorm', 'Upsample', 'Sequential']:
136 |             pass
137 | 
138 |         elif type_name in ['DeformConv']:
139 |             # don't count bilinear
140 |             offset_conv = list(layer.parameters())[0]
141 |             delta_ops = reduce(operator.mul, offset_conv.size(), x.size()[2] * x.size()[3])
142 |             out_h = int((x.size()[2] + 2 * layer.padding[0] / layer.dilation[0]
143 |                          - layer.kernel_size[0]) / layer.stride[0] + 1)
144 |             out_w = int((x.size()[3] + 2 * layer.padding[1] / layer.dilation[1]
145 |                          - layer.kernel_size[1]) / layer.stride[1] + 1)
146 |             delta_ops += layer.in_channels * layer.out_channels * layer.kernel_size[0] * layer.kernel_size[1] * out_h * out_w / layer.groups * multi_add
147 |             delta_params = get_layer_param(layer)
148 | 
149 |         ### unknown layer type
150 |         else:
151 |             raise TypeError('unknown layer type: %s' % type_name)
152 | 
153 |         count_ops += delta_ops
154 |         count_params += delta_params
155 |     return
156 | 
157 | 
158 | def measure_model(model, x):
159 |     global count_ops, count_params
160 |     count_ops = 0
161 |     count_params = 0
162 | 
163 |     def should_measure(x):
164 |         return is_leaf(x) or is_pruned(x)
165 | 
166 |     def modify_forward(model):
167 |         for child in model.children():
168 |             if should_measure(child):
169 |                 def new_forward(m):
170 |                     def lambda_forward(*args):
171 |                         measure_layer(m, *args)
172 |                         return m.old_forward(*args)
173 |                     return lambda_forward
174 |                 child.old_forward = child.forward
175 |                 child.forward = new_forward(child)
176 |             else:
177 |                 modify_forward(child)
178 | 
179 |     def restore_forward(model):
180 |         for child in model.children():
181 |             # leaf node
182 |             if is_leaf(child) and hasattr(child, 'old_forward'):
183 |                 child.forward = child.old_forward
184 |                 child.old_forward = None
185 |             else:
186 |                 restore_forward(child)
187 | 
188 |     modify_forward(model)
189 |     out = model.forward(x)
190 |     restore_forward(model)
191 | 
192 |     return out, count_ops, count_params
193 | 


--------------------------------------------------------------------------------
/fsod_train_net.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
  3 | 
  4 | """
  5 | TridentNet Training Script.
  6 | 
  7 | This script is a simplified version of the training script in detectron2/tools.
  8 | """
  9 | 
 10 | import os
 11 | 
 12 | from detectron2.checkpoint import DetectionCheckpointer
 13 | from detectron2.config import get_cfg
 14 | from detectron2.engine import DefaultTrainer, default_argument_parser, default_setup, launch
 15 | #from detectron2.evaluation import COCOEvaluator
 16 | #from detectron2.data import build_detection_train_loader
 17 | from detectron2.data import build_batch_data_loader
 18 | 
 19 | from fewx.config import get_cfg
 20 | from fewx.data.dataset_mapper import DatasetMapperWithSupport
 21 | from fewx.data.build import build_detection_train_loader, build_detection_test_loader
 22 | from fewx.solver import build_optimizer
 23 | from fewx.evaluation import COCOEvaluator
 24 | 
 25 | import bisect
 26 | import copy
 27 | import itertools
 28 | import logging
 29 | import numpy as np
 30 | import operator
 31 | import pickle
 32 | import torch.utils.data
 33 | 
 34 | import detectron2.utils.comm as comm
 35 | from detectron2.utils.logger import setup_logger
 36 | 
 37 | class Trainer(DefaultTrainer):
 38 | 
 39 |     @classmethod
 40 |     def build_train_loader(cls, cfg):
 41 |         """
 42 |         Returns:
 43 |             iterable
 44 |         It calls :func:`detectron2.data.build_detection_train_loader` with a customized
 45 |         DatasetMapper, which adds categorical labels as a semantic mask.
 46 |         """
 47 |         mapper = DatasetMapperWithSupport(cfg)
 48 |         return build_detection_train_loader(cfg, mapper)
 49 | 
 50 |     @classmethod
 51 |     def build_test_loader(cls, cfg, dataset_name):
 52 |         """
 53 |         Returns:
 54 |             iterable
 55 |         It now calls :func:`detectron2.data.build_detection_test_loader`.
 56 |         Overwrite it if you'd like a different data loader.
 57 |         """
 58 |         return build_detection_test_loader(cfg, dataset_name)
 59 | 
 60 |     @classmethod
 61 |     def build_optimizer(cls, cfg, model):
 62 |         """
 63 |         Returns:
 64 |             torch.optim.Optimizer:
 65 |         It now calls :func:`detectron2.solver.build_optimizer`.
 66 |         Overwrite it if you'd like a different optimizer.
 67 |         """
 68 |         return build_optimizer(cfg, model)
 69 | 
 70 |     @classmethod
 71 |     def build_evaluator(cls, cfg, dataset_name, output_folder=None):
 72 |         if output_folder is None:
 73 |             output_folder = os.path.join(cfg.OUTPUT_DIR, "inference")
 74 |         return COCOEvaluator(dataset_name, cfg, True, output_folder)
 75 | 
 76 | 
 77 | def setup(args):
 78 |     """
 79 |     Create configs and perform basic setups.
 80 |     """
 81 |     cfg = get_cfg()
 82 |     cfg.merge_from_file(args.config_file)
 83 |     cfg.merge_from_list(args.opts)
 84 |     cfg.freeze()
 85 |     default_setup(cfg, args)
 86 | 
 87 |     rank = comm.get_rank()
 88 |     setup_logger(cfg.OUTPUT_DIR, distributed_rank=rank, name="fewx")
 89 | 
 90 |     return cfg
 91 | 
 92 | 
 93 | def main(args):
 94 |     cfg = setup(args)
 95 | 
 96 |     if args.eval_only:
 97 |         model = Trainer.build_model(cfg)
 98 |         DetectionCheckpointer(model, save_dir=cfg.OUTPUT_DIR).resume_or_load(
 99 |             cfg.MODEL.WEIGHTS, resume=args.resume
100 |         )
101 |         res = Trainer.test(cfg, model)
102 |         return res
103 | 
104 |     trainer = Trainer(cfg)
105 |     trainer.resume_or_load(resume=args.resume)
106 |     return trainer.train()
107 | 
108 | 
109 | if __name__ == "__main__":
110 |     args = default_argument_parser().parse_args()
111 |     print("Command Line Args:", args)
112 |     launch(
113 |         main,
114 |         args.num_gpus,
115 |         num_machines=args.num_machines,
116 |         machine_rank=args.machine_rank,
117 |         dist_url=args.dist_url,
118 |         args=(args,),
119 |     )
120 | 


--------------------------------------------------------------------------------
/log/metric.txt:
--------------------------------------------------------------------------------
 1 | Running per image evaluation...
 2 | Evaluate annotation type *bbox*
 3 | COCOeval_opt.evaluate() finished in 7.92 seconds.
 4 | Accumulating evaluation results...
 5 | COCOeval_opt.accumulate() finished in 0.96 seconds.
 6 |  Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.030
 7 |  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.056
 8 |  Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.029
 9 |  Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.007
10 |  Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.031
11 |  Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.052
12 |  Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.047
13 |  Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.066
14 |  Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.066
15 |  Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.009
16 |  Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.059
17 |  Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.114
18 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for bbox: 
19 | |  AP   |  AP50  |  AP75  |  APs  |  APm  |  APl  |
20 | |:-----:|:------:|:------:|:-----:|:-----:|:-----:|
21 | | 2.989 | 5.592  | 2.948  | 0.724 | 3.057 | 5.165 |
22 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for VOC 20 categories =======> AP  : 11.95
23 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for VOC 20 categories =======> AP50: 22.37
24 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for VOC 20 categories =======> AP75: 11.79
25 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for VOC 20 categories =======> APs : 2.89
26 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for VOC 20 categories =======> APm : 12.23
27 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for VOC 20 categories =======> APl : 20.66
28 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for Non VOC 60 categories =======> AP  : 0.00
29 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for Non VOC 60 categories =======> AP50: 0.00
30 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for Non VOC 60 categories =======> AP75: 0.00
31 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for Non VOC 60 categories =======> APs : 0.00
32 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for Non VOC 60 categories =======> APm : 0.00
33 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mEvaluation results for Non VOC 60 categories =======> APl : 0.00
34 | [32m[08/08 18:10:48 fewx.evaluation.coco_evaluation]: [0mPer-category bbox AP: 
35 | | category      | AP     | category     | AP     | category       | AP     |
36 | |:--------------|:-------|:-------------|:-------|:---------------|:-------|
37 | | person        | 10.446 | bicycle      | 5.604  | car            | 11.692 |
38 | | motorcycle    | 10.597 | airplane     | 23.452 | bus            | 20.706 |
39 | | train         | 16.856 | truck        | 0.000  | boat           | 1.730  |
40 | | traffic light | 0.000  | fire hydrant | 0.000  | stop sign      | 0.000  |
41 | | parking meter | 0.000  | bench        | 0.000  | bird           | 8.258  |
42 | | cat           | 25.047 | dog          | 17.904 | horse          | 11.009 |
43 | | sheep         | 6.285  | cow          | 9.031  | elephant       | 0.000  |
44 | | bear          | 0.000  | zebra        | 0.000  | giraffe        | 0.000  |
45 | | backpack      | 0.000  | umbrella     | 0.000  | handbag        | 0.000  |
46 | | tie           | 0.000  | suitcase     | 0.000  | frisbee        | 0.000  |
47 | | skis          | 0.000  | snowboard    | 0.000  | sports ball    | 0.000  |
48 | | kite          | 0.000  | baseball bat | 0.000  | baseball glove | 0.000  |
49 | | skateboard    | 0.000  | surfboard    | 0.000  | tennis racket  | 0.000  |
50 | | bottle        | 7.444  | wine glass   | 0.000  | cup            | 0.000  |
51 | | fork          | 0.000  | knife        | 0.000  | spoon          | 0.000  |
52 | | bowl          | 0.000  | banana       | 0.000  | apple          | 0.000  |
53 | | sandwich      | 0.000  | orange       | 0.000  | broccoli       | 0.000  |
54 | | carrot        | 0.000  | hot dog      | 0.000  | pizza          | 0.000  |
55 | | donut         | 0.000  | cake         | 0.000  | chair          | 2.958  |
56 | | couch         | 11.579 | potted plant | 4.004  | bed            | 0.000  |
57 | | dining table  | 8.955  | toilet       | 0.000  | tv             | 25.533 |
58 | | laptop        | 0.000  | mouse        | 0.000  | remote         | 0.000  |
59 | | keyboard      | 0.000  | cell phone   | 0.000  | microwave      | 0.000  |
60 | | oven          | 0.000  | toaster      | 0.000  | sink           | 0.000  |
61 | | refrigerator  | 0.000  | book         | 0.000  | clock          | 0.000  |
62 | | vase          | 0.000  | scissors     | 0.000  | teddy bear     | 0.000  |
63 | | hair drier    | 0.000  | toothbrush   | 0.000  |                |        |
64 | [32m[08/08 18:10:48 d2.engine.defaults]: [0mEvaluation results for coco_2017_val in csv format:
65 | [32m[08/08 18:10:48 d2.evaluation.testing]: [0mcopypaste: Task: bbox
66 | [32m[08/08 18:10:48 d2.evaluation.testing]: [0mcopypaste: AP,AP50,AP75,APs,APm,APl
67 | [32m[08/08 18:10:48 d2.evaluation.testing]: [0mcopypaste: 2.9886,5.5922,2.9480,0.7236,3.0575,5.1649
68 | 


--------------------------------------------------------------------------------