├── .gitignore ├── README.md ├── configs ├── nwpu │ ├── faster_rcnn_robust_Nb=0_Ns=50.yaml │ ├── faster_rcnn_robust_Nb=0_Ns=ex.yaml │ ├── faster_rcnn_robust_Nb=20_Ns=0.yaml │ ├── faster_rcnn_robust_Nb=20_Ns=50.yaml │ ├── faster_rcnn_robust_Nb=20_Ns=ex.yaml │ ├── faster_rcnn_robust_Nb=40_Ns=0.yaml │ ├── faster_rcnn_robust_Nb=40_Ns=50.yaml │ ├── faster_rcnn_robust_Nb=40_Ns=ex.yaml │ ├── faster_rcnn_standard_Nb=0_Ns=50.yaml │ ├── faster_rcnn_standard_Nb=0_Ns=ex.yaml │ ├── faster_rcnn_standard_Nb=20_Ns=0.yaml │ ├── faster_rcnn_standard_Nb=20_Ns=50.yaml │ ├── faster_rcnn_standard_Nb=20_Ns=ex.yaml │ ├── faster_rcnn_standard_Nb=40_Ns=0.yaml │ ├── faster_rcnn_standard_Nb=40_Ns=50.yaml │ ├── faster_rcnn_standard_Nb=40_Ns=ex.yaml │ └── faster_rcnn_standard_clean.yaml └── voc │ ├── faster_rcnn_robust_Nb=0_Ns=50.yaml │ ├── faster_rcnn_robust_Nb=0_Ns=ex .yaml │ ├── faster_rcnn_robust_Nb=20_Ns=0.yaml │ ├── faster_rcnn_robust_Nb=20_Ns=50.yaml │ ├── faster_rcnn_robust_Nb=20_Ns=ex.yaml │ ├── faster_rcnn_robust_Nb=40_Ns=0.yaml │ ├── faster_rcnn_robust_Nb=40_Ns=50.yaml │ ├── faster_rcnn_robust_Nb=40_Ns=ex.yaml │ ├── faster_rcnn_standard_Nb=0_Ns=50.yaml │ ├── faster_rcnn_standard_Nb=0_Ns=ex.yaml │ ├── faster_rcnn_standard_Nb=20_Ns=0.yaml │ ├── faster_rcnn_standard_Nb=20_Ns=50.yaml │ ├── faster_rcnn_standard_Nb=20_Ns=ex.yaml │ ├── faster_rcnn_standard_Nb=40_Ns=0.yaml │ ├── faster_rcnn_standard_Nb=40_Ns=50.yaml │ ├── faster_rcnn_standard_Nb=40_Ns=ex.yaml │ └── faster_rcnn_standard_clean.yaml ├── datasets ├── NWPU │ └── Splits │ │ ├── test_set_negative.txt │ │ ├── test_set_positive.txt │ │ ├── train_set_negative.txt │ │ ├── train_set_positive.txt │ │ ├── val_set_negative.txt │ │ └── val_set_positive.txt └── VOC2007 │ └── ImageSets │ └── Main │ ├── custom_train.txt │ ├── custom_val.txt │ ├── test.txt │ ├── train.txt │ ├── trainval.txt │ └── val.txt ├── environment.yaml ├── requirements.txt ├── run.py ├── src ├── config │ └── utils.py ├── correction │ ├── augmentation.py │ └── correction_module.py ├── datasets │ ├── build.py │ ├── noise.py │ ├── nwpu.py │ └── register.py ├── engine │ ├── train_standard.py │ └── train_teacher_student.py ├── evaluation │ ├── build.py │ └── visualization.py ├── models │ ├── adet │ │ ├── __init__.py │ │ ├── layers │ │ │ ├── __init__.py │ │ │ ├── deform_conv.py │ │ │ ├── iou_loss.py │ │ │ ├── ml_nms.py │ │ │ └── naive_group_norm.py │ │ ├── modeling │ │ │ ├── backbone │ │ │ │ ├── __init__.py │ │ │ │ ├── fpn.py │ │ │ │ ├── lpf.py │ │ │ │ ├── mobilenet.py │ │ │ │ ├── resnet_interval.py │ │ │ │ └── resnet_lpf.py │ │ │ ├── fcos │ │ │ │ ├── __init__.py │ │ │ │ ├── fcos.py │ │ │ │ └── fcos_outputs.py │ │ │ └── one_stage_detector.py │ │ └── utils │ │ │ └── comm.py │ ├── build.py │ ├── faster_rcnn.py │ ├── fcos.py │ └── retinanet.py └── utils │ ├── boxutils.py │ ├── checkpoint.py │ └── misc.py └── test.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *.pt 3 | *.pth 4 | .vscode/ 5 | runs/ 6 | pretrained_models/ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Robust Object Detection 2 | 3 | This repository contains the source code for the paper "Robust Object Detection in Remote Sensing Imagery with Noisy 4 | and Sparse Geo-Annotations". 5 | 6 | > Recently, the availability of remote sensing imagery from aerial vehicles and satellites constantly improved. For an automated interpretation of such data, deep-learning-based object detectors achieve state-of-the-art performance. However, established object detectors require complete, precise, and correct bounding box annotations for training. In order to create the necessary training annotations for object detectors, imagery can be georeferenced and combined with data from other sources, such as points of interest localized by GPS sensors. Unfortunately, this combination often leads to poor object localization and missing annotations. Therefore, training object detectors with such data often results in insufficient detection performance. In this paper, we present a novel approach for training object detectors with extremely noisy and incomplete annotations. Our method is based on a teacher-student learning framework and a correction module accounting for imprecise and missing annotations. Thus, our method is easy to use and can be combined with arbitrary object detectors. We demonstrate that our approach improves standard detectors by 37.1% AP_50 on a noisy real-world remote-sensing dataset. Furthermore, our method achieves great performance gains on two datasets with synthetic noise. Code is available at https://github.com/mxbh/robust_object_detection. 7 | 8 | An extended version of the paper with more detailed explanations can be found [here](http://arxiv.org/abs/2210.12989). 9 | 10 | ## Usage 11 | 1. Download the desired datasets and place them into `./datasets`. 12 | 2. For NWPU and Pascal VOC2007, place the provided split files under the corresponding dataset folder. 13 | ``` 14 | datasets/ 15 | NWPU/ 16 | ground truth/ 17 | ... 18 | negative image set/ 19 | ... 20 | positive image set/ 21 | ... 22 | Splits/ 23 | test_set_negative.txt 24 | test_set_positive.txt 25 | train_set_negative.txt 26 | train_set_positive.txt 27 | val_set_negative.txt 28 | val_set_positive.txt 29 | VOC2007/ 30 | ImageSets/ 31 | Main/ 32 | custom_train.txt 33 | custom_val.txt 34 | ... 35 | ... 36 | ... 37 | VOC2012/ 38 | ... 39 | 40 | ``` 41 | 3. Install the requirements, particulary we used `torch==1.9.1`, `torchvision==0.10.1` and `detectron2==0.5`. 42 | A superset of the required packages is listed in `./requirements.txt` and `environment.yaml`. 43 | 4. Download [pretrained backbone weights](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/MSRA/R-50.pkl) (other backbones can be found [here](https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md#imagenet-pretrained-models)) into `./pretrained_backbones`. Make sure that the files are named the same way in `./src/models/build.py`. 44 | 5. Optional: Modify the config files provided in `./configs`. 45 | 6. Run the training scripts, e.g. 46 | ``` 47 | python run.py \ 48 | --method=standard \ 49 | --config=./configs/voc/faster_rcnn_standard_Nb=40_Ns=0.yaml 50 | 51 | python run.py \ 52 | --method=robust \ 53 | --config=./configs/voc/faster_rcnn_robust_Nb=40_Ns=0.yaml 54 | 55 | ``` 56 | Note: before you conduct a run with robust training, first pretrain the network with standard training to have a better initialization. 57 | 7. To assess the performance on the test set, run 58 | ``` 59 | python test.py --run=./runs/voc_Nb=40_Ns=0_faster_rcnn_robust 60 | ``` 61 | 62 | ## Citation 63 | If you use this repository in your research, please cite 64 | ``` 65 | @article{bernhard2022robust_obj_det, 66 | title={Robust Object Detection in Remote Sensing Imagery with Noisy and Sparse Geo-Annotations (Full Version)}, 67 | author={Bernhard, Maximilian and Schubert, Matthias}, 68 | journal={arXiv preprint arXiv:2210.12989}, 69 | year={2022} 70 | } 71 | ``` -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=0_Ns=50.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.5 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.0 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=0_Ns=50_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=0_Ns=ex.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 1.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.0 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=0_Ns=ex_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=20_Ns=0.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.2 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=20_Ns=0_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=20_Ns=50.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.5 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.2 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=20_Ns=50_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=20_Ns=ex.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 1.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.2 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=20_Ns=ex_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=40_Ns=0.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.4 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=40_Ns=0_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=40_Ns=50.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.5 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.4 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=40_Ns=50_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_Nb=40_Ns=ex.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 1.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.4 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - noisy_nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_Nb=40_Ns=ex_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/nwpu/faster_rcnn_standard_clean.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 1 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.0 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - nwpu_test 34 | TRAIN: 35 | - nwpu_train 36 | VAL: 37 | - nwpu_val 38 | GLOBAL: 39 | HACK: 1.0 40 | INPUT: 41 | CROP: 42 | ENABLED: false 43 | SIZE: 44 | - 0.9 45 | - 0.9 46 | TYPE: relative_range 47 | FORMAT: BGR 48 | MASK_FORMAT: polygon 49 | MAX_SIZE_TEST: 1333 50 | MAX_SIZE_TRAIN: 1333 51 | MIN_SIZE_TEST: 800 52 | MIN_SIZE_TRAIN: 53 | - 480 54 | - 512 55 | - 544 56 | - 576 57 | - 608 58 | - 640 59 | - 672 60 | - 704 61 | - 736 62 | - 768 63 | - 800 64 | MIN_SIZE_TRAIN_SAMPLING: choice 65 | RANDOM_FLIP: horizontal&vertical 66 | MODEL: 67 | ANCHOR_GENERATOR: 68 | ANGLES: 69 | - - -90 70 | - 0 71 | - 90 72 | ASPECT_RATIOS: 73 | - - 0.5 74 | - 1.0 75 | - 2.0 76 | NAME: DefaultAnchorGenerator 77 | OFFSET: 0.0 78 | SIZES: 79 | - - 32 80 | - - 64 81 | - - 128 82 | - - 256 83 | - - 512 84 | BACKBONE: 85 | ANTI_ALIAS: null 86 | FREEZE_AT: 2 87 | NAME: build_resnet_fpn_backbone 88 | DEVICE: cuda 89 | FCOS: 90 | BOX_QUALITY: null 91 | CENTER_SAMPLE: null 92 | FPN_STRIDES: null 93 | INFERENCE_TH_TEST: null 94 | INFERENCE_TH_TRAIN: null 95 | IN_FEATURES: null 96 | LOC_LOSS_TYPE: null 97 | LOSS_ALPHA: null 98 | LOSS_GAMMA: null 99 | LOSS_NORMALIZER_CLS: null 100 | LOSS_WEIGHT_CLS: null 101 | NMS_TH: null 102 | NORM: null 103 | NUM_BOX_CONVS: null 104 | NUM_CLASSES: 10 105 | NUM_CLS_CONVS: null 106 | NUM_SHARE_CONVS: null 107 | POST_NMS_TOPK_TEST: null 108 | POST_NMS_TOPK_TRAIN: null 109 | POS_RADIUS: null 110 | PRE_NMS_TOPK_TEST: null 111 | PRE_NMS_TOPK_TRAIN: null 112 | PRIOR_PROB: null 113 | SIZES_OF_INTEREST: null 114 | THRESH_WITH_CTR: null 115 | TOP_LEVELS: null 116 | USE_DEFORMABLE: null 117 | USE_RELU: null 118 | USE_SCALE: null 119 | YIELD_BOX_FEATURES: null 120 | YIELD_PROPOSAL: null 121 | FPN: 122 | FUSE_TYPE: sum 123 | IN_FEATURES: 124 | - res2 125 | - res3 126 | - res4 127 | - res5 128 | NORM: '' 129 | OUT_CHANNELS: 256 130 | KEYPOINT_ON: false 131 | LOAD_PROPOSALS: false 132 | MASK_ON: false 133 | META_ARCHITECTURE: GeneralizedRCNN 134 | MOBILENET: null 135 | PIXEL_MEAN: 136 | - 103.53 137 | - 116.28 138 | - 123.675 139 | PIXEL_STD: 140 | - 1.0 141 | - 1.0 142 | - 1.0 143 | PROPOSAL_GENERATOR: 144 | MIN_SIZE: 0 145 | NAME: RPN 146 | RESNETS: 147 | DEFORM_INTERVAL: null 148 | DEFORM_MODULATED: false 149 | DEFORM_NUM_GROUPS: 1 150 | DEFORM_ON_PER_STAGE: 151 | - false 152 | - false 153 | - false 154 | - false 155 | DEPTH: 50 156 | NORM: FrozenBN 157 | NUM_GROUPS: 1 158 | OUT_FEATURES: 159 | - res2 160 | - res3 161 | - res4 162 | - res5 163 | RES2_OUT_CHANNELS: 256 164 | RES5_DILATION: 1 165 | STEM_OUT_CHANNELS: 64 166 | STRIDE_IN_1X1: true 167 | WIDTH_PER_GROUP: 64 168 | RETINANET: 169 | BBOX_REG_LOSS_TYPE: null 170 | BBOX_REG_WEIGHTS: null 171 | FOCAL_LOSS_ALPHA: null 172 | FOCAL_LOSS_GAMMA: null 173 | IN_FEATURES: null 174 | IOU_LABELS: null 175 | IOU_THRESHOLDS: null 176 | NMS_THRESH_TEST: null 177 | NORM: null 178 | NUM_CLASSES: 10 179 | NUM_CONVS: null 180 | PRIOR_PROB: null 181 | SCORE_THRESH_TEST: null 182 | SMOOTH_L1_LOSS_BETA: null 183 | TOPK_CANDIDATES_TEST: null 184 | ROI_BOX_HEAD: 185 | BBOX_REG_LOSS_TYPE: smooth_l1 186 | BBOX_REG_LOSS_WEIGHT: 1.0 187 | BBOX_REG_WEIGHTS: 188 | - 10.0 189 | - 10.0 190 | - 5.0 191 | - 5.0 192 | CLS_AGNOSTIC_BBOX_REG: false 193 | CONV_DIM: 256 194 | FC_DIM: 1024 195 | NAME: FastRCNNConvFCHead 196 | NORM: '' 197 | NUM_CONV: 0 198 | NUM_FC: 2 199 | POOLER_RESOLUTION: 7 200 | POOLER_SAMPLING_RATIO: 0 201 | POOLER_TYPE: ROIAlignV2 202 | SMOOTH_L1_BETA: 0.0 203 | TRAIN_ON_PRED_BOXES: false 204 | ROI_HEADS: 205 | BATCH_SIZE_PER_IMAGE: 512 206 | DETECTIONS_PER_IMAGE_TRAIN: null 207 | IN_FEATURES: 208 | - p2 209 | - p3 210 | - p4 211 | - p5 212 | IOU_LABELS: 213 | - 0 214 | - 1 215 | IOU_THRESHOLDS: 216 | - 0.5 217 | NAME: StandardROIHeads 218 | NMS_THRESH_TEST: 0.5 219 | NMS_THRESH_TRAIN: null 220 | NUM_CLASSES: 10 221 | POSITIVE_FRACTION: 0.25 222 | PROPOSAL_APPEND_GT: true 223 | SCORE_THRESH_TEST: 0.05 224 | SCORE_THRESH_TRAIN: null 225 | RPN: 226 | BATCH_SIZE_PER_IMAGE: 256 227 | BBOX_REG_LOSS_TYPE: smooth_l1 228 | BBOX_REG_LOSS_WEIGHT: 1.0 229 | BBOX_REG_WEIGHTS: 230 | - 1.0 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | BOUNDARY_THRESH: -1 235 | CONV_DIMS: 236 | - -1 237 | HEAD_NAME: StandardRPNHead 238 | IN_FEATURES: 239 | - p2 240 | - p3 241 | - p4 242 | - p5 243 | - p6 244 | IOU_LABELS: 245 | - 0 246 | - -1 247 | - 1 248 | IOU_THRESHOLDS: 249 | - 0.3 250 | - 0.7 251 | LOSS_WEIGHT: 1.0 252 | NMS_THRESH: 0.7 253 | POSITIVE_FRACTION: 0.5 254 | POST_NMS_TOPK_TEST: 1000 255 | POST_NMS_TOPK_TRAIN: 1000 256 | PRE_NMS_TOPK_TEST: 1000 257 | PRE_NMS_TOPK_TRAIN: 2000 258 | SMOOTH_L1_BETA: 0.0 259 | EMA_TEACHER: 260 | KEEP_RATE: null 261 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 262 | OUTPUT_DIR: ./runs/nwpu_clean_faster_rcnn_standard 263 | SEED: -1 264 | SOLVER: 265 | AMP: 266 | ENABLED: false 267 | BACKBONE_MULTIPLIER: null 268 | BASE_LR: 0.02 269 | BIAS_LR_FACTOR: 1.0 270 | CHECKPOINTS: [] 271 | CLIP_GRADIENTS: 272 | CLIP_TYPE: value 273 | CLIP_VALUE: 1.0 274 | ENABLED: false 275 | NORM_TYPE: 2.0 276 | GAMMA: 0.1 277 | IMS_PER_BATCH: 16 278 | LR_SCHEDULER_NAME: WarmupMultiStepLR 279 | MAX_ITER: 4000 280 | MOMENTUM: 0.9 281 | NESTEROV: false 282 | OPTIMIZER: null 283 | PATIENCE: 16 284 | REFERENCE_WORLD_SIZE: 0 285 | STEPS: [] 286 | USE_CHECKPOINT: null 287 | WARMUP_FACTOR: 0.001 288 | WARMUP_ITERS: 100 289 | WARMUP_METHOD: linear 290 | WEIGHT_DECAY: 0.0001 291 | WEIGHT_DECAY_BIAS: 0.0001 292 | WEIGHT_DECAY_NORM: 0.0 293 | TEST: 294 | AUG: 295 | ENABLED: false 296 | FLIP: false 297 | MAX_SIZE: 4000 298 | MIN_SIZES: 299 | - 400 300 | - 500 301 | - 600 302 | - 700 303 | - 800 304 | - 900 305 | - 1000 306 | - 1100 307 | - 1200 308 | DETECTIONS_PER_IMAGE: 100 309 | EVAL_PERIOD: 500 310 | EXPECTED_RESULTS: [] 311 | KEYPOINT_OKS_SIGMAS: [] 312 | PRECISE_BN: 313 | ENABLED: false 314 | NUM_ITER: 200 315 | VERSION: 2 316 | VIS_PERIOD: 1000 317 | -------------------------------------------------------------------------------- /configs/voc/faster_rcnn_standard_Nb=0_Ns=50.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 8 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.5 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.0 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - voc_2007_test 34 | TRAIN: 35 | - noisy_voc_2007_custom_train 36 | - noisy_voc_2012_trainval 37 | VAL: 38 | - voc_2007_custom_val 39 | GLOBAL: 40 | HACK: 1.0 41 | INPUT: 42 | CROP: 43 | ENABLED: false 44 | SIZE: 45 | - 0.9 46 | - 0.9 47 | TYPE: relative_range 48 | FORMAT: BGR 49 | MASK_FORMAT: polygon 50 | MAX_SIZE_TEST: 1333 51 | MAX_SIZE_TRAIN: 1333 52 | MIN_SIZE_TEST: 800 53 | MIN_SIZE_TRAIN: 54 | - 480 55 | - 512 56 | - 544 57 | - 576 58 | - 608 59 | - 640 60 | - 672 61 | - 704 62 | - 736 63 | - 768 64 | - 800 65 | MIN_SIZE_TRAIN_SAMPLING: choice 66 | RANDOM_FLIP: horizontal 67 | MODEL: 68 | ANCHOR_GENERATOR: 69 | ANGLES: 70 | - - -90 71 | - 0 72 | - 90 73 | ASPECT_RATIOS: 74 | - - 0.5 75 | - 1.0 76 | - 2.0 77 | NAME: DefaultAnchorGenerator 78 | OFFSET: 0.0 79 | SIZES: 80 | - - 32 81 | - - 64 82 | - - 128 83 | - - 256 84 | - - 512 85 | BACKBONE: 86 | ANTI_ALIAS: null 87 | FREEZE_AT: 2 88 | NAME: build_resnet_fpn_backbone 89 | DEVICE: cuda 90 | FCOS: 91 | BOX_QUALITY: null 92 | CENTER_SAMPLE: null 93 | FPN_STRIDES: null 94 | INFERENCE_TH_TEST: null 95 | INFERENCE_TH_TRAIN: null 96 | IN_FEATURES: null 97 | LOC_LOSS_TYPE: null 98 | LOSS_ALPHA: null 99 | LOSS_GAMMA: null 100 | LOSS_NORMALIZER_CLS: null 101 | LOSS_WEIGHT_CLS: null 102 | NMS_TH: null 103 | NORM: null 104 | NUM_BOX_CONVS: null 105 | NUM_CLASSES: 20 106 | NUM_CLS_CONVS: null 107 | NUM_SHARE_CONVS: null 108 | POST_NMS_TOPK_TEST: null 109 | POST_NMS_TOPK_TRAIN: null 110 | POS_RADIUS: null 111 | PRE_NMS_TOPK_TEST: null 112 | PRE_NMS_TOPK_TRAIN: null 113 | PRIOR_PROB: null 114 | SIZES_OF_INTEREST: null 115 | THRESH_WITH_CTR: null 116 | TOP_LEVELS: null 117 | USE_DEFORMABLE: null 118 | USE_RELU: null 119 | USE_SCALE: null 120 | YIELD_BOX_FEATURES: null 121 | YIELD_PROPOSAL: null 122 | FPN: 123 | FUSE_TYPE: sum 124 | IN_FEATURES: 125 | - res2 126 | - res3 127 | - res4 128 | - res5 129 | NORM: '' 130 | OUT_CHANNELS: 256 131 | KEYPOINT_ON: false 132 | LOAD_PROPOSALS: false 133 | MASK_ON: false 134 | META_ARCHITECTURE: GeneralizedRCNN 135 | MOBILENET: null 136 | PIXEL_MEAN: 137 | - 103.53 138 | - 116.28 139 | - 123.675 140 | PIXEL_STD: 141 | - 1.0 142 | - 1.0 143 | - 1.0 144 | PROPOSAL_GENERATOR: 145 | MIN_SIZE: 0 146 | NAME: RPN 147 | RESNETS: 148 | DEFORM_INTERVAL: null 149 | DEFORM_MODULATED: false 150 | DEFORM_NUM_GROUPS: 1 151 | DEFORM_ON_PER_STAGE: 152 | - false 153 | - false 154 | - false 155 | - false 156 | DEPTH: 50 157 | NORM: FrozenBN 158 | NUM_GROUPS: 1 159 | OUT_FEATURES: 160 | - res2 161 | - res3 162 | - res4 163 | - res5 164 | RES2_OUT_CHANNELS: 256 165 | RES5_DILATION: 1 166 | STEM_OUT_CHANNELS: 64 167 | STRIDE_IN_1X1: true 168 | WIDTH_PER_GROUP: 64 169 | RETINANET: 170 | BBOX_REG_LOSS_TYPE: null 171 | BBOX_REG_WEIGHTS: null 172 | FOCAL_LOSS_ALPHA: null 173 | FOCAL_LOSS_GAMMA: null 174 | IN_FEATURES: null 175 | IOU_LABELS: null 176 | IOU_THRESHOLDS: null 177 | NMS_THRESH_TEST: null 178 | NORM: null 179 | NUM_CLASSES: 20 180 | NUM_CONVS: null 181 | PRIOR_PROB: null 182 | SCORE_THRESH_TEST: null 183 | SMOOTH_L1_LOSS_BETA: null 184 | TOPK_CANDIDATES_TEST: null 185 | ROI_BOX_HEAD: 186 | BBOX_REG_LOSS_TYPE: smooth_l1 187 | BBOX_REG_LOSS_WEIGHT: 1.0 188 | BBOX_REG_WEIGHTS: 189 | - 10.0 190 | - 10.0 191 | - 5.0 192 | - 5.0 193 | CLS_AGNOSTIC_BBOX_REG: false 194 | CONV_DIM: 256 195 | FC_DIM: 1024 196 | NAME: FastRCNNConvFCHead 197 | NORM: '' 198 | NUM_CONV: 0 199 | NUM_FC: 2 200 | POOLER_RESOLUTION: 7 201 | POOLER_SAMPLING_RATIO: 0 202 | POOLER_TYPE: ROIAlignV2 203 | SMOOTH_L1_BETA: 0.0 204 | TRAIN_ON_PRED_BOXES: false 205 | ROI_HEADS: 206 | BATCH_SIZE_PER_IMAGE: 512 207 | DETECTIONS_PER_IMAGE_TRAIN: null 208 | IN_FEATURES: 209 | - p2 210 | - p3 211 | - p4 212 | - p5 213 | IOU_LABELS: 214 | - 0 215 | - 1 216 | IOU_THRESHOLDS: 217 | - 0.5 218 | NAME: StandardROIHeads 219 | NMS_THRESH_TEST: 0.5 220 | NMS_THRESH_TRAIN: null 221 | NUM_CLASSES: 20 222 | POSITIVE_FRACTION: 0.25 223 | PROPOSAL_APPEND_GT: true 224 | SCORE_THRESH_TEST: 0.05 225 | SCORE_THRESH_TRAIN: null 226 | RPN: 227 | BATCH_SIZE_PER_IMAGE: 256 228 | BBOX_REG_LOSS_TYPE: smooth_l1 229 | BBOX_REG_LOSS_WEIGHT: 1.0 230 | BBOX_REG_WEIGHTS: 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | - 1.0 235 | BOUNDARY_THRESH: -1 236 | CONV_DIMS: 237 | - -1 238 | HEAD_NAME: StandardRPNHead 239 | IN_FEATURES: 240 | - p2 241 | - p3 242 | - p4 243 | - p5 244 | - p6 245 | IOU_LABELS: 246 | - 0 247 | - -1 248 | - 1 249 | IOU_THRESHOLDS: 250 | - 0.3 251 | - 0.7 252 | LOSS_WEIGHT: 1.0 253 | NMS_THRESH: 0.7 254 | POSITIVE_FRACTION: 0.5 255 | POST_NMS_TOPK_TEST: 1000 256 | POST_NMS_TOPK_TRAIN: 1000 257 | PRE_NMS_TOPK_TEST: 1000 258 | PRE_NMS_TOPK_TRAIN: 2000 259 | SMOOTH_L1_BETA: 0.0 260 | EMA_TEACHER: 261 | KEEP_RATE: null 262 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 263 | OUTPUT_DIR: ./runs/voc_Nb=0_Ns=50_faster_rcnn_standard 264 | SEED: -1 265 | SOLVER: 266 | AMP: 267 | ENABLED: false 268 | BACKBONE_MULTIPLIER: null 269 | BASE_LR: 0.02 270 | BIAS_LR_FACTOR: 1.0 271 | CHECKPOINTS: 272 | - 20000 273 | - 32000 274 | CLIP_GRADIENTS: 275 | CLIP_TYPE: value 276 | CLIP_VALUE: 1.0 277 | ENABLED: false 278 | NORM_TYPE: 2.0 279 | GAMMA: 0.1 280 | IMS_PER_BATCH: 16 281 | LR_SCHEDULER_NAME: WarmupMultiStepLR 282 | MAX_ITER: 36000 283 | MOMENTUM: 0.9 284 | NESTEROV: false 285 | OPTIMIZER: null 286 | PATIENCE: 10 287 | REFERENCE_WORLD_SIZE: 0 288 | STEPS: 289 | - 20000 290 | - 32000 291 | USE_CHECKPOINT: null 292 | WARMUP_FACTOR: 0.001 293 | WARMUP_ITERS: 100 294 | WARMUP_METHOD: linear 295 | WEIGHT_DECAY: 0.0001 296 | WEIGHT_DECAY_BIAS: 0.0001 297 | WEIGHT_DECAY_NORM: 0.0 298 | TEST: 299 | AUG: 300 | ENABLED: false 301 | FLIP: false 302 | MAX_SIZE: 4000 303 | MIN_SIZES: 304 | - 400 305 | - 500 306 | - 600 307 | - 700 308 | - 800 309 | - 900 310 | - 1000 311 | - 1100 312 | - 1200 313 | DETECTIONS_PER_IMAGE: 100 314 | EVAL_PERIOD: 1000 315 | EXPECTED_RESULTS: [] 316 | KEYPOINT_OKS_SIGMAS: [] 317 | PRECISE_BN: 318 | ENABLED: false 319 | NUM_ITER: 200 320 | VERSION: 2 321 | VIS_PERIOD: 1000 322 | -------------------------------------------------------------------------------- /configs/voc/faster_rcnn_standard_Nb=0_Ns=ex.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 8 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 1.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.0 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - voc_2007_test 34 | TRAIN: 35 | - noisy_voc_2007_custom_train 36 | - noisy_voc_2012_trainval 37 | VAL: 38 | - voc_2007_custom_val 39 | GLOBAL: 40 | HACK: 1.0 41 | INPUT: 42 | CROP: 43 | ENABLED: false 44 | SIZE: 45 | - 0.9 46 | - 0.9 47 | TYPE: relative_range 48 | FORMAT: BGR 49 | MASK_FORMAT: polygon 50 | MAX_SIZE_TEST: 1333 51 | MAX_SIZE_TRAIN: 1333 52 | MIN_SIZE_TEST: 800 53 | MIN_SIZE_TRAIN: 54 | - 480 55 | - 512 56 | - 544 57 | - 576 58 | - 608 59 | - 640 60 | - 672 61 | - 704 62 | - 736 63 | - 768 64 | - 800 65 | MIN_SIZE_TRAIN_SAMPLING: choice 66 | RANDOM_FLIP: horizontal 67 | MODEL: 68 | ANCHOR_GENERATOR: 69 | ANGLES: 70 | - - -90 71 | - 0 72 | - 90 73 | ASPECT_RATIOS: 74 | - - 0.5 75 | - 1.0 76 | - 2.0 77 | NAME: DefaultAnchorGenerator 78 | OFFSET: 0.0 79 | SIZES: 80 | - - 32 81 | - - 64 82 | - - 128 83 | - - 256 84 | - - 512 85 | BACKBONE: 86 | ANTI_ALIAS: null 87 | FREEZE_AT: 2 88 | NAME: build_resnet_fpn_backbone 89 | DEVICE: cuda 90 | FCOS: 91 | BOX_QUALITY: null 92 | CENTER_SAMPLE: null 93 | FPN_STRIDES: null 94 | INFERENCE_TH_TEST: null 95 | INFERENCE_TH_TRAIN: null 96 | IN_FEATURES: null 97 | LOC_LOSS_TYPE: null 98 | LOSS_ALPHA: null 99 | LOSS_GAMMA: null 100 | LOSS_NORMALIZER_CLS: null 101 | LOSS_WEIGHT_CLS: null 102 | NMS_TH: null 103 | NORM: null 104 | NUM_BOX_CONVS: null 105 | NUM_CLASSES: 20 106 | NUM_CLS_CONVS: null 107 | NUM_SHARE_CONVS: null 108 | POST_NMS_TOPK_TEST: null 109 | POST_NMS_TOPK_TRAIN: null 110 | POS_RADIUS: null 111 | PRE_NMS_TOPK_TEST: null 112 | PRE_NMS_TOPK_TRAIN: null 113 | PRIOR_PROB: null 114 | SIZES_OF_INTEREST: null 115 | THRESH_WITH_CTR: null 116 | TOP_LEVELS: null 117 | USE_DEFORMABLE: null 118 | USE_RELU: null 119 | USE_SCALE: null 120 | YIELD_BOX_FEATURES: null 121 | YIELD_PROPOSAL: null 122 | FPN: 123 | FUSE_TYPE: sum 124 | IN_FEATURES: 125 | - res2 126 | - res3 127 | - res4 128 | - res5 129 | NORM: '' 130 | OUT_CHANNELS: 256 131 | KEYPOINT_ON: false 132 | LOAD_PROPOSALS: false 133 | MASK_ON: false 134 | META_ARCHITECTURE: GeneralizedRCNN 135 | MOBILENET: null 136 | PIXEL_MEAN: 137 | - 103.53 138 | - 116.28 139 | - 123.675 140 | PIXEL_STD: 141 | - 1.0 142 | - 1.0 143 | - 1.0 144 | PROPOSAL_GENERATOR: 145 | MIN_SIZE: 0 146 | NAME: RPN 147 | RESNETS: 148 | DEFORM_INTERVAL: null 149 | DEFORM_MODULATED: false 150 | DEFORM_NUM_GROUPS: 1 151 | DEFORM_ON_PER_STAGE: 152 | - false 153 | - false 154 | - false 155 | - false 156 | DEPTH: 50 157 | NORM: FrozenBN 158 | NUM_GROUPS: 1 159 | OUT_FEATURES: 160 | - res2 161 | - res3 162 | - res4 163 | - res5 164 | RES2_OUT_CHANNELS: 256 165 | RES5_DILATION: 1 166 | STEM_OUT_CHANNELS: 64 167 | STRIDE_IN_1X1: true 168 | WIDTH_PER_GROUP: 64 169 | RETINANET: 170 | BBOX_REG_LOSS_TYPE: null 171 | BBOX_REG_WEIGHTS: null 172 | FOCAL_LOSS_ALPHA: null 173 | FOCAL_LOSS_GAMMA: null 174 | IN_FEATURES: null 175 | IOU_LABELS: null 176 | IOU_THRESHOLDS: null 177 | NMS_THRESH_TEST: null 178 | NORM: null 179 | NUM_CLASSES: 20 180 | NUM_CONVS: null 181 | PRIOR_PROB: null 182 | SCORE_THRESH_TEST: null 183 | SMOOTH_L1_LOSS_BETA: null 184 | TOPK_CANDIDATES_TEST: null 185 | ROI_BOX_HEAD: 186 | BBOX_REG_LOSS_TYPE: smooth_l1 187 | BBOX_REG_LOSS_WEIGHT: 1.0 188 | BBOX_REG_WEIGHTS: 189 | - 10.0 190 | - 10.0 191 | - 5.0 192 | - 5.0 193 | CLS_AGNOSTIC_BBOX_REG: false 194 | CONV_DIM: 256 195 | FC_DIM: 1024 196 | NAME: FastRCNNConvFCHead 197 | NORM: '' 198 | NUM_CONV: 0 199 | NUM_FC: 2 200 | POOLER_RESOLUTION: 7 201 | POOLER_SAMPLING_RATIO: 0 202 | POOLER_TYPE: ROIAlignV2 203 | SMOOTH_L1_BETA: 0.0 204 | TRAIN_ON_PRED_BOXES: false 205 | ROI_HEADS: 206 | BATCH_SIZE_PER_IMAGE: 512 207 | DETECTIONS_PER_IMAGE_TRAIN: null 208 | IN_FEATURES: 209 | - p2 210 | - p3 211 | - p4 212 | - p5 213 | IOU_LABELS: 214 | - 0 215 | - 1 216 | IOU_THRESHOLDS: 217 | - 0.5 218 | NAME: StandardROIHeads 219 | NMS_THRESH_TEST: 0.5 220 | NMS_THRESH_TRAIN: null 221 | NUM_CLASSES: 20 222 | POSITIVE_FRACTION: 0.25 223 | PROPOSAL_APPEND_GT: true 224 | SCORE_THRESH_TEST: 0.05 225 | SCORE_THRESH_TRAIN: null 226 | RPN: 227 | BATCH_SIZE_PER_IMAGE: 256 228 | BBOX_REG_LOSS_TYPE: smooth_l1 229 | BBOX_REG_LOSS_WEIGHT: 1.0 230 | BBOX_REG_WEIGHTS: 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | - 1.0 235 | BOUNDARY_THRESH: -1 236 | CONV_DIMS: 237 | - -1 238 | HEAD_NAME: StandardRPNHead 239 | IN_FEATURES: 240 | - p2 241 | - p3 242 | - p4 243 | - p5 244 | - p6 245 | IOU_LABELS: 246 | - 0 247 | - -1 248 | - 1 249 | IOU_THRESHOLDS: 250 | - 0.3 251 | - 0.7 252 | LOSS_WEIGHT: 1.0 253 | NMS_THRESH: 0.7 254 | POSITIVE_FRACTION: 0.5 255 | POST_NMS_TOPK_TEST: 1000 256 | POST_NMS_TOPK_TRAIN: 1000 257 | PRE_NMS_TOPK_TEST: 1000 258 | PRE_NMS_TOPK_TRAIN: 2000 259 | SMOOTH_L1_BETA: 0.0 260 | EMA_TEACHER: 261 | KEEP_RATE: null 262 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 263 | OUTPUT_DIR: ./runs/voc_Nb=0_Ns=ex_faster_rcnn_standard 264 | SEED: -1 265 | SOLVER: 266 | AMP: 267 | ENABLED: false 268 | BACKBONE_MULTIPLIER: null 269 | BASE_LR: 0.02 270 | BIAS_LR_FACTOR: 1.0 271 | CHECKPOINTS: 272 | - 20000 273 | - 32000 274 | CLIP_GRADIENTS: 275 | CLIP_TYPE: value 276 | CLIP_VALUE: 1.0 277 | ENABLED: false 278 | NORM_TYPE: 2.0 279 | GAMMA: 0.1 280 | IMS_PER_BATCH: 16 281 | LR_SCHEDULER_NAME: WarmupMultiStepLR 282 | MAX_ITER: 36000 283 | MOMENTUM: 0.9 284 | NESTEROV: false 285 | OPTIMIZER: null 286 | PATIENCE: 10 287 | REFERENCE_WORLD_SIZE: 0 288 | STEPS: 289 | - 20000 290 | - 32000 291 | USE_CHECKPOINT: null 292 | WARMUP_FACTOR: 0.001 293 | WARMUP_ITERS: 100 294 | WARMUP_METHOD: linear 295 | WEIGHT_DECAY: 0.0001 296 | WEIGHT_DECAY_BIAS: 0.0001 297 | WEIGHT_DECAY_NORM: 0.0 298 | TEST: 299 | AUG: 300 | ENABLED: false 301 | FLIP: false 302 | MAX_SIZE: 4000 303 | MIN_SIZES: 304 | - 400 305 | - 500 306 | - 600 307 | - 700 308 | - 800 309 | - 900 310 | - 1000 311 | - 1100 312 | - 1200 313 | DETECTIONS_PER_IMAGE: 100 314 | EVAL_PERIOD: 1000 315 | EXPECTED_RESULTS: [] 316 | KEYPOINT_OKS_SIGMAS: [] 317 | PRECISE_BN: 318 | ENABLED: false 319 | NUM_ITER: 200 320 | VERSION: 2 321 | VIS_PERIOD: 1000 322 | -------------------------------------------------------------------------------- /configs/voc/faster_rcnn_standard_Nb=20_Ns=0.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 8 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.2 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - voc_2007_test 34 | TRAIN: 35 | - noisy_voc_2007_custom_train 36 | - noisy_voc_2012_trainval 37 | VAL: 38 | - voc_2007_custom_val 39 | GLOBAL: 40 | HACK: 1.0 41 | INPUT: 42 | CROP: 43 | ENABLED: false 44 | SIZE: 45 | - 0.9 46 | - 0.9 47 | TYPE: relative_range 48 | FORMAT: BGR 49 | MASK_FORMAT: polygon 50 | MAX_SIZE_TEST: 1333 51 | MAX_SIZE_TRAIN: 1333 52 | MIN_SIZE_TEST: 800 53 | MIN_SIZE_TRAIN: 54 | - 480 55 | - 512 56 | - 544 57 | - 576 58 | - 608 59 | - 640 60 | - 672 61 | - 704 62 | - 736 63 | - 768 64 | - 800 65 | MIN_SIZE_TRAIN_SAMPLING: choice 66 | RANDOM_FLIP: horizontal 67 | MODEL: 68 | ANCHOR_GENERATOR: 69 | ANGLES: 70 | - - -90 71 | - 0 72 | - 90 73 | ASPECT_RATIOS: 74 | - - 0.5 75 | - 1.0 76 | - 2.0 77 | NAME: DefaultAnchorGenerator 78 | OFFSET: 0.0 79 | SIZES: 80 | - - 32 81 | - - 64 82 | - - 128 83 | - - 256 84 | - - 512 85 | BACKBONE: 86 | ANTI_ALIAS: null 87 | FREEZE_AT: 2 88 | NAME: build_resnet_fpn_backbone 89 | DEVICE: cuda 90 | FCOS: 91 | BOX_QUALITY: null 92 | CENTER_SAMPLE: null 93 | FPN_STRIDES: null 94 | INFERENCE_TH_TEST: null 95 | INFERENCE_TH_TRAIN: null 96 | IN_FEATURES: null 97 | LOC_LOSS_TYPE: null 98 | LOSS_ALPHA: null 99 | LOSS_GAMMA: null 100 | LOSS_NORMALIZER_CLS: null 101 | LOSS_WEIGHT_CLS: null 102 | NMS_TH: null 103 | NORM: null 104 | NUM_BOX_CONVS: null 105 | NUM_CLASSES: 20 106 | NUM_CLS_CONVS: null 107 | NUM_SHARE_CONVS: null 108 | POST_NMS_TOPK_TEST: null 109 | POST_NMS_TOPK_TRAIN: null 110 | POS_RADIUS: null 111 | PRE_NMS_TOPK_TEST: null 112 | PRE_NMS_TOPK_TRAIN: null 113 | PRIOR_PROB: null 114 | SIZES_OF_INTEREST: null 115 | THRESH_WITH_CTR: null 116 | TOP_LEVELS: null 117 | USE_DEFORMABLE: null 118 | USE_RELU: null 119 | USE_SCALE: null 120 | YIELD_BOX_FEATURES: null 121 | YIELD_PROPOSAL: null 122 | FPN: 123 | FUSE_TYPE: sum 124 | IN_FEATURES: 125 | - res2 126 | - res3 127 | - res4 128 | - res5 129 | NORM: '' 130 | OUT_CHANNELS: 256 131 | KEYPOINT_ON: false 132 | LOAD_PROPOSALS: false 133 | MASK_ON: false 134 | META_ARCHITECTURE: GeneralizedRCNN 135 | MOBILENET: null 136 | PIXEL_MEAN: 137 | - 103.53 138 | - 116.28 139 | - 123.675 140 | PIXEL_STD: 141 | - 1.0 142 | - 1.0 143 | - 1.0 144 | PROPOSAL_GENERATOR: 145 | MIN_SIZE: 0 146 | NAME: RPN 147 | RESNETS: 148 | DEFORM_INTERVAL: null 149 | DEFORM_MODULATED: false 150 | DEFORM_NUM_GROUPS: 1 151 | DEFORM_ON_PER_STAGE: 152 | - false 153 | - false 154 | - false 155 | - false 156 | DEPTH: 50 157 | NORM: FrozenBN 158 | NUM_GROUPS: 1 159 | OUT_FEATURES: 160 | - res2 161 | - res3 162 | - res4 163 | - res5 164 | RES2_OUT_CHANNELS: 256 165 | RES5_DILATION: 1 166 | STEM_OUT_CHANNELS: 64 167 | STRIDE_IN_1X1: true 168 | WIDTH_PER_GROUP: 64 169 | RETINANET: 170 | BBOX_REG_LOSS_TYPE: null 171 | BBOX_REG_WEIGHTS: null 172 | FOCAL_LOSS_ALPHA: null 173 | FOCAL_LOSS_GAMMA: null 174 | IN_FEATURES: null 175 | IOU_LABELS: null 176 | IOU_THRESHOLDS: null 177 | NMS_THRESH_TEST: null 178 | NORM: null 179 | NUM_CLASSES: 20 180 | NUM_CONVS: null 181 | PRIOR_PROB: null 182 | SCORE_THRESH_TEST: null 183 | SMOOTH_L1_LOSS_BETA: null 184 | TOPK_CANDIDATES_TEST: null 185 | ROI_BOX_HEAD: 186 | BBOX_REG_LOSS_TYPE: smooth_l1 187 | BBOX_REG_LOSS_WEIGHT: 1.0 188 | BBOX_REG_WEIGHTS: 189 | - 10.0 190 | - 10.0 191 | - 5.0 192 | - 5.0 193 | CLS_AGNOSTIC_BBOX_REG: false 194 | CONV_DIM: 256 195 | FC_DIM: 1024 196 | NAME: FastRCNNConvFCHead 197 | NORM: '' 198 | NUM_CONV: 0 199 | NUM_FC: 2 200 | POOLER_RESOLUTION: 7 201 | POOLER_SAMPLING_RATIO: 0 202 | POOLER_TYPE: ROIAlignV2 203 | SMOOTH_L1_BETA: 0.0 204 | TRAIN_ON_PRED_BOXES: false 205 | ROI_HEADS: 206 | BATCH_SIZE_PER_IMAGE: 512 207 | DETECTIONS_PER_IMAGE_TRAIN: null 208 | IN_FEATURES: 209 | - p2 210 | - p3 211 | - p4 212 | - p5 213 | IOU_LABELS: 214 | - 0 215 | - 1 216 | IOU_THRESHOLDS: 217 | - 0.5 218 | NAME: StandardROIHeads 219 | NMS_THRESH_TEST: 0.5 220 | NMS_THRESH_TRAIN: null 221 | NUM_CLASSES: 20 222 | POSITIVE_FRACTION: 0.25 223 | PROPOSAL_APPEND_GT: true 224 | SCORE_THRESH_TEST: 0.05 225 | SCORE_THRESH_TRAIN: null 226 | RPN: 227 | BATCH_SIZE_PER_IMAGE: 256 228 | BBOX_REG_LOSS_TYPE: smooth_l1 229 | BBOX_REG_LOSS_WEIGHT: 1.0 230 | BBOX_REG_WEIGHTS: 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | - 1.0 235 | BOUNDARY_THRESH: -1 236 | CONV_DIMS: 237 | - -1 238 | HEAD_NAME: StandardRPNHead 239 | IN_FEATURES: 240 | - p2 241 | - p3 242 | - p4 243 | - p5 244 | - p6 245 | IOU_LABELS: 246 | - 0 247 | - -1 248 | - 1 249 | IOU_THRESHOLDS: 250 | - 0.3 251 | - 0.7 252 | LOSS_WEIGHT: 1.0 253 | NMS_THRESH: 0.7 254 | POSITIVE_FRACTION: 0.5 255 | POST_NMS_TOPK_TEST: 1000 256 | POST_NMS_TOPK_TRAIN: 1000 257 | PRE_NMS_TOPK_TEST: 1000 258 | PRE_NMS_TOPK_TRAIN: 2000 259 | SMOOTH_L1_BETA: 0.0 260 | EMA_TEACHER: 261 | KEEP_RATE: null 262 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 263 | OUTPUT_DIR: ./runs/voc_Nb=20_Ns=0_faster_rcnn_standard 264 | SEED: -1 265 | SOLVER: 266 | AMP: 267 | ENABLED: false 268 | BACKBONE_MULTIPLIER: null 269 | BASE_LR: 0.02 270 | BIAS_LR_FACTOR: 1.0 271 | CHECKPOINTS: 272 | - 20000 273 | - 32000 274 | CLIP_GRADIENTS: 275 | CLIP_TYPE: value 276 | CLIP_VALUE: 1.0 277 | ENABLED: false 278 | NORM_TYPE: 2.0 279 | GAMMA: 0.1 280 | IMS_PER_BATCH: 16 281 | LR_SCHEDULER_NAME: WarmupMultiStepLR 282 | MAX_ITER: 36000 283 | MOMENTUM: 0.9 284 | NESTEROV: false 285 | OPTIMIZER: null 286 | PATIENCE: 10 287 | REFERENCE_WORLD_SIZE: 0 288 | STEPS: 289 | - 20000 290 | - 32000 291 | USE_CHECKPOINT: null 292 | WARMUP_FACTOR: 0.001 293 | WARMUP_ITERS: 100 294 | WARMUP_METHOD: linear 295 | WEIGHT_DECAY: 0.0001 296 | WEIGHT_DECAY_BIAS: 0.0001 297 | WEIGHT_DECAY_NORM: 0.0 298 | TEST: 299 | AUG: 300 | ENABLED: false 301 | FLIP: false 302 | MAX_SIZE: 4000 303 | MIN_SIZES: 304 | - 400 305 | - 500 306 | - 600 307 | - 700 308 | - 800 309 | - 900 310 | - 1000 311 | - 1100 312 | - 1200 313 | DETECTIONS_PER_IMAGE: 100 314 | EVAL_PERIOD: 1000 315 | EXPECTED_RESULTS: [] 316 | KEYPOINT_OKS_SIGMAS: [] 317 | PRECISE_BN: 318 | ENABLED: false 319 | NUM_ITER: 200 320 | VERSION: 2 321 | VIS_PERIOD: 1000 322 | -------------------------------------------------------------------------------- /configs/voc/faster_rcnn_standard_Nb=20_Ns=50.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 8 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.5 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.2 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - voc_2007_test 34 | TRAIN: 35 | - noisy_voc_2007_custom_train 36 | - noisy_voc_2012_trainval 37 | VAL: 38 | - voc_2007_custom_val 39 | GLOBAL: 40 | HACK: 1.0 41 | INPUT: 42 | CROP: 43 | ENABLED: false 44 | SIZE: 45 | - 0.9 46 | - 0.9 47 | TYPE: relative_range 48 | FORMAT: BGR 49 | MASK_FORMAT: polygon 50 | MAX_SIZE_TEST: 1333 51 | MAX_SIZE_TRAIN: 1333 52 | MIN_SIZE_TEST: 800 53 | MIN_SIZE_TRAIN: 54 | - 480 55 | - 512 56 | - 544 57 | - 576 58 | - 608 59 | - 640 60 | - 672 61 | - 704 62 | - 736 63 | - 768 64 | - 800 65 | MIN_SIZE_TRAIN_SAMPLING: choice 66 | RANDOM_FLIP: horizontal 67 | MODEL: 68 | ANCHOR_GENERATOR: 69 | ANGLES: 70 | - - -90 71 | - 0 72 | - 90 73 | ASPECT_RATIOS: 74 | - - 0.5 75 | - 1.0 76 | - 2.0 77 | NAME: DefaultAnchorGenerator 78 | OFFSET: 0.0 79 | SIZES: 80 | - - 32 81 | - - 64 82 | - - 128 83 | - - 256 84 | - - 512 85 | BACKBONE: 86 | ANTI_ALIAS: null 87 | FREEZE_AT: 2 88 | NAME: build_resnet_fpn_backbone 89 | DEVICE: cuda 90 | FCOS: 91 | BOX_QUALITY: null 92 | CENTER_SAMPLE: null 93 | FPN_STRIDES: null 94 | INFERENCE_TH_TEST: null 95 | INFERENCE_TH_TRAIN: null 96 | IN_FEATURES: null 97 | LOC_LOSS_TYPE: null 98 | LOSS_ALPHA: null 99 | LOSS_GAMMA: null 100 | LOSS_NORMALIZER_CLS: null 101 | LOSS_WEIGHT_CLS: null 102 | NMS_TH: null 103 | NORM: null 104 | NUM_BOX_CONVS: null 105 | NUM_CLASSES: 20 106 | NUM_CLS_CONVS: null 107 | NUM_SHARE_CONVS: null 108 | POST_NMS_TOPK_TEST: null 109 | POST_NMS_TOPK_TRAIN: null 110 | POS_RADIUS: null 111 | PRE_NMS_TOPK_TEST: null 112 | PRE_NMS_TOPK_TRAIN: null 113 | PRIOR_PROB: null 114 | SIZES_OF_INTEREST: null 115 | THRESH_WITH_CTR: null 116 | TOP_LEVELS: null 117 | USE_DEFORMABLE: null 118 | USE_RELU: null 119 | USE_SCALE: null 120 | YIELD_BOX_FEATURES: null 121 | YIELD_PROPOSAL: null 122 | FPN: 123 | FUSE_TYPE: sum 124 | IN_FEATURES: 125 | - res2 126 | - res3 127 | - res4 128 | - res5 129 | NORM: '' 130 | OUT_CHANNELS: 256 131 | KEYPOINT_ON: false 132 | LOAD_PROPOSALS: false 133 | MASK_ON: false 134 | META_ARCHITECTURE: GeneralizedRCNN 135 | MOBILENET: null 136 | PIXEL_MEAN: 137 | - 103.53 138 | - 116.28 139 | - 123.675 140 | PIXEL_STD: 141 | - 1.0 142 | - 1.0 143 | - 1.0 144 | PROPOSAL_GENERATOR: 145 | MIN_SIZE: 0 146 | NAME: RPN 147 | RESNETS: 148 | DEFORM_INTERVAL: null 149 | DEFORM_MODULATED: false 150 | DEFORM_NUM_GROUPS: 1 151 | DEFORM_ON_PER_STAGE: 152 | - false 153 | - false 154 | - false 155 | - false 156 | DEPTH: 50 157 | NORM: FrozenBN 158 | NUM_GROUPS: 1 159 | OUT_FEATURES: 160 | - res2 161 | - res3 162 | - res4 163 | - res5 164 | RES2_OUT_CHANNELS: 256 165 | RES5_DILATION: 1 166 | STEM_OUT_CHANNELS: 64 167 | STRIDE_IN_1X1: true 168 | WIDTH_PER_GROUP: 64 169 | RETINANET: 170 | BBOX_REG_LOSS_TYPE: null 171 | BBOX_REG_WEIGHTS: null 172 | FOCAL_LOSS_ALPHA: null 173 | FOCAL_LOSS_GAMMA: null 174 | IN_FEATURES: null 175 | IOU_LABELS: null 176 | IOU_THRESHOLDS: null 177 | NMS_THRESH_TEST: null 178 | NORM: null 179 | NUM_CLASSES: 20 180 | NUM_CONVS: null 181 | PRIOR_PROB: null 182 | SCORE_THRESH_TEST: null 183 | SMOOTH_L1_LOSS_BETA: null 184 | TOPK_CANDIDATES_TEST: null 185 | ROI_BOX_HEAD: 186 | BBOX_REG_LOSS_TYPE: smooth_l1 187 | BBOX_REG_LOSS_WEIGHT: 1.0 188 | BBOX_REG_WEIGHTS: 189 | - 10.0 190 | - 10.0 191 | - 5.0 192 | - 5.0 193 | CLS_AGNOSTIC_BBOX_REG: false 194 | CONV_DIM: 256 195 | FC_DIM: 1024 196 | NAME: FastRCNNConvFCHead 197 | NORM: '' 198 | NUM_CONV: 0 199 | NUM_FC: 2 200 | POOLER_RESOLUTION: 7 201 | POOLER_SAMPLING_RATIO: 0 202 | POOLER_TYPE: ROIAlignV2 203 | SMOOTH_L1_BETA: 0.0 204 | TRAIN_ON_PRED_BOXES: false 205 | ROI_HEADS: 206 | BATCH_SIZE_PER_IMAGE: 512 207 | DETECTIONS_PER_IMAGE_TRAIN: null 208 | IN_FEATURES: 209 | - p2 210 | - p3 211 | - p4 212 | - p5 213 | IOU_LABELS: 214 | - 0 215 | - 1 216 | IOU_THRESHOLDS: 217 | - 0.5 218 | NAME: StandardROIHeads 219 | NMS_THRESH_TEST: 0.5 220 | NMS_THRESH_TRAIN: null 221 | NUM_CLASSES: 20 222 | POSITIVE_FRACTION: 0.25 223 | PROPOSAL_APPEND_GT: true 224 | SCORE_THRESH_TEST: 0.05 225 | SCORE_THRESH_TRAIN: null 226 | RPN: 227 | BATCH_SIZE_PER_IMAGE: 256 228 | BBOX_REG_LOSS_TYPE: smooth_l1 229 | BBOX_REG_LOSS_WEIGHT: 1.0 230 | BBOX_REG_WEIGHTS: 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | - 1.0 235 | BOUNDARY_THRESH: -1 236 | CONV_DIMS: 237 | - -1 238 | HEAD_NAME: StandardRPNHead 239 | IN_FEATURES: 240 | - p2 241 | - p3 242 | - p4 243 | - p5 244 | - p6 245 | IOU_LABELS: 246 | - 0 247 | - -1 248 | - 1 249 | IOU_THRESHOLDS: 250 | - 0.3 251 | - 0.7 252 | LOSS_WEIGHT: 1.0 253 | NMS_THRESH: 0.7 254 | POSITIVE_FRACTION: 0.5 255 | POST_NMS_TOPK_TEST: 1000 256 | POST_NMS_TOPK_TRAIN: 1000 257 | PRE_NMS_TOPK_TEST: 1000 258 | PRE_NMS_TOPK_TRAIN: 2000 259 | SMOOTH_L1_BETA: 0.0 260 | EMA_TEACHER: 261 | KEEP_RATE: null 262 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 263 | OUTPUT_DIR: ./runs/voc_Nb=20_Ns=50_faster_rcnn_standard 264 | SEED: -1 265 | SOLVER: 266 | AMP: 267 | ENABLED: false 268 | BACKBONE_MULTIPLIER: null 269 | BASE_LR: 0.02 270 | BIAS_LR_FACTOR: 1.0 271 | CHECKPOINTS: 272 | - 20000 273 | - 32000 274 | CLIP_GRADIENTS: 275 | CLIP_TYPE: value 276 | CLIP_VALUE: 1.0 277 | ENABLED: false 278 | NORM_TYPE: 2.0 279 | GAMMA: 0.1 280 | IMS_PER_BATCH: 16 281 | LR_SCHEDULER_NAME: WarmupMultiStepLR 282 | MAX_ITER: 36000 283 | MOMENTUM: 0.9 284 | NESTEROV: false 285 | OPTIMIZER: null 286 | PATIENCE: 10 287 | REFERENCE_WORLD_SIZE: 0 288 | STEPS: 289 | - 20000 290 | - 32000 291 | USE_CHECKPOINT: null 292 | WARMUP_FACTOR: 0.001 293 | WARMUP_ITERS: 100 294 | WARMUP_METHOD: linear 295 | WEIGHT_DECAY: 0.0001 296 | WEIGHT_DECAY_BIAS: 0.0001 297 | WEIGHT_DECAY_NORM: 0.0 298 | TEST: 299 | AUG: 300 | ENABLED: false 301 | FLIP: false 302 | MAX_SIZE: 4000 303 | MIN_SIZES: 304 | - 400 305 | - 500 306 | - 600 307 | - 700 308 | - 800 309 | - 900 310 | - 1000 311 | - 1100 312 | - 1200 313 | DETECTIONS_PER_IMAGE: 100 314 | EVAL_PERIOD: 1000 315 | EXPECTED_RESULTS: [] 316 | KEYPOINT_OKS_SIGMAS: [] 317 | PRECISE_BN: 318 | ENABLED: false 319 | NUM_ITER: 200 320 | VERSION: 2 321 | VIS_PERIOD: 1000 322 | -------------------------------------------------------------------------------- /configs/voc/faster_rcnn_standard_Nb=20_Ns=ex.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 8 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 1.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.2 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - voc_2007_test 34 | TRAIN: 35 | - noisy_voc_2007_custom_train 36 | - noisy_voc_2012_trainval 37 | VAL: 38 | - voc_2007_custom_val 39 | GLOBAL: 40 | HACK: 1.0 41 | INPUT: 42 | CROP: 43 | ENABLED: false 44 | SIZE: 45 | - 0.9 46 | - 0.9 47 | TYPE: relative_range 48 | FORMAT: BGR 49 | MASK_FORMAT: polygon 50 | MAX_SIZE_TEST: 1333 51 | MAX_SIZE_TRAIN: 1333 52 | MIN_SIZE_TEST: 800 53 | MIN_SIZE_TRAIN: 54 | - 480 55 | - 512 56 | - 544 57 | - 576 58 | - 608 59 | - 640 60 | - 672 61 | - 704 62 | - 736 63 | - 768 64 | - 800 65 | MIN_SIZE_TRAIN_SAMPLING: choice 66 | RANDOM_FLIP: horizontal 67 | MODEL: 68 | ANCHOR_GENERATOR: 69 | ANGLES: 70 | - - -90 71 | - 0 72 | - 90 73 | ASPECT_RATIOS: 74 | - - 0.5 75 | - 1.0 76 | - 2.0 77 | NAME: DefaultAnchorGenerator 78 | OFFSET: 0.0 79 | SIZES: 80 | - - 32 81 | - - 64 82 | - - 128 83 | - - 256 84 | - - 512 85 | BACKBONE: 86 | ANTI_ALIAS: null 87 | FREEZE_AT: 2 88 | NAME: build_resnet_fpn_backbone 89 | DEVICE: cuda 90 | FCOS: 91 | BOX_QUALITY: null 92 | CENTER_SAMPLE: null 93 | FPN_STRIDES: null 94 | INFERENCE_TH_TEST: null 95 | INFERENCE_TH_TRAIN: null 96 | IN_FEATURES: null 97 | LOC_LOSS_TYPE: null 98 | LOSS_ALPHA: null 99 | LOSS_GAMMA: null 100 | LOSS_NORMALIZER_CLS: null 101 | LOSS_WEIGHT_CLS: null 102 | NMS_TH: null 103 | NORM: null 104 | NUM_BOX_CONVS: null 105 | NUM_CLASSES: 20 106 | NUM_CLS_CONVS: null 107 | NUM_SHARE_CONVS: null 108 | POST_NMS_TOPK_TEST: null 109 | POST_NMS_TOPK_TRAIN: null 110 | POS_RADIUS: null 111 | PRE_NMS_TOPK_TEST: null 112 | PRE_NMS_TOPK_TRAIN: null 113 | PRIOR_PROB: null 114 | SIZES_OF_INTEREST: null 115 | THRESH_WITH_CTR: null 116 | TOP_LEVELS: null 117 | USE_DEFORMABLE: null 118 | USE_RELU: null 119 | USE_SCALE: null 120 | YIELD_BOX_FEATURES: null 121 | YIELD_PROPOSAL: null 122 | FPN: 123 | FUSE_TYPE: sum 124 | IN_FEATURES: 125 | - res2 126 | - res3 127 | - res4 128 | - res5 129 | NORM: '' 130 | OUT_CHANNELS: 256 131 | KEYPOINT_ON: false 132 | LOAD_PROPOSALS: false 133 | MASK_ON: false 134 | META_ARCHITECTURE: GeneralizedRCNN 135 | MOBILENET: null 136 | PIXEL_MEAN: 137 | - 103.53 138 | - 116.28 139 | - 123.675 140 | PIXEL_STD: 141 | - 1.0 142 | - 1.0 143 | - 1.0 144 | PROPOSAL_GENERATOR: 145 | MIN_SIZE: 0 146 | NAME: RPN 147 | RESNETS: 148 | DEFORM_INTERVAL: null 149 | DEFORM_MODULATED: false 150 | DEFORM_NUM_GROUPS: 1 151 | DEFORM_ON_PER_STAGE: 152 | - false 153 | - false 154 | - false 155 | - false 156 | DEPTH: 50 157 | NORM: FrozenBN 158 | NUM_GROUPS: 1 159 | OUT_FEATURES: 160 | - res2 161 | - res3 162 | - res4 163 | - res5 164 | RES2_OUT_CHANNELS: 256 165 | RES5_DILATION: 1 166 | STEM_OUT_CHANNELS: 64 167 | STRIDE_IN_1X1: true 168 | WIDTH_PER_GROUP: 64 169 | RETINANET: 170 | BBOX_REG_LOSS_TYPE: null 171 | BBOX_REG_WEIGHTS: null 172 | FOCAL_LOSS_ALPHA: null 173 | FOCAL_LOSS_GAMMA: null 174 | IN_FEATURES: null 175 | IOU_LABELS: null 176 | IOU_THRESHOLDS: null 177 | NMS_THRESH_TEST: null 178 | NORM: null 179 | NUM_CLASSES: 20 180 | NUM_CONVS: null 181 | PRIOR_PROB: null 182 | SCORE_THRESH_TEST: null 183 | SMOOTH_L1_LOSS_BETA: null 184 | TOPK_CANDIDATES_TEST: null 185 | ROI_BOX_HEAD: 186 | BBOX_REG_LOSS_TYPE: smooth_l1 187 | BBOX_REG_LOSS_WEIGHT: 1.0 188 | BBOX_REG_WEIGHTS: 189 | - 10.0 190 | - 10.0 191 | - 5.0 192 | - 5.0 193 | CLS_AGNOSTIC_BBOX_REG: false 194 | CONV_DIM: 256 195 | FC_DIM: 1024 196 | NAME: FastRCNNConvFCHead 197 | NORM: '' 198 | NUM_CONV: 0 199 | NUM_FC: 2 200 | POOLER_RESOLUTION: 7 201 | POOLER_SAMPLING_RATIO: 0 202 | POOLER_TYPE: ROIAlignV2 203 | SMOOTH_L1_BETA: 0.0 204 | TRAIN_ON_PRED_BOXES: false 205 | ROI_HEADS: 206 | BATCH_SIZE_PER_IMAGE: 512 207 | DETECTIONS_PER_IMAGE_TRAIN: null 208 | IN_FEATURES: 209 | - p2 210 | - p3 211 | - p4 212 | - p5 213 | IOU_LABELS: 214 | - 0 215 | - 1 216 | IOU_THRESHOLDS: 217 | - 0.5 218 | NAME: StandardROIHeads 219 | NMS_THRESH_TEST: 0.5 220 | NMS_THRESH_TRAIN: null 221 | NUM_CLASSES: 20 222 | POSITIVE_FRACTION: 0.25 223 | PROPOSAL_APPEND_GT: true 224 | SCORE_THRESH_TEST: 0.05 225 | SCORE_THRESH_TRAIN: null 226 | RPN: 227 | BATCH_SIZE_PER_IMAGE: 256 228 | BBOX_REG_LOSS_TYPE: smooth_l1 229 | BBOX_REG_LOSS_WEIGHT: 1.0 230 | BBOX_REG_WEIGHTS: 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | - 1.0 235 | BOUNDARY_THRESH: -1 236 | CONV_DIMS: 237 | - -1 238 | HEAD_NAME: StandardRPNHead 239 | IN_FEATURES: 240 | - p2 241 | - p3 242 | - p4 243 | - p5 244 | - p6 245 | IOU_LABELS: 246 | - 0 247 | - -1 248 | - 1 249 | IOU_THRESHOLDS: 250 | - 0.3 251 | - 0.7 252 | LOSS_WEIGHT: 1.0 253 | NMS_THRESH: 0.7 254 | POSITIVE_FRACTION: 0.5 255 | POST_NMS_TOPK_TEST: 1000 256 | POST_NMS_TOPK_TRAIN: 1000 257 | PRE_NMS_TOPK_TEST: 1000 258 | PRE_NMS_TOPK_TRAIN: 2000 259 | SMOOTH_L1_BETA: 0.0 260 | EMA_TEACHER: 261 | KEEP_RATE: null 262 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 263 | OUTPUT_DIR: ./runs/voc_Nb=20_Ns=ex_faster_rcnn_standard 264 | SEED: -1 265 | SOLVER: 266 | AMP: 267 | ENABLED: false 268 | BACKBONE_MULTIPLIER: null 269 | BASE_LR: 0.02 270 | BIAS_LR_FACTOR: 1.0 271 | CHECKPOINTS: 272 | - 20000 273 | - 32000 274 | CLIP_GRADIENTS: 275 | CLIP_TYPE: value 276 | CLIP_VALUE: 1.0 277 | ENABLED: false 278 | NORM_TYPE: 2.0 279 | GAMMA: 0.1 280 | IMS_PER_BATCH: 16 281 | LR_SCHEDULER_NAME: WarmupMultiStepLR 282 | MAX_ITER: 36000 283 | MOMENTUM: 0.9 284 | NESTEROV: false 285 | OPTIMIZER: null 286 | PATIENCE: 10 287 | REFERENCE_WORLD_SIZE: 0 288 | STEPS: 289 | - 20000 290 | - 32000 291 | USE_CHECKPOINT: null 292 | WARMUP_FACTOR: 0.001 293 | WARMUP_ITERS: 100 294 | WARMUP_METHOD: linear 295 | WEIGHT_DECAY: 0.0001 296 | WEIGHT_DECAY_BIAS: 0.0001 297 | WEIGHT_DECAY_NORM: 0.0 298 | TEST: 299 | AUG: 300 | ENABLED: false 301 | FLIP: false 302 | MAX_SIZE: 4000 303 | MIN_SIZES: 304 | - 400 305 | - 500 306 | - 600 307 | - 700 308 | - 800 309 | - 900 310 | - 1000 311 | - 1100 312 | - 1200 313 | DETECTIONS_PER_IMAGE: 100 314 | EVAL_PERIOD: 1000 315 | EXPECTED_RESULTS: [] 316 | KEYPOINT_OKS_SIGMAS: [] 317 | PRECISE_BN: 318 | ENABLED: false 319 | NUM_ITER: 200 320 | VERSION: 2 321 | VIS_PERIOD: 1000 322 | -------------------------------------------------------------------------------- /configs/voc/faster_rcnn_standard_Nb=40_Ns=0.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 8 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.4 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - voc_2007_test 34 | TRAIN: 35 | - noisy_voc_2007_custom_train 36 | - noisy_voc_2012_trainval 37 | VAL: 38 | - voc_2007_custom_val 39 | GLOBAL: 40 | HACK: 1.0 41 | INPUT: 42 | CROP: 43 | ENABLED: false 44 | SIZE: 45 | - 0.9 46 | - 0.9 47 | TYPE: relative_range 48 | FORMAT: BGR 49 | MASK_FORMAT: polygon 50 | MAX_SIZE_TEST: 1333 51 | MAX_SIZE_TRAIN: 1333 52 | MIN_SIZE_TEST: 800 53 | MIN_SIZE_TRAIN: 54 | - 480 55 | - 512 56 | - 544 57 | - 576 58 | - 608 59 | - 640 60 | - 672 61 | - 704 62 | - 736 63 | - 768 64 | - 800 65 | MIN_SIZE_TRAIN_SAMPLING: choice 66 | RANDOM_FLIP: horizontal 67 | MODEL: 68 | ANCHOR_GENERATOR: 69 | ANGLES: 70 | - - -90 71 | - 0 72 | - 90 73 | ASPECT_RATIOS: 74 | - - 0.5 75 | - 1.0 76 | - 2.0 77 | NAME: DefaultAnchorGenerator 78 | OFFSET: 0.0 79 | SIZES: 80 | - - 32 81 | - - 64 82 | - - 128 83 | - - 256 84 | - - 512 85 | BACKBONE: 86 | ANTI_ALIAS: null 87 | FREEZE_AT: 2 88 | NAME: build_resnet_fpn_backbone 89 | DEVICE: cuda 90 | FCOS: 91 | BOX_QUALITY: null 92 | CENTER_SAMPLE: null 93 | FPN_STRIDES: null 94 | INFERENCE_TH_TEST: null 95 | INFERENCE_TH_TRAIN: null 96 | IN_FEATURES: null 97 | LOC_LOSS_TYPE: null 98 | LOSS_ALPHA: null 99 | LOSS_GAMMA: null 100 | LOSS_NORMALIZER_CLS: null 101 | LOSS_WEIGHT_CLS: null 102 | NMS_TH: null 103 | NORM: null 104 | NUM_BOX_CONVS: null 105 | NUM_CLASSES: 20 106 | NUM_CLS_CONVS: null 107 | NUM_SHARE_CONVS: null 108 | POST_NMS_TOPK_TEST: null 109 | POST_NMS_TOPK_TRAIN: null 110 | POS_RADIUS: null 111 | PRE_NMS_TOPK_TEST: null 112 | PRE_NMS_TOPK_TRAIN: null 113 | PRIOR_PROB: null 114 | SIZES_OF_INTEREST: null 115 | THRESH_WITH_CTR: null 116 | TOP_LEVELS: null 117 | USE_DEFORMABLE: null 118 | USE_RELU: null 119 | USE_SCALE: null 120 | YIELD_BOX_FEATURES: null 121 | YIELD_PROPOSAL: null 122 | FPN: 123 | FUSE_TYPE: sum 124 | IN_FEATURES: 125 | - res2 126 | - res3 127 | - res4 128 | - res5 129 | NORM: '' 130 | OUT_CHANNELS: 256 131 | KEYPOINT_ON: false 132 | LOAD_PROPOSALS: false 133 | MASK_ON: false 134 | META_ARCHITECTURE: GeneralizedRCNN 135 | MOBILENET: null 136 | PIXEL_MEAN: 137 | - 103.53 138 | - 116.28 139 | - 123.675 140 | PIXEL_STD: 141 | - 1.0 142 | - 1.0 143 | - 1.0 144 | PROPOSAL_GENERATOR: 145 | MIN_SIZE: 0 146 | NAME: RPN 147 | RESNETS: 148 | DEFORM_INTERVAL: null 149 | DEFORM_MODULATED: false 150 | DEFORM_NUM_GROUPS: 1 151 | DEFORM_ON_PER_STAGE: 152 | - false 153 | - false 154 | - false 155 | - false 156 | DEPTH: 50 157 | NORM: FrozenBN 158 | NUM_GROUPS: 1 159 | OUT_FEATURES: 160 | - res2 161 | - res3 162 | - res4 163 | - res5 164 | RES2_OUT_CHANNELS: 256 165 | RES5_DILATION: 1 166 | STEM_OUT_CHANNELS: 64 167 | STRIDE_IN_1X1: true 168 | WIDTH_PER_GROUP: 64 169 | RETINANET: 170 | BBOX_REG_LOSS_TYPE: null 171 | BBOX_REG_WEIGHTS: null 172 | FOCAL_LOSS_ALPHA: null 173 | FOCAL_LOSS_GAMMA: null 174 | IN_FEATURES: null 175 | IOU_LABELS: null 176 | IOU_THRESHOLDS: null 177 | NMS_THRESH_TEST: null 178 | NORM: null 179 | NUM_CLASSES: 20 180 | NUM_CONVS: null 181 | PRIOR_PROB: null 182 | SCORE_THRESH_TEST: null 183 | SMOOTH_L1_LOSS_BETA: null 184 | TOPK_CANDIDATES_TEST: null 185 | ROI_BOX_HEAD: 186 | BBOX_REG_LOSS_TYPE: smooth_l1 187 | BBOX_REG_LOSS_WEIGHT: 1.0 188 | BBOX_REG_WEIGHTS: 189 | - 10.0 190 | - 10.0 191 | - 5.0 192 | - 5.0 193 | CLS_AGNOSTIC_BBOX_REG: false 194 | CONV_DIM: 256 195 | FC_DIM: 1024 196 | NAME: FastRCNNConvFCHead 197 | NORM: '' 198 | NUM_CONV: 0 199 | NUM_FC: 2 200 | POOLER_RESOLUTION: 7 201 | POOLER_SAMPLING_RATIO: 0 202 | POOLER_TYPE: ROIAlignV2 203 | SMOOTH_L1_BETA: 0.0 204 | TRAIN_ON_PRED_BOXES: false 205 | ROI_HEADS: 206 | BATCH_SIZE_PER_IMAGE: 512 207 | DETECTIONS_PER_IMAGE_TRAIN: null 208 | IN_FEATURES: 209 | - p2 210 | - p3 211 | - p4 212 | - p5 213 | IOU_LABELS: 214 | - 0 215 | - 1 216 | IOU_THRESHOLDS: 217 | - 0.5 218 | NAME: StandardROIHeads 219 | NMS_THRESH_TEST: 0.5 220 | NMS_THRESH_TRAIN: null 221 | NUM_CLASSES: 20 222 | POSITIVE_FRACTION: 0.25 223 | PROPOSAL_APPEND_GT: true 224 | SCORE_THRESH_TEST: 0.05 225 | SCORE_THRESH_TRAIN: null 226 | RPN: 227 | BATCH_SIZE_PER_IMAGE: 256 228 | BBOX_REG_LOSS_TYPE: smooth_l1 229 | BBOX_REG_LOSS_WEIGHT: 1.0 230 | BBOX_REG_WEIGHTS: 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | - 1.0 235 | BOUNDARY_THRESH: -1 236 | CONV_DIMS: 237 | - -1 238 | HEAD_NAME: StandardRPNHead 239 | IN_FEATURES: 240 | - p2 241 | - p3 242 | - p4 243 | - p5 244 | - p6 245 | IOU_LABELS: 246 | - 0 247 | - -1 248 | - 1 249 | IOU_THRESHOLDS: 250 | - 0.3 251 | - 0.7 252 | LOSS_WEIGHT: 1.0 253 | NMS_THRESH: 0.7 254 | POSITIVE_FRACTION: 0.5 255 | POST_NMS_TOPK_TEST: 1000 256 | POST_NMS_TOPK_TRAIN: 1000 257 | PRE_NMS_TOPK_TEST: 1000 258 | PRE_NMS_TOPK_TRAIN: 2000 259 | SMOOTH_L1_BETA: 0.0 260 | EMA_TEACHER: 261 | KEEP_RATE: null 262 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 263 | OUTPUT_DIR: ./runs/voc_Nb=40_Ns=0_faster_rcnn_standard 264 | SEED: -1 265 | SOLVER: 266 | AMP: 267 | ENABLED: false 268 | BACKBONE_MULTIPLIER: null 269 | BASE_LR: 0.02 270 | BIAS_LR_FACTOR: 1.0 271 | CHECKPOINTS: 272 | - 20000 273 | - 32000 274 | CLIP_GRADIENTS: 275 | CLIP_TYPE: value 276 | CLIP_VALUE: 1.0 277 | ENABLED: false 278 | NORM_TYPE: 2.0 279 | GAMMA: 0.1 280 | IMS_PER_BATCH: 16 281 | LR_SCHEDULER_NAME: WarmupMultiStepLR 282 | MAX_ITER: 36000 283 | MOMENTUM: 0.9 284 | NESTEROV: false 285 | OPTIMIZER: null 286 | PATIENCE: 10 287 | REFERENCE_WORLD_SIZE: 0 288 | STEPS: 289 | - 20000 290 | - 32000 291 | USE_CHECKPOINT: null 292 | WARMUP_FACTOR: 0.001 293 | WARMUP_ITERS: 100 294 | WARMUP_METHOD: linear 295 | WEIGHT_DECAY: 0.0001 296 | WEIGHT_DECAY_BIAS: 0.0001 297 | WEIGHT_DECAY_NORM: 0.0 298 | TEST: 299 | AUG: 300 | ENABLED: false 301 | FLIP: false 302 | MAX_SIZE: 4000 303 | MIN_SIZES: 304 | - 400 305 | - 500 306 | - 600 307 | - 700 308 | - 800 309 | - 900 310 | - 1000 311 | - 1100 312 | - 1200 313 | DETECTIONS_PER_IMAGE: 100 314 | EVAL_PERIOD: 1000 315 | EXPECTED_RESULTS: [] 316 | KEYPOINT_OKS_SIGMAS: [] 317 | PRECISE_BN: 318 | ENABLED: false 319 | NUM_ITER: 200 320 | VERSION: 2 321 | VIS_PERIOD: 1000 322 | -------------------------------------------------------------------------------- /configs/voc/faster_rcnn_standard_clean.yaml: -------------------------------------------------------------------------------- 1 | CORRECTION: 2 | AUG: [] 3 | BOXES: 4 | DISTANCE_LIMIT: null 5 | LOWER_THRESH: null 6 | SOFTMAX_TEMP: null 7 | TYPE: null 8 | DETECTIONS_PER_IMAGE_TRAIN: null 9 | LABELS: 10 | MINING_THRESH: null 11 | TYPE: null 12 | NMS_THRESH_TRAIN: null 13 | SCORE_THRESH_TRAIN: null 14 | WARMUP: null 15 | CUDNN_BENCHMARK: false 16 | DATALOADER: 17 | ASPECT_RATIO_GROUPING: true 18 | FILTER_EMPTY_ANNOTATIONS: false 19 | NUM_WORKERS: 8 20 | REPEAT_THRESHOLD: 0.0 21 | SAMPLER_TRAIN: TrainingSampler 22 | DATASETS: 23 | NOISE: 24 | DROP_LABELS: 25 | P: 0.0 26 | UNIFORM_BBOX_NOISE: 27 | P: 0.0 28 | PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 29 | PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 30 | PROPOSAL_FILES_TEST: [] 31 | PROPOSAL_FILES_TRAIN: [] 32 | TEST: 33 | - voc_2007_test 34 | TRAIN: 35 | - voc_2007_custom_train 36 | - voc_2012_trainval 37 | VAL: 38 | - voc_2007_custom_val 39 | GLOBAL: 40 | HACK: 1.0 41 | INPUT: 42 | CROP: 43 | ENABLED: false 44 | SIZE: 45 | - 0.9 46 | - 0.9 47 | TYPE: relative_range 48 | FORMAT: BGR 49 | MASK_FORMAT: polygon 50 | MAX_SIZE_TEST: 1333 51 | MAX_SIZE_TRAIN: 1333 52 | MIN_SIZE_TEST: 800 53 | MIN_SIZE_TRAIN: 54 | - 480 55 | - 512 56 | - 544 57 | - 576 58 | - 608 59 | - 640 60 | - 672 61 | - 704 62 | - 736 63 | - 768 64 | - 800 65 | MIN_SIZE_TRAIN_SAMPLING: choice 66 | RANDOM_FLIP: horizontal 67 | MODEL: 68 | ANCHOR_GENERATOR: 69 | ANGLES: 70 | - - -90 71 | - 0 72 | - 90 73 | ASPECT_RATIOS: 74 | - - 0.5 75 | - 1.0 76 | - 2.0 77 | NAME: DefaultAnchorGenerator 78 | OFFSET: 0.0 79 | SIZES: 80 | - - 32 81 | - - 64 82 | - - 128 83 | - - 256 84 | - - 512 85 | BACKBONE: 86 | ANTI_ALIAS: null 87 | FREEZE_AT: 2 88 | NAME: build_resnet_fpn_backbone 89 | DEVICE: cuda 90 | FCOS: 91 | BOX_QUALITY: null 92 | CENTER_SAMPLE: null 93 | FPN_STRIDES: null 94 | INFERENCE_TH_TEST: null 95 | INFERENCE_TH_TRAIN: null 96 | IN_FEATURES: null 97 | LOC_LOSS_TYPE: null 98 | LOSS_ALPHA: null 99 | LOSS_GAMMA: null 100 | LOSS_NORMALIZER_CLS: null 101 | LOSS_WEIGHT_CLS: null 102 | NMS_TH: null 103 | NORM: null 104 | NUM_BOX_CONVS: null 105 | NUM_CLASSES: 20 106 | NUM_CLS_CONVS: null 107 | NUM_SHARE_CONVS: null 108 | POST_NMS_TOPK_TEST: null 109 | POST_NMS_TOPK_TRAIN: null 110 | POS_RADIUS: null 111 | PRE_NMS_TOPK_TEST: null 112 | PRE_NMS_TOPK_TRAIN: null 113 | PRIOR_PROB: null 114 | SIZES_OF_INTEREST: null 115 | THRESH_WITH_CTR: null 116 | TOP_LEVELS: null 117 | USE_DEFORMABLE: null 118 | USE_RELU: null 119 | USE_SCALE: null 120 | YIELD_BOX_FEATURES: null 121 | YIELD_PROPOSAL: null 122 | FPN: 123 | FUSE_TYPE: sum 124 | IN_FEATURES: 125 | - res2 126 | - res3 127 | - res4 128 | - res5 129 | NORM: '' 130 | OUT_CHANNELS: 256 131 | KEYPOINT_ON: false 132 | LOAD_PROPOSALS: false 133 | MASK_ON: false 134 | META_ARCHITECTURE: GeneralizedRCNN 135 | MOBILENET: null 136 | PIXEL_MEAN: 137 | - 103.53 138 | - 116.28 139 | - 123.675 140 | PIXEL_STD: 141 | - 1.0 142 | - 1.0 143 | - 1.0 144 | PROPOSAL_GENERATOR: 145 | MIN_SIZE: 0 146 | NAME: RPN 147 | RESNETS: 148 | DEFORM_INTERVAL: null 149 | DEFORM_MODULATED: false 150 | DEFORM_NUM_GROUPS: 1 151 | DEFORM_ON_PER_STAGE: 152 | - false 153 | - false 154 | - false 155 | - false 156 | DEPTH: 50 157 | NORM: FrozenBN 158 | NUM_GROUPS: 1 159 | OUT_FEATURES: 160 | - res2 161 | - res3 162 | - res4 163 | - res5 164 | RES2_OUT_CHANNELS: 256 165 | RES5_DILATION: 1 166 | STEM_OUT_CHANNELS: 64 167 | STRIDE_IN_1X1: true 168 | WIDTH_PER_GROUP: 64 169 | RETINANET: 170 | BBOX_REG_LOSS_TYPE: null 171 | BBOX_REG_WEIGHTS: null 172 | FOCAL_LOSS_ALPHA: null 173 | FOCAL_LOSS_GAMMA: null 174 | IN_FEATURES: null 175 | IOU_LABELS: null 176 | IOU_THRESHOLDS: null 177 | NMS_THRESH_TEST: null 178 | NORM: null 179 | NUM_CLASSES: 20 180 | NUM_CONVS: null 181 | PRIOR_PROB: null 182 | SCORE_THRESH_TEST: null 183 | SMOOTH_L1_LOSS_BETA: null 184 | TOPK_CANDIDATES_TEST: null 185 | ROI_BOX_HEAD: 186 | BBOX_REG_LOSS_TYPE: smooth_l1 187 | BBOX_REG_LOSS_WEIGHT: 1.0 188 | BBOX_REG_WEIGHTS: 189 | - 10.0 190 | - 10.0 191 | - 5.0 192 | - 5.0 193 | CLS_AGNOSTIC_BBOX_REG: false 194 | CONV_DIM: 256 195 | FC_DIM: 1024 196 | NAME: FastRCNNConvFCHead 197 | NORM: '' 198 | NUM_CONV: 0 199 | NUM_FC: 2 200 | POOLER_RESOLUTION: 7 201 | POOLER_SAMPLING_RATIO: 0 202 | POOLER_TYPE: ROIAlignV2 203 | SMOOTH_L1_BETA: 0.0 204 | TRAIN_ON_PRED_BOXES: false 205 | ROI_HEADS: 206 | BATCH_SIZE_PER_IMAGE: 512 207 | DETECTIONS_PER_IMAGE_TRAIN: null 208 | IN_FEATURES: 209 | - p2 210 | - p3 211 | - p4 212 | - p5 213 | IOU_LABELS: 214 | - 0 215 | - 1 216 | IOU_THRESHOLDS: 217 | - 0.5 218 | NAME: StandardROIHeads 219 | NMS_THRESH_TEST: 0.5 220 | NMS_THRESH_TRAIN: null 221 | NUM_CLASSES: 20 222 | POSITIVE_FRACTION: 0.25 223 | PROPOSAL_APPEND_GT: true 224 | SCORE_THRESH_TEST: 0.05 225 | SCORE_THRESH_TRAIN: null 226 | RPN: 227 | BATCH_SIZE_PER_IMAGE: 256 228 | BBOX_REG_LOSS_TYPE: smooth_l1 229 | BBOX_REG_LOSS_WEIGHT: 1.0 230 | BBOX_REG_WEIGHTS: 231 | - 1.0 232 | - 1.0 233 | - 1.0 234 | - 1.0 235 | BOUNDARY_THRESH: -1 236 | CONV_DIMS: 237 | - -1 238 | HEAD_NAME: StandardRPNHead 239 | IN_FEATURES: 240 | - p2 241 | - p3 242 | - p4 243 | - p5 244 | - p6 245 | IOU_LABELS: 246 | - 0 247 | - -1 248 | - 1 249 | IOU_THRESHOLDS: 250 | - 0.3 251 | - 0.7 252 | LOSS_WEIGHT: 1.0 253 | NMS_THRESH: 0.7 254 | POSITIVE_FRACTION: 0.5 255 | POST_NMS_TOPK_TEST: 1000 256 | POST_NMS_TOPK_TRAIN: 1000 257 | PRE_NMS_TOPK_TEST: 1000 258 | PRE_NMS_TOPK_TRAIN: 2000 259 | SMOOTH_L1_BETA: 0.0 260 | EMA_TEACHER: 261 | KEEP_RATE: null 262 | WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl 263 | OUTPUT_DIR: ./runs/voc_clean_faster_rcnn_standard 264 | SEED: -1 265 | SOLVER: 266 | AMP: 267 | ENABLED: false 268 | BACKBONE_MULTIPLIER: null 269 | BASE_LR: 0.02 270 | BIAS_LR_FACTOR: 1.0 271 | CHECKPOINTS: 272 | - 20000 273 | - 32000 274 | CLIP_GRADIENTS: 275 | CLIP_TYPE: value 276 | CLIP_VALUE: 1.0 277 | ENABLED: false 278 | NORM_TYPE: 2.0 279 | GAMMA: 0.1 280 | IMS_PER_BATCH: 16 281 | LR_SCHEDULER_NAME: WarmupMultiStepLR 282 | MAX_ITER: 36000 283 | MOMENTUM: 0.9 284 | NESTEROV: false 285 | OPTIMIZER: null 286 | PATIENCE: 10 287 | REFERENCE_WORLD_SIZE: 0 288 | STEPS: 289 | - 20000 290 | - 32000 291 | USE_CHECKPOINT: null 292 | WARMUP_FACTOR: 0.001 293 | WARMUP_ITERS: 100 294 | WARMUP_METHOD: linear 295 | WEIGHT_DECAY: 0.0001 296 | WEIGHT_DECAY_BIAS: 0.0001 297 | WEIGHT_DECAY_NORM: 0.0 298 | TEST: 299 | AUG: 300 | ENABLED: false 301 | FLIP: false 302 | MAX_SIZE: 4000 303 | MIN_SIZES: 304 | - 400 305 | - 500 306 | - 600 307 | - 700 308 | - 800 309 | - 900 310 | - 1000 311 | - 1100 312 | - 1200 313 | DETECTIONS_PER_IMAGE: 100 314 | EVAL_PERIOD: 1000 315 | EXPECTED_RESULTS: [] 316 | KEYPOINT_OKS_SIGMAS: [] 317 | PRECISE_BN: 318 | ENABLED: false 319 | NUM_ITER: 200 320 | VERSION: 2 321 | VIS_PERIOD: 1000 322 | -------------------------------------------------------------------------------- /datasets/NWPU/Splits/test_set_negative.txt: -------------------------------------------------------------------------------- 1 | 011.jpg 2 | 072.jpg 3 | 074.jpg 4 | 104.jpg 5 | 073.jpg 6 | 051.jpg 7 | 075.jpg 8 | 050.jpg 9 | 144.jpg 10 | 041.jpg 11 | 031.jpg 12 | 090.jpg 13 | 049.jpg 14 | 112.jpg 15 | 070.jpg 16 | 147.jpg 17 | 130.jpg 18 | 077.jpg 19 | 108.jpg 20 | 016.jpg 21 | 099.jpg 22 | 056.jpg 23 | 033.jpg 24 | 097.jpg 25 | 001.jpg -------------------------------------------------------------------------------- /datasets/NWPU/Splits/test_set_positive.txt: -------------------------------------------------------------------------------- 1 | 162.jpg 2 | 198.jpg 3 | 054.jpg 4 | 400.jpg 5 | 595.jpg 6 | 299.jpg 7 | 621.jpg 8 | 529.jpg 9 | 233.jpg 10 | 060.jpg 11 | 469.jpg 12 | 638.jpg 13 | 200.jpg 14 | 045.jpg 15 | 102.jpg 16 | 509.jpg 17 | 167.jpg 18 | 622.jpg 19 | 107.jpg 20 | 158.jpg 21 | 275.jpg 22 | 005.jpg 23 | 629.jpg 24 | 497.jpg 25 | 063.jpg 26 | 593.jpg 27 | 248.jpg 28 | 349.jpg 29 | 061.jpg 30 | 459.jpg 31 | 346.jpg 32 | 057.jpg 33 | 237.jpg 34 | 117.jpg 35 | 476.jpg 36 | 225.jpg 37 | 207.jpg 38 | 580.jpg 39 | 353.jpg 40 | 456.jpg 41 | 064.jpg 42 | 274.jpg 43 | 212.jpg 44 | 556.jpg 45 | 599.jpg 46 | 083.jpg 47 | 177.jpg 48 | 009.jpg 49 | 345.jpg 50 | 635.jpg 51 | 454.jpg 52 | 347.jpg 53 | 084.jpg 54 | 224.jpg 55 | 123.jpg 56 | 561.jpg 57 | 485.jpg 58 | 503.jpg 59 | 355.jpg 60 | 315.jpg 61 | 432.jpg 62 | 254.jpg 63 | 468.jpg 64 | 582.jpg 65 | 512.jpg 66 | 634.jpg 67 | 620.jpg 68 | 440.jpg 69 | 323.jpg 70 | 027.jpg 71 | 524.jpg 72 | 528.jpg 73 | 263.jpg 74 | 281.jpg 75 | 583.jpg 76 | 339.jpg 77 | 015.jpg 78 | 429.jpg 79 | 141.jpg 80 | 215.jpg 81 | 111.jpg 82 | 488.jpg 83 | 588.jpg 84 | 309.jpg 85 | 548.jpg 86 | 071.jpg 87 | 093.jpg 88 | 407.jpg 89 | 554.jpg 90 | 208.jpg 91 | 472.jpg 92 | 128.jpg 93 | 522.jpg 94 | 343.jpg 95 | 421.jpg 96 | 229.jpg 97 | 601.jpg 98 | 427.jpg 99 | 600.jpg 100 | 170.jpg 101 | 447.jpg 102 | 401.jpg 103 | 569.jpg 104 | 384.jpg 105 | 249.jpg 106 | 417.jpg 107 | 097.jpg 108 | 565.jpg 109 | 049.jpg 110 | 550.jpg 111 | 068.jpg 112 | 373.jpg 113 | 644.jpg 114 | 131.jpg 115 | 002.jpg 116 | 606.jpg 117 | 016.jpg 118 | 566.jpg 119 | 562.jpg 120 | 350.jpg 121 | 279.jpg 122 | 351.jpg 123 | 446.jpg 124 | 613.jpg -------------------------------------------------------------------------------- /datasets/NWPU/Splits/train_set_negative.txt: -------------------------------------------------------------------------------- 1 | 054.jpg 2 | 094.jpg 3 | 088.jpg 4 | 009.jpg 5 | 102.jpg 6 | 052.jpg 7 | 028.jpg 8 | 081.jpg 9 | 098.jpg 10 | 089.jpg 11 | 043.jpg 12 | 061.jpg 13 | 063.jpg 14 | 117.jpg 15 | 067.jpg 16 | 093.jpg 17 | 022.jpg 18 | 100.jpg 19 | 024.jpg 20 | 122.jpg 21 | 064.jpg 22 | 029.jpg 23 | 030.jpg 24 | 023.jpg 25 | 044.jpg 26 | 124.jpg 27 | 046.jpg 28 | 015.jpg 29 | 057.jpg 30 | 083.jpg 31 | 012.jpg 32 | 116.jpg 33 | 140.jpg 34 | 128.jpg 35 | 143.jpg 36 | 136.jpg 37 | 037.jpg 38 | 125.jpg 39 | 036.jpg 40 | 110.jpg 41 | 111.jpg 42 | 086.jpg 43 | 042.jpg 44 | 082.jpg 45 | 126.jpg 46 | 096.jpg 47 | 150.jpg 48 | 103.jpg 49 | 114.jpg 50 | 101.jpg 51 | 118.jpg 52 | 040.jpg 53 | 142.jpg 54 | 133.jpg 55 | 092.jpg 56 | 039.jpg 57 | 026.jpg 58 | 055.jpg 59 | 095.jpg 60 | 071.jpg 61 | 053.jpg 62 | 034.jpg 63 | 005.jpg 64 | 038.jpg 65 | 025.jpg 66 | 008.jpg 67 | 078.jpg 68 | 132.jpg 69 | 003.jpg 70 | 139.jpg 71 | 020.jpg 72 | 105.jpg 73 | 109.jpg 74 | 076.jpg 75 | 069.jpg 76 | 006.jpg 77 | 045.jpg 78 | 062.jpg 79 | 017.jpg 80 | 091.jpg 81 | 120.jpg 82 | 087.jpg 83 | 121.jpg 84 | 080.jpg 85 | 004.jpg 86 | 119.jpg 87 | 137.jpg 88 | 035.jpg 89 | 014.jpg 90 | 065.jpg 91 | 060.jpg 92 | 047.jpg 93 | 018.jpg 94 | 007.jpg 95 | 058.jpg 96 | 084.jpg 97 | 141.jpg 98 | 059.jpg 99 | 010.jpg 100 | 066.jpg -------------------------------------------------------------------------------- /datasets/NWPU/Splits/train_set_positive.txt: -------------------------------------------------------------------------------- 1 | 232.jpg 2 | 371.jpg 3 | 302.jpg 4 | 647.jpg 5 | 048.jpg 6 | 513.jpg 7 | 493.jpg 8 | 612.jpg 9 | 397.jpg 10 | 537.jpg 11 | 163.jpg 12 | 482.jpg 13 | 380.jpg 14 | 260.jpg 15 | 221.jpg 16 | 379.jpg 17 | 361.jpg 18 | 377.jpg 19 | 575.jpg 20 | 501.jpg 21 | 112.jpg 22 | 337.jpg 23 | 169.jpg 24 | 616.jpg 25 | 357.jpg 26 | 542.jpg 27 | 334.jpg 28 | 268.jpg 29 | 135.jpg 30 | 399.jpg 31 | 406.jpg 32 | 558.jpg 33 | 451.jpg 34 | 270.jpg 35 | 619.jpg 36 | 130.jpg 37 | 587.jpg 38 | 414.jpg 39 | 552.jpg 40 | 043.jpg 41 | 478.jpg 42 | 010.jpg 43 | 398.jpg 44 | 242.jpg 45 | 286.jpg 46 | 088.jpg 47 | 490.jpg 48 | 499.jpg 49 | 292.jpg 50 | 051.jpg 51 | 195.jpg 52 | 464.jpg 53 | 443.jpg 54 | 574.jpg 55 | 603.jpg 56 | 223.jpg 57 | 033.jpg 58 | 307.jpg 59 | 098.jpg 60 | 517.jpg 61 | 549.jpg 62 | 409.jpg 63 | 113.jpg 64 | 030.jpg 65 | 610.jpg 66 | 100.jpg 67 | 547.jpg 68 | 226.jpg 69 | 573.jpg 70 | 442.jpg 71 | 462.jpg 72 | 127.jpg 73 | 175.jpg 74 | 034.jpg 75 | 222.jpg 76 | 387.jpg 77 | 189.jpg 78 | 402.jpg 79 | 581.jpg 80 | 508.jpg 81 | 624.jpg 82 | 492.jpg 83 | 253.jpg 84 | 543.jpg 85 | 389.jpg 86 | 172.jpg 87 | 553.jpg 88 | 121.jpg 89 | 251.jpg 90 | 012.jpg 91 | 031.jpg 92 | 006.jpg 93 | 489.jpg 94 | 086.jpg 95 | 641.jpg 96 | 025.jpg 97 | 024.jpg 98 | 161.jpg 99 | 403.jpg 100 | 138.jpg 101 | 494.jpg 102 | 210.jpg 103 | 183.jpg 104 | 416.jpg 105 | 295.jpg 106 | 082.jpg 107 | 640.jpg 108 | 434.jpg 109 | 415.jpg 110 | 340.jpg 111 | 596.jpg 112 | 018.jpg 113 | 182.jpg 114 | 019.jpg 115 | 318.jpg 116 | 079.jpg 117 | 192.jpg 118 | 405.jpg 119 | 065.jpg 120 | 197.jpg 121 | 367.jpg 122 | 313.jpg 123 | 056.jpg 124 | 216.jpg 125 | 374.jpg 126 | 278.jpg 127 | 396.jpg 128 | 393.jpg 129 | 125.jpg 130 | 650.jpg 131 | 630.jpg 132 | 344.jpg 133 | 535.jpg 134 | 092.jpg 135 | 411.jpg 136 | 358.jpg 137 | 611.jpg 138 | 072.jpg 139 | 649.jpg 140 | 546.jpg 141 | 146.jpg 142 | 480.jpg 143 | 261.jpg 144 | 058.jpg 145 | 011.jpg 146 | 557.jpg 147 | 059.jpg 148 | 037.jpg 149 | 157.jpg 150 | 455.jpg 151 | 091.jpg 152 | 023.jpg 153 | 598.jpg 154 | 483.jpg 155 | 321.jpg 156 | 028.jpg 157 | 262.jpg 158 | 465.jpg 159 | 317.jpg 160 | 096.jpg 161 | 184.jpg 162 | 166.jpg 163 | 271.jpg 164 | 231.jpg 165 | 467.jpg 166 | 531.jpg 167 | 202.jpg 168 | 445.jpg 169 | 525.jpg 170 | 394.jpg 171 | 511.jpg 172 | 314.jpg 173 | 627.jpg 174 | 559.jpg 175 | 181.jpg 176 | 137.jpg 177 | 243.jpg 178 | 450.jpg 179 | 615.jpg 180 | 085.jpg 181 | 364.jpg 182 | 090.jpg 183 | 320.jpg 184 | 120.jpg 185 | 143.jpg 186 | 505.jpg 187 | 473.jpg 188 | 538.jpg 189 | 132.jpg 190 | 101.jpg 191 | 475.jpg 192 | 155.jpg 193 | 267.jpg 194 | 041.jpg 195 | 077.jpg 196 | 259.jpg 197 | 329.jpg 198 | 156.jpg 199 | 148.jpg 200 | 186.jpg 201 | 408.jpg 202 | 191.jpg 203 | 217.jpg 204 | 277.jpg 205 | 424.jpg 206 | 514.jpg 207 | 036.jpg 208 | 338.jpg 209 | 378.jpg 210 | 519.jpg 211 | 545.jpg 212 | 592.jpg 213 | 298.jpg 214 | 333.jpg 215 | 331.jpg 216 | 042.jpg 217 | 376.jpg 218 | 487.jpg 219 | 474.jpg 220 | 218.jpg 221 | 632.jpg 222 | 078.jpg 223 | 283.jpg 224 | 324.jpg 225 | 265.jpg 226 | 294.jpg 227 | 122.jpg 228 | 523.jpg 229 | 452.jpg 230 | 423.jpg 231 | 204.jpg 232 | 007.jpg 233 | 570.jpg 234 | 625.jpg 235 | 105.jpg 236 | 255.jpg 237 | 069.jpg 238 | 203.jpg 239 | 206.jpg 240 | 118.jpg 241 | 213.jpg 242 | 272.jpg 243 | 484.jpg 244 | 477.jpg 245 | 289.jpg 246 | 139.jpg 247 | 536.jpg 248 | 114.jpg 249 | 020.jpg 250 | 171.jpg 251 | 419.jpg 252 | 152.jpg 253 | 504.jpg 254 | 047.jpg 255 | 410.jpg 256 | 354.jpg 257 | 129.jpg 258 | 230.jpg 259 | 134.jpg 260 | 648.jpg 261 | 608.jpg 262 | 176.jpg 263 | 585.jpg 264 | 140.jpg 265 | 109.jpg 266 | 418.jpg 267 | 381.jpg 268 | 510.jpg 269 | 422.jpg 270 | 173.jpg 271 | 395.jpg 272 | 304.jpg 273 | 433.jpg 274 | 341.jpg 275 | 234.jpg 276 | 238.jpg 277 | 567.jpg 278 | 160.jpg 279 | 342.jpg 280 | 288.jpg 281 | 193.jpg 282 | 185.jpg 283 | 055.jpg 284 | 062.jpg 285 | 578.jpg 286 | 369.jpg 287 | 614.jpg 288 | 437.jpg 289 | 067.jpg 290 | 196.jpg 291 | 479.jpg 292 | 322.jpg 293 | 151.jpg 294 | 382.jpg 295 | 521.jpg 296 | 003.jpg 297 | 441.jpg 298 | 247.jpg 299 | 375.jpg 300 | 555.jpg 301 | 081.jpg 302 | 420.jpg 303 | 022.jpg 304 | 290.jpg 305 | 594.jpg 306 | 179.jpg 307 | 001.jpg 308 | 188.jpg 309 | 352.jpg 310 | 532.jpg 311 | 448.jpg 312 | 356.jpg 313 | 089.jpg 314 | 205.jpg 315 | 287.jpg 316 | 589.jpg 317 | 266.jpg 318 | 154.jpg 319 | 413.jpg 320 | 052.jpg 321 | 273.jpg 322 | 327.jpg 323 | 425.jpg 324 | 520.jpg 325 | 576.jpg 326 | 639.jpg 327 | 228.jpg 328 | 236.jpg 329 | 227.jpg 330 | 074.jpg 331 | 110.jpg 332 | 040.jpg 333 | 319.jpg 334 | 481.jpg 335 | 046.jpg 336 | 643.jpg 337 | 269.jpg 338 | 438.jpg 339 | 240.jpg 340 | 636.jpg 341 | 316.jpg 342 | 590.jpg 343 | 124.jpg 344 | 534.jpg 345 | 256.jpg 346 | 435.jpg 347 | 359.jpg 348 | 070.jpg 349 | 190.jpg 350 | 164.jpg 351 | 220.jpg 352 | 252.jpg 353 | 311.jpg 354 | 436.jpg 355 | 264.jpg 356 | 646.jpg 357 | 245.jpg 358 | 428.jpg 359 | 457.jpg 360 | 150.jpg 361 | 301.jpg 362 | 029.jpg 363 | 326.jpg 364 | 370.jpg 365 | 463.jpg 366 | 336.jpg 367 | 385.jpg 368 | 383.jpg 369 | 391.jpg 370 | 602.jpg 371 | 533.jpg 372 | 145.jpg 373 | 568.jpg 374 | 116.jpg 375 | 021.jpg 376 | 250.jpg 377 | 551.jpg 378 | 518.jpg 379 | 014.jpg 380 | 366.jpg 381 | 449.jpg 382 | 328.jpg 383 | 530.jpg 384 | 362.jpg 385 | 106.jpg 386 | 330.jpg 387 | 515.jpg 388 | 153.jpg 389 | 312.jpg 390 | 645.jpg 391 | 136.jpg 392 | 285.jpg 393 | 458.jpg 394 | 147.jpg 395 | 617.jpg 396 | 133.jpg 397 | 038.jpg 398 | 087.jpg 399 | 495.jpg 400 | 470.jpg -------------------------------------------------------------------------------- /datasets/NWPU/Splits/val_set_negative.txt: -------------------------------------------------------------------------------- 1 | 138.jpg 2 | 068.jpg 3 | 002.jpg 4 | 079.jpg 5 | 131.jpg 6 | 107.jpg 7 | 113.jpg 8 | 021.jpg 9 | 032.jpg 10 | 123.jpg 11 | 149.jpg 12 | 145.jpg 13 | 106.jpg 14 | 019.jpg 15 | 134.jpg 16 | 148.jpg 17 | 127.jpg 18 | 129.jpg 19 | 027.jpg 20 | 085.jpg 21 | 013.jpg 22 | 048.jpg 23 | 115.jpg 24 | 146.jpg 25 | 135.jpg -------------------------------------------------------------------------------- /datasets/NWPU/Splits/val_set_positive.txt: -------------------------------------------------------------------------------- 1 | 500.jpg 2 | 560.jpg 3 | 365.jpg 4 | 539.jpg 5 | 165.jpg 6 | 631.jpg 7 | 461.jpg 8 | 050.jpg 9 | 564.jpg 10 | 572.jpg 11 | 187.jpg 12 | 626.jpg 13 | 004.jpg 14 | 108.jpg 15 | 591.jpg 16 | 035.jpg 17 | 453.jpg 18 | 404.jpg 19 | 142.jpg 20 | 115.jpg 21 | 386.jpg 22 | 526.jpg 23 | 491.jpg 24 | 293.jpg 25 | 296.jpg 26 | 244.jpg 27 | 258.jpg 28 | 431.jpg 29 | 159.jpg 30 | 506.jpg 31 | 426.jpg 32 | 544.jpg 33 | 104.jpg 34 | 241.jpg 35 | 642.jpg 36 | 306.jpg 37 | 368.jpg 38 | 332.jpg 39 | 246.jpg 40 | 199.jpg 41 | 392.jpg 42 | 303.jpg 43 | 094.jpg 44 | 310.jpg 45 | 412.jpg 46 | 080.jpg 47 | 597.jpg 48 | 444.jpg 49 | 053.jpg 50 | 390.jpg 51 | 075.jpg 52 | 013.jpg 53 | 586.jpg 54 | 623.jpg 55 | 119.jpg 56 | 103.jpg 57 | 471.jpg 58 | 460.jpg 59 | 527.jpg 60 | 219.jpg 61 | 297.jpg 62 | 618.jpg 63 | 032.jpg 64 | 284.jpg 65 | 149.jpg 66 | 099.jpg 67 | 211.jpg 68 | 507.jpg 69 | 095.jpg 70 | 008.jpg 71 | 305.jpg 72 | 174.jpg 73 | 308.jpg 74 | 502.jpg 75 | 540.jpg 76 | 276.jpg 77 | 214.jpg 78 | 201.jpg 79 | 209.jpg 80 | 628.jpg 81 | 239.jpg 82 | 325.jpg 83 | 607.jpg 84 | 026.jpg 85 | 579.jpg 86 | 180.jpg 87 | 235.jpg 88 | 335.jpg 89 | 466.jpg 90 | 496.jpg 91 | 633.jpg 92 | 178.jpg 93 | 194.jpg 94 | 291.jpg 95 | 066.jpg 96 | 563.jpg 97 | 257.jpg 98 | 430.jpg 99 | 439.jpg 100 | 584.jpg 101 | 039.jpg 102 | 486.jpg 103 | 541.jpg 104 | 571.jpg 105 | 280.jpg 106 | 637.jpg 107 | 388.jpg 108 | 516.jpg 109 | 360.jpg 110 | 144.jpg 111 | 348.jpg 112 | 168.jpg 113 | 363.jpg 114 | 282.jpg 115 | 498.jpg 116 | 073.jpg 117 | 044.jpg 118 | 609.jpg 119 | 577.jpg 120 | 604.jpg 121 | 126.jpg 122 | 605.jpg 123 | 372.jpg 124 | 076.jpg 125 | 017.jpg -------------------------------------------------------------------------------- /run.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | from detectron2.config import CfgNode 3 | from copy import deepcopy 4 | 5 | parser = argparse.ArgumentParser() 6 | parser.add_argument('--method', required=True, type=str, help='Which training method to use (standard vs robust).') 7 | parser.add_argument('--runs', required=False, default=1, type=int, help='How many training runs.') 8 | parser.add_argument('--test', required=False, default=False, type=bool, help='Whether to test.') 9 | parser.add_argument('--config', required=True, type=str, help='Config file to use') 10 | args = parser.parse_args() 11 | 12 | with open(args.config, 'r') as f: 13 | cfg = CfgNode.load_cfg(f) 14 | 15 | if args.method == 'standard': 16 | from src.engine.train_standard import train 17 | elif args.method == 'robust': 18 | from src.engine.train_teacher_student import train 19 | else: 20 | assert False, 'Unknown train method.' 21 | 22 | for run in range(args.runs): 23 | cfg_run = deepcopy(cfg) 24 | model, best_score = train(cfg_run) 25 | if args.test: 26 | from test import test 27 | test_score = test(run_path=None, cfg=cfg_run, model=model, val=False, test=True) -------------------------------------------------------------------------------- /src/config/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | 4 | def set_output_dir(cfg): 5 | output_dir = cfg.OUTPUT_DIR 6 | os.makedirs(output_dir, exist_ok=False) 7 | 8 | with open(os.path.join(output_dir, 'config.yaml'), 'w') as f: 9 | f.write(cfg.dump()) 10 | -------------------------------------------------------------------------------- /src/correction/augmentation.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | from copy import deepcopy 4 | from detectron2.config.config import configurable 5 | from detectron2.data.transforms.augmentation import AugInput, Augmentation 6 | from detectron2.data import transforms as T 7 | from fvcore.transforms.transform import Transform, NoOpTransform 8 | from scipy.ndimage.filters import gaussian_filter 9 | 10 | 11 | 12 | class BatchAugmentation: 13 | ''' 14 | Wrapper around detectrons Augmentation to allow for augmentations on batches with tensors. 15 | ''' 16 | @configurable 17 | def __init__(self, augmentation): 18 | self.augmentation = augmentation 19 | self.transforms = [] 20 | 21 | @classmethod 22 | def from_config(cls, cfg): 23 | aug_list = [] 24 | for aug_string in cfg.CORRECTION.AUG: 25 | aug_string_splitted = aug_string.split('-') 26 | aug_type = aug_string_splitted[0] 27 | 28 | if aug_type == 'hflip': 29 | p = float(aug_string_splitted[1]) 30 | aug_list.append(T.RandomFlip(prob=p, horizontal=True, vertical=False)) 31 | elif aug_type == 'vflip': 32 | p = float(aug_string_splitted[1]) 33 | aug_list.append(T.RandomFlip(prob=p, horizontal=False, vertical=True)) 34 | elif aug_type == 'brightness': 35 | intensity_min, intensity_max = float(aug_string_splitted[1]), float(aug_string_splitted[2]) 36 | aug_list.append(T.RandomBrightness(intensity_min, intensity_max)) 37 | elif aug_type == 'contrast': 38 | intensity_min, intensity_max = float(aug_string_splitted[1]), float(aug_string_splitted[2]) 39 | aug_list.append(T.RandomContrast(intensity_min, intensity_max)) 40 | elif aug_type == 'saturation': 41 | intensity_min, intensity_max = float(aug_string_splitted[1]), float(aug_string_splitted[2]) 42 | aug_list.append(T.RandomSaturation(intensity_min, intensity_max)) 43 | elif aug_type == 'erase': 44 | p, max_size = float(aug_string_splitted[1]), int(aug_string_splitted[2]) 45 | aug_list.append(RandomErase(p, max_size)) 46 | elif aug_type == 'blur': 47 | p, sig = float(aug_string_splitted[1]), float(aug_string_splitted[2]) 48 | aug_list.append(RandomBlur(p, sig)) 49 | 50 | return {'augmentation': T.AugmentationList(aug_list)} 51 | 52 | def apply_batch(self, batch): 53 | self.transforms = [] 54 | 55 | new_batch = [] 56 | for sample in batch: 57 | new_sample = {key:value for key,value in sample.items()} 58 | new_sample['instances'] = deepcopy(new_sample['instances']) 59 | 60 | image = numpy_image(sample['image']) 61 | boxes = sample['instances'].gt_boxes.tensor.numpy() 62 | input_ = AugInput(image=image, boxes=boxes) 63 | 64 | tfm = self.augmentation(input_) 65 | self.transforms.append(tfm) 66 | 67 | new_sample['image'] = torch_image(input_.image) 68 | new_sample['instances'].gt_boxes.tensor = torch.from_numpy(input_.boxes) 69 | 70 | new_batch.append(new_sample) 71 | 72 | return new_batch 73 | 74 | def apply_gt_instances(self, instance_list, inverse=False, inplace=False): 75 | if not inplace: 76 | instance_list = deepcopy(instance_list) 77 | if inverse: 78 | assert self.transforms != [] 79 | 80 | for instances, tfm in zip(instance_list, self.transforms): 81 | if inverse: 82 | tfm = tfm.inverse() 83 | 84 | boxes = instances.gt_boxes.tensor.detach().cpu().numpy() 85 | boxes_transformed = tfm.apply_box(boxes) 86 | instances.gt_boxes.tensor = torch.from_numpy(boxes_transformed) 87 | 88 | return instance_list 89 | 90 | def numpy_image(img): 91 | return img.detach().cpu().permute(1,2,0).numpy() 92 | 93 | def torch_image(img): 94 | return torch.from_numpy(np.copy(img)).permute(2,0,1) 95 | 96 | class RandomErase(Augmentation): 97 | def __init__(self, prob=0.5, max_size=64): 98 | super().__init__() 99 | self.prob = prob 100 | self.max_size = max_size 101 | 102 | def get_transform(self, image): 103 | H, W = image.shape[:2] 104 | if np.random.uniform() < self.prob: 105 | x = np.random.randint(low=0, high=W) 106 | y = np.random.randint(low=0, high=H) 107 | h,w = np.random.randint(low=self.max_size // 2, high=self.max_size, size=2) 108 | 109 | x1 = max(0, x - w // 2) 110 | x2 = min(W, x + w // 2) 111 | y1 = max(0, y - h // 2) 112 | y2 = min(H, y + h // 2) 113 | 114 | return BoxEraseTransform(x1=x1,x2=x2,y1=y1,y2=y2) 115 | else: 116 | return NoOpTransform() 117 | 118 | class BoxEraseTransform(Transform): 119 | def __init__(self, x1, x2, y1, y2): 120 | super().__init__() 121 | self.x1 = x1 122 | self.x2 = x2 123 | self.y1 = y1 124 | self.y2 = y2 125 | 126 | def apply_image(self, img: np.ndarray) -> np.ndarray: 127 | tensor = torch.from_numpy(np.ascontiguousarray(img)) 128 | tensor[self.y1:self.y2,self.x1:self.x2] = 0 129 | return tensor.numpy() 130 | 131 | def apply_coords(self, coords: np.ndarray) -> np.ndarray: 132 | """ 133 | Apply no transform on the coordinates. 134 | """ 135 | return coords 136 | 137 | def apply_segmentation(self, segmentation: np.ndarray) -> np.ndarray: 138 | """ 139 | Apply no transform on the full-image segmentation. 140 | """ 141 | return segmentation 142 | 143 | def inverse(self) -> Transform: 144 | """ 145 | The inverse is a no-op. 146 | """ 147 | return NoOpTransform() 148 | 149 | 150 | class RandomBlur(Augmentation): 151 | def __init__(self, prob, sigma): 152 | super().__init__() 153 | self.prob = prob 154 | self.sigma = sigma 155 | 156 | def get_transform(self, image): 157 | if np.random.uniform() < self.prob: 158 | return BlurTransform(sigma=self.sigma) 159 | else: 160 | return NoOpTransform() 161 | 162 | class BlurTransform(Transform): 163 | def __init__(self, sigma): 164 | super().__init__() 165 | self.sigma = sigma 166 | 167 | def apply_image(self, img: np.ndarray) -> np.ndarray: 168 | return gaussian_filter(img, sigma=(self.sigma, self.sigma, 0)) 169 | 170 | def apply_coords(self, coords: np.ndarray) -> np.ndarray: 171 | """ 172 | Apply no transform on the coordinates. 173 | """ 174 | return coords 175 | 176 | def apply_segmentation(self, segmentation: np.ndarray) -> np.ndarray: 177 | """ 178 | Apply no transform on the full-image segmentation. 179 | """ 180 | return segmentation 181 | 182 | def inverse(self) -> Transform: 183 | """ 184 | The inverse is a no-op. 185 | """ 186 | return NoOpTransform() 187 | 188 | -------------------------------------------------------------------------------- /src/datasets/build.py: -------------------------------------------------------------------------------- 1 | from detectron2.data import build_detection_train_loader, build_detection_test_loader, DatasetMapper, transforms as T 2 | from .register import register_datasets 3 | 4 | def build_dataloaders(cfg): 5 | register_datasets(cfg) 6 | 7 | augs = build_train_augmentation(cfg) 8 | mapper = DatasetMapper( 9 | is_train=True, 10 | augmentations=augs, 11 | image_format=cfg.INPUT.FORMAT, 12 | use_instance_mask=cfg.MODEL.MASK_ON, 13 | use_keypoint=cfg.MODEL.KEYPOINT_ON, 14 | instance_mask_format=cfg.INPUT.MASK_FORMAT, 15 | keypoint_hflip_indices=None, 16 | precomputed_proposal_topk=None, 17 | recompute_boxes=False, 18 | ) 19 | train_loader = build_detection_train_loader(cfg, mapper=mapper)#, cfg.DATASETS.TRAIN) 20 | val_loader = build_detection_test_loader(cfg, cfg.DATASETS.VAL) 21 | val_loader.dataset._map_func._obj.is_train = True # ensures that annotations are not discarded 22 | test_loader = build_detection_test_loader(cfg, cfg.DATASETS.TEST) 23 | test_loader.dataset._map_func._obj.is_train = True # ensures that annotations are not discarded 24 | 25 | return train_loader, val_loader, test_loader 26 | 27 | 28 | def build_train_augmentation(cfg): 29 | min_size = cfg.INPUT.MIN_SIZE_TRAIN 30 | max_size = cfg.INPUT.MAX_SIZE_TRAIN 31 | sample_style = cfg.INPUT.MIN_SIZE_TRAIN_SAMPLING 32 | augmentation = [T.ResizeShortestEdge(min_size, max_size, sample_style)] 33 | 34 | if cfg.INPUT.RANDOM_FLIP == "horizontal": 35 | augmentation.append( 36 | T.RandomFlip( 37 | horizontal=True, 38 | vertical=False, 39 | ) 40 | ) 41 | elif cfg.INPUT.RANDOM_FLIP == "vertical": 42 | augmentation.append( 43 | T.RandomFlip( 44 | horizontal=False, 45 | vertical=True, 46 | ) 47 | ) 48 | elif cfg.INPUT.RANDOM_FLIP in ["vertical&horizontal", 49 | "horizontal&vertical", 50 | "both"]: 51 | augmentation.append( 52 | T.RandomFlip( 53 | horizontal=False, 54 | vertical=True, 55 | ) 56 | ) 57 | augmentation.append( 58 | T.RandomFlip( 59 | horizontal=True, 60 | vertical=False, 61 | ) 62 | ) 63 | if cfg.INPUT.CROP.ENABLED: 64 | augmentation.insert(0, T.RandomCrop(cfg.INPUT.CROP.TYPE, cfg.INPUT.CROP.SIZE)) 65 | return augmentation -------------------------------------------------------------------------------- /src/datasets/noise.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from copy import deepcopy 3 | from detectron2.config.config import CfgNode 4 | from ..utils.misc import get_random_generator 5 | 6 | 7 | def load_noisy_instances(dicts, transform=lambda x:x['annotations']): 8 | ''' 9 | Adds noisy instances to the instances in dicts usingthe provided transform. 10 | ''' 11 | for d in dicts: 12 | d['ground_truth'] = deepcopy(d['annotations']) 13 | d['annotations'] = transform(d) 14 | return dicts 15 | 16 | 17 | def build_label_transform(cfg): 18 | num_classes = cfg.MODEL.RETINANET.NUM_CLASSES 19 | assert num_classes == cfg.MODEL.ROI_HEADS.NUM_CLASSES 20 | 21 | try: 22 | noise_cfg = cfg.DATASETS.NOISE 23 | except: 24 | noise_cfg = CfgNode() 25 | 26 | transforms_list = [] 27 | for key, value in noise_cfg.items(): 28 | if key == 'UNIFORM_BBOX_NOISE': 29 | p = value.get('P') 30 | if p != 0: 31 | transforms_list.append(UniformBoxNoise(p=p)) 32 | elif key == 'DROP_LABELS': 33 | p = value.get('P') 34 | if p != 0: 35 | transforms_list.append(DropLabels(p=p)) 36 | else: 37 | raise ValueError(f'Unknown noise transform: {key}') 38 | 39 | if len(transforms_list) == 0: 40 | transforms_list.append(lambda x:x['annotations']) 41 | 42 | return CompositeLabelTransform(*transforms_list) 43 | 44 | class CompositeLabelTransform: 45 | def __init__(self, *transforms): 46 | self.transforms = transforms 47 | 48 | def __call__(self, sample): 49 | for t in self.transforms: 50 | annotations = t(sample) 51 | sample['annotations'] = annotations 52 | return annotations 53 | 54 | class UniformBoxNoise: 55 | def __init__(self, p): 56 | self.p = p 57 | 58 | def __call__(self, sample): 59 | annotations = sample['annotations'] 60 | boxes = torch.tensor([box['bbox'] for box in annotations]).reshape(-1,4) 61 | w = boxes[:, 2] - boxes[:, 0] 62 | h = boxes[:, 3] - boxes[:, 1] 63 | 64 | # fix seed with generator 65 | generator = get_random_generator(sample['image_id']) 66 | 67 | # sample noise 68 | noise = 2 * self.p * \ 69 | torch.rand(boxes.shape, generator=generator) - self.p # [-p,+p] 70 | noise[:, [0, 2]] = noise[:, [0, 2]] * w.view(-1, 1) # [-wp,+wp] 71 | noise[:, [1, 3]] = noise[:, [1, 3]] * h.view(-1, 1) # [-hp,+hp] 72 | boxes = boxes + noise 73 | 74 | # remove degenerate boxes 75 | mask = torch.logical_and(boxes[:, 2] > boxes[:, 0], boxes[:, 3] > boxes[:, 1]) 76 | 77 | # reverse order because of popping 78 | for i in range(len(annotations))[::-1]: 79 | annotations[i]['bbox'] = boxes[i].tolist() 80 | if not mask[i]: 81 | annotations.pop(i) 82 | return annotations 83 | 84 | 85 | class DropLabels: 86 | # drop 87 | def __init__(self, p): 88 | ''' 89 | :param p: Ratio of annotations to delete. One means that one annotation per image is kept. 90 | ''' 91 | self.p = p 92 | 93 | def __call__(self, sample): 94 | annotations = sample['annotations'] 95 | num_objects = len(annotations) 96 | if num_objects > 0: 97 | # fix seed with generator 98 | generator = get_random_generator(sample['image_id']) 99 | if self.p == 1: 100 | mask = torch.full((num_objects,), False) 101 | i = torch.randint(low=0, high=num_objects, size=(1,)) 102 | mask[i] = True 103 | else: 104 | mask = self.p < torch.rand(num_objects, generator=generator) 105 | 106 | 107 | # go through in reverse order 108 | for i in range(num_objects)[::-1]: 109 | if not mask[i]: 110 | annotations.pop(i) 111 | return annotations 112 | -------------------------------------------------------------------------------- /src/datasets/nwpu.py: -------------------------------------------------------------------------------- 1 | import os 2 | from PIL import Image 3 | 4 | NWPU_ROOT_DIR = './datasets/NWPU' 5 | NWPU_CLASS_NAMES = ('airplane', 6 | 'ship', 7 | 'storage tank', 8 | 'baseball diamond', 9 | 'tennis court', 10 | 'basketball court', 11 | 'ground track field', 12 | 'harbor', 13 | 'bridge', 14 | 'vehicle') 15 | 16 | def get_annotations(annotation_path): 17 | annotations = [] 18 | 19 | with open(annotation_path, 'r') as f: # encoding='utf-8-sig' 20 | content = f.read() 21 | objects = content.split('\n') 22 | objects = [x for x in objects if len(x) > 0] 23 | for obj in objects: 24 | info = obj.replace('(', '').replace(')', '').strip().split(',') 25 | assert len(info) == 5, 'wronging occurred in label convertion!!' 26 | label = int(info[4]) - 1 # -1 because we want to start with 0 27 | x1, y1, x2, y2 = [float(x) for x in info[:4]] 28 | 29 | annotations.append(dict(bbox=[x1,y1,x2,y2], 30 | bbox_mode=0, #BoxMode.XYXY_ABS 31 | category_id=label)) 32 | return annotations 33 | 34 | def load_nwpu_instances(split, label_transform=None): 35 | with open(os.path.join(NWPU_ROOT_DIR, 'Splits/{}_set_positive.txt'.format(split)), 'r') as f: 36 | positive_imgs = f.readlines() 37 | with open(os.path.join(NWPU_ROOT_DIR, 'Splits/{}_set_negative.txt'.format(split)), 'r') as f: 38 | negative_imgs = f.readlines() 39 | 40 | dicts = [] 41 | 42 | for name in positive_imgs: 43 | name = name.strip() 44 | img_path = os.path.join(NWPU_ROOT_DIR, 'positive image set', name) 45 | annotation_path = os.path.join(NWPU_ROOT_DIR, 'ground truth', name[:-3]+'txt') 46 | 47 | img = Image.open(img_path) 48 | width, height = img.size 49 | img_id = os.path.join('positive image set', name) 50 | annotations = get_annotations(annotation_path) 51 | 52 | dicts.append(dict(file_name=img_path, 53 | height=height, 54 | width=width, 55 | image_id=img_id, 56 | annotations=annotations 57 | )) 58 | 59 | for name in negative_imgs: 60 | name = name.strip() 61 | img_path = os.path.join(NWPU_ROOT_DIR, 'negative image set', name) 62 | img = Image.open(img_path) 63 | width, height = img.size 64 | img_id = os.path.join('negative image set', name) 65 | 66 | dicts.append(dict(file_name=img_path, 67 | height=height, 68 | width=width, 69 | image_id=img_id, 70 | annotations=[] 71 | )) 72 | 73 | return dicts 74 | 75 | 76 | 77 | 78 | -------------------------------------------------------------------------------- /src/datasets/register.py: -------------------------------------------------------------------------------- 1 | from detectron2.data import DatasetCatalog, MetadataCatalog 2 | from detectron2.data.datasets.pascal_voc import load_voc_instances, CLASS_NAMES as VOC_CLASS_NAMES 3 | from .nwpu import * 4 | from .noise import load_noisy_instances, build_label_transform 5 | 6 | 7 | def register_datasets(cfg): 8 | train_datasets = cfg.DATASETS.TRAIN 9 | val_datasets = cfg.DATASETS.VAL 10 | test_datasets = cfg.DATASETS.TEST 11 | if isinstance(train_datasets, str): 12 | train_datasets = [train_datasets] 13 | if isinstance(val_datasets, str): 14 | val_datasets = [val_datasets] 15 | if isinstance(test_datasets, str): 16 | test_datasets = [test_datasets] 17 | 18 | if 'voc_2007_custom' in ''.join(train_datasets) or 'voc_2007_custom' in ''.join(val_datasets) or 'voc_2007_custom' in ''.join(test_datasets): 19 | for split in ['custom_train', 'custom_val']: 20 | name = 'voc_2007_' + split 21 | try: 22 | DatasetCatalog.remove(name) 23 | except: 24 | pass 25 | 26 | year = '2007' 27 | dirname = os.path.join( 28 | os.getenv("DETECTRON2_DATASETS", "datasets"), 'VOC{}'.format(year)) 29 | DatasetCatalog.register(name, lambda split=split, dirname=dirname: load_voc_instances(dirname=dirname, 30 | split=split, 31 | class_names=list(VOC_CLASS_NAMES))) 32 | MetadataCatalog.get(name).set(thing_classes=list(VOC_CLASS_NAMES), 33 | evaluator_type='pascal_voc', 34 | dirname=dirname, 35 | year=int(year), 36 | split=split) 37 | 38 | if 'nwpu' in ''.join(train_datasets) or 'nwpu' in ''.join(val_datasets) or 'nwpu' in ''.join(test_datasets): 39 | for split in ['train', 'val', 'test']: 40 | name = 'nwpu_' + split 41 | try: 42 | DatasetCatalog.remove(name) 43 | except: 44 | pass 45 | DatasetCatalog.register( 46 | name, lambda split=split: load_nwpu_instances(split=split)) 47 | MetadataCatalog.get( 48 | name).set(thing_classes=NWPU_CLASS_NAMES, evaluator_type='coco') 49 | 50 | if 'noisy' in ''.join(train_datasets) or 'noisy' in ''.join(val_datasets) or 'noisy' in ''.join(test_datasets): 51 | transform = build_label_transform(cfg) 52 | if 'nwpu' in ''.join(train_datasets) or 'nwpu' in ''.join(val_datasets) or 'nwpu' in ''.join(test_datasets): 53 | for split in ['train', 'val', 'test']: 54 | name = 'noisy_nwpu_' + split 55 | try: 56 | DatasetCatalog.remove(name) 57 | except: 58 | pass 59 | DatasetCatalog.register(name, 60 | lambda split=split: load_noisy_instances(dicts=load_nwpu_instances(split=split), 61 | transform=transform)) 62 | MetadataCatalog.get(name).set(thing_classes=NWPU_CLASS_NAMES, 63 | evaluator_type='coco') 64 | 65 | if 'voc' in ''.join(train_datasets) or 'voc' in ''.join(val_datasets) or 'voc' in ''.join(test_datasets): 66 | for year in ['2007', '2012']: 67 | 68 | if year == '2007': 69 | splits = ['train', 'val', 'test', 'trainval', 'custom_train', 'custom_val'] 70 | elif year == '2012': 71 | splits = ['train', 'val', 'test', 'trainval'] 72 | 73 | for split in splits: 74 | dirname = os.path.join( 75 | os.getenv("DETECTRON2_DATASETS", "datasets"), 'VOC{}'.format(year)) 76 | name = 'noisy_voc_{}_{}'.format(year, split) 77 | try: 78 | DatasetCatalog.remove(name) 79 | except: 80 | pass 81 | DatasetCatalog.register(name, 82 | lambda split=split, year=year: load_noisy_instances(dicts=DatasetCatalog.get('voc_{}_{}'.format(year, split)), 83 | transform=transform)) 84 | MetadataCatalog.get(name).set(thing_classes=list(VOC_CLASS_NAMES), 85 | evaluator_type='pascal_voc', 86 | dirname=dirname, 87 | year=int(year), 88 | split=split) 89 | -------------------------------------------------------------------------------- /src/engine/train_standard.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import torch 4 | 5 | from detectron2.solver import build_lr_scheduler, build_optimizer 6 | from detectron2.evaluation import inference_on_dataset 7 | from detectron2.utils.events import CommonMetricPrinter, EventStorage 8 | from detectron2.engine import default_writers 9 | from detectron2.utils.logger import setup_logger 10 | import detectron2.utils.comm as comm 11 | 12 | from src.evaluation.build import build_evaluator 13 | from src.models.build import build_model 14 | from src.datasets.build import build_dataloaders 15 | from src.config.utils import set_output_dir 16 | from src.utils.checkpoint import save_checkpoint, load_checkpoint 17 | 18 | 19 | def train(cfg, model=None): 20 | set_output_dir(cfg) 21 | writers = default_writers(cfg.OUTPUT_DIR, max_iter=cfg.SOLVER.MAX_ITER) 22 | logger = setup_logger(cfg.OUTPUT_DIR) 23 | for h in logger.handlers[:-2]: # remove handlers such that multiple runs don't affect each other 24 | logger.removeHandler(h) 25 | logger.info('Output dir: {}'.format(cfg.OUTPUT_DIR)) 26 | 27 | train_loader, val_loader, test_loader = build_dataloaders(cfg) 28 | evaluator = build_evaluator(cfg) 29 | 30 | if model is None: 31 | model = build_model(cfg) 32 | optimizer = build_optimizer(cfg, model) 33 | scheduler = build_lr_scheduler(cfg, optimizer) 34 | 35 | start_iter = 0 36 | max_iter = cfg.SOLVER.MAX_ITER 37 | best_score = np.inf 38 | patience_left = cfg.SOLVER.PATIENCE 39 | 40 | if cfg.SOLVER.USE_CHECKPOINT is not None and cfg.SOLVER.USE_CHECKPOINT != '': 41 | model_state_dict,\ 42 | optimizer_state_dict,\ 43 | scheduler_state_dict,\ 44 | checkpoint_iter = load_checkpoint(cfg.SOLVER.USE_CHECKPOINT) 45 | model.load_state_dict(model_state_dict) 46 | optimizer.load_state_dict(optimizer_state_dict) 47 | scheduler.load_state_dict(scheduler_state_dict) 48 | scheduler._max_iter = max_iter + 1 49 | start_iter = checkpoint_iter 50 | 51 | model.to(cfg.MODEL.DEVICE) 52 | 53 | with EventStorage(start_iter) as storage: 54 | logger.info('START.') 55 | for data, iteration in zip(train_loader, range(start_iter, max_iter)): 56 | storage.iter = iteration 57 | 58 | loss_dict = model(data) 59 | losses = sum(loss_dict.values()) 60 | 61 | loss_dict_reduced = {k: v.item() for k, v in comm.reduce_dict(loss_dict).items()} 62 | losses_reduced = sum(loss for loss in loss_dict_reduced.values()) 63 | storage.put_scalars(train_loss=losses_reduced, **loss_dict_reduced) 64 | 65 | if not torch.isfinite(losses).all(): 66 | for writer in writers: 67 | if isinstance(writer, CommonMetricPrinter): 68 | logger.info('NaN in losses: {}'.format(loss_dict)) 69 | writer.write() 70 | assert False 71 | 72 | optimizer.zero_grad() 73 | losses.backward() 74 | optimizer.step() 75 | storage.put_scalar("lr", optimizer.param_groups[0]["lr"], smoothing_hint=False) 76 | scheduler.step() 77 | 78 | do_test = False 79 | if (cfg.TEST.EVAL_PERIOD > 0 and (iteration + 1) % cfg.TEST.EVAL_PERIOD == 0) or (iteration == max_iter - 1): 80 | do_test = True 81 | results = inference_on_dataset(model, val_loader, evaluator) 82 | 83 | val_score = results['score'] if 'score' in results else -results['bbox']['AP50'] 84 | storage.put_scalar('val_score', val_score, smoothing_hint=False) 85 | 86 | if val_score < best_score: 87 | best_score = val_score 88 | model.cpu() 89 | torch.save(model.state_dict(), os.path.join(cfg.OUTPUT_DIR, 'final_model.pt')) 90 | model.to(cfg.MODEL.DEVICE) 91 | patience_left = cfg.SOLVER.PATIENCE 92 | elif val_score == best_score: 93 | patience_left = cfg.SOLVER.PATIENCE 94 | else: 95 | patience_left -= 1 96 | 97 | if ((iteration + 1) % 20 == 0) or (iteration == max_iter - 1) or (do_test): 98 | for writer in writers: 99 | writer.write() 100 | if do_test and isinstance(writer, CommonMetricPrinter): 101 | logger.info('val_score: {}, best_val_score: {}'.format(val_score, best_score)) 102 | 103 | if (iteration + 1) in cfg.SOLVER.CHECKPOINTS: 104 | save_checkpoint(model, optimizer, scheduler, iteration, cfg.OUTPUT_DIR) 105 | logger.info('Saved checkpoint {}'.format(iteration)) 106 | 107 | if patience_left < 0: 108 | logger.info('Stopping early!') 109 | break 110 | 111 | try: 112 | model.load_state_dict(torch.load(os.path.join(cfg.OUTPUT_DIR, 'final_model.pt'))) 113 | except: 114 | pass 115 | logger.info('DONE.') 116 | return model, best_score 117 | -------------------------------------------------------------------------------- /src/evaluation/build.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | from detectron2.evaluation import PascalVOCDetectionEvaluator, COCOEvaluator 4 | from detectron2.data.datasets.coco import convert_to_coco_dict 5 | from detectron2.data import MetadataCatalog 6 | 7 | 8 | def build_evaluator(cfg, val=True): 9 | if val: 10 | dataset_name = cfg.DATASETS.VAL 11 | else: 12 | dataset_name = cfg.DATASETS.TEST 13 | 14 | if not isinstance(dataset_name, str): 15 | assert len(dataset_name) == 1 16 | dataset_name = dataset_name[0] 17 | evaluator_type = MetadataCatalog.get(dataset_name).evaluator_type 18 | 19 | if evaluator_type == 'pascal_voc': 20 | return PascalVOCDetectionEvaluator(dataset_name) 21 | elif evaluator_type == 'coco': 22 | if not hasattr(MetadataCatalog.get(dataset_name), "json_file"): 23 | # create and register coco json first 24 | # we do this manually because convert_to_coco_json() gets stuck with file_lock() 25 | coco_dict = convert_to_coco_dict(dataset_name) 26 | json_path = os.path.join(cfg.OUTPUT_DIR, dataset_name + '_coco_format.json') 27 | with open(json_path, 'w') as f: 28 | json.dump(coco_dict, f) 29 | MetadataCatalog.get(dataset_name).set(json_file=json_path) 30 | 31 | return COCOEvaluator(dataset_name=dataset_name, 32 | tasks=('bbox',), 33 | distributed=False, 34 | output_dir=cfg.OUTPUT_DIR) 35 | else: 36 | raise NotImplementedError('Unknow evaluator type {}!'.format(evaluator_type)) 37 | -------------------------------------------------------------------------------- /src/evaluation/visualization.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from PIL import Image 3 | from detectron2.data.catalog import MetadataCatalog 4 | from detectron2.utils.visualizer import Visualizer as D2Visualizer, _create_text_labels 5 | 6 | 7 | class Visualizer: 8 | def __init__(self, dataset=None, rgb=False, scale=1.0, labels=True, color=None): 9 | self.dataset = dataset 10 | self.metadata = MetadataCatalog.get(dataset) 11 | self.class_names = self.metadata.thing_classes 12 | self.rgb = rgb 13 | self.scale = scale 14 | self.labels = labels 15 | self.color = color 16 | 17 | def visualize_batch_ground_truth(self, dict_): 18 | img = dict_['image'].permute(1,2,0).numpy() 19 | img = img[:,:,::-1] if not self.rgb else img 20 | vis = D2Visualizer(img_rgb=img, metadata=self.metadata, scale=self.scale) 21 | vis._default_font_size = 12. / self.scale 22 | 23 | objects = dict_['ground_truth'] 24 | boxes = np.stack([np.array(obj['bbox']) for obj in objects]) 25 | # scale boxes 26 | orig_height, orig_width = dict_['height'], dict_['width'] 27 | actual_height, actual_width = img.shape[:2] 28 | boxes[:,[0,2]] *= actual_width / orig_width 29 | boxes[:,[1,3]] *= actual_height / orig_height 30 | 31 | class_ids = [obj['category_id'] for obj in objects] 32 | labels = _create_text_labels(classes=class_ids, scores=None, class_names=self.class_names) \ 33 | if self.labels else None 34 | 35 | colors = None if self.color is None else ([self.color] * len(boxes)) 36 | result = vis.overlay_instances(boxes=boxes, labels=labels, assigned_colors=colors) 37 | 38 | return Image.fromarray(result.get_image()) 39 | 40 | 41 | 42 | def visualize_batch_annotation(self, dict_, instances=None): 43 | img = dict_['image'].permute(1,2,0).numpy() 44 | img = img[:,:,::-1] if not self.rgb else img 45 | vis = D2Visualizer(img_rgb=img, metadata=self.metadata, scale=self.scale) 46 | vis._default_font_size = 12. / self.scale 47 | 48 | if instances is None: 49 | instances = dict_['instances'] 50 | boxes = instances.gt_boxes.tensor.numpy() 51 | class_ids = instances.gt_classes.numpy() 52 | labels = _create_text_labels(classes=class_ids, scores=None, class_names=self.class_names) \ 53 | if self.labels else None 54 | colors = None if self.color is None else ([self.color] * len(boxes)) 55 | result = vis.overlay_instances(boxes=boxes, labels=labels, assigned_colors=colors) 56 | 57 | return Image.fromarray(result.get_image()) 58 | 59 | 60 | 61 | def visualize_model_output(self, input_dict, output_dict, threshold=0.5, rescale=True): 62 | img = input_dict['image'].permute(1,2,0).numpy() 63 | img = img[:,:,::-1] if not self.rgb else img 64 | vis = D2Visualizer(img_rgb=img, metadata=self.metadata, scale=self.scale) 65 | vis._default_font_size = 12. / self.scale 66 | 67 | boxes = output_dict['instances'].pred_boxes.tensor.detach().cpu().numpy() 68 | class_ids = output_dict['instances'].pred_classes.detach().cpu().numpy() 69 | scores = output_dict['instances'].scores.detach().cpu().numpy() 70 | keep = scores >= threshold 71 | 72 | if rescale: # scale boxes 73 | orig_height, orig_width = input_dict['height'], input_dict['width'] 74 | actual_height, actual_width = img.shape[:2] 75 | boxes[:,[0,2]] *= actual_width / orig_width 76 | boxes[:,[1,3]] *= actual_height / orig_height 77 | 78 | labels = _create_text_labels(classes=class_ids[keep], scores=scores[keep], class_names=self.class_names) \ 79 | if self.labels else None 80 | colors = None if self.color is None else ([self.color] * len(boxes)) 81 | result = vis.overlay_instances(boxes=boxes[keep], labels=labels, assigned_colors=colors) 82 | 83 | return Image.fromarray(result.get_image()) -------------------------------------------------------------------------------- /src/models/adet/__init__.py: -------------------------------------------------------------------------------- 1 | from src.models.adet.modeling.fcos.fcos import FCOS 2 | from .modeling.one_stage_detector import OneStageDetector 3 | from .modeling.backbone import build_fcos_resnet_fpn_backbone 4 | from .modeling.fcos import FCOS -------------------------------------------------------------------------------- /src/models/adet/layers/__init__.py: -------------------------------------------------------------------------------- 1 | from .deform_conv import DFConv2d 2 | from .ml_nms import ml_nms 3 | from .iou_loss import IOULoss 4 | from .naive_group_norm import NaiveGroupNorm -------------------------------------------------------------------------------- /src/models/adet/layers/deform_conv.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch import nn 3 | 4 | from detectron2.layers import Conv2d 5 | 6 | 7 | class _NewEmptyTensorOp(torch.autograd.Function): 8 | @staticmethod 9 | def forward(ctx, x, new_shape): 10 | ctx.shape = x.shape 11 | return x.new_empty(new_shape) 12 | 13 | @staticmethod 14 | def backward(ctx, grad): 15 | shape = ctx.shape 16 | return _NewEmptyTensorOp.apply(grad, shape), None 17 | 18 | 19 | class DFConv2d(nn.Module): 20 | """ 21 | Deformable convolutional layer with configurable 22 | deformable groups, dilations and groups. 23 | Code is from: 24 | https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/maskrcnn_benchmark/layers/misc.py 25 | """ 26 | def __init__( 27 | self, 28 | in_channels, 29 | out_channels, 30 | with_modulated_dcn=True, 31 | kernel_size=3, 32 | stride=1, 33 | groups=1, 34 | dilation=1, 35 | deformable_groups=1, 36 | bias=False, 37 | padding=None 38 | ): 39 | super(DFConv2d, self).__init__() 40 | if isinstance(kernel_size, (list, tuple)): 41 | assert isinstance(stride, (list, tuple)) 42 | assert isinstance(dilation, (list, tuple)) 43 | assert len(kernel_size) == 2 44 | assert len(stride) == 2 45 | assert len(dilation) == 2 46 | padding = ( 47 | dilation[0] * (kernel_size[0] - 1) // 2, 48 | dilation[1] * (kernel_size[1] - 1) // 2 49 | ) 50 | offset_base_channels = kernel_size[0] * kernel_size[1] 51 | else: 52 | padding = dilation * (kernel_size - 1) // 2 53 | offset_base_channels = kernel_size * kernel_size 54 | if with_modulated_dcn: 55 | from detectron2.layers.deform_conv import ModulatedDeformConv 56 | offset_channels = offset_base_channels * 3 # default: 27 57 | conv_block = ModulatedDeformConv 58 | else: 59 | from detectron2.layers.deform_conv import DeformConv 60 | offset_channels = offset_base_channels * 2 # default: 18 61 | conv_block = DeformConv 62 | self.offset = Conv2d( 63 | in_channels, 64 | deformable_groups * offset_channels, 65 | kernel_size=kernel_size, 66 | stride=stride, 67 | padding=padding, 68 | groups=1, 69 | dilation=dilation 70 | ) 71 | for l in [self.offset, ]: 72 | nn.init.kaiming_uniform_(l.weight, a=1) 73 | torch.nn.init.constant_(l.bias, 0.) 74 | self.conv = conv_block( 75 | in_channels, 76 | out_channels, 77 | kernel_size=kernel_size, 78 | stride=stride, 79 | padding=padding, 80 | dilation=dilation, 81 | groups=groups, 82 | deformable_groups=deformable_groups, 83 | bias=bias 84 | ) 85 | self.with_modulated_dcn = with_modulated_dcn 86 | self.kernel_size = kernel_size 87 | self.stride = stride 88 | self.padding = padding 89 | self.dilation = dilation 90 | self.offset_split = offset_base_channels * deformable_groups * 2 91 | 92 | def forward(self, x, return_offset=False): 93 | if x.numel() > 0: 94 | if not self.with_modulated_dcn: 95 | offset_mask = self.offset(x) 96 | x = self.conv(x, offset_mask) 97 | else: 98 | offset_mask = self.offset(x) 99 | offset = offset_mask[:, :self.offset_split, :, :] 100 | mask = offset_mask[:, self.offset_split:, :, :].sigmoid() 101 | x = self.conv(x, offset, mask) 102 | if return_offset: 103 | return x, offset_mask 104 | return x 105 | # get output shape 106 | output_shape = [ 107 | (i + 2 * p - (di * (k - 1) + 1)) // d + 1 108 | for i, p, di, k, d in zip( 109 | x.shape[-2:], 110 | self.padding, 111 | self.dilation, 112 | self.kernel_size, 113 | self.stride 114 | ) 115 | ] 116 | output_shape = [x.shape[0], self.conv.weight.shape[0]] + output_shape 117 | return _NewEmptyTensorOp.apply(x, output_shape) -------------------------------------------------------------------------------- /src/models/adet/layers/iou_loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch import nn 3 | 4 | 5 | class IOULoss(nn.Module): 6 | """ 7 | Intersetion Over Union (IoU) loss which supports three 8 | different IoU computations: 9 | * IoU 10 | * Linear IoU 11 | * gIoU 12 | """ 13 | def __init__(self, loc_loss_type='iou'): 14 | super(IOULoss, self).__init__() 15 | self.loc_loss_type = loc_loss_type 16 | 17 | def forward(self, ious, gious=None, weight=None): 18 | if self.loc_loss_type == 'iou': 19 | losses = -torch.log(ious) 20 | elif self.loc_loss_type == 'linear_iou': 21 | losses = 1 - ious 22 | elif self.loc_loss_type == 'giou': 23 | assert gious is not None 24 | losses = 1 - gious 25 | else: 26 | raise NotImplementedError 27 | 28 | if weight is not None: 29 | return (losses * weight).sum() 30 | else: 31 | return losses.sum() -------------------------------------------------------------------------------- /src/models/adet/layers/ml_nms.py: -------------------------------------------------------------------------------- 1 | from detectron2.layers import batched_nms 2 | 3 | 4 | def ml_nms(boxlist, nms_thresh, max_proposals=-1, 5 | score_field="scores", label_field="labels"): 6 | """ 7 | Performs non-maximum suppression on a boxlist, with scores specified 8 | in a boxlist field via score_field. 9 | 10 | Args: 11 | boxlist (detectron2.structures.Boxes): 12 | nms_thresh (float): 13 | max_proposals (int): if > 0, then only the top max_proposals are kept 14 | after non-maximum suppression 15 | score_field (str): 16 | """ 17 | if nms_thresh <= 0: 18 | return boxlist 19 | boxes = boxlist.pred_boxes.tensor 20 | scores = boxlist.scores 21 | labels = boxlist.pred_classes 22 | keep = batched_nms(boxes, scores, labels, nms_thresh) 23 | if max_proposals > 0: 24 | keep = keep[: max_proposals] 25 | boxlist = boxlist[keep] 26 | return boxlist -------------------------------------------------------------------------------- /src/models/adet/layers/naive_group_norm.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.nn import Module, Parameter 3 | from torch.nn import init 4 | 5 | 6 | class NaiveGroupNorm(Module): 7 | r"""NaiveGroupNorm implements Group Normalization with the high-level matrix operations in PyTorch. 8 | It is a temporary solution to export GN by ONNX before the official GN can be exported by ONNX. 9 | The usage of NaiveGroupNorm is exactly the same as the official :class:`torch.nn.GroupNorm`. 10 | Args: 11 | num_groups (int): number of groups to separate the channels into 12 | num_channels (int): number of channels expected in input 13 | eps: a value added to the denominator for numerical stability. Default: 1e-5 14 | affine: a boolean value that when set to ``True``, this module 15 | has learnable per-channel affine parameters initialized to ones (for weights) 16 | and zeros (for biases). Default: ``True``. 17 | Shape: 18 | - Input: :math:`(N, C, *)` where :math:`C=\text{num\_channels}` 19 | - Output: :math:`(N, C, *)` (same shape as input) 20 | Examples:: 21 | >>> input = torch.randn(20, 6, 10, 10) 22 | >>> # Separate 6 channels into 3 groups 23 | >>> m = NaiveGroupNorm(3, 6) 24 | >>> # Separate 6 channels into 6 groups (equivalent with InstanceNorm) 25 | >>> m = NaiveGroupNorm(6, 6) 26 | >>> # Put all 6 channels into a single group (equivalent with LayerNorm) 27 | >>> m = NaiveGroupNorm(1, 6) 28 | >>> # Activating the module 29 | >>> output = m(input) 30 | .. _`Group Normalization`: https://arxiv.org/abs/1803.08494 31 | """ 32 | __constants__ = ['num_groups', 'num_channels', 'eps', 'affine', 'weight', 33 | 'bias'] 34 | 35 | def __init__(self, num_groups, num_channels, eps=1e-5, affine=True): 36 | super(NaiveGroupNorm, self).__init__() 37 | self.num_groups = num_groups 38 | self.num_channels = num_channels 39 | self.eps = eps 40 | self.affine = affine 41 | if self.affine: 42 | self.weight = Parameter(torch.Tensor(num_channels)) 43 | self.bias = Parameter(torch.Tensor(num_channels)) 44 | else: 45 | self.register_parameter('weight', None) 46 | self.register_parameter('bias', None) 47 | self.reset_parameters() 48 | 49 | def reset_parameters(self): 50 | if self.affine: 51 | init.ones_(self.weight) 52 | init.zeros_(self.bias) 53 | 54 | def forward(self, input): 55 | N, C, H, W = input.size() 56 | assert C % self.num_groups == 0 57 | input = input.reshape(N, self.num_groups, -1) 58 | mean = input.mean(dim=-1, keepdim=True) 59 | var = (input ** 2).mean(dim=-1, keepdim=True) - mean ** 2 60 | std = torch.sqrt(var + self.eps) 61 | 62 | input = (input - mean) / std 63 | input = input.reshape(N, C, H, W) 64 | if self.affine: 65 | input = input * self.weight.reshape(1, C, 1, 1) + self.bias.reshape(1, C, 1, 1) 66 | return input 67 | 68 | def extra_repr(self): 69 | return '{num_groups}, {num_channels}, eps={eps}, ' \ 70 | 'affine={affine}'.format(**self.__dict__) -------------------------------------------------------------------------------- /src/models/adet/modeling/backbone/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | from .fpn import build_fcos_resnet_fpn_backbone -------------------------------------------------------------------------------- /src/models/adet/modeling/backbone/fpn.py: -------------------------------------------------------------------------------- 1 | from torch import nn 2 | import torch.nn.functional as F 3 | import fvcore.nn.weight_init as weight_init 4 | 5 | from detectron2.modeling.backbone import FPN, build_resnet_backbone 6 | from detectron2.layers import ShapeSpec 7 | from detectron2.modeling.backbone.build import BACKBONE_REGISTRY 8 | 9 | from .resnet_lpf import build_resnet_lpf_backbone 10 | from .resnet_interval import build_resnet_interval_backbone 11 | from .mobilenet import build_mnv2_backbone 12 | 13 | 14 | class LastLevelP6P7(nn.Module): 15 | """ 16 | This module is used in RetinaNet and FCOS to generate extra layers, P6 and P7 from 17 | C5 or P5 feature. 18 | """ 19 | 20 | def __init__(self, in_channels, out_channels, in_features="res5"): 21 | super().__init__() 22 | self.num_levels = 2 23 | self.in_feature = in_features 24 | self.p6 = nn.Conv2d(in_channels, out_channels, 3, 2, 1) 25 | self.p7 = nn.Conv2d(out_channels, out_channels, 3, 2, 1) 26 | for module in [self.p6, self.p7]: 27 | weight_init.c2_xavier_fill(module) 28 | 29 | def forward(self, x): 30 | p6 = self.p6(x) 31 | p7 = self.p7(F.relu(p6)) 32 | return [p6, p7] 33 | 34 | 35 | class LastLevelP6(nn.Module): 36 | """ 37 | This module is used in FCOS to generate extra layers 38 | """ 39 | 40 | def __init__(self, in_channels, out_channels, in_features="res5"): 41 | super().__init__() 42 | self.num_levels = 1 43 | self.in_feature = in_features 44 | self.p6 = nn.Conv2d(in_channels, out_channels, 3, 2, 1) 45 | for module in [self.p6]: 46 | weight_init.c2_xavier_fill(module) 47 | 48 | def forward(self, x): 49 | p6 = self.p6(x) 50 | return [p6] 51 | 52 | 53 | @BACKBONE_REGISTRY.register() 54 | def build_fcos_resnet_fpn_backbone(cfg, input_shape: ShapeSpec): 55 | """ 56 | Args: 57 | cfg: a detectron2 CfgNode 58 | Returns: 59 | backbone (Backbone): backbone module, must be a subclass of :class:`Backbone`. 60 | """ 61 | if cfg.MODEL.BACKBONE.ANTI_ALIAS: 62 | bottom_up = build_resnet_lpf_backbone(cfg, input_shape) 63 | elif cfg.MODEL.RESNETS.DEFORM_INTERVAL > 1: 64 | bottom_up = build_resnet_interval_backbone(cfg, input_shape) 65 | elif cfg.MODEL.MOBILENET: 66 | bottom_up = build_mnv2_backbone(cfg, input_shape) 67 | else: 68 | bottom_up = build_resnet_backbone(cfg, input_shape) 69 | in_features = cfg.MODEL.FPN.IN_FEATURES 70 | out_channels = cfg.MODEL.FPN.OUT_CHANNELS 71 | top_levels = cfg.MODEL.FCOS.TOP_LEVELS 72 | in_channels_top = out_channels 73 | if top_levels == 2: 74 | top_block = LastLevelP6P7(in_channels_top, out_channels, "p5") 75 | if top_levels == 1: 76 | top_block = LastLevelP6(in_channels_top, out_channels, "p5") 77 | elif top_levels == 0: 78 | top_block = None 79 | backbone = FPN( 80 | bottom_up=bottom_up, 81 | in_features=in_features, 82 | out_channels=out_channels, 83 | norm=cfg.MODEL.FPN.NORM, 84 | top_block=top_block, 85 | fuse_type=cfg.MODEL.FPN.FUSE_TYPE, 86 | ) 87 | return backbone -------------------------------------------------------------------------------- /src/models/adet/modeling/backbone/lpf.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.parallel 3 | import numpy as np 4 | import torch.nn as nn 5 | import torch.nn.functional as F 6 | 7 | 8 | class Downsample(nn.Module): 9 | def __init__(self, pad_type='reflect', filt_size=3, stride=2, channels=None, pad_off=0): 10 | super(Downsample, self).__init__() 11 | self.filt_size = filt_size 12 | self.pad_off = pad_off 13 | self.pad_sizes = [int(1.*(filt_size-1)/2), int(np.ceil(1.*(filt_size-1)/2)), int(1.*(filt_size-1)/2), int(np.ceil(1.*(filt_size-1)/2))] 14 | self.pad_sizes = [pad_size+pad_off for pad_size in self.pad_sizes] 15 | self.stride = stride 16 | self.off = int((self.stride-1)/2.) 17 | self.channels = channels 18 | 19 | # print('Filter size [%i]'%filt_size) 20 | if(self.filt_size==1): 21 | a = np.array([1.,]) 22 | elif(self.filt_size==2): 23 | a = np.array([1., 1.]) 24 | elif(self.filt_size==3): 25 | a = np.array([1., 2., 1.]) 26 | elif(self.filt_size==4): 27 | a = np.array([1., 3., 3., 1.]) 28 | elif(self.filt_size==5): 29 | a = np.array([1., 4., 6., 4., 1.]) 30 | elif(self.filt_size==6): 31 | a = np.array([1., 5., 10., 10., 5., 1.]) 32 | elif(self.filt_size==7): 33 | a = np.array([1., 6., 15., 20., 15., 6., 1.]) 34 | 35 | filt = torch.Tensor(a[:,None]*a[None,:]) 36 | filt = filt/torch.sum(filt) 37 | self.register_buffer('filt', filt[None,None,:,:].repeat((self.channels,1,1,1))) 38 | 39 | self.pad = get_pad_layer(pad_type)(self.pad_sizes) 40 | 41 | def forward(self, inp): 42 | if(self.filt_size==1): 43 | if(self.pad_off==0): 44 | return inp[:,:,::self.stride,::self.stride] 45 | else: 46 | return self.pad(inp)[:,:,::self.stride,::self.stride] 47 | else: 48 | return F.conv2d(self.pad(inp), self.filt, stride=self.stride, groups=inp.shape[1]) 49 | 50 | def get_pad_layer(pad_type): 51 | if(pad_type in ['refl','reflect']): 52 | PadLayer = nn.ReflectionPad2d 53 | elif(pad_type in ['repl','replicate']): 54 | PadLayer = nn.ReplicationPad2d 55 | elif(pad_type=='zero'): 56 | PadLayer = nn.ZeroPad2d 57 | else: 58 | print('Pad type [%s] not recognized'%pad_type) 59 | return PadLayer 60 | 61 | 62 | class Downsample1D(nn.Module): 63 | def __init__(self, pad_type='reflect', filt_size=3, stride=2, channels=None, pad_off=0): 64 | super(Downsample1D, self).__init__() 65 | self.filt_size = filt_size 66 | self.pad_off = pad_off 67 | self.pad_sizes = [int(1. * (filt_size - 1) / 2), int(np.ceil(1. * (filt_size - 1) / 2))] 68 | self.pad_sizes = [pad_size + pad_off for pad_size in self.pad_sizes] 69 | self.stride = stride 70 | self.off = int((self.stride - 1) / 2.) 71 | self.channels = channels 72 | 73 | # print('Filter size [%i]' % filt_size) 74 | if(self.filt_size == 1): 75 | a = np.array([1., ]) 76 | elif(self.filt_size == 2): 77 | a = np.array([1., 1.]) 78 | elif(self.filt_size == 3): 79 | a = np.array([1., 2., 1.]) 80 | elif(self.filt_size == 4): 81 | a = np.array([1., 3., 3., 1.]) 82 | elif(self.filt_size == 5): 83 | a = np.array([1., 4., 6., 4., 1.]) 84 | elif(self.filt_size == 6): 85 | a = np.array([1., 5., 10., 10., 5., 1.]) 86 | elif(self.filt_size == 7): 87 | a = np.array([1., 6., 15., 20., 15., 6., 1.]) 88 | 89 | filt = torch.Tensor(a) 90 | filt = filt / torch.sum(filt) 91 | self.register_buffer('filt', filt[None, None, :].repeat((self.channels, 1, 1))) 92 | 93 | self.pad = get_pad_layer_1d(pad_type)(self.pad_sizes) 94 | 95 | def forward(self, inp): 96 | if(self.filt_size == 1): 97 | if(self.pad_off == 0): 98 | return inp[:, :, ::self.stride] 99 | else: 100 | return self.pad(inp)[:, :, ::self.stride] 101 | else: 102 | return F.conv1d(self.pad(inp), self.filt, stride=self.stride, groups=inp.shape[1]) 103 | 104 | 105 | def get_pad_layer_1d(pad_type): 106 | if(pad_type in ['refl', 'reflect']): 107 | PadLayer = nn.ReflectionPad1d 108 | elif(pad_type in ['repl', 'replicate']): 109 | PadLayer = nn.ReplicationPad1d 110 | elif(pad_type == 'zero'): 111 | PadLayer = nn.ZeroPad1d 112 | else: 113 | print('Pad type [%s] not recognized' % pad_type) 114 | return PadLayer -------------------------------------------------------------------------------- /src/models/adet/modeling/backbone/mobilenet.py: -------------------------------------------------------------------------------- 1 | # taken from https://github.com/tonylins/pytorch-mobilenet-v2/ 2 | # Published by Ji Lin, tonylins 3 | # licensed under the Apache License, Version 2.0, January 2004 4 | 5 | from torch import nn 6 | from torch.nn import BatchNorm2d 7 | #from detectron2.layers.batch_norm import NaiveSyncBatchNorm as BatchNorm2d 8 | from detectron2.layers import Conv2d 9 | from detectron2.modeling.backbone.build import BACKBONE_REGISTRY 10 | from detectron2.modeling.backbone import Backbone 11 | 12 | 13 | def conv_bn(inp, oup, stride): 14 | return nn.Sequential( 15 | Conv2d(inp, oup, 3, stride, 1, bias=False), 16 | BatchNorm2d(oup), 17 | nn.ReLU6(inplace=True) 18 | ) 19 | 20 | 21 | def conv_1x1_bn(inp, oup): 22 | return nn.Sequential( 23 | Conv2d(inp, oup, 1, 1, 0, bias=False), 24 | BatchNorm2d(oup), 25 | nn.ReLU6(inplace=True) 26 | ) 27 | 28 | 29 | class InvertedResidual(nn.Module): 30 | def __init__(self, inp, oup, stride, expand_ratio): 31 | super(InvertedResidual, self).__init__() 32 | self.stride = stride 33 | assert stride in [1, 2] 34 | 35 | hidden_dim = int(round(inp * expand_ratio)) 36 | self.use_res_connect = self.stride == 1 and inp == oup 37 | 38 | if expand_ratio == 1: 39 | self.conv = nn.Sequential( 40 | # dw 41 | Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False), 42 | BatchNorm2d(hidden_dim), 43 | nn.ReLU6(inplace=True), 44 | # pw-linear 45 | Conv2d(hidden_dim, oup, 1, 1, 0, bias=False), 46 | BatchNorm2d(oup), 47 | ) 48 | else: 49 | self.conv = nn.Sequential( 50 | # pw 51 | Conv2d(inp, hidden_dim, 1, 1, 0, bias=False), 52 | BatchNorm2d(hidden_dim), 53 | nn.ReLU6(inplace=True), 54 | # dw 55 | Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False), 56 | BatchNorm2d(hidden_dim), 57 | nn.ReLU6(inplace=True), 58 | # pw-linear 59 | Conv2d(hidden_dim, oup, 1, 1, 0, bias=False), 60 | BatchNorm2d(oup), 61 | ) 62 | 63 | def forward(self, x): 64 | if self.use_res_connect: 65 | return x + self.conv(x) 66 | else: 67 | return self.conv(x) 68 | 69 | 70 | class MobileNetV2(Backbone): 71 | """ 72 | Should freeze bn 73 | """ 74 | def __init__(self, cfg, n_class=1000, input_size=224, width_mult=1.): 75 | super(MobileNetV2, self).__init__() 76 | block = InvertedResidual 77 | input_channel = 32 78 | interverted_residual_setting = [ 79 | # t, c, n, s 80 | [1, 16, 1, 1], 81 | [6, 24, 2, 2], 82 | [6, 32, 3, 2], 83 | [6, 64, 4, 2], 84 | [6, 96, 3, 1], 85 | [6, 160, 3, 2], 86 | [6, 320, 1, 1], 87 | ] 88 | 89 | # building first layer 90 | assert input_size % 32 == 0 91 | input_channel = int(input_channel * width_mult) 92 | self.return_features_indices = [3, 6, 13, 17] 93 | self.return_features_num_channels = [] 94 | self.features = nn.ModuleList([conv_bn(3, input_channel, 2)]) 95 | # building inverted residual blocks 96 | for t, c, n, s in interverted_residual_setting: 97 | output_channel = int(c * width_mult) 98 | for i in range(n): 99 | if i == 0: 100 | self.features.append(block(input_channel, output_channel, s, expand_ratio=t)) 101 | else: 102 | self.features.append(block(input_channel, output_channel, 1, expand_ratio=t)) 103 | input_channel = output_channel 104 | if len(self.features) - 1 in self.return_features_indices: 105 | self.return_features_num_channels.append(output_channel) 106 | 107 | self._initialize_weights() 108 | self._freeze_backbone(cfg.MODEL.BACKBONE.FREEZE_AT) 109 | 110 | def _freeze_backbone(self, freeze_at): 111 | for layer_index in range(freeze_at): 112 | for p in self.features[layer_index].parameters(): 113 | p.requires_grad = False 114 | 115 | def forward(self, x): 116 | res = [] 117 | for i, m in enumerate(self.features): 118 | x = m(x) 119 | if i in self.return_features_indices: 120 | res.append(x) 121 | return {'res{}'.format(i + 2): r for i, r in enumerate(res)} 122 | 123 | def _initialize_weights(self): 124 | for m in self.modules(): 125 | if isinstance(m, Conv2d): 126 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 127 | m.weight.data.normal_(0, (2. / n) ** 0.5) 128 | if m.bias is not None: 129 | m.bias.data.zero_() 130 | elif isinstance(m, BatchNorm2d): 131 | m.weight.data.fill_(1) 132 | m.bias.data.zero_() 133 | elif isinstance(m, nn.Linear): 134 | n = m.weight.size(1) 135 | m.weight.data.normal_(0, 0.01) 136 | m.bias.data.zero_() 137 | 138 | @BACKBONE_REGISTRY.register() 139 | def build_mnv2_backbone(cfg, input_shape): 140 | """ 141 | Create a ResNet instance from config. 142 | Returns: 143 | ResNet: a :class:`ResNet` instance. 144 | """ 145 | out_features = cfg.MODEL.RESNETS.OUT_FEATURES 146 | 147 | out_feature_channels = {"res2": 24, "res3": 32, 148 | "res4": 96, "res5": 320} 149 | out_feature_strides = {"res2": 4, "res3": 8, "res4": 16, "res5": 32} 150 | model = MobileNetV2(cfg) 151 | model._out_features = out_features 152 | model._out_feature_channels = out_feature_channels 153 | model._out_feature_strides = out_feature_strides 154 | return model -------------------------------------------------------------------------------- /src/models/adet/modeling/backbone/resnet_interval.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved 2 | from detectron2.layers import FrozenBatchNorm2d 3 | from detectron2.modeling.backbone import BACKBONE_REGISTRY 4 | from detectron2.modeling.backbone.resnet import ( 5 | BasicStem, 6 | DeformBottleneckBlock, 7 | BottleneckBlock, 8 | ResNet, 9 | ) 10 | 11 | 12 | def make_stage_intervals(block_class, num_blocks, first_stride, **kwargs): 13 | """ 14 | Create a resnet stage by creating many blocks. 15 | Args: 16 | block_class (class): a subclass of ResNetBlockBase 17 | num_blocks (int): 18 | first_stride (int): the stride of the first block. The other blocks will have stride=1. 19 | A `stride` argument will be passed to the block constructor. 20 | kwargs: other arguments passed to the block constructor. 21 | Returns: 22 | list[nn.Module]: a list of block module. 23 | """ 24 | blocks = [] 25 | conv_kwargs = {key: kwargs[key] for key in kwargs if "deform" not in key} 26 | deform_kwargs = {key: kwargs[key] for key in kwargs if key != "deform_interval"} 27 | deform_interval = kwargs.get("deform_interval", None) 28 | for i in range(num_blocks): 29 | if deform_interval and i % deform_interval == 0: 30 | blocks.append(block_class(stride=first_stride if i == 0 else 1, **deform_kwargs)) 31 | else: 32 | blocks.append(BottleneckBlock(stride=first_stride if i == 0 else 1, **conv_kwargs)) 33 | conv_kwargs["in_channels"] = conv_kwargs["out_channels"] 34 | deform_kwargs["in_channels"] = deform_kwargs["out_channels"] 35 | return blocks 36 | 37 | 38 | @BACKBONE_REGISTRY.register() 39 | def build_resnet_interval_backbone(cfg, input_shape): 40 | """ 41 | Create a ResNet instance from config. 42 | Returns: 43 | ResNet: a :class:`ResNet` instance. 44 | """ 45 | # need registration of new blocks/stems? 46 | norm = cfg.MODEL.RESNETS.NORM 47 | stem = BasicStem( 48 | in_channels=input_shape.channels, 49 | out_channels=cfg.MODEL.RESNETS.STEM_OUT_CHANNELS, 50 | norm=norm, 51 | ) 52 | freeze_at = cfg.MODEL.BACKBONE.FREEZE_AT 53 | 54 | if freeze_at >= 1: 55 | for p in stem.parameters(): 56 | p.requires_grad = False 57 | stem = FrozenBatchNorm2d.convert_frozen_batchnorm(stem) 58 | 59 | # fmt: off 60 | out_features = cfg.MODEL.RESNETS.OUT_FEATURES 61 | depth = cfg.MODEL.RESNETS.DEPTH 62 | num_groups = cfg.MODEL.RESNETS.NUM_GROUPS 63 | width_per_group = cfg.MODEL.RESNETS.WIDTH_PER_GROUP 64 | bottleneck_channels = num_groups * width_per_group 65 | in_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS 66 | out_channels = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS 67 | stride_in_1x1 = cfg.MODEL.RESNETS.STRIDE_IN_1X1 68 | res5_dilation = cfg.MODEL.RESNETS.RES5_DILATION 69 | deform_on_per_stage = cfg.MODEL.RESNETS.DEFORM_ON_PER_STAGE 70 | deform_modulated = cfg.MODEL.RESNETS.DEFORM_MODULATED 71 | deform_num_groups = cfg.MODEL.RESNETS.DEFORM_NUM_GROUPS 72 | deform_interval = cfg.MODEL.RESNETS.DEFORM_INTERVAL 73 | # fmt: on 74 | assert res5_dilation in {1, 2}, "res5_dilation cannot be {}.".format(res5_dilation) 75 | 76 | num_blocks_per_stage = {50: [3, 4, 6, 3], 101: [3, 4, 23, 3], 152: [3, 8, 36, 3]}[depth] 77 | 78 | stages = [] 79 | 80 | # Avoid creating variables without gradients 81 | # It consumes extra memory and may cause allreduce to fail 82 | out_stage_idx = [{"res2": 2, "res3": 3, "res4": 4, "res5": 5}[f] for f in out_features] 83 | max_stage_idx = max(out_stage_idx) 84 | for idx, stage_idx in enumerate(range(2, max_stage_idx + 1)): 85 | dilation = res5_dilation if stage_idx == 5 else 1 86 | first_stride = 1 if idx == 0 or (stage_idx == 5 and dilation == 2) else 2 87 | stage_kargs = { 88 | "num_blocks": num_blocks_per_stage[idx], 89 | "first_stride": first_stride, 90 | "in_channels": in_channels, 91 | "bottleneck_channels": bottleneck_channels, 92 | "out_channels": out_channels, 93 | "num_groups": num_groups, 94 | "norm": norm, 95 | "stride_in_1x1": stride_in_1x1, 96 | "dilation": dilation, 97 | } 98 | if deform_on_per_stage[idx]: 99 | stage_kargs["block_class"] = DeformBottleneckBlock 100 | stage_kargs["deform_modulated"] = deform_modulated 101 | stage_kargs["deform_num_groups"] = deform_num_groups 102 | stage_kargs["deform_interval"] = deform_interval 103 | else: 104 | stage_kargs["block_class"] = BottleneckBlock 105 | blocks = make_stage_intervals(**stage_kargs) 106 | in_channels = out_channels 107 | out_channels *= 2 108 | bottleneck_channels *= 2 109 | 110 | if freeze_at >= stage_idx: 111 | for block in blocks: 112 | block.freeze() 113 | stages.append(blocks) 114 | return ResNet(stem, stages, out_features=out_features) -------------------------------------------------------------------------------- /src/models/adet/modeling/fcos/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | from .fcos import FCOS -------------------------------------------------------------------------------- /src/models/adet/utils/comm.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | import torch.distributed as dist 4 | 5 | from detectron2.utils.comm import get_world_size 6 | 7 | 8 | def reduce_sum(tensor): 9 | world_size = get_world_size() 10 | if world_size < 2: 11 | return tensor 12 | tensor = tensor.clone() 13 | dist.all_reduce(tensor, op=dist.ReduceOp.SUM) 14 | return tensor 15 | 16 | 17 | def reduce_mean(tensor): 18 | num_gpus = get_world_size() 19 | total = reduce_sum(tensor) 20 | return total.float() / num_gpus 21 | 22 | 23 | def aligned_bilinear(tensor, factor): 24 | assert tensor.dim() == 4 25 | assert factor >= 1 26 | assert int(factor) == factor 27 | 28 | if factor == 1: 29 | return tensor 30 | 31 | h, w = tensor.size()[2:] 32 | tensor = F.pad(tensor, pad=(0, 1, 0, 1), mode="replicate") 33 | oh = factor * h + 1 34 | ow = factor * w + 1 35 | tensor = F.interpolate( 36 | tensor, size=(oh, ow), 37 | mode='bilinear', 38 | align_corners=True 39 | ) 40 | tensor = F.pad( 41 | tensor, pad=(factor // 2, 0, factor // 2, 0), 42 | mode="replicate" 43 | ) 44 | 45 | return tensor[:, :, :oh - 1, :ow - 1] 46 | 47 | 48 | def compute_locations(h, w, stride, device): 49 | shifts_x = torch.arange( 50 | 0, w * stride, step=stride, 51 | dtype=torch.float32, device=device 52 | ) 53 | shifts_y = torch.arange( 54 | 0, h * stride, step=stride, 55 | dtype=torch.float32, device=device 56 | ) 57 | shift_y, shift_x = torch.meshgrid(shifts_y, shifts_x) 58 | shift_x = shift_x.reshape(-1) 59 | shift_y = shift_y.reshape(-1) 60 | locations = torch.stack((shift_x, shift_y), dim=1) + stride // 2 61 | return locations 62 | 63 | 64 | def compute_ious(pred, target): 65 | """ 66 | Args: 67 | pred: Nx4 predicted bounding boxes 68 | target: Nx4 target bounding boxes 69 | Both are in the form of FCOS prediction (l, t, r, b) 70 | """ 71 | pred_left = pred[:, 0] 72 | pred_top = pred[:, 1] 73 | pred_right = pred[:, 2] 74 | pred_bottom = pred[:, 3] 75 | 76 | target_left = target[:, 0] 77 | target_top = target[:, 1] 78 | target_right = target[:, 2] 79 | target_bottom = target[:, 3] 80 | 81 | target_aera = (target_left + target_right) * \ 82 | (target_top + target_bottom) 83 | pred_aera = (pred_left + pred_right) * \ 84 | (pred_top + pred_bottom) 85 | 86 | w_intersect = torch.min(pred_left, target_left) + \ 87 | torch.min(pred_right, target_right) 88 | h_intersect = torch.min(pred_bottom, target_bottom) + \ 89 | torch.min(pred_top, target_top) 90 | 91 | g_w_intersect = torch.max(pred_left, target_left) + \ 92 | torch.max(pred_right, target_right) 93 | g_h_intersect = torch.max(pred_bottom, target_bottom) + \ 94 | torch.max(pred_top, target_top) 95 | ac_uion = g_w_intersect * g_h_intersect 96 | 97 | area_intersect = w_intersect * h_intersect 98 | area_union = target_aera + pred_aera - area_intersect 99 | 100 | ious = (area_intersect + 1.0) / (area_union + 1.0) 101 | gious = ious - (ac_uion - area_union) / ac_uion 102 | 103 | return ious, gious -------------------------------------------------------------------------------- /src/models/build.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from detectron2.modeling import build_model as build_architecture 3 | from .faster_rcnn import CustomFasterRCNN 4 | from .retinanet import CustomRetinaNet 5 | from .fcos import CustomFCOS 6 | 7 | def build_model(cfg): 8 | ''' 9 | Wrapper around detectron's build_model() to do the backbone (or the whole model) initialization myself. 10 | ''' 11 | # create new model 12 | model = build_architecture(cfg) 13 | if cfg.MODEL.WEIGHTS == 'detectron2://ImageNetPretrained/MSRA/R-50.pkl': 14 | model.backbone.bottom_up.load_state_dict(torch.load('./pretrained_backbones/ImageNetPretrained-MSRA-R-50.pt', 15 | map_location=cfg.MODEL.DEVICE)) 16 | print('Init model from MSRA.') 17 | assert cfg.INPUT.FORMAT == 'BGR', 'Input format for MSRA should be BGR!' 18 | 19 | elif cfg.MODEL.WEIGHTS == 'detectron2://ImageNetPretrained/torchvision/R-50.pkl': 20 | model.backbone.bottom_up.load_state_dict(torch.load('./pretrained_backbones/ImageNetPretrained-torchvision-R-50.pt', 21 | map_location=cfg.MODEL.DEVICE)) 22 | print('Init model from torchvision.') 23 | assert cfg.INPUT.FORMAT == 'RGB', 'Input format for torchvision should be RGB!' 24 | 25 | elif cfg.MODEL.WEIGHTS: 26 | model.load_state_dict(torch.load(cfg.MODEL.WEIGHTS, 27 | map_location=cfg.MODEL.DEVICE)) 28 | print(f'Init model from {cfg.MODEL.WEIGHTS}') 29 | else: 30 | print('Init model randomly.') 31 | 32 | # TODO: move this to constructors 33 | if isinstance(model, CustomFasterRCNN) or isinstance(model, CustomRetinaNet): 34 | model.nms_thresh_train = cfg.CORRECTION.get('NMS_THRESH_TRAIN', None) 35 | model.score_thresh_train = cfg.CORRECTION.get('SCORE_THRESH_TRAIN', None) 36 | model.detections_per_image_train = cfg.CORRECTION.get('DETECTIONS_PER_IMAGE_TRAIN', None) 37 | if isinstance(model, CustomFCOS): 38 | assert cfg.CORRECTION.DETECTIONS_PER_IMAGE_TRAIN == cfg.MODEL.FCOS.POST_NMS_TOPK_TRAIN 39 | assert cfg.CORRECTION.SCORE_THRESH_TRAIN == cfg.MODEL.FCOS.INFERENCE_TH_TRAIN 40 | model.nms_thresh_train = cfg.CORRECTION.get('NMS_THRESH_TRAIN', None) 41 | model.nms_thresh_train = cfg.MODEL.FCOS.NMS_TH 42 | 43 | return model 44 | -------------------------------------------------------------------------------- /src/models/faster_rcnn.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from typing import Dict, List 3 | from detectron2.modeling.meta_arch.rcnn import GeneralizedRCNN 4 | from detectron2.modeling.roi_heads.fast_rcnn import fast_rcnn_inference 5 | from detectron2.modeling.meta_arch.build import META_ARCH_REGISTRY 6 | from detectron2.utils.events import get_event_storage 7 | 8 | 9 | @META_ARCH_REGISTRY.register() 10 | class CustomFasterRCNN(GeneralizedRCNN): 11 | ''' 12 | With train_forward() and compute_losses(), we can separate 13 | the forward pass of inputs from the loss generation. 14 | ''' 15 | 16 | def train_forward(self, batched_inputs: List[Dict[str, torch.Tensor]]): 17 | ''' 18 | Do not use the gt_instances of the inputs here. 19 | ''' 20 | loss_inputs = {} 21 | 22 | images = self.preprocess_image(batched_inputs) 23 | features = self.backbone(images.tensor) 24 | 25 | # start proposal_generator.forward: proposals, proposal_losses = self.proposal_generator(images, features, gt_instances) 26 | proposal_generator_features = [features[f] for f in self.proposal_generator.in_features] 27 | anchors = self.proposal_generator.anchor_generator(proposal_generator_features) 28 | pred_objectness_logits, pred_anchor_deltas = self.proposal_generator.rpn_head(proposal_generator_features) 29 | # Transpose the Hi*Wi*A dimension to the middle: 30 | pred_objectness_logits = [ 31 | # (N, A, Hi, Wi) -> (N, Hi, Wi, A) -> (N, Hi*Wi*A) 32 | score.permute(0, 2, 3, 1).flatten(1) 33 | for score in pred_objectness_logits 34 | ] 35 | pred_anchor_deltas = [ 36 | # (N, A*B, Hi, Wi) -> (N, A, B, Hi, Wi) -> (N, Hi, Wi, A, B) -> (N, Hi*Wi*A, B) 37 | x.view(x.shape[0], -1, self.proposal_generator.anchor_generator.box_dim, x.shape[-2], x.shape[-1]) 38 | .permute(0, 3, 4, 1, 2) 39 | .flatten(1, -2) 40 | for x in pred_anchor_deltas 41 | ] 42 | #gt_labels, gt_boxes = self.label_and_sample_anchors(anchors, gt_instances) 43 | #losses = self.losses( 44 | # anchors, pred_objectness_logits, gt_labels, pred_anchor_deltas, gt_boxes 45 | #) 46 | loss_inputs['proposal_generator'] = dict(anchors=anchors, 47 | pred_objectness_logits=pred_objectness_logits, 48 | pred_anchor_deltas=pred_anchor_deltas) 49 | 50 | proposals = self.proposal_generator.predict_proposals( 51 | anchors, pred_objectness_logits, pred_anchor_deltas, images.image_sizes 52 | ) 53 | # end proposal_generator.forward 54 | 55 | # start roi_heads.forward: _, detector_losses = self.roi_heads(images, features, proposals, gt_instances) 56 | # start roi_heads._forward_box: pred_instances = self._forward_box(features, proposals) 57 | roi_heads_features = [features[f] for f in self.roi_heads.box_in_features] 58 | loss_inputs['roi_heads'] = dict(features=roi_heads_features, 59 | proposals=proposals) 60 | roi_heads_box_features = self.roi_heads.box_pooler(roi_heads_features, [x.proposal_boxes for x in proposals]) 61 | roi_heads_box_features = self.roi_heads.box_head(roi_heads_box_features) 62 | roi_heads_predictions = self.roi_heads.box_predictor(roi_heads_box_features) 63 | # start roi_heads.box_predictor.inference: pred_instances, _ = self.roi_heads.box_predictor.inference(roi_heads_predictions, proposals) 64 | boxes = self.roi_heads.box_predictor.predict_boxes(roi_heads_predictions, proposals) 65 | scores = self.roi_heads.box_predictor.predict_probs(roi_heads_predictions, proposals) 66 | image_shapes = [x.image_size for x in proposals] 67 | pred_instances, _ = fast_rcnn_inference(boxes=boxes, 68 | scores=scores, 69 | image_shapes=image_shapes, 70 | score_thresh=self.score_thresh_train, 71 | nms_thresh=self.nms_thresh_train, 72 | topk_per_image=self.detections_per_image_train, 73 | # initialized in build_model() 74 | ) 75 | # end roi_heads.box_predictor.inference 76 | # end roi_heads._forward_box 77 | # end roi_heads.forward 78 | 79 | if self.vis_period > 0: 80 | storage = get_event_storage() 81 | if storage.iter % self.vis_period == 0: 82 | self.visualize_training(batched_inputs, proposals) 83 | 84 | # postprocessing to obtain proper predictions 85 | assert not torch.jit.is_scripting(), "Scripting is not supported for postprocess." 86 | #predictions = GeneralizedRCNN._postprocess(pred_instances, batched_inputs, images.image_sizes) # this brings data to original scale (before padding AND scaling) 87 | predictions = [{'instances': x} for x in pred_instances] 88 | return predictions, loss_inputs 89 | 90 | 91 | def compute_losses(self, gt_instances, loss_inputs): 92 | gt_instances = [x.to(self.device) for x in gt_instances] 93 | 94 | # proposal losses 95 | anchors = loss_inputs['proposal_generator']['anchors'] 96 | pred_objectness_logits = loss_inputs['proposal_generator']['pred_objectness_logits'] 97 | pred_anchor_deltas = loss_inputs['proposal_generator']['pred_anchor_deltas'] 98 | 99 | gt_labels, gt_boxes = self.proposal_generator.label_and_sample_anchors(anchors, gt_instances) 100 | proposal_losses = self.proposal_generator.losses( 101 | anchors, pred_objectness_logits, gt_labels, pred_anchor_deltas, gt_boxes 102 | ) 103 | 104 | # roi_heads/detector losses 105 | targets = gt_instances 106 | proposals = loss_inputs['roi_heads']['proposals'] 107 | features = loss_inputs['roi_heads']['features'] 108 | 109 | proposals = self.roi_heads.label_and_sample_proposals(proposals, targets) 110 | box_features = self.roi_heads.box_pooler(features, [x.proposal_boxes for x in proposals]) 111 | box_features = self.roi_heads.box_head(box_features) 112 | predictions = self.roi_heads.box_predictor(box_features) 113 | detector_losses = self.roi_heads.box_predictor.losses(predictions, proposals) 114 | 115 | ########################################################## 116 | losses = {} 117 | losses.update(detector_losses) 118 | losses.update(proposal_losses) 119 | return losses -------------------------------------------------------------------------------- /src/models/fcos.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from typing import Dict, List 3 | from detectron2.modeling.meta_arch.build import META_ARCH_REGISTRY 4 | from detectron2.structures import ImageList 5 | from .adet import OneStageDetector 6 | 7 | 8 | @META_ARCH_REGISTRY.register() 9 | class CustomFCOS(OneStageDetector): 10 | ''' 11 | With train_forward() and compute_losses(), we can separate 12 | the forward pass of inputs from the loss generation. 13 | ''' 14 | def train_forward(self, batched_inputs: List[Dict[str, torch.Tensor]]): 15 | ''' 16 | Do not use the gt_instances of the inputs here. 17 | ''' 18 | # if self.training: 19 | # return super().forward(batched_inputs) 20 | images = [x["image"].to(self.device) for x in batched_inputs] 21 | images = [(x - self.pixel_mean) / self.pixel_std for x in images] 22 | images = ImageList.from_tensors(images, self.backbone.size_divisibility) 23 | features = self.backbone(images.tensor) 24 | # start self.proposal_generator.forward: proposals, proposal_losses = self.proposal_generator(images, features, gt_instances) 25 | proposal_features = [features[f] for f in self.proposal_generator.in_features] 26 | locations = self.proposal_generator.compute_locations(proposal_features) 27 | logits_pred, reg_pred, ctrness_pred, top_feats, bbox_towers = self.proposal_generator.fcos_head( 28 | proposal_features, None, self.proposal_generator.yield_proposal or self.proposal_generator.yield_box_feats 29 | ) 30 | loss_inputs = {'logits_pred':logits_pred, 31 | 'reg_pred':reg_pred, 32 | 'ctrness_pred':ctrness_pred, 33 | 'locations':locations, 34 | 'top_feats':top_feats 35 | } 36 | #temporarily change the nms_thresh value of fcos_outputs 37 | self.nms_thresh_test = self.proposal_generator.fcos_outputs.nms_thresh 38 | self.proposal_generator.fcos_outputs.nms_thresh = self.nms_thresh_train 39 | proposals = self.proposal_generator.fcos_outputs.predict_proposals( 40 | logits_pred, reg_pred, ctrness_pred, 41 | locations, images.image_sizes, top_feats) 42 | self.proposal_generator.fcos_outputs.nms_thresh = self.nms_thresh_test 43 | # end self.proposal_generator.forward: proposals, proposal_losses = self.proposal_generator(images, features, gt_instances) 44 | # In training, the proposals are not useful at all but we generate them anyway. 45 | # This makes RPN-only models about 5% slower. 46 | predictions = [] 47 | for results_per_image, input_per_image, image_size in zip( 48 | proposals, batched_inputs, images.image_sizes 49 | ): 50 | # height = input_per_image.get("height", image_size[0]) 51 | # width = input_per_image.get("width", image_size[1]) 52 | # r = detector_postprocess(results_per_image, height, width) 53 | r = results_per_image 54 | predictions.append({"instances": r}) 55 | return predictions, loss_inputs 56 | 57 | def compute_losses(self, gt_instances, loss_inputs): 58 | gt_instances = [x.to(self.device) for x in gt_instances] 59 | logits_pred = loss_inputs['logits_pred'] 60 | reg_pred = loss_inputs['reg_pred'] 61 | ctrness_pred = loss_inputs['ctrness_pred'] 62 | locations = loss_inputs['locations'] 63 | top_feats = loss_inputs['top_feats'] 64 | 65 | 66 | _, proposal_losses = self.proposal_generator.fcos_outputs.losses( 67 | logits_pred, reg_pred, ctrness_pred, 68 | locations, gt_instances, top_feats 69 | ) 70 | return proposal_losses -------------------------------------------------------------------------------- /src/models/retinanet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from typing import Dict, List 3 | from detectron2.modeling.meta_arch.build import META_ARCH_REGISTRY 4 | from detectron2.modeling.meta_arch.retinanet import RetinaNet, permute_to_N_HWA_K 5 | from detectron2.structures import Boxes, Instances 6 | from detectron2.utils.events import get_event_storage 7 | from detectron2.layers import batched_nms, cat, nonzero_tuple 8 | 9 | 10 | @META_ARCH_REGISTRY.register() 11 | class CustomRetinaNet(RetinaNet): 12 | ''' 13 | With train_forward() and compute_losses(), we can separate 14 | the forward pass of inputs from the loss generation. 15 | ''' 16 | def train_forward(self, batched_inputs: List[Dict[str, torch.Tensor]]): 17 | ''' 18 | Do not use the gt_instances of the inputs here. 19 | ''' 20 | loss_inputs = {} 21 | 22 | images = self.preprocess_image(batched_inputs) 23 | features = self.backbone(images.tensor) 24 | features = [features[f] for f in self.head_in_features] 25 | 26 | anchors = self.anchor_generator(features) 27 | pred_logits, pred_anchor_deltas = self.head(features) 28 | # Transpose the Hi*Wi*A dimension to the middle: 29 | pred_logits = [permute_to_N_HWA_K(x, self.num_classes) for x in pred_logits] 30 | pred_anchor_deltas = [permute_to_N_HWA_K(x, 4) for x in pred_anchor_deltas] 31 | 32 | loss_inputs['anchors'] = anchors 33 | loss_inputs['pred_logits'] = pred_logits 34 | loss_inputs['pred_anchor_deltas'] = pred_anchor_deltas 35 | 36 | pred_logits = [torch.clone(x).detach() for x in pred_logits] # otherwise it will be overriden 37 | pred_anchor_deltas = [torch.clone(x).detach() for x in pred_anchor_deltas] 38 | 39 | # if self.training: 40 | # assert not torch.jit.is_scripting(), "Not supported" 41 | # assert "instances" in batched_inputs[0], "Instance annotations are missing in training!" 42 | # gt_instances = [x["instances"].to(self.device) for x in batched_inputs] 43 | 44 | # gt_labels, gt_boxes = self.label_anchors(anchors, gt_instances) 45 | # losses = self.losses(anchors, pred_logits, gt_labels, pred_anchor_deltas, gt_boxes) 46 | 47 | if self.vis_period > 0: 48 | storage = get_event_storage() 49 | if storage.iter % self.vis_period == 0: 50 | results = self.inference( 51 | anchors, [torch.clone(x) for x in pred_logits], pred_anchor_deltas, images.image_sizes 52 | ) 53 | self.visualize_training(batched_inputs, results) 54 | 55 | # return losses 56 | 57 | # start self.inference: results = self.inference(anchors, pred_logits, pred_anchor_deltas, images.image_sizes) 58 | results: List[Instances] = [] 59 | for img_idx, image_size in enumerate(images.image_sizes): 60 | pred_logits_per_image = [x[img_idx] for x in pred_logits] 61 | deltas_per_image = [x[img_idx] for x in pred_anchor_deltas] 62 | # start self.inference_single_image: results_per_image = self.inference_single_image(anchors, pred_logits_per_image, deltas_per_image, image_size) 63 | box_cls, box_delta = pred_logits_per_image, deltas_per_image 64 | 65 | boxes_all = [] 66 | scores_all = [] 67 | class_idxs_all = [] 68 | 69 | # Iterate over every feature level 70 | for box_cls_i, box_reg_i, anchors_i in zip(box_cls, box_delta, anchors): 71 | # (HxWxAxK,) 72 | predicted_prob = box_cls_i.flatten().sigmoid_() 73 | 74 | # Apply two filtering below to make NMS faster. 75 | # 1. Keep boxes with confidence score higher than threshold 76 | keep_idxs = predicted_prob > self.score_thresh_train 77 | predicted_prob = predicted_prob[keep_idxs] 78 | topk_idxs = nonzero_tuple(keep_idxs)[0] 79 | 80 | # 2. Keep top k top scoring boxes only 81 | num_topk = min(self.detections_per_image_train, topk_idxs.size(0)) 82 | # torch.sort is actually faster than .topk (at least on GPUs) 83 | predicted_prob, idxs = predicted_prob.sort(descending=True) 84 | predicted_prob = predicted_prob[:num_topk] 85 | topk_idxs = topk_idxs[idxs[:num_topk]] 86 | 87 | anchor_idxs = topk_idxs // self.num_classes 88 | classes_idxs = topk_idxs % self.num_classes 89 | 90 | box_reg_i = box_reg_i[anchor_idxs] 91 | anchors_i = anchors_i[anchor_idxs] 92 | # predict boxes 93 | predicted_boxes = self.box2box_transform.apply_deltas(box_reg_i, anchors_i.tensor) 94 | 95 | boxes_all.append(predicted_boxes) 96 | scores_all.append(predicted_prob) 97 | class_idxs_all.append(classes_idxs) 98 | 99 | boxes_all, scores_all, class_idxs_all = [ 100 | cat(x) for x in [boxes_all, scores_all, class_idxs_all] 101 | ] 102 | keep = batched_nms(boxes_all, scores_all, class_idxs_all, self.nms_thresh_train) 103 | keep = keep[: self.detections_per_image_train] # instead of self.max_detections_per_image 104 | 105 | results_per_image = Instances(image_size) 106 | results_per_image.pred_boxes = Boxes(boxes_all[keep]) 107 | results_per_image.scores = scores_all[keep] 108 | results_per_image.pred_classes = class_idxs_all[keep] 109 | # end self.inference_single_image 110 | results.append(results_per_image) 111 | # end self.inference 112 | predictions = [] 113 | for results_per_image, input_per_image, image_size in zip( 114 | results, batched_inputs, images.image_sizes 115 | ): 116 | # height = input_per_image.get("height", image_size[0]) 117 | # width = input_per_image.get("width", image_size[1]) 118 | # r = detector_postprocess(results_per_image, height, width) 119 | r = results_per_image 120 | predictions.append({"instances": r}) 121 | return predictions, loss_inputs 122 | 123 | def compute_losses(self, gt_instances, loss_inputs): 124 | gt_instances = [x.to(self.device) for x in gt_instances] 125 | anchors = loss_inputs['anchors'] 126 | pred_logits = loss_inputs['pred_logits'] 127 | pred_anchor_deltas = loss_inputs['pred_anchor_deltas'] 128 | 129 | gt_labels, gt_boxes = self.label_anchors(anchors, gt_instances) 130 | losses = self.losses(anchors, pred_logits, gt_labels, pred_anchor_deltas, gt_boxes) 131 | return losses 132 | -------------------------------------------------------------------------------- /src/utils/boxutils.py: -------------------------------------------------------------------------------- 1 | from torchvision.ops import box_iou 2 | import torch 3 | import numpy as np 4 | 5 | def box_iou_distance(boxes1, boxes2): 6 | ''' 7 | Returned values are 1 - IoU, i.e. values in [0,1]. 8 | ''' 9 | if isinstance(boxes1, np.ndarray): 10 | boxes1 = torch.from_numpy(boxes1) 11 | if isinstance(boxes2, np.ndarray): 12 | boxes2 = torch.from_numpy(boxes2) 13 | return 1 - box_iou(boxes1, boxes2) # (n1,n2) 14 | 15 | def box_center_distance(boxes1, boxes2): 16 | ''' 17 | Returned values are the euclidean center distances normalized by the sizes of the boxes. 18 | ''' 19 | if isinstance(boxes1, np.ndarray): 20 | boxes1 = torch.from_numpy(boxes1) 21 | if isinstance(boxes2, np.ndarray): 22 | boxes2 = torch.from_numpy(boxes2) 23 | 24 | centers1 = 0.5 * torch.stack([boxes1[:,0] + boxes1[:,2], 25 | boxes1[:,1] + boxes1[:,3]], dim=1) 26 | centers2 = 0.5 * torch.stack([boxes2[:,0] + boxes2[:,2], 27 | boxes2[:,1] + boxes2[:,3]], dim=1) 28 | sizes1 = 0.5 * (boxes1[:,2] - boxes1[:,0] + boxes1[:,3] - boxes1[:,1]) 29 | sizes2 = 0.5 * (boxes2[:,2] - boxes2[:,0] + boxes2[:,3] - boxes2[:,1]) 30 | 31 | distances = torch.cdist(centers1, centers2, p=2) 32 | distances = distances / torch.sqrt(sizes1).reshape(-1, 1) 33 | distances = distances / torch.sqrt(sizes2).reshape( 1,-1) 34 | return distances 35 | -------------------------------------------------------------------------------- /src/utils/checkpoint.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | 4 | def save_checkpoint(model, optim, scheduler, iteration, output_dir, teacher_model=None): 5 | checkpoint_dir = os.path.join(output_dir, 'checkpoints') 6 | os.makedirs(checkpoint_dir, exist_ok=True) 7 | 8 | checkpoint = dict(iteration=iteration, 9 | model=model.state_dict(), 10 | optim=optim.state_dict(), 11 | scheduler=scheduler.state_dict()) 12 | if teacher_model is not None: 13 | checkpoint['teacher_model'] = teacher_model.state_dict() 14 | torch.save(checkpoint, os.path.join(checkpoint_dir, str(iteration) + '.pt')) 15 | 16 | def load_checkpoint(checkpoint_path): 17 | checkpoint = torch.load(checkpoint_path) 18 | model_state_dict = checkpoint['model'] 19 | optim_state_dict = checkpoint['optim'] 20 | scheduler_state_dict = checkpoint['scheduler'] 21 | iteration = checkpoint['iteration'] 22 | 23 | if 'teacher_model' in checkpoint.keys(): 24 | teacher_model_state_dict = checkpoint['teacher_model'] 25 | return model_state_dict, optim_state_dict, scheduler_state_dict, iteration, teacher_model_state_dict 26 | else: 27 | return model_state_dict, optim_state_dict, scheduler_state_dict, iteration 28 | 29 | -------------------------------------------------------------------------------- /src/utils/misc.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import hashlib 3 | 4 | assert torch.rand(1, generator=torch.Generator().manual_seed(0)).item() == 0.49625658988952637,\ 5 | 'Random seeding is different on this machine compared to our experiments' 6 | 7 | def to_numpy(x): 8 | if isinstance(x, torch.Tensor): 9 | return x.detach().cpu().numpy() 10 | else: 11 | raise NotImplementedError 12 | 13 | def get_random_generator(seed): 14 | # pytorch: do not use numbers with too many zero bytes 15 | assert isinstance(seed, str) or isinstance(seed, int) or isinstance(seed, float) 16 | generator = torch.Generator() 17 | seed = str(seed) 18 | seed = int(hashlib.md5(seed.encode()).hexdigest()[:16], 16) # only take half the hash to avoid overflows 19 | generator.manual_seed(seed) 20 | 21 | return generator 22 | -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import argparse 4 | 5 | from detectron2.config import CfgNode 6 | from detectron2.evaluation import inference_on_dataset 7 | from detectron2.utils.events import EventStorage 8 | from detectron2.utils.logger import setup_logger 9 | 10 | from src.evaluation.build import build_evaluator 11 | from src.models.build import build_model 12 | from src.datasets.build import build_dataloaders 13 | 14 | def test(run_path=None, cfg=None, model=None, val=False, test=True): 15 | if run_path is not None: 16 | assert cfg is None 17 | cfg = CfgNode(init_dict=CfgNode.load_yaml_with_base(filename=os.path.join(run_path, 'config.yaml'))) 18 | cfg.OUTPUT_DIR = run_path 19 | else: 20 | assert cfg is not None 21 | 22 | logger = setup_logger(os.path.join(cfg.OUTPUT_DIR, 'evaluation')) 23 | for h in logger.handlers[:-2]: # remove handlers such that multiple runs can be tested in one script 24 | logger.removeHandler(h) 25 | 26 | train_loader, val_loader, test_loader = build_dataloaders(cfg) 27 | val_evaluator = build_evaluator(cfg, val=True) 28 | test_evaluator = build_evaluator(cfg, val=False) 29 | 30 | if model is None: 31 | model_path = os.path.join(cfg.OUTPUT_DIR, 'final_model.pt') 32 | model = build_model(cfg) 33 | model.load_state_dict(torch.load(model_path)) 34 | if isinstance(model, str): 35 | model_path = model 36 | model = build_model(cfg) 37 | model.load_state_dict(torch.load(model_path)) 38 | model.to(cfg.MODEL.DEVICE) 39 | 40 | with EventStorage(0) as storage: 41 | if val: 42 | val_results = inference_on_dataset(model, val_loader, val_evaluator) 43 | val_score = val_results['score'] if 'score' in val_results else -val_results['bbox']['AP50'] 44 | logger.info(val_results) 45 | logger.info('Val score: {}'.format(val_score)) 46 | 47 | if test: 48 | test_results = inference_on_dataset(model, test_loader, test_evaluator) 49 | test_score = test_results['score'] if 'score' in test_results else -test_results['bbox']['AP50'] 50 | logger.info(test_results) 51 | logger.info('Test score: {}'.format(test_score)) 52 | 53 | return test_score if test else val_score 54 | 55 | if __name__ == '__main__': 56 | parser = argparse.ArgumentParser(description='Evaluation args.') 57 | parser.add_argument('--run_path', type=str, required=True, help='Path to the run you want to evaluate.') 58 | parser.add_argument('--val', type=bool, default=False, help='Whether to do validation.') 59 | parser.add_argument('--test', type=bool, default=True, help='Whether to do testing.') 60 | 61 | args = parser.parse_args() 62 | score = test(run_path=args.run_path, val=args.val, test=args.test) --------------------------------------------------------------------------------