├── mmdet
    ├── ops
    │   ├── dcn
    │   │   ├── modules
    │   │   │   └── __init__.py
    │   │   ├── functions
    │   │   │   ├── __init__.py
    │   │   │   └── deform_pool.py
    │   │   ├── setup.py
    │   │   ├── __init__.py
    │   │   └── src
    │   │   │   └── deform_pool_cuda.cpp
    │   ├── roi_pool
    │   │   ├── modules
    │   │   │   ├── __init__.py
    │   │   │   └── roi_pool.py
    │   │   ├── functions
    │   │   │   ├── __init__.py
    │   │   │   └── roi_pool.py
    │   │   ├── __init__.py
    │   │   ├── setup.py
    │   │   ├── gradcheck.py
    │   │   └── src
    │   │   │   └── roi_pool_cuda.cpp
    │   ├── roi_align
    │   │   ├── functions
    │   │   │   ├── __init__.py
    │   │   │   └── roi_align.py
    │   │   ├── modules
    │   │   │   ├── __init__.py
    │   │   │   └── roi_align.py
    │   │   ├── __init__.py
    │   │   ├── setup.py
    │   │   ├── gradcheck.py
    │   │   └── src
    │   │   │   └── roi_align_cuda.cpp
    │   ├── sigmoid_focal_loss
    │   │   ├── modules
    │   │   │   ├── __init__.py
    │   │   │   └── sigmoid_focal_loss.py
    │   │   ├── functions
    │   │   │   ├── __init__.py
    │   │   │   └── sigmoid_focal_loss.py
    │   │   ├── __init__.py
    │   │   ├── setup.py
    │   │   └── src
    │   │   │   └── sigmoid_focal_loss.cpp
    │   ├── nms
    │   │   ├── __init__.py
    │   │   ├── src
    │   │   │   ├── nms_cuda.cpp
    │   │   │   ├── nms_cpu.cpp
    │   │   │   ├── soft_nms_cpu.pyx
    │   │   │   └── nms_kernel.cu
    │   │   ├── nms_wrapper.py
    │   │   └── setup.py
    │   └── __init__.py
    ├── models
    │   ├── necks
    │   │   ├── __init__.py
    │   │   └── fpn.py
    │   ├── roi_extractors
    │   │   ├── __init__.py
    │   │   └── single_level.py
    │   ├── anchor_heads
    │   │   ├── __init__.py
    │   │   └── rpn_head.py
    │   ├── backbones
    │   │   └── __init__.py
    │   ├── losses
    │   │   ├── __init__.py
    │   │   ├── smooth_l1_loss.py
    │   │   └── cross_entropy_loss.py
    │   ├── bbox_heads
    │   │   └── __init__.py
    │   ├── detectors
    │   │   ├── __init__.py
    │   │   ├── faster_rcnn.py
    │   │   ├── rpn.py
    │   │   └── base.py
    │   ├── utils
    │   │   ├── scale.py
    │   │   ├── gaussian_kernel.py
    │   │   ├── __init__.py
    │   │   ├── gumbel_sigmoid.py
    │   │   ├── conv_ws.py
    │   │   ├── weight_init.py
    │   │   └── norm.py
    │   ├── __init__.py
    │   ├── registry.py
    │   └── builder.py
    ├── __init__.py
    ├── core
    │   ├── anchor
    │   │   ├── __init__.py
    │   │   └── anchor_generator.py
    │   ├── bbox
    │   │   ├── assigners
    │   │   │   ├── __init__.py
    │   │   │   ├── base_assigner.py
    │   │   │   └── assign_result.py
    │   │   ├── samplers
    │   │   │   ├── combined_sampler.py
    │   │   │   ├── __init__.py
    │   │   │   ├── sampling_result.py
    │   │   │   ├── pseudo_sampler.py
    │   │   │   ├── instance_balanced_pos_sampler.py
    │   │   │   ├── random_sampler.py
    │   │   │   ├── iou_balanced_neg_sampler.py
    │   │   │   ├── ohem_sampler.py
    │   │   │   └── base_sampler.py
    │   │   ├── __init__.py
    │   │   ├── assign_sampling.py
    │   │   ├── geometry.py
    │   │   └── bbox_target.py
    │   ├── utils
    │   │   ├── __init__.py
    │   │   ├── misc.py
    │   │   └── dist_utils.py
    │   ├── __init__.py
    │   ├── post_processing
    │   │   ├── __init__.py
    │   │   ├── bbox_nms.py
    │   │   └── merge_augs.py
    │   ├── loss
    │   │   ├── __init__.py
    │   │   └── losses.py
    │   └── evaluation
    │   │   ├── __init__.py
    │   │   ├── bbox_overlaps.py
    │   │   ├── coco_utils.py
    │   │   └── class_names.py
    ├── datasets
    │   ├── loader
    │   │   ├── __init__.py
    │   │   └── build_loader.py
    │   ├── repeat_dataset.py
    │   ├── voc.py
    │   ├── __init__.py
    │   ├── concat_dataset.py
    │   ├── imagenet.py
    │   ├── xml_style.py
    │   ├── utils.py
    │   ├── transforms.py
    │   └── coco.py
    ├── utils
    │   ├── __init__.py
    │   └── distributed.py
    └── apis
    │   ├── __init__.py
    │   ├── env.py
    │   └── inference.py
├── mmcv_custom
    ├── __init__.py
    ├── parameters.py
    ├── image_io.py
    ├── runner.py
    └── zipreader.py
├── demo
    ├── github_raw_image.png
    ├── github_pipeline_gumbel.png
    ├── github_stochastic_sampling.png
    └── github_deterministic_sampling.png
├── init.sh
├── tools
    ├── dist_train.sh
    ├── dist_test.sh
    ├── coco_eval.py
    ├── publish_model.py
    └── train.py
├── compile.sh
├── .gitignore
├── setup.py
└── README.md


/mmdet/ops/dcn/modules/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/ops/dcn/functions/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/modules/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/functions/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/modules/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/functions/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/ops/sigmoid_focal_loss/modules/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/ops/sigmoid_focal_loss/functions/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/mmdet/models/necks/__init__.py:
--------------------------------------------------------------------------------
1 | from .fpn import FPN
2 | 
3 | __all__ = ['FPN']
4 | 


--------------------------------------------------------------------------------
/mmdet/ops/nms/__init__.py:
--------------------------------------------------------------------------------
1 | from .nms_wrapper import nms, soft_nms
2 | 
3 | __all__ = ['nms', 'soft_nms']
4 | 


--------------------------------------------------------------------------------
/mmcv_custom/__init__.py:
--------------------------------------------------------------------------------
1 | from .image_io import imread
2 | from .runner import Runner
3 | 
4 | __all__ = ['imread', '']
5 | 


--------------------------------------------------------------------------------
/mmdet/__init__.py:
--------------------------------------------------------------------------------
1 | from .version import __version__, short_version
2 | 
3 | __all__ = ['__version__', 'short_version']
4 | 


--------------------------------------------------------------------------------
/demo/github_raw_image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zdaxie/SpatiallyAdaptiveInference-Detection/HEAD/demo/github_raw_image.png


--------------------------------------------------------------------------------
/mmdet/models/roi_extractors/__init__.py:
--------------------------------------------------------------------------------
1 | from .single_level import SingleRoIExtractor
2 | 
3 | __all__ = ['SingleRoIExtractor']
4 | 


--------------------------------------------------------------------------------
/demo/github_pipeline_gumbel.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zdaxie/SpatiallyAdaptiveInference-Detection/HEAD/demo/github_pipeline_gumbel.png


--------------------------------------------------------------------------------
/demo/github_stochastic_sampling.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zdaxie/SpatiallyAdaptiveInference-Detection/HEAD/demo/github_stochastic_sampling.png


--------------------------------------------------------------------------------
/demo/github_deterministic_sampling.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zdaxie/SpatiallyAdaptiveInference-Detection/HEAD/demo/github_deterministic_sampling.png


--------------------------------------------------------------------------------
/mmdet/models/anchor_heads/__init__.py:
--------------------------------------------------------------------------------
1 | from .anchor_head import AnchorHead
2 | from .rpn_head import RPNHead
3 | 
4 | __all__ = ['AnchorHead', 'RPNHead']
5 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/__init__.py:
--------------------------------------------------------------------------------
1 | from .functions.roi_pool import roi_pool
2 | from .modules.roi_pool import RoIPool
3 | 
4 | __all__ = ['roi_pool', 'RoIPool']
5 | 


--------------------------------------------------------------------------------
/mmdet/models/backbones/__init__.py:
--------------------------------------------------------------------------------
1 | from .resnet import ResNet, make_res_layer
2 | from .sparse_resnet import SparseResNet
3 | 
4 | __all__ = ['ResNet', 'SparseResNet']


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/__init__.py:
--------------------------------------------------------------------------------
1 | from .functions.roi_align import roi_align
2 | from .modules.roi_align import RoIAlign
3 | 
4 | __all__ = ['roi_align', 'RoIAlign']
5 | 


--------------------------------------------------------------------------------
/mmdet/core/anchor/__init__.py:
--------------------------------------------------------------------------------
1 | from .anchor_generator import AnchorGenerator
2 | from .anchor_target import anchor_target
3 | 
4 | __all__ = ['AnchorGenerator', 'anchor_target']
5 | 


--------------------------------------------------------------------------------
/mmdet/models/losses/__init__.py:
--------------------------------------------------------------------------------
1 | from .cross_entropy_loss import CrossEntropyLoss
2 | from .smooth_l1_loss import SmoothL1Loss
3 | 
4 | __all__ = ['CrossEntropyLoss', 'SmoothL1Loss']
5 | 


--------------------------------------------------------------------------------
/mmdet/ops/sigmoid_focal_loss/__init__.py:
--------------------------------------------------------------------------------
1 | from .modules.sigmoid_focal_loss import SigmoidFocalLoss, sigmoid_focal_loss
2 | 
3 | __all__ = ['SigmoidFocalLoss', 'sigmoid_focal_loss']
4 | 


--------------------------------------------------------------------------------
/mmdet/models/bbox_heads/__init__.py:
--------------------------------------------------------------------------------
1 | from .bbox_head import BBoxHead
2 | from .convfc_bbox_head import ConvFCBBoxHead, SharedFCBBoxHead
3 | 
4 | __all__ = ['BBoxHead', 'ConvFCBBoxHead', 'SharedFCBBoxHead']
5 | 


--------------------------------------------------------------------------------
/init.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 | pip install --user mmcv==0.2.8 numpy==1.16 matplotlib cython pillow
3 | ./compile.sh
4 | python setup.py develop --user
5 | pip install tensorflow --user
6 | pip install tensorboardX --user


--------------------------------------------------------------------------------
/mmdet/datasets/loader/__init__.py:
--------------------------------------------------------------------------------
1 | from .build_loader import build_dataloader
2 | from .sampler import GroupSampler, DistributedGroupSampler
3 | 
4 | __all__ = ['GroupSampler', 'DistributedGroupSampler', 'build_dataloader']
5 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/assigners/__init__.py:
--------------------------------------------------------------------------------
1 | from .base_assigner import BaseAssigner
2 | from .max_iou_assigner import MaxIoUAssigner
3 | from .assign_result import AssignResult
4 | 
5 | __all__ = ['BaseAssigner', 'MaxIoUAssigner', 'AssignResult']
6 | 


--------------------------------------------------------------------------------
/mmdet/models/detectors/__init__.py:
--------------------------------------------------------------------------------
1 | from .base import BaseDetector
2 | from .two_stage import TwoStageDetector
3 | from .faster_rcnn import FasterRCNN
4 | from .rpn import RPN
5 | 
6 | __all__ = ['BaseDetector', 'TwoStageDetector', 'FasterRCNN', 'RPN']
7 | 


--------------------------------------------------------------------------------
/mmdet/utils/__init__.py:
--------------------------------------------------------------------------------
 1 | from .distributed import gpu_indices, ompi_size, ompi_rank
 2 | from .flops import FlopsCalculator
 3 | 
 4 | __all__ = [
 5 |     'gpu_indices',
 6 |     'ompi_size',
 7 |     'ompi_rank',
 8 |     'FlopsCalculator',
 9 | ]
10 | 


--------------------------------------------------------------------------------
/tools/dist_train.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | 
 3 | PYTHON=${PYTHON:-"python"}
 4 | 
 5 | CONFIG=$1
 6 | GPUS=$2
 7 | 
 8 | $PYTHON -m torch.distributed.launch --nproc_per_node=$GPUS \
 9 |     $(dirname "$0")/train.py $CONFIG --launcher pytorch ${@:3}
10 | 


--------------------------------------------------------------------------------
/mmdet/core/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .dist_utils import allreduce_grads, DistOptimizerHook
2 | from .misc import tensor2imgs, unmap, multi_apply
3 | 
4 | __all__ = [
5 |     'allreduce_grads', 'DistOptimizerHook', 'tensor2imgs', 'unmap',
6 |     'multi_apply'
7 | ]
8 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/assigners/base_assigner.py:
--------------------------------------------------------------------------------
1 | from abc import ABCMeta, abstractmethod
2 | 
3 | 
4 | class BaseAssigner(metaclass=ABCMeta):
5 | 
6 |     @abstractmethod
7 |     def assign(self, bboxes, gt_bboxes, gt_bboxes_ignore=None, gt_labels=None):
8 |         pass
9 | 


--------------------------------------------------------------------------------
/tools/dist_test.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | 
 3 | PYTHON=${PYTHON:-"python"}
 4 | 
 5 | CONFIG=$1
 6 | CHECKPOINT=$2
 7 | GPUS=$3
 8 | 
 9 | $PYTHON -m torch.distributed.launch --nproc_per_node=$GPUS \
10 |     $(dirname "$0")/test.py $CONFIG $CHECKPOINT --launcher pytorch --eval bbox ${@:4}
11 | 


--------------------------------------------------------------------------------
/mmdet/core/__init__.py:
--------------------------------------------------------------------------------
1 | from .anchor import *  # noqa: F401, F403
2 | from .bbox import *  # noqa: F401, F403
3 | from .loss import *  # noqa: F401, F403
4 | from .evaluation import *  # noqa: F401, F403
5 | from .post_processing import *  # noqa: F401, F403
6 | from .utils import *  # noqa: F401, F403
7 | 


--------------------------------------------------------------------------------
/mmdet/core/post_processing/__init__.py:
--------------------------------------------------------------------------------
1 | from .bbox_nms import multiclass_nms
2 | from .merge_augs import (merge_aug_proposals, merge_aug_bboxes,
3 |                          merge_aug_scores, merge_aug_masks)
4 | 
5 | __all__ = [
6 |     'multiclass_nms', 'merge_aug_proposals', 'merge_aug_bboxes',
7 |     'merge_aug_scores', 'merge_aug_masks'
8 | ]
9 | 


--------------------------------------------------------------------------------
/mmdet/models/utils/scale.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | 
 4 | 
 5 | class Scale(nn.Module):
 6 | 
 7 |     def __init__(self, scale=1.0):
 8 |         super(Scale, self).__init__()
 9 |         self.scale = nn.Parameter(torch.tensor(scale, dtype=torch.float))
10 | 
11 |     def forward(self, x):
12 |         return x * self.scale
13 | 


--------------------------------------------------------------------------------
/mmdet/apis/__init__.py:
--------------------------------------------------------------------------------
1 | from .env import init_dist, get_root_logger, set_random_seed, get_git_hash
2 | from .train import train_detector
3 | from .inference import init_detector, inference_detector, show_result
4 | 
5 | __all__ = [
6 |     'init_dist', 'get_root_logger', 'set_random_seed', 'get_git_hash',
7 |     'train_detector', 'init_detector', 'inference_detector', 'show_result',
8 | ]
9 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/setup.py:
--------------------------------------------------------------------------------
 1 | from setuptools import setup
 2 | from torch.utils.cpp_extension import BuildExtension, CUDAExtension
 3 | 
 4 | setup(
 5 |     name='roi_pool',
 6 |     ext_modules=[
 7 |         CUDAExtension('roi_pool_cuda', [
 8 |             'src/roi_pool_cuda.cpp',
 9 |             'src/roi_pool_kernel.cu',
10 |         ])
11 |     ],
12 |     cmdclass={'build_ext': BuildExtension})
13 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/setup.py:
--------------------------------------------------------------------------------
 1 | from setuptools import setup
 2 | from torch.utils.cpp_extension import BuildExtension, CUDAExtension
 3 | 
 4 | setup(
 5 |     name='roi_align_cuda',
 6 |     ext_modules=[
 7 |         CUDAExtension('roi_align_cuda', [
 8 |             'src/roi_align_cuda.cpp',
 9 |             'src/roi_align_kernel.cu',
10 |         ]),
11 |     ],
12 |     cmdclass={'build_ext': BuildExtension})
13 | 


--------------------------------------------------------------------------------
/mmdet/ops/sigmoid_focal_loss/setup.py:
--------------------------------------------------------------------------------
 1 | from setuptools import setup
 2 | from torch.utils.cpp_extension import BuildExtension, CUDAExtension
 3 | 
 4 | setup(
 5 |     name='SigmoidFocalLoss',
 6 |     ext_modules=[
 7 |         CUDAExtension('sigmoid_focal_loss_cuda', [
 8 |             'src/sigmoid_focal_loss.cpp',
 9 |             'src/sigmoid_focal_loss_cuda.cu',
10 |         ]),
11 |     ],
12 |     cmdclass={'build_ext': BuildExtension})
13 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/modules/roi_pool.py:
--------------------------------------------------------------------------------
 1 | from torch.nn.modules.module import Module
 2 | from ..functions.roi_pool import roi_pool
 3 | 
 4 | 
 5 | class RoIPool(Module):
 6 | 
 7 |     def __init__(self, out_size, spatial_scale):
 8 |         super(RoIPool, self).__init__()
 9 | 
10 |         self.out_size = out_size
11 |         self.spatial_scale = float(spatial_scale)
12 | 
13 |     def forward(self, features, rois):
14 |         return roi_pool(features, rois, self.out_size, self.spatial_scale)
15 | 


--------------------------------------------------------------------------------
/mmdet/ops/dcn/setup.py:
--------------------------------------------------------------------------------
 1 | from setuptools import setup
 2 | from torch.utils.cpp_extension import BuildExtension, CUDAExtension
 3 | 
 4 | setup(
 5 |     name='deform_conv',
 6 |     ext_modules=[
 7 |         CUDAExtension('deform_conv_cuda', [
 8 |             'src/deform_conv_cuda.cpp',
 9 |             'src/deform_conv_cuda_kernel.cu',
10 |         ]),
11 |         CUDAExtension(
12 |             'deform_pool_cuda',
13 |             ['src/deform_pool_cuda.cpp', 'src/deform_pool_cuda_kernel.cu']),
14 |     ],
15 |     cmdclass={'build_ext': BuildExtension})
16 | 


--------------------------------------------------------------------------------
/mmdet/datasets/repeat_dataset.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | class RepeatDataset(object):
 5 | 
 6 |     def __init__(self, dataset, times):
 7 |         self.dataset = dataset
 8 |         self.times = times
 9 |         self.CLASSES = dataset.CLASSES
10 |         if hasattr(self.dataset, 'flag'):
11 |             self.flag = np.tile(self.dataset.flag, times)
12 | 
13 |         self._ori_len = len(self.dataset)
14 | 
15 |     def __getitem__(self, idx):
16 |         return self.dataset[idx % self._ori_len]
17 | 
18 |     def __len__(self):
19 |         return self.times * self._ori_len
20 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/gradcheck.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from torch.autograd import gradcheck
 3 | 
 4 | import os.path as osp
 5 | import sys
 6 | sys.path.append(osp.abspath(osp.join(__file__, '../../')))
 7 | from roi_pool import RoIPool  # noqa: E402
 8 | 
 9 | feat = torch.randn(4, 16, 15, 15, requires_grad=True).cuda()
10 | rois = torch.Tensor([[0, 0, 0, 50, 50], [0, 10, 30, 43, 55],
11 |                      [1, 67, 40, 110, 120]]).cuda()
12 | inputs = (feat, rois)
13 | print('Gradcheck for roi pooling...')
14 | test = gradcheck(RoIPool(4, 1.0 / 8), inputs, eps=1e-5, atol=1e-3)
15 | print(test)
16 | 


--------------------------------------------------------------------------------
/mmdet/core/loss/__init__.py:
--------------------------------------------------------------------------------
 1 | from .losses import (
 2 |     weighted_nll_loss, weighted_cross_entropy, weighted_binary_cross_entropy,
 3 |     sigmoid_focal_loss, py_sigmoid_focal_loss, weighted_sigmoid_focal_loss,
 4 |     mask_cross_entropy, smooth_l1_loss, weighted_smoothl1, accuracy, iou_loss)
 5 | 
 6 | __all__ = [
 7 |     'weighted_nll_loss', 'weighted_cross_entropy',
 8 |     'weighted_binary_cross_entropy', 'sigmoid_focal_loss',
 9 |     'py_sigmoid_focal_loss', 'weighted_sigmoid_focal_loss',
10 |     'mask_cross_entropy', 'smooth_l1_loss', 'weighted_smoothl1', 'accuracy',
11 |     'iou_loss'
12 | ]
13 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/combined_sampler.py:
--------------------------------------------------------------------------------
 1 | from .base_sampler import BaseSampler
 2 | from ..assign_sampling import build_sampler
 3 | 
 4 | 
 5 | class CombinedSampler(BaseSampler):
 6 | 
 7 |     def __init__(self, pos_sampler, neg_sampler, **kwargs):
 8 |         super(CombinedSampler, self).__init__(**kwargs)
 9 |         self.pos_sampler = build_sampler(pos_sampler, **kwargs)
10 |         self.neg_sampler = build_sampler(neg_sampler, **kwargs)
11 | 
12 |     def _sample_pos(self, **kwargs):
13 |         raise NotImplementedError
14 | 
15 |     def _sample_neg(self, **kwargs):
16 |         raise NotImplementedError
17 | 


--------------------------------------------------------------------------------
/mmdet/models/losses/smooth_l1_loss.py:
--------------------------------------------------------------------------------
 1 | import torch.nn as nn
 2 | from mmdet.core import weighted_smoothl1
 3 | 
 4 | from ..registry import LOSSES
 5 | 
 6 | 
 7 | @LOSSES.register_module
 8 | class SmoothL1Loss(nn.Module):
 9 | 
10 |     def __init__(self, beta=1.0, loss_weight=1.0):
11 |         super(SmoothL1Loss, self).__init__()
12 |         self.beta = beta
13 |         self.loss_weight = loss_weight
14 | 
15 |     def forward(self, pred, target, weight, *args, **kwargs):
16 |         loss_bbox = self.loss_weight * weighted_smoothl1(
17 |             pred, target, weight, beta=self.beta, *args, **kwargs)
18 |         return loss_bbox
19 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/modules/roi_align.py:
--------------------------------------------------------------------------------
 1 | from torch.nn.modules.module import Module
 2 | from ..functions.roi_align import RoIAlignFunction
 3 | 
 4 | 
 5 | class RoIAlign(Module):
 6 | 
 7 |     def __init__(self, out_size, spatial_scale, sample_num=0):
 8 |         super(RoIAlign, self).__init__()
 9 | 
10 |         self.out_size = out_size
11 |         self.spatial_scale = float(spatial_scale)
12 |         self.sample_num = int(sample_num)
13 | 
14 |     def forward(self, features, rois):
15 |         return RoIAlignFunction.apply(features, rois, self.out_size,
16 |                                       self.spatial_scale, self.sample_num)
17 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/__init__.py:
--------------------------------------------------------------------------------
 1 | from .base_sampler import BaseSampler
 2 | from .pseudo_sampler import PseudoSampler
 3 | from .random_sampler import RandomSampler
 4 | from .instance_balanced_pos_sampler import InstanceBalancedPosSampler
 5 | from .iou_balanced_neg_sampler import IoUBalancedNegSampler
 6 | from .combined_sampler import CombinedSampler
 7 | from .ohem_sampler import OHEMSampler
 8 | from .sampling_result import SamplingResult
 9 | 
10 | __all__ = [
11 |     'BaseSampler', 'PseudoSampler', 'RandomSampler',
12 |     'InstanceBalancedPosSampler', 'IoUBalancedNegSampler', 'CombinedSampler',
13 |     'OHEMSampler', 'SamplingResult'
14 | ]
15 | 


--------------------------------------------------------------------------------
/mmdet/ops/nms/src/nms_cuda.cpp:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
 2 | #include <torch/extension.h>
 3 | 
 4 | #define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
 5 | 
 6 | at::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh);
 7 | 
 8 | at::Tensor nms(const at::Tensor& dets, const float threshold) {
 9 |   CHECK_CUDA(dets);
10 |   if (dets.numel() == 0)
11 |     return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU));
12 |   return nms_cuda(dets, threshold);
13 | }
14 | 
15 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
16 |   m.def("nms", &nms, "non-maximum suppression");
17 | }


--------------------------------------------------------------------------------
/mmdet/models/utils/gaussian_kernel.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch
 3 | import torch.nn as nn
 4 | 
 5 | 
 6 | class GaussianKernel(nn.Module):
 7 |     def __init__(self, size):
 8 |         super(GaussianKernel, self).__init__()
 9 | 
10 |         s = (size - 1) // 2
11 |         _x = torch.linspace(-s, s, size).reshape((size, 1)).repeat((1, size))
12 |         _y = torch.linspace(-s, s, size).reshape((1, size)).repeat((size, 1))
13 |         self.d = _x ** 2 + _y ** 2
14 | 
15 |     def forward(self, sigma):
16 |         k = sigma ** 2
17 |         A = k / (2. * np.pi)
18 |         d = -k / 2. * self.d.cuda()
19 |         B = torch.exp(d)
20 |         B = A * B
21 |         return B
22 | 


--------------------------------------------------------------------------------
/mmdet/models/utils/__init__.py:
--------------------------------------------------------------------------------
 1 | from .conv_ws import conv_ws_2d, ConvWS2d
 2 | from .conv_module import build_conv_layer, ConvModule
 3 | from .norm import build_norm_layer
 4 | from .scale import Scale
 5 | from .weight_init import (xavier_init, normal_init, uniform_init, kaiming_init,
 6 |                           bias_init_with_prob)
 7 | 
 8 | from .gaussian_kernel import GaussianKernel
 9 | from .gumbel_sigmoid import GumbelSigmoid
10 | 
11 | __all__ = [
12 |     'conv_ws_2d', 'ConvWS2d', 'build_conv_layer', 'ConvModule',
13 |     'build_norm_layer', 'xavier_init', 'normal_init', 'uniform_init',
14 |     'kaiming_init', 'bias_init_with_prob', 'Scale',
15 |     'GaussianKernel', 'GumbelSigmoid'
16 | ]
17 | 


--------------------------------------------------------------------------------
/mmdet/ops/dcn/__init__.py:
--------------------------------------------------------------------------------
 1 | from .functions.deform_conv import deform_conv, modulated_deform_conv
 2 | from .functions.deform_pool import deform_roi_pooling
 3 | from .modules.deform_conv import (DeformConv, ModulatedDeformConv,
 4 |                                   DeformConvPack, ModulatedDeformConvPack)
 5 | from .modules.deform_pool import (DeformRoIPooling, DeformRoIPoolingPack,
 6 |                                   ModulatedDeformRoIPoolingPack)
 7 | 
 8 | __all__ = [
 9 |     'DeformConv', 'DeformConvPack', 'ModulatedDeformConv',
10 |     'ModulatedDeformConvPack', 'DeformRoIPooling', 'DeformRoIPoolingPack',
11 |     'ModulatedDeformRoIPoolingPack', 'deform_conv', 'modulated_deform_conv',
12 |     'deform_roi_pooling'
13 | ]
14 | 


--------------------------------------------------------------------------------
/mmdet/datasets/voc.py:
--------------------------------------------------------------------------------
 1 | from .xml_style import XMLDataset
 2 | 
 3 | 
 4 | class VOCDataset(XMLDataset):
 5 | 
 6 |     CLASSES = ('aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car',
 7 |                'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse',
 8 |                'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train',
 9 |                'tvmonitor')
10 | 
11 |     def __init__(self, **kwargs):
12 |         super(VOCDataset, self).__init__(**kwargs)
13 |         if 'VOC2007' in self.img_prefix:
14 |             self.year = 2007
15 |         elif 'VOC2012' in self.img_prefix:
16 |             self.year = 2012
17 |         else:
18 |             raise ValueError('Cannot infer dataset year from img_prefix')
19 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/assigners/assign_result.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | 
 4 | class AssignResult(object):
 5 | 
 6 |     def __init__(self, num_gts, gt_inds, max_overlaps, labels=None):
 7 |         self.num_gts = num_gts
 8 |         self.gt_inds = gt_inds
 9 |         self.max_overlaps = max_overlaps
10 |         self.labels = labels
11 | 
12 |     def add_gt_(self, gt_labels):
13 |         self_inds = torch.arange(
14 |             1, len(gt_labels) + 1, dtype=torch.long, device=gt_labels.device)
15 |         self.gt_inds = torch.cat([self_inds, self.gt_inds])
16 |         self.max_overlaps = torch.cat(
17 |             [self.max_overlaps.new_ones(self.num_gts), self.max_overlaps])
18 |         if self.labels is not None:
19 |             self.labels = torch.cat([gt_labels, self.labels])
20 | 


--------------------------------------------------------------------------------
/mmdet/ops/sigmoid_focal_loss/modules/sigmoid_focal_loss.py:
--------------------------------------------------------------------------------
 1 | from torch import nn
 2 | 
 3 | from ..functions.sigmoid_focal_loss import sigmoid_focal_loss
 4 | 
 5 | 
 6 | class SigmoidFocalLoss(nn.Module):
 7 | 
 8 |     def __init__(self, gamma, alpha):
 9 |         super(SigmoidFocalLoss, self).__init__()
10 |         self.gamma = gamma
11 |         self.alpha = alpha
12 | 
13 |     def forward(self, logits, targets):
14 |         assert logits.is_cuda
15 |         loss = sigmoid_focal_loss(logits, targets, self.gamma, self.alpha)
16 |         return loss.sum()
17 | 
18 |     def __repr__(self):
19 |         tmpstr = self.__class__.__name__ + "("
20 |         tmpstr += "gamma=" + str(self.gamma)
21 |         tmpstr += ", alpha=" + str(self.alpha)
22 |         tmpstr += ")"
23 |         return tmpstr
24 | 


--------------------------------------------------------------------------------
/mmdet/datasets/__init__.py:
--------------------------------------------------------------------------------
 1 | from .custom import CustomDataset
 2 | from .xml_style import XMLDataset
 3 | from .coco import CocoDataset
 4 | from .voc import VOCDataset
 5 | from .loader import GroupSampler, DistributedGroupSampler, build_dataloader
 6 | from .utils import to_tensor, random_scale, show_ann, get_dataset
 7 | from .concat_dataset import ConcatDataset
 8 | from .repeat_dataset import RepeatDataset
 9 | from .extra_aug import ExtraAugmentation
10 | from .imagenet import ImageNetDataset
11 | 
12 | __all__ = [
13 |     'CustomDataset', 'XMLDataset', 'CocoDataset', 'VOCDataset', 'GroupSampler',
14 |     'DistributedGroupSampler', 'build_dataloader', 'to_tensor', 'random_scale',
15 |     'show_ann', 'get_dataset', 'ConcatDataset', 'RepeatDataset',
16 |     'ExtraAugmentation', 'ImageNetDataset'
17 | ]
18 | 


--------------------------------------------------------------------------------
/mmdet/datasets/concat_dataset.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from torch.utils.data.dataset import ConcatDataset as _ConcatDataset
 3 | 
 4 | 
 5 | class ConcatDataset(_ConcatDataset):
 6 |     """A wrapper of concatenated dataset.
 7 | 
 8 |     Same as :obj:`torch.utils.data.dataset.ConcatDataset`, but
 9 |     concat the group flag for image aspect ratio.
10 | 
11 |     Args:
12 |         datasets (list[:obj:`Dataset`]): A list of datasets.
13 |     """
14 | 
15 |     def __init__(self, datasets):
16 |         super(ConcatDataset, self).__init__(datasets)
17 |         self.CLASSES = datasets[0].CLASSES
18 |         if hasattr(datasets[0], 'flag'):
19 |             flags = []
20 |             for i in range(0, len(datasets)):
21 |                 flags.append(datasets[i].flag)
22 |             self.flag = np.concatenate(flags)
23 | 


--------------------------------------------------------------------------------
/compile.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | 
 3 | PYTHON=${PYTHON:-"python"}
 4 | 
 5 | echo "Building roi align op..."
 6 | cd mmdet/ops/roi_align
 7 | if [ -d "build" ]; then
 8 |     rm -r build
 9 | fi
10 | $PYTHON setup.py build_ext --inplace
11 | 
12 | echo "Building roi pool op..."
13 | cd ../roi_pool
14 | if [ -d "build" ]; then
15 |     rm -r build
16 | fi
17 | $PYTHON setup.py build_ext --inplace
18 | 
19 | echo "Building nms op..."
20 | cd ../nms
21 | if [ -d "build" ]; then
22 |     rm -r build
23 | fi
24 | $PYTHON setup.py build_ext --inplace
25 | 
26 | echo "Building dcn..."
27 | cd ../dcn
28 | if [ -d "build" ]; then
29 |     rm -r build
30 | fi
31 | $PYTHON setup.py build_ext --inplace
32 | 
33 | echo "Building sigmoid focal loss op..."
34 | cd ../sigmoid_focal_loss
35 | if [ -d "build" ]; then
36 |     rm -r build
37 | fi
38 | $PYTHON setup.py build_ext --inplace
39 | 


--------------------------------------------------------------------------------
/mmdet/ops/__init__.py:
--------------------------------------------------------------------------------
 1 | from .dcn import (DeformConv, DeformConvPack, ModulatedDeformConv,
 2 |                   ModulatedDeformConvPack, DeformRoIPooling,
 3 |                   DeformRoIPoolingPack, ModulatedDeformRoIPoolingPack,
 4 |                   deform_conv, modulated_deform_conv, deform_roi_pooling)
 5 | from .nms import nms, soft_nms
 6 | from .roi_align import RoIAlign, roi_align
 7 | from .roi_pool import RoIPool, roi_pool
 8 | from .sigmoid_focal_loss import SigmoidFocalLoss, sigmoid_focal_loss
 9 | __all__ = [
10 |     'nms', 'soft_nms', 'RoIAlign', 'roi_align', 'RoIPool', 'roi_pool',
11 |     'DeformConv', 'DeformConvPack', 'DeformRoIPooling', 'DeformRoIPoolingPack',
12 |     'ModulatedDeformRoIPoolingPack', 'ModulatedDeformConv',
13 |     'ModulatedDeformConvPack', 'deform_conv', 'modulated_deform_conv',
14 |     'deform_roi_pooling',
15 |     'SigmoidFocalLoss', 'sigmoid_focal_loss',
16 | ]
17 | 


--------------------------------------------------------------------------------
/tools/coco_eval.py:
--------------------------------------------------------------------------------
 1 | from argparse import ArgumentParser
 2 | 
 3 | from mmdet.core import coco_eval
 4 | 
 5 | 
 6 | def main():
 7 |     parser = ArgumentParser(description='COCO Evaluation')
 8 |     parser.add_argument('result', help='result file path')
 9 |     parser.add_argument('--ann', help='annotation file path')
10 |     parser.add_argument(
11 |         '--types',
12 |         type=str,
13 |         nargs='+',
14 |         choices=['proposal_fast', 'proposal', 'bbox', 'segm', 'keypoint'],
15 |         default=['bbox'],
16 |         help='result types')
17 |     parser.add_argument(
18 |         '--max-dets',
19 |         type=int,
20 |         nargs='+',
21 |         default=[100, 300, 1000],
22 |         help='proposal numbers, only used for recall evaluation')
23 |     args = parser.parse_args()
24 |     coco_eval(args.result, args.types, args.ann, args.max_dets)
25 | 
26 | 
27 | if __name__ == '__main__':
28 |     main()
29 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/sampling_result.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | 
 4 | class SamplingResult(object):
 5 | 
 6 |     def __init__(self, pos_inds, neg_inds, bboxes, gt_bboxes, assign_result,
 7 |                  gt_flags):
 8 |         self.pos_inds = pos_inds
 9 |         self.neg_inds = neg_inds
10 |         self.pos_bboxes = bboxes[pos_inds]
11 |         self.neg_bboxes = bboxes[neg_inds]
12 |         self.pos_is_gt = gt_flags[pos_inds]
13 | 
14 |         self.num_gts = gt_bboxes.shape[0]
15 |         self.pos_assigned_gt_inds = assign_result.gt_inds[pos_inds] - 1
16 |         self.pos_gt_bboxes = gt_bboxes[self.pos_assigned_gt_inds, :]
17 |         if assign_result.labels is not None:
18 |             self.pos_gt_labels = assign_result.labels[pos_inds]
19 |         else:
20 |             self.pos_gt_labels = None
21 | 
22 |     @property
23 |     def bboxes(self):
24 |         return torch.cat([self.pos_bboxes, self.neg_bboxes])
25 | 


--------------------------------------------------------------------------------
/mmdet/models/__init__.py:
--------------------------------------------------------------------------------
 1 | from .backbones import *  # noqa: F401,F403
 2 | from .necks import *  # noqa: F401,F403
 3 | from .roi_extractors import *  # noqa: F401,F403
 4 | from .anchor_heads import *  # noqa: F401,F403
 5 | from .bbox_heads import *  # noqa: F401,F403
 6 | from .losses import *  # noqa: F401,F403
 7 | from .detectors import *  # noqa: F401,F403
 8 | from .registry import (BACKBONES, NECKS, ROI_EXTRACTORS, SHARED_HEADS, HEADS,
 9 |                        LOSSES, DETECTORS)
10 | from .builder import (build_backbone, build_neck, build_roi_extractor,
11 |                       build_shared_head, build_head, build_loss,
12 |                       build_detector)
13 | 
14 | __all__ = [
15 |     'BACKBONES', 'NECKS', 'ROI_EXTRACTORS', 'SHARED_HEADS', 'HEADS', 'LOSSES',
16 |     'DETECTORS', 'build_backbone', 'build_neck', 'build_roi_extractor',
17 |     'build_shared_head', 'build_head', 'build_loss', 'build_detector'
18 | ]
19 | 


--------------------------------------------------------------------------------
/mmdet/models/detectors/faster_rcnn.py:
--------------------------------------------------------------------------------
 1 | from .two_stage import TwoStageDetector
 2 | from ..registry import DETECTORS
 3 | 
 4 | 
 5 | @DETECTORS.register_module
 6 | class FasterRCNN(TwoStageDetector):
 7 | 
 8 |     def __init__(self,
 9 |                  backbone,
10 |                  rpn_head,
11 |                  bbox_roi_extractor,
12 |                  bbox_head,
13 |                  train_cfg,
14 |                  test_cfg,
15 |                  neck=None,
16 |                  shared_head=None,
17 |                  pretrained=None):
18 |         super(FasterRCNN, self).__init__(
19 |             backbone=backbone,
20 |             neck=neck,
21 |             shared_head=shared_head,
22 |             rpn_head=rpn_head,
23 |             bbox_roi_extractor=bbox_roi_extractor,
24 |             bbox_head=bbox_head,
25 |             train_cfg=train_cfg,
26 |             test_cfg=test_cfg,
27 |             pretrained=pretrained
28 |         )
29 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/pseudo_sampler.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | from .base_sampler import BaseSampler
 4 | from .sampling_result import SamplingResult
 5 | 
 6 | 
 7 | class PseudoSampler(BaseSampler):
 8 | 
 9 |     def __init__(self, **kwargs):
10 |         pass
11 | 
12 |     def _sample_pos(self, **kwargs):
13 |         raise NotImplementedError
14 | 
15 |     def _sample_neg(self, **kwargs):
16 |         raise NotImplementedError
17 | 
18 |     def sample(self, assign_result, bboxes, gt_bboxes, **kwargs):
19 |         pos_inds = torch.nonzero(
20 |             assign_result.gt_inds > 0).squeeze(-1).unique()
21 |         neg_inds = torch.nonzero(
22 |             assign_result.gt_inds == 0).squeeze(-1).unique()
23 |         gt_flags = bboxes.new_zeros(bboxes.shape[0], dtype=torch.uint8)
24 |         sampling_result = SamplingResult(pos_inds, neg_inds, bboxes, gt_bboxes,
25 |                                          assign_result, gt_flags)
26 |         return sampling_result
27 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/gradcheck.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch
 3 | from torch.autograd import gradcheck
 4 | 
 5 | import os.path as osp
 6 | import sys
 7 | sys.path.append(osp.abspath(osp.join(__file__, '../../')))
 8 | from roi_align import RoIAlign  # noqa: E402
 9 | 
10 | feat_size = 15
11 | spatial_scale = 1.0 / 8
12 | img_size = feat_size / spatial_scale
13 | num_imgs = 2
14 | num_rois = 20
15 | 
16 | batch_ind = np.random.randint(num_imgs, size=(num_rois, 1))
17 | rois = np.random.rand(num_rois, 4) * img_size * 0.5
18 | rois[:, 2:] += img_size * 0.5
19 | rois = np.hstack((batch_ind, rois))
20 | 
21 | feat = torch.randn(
22 |     num_imgs, 16, feat_size, feat_size, requires_grad=True, device='cuda:0')
23 | rois = torch.from_numpy(rois).float().cuda()
24 | inputs = (feat, rois)
25 | print('Gradcheck for roi align...')
26 | test = gradcheck(RoIAlign(3, spatial_scale), inputs, atol=1e-3, eps=1e-3)
27 | print(test)
28 | test = gradcheck(RoIAlign(3, spatial_scale, 2), inputs, atol=1e-3, eps=1e-3)
29 | print(test)
30 | 


--------------------------------------------------------------------------------
/mmdet/core/evaluation/__init__.py:
--------------------------------------------------------------------------------
 1 | from .class_names import (voc_classes, imagenet_det_classes,
 2 |                           imagenet_vid_classes, coco_classes, dataset_aliases,
 3 |                           get_classes)
 4 | from .coco_utils import coco_eval, fast_eval_recall, results2json
 5 | from .eval_hooks import (DistEvalHook, DistEvalmAPHook, CocoDistEvalRecallHook,
 6 |                          CocoDistEvalmAPHook)
 7 | from .mean_ap import average_precision, eval_map, print_map_summary
 8 | from .recall import (eval_recalls, print_recall_summary, plot_num_recall,
 9 |                      plot_iou_recall)
10 | 
11 | __all__ = [
12 |     'voc_classes', 'imagenet_det_classes', 'imagenet_vid_classes',
13 |     'coco_classes', 'dataset_aliases', 'get_classes', 'coco_eval',
14 |     'fast_eval_recall', 'results2json', 'DistEvalHook', 'DistEvalmAPHook',
15 |     'CocoDistEvalRecallHook', 'CocoDistEvalmAPHook', 'average_precision',
16 |     'eval_map', 'print_map_summary', 'eval_recalls', 'print_recall_summary',
17 |     'plot_num_recall', 'plot_iou_recall'
18 | ]
19 | 


--------------------------------------------------------------------------------
/mmdet/models/losses/cross_entropy_loss.py:
--------------------------------------------------------------------------------
 1 | import torch.nn as nn
 2 | from mmdet.core import (weighted_cross_entropy, weighted_binary_cross_entropy,
 3 |                         mask_cross_entropy)
 4 | 
 5 | from ..registry import LOSSES
 6 | 
 7 | 
 8 | @LOSSES.register_module
 9 | class CrossEntropyLoss(nn.Module):
10 | 
11 |     def __init__(self, use_sigmoid=False, use_mask=False, loss_weight=1.0):
12 |         super(CrossEntropyLoss, self).__init__()
13 |         assert (use_sigmoid is False) or (use_mask is False)
14 |         self.use_sigmoid = use_sigmoid
15 |         self.use_mask = use_mask
16 |         self.loss_weight = loss_weight
17 | 
18 |         if self.use_sigmoid:
19 |             self.cls_criterion = weighted_binary_cross_entropy
20 |         elif self.use_mask:
21 |             self.cls_criterion = mask_cross_entropy
22 |         else:
23 |             self.cls_criterion = weighted_cross_entropy
24 | 
25 |     def forward(self, cls_score, label, label_weight, *args, **kwargs):
26 |         loss_cls = self.loss_weight * self.cls_criterion(
27 |             cls_score, label, label_weight, *args, **kwargs)
28 |         return loss_cls
29 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/__init__.py:
--------------------------------------------------------------------------------
 1 | from .geometry import bbox_overlaps
 2 | from .assigners import BaseAssigner, MaxIoUAssigner, AssignResult
 3 | from .samplers import (BaseSampler, PseudoSampler, RandomSampler,
 4 |                        InstanceBalancedPosSampler, IoUBalancedNegSampler,
 5 |                        CombinedSampler, SamplingResult)
 6 | from .assign_sampling import build_assigner, build_sampler, assign_and_sample
 7 | from .transforms import (bbox2delta, delta2bbox, bbox_flip, bbox_mapping,
 8 |                          bbox_mapping_back, bbox2roi, roi2bbox, bbox2result,
 9 |                          distance2bbox)
10 | from .bbox_target import bbox_target
11 | 
12 | __all__ = [
13 |     'bbox_overlaps', 'BaseAssigner', 'MaxIoUAssigner', 'AssignResult',
14 |     'BaseSampler', 'PseudoSampler', 'RandomSampler',
15 |     'InstanceBalancedPosSampler', 'IoUBalancedNegSampler', 'CombinedSampler',
16 |     'SamplingResult', 'build_assigner', 'build_sampler', 'assign_and_sample',
17 |     'bbox2delta', 'delta2bbox', 'bbox_flip', 'bbox_mapping',
18 |     'bbox_mapping_back', 'bbox2roi', 'roi2bbox', 'bbox2result',
19 |     'distance2bbox', 'bbox_target'
20 | ]
21 | 


--------------------------------------------------------------------------------
/tools/publish_model.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import subprocess
 3 | import torch
 4 | 
 5 | 
 6 | def parse_args():
 7 |     parser = argparse.ArgumentParser(
 8 |         description='Process a checkpoint to be published')
 9 |     parser.add_argument('in_file', help='input checkpoint filename')
10 |     parser.add_argument('out_file', help='output checkpoint filename')
11 |     args = parser.parse_args()
12 |     return args
13 | 
14 | 
15 | def process_checkpoint(in_file, out_file):
16 |     checkpoint = torch.load(in_file, map_location='cpu')
17 |     # remove optimizer for smaller file size
18 |     if 'optimizer' in checkpoint:
19 |         del checkpoint['optimizer']
20 |         del checkpoint['meta']
21 |     # if it is necessary to remove some sensitive data in checkpoint['meta'],
22 |     # add the code here.
23 |     torch.save(checkpoint, out_file)
24 |     sha = subprocess.check_output(['sha256sum', out_file]).decode()
25 |     final_file = out_file.rstrip('.pth') + '-{}.pth'.format(sha[:8])
26 |     subprocess.Popen(['mv', out_file, final_file])
27 | 
28 | 
29 | def main():
30 |     args = parse_args()
31 |     process_checkpoint(args.in_file, args.out_file)
32 | 
33 | 
34 | if __name__ == '__main__':
35 |     main()
36 | 


--------------------------------------------------------------------------------
/mmcv_custom/parameters.py:
--------------------------------------------------------------------------------
 1 | def parameters(net, base_lr):
 2 |     total_length = 0
 3 | 
 4 |     default_lr_param_group = []
 5 |     lr_mult_param_groups = {}
 6 |     for m in net.modules():
 7 |         # print(type(m), len(list(m.named_parameters(recurse=False))))
 8 |         # print(list(m.named_parameters(recurse=False)))
 9 |         total_length += len(list(m.parameters(recurse=False)))
10 |         if hasattr(m, 'lr_mult'):
11 |             lr_mult_param_groups.setdefault(m.lr_mult, [])
12 |             lr_mult_param_groups[m.lr_mult] += list(
13 |                 m.parameters(recurse=False))
14 |         else:
15 |             default_lr_param_group += list(m.parameters(recurse=False))
16 |     param_list = [{
17 |         'params': default_lr_param_group
18 |     }] + [{
19 |         'params': p,
20 |         'lr': base_lr * lm
21 |     } for lm, p in lr_mult_param_groups.items()]
22 | 
23 |     _total_length = len(list(net.parameters()))
24 |     assert total_length == _total_length, '{} vs {}'.format(
25 |         total_length, _total_length)
26 | 
27 |     _total_length = sum([len(p['params']) for p in param_list])
28 |     assert total_length == _total_length, '{} vs {}'.format(
29 |         total_length, _total_length)
30 | 
31 |     return param_list
32 | 


--------------------------------------------------------------------------------
/mmdet/core/utils/misc.py:
--------------------------------------------------------------------------------
 1 | from functools import partial
 2 | 
 3 | import mmcv
 4 | import numpy as np
 5 | from six.moves import map, zip
 6 | 
 7 | 
 8 | def tensor2imgs(tensor, mean=(0, 0, 0), std=(1, 1, 1), to_rgb=True):
 9 |     num_imgs = tensor.size(0)
10 |     mean = np.array(mean, dtype=np.float32)
11 |     std = np.array(std, dtype=np.float32)
12 |     imgs = []
13 |     for img_id in range(num_imgs):
14 |         img = tensor[img_id, ...].cpu().numpy().transpose(1, 2, 0)
15 |         img = mmcv.imdenormalize(
16 |             img, mean, std, to_bgr=to_rgb).astype(np.uint8)
17 |         imgs.append(np.ascontiguousarray(img))
18 |     return imgs
19 | 
20 | 
21 | def multi_apply(func, *args, **kwargs):
22 |     pfunc = partial(func, **kwargs) if kwargs else func
23 |     map_results = map(pfunc, *args)
24 |     return tuple(map(list, zip(*map_results)))
25 | 
26 | 
27 | def unmap(data, count, inds, fill=0):
28 |     """ Unmap a subset of item (data) back to the original set of items (of
29 |     size count) """
30 |     if data.dim() == 1:
31 |         ret = data.new_full((count, ), fill)
32 |         ret[inds] = data
33 |     else:
34 |         new_size = (count, ) + data.size()[1:]
35 |         ret = data.new_full(new_size, fill)
36 |         ret[inds, :] = data
37 |     return ret
38 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/assign_sampling.py:
--------------------------------------------------------------------------------
 1 | import mmcv
 2 | 
 3 | from . import assigners, samplers
 4 | 
 5 | 
 6 | def build_assigner(cfg, **kwargs):
 7 |     if isinstance(cfg, assigners.BaseAssigner):
 8 |         return cfg
 9 |     elif isinstance(cfg, dict):
10 |         return mmcv.runner.obj_from_dict(cfg, assigners, default_args=kwargs)
11 |     else:
12 |         raise TypeError('Invalid type {} for building a sampler'.format(
13 |             type(cfg)))
14 | 
15 | 
16 | def build_sampler(cfg, **kwargs):
17 |     if isinstance(cfg, samplers.BaseSampler):
18 |         return cfg
19 |     elif isinstance(cfg, dict):
20 |         return mmcv.runner.obj_from_dict(cfg, samplers, default_args=kwargs)
21 |     else:
22 |         raise TypeError('Invalid type {} for building a sampler'.format(
23 |             type(cfg)))
24 | 
25 | 
26 | def assign_and_sample(bboxes, gt_bboxes, gt_bboxes_ignore, gt_labels, cfg):
27 |     bbox_assigner = build_assigner(cfg.assigner)
28 |     bbox_sampler = build_sampler(cfg.sampler)
29 |     assign_result = bbox_assigner.assign(bboxes, gt_bboxes, gt_bboxes_ignore,
30 |                                          gt_labels)
31 |     sampling_result = bbox_sampler.sample(assign_result, bboxes, gt_bboxes,
32 |                                           gt_labels)
33 |     return assign_result, sampling_result
34 | 


--------------------------------------------------------------------------------
/mmdet/models/registry.py:
--------------------------------------------------------------------------------
 1 | import torch.nn as nn
 2 | 
 3 | 
 4 | class Registry(object):
 5 | 
 6 |     def __init__(self, name):
 7 |         self._name = name
 8 |         self._module_dict = dict()
 9 | 
10 |     @property
11 |     def name(self):
12 |         return self._name
13 | 
14 |     @property
15 |     def module_dict(self):
16 |         return self._module_dict
17 | 
18 |     def _register_module(self, module_class):
19 |         """Register a module.
20 | 
21 |         Args:
22 |             module (:obj:`nn.Module`): Module to be registered.
23 |         """
24 |         if not issubclass(module_class, nn.Module):
25 |             raise TypeError(
26 |                 'module must be a child of nn.Module, but got {}'.format(
27 |                     module_class))
28 |         module_name = module_class.__name__
29 |         if module_name in self._module_dict:
30 |             raise KeyError('{} is already registered in {}'.format(
31 |                 module_name, self.name))
32 |         self._module_dict[module_name] = module_class
33 | 
34 |     def register_module(self, cls):
35 |         self._register_module(cls)
36 |         return cls
37 | 
38 | 
39 | BACKBONES = Registry('backbone')
40 | NECKS = Registry('neck')
41 | ROI_EXTRACTORS = Registry('roi_extractor')
42 | SHARED_HEADS = Registry('shared_head')
43 | HEADS = Registry('head')
44 | LOSSES = Registry('loss')
45 | DETECTORS = Registry('detector')
46 | 


--------------------------------------------------------------------------------
/mmdet/models/utils/gumbel_sigmoid.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from torch import nn
 3 | 
 4 | 
 5 | class GumbelSigmoid(nn.Module):
 6 |     def __init__(self, max_T, decay_alpha):
 7 |         super(GumbelSigmoid, self).__init__()
 8 | 
 9 |         self.max_T = max_T
10 |         self.decay_alpha = decay_alpha
11 |         self.softmax = nn.Softmax(dim=1)
12 |         self.p_value = 1e-8
13 | 
14 |         self.register_buffer('cur_T', torch.tensor(max_T))
15 | 
16 |     def forward(self, x):
17 |         if self.training:
18 |             _cur_T = self.cur_T
19 |         else:
20 |             _cur_T = 0.03
21 | 
22 |         # Shape <x> : [N, C, H, W]
23 |         # Shape <r> : [N, C, H, W]
24 |         r = 1 - x
25 |         x = (x + self.p_value).log()
26 |         r = (r + self.p_value).log()
27 | 
28 |         # Generate Noise
29 |         x_N = torch.rand_like(x)
30 |         r_N = torch.rand_like(r)
31 |         x_N = -1 * (x_N + self.p_value).log()
32 |         r_N = -1 * (r_N + self.p_value).log()
33 |         x_N = -1 * (x_N + self.p_value).log()
34 |         r_N = -1 * (r_N + self.p_value).log()
35 | 
36 |         # Get Final Distribution
37 |         x = x + x_N
38 |         x = x / (_cur_T + self.p_value)
39 |         r = r + r_N
40 |         r = r / (_cur_T + self.p_value)
41 | 
42 |         x = torch.cat((x, r), dim=1)
43 |         x = self.softmax(x)
44 |         x = x[:, [0], :, :]
45 | 
46 |         if self.training:
47 |             self.cur_T = self.cur_T * self.decay_alpha
48 | 
49 |         return x


--------------------------------------------------------------------------------
/mmdet/models/utils/conv_ws.py:
--------------------------------------------------------------------------------
 1 | import torch.nn as nn
 2 | import torch.nn.functional as F
 3 | 
 4 | 
 5 | def conv_ws_2d(input,
 6 |                weight,
 7 |                bias=None,
 8 |                stride=1,
 9 |                padding=0,
10 |                dilation=1,
11 |                groups=1,
12 |                eps=1e-5):
13 |     c_in = weight.size(0)
14 |     weight_flat = weight.view(c_in, -1)
15 |     mean = weight_flat.mean(dim=1, keepdim=True).view(c_in, 1, 1, 1)
16 |     std = weight_flat.std(dim=1, keepdim=True).view(c_in, 1, 1, 1)
17 |     weight = (weight - mean) / (std + eps)
18 |     return F.conv2d(input, weight, bias, stride, padding, dilation, groups)
19 | 
20 | 
21 | class ConvWS2d(nn.Conv2d):
22 | 
23 |     def __init__(self,
24 |                  in_channels,
25 |                  out_channels,
26 |                  kernel_size,
27 |                  stride=1,
28 |                  padding=0,
29 |                  dilation=1,
30 |                  groups=1,
31 |                  bias=True,
32 |                  eps=1e-5):
33 |         super(ConvWS2d, self).__init__(
34 |             in_channels,
35 |             out_channels,
36 |             kernel_size,
37 |             stride=stride,
38 |             padding=padding,
39 |             dilation=dilation,
40 |             groups=groups,
41 |             bias=bias)
42 |         self.eps = eps
43 | 
44 |     def forward(self, x):
45 |         return conv_ws_2d(x, self.weight, self.bias, self.stride, self.padding,
46 |                           self.dilation, self.groups, self.eps)
47 | 


--------------------------------------------------------------------------------
/mmdet/datasets/imagenet.py:
--------------------------------------------------------------------------------
 1 | import mmcv
 2 | from .custom import CustomDataset
 3 | 
 4 | 
 5 | class ImageNetDataset(CustomDataset):
 6 | 
 7 |     CLASSES = ('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
 8 |                'train', 'truck', 'boat', 'traffic_light', 'fire_hydrant',
 9 |                'stop_sign', 'parking_meter', 'bench', 'bird', 'cat', 'dog',
10 |                'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe',
11 |                'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
12 |                'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat',
13 |                'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket',
14 |                'bottle', 'wine_glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
15 |                'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',
16 |                'hot_dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
17 |                'potted_plant', 'bed', 'dining_table', 'toilet', 'tv', 'laptop',
18 |                'mouse', 'remote', 'keyboard', 'cell_phone', 'microwave',
19 |                'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock',
20 |                'vase', 'scissors', 'teddy_bear', 'hair_drier', 'toothbrush')
21 | 
22 |     def load_annotations(self, ann_file):
23 |         img_infos = mmcv.load(ann_file)
24 |         for item in img_infos:
25 |             item.update({'id': item['filename'].split('/')[-1].split('.')[0]})
26 |         return img_infos
27 | 
28 |     def get_ann_info(self, idx):
29 |         return self.img_infos[idx]
30 | 
31 | 


--------------------------------------------------------------------------------
/mmdet/ops/sigmoid_focal_loss/functions/sigmoid_focal_loss.py:
--------------------------------------------------------------------------------
 1 | import torch.nn.functional as F
 2 | from torch.autograd import Function
 3 | from torch.autograd.function import once_differentiable
 4 | 
 5 | from .. import sigmoid_focal_loss_cuda
 6 | 
 7 | 
 8 | class SigmoidFocalLossFunction(Function):
 9 | 
10 |     @staticmethod
11 |     def forward(ctx, input, target, gamma=2.0, alpha=0.25, reduction='mean'):
12 |         ctx.save_for_backward(input, target)
13 |         num_classes = input.shape[1]
14 |         ctx.num_classes = num_classes
15 |         ctx.gamma = gamma
16 |         ctx.alpha = alpha
17 | 
18 |         loss = sigmoid_focal_loss_cuda.forward(input, target, num_classes,
19 |                                                gamma, alpha)
20 |         reduction_enum = F._Reduction.get_enum(reduction)
21 |         # none: 0, mean:1, sum: 2
22 |         if reduction_enum == 0:
23 |             return loss
24 |         elif reduction_enum == 1:
25 |             return loss.mean()
26 |         elif reduction_enum == 2:
27 |             return loss.sum()
28 | 
29 |     @staticmethod
30 |     @once_differentiable
31 |     def backward(ctx, d_loss):
32 |         input, target = ctx.saved_tensors
33 |         num_classes = ctx.num_classes
34 |         gamma = ctx.gamma
35 |         alpha = ctx.alpha
36 |         d_loss = d_loss.contiguous()
37 |         d_input = sigmoid_focal_loss_cuda.backward(input, target, d_loss,
38 |                                                    num_classes, gamma, alpha)
39 |         return d_input, None, None, None, None
40 | 
41 | 
42 | sigmoid_focal_loss = SigmoidFocalLossFunction.apply
43 | 


--------------------------------------------------------------------------------
/mmdet/models/utils/weight_init.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch.nn as nn
 3 | 
 4 | 
 5 | def xavier_init(module, gain=1, bias=0, distribution='normal'):
 6 |     assert distribution in ['uniform', 'normal']
 7 |     if distribution == 'uniform':
 8 |         nn.init.xavier_uniform_(module.weight, gain=gain)
 9 |     else:
10 |         nn.init.xavier_normal_(module.weight, gain=gain)
11 |     if hasattr(module, 'bias'):
12 |         nn.init.constant_(module.bias, bias)
13 | 
14 | 
15 | def normal_init(module, mean=0, std=1, bias=0):
16 |     nn.init.normal_(module.weight, mean, std)
17 |     if hasattr(module, 'bias'):
18 |         nn.init.constant_(module.bias, bias)
19 | 
20 | 
21 | def uniform_init(module, a=0, b=1, bias=0):
22 |     nn.init.uniform_(module.weight, a, b)
23 |     if hasattr(module, 'bias'):
24 |         nn.init.constant_(module.bias, bias)
25 | 
26 | 
27 | def kaiming_init(module,
28 |                  mode='fan_out',
29 |                  nonlinearity='relu',
30 |                  bias=0,
31 |                  distribution='normal'):
32 |     assert distribution in ['uniform', 'normal']
33 |     if distribution == 'uniform':
34 |         nn.init.kaiming_uniform_(
35 |             module.weight, mode=mode, nonlinearity=nonlinearity)
36 |     else:
37 |         nn.init.kaiming_normal_(
38 |             module.weight, mode=mode, nonlinearity=nonlinearity)
39 |     if hasattr(module, 'bias'):
40 |         nn.init.constant_(module.bias, bias)
41 | 
42 | 
43 | def bias_init_with_prob(prior_prob):
44 |     """ initialize conv/fc bias value according to giving probablity"""
45 |     bias_init = float(-np.log((1 - prior_prob) / prior_prob))
46 |     return bias_init
47 | 


--------------------------------------------------------------------------------
/mmdet/datasets/loader/build_loader.py:
--------------------------------------------------------------------------------
 1 | from functools import partial
 2 | 
 3 | from mmcv.runner import get_dist_info
 4 | from mmcv.parallel import collate
 5 | from torch.utils.data import DataLoader
 6 | 
 7 | from .sampler import GroupSampler, DistributedGroupSampler, DistributedSampler
 8 | 
 9 | # https://github.com/pytorch/pytorch/issues/973
10 | import resource
11 | rlimit = resource.getrlimit(resource.RLIMIT_NOFILE)
12 | resource.setrlimit(resource.RLIMIT_NOFILE, (4096, rlimit[1]))
13 | 
14 | 
15 | def build_dataloader(dataset,
16 |                      imgs_per_gpu,
17 |                      workers_per_gpu,
18 |                      num_gpus=1,
19 |                      dist=True,
20 |                      **kwargs):
21 |     shuffle = kwargs.get('shuffle', True)
22 |     if dist:
23 |         rank, world_size = get_dist_info()
24 |         if shuffle:
25 |             sampler = DistributedGroupSampler(dataset, imgs_per_gpu,
26 |                                               world_size, rank)
27 |         else:
28 |             sampler = DistributedSampler(
29 |                 dataset, world_size, rank, shuffle=False)
30 |         batch_size = imgs_per_gpu
31 |         num_workers = workers_per_gpu
32 |     else:
33 |         sampler = GroupSampler(dataset, imgs_per_gpu) if shuffle else None
34 |         batch_size = num_gpus * imgs_per_gpu
35 |         num_workers = num_gpus * workers_per_gpu
36 | 
37 |     data_loader = DataLoader(
38 |         dataset,
39 |         batch_size=batch_size,
40 |         sampler=sampler,
41 |         num_workers=num_workers,
42 |         collate_fn=partial(collate, samples_per_gpu=imgs_per_gpu),
43 |         pin_memory=False,
44 |         **kwargs)
45 | 
46 |     return data_loader
47 | 


--------------------------------------------------------------------------------
/mmdet/core/evaluation/bbox_overlaps.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | def bbox_overlaps(bboxes1, bboxes2, mode='iou'):
 5 |     """Calculate the ious between each bbox of bboxes1 and bboxes2.
 6 | 
 7 |     Args:
 8 |         bboxes1(ndarray): shape (n, 4)
 9 |         bboxes2(ndarray): shape (k, 4)
10 |         mode(str): iou (intersection over union) or iof (intersection
11 |             over foreground)
12 | 
13 |     Returns:
14 |         ious(ndarray): shape (n, k)
15 |     """
16 | 
17 |     assert mode in ['iou', 'iof']
18 | 
19 |     bboxes1 = bboxes1.astype(np.float32)
20 |     bboxes2 = bboxes2.astype(np.float32)
21 |     rows = bboxes1.shape[0]
22 |     cols = bboxes2.shape[0]
23 |     ious = np.zeros((rows, cols), dtype=np.float32)
24 |     if rows * cols == 0:
25 |         return ious
26 |     exchange = False
27 |     if bboxes1.shape[0] > bboxes2.shape[0]:
28 |         bboxes1, bboxes2 = bboxes2, bboxes1
29 |         ious = np.zeros((cols, rows), dtype=np.float32)
30 |         exchange = True
31 |     area1 = (bboxes1[:, 2] - bboxes1[:, 0] + 1) * (
32 |         bboxes1[:, 3] - bboxes1[:, 1] + 1)
33 |     area2 = (bboxes2[:, 2] - bboxes2[:, 0] + 1) * (
34 |         bboxes2[:, 3] - bboxes2[:, 1] + 1)
35 |     for i in range(bboxes1.shape[0]):
36 |         x_start = np.maximum(bboxes1[i, 0], bboxes2[:, 0])
37 |         y_start = np.maximum(bboxes1[i, 1], bboxes2[:, 1])
38 |         x_end = np.minimum(bboxes1[i, 2], bboxes2[:, 2])
39 |         y_end = np.minimum(bboxes1[i, 3], bboxes2[:, 3])
40 |         overlap = np.maximum(x_end - x_start + 1, 0) * np.maximum(
41 |             y_end - y_start + 1, 0)
42 |         if mode == 'iou':
43 |             union = area1[i] + area2 - overlap
44 |         else:
45 |             union = area1[i] if not exchange else area2
46 |         ious[i, :] = overlap / union
47 |     if exchange:
48 |         ious = ious.T
49 |     return ious
50 | 


--------------------------------------------------------------------------------
/mmdet/models/utils/norm.py:
--------------------------------------------------------------------------------
 1 | import torch.nn as nn
 2 | 
 3 | norm_cfg = {
 4 |     # format: layer_type: (abbreviation, module)
 5 |     'BN': ('bn', nn.BatchNorm2d),
 6 |     'SyncBN': ('bn', nn.SyncBatchNorm),
 7 |     'GN': ('gn', nn.GroupNorm),
 8 |     # and potentially 'SN'
 9 | }
10 | 
11 | 
12 | def build_norm_layer(cfg, num_features, postfix=''):
13 |     """ Build normalization layer
14 | 
15 |     Args:
16 |         cfg (dict): cfg should contain:
17 |             type (str): identify norm layer type.
18 |             layer args: args needed to instantiate a norm layer.
19 |             requires_grad (bool): [optional] whether stop gradient updates
20 |         num_features (int): number of channels from input.
21 |         postfix (int, str): appended into norm abbreviation to
22 |             create named layer.
23 | 
24 |     Returns:
25 |         name (str): abbreviation + postfix
26 |         layer (nn.Module): created norm layer
27 |     """
28 |     assert isinstance(cfg, dict) and 'type' in cfg
29 |     cfg_ = cfg.copy()
30 | 
31 |     layer_type = cfg_.pop('type')
32 |     if layer_type not in norm_cfg:
33 |         raise KeyError('Unrecognized norm type {}'.format(layer_type))
34 |     else:
35 |         abbr, norm_layer = norm_cfg[layer_type]
36 |         if norm_layer is None:
37 |             raise NotImplementedError
38 | 
39 |     assert isinstance(postfix, (int, str))
40 |     name = abbr + str(postfix)
41 | 
42 |     requires_grad = cfg_.pop('requires_grad', True)
43 |     cfg_.setdefault('eps', 1e-5)
44 |     if layer_type != 'GN':
45 |         layer = norm_layer(num_features, **cfg_)
46 |         if layer_type == 'SyncBN':
47 |             layer._specify_ddp_gpu_num(1)
48 |     else:
49 |         assert 'num_groups' in cfg_
50 |         layer = norm_layer(num_channels=num_features, **cfg_)
51 | 
52 |     for param in layer.parameters():
53 |         param.requires_grad = requires_grad
54 | 
55 |     return name, layer
56 | 


--------------------------------------------------------------------------------
/mmdet/models/builder.py:
--------------------------------------------------------------------------------
 1 | import mmcv
 2 | from torch import nn
 3 | 
 4 | from .registry import (BACKBONES, NECKS, ROI_EXTRACTORS, SHARED_HEADS, HEADS,
 5 |                        LOSSES, DETECTORS)
 6 | 
 7 | 
 8 | def _build_module(cfg, registry, default_args):
 9 |     assert isinstance(cfg, dict) and 'type' in cfg
10 |     assert isinstance(default_args, dict) or default_args is None
11 |     args = cfg.copy()
12 |     obj_type = args.pop('type')
13 |     if mmcv.is_str(obj_type):
14 |         if obj_type not in registry.module_dict:
15 |             raise KeyError('{} is not in the {} registry'.format(
16 |                 obj_type, registry.name))
17 |         obj_type = registry.module_dict[obj_type]
18 |     elif not isinstance(obj_type, type):
19 |         raise TypeError('type must be a str or valid type, but got {}'.format(
20 |             type(obj_type)))
21 |     if default_args is not None:
22 |         for name, value in default_args.items():
23 |             args.setdefault(name, value)
24 |     return obj_type(**args)
25 | 
26 | 
27 | def build(cfg, registry, default_args=None):
28 |     if isinstance(cfg, list):
29 |         modules = [_build_module(cfg_, registry, default_args) for cfg_ in cfg]
30 |         return nn.Sequential(*modules)
31 |     else:
32 |         return _build_module(cfg, registry, default_args)
33 | 
34 | 
35 | def build_backbone(cfg):
36 |     return build(cfg, BACKBONES)
37 | 
38 | 
39 | def build_neck(cfg):
40 |     return build(cfg, NECKS)
41 | 
42 | 
43 | def build_roi_extractor(cfg):
44 |     return build(cfg, ROI_EXTRACTORS)
45 | 
46 | 
47 | def build_shared_head(cfg):
48 |     return build(cfg, SHARED_HEADS)
49 | 
50 | 
51 | def build_head(cfg):
52 |     return build(cfg, HEADS)
53 | 
54 | 
55 | def build_loss(cfg):
56 |     return build(cfg, LOSSES)
57 | 
58 | 
59 | def build_detector(cfg, train_cfg=None, test_cfg=None):
60 |     return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
61 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/instance_balanced_pos_sampler.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch
 3 | 
 4 | from .random_sampler import RandomSampler
 5 | 
 6 | 
 7 | class InstanceBalancedPosSampler(RandomSampler):
 8 | 
 9 |     def _sample_pos(self, assign_result, num_expected, **kwargs):
10 |         pos_inds = torch.nonzero(assign_result.gt_inds > 0)
11 |         if pos_inds.numel() != 0:
12 |             pos_inds = pos_inds.squeeze(1)
13 |         if pos_inds.numel() <= num_expected:
14 |             return pos_inds
15 |         else:
16 |             unique_gt_inds = assign_result.gt_inds[pos_inds].unique()
17 |             num_gts = len(unique_gt_inds)
18 |             num_per_gt = int(round(num_expected / float(num_gts)) + 1)
19 |             sampled_inds = []
20 |             for i in unique_gt_inds:
21 |                 inds = torch.nonzero(assign_result.gt_inds == i.item())
22 |                 if inds.numel() != 0:
23 |                     inds = inds.squeeze(1)
24 |                 else:
25 |                     continue
26 |                 if len(inds) > num_per_gt:
27 |                     inds = self.random_choice(inds, num_per_gt)
28 |                 sampled_inds.append(inds)
29 |             sampled_inds = torch.cat(sampled_inds)
30 |             if len(sampled_inds) < num_expected:
31 |                 num_extra = num_expected - len(sampled_inds)
32 |                 extra_inds = np.array(
33 |                     list(set(pos_inds.cpu()) - set(sampled_inds.cpu())))
34 |                 if len(extra_inds) > num_extra:
35 |                     extra_inds = self.random_choice(extra_inds, num_extra)
36 |                 extra_inds = torch.from_numpy(extra_inds).to(
37 |                     assign_result.gt_inds.device).long()
38 |                 sampled_inds = torch.cat([sampled_inds, extra_inds])
39 |             elif len(sampled_inds) > num_expected:
40 |                 sampled_inds = self.random_choice(sampled_inds, num_expected)
41 |             return sampled_inds
42 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | *.egg-info/
 24 | .installed.cfg
 25 | *.egg
 26 | MANIFEST
 27 | 
 28 | # PyInstaller
 29 | #  Usually these files are written by a python script from a template
 30 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 31 | *.manifest
 32 | *.spec
 33 | 
 34 | # Installer logs
 35 | pip-log.txt
 36 | pip-delete-this-directory.txt
 37 | 
 38 | # Unit test / coverage reports
 39 | htmlcov/
 40 | .tox/
 41 | .coverage
 42 | .coverage.*
 43 | .cache
 44 | nosetests.xml
 45 | coverage.xml
 46 | *.cover
 47 | .hypothesis/
 48 | .pytest_cache/
 49 | 
 50 | # Translations
 51 | *.mo
 52 | *.pot
 53 | 
 54 | # Django stuff:
 55 | *.log
 56 | local_settings.py
 57 | db.sqlite3
 58 | 
 59 | # Flask stuff:
 60 | instance/
 61 | .webassets-cache
 62 | 
 63 | # Scrapy stuff:
 64 | .scrapy
 65 | 
 66 | # Sphinx documentation
 67 | docs/_build/
 68 | 
 69 | # PyBuilder
 70 | target/
 71 | 
 72 | # Jupyter Notebook
 73 | .ipynb_checkpoints
 74 | 
 75 | # pyenv
 76 | .python-version
 77 | 
 78 | # celery beat schedule file
 79 | celerybeat-schedule
 80 | 
 81 | # SageMath parsed files
 82 | *.sage.py
 83 | 
 84 | # Environments
 85 | .env
 86 | .venv
 87 | env/
 88 | venv/
 89 | ENV/
 90 | env.bak/
 91 | venv.bak/
 92 | 
 93 | # Spyder project settings
 94 | .spyderproject
 95 | .spyproject
 96 | 
 97 | # Rope project settings
 98 | .ropeproject
 99 | 
100 | # mkdocs documentation
101 | /site
102 | 
103 | # mypy
104 | .mypy_cache/
105 | 
106 | # cython generated cpp
107 | mmdet/ops/nms/src/soft_nms_cpu.cpp
108 | mmdet/version.py
109 | .vscode
110 | .idea
111 | .DS_Store
112 | 
113 | # data & work_dirs
114 | data
115 | work_dirs


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/functions/roi_pool.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from torch.autograd import Function
 3 | 
 4 | from .. import roi_pool_cuda
 5 | 
 6 | 
 7 | class RoIPoolFunction(Function):
 8 | 
 9 |     @staticmethod
10 |     def forward(ctx, features, rois, out_size, spatial_scale):
11 |         if isinstance(out_size, int):
12 |             out_h = out_size
13 |             out_w = out_size
14 |         elif isinstance(out_size, tuple):
15 |             assert len(out_size) == 2
16 |             assert isinstance(out_size[0], int)
17 |             assert isinstance(out_size[1], int)
18 |             out_h, out_w = out_size
19 |         else:
20 |             raise TypeError(
21 |                 '"out_size" must be an integer or tuple of integers')
22 |         assert features.is_cuda
23 |         ctx.save_for_backward(rois)
24 |         num_channels = features.size(1)
25 |         num_rois = rois.size(0)
26 |         out_size = (num_rois, num_channels, out_h, out_w)
27 |         output = features.new_zeros(out_size)
28 |         argmax = features.new_zeros(out_size, dtype=torch.int)
29 |         roi_pool_cuda.forward(features, rois, out_h, out_w, spatial_scale,
30 |                               output, argmax)
31 |         ctx.spatial_scale = spatial_scale
32 |         ctx.feature_size = features.size()
33 |         ctx.argmax = argmax
34 | 
35 |         return output
36 | 
37 |     @staticmethod
38 |     def backward(ctx, grad_output):
39 |         assert grad_output.is_cuda
40 |         spatial_scale = ctx.spatial_scale
41 |         feature_size = ctx.feature_size
42 |         argmax = ctx.argmax
43 |         rois = ctx.saved_tensors[0]
44 |         assert feature_size is not None
45 | 
46 |         grad_input = grad_rois = None
47 |         if ctx.needs_input_grad[0]:
48 |             grad_input = grad_output.new_zeros(feature_size)
49 |             roi_pool_cuda.backward(grad_output.contiguous(), rois, argmax,
50 |                                    spatial_scale, grad_input)
51 | 
52 |         return grad_input, grad_rois, None, None
53 | 
54 | 
55 | roi_pool = RoIPoolFunction.apply
56 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/random_sampler.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch
 3 | 
 4 | from .base_sampler import BaseSampler
 5 | 
 6 | 
 7 | class RandomSampler(BaseSampler):
 8 | 
 9 |     def __init__(self,
10 |                  num,
11 |                  pos_fraction,
12 |                  neg_pos_ub=-1,
13 |                  add_gt_as_proposals=True,
14 |                  **kwargs):
15 |         super(RandomSampler, self).__init__(num, pos_fraction, neg_pos_ub,
16 |                                             add_gt_as_proposals)
17 | 
18 |     @staticmethod
19 |     def random_choice(gallery, num):
20 |         """Random select some elements from the gallery.
21 | 
22 |         It seems that Pytorch's implementation is slower than numpy so we use
23 |         numpy to randperm the indices.
24 |         """
25 |         assert len(gallery) >= num
26 |         if isinstance(gallery, list):
27 |             gallery = np.array(gallery)
28 |         cands = np.arange(len(gallery))
29 |         np.random.shuffle(cands)
30 |         rand_inds = cands[:num]
31 |         if not isinstance(gallery, np.ndarray):
32 |             rand_inds = torch.from_numpy(rand_inds).long().to(gallery.device)
33 |         return gallery[rand_inds]
34 | 
35 |     def _sample_pos(self, assign_result, num_expected, **kwargs):
36 |         """Randomly sample some positive samples."""
37 |         pos_inds = torch.nonzero(assign_result.gt_inds > 0)
38 |         if pos_inds.numel() != 0:
39 |             pos_inds = pos_inds.squeeze(1)
40 |         if pos_inds.numel() <= num_expected:
41 |             return pos_inds
42 |         else:
43 |             return self.random_choice(pos_inds, num_expected)
44 | 
45 |     def _sample_neg(self, assign_result, num_expected, **kwargs):
46 |         """Randomly sample some negative samples."""
47 |         neg_inds = torch.nonzero(assign_result.gt_inds == 0)
48 |         if neg_inds.numel() != 0:
49 |             neg_inds = neg_inds.squeeze(1)
50 |         if len(neg_inds) <= num_expected:
51 |             return neg_inds
52 |         else:
53 |             return self.random_choice(neg_inds, num_expected)
54 | 


--------------------------------------------------------------------------------
/mmdet/ops/sigmoid_focal_loss/src/sigmoid_focal_loss.cpp:
--------------------------------------------------------------------------------
 1 | // modify from
 2 | // https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/maskrcnn_benchmark/csrc/SigmoidFocalLoss.h
 3 | #include <torch/extension.h>
 4 | 
 5 | at::Tensor SigmoidFocalLoss_forward_cuda(const at::Tensor &logits,
 6 |                                          const at::Tensor &targets,
 7 |                                          const int num_classes,
 8 |                                          const float gamma, const float alpha);
 9 | 
10 | at::Tensor SigmoidFocalLoss_backward_cuda(const at::Tensor &logits,
11 |                                           const at::Tensor &targets,
12 |                                           const at::Tensor &d_losses,
13 |                                           const int num_classes,
14 |                                           const float gamma, const float alpha);
15 | 
16 | // Interface for Python
17 | at::Tensor SigmoidFocalLoss_forward(const at::Tensor &logits,
18 |                                     const at::Tensor &targets,
19 |                                     const int num_classes, const float gamma,
20 |                                     const float alpha) {
21 |   if (logits.type().is_cuda()) {
22 |     return SigmoidFocalLoss_forward_cuda(logits, targets, num_classes, gamma,
23 |                                          alpha);
24 |   }
25 | }
26 | 
27 | at::Tensor SigmoidFocalLoss_backward(const at::Tensor &logits,
28 |                                      const at::Tensor &targets,
29 |                                      const at::Tensor &d_losses,
30 |                                      const int num_classes, const float gamma,
31 |                                      const float alpha) {
32 |   if (logits.type().is_cuda()) {
33 |     return SigmoidFocalLoss_backward_cuda(logits, targets, d_losses,
34 |                                           num_classes, gamma, alpha);
35 |   }
36 | }
37 | 
38 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
39 |   m.def("forward", &SigmoidFocalLoss_forward,
40 |         "SigmoidFocalLoss forward (CUDA)");
41 |   m.def("backward", &SigmoidFocalLoss_backward,
42 |         "SigmoidFocalLoss backward (CUDA)");
43 | }
44 | 


--------------------------------------------------------------------------------
/mmdet/core/utils/dist_utils.py:
--------------------------------------------------------------------------------
 1 | from collections import OrderedDict
 2 | 
 3 | import torch.distributed as dist
 4 | from torch._utils import (_flatten_dense_tensors, _unflatten_dense_tensors,
 5 |                           _take_tensors)
 6 | from mmcv.runner import OptimizerHook
 7 | 
 8 | 
 9 | def _allreduce_coalesced(tensors, world_size, bucket_size_mb=-1):
10 |     if bucket_size_mb > 0:
11 |         bucket_size_bytes = bucket_size_mb * 1024 * 1024
12 |         buckets = _take_tensors(tensors, bucket_size_bytes)
13 |     else:
14 |         buckets = OrderedDict()
15 |         for tensor in tensors:
16 |             tp = tensor.type()
17 |             if tp not in buckets:
18 |                 buckets[tp] = []
19 |             buckets[tp].append(tensor)
20 |         buckets = buckets.values()
21 | 
22 |     for bucket in buckets:
23 |         flat_tensors = _flatten_dense_tensors(bucket)
24 |         dist.all_reduce(flat_tensors)
25 |         flat_tensors.div_(world_size)
26 |         for tensor, synced in zip(
27 |                 bucket, _unflatten_dense_tensors(flat_tensors, bucket)):
28 |             tensor.copy_(synced)
29 | 
30 | 
31 | def allreduce_grads(model, coalesce=True, bucket_size_mb=-1):
32 |     grads = [
33 |         param.grad.data for param in model.parameters()
34 |         if param.requires_grad and param.grad is not None
35 |     ]
36 |     world_size = dist.get_world_size()
37 |     if coalesce:
38 |         _allreduce_coalesced(grads, world_size, bucket_size_mb)
39 |     else:
40 |         for tensor in grads:
41 |             dist.all_reduce(tensor.div_(world_size))
42 | 
43 | 
44 | class DistOptimizerHook(OptimizerHook):
45 | 
46 |     def __init__(self, grad_clip=None, coalesce=True, bucket_size_mb=-1):
47 |         self.grad_clip = grad_clip
48 |         self.coalesce = coalesce
49 |         self.bucket_size_mb = bucket_size_mb
50 | 
51 |     def after_train_iter(self, runner):
52 |         runner.optimizer.zero_grad()
53 |         runner.outputs['loss'].backward()
54 |         allreduce_grads(runner.model, self.coalesce, self.bucket_size_mb)
55 |         if self.grad_clip is not None:
56 |             self.clip_grads(runner.model.parameters())
57 |         runner.optimizer.step()
58 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/functions/roi_align.py:
--------------------------------------------------------------------------------
 1 | from torch.autograd import Function
 2 | 
 3 | from .. import roi_align_cuda
 4 | 
 5 | 
 6 | class RoIAlignFunction(Function):
 7 | 
 8 |     @staticmethod
 9 |     def forward(ctx, features, rois, out_size, spatial_scale, sample_num=0):
10 |         if isinstance(out_size, int):
11 |             out_h = out_size
12 |             out_w = out_size
13 |         elif isinstance(out_size, tuple):
14 |             assert len(out_size) == 2
15 |             assert isinstance(out_size[0], int)
16 |             assert isinstance(out_size[1], int)
17 |             out_h, out_w = out_size
18 |         else:
19 |             raise TypeError(
20 |                 '"out_size" must be an integer or tuple of integers')
21 |         ctx.spatial_scale = spatial_scale
22 |         ctx.sample_num = sample_num
23 |         ctx.save_for_backward(rois)
24 |         ctx.feature_size = features.size()
25 | 
26 |         batch_size, num_channels, data_height, data_width = features.size()
27 |         num_rois = rois.size(0)
28 | 
29 |         output = features.new_zeros(num_rois, num_channels, out_h, out_w)
30 |         if features.is_cuda:
31 |             roi_align_cuda.forward(features, rois, out_h, out_w, spatial_scale,
32 |                                    sample_num, output)
33 |         else:
34 |             raise NotImplementedError
35 | 
36 |         return output
37 | 
38 |     @staticmethod
39 |     def backward(ctx, grad_output):
40 |         feature_size = ctx.feature_size
41 |         spatial_scale = ctx.spatial_scale
42 |         sample_num = ctx.sample_num
43 |         rois = ctx.saved_tensors[0]
44 |         assert (feature_size is not None and grad_output.is_cuda)
45 | 
46 |         batch_size, num_channels, data_height, data_width = feature_size
47 |         out_w = grad_output.size(3)
48 |         out_h = grad_output.size(2)
49 | 
50 |         grad_input = grad_rois = None
51 |         if ctx.needs_input_grad[0]:
52 |             grad_input = rois.new_zeros(batch_size, num_channels, data_height,
53 |                                         data_width)
54 |             roi_align_cuda.backward(grad_output.contiguous(), rois, out_h,
55 |                                     out_w, spatial_scale, sample_num,
56 |                                     grad_input)
57 | 
58 |         return grad_input, grad_rois, None, None, None
59 | 
60 | 
61 | roi_align = RoIAlignFunction.apply
62 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/geometry.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | 
 4 | def bbox_overlaps(bboxes1, bboxes2, mode='iou', is_aligned=False):
 5 |     """Calculate overlap between two set of bboxes.
 6 | 
 7 |     If ``is_aligned`` is ``False``, then calculate the ious between each bbox
 8 |     of bboxes1 and bboxes2, otherwise the ious between each aligned pair of
 9 |     bboxes1 and bboxes2.
10 | 
11 |     Args:
12 |         bboxes1 (Tensor): shape (m, 4)
13 |         bboxes2 (Tensor): shape (n, 4), if is_aligned is ``True``, then m and n
14 |             must be equal.
15 |         mode (str): "iou" (intersection over union) or iof (intersection over
16 |             foreground).
17 | 
18 |     Returns:
19 |         ious(Tensor): shape (m, n) if is_aligned == False else shape (m, 1)
20 |     """
21 | 
22 |     assert mode in ['iou', 'iof']
23 | 
24 |     rows = bboxes1.size(0)
25 |     cols = bboxes2.size(0)
26 |     if is_aligned:
27 |         assert rows == cols
28 | 
29 |     if rows * cols == 0:
30 |         return bboxes1.new(rows, 1) if is_aligned else bboxes1.new(rows, cols)
31 | 
32 |     if is_aligned:
33 |         lt = torch.max(bboxes1[:, :2], bboxes2[:, :2])  # [rows, 2]
34 |         rb = torch.min(bboxes1[:, 2:], bboxes2[:, 2:])  # [rows, 2]
35 | 
36 |         wh = (rb - lt + 1).clamp(min=0)  # [rows, 2]
37 |         overlap = wh[:, 0] * wh[:, 1]
38 |         area1 = (bboxes1[:, 2] - bboxes1[:, 0] + 1) * (
39 |             bboxes1[:, 3] - bboxes1[:, 1] + 1)
40 | 
41 |         if mode == 'iou':
42 |             area2 = (bboxes2[:, 2] - bboxes2[:, 0] + 1) * (
43 |                 bboxes2[:, 3] - bboxes2[:, 1] + 1)
44 |             ious = overlap / (area1 + area2 - overlap)
45 |         else:
46 |             ious = overlap / area1
47 |     else:
48 |         lt = torch.max(bboxes1[:, None, :2], bboxes2[:, :2])  # [rows, cols, 2]
49 |         rb = torch.min(bboxes1[:, None, 2:], bboxes2[:, 2:])  # [rows, cols, 2]
50 | 
51 |         wh = (rb - lt + 1).clamp(min=0)  # [rows, cols, 2]
52 |         overlap = wh[:, :, 0] * wh[:, :, 1]
53 |         area1 = (bboxes1[:, 2] - bboxes1[:, 0] + 1) * (
54 |             bboxes1[:, 3] - bboxes1[:, 1] + 1)
55 | 
56 |         if mode == 'iou':
57 |             area2 = (bboxes2[:, 2] - bboxes2[:, 0] + 1) * (
58 |                 bboxes2[:, 3] - bboxes2[:, 1] + 1)
59 |             ious = overlap / (area1[:, None] + area2 - overlap)
60 |         else:
61 |             ious = overlap / (area1[:, None])
62 | 
63 |     return ious
64 | 


--------------------------------------------------------------------------------
/mmdet/apis/env.py:
--------------------------------------------------------------------------------
 1 | import logging
 2 | import os
 3 | import random
 4 | import subprocess
 5 | 
 6 | import numpy as np
 7 | import torch
 8 | import torch.distributed as dist
 9 | import torch.multiprocessing as mp
10 | from mmcv.runner import get_dist_info
11 | 
12 | 
13 | def init_dist(launcher, backend='nccl', **kwargs):
14 |     if mp.get_start_method(allow_none=True) is None:
15 |         mp.set_start_method('spawn')
16 |     if launcher == 'pytorch':
17 |         _init_dist_pytorch(backend, **kwargs)
18 |     elif launcher == 'mpi':
19 |         _init_dist_mpi(backend, **kwargs)
20 |     elif launcher == 'slurm':
21 |         _init_dist_slurm(backend, **kwargs)
22 |     else:
23 |         raise ValueError('Invalid launcher type: {}'.format(launcher))
24 | 
25 | 
26 | def _init_dist_pytorch(backend, **kwargs):
27 |     # TODO: use local_rank instead of rank % num_gpus
28 |     rank = int(os.environ['RANK'])
29 |     num_gpus = torch.cuda.device_count()
30 |     torch.cuda.set_device(rank % num_gpus)
31 |     dist.init_process_group(backend=backend, **kwargs)
32 | 
33 | 
34 | def _init_dist_mpi(backend, **kwargs):
35 |     raise NotImplementedError
36 | 
37 | 
38 | def _init_dist_slurm(backend, port=29500, **kwargs):
39 |     proc_id = int(os.environ['SLURM_PROCID'])
40 |     ntasks = int(os.environ['SLURM_NTASKS'])
41 |     node_list = os.environ['SLURM_NODELIST']
42 |     num_gpus = torch.cuda.device_count()
43 |     torch.cuda.set_device(proc_id % num_gpus)
44 |     addr = subprocess.getoutput(
45 |         'scontrol show hostname {} | head -n1'.format(node_list))
46 |     os.environ['MASTER_PORT'] = str(port)
47 |     os.environ['MASTER_ADDR'] = addr
48 |     os.environ['WORLD_SIZE'] = str(ntasks)
49 |     os.environ['RANK'] = str(proc_id)
50 |     dist.init_process_group(backend=backend)
51 | 
52 | 
53 | def set_random_seed(seed):
54 |     random.seed(seed)
55 |     np.random.seed(seed)
56 |     torch.manual_seed(seed)
57 |     torch.cuda.manual_seed_all(seed)
58 | 
59 | 
60 | def get_root_logger(log_level=logging.INFO):
61 |     logger = logging.getLogger()
62 |     if not logger.hasHandlers():
63 |         logging.basicConfig(
64 |             format='%(asctime)s - %(levelname)s - %(message)s',
65 |             level=log_level)
66 |     rank, _ = get_dist_info()
67 |     if rank != 0:
68 |         logger.setLevel('ERROR')
69 |     return logger
70 | 
71 | 
72 | def get_git_hash():
73 |     return str(subprocess.check_output('git log --oneline -1', shell=True))


--------------------------------------------------------------------------------
/mmdet/core/post_processing/bbox_nms.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | from mmdet.ops.nms import nms_wrapper
 4 | 
 5 | 
 6 | def multiclass_nms(multi_bboxes,
 7 |                    multi_scores,
 8 |                    score_thr,
 9 |                    nms_cfg,
10 |                    max_num=-1,
11 |                    score_factors=None):
12 |     """NMS for multi-class bboxes.
13 | 
14 |     Args:
15 |         multi_bboxes (Tensor): shape (n, #class*4) or (n, 4)
16 |         multi_scores (Tensor): shape (n, #class)
17 |         score_thr (float): bbox threshold, bboxes with scores lower than it
18 |             will not be considered.
19 |         nms_thr (float): NMS IoU threshold
20 |         max_num (int): if there are more than max_num bboxes after NMS,
21 |             only top max_num will be kept.
22 |         score_factors (Tensor): The factors multiplied to scores before
23 |             applying NMS
24 | 
25 |     Returns:
26 |         tuple: (bboxes, labels), tensors of shape (k, 5) and (k, 1). Labels
27 |             are 0-based.
28 |     """
29 |     num_classes = multi_scores.shape[1]
30 |     bboxes, labels = [], []
31 |     nms_cfg_ = nms_cfg.copy()
32 |     nms_type = nms_cfg_.pop('type', 'nms')
33 |     nms_op = getattr(nms_wrapper, nms_type)
34 |     for i in range(1, num_classes):
35 |         cls_inds = multi_scores[:, i] > score_thr
36 |         if not cls_inds.any():
37 |             continue
38 |         # get bboxes and scores of this class
39 |         if multi_bboxes.shape[1] == 4:
40 |             _bboxes = multi_bboxes[cls_inds, :]
41 |         else:
42 |             _bboxes = multi_bboxes[cls_inds, i * 4:(i + 1) * 4]
43 |         _scores = multi_scores[cls_inds, i]
44 |         if score_factors is not None:
45 |             _scores *= score_factors[cls_inds]
46 |         cls_dets = torch.cat([_bboxes, _scores[:, None]], dim=1)
47 |         cls_dets, _ = nms_op(cls_dets, **nms_cfg_)
48 |         cls_labels = multi_bboxes.new_full(
49 |             (cls_dets.shape[0], ), i - 1, dtype=torch.long)
50 |         bboxes.append(cls_dets)
51 |         labels.append(cls_labels)
52 |     if bboxes:
53 |         bboxes = torch.cat(bboxes)
54 |         labels = torch.cat(labels)
55 |         if bboxes.shape[0] > max_num:
56 |             _, inds = bboxes[:, -1].sort(descending=True)
57 |             inds = inds[:max_num]
58 |             bboxes = bboxes[inds]
59 |             labels = labels[inds]
60 |     else:
61 |         bboxes = multi_bboxes.new_zeros((0, 5))
62 |         labels = multi_bboxes.new_zeros((0, ), dtype=torch.long)
63 | 
64 |     return bboxes, labels
65 | 


--------------------------------------------------------------------------------
/mmdet/ops/nms/src/nms_cpu.cpp:
--------------------------------------------------------------------------------
 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
 2 | #include <torch/extension.h>
 3 | 
 4 | template <typename scalar_t>
 5 | at::Tensor nms_cpu_kernel(const at::Tensor& dets, const float threshold) {
 6 |   AT_ASSERTM(!dets.type().is_cuda(), "dets must be a CPU tensor");
 7 | 
 8 |   if (dets.numel() == 0) {
 9 |     return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU));
10 |   }
11 | 
12 |   auto x1_t = dets.select(1, 0).contiguous();
13 |   auto y1_t = dets.select(1, 1).contiguous();
14 |   auto x2_t = dets.select(1, 2).contiguous();
15 |   auto y2_t = dets.select(1, 3).contiguous();
16 |   auto scores = dets.select(1, 4).contiguous();
17 | 
18 |   at::Tensor areas_t = (x2_t - x1_t + 1) * (y2_t - y1_t + 1);
19 | 
20 |   auto order_t = std::get<1>(scores.sort(0, /* descending=*/true));
21 | 
22 |   auto ndets = dets.size(0);
23 |   at::Tensor suppressed_t =
24 |       at::zeros({ndets}, dets.options().dtype(at::kByte).device(at::kCPU));
25 | 
26 |   auto suppressed = suppressed_t.data<uint8_t>();
27 |   auto order = order_t.data<int64_t>();
28 |   auto x1 = x1_t.data<scalar_t>();
29 |   auto y1 = y1_t.data<scalar_t>();
30 |   auto x2 = x2_t.data<scalar_t>();
31 |   auto y2 = y2_t.data<scalar_t>();
32 |   auto areas = areas_t.data<scalar_t>();
33 | 
34 |   for (int64_t _i = 0; _i < ndets; _i++) {
35 |     auto i = order[_i];
36 |     if (suppressed[i] == 1) continue;
37 |     auto ix1 = x1[i];
38 |     auto iy1 = y1[i];
39 |     auto ix2 = x2[i];
40 |     auto iy2 = y2[i];
41 |     auto iarea = areas[i];
42 | 
43 |     for (int64_t _j = _i + 1; _j < ndets; _j++) {
44 |       auto j = order[_j];
45 |       if (suppressed[j] == 1) continue;
46 |       auto xx1 = std::max(ix1, x1[j]);
47 |       auto yy1 = std::max(iy1, y1[j]);
48 |       auto xx2 = std::min(ix2, x2[j]);
49 |       auto yy2 = std::min(iy2, y2[j]);
50 | 
51 |       auto w = std::max(static_cast<scalar_t>(0), xx2 - xx1 + 1);
52 |       auto h = std::max(static_cast<scalar_t>(0), yy2 - yy1 + 1);
53 |       auto inter = w * h;
54 |       auto ovr = inter / (iarea + areas[j] - inter);
55 |       if (ovr >= threshold) suppressed[j] = 1;
56 |     }
57 |   }
58 |   return at::nonzero(suppressed_t == 0).squeeze(1);
59 | }
60 | 
61 | at::Tensor nms(const at::Tensor& dets, const float threshold) {
62 |   at::Tensor result;
63 |   AT_DISPATCH_FLOATING_TYPES(dets.type(), "nms", [&] {
64 |     result = nms_cpu_kernel<scalar_t>(dets, threshold);
65 |   });
66 |   return result;
67 | }
68 | 
69 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
70 |   m.def("nms", &nms, "non-maximum suppression");
71 | }


--------------------------------------------------------------------------------
/mmdet/ops/dcn/functions/deform_pool.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from torch.autograd import Function
 3 | 
 4 | from .. import deform_pool_cuda
 5 | 
 6 | 
 7 | class DeformRoIPoolingFunction(Function):
 8 | 
 9 |     @staticmethod
10 |     def forward(ctx,
11 |                 data,
12 |                 rois,
13 |                 offset,
14 |                 spatial_scale,
15 |                 out_size,
16 |                 out_channels,
17 |                 no_trans,
18 |                 group_size=1,
19 |                 part_size=None,
20 |                 sample_per_part=4,
21 |                 trans_std=.0):
22 |         ctx.spatial_scale = spatial_scale
23 |         ctx.out_size = out_size
24 |         ctx.out_channels = out_channels
25 |         ctx.no_trans = no_trans
26 |         ctx.group_size = group_size
27 |         ctx.part_size = out_size if part_size is None else part_size
28 |         ctx.sample_per_part = sample_per_part
29 |         ctx.trans_std = trans_std
30 | 
31 |         assert 0.0 <= ctx.trans_std <= 1.0
32 |         if not data.is_cuda:
33 |             raise NotImplementedError
34 | 
35 |         n = rois.shape[0]
36 |         output = data.new_empty(n, out_channels, out_size, out_size)
37 |         output_count = data.new_empty(n, out_channels, out_size, out_size)
38 |         deform_pool_cuda.deform_psroi_pooling_cuda_forward(
39 |             data, rois, offset, output, output_count, ctx.no_trans,
40 |             ctx.spatial_scale, ctx.out_channels, ctx.group_size, ctx.out_size,
41 |             ctx.part_size, ctx.sample_per_part, ctx.trans_std)
42 | 
43 |         if data.requires_grad or rois.requires_grad or offset.requires_grad:
44 |             ctx.save_for_backward(data, rois, offset)
45 |         ctx.output_count = output_count
46 | 
47 |         return output
48 | 
49 |     @staticmethod
50 |     def backward(ctx, grad_output):
51 |         if not grad_output.is_cuda:
52 |             raise NotImplementedError
53 | 
54 |         data, rois, offset = ctx.saved_tensors
55 |         output_count = ctx.output_count
56 |         grad_input = torch.zeros_like(data)
57 |         grad_rois = None
58 |         grad_offset = torch.zeros_like(offset)
59 | 
60 |         deform_pool_cuda.deform_psroi_pooling_cuda_backward(
61 |             grad_output, data, rois, offset, output_count, grad_input,
62 |             grad_offset, ctx.no_trans, ctx.spatial_scale, ctx.out_channels,
63 |             ctx.group_size, ctx.out_size, ctx.part_size, ctx.sample_per_part,
64 |             ctx.trans_std)
65 |         return (grad_input, grad_rois, grad_offset, None, None, None, None,
66 |                 None, None, None, None)
67 | 
68 | 
69 | deform_roi_pooling = DeformRoIPoolingFunction.apply
70 | 


--------------------------------------------------------------------------------
/mmdet/ops/nms/nms_wrapper.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch
 3 | 
 4 | from . import nms_cuda, nms_cpu
 5 | from .soft_nms_cpu import soft_nms_cpu
 6 | 
 7 | 
 8 | def nms(dets, iou_thr, device_id=None):
 9 |     """Dispatch to either CPU or GPU NMS implementations.
10 | 
11 |     The input can be either a torch tensor or numpy array. GPU NMS will be used
12 |     if the input is a gpu tensor or device_id is specified, otherwise CPU NMS
13 |     will be used. The returned type will always be the same as inputs.
14 | 
15 |     Arguments:
16 |         dets (torch.Tensor or np.ndarray): bboxes with scores.
17 |         iou_thr (float): IoU threshold for NMS.
18 |         device_id (int, optional): when `dets` is a numpy array, if `device_id`
19 |             is None, then cpu nms is used, otherwise gpu_nms will be used.
20 | 
21 |     Returns:
22 |         tuple: kept bboxes and indice, which is always the same data type as
23 |             the input.
24 |     """
25 |     # convert dets (tensor or numpy array) to tensor
26 |     if isinstance(dets, torch.Tensor):
27 |         is_numpy = False
28 |         dets_th = dets
29 |     elif isinstance(dets, np.ndarray):
30 |         is_numpy = True
31 |         device = 'cpu' if device_id is None else 'cuda:{}'.format(device_id)
32 |         dets_th = torch.from_numpy(dets).to(device)
33 |     else:
34 |         raise TypeError(
35 |             'dets must be either a Tensor or numpy array, but got {}'.format(
36 |                 type(dets)))
37 | 
38 |     # execute cpu or cuda nms
39 |     if dets_th.shape[0] == 0:
40 |         inds = dets_th.new_zeros(0, dtype=torch.long)
41 |     else:
42 |         if dets_th.is_cuda:
43 |             inds = nms_cuda.nms(dets_th, iou_thr)
44 |         else:
45 |             inds = nms_cpu.nms(dets_th, iou_thr)
46 | 
47 |     if is_numpy:
48 |         inds = inds.cpu().numpy()
49 |     return dets[inds, :], inds
50 | 
51 | 
52 | def soft_nms(dets, iou_thr, method='linear', sigma=0.5, min_score=1e-3):
53 |     if isinstance(dets, torch.Tensor):
54 |         is_tensor = True
55 |         dets_np = dets.detach().cpu().numpy()
56 |     elif isinstance(dets, np.ndarray):
57 |         is_tensor = False
58 |         dets_np = dets
59 |     else:
60 |         raise TypeError(
61 |             'dets must be either a Tensor or numpy array, but got {}'.format(
62 |                 type(dets)))
63 | 
64 |     method_codes = {'linear': 1, 'gaussian': 2}
65 |     if method not in method_codes:
66 |         raise ValueError('Invalid method for SoftNMS: {}'.format(method))
67 |     new_dets, inds = soft_nms_cpu(
68 |         dets_np,
69 |         iou_thr,
70 |         method=method_codes[method],
71 |         sigma=sigma,
72 |         min_score=min_score)
73 | 
74 |     if is_tensor:
75 |         return dets.new_tensor(new_dets), dets.new_tensor(
76 |             inds, dtype=torch.long)
77 |     else:
78 |         return new_dets.astype(np.float32), inds.astype(np.int64)
79 | 


--------------------------------------------------------------------------------
/mmcv_custom/image_io.py:
--------------------------------------------------------------------------------
 1 | import os.path as osp
 2 | 
 3 | import cv2
 4 | import numpy as np
 5 | 
 6 | from mmcv.utils import is_str, check_file_exist, mkdir_or_exist
 7 | from mmcv.opencv_info import USE_OPENCV2
 8 | from mmcv_custom.zipreader import ZipReader
 9 | 
10 | if not USE_OPENCV2:
11 |     from cv2 import IMREAD_COLOR, IMREAD_GRAYSCALE, IMREAD_UNCHANGED
12 | else:
13 |     from cv2 import CV_LOAD_IMAGE_COLOR as IMREAD_COLOR
14 |     from cv2 import CV_LOAD_IMAGE_GRAYSCALE as IMREAD_GRAYSCALE
15 |     from cv2 import CV_LOAD_IMAGE_UNCHANGED as IMREAD_UNCHANGED
16 | 
17 | 
18 | imread_flags = {
19 |     'color': IMREAD_COLOR,
20 |     'grayscale': IMREAD_GRAYSCALE,
21 |     'unchanged': IMREAD_UNCHANGED
22 | }
23 | 
24 | 
25 | def is_zip_path(img_or_path):
26 |     return '.zip@' in img_or_path
27 | 
28 | 
29 | def imread(img_or_path, flag='color'):
30 |     """Read an image.
31 | 
32 |     Args:
33 |         img_or_path (ndarray or str): Either a numpy array or image path.
34 |             If it is a numpy array (loaded image), then it will be returned
35 |             as is.
36 |         flag (str): Flags specifying the color type of a loaded image,
37 |             candidates are `color`, `grayscale` and `unchanged`.
38 | 
39 |     Returns:
40 |         ndarray: Loaded image array.
41 |     """
42 |     if isinstance(img_or_path, np.ndarray):
43 |         return img_or_path
44 |     elif is_str(img_or_path):
45 |         flag = imread_flags[flag] if is_str(flag) else flag
46 |         if is_zip_path(img_or_path):
47 |             return imfrombytes(ZipReader.read(img_or_path), flag)
48 |         check_file_exist(img_or_path,
49 |                          'img file does not exist: {}'.format(img_or_path))
50 |         return cv2.imread(img_or_path, flag)
51 |     else:
52 |         raise TypeError('"img" must be a numpy array or a filename')
53 | 
54 | 
55 | def imfrombytes(content, flag='color'):
56 |     """Read an image from bytes.
57 | 
58 |     Args:
59 |         content (bytes): Image bytes got from files or other streams.
60 |         flag (str): Same as :func:`imread`.
61 | 
62 |     Returns:
63 |         ndarray: Loaded image array.
64 |     """
65 |     img_np = np.frombuffer(content, np.uint8)
66 |     flag = imread_flags[flag] if is_str(flag) else flag
67 |     img = cv2.imdecode(img_np, flag)
68 |     return img
69 | 
70 | 
71 | def imwrite(img, file_path, params=None, auto_mkdir=True):
72 |     """Write image to file
73 | 
74 |     Args:
75 |         img (ndarray): Image array to be written.
76 |         file_path (str): Image file path.
77 |         params (None or list): Same as opencv's :func:`imwrite` interface.
78 |         auto_mkdir (bool): If the parrent folder of `file_path` does not exist,
79 |             whether to create it automatically.
80 | 
81 |     Returns:
82 |         bool: Successful or not.
83 |     """
84 |     if auto_mkdir:
85 |         dir_name = osp.abspath(osp.dirname(file_path))
86 |         mkdir_or_exist(dir_name)
87 |     return cv2.imwrite(file_path, img, params)
88 | 


--------------------------------------------------------------------------------
/mmdet/datasets/xml_style.py:
--------------------------------------------------------------------------------
 1 | import os.path as osp
 2 | import xml.etree.ElementTree as ET
 3 | 
 4 | import mmcv
 5 | import numpy as np
 6 | 
 7 | from .custom import CustomDataset
 8 | 
 9 | 
10 | class XMLDataset(CustomDataset):
11 | 
12 |     def __init__(self, **kwargs):
13 |         super(XMLDataset, self).__init__(**kwargs)
14 |         self.cat2label = {cat: i + 1 for i, cat in enumerate(self.CLASSES)}
15 | 
16 |     def load_annotations(self, ann_file):
17 |         img_infos = []
18 |         img_ids = mmcv.list_from_file(ann_file)
19 |         for img_id in img_ids:
20 |             filename = 'JPEGImages/{}.jpg'.format(img_id)
21 |             xml_path = osp.join(self.img_prefix, 'Annotations',
22 |                                 '{}.xml'.format(img_id))
23 |             tree = ET.parse(xml_path)
24 |             root = tree.getroot()
25 |             size = root.find('size')
26 |             width = int(size.find('width').text)
27 |             height = int(size.find('height').text)
28 |             img_infos.append(
29 |                 dict(id=img_id, filename=filename, width=width, height=height))
30 |         return img_infos
31 | 
32 |     def get_ann_info(self, idx):
33 |         img_id = self.img_infos[idx]['id']
34 |         xml_path = osp.join(self.img_prefix, 'Annotations',
35 |                             '{}.xml'.format(img_id))
36 |         tree = ET.parse(xml_path)
37 |         root = tree.getroot()
38 |         bboxes = []
39 |         labels = []
40 |         bboxes_ignore = []
41 |         labels_ignore = []
42 |         for obj in root.findall('object'):
43 |             name = obj.find('name').text
44 |             label = self.cat2label[name]
45 |             difficult = int(obj.find('difficult').text)
46 |             bnd_box = obj.find('bndbox')
47 |             bbox = [
48 |                 int(bnd_box.find('xmin').text),
49 |                 int(bnd_box.find('ymin').text),
50 |                 int(bnd_box.find('xmax').text),
51 |                 int(bnd_box.find('ymax').text)
52 |             ]
53 |             if difficult:
54 |                 bboxes_ignore.append(bbox)
55 |                 labels_ignore.append(label)
56 |             else:
57 |                 bboxes.append(bbox)
58 |                 labels.append(label)
59 |         if not bboxes:
60 |             bboxes = np.zeros((0, 4))
61 |             labels = np.zeros((0, ))
62 |         else:
63 |             bboxes = np.array(bboxes, ndmin=2) - 1
64 |             labels = np.array(labels)
65 |         if not bboxes_ignore:
66 |             bboxes_ignore = np.zeros((0, 4))
67 |             labels_ignore = np.zeros((0, ))
68 |         else:
69 |             bboxes_ignore = np.array(bboxes_ignore, ndmin=2) - 1
70 |             labels_ignore = np.array(labels_ignore)
71 |         ann = dict(
72 |             bboxes=bboxes.astype(np.float32),
73 |             labels=labels.astype(np.int64),
74 |             bboxes_ignore=bboxes_ignore.astype(np.float32),
75 |             labels_ignore=labels_ignore.astype(np.int64))
76 |         return ann
77 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/iou_balanced_neg_sampler.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch
 3 | 
 4 | from .random_sampler import RandomSampler
 5 | 
 6 | 
 7 | class IoUBalancedNegSampler(RandomSampler):
 8 | 
 9 |     def __init__(self,
10 |                  num,
11 |                  pos_fraction,
12 |                  hard_thr=0.1,
13 |                  hard_fraction=0.5,
14 |                  **kwargs):
15 |         super(IoUBalancedNegSampler, self).__init__(num, pos_fraction,
16 |                                                     **kwargs)
17 |         assert hard_thr > 0
18 |         assert 0 < hard_fraction < 1
19 |         self.hard_thr = hard_thr
20 |         self.hard_fraction = hard_fraction
21 | 
22 |     def _sample_neg(self, assign_result, num_expected, **kwargs):
23 |         neg_inds = torch.nonzero(assign_result.gt_inds == 0)
24 |         if neg_inds.numel() != 0:
25 |             neg_inds = neg_inds.squeeze(1)
26 |         if len(neg_inds) <= num_expected:
27 |             return neg_inds
28 |         else:
29 |             max_overlaps = assign_result.max_overlaps.cpu().numpy()
30 |             # balance sampling for negative samples
31 |             neg_set = set(neg_inds.cpu().numpy())
32 |             easy_set = set(
33 |                 np.where(
34 |                     np.logical_and(max_overlaps >= 0,
35 |                                    max_overlaps < self.hard_thr))[0])
36 |             hard_set = set(np.where(max_overlaps >= self.hard_thr)[0])
37 |             easy_neg_inds = list(easy_set & neg_set)
38 |             hard_neg_inds = list(hard_set & neg_set)
39 | 
40 |             num_expected_hard = int(num_expected * self.hard_fraction)
41 |             if len(hard_neg_inds) > num_expected_hard:
42 |                 sampled_hard_inds = self.random_choice(hard_neg_inds,
43 |                                                        num_expected_hard)
44 |             else:
45 |                 sampled_hard_inds = np.array(hard_neg_inds, dtype=np.int)
46 |             num_expected_easy = num_expected - len(sampled_hard_inds)
47 |             if len(easy_neg_inds) > num_expected_easy:
48 |                 sampled_easy_inds = self.random_choice(easy_neg_inds,
49 |                                                        num_expected_easy)
50 |             else:
51 |                 sampled_easy_inds = np.array(easy_neg_inds, dtype=np.int)
52 |             sampled_inds = np.concatenate((sampled_easy_inds,
53 |                                            sampled_hard_inds))
54 |             if len(sampled_inds) < num_expected:
55 |                 num_extra = num_expected - len(sampled_inds)
56 |                 extra_inds = np.array(list(neg_set - set(sampled_inds)))
57 |                 if len(extra_inds) > num_extra:
58 |                     extra_inds = self.random_choice(extra_inds, num_extra)
59 |                 sampled_inds = np.concatenate((sampled_inds, extra_inds))
60 |             sampled_inds = torch.from_numpy(sampled_inds).long().to(
61 |                 assign_result.gt_inds.device)
62 |             return sampled_inds
63 | 


--------------------------------------------------------------------------------
/mmdet/ops/nms/setup.py:
--------------------------------------------------------------------------------
 1 | import os.path as osp
 2 | from setuptools import setup, Extension
 3 | 
 4 | import numpy as np
 5 | from Cython.Build import cythonize
 6 | from Cython.Distutils import build_ext
 7 | from torch.utils.cpp_extension import BuildExtension, CUDAExtension
 8 | 
 9 | ext_args = dict(
10 |     include_dirs=[np.get_include()],
11 |     language='c++',
12 |     extra_compile_args={
13 |         'cc': ['-Wno-unused-function', '-Wno-write-strings'],
14 |         'nvcc': ['-c', '--compiler-options', '-fPIC'],
15 |     },
16 | )
17 | 
18 | extensions = [
19 |     Extension('soft_nms_cpu', ['src/soft_nms_cpu.pyx'], **ext_args),
20 | ]
21 | 
22 | 
23 | def customize_compiler_for_nvcc(self):
24 |     """inject deep into distutils to customize how the dispatch
25 |     to cc/nvcc works.
26 |     If you subclass UnixCCompiler, it's not trivial to get your subclass
27 |     injected in, and still have the right customizations (i.e.
28 |     distutils.sysconfig.customize_compiler) run on it. So instead of going
29 |     the OO route, I have this. Note, it's kindof like a wierd functional
30 |     subclassing going on."""
31 | 
32 |     # tell the compiler it can processes .cu
33 |     self.src_extensions.append('.cu')
34 | 
35 |     # save references to the default compiler_so and _comple methods
36 |     default_compiler_so = self.compiler_so
37 |     super = self._compile
38 | 
39 |     # now redefine the _compile method. This gets executed for each
40 |     # object but distutils doesn't have the ability to change compilers
41 |     # based on source extension: we add it.
42 |     def _compile(obj, src, ext, cc_args, extra_postargs, pp_opts):
43 |         if osp.splitext(src)[1] == '.cu':
44 |             # use the cuda for .cu files
45 |             self.set_executable('compiler_so', 'nvcc')
46 |             # use only a subset of the extra_postargs, which are 1-1 translated
47 |             # from the extra_compile_args in the Extension class
48 |             postargs = extra_postargs['nvcc']
49 |         else:
50 |             postargs = extra_postargs['cc']
51 | 
52 |         super(obj, src, ext, cc_args, postargs, pp_opts)
53 |         # reset the default compiler_so, which we might have changed for cuda
54 |         self.compiler_so = default_compiler_so
55 | 
56 |     # inject our redefined _compile method into the class
57 |     self._compile = _compile
58 | 
59 | 
60 | class custom_build_ext(build_ext):
61 | 
62 |     def build_extensions(self):
63 |         customize_compiler_for_nvcc(self.compiler)
64 |         build_ext.build_extensions(self)
65 | 
66 | 
67 | setup(
68 |     name='soft_nms',
69 |     cmdclass={'build_ext': custom_build_ext},
70 |     ext_modules=cythonize(extensions),
71 | )
72 | 
73 | setup(
74 |     name='nms_cuda',
75 |     ext_modules=[
76 |         CUDAExtension('nms_cuda', [
77 |             'src/nms_cuda.cpp',
78 |             'src/nms_kernel.cu',
79 |         ]),
80 |         CUDAExtension('nms_cpu', [
81 |             'src/nms_cpu.cpp',
82 |         ]),
83 |     ],
84 |     cmdclass={'build_ext': BuildExtension})
85 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/ohem_sampler.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | from .base_sampler import BaseSampler
 4 | from ..transforms import bbox2roi
 5 | 
 6 | 
 7 | class OHEMSampler(BaseSampler):
 8 | 
 9 |     def __init__(self,
10 |                  num,
11 |                  pos_fraction,
12 |                  context,
13 |                  neg_pos_ub=-1,
14 |                  add_gt_as_proposals=True,
15 |                  **kwargs):
16 |         super(OHEMSampler, self).__init__(num, pos_fraction, neg_pos_ub,
17 |                                           add_gt_as_proposals)
18 |         if not hasattr(context, 'num_stages'):
19 |             self.bbox_roi_extractor = context.bbox_roi_extractor
20 |             self.bbox_head = context.bbox_head
21 |         else:
22 |             self.bbox_roi_extractor = context.bbox_roi_extractor[
23 |                 context.current_stage]
24 |             self.bbox_head = context.bbox_head[context.current_stage]
25 | 
26 |     def hard_mining(self, inds, num_expected, bboxes, labels, feats):
27 |         with torch.no_grad():
28 |             rois = bbox2roi([bboxes])
29 |             bbox_feats = self.bbox_roi_extractor(
30 |                 feats[:self.bbox_roi_extractor.num_inputs], rois)
31 |             cls_score, _ = self.bbox_head(bbox_feats)
32 |             loss = self.bbox_head.loss(
33 |                 cls_score=cls_score,
34 |                 bbox_pred=None,
35 |                 labels=labels,
36 |                 label_weights=cls_score.new_ones(cls_score.size(0)),
37 |                 bbox_targets=None,
38 |                 bbox_weights=None,
39 |                 reduce=False)['loss_cls']
40 |             _, topk_loss_inds = loss.topk(num_expected)
41 |         return inds[topk_loss_inds]
42 | 
43 |     def _sample_pos(self,
44 |                     assign_result,
45 |                     num_expected,
46 |                     bboxes=None,
47 |                     feats=None,
48 |                     **kwargs):
49 |         # Sample some hard positive samples
50 |         pos_inds = torch.nonzero(assign_result.gt_inds > 0)
51 |         if pos_inds.numel() != 0:
52 |             pos_inds = pos_inds.squeeze(1)
53 |         if pos_inds.numel() <= num_expected:
54 |             return pos_inds
55 |         else:
56 |             return self.hard_mining(pos_inds, num_expected, bboxes[pos_inds],
57 |                                     assign_result.labels[pos_inds], feats)
58 | 
59 |     def _sample_neg(self,
60 |                     assign_result,
61 |                     num_expected,
62 |                     bboxes=None,
63 |                     feats=None,
64 |                     **kwargs):
65 |         # Sample some hard negative samples
66 |         neg_inds = torch.nonzero(assign_result.gt_inds == 0)
67 |         if neg_inds.numel() != 0:
68 |             neg_inds = neg_inds.squeeze(1)
69 |         if len(neg_inds) <= num_expected:
70 |             return neg_inds
71 |         else:
72 |             return self.hard_mining(neg_inds, num_expected, bboxes[neg_inds],
73 |                                     assign_result.labels[neg_inds], feats)
74 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/samplers/base_sampler.py:
--------------------------------------------------------------------------------
 1 | from abc import ABCMeta, abstractmethod
 2 | 
 3 | import torch
 4 | 
 5 | from .sampling_result import SamplingResult
 6 | 
 7 | 
 8 | class BaseSampler(metaclass=ABCMeta):
 9 | 
10 |     def __init__(self,
11 |                  num,
12 |                  pos_fraction,
13 |                  neg_pos_ub=-1,
14 |                  add_gt_as_proposals=True,
15 |                  **kwargs):
16 |         self.num = num
17 |         self.pos_fraction = pos_fraction
18 |         self.neg_pos_ub = neg_pos_ub
19 |         self.add_gt_as_proposals = add_gt_as_proposals
20 |         self.pos_sampler = self
21 |         self.neg_sampler = self
22 | 
23 |     @abstractmethod
24 |     def _sample_pos(self, assign_result, num_expected, **kwargs):
25 |         pass
26 | 
27 |     @abstractmethod
28 |     def _sample_neg(self, assign_result, num_expected, **kwargs):
29 |         pass
30 | 
31 |     def sample(self,
32 |                assign_result,
33 |                bboxes,
34 |                gt_bboxes,
35 |                gt_labels=None,
36 |                **kwargs):
37 |         """Sample positive and negative bboxes.
38 | 
39 |         This is a simple implementation of bbox sampling given candidates,
40 |         assigning results and ground truth bboxes.
41 | 
42 |         Args:
43 |             assign_result (:obj:`AssignResult`): Bbox assigning results.
44 |             bboxes (Tensor): Boxes to be sampled from.
45 |             gt_bboxes (Tensor): Ground truth bboxes.
46 |             gt_labels (Tensor, optional): Class labels of ground truth bboxes.
47 | 
48 |         Returns:
49 |             :obj:`SamplingResult`: Sampling result.
50 |         """
51 |         bboxes = bboxes[:, :4]
52 | 
53 |         gt_flags = bboxes.new_zeros((bboxes.shape[0], ), dtype=torch.uint8)
54 |         if self.add_gt_as_proposals:
55 |             bboxes = torch.cat([gt_bboxes, bboxes], dim=0)
56 |             assign_result.add_gt_(gt_labels)
57 |             gt_ones = bboxes.new_ones(gt_bboxes.shape[0], dtype=torch.uint8)
58 |             gt_flags = torch.cat([gt_ones, gt_flags])
59 | 
60 |         num_expected_pos = int(self.num * self.pos_fraction)
61 |         pos_inds = self.pos_sampler._sample_pos(
62 |             assign_result, num_expected_pos, bboxes=bboxes, **kwargs)
63 |         # We found that sampled indices have duplicated items occasionally.
64 |         # (may be a bug of PyTorch)
65 |         pos_inds = pos_inds.unique()
66 |         num_sampled_pos = pos_inds.numel()
67 |         num_expected_neg = self.num - num_sampled_pos
68 |         if self.neg_pos_ub >= 0:
69 |             _pos = max(1, num_sampled_pos)
70 |             neg_upper_bound = int(self.neg_pos_ub * _pos)
71 |             if num_expected_neg > neg_upper_bound:
72 |                 num_expected_neg = neg_upper_bound
73 |         neg_inds = self.neg_sampler._sample_neg(
74 |             assign_result, num_expected_neg, bboxes=bboxes, **kwargs)
75 |         neg_inds = neg_inds.unique()
76 | 
77 |         return SamplingResult(pos_inds, neg_inds, bboxes, gt_bboxes,
78 |                               assign_result, gt_flags)
79 | 


--------------------------------------------------------------------------------
/mmdet/core/bbox/bbox_target.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | from .transforms import bbox2delta
 4 | from ..utils import multi_apply
 5 | 
 6 | 
 7 | def bbox_target(pos_bboxes_list,
 8 |                 neg_bboxes_list,
 9 |                 pos_gt_bboxes_list,
10 |                 pos_gt_labels_list,
11 |                 cfg,
12 |                 reg_classes=1,
13 |                 target_means=[.0, .0, .0, .0],
14 |                 target_stds=[1.0, 1.0, 1.0, 1.0],
15 |                 concat=True):
16 |     labels, label_weights, bbox_targets, bbox_weights = multi_apply(
17 |         bbox_target_single,
18 |         pos_bboxes_list,
19 |         neg_bboxes_list,
20 |         pos_gt_bboxes_list,
21 |         pos_gt_labels_list,
22 |         cfg=cfg,
23 |         reg_classes=reg_classes,
24 |         target_means=target_means,
25 |         target_stds=target_stds)
26 | 
27 |     if concat:
28 |         labels = torch.cat(labels, 0)
29 |         label_weights = torch.cat(label_weights, 0)
30 |         bbox_targets = torch.cat(bbox_targets, 0)
31 |         bbox_weights = torch.cat(bbox_weights, 0)
32 |     return labels, label_weights, bbox_targets, bbox_weights
33 | 
34 | 
35 | def bbox_target_single(pos_bboxes,
36 |                        neg_bboxes,
37 |                        pos_gt_bboxes,
38 |                        pos_gt_labels,
39 |                        cfg,
40 |                        reg_classes=1,
41 |                        target_means=[.0, .0, .0, .0],
42 |                        target_stds=[1.0, 1.0, 1.0, 1.0]):
43 |     num_pos = pos_bboxes.size(0)
44 |     num_neg = neg_bboxes.size(0)
45 |     num_samples = num_pos + num_neg
46 |     labels = pos_bboxes.new_zeros(num_samples, dtype=torch.long)
47 |     label_weights = pos_bboxes.new_zeros(num_samples)
48 |     bbox_targets = pos_bboxes.new_zeros(num_samples, 4)
49 |     bbox_weights = pos_bboxes.new_zeros(num_samples, 4)
50 |     if num_pos > 0:
51 |         labels[:num_pos] = pos_gt_labels
52 |         pos_weight = 1.0 if cfg.pos_weight <= 0 else cfg.pos_weight
53 |         label_weights[:num_pos] = pos_weight
54 |         pos_bbox_targets = bbox2delta(pos_bboxes, pos_gt_bboxes, target_means,
55 |                                       target_stds)
56 |         bbox_targets[:num_pos, :] = pos_bbox_targets
57 |         bbox_weights[:num_pos, :] = 1
58 |     if num_neg > 0:
59 |         label_weights[-num_neg:] = 1.0
60 | 
61 |     return labels, label_weights, bbox_targets, bbox_weights
62 | 
63 | 
64 | def expand_target(bbox_targets, bbox_weights, labels, num_classes):
65 |     bbox_targets_expand = bbox_targets.new_zeros((bbox_targets.size(0),
66 |                                                   4 * num_classes))
67 |     bbox_weights_expand = bbox_weights.new_zeros((bbox_weights.size(0),
68 |                                                   4 * num_classes))
69 |     for i in torch.nonzero(labels > 0).squeeze(-1):
70 |         start, end = labels[i] * 4, (labels[i] + 1) * 4
71 |         bbox_targets_expand[i, start:end] = bbox_targets[i, :]
72 |         bbox_weights_expand[i, start:end] = bbox_weights[i, :]
73 |     return bbox_targets_expand, bbox_weights_expand
74 | 


--------------------------------------------------------------------------------
/mmcv_custom/runner.py:
--------------------------------------------------------------------------------
 1 | import os.path as osp
 2 | import mmcv
 3 | from mmcv.runner.checkpoint import save_checkpoint
 4 | from mmcv.runner.utils import obj_from_dict
 5 | import torch
 6 | from .parameters import parameters
 7 | 
 8 | 
 9 | class Runner(mmcv.runner.Runner):
10 |     """A training helper for PyTorch.
11 | 
12 |         Custom version of mmcv runner, overwrite init_optimizer method
13 |     """
14 | 
15 |     def init_optimizer(self, optimizer):
16 |         """Init the optimizer.
17 | 
18 |         Args:
19 |             optimizer (dict or :obj:`~torch.optim.Optimizer`): Either an
20 |                 optimizer object or a dict used for constructing the optimizer.
21 | 
22 |         Returns:
23 |             :obj:`~torch.optim.Optimizer`: An optimizer object.
24 | 
25 |         Examples:
26 |             >>> optimizer = dict(type='SGD', lr=0.01, momentum=0.9)
27 |             >>> type(runner.init_optimizer(optimizer))
28 |             <class 'torch.optim.sgd.SGD'>
29 |         """
30 |         if isinstance(optimizer, dict):
31 |             optimizer = obj_from_dict(
32 |                 optimizer, torch.optim,
33 |                 dict(params=parameters(self.model, optimizer.lr)))
34 |         elif not isinstance(optimizer, torch.optim.Optimizer):
35 |             raise TypeError(
36 |                 'optimizer must be either an Optimizer object or a dict, '
37 |                 'but got {}'.format(type(optimizer)))
38 |         return optimizer
39 | 
40 |     def resume(self, checkpoint, resume_optimizer=True,
41 |                map_location='default'):
42 |         if map_location == 'default':
43 |             device_id = torch.cuda.current_device()
44 |             checkpoint = self.load_checkpoint(
45 |                 checkpoint,
46 |                 map_location=lambda storage, loc: storage.cuda(device_id))
47 |         else:
48 |             checkpoint = self.load_checkpoint(
49 |                 checkpoint, map_location=map_location)
50 | 
51 |         self._epoch = checkpoint['meta']['epoch']
52 |         self._iter = checkpoint['meta']['iter']
53 |         if 'optimizer' in checkpoint and resume_optimizer:
54 |             self.optimizer.load_state_dict(checkpoint['optimizer'])
55 | 
56 |         self.logger.info('resumed epoch %d, iter %d', self.epoch, self.iter)
57 | 
58 |     def auto_resume(self):
59 |         linkname = osp.join(self.work_dir, 'latest.pth')
60 |         if osp.exists(linkname):
61 |             self.logger.info('latest checkpoint found')
62 |             self.resume(linkname)
63 | 
64 |     def save_checkpoint(self,
65 |                         out_dir,
66 |                         filename_tmpl='epoch_{}.pth',
67 |                         save_optimizer=True,
68 |                         meta=None):
69 |         if meta is None:
70 |             meta = dict(epoch=self.epoch + 1, iter=self.iter)
71 |         else:
72 |             meta.update(epoch=self.epoch + 1, iter=self.iter)
73 | 
74 |         filename = filename_tmpl.format(self.epoch + 1)
75 |         filepath = osp.join(out_dir, filename)
76 |         optimizer = self.optimizer if save_optimizer else None
77 |         save_checkpoint(self.model, filepath, optimizer=optimizer, meta=meta)
78 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_pool/src/roi_pool_cuda.cpp:
--------------------------------------------------------------------------------
 1 | #include <torch/extension.h>
 2 | 
 3 | #include <cmath>
 4 | #include <vector>
 5 | 
 6 | int ROIPoolForwardLaucher(const at::Tensor features, const at::Tensor rois,
 7 |                           const float spatial_scale, const int channels,
 8 |                           const int height, const int width, const int num_rois,
 9 |                           const int pooled_h, const int pooled_w,
10 |                           at::Tensor output, at::Tensor argmax);
11 | 
12 | int ROIPoolBackwardLaucher(const at::Tensor top_grad, const at::Tensor rois,
13 |                            const at::Tensor argmax, const float spatial_scale,
14 |                            const int batch_size, const int channels,
15 |                            const int height, const int width,
16 |                            const int num_rois, const int pooled_h,
17 |                            const int pooled_w, at::Tensor bottom_grad);
18 | 
19 | #define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
20 | #define CHECK_CONTIGUOUS(x) \
21 |   AT_CHECK(x.is_contiguous(), #x, " must be contiguous ")
22 | #define CHECK_INPUT(x) \
23 |   CHECK_CUDA(x);       \
24 |   CHECK_CONTIGUOUS(x)
25 | 
26 | int roi_pooling_forward_cuda(at::Tensor features, at::Tensor rois,
27 |                              int pooled_height, int pooled_width,
28 |                              float spatial_scale, at::Tensor output,
29 |                              at::Tensor argmax) {
30 |   CHECK_INPUT(features);
31 |   CHECK_INPUT(rois);
32 |   CHECK_INPUT(output);
33 |   CHECK_INPUT(argmax);
34 | 
35 |   // Number of ROIs
36 |   int num_rois = rois.size(0);
37 |   int size_rois = rois.size(1);
38 | 
39 |   if (size_rois != 5) {
40 |     printf("wrong roi size\n");
41 |     return 0;
42 |   }
43 | 
44 |   int channels = features.size(1);
45 |   int height = features.size(2);
46 |   int width = features.size(3);
47 | 
48 |   ROIPoolForwardLaucher(features, rois, spatial_scale, channels, height, width,
49 |                         num_rois, pooled_height, pooled_width, output, argmax);
50 | 
51 |   return 1;
52 | }
53 | 
54 | int roi_pooling_backward_cuda(at::Tensor top_grad, at::Tensor rois,
55 |                               at::Tensor argmax, float spatial_scale,
56 |                               at::Tensor bottom_grad) {
57 |   CHECK_INPUT(top_grad);
58 |   CHECK_INPUT(rois);
59 |   CHECK_INPUT(argmax);
60 |   CHECK_INPUT(bottom_grad);
61 | 
62 |   int pooled_height = top_grad.size(2);
63 |   int pooled_width = top_grad.size(3);
64 |   int num_rois = rois.size(0);
65 |   int size_rois = rois.size(1);
66 | 
67 |   if (size_rois != 5) {
68 |     printf("wrong roi size\n");
69 |     return 0;
70 |   }
71 |   int batch_size = bottom_grad.size(0);
72 |   int channels = bottom_grad.size(1);
73 |   int height = bottom_grad.size(2);
74 |   int width = bottom_grad.size(3);
75 | 
76 |   ROIPoolBackwardLaucher(top_grad, rois, argmax, spatial_scale, batch_size,
77 |                          channels, height, width, num_rois, pooled_height,
78 |                          pooled_width, bottom_grad);
79 | 
80 |   return 1;
81 | }
82 | 
83 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
84 |   m.def("forward", &roi_pooling_forward_cuda, "Roi_Pooling forward (CUDA)");
85 |   m.def("backward", &roi_pooling_backward_cuda, "Roi_Pooling backward (CUDA)");
86 | }
87 | 


--------------------------------------------------------------------------------
/mmdet/ops/roi_align/src/roi_align_cuda.cpp:
--------------------------------------------------------------------------------
 1 | #include <torch/extension.h>
 2 | 
 3 | #include <cmath>
 4 | #include <vector>
 5 | 
 6 | int ROIAlignForwardLaucher(const at::Tensor features, const at::Tensor rois,
 7 |                            const float spatial_scale, const int sample_num,
 8 |                            const int channels, const int height,
 9 |                            const int width, const int num_rois,
10 |                            const int pooled_height, const int pooled_width,
11 |                            at::Tensor output);
12 | 
13 | int ROIAlignBackwardLaucher(const at::Tensor top_grad, const at::Tensor rois,
14 |                             const float spatial_scale, const int sample_num,
15 |                             const int channels, const int height,
16 |                             const int width, const int num_rois,
17 |                             const int pooled_height, const int pooled_width,
18 |                             at::Tensor bottom_grad);
19 | 
20 | #define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
21 | #define CHECK_CONTIGUOUS(x) \
22 |   AT_CHECK(x.is_contiguous(), #x, " must be contiguous ")
23 | #define CHECK_INPUT(x) \
24 |   CHECK_CUDA(x);       \
25 |   CHECK_CONTIGUOUS(x)
26 | 
27 | int roi_align_forward_cuda(at::Tensor features, at::Tensor rois,
28 |                            int pooled_height, int pooled_width,
29 |                            float spatial_scale, int sample_num,
30 |                            at::Tensor output) {
31 |   CHECK_INPUT(features);
32 |   CHECK_INPUT(rois);
33 |   CHECK_INPUT(output);
34 | 
35 |   // Number of ROIs
36 |   int num_rois = rois.size(0);
37 |   int size_rois = rois.size(1);
38 | 
39 |   if (size_rois != 5) {
40 |     printf("wrong roi size\n");
41 |     return 0;
42 |   }
43 | 
44 |   int num_channels = features.size(1);
45 |   int data_height = features.size(2);
46 |   int data_width = features.size(3);
47 | 
48 |   ROIAlignForwardLaucher(features, rois, spatial_scale, sample_num,
49 |                          num_channels, data_height, data_width, num_rois,
50 |                          pooled_height, pooled_width, output);
51 | 
52 |   return 1;
53 | }
54 | 
55 | int roi_align_backward_cuda(at::Tensor top_grad, at::Tensor rois,
56 |                             int pooled_height, int pooled_width,
57 |                             float spatial_scale, int sample_num,
58 |                             at::Tensor bottom_grad) {
59 |   CHECK_INPUT(top_grad);
60 |   CHECK_INPUT(rois);
61 |   CHECK_INPUT(bottom_grad);
62 | 
63 |   // Number of ROIs
64 |   int num_rois = rois.size(0);
65 |   int size_rois = rois.size(1);
66 |   if (size_rois != 5) {
67 |     printf("wrong roi size\n");
68 |     return 0;
69 |   }
70 | 
71 |   int num_channels = bottom_grad.size(1);
72 |   int data_height = bottom_grad.size(2);
73 |   int data_width = bottom_grad.size(3);
74 | 
75 |   ROIAlignBackwardLaucher(top_grad, rois, spatial_scale, sample_num,
76 |                           num_channels, data_height, data_width, num_rois,
77 |                           pooled_height, pooled_width, bottom_grad);
78 | 
79 |   return 1;
80 | }
81 | 
82 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
83 |   m.def("forward", &roi_align_forward_cuda, "Roi_Align forward (CUDA)");
84 |   m.def("backward", &roi_align_backward_cuda, "Roi_Align backward (CUDA)");
85 | }
86 | 


--------------------------------------------------------------------------------
/mmdet/utils/distributed.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import numpy as np
  3 | import subprocess
  4 | from contextlib import contextmanager
  5 | import logging
  6 | 
  7 | 
  8 | def ompi_rank():
  9 |     """Find OMPI world rank without calling mpi functions
 10 |     :rtype: int
 11 |     """
 12 |     return int(os.environ.get('OMPI_COMM_WORLD_RANK') or 0)
 13 | 
 14 | 
 15 | def ompi_size():
 16 |     """Find OMPI world size without calling mpi functions
 17 |     :rtype: int
 18 |     """
 19 |     return int(os.environ.get('OMPI_COMM_WORLD_SIZE') or 1)
 20 | 
 21 | 
 22 | def ompi_local_rank():
 23 |     """Find OMPI local rank without calling mpi functions
 24 |     :rtype: int
 25 |     """
 26 |     return int(os.environ.get('OMPI_COMM_WORLD_LOCAL_RANK') or 0)
 27 | 
 28 | 
 29 | def ompi_local_size():
 30 |     """Find OMPI local size without calling mpi functions
 31 |     :rtype: int
 32 |     """
 33 |     return int(os.environ.get('OMPI_COMM_WORLD_LOCAL_SIZE') or 1)
 34 | 
 35 | 
 36 | def ompi_universe_size():
 37 |     """Find OMPI universe size without calling mpi functions
 38 |     :rtype: int
 39 |     """
 40 |     return int(os.environ.get('OMPI_UNIVERSE_SIZE') or 1)
 41 | 
 42 | 
 43 | @contextmanager
 44 | def run_and_terminate_process(*args, **kwargs):
 45 |     """Run a process and terminate it at the end
 46 |     """
 47 |     p = None
 48 |     try:
 49 |         p = subprocess.Popen(*args, **kwargs)
 50 |         yield p
 51 |     finally:
 52 |         if not p:
 53 |             return
 54 |         try:
 55 |             p.terminate()  # send sigterm
 56 |         except OSError:
 57 |             pass
 58 |         try:
 59 |             p.kill()  # send sigkill
 60 |         except OSError:
 61 |             pass
 62 | 
 63 | 
 64 | def get_gpus_nocache():
 65 |     """List of NVIDIA GPUs
 66 |     """
 67 |     cmds = 'nvidia-smi --query-gpu=name --format=csv,noheader'.split(' ')
 68 |     with run_and_terminate_process(
 69 |             cmds, stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
 70 |             bufsize=1) as process:
 71 |         return [
 72 |             str(line).strip() for line in iter(process.stdout.readline, b'')
 73 |         ]
 74 | 
 75 | 
 76 | _GPUS = get_gpus_nocache()
 77 | 
 78 | 
 79 | def get_gpus():
 80 |     """List of NVIDIA GPUs
 81 |     """
 82 |     return _GPUS
 83 | 
 84 | 
 85 | def gpu_indices(divisible=True):
 86 |     """Get the GPU device indices for this process/rank
 87 |     :param divisible: if GPU count of all ranks must be the same
 88 |     :rtype: list[int]
 89 |     """
 90 |     local_size = ompi_local_size()
 91 |     local_rank = ompi_local_rank()
 92 |     assert 0 <= local_rank < local_size, \
 93 |         "Invalid local_rank: {} local_size: {}".format(local_rank, local_size)
 94 |     gpu_count = len(get_gpus())
 95 |     assert gpu_count >= local_size > 0, \
 96 |         "GPU count: {} must be >= LOCAL_SIZE: {} > 0".format(gpu_count, local_size)
 97 |     if divisible:
 98 |         ngpu = int(gpu_count / local_size)
 99 |         gpus = np.arange(local_rank * ngpu, (local_rank + 1) * ngpu)
100 |         if gpu_count % local_size != 0:
101 |             logging.warning(
102 |                 "gpu_count: {} not divisible by local_size: {}; " + "some GPUs may be unused"
103 |                 .format(gpu_count, local_size))
104 |     else:
105 |         gpus = np.array_split(range(gpu_count), local_size)[local_rank]
106 |     return gpus.astype(int)
107 | 


--------------------------------------------------------------------------------
/mmdet/models/roi_extractors/single_level.py:
--------------------------------------------------------------------------------
 1 | from __future__ import division
 2 | 
 3 | import torch
 4 | import torch.nn as nn
 5 | 
 6 | from mmdet import ops
 7 | from ..registry import ROI_EXTRACTORS
 8 | 
 9 | 
10 | @ROI_EXTRACTORS.register_module
11 | class SingleRoIExtractor(nn.Module):
12 |     """Extract RoI features from a single level feature map.
13 | 
14 |     If there are mulitple input feature levels, each RoI is mapped to a level
15 |     according to its scale.
16 | 
17 |     Args:
18 |         roi_layer (dict): Specify RoI layer type and arguments.
19 |         out_channels (int): Output channels of RoI layers.
20 |         featmap_strides (int): Strides of input feature maps.
21 |         finest_scale (int): Scale threshold of mapping to level 0.
22 |     """
23 | 
24 |     def __init__(self,
25 |                  roi_layer,
26 |                  out_channels,
27 |                  featmap_strides,
28 |                  finest_scale=56):
29 |         super(SingleRoIExtractor, self).__init__()
30 |         self.roi_layers = self.build_roi_layers(roi_layer, featmap_strides)
31 |         self.out_channels = out_channels
32 |         self.featmap_strides = featmap_strides
33 |         self.finest_scale = finest_scale
34 | 
35 |     @property
36 |     def num_inputs(self):
37 |         """int: Input feature map levels."""
38 |         return len(self.featmap_strides)
39 | 
40 |     def init_weights(self):
41 |         pass
42 | 
43 |     def build_roi_layers(self, layer_cfg, featmap_strides):
44 |         cfg = layer_cfg.copy()
45 |         layer_type = cfg.pop('type')
46 |         assert hasattr(ops, layer_type)
47 |         layer_cls = getattr(ops, layer_type)
48 |         roi_layers = nn.ModuleList(
49 |             [layer_cls(spatial_scale=1 / s, **cfg) for s in featmap_strides])
50 |         return roi_layers
51 | 
52 |     def map_roi_levels(self, rois, num_levels):
53 |         """Map rois to corresponding feature levels by scales.
54 | 
55 |         - scale < finest_scale: level 0
56 |         - finest_scale <= scale < finest_scale * 2: level 1
57 |         - finest_scale * 2 <= scale < finest_scale * 4: level 2
58 |         - scale >= finest_scale * 4: level 3
59 | 
60 |         Args:
61 |             rois (Tensor): Input RoIs, shape (k, 5).
62 |             num_levels (int): Total level number.
63 | 
64 |         Returns:
65 |             Tensor: Level index (0-based) of each RoI, shape (k, )
66 |         """
67 |         scale = torch.sqrt(
68 |             (rois[:, 3] - rois[:, 1] + 1) * (rois[:, 4] - rois[:, 2] + 1))
69 |         target_lvls = torch.floor(torch.log2(scale / self.finest_scale + 1e-6))
70 |         target_lvls = target_lvls.clamp(min=0, max=num_levels - 1).long()
71 |         return target_lvls
72 | 
73 |     def forward(self, feats, rois):
74 |         if len(feats) == 1:
75 |             return self.roi_layers[0](feats[0], rois)
76 | 
77 |         out_size = self.roi_layers[0].out_size
78 |         num_levels = len(feats)
79 |         target_lvls = self.map_roi_levels(rois, num_levels)
80 |         roi_feats = torch.cuda.FloatTensor(rois.size()[0], self.out_channels,
81 |                                            out_size, out_size).fill_(0)
82 |         for i in range(num_levels):
83 |             inds = target_lvls == i
84 |             if inds.any():
85 |                 rois_ = rois[inds, :]
86 |                 roi_feats_t = self.roi_layers[i](feats[i], rois_)
87 |                 roi_feats[inds] += roi_feats_t
88 |         return roi_feats
89 | 


--------------------------------------------------------------------------------
/mmdet/core/anchor/anchor_generator.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | 
 4 | class AnchorGenerator(object):
 5 | 
 6 |     def __init__(self, base_size, scales, ratios, scale_major=True, ctr=None):
 7 |         self.base_size = base_size
 8 |         self.scales = torch.Tensor(scales)
 9 |         self.ratios = torch.Tensor(ratios)
10 |         self.scale_major = scale_major
11 |         self.ctr = ctr
12 |         self.base_anchors = self.gen_base_anchors()
13 | 
14 |     @property
15 |     def num_base_anchors(self):
16 |         return self.base_anchors.size(0)
17 | 
18 |     def gen_base_anchors(self):
19 |         w = self.base_size
20 |         h = self.base_size
21 |         if self.ctr is None:
22 |             x_ctr = 0.5 * (w - 1)
23 |             y_ctr = 0.5 * (h - 1)
24 |         else:
25 |             x_ctr, y_ctr = self.ctr
26 | 
27 |         h_ratios = torch.sqrt(self.ratios)
28 |         w_ratios = 1 / h_ratios
29 |         if self.scale_major:
30 |             ws = (w * w_ratios[:, None] * self.scales[None, :]).view(-1)
31 |             hs = (h * h_ratios[:, None] * self.scales[None, :]).view(-1)
32 |         else:
33 |             ws = (w * self.scales[:, None] * w_ratios[None, :]).view(-1)
34 |             hs = (h * self.scales[:, None] * h_ratios[None, :]).view(-1)
35 | 
36 |         base_anchors = torch.stack(
37 |             [
38 |                 x_ctr - 0.5 * (ws - 1), y_ctr - 0.5 * (hs - 1),
39 |                 x_ctr + 0.5 * (ws - 1), y_ctr + 0.5 * (hs - 1)
40 |             ],
41 |             dim=-1).round()
42 | 
43 |         return base_anchors
44 | 
45 |     def _meshgrid(self, x, y, row_major=True):
46 |         xx = x.repeat(len(y))
47 |         yy = y.view(-1, 1).repeat(1, len(x)).view(-1)
48 |         if row_major:
49 |             return xx, yy
50 |         else:
51 |             return yy, xx
52 | 
53 |     def grid_anchors(self, featmap_size, stride=16, device='cuda'):
54 |         base_anchors = self.base_anchors.to(device)
55 | 
56 |         feat_h, feat_w = featmap_size
57 |         shift_x = torch.arange(0, feat_w, device=device) * stride
58 |         shift_y = torch.arange(0, feat_h, device=device) * stride
59 |         shift_xx, shift_yy = self._meshgrid(shift_x, shift_y)
60 |         shifts = torch.stack([shift_xx, shift_yy, shift_xx, shift_yy], dim=-1)
61 |         shifts = shifts.type_as(base_anchors)
62 |         # first feat_w elements correspond to the first row of shifts
63 |         # add A anchors (1, A, 4) to K shifts (K, 1, 4) to get
64 |         # shifted anchors (K, A, 4), reshape to (K*A, 4)
65 | 
66 |         all_anchors = base_anchors[None, :, :] + shifts[:, None, :]
67 |         all_anchors = all_anchors.view(-1, 4)
68 |         # first A rows correspond to A anchors of (0, 0) in feature map,
69 |         # then (0, 1), (0, 2), ...
70 |         return all_anchors
71 | 
72 |     def valid_flags(self, featmap_size, valid_size, device='cuda'):
73 |         feat_h, feat_w = featmap_size
74 |         valid_h, valid_w = valid_size
75 |         assert valid_h <= feat_h and valid_w <= feat_w
76 |         valid_x = torch.zeros(feat_w, dtype=torch.uint8, device=device)
77 |         valid_y = torch.zeros(feat_h, dtype=torch.uint8, device=device)
78 |         valid_x[:valid_w] = 1
79 |         valid_y[:valid_h] = 1
80 |         valid_xx, valid_yy = self._meshgrid(valid_x, valid_y)
81 |         valid = valid_xx & valid_yy
82 |         valid = valid[:, None].expand(
83 |             valid.size(0), self.num_base_anchors).contiguous().view(-1)
84 |         return valid
85 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import subprocess
  3 | import time
  4 | from setuptools import find_packages, setup
  5 | 
  6 | 
  7 | def readme():
  8 |     with open('README.md', encoding='utf-8') as f:
  9 |         content = f.read()
 10 |     return content
 11 | 
 12 | 
 13 | MAJOR = 0
 14 | MINOR = 6
 15 | PATCH = 0
 16 | SUFFIX = ''
 17 | SHORT_VERSION = '{}.{}.{}{}'.format(MAJOR, MINOR, PATCH, SUFFIX)
 18 | 
 19 | version_file = 'mmdet/version.py'
 20 | 
 21 | 
 22 | def get_git_hash():
 23 | 
 24 |     def _minimal_ext_cmd(cmd):
 25 |         # construct minimal environment
 26 |         env = {}
 27 |         for k in ['SYSTEMROOT', 'PATH', 'HOME']:
 28 |             v = os.environ.get(k)
 29 |             if v is not None:
 30 |                 env[k] = v
 31 |         # LANGUAGE is used on win32
 32 |         env['LANGUAGE'] = 'C'
 33 |         env['LANG'] = 'C'
 34 |         env['LC_ALL'] = 'C'
 35 |         out = subprocess.Popen(
 36 |             cmd, stdout=subprocess.PIPE, env=env).communicate()[0]
 37 |         return out
 38 | 
 39 |     try:
 40 |         out = _minimal_ext_cmd(['git', 'rev-parse', 'HEAD'])
 41 |         sha = out.strip().decode('ascii')
 42 |     except OSError:
 43 |         sha = 'unknown'
 44 | 
 45 |     return sha
 46 | 
 47 | 
 48 | def get_hash():
 49 |     if os.path.exists('.git'):
 50 |         sha = get_git_hash()[:7]
 51 |     elif os.path.exists(version_file):
 52 |         try:
 53 |             from mmdet.version import __version__
 54 |             sha = __version__.split('+')[-1]
 55 |         except ImportError:
 56 |             raise ImportError('Unable to get git version')
 57 |     else:
 58 |         sha = 'unknown'
 59 | 
 60 |     return sha
 61 | 
 62 | 
 63 | def write_version_py():
 64 |     content = """# GENERATED VERSION FILE
 65 | # TIME: {}
 66 | 
 67 | __version__ = '{}'
 68 | short_version = '{}'
 69 | """
 70 |     sha = get_hash()
 71 |     VERSION = SHORT_VERSION + '+' + sha
 72 | 
 73 |     with open(version_file, 'w') as f:
 74 |         f.write(content.format(time.asctime(), VERSION, SHORT_VERSION))
 75 | 
 76 | 
 77 | def get_version():
 78 |     with open(version_file, 'r') as f:
 79 |         exec(compile(f.read(), version_file, 'exec'))
 80 |     return locals()['__version__']
 81 | 
 82 | 
 83 | if __name__ == '__main__':
 84 |     write_version_py()
 85 |     setup(
 86 |         name='mmdet',
 87 |         version=get_version(),
 88 |         description='Open MMLab Detection Toolbox',
 89 |         long_description=readme(),
 90 |         keywords='computer vision, object detection',
 91 |         url='https://github.com/open-mmlab/mmdetection',
 92 |         packages=find_packages(exclude=('configs', 'tools', 'demo')),
 93 |         package_data={'mmdet.ops': ['*/*.so']},
 94 |         classifiers=[
 95 |             'Development Status :: 4 - Beta',
 96 |             'License :: OSI Approved :: Apache Software License',
 97 |             'Operating System :: OS Independent',
 98 |             'Programming Language :: Python :: 2',
 99 |             'Programming Language :: Python :: 2.7',
100 |             'Programming Language :: Python :: 3',
101 |             'Programming Language :: Python :: 3.4',
102 |             'Programming Language :: Python :: 3.5',
103 |             'Programming Language :: Python :: 3.6',
104 |         ],
105 |         license='Apache License 2.0',
106 |         setup_requires=['pytest-runner'],
107 |         tests_require=['pytest'],
108 |         install_requires=[
109 |             'mmcv>=0.2.6', 'numpy', 'matplotlib', 'six', 'terminaltables',
110 |             'pycocotools'
111 |         ],
112 |         zip_safe=False)
113 | 


--------------------------------------------------------------------------------
/mmdet/core/post_processing/merge_augs.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | 
 3 | import numpy as np
 4 | 
 5 | from mmdet.ops import nms
 6 | from ..bbox import bbox_mapping_back
 7 | 
 8 | 
 9 | def merge_aug_proposals(aug_proposals, img_metas, rpn_test_cfg):
10 |     """Merge augmented proposals (multiscale, flip, etc.)
11 | 
12 |     Args:
13 |         aug_proposals (list[Tensor]): proposals from different testing
14 |             schemes, shape (n, 5). Note that they are not rescaled to the
15 |             original image size.
16 |         img_metas (list[dict]): image info including "shape_scale" and "flip".
17 |         rpn_test_cfg (dict): rpn test config.
18 | 
19 |     Returns:
20 |         Tensor: shape (n, 4), proposals corresponding to original image scale.
21 |     """
22 |     recovered_proposals = []
23 |     for proposals, img_info in zip(aug_proposals, img_metas):
24 |         img_shape = img_info['img_shape']
25 |         scale_factor = img_info['scale_factor']
26 |         flip = img_info['flip']
27 |         _proposals = proposals.clone()
28 |         _proposals[:, :4] = bbox_mapping_back(_proposals[:, :4], img_shape,
29 |                                               scale_factor, flip)
30 |         recovered_proposals.append(_proposals)
31 |     aug_proposals = torch.cat(recovered_proposals, dim=0)
32 |     merged_proposals, _ = nms(aug_proposals, rpn_test_cfg.nms_thr)
33 |     scores = merged_proposals[:, 4]
34 |     _, order = scores.sort(0, descending=True)
35 |     num = min(rpn_test_cfg.max_num, merged_proposals.shape[0])
36 |     order = order[:num]
37 |     merged_proposals = merged_proposals[order, :]
38 |     return merged_proposals
39 | 
40 | 
41 | def merge_aug_bboxes(aug_bboxes, aug_scores, img_metas, rcnn_test_cfg):
42 |     """Merge augmented detection bboxes and scores.
43 | 
44 |     Args:
45 |         aug_bboxes (list[Tensor]): shape (n, 4*#class)
46 |         aug_scores (list[Tensor] or None): shape (n, #class)
47 |         img_shapes (list[Tensor]): shape (3, ).
48 |         rcnn_test_cfg (dict): rcnn test config.
49 | 
50 |     Returns:
51 |         tuple: (bboxes, scores)
52 |     """
53 |     recovered_bboxes = []
54 |     for bboxes, img_info in zip(aug_bboxes, img_metas):
55 |         img_shape = img_info[0]['img_shape']
56 |         scale_factor = img_info[0]['scale_factor']
57 |         flip = img_info[0]['flip']
58 |         bboxes = bbox_mapping_back(bboxes, img_shape, scale_factor, flip)
59 |         recovered_bboxes.append(bboxes)
60 |     bboxes = torch.stack(recovered_bboxes).mean(dim=0)
61 |     if aug_scores is None:
62 |         return bboxes
63 |     else:
64 |         scores = torch.stack(aug_scores).mean(dim=0)
65 |         return bboxes, scores
66 | 
67 | 
68 | def merge_aug_scores(aug_scores):
69 |     """Merge augmented bbox scores."""
70 |     if isinstance(aug_scores[0], torch.Tensor):
71 |         return torch.mean(torch.stack(aug_scores), dim=0)
72 |     else:
73 |         return np.mean(aug_scores, axis=0)
74 | 
75 | 
76 | def merge_aug_masks(aug_masks, img_metas, rcnn_test_cfg, weights=None):
77 |     """Merge augmented mask prediction.
78 | 
79 |     Args:
80 |         aug_masks (list[ndarray]): shape (n, #class, h, w)
81 |         img_shapes (list[ndarray]): shape (3, ).
82 |         rcnn_test_cfg (dict): rcnn test config.
83 | 
84 |     Returns:
85 |         tuple: (bboxes, scores)
86 |     """
87 |     recovered_masks = [
88 |         mask if not img_info[0]['flip'] else mask[..., ::-1]
89 |         for mask, img_info in zip(aug_masks, img_metas)
90 |     ]
91 |     if weights is None:
92 |         merged_masks = np.mean(recovered_masks, axis=0)
93 |     else:
94 |         merged_masks = np.average(
95 |             np.array(recovered_masks), axis=0, weights=np.array(weights))
96 |     return merged_masks
97 | 


--------------------------------------------------------------------------------
/mmdet/models/detectors/rpn.py:
--------------------------------------------------------------------------------
 1 | import mmcv
 2 | 
 3 | from mmdet.core import tensor2imgs, bbox_mapping
 4 | from .base import BaseDetector
 5 | from .test_mixins import RPNTestMixin
 6 | from .. import builder
 7 | from ..registry import DETECTORS
 8 | 
 9 | 
10 | @DETECTORS.register_module
11 | class RPN(BaseDetector, RPNTestMixin):
12 | 
13 |     def __init__(self,
14 |                  backbone,
15 |                  neck,
16 |                  rpn_head,
17 |                  train_cfg,
18 |                  test_cfg,
19 |                  pretrained=None):
20 |         super(RPN, self).__init__()
21 |         self.backbone = builder.build_backbone(backbone)
22 |         self.neck = builder.build_neck(neck) if neck is not None else None
23 |         self.rpn_head = builder.build_head(rpn_head)
24 |         self.train_cfg = train_cfg
25 |         self.test_cfg = test_cfg
26 |         self.init_weights(pretrained=pretrained)
27 | 
28 |     def init_weights(self, pretrained=None):
29 |         super(RPN, self).init_weights(pretrained)
30 |         self.backbone.init_weights(pretrained=pretrained)
31 |         if self.with_neck:
32 |             self.neck.init_weights()
33 |         self.rpn_head.init_weights()
34 | 
35 |     def extract_feat(self, img):
36 |         x = self.backbone(img)
37 |         if self.with_neck:
38 |             x = self.neck(x)
39 |         return x
40 | 
41 |     def forward_train(self,
42 |                       img,
43 |                       img_meta,
44 |                       gt_bboxes=None,
45 |                       gt_bboxes_ignore=None):
46 |         if self.train_cfg.rpn.get('debug', False):
47 |             self.rpn_head.debug_imgs = tensor2imgs(img)
48 | 
49 |         x = self.extract_feat(img)
50 |         rpn_outs = self.rpn_head(x)
51 | 
52 |         rpn_loss_inputs = rpn_outs + (gt_bboxes, img_meta, self.train_cfg.rpn)
53 |         losses = self.rpn_head.loss(
54 |             *rpn_loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
55 |         return losses
56 | 
57 |     def simple_test(self, img, img_meta, rescale=False):
58 |         x = self.extract_feat(img)
59 |         proposal_list = self.simple_test_rpn(x, img_meta, self.test_cfg.rpn)
60 |         if rescale:
61 |             for proposals, meta in zip(proposal_list, img_meta):
62 |                 proposals[:, :4] /= meta['scale_factor']
63 |         # TODO: remove this restriction
64 |         return proposal_list[0].cpu().numpy()
65 | 
66 |     def aug_test(self, imgs, img_metas, rescale=False):
67 |         proposal_list = self.aug_test_rpn(
68 |             self.extract_feats(imgs), img_metas, self.test_cfg.rpn)
69 |         if not rescale:
70 |             for proposals, img_meta in zip(proposal_list, img_metas[0]):
71 |                 img_shape = img_meta['img_shape']
72 |                 scale_factor = img_meta['scale_factor']
73 |                 flip = img_meta['flip']
74 |                 proposals[:, :4] = bbox_mapping(proposals[:, :4], img_shape,
75 |                                                 scale_factor, flip)
76 |         # TODO: remove this restriction
77 |         return proposal_list[0].cpu().numpy()
78 | 
79 |     def show_result(self, data, result, img_norm_cfg, dataset=None, top_k=20):
80 |         """Show RPN proposals on the image.
81 | 
82 |         Although we assume batch size is 1, this method supports arbitrary
83 |         batch size.
84 |         """
85 |         img_tensor = data['img'][0]
86 |         img_metas = data['img_meta'][0].data[0]
87 |         imgs = tensor2imgs(img_tensor, **img_norm_cfg)
88 |         assert len(imgs) == len(img_metas)
89 |         for img, img_meta in zip(imgs, img_metas):
90 |             h, w, _ = img_meta['img_shape']
91 |             img_show = img[:h, :w, :]
92 |             mmcv.imshow_bboxes(img_show, result, top_k=top_k)
93 | 


--------------------------------------------------------------------------------
/mmcv_custom/zipreader.py:
--------------------------------------------------------------------------------
  1 | import zipfile
  2 | import os
  3 | 
  4 | import cv2
  5 | import numpy as np
  6 | 
  7 | from mmcv.utils import is_str
  8 | from mmcv.opencv_info import USE_OPENCV2
  9 | 
 10 | if not USE_OPENCV2:
 11 |     from cv2 import IMREAD_COLOR, IMREAD_GRAYSCALE, IMREAD_UNCHANGED
 12 | else:
 13 |     from cv2 import CV_LOAD_IMAGE_COLOR as IMREAD_COLOR
 14 |     from cv2 import CV_LOAD_IMAGE_GRAYSCALE as IMREAD_GRAYSCALE
 15 |     from cv2 import CV_LOAD_IMAGE_UNCHANGED as IMREAD_UNCHANGED
 16 | 
 17 | imread_flags = {
 18 |     'color': IMREAD_COLOR,
 19 |     'grayscale': IMREAD_GRAYSCALE,
 20 |     'unchanged': IMREAD_UNCHANGED
 21 | }
 22 | 
 23 | 
 24 | class ZipReader(object):
 25 |     zip_bank = dict()
 26 | 
 27 |     def __init__(self):
 28 |         super(ZipReader, self).__init__()
 29 | 
 30 |     @staticmethod
 31 |     def get_zipfile(path):
 32 |         zip_bank = ZipReader.zip_bank
 33 |         if path in zip_bank:
 34 |             return zip_bank[path]
 35 |         else:
 36 |             # print("creating new zip_bank")
 37 |             zfile = zipfile.ZipFile(path, 'r')
 38 |             zip_bank[path] = zfile
 39 |             return zip_bank[path]
 40 | 
 41 |     @staticmethod
 42 |     def split_zip_style_path(path):
 43 |         pos_at = path.index('@')
 44 |         if pos_at == -1:
 45 |             print("character '@' is not found from the given path '%s'" %
 46 |                   (path))
 47 |             assert 0
 48 |         zip_path = path[0:pos_at]
 49 |         folder_path = path[pos_at + 1:]
 50 |         folder_path = str.strip(folder_path, '/')
 51 |         return zip_path, folder_path
 52 | 
 53 |     @staticmethod
 54 |     def list_folder(path):
 55 |         zip_path, folder_path = ZipReader.split_zip_style_path(path)
 56 | 
 57 |         zfile = ZipReader.get_zipfile(zip_path)
 58 |         folder_list = []
 59 |         for file_foler_name in zfile.namelist():
 60 |             file_foler_name = str.strip(file_foler_name, '/')
 61 |             if file_foler_name.startswith(folder_path) and \
 62 |                len(os.path.splitext(file_foler_name)[-1]) == 0 and \
 63 |                file_foler_name != folder_path:
 64 |                 if len(folder_path) == 0:
 65 |                     folder_list.append(file_foler_name)
 66 |                 else:
 67 |                     folder_list.append(file_foler_name[len(folder_path) + 1:])
 68 | 
 69 |         return folder_list
 70 | 
 71 |     @staticmethod
 72 |     def list_files(path, extension=['.*']):
 73 |         zip_path, folder_path = ZipReader.split_zip_style_path(path)
 74 | 
 75 |         zfile = ZipReader.get_zipfile(zip_path)
 76 |         file_lists = []
 77 |         for file_foler_name in zfile.namelist():
 78 |             file_foler_name = str.strip(file_foler_name, '/')
 79 |             if file_foler_name.startswith(folder_path) and str.lower(
 80 |                     os.path.splitext(file_foler_name)[-1]) in extension:
 81 |                 if len(folder_path) == 0:
 82 |                     file_lists.append(file_foler_name)
 83 |                 else:
 84 |                     file_lists.append(file_foler_name[len(folder_path) + 1:])
 85 | 
 86 |         return file_lists
 87 | 
 88 |     @staticmethod
 89 |     def imread(path, flag='color'):
 90 |         zip_path, path_img = ZipReader.split_zip_style_path(path)
 91 |         zfile = ZipReader.get_zipfile(zip_path)
 92 |         data = zfile.read(path_img)
 93 |         flag = imread_flags[flag] if is_str(flag) else flag
 94 |         im = cv2.imdecode(np.frombuffer(data, np.uint8), flag)
 95 |         return im
 96 | 
 97 |     @staticmethod
 98 |     def read(path):
 99 |         zip_path, path_img = ZipReader.split_zip_style_path(path)
100 |         zfile = ZipReader.get_zipfile(zip_path)
101 |         data = zfile.read(path_img)
102 |         return data
103 | 


--------------------------------------------------------------------------------
/tools/train.py:
--------------------------------------------------------------------------------
  1 | from __future__ import division
  2 | import pprint
  3 | import argparse
  4 | from mmcv import Config
  5 | 
  6 | from mmdet import __version__
  7 | from mmdet.datasets import get_dataset
  8 | from mmdet.apis import (train_detector, init_dist, get_root_logger,
  9 |                         set_random_seed, get_git_hash)
 10 | from mmdet.models import build_detector
 11 | 
 12 | import torch
 13 | import torch.distributed as dist
 14 | import time
 15 | 
 16 | def parse_args():
 17 |     parser = argparse.ArgumentParser(description='Train a detector')
 18 |     parser.add_argument('config', help='train config file path')
 19 |     parser.add_argument('--work_dir', help='the dir to save logs and models')
 20 |     parser.add_argument(
 21 |         '--resume_from', help='the checkpoint file to resume from')
 22 |     parser.add_argument(
 23 |         '--validate',
 24 |         action='store_true',
 25 |         help='whether to evaluate the checkpoint during training')
 26 |     parser.add_argument(
 27 |         '--gpus',
 28 |         type=int,
 29 |         default=1,
 30 |         help='number of gpus to use '
 31 |         '(only applicable to non-distributed training)')
 32 |     parser.add_argument('--seed', type=int, default=None, help='random seed')
 33 |     parser.add_argument(
 34 |         '--launcher',
 35 |         choices=['none', 'pytorch', 'slurm', 'mpi'],
 36 |         default='none',
 37 |         help='job launcher')
 38 |     parser.add_argument('--local_rank', type=int, default=0)
 39 |     args, _ = parser.parse_known_args()
 40 | 
 41 |     return args
 42 | 
 43 | 
 44 | def main():
 45 |     args = parse_args()
 46 | 
 47 |     cfg = Config.fromfile(args.config)
 48 |     # set cudnn_benchmark
 49 |     if cfg.get('cudnn_benchmark', False):
 50 |         torch.backends.cudnn.benchmark = True
 51 |     # update configs according to CLI args
 52 | 
 53 |     cfg.work_dir = cfg.work_dir + '_' + time.strftime('Time_%m%d_%H%M%S', time.localtime())
 54 | 
 55 |     if args.work_dir is not None:
 56 |         cfg.work_dir = args.work_dir
 57 |     if args.resume_from is not None:
 58 |         cfg.resume_from = args.resume_from
 59 |     cfg.gpus = args.gpus
 60 | 
 61 |     # init distributed env first, since logger depends on the dist info.
 62 |     if args.launcher == 'none':
 63 |         distributed = False
 64 |     else:
 65 |         distributed = True
 66 |         init_dist(args.launcher, **cfg.dist_params)
 67 | 
 68 |     # init logger before other steps
 69 |     logger = get_root_logger(cfg.log_level)
 70 |     logger.info('Distributed training: {}'.format(distributed))
 71 | 
 72 |     # log cfg
 73 |     logger.info('training config:{}\n'.format(pprint.pformat(cfg._cfg_dict)))
 74 | 
 75 |     # log git hash
 76 |     logger.info('git hash: {}'.format(get_git_hash()))
 77 | 
 78 |     # set random seeds
 79 |     if args.seed is not None:
 80 |         logger.info('Set random seed to {}'.format(args.seed))
 81 |         set_random_seed(args.seed)
 82 | 
 83 |     model = build_detector(
 84 |         cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
 85 | 
 86 |     train_dataset = get_dataset(cfg.data.train)
 87 |     if cfg.checkpoint_config is not None:
 88 |         # save mmdet version, config file content and class names in
 89 |         # checkpoints as meta data
 90 |         cfg.checkpoint_config.meta = dict(
 91 |             mmdet_version=__version__,
 92 |             config=cfg.text,
 93 |             classes=train_dataset.CLASSES)
 94 |     # add an attribute for visualization convenience
 95 |     model.CLASSES = train_dataset.CLASSES
 96 |     train_detector(
 97 |         model,
 98 |         train_dataset,
 99 |         cfg,
100 |         distributed=distributed,
101 |         validate=args.validate,
102 |         logger=logger)
103 | 
104 | if __name__ == '__main__':
105 |     main()
106 | 


--------------------------------------------------------------------------------
/mmdet/ops/dcn/src/deform_pool_cuda.cpp:
--------------------------------------------------------------------------------
 1 | // modify from
 2 | // https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/modulated_dcn_cuda.c
 3 | 
 4 | // based on
 5 | // author: Charles Shang
 6 | // https://github.com/torch/cunn/blob/master/lib/THCUNN/generic/SpatialConvolutionMM.cu
 7 | 
 8 | #include <torch/extension.h>
 9 | 
10 | #include <cmath>
11 | #include <vector>
12 | 
13 | void DeformablePSROIPoolForward(
14 |     const at::Tensor data, const at::Tensor bbox, const at::Tensor trans,
15 |     at::Tensor out, at::Tensor top_count, const int batch, const int channels,
16 |     const int height, const int width, const int num_bbox,
17 |     const int channels_trans, const int no_trans, const float spatial_scale,
18 |     const int output_dim, const int group_size, const int pooled_size,
19 |     const int part_size, const int sample_per_part, const float trans_std);
20 | 
21 | void DeformablePSROIPoolBackwardAcc(
22 |     const at::Tensor out_grad, const at::Tensor data, const at::Tensor bbox,
23 |     const at::Tensor trans, const at::Tensor top_count, at::Tensor in_grad,
24 |     at::Tensor trans_grad, const int batch, const int channels,
25 |     const int height, const int width, const int num_bbox,
26 |     const int channels_trans, const int no_trans, const float spatial_scale,
27 |     const int output_dim, const int group_size, const int pooled_size,
28 |     const int part_size, const int sample_per_part, const float trans_std);
29 | 
30 | void deform_psroi_pooling_cuda_forward(
31 |     at::Tensor input, at::Tensor bbox, at::Tensor trans, at::Tensor out,
32 |     at::Tensor top_count, const int no_trans, const float spatial_scale,
33 |     const int output_dim, const int group_size, const int pooled_size,
34 |     const int part_size, const int sample_per_part, const float trans_std) {
35 |   AT_CHECK(input.is_contiguous(), "input tensor has to be contiguous");
36 | 
37 |   const int batch = input.size(0);
38 |   const int channels = input.size(1);
39 |   const int height = input.size(2);
40 |   const int width = input.size(3);
41 |   const int channels_trans = no_trans ? 2 : trans.size(1);
42 | 
43 |   const int num_bbox = bbox.size(0);
44 |   if (num_bbox != out.size(0))
45 |     AT_ERROR("Output shape and bbox number wont match: (%d vs %d).",
46 |              out.size(0), num_bbox);
47 | 
48 |   DeformablePSROIPoolForward(
49 |       input, bbox, trans, out, top_count, batch, channels, height, width,
50 |       num_bbox, channels_trans, no_trans, spatial_scale, output_dim, group_size,
51 |       pooled_size, part_size, sample_per_part, trans_std);
52 | }
53 | 
54 | void deform_psroi_pooling_cuda_backward(
55 |     at::Tensor out_grad, at::Tensor input, at::Tensor bbox, at::Tensor trans,
56 |     at::Tensor top_count, at::Tensor input_grad, at::Tensor trans_grad,
57 |     const int no_trans, const float spatial_scale, const int output_dim,
58 |     const int group_size, const int pooled_size, const int part_size,
59 |     const int sample_per_part, const float trans_std) {
60 |   AT_CHECK(out_grad.is_contiguous(), "out_grad tensor has to be contiguous");
61 |   AT_CHECK(input.is_contiguous(), "input tensor has to be contiguous");
62 | 
63 |   const int batch = input.size(0);
64 |   const int channels = input.size(1);
65 |   const int height = input.size(2);
66 |   const int width = input.size(3);
67 |   const int channels_trans = no_trans ? 2 : trans.size(1);
68 | 
69 |   const int num_bbox = bbox.size(0);
70 |   if (num_bbox != out_grad.size(0))
71 |     AT_ERROR("Output shape and bbox number wont match: (%d vs %d).",
72 |              out_grad.size(0), num_bbox);
73 | 
74 |   DeformablePSROIPoolBackwardAcc(
75 |       out_grad, input, bbox, trans, top_count, input_grad, trans_grad, batch,
76 |       channels, height, width, num_bbox, channels_trans, no_trans,
77 |       spatial_scale, output_dim, group_size, pooled_size, part_size,
78 |       sample_per_part, trans_std);
79 | }
80 | 
81 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
82 |   m.def("deform_psroi_pooling_cuda_forward", &deform_psroi_pooling_cuda_forward,
83 |         "deform psroi pooling forward(CUDA)");
84 |   m.def("deform_psroi_pooling_cuda_backward",
85 |         &deform_psroi_pooling_cuda_backward,
86 |         "deform psroi pooling backward(CUDA)");
87 | }


--------------------------------------------------------------------------------
/mmdet/datasets/utils.py:
--------------------------------------------------------------------------------
  1 | import copy
  2 | from collections import Sequence
  3 | 
  4 | import mmcv
  5 | from mmcv.runner import obj_from_dict
  6 | import torch
  7 | 
  8 | import matplotlib.pyplot as plt
  9 | import numpy as np
 10 | from .concat_dataset import ConcatDataset
 11 | from .repeat_dataset import RepeatDataset
 12 | from .. import datasets
 13 | 
 14 | 
 15 | def to_tensor(data):
 16 |     """Convert objects of various python types to :obj:`torch.Tensor`.
 17 | 
 18 |     Supported types are: :class:`numpy.ndarray`, :class:`torch.Tensor`,
 19 |     :class:`Sequence`, :class:`int` and :class:`float`.
 20 |     """
 21 |     if isinstance(data, torch.Tensor):
 22 |         return data
 23 |     elif isinstance(data, np.ndarray):
 24 |         return torch.from_numpy(data)
 25 |     elif isinstance(data, Sequence) and not mmcv.is_str(data):
 26 |         return torch.tensor(data)
 27 |     elif isinstance(data, int):
 28 |         return torch.LongTensor([data])
 29 |     elif isinstance(data, float):
 30 |         return torch.FloatTensor([data])
 31 |     else:
 32 |         raise TypeError('type {} cannot be converted to tensor.'.format(
 33 |             type(data)))
 34 | 
 35 | 
 36 | def random_scale(img_scales, mode='range'):
 37 |     """Randomly select a scale from a list of scales or scale ranges.
 38 | 
 39 |     Args:
 40 |         img_scales (list[tuple]): Image scale or scale range.
 41 |         mode (str): "range" or "value".
 42 | 
 43 |     Returns:
 44 |         tuple: Sampled image scale.
 45 |     """
 46 |     num_scales = len(img_scales)
 47 |     if num_scales == 1:  # fixed scale is specified
 48 |         img_scale = img_scales[0]
 49 |     elif num_scales == 2:  # randomly sample a scale
 50 |         if mode == 'range':
 51 |             img_scale_long = [max(s) for s in img_scales]
 52 |             img_scale_short = [min(s) for s in img_scales]
 53 |             long_edge = np.random.randint(
 54 |                 min(img_scale_long),
 55 |                 max(img_scale_long) + 1)
 56 |             short_edge = np.random.randint(
 57 |                 min(img_scale_short),
 58 |                 max(img_scale_short) + 1)
 59 |             img_scale = (long_edge, short_edge)
 60 |         elif mode == 'value':
 61 |             img_scale = img_scales[np.random.randint(num_scales)]
 62 |     else:
 63 |         if mode != 'value':
 64 |             raise ValueError(
 65 |                 'Only "value" mode supports more than 2 image scales')
 66 |         img_scale = img_scales[np.random.randint(num_scales)]
 67 |     return img_scale
 68 | 
 69 | 
 70 | def show_ann(coco, img, ann_info):
 71 |     plt.imshow(mmcv.bgr2rgb(img))
 72 |     plt.axis('off')
 73 |     coco.showAnns(ann_info)
 74 |     plt.show()
 75 | 
 76 | 
 77 | def get_dataset(data_cfg):
 78 |     if data_cfg['type'] == 'RepeatDataset':
 79 |         return RepeatDataset(
 80 |             get_dataset(data_cfg['dataset']), data_cfg['times'])
 81 | 
 82 |     if isinstance(data_cfg['ann_file'], (list, tuple)):
 83 |         ann_files = data_cfg['ann_file']
 84 |         num_dset = len(ann_files)
 85 |     else:
 86 |         ann_files = [data_cfg['ann_file']]
 87 |         num_dset = 1
 88 | 
 89 |     if 'proposal_file' in data_cfg.keys():
 90 |         if isinstance(data_cfg['proposal_file'], (list, tuple)):
 91 |             proposal_files = data_cfg['proposal_file']
 92 |         else:
 93 |             proposal_files = [data_cfg['proposal_file']]
 94 |     else:
 95 |         proposal_files = [None] * num_dset
 96 |     assert len(proposal_files) == num_dset
 97 | 
 98 |     if isinstance(data_cfg['img_prefix'], (list, tuple)):
 99 |         img_prefixes = data_cfg['img_prefix']
100 |     else:
101 |         img_prefixes = [data_cfg['img_prefix']] * num_dset
102 |     assert len(img_prefixes) == num_dset
103 | 
104 |     dsets = []
105 |     for i in range(num_dset):
106 |         data_info = copy.deepcopy(data_cfg)
107 |         data_info['ann_file'] = ann_files[i]
108 |         data_info['proposal_file'] = proposal_files[i]
109 |         data_info['img_prefix'] = img_prefixes[i]
110 |         dset = obj_from_dict(data_info, datasets)
111 |         dsets.append(dset)
112 |     if len(dsets) > 1:
113 |         dset = ConcatDataset(dsets)
114 |     else:
115 |         dset = dsets[0]
116 |     return dset
117 | 


--------------------------------------------------------------------------------
/mmdet/models/anchor_heads/rpn_head.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from mmcv.cnn import normal_init
  5 | 
  6 | from mmdet.core import delta2bbox
  7 | from mmdet.ops import nms
  8 | from .anchor_head import AnchorHead
  9 | from ..registry import HEADS
 10 | 
 11 | 
 12 | @HEADS.register_module
 13 | class RPNHead(AnchorHead):
 14 | 
 15 |     def __init__(self, in_channels, **kwargs):
 16 |         super(RPNHead, self).__init__(2, in_channels, **kwargs)
 17 | 
 18 |     def _init_layers(self):
 19 |         self.rpn_conv = nn.Conv2d(
 20 |             self.in_channels, self.feat_channels, 3, padding=1)
 21 |         self.rpn_cls = nn.Conv2d(self.feat_channels,
 22 |                                  self.num_anchors * self.cls_out_channels, 1)
 23 |         self.rpn_reg = nn.Conv2d(self.feat_channels, self.num_anchors * 4, 1)
 24 | 
 25 |     def init_weights(self):
 26 |         normal_init(self.rpn_conv, std=0.01)
 27 |         normal_init(self.rpn_cls, std=0.01)
 28 |         normal_init(self.rpn_reg, std=0.01)
 29 | 
 30 |     def forward_single(self, x):
 31 |         x = self.rpn_conv(x)
 32 |         x = F.relu(x, inplace=True)
 33 |         rpn_cls_score = self.rpn_cls(x)
 34 |         rpn_bbox_pred = self.rpn_reg(x)
 35 |         return rpn_cls_score, rpn_bbox_pred
 36 | 
 37 |     def loss(self,
 38 |              cls_scores,
 39 |              bbox_preds,
 40 |              gt_bboxes,
 41 |              img_metas,
 42 |              cfg,
 43 |              gt_bboxes_ignore=None):
 44 |         losses = super(RPNHead, self).loss(
 45 |             cls_scores,
 46 |             bbox_preds,
 47 |             gt_bboxes,
 48 |             None,
 49 |             img_metas,
 50 |             cfg,
 51 |             gt_bboxes_ignore=gt_bboxes_ignore)
 52 |         return dict(
 53 |             loss_rpn_cls=losses['loss_cls'], loss_rpn_bbox=losses['loss_bbox'])
 54 | 
 55 |     def get_bboxes_single(self,
 56 |                           cls_scores,
 57 |                           bbox_preds,
 58 |                           mlvl_anchors,
 59 |                           img_shape,
 60 |                           scale_factor,
 61 |                           cfg,
 62 |                           rescale=False):
 63 |         mlvl_proposals = []
 64 |         for idx in range(len(cls_scores)):
 65 |             rpn_cls_score = cls_scores[idx]
 66 |             rpn_bbox_pred = bbox_preds[idx]
 67 |             assert rpn_cls_score.size()[-2:] == rpn_bbox_pred.size()[-2:]
 68 |             anchors = mlvl_anchors[idx]
 69 |             rpn_cls_score = rpn_cls_score.permute(1, 2, 0)
 70 |             if self.use_sigmoid_cls:
 71 |                 rpn_cls_score = rpn_cls_score.reshape(-1)
 72 |                 scores = rpn_cls_score.sigmoid()
 73 |             else:
 74 |                 rpn_cls_score = rpn_cls_score.reshape(-1, 2)
 75 |                 scores = rpn_cls_score.softmax(dim=1)[:, 1]
 76 |             rpn_bbox_pred = rpn_bbox_pred.permute(1, 2, 0).reshape(-1, 4)
 77 |             if cfg.nms_pre > 0 and scores.shape[0] > cfg.nms_pre:
 78 |                 _, topk_inds = scores.topk(cfg.nms_pre)
 79 |                 rpn_bbox_pred = rpn_bbox_pred[topk_inds, :]
 80 |                 anchors = anchors[topk_inds, :]
 81 |                 scores = scores[topk_inds]
 82 |             proposals = delta2bbox(anchors, rpn_bbox_pred, self.target_means,
 83 |                                    self.target_stds, img_shape)
 84 |             if cfg.min_bbox_size > 0:
 85 |                 w = proposals[:, 2] - proposals[:, 0] + 1
 86 |                 h = proposals[:, 3] - proposals[:, 1] + 1
 87 |                 valid_inds = torch.nonzero((w >= cfg.min_bbox_size) &
 88 |                                            (h >= cfg.min_bbox_size)).squeeze()
 89 |                 proposals = proposals[valid_inds, :]
 90 |                 scores = scores[valid_inds]
 91 |             proposals = torch.cat([proposals, scores.unsqueeze(-1)], dim=-1)
 92 |             proposals, _ = nms(proposals, cfg.nms_thr)
 93 |             proposals = proposals[:cfg.nms_post, :]
 94 |             mlvl_proposals.append(proposals)
 95 |         proposals = torch.cat(mlvl_proposals, 0)
 96 |         if cfg.nms_across_levels:
 97 |             proposals, _ = nms(proposals, cfg.nms_thr)
 98 |             proposals = proposals[:cfg.max_num, :]
 99 |         else:
100 |             scores = proposals[:, 4]
101 |             num = min(cfg.max_num, proposals.shape[0])
102 |             _, topk_inds = scores.topk(num)
103 |             proposals = proposals[topk_inds, :]
104 |         return proposals
105 | 


--------------------------------------------------------------------------------
/mmdet/ops/nms/src/soft_nms_cpu.pyx:
--------------------------------------------------------------------------------
  1 | # ----------------------------------------------------------
  2 | # Soft-NMS: Improving Object Detection With One Line of Code
  3 | # Copyright (c) University of Maryland, College Park
  4 | # Licensed under The MIT License [see LICENSE for details]
  5 | # Written by Navaneeth Bodla and Bharat Singh
  6 | # Modified by Kai Chen
  7 | # ----------------------------------------------------------
  8 | 
  9 | # cython: language_level=3, boundscheck=False
 10 | 
 11 | import numpy as np
 12 | cimport numpy as np
 13 | 
 14 | 
 15 | cdef inline np.float32_t max(np.float32_t a, np.float32_t b):
 16 |     return a if a >= b else b
 17 | 
 18 | cdef inline np.float32_t min(np.float32_t a, np.float32_t b):
 19 |     return a if a <= b else b
 20 | 
 21 | 
 22 | def soft_nms_cpu(
 23 |     np.ndarray[float, ndim=2] boxes_in,
 24 |     float iou_thr,
 25 |     unsigned int method=1,
 26 |     float sigma=0.5,
 27 |     float min_score=0.001,
 28 | ):
 29 |     boxes = boxes_in.copy()
 30 |     cdef unsigned int N = boxes.shape[0]
 31 |     cdef float iw, ih, box_area
 32 |     cdef float ua
 33 |     cdef int pos = 0
 34 |     cdef float maxscore = 0
 35 |     cdef int maxpos = 0
 36 |     cdef float x1, x2, y1, y2, tx1, tx2, ty1, ty2, ts, area, weight, ov
 37 |     inds = np.arange(N)
 38 | 
 39 |     for i in range(N):
 40 |         maxscore = boxes[i, 4]
 41 |         maxpos = i
 42 | 
 43 |         tx1 = boxes[i, 0]
 44 |         ty1 = boxes[i, 1]
 45 |         tx2 = boxes[i, 2]
 46 |         ty2 = boxes[i, 3]
 47 |         ts = boxes[i, 4]
 48 |         ti = inds[i]
 49 | 
 50 |         pos = i + 1
 51 |         # get max box
 52 |         while pos < N:
 53 |             if maxscore < boxes[pos, 4]:
 54 |                 maxscore = boxes[pos, 4]
 55 |                 maxpos = pos
 56 |             pos = pos + 1
 57 | 
 58 |         # add max box as a detection
 59 |         boxes[i, 0] = boxes[maxpos, 0]
 60 |         boxes[i, 1] = boxes[maxpos, 1]
 61 |         boxes[i, 2] = boxes[maxpos, 2]
 62 |         boxes[i, 3] = boxes[maxpos, 3]
 63 |         boxes[i, 4] = boxes[maxpos, 4]
 64 |         inds[i] = inds[maxpos]
 65 | 
 66 |         # swap ith box with position of max box
 67 |         boxes[maxpos, 0] = tx1
 68 |         boxes[maxpos, 1] = ty1
 69 |         boxes[maxpos, 2] = tx2
 70 |         boxes[maxpos, 3] = ty2
 71 |         boxes[maxpos, 4] = ts
 72 |         inds[maxpos] = ti
 73 | 
 74 |         tx1 = boxes[i, 0]
 75 |         ty1 = boxes[i, 1]
 76 |         tx2 = boxes[i, 2]
 77 |         ty2 = boxes[i, 3]
 78 |         ts = boxes[i, 4]
 79 | 
 80 |         pos = i + 1
 81 |         # NMS iterations, note that N changes if detection boxes fall below
 82 |         # threshold
 83 |         while pos < N:
 84 |             x1 = boxes[pos, 0]
 85 |             y1 = boxes[pos, 1]
 86 |             x2 = boxes[pos, 2]
 87 |             y2 = boxes[pos, 3]
 88 |             s = boxes[pos, 4]
 89 | 
 90 |             area = (x2 - x1 + 1) * (y2 - y1 + 1)
 91 |             iw = (min(tx2, x2) - max(tx1, x1) + 1)
 92 |             if iw > 0:
 93 |                 ih = (min(ty2, y2) - max(ty1, y1) + 1)
 94 |                 if ih > 0:
 95 |                     ua = float((tx2 - tx1 + 1) * (ty2 - ty1 + 1) + area - iw * ih)
 96 |                     ov = iw * ih / ua  # iou between max box and detection box
 97 | 
 98 |                     if method == 1:  # linear
 99 |                         if ov > iou_thr:
100 |                             weight = 1 - ov
101 |                         else:
102 |                             weight = 1
103 |                     elif method == 2:  # gaussian
104 |                         weight = np.exp(-(ov * ov) / sigma)
105 |                     else:  # original NMS
106 |                         if ov > iou_thr:
107 |                             weight = 0
108 |                         else:
109 |                             weight = 1
110 | 
111 |                     boxes[pos, 4] = weight * boxes[pos, 4]
112 | 
113 |                     # if box score falls below threshold, discard the box by
114 |                     # swapping with last box update N
115 |                     if boxes[pos, 4] < min_score:
116 |                         boxes[pos, 0] = boxes[N-1, 0]
117 |                         boxes[pos, 1] = boxes[N-1, 1]
118 |                         boxes[pos, 2] = boxes[N-1, 2]
119 |                         boxes[pos, 3] = boxes[N-1, 3]
120 |                         boxes[pos, 4] = boxes[N-1, 4]
121 |                         inds[pos] = inds[N - 1]
122 |                         N = N - 1
123 |                         pos = pos - 1
124 | 
125 |             pos = pos + 1
126 | 
127 |     return boxes[:N], inds[:N]
128 | 


--------------------------------------------------------------------------------
/mmdet/datasets/transforms.py:
--------------------------------------------------------------------------------
  1 | import mmcv
  2 | import numpy as np
  3 | import torch
  4 | 
  5 | __all__ = [
  6 |     'ImageTransform', 'BboxTransform', 'MaskTransform', 'SegMapTransform',
  7 |     'Numpy2Tensor'
  8 | ]
  9 | 
 10 | 
 11 | class ImageTransform(object):
 12 |     """Preprocess an image.
 13 | 
 14 |     1. rescale the image to expected size
 15 |     2. normalize the image
 16 |     3. flip the image (if needed)
 17 |     4. pad the image (if needed)
 18 |     5. transpose to (c, h, w)
 19 |     """
 20 | 
 21 |     def __init__(self,
 22 |                  mean=(0, 0, 0),
 23 |                  std=(1, 1, 1),
 24 |                  to_rgb=True,
 25 |                  size_divisor=None):
 26 |         self.mean = np.array(mean, dtype=np.float32)
 27 |         self.std = np.array(std, dtype=np.float32)
 28 |         self.to_rgb = to_rgb
 29 |         self.size_divisor = size_divisor
 30 | 
 31 |     def __call__(self, img, scale, flip=False, keep_ratio=True):
 32 |         if keep_ratio:
 33 |             img, scale_factor = mmcv.imrescale(img, scale, return_scale=True)
 34 |         else:
 35 |             img, w_scale, h_scale = mmcv.imresize(
 36 |                 img, scale, return_scale=True)
 37 |             scale_factor = np.array(
 38 |                 [w_scale, h_scale, w_scale, h_scale], dtype=np.float32)
 39 |         img_shape = img.shape
 40 |         img = mmcv.imnormalize(img, self.mean, self.std, self.to_rgb)
 41 |         if flip:
 42 |             img = mmcv.imflip(img)
 43 |         if self.size_divisor is not None:
 44 |             img = mmcv.impad_to_multiple(img, self.size_divisor)
 45 |             pad_shape = img.shape
 46 |         else:
 47 |             pad_shape = img_shape
 48 |         img = img.transpose(2, 0, 1)
 49 |         return img, img_shape, pad_shape, scale_factor
 50 | 
 51 | 
 52 | def bbox_flip(bboxes, img_shape):
 53 |     """Flip bboxes horizontally.
 54 | 
 55 |     Args:
 56 |         bboxes(ndarray): shape (..., 4*k)
 57 |         img_shape(tuple): (height, width)
 58 |     """
 59 |     assert bboxes.shape[-1] % 4 == 0
 60 |     w = img_shape[1]
 61 |     flipped = bboxes.copy()
 62 |     flipped[..., 0::4] = w - bboxes[..., 2::4] - 1
 63 |     flipped[..., 2::4] = w - bboxes[..., 0::4] - 1
 64 |     return flipped
 65 | 
 66 | 
 67 | class BboxTransform(object):
 68 |     """Preprocess gt bboxes.
 69 | 
 70 |     1. rescale bboxes according to image size
 71 |     2. flip bboxes (if needed)
 72 |     3. pad the first dimension to `max_num_gts`
 73 |     """
 74 | 
 75 |     def __init__(self, max_num_gts=None):
 76 |         self.max_num_gts = max_num_gts
 77 | 
 78 |     def __call__(self, bboxes, img_shape, scale_factor, flip=False):
 79 |         gt_bboxes = bboxes * scale_factor
 80 |         if flip:
 81 |             gt_bboxes = bbox_flip(gt_bboxes, img_shape)
 82 |         gt_bboxes[:, 0::2] = np.clip(gt_bboxes[:, 0::2], 0, img_shape[1] - 1)
 83 |         gt_bboxes[:, 1::2] = np.clip(gt_bboxes[:, 1::2], 0, img_shape[0] - 1)
 84 |         if self.max_num_gts is None:
 85 |             return gt_bboxes
 86 |         else:
 87 |             num_gts = gt_bboxes.shape[0]
 88 |             padded_bboxes = np.zeros((self.max_num_gts, 4), dtype=np.float32)
 89 |             padded_bboxes[:num_gts, :] = gt_bboxes
 90 |             return padded_bboxes
 91 | 
 92 | 
 93 | class MaskTransform(object):
 94 |     """Preprocess masks.
 95 | 
 96 |     1. resize masks to expected size and stack to a single array
 97 |     2. flip the masks (if needed)
 98 |     3. pad the masks (if needed)
 99 |     """
100 | 
101 |     def __call__(self, masks, pad_shape, scale_factor, flip=False):
102 |         masks = [
103 |             mmcv.imrescale(mask, scale_factor, interpolation='nearest')
104 |             for mask in masks
105 |         ]
106 |         if flip:
107 |             masks = [mask[:, ::-1] for mask in masks]
108 |         padded_masks = [
109 |             mmcv.impad(mask, pad_shape[:2], pad_val=0) for mask in masks
110 |         ]
111 |         padded_masks = np.stack(padded_masks, axis=0)
112 |         return padded_masks
113 | 
114 | 
115 | class SegMapTransform(object):
116 |     """Preprocess semantic segmentation maps.
117 | 
118 |     1. rescale the segmentation map to expected size
119 |     3. flip the image (if needed)
120 |     4. pad the image (if needed)
121 |     """
122 | 
123 |     def __init__(self, size_divisor=None):
124 |         self.size_divisor = size_divisor
125 | 
126 |     def __call__(self, img, scale, flip=False, keep_ratio=True):
127 |         if keep_ratio:
128 |             img = mmcv.imrescale(img, scale, interpolation='nearest')
129 |         else:
130 |             img = mmcv.imresize(img, scale, interpolation='nearest')
131 |         if flip:
132 |             img = mmcv.imflip(img)
133 |         if self.size_divisor is not None:
134 |             img = mmcv.impad_to_multiple(img, self.size_divisor)
135 |         return img
136 | 
137 | 
138 | class Numpy2Tensor(object):
139 | 
140 |     def __init__(self):
141 |         pass
142 | 
143 |     def __call__(self, *args):
144 |         if len(args) == 1:
145 |             return torch.from_numpy(args[0])
146 |         else:
147 |             return tuple([torch.from_numpy(np.array(array)) for array in args])
148 | 


--------------------------------------------------------------------------------
/mmdet/models/detectors/base.py:
--------------------------------------------------------------------------------
  1 | import logging
  2 | from abc import ABCMeta, abstractmethod
  3 | 
  4 | import mmcv
  5 | import numpy as np
  6 | import torch.nn as nn
  7 | import pycocotools.mask as maskUtils
  8 | 
  9 | from mmdet.core import tensor2imgs, get_classes
 10 | 
 11 | 
 12 | class BaseDetector(nn.Module):
 13 |     """Base class for detectors"""
 14 | 
 15 |     __metaclass__ = ABCMeta
 16 | 
 17 |     def __init__(self):
 18 |         super(BaseDetector, self).__init__()
 19 | 
 20 |     @property
 21 |     def with_neck(self):
 22 |         return hasattr(self, 'neck') and self.neck is not None
 23 | 
 24 |     @property
 25 |     def with_shared_head(self):
 26 |         return hasattr(self, 'shared_head') and self.shared_head is not None
 27 | 
 28 |     @property
 29 |     def with_bbox(self):
 30 |         return hasattr(self, 'bbox_head') and self.bbox_head is not None
 31 | 
 32 |     @property
 33 |     def with_mask(self):
 34 |         return hasattr(self, 'mask_head') and self.mask_head is not None
 35 | 
 36 |     @abstractmethod
 37 |     def extract_feat(self, imgs):
 38 |         pass
 39 | 
 40 |     def extract_feats(self, imgs):
 41 |         assert isinstance(imgs, list)
 42 |         for img in imgs:
 43 |             yield self.extract_feat(img)
 44 | 
 45 |     @abstractmethod
 46 |     def forward_train(self, imgs, img_metas, **kwargs):
 47 |         pass
 48 | 
 49 |     @abstractmethod
 50 |     def simple_test(self, img, img_meta, **kwargs):
 51 |         pass
 52 | 
 53 |     @abstractmethod
 54 |     def aug_test(self, imgs, img_metas, **kwargs):
 55 |         pass
 56 | 
 57 |     def init_weights(self, pretrained=None):
 58 |         if pretrained is not None:
 59 |             logger = logging.getLogger()
 60 |             logger.info('load model from: {}'.format(pretrained))
 61 | 
 62 |     def forward_test(self, imgs, img_metas, **kwargs):
 63 |         for var, name in [(imgs, 'imgs'), (img_metas, 'img_metas')]:
 64 |             if not isinstance(var, list):
 65 |                 raise TypeError('{} must be a list, but got {}'.format(
 66 |                     name, type(var)))
 67 | 
 68 |         num_augs = len(imgs)
 69 |         if num_augs != len(img_metas):
 70 |             raise ValueError(
 71 |                 'num of augmentations ({}) != num of image meta ({})'.format(
 72 |                     len(imgs), len(img_metas)))
 73 |         # TODO: remove the restriction of imgs_per_gpu == 1 when prepared
 74 |         imgs_per_gpu = imgs[0].size(0)
 75 |         assert imgs_per_gpu == 1
 76 | 
 77 |         if num_augs == 1:
 78 |             return self.simple_test(imgs[0], img_metas[0], **kwargs)
 79 |         else:
 80 |             return self.aug_test(imgs, img_metas, **kwargs)
 81 | 
 82 |     def forward(self, img, img_meta, return_loss=True, **kwargs):
 83 |         if return_loss:
 84 |             return self.forward_train(img, img_meta, **kwargs)
 85 |         else:
 86 |             return self.forward_test(img, img_meta, **kwargs)
 87 | 
 88 |     def show_result(self,
 89 |                     data,
 90 |                     result,
 91 |                     img_norm_cfg,
 92 |                     dataset=None,
 93 |                     score_thr=0.3):
 94 |         if isinstance(result, tuple):
 95 |             bbox_result, segm_result = result
 96 |         else:
 97 |             bbox_result, segm_result = result, None
 98 | 
 99 |         img_tensor = data['img'][0]
100 |         img_metas = data['img_meta'][0].data[0]
101 |         imgs = tensor2imgs(img_tensor, **img_norm_cfg)
102 |         assert len(imgs) == len(img_metas)
103 | 
104 |         if dataset is None:
105 |             class_names = self.CLASSES
106 |         elif isinstance(dataset, str):
107 |             class_names = get_classes(dataset)
108 |         elif isinstance(dataset, (list, tuple)):
109 |             class_names = dataset
110 |         else:
111 |             raise TypeError(
112 |                 'dataset must be a valid dataset name or a sequence'
113 |                 ' of class names, not {}'.format(type(dataset)))
114 | 
115 |         for img, img_meta in zip(imgs, img_metas):
116 |             h, w, _ = img_meta['img_shape']
117 |             img_show = img[:h, :w, :]
118 | 
119 |             bboxes = np.vstack(bbox_result)
120 |             # draw segmentation masks
121 |             if segm_result is not None:
122 |                 segms = mmcv.concat_list(segm_result)
123 |                 inds = np.where(bboxes[:, -1] > score_thr)[0]
124 |                 for i in inds:
125 |                     color_mask = np.random.randint(
126 |                         0, 256, (1, 3), dtype=np.uint8)
127 |                     mask = maskUtils.decode(segms[i]).astype(np.bool)
128 |                     img_show[mask] = img_show[mask] * 0.5 + color_mask * 0.5
129 |             # draw bounding boxes
130 |             labels = [
131 |                 np.full(bbox.shape[0], i, dtype=np.int32)
132 |                 for i, bbox in enumerate(bbox_result)
133 |             ]
134 |             labels = np.concatenate(labels)
135 |             mmcv.imshow_det_bboxes(
136 |                 img_show,
137 |                 bboxes,
138 |                 labels,
139 |                 class_names=class_names,
140 |                 score_thr=score_thr)
141 | 


--------------------------------------------------------------------------------
/mmdet/datasets/coco.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from pycocotools.coco import COCO
  3 | 
  4 | from .custom import CustomDataset
  5 | 
  6 | 
  7 | class CocoDataset(CustomDataset):
  8 | 
  9 |     CLASSES = ('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
 10 |                'train', 'truck', 'boat', 'traffic_light', 'fire_hydrant',
 11 |                'stop_sign', 'parking_meter', 'bench', 'bird', 'cat', 'dog',
 12 |                'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe',
 13 |                'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
 14 |                'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat',
 15 |                'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket',
 16 |                'bottle', 'wine_glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
 17 |                'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',
 18 |                'hot_dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
 19 |                'potted_plant', 'bed', 'dining_table', 'toilet', 'tv', 'laptop',
 20 |                'mouse', 'remote', 'keyboard', 'cell_phone', 'microwave',
 21 |                'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock',
 22 |                'vase', 'scissors', 'teddy_bear', 'hair_drier', 'toothbrush')
 23 | 
 24 |     def load_annotations(self, ann_file):
 25 |         self.coco = COCO(ann_file)
 26 |         self.cat_ids = self.coco.getCatIds()
 27 |         self.cat2label = {
 28 |             cat_id: i + 1
 29 |             for i, cat_id in enumerate(self.cat_ids)
 30 |         }
 31 |         self.img_ids = self.coco.getImgIds()
 32 |         img_infos = []
 33 |         for i in self.img_ids:
 34 |             info = self.coco.loadImgs([i])[0]
 35 |             info['filename'] = info['file_name']
 36 |             img_infos.append(info)
 37 |         return img_infos
 38 | 
 39 |     def get_ann_info(self, idx):
 40 |         img_id = self.img_infos[idx]['id']
 41 |         ann_ids = self.coco.getAnnIds(imgIds=[img_id])
 42 |         ann_info = self.coco.loadAnns(ann_ids)
 43 |         return self._parse_ann_info(ann_info, self.with_mask)
 44 | 
 45 |     def _filter_imgs(self, min_size=32):
 46 |         """Filter images too small or without ground truths."""
 47 |         valid_inds = []
 48 |         ids_with_ann = set(_['image_id'] for _ in self.coco.anns.values())
 49 |         for i, img_info in enumerate(self.img_infos):
 50 |             if self.img_ids[i] not in ids_with_ann:
 51 |                 continue
 52 |             if min(img_info['width'], img_info['height']) >= min_size:
 53 |                 valid_inds.append(i)
 54 |         return valid_inds
 55 | 
 56 |     def _parse_ann_info(self, ann_info, with_mask=True):
 57 |         """Parse bbox and mask annotation.
 58 | 
 59 |         Args:
 60 |             ann_info (list[dict]): Annotation info of an image.
 61 |             with_mask (bool): Whether to parse mask annotations.
 62 | 
 63 |         Returns:
 64 |             dict: A dict containing the following keys: bboxes, bboxes_ignore,
 65 |                 labels, masks, mask_polys, poly_lens.
 66 |         """
 67 |         gt_bboxes = []
 68 |         gt_labels = []
 69 |         gt_bboxes_ignore = []
 70 |         # Two formats are provided.
 71 |         # 1. mask: a binary map of the same size of the image.
 72 |         # 2. polys: each mask consists of one or several polys, each poly is a
 73 |         # list of float.
 74 |         if with_mask:
 75 |             gt_masks = []
 76 |             gt_mask_polys = []
 77 |             gt_poly_lens = []
 78 |         for i, ann in enumerate(ann_info):
 79 |             if ann.get('ignore', False):
 80 |                 continue
 81 |             x1, y1, w, h = ann['bbox']
 82 |             if ann['area'] <= 0 or w < 1 or h < 1:
 83 |                 continue
 84 |             bbox = [x1, y1, x1 + w - 1, y1 + h - 1]
 85 |             if ann['iscrowd']:
 86 |                 gt_bboxes_ignore.append(bbox)
 87 |             else:
 88 |                 gt_bboxes.append(bbox)
 89 |                 gt_labels.append(self.cat2label[ann['category_id']])
 90 |             if with_mask:
 91 |                 gt_masks.append(self.coco.annToMask(ann))
 92 |                 mask_polys = [
 93 |                     p for p in ann['segmentation'] if len(p) >= 6
 94 |                 ]  # valid polygons have >= 3 points (6 coordinates)
 95 |                 poly_lens = [len(p) for p in mask_polys]
 96 |                 gt_mask_polys.append(mask_polys)
 97 |                 gt_poly_lens.extend(poly_lens)
 98 |         if gt_bboxes:
 99 |             gt_bboxes = np.array(gt_bboxes, dtype=np.float32)
100 |             gt_labels = np.array(gt_labels, dtype=np.int64)
101 |         else:
102 |             gt_bboxes = np.zeros((0, 4), dtype=np.float32)
103 |             gt_labels = np.array([], dtype=np.int64)
104 | 
105 |         if gt_bboxes_ignore:
106 |             gt_bboxes_ignore = np.array(gt_bboxes_ignore, dtype=np.float32)
107 |         else:
108 |             gt_bboxes_ignore = np.zeros((0, 4), dtype=np.float32)
109 | 
110 |         ann = dict(
111 |             bboxes=gt_bboxes, labels=gt_labels, bboxes_ignore=gt_bboxes_ignore)
112 | 
113 |         if with_mask:
114 |             ann['masks'] = gt_masks
115 |             # poly format is not used in the current implementation
116 |             ann['mask_polys'] = gt_mask_polys
117 |             ann['poly_lens'] = gt_poly_lens
118 |         return ann
119 | 


--------------------------------------------------------------------------------
/mmdet/ops/nms/src/nms_kernel.cu:
--------------------------------------------------------------------------------
  1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
  2 | #include <ATen/ATen.h>
  3 | #include <ATen/cuda/CUDAContext.h>
  4 | 
  5 | #include <THC/THC.h>
  6 | #include <THC/THCDeviceUtils.cuh>
  7 | 
  8 | #include <vector>
  9 | #include <iostream>
 10 | 
 11 | int const threadsPerBlock = sizeof(unsigned long long) * 8;
 12 | 
 13 | __device__ inline float devIoU(float const * const a, float const * const b) {
 14 |   float left = max(a[0], b[0]), right = min(a[2], b[2]);
 15 |   float top = max(a[1], b[1]), bottom = min(a[3], b[3]);
 16 |   float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);
 17 |   float interS = width * height;
 18 |   float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);
 19 |   float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);
 20 |   return interS / (Sa + Sb - interS);
 21 | }
 22 | 
 23 | __global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,
 24 |                            const float *dev_boxes, unsigned long long *dev_mask) {
 25 |   const int row_start = blockIdx.y;
 26 |   const int col_start = blockIdx.x;
 27 | 
 28 |   // if (row_start > col_start) return;
 29 | 
 30 |   const int row_size =
 31 |         min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
 32 |   const int col_size =
 33 |         min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
 34 | 
 35 |   __shared__ float block_boxes[threadsPerBlock * 5];
 36 |   if (threadIdx.x < col_size) {
 37 |     block_boxes[threadIdx.x * 5 + 0] =
 38 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];
 39 |     block_boxes[threadIdx.x * 5 + 1] =
 40 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];
 41 |     block_boxes[threadIdx.x * 5 + 2] =
 42 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];
 43 |     block_boxes[threadIdx.x * 5 + 3] =
 44 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];
 45 |     block_boxes[threadIdx.x * 5 + 4] =
 46 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];
 47 |   }
 48 |   __syncthreads();
 49 | 
 50 |   if (threadIdx.x < row_size) {
 51 |     const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
 52 |     const float *cur_box = dev_boxes + cur_box_idx * 5;
 53 |     int i = 0;
 54 |     unsigned long long t = 0;
 55 |     int start = 0;
 56 |     if (row_start == col_start) {
 57 |       start = threadIdx.x + 1;
 58 |     }
 59 |     for (i = start; i < col_size; i++) {
 60 |       if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {
 61 |         t |= 1ULL << i;
 62 |       }
 63 |     }
 64 |     const int col_blocks = THCCeilDiv(n_boxes, threadsPerBlock);
 65 |     dev_mask[cur_box_idx * col_blocks + col_start] = t;
 66 |   }
 67 | }
 68 | 
 69 | // boxes is a N x 5 tensor
 70 | at::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh) {
 71 |   using scalar_t = float;
 72 |   AT_ASSERTM(boxes.type().is_cuda(), "boxes must be a CUDA tensor");
 73 |   auto scores = boxes.select(1, 4);
 74 |   auto order_t = std::get<1>(scores.sort(0, /* descending=*/true));
 75 |   auto boxes_sorted = boxes.index_select(0, order_t);
 76 | 
 77 |   int boxes_num = boxes.size(0);
 78 | 
 79 |   const int col_blocks = THCCeilDiv(boxes_num, threadsPerBlock);
 80 | 
 81 |   scalar_t* boxes_dev = boxes_sorted.data<scalar_t>();
 82 | 
 83 |   THCState *state = at::globalContext().lazyInitCUDA(); // TODO replace with getTHCState
 84 | 
 85 |   unsigned long long* mask_dev = NULL;
 86 |   //THCudaCheck(THCudaMalloc(state, (void**) &mask_dev,
 87 |   //                      boxes_num * col_blocks * sizeof(unsigned long long)));
 88 | 
 89 |   mask_dev = (unsigned long long*) THCudaMalloc(state, boxes_num * col_blocks * sizeof(unsigned long long));
 90 | 
 91 |   dim3 blocks(THCCeilDiv(boxes_num, threadsPerBlock),
 92 |               THCCeilDiv(boxes_num, threadsPerBlock));
 93 |   dim3 threads(threadsPerBlock);
 94 |   nms_kernel<<<blocks, threads>>>(boxes_num,
 95 |                                   nms_overlap_thresh,
 96 |                                   boxes_dev,
 97 |                                   mask_dev);
 98 | 
 99 |   std::vector<unsigned long long> mask_host(boxes_num * col_blocks);
100 |   THCudaCheck(cudaMemcpy(&mask_host[0],
101 |                         mask_dev,
102 |                         sizeof(unsigned long long) * boxes_num * col_blocks,
103 |                         cudaMemcpyDeviceToHost));
104 | 
105 |   std::vector<unsigned long long> remv(col_blocks);
106 |   memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);
107 | 
108 |   at::Tensor keep = at::empty({boxes_num}, boxes.options().dtype(at::kLong).device(at::kCPU));
109 |   int64_t* keep_out = keep.data<int64_t>();
110 | 
111 |   int num_to_keep = 0;
112 |   for (int i = 0; i < boxes_num; i++) {
113 |     int nblock = i / threadsPerBlock;
114 |     int inblock = i % threadsPerBlock;
115 | 
116 |     if (!(remv[nblock] & (1ULL << inblock))) {
117 |       keep_out[num_to_keep++] = i;
118 |       unsigned long long *p = &mask_host[0] + i * col_blocks;
119 |       for (int j = nblock; j < col_blocks; j++) {
120 |         remv[j] |= p[j];
121 |       }
122 |     }
123 |   }
124 | 
125 |   THCudaFree(state, mask_dev);
126 |   // TODO improve this part
127 |   return std::get<0>(order_t.index({
128 |                        keep.narrow(/*dim=*/0, /*start=*/0, /*length=*/num_to_keep).to(
129 |                          order_t.device(), keep.scalar_type())
130 |                      }).sort(0, false));
131 | }


--------------------------------------------------------------------------------
/mmdet/core/evaluation/coco_utils.py:
--------------------------------------------------------------------------------
  1 | import mmcv
  2 | import numpy as np
  3 | from pycocotools.coco import COCO
  4 | from tools.cocoeval import COCOeval
  5 | 
  6 | from .recall import eval_recalls
  7 | 
  8 | 
  9 | def coco_eval(result_file, result_types, coco, max_dets=(100, 300, 1000)):
 10 |     for res_type in result_types:
 11 |         assert res_type in [
 12 |             'proposal', 'proposal_fast', 'bbox', 'segm', 'keypoints'
 13 |         ]
 14 | 
 15 |     if mmcv.is_str(coco):
 16 |         coco = COCO(coco)
 17 |     assert isinstance(coco, COCO)
 18 | 
 19 |     if result_types == ['proposal_fast']:
 20 |         ar = fast_eval_recall(result_file, coco, np.array(max_dets))
 21 |         for i, num in enumerate(max_dets):
 22 |             print('AR@{}\t= {:.4f}'.format(num, ar[i]))
 23 |         return
 24 | 
 25 |     assert result_file.endswith('.json')
 26 |     coco_dets = coco.loadRes(result_file)
 27 | 
 28 |     img_ids = coco.getImgIds()
 29 |     for res_type in result_types:
 30 |         iou_type = 'bbox' if res_type == 'proposal' else res_type
 31 |         cocoEval = COCOeval(coco, coco_dets, iou_type)
 32 |         cocoEval.params.imgIds = img_ids
 33 |         if res_type == 'proposal':
 34 |             cocoEval.params.useCats = 0
 35 |             cocoEval.params.maxDets = list(max_dets)
 36 |         cocoEval.evaluate()
 37 |         cocoEval.accumulate()
 38 |         cocoEval.summarize()
 39 | 
 40 | 
 41 | def fast_eval_recall(results,
 42 |                      coco,
 43 |                      max_dets,
 44 |                      iou_thrs=np.arange(0.5, 0.96, 0.05)):
 45 |     if mmcv.is_str(results):
 46 |         assert results.endswith('.pkl')
 47 |         results = mmcv.load(results)
 48 |     elif not isinstance(results, list):
 49 |         raise TypeError(
 50 |             'results must be a list of numpy arrays or a filename, not {}'.
 51 |             format(type(results)))
 52 | 
 53 |     gt_bboxes = []
 54 |     img_ids = coco.getImgIds()
 55 |     for i in range(len(img_ids)):
 56 |         ann_ids = coco.getAnnIds(imgIds=img_ids[i])
 57 |         ann_info = coco.loadAnns(ann_ids)
 58 |         if len(ann_info) == 0:
 59 |             gt_bboxes.append(np.zeros((0, 4)))
 60 |             continue
 61 |         bboxes = []
 62 |         for ann in ann_info:
 63 |             if ann.get('ignore', False) or ann['iscrowd']:
 64 |                 continue
 65 |             x1, y1, w, h = ann['bbox']
 66 |             bboxes.append([x1, y1, x1 + w - 1, y1 + h - 1])
 67 |         bboxes = np.array(bboxes, dtype=np.float32)
 68 |         if bboxes.shape[0] == 0:
 69 |             bboxes = np.zeros((0, 4))
 70 |         gt_bboxes.append(bboxes)
 71 | 
 72 |     recalls = eval_recalls(
 73 |         gt_bboxes, results, max_dets, iou_thrs, print_summary=False)
 74 |     ar = recalls.mean(axis=1)
 75 |     return ar
 76 | 
 77 | 
 78 | def xyxy2xywh(bbox):
 79 |     _bbox = bbox.tolist()
 80 |     return [
 81 |         _bbox[0],
 82 |         _bbox[1],
 83 |         _bbox[2] - _bbox[0] + 1,
 84 |         _bbox[3] - _bbox[1] + 1,
 85 |     ]
 86 | 
 87 | 
 88 | def proposal2json(dataset, results):
 89 |     json_results = []
 90 |     for idx in range(len(dataset)):
 91 |         img_id = dataset.img_ids[idx]
 92 |         bboxes = results[idx]
 93 |         for i in range(bboxes.shape[0]):
 94 |             data = dict()
 95 |             data['image_id'] = img_id
 96 |             data['bbox'] = xyxy2xywh(bboxes[i])
 97 |             data['score'] = float(bboxes[i][4])
 98 |             data['category_id'] = 1
 99 |             json_results.append(data)
100 |     return json_results
101 | 
102 | 
103 | def det2json(dataset, results):
104 |     json_results = []
105 |     for idx in range(len(dataset)):
106 |         img_id = dataset.img_ids[idx]
107 |         result = results[idx]
108 |         for label in range(len(result)):
109 |             bboxes = result[label]
110 |             for i in range(bboxes.shape[0]):
111 |                 data = dict()
112 |                 data['image_id'] = img_id
113 |                 data['bbox'] = xyxy2xywh(bboxes[i])
114 |                 data['score'] = float(bboxes[i][4])
115 |                 data['category_id'] = dataset.cat_ids[label]
116 |                 json_results.append(data)
117 |     return json_results
118 | 
119 | 
120 | def segm2json(dataset, results):
121 |     json_results = []
122 |     for idx in range(len(dataset)):
123 |         img_id = dataset.img_ids[idx]
124 |         det, seg = results[idx]
125 |         for label in range(len(det)):
126 |             bboxes = det[label]
127 |             segms = seg[label]
128 |             for i in range(bboxes.shape[0]):
129 |                 data = dict()
130 |                 data['image_id'] = img_id
131 |                 data['bbox'] = xyxy2xywh(bboxes[i])
132 |                 data['score'] = float(bboxes[i][4])
133 |                 data['category_id'] = dataset.cat_ids[label]
134 |                 segms[i]['counts'] = segms[i]['counts'].decode()
135 |                 data['segmentation'] = segms[i]
136 |                 json_results.append(data)
137 |     return json_results
138 | 
139 | 
140 | def results2json(dataset, results, out_file):
141 |     if isinstance(results[0], list):
142 |         json_results = det2json(dataset, results)
143 |     elif isinstance(results[0], tuple):
144 |         json_results = segm2json(dataset, results)
145 |     elif isinstance(results[0], np.ndarray):
146 |         json_results = proposal2json(dataset, results)
147 |     else:
148 |         raise TypeError('invalid type of results')
149 |     mmcv.dump(json_results, out_file)
150 | 


--------------------------------------------------------------------------------
/mmdet/core/evaluation/class_names.py:
--------------------------------------------------------------------------------
  1 | import mmcv
  2 | 
  3 | 
  4 | def voc_classes():
  5 |     return [
  6 |         'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
  7 |         'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
  8 |         'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'
  9 |     ]
 10 | 
 11 | 
 12 | def imagenet_det_classes():
 13 |     return [
 14 |         'accordion', 'airplane', 'ant', 'antelope', 'apple', 'armadillo',
 15 |         'artichoke', 'axe', 'baby_bed', 'backpack', 'bagel', 'balance_beam',
 16 |         'banana', 'band_aid', 'banjo', 'baseball', 'basketball', 'bathing_cap',
 17 |         'beaker', 'bear', 'bee', 'bell_pepper', 'bench', 'bicycle', 'binder',
 18 |         'bird', 'bookshelf', 'bow_tie', 'bow', 'bowl', 'brassiere', 'burrito',
 19 |         'bus', 'butterfly', 'camel', 'can_opener', 'car', 'cart', 'cattle',
 20 |         'cello', 'centipede', 'chain_saw', 'chair', 'chime', 'cocktail_shaker',
 21 |         'coffee_maker', 'computer_keyboard', 'computer_mouse', 'corkscrew',
 22 |         'cream', 'croquet_ball', 'crutch', 'cucumber', 'cup_or_mug', 'diaper',
 23 |         'digital_clock', 'dishwasher', 'dog', 'domestic_cat', 'dragonfly',
 24 |         'drum', 'dumbbell', 'electric_fan', 'elephant', 'face_powder', 'fig',
 25 |         'filing_cabinet', 'flower_pot', 'flute', 'fox', 'french_horn', 'frog',
 26 |         'frying_pan', 'giant_panda', 'goldfish', 'golf_ball', 'golfcart',
 27 |         'guacamole', 'guitar', 'hair_dryer', 'hair_spray', 'hamburger',
 28 |         'hammer', 'hamster', 'harmonica', 'harp', 'hat_with_a_wide_brim',
 29 |         'head_cabbage', 'helmet', 'hippopotamus', 'horizontal_bar', 'horse',
 30 |         'hotdog', 'iPod', 'isopod', 'jellyfish', 'koala_bear', 'ladle',
 31 |         'ladybug', 'lamp', 'laptop', 'lemon', 'lion', 'lipstick', 'lizard',
 32 |         'lobster', 'maillot', 'maraca', 'microphone', 'microwave', 'milk_can',
 33 |         'miniskirt', 'monkey', 'motorcycle', 'mushroom', 'nail', 'neck_brace',
 34 |         'oboe', 'orange', 'otter', 'pencil_box', 'pencil_sharpener', 'perfume',
 35 |         'person', 'piano', 'pineapple', 'ping-pong_ball', 'pitcher', 'pizza',
 36 |         'plastic_bag', 'plate_rack', 'pomegranate', 'popsicle', 'porcupine',
 37 |         'power_drill', 'pretzel', 'printer', 'puck', 'punching_bag', 'purse',
 38 |         'rabbit', 'racket', 'ray', 'red_panda', 'refrigerator',
 39 |         'remote_control', 'rubber_eraser', 'rugby_ball', 'ruler',
 40 |         'salt_or_pepper_shaker', 'saxophone', 'scorpion', 'screwdriver',
 41 |         'seal', 'sheep', 'ski', 'skunk', 'snail', 'snake', 'snowmobile',
 42 |         'snowplow', 'soap_dispenser', 'soccer_ball', 'sofa', 'spatula',
 43 |         'squirrel', 'starfish', 'stethoscope', 'stove', 'strainer',
 44 |         'strawberry', 'stretcher', 'sunglasses', 'swimming_trunks', 'swine',
 45 |         'syringe', 'table', 'tape_player', 'tennis_ball', 'tick', 'tie',
 46 |         'tiger', 'toaster', 'traffic_light', 'train', 'trombone', 'trumpet',
 47 |         'turtle', 'tv_or_monitor', 'unicycle', 'vacuum', 'violin',
 48 |         'volleyball', 'waffle_iron', 'washer', 'water_bottle', 'watercraft',
 49 |         'whale', 'wine_bottle', 'zebra'
 50 |     ]
 51 | 
 52 | 
 53 | def imagenet_vid_classes():
 54 |     return [
 55 |         'airplane', 'antelope', 'bear', 'bicycle', 'bird', 'bus', 'car',
 56 |         'cattle', 'dog', 'domestic_cat', 'elephant', 'fox', 'giant_panda',
 57 |         'hamster', 'horse', 'lion', 'lizard', 'monkey', 'motorcycle', 'rabbit',
 58 |         'red_panda', 'sheep', 'snake', 'squirrel', 'tiger', 'train', 'turtle',
 59 |         'watercraft', 'whale', 'zebra'
 60 |     ]
 61 | 
 62 | 
 63 | def coco_classes():
 64 |     return [
 65 |         'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train',
 66 |         'truck', 'boat', 'traffic_light', 'fire_hydrant', 'stop_sign',
 67 |         'parking_meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep',
 68 |         'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella',
 69 |         'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard',
 70 |         'sports_ball', 'kite', 'baseball_bat', 'baseball_glove', 'skateboard',
 71 |         'surfboard', 'tennis_racket', 'bottle', 'wine_glass', 'cup', 'fork',
 72 |         'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange',
 73 |         'broccoli', 'carrot', 'hot_dog', 'pizza', 'donut', 'cake', 'chair',
 74 |         'couch', 'potted_plant', 'bed', 'dining_table', 'toilet', 'tv',
 75 |         'laptop', 'mouse', 'remote', 'keyboard', 'cell_phone', 'microwave',
 76 |         'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase',
 77 |         'scissors', 'teddy_bear', 'hair_drier', 'toothbrush'
 78 |     ]
 79 | 
 80 | 
 81 | dataset_aliases = {
 82 |     'voc': ['voc', 'pascal_voc', 'voc07', 'voc12'],
 83 |     'imagenet_det': ['det', 'imagenet_det', 'ilsvrc_det'],
 84 |     'imagenet_vid': ['vid', 'imagenet_vid', 'ilsvrc_vid'],
 85 |     'coco': ['coco', 'mscoco', 'ms_coco']
 86 | }
 87 | 
 88 | 
 89 | def get_classes(dataset):
 90 |     """Get class names of a dataset."""
 91 |     alias2name = {}
 92 |     for name, aliases in dataset_aliases.items():
 93 |         for alias in aliases:
 94 |             alias2name[alias] = name
 95 | 
 96 |     if mmcv.is_str(dataset):
 97 |         if dataset in alias2name:
 98 |             labels = eval(alias2name[dataset] + '_classes()')
 99 |         else:
100 |             raise ValueError('Unrecognized dataset: {}'.format(dataset))
101 |     else:
102 |         raise TypeError('dataset must a str, but got {}'.format(type(dataset)))
103 |     return labels
104 | 


--------------------------------------------------------------------------------
/mmdet/core/loss/losses.py:
--------------------------------------------------------------------------------
  1 | # TODO merge naive and weighted loss.
  2 | import torch
  3 | import torch.nn.functional as F
  4 | 
  5 | from ..bbox import bbox_overlaps
  6 | from ...ops import sigmoid_focal_loss
  7 | 
  8 | 
  9 | def weighted_nll_loss(pred, label, weight, avg_factor=None):
 10 |     if avg_factor is None:
 11 |         avg_factor = max(torch.sum(weight > 0).float().item(), 1.)
 12 |     raw = F.nll_loss(pred, label, reduction='none')
 13 |     return torch.sum(raw * weight)[None] / avg_factor
 14 | 
 15 | 
 16 | def weighted_cross_entropy(pred, label, weight, avg_factor=None, reduce=True):
 17 |     if avg_factor is None:
 18 |         avg_factor = max(torch.sum(weight > 0).float().item(), 1.)
 19 |     raw = F.cross_entropy(pred, label, reduction='none')
 20 |     if reduce:
 21 |         return torch.sum(raw * weight)[None] / avg_factor
 22 |     else:
 23 |         return raw * weight / avg_factor
 24 | 
 25 | 
 26 | def weighted_binary_cross_entropy(pred, label, weight, avg_factor=None):
 27 |     if pred.dim() != label.dim():
 28 |         label, weight = _expand_binary_labels(label, weight, pred.size(-1))
 29 |     if avg_factor is None:
 30 |         avg_factor = max(torch.sum(weight > 0).float().item(), 1.)
 31 |     return F.binary_cross_entropy_with_logits(
 32 |         pred, label.float(), weight.float(),
 33 |         reduction='sum')[None] / avg_factor
 34 | 
 35 | 
 36 | def py_sigmoid_focal_loss(pred,
 37 |                           target,
 38 |                           weight,
 39 |                           gamma=2.0,
 40 |                           alpha=0.25,
 41 |                           reduction='mean'):
 42 |     pred_sigmoid = pred.sigmoid()
 43 |     target = target.type_as(pred)
 44 |     pt = (1 - pred_sigmoid) * target + pred_sigmoid * (1 - target)
 45 |     weight = (alpha * target + (1 - alpha) * (1 - target)) * weight
 46 |     weight = weight * pt.pow(gamma)
 47 |     loss = F.binary_cross_entropy_with_logits(
 48 |         pred, target, reduction='none') * weight
 49 |     reduction_enum = F._Reduction.get_enum(reduction)
 50 |     # none: 0, mean:1, sum: 2
 51 |     if reduction_enum == 0:
 52 |         return loss
 53 |     elif reduction_enum == 1:
 54 |         return loss.mean()
 55 |     elif reduction_enum == 2:
 56 |         return loss.sum()
 57 | 
 58 | 
 59 | def weighted_sigmoid_focal_loss(pred,
 60 |                                 target,
 61 |                                 weight,
 62 |                                 gamma=2.0,
 63 |                                 alpha=0.25,
 64 |                                 avg_factor=None,
 65 |                                 num_classes=80):
 66 |     if avg_factor is None:
 67 |         avg_factor = torch.sum(weight > 0).float().item() / num_classes + 1e-6
 68 |     return torch.sum(
 69 |         sigmoid_focal_loss(pred, target, gamma, alpha, 'none') * weight.view(
 70 |             -1, 1))[None] / avg_factor
 71 | 
 72 | 
 73 | def mask_cross_entropy(pred, target, label):
 74 |     num_rois = pred.size()[0]
 75 |     inds = torch.arange(0, num_rois, dtype=torch.long, device=pred.device)
 76 |     pred_slice = pred[inds, label].squeeze(1)
 77 |     return F.binary_cross_entropy_with_logits(
 78 |         pred_slice, target, reduction='mean')[None]
 79 | 
 80 | 
 81 | def smooth_l1_loss(pred, target, beta=1.0, reduction='mean'):
 82 |     assert beta > 0
 83 |     assert pred.size() == target.size() and target.numel() > 0
 84 |     diff = torch.abs(pred - target)
 85 |     loss = torch.where(diff < beta, 0.5 * diff * diff / beta,
 86 |                        diff - 0.5 * beta)
 87 |     reduction_enum = F._Reduction.get_enum(reduction)
 88 |     # none: 0, mean:1, sum: 2
 89 |     if reduction_enum == 0:
 90 |         return loss
 91 |     elif reduction_enum == 1:
 92 |         return loss.sum() / pred.numel()
 93 |     elif reduction_enum == 2:
 94 |         return loss.sum()
 95 | 
 96 | 
 97 | def weighted_smoothl1(pred, target, weight, beta=1.0, avg_factor=None):
 98 |     if avg_factor is None:
 99 |         avg_factor = torch.sum(weight > 0).float().item() / 4 + 1e-6
100 |     loss = smooth_l1_loss(pred, target, beta, reduction='none')
101 |     return torch.sum(loss * weight)[None] / avg_factor
102 | 
103 | 
104 | def accuracy(pred, target, topk=1):
105 |     if isinstance(topk, int):
106 |         topk = (topk, )
107 |         return_single = True
108 |     else:
109 |         return_single = False
110 | 
111 |     maxk = max(topk)
112 |     _, pred_label = pred.topk(maxk, 1, True, True)
113 |     pred_label = pred_label.t()
114 |     correct = pred_label.eq(target.view(1, -1).expand_as(pred_label))
115 | 
116 |     res = []
117 |     for k in topk:
118 |         correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
119 |         res.append(correct_k.mul_(100.0 / pred.size(0)))
120 |     return res[0] if return_single else res
121 | 
122 | 
123 | def _expand_binary_labels(labels, label_weights, label_channels):
124 |     bin_labels = labels.new_full((labels.size(0), label_channels), 0)
125 |     inds = torch.nonzero(labels >= 1).squeeze()
126 |     if inds.numel() > 0:
127 |         bin_labels[inds, labels[inds] - 1] = 1
128 |     bin_label_weights = label_weights.view(-1, 1).expand(
129 |         label_weights.size(0), label_channels)
130 |     return bin_labels, bin_label_weights
131 | 
132 | 
133 | def iou_loss(pred_bboxes, target_bboxes, reduction='mean'):
134 |     ious = bbox_overlaps(pred_bboxes, target_bboxes, is_aligned=True)
135 |     loss = -ious.log()
136 | 
137 |     reduction_enum = F._Reduction.get_enum(reduction)
138 |     if reduction_enum == 0:
139 |         return loss
140 |     elif reduction_enum == 1:
141 |         return loss.mean()
142 |     elif reduction_enum == 2:
143 |         return loss.sum()
144 | 


--------------------------------------------------------------------------------
/mmdet/apis/inference.py:
--------------------------------------------------------------------------------
  1 | import warnings
  2 | 
  3 | import mmcv
  4 | import mmcv_custom
  5 | import numpy as np
  6 | import pycocotools.mask as maskUtils
  7 | import torch
  8 | from mmcv.runner import load_checkpoint
  9 | 
 10 | from mmdet.core import get_classes
 11 | from mmdet.datasets import to_tensor
 12 | from mmdet.datasets.transforms import ImageTransform
 13 | from mmdet.models import build_detector
 14 | 
 15 | 
 16 | def init_detector(config, checkpoint=None, device='cuda:0'):
 17 |     """Initialize a detector from config file.
 18 | 
 19 |     Args:
 20 |         config (str or :obj:`mmcv.Config`): Config file path or the config
 21 |             object.
 22 |         checkpoint (str, optional): Checkpoint path. If left as None, the model
 23 |             will not load any weights.
 24 | 
 25 |     Returns:
 26 |         nn.Module: The constructed detector.
 27 |     """
 28 |     if isinstance(config, str):
 29 |         config = mmcv.Config.fromfile(config)
 30 |     elif not isinstance(config, mmcv.Config):
 31 |         raise TypeError('config must be a filename or Config object, '
 32 |                         'but got {}'.format(type(config)))
 33 |     config.model.pretrained = None
 34 |     model = build_detector(config.model, test_cfg=config.test_cfg)
 35 |     if checkpoint is not None:
 36 |         checkpoint = load_checkpoint(model, checkpoint)
 37 |         if 'CLASSES' in checkpoint['meta']:
 38 |             model.CLASSES = checkpoint['meta']['classes']
 39 |         else:
 40 |             warnings.warn('Class names are not saved in the checkpoint\'s '
 41 |                           'meta data, use COCO classes by default.')
 42 |             model.CLASSES = get_classes('coco')
 43 |     model.cfg = config  # save the config in the model for convenience
 44 |     model.to(device)
 45 |     model.eval()
 46 |     return model
 47 | 
 48 | 
 49 | def inference_detector(model, imgs):
 50 |     """Inference image(s) with the detector.
 51 | 
 52 |     Args:
 53 |         model (nn.Module): The loaded detector.
 54 |         imgs (str/ndarray or list[str/ndarray]): Either image files or loaded
 55 |             images.
 56 | 
 57 |     Returns:
 58 |         If imgs is a str, a generator will be returned, otherwise return the
 59 |         detection results directly.
 60 |     """
 61 |     cfg = model.cfg
 62 |     img_transform = ImageTransform(
 63 |         size_divisor=cfg.data.test.size_divisor, **cfg.img_norm_cfg)
 64 | 
 65 |     device = next(model.parameters()).device  # model device
 66 |     if not isinstance(imgs, list):
 67 |         return _inference_single(model, imgs, img_transform, device)
 68 |     else:
 69 |         return _inference_generator(model, imgs, img_transform, device)
 70 | 
 71 | 
 72 | def _prepare_data(img, img_transform, cfg, device):
 73 |     ori_shape = img.shape
 74 |     img, img_shape, pad_shape, scale_factor = img_transform(
 75 |         img,
 76 |         scale=cfg.data.test.img_scale,
 77 |         keep_ratio=cfg.data.test.get('resize_keep_ratio', True))
 78 |     img = to_tensor(img).to(device).unsqueeze(0)
 79 |     img_meta = [
 80 |         dict(
 81 |             ori_shape=ori_shape,
 82 |             img_shape=img_shape,
 83 |             pad_shape=pad_shape,
 84 |             scale_factor=scale_factor,
 85 |             flip=False)
 86 |     ]
 87 |     return dict(img=[img], img_meta=[img_meta])
 88 | 
 89 | 
 90 | def _inference_single(model, img, img_transform, device):
 91 |     img = mmcv.imread(img)
 92 |     data = _prepare_data(img, img_transform, model.cfg, device)
 93 |     with torch.no_grad():
 94 |         result = model(return_loss=False, rescale=True, **data)
 95 |     return result
 96 | 
 97 | 
 98 | def _inference_generator(model, imgs, img_transform, device):
 99 |     for img in imgs:
100 |         yield _inference_single(model, img, img_transform, device)
101 | 
102 | 
103 | # TODO: merge this method with the one in BaseDetector
104 | def show_result(img, result, class_names, score_thr=0.3, out_file=None):
105 |     """Visualize the detection results on the image.
106 | 
107 |     Args:
108 |         img (str or np.ndarray): Image filename or loaded image.
109 |         result (tuple[list] or list): The detection result, can be either
110 |             (bbox, segm) or just bbox.
111 |         class_names (list[str] or tuple[str]): A list of class names.
112 |         score_thr (float): The threshold to visualize the bboxes and masks.
113 |         out_file (str, optional): If specified, the visualization result will
114 |             be written to the out file instead of shown in a window.
115 |     """
116 |     assert isinstance(class_names, (tuple, list))
117 |     img = mmcv_custom.imread(img)
118 |     if isinstance(result, tuple):
119 |         bbox_result, segm_result = result
120 |     else:
121 |         bbox_result, segm_result = result, None
122 |     bboxes = np.vstack(bbox_result)
123 |     # draw segmentation masks
124 |     if segm_result is not None:
125 |         segms = mmcv.concat_list(segm_result)
126 |         inds = np.where(bboxes[:, -1] > score_thr)[0]
127 |         for i in inds:
128 |             color_mask = np.random.randint(0, 256, (1, 3), dtype=np.uint8)
129 |             mask = maskUtils.decode(segms[i]).astype(np.bool)
130 |             img[mask] = img[mask] * 0.5 + color_mask * 0.5
131 |     # draw bounding boxes
132 |     labels = [
133 |         np.full(bbox.shape[0], i, dtype=np.int32)
134 |         for i, bbox in enumerate(bbox_result)
135 |     ]
136 |     labels = np.concatenate(labels)
137 |     mmcv.imshow_det_bboxes(
138 |         img.copy(),
139 |         bboxes,
140 |         labels,
141 |         class_names=class_names,
142 |         score_thr=score_thr,
143 |         show=out_file is None,
144 |         out_file=out_file)
145 | 


--------------------------------------------------------------------------------
/mmdet/models/necks/fpn.py:
--------------------------------------------------------------------------------
  1 | import torch.nn as nn
  2 | import torch.nn.functional as F
  3 | from mmcv.cnn import xavier_init
  4 | 
  5 | from ..registry import NECKS
  6 | from ..utils import ConvModule
  7 | 
  8 | 
  9 | @NECKS.register_module
 10 | class FPN(nn.Module):
 11 | 
 12 |     def __init__(self,
 13 |                  in_channels,
 14 |                  out_channels,
 15 |                  num_outs,
 16 |                  start_level=0,
 17 |                  end_level=-1,
 18 |                  add_extra_convs=False,
 19 |                  extra_convs_on_inputs=True,
 20 |                  relu_before_extra_convs=False,
 21 |                  conv_cfg=None,
 22 |                  norm_cfg=None,
 23 |                  activation=None):
 24 |         super(FPN, self).__init__()
 25 |         assert isinstance(in_channels, list)
 26 |         self.in_channels = in_channels
 27 |         self.out_channels = out_channels
 28 |         self.num_ins = len(in_channels)
 29 |         self.num_outs = num_outs
 30 |         self.activation = activation
 31 |         self.relu_before_extra_convs = relu_before_extra_convs
 32 | 
 33 |         if end_level == -1:
 34 |             self.backbone_end_level = self.num_ins
 35 |             assert num_outs >= self.num_ins - start_level
 36 |         else:
 37 |             # if end_level < inputs, no extra level is allowed
 38 |             self.backbone_end_level = end_level
 39 |             assert end_level <= len(in_channels)
 40 |             assert num_outs == end_level - start_level
 41 |         self.start_level = start_level
 42 |         self.end_level = end_level
 43 |         self.add_extra_convs = add_extra_convs
 44 |         self.extra_convs_on_inputs = extra_convs_on_inputs
 45 | 
 46 |         self.lateral_convs = nn.ModuleList()
 47 |         self.fpn_convs = nn.ModuleList()
 48 | 
 49 |         for i in range(self.start_level, self.backbone_end_level):
 50 |             l_conv = ConvModule(
 51 |                 in_channels[i],
 52 |                 out_channels,
 53 |                 1,
 54 |                 conv_cfg=conv_cfg,
 55 |                 norm_cfg=norm_cfg,
 56 |                 activation=self.activation,
 57 |                 inplace=False)
 58 |             fpn_conv = ConvModule(
 59 |                 out_channels,
 60 |                 out_channels,
 61 |                 3,
 62 |                 padding=1,
 63 |                 conv_cfg=conv_cfg,
 64 |                 norm_cfg=norm_cfg,
 65 |                 activation=self.activation,
 66 |                 inplace=False)
 67 | 
 68 |             self.lateral_convs.append(l_conv)
 69 |             self.fpn_convs.append(fpn_conv)
 70 | 
 71 |         # add extra conv layers (e.g., RetinaNet)
 72 |         extra_levels = num_outs - self.backbone_end_level + self.start_level
 73 |         if add_extra_convs and extra_levels >= 1:
 74 |             for i in range(extra_levels):
 75 |                 if i == 0 and self.extra_convs_on_inputs:
 76 |                     in_channels = self.in_channels[self.backbone_end_level - 1]
 77 |                 else:
 78 |                     in_channels = out_channels
 79 |                 extra_fpn_conv = ConvModule(
 80 |                     in_channels,
 81 |                     out_channels,
 82 |                     3,
 83 |                     stride=2,
 84 |                     padding=1,
 85 |                     conv_cfg=conv_cfg,
 86 |                     norm_cfg=norm_cfg,
 87 |                     activation=self.activation,
 88 |                     inplace=False)
 89 |                 self.fpn_convs.append(extra_fpn_conv)
 90 | 
 91 |     # default init_weights for conv(msra) and norm in ConvModule
 92 |     def init_weights(self):
 93 |         for m in self.modules():
 94 |             if isinstance(m, nn.Conv2d):
 95 |                 xavier_init(m, distribution='uniform')
 96 | 
 97 |     def forward(self, inputs):
 98 |         assert len(inputs) == len(self.in_channels)
 99 | 
100 |         # build laterals
101 |         laterals = [
102 |             lateral_conv(inputs[i + self.start_level])
103 |             for i, lateral_conv in enumerate(self.lateral_convs)
104 |         ]
105 | 
106 |         # build top-down path
107 |         used_backbone_levels = len(laterals)
108 |         for i in range(used_backbone_levels - 1, 0, -1):
109 |             laterals[i - 1] += F.interpolate(
110 |                 laterals[i], scale_factor=2, mode='nearest')
111 | 
112 |         # build outputs
113 |         # part 1: from original levels
114 |         outs = [
115 |             self.fpn_convs[i](laterals[i]) for i in range(used_backbone_levels)
116 |         ]
117 |         # part 2: add extra levels
118 |         if self.num_outs > len(outs):
119 |             # use max pool to get more levels on top of outputs
120 |             # (e.g., Faster R-CNN, Mask R-CNN)
121 |             if not self.add_extra_convs:
122 |                 for i in range(self.num_outs - used_backbone_levels):
123 |                     outs.append(F.max_pool2d(outs[-1], 1, stride=2))
124 |             # add conv layers on top of original feature maps (RetinaNet)
125 |             else:
126 |                 if self.extra_convs_on_inputs:
127 |                     orig = inputs[self.backbone_end_level - 1]
128 |                     outs.append(self.fpn_convs[used_backbone_levels](orig))
129 |                 else:
130 |                     outs.append(self.fpn_convs[used_backbone_levels](outs[-1]))
131 |                 for i in range(used_backbone_levels + 1, self.num_outs):
132 |                     if self.relu_before_extra_convs:
133 |                         outs.append(self.fpn_convs[i](F.relu(outs[-1])))
134 |                     else:
135 |                         outs.append(self.fpn_convs[i](outs[-1]))
136 |         return tuple(outs)
137 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation
  2 | 
  3 | ## Introduction
  4 | 
  5 | This repo is the official implementation of ["Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation"](https://arxiv.org/abs/2003.08866) on COCO object detection. The code is based on [MMDetection](https://github.com/open-mmlab/mmdetection) v0.6.0.
  6 | 
  7 | <div align="center">
  8 |     <img src="demo/github_raw_image.png" width="200px" /><img src="demo/github_deterministic_sampling.png" width="200px" /><img src="demo/github_stochastic_sampling.png" width="200px" />
  9 |     <p>Deterministic sampling(middle) and stochastic sampling(right) from the raw image(left).</p>
 10 | </div>
 11 | 
 12 | <div align="center">
 13 |     <img src="demo/github_pipeline_gumbel.png" width="500px" />
 14 |     <p>Stochastic sampling-interpolation network(a) and comparison between deterministic sampling(b, left) and stochastic sampling(b, right).</p>
 15 | </div>
 16 | 
 17 | **Abstract.** In the feature maps of CNNs, there commonly exists considerable spatial redundancy that leads to much repetitive processing. Towards reducing this superﬂuous computation, we propose to compute features only at sparsely sampled locations, which are probabilistically chosen according to activation responses, and then densely reconstruct the feature map with an eﬃcient interpolation procedure. With this sampling-interpolation scheme, our network avoids expending computation on spatial locations that can be eﬀectively interpolated, while being robust to activation prediction errors through broadly distributed sampling. A technical challenge of this sampling-based approach is that the binary decision variables for representing discrete sampling locations are non-diﬀerentiable, making them incompatible with backpropagation. To circumvent this issue, we make use of a reparameterization trick based on the Gumbel-Softmax distribution, with which backpropagation can iterate these variables towards binary values. The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.
 18 | 
 19 | ## Model Zoo
 20 | 
 21 | All the models are based on the Faster R-CNN with FPN.
 22 | 
 23 | |Backbone|Resolution|Sparse Loss Weight|mAP|GFlops|Config|Model URL|Model sha256sum|
 24 | |-----|-----|-----|-----|-----|-----|-----|-----|
 25 | |ResNet-101| 500|-|38.5| 70.0|[config](./configs/resnet_101_faster_rcnn_res500.py) |[model](https://drive.google.com/file/d/1QP1s5EJ3ld5H0tCqW_XatSOHdf7vpO7E/view?usp=sharing)|206b4c0e|
 26 | |ResNet-101| 600|-|40.4|100.2|[config](./configs/resnet_101_faster_rcnn_res600.py) |[model](https://drive.google.com/file/d/1gPITSogNwTwbPdA6Cbez4rmjrq2ENSSi/view?usp=sharing)|c4e102de|
 27 | |ResNet-101| 800|-|42.3|184.1|[config](./configs/resnet_101_faster_rcnn_res800.py) |[model](https://drive.google.com/file/d/1U_-7b9a2VMM81IJu4-gEyGtsL7lupfAa/view?usp=sharing)|3fc2af7a|
 28 | |ResNet-101|1000|-|43.4|289.5|[config](./configs/resnet_101_faster_rcnn_res1000.py)|[model](https://drive.google.com/file/d/1XOyzBgNndl8TKaMs0_5QSu-obHscWsWp/view?usp=sharing)|e043c999|
 29 | |SparseResNet-101|1000|0.02|43.3|164.8|[config](./configs/sparse_resnet_101_faster_rcnn_sparse_loss_weight_0_02.py)|[model](https://drive.google.com/file/d/1z942uPLEsfa0Cra7C63qbeWXU7FMURyS/view?usp=sharing)|16a152e0|
 30 | |SparseResNet-101|1000|0.05|42.7|120.3|[config](./configs/sparse_resnet_101_faster_rcnn_sparse_loss_weight_0_05.py)|[model](https://drive.google.com/file/d/16X-IaqlJwHhZnbVMOdDeXkNfICNCL3Dw/view?usp=sharing)|f0a467c8|
 31 | |SparseResNet-101|1000| 0.1|41.9| 94.4|[config](./configs/sparse_resnet_101_faster_rcnn_sparse_loss_weight_0_1.py) |[model](https://drive.google.com/file/d/1KrCMXTvSJR7QslsfXRnZCQJUvtCyS8SW/view?usp=sharing)|1c9bf665|
 32 | |SparseResNet-101|1000| 0.2|40.7| 71.4|[config](./configs/sparse_resnet_101_faster_rcnn_sparse_loss_weight_0_2.py) |[model](https://drive.google.com/file/d/1El9uroSGNjRAixNpOAAjczTyKGa9HISu/view?usp=sharing)|46044e4a|
 33 | 
 34 | ## Getting Started
 35 | 
 36 | ### Requirements
 37 | 
 38 | At present, we have not checked the compatibility of the code with other versions of the packages, so we only recommend the following configuration.
 39 | 
 40 | - Python 3.7
 41 | - PyTorch == 1.1.0
 42 | - Torchvision == 0.3.0
 43 | - CUDA 9.0
 44 | - Other dependencies
 45 | 
 46 | ### Installation
 47 | 
 48 | We recommand using conda env to setup the experimental environments.
 49 | 
 50 | ```shell script
 51 | # Create environment
 52 | conda create -n SAI_Det python=3.7 -y
 53 | conda activate SAI_Det
 54 | 
 55 | # Install PyTorch & Torchvision
 56 | conda install pytorch=1.1.0 cudatoolkit=9.0 torchvision -c pytorch -y
 57 | 
 58 | # Clone repo
 59 | git clone https://github.com/zdaxie/SpatiallyAdaptiveInference-Detection ./SAI_Det
 60 | cd ./SAI_Det
 61 | 
 62 | # Create soft link for data
 63 | mkdir data
 64 | cd data
 65 | ln -s ${COCO-Path} ./coco
 66 | cd ..
 67 | 
 68 | # Install requirements and Compile operators
 69 | ./init.sh
 70 | ```
 71 | 
 72 | ### Running
 73 | 
 74 | For now, we only support training with 8 GPUs.
 75 | 
 76 | ```shell script
 77 | # Test with the given config & model
 78 | ./tools/dist_test.sh ${config-path} ${model-path} ${num-gpus} --out ${output-file.pkl}
 79 | 
 80 | # Train with the given config
 81 | ./tools/dist_train.sh ${config-path} ${num-gpus}
 82 | ```
 83 | 
 84 | ## License
 85 | 
 86 | This project is released under the [Apache 2.0 license](LICENSE).
 87 | 
 88 | ## Citation
 89 | 
 90 | If you use our codebase or models in your research, please cite this project.
 91 | 
 92 | ```
 93 | @InProceedings{xie2020spatially,
 94 | author = {Xie, Zhenda and Zhang, Zheng and Zhu, Xizhou and Huang, Gao and Lin, Steve},
 95 | title = {Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation},
 96 | booktitle = {European Conference on Computer Vision (ECCV)},
 97 | year = {2020},
 98 | month = {August},
 99 | }
100 | ```
101 | 


--------------------------------------------------------------------------------