├── README.md ├── config └── nanodet-m.yml ├── detect_main.py ├── docs ├── Model_arch.png ├── Title.jpg └── config_file_detail.md ├── model └── nanodet_m.pth ├── nanodet ├── data │ ├── collate.py │ ├── dataset │ │ ├── __init__.py │ │ ├── base.py │ │ └── coco.py │ └── transform │ │ ├── __init__.py │ │ ├── __pycache__ │ │ ├── __init__.cpython-38.pyc │ │ ├── color.cpython-38.pyc │ │ ├── pipeline.cpython-38.pyc │ │ └── warp.cpython-38.pyc │ │ ├── color.py │ │ ├── mosaic.py │ │ ├── pipeline.py │ │ └── warp.py ├── evaluator │ ├── __init__.py │ └── coco_detection.py ├── model │ ├── arch │ │ ├── __init__.py │ │ ├── __pycache__ │ │ │ ├── __init__.cpython-38.pyc │ │ │ ├── gfl.cpython-38.pyc │ │ │ └── one_stage.cpython-38.pyc │ │ ├── gfl.py │ │ └── one_stage.py │ ├── backbone │ │ ├── __init__.py │ │ ├── __pycache__ │ │ │ ├── __init__.cpython-38.pyc │ │ │ ├── ghostnet.cpython-38.pyc │ │ │ ├── mobilenetv2.cpython-38.pyc │ │ │ ├── resnet.cpython-38.pyc │ │ │ └── shufflenetv2.cpython-38.pyc │ │ ├── ghostnet.py │ │ ├── mobilenetv2.py │ │ ├── resnet.py │ │ └── shufflenetv2.py │ ├── fpn │ │ ├── __init__.py │ │ ├── __pycache__ │ │ │ ├── __init__.cpython-38.pyc │ │ │ ├── fpn.cpython-38.pyc │ │ │ └── pan.cpython-38.pyc │ │ ├── fpn.py │ │ └── pan.py │ ├── head │ │ ├── __init__.py │ │ ├── __pycache__ │ │ │ ├── __init__.cpython-38.pyc │ │ │ ├── gfl_head.cpython-38.pyc │ │ │ └── nanodet_head.cpython-38.pyc │ │ ├── anchor │ │ │ ├── __pycache__ │ │ │ │ ├── anchor_generator.cpython-38.pyc │ │ │ │ ├── anchor_target.cpython-38.pyc │ │ │ │ └── base_anchor_head.cpython-38.pyc │ │ │ ├── anchor_generator.py │ │ │ ├── anchor_target.py │ │ │ └── base_anchor_head.py │ │ ├── assigner │ │ │ ├── __pycache__ │ │ │ │ ├── assign_result.cpython-38.pyc │ │ │ │ ├── atss_assigner.cpython-38.pyc │ │ │ │ └── base_assigner.cpython-38.pyc │ │ │ ├── assign_result.py │ │ │ ├── atss_assigner.py │ │ │ └── base_assigner.py │ │ ├── gfl_head.py │ │ ├── nanodet_head.py │ │ └── sampler │ │ │ ├── __pycache__ │ │ │ ├── base_sampler.cpython-38.pyc │ │ │ ├── pseudo_sampler.cpython-38.pyc │ │ │ └── sampling_result.cpython-38.pyc │ │ │ ├── base_sampler.py │ │ │ ├── pseudo_sampler.py │ │ │ └── sampling_result.py │ ├── loss │ │ ├── __pycache__ │ │ │ ├── gfocal_loss.cpython-38.pyc │ │ │ ├── iou_loss.cpython-38.pyc │ │ │ └── utils.cpython-38.pyc │ │ ├── gfocal_loss.py │ │ ├── iou_loss.py │ │ ├── utils.py │ │ └── varifocal_loss.py │ └── module │ │ ├── __pycache__ │ │ ├── activation.cpython-38.pyc │ │ ├── conv.cpython-38.pyc │ │ ├── init_weights.cpython-38.pyc │ │ ├── nms.cpython-38.pyc │ │ ├── norm.cpython-38.pyc │ │ └── scale.cpython-38.pyc │ │ ├── activation.py │ │ ├── conv.py │ │ ├── init_weights.py │ │ ├── nms.py │ │ ├── norm.py │ │ └── scale.py ├── nanodet.zip ├── trainer │ ├── __init__.py │ ├── dist_trainer.py │ └── trainer.py └── util │ ├── __init__.py │ ├── __pycache__ │ ├── __init__.cpython-38.pyc │ ├── box_transform.cpython-38.pyc │ ├── check_point.cpython-38.pyc │ ├── config.cpython-38.pyc │ ├── data_parallel.cpython-38.pyc │ ├── distributed_data_parallel.cpython-38.pyc │ ├── flops_counter.cpython-38.pyc │ ├── logger.cpython-38.pyc │ ├── path.cpython-38.pyc │ ├── rank_filter.cpython-38.pyc │ ├── scatter_gather.cpython-38.pyc │ ├── util_mixins.cpython-38.pyc │ ├── visualization.cpython-38.pyc │ └── yacs.cpython-38.pyc │ ├── box_transform.py │ ├── check_point.py │ ├── config.py │ ├── data_parallel.py │ ├── distributed_data_parallel.py │ ├── flops_counter.py │ ├── logger.py │ ├── path.py │ ├── rank_filter.py │ ├── scatter_gather.py │ ├── util_mixins.py │ ├── visualization.py │ └── yacs.py ├── requirements.txt ├── street.png └── tools ├── export.py ├── flops.py ├── inference.py ├── test.py └── train.py /README.md: -------------------------------------------------------------------------------- 1 | # NanoDet-PyTorch 2 | 3 | * 说明:NanoDet作者开源代码地址:https://github.com/RangiLyu/nanodet (致敬) 4 | * **该代码基于NanoDet项目进行小裁剪,专门用来实现Python语言、PyTorch 版本的代码,下载直接能使用,支持图片、视频文件、摄像头实时目标检测。** 5 | 6 | 7 | - YOLO、SSD、Fast R-CNN等模型在目标检测方面速度较快和精度较高,但是这些模型比较大,不太适合移植到移动端或嵌入式设备; 8 | - 轻量级模型 NanoDet-m,对单阶段检测模型三大模块(Head、Neck、Backbone)进行轻量化,目标加检测速度很快;模型文件大小仅几兆(小于4M)。 9 | - NanoDet 是一种 FCOS 式的单阶段 anchor-free 目标检测模型,它使用 ATSS 进行目标采样,使用 Generalized Focal Loss 损失函数执行分类和边框回归(box regression) 10 | 11 | ## 模型性能 12 | 13 | Model |Resolution|COCO mAP|Latency(ARM 4xCore)|FLOPS|Params | Model Size(ncnn bin) 14 | :--------:|:--------:|:------:|:-----------------:|:---:|:-------:|:-------: 15 | NanoDet-m | 320*320 | 20.6 | 10.23ms | 0.72B | 0.95M | 1.8mb 16 | NanoDet-m | 416*416 | 21.7 | 16.44ms | 1.2B | 0.95M | 1.8mb 17 | YoloV3-Tiny| 416*416 | 16.6 | 37.6ms | 5.62B | 8.86M | 33.7mb 18 | YoloV4-Tiny| 416*416 | 21.7 | 32.81ms | 6.96B | 6.06M | 23.0mb 19 | 20 | 说明: 21 | * 以上性能基于 ncnn 和麒麟 980 (4xA76+4xA55) ARM CPU 获得的 22 | * 使用 COCO mAP (0.5:0.95) 作为评估指标,兼顾检测和定位的精度,在 COCO val 5000 张图片上测试,并且没有使用 Testing-Time-Augmentation。 23 | 24 | ## NanoDet损失函数 25 | * NanoDet 使用了李翔等人提出的 Generalized Focal Loss 损失函数。该函数能够去掉 FCOS 的 Centerness 分支,省去这一分支上的大量卷积,从而减少检测头的计算开销,非常适合移动端的轻量化部署。 26 | * 详细请参考:Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection 27 | 28 | ## NanoDet 优势 29 | * 超轻量级:模型文件大小仅几兆(小于4M——nanodet_m.pth); 30 | * 速度超快:在移动 ARM CPU 上的速度达到 97fps(10.23ms); 31 | * 训练友好:GPU 内存成本比其他模型低得多。GTX1060 6G 上的 Batch-size 为 80 即可运行; 32 | * 方便部署:提供了基于 ncnn 推理框架的 C++ 实现和 Android demo。 33 | 34 | ## 开发环境 35 | ```text 36 | Cython 37 | termcolor 38 | numpy 39 | torch>=1.3 40 | torchvision 41 | tensorboard 42 | pycocotools 43 | matplotlib 44 | pyaml 45 | opencv-python 46 | tqdm 47 | ``` 48 | 通常测试感觉GPU加速(显卡驱动、cudatoolkit 、cudnn)、PyTorch、pycocotools相对难装一点 49 | 50 | Windows开发环境安装可以参考: 51 | ```text 52 | 安装cudatoolkit 10.1、cudnn7.6请参考 https://blog.csdn.net/qq_41204464/article/details/108807165 53 | 安装PyTorch请参考 https://blog.csdn.net/u014723479/article/details/103001861 54 | 安装pycocotools请参考 https://blog.csdn.net/weixin_41166529/article/details/109997105 55 | ``` 56 | 57 | ## 运行程序 58 | ```text 59 | '''目标检测-图片''' 60 | # python detect_main.py image --config ./config/nanodet-m.yml --model model/nanodet_m.pth --path street.png 61 | 62 | '''目标检测-视频文件''' 63 | # python detect_main.py video --config ./config/nanodet-m.yml --model model/nanodet_m.pth --path test.mp4 64 | 65 | '''目标检测-摄像头''' 66 | # python detect_main.py webcam --config ./config/nanodet-m.yml --model model/nanodet_m.pth --path 0 67 | ``` 68 | 69 | 70 | ## 总结 71 | * 通过测试发现NanoDet确实很快,但识别精度和效果比YOLOv4差不少的。 72 | * 适用于对检测精度要求不高的,对实时要求高的移动端或嵌入式设备。 73 | 74 | ## 详细介绍 75 | https://guo-pu.blog.csdn.net/article/details/110410940 76 | 77 | ## 其他版本 78 | * 用于目标检测,模型小,检测速度快速,适合没GPU显卡的嵌入式设备运行,比如“树莓派”、ARM开发板、嵌入式开发板。 79 | https://github.com/guo-pu/NanoDet-PyTorch-CPU 80 | -------------------------------------------------------------------------------- /config/nanodet-m.yml: -------------------------------------------------------------------------------- 1 | #Config File example 2 | save_dir: workspace/nanodet_m 3 | model: 4 | arch: 5 | name: GFL 6 | backbone: 7 | name: ShuffleNetV2 8 | model_size: 1.0x 9 | out_stages: [2,3,4] 10 | activation: LeakyReLU 11 | fpn: 12 | name: PAN 13 | in_channels: [116, 232, 464] 14 | out_channels: 96 15 | start_level: 0 16 | num_outs: 3 17 | head: 18 | name: NanoDetHead 19 | num_classes: 80 20 | input_channel: 96 21 | feat_channels: 96 22 | stacked_convs: 2 23 | share_cls_reg: True 24 | octave_base_scale: 5 25 | scales_per_octave: 1 26 | strides: [8, 16, 32] 27 | reg_max: 7 28 | norm_cfg: 29 | type: BN 30 | loss: 31 | loss_qfl: 32 | name: QualityFocalLoss 33 | use_sigmoid: True 34 | beta: 2.0 35 | loss_weight: 1.0 36 | loss_dfl: 37 | name: DistributionFocalLoss 38 | loss_weight: 0.25 39 | loss_bbox: 40 | name: GIoULoss 41 | loss_weight: 2.0 42 | data: 43 | train: 44 | name: coco 45 | img_path: coco/train2017 46 | ann_path: coco/annotations/instances_train2017.json 47 | input_size: [320,320] #[w,h] 48 | keep_ratio: True 49 | pipeline: 50 | perspective: 0.0 51 | scale: [0.6, 1.4] 52 | stretch: [[1, 1], [1, 1]] 53 | rotation: 0 54 | shear: 0 55 | translate: 0 56 | flip: 0.5 57 | brightness: 0.2 58 | contrast: [0.8, 1.2] 59 | saturation: [0.8, 1.2] 60 | normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]] 61 | val: 62 | name: coco 63 | img_path: cocoval2017 64 | ann_path: coco/annotationsinstances_val2017.json 65 | input_size: [320,320] #[w,h] 66 | keep_ratio: True 67 | pipeline: 68 | normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]] 69 | device: 70 | gpu_ids: [0] 71 | workers_per_gpu: 12 72 | batchsize_per_gpu: 160 73 | schedule: 74 | # resume: 75 | # load_model: YOUR_MODEL_PATH 76 | optimizer: 77 | name: SGD 78 | lr: 0.14 79 | momentum: 0.9 80 | weight_decay: 0.0001 81 | warmup: 82 | name: linear 83 | steps: 300 84 | ratio: 0.1 85 | total_epochs: 70 86 | lr_schedule: 87 | name: MultiStepLR 88 | milestones: [40,55,60,65] 89 | gamma: 0.1 90 | val_intervals: 10 91 | evaluator: 92 | name: CocoDetectionEvaluator 93 | save_key: mAP 94 | 95 | log: 96 | interval: 10 97 | 98 | class_names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 99 | 'train', 'truck', 'boat', 'traffic_light', 'fire_hydrant', 100 | 'stop_sign', 'parking_meter', 'bench', 'bird', 'cat', 'dog', 101 | 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 102 | 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 103 | 'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat', 104 | 'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket', 105 | 'bottle', 'wine_glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 106 | 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 107 | 'hot_dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 108 | 'potted_plant', 'bed', 'dining_table', 'toilet', 'tv', 'laptop', 109 | 'mouse', 'remote', 'keyboard', 'cell_phone', 'microwave', 110 | 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 111 | 'vase', 'scissors', 'teddy_bear', 'hair_drier', 'toothbrush'] 112 | -------------------------------------------------------------------------------- /detect_main.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import os 3 | import time 4 | import torch 5 | import argparse 6 | from nanodet.util import cfg, load_config, Logger 7 | from nanodet.model.arch import build_model 8 | from nanodet.util import load_model_weight 9 | from nanodet.data.transform import Pipeline 10 | 11 | image_ext = ['.jpg', '.jpeg', '.webp', '.bmp', '.png'] 12 | video_ext = ['mp4', 'mov', 'avi', 'mkv'] 13 | 14 | '''目标检测-图片''' 15 | # python detect_main.py image --config ./config/nanodet-m.yml --model model/nanodet_m.pth --path street.png 16 | 17 | '''目标检测-视频文件''' 18 | # python detect_main.py video --config ./config/nanodet-m.yml --model model/nanodet_m.pth --path test.mp4 19 | 20 | '''目标检测-摄像头''' 21 | # python detect_main.py webcam --config ./config/nanodet-m.yml --model model/nanodet_m.pth --path 0 22 | 23 | def parse_args(): 24 | parser = argparse.ArgumentParser() 25 | parser.add_argument('demo', default='image', help='demo type, eg. image, video and webcam') 26 | parser.add_argument('--config', help='model config file path') 27 | parser.add_argument('--model', help='model file path') 28 | parser.add_argument('--path', default='./demo', help='path to images or video') 29 | parser.add_argument('--camid', type=int, default=0, help='webcam demo camera id') 30 | args = parser.parse_args() 31 | return args 32 | 33 | 34 | class Predictor(object): 35 | def __init__(self, cfg, model_path, logger, device='cuda:0'): 36 | self.cfg = cfg 37 | self.device = device 38 | model = build_model(cfg.model) 39 | ckpt = torch.load(model_path, map_location=lambda storage, loc: storage) 40 | load_model_weight(model, ckpt, logger) 41 | self.model = model.to(device).eval() 42 | self.pipeline = Pipeline(cfg.data.val.pipeline, cfg.data.val.keep_ratio) 43 | 44 | def inference(self, img): 45 | img_info = {} 46 | if isinstance(img, str): 47 | img_info['file_name'] = os.path.basename(img) 48 | img = cv2.imread(img) 49 | else: 50 | img_info['file_name'] = None 51 | 52 | height, width = img.shape[:2] 53 | img_info['height'] = height 54 | img_info['width'] = width 55 | meta = dict(img_info=img_info, 56 | raw_img=img, 57 | img=img) 58 | meta = self.pipeline(meta, self.cfg.data.val.input_size) 59 | meta['img'] = torch.from_numpy(meta['img'].transpose(2, 0, 1)).unsqueeze(0).to(self.device) 60 | with torch.no_grad(): 61 | results = self.model.inference(meta) 62 | return meta, results 63 | 64 | def visualize(self, dets, meta, class_names, score_thres, wait=0): 65 | time1 = time.time() 66 | self.model.head.show_result(meta['raw_img'], dets, class_names, score_thres=score_thres, show=True) 67 | print('viz time: {:.3f}s'.format(time.time()-time1)) 68 | 69 | 70 | def get_image_list(path): 71 | image_names = [] 72 | for maindir, subdir, file_name_list in os.walk(path): 73 | for filename in file_name_list: 74 | apath = os.path.join(maindir, filename) 75 | ext = os.path.splitext(apath)[1] 76 | if ext in image_ext: 77 | image_names.append(apath) 78 | return image_names 79 | 80 | 81 | def main(): 82 | args = parse_args() 83 | torch.backends.cudnn.enabled = True 84 | torch.backends.cudnn.benchmark = True 85 | 86 | load_config(cfg, args.config) 87 | logger = Logger(-1, use_tensorboard=False) 88 | predictor = Predictor(cfg, args.model, logger, device='cuda:0') 89 | logger.log('Press "Esc", "q" or "Q" to exit.') 90 | if args.demo == 'image': 91 | if os.path.isdir(args.path): 92 | files = get_image_list(args.path) 93 | else: 94 | files = [args.path] 95 | files.sort() 96 | for image_name in files: 97 | meta, res = predictor.inference(image_name) 98 | predictor.visualize(res, meta, cfg.class_names, 0.35) 99 | ch = cv2.waitKey(0) 100 | if ch == 27 or ch == ord('q') or ch == ord('Q'): 101 | break 102 | elif args.demo == 'video' or args.demo == 'webcam': 103 | cap = cv2.VideoCapture(args.path if args.demo == 'video' else args.camid) 104 | while True: 105 | ret_val, frame = cap.read() 106 | meta, res = predictor.inference(frame) 107 | predictor.visualize(res, meta, cfg.class_names, 0.35) 108 | ch = cv2.waitKey(1) 109 | if ch == 27 or ch == ord('q') or ch == ord('Q'): 110 | break 111 | 112 | 113 | if __name__ == '__main__': 114 | main() 115 | -------------------------------------------------------------------------------- /docs/Model_arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/docs/Model_arch.png -------------------------------------------------------------------------------- /docs/Title.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/docs/Title.jpg -------------------------------------------------------------------------------- /docs/config_file_detail.md: -------------------------------------------------------------------------------- 1 | # NanoDet Config File Analysis 2 | 3 | NanoDet using [yacs](https://github.com/rbgirshick/yacs) to read yaml config file. 4 | 5 | ## Saving path 6 | 7 | ```yaml 8 | save_dir: PATH_TO_SAVE 9 | ``` 10 | 11 | Change save_dir to where you want to save logs and models. If path not exist, NanoDet will create it. 12 | 13 | ## Model 14 | 15 | ```yaml 16 | model: 17 | arch: 18 | name: xxx 19 | backbone: xxx 20 | fpn: xxx 21 | head: xxx 22 | ``` 23 | 24 | Most detection model architecture can be devided into 3 parts: backbone, task head and connector between them(e.g. FPN, BiFPN, PAN...). 25 | 26 | ### Backbone 27 | 28 | ```yaml 29 | backbone: 30 | name: ShuffleNetV2 31 | model_size: 1.0x 32 | out_stages: [2,3,4] 33 | activation: LeakyReLU 34 | with_last_conv: False 35 | ``` 36 | 37 | NanoDet using ShuffleNetV2 as backbone. You can modify model size, output feature levels and activation function. Moreover, NanoDet provides other lightweight backbones like **GhostNet** and **MobileNetV2**. You can also add your backbone network by importing it in `nanodet/model/backbone/__init__.py`. 38 | 39 | ### FPN 40 | 41 | ```yaml 42 | fpn: 43 | name: PAN 44 | in_channels: [116, 232, 464] 45 | out_channels: 96 46 | start_level: 0 47 | num_outs: 3 48 | ``` 49 | 50 | NanoDet using modified [PAN](http://arxiv.org/abs/1803.01534) (replace downsample convs with interpolation to reduce amount of computations). 51 | 52 | `in_channels` : a list of feature map channels extracted from backbone. 53 | 54 | `out_channels` : out put feature map channel. 55 | 56 | ### Head 57 | 58 | ```yaml 59 | head: 60 | name: NanoDetHead 61 | num_classes: 80 62 | input_channel: 96 63 | feat_channels: 96 64 | stacked_convs: 2 65 | share_cls_reg: True 66 | octave_base_scale: 8 67 | scales_per_octave: 1 68 | strides: [8, 16, 32] 69 | reg_max: 7 70 | norm_cfg: 71 | type: BN 72 | loss: 73 | ``` 74 | 75 | `name`: Task head class name 76 | 77 | `num_classes`: number of classes 78 | 79 | `input_channel`: input feature map channel 80 | 81 | `feat_channels`: channel of task head convs 82 | 83 | `stacked_convs`: how many conv blocks use in one task head 84 | 85 | `share_cls_reg`: use same conv blocks for classification and box regression 86 | 87 | `octave_base_scale`: base box scale 88 | 89 | `scales_per_octave`: anchor free model only have one base box, default value 1 90 | 91 | `strides`: down sample stride of each feature map level 92 | 93 | `reg_max`: max value of per-level l-r-t-b distance 94 | 95 | `norm_cfg`: normalization layer setting 96 | 97 | `loss`: adjust loss functions and weights 98 | 99 | ## Data 100 | 101 | ```yaml 102 | data: 103 | train: 104 | name: coco 105 | img_path: coco/train2017 106 | ann_path: coco/annotations/instances_train2017.json 107 | input_size: [320,320] 108 | keep_ratio: True 109 | pipeline: 110 | val: 111 | ..... 112 | ``` 113 | 114 | In `data` you need to set your train and validate dataset. 115 | 116 | `name`: Dataset format name. You can create your own dataset format in `nanodet/data/dataset`. 117 | `input_size`: [width, height] 118 | `keep_ratio`: whether to maintain the original image ratio when resizing to input size 119 | `pipeline`: data preprocessing and augmentation pipeline 120 | 121 | ## Device 122 | 123 | ```yaml 124 | device: 125 | gpu_ids: [0] 126 | workers_per_gpu: 12 127 | batchsize_per_gpu: 160 128 | ``` 129 | 130 | `gpu_ids`: CUDA device id. For multi-gpu training, set [0, 1, 2...]. 131 | 132 | `workers_per_gpu`: how many dataloader processes for each gpu 133 | 134 | `batchsize_per_gpu`: amount of images in one batch for each gpu 135 | 136 | ## schedule 137 | 138 | ```yaml 139 | schedule: 140 | # resume: 141 | # load_model: YOUR_MODEL_PATH 142 | optimizer: 143 | name: SGD 144 | lr: 0.14 145 | momentum: 0.9 146 | weight_decay: 0.0001 147 | warmup: 148 | name: linear 149 | steps: 300 150 | ratio: 0.1 151 | total_epochs: 70 152 | lr_schedule: 153 | name: MultiStepLR 154 | milestones: [40,55,60,65] 155 | gamma: 0.1 156 | val_intervals: 10 157 | ``` 158 | 159 | Set training schedule. 160 | 161 | `resume`: whether to restore last training process 162 | 163 | `load_model`: path to trained weight 164 | 165 | `optimizer`: Support all optimizer provided by pytorch. 166 | 167 | You should adjust the lr with batch_size. Following linear scaling rule in paper *[Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://research.fb.com/wp-content/uploads/2017/06/imagenet1kin1h5.pdf)* 168 | 169 | `warmup`: Warm up your network before training. Support `constant`, `exp` and `linear` three types of warm up. 170 | 171 | `total_epochs`: total epochs to train 172 | 173 | `lr_schedule`: please refer to [pytorch lr_scheduler documentation](https://pytorch.org/docs/stable/optim.html?highlight=lr_scheduler#torch.optim.lr_scheduler) 174 | 175 | `val_intervals`: epoch interval of evaluating during training 176 | 177 | ## Evaluate 178 | 179 | ```yaml 180 | evaluator: 181 | name: CocoDetectionEvaluator 182 | save_key: mAP 183 | ``` 184 | 185 | Currently only support coco eval. 186 | 187 | `save_key`: Metric of best model. Support mAP, AP50, AP75.... 188 | 189 | **** 190 | 191 | `class_names`: used in visualization -------------------------------------------------------------------------------- /model/nanodet_m.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/model/nanodet_m.pth -------------------------------------------------------------------------------- /nanodet/data/collate.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | 4 | import re 5 | from torch._six import container_abcs, string_classes, int_classes 6 | 7 | 8 | np_str_obj_array_pattern = re.compile(r'[SaUO]') 9 | 10 | 11 | default_collate_err_msg_format = ( 12 | "default_collate: batch must contain tensors, numpy arrays, numbers, " 13 | "dicts or lists; found {}") 14 | 15 | 16 | def collate_function(batch): 17 | r"""Puts each data field into a tensor with outer dimension batch size""" 18 | 19 | elem = batch[0] 20 | elem_type = type(elem) 21 | if isinstance(elem, torch.Tensor): 22 | out = None 23 | # TODO: support pytorch < 1.3 24 | # if torch.utils.data.get_worker_info() is not None: 25 | # # If we're in a background process, concatenate directly into a 26 | # # shared memory tensor to avoid an extra copy 27 | # numel = sum([x.numel() for x in batch]) 28 | # storage = elem.storage()._new_shared(numel) 29 | # out = elem.new(storage) 30 | return torch.stack(batch, 0, out=out) 31 | elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \ 32 | and elem_type.__name__ != 'string_': 33 | elem = batch[0] 34 | if elem_type.__name__ == 'ndarray': 35 | # array of string classes and object 36 | if np_str_obj_array_pattern.search(elem.dtype.str) is not None: 37 | raise TypeError(default_collate_err_msg_format.format(elem.dtype)) 38 | 39 | # return collate_function([torch.as_tensor(b) for b in batch]) 40 | return batch 41 | elif elem.shape == (): # scalars 42 | # return torch.as_tensor(batch) 43 | return batch 44 | elif isinstance(elem, float): 45 | return torch.tensor(batch, dtype=torch.float64) 46 | elif isinstance(elem, int_classes): 47 | return torch.tensor(batch) 48 | elif isinstance(elem, string_classes): 49 | return batch 50 | elif isinstance(elem, container_abcs.Mapping): 51 | return {key: collate_function([d[key] for d in batch]) for key in elem} 52 | elif isinstance(elem, tuple) and hasattr(elem, '_fields'): # namedtuple 53 | return elem_type(*(collate_function(samples) for samples in zip(*batch))) 54 | elif isinstance(elem, container_abcs.Sequence): 55 | transposed = zip(*batch) 56 | return [collate_function(samples) for samples in transposed] 57 | 58 | raise TypeError(default_collate_err_msg_format.format(elem_type)) 59 | -------------------------------------------------------------------------------- /nanodet/data/dataset/__init__.py: -------------------------------------------------------------------------------- 1 | import copy 2 | from .coco import CocoDataset 3 | 4 | 5 | def build_dataset(cfg, mode): 6 | dataset_cfg = copy.deepcopy(cfg) 7 | if dataset_cfg['name'] == 'coco': 8 | dataset_cfg.pop('name') 9 | return CocoDataset(mode=mode, **dataset_cfg) 10 | -------------------------------------------------------------------------------- /nanodet/data/dataset/base.py: -------------------------------------------------------------------------------- 1 | from abc import ABCMeta, abstractmethod 2 | import torch 3 | import numpy as np 4 | from torch.utils.data import Dataset 5 | from ..transform import Pipeline 6 | 7 | 8 | class BaseDataset(Dataset, metaclass=ABCMeta): 9 | """ 10 | A base class of detection dataset. Referring from MMDetection. 11 | A dataset should have images, annotations and preprocessing pipelines 12 | NanoDet use [xmin, ymin, xmax, ymax] format for box and 13 | [[x0,y0], [x1,y1] ... [xn,yn]] format for key points. 14 | instance masks should decode into binary masks for each instance like 15 | { 16 | 'bbox': [xmin,ymin,xmax,ymax], 17 | 'mask': mask 18 | } 19 | segmentation mask should decode into binary masks for each class. 20 | 21 | :param img_path: image data folder 22 | :param ann_path: annotation file path or folder 23 | :param use_instance_mask: load instance segmentation data 24 | :param use_seg_mask: load semantic segmentation data 25 | :param use_keypoint: load pose keypoint data 26 | :param load_mosaic: using mosaic data augmentation from yolov4 27 | :param mode: train or val or test 28 | """ 29 | def __init__(self, 30 | img_path, 31 | ann_path, 32 | input_size, 33 | pipeline, 34 | keep_ratio=True, 35 | use_instance_mask=False, 36 | use_seg_mask=False, 37 | use_keypoint=False, 38 | load_mosaic=False, 39 | mode='train' 40 | ): 41 | self.img_path = img_path 42 | self.ann_path = ann_path 43 | self.input_size = input_size 44 | self.pipeline = Pipeline(pipeline, keep_ratio) 45 | self.keep_ratio = keep_ratio 46 | self.use_instance_mask = use_instance_mask 47 | self.use_seg_mask = use_seg_mask 48 | self.use_keypoint = use_keypoint 49 | self.load_mosaic = load_mosaic 50 | self.mode = mode 51 | 52 | self.data_info = self.get_data_info(ann_path) 53 | 54 | def __len__(self): 55 | return len(self.data_info) 56 | 57 | def __getitem__(self, idx): 58 | if self.mode == 'val' or self.mode == 'test': 59 | return self.get_val_data(idx) 60 | else: 61 | while True: 62 | data = self.get_train_data(idx) 63 | if data is None: 64 | idx = self.get_another_id() 65 | continue 66 | return data 67 | 68 | @abstractmethod 69 | def get_data_info(self, ann_path): 70 | pass 71 | 72 | @abstractmethod 73 | def get_train_data(self, idx): 74 | pass 75 | 76 | @abstractmethod 77 | def get_val_data(self, idx): 78 | pass 79 | 80 | def get_another_id(self): 81 | return np.random.random_integers(0, len(self.data_info)-1) 82 | 83 | 84 | 85 | 86 | -------------------------------------------------------------------------------- /nanodet/data/dataset/coco.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import numpy as np 4 | import cv2 5 | from pycocotools.coco import COCO 6 | from .base import BaseDataset 7 | 8 | 9 | class CocoDataset(BaseDataset): 10 | 11 | def get_data_info(self, ann_path): 12 | """ 13 | Load basic information of dataset such as image path, label and so on. 14 | :param ann_path: coco json file path 15 | :return: image info: 16 | [{'license': 2, 17 | 'file_name': '000000000139.jpg', 18 | 'coco_url': 'http://images.cocodataset.org/val2017/000000000139.jpg', 19 | 'height': 426, 20 | 'width': 640, 21 | 'date_captured': '2013-11-21 01:34:01', 22 | 'flickr_url': 'http://farm9.staticflickr.com/8035/8024364858_9c41dc1666_z.jpg', 23 | 'id': 139}, 24 | ... 25 | ] 26 | """ 27 | self.coco_api = COCO(ann_path) 28 | self.cat_ids = sorted(self.coco_api.getCatIds()) 29 | self.cat2label = {cat_id: i for i, cat_id in enumerate(self.cat_ids)} 30 | self.cats = self.coco_api.loadCats(self.cat_ids) 31 | self.img_ids = sorted(self.coco_api.imgs.keys()) 32 | img_info = self.coco_api.loadImgs(self.img_ids) 33 | return img_info 34 | 35 | def get_img_annotation(self, idx): 36 | """ 37 | load per image annotation 38 | :param idx: index in dataloader 39 | :return: annotation dict 40 | """ 41 | img_id = self.img_ids[idx] 42 | ann_ids = self.coco_api.getAnnIds([img_id]) 43 | anns = self.coco_api.loadAnns(ann_ids) 44 | gt_bboxes = [] 45 | gt_labels = [] 46 | gt_bboxes_ignore = [] 47 | if self.use_instance_mask: 48 | gt_masks = [] 49 | if self.use_keypoint: 50 | gt_keypoints = [] 51 | for ann in anns: 52 | if ann.get('ignore', False): 53 | continue 54 | x1, y1, w, h = ann['bbox'] 55 | if ann['area'] <= 0 or w < 1 or h < 1: 56 | continue 57 | if ann['category_id'] not in self.cat_ids: 58 | continue 59 | bbox = [x1, y1, x1 + w, y1 + h] 60 | if ann['iscrowd']: 61 | gt_bboxes_ignore.append(bbox) 62 | else: 63 | gt_bboxes.append(bbox) 64 | gt_labels.append(self.cat2label[ann['category_id']]) 65 | if self.use_instance_mask: 66 | gt_masks.append(self.coco_api.annToMask(ann)) 67 | if self.use_keypoint: 68 | gt_keypoints.append(ann['keypoints']) 69 | if gt_bboxes: 70 | gt_bboxes = np.array(gt_bboxes, dtype=np.float32) 71 | gt_labels = np.array(gt_labels, dtype=np.int64) 72 | else: 73 | gt_bboxes = np.zeros((0, 4), dtype=np.float32) 74 | gt_labels = np.array([], dtype=np.int64) 75 | if gt_bboxes_ignore: 76 | gt_bboxes_ignore = np.array(gt_bboxes_ignore, dtype=np.float32) 77 | else: 78 | gt_bboxes_ignore = np.zeros((0, 4), dtype=np.float32) 79 | annotation = dict( 80 | bboxes=gt_bboxes, labels=gt_labels, bboxes_ignore=gt_bboxes_ignore) 81 | if self.use_instance_mask: 82 | annotation['masks'] = gt_masks 83 | if self.use_keypoint: 84 | if gt_keypoints: 85 | annotation['keypoints'] = np.array(gt_keypoints, dtype=np.float32) 86 | else: 87 | annotation['keypoints'] = np.zeros((0, 51), dtype=np.float32) 88 | return annotation 89 | 90 | def get_train_data(self, idx): 91 | """ 92 | Load image and annotation 93 | :param idx: 94 | :return: meta-data (a dict containing image, annotation and other information) 95 | """ 96 | img_info = self.data_info[idx] 97 | file_name = img_info['file_name'] 98 | image_path = os.path.join(self.img_path, file_name) 99 | img = cv2.imread(image_path) 100 | ann = self.get_img_annotation(idx) 101 | meta = dict(img=img, 102 | img_info=img_info, 103 | gt_bboxes=ann['bboxes'], 104 | gt_labels=ann['labels']) 105 | if self.use_instance_mask: 106 | meta['gt_masks'] = ann['masks'] 107 | if self.use_keypoint: 108 | meta['gt_keypoints'] = ann['keypoints'] 109 | 110 | meta = self.pipeline(meta, self.input_size) 111 | meta['img'] = torch.from_numpy(meta['img'].transpose(2, 0, 1)) 112 | return meta 113 | 114 | def get_val_data(self, idx): 115 | """ 116 | Currently no difference from get_train_data. 117 | Not support TTA(testing time augmentation) yet. 118 | :param idx: 119 | :return: 120 | """ 121 | # TODO: support TTA 122 | return self.get_train_data(idx) 123 | -------------------------------------------------------------------------------- /nanodet/data/transform/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | from .pipeline import Pipeline -------------------------------------------------------------------------------- /nanodet/data/transform/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/data/transform/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/data/transform/__pycache__/color.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/data/transform/__pycache__/color.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/data/transform/__pycache__/pipeline.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/data/transform/__pycache__/pipeline.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/data/transform/__pycache__/warp.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/data/transform/__pycache__/warp.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/data/transform/color.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import random 4 | 5 | 6 | def random_brightness(img, delta): 7 | img += random.uniform(-delta, delta) 8 | return img 9 | 10 | 11 | def random_contrast(img, alpha_low, alpha_up): 12 | img *= random.uniform(alpha_low, alpha_up) 13 | return img 14 | 15 | 16 | def random_saturation(img, alpha_low, alpha_up): 17 | hsv_img = cv2.cvtColor(img.astype(np.float32)/255, cv2.COLOR_BGR2HSV) 18 | hsv_img[..., 1] *= random.uniform(alpha_low, alpha_up) 19 | img = cv2.cvtColor(hsv_img, cv2.COLOR_HSV2BGR) * 255 20 | # cv2.imshow('img', img/255) 21 | return img 22 | 23 | 24 | def normalize(meta, mean, std): 25 | img = meta['img'].astype(np.float32) 26 | mean = np.array(mean, dtype=np.float64).reshape(1, -1) 27 | stdinv = 1 / np.array(std, dtype=np.float64).reshape(1, -1) 28 | cv2.subtract(img, mean, img) 29 | cv2.multiply(img, stdinv, img) 30 | meta['img'] = img 31 | return meta 32 | 33 | 34 | def _normalize(img, mean, std): 35 | mean = np.array(mean, dtype=np.float32).reshape(1, 1, 3) / 255 36 | std = np.array(std, dtype=np.float32).reshape(1, 1, 3) / 255 37 | img = (img - mean) / std 38 | return img 39 | 40 | 41 | def color_aug_and_norm(meta, kwargs): 42 | img = meta['img'].astype(np.float32) / 255 43 | 44 | if 'brightness' in kwargs and random.randint(0, 1): 45 | img = random_brightness(img, kwargs['brightness']) 46 | 47 | if 'contrast' in kwargs and random.randint(0, 1): 48 | img = random_contrast(img, *kwargs['contrast']) 49 | 50 | if 'saturation' in kwargs and random.randint(0, 1): 51 | img = random_saturation(img, *kwargs['saturation']) 52 | # cv2.imshow('trans', img) 53 | # cv2.waitKey(0) 54 | img = _normalize(img, *kwargs['normalize']) 55 | meta['img'] = img 56 | return meta 57 | 58 | 59 | -------------------------------------------------------------------------------- /nanodet/data/transform/mosaic.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/data/transform/mosaic.py -------------------------------------------------------------------------------- /nanodet/data/transform/pipeline.py: -------------------------------------------------------------------------------- 1 | from .warp import warp_and_resize 2 | from .color import color_aug_and_norm 3 | import functools 4 | 5 | 6 | class Pipeline: 7 | def __init__(self, 8 | cfg, 9 | keep_ratio): 10 | self.warp = functools.partial(warp_and_resize, 11 | warp_kwargs=cfg, 12 | keep_ratio=keep_ratio) 13 | self.color = functools.partial(color_aug_and_norm, 14 | kwargs=cfg) 15 | 16 | def __call__(self, meta, dst_shape): 17 | meta = self.warp(meta=meta, dst_shape=dst_shape) 18 | meta = self.color(meta=meta) 19 | return meta 20 | -------------------------------------------------------------------------------- /nanodet/data/transform/warp.py: -------------------------------------------------------------------------------- 1 | import random 2 | import numpy as np 3 | import cv2 4 | import math 5 | 6 | def get_flip_matrix(prob=0.5): 7 | F = np.eye(3) 8 | if random.random() < prob: 9 | F[0, 0] = -1 10 | return F 11 | 12 | def get_perspective_matrix(perspective=0): 13 | """ 14 | 15 | :param perspective: 16 | :return: 17 | """ 18 | P = np.eye(3) 19 | P[2, 0] = random.uniform(-perspective, perspective) # x perspective (about y) 20 | P[2, 1] = random.uniform(-perspective, perspective) # y perspective (about x) 21 | return P 22 | 23 | 24 | def get_rotation_matrix(degree=0): 25 | """ 26 | 27 | :param degree: 28 | :return: 29 | """ 30 | R = np.eye(3) 31 | a = random.uniform(-degree, degree) 32 | R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=1) 33 | return R 34 | 35 | 36 | def get_scale_matrix(ratio=(1, 1)): 37 | """ 38 | 39 | :param width_ratio: 40 | :param height_ratio: 41 | """ 42 | Scl = np.eye(3) 43 | scale = random.uniform(*ratio) 44 | Scl[0, 0] *= scale 45 | Scl[1, 1] *= scale 46 | return Scl 47 | 48 | 49 | def get_stretch_matrix(width_ratio=(1, 1), height_ratio=(1, 1)): 50 | """ 51 | 52 | :param width_ratio: 53 | :param height_ratio: 54 | """ 55 | Str = np.eye(3) 56 | Str[0, 0] *= random.uniform(*width_ratio) 57 | Str[1, 1] *= random.uniform(*height_ratio) 58 | return Str 59 | 60 | 61 | def get_shear_matrix(degree): 62 | """ 63 | 64 | :param degree: 65 | :return: 66 | """ 67 | Sh = np.eye(3) 68 | Sh[0, 1] = math.tan(random.uniform(-degree, degree) * math.pi / 180) # x shear (deg) 69 | Sh[1, 0] = math.tan(random.uniform(-degree, degree) * math.pi / 180) # y shear (deg) 70 | return Sh 71 | 72 | 73 | def get_translate_matrix(translate, width, height): 74 | """ 75 | 76 | :param translate: 77 | :return: 78 | """ 79 | T = np.eye(3) 80 | T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width # x translation 81 | T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height # y translation 82 | return T 83 | 84 | 85 | def get_resize_matrix(raw_shape, dst_shape, keep_ratio): 86 | """ 87 | Get resize matrix for resizing raw img to input size 88 | :param raw_shape: (width, height) of raw image 89 | :param dst_shape: (width, height) of input image 90 | :param keep_ratio: whether keep original ratio 91 | :return: 3x3 Matrix 92 | """ 93 | r_w, r_h = raw_shape 94 | d_w, d_h = dst_shape 95 | Rs = np.eye(3) 96 | if keep_ratio: 97 | C = np.eye(3) 98 | C[0, 2] = - r_w / 2 99 | C[1, 2] = - r_h / 2 100 | 101 | if r_w / r_h < d_w / d_h: 102 | ratio = d_h / r_h 103 | else: 104 | ratio = d_w / r_w 105 | Rs[0, 0] *= ratio 106 | Rs[1, 1] *= ratio 107 | 108 | T = np.eye(3) 109 | T[0, 2] = 0.5 * d_w 110 | T[1, 2] = 0.5 * d_h 111 | return T @ Rs @ C 112 | else: 113 | Rs[0, 0] *= d_w / r_w 114 | Rs[1, 1] *= d_h / r_h 115 | return Rs 116 | 117 | def warp_and_resize(meta, warp_kwargs, dst_shape, keep_ratio=True): 118 | # TODO: background, type 119 | raw_img = meta['img'] 120 | height = raw_img.shape[0] # shape(h,w,c) 121 | width = raw_img.shape[1] 122 | 123 | # center 124 | C = np.eye(3) 125 | C[0, 2] = - width / 2 126 | C[1, 2] = - height / 2 127 | 128 | # do not change the order of mat mul 129 | if 'perspective' in warp_kwargs and random.randint(0, 1): 130 | P = get_perspective_matrix(warp_kwargs['perspective']) 131 | C = P @ C 132 | if 'scale' in warp_kwargs and random.randint(0, 1): 133 | Scl = get_scale_matrix(warp_kwargs['scale']) 134 | C = Scl @ C 135 | if 'stretch' in warp_kwargs and random.randint(0, 1): 136 | Str = get_stretch_matrix(*warp_kwargs['stretch']) 137 | C = Str @ C 138 | if 'rotation' in warp_kwargs and random.randint(0, 1): 139 | R = get_rotation_matrix(warp_kwargs['rotation']) 140 | C = R @ C 141 | if 'shear' in warp_kwargs and random.randint(0, 1): 142 | Sh = get_shear_matrix(warp_kwargs['shear']) 143 | C = Sh @ C 144 | if 'flip' in warp_kwargs: 145 | F = get_flip_matrix(warp_kwargs['flip']) 146 | C = F @ C 147 | if 'translate' in warp_kwargs and random.randint(0, 1): 148 | T = get_translate_matrix(warp_kwargs['translate'], width, height) 149 | else: 150 | T = get_translate_matrix(0, width, height) 151 | M = T @ C 152 | # M = T @ Sh @ R @ Str @ P @ C 153 | ResizeM = get_resize_matrix((width, height), dst_shape, keep_ratio) 154 | M = ResizeM @ M 155 | img = cv2.warpPerspective(raw_img, M, dsize=tuple(dst_shape)) 156 | meta['img'] = img 157 | meta['warp_matrix'] = M 158 | if 'gt_bboxes' in meta: 159 | boxes = meta['gt_bboxes'] 160 | meta['gt_bboxes'] = warp_boxes(boxes, M, dst_shape[0], dst_shape[1]) 161 | if 'gt_masks' in meta: 162 | for i, mask in enumerate(meta['gt_masks']): 163 | meta['gt_masks'][i] = cv2.warpPerspective(mask, M, dsize=tuple(dst_shape)) 164 | 165 | # TODO: keypoints 166 | # if 'gt_keypoints' in meta: 167 | 168 | return meta 169 | 170 | 171 | def warp_boxes(boxes, M, width, height): 172 | n = len(boxes) 173 | if n: 174 | # warp points 175 | xy = np.ones((n * 4, 3)) 176 | xy[:, :2] = boxes[:, [0, 1, 2, 3, 0, 3, 2, 1]].reshape(n * 4, 2) # x1y1, x2y2, x1y2, x2y1 177 | xy = xy @ M.T # transform 178 | xy = (xy[:, :2] / xy[:, 2:3]).reshape(n, 8) # rescale 179 | # create new boxes 180 | x = xy[:, [0, 2, 4, 6]] 181 | y = xy[:, [1, 3, 5, 7]] 182 | xy = np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T 183 | # clip boxes 184 | xy[:, [0, 2]] = xy[:, [0, 2]].clip(0, width) 185 | xy[:, [1, 3]] = xy[:, [1, 3]].clip(0, height) 186 | return xy.astype(np.float32) 187 | else: 188 | return boxes 189 | 190 | # def warp_keypoints(keypoints, M, width, height): 191 | # n = len(keypoints) 192 | # if n: 193 | # 194 | # # warp points 195 | # xy = np.ones((n * 4, 3)) 196 | # xy[:, :2] = boxes[:, [0, 1, 2, 3, 0, 3, 2, 1]].reshape(n * 4, 2) # x1y1, x2y2, x1y2, x2y1 197 | # xy = xy @ M.T # transform 198 | # xy = (xy[:, :2] / xy[:, 2:3]).reshape(n, 8) # rescale 199 | # # create new boxes 200 | # x = xy[:, [0, 2, 4, 6]] 201 | # y = xy[:, [1, 3, 5, 7]] 202 | # xy = np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T 203 | # # clip boxes 204 | # xy[:, [0, 2]] = xy[:, [0, 2]].clip(0, width) 205 | # xy[:, [1, 3]] = xy[:, [1, 3]].clip(0, height) 206 | # return xy 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | 220 | -------------------------------------------------------------------------------- /nanodet/evaluator/__init__.py: -------------------------------------------------------------------------------- 1 | from .coco_detection import CocoDetectionEvaluator 2 | 3 | 4 | def build_evaluator(cfg, dataset): 5 | if cfg.evaluator.name == 'CocoDetectionEvaluator': 6 | return CocoDetectionEvaluator(dataset) 7 | else: 8 | raise NotImplementedError 9 | -------------------------------------------------------------------------------- /nanodet/evaluator/coco_detection.py: -------------------------------------------------------------------------------- 1 | import pycocotools.coco as coco 2 | from pycocotools.cocoeval import COCOeval 3 | import json 4 | import os 5 | import copy 6 | 7 | 8 | def xyxy2xywh(bbox): 9 | """ 10 | change bbox to coco format 11 | :param bbox: [x1, y1, x2, y2] 12 | :return: [x, y, w, h] 13 | """ 14 | return [ 15 | bbox[0], 16 | bbox[1], 17 | bbox[2] - bbox[0], 18 | bbox[3] - bbox[1], 19 | ] 20 | 21 | 22 | class CocoDetectionEvaluator: 23 | def __init__(self, dataset): 24 | assert hasattr(dataset, 'coco_api') 25 | self.coco_api = dataset.coco_api 26 | self.cat_ids = dataset.cat_ids 27 | self.metric_names = ['mAP', 'AP_50', 'AP_75', 'AP_small', 'AP_m', 'AP_l'] 28 | 29 | def results2json(self, results): 30 | """ 31 | results: {image_id: {label: [bboxes...] } } 32 | :return coco json format: {image_id: 33 | category_id: 34 | bbox: 35 | score: } 36 | """ 37 | json_results = [] 38 | for image_id, dets in results.items(): 39 | for label, bboxes in dets.items(): 40 | category_id = self.cat_ids[label] 41 | for bbox in bboxes: 42 | score = float(bbox[4]) 43 | detection = dict( 44 | image_id=int(image_id), 45 | category_id=int(category_id), 46 | bbox=xyxy2xywh(bbox), 47 | score=score) 48 | json_results.append(detection) 49 | return json_results 50 | 51 | def evaluate(self, results, save_dir, epoch, logger, rank=-1): 52 | results_json = self.results2json(results) 53 | json_path = os.path.join(save_dir, 'results{}.json'.format(rank)) 54 | json.dump(results_json, open(json_path, 'w')) 55 | coco_dets = self.coco_api.loadRes(json_path) 56 | coco_eval = COCOeval(copy.deepcopy(self.coco_api), copy.deepcopy(coco_dets), "bbox") 57 | coco_eval.evaluate() 58 | coco_eval.accumulate() 59 | coco_eval.summarize() 60 | aps = coco_eval.stats[:6] 61 | eval_results = {} 62 | for k, v in zip(self.metric_names, aps): 63 | eval_results[k] = v 64 | logger.scalar_summary('Val_coco_bbox/' + k, 'val', v, epoch) 65 | return eval_results 66 | -------------------------------------------------------------------------------- /nanodet/model/arch/__init__.py: -------------------------------------------------------------------------------- 1 | from .gfl import GFL 2 | 3 | 4 | def build_model(model_cfg): 5 | if model_cfg.arch.name == 'GFL': 6 | model = GFL(model_cfg.arch.backbone, model_cfg.arch.fpn, model_cfg.arch.head) 7 | else: 8 | raise NotImplementedError 9 | return model 10 | -------------------------------------------------------------------------------- /nanodet/model/arch/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/arch/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/arch/__pycache__/gfl.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/arch/__pycache__/gfl.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/arch/__pycache__/one_stage.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/arch/__pycache__/one_stage.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/arch/gfl.py: -------------------------------------------------------------------------------- 1 | from .one_stage import OneStage 2 | 3 | 4 | class GFL(OneStage): 5 | def __init__(self, 6 | backbone_cfg, 7 | fpn_cfg, 8 | head_cfg, ): 9 | super(GFL, self).__init__(backbone_cfg, 10 | fpn_cfg, 11 | head_cfg) 12 | 13 | def forward(self, x): 14 | x = self.backbone(x) 15 | x = self.fpn(x) 16 | x = self.head(x) 17 | return x 18 | -------------------------------------------------------------------------------- /nanodet/model/arch/one_stage.py: -------------------------------------------------------------------------------- 1 | import time 2 | import torch 3 | import torch.nn as nn 4 | from ..backbone import build_backbone 5 | from ..fpn import build_fpn 6 | from ..head import build_head 7 | 8 | 9 | class OneStage(nn.Module): 10 | def __init__(self, 11 | backbone_cfg, 12 | fpn_cfg=None, 13 | head_cfg=None,): 14 | super(OneStage, self).__init__() 15 | self.backbone = build_backbone(backbone_cfg) 16 | if fpn_cfg is not None: 17 | self.fpn = build_fpn(fpn_cfg) 18 | if head_cfg is not None: 19 | self.head = build_head(head_cfg) 20 | 21 | def forward(self, x): 22 | x = self.backbone(x) 23 | if hasattr(self, 'fpn') and self.fpn is not None: 24 | x = self.fpn(x) 25 | if hasattr(self, 'head'): 26 | out = [] 27 | for xx in x: 28 | out.append(self.head(xx)) 29 | x = tuple(out) 30 | return x 31 | 32 | def inference(self, meta): 33 | with torch.no_grad(): 34 | torch.cuda.synchronize() 35 | time1 = time.time() 36 | preds = self(meta['img']) 37 | torch.cuda.synchronize() 38 | time2 = time.time() 39 | print('forward time: {:.3f}s'.format((time2 - time1)), end=' | ') 40 | results = self.head.post_process(preds, meta) 41 | torch.cuda.synchronize() 42 | print('decode time: {:.3f}s'.format((time.time() - time2)), end=' | ') 43 | return results 44 | 45 | def forward_train(self, gt_meta): 46 | preds = self(gt_meta['img']) 47 | loss, loss_states = self.head.loss(preds, gt_meta) 48 | 49 | return preds, loss, loss_states 50 | -------------------------------------------------------------------------------- /nanodet/model/backbone/__init__.py: -------------------------------------------------------------------------------- 1 | import copy 2 | from .resnet import ResNet 3 | from .ghostnet import GhostNet 4 | from .shufflenetv2 import ShuffleNetV2 5 | from .mobilenetv2 import MobileNetV2 6 | 7 | 8 | def build_backbone(cfg): 9 | backbone_cfg = copy.deepcopy(cfg) 10 | name = backbone_cfg.pop('name') 11 | if name == 'ResNet': 12 | return ResNet(**backbone_cfg) 13 | elif name == 'ShuffleNetV2': 14 | return ShuffleNetV2(**backbone_cfg) 15 | elif name == 'GhostNet': 16 | return GhostNet(**backbone_cfg) 17 | elif name == 'MobileNetV2': 18 | return MobileNetV2(**backbone_cfg) 19 | else: 20 | raise NotImplementedError 21 | 22 | -------------------------------------------------------------------------------- /nanodet/model/backbone/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/backbone/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/backbone/__pycache__/ghostnet.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/backbone/__pycache__/ghostnet.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/backbone/__pycache__/mobilenetv2.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/backbone/__pycache__/mobilenetv2.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/backbone/__pycache__/resnet.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/backbone/__pycache__/resnet.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/backbone/__pycache__/shufflenetv2.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/backbone/__pycache__/shufflenetv2.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/backbone/ghostnet.py: -------------------------------------------------------------------------------- 1 | # 2020.06.09-Changed for building GhostNet 2 | # Huawei Technologies Co., Ltd. 3 | """ 4 | Creates a GhostNet Model as defined in: 5 | GhostNet: More Features from Cheap Operations By Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu. 6 | https://arxiv.org/abs/1911.11907 7 | Modified from https://github.com/d-li14/mobilenetv3.pytorch and https://github.com/rwightman/pytorch-image-models 8 | """ 9 | import logging 10 | import torch 11 | import torch.nn as nn 12 | import torch.nn.functional as F 13 | import math 14 | from ..module.activation import act_layers 15 | 16 | 17 | def get_url(width_mult=1.0): 18 | if width_mult==1.0: 19 | return 'https://github.com/huawei-noah/ghostnet/raw/master/pytorch/models/state_dict_93.98.pth' 20 | else: 21 | logging.info('GhostNet only has 1.0 pretrain model. ') 22 | return None 23 | 24 | 25 | def _make_divisible(v, divisor, min_value=None): 26 | """ 27 | This function is taken from the original tf repo. 28 | It ensures that all layers have a channel number that is divisible by 8 29 | It can be seen here: 30 | https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py 31 | """ 32 | if min_value is None: 33 | min_value = divisor 34 | new_v = max(min_value, int(v + divisor / 2) // divisor * divisor) 35 | # Make sure that round down does not go down by more than 10%. 36 | if new_v < 0.9 * v: 37 | new_v += divisor 38 | return new_v 39 | 40 | 41 | def hard_sigmoid(x, inplace: bool = False): 42 | if inplace: 43 | return x.add_(3.).clamp_(0., 6.).div_(6.) 44 | else: 45 | return F.relu6(x + 3.) / 6. 46 | 47 | 48 | class SqueezeExcite(nn.Module): 49 | def __init__(self, in_chs, se_ratio=0.25, reduced_base_chs=None, 50 | act="ReLU", gate_fn=hard_sigmoid, divisor=4, **_): 51 | super(SqueezeExcite, self).__init__() 52 | self.gate_fn = gate_fn 53 | reduced_chs = _make_divisible((reduced_base_chs or in_chs) * se_ratio, divisor) 54 | self.avg_pool = nn.AdaptiveAvgPool2d(1) 55 | self.conv_reduce = nn.Conv2d(in_chs, reduced_chs, 1, bias=True) 56 | self.act1 = act_layers(act) 57 | self.conv_expand = nn.Conv2d(reduced_chs, in_chs, 1, bias=True) 58 | 59 | def forward(self, x): 60 | x_se = self.avg_pool(x) 61 | x_se = self.conv_reduce(x_se) 62 | x_se = self.act1(x_se) 63 | x_se = self.conv_expand(x_se) 64 | x = x * self.gate_fn(x_se) 65 | return x 66 | 67 | 68 | class ConvBnAct(nn.Module): 69 | def __init__(self, in_chs, out_chs, kernel_size, 70 | stride=1, act="ReLU"): 71 | super(ConvBnAct, self).__init__() 72 | self.conv = nn.Conv2d(in_chs, out_chs, kernel_size, stride, kernel_size // 2, bias=False) 73 | self.bn1 = nn.BatchNorm2d(out_chs) 74 | self.act1 = act_layers(act) 75 | 76 | def forward(self, x): 77 | x = self.conv(x) 78 | x = self.bn1(x) 79 | x = self.act1(x) 80 | return x 81 | 82 | 83 | class GhostModule(nn.Module): 84 | def __init__(self, inp, oup, kernel_size=1, ratio=2, dw_size=3, stride=1, act="ReLU"): 85 | super(GhostModule, self).__init__() 86 | self.oup = oup 87 | init_channels = math.ceil(oup / ratio) 88 | new_channels = init_channels * (ratio - 1) 89 | 90 | self.primary_conv = nn.Sequential( 91 | nn.Conv2d(inp, init_channels, kernel_size, stride, kernel_size // 2, bias=False), 92 | nn.BatchNorm2d(init_channels), 93 | act_layers(act) if act else nn.Sequential(), 94 | ) 95 | 96 | self.cheap_operation = nn.Sequential( 97 | nn.Conv2d(init_channels, new_channels, dw_size, 1, dw_size // 2, groups=init_channels, bias=False), 98 | nn.BatchNorm2d(new_channels), 99 | act_layers(act) if act else nn.Sequential(), 100 | ) 101 | 102 | def forward(self, x): 103 | x1 = self.primary_conv(x) 104 | x2 = self.cheap_operation(x1) 105 | out = torch.cat([x1, x2], dim=1) 106 | return out 107 | 108 | 109 | class GhostBottleneck(nn.Module): 110 | """ Ghost bottleneck w/ optional SE""" 111 | 112 | def __init__(self, in_chs, mid_chs, out_chs, dw_kernel_size=3, 113 | stride=1, act="ReLU", se_ratio=0.): 114 | super(GhostBottleneck, self).__init__() 115 | has_se = se_ratio is not None and se_ratio > 0. 116 | self.stride = stride 117 | 118 | # Point-wise expansion 119 | self.ghost1 = GhostModule(in_chs, mid_chs, act=act) 120 | 121 | # Depth-wise convolution 122 | if self.stride > 1: 123 | self.conv_dw = nn.Conv2d(mid_chs, mid_chs, dw_kernel_size, stride=stride, 124 | padding=(dw_kernel_size - 1) // 2, 125 | groups=mid_chs, bias=False) 126 | self.bn_dw = nn.BatchNorm2d(mid_chs) 127 | 128 | # Squeeze-and-excitation 129 | if has_se: 130 | self.se = SqueezeExcite(mid_chs, se_ratio=se_ratio) 131 | else: 132 | self.se = None 133 | 134 | # Point-wise linear projection 135 | self.ghost2 = GhostModule(mid_chs, out_chs, act=None) 136 | 137 | # shortcut 138 | if in_chs == out_chs and self.stride == 1: 139 | self.shortcut = nn.Sequential() 140 | else: 141 | self.shortcut = nn.Sequential( 142 | nn.Conv2d(in_chs, in_chs, dw_kernel_size, stride=stride, 143 | padding=(dw_kernel_size - 1) // 2, groups=in_chs, bias=False), 144 | nn.BatchNorm2d(in_chs), 145 | nn.Conv2d(in_chs, out_chs, 1, stride=1, padding=0, bias=False), 146 | nn.BatchNorm2d(out_chs), 147 | ) 148 | 149 | def forward(self, x): 150 | residual = x 151 | 152 | # 1st ghost bottleneck 153 | x = self.ghost1(x) 154 | 155 | # Depth-wise convolution 156 | if self.stride > 1: 157 | x = self.conv_dw(x) 158 | x = self.bn_dw(x) 159 | 160 | # Squeeze-and-excitation 161 | if self.se is not None: 162 | x = self.se(x) 163 | 164 | # 2nd ghost bottleneck 165 | x = self.ghost2(x) 166 | 167 | x += self.shortcut(residual) 168 | return x 169 | 170 | 171 | class GhostNet(nn.Module): 172 | def __init__(self, width_mult=1.0, out_stages=(4, 6, 9), act='ReLU', pretrain=True): 173 | super(GhostNet, self).__init__() 174 | self.width_mult = width_mult 175 | self.out_stages = out_stages 176 | # setting of inverted residual blocks 177 | self.cfgs = [ 178 | # k, t, c, SE, s 179 | # stage1 180 | [[3, 16, 16, 0, 1]], # 0 181 | # stage2 182 | [[3, 48, 24, 0, 2]], # 1 183 | [[3, 72, 24, 0, 1]], # 2 1/4 184 | # stage3 185 | [[5, 72, 40, 0.25, 2]], # 3 186 | [[5, 120, 40, 0.25, 1]], # 4 1/8 187 | # stage4 188 | [[3, 240, 80, 0, 2]], # 5 189 | [[3, 200, 80, 0, 1], 190 | [3, 184, 80, 0, 1], 191 | [3, 184, 80, 0, 1], 192 | [3, 480, 112, 0.25, 1], 193 | [3, 672, 112, 0.25, 1] 194 | ], # 6 1/16 195 | # stage5 196 | [[5, 672, 160, 0.25, 2]], # 7 197 | [[5, 960, 160, 0, 1], 198 | [5, 960, 160, 0.25, 1], 199 | [5, 960, 160, 0, 1], 200 | [5, 960, 160, 0.25, 1] 201 | ] # 8 202 | ] 203 | # ------conv+bn+act----------# 9 1/32 204 | 205 | # building first layer 206 | output_channel = _make_divisible(16 * width_mult, 4) 207 | self.conv_stem = nn.Conv2d(3, output_channel, 3, 2, 1, bias=False) 208 | self.bn1 = nn.BatchNorm2d(output_channel) 209 | self.act1 = act_layers(act) 210 | input_channel = output_channel 211 | 212 | # building inverted residual blocks 213 | stages = [] 214 | block = GhostBottleneck 215 | for cfg in self.cfgs: 216 | layers = [] 217 | for k, exp_size, c, se_ratio, s in cfg: 218 | output_channel = _make_divisible(c * width_mult, 4) 219 | hidden_channel = _make_divisible(exp_size * width_mult, 4) 220 | layers.append(block(input_channel, hidden_channel, output_channel, k, s, 221 | act=act, se_ratio=se_ratio)) 222 | input_channel = output_channel 223 | stages.append(nn.Sequential(*layers)) 224 | 225 | output_channel = _make_divisible(exp_size * width_mult, 4) 226 | stages.append(nn.Sequential(ConvBnAct(input_channel, output_channel, 1, act=act))) #9 227 | 228 | self.blocks = nn.Sequential(*stages) 229 | 230 | self._initialize_weights(pretrain) 231 | 232 | def forward(self, x): 233 | x = self.conv_stem(x) 234 | x = self.bn1(x) 235 | x = self.act1(x) 236 | output = [] 237 | for i in range(10): 238 | x = self.blocks[i](x) 239 | if i in self.out_stages: 240 | output.append(x) 241 | return tuple(output) 242 | 243 | def _initialize_weights(self, pretrain=True): 244 | print('init weights...') 245 | for name, m in self.named_modules(): 246 | if isinstance(m, nn.Conv2d): 247 | if 'conv_stem' in name: 248 | nn.init.normal_(m.weight, 0, 0.01) 249 | else: 250 | nn.init.normal_(m.weight, 0, 1.0 / m.weight.shape[1]) 251 | if m.bias is not None: 252 | nn.init.constant_(m.bias, 0) 253 | elif isinstance(m, nn.BatchNorm2d): 254 | nn.init.constant_(m.weight, 1) 255 | if m.bias is not None: 256 | nn.init.constant_(m.bias, 0.0001) 257 | nn.init.constant_(m.running_mean, 0) 258 | elif isinstance(m, nn.BatchNorm1d): 259 | nn.init.constant_(m.weight, 1) 260 | if m.bias is not None: 261 | nn.init.constant_(m.bias, 0.0001) 262 | nn.init.constant_(m.running_mean, 0) 263 | elif isinstance(m, nn.Linear): 264 | nn.init.normal_(m.weight, 0, 0.01) 265 | if m.bias is not None: 266 | nn.init.constant_(m.bias, 0) 267 | if pretrain: 268 | url = get_url(self.width_mult) 269 | if url is not None: 270 | state_dict = torch.hub.load_state_dict_from_url(url, progress=True) 271 | self.load_state_dict(state_dict, strict=False) 272 | -------------------------------------------------------------------------------- /nanodet/model/backbone/mobilenetv2.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import torch 6 | import torch.nn as nn 7 | from ..module.activation import act_layers 8 | 9 | 10 | class ConvBNReLU(nn.Sequential): 11 | def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1, act='ReLU'): 12 | padding = (kernel_size - 1) // 2 13 | super(ConvBNReLU, self).__init__( 14 | nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False), 15 | nn.BatchNorm2d(out_planes), 16 | act_layers(act) 17 | ) 18 | 19 | 20 | class InvertedResidual(nn.Module): 21 | def __init__(self, inp, oup, stride, expand_ratio, act='ReLU'): 22 | super(InvertedResidual, self).__init__() 23 | self.stride = stride 24 | assert stride in [1, 2] 25 | 26 | hidden_dim = int(round(inp * expand_ratio)) 27 | self.use_res_connect = self.stride == 1 and inp == oup 28 | 29 | layers = [] 30 | if expand_ratio != 1: 31 | # pw 32 | layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1, act=act)) 33 | layers.extend([ 34 | # dw 35 | ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim, act=act), 36 | # pw-linear 37 | nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False), 38 | nn.BatchNorm2d(oup), 39 | ]) 40 | self.conv = nn.Sequential(*layers) 41 | 42 | def forward(self, x): 43 | if self.use_res_connect: 44 | return x + self.conv(x) 45 | else: 46 | return self.conv(x) 47 | 48 | 49 | class MobileNetV2(nn.Module): 50 | def __init__(self, width_mult=1., out_stages=(1, 2, 4, 6), last_channel=1280, act='ReLU'): 51 | super(MobileNetV2, self).__init__() 52 | self.width_mult = width_mult 53 | self.out_stages = out_stages 54 | input_channel = 32 55 | self.last_channel = last_channel 56 | self.act = act 57 | self.interverted_residual_setting = [ 58 | # t, c, n, s 59 | [1, 16, 1, 1], 60 | [6, 24, 2, 2], 61 | [6, 32, 3, 2], 62 | [6, 64, 4, 2], 63 | [6, 96, 3, 1], 64 | [6, 160, 3, 2], 65 | [6, 320, 1, 1], 66 | ] 67 | 68 | # building first layer 69 | self.input_channel = int(input_channel * width_mult) 70 | self.first_layer = ConvBNReLU(3, input_channel, stride=2, act=self.act) 71 | # building inverted residual blocks 72 | for i in range(7): 73 | name = 'stage{}'.format(i) 74 | setattr(self, name, self.build_mobilenet_stage(stage_num=i)) 75 | 76 | def build_mobilenet_stage(self, stage_num): 77 | stage = [] 78 | t, c, n, s = self.interverted_residual_setting[stage_num] 79 | output_channel = int(c * self.width_mult) 80 | for i in range(n): 81 | if i == 0: 82 | stage.append(InvertedResidual(self.input_channel, output_channel, s, expand_ratio=t, act=self.act)) 83 | else: 84 | stage.append(InvertedResidual(self.input_channel, output_channel, 1, expand_ratio=t, act=self.act)) 85 | self.input_channel = output_channel 86 | if stage_num == 6: 87 | last_layer = ConvBNReLU(self.input_channel, self.last_channel, kernel_size=1, act=self.act) 88 | stage.append(last_layer) 89 | stage = nn.Sequential(*stage) 90 | return stage 91 | 92 | def forward(self, x): 93 | x = self.first_layer(x) 94 | output = [] 95 | for i in range(0, 7): 96 | stage = getattr(self, 'stage{}'.format(i)) 97 | x = stage(x) 98 | if i in self.out_stages: 99 | output.append(x) 100 | 101 | return tuple(output) 102 | 103 | def init_weights(self): 104 | for m in self.modules(): 105 | if isinstance(m, nn.Conv2d): 106 | nn.init.normal_(m.weight, std=0.001) 107 | if m.bias is not None: 108 | m.bias.data.zero_() 109 | elif isinstance(m, nn.BatchNorm2d): 110 | m.weight.data.fill_(1) 111 | m.bias.data.zero_() 112 | 113 | -------------------------------------------------------------------------------- /nanodet/model/backbone/resnet.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | 6 | import torch 7 | import torch.nn as nn 8 | import torch.utils.model_zoo as model_zoo 9 | from ..module.activation import act_layers 10 | 11 | 12 | model_urls = { 13 | 'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth', 14 | 'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth', 15 | 'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth', 16 | 'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth', 17 | 'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth', 18 | } 19 | 20 | def conv3x3(in_planes, out_planes, stride=1): 21 | """3x3 convolution with padding""" 22 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 23 | padding=1, bias=False) 24 | 25 | 26 | class BasicBlock(nn.Module): 27 | expansion = 1 28 | 29 | def __init__(self, inplanes, planes, stride=1, downsample=None, activation='ReLU'): 30 | super(BasicBlock, self).__init__() 31 | self.conv1 = conv3x3(inplanes, planes, stride) 32 | self.bn1 = nn.BatchNorm2d(planes) 33 | self.act = act_layers(activation) 34 | self.conv2 = conv3x3(planes, planes) 35 | self.bn2 = nn.BatchNorm2d(planes) 36 | self.downsample = downsample 37 | self.stride = stride 38 | 39 | def forward(self, x): 40 | residual = x 41 | 42 | out = self.conv1(x) 43 | out = self.bn1(out) 44 | out = self.act(out) 45 | 46 | out = self.conv2(out) 47 | out = self.bn2(out) 48 | 49 | if self.downsample is not None: 50 | residual = self.downsample(x) 51 | 52 | out += residual 53 | out = self.act(out) 54 | 55 | return out 56 | 57 | 58 | class Bottleneck(nn.Module): 59 | expansion = 4 60 | 61 | def __init__(self, inplanes, planes, stride=1, downsample=None, activation='ReLU'): 62 | super(Bottleneck, self).__init__() 63 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False) 64 | self.bn1 = nn.BatchNorm2d(planes) 65 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, 66 | padding=1, bias=False) 67 | self.bn2 = nn.BatchNorm2d(planes) 68 | self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, 69 | bias=False) 70 | self.bn3 = nn.BatchNorm2d(planes * self.expansion) 71 | self.act = act_layers(activation) 72 | self.downsample = downsample 73 | self.stride = stride 74 | 75 | def forward(self, x): 76 | residual = x 77 | 78 | out = self.conv1(x) 79 | out = self.bn1(out) 80 | out = self.act(out) 81 | 82 | out = self.conv2(out) 83 | out = self.bn2(out) 84 | out = self.act(out) 85 | 86 | out = self.conv3(out) 87 | out = self.bn3(out) 88 | 89 | if self.downsample is not None: 90 | residual = self.downsample(x) 91 | 92 | out += residual 93 | out = self.act(out) 94 | 95 | return out 96 | 97 | 98 | def fill_fc_weights(layers): 99 | for m in layers.modules(): 100 | if isinstance(m, nn.Conv2d): 101 | nn.init.normal_(m.weight, std=0.001) 102 | # torch.nn.init.kaiming_normal_(m.weight.data, nonlinearity='relu') 103 | # torch.nn.init.xavier_normal_(m.weight.data) 104 | if m.bias is not None: 105 | nn.init.constant_(m.bias, 0) 106 | 107 | 108 | class ResNet(nn.Module): 109 | resnet_spec = {18: (BasicBlock, [2, 2, 2, 2]), 110 | 34: (BasicBlock, [3, 4, 6, 3]), 111 | 50: (Bottleneck, [3, 4, 6, 3]), 112 | 101: (Bottleneck, [3, 4, 23, 3]), 113 | 152: (Bottleneck, [3, 8, 36, 3])} 114 | 115 | def __init__(self, 116 | depth, 117 | out_stages=(1, 2, 3, 4), 118 | activation='ReLU', 119 | pretrain=True 120 | ): 121 | super(ResNet, self).__init__() 122 | if depth not in self.resnet_spec: 123 | raise KeyError('invalid resnet depth {}'.format(depth)) 124 | self.activation = activation 125 | block, layers = self.resnet_spec[depth] 126 | self.depth = depth 127 | self.inplanes = 64 128 | self.out_stages = out_stages 129 | 130 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, 131 | bias=False) 132 | self.bn1 = nn.BatchNorm2d(64) 133 | self.act = act_layers(self.activation) 134 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 135 | self.layer1 = self._make_layer(block, 64, layers[0]) 136 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 137 | self.layer3 = self._make_layer(block, 256, layers[2], stride=2) 138 | self.layer4 = self._make_layer(block, 512, layers[3], stride=2) 139 | self.init_weights(pretrain=pretrain) 140 | 141 | def _make_layer(self, block, planes, blocks, stride=1): 142 | downsample = None 143 | if stride != 1 or self.inplanes != planes * block.expansion: 144 | downsample = nn.Sequential( 145 | nn.Conv2d(self.inplanes, planes * block.expansion, 146 | kernel_size=1, stride=stride, bias=False), 147 | nn.BatchNorm2d(planes * block.expansion), 148 | ) 149 | 150 | layers = [] 151 | layers.append(block(self.inplanes, planes, stride, downsample, activation=self.activation)) 152 | self.inplanes = planes * block.expansion 153 | for i in range(1, blocks): 154 | layers.append(block(self.inplanes, planes, activation=self.activation)) 155 | 156 | return nn.Sequential(*layers) 157 | 158 | def forward(self, x): 159 | x = self.conv1(x) 160 | x = self.bn1(x) 161 | x = self.act(x) 162 | x = self.maxpool(x) 163 | output = [] 164 | for i in range(1,5): 165 | res_layer = getattr(self, 'layer{}'.format(i)) 166 | x = res_layer(x) 167 | if i in self.out_stages: 168 | output.append(x) 169 | 170 | return tuple(output) 171 | 172 | def init_weights(self, pretrain=True): 173 | if pretrain: 174 | url = model_urls['resnet{}'.format(self.depth)] 175 | pretrained_state_dict = model_zoo.load_url(url) 176 | print('=> loading pretrained model {}'.format(url)) 177 | self.load_state_dict(pretrained_state_dict, strict=False) 178 | else: 179 | for m in self.modules(): 180 | if self.activation == 'LeakyReLU': 181 | nonlinearity = 'leaky_relu' 182 | else: 183 | nonlinearity = 'relu' 184 | if isinstance(m, nn.Conv2d): 185 | nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity=nonlinearity) 186 | elif isinstance(m, nn.BatchNorm2d): 187 | m.weight.data.fill_(1) 188 | m.bias.data.zero_() -------------------------------------------------------------------------------- /nanodet/model/backbone/shufflenetv2.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.utils.model_zoo as model_zoo 4 | from ..module.activation import act_layers 5 | 6 | model_urls = { 7 | 'shufflenetv2_0.5x': 'https://download.pytorch.org/models/shufflenetv2_x0.5-f707e7126e.pth', 8 | 'shufflenetv2_1.0x': 'https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth', 9 | 'shufflenetv2_1.5x': None, 10 | 'shufflenetv2_2.0x': None, 11 | } 12 | 13 | 14 | def channel_shuffle(x, groups): 15 | # type: (torch.Tensor, int) -> torch.Tensor 16 | batchsize, num_channels, height, width = x.data.size() 17 | channels_per_group = num_channels // groups 18 | 19 | # reshape 20 | x = x.view(batchsize, groups, 21 | channels_per_group, height, width) 22 | 23 | x = torch.transpose(x, 1, 2).contiguous() 24 | 25 | # flatten 26 | x = x.view(batchsize, -1, height, width) 27 | 28 | return x 29 | 30 | 31 | class ShuffleV2Block(nn.Module): 32 | def __init__(self, inp, oup, stride, activation='ReLU'): 33 | super(ShuffleV2Block, self).__init__() 34 | 35 | if not (1 <= stride <= 3): 36 | raise ValueError('illegal stride value') 37 | self.stride = stride 38 | 39 | branch_features = oup // 2 40 | assert (self.stride != 1) or (inp == branch_features << 1) 41 | 42 | if self.stride > 1: 43 | self.branch1 = nn.Sequential( 44 | self.depthwise_conv(inp, inp, kernel_size=3, stride=self.stride, padding=1), 45 | nn.BatchNorm2d(inp), 46 | nn.Conv2d(inp, branch_features, kernel_size=1, stride=1, padding=0, bias=False), 47 | nn.BatchNorm2d(branch_features), 48 | act_layers(activation), 49 | ) 50 | else: 51 | self.branch1 = nn.Sequential() 52 | 53 | self.branch2 = nn.Sequential( 54 | nn.Conv2d(inp if (self.stride > 1) else branch_features, 55 | branch_features, kernel_size=1, stride=1, padding=0, bias=False), 56 | nn.BatchNorm2d(branch_features), 57 | act_layers(activation), 58 | self.depthwise_conv(branch_features, branch_features, kernel_size=3, stride=self.stride, padding=1), 59 | nn.BatchNorm2d(branch_features), 60 | nn.Conv2d(branch_features, branch_features, kernel_size=1, stride=1, padding=0, bias=False), 61 | nn.BatchNorm2d(branch_features), 62 | act_layers(activation), 63 | ) 64 | 65 | @staticmethod 66 | def depthwise_conv(i, o, kernel_size, stride=1, padding=0, bias=False): 67 | return nn.Conv2d(i, o, kernel_size, stride, padding, bias=bias, groups=i) 68 | 69 | def forward(self, x): 70 | if self.stride == 1: 71 | x1, x2 = x.chunk(2, dim=1) 72 | out = torch.cat((x1, self.branch2(x2)), dim=1) 73 | else: 74 | out = torch.cat((self.branch1(x), self.branch2(x)), dim=1) 75 | 76 | out = channel_shuffle(out, 2) 77 | 78 | return out 79 | 80 | 81 | class ShuffleNetV2(nn.Module): 82 | def __init__(self, 83 | model_size='1.5x', 84 | out_stages=(2, 3, 4), 85 | with_last_conv=False, 86 | kernal_size=3, 87 | activation='ReLU'): 88 | super(ShuffleNetV2, self).__init__() 89 | print('model size is ', model_size) 90 | 91 | self.stage_repeats = [4, 8, 4] 92 | self.model_size = model_size 93 | self.out_stages = out_stages 94 | self.with_last_conv = with_last_conv 95 | self.kernal_size = kernal_size 96 | self.activation = activation 97 | if model_size == '0.5x': 98 | self._stage_out_channels = [24, 48, 96, 192, 1024] 99 | elif model_size == '1.0x': 100 | self._stage_out_channels = [24, 116, 232, 464, 1024] 101 | elif model_size == '1.5x': 102 | self._stage_out_channels = [24, 176, 352, 704, 1024] 103 | elif model_size == '2.0x': 104 | self._stage_out_channels = [24, 244, 488, 976, 2048] 105 | else: 106 | raise NotImplementedError 107 | 108 | # building first layer 109 | input_channels = 3 110 | output_channels = self._stage_out_channels[0] 111 | self.conv1 = nn.Sequential( 112 | nn.Conv2d(input_channels, output_channels, 3, 2, 1, bias=False), 113 | nn.BatchNorm2d(output_channels), 114 | act_layers(activation), 115 | ) 116 | input_channels = output_channels 117 | 118 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 119 | 120 | stage_names = ['stage{}'.format(i) for i in [2, 3, 4]] 121 | for name, repeats, output_channels in zip( 122 | stage_names, self.stage_repeats, self._stage_out_channels[1:]): 123 | seq = [ShuffleV2Block(input_channels, output_channels, 2, activation=activation)] 124 | for i in range(repeats - 1): 125 | seq.append(ShuffleV2Block(output_channels, output_channels, 1, activation=activation)) 126 | setattr(self, name, nn.Sequential(*seq)) 127 | input_channels = output_channels 128 | output_channels = self._stage_out_channels[-1] 129 | if self.with_last_conv: 130 | self.conv5 = nn.Sequential( 131 | nn.Conv2d(input_channels, output_channels, 1, 1, 0, bias=False), 132 | nn.BatchNorm2d(output_channels), 133 | act_layers(activation), 134 | ) 135 | self.stage4.add_module('conv5', self.conv5) 136 | self._initialize_weights() 137 | 138 | def forward(self, x): 139 | x = self.conv1(x) 140 | x = self.maxpool(x) 141 | output = [] 142 | for i in range(2, 5): 143 | stage = getattr(self, 'stage{}'.format(i)) 144 | x = stage(x) 145 | if i in self.out_stages: 146 | output.append(x) 147 | return tuple(output) 148 | 149 | def _initialize_weights(self, pretrain=True): 150 | print('init weights...') 151 | for name, m in self.named_modules(): 152 | if isinstance(m, nn.Conv2d): 153 | if 'first' in name: 154 | nn.init.normal_(m.weight, 0, 0.01) 155 | else: 156 | nn.init.normal_(m.weight, 0, 1.0 / m.weight.shape[1]) 157 | if m.bias is not None: 158 | nn.init.constant_(m.bias, 0) 159 | elif isinstance(m, nn.BatchNorm2d): 160 | nn.init.constant_(m.weight, 1) 161 | if m.bias is not None: 162 | nn.init.constant_(m.bias, 0.0001) 163 | nn.init.constant_(m.running_mean, 0) 164 | elif isinstance(m, nn.BatchNorm1d): 165 | nn.init.constant_(m.weight, 1) 166 | if m.bias is not None: 167 | nn.init.constant_(m.bias, 0.0001) 168 | nn.init.constant_(m.running_mean, 0) 169 | elif isinstance(m, nn.Linear): 170 | nn.init.normal_(m.weight, 0, 0.01) 171 | if m.bias is not None: 172 | nn.init.constant_(m.bias, 0) 173 | if pretrain: 174 | url = model_urls['shufflenetv2_{}'.format(self.model_size)] 175 | if url is not None: 176 | pretrained_state_dict = model_zoo.load_url(url) 177 | print('=> loading pretrained model {}'.format(url)) 178 | self.load_state_dict(pretrained_state_dict, strict=False) 179 | 180 | 181 | if __name__ == "__main__": 182 | model = ShuffleNetV2(model_size='1.0x', ) 183 | print(model) 184 | test_data = torch.rand(5, 3, 320, 320) 185 | test_outputs = model(test_data) 186 | for out in test_outputs: 187 | print(out.size()) 188 | -------------------------------------------------------------------------------- /nanodet/model/fpn/__init__.py: -------------------------------------------------------------------------------- 1 | import copy 2 | from .fpn import FPN 3 | from .pan import PAN 4 | 5 | 6 | def build_fpn(cfg): 7 | fpn_cfg = copy.deepcopy(cfg) 8 | name = fpn_cfg.pop('name') 9 | if name == 'FPN': 10 | return FPN(**fpn_cfg) 11 | elif name == 'PAN': 12 | return PAN(**fpn_cfg) 13 | else: 14 | raise NotImplementedError -------------------------------------------------------------------------------- /nanodet/model/fpn/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/fpn/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/fpn/__pycache__/fpn.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/fpn/__pycache__/fpn.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/fpn/__pycache__/pan.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/fpn/__pycache__/pan.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/fpn/fpn.py: -------------------------------------------------------------------------------- 1 | """ 2 | from MMDetection 3 | """ 4 | 5 | import torch.nn as nn 6 | import torch.nn.functional as F 7 | from ..module.conv import ConvModule 8 | from ..module.init_weights import xavier_init 9 | 10 | 11 | class FPN(nn.Module): 12 | 13 | def __init__(self, 14 | in_channels, 15 | out_channels, 16 | num_outs, 17 | start_level=0, 18 | end_level=-1, 19 | conv_cfg=None, 20 | norm_cfg=None, 21 | activation=None 22 | ): 23 | super(FPN, self).__init__() 24 | assert isinstance(in_channels, list) 25 | self.in_channels = in_channels 26 | self.out_channels = out_channels 27 | self.num_ins = len(in_channels) 28 | self.num_outs = num_outs 29 | self.fp16_enabled = False 30 | 31 | if end_level == -1: 32 | self.backbone_end_level = self.num_ins 33 | assert num_outs >= self.num_ins - start_level 34 | else: 35 | # if end_level < inputs, no extra level is allowed 36 | self.backbone_end_level = end_level 37 | assert end_level <= len(in_channels) 38 | assert num_outs == end_level - start_level 39 | self.start_level = start_level 40 | self.end_level = end_level 41 | self.lateral_convs = nn.ModuleList() 42 | 43 | for i in range(self.start_level, self.backbone_end_level): 44 | l_conv = ConvModule( 45 | in_channels[i], 46 | out_channels, 47 | 1, 48 | conv_cfg=conv_cfg, 49 | norm_cfg=norm_cfg, 50 | activation=activation, 51 | inplace=False) 52 | 53 | self.lateral_convs.append(l_conv) 54 | self.init_weights() 55 | 56 | # default init_weights for conv(msra) and norm in ConvModule 57 | def init_weights(self): 58 | for m in self.modules(): 59 | if isinstance(m, nn.Conv2d): 60 | xavier_init(m, distribution='uniform') 61 | 62 | def forward(self, inputs): 63 | assert len(inputs) == len(self.in_channels) 64 | 65 | # build laterals 66 | laterals = [ 67 | lateral_conv(inputs[i + self.start_level]) 68 | for i, lateral_conv in enumerate(self.lateral_convs) 69 | ] 70 | 71 | # build top-down path 72 | used_backbone_levels = len(laterals) 73 | for i in range(used_backbone_levels - 1, 0, -1): 74 | prev_shape = laterals[i - 1].shape[2:] 75 | laterals[i - 1] += F.interpolate( 76 | laterals[i], size=prev_shape, mode='bilinear') 77 | 78 | # build outputs 79 | outs = [ 80 | # self.fpn_convs[i](laterals[i]) for i in range(used_backbone_levels) 81 | laterals[i] for i in range(used_backbone_levels) 82 | ] 83 | return tuple(outs) 84 | 85 | 86 | # if __name__ == '__main__': 87 | -------------------------------------------------------------------------------- /nanodet/model/fpn/pan.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch.nn.functional as F 3 | from ..module.conv import ConvModule 4 | from .fpn import FPN 5 | 6 | 7 | class PAN(FPN): 8 | """Path Aggregation Network for Instance Segmentation. 9 | 10 | This is an implementation of the `PAN in Path Aggregation Network 11 | `_. 12 | 13 | Args: 14 | in_channels (List[int]): Number of input channels per scale. 15 | out_channels (int): Number of output channels (used at each scale) 16 | num_outs (int): Number of output scales. 17 | start_level (int): Index of the start input backbone level used to 18 | build the feature pyramid. Default: 0. 19 | end_level (int): Index of the end input backbone level (exclusive) to 20 | build the feature pyramid. Default: -1, which means the last level. 21 | add_extra_convs (bool): Whether to add conv layers on top of the 22 | original feature maps. Default: False. 23 | extra_convs_on_inputs (bool): Whether to apply extra conv on 24 | the original feature from the backbone. Default: False. 25 | relu_before_extra_convs (bool): Whether to apply relu before the extra 26 | conv. Default: False. 27 | no_norm_on_lateral (bool): Whether to apply norm on lateral. 28 | Default: False. 29 | conv_cfg (dict): Config dict for convolution layer. Default: None. 30 | norm_cfg (dict): Config dict for normalization layer. Default: None. 31 | act_cfg (str): Config dict for activation layer in ConvModule. 32 | Default: None. 33 | """ 34 | 35 | def __init__(self, 36 | in_channels, 37 | out_channels, 38 | num_outs, 39 | start_level=0, 40 | end_level=-1, 41 | conv_cfg=None, 42 | norm_cfg=None, 43 | activation=None): 44 | super(PAN, 45 | self).__init__(in_channels, out_channels, num_outs, start_level, 46 | end_level, conv_cfg, norm_cfg, activation) 47 | self.init_weights() 48 | 49 | def forward(self, inputs): 50 | """Forward function.""" 51 | assert len(inputs) == len(self.in_channels) 52 | 53 | # build laterals 54 | laterals = [ 55 | lateral_conv(inputs[i + self.start_level]) 56 | for i, lateral_conv in enumerate(self.lateral_convs) 57 | ] 58 | 59 | # build top-down path 60 | used_backbone_levels = len(laterals) 61 | for i in range(used_backbone_levels - 1, 0, -1): 62 | prev_shape = laterals[i - 1].shape[2:] 63 | laterals[i - 1] += F.interpolate( 64 | laterals[i], size=prev_shape, mode='bilinear') 65 | 66 | # build outputs 67 | # part 1: from original levels 68 | inter_outs = [ 69 | laterals[i] for i in range(used_backbone_levels) 70 | ] 71 | 72 | # part 2: add bottom-up path 73 | for i in range(0, used_backbone_levels - 1): 74 | prev_shape = inter_outs[i + 1].shape[2:] 75 | inter_outs[i + 1] += F.interpolate(inter_outs[i], size=prev_shape, mode='bilinear') 76 | 77 | outs = [] 78 | outs.append(inter_outs[0]) 79 | outs.extend([ 80 | inter_outs[i] for i in range(1, used_backbone_levels) 81 | ]) 82 | return tuple(outs) 83 | -------------------------------------------------------------------------------- /nanodet/model/head/__init__.py: -------------------------------------------------------------------------------- 1 | import copy 2 | from .gfl_head import GFLHead 3 | from .nanodet_head import NanoDetHead 4 | 5 | 6 | def build_head(cfg): 7 | head_cfg = copy.deepcopy(cfg) 8 | name = head_cfg.pop('name') 9 | if name == 'GFLHead': 10 | return GFLHead(**head_cfg) 11 | elif name == 'NanoDetHead': 12 | return NanoDetHead(**head_cfg) 13 | else: 14 | raise NotImplementedError -------------------------------------------------------------------------------- /nanodet/model/head/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/__pycache__/gfl_head.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/__pycache__/gfl_head.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/__pycache__/nanodet_head.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/__pycache__/nanodet_head.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/anchor/__pycache__/anchor_generator.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/anchor/__pycache__/anchor_generator.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/anchor/__pycache__/anchor_target.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/anchor/__pycache__/anchor_target.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/anchor/__pycache__/base_anchor_head.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/anchor/__pycache__/base_anchor_head.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/anchor/anchor_generator.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | 4 | class AnchorGenerator(object): 5 | """ 6 | Examples: 7 | >>> self = AnchorGenerator(9, [1.], [1.]) 8 | >>> all_anchors = self.grid_anchors((2, 2), device='cpu') 9 | >>> print(all_anchors) 10 | tensor([[ 0., 0., 8., 8.], 11 | [16., 0., 24., 8.], 12 | [ 0., 16., 8., 24.], 13 | [16., 16., 24., 24.]]) 14 | """ 15 | 16 | def __init__(self, base_size, scales, ratios, scale_major=True, ctr=None): 17 | self.base_size = base_size 18 | self.scales = torch.Tensor(scales) 19 | self.ratios = torch.Tensor(ratios) 20 | self.scale_major = scale_major 21 | self.ctr = ctr 22 | self.base_anchors = self.gen_base_anchors() 23 | 24 | @property 25 | def num_base_anchors(self): 26 | return self.base_anchors.size(0) 27 | 28 | def gen_base_anchors(self): 29 | w = self.base_size 30 | h = self.base_size 31 | if self.ctr is None: 32 | x_ctr = 0.5 * (w - 1) 33 | y_ctr = 0.5 * (h - 1) 34 | else: 35 | x_ctr, y_ctr = self.ctr 36 | 37 | h_ratios = torch.sqrt(self.ratios) 38 | w_ratios = 1 / h_ratios 39 | if self.scale_major: 40 | ws = (w * w_ratios[:, None] * self.scales[None, :]).view(-1) 41 | hs = (h * h_ratios[:, None] * self.scales[None, :]).view(-1) 42 | else: 43 | ws = (w * self.scales[:, None] * w_ratios[None, :]).view(-1) 44 | hs = (h * self.scales[:, None] * h_ratios[None, :]).view(-1) 45 | 46 | # yapf: disable 47 | base_anchors = torch.stack( 48 | [ 49 | x_ctr - 0.5 * (ws - 1), y_ctr - 0.5 * (hs - 1), 50 | x_ctr + 0.5 * (ws - 1), y_ctr + 0.5 * (hs - 1) 51 | ], 52 | dim=-1).round() 53 | # yapf: enable 54 | 55 | return base_anchors 56 | 57 | def _meshgrid(self, x, y, row_major=True): 58 | xx = x.repeat(len(y)) 59 | yy = y.view(-1, 1).repeat(1, len(x)).view(-1) 60 | if row_major: 61 | return xx, yy 62 | else: 63 | return yy, xx 64 | 65 | def grid_anchors(self, featmap_size, stride=16, device='cuda'): 66 | base_anchors = self.base_anchors.to(device) 67 | 68 | feat_h, feat_w = featmap_size 69 | shift_x = torch.arange(0, feat_w, device=device) * stride 70 | shift_y = torch.arange(0, feat_h, device=device) * stride 71 | shift_xx, shift_yy = self._meshgrid(shift_x, shift_y) 72 | shifts = torch.stack([shift_xx, shift_yy, shift_xx, shift_yy], dim=-1) 73 | shifts = shifts.type_as(base_anchors) 74 | # first feat_w elements correspond to the first row of shifts 75 | # add A anchors (1, A, 4) to K shifts (K, 1, 4) to get 76 | # shifted anchors (K, A, 4), reshape to (K*A, 4) 77 | 78 | all_anchors = base_anchors[None, :, :] + shifts[:, None, :] 79 | all_anchors = all_anchors.view(-1, 4) 80 | # first A rows correspond to A anchors of (0, 0) in feature map, 81 | # then (0, 1), (0, 2), ... 82 | return all_anchors 83 | 84 | def valid_flags(self, featmap_size, valid_size, device='cuda'): 85 | feat_h, feat_w = featmap_size 86 | valid_h, valid_w = valid_size 87 | assert valid_h <= feat_h and valid_w <= feat_w 88 | valid_x = torch.zeros(feat_w, dtype=torch.bool, device=device) 89 | valid_y = torch.zeros(feat_h, dtype=torch.bool, device=device) 90 | valid_x[:valid_w] = 1 91 | valid_y[:valid_h] = 1 92 | valid_xx, valid_yy = self._meshgrid(valid_x, valid_y) 93 | valid = valid_xx & valid_yy 94 | valid = valid[:, None].expand(valid.size(0), 95 | self.num_base_anchors).contiguous().view(-1) 96 | return valid 97 | -------------------------------------------------------------------------------- /nanodet/model/head/anchor/anchor_target.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from functools import partial 3 | 4 | 5 | def multi_apply(func, *args, **kwargs): 6 | pfunc = partial(func, **kwargs) if kwargs else func 7 | map_results = map(pfunc, *args) 8 | return tuple(map(list, zip(*map_results))) 9 | 10 | 11 | def images_to_levels(target, num_level_anchors): 12 | """Convert targets by image to targets by feature level. 13 | 14 | [target_img0, target_img1] -> [target_level0, target_level1, ...] 15 | """ 16 | target = torch.stack(target, 0) 17 | level_targets = [] 18 | start = 0 19 | for n in num_level_anchors: 20 | end = start + n 21 | level_targets.append(target[:, start:end].squeeze(0)) 22 | start = end 23 | return level_targets 24 | 25 | 26 | def anchor_inside_flags(flat_anchors, 27 | valid_flags, 28 | img_shape, 29 | allowed_border=0): 30 | img_h, img_w = img_shape 31 | if allowed_border >= 0: 32 | inside_flags = valid_flags & \ 33 | (flat_anchors[:, 0] >= -allowed_border) & \ 34 | (flat_anchors[:, 1] >= -allowed_border) & \ 35 | (flat_anchors[:, 2] < img_w + allowed_border) & \ 36 | (flat_anchors[:, 3] < img_h + allowed_border) 37 | else: 38 | inside_flags = valid_flags 39 | return inside_flags 40 | 41 | 42 | def unmap(data, count, inds, fill=0): 43 | """ Unmap a subset of item (data) back to the original set of items (of 44 | size count) """ 45 | if data.dim() == 1: 46 | ret = data.new_full((count, ), fill) 47 | ret[inds.type(torch.bool)] = data 48 | else: 49 | new_size = (count, ) + data.size()[1:] 50 | ret = data.new_full(new_size, fill) 51 | ret[inds.type(torch.bool), :] = data 52 | return ret 53 | -------------------------------------------------------------------------------- /nanodet/model/head/anchor/base_anchor_head.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | import torch.nn as nn 4 | from nanodet.model.module.init_weights import normal_init 5 | 6 | from .anchor_generator import AnchorGenerator 7 | from .anchor_target import multi_apply 8 | 9 | 10 | class AnchorHead(nn.Module): 11 | """Anchor-based head (RPN, RetinaNet, SSD, etc.). 12 | 13 | Args: 14 | num_classes (int): Number of categories including the background 15 | category. 16 | in_channels (int): Number of channels in the input feature map. 17 | feat_channels (int): Number of hidden channels. Used in child classes. 18 | anchor_scales (Iterable): Anchor scales. 19 | anchor_ratios (Iterable): Anchor aspect ratios. 20 | anchor_strides (Iterable): Anchor strides. 21 | anchor_base_sizes (Iterable): Anchor base sizes. 22 | target_means (Iterable): Mean values of regression targets. 23 | target_stds (Iterable): Std values of regression targets. 24 | loss_cls (dict): Config of classification loss. 25 | loss_bbox (dict): Config of localization loss. 26 | """ # noqa: W605 27 | 28 | def __init__(self, 29 | num_classes, 30 | loss, 31 | use_sigmoid, 32 | input_channel, 33 | feat_channels=256, 34 | anchor_scales=[8], 35 | anchor_ratios=[1.0], 36 | strides=[8, 16, 32], 37 | anchor_base_sizes=None, 38 | target_means=(.0, .0, .0, .0), 39 | target_stds=(0.1, 0.1, 0.2, 0.2), 40 | ): 41 | super(AnchorHead, self).__init__() 42 | self.in_channels = input_channel 43 | self.num_classes = num_classes 44 | self.loss_cfg = loss 45 | self.feat_channels = feat_channels 46 | self.anchor_scales = anchor_scales 47 | self.anchor_ratios = anchor_ratios 48 | self.anchor_strides = strides 49 | self.anchor_base_sizes = list( 50 | strides) if anchor_base_sizes is None else anchor_base_sizes 51 | self.target_means = target_means 52 | self.target_stds = target_stds 53 | 54 | self.use_sigmoid_cls = use_sigmoid 55 | # self.sampling = self.loss_cfg.loss_cls['name'] not in ['FocalLoss', 'GHMC'] 56 | if self.use_sigmoid_cls: 57 | self.cls_out_channels = num_classes 58 | else: 59 | self.cls_out_channels = num_classes + 1 60 | 61 | if self.cls_out_channels <= 0: 62 | raise ValueError('num_classes={} is too small'.format(num_classes)) 63 | 64 | # self.loss_cls = build_loss(loss_cls) 65 | # self.loss_bbox = build_loss(loss_bbox) 66 | self.fp16_enabled = False 67 | 68 | self.anchor_generators = [] 69 | for anchor_base in self.anchor_base_sizes: 70 | self.anchor_generators.append( 71 | AnchorGenerator(anchor_base, anchor_scales, anchor_ratios)) 72 | 73 | self.num_anchors = len(self.anchor_ratios) * len(self.anchor_scales) 74 | self._init_layers() 75 | 76 | def _init_layers(self): 77 | self.conv_cls = nn.Conv2d(self.in_channels, 78 | self.num_anchors * self.cls_out_channels, 1) 79 | self.conv_reg = nn.Conv2d(self.in_channels, self.num_anchors * 4, 1) 80 | 81 | def init_weights(self): 82 | normal_init(self.conv_cls, std=0.01) 83 | normal_init(self.conv_reg, std=0.01) 84 | 85 | def forward_single(self, x): 86 | cls_score = self.conv_cls(x) 87 | bbox_pred = self.conv_reg(x) 88 | return cls_score, bbox_pred 89 | 90 | def forward(self, feats): 91 | return multi_apply(self.forward_single, feats) 92 | 93 | def get_anchors(self, featmap_sizes, img_shapes, device='cuda'): # checked! 94 | """Get anchors according to feature map sizes. 95 | 96 | Args: 97 | featmap_sizes (list[tuple]): Multi-level feature map sizes. 98 | img_shapes (h,w): Image meta info. 99 | device (torch.device | str): device for returned tensors 100 | 101 | Returns: 102 | tuple: anchors of each image, valid flags of each image 103 | """ 104 | num_imgs = len(img_shapes) 105 | num_levels = len(featmap_sizes) 106 | 107 | # since feature map sizes of all images are the same, we only compute 108 | # anchors for one time 109 | multi_level_anchors = [] 110 | for i in range(num_levels): 111 | anchors = self.anchor_generators[i].grid_anchors( 112 | featmap_sizes[i], self.anchor_strides[i], device=device) 113 | multi_level_anchors.append(anchors) 114 | anchor_list = [multi_level_anchors for _ in range(num_imgs)] 115 | 116 | # for each image, we compute valid flags of multi level anchors 117 | valid_flag_list = [] 118 | for img_id, img_shape in enumerate(img_shapes): 119 | multi_level_flags = [] 120 | for i in range(num_levels): 121 | anchor_stride = self.anchor_strides[i] 122 | feat_h, feat_w = featmap_sizes[i] 123 | h, w = img_shape 124 | valid_feat_h = min(int(np.ceil(h / anchor_stride)), feat_h) 125 | valid_feat_w = min(int(np.ceil(w / anchor_stride)), feat_w) 126 | flags = self.anchor_generators[i].valid_flags( 127 | (feat_h, feat_w), (valid_feat_h, valid_feat_w), 128 | device=device) 129 | multi_level_flags.append(flags) 130 | valid_flag_list.append(multi_level_flags) 131 | 132 | return anchor_list, valid_flag_list 133 | -------------------------------------------------------------------------------- /nanodet/model/head/assigner/__pycache__/assign_result.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/assigner/__pycache__/assign_result.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/assigner/__pycache__/atss_assigner.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/assigner/__pycache__/atss_assigner.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/assigner/__pycache__/base_assigner.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/assigner/__pycache__/base_assigner.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/assigner/assign_result.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | from nanodet.util import util_mixins 4 | 5 | 6 | class AssignResult(util_mixins.NiceRepr): 7 | """ 8 | Stores assignments between predicted and truth boxes. 9 | 10 | Attributes: 11 | num_gts (int): the number of truth boxes considered when computing this 12 | assignment 13 | 14 | gt_inds (LongTensor): for each predicted box indicates the 1-based 15 | index of the assigned truth box. 0 means unassigned and -1 means 16 | ignore. 17 | 18 | max_overlaps (FloatTensor): the iou between the predicted box and its 19 | assigned truth box. 20 | 21 | labels (None | LongTensor): If specified, for each predicted box 22 | indicates the category label of the assigned truth box. 23 | 24 | Example: 25 | >>> # An assign result between 4 predicted boxes and 9 true boxes 26 | >>> # where only two boxes were assigned. 27 | >>> num_gts = 9 28 | >>> max_overlaps = torch.LongTensor([0, .5, .9, 0]) 29 | >>> gt_inds = torch.LongTensor([-1, 1, 2, 0]) 30 | >>> labels = torch.LongTensor([0, 3, 4, 0]) 31 | >>> self = AssignResult(num_gts, gt_inds, max_overlaps, labels) 32 | >>> print(str(self)) # xdoctest: +IGNORE_WANT 33 | 35 | >>> # Force addition of gt labels (when adding gt as proposals) 36 | >>> new_labels = torch.LongTensor([3, 4, 5]) 37 | >>> self.add_gt_(new_labels) 38 | >>> print(str(self)) # xdoctest: +IGNORE_WANT 39 | 41 | """ 42 | 43 | def __init__(self, num_gts, gt_inds, max_overlaps, labels=None): 44 | self.num_gts = num_gts 45 | self.gt_inds = gt_inds 46 | self.max_overlaps = max_overlaps 47 | self.labels = labels 48 | # Interface for possible user-defined properties 49 | self._extra_properties = {} 50 | 51 | @property 52 | def num_preds(self): 53 | """int: the number of predictions in this assignment""" 54 | return len(self.gt_inds) 55 | 56 | def set_extra_property(self, key, value): 57 | """Set user-defined new property.""" 58 | assert key not in self.info 59 | self._extra_properties[key] = value 60 | 61 | def get_extra_property(self, key): 62 | """Get user-defined property.""" 63 | return self._extra_properties.get(key, None) 64 | 65 | @property 66 | def info(self): 67 | """dict: a dictionary of info about the object""" 68 | basic_info = { 69 | 'num_gts': self.num_gts, 70 | 'num_preds': self.num_preds, 71 | 'gt_inds': self.gt_inds, 72 | 'max_overlaps': self.max_overlaps, 73 | 'labels': self.labels, 74 | } 75 | basic_info.update(self._extra_properties) 76 | return basic_info 77 | 78 | def __nice__(self): 79 | """str: a "nice" summary string describing this assign result""" 80 | parts = [] 81 | parts.append(f'num_gts={self.num_gts!r}') 82 | if self.gt_inds is None: 83 | parts.append(f'gt_inds={self.gt_inds!r}') 84 | else: 85 | parts.append(f'gt_inds.shape={tuple(self.gt_inds.shape)!r}') 86 | if self.max_overlaps is None: 87 | parts.append(f'max_overlaps={self.max_overlaps!r}') 88 | else: 89 | parts.append('max_overlaps.shape=' 90 | f'{tuple(self.max_overlaps.shape)!r}') 91 | if self.labels is None: 92 | parts.append(f'labels={self.labels!r}') 93 | else: 94 | parts.append(f'labels.shape={tuple(self.labels.shape)!r}') 95 | return ', '.join(parts) 96 | 97 | @classmethod 98 | def random(cls, **kwargs): 99 | """Create random AssignResult for tests or debugging. 100 | 101 | Args: 102 | num_preds: number of predicted boxes 103 | num_gts: number of true boxes 104 | p_ignore (float): probability of a predicted box assinged to an 105 | ignored truth 106 | p_assigned (float): probability of a predicted box not being 107 | assigned 108 | p_use_label (float | bool): with labels or not 109 | rng (None | int | numpy.random.RandomState): seed or state 110 | 111 | Returns: 112 | :obj:`AssignResult`: Randomly generated assign results. 113 | 114 | Example: 115 | >>> from mmdet.core.bbox.assigners.assign_result import * # NOQA 116 | >>> self = AssignResult.random() 117 | >>> print(self.info) 118 | """ 119 | from mmdet.core.bbox import demodata 120 | rng = demodata.ensure_rng(kwargs.get('rng', None)) 121 | 122 | num_gts = kwargs.get('num_gts', None) 123 | num_preds = kwargs.get('num_preds', None) 124 | p_ignore = kwargs.get('p_ignore', 0.3) 125 | p_assigned = kwargs.get('p_assigned', 0.7) 126 | p_use_label = kwargs.get('p_use_label', 0.5) 127 | num_classes = kwargs.get('p_use_label', 3) 128 | 129 | if num_gts is None: 130 | num_gts = rng.randint(0, 8) 131 | if num_preds is None: 132 | num_preds = rng.randint(0, 16) 133 | 134 | if num_gts == 0: 135 | max_overlaps = torch.zeros(num_preds, dtype=torch.float32) 136 | gt_inds = torch.zeros(num_preds, dtype=torch.int64) 137 | if p_use_label is True or p_use_label < rng.rand(): 138 | labels = torch.zeros(num_preds, dtype=torch.int64) 139 | else: 140 | labels = None 141 | else: 142 | import numpy as np 143 | # Create an overlap for each predicted box 144 | max_overlaps = torch.from_numpy(rng.rand(num_preds)) 145 | 146 | # Construct gt_inds for each predicted box 147 | is_assigned = torch.from_numpy(rng.rand(num_preds) < p_assigned) 148 | # maximum number of assignments constraints 149 | n_assigned = min(num_preds, min(num_gts, is_assigned.sum())) 150 | 151 | assigned_idxs = np.where(is_assigned)[0] 152 | rng.shuffle(assigned_idxs) 153 | assigned_idxs = assigned_idxs[0:n_assigned] 154 | assigned_idxs.sort() 155 | 156 | is_assigned[:] = 0 157 | is_assigned[assigned_idxs] = True 158 | 159 | is_ignore = torch.from_numpy( 160 | rng.rand(num_preds) < p_ignore) & is_assigned 161 | 162 | gt_inds = torch.zeros(num_preds, dtype=torch.int64) 163 | 164 | true_idxs = np.arange(num_gts) 165 | rng.shuffle(true_idxs) 166 | true_idxs = torch.from_numpy(true_idxs) 167 | gt_inds[is_assigned] = true_idxs[:n_assigned] 168 | 169 | gt_inds = torch.from_numpy( 170 | rng.randint(1, num_gts + 1, size=num_preds)) 171 | gt_inds[is_ignore] = -1 172 | gt_inds[~is_assigned] = 0 173 | max_overlaps[~is_assigned] = 0 174 | 175 | if p_use_label is True or p_use_label < rng.rand(): 176 | if num_classes == 0: 177 | labels = torch.zeros(num_preds, dtype=torch.int64) 178 | else: 179 | labels = torch.from_numpy( 180 | # remind that we set FG labels to [0, num_class-1] 181 | # since mmdet v2.0 182 | # BG cat_id: num_class 183 | rng.randint(0, num_classes, size=num_preds)) 184 | labels[~is_assigned] = 0 185 | else: 186 | labels = None 187 | 188 | self = cls(num_gts, gt_inds, max_overlaps, labels) 189 | return self 190 | 191 | def add_gt_(self, gt_labels): 192 | """Add ground truth as assigned results. 193 | 194 | Args: 195 | gt_labels (torch.Tensor): Labels of gt boxes 196 | """ 197 | self_inds = torch.arange( 198 | 1, len(gt_labels) + 1, dtype=torch.long, device=gt_labels.device) 199 | self.gt_inds = torch.cat([self_inds, self.gt_inds]) 200 | 201 | self.max_overlaps = torch.cat( 202 | [self.max_overlaps.new_ones(len(gt_labels)), self.max_overlaps]) 203 | 204 | if self.labels is not None: 205 | self.labels = torch.cat([gt_labels, self.labels]) 206 | -------------------------------------------------------------------------------- /nanodet/model/head/assigner/atss_assigner.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | from ...loss.iou_loss import bbox_overlaps 4 | from .base_assigner import BaseAssigner 5 | from .assign_result import AssignResult 6 | 7 | 8 | class ATSSAssigner(BaseAssigner): 9 | """Assign a corresponding gt bbox or background to each bbox. 10 | 11 | Each proposals will be assigned with `0` or a positive integer 12 | indicating the ground truth index. 13 | 14 | - 0: negative sample, no assigned gt 15 | - positive integer: positive sample, index (1-based) of assigned gt 16 | 17 | Args: 18 | topk (float): number of bbox selected in each level 19 | """ 20 | 21 | def __init__(self, topk): 22 | self.topk = topk 23 | 24 | # https://github.com/sfzhang15/ATSS/blob/master/atss_core/modeling/rpn/atss/loss.py 25 | 26 | def assign(self, 27 | bboxes, 28 | num_level_bboxes, 29 | gt_bboxes, 30 | gt_bboxes_ignore=None, 31 | gt_labels=None): 32 | """Assign gt to bboxes. 33 | 34 | The assignment is done in following steps 35 | 36 | 1. compute iou between all bbox (bbox of all pyramid levels) and gt 37 | 2. compute center distance between all bbox and gt 38 | 3. on each pyramid level, for each gt, select k bbox whose center 39 | are closest to the gt center, so we total select k*l bbox as 40 | candidates for each gt 41 | 4. get corresponding iou for the these candidates, and compute the 42 | mean and std, set mean + std as the iou threshold 43 | 5. select these candidates whose iou are greater than or equal to 44 | the threshold as postive 45 | 6. limit the positive sample's center in gt 46 | 47 | 48 | Args: 49 | bboxes (Tensor): Bounding boxes to be assigned, shape(n, 4). 50 | num_level_bboxes (List): num of bboxes in each level 51 | gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4). 52 | gt_bboxes_ignore (Tensor, optional): Ground truth bboxes that are 53 | labelled as `ignored`, e.g., crowd boxes in COCO. 54 | gt_labels (Tensor, optional): Label of gt_bboxes, shape (k, ). 55 | 56 | Returns: 57 | :obj:`AssignResult`: The assign result. 58 | """ 59 | INF = 100000000 60 | bboxes = bboxes[:, :4] 61 | num_gt, num_bboxes = gt_bboxes.size(0), bboxes.size(0) 62 | 63 | # compute iou between all bbox and gt 64 | overlaps = bbox_overlaps(bboxes, gt_bboxes) 65 | 66 | # assign 0 by default 67 | assigned_gt_inds = overlaps.new_full((num_bboxes,), 68 | 0, 69 | dtype=torch.long) 70 | 71 | if num_gt == 0 or num_bboxes == 0: 72 | # No ground truth or boxes, return empty assignment 73 | max_overlaps = overlaps.new_zeros((num_bboxes,)) 74 | if num_gt == 0: 75 | # No truth, assign everything to background 76 | assigned_gt_inds[:] = 0 77 | if gt_labels is None: 78 | assigned_labels = None 79 | else: 80 | assigned_labels = overlaps.new_full((num_bboxes,), 81 | -1, 82 | dtype=torch.long) 83 | return AssignResult( 84 | num_gt, assigned_gt_inds, max_overlaps, labels=assigned_labels) 85 | 86 | # compute center distance between all bbox and gt 87 | gt_cx = (gt_bboxes[:, 0] + gt_bboxes[:, 2]) / 2.0 88 | gt_cy = (gt_bboxes[:, 1] + gt_bboxes[:, 3]) / 2.0 89 | gt_points = torch.stack((gt_cx, gt_cy), dim=1) 90 | 91 | bboxes_cx = (bboxes[:, 0] + bboxes[:, 2]) / 2.0 92 | bboxes_cy = (bboxes[:, 1] + bboxes[:, 3]) / 2.0 93 | bboxes_points = torch.stack((bboxes_cx, bboxes_cy), dim=1) 94 | 95 | distances = (bboxes_points[:, None, :] - 96 | gt_points[None, :, :]).pow(2).sum(-1).sqrt() 97 | 98 | # Selecting candidates based on the center distance 99 | candidate_idxs = [] 100 | start_idx = 0 101 | for level, bboxes_per_level in enumerate(num_level_bboxes): 102 | # on each pyramid level, for each gt, 103 | # select k bbox whose center are closest to the gt center 104 | end_idx = start_idx + bboxes_per_level 105 | distances_per_level = distances[start_idx:end_idx, :] 106 | selectable_k = min(self.topk, bboxes_per_level) 107 | _, topk_idxs_per_level = distances_per_level.topk( 108 | selectable_k, dim=0, largest=False) 109 | candidate_idxs.append(topk_idxs_per_level + start_idx) 110 | start_idx = end_idx 111 | candidate_idxs = torch.cat(candidate_idxs, dim=0) 112 | 113 | # get corresponding iou for the these candidates, and compute the 114 | # mean and std, set mean + std as the iou threshold 115 | candidate_overlaps = overlaps[candidate_idxs, torch.arange(num_gt)] 116 | overlaps_mean_per_gt = candidate_overlaps.mean(0) 117 | overlaps_std_per_gt = candidate_overlaps.std(0) 118 | overlaps_thr_per_gt = overlaps_mean_per_gt + overlaps_std_per_gt 119 | 120 | is_pos = candidate_overlaps >= overlaps_thr_per_gt[None, :] 121 | 122 | # limit the positive sample's center in gt 123 | for gt_idx in range(num_gt): 124 | candidate_idxs[:, gt_idx] += gt_idx * num_bboxes 125 | ep_bboxes_cx = bboxes_cx.view(1, -1).expand( 126 | num_gt, num_bboxes).contiguous().view(-1) 127 | ep_bboxes_cy = bboxes_cy.view(1, -1).expand( 128 | num_gt, num_bboxes).contiguous().view(-1) 129 | candidate_idxs = candidate_idxs.view(-1) 130 | 131 | # calculate the left, top, right, bottom distance between positive 132 | # bbox center and gt side 133 | l_ = ep_bboxes_cx[candidate_idxs].view(-1, num_gt) - gt_bboxes[:, 0] 134 | t_ = ep_bboxes_cy[candidate_idxs].view(-1, num_gt) - gt_bboxes[:, 1] 135 | r_ = gt_bboxes[:, 2] - ep_bboxes_cx[candidate_idxs].view(-1, num_gt) 136 | b_ = gt_bboxes[:, 3] - ep_bboxes_cy[candidate_idxs].view(-1, num_gt) 137 | is_in_gts = torch.stack([l_, t_, r_, b_], dim=1).min(dim=1)[0] > 0.01 138 | is_pos = is_pos & is_in_gts 139 | 140 | # if an anchor box is assigned to multiple gts, 141 | # the one with the highest IoU will be selected. 142 | overlaps_inf = torch.full_like(overlaps, 143 | -INF).t().contiguous().view(-1) 144 | index = candidate_idxs.view(-1)[is_pos.view(-1)] 145 | overlaps_inf[index] = overlaps.t().contiguous().view(-1)[index] 146 | overlaps_inf = overlaps_inf.view(num_gt, -1).t() 147 | 148 | max_overlaps, argmax_overlaps = overlaps_inf.max(dim=1) 149 | assigned_gt_inds[ 150 | max_overlaps != -INF] = argmax_overlaps[max_overlaps != -INF] + 1 151 | 152 | if gt_labels is not None: 153 | assigned_labels = assigned_gt_inds.new_full((num_bboxes,), -1) 154 | pos_inds = torch.nonzero( 155 | assigned_gt_inds > 0, as_tuple=False).squeeze() 156 | if pos_inds.numel() > 0: 157 | assigned_labels[pos_inds] = gt_labels[ 158 | assigned_gt_inds[pos_inds] - 1] 159 | else: 160 | assigned_labels = None 161 | return AssignResult( 162 | num_gt, assigned_gt_inds, max_overlaps, labels=assigned_labels) 163 | -------------------------------------------------------------------------------- /nanodet/model/head/assigner/base_assigner.py: -------------------------------------------------------------------------------- 1 | from abc import ABCMeta, abstractmethod 2 | 3 | 4 | class BaseAssigner(metaclass=ABCMeta): 5 | 6 | @abstractmethod 7 | def assign(self, bboxes, gt_bboxes, gt_bboxes_ignore=None, gt_labels=None): 8 | pass -------------------------------------------------------------------------------- /nanodet/model/head/nanodet_head.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | from ..module.conv import ConvModule, DepthwiseConvModule 5 | from ..module.init_weights import normal_init 6 | from .gfl_head import GFLHead 7 | from .anchor.anchor_target import multi_apply 8 | 9 | 10 | class NanoDetHead(GFLHead): 11 | """ 12 | Modified from GFL, use same loss functions but much lightweight convolution heads 13 | """ 14 | 15 | def __init__(self, 16 | num_classes, 17 | loss, 18 | input_channel, 19 | stacked_convs=2, 20 | octave_base_scale=5, 21 | scales_per_octave=1, 22 | conv_cfg=None, 23 | norm_cfg=dict(type='BN'), 24 | reg_max=16, 25 | share_cls_reg=False, 26 | activation='LeakyReLU', 27 | **kwargs): 28 | self.share_cls_reg = share_cls_reg 29 | self.activation = activation 30 | super(NanoDetHead, self).__init__(num_classes, 31 | loss, 32 | input_channel, 33 | stacked_convs, 34 | octave_base_scale, 35 | scales_per_octave, 36 | conv_cfg, 37 | norm_cfg, 38 | reg_max, 39 | **kwargs) 40 | 41 | def _init_layers(self): 42 | self.cls_convs = nn.ModuleList() 43 | self.reg_convs = nn.ModuleList() 44 | for _ in self.anchor_strides: 45 | cls_convs, reg_convs = self._buid_not_shared_head() 46 | self.cls_convs.append(cls_convs) 47 | self.reg_convs.append(reg_convs) 48 | 49 | self.gfl_cls = nn.ModuleList([nn.Conv2d(self.feat_channels, 50 | self.cls_out_channels + 51 | 4 * (self.reg_max + 1) if self.share_cls_reg else self.cls_out_channels, 52 | 1, 53 | padding=0) for _ in self.anchor_strides]) 54 | # TODO: if 55 | self.gfl_reg = nn.ModuleList([nn.Conv2d(self.feat_channels, 56 | 4 * (self.reg_max + 1), 57 | 1, 58 | padding=0) for _ in self.anchor_strides]) 59 | 60 | def _buid_not_shared_head(self): 61 | cls_convs = nn.ModuleList() 62 | reg_convs = nn.ModuleList() 63 | for i in range(self.stacked_convs): 64 | chn = self.in_channels if i == 0 else self.feat_channels 65 | cls_convs.append( 66 | DepthwiseConvModule(chn, 67 | self.feat_channels, 68 | 3, 69 | stride=1, 70 | padding=1, 71 | norm_cfg=self.norm_cfg, 72 | bias=self.norm_cfg is None, 73 | activation=self.activation)) 74 | if not self.share_cls_reg: 75 | reg_convs.append( 76 | DepthwiseConvModule(chn, 77 | self.feat_channels, 78 | 3, 79 | stride=1, 80 | padding=1, 81 | norm_cfg=self.norm_cfg, 82 | bias=self.norm_cfg is None, 83 | activation=self.activation)) 84 | 85 | return cls_convs, reg_convs 86 | 87 | def init_weights(self): 88 | for seq in self.cls_convs: 89 | for m in seq: 90 | normal_init(m.depthwise, std=0.01) 91 | normal_init(m.pointwise, std=0.01) 92 | for seq in self.reg_convs: 93 | for m in seq: 94 | normal_init(m.depthwise, std=0.01) 95 | normal_init(m.pointwise, std=0.01) 96 | bias_cls = -4.595 # 用0.01的置信度初始化 97 | for i in range(len(self.anchor_strides)): 98 | normal_init(self.gfl_cls[i], std=0.01, bias=bias_cls) 99 | normal_init(self.gfl_reg[i], std=0.01) 100 | print('Finish initialize Lite GFL Head.') 101 | 102 | def forward(self, feats): 103 | return multi_apply(self.forward_single, 104 | feats, 105 | self.cls_convs, 106 | self.reg_convs, 107 | self.gfl_cls, 108 | self.gfl_reg, 109 | ) 110 | 111 | def forward_single(self, x, cls_convs, reg_convs, gfl_cls, gfl_reg): 112 | cls_feat = x 113 | reg_feat = x 114 | for cls_conv in cls_convs: 115 | cls_feat = cls_conv(cls_feat) 116 | for reg_conv in reg_convs: 117 | reg_feat = reg_conv(reg_feat) 118 | if self.share_cls_reg: 119 | feat = gfl_cls(cls_feat) 120 | cls_score, bbox_pred = torch.split(feat, [self.cls_out_channels, 4 * (self.reg_max + 1)], dim=1) 121 | else: 122 | cls_score = gfl_cls(cls_feat) 123 | bbox_pred = gfl_reg(reg_feat) 124 | 125 | if torch.onnx.is_in_onnx_export(): 126 | cls_score = torch.sigmoid(cls_score).reshape(1, self.num_classes, -1).permute(0, 2, 1) 127 | bbox_pred = bbox_pred.reshape(1, (self.reg_max+1)*4, -1).permute(0, 2, 1) 128 | return cls_score, bbox_pred 129 | 130 | 131 | -------------------------------------------------------------------------------- /nanodet/model/head/sampler/__pycache__/base_sampler.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/sampler/__pycache__/base_sampler.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/sampler/__pycache__/pseudo_sampler.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/sampler/__pycache__/pseudo_sampler.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/sampler/__pycache__/sampling_result.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/head/sampler/__pycache__/sampling_result.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/head/sampler/base_sampler.py: -------------------------------------------------------------------------------- 1 | from abc import ABCMeta, abstractmethod 2 | 3 | import torch 4 | 5 | from .sampling_result import SamplingResult 6 | 7 | 8 | class BaseSampler(metaclass=ABCMeta): 9 | 10 | def __init__(self, 11 | num, 12 | pos_fraction, 13 | neg_pos_ub=-1, 14 | add_gt_as_proposals=True, 15 | **kwargs): 16 | self.num = num 17 | self.pos_fraction = pos_fraction 18 | self.neg_pos_ub = neg_pos_ub 19 | self.add_gt_as_proposals = add_gt_as_proposals 20 | self.pos_sampler = self 21 | self.neg_sampler = self 22 | 23 | @abstractmethod 24 | def _sample_pos(self, assign_result, num_expected, **kwargs): 25 | pass 26 | 27 | @abstractmethod 28 | def _sample_neg(self, assign_result, num_expected, **kwargs): 29 | pass 30 | 31 | def sample(self, 32 | assign_result, 33 | bboxes, 34 | gt_bboxes, 35 | gt_labels=None, 36 | **kwargs): 37 | """Sample positive and negative bboxes. 38 | 39 | This is a simple implementation of bbox sampling given candidates, 40 | assigning results and ground truth bboxes. 41 | 42 | Args: 43 | assign_result (:obj:`AssignResult`): Bbox assigning results. 44 | bboxes (Tensor): Boxes to be sampled from. 45 | gt_bboxes (Tensor): Ground truth bboxes. 46 | gt_labels (Tensor, optional): Class labels of ground truth bboxes. 47 | 48 | Returns: 49 | :obj:`SamplingResult`: Sampling result. 50 | 51 | """ 52 | if len(bboxes.shape) < 2: 53 | bboxes = bboxes[None, :] 54 | 55 | bboxes = bboxes[:, :4] 56 | 57 | gt_flags = bboxes.new_zeros((bboxes.shape[0],), dtype=torch.uint8) 58 | if self.add_gt_as_proposals and len(gt_bboxes) > 0: 59 | if gt_labels is None: 60 | raise ValueError( 61 | 'gt_labels must be given when add_gt_as_proposals is True') 62 | bboxes = torch.cat([gt_bboxes, bboxes], dim=0) 63 | assign_result.add_gt_(gt_labels) 64 | gt_ones = bboxes.new_ones(gt_bboxes.shape[0], dtype=torch.uint8) 65 | gt_flags = torch.cat([gt_ones, gt_flags]) 66 | 67 | num_expected_pos = int(self.num * self.pos_fraction) 68 | pos_inds = self.pos_sampler._sample_pos( 69 | assign_result, num_expected_pos, bboxes=bboxes, **kwargs) 70 | # We found that sampled indices have duplicated items occasionally. 71 | # (may be a bug of PyTorch) 72 | pos_inds = pos_inds.unique() 73 | num_sampled_pos = pos_inds.numel() 74 | num_expected_neg = self.num - num_sampled_pos 75 | if self.neg_pos_ub >= 0: 76 | _pos = max(1, num_sampled_pos) 77 | neg_upper_bound = int(self.neg_pos_ub * _pos) 78 | if num_expected_neg > neg_upper_bound: 79 | num_expected_neg = neg_upper_bound 80 | neg_inds = self.neg_sampler._sample_neg( 81 | assign_result, num_expected_neg, bboxes=bboxes, **kwargs) 82 | neg_inds = neg_inds.unique() 83 | 84 | sampling_result = SamplingResult(pos_inds, neg_inds, bboxes, gt_bboxes, 85 | assign_result, gt_flags) 86 | return sampling_result 87 | -------------------------------------------------------------------------------- /nanodet/model/head/sampler/pseudo_sampler.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | from .base_sampler import BaseSampler 4 | from .sampling_result import SamplingResult 5 | 6 | 7 | class PseudoSampler(BaseSampler): 8 | 9 | def __init__(self, **kwargs): 10 | pass 11 | 12 | def _sample_pos(self, **kwargs): 13 | raise NotImplementedError 14 | 15 | def _sample_neg(self, **kwargs): 16 | raise NotImplementedError 17 | 18 | def sample(self, assign_result, bboxes, gt_bboxes, **kwargs): 19 | pos_inds = torch.nonzero( 20 | assign_result.gt_inds > 0, as_tuple=False).squeeze(-1).unique() 21 | neg_inds = torch.nonzero( 22 | assign_result.gt_inds == 0, as_tuple=False).squeeze(-1).unique() 23 | gt_flags = bboxes.new_zeros(bboxes.shape[0], dtype=torch.uint8) 24 | sampling_result = SamplingResult(pos_inds, neg_inds, bboxes, gt_bboxes, 25 | assign_result, gt_flags) 26 | return sampling_result 27 | -------------------------------------------------------------------------------- /nanodet/model/head/sampler/sampling_result.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | from nanodet.util import util_mixins 4 | 5 | 6 | class SamplingResult(util_mixins.NiceRepr): 7 | """ 8 | Example: 9 | >>> # xdoctest: +IGNORE_WANT 10 | >>> self = SamplingResult.random(rng=10) 11 | >>> print('self = {}'.format(self)) 12 | self = 21 | """ 22 | 23 | def __init__(self, pos_inds, neg_inds, bboxes, gt_bboxes, assign_result, 24 | gt_flags): 25 | self.pos_inds = pos_inds 26 | self.neg_inds = neg_inds 27 | self.pos_bboxes = bboxes[pos_inds] 28 | self.neg_bboxes = bboxes[neg_inds] 29 | self.pos_is_gt = gt_flags[pos_inds] 30 | 31 | self.num_gts = gt_bboxes.shape[0] 32 | self.pos_assigned_gt_inds = assign_result.gt_inds[pos_inds] - 1 33 | 34 | if gt_bboxes.numel() == 0: 35 | # hack for index error case 36 | assert self.pos_assigned_gt_inds.numel() == 0 37 | self.pos_gt_bboxes = torch.empty_like(gt_bboxes).view(-1, 4) 38 | else: 39 | if len(gt_bboxes.shape) < 2: 40 | gt_bboxes = gt_bboxes.view(-1, 4) 41 | 42 | self.pos_gt_bboxes = gt_bboxes[self.pos_assigned_gt_inds, :] 43 | 44 | if assign_result.labels is not None: 45 | self.pos_gt_labels = assign_result.labels[pos_inds] 46 | else: 47 | self.pos_gt_labels = None 48 | 49 | @property 50 | def bboxes(self): 51 | return torch.cat([self.pos_bboxes, self.neg_bboxes]) 52 | 53 | def to(self, device): 54 | """ 55 | Change the device of the data inplace. 56 | 57 | Example: 58 | >>> self = SamplingResult.random() 59 | >>> print('self = {}'.format(self.to(None))) 60 | >>> # xdoctest: +REQUIRES(--gpu) 61 | >>> print('self = {}'.format(self.to(0))) 62 | """ 63 | _dict = self.__dict__ 64 | for key, value in _dict.items(): 65 | if isinstance(value, torch.Tensor): 66 | _dict[key] = value.to(device) 67 | return self 68 | 69 | def __nice__(self): 70 | data = self.info.copy() 71 | data['pos_bboxes'] = data.pop('pos_bboxes').shape 72 | data['neg_bboxes'] = data.pop('neg_bboxes').shape 73 | parts = ['\'{}\': {!r}'.format(k, v) for k, v in sorted(data.items())] 74 | body = ' ' + ',\n '.join(parts) 75 | return '{\n' + body + '\n}' 76 | 77 | @property 78 | def info(self): 79 | """ 80 | Returns a dictionary of info about the object 81 | """ 82 | return { 83 | 'pos_inds': self.pos_inds, 84 | 'neg_inds': self.neg_inds, 85 | 'pos_bboxes': self.pos_bboxes, 86 | 'neg_bboxes': self.neg_bboxes, 87 | 'pos_is_gt': self.pos_is_gt, 88 | 'num_gts': self.num_gts, 89 | 'pos_assigned_gt_inds': self.pos_assigned_gt_inds, 90 | } 91 | -------------------------------------------------------------------------------- /nanodet/model/loss/__pycache__/gfocal_loss.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/loss/__pycache__/gfocal_loss.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/loss/__pycache__/iou_loss.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/loss/__pycache__/iou_loss.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/loss/__pycache__/utils.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/loss/__pycache__/utils.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/loss/gfocal_loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | from .utils import weighted_loss 6 | 7 | 8 | @weighted_loss 9 | def quality_focal_loss(pred, target, beta=2.0): 10 | r"""Quality Focal Loss (QFL) is from `Generalized Focal Loss: Learning 11 | Qualified and Distributed Bounding Boxes for Dense Object Detection 12 | `_. 13 | 14 | Args: 15 | pred (torch.Tensor): Predicted joint representation of classification 16 | and quality (IoU) estimation with shape (N, C), C is the number of 17 | classes. 18 | target (tuple([torch.Tensor])): Target category label with shape (N,) 19 | and target quality label with shape (N,). 20 | beta (float): The beta parameter for calculating the modulating factor. 21 | Defaults to 2.0. 22 | 23 | Returns: 24 | torch.Tensor: Loss tensor with shape (N,). 25 | """ 26 | assert len(target) == 2, """target for QFL must be a tuple of two elements, 27 | including category label and quality label, respectively""" 28 | # label denotes the category id, score denotes the quality score 29 | label, score = target 30 | 31 | # negatives are supervised by 0 quality score 32 | pred_sigmoid = pred.sigmoid() 33 | scale_factor = pred_sigmoid 34 | zerolabel = scale_factor.new_zeros(pred.shape) 35 | loss = F.binary_cross_entropy_with_logits( 36 | pred, zerolabel, reduction='none') * scale_factor.pow(beta) 37 | 38 | # FG cat_id: [0, num_classes -1], BG cat_id: num_classes 39 | bg_class_ind = pred.size(1) 40 | pos = torch.nonzero((label >= 0) & (label < bg_class_ind), as_tuple=False).squeeze(1) 41 | pos_label = label[pos].long() 42 | # positives are supervised by bbox quality (IoU) score 43 | scale_factor = score[pos] - pred_sigmoid[pos, pos_label] 44 | loss[pos, pos_label] = F.binary_cross_entropy_with_logits( 45 | pred[pos, pos_label], score[pos], 46 | reduction='none') * scale_factor.abs().pow(beta) 47 | 48 | loss = loss.sum(dim=1, keepdim=False) 49 | return loss 50 | 51 | 52 | @weighted_loss 53 | def distribution_focal_loss(pred, label): 54 | r"""Distribution Focal Loss (DFL) is from `Generalized Focal Loss: Learning 55 | Qualified and Distributed Bounding Boxes for Dense Object Detection 56 | `_. 57 | 58 | Args: 59 | pred (torch.Tensor): Predicted general distribution of bounding boxes 60 | (before softmax) with shape (N, n+1), n is the max value of the 61 | integral set `{0, ..., n}` in paper. 62 | label (torch.Tensor): Target distance label for bounding boxes with 63 | shape (N,). 64 | 65 | Returns: 66 | torch.Tensor: Loss tensor with shape (N,). 67 | """ 68 | dis_left = label.long() 69 | dis_right = dis_left + 1 70 | weight_left = dis_right.float() - label 71 | weight_right = label - dis_left.float() 72 | loss = F.cross_entropy(pred, dis_left, reduction='none') * weight_left \ 73 | + F.cross_entropy(pred, dis_right, reduction='none') * weight_right 74 | return loss 75 | 76 | 77 | class QualityFocalLoss(nn.Module): 78 | r"""Quality Focal Loss (QFL) is a variant of `Generalized Focal Loss: 79 | Learning Qualified and Distributed Bounding Boxes for Dense Object 80 | Detection `_. 81 | 82 | Args: 83 | use_sigmoid (bool): Whether sigmoid operation is conducted in QFL. 84 | Defaults to True. 85 | beta (float): The beta parameter for calculating the modulating factor. 86 | Defaults to 2.0. 87 | reduction (str): Options are "none", "mean" and "sum". 88 | loss_weight (float): Loss weight of current loss. 89 | """ 90 | 91 | def __init__(self, 92 | use_sigmoid=True, 93 | beta=2.0, 94 | reduction='mean', 95 | loss_weight=1.0): 96 | super(QualityFocalLoss, self).__init__() 97 | assert use_sigmoid is True, 'Only sigmoid in QFL supported now.' 98 | self.use_sigmoid = use_sigmoid 99 | self.beta = beta 100 | self.reduction = reduction 101 | self.loss_weight = loss_weight 102 | 103 | def forward(self, 104 | pred, 105 | target, 106 | weight=None, 107 | avg_factor=None, 108 | reduction_override=None): 109 | """Forward function. 110 | 111 | Args: 112 | pred (torch.Tensor): Predicted joint representation of 113 | classification and quality (IoU) estimation with shape (N, C), 114 | C is the number of classes. 115 | target (tuple([torch.Tensor])): Target category label with shape 116 | (N,) and target quality label with shape (N,). 117 | weight (torch.Tensor, optional): The weight of loss for each 118 | prediction. Defaults to None. 119 | avg_factor (int, optional): Average factor that is used to average 120 | the loss. Defaults to None. 121 | reduction_override (str, optional): The reduction method used to 122 | override the original reduction method of the loss. 123 | Defaults to None. 124 | """ 125 | assert reduction_override in (None, 'none', 'mean', 'sum') 126 | reduction = ( 127 | reduction_override if reduction_override else self.reduction) 128 | if self.use_sigmoid: 129 | loss_cls = self.loss_weight * quality_focal_loss( 130 | pred, 131 | target, 132 | weight, 133 | beta=self.beta, 134 | reduction=reduction, 135 | avg_factor=avg_factor) 136 | else: 137 | raise NotImplementedError 138 | return loss_cls 139 | 140 | 141 | class DistributionFocalLoss(nn.Module): 142 | r"""Distribution Focal Loss (DFL) is a variant of `Generalized Focal Loss: 143 | Learning Qualified and Distributed Bounding Boxes for Dense Object 144 | Detection `_. 145 | 146 | Args: 147 | reduction (str): Options are `'none'`, `'mean'` and `'sum'`. 148 | loss_weight (float): Loss weight of current loss. 149 | """ 150 | 151 | def __init__(self, reduction='mean', loss_weight=1.0): 152 | super(DistributionFocalLoss, self).__init__() 153 | self.reduction = reduction 154 | self.loss_weight = loss_weight 155 | 156 | def forward(self, 157 | pred, 158 | target, 159 | weight=None, 160 | avg_factor=None, 161 | reduction_override=None): 162 | """Forward function. 163 | 164 | Args: 165 | pred (torch.Tensor): Predicted general distribution of bounding 166 | boxes (before softmax) with shape (N, n+1), n is the max value 167 | of the integral set `{0, ..., n}` in paper. 168 | target (torch.Tensor): Target distance label for bounding boxes 169 | with shape (N,). 170 | weight (torch.Tensor, optional): The weight of loss for each 171 | prediction. Defaults to None. 172 | avg_factor (int, optional): Average factor that is used to average 173 | the loss. Defaults to None. 174 | reduction_override (str, optional): The reduction method used to 175 | override the original reduction method of the loss. 176 | Defaults to None. 177 | """ 178 | assert reduction_override in (None, 'none', 'mean', 'sum') 179 | reduction = ( 180 | reduction_override if reduction_override else self.reduction) 181 | loss_cls = self.loss_weight * distribution_focal_loss( 182 | pred, target, weight, reduction=reduction, avg_factor=avg_factor) 183 | return loss_cls 184 | -------------------------------------------------------------------------------- /nanodet/model/loss/utils.py: -------------------------------------------------------------------------------- 1 | import functools 2 | 3 | import torch.nn.functional as F 4 | 5 | 6 | def reduce_loss(loss, reduction): 7 | """Reduce loss as specified. 8 | 9 | Args: 10 | loss (Tensor): Elementwise loss tensor. 11 | reduction (str): Options are "none", "mean" and "sum". 12 | 13 | Return: 14 | Tensor: Reduced loss tensor. 15 | """ 16 | reduction_enum = F._Reduction.get_enum(reduction) 17 | # none: 0, elementwise_mean:1, sum: 2 18 | if reduction_enum == 0: 19 | return loss 20 | elif reduction_enum == 1: 21 | return loss.mean() 22 | elif reduction_enum == 2: 23 | return loss.sum() 24 | 25 | 26 | def weight_reduce_loss(loss, weight=None, reduction='mean', avg_factor=None): 27 | """Apply element-wise weight and reduce loss. 28 | 29 | Args: 30 | loss (Tensor): Element-wise loss. 31 | weight (Tensor): Element-wise weights. 32 | reduction (str): Same as built-in losses of PyTorch. 33 | avg_factor (float): Avarage factor when computing the mean of losses. 34 | 35 | Returns: 36 | Tensor: Processed loss values. 37 | """ 38 | # if weight is specified, apply element-wise weight 39 | if weight is not None: 40 | loss = loss * weight 41 | 42 | # if avg_factor is not specified, just reduce the loss 43 | if avg_factor is None: 44 | loss = reduce_loss(loss, reduction) 45 | else: 46 | # if reduction is mean, then average the loss by avg_factor 47 | if reduction == 'mean': 48 | loss = loss.sum() / avg_factor 49 | # if reduction is 'none', then do nothing, otherwise raise an error 50 | elif reduction != 'none': 51 | raise ValueError('avg_factor can not be used with reduction="sum"') 52 | return loss 53 | 54 | 55 | def weighted_loss(loss_func): 56 | """Create a weighted version of a given loss function. 57 | 58 | To use this decorator, the loss function must have the signature like 59 | `loss_func(pred, target, **kwargs)`. The function only needs to compute 60 | element-wise loss without any reduction. This decorator will add weight 61 | and reduction arguments to the function. The decorated function will have 62 | the signature like `loss_func(pred, target, weight=None, reduction='mean', 63 | avg_factor=None, **kwargs)`. 64 | 65 | :Example: 66 | 67 | >>> import torch 68 | >>> @weighted_loss 69 | >>> def l1_loss(pred, target): 70 | >>> return (pred - target).abs() 71 | 72 | >>> pred = torch.Tensor([0, 2, 3]) 73 | >>> target = torch.Tensor([1, 1, 1]) 74 | >>> weight = torch.Tensor([1, 0, 1]) 75 | 76 | >>> l1_loss(pred, target) 77 | tensor(1.3333) 78 | >>> l1_loss(pred, target, weight) 79 | tensor(1.) 80 | >>> l1_loss(pred, target, reduction='none') 81 | tensor([1., 1., 2.]) 82 | >>> l1_loss(pred, target, weight, avg_factor=2) 83 | tensor(1.5000) 84 | """ 85 | 86 | @functools.wraps(loss_func) 87 | def wrapper(pred, 88 | target, 89 | weight=None, 90 | reduction='mean', 91 | avg_factor=None, 92 | **kwargs): 93 | # get element-wise loss 94 | loss = loss_func(pred, target, **kwargs) 95 | loss = weight_reduce_loss(loss, weight, reduction, avg_factor) 96 | return loss 97 | 98 | return wrapper 99 | -------------------------------------------------------------------------------- /nanodet/model/loss/varifocal_loss.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch.nn.functional as F 3 | from .utils import weight_reduce_loss 4 | 5 | 6 | def varifocal_loss(pred, 7 | target, 8 | weight=None, 9 | alpha=0.75, 10 | gamma=2.0, 11 | iou_weighted=True, 12 | reduction='mean', 13 | avg_factor=None): 14 | """`Varifocal Loss `_ 15 | 16 | Args: 17 | pred (torch.Tensor): The prediction with shape (N, C), C is the 18 | number of classes 19 | target (torch.Tensor): The learning target of the iou-aware 20 | classification score with shape (N, C), C is the number of classes. 21 | weight (torch.Tensor, optional): The weight of loss for each 22 | prediction. Defaults to None. 23 | alpha (float, optional): A balance factor for the negative part of 24 | Varifocal Loss, which is different from the alpha of Focal Loss. 25 | Defaults to 0.75. 26 | gamma (float, optional): The gamma for calculating the modulating 27 | factor. Defaults to 2.0. 28 | iou_weighted (bool, optional): Whether to weight the loss of the 29 | positive example with the iou target. Defaults to True. 30 | reduction (str, optional): The method used to reduce the loss into 31 | a scalar. Defaults to 'mean'. Options are "none", "mean" and 32 | "sum". 33 | avg_factor (int, optional): Average factor that is used to average 34 | the loss. Defaults to None. 35 | """ 36 | # pred and target should be of the same size 37 | assert pred.size() == target.size() 38 | pred_sigmoid = pred.sigmoid() 39 | target = target.type_as(pred) 40 | if iou_weighted: 41 | focal_weight = target * (target > 0.0).float() + \ 42 | alpha * (pred_sigmoid - target).abs().pow(gamma) * \ 43 | (target <= 0.0).float() 44 | else: 45 | focal_weight = (target > 0.0).float() + \ 46 | alpha * (pred_sigmoid - target).abs().pow(gamma) * \ 47 | (target <= 0.0).float() 48 | loss = F.binary_cross_entropy_with_logits( 49 | pred, target, reduction='none') * focal_weight 50 | loss = weight_reduce_loss(loss, weight, reduction, avg_factor) 51 | return loss 52 | 53 | 54 | class VarifocalLoss(nn.Module): 55 | 56 | def __init__(self, 57 | use_sigmoid=True, 58 | alpha=0.75, 59 | gamma=2.0, 60 | iou_weighted=True, 61 | reduction='mean', 62 | loss_weight=1.0): 63 | """`Varifocal Loss `_ 64 | 65 | Args: 66 | use_sigmoid (bool, optional): Whether the prediction is 67 | used for sigmoid or softmax. Defaults to True. 68 | alpha (float, optional): A balance factor for the negative part of 69 | Varifocal Loss, which is different from the alpha of Focal 70 | Loss. Defaults to 0.75. 71 | gamma (float, optional): The gamma for calculating the modulating 72 | factor. Defaults to 2.0. 73 | iou_weighted (bool, optional): Whether to weight the loss of the 74 | positive examples with the iou target. Defaults to True. 75 | reduction (str, optional): The method used to reduce the loss into 76 | a scalar. Defaults to 'mean'. Options are "none", "mean" and 77 | "sum". 78 | loss_weight (float, optional): Weight of loss. Defaults to 1.0. 79 | """ 80 | super(VarifocalLoss, self).__init__() 81 | assert use_sigmoid is True, \ 82 | 'Only sigmoid varifocal loss supported now.' 83 | assert alpha >= 0.0 84 | self.use_sigmoid = use_sigmoid 85 | self.alpha = alpha 86 | self.gamma = gamma 87 | self.iou_weighted = iou_weighted 88 | self.reduction = reduction 89 | self.loss_weight = loss_weight 90 | 91 | def forward(self, 92 | pred, 93 | target, 94 | weight=None, 95 | avg_factor=None, 96 | reduction_override=None): 97 | """Forward function. 98 | 99 | Args: 100 | pred (torch.Tensor): The prediction. 101 | target (torch.Tensor): The learning target of the prediction. 102 | weight (torch.Tensor, optional): The weight of loss for each 103 | prediction. Defaults to None. 104 | avg_factor (int, optional): Average factor that is used to average 105 | the loss. Defaults to None. 106 | reduction_override (str, optional): The reduction method used to 107 | override the original reduction method of the loss. 108 | Options are "none", "mean" and "sum". 109 | 110 | Returns: 111 | torch.Tensor: The calculated loss 112 | """ 113 | assert reduction_override in (None, 'none', 'mean', 'sum') 114 | reduction = ( 115 | reduction_override if reduction_override else self.reduction) 116 | if self.use_sigmoid: 117 | loss_cls = self.loss_weight * varifocal_loss( 118 | pred, 119 | target, 120 | weight, 121 | alpha=self.alpha, 122 | gamma=self.gamma, 123 | iou_weighted=self.iou_weighted, 124 | reduction=reduction, 125 | avg_factor=avg_factor) 126 | else: 127 | raise NotImplementedError 128 | return loss_cls 129 | -------------------------------------------------------------------------------- /nanodet/model/module/__pycache__/activation.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/module/__pycache__/activation.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/module/__pycache__/conv.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/module/__pycache__/conv.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/module/__pycache__/init_weights.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/module/__pycache__/init_weights.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/module/__pycache__/nms.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/module/__pycache__/nms.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/module/__pycache__/norm.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/module/__pycache__/norm.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/module/__pycache__/scale.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/model/module/__pycache__/scale.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/model/module/activation.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | 3 | activations = {'ReLU': nn.ReLU, 4 | 'LeakyReLU': nn.LeakyReLU, 5 | 'ReLU6': nn.ReLU6, 6 | 'SELU': nn.SELU, 7 | 'ELU': nn.ELU, 8 | None: nn.Identity 9 | } 10 | 11 | 12 | def act_layers(name): 13 | assert name in activations.keys() 14 | if name == 'LeakyReLU': 15 | return nn.LeakyReLU(negative_slope=0.1, inplace=True) 16 | else: 17 | return activations[name](inplace=True) 18 | -------------------------------------------------------------------------------- /nanodet/model/module/conv.py: -------------------------------------------------------------------------------- 1 | """ 2 | from MMDetection 3 | """ 4 | import torch.nn as nn 5 | import warnings 6 | from .init_weights import kaiming_init, normal_init, xavier_init, constant_init 7 | from .norm import build_norm_layer 8 | from .activation import act_layers 9 | 10 | 11 | class ConvModule(nn.Module): 12 | """A conv block that contains conv/norm/activation layers. 13 | 14 | Args: 15 | in_channels (int): Same as nn.Conv2d. 16 | out_channels (int): Same as nn.Conv2d. 17 | kernel_size (int or tuple[int]): Same as nn.Conv2d. 18 | stride (int or tuple[int]): Same as nn.Conv2d. 19 | padding (int or tuple[int]): Same as nn.Conv2d. 20 | dilation (int or tuple[int]): Same as nn.Conv2d. 21 | groups (int): Same as nn.Conv2d. 22 | bias (bool or str): If specified as `auto`, it will be decided by the 23 | norm_cfg. Bias will be set as True if norm_cfg is None, otherwise 24 | False. 25 | conv_cfg (dict): Config dict for convolution layer. 26 | norm_cfg (dict): Config dict for normalization layer. 27 | activation (str): activation layer, "ReLU" by default. 28 | inplace (bool): Whether to use inplace mode for activation. 29 | order (tuple[str]): The order of conv/norm/activation layers. It is a 30 | sequence of "conv", "norm" and "act". Examples are 31 | ("conv", "norm", "act") and ("act", "conv", "norm"). 32 | """ 33 | 34 | def __init__(self, 35 | in_channels, 36 | out_channels, 37 | kernel_size, 38 | stride=1, 39 | padding=0, 40 | dilation=1, 41 | groups=1, 42 | bias='auto', 43 | conv_cfg=None, 44 | norm_cfg=None, 45 | activation='ReLU', 46 | inplace=True, 47 | order=('conv', 'norm', 'act')): 48 | super(ConvModule, self).__init__() 49 | assert conv_cfg is None or isinstance(conv_cfg, dict) 50 | assert norm_cfg is None or isinstance(norm_cfg, dict) 51 | assert activation is None or isinstance(activation, str) 52 | self.conv_cfg = conv_cfg 53 | self.norm_cfg = norm_cfg 54 | self.activation = activation 55 | self.inplace = inplace 56 | self.order = order 57 | assert isinstance(self.order, tuple) and len(self.order) == 3 58 | assert set(order) == set(['conv', 'norm', 'act']) 59 | 60 | self.with_norm = norm_cfg is not None 61 | # if the conv layer is before a norm layer, bias is unnecessary. 62 | if bias == 'auto': 63 | bias = False if self.with_norm else True 64 | self.with_bias = bias 65 | 66 | if self.with_norm and self.with_bias: 67 | warnings.warn('ConvModule has norm and bias at the same time') 68 | 69 | # build convolution layer 70 | self.conv = nn.Conv2d( # 71 | in_channels, 72 | out_channels, 73 | kernel_size, 74 | stride=stride, 75 | padding=padding, 76 | dilation=dilation, 77 | groups=groups, 78 | bias=bias) 79 | # export the attributes of self.conv to a higher level for convenience 80 | self.in_channels = self.conv.in_channels 81 | self.out_channels = self.conv.out_channels 82 | self.kernel_size = self.conv.kernel_size 83 | self.stride = self.conv.stride 84 | self.padding = self.conv.padding 85 | self.dilation = self.conv.dilation 86 | self.transposed = self.conv.transposed 87 | self.output_padding = self.conv.output_padding 88 | self.groups = self.conv.groups 89 | 90 | # build normalization layers 91 | if self.with_norm: 92 | # norm layer is after conv layer 93 | if order.index('norm') > order.index('conv'): 94 | norm_channels = out_channels 95 | else: 96 | norm_channels = in_channels 97 | self.norm_name, norm = build_norm_layer(norm_cfg, norm_channels) 98 | self.add_module(self.norm_name, norm) 99 | 100 | # build activation layer 101 | if self.activation: 102 | self.act = act_layers(self.activation) 103 | 104 | # Use msra init by default 105 | self.init_weights() 106 | 107 | @property 108 | def norm(self): 109 | return getattr(self, self.norm_name) 110 | 111 | def init_weights(self): 112 | if self.activation == 'LeakyReLU': 113 | nonlinearity = 'leaky_relu' 114 | else: 115 | nonlinearity = 'relu' 116 | kaiming_init(self.conv, nonlinearity=nonlinearity) 117 | if self.with_norm: 118 | constant_init(self.norm, 1, bias=0) 119 | 120 | def forward(self, x, norm=True): 121 | for layer in self.order: 122 | if layer == 'conv': 123 | x = self.conv(x) 124 | elif layer == 'norm' and norm and self.with_norm: 125 | x = self.norm(x) 126 | elif layer == 'act' and self.activation: 127 | x = self.act(x) 128 | return x 129 | 130 | 131 | class DepthwiseConvModule(nn.Module): 132 | 133 | def __init__(self, 134 | in_channels, 135 | out_channels, 136 | kernel_size, 137 | stride=1, 138 | padding=0, 139 | dilation=1, 140 | bias='auto', 141 | norm_cfg=dict(type='BN'), 142 | activation='ReLU', 143 | inplace=True, 144 | order=('depthwise', 'dwnorm', 'act', 'pointwise', 'pwnorm', 'act')): 145 | super(DepthwiseConvModule, self).__init__() 146 | assert activation is None or isinstance(activation, str) 147 | self.activation = activation 148 | self.inplace = inplace 149 | self.order = order 150 | assert isinstance(self.order, tuple) and len(self.order) == 6 151 | assert set(order) == set(['depthwise', 'dwnorm', 'act', 'pointwise', 'pwnorm', 'act']) 152 | 153 | self.with_norm = norm_cfg is not None 154 | # if the conv layer is before a norm layer, bias is unnecessary. 155 | if bias == 'auto': 156 | bias = False if self.with_norm else True 157 | self.with_bias = bias 158 | 159 | if self.with_norm and self.with_bias: 160 | warnings.warn('ConvModule has norm and bias at the same time') 161 | 162 | # build convolution layer 163 | self.depthwise = nn.Conv2d(in_channels, 164 | in_channels, 165 | kernel_size, 166 | stride=stride, 167 | padding=padding, 168 | dilation=dilation, 169 | groups=in_channels, 170 | bias=bias) 171 | self.pointwise = nn.Conv2d(in_channels, 172 | out_channels, 173 | kernel_size=1, 174 | stride=1, 175 | padding=0, 176 | bias=bias) 177 | 178 | # export the attributes of self.conv to a higher level for convenience 179 | self.in_channels = self.depthwise.in_channels 180 | self.out_channels = self.pointwise.out_channels 181 | self.kernel_size = self.depthwise.kernel_size 182 | self.stride = self.depthwise.stride 183 | self.padding = self.depthwise.padding 184 | self.dilation = self.depthwise.dilation 185 | self.transposed = self.depthwise.transposed 186 | self.output_padding = self.depthwise.output_padding 187 | 188 | # build normalization layers 189 | if self.with_norm: 190 | # norm layer is after conv layer 191 | _, self.dwnorm = build_norm_layer(norm_cfg, in_channels) 192 | _, self.pwnorm = build_norm_layer(norm_cfg, out_channels) 193 | 194 | # build activation layer 195 | if self.activation: 196 | self.act = act_layers(self.activation) 197 | 198 | # Use msra init by default 199 | self.init_weights() 200 | 201 | def init_weights(self): 202 | if self.activation == 'LeakyReLU': 203 | nonlinearity = 'leaky_relu' 204 | else: 205 | nonlinearity = 'relu' 206 | kaiming_init(self.depthwise, nonlinearity=nonlinearity) 207 | kaiming_init(self.pointwise, nonlinearity=nonlinearity) 208 | if self.with_norm: 209 | constant_init(self.dwnorm, 1, bias=0) 210 | constant_init(self.pwnorm, 1, bias=0) 211 | 212 | def forward(self, x, norm=True): 213 | for layer_name in self.order: 214 | if layer_name != 'act': 215 | layer = self.__getattr__(layer_name) 216 | x = layer(x) 217 | elif layer_name == 'act' and self.activation: 218 | x = self.act(x) 219 | return x 220 | -------------------------------------------------------------------------------- /nanodet/model/module/init_weights.py: -------------------------------------------------------------------------------- 1 | """ 2 | from MMDetection 3 | """ 4 | import torch.nn as nn 5 | 6 | 7 | def kaiming_init(module, 8 | a=0, 9 | mode='fan_out', 10 | nonlinearity='relu', 11 | bias=0, 12 | distribution='normal'): 13 | assert distribution in ['uniform', 'normal'] 14 | if distribution == 'uniform': 15 | nn.init.kaiming_uniform_( 16 | module.weight, a=a, mode=mode, nonlinearity=nonlinearity) 17 | else: 18 | nn.init.kaiming_normal_( 19 | module.weight, a=a, mode=mode, nonlinearity=nonlinearity) 20 | if hasattr(module, 'bias') and module.bias is not None: 21 | nn.init.constant_(module.bias, bias) 22 | 23 | 24 | def xavier_init(module, gain=1, bias=0, distribution='normal'): 25 | assert distribution in ['uniform', 'normal'] 26 | if distribution == 'uniform': 27 | nn.init.xavier_uniform_(module.weight, gain=gain) 28 | else: 29 | nn.init.xavier_normal_(module.weight, gain=gain) 30 | if hasattr(module, 'bias') and module.bias is not None: 31 | nn.init.constant_(module.bias, bias) 32 | 33 | 34 | def normal_init(module, mean=0, std=1, bias=0): 35 | nn.init.normal_(module.weight, mean, std) 36 | if hasattr(module, 'bias') and module.bias is not None: 37 | nn.init.constant_(module.bias, bias) 38 | 39 | 40 | def constant_init(module, val, bias=0): 41 | if hasattr(module, 'weight') and module.weight is not None: 42 | nn.init.constant_(module.weight, val) 43 | if hasattr(module, 'bias') and module.bias is not None: 44 | nn.init.constant_(module.bias, bias) -------------------------------------------------------------------------------- /nanodet/model/module/nms.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torchvision.ops import nms 3 | 4 | 5 | def multiclass_nms(multi_bboxes, 6 | multi_scores, 7 | score_thr, 8 | nms_cfg, 9 | max_num=-1, 10 | score_factors=None): 11 | """NMS for multi-class bboxes. 12 | 13 | Args: 14 | multi_bboxes (Tensor): shape (n, #class*4) or (n, 4) 15 | multi_scores (Tensor): shape (n, #class), where the last column 16 | contains scores of the background class, but this will be ignored. 17 | score_thr (float): bbox threshold, bboxes with scores lower than it 18 | will not be considered. 19 | nms_thr (float): NMS IoU threshold 20 | max_num (int): if there are more than max_num bboxes after NMS, 21 | only top max_num will be kept. 22 | score_factors (Tensor): The factors multiplied to scores before 23 | applying NMS 24 | 25 | Returns: 26 | tuple: (bboxes, labels), tensors of shape (k, 5) and (k, 1). Labels \ 27 | are 0-based. 28 | """ 29 | num_classes = multi_scores.size(1) - 1 30 | # exclude background category 31 | if multi_bboxes.shape[1] > 4: 32 | bboxes = multi_bboxes.view(multi_scores.size(0), -1, 4) 33 | else: 34 | bboxes = multi_bboxes[:, None].expand( 35 | multi_scores.size(0), num_classes, 4) 36 | scores = multi_scores[:, :-1] 37 | 38 | # filter out boxes with low scores 39 | valid_mask = scores > score_thr 40 | 41 | # We use masked_select for ONNX exporting purpose, 42 | # which is equivalent to bboxes = bboxes[valid_mask] 43 | # (TODO): as ONNX does not support repeat now, 44 | # we have to use this ugly code 45 | bboxes = torch.masked_select( 46 | bboxes, 47 | torch.stack((valid_mask, valid_mask, valid_mask, valid_mask), 48 | -1)).view(-1, 4) 49 | if score_factors is not None: 50 | scores = scores * score_factors[:, None] 51 | scores = torch.masked_select(scores, valid_mask) 52 | labels = valid_mask.nonzero(as_tuple=False)[:, 1] 53 | 54 | if bboxes.numel() == 0: 55 | bboxes = multi_bboxes.new_zeros((0, 5)) 56 | labels = multi_bboxes.new_zeros((0, ), dtype=torch.long) 57 | 58 | if torch.onnx.is_in_onnx_export(): 59 | raise RuntimeError('[ONNX Error] Can not record NMS ' 60 | 'as it has not been executed this time') 61 | return bboxes, labels 62 | 63 | dets, keep = batched_nms(bboxes, scores, labels, nms_cfg) 64 | 65 | if max_num > 0: 66 | dets = dets[:max_num] 67 | keep = keep[:max_num] 68 | 69 | return dets, labels[keep] 70 | 71 | 72 | def batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False): 73 | """Performs non-maximum suppression in a batched fashion. 74 | Modified from https://github.com/pytorch/vision/blob 75 | /505cd6957711af790211896d32b40291bea1bc21/torchvision/ops/boxes.py#L39. 76 | In order to perform NMS independently per class, we add an offset to all 77 | the boxes. The offset is dependent only on the class idx, and is large 78 | enough so that boxes from different classes do not overlap. 79 | Arguments: 80 | boxes (torch.Tensor): boxes in shape (N, 4). 81 | scores (torch.Tensor): scores in shape (N, ). 82 | idxs (torch.Tensor): each index value correspond to a bbox cluster, 83 | and NMS will not be applied between elements of different idxs, 84 | shape (N, ). 85 | nms_cfg (dict): specify nms type and other parameters like iou_thr. 86 | Possible keys includes the following. 87 | - iou_thr (float): IoU threshold used for NMS. 88 | - split_thr (float): threshold number of boxes. In some cases the 89 | number of boxes is large (e.g., 200k). To avoid OOM during 90 | training, the users could set `split_thr` to a small value. 91 | If the number of boxes is greater than the threshold, it will 92 | perform NMS on each group of boxes separately and sequentially. 93 | Defaults to 10000. 94 | class_agnostic (bool): if true, nms is class agnostic, 95 | i.e. IoU thresholding happens over all boxes, 96 | regardless of the predicted class. 97 | Returns: 98 | tuple: kept dets and indice. 99 | """ 100 | nms_cfg_ = nms_cfg.copy() 101 | class_agnostic = nms_cfg_.pop('class_agnostic', class_agnostic) 102 | if class_agnostic: 103 | boxes_for_nms = boxes 104 | else: 105 | max_coordinate = boxes.max() 106 | offsets = idxs.to(boxes) * (max_coordinate + 1) 107 | boxes_for_nms = boxes + offsets[:, None] 108 | 109 | nms_type = nms_cfg_.pop('type', 'nms') 110 | # nms_op = eval(nms_type) 111 | 112 | split_thr = nms_cfg_.pop('split_thr', 10000) 113 | if len(boxes_for_nms) < split_thr: 114 | # dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_) 115 | keep = nms(boxes_for_nms, scores, **nms_cfg_) 116 | boxes = boxes[keep] 117 | # scores = dets[:, -1] 118 | scores = scores[keep] 119 | else: 120 | total_mask = scores.new_zeros(scores.size(), dtype=torch.bool) 121 | for id in torch.unique(idxs): 122 | mask = (idxs == id).nonzero(as_tuple=False).view(-1) 123 | # dets, keep = nms_op(boxes_for_nms[mask], scores[mask], **nms_cfg_) 124 | keep = nms(boxes_for_nms[mask], scores[mask], **nms_cfg_) 125 | total_mask[mask[keep]] = True 126 | 127 | keep = total_mask.nonzero(as_tuple=False).view(-1) 128 | keep = keep[scores[keep].argsort(descending=True)] 129 | boxes = boxes[keep] 130 | scores = scores[keep] 131 | 132 | return torch.cat([boxes, scores[:, None]], -1), keep -------------------------------------------------------------------------------- /nanodet/model/module/norm.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | 3 | norm_cfg = { 4 | # format: layer_type: (abbreviation, module) 5 | 'BN': ('bn', nn.BatchNorm2d), 6 | 'SyncBN': ('bn', nn.SyncBatchNorm), 7 | 'GN': ('gn', nn.GroupNorm), 8 | # and potentially 'SN' 9 | } 10 | 11 | 12 | def build_norm_layer(cfg, num_features, postfix=''): 13 | """ Build normalization layer 14 | 15 | Args: 16 | cfg (dict): cfg should contain: 17 | type (str): identify norm layer type. 18 | layer args: args needed to instantiate a norm layer. 19 | requires_grad (bool): [optional] whether stop gradient updates 20 | num_features (int): number of channels from input. 21 | postfix (int, str): appended into norm abbreviation to 22 | create named layer. 23 | 24 | Returns: 25 | name (str): abbreviation + postfix 26 | layer (nn.Module): created norm layer 27 | """ 28 | assert isinstance(cfg, dict) and 'type' in cfg 29 | cfg_ = cfg.copy() 30 | 31 | layer_type = cfg_.pop('type') 32 | if layer_type not in norm_cfg: 33 | raise KeyError('Unrecognized norm type {}'.format(layer_type)) 34 | else: 35 | abbr, norm_layer = norm_cfg[layer_type] 36 | if norm_layer is None: 37 | raise NotImplementedError 38 | 39 | assert isinstance(postfix, (int, str)) 40 | name = abbr + str(postfix) 41 | 42 | requires_grad = cfg_.pop('requires_grad', True) 43 | cfg_.setdefault('eps', 1e-5) 44 | if layer_type != 'GN': 45 | layer = norm_layer(num_features, **cfg_) 46 | if layer_type == 'SyncBN': 47 | layer._specify_ddp_gpu_num(1) 48 | else: 49 | assert 'num_groups' in cfg_ 50 | layer = norm_layer(num_channels=num_features, **cfg_) 51 | 52 | for param in layer.parameters(): 53 | param.requires_grad = requires_grad 54 | 55 | return name, layer 56 | -------------------------------------------------------------------------------- /nanodet/model/module/scale.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | 5 | class Scale(nn.Module): 6 | """ 7 | A learnable scale parameter 8 | """ 9 | 10 | def __init__(self, scale=1.0): 11 | super(Scale, self).__init__() 12 | self.scale = nn.Parameter(torch.tensor(scale, dtype=torch.float)) 13 | 14 | def forward(self, x): 15 | return x * self.scale 16 | -------------------------------------------------------------------------------- /nanodet/nanodet.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/nanodet.zip -------------------------------------------------------------------------------- /nanodet/trainer/__init__.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from .trainer import Trainer 3 | from .dist_trainer import DistTrainer 4 | 5 | 6 | def build_trainer(rank, cfg, model, logger): 7 | if len(cfg.device.gpu_ids) > 1: 8 | trainer = DistTrainer(rank, cfg, model, logger) 9 | trainer.set_device(cfg.device.batchsize_per_gpu, rank, device=torch.device('cuda')) # TODO: device 10 | else: 11 | trainer = Trainer(rank, cfg, model, logger) 12 | trainer.set_device(cfg.device.batchsize_per_gpu, cfg.device.gpu_ids, device=torch.device('cuda')) 13 | return trainer 14 | 15 | -------------------------------------------------------------------------------- /nanodet/trainer/dist_trainer.py: -------------------------------------------------------------------------------- 1 | import torch.distributed as dist 2 | from .trainer import Trainer 3 | from ..util import DDP 4 | 5 | 6 | def average_gradients(model): 7 | """ Gradient averaging. """ 8 | size = float(dist.get_world_size()) 9 | for param in model.parameters(): 10 | if param.grad is not None: 11 | dist.all_reduce(param.grad.data, op=dist.ReduceOp.SUM) 12 | param.grad.data /= size 13 | 14 | 15 | 16 | class DistTrainer(Trainer): 17 | """ 18 | Distributed trainer for multi-gpu training. (not finish yet) 19 | """ 20 | def run_step(self, model, batch, mode='train'): 21 | output, loss, loss_stats = model.module.forward_train(batch) 22 | loss = loss.mean() 23 | if mode == 'train': 24 | self.optimizer.zero_grad() 25 | loss.backward() 26 | average_gradients(model) 27 | self.optimizer.step() 28 | return output, loss, loss_stats 29 | 30 | def set_device(self, batch_per_gpu, rank, device): 31 | """ 32 | Set model device for Distributed-Data-Parallel 33 | :param batch_per_gpu: batch size of each gpu 34 | :param rank: distributed training process rank 35 | :param device: cuda 36 | """ 37 | self.rank = rank 38 | self.model = DDP(batch_per_gpu, module=self.model.cuda(), device_ids=[rank], output_device=rank) 39 | 40 | 41 | -------------------------------------------------------------------------------- /nanodet/trainer/trainer.py: -------------------------------------------------------------------------------- 1 | import os 2 | import copy 3 | import warnings 4 | import torch 5 | from nanodet.util import mkdir, DataParallel, load_model_weight, save_model, MovingAverage, AverageMeter 6 | 7 | 8 | class Trainer: 9 | """ 10 | Epoch based trainer 11 | """ 12 | def __init__(self, rank, cfg, model, logger): 13 | self.rank = rank # local rank for distributed training. For single gpu training, default is -1 14 | self.cfg = cfg 15 | self.model = model 16 | self.logger = logger 17 | self._init_optimizer() 18 | self._iter = 1 19 | self.epoch = 1 20 | 21 | def set_device(self, batch_per_gpu, gpu_ids, device): 22 | """ 23 | Set model device to GPU. 24 | :param batch_per_gpu: batch size of each gpu 25 | :param gpu_ids: a list of gpu ids 26 | :param device: cuda 27 | """ 28 | num_gpu = len(gpu_ids) 29 | batch_sizes = [batch_per_gpu for i in range(num_gpu)] 30 | self.logger.log('Training batch size: {}'.format(batch_per_gpu*num_gpu)) 31 | self.model = DataParallel(self.model, gpu_ids, chunk_sizes=batch_sizes).to(device) 32 | 33 | def _init_optimizer(self): 34 | optimizer_cfg = copy.deepcopy(self.cfg.schedule.optimizer) 35 | name = optimizer_cfg.pop('name') 36 | Optimizer = getattr(torch.optim, name) 37 | self.optimizer = Optimizer(params=self.model.parameters(), **optimizer_cfg) 38 | 39 | def _init_scheduler(self): 40 | schedule_cfg = copy.deepcopy(self.cfg.schedule.lr_schedule) 41 | name = schedule_cfg.pop('name') 42 | Scheduler = getattr(torch.optim.lr_scheduler, name) 43 | self.lr_scheduler = Scheduler(optimizer=self.optimizer, **schedule_cfg) 44 | 45 | def run_step(self, model, meta, mode='train'): 46 | """ 47 | Training step including forward and backward 48 | :param model: model to train 49 | :param meta: a batch of input data 50 | :param mode: train or val or test 51 | :return: result, total loss and a dict of all losses 52 | """ 53 | output, loss, loss_dict = model.module.forward_train(meta) 54 | loss = loss.mean() 55 | if mode == 'train': 56 | self.optimizer.zero_grad() 57 | loss.backward() 58 | self.optimizer.step() 59 | return output, loss, loss_dict 60 | 61 | def run_epoch(self, epoch, data_loader, mode): 62 | """ 63 | train or validate one epoch 64 | :param epoch: current epoch number 65 | :param data_loader: dataloader of train or test dataset 66 | :param mode: train or val or test 67 | :return: outputs and a dict of epoch average losses 68 | """ 69 | model = self.model 70 | if mode == 'train': 71 | model.train() 72 | if self.rank > -1: # Using distributed training, need to set epoch for sampler 73 | self.logger.log("distributed sampler set epoch at {}".format(epoch)) 74 | data_loader.sampler.set_epoch(epoch) 75 | else: 76 | model.eval() 77 | torch.cuda.empty_cache() 78 | results = {} 79 | epoch_losses = {} 80 | step_losses = {} 81 | num_iters = len(data_loader) 82 | for iter_id, meta in enumerate(data_loader): 83 | if iter_id >= num_iters: 84 | break 85 | meta['img'] = meta['img'].to(device=torch.device('cuda'), non_blocking=True) 86 | output, loss, loss_stats = self.run_step(model, meta, mode) 87 | if mode == 'val': # TODO: eval 88 | dets = model.module.head.post_process(output, meta) 89 | results[meta['img_info']['id'].cpu().numpy()[0]] = dets 90 | for k in loss_stats: 91 | if k not in epoch_losses: 92 | epoch_losses[k] = AverageMeter(loss_stats[k].mean().item()) 93 | step_losses[k] = MovingAverage(loss_stats[k].mean().item(), window_size=self.cfg.log.interval) 94 | else: 95 | epoch_losses[k].update(loss_stats[k].mean().item()) 96 | step_losses[k].push(loss_stats[k].mean().item()) 97 | 98 | if iter_id % self.cfg.log.interval == 0: 99 | log_msg = '{}|Epoch{}/{}|Iter{}({}/{})| lr:{:.2e}| '.format(mode, epoch, self.cfg.schedule.total_epochs, 100 | self._iter, iter_id, num_iters, self.optimizer.param_groups[0]['lr']) 101 | for l in step_losses: 102 | log_msg += '{}:{:.4f}| '.format(l, step_losses[l].avg()) 103 | if mode == 'train' and self.rank < 1: 104 | self.logger.scalar_summary('Train_loss/' + l, mode, step_losses[l].avg(), self._iter) 105 | self.logger.log(log_msg) 106 | if mode == 'train': 107 | self._iter += 1 108 | del output, loss, loss_stats 109 | epoch_loss_dict = {k: v.avg for k, v in epoch_losses.items()} 110 | return results, epoch_loss_dict 111 | 112 | def run(self, train_loader, val_loader, evaluator): 113 | """ 114 | start running 115 | :param train_loader: 116 | :param val_loader: 117 | :param evaluator: 118 | """ 119 | start_epoch = self.epoch 120 | save_flag = -10 121 | if self.cfg.schedule.warmup.steps > 0 and start_epoch == 1: 122 | self.logger.log('Start warming up...') 123 | self.warm_up(train_loader) 124 | for param_group in self.optimizer.param_groups: 125 | param_group['lr'] = self.cfg.schedule.optimizer.lr 126 | 127 | self._init_scheduler() 128 | self.lr_scheduler.last_epoch = start_epoch - 1 129 | 130 | for epoch in range(start_epoch, self.cfg.schedule.total_epochs + 1): 131 | results, train_loss_dict = self.run_epoch(epoch, train_loader, mode='train') 132 | self.lr_scheduler.step() 133 | save_model(self.rank, self.model, os.path.join(self.cfg.save_dir, 'model_last.pth'), epoch, self._iter, self.optimizer) 134 | for k, v in train_loss_dict.items(): 135 | self.logger.scalar_summary('Epoch_loss/' + k, 'train', v, epoch) 136 | 137 | # --------evaluate---------- 138 | if self.cfg.schedule.val_intervals > 0 and epoch % self.cfg.schedule.val_intervals == 0: 139 | with torch.no_grad(): 140 | results, val_loss_dict = self.run_epoch(self.epoch, val_loader, mode='val') 141 | for k, v in val_loss_dict.items(): 142 | self.logger.scalar_summary('Epoch_loss/' + k, 'val', v, epoch) 143 | eval_results = evaluator.evaluate(results, self.cfg.save_dir, epoch, self.logger, rank=self.rank) 144 | if self.cfg.evaluator.save_key in eval_results: 145 | metric = eval_results[self.cfg.evaluator.save_key] 146 | if metric > save_flag: 147 | # ------save best model-------- 148 | save_flag = metric 149 | best_save_path = os.path.join(self.cfg.save_dir, 'model_best') 150 | mkdir(self.rank, best_save_path) 151 | save_model(self.rank, self.model, os.path.join(best_save_path, 'model_best.pth'), epoch, 152 | self._iter, self.optimizer) 153 | txt_path = os.path.join(best_save_path, "eval_results.txt") 154 | if self.rank < 1: 155 | with open(txt_path, "a") as f: 156 | f.write("Epoch:{}\n".format(epoch)) 157 | for k, v in eval_results.items(): 158 | f.write("{}: {}\n".format(k, v)) 159 | else: 160 | warnings.warn('Warning! Save_key is not in eval results! Only save model last!') 161 | self.epoch += 1 162 | 163 | def get_warmup_lr(self, cur_iters): 164 | if self.cfg.schedule.warmup.name == 'constant': 165 | warmup_lr = self.cfg.schedule.optimizer.lr * self.cfg.schedule.warmup.ratio 166 | elif self.cfg.schedule.warmup.name == 'linear': 167 | k = (1 - cur_iters / self.cfg.schedule.warmup.steps) * (1 - self.cfg.schedule.warmup.ratio) 168 | warmup_lr = self.cfg.schedule.optimizer.lr * (1 - k) 169 | elif self.cfg.schedule.warmup.name == 'exp': 170 | k = self.cfg.schedule.warmup.ratio ** (1 - cur_iters / self.cfg.schedule.warmup.steps) 171 | warmup_lr = self.cfg.schedule.optimizer.lr * k 172 | else: 173 | raise Exception('Unsupported warm up type!') 174 | return warmup_lr 175 | 176 | def warm_up(self, data_loader): 177 | model = self.model 178 | model.train() 179 | step_losses = {} 180 | num_iters = self.cfg.schedule.warmup.steps 181 | cur_iter = 0 182 | while cur_iter < num_iters: 183 | for iter_id, batch in enumerate(data_loader): 184 | cur_iter += 1 185 | if cur_iter >= num_iters: 186 | break 187 | lr = self.get_warmup_lr(cur_iter) 188 | for param_group in self.optimizer.param_groups: 189 | param_group['lr'] = lr 190 | batch['img'] = batch['img'].to(device=torch.device('cuda'), non_blocking=True) 191 | output, loss, loss_stats = self.run_step(model, batch) 192 | 193 | # TODO: simplify code 194 | for k in loss_stats: 195 | if k not in step_losses: 196 | step_losses[k] = MovingAverage(loss_stats[k].mean().item(), window_size=self.cfg.log.interval) 197 | else: 198 | step_losses[k].push(loss_stats[k].mean().item()) 199 | if iter_id % self.cfg.log.interval == 0: 200 | log_msg = '{}|Iter({}/{})| lr:{:.2e}| '.format('warmup', cur_iter, num_iters, self.optimizer.param_groups[0]['lr']) 201 | for l in step_losses: 202 | log_msg += '{}:{:.4f}| '.format(l, step_losses[l].avg()) 203 | self.logger.log(log_msg) 204 | del output, loss, loss_stats 205 | 206 | def load_model(self, cfg): 207 | load_path = cfg.schedule.load_model 208 | checkpoint = torch.load(load_path, map_location=lambda storage, loc: storage) 209 | self.logger.log('loaded {}, epoch {}'.format(load_path, checkpoint['epoch'])) 210 | if hasattr(self.model, 'module'): 211 | load_model_weight(self.model.module, checkpoint, self.logger) 212 | else: 213 | load_model_weight(self.model, checkpoint, self.logger) 214 | 215 | def resume(self, cfg): 216 | """ 217 | load model and optimizer state 218 | """ 219 | if cfg.schedule.resume is not None: 220 | load_path = cfg.schedule.resume 221 | else: 222 | load_path = os.path.join(cfg.save_dir, 'model_last.pth') 223 | checkpoint = torch.load(load_path, map_location=lambda storage, loc: storage) 224 | self.logger.log('loaded {}, epoch {}'.format(load_path, checkpoint['epoch'])) 225 | if hasattr(self.model, 'module'): 226 | load_model_weight(self.model.module, checkpoint, self.logger) 227 | else: 228 | load_model_weight(self.model, checkpoint, self.logger) 229 | if 'optimizer' in checkpoint: 230 | self.optimizer.load_state_dict(checkpoint['optimizer']) 231 | self.epoch = checkpoint['epoch'] + 1 232 | self.logger.log('resumed at epoch: {}'.format(self.epoch)) 233 | if 'iter' in checkpoint: 234 | self._iter = checkpoint['iter'] + 1 235 | self.logger.log('resumed at steps: {}'.format(self._iter)) 236 | else: 237 | self.logger.log('No optimizer parameters in checkpoint.') 238 | 239 | -------------------------------------------------------------------------------- /nanodet/util/__init__.py: -------------------------------------------------------------------------------- 1 | from .rank_filter import rank_filter 2 | from .path import mkdir 3 | from .logger import Logger, MovingAverage, AverageMeter 4 | from .data_parallel import DataParallel 5 | from .distributed_data_parallel import DDP 6 | from .check_point import load_model_weight, save_model 7 | from .config import cfg, load_config 8 | from .box_transform import * 9 | from .util_mixins import NiceRepr 10 | from .visualization import Visualizer, overlay_bbox_cv 11 | from .flops_counter import get_model_complexity_info 12 | -------------------------------------------------------------------------------- /nanodet/util/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/box_transform.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/box_transform.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/check_point.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/check_point.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/config.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/config.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/data_parallel.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/data_parallel.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/distributed_data_parallel.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/distributed_data_parallel.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/flops_counter.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/flops_counter.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/logger.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/logger.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/path.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/path.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/rank_filter.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/rank_filter.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/scatter_gather.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/scatter_gather.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/util_mixins.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/util_mixins.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/visualization.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/visualization.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/__pycache__/yacs.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/nanodet/util/__pycache__/yacs.cpython-38.pyc -------------------------------------------------------------------------------- /nanodet/util/box_transform.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | 4 | def distance2bbox(points, distance, max_shape=None): 5 | """Decode distance prediction to bounding box. 6 | 7 | Args: 8 | points (Tensor): Shape (n, 2), [x, y]. 9 | distance (Tensor): Distance from the given point to 4 10 | boundaries (left, top, right, bottom). 11 | max_shape (tuple): Shape of the image. 12 | 13 | Returns: 14 | Tensor: Decoded bboxes. 15 | """ 16 | x1 = points[:, 0] - distance[:, 0] 17 | y1 = points[:, 1] - distance[:, 1] 18 | x2 = points[:, 0] + distance[:, 2] 19 | y2 = points[:, 1] + distance[:, 3] 20 | if max_shape is not None: 21 | x1 = x1.clamp(min=0, max=max_shape[1]) 22 | y1 = y1.clamp(min=0, max=max_shape[0]) 23 | x2 = x2.clamp(min=0, max=max_shape[1]) 24 | y2 = y2.clamp(min=0, max=max_shape[0]) 25 | return torch.stack([x1, y1, x2, y2], -1) 26 | 27 | 28 | def bbox2distance(points, bbox, max_dis=None, eps=0.1): 29 | """Decode bounding box based on distances. 30 | 31 | Args: 32 | points (Tensor): Shape (n, 2), [x, y]. 33 | bbox (Tensor): Shape (n, 4), "xyxy" format 34 | max_dis (float): Upper bound of the distance. 35 | eps (float): a small value to ensure target < max_dis, instead <= 36 | 37 | Returns: 38 | Tensor: Decoded distances. 39 | """ 40 | left = points[:, 0] - bbox[:, 0] 41 | top = points[:, 1] - bbox[:, 1] 42 | right = bbox[:, 2] - points[:, 0] 43 | bottom = bbox[:, 3] - points[:, 1] 44 | if max_dis is not None: 45 | left = left.clamp(min=0, max=max_dis - eps) 46 | top = top.clamp(min=0, max=max_dis - eps) 47 | right = right.clamp(min=0, max=max_dis - eps) 48 | bottom = bottom.clamp(min=0, max=max_dis - eps) 49 | return torch.stack([left, top, right, bottom], -1) -------------------------------------------------------------------------------- /nanodet/util/check_point.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from .rank_filter import rank_filter 3 | 4 | def load_model_weight(model, checkpoint, logger): 5 | state_dict = checkpoint['state_dict'] 6 | # strip prefix of state_dict 7 | if list(state_dict.keys())[0].startswith('module.'): 8 | state_dict = {k[7:]: v for k, v in checkpoint['state_dict'].items()} 9 | 10 | model_state_dict = model.module.state_dict() if hasattr(model, 'module') else model.state_dict() 11 | 12 | # check loaded parameters and created model parameters 13 | for k in state_dict: 14 | if k in model_state_dict: 15 | if state_dict[k].shape != model_state_dict[k].shape: 16 | logger.log('Skip loading parameter {}, required shape{}, loaded shape{}.'.format( 17 | k, model_state_dict[k].shape, state_dict[k].shape)) 18 | state_dict[k] = model_state_dict[k] 19 | else: 20 | logger.log('Drop parameter {}.'.format(k)) 21 | for k in model_state_dict: 22 | if not (k in state_dict): 23 | logger.log('No param {}.'.format(k)) 24 | state_dict[k] = model_state_dict[k] 25 | model.load_state_dict(state_dict, strict=False) 26 | 27 | 28 | @rank_filter 29 | def save_model(model, path, epoch, iter, optimizer=None): 30 | model_state_dict = model.module.state_dict() if hasattr(model, 'module') else model.state_dict() 31 | data = {'epoch': epoch, 32 | 'state_dict': model_state_dict, 33 | 'iter': iter} 34 | if optimizer is not None: 35 | data['optimizer'] = optimizer.state_dict() 36 | 37 | torch.save(data, path) 38 | -------------------------------------------------------------------------------- /nanodet/util/config.py: -------------------------------------------------------------------------------- 1 | from .yacs import CfgNode 2 | 3 | cfg = CfgNode(new_allowed=True) 4 | cfg.save_dir = './' 5 | # common params for NETWORK 6 | cfg.model = CfgNode() 7 | cfg.model.arch = CfgNode(new_allowed=True) 8 | cfg.model.arch.backbone = CfgNode(new_allowed=True) 9 | cfg.model.arch.neck = CfgNode(new_allowed=True) 10 | cfg.model.arch.head = CfgNode(new_allowed=True) 11 | 12 | # DATASET related params 13 | cfg.data = CfgNode(new_allowed=True) 14 | cfg.data.train = CfgNode(new_allowed=True) 15 | cfg.data.val = CfgNode(new_allowed=True) 16 | cfg.device = CfgNode(new_allowed=True) 17 | # train 18 | cfg.schedule = CfgNode(new_allowed=True) 19 | 20 | # logger 21 | cfg.log = CfgNode() 22 | cfg.log.interval = 50 23 | 24 | # testing 25 | cfg.test = CfgNode() 26 | # size of images for each device 27 | 28 | 29 | def load_config(cfg, args_cfg): 30 | cfg.defrost() 31 | cfg.merge_from_file(args_cfg) 32 | cfg.freeze() 33 | 34 | 35 | if __name__ == '__main__': 36 | import sys 37 | 38 | with open(sys.argv[1], 'w') as f: 39 | print(cfg, file=f) 40 | -------------------------------------------------------------------------------- /nanodet/util/data_parallel.py: -------------------------------------------------------------------------------- 1 | 2 | import torch 3 | from torch.nn.modules import Module 4 | from torch.nn.parallel.scatter_gather import gather 5 | from torch.nn.parallel.replicate import replicate 6 | from torch.nn.parallel.parallel_apply import parallel_apply 7 | 8 | from .scatter_gather import scatter_kwargs 9 | 10 | class DataParallel(Module): 11 | r"""Implements data parallelism at the module level. 12 | 13 | This container parallelizes the application of the given module by 14 | splitting the input across the specified devices by chunking in the batch 15 | dimension. In the forward pass, the module is replicated on each device, 16 | and each replica handles a portion of the input. During the backwards 17 | pass, gradients from each replica are summed into the original module. 18 | 19 | The batch size should be larger than the number of GPUs used. It should 20 | also be an integer multiple of the number of GPUs so that each chunk is the 21 | same size (so that each GPU processes the same number of samples). 22 | 23 | See also: :ref:`cuda-nn-dataparallel-instead` 24 | 25 | Arbitrary positional and keyword inputs are allowed to be passed into 26 | DataParallel EXCEPT Tensors. All variables will be scattered on dim 27 | specified (default 0). Primitive types will be broadcasted, but all 28 | other types will be a shallow copy and can be corrupted if written to in 29 | the model's forward pass. 30 | 31 | Args: 32 | module: module to be parallelized 33 | device_ids: CUDA devices (default: all devices) 34 | output_device: device location of output (default: device_ids[0]) 35 | 36 | Example:: 37 | 38 | >>> net = torch.nn.DataParallel(model, device_ids=[0, 1, 2]) 39 | >>> output = net(input_var) 40 | """ 41 | 42 | def __init__(self, module, device_ids=None, output_device=None, dim=0, chunk_sizes=None): 43 | super(DataParallel, self).__init__() 44 | 45 | if not torch.cuda.is_available(): 46 | self.module = module 47 | self.device_ids = [] 48 | return 49 | 50 | if device_ids is None: 51 | device_ids = list(range(torch.cuda.device_count())) 52 | if output_device is None: 53 | output_device = device_ids[0] 54 | self.dim = dim 55 | self.module = module 56 | self.device_ids = device_ids 57 | self.chunk_sizes = chunk_sizes 58 | self.output_device = output_device 59 | if len(self.device_ids) == 1: 60 | self.module.cuda(device_ids[0]) 61 | 62 | def forward(self, *inputs, **kwargs): 63 | if not self.device_ids: 64 | return self.module(*inputs, **kwargs) 65 | inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids, self.chunk_sizes) 66 | if len(self.device_ids) == 1: 67 | return self.module(*inputs[0], **kwargs[0]) 68 | replicas = self.replicate(self.module, self.device_ids[:len(inputs)]) 69 | outputs = self.parallel_apply(replicas, inputs, kwargs) 70 | return self.gather(outputs, self.output_device) 71 | 72 | def replicate(self, module, device_ids): 73 | return replicate(module, device_ids) 74 | 75 | def scatter(self, inputs, kwargs, device_ids, chunk_sizes): 76 | return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim, chunk_sizes=self.chunk_sizes) 77 | 78 | def parallel_apply(self, replicas, inputs, kwargs): 79 | return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) 80 | 81 | def gather(self, outputs, output_device): 82 | return gather(outputs, output_device, dim=self.dim) 83 | 84 | 85 | # TODO: remove this 86 | def data_parallel(module, inputs, device_ids=None, output_device=None, dim=0, module_kwargs=None): 87 | r"""Evaluates module(input) in parallel across the GPUs given in device_ids. 88 | 89 | This is the functional version of the DataParallel module. 90 | 91 | Args: 92 | module: the module to evaluate in parallel 93 | inputs: inputs to the module 94 | device_ids: GPU ids on which to replicate module 95 | output_device: GPU location of the output Use -1 to indicate the CPU. 96 | (default: device_ids[0]) 97 | Returns: 98 | a Variable containing the result of module(input) located on 99 | output_device 100 | """ 101 | if not isinstance(inputs, tuple): 102 | inputs = (inputs,) 103 | 104 | if device_ids is None: 105 | device_ids = list(range(torch.cuda.device_count())) 106 | 107 | if output_device is None: 108 | output_device = device_ids[0] 109 | 110 | inputs, module_kwargs = scatter_kwargs(inputs, module_kwargs, device_ids, dim) 111 | if len(device_ids) == 1: 112 | return module(*inputs[0], **module_kwargs[0]) 113 | used_device_ids = device_ids[:len(inputs)] 114 | replicas = replicate(module, used_device_ids) 115 | outputs = parallel_apply(replicas, inputs, module_kwargs, used_device_ids) 116 | return gather(outputs, output_device, dim) 117 | 118 | -------------------------------------------------------------------------------- /nanodet/util/distributed_data_parallel.py: -------------------------------------------------------------------------------- 1 | from torch.nn.parallel import DistributedDataParallel 2 | from .scatter_gather import scatter_kwargs 3 | 4 | 5 | class DDP(DistributedDataParallel): 6 | 7 | def __init__(self, batchsize, **kwargs): 8 | self.batchsize = batchsize 9 | super(DDP, self).__init__(**kwargs) 10 | 11 | def scatter(self, inputs, kwargs, device_ids): 12 | return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim, chunk_sizes=[self.batchsize]) -------------------------------------------------------------------------------- /nanodet/util/logger.py: -------------------------------------------------------------------------------- 1 | import os 2 | import logging 3 | import torch 4 | import numpy as np 5 | from termcolor import colored 6 | from .rank_filter import rank_filter 7 | from .path import mkdir 8 | 9 | 10 | class Logger: 11 | def __init__(self, local_rank, save_dir='./', use_tensorboard=True): 12 | mkdir(local_rank, save_dir) 13 | self.rank = local_rank 14 | fmt = colored('[%(name)s]', 'magenta', attrs=['bold']) + colored('[%(asctime)s]', 'blue') + \ 15 | colored('%(levelname)s:', 'green') + colored('%(message)s', 'white') 16 | logging.basicConfig(level=logging.INFO, 17 | filename=os.path.join(save_dir, 'logs.txt'), 18 | filemode='w') 19 | self.log_dir = os.path.join(save_dir, 'logs') 20 | console = logging.StreamHandler() 21 | console.setLevel(logging.INFO) 22 | formatter = logging.Formatter(fmt, datefmt="%m-%d %H:%M:%S") 23 | console.setFormatter(formatter) 24 | logging.getLogger().addHandler(console) 25 | if use_tensorboard: 26 | try: 27 | from torch.utils.tensorboard import SummaryWriter 28 | except ImportError: 29 | raise ImportError( 30 | 'Please run "pip install future tensorboard" to install ' 31 | 'the dependencies to use torch.utils.tensorboard ' 32 | '(applicable to PyTorch 1.1 or higher)') 33 | if self.rank < 1: 34 | logging.info('Using Tensorboard, logs will be saved in {}'.format(self.log_dir)) 35 | self.writer = SummaryWriter(log_dir=self.log_dir) 36 | 37 | def log(self, string): 38 | if self.rank < 1: 39 | logging.info(string) 40 | 41 | def scalar_summary(self, tag, phase, value, step): 42 | if self.rank < 1: 43 | self.writer.add_scalars(tag, {phase: value}, step) 44 | 45 | 46 | class MovingAverage(object): 47 | def __init__(self, val, window_size=50): 48 | self.window_size = window_size 49 | self.reset() 50 | self.push(val) 51 | 52 | def reset(self): 53 | self.queue = [] 54 | 55 | def push(self, val): 56 | self.queue.append(val) 57 | if len(self.queue) > self.window_size: 58 | self.queue.pop(0) 59 | 60 | def avg(self): 61 | return np.mean(self.queue) 62 | 63 | 64 | class AverageMeter(object): 65 | """Computes and stores the average and current value""" 66 | 67 | def __init__(self, val): 68 | self.reset() 69 | self.update(val) 70 | 71 | def reset(self): 72 | self.val = 0 73 | self.avg = 0 74 | self.sum = 0 75 | self.count = 0 76 | 77 | def update(self, val, n=1): 78 | self.val = val 79 | self.sum += val * n 80 | self.count += n 81 | if self.count > 0: 82 | self.avg = self.sum / self.count 83 | -------------------------------------------------------------------------------- /nanodet/util/path.py: -------------------------------------------------------------------------------- 1 | import os 2 | from .rank_filter import rank_filter 3 | 4 | 5 | @rank_filter 6 | def mkdir(path): 7 | if not os.path.exists(path): 8 | os.makedirs(path) 9 | -------------------------------------------------------------------------------- /nanodet/util/rank_filter.py: -------------------------------------------------------------------------------- 1 | 2 | def rank_filter(func): 3 | def func_filter(local_rank=-1, *args, **kwargs): 4 | if local_rank < 1: 5 | return func(*args, **kwargs) 6 | else: 7 | pass 8 | return func_filter 9 | -------------------------------------------------------------------------------- /nanodet/util/scatter_gather.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.autograd import Variable 3 | from torch.nn.parallel._functions import Scatter 4 | 5 | 6 | def list_scatter(input, target_gpus, chunk_sizes): 7 | ret = [] 8 | for idx, size in enumerate(chunk_sizes): 9 | ret.append(input[:size]) 10 | del input[:size] 11 | return tuple(ret) 12 | 13 | def scatter(inputs, target_gpus, dim=0, chunk_sizes=None): 14 | """ 15 | Slices variables into approximately equal chunks and 16 | distributes them across given GPUs. Duplicates 17 | references to objects that are not variables. Does not 18 | support Tensors. 19 | """ 20 | def scatter_map(obj): 21 | if isinstance(obj, Variable): 22 | return Scatter.apply(target_gpus, chunk_sizes, dim, obj) 23 | assert not torch.is_tensor(obj), "Tensors not supported in scatter." 24 | if isinstance(obj, list): 25 | return list_scatter(obj, target_gpus, chunk_sizes) 26 | if isinstance(obj, tuple): 27 | return list(zip(*map(scatter_map, obj))) 28 | if isinstance(obj, dict): 29 | return list(map(type(obj), zip(*map(scatter_map, obj.items())))) 30 | return [obj for targets in target_gpus] 31 | 32 | return scatter_map(inputs) 33 | 34 | 35 | def scatter_kwargs(inputs, kwargs, target_gpus, dim=0, chunk_sizes=None): 36 | r"""Scatter with support for kwargs dictionary""" 37 | inputs = scatter(inputs, target_gpus, dim, chunk_sizes) if inputs else [] 38 | kwargs = scatter(kwargs, target_gpus, dim, chunk_sizes) if kwargs else [] 39 | if len(inputs) < len(kwargs): 40 | inputs.extend([() for _ in range(len(kwargs) - len(inputs))]) 41 | elif len(kwargs) < len(inputs): 42 | kwargs.extend([{} for _ in range(len(inputs) - len(kwargs))]) 43 | inputs = tuple(inputs) 44 | kwargs = tuple(kwargs) 45 | return inputs, kwargs -------------------------------------------------------------------------------- /nanodet/util/util_mixins.py: -------------------------------------------------------------------------------- 1 | """This module defines the :class:`NiceRepr` mixin class, which defines a 2 | ``__repr__`` and ``__str__`` method that only depend on a custom ``__nice__`` 3 | method, which you must define. This means you only have to overload one 4 | function instead of two. Furthermore, if the object defines a ``__len__`` 5 | method, then the ``__nice__`` method defaults to something sensible, otherwise 6 | it is treated as abstract and raises ``NotImplementedError``. 7 | 8 | To use simply have your object inherit from :class:`NiceRepr` 9 | (multi-inheritance should be ok). 10 | 11 | This code was copied from the ubelt library: https://github.com/Erotemic/ubelt 12 | 13 | Example: 14 | >>> # Objects that define __nice__ have a default __str__ and __repr__ 15 | >>> class Student(NiceRepr): 16 | ... def __init__(self, name): 17 | ... self.name = name 18 | ... def __nice__(self): 19 | ... return self.name 20 | >>> s1 = Student('Alice') 21 | >>> s2 = Student('Bob') 22 | >>> print(f's1 = {s1}') 23 | >>> print(f's2 = {s2}') 24 | s1 = 25 | s2 = 26 | 27 | Example: 28 | >>> # Objects that define __len__ have a default __nice__ 29 | >>> class Group(NiceRepr): 30 | ... def __init__(self, data): 31 | ... self.data = data 32 | ... def __len__(self): 33 | ... return len(self.data) 34 | >>> g = Group([1, 2, 3]) 35 | >>> print(f'g = {g}') 36 | g = 37 | """ 38 | import warnings 39 | 40 | 41 | class NiceRepr(object): 42 | """Inherit from this class and define ``__nice__`` to "nicely" print your 43 | objects. 44 | 45 | Defines ``__str__`` and ``__repr__`` in terms of ``__nice__`` function 46 | Classes that inherit from :class:`NiceRepr` should redefine ``__nice__``. 47 | If the inheriting class has a ``__len__``, method then the default 48 | ``__nice__`` method will return its length. 49 | 50 | Example: 51 | >>> class Foo(NiceRepr): 52 | ... def __nice__(self): 53 | ... return 'info' 54 | >>> foo = Foo() 55 | >>> assert str(foo) == '' 56 | >>> assert repr(foo).startswith('>> class Bar(NiceRepr): 60 | ... pass 61 | >>> bar = Bar() 62 | >>> import pytest 63 | >>> with pytest.warns(None) as record: 64 | >>> assert 'object at' in str(bar) 65 | >>> assert 'object at' in repr(bar) 66 | 67 | Example: 68 | >>> class Baz(NiceRepr): 69 | ... def __len__(self): 70 | ... return 5 71 | >>> baz = Baz() 72 | >>> assert str(baz) == '' 73 | """ 74 | 75 | def __nice__(self): 76 | """str: a "nice" summary string describing this module""" 77 | if hasattr(self, '__len__'): 78 | # It is a common pattern for objects to use __len__ in __nice__ 79 | # As a convenience we define a default __nice__ for these objects 80 | return str(len(self)) 81 | else: 82 | # In all other cases force the subclass to overload __nice__ 83 | raise NotImplementedError( 84 | f'Define the __nice__ method for {self.__class__!r}') 85 | 86 | def __repr__(self): 87 | """str: the string of the module""" 88 | try: 89 | nice = self.__nice__() 90 | classname = self.__class__.__name__ 91 | return f'<{classname}({nice}) at {hex(id(self))}>' 92 | except NotImplementedError as ex: 93 | warnings.warn(str(ex), category=RuntimeWarning) 94 | return object.__repr__(self) 95 | 96 | def __str__(self): 97 | """str: the string of the module""" 98 | try: 99 | classname = self.__class__.__name__ 100 | nice = self.__nice__() 101 | return f'<{classname}({nice})>' 102 | except NotImplementedError as ex: 103 | warnings.warn(str(ex), category=RuntimeWarning) 104 | return object.__repr__(self) 105 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | Cython 2 | termcolor 3 | numpy 4 | torch>=1.3 5 | torchvision 6 | tensorboard 7 | pycocotools 8 | matplotlib 9 | pyaml 10 | opencv-python 11 | tqdm -------------------------------------------------------------------------------- /street.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/guo-pu/NanoDet-PyTorch/bac88abfaaff0b5bf1b2c42bbe8d742014bd14dd/street.png -------------------------------------------------------------------------------- /tools/export.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | from nanodet.model.arch import build_model 4 | from nanodet.util import Logger, cfg, load_config, load_model_weight 5 | 6 | def main(config, model_path, output_path, input_shape=(320, 320)): 7 | logger = Logger(-1, config.save_dir, False) 8 | model = build_model(config.model) 9 | checkpoint = torch.load(model_path, map_location=lambda storage, loc: storage) 10 | load_model_weight(model, checkpoint, logger) 11 | dummy_input = torch.autograd.Variable(torch.randn(1, 3, input_shape[0], input_shape[1])) 12 | torch.onnx.export(model, dummy_input, output_path, verbose=True, keep_initializers_as_inputs=True, opset_version=11) 13 | print('finished exporting onnx ') 14 | 15 | if __name__ == '__main__': 16 | cfg_path = r"config/nanodet-m.yml" 17 | model_path = r"nanodet_m.pth" 18 | out_path = r'output.onnx' 19 | load_config(cfg, cfg_path) 20 | main(cfg, model_path, out_path, input_shape=(320, 320)) -------------------------------------------------------------------------------- /tools/flops.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from nanodet.model.arch import build_model 3 | from nanodet.util import cfg, load_config, get_model_complexity_info 4 | 5 | 6 | def main(config, input_shape=(3, 320, 320)): 7 | model = build_model(config.model) 8 | flops, params = get_model_complexity_info(model, input_shape) 9 | split_line = '=' * 30 10 | print(f'{split_line}\nInput shape: {input_shape}\n' 11 | f'Flops: {flops*2}\nParams: {params}\n{split_line}') 12 | 13 | 14 | if __name__ == '__main__': 15 | cfg_path = r"config/nanodet-m.yml" 16 | load_config(cfg, cfg_path) 17 | main(config=cfg, 18 | input_shape=(3, 320, 320) 19 | ) 20 | -------------------------------------------------------------------------------- /tools/inference.py: -------------------------------------------------------------------------------- 1 | import os 2 | import cv2 3 | import time 4 | import torch 5 | 6 | from nanodet.model.arch import build_model 7 | from nanodet.util import load_model_weight 8 | from nanodet.data.transform import Pipeline 9 | 10 | 11 | class Predictor(object): 12 | def __init__(self, cfg, model_path, logger, device='cuda:0'): 13 | self.cfg = cfg 14 | self.device = device 15 | model = build_model(cfg.model) 16 | ckpt = torch.load(model_path, map_location=lambda storage, loc: storage) 17 | load_model_weight(model, ckpt, logger) 18 | self.model = model.to(device).eval() 19 | self.pipeline = Pipeline(cfg.data.val.pipeline, cfg.data.val.keep_ratio) 20 | 21 | def inference(self, img): 22 | img_info = {} 23 | if isinstance(img, str): 24 | img_info['file_name'] = os.path.basename(img) 25 | img = cv2.imread(img) 26 | else: 27 | img_info['file_name'] = None 28 | 29 | height, width = img.shape[:2] 30 | img_info['height'] = height 31 | img_info['width'] = width 32 | meta = dict(img_info=img_info, 33 | raw_img=img, 34 | img=img) 35 | meta = self.pipeline(meta, self.cfg.data.val.input_size) 36 | meta['img'] = torch.from_numpy(meta['img'].transpose(2, 0, 1)).unsqueeze(0).to(self.device) 37 | with torch.no_grad(): 38 | results = self.model.inference(meta) 39 | return meta, results 40 | 41 | def visualize(self, dets, meta, class_names, score_thres, wait=0): 42 | time1 = time.time() 43 | self.model.head.show_result(meta['raw_img'], dets, class_names, score_thres=score_thres, show=True) 44 | print('viz time: {:.3f}s'.format(time.time()-time1)) 45 | -------------------------------------------------------------------------------- /tools/test.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import json 4 | import datetime 5 | import argparse 6 | 7 | from nanodet.util import mkdir, Logger, cfg, load_config 8 | from nanodet.trainer import build_trainer 9 | from nanodet.data.collate import collate_function 10 | from nanodet.data.dataset import build_dataset 11 | from nanodet.model.arch import build_model 12 | from nanodet.evaluator import build_evaluator 13 | 14 | 15 | def parse_args(): 16 | parser = argparse.ArgumentParser() 17 | parser.add_argument('config', help='model config file path') 18 | parser.add_argument('--task', default='val', help='task to run, test or val') 19 | parser.add_argument('--save_result', action='store_true', default=True, help='save val results to txt') 20 | args = parser.parse_args() 21 | return args 22 | 23 | 24 | def main(args): 25 | load_config(cfg, args.config) 26 | local_rank = -1 27 | torch.backends.cudnn.enabled = True 28 | torch.backends.cudnn.benchmark = True 29 | cfg.defrost() 30 | timestr = datetime.datetime.now().__format__('%Y%m%d%H%M%S') 31 | cfg.save_dir = os.path.join(cfg.save_dir, timestr) 32 | cfg.freeze() 33 | mkdir(local_rank, cfg.save_dir) 34 | logger = Logger(local_rank, cfg.save_dir) 35 | 36 | logger.log('Creating model...') 37 | model = build_model(cfg.model) 38 | 39 | logger.log('Setting up data...') 40 | val_dataset = build_dataset(cfg.data.val, args.task) 41 | val_dataloader = torch.utils.data.DataLoader(val_dataset, batch_size=1, shuffle=False, num_workers=1, 42 | pin_memory=True, collate_fn=collate_function, drop_last=True) 43 | trainer = build_trainer(local_rank, cfg, model, logger) 44 | if 'load_model' in cfg.schedule: 45 | trainer.load_model(cfg) 46 | evaluator = build_evaluator(cfg, val_dataset) 47 | logger.log('Starting testing...') 48 | with torch.no_grad(): 49 | results, val_loss_dict = trainer.run_epoch(0, val_dataloader, mode=args.task) 50 | if args.task == 'test': 51 | res_json = evaluator.results2json(results) 52 | json_path = os.path.join(cfg.save_dir, 'results{}.json'.format(timestr)) 53 | json.dump(res_json, open(json_path, 'w')) 54 | elif args.task == 'val': 55 | eval_results = evaluator.evaluate(results, cfg.save_dir, 0, logger, rank=local_rank) 56 | if args.save_result: 57 | txt_path = os.path.join(cfg.save_dir, "eval_results{}.txt".format(timestr)) 58 | with open(txt_path, "a") as f: 59 | for k, v in eval_results.items(): 60 | f.write("{}: {}\n".format(k, v)) 61 | 62 | 63 | if __name__ == '__main__': 64 | args = parse_args() 65 | main(args) 66 | -------------------------------------------------------------------------------- /tools/train.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import logging 4 | import argparse 5 | import numpy as np 6 | import torch.distributed as dist 7 | 8 | from nanodet.util import mkdir, Logger, cfg, load_config 9 | from nanodet.trainer import build_trainer 10 | from nanodet.data.collate import collate_function 11 | from nanodet.data.dataset import build_dataset 12 | from nanodet.model.arch import build_model 13 | from nanodet.evaluator import build_evaluator 14 | 15 | 16 | def parse_args(): 17 | parser = argparse.ArgumentParser() 18 | parser.add_argument('config', help='train config file path') 19 | parser.add_argument('--local_rank', default=-1, type=int, 20 | help='node rank for distributed training') 21 | parser.add_argument('--seed', type=int, default=None, 22 | help='random seed') 23 | args = parser.parse_args() 24 | return args 25 | 26 | 27 | def init_seeds(seed=0): 28 | """ 29 | manually set a random seed for numpy, torch and cuda 30 | :param seed: random seed 31 | """ 32 | torch.manual_seed(seed) 33 | np.random.seed(seed) 34 | torch.cuda.manual_seed(seed) 35 | torch.cuda.manual_seed_all(seed) 36 | if seed == 0: 37 | torch.backends.cudnn.deterministic = True 38 | torch.backends.cudnn.benchmark = False 39 | 40 | 41 | def main(args): 42 | load_config(cfg, args.config) 43 | local_rank = int(args.local_rank) 44 | torch.backends.cudnn.enabled = True 45 | torch.backends.cudnn.benchmark = True 46 | mkdir(local_rank, cfg.save_dir) 47 | logger = Logger(local_rank, cfg.save_dir) 48 | if args.seed is not None: 49 | logger.log('Set random seed to {}'.format(args.seed)) 50 | init_seeds(args.seed) 51 | 52 | logger.log('Creating model...') 53 | model = build_model(cfg.model) 54 | 55 | logger.log('Setting up data...') 56 | train_dataset = build_dataset(cfg.data.train, 'train') 57 | val_dataset = build_dataset(cfg.data.val, 'test') 58 | 59 | if len(cfg.device.gpu_ids) > 1: 60 | print('rank = ', local_rank) 61 | num_gpus = torch.cuda.device_count() 62 | torch.cuda.set_device(local_rank % num_gpus) 63 | dist.init_process_group(backend='nccl') 64 | train_sampler = torch.utils.data.distributed.DistributedSampler(train_dataset) 65 | train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=cfg.device.batchsize_per_gpu, 66 | num_workers=cfg.device.workers_per_gpu, pin_memory=True, 67 | collate_fn=collate_function, sampler=train_sampler, 68 | drop_last=True) 69 | else: 70 | train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=cfg.device.batchsize_per_gpu, 71 | shuffle=True, num_workers=cfg.device.workers_per_gpu, 72 | pin_memory=True, collate_fn=collate_function, drop_last=True) 73 | 74 | val_dataloader = torch.utils.data.DataLoader(val_dataset, batch_size=1, shuffle=False, num_workers=1, 75 | pin_memory=True, collate_fn=collate_function, drop_last=True) 76 | 77 | trainer = build_trainer(local_rank, cfg, model, logger) 78 | 79 | if 'load_model' in cfg.schedule: 80 | trainer.load_model(cfg) 81 | if 'resume' in cfg.schedule: 82 | trainer.resume(cfg) 83 | 84 | evaluator = build_evaluator(cfg, val_dataset) 85 | 86 | logger.log('Starting training...') 87 | trainer.run(train_dataloader, val_dataloader, evaluator) 88 | 89 | 90 | if __name__ == '__main__': 91 | args = parse_args() 92 | main(args) 93 | --------------------------------------------------------------------------------