├── .gitignore
├── LICENSE
├── README.md
├── experiments
    ├── coco
    │   └── resnet
    │   │   └── res50_256x192_d256x3_adam_lr1e-3_advmix.yaml
    └── mpii
    │   ├── hrnet
    │       └── w32_256x256_adam_lr1e-3_advmix.yaml
    │   └── resnet
    │       └── res50_256x256_d256x3_adam_lr1e-3_advmix.yaml
├── figures
    ├── AdvMix.jpg
    ├── Qualitative.png
    ├── benchmarking_results.png
    └── image_corruption.png
├── lib
    ├── Makefile
    ├── config
    │   ├── __init__.py
    │   ├── default.py
    │   └── models.py
    ├── core
    │   ├── evaluate.py
    │   ├── function.py
    │   ├── inference.py
    │   └── loss.py
    ├── dataset
    │   ├── JointsDataset.py
    │   ├── __init__.py
    │   ├── advaug.py
    │   ├── coco.py
    │   └── mpii.py
    ├── models
    │   ├── Unet_generator.py
    │   ├── __init__.py
    │   ├── pose_hrnet.py
    │   └── pose_resnet.py
    ├── nms
    │   ├── __init__.py
    │   ├── cpu_nms.c
    │   ├── cpu_nms.pyx
    │   ├── gpu_nms.cpp
    │   ├── gpu_nms.cu
    │   ├── gpu_nms.hpp
    │   ├── gpu_nms.pyx
    │   ├── nms.py
    │   ├── nms_kernel.cu
    │   └── setup_linux.py
    └── utils
    │   ├── __init__.py
    │   ├── transforms.py
    │   ├── utils.py
    │   ├── vis.py
    │   └── zipreader.py
├── requirements.txt
├── scripts
    ├── make_datasets.sh
    ├── test.sh
    └── train.sh
└── tools
    ├── _init_parse.py
    ├── _init_paths.py
    ├── make_datasets.py
    ├── test_corruption.py
    └── train.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | # mypy
 2 | .mypy_cache/
 3 | .vscode
 4 | *.idea*
 5 | 
 6 | # Byte-compiled / optimized / DLL files
 7 | __pycache__/
 8 | *.py[cod]
 9 | *$py.class
10 | *.pyc
11 | */__pycache__/
12 | 
13 | 
14 | # ckpt
15 | /data/*
16 | /models/*
17 | /output/*
18 | /output_robustness/*
19 | /log/*
20 | # debug
21 | /scripts/debug.sh
22 | /scripts/debug_test.sh
23 | /scripts/make_datasets_local.sh
24 | /experiments/debug.yaml
25 | /tools/make_datasets_local.py
26 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2021 Jiahang Wang, Sheng Jin, Wentao Liu, Weizhong Liu, Chen Qian, Ping Luo
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # AdvMix
  2 | Official code for our CVPR 2021 paper: ["When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks"](https://arxiv.org/abs/2105.06152).
  3 | 
  4 | ## Getting started
  5 | * Installation
  6 | ```
  7 | # clone this repo
  8 | git clone https://github.com/AIprogrammer/AdvMix
  9 | # install dependencies
 10 | pip install -r requirements
 11 | # make nms
 12 | cd AdvMix
 13 | cd lib
 14 | make
 15 | # install cocoapi
 16 | # COCOAPI=/path/to/clone/cocoapi
 17 | git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
 18 | cd $COCOAPI/PythonAPI
 19 | # Install into global site-packages
 20 | make install
 21 | # Alternatively, if you do not have permissions or prefer
 22 | # not to install the COCO API into global site-packages
 23 | python3 setup.py install --user
 24 | ```
 25 | 
 26 | * Download the datasets [COCO](https://cocodataset.org/), [MPII](http://human-pose.mpi-inf.mpg.de/), and [OCHuman](https://github.com/liruilong940607/OCHumanApi). Put them under "./data". The directory structure follows [HRNet](https://github.com/leoxiaobin/deep-high-resolution-net.pytorch).
 27 | 
 28 | ## Benchmarking
 29 | ### Contruct benchmarking datasets
 30 | ```
 31 | sh scripts/make_datasets.sh
 32 | ```
 33 | ### Visualization examples
 34 | ![benchmark_dataset](./figures/image_corruption.png)
 35 | ### Benchmark results
 36 | ![benchmark_results](./figures/benchmarking_results.png)
 37 | 
 38 | **Note: There may be small gap between the results by [Evaluation](#Evaluation) and results in our paper due to randomness of operations in package 'imagecorruptions'.**
 39 | 
 40 | ## AdvMix
 41 | ![AdvMix](./figures/AdvMix.jpg)
 42 | ### Training
 43 | 
 44 | * MPII
 45 | ```
 46 | sh scripts/train.sh mpii
 47 | ```
 48 | * COCO
 49 | ```
 50 | sh scripts/train.sh coco
 51 | ```
 52 | <a name="Evaluation"></a>
 53 | ### Evaluation
 54 | ```
 55 | sh scripts/test.sh coco
 56 | sh scripts/test.sh mpii
 57 | ```
 58 | 
 59 | ### Quantitative results
 60 | |   Method | Arch               | Input size | AP<sup>*</sup>  |  mPC   |   rPC |
 61 | |----------|--------------------|------------|--------|--------|-------|
 62 | | Standard |      ResNet_50     |    256x192 | 70.4   |   47.8 | 67.9  |
 63 | |  AdvMix  |      ResNet_50     |    256x192 | 70.1   |   **50.1** | **71.5**  |
 64 | | Standard |     ResNet_101     |    256x192 | 71.4   |   49.6 | 69.5  |
 65 | |  AdvMix  |     ResNet_101     |    256x192 | 71.3   |   **52.3** | **73.3**  |
 66 | | Standard |     ResNet_152     |    256x192 | 72.0   |   50.9 | 70.7  |
 67 | |  AdvMix  |     ResNet_152     |    256x192 | 72.3   |   **53.2** | **73.6**  |
 68 | | Standard |     HRNet_W32      |    256x192 | 74.4   |   53.0 | 71.3  |
 69 | |  AdvMix  |     HRNet_W32      |    256x192 | 74.7   |   **55.5** | **74.3**  |
 70 | | Standard |     HRNet_W48      |    256x192 | 75.1   |   53.7 | 71.6  |
 71 | |  AdvMix  |     HRNet_W48      |    256x192 | 75.4   |   **57.1** | **75.7**  |
 72 | | Standard |     HrHRNet_W32    |    512x512 | 67.1   |   39.9 | 59.4  |
 73 | |  AdvMix  |     HrHRNet_W32    |    512x512 | 68.3   |   **45.4** | **66.5**  |
 74 | 
 75 | 
 76 | Comparisons between standard training and AdvMix on COCO-C. For top-down approaches, results are obtained with detected bounding boxes of [HRNet](https://github.com/leoxiaobin/deep-high-resolution-net.pytorch/). We see that mPC and rPC are greatly improved, whilst clean performance AP<sup>*</sup> can be preserved
 77 | 
 78 | ### Visualization results
 79 | ![AdvMix](./figures/Qualitative.png)
 80 | Qualitative comparisons between HRNet without and with AdvMix. For each image triplet, the images from left to right are ground truth, predicted results of Standard HRNet-W32, and predicted results of HRNet-W32 with AdvMix.
 81 | 
 82 | # Citations
 83 | If you find our work useful in your research, please consider citing:
 84 | ```
 85 | @article{wang2021human,
 86 |   title={When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks},
 87 |   author={Wang, Jiahang and Jin, Sheng and Liu, Wentao and Liu, Weizhong and Qian, Chen and Luo, Ping},
 88 |   journal={arXiv preprint arXiv:2105.06152},
 89 |   year={2021}
 90 | }
 91 | ```
 92 | 
 93 | # License
 94 | Our research code is released under the MIT license. See [LICENSE](./LICENSE) for details.
 95 | 
 96 | # Acknowledgments
 97 | Thanks for open-source code [HRNet](https://github.com/leoxiaobin/deep-high-resolution-net.pytorch/).
 98 | 
 99 | 
100 | 


--------------------------------------------------------------------------------
/experiments/coco/resnet/res50_256x192_d256x3_adam_lr1e-3_advmix.yaml:
--------------------------------------------------------------------------------
 1 | AUTO_RESUME: true
 2 | CUDNN:
 3 |   BENCHMARK: true
 4 |   DETERMINISTIC: false
 5 |   ENABLED: true
 6 | DATA_DIR: ''
 7 | GPUS: (0,1,2,3,4,5,6,7)
 8 | OUTPUT_DIR: 'output'
 9 | LOG_DIR: 'log'
10 | WORKERS: 24
11 | PRINT_FREQ: 100
12 | 
13 | DATASET:
14 |   COLOR_RGB: false
15 |   DATASET: 'coco'
16 |   ROOT: 'data/coco/'
17 |   TEST_SET: 'val2017'
18 |   TRAIN_SET: 'train2017'
19 |   FLIP: true
20 |   ROT_FACTOR: 40
21 |   SCALE_FACTOR: 0.3
22 | MODEL:
23 |   NAME: 'pose_resnet'
24 |   PRETRAINED: 'models/pytorch/imagenet/resnet50-19c8e357.pth'
25 |   IMAGE_SIZE:
26 |   - 192
27 |   - 256
28 |   HEATMAP_SIZE:
29 |   - 48
30 |   - 64
31 |   SIGMA: 2
32 |   NUM_JOINTS: 17
33 |   TARGET_TYPE: 'gaussian'
34 |   EXTRA:
35 |     FINAL_CONV_KERNEL: 1
36 |     DECONV_WITH_BIAS: false
37 |     NUM_DECONV_LAYERS: 3
38 |     NUM_DECONV_FILTERS:
39 |     - 256
40 |     - 256
41 |     - 256
42 |     NUM_DECONV_KERNELS:
43 |     - 4
44 |     - 4
45 |     - 4
46 |     NUM_LAYERS: 50
47 | LOSS:
48 |   USE_TARGET_WEIGHT: true
49 | TRAIN:
50 |   BATCH_SIZE_PER_GPU: 32
51 |   SHUFFLE: true
52 |   BEGIN_EPOCH: 0
53 |   END_EPOCH: 140
54 |   OPTIMIZER: 'adam'
55 |   LR: 0.001
56 |   LR_FACTOR: 0.1
57 |   LR_STEP:
58 |   - 90
59 |   - 120
60 |   WD: 0.0001
61 |   GAMMA1: 0.99
62 |   GAMMA2: 0.0
63 |   MOMENTUM: 0.9
64 |   NESTEROV: false
65 | TEST:
66 |   BATCH_SIZE_PER_GPU: 128
67 |   COCO_BBOX_FILE: 'data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json'
68 |   BBOX_THRE: 1.0
69 |   IMAGE_THRE: 0.0
70 |   IN_VIS_THRE: 0.2
71 |   MODEL_FILE: ''
72 |   NMS_THRE: 1.0
73 |   OKS_THRE: 0.9
74 |   FLIP_TEST: true
75 |   POST_PROCESS: true
76 |   SHIFT_HEATMAP: true
77 |   USE_GT_BBOX: true
78 | DEBUG:
79 |   DEBUG: true
80 |   SAVE_BATCH_IMAGES_GT: true
81 |   SAVE_BATCH_IMAGES_PRED: true
82 |   SAVE_HEATMAPS_GT: true
83 |   SAVE_HEATMAPS_PRED: true
84 | 


--------------------------------------------------------------------------------
/experiments/mpii/hrnet/w32_256x256_adam_lr1e-3_advmix.yaml:
--------------------------------------------------------------------------------
  1 | AUTO_RESUME: true
  2 | CUDNN:
  3 |   BENCHMARK: true
  4 |   DETERMINISTIC: false
  5 |   ENABLED: true
  6 | DATA_DIR: ''
  7 | GPUS: (0,1,2,3,4,5,6,7)
  8 | OUTPUT_DIR: 'output'
  9 | LOG_DIR: 'log'
 10 | WORKERS: 24
 11 | PRINT_FREQ: 100
 12 | 
 13 | DATASET:
 14 |   COLOR_RGB: true
 15 |   DATASET: mpii
 16 |   DATA_FORMAT: jpg
 17 |   FLIP: true
 18 |   NUM_JOINTS_HALF_BODY: 8
 19 |   PROB_HALF_BODY: -1.0
 20 |   ROOT: 'data/mpii/'
 21 |   ROT_FACTOR: 30
 22 |   SCALE_FACTOR: 0.25
 23 |   TEST_SET: valid
 24 |   TRAIN_SET: train
 25 | MODEL:
 26 |   INIT_WEIGHTS: true
 27 |   NAME: pose_hrnet
 28 |   NUM_JOINTS: 16
 29 |   PRETRAINED: 'models/pytorch/imagenet/hrnet_w32-36af842e.pth'
 30 |   TARGET_TYPE: gaussian
 31 |   IMAGE_SIZE:
 32 |   - 256
 33 |   - 256
 34 |   HEATMAP_SIZE:
 35 |   - 64
 36 |   - 64
 37 |   SIGMA: 2
 38 |   EXTRA:
 39 |     PRETRAINED_LAYERS:
 40 |     - 'conv1'
 41 |     - 'bn1'
 42 |     - 'conv2'
 43 |     - 'bn2'
 44 |     - 'layer1'
 45 |     - 'transition1'
 46 |     - 'stage2'
 47 |     - 'transition2'
 48 |     - 'stage3'
 49 |     - 'transition3'
 50 |     - 'stage4'
 51 |     FINAL_CONV_KERNEL: 1
 52 |     STAGE2:
 53 |       NUM_MODULES: 1
 54 |       NUM_BRANCHES: 2
 55 |       BLOCK: BASIC
 56 |       NUM_BLOCKS:
 57 |       - 4
 58 |       - 4
 59 |       NUM_CHANNELS:
 60 |       - 32
 61 |       - 64
 62 |       FUSE_METHOD: SUM
 63 |     STAGE3:
 64 |       NUM_MODULES: 4
 65 |       NUM_BRANCHES: 3
 66 |       BLOCK: BASIC
 67 |       NUM_BLOCKS:
 68 |       - 4
 69 |       - 4
 70 |       - 4
 71 |       NUM_CHANNELS:
 72 |       - 32
 73 |       - 64
 74 |       - 128
 75 |       FUSE_METHOD: SUM
 76 |     STAGE4:
 77 |       NUM_MODULES: 3
 78 |       NUM_BRANCHES: 4
 79 |       BLOCK: BASIC
 80 |       NUM_BLOCKS:
 81 |       - 4
 82 |       - 4
 83 |       - 4
 84 |       - 4
 85 |       NUM_CHANNELS:
 86 |       - 32
 87 |       - 64
 88 |       - 128
 89 |       - 256
 90 |       FUSE_METHOD: SUM
 91 | LOSS:
 92 |   USE_TARGET_WEIGHT: true
 93 | TRAIN:
 94 |   BATCH_SIZE_PER_GPU: 32
 95 |   SHUFFLE: true
 96 |   BEGIN_EPOCH: 0
 97 |   END_EPOCH: 210
 98 |   OPTIMIZER: adam
 99 |   LR: 0.001
100 |   LR_FACTOR: 0.1
101 |   LR_STEP:
102 |   - 170
103 |   - 200
104 |   WD: 0.0001
105 |   GAMMA1: 0.99
106 |   GAMMA2: 0.0
107 |   MOMENTUM: 0.9
108 |   NESTEROV: false
109 | TEST:
110 |   BATCH_SIZE_PER_GPU: 128
111 |   MODEL_FILE: ''
112 |   FLIP_TEST: true
113 |   POST_PROCESS: true
114 |   SHIFT_HEATMAP: true
115 | DEBUG:
116 |   DEBUG: true
117 |   SAVE_BATCH_IMAGES_GT: true
118 |   SAVE_BATCH_IMAGES_PRED: true
119 |   SAVE_HEATMAPS_GT: true
120 |   SAVE_HEATMAPS_PRED: true
121 | 


--------------------------------------------------------------------------------
/experiments/mpii/resnet/res50_256x256_d256x3_adam_lr1e-3_advmix.yaml:
--------------------------------------------------------------------------------
 1 | AUTO_RESUME: true
 2 | CUDNN:
 3 |   BENCHMARK: true
 4 |   DETERMINISTIC: false
 5 |   ENABLED: true
 6 | DATA_DIR: ''
 7 | GPUS: (0,1,2,3,4,5,6,7)
 8 | OUTPUT_DIR: 'output'
 9 | LOG_DIR: 'log'
10 | WORKERS: 24
11 | PRINT_FREQ: 100
12 | 
13 | DATASET:
14 |   COLOR_RGB: false
15 |   DATASET: mpii
16 |   DATA_FORMAT: jpg
17 |   FLIP: true
18 |   NUM_JOINTS_HALF_BODY: 8
19 |   PROB_HALF_BODY: -1.0
20 |   ROOT: 'data/mpii/'
21 |   ROT_FACTOR: 30
22 |   SCALE_FACTOR: 0.25
23 |   TEST_SET: valid
24 |   TRAIN_SET: train
25 | MODEL:
26 |   NAME: 'pose_resnet'
27 |   PRETRAINED: 'models/pytorch/imagenet/resnet50-19c8e357.pth'
28 |   IMAGE_SIZE:
29 |   - 256
30 |   - 256
31 |   HEATMAP_SIZE:
32 |   - 64
33 |   - 64
34 |   SIGMA: 2
35 |   NUM_JOINTS: 16
36 |   TARGET_TYPE: 'gaussian'
37 |   EXTRA:
38 |     FINAL_CONV_KERNEL: 1
39 |     DECONV_WITH_BIAS: false
40 |     NUM_DECONV_LAYERS: 3
41 |     NUM_DECONV_FILTERS:
42 |     - 256
43 |     - 256
44 |     - 256
45 |     NUM_DECONV_KERNELS:
46 |     - 4
47 |     - 4
48 |     - 4
49 |     NUM_LAYERS: 50
50 | LOSS:
51 |   USE_TARGET_WEIGHT: true
52 | TRAIN:
53 |   BATCH_SIZE_PER_GPU: 32
54 |   SHUFFLE: true
55 |   BEGIN_EPOCH: 0
56 |   END_EPOCH: 140
57 |   OPTIMIZER: 'adam'
58 |   LR: 0.001
59 |   LR_FACTOR: 0.1
60 |   LR_STEP:
61 |   - 90
62 |   - 120
63 |   WD: 0.0001
64 |   GAMMA1: 0.99
65 |   GAMMA2: 0.0
66 |   MOMENTUM: 0.9
67 |   NESTEROV: false
68 | TEST:
69 |   BATCH_SIZE_PER_GPU: 32
70 |   COCO_BBOX_FILE: 'data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json'
71 |   BBOX_THRE: 1.0
72 |   IMAGE_THRE: 0.0
73 |   IN_VIS_THRE: 0.2
74 |   MODEL_FILE: ''
75 |   NMS_THRE: 1.0
76 |   OKS_THRE: 0.9
77 |   FLIP_TEST: true
78 |   POST_PROCESS: true
79 |   SHIFT_HEATMAP: true
80 |   USE_GT_BBOX: true
81 | DEBUG:
82 |   DEBUG: true
83 |   SAVE_BATCH_IMAGES_GT: true
84 |   SAVE_BATCH_IMAGES_PRED: true
85 |   SAVE_HEATMAPS_GT: true
86 |   SAVE_HEATMAPS_PRED: true
87 | 


--------------------------------------------------------------------------------
/figures/AdvMix.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIprogrammer/AdvMix/f619fa279d9419eb452d228762c3872691e42e7d/figures/AdvMix.jpg


--------------------------------------------------------------------------------
/figures/Qualitative.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIprogrammer/AdvMix/f619fa279d9419eb452d228762c3872691e42e7d/figures/Qualitative.png


--------------------------------------------------------------------------------
/figures/benchmarking_results.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIprogrammer/AdvMix/f619fa279d9419eb452d228762c3872691e42e7d/figures/benchmarking_results.png


--------------------------------------------------------------------------------
/figures/image_corruption.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIprogrammer/AdvMix/f619fa279d9419eb452d228762c3872691e42e7d/figures/image_corruption.png


--------------------------------------------------------------------------------
/lib/Makefile:
--------------------------------------------------------------------------------
1 | all:
2 | 	cd nms; python setup_linux.py build_ext --inplace; rm -rf build; cd ../../
3 | clean:
4 | 	cd nms; rm *.so; cd ../../
5 | 


--------------------------------------------------------------------------------
/lib/config/__init__.py:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # Copyright (c) Microsoft
 3 | # Licensed under the MIT License.
 4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 5 | # ------------------------------------------------------------------------------
 6 | 
 7 | from .default import _C as cfg
 8 | from .default import update_config
 9 | from .models import MODEL_EXTRAS
10 | 


--------------------------------------------------------------------------------
/lib/config/default.py:
--------------------------------------------------------------------------------
  1 | 
  2 | # ------------------------------------------------------------------------------
  3 | # Copyright (c) Microsoft
  4 | # Licensed under the MIT License.
  5 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  6 | # ------------------------------------------------------------------------------
  7 | 
  8 | from __future__ import absolute_import
  9 | from __future__ import division
 10 | from __future__ import print_function
 11 | 
 12 | import os
 13 | 
 14 | from yacs.config import CfgNode as CN
 15 | 
 16 | 
 17 | _C = CN()
 18 | 
 19 | _C.OUTPUT_DIR = ''
 20 | _C.LOG_DIR = ''
 21 | _C.DATA_DIR = ''
 22 | _C.GPUS = (0,)
 23 | _C.WORKERS = 4
 24 | _C.PRINT_FREQ = 20
 25 | _C.AUTO_RESUME = False
 26 | _C.PIN_MEMORY = True
 27 | _C.RANK = 0
 28 | 
 29 | # Cudnn related params
 30 | _C.CUDNN = CN()
 31 | _C.CUDNN.BENCHMARK = True
 32 | _C.CUDNN.DETERMINISTIC = False
 33 | _C.CUDNN.ENABLED = True
 34 | 
 35 | # common params for NETWORK
 36 | _C.MODEL = CN()
 37 | _C.MODEL.NAME = 'pose_hrnet'
 38 | _C.MODEL.INIT_WEIGHTS = True
 39 | _C.MODEL.PRETRAINED = ''
 40 | _C.MODEL.NUM_JOINTS = 17
 41 | _C.MODEL.TAG_PER_JOINT = True
 42 | _C.MODEL.TARGET_TYPE = 'gaussian'
 43 | _C.MODEL.IMAGE_SIZE = [256, 256]
 44 | _C.MODEL.HEATMAP_SIZE = [64, 64]
 45 | _C.MODEL.SIGMA = 2
 46 | _C.MODEL.EXTRA = CN(new_allowed=True)
 47 | 
 48 | _C.LOSS = CN()
 49 | _C.LOSS.USE_OHKM = False
 50 | _C.LOSS.TOPK = 8
 51 | _C.LOSS.USE_TARGET_WEIGHT = True
 52 | _C.LOSS.USE_DIFFERENT_JOINTS_WEIGHT = False
 53 | 
 54 | # DATASET related params
 55 | _C.DATASET = CN()
 56 | _C.DATASET.ROOT = ''
 57 | _C.DATASET.ROOT_C = ''
 58 | _C.DATASET.DATASET = 'mpii'
 59 | _C.DATASET.TRAIN_SET = 'train'
 60 | _C.DATASET.TEST_SET = 'valid'
 61 | _C.DATASET.DATA_FORMAT = 'jpg'
 62 | _C.DATASET.HYBRID_JOINTS_TYPE = ''
 63 | _C.DATASET.SELECT_DATA = False
 64 | 
 65 | # training data augmentation
 66 | _C.DATASET.FLIP = True
 67 | _C.DATASET.SCALE_FACTOR = 0.25
 68 | _C.DATASET.ROT_FACTOR = 30
 69 | _C.DATASET.PROB_HALF_BODY = 0.0
 70 | _C.DATASET.NUM_JOINTS_HALF_BODY = 8
 71 | _C.DATASET.COLOR_RGB = False
 72 | # debug mini_coco
 73 | _C.DATASET.MINI_COCO = False
 74 | # VAL MASK 
 75 | _C.DATASET.VAL_FG = False
 76 | _C.DATASET.VAL_MASK = False
 77 | _C.DATASET.VAL_PARSING = False
 78 | # train
 79 | _C.TRAIN = CN()
 80 | 
 81 | _C.TRAIN.LR_FACTOR = 0.1
 82 | _C.TRAIN.LR_STEP = [90, 110]
 83 | _C.TRAIN.LR = 0.001
 84 | 
 85 | _C.TRAIN.OPTIMIZER = 'adam'
 86 | _C.TRAIN.MOMENTUM = 0.9
 87 | _C.TRAIN.WD = 0.0001
 88 | _C.TRAIN.NESTEROV = False
 89 | _C.TRAIN.GAMMA1 = 0.99
 90 | _C.TRAIN.GAMMA2 = 0.0
 91 | 
 92 | _C.TRAIN.BEGIN_EPOCH = 0
 93 | _C.TRAIN.END_EPOCH = 140
 94 | 
 95 | _C.TRAIN.RESUME = False
 96 | _C.TRAIN.CHECKPOINT = ''
 97 | 
 98 | _C.TRAIN.BATCH_SIZE_PER_GPU = 32
 99 | _C.TRAIN.SHUFFLE = True
100 | 
101 | # testing
102 | _C.TEST = CN()
103 | 
104 | # size of images for each device
105 | _C.TEST.BATCH_SIZE_PER_GPU = 32
106 | # Test Model Epoch
107 | _C.TEST.FLIP_TEST = False
108 | _C.TEST.POST_PROCESS = False
109 | _C.TEST.SHIFT_HEATMAP = False
110 | 
111 | _C.TEST.USE_GT_BBOX = False
112 | 
113 | # test robustness
114 | _C.TEST.TEST_ROBUST = False
115 | _C.TEST.CORRUPTION_TYPE = ''
116 | 
117 | 
118 | # nms
119 | _C.TEST.IMAGE_THRE = 0.1
120 | _C.TEST.NMS_THRE = 0.6
121 | _C.TEST.SOFT_NMS = False
122 | _C.TEST.OKS_THRE = 0.5
123 | _C.TEST.IN_VIS_THRE = 0.0
124 | _C.TEST.COCO_BBOX_FILE = ''
125 | _C.TEST.BBOX_THRE = 1.0
126 | _C.TEST.MODEL_FILE = ''
127 | _C.TEST.MASK_FILE = ''
128 | 
129 | # soft_argmax
130 | _C.TEST.SOFT_ARGMAX = False
131 | _C.TEST.BIAS = 0.0
132 | 
133 | # debug
134 | _C.DEBUG = CN()
135 | _C.DEBUG.DEBUG = False
136 | _C.DEBUG.SAVE_BATCH_IMAGES_GT = False
137 | _C.DEBUG.SAVE_BATCH_IMAGES_PRED = False
138 | _C.DEBUG.SAVE_HEATMAPS_GT = False
139 | _C.DEBUG.SAVE_HEATMAPS_PRED = False
140 | 
141 | 
142 | # update config by args
143 | def update_config(cfg, args):
144 |     cfg.defrost()
145 |     # merge and update args
146 |     cfg.merge_from_file(args.cfg)
147 |     cfg.merge_from_list(args.opts)
148 | 
149 |     if args.modelDir:
150 |         cfg.OUTPUT_DIR = args.modelDir
151 | 
152 |     if args.logDir:
153 |         cfg.LOG_DIR = args.logDir
154 | 
155 |     if args.dataDir:
156 |         cfg.DATA_DIR = args.dataDir
157 | 
158 |     if args.corruption_type:
159 |         cfg.TEST.CORRUPTION_TYPE = args.corruption_type
160 |     
161 |     cfg.TEST.SEVERITY = args.severity
162 |     cfg.TEST.TEST_ROBUST = args.test_robust
163 |     
164 |     cfg.DATASET.ROOT = os.path.join(
165 |         cfg.DATA_DIR, cfg.DATASET.ROOT
166 |     )
167 | 
168 |     if cfg.DATASET.DATASET == 'coco':
169 |         cfg.DATASET.ROOT_C = 'data/coco-C'
170 |     else:
171 |         cfg.DATASET.ROOT_C = 'data/mpii-C'
172 |     
173 |     cfg.DATASET.ROOT_C = os.path.join(
174 |         cfg.DATA_DIR, cfg.DATASET.ROOT_C
175 |     )
176 |     cfg.MODEL.PRETRAINED = os.path.join(
177 |         cfg.DATA_DIR, cfg.MODEL.PRETRAINED
178 |     )
179 |     if cfg.TEST.MODEL_FILE:
180 |         cfg.TEST.MODEL_FILE = os.path.join(
181 |             cfg.DATA_DIR, cfg.TEST.MODEL_FILE
182 |         )
183 |     
184 |     cfg.freeze()
185 | 
186 | 
187 | if __name__ == '__main__':
188 |     import sys
189 |     with open(sys.argv[1], 'w') as f:
190 |         print(_C, file=f)
191 | 
192 | 


--------------------------------------------------------------------------------
/lib/config/models.py:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # Copyright (c) Microsoft
 3 | # Licensed under the MIT License.
 4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 5 | # ------------------------------------------------------------------------------
 6 | 
 7 | from __future__ import absolute_import
 8 | from __future__ import division
 9 | from __future__ import print_function
10 | 
11 | from yacs.config import CfgNode as CN
12 | 
13 | 
14 | # pose_resnet related params
15 | POSE_RESNET = CN()
16 | POSE_RESNET.NUM_LAYERS = 50
17 | POSE_RESNET.DECONV_WITH_BIAS = False
18 | POSE_RESNET.NUM_DECONV_LAYERS = 3
19 | POSE_RESNET.NUM_DECONV_FILTERS = [256, 256, 256]
20 | POSE_RESNET.NUM_DECONV_KERNELS = [4, 4, 4]
21 | POSE_RESNET.FINAL_CONV_KERNEL = 1
22 | POSE_RESNET.PRETRAINED_LAYERS = ['*']
23 | 
24 | # pose_multi_resoluton_net related params
25 | POSE_HIGH_RESOLUTION_NET = CN()
26 | POSE_HIGH_RESOLUTION_NET.PRETRAINED_LAYERS = ['*']
27 | POSE_HIGH_RESOLUTION_NET.STEM_INPLANES = 64
28 | POSE_HIGH_RESOLUTION_NET.FINAL_CONV_KERNEL = 1
29 | 
30 | POSE_HIGH_RESOLUTION_NET.STAGE2 = CN()
31 | POSE_HIGH_RESOLUTION_NET.STAGE2.NUM_MODULES = 1
32 | POSE_HIGH_RESOLUTION_NET.STAGE2.NUM_BRANCHES = 2
33 | POSE_HIGH_RESOLUTION_NET.STAGE2.NUM_BLOCKS = [4, 4]
34 | POSE_HIGH_RESOLUTION_NET.STAGE2.NUM_CHANNELS = [32, 64]
35 | POSE_HIGH_RESOLUTION_NET.STAGE2.BLOCK = 'BASIC'
36 | POSE_HIGH_RESOLUTION_NET.STAGE2.FUSE_METHOD = 'SUM'
37 | 
38 | POSE_HIGH_RESOLUTION_NET.STAGE3 = CN()
39 | POSE_HIGH_RESOLUTION_NET.STAGE3.NUM_MODULES = 1
40 | POSE_HIGH_RESOLUTION_NET.STAGE3.NUM_BRANCHES = 3
41 | POSE_HIGH_RESOLUTION_NET.STAGE3.NUM_BLOCKS = [4, 4, 4]
42 | POSE_HIGH_RESOLUTION_NET.STAGE3.NUM_CHANNELS = [32, 64, 128]
43 | POSE_HIGH_RESOLUTION_NET.STAGE3.BLOCK = 'BASIC'
44 | POSE_HIGH_RESOLUTION_NET.STAGE3.FUSE_METHOD = 'SUM'
45 | 
46 | POSE_HIGH_RESOLUTION_NET.STAGE4 = CN()
47 | POSE_HIGH_RESOLUTION_NET.STAGE4.NUM_MODULES = 1
48 | POSE_HIGH_RESOLUTION_NET.STAGE4.NUM_BRANCHES = 4
49 | POSE_HIGH_RESOLUTION_NET.STAGE4.NUM_BLOCKS = [4, 4, 4, 4]
50 | POSE_HIGH_RESOLUTION_NET.STAGE4.NUM_CHANNELS = [32, 64, 128, 256]
51 | POSE_HIGH_RESOLUTION_NET.STAGE4.BLOCK = 'BASIC'
52 | POSE_HIGH_RESOLUTION_NET.STAGE4.FUSE_METHOD = 'SUM'
53 | 
54 | 
55 | MODEL_EXTRAS = {
56 |     'pose_resnet': POSE_RESNET,
57 |     'pose_high_resolution_net': POSE_HIGH_RESOLUTION_NET,
58 | }
59 | 


--------------------------------------------------------------------------------
/lib/core/evaluate.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import numpy as np
 12 | 
 13 | from core.inference import get_max_preds
 14 | from utils.transforms import flip_back, tofloat, coord_norm, inv_coord_norm, _tocopy, _tocuda
 15 | 
 16 | def calc_dists(preds, target, normalize):
 17 |     preds = preds.astype(np.float32)
 18 |     target = target.astype(np.float32)
 19 |     dists = np.zeros((preds.shape[1], preds.shape[0]))
 20 |     for n in range(preds.shape[0]):
 21 |         for c in range(preds.shape[1]):
 22 |             if target[n, c, 0] > 1 and target[n, c, 1] > 1:
 23 |                 normed_preds = preds[n, c, :] / normalize[n]
 24 |                 normed_targets = target[n, c, :] / normalize[n]
 25 |                 dists[c, n] = np.linalg.norm(normed_preds - normed_targets)
 26 |             else:
 27 |                 dists[c, n] = -1
 28 |     return dists
 29 | 
 30 | 
 31 | def dist_acc(dists, thr=0.5):
 32 |     ''' Return percentage below threshold while ignoring values with a -1 '''
 33 |     dist_cal = np.not_equal(dists, -1)
 34 |     num_dist_cal = dist_cal.sum()
 35 |     if num_dist_cal > 0:
 36 |         return np.less(dists[dist_cal], thr).sum() * 1.0 / num_dist_cal
 37 |     else:
 38 |         return -1
 39 | 
 40 | 
 41 | def accuracy(outputs, target, hm_type='gaussian', thr=0.5, args=None, cfg=None):
 42 |     '''
 43 |     Calculate accuracy according to PCK,
 44 |     but uses ground truth heatmap rather than x,y locations
 45 |     First value to be returned is average accuracy across 'idxs',
 46 |     followed by individual accuracies
 47 |     '''
 48 |     if isinstance(outputs, list):
 49 |         for index in range(len(outputs)):
 50 |             outputs[index] = outputs[index].clone().detach().cpu().numpy()
 51 |         idx = list(range(outputs[-1].shape[1]))
 52 | 
 53 |     else:
 54 |         outputs = outputs.clone().detach().cpu().numpy()
 55 |         idx = list(range(outputs.shape[1]))
 56 |     
 57 |     if isinstance(target, list):
 58 |         for index in range(len(target)):
 59 |             target[index] = target[index].clone().detach().cpu().numpy()
 60 |         idx = list(range(target[-1].shape[1]))
 61 | 
 62 |     else:
 63 |         target = target.clone().detach().cpu().numpy()
 64 |         idx = list(range(target.shape[1]))
 65 |     
 66 |     norm = 1.0
 67 |     
 68 |     if hm_type == 'gaussian' and args is None:
 69 |         pred, _ = get_max_preds(outputs)
 70 |         target, _ = get_max_preds(target)
 71 |         h = outputs.shape[2] # y
 72 |         w = outputs.shape[3] # x
 73 | 
 74 |     else:
 75 |         assert outputs[0].ndim == 3, 'the output coord must be 3 dims'
 76 |         pred = outputs[0]
 77 |         pred = inv_coord_norm(pred, cfg, args).clone().detach().cpu().numpy()
 78 |         target, _ = get_max_preds(target[-1])
 79 |         h = outputs[-1].shape[2]
 80 |         w = outputs[-1].shape[3]
 81 | 
 82 |     norm = np.ones((pred.shape[0], 2)) * np.array([h, w]) / 10
 83 | 
 84 |     dists = calc_dists(pred, target, norm)
 85 | 
 86 |     acc = np.zeros((len(idx) + 1))
 87 |     avg_acc = 0
 88 |     cnt = 0
 89 | 
 90 |     for i in range(len(idx)):
 91 |         acc[i + 1] = dist_acc(dists[idx[i]])
 92 |         if acc[i + 1] >= 0:
 93 |             avg_acc = avg_acc + acc[i + 1]
 94 |             cnt += 1
 95 | 
 96 |     avg_acc = avg_acc / cnt if cnt != 0 else 0
 97 |     if cnt != 0:
 98 |         acc[0] = avg_acc
 99 |     return acc, avg_acc, cnt, pred
100 | 
101 | 
102 | 


--------------------------------------------------------------------------------
/lib/core/function.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 |  
 11 | import time
 12 | import logging
 13 | import os, copy
 14 | 
 15 | import numpy as np
 16 | import torch
 17 | 
 18 | from core.evaluate import accuracy
 19 | from core.inference import get_final_preds, get_final_preds_using_softargmax, SoftArgmax2D
 20 | from utils.transforms import flip_back, tofloat, coord_norm, inv_coord_norm, _tocopy, _tocuda
 21 | from utils.vis import save_debug_images
 22 | import torch.nn as nn
 23 | from tqdm import tqdm
 24 | import torch.nn.functional as F
 25 | 
 26 | 
 27 | logger = logging.getLogger(__name__)
 28 | 
 29 | 
 30 | def train(config, args, train_loader, model, criterion, optimizer, epoch,
 31 |           output_dir, tb_log_dir, writer_dict):
 32 |     batch_time = AverageMeter()
 33 |     data_time = AverageMeter()
 34 |     losses = AverageMeter()
 35 |     acc = AverageMeter()
 36 | 
 37 |     if isinstance(model, list):
 38 |         model = model[0].train()
 39 |         model_D = model[1].train()
 40 |     else:
 41 |         model.train()
 42 | 
 43 |     end = time.time()
 44 |     for i, (input, target, target_weight, meta) in tqdm(enumerate(train_loader)):
 45 | 
 46 |         data_time.update(time.time() - end)
 47 | 
 48 |         outputs = model(input)
 49 | 
 50 |         target = target[0].cuda(non_blocking=True)
 51 |         target_hm = target
 52 |         target_weight = target_weight.cuda(non_blocking=True)
 53 | 
 54 |         loss = criterion(outputs, target, target_weight)
 55 | 
 56 |         # compute gradient and do update step
 57 |         optimizer.zero_grad()
 58 |         loss.backward()
 59 |         optimizer.step()
 60 | 
 61 |         # measure accuracy and record loss
 62 |         losses.update(loss.item(), input.size(0))
 63 |         _, avg_acc, cnt, pred = accuracy(outputs,
 64 |                                         target, args=None, cfg=config)
 65 |                                         
 66 |         outputs = _tocuda(outputs)
 67 | 
 68 |         acc.update(avg_acc, cnt)
 69 | 
 70 | 
 71 |         batch_time.update(time.time() - end)
 72 |         end = time.time()
 73 | 
 74 |         if i % config.PRINT_FREQ == 0:
 75 |             msg = 'Epoch: [{0}][{1}/{2}]\t' \
 76 |                   'Time {batch_time.val:.3f}s ({batch_time.avg:.3f}s)\t' \
 77 |                   'Speed {speed:.1f} samples/s\t' \
 78 |                   'Data {data_time.val:.3f}s ({data_time.avg:.3f}s)\t' \
 79 |                   'Loss {loss.val:.5f} ({loss.avg:.5f})\t' \
 80 |                   'Accuracy {acc.val:.3f} ({acc.avg:.3f})'.format(
 81 |                       epoch, i, len(train_loader), batch_time=batch_time,
 82 |                       speed=input.size(0)/batch_time.val,
 83 |                       data_time=data_time, loss=losses, acc=acc)
 84 |             logger.info(msg)
 85 | 
 86 |             writer = writer_dict['writer']
 87 |             global_steps = writer_dict['train_global_steps']
 88 |             writer.add_scalar('train_loss', losses.val, global_steps)
 89 |             writer.add_scalar('train_acc', acc.val, global_steps)
 90 |             writer_dict['train_global_steps'] = global_steps + 1
 91 | 
 92 |             prefix = '{}_{}'.format(os.path.join(output_dir, 'train'), i)
 93 | 
 94 |             save_debug_images(config, input, meta, target_hm, pred*4, outputs,
 95 |                             prefix)
 96 | 
 97 | 
 98 | def set_require_grad(nets, requires_grad=True):
 99 |     if not isinstance(nets, list):
100 |         nets = [nets]
101 |     for net in nets:
102 |         if net is not None:
103 |             for param in net.parameters():
104 |                 param.requires_grad = requires_grad
105 | 
106 | 
107 | def train_advmix(config, args, train_loader, models, criterion, optimizers, epoch,
108 |           output_dir, tb_log_dir, writer_dict):
109 |     batch_time = AverageMeter()
110 |     data_time = AverageMeter()
111 |     losses = AverageMeter()
112 |     acc = AverageMeter()
113 | 
114 |     if isinstance(models, list):
115 |         model = models[0].train()
116 |         model_G = models[1].train()
117 |         model_teacher = models[2].eval()
118 |     else:
119 |         models.train()
120 | 
121 |     optimizer = optimizers[0]
122 |     optimizer_G = optimizers[1]
123 | 
124 |     end = time.time()
125 |     for i, (inputs, targets, target_weights, metas) in tqdm(enumerate(train_loader)):
126 | 
127 |         data_time.update(time.time() - end)
128 |         # mask_channel = meta['model_supervise_channel'] > 0.5
129 |         if isinstance(inputs, list):
130 |             inputs = [_.cuda(non_blocking=True) for _ in inputs]
131 |             target = targets[0].cuda(non_blocking=True)
132 |             target_weight = target_weights[0].cuda(non_blocking=True)
133 |             meta = metas[0]
134 |         else:
135 |             inputs = inputs.cuda(non_blocking=True)
136 |         
137 |         G_input = torch.cat(inputs, dim=1)
138 |         mix_weight = F.softmax(model_G(G_input), dim=1)
139 | 
140 |         set_require_grad(model, True)
141 |         optimizer.zero_grad()
142 |         tmp = inputs[0] * mix_weight[:,0,...].unsqueeze(dim=1)
143 |         for list_index in range(1, len(inputs)):
144 |             tmp += inputs[list_index] * mix_weight[:,list_index].unsqueeze(dim=1)
145 | 
146 |         D_output_detach = model(tmp.detach())
147 | 
148 |         with torch.no_grad():
149 |             teacher_output = model_teacher(inputs[0])
150 | 
151 |         loss_D_hm = criterion(D_output_detach, target, target_weight)
152 |         loss_D_kd = criterion(D_output_detach, teacher_output, target_weight)
153 |         loss_D = loss_D_hm * (1 - args.alpha) + loss_D_kd * args.alpha
154 |         loss_D.backward()
155 |         optimizer.step()
156 | 
157 |         # G: compute gradient and do update step
158 |         set_require_grad(model, False)
159 |         optimizer_G.zero_grad()
160 |         outputs = model(tmp)
161 |         output = outputs
162 |         loss_G = -criterion(output, target, target_weight) * args.adv_loss_weight
163 |         loss_G.backward()
164 |         optimizer_G.step()
165 |     
166 |         # measure accuracy and record loss
167 |         losses.update(loss_D.item(), inputs[0].size(0))
168 |         _, avg_acc, cnt, pred = accuracy(output,
169 |                                         target, args=None, cfg=config)
170 | 
171 |         acc.update(avg_acc, cnt)
172 |         batch_time.update(time.time() - end)
173 |         end = time.time()
174 | 
175 |         if i % config.PRINT_FREQ == 0:
176 |             msg = 'Epoch: [{0}][{1}/{2}]\t' \
177 |                   'Time {batch_time.val:.3f}s ({batch_time.avg:.3f}s)\t' \
178 |                   'Speed {speed:.1f} samples/s\t' \
179 |                   'Data {data_time.val:.3f}s ({data_time.avg:.3f}s)\t' \
180 |                   'Loss {loss.val:.5f} ({loss.avg:.5f})\t' \
181 |                   'Accuracy {acc.val:.3f} ({acc.avg:.3f})'.format(
182 |                       epoch, i, len(train_loader), batch_time=batch_time,
183 |                       speed=inputs[0].size(0)/batch_time.val,
184 |                       data_time=data_time, loss=losses, acc=acc)
185 |             logger.info(msg)
186 | 
187 |             writer = writer_dict['writer']
188 |             global_steps = writer_dict['train_global_steps']
189 |             writer.add_scalar('train_loss', losses.val, global_steps)
190 |             writer.add_scalar('train_acc', acc.val, global_steps)
191 |             writer_dict['train_global_steps'] = global_steps + 1
192 | 
193 |             prefix = '{}_{}'.format(os.path.join(output_dir, 'train'), i)
194 |             save_debug_images(config, inputs[0], copy.deepcopy(meta), target, pred*4, outputs,
195 |                             prefix + '_clean')
196 |             save_debug_images(config, tmp, copy.deepcopy(meta), target, pred*4, outputs,
197 |                             prefix)
198 | 
199 | 
200 | def validate(config, args, val_loader, val_dataset, model, criterion, output_dir,
201 |              tb_log_dir, writer_dict=None, cpu=False):
202 |     batch_time = AverageMeter()
203 |     losses = AverageMeter()
204 |     acc = AverageMeter()
205 | 
206 |     # switch to evaluate mode
207 |     model.eval()
208 | 
209 |     num_samples = len(val_dataset)
210 |     all_preds = np.zeros(
211 |         (num_samples, config.MODEL.NUM_JOINTS, 3),
212 |         dtype=np.float32
213 |     )
214 |     all_boxes = np.zeros((num_samples, 6))
215 |     image_path = []
216 |     filenames = []
217 |     imgnums = []
218 |     idx = 0
219 |     feat_dict = {}
220 | 
221 |     with torch.no_grad():
222 |         end = time.time()
223 |         time_gpu = 0.
224 |         for i, (input, target, target_weight, meta) in tqdm(enumerate(val_loader)):
225 |             if not cpu:
226 |                 input = input.cuda()
227 |             # compute output
228 |             torch.cuda.synchronize()
229 |             infer_start = time.time()
230 |             outputs = model(input)
231 | 
232 |             infer_end = time.time()
233 |             torch.cuda.synchronize()
234 |             time_gpu += (infer_end - infer_start)
235 | 
236 |             if isinstance(outputs, list):
237 |                 output = outputs[-1]
238 |             else:
239 |                 output = outputs
240 | 
241 |             if config.TEST.FLIP_TEST:
242 |                 input_flipped = input.flip(3)
243 |                 outputs_flipped = model(input_flipped)
244 | 
245 |                 if isinstance(outputs_flipped, list):
246 |                     output_flipped = outputs_flipped[-1]
247 |                 else:
248 |                     output_flipped = outputs_flipped
249 |                 
250 |                 output_flipped = flip_back(output_flipped.cpu().numpy(),
251 |                                         val_dataset.flip_pairs)
252 |                 if not cpu:
253 |                     output_flipped = torch.from_numpy(output_flipped.copy()).cuda()
254 |                 else:
255 |                     output_flipped = torch.from_numpy(output_flipped.copy())
256 |                     
257 |                 # feature is not aligned, shift flipped heatmap for higher accuracy
258 |                 if config.TEST.SHIFT_HEATMAP:
259 |                     output_flipped[:, :, :, 1:] = \
260 |                         output_flipped.clone()[:, :, :, 0:-1]
261 |                 output = (output + output_flipped) * 0.5
262 |                 
263 |             if not cpu:
264 |                 target = target[0].cuda(non_blocking=True)
265 |                 target_hm = target
266 |                 target_weight = target_weight.cuda(non_blocking=True)
267 |         
268 |             loss = criterion(output, target, target_weight)
269 | 
270 |             num_images = input.size(0)
271 |             # measure accuracy and record loss
272 |             losses.update(loss.item(), num_images)
273 |             
274 |             _, avg_acc, cnt, pred = accuracy(output,
275 |                                              target, args=None, cfg=config)
276 | 
277 |             output = _tocuda(output)
278 |             acc.update(avg_acc, cnt)
279 |             batch_time.update(time.time() - end)
280 |             end = time.time()
281 | 
282 |             # corresponding center scale joint
283 |             c = meta['center'].numpy()
284 |             s = meta['scale'].numpy()
285 |             score = meta['score'].numpy()
286 |             
287 | 
288 |             preds, maxvals = get_final_preds(
289 |                 config, args, output.clone().cpu().numpy(), c, s)
290 | 
291 |             all_preds[idx:idx + num_images, :, 0:2] = preds[:, :, 0:2]
292 |             all_preds[idx:idx + num_images, :, 2:3] = maxvals
293 |             # double check this all_boxes parts
294 |             all_boxes[idx:idx + num_images, 0:2] = c[:, 0:2]
295 |             all_boxes[idx:idx + num_images, 2:4] = s[:, 0:2]
296 |             all_boxes[idx:idx + num_images, 4] = np.prod(s*200, 1)
297 |             all_boxes[idx:idx + num_images, 5] = score
298 |             image_path.extend(meta['image'])
299 | 
300 |             idx += num_images
301 | 
302 |             if i % config.PRINT_FREQ == 0:
303 |                 msg = 'Test: [{0}/{1}]\t' \
304 |                       'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t' \
305 |                       'Loss {loss.val:.4f} ({loss.avg:.4f})\t' \
306 |                       'Accuracy {acc.val:.3f} ({acc.avg:.3f})'.format(
307 |                           i, len(val_loader), batch_time=batch_time,
308 |                           loss=losses, acc=acc)
309 |                 logger.info(msg)
310 | 
311 |                 prefix = '{}_{}'.format(
312 |                     os.path.join(output_dir, 'val'), i
313 |                 )
314 | 
315 |                 save_debug_images(config, input, meta, target_hm, pred * 4, output,
316 |                                 prefix) 
317 |         
318 |         print('=> The average inference time is :', time_gpu / len(val_loader))
319 |     
320 |         name_values, perf_indicator = val_dataset.evaluate(
321 |             config, all_preds, output_dir, all_boxes, image_path,
322 |             filenames, imgnums
323 |         )
324 | 
325 |         model_name = config.MODEL.NAME
326 |         if isinstance(name_values, list):
327 |             for name_value in name_values:
328 |                 _print_name_value(name_value, model_name)
329 |         else:
330 |             _print_name_value(name_values, model_name)
331 | 
332 |         if writer_dict:
333 |             writer = writer_dict['writer']
334 |             global_steps = writer_dict['valid_global_steps']
335 |             writer.add_scalar(
336 |                 'valid_loss',
337 |                 losses.avg,
338 |                 global_steps
339 |             )
340 |             writer.add_scalar(
341 |                 'valid_acc',
342 |                 acc.avg,
343 |                 global_steps
344 |             )
345 |             if isinstance(name_values, list):
346 |                 for name_value in name_values:
347 |                     writer.add_scalars(
348 |                         'valid',
349 |                         dict(name_value),
350 |                         global_steps
351 |                     )
352 |             else:
353 |                 writer.add_scalars(
354 |                     'valid',
355 |                     dict(name_values),
356 |                     global_steps
357 |                 )
358 |             writer_dict['valid_global_steps'] = global_steps + 1
359 |     
360 |     return name_values, perf_indicator
361 | 
362 | 
363 | # markdown format output
364 | def _print_name_value(name_value, full_arch_name):
365 |     names = name_value.keys()
366 |     values = name_value.values()
367 |     num_values = len(name_value)
368 |     logger.info(
369 |         '| Arch ' +
370 |         ' '.join(['| {}'.format(name) for name in names]) +
371 |         ' |'
372 |     )
373 |     logger.info('|---' * (num_values+1) + '|')
374 | 
375 |     if len(full_arch_name) > 15:
376 |         full_arch_name = full_arch_name[:8] + '...'
377 |     logger.info(
378 |         '| ' + full_arch_name + ' ' +
379 |         ' '.join(['| {:.3f}'.format(value) for value in values]) +
380 |          ' |'
381 |     )
382 | 
383 | class AverageMeter(object):
384 |     """Computes and stores the average and current value"""
385 |     def __init__(self):
386 |         self.reset()
387 | 
388 |     def reset(self):
389 |         self.val = 0
390 |         self.avg = 0
391 |         self.sum = 0
392 |         self.count = 0
393 | 
394 |     def update(self, val, n=1):
395 |         self.val = val
396 |         self.sum += val * n
397 |         self.count += n
398 |         self.avg = self.sum / self.count if self.count != 0 else 0
399 | 


--------------------------------------------------------------------------------
/lib/core/inference.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import math
 12 | 
 13 | import numpy as np
 14 | 
 15 | from utils.transforms import transform_preds
 16 | import torch.nn as nn
 17 | import torch
 18 | 
 19 | 
 20 | 
 21 | 
 22 | def get_max_preds(batch_heatmaps):
 23 |     '''
 24 |     get predictions from score maps
 25 |     heatmaps: numpy.ndarray([batch_size, num_joints, height, width])
 26 |     '''
 27 |     assert isinstance(batch_heatmaps, np.ndarray), \
 28 |         'batch_heatmaps should be numpy.ndarray'
 29 |     assert batch_heatmaps.ndim == 4, 'batch_images should be 4-ndim'
 30 | 
 31 |     batch_size = batch_heatmaps.shape[0]
 32 |     num_joints = batch_heatmaps.shape[1]
 33 |     width = batch_heatmaps.shape[3]
 34 |     heatmaps_reshaped = batch_heatmaps.reshape((batch_size, num_joints, -1))
 35 |     idx = np.argmax(heatmaps_reshaped, 2)
 36 |     maxvals = np.amax(heatmaps_reshaped, 2)
 37 | 
 38 |     maxvals = maxvals.reshape((batch_size, num_joints, 1))
 39 |     idx = idx.reshape((batch_size, num_joints, 1))
 40 | 
 41 |     preds = np.tile(idx, (1, 1, 2)).astype(np.float32)
 42 |     preds[:, :, 0] = (preds[:, :, 0]) % width
 43 |     preds[:, :, 1] = np.floor((preds[:, :, 1]) / width)
 44 |     
 45 |     pred_mask = np.tile(np.greater(maxvals, 0.0), (1, 1, 2))
 46 |     pred_mask = pred_mask.astype(np.float32)
 47 | 
 48 |     preds *= pred_mask
 49 |     return preds, maxvals
 50 | 
 51 | 
 52 | def get_final_preds(config, args, batch_heatmaps, center, scale, cal_hm_coord=True, coord=None, reg_hm=False):
 53 |     # default: calculate coord from heatmap
 54 |     if cal_hm_coord:
 55 |         coords, maxvals = get_max_preds(batch_heatmaps)
 56 |     else:
 57 |         coords = coord
 58 |         _, maxvals = get_max_preds(batch_heatmaps)
 59 |     
 60 |     heatmap_height = batch_heatmaps.shape[2]
 61 |     heatmap_width = batch_heatmaps.shape[3]
 62 | 
 63 |     if config.TEST.POST_PROCESS:
 64 |         for n in range(coords.shape[0]):
 65 |             for p in range(coords.shape[1]):
 66 |                 hm = batch_heatmaps[n][p]
 67 |                 px = int(math.floor(coords[n][p][0] + 0.5))
 68 |                 py = int(math.floor(coords[n][p][1] + 0.5))
 69 |                 if 1 < px < heatmap_width-1 and 1 < py < heatmap_height-1:
 70 |                     diff = np.array(
 71 |                         [
 72 |                             hm[py][px+1] - hm[py][px-1],
 73 |                             hm[py+1][px]-hm[py-1][px]
 74 |                         ]
 75 |                     )
 76 |                     coords[n][p] += np.sign(diff) * .25
 77 | 
 78 |     preds = coords.copy()
 79 | 
 80 |     # Transform back the coord based on center and scale
 81 |     for i in range(coords.shape[0]):
 82 |         if coord is None:
 83 |             preds[i] = transform_preds(
 84 |                 coords[i], center[i], scale[i], [heatmap_width, heatmap_height]
 85 |             )
 86 |         if coord is not None and reg_hm:
 87 |             preds[i] = transform_preds(
 88 |                 coords[i], center[i], scale[i], [heatmap_width, heatmap_height]
 89 |             )
 90 |         # model outputs coord; not reg_hm; reg_hm = False in default
 91 |         if coord is not None and not reg_hm:
 92 |             preds[i] = transform_preds(
 93 |                 coords[i], center[i], scale[i], [config.MODEL.IMAGE_SIZE[0], config.MODEL.IMAGE_SIZE[1]]
 94 |             )
 95 |     return preds, maxvals
 96 | 
 97 | class SoftArgmax2D(nn.Module):
 98 |     def __init__(self, height=64, width=48, beta=100):
 99 |         super(SoftArgmax2D, self).__init__()
100 |         self.softmax = nn.Softmax(dim=-1)
101 |         self.beta = beta
102 |         # Note that meshgrid in pytorch behaves differently with numpy.
103 |         self.WY, self.WX = torch.meshgrid(torch.arange(height, dtype=torch.float),
104 |                                           torch.arange(width, dtype=torch.float))
105 | 
106 |     def forward(self, x):
107 |         b, c, h, w = x.shape
108 |         device = x.device
109 | 
110 |         probs = self.softmax(x.view(b, c, -1) * self.beta)
111 |         probs = probs.view(b, c, h, w)
112 | 
113 |         self.WY = self.WY.to(device)
114 |         self.WX = self.WX.to(device)
115 | 
116 |         px = torch.sum(probs * self.WX, dim=(2, 3))
117 |         py = torch.sum(probs * self.WY, dim=(2, 3))
118 |         preds = torch.stack((px, py), dim=-1).cpu().numpy()
119 | 
120 |         idx = np.round(preds).astype(np.int32)
121 |         maxvals = np.zeros(shape=(b, c, 1))
122 |         for bi in range(b):
123 |             for ci in range(c):
124 |                 maxvals[bi, ci, 0] = x[bi, ci, idx[bi, ci, 1], idx[bi, ci, 0]]
125 | 
126 |         return preds, maxvals
127 | 
128 | 
129 | def get_final_preds_using_softargmax(config, batch_heatmaps, center, scale):
130 |     soft_argmax = SoftArgmax2D(config.MODEL.HEATMAP_SIZE[1], config.MODEL.HEATMAP_SIZE[0], beta=160)
131 |     coords, maxvals = soft_argmax(batch_heatmaps)
132 | 
133 |     heatmap_height = batch_heatmaps.shape[2]
134 |     heatmap_width = batch_heatmaps.shape[3]
135 | 
136 |     batch_heatmaps = batch_heatmaps.cpu().numpy()
137 | 
138 |     # post-processing
139 |     if config.TEST.POST_PROCESS:
140 |         for n in range(coords.shape[0]):
141 |             for p in range(coords.shape[1]):
142 |                 hm = batch_heatmaps[n][p]
143 |                 px = int(math.floor(coords[n][p][0] + 0.5))
144 |                 py = int(math.floor(coords[n][p][1] + 0.5))
145 |                 if 1 < px < heatmap_width - 1 and 1 < py < heatmap_height - 1:
146 |                     diff = np.array(
147 |                         [
148 |                             hm[py][px + 1] - hm[py][px - 1],
149 |                             hm[py + 1][px] - hm[py - 1][px]
150 |                         ]
151 |                     )
152 |                     coords[n][p] += np.sign(diff) * .25
153 | 
154 |     preds = coords.copy()
155 | 
156 |     # Transform back
157 |     for i in range(coords.shape[0]):
158 |         preds[i] = transform_preds(
159 |             coords[i], center[i], scale[i], [heatmap_width, heatmap_height]
160 |         )
161 | 
162 |     return preds, maxvals
163 | 


--------------------------------------------------------------------------------
/lib/core/loss.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import, division, print_function
  8 | 
  9 | import numpy as np
 10 | 
 11 | import torch
 12 | import torch.nn as nn
 13 | 
 14 | 
 15 | class JointsMSELoss(nn.Module):
 16 |     def __init__(self, use_target_weight, smooth_L1=False):
 17 |         super(JointsMSELoss, self).__init__()
 18 |         if smooth_L1:
 19 |             self.criterion = nn.MSELoss(reduction='mean')
 20 |         else:
 21 |             self.criterion = nn.SmoothL1Loss()
 22 |         
 23 |         self.use_target_weight = use_target_weight
 24 | 
 25 |     def forward(self, output, target, target_weight):
 26 |         batch_size = output.size(0)
 27 |         num_joints = output.size(1)
 28 | 
 29 |         def tofloat(x):
 30 |             if x.dtype == torch.float64 or x.dtype == torch.double:
 31 |                 x = x.float()
 32 |             return x
 33 |         
 34 |         output = tofloat(output)
 35 |         target = tofloat(target)
 36 |         target_weight = tofloat(target_weight)
 37 | 
 38 |         if output.dim() == 4:
 39 |             heatmaps_pred = output.reshape((batch_size, num_joints, -1)).split(1, 1)
 40 |             heatmaps_gt = target.reshape((batch_size, num_joints, -1)).split(1, 1)
 41 |         
 42 |         else:
 43 |             heatmaps_pred = output
 44 |             heatmaps_gt = target
 45 |         
 46 |         loss = 0
 47 | 
 48 |         for idx in range(num_joints):
 49 |             if not isinstance(heatmaps_pred, tuple) and heatmaps_pred.shape[2] == 2:
 50 |                 heatmap_pred = heatmaps_pred[:,idx].squeeze()
 51 |                 heatmap_gt = heatmaps_gt[:,idx].squeeze()
 52 | 
 53 |             else:
 54 |                 heatmap_pred = heatmaps_pred[idx].squeeze()
 55 |                 heatmap_gt = heatmaps_gt[idx].squeeze()
 56 |             
 57 |             if self.use_target_weight:
 58 |                 loss += 0.5 * self.criterion(
 59 |                     heatmap_pred.mul(target_weight[:, idx]),
 60 |                     heatmap_gt.mul(target_weight[:, idx])
 61 |                 )
 62 |             else:
 63 |                 loss += 0.5 * self.criterion(heatmap_pred, heatmap_gt)
 64 | 
 65 |         return loss / num_joints
 66 | 
 67 | 
 68 | class JointsOHKMMSELoss(nn.Module):
 69 |     def __init__(self, use_target_weight, topk=8):
 70 |         super(JointsOHKMMSELoss, self).__init__()
 71 |         self.criterion = nn.MSELoss(reduction='none')
 72 |         self.use_target_weight = use_target_weight
 73 |         self.topk = topk
 74 | 
 75 |     def ohkm(self, loss):
 76 |         ohkm_loss = 0.
 77 |         for i in range(loss.size()[0]):
 78 |             sub_loss = loss[i]
 79 |             topk_val, topk_idx = torch.topk(
 80 |                 sub_loss, k=self.topk, dim=0, sorted=False
 81 |             )
 82 |             tmp_loss = torch.gather(sub_loss, 0, topk_idx)
 83 |             ohkm_loss += torch.sum(tmp_loss) / self.topk
 84 |         ohkm_loss /= loss.size()[0]
 85 |         return ohkm_loss
 86 | 
 87 |     def forward(self, output, target, target_weight):
 88 |         batch_size = output.size(0)
 89 |         num_joints = output.size(1)
 90 |         heatmaps_pred = output.reshape((batch_size, num_joints, -1)).split(1, 1)
 91 |         heatmaps_gt = target.reshape((batch_size, num_joints, -1)).split(1, 1)
 92 | 
 93 |         loss = []
 94 |         for idx in range(num_joints):
 95 |             heatmap_pred = heatmaps_pred[idx].squeeze()
 96 |             heatmap_gt = heatmaps_gt[idx].squeeze()
 97 |             if self.use_target_weight:
 98 |                 loss.append(0.5 * self.criterion(
 99 |                     heatmap_pred.mul(target_weight[:, idx]),
100 |                     heatmap_gt.mul(target_weight[:, idx])
101 |                 ))
102 |             else:
103 |                 loss.append(
104 |                     0.5 * self.criterion(heatmap_pred, heatmap_gt)
105 |                 )
106 | 
107 |         loss = [l.mean(dim=1).unsqueeze(dim=1) for l in loss]
108 |         loss = torch.cat(loss, dim=1)
109 | 
110 |         return self.ohkm(loss)
111 | 


--------------------------------------------------------------------------------
/lib/dataset/JointsDataset.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import copy
 12 | import logging
 13 | import random
 14 | 
 15 | import cv2, os
 16 | import numpy as np
 17 | import torch
 18 | from torch.utils.data import Dataset
 19 | 
 20 | from utils.transforms import get_affine_transform
 21 | from utils.transforms import affine_transform
 22 | from utils.transforms import fliplr_joints
 23 | from imagecorruptions import corrupt, get_corruption_names
 24 | from .advaug import MixCombine
 25 | 
 26 | logger = logging.getLogger(__name__)
 27 | 
 28 | 
 29 | class JointsDataset(Dataset):
 30 |     def __init__(self, cfg, args, root, image_set, is_train, transform=None):
 31 | 
 32 |         self.args = args
 33 |         self.num_joints = 0
 34 |         self.pixel_std = 200
 35 |         self.flip_pairs = []
 36 |         self.parent_ids = []
 37 | 
 38 |         self.is_train = is_train
 39 |         self.root = root
 40 |         self.image_set = image_set
 41 | 
 42 |         self.output_path = cfg.OUTPUT_DIR
 43 |         self.data_format = cfg.DATASET.DATA_FORMAT
 44 | 
 45 |         self.scale_factor = cfg.DATASET.SCALE_FACTOR
 46 |         self.rotation_factor = cfg.DATASET.ROT_FACTOR
 47 |         self.flip = cfg.DATASET.FLIP
 48 |         self.num_joints_half_body = cfg.DATASET.NUM_JOINTS_HALF_BODY
 49 |         self.prob_half_body = cfg.DATASET.PROB_HALF_BODY
 50 |         self.color_rgb = cfg.DATASET.COLOR_RGB
 51 | 
 52 |         self.target_type = cfg.MODEL.TARGET_TYPE
 53 |         self.image_size = np.array(cfg.MODEL.IMAGE_SIZE)
 54 |         self.heatmap_size = np.array(cfg.MODEL.HEATMAP_SIZE)
 55 |         self.sigma = cfg.MODEL.SIGMA
 56 |         self.use_different_joints_weight = cfg.LOSS.USE_DIFFERENT_JOINTS_WEIGHT
 57 |         self.joints_weight = 1
 58 | 
 59 |         self.transform = transform
 60 |         self.get_varaug = MixCombine()
 61 |         self.db = []
 62 | 
 63 |     def _get_db(self):
 64 |         raise NotImplementedError
 65 | 
 66 |     def evaluate(self, cfg, preds, output_dir, *args, **kwargs):
 67 |         raise NotImplementedError
 68 |     
 69 |     def half_body_transform(self, joints, joints_vis):
 70 |         upper_joints = []
 71 |         lower_joints = []
 72 |         for joint_id in range(self.num_joints):
 73 |             if joints_vis[joint_id][0] > 0:
 74 |                 if joint_id in self.upper_body_ids:
 75 |                     upper_joints.append(joints[joint_id])
 76 |                 else:
 77 |                     lower_joints.append(joints[joint_id])
 78 | 
 79 |         if np.random.randn() < 0.5 and len(upper_joints) > 2:
 80 |             selected_joints = upper_joints
 81 |         else:
 82 |             selected_joints = lower_joints \
 83 |                 if len(lower_joints) > 2 else upper_joints
 84 | 
 85 |         if len(selected_joints) < 2:
 86 |             return None, None
 87 | 
 88 |         selected_joints = np.array(selected_joints, dtype=np.float32)
 89 |         center = selected_joints.mean(axis=0)[:2]
 90 | 
 91 |         left_top = np.amin(selected_joints, axis=0)
 92 |         right_bottom = np.amax(selected_joints, axis=0)
 93 | 
 94 |         w = right_bottom[0] - left_top[0]
 95 |         h = right_bottom[1] - left_top[1]
 96 | 
 97 |         if w > self.aspect_ratio * h:
 98 |             h = w * 1.0 / self.aspect_ratio
 99 |         elif w < self.aspect_ratio * h:
100 |             w = h * self.aspect_ratio
101 | 
102 |         scale = np.array(
103 |             [
104 |                 w * 1.0 / self.pixel_std,
105 |                 h * 1.0 / self.pixel_std
106 |             ],
107 |             dtype=np.float32
108 |         )
109 | 
110 |         scale = scale * 1.5
111 |         return center, scale
112 | 
113 |     def __len__(self,):
114 |         return len(self.db)
115 | 
116 |     
117 |     def __getitem__(self, idx):
118 |         if self.args.sample_times == 1 or not self.is_train:
119 |             input, target, target_weight, meta = self.get_clean(idx)
120 |             return input, target, target_weight, meta
121 |         else:
122 |             input_list, target_list, target_weight_list, meta_list = [],[],[],[]
123 |             meta, input_base = self.get_base(idx)
124 |             get_cleans = ['clean', 'autoaug', 'gridmask']
125 |             for sample in range(len(get_cleans)):
126 |                 get_clean = get_cleans[sample]
127 |                 input, target, target_weight, meta = self.get_var(meta, input_base, get_clean=get_clean)
128 |                 input_list.append(input)
129 |                 target_list.append(target)
130 |                 target_weight_list.append(target_weight)
131 |                 meta_list.append(meta)
132 | 
133 |             return input_list, target_list, target_weight_list, meta_list
134 |     
135 |     def get_base(self, idx):
136 |         db_rec = copy.deepcopy(self.db[idx])
137 | 
138 |         image_file = db_rec['image']
139 |         filename = db_rec['filename'] if 'filename' in db_rec else ''
140 |         imgnum = db_rec['imgnum'] if 'imgnum' in db_rec else ''
141 | 
142 |         if self.data_format == 'zip':
143 |             from utils import zipreader
144 |             data_numpy = zipreader.imread(
145 |                 image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION
146 |             )
147 |         else:
148 |             data_numpy = cv2.imread(
149 |                 image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION
150 |             )
151 |         
152 |         if self.color_rgb:
153 |             data_numpy = cv2.cvtColor(data_numpy, cv2.COLOR_BGR2RGB)
154 | 
155 |         if data_numpy is None:
156 |             logger.error('=> fail to read {}'.format(image_file))
157 |             raise ValueError('Fail to read {}'.format(image_file))
158 | 
159 |         joints = db_rec['joints_3d']
160 |         joints_vis = db_rec['joints_3d_vis']
161 | 
162 |         c = db_rec['center']
163 |         s = db_rec['scale']
164 |         score = db_rec['score'] if 'score' in db_rec else 1
165 |         r = 0
166 |         
167 |         if self.is_train:
168 |             if (np.sum(joints_vis[:, 0]) > self.num_joints_half_body
169 |                 and np.random.rand() < self.prob_half_body):
170 |                 c_half_body, s_half_body = self.half_body_transform(
171 |                     joints, joints_vis
172 |                 )
173 | 
174 |                 if c_half_body is not None and s_half_body is not None:
175 |                     c, s = c_half_body, s_half_body
176 | 
177 |             sf = self.scale_factor
178 |             rf = self.rotation_factor
179 |             s = s * np.clip(np.random.randn()*sf + 1, 1 - sf, 1 + sf)
180 | 
181 |             r = np.clip(np.random.randn()*rf, -rf*2, rf*2) \
182 |                 if random.random() <= 0.6 else 0
183 |         
184 |             if self.flip and random.random() <= 0.5:
185 |                 data_numpy = data_numpy[:, ::-1, :]
186 |                 joints, joints_vis = fliplr_joints(
187 |                     joints, joints_vis, data_numpy.shape[1], self.flip_pairs)
188 |                 c[0] = data_numpy.shape[1] - c[0] - 1
189 |         
190 |         trans = get_affine_transform(c, s, r, self.image_size)
191 |         input = cv2.warpAffine(
192 |             data_numpy,
193 |             trans,
194 |             (int(self.image_size[0]), int(self.image_size[1])),
195 |             flags=cv2.INTER_LINEAR)
196 | 
197 |         for i in range(self.num_joints):
198 |             if joints_vis[i, 0] > 0.0:
199 |                 joints[i, 0:2] = affine_transform(joints[i, 0:2], trans)
200 | 
201 |         if 'style' in filename:
202 |             datasetname = 'style'
203 |         else:
204 |             datasetname = 'clean'
205 | 
206 |         if 'instance_index' not in db_rec:
207 |             db_rec['instance_index'] = -1
208 | 
209 |         meta = {
210 |             'image': image_file,
211 |             'filename': filename,
212 |             'imgnum': imgnum,
213 |             'joints': joints,
214 |             'joints_vis': joints_vis,
215 |             'center': c,
216 |             'scale': s,
217 |             'rotation': r,
218 |             'score': score,
219 |             'dataset': datasetname,
220 |             'instance_index':db_rec['instance_index']
221 |         }
222 | 
223 |         return meta, input
224 |     
225 |     def get_var(self, meta, input, get_clean=False):
226 |         joints = meta['joints']
227 |         joints_vis = meta['joints_vis']
228 |         
229 |         inputs = {
230 |             'data_numpy':input,
231 |             'img_label':1,
232 |             'joints_3d':joints,
233 |             'joints_3d_vis':joints_vis,
234 |             'dataset':meta['dataset']
235 |         }
236 | 
237 |         inputs, _ = self.get_varaug((inputs, get_clean, self.args), self.transform)
238 | 
239 |         input = inputs['data_numpy']
240 |         joints = inputs['joints_3d']
241 |         joints_vis = inputs['joints_3d_vis']
242 | 
243 |         target, target_weight = self.generate_target(joints, joints_vis)
244 | 
245 |         if isinstance(target, list):
246 |             for index in range(len(target)):
247 |                 target[index] = torch.from_numpy(target[index])
248 |         else:
249 |             target = torch.from_numpy(target)
250 |         
251 |         target_weight = torch.from_numpy(target_weight)
252 | 
253 |         if isinstance(target, list):
254 |             return input, target[0], target_weight, meta
255 |         else:
256 |             return input, target, target_weight, meta
257 | 
258 |     def get_clean(self, idx):
259 |         corruptions = [
260 |             'gaussian_noise', 'shot_noise', 'impulse_noise',
261 |             'defocus_blur', 'glass_blur', 'motion_blur', 'zoom_blur',
262 |             'snow', 'frost', 'fog', 'brightness',
263 |             'contrast', 'elastic_transform', 'pixelate', 'jpeg_compression',
264 |         ]
265 |         db_rec = copy.deepcopy(self.db[idx])
266 | 
267 |         image_file = db_rec['image']
268 |         filename = db_rec['filename'] if 'filename' in db_rec else ''
269 |         imgnum = db_rec['imgnum'] if 'imgnum' in db_rec else ''
270 | 
271 |         if self.data_format == 'zip':
272 |             from utils import zipreader
273 |             data_numpy = zipreader.imread(
274 |                 image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION
275 |             )
276 |         else:
277 |             data_numpy = cv2.imread(
278 |                 image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION
279 |             )
280 |         
281 |         if self.color_rgb:
282 |             data_numpy = cv2.cvtColor(data_numpy, cv2.COLOR_BGR2RGB)
283 | 
284 |         if self.args.random_corruption:
285 |             print('=> random augmentation')
286 |             data_numpy = corrupt(data_numpy, corruption_name=random.choice(corruptions), severity=random.randint(1,5))
287 |         
288 |         if data_numpy is None:
289 |             logger.error('=> fail to read {}'.format(image_file))
290 |             raise ValueError('Fail to read {}'.format(image_file))
291 | 
292 |         joints = db_rec['joints_3d']
293 |         joints_vis = db_rec['joints_3d_vis']
294 | 
295 |         c = db_rec['center']
296 |         s = db_rec['scale']
297 |         score = db_rec['score'] if 'score' in db_rec else 1
298 |         r = 0
299 | 
300 |         if self.is_train:
301 |             if (np.sum(joints_vis[:, 0]) > self.num_joints_half_body
302 |                 and np.random.rand() < self.prob_half_body):
303 |                 c_half_body, s_half_body = self.half_body_transform(
304 |                     joints, joints_vis
305 |                 )
306 | 
307 |                 if c_half_body is not None and s_half_body is not None:
308 |                     c, s = c_half_body, s_half_body
309 | 
310 |             sf = self.scale_factor
311 |             rf = self.rotation_factor
312 |             s = s * np.clip(np.random.randn()*sf + 1, 1 - sf, 1 + sf)
313 | 
314 |             r = np.clip(np.random.randn()*rf, -rf*2, rf*2) \
315 |                 if random.random() <= 0.6 else 0
316 |         
317 |             if self.flip and random.random() <= 0.5:
318 |                 data_numpy = data_numpy[:, ::-1, :]
319 |                 joints, joints_vis = fliplr_joints(
320 |                     joints, joints_vis, data_numpy.shape[1], self.flip_pairs)
321 |                 c[0] = data_numpy.shape[1] - c[0] - 1
322 | 
323 | 
324 |         trans = get_affine_transform(c, s, r, self.image_size)
325 |         input = cv2.warpAffine(
326 |             data_numpy,
327 |             trans,
328 |             (int(self.image_size[0]), int(self.image_size[1])),
329 |             flags=cv2.INTER_LINEAR)
330 | 
331 |         if self.transform:
332 |             input = self.transform(input)
333 | 
334 |         for i in range(self.num_joints):
335 |             if joints_vis[i, 0] > 0.0:
336 |                 joints[i, 0:2] = affine_transform(joints[i, 0:2], trans)
337 |         
338 |         target, target_weight = self.generate_target(joints, joints_vis)
339 | 
340 |         if isinstance(target, list):
341 |             for index in range(len(target)):
342 |                 target[index] = torch.from_numpy(target[index])
343 |         else:
344 |             target = torch.from_numpy(target)
345 |         
346 |         target_weight = torch.from_numpy(target_weight)
347 | 
348 |         if 'instance_index' not in db_rec:
349 |             db_rec['instance_index'] = -1
350 | 
351 |         meta = {
352 |             'image': image_file,
353 |             'filename': filename,
354 |             'imgnum': imgnum,
355 |             'joints': joints,
356 |             'joints_vis': joints_vis,
357 |             'center': c,
358 |             'scale': s,
359 |             'rotation': r,
360 |             'score': score,
361 |             'instance_index':db_rec['instance_index']
362 |         }
363 | 
364 |         return input, target, target_weight, meta
365 | 
366 |     def select_data(self, db):
367 |         db_selected = []
368 |         for rec in db:
369 |             num_vis = 0
370 |             joints_x = 0.0
371 |             joints_y = 0.0
372 |             for joint, joint_vis in zip(
373 |                     rec['joints_3d'], rec['joints_3d_vis']):
374 |                 if joint_vis[0] <= 0:
375 |                     continue
376 |                 num_vis += 1
377 | 
378 |                 joints_x += joint[0]
379 |                 joints_y += joint[1]
380 |             if num_vis == 0:
381 |                 continue
382 | 
383 |             joints_x, joints_y = joints_x / num_vis, joints_y / num_vis
384 | 
385 |             area = rec['scale'][0] * rec['scale'][1] * (self.pixel_std**2)
386 | 
387 |             joints_center = np.array([joints_x, joints_y])
388 |             bbox_center = np.array(rec['center'])
389 |             diff_norm2 = np.linalg.norm((joints_center-bbox_center), 2)
390 |             ks = np.exp(-1.0*(diff_norm2**2) / ((0.2)**2*2.0*area))
391 | 
392 |             metric = (0.2 / 16) * num_vis + 0.45 - 0.2 / 16
393 | 
394 |             if ks > metric:
395 |                 db_selected.append(rec)
396 | 
397 |         logger.info('=> num db: {}'.format(len(db)))
398 |         logger.info('=> num selected db: {}'.format(len(db_selected)))
399 |         return db_selected
400 | 
401 |     def get_forground_image(self, img):
402 |         mask = np.zeros(img.shape[:2],np.uint8)
403 |         bgdModel = np.zeros((1,65),np.float64)
404 |         fgdModel = np.zeros((1,65),np.float64)
405 |         rect = (0, 0, img.shape[1] - 1, img.shape[0] - 1)
406 |         cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
407 |         mask2 = np.where((mask==2)|(mask==0),0,1).astype('uint8')
408 |         img = img*mask2[:,:,np.newaxis]
409 |         cv2.imwrite(os.path.dirname((img)) + '_mask' + '/' + os.path.basename(img), img)
410 |         return img
411 | 
412 |     def generate_target(self, joints, joints_vis):
413 |         '''
414 |         :param joints:  [num_joints, 3]
415 |         :param joints_vis: [num_joints, 3]
416 |         :return: target, target_weight(1: visible, 0: invisible)
417 |         '''
418 |         target_weight = np.ones((self.num_joints, 1), dtype=np.float32)
419 |         target_weight[:, 0] = joints_vis[:, 0]
420 | 
421 |         assert self.target_type == 'gaussian', \
422 |             'Only support gaussian map now!'
423 | 
424 |         if self.target_type == 'gaussian':
425 |             target = [np.zeros((self.num_joints,
426 |                                self.heatmap_size[1],
427 |                                self.heatmap_size[0]),
428 |                               dtype=np.float32), np.zeros((self.num_joints, 2), dtype=np.float32)]
429 |             tmp_size = self.sigma * 3
430 |             if False:
431 |                 tmp_hm = self.heatmap_size
432 |                 for joint_id in range(self.num_joints):
433 |                     heatmap_vis = joints_vis[joint_id, 0]
434 |                     target_weight[joint_id] = heatmap_vis
435 |                     feat_stride = self.image_size / tmp_hm
436 |                     mu_x = joints[joint_id][0] / feat_stride[0]
437 |                     mu_y = joints[joint_id][1] / feat_stride[1]
438 |                     ul = [mu_x - tmp_size, mu_y - tmp_size]
439 |                     br = [mu_x + tmp_size + 1, mu_y + tmp_size + 1]
440 |                     if ul[0] >= tmp_hm[0] or ul[1] >= tmp_hm[1] or br[0] < 0 or br[1] < 0:
441 |                             target_weight[joint_id] = 0
442 |                                     
443 |                     if target_weight[joint_id] == 0:
444 |                         continue
445 |                     
446 |                     x = np.arange(0, tmp_hm[0], 1, np.float32)
447 |                     y = np.arange(0, tmp_hm[1], 1, np.float32)
448 |                     y = y[:, np.newaxis]
449 | 
450 |                     v = target_weight[joint_id]
451 |                     if v > 0.5:
452 |                         target[joint_id] = np.exp(- ((x - mu_x) ** 2 + (y - mu_y) ** 2) / (2 * self.sigma ** 2))
453 | 
454 |             else:
455 |                 for joint_id in range(self.num_joints):
456 |                     feat_stride = self.image_size / self.heatmap_size
457 |                     mu_x = int(joints[joint_id][0] / feat_stride[0] + 0.5)
458 |                     mu_y = int(joints[joint_id][1] / feat_stride[1] + 0.5)
459 |                     # Check that any part of the gaussian is in-bounds
460 |                     ul = [int(mu_x - tmp_size), int(mu_y - tmp_size)]
461 |                     br = [int(mu_x + tmp_size + 1), int(mu_y + tmp_size + 1)]
462 |                     
463 |                     if ul[0] >= self.heatmap_size[0] or ul[1] >= self.heatmap_size[1] \
464 |                             or br[0] < 0 or br[1] < 0:
465 |                         target_weight[joint_id] = 0
466 |                         continue
467 | 
468 |                     # Generate gaussian
469 |                     size = 2 * tmp_size + 1
470 |                     x = np.arange(0, size, 1, np.float32)
471 |                     y = x[:, np.newaxis]
472 |                     x0 = y0 = size // 2
473 |                     # The gaussian is not normalized, we want the center value to equal 1
474 |                     g = np.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * self.sigma ** 2))
475 | 
476 |                     g_x = max(0, -ul[0]), min(br[0], self.heatmap_size[0]) - ul[0]
477 |                     g_y = max(0, -ul[1]), min(br[1], self.heatmap_size[1]) - ul[1]
478 |                     img_x = max(0, ul[0]), min(br[0], self.heatmap_size[0])
479 |                     img_y = max(0, ul[1]), min(br[1], self.heatmap_size[1])
480 | 
481 |                     v = target_weight[joint_id]
482 |                     if v > 0.5:
483 |                         target[0][joint_id][img_y[0]:img_y[1], img_x[0]:img_x[1]] = \
484 |                             g[g_y[0]:g_y[1], g_x[0]:g_x[1]]
485 |                     
486 |                         target[1][joint_id] = np.array([mu_x, mu_y], dtype=np.float32)
487 | 
488 |         if self.use_different_joints_weight:
489 |             target_weight = np.multiply(target_weight, self.joints_weight)
490 | 
491 |         return target, target_weight
492 | 
493 | 
494 | 
495 | 
496 | 
497 | 


--------------------------------------------------------------------------------
/lib/dataset/__init__.py:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # Copyright (c) Microsoft
 3 | # Licensed under the MIT License.
 4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 5 | # ------------------------------------------------------------------------------
 6 | 
 7 | from __future__ import absolute_import
 8 | from __future__ import division
 9 | from __future__ import print_function
10 | 
11 | from .mpii import MPIIDataset as mpii
12 | from .coco import COCODataset as coco
13 | 


--------------------------------------------------------------------------------
/lib/dataset/advaug.py:
--------------------------------------------------------------------------------
  1 | from PIL import Image, ImageEnhance, ImageOps
  2 | import numpy as np
  3 | import random
  4 | import torch
  5 | import logging
  6 | 
  7 | logger = logging.getLogger(__name__)
  8 | 
  9 | 
 10 | class ImageNetPolicy(object):
 11 |     """ Randomly choose one of the best 24 Sub-policies on ImageNet.
 12 |         Example:
 13 |         >>> policy = ImageNetPolicy()
 14 |         >>> transformed = policy(image)
 15 |         Example as a PyTorch Transform:
 16 |         >>> transform=transforms.Compose([
 17 |         >>>     transforms.Resize(256),
 18 |         >>>     ImageNetPolicy(),
 19 |         >>>     transforms.ToTensor()])
 20 |     """
 21 |     def __init__(self, fillcolor=(128, 128, 128)):
 22 |         self.policies = [
 23 |             SubPolicy(0.8, "equalize", 8, 0.6, "equalize", 3, fillcolor),
 24 |             SubPolicy(0.6, "posterize", 7, 0.6, "posterize", 6, fillcolor),
 25 |             SubPolicy(0.4, "equalize", 7, 0.2, "solarize", 4, fillcolor),
 26 |             SubPolicy(0.6, "solarize", 3, 0.6, "equalize", 7, fillcolor),
 27 |             SubPolicy(0.8, "posterize", 5, 1.0, "equalize", 2, fillcolor),
 28 |             SubPolicy(0.6, "equalize", 8, 0.4, "posterize", 6, fillcolor),
 29 |             SubPolicy(0.0, "equalize", 7, 0.8, "equalize", 8, fillcolor),
 30 |             SubPolicy(0.6, "invert", 4, 1.0, "equalize", 8, fillcolor),
 31 |             SubPolicy(0.4, "sharpness", 7, 0.6, "invert", 8, fillcolor),
 32 |             SubPolicy(0.4, "equalize", 7, 0.2, "solarize", 4, fillcolor),
 33 |             SubPolicy(0.6, "invert", 4, 1.0, "equalize", 8, fillcolor),
 34 |             SubPolicy(0.8, "equalize", 8, 0.6, "equalize", 3, fillcolor)
 35 |         ]
 36 | 
 37 |     def __call__(self, img):
 38 |         policy_idx = random.randint(0, len(self.policies) - 1)
 39 |         return self.policies[policy_idx](img)
 40 |     
 41 |     def __repr__(self):
 42 |         return "AutoAugment ImageNet Policy"
 43 | 
 44 | 
 45 | 
 46 | class SubPolicy(object):
 47 |     def __init__(self, p1, operation1, magnitude_idx1, p2, operation2, magnitude_idx2, fillcolor=(128, 128, 128)):
 48 |         ranges = {
 49 |             "shearX": np.linspace(0, 0.3, 10),
 50 |             "shearY": np.linspace(0, 0.3, 10),
 51 |             "translateX": np.linspace(0, 150 / 331, 10),
 52 |             "translateY": np.linspace(0, 150 / 331, 10),
 53 |             "rotate": np.linspace(0, 30, 10),
 54 |             "color": np.linspace(0.0, 0.9, 10),
 55 |             "posterize": np.round(np.linspace(8, 4, 10), 0).astype(np.int),
 56 |             "solarize": np.linspace(256, 0, 10),
 57 |             "contrast": np.linspace(0.0, 0.9, 10),
 58 |             "sharpness": np.linspace(0.0, 0.9, 10),
 59 |             "brightness": np.linspace(0.0, 0.9, 10),
 60 |             "autocontrast": [0] * 10,
 61 |             "equalize": [0] * 10,
 62 |             "invert": [0] * 10
 63 |         }
 64 | 
 65 |         def rotate_with_fill(img, magnitude):
 66 |             rot = img.convert("RGBA").rotate(magnitude)
 67 |             return Image.composite(rot, Image.new("RGBA", rot.size, (128,) * 4), rot).convert(img.mode)
 68 | 
 69 |         func = {
 70 |             "shearX": lambda img, magnitude: img.transform(
 71 |                 img.size, Image.AFFINE, (1, magnitude * random.choice([-1, 1]), 0, 0, 1, 0),
 72 |                 Image.BICUBIC, fillcolor=fillcolor),
 73 |             "shearY": lambda img, magnitude: img.transform(
 74 |                 img.size, Image.AFFINE, (1, 0, 0, magnitude * random.choice([-1, 1]), 1, 0),
 75 |                 Image.BICUBIC, fillcolor=fillcolor),
 76 |             "translateX": lambda img, magnitude: img.transform(
 77 |                 img.size, Image.AFFINE, (1, 0, magnitude * img.size[0] * random.choice([-1, 1]), 0, 1, 0),
 78 |                 fillcolor=fillcolor),
 79 |             "translateY": lambda img, magnitude: img.transform(
 80 |                 img.size, Image.AFFINE, (1, 0, 0, 0, 1, magnitude * img.size[1] * random.choice([-1, 1])),
 81 |                 fillcolor=fillcolor),
 82 |             "rotate": lambda img, magnitude: rotate_with_fill(img, magnitude),
 83 |             "color": lambda img, magnitude: ImageEnhance.Color(img).enhance(1 + magnitude * random.choice([-1, 1])),
 84 |             "posterize": lambda img, magnitude: ImageOps.posterize(img, magnitude),
 85 |             "solarize": lambda img, magnitude: ImageOps.solarize(img, magnitude),
 86 |             "contrast": lambda img, magnitude: ImageEnhance.Contrast(img).enhance(
 87 |                 1 + magnitude * random.choice([-1, 1])),
 88 |             "sharpness": lambda img, magnitude: ImageEnhance.Sharpness(img).enhance(
 89 |                 1 + magnitude * random.choice([-1, 1])),
 90 |             "brightness": lambda img, magnitude: ImageEnhance.Brightness(img).enhance(
 91 |                 1 + magnitude * random.choice([-1, 1])),
 92 |             "autocontrast": lambda img, magnitude: ImageOps.autocontrast(img),
 93 |             "equalize": lambda img, magnitude: ImageOps.equalize(img),
 94 |             "invert": lambda img, magnitude: ImageOps.invert(img)
 95 |         }
 96 |         self.p1 = p1
 97 |         self.operation1 = func[operation1]
 98 |         self.magnitude1 = ranges[operation1][magnitude_idx1]
 99 |         self.p2 = p2
100 |         self.operation2 = func[operation2]
101 |         self.magnitude2 = ranges[operation2][magnitude_idx2]
102 | 
103 | 
104 |     def __call__(self, img):
105 |         if random.random() < self.p1: img = self.operation1(img, self.magnitude1)
106 |         if random.random() < self.p2: img = self.operation2(img, self.magnitude2)
107 |         return img
108 | 
109 | 
110 | 
111 | def grid_aug(cfg, img, joints, joints_vis, use_h, use_w, rotate = 1, offset=False, ratio = 0.5, mode=0, prob = 1., db_rec=None):
112 |     if np.random.rand() > prob:
113 |         return img, joints, joints_vis,db_rec
114 |     db_rec['img_label'] = 1
115 |     h = img.size(1)
116 |     w = img.size(2)
117 |     d1 = 2
118 |     d2 = min(h, w)
119 |     hh = int(1.5*h)
120 |     ww = int(1.5*w)
121 |     d = np.random.randint(d1, d2)
122 |     if ratio == 1:
123 |         l = np.random.randint(1, d)
124 |     else:
125 |         l = min(max(int(d*ratio+0.5),1),d-1)
126 |     mask = np.ones((hh, ww), np.float32)
127 |     st_h = np.random.randint(d)
128 |     st_w = np.random.randint(d)
129 |     if use_h:
130 |         for i in range(hh//d):
131 |             s = d*i + st_h
132 |             t = min(s+l, hh)
133 |             mask[s:t,:] *= 0
134 |     if use_w:
135 |         for i in range(ww//d):
136 |             s = d*i + st_w
137 |             t = min(s+l, ww)
138 |             mask[:,s:t] *= 0
139 |     
140 |     r = np.random.randint(rotate)
141 |     mask = Image.fromarray(np.uint8(mask))
142 |     mask = mask.rotate(r)
143 |     mask = np.asarray(mask)
144 |     mask = mask[(hh-h)//2:(hh-h)//2+h, (ww-w)//2:(ww-w)//2+w]
145 | 
146 |     mask = torch.from_numpy(mask).float()
147 |     if mode == 1:
148 |         mask = 1-mask
149 | 
150 |     tmp_mask = mask
151 |     mask = mask.expand_as(img)
152 |     if offset:
153 |         offset = torch.from_numpy(2 * (np.random.rand(h,w) - 0.5)).float()
154 |         offset = (1 - mask) * offset
155 |         img = img * mask + offset
156 |     else:
157 |         img = img * mask
158 |     
159 |     for joint_id in range(cfg.joints_num):
160 |         joint = joints[joint_id][:2]
161 |         tmp_x = min(int(joint[0]), tmp_mask.shape[1] - 1)
162 |         tmp_x = max(tmp_x, 0)
163 |         tmp_y = min(int(joint[1]), tmp_mask.shape[0] - 1)
164 |         tmp_y = max(tmp_y, 0)
165 | 
166 |         if tmp_mask[tmp_y, tmp_x] == 0:
167 |             joints_vis[joint_id][0] = 0
168 |             joints_vis[joint_id][1] = 0
169 | 
170 |     return img, joints, joints_vis,db_rec
171 | 
172 | 
173 | class MixCombine(object):
174 |     def __init__(self, to_float32=False):
175 |         self.to_float32 = to_float32
176 |         self.autoaug = ImageNetPolicy(fillcolor=(128, 128, 128))
177 |     def __call__(self, inputs, transform):
178 |         db_rec, get_clean, cfg = inputs
179 |         if get_clean != 'clean':
180 |             if get_clean == 'autoaug':
181 |                 if db_rec['dataset'] != 'style' or not cfg.sp_style:
182 |                     db_rec['img_label'] = 1
183 |                     data_numpy = db_rec['data_numpy']
184 |                     tmp_img = Image.fromarray(data_numpy.astype(np.uint8))
185 |                     data_numpy = self.autoaug(tmp_img)
186 |                     db_rec['data_numpy'] = np.array(data_numpy)
187 |                 db_rec['data_numpy'] = transform(db_rec['data_numpy'])
188 |             
189 |             elif get_clean == 'gridmask':
190 |                 db_rec['data_numpy'] = transform(db_rec['data_numpy'])
191 |                 if db_rec['dataset'] != 'style' or not cfg.sp_style:
192 |                     rotate = 1
193 |                     offset=False 
194 |                     ratio = 0.5
195 |                     mode=1
196 |                     prob = 0.7
197 |                     self.st_prob = prob
198 |                     data_numpy = db_rec['data_numpy']
199 |                     joints = db_rec['joints_3d']
200 |                     joints_vis = db_rec['joints_3d_vis']
201 | 
202 |                     data_numpy, joints, joints_vis, db_rec = grid_aug(cfg, data_numpy, joints, joints_vis, True, True, rotate, offset, ratio, mode, prob, db_rec)
203 |                     
204 |                     db_rec['data_numpy'] = data_numpy
205 |                     db_rec['joints_3d'] = joints
206 |                     db_rec['joints_3d_vis'] = joints_vis
207 |         
208 |         else:
209 |             db_rec['data_numpy'] = transform(db_rec['data_numpy'])
210 | 
211 |         return db_rec, cfg
212 | 
213 |     def set_prob(self, epoch, max_epoch):
214 |         self.prob = self.st_prob * epoch / max_epoch
215 | 


--------------------------------------------------------------------------------
/lib/dataset/coco.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | from collections import defaultdict
 12 | from collections import OrderedDict
 13 | import logging
 14 | import os, cv2
 15 | 
 16 | from pycocotools.coco import COCO
 17 | from pycocotools.cocoeval import COCOeval
 18 | import json_tricks as json
 19 | import numpy as np
 20 | 
 21 | from dataset.JointsDataset import JointsDataset
 22 | from nms.nms import oks_nms
 23 | from nms.nms import soft_oks_nms
 24 | 
 25 | 
 26 | logger = logging.getLogger(__name__)
 27 | 
 28 | 
 29 | class COCODataset(JointsDataset):
 30 |     def __init__(self, cfg, args ,root, image_set, is_train, transform=None):
 31 |         super().__init__(cfg, args, root, image_set, is_train, transform)
 32 |         self.args = args
 33 |         self.nms_thre = cfg.TEST.NMS_THRE
 34 |         self.image_thre = cfg.TEST.IMAGE_THRE
 35 |         self.soft_nms = cfg.TEST.SOFT_NMS
 36 |         self.oks_thre = cfg.TEST.OKS_THRE
 37 |         self.in_vis_thre = cfg.TEST.IN_VIS_THRE
 38 |         self.bbox_file = cfg.TEST.COCO_BBOX_FILE
 39 |         self.use_gt_bbox = cfg.TEST.USE_GT_BBOX 
 40 |         self.image_width = cfg.MODEL.IMAGE_SIZE[0]
 41 |         self.image_height = cfg.MODEL.IMAGE_SIZE[1]
 42 |         self.aspect_ratio = self.image_width * 1.0 / self.image_height
 43 |         self.pixel_std = 200
 44 | 
 45 |         # add paramters for test robustness
 46 |         self.test_robust = cfg.TEST.TEST_ROBUST
 47 |         self.corruption_type = cfg.TEST.CORRUPTION_TYPE
 48 |         self.root_c = cfg.DATASET.ROOT_C
 49 |         self.severity = cfg.TEST.SEVERITY
 50 |         self.mini_coco = cfg.DATASET.MINI_COCO
 51 |         self.coco = COCO(self._get_ann_file_keypoint())
 52 | 
 53 |         cats = [cat['name']
 54 |                 for cat in self.coco.loadCats(self.coco.getCatIds())]
 55 |         self.classes = ['__background__'] + cats
 56 |         logger.info('=> classes: {}'.format(self.classes))
 57 |         self.num_classes = len(self.classes)
 58 |         self._class_to_ind = dict(zip(self.classes, range(self.num_classes)))
 59 |         self._class_to_coco_ind = dict(zip(cats, self.coco.getCatIds()))
 60 |         self._coco_ind_to_class_ind = dict(
 61 |             [
 62 |                 (self._class_to_coco_ind[cls], self._class_to_ind[cls])
 63 |                 for cls in self.classes[1:]
 64 |             ]
 65 |         )
 66 |         self.image_set_index = self._load_image_set_index()
 67 |         self.num_images = len(self.image_set_index)
 68 |         logger.info('=> num_images: {}'.format(self.num_images))
 69 | 
 70 |         self.num_joints = 17
 71 |         self.flip_pairs = [[1, 2], [3, 4], [5, 6], [7, 8],
 72 |                            [9, 10], [11, 12], [13, 14], [15, 16]]
 73 |         self.parent_ids = None
 74 |         self.upper_body_ids = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
 75 |         self.lower_body_ids = (11, 12, 13, 14, 15, 16)
 76 | 
 77 |         self.joints_weight = np.array(
 78 |             [
 79 |                 1., 1., 1., 1., 1., 1., 1., 1.2, 1.2,
 80 |                 1.5, 1.5, 1., 1., 1.2, 1.2, 1.5, 1.5
 81 |             ],
 82 |             dtype=np.float32
 83 |         ).reshape((self.num_joints, 1))
 84 | 
 85 |         self.db = self._get_db()
 86 | 
 87 |         if is_train and cfg.DATASET.SELECT_DATA:
 88 |             self.db = self.select_data(self.db)
 89 | 
 90 |         logger.info('=> load {} samples'.format(len(self.db)))
 91 | 
 92 |     def _get_ann_file_keypoint(self):
 93 |         """ self.root / annotations / person_keypoints_train2017.json """
 94 |         prefix = 'person_keypoints' \
 95 |             if 'test' not in self.image_set else 'image_info'
 96 |         
 97 |         json_path = os.path.join(
 98 |             self.root,
 99 |             'annotations',
100 |             prefix + '_' + self.image_set + '.json'
101 |         )
102 |         if not os.path.exists(json_path):
103 |             json_path = os.path.join(
104 |                 'data/coco',
105 |                 'annotations',
106 |                 prefix + '_' + self.image_set + '.json'                
107 |                 )
108 |         return json_path 
109 | 
110 |     def _load_image_set_index(self):
111 |         """ image id: int """
112 |         if self.mini_coco:
113 |             image_ids = self.coco.getImgIds()[:200]
114 |         else:
115 |             image_ids = self.coco.getImgIds()
116 |         return image_ids
117 | 
118 |     def _get_db(self):
119 |         if self.is_train or self.use_gt_bbox:
120 |             gt_db = self._load_coco_keypoint_annotations()
121 |         else:
122 |             if self.mini_coco:
123 |                 gt_db = self._load_coco_keypoint_annotations()
124 |             else:
125 |                 gt_db = self._load_coco_person_detection_results()
126 |         return gt_db
127 | 
128 |     def _load_coco_keypoint_annotations(self):
129 |         """ ground truth bbox and keypoints """
130 |         gt_db = []
131 |         for index in self.image_set_index:
132 |             gt_db.extend(self._load_coco_keypoint_annotation_kernal(index))
133 |         return gt_db
134 | 
135 |     def _load_coco_keypoint_annotation_kernal(self, index):
136 |         """
137 |         coco ann: [u'segmentation', u'area', u'iscrowd', u'image_id', u'bbox', u'category_id', u'id']
138 |         iscrowd:
139 |             crowd instances are handled by marking their overlaps with all categories to -1
140 |             and later excluded in training
141 |         bbox:
142 |             [x1, y1, w, h]
143 |         :param index: coco image id
144 |         :return: db entry
145 |         """
146 |         im_ann = self.coco.loadImgs(index)[0]
147 |         width = im_ann['width']
148 |         height = im_ann['height']
149 | 
150 |         annIds = self.coco.getAnnIds(imgIds=index, iscrowd=False)
151 |         objs = self.coco.loadAnns(annIds)
152 | 
153 |         valid_objs = []
154 |         for obj in objs:
155 |             x, y, w, h = obj['bbox']
156 |             x1 = np.max((0, x))
157 |             y1 = np.max((0, y))
158 |             x2 = np.min((width - 1, x1 + np.max((0, w - 1))))
159 |             y2 = np.min((height - 1, y1 + np.max((0, h - 1))))
160 |             if obj['area'] > 0 and x2 >= x1 and y2 >= y1:
161 |                 obj['clean_bbox'] = [x1, y1, x2-x1, y2-y1]
162 |                 valid_objs.append(obj)
163 |         objs = valid_objs
164 | 
165 |         rec = []
166 |         for kobj, obj in enumerate(objs):
167 |             cls = self._coco_ind_to_class_ind[obj['category_id']]
168 |             if cls != 1:
169 |                 continue
170 | 
171 |             if max(obj['keypoints']) == 0:
172 |                 continue
173 | 
174 |             joints_3d = np.zeros((self.num_joints, 3), dtype=np.float)
175 |             joints_3d_vis = np.zeros((self.num_joints, 3), dtype=np.float)
176 |             for ipt in range(self.num_joints):
177 |                 joints_3d[ipt, 0] = obj['keypoints'][ipt * 3 + 0]
178 |                 joints_3d[ipt, 1] = obj['keypoints'][ipt * 3 + 1]
179 |                 joints_3d[ipt, 2] = 0
180 |                 t_vis = obj['keypoints'][ipt * 3 + 2]
181 |                 if t_vis > 1:
182 |                     t_vis = 1
183 |                 joints_3d_vis[ipt, 0] = t_vis
184 |                 joints_3d_vis[ipt, 1] = t_vis
185 |                 joints_3d_vis[ipt, 2] = 0
186 | 
187 |             center, scale = self._box2cs(obj['clean_bbox'][:4])
188 |             rec.append({
189 |                 'image': self.image_path_from_index(index),
190 |                 'center': center,
191 |                 'scale': scale,
192 |                 'joints_3d': joints_3d,
193 |                 'joints_3d_vis': joints_3d_vis,
194 |                 'filename': '',
195 |                 'imgnum': 0,
196 |                 'instance_index':str(index) + '_' + str(kobj)
197 |             })
198 | 
199 |         return rec
200 |     
201 |     def _box2cs(self, box):
202 |         x, y, w, h = box[:4]
203 |         return self._xywh2cs(x, y, w, h)
204 |     
205 |     def _xywh2cs(self, x, y, w, h):
206 |         center = np.zeros((2), dtype=np.float32)
207 |         center[0] = x + w * 0.5
208 |         center[1] = y + h * 0.5
209 | 
210 |         if w > self.aspect_ratio * h:
211 |             h = w * 1.0 / self.aspect_ratio
212 |         elif w < self.aspect_ratio * h:
213 |             w = h * self.aspect_ratio
214 |         scale = np.array(
215 |             [w * 1.0 / self.pixel_std, h * 1.0 / self.pixel_std],
216 |             dtype=np.float32)
217 |         
218 |         if center[0] != -1:
219 |             scale = scale * 1.25
220 |         return center, scale
221 | 
222 |     def image_path_from_index(self, index):
223 |         """ example: images / train2017 / 000000119993.jpg """
224 |         # testing the robustnss
225 |         if self.test_robust and self.corruption_type != 'clean':
226 |             file_name = self.corruption_type + '/' + str(self.severity) + '/' + '%012d.jpg' % index
227 |         else:
228 |             file_name = '%012d.jpg' % index
229 |         
230 |         if '2014' in self.image_set:
231 |             file_name = 'COCO_%s_' % self.image_set + file_name
232 | 
233 |         prefix = 'test2017' if 'test' in self.image_set else self.image_set
234 | 
235 |         data_name = prefix + '.zip@' if self.data_format == 'zip' else prefix
236 | 
237 |         if self.test_robust and self.corruption_type != 'clean':
238 |             image_path = os.path.join(
239 |                 self.root_c, file_name)
240 |         else:
241 |             if 'stylize_image' in self.root:
242 |                 image_path = os.path.join(
243 |                     self.root, file_name)
244 |             else:
245 |                 image_path = os.path.join(
246 |                     self.root, data_name, file_name)
247 |         return image_path
248 | 
249 |     def _load_coco_person_detection_results(self):
250 |         all_boxes = None
251 |         with open(self.bbox_file, 'r') as f:
252 |             all_boxes = json.load(f)
253 | 
254 |         if not all_boxes:
255 |             logger.error('=> Load %s fail!' % self.bbox_file)
256 |             return None
257 | 
258 |         logger.info('=> Total boxes: {}'.format(len(all_boxes)))
259 | 
260 |         kpt_db = []
261 |         num_boxes = 0
262 |         if self.mini_coco:
263 |             all_boxes = all_boxes[:100]
264 |         
265 |         for n_img in range(0, len(all_boxes)):
266 |             det_res = all_boxes[n_img]
267 |             if det_res['category_id'] != 1:
268 |                 continue
269 |             img_name = self.image_path_from_index(det_res['image_id'])
270 |             box = det_res['bbox']
271 |             score = det_res['score']
272 | 
273 |             if score < self.image_thre:
274 |                 continue
275 | 
276 |             num_boxes = num_boxes + 1
277 | 
278 |             center, scale = self._box2cs(box)
279 |             joints_3d = np.zeros((self.num_joints, 3), dtype=np.float)
280 |             joints_3d_vis = np.ones(
281 |                 (self.num_joints, 3), dtype=np.float)
282 |             kpt_db.append({
283 |                 'image': img_name,
284 |                 'center': center,
285 |                 'scale': scale,
286 |                 'score': score,
287 |                 'joints_3d': joints_3d,
288 |                 'joints_3d_vis': joints_3d_vis,
289 |             })
290 | 
291 |         logger.info('=> Total boxes after fliter low score@{}: {}'.format(
292 |             self.image_thre, num_boxes))
293 |         return kpt_db
294 | 
295 |     def evaluate(self, cfg, preds, output_dir, all_boxes, img_path,
296 |                  *args, **kwargs):
297 |         rank = cfg.RANK
298 | 
299 |         res_folder = os.path.join(output_dir, 'results')
300 |         if not os.path.exists(res_folder):
301 |             try:
302 |                 os.makedirs(res_folder)
303 |             except Exception:
304 |                 logger.error('Fail to make {}'.format(res_folder))
305 | 
306 |         if self.test_robust and self.corruption_type != 'clean':
307 |             
308 |             res_file = os.path.join(
309 |                 res_folder, 'keypoints_{}_results_{}_{}_{}.json'.format(
310 |                     self.image_set, rank, self.corruption_type, self.severity)
311 |             )
312 | 
313 |         else:
314 |             res_file = os.path.join(
315 |                 res_folder, 'keypoints_{}_results_{}.json'.format(
316 |                     self.image_set, rank)
317 |             )
318 | 
319 |         _kpts = []
320 | 
321 |         for idx, kpt in enumerate(preds):
322 |             image_idx = int(img_path[idx][-16:-4])
323 |             _kpts.append({
324 |                 'keypoints': kpt,
325 |                 'center': all_boxes[idx][0:2],
326 |                 'scale': all_boxes[idx][2:4],
327 |                 'area': all_boxes[idx][4],
328 |                 'score': all_boxes[idx][5],
329 |                 'image': image_idx
330 |             })
331 |         kpts = defaultdict(list)
332 |         for kpt in _kpts:
333 |             kpts[kpt['image']].append(kpt)
334 | 
335 |         num_joints = self.num_joints
336 |         in_vis_thre = self.in_vis_thre
337 |         oks_thre = self.oks_thre
338 |         oks_nmsed_kpts = []
339 |         for img in kpts.keys():
340 |             img_kpts = kpts[img]
341 |             for n_p in img_kpts:
342 |                 box_score = n_p['score']
343 |                 kpt_score = 0
344 |                 valid_num = 0
345 |                 for n_jt in range(0, num_joints):
346 |                     t_s = n_p['keypoints'][n_jt][2]
347 |                     if t_s > in_vis_thre:
348 |                         kpt_score = kpt_score + t_s
349 |                         valid_num = valid_num + 1
350 |                 if valid_num != 0:
351 |                     kpt_score = kpt_score / valid_num
352 |                 
353 |                 n_p['score'] = kpt_score * box_score
354 | 
355 |             if self.soft_nms:
356 |                 keep = soft_oks_nms(
357 |                     [img_kpts[i] for i in range(len(img_kpts))],
358 |                     oks_thre
359 |                 )
360 |             else:
361 |                 keep = oks_nms(
362 |                     [img_kpts[i] for i in range(len(img_kpts))],
363 |                     oks_thre
364 |                 )
365 | 
366 |             if len(keep) == 0:
367 |                 oks_nmsed_kpts.append(img_kpts)
368 |             else:
369 |                 oks_nmsed_kpts.append([img_kpts[_keep] for _keep in keep])
370 |  
371 |         self._write_coco_keypoint_results(
372 |             oks_nmsed_kpts, res_file)
373 |         if 'test' not in self.image_set:
374 |             info_str = self._do_python_keypoint_eval(
375 |                 res_file, res_folder)
376 |             name_value = OrderedDict(info_str)
377 |             return name_value, name_value['AP']
378 |         else:
379 |             return {'Null': 0}, 0
380 | 
381 |     def _write_coco_keypoint_results(self, keypoints, res_file):
382 |         data_pack = [
383 |             {
384 |                 'cat_id': self._class_to_coco_ind[cls],
385 |                 'cls_ind': cls_ind,
386 |                 'cls': cls,
387 |                 'ann_type': 'keypoints',
388 |                 'keypoints': keypoints
389 |             }
390 |             for cls_ind, cls in enumerate(self.classes) if not cls == '__background__'
391 |         ]
392 |         results = self._coco_keypoint_results_one_category_kernel(data_pack[0])
393 |         logger.info('=> writing results json to %s' % res_file)
394 |         with open(res_file, 'w') as f:
395 |             json.dump(results, f, sort_keys=True, indent=4)
396 |         try:
397 |             json.load(open(res_file))
398 |         except Exception:
399 |             content = []
400 |             with open(res_file, 'r') as f:
401 |                 for line in f:
402 |                     content.append(line)
403 |             content[-1] = ']'
404 |             with open(res_file, 'w') as f:
405 |                 for c in content:
406 |                     f.write(c)
407 | 
408 |     def _coco_keypoint_results_one_category_kernel(self, data_pack):
409 |         cat_id = data_pack['cat_id']
410 |         keypoints = data_pack['keypoints']
411 |         cat_results = []
412 |         for img_kpts in keypoints:
413 |             if len(img_kpts) == 0:
414 |                 continue
415 | 
416 |             _key_points = np.array([img_kpts[k]['keypoints']
417 |                                     for k in range(len(img_kpts))])
418 |             key_points = np.zeros(
419 |                 (_key_points.shape[0], self.num_joints * 3), dtype=np.float
420 |             )
421 |             for ipt in range(self.num_joints):
422 |                 key_points[:, ipt * 3 + 0] = _key_points[:, ipt, 0]
423 |                 key_points[:, ipt * 3 + 1] = _key_points[:, ipt, 1]
424 |                 key_points[:, ipt * 3 + 2] = _key_points[:, ipt, 2]
425 | 
426 |             result = [
427 |                 {
428 |                     'image_id': img_kpts[k]['image'],
429 |                     'category_id': cat_id,
430 |                     'keypoints': list(key_points[k]),
431 |                     'score': img_kpts[k]['score'],
432 |                     'center': list(img_kpts[k]['center']),
433 |                     'scale': list(img_kpts[k]['scale'])
434 |                 }
435 |                 for k in range(len(img_kpts))
436 |             ]
437 |             cat_results.extend(result)
438 | 
439 |         return cat_results
440 | 
441 |     def _do_python_keypoint_eval(self, res_file, res_folder):
442 |         coco_dt = self.coco.loadRes(res_file)
443 |         coco_eval = COCOeval(self.coco, coco_dt, 'keypoints')
444 |         coco_eval.params.useSegm = None
445 |         coco_eval.evaluate()
446 |         coco_eval.accumulate()
447 |         coco_eval.summarize()
448 | 
449 |         stats_names = ['AP', 'Ap .5', 'AP .75', 'AP (M)', 'AP (L)', 'AR', 'AR .5', 'AR .75', 'AR (M)', 'AR (L)']
450 |         info_str = []
451 |         for ind, name in enumerate(stats_names):
452 |             info_str.append((name, coco_eval.stats[ind]))
453 | 
454 |         return info_str
455 | 


--------------------------------------------------------------------------------
/lib/dataset/mpii.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import logging
 12 | import os
 13 | import json_tricks as json
 14 | from collections import OrderedDict
 15 | 
 16 | import numpy as np
 17 | from scipy.io import loadmat, savemat
 18 | 
 19 | from dataset.JointsDataset import JointsDataset
 20 | 
 21 | 
 22 | logger = logging.getLogger(__name__)
 23 | 
 24 | 
 25 | class MPIIDataset(JointsDataset):
 26 |     def __init__(self, cfg, args, root, image_set, is_train, transform=None):
 27 |         super().__init__(cfg, args, root, image_set, is_train, transform)
 28 | 
 29 |         # add paramters for test robustness
 30 |         self.args = args
 31 |         self.test_robust = cfg.TEST.TEST_ROBUST
 32 |         self.corruption_type = cfg.TEST.CORRUPTION_TYPE
 33 |         self.root_c = cfg.DATASET.ROOT_C
 34 |         self.severity = cfg.TEST.SEVERITY
 35 | 
 36 |         self.num_joints = 16
 37 |         self.flip_pairs = [[0, 5], [1, 4], [2, 3], [10, 15], [11, 14], [12, 13]]
 38 |         self.parent_ids = [1, 2, 6, 6, 3, 4, 6, 6, 7, 8, 11, 12, 7, 7, 13, 14]
 39 | 
 40 |         self.upper_body_ids = (7, 8, 9, 10, 11, 12, 13, 14, 15)
 41 |         self.lower_body_ids = (0, 1, 2, 3, 4, 5, 6)
 42 | 
 43 |         self.db = self._get_db()
 44 | 
 45 |         if is_train and cfg.DATASET.SELECT_DATA:
 46 |             self.db = self.select_data(self.db)
 47 | 
 48 |         logger.info('=> load {} samples'.format(len(self.db)))
 49 | 
 50 |     def _get_db(self):
 51 |         
 52 |         file_name = os.path.join(
 53 |             self.root, 'annot', self.image_set+'.json'
 54 |         )
 55 |         with open(file_name) as anno_file:
 56 |             anno = json.load(anno_file)
 57 | 
 58 |         gt_db = []
 59 |         for a in anno:
 60 | 
 61 |             if self.test_robust and self.corruption_type != 'clean':
 62 |                 image_name = self.corruption_type + '/' + str(self.severity) + '/' + a['image']
 63 |             else:
 64 |                 image_name = a['image']
 65 | 
 66 |             c = np.array(a['center'], dtype=np.float)
 67 |             s = np.array([a['scale'], a['scale']], dtype=np.float)
 68 | 
 69 |             # Adjust center/scale slightly to avoid cropping limbs
 70 |             if c[0] != -1:
 71 |                 c[1] = c[1] + 15 * s[1]
 72 |                 s = s * 1.25
 73 | 
 74 |             # MPII uses matlab format, index is based 1,
 75 |             # we should first convert to 0-based index
 76 |             c = c - 1
 77 | 
 78 |             joints_3d = np.zeros((self.num_joints, 3), dtype=np.float)
 79 |             joints_3d_vis = np.zeros((self.num_joints,  3), dtype=np.float)
 80 |             if self.image_set != 'test':
 81 |                 joints = np.array(a['joints'])
 82 |                 joints[:, 0:2] = joints[:, 0:2] - 1
 83 |                 joints_vis = np.array(a['joints_vis'])
 84 |                 assert len(joints) == self.num_joints, \
 85 |                     'joint num diff: {} vs {}'.format(len(joints),
 86 |                                                       self.num_joints)
 87 | 
 88 |                 joints_3d[:, 0:2] = joints[:, 0:2]
 89 |                 joints_3d_vis[:, 0] = joints_vis[:]
 90 |                 joints_3d_vis[:, 1] = joints_vis[:]
 91 | 
 92 |             image_dir = 'images.zip@' if self.data_format == 'zip' else 'images'
 93 | 
 94 |             if self.test_robust and self.corruption_type != 'clean':
 95 |                 image_path = os.path.join(
 96 |                     self.root_c, image_name)
 97 |             else:
 98 |                 image_path = os.path.join(self.root, image_dir, image_name)
 99 |             
100 |             gt_db.append(
101 |                 {
102 |                     'image': image_path,
103 |                     'center': c,
104 |                     'scale': s,
105 |                     'joints_3d': joints_3d,
106 |                     'joints_3d_vis': joints_3d_vis,
107 |                     'filename': '',
108 |                     'imgnum': 0,
109 |                 }
110 |             )
111 | 
112 |         return gt_db
113 | 
114 |     def evaluate(self, cfg, preds, output_dir, *args, **kwargs):
115 |         # convert 0-based index to 1-based index
116 |         preds = preds[:, :, 0:2] + 1.0
117 | 
118 |         if output_dir:
119 |             if self.test_robust:
120 |                 pred_file = os.path.join(output_dir, '{}_{}_pred.mat'.format(self.corruption_type, self.severity))
121 |                 savemat(pred_file, mdict={'preds': preds})
122 |             else:
123 |                 pred_file = os.path.join(output_dir, 'pred.mat')
124 |                 savemat(pred_file, mdict={'preds': preds})
125 | 
126 |         if 'test' in cfg.DATASET.TEST_SET:
127 |             return {'Null': 0.0}, 0.0
128 | 
129 |         SC_BIAS = 0.6
130 |         threshold = 0.5
131 | 
132 |         gt_file = os.path.join(cfg.DATASET.ROOT,
133 |                                'annot',
134 |                                'gt_{}.mat'.format(cfg.DATASET.TEST_SET))
135 |         gt_dict = loadmat(gt_file)
136 |         dataset_joints = gt_dict['dataset_joints']
137 |         jnt_missing = gt_dict['jnt_missing']
138 |         pos_gt_src = gt_dict['pos_gt_src']
139 |         headboxes_src = gt_dict['headboxes_src']
140 | 
141 |         pos_pred_src = np.transpose(preds, [1, 2, 0])
142 | 
143 |         head = np.where(dataset_joints == 'head')[1][0]
144 |         lsho = np.where(dataset_joints == 'lsho')[1][0]
145 |         lelb = np.where(dataset_joints == 'lelb')[1][0]
146 |         lwri = np.where(dataset_joints == 'lwri')[1][0]
147 |         lhip = np.where(dataset_joints == 'lhip')[1][0]
148 |         lkne = np.where(dataset_joints == 'lkne')[1][0]
149 |         lank = np.where(dataset_joints == 'lank')[1][0]
150 | 
151 |         rsho = np.where(dataset_joints == 'rsho')[1][0]
152 |         relb = np.where(dataset_joints == 'relb')[1][0]
153 |         rwri = np.where(dataset_joints == 'rwri')[1][0]
154 |         rkne = np.where(dataset_joints == 'rkne')[1][0]
155 |         rank = np.where(dataset_joints == 'rank')[1][0]
156 |         rhip = np.where(dataset_joints == 'rhip')[1][0]
157 | 
158 |         jnt_visible = 1 - jnt_missing
159 |         uv_error = pos_pred_src - pos_gt_src
160 |         uv_err = np.linalg.norm(uv_error, axis=1)
161 |         headsizes = headboxes_src[1, :, :] - headboxes_src[0, :, :]
162 |         headsizes = np.linalg.norm(headsizes, axis=0)
163 |         headsizes *= SC_BIAS
164 |         scale = np.multiply(headsizes, np.ones((len(uv_err), 1)))
165 |         scaled_uv_err = np.divide(uv_err, scale)
166 |         scaled_uv_err = np.multiply(scaled_uv_err, jnt_visible)
167 |         jnt_count = np.sum(jnt_visible, axis=1)
168 |         less_than_threshold = np.multiply((scaled_uv_err <= threshold),
169 |                                           jnt_visible)
170 |         PCKh = np.divide(100.*np.sum(less_than_threshold, axis=1), jnt_count)
171 | 
172 |         rng = np.arange(0, 0.5+0.01, 0.01)
173 |         pckAll = np.zeros((len(rng), 16))
174 | 
175 |         for r in range(len(rng)):
176 |             threshold = rng[r]
177 |             less_than_threshold = np.multiply(scaled_uv_err <= threshold,
178 |                                               jnt_visible)
179 |             pckAll[r, :] = np.divide(100.*np.sum(less_than_threshold, axis=1),
180 |                                      jnt_count)
181 | 
182 |         PCKh = np.ma.array(PCKh, mask=False)
183 |         PCKh.mask[6:8] = True
184 | 
185 |         jnt_count = np.ma.array(jnt_count, mask=False)
186 |         jnt_count.mask[6:8] = True
187 |         jnt_ratio = jnt_count / np.sum(jnt_count).astype(np.float64)
188 | 
189 |         name_value = [
190 |             ('Head', PCKh[head]),
191 |             ('Shoulder', 0.5 * (PCKh[lsho] + PCKh[rsho])),
192 |             ('Elbow', 0.5 * (PCKh[lelb] + PCKh[relb])),
193 |             ('Wrist', 0.5 * (PCKh[lwri] + PCKh[rwri])),
194 |             ('Hip', 0.5 * (PCKh[lhip] + PCKh[rhip])),
195 |             ('Knee', 0.5 * (PCKh[lkne] + PCKh[rkne])),
196 |             ('Ankle', 0.5 * (PCKh[lank] + PCKh[rank])),
197 |             ('Mean', np.sum(PCKh * jnt_ratio)),
198 |             ('Mean@0.1', np.sum(pckAll[11, :] * jnt_ratio))
199 |         ]
200 |         name_value = OrderedDict(name_value)
201 | 
202 |         return name_value, name_value['Mean']


--------------------------------------------------------------------------------
/lib/models/Unet_generator.py:
--------------------------------------------------------------------------------
  1 | from __future__ import absolute_import
  2 | from __future__ import division
  3 | from __future__ import print_function
  4 | 
  5 | import os
  6 | import logging
  7 | 
  8 | import torch
  9 | import torch.nn as nn
 10 | 
 11 | import functools
 12 | 
 13 | class UnetBlock(nn.Module):
 14 |     """Defines the Unet submodule with skip connection.
 15 |         X -------------------identity----------------------
 16 |         |-- downsampling -- |submodule| -- upsampling --|
 17 |     """
 18 |     def __init__(self, outer_nc, inner_nc, input_nc=None,
 19 |                  submodule=None, outermost=False, innermost=False, norm_layer=nn.InstanceNorm2d, use_dropout=False, with_tanh=True):
 20 |         """Construct a Unet submodule with skip connections.
 21 | 
 22 |         Parameters:
 23 |             outer_nc (int) -- the number of filters in the outer conv layer
 24 |             inner_nc (int) -- the number of filters in the inner conv layer
 25 |             input_nc (int) -- the number of channels in input images/features
 26 |             submodule (UnetSkipConnectionBlock) -- previously defined submodules
 27 |             outermost (bool)    -- if this module is the outermost module
 28 |             innermost (bool)    -- if this module is the innermost module
 29 |             norm_layer          -- normalization layer
 30 |             user_dropout (bool) -- if use dropout layers.
 31 |         """
 32 |         super(UnetBlock, self).__init__()
 33 |         self.outermost = outermost
 34 |         if type(norm_layer) == functools.partial:
 35 |             use_bias = norm_layer.func == nn.InstanceNorm2d
 36 |         else:
 37 |             use_bias = norm_layer == nn.InstanceNorm2d
 38 |         if input_nc is None:
 39 |             input_nc = outer_nc
 40 |         downconv = nn.Conv2d(input_nc, inner_nc, kernel_size=4,
 41 |                              stride=2, padding=1, bias=use_bias)
 42 |         downrelu = nn.LeakyReLU(0.2, True)
 43 |         downnorm = norm_layer(inner_nc)
 44 |         uprelu = nn.ReLU(True)
 45 |         upnorm = norm_layer(outer_nc)
 46 | 
 47 |         if outermost:
 48 |             upconv = nn.ConvTranspose2d(inner_nc * 2, outer_nc,
 49 |                                         kernel_size=4, stride=2,
 50 |                                         padding=1)
 51 |             down = [downconv]
 52 |             if with_tanh:
 53 |                 up = [uprelu, upconv, nn.Tanh()]
 54 |             else:
 55 |                 up = [uprelu, upconv]
 56 |             model = down + [submodule] + up
 57 |         elif innermost:
 58 |             upconv = nn.ConvTranspose2d(inner_nc, outer_nc,
 59 |                                         kernel_size=4, stride=2,
 60 |                                         padding=1, bias=use_bias)
 61 |             down = [downrelu, downconv]
 62 |             
 63 |             up = [uprelu, upconv, upnorm]
 64 |             model = down + up
 65 |         else:
 66 |             upconv = nn.ConvTranspose2d(inner_nc * 2, outer_nc,
 67 |                                         kernel_size=4, stride=2,
 68 |                                         padding=1, bias=use_bias)
 69 |             down = [downrelu, downconv, downnorm]
 70 |             up = [uprelu, upconv, upnorm]
 71 | 
 72 |             if use_dropout:
 73 |                 model = down + [submodule] + up + [nn.Dropout(0.5)]
 74 |             else:
 75 |                 model = down + [submodule] + up
 76 | 
 77 |         self.model = nn.Sequential(*model)
 78 | 
 79 |     def forward(self, x):
 80 |         if self.outermost:
 81 |             return self.model(x)
 82 |         else:
 83 |             return torch.cat([x, self.model(x)], 1)
 84 | 
 85 | class UnetGenerator(nn.Module):
 86 |     """Create a Unet-based generator"""
 87 |     
 88 |     def __init__(self, input_nc, output_nc, num_downs, ngf=64, norm_layer=nn.InstanceNorm2d, use_dropout=False, with_tanh=False):
 89 |         """Construct a Unet generator
 90 |         Parameters:
 91 |             input_nc (int)  -- the number of channels in input images
 92 |             output_nc (int) -- the number of channels in output images
 93 |             num_downs (int) -- the number of downsamplings in UNet. For example, # if |num_downs| == 7,
 94 |                                 image of size 128x128 will become of size 1x1 # at the bottleneck
 95 |             ngf (int)       -- the number of filters in the last conv layer
 96 |             norm_layer      -- normalization layer
 97 |         
 98 |         We construct the U-Net from the innermost layer to the outermost layer.
 99 |         It is a recursive process.
100 |         """
101 | 
102 |         super(UnetGenerator, self).__init__()
103 |         unet_block = UnetBlock(ngf * 8, ngf * 8, input_nc=None, submodule=None, norm_layer=norm_layer, innermost=True, with_tanh=with_tanh)  # add the innermost layer
104 |         for i in range(num_downs - 5):
105 |             unet_block = UnetBlock(ngf * 8, ngf * 8, input_nc=None, submodule=unet_block, norm_layer=norm_layer, use_dropout=use_dropout, with_tanh=with_tanh)
106 |         unet_block = UnetBlock(ngf * 4, ngf * 8, input_nc=None, submodule=unet_block, norm_layer=norm_layer, with_tanh=with_tanh)
107 |         unet_block = UnetBlock(ngf * 2, ngf * 4, input_nc=None, submodule=unet_block, norm_layer=norm_layer, with_tanh=with_tanh)
108 |         unet_block = UnetBlock(ngf, ngf * 2, input_nc=None, submodule=unet_block, norm_layer=norm_layer, with_tanh=with_tanh)
109 |         self.model = UnetBlock(output_nc, ngf, input_nc=input_nc, submodule=unet_block, outermost=True, norm_layer=norm_layer, with_tanh=with_tanh)  # add the outermost layer
110 |     def forward(self, input):
111 |         """Standard forward"""
112 |         return self.model(input)
113 | 


--------------------------------------------------------------------------------
/lib/models/__init__.py:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # Copyright (c) Microsoft
 3 | # Licensed under the MIT License.
 4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 5 | # ------------------------------------------------------------------------------
 6 | 
 7 | from __future__ import absolute_import
 8 | from __future__ import division
 9 | from __future__ import print_function
10 | 
11 | from __future__ import absolute_import
12 | from __future__ import division
13 | from __future__ import print_function
14 | 
15 | import models.pose_resnet
16 | import models.pose_hrnet
17 | import models.Unet_generator


--------------------------------------------------------------------------------
/lib/models/pose_hrnet.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import os
 12 | import logging
 13 | 
 14 | import torch
 15 | import torch.nn as nn
 16 | 
 17 | 
 18 | BN_MOMENTUM = 0.1
 19 | logger = logging.getLogger(__name__)
 20 | 
 21 | 
 22 | def conv3x3(in_planes, out_planes, stride=1):
 23 |     """3x3 convolution with padding"""
 24 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 25 |                      padding=1, bias=False)
 26 | 
 27 | 
 28 | class BasicBlock(nn.Module):
 29 |     expansion = 1
 30 | 
 31 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 32 |         super(BasicBlock, self).__init__()
 33 |         self.conv1 = conv3x3(inplanes, planes, stride)
 34 |         self.bn1 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 35 |         self.relu = nn.ReLU(inplace=True)
 36 |         self.conv2 = conv3x3(planes, planes)
 37 |         self.bn2 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 38 |         self.downsample = downsample
 39 |         self.stride = stride
 40 | 
 41 |     def forward(self, x):
 42 |         residual = x
 43 | 
 44 |         out = self.conv1(x)
 45 |         out = self.bn1(out)
 46 |         out = self.relu(out)
 47 | 
 48 |         out = self.conv2(out)
 49 |         out = self.bn2(out)
 50 | 
 51 |         if self.downsample is not None:
 52 |             residual = self.downsample(x)
 53 | 
 54 |         out += residual
 55 |         out = self.relu(out)
 56 | 
 57 |         return out
 58 | 
 59 | 
 60 | class Bottleneck(nn.Module):
 61 |     expansion = 4
 62 | 
 63 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 64 |         super(Bottleneck, self).__init__()
 65 |         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 66 |         self.bn1 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 67 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
 68 |                                padding=1, bias=False)
 69 |         self.bn2 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 70 |         self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1,
 71 |                                bias=False)
 72 |         self.bn3 = nn.BatchNorm2d(planes * self.expansion,
 73 |                                   momentum=BN_MOMENTUM)
 74 |         self.relu = nn.ReLU(inplace=True)
 75 |         self.downsample = downsample
 76 |         self.stride = stride
 77 | 
 78 |     def forward(self, x):
 79 |         residual = x
 80 | 
 81 |         out = self.conv1(x)
 82 |         out = self.bn1(out)
 83 |         out = self.relu(out)
 84 | 
 85 |         out = self.conv2(out)
 86 |         out = self.bn2(out)
 87 |         out = self.relu(out)
 88 | 
 89 |         out = self.conv3(out)
 90 |         out = self.bn3(out)
 91 | 
 92 |         if self.downsample is not None:
 93 |             residual = self.downsample(x)
 94 | 
 95 |         out += residual
 96 |         out = self.relu(out)
 97 | 
 98 |         return out
 99 | 
100 | 
101 | class HighResolutionModule(nn.Module):
102 |     def __init__(self, num_branches, blocks, num_blocks, num_inchannels,
103 |                  num_channels, fuse_method, multi_scale_output=True):
104 |         super(HighResolutionModule, self).__init__()
105 |         self._check_branches(
106 |             num_branches, blocks, num_blocks, num_inchannels, num_channels)
107 | 
108 |         self.num_inchannels = num_inchannels
109 |         self.fuse_method = fuse_method
110 |         self.num_branches = num_branches
111 | 
112 |         self.multi_scale_output = multi_scale_output
113 | 
114 |         self.branches = self._make_branches(
115 |             num_branches, blocks, num_blocks, num_channels)
116 |         self.fuse_layers = self._make_fuse_layers()
117 |         self.relu = nn.ReLU(True)
118 | 
119 |     def _check_branches(self, num_branches, blocks, num_blocks,
120 |                         num_inchannels, num_channels):
121 |         if num_branches != len(num_blocks):
122 |             error_msg = 'NUM_BRANCHES({}) <> NUM_BLOCKS({})'.format(
123 |                 num_branches, len(num_blocks))
124 |             logger.error(error_msg)
125 |             raise ValueError(error_msg)
126 | 
127 |         if num_branches != len(num_channels):
128 |             error_msg = 'NUM_BRANCHES({}) <> NUM_CHANNELS({})'.format(
129 |                 num_branches, len(num_channels))
130 |             logger.error(error_msg)
131 |             raise ValueError(error_msg)
132 | 
133 |         if num_branches != len(num_inchannels):
134 |             error_msg = 'NUM_BRANCHES({}) <> NUM_INCHANNELS({})'.format(
135 |                 num_branches, len(num_inchannels))
136 |             logger.error(error_msg)
137 |             raise ValueError(error_msg)
138 | 
139 |     def _make_one_branch(self, branch_index, block, num_blocks, num_channels,
140 |                          stride=1):
141 |         downsample = None
142 |         if stride != 1 or \
143 |            self.num_inchannels[branch_index] != num_channels[branch_index] * block.expansion:
144 |             downsample = nn.Sequential(
145 |                 nn.Conv2d(
146 |                     self.num_inchannels[branch_index],
147 |                     num_channels[branch_index] * block.expansion,
148 |                     kernel_size=1, stride=stride, bias=False
149 |                 ),
150 |                 nn.BatchNorm2d(
151 |                     num_channels[branch_index] * block.expansion,
152 |                     momentum=BN_MOMENTUM
153 |                 ),
154 |             )
155 | 
156 |         layers = []
157 |         layers.append(
158 |             block(
159 |                 self.num_inchannels[branch_index],
160 |                 num_channels[branch_index],
161 |                 stride,
162 |                 downsample
163 |             )
164 |         )
165 |         self.num_inchannels[branch_index] = \
166 |             num_channels[branch_index] * block.expansion
167 |         for i in range(1, num_blocks[branch_index]):
168 |             layers.append(
169 |                 block(
170 |                     self.num_inchannels[branch_index],
171 |                     num_channels[branch_index]
172 |                 )
173 |             )
174 | 
175 |         return nn.Sequential(*layers)
176 | 
177 |     def _make_branches(self, num_branches, block, num_blocks, num_channels):
178 |         branches = []
179 | 
180 |         for i in range(num_branches):
181 |             branches.append(
182 |                 self._make_one_branch(i, block, num_blocks, num_channels)
183 |             )
184 | 
185 |         return nn.ModuleList(branches)
186 | 
187 |     def _make_fuse_layers(self):
188 |         if self.num_branches == 1:
189 |             return None
190 | 
191 |         num_branches = self.num_branches
192 |         num_inchannels = self.num_inchannels
193 |         fuse_layers = []
194 |         for i in range(num_branches if self.multi_scale_output else 1):
195 |             fuse_layer = []
196 |             for j in range(num_branches):
197 |                 if j > i:
198 |                     fuse_layer.append(
199 |                         nn.Sequential(
200 |                             nn.Conv2d(
201 |                                 num_inchannels[j],
202 |                                 num_inchannels[i],
203 |                                 1, 1, 0, bias=False
204 |                             ),
205 |                             nn.BatchNorm2d(num_inchannels[i]),
206 |                             nn.Upsample(scale_factor=2**(j-i), mode='nearest')
207 |                         )
208 |                     )
209 |                 elif j == i:
210 |                     fuse_layer.append(None)
211 |                 else:
212 |                     conv3x3s = []
213 |                     for k in range(i-j):
214 |                         if k == i - j - 1:
215 |                             num_outchannels_conv3x3 = num_inchannels[i]
216 |                             conv3x3s.append(
217 |                                 nn.Sequential(
218 |                                     nn.Conv2d(
219 |                                         num_inchannels[j],
220 |                                         num_outchannels_conv3x3,
221 |                                         3, 2, 1, bias=False
222 |                                     ),
223 |                                     nn.BatchNorm2d(num_outchannels_conv3x3)
224 |                                 )
225 |                             )
226 |                         else:
227 |                             num_outchannels_conv3x3 = num_inchannels[j]
228 |                             conv3x3s.append(
229 |                                 nn.Sequential(
230 |                                     nn.Conv2d(
231 |                                         num_inchannels[j],
232 |                                         num_outchannels_conv3x3,
233 |                                         3, 2, 1, bias=False
234 |                                     ),
235 |                                     nn.BatchNorm2d(num_outchannels_conv3x3),
236 |                                     nn.ReLU(True)
237 |                                 )
238 |                             )
239 |                     fuse_layer.append(nn.Sequential(*conv3x3s))
240 |             fuse_layers.append(nn.ModuleList(fuse_layer))
241 | 
242 |         return nn.ModuleList(fuse_layers)
243 | 
244 |     def get_num_inchannels(self):
245 |         return self.num_inchannels
246 | 
247 |     def forward(self, x):
248 |         if self.num_branches == 1:
249 |             return [self.branches[0](x[0])]
250 | 
251 |         for i in range(self.num_branches):
252 |             x[i] = self.branches[i](x[i])
253 | 
254 |         x_fuse = []
255 | 
256 |         for i in range(len(self.fuse_layers)):
257 |             y = x[0] if i == 0 else self.fuse_layers[i][0](x[0])
258 |             for j in range(1, self.num_branches):
259 |                 if i == j:
260 |                     y = y + x[j]
261 |                 else:
262 |                     y = y + self.fuse_layers[i][j](x[j])
263 |             x_fuse.append(self.relu(y))
264 | 
265 |         return x_fuse
266 | 
267 | 
268 | blocks_dict = {
269 |     'BASIC': BasicBlock,
270 |     'BOTTLENECK': Bottleneck
271 | }
272 | 
273 | 
274 | class PoseHighResolutionNet(nn.Module):
275 | 
276 |     def __init__(self, cfg, **kwargs):
277 |         self.inplanes = 64
278 |         extra = cfg['MODEL']['EXTRA']
279 |         super(PoseHighResolutionNet, self).__init__()
280 | 
281 |         # stem net
282 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1,
283 |                                bias=False)
284 |         self.bn1 = nn.BatchNorm2d(64, momentum=BN_MOMENTUM)
285 |         self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1,
286 |                                bias=False)
287 |         self.bn2 = nn.BatchNorm2d(64, momentum=BN_MOMENTUM)
288 |         self.relu = nn.ReLU(inplace=True)
289 |         self.layer1 = self._make_layer(Bottleneck, 64, 4)
290 | 
291 |         self.stage2_cfg = extra['STAGE2']
292 |         num_channels = self.stage2_cfg['NUM_CHANNELS']
293 |         block = blocks_dict[self.stage2_cfg['BLOCK']]
294 |         num_channels = [
295 |             num_channels[i] * block.expansion for i in range(len(num_channels))
296 |         ]
297 |         self.transition1 = self._make_transition_layer([256], num_channels)
298 |         self.stage2, pre_stage_channels = self._make_stage(
299 |             self.stage2_cfg, num_channels)
300 | 
301 |         self.stage3_cfg = extra['STAGE3']
302 |         num_channels = self.stage3_cfg['NUM_CHANNELS']
303 |         block = blocks_dict[self.stage3_cfg['BLOCK']]
304 |         num_channels = [
305 |             num_channels[i] * block.expansion for i in range(len(num_channels))
306 |         ]
307 |         self.transition2 = self._make_transition_layer(
308 |             pre_stage_channels, num_channels)
309 |         self.stage3, pre_stage_channels = self._make_stage(
310 |             self.stage3_cfg, num_channels)
311 | 
312 |         self.stage4_cfg = extra['STAGE4']
313 |         num_channels = self.stage4_cfg['NUM_CHANNELS']
314 |         block = blocks_dict[self.stage4_cfg['BLOCK']]
315 |         num_channels = [
316 |             num_channels[i] * block.expansion for i in range(len(num_channels))
317 |         ]
318 |         self.transition3 = self._make_transition_layer(
319 |             pre_stage_channels, num_channels)
320 |         self.stage4, pre_stage_channels = self._make_stage(
321 |             self.stage4_cfg, num_channels, multi_scale_output=False)
322 | 
323 |         self.final_layer = nn.Conv2d(
324 |             in_channels=pre_stage_channels[0],
325 |             out_channels=cfg['MODEL']['NUM_JOINTS'],
326 |             kernel_size=extra['FINAL_CONV_KERNEL'],
327 |             stride=1,
328 |             padding=1 if extra['FINAL_CONV_KERNEL'] == 3 else 0
329 |         )
330 | 
331 |         self.pretrained_layers = extra['PRETRAINED_LAYERS']
332 | 
333 |     def _make_transition_layer(
334 |             self, num_channels_pre_layer, num_channels_cur_layer):
335 |         num_branches_cur = len(num_channels_cur_layer)
336 |         num_branches_pre = len(num_channels_pre_layer)
337 | 
338 |         transition_layers = []
339 |         for i in range(num_branches_cur):
340 |             if i < num_branches_pre:
341 |                 if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
342 |                     transition_layers.append(
343 |                         nn.Sequential(
344 |                             nn.Conv2d(
345 |                                 num_channels_pre_layer[i],
346 |                                 num_channels_cur_layer[i],
347 |                                 3, 1, 1, bias=False
348 |                             ),
349 |                             nn.BatchNorm2d(num_channels_cur_layer[i]),
350 |                             nn.ReLU(inplace=True)
351 |                         )
352 |                     )
353 |                 else:
354 |                     transition_layers.append(None)
355 |             else:
356 |                 conv3x3s = []
357 |                 for j in range(i+1-num_branches_pre):
358 |                     inchannels = num_channels_pre_layer[-1]
359 |                     outchannels = num_channels_cur_layer[i] \
360 |                         if j == i-num_branches_pre else inchannels
361 |                     conv3x3s.append(
362 |                         nn.Sequential(
363 |                             nn.Conv2d(
364 |                                 inchannels, outchannels, 3, 2, 1, bias=False
365 |                             ),
366 |                             nn.BatchNorm2d(outchannels),
367 |                             nn.ReLU(inplace=True)
368 |                         )
369 |                     )
370 |                 transition_layers.append(nn.Sequential(*conv3x3s))
371 | 
372 |         return nn.ModuleList(transition_layers)
373 | 
374 |     def _make_layer(self, block, planes, blocks, stride=1):
375 |         downsample = None
376 |         if stride != 1 or self.inplanes != planes * block.expansion:
377 |             downsample = nn.Sequential(
378 |                 nn.Conv2d(
379 |                     self.inplanes, planes * block.expansion,
380 |                     kernel_size=1, stride=stride, bias=False
381 |                 ),
382 |                 nn.BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
383 |             )
384 | 
385 |         layers = []
386 |         layers.append(block(self.inplanes, planes, stride, downsample))
387 |         self.inplanes = planes * block.expansion
388 |         for i in range(1, blocks):
389 |             layers.append(block(self.inplanes, planes))
390 | 
391 |         return nn.Sequential(*layers)
392 | 
393 |     def _make_stage(self, layer_config, num_inchannels,
394 |                     multi_scale_output=True):
395 |         num_modules = layer_config['NUM_MODULES']
396 |         num_branches = layer_config['NUM_BRANCHES']
397 |         num_blocks = layer_config['NUM_BLOCKS']
398 |         num_channels = layer_config['NUM_CHANNELS']
399 |         block = blocks_dict[layer_config['BLOCK']]
400 |         fuse_method = layer_config['FUSE_METHOD']
401 | 
402 |         modules = []
403 |         for i in range(num_modules):
404 |             # multi_scale_output is only used last module
405 |             if not multi_scale_output and i == num_modules - 1:
406 |                 reset_multi_scale_output = False
407 |             else:
408 |                 reset_multi_scale_output = True
409 | 
410 |             modules.append(
411 |                 HighResolutionModule(
412 |                     num_branches,
413 |                     block,
414 |                     num_blocks,
415 |                     num_inchannels,
416 |                     num_channels,
417 |                     fuse_method,
418 |                     reset_multi_scale_output
419 |                 )
420 |             )
421 |             num_inchannels = modules[-1].get_num_inchannels()
422 | 
423 |         return nn.Sequential(*modules), num_inchannels
424 | 
425 |     def forward(self, x):
426 |         x = self.conv1(x)
427 |         x = self.bn1(x)
428 |         x = self.relu(x)
429 |         x = self.conv2(x)
430 |         x = self.bn2(x)
431 |         x = self.relu(x)
432 |         x = self.layer1(x)
433 | 
434 |         x_list = []
435 |         for i in range(self.stage2_cfg['NUM_BRANCHES']):
436 |             if self.transition1[i] is not None:
437 |                 x_list.append(self.transition1[i](x))
438 |             else:
439 |                 x_list.append(x)
440 |         y_list = self.stage2(x_list)
441 | 
442 |         x_list = []
443 |         for i in range(self.stage3_cfg['NUM_BRANCHES']):
444 |             if self.transition2[i] is not None:
445 |                 x_list.append(self.transition2[i](y_list[-1]))
446 |             else:
447 |                 x_list.append(y_list[i])
448 |         y_list = self.stage3(x_list)
449 | 
450 |         x_list = []
451 |         for i in range(self.stage4_cfg['NUM_BRANCHES']):
452 |             if self.transition3[i] is not None:
453 |                 x_list.append(self.transition3[i](y_list[-1]))
454 |             else:
455 |                 x_list.append(y_list[i])
456 |         y_list = self.stage4(x_list)
457 | 
458 |         x = self.final_layer(y_list[0])
459 | 
460 |         return x
461 | 
462 |     def init_weights(self, pretrained=''):
463 |         logger.info('=> init weights from normal distribution')
464 |         for m in self.modules():
465 |             if isinstance(m, nn.Conv2d):
466 |                 # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
467 |                 nn.init.normal_(m.weight, std=0.001)
468 |                 for name, _ in m.named_parameters():
469 |                     if name in ['bias']:
470 |                         nn.init.constant_(m.bias, 0)
471 |             elif isinstance(m, nn.BatchNorm2d):
472 |                 nn.init.constant_(m.weight, 1)
473 |                 nn.init.constant_(m.bias, 0)
474 |             elif isinstance(m, nn.ConvTranspose2d):
475 |                 nn.init.normal_(m.weight, std=0.001)
476 |                 for name, _ in m.named_parameters():
477 |                     if name in ['bias']:
478 |                         nn.init.constant_(m.bias, 0)
479 | 
480 |         if os.path.isfile(pretrained):
481 |             pretrained_state_dict = torch.load(pretrained)
482 |             logger.info('=> loading pretrained model {}'.format(pretrained))
483 | 
484 |             need_init_state_dict = {}
485 |             for name, m in pretrained_state_dict.items():
486 |                 if name.split('.')[0] in self.pretrained_layers \
487 |                    or self.pretrained_layers[0] is '*':
488 |                     need_init_state_dict[name] = m
489 |             self.load_state_dict(need_init_state_dict, strict=False)
490 |         elif pretrained:
491 |             logger.error('=> please download pre-trained models first!')
492 |             raise ValueError('{} is not exist!'.format(pretrained))
493 | 
494 | 
495 | def get_pose_net(cfg, is_train, **kwargs):
496 |     model = PoseHighResolutionNet(cfg, **kwargs)
497 | 
498 |     if is_train and cfg['MODEL']['INIT_WEIGHTS']:
499 |         model.init_weights(cfg['MODEL']['PRETRAINED'])
500 | 
501 |     return model


--------------------------------------------------------------------------------
/lib/models/pose_resnet.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import os
 12 | import logging
 13 | 
 14 | import torch
 15 | import torch.nn as nn
 16 | 
 17 | 
 18 | BN_MOMENTUM = 0.1
 19 | logger = logging.getLogger(__name__)
 20 | 
 21 | 
 22 | def conv3x3(in_planes, out_planes, stride=1):
 23 |     """3x3 convolution with padding"""
 24 |     return nn.Conv2d(
 25 |         in_planes, out_planes, kernel_size=3, stride=stride,
 26 |         padding=1, bias=False
 27 |     )
 28 | 
 29 | 
 30 | class BasicBlock(nn.Module):
 31 |     expansion = 1
 32 | 
 33 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 34 |         super(BasicBlock, self).__init__()
 35 |         self.conv1 = conv3x3(inplanes, planes, stride)
 36 |         self.bn1 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 37 |         self.relu = nn.ReLU(inplace=True)
 38 |         self.conv2 = conv3x3(planes, planes)
 39 |         self.bn2 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 40 |         self.downsample = downsample
 41 |         self.stride = stride
 42 | 
 43 |     def forward(self, x):
 44 |         residual = x
 45 | 
 46 |         out = self.conv1(x)
 47 |         out = self.bn1(out)
 48 |         out = self.relu(out)
 49 | 
 50 |         out = self.conv2(out)
 51 |         out = self.bn2(out)
 52 | 
 53 |         if self.downsample is not None:
 54 |             residual = self.downsample(x)
 55 | 
 56 |         out += residual
 57 |         out = self.relu(out)
 58 | 
 59 |         return out
 60 | 
 61 | 
 62 | class Bottleneck(nn.Module):
 63 |     expansion = 4
 64 | 
 65 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 66 |         super(Bottleneck, self).__init__()
 67 |         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 68 |         self.bn1 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 69 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
 70 |                                padding=1, bias=False)
 71 |         self.bn2 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
 72 |         self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1,
 73 |                                bias=False)
 74 |         self.bn3 = nn.BatchNorm2d(planes * self.expansion,
 75 |                                   momentum=BN_MOMENTUM)
 76 |         self.relu = nn.ReLU(inplace=True)
 77 |         self.downsample = downsample
 78 |         self.stride = stride
 79 | 
 80 |     def forward(self, x):
 81 |         residual = x
 82 | 
 83 |         out = self.conv1(x)
 84 |         out = self.bn1(out)
 85 |         out = self.relu(out)
 86 | 
 87 |         out = self.conv2(out)
 88 |         out = self.bn2(out)
 89 |         out = self.relu(out)
 90 | 
 91 |         out = self.conv3(out)
 92 |         out = self.bn3(out)
 93 | 
 94 |         if self.downsample is not None:
 95 |             residual = self.downsample(x)
 96 | 
 97 |         out += residual
 98 |         out = self.relu(out)
 99 | 
100 |         return out
101 | 
102 | 
103 | class PoseResNet(nn.Module):
104 | 
105 |     def __init__(self, block, layers, cfg, **kwargs):
106 |         self.inplanes = 64
107 |         extra = cfg.MODEL.EXTRA
108 |         self.deconv_with_bias = extra.DECONV_WITH_BIAS
109 | 
110 |         super(PoseResNet, self).__init__()
111 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
112 |                             bias=False)
113 |         self.bn1 = nn.BatchNorm2d(64, momentum=BN_MOMENTUM)
114 |         self.relu = nn.ReLU(inplace=True)
115 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
116 |         self.layer1 = self._make_layer(block, 64, layers[0])
117 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
118 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
119 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
120 | 
121 |         # used for deconv layers
122 |         self.deconv_layers = self._make_deconv_layer(
123 |             extra.NUM_DECONV_LAYERS,
124 |             extra.NUM_DECONV_FILTERS,
125 |             extra.NUM_DECONV_KERNELS,
126 |         )
127 | 
128 |         self.final_layer = nn.Conv2d(
129 |             in_channels=extra.NUM_DECONV_FILTERS[-1],
130 |             out_channels=cfg.MODEL.NUM_JOINTS,
131 |             kernel_size=extra.FINAL_CONV_KERNEL,
132 |             stride=1,
133 |             padding=1 if extra.FINAL_CONV_KERNEL == 3 else 0
134 |         )
135 | 
136 |     def _make_layer(self, block, planes, blocks, stride=1):
137 |         downsample = None
138 |         if stride != 1 or self.inplanes != planes * block.expansion:
139 |             downsample = nn.Sequential(
140 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
141 |                           kernel_size=1, stride=stride, bias=False),
142 |                 nn.BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
143 |             )
144 | 
145 |         layers = []
146 |         layers.append(block(self.inplanes, planes, stride, downsample))
147 |         self.inplanes = planes * block.expansion
148 |         for i in range(1, blocks):
149 |             layers.append(block(self.inplanes, planes))
150 | 
151 |         return nn.Sequential(*layers)
152 | 
153 |     def _get_deconv_cfg(self, deconv_kernel, index):
154 |         if deconv_kernel == 4:
155 |             padding = 1
156 |             output_padding = 0
157 |         elif deconv_kernel == 3:
158 |             padding = 1
159 |             output_padding = 1
160 |         elif deconv_kernel == 2:
161 |             padding = 0
162 |             output_padding = 0
163 | 
164 |         return deconv_kernel, padding, output_padding
165 | 
166 |     def _make_deconv_layer(self, num_layers, num_filters, num_kernels):
167 |         assert num_layers == len(num_filters), \
168 |             'ERROR: num_deconv_layers is different len(num_deconv_filters)'
169 |         assert num_layers == len(num_kernels), \
170 |             'ERROR: num_deconv_layers is different len(num_deconv_filters)'
171 | 
172 |         layers = []
173 |         for i in range(num_layers):
174 |             kernel, padding, output_padding = \
175 |                 self._get_deconv_cfg(num_kernels[i], i)
176 | 
177 |             planes = num_filters[i]
178 |             layers.append(
179 |                 nn.ConvTranspose2d(
180 |                     in_channels=self.inplanes,
181 |                     out_channels=planes,
182 |                     kernel_size=kernel,
183 |                     stride=2,
184 |                     padding=padding,
185 |                     output_padding=output_padding,
186 |                     bias=self.deconv_with_bias))
187 |             layers.append(nn.BatchNorm2d(planes, momentum=BN_MOMENTUM))
188 |             layers.append(nn.ReLU(inplace=True))
189 |             self.inplanes = planes
190 | 
191 |         return nn.Sequential(*layers)
192 | 
193 |     def forward(self, x):
194 |         x = self.conv1(x)
195 |         x = self.bn1(x)
196 |         x = self.relu(x)
197 |         x = self.maxpool(x)
198 | 
199 |         x = self.layer1(x)
200 |         x = self.layer2(x)
201 |         x = self.layer3(x)
202 |         x = self.layer4(x)
203 | 
204 |         x = self.deconv_layers(x)
205 |         x = self.final_layer(x)
206 | 
207 |         return x
208 | 
209 |     def init_weights(self, pretrained=''):
210 |         if os.path.isfile(pretrained):
211 |             logger.info('=> init deconv weights from normal distribution')
212 |             for name, m in self.deconv_layers.named_modules():
213 |                 if isinstance(m, nn.ConvTranspose2d):
214 |                     logger.info('=> init {}.weight as normal(0, 0.001)'.format(name))
215 |                     logger.info('=> init {}.bias as 0'.format(name))
216 |                     nn.init.normal_(m.weight, std=0.001)
217 |                     if self.deconv_with_bias:
218 |                         nn.init.constant_(m.bias, 0)
219 |                 elif isinstance(m, nn.BatchNorm2d):
220 |                     logger.info('=> init {}.weight as 1'.format(name))
221 |                     logger.info('=> init {}.bias as 0'.format(name))
222 |                     nn.init.constant_(m.weight, 1)
223 |                     nn.init.constant_(m.bias, 0)
224 |             logger.info('=> init final conv weights from normal distribution')
225 |             for m in self.final_layer.modules():
226 |                 if isinstance(m, nn.Conv2d):
227 |                     # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
228 |                     logger.info('=> init {}.weight as normal(0, 0.001)'.format(name))
229 |                     logger.info('=> init {}.bias as 0'.format(name))
230 |                     nn.init.normal_(m.weight, std=0.001)
231 |                     nn.init.constant_(m.bias, 0)
232 | 
233 |             pretrained_state_dict = torch.load(pretrained)
234 |             logger.info('=> loading pretrained model {}'.format(pretrained))
235 |             self.load_state_dict(pretrained_state_dict, strict=False)
236 |         else:
237 |             logger.info('=> init weights from normal distribution')
238 |             for m in self.modules():
239 |                 if isinstance(m, nn.Conv2d):
240 |                     # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
241 |                     nn.init.normal_(m.weight, std=0.001)
242 |                     # nn.init.constant_(m.bias, 0)
243 |                 elif isinstance(m, nn.BatchNorm2d):
244 |                     nn.init.constant_(m.weight, 1)
245 |                     nn.init.constant_(m.bias, 0)
246 |                 elif isinstance(m, nn.ConvTranspose2d):
247 |                     nn.init.normal_(m.weight, std=0.001)
248 |                     if self.deconv_with_bias:
249 |                         nn.init.constant_(m.bias, 0)
250 | 
251 | 
252 | resnet_spec = {
253 |     18: (BasicBlock, [2, 2, 2, 2]),
254 |     34: (BasicBlock, [3, 4, 6, 3]),
255 |     50: (Bottleneck, [3, 4, 6, 3]),
256 |     101: (Bottleneck, [3, 4, 23, 3]),
257 |     152: (Bottleneck, [3, 8, 36, 3])
258 | }
259 | 
260 | 
261 | def get_pose_net(cfg, is_train, **kwargs):
262 |     num_layers = cfg.MODEL.EXTRA.NUM_LAYERS
263 | 
264 |     block_class, layers = resnet_spec[num_layers]
265 | 
266 |     model = PoseResNet(block_class, layers, cfg, **kwargs)
267 | 
268 |     if is_train and cfg.MODEL.INIT_WEIGHTS:
269 |         model.init_weights(cfg.MODEL.PRETRAINED)
270 | 
271 |     return model


--------------------------------------------------------------------------------
/lib/nms/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIprogrammer/AdvMix/f619fa279d9419eb452d228762c3872691e42e7d/lib/nms/__init__.py


--------------------------------------------------------------------------------
/lib/nms/cpu_nms.pyx:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # Copyright (c) Microsoft
 3 | # Licensed under the MIT License.
 4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 5 | # ------------------------------------------------------------------------------
 6 | 
 7 | from __future__ import absolute_import
 8 | from __future__ import division
 9 | from __future__ import print_function
10 | 
11 | import numpy as np
12 | cimport numpy as np
13 | 
14 | cdef inline np.float32_t max(np.float32_t a, np.float32_t b):
15 |     return a if a >= b else b
16 | 
17 | cdef inline np.float32_t min(np.float32_t a, np.float32_t b):
18 |     return a if a <= b else b
19 | 
20 | def cpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):
21 |     cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]
22 |     cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]
23 |     cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]
24 |     cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]
25 |     cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]
26 | 
27 |     cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)
28 |     cdef np.ndarray[np.int_t, ndim=1] order = scores.argsort()[::-1].astype('i')
29 | 
30 |     cdef int ndets = dets.shape[0]
31 |     cdef np.ndarray[np.int_t, ndim=1] suppressed = \
32 |             np.zeros((ndets), dtype=np.int)
33 | 
34 |     # nominal indices
35 |     cdef int _i, _j
36 |     # sorted indices
37 |     cdef int i, j
38 |     # temp variables for box i's (the box currently under consideration)
39 |     cdef np.float32_t ix1, iy1, ix2, iy2, iarea
40 |     # variables for computing overlap with box j (lower scoring box)
41 |     cdef np.float32_t xx1, yy1, xx2, yy2
42 |     cdef np.float32_t w, h
43 |     cdef np.float32_t inter, ovr
44 | 
45 |     keep = []
46 |     for _i in range(ndets):
47 |         i = order[_i]
48 |         if suppressed[i] == 1:
49 |             continue
50 |         keep.append(i)
51 |         ix1 = x1[i]
52 |         iy1 = y1[i]
53 |         ix2 = x2[i]
54 |         iy2 = y2[i]
55 |         iarea = areas[i]
56 |         for _j in range(_i + 1, ndets):
57 |             j = order[_j]
58 |             if suppressed[j] == 1:
59 |                 continue
60 |             xx1 = max(ix1, x1[j])
61 |             yy1 = max(iy1, y1[j])
62 |             xx2 = min(ix2, x2[j])
63 |             yy2 = min(iy2, y2[j])
64 |             w = max(0.0, xx2 - xx1 + 1)
65 |             h = max(0.0, yy2 - yy1 + 1)
66 |             inter = w * h
67 |             ovr = inter / (iarea + areas[j] - inter)
68 |             if ovr >= thresh:
69 |                 suppressed[j] = 1
70 | 
71 |     return keep
72 | 


--------------------------------------------------------------------------------
/lib/nms/gpu_nms.hpp:
--------------------------------------------------------------------------------
1 | void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,
2 |           int boxes_dim, float nms_overlap_thresh, int device_id);
3 | 


--------------------------------------------------------------------------------
/lib/nms/gpu_nms.pyx:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # Copyright (c) Microsoft
 3 | # Licensed under the MIT License.
 4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 5 | # ------------------------------------------------------------------------------
 6 | 
 7 | from __future__ import absolute_import
 8 | from __future__ import division
 9 | from __future__ import print_function
10 | 
11 | import numpy as np
12 | cimport numpy as np
13 | 
14 | assert sizeof(int) == sizeof(np.int32_t)
15 | 
16 | cdef extern from "gpu_nms.hpp":
17 |     void _nms(np.int32_t*, int*, np.float32_t*, int, int, float, int)
18 | 
19 | def gpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh,
20 |             np.int32_t device_id=0):
21 |     cdef int boxes_num = dets.shape[0]
22 |     cdef int boxes_dim = dets.shape[1]
23 |     cdef int num_out
24 |     cdef np.ndarray[np.int32_t, ndim=1] \
25 |         keep = np.zeros(boxes_num, dtype=np.int32)
26 |     cdef np.ndarray[np.float32_t, ndim=1] \
27 |         scores = dets[:, 4]
28 |     cdef np.ndarray[np.int32_t, ndim=1] \
29 |         order = scores.argsort()[::-1].astype(np.int32)
30 |     cdef np.ndarray[np.float32_t, ndim=2] \
31 |         sorted_dets = dets[order, :]
32 |     _nms(&keep[0], &num_out, &sorted_dets[0, 0], boxes_num, boxes_dim, thresh, device_id)
33 |     keep = keep[:num_out]
34 |     return list(order[keep])
35 | 


--------------------------------------------------------------------------------
/lib/nms/nms.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Modified from py-faster-rcnn (https://github.com/rbgirshick/py-faster-rcnn)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import numpy as np
 12 | 
 13 | from .cpu_nms import cpu_nms
 14 | from .gpu_nms import gpu_nms
 15 | 
 16 | 
 17 | def py_nms_wrapper(thresh):
 18 |     def _nms(dets):
 19 |         return nms(dets, thresh)
 20 |     return _nms
 21 | 
 22 | 
 23 | def cpu_nms_wrapper(thresh):
 24 |     def _nms(dets):
 25 |         return cpu_nms(dets, thresh)
 26 |     return _nms
 27 | 
 28 | 
 29 | def gpu_nms_wrapper(thresh, device_id):
 30 |     def _nms(dets):
 31 |         return gpu_nms(dets, thresh, device_id)
 32 |     return _nms
 33 | 
 34 | 
 35 | def nms(dets, thresh):
 36 |     """
 37 |     greedily select boxes with high confidence and overlap with current maximum <= thresh
 38 |     rule out overlap >= thresh
 39 |     :param dets: [[x1, y1, x2, y2 score]]
 40 |     :param thresh: retain overlap < thresh
 41 |     :return: indexes to keep
 42 |     """
 43 |     if dets.shape[0] == 0:
 44 |         return []
 45 | 
 46 |     x1 = dets[:, 0]
 47 |     y1 = dets[:, 1]
 48 |     x2 = dets[:, 2]
 49 |     y2 = dets[:, 3]
 50 |     scores = dets[:, 4]
 51 | 
 52 |     areas = (x2 - x1 + 1) * (y2 - y1 + 1)
 53 |     order = scores.argsort()[::-1]
 54 | 
 55 |     keep = []
 56 |     while order.size > 0:
 57 |         i = order[0]
 58 |         keep.append(i)
 59 |         xx1 = np.maximum(x1[i], x1[order[1:]])
 60 |         yy1 = np.maximum(y1[i], y1[order[1:]])
 61 |         xx2 = np.minimum(x2[i], x2[order[1:]])
 62 |         yy2 = np.minimum(y2[i], y2[order[1:]])
 63 | 
 64 |         w = np.maximum(0.0, xx2 - xx1 + 1)
 65 |         h = np.maximum(0.0, yy2 - yy1 + 1)
 66 |         inter = w * h
 67 |         ovr = inter / (areas[i] + areas[order[1:]] - inter)
 68 | 
 69 |         inds = np.where(ovr <= thresh)[0]
 70 |         order = order[inds + 1]
 71 | 
 72 |     return keep
 73 | 
 74 | 
 75 | def oks_iou(g, d, a_g, a_d, sigmas=None, in_vis_thre=None):
 76 |     if not isinstance(sigmas, np.ndarray):
 77 |         sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62, .62, 1.07, 1.07, .87, .87, .89, .89]) / 10.0
 78 |     vars = (sigmas * 2) ** 2
 79 |     xg = g[0::3]
 80 |     yg = g[1::3]
 81 |     vg = g[2::3]
 82 |     ious = np.zeros((d.shape[0]))
 83 |     for n_d in range(0, d.shape[0]):
 84 |         xd = d[n_d, 0::3]
 85 |         yd = d[n_d, 1::3]
 86 |         vd = d[n_d, 2::3]
 87 |         dx = xd - xg
 88 |         dy = yd - yg
 89 |         e = (dx ** 2 + dy ** 2) / vars / ((a_g + a_d[n_d]) / 2 + np.spacing(1)) / 2
 90 |         if in_vis_thre is not None:
 91 |             ind = list(vg > in_vis_thre) and list(vd > in_vis_thre)
 92 |             e = e[ind]
 93 |         ious[n_d] = np.sum(np.exp(-e)) / e.shape[0] if e.shape[0] != 0 else 0.0
 94 |     return ious
 95 | 
 96 | 
 97 | def oks_nms(kpts_db, thresh, sigmas=None, in_vis_thre=None):
 98 |     """
 99 |     greedily select boxes with high confidence and overlap with current maximum <= thresh
100 |     rule out overlap >= thresh, overlap = oks
101 |     :param kpts_db
102 |     :param thresh: retain overlap < thresh
103 |     :return: indexes to keep
104 |     """
105 |     if len(kpts_db) == 0:
106 |         return []
107 | 
108 |     scores = np.array([kpts_db[i]['score'] for i in range(len(kpts_db))])
109 |     kpts = np.array([kpts_db[i]['keypoints'].flatten() for i in range(len(kpts_db))])
110 |     areas = np.array([kpts_db[i]['area'] for i in range(len(kpts_db))])
111 | 
112 |     order = scores.argsort()[::-1]
113 | 
114 |     keep = []
115 |     while order.size > 0:
116 |         i = order[0]
117 |         keep.append(i)
118 | 
119 |         oks_ovr = oks_iou(kpts[i], kpts[order[1:]], areas[i], areas[order[1:]], sigmas, in_vis_thre)
120 | 
121 |         inds = np.where(oks_ovr <= thresh)[0]
122 |         order = order[inds + 1]
123 | 
124 |     return keep
125 | 
126 | 
127 | def rescore(overlap, scores, thresh, type='gaussian'):
128 |     assert overlap.shape[0] == scores.shape[0]
129 |     if type == 'linear':
130 |         inds = np.where(overlap >= thresh)[0]
131 |         scores[inds] = scores[inds] * (1 - overlap[inds])
132 |     else:
133 |         scores = scores * np.exp(- overlap**2 / thresh)
134 | 
135 |     return scores
136 | 
137 | 
138 | def soft_oks_nms(kpts_db, thresh, sigmas=None, in_vis_thre=None):
139 |     """
140 |     greedily select boxes with high confidence and overlap with current maximum <= thresh
141 |     rule out overlap >= thresh, overlap = oks
142 |     :param kpts_db
143 |     :param thresh: retain overlap < thresh
144 |     :return: indexes to keep
145 |     """
146 |     if len(kpts_db) == 0:
147 |         return []
148 | 
149 |     scores = np.array([kpts_db[i]['score'] for i in range(len(kpts_db))])
150 |     kpts = np.array([kpts_db[i]['keypoints'].flatten() for i in range(len(kpts_db))])
151 |     areas = np.array([kpts_db[i]['area'] for i in range(len(kpts_db))])
152 | 
153 |     order = scores.argsort()[::-1]
154 |     scores = scores[order]
155 | 
156 |     # max_dets = order.size
157 |     max_dets = 20
158 |     keep = np.zeros(max_dets, dtype=np.intp)
159 |     keep_cnt = 0
160 |     while order.size > 0 and keep_cnt < max_dets:
161 |         i = order[0]
162 | 
163 |         oks_ovr = oks_iou(kpts[i], kpts[order[1:]], areas[i], areas[order[1:]], sigmas, in_vis_thre)
164 | 
165 |         order = order[1:]
166 |         scores = rescore(oks_ovr, scores[1:], thresh)
167 | 
168 |         tmp = scores.argsort()[::-1]
169 |         order = order[tmp]
170 |         scores = scores[tmp]
171 | 
172 |         keep[keep_cnt] = i
173 |         keep_cnt += 1
174 | 
175 |     keep = keep[:keep_cnt]
176 | 
177 |     return keep
178 |     # kpts_db = kpts_db[:keep_cnt]
179 | 
180 |     # return kpts_db
181 | 


--------------------------------------------------------------------------------
/lib/nms/nms_kernel.cu:
--------------------------------------------------------------------------------
  1 | // ------------------------------------------------------------------
  2 | // Copyright (c) Microsoft
  3 | // Licensed under The MIT License
  4 | // Modified from MATLAB Faster R-CNN (https://github.com/shaoqingren/faster_rcnn)
  5 | // ------------------------------------------------------------------
  6 | 
  7 | #include "gpu_nms.hpp"
  8 | #include <vector>
  9 | #include <iostream>
 10 | 
 11 | #define CUDA_CHECK(condition) \
 12 |   /* Code block avoids redefinition of cudaError_t error */ \
 13 |   do { \
 14 |     cudaError_t error = condition; \
 15 |     if (error != cudaSuccess) { \
 16 |       std::cout << cudaGetErrorString(error) << std::endl; \
 17 |     } \
 18 |   } while (0)
 19 | 
 20 | #define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))
 21 | int const threadsPerBlock = sizeof(unsigned long long) * 8;
 22 | 
 23 | __device__ inline float devIoU(float const * const a, float const * const b) {
 24 |   float left = max(a[0], b[0]), right = min(a[2], b[2]);
 25 |   float top = max(a[1], b[1]), bottom = min(a[3], b[3]);
 26 |   float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);
 27 |   float interS = width * height;
 28 |   float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);
 29 |   float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);
 30 |   return interS / (Sa + Sb - interS);
 31 | }
 32 | 
 33 | __global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,
 34 |                            const float *dev_boxes, unsigned long long *dev_mask) {
 35 |   const int row_start = blockIdx.y;
 36 |   const int col_start = blockIdx.x;
 37 | 
 38 |   // if (row_start > col_start) return;
 39 | 
 40 |   const int row_size =
 41 |         min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
 42 |   const int col_size =
 43 |         min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
 44 | 
 45 |   __shared__ float block_boxes[threadsPerBlock * 5];
 46 |   if (threadIdx.x < col_size) {
 47 |     block_boxes[threadIdx.x * 5 + 0] =
 48 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];
 49 |     block_boxes[threadIdx.x * 5 + 1] =
 50 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];
 51 |     block_boxes[threadIdx.x * 5 + 2] =
 52 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];
 53 |     block_boxes[threadIdx.x * 5 + 3] =
 54 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];
 55 |     block_boxes[threadIdx.x * 5 + 4] =
 56 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];
 57 |   }
 58 |   __syncthreads();
 59 | 
 60 |   if (threadIdx.x < row_size) {
 61 |     const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
 62 |     const float *cur_box = dev_boxes + cur_box_idx * 5;
 63 |     int i = 0;
 64 |     unsigned long long t = 0;
 65 |     int start = 0;
 66 |     if (row_start == col_start) {
 67 |       start = threadIdx.x + 1;
 68 |     }
 69 |     for (i = start; i < col_size; i++) {
 70 |       if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {
 71 |         t |= 1ULL << i;
 72 |       }
 73 |     }
 74 |     const int col_blocks = DIVUP(n_boxes, threadsPerBlock);
 75 |     dev_mask[cur_box_idx * col_blocks + col_start] = t;
 76 |   }
 77 | }
 78 | 
 79 | void _set_device(int device_id) {
 80 |   int current_device;
 81 |   CUDA_CHECK(cudaGetDevice(&current_device));
 82 |   if (current_device == device_id) {
 83 |     return;
 84 |   }
 85 |   // The call to cudaSetDevice must come before any calls to Get, which
 86 |   // may perform initialization using the GPU.
 87 |   CUDA_CHECK(cudaSetDevice(device_id));
 88 | }
 89 | 
 90 | void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,
 91 |           int boxes_dim, float nms_overlap_thresh, int device_id) {
 92 |   _set_device(device_id);
 93 | 
 94 |   float* boxes_dev = NULL;
 95 |   unsigned long long* mask_dev = NULL;
 96 | 
 97 |   const int col_blocks = DIVUP(boxes_num, threadsPerBlock);
 98 | 
 99 |   CUDA_CHECK(cudaMalloc(&boxes_dev,
100 |                         boxes_num * boxes_dim * sizeof(float)));
101 |   CUDA_CHECK(cudaMemcpy(boxes_dev,
102 |                         boxes_host,
103 |                         boxes_num * boxes_dim * sizeof(float),
104 |                         cudaMemcpyHostToDevice));
105 | 
106 |   CUDA_CHECK(cudaMalloc(&mask_dev,
107 |                         boxes_num * col_blocks * sizeof(unsigned long long)));
108 | 
109 |   dim3 blocks(DIVUP(boxes_num, threadsPerBlock),
110 |               DIVUP(boxes_num, threadsPerBlock));
111 |   dim3 threads(threadsPerBlock);
112 |   nms_kernel<<<blocks, threads>>>(boxes_num,
113 |                                   nms_overlap_thresh,
114 |                                   boxes_dev,
115 |                                   mask_dev);
116 | 
117 |   std::vector<unsigned long long> mask_host(boxes_num * col_blocks);
118 |   CUDA_CHECK(cudaMemcpy(&mask_host[0],
119 |                         mask_dev,
120 |                         sizeof(unsigned long long) * boxes_num * col_blocks,
121 |                         cudaMemcpyDeviceToHost));
122 | 
123 |   std::vector<unsigned long long> remv(col_blocks);
124 |   memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);
125 | 
126 |   int num_to_keep = 0;
127 |   for (int i = 0; i < boxes_num; i++) {
128 |     int nblock = i / threadsPerBlock;
129 |     int inblock = i % threadsPerBlock;
130 | 
131 |     if (!(remv[nblock] & (1ULL << inblock))) {
132 |       keep_out[num_to_keep++] = i;
133 |       unsigned long long *p = &mask_host[0] + i * col_blocks;
134 |       for (int j = nblock; j < col_blocks; j++) {
135 |         remv[j] |= p[j];
136 |       }
137 |     }
138 |   }
139 |   *num_out = num_to_keep;
140 | 
141 |   CUDA_CHECK(cudaFree(boxes_dev));
142 |   CUDA_CHECK(cudaFree(mask_dev));
143 | }
144 | 


--------------------------------------------------------------------------------
/lib/nms/setup_linux.py:
--------------------------------------------------------------------------------
  1 | # --------------------------------------------------------
  2 | # Pose.gluon
  3 | # Copyright (c) 2018-present Microsoft
  4 | # Licensed under The MIT License [see LICENSE for details]
  5 | # Modified from py-faster-rcnn (https://github.com/rbgirshick/py-faster-rcnn)
  6 | # --------------------------------------------------------
  7 | 
  8 | import os
  9 | from os.path import join as pjoin
 10 | from setuptools import setup
 11 | from distutils.extension import Extension
 12 | from Cython.Distutils import build_ext
 13 | import numpy as np
 14 | 
 15 | 
 16 | def find_in_path(name, path):
 17 |     "Find a file in a search path"
 18 |     # Adapted fom
 19 |     # http://code.activestate.com/recipes/52224-find-a-file-given-a-search-path/
 20 |     for dir in path.split(os.pathsep):
 21 |         binpath = pjoin(dir, name)
 22 |         if os.path.exists(binpath):
 23 |             return os.path.abspath(binpath)
 24 |     return None
 25 | 
 26 | 
 27 | def locate_cuda():
 28 |     """Locate the CUDA environment on the system
 29 |     Returns a dict with keys 'home', 'nvcc', 'include', and 'lib64'
 30 |     and values giving the absolute path to each directory.
 31 |     Starts by looking for the CUDAHOME env variable. If not found, everything
 32 |     is based on finding 'nvcc' in the PATH.
 33 |     """
 34 | 
 35 |     # first check if the CUDAHOME env variable is in use
 36 |     if 'CUDAHOME' in os.environ:
 37 |         home = os.environ['CUDAHOME']
 38 |         nvcc = pjoin(home, 'bin', 'nvcc')
 39 |     else:
 40 |         # otherwise, search the PATH for NVCC
 41 |         default_path = pjoin(os.sep, 'usr', 'local', 'cuda', 'bin')
 42 |         nvcc = find_in_path('nvcc', os.environ['PATH'] + os.pathsep + default_path)
 43 |         if nvcc is None:
 44 |             raise EnvironmentError('The nvcc binary could not be '
 45 |                 'located in your $PATH. Either add it to your path, or set $CUDAHOME')
 46 |         home = os.path.dirname(os.path.dirname(nvcc))
 47 | 
 48 |     cudaconfig = {'home':home, 'nvcc':nvcc,
 49 |                   'include': pjoin(home, 'include'),
 50 |                   'lib64': pjoin(home, 'lib64')}
 51 |     for k, v in cudaconfig.items():
 52 |         if not os.path.exists(v):
 53 |             raise EnvironmentError('The CUDA %s path could not be located in %s' % (k, v))
 54 | 
 55 |     return cudaconfig
 56 | CUDA = locate_cuda()
 57 | 
 58 | 
 59 | # Obtain the numpy include directory.  This logic works across numpy versions.
 60 | try:
 61 |     numpy_include = np.get_include()
 62 | except AttributeError:
 63 |     numpy_include = np.get_numpy_include()
 64 | 
 65 | 
 66 | def customize_compiler_for_nvcc(self):
 67 |     """inject deep into distutils to customize how the dispatch
 68 |     to gcc/nvcc works.
 69 |     If you subclass UnixCCompiler, it's not trivial to get your subclass
 70 |     injected in, and still have the right customizations (i.e.
 71 |     distutils.sysconfig.customize_compiler) run on it. So instead of going
 72 |     the OO route, I have this. Note, it's kindof like a wierd functional
 73 |     subclassing going on."""
 74 | 
 75 |     # tell the compiler it can processes .cu
 76 |     self.src_extensions.append('.cu')
 77 | 
 78 |     # save references to the default compiler_so and _comple methods
 79 |     default_compiler_so = self.compiler_so
 80 |     super = self._compile
 81 | 
 82 |     # now redefine the _compile method. This gets executed for each
 83 |     # object but distutils doesn't have the ability to change compilers
 84 |     # based on source extension: we add it.
 85 |     def _compile(obj, src, ext, cc_args, extra_postargs, pp_opts):
 86 |         if os.path.splitext(src)[1] == '.cu':
 87 |             # use the cuda for .cu files
 88 |             self.set_executable('compiler_so', CUDA['nvcc'])
 89 |             # use only a subset of the extra_postargs, which are 1-1 translated
 90 |             # from the extra_compile_args in the Extension class
 91 |             postargs = extra_postargs['nvcc']
 92 |         else:
 93 |             postargs = extra_postargs['gcc']
 94 | 
 95 |         super(obj, src, ext, cc_args, postargs, pp_opts)
 96 |         # reset the default compiler_so, which we might have changed for cuda
 97 |         self.compiler_so = default_compiler_so
 98 | 
 99 |     # inject our redefined _compile method into the class
100 |     self._compile = _compile
101 | 
102 | 
103 | # run the customize_compiler
104 | class custom_build_ext(build_ext):
105 |     def build_extensions(self):
106 |         customize_compiler_for_nvcc(self.compiler)
107 |         build_ext.build_extensions(self)
108 | 
109 | 
110 | ext_modules = [
111 |     Extension(
112 |         "cpu_nms",
113 |         ["cpu_nms.pyx"],
114 |         extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
115 |         include_dirs = [numpy_include]
116 |     ),
117 |     Extension('gpu_nms',
118 |         ['nms_kernel.cu', 'gpu_nms.pyx'],
119 |         library_dirs=[CUDA['lib64']],
120 |         libraries=['cudart'],
121 |         language='c++',
122 |         runtime_library_dirs=[CUDA['lib64']],
123 |         # this syntax is specific to this build system
124 |         # we're only going to use certain compiler args with nvcc and not with
125 |         # gcc the implementation of this trick is in customize_compiler() below
126 |         extra_compile_args={'gcc': ["-Wno-unused-function"],
127 |                             'nvcc': ['-arch=sm_35',
128 |                                      '--ptxas-options=-v',
129 |                                      '-c',
130 |                                      '--compiler-options',
131 |                                      "'-fPIC'"]},
132 |         include_dirs = [numpy_include, CUDA['include']]
133 |     ),
134 | ]
135 | 
136 | setup(
137 |     name='nms',
138 |     ext_modules=ext_modules,
139 |     # inject our custom trigger
140 |     cmdclass={'build_ext': custom_build_ext},
141 | )
142 | 


--------------------------------------------------------------------------------
/lib/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIprogrammer/AdvMix/f619fa279d9419eb452d228762c3872691e42e7d/lib/utils/__init__.py


--------------------------------------------------------------------------------
/lib/utils/transforms.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import numpy as np
 12 | import cv2
 13 | import torch
 14 | 
 15 | 
 16 | def flip_back(output_flipped, matched_parts, args=None, cfg=None, dim=None):
 17 |     '''
 18 |     ouput_flipped: numpy.ndarray(batch_size, num_joints, height, width)
 19 |     '''
 20 |     assert output_flipped.ndim == 4 or output_flipped.ndim == 3,\
 21 |         'output_flipped should be [batch_size, num_joints, height, width] or [batch_size, num_joints, 2]'
 22 |     
 23 |     if output_flipped.ndim == 4:
 24 |         output_flipped = output_flipped[..., ::-1]
 25 |     
 26 |     # flip x,y
 27 |     else:
 28 |         output_flipped = inv_coord_norm(output_flipped, cfg, args)
 29 |         if args.reg_coord:
 30 |             output_flipped[...,0] = cfg.MODEL.IMAGE_SIZE[0] - 1 - output_flipped[...,0] 
 31 |         else:
 32 |             output_flipped[...,0] = dim[3] - 1 - output_flipped[...,0]
 33 |         
 34 |         output_flipped = coord_norm(output_flipped, cfg, args).clone().cpu().numpy()
 35 | 
 36 |     for pair in matched_parts:
 37 |         tmp = output_flipped[:, pair[0], ...].copy()
 38 |         output_flipped[:, pair[0], ...] = output_flipped[:, pair[1], ...]
 39 |         output_flipped[:, pair[1], ...] = tmp
 40 | 
 41 |     return output_flipped
 42 | 
 43 | 
 44 | def fliplr_joints(joints, joints_vis, width, matched_parts):
 45 |     """
 46 |     flip coords
 47 |     """
 48 |     # Flip horizontal
 49 |     joints[:, 0] = width - joints[:, 0] - 1
 50 | 
 51 |     # Change left-right parts; point index
 52 |     for pair in matched_parts:
 53 |         joints[pair[0], :], joints[pair[1], :] = \
 54 |             joints[pair[1], :], joints[pair[0], :].copy()
 55 |         joints_vis[pair[0], :], joints_vis[pair[1], :] = \
 56 |             joints_vis[pair[1], :], joints_vis[pair[0], :].copy()
 57 | 
 58 |     return joints*joints_vis, joints_vis
 59 | 
 60 | 
 61 | def transform_preds(coords, center, scale, output_size):
 62 |     target_coords = np.zeros(coords.shape)
 63 |     trans = get_affine_transform(center, scale, 0, output_size, inv=1)
 64 |     for p in range(coords.shape[0]):
 65 |         target_coords[p, 0:2] = affine_transform(coords[p, 0:2], trans)
 66 |     return target_coords
 67 | 
 68 | 
 69 | def get_affine_transform(
 70 |         center, scale, rot, output_size,
 71 |         shift=np.array([0, 0], dtype=np.float32), inv=0
 72 | ):
 73 |     if not isinstance(scale, np.ndarray) and not isinstance(scale, list):
 74 |         print(scale)
 75 |         scale = np.array([scale, scale])
 76 | 
 77 |     scale_tmp = scale * 200.0
 78 |     src_w = scale_tmp[0] 
 79 |     dst_w = output_size[0]
 80 |     dst_h = output_size[1]
 81 | 
 82 |     rot_rad = np.pi * rot / 180
 83 |     src_dir = get_dir([0, src_w * -0.5], rot_rad)
 84 |     dst_dir = np.array([0, dst_w * -0.5], np.float32)
 85 | 
 86 |     src = np.zeros((3, 2), dtype=np.float32)
 87 |     dst = np.zeros((3, 2), dtype=np.float32)
 88 |     src[0, :] = center + scale_tmp * shift
 89 |     src[1, :] = center + src_dir + scale_tmp * shift
 90 |     dst[0, :] = [dst_w * 0.5, dst_h * 0.5]
 91 |     dst[1, :] = np.array([dst_w * 0.5, dst_h * 0.5]) + dst_dir
 92 | 
 93 |     src[2:, :] = get_3rd_point(src[0, :], src[1, :])
 94 |     dst[2:, :] = get_3rd_point(dst[0, :], dst[1, :])
 95 | 
 96 |     if inv:
 97 |         trans = cv2.getAffineTransform(np.float32(dst), np.float32(src))
 98 |     else:
 99 |         trans = cv2.getAffineTransform(np.float32(src), np.float32(dst))
100 | 
101 |     return trans
102 | 
103 | 
104 | def affine_transform(pt, t):
105 |     new_pt = np.array([pt[0], pt[1], 1.]).T
106 |     new_pt = np.dot(t, new_pt)
107 |     return new_pt[:2]
108 | 
109 | # the same rule: [1,0] [3,0] ==> [3,-2]  square triangle
110 | def get_3rd_point(a, b):
111 |     direct = a - b
112 |     return b + np.array([-direct[1], direct[0]], dtype=np.float32)
113 | 
114 | # rotation coordination
115 | def get_dir(src_point, rot_rad):
116 |     sn, cs = np.sin(rot_rad), np.cos(rot_rad)
117 | 
118 |     src_result = [0, 0]
119 |     src_result[0] = src_point[0] * cs - src_point[1] * sn
120 |     src_result[1] = src_point[0] * sn + src_point[1] * cs
121 | 
122 |     return src_result
123 | 
124 | 
125 | def crop(img, center, scale, output_size, rot=0):
126 |     trans = get_affine_transform(center, scale, rot, output_size)
127 | 
128 |     dst_img = cv2.warpAffine(
129 |         img, trans, (int(output_size[0]), int(output_size[1])),
130 |         flags=cv2.INTER_LINEAR
131 |     )
132 | 
133 |     return dst_img
134 | 
135 | 
136 | def tofloat(x):
137 | 
138 |     if isinstance(x, np.ndarray):
139 |         return torch.Tensor(x).cuda()
140 |     
141 |     if x.dtype == torch.float64 or x.dtype == torch.double:
142 |         x = x.float()
143 |     return x
144 | 
145 | def coord_norm(gt, cfg, args):
146 |     
147 |     gt = tofloat(gt)
148 |     
149 |     if args.reg_coord:
150 |         image_size = [cfg.MODEL.IMAGE_SIZE[0], cfg.MODEL.IMAGE_SIZE[1]]
151 |     else:
152 |         image_size = [cfg.MODEL.HEATMAP_SIZE[0], cfg.MODEL.HEATMAP_SIZE[1]]
153 |     
154 |     gt = (gt * 2 + 1) / torch.Tensor(image_size).cuda() - 1
155 |     return gt
156 | 
157 | def inv_coord_norm(gt_norm, cfg, args):
158 | 
159 |     gt_norm = tofloat(gt_norm)
160 |     
161 |     if args.reg_coord:
162 |         image_size = [cfg.MODEL.IMAGE_SIZE[0], cfg.MODEL.IMAGE_SIZE[1]]
163 |     else:
164 |         image_size = [cfg.MODEL.HEATMAP_SIZE[0], cfg.MODEL.HEATMAP_SIZE[1]]
165 |     
166 |     gt = ((gt_norm + 1) * torch.Tensor(image_size).cuda() - 1) / 2
167 |     return gt
168 | 
169 | def _tocuda(t):
170 |     if isinstance(t, list):
171 |         for index in range(len(t)):
172 |             t[index] = _tocuda(t[index])
173 |     else:
174 |         if isinstance(t, np.ndarray):
175 |             t = torch.from_numpy(t.copy()).cuda()
176 |         elif t.is_cuda:
177 |             return t
178 |         else:
179 |             t = torch.Tensor(t.clone()).cuda()
180 |     return t
181 | 
182 | def _tocopy(t):
183 |     if isinstance(t, list):
184 |         for index in range(len(list)):
185 |             t[index] = _tocopy(t[index])
186 |     else:
187 |         if isinstance(t, np.ndarray):
188 |             t = torch.from_numpy(t.copy()).cuda()
189 |         else:
190 |             t = torch.Tensor(t.clone()).cuda()
191 |     return t


--------------------------------------------------------------------------------
/lib/utils/utils.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import os
 12 | import logging
 13 | import time
 14 | from collections import namedtuple
 15 | from pathlib import Path
 16 | 
 17 | import torch
 18 | import torch.optim as optim
 19 | import torch.nn as nn
 20 | 
 21 | 
 22 | def create_logger(args, cfg, cfg_name, phase='train'):
 23 |     root_output_dir = Path(cfg.OUTPUT_DIR)
 24 |     # set up logger
 25 |     if not root_output_dir.exists():
 26 |         print('=> creating {}'.format(root_output_dir))
 27 |         root_output_dir.mkdir()
 28 | 
 29 |     dataset = cfg.DATASET.DATASET + '_' + cfg.DATASET.HYBRID_JOINTS_TYPE \
 30 |         if cfg.DATASET.HYBRID_JOINTS_TYPE else cfg.DATASET.DATASET
 31 |     dataset = dataset.replace(':', '_')
 32 |     model = cfg.MODEL.NAME
 33 |     
 34 |     cfg_name = os.path.basename(cfg_name).split('.')[0]
 35 |     cfg_name = args.save_suffix if args.save_suffix is not '' else cfg_name
 36 | 
 37 |     ### log output dir
 38 |     if args.test_robust:
 39 |         root_output_dir = Path('output_robustness')
 40 |         final_output_dir = root_output_dir / dataset / model / cfg_name / 'test_corruption'
 41 |     else:
 42 |         final_output_dir = root_output_dir / dataset / model / cfg_name
 43 |     
 44 |     print('=> creating {}'.format(final_output_dir))
 45 |     final_output_dir.mkdir(parents=True, exist_ok=True)
 46 | 
 47 |     time_str = time.strftime('%Y-%m-%d-%H-%M')
 48 | 
 49 |     if args.test_robust:
 50 |         log_file = '{}_{}.log'.format(cfg_name, phase)
 51 |     else:
 52 |         log_file = '{}_{}_{}.log'.format(cfg_name, time_str, phase)
 53 |     
 54 |     final_log_file = final_output_dir / log_file
 55 |     head = '%(asctime)-15s %(message)s'
 56 |     logging.basicConfig(filename=str(final_log_file),
 57 |                         format=head)
 58 |     logger = logging.getLogger()
 59 | 
 60 |     logger.setLevel(logging.INFO)
 61 |     
 62 |     console = logging.StreamHandler()
 63 |     logging.getLogger('').addHandler(console)
 64 | 
 65 |     tensorboard_log_dir = Path(cfg.LOG_DIR) / dataset / model / \
 66 |         (cfg_name + '_' + time_str)
 67 |     
 68 |     if args.test_robust:
 69 |         tensorboard_log_dir = Path(cfg.LOG_DIR) / dataset / model / 'test_robustness' / \
 70 |             (cfg_name + '_' + time_str) / args.corruption_type / str(args.severity)
 71 | 
 72 |     print('=> creating {}'.format(tensorboard_log_dir))
 73 |     tensorboard_log_dir.mkdir(parents=True, exist_ok=True)
 74 | 
 75 |     return logger, str(final_output_dir), str(tensorboard_log_dir)
 76 | 
 77 | 
 78 | def get_optimizer(cfg, model):
 79 |     optimizer = None
 80 |     if cfg.TRAIN.OPTIMIZER == 'sgd':
 81 |         optimizer = optim.SGD(
 82 |             model.parameters(),
 83 |             lr=cfg.TRAIN.LR,
 84 |             momentum=cfg.TRAIN.MOMENTUM,
 85 |             weight_decay=cfg.TRAIN.WD,
 86 |             nesterov=cfg.TRAIN.NESTEROV
 87 |         )
 88 |     elif cfg.TRAIN.OPTIMIZER == 'adam':
 89 |         optimizer = optim.Adam(
 90 |             model.parameters(),
 91 |             lr=cfg.TRAIN.LR
 92 |         )
 93 | 
 94 |     return optimizer
 95 | 
 96 | 
 97 | def save_checkpoint(states, is_best, output_dir,
 98 |                     filename='checkpoint.pth', suffix=""):
 99 |     if suffix != "":
100 |         torch.save(states, os.path.join(output_dir, filename[:-4] + '_' + suffix + ".pth"))
101 |         if is_best and 'state_dict' in states:
102 |             torch.save(states['best_state_dict'],
103 |                     os.path.join(output_dir, 'model_best_{}.pth'.format(suffix)))
104 |     else:
105 |         torch.save(states, os.path.join(output_dir, filename))
106 |         if is_best and 'state_dict' in states:
107 |             torch.save(states['best_state_dict'],
108 |                     os.path.join(output_dir, 'model_best.pth'))
109 | 
110 | def get_model_summary(model, *input_tensors, item_length=26, verbose=False, return_all=False):
111 |     """
112 |     :param model:
113 |     :param input_tensors:
114 |     :param item_length:
115 |     :return:
116 |     """
117 | 
118 |     summary = []
119 | 
120 |     ModuleDetails = namedtuple(
121 |         "Layer", ["name", "input_size", "output_size", "num_parameters", "multiply_adds", "memory_access_cost"])
122 |     hooks = []
123 |     layer_instances = {}
124 | 
125 |     def add_hooks(module):
126 | 
127 |         def hook(module, input, output):
128 |             class_name = str(module.__class__.__name__)
129 |             instance_index = 1
130 |             if class_name not in layer_instances:
131 |                 layer_instances[class_name] = instance_index
132 |             else:
133 |                 instance_index = layer_instances[class_name] + 1
134 |                 layer_instances[class_name] = instance_index
135 | 
136 |             layer_name = class_name + "_" + str(instance_index)
137 | 
138 |             params = 0
139 |             mac = 0
140 |             mac_input = 0
141 |             mac_output = 0
142 |             mac_module = 0
143 |             if class_name.find("Conv2d") != -1 or class_name.find("BatchNorm") != -1 or \
144 |                class_name.find("Linear") != -1:
145 |                 for param_ in module.parameters():
146 |                     params += param_.view(-1).size(0)
147 |                     mac_module += param_.view(-1).size(0)
148 | 
149 |             flops = "Not Available"
150 |             if class_name.find("Conv2d") != -1 and hasattr(module, "weight"):
151 |                 flops = (
152 |                     torch.prod(
153 |                         torch.LongTensor(list(module.weight.data.size()))) *
154 |                     torch.prod(
155 |                         torch.LongTensor(list(output.size())[2:]))).item()
156 |             elif isinstance(module, nn.Linear):
157 |                 flops = (torch.prod(torch.LongTensor(list(output.size()))) \
158 |                          * input[0].size(1)).item()
159 | 
160 |             if isinstance(input[0], list):
161 |                 input = input[0]
162 |             if isinstance(output, list):
163 |                 output = output[-1]
164 | 
165 |             mac_input = input[0].view(-1).size(0)# input shape
166 |             mac_output = output.view(-1).size(0)
167 |             mac += (mac_input + mac_module + mac_output)
168 | 
169 |             # 与 batch 无关
170 |             summary.append(
171 |                 ModuleDetails(
172 |                     name=layer_name,
173 |                     input_size=list(input[0].size()),
174 |                     output_size=list(output.size()),
175 |                     num_parameters=params,
176 |                     multiply_adds=flops,
177 |                     memory_access_cost=mac)
178 |             )
179 |         if not isinstance(module, nn.ModuleList) \
180 |            and not isinstance(module, nn.Sequential) \
181 |            and module != model:
182 |             hooks.append(module.register_forward_hook(hook))
183 | 
184 |     model.eval()
185 |     model.apply(add_hooks)
186 | 
187 |     space_len = item_length
188 | 
189 |     model(*input_tensors)
190 |     for hook in hooks:
191 |         hook.remove()
192 | 
193 |     details = ''
194 |     if verbose:
195 |         details = "Model Summary" + \
196 |             os.linesep + \
197 |             "Name{}Input Size{}Output Size{}Parameters{}Multiply Adds (Flops){}MAC(memory access cost){}".format(
198 |                 ' ' * (space_len - len("Name")),
199 |                 ' ' * (space_len - len("Input Size")),
200 |                 ' ' * (space_len - len("Output Size")),
201 |                 ' ' * (space_len - len("Parameters")),
202 |                 ' ' * (space_len - len("Multiply Adds (Flops)")), \
203 |                 ' ' * (space_len - len("MAC(memory access cost)"))) \
204 |                 + os.linesep + '-' * space_len * 5 + os.linesep
205 | 
206 |     params_sum = 0
207 |     flops_sum = 0
208 |     mac_sum = 0
209 |     for layer in summary:
210 |         params_sum += layer.num_parameters
211 |         mac_sum += layer.memory_access_cost
212 |         if layer.multiply_adds != "Not Available":
213 |             flops_sum += layer.multiply_adds
214 |         if verbose:
215 |             details += "{}{}{}{}{}{}{}{}{}{}".format(
216 |                 layer.name,
217 |                 ' ' * (space_len - len(layer.name)),
218 |                 layer.input_size,
219 |                 ' ' * (space_len - len(str(layer.input_size))),
220 |                 layer.output_size,
221 |                 ' ' * (space_len - len(str(layer.output_size))),
222 |                 layer.num_parameters,
223 |                 ' ' * (space_len - len(str(layer.num_parameters))),
224 |                 layer.multiply_adds,
225 |                 ' ' * (space_len - len(str(layer.multiply_adds)))) \
226 |                 + os.linesep + '-' * space_len * 5 + os.linesep
227 | 
228 |     details += os.linesep \
229 |         + "Total Parameters: {:,}".format(params_sum) \
230 |         + os.linesep + '-' * space_len * 5 + os.linesep
231 |     details += "Total Multiply Adds (For Convolution and Linear Layers only): {:,} GFLOPs".format(flops_sum/(1024**3)) \
232 |         + os.linesep + '-' * space_len * 5 + os.linesep
233 |     details += "Memory Access Cost : {:,}".format(mac_sum) \
234 |         + os.linesep + '-' * space_len * 5 + os.linesep
235 | 
236 |     details += "Number of Layers" + os.linesep
237 |     for layer in layer_instances:
238 |         details += "{} : {} layers   ".format(layer, layer_instances[layer])
239 | 
240 |     if return_all:
241 |         return [details, flops_sum, mac_sum, params_sum]
242 |     return details
243 | 


--------------------------------------------------------------------------------
/lib/utils/vis.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import math
 12 | 
 13 | import numpy as np
 14 | import torchvision
 15 | import cv2
 16 | import torch
 17 | import torch.nn as nn
 18 | 
 19 | from core.inference import get_max_preds
 20 | 
 21 | 
 22 | def tensor2im(input_image, imtype=np.uint8):
 23 |     """
 24 |     Args:
 25 |         input_image (tensor)
 26 |         imtype (type)
 27 |     """
 28 |     mean = [0.485,0.456,0.406]
 29 |     std = [0.229,0.224,0.225]
 30 |     if not isinstance(input_image, np.ndarray):
 31 |         if isinstance(input_image, torch.Tensor):
 32 |             image_tensor = input_image.data
 33 |         else:
 34 |             return input_image
 35 |         image_numpy = image_tensor.cpu().float().numpy()
 36 |         if image_numpy.shape[0] == 1:
 37 |             image_numpy = np.tile(image_numpy, (3, 1, 1))
 38 |         for i in range(len(mean)):
 39 |             image_numpy[i] = image_numpy[i] * std[i] + mean[i]
 40 |         image_numpy = image_numpy * 255
 41 |         image_numpy = np.transpose(image_numpy, (1, 2, 0))
 42 |     else:
 43 |         image_numpy = input_image
 44 |     return image_numpy.astype(imtype)
 45 | 
 46 | 
 47 | def save_batch_image_with_joints(batch_image, batch_joints, batch_joints_vis,
 48 |                                  file_name, nrow=8, padding=2):
 49 |     
 50 |     """
 51 |     Args:
 52 |         batch_image: [batch_size, channel, height, width]
 53 |         batch_joints: [batch_size, num_joints, 3],
 54 |         batch_joints_vis: [batch_size, num_joints, 1],
 55 |     """
 56 |     grid = torchvision.utils.make_grid(batch_image, nrow, padding, False)
 57 |     ndarr = tensor2im(grid)
 58 |     ndarr = ndarr.copy()
 59 | 
 60 |     nmaps = batch_image.size(0)
 61 |     xmaps = min(nrow, nmaps)
 62 |     ymaps = int(math.ceil(float(nmaps) / xmaps))
 63 |     height = int(batch_image.size(2) + padding)
 64 |     width = int(batch_image.size(3) + padding)
 65 |     k = 0
 66 |     for y in range(ymaps):
 67 |         for x in range(xmaps):
 68 |             if k >= nmaps:
 69 |                 break
 70 |             joints = batch_joints[k]
 71 |             joints_vis = batch_joints_vis[k]
 72 | 
 73 |             for joint, joint_vis in zip(joints, joints_vis):
 74 |                 joint[0] = x * width + padding + joint[0]
 75 |                 joint[1] = y * height + padding + joint[1]
 76 |                 if joint_vis[0]:
 77 |                     cv2.circle(ndarr, (int(joint[0]), int(joint[1])), 2, [255, 0, 0], 2)
 78 |             k = k + 1
 79 |     cv2.imwrite(file_name, ndarr)
 80 | 
 81 | 
 82 | def save_batch_heatmaps(batch_image, batch_heatmaps, file_name,
 83 |                         normalize=True, coord=None):
 84 |     """
 85 |     Args:
 86 |         batch_image: [batch_size, channel, height, width]
 87 |         batch_heatmaps: ['batch_size, num_joints, height, width]
 88 |         file_name: saved file name
 89 |     """
 90 |     if normalize:
 91 |         batch_image = batch_image.clone()
 92 |         min = float(batch_image.min())
 93 |         max = float(batch_image.max())
 94 | 
 95 |         batch_image.add_(-min).div_(max - min + 1e-5)
 96 | 
 97 |     batch_size = batch_heatmaps.size(0)
 98 |     num_joints = batch_heatmaps.size(1)
 99 |     heatmap_height = batch_heatmaps.size(2)
100 |     heatmap_width = batch_heatmaps.size(3)
101 | 
102 |     grid_image = np.zeros((batch_size*heatmap_height,
103 |                            (num_joints+1)*heatmap_width,
104 |                            3),
105 |                           dtype=np.uint8)
106 |     if coord is None:
107 |         preds, maxvals = get_max_preds(batch_heatmaps.detach().cpu().numpy())
108 |     else:
109 |         preds = coord
110 |     
111 |     for i in range(batch_size):
112 |         image = batch_image[i].mul(255)\
113 |                               .clamp(0, 255)\
114 |                               .byte()\
115 |                               .permute(1, 2, 0)\
116 |                               .cpu().numpy()
117 |         heatmaps = batch_heatmaps[i].mul(255)\
118 |                                     .clamp(0, 255)\
119 |                                     .byte()\
120 |                                     .cpu().numpy()
121 | 
122 |         resized_image = cv2.resize(image,
123 |                                    (int(heatmap_width), int(heatmap_height)))
124 | 
125 |         height_begin = heatmap_height * i
126 |         height_end = heatmap_height * (i + 1)
127 |         for j in range(num_joints):
128 |             cv2.circle(resized_image,
129 |                        (int(preds[i][j][0]), int(preds[i][j][1])),
130 |                        1, [0, 0, 255], 1)
131 |             heatmap = heatmaps[j, :, :]
132 |             colored_heatmap = cv2.applyColorMap(heatmap, cv2.COLORMAP_JET)
133 |             masked_image = colored_heatmap*0.7 + resized_image*0.3
134 |             cv2.circle(masked_image,
135 |                        (int(preds[i][j][0]), int(preds[i][j][1])),
136 |                        1, [0, 0, 255], 1)
137 | 
138 |             width_begin = heatmap_width * (j+1)
139 |             width_end = heatmap_width * (j+2)
140 |             grid_image[height_begin:height_end, width_begin:width_end, :] = \
141 |                 masked_image
142 | 
143 |         grid_image[height_begin:height_end, 0:heatmap_width, :] = resized_image
144 | 
145 |     cv2.imwrite(file_name, grid_image)
146 | 
147 | 
148 |     
149 | def save_debug_images(config, input, meta, target, joints_pred, output,
150 |                       prefix, coord=None):
151 |     if not config.DEBUG.DEBUG:
152 |         return
153 | 
154 |     if config.DEBUG.SAVE_BATCH_IMAGES_GT:
155 |         save_batch_image_with_joints(
156 |             input, meta['joints'], meta['joints_vis'],
157 |             '{}_gt.jpg'.format(prefix)
158 |         )
159 |     if config.DEBUG.SAVE_BATCH_IMAGES_PRED:
160 |         save_batch_image_with_joints(
161 |             input, joints_pred, meta['joints_vis'],
162 |             '{}_pred.jpg'.format(prefix)
163 |         )
164 |     if config.DEBUG.SAVE_HEATMAPS_GT:
165 |         save_batch_heatmaps(
166 |             input, target, '{}_hm_gt.jpg'.format(prefix)
167 |         )
168 |     # normalized
169 |     if config.DEBUG.SAVE_HEATMAPS_PRED:
170 |         if isinstance(output, list) and len(output) > 1:
171 |             save_batch_heatmaps(
172 |                 input, output[1], '{}_hm_pred.jpg'.format(prefix), normalize=False, coord=coord
173 |             )
174 |             save_batch_heatmaps(
175 |                 input, output[0], '{}_hm_unnorm_pred.jpg'.format(prefix), coord=coord
176 |             )
177 |         elif isinstance(output, list) and len(output) == 1:
178 |             try:
179 |                 save_batch_heatmaps(
180 |                     input, output[0], '{}_hm_pred.jpg'.format(prefix), normalize=False, coord=coord
181 |                 )
182 |             except:
183 |                 print('do not support')
184 |         else:
185 |             save_batch_heatmaps(
186 |                 input, output, '{}_hm_pred.jpg'.format(prefix), normalize=False, coord=coord
187 |             )
188 |         
189 | 
190 | 
191 | 


--------------------------------------------------------------------------------
/lib/utils/zipreader.py:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # Copyright (c) Microsoft
 3 | # Licensed under the MIT License.
 4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 5 | # ------------------------------------------------------------------------------
 6 | 
 7 | from __future__ import absolute_import
 8 | from __future__ import division
 9 | from __future__ import print_function
10 | 
11 | import os
12 | import zipfile
13 | import xml.etree.ElementTree as ET
14 | 
15 | import cv2
16 | import numpy as np
17 | 
18 | _im_zfile = []
19 | _xml_path_zip = []
20 | _xml_zfile = []
21 | 
22 | 
23 | def imread(filename, flags=cv2.IMREAD_COLOR):
24 |     global _im_zfile
25 |     path = filename
26 |     pos_at = path.index('@')
27 |     if pos_at == -1:
28 |         print("character '@' is not found from the given path '%s'"%(path))
29 |         assert 0
30 |     path_zip = path[0: pos_at]
31 |     path_img = path[pos_at + 2:]
32 |     if not os.path.isfile(path_zip):
33 |         print("zip file '%s' is not found"%(path_zip))
34 |         assert 0
35 |     for i in range(len(_im_zfile)):
36 |         if _im_zfile[i]['path'] == path_zip:
37 |             data = _im_zfile[i]['zipfile'].read(path_img)
38 |             return cv2.imdecode(np.frombuffer(data, np.uint8), flags)
39 | 
40 |     _im_zfile.append({
41 |         'path': path_zip,
42 |         'zipfile': zipfile.ZipFile(path_zip, 'r')
43 |     })
44 |     data = _im_zfile[-1]['zipfile'].read(path_img)
45 | 
46 |     return cv2.imdecode(np.frombuffer(data, np.uint8), flags)
47 | 
48 | 
49 | def xmlread(filename):
50 |     global _xml_path_zip
51 |     global _xml_zfile
52 |     path = filename
53 |     pos_at = path.index('@')
54 |     if pos_at == -1:
55 |         print("character '@' is not found from the given path '%s'"%(path))
56 |         assert 0
57 |     path_zip = path[0: pos_at]
58 |     path_xml = path[pos_at + 2:]
59 |     if not os.path.isfile(path_zip):
60 |         print("zip file '%s' is not found"%(path_zip))
61 |         assert 0
62 |     for i in xrange(len(_xml_path_zip)):
63 |         if _xml_path_zip[i] == path_zip:
64 |             data = _xml_zfile[i].open(path_xml)
65 |             return ET.fromstring(data.read())
66 |     _xml_path_zip.append(path_zip)
67 |     print("read new xml file '%s'"%(path_zip))
68 |     _xml_zfile.append(zipfile.ZipFile(path_zip, 'r'))
69 |     data = _xml_zfile[-1].open(path_xml)
70 |     return ET.fromstring(data.read())
71 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | Cython
 2 | scipy
 3 | pandas
 4 | pyyaml
 5 | json_tricks
 6 | yacs>=0.1.5
 7 | tensorboardX==1.6
 8 | tqdm
 9 | glob
10 | numpy
11 | Pillow
12 | imagecorruptions
13 | torch>=1.0.0


--------------------------------------------------------------------------------
/scripts/make_datasets.sh:
--------------------------------------------------------------------------------
1 | # coco
2 | python tools/make_datasets.py --dataset coco --root_dir data --data_dir coco/val2017
3 | # mpii
4 | python tools/make_datasets.py --dataset mpii --root_dir data --data_dir mpii/images


--------------------------------------------------------------------------------
/scripts/test.sh:
--------------------------------------------------------------------------------
 1 | which_dataset=$1
 2 | 
 3 | if [ $which_dataset == coco ]; then
 4 |     exp_ID=GT_test_COCO_res50_256x192_advmix
 5 |     cfg_file=experiments/coco/resnet/res50_256x192_d256x3_adam_lr1e-3_advmix.yaml
 6 |     checkpoint=output/coco/pose_resnet/MPII_res50_256x256_advmix/model_best_D.pth
 7 | elif [ $which_dataset == mpii ]; then
 8 |     exp_ID=GT_test_MPII_res50_256x256_advmix
 9 |     cfg_file=experiments/mpii/resnet/res50_256x256_d256x3_adam_lr1e-3_advmix.yaml
10 |     checkpoint=output/coco/pose_resnet/COCO_res50_256x192_advmix/model_best_D.pth
11 | fi
12 | 
13 | 
14 | python tools/test_corruption.py \
15 |     --cfg  $cfg_file \
16 |     --test_robust \
17 |     --exp_id $exp_ID \
18 |     --save_suffix $exp_ID \
19 |     TEST.MODEL_FILE $checkpoint \
20 |     TEST.USE_GT_BBOX False


--------------------------------------------------------------------------------
/scripts/train.sh:
--------------------------------------------------------------------------------
 1 | which_dataset=$1
 2 | 
 3 | if [ $which_dataset == coco ]; then
 4 |     num_joints=17
 5 |     exp_ID=COCO_res50_256x192_advmix
 6 |     cfg_file=experiments/coco/resnet/res50_256x192_d256x3_adam_lr1e-3_advmix.yaml
 7 |     checkpoint=models/pytorch/pose_coco/pose_resnet_50_256x192.pth
 8 | elif [ $which_dataset == mpii ]; then
 9 |     num_joints=16
10 |     exp_ID=MPII_res50_256x256_advmix
11 |     cfg_file=experiments/mpii/resnet/res50_256x256_d256x3_adam_lr1e-3_advmix.yaml
12 |     checkpoint=models/pytorch/pose_mpii/pose_resnet_50_256x256.pth.tar
13 | fi
14 | echo 'Start training :'$which_dataset
15 | 
16 | python tools/train.py \
17 |     --cfg  $cfg_file \
18 |     --exp_id $exp_ID \
19 |     --save_suffix $exp_ID \
20 |     --load_from_D $checkpoint \
21 |     --advmix \
22 |     --sample_times 3 \
23 |     --joints_num $num_joints \
24 |     --kd_mseloss \
25 |     --alpha 0.1 \
26 |     TEST.MODEL_FILE $checkpoint \
27 |     TEST.USE_GT_BBOX True


--------------------------------------------------------------------------------
/tools/_init_parse.py:
--------------------------------------------------------------------------------
  1 | from __future__ import absolute_import
  2 | from __future__ import division
  3 | from __future__ import print_function
  4 | 
  5 | import argparse
  6 | import os
  7 | import numpy as np
  8 | import pprint
  9 | import shutil
 10 | 
 11 | 
 12 | def parse_args():
 13 |     parser = argparse.ArgumentParser(description='Train keypoints network')
 14 |     # general
 15 |     parser.add_argument('--cfg',
 16 |                         help='experiment configure file name',
 17 |                         required=True,
 18 |                         type=str)
 19 | 
 20 |     parser.add_argument('opts',
 21 |                         help="Modify config options using the command-line",
 22 |                         default=None,
 23 |                         nargs=argparse.REMAINDER)
 24 | 
 25 |     # philly
 26 |     parser.add_argument('--modelDir',
 27 |                         help='model directory',
 28 |                         type=str,
 29 |                         default='')
 30 |     parser.add_argument('--logDir',
 31 |                         help='log directory',
 32 |                         type=str,
 33 |                         default='')
 34 |     parser.add_argument('--dataDir',
 35 |                         help='data directory',
 36 |                         type=str,
 37 |                         default='')
 38 |     parser.add_argument('--prevModelDir',
 39 |                         help='prev Model directory',
 40 |                         type=str,
 41 |                         default='')
 42 |     parser.add_argument('--save_suffix',
 43 |                         help='model output dir suffix',
 44 |                         type=str,
 45 |                         default='')
 46 | 
 47 |     ### test robustness
 48 |     parser.add_argument('--test_robust',
 49 |                         help='normal test or test robustness',
 50 |                         default=False,
 51 |                         action='store_true')
 52 | 
 53 |     parser.add_argument('--corruption_type',
 54 |                     help='type of corruption',
 55 |                     type=str,
 56 |                     default='')
 57 | 
 58 |     parser.add_argument('--severity',
 59 |                     help='severity of corruption',
 60 |                     type=int,
 61 |                     default=0)
 62 | 
 63 |     # INPUT TYPE
 64 |     parser.add_argument('--dataset_root',
 65 |                         help='data directory, if only dataset root is provided, then all the images are processed',
 66 |                         type=str,
 67 |                         default='',
 68 |                         )
 69 |     parser.add_argument('--load_json_file',
 70 |                         help='load json file. The dataset root should also be given.',
 71 |                         type=str,
 72 |                         default='')
 73 |     # OUTPUT ROOT:
 74 |     parser.add_argument('--out_root',
 75 |                         help='data directory',
 76 |                         type=str,
 77 |                         default='/mnt/lustre/share/jinsheng/res_crop')
 78 | 
 79 |     parser.add_argument('--out_file',
 80 |                         help='data directory',
 81 |                         type=str,
 82 |                         default='res')
 83 | 
 84 |     # test & train
 85 |     parser.add_argument('--exp_id',
 86 |                         type=str,
 87 |                         default='')
 88 |     parser.add_argument('--load_from_G',
 89 |                         type=str,
 90 |                         default='')
 91 |     parser.add_argument('--load_from_D',
 92 |                         type=str,
 93 |                         default='')
 94 | 
 95 | 
 96 |     parser.add_argument('--sample_times',
 97 |                         type=int,
 98 |                         default=1)
 99 | 
100 |     parser.add_argument('--adv_loss_weight',
101 |                         type=float,
102 |                         default=1)
103 |     parser.add_argument('--combine_prob',
104 |                         type=float,
105 |                         default=0.2)
106 |     parser.add_argument('--perturb_joint',
107 |                         type=float,
108 |                         default=0.2)
109 |     parser.add_argument('--perturb_range',
110 |                         type=int,
111 |                         default=5)
112 |     parser.add_argument('--sp_style',
113 |                         type=float,
114 |                         default=0)
115 |     parser.add_argument('--advmix',
116 |                     default=False,
117 |                     action='store_true')
118 | 
119 |     parser.add_argument('--stylize_image',
120 |                     default=False,
121 |                     action='store_true')
122 | 
123 |     parser.add_argument('--joints_num',
124 |                         type=int,
125 |                         default=17)
126 |     
127 |     # generator
128 |     parser.add_argument('--gen_input_chn',
129 |                         type=int,
130 |                         default=9)
131 | 
132 |     parser.add_argument('--downsamples',
133 |                         type=int,
134 |                         default=6)
135 |     
136 |     # knowledge distillation
137 |     parser.add_argument('--kd_mseloss',
138 |                     default=False,
139 |                     action='store_true')
140 | 
141 |     parser.add_argument('--kd_klloss',
142 |                     default=False,
143 |                     action='store_true')
144 | 
145 |     parser.add_argument('--alpha',
146 |                         type=float,
147 |                         default=0.1)
148 | 
149 |     # random corruption
150 |     parser.add_argument('--random_corruption',
151 |                     default=False,
152 |                     action='store_true')
153 |     
154 |     args = parser.parse_args()
155 | 
156 |     return args


--------------------------------------------------------------------------------
/tools/_init_paths.py:
--------------------------------------------------------------------------------
 1 | # ------------------------------------------------------------------------------
 2 | # pose.pytorch
 3 | # Copyright (c) 2018-present Microsoft
 4 | # Licensed under The Apache-2.0 License [see LICENSE for details]
 5 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
 6 | # ------------------------------------------------------------------------------
 7 | 
 8 | from __future__ import absolute_import
 9 | from __future__ import division
10 | from __future__ import print_function
11 | 
12 | import os.path as osp
13 | import sys
14 | 
15 | 
16 | def add_path(path):
17 |     if path not in sys.path:
18 |         sys.path.insert(0, path)
19 | 
20 | 
21 | this_dir = osp.dirname(__file__)
22 | 
23 | lib_path = osp.join(this_dir, '..', 'lib')
24 | add_path(lib_path)
25 | 
26 | mm_path = osp.join(this_dir, '..', 'lib/poseeval/py-motmetrics')
27 | add_path(mm_path)
28 | 
29 | # mmdectection
30 | lib_path = osp.join(this_dir, '..', 'mmdetection')
31 | 
32 | lib_path = osp.join(this_dir, '..', 'Synchronized_BatchNorm')
33 | add_path(lib_path)
34 | 
35 | if __name__ == '__main__':
36 |     import os.path as osp
37 |     this_dir = osp.dirname(__file__)
38 |     print(this_dir)
39 |     ### only current path and sys append path


--------------------------------------------------------------------------------
/tools/make_datasets.py:
--------------------------------------------------------------------------------
 1 | from PIL import Image
 2 | import numpy as np
 3 | import time, os
 4 | from glob import glob
 5 | from tqdm import tqdm
 6 | import argparse
 7 | import torch
 8 | from torch.utils.data import Dataset
 9 | from imagecorruptions import corrupt, get_corruption_names
10 | 
11 | 
12 | parser = argparse.ArgumentParser(description='Apply different corruption types to official validation dataset, e.g., COCO, MPII, OCHuman, etc.',
13 |                                  formatter_class=argparse.ArgumentDefaultsHelpFormatter)
14 | parser.add_argument('--batch_size', type=int, default=64, help='Batch size for processing images.')
15 | parser.add_argument('--num_workers', type=int, default=8, help='Multi-process data loading.')
16 | parser.add_argument('--root_dir', type=str, default="./data", help='Root directory of data.')
17 | parser.add_argument('--data_dir', type=str, help='Directroy of images.')
18 | parser.add_argument('--dataset', type=str, default="COCO", help='Dataset to process.')
19 | args = parser.parse_args()
20 | 
21 | class make_data(Dataset):
22 |     def __init__(self, root_dir, data_dir, dataset):
23 |         self.root_dir = root_dir
24 |         self.data_dir = data_dir
25 |         self.dataset = dataset
26 |         self.imglist = glob(self.root_dir + '/' + self.data_dir + '/*.jpg')
27 | 
28 |     def __getitem__(self, index):
29 |         img = self.imglist[index]
30 |         self.process(img)        
31 |         return 0
32 | 
33 |     def __len__(self,):
34 |         return len(self.imglist)
35 |     
36 |     def process(self,img):
37 |         image = np.asarray(Image.open(img))
38 |         for corruption in get_corruption_names('all'):
39 |             for severity in range(5):
40 |                 np.random.seed(1)
41 |                 corrupted = corrupt(image, corruption_name=corruption, severity=severity+1)
42 |                 corrupted_path = os.path.join(self.root_dir, self.dataset + '-C', corruption, str(severity), os.path.basename(img))
43 |                 if not os.path.exists(os.path.dirname(corrupted_path)):
44 |                     os.makedirs(os.path.dirname(corrupted_path))
45 |                 Image.fromarray(corrupted).save(corrupted_path)
46 | 
47 | if __name__ == '__main__':
48 |     root_dir = args.root_dir
49 |     data_dir = args.data_dir
50 |     which_dataset = args.dataset
51 |     d_dataset = make_data(root_dir, data_dir, which_dataset)
52 |     print("To process {} images. ".format(len(d_dataset)))
53 |     distorted_dataset_loader = torch.utils.data.DataLoader(
54 |             d_dataset, batch_size=args.batch_size, shuffle=False, num_workers=args.num_workers)
55 | 
56 |     for _ in tqdm(distorted_dataset_loader): continue
57 | 


--------------------------------------------------------------------------------
/tools/test_corruption.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # pose.pytorch
  3 | # Copyright (c) 2018-present Microsoft
  4 | # Licensed under The Apache-2.0 License [see LICENSE for details]
  5 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  6 | # ------------------------------------------------------------------------------
  7 | 
  8 | from __future__ import absolute_import
  9 | from __future__ import division
 10 | from __future__ import print_function
 11 | 
 12 | import argparse
 13 | import os
 14 | import pprint
 15 | 
 16 | import torch
 17 | import torch.nn.parallel
 18 | import torch.backends.cudnn as cudnn
 19 | import torch.optim
 20 | import torch.utils.data
 21 | import torch.utils.data.distributed
 22 | import torchvision.transforms as transforms
 23 | 
 24 | import _init_paths
 25 | from config import cfg
 26 | from config import update_config
 27 | from core.loss import JointsMSELoss
 28 | from core.function import validate
 29 | from utils.utils import create_logger
 30 | 
 31 | import dataset
 32 | import models
 33 | 
 34 | import collections
 35 | from _init_parse import parse_args
 36 | import pandas as pd
 37 | 
 38 | def val_model_init():
 39 | 
 40 |     # adjust the gpu_ids
 41 |     args = parse_args()
 42 |     update_config(cfg, args)
 43 | 
 44 |     # cudnn related setting
 45 |     cudnn.benchmark = cfg.CUDNN.BENCHMARK
 46 |     torch.backends.cudnn.deterministic = cfg.CUDNN.DETERMINISTIC
 47 |     torch.backends.cudnn.enabled = cfg.CUDNN.ENABLED
 48 | 
 49 |     model = eval('models.'+cfg.MODEL.NAME+'.get_pose_net')(
 50 |         cfg, is_train=False
 51 |     )
 52 | 
 53 |     if cfg.TEST.MODEL_FILE:
 54 |         model.load_state_dict(torch.load(cfg.TEST.MODEL_FILE), strict=False)
 55 |     
 56 |     model = torch.nn.DataParallel(model, device_ids=cfg.GPUS).cuda()
 57 | 
 58 |     return model
 59 | 
 60 | def val(distortion_name, severity, model):
 61 |     args = parse_args()
 62 |     args.corruption_type = distortion_name
 63 |     args.severity = severity
 64 | 
 65 |     update_config(cfg, args)
 66 | 
 67 |     exp_id = args.exp_id
 68 |     which_dataset = cfg.DATASET.DATASET
 69 | 
 70 |     logger, final_output_dir, tb_log_dir = create_logger(args, 
 71 |         cfg, args.cfg, 'valid')
 72 | 
 73 |     logger.info(pprint.pformat(args))
 74 | 
 75 |     if cfg.TEST.MODEL_FILE:
 76 |         logger.info('=> loading model from {}'.format(cfg.TEST.MODEL_FILE))
 77 | 
 78 |     else:
 79 |         model_state_file = os.path.join(
 80 |             final_output_dir, 'final_state.pth'
 81 |         )
 82 |         logger.info('=> loading model from {}'.format(model_state_file))
 83 |         model.load_state_dict(torch.load(model_state_file))
 84 | 
 85 |     # define loss function (criterion) and optimizer
 86 |     criterion = JointsMSELoss(
 87 |         use_target_weight=cfg.LOSS.USE_TARGET_WEIGHT
 88 |     ).cuda()
 89 | 
 90 |     # Data loading code
 91 |     normalize = transforms.Normalize(
 92 |         mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
 93 |     )
 94 | 
 95 |     valid_dataset = eval('dataset.'+cfg.DATASET.DATASET)(
 96 |         cfg, args, cfg.DATASET.ROOT, cfg.DATASET.TEST_SET, False,
 97 |         transforms.Compose([
 98 |             transforms.ToTensor(),
 99 |             normalize,
100 |         ])
101 |     )
102 |     
103 |     valid_loader = torch.utils.data.DataLoader(
104 |         valid_dataset,
105 |         batch_size=cfg.TEST.BATCH_SIZE_PER_GPU*len(cfg.GPUS),
106 |         shuffle=False,
107 |         num_workers=cfg.WORKERS,
108 |         pin_memory=True
109 |     )
110 | 
111 |     # evaluate on validation set
112 |     name_values, perf_indicator = validate(cfg, args, valid_loader, valid_dataset, model, criterion,
113 |                         final_output_dir, tb_log_dir)
114 |     
115 | 
116 |     # recording overall results
117 |     overall_dir = os.path.join(final_output_dir, 'robust_C.val')
118 |     record = open(os.path.join(final_output_dir, 'robust_C.val'), 'a')
119 |     record.write(distortion_name + '_' + str(severity) + ':' + '\t')
120 |     for keys, values in name_values.items():
121 |         record.write(keys + ' = ' + str(values) + '\t')
122 |     record.write('\n')
123 |     record.close()
124 |     return name_values, perf_indicator, final_output_dir, exp_id, which_dataset, overall_dir
125 | 
126 | def get_corrpution_results():
127 |     distortions = [
128 |         'gaussian_noise', 'shot_noise', 'impulse_noise',
129 |         'defocus_blur', 'glass_blur', 'motion_blur', 'zoom_blur',
130 |         'snow', 'frost', 'fog', 'brightness',
131 |         'contrast', 'elastic_transform', 'pixelate', 'jpeg_compression',
132 |         'speckle_noise', 'gaussian_blur', 'spatter', 'saturate'
133 |     ]
134 |     res = []
135 |     model = val_model_init()
136 |     name_values, perf_indicator, final_output_dir, exp_id, which_dataset, overall_dir = val('clean', 0, model)
137 |     res.append(perf_indicator)
138 | 
139 |     for distortion_name in distortions[:15]:
140 |         for severity in range(5):
141 |             name_values, perf_indicator, final_output_dir, exp_id, which_dataset, overall_dir = val(distortion_name, severity, model)
142 |             res.append(perf_indicator)
143 |     
144 |     if which_dataset == 'mpii':
145 |         get_final_results_mpii(res, distortions, final_output_dir, exp_id, mode='td')
146 |     else:
147 |         mode = 'bu' if cfg.model.type == 'BottomUp' else 'td'
148 |         get_final_results(res, distortions, final_output_dir, exp_id, mode=mode)
149 | 
150 | def get_final_results(mAP, distortions, final_output_dir, exp_id,mode='td'):
151 |     dic = {}
152 |     assert len(mAP) == 96, 'Result length'
153 |     dic['clean_mAP'] = [mAP.pop(0)]
154 |     print(mAP)
155 | 
156 |     all_tmp = 0
157 |     for dis in distortions:
158 |         tmp = []
159 |         for i in range(5):
160 |             tmp.append(mAP[distortions.index(dis) * 5 + i])
161 |         dic[dis] = [sum(tmp) / len(tmp)]
162 |         if dis in distortions[:15]:
163 |             all_tmp += dic[dis][0]
164 |     
165 |     dic['mean_corrupted_AP'] = [all_tmp / 15]
166 |     dic['rAP'] = dic['mean_corrupted_AP'][0] / dic['clean_mAP'][0]
167 | 
168 |     dataframe = pd.DataFrame(dic)
169 |     columns = ['clean_mAP', 'mean_corrupted_AP', 'rAP'] + distortions 
170 |     dataframe.to_csv(final_output_dir + '/' + exp_id + ".csv", index=False,sep=',', columns=columns)
171 | 
172 | def get_final_results_mpii(mean, distortions, final_output_dir, exp_id,mode='td'):
173 |     dic = {}
174 |     assert len(mean) == 96, 'Result length'
175 |     dic['clean_mean'] = [round(mean.pop(0),3)]
176 |     print(mean)
177 | 
178 |     all_tmp = 0
179 |     for dis in distortions:
180 |         tmp = []
181 |         for i in range(5):
182 |             tmp.append(mean[distortions.index(dis) * 5 + i])
183 |         dic[dis] = [round(sum(tmp) / len(tmp),3)]
184 |         if dis in distortions[:15]:
185 |             all_tmp += dic[dis][0]
186 |     
187 |     dic['mean_corrupted_mean'] = [round(all_tmp / 15,3)]
188 |     dic['rmean'] = round(dic['mean_corrupted_mean'][0] / dic['clean_mean'][0],3)
189 | 
190 |     dataframe = pd.DataFrame(dic)
191 |     columns = ['clean_mean', 'mean_corrupted_mean', 'rmean'] + distortions 
192 |     dataframe.to_csv(final_output_dir + '/' + exp_id + ".csv", index=False,sep=',', columns=columns)
193 | 
194 | if __name__ == '__main__':
195 |     get_corrpution_results()
196 | 


--------------------------------------------------------------------------------
/tools/train.py:
--------------------------------------------------------------------------------
  1 | # ------------------------------------------------------------------------------
  2 | # Copyright (c) Microsoft
  3 | # Licensed under the MIT License.
  4 | # Written by Bin Xiao (Bin.Xiao@microsoft.com)
  5 | # ------------------------------------------------------------------------------
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | from __future__ import print_function
 10 | 
 11 | import argparse
 12 | import os
 13 | import pprint
 14 | import shutil
 15 | import copy
 16 | 
 17 | import torch
 18 | import torch.nn.parallel
 19 | import torch.backends.cudnn as cudnn
 20 | import torch.optim
 21 | import torch.utils.data
 22 | import torch.utils.data.distributed
 23 | import torchvision.transforms as transforms
 24 | from tensorboardX import SummaryWriter
 25 | import sys
 26 | import _init_paths
 27 | from _init_parse import parse_args
 28 | 
 29 | from config import cfg
 30 | from config import update_config
 31 | from core.loss import JointsMSELoss
 32 | from core.function import train, train_advmix
 33 | from core.function import validate
 34 | from utils.utils import get_optimizer
 35 | from utils.utils import save_checkpoint
 36 | from utils.utils import create_logger
 37 | from utils.utils import get_model_summary
 38 | from torch.utils.data.dataset import ConcatDataset 
 39 | 
 40 | 
 41 | import dataset
 42 | import models
 43 | 
 44 | 
 45 | def main():
 46 |     args = parse_args()
 47 |     update_config(cfg, args)
 48 |     
 49 |     logger, final_output_dir, tb_log_dir = create_logger(
 50 |         args, cfg, args.cfg, 'train')
 51 | 
 52 |     logger.info(pprint.pformat(args))
 53 |     logger.info(cfg)
 54 | 
 55 |     # cudnn related setting
 56 |     cudnn.benchmark = cfg.CUDNN.BENCHMARK
 57 |     torch.backends.cudnn.deterministic = cfg.CUDNN.DETERMINISTIC
 58 |     torch.backends.cudnn.enabled = cfg.CUDNN.ENABLED
 59 | 
 60 |     model = eval('models.'+cfg.MODEL.NAME+'.get_pose_net')(
 61 |         cfg, is_train=True
 62 |     )
 63 | 
 64 |     if args.advmix:
 65 |         model_teacher = copy.deepcopy(model)
 66 |         print('=> Traing adversarially.')
 67 |         model_G = models.Unet_generator.UnetGenerator(args.gen_input_chn,3,args.downsamples)
 68 |         print("=> UNet generator : {} input chanenels; {} downsample times".format(args.gen_input_chn, args.downsamples))
 69 |         model_G = torch.nn.DataParallel(model_G, device_ids=cfg.GPUS).cuda()
 70 | 
 71 |     # copy model file
 72 |     this_dir = os.path.dirname(__file__)
 73 |     shutil.copy2(
 74 |         os.path.join(this_dir, '../lib/models', cfg.MODEL.NAME + '.py'),
 75 |         final_output_dir)
 76 |     
 77 |     shutil.copy2(
 78 |         args.cfg,
 79 |         final_output_dir)
 80 | 
 81 |     shutil.copy2(
 82 |         'tools/train.py',
 83 |         final_output_dir)
 84 | 
 85 |     # logger.info(pprint.pformat(model))
 86 | 
 87 |     writer_dict = {
 88 |         'writer': SummaryWriter(log_dir=tb_log_dir),
 89 |         'train_global_steps': 0,
 90 |         'valid_global_steps': 0,
 91 |     }
 92 | 
 93 |     dump_input = torch.rand(
 94 |         (1, 3, cfg.MODEL.IMAGE_SIZE[1], cfg.MODEL.IMAGE_SIZE[0])
 95 |     )
 96 |     
 97 |     try:
 98 |         writer_dict['writer'].add_graph(model, (dump_input, ))
 99 |     except:
100 |         pass 
101 |     try:
102 |         logger.info(get_model_summary(model, dump_input))
103 |     except:
104 |         pass
105 | 
106 |     model = torch.nn.DataParallel(model, device_ids=cfg.GPUS).cuda()
107 | 
108 |     if args.advmix:
109 |         model_teacher = torch.nn.DataParallel(model_teacher, device_ids=cfg.GPUS).cuda()
110 | 
111 |     criterion = JointsMSELoss(
112 |         use_target_weight=cfg.LOSS.USE_TARGET_WEIGHT
113 |     ).cuda()
114 | 
115 |     # Data loading code
116 |     normalize = transforms.Normalize(
117 |         mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
118 |     )
119 | 
120 |     train_dataset = eval('dataset.'+cfg.DATASET.DATASET)(
121 |         cfg, args, cfg.DATASET.ROOT, cfg.DATASET.TRAIN_SET, True,
122 |         transforms.Compose([
123 |             transforms.ToTensor(),
124 |             normalize,
125 |         ])
126 |     )
127 |     if cfg.DATASET.MINI_COCO:
128 |         valid_dataset = eval('dataset.'+cfg.DATASET.DATASET)(
129 |             cfg, args, cfg.DATASET.ROOT, cfg.DATASET.TRAIN_SET, False,
130 |             transforms.Compose([
131 |                 transforms.ToTensor(),
132 |                 normalize,
133 |             ])
134 |         )
135 |     else:
136 |         valid_dataset = eval('dataset.'+cfg.DATASET.DATASET)(
137 |             cfg, args, cfg.DATASET.ROOT, cfg.DATASET.TEST_SET, False,
138 |             transforms.Compose([
139 |                 transforms.ToTensor(),
140 |                 normalize,
141 |             ])
142 |         )
143 |     
144 | 
145 |     if args.advmix:
146 |         if args.stylize_image:
147 |             if cfg.DATASET.DATASET == 'mpii':
148 |                 style_dataset = eval('dataset.'+cfg.DATASET.DATASET)(
149 |                     cfg, args, 'data/stylize_image/output_mpii', cfg.DATASET.TRAIN_SET, True,
150 |                     transforms.Compose([
151 |                         transforms.ToTensor(),
152 |                         normalize,
153 |                     ])
154 |                 )
155 |             else:
156 |                 style_dataset = eval('dataset.'+cfg.DATASET.DATASET)(
157 |                     cfg, args, 'data/stylize_image/output', cfg.DATASET.TRAIN_SET, True,
158 |                     transforms.Compose([
159 |                         transforms.ToTensor(),
160 |                         normalize,
161 |                     ])
162 |                 )
163 |             train_dataset = ConcatDataset([train_dataset, style_dataset])
164 |     
165 |     train_loader = torch.utils.data.DataLoader(
166 |         train_dataset,
167 |         batch_size=cfg.TRAIN.BATCH_SIZE_PER_GPU*len(cfg.GPUS),
168 |         shuffle=cfg.TRAIN.SHUFFLE,
169 |         num_workers=cfg.WORKERS,
170 |         pin_memory=cfg.PIN_MEMORY
171 |     )
172 |     valid_loader = torch.utils.data.DataLoader(
173 |         valid_dataset,
174 |         batch_size=cfg.TEST.BATCH_SIZE_PER_GPU*len(cfg.GPUS),
175 |         shuffle=False,
176 |         num_workers=cfg.WORKERS,
177 |         pin_memory=cfg.PIN_MEMORY
178 |     )
179 | 
180 |     best_perf = 0.0
181 |     best_model = False
182 |     last_epoch = -1
183 |     last_epoch_G = -1
184 |     optimizer = get_optimizer(cfg, model)
185 |     if args.advmix:
186 |         optimizer_G = get_optimizer(cfg, model_G)
187 | 
188 |     begin_epoch = cfg.TRAIN.BEGIN_EPOCH
189 |     checkpoint_file = os.path.join(
190 |         final_output_dir, 'checkpoint_D.pth'
191 |     )
192 | 
193 |     checkpoint_file_G = os.path.join(
194 |         final_output_dir, 'checkpoint_G.pth'
195 |     )
196 | 
197 | 
198 |     if os.path.exists(args.load_from_D):
199 |         logger.info('=> Fine tuning by loading pretrained model: {}'.format(args.load_from_D))
200 |         pretrained_dict = torch.load(args.load_from_D)
201 |         pretrained_dict = {'module.' + k:v for k,v in pretrained_dict.items()}
202 |         share_state = {}
203 |         model_state = model.state_dict()
204 |         for k, v in pretrained_dict.items():
205 |             if k in model_state and v.size() == model_state[k].size():
206 |                 share_state[k] = v
207 |         
208 |         logger.info('Model dict :{}; shared dict :{}'.format(len(model_state), len(share_state)))
209 |         model_state.update(share_state)
210 |         model.load_state_dict(model_state)
211 | 
212 |     if os.path.exists(args.load_from_D) and args.advmix:
213 |         logger.info('=> Build teacher model by loading pretrained model: {}'.format(args.load_from_D))
214 |         pretrained_dict = torch.load(args.load_from_D)
215 |         pretrained_dict = {'module.' + k:v for k,v in pretrained_dict.items()}
216 |         share_state = {}
217 |         model_state = model_teacher.state_dict()
218 |         for k, v in pretrained_dict.items():
219 |             if k in model_state and v.size() == model_state[k].size():
220 |                 share_state[k] = v
221 |         logger.info('Model dict :{}; shared dict :{}'.format(len(model_state), len(share_state)))
222 |         model_state.update(share_state)
223 |         model_teacher.load_state_dict(model_state)
224 | 
225 |     if os.path.exists(args.load_from_G):
226 |         pretrained_dict = torch.load(args.load_from_G)
227 |         pretrained_dict = {'module.' + k:v for k,v in pretrained_dict.items()}
228 |         share_state = {}
229 |         model_state = model_G.state_dict()
230 |         for k, v in pretrained_dict.items():
231 |             if k in model_state and v.size() == model_state[k].size():
232 |                 share_state[k] = v
233 |         
234 |         model_state.update(share_state)
235 |         model_G.load_state_dict(model_state)
236 |     
237 | 
238 |     if cfg.AUTO_RESUME and os.path.exists(checkpoint_file):
239 |         logger.info("=> loading checkpoint '{}'".format(checkpoint_file))
240 |         checkpoint = torch.load(checkpoint_file)
241 |         begin_epoch = checkpoint['epoch']
242 |         best_perf = checkpoint['perf']
243 |         last_epoch = checkpoint['epoch']
244 |         model.load_state_dict(checkpoint['state_dict'])
245 | 
246 |         optimizer.load_state_dict(checkpoint['optimizer'])
247 |         logger.info("=> loaded checkpoint '{}' (epoch {})".format(
248 |             checkpoint_file, checkpoint['epoch']))
249 |     
250 |     if cfg.AUTO_RESUME and os.path.exists(checkpoint_file) and args.advmix:
251 |         logger.info("=> loading checkpoint teacher model'{}'".format(checkpoint_file))
252 |         checkpoint = torch.load(checkpoint_file)
253 |         begin_epoch = checkpoint['epoch']
254 |         best_perf = checkpoint['perf']
255 |         last_epoch = checkpoint['epoch']
256 |         model_teacher.load_state_dict(checkpoint['state_dict'])
257 | 
258 | 
259 |     if cfg.AUTO_RESUME and os.path.exists(checkpoint_file_G) and args.advmix:
260 |         logger.info("=> loading checkpoint generator'{}'".format(checkpoint_file_G))
261 |         checkpoint_G = torch.load(checkpoint_file_G)
262 |         begin_epoch_G = checkpoint_G['epoch']
263 |         best_perf_G = checkpoint_G['perf']
264 |         last_epoch_G = checkpoint_G['epoch']
265 |         model_G.load_state_dict(checkpoint_G['state_dict'])
266 | 
267 |         optimizer_G.load_state_dict(checkpoint_G['optimizer'])
268 |         logger.info("=> loaded checkpoint '{}' (epoch {})".format(
269 |             checkpoint_file_G, checkpoint_G['epoch']))
270 | 
271 | 
272 |     lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(
273 |         optimizer, cfg.TRAIN.LR_STEP, cfg.TRAIN.LR_FACTOR,
274 |         last_epoch=last_epoch
275 |     )
276 | 
277 |     if args.advmix:
278 |         lr_scheduler_G = torch.optim.lr_scheduler.MultiStepLR(
279 |             optimizer_G, cfg.TRAIN.LR_STEP, cfg.TRAIN.LR_FACTOR,
280 |             last_epoch=last_epoch_G
281 |         )
282 | 
283 |     for epoch in range(begin_epoch, cfg.TRAIN.END_EPOCH):
284 |         lr_scheduler.step()
285 |         if args.advmix:
286 |             lr_scheduler_G.step()
287 |         
288 |         print('=> The learning rate is: ', optimizer.param_groups[0]['lr'], optimizer_G.param_groups[0]['lr'])
289 |         
290 |         if args.advmix:
291 |             train_advmix(cfg, args, train_loader, [model, model_G, model_teacher], criterion, [optimizer, optimizer_G], epoch,
292 |                 final_output_dir, tb_log_dir, writer_dict)
293 |         else:
294 |             print('=> Normal training ...')
295 |             train(cfg, args, train_loader, model, criterion, optimizer, epoch,
296 |                 final_output_dir, tb_log_dir, writer_dict)
297 |         
298 |         name_values, perf_indicator = validate(
299 |             cfg, args, valid_loader, valid_dataset, model, criterion,
300 |             final_output_dir, tb_log_dir, writer_dict
301 |         )
302 | 
303 |         if perf_indicator >= best_perf:
304 |             best_perf = perf_indicator
305 |             best_model = True
306 |         else:
307 |             best_model = False
308 |         
309 |         logger.info("==> best mAP is {}".format(best_perf))
310 |         logger.info('=> saving checkpoint to {}'.format(final_output_dir))
311 |         save_checkpoint({
312 |             'epoch': epoch + 1,
313 |             'model': cfg.MODEL.NAME,
314 |             'state_dict': model.state_dict(),
315 |             'best_state_dict': model.module.state_dict(),
316 |             'perf': perf_indicator,
317 |             'optimizer': optimizer.state_dict(),
318 |         }, best_model, final_output_dir, suffix="D")
319 | 
320 |         if args.advmix:
321 |             save_checkpoint({
322 |                 'epoch': epoch + 1,
323 |                 'model': cfg.MODEL.NAME,
324 |                 'state_dict': model_G.state_dict(),
325 |                 'best_state_dict': model_G.module.state_dict(),
326 |                 'perf': perf_indicator,
327 |                 'optimizer': optimizer_G.state_dict(),
328 |             }, best_model, final_output_dir, suffix = "G")
329 | 
330 |         
331 |     final_model_state_file = os.path.join(
332 |         final_output_dir, 'final_state.pth'
333 |     )
334 |     logger.info('=> saving final model state to {}'.format(
335 |         final_model_state_file)
336 |     )
337 |     torch.save(model.module.state_dict(), final_model_state_file)
338 |     writer_dict['writer'].close()
339 | 
340 | 
341 | if __name__ == '__main__':
342 |     main()
343 | 


--------------------------------------------------------------------------------