├── 0-NICO ├── conf │ ├── __init__.py │ ├── baseline_resnet18_bf0.02.yaml │ ├── baseline_resnet18_bf0.02_mixup.yaml │ ├── baseline_t2tvit7_bf0.02.yaml │ ├── global_settings.py │ ├── ours_resnet18_multilayer4_bf0.02_noenv_pw5e5.yaml │ ├── ours_resnet18_multilayer4_bf0.02_noenv_pw5e5_mixup.yaml │ └── ours_t2tvit7_bf0.02_s4_noenv_pw5e4.yaml ├── dataset.py ├── eval_module.py ├── misc │ └── unbalance_nico_resnet18_split.npy ├── models │ ├── __pycache__ │ │ ├── bam.cpython-36.pyc │ │ ├── bam.cpython-37.pyc │ │ ├── cbam.cpython-36.pyc │ │ ├── cbam.cpython-37.pyc │ │ ├── resnet.cpython-36.pyc │ │ ├── resnet224.cpython-36.pyc │ │ ├── resnet_cbam.cpython-36.pyc │ │ ├── resnet_cbam2.cpython-36.pyc │ │ ├── resnet_ours_cbam.cpython-36.pyc │ │ ├── resnet_ours_cbam_1.cpython-36.pyc │ │ ├── resnet_ours_cbam_multi.cpython-36.pyc │ │ ├── resnet_ours_cbam_multi.cpython-37.pyc │ │ ├── resnet_senet.cpython-36.pyc │ │ ├── t2tvit.cpython-36.pyc │ │ ├── t2tvit_ours.cpython-36.pyc │ │ ├── token_performer.cpython-36.pyc │ │ ├── token_transformer.cpython-36.pyc │ │ └── transformer_block.cpython-36.pyc │ ├── attention.py │ ├── bam.py │ ├── cbam.py │ ├── densenet.py │ ├── googlenet.py │ ├── inceptionv3.py │ ├── inceptionv4.py │ ├── mobilenet.py │ ├── mobilenetv2.py │ ├── nasnet.py │ ├── preactresnet.py │ ├── resnet.py │ ├── resnet224.py │ ├── resnet_cbam.py │ ├── resnet_cbam2.py │ ├── resnet_nonlocal.py │ ├── resnet_ours_cbam.py │ ├── resnet_ours_cbam_1.py │ ├── resnet_ours_cbam_multi.py │ ├── resnet_senet.py │ ├── resnext.py │ ├── resvit.py │ ├── rir.py │ ├── senet.py │ ├── shufflenet.py │ ├── shufflenetv2.py │ ├── squeezenet.py │ ├── t2t-vit.py │ ├── t2tvit.py │ ├── t2tvit_ours.py │ ├── token_performer.py │ ├── token_transformer.py │ ├── transformer_block.py │ ├── vgg.py │ ├── wideresidual.py │ └── xception.py ├── pretrain_model │ ├── nico_resnet18_ours_caam-best.pth │ └── nico_t2tvit7_ours_caam-best.pth ├── scripts │ ├── run_baseline_resnet18.sh │ ├── run_baseline_t2tvit7.sh │ ├── run_ours_resnet18.sh │ ├── run_ours_resnet18_mixup.sh │ └── run_ours_t2tvit7.sh ├── train.py ├── train_module.py └── utils.py ├── 1-Imagenet9 ├── clusters │ ├── 9class_imagenet_val.csv │ ├── cluster_label_1.pth │ ├── cluster_label_2.pth │ └── cluster_label_3.pth ├── conf │ ├── __init__.py │ ├── baseline_resnet18_imagenet9.yaml │ ├── baseline_t2tvit7_imagenet9.yaml │ ├── global_settings.py │ ├── ours_resnet18_multi2_imagenet9_pw5e4_noenv_iter.yaml │ └── ours_t2tvit7_s4_imagenet9_noenv_pw5e4_iter.yaml ├── dataset_imagenet.py ├── eval_module.py ├── imagenet_cluster.py ├── misc │ ├── unbalance_bagnet18_imagenet9_presplit.npy │ └── unbalance_resnet18_imagenet9_presplit.npy ├── models │ ├── __pycache__ │ │ ├── bam.cpython-36.pyc │ │ ├── bam.cpython-37.pyc │ │ ├── cbam.cpython-36.pyc │ │ ├── cbam.cpython-37.pyc │ │ ├── resnet.cpython-36.pyc │ │ ├── resnet224.cpython-36.pyc │ │ ├── resnet224.cpython-37.pyc │ │ ├── resnet_cbam.cpython-36.pyc │ │ ├── resnet_cbam2.cpython-36.pyc │ │ ├── resnet_cbam2.cpython-37.pyc │ │ ├── resnet_ours_cbam.cpython-36.pyc │ │ ├── resnet_ours_cbam.cpython-37.pyc │ │ ├── resnet_ours_cbam_1.cpython-36.pyc │ │ ├── resnet_ours_cbam_multi.cpython-37.pyc │ │ ├── t2tvit.cpython-36.pyc │ │ ├── t2tvit.cpython-37.pyc │ │ ├── t2tvit_ours.cpython-36.pyc │ │ ├── t2tvit_ours.cpython-37.pyc │ │ ├── token_performer.cpython-36.pyc │ │ ├── token_performer.cpython-37.pyc │ │ ├── token_transformer.cpython-36.pyc │ │ ├── token_transformer.cpython-37.pyc │ │ ├── transformer_block.cpython-36.pyc │ │ └── transformer_block.cpython-37.pyc │ ├── attention.py │ ├── bam.py │ ├── cbam.py │ ├── densenet.py │ ├── googlenet.py │ ├── inceptionv3.py │ ├── inceptionv4.py │ ├── mobilenet.py │ ├── mobilenetv2.py │ ├── nasnet.py │ ├── preactresnet.py │ ├── resnet.py │ ├── resnet224.py │ ├── resnet_cbam.py │ ├── resnet_cbam2.py │ ├── resnet_nonlocal.py │ ├── resnet_ours_cbam.py │ ├── resnet_ours_cbam_1.py │ ├── resnet_ours_cbam_multi.py │ ├── resnext.py │ ├── resvit.py │ ├── rir.py │ ├── senet.py │ ├── shufflenet.py │ ├── shufflenetv2.py │ ├── squeezenet.py │ ├── t2t-vit.py │ ├── t2tvit.py │ ├── t2tvit_ours.py │ ├── token_performer.py │ ├── token_transformer.py │ ├── transformer_block.py │ ├── vgg.py │ ├── wideresidual.py │ └── xception.py ├── pre_cluster_results │ ├── backup │ │ ├── cluster_label_1.pth │ │ ├── cluster_label_2.pth │ │ ├── cluster_label_3.pth │ │ ├── cluster_sample_1.jpg │ │ ├── cluster_sample_2.jpg │ │ └── cluster_sample_3.jpg │ ├── cluster_label_1.pth │ ├── cluster_label_2.pth │ ├── cluster_label_3.pth │ ├── cluster_sample_1.jpg │ ├── cluster_sample_2.jpg │ └── cluster_sample_3.jpg ├── pretrain_model │ ├── resnet_ours │ │ └── resnet18_caam.pth │ └── t2tvit_ours │ │ └── t2tvit7_caam.pth ├── scripts │ ├── run_baseline_resnet18.sh │ └── run_ours_resnet18.sh ├── train.py ├── train_module.py └── utils.py └── README.md /0-NICO/conf/__init__.py: -------------------------------------------------------------------------------- 1 | """ dynamically load settings 2 | 3 | author baiyu 4 | """ 5 | import conf.global_settings as settings 6 | 7 | class Settings: 8 | def __init__(self, settings): 9 | 10 | for attr in dir(settings): 11 | if attr.isupper(): 12 | setattr(self, attr, getattr(settings, attr)) 13 | 14 | settings = Settings(settings) -------------------------------------------------------------------------------- /0-NICO/conf/baseline_resnet18_bf0.02.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: resnet18 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | training_opt: 8 | seed: 0 9 | batch_size: 128 10 | lr: 0.05 11 | warm: 2 12 | epoch: 200 13 | milestones: [80, 120, 160] 14 | # milestones: [80, 140, 200] 15 | save_epoch: 20 16 | print_batch: 1 17 | mean: [0.52418953, 0.5233741, 0.44896784] 18 | std: [0.21851876, 0.2175944, 0.22552039] 19 | variance_opt: 20 | balance_factor: 0.02 21 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 22 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 23 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 24 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 25 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 26 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 27 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 28 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 29 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 30 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 31 | mode: 'baseline' 32 | env_type: 'baseline' 33 | resume: False -------------------------------------------------------------------------------- /0-NICO/conf/baseline_resnet18_bf0.02_mixup.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: resnet18 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | training_opt: 8 | seed: 0 9 | batch_size: 128 10 | lr: 0.05 11 | warm: 2 12 | epoch: 200 13 | milestones: [80, 120, 160] 14 | # milestones: [80, 140, 200] 15 | save_epoch: 20 16 | print_batch: 1 17 | mean: [0.52418953, 0.5233741, 0.44896784] 18 | std: [0.21851876, 0.2175944, 0.22552039] 19 | mixup: True 20 | variance_opt: 21 | balance_factor: 0.02 22 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 23 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 24 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 25 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 26 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 27 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 28 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 29 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 30 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 31 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 32 | mode: 'baseline' 33 | env_type: 'baseline' 34 | resume: False -------------------------------------------------------------------------------- /0-NICO/conf/baseline_t2tvit7_bf0.02.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: t2tvit7 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | training_opt: 8 | seed: 0 9 | batch_size: 128 10 | optim: 11 | sched: baseline 12 | lr: 0.0005 13 | warm: 2 14 | epoch: 200 15 | milestones: [80, 120, 160] 16 | # milestones: [80, 140, 200] 17 | save_epoch: 20 18 | print_batch: 1 19 | mean: [0.52418953, 0.5233741, 0.44896784] 20 | std: [0.21851876, 0.2175944, 0.22552039] 21 | variance_opt: 22 | balance_factor: 0.02 23 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 24 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 25 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 26 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 27 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 28 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 29 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 30 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 31 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 32 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 33 | mode: 'baseline' 34 | env_type: 'baseline' 35 | resume: False 36 | -------------------------------------------------------------------------------- /0-NICO/conf/global_settings.py: -------------------------------------------------------------------------------- 1 | """ configurations for this project 2 | 3 | author baiyu 4 | """ 5 | import os 6 | from datetime import datetime 7 | 8 | #CIFAR100 dataset path (python version) 9 | #CIFAR100_PATH = '/nfs/private/cifar100/cifar-100-python' 10 | 11 | #mean and std of cifar100 dataset 12 | CIFAR100_TRAIN_MEAN = (0.5070751592371323, 0.48654887331495095, 0.4409178433670343) 13 | CIFAR100_TRAIN_STD = (0.2673342858792401, 0.2564384629170883, 0.27615047132568404) 14 | 15 | #directory to save weights file 16 | CHECKPOINT_PATH = 'checkpoint' 17 | 18 | DATE_FORMAT = '%A_%d_%B_%Y_%Hh_%Mm_%Ss' 19 | #time of we run the script 20 | TIME_NOW = datetime.now().strftime(DATE_FORMAT) 21 | 22 | #tensorboard log dir 23 | LOG_DIR = 'runs' 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | -------------------------------------------------------------------------------- /0-NICO/conf/ours_resnet18_multilayer4_bf0.02_noenv_pw5e5.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: resnet18_ours_cbam_multi 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | training_opt: 8 | seed: 0 9 | classes: 10 10 | batch_size: 32 11 | lr: 0.05 12 | warm: 2 13 | epoch: 200 14 | milestones: [80, 120, 160] 15 | # milestones: [80, 140, 200] 16 | save_epoch: 20 17 | print_batch: 1 18 | mean: [0.52418953, 0.5233741, 0.44896784] 19 | std: [0.21851876, 0.2175944, 0.22552039] 20 | variance_opt: 21 | balance_factor: 0.02 22 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 23 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 24 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 25 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 26 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 27 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 28 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 29 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 30 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 31 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 32 | env: True 33 | mode: 'ours' 34 | erm_flag: True 35 | sp_flag: False 36 | n_env: 4 37 | env_type: 'auto-iter' 38 | split_renew: 40 39 | split_renew_iters: 20 40 | from_scratch: False 41 | ref_model_path: /data2/wangtan/causal-invariant-attention/multi-classification/nico/checkpoint/resnet18/multi_baseline_resnet18_bf0.02/resnet18-180-regular.pth 42 | penalty_weight: 5e5 43 | penalty_anneal_iters: 0 44 | #2 blocks, 4 layers 45 | split_layer: 2 46 | resume: False -------------------------------------------------------------------------------- /0-NICO/conf/ours_resnet18_multilayer4_bf0.02_noenv_pw5e5_mixup.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: resnet18_ours_cbam_multi 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | training_opt: 8 | seed: 0 9 | classes: 10 10 | batch_size: 32 11 | lr: 0.05 12 | warm: 2 13 | epoch: 200 14 | milestones: [80, 120, 160] 15 | # milestones: [80, 140, 200] 16 | save_epoch: 20 17 | print_batch: 1 18 | mean: [0.52418953, 0.5233741, 0.44896784] 19 | std: [0.21851876, 0.2175944, 0.22552039] 20 | mixup: True 21 | variance_opt: 22 | balance_factor: 0.02 23 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 24 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 25 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 26 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 27 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 28 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 29 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 30 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 31 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 32 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 33 | env: True 34 | mode: 'ours' 35 | erm_flag: True 36 | sp_flag: False 37 | n_env: 4 38 | env_type: 'auto-iter' 39 | split_renew: 40 40 | split_renew_iters: 20 41 | from_scratch: False 42 | ref_model_path: /data2/wangtan/causal-invariant-attention/multi-classification/nico/checkpoint/resnet18/multi_baseline_resnet18_bf0.02/resnet18-180-regular.pth 43 | penalty_weight: 5e5 44 | penalty_anneal_iters: 0 45 | #2 blocks, 4 layers 46 | split_layer: 2 47 | resume: False -------------------------------------------------------------------------------- /0-NICO/conf/ours_t2tvit7_bf0.02_s4_noenv_pw5e4.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: t2tvit7_ours 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | training_opt: 8 | seed: 0 9 | classes: 10 10 | batch_size: 32 11 | optim: 12 | sched: baseline 13 | lr: 0.001 14 | warm: 2 15 | epoch: 200 16 | milestones: [80, 120, 160] 17 | # milestones: [80, 140, 200] 18 | save_epoch: 20 19 | print_batch: 1 20 | mean: [0.52418953, 0.5233741, 0.44896784] 21 | std: [0.21851876, 0.2175944, 0.22552039] 22 | variance_opt: 23 | balance_factor: 0.02 24 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 25 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 26 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 27 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 28 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 29 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 30 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 31 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 32 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 33 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 34 | env: True 35 | mode: 'ours' 36 | erm_flag: True 37 | sp_flag: False 38 | n_env: 4 39 | env_type: 'auto-iter' 40 | split_renew: 40 41 | split_renew_iters: 20 42 | from_scratch: False 43 | penalty_weight: 5e4 44 | penalty_anneal_iters: 0 45 | final_k: 4 46 | resume: False -------------------------------------------------------------------------------- /0-NICO/dataset.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import pickle 4 | 5 | from skimage import io 6 | import matplotlib.pyplot as plt 7 | import numpy 8 | import torch 9 | from torch.utils.data import Dataset 10 | 11 | class CIFAR100Train(Dataset): 12 | """cifar100 test dataset, derived from 13 | torch.utils.data.DataSet 14 | """ 15 | 16 | def __init__(self, path, transform=None): 17 | #if transform is given, we transoform data using 18 | with open(os.path.join(path, 'train'), 'rb') as cifar100: 19 | self.data = pickle.load(cifar100, encoding='bytes') 20 | self.transform = transform 21 | 22 | def __len__(self): 23 | return len(self.data['fine_labels'.encode()]) 24 | 25 | def __getitem__(self, index): 26 | label = self.data['fine_labels'.encode()][index] 27 | r = self.data['data'.encode()][index, :1024].reshape(32, 32) 28 | g = self.data['data'.encode()][index, 1024:2048].reshape(32, 32) 29 | b = self.data['data'.encode()][index, 2048:].reshape(32, 32) 30 | image = numpy.dstack((r, g, b)) 31 | 32 | if self.transform: 33 | image = self.transform(image) 34 | return label, image 35 | 36 | class CIFAR100Test(Dataset): 37 | """cifar100 test dataset, derived from 38 | torch.utils.data.DataSet 39 | """ 40 | 41 | def __init__(self, path, transform=None): 42 | with open(os.path.join(path, 'test'), 'rb') as cifar100: 43 | self.data = pickle.load(cifar100, encoding='bytes') 44 | self.transform = transform 45 | 46 | def __len__(self): 47 | return len(self.data['data'.encode()]) 48 | 49 | def __getitem__(self, index): 50 | label = self.data['fine_labels'.encode()][index] 51 | r = self.data['data'.encode()][index, :1024].reshape(32, 32) 52 | g = self.data['data'.encode()][index, 1024:2048].reshape(32, 32) 53 | b = self.data['data'.encode()][index, 2048:].reshape(32, 32) 54 | image = numpy.dstack((r, g, b)) 55 | 56 | if self.transform: 57 | image = self.transform(image) 58 | return label, image 59 | 60 | -------------------------------------------------------------------------------- /0-NICO/misc/unbalance_nico_resnet18_split.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/misc/unbalance_nico_resnet18_split.npy -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/bam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/bam.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/bam.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/bam.cpython-37.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/cbam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/cbam.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/cbam.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/cbam.cpython-37.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet224.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet224.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet_cbam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet_cbam.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet_cbam2.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet_cbam2.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet_ours_cbam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet_ours_cbam.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet_ours_cbam_1.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet_ours_cbam_1.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet_ours_cbam_multi.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet_ours_cbam_multi.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet_ours_cbam_multi.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet_ours_cbam_multi.cpython-37.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/resnet_senet.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/resnet_senet.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/t2tvit.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/t2tvit.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/t2tvit_ours.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/t2tvit_ours.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/token_performer.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/token_performer.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/token_transformer.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/token_transformer.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/__pycache__/transformer_block.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/models/__pycache__/transformer_block.cpython-36.pyc -------------------------------------------------------------------------------- /0-NICO/models/bam.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import math 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | 6 | class Flatten(nn.Module): 7 | def forward(self, x): 8 | return x.view(x.size(0), -1) 9 | class ChannelGate(nn.Module): 10 | def __init__(self, gate_channel, reduction_ratio=16, num_layers=1): 11 | super(ChannelGate, self).__init__() 12 | self.gate_activation = gate_activation 13 | self.gate_c = nn.Sequential() 14 | self.gate_c.add_module( 'flatten', Flatten() ) 15 | gate_channels = [gate_channel] 16 | gate_channels += [gate_channel // reduction_ratio] * num_layers 17 | gate_channels += [gate_channel] 18 | for i in range( len(gate_channels) - 2 ): 19 | self.gate_c.add_module( 'gate_c_fc_%d'%i, nn.Linear(gate_channels[i], gate_channels[i+1]) ) 20 | self.gate_c.add_module( 'gate_c_bn_%d'%(i+1), nn.BatchNorm1d(gate_channels[i+1]) ) 21 | self.gate_c.add_module( 'gate_c_relu_%d'%(i+1), nn.ReLU() ) 22 | self.gate_c.add_module( 'gate_c_fc_final', nn.Linear(gate_channels[-2], gate_channels[-1]) ) 23 | def forward(self, in_tensor): 24 | avg_pool = F.avg_pool2d( in_tensor, in_tensor.size(2), stride=in_tensor.size(2) ) 25 | return self.gate_c( avg_pool ).unsqueeze(2).unsqueeze(3).expand_as(in_tensor) 26 | 27 | class SpatialGate(nn.Module): 28 | def __init__(self, gate_channel, reduction_ratio=16, dilation_conv_num=2, dilation_val=4): 29 | super(SpatialGate, self).__init__() 30 | self.gate_s = nn.Sequential() 31 | self.gate_s.add_module( 'gate_s_conv_reduce0', nn.Conv2d(gate_channel, gate_channel//reduction_ratio, kernel_size=1)) 32 | self.gate_s.add_module( 'gate_s_bn_reduce0', nn.BatchNorm2d(gate_channel//reduction_ratio) ) 33 | self.gate_s.add_module( 'gate_s_relu_reduce0',nn.ReLU() ) 34 | for i in range( dilation_conv_num ): 35 | self.gate_s.add_module( 'gate_s_conv_di_%d'%i, nn.Conv2d(gate_channel//reduction_ratio, gate_channel//reduction_ratio, kernel_size=3, \ 36 | padding=dilation_val, dilation=dilation_val) ) 37 | self.gate_s.add_module( 'gate_s_bn_di_%d'%i, nn.BatchNorm2d(gate_channel//reduction_ratio) ) 38 | self.gate_s.add_module( 'gate_s_relu_di_%d'%i, nn.ReLU() ) 39 | self.gate_s.add_module( 'gate_s_conv_final', nn.Conv2d(gate_channel//reduction_ratio, 1, kernel_size=1) ) 40 | def forward(self, in_tensor): 41 | return self.gate_s( in_tensor ).expand_as(in_tensor) 42 | class BAM(nn.Module): 43 | def __init__(self, gate_channel): 44 | super(BAM, self).__init__() 45 | self.channel_att = ChannelGate(gate_channel) 46 | self.spatial_att = SpatialGate(gate_channel) 47 | def forward(self,in_tensor): 48 | att = 1 + F.sigmoid( self.channel_att(in_tensor) * self.spatial_att(in_tensor) ) 49 | return att * in_tensor -------------------------------------------------------------------------------- /0-NICO/models/cbam.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import math 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | 6 | class BasicConv(nn.Module): 7 | def __init__(self, in_planes, out_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True, bias=False): 8 | super(BasicConv, self).__init__() 9 | self.out_channels = out_planes 10 | self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=bias) 11 | self.bn = nn.BatchNorm2d(out_planes,eps=1e-5, momentum=0.01, affine=True) if bn else None 12 | self.relu = nn.ReLU() if relu else None 13 | 14 | def forward(self, x): 15 | x = self.conv(x) 16 | if self.bn is not None: 17 | x = self.bn(x) 18 | if self.relu is not None: 19 | x = self.relu(x) 20 | return x 21 | 22 | class Flatten(nn.Module): 23 | def forward(self, x): 24 | return x.view(x.size(0), -1) 25 | 26 | class ChannelGate(nn.Module): 27 | def __init__(self, gate_channels, reduction_ratio=16, pool_types=['avg', 'max']): 28 | super(ChannelGate, self).__init__() 29 | self.gate_channels = gate_channels 30 | self.mlp = nn.Sequential( 31 | Flatten(), 32 | nn.Linear(gate_channels, gate_channels // reduction_ratio), 33 | nn.ReLU(), 34 | nn.Linear(gate_channels // reduction_ratio, gate_channels) 35 | ) 36 | self.pool_types = pool_types 37 | def forward(self, x, return_attn=False): 38 | channel_att_sum = None 39 | for pool_type in self.pool_types: 40 | if pool_type=='avg': 41 | avg_pool = F.avg_pool2d( x, (x.size(2), x.size(3)), stride=(x.size(2), x.size(3))) 42 | channel_att_raw = self.mlp( avg_pool ) 43 | elif pool_type=='max': 44 | max_pool = F.max_pool2d( x, (x.size(2), x.size(3)), stride=(x.size(2), x.size(3))) 45 | channel_att_raw = self.mlp( max_pool ) 46 | elif pool_type=='lp': 47 | lp_pool = F.lp_pool2d( x, 2, (x.size(2), x.size(3)), stride=(x.size(2), x.size(3))) 48 | channel_att_raw = self.mlp( lp_pool ) 49 | elif pool_type=='lse': 50 | # LSE pool only 51 | lse_pool = logsumexp_2d(x) 52 | channel_att_raw = self.mlp( lse_pool ) 53 | 54 | if channel_att_sum is None: 55 | channel_att_sum = channel_att_raw 56 | else: 57 | channel_att_sum = channel_att_sum + channel_att_raw 58 | 59 | scale = F.sigmoid( channel_att_sum ).unsqueeze(2).unsqueeze(3).expand_as(x) 60 | if return_attn: 61 | return x * scale, scale 62 | else: 63 | return x * scale 64 | 65 | def logsumexp_2d(tensor): 66 | tensor_flatten = tensor.view(tensor.size(0), tensor.size(1), -1) 67 | s, _ = torch.max(tensor_flatten, dim=2, keepdim=True) 68 | outputs = s + (tensor_flatten - s).exp().sum(dim=2, keepdim=True).log() 69 | return outputs 70 | 71 | class ChannelPool(nn.Module): 72 | def forward(self, x): 73 | return torch.cat( (torch.max(x,1)[0].unsqueeze(1), torch.mean(x,1).unsqueeze(1)), dim=1 ) 74 | 75 | class SpatialGate(nn.Module): 76 | def __init__(self): 77 | super(SpatialGate, self).__init__() 78 | kernel_size = 7 79 | self.compress = ChannelPool() 80 | self.spatial = BasicConv(2, 1, kernel_size, stride=1, padding=(kernel_size-1) // 2, relu=False) 81 | def forward(self, x, return_attn=False): 82 | x_compress = self.compress(x) 83 | x_out = self.spatial(x_compress) 84 | scale = F.sigmoid(x_out) # broadcasting 85 | if return_attn: 86 | return x * scale, scale 87 | else: 88 | return x * scale 89 | 90 | class CBAM(nn.Module): 91 | def __init__(self, gate_channels, reduction_ratio=16, pool_types=['avg', 'max'], no_spatial=False): 92 | super(CBAM, self).__init__() 93 | self.ChannelGate = ChannelGate(gate_channels, reduction_ratio, pool_types) 94 | self.no_spatial=no_spatial 95 | if not no_spatial: 96 | self.SpatialGate = SpatialGate() 97 | def forward(self, x, return_attn=False): 98 | # x_fake = torch.ones_like(x) 99 | if return_attn: 100 | x_out, attn1 = self.ChannelGate(x, return_attn=True) 101 | if not self.no_spatial: 102 | x_out, attn2 = self.SpatialGate(x_out, return_attn=True) 103 | return x_out, attn1, attn2 104 | else: 105 | return x_out, attn1 106 | else: 107 | x_out = self.ChannelGate(x) 108 | if not self.no_spatial: 109 | x_out = self.SpatialGate(x_out) 110 | return x_out -------------------------------------------------------------------------------- /0-NICO/models/densenet.py: -------------------------------------------------------------------------------- 1 | """dense net in pytorch 2 | 3 | 4 | 5 | [1] Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger. 6 | 7 | Densely Connected Convolutional Networks 8 | https://arxiv.org/abs/1608.06993v5 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | 15 | 16 | #"""Bottleneck layers. Although each layer only produces k 17 | #output feature-maps, it typically has many more inputs. It 18 | #has been noted in [37, 11] that a 1×1 convolution can be in- 19 | #troduced as bottleneck layer before each 3×3 convolution 20 | #to reduce the number of input feature-maps, and thus to 21 | #improve computational efficiency.""" 22 | class Bottleneck(nn.Module): 23 | def __init__(self, in_channels, growth_rate): 24 | super().__init__() 25 | #"""In our experiments, we let each 1×1 convolution 26 | #produce 4k feature-maps.""" 27 | inner_channel = 4 * growth_rate 28 | 29 | #"""We find this design especially effective for DenseNet and 30 | #we refer to our network with such a bottleneck layer, i.e., 31 | #to the BN-ReLU-Conv(1×1)-BN-ReLU-Conv(3×3) version of H ` , 32 | #as DenseNet-B.""" 33 | self.bottle_neck = nn.Sequential( 34 | nn.BatchNorm2d(in_channels), 35 | nn.ReLU(inplace=True), 36 | nn.Conv2d(in_channels, inner_channel, kernel_size=1, bias=False), 37 | nn.BatchNorm2d(inner_channel), 38 | nn.ReLU(inplace=True), 39 | nn.Conv2d(inner_channel, growth_rate, kernel_size=3, padding=1, bias=False) 40 | ) 41 | 42 | def forward(self, x): 43 | return torch.cat([x, self.bottle_neck(x)], 1) 44 | 45 | #"""We refer to layers between blocks as transition 46 | #layers, which do convolution and pooling.""" 47 | class Transition(nn.Module): 48 | def __init__(self, in_channels, out_channels): 49 | super().__init__() 50 | #"""The transition layers used in our experiments 51 | #consist of a batch normalization layer and an 1×1 52 | #convolutional layer followed by a 2×2 average pooling 53 | #layer""". 54 | self.down_sample = nn.Sequential( 55 | nn.BatchNorm2d(in_channels), 56 | nn.Conv2d(in_channels, out_channels, 1, bias=False), 57 | nn.AvgPool2d(2, stride=2) 58 | ) 59 | 60 | def forward(self, x): 61 | return self.down_sample(x) 62 | 63 | #DesneNet-BC 64 | #B stands for bottleneck layer(BN-RELU-CONV(1x1)-BN-RELU-CONV(3x3)) 65 | #C stands for compression factor(0<=theta<=1) 66 | class DenseNet(nn.Module): 67 | def __init__(self, block, nblocks, growth_rate=12, reduction=0.5, num_class=100): 68 | super().__init__() 69 | self.growth_rate = growth_rate 70 | 71 | #"""Before entering the first dense block, a convolution 72 | #with 16 (or twice the growth rate for DenseNet-BC) 73 | #output channels is performed on the input images.""" 74 | inner_channels = 2 * growth_rate 75 | 76 | #For convolutional layers with kernel size 3×3, each 77 | #side of the inputs is zero-padded by one pixel to keep 78 | #the feature-map size fixed. 79 | self.conv1 = nn.Conv2d(3, inner_channels, kernel_size=3, padding=1, bias=False) 80 | 81 | self.features = nn.Sequential() 82 | 83 | for index in range(len(nblocks) - 1): 84 | self.features.add_module("dense_block_layer_{}".format(index), self._make_dense_layers(block, inner_channels, nblocks[index])) 85 | inner_channels += growth_rate * nblocks[index] 86 | 87 | #"""If a dense block contains m feature-maps, we let the 88 | #following transition layer generate θm output feature- 89 | #maps, where 0 < θ ≤ 1 is referred to as the compression 90 | #fac-tor. 91 | out_channels = int(reduction * inner_channels) # int() will automatic floor the value 92 | self.features.add_module("transition_layer_{}".format(index), Transition(inner_channels, out_channels)) 93 | inner_channels = out_channels 94 | 95 | self.features.add_module("dense_block{}".format(len(nblocks) - 1), self._make_dense_layers(block, inner_channels, nblocks[len(nblocks)-1])) 96 | inner_channels += growth_rate * nblocks[len(nblocks) - 1] 97 | self.features.add_module('bn', nn.BatchNorm2d(inner_channels)) 98 | self.features.add_module('relu', nn.ReLU(inplace=True)) 99 | 100 | self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) 101 | 102 | self.linear = nn.Linear(inner_channels, num_class) 103 | 104 | def forward(self, x): 105 | output = self.conv1(x) 106 | output = self.features(output) 107 | output = self.avgpool(output) 108 | output = output.view(output.size()[0], -1) 109 | output = self.linear(output) 110 | return output 111 | 112 | def _make_dense_layers(self, block, in_channels, nblocks): 113 | dense_block = nn.Sequential() 114 | for index in range(nblocks): 115 | dense_block.add_module('bottle_neck_layer_{}'.format(index), block(in_channels, self.growth_rate)) 116 | in_channels += self.growth_rate 117 | return dense_block 118 | 119 | def densenet121(): 120 | return DenseNet(Bottleneck, [6,12,24,16], growth_rate=32) 121 | 122 | def densenet169(): 123 | return DenseNet(Bottleneck, [6,12,32,32], growth_rate=32) 124 | 125 | def densenet201(): 126 | return DenseNet(Bottleneck, [6,12,48,32], growth_rate=32) 127 | 128 | def densenet161(): 129 | return DenseNet(Bottleneck, [6,12,36,24], growth_rate=48) 130 | 131 | -------------------------------------------------------------------------------- /0-NICO/models/googlenet.py: -------------------------------------------------------------------------------- 1 | """google net in pytorch 2 | 3 | 4 | 5 | [1] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, 6 | Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. 7 | 8 | Going Deeper with Convolutions 9 | https://arxiv.org/abs/1409.4842v1 10 | """ 11 | 12 | import torch 13 | import torch.nn as nn 14 | 15 | class Inception(nn.Module): 16 | def __init__(self, input_channels, n1x1, n3x3_reduce, n3x3, n5x5_reduce, n5x5, pool_proj): 17 | super().__init__() 18 | 19 | #1x1conv branch 20 | self.b1 = nn.Sequential( 21 | nn.Conv2d(input_channels, n1x1, kernel_size=1), 22 | nn.BatchNorm2d(n1x1), 23 | nn.ReLU(inplace=True) 24 | ) 25 | 26 | #1x1conv -> 3x3conv branch 27 | self.b2 = nn.Sequential( 28 | nn.Conv2d(input_channels, n3x3_reduce, kernel_size=1), 29 | nn.BatchNorm2d(n3x3_reduce), 30 | nn.ReLU(inplace=True), 31 | nn.Conv2d(n3x3_reduce, n3x3, kernel_size=3, padding=1), 32 | nn.BatchNorm2d(n3x3), 33 | nn.ReLU(inplace=True) 34 | ) 35 | 36 | #1x1conv -> 5x5conv branch 37 | #we use 2 3x3 conv filters stacked instead 38 | #of 1 5x5 filters to obtain the same receptive 39 | #field with fewer parameters 40 | self.b3 = nn.Sequential( 41 | nn.Conv2d(input_channels, n5x5_reduce, kernel_size=1), 42 | nn.BatchNorm2d(n5x5_reduce), 43 | nn.ReLU(inplace=True), 44 | nn.Conv2d(n5x5_reduce, n5x5, kernel_size=3, padding=1), 45 | nn.BatchNorm2d(n5x5, n5x5), 46 | nn.ReLU(inplace=True), 47 | nn.Conv2d(n5x5, n5x5, kernel_size=3, padding=1), 48 | nn.BatchNorm2d(n5x5), 49 | nn.ReLU(inplace=True) 50 | ) 51 | 52 | #3x3pooling -> 1x1conv 53 | #same conv 54 | self.b4 = nn.Sequential( 55 | nn.MaxPool2d(3, stride=1, padding=1), 56 | nn.Conv2d(input_channels, pool_proj, kernel_size=1), 57 | nn.BatchNorm2d(pool_proj), 58 | nn.ReLU(inplace=True) 59 | ) 60 | 61 | def forward(self, x): 62 | return torch.cat([self.b1(x), self.b2(x), self.b3(x), self.b4(x)], dim=1) 63 | 64 | 65 | class GoogleNet(nn.Module): 66 | 67 | def __init__(self, num_class=100): 68 | super().__init__() 69 | self.prelayer = nn.Sequential( 70 | nn.Conv2d(3, 192, kernel_size=3, padding=1), 71 | nn.BatchNorm2d(192), 72 | nn.ReLU(inplace=True) 73 | ) 74 | 75 | #although we only use 1 conv layer as prelayer, 76 | #we still use name a3, b3....... 77 | self.a3 = Inception(192, 64, 96, 128, 16, 32, 32) 78 | self.b3 = Inception(256, 128, 128, 192, 32, 96, 64) 79 | 80 | #"""In general, an Inception network is a network consisting of 81 | #modules of the above type stacked upon each other, with occasional 82 | #max-pooling layers with stride 2 to halve the resolution of the 83 | #grid""" 84 | self.maxpool = nn.MaxPool2d(3, stride=2, padding=1) 85 | 86 | self.a4 = Inception(480, 192, 96, 208, 16, 48, 64) 87 | self.b4 = Inception(512, 160, 112, 224, 24, 64, 64) 88 | self.c4 = Inception(512, 128, 128, 256, 24, 64, 64) 89 | self.d4 = Inception(512, 112, 144, 288, 32, 64, 64) 90 | self.e4 = Inception(528, 256, 160, 320, 32, 128, 128) 91 | 92 | self.a5 = Inception(832, 256, 160, 320, 32, 128, 128) 93 | self.b5 = Inception(832, 384, 192, 384, 48, 128, 128) 94 | 95 | #input feature size: 8*8*1024 96 | self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) 97 | self.dropout = nn.Dropout2d(p=0.4) 98 | self.linear = nn.Linear(1024, num_class) 99 | 100 | def forward(self, x): 101 | output = self.prelayer(x) 102 | output = self.a3(output) 103 | output = self.b3(output) 104 | 105 | output = self.maxpool(output) 106 | 107 | output = self.a4(output) 108 | output = self.b4(output) 109 | output = self.c4(output) 110 | output = self.d4(output) 111 | output = self.e4(output) 112 | 113 | output = self.maxpool(output) 114 | 115 | output = self.a5(output) 116 | output = self.b5(output) 117 | 118 | #"""It was found that a move from fully connected layers to 119 | #average pooling improved the top-1 accuracy by about 0.6%, 120 | #however the use of dropout remained essential even after 121 | #removing the fully connected layers.""" 122 | output = self.avgpool(output) 123 | output = self.dropout(output) 124 | output = output.view(output.size()[0], -1) 125 | output = self.linear(output) 126 | 127 | return output 128 | 129 | def googlenet(): 130 | return GoogleNet() 131 | 132 | 133 | -------------------------------------------------------------------------------- /0-NICO/models/mobilenet.py: -------------------------------------------------------------------------------- 1 | """mobilenet in pytorch 2 | 3 | 4 | 5 | [1] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam 6 | 7 | MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 8 | https://arxiv.org/abs/1704.04861 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | 15 | class DepthSeperabelConv2d(nn.Module): 16 | 17 | def __init__(self, input_channels, output_channels, kernel_size, **kwargs): 18 | super().__init__() 19 | self.depthwise = nn.Sequential( 20 | nn.Conv2d( 21 | input_channels, 22 | input_channels, 23 | kernel_size, 24 | groups=input_channels, 25 | **kwargs), 26 | nn.BatchNorm2d(input_channels), 27 | nn.ReLU(inplace=True) 28 | ) 29 | 30 | self.pointwise = nn.Sequential( 31 | nn.Conv2d(input_channels, output_channels, 1), 32 | nn.BatchNorm2d(output_channels), 33 | nn.ReLU(inplace=True) 34 | ) 35 | 36 | def forward(self, x): 37 | x = self.depthwise(x) 38 | x = self.pointwise(x) 39 | 40 | return x 41 | 42 | 43 | class BasicConv2d(nn.Module): 44 | 45 | def __init__(self, input_channels, output_channels, kernel_size, **kwargs): 46 | 47 | super().__init__() 48 | self.conv = nn.Conv2d( 49 | input_channels, output_channels, kernel_size, **kwargs) 50 | self.bn = nn.BatchNorm2d(output_channels) 51 | self.relu = nn.ReLU(inplace=True) 52 | 53 | def forward(self, x): 54 | x = self.conv(x) 55 | x = self.bn(x) 56 | x = self.relu(x) 57 | 58 | return x 59 | 60 | 61 | class MobileNet(nn.Module): 62 | 63 | """ 64 | Args: 65 | width multipler: The role of the width multiplier α is to thin 66 | a network uniformly at each layer. For a given 67 | layer and width multiplier α, the number of 68 | input channels M becomes αM and the number of 69 | output channels N becomes αN. 70 | """ 71 | 72 | def __init__(self, width_multiplier=1, class_num=100): 73 | super().__init__() 74 | 75 | alpha = width_multiplier 76 | self.stem = nn.Sequential( 77 | BasicConv2d(3, int(32 * alpha), 3, padding=1, bias=False), 78 | DepthSeperabelConv2d( 79 | int(32 * alpha), 80 | int(64 * alpha), 81 | 3, 82 | padding=1, 83 | bias=False 84 | ) 85 | ) 86 | 87 | #downsample 88 | self.conv1 = nn.Sequential( 89 | DepthSeperabelConv2d( 90 | int(64 * alpha), 91 | int(128 * alpha), 92 | 3, 93 | stride=2, 94 | padding=1, 95 | bias=False 96 | ), 97 | DepthSeperabelConv2d( 98 | int(128 * alpha), 99 | int(128 * alpha), 100 | 3, 101 | padding=1, 102 | bias=False 103 | ) 104 | ) 105 | 106 | #downsample 107 | self.conv2 = nn.Sequential( 108 | DepthSeperabelConv2d( 109 | int(128 * alpha), 110 | int(256 * alpha), 111 | 3, 112 | stride=2, 113 | padding=1, 114 | bias=False 115 | ), 116 | DepthSeperabelConv2d( 117 | int(256 * alpha), 118 | int(256 * alpha), 119 | 3, 120 | padding=1, 121 | bias=False 122 | ) 123 | ) 124 | 125 | #downsample 126 | self.conv3 = nn.Sequential( 127 | DepthSeperabelConv2d( 128 | int(256 * alpha), 129 | int(512 * alpha), 130 | 3, 131 | stride=2, 132 | padding=1, 133 | bias=False 134 | ), 135 | 136 | DepthSeperabelConv2d( 137 | int(512 * alpha), 138 | int(512 * alpha), 139 | 3, 140 | padding=1, 141 | bias=False 142 | ), 143 | DepthSeperabelConv2d( 144 | int(512 * alpha), 145 | int(512 * alpha), 146 | 3, 147 | padding=1, 148 | bias=False 149 | ), 150 | DepthSeperabelConv2d( 151 | int(512 * alpha), 152 | int(512 * alpha), 153 | 3, 154 | padding=1, 155 | bias=False 156 | ), 157 | DepthSeperabelConv2d( 158 | int(512 * alpha), 159 | int(512 * alpha), 160 | 3, 161 | padding=1, 162 | bias=False 163 | ), 164 | DepthSeperabelConv2d( 165 | int(512 * alpha), 166 | int(512 * alpha), 167 | 3, 168 | padding=1, 169 | bias=False 170 | ) 171 | ) 172 | 173 | #downsample 174 | self.conv4 = nn.Sequential( 175 | DepthSeperabelConv2d( 176 | int(512 * alpha), 177 | int(1024 * alpha), 178 | 3, 179 | stride=2, 180 | padding=1, 181 | bias=False 182 | ), 183 | DepthSeperabelConv2d( 184 | int(1024 * alpha), 185 | int(1024 * alpha), 186 | 3, 187 | padding=1, 188 | bias=False 189 | ) 190 | ) 191 | 192 | self.fc = nn.Linear(int(1024 * alpha), class_num) 193 | self.avg = nn.AdaptiveAvgPool2d(1) 194 | 195 | def forward(self, x): 196 | x = self.stem(x) 197 | 198 | x = self.conv1(x) 199 | x = self.conv2(x) 200 | x = self.conv3(x) 201 | x = self.conv4(x) 202 | 203 | x = self.avg(x) 204 | x = x.view(x.size(0), -1) 205 | x = self.fc(x) 206 | return x 207 | 208 | 209 | def mobilenet(alpha=1, class_num=100): 210 | return MobileNet(alpha, class_num) 211 | 212 | -------------------------------------------------------------------------------- /0-NICO/models/mobilenetv2.py: -------------------------------------------------------------------------------- 1 | """mobilenetv2 in pytorch 2 | 3 | 4 | 5 | [1] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen 6 | 7 | MobileNetV2: Inverted Residuals and Linear Bottlenecks 8 | https://arxiv.org/abs/1801.04381 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | 16 | class LinearBottleNeck(nn.Module): 17 | 18 | def __init__(self, in_channels, out_channels, stride, t=6, class_num=100): 19 | super().__init__() 20 | 21 | self.residual = nn.Sequential( 22 | nn.Conv2d(in_channels, in_channels * t, 1), 23 | nn.BatchNorm2d(in_channels * t), 24 | nn.ReLU6(inplace=True), 25 | 26 | nn.Conv2d(in_channels * t, in_channels * t, 3, stride=stride, padding=1, groups=in_channels * t), 27 | nn.BatchNorm2d(in_channels * t), 28 | nn.ReLU6(inplace=True), 29 | 30 | nn.Conv2d(in_channels * t, out_channels, 1), 31 | nn.BatchNorm2d(out_channels) 32 | ) 33 | 34 | self.stride = stride 35 | self.in_channels = in_channels 36 | self.out_channels = out_channels 37 | 38 | def forward(self, x): 39 | 40 | residual = self.residual(x) 41 | 42 | if self.stride == 1 and self.in_channels == self.out_channels: 43 | residual += x 44 | 45 | return residual 46 | 47 | class MobileNetV2(nn.Module): 48 | 49 | def __init__(self, class_num=100): 50 | super().__init__() 51 | 52 | self.pre = nn.Sequential( 53 | nn.Conv2d(3, 32, 1, padding=1), 54 | nn.BatchNorm2d(32), 55 | nn.ReLU6(inplace=True) 56 | ) 57 | 58 | self.stage1 = LinearBottleNeck(32, 16, 1, 1) 59 | self.stage2 = self._make_stage(2, 16, 24, 2, 6) 60 | self.stage3 = self._make_stage(3, 24, 32, 2, 6) 61 | self.stage4 = self._make_stage(4, 32, 64, 2, 6) 62 | self.stage5 = self._make_stage(3, 64, 96, 1, 6) 63 | self.stage6 = self._make_stage(3, 96, 160, 1, 6) 64 | self.stage7 = LinearBottleNeck(160, 320, 1, 6) 65 | 66 | self.conv1 = nn.Sequential( 67 | nn.Conv2d(320, 1280, 1), 68 | nn.BatchNorm2d(1280), 69 | nn.ReLU6(inplace=True) 70 | ) 71 | 72 | self.conv2 = nn.Conv2d(1280, class_num, 1) 73 | 74 | def forward(self, x): 75 | x = self.pre(x) 76 | x = self.stage1(x) 77 | x = self.stage2(x) 78 | x = self.stage3(x) 79 | x = self.stage4(x) 80 | x = self.stage5(x) 81 | x = self.stage6(x) 82 | x = self.stage7(x) 83 | x = self.conv1(x) 84 | x = F.adaptive_avg_pool2d(x, 1) 85 | x = self.conv2(x) 86 | x = x.view(x.size(0), -1) 87 | 88 | return x 89 | 90 | def _make_stage(self, repeat, in_channels, out_channels, stride, t): 91 | 92 | layers = [] 93 | layers.append(LinearBottleNeck(in_channels, out_channels, stride, t)) 94 | 95 | while repeat - 1: 96 | layers.append(LinearBottleNeck(out_channels, out_channels, 1, t)) 97 | repeat -= 1 98 | 99 | return nn.Sequential(*layers) 100 | 101 | def mobilenetv2(): 102 | return MobileNetV2() -------------------------------------------------------------------------------- /0-NICO/models/preactresnet.py: -------------------------------------------------------------------------------- 1 | """preactresnet in pytorch 2 | 3 | [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun 4 | 5 | Identity Mappings in Deep Residual Networks 6 | https://arxiv.org/abs/1603.05027 7 | """ 8 | 9 | import torch 10 | import torch.nn as nn 11 | import torch.nn.functional as F 12 | 13 | class PreActBasic(nn.Module): 14 | 15 | expansion = 1 16 | def __init__(self, in_channels, out_channels, stride): 17 | super().__init__() 18 | self.residual = nn.Sequential( 19 | nn.BatchNorm2d(in_channels), 20 | nn.ReLU(inplace=True), 21 | nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1), 22 | nn.BatchNorm2d(out_channels), 23 | nn.ReLU(inplace=True), 24 | nn.Conv2d(out_channels, out_channels * PreActBasic.expansion, kernel_size=3, padding=1) 25 | ) 26 | 27 | self.shortcut = nn.Sequential() 28 | if stride != 1 or in_channels != out_channels * PreActBasic.expansion: 29 | self.shortcut = nn.Conv2d(in_channels, out_channels * PreActBasic.expansion, 1, stride=stride) 30 | 31 | def forward(self, x): 32 | 33 | res = self.residual(x) 34 | shortcut = self.shortcut(x) 35 | 36 | return res + shortcut 37 | 38 | 39 | class PreActBottleNeck(nn.Module): 40 | 41 | expansion = 4 42 | def __init__(self, in_channels, out_channels, stride): 43 | super().__init__() 44 | 45 | self.residual = nn.Sequential( 46 | nn.BatchNorm2d(in_channels), 47 | nn.ReLU(inplace=True), 48 | nn.Conv2d(in_channels, out_channels, 1, stride=stride), 49 | 50 | nn.BatchNorm2d(out_channels), 51 | nn.ReLU(inplace=True), 52 | nn.Conv2d(out_channels, out_channels, 3, padding=1), 53 | 54 | nn.BatchNorm2d(out_channels), 55 | nn.ReLU(inplace=True), 56 | nn.Conv2d(out_channels, out_channels * PreActBottleNeck.expansion, 1) 57 | ) 58 | 59 | self.shortcut = nn.Sequential() 60 | 61 | if stride != 1 or in_channels != out_channels * PreActBottleNeck.expansion: 62 | self.shortcut = nn.Conv2d(in_channels, out_channels * PreActBottleNeck.expansion, 1, stride=stride) 63 | 64 | def forward(self, x): 65 | 66 | res = self.residual(x) 67 | shortcut = self.shortcut(x) 68 | 69 | return res + shortcut 70 | 71 | class PreActResNet(nn.Module): 72 | 73 | def __init__(self, block, num_block, class_num=100): 74 | super().__init__() 75 | self.input_channels = 64 76 | 77 | self.pre = nn.Sequential( 78 | nn.Conv2d(3, 64, 3, padding=1), 79 | nn.BatchNorm2d(64), 80 | nn.ReLU(inplace=True) 81 | ) 82 | 83 | self.stage1 = self._make_layers(block, num_block[0], 64, 1) 84 | self.stage2 = self._make_layers(block, num_block[1], 128, 2) 85 | self.stage3 = self._make_layers(block, num_block[2], 256, 2) 86 | self.stage4 = self._make_layers(block, num_block[3], 512, 2) 87 | 88 | self.linear = nn.Linear(self.input_channels, class_num) 89 | 90 | def _make_layers(self, block, block_num, out_channels, stride): 91 | layers = [] 92 | 93 | layers.append(block(self.input_channels, out_channels, stride)) 94 | self.input_channels = out_channels * block.expansion 95 | 96 | while block_num - 1: 97 | layers.append(block(self.input_channels, out_channels, 1)) 98 | self.input_channels = out_channels * block.expansion 99 | block_num -= 1 100 | 101 | return nn.Sequential(*layers) 102 | 103 | def forward(self, x): 104 | x = self.pre(x) 105 | 106 | x = self.stage1(x) 107 | x = self.stage2(x) 108 | x = self.stage3(x) 109 | x = self.stage4(x) 110 | 111 | x = F.adaptive_avg_pool2d(x, 1) 112 | x = x.view(x.size(0), -1) 113 | x = self.linear(x) 114 | 115 | return x 116 | 117 | def preactresnet18(): 118 | return PreActResNet(PreActBasic, [2, 2, 2, 2]) 119 | 120 | def preactresnet34(): 121 | return PreActResNet(PreActBasic, [3, 4, 6, 3]) 122 | 123 | def preactresnet50(): 124 | return PreActResNet(PreActBottleNeck, [3, 4, 6, 3]) 125 | 126 | def preactresnet101(): 127 | return PreActResNet(PreActBottleNeck, [3, 4, 23, 3]) 128 | 129 | def preactresnet152(): 130 | return PreActResNet(PreActBottleNeck, [3, 8, 36, 3]) 131 | 132 | -------------------------------------------------------------------------------- /0-NICO/models/resnet.py: -------------------------------------------------------------------------------- 1 | """resnet in pytorch 2 | 3 | 4 | 5 | [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. 6 | 7 | Deep Residual Learning for Image Recognition 8 | https://arxiv.org/abs/1512.03385v1 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | class BasicBlock(nn.Module): 15 | """Basic Block for resnet 18 and resnet 34 16 | 17 | """ 18 | 19 | #BasicBlock and BottleNeck block 20 | #have different output size 21 | #we use class attribute expansion 22 | #to distinct 23 | expansion = 1 24 | 25 | def __init__(self, in_channels, out_channels, stride=1): 26 | super().__init__() 27 | 28 | #residual function 29 | self.residual_function = nn.Sequential( 30 | nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False), 31 | nn.BatchNorm2d(out_channels), 32 | nn.ReLU(inplace=True), 33 | nn.Conv2d(out_channels, out_channels * BasicBlock.expansion, kernel_size=3, padding=1, bias=False), 34 | nn.BatchNorm2d(out_channels * BasicBlock.expansion) 35 | ) 36 | 37 | #shortcut 38 | self.shortcut = nn.Sequential() 39 | 40 | #the shortcut output dimension is not the same with residual function 41 | #use 1*1 convolution to match the dimension 42 | if stride != 1 or in_channels != BasicBlock.expansion * out_channels: 43 | self.shortcut = nn.Sequential( 44 | nn.Conv2d(in_channels, out_channels * BasicBlock.expansion, kernel_size=1, stride=stride, bias=False), 45 | nn.BatchNorm2d(out_channels * BasicBlock.expansion) 46 | ) 47 | 48 | def forward(self, x): 49 | return nn.ReLU(inplace=True)(self.residual_function(x) + self.shortcut(x)) 50 | 51 | class BottleNeck(nn.Module): 52 | """Residual block for resnet over 50 layers 53 | 54 | """ 55 | expansion = 4 56 | def __init__(self, in_channels, out_channels, stride=1): 57 | super().__init__() 58 | self.residual_function = nn.Sequential( 59 | nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False), 60 | nn.BatchNorm2d(out_channels), 61 | nn.ReLU(inplace=True), 62 | nn.Conv2d(out_channels, out_channels, stride=stride, kernel_size=3, padding=1, bias=False), 63 | nn.BatchNorm2d(out_channels), 64 | nn.ReLU(inplace=True), 65 | nn.Conv2d(out_channels, out_channels * BottleNeck.expansion, kernel_size=1, bias=False), 66 | nn.BatchNorm2d(out_channels * BottleNeck.expansion), 67 | ) 68 | 69 | self.shortcut = nn.Sequential() 70 | 71 | if stride != 1 or in_channels != out_channels * BottleNeck.expansion: 72 | self.shortcut = nn.Sequential( 73 | nn.Conv2d(in_channels, out_channels * BottleNeck.expansion, stride=stride, kernel_size=1, bias=False), 74 | nn.BatchNorm2d(out_channels * BottleNeck.expansion) 75 | ) 76 | 77 | def forward(self, x): 78 | return nn.ReLU(inplace=True)(self.residual_function(x) + self.shortcut(x)) 79 | 80 | class ResNet(nn.Module): 81 | 82 | def __init__(self, block, num_block, num_classes=100): 83 | super().__init__() 84 | 85 | self.in_channels = 64 86 | 87 | self.conv1 = nn.Sequential( 88 | nn.Conv2d(3, 64, kernel_size=3, padding=1, bias=False), 89 | nn.BatchNorm2d(64), 90 | nn.ReLU(inplace=True)) 91 | #we use a different inputsize than the original paper 92 | #so conv2_x's stride is 1 93 | self.conv2_x = self._make_layer(block, 64, num_block[0], 1) 94 | self.conv3_x = self._make_layer(block, 128, num_block[1], 2) 95 | self.conv4_x = self._make_layer(block, 256, num_block[2], 2) 96 | self.conv5_x = self._make_layer(block, 512, num_block[3], 2) 97 | self.avg_pool = nn.AdaptiveAvgPool2d((1, 1)) 98 | self.fc = nn.Linear(512 * block.expansion, num_classes) 99 | 100 | def _make_layer(self, block, out_channels, num_blocks, stride): 101 | """make resnet layers(by layer i didnt mean this 'layer' was the 102 | same as a neuron netowork layer, ex. conv layer), one layer may 103 | contain more than one residual block 104 | 105 | Args: 106 | block: block type, basic block or bottle neck block 107 | out_channels: output depth channel number of this layer 108 | num_blocks: how many blocks per layer 109 | stride: the stride of the first block of this layer 110 | 111 | Return: 112 | return a resnet layer 113 | """ 114 | 115 | # we have num_block blocks per layer, the first block 116 | # could be 1 or 2, other blocks would always be 1 117 | strides = [stride] + [1] * (num_blocks - 1) 118 | layers = [] 119 | for stride in strides: 120 | layers.append(block(self.in_channels, out_channels, stride)) 121 | self.in_channels = out_channels * block.expansion 122 | 123 | return nn.Sequential(*layers) 124 | 125 | def forward(self, x): 126 | output = self.conv1(x) 127 | output = self.conv2_x(output) 128 | output = self.conv3_x(output) 129 | output = self.conv4_x(output) 130 | output = self.conv5_x(output) 131 | output = self.avg_pool(output) 132 | output = output.view(output.size(0), -1) 133 | output = self.fc(output) 134 | 135 | return output 136 | 137 | def resnet18(): 138 | """ return a ResNet 18 object 139 | """ 140 | return ResNet(BasicBlock, [2, 2, 2, 2]) 141 | 142 | def resnet34(): 143 | """ return a ResNet 34 object 144 | """ 145 | return ResNet(BasicBlock, [3, 4, 6, 3]) 146 | 147 | def resnet50(): 148 | """ return a ResNet 50 object 149 | """ 150 | return ResNet(BottleNeck, [3, 4, 6, 3]) 151 | 152 | def resnet101(): 153 | """ return a ResNet 101 object 154 | """ 155 | return ResNet(BottleNeck, [3, 4, 23, 3]) 156 | 157 | def resnet152(): 158 | """ return a ResNet 152 object 159 | """ 160 | return ResNet(BottleNeck, [3, 8, 36, 3]) 161 | 162 | 163 | 164 | -------------------------------------------------------------------------------- /0-NICO/models/resnet_nonlocal.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Non-Local ResNet2D-50 for CIFAR-10 dataset. 3 | Most of the code is borrowed from https://github.com/akamaster/pytorch_resnet_cifar10 4 | Properly implemented ResNet-s for CIFAR10 as described in paper [1]. 5 | The implementation and structure of this file is hugely influenced by [2] 6 | which is implemented for ImageNet and doesn't have option A for identity. 7 | Moreover, most of the implementations on the web is copy-paste from 8 | torchvision's resnet and has wrong number of params. 9 | Reference: 10 | [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun 11 | Deep Residual Learning for Image Recognition. arXiv:1512.03385 12 | [2] https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py 13 | ''' 14 | import torch 15 | import torch.nn as nn 16 | import torch.nn.functional as F 17 | import torch.nn.init as init 18 | 19 | from torch.autograd import Variable 20 | from models.non_local import NLBlockND 21 | 22 | 23 | def _weights_init(m): 24 | if isinstance(m, nn.Linear) or isinstance(m, nn.Conv2d): 25 | init.kaiming_normal_(m.weight) 26 | 27 | 28 | class LambdaLayer(nn.Module): 29 | def __init__(self, lambd): 30 | super(LambdaLayer, self).__init__() 31 | self.lambd = lambd 32 | 33 | def forward(self, x): 34 | return self.lambd(x) 35 | 36 | 37 | class BasicBlock(nn.Module): 38 | expansion = 1 39 | 40 | def __init__(self, in_planes, planes, stride=1, option='A'): 41 | super(BasicBlock, self).__init__() 42 | self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) 43 | self.bn1 = nn.BatchNorm2d(planes) 44 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False) 45 | self.bn2 = nn.BatchNorm2d(planes) 46 | 47 | self.shortcut = nn.Sequential() 48 | if stride != 1 or in_planes != planes: 49 | if option == 'A': 50 | """ 51 | For CIFAR10 ResNet paper uses option A. 52 | """ 53 | self.shortcut = LambdaLayer(lambda x: 54 | F.pad(x[:, :, ::2, ::2], (0, 0, 0, 0, planes // 4, planes // 4), "constant", 55 | 0)) 56 | elif option == 'B': 57 | self.shortcut = nn.Sequential( 58 | nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False), 59 | nn.BatchNorm2d(self.expansion * planes) 60 | ) 61 | 62 | def forward(self, x): 63 | out = F.relu(self.bn1(self.conv1(x))) 64 | out = self.bn2(self.conv2(out)) 65 | out += self.shortcut(x) 66 | out = F.relu(out) 67 | return out 68 | 69 | 70 | class ResNet2D(nn.Module): 71 | def __init__(self, block, num_blocks, num_classes=10, non_local=False): 72 | super(ResNet2D, self).__init__() 73 | self.in_planes = 16 74 | 75 | self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1, bias=False) 76 | self.bn1 = nn.BatchNorm2d(16) 77 | self.layer1 = self._make_layer(block, 16, num_blocks[0], stride=1) 78 | 79 | # add non-local block after layer 2 80 | self.layer2 = self._make_layer(block, 32, num_blocks[1], stride=2, non_local=non_local) 81 | self.layer3 = self._make_layer(block, 64, num_blocks[2], stride=2) 82 | self.linear = nn.Linear(64, num_classes) 83 | 84 | self.apply(_weights_init) 85 | 86 | def _make_layer(self, block, planes, num_blocks, stride, non_local=False): 87 | strides = [stride] + [1] * (num_blocks - 1) 88 | layers = [] 89 | 90 | last_idx = len(strides) 91 | if non_local: 92 | last_idx = len(strides) - 1 93 | 94 | for i in range(last_idx): 95 | layers.append(block(self.in_planes, planes, strides[i])) 96 | self.in_planes = planes * block.expansion 97 | 98 | if non_local: 99 | layers.append(NLBlockND(in_channels=planes, dimension=2)) 100 | layers.append(block(self.in_planes, planes, strides[-1])) 101 | 102 | return nn.Sequential(*layers) 103 | 104 | def forward(self, x): 105 | out = F.relu(self.bn1(self.conv1(x))) 106 | out = self.layer1(out) 107 | out = self.layer2(out) 108 | out = self.layer3(out) 109 | out = F.avg_pool2d(out, out.size()[3]) 110 | out = out.view(out.size(0), -1) 111 | out = self.linear(out) 112 | return out 113 | 114 | 115 | def resnet2D56(non_local=False, **kwargs): 116 | """Constructs a ResNet-56 model. 117 | """ 118 | return ResNet2D(BasicBlock, [9, 9, 9], non_local=non_local, **kwargs) -------------------------------------------------------------------------------- /0-NICO/models/resnext.py: -------------------------------------------------------------------------------- 1 | """resnext in pytorch 2 | 3 | 4 | 5 | [1] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He. 6 | 7 | Aggregated Residual Transformations for Deep Neural Networks 8 | https://arxiv.org/abs/1611.05431 9 | """ 10 | 11 | import math 12 | import torch 13 | import torch.nn as nn 14 | import torch.nn.functional as F 15 | 16 | #only implements ResNext bottleneck c 17 | 18 | 19 | #"""This strategy exposes a new dimension, which we call “cardinality” 20 | #(the size of the set of transformations), as an essential factor 21 | #in addition to the dimensions of depth and width.""" 22 | CARDINALITY = 32 23 | DEPTH = 4 24 | BASEWIDTH = 64 25 | 26 | #"""The grouped convolutional layer in Fig. 3(c) performs 32 groups 27 | #of convolutions whose input and output channels are 4-dimensional. 28 | #The grouped convolutional layer concatenates them as the outputs 29 | #of the layer.""" 30 | 31 | class ResNextBottleNeckC(nn.Module): 32 | 33 | def __init__(self, in_channels, out_channels, stride): 34 | super().__init__() 35 | 36 | C = CARDINALITY #How many groups a feature map was splitted into 37 | 38 | #"""We note that the input/output width of the template is fixed as 39 | #256-d (Fig. 3), We note that the input/output width of the template 40 | #is fixed as 256-d (Fig. 3), and all widths are dou- bled each time 41 | #when the feature map is subsampled (see Table 1).""" 42 | D = int(DEPTH * out_channels / BASEWIDTH) #number of channels per group 43 | self.split_transforms = nn.Sequential( 44 | nn.Conv2d(in_channels, C * D, kernel_size=1, groups=C, bias=False), 45 | nn.BatchNorm2d(C * D), 46 | nn.ReLU(inplace=True), 47 | nn.Conv2d(C * D, C * D, kernel_size=3, stride=stride, groups=C, padding=1, bias=False), 48 | nn.BatchNorm2d(C * D), 49 | nn.ReLU(inplace=True), 50 | nn.Conv2d(C * D, out_channels * 4, kernel_size=1, bias=False), 51 | nn.BatchNorm2d(out_channels * 4), 52 | ) 53 | 54 | self.shortcut = nn.Sequential() 55 | 56 | if stride != 1 or in_channels != out_channels * 4: 57 | self.shortcut = nn.Sequential( 58 | nn.Conv2d(in_channels, out_channels * 4, stride=stride, kernel_size=1, bias=False), 59 | nn.BatchNorm2d(out_channels * 4) 60 | ) 61 | 62 | def forward(self, x): 63 | return F.relu(self.split_transforms(x) + self.shortcut(x)) 64 | 65 | class ResNext(nn.Module): 66 | 67 | def __init__(self, block, num_blocks, class_names=100): 68 | super().__init__() 69 | self.in_channels = 64 70 | 71 | self.conv1 = nn.Sequential( 72 | nn.Conv2d(3, 64, 3, stride=1, padding=1, bias=False), 73 | nn.BatchNorm2d(64), 74 | nn.ReLU(inplace=True) 75 | ) 76 | 77 | self.conv2 = self._make_layer(block, num_blocks[0], 64, 1) 78 | self.conv3 = self._make_layer(block, num_blocks[1], 128, 2) 79 | self.conv4 = self._make_layer(block, num_blocks[2], 256, 2) 80 | self.conv5 = self._make_layer(block, num_blocks[3], 512, 2) 81 | self.avg = nn.AdaptiveAvgPool2d((1, 1)) 82 | self.fc = nn.Linear(512 * 4, 100) 83 | 84 | def forward(self, x): 85 | x = self.conv1(x) 86 | x = self.conv2(x) 87 | x = self.conv3(x) 88 | x = self.conv4(x) 89 | x = self.conv5(x) 90 | x = self.avg(x) 91 | x = x.view(x.size(0), -1) 92 | x = self.fc(x) 93 | return x 94 | 95 | def _make_layer(self, block, num_block, out_channels, stride): 96 | """Building resnext block 97 | Args: 98 | block: block type(default resnext bottleneck c) 99 | num_block: number of blocks per layer 100 | out_channels: output channels per block 101 | stride: block stride 102 | 103 | Returns: 104 | a resnext layer 105 | """ 106 | strides = [stride] + [1] * (num_block - 1) 107 | layers = [] 108 | for stride in strides: 109 | layers.append(block(self.in_channels, out_channels, stride)) 110 | self.in_channels = out_channels * 4 111 | 112 | return nn.Sequential(*layers) 113 | 114 | def resnext50(): 115 | """ return a resnext50(c32x4d) network 116 | """ 117 | return ResNext(ResNextBottleNeckC, [3, 4, 6, 3]) 118 | 119 | def resnext101(): 120 | """ return a resnext101(c32x4d) network 121 | """ 122 | return ResNext(ResNextBottleNeckC, [3, 4, 23, 3]) 123 | 124 | def resnext152(): 125 | """ return a resnext101(c32x4d) network 126 | """ 127 | return ResNext(ResNextBottleNeckC, [3, 4, 36, 3]) 128 | 129 | 130 | 131 | -------------------------------------------------------------------------------- /0-NICO/models/senet.py: -------------------------------------------------------------------------------- 1 | """senet in pytorch 2 | 3 | 4 | 5 | [1] Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu 6 | 7 | Squeeze-and-Excitation Networks 8 | https://arxiv.org/abs/1709.01507 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | class BasicResidualSEBlock(nn.Module): 16 | 17 | expansion = 1 18 | 19 | def __init__(self, in_channels, out_channels, stride, r=16): 20 | super().__init__() 21 | 22 | self.residual = nn.Sequential( 23 | nn.Conv2d(in_channels, out_channels, 3, stride=stride, padding=1), 24 | nn.BatchNorm2d(out_channels), 25 | nn.ReLU(inplace=True), 26 | 27 | nn.Conv2d(out_channels, out_channels * self.expansion, 3, padding=1), 28 | nn.BatchNorm2d(out_channels * self.expansion), 29 | nn.ReLU(inplace=True) 30 | ) 31 | 32 | self.shortcut = nn.Sequential() 33 | if stride != 1 or in_channels != out_channels * self.expansion: 34 | self.shortcut = nn.Sequential( 35 | nn.Conv2d(in_channels, out_channels * self.expansion, 1, stride=stride), 36 | nn.BatchNorm2d(out_channels * self.expansion) 37 | ) 38 | 39 | self.squeeze = nn.AdaptiveAvgPool2d(1) 40 | self.excitation = nn.Sequential( 41 | nn.Linear(out_channels * self.expansion, out_channels * self.expansion // r), 42 | nn.ReLU(inplace=True), 43 | nn.Linear(out_channels * self.expansion // r, out_channels * self.expansion), 44 | nn.Sigmoid() 45 | ) 46 | 47 | def forward(self, x): 48 | shortcut = self.shortcut(x) 49 | residual = self.residual(x) 50 | 51 | squeeze = self.squeeze(residual) 52 | squeeze = squeeze.view(squeeze.size(0), -1) 53 | excitation = self.excitation(squeeze) 54 | excitation = excitation.view(residual.size(0), residual.size(1), 1, 1) 55 | 56 | x = residual * excitation.expand_as(residual) + shortcut 57 | 58 | return F.relu(x) 59 | 60 | class BottleneckResidualSEBlock(nn.Module): 61 | 62 | expansion = 4 63 | 64 | def __init__(self, in_channels, out_channels, stride, r=16): 65 | super().__init__() 66 | 67 | self.residual = nn.Sequential( 68 | nn.Conv2d(in_channels, out_channels, 1), 69 | nn.BatchNorm2d(out_channels), 70 | nn.ReLU(inplace=True), 71 | 72 | nn.Conv2d(out_channels, out_channels, 3, stride=stride, padding=1), 73 | nn.BatchNorm2d(out_channels), 74 | nn.ReLU(inplace=True), 75 | 76 | nn.Conv2d(out_channels, out_channels * self.expansion, 1), 77 | nn.BatchNorm2d(out_channels * self.expansion), 78 | nn.ReLU(inplace=True) 79 | ) 80 | 81 | self.squeeze = nn.AdaptiveAvgPool2d(1) 82 | self.excitation = nn.Sequential( 83 | nn.Linear(out_channels * self.expansion, out_channels * self.expansion // r), 84 | nn.ReLU(inplace=True), 85 | nn.Linear(out_channels * self.expansion // r, out_channels * self.expansion), 86 | nn.Sigmoid() 87 | ) 88 | 89 | self.shortcut = nn.Sequential() 90 | if stride != 1 or in_channels != out_channels * self.expansion: 91 | self.shortcut = nn.Sequential( 92 | nn.Conv2d(in_channels, out_channels * self.expansion, 1, stride=stride), 93 | nn.BatchNorm2d(out_channels * self.expansion) 94 | ) 95 | 96 | def forward(self, x): 97 | 98 | shortcut = self.shortcut(x) 99 | 100 | residual = self.residual(x) 101 | squeeze = self.squeeze(residual) 102 | squeeze = squeeze.view(squeeze.size(0), -1) 103 | excitation = self.excitation(squeeze) 104 | excitation = excitation.view(residual.size(0), residual.size(1), 1, 1) 105 | 106 | x = residual * excitation.expand_as(residual) + shortcut 107 | 108 | return F.relu(x) 109 | 110 | class SEResNet(nn.Module): 111 | 112 | def __init__(self, block, block_num, class_num=100): 113 | super().__init__() 114 | 115 | self.in_channels = 64 116 | 117 | self.pre = nn.Sequential( 118 | nn.Conv2d(3, 64, 3, padding=1), 119 | nn.BatchNorm2d(64), 120 | nn.ReLU(inplace=True) 121 | ) 122 | 123 | self.stage1 = self._make_stage(block, block_num[0], 64, 1) 124 | self.stage2 = self._make_stage(block, block_num[1], 128, 2) 125 | self.stage3 = self._make_stage(block, block_num[2], 256, 2) 126 | self.stage4 = self._make_stage(block, block_num[3], 512, 2) 127 | 128 | self.linear = nn.Linear(self.in_channels, class_num) 129 | 130 | def forward(self, x): 131 | x = self.pre(x) 132 | 133 | x = self.stage1(x) 134 | x = self.stage2(x) 135 | x = self.stage3(x) 136 | x = self.stage4(x) 137 | 138 | x = F.adaptive_avg_pool2d(x, 1) 139 | x = x.view(x.size(0), -1) 140 | 141 | x = self.linear(x) 142 | 143 | return x 144 | 145 | 146 | def _make_stage(self, block, num, out_channels, stride): 147 | 148 | layers = [] 149 | layers.append(block(self.in_channels, out_channels, stride)) 150 | self.in_channels = out_channels * block.expansion 151 | 152 | while num - 1: 153 | layers.append(block(self.in_channels, out_channels, 1)) 154 | num -= 1 155 | 156 | return nn.Sequential(*layers) 157 | 158 | def seresnet18(): 159 | return SEResNet(BasicResidualSEBlock, [2, 2, 2, 2]) 160 | 161 | def seresnet34(): 162 | return SEResNet(BasicResidualSEBlock, [3, 4, 6, 3]) 163 | 164 | def seresnet50(): 165 | return SEResNet(BottleneckResidualSEBlock, [3, 4, 6, 3]) 166 | 167 | def seresnet101(): 168 | return SEResNet(BottleneckResidualSEBlock, [3, 4, 23, 3]) 169 | 170 | def seresnet152(): 171 | return SEResNet(BottleneckResidualSEBlock, [3, 8, 36, 3]) 172 | -------------------------------------------------------------------------------- /0-NICO/models/shufflenetv2.py: -------------------------------------------------------------------------------- 1 | """shufflenetv2 in pytorch 2 | 3 | 4 | 5 | [1] Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun 6 | 7 | ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design 8 | https://arxiv.org/abs/1807.11164 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | 16 | def channel_split(x, split): 17 | """split a tensor into two pieces along channel dimension 18 | Args: 19 | x: input tensor 20 | split:(int) channel size for each pieces 21 | """ 22 | assert x.size(1) == split * 2 23 | return torch.split(x, split, dim=1) 24 | 25 | def channel_shuffle(x, groups): 26 | """channel shuffle operation 27 | Args: 28 | x: input tensor 29 | groups: input branch number 30 | """ 31 | 32 | batch_size, channels, height, width = x.size() 33 | channels_per_group = int(channels / groups) 34 | 35 | x = x.view(batch_size, groups, channels_per_group, height, width) 36 | x = x.transpose(1, 2).contiguous() 37 | x = x.view(batch_size, -1, height, width) 38 | 39 | return x 40 | 41 | class ShuffleUnit(nn.Module): 42 | 43 | def __init__(self, in_channels, out_channels, stride): 44 | super().__init__() 45 | 46 | self.stride = stride 47 | self.in_channels = in_channels 48 | self.out_channels = out_channels 49 | 50 | if stride != 1 or in_channels != out_channels: 51 | self.residual = nn.Sequential( 52 | nn.Conv2d(in_channels, in_channels, 1), 53 | nn.BatchNorm2d(in_channels), 54 | nn.ReLU(inplace=True), 55 | nn.Conv2d(in_channels, in_channels, 3, stride=stride, padding=1, groups=in_channels), 56 | nn.BatchNorm2d(in_channels), 57 | nn.Conv2d(in_channels, int(out_channels / 2), 1), 58 | nn.BatchNorm2d(int(out_channels / 2)), 59 | nn.ReLU(inplace=True) 60 | ) 61 | 62 | self.shortcut = nn.Sequential( 63 | nn.Conv2d(in_channels, in_channels, 3, stride=stride, padding=1, groups=in_channels), 64 | nn.BatchNorm2d(in_channels), 65 | nn.Conv2d(in_channels, int(out_channels / 2), 1), 66 | nn.BatchNorm2d(int(out_channels / 2)), 67 | nn.ReLU(inplace=True) 68 | ) 69 | else: 70 | self.shortcut = nn.Sequential() 71 | 72 | in_channels = int(in_channels / 2) 73 | self.residual = nn.Sequential( 74 | nn.Conv2d(in_channels, in_channels, 1), 75 | nn.BatchNorm2d(in_channels), 76 | nn.ReLU(inplace=True), 77 | nn.Conv2d(in_channels, in_channels, 3, stride=stride, padding=1, groups=in_channels), 78 | nn.BatchNorm2d(in_channels), 79 | nn.Conv2d(in_channels, in_channels, 1), 80 | nn.BatchNorm2d(in_channels), 81 | nn.ReLU(inplace=True) 82 | ) 83 | 84 | 85 | def forward(self, x): 86 | 87 | if self.stride == 1 and self.out_channels == self.in_channels: 88 | shortcut, residual = channel_split(x, int(self.in_channels / 2)) 89 | else: 90 | shortcut = x 91 | residual = x 92 | 93 | shortcut = self.shortcut(shortcut) 94 | residual = self.residual(residual) 95 | x = torch.cat([shortcut, residual], dim=1) 96 | x = channel_shuffle(x, 2) 97 | 98 | return x 99 | 100 | class ShuffleNetV2(nn.Module): 101 | 102 | def __init__(self, ratio=1, class_num=100): 103 | super().__init__() 104 | if ratio == 0.5: 105 | out_channels = [48, 96, 192, 1024] 106 | elif ratio == 1: 107 | out_channels = [116, 232, 464, 1024] 108 | elif ratio == 1.5: 109 | out_channels = [176, 352, 704, 1024] 110 | elif ratio == 2: 111 | out_channels = [244, 488, 976, 2048] 112 | else: 113 | ValueError('unsupported ratio number') 114 | 115 | self.pre = nn.Sequential( 116 | nn.Conv2d(3, 24, 3, padding=1), 117 | nn.BatchNorm2d(24) 118 | ) 119 | 120 | self.stage2 = self._make_stage(24, out_channels[0], 3) 121 | self.stage3 = self._make_stage(out_channels[0], out_channels[1], 7) 122 | self.stage4 = self._make_stage(out_channels[1], out_channels[2], 3) 123 | self.conv5 = nn.Sequential( 124 | nn.Conv2d(out_channels[2], out_channels[3], 1), 125 | nn.BatchNorm2d(out_channels[3]), 126 | nn.ReLU(inplace=True) 127 | ) 128 | 129 | self.fc = nn.Linear(out_channels[3], class_num) 130 | 131 | def forward(self, x): 132 | x = self.pre(x) 133 | x = self.stage2(x) 134 | x = self.stage3(x) 135 | x = self.stage4(x) 136 | x = self.conv5(x) 137 | x = F.adaptive_avg_pool2d(x, 1) 138 | x = x.view(x.size(0), -1) 139 | x = self.fc(x) 140 | 141 | return x 142 | 143 | def _make_stage(self, in_channels, out_channels, repeat): 144 | layers = [] 145 | layers.append(ShuffleUnit(in_channels, out_channels, 2)) 146 | 147 | while repeat: 148 | layers.append(ShuffleUnit(out_channels, out_channels, 1)) 149 | repeat -= 1 150 | 151 | return nn.Sequential(*layers) 152 | 153 | def shufflenetv2(): 154 | return ShuffleNetV2() 155 | 156 | 157 | 158 | 159 | 160 | -------------------------------------------------------------------------------- /0-NICO/models/squeezenet.py: -------------------------------------------------------------------------------- 1 | """squeezenet in pytorch 2 | 3 | 4 | 5 | [1] Song Han, Jeff Pool, John Tran, William J. Dally 6 | 7 | squeezenet: Learning both Weights and Connections for Efficient Neural Networks 8 | https://arxiv.org/abs/1506.02626 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | 15 | class Fire(nn.Module): 16 | 17 | def __init__(self, in_channel, out_channel, squzee_channel): 18 | 19 | super().__init__() 20 | self.squeeze = nn.Sequential( 21 | nn.Conv2d(in_channel, squzee_channel, 1), 22 | nn.BatchNorm2d(squzee_channel), 23 | nn.ReLU(inplace=True) 24 | ) 25 | 26 | self.expand_1x1 = nn.Sequential( 27 | nn.Conv2d(squzee_channel, int(out_channel / 2), 1), 28 | nn.BatchNorm2d(int(out_channel / 2)), 29 | nn.ReLU(inplace=True) 30 | ) 31 | 32 | self.expand_3x3 = nn.Sequential( 33 | nn.Conv2d(squzee_channel, int(out_channel / 2), 3, padding=1), 34 | nn.BatchNorm2d(int(out_channel / 2)), 35 | nn.ReLU(inplace=True) 36 | ) 37 | 38 | def forward(self, x): 39 | 40 | x = self.squeeze(x) 41 | x = torch.cat([ 42 | self.expand_1x1(x), 43 | self.expand_3x3(x) 44 | ], 1) 45 | 46 | return x 47 | 48 | class SqueezeNet(nn.Module): 49 | 50 | """mobile net with simple bypass""" 51 | def __init__(self, class_num=100): 52 | 53 | super().__init__() 54 | self.stem = nn.Sequential( 55 | nn.Conv2d(3, 96, 3, padding=1), 56 | nn.BatchNorm2d(96), 57 | nn.ReLU(inplace=True), 58 | nn.MaxPool2d(2, 2) 59 | ) 60 | 61 | self.fire2 = Fire(96, 128, 16) 62 | self.fire3 = Fire(128, 128, 16) 63 | self.fire4 = Fire(128, 256, 32) 64 | self.fire5 = Fire(256, 256, 32) 65 | self.fire6 = Fire(256, 384, 48) 66 | self.fire7 = Fire(384, 384, 48) 67 | self.fire8 = Fire(384, 512, 64) 68 | self.fire9 = Fire(512, 512, 64) 69 | 70 | self.conv10 = nn.Conv2d(512, class_num, 1) 71 | self.avg = nn.AdaptiveAvgPool2d(1) 72 | self.maxpool = nn.MaxPool2d(2, 2) 73 | 74 | def forward(self, x): 75 | x = self.stem(x) 76 | 77 | f2 = self.fire2(x) 78 | f3 = self.fire3(f2) + f2 79 | f4 = self.fire4(f3) 80 | f4 = self.maxpool(f4) 81 | 82 | f5 = self.fire5(f4) + f4 83 | f6 = self.fire6(f5) 84 | f7 = self.fire7(f6) + f6 85 | f8 = self.fire8(f7) 86 | f8 = self.maxpool(f8) 87 | 88 | f9 = self.fire9(f8) 89 | c10 = self.conv10(f9) 90 | 91 | x = self.avg(c10) 92 | x = x.view(x.size(0), -1) 93 | 94 | return x 95 | 96 | def squeezenet(class_num=100): 97 | return SqueezeNet(class_num=class_num) 98 | -------------------------------------------------------------------------------- /0-NICO/models/token_performer.py: -------------------------------------------------------------------------------- 1 | """ 2 | Take Performer as T2T Transformer 3 | """ 4 | import math 5 | import torch 6 | import torch.nn as nn 7 | 8 | class Token_performer(nn.Module): 9 | def __init__(self, dim, in_dim, head_cnt=1, kernel_ratio=0.5, dp1=0.1, dp2 = 0.1): 10 | super().__init__() 11 | self.emb = in_dim * head_cnt # we use 1, so it is no need here 12 | self.kqv = nn.Linear(dim, 3 * self.emb) 13 | self.dp = nn.Dropout(dp1) 14 | self.proj = nn.Linear(self.emb, self.emb) 15 | self.head_cnt = head_cnt 16 | self.norm1 = nn.LayerNorm(dim) 17 | self.norm2 = nn.LayerNorm(self.emb) 18 | self.epsilon = 1e-8 # for stable in division 19 | 20 | self.mlp = nn.Sequential( 21 | nn.Linear(self.emb, 1 * self.emb), 22 | nn.GELU(), 23 | nn.Linear(1 * self.emb, self.emb), 24 | nn.Dropout(dp2), 25 | ) 26 | 27 | self.m = int(self.emb * kernel_ratio) 28 | self.w = torch.randn(self.m, self.emb) 29 | self.w = nn.Parameter(nn.init.orthogonal_(self.w) * math.sqrt(self.m), requires_grad=False) 30 | 31 | def prm_exp(self, x): 32 | # part of the function is borrow from https://github.com/lucidrains/performer-pytorch 33 | # and Simo Ryu (https://github.com/cloneofsimo) 34 | # ==== positive random features for gaussian kernels ==== 35 | # x = (B, T, hs) 36 | # w = (m, hs) 37 | # return : x : B, T, m 38 | # SM(x, y) = E_w[exp(w^T x - |x|/2) exp(w^T y - |y|/2)] 39 | # therefore return exp(w^Tx - |x|/2)/sqrt(m) 40 | xd = ((x * x).sum(dim=-1, keepdim=True)).repeat(1, 1, self.m) / 2 41 | wtx = torch.einsum('bti,mi->btm', x.float(), self.w) 42 | 43 | return torch.exp(wtx - xd) / math.sqrt(self.m) 44 | 45 | def single_attn(self, x): 46 | k, q, v = torch.split(self.kqv(x), self.emb, dim=-1) 47 | kp, qp = self.prm_exp(k), self.prm_exp(q) # (B, T, m), (B, T, m) 48 | D = torch.einsum('bti,bi->bt', qp, kp.sum(dim=1)).unsqueeze(dim=2) # (B, T, m) * (B, m) -> (B, T, 1) 49 | kptv = torch.einsum('bin,bim->bnm', v.float(), kp) # (B, emb, m) 50 | y = torch.einsum('bti,bni->btn', qp, kptv) / (D.repeat(1, 1, self.emb) + self.epsilon) # (B, T, emb)/Diag 51 | # skip connection 52 | y = v + self.dp(self.proj(y)) # same as token_transformer in T2T layer, use v as skip connection 53 | 54 | return y 55 | 56 | def forward(self, x): 57 | x = self.single_attn(self.norm1(x)) 58 | x = x + self.mlp(self.norm2(x)) 59 | return x -------------------------------------------------------------------------------- /0-NICO/models/token_transformer.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) [2012]-[2021] Shanghai Yitu Technology Co., Ltd. 2 | # 3 | # This source code is licensed under the Clear BSD License 4 | # LICENSE file in the root directory of this file 5 | # All rights reserved. 6 | """ 7 | Take the standard Transformer as T2T Transformer 8 | """ 9 | import torch.nn as nn 10 | from timm.models.layers import DropPath 11 | from .transformer_block import Mlp 12 | 13 | class Attention(nn.Module): 14 | def __init__(self, dim, num_heads=8, in_dim = None, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0.): 15 | super().__init__() 16 | self.num_heads = num_heads 17 | self.in_dim = in_dim 18 | head_dim = dim // num_heads 19 | self.scale = qk_scale or head_dim ** -0.5 20 | 21 | self.qkv = nn.Linear(dim, in_dim * 3, bias=qkv_bias) 22 | self.attn_drop = nn.Dropout(attn_drop) 23 | self.proj = nn.Linear(in_dim, in_dim) 24 | self.proj_drop = nn.Dropout(proj_drop) 25 | 26 | def forward(self, x): 27 | B, N, C = x.shape 28 | 29 | qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, self.in_dim).permute(2, 0, 3, 1, 4) 30 | q, k, v = qkv[0], qkv[1], qkv[2] 31 | 32 | attn = (q @ k.transpose(-2, -1)) * self.scale 33 | attn = attn.softmax(dim=-1) 34 | attn = self.attn_drop(attn) 35 | 36 | x = (attn @ v).transpose(1, 2).reshape(B, N, self.in_dim) 37 | x = self.proj(x) 38 | x = self.proj_drop(x) 39 | 40 | # skip connection 41 | x = v.squeeze(1) + x # because the original x has different size with current x, use v to do skip connection 42 | 43 | return x 44 | 45 | class Token_transformer(nn.Module): 46 | 47 | def __init__(self, dim, in_dim, num_heads, mlp_ratio=1., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0., 48 | drop_path=0., act_layer=nn.GELU, norm_layer=nn.LayerNorm): 49 | super().__init__() 50 | self.norm1 = norm_layer(dim) 51 | self.attn = Attention( 52 | dim, in_dim=in_dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop) 53 | self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() 54 | self.norm2 = norm_layer(in_dim) 55 | self.mlp = Mlp(in_features=in_dim, hidden_features=int(in_dim*mlp_ratio), out_features=in_dim, act_layer=act_layer, drop=drop) 56 | 57 | def forward(self, x): 58 | x = self.attn(self.norm1(x)) 59 | x = x + self.drop_path(self.mlp(self.norm2(x))) 60 | return x -------------------------------------------------------------------------------- /0-NICO/models/transformer_block.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) [2012]-[2021] Shanghai Yitu Technology Co., Ltd. 2 | # 3 | # This source code is licensed under the Clear BSD License 4 | # LICENSE file in the root directory of this file 5 | # All rights reserved. 6 | """ 7 | Borrow from timm(https://github.com/rwightman/pytorch-image-models) 8 | """ 9 | import torch 10 | import torch.nn as nn 11 | import numpy as np 12 | from timm.models.layers import DropPath 13 | 14 | class Mlp(nn.Module): 15 | def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.): 16 | super().__init__() 17 | out_features = out_features or in_features 18 | hidden_features = hidden_features or in_features 19 | self.fc1 = nn.Linear(in_features, hidden_features) 20 | self.act = act_layer() 21 | self.fc2 = nn.Linear(hidden_features, out_features) 22 | self.drop = nn.Dropout(drop) 23 | 24 | def forward(self, x): 25 | x = self.fc1(x) 26 | x = self.act(x) 27 | x = self.drop(x) 28 | x = self.fc2(x) 29 | x = self.drop(x) 30 | return x 31 | 32 | class Attention(nn.Module): 33 | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0.): 34 | super().__init__() 35 | self.num_heads = num_heads 36 | head_dim = dim // num_heads 37 | 38 | self.scale = qk_scale or head_dim ** -0.5 39 | 40 | self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias) 41 | self.attn_drop = nn.Dropout(attn_drop) 42 | self.proj = nn.Linear(dim, dim) 43 | self.proj_drop = nn.Dropout(proj_drop) 44 | 45 | def forward(self, x): 46 | B, N, C = x.shape 47 | qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4) 48 | q, k, v = qkv[0], qkv[1], qkv[2] 49 | 50 | attn = (q @ k.transpose(-2, -1)) * self.scale 51 | attn = attn.softmax(dim=-1) 52 | attn = self.attn_drop(attn) 53 | 54 | x = (attn @ v).transpose(1, 2).reshape(B, N, C) 55 | x = self.proj(x) 56 | x = self.proj_drop(x) 57 | return x 58 | 59 | 60 | class Attention_ours(nn.Module): 61 | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0.): 62 | super().__init__() 63 | self.num_heads = num_heads 64 | head_dim = dim // num_heads 65 | 66 | self.scale = qk_scale or head_dim ** -0.5 67 | 68 | self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias) 69 | self.attn_drop = nn.Dropout(attn_drop) 70 | self.proj = nn.Linear(dim, dim) 71 | self.proj_drop = nn.Dropout(proj_drop) 72 | 73 | def forward(self, x): 74 | B, N, C = x.shape 75 | qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4) 76 | q, k, v = qkv[0], qkv[1], qkv[2] 77 | 78 | attn = (q @ k.transpose(-2, -1)) * self.scale 79 | attn_causal = attn.softmax(dim=-1) 80 | attn_sp = (-attn).softmax(dim=-1) 81 | 82 | attn_causal = self.attn_drop(attn_causal) 83 | attn_sp = self.attn_drop(attn_sp) 84 | 85 | x = (attn_causal @ v).transpose(1, 2).reshape(B, N, C) 86 | x_comp = (attn_sp @ v).transpose(1, 2).reshape(B, N, C) 87 | 88 | x = self.proj(x) 89 | x_comp = self.proj(x_comp) 90 | x = self.proj_drop(x) 91 | x_comp = self.proj_drop(x_comp) 92 | return x, x_comp 93 | 94 | 95 | class Block(nn.Module): 96 | 97 | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0., 98 | drop_path=0., act_layer=nn.GELU, norm_layer=nn.LayerNorm): 99 | super().__init__() 100 | self.norm1 = norm_layer(dim) 101 | self.attn = Attention( 102 | dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop) 103 | self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() 104 | self.norm2 = norm_layer(dim) 105 | mlp_hidden_dim = int(dim * mlp_ratio) 106 | self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop) 107 | 108 | def forward(self, x): 109 | x = x + self.drop_path(self.attn(self.norm1(x))) 110 | x = x + self.drop_path(self.mlp(self.norm2(x))) 111 | return x 112 | 113 | 114 | class Block_ours(nn.Module): 115 | 116 | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0., 117 | drop_path=0., act_layer=nn.GELU, norm_layer=nn.LayerNorm): 118 | super().__init__() 119 | self.norm1 = norm_layer(dim) 120 | self.attn = Attention_ours( 121 | dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop) 122 | self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() 123 | self.norm2 = norm_layer(dim) 124 | # self.norm3 = norm_layer(dim) 125 | mlp_hidden_dim = int(dim * mlp_ratio) 126 | self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop) 127 | 128 | def forward(self, x): 129 | if isinstance(x, list): 130 | x = x[-1] # x = x_mix 131 | attn_x, attn_x_comp = self.attn(self.norm1(x)) 132 | x_causal = x + self.drop_path(attn_x) 133 | x_spurious = self.drop_path(attn_x_comp) # no residual 134 | 135 | x_causal = x_causal + self.drop_path(self.mlp(self.norm2(x_causal))) 136 | x_spurious = x_spurious + self.drop_path(self.mlp(self.norm2(x_spurious))) 137 | x_mix = x_causal + x_spurious 138 | 139 | return [x_causal, x_spurious, x_mix] 140 | 141 | 142 | def get_sinusoid_encoding(n_position, d_hid): 143 | ''' Sinusoid position encoding table ''' 144 | 145 | def get_position_angle_vec(position): 146 | return [position / np.power(10000, 2 * (hid_j // 2) / d_hid) for hid_j in range(d_hid)] 147 | 148 | sinusoid_table = np.array([get_position_angle_vec(pos_i) for pos_i in range(n_position)]) 149 | sinusoid_table[:, 0::2] = np.sin(sinusoid_table[:, 0::2]) # dim 2i 150 | sinusoid_table[:, 1::2] = np.cos(sinusoid_table[:, 1::2]) # dim 2i+1 151 | 152 | return torch.FloatTensor(sinusoid_table).unsqueeze(0) -------------------------------------------------------------------------------- /0-NICO/models/vgg.py: -------------------------------------------------------------------------------- 1 | """vgg in pytorch 2 | 3 | 4 | [1] Karen Simonyan, Andrew Zisserman 5 | 6 | Very Deep Convolutional Networks for Large-Scale Image Recognition. 7 | https://arxiv.org/abs/1409.1556v6 8 | """ 9 | '''VGG11/13/16/19 in Pytorch.''' 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | cfg = { 15 | 'A' : [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 16 | 'B' : [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 17 | 'D' : [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'], 18 | 'E' : [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'] 19 | } 20 | 21 | class VGG(nn.Module): 22 | 23 | def __init__(self, features, num_class=100): 24 | super().__init__() 25 | self.features = features 26 | 27 | self.classifier = nn.Sequential( 28 | nn.Linear(512, 4096), 29 | nn.ReLU(inplace=True), 30 | nn.Dropout(), 31 | nn.Linear(4096, 4096), 32 | nn.ReLU(inplace=True), 33 | nn.Dropout(), 34 | nn.Linear(4096, num_class) 35 | ) 36 | 37 | def forward(self, x): 38 | output = self.features(x) 39 | output = output.view(output.size()[0], -1) 40 | output = self.classifier(output) 41 | 42 | return output 43 | 44 | def make_layers(cfg, batch_norm=False): 45 | layers = [] 46 | 47 | input_channel = 3 48 | for l in cfg: 49 | if l == 'M': 50 | layers += [nn.MaxPool2d(kernel_size=2, stride=2)] 51 | continue 52 | 53 | layers += [nn.Conv2d(input_channel, l, kernel_size=3, padding=1)] 54 | 55 | if batch_norm: 56 | layers += [nn.BatchNorm2d(l)] 57 | 58 | layers += [nn.ReLU(inplace=True)] 59 | input_channel = l 60 | 61 | return nn.Sequential(*layers) 62 | 63 | def vgg11_bn(): 64 | return VGG(make_layers(cfg['A'], batch_norm=True)) 65 | 66 | def vgg13_bn(): 67 | return VGG(make_layers(cfg['B'], batch_norm=True)) 68 | 69 | def vgg16_bn(): 70 | return VGG(make_layers(cfg['D'], batch_norm=True)) 71 | 72 | def vgg19_bn(): 73 | return VGG(make_layers(cfg['E'], batch_norm=True)) 74 | 75 | 76 | -------------------------------------------------------------------------------- /0-NICO/models/wideresidual.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | 5 | class WideBasic(nn.Module): 6 | 7 | def __init__(self, in_channels, out_channels, stride=1): 8 | super().__init__() 9 | self.residual = nn.Sequential( 10 | nn.BatchNorm2d(in_channels), 11 | nn.ReLU(inplace=True), 12 | nn.Conv2d( 13 | in_channels, 14 | out_channels, 15 | kernel_size=3, 16 | stride=stride, 17 | padding=1 18 | ), 19 | nn.BatchNorm2d(out_channels), 20 | nn.ReLU(inplace=True), 21 | nn.Dropout(), 22 | nn.Conv2d( 23 | out_channels, 24 | out_channels, 25 | kernel_size=3, 26 | stride=1, 27 | padding=1 28 | ) 29 | ) 30 | 31 | self.shortcut = nn.Sequential() 32 | 33 | if in_channels != out_channels or stride != 1: 34 | self.shortcut = nn.Sequential( 35 | nn.Conv2d(in_channels, out_channels, 1, stride=stride) 36 | ) 37 | 38 | def forward(self, x): 39 | 40 | residual = self.residual(x) 41 | shortcut = self.shortcut(x) 42 | 43 | return residual + shortcut 44 | 45 | class WideResNet(nn.Module): 46 | def __init__(self, num_classes, block, depth=50, widen_factor=1): 47 | super().__init__() 48 | 49 | self.depth = depth 50 | k = widen_factor 51 | l = int((depth - 4) / 6) 52 | self.in_channels = 16 53 | self.init_conv = nn.Conv2d(3, self.in_channels, 3, 1, padding=1) 54 | self.conv2 = self._make_layer(block, 16 * k, l, 1) 55 | self.conv3 = self._make_layer(block, 32 * k, l, 2) 56 | self.conv4 = self._make_layer(block, 64 * k, l, 2) 57 | self.bn = nn.BatchNorm2d(64 * k) 58 | self.relu = nn.ReLU(inplace=True) 59 | self.avg_pool = nn.AdaptiveAvgPool2d((1, 1)) 60 | self.linear = nn.Linear(64 * k, num_classes) 61 | 62 | def forward(self, x): 63 | x = self.init_conv(x) 64 | x = self.conv2(x) 65 | x = self.conv3(x) 66 | x = self.conv4(x) 67 | x = self.bn(x) 68 | x = self.relu(x) 69 | x = self.avg_pool(x) 70 | x = x.view(x.size(0), -1) 71 | x = self.linear(x) 72 | 73 | return x 74 | 75 | def _make_layer(self, block, out_channels, num_blocks, stride): 76 | """make resnet layers(by layer i didnt mean this 'layer' was the 77 | same as a neuron netowork layer, ex. conv layer), one layer may 78 | contain more than one residual block 79 | 80 | Args: 81 | block: block type, basic block or bottle neck block 82 | out_channels: output depth channel number of this layer 83 | num_blocks: how many blocks per layer 84 | stride: the stride of the first block of this layer 85 | 86 | Return: 87 | return a resnet layer 88 | """ 89 | 90 | # we have num_block blocks per layer, the first block 91 | # could be 1 or 2, other blocks would always be 1 92 | strides = [stride] + [1] * (num_blocks - 1) 93 | layers = [] 94 | for stride in strides: 95 | layers.append(block(self.in_channels, out_channels, stride)) 96 | self.in_channels = out_channels 97 | 98 | return nn.Sequential(*layers) 99 | 100 | 101 | # Table 9: Best WRN performance over various datasets, single run results. 102 | def wideresnet(depth=40, widen_factor=10): 103 | net = WideResNet(100, WideBasic, depth=depth, widen_factor=widen_factor) 104 | return net -------------------------------------------------------------------------------- /0-NICO/models/xception.py: -------------------------------------------------------------------------------- 1 | """xception in pytorch 2 | 3 | 4 | [1] François Chollet 5 | 6 | Xception: Deep Learning with Depthwise Separable Convolutions 7 | https://arxiv.org/abs/1610.02357 8 | """ 9 | 10 | import torch 11 | import torch.nn as nn 12 | 13 | class SeperableConv2d(nn.Module): 14 | 15 | #***Figure 4. An “extreme” version of our Inception module, 16 | #with one spatial convolution per output channel of the 1x1 17 | #convolution.""" 18 | def __init__(self, input_channels, output_channels, kernel_size, **kwargs): 19 | 20 | super().__init__() 21 | self.depthwise = nn.Conv2d( 22 | input_channels, 23 | input_channels, 24 | kernel_size, 25 | groups=input_channels, 26 | bias=False, 27 | **kwargs 28 | ) 29 | 30 | self.pointwise = nn.Conv2d(input_channels, output_channels, 1, bias=False) 31 | 32 | def forward(self, x): 33 | x = self.depthwise(x) 34 | x = self.pointwise(x) 35 | 36 | return x 37 | 38 | class EntryFlow(nn.Module): 39 | 40 | def __init__(self): 41 | 42 | super().__init__() 43 | self.conv1 = nn.Sequential( 44 | nn.Conv2d(3, 32, 3, padding=1, bias=False), 45 | nn.BatchNorm2d(32), 46 | nn.ReLU(inplace=True) 47 | ) 48 | 49 | self.conv2 = nn.Sequential( 50 | nn.Conv2d(32, 64, 3, padding=1, bias=False), 51 | nn.BatchNorm2d(64), 52 | nn.ReLU(inplace=True) 53 | ) 54 | 55 | self.conv3_residual = nn.Sequential( 56 | SeperableConv2d(64, 128, 3, padding=1), 57 | nn.BatchNorm2d(128), 58 | nn.ReLU(inplace=True), 59 | SeperableConv2d(128, 128, 3, padding=1), 60 | nn.BatchNorm2d(128), 61 | nn.MaxPool2d(3, stride=2, padding=1), 62 | ) 63 | 64 | self.conv3_shortcut = nn.Sequential( 65 | nn.Conv2d(64, 128, 1, stride=2), 66 | nn.BatchNorm2d(128), 67 | ) 68 | 69 | self.conv4_residual = nn.Sequential( 70 | nn.ReLU(inplace=True), 71 | SeperableConv2d(128, 256, 3, padding=1), 72 | nn.BatchNorm2d(256), 73 | nn.ReLU(inplace=True), 74 | SeperableConv2d(256, 256, 3, padding=1), 75 | nn.BatchNorm2d(256), 76 | nn.MaxPool2d(3, stride=2, padding=1) 77 | ) 78 | 79 | self.conv4_shortcut = nn.Sequential( 80 | nn.Conv2d(128, 256, 1, stride=2), 81 | nn.BatchNorm2d(256), 82 | ) 83 | 84 | #no downsampling 85 | self.conv5_residual = nn.Sequential( 86 | nn.ReLU(inplace=True), 87 | SeperableConv2d(256, 728, 3, padding=1), 88 | nn.BatchNorm2d(728), 89 | nn.ReLU(inplace=True), 90 | SeperableConv2d(728, 728, 3, padding=1), 91 | nn.BatchNorm2d(728), 92 | nn.MaxPool2d(3, 1, padding=1) 93 | ) 94 | 95 | #no downsampling 96 | self.conv5_shortcut = nn.Sequential( 97 | nn.Conv2d(256, 728, 1), 98 | nn.BatchNorm2d(728) 99 | ) 100 | 101 | def forward(self, x): 102 | x = self.conv1(x) 103 | x = self.conv2(x) 104 | residual = self.conv3_residual(x) 105 | shortcut = self.conv3_shortcut(x) 106 | x = residual + shortcut 107 | residual = self.conv4_residual(x) 108 | shortcut = self.conv4_shortcut(x) 109 | x = residual + shortcut 110 | residual = self.conv5_residual(x) 111 | shortcut = self.conv5_shortcut(x) 112 | x = residual + shortcut 113 | 114 | return x 115 | 116 | class MiddleFLowBlock(nn.Module): 117 | 118 | def __init__(self): 119 | super().__init__() 120 | 121 | self.shortcut = nn.Sequential() 122 | self.conv1 = nn.Sequential( 123 | nn.ReLU(inplace=True), 124 | SeperableConv2d(728, 728, 3, padding=1), 125 | nn.BatchNorm2d(728) 126 | ) 127 | self.conv2 = nn.Sequential( 128 | nn.ReLU(inplace=True), 129 | SeperableConv2d(728, 728, 3, padding=1), 130 | nn.BatchNorm2d(728) 131 | ) 132 | self.conv3 = nn.Sequential( 133 | nn.ReLU(inplace=True), 134 | SeperableConv2d(728, 728, 3, padding=1), 135 | nn.BatchNorm2d(728) 136 | ) 137 | 138 | def forward(self, x): 139 | residual = self.conv1(x) 140 | residual = self.conv2(residual) 141 | residual = self.conv3(residual) 142 | 143 | shortcut = self.shortcut(x) 144 | 145 | return shortcut + residual 146 | 147 | class MiddleFlow(nn.Module): 148 | def __init__(self, block): 149 | super().__init__() 150 | 151 | #"""then through the middle flow which is repeated eight times""" 152 | self.middel_block = self._make_flow(block, 8) 153 | 154 | def forward(self, x): 155 | x = self.middel_block(x) 156 | return x 157 | 158 | def _make_flow(self, block, times): 159 | flows = [] 160 | for i in range(times): 161 | flows.append(block()) 162 | 163 | return nn.Sequential(*flows) 164 | 165 | 166 | class ExitFLow(nn.Module): 167 | 168 | def __init__(self): 169 | super().__init__() 170 | self.residual = nn.Sequential( 171 | nn.ReLU(), 172 | SeperableConv2d(728, 728, 3, padding=1), 173 | nn.BatchNorm2d(728), 174 | nn.ReLU(), 175 | SeperableConv2d(728, 1024, 3, padding=1), 176 | nn.BatchNorm2d(1024), 177 | nn.MaxPool2d(3, stride=2, padding=1) 178 | ) 179 | 180 | self.shortcut = nn.Sequential( 181 | nn.Conv2d(728, 1024, 1, stride=2), 182 | nn.BatchNorm2d(1024) 183 | ) 184 | 185 | self.conv = nn.Sequential( 186 | SeperableConv2d(1024, 1536, 3, padding=1), 187 | nn.BatchNorm2d(1536), 188 | nn.ReLU(inplace=True), 189 | SeperableConv2d(1536, 2048, 3, padding=1), 190 | nn.BatchNorm2d(2048), 191 | nn.ReLU(inplace=True) 192 | ) 193 | 194 | self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) 195 | 196 | def forward(self, x): 197 | shortcut = self.shortcut(x) 198 | residual = self.residual(x) 199 | output = shortcut + residual 200 | output = self.conv(output) 201 | output = self.avgpool(output) 202 | 203 | return output 204 | 205 | class Xception(nn.Module): 206 | 207 | def __init__(self, block, num_class=100): 208 | super().__init__() 209 | self.entry_flow = EntryFlow() 210 | self.middel_flow = MiddleFlow(block) 211 | self.exit_flow = ExitFLow() 212 | 213 | self.fc = nn.Linear(2048, num_class) 214 | 215 | def forward(self, x): 216 | x = self.entry_flow(x) 217 | x = self.middel_flow(x) 218 | x = self.exit_flow(x) 219 | x = x.view(x.size(0), -1) 220 | x = self.fc(x) 221 | 222 | return x 223 | 224 | def xception(): 225 | return Xception(MiddleFLowBlock) 226 | 227 | 228 | -------------------------------------------------------------------------------- /0-NICO/pretrain_model/nico_resnet18_ours_caam-best.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/pretrain_model/nico_resnet18_ours_caam-best.pth -------------------------------------------------------------------------------- /0-NICO/pretrain_model/nico_t2tvit7_ours_caam-best.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/0-NICO/pretrain_model/nico_t2tvit7_ours_caam-best.pth -------------------------------------------------------------------------------- /0-NICO/scripts/run_baseline_resnet18.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0 python train.py -cfg conf/baseline_resnet18_bf0.02.yaml -gpu -name baseline_resnet18 -------------------------------------------------------------------------------- /0-NICO/scripts/run_baseline_t2tvit7.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0 python train.py -cfg conf/baseline_t2tvit7_bf0.02.yaml -gpu -name baseline_t2tvit7 -------------------------------------------------------------------------------- /0-NICO/scripts/run_ours_resnet18.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0,1 python train.py -cfg conf/ours_resnet18_multilayer4_bf0.02_noenv_pw5e5.yaml -gpu -name lti_ours_resnet18_multilayer4_bf0.02_noenv_pw5e5 -multigpu -------------------------------------------------------------------------------- /0-NICO/scripts/run_ours_resnet18_mixup.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0,1 python train.py -cfg conf/ours_resnet18_multilayer4_bf0.02_noenv_pw5e5_mixup.yaml -gpu -name lti_ours_resnet18_multilayer4_bf0.02_noenv_pw5e5_mixup -multigpu -------------------------------------------------------------------------------- /0-NICO/scripts/run_ours_t2tvit7.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0,1 python train.py -cfg conf/ours_t2tvit7_bf0.02_s4_noenv_pw5e4.yaml -gpu -name multi_ours_t2tvit7_stage4_noenv_pw5e4 -------------------------------------------------------------------------------- /1-Imagenet9/clusters/cluster_label_1.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/clusters/cluster_label_1.pth -------------------------------------------------------------------------------- /1-Imagenet9/clusters/cluster_label_2.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/clusters/cluster_label_2.pth -------------------------------------------------------------------------------- /1-Imagenet9/clusters/cluster_label_3.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/clusters/cluster_label_3.pth -------------------------------------------------------------------------------- /1-Imagenet9/conf/__init__.py: -------------------------------------------------------------------------------- 1 | """ dynamically load settings 2 | 3 | author baiyu 4 | """ 5 | import conf.global_settings as settings 6 | 7 | class Settings: 8 | def __init__(self, settings): 9 | 10 | for attr in dir(settings): 11 | if attr.isupper(): 12 | setattr(self, attr, getattr(settings, attr)) 13 | 14 | settings = Settings(settings) -------------------------------------------------------------------------------- /1-Imagenet9/conf/baseline_resnet18_imagenet9.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: resnet18 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | train_root: /disk2/wangtan/dataset/ImageNet/imagenet/ILSVRC/Data/CLS-LOC/train 8 | val_root: /disk2/wangtan/dataset/ImageNet/imagenet/ILSVRC/Data/CLS-LOC/val 9 | imageneta_root: /disk2/wangtan/dataset/ImageNet/imagenet_a/imagenet-a 10 | training_opt: 11 | seed: 0 12 | batch_size: 256 13 | val_batch_size: 128 14 | lr: 0.05 15 | warm: 2 16 | epoch: 120 17 | milestones: [50, 80, 100] 18 | # milestones: [80, 140, 200] 19 | save_epoch: 20 20 | print_batch: 10 21 | mean: [0.52418953, 0.5233741, 0.44896784] 22 | std: [0.21851876, 0.2175944, 0.22552039] 23 | variance_opt: 24 | balance_factor: 0.02 25 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 26 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 27 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 28 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 29 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 30 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 31 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 32 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 33 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 34 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 35 | mode: 'baseline' 36 | env_type: 'baseline' 37 | resume: False -------------------------------------------------------------------------------- /1-Imagenet9/conf/baseline_t2tvit7_imagenet9.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: t2tvit7 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | train_root: /disk2/simpleshinobu.qjx/imagenet/train 8 | val_root: /disk2/simpleshinobu.qjx/imagenet/val 9 | imageneta_root: /disk2/simpleshinobu.qjx/imagenet-a/imagenet-a 10 | training_opt: 11 | seed: 0 12 | batch_size: 256 13 | val_batch_size: 128 14 | optim: 15 | sched: baseline 16 | lr: 0.001 17 | warm: 3 18 | epoch: 120 19 | milestones: [50, 80, 100] 20 | # milestones: [80, 140, 200] 21 | save_epoch: 20 22 | print_batch: 10 23 | mean: [0.52418953, 0.5233741, 0.44896784] 24 | std: [0.21851876, 0.2175944, 0.22552039] 25 | variance_opt: 26 | balance_factor: 0.02 27 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 28 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 29 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 30 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 31 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 32 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 33 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 34 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 35 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 36 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 37 | mode: 'baseline' 38 | env_type: 'baseline' 39 | resume: False -------------------------------------------------------------------------------- /1-Imagenet9/conf/global_settings.py: -------------------------------------------------------------------------------- 1 | """ configurations for this project 2 | 3 | author baiyu 4 | """ 5 | import os 6 | from datetime import datetime 7 | 8 | #CIFAR100 dataset path (python version) 9 | #CIFAR100_PATH = '/nfs/private/cifar100/cifar-100-python' 10 | 11 | #mean and std of cifar100 dataset 12 | CIFAR100_TRAIN_MEAN = (0.5070751592371323, 0.48654887331495095, 0.4409178433670343) 13 | CIFAR100_TRAIN_STD = (0.2673342858792401, 0.2564384629170883, 0.27615047132568404) 14 | 15 | #directory to save weights file 16 | CHECKPOINT_PATH = 'checkpoint' 17 | 18 | DATE_FORMAT = '%A_%d_%B_%Y_%Hh_%Mm_%Ss' 19 | #time of we run the script 20 | TIME_NOW = datetime.now().strftime(DATE_FORMAT) 21 | 22 | #tensorboard log dir 23 | LOG_DIR = 'runs' 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | -------------------------------------------------------------------------------- /1-Imagenet9/conf/ours_resnet18_multi2_imagenet9_pw5e4_noenv_iter.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: resnet18_ours_cbam_multi 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | train_root: /disk2/wangtan/dataset/ImageNet/imagenet/ILSVRC/Data/CLS-LOC/train 8 | val_root: /disk2/wangtan/dataset/ImageNet/imagenet/ILSVRC/Data/CLS-LOC/val 9 | imageneta_root: /disk2/wangtan/dataset/ImageNet/imagenet_a/imagenet-a 10 | training_opt: 11 | seed: 0 12 | classes: 10 13 | batch_size: 64 14 | val_batch_size: 128 15 | lr: 0.1 16 | warm: 2 17 | epoch: 120 18 | milestones: [50, 80, 100] 19 | # milestones: [80, 140, 200] 20 | save_epoch: 20 21 | print_batch: 10 22 | mean: [0.52418953, 0.5233741, 0.44896784] 23 | std: [0.21851876, 0.2175944, 0.22552039] 24 | variance_opt: 25 | balance_factor: 0.02 26 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 27 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 28 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 29 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 30 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 31 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 32 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 33 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 34 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 35 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 36 | env: True 37 | mode: 'ours' 38 | erm_flag: True 39 | sp_flag: False 40 | n_env: 4 41 | env_type: 'auto-iter' 42 | split_renew: 40 43 | split_renew_iters: 10 44 | from_scratch: False 45 | ref_model: resnet18 46 | ref_model_path: /disk2/wangtan/code/causal_invariant_attention/multi-classification/imagenet/checkpoint/resnet18/baseline_resnet18_imagenet9_lr0.05_class9/resnet18-100-regular.pth 47 | penalty_weight: 5e4 48 | penalty_anneal_iters: 0 49 | #2 blocks, 4 layers 50 | split_layer: 3 51 | resume: False -------------------------------------------------------------------------------- /1-Imagenet9/conf/ours_t2tvit7_s4_imagenet9_noenv_pw5e4_iter.yaml: -------------------------------------------------------------------------------- 1 | exp_name: nico_resvit18_multi_unshuffle_bf0.02_lr0.01 2 | net: t2tvit7_ours 3 | dataset: NICO 4 | image_folder: /data2/wangtan/causal-invariant-attention/dataset/NICO/multi_classification 5 | cxt_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Context_name2label.json 6 | class_dic_path: /data2/wangtan/causal-invariant-attention/dataset/NICO/label_file/Animal_name2label.json 7 | train_root: /disk2/wangtan/dataset/ImageNet/imagenet/ILSVRC/Data/CLS-LOC/train 8 | val_root: /disk2/wangtan/dataset/ImageNet/imagenet/ILSVRC/Data/CLS-LOC/val 9 | imageneta_root: /disk2/wangtan/dataset/ImageNet/imagenet_a/imagenet-a 10 | training_opt: 11 | seed: 0 12 | classes: 9 13 | batch_size: 64 14 | val_batch_size: 128 15 | optim: 16 | sched: baseline 17 | lr: 0.001 18 | warm: 3 19 | epoch: 120 20 | milestones: [50, 80, 100] 21 | # milestones: [80, 140, 200] 22 | save_epoch: 20 23 | print_batch: 10 24 | mean: [0.52418953, 0.5233741, 0.44896784] 25 | std: [0.21851876, 0.2175944, 0.22552039] 26 | variance_opt: 27 | balance_factor: 0.02 28 | training_dist: {'dog': ['on_grass','in_water','in_cage','eating','on_beach','lying','running'], 29 | 'cat': ['on_snow','at_home','in_street','walking','in_river','in_cage','eating'], 30 | 'bear': ['in_forest','black','brown','eating_grass','in_water','lying','on_snow'], 31 | 'bird': ['on_ground', 'in_hand','on_branch','flying','eating','on_grass','standing'], 32 | 'cow': ['in_river', 'lying', 'standing','eating','in_forest','on_grass','on_snow'], 33 | 'elephant': ['in_zoo', 'in_circus', 'in_forest', 'in_river','eating','standing','on_grass'], 34 | 'horse': ['on_beach', 'aside_people', 'running','lying','on_grass','on_snow','in_forest'], 35 | 'monkey': ['sitting', 'walking', 'in_water','on_snow','in_forest','eating','on_grass'], 36 | 'rat': ['at_home', 'in_hole', 'in_cage','in_forest','in_water','on_grass','eating'], 37 | 'sheep': ['eating', 'on_road','walking','on_snow','on_grass','lying','in_forest']} 38 | env: True 39 | mode: 'ours' 40 | erm_flag: True 41 | sp_flag: False 42 | n_env: 4 43 | env_type: 'auto-iter' 44 | split_renew: 40 45 | split_renew_iters: 20 46 | from_scratch: False 47 | ref_model: resnet18 48 | ref_model_path: /disk2/wangtan/code/causal_invariant_attention/multi-classification/imagenet/checkpoint/resnet18/baseline_resnet18_imagenet9_lr0.05_class9/resnet18-100-regular.pth 49 | penalty_weight: 5e4 50 | penalty_anneal_iters: 0 51 | final_k: 4 52 | resume: False -------------------------------------------------------------------------------- /1-Imagenet9/imagenet_cluster.py: -------------------------------------------------------------------------------- 1 | """ReBias 2 | Copyright (c) 2020-present NAVER Corp. 3 | MIT license 4 | """ 5 | import argparse 6 | import os 7 | import time 8 | 9 | import torch 10 | import torch.nn as nn 11 | 12 | import torchvision 13 | from torchvision import transforms 14 | from torchvision.utils import save_image 15 | 16 | import numpy as np 17 | from PIL import Image 18 | from sklearn.cluster import MiniBatchKMeans 19 | 20 | from dataset_imagenet import get_imagenet_dataloader 21 | 22 | parser = argparse.ArgumentParser() 23 | 24 | parser.add_argument('--dataset', type=str, default='ImageNet') 25 | parser.add_argument('--num_classes', type=int, default=9, help='number of classes') 26 | parser.add_argument('--load_size', type=int, default=256, help='image load size') 27 | parser.add_argument('--image_size', type=int, default=224, help='image crop size') 28 | parser.add_argument('--k', type=int, default=9, help='number of clusters') 29 | parser.add_argument('--n_sample', type=int, default='30', help='number of samples per cluster') 30 | parser.add_argument('--batch_size', type=int, default=128, help='mini-batch size') 31 | parser.add_argument('--num_workers', type=int, default=4, help='number of data loading workers') 32 | parser.add_argument('--cluster_dir', type=str, default='pre_cluster_results') 33 | 34 | 35 | def main(n_try=None): 36 | args = parser.parse_args() 37 | 38 | # create directories if not exist 39 | if not os.path.exists(args.cluster_dir): 40 | os.makedirs(args.cluster_dir) 41 | 42 | data_loader = get_imagenet_dataloader(root='/disk2/wangtan/dataset/ImageNet/imagenet/ILSVRC/Data/CLS-LOC/train', batch_size=args.batch_size, train=False) 43 | 44 | transform = transforms.Compose([ 45 | transforms.Resize(256), 46 | transforms.CenterCrop(224), 47 | transforms.ToTensor(), 48 | transforms.Normalize(mean=[0.485, 0.456, 0.406], 49 | std=[0.229, 0.224, 0.225]) 50 | ]) 51 | 52 | extractor = nn.Sequential(*list(torchvision.models.vgg16(pretrained=True).features)[:-16]) # conv1_2 53 | extractor.cuda() 54 | # extractor = nn.DataParallel(extractor) 55 | 56 | # ======================================================================= # 57 | # 1. Extract features # 58 | # ======================================================================= # 59 | print('Start extracting features...') 60 | extractor.eval() 61 | N = len(data_loader.dataset.dataset) 62 | 63 | start = time.time() 64 | for i, (images, targets, _) in enumerate(data_loader): 65 | images = images.cuda() 66 | outputs = gram_matrix(extractor(images)) 67 | outputs = outputs.view(images.size(0), -1).data.cpu().numpy() 68 | 69 | if i == 0: 70 | features = np.zeros((N, outputs.shape[1])).astype('float32') 71 | 72 | if i < N - 1: 73 | features[i * args.batch_size: (i+1) * args.batch_size] = outputs.astype('float32') 74 | 75 | else: 76 | features[i * args.batch_size:] = outputs.astype('float32') 77 | 78 | # L2 normalization 79 | features = features / np.linalg.norm(features, axis=1)[:, np.newaxis] 80 | print('Finished extracting features...(time: {0:.0f} s)'.format(time.time() - start)) 81 | 82 | # ======================================================================= # 83 | # 2. Clustering # 84 | # ======================================================================= # 85 | start = time.time() 86 | labels, image_lists = Kmeans(args.k, features) 87 | print('Finished clustering...(time: {0:.0f} s)'.format(time.time() - start)) 88 | 89 | # save clustering results 90 | torch.save(torch.LongTensor(labels), os.path.join(args.cluster_dir, 91 | 'cluster_label_{}.pth'.format(n_try))) 92 | print('Saved cluster label...') 93 | 94 | len_list = [len(image_list) for image_list in image_lists] 95 | min_len = min(len_list) 96 | if min_len < args.n_sample: 97 | args.n_sample = min_len 98 | print('number of images in each cluster:', len_list) 99 | 100 | # sample clustering results 101 | start = time.time() 102 | samples = [[]] * args.k 103 | for k in range(args.k): 104 | idx_list = image_lists[k] # list of image indexes in each cluster 105 | for j in range(args.n_sample): # sample j indexes 106 | idx = idx_list[j] 107 | filename = data_loader.dataset.dataset[idx][0] 108 | image = transform(Image.open(filename).convert('RGB')).unsqueeze(0) 109 | samples[k] = samples[k] + [image] 110 | 111 | for k in range(args.k): 112 | samples[k] = torch.cat(samples[k], dim=3) 113 | samples = torch.cat(samples, dim=0) 114 | 115 | filename = os.path.join(args.cluster_dir, 'cluster_sample_{}.jpg'.format(n_try)) 116 | save_image(denorm(samples.data.cpu()), filename, nrow=1, padding=0) 117 | print('Finished sampling...(time: {0:.0f} s)'.format(time.time() - start)) 118 | 119 | 120 | def gram_matrix(input, normalize=True): 121 | N, C, H, W = input.size() 122 | feat = input.view(N, C, -1) 123 | G = torch.bmm(feat, feat.transpose(1, 2)) # N X C X C 124 | if normalize: 125 | G /= (C * H * W) 126 | return G 127 | 128 | 129 | def denorm(x): 130 | """Convert the range to [0, 1].""" 131 | mean = torch.tensor([0.485, 0.456, 0.406]) 132 | std = torch.tensor([0.229, 0.224, 0.225]) 133 | return x.mul_(std[:, None, None]).add_(mean[:, None, None]).clamp_(0, 1) 134 | 135 | 136 | def Kmeans(k, features): 137 | n_data, dim = features.shape 138 | features = torch.FloatTensor(features) 139 | 140 | clus = MiniBatchKMeans(n_clusters=k, 141 | batch_size=1024).fit(features) 142 | labels = clus.labels_ 143 | 144 | image_lists = [[] for _ in range(k)] 145 | feat_lists = [[] for _ in range(k)] 146 | for i in range(n_data): 147 | image_lists[labels[i]].append(i) 148 | feat_lists[labels[i]].append(features[i].unsqueeze(0)) 149 | 150 | return labels, image_lists 151 | 152 | 153 | if __name__ == '__main__': 154 | for i in range(3): 155 | main(i+1) -------------------------------------------------------------------------------- /1-Imagenet9/misc/unbalance_bagnet18_imagenet9_presplit.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/misc/unbalance_bagnet18_imagenet9_presplit.npy -------------------------------------------------------------------------------- /1-Imagenet9/misc/unbalance_resnet18_imagenet9_presplit.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/misc/unbalance_resnet18_imagenet9_presplit.npy -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/bam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/bam.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/bam.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/bam.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/cbam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/cbam.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/cbam.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/cbam.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet224.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet224.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet224.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet224.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet_cbam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet_cbam.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet_cbam2.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet_cbam2.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet_cbam2.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet_cbam2.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet_ours_cbam.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet_ours_cbam.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet_ours_cbam.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet_ours_cbam.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet_ours_cbam_1.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet_ours_cbam_1.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/resnet_ours_cbam_multi.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/resnet_ours_cbam_multi.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/t2tvit.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/t2tvit.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/t2tvit.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/t2tvit.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/t2tvit_ours.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/t2tvit_ours.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/t2tvit_ours.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/t2tvit_ours.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/token_performer.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/token_performer.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/token_performer.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/token_performer.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/token_transformer.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/token_transformer.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/token_transformer.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/token_transformer.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/transformer_block.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/transformer_block.cpython-36.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/__pycache__/transformer_block.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/models/__pycache__/transformer_block.cpython-37.pyc -------------------------------------------------------------------------------- /1-Imagenet9/models/bam.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import math 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | 6 | class Flatten(nn.Module): 7 | def forward(self, x): 8 | return x.view(x.size(0), -1) 9 | class ChannelGate(nn.Module): 10 | def __init__(self, gate_channel, reduction_ratio=16, num_layers=1): 11 | super(ChannelGate, self).__init__() 12 | self.gate_activation = gate_activation 13 | self.gate_c = nn.Sequential() 14 | self.gate_c.add_module( 'flatten', Flatten() ) 15 | gate_channels = [gate_channel] 16 | gate_channels += [gate_channel // reduction_ratio] * num_layers 17 | gate_channels += [gate_channel] 18 | for i in range( len(gate_channels) - 2 ): 19 | self.gate_c.add_module( 'gate_c_fc_%d'%i, nn.Linear(gate_channels[i], gate_channels[i+1]) ) 20 | self.gate_c.add_module( 'gate_c_bn_%d'%(i+1), nn.BatchNorm1d(gate_channels[i+1]) ) 21 | self.gate_c.add_module( 'gate_c_relu_%d'%(i+1), nn.ReLU() ) 22 | self.gate_c.add_module( 'gate_c_fc_final', nn.Linear(gate_channels[-2], gate_channels[-1]) ) 23 | def forward(self, in_tensor): 24 | avg_pool = F.avg_pool2d( in_tensor, in_tensor.size(2), stride=in_tensor.size(2) ) 25 | return self.gate_c( avg_pool ).unsqueeze(2).unsqueeze(3).expand_as(in_tensor) 26 | 27 | class SpatialGate(nn.Module): 28 | def __init__(self, gate_channel, reduction_ratio=16, dilation_conv_num=2, dilation_val=4): 29 | super(SpatialGate, self).__init__() 30 | self.gate_s = nn.Sequential() 31 | self.gate_s.add_module( 'gate_s_conv_reduce0', nn.Conv2d(gate_channel, gate_channel//reduction_ratio, kernel_size=1)) 32 | self.gate_s.add_module( 'gate_s_bn_reduce0', nn.BatchNorm2d(gate_channel//reduction_ratio) ) 33 | self.gate_s.add_module( 'gate_s_relu_reduce0',nn.ReLU() ) 34 | for i in range( dilation_conv_num ): 35 | self.gate_s.add_module( 'gate_s_conv_di_%d'%i, nn.Conv2d(gate_channel//reduction_ratio, gate_channel//reduction_ratio, kernel_size=3, \ 36 | padding=dilation_val, dilation=dilation_val) ) 37 | self.gate_s.add_module( 'gate_s_bn_di_%d'%i, nn.BatchNorm2d(gate_channel//reduction_ratio) ) 38 | self.gate_s.add_module( 'gate_s_relu_di_%d'%i, nn.ReLU() ) 39 | self.gate_s.add_module( 'gate_s_conv_final', nn.Conv2d(gate_channel//reduction_ratio, 1, kernel_size=1) ) 40 | def forward(self, in_tensor): 41 | return self.gate_s( in_tensor ).expand_as(in_tensor) 42 | class BAM(nn.Module): 43 | def __init__(self, gate_channel): 44 | super(BAM, self).__init__() 45 | self.channel_att = ChannelGate(gate_channel) 46 | self.spatial_att = SpatialGate(gate_channel) 47 | def forward(self,in_tensor): 48 | att = 1 + F.sigmoid( self.channel_att(in_tensor) * self.spatial_att(in_tensor) ) 49 | return att * in_tensor -------------------------------------------------------------------------------- /1-Imagenet9/models/cbam.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import math 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | 6 | class BasicConv(nn.Module): 7 | def __init__(self, in_planes, out_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True, bias=False): 8 | super(BasicConv, self).__init__() 9 | self.out_channels = out_planes 10 | self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=bias) 11 | self.bn = nn.BatchNorm2d(out_planes,eps=1e-5, momentum=0.01, affine=True) if bn else None 12 | self.relu = nn.ReLU() if relu else None 13 | 14 | def forward(self, x): 15 | x = self.conv(x) 16 | if self.bn is not None: 17 | x = self.bn(x) 18 | if self.relu is not None: 19 | x = self.relu(x) 20 | return x 21 | 22 | class Flatten(nn.Module): 23 | def forward(self, x): 24 | return x.view(x.size(0), -1) 25 | 26 | class ChannelGate(nn.Module): 27 | def __init__(self, gate_channels, reduction_ratio=16, pool_types=['avg', 'max']): 28 | super(ChannelGate, self).__init__() 29 | self.gate_channels = gate_channels 30 | self.mlp = nn.Sequential( 31 | Flatten(), 32 | nn.Linear(gate_channels, gate_channels // reduction_ratio), 33 | nn.ReLU(), 34 | nn.Linear(gate_channels // reduction_ratio, gate_channels) 35 | ) 36 | self.pool_types = pool_types 37 | def forward(self, x, return_attn=False): 38 | channel_att_sum = None 39 | for pool_type in self.pool_types: 40 | if pool_type=='avg': 41 | avg_pool = F.avg_pool2d( x, (x.size(2), x.size(3)), stride=(x.size(2), x.size(3))) 42 | channel_att_raw = self.mlp( avg_pool ) 43 | elif pool_type=='max': 44 | max_pool = F.max_pool2d( x, (x.size(2), x.size(3)), stride=(x.size(2), x.size(3))) 45 | channel_att_raw = self.mlp( max_pool ) 46 | elif pool_type=='lp': 47 | lp_pool = F.lp_pool2d( x, 2, (x.size(2), x.size(3)), stride=(x.size(2), x.size(3))) 48 | channel_att_raw = self.mlp( lp_pool ) 49 | elif pool_type=='lse': 50 | # LSE pool only 51 | lse_pool = logsumexp_2d(x) 52 | channel_att_raw = self.mlp( lse_pool ) 53 | 54 | if channel_att_sum is None: 55 | channel_att_sum = channel_att_raw 56 | else: 57 | channel_att_sum = channel_att_sum + channel_att_raw 58 | 59 | scale = F.sigmoid( channel_att_sum ).unsqueeze(2).unsqueeze(3).expand_as(x) 60 | if return_attn: 61 | return x * scale, scale 62 | else: 63 | return x * scale 64 | 65 | def logsumexp_2d(tensor): 66 | tensor_flatten = tensor.view(tensor.size(0), tensor.size(1), -1) 67 | s, _ = torch.max(tensor_flatten, dim=2, keepdim=True) 68 | outputs = s + (tensor_flatten - s).exp().sum(dim=2, keepdim=True).log() 69 | return outputs 70 | 71 | class ChannelPool(nn.Module): 72 | def forward(self, x): 73 | return torch.cat( (torch.max(x,1)[0].unsqueeze(1), torch.mean(x,1).unsqueeze(1)), dim=1 ) 74 | 75 | class SpatialGate(nn.Module): 76 | def __init__(self): 77 | super(SpatialGate, self).__init__() 78 | kernel_size = 7 79 | self.compress = ChannelPool() 80 | self.spatial = BasicConv(2, 1, kernel_size, stride=1, padding=(kernel_size-1) // 2, relu=False) 81 | def forward(self, x, return_attn=False): 82 | x_compress = self.compress(x) 83 | x_out = self.spatial(x_compress) 84 | scale = F.sigmoid(x_out) # broadcasting 85 | if return_attn: 86 | return x * scale, scale 87 | else: 88 | return x * scale 89 | 90 | class CBAM(nn.Module): 91 | def __init__(self, gate_channels, reduction_ratio=16, pool_types=['avg', 'max'], no_spatial=False): 92 | super(CBAM, self).__init__() 93 | self.ChannelGate = ChannelGate(gate_channels, reduction_ratio, pool_types) 94 | self.no_spatial=no_spatial 95 | if not no_spatial: 96 | self.SpatialGate = SpatialGate() 97 | def forward(self, x, return_attn=False): 98 | # x_fake = torch.ones_like(x) 99 | if return_attn: 100 | x_out, attn1 = self.ChannelGate(x, return_attn=True) 101 | if not self.no_spatial: 102 | x_out, attn2 = self.SpatialGate(x_out, return_attn=True) 103 | return x_out, attn1, attn2 104 | else: 105 | return x_out, attn1 106 | else: 107 | x_out = self.ChannelGate(x) 108 | if not self.no_spatial: 109 | x_out = self.SpatialGate(x_out) 110 | return x_out -------------------------------------------------------------------------------- /1-Imagenet9/models/densenet.py: -------------------------------------------------------------------------------- 1 | """dense net in pytorch 2 | 3 | 4 | 5 | [1] Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger. 6 | 7 | Densely Connected Convolutional Networks 8 | https://arxiv.org/abs/1608.06993v5 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | 15 | 16 | #"""Bottleneck layers. Although each layer only produces k 17 | #output feature-maps, it typically has many more inputs. It 18 | #has been noted in [37, 11] that a 1×1 convolution can be in- 19 | #troduced as bottleneck layer before each 3×3 convolution 20 | #to reduce the number of input feature-maps, and thus to 21 | #improve computational efficiency.""" 22 | class Bottleneck(nn.Module): 23 | def __init__(self, in_channels, growth_rate): 24 | super().__init__() 25 | #"""In our experiments, we let each 1×1 convolution 26 | #produce 4k feature-maps.""" 27 | inner_channel = 4 * growth_rate 28 | 29 | #"""We find this design especially effective for DenseNet and 30 | #we refer to our network with such a bottleneck layer, i.e., 31 | #to the BN-ReLU-Conv(1×1)-BN-ReLU-Conv(3×3) version of H ` , 32 | #as DenseNet-B.""" 33 | self.bottle_neck = nn.Sequential( 34 | nn.BatchNorm2d(in_channels), 35 | nn.ReLU(inplace=True), 36 | nn.Conv2d(in_channels, inner_channel, kernel_size=1, bias=False), 37 | nn.BatchNorm2d(inner_channel), 38 | nn.ReLU(inplace=True), 39 | nn.Conv2d(inner_channel, growth_rate, kernel_size=3, padding=1, bias=False) 40 | ) 41 | 42 | def forward(self, x): 43 | return torch.cat([x, self.bottle_neck(x)], 1) 44 | 45 | #"""We refer to layers between blocks as transition 46 | #layers, which do convolution and pooling.""" 47 | class Transition(nn.Module): 48 | def __init__(self, in_channels, out_channels): 49 | super().__init__() 50 | #"""The transition layers used in our experiments 51 | #consist of a batch normalization layer and an 1×1 52 | #convolutional layer followed by a 2×2 average pooling 53 | #layer""". 54 | self.down_sample = nn.Sequential( 55 | nn.BatchNorm2d(in_channels), 56 | nn.Conv2d(in_channels, out_channels, 1, bias=False), 57 | nn.AvgPool2d(2, stride=2) 58 | ) 59 | 60 | def forward(self, x): 61 | return self.down_sample(x) 62 | 63 | #DesneNet-BC 64 | #B stands for bottleneck layer(BN-RELU-CONV(1x1)-BN-RELU-CONV(3x3)) 65 | #C stands for compression factor(0<=theta<=1) 66 | class DenseNet(nn.Module): 67 | def __init__(self, block, nblocks, growth_rate=12, reduction=0.5, num_class=100): 68 | super().__init__() 69 | self.growth_rate = growth_rate 70 | 71 | #"""Before entering the first dense block, a convolution 72 | #with 16 (or twice the growth rate for DenseNet-BC) 73 | #output channels is performed on the input images.""" 74 | inner_channels = 2 * growth_rate 75 | 76 | #For convolutional layers with kernel size 3×3, each 77 | #side of the inputs is zero-padded by one pixel to keep 78 | #the feature-map size fixed. 79 | self.conv1 = nn.Conv2d(3, inner_channels, kernel_size=3, padding=1, bias=False) 80 | 81 | self.features = nn.Sequential() 82 | 83 | for index in range(len(nblocks) - 1): 84 | self.features.add_module("dense_block_layer_{}".format(index), self._make_dense_layers(block, inner_channels, nblocks[index])) 85 | inner_channels += growth_rate * nblocks[index] 86 | 87 | #"""If a dense block contains m feature-maps, we let the 88 | #following transition layer generate θm output feature- 89 | #maps, where 0 < θ ≤ 1 is referred to as the compression 90 | #fac-tor. 91 | out_channels = int(reduction * inner_channels) # int() will automatic floor the value 92 | self.features.add_module("transition_layer_{}".format(index), Transition(inner_channels, out_channels)) 93 | inner_channels = out_channels 94 | 95 | self.features.add_module("dense_block{}".format(len(nblocks) - 1), self._make_dense_layers(block, inner_channels, nblocks[len(nblocks)-1])) 96 | inner_channels += growth_rate * nblocks[len(nblocks) - 1] 97 | self.features.add_module('bn', nn.BatchNorm2d(inner_channels)) 98 | self.features.add_module('relu', nn.ReLU(inplace=True)) 99 | 100 | self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) 101 | 102 | self.linear = nn.Linear(inner_channels, num_class) 103 | 104 | def forward(self, x): 105 | output = self.conv1(x) 106 | output = self.features(output) 107 | output = self.avgpool(output) 108 | output = output.view(output.size()[0], -1) 109 | output = self.linear(output) 110 | return output 111 | 112 | def _make_dense_layers(self, block, in_channels, nblocks): 113 | dense_block = nn.Sequential() 114 | for index in range(nblocks): 115 | dense_block.add_module('bottle_neck_layer_{}'.format(index), block(in_channels, self.growth_rate)) 116 | in_channels += self.growth_rate 117 | return dense_block 118 | 119 | def densenet121(): 120 | return DenseNet(Bottleneck, [6,12,24,16], growth_rate=32) 121 | 122 | def densenet169(): 123 | return DenseNet(Bottleneck, [6,12,32,32], growth_rate=32) 124 | 125 | def densenet201(): 126 | return DenseNet(Bottleneck, [6,12,48,32], growth_rate=32) 127 | 128 | def densenet161(): 129 | return DenseNet(Bottleneck, [6,12,36,24], growth_rate=48) 130 | 131 | -------------------------------------------------------------------------------- /1-Imagenet9/models/googlenet.py: -------------------------------------------------------------------------------- 1 | """google net in pytorch 2 | 3 | 4 | 5 | [1] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, 6 | Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. 7 | 8 | Going Deeper with Convolutions 9 | https://arxiv.org/abs/1409.4842v1 10 | """ 11 | 12 | import torch 13 | import torch.nn as nn 14 | 15 | class Inception(nn.Module): 16 | def __init__(self, input_channels, n1x1, n3x3_reduce, n3x3, n5x5_reduce, n5x5, pool_proj): 17 | super().__init__() 18 | 19 | #1x1conv branch 20 | self.b1 = nn.Sequential( 21 | nn.Conv2d(input_channels, n1x1, kernel_size=1), 22 | nn.BatchNorm2d(n1x1), 23 | nn.ReLU(inplace=True) 24 | ) 25 | 26 | #1x1conv -> 3x3conv branch 27 | self.b2 = nn.Sequential( 28 | nn.Conv2d(input_channels, n3x3_reduce, kernel_size=1), 29 | nn.BatchNorm2d(n3x3_reduce), 30 | nn.ReLU(inplace=True), 31 | nn.Conv2d(n3x3_reduce, n3x3, kernel_size=3, padding=1), 32 | nn.BatchNorm2d(n3x3), 33 | nn.ReLU(inplace=True) 34 | ) 35 | 36 | #1x1conv -> 5x5conv branch 37 | #we use 2 3x3 conv filters stacked instead 38 | #of 1 5x5 filters to obtain the same receptive 39 | #field with fewer parameters 40 | self.b3 = nn.Sequential( 41 | nn.Conv2d(input_channels, n5x5_reduce, kernel_size=1), 42 | nn.BatchNorm2d(n5x5_reduce), 43 | nn.ReLU(inplace=True), 44 | nn.Conv2d(n5x5_reduce, n5x5, kernel_size=3, padding=1), 45 | nn.BatchNorm2d(n5x5, n5x5), 46 | nn.ReLU(inplace=True), 47 | nn.Conv2d(n5x5, n5x5, kernel_size=3, padding=1), 48 | nn.BatchNorm2d(n5x5), 49 | nn.ReLU(inplace=True) 50 | ) 51 | 52 | #3x3pooling -> 1x1conv 53 | #same conv 54 | self.b4 = nn.Sequential( 55 | nn.MaxPool2d(3, stride=1, padding=1), 56 | nn.Conv2d(input_channels, pool_proj, kernel_size=1), 57 | nn.BatchNorm2d(pool_proj), 58 | nn.ReLU(inplace=True) 59 | ) 60 | 61 | def forward(self, x): 62 | return torch.cat([self.b1(x), self.b2(x), self.b3(x), self.b4(x)], dim=1) 63 | 64 | 65 | class GoogleNet(nn.Module): 66 | 67 | def __init__(self, num_class=100): 68 | super().__init__() 69 | self.prelayer = nn.Sequential( 70 | nn.Conv2d(3, 192, kernel_size=3, padding=1), 71 | nn.BatchNorm2d(192), 72 | nn.ReLU(inplace=True) 73 | ) 74 | 75 | #although we only use 1 conv layer as prelayer, 76 | #we still use name a3, b3....... 77 | self.a3 = Inception(192, 64, 96, 128, 16, 32, 32) 78 | self.b3 = Inception(256, 128, 128, 192, 32, 96, 64) 79 | 80 | #"""In general, an Inception network is a network consisting of 81 | #modules of the above type stacked upon each other, with occasional 82 | #max-pooling layers with stride 2 to halve the resolution of the 83 | #grid""" 84 | self.maxpool = nn.MaxPool2d(3, stride=2, padding=1) 85 | 86 | self.a4 = Inception(480, 192, 96, 208, 16, 48, 64) 87 | self.b4 = Inception(512, 160, 112, 224, 24, 64, 64) 88 | self.c4 = Inception(512, 128, 128, 256, 24, 64, 64) 89 | self.d4 = Inception(512, 112, 144, 288, 32, 64, 64) 90 | self.e4 = Inception(528, 256, 160, 320, 32, 128, 128) 91 | 92 | self.a5 = Inception(832, 256, 160, 320, 32, 128, 128) 93 | self.b5 = Inception(832, 384, 192, 384, 48, 128, 128) 94 | 95 | #input feature size: 8*8*1024 96 | self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) 97 | self.dropout = nn.Dropout2d(p=0.4) 98 | self.linear = nn.Linear(1024, num_class) 99 | 100 | def forward(self, x): 101 | output = self.prelayer(x) 102 | output = self.a3(output) 103 | output = self.b3(output) 104 | 105 | output = self.maxpool(output) 106 | 107 | output = self.a4(output) 108 | output = self.b4(output) 109 | output = self.c4(output) 110 | output = self.d4(output) 111 | output = self.e4(output) 112 | 113 | output = self.maxpool(output) 114 | 115 | output = self.a5(output) 116 | output = self.b5(output) 117 | 118 | #"""It was found that a move from fully connected layers to 119 | #average pooling improved the top-1 accuracy by about 0.6%, 120 | #however the use of dropout remained essential even after 121 | #removing the fully connected layers.""" 122 | output = self.avgpool(output) 123 | output = self.dropout(output) 124 | output = output.view(output.size()[0], -1) 125 | output = self.linear(output) 126 | 127 | return output 128 | 129 | def googlenet(): 130 | return GoogleNet() 131 | 132 | 133 | -------------------------------------------------------------------------------- /1-Imagenet9/models/mobilenet.py: -------------------------------------------------------------------------------- 1 | """mobilenet in pytorch 2 | 3 | 4 | 5 | [1] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam 6 | 7 | MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 8 | https://arxiv.org/abs/1704.04861 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | 15 | class DepthSeperabelConv2d(nn.Module): 16 | 17 | def __init__(self, input_channels, output_channels, kernel_size, **kwargs): 18 | super().__init__() 19 | self.depthwise = nn.Sequential( 20 | nn.Conv2d( 21 | input_channels, 22 | input_channels, 23 | kernel_size, 24 | groups=input_channels, 25 | **kwargs), 26 | nn.BatchNorm2d(input_channels), 27 | nn.ReLU(inplace=True) 28 | ) 29 | 30 | self.pointwise = nn.Sequential( 31 | nn.Conv2d(input_channels, output_channels, 1), 32 | nn.BatchNorm2d(output_channels), 33 | nn.ReLU(inplace=True) 34 | ) 35 | 36 | def forward(self, x): 37 | x = self.depthwise(x) 38 | x = self.pointwise(x) 39 | 40 | return x 41 | 42 | 43 | class BasicConv2d(nn.Module): 44 | 45 | def __init__(self, input_channels, output_channels, kernel_size, **kwargs): 46 | 47 | super().__init__() 48 | self.conv = nn.Conv2d( 49 | input_channels, output_channels, kernel_size, **kwargs) 50 | self.bn = nn.BatchNorm2d(output_channels) 51 | self.relu = nn.ReLU(inplace=True) 52 | 53 | def forward(self, x): 54 | x = self.conv(x) 55 | x = self.bn(x) 56 | x = self.relu(x) 57 | 58 | return x 59 | 60 | 61 | class MobileNet(nn.Module): 62 | 63 | """ 64 | Args: 65 | width multipler: The role of the width multiplier α is to thin 66 | a network uniformly at each layer. For a given 67 | layer and width multiplier α, the number of 68 | input channels M becomes αM and the number of 69 | output channels N becomes αN. 70 | """ 71 | 72 | def __init__(self, width_multiplier=1, class_num=100): 73 | super().__init__() 74 | 75 | alpha = width_multiplier 76 | self.stem = nn.Sequential( 77 | BasicConv2d(3, int(32 * alpha), 3, padding=1, bias=False), 78 | DepthSeperabelConv2d( 79 | int(32 * alpha), 80 | int(64 * alpha), 81 | 3, 82 | padding=1, 83 | bias=False 84 | ) 85 | ) 86 | 87 | #downsample 88 | self.conv1 = nn.Sequential( 89 | DepthSeperabelConv2d( 90 | int(64 * alpha), 91 | int(128 * alpha), 92 | 3, 93 | stride=2, 94 | padding=1, 95 | bias=False 96 | ), 97 | DepthSeperabelConv2d( 98 | int(128 * alpha), 99 | int(128 * alpha), 100 | 3, 101 | padding=1, 102 | bias=False 103 | ) 104 | ) 105 | 106 | #downsample 107 | self.conv2 = nn.Sequential( 108 | DepthSeperabelConv2d( 109 | int(128 * alpha), 110 | int(256 * alpha), 111 | 3, 112 | stride=2, 113 | padding=1, 114 | bias=False 115 | ), 116 | DepthSeperabelConv2d( 117 | int(256 * alpha), 118 | int(256 * alpha), 119 | 3, 120 | padding=1, 121 | bias=False 122 | ) 123 | ) 124 | 125 | #downsample 126 | self.conv3 = nn.Sequential( 127 | DepthSeperabelConv2d( 128 | int(256 * alpha), 129 | int(512 * alpha), 130 | 3, 131 | stride=2, 132 | padding=1, 133 | bias=False 134 | ), 135 | 136 | DepthSeperabelConv2d( 137 | int(512 * alpha), 138 | int(512 * alpha), 139 | 3, 140 | padding=1, 141 | bias=False 142 | ), 143 | DepthSeperabelConv2d( 144 | int(512 * alpha), 145 | int(512 * alpha), 146 | 3, 147 | padding=1, 148 | bias=False 149 | ), 150 | DepthSeperabelConv2d( 151 | int(512 * alpha), 152 | int(512 * alpha), 153 | 3, 154 | padding=1, 155 | bias=False 156 | ), 157 | DepthSeperabelConv2d( 158 | int(512 * alpha), 159 | int(512 * alpha), 160 | 3, 161 | padding=1, 162 | bias=False 163 | ), 164 | DepthSeperabelConv2d( 165 | int(512 * alpha), 166 | int(512 * alpha), 167 | 3, 168 | padding=1, 169 | bias=False 170 | ) 171 | ) 172 | 173 | #downsample 174 | self.conv4 = nn.Sequential( 175 | DepthSeperabelConv2d( 176 | int(512 * alpha), 177 | int(1024 * alpha), 178 | 3, 179 | stride=2, 180 | padding=1, 181 | bias=False 182 | ), 183 | DepthSeperabelConv2d( 184 | int(1024 * alpha), 185 | int(1024 * alpha), 186 | 3, 187 | padding=1, 188 | bias=False 189 | ) 190 | ) 191 | 192 | self.fc = nn.Linear(int(1024 * alpha), class_num) 193 | self.avg = nn.AdaptiveAvgPool2d(1) 194 | 195 | def forward(self, x): 196 | x = self.stem(x) 197 | 198 | x = self.conv1(x) 199 | x = self.conv2(x) 200 | x = self.conv3(x) 201 | x = self.conv4(x) 202 | 203 | x = self.avg(x) 204 | x = x.view(x.size(0), -1) 205 | x = self.fc(x) 206 | return x 207 | 208 | 209 | def mobilenet(alpha=1, class_num=100): 210 | return MobileNet(alpha, class_num) 211 | 212 | -------------------------------------------------------------------------------- /1-Imagenet9/models/mobilenetv2.py: -------------------------------------------------------------------------------- 1 | """mobilenetv2 in pytorch 2 | 3 | 4 | 5 | [1] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen 6 | 7 | MobileNetV2: Inverted Residuals and Linear Bottlenecks 8 | https://arxiv.org/abs/1801.04381 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | 16 | class LinearBottleNeck(nn.Module): 17 | 18 | def __init__(self, in_channels, out_channels, stride, t=6, class_num=100): 19 | super().__init__() 20 | 21 | self.residual = nn.Sequential( 22 | nn.Conv2d(in_channels, in_channels * t, 1), 23 | nn.BatchNorm2d(in_channels * t), 24 | nn.ReLU6(inplace=True), 25 | 26 | nn.Conv2d(in_channels * t, in_channels * t, 3, stride=stride, padding=1, groups=in_channels * t), 27 | nn.BatchNorm2d(in_channels * t), 28 | nn.ReLU6(inplace=True), 29 | 30 | nn.Conv2d(in_channels * t, out_channels, 1), 31 | nn.BatchNorm2d(out_channels) 32 | ) 33 | 34 | self.stride = stride 35 | self.in_channels = in_channels 36 | self.out_channels = out_channels 37 | 38 | def forward(self, x): 39 | 40 | residual = self.residual(x) 41 | 42 | if self.stride == 1 and self.in_channels == self.out_channels: 43 | residual += x 44 | 45 | return residual 46 | 47 | class MobileNetV2(nn.Module): 48 | 49 | def __init__(self, class_num=100): 50 | super().__init__() 51 | 52 | self.pre = nn.Sequential( 53 | nn.Conv2d(3, 32, 1, padding=1), 54 | nn.BatchNorm2d(32), 55 | nn.ReLU6(inplace=True) 56 | ) 57 | 58 | self.stage1 = LinearBottleNeck(32, 16, 1, 1) 59 | self.stage2 = self._make_stage(2, 16, 24, 2, 6) 60 | self.stage3 = self._make_stage(3, 24, 32, 2, 6) 61 | self.stage4 = self._make_stage(4, 32, 64, 2, 6) 62 | self.stage5 = self._make_stage(3, 64, 96, 1, 6) 63 | self.stage6 = self._make_stage(3, 96, 160, 1, 6) 64 | self.stage7 = LinearBottleNeck(160, 320, 1, 6) 65 | 66 | self.conv1 = nn.Sequential( 67 | nn.Conv2d(320, 1280, 1), 68 | nn.BatchNorm2d(1280), 69 | nn.ReLU6(inplace=True) 70 | ) 71 | 72 | self.conv2 = nn.Conv2d(1280, class_num, 1) 73 | 74 | def forward(self, x): 75 | x = self.pre(x) 76 | x = self.stage1(x) 77 | x = self.stage2(x) 78 | x = self.stage3(x) 79 | x = self.stage4(x) 80 | x = self.stage5(x) 81 | x = self.stage6(x) 82 | x = self.stage7(x) 83 | x = self.conv1(x) 84 | x = F.adaptive_avg_pool2d(x, 1) 85 | x = self.conv2(x) 86 | x = x.view(x.size(0), -1) 87 | 88 | return x 89 | 90 | def _make_stage(self, repeat, in_channels, out_channels, stride, t): 91 | 92 | layers = [] 93 | layers.append(LinearBottleNeck(in_channels, out_channels, stride, t)) 94 | 95 | while repeat - 1: 96 | layers.append(LinearBottleNeck(out_channels, out_channels, 1, t)) 97 | repeat -= 1 98 | 99 | return nn.Sequential(*layers) 100 | 101 | def mobilenetv2(): 102 | return MobileNetV2() -------------------------------------------------------------------------------- /1-Imagenet9/models/preactresnet.py: -------------------------------------------------------------------------------- 1 | """preactresnet in pytorch 2 | 3 | [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun 4 | 5 | Identity Mappings in Deep Residual Networks 6 | https://arxiv.org/abs/1603.05027 7 | """ 8 | 9 | import torch 10 | import torch.nn as nn 11 | import torch.nn.functional as F 12 | 13 | class PreActBasic(nn.Module): 14 | 15 | expansion = 1 16 | def __init__(self, in_channels, out_channels, stride): 17 | super().__init__() 18 | self.residual = nn.Sequential( 19 | nn.BatchNorm2d(in_channels), 20 | nn.ReLU(inplace=True), 21 | nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1), 22 | nn.BatchNorm2d(out_channels), 23 | nn.ReLU(inplace=True), 24 | nn.Conv2d(out_channels, out_channels * PreActBasic.expansion, kernel_size=3, padding=1) 25 | ) 26 | 27 | self.shortcut = nn.Sequential() 28 | if stride != 1 or in_channels != out_channels * PreActBasic.expansion: 29 | self.shortcut = nn.Conv2d(in_channels, out_channels * PreActBasic.expansion, 1, stride=stride) 30 | 31 | def forward(self, x): 32 | 33 | res = self.residual(x) 34 | shortcut = self.shortcut(x) 35 | 36 | return res + shortcut 37 | 38 | 39 | class PreActBottleNeck(nn.Module): 40 | 41 | expansion = 4 42 | def __init__(self, in_channels, out_channels, stride): 43 | super().__init__() 44 | 45 | self.residual = nn.Sequential( 46 | nn.BatchNorm2d(in_channels), 47 | nn.ReLU(inplace=True), 48 | nn.Conv2d(in_channels, out_channels, 1, stride=stride), 49 | 50 | nn.BatchNorm2d(out_channels), 51 | nn.ReLU(inplace=True), 52 | nn.Conv2d(out_channels, out_channels, 3, padding=1), 53 | 54 | nn.BatchNorm2d(out_channels), 55 | nn.ReLU(inplace=True), 56 | nn.Conv2d(out_channels, out_channels * PreActBottleNeck.expansion, 1) 57 | ) 58 | 59 | self.shortcut = nn.Sequential() 60 | 61 | if stride != 1 or in_channels != out_channels * PreActBottleNeck.expansion: 62 | self.shortcut = nn.Conv2d(in_channels, out_channels * PreActBottleNeck.expansion, 1, stride=stride) 63 | 64 | def forward(self, x): 65 | 66 | res = self.residual(x) 67 | shortcut = self.shortcut(x) 68 | 69 | return res + shortcut 70 | 71 | class PreActResNet(nn.Module): 72 | 73 | def __init__(self, block, num_block, class_num=100): 74 | super().__init__() 75 | self.input_channels = 64 76 | 77 | self.pre = nn.Sequential( 78 | nn.Conv2d(3, 64, 3, padding=1), 79 | nn.BatchNorm2d(64), 80 | nn.ReLU(inplace=True) 81 | ) 82 | 83 | self.stage1 = self._make_layers(block, num_block[0], 64, 1) 84 | self.stage2 = self._make_layers(block, num_block[1], 128, 2) 85 | self.stage3 = self._make_layers(block, num_block[2], 256, 2) 86 | self.stage4 = self._make_layers(block, num_block[3], 512, 2) 87 | 88 | self.linear = nn.Linear(self.input_channels, class_num) 89 | 90 | def _make_layers(self, block, block_num, out_channels, stride): 91 | layers = [] 92 | 93 | layers.append(block(self.input_channels, out_channels, stride)) 94 | self.input_channels = out_channels * block.expansion 95 | 96 | while block_num - 1: 97 | layers.append(block(self.input_channels, out_channels, 1)) 98 | self.input_channels = out_channels * block.expansion 99 | block_num -= 1 100 | 101 | return nn.Sequential(*layers) 102 | 103 | def forward(self, x): 104 | x = self.pre(x) 105 | 106 | x = self.stage1(x) 107 | x = self.stage2(x) 108 | x = self.stage3(x) 109 | x = self.stage4(x) 110 | 111 | x = F.adaptive_avg_pool2d(x, 1) 112 | x = x.view(x.size(0), -1) 113 | x = self.linear(x) 114 | 115 | return x 116 | 117 | def preactresnet18(): 118 | return PreActResNet(PreActBasic, [2, 2, 2, 2]) 119 | 120 | def preactresnet34(): 121 | return PreActResNet(PreActBasic, [3, 4, 6, 3]) 122 | 123 | def preactresnet50(): 124 | return PreActResNet(PreActBottleNeck, [3, 4, 6, 3]) 125 | 126 | def preactresnet101(): 127 | return PreActResNet(PreActBottleNeck, [3, 4, 23, 3]) 128 | 129 | def preactresnet152(): 130 | return PreActResNet(PreActBottleNeck, [3, 8, 36, 3]) 131 | 132 | -------------------------------------------------------------------------------- /1-Imagenet9/models/resnet.py: -------------------------------------------------------------------------------- 1 | """resnet in pytorch 2 | 3 | 4 | 5 | [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. 6 | 7 | Deep Residual Learning for Image Recognition 8 | https://arxiv.org/abs/1512.03385v1 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | class BasicBlock(nn.Module): 15 | """Basic Block for resnet 18 and resnet 34 16 | 17 | """ 18 | 19 | #BasicBlock and BottleNeck block 20 | #have different output size 21 | #we use class attribute expansion 22 | #to distinct 23 | expansion = 1 24 | 25 | def __init__(self, in_channels, out_channels, stride=1): 26 | super().__init__() 27 | 28 | #residual function 29 | self.residual_function = nn.Sequential( 30 | nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False), 31 | nn.BatchNorm2d(out_channels), 32 | nn.ReLU(inplace=True), 33 | nn.Conv2d(out_channels, out_channels * BasicBlock.expansion, kernel_size=3, padding=1, bias=False), 34 | nn.BatchNorm2d(out_channels * BasicBlock.expansion) 35 | ) 36 | 37 | #shortcut 38 | self.shortcut = nn.Sequential() 39 | 40 | #the shortcut output dimension is not the same with residual function 41 | #use 1*1 convolution to match the dimension 42 | if stride != 1 or in_channels != BasicBlock.expansion * out_channels: 43 | self.shortcut = nn.Sequential( 44 | nn.Conv2d(in_channels, out_channels * BasicBlock.expansion, kernel_size=1, stride=stride, bias=False), 45 | nn.BatchNorm2d(out_channels * BasicBlock.expansion) 46 | ) 47 | 48 | def forward(self, x): 49 | return nn.ReLU(inplace=True)(self.residual_function(x) + self.shortcut(x)) 50 | 51 | class BottleNeck(nn.Module): 52 | """Residual block for resnet over 50 layers 53 | 54 | """ 55 | expansion = 4 56 | def __init__(self, in_channels, out_channels, stride=1): 57 | super().__init__() 58 | self.residual_function = nn.Sequential( 59 | nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False), 60 | nn.BatchNorm2d(out_channels), 61 | nn.ReLU(inplace=True), 62 | nn.Conv2d(out_channels, out_channels, stride=stride, kernel_size=3, padding=1, bias=False), 63 | nn.BatchNorm2d(out_channels), 64 | nn.ReLU(inplace=True), 65 | nn.Conv2d(out_channels, out_channels * BottleNeck.expansion, kernel_size=1, bias=False), 66 | nn.BatchNorm2d(out_channels * BottleNeck.expansion), 67 | ) 68 | 69 | self.shortcut = nn.Sequential() 70 | 71 | if stride != 1 or in_channels != out_channels * BottleNeck.expansion: 72 | self.shortcut = nn.Sequential( 73 | nn.Conv2d(in_channels, out_channels * BottleNeck.expansion, stride=stride, kernel_size=1, bias=False), 74 | nn.BatchNorm2d(out_channels * BottleNeck.expansion) 75 | ) 76 | 77 | def forward(self, x): 78 | return nn.ReLU(inplace=True)(self.residual_function(x) + self.shortcut(x)) 79 | 80 | class ResNet(nn.Module): 81 | 82 | def __init__(self, block, num_block, num_classes=100): 83 | super().__init__() 84 | 85 | self.in_channels = 64 86 | 87 | self.conv1 = nn.Sequential( 88 | nn.Conv2d(3, 64, kernel_size=3, padding=1, bias=False), 89 | nn.BatchNorm2d(64), 90 | nn.ReLU(inplace=True)) 91 | #we use a different inputsize than the original paper 92 | #so conv2_x's stride is 1 93 | self.conv2_x = self._make_layer(block, 64, num_block[0], 1) 94 | self.conv3_x = self._make_layer(block, 128, num_block[1], 2) 95 | self.conv4_x = self._make_layer(block, 256, num_block[2], 2) 96 | self.conv5_x = self._make_layer(block, 512, num_block[3], 2) 97 | self.avg_pool = nn.AdaptiveAvgPool2d((1, 1)) 98 | self.fc = nn.Linear(512 * block.expansion, num_classes) 99 | 100 | def _make_layer(self, block, out_channels, num_blocks, stride): 101 | """make resnet layers(by layer i didnt mean this 'layer' was the 102 | same as a neuron netowork layer, ex. conv layer), one layer may 103 | contain more than one residual block 104 | 105 | Args: 106 | block: block type, basic block or bottle neck block 107 | out_channels: output depth channel number of this layer 108 | num_blocks: how many blocks per layer 109 | stride: the stride of the first block of this layer 110 | 111 | Return: 112 | return a resnet layer 113 | """ 114 | 115 | # we have num_block blocks per layer, the first block 116 | # could be 1 or 2, other blocks would always be 1 117 | strides = [stride] + [1] * (num_blocks - 1) 118 | layers = [] 119 | for stride in strides: 120 | layers.append(block(self.in_channels, out_channels, stride)) 121 | self.in_channels = out_channels * block.expansion 122 | 123 | return nn.Sequential(*layers) 124 | 125 | def forward(self, x): 126 | output = self.conv1(x) 127 | output = self.conv2_x(output) 128 | output = self.conv3_x(output) 129 | output = self.conv4_x(output) 130 | output = self.conv5_x(output) 131 | output = self.avg_pool(output) 132 | output = output.view(output.size(0), -1) 133 | output = self.fc(output) 134 | 135 | return output 136 | 137 | def resnet18(): 138 | """ return a ResNet 18 object 139 | """ 140 | return ResNet(BasicBlock, [2, 2, 2, 2]) 141 | 142 | def resnet34(): 143 | """ return a ResNet 34 object 144 | """ 145 | return ResNet(BasicBlock, [3, 4, 6, 3]) 146 | 147 | def resnet50(): 148 | """ return a ResNet 50 object 149 | """ 150 | return ResNet(BottleNeck, [3, 4, 6, 3]) 151 | 152 | def resnet101(): 153 | """ return a ResNet 101 object 154 | """ 155 | return ResNet(BottleNeck, [3, 4, 23, 3]) 156 | 157 | def resnet152(): 158 | """ return a ResNet 152 object 159 | """ 160 | return ResNet(BottleNeck, [3, 8, 36, 3]) 161 | 162 | 163 | 164 | -------------------------------------------------------------------------------- /1-Imagenet9/models/resnet_nonlocal.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Non-Local ResNet2D-50 for CIFAR-10 dataset. 3 | Most of the code is borrowed from https://github.com/akamaster/pytorch_resnet_cifar10 4 | Properly implemented ResNet-s for CIFAR10 as described in paper [1]. 5 | The implementation and structure of this file is hugely influenced by [2] 6 | which is implemented for ImageNet and doesn't have option A for identity. 7 | Moreover, most of the implementations on the web is copy-paste from 8 | torchvision's resnet and has wrong number of params. 9 | Reference: 10 | [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun 11 | Deep Residual Learning for Image Recognition. arXiv:1512.03385 12 | [2] https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py 13 | ''' 14 | import torch 15 | import torch.nn as nn 16 | import torch.nn.functional as F 17 | import torch.nn.init as init 18 | 19 | from torch.autograd import Variable 20 | from models.non_local import NLBlockND 21 | 22 | 23 | def _weights_init(m): 24 | if isinstance(m, nn.Linear) or isinstance(m, nn.Conv2d): 25 | init.kaiming_normal_(m.weight) 26 | 27 | 28 | class LambdaLayer(nn.Module): 29 | def __init__(self, lambd): 30 | super(LambdaLayer, self).__init__() 31 | self.lambd = lambd 32 | 33 | def forward(self, x): 34 | return self.lambd(x) 35 | 36 | 37 | class BasicBlock(nn.Module): 38 | expansion = 1 39 | 40 | def __init__(self, in_planes, planes, stride=1, option='A'): 41 | super(BasicBlock, self).__init__() 42 | self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) 43 | self.bn1 = nn.BatchNorm2d(planes) 44 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False) 45 | self.bn2 = nn.BatchNorm2d(planes) 46 | 47 | self.shortcut = nn.Sequential() 48 | if stride != 1 or in_planes != planes: 49 | if option == 'A': 50 | """ 51 | For CIFAR10 ResNet paper uses option A. 52 | """ 53 | self.shortcut = LambdaLayer(lambda x: 54 | F.pad(x[:, :, ::2, ::2], (0, 0, 0, 0, planes // 4, planes // 4), "constant", 55 | 0)) 56 | elif option == 'B': 57 | self.shortcut = nn.Sequential( 58 | nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False), 59 | nn.BatchNorm2d(self.expansion * planes) 60 | ) 61 | 62 | def forward(self, x): 63 | out = F.relu(self.bn1(self.conv1(x))) 64 | out = self.bn2(self.conv2(out)) 65 | out += self.shortcut(x) 66 | out = F.relu(out) 67 | return out 68 | 69 | 70 | class ResNet2D(nn.Module): 71 | def __init__(self, block, num_blocks, num_classes=10, non_local=False): 72 | super(ResNet2D, self).__init__() 73 | self.in_planes = 16 74 | 75 | self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1, bias=False) 76 | self.bn1 = nn.BatchNorm2d(16) 77 | self.layer1 = self._make_layer(block, 16, num_blocks[0], stride=1) 78 | 79 | # add non-local block after layer 2 80 | self.layer2 = self._make_layer(block, 32, num_blocks[1], stride=2, non_local=non_local) 81 | self.layer3 = self._make_layer(block, 64, num_blocks[2], stride=2) 82 | self.linear = nn.Linear(64, num_classes) 83 | 84 | self.apply(_weights_init) 85 | 86 | def _make_layer(self, block, planes, num_blocks, stride, non_local=False): 87 | strides = [stride] + [1] * (num_blocks - 1) 88 | layers = [] 89 | 90 | last_idx = len(strides) 91 | if non_local: 92 | last_idx = len(strides) - 1 93 | 94 | for i in range(last_idx): 95 | layers.append(block(self.in_planes, planes, strides[i])) 96 | self.in_planes = planes * block.expansion 97 | 98 | if non_local: 99 | layers.append(NLBlockND(in_channels=planes, dimension=2)) 100 | layers.append(block(self.in_planes, planes, strides[-1])) 101 | 102 | return nn.Sequential(*layers) 103 | 104 | def forward(self, x): 105 | out = F.relu(self.bn1(self.conv1(x))) 106 | out = self.layer1(out) 107 | out = self.layer2(out) 108 | out = self.layer3(out) 109 | out = F.avg_pool2d(out, out.size()[3]) 110 | out = out.view(out.size(0), -1) 111 | out = self.linear(out) 112 | return out 113 | 114 | 115 | def resnet2D56(non_local=False, **kwargs): 116 | """Constructs a ResNet-56 model. 117 | """ 118 | return ResNet2D(BasicBlock, [9, 9, 9], non_local=non_local, **kwargs) -------------------------------------------------------------------------------- /1-Imagenet9/models/resnext.py: -------------------------------------------------------------------------------- 1 | """resnext in pytorch 2 | 3 | 4 | 5 | [1] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He. 6 | 7 | Aggregated Residual Transformations for Deep Neural Networks 8 | https://arxiv.org/abs/1611.05431 9 | """ 10 | 11 | import math 12 | import torch 13 | import torch.nn as nn 14 | import torch.nn.functional as F 15 | 16 | #only implements ResNext bottleneck c 17 | 18 | 19 | #"""This strategy exposes a new dimension, which we call “cardinality” 20 | #(the size of the set of transformations), as an essential factor 21 | #in addition to the dimensions of depth and width.""" 22 | CARDINALITY = 32 23 | DEPTH = 4 24 | BASEWIDTH = 64 25 | 26 | #"""The grouped convolutional layer in Fig. 3(c) performs 32 groups 27 | #of convolutions whose input and output channels are 4-dimensional. 28 | #The grouped convolutional layer concatenates them as the outputs 29 | #of the layer.""" 30 | 31 | class ResNextBottleNeckC(nn.Module): 32 | 33 | def __init__(self, in_channels, out_channels, stride): 34 | super().__init__() 35 | 36 | C = CARDINALITY #How many groups a feature map was splitted into 37 | 38 | #"""We note that the input/output width of the template is fixed as 39 | #256-d (Fig. 3), We note that the input/output width of the template 40 | #is fixed as 256-d (Fig. 3), and all widths are dou- bled each time 41 | #when the feature map is subsampled (see Table 1).""" 42 | D = int(DEPTH * out_channels / BASEWIDTH) #number of channels per group 43 | self.split_transforms = nn.Sequential( 44 | nn.Conv2d(in_channels, C * D, kernel_size=1, groups=C, bias=False), 45 | nn.BatchNorm2d(C * D), 46 | nn.ReLU(inplace=True), 47 | nn.Conv2d(C * D, C * D, kernel_size=3, stride=stride, groups=C, padding=1, bias=False), 48 | nn.BatchNorm2d(C * D), 49 | nn.ReLU(inplace=True), 50 | nn.Conv2d(C * D, out_channels * 4, kernel_size=1, bias=False), 51 | nn.BatchNorm2d(out_channels * 4), 52 | ) 53 | 54 | self.shortcut = nn.Sequential() 55 | 56 | if stride != 1 or in_channels != out_channels * 4: 57 | self.shortcut = nn.Sequential( 58 | nn.Conv2d(in_channels, out_channels * 4, stride=stride, kernel_size=1, bias=False), 59 | nn.BatchNorm2d(out_channels * 4) 60 | ) 61 | 62 | def forward(self, x): 63 | return F.relu(self.split_transforms(x) + self.shortcut(x)) 64 | 65 | class ResNext(nn.Module): 66 | 67 | def __init__(self, block, num_blocks, class_names=100): 68 | super().__init__() 69 | self.in_channels = 64 70 | 71 | self.conv1 = nn.Sequential( 72 | nn.Conv2d(3, 64, 3, stride=1, padding=1, bias=False), 73 | nn.BatchNorm2d(64), 74 | nn.ReLU(inplace=True) 75 | ) 76 | 77 | self.conv2 = self._make_layer(block, num_blocks[0], 64, 1) 78 | self.conv3 = self._make_layer(block, num_blocks[1], 128, 2) 79 | self.conv4 = self._make_layer(block, num_blocks[2], 256, 2) 80 | self.conv5 = self._make_layer(block, num_blocks[3], 512, 2) 81 | self.avg = nn.AdaptiveAvgPool2d((1, 1)) 82 | self.fc = nn.Linear(512 * 4, 100) 83 | 84 | def forward(self, x): 85 | x = self.conv1(x) 86 | x = self.conv2(x) 87 | x = self.conv3(x) 88 | x = self.conv4(x) 89 | x = self.conv5(x) 90 | x = self.avg(x) 91 | x = x.view(x.size(0), -1) 92 | x = self.fc(x) 93 | return x 94 | 95 | def _make_layer(self, block, num_block, out_channels, stride): 96 | """Building resnext block 97 | Args: 98 | block: block type(default resnext bottleneck c) 99 | num_block: number of blocks per layer 100 | out_channels: output channels per block 101 | stride: block stride 102 | 103 | Returns: 104 | a resnext layer 105 | """ 106 | strides = [stride] + [1] * (num_block - 1) 107 | layers = [] 108 | for stride in strides: 109 | layers.append(block(self.in_channels, out_channels, stride)) 110 | self.in_channels = out_channels * 4 111 | 112 | return nn.Sequential(*layers) 113 | 114 | def resnext50(): 115 | """ return a resnext50(c32x4d) network 116 | """ 117 | return ResNext(ResNextBottleNeckC, [3, 4, 6, 3]) 118 | 119 | def resnext101(): 120 | """ return a resnext101(c32x4d) network 121 | """ 122 | return ResNext(ResNextBottleNeckC, [3, 4, 23, 3]) 123 | 124 | def resnext152(): 125 | """ return a resnext101(c32x4d) network 126 | """ 127 | return ResNext(ResNextBottleNeckC, [3, 4, 36, 3]) 128 | 129 | 130 | 131 | -------------------------------------------------------------------------------- /1-Imagenet9/models/senet.py: -------------------------------------------------------------------------------- 1 | """senet in pytorch 2 | 3 | 4 | 5 | [1] Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu 6 | 7 | Squeeze-and-Excitation Networks 8 | https://arxiv.org/abs/1709.01507 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | class BasicResidualSEBlock(nn.Module): 16 | 17 | expansion = 1 18 | 19 | def __init__(self, in_channels, out_channels, stride, r=16): 20 | super().__init__() 21 | 22 | self.residual = nn.Sequential( 23 | nn.Conv2d(in_channels, out_channels, 3, stride=stride, padding=1), 24 | nn.BatchNorm2d(out_channels), 25 | nn.ReLU(inplace=True), 26 | 27 | nn.Conv2d(out_channels, out_channels * self.expansion, 3, padding=1), 28 | nn.BatchNorm2d(out_channels * self.expansion), 29 | nn.ReLU(inplace=True) 30 | ) 31 | 32 | self.shortcut = nn.Sequential() 33 | if stride != 1 or in_channels != out_channels * self.expansion: 34 | self.shortcut = nn.Sequential( 35 | nn.Conv2d(in_channels, out_channels * self.expansion, 1, stride=stride), 36 | nn.BatchNorm2d(out_channels * self.expansion) 37 | ) 38 | 39 | self.squeeze = nn.AdaptiveAvgPool2d(1) 40 | self.excitation = nn.Sequential( 41 | nn.Linear(out_channels * self.expansion, out_channels * self.expansion // r), 42 | nn.ReLU(inplace=True), 43 | nn.Linear(out_channels * self.expansion // r, out_channels * self.expansion), 44 | nn.Sigmoid() 45 | ) 46 | 47 | def forward(self, x): 48 | shortcut = self.shortcut(x) 49 | residual = self.residual(x) 50 | 51 | squeeze = self.squeeze(residual) 52 | squeeze = squeeze.view(squeeze.size(0), -1) 53 | excitation = self.excitation(squeeze) 54 | excitation = excitation.view(residual.size(0), residual.size(1), 1, 1) 55 | 56 | x = residual * excitation.expand_as(residual) + shortcut 57 | 58 | return F.relu(x) 59 | 60 | class BottleneckResidualSEBlock(nn.Module): 61 | 62 | expansion = 4 63 | 64 | def __init__(self, in_channels, out_channels, stride, r=16): 65 | super().__init__() 66 | 67 | self.residual = nn.Sequential( 68 | nn.Conv2d(in_channels, out_channels, 1), 69 | nn.BatchNorm2d(out_channels), 70 | nn.ReLU(inplace=True), 71 | 72 | nn.Conv2d(out_channels, out_channels, 3, stride=stride, padding=1), 73 | nn.BatchNorm2d(out_channels), 74 | nn.ReLU(inplace=True), 75 | 76 | nn.Conv2d(out_channels, out_channels * self.expansion, 1), 77 | nn.BatchNorm2d(out_channels * self.expansion), 78 | nn.ReLU(inplace=True) 79 | ) 80 | 81 | self.squeeze = nn.AdaptiveAvgPool2d(1) 82 | self.excitation = nn.Sequential( 83 | nn.Linear(out_channels * self.expansion, out_channels * self.expansion // r), 84 | nn.ReLU(inplace=True), 85 | nn.Linear(out_channels * self.expansion // r, out_channels * self.expansion), 86 | nn.Sigmoid() 87 | ) 88 | 89 | self.shortcut = nn.Sequential() 90 | if stride != 1 or in_channels != out_channels * self.expansion: 91 | self.shortcut = nn.Sequential( 92 | nn.Conv2d(in_channels, out_channels * self.expansion, 1, stride=stride), 93 | nn.BatchNorm2d(out_channels * self.expansion) 94 | ) 95 | 96 | def forward(self, x): 97 | 98 | shortcut = self.shortcut(x) 99 | 100 | residual = self.residual(x) 101 | squeeze = self.squeeze(residual) 102 | squeeze = squeeze.view(squeeze.size(0), -1) 103 | excitation = self.excitation(squeeze) 104 | excitation = excitation.view(residual.size(0), residual.size(1), 1, 1) 105 | 106 | x = residual * excitation.expand_as(residual) + shortcut 107 | 108 | return F.relu(x) 109 | 110 | class SEResNet(nn.Module): 111 | 112 | def __init__(self, block, block_num, class_num=100): 113 | super().__init__() 114 | 115 | self.in_channels = 64 116 | 117 | self.pre = nn.Sequential( 118 | nn.Conv2d(3, 64, 3, padding=1), 119 | nn.BatchNorm2d(64), 120 | nn.ReLU(inplace=True) 121 | ) 122 | 123 | self.stage1 = self._make_stage(block, block_num[0], 64, 1) 124 | self.stage2 = self._make_stage(block, block_num[1], 128, 2) 125 | self.stage3 = self._make_stage(block, block_num[2], 256, 2) 126 | self.stage4 = self._make_stage(block, block_num[3], 512, 2) 127 | 128 | self.linear = nn.Linear(self.in_channels, class_num) 129 | 130 | def forward(self, x): 131 | x = self.pre(x) 132 | 133 | x = self.stage1(x) 134 | x = self.stage2(x) 135 | x = self.stage3(x) 136 | x = self.stage4(x) 137 | 138 | x = F.adaptive_avg_pool2d(x, 1) 139 | x = x.view(x.size(0), -1) 140 | 141 | x = self.linear(x) 142 | 143 | return x 144 | 145 | 146 | def _make_stage(self, block, num, out_channels, stride): 147 | 148 | layers = [] 149 | layers.append(block(self.in_channels, out_channels, stride)) 150 | self.in_channels = out_channels * block.expansion 151 | 152 | while num - 1: 153 | layers.append(block(self.in_channels, out_channels, 1)) 154 | num -= 1 155 | 156 | return nn.Sequential(*layers) 157 | 158 | def seresnet18(): 159 | return SEResNet(BasicResidualSEBlock, [2, 2, 2, 2]) 160 | 161 | def seresnet34(): 162 | return SEResNet(BasicResidualSEBlock, [3, 4, 6, 3]) 163 | 164 | def seresnet50(): 165 | return SEResNet(BottleneckResidualSEBlock, [3, 4, 6, 3]) 166 | 167 | def seresnet101(): 168 | return SEResNet(BottleneckResidualSEBlock, [3, 4, 23, 3]) 169 | 170 | def seresnet152(): 171 | return SEResNet(BottleneckResidualSEBlock, [3, 8, 36, 3]) 172 | -------------------------------------------------------------------------------- /1-Imagenet9/models/shufflenetv2.py: -------------------------------------------------------------------------------- 1 | """shufflenetv2 in pytorch 2 | 3 | 4 | 5 | [1] Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun 6 | 7 | ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design 8 | https://arxiv.org/abs/1807.11164 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | 16 | def channel_split(x, split): 17 | """split a tensor into two pieces along channel dimension 18 | Args: 19 | x: input tensor 20 | split:(int) channel size for each pieces 21 | """ 22 | assert x.size(1) == split * 2 23 | return torch.split(x, split, dim=1) 24 | 25 | def channel_shuffle(x, groups): 26 | """channel shuffle operation 27 | Args: 28 | x: input tensor 29 | groups: input branch number 30 | """ 31 | 32 | batch_size, channels, height, width = x.size() 33 | channels_per_group = int(channels / groups) 34 | 35 | x = x.view(batch_size, groups, channels_per_group, height, width) 36 | x = x.transpose(1, 2).contiguous() 37 | x = x.view(batch_size, -1, height, width) 38 | 39 | return x 40 | 41 | class ShuffleUnit(nn.Module): 42 | 43 | def __init__(self, in_channels, out_channels, stride): 44 | super().__init__() 45 | 46 | self.stride = stride 47 | self.in_channels = in_channels 48 | self.out_channels = out_channels 49 | 50 | if stride != 1 or in_channels != out_channels: 51 | self.residual = nn.Sequential( 52 | nn.Conv2d(in_channels, in_channels, 1), 53 | nn.BatchNorm2d(in_channels), 54 | nn.ReLU(inplace=True), 55 | nn.Conv2d(in_channels, in_channels, 3, stride=stride, padding=1, groups=in_channels), 56 | nn.BatchNorm2d(in_channels), 57 | nn.Conv2d(in_channels, int(out_channels / 2), 1), 58 | nn.BatchNorm2d(int(out_channels / 2)), 59 | nn.ReLU(inplace=True) 60 | ) 61 | 62 | self.shortcut = nn.Sequential( 63 | nn.Conv2d(in_channels, in_channels, 3, stride=stride, padding=1, groups=in_channels), 64 | nn.BatchNorm2d(in_channels), 65 | nn.Conv2d(in_channels, int(out_channels / 2), 1), 66 | nn.BatchNorm2d(int(out_channels / 2)), 67 | nn.ReLU(inplace=True) 68 | ) 69 | else: 70 | self.shortcut = nn.Sequential() 71 | 72 | in_channels = int(in_channels / 2) 73 | self.residual = nn.Sequential( 74 | nn.Conv2d(in_channels, in_channels, 1), 75 | nn.BatchNorm2d(in_channels), 76 | nn.ReLU(inplace=True), 77 | nn.Conv2d(in_channels, in_channels, 3, stride=stride, padding=1, groups=in_channels), 78 | nn.BatchNorm2d(in_channels), 79 | nn.Conv2d(in_channels, in_channels, 1), 80 | nn.BatchNorm2d(in_channels), 81 | nn.ReLU(inplace=True) 82 | ) 83 | 84 | 85 | def forward(self, x): 86 | 87 | if self.stride == 1 and self.out_channels == self.in_channels: 88 | shortcut, residual = channel_split(x, int(self.in_channels / 2)) 89 | else: 90 | shortcut = x 91 | residual = x 92 | 93 | shortcut = self.shortcut(shortcut) 94 | residual = self.residual(residual) 95 | x = torch.cat([shortcut, residual], dim=1) 96 | x = channel_shuffle(x, 2) 97 | 98 | return x 99 | 100 | class ShuffleNetV2(nn.Module): 101 | 102 | def __init__(self, ratio=1, class_num=100): 103 | super().__init__() 104 | if ratio == 0.5: 105 | out_channels = [48, 96, 192, 1024] 106 | elif ratio == 1: 107 | out_channels = [116, 232, 464, 1024] 108 | elif ratio == 1.5: 109 | out_channels = [176, 352, 704, 1024] 110 | elif ratio == 2: 111 | out_channels = [244, 488, 976, 2048] 112 | else: 113 | ValueError('unsupported ratio number') 114 | 115 | self.pre = nn.Sequential( 116 | nn.Conv2d(3, 24, 3, padding=1), 117 | nn.BatchNorm2d(24) 118 | ) 119 | 120 | self.stage2 = self._make_stage(24, out_channels[0], 3) 121 | self.stage3 = self._make_stage(out_channels[0], out_channels[1], 7) 122 | self.stage4 = self._make_stage(out_channels[1], out_channels[2], 3) 123 | self.conv5 = nn.Sequential( 124 | nn.Conv2d(out_channels[2], out_channels[3], 1), 125 | nn.BatchNorm2d(out_channels[3]), 126 | nn.ReLU(inplace=True) 127 | ) 128 | 129 | self.fc = nn.Linear(out_channels[3], class_num) 130 | 131 | def forward(self, x): 132 | x = self.pre(x) 133 | x = self.stage2(x) 134 | x = self.stage3(x) 135 | x = self.stage4(x) 136 | x = self.conv5(x) 137 | x = F.adaptive_avg_pool2d(x, 1) 138 | x = x.view(x.size(0), -1) 139 | x = self.fc(x) 140 | 141 | return x 142 | 143 | def _make_stage(self, in_channels, out_channels, repeat): 144 | layers = [] 145 | layers.append(ShuffleUnit(in_channels, out_channels, 2)) 146 | 147 | while repeat: 148 | layers.append(ShuffleUnit(out_channels, out_channels, 1)) 149 | repeat -= 1 150 | 151 | return nn.Sequential(*layers) 152 | 153 | def shufflenetv2(): 154 | return ShuffleNetV2() 155 | 156 | 157 | 158 | 159 | 160 | -------------------------------------------------------------------------------- /1-Imagenet9/models/squeezenet.py: -------------------------------------------------------------------------------- 1 | """squeezenet in pytorch 2 | 3 | 4 | 5 | [1] Song Han, Jeff Pool, John Tran, William J. Dally 6 | 7 | squeezenet: Learning both Weights and Connections for Efficient Neural Networks 8 | https://arxiv.org/abs/1506.02626 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | 15 | class Fire(nn.Module): 16 | 17 | def __init__(self, in_channel, out_channel, squzee_channel): 18 | 19 | super().__init__() 20 | self.squeeze = nn.Sequential( 21 | nn.Conv2d(in_channel, squzee_channel, 1), 22 | nn.BatchNorm2d(squzee_channel), 23 | nn.ReLU(inplace=True) 24 | ) 25 | 26 | self.expand_1x1 = nn.Sequential( 27 | nn.Conv2d(squzee_channel, int(out_channel / 2), 1), 28 | nn.BatchNorm2d(int(out_channel / 2)), 29 | nn.ReLU(inplace=True) 30 | ) 31 | 32 | self.expand_3x3 = nn.Sequential( 33 | nn.Conv2d(squzee_channel, int(out_channel / 2), 3, padding=1), 34 | nn.BatchNorm2d(int(out_channel / 2)), 35 | nn.ReLU(inplace=True) 36 | ) 37 | 38 | def forward(self, x): 39 | 40 | x = self.squeeze(x) 41 | x = torch.cat([ 42 | self.expand_1x1(x), 43 | self.expand_3x3(x) 44 | ], 1) 45 | 46 | return x 47 | 48 | class SqueezeNet(nn.Module): 49 | 50 | """mobile net with simple bypass""" 51 | def __init__(self, class_num=100): 52 | 53 | super().__init__() 54 | self.stem = nn.Sequential( 55 | nn.Conv2d(3, 96, 3, padding=1), 56 | nn.BatchNorm2d(96), 57 | nn.ReLU(inplace=True), 58 | nn.MaxPool2d(2, 2) 59 | ) 60 | 61 | self.fire2 = Fire(96, 128, 16) 62 | self.fire3 = Fire(128, 128, 16) 63 | self.fire4 = Fire(128, 256, 32) 64 | self.fire5 = Fire(256, 256, 32) 65 | self.fire6 = Fire(256, 384, 48) 66 | self.fire7 = Fire(384, 384, 48) 67 | self.fire8 = Fire(384, 512, 64) 68 | self.fire9 = Fire(512, 512, 64) 69 | 70 | self.conv10 = nn.Conv2d(512, class_num, 1) 71 | self.avg = nn.AdaptiveAvgPool2d(1) 72 | self.maxpool = nn.MaxPool2d(2, 2) 73 | 74 | def forward(self, x): 75 | x = self.stem(x) 76 | 77 | f2 = self.fire2(x) 78 | f3 = self.fire3(f2) + f2 79 | f4 = self.fire4(f3) 80 | f4 = self.maxpool(f4) 81 | 82 | f5 = self.fire5(f4) + f4 83 | f6 = self.fire6(f5) 84 | f7 = self.fire7(f6) + f6 85 | f8 = self.fire8(f7) 86 | f8 = self.maxpool(f8) 87 | 88 | f9 = self.fire9(f8) 89 | c10 = self.conv10(f9) 90 | 91 | x = self.avg(c10) 92 | x = x.view(x.size(0), -1) 93 | 94 | return x 95 | 96 | def squeezenet(class_num=100): 97 | return SqueezeNet(class_num=class_num) 98 | -------------------------------------------------------------------------------- /1-Imagenet9/models/token_performer.py: -------------------------------------------------------------------------------- 1 | """ 2 | Take Performer as T2T Transformer 3 | """ 4 | import math 5 | import torch 6 | import torch.nn as nn 7 | 8 | class Token_performer(nn.Module): 9 | def __init__(self, dim, in_dim, head_cnt=1, kernel_ratio=0.5, dp1=0.1, dp2 = 0.1): 10 | super().__init__() 11 | self.emb = in_dim * head_cnt # we use 1, so it is no need here 12 | self.kqv = nn.Linear(dim, 3 * self.emb) 13 | self.dp = nn.Dropout(dp1) 14 | self.proj = nn.Linear(self.emb, self.emb) 15 | self.head_cnt = head_cnt 16 | self.norm1 = nn.LayerNorm(dim) 17 | self.norm2 = nn.LayerNorm(self.emb) 18 | self.epsilon = 1e-8 # for stable in division 19 | 20 | self.mlp = nn.Sequential( 21 | nn.Linear(self.emb, 1 * self.emb), 22 | nn.GELU(), 23 | nn.Linear(1 * self.emb, self.emb), 24 | nn.Dropout(dp2), 25 | ) 26 | 27 | self.m = int(self.emb * kernel_ratio) 28 | self.w = torch.randn(self.m, self.emb) 29 | self.w = nn.Parameter(nn.init.orthogonal_(self.w) * math.sqrt(self.m), requires_grad=False) 30 | 31 | def prm_exp(self, x): 32 | # part of the function is borrow from https://github.com/lucidrains/performer-pytorch 33 | # and Simo Ryu (https://github.com/cloneofsimo) 34 | # ==== positive random features for gaussian kernels ==== 35 | # x = (B, T, hs) 36 | # w = (m, hs) 37 | # return : x : B, T, m 38 | # SM(x, y) = E_w[exp(w^T x - |x|/2) exp(w^T y - |y|/2)] 39 | # therefore return exp(w^Tx - |x|/2)/sqrt(m) 40 | xd = ((x * x).sum(dim=-1, keepdim=True)).repeat(1, 1, self.m) / 2 41 | wtx = torch.einsum('bti,mi->btm', x.float(), self.w) 42 | 43 | return torch.exp(wtx - xd) / math.sqrt(self.m) 44 | 45 | def single_attn(self, x): 46 | k, q, v = torch.split(self.kqv(x), self.emb, dim=-1) 47 | kp, qp = self.prm_exp(k), self.prm_exp(q) # (B, T, m), (B, T, m) 48 | D = torch.einsum('bti,bi->bt', qp, kp.sum(dim=1)).unsqueeze(dim=2) # (B, T, m) * (B, m) -> (B, T, 1) 49 | kptv = torch.einsum('bin,bim->bnm', v.float(), kp) # (B, emb, m) 50 | y = torch.einsum('bti,bni->btn', qp, kptv) / (D.repeat(1, 1, self.emb) + self.epsilon) # (B, T, emb)/Diag 51 | # skip connection 52 | y = v + self.dp(self.proj(y)) # same as token_transformer in T2T layer, use v as skip connection 53 | 54 | return y 55 | 56 | def forward(self, x): 57 | x = self.single_attn(self.norm1(x)) 58 | x = x + self.mlp(self.norm2(x)) 59 | return x -------------------------------------------------------------------------------- /1-Imagenet9/models/token_transformer.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) [2012]-[2021] Shanghai Yitu Technology Co., Ltd. 2 | # 3 | # This source code is licensed under the Clear BSD License 4 | # LICENSE file in the root directory of this file 5 | # All rights reserved. 6 | """ 7 | Take the standard Transformer as T2T Transformer 8 | """ 9 | import torch.nn as nn 10 | from timm.models.layers import DropPath 11 | from .transformer_block import Mlp 12 | 13 | class Attention(nn.Module): 14 | def __init__(self, dim, num_heads=8, in_dim = None, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0.): 15 | super().__init__() 16 | self.num_heads = num_heads 17 | self.in_dim = in_dim 18 | head_dim = dim // num_heads 19 | self.scale = qk_scale or head_dim ** -0.5 20 | 21 | self.qkv = nn.Linear(dim, in_dim * 3, bias=qkv_bias) 22 | self.attn_drop = nn.Dropout(attn_drop) 23 | self.proj = nn.Linear(in_dim, in_dim) 24 | self.proj_drop = nn.Dropout(proj_drop) 25 | 26 | def forward(self, x): 27 | B, N, C = x.shape 28 | 29 | qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, self.in_dim).permute(2, 0, 3, 1, 4) 30 | q, k, v = qkv[0], qkv[1], qkv[2] 31 | 32 | attn = (q @ k.transpose(-2, -1)) * self.scale 33 | attn = attn.softmax(dim=-1) 34 | attn = self.attn_drop(attn) 35 | 36 | x = (attn @ v).transpose(1, 2).reshape(B, N, self.in_dim) 37 | x = self.proj(x) 38 | x = self.proj_drop(x) 39 | 40 | # skip connection 41 | x = v.squeeze(1) + x # because the original x has different size with current x, use v to do skip connection 42 | 43 | return x 44 | 45 | class Token_transformer(nn.Module): 46 | 47 | def __init__(self, dim, in_dim, num_heads, mlp_ratio=1., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0., 48 | drop_path=0., act_layer=nn.GELU, norm_layer=nn.LayerNorm): 49 | super().__init__() 50 | self.norm1 = norm_layer(dim) 51 | self.attn = Attention( 52 | dim, in_dim=in_dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop) 53 | self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() 54 | self.norm2 = norm_layer(in_dim) 55 | self.mlp = Mlp(in_features=in_dim, hidden_features=int(in_dim*mlp_ratio), out_features=in_dim, act_layer=act_layer, drop=drop) 56 | 57 | def forward(self, x): 58 | x = self.attn(self.norm1(x)) 59 | x = x + self.drop_path(self.mlp(self.norm2(x))) 60 | return x -------------------------------------------------------------------------------- /1-Imagenet9/models/transformer_block.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) [2012]-[2021] Shanghai Yitu Technology Co., Ltd. 2 | # 3 | # This source code is licensed under the Clear BSD License 4 | # LICENSE file in the root directory of this file 5 | # All rights reserved. 6 | """ 7 | Borrow from timm(https://github.com/rwightman/pytorch-image-models) 8 | """ 9 | import torch 10 | import torch.nn as nn 11 | import numpy as np 12 | from timm.models.layers import DropPath 13 | 14 | class Mlp(nn.Module): 15 | def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.): 16 | super().__init__() 17 | out_features = out_features or in_features 18 | hidden_features = hidden_features or in_features 19 | self.fc1 = nn.Linear(in_features, hidden_features) 20 | self.act = act_layer() 21 | self.fc2 = nn.Linear(hidden_features, out_features) 22 | self.drop = nn.Dropout(drop) 23 | 24 | def forward(self, x): 25 | x = self.fc1(x) 26 | x = self.act(x) 27 | x = self.drop(x) 28 | x = self.fc2(x) 29 | x = self.drop(x) 30 | return x 31 | 32 | class Attention(nn.Module): 33 | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0.): 34 | super().__init__() 35 | self.num_heads = num_heads 36 | head_dim = dim // num_heads 37 | 38 | self.scale = qk_scale or head_dim ** -0.5 39 | 40 | self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias) 41 | self.attn_drop = nn.Dropout(attn_drop) 42 | self.proj = nn.Linear(dim, dim) 43 | self.proj_drop = nn.Dropout(proj_drop) 44 | 45 | def forward(self, x): 46 | B, N, C = x.shape 47 | qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4) 48 | q, k, v = qkv[0], qkv[1], qkv[2] 49 | 50 | attn = (q @ k.transpose(-2, -1)) * self.scale 51 | attn = attn.softmax(dim=-1) 52 | attn = self.attn_drop(attn) 53 | 54 | x = (attn @ v).transpose(1, 2).reshape(B, N, C) 55 | x = self.proj(x) 56 | x = self.proj_drop(x) 57 | return x 58 | 59 | 60 | class Attention_ours(nn.Module): 61 | def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0.): 62 | super().__init__() 63 | self.num_heads = num_heads 64 | head_dim = dim // num_heads 65 | 66 | self.scale = qk_scale or head_dim ** -0.5 67 | 68 | self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias) 69 | self.attn_drop = nn.Dropout(attn_drop) 70 | self.proj = nn.Linear(dim, dim) 71 | self.proj_drop = nn.Dropout(proj_drop) 72 | 73 | def forward(self, x): 74 | B, N, C = x.shape 75 | qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4) 76 | q, k, v = qkv[0], qkv[1], qkv[2] 77 | 78 | attn = (q @ k.transpose(-2, -1)) * self.scale 79 | attn_causal = attn.softmax(dim=-1) 80 | attn_sp = (-attn).softmax(dim=-1) 81 | 82 | attn_causal = self.attn_drop(attn_causal) 83 | attn_sp = self.attn_drop(attn_sp) 84 | 85 | x = (attn_causal @ v).transpose(1, 2).reshape(B, N, C) 86 | x_comp = (attn_sp @ v).transpose(1, 2).reshape(B, N, C) 87 | 88 | x = self.proj(x) 89 | x_comp = self.proj(x_comp) 90 | x = self.proj_drop(x) 91 | x_comp = self.proj_drop(x_comp) 92 | return x, x_comp 93 | 94 | 95 | class Block(nn.Module): 96 | 97 | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0., 98 | drop_path=0., act_layer=nn.GELU, norm_layer=nn.LayerNorm): 99 | super().__init__() 100 | self.norm1 = norm_layer(dim) 101 | self.attn = Attention( 102 | dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop) 103 | self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() 104 | self.norm2 = norm_layer(dim) 105 | mlp_hidden_dim = int(dim * mlp_ratio) 106 | self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop) 107 | 108 | def forward(self, x): 109 | x = x + self.drop_path(self.attn(self.norm1(x))) 110 | x = x + self.drop_path(self.mlp(self.norm2(x))) 111 | return x 112 | 113 | 114 | class Block_ours(nn.Module): 115 | 116 | def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0., 117 | drop_path=0., act_layer=nn.GELU, norm_layer=nn.LayerNorm): 118 | super().__init__() 119 | self.norm1 = norm_layer(dim) 120 | self.attn = Attention_ours( 121 | dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop) 122 | self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() 123 | self.norm2 = norm_layer(dim) 124 | self.norm3 = norm_layer(dim) 125 | mlp_hidden_dim = int(dim * mlp_ratio) 126 | self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop) 127 | 128 | def forward(self, x): 129 | if isinstance(x, list): 130 | x = x[-1] # x = x_mix 131 | attn_x, attn_x_comp = self.attn(self.norm1(x)) 132 | x_causal = x + self.drop_path(attn_x) 133 | x_spurious = self.drop_path(attn_x_comp) # no residual 134 | 135 | x_causal = x_causal + self.drop_path(self.mlp(self.norm2(x_causal))) 136 | x_spurious = x_spurious + self.drop_path(self.mlp(self.norm3(x_spurious))) 137 | x_mix = x_causal + x_spurious 138 | 139 | return [x_causal, x_spurious, x_mix] 140 | 141 | 142 | def get_sinusoid_encoding(n_position, d_hid): 143 | ''' Sinusoid position encoding table ''' 144 | 145 | def get_position_angle_vec(position): 146 | return [position / np.power(10000, 2 * (hid_j // 2) / d_hid) for hid_j in range(d_hid)] 147 | 148 | sinusoid_table = np.array([get_position_angle_vec(pos_i) for pos_i in range(n_position)]) 149 | sinusoid_table[:, 0::2] = np.sin(sinusoid_table[:, 0::2]) # dim 2i 150 | sinusoid_table[:, 1::2] = np.cos(sinusoid_table[:, 1::2]) # dim 2i+1 151 | 152 | return torch.FloatTensor(sinusoid_table).unsqueeze(0) -------------------------------------------------------------------------------- /1-Imagenet9/models/vgg.py: -------------------------------------------------------------------------------- 1 | """vgg in pytorch 2 | 3 | 4 | [1] Karen Simonyan, Andrew Zisserman 5 | 6 | Very Deep Convolutional Networks for Large-Scale Image Recognition. 7 | https://arxiv.org/abs/1409.1556v6 8 | """ 9 | '''VGG11/13/16/19 in Pytorch.''' 10 | 11 | import torch 12 | import torch.nn as nn 13 | 14 | cfg = { 15 | 'A' : [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 16 | 'B' : [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 17 | 'D' : [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'], 18 | 'E' : [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'] 19 | } 20 | 21 | class VGG(nn.Module): 22 | 23 | def __init__(self, features, num_class=100): 24 | super().__init__() 25 | self.features = features 26 | 27 | self.classifier = nn.Sequential( 28 | nn.Linear(512, 4096), 29 | nn.ReLU(inplace=True), 30 | nn.Dropout(), 31 | nn.Linear(4096, 4096), 32 | nn.ReLU(inplace=True), 33 | nn.Dropout(), 34 | nn.Linear(4096, num_class) 35 | ) 36 | 37 | def forward(self, x): 38 | output = self.features(x) 39 | output = output.view(output.size()[0], -1) 40 | output = self.classifier(output) 41 | 42 | return output 43 | 44 | def make_layers(cfg, batch_norm=False): 45 | layers = [] 46 | 47 | input_channel = 3 48 | for l in cfg: 49 | if l == 'M': 50 | layers += [nn.MaxPool2d(kernel_size=2, stride=2)] 51 | continue 52 | 53 | layers += [nn.Conv2d(input_channel, l, kernel_size=3, padding=1)] 54 | 55 | if batch_norm: 56 | layers += [nn.BatchNorm2d(l)] 57 | 58 | layers += [nn.ReLU(inplace=True)] 59 | input_channel = l 60 | 61 | return nn.Sequential(*layers) 62 | 63 | def vgg11_bn(): 64 | return VGG(make_layers(cfg['A'], batch_norm=True)) 65 | 66 | def vgg13_bn(): 67 | return VGG(make_layers(cfg['B'], batch_norm=True)) 68 | 69 | def vgg16_bn(): 70 | return VGG(make_layers(cfg['D'], batch_norm=True)) 71 | 72 | def vgg19_bn(): 73 | return VGG(make_layers(cfg['E'], batch_norm=True)) 74 | 75 | 76 | -------------------------------------------------------------------------------- /1-Imagenet9/models/wideresidual.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | 5 | class WideBasic(nn.Module): 6 | 7 | def __init__(self, in_channels, out_channels, stride=1): 8 | super().__init__() 9 | self.residual = nn.Sequential( 10 | nn.BatchNorm2d(in_channels), 11 | nn.ReLU(inplace=True), 12 | nn.Conv2d( 13 | in_channels, 14 | out_channels, 15 | kernel_size=3, 16 | stride=stride, 17 | padding=1 18 | ), 19 | nn.BatchNorm2d(out_channels), 20 | nn.ReLU(inplace=True), 21 | nn.Dropout(), 22 | nn.Conv2d( 23 | out_channels, 24 | out_channels, 25 | kernel_size=3, 26 | stride=1, 27 | padding=1 28 | ) 29 | ) 30 | 31 | self.shortcut = nn.Sequential() 32 | 33 | if in_channels != out_channels or stride != 1: 34 | self.shortcut = nn.Sequential( 35 | nn.Conv2d(in_channels, out_channels, 1, stride=stride) 36 | ) 37 | 38 | def forward(self, x): 39 | 40 | residual = self.residual(x) 41 | shortcut = self.shortcut(x) 42 | 43 | return residual + shortcut 44 | 45 | class WideResNet(nn.Module): 46 | def __init__(self, num_classes, block, depth=50, widen_factor=1): 47 | super().__init__() 48 | 49 | self.depth = depth 50 | k = widen_factor 51 | l = int((depth - 4) / 6) 52 | self.in_channels = 16 53 | self.init_conv = nn.Conv2d(3, self.in_channels, 3, 1, padding=1) 54 | self.conv2 = self._make_layer(block, 16 * k, l, 1) 55 | self.conv3 = self._make_layer(block, 32 * k, l, 2) 56 | self.conv4 = self._make_layer(block, 64 * k, l, 2) 57 | self.bn = nn.BatchNorm2d(64 * k) 58 | self.relu = nn.ReLU(inplace=True) 59 | self.avg_pool = nn.AdaptiveAvgPool2d((1, 1)) 60 | self.linear = nn.Linear(64 * k, num_classes) 61 | 62 | def forward(self, x): 63 | x = self.init_conv(x) 64 | x = self.conv2(x) 65 | x = self.conv3(x) 66 | x = self.conv4(x) 67 | x = self.bn(x) 68 | x = self.relu(x) 69 | x = self.avg_pool(x) 70 | x = x.view(x.size(0), -1) 71 | x = self.linear(x) 72 | 73 | return x 74 | 75 | def _make_layer(self, block, out_channels, num_blocks, stride): 76 | """make resnet layers(by layer i didnt mean this 'layer' was the 77 | same as a neuron netowork layer, ex. conv layer), one layer may 78 | contain more than one residual block 79 | 80 | Args: 81 | block: block type, basic block or bottle neck block 82 | out_channels: output depth channel number of this layer 83 | num_blocks: how many blocks per layer 84 | stride: the stride of the first block of this layer 85 | 86 | Return: 87 | return a resnet layer 88 | """ 89 | 90 | # we have num_block blocks per layer, the first block 91 | # could be 1 or 2, other blocks would always be 1 92 | strides = [stride] + [1] * (num_blocks - 1) 93 | layers = [] 94 | for stride in strides: 95 | layers.append(block(self.in_channels, out_channels, stride)) 96 | self.in_channels = out_channels 97 | 98 | return nn.Sequential(*layers) 99 | 100 | 101 | # Table 9: Best WRN performance over various datasets, single run results. 102 | def wideresnet(depth=40, widen_factor=10): 103 | net = WideResNet(100, WideBasic, depth=depth, widen_factor=widen_factor) 104 | return net -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/backup/cluster_label_1.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/backup/cluster_label_1.pth -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/backup/cluster_label_2.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/backup/cluster_label_2.pth -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/backup/cluster_label_3.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/backup/cluster_label_3.pth -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/backup/cluster_sample_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/backup/cluster_sample_1.jpg -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/backup/cluster_sample_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/backup/cluster_sample_2.jpg -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/backup/cluster_sample_3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/backup/cluster_sample_3.jpg -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/cluster_label_1.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/cluster_label_1.pth -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/cluster_label_2.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/cluster_label_2.pth -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/cluster_label_3.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/cluster_label_3.pth -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/cluster_sample_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/cluster_sample_1.jpg -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/cluster_sample_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/cluster_sample_2.jpg -------------------------------------------------------------------------------- /1-Imagenet9/pre_cluster_results/cluster_sample_3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pre_cluster_results/cluster_sample_3.jpg -------------------------------------------------------------------------------- /1-Imagenet9/pretrain_model/resnet_ours/resnet18_caam.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pretrain_model/resnet_ours/resnet18_caam.pth -------------------------------------------------------------------------------- /1-Imagenet9/pretrain_model/t2tvit_ours/t2tvit7_caam.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Wangt-CN/CaaM/32b9bb24c10a0fd8c426ae50bb8a09ccd420d1aa/1-Imagenet9/pretrain_model/t2tvit_ours/t2tvit7_caam.pth -------------------------------------------------------------------------------- /1-Imagenet9/scripts/run_baseline_resnet18.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0,1 python train.py --cfg conf/baseline_resnet18_imagenet9.yaml --gpu --multigpu --name baseline_resnet18_imagenet9 -------------------------------------------------------------------------------- /1-Imagenet9/scripts/run_ours_resnet18.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0,1 python train.py --cfg conf/ours_resnet18_multi2_imagenet9_pw5e4_noenv_iter.yaml --gpu --multigpu --name ours_resnet18_multilayer2_imagenet9_pw5e4_noenv_iter -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CaaM 2 | 3 | This repo contains the codes of training our [CaaM](https://arxiv.org/abs/2108.08782) on NICO/ImageNet9 dataset. Due to my recent limited bandwidth, this codebase is still messy, which will be further refined and checked recently. 4 | 5 | 6 | 7 | ### 0. Bibtex 8 | 9 | If you find our codes helpful, please cite our paper: 10 | 11 | ``` 12 | @inproceedings{wang2021causal, 13 | title={Causal Attention for Unbiased Visual Recognition}, 14 | author={Wang, Tan and Zhou, Chang and Sun, Qianru and Zhang, Hanwang}, 15 | booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, 16 | year={2021} 17 | } 18 | ``` 19 | 20 | 21 | 22 | ### 1. Preparation 23 | 24 | 1) Installation: Python3.6, Pytorch1.6, tensorboard, timm(0.3.4), scikit-learn, opencv-python, matplotlib, yaml 25 | 2) Dataset: 26 | 27 | - NICO: Please download from https://drive.google.com/drive/folders/17-jl0fF9BxZupG75BtpOqJaB6dJ2Pv8O?usp=sharing, we remove the damaged images in original NICO and rename the images. The construction details of our proposed subset are in our Appendix. 28 | - ImageNet9: Please follow the usual practice to download the ImageNet (ILSVRC2015) dataset. 29 | 30 | 3) Please remember to change the data path in the config file. 31 | 32 | 33 | 34 | ### 2. Evaluation: 35 | 36 | 1) For ResNet18 on NICO dataset 37 | 38 | ``` 39 | CUDA_VISIBLE_DEVICES=0 python train.py -cfg conf/ours_resnet18_multilayer2_bf0.02_noenv_pw5e5.yaml -debug -gpu -eval pretrain_model/nico_resnet18_ours_caam-best.pth 40 | ``` 41 | 42 | The results will be: Val Score: 0.4638461470603943 Test Score: 0.4661538600921631 43 | 44 | 2) For T2T-ViT7 on NICO dataset 45 | 46 | ``` 47 | CUDA_VISIBLE_DEVICES=0,1 python train.py -cfg conf/ours_t2tvit7_bf0.02_s4_noenv_pw5e4.yaml -debug -gpu -multigpu -eval pretrain_model/nico_t2tvit7_ours_caam-best.pth 48 | ``` 49 | 50 | The results will be: Val Score: 0.3799999952316284 Test Score: 0.3761538565158844 51 | 52 | 3) For ImageNet-9 dataset 53 | 54 | Similarly, the pretrained model is in `pretrain_model`. Please note that on ImageNet9, we report the best performance for the 3 metrics in our paper. The pretrained model is for `bias` and `unbias` and we did not save the model for the best `ImageNet-A`. 55 | 56 | 57 | 58 | ### 3. Train 59 | 60 | To perform training, please run the sh file in scripts. For example: 61 | 62 | ``` 63 | sh scripts/run_baseline_resnet18.sh 64 | ``` 65 | 66 | 67 | 68 | ### **4. An interesting finding** 69 | 70 | Recently I found an interesting thing by accident. The `mixup` added on the baseline model would not bring much performance improvements (see Table 1. in the main paper). However, when performing `mixup` based on our CaaM, the performance can be further boosted. 71 | 72 | Specifically, you can active the `mixup` by: 73 | 74 | ``` 75 | sh scripts/run_ours_resnet18_mixup.sh 76 | ``` 77 | 78 | This can make our CaaM achieve about **50~51%** Val & Test accuracy on NICO dataset. 79 | 80 | 81 | 82 | 83 | 84 | ### **Acknowledgement** 85 | 86 | Special thanks to the authors of [ReBias](https://github.com/clovaai/rebias) and [IRM](https://github.com/facebookresearch/InvariantRiskMinimization), and the datasets used in this research project. 87 | 88 | If you have any question or find any bug, please kindly email [me](TAN317@e.ntu.edu.sg). 89 | --------------------------------------------------------------------------------