├── .gitignore
├── LICENSE
├── README.md
├── adaptive-aggregation-networks
    ├── README.md
    ├── main.py
    ├── models
    │   ├── modified_linear.py
    │   ├── modified_resnet.py
    │   ├── modified_resnet_cifar.py
    │   ├── modified_resnetmtl.py
    │   ├── modified_resnetmtl_cifar.py
    │   └── resnet_cifar.py
    ├── trainer
    │   ├── __init__.py
    │   ├── base_trainer.py
    │   ├── incremental_icarl.py
    │   ├── incremental_lucir.py
    │   ├── trainer.py
    │   └── zeroth_phase.py
    └── utils
    │   ├── __init__.py
    │   ├── gpu_tools.py
    │   ├── imagenet
    │       ├── __init__.py
    │       ├── train_and_eval.py
    │       ├── utils_dataset.py
    │       └── utils_train.py
    │   ├── incremental
    │       ├── __init__.py
    │       ├── compute_accuracy.py
    │       ├── compute_features.py
    │       └── conv2d_mtl.py
    │   ├── misc.py
    │   └── process_fp.py
└── mnemonics-training
    ├── 1_train
        ├── main.py
        ├── models
        │   ├── __init__.py
        │   ├── modified_linear.py
        │   ├── modified_resnet_cifar.py
        │   └── modified_resnetmtl_cifar.py
        ├── trainer
        │   ├── __init__.py
        │   ├── baseline.py
        │   ├── incremental.py
        │   └── mnemonics.py
        └── utils
        │   ├── __init__.py
        │   ├── compute_accuracy.py
        │   ├── compute_features.py
        │   ├── conv2d_mtl.py
        │   ├── gpu_tools.py
        │   ├── misc.py
        │   ├── process_fp.py
        │   └── process_mnemonics.py
    ├── 2_eval
        ├── README.md
        ├── main.py
        ├── models
        │   ├── modified_linear.py
        │   ├── modified_resnet.py
        │   ├── modified_resnet_cifar.py
        │   ├── modified_resnetmtl.py
        │   ├── modified_resnetmtl_cifar.py
        │   └── resnet_cifar.py
        ├── process_imagenet
        │   ├── generate_imagenet.py
        │   └── generate_imagenet_subset.py
        ├── run_eval.sh
        ├── script
        │   └── download_ckpt.sh
        ├── trainer
        │   ├── __init__.py
        │   └── train.py
        └── utils
        │   ├── __init__.py
        │   ├── gpu_tools.py
        │   ├── imagenet
        │       ├── __init__.py
        │       ├── train_and_eval.py
        │       ├── utils_dataset.py
        │       └── utils_train.py
        │   ├── incremental
        │       ├── __init__.py
        │       ├── compute_accuracy.py
        │       ├── compute_confusion_matrix.py
        │       ├── compute_features.py
        │       └── conv2d_mtl.py
        │   └── misc.py
    └── README.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | # File types
 2 | *.pyc
 3 | *.npy
 4 | *.tar.gz
 5 | *.sh
 6 | *.out
 7 | 
 8 | # Folders
 9 | data
10 | logs
11 | runs
12 | __pycache__
13 | 
14 | # File
15 | .DS_Store
16 | bashrc
17 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2020-2021 Yaoyao Liu
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Class-Incremental Learning
 2 | 
 3 | [![LICENSE](https://img.shields.io/badge/license-MIT-green?style=flat-square)](https://github.com/yaoyao-liu/class-incremental-learning/blob/master/LICENSE)
 4 | [![Python](https://img.shields.io/badge/python-3.6-blue.svg?style=flat-square&logo=python&color=3776AB&logoColor=3776AB)](https://www.python.org/)
 5 | [![PyTorch](https://img.shields.io/badge/pytorch-1.2.0-%237732a8?style=flat-square&logo=PyTorch&color=EE4C2C)](https://pytorch.org/)
 6 | 
 7 | ### Papers
 8 | 
 9 | - Adaptive Aggregation Networks for Class-Incremental Learning,
10 | CVPR 2021. \[[PDF](https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_Adaptive_Aggregation_Networks_for_Class-Incremental_Learning_CVPR_2021_paper.pdf)\] \[[Project Page](https://class-il.mpi-inf.mpg.de/)\]
11 | 
12 | - Mnemonics Training: Multi-Class Incremental Learning without Forgetting,
13 | CVPR 2020. \[[PDF](https://arxiv.org/pdf/2002.10211.pdf)\] \[[Project Page](https://class-il.mpi-inf.mpg.de/mnemonics-training/)\]
14 | 
15 | ### Citations
16 | 
17 | Please cite our papers if they are helpful to your work:
18 | 
19 | ```bibtex
20 | @inproceedings{Liu2020AANets,
21 |   author    = {Liu, Yaoyao and Schiele, Bernt and Sun, Qianru},
22 |   title     = {Adaptive Aggregation Networks for Class-Incremental Learning},
23 |   booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
24 |   pages     = {2544-2553},
25 |   year      = {2021}
26 | }
27 | ```
28 | 
29 | ```bibtex
30 | @inproceedings{liu2020mnemonics,
31 | author    = {Liu, Yaoyao and Su, Yuting and Liu, An{-}An and Schiele, Bernt and Sun, Qianru},
32 | title     = {Mnemonics Training: Multi-Class Incremental Learning without Forgetting},
33 | booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
34 | pages     = {12245--12254},
35 | year      = {2020}
36 | }
37 | ```
38 | 
39 | ### Acknowledgements
40 | 
41 | Our implementation uses the source code from the following repositories:
42 | 
43 | * [Learning a Unified Classifier Incrementally via Rebalancing](https://github.com/hshustc/CVPR19_Incremental_Learning)
44 | 
45 | * [iCaRL: Incremental Classifier and Representation Learning](https://github.com/srebuffi/iCaRL)
46 | 
47 | * [Dataset Distillation](https://github.com/SsnL/dataset-distillation)
48 | 
49 | * [Generative Teaching Networks](https://github.com/uber-research/GTN)
50 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/README.md:
--------------------------------------------------------------------------------
  1 | ## Adaptive Aggregation Networks for Class-Incremental Learning
  2 | 
  3 | [![LICENSE](https://img.shields.io/badge/license-MIT-green?style=flat-square)](https://github.com/yaoyao-liu/class-incremental-learning/blob/master/LICENSE)
  4 | [![Python](https://img.shields.io/badge/python-3.6-blue.svg?style=flat-square&logo=python&color=3776AB)](https://www.python.org/)
  5 | [![PyTorch](https://img.shields.io/badge/pytorch-1.2.0-%237732a8?style=flat-square&logo=PyTorch&color=EE4C2C)](https://pytorch.org/)
  6 | <!--
  7 | [![Citations](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/yaoyao-liu/yaoyao-liu.github.io/google-scholar-stats/gs_data_shieldsio_aanets.json&logo=Google%20Scholar&color=5087ec&style=flat-square&label=citations)](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=Uf9GqRsAAAAJ&citation_for_view=Uf9GqRsAAAAJ:u_35RYKgDlwC)
  8 | -->
  9 | 
 10 | \[[PDF](https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_Adaptive_Aggregation_Networks_for_Class-Incremental_Learning_CVPR_2021_paper.pdf)\] \[[Project Page](https://class-il.mpi-inf.mpg.de/)\] \[[GitLab@MPI](https://gitlab.mpi-klsb.mpg.de/yaoyaoliu/adaptive-aggregation-networks)\] 
 11 | 
 12 | #### Summary
 13 | 
 14 | * [Introduction](#introduction)
 15 | * [Getting Started](#getting-started)
 16 | * [Download the Datasets](#download-the-datasets)
 17 | * [Running Experiments](#running-experiments)
 18 | * [Citation](#citation)
 19 | * [Acknowledgements](#acknowledgements)
 20 | 
 21 | ### Introduction
 22 | 
 23 | Class-Incremental Learning (CIL) aims to learn a classification model with the number of classes increasing phase-by-phase. The inherent problem in CIL is the stability-plasticity dilemma between the learning of old and new classes, i.e., high-plasticity models easily forget old classes but high-stability models are weak to learn new classes. We alleviate this issue by proposing a novel network architecture called Adaptive Aggregation Networks (AANets) in which we explicitly build two residual blocks at each residual level (taking ResNet as the baseline architecture): a stable block and a plastic block. We aggregate the output feature maps from these two blocks and then feed the results to the next-level blocks. We meta-learn the aggregating weights in order to dynamically optimize and balance between two types of blocks, i.e., between stability and plasticity. We conduct extensive experiments on three CIL benchmarks: CIFAR-100, ImageNet-Subset, and ImageNet, and show that many existing CIL methods can be straightforwardly incorporated on the architecture of AANets to boost their performance. 
 24 | 
 25 | <p align="center">
 26 |     <img src="https://images.yyliu.net/AANets-1.png" width="800"/>
 27 | </p>
 28 | 
 29 | > Figure: Conceptual illustrations of different CIL methods. (a) Conventional methods use all available data (imbalanced classes) to train the model (Rebuffi et al., 2017; Hou et al., 2019) (b) Castro et al. (2018), Hou et al. (2019) and Douillard et al. (2020) follow the convention but add a fine-tuning step using the balanced set of exemplars. (c) Our AANets approach uses all available data to update the plastic and stable blocks, and use the balanced set of exemplars to meta-learn the aggregating weights. We continuously update these weights such as to dynamically balance between plastic and stable blocks, i.e., between plasticity and stability
 30 | 
 31 | ### Getting Started
 32 | 
 33 | In order to run this repository, we advise you to install python 3.6 and PyTorch 1.2.0 with Anaconda.
 34 | 
 35 | You may download Anaconda and read the installation instruction on their official website:
 36 | <https://www.anaconda.com/download/>
 37 | 
 38 | Create a new environment and install PyTorch and torchvision on it:
 39 | 
 40 | ```bash
 41 | conda create --name AANets-PyTorch python=3.6
 42 | conda activate AANets-PyTorch
 43 | conda install pytorch=1.2.0 
 44 | conda install torchvision -c pytorch
 45 | ```
 46 | 
 47 | Install other requirements:
 48 | ```bash
 49 | pip install tqdm scipy sklearn tensorboardX Pillow==6.2.2
 50 | ```
 51 | 
 52 | Clone this repository and enter the folder `adaptive-aggregation-networks`:
 53 | ```bash
 54 | git clone https://github.com/yaoyao-liu/class-incremental-learning.git
 55 | cd class-incremental-learning/adaptive-aggregation-networks
 56 | 
 57 | ```
 58 | 
 59 | ### Download the Datasets
 60 | #### CIFAR-100
 61 | It will be downloaded automatically by `torchvision` when running the experiments.
 62 | 
 63 | #### ImageNet-Subset
 64 | We create the ImageNet-Subset following [LUCIR](https://github.com/hshustc/CVPR19_Incremental_Learning).
 65 | You may download the dataset using the following links:
 66 | - [Download from Google Drive](https://drive.google.com/file/d/1n5Xg7Iye_wkzVKc0MTBao5adhYSUlMCL/view?usp=sharing)
 67 | - [Download from 百度网盘](https://pan.baidu.com/s/1MnhITYKUI1i7aRBzsPrCSw) (提取码: 6uj5)
 68 | 
 69 | File information:
 70 | ```
 71 | File name: ImageNet-Subset.tar
 72 | Size: 15.37 GB
 73 | MD5: ab2190e9dac15042a141561b9ba5d6e9
 74 | ```
 75 | You need to untar the downloaded file, and put the folder `seed_1993_subset_100_imagenet` in `class-incremental-learning/adaptive-aggregation-networks/data`.
 76 | 
 77 | Please note that the ImageNet-Subset is created from ImageNet. ImageNet is only allowed to be downloaded by researchers for non-commercial research and educational purposes. See the terms of ImageNet [here](https://image-net.org/download.php).
 78 | 
 79 | ### Running Experiments
 80 | #### Running Experiments w/ AANets on CIFAR-100
 81 | 
 82 | [LUCIR](https://github.com/hshustc/CVPR19_Incremental_Learning) w/ AANets
 83 | ```bash
 84 | python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=ss --branch_2=free --dataset=cifar100
 85 | python main.py --nb_cl_fg=50 --nb_cl=5 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=ss --branch_2=free --dataset=cifar100
 86 | python main.py --nb_cl_fg=50 --nb_cl=2 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=ss --branch_2=free --dataset=cifar100
 87 | ```
 88 | 
 89 | [iCaRL](https://github.com/hshustc/CVPR19_Incremental_Learning) w/ AANets
 90 | ```bash
 91 | python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=dual --branch_1=ss --branch_2=free --dataset=cifar100 
 92 | python main.py --nb_cl_fg=50 --nb_cl=5 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=dual --branch_1=ss --branch_2=free --dataset=cifar100 
 93 | python main.py --nb_cl_fg=50 --nb_cl=2 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=dual --branch_1=ss --branch_2=free --dataset=cifar100 
 94 | ```
 95 | 
 96 | #### Running Baseline Experiments on CIFAR-100
 97 | 
 98 | [LUCIR](https://github.com/hshustc/CVPR19_Incremental_Learning) w/o AANets, dual branch
 99 | ```bash
100 | python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=free --branch_2=free --fusion_lr=0.0 --dataset=cifar100
101 | python main.py --nb_cl_fg=50 --nb_cl=5 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=free --branch_2=free ---fusion_lr=0.0 -dataset=cifar100
102 | python main.py --nb_cl_fg=50 --nb_cl=2 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=free --branch_2=free --fusion_lr=0.0 --dataset=cifar100
103 | ```
104 | 
105 | [iCaRL](https://github.com/hshustc/CVPR19_Incremental_Learning) w/o AANets, dual branch
106 | ```bash
107 | python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=dual --branch_1=free --branch_2=free --fusion_lr=0.0 --dataset=cifar100 
108 | python main.py --nb_cl_fg=50 --nb_cl=5 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=dual --branch_1=free --branch_2=free --fusion_lr=0.0 --dataset=cifar100 
109 | python main.py --nb_cl_fg=50 --nb_cl=2 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=dual --branch_1=free --branch_2=free --fusion_lr=0.0 --dataset=cifar100 
110 | ```
111 | 
112 | [LUCIR](https://github.com/hshustc/CVPR19_Incremental_Learning) w/o AANets, single branch
113 | ```bash
114 | python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=single --branch_1=free --dataset=cifar100
115 | python main.py --nb_cl_fg=50 --nb_cl=5 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=single --branch_1=free -dataset=cifar100
116 | python main.py --nb_cl_fg=50 --nb_cl=2 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=single --branch_1=free --dataset=cifar100
117 | ```
118 | 
119 | [iCaRL](https://github.com/hshustc/CVPR19_Incremental_Learning) w/o AANets, single branch
120 | ```bash
121 | python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=single --branch_1=free --dataset=cifar100 
122 | python main.py --nb_cl_fg=50 --nb_cl=5 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=single --branch_1=free --dataset=cifar100 
123 | python main.py --nb_cl_fg=50 --nb_cl=2 --gpu=0 --random_seed=1993 --baseline=icarl --branch_mode=single --branch_1=free --dataset=cifar100 
124 | ```
125 | 
126 | #### Running Experiments on ImageNet-Subset
127 | [LUCIR](https://github.com/hshustc/CVPR19_Incremental_Learning) w/ AANets
128 | ```bash
129 | python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=ss --branch_2=free --dataset=imagenet_sub --test_batch_size=50 --epochs=90 --num_workers=1 --custom_weight_decay=0.0005 --the_lambda=10 --K=2 --dist=0.5 --lw_mr=1 --base_lr1=0.05 --base_lr2=0.05 --dynamic_budget
130 | python main.py --nb_cl_fg=50 --nb_cl=5 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=ss --branch_2=free --dataset=imagenet_sub --test_batch_size=50 --epochs=90 --num_workers=1 --custom_weight_decay=0.0005 --the_lambda=10 --K=2 --dist=0.5 --lw_mr=1 --base_lr1=0.05 --base_lr2=0.05 --dynamic_budget
131 | python main.py --nb_cl_fg=50 --nb_cl=2 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=ss --branch_2=free --dataset=imagenet_sub --test_batch_size=50 --epochs=90 --num_workers=1 --custom_weight_decay=0.0005 --the_lambda=10 --K=2 --dist=0.5 --lw_mr=1 --base_lr1=0.05 --base_lr2=0.05 --dynamic_budget
132 | ```
133 | 
134 | ### Code for [PODNet](https://github.com/arthurdouillard/incremental_learning.pytorch) w/ AANets
135 | 
136 | We are still cleaning up the code for [PODNet](https://github.com/arthurdouillard/incremental_learning.pytorch) w/ AANets. So we will add it to the GitHub repository later. 
137 | <br>
138 | If you need to use it now, here is a preliminary version: <https://github.com/yaoyao-liu/POD-AANets>
139 | <br>
140 | Please note that you need to install the same environment as [PODNet](https://github.com/arthurdouillard/incremental_learning.pytorch) to run this code.
141 | 
142 | ### Accuracy for Each Phase
143 | 
144 | We provide the accuracy for each phase on CIFAR-100, ImageNet-Subset, and ImageNet-Full in different settings (*N=5, 10, 25*).
145 | <br>
146 | You may view the results using the following link:
147 | [\[Google Sheet Link\]](https://docs.google.com/spreadsheets/d/1rSA0IH7OilDgfx2cvl86ixjVno4I15bmrDWkS4cUtBA/edit?usp=sharing)
148 | <br>
149 | Please note that we re-run some experiments, so some results are slightly different from the paper table.
150 | 
151 | 
152 | ### Citation
153 | 
154 | Please cite our paper if it is helpful to your work:
155 | 
156 | ```bibtex
157 | @inproceedings{Liu2020AANets,
158 |   author    = {Liu, Yaoyao and Schiele, Bernt and Sun, Qianru},
159 |   title     = {Adaptive Aggregation Networks for Class-Incremental Learning},
160 |   booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
161 |   pages     = {2544-2553},
162 |   year      = {2021}
163 | }
164 | ```
165 | 
166 | ### Acknowledgements
167 | 
168 | Our implementation uses the source code from the following repositories:
169 | 
170 | * [Learning a Unified Classifier Incrementally via Rebalancing](https://github.com/hshustc/CVPR19_Incremental_Learning)
171 | 
172 | * [iCaRL: Incremental Classifier and Representation Learning](https://github.com/srebuffi/iCaRL)
173 | 
174 | * [PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning](https://github.com/arthurdouillard/incremental_learning.pytorch)
175 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/main.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Created by: Yaoyao Liu
 3 | ## Max Planck Institute for Informatics
 4 | ## yaoyao.liu@mpi-inf.mpg.de
 5 | ## Copyright (c) 2021
 6 | ##
 7 | ## This source code is licensed under the MIT-style license found in the
 8 | ## LICENSE file in the root directory of this source tree
 9 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
10 | """ Main function for this project. """
11 | import os
12 | import argparse
13 | import numpy as np
14 | from trainer.trainer import Trainer
15 | from utils.gpu_tools import occupy_memory
16 | 
17 | if __name__ == '__main__':
18 |     parser = argparse.ArgumentParser()
19 | 
20 |     ### Basic parameters
21 |     parser.add_argument('--gpu', default='0', help='the index of GPU')
22 |     parser.add_argument('--dataset', default='cifar100', type=str, choices=['cifar100', 'imagenet_sub', 'imagenet'])
23 |     parser.add_argument('--data_dir', default='data/seed_1993_subset_100_imagenet/data', type=str)
24 |     parser.add_argument('--baseline', default='lucir', type=str, choices=['lucir', 'icarl'], help='baseline method')
25 |     parser.add_argument('--ckpt_label', type=str, default='exp01', help='the label for the checkpoints')
26 |     parser.add_argument('--ckpt_dir_fg', type=str, default='-', help='the checkpoint file for the 0-th phase')
27 |     parser.add_argument('--resume_fg', action='store_true', help='resume 0-th phase model from the checkpoint')
28 |     parser.add_argument('--resume', action='store_true', help='resume from the checkpoints')
29 |     parser.add_argument('--num_workers', default=1, type=int, help='the number of workers for loading data')
30 |     parser.add_argument('--random_seed', default=1993, type=int, help='random seed')
31 |     parser.add_argument('--train_batch_size', default=128, type=int, help='the batch size for train loader')
32 |     parser.add_argument('--test_batch_size', default=100, type=int, help='the batch size for test loader')
33 |     parser.add_argument('--eval_batch_size', default=128, type=int, help='the batch size for validation loader')
34 |     parser.add_argument('--disable_gpu_occupancy', action='store_false', help='disable GPU occupancy')
35 | 
36 |     ### Network architecture parameters
37 |     parser.add_argument('--branch_mode', default='dual', type=str, choices=['dual', 'single'], help='the branch mode for AANets')
38 |     parser.add_argument('--branch_1', default='ss', type=str, choices=['ss', 'fixed', 'free'], help='the network type for the first branch')
39 |     parser.add_argument('--branch_2', default='free', type=str, choices=['ss', 'fixed', 'free'], help='the network type for the second branch')
40 |     parser.add_argument('--imgnet_backbone', default='resnet18', type=str, choices=['resnet18', 'resnet34'], help='network backbone for ImageNet')
41 | 
42 |     ### Incremental learning parameters
43 |     parser.add_argument('--num_classes', default=100, type=int, help='the total number of classes')
44 |     parser.add_argument('--nb_cl_fg', default=50, type=int, help='the number of classes in the 0-th phase')
45 |     parser.add_argument('--nb_cl', default=10, type=int, help='the number of classes for each phase')
46 |     parser.add_argument('--nb_protos', default=20, type=int, help='the number of exemplars for each class')
47 |     parser.add_argument('--epochs', default=160, type=int, help='the number of epochs')
48 |     parser.add_argument('--dynamic_budget', action='store_true', help='using dynamic budget setting')
49 |     parser.add_argument('--fusion_lr', default=1e-8, type=float, help='the learning rate for the aggregation weights')
50 | 
51 |     ### General learning parameters
52 |     parser.add_argument('--lr_factor', default=0.1, type=float, help='learning rate decay factor')
53 |     parser.add_argument('--custom_weight_decay', default=5e-4, type=float, help='weight decay parameter for the optimizer')
54 |     parser.add_argument('--custom_momentum', default=0.9, type=float, help='momentum parameter for the optimizer')
55 |     parser.add_argument('--base_lr1', default=0.1, type=float, help='learning rate for the 0-th phase')
56 |     parser.add_argument('--base_lr2', default=0.1, type=float, help='learning rate for the following phases')
57 | 
58 |     ### LUCIR parameters
59 |     parser.add_argument('--the_lambda', default=5, type=float, help='lamda for LF')
60 |     parser.add_argument('--dist', default=0.5, type=float, help='dist for margin ranking losses')
61 |     parser.add_argument('--K', default=2, type=int, help='K for margin ranking losses')
62 |     parser.add_argument('--lw_mr', default=1, type=float, help='loss weight for margin ranking losses')
63 | 
64 |     ### iCaRL parameters
65 |     parser.add_argument('--icarl_beta', default=0.25, type=float, help='beta for iCaRL')
66 |     parser.add_argument('--icarl_T', default=2, type=int, help='T for iCaRL')
67 | 
68 |     the_args = parser.parse_args()
69 | 
70 |     # Checke the number of classes, ensure they are reasonable
71 |     assert(the_args.nb_cl_fg % the_args.nb_cl == 0)
72 |     assert(the_args.nb_cl_fg >= the_args.nb_cl)
73 | 
74 |     # Print the parameters
75 |     print(the_args)
76 | 
77 |     # Set GPU index
78 |     os.environ['CUDA_VISIBLE_DEVICES'] = the_args.gpu
79 |     print('Using gpu:', the_args.gpu)
80 | 
81 |     # Occupy GPU memory in advance
82 |     if the_args.disable_gpu_occupancy:
83 |         occupy_memory(the_args.gpu)
84 |         print('Occupy GPU memory in advance.')
85 | 
86 |     # Set the trainer and start training
87 |     trainer = Trainer(the_args)
88 |     trainer.train()
89 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/models/modified_linear.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Created by: Yaoyao Liu
 3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
 4 | ## Max Planck Institute for Informatics
 5 | ## yaoyao.liu@mpi-inf.mpg.de
 6 | ## Copyright (c) 2021
 7 | ##
 8 | ## This source code is licensed under the MIT-style license found in the
 9 | ## LICENSE file in the root directory of this source tree
10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
11 | import math
12 | import torch
13 | from torch.nn.parameter import Parameter
14 | from torch.nn import functional as F
15 | from torch.nn import Module
16 | 
17 | class CosineLinear(Module):
18 |     def __init__(self, in_features, out_features, sigma=True):
19 |         super(CosineLinear, self).__init__()
20 |         self.in_features = in_features
21 |         self.out_features = out_features
22 |         self.weight = Parameter(torch.Tensor(out_features, in_features))
23 |         if sigma:
24 |             self.sigma = Parameter(torch.Tensor(1))
25 |         else:
26 |             self.register_parameter('sigma', None)
27 |         self.reset_parameters()
28 | 
29 |     def reset_parameters(self):
30 |         stdv = 1. / math.sqrt(self.weight.size(1))
31 |         self.weight.data.uniform_(-stdv, stdv)
32 |         if self.sigma is not None:
33 |             self.sigma.data.fill_(1)
34 | 
35 |     def forward(self, input):
36 |         out = F.linear(F.normalize(input, p=2,dim=1), \
37 |                 F.normalize(self.weight, p=2, dim=1))
38 |         if self.sigma is not None:
39 |             out = self.sigma * out
40 |         return out
41 | 
42 | class SplitCosineLinear(Module):
43 |     def __init__(self, in_features, out_features1, out_features2, sigma=True):
44 |         super(SplitCosineLinear, self).__init__()
45 |         self.in_features = in_features
46 |         self.out_features = out_features1 + out_features2
47 |         self.fc1 = CosineLinear(in_features, out_features1, False)
48 |         self.fc2 = CosineLinear(in_features, out_features2, False)
49 |         if sigma:
50 |             self.sigma = Parameter(torch.Tensor(1))
51 |             self.sigma.data.fill_(1)
52 |         else:
53 |             self.register_parameter('sigma', None)
54 | 
55 |     def forward(self, x):
56 |         out1 = self.fc1(x)
57 |         out2 = self.fc2(x)
58 |         out = torch.cat((out1, out2), dim=1)
59 |         if self.sigma is not None:
60 |             out = self.sigma * out
61 |         return out
62 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/models/modified_resnet.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | import torch.nn as nn
 12 | import math
 13 | import torch.utils.model_zoo as model_zoo
 14 | import models.modified_linear as modified_linear
 15 | 
 16 | def conv3x3(in_planes, out_planes, stride=1):
 17 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 18 |                      padding=1, bias=False)
 19 | 
 20 | class BasicBlock(nn.Module):
 21 |     expansion = 1
 22 | 
 23 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 24 |         super(BasicBlock, self).__init__()
 25 |         self.conv1 = conv3x3(inplanes, planes, stride)
 26 |         self.bn1 = nn.BatchNorm2d(planes)
 27 |         self.relu = nn.ReLU(inplace=True)
 28 |         self.conv2 = conv3x3(planes, planes)
 29 |         self.bn2 = nn.BatchNorm2d(planes)
 30 |         self.downsample = downsample
 31 |         self.stride = stride
 32 |         self.last = last
 33 | 
 34 |     def forward(self, x):
 35 |         residual = x
 36 | 
 37 |         out = self.conv1(x)
 38 |         out = self.bn1(out)
 39 |         out = self.relu(out)
 40 | 
 41 |         out = self.conv2(out)
 42 |         out = self.bn2(out)
 43 | 
 44 |         if self.downsample is not None:
 45 |             residual = self.downsample(x)
 46 | 
 47 |         out += residual
 48 |         if not self.last: 
 49 |             out = self.relu(out)
 50 | 
 51 |         return out
 52 | 
 53 | class ResNet(nn.Module):
 54 | 
 55 |     def __init__(self, block, layers, num_classes=1000):
 56 |         self.inplanes = 64
 57 |         super(ResNet, self).__init__()
 58 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
 59 |                                bias=False)
 60 |         self.bn1 = nn.BatchNorm2d(64)
 61 |         self.relu = nn.ReLU(inplace=True)
 62 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
 63 |         self.layer1 = self._make_layer(block, 64, layers[0])
 64 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
 65 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
 66 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2, last_phase=True)
 67 |         self.avgpool = nn.AvgPool2d(7, stride=1)
 68 |         self.fc = modified_linear.CosineLinear(512 * block.expansion, num_classes)
 69 | 
 70 |         for m in self.modules():
 71 |             if isinstance(m, nn.Conv2d):
 72 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 73 |             elif isinstance(m, nn.BatchNorm2d):
 74 |                 nn.init.constant_(m.weight, 1)
 75 |                 nn.init.constant_(m.bias, 0)
 76 | 
 77 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 78 |         downsample = None
 79 |         if stride != 1 or self.inplanes != planes * block.expansion:
 80 |             downsample = nn.Sequential(
 81 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
 82 |                           kernel_size=1, stride=stride, bias=False),
 83 |                 nn.BatchNorm2d(planes * block.expansion),
 84 |             )
 85 | 
 86 |         layers = []
 87 |         layers.append(block(self.inplanes, planes, stride, downsample))
 88 |         self.inplanes = planes * block.expansion
 89 |         if last_phase:
 90 |             for i in range(1, blocks-1):
 91 |                 layers.append(block(self.inplanes, planes))
 92 |             layers.append(block(self.inplanes, planes, last=True))
 93 |         else: 
 94 |             for i in range(1, blocks):
 95 |                 layers.append(block(self.inplanes, planes))
 96 | 
 97 |         return nn.Sequential(*layers)
 98 | 
 99 |     def forward(self, x):
100 |         x = self.conv1(x)
101 |         x = self.bn1(x)
102 |         x = self.relu(x)
103 |         x = self.maxpool(x)
104 | 
105 |         x = self.layer1(x)
106 |         x = self.layer2(x)
107 |         x = self.layer3(x)
108 |         x = self.layer4(x)
109 | 
110 |         x = self.avgpool(x)
111 |         x = x.view(x.size(0), -1)
112 |         x = self.fc(x)
113 | 
114 |         return x
115 | 
116 | def resnet18(pretrained=False, **kwargs):
117 |     model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
118 |     return model
119 | 
120 | def resnet34(pretrained=False, **kwargs):
121 |     model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
122 |     return model
123 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/models/modified_resnet_cifar.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | import torch.nn as nn
 12 | import math
 13 | import torch.utils.model_zoo as model_zoo
 14 | import models.modified_linear as modified_linear
 15 | 
 16 | def conv3x3(in_planes, out_planes, stride=1):
 17 |     """3x3 convolution with padding"""
 18 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 19 |                      padding=1, bias=False)
 20 | 
 21 | class BasicBlock(nn.Module):
 22 |     expansion = 1
 23 | 
 24 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 25 |         super(BasicBlock, self).__init__()
 26 |         self.conv1 = conv3x3(inplanes, planes, stride)
 27 |         self.bn1 = nn.BatchNorm2d(planes)
 28 |         self.relu = nn.ReLU(inplace=True)
 29 |         self.conv2 = conv3x3(planes, planes)
 30 |         self.bn2 = nn.BatchNorm2d(planes)
 31 |         self.downsample = downsample
 32 |         self.stride = stride
 33 |         self.last = last
 34 | 
 35 |     def forward(self, x):
 36 |         residual = x
 37 | 
 38 |         out = self.conv1(x)
 39 |         out = self.bn1(out)
 40 |         out = self.relu(out)
 41 | 
 42 |         out = self.conv2(out)
 43 |         out = self.bn2(out)
 44 | 
 45 |         if self.downsample is not None:
 46 |             residual = self.downsample(x)
 47 | 
 48 |         out += residual
 49 |         if not self.last: 
 50 |             out = self.relu(out)
 51 | 
 52 |         return out
 53 | 
 54 | class ResNet(nn.Module):
 55 | 
 56 |     def __init__(self, block, layers, num_classes=10):
 57 |         self.inplanes = 16
 58 |         super(ResNet, self).__init__()
 59 |         self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1,
 60 |                                bias=False)
 61 |         self.bn1 = nn.BatchNorm2d(16)
 62 |         self.relu = nn.ReLU(inplace=True)
 63 |         self.layer1 = self._make_layer(block, 16, layers[0])
 64 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
 65 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2, last_phase=True)
 66 |         self.avgpool = nn.AvgPool2d(8, stride=1)
 67 |         self.fc = modified_linear.CosineLinear(64 * block.expansion, num_classes)
 68 | 
 69 |         for m in self.modules():
 70 |             if isinstance(m, nn.Conv2d):
 71 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 72 |             elif isinstance(m, nn.BatchNorm2d):
 73 |                 nn.init.constant_(m.weight, 1)
 74 |                 nn.init.constant_(m.bias, 0)
 75 | 
 76 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 77 |         downsample = None
 78 |         if stride != 1 or self.inplanes != planes * block.expansion:
 79 |             downsample = nn.Sequential(
 80 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
 81 |                           kernel_size=1, stride=stride, bias=False),
 82 |                 nn.BatchNorm2d(planes * block.expansion),
 83 |             )
 84 | 
 85 |         layers = []
 86 |         layers.append(block(self.inplanes, planes, stride, downsample))
 87 |         self.inplanes = planes * block.expansion
 88 |         if last_phase:
 89 |             for i in range(1, blocks-1):
 90 |                 layers.append(block(self.inplanes, planes))
 91 |             layers.append(block(self.inplanes, planes, last=True))
 92 |         else: 
 93 |             for i in range(1, blocks):
 94 |                 layers.append(block(self.inplanes, planes))
 95 | 
 96 |         return nn.Sequential(*layers)
 97 | 
 98 |     def forward(self, x):
 99 |         x = self.conv1(x)
100 |         x = self.bn1(x)
101 |         x = self.relu(x)
102 | 
103 |         x = self.layer1(x)
104 |         x = self.layer2(x)
105 |         x = self.layer3(x)
106 | 
107 |         x = self.avgpool(x)
108 |         x = x.view(x.size(0), -1)
109 |         x = self.fc(x)
110 | 
111 |         return x
112 | 
113 | def resnet20(pretrained=False, **kwargs):
114 |     n = 3
115 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
116 |     return model
117 | 
118 | def resnet32(pretrained=False, **kwargs):
119 |     n = 5
120 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
121 |     return model
122 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/models/modified_resnetmtl.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | import torch.nn as nn
 12 | import math
 13 | import torch.utils.model_zoo as model_zoo
 14 | import models.modified_linear as modified_linear
 15 | from utils.incremental.conv2d_mtl import Conv2dMtl
 16 | 
 17 | def conv3x3mtl(in_planes, out_planes, stride=1):
 18 |     """3x3 convolution with padding"""
 19 |     return Conv2dMtl(in_planes, out_planes, kernel_size=3, stride=stride,
 20 |                      padding=1, bias=False)
 21 | 
 22 | 
 23 | class BasicBlockMtl(nn.Module):
 24 |     expansion = 1
 25 | 
 26 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 27 |         super(BasicBlockMtl, self).__init__()
 28 |         self.conv1 = conv3x3mtl(inplanes, planes, stride)
 29 |         self.bn1 = nn.BatchNorm2d(planes)
 30 |         self.relu = nn.ReLU(inplace=True)
 31 |         self.conv2 = conv3x3mtl(planes, planes)
 32 |         self.bn2 = nn.BatchNorm2d(planes)
 33 |         self.downsample = downsample
 34 |         self.stride = stride
 35 |         self.last = last
 36 | 
 37 |     def forward(self, x):
 38 |         residual = x
 39 | 
 40 |         out = self.conv1(x)
 41 |         out = self.bn1(out)
 42 |         out = self.relu(out)
 43 | 
 44 |         out = self.conv2(out)
 45 |         out = self.bn2(out)
 46 | 
 47 |         if self.downsample is not None:
 48 |             residual = self.downsample(x)
 49 | 
 50 |         out += residual
 51 |         if not self.last: 
 52 |             out = self.relu(out)
 53 | 
 54 |         return out
 55 | 
 56 | class ResNetMtl(nn.Module):
 57 | 
 58 |     def __init__(self, block, layers, num_classes=1000):
 59 |         self.inplanes = 64
 60 |         super(ResNetMtl, self).__init__()
 61 |         self.conv1 = Conv2dMtl(3, 64, kernel_size=7, stride=2, padding=3,
 62 |                                bias=False)
 63 |         self.bn1 = nn.BatchNorm2d(64)
 64 |         self.relu = nn.ReLU(inplace=True)
 65 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
 66 |         self.layer1 = self._make_layer(block, 64, layers[0])
 67 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
 68 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
 69 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2, last_phase=True)
 70 |         self.avgpool = nn.AvgPool2d(7, stride=1)
 71 |         self.fc = modified_linear.CosineLinear(512 * block.expansion, num_classes)
 72 | 
 73 |         for m in self.modules():
 74 |             if isinstance(m, Conv2dMtl):
 75 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 76 |             elif isinstance(m, nn.BatchNorm2d):
 77 |                 nn.init.constant_(m.weight, 1)
 78 |                 nn.init.constant_(m.bias, 0)
 79 | 
 80 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 81 |         downsample = None
 82 |         if stride != 1 or self.inplanes != planes * block.expansion:
 83 |             downsample = nn.Sequential(
 84 |                 Conv2dMtl(self.inplanes, planes * block.expansion,
 85 |                           kernel_size=1, stride=stride, bias=False),
 86 |                 nn.BatchNorm2d(planes * block.expansion),
 87 |             )
 88 | 
 89 |         layers = []
 90 |         layers.append(block(self.inplanes, planes, stride, downsample))
 91 |         self.inplanes = planes * block.expansion
 92 |         if last_phase:
 93 |             for i in range(1, blocks-1):
 94 |                 layers.append(block(self.inplanes, planes))
 95 |             layers.append(block(self.inplanes, planes, last=True))
 96 |         else: 
 97 |             for i in range(1, blocks):
 98 |                 layers.append(block(self.inplanes, planes))
 99 | 
100 |         return nn.Sequential(*layers)
101 | 
102 |     def forward(self, x):
103 |         x = self.conv1(x)
104 |         x = self.bn1(x)
105 |         x = self.relu(x)
106 |         x = self.maxpool(x)
107 | 
108 |         x = self.layer1(x)
109 |         x = self.layer2(x)
110 |         x = self.layer3(x)
111 |         x = self.layer4(x)
112 | 
113 |         x = self.avgpool(x)
114 |         x = x.view(x.size(0), -1)
115 |         x = self.fc(x)
116 | 
117 |         return x
118 | 
119 | def resnetmtl18(pretrained=False, **kwargs):
120 |     model = ResNetMtl(BasicBlockMtl, [2, 2, 2, 2], **kwargs)
121 |     return model
122 | 
123 | def resnetmtl34(pretrained=False, **kwargs):
124 |     model = ResNetMtl(BasicBlockMtl, [3, 4, 6, 3], **kwargs)
125 |     return model
126 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/models/modified_resnetmtl_cifar.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | import torch.nn as nn
 12 | import math
 13 | import torch.utils.model_zoo as model_zoo
 14 | import models.modified_linear as modified_linear
 15 | from utils.incremental.conv2d_mtl import Conv2dMtl
 16 | 
 17 | def conv3x3mtl(in_planes, out_planes, stride=1):
 18 |     return Conv2dMtl(in_planes, out_planes, kernel_size=3, stride=stride,
 19 |                      padding=1, bias=False)
 20 | 
 21 | class BasicBlockMtl(nn.Module):
 22 |     expansion = 1
 23 | 
 24 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 25 |         super(BasicBlockMtl, self).__init__()
 26 |         self.conv1 = conv3x3mtl(inplanes, planes, stride)
 27 |         self.bn1 = nn.BatchNorm2d(planes)
 28 |         self.relu = nn.ReLU(inplace=True)
 29 |         self.conv2 = conv3x3mtl(planes, planes)
 30 |         self.bn2 = nn.BatchNorm2d(planes)
 31 |         self.downsample = downsample
 32 |         self.stride = stride
 33 |         self.last = last
 34 | 
 35 |     def forward(self, x):
 36 |         residual = x
 37 | 
 38 |         out = self.conv1(x)
 39 |         out = self.bn1(out)
 40 |         out = self.relu(out)
 41 | 
 42 |         out = self.conv2(out)
 43 |         out = self.bn2(out)
 44 | 
 45 |         if self.downsample is not None:
 46 |             residual = self.downsample(x)
 47 | 
 48 |         out += residual
 49 |         if not self.last:
 50 |             out = self.relu(out)
 51 | 
 52 |         return out
 53 | 
 54 | class ResNetMtl(nn.Module):
 55 | 
 56 |     def __init__(self, block, layers, num_classes=10):
 57 |         self.inplanes = 16
 58 |         super(ResNetMtl, self).__init__()
 59 |         self.conv1 = Conv2dMtl(3, 16, kernel_size=3, stride=1, padding=1,
 60 |                                bias=False)
 61 |         self.bn1 = nn.BatchNorm2d(16)
 62 |         self.relu = nn.ReLU(inplace=True)
 63 |         self.layer1 = self._make_layer(block, 16, layers[0])
 64 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
 65 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2, last_phase=True)
 66 |         self.avgpool = nn.AvgPool2d(8, stride=1)
 67 |         self.fc = modified_linear.CosineLinear(64 * block.expansion, num_classes)
 68 | 
 69 |         for m in self.modules():
 70 |             if isinstance(m, Conv2dMtl):
 71 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 72 |             elif isinstance(m, nn.BatchNorm2d):
 73 |                 nn.init.constant_(m.weight, 1)
 74 |                 nn.init.constant_(m.bias, 0)
 75 | 
 76 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 77 |         downsample = None
 78 |         if stride != 1 or self.inplanes != planes * block.expansion:
 79 |             downsample = nn.Sequential(
 80 |                 Conv2dMtl(self.inplanes, planes * block.expansion,
 81 |                           kernel_size=1, stride=stride, bias=False),
 82 |                 nn.BatchNorm2d(planes * block.expansion),
 83 |             )
 84 | 
 85 |         layers = []
 86 |         layers.append(block(self.inplanes, planes, stride, downsample))
 87 |         self.inplanes = planes * block.expansion
 88 |         if last_phase:
 89 |             for i in range(1, blocks-1):
 90 |                 layers.append(block(self.inplanes, planes))
 91 |             layers.append(block(self.inplanes, planes, last=True))
 92 |         else: 
 93 |             for i in range(1, blocks):
 94 |                 layers.append(block(self.inplanes, planes))
 95 | 
 96 |         return nn.Sequential(*layers)
 97 | 
 98 |     def forward(self, x):
 99 |         x = self.conv1(x)
100 |         x = self.bn1(x)
101 |         x = self.relu(x)
102 | 
103 |         x = self.layer1(x)
104 |         x = self.layer2(x)
105 |         x = self.layer3(x)
106 | 
107 |         x = self.avgpool(x)
108 |         x = x.view(x.size(0), -1)
109 |         x = self.fc(x)
110 | 
111 |         return x
112 | 
113 | def resnetmtl20(pretrained=False, **kwargs):
114 |     n = 3
115 |     model = ResNetMtl(BasicBlockMtl, [n, n, n], **kwargs)
116 |     return model
117 | 
118 | def resnetmtl32(pretrained=False, **kwargs):
119 |     n = 5
120 |     model = ResNetMtl(BasicBlockMtl, [n, n, n], **kwargs)
121 |     return model
122 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/models/resnet_cifar.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | import torch.nn as nn
 12 | import math
 13 | import torch.utils.model_zoo as model_zoo
 14 | from utils.incremental.conv2d_mtl import Conv2d
 15 | 
 16 | def conv3x3(in_planes, out_planes, stride=1):
 17 |     """3x3 convolution with padding"""
 18 |     return Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 19 |                      padding=1, bias=False)
 20 | 
 21 | class BasicBlock(nn.Module):
 22 |     expansion = 1
 23 | 
 24 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 25 |         super(BasicBlock, self).__init__()
 26 |         self.conv1 = conv3x3(inplanes, planes, stride)
 27 |         self.bn1 = nn.BatchNorm2d(planes)
 28 |         self.relu = nn.ReLU(inplace=True)
 29 |         self.conv2 = conv3x3(planes, planes)
 30 |         self.bn2 = nn.BatchNorm2d(planes)
 31 |         self.downsample = downsample
 32 |         self.stride = stride
 33 | 
 34 |     def forward(self, x):
 35 |         residual = x
 36 |         import pdb
 37 |         pdb.set_trace()
 38 |         out = self.conv1(x)
 39 |         out = self.bn1(out)
 40 |         out = self.relu(out)
 41 | 
 42 |         out = self.conv2(out)
 43 |         out = self.bn2(out)
 44 | 
 45 |         if self.downsample is not None:
 46 |             residual = self.downsample(x)
 47 | 
 48 |         out += residual
 49 |         out = self.relu(out)
 50 | 
 51 |         return out
 52 | 
 53 | 
 54 | class Bottleneck(nn.Module):
 55 |     expansion = 4
 56 | 
 57 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 58 |         super(Bottleneck, self).__init__()
 59 |         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 60 |         self.bn1 = nn.BatchNorm2d(planes)
 61 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
 62 |                                padding=1, bias=False)
 63 |         self.bn2 = nn.BatchNorm2d(planes)
 64 |         self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, bias=False)
 65 |         self.bn3 = nn.BatchNorm2d(planes * self.expansion)
 66 |         self.relu = nn.ReLU(inplace=True)
 67 |         self.downsample = downsample
 68 |         self.stride = stride
 69 | 
 70 |     def forward(self, x):
 71 |         residual = x
 72 | 
 73 |         out = self.conv1(x)
 74 |         out = self.bn1(out)
 75 |         out = self.relu(out)
 76 | 
 77 |         out = self.conv2(out)
 78 |         out = self.bn2(out)
 79 |         out = self.relu(out)
 80 | 
 81 |         out = self.conv3(out)
 82 |         out = self.bn3(out)
 83 | 
 84 |         if self.downsample is not None:
 85 |             residual = self.downsample(x)
 86 | 
 87 |         out += residual
 88 |         out = self.relu(out)
 89 | 
 90 |         return out
 91 | 
 92 | 
 93 | class ResNet(nn.Module):
 94 | 
 95 |     def __init__(self, block, layers, num_classes=10):
 96 |         self.inplanes = 16
 97 |         super(ResNet, self).__init__()
 98 |         self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1,
 99 |                                bias=False)
100 |         self.bn1 = nn.BatchNorm2d(16)
101 |         self.relu = nn.ReLU(inplace=True)
102 |         self.layer1 = self._make_layer(block, 16, layers[0])
103 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
104 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2)
105 |         self.avgpool = nn.AvgPool2d(8, stride=1)
106 |         self.fc = nn.Linear(64 * block.expansion, num_classes)
107 | 
108 |         for m in self.modules():
109 |             if isinstance(m, nn.Conv2d):
110 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
111 |             elif isinstance(m, nn.BatchNorm2d):
112 |                 nn.init.constant_(m.weight, 1)
113 |                 nn.init.constant_(m.bias, 0)
114 | 
115 |     def _make_layer(self, block, planes, blocks, stride=1):
116 |         downsample = None
117 |         if stride != 1 or self.inplanes != planes * block.expansion:
118 |             downsample = nn.Sequential(
119 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
120 |                           kernel_size=1, stride=stride, bias=False),
121 |                 nn.BatchNorm2d(planes * block.expansion),
122 |             )
123 | 
124 |         layers = []
125 |         layers.append(block(self.inplanes, planes, stride, downsample))
126 |         self.inplanes = planes * block.expansion
127 |         for i in range(1, blocks):
128 |             layers.append(block(self.inplanes, planes))
129 | 
130 |         return nn.Sequential(*layers)
131 | 
132 |     def forward(self, x):
133 |         x = self.conv1(x)
134 |         x = self.bn1(x)
135 |         x = self.relu(x)
136 | 
137 |         x = self.layer1(x)
138 |         x = self.layer2(x)
139 |         x = self.layer3(x)
140 | 
141 |         x = self.avgpool(x)
142 |         x = x.view(x.size(0), -1)
143 |         x = self.fc(x)
144 | 
145 |         return x
146 | 
147 | def resnet20(pretrained=False, **kwargs):
148 |     n = 3
149 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
150 |     return model
151 | 
152 | def resnet32(pretrained=False, **kwargs):
153 |     n = 5
154 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
155 |     return model
156 | 
157 | def resnet56(pretrained=False, **kwargs):
158 |     n = 9
159 |     model = ResNet(Bottleneck, [n, n, n], **kwargs)
160 |     return model
161 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/trainer/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yaoyao-liu/class-incremental-learning/701af9f819f559c6ab3d3ee73bb3d7c21e924572/adaptive-aggregation-networks/trainer/__init__.py


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/trainer/incremental_icarl.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2019
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | """ Training code for iCaRL """
 12 | import torch
 13 | import tqdm
 14 | import numpy as np
 15 | import torch.nn as nn
 16 | import torchvision
 17 | from torch.optim import lr_scheduler
 18 | from torchvision import datasets, models, transforms
 19 | from utils.misc import *
 20 | from utils.process_fp import process_inputs_fp
 21 | import torch.nn.functional as F
 22 | 
 23 | def incremental_train_and_eval(the_args, epochs, fusion_vars, ref_fusion_vars, b1_model, ref_model, b2_model, ref_b2_model, tg_optimizer, tg_lr_scheduler, fusion_optimizer, fusion_lr_scheduler, trainloader, testloader, iteration, start_iteration, X_protoset_cumuls, Y_protoset_cumuls, order_list,lamda, dist, K, lw_mr, balancedloader, T=None, beta=None, fix_bn=False, weight_per_class=None, device=None):
 24 | 
 25 |     # Setting up the CUDA device
 26 |     if device is None:
 27 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
 28 |     # Set the 1st branch reference model to the evaluation mode
 29 |     ref_model.eval()
 30 | 
 31 |     # Get the number of old classes
 32 |     num_old_classes = ref_model.fc.out_features
 33 | 
 34 |     # If the 2nd branch reference is not None, set it to the evaluation mode
 35 |     if iteration > start_iteration+1:
 36 |         ref_b2_model.eval()
 37 | 
 38 |     for epoch in range(epochs):
 39 |         # Start training for the current phase, set the two branch models to the training mode
 40 |         b1_model.train()
 41 |         b2_model.train()
 42 | 
 43 |         # Fix the batch norm parameters according to the config
 44 |         if fix_bn:
 45 |             for m in b1_model.modules():
 46 |                 if isinstance(m, nn.BatchNorm2d):
 47 |                     m.eval()
 48 | 
 49 |         # Set all the losses to zeros
 50 |         train_loss = 0
 51 |         train_loss1 = 0
 52 |         train_loss2 = 0
 53 |         # Set the counters to zeros
 54 |         correct = 0
 55 |         total = 0
 56 | 
 57 |         # Learning rate decay
 58 |         tg_lr_scheduler.step()
 59 |         fusion_lr_scheduler.step()
 60 | 
 61 |         # Print the information
 62 |         print('\nEpoch: %d, learning rate: ' % epoch, end='')
 63 |         print(tg_lr_scheduler.get_lr()[0])
 64 | 
 65 |         for batch_idx, (inputs, targets) in enumerate(trainloader):
 66 | 
 67 |             # Get a batch of training samples, transfer them to the device
 68 |             inputs, targets = inputs.to(device), targets.to(device)
 69 | 
 70 |             # Clear the gradient of the paramaters for the tg_optimizer
 71 |             tg_optimizer.zero_grad()
 72 | 
 73 |             # Forward the samples in the deep networks
 74 |             outputs, _ = process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs)
 75 | 
 76 |             if iteration == start_iteration+1:
 77 |                 ref_outputs = ref_model(inputs)
 78 |             else:
 79 |                 ref_outputs, ref_features_new = process_inputs_fp(the_args, ref_fusion_vars, ref_model, ref_b2_model, inputs)
 80 |             # Loss 1: logits-level distillation loss
 81 |             loss1 = nn.KLDivLoss()(F.log_softmax(outputs[:,:num_old_classes]/T, dim=1), \
 82 |                 F.softmax(ref_outputs.detach()/T, dim=1)) * T * T * beta * num_old_classes
 83 |             # Loss 2: classification loss
 84 |             loss2 = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
 85 |             # Sum up all looses
 86 |             loss = loss1 + loss2
 87 | 
 88 |             # Backward and update the parameters
 89 |             loss.backward()
 90 |             tg_optimizer.step()
 91 | 
 92 |             # Record the losses and the number of samples to compute the accuracy
 93 |             train_loss += loss.item()
 94 |             train_loss1 += loss1.item()
 95 |             train_loss2 += loss2.item()
 96 |             _, predicted = outputs.max(1)
 97 |             total += targets.size(0)
 98 |             correct += predicted.eq(targets).sum().item()
 99 | 
100 |         # Print the training losses and accuracies
101 |         print('Train set: {}, train loss1: {:.4f}, train loss2: {:.4f}, train loss: {:.4f} accuracy: {:.4f}'.format(len(trainloader), train_loss1/(batch_idx+1), train_loss2/(batch_idx+1), train_loss/(batch_idx+1), 100.*correct/total))
102 |         
103 |         # Update the aggregation weights
104 |         b1_model.eval()
105 |         b2_model.eval()
106 |         
107 |         for batch_idx, (inputs, targets) in enumerate(balancedloader):
108 |             fusion_optimizer.zero_grad()
109 |             inputs, targets = inputs.to(device), targets.to(device)
110 |             outputs, _ = process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs)
111 |             loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
112 |             loss.backward()
113 |             fusion_optimizer.step()
114 | 
115 |         # Running the test for this epoch
116 |         b1_model.eval()
117 |         b2_model.eval()
118 |         test_loss = 0
119 |         correct = 0
120 |         total = 0
121 |         with torch.no_grad():
122 |             for batch_idx, (inputs, targets) in enumerate(testloader):
123 |                 inputs, targets = inputs.to(device), targets.to(device)
124 |                 outputs, _ = process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs)
125 |                 loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
126 |                 test_loss += loss.item()
127 |                 _, predicted = outputs.max(1)
128 |                 total += targets.size(0)
129 |                 correct += predicted.eq(targets).sum().item()
130 |         print('Test set: {} test loss: {:.4f} accuracy: {:.4f}'.format(len(testloader), test_loss/(batch_idx+1), 100.*correct/total))
131 | 
132 |     print("Removing register forward hook")
133 |     return b1_model, b2_model
134 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/trainer/incremental_lucir.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | """ Training code for LUCIR """
 12 | import torch
 13 | import tqdm
 14 | import numpy as np
 15 | import torch.nn as nn
 16 | import torchvision
 17 | from torch.optim import lr_scheduler
 18 | from torchvision import datasets, models, transforms
 19 | from utils.misc import *
 20 | from utils.process_fp import process_inputs_fp
 21 | 
 22 | cur_features = []
 23 | ref_features = []
 24 | old_scores = []
 25 | new_scores = []
 26 | 
 27 | def get_ref_features(self, inputs, outputs):
 28 |     global ref_features
 29 |     ref_features = inputs[0]
 30 | 
 31 | def get_cur_features(self, inputs, outputs):
 32 |     global cur_features
 33 |     cur_features = inputs[0]
 34 | 
 35 | def get_old_scores_before_scale(self, inputs, outputs):
 36 |     global old_scores
 37 |     old_scores = outputs
 38 | 
 39 | def get_new_scores_before_scale(self, inputs, outputs):
 40 |     global new_scores
 41 |     new_scores = outputs
 42 | 
 43 | def map_labels(order_list, Y_set):
 44 |     map_Y = []
 45 |     for idx in Y_set:
 46 |         map_Y.append(order_list.index(idx))
 47 |     map_Y = np.array(map_Y)
 48 |     return map_Y
 49 | 
 50 | 
 51 | def incremental_train_and_eval(the_args, epochs, fusion_vars, ref_fusion_vars, b1_model, ref_model, b2_model, ref_b2_model, \
 52 |     tg_optimizer, tg_lr_scheduler, fusion_optimizer, fusion_lr_scheduler, trainloader, testloader, iteration, \
 53 |     start_iteration, X_protoset_cumuls, Y_protoset_cumuls, order_list, the_lambda, dist, \
 54 |     K, lw_mr, balancedloader, fix_bn=False, weight_per_class=None, device=None):
 55 | 
 56 |     # Setting up the CUDA device
 57 |     if device is None:
 58 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
 59 |     # Set the 1st branch reference model to the evaluation mode
 60 |     ref_model.eval()
 61 | 
 62 |     # Get the number of old classes
 63 |     num_old_classes = ref_model.fc.out_features
 64 | 
 65 |     # Get the features from the current and the reference model
 66 |     handle_ref_features = ref_model.fc.register_forward_hook(get_ref_features)
 67 |     handle_cur_features = b1_model.fc.register_forward_hook(get_cur_features)
 68 |     handle_old_scores_bs = b1_model.fc.fc1.register_forward_hook(get_old_scores_before_scale)
 69 |     handle_new_scores_bs = b1_model.fc.fc2.register_forward_hook(get_new_scores_before_scale)
 70 | 
 71 |     # If the 2nd branch reference is not None, set it to the evaluation mode
 72 |     if iteration > start_iteration+1:
 73 |         ref_b2_model.eval()
 74 | 
 75 |     for epoch in range(epochs):
 76 |         # Start training for the current phase, set the two branch models to the training mode
 77 |         b1_model.train()
 78 |         b2_model.train()
 79 | 
 80 |         # Fix the batch norm parameters according to the config
 81 |         if fix_bn:
 82 |             for m in b1_model.modules():
 83 |                 if isinstance(m, nn.BatchNorm2d):
 84 |                     m.eval()
 85 | 
 86 |         # Set all the losses to zeros
 87 |         train_loss = 0
 88 |         train_loss1 = 0
 89 |         train_loss2 = 0
 90 |         train_loss3 = 0
 91 |         # Set the counters to zeros
 92 |         correct = 0
 93 |         total = 0
 94 |     
 95 |         # Learning rate decay
 96 |         tg_lr_scheduler.step()
 97 |         fusion_lr_scheduler.step()
 98 | 
 99 |         # Print the information
100 |         print('\nEpoch: %d, learning rate: ' % epoch, end='')
101 |         print(tg_lr_scheduler.get_lr()[0])
102 | 
103 |         for batch_idx, (inputs, targets) in enumerate(trainloader):
104 | 
105 |             # Get a batch of training samples, transfer them to the device
106 |             inputs, targets = inputs.to(device), targets.to(device)
107 | 
108 |             # Clear the gradient of the paramaters for the tg_optimizer
109 |             tg_optimizer.zero_grad()
110 | 
111 |             # Forward the samples in the deep networks
112 |             outputs, _ = process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs)
113 |     
114 |             # Loss 1: feature-level distillation loss
115 |             if iteration == start_iteration+1:
116 |                 ref_outputs = ref_model(inputs)
117 |                 loss1 = nn.CosineEmbeddingLoss()(cur_features, ref_features.detach(), torch.ones(inputs.shape[0]).to(device)) * the_lambda
118 |             else:
119 |                 ref_outputs, ref_features_new = process_inputs_fp(the_args, ref_fusion_vars, ref_model, ref_b2_model, inputs)
120 |                 loss1 = nn.CosineEmbeddingLoss()(cur_features, ref_features_new.detach(), torch.ones(inputs.shape[0]).to(device)) * the_lambda
121 | 
122 |             # Loss 2: classification loss
123 |             loss2 = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
124 | 
125 |             # Loss 3: margin ranking loss
126 |             outputs_bs = torch.cat((old_scores, new_scores), dim=1)
127 |             assert(outputs_bs.size()==outputs.size())
128 |             gt_index = torch.zeros(outputs_bs.size()).to(device)
129 |             gt_index = gt_index.scatter(1, targets.view(-1,1), 1).ge(0.5)
130 |             gt_scores = outputs_bs.masked_select(gt_index)
131 |             max_novel_scores = outputs_bs[:, num_old_classes:].topk(K, dim=1)[0]
132 |             hard_index = targets.lt(num_old_classes)
133 |             hard_num = torch.nonzero(hard_index).size(0)
134 |             if hard_num > 0:
135 |                 gt_scores = gt_scores[hard_index].view(-1, 1).repeat(1, K)
136 |                 max_novel_scores = max_novel_scores[hard_index]
137 |                 assert(gt_scores.size() == max_novel_scores.size())
138 |                 assert(gt_scores.size(0) == hard_num)
139 |                 loss3 = nn.MarginRankingLoss(margin=dist)(gt_scores.view(-1, 1), max_novel_scores.view(-1, 1), torch.ones(hard_num*K).to(device)) * lw_mr
140 |             else:
141 |                 loss3 = torch.zeros(1).to(device)
142 | 
143 |             # Sum up all looses
144 |             loss = loss1 + loss2 + loss3
145 | 
146 |             # Backward and update the parameters
147 |             loss.backward()
148 |             tg_optimizer.step()
149 | 
150 |             # Record the losses and the number of samples to compute the accuracy
151 |             train_loss += loss.item()
152 |             train_loss1 += loss1.item()
153 |             train_loss2 += loss2.item()
154 |             train_loss3 += loss3.item()
155 |             _, predicted = outputs.max(1)
156 |             total += targets.size(0)
157 |             correct += predicted.eq(targets).sum().item()
158 | 
159 |         # Print the training losses and accuracies
160 |         print('Train set: {}, train loss1: {:.4f}, train loss2: {:.4f}, train loss3: {:.4f}, train loss: {:.4f} accuracy: {:.4f}'.format(len(trainloader), train_loss1/(batch_idx+1), train_loss2/(batch_idx+1), train_loss3/(batch_idx+1), train_loss/(batch_idx+1), 100.*correct/total))
161 |         
162 |         # Update the aggregation weights
163 |         b1_model.eval()
164 |         b2_model.eval()
165 |      
166 |         for batch_idx, (inputs, targets) in enumerate(balancedloader):
167 |             if batch_idx <= 500:
168 |                 inputs, targets = inputs.to(device), targets.to(device)
169 |                 outputs, _ = process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs)
170 |                 loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
171 |                 loss.backward()
172 |                 fusion_optimizer.step()
173 | 
174 |         # Running the test for this epoch
175 |         b1_model.eval()
176 |         b2_model.eval()
177 |         test_loss = 0
178 |         correct = 0
179 |         total = 0
180 |         with torch.no_grad():
181 |             for batch_idx, (inputs, targets) in enumerate(testloader):
182 |                 inputs, targets = inputs.to(device), targets.to(device)
183 |                 outputs, _ = process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs)
184 |                 loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
185 |                 test_loss += loss.item()
186 |                 _, predicted = outputs.max(1)
187 |                 total += targets.size(0)
188 |                 correct += predicted.eq(targets).sum().item()
189 |         print('Test set: {} test loss: {:.4f} accuracy: {:.4f}'.format(len(testloader), test_loss/(batch_idx+1), 100.*correct/total))
190 | 
191 |     print("Removing register forward hook")
192 |     handle_ref_features.remove()
193 |     handle_cur_features.remove()
194 |     handle_old_scores_bs.remove()
195 |     handle_new_scores_bs.remove()
196 |     return b1_model, b2_model
197 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/trainer/trainer.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | """ Class-incremental learning trainer. """
 12 | import torch
 13 | import torch.nn as nn
 14 | import torch.nn.functional as F
 15 | import torch.optim as optim
 16 | from torch.optim import lr_scheduler
 17 | import torchvision
 18 | from torchvision import datasets, models, transforms
 19 | from torch.autograd import Variable
 20 | from tensorboardX import SummaryWriter
 21 | import numpy as np
 22 | import time
 23 | import os
 24 | import os.path as osp
 25 | import sys
 26 | import copy
 27 | import argparse
 28 | from PIL import Image
 29 | try:
 30 |     import cPickle as pickle
 31 | except:
 32 |     import pickle
 33 | import math
 34 | import utils.misc
 35 | import models.modified_resnet_cifar as modified_resnet_cifar
 36 | import models.modified_resnetmtl_cifar as modified_resnetmtl_cifar
 37 | import models.modified_resnet as modified_resnet
 38 | import models.modified_resnetmtl as modified_resnetmtl
 39 | import models.modified_linear as modified_linear
 40 | from utils.imagenet.utils_dataset import split_images_labels
 41 | from utils.imagenet.utils_dataset import merge_images_labels
 42 | from utils.incremental.compute_accuracy import compute_accuracy
 43 | from trainer.incremental_lucir import incremental_train_and_eval as incremental_train_and_eval_lucir
 44 | from trainer.incremental_icarl import incremental_train_and_eval as incremental_train_and_eval_icarl
 45 | from trainer.zeroth_phase import incremental_train_and_eval_zeroth_phase as incremental_train_and_eval_zeroth_phase
 46 | from utils.misc import process_mnemonics
 47 | from trainer.base_trainer import BaseTrainer
 48 | import warnings
 49 | warnings.filterwarnings('ignore')
 50 | 
 51 | class Trainer(BaseTrainer):
 52 |     def train(self):
 53 |         """The class that contains the code for the class-incremental system.
 54 |         This trianer is based on the base_trainer.py in the same folder.
 55 |         If you hope to find the source code of the functions used in this trainer, you may find them in base_trainer.py.
 56 |         """
 57 |         
 58 |         # Set tensorboard recorder
 59 |         self.train_writer = SummaryWriter(comment=self.save_path)
 60 | 
 61 |         # Initial the array to store the accuracies for each phase
 62 |         top1_acc_list_cumul = np.zeros((int(self.args.num_classes/self.args.nb_cl), 3, 1))
 63 |         top1_acc_list_ori = np.zeros((int(self.args.num_classes/self.args.nb_cl), 3, 1))
 64 | 
 65 |         # Load the training and test samples from the dataset
 66 |         X_train_total, Y_train_total, X_valid_total, Y_valid_total = self.set_dataset()
 67 | 
 68 |         # Initialize the aggregation weights
 69 |         self.init_fusion_vars()       
 70 | 
 71 |         # Initialize the class order
 72 |         order, order_list = self.init_class_order()
 73 |         np.random.seed(None)
 74 | 
 75 |         # Set empty lists for the data    
 76 |         X_valid_cumuls    = []
 77 |         X_protoset_cumuls = []
 78 |         X_train_cumuls    = []
 79 |         Y_valid_cumuls    = []
 80 |         Y_protoset_cumuls = []
 81 |         Y_train_cumuls    = []
 82 | 
 83 |         # Initialize the prototypes
 84 |         alpha_dr_herding, prototypes = self.init_prototypes(self.dictionary_size, order, X_train_total, Y_train_total)
 85 | 
 86 |         # Set the starting iteration
 87 |         # We start training the class-incremental learning system from e.g., 50 classes to provide a good initial encoder
 88 |         start_iter = int(self.args.nb_cl_fg/self.args.nb_cl)-1
 89 | 
 90 |         # Set the models and some parameter to None
 91 |         # These models and parameters will be assigned in the following phases
 92 |         b1_model = None
 93 |         ref_model = None
 94 |         b2_model = None
 95 |         ref_b2_model = None
 96 |         the_lambda_mult = None
 97 | 
 98 |         for iteration in range(start_iter, int(self.args.num_classes/self.args.nb_cl)):
 99 |             ### Initialize models for the current phase
100 |             b1_model, b2_model, ref_model, ref_b2_model, lambda_mult, cur_lambda, last_iter = self.init_current_phase_model(iteration, start_iter, b1_model, b2_model)
101 | 
102 |             ### Initialize datasets for the current phase
103 |             if iteration == start_iter:
104 |                 indices_train_10, X_valid_cumul, X_train_cumul, Y_valid_cumul, Y_train_cumul, \
105 |                     X_train_cumuls, Y_valid_cumuls, X_protoset_cumuls, Y_protoset_cumuls, X_valid_cumuls, Y_valid_cumuls, \
106 |                     X_train, map_Y_train, map_Y_valid_cumul, X_valid_ori, Y_valid_ori = \
107 |                     self.init_current_phase_dataset(iteration, \
108 |                     start_iter, last_iter, order, order_list, X_train_total, Y_train_total, X_valid_total, Y_valid_total, \
109 |                     X_train_cumuls, Y_train_cumuls, X_valid_cumuls, Y_valid_cumuls, X_protoset_cumuls, Y_protoset_cumuls)
110 |             else:
111 |                 indices_train_10, X_valid_cumul, X_train_cumul, Y_valid_cumul, Y_train_cumul, \
112 |                     X_train_cumuls, Y_valid_cumuls, X_protoset_cumuls, Y_protoset_cumuls, X_valid_cumuls, Y_valid_cumuls, \
113 |                     X_train, map_Y_train, map_Y_valid_cumul, X_protoset, Y_protoset = \
114 |                     self.init_current_phase_dataset(iteration, \
115 |                     start_iter, last_iter, order, order_list, X_train_total, Y_train_total, X_valid_total, Y_valid_total, \
116 |                     X_train_cumuls, Y_train_cumuls, X_valid_cumuls, Y_valid_cumuls, X_protoset_cumuls, Y_protoset_cumuls)                
117 | 
118 |             is_start_iteration = (iteration == start_iter)
119 | 
120 |             # Imprint weights
121 |             if iteration > start_iter:
122 |                 b1_model = self.imprint_weights(b1_model, b2_model, iteration, is_start_iteration, X_train, map_Y_train, self.dictionary_size)
123 | 
124 |             # Update training and test dataloader
125 |             trainloader, testloader = self.update_train_and_valid_loader(X_train, map_Y_train, X_valid_cumul, map_Y_valid_cumul, \
126 |                 iteration, start_iter)
127 | 
128 |             # Set the names for the checkpoints
129 |             ckp_name = osp.join(self.save_path, 'iter_{}_b1.pth'.format(iteration))
130 |             ckp_name_b2 = osp.join(self.save_path, 'iter_{}_b2.pth'.format(iteration))            
131 |             print('Check point name: ', ckp_name)
132 | 
133 |             if iteration==start_iter and self.args.resume_fg:
134 |                 # Resume the 0-th phase model according to the config
135 |                 b1_model = torch.load(self.args.ckpt_dir_fg)
136 |             elif self.args.resume and os.path.exists(ckp_name):
137 |                 # Resume other models according to the config
138 |                 b1_model = torch.load(ckp_name)
139 |                 b2_model = torch.load(ckp_name_b2)
140 |             else:
141 |                 # Start training (if we don't resume the models from the checkppoints)
142 |     
143 |                 # Set the optimizer
144 |                 tg_optimizer, tg_lr_scheduler, fusion_optimizer, fusion_lr_scheduler = self.set_optimizer(iteration, \
145 |                     start_iter, b1_model, ref_model, b2_model, ref_b2_model)     
146 | 
147 |                 if iteration > start_iter:
148 |                     # Training the class-incremental learning system from the 1st phase
149 | 
150 |                     # Set the balanced dataloader
151 |                     balancedloader = self.gen_balanced_loader(X_train_total, Y_train_total, indices_train_10, X_protoset, Y_protoset, order_list)
152 | 
153 |                     # Training the model for different baselines        
154 |                     if self.args.baseline == 'lucir':
155 |                         b1_model, b2_model = incremental_train_and_eval_lucir(self.args, self.args.epochs, self.fusion_vars, \
156 |                             self.ref_fusion_vars, b1_model, ref_model, b2_model, ref_b2_model, tg_optimizer, tg_lr_scheduler, \
157 |                             fusion_optimizer, fusion_lr_scheduler, trainloader, testloader, iteration, start_iter, \
158 |                             X_protoset_cumuls, Y_protoset_cumuls, order_list, cur_lambda, self.args.dist, self.args.K, self.args.lw_mr, balancedloader)    
159 |                     elif self.args.baseline == 'icarl':
160 |                         b1_model, b2_model = incremental_train_and_eval_icarl(self.args, self.args.epochs, self.fusion_vars, \
161 |                             self.ref_fusion_vars, b1_model, ref_model, b2_model, ref_b2_model, tg_optimizer, tg_lr_scheduler, \
162 |                             fusion_optimizer, fusion_lr_scheduler, trainloader, testloader, iteration, start_iter, \
163 |                             X_protoset_cumuls, Y_protoset_cumuls, order_list, cur_lambda, self.args.dist, self.args.K, self.args.lw_mr, balancedloader, \
164 |                             self.args.icarl_T, self.args.icarl_beta)
165 |                     else:
166 |                         raise ValueError('Please set the correct baseline.')       
167 |                 else:         
168 |                     # Training the class-incremental learning system from the 0th phase           
169 |                     b1_model = incremental_train_and_eval_zeroth_phase(self.args, self.args.epochs, b1_model, \
170 |                         ref_model, tg_optimizer, tg_lr_scheduler, trainloader, testloader, iteration, start_iter, \
171 |                         cur_lambda, self.args.dist, self.args.K, self.args.lw_mr) 
172 | 
173 |             # Select the exemplars according to the current model
174 |             X_protoset_cumuls, Y_protoset_cumuls, class_means, alpha_dr_herding = self.set_exemplar_set(b1_model, b2_model, \
175 |                 is_start_iteration, iteration, last_iter, order, alpha_dr_herding, prototypes)
176 |             
177 |             # Compute the accuracies for current phase
178 |             top1_acc_list_ori, top1_acc_list_cumul = self.compute_acc(class_means, order, order_list, b1_model, b2_model, X_protoset_cumuls, Y_protoset_cumuls, \
179 |                 X_valid_ori, Y_valid_ori, X_valid_cumul, Y_valid_cumul, iteration, is_start_iteration, top1_acc_list_ori, top1_acc_list_cumul)
180 | 
181 |             # Compute the average accuracy
182 |             num_of_testing = iteration - start_iter + 1
183 |             avg_cumul_acc_fc = np.sum(top1_acc_list_cumul[start_iter:,0])/num_of_testing
184 |             avg_cumul_acc_icarl = np.sum(top1_acc_list_cumul[start_iter:,1])/num_of_testing
185 |             print('Computing average accuracy...')
186 |             print("  Average accuracy (FC)         :\t\t{:.2f} %".format(avg_cumul_acc_fc))
187 |             print("  Average accuracy (Proto)      :\t\t{:.2f} %".format(avg_cumul_acc_icarl))
188 | 
189 |             # Write the results to the tensorboard
190 |             self.train_writer.add_scalar('avg_acc/fc', float(avg_cumul_acc_fc), iteration)
191 |             self.train_writer.add_scalar('avg_acc/proto', float(avg_cumul_acc_icarl), iteration)
192 | 
193 |         # Save the results and close the tensorboard writer
194 |         torch.save(top1_acc_list_ori, osp.join(self.save_path, 'acc_list_ori.pth'))
195 |         torch.save(top1_acc_list_cumul, osp.join(self.save_path, 'acc_list_cumul.pth'))
196 |         self.train_writer.close()
197 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/trainer/zeroth_phase.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Created by: Yaoyao Liu
 3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
 4 | ## Max Planck Institute for Informatics
 5 | ## yaoyao.liu@mpi-inf.mpg.de
 6 | ## Copyright (c) 2019
 7 | ##
 8 | ## This source code is licensed under the MIT-style license found in the
 9 | ## LICENSE file in the root directory of this source tree
10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
11 | """ Training code for the 0-th phase """
12 | import torch
13 | import tqdm
14 | import numpy as np
15 | import torch.nn as nn
16 | import torchvision
17 | from torch.optim import lr_scheduler
18 | from torchvision import datasets, models, transforms
19 | from utils.misc import *
20 | from utils.process_fp import process_inputs_fp
21 | import torch.nn.functional as F
22 | 
23 | def incremental_train_and_eval_zeroth_phase(the_args, epochs, b1_model, ref_model, \
24 |     tg_optimizer, tg_lr_scheduler, trainloader, testloader, iteration, start_iteration, \
25 |     lamda, dist, K, lw_mr, fix_bn=False, weight_per_class=None, device=None):
26 | 
27 |     # Setting up the CUDA device
28 |     if device is None:
29 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
30 | 
31 |     for epoch in range(epochs):
32 |         # Set the 1st branch model to the training mode
33 |         b1_model.train()
34 | 
35 |         # Fix the batch norm parameters according to the config
36 |         if fix_bn:
37 |             for m in b1_model.modules():
38 |                 if isinstance(m, nn.BatchNorm2d):
39 |                     m.eval()
40 | 
41 |         # Set all the losses to zeros
42 |         train_loss = 0
43 |         train_loss1 = 0
44 |         train_loss2 = 0
45 |         # Set the counters to zeros
46 |         correct = 0
47 |         total = 0
48 | 
49 |         # Learning rate decay
50 |         tg_lr_scheduler.step()
51 | 
52 |         # Print the information
53 |         print('\nEpoch: %d, learning rate: ' % epoch, end='')
54 |         print(tg_lr_scheduler.get_lr()[0])
55 | 
56 |         for batch_idx, (inputs, targets) in enumerate(trainloader):
57 |             # Get a batch of training samples, transfer them to the device
58 |             inputs, targets = inputs.to(device), targets.to(device)
59 |             # Clear the gradient of the paramaters for the tg_optimizer
60 |             tg_optimizer.zero_grad()
61 |             # Forward the samples in the deep networks
62 |             outputs = b1_model(inputs)
63 |             # Compute classification loss
64 |             loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
65 |             # Backward and update the parameters
66 |             loss.backward()
67 |             tg_optimizer.step()
68 |             # Record the losses and the number of samples to compute the accuracy
69 |             train_loss += loss.item()
70 |             _, predicted = outputs.max(1)
71 |             total += targets.size(0)
72 |             correct += predicted.eq(targets).sum().item()
73 | 
74 |         # Print the training losses and accuracies
75 |         print('Train set: {}, train loss: {:.4f} accuracy: {:.4f}'.format(len(trainloader), train_loss/(batch_idx+1), 100.*correct/total))
76 | 
77 |         # Running the test for this epoch
78 |         b1_model.eval()
79 |         test_loss = 0
80 |         correct = 0
81 |         total = 0
82 |         with torch.no_grad():
83 |             for batch_idx, (inputs, targets) in enumerate(testloader):
84 |                 inputs, targets = inputs.to(device), targets.to(device)
85 |                 outputs = b1_model(inputs)
86 |                 loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
87 |                 test_loss += loss.item()
88 |                 _, predicted = outputs.max(1)
89 |                 total += targets.size(0)
90 |                 correct += predicted.eq(targets).sum().item()
91 |         print('Test set: {} test loss: {:.4f} accuracy: {:.4f}'.format(len(testloader), test_loss/(batch_idx+1), 100.*correct/total))
92 | 
93 |     return b1_model
94 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yaoyao-liu/class-incremental-learning/701af9f819f559c6ab3d3ee73bb3d7c21e924572/adaptive-aggregation-networks/utils/__init__.py


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/gpu_tools.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Created by: Yaoyao Liu
 3 | ## Max Planck Institute for Informatics
 4 | ## yaoyao.liu@mpi-inf.mpg.de
 5 | ## Copyright (c) 2021
 6 | ##
 7 | ## This source code is licensed under the MIT-style license found in the
 8 | ## LICENSE file in the root directory of this source tree
 9 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
10 | """ GPU tools. """
11 | import os
12 | import torch
13 | import time
14 | 
15 | def check_memory(cuda_device):
16 |     """ Check the total memory and occupied memory for GPU """
17 |     devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n")
18 |     total, used = devices_info[int(cuda_device)].split(',')
19 |     return total,used
20 | 
21 | def occupy_memory(cuda_device):
22 |     """ Create a large tensor and delete it.
23 |     This operation occupies the GPU memory, so other processes cannot use the occupied memory.
24 |     It is used to ensure that this process won't be stopped when it requires additional GPU memory.
25 |     Be careful with this operation. It will influence other people when you are sharing GPUs with others.
26 |     """
27 |     total, used = check_memory(cuda_device)
28 |     total = int(total)
29 |     used = int(used)
30 |     max_mem = int(total * 0.90)
31 |     print('Total memory: ' + str(total) + ', used memory: ' + str(used))
32 |     block_mem = max_mem - used
33 |     if block_mem > 0:
34 |         x = torch.cuda.FloatTensor(256, 1024, block_mem)
35 |         del x
36 | 
37 | def set_gpu(x):
38 |     """ Set up which GPU we use for this process """
39 |     os.environ['CUDA_VISIBLE_DEVICES'] = x
40 |     print('Using gpu:', x)
41 |     
42 | 
43 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/imagenet/__init__.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 | # for incremental-class train and eval
4 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/imagenet/train_and_eval.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Copied from: https://github.com/hshustc/CVPR19_Incremental_Learning
 3 | ##
 4 | ## This source code is licensed under the MIT-style license found in the
 5 | ## LICENSE file in the root directory of this source tree
 6 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 7 | """ Tools for ImageNet """
 8 | import argparse
 9 | import os
10 | import shutil
11 | import time
12 | 
13 | import torch
14 | import torch.nn as nn
15 | import torch.nn.parallel
16 | import torch.backends.cudnn as cudnn
17 | import torch.distributed as dist
18 | import torch.optim
19 | import torch.utils.data
20 | import torch.utils.data.distributed
21 | import torchvision.transforms as transforms
22 | import torchvision.datasets as datasets
23 | import torchvision.models as models
24 | 
25 | from .utils_train import *
26 | 
27 | def train_and_eval(epochs, start_epoch, model, optimizer, lr_scheduler, \
28 |         train_loader, val_loader, gpu=None):
29 |     for epoch in range(start_epoch, epochs):
30 |         lr_scheduler.step()
31 |         print('\nEpoch: %d, LR: ' % epoch, end='')
32 |         print(lr_scheduler.get_lr())
33 | 
34 |         train(train_loader, model, optimizer, epoch, gpu)
35 |         validate(val_loader, model, gpu)
36 | 
37 |     return model
38 | 
39 | def train(train_loader, model, optimizer, epoch, gpu=None):
40 |     batch_time = AverageMeter()
41 |     data_time = AverageMeter()
42 |     losses = AverageMeter()
43 |     top1 = AverageMeter()
44 |     top5 = AverageMeter()
45 | 
46 |     model.train()
47 |     criterion = nn.CrossEntropyLoss().cuda(gpu)
48 | 
49 |     end = time.time()
50 |     for i, (input, target) in enumerate(train_loader):
51 |         data_time.update(time.time() - end)
52 | 
53 |         if gpu is not None:
54 |             input = input.cuda(gpu, non_blocking=True)
55 |         target = target.cuda(gpu, non_blocking=True)
56 | 
57 |         output = model(input)
58 |         loss = criterion(output, target)
59 | 
60 |         prec1, prec5 = accuracy(output, target, topk=(1, 5))
61 |         losses.update(loss.item(), input.size(0))
62 |         top1.update(prec1[0], input.size(0))
63 |         top5.update(prec5[0], input.size(0))
64 | 
65 |         optimizer.zero_grad()
66 |         loss.backward()
67 |         optimizer.step()
68 | 
69 | 
70 |         batch_time.update(time.time() - end)
71 |         end = time.time()
72 | 
73 |         if i % 10 == 0:
74 |             print('Epoch: [{0}][{1}/{2}]\t'
75 |                   'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
76 |                   'Data {data_time.val:.3f} ({data_time.avg:.3f})\t'
77 |                   'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
78 |                   'Prec@1 {top1.val:.3f} ({top1.avg:.3f})\t'
79 |                   'Prec@5 {top5.val:.3f} ({top5.avg:.3f})'.format(
80 |                    epoch, i, len(train_loader), batch_time=batch_time,
81 |                    data_time=data_time, loss=losses, top1=top1, top5=top5))
82 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/imagenet/utils_dataset.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Copied from: https://github.com/hshustc/CVPR19_Incremental_Learning
 3 | ##
 4 | ## This source code is licensed under the MIT-style license found in the
 5 | ## LICENSE file in the root directory of this source tree
 6 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 7 | """ Tools for ImageNet """
 8 | import argparse
 9 | import os
10 | import shutil
11 | import time
12 | import numpy as np
13 | 
14 | def split_images_labels(imgs):
15 |     images = []
16 |     labels = []
17 |     for item in imgs:
18 |         images.append(item[0])
19 |         labels.append(item[1])
20 | 
21 |     return np.array(images), np.array(labels)
22 | 
23 | def merge_images_labels(images, labels):
24 |     images = list(images)
25 |     labels = list(labels)
26 |     assert(len(images)==len(labels))
27 |     imgs = []
28 |     for i in range(len(images)):
29 |         item = (images[i], labels[i])
30 |         imgs.append(item)
31 |     
32 |     return imgs
33 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/imagenet/utils_train.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Copied from: https://github.com/hshustc/CVPR19_Incremental_Learning
 3 | ##
 4 | ## This source code is licensed under the MIT-style license found in the
 5 | ## LICENSE file in the root directory of this source tree
 6 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 7 | """ Tools for ImageNet """
 8 | import argparse
 9 | import os
10 | import shutil
11 | import time
12 | 
13 | import torch
14 | import torch.nn as nn
15 | import torch.nn.parallel
16 | import torch.backends.cudnn as cudnn
17 | import torch.distributed as dist
18 | import torch.optim
19 | import torch.utils.data
20 | import torch.utils.data.distributed
21 | import torchvision.transforms as transforms
22 | import torchvision.datasets as datasets
23 | import torchvision.models as models
24 | 
25 | def validate(val_loader, model, gpu=None):
26 |     batch_time = AverageMeter()
27 |     losses = AverageMeter()
28 |     top1 = AverageMeter()
29 |     top5 = AverageMeter()
30 | 
31 |     model.eval()
32 |     criterion = nn.CrossEntropyLoss().cuda(gpu)
33 | 
34 |     with torch.no_grad():
35 |         end = time.time()
36 |         for i, (input, target) in enumerate(val_loader):
37 |             if gpu is not None:
38 |                 input = input.cuda(gpu, non_blocking=True)
39 |             target = target.cuda(gpu, non_blocking=True)
40 | 
41 |             output = model(input)
42 |             loss = criterion(output, target)
43 | 
44 |             prec1, prec5 = accuracy(output, target, topk=(1, 5))
45 |             losses.update(loss.item(), input.size(0))
46 |             top1.update(prec1[0], input.size(0))
47 |             top5.update(prec5[0], input.size(0))
48 | 
49 |             batch_time.update(time.time() - end)
50 |             end = time.time()
51 | 
52 |             if i % 10 == 0:
53 |                 print('Test: [{0}/{1}]\t'
54 |                       'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
55 |                       'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
56 |                       'Prec@1 {top1.val:.3f} ({top1.avg:.3f})\t'
57 |                       'Prec@5 {top5.val:.3f} ({top5.avg:.3f})'.format(
58 |                        i, len(val_loader), batch_time=batch_time, loss=losses,
59 |                        top1=top1, top5=top5))
60 | 
61 |         print(' * Prec@1 {top1.avg:.3f} Prec@5 {top5.avg:.3f}'
62 |               .format(top1=top1, top5=top5))
63 | 
64 |     return top1.avg
65 | 
66 | class AverageMeter(object):
67 |     """Computes and stores the average and current value"""
68 |     def __init__(self):
69 |         self.reset()
70 | 
71 |     def reset(self):
72 |         self.val = 0
73 |         self.avg = 0
74 |         self.sum = 0
75 |         self.count = 0
76 | 
77 |     def update(self, val, n=1):
78 |         self.val = val
79 |         self.sum += val * n
80 |         self.count += n
81 |         self.avg = self.sum / self.count
82 | 
83 | def accuracy(output, target, topk=(1,)):
84 |     """Computes the precision@k for the specified values of k"""
85 |     with torch.no_grad():
86 |         maxk = max(topk)
87 |         batch_size = target.size(0)
88 | 
89 |         _, pred = output.topk(maxk, 1, True, True)
90 |         pred = pred.t()
91 |         correct = pred.eq(target.view(1, -1).expand_as(pred))
92 | 
93 |         res = []
94 |         for k in topk:
95 |             correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
96 |             res.append(correct_k.mul_(100.0 / batch_size))
97 |         return res
98 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/incremental/__init__.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 | # for incremental train and eval
4 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/incremental/compute_accuracy.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | """ The functions that compute the accuracies """
 12 | import torch
 13 | import torch.nn as nn
 14 | import torch.nn.functional as F
 15 | import torch.optim as optim
 16 | from torch.optim import lr_scheduler
 17 | import torchvision
 18 | from torchvision import datasets, models, transforms
 19 | from torch.autograd import Variable
 20 | import numpy as np
 21 | import time
 22 | import os
 23 | import copy
 24 | import argparse
 25 | from PIL import Image
 26 | from scipy.spatial.distance import cdist
 27 | from sklearn.metrics import confusion_matrix
 28 | from utils.misc import *
 29 | from utils.imagenet.utils_dataset import merge_images_labels
 30 | from utils.process_fp import process_inputs_fp
 31 | 
 32 | def map_labels(order_list, Y_set):
 33 |     map_Y = []
 34 |     for idx in Y_set:
 35 |         map_Y.append(order_list.index(idx))
 36 |     map_Y = np.array(map_Y)
 37 |     return map_Y
 38 | 
 39 | def compute_accuracy(the_args, fusion_vars, b1_model, b2_model, tg_feature_model, class_means, \
 40 |     X_protoset_cumuls, Y_protoset_cumuls, evalloader, order_list, is_start_iteration=False, \
 41 |     fast_fc=None, scale=None, print_info=True, device=None, cifar=True, imagenet=False, \
 42 |     valdir=None):
 43 |     if device is None:
 44 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
 45 |     b1_model.eval()
 46 |     tg_feature_model.eval()
 47 |     b1_model.eval()
 48 |     if b2_model is not None:
 49 |         b2_model.eval()
 50 |     fast_fc = 0.0
 51 |     correct = 0
 52 |     correct_icarl = 0
 53 |     correct_icarl_cosine = 0
 54 |     correct_icarl_cosine2 = 0
 55 |     correct_ncm = 0
 56 |     correct_maml = 0
 57 |     total = 0
 58 |     with torch.no_grad():
 59 |         for batch_idx, (inputs, targets) in enumerate(evalloader):
 60 |             inputs, targets = inputs.to(device), targets.to(device)
 61 |             total += targets.size(0)
 62 | 
 63 |             if is_start_iteration:
 64 |                 outputs = b1_model(inputs)
 65 |             else:
 66 |                 outputs, outputs_feature = process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs)
 67 |             
 68 |             outputs = F.softmax(outputs, dim=1)
 69 |             if scale is not None:
 70 |                 assert(scale.shape[0] == 1)
 71 |                 assert(outputs.shape[1] == scale.shape[1])
 72 |                 outputs = outputs / scale.repeat(outputs.shape[0], 1).type(torch.FloatTensor).to(device)
 73 |             _, predicted = outputs.max(1)
 74 |             correct += predicted.eq(targets).sum().item()
 75 | 
 76 |             if is_start_iteration:
 77 |                 outputs_feature = np.squeeze(tg_feature_model(inputs))
 78 |             sqd_icarl = cdist(class_means[:,:,0].T, outputs_feature.cpu(), 'sqeuclidean')
 79 |             score_icarl = torch.from_numpy((-sqd_icarl).T).to(device)
 80 |             _, predicted_icarl = score_icarl.max(1)
 81 |             correct_icarl += predicted_icarl.eq(targets).sum().item()
 82 |             sqd_icarl_cosine = cdist(class_means[:,:,0].T, outputs_feature.cpu(), 'cosine')
 83 |             score_icarl_cosine = torch.from_numpy((-sqd_icarl_cosine).T).to(device)
 84 |             _, predicted_icarl_cosine = score_icarl_cosine.max(1)
 85 |             correct_icarl_cosine += predicted_icarl_cosine.eq(targets).sum().item()
 86 |             fast_weights = torch.from_numpy(np.float32(class_means[:,:,0].T)).to(device)
 87 |             sqd_icarl_cosine2 = F.linear(F.normalize(torch.squeeze(outputs_feature), p=2,dim=1), F.normalize(fast_weights, p=2, dim=1))
 88 |             score_icarl_cosine2 = sqd_icarl_cosine2
 89 |             _, predicted_icarl_cosine2 = score_icarl_cosine2.max(1)
 90 |             correct_icarl_cosine2 += predicted_icarl_cosine2.eq(targets).sum().item()
 91 |             sqd_ncm = cdist(class_means[:,:,1].T, outputs_feature.cpu(), 'sqeuclidean')
 92 |             score_ncm = torch.from_numpy((-sqd_ncm).T).to(device)
 93 |             _, predicted_ncm = score_ncm.max(1)
 94 |             correct_ncm += predicted_ncm.eq(targets).sum().item()
 95 |     if print_info:
 96 |         print("  Current accuracy (FC)         :\t\t{:.2f} %".format(100.*correct/total))
 97 |         print("  Current accuracy (Proto)      :\t\t{:.2f} %".format(100.*correct_icarl/total))
 98 |         print("  Current accuracy (Proto-UB)   :\t\t{:.2f} %".format(100.*correct_ncm/total))  
 99 |     cnn_acc = 100.*correct/total
100 |     icarl_acc = 100.*correct_icarl/total
101 |     ncm_acc = 100.*correct_ncm/total
102 |     return [cnn_acc, icarl_acc, ncm_acc], fast_fc
103 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/incremental/compute_features.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Created by: Yaoyao Liu
 3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
 4 | ## Max Planck Institute for Informatics
 5 | ## yaoyao.liu@mpi-inf.mpg.de
 6 | ## Copyright (c) 2021
 7 | ##
 8 | ## This source code is licensed under the MIT-style license found in the
 9 | ## LICENSE file in the root directory of this source tree
10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
11 | """ The functions that compute the features """
12 | import torch
13 | import torch.nn as nn
14 | import torch.nn.functional as F
15 | import torch.optim as optim
16 | from torch.optim import lr_scheduler
17 | import torchvision
18 | from torchvision import datasets, models, transforms
19 | from torch.autograd import Variable
20 | import numpy as np
21 | import time
22 | import os
23 | import copy
24 | import argparse
25 | from PIL import Image
26 | from scipy.spatial.distance import cdist
27 | from sklearn.metrics import confusion_matrix
28 | from utils.misc import *
29 | from utils.process_fp import process_inputs_fp
30 | 
31 | def compute_features(the_args, fusion_vars, tg_model, free_model, tg_feature_model, \
32 |     is_start_iteration, evalloader, num_samples, num_features, device=None):
33 |     if device is None:
34 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
35 |     tg_feature_model.eval()
36 |     tg_model.eval()
37 |     if free_model is not None:
38 |         free_model.eval()
39 | 
40 |     features = np.zeros([num_samples, num_features])
41 |     start_idx = 0
42 |     with torch.no_grad():
43 |         for inputs, targets in evalloader:
44 |             inputs = inputs.to(device)
45 |             if is_start_iteration:
46 |                 the_feature = tg_feature_model(inputs)
47 |             else:
48 |                 the_feature = process_inputs_fp(the_args, fusion_vars, tg_model, free_model, inputs, feature_mode=True)
49 |             features[start_idx:start_idx+inputs.shape[0], :] = np.squeeze(the_feature.cpu())
50 |             start_idx = start_idx+inputs.shape[0]
51 |     assert(start_idx==num_samples)
52 |     return features
53 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/incremental/conv2d_mtl.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Created by: Yaoyao Liu
  3 | ## Modified from: https://github.com/pytorch/pytorch
  4 | ## Max Planck Institute for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2021
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | """ SS CONV layers. 
 12 | This file contains the source code for the scaling and shifting weights.
 13 | If this architecture is applied, the convolution weights will be frozen, and only the channel-wise masks will be updated.
 14 | """
 15 | import math
 16 | import torch
 17 | from torch.nn.parameter import Parameter
 18 | import torch.nn.functional as F
 19 | from torch.nn.modules.module import Module
 20 | from torch.nn.modules.utils import _single, _pair, _triple
 21 | 
 22 | class _ConvNdMtl(Module):
 23 | 
 24 |     def __init__(self, in_channels, out_channels, kernel_size, stride,
 25 |                  padding, dilation, transposed, output_padding, groups, bias):
 26 |         super(_ConvNdMtl, self).__init__()
 27 |         if in_channels % groups != 0:
 28 |             raise ValueError('in_channels must be divisible by groups')
 29 |         if out_channels % groups != 0:
 30 |             raise ValueError('out_channels must be divisible by groups')
 31 |         self.in_channels = in_channels
 32 |         self.out_channels = out_channels
 33 |         self.kernel_size = kernel_size
 34 |         self.stride = stride
 35 |         self.padding = padding
 36 |         self.dilation = dilation
 37 |         self.transposed = transposed
 38 |         self.output_padding = output_padding
 39 |         self.groups = groups
 40 |         if transposed:
 41 |             self.weight = Parameter(torch.Tensor(
 42 |                 in_channels, out_channels // groups, *kernel_size))
 43 |             self.mtl_weight = Parameter(torch.ones(in_channels, out_channels // groups, 1, 1))
 44 |         else:
 45 |             self.weight = Parameter(torch.Tensor(
 46 |                 out_channels, in_channels // groups, *kernel_size))
 47 |             self.mtl_weight = Parameter(torch.ones(out_channels, in_channels // groups, 1, 1))
 48 |         self.weight.requires_grad=False
 49 |         if bias:
 50 |             self.bias = Parameter(torch.Tensor(out_channels))
 51 |             self.bias.requires_grad=False
 52 |             self.mtl_bias = Parameter(torch.zeros(out_channels))
 53 |         else:
 54 |             self.register_parameter('bias', None)
 55 |             self.register_parameter('mtl_bias', None)
 56 |         self.reset_parameters()
 57 | 
 58 |     def reset_parameters(self):
 59 |         n = self.in_channels
 60 |         for k in self.kernel_size:
 61 |             n *= k
 62 |         stdv = 1. / math.sqrt(n)
 63 |         self.weight.data.uniform_(-stdv, stdv)
 64 |         self.mtl_weight.data.uniform_(1, 1)
 65 |         if self.bias is not None:
 66 |             self.bias.data.uniform_(-stdv, stdv)
 67 |             self.mtl_bias.data.uniform_(0, 0)
 68 | 
 69 |     def extra_repr(self):
 70 |         s = ('{in_channels}, {out_channels}, kernel_size={kernel_size}'
 71 |              ', stride={stride}')
 72 |         if self.padding != (0,) * len(self.padding):
 73 |             s += ', padding={padding}'
 74 |         if self.dilation != (1,) * len(self.dilation):
 75 |             s += ', dilation={dilation}'
 76 |         if self.output_padding != (0,) * len(self.output_padding):
 77 |             s += ', output_padding={output_padding}'
 78 |         if self.groups != 1:
 79 |             s += ', groups={groups}'
 80 |         if self.bias is None:
 81 |             s += ', bias=False'
 82 |         return s.format(**self.__dict__)
 83 | 
 84 | class Conv2dMtl(_ConvNdMtl):
 85 | 
 86 |     def __init__(self, in_channels, out_channels, kernel_size, stride=1,
 87 |                  padding=0, dilation=1, groups=1, bias=True):
 88 |         kernel_size = _pair(kernel_size)
 89 |         stride = _pair(stride)
 90 |         padding = _pair(padding)
 91 |         dilation = _pair(dilation)
 92 |         super(Conv2dMtl, self).__init__(
 93 |             in_channels, out_channels, kernel_size, stride, padding, dilation,
 94 |             False, _pair(0), groups, bias)
 95 | 
 96 |     def forward(self, input):
 97 |         new_mtl_weight = self.mtl_weight.expand(self.weight.shape)
 98 |         new_weight = self.weight.mul(new_mtl_weight)
 99 |         if self.bias is not None:
100 |             new_bias = self.bias + self.mtl_bias
101 |         else:
102 |             new_bias = None
103 |         return F.conv2d(input, new_weight, new_bias, self.stride,
104 |                         self.padding, self.dilation, self.groups)
105 | 
106 | 


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/misc.py:
--------------------------------------------------------------------------------
  1 | from __future__ import print_function, division
  2 | 
  3 | import torch
  4 | import torch.nn as nn
  5 | import torch.nn.init as init
  6 | from collections import OrderedDict
  7 | import torch.optim as optim
  8 | import torchvision
  9 | import argparse
 10 | import numpy as np
 11 | import os
 12 | import os.path as osp
 13 | import sys
 14 | import time
 15 | import math
 16 | import subprocess
 17 | try:
 18 |     import cPickle as pickle
 19 | except:
 20 |     import pickle
 21 | 
 22 | def savepickle(data, file_path):
 23 |     mkdir_p(osp.dirname(file_path), delete=False)
 24 |     print('pickle into', file_path)
 25 |     with open(file_path, 'wb') as f:
 26 |         pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
 27 | 
 28 | def unpickle(file_path):
 29 |     with open(file_path, 'rb') as f:
 30 |         data = pickle.load(f)
 31 |     return data
 32 | 
 33 | def mkdir_p(path, delete=False, print_info=True):
 34 |     if path == '': return
 35 | 
 36 |     if delete:
 37 |         subprocess.call(('rm -r ' + path).split())
 38 |     if not osp.exists(path):
 39 |         if print_info:
 40 |             print('mkdir -p  ' + path)
 41 |         subprocess.call(('mkdir -p ' + path).split())
 42 | 
 43 | def get_mean_and_std(dataset):
 44 |     '''Compute the mean and std value of dataset.'''
 45 |     dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=True, num_workers=2)
 46 |     mean = torch.zeros(3)
 47 |     std = torch.zeros(3)
 48 |     print('==> Computing mean and std..')
 49 |     for inputs, targets in dataloader:
 50 |         for i in range(3):
 51 |             mean[i] += inputs[:,i,:,:].mean()
 52 |             std[i] += inputs[:,i,:,:].std()
 53 |     mean.div_(len(dataset))
 54 |     std.div_(len(dataset))
 55 |     return mean, std
 56 | 
 57 | def init_params(net):
 58 |     '''Init layer parameters.'''
 59 |     for m in net.modules():
 60 |         if isinstance(m, nn.Conv2d):
 61 |             init.kaiming_normal_(m.weight, mode='fan_out')
 62 |             if m.bias:
 63 |                 init.constant_(m.bias, 0)
 64 |         elif isinstance(m, nn.BatchNorm2d):
 65 |             init.constant_(m.weight, 1)
 66 |             init.constant_(m.bias, 0)
 67 |         elif isinstance(m, nn.Linear):
 68 |             init.normal_(m.weight, std=1e-3)
 69 |             if m.bias is not None:
 70 |                 init.constant_(m.bias, 0)
 71 | 
 72 | def map_labels(order_list, Y_set):
 73 |     map_Y = []
 74 |     for idx in Y_set:
 75 |         map_Y.append(order_list.index(idx))
 76 |     map_Y = np.array(map_Y)
 77 |     return map_Y
 78 | 
 79 | def format_time(seconds):
 80 |     days = int(seconds / 3600/24)
 81 |     seconds = seconds - days*3600*24
 82 |     hours = int(seconds / 3600)
 83 |     seconds = seconds - hours*3600
 84 |     minutes = int(seconds / 60)
 85 |     seconds = seconds - minutes*60
 86 |     secondsf = int(seconds)
 87 |     seconds = seconds - secondsf
 88 |     millis = int(seconds*1000)
 89 | 
 90 |     f = ''
 91 |     i = 1
 92 |     if days > 0:
 93 |         f += str(days) + 'D'
 94 |         i += 1
 95 |     if hours > 0 and i <= 2:
 96 |         f += str(hours) + 'h'
 97 |         i += 1
 98 |     if minutes > 0 and i <= 2:
 99 |         f += str(minutes) + 'm'
100 |         i += 1
101 |     if secondsf > 0 and i <= 2:
102 |         f += str(secondsf) + 's'
103 |         i += 1
104 |     if millis > 0 and i <= 2:
105 |         f += str(millis) + 'ms'
106 |         i += 1
107 |     if f == '':
108 |         f = '0ms'
109 |     return f
110 | 
111 | def tensor2im(input_image, imtype=np.uint8):
112 |     mean = [0.5071,  0.4866,  0.4409]
113 |     std = [0.2009,  0.1984,  0.2023]
114 |     if not isinstance(input_image, np.ndarray):
115 |         if isinstance(input_image, torch.Tensor):
116 |             image_tensor = input_image.data
117 |         else:
118 |             return input_image
119 |         image_numpy = image_tensor.cpu().detach().float().numpy()
120 |         if image_numpy.shape[0] == 1: 
121 |             image_numpy = np.tile(image_numpy, (3, 1, 1))
122 |         for i in range(len(mean)): 
123 |             image_numpy[i] = image_numpy[i] * std[i] + mean[i]
124 |         image_numpy = image_numpy * 255
125 |         image_numpy = np.transpose(image_numpy, (1, 2, 0))
126 |     else:
127 |         image_numpy = input_image
128 |     return image_numpy.astype(imtype)
129 | 
130 | def process_mnemonics(X_protoset_cumuls, Y_protoset_cumuls, mnemonics, mnemonics_label, order_list):
131 |     mnemonics_array_new = np.zeros(np.array(X_protoset_cumuls).shape)
132 |     mnemonics_list = []
133 |     mnemonics_label_list = []
134 |     for idx in range(len(mnemonics)):
135 |         this_mnemonics = []
136 |         for sub_idx in range(len(mnemonics[idx])):
137 |             processed_img = tensor2im(mnemonics[idx][sub_idx])
138 |             mnemonics_array_new[idx][sub_idx] = processed_img
139 |     
140 |     diff = len(X_protoset_cumuls) - len(mnemonics_array_new)
141 |     for idx in range(len(mnemonics_array_new)):
142 |         X_protoset_cumuls[idx+diff] = mnemonics_array_new[idx]
143 | 
144 |     return X_protoset_cumuls


--------------------------------------------------------------------------------
/adaptive-aggregation-networks/utils/process_fp.py:
--------------------------------------------------------------------------------
 1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 | ## Created by: Yaoyao Liu
 3 | ## Max Planck Institute for Informatics
 4 | ## yaoyao.liu@mpi-inf.mpg.de
 5 | ## Copyright (c) 2021
 6 | ##
 7 | ## This source code is licensed under the MIT-style license found in the
 8 | ## LICENSE file in the root directory of this source tree
 9 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
10 | """ Using the aggregation weights to compute the feature maps from two branches """
11 | import torch
12 | import torch.nn as nn
13 | from utils.misc import *
14 | 
15 | def process_inputs_fp(the_args, fusion_vars, b1_model, b2_model, inputs, feature_mode=False):
16 | 
17 |     # The 1st level
18 |     if the_args.dataset == 'cifar100':
19 |         b1_model_group1 = [b1_model.conv1, b1_model.bn1, b1_model.relu, b1_model.layer1]
20 |         b2_model_group1 = [b2_model.conv1, b2_model.bn1, b2_model.relu, b2_model.layer1]
21 |     elif the_args.dataset == 'imagenet_sub' or the_args.dataset == 'imagenet':
22 |         b1_model_group1 = [b1_model.conv1, b1_model.bn1, b1_model.relu, b1_model.maxpool, b1_model.layer1]
23 |         b2_model_group1 = [b2_model.conv1, b2_model.bn1, b2_model.relu, b2_model.maxpool, b2_model.layer1]
24 |     else:
25 |         raise ValueError('Please set correct dataset.')
26 |     b1_model_group1 = nn.Sequential(*b1_model_group1)
27 |     b1_fp1 = b1_model_group1(inputs)
28 |     b2_model_group1 = nn.Sequential(*b2_model_group1)
29 |     b2_fp1 = b2_model_group1(inputs)
30 |     fp1 = fusion_vars[0]*b1_fp1+(1-fusion_vars[0])*b2_fp1
31 | 
32 |     # The 2nd level
33 |     b1_model_group2 = b1_model.layer2
34 |     b1_fp2 = b1_model_group2(fp1)
35 |     b2_model_group2 = b2_model.layer2
36 |     b2_fp2 = b2_model_group2(fp1)
37 |     fp2 = fusion_vars[1]*b1_fp2+(1-fusion_vars[1])*b2_fp2
38 | 
39 |     # The 3rd level
40 |     if the_args.dataset == 'cifar100':
41 |         b1_model_group3 = [b1_model.layer3, b1_model.avgpool]
42 |         b2_model_group3 = [b2_model.layer3, b2_model.avgpool]
43 |     elif the_args.dataset == 'imagenet_sub' or the_args.dataset == 'imagenet':
44 |         b1_model_group3 = b1_model.layer3
45 |         b2_model_group3 = b2_model.layer3
46 |     else:
47 |         raise ValueError('Please set correct dataset.')
48 |     b1_model_group3 = nn.Sequential(*b1_model_group3)
49 |     b1_fp3 = b1_model_group3(fp2)
50 |     b2_model_group3 = nn.Sequential(*b2_model_group3)
51 |     b2_fp3 = b2_model_group3(fp2)
52 |     fp3 = fusion_vars[2]*b1_fp3+(1-fusion_vars[2])*b2_fp3
53 | 
54 |     if the_args.dataset == 'cifar100': 
55 |         fp_final = fp3.view(fp3.size(0), -1)
56 |     elif the_args.dataset == 'imagenet_sub' or the_args.dataset == 'imagenet':
57 |         # The 4th level
58 |         b1_model_group4 = [b1_model.layer4, b1_model.avgpool]
59 |         b1_model_group4 = nn.Sequential(*b1_model_group4)
60 |         b1_fp4 = b1_model_group4(fp3)
61 |         b2_model_group4 = [b2_model.layer4, b2_model.avgpool]
62 |         b2_model_group4 = nn.Sequential(*b2_model_group4)
63 |         b2_fp4 = b2_model_group4(fp3)
64 |         fp4 = fusion_vars[3]*b1_fp4+(1-fusion_vars[3])*b2_fp4
65 |         fp_final = fp4.view(fp4.size(0), -1)
66 |     else:
67 |         raise ValueError('Please set correct dataset.')
68 |     if feature_mode:
69 |         return fp_final
70 |     else:
71 |         outputs = b1_model.fc(fp_final)
72 |         return outputs, fp_final
73 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/main.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import argparse
 3 | import numpy as np
 4 | from trainer.mnemonics import Trainer as MnemonicsTrainer
 5 | from trainer.baseline import Trainer as BaselineTrainer
 6 | 
 7 | if __name__ == '__main__':
 8 |     parser = argparse.ArgumentParser()
 9 |     parser.add_argument('--gpu', default='0')
10 |     parser.add_argument('--method', default='mnemonics', type=str, choices=['mnemonics', 'baseline'])
11 |     parser.add_argument('--dataset', default='cifar100', type=str, choices=['cifar100'])
12 |     parser.add_argument('--data_dir', default='data/seed_1993_subset_100_imagenet/data', type=str)
13 |     parser.add_argument('--num_classes', default=100, type=int)
14 |     parser.add_argument('--nb_cl_fg', default=50, type=int)
15 |     parser.add_argument('--nb_cl', default=10, type=int)
16 |     parser.add_argument('--nb_protos', default=20, type=int)
17 |     parser.add_argument('--nb_runs', default=1, type=int)
18 |     parser.add_argument('--epochs', default=160, type=int)
19 |     parser.add_argument('--T', default=2, type=float)
20 |     parser.add_argument('--beta', default=0.25, type=float)
21 |     parser.add_argument('--resume', action='store_true')
22 |     parser.add_argument('--resume_fg', action='store_true')
23 |     parser.add_argument('--ckpt_dir_fg', type=str, default='-')
24 |     parser.add_argument('--dynamic_budget', action='store_true')
25 |     parser.add_argument('--phase', type=str, default='train', choices=['train', 'eval'])
26 |     parser.add_argument('--fusion_mode', default='free', type=str, choices=['std', 'free', 'mtl'])
27 |     parser.add_argument('--ckpt_label', type=str, default='01')
28 |     parser.add_argument('--num_workers', default=2, type=int)
29 |     parser.add_argument('--load_iter', default=0, type=int)
30 |     parser.add_argument('--dictionary_size', default=500, type=int)
31 |     parser.add_argument('--mimic_score', action='store_true')
32 |     parser.add_argument('--lw_ms', default=1, type=float)
33 |     parser.add_argument('--rs_ratio', default=0, type=float)
34 |     parser.add_argument('--less_forget', action='store_true')
35 |     parser.add_argument('--lamda', default=5, type=float)
36 |     parser.add_argument('--adapt_lamda', action='store_true')
37 |     parser.add_argument('--dist', default=0.5, type=float)
38 |     parser.add_argument('--K', default=2, type=int)
39 |     parser.add_argument('--lw_mr', default=1, type=float)
40 |     parser.add_argument('--random_seed', default=1993, type=int)
41 |     parser.add_argument('--train_batch_size', default=128, type=int)
42 |     parser.add_argument('--test_batch_size', default=100, type=int)
43 |     parser.add_argument('--eval_batch_size', default=128, type=int)
44 |     parser.add_argument('--base_lr1', default=0.1, type=float)
45 |     parser.add_argument('--base_lr2', default=0.1, type=float)
46 |     parser.add_argument('--lr_factor', default=0.1, type=float)
47 |     parser.add_argument('--custom_weight_decay', default=5e-4, type=float)
48 |     parser.add_argument('--custom_momentum', default=0.9, type=float)
49 |     parser.add_argument('--load_ckpt_prefix', type=str, default='-')
50 |     parser.add_argument('--load_order', type=str, default='-')
51 |     parser.add_argument('--maml_lr', default=0.1, type=float)
52 |     parser.add_argument('--maml_epoch', default=50, type=int)
53 |     parser.add_argument('--mnemonics_images_per_class_per_step', default=1, type=int)    
54 |     parser.add_argument('--mnemonics_steps', default=20, type=int)    
55 |     parser.add_argument('--mnemonics_epochs', default=1, type=int)    
56 |     parser.add_argument('--mnemonics_lr', type=float, default=1e-5)
57 |     parser.add_argument('--mnemonics_decay_factor', type=float, default=0.5)
58 |     parser.add_argument('--mnemonics_outer_lr', type=float, default=1e-5)
59 |     parser.add_argument('--mnemonics_total_epochs', type=int, default=1)
60 |     parser.add_argument('--mnemonics_decay_epochs', type=int, default=1)
61 | 
62 |     the_args = parser.parse_args()
63 | 
64 |     assert(the_args.nb_cl_fg % the_args.nb_cl == 0)
65 |     assert(the_args.nb_cl_fg >= the_args.nb_cl)
66 | 
67 |     print(the_args)
68 | 
69 |     os.environ['CUDA_VISIBLE_DEVICES'] = the_args.gpu
70 |     print('Using gpu:', the_args.gpu)
71 | 
72 |     if the_args.method == 'mnemonics':
73 |         trainer = MnemonicsTrainer(the_args)
74 |     elif the_args.method == 'baseline':
75 |         trainer = BaselineTrainer(the_args)
76 |     else:
77 |         raise ValueError('Please set the correct method.')
78 |     trainer.train()
79 | 
80 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/models/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yaoyao-liu/class-incremental-learning/701af9f819f559c6ab3d3ee73bb3d7c21e924572/mnemonics-training/1_train/models/__init__.py


--------------------------------------------------------------------------------
/mnemonics-training/1_train/models/modified_linear.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | import torch
 3 | from torch.nn import Module
 4 | from torch.nn.parameter import Parameter
 5 | from torch.nn import functional as F
 6 | 
 7 | class CosineLinear(Module):
 8 |     def __init__(self, in_features, out_features, sigma=True):
 9 |         super(CosineLinear, self).__init__()
10 |         self.in_features = in_features
11 |         self.out_features = out_features
12 |         self.weight = Parameter(torch.Tensor(out_features, in_features))
13 |         if sigma:
14 |             self.sigma = Parameter(torch.Tensor(1))
15 |         else:
16 |             self.register_parameter('sigma', None)
17 |         self.reset_parameters()
18 | 
19 |     def reset_parameters(self):
20 |         stdv = 1. / math.sqrt(self.weight.size(1))
21 |         self.weight.data.uniform_(-stdv, stdv)
22 |         if self.sigma is not None:
23 |             self.sigma.data.fill_(1)
24 | 
25 |     def forward(self, input):
26 |         out = F.linear(F.normalize(input, p=2,dim=1), \
27 |                 F.normalize(self.weight, p=2, dim=1))
28 |         if self.sigma is not None:
29 |             out = self.sigma * out
30 |         return out
31 | 
32 | class SplitCosineLinear(Module):
33 |     def __init__(self, in_features, out_features1, out_features2, sigma=True):
34 |         super(SplitCosineLinear, self).__init__()
35 |         self.in_features = in_features
36 |         self.out_features = out_features1 + out_features2
37 |         self.fc1 = CosineLinear(in_features, out_features1, False)
38 |         self.fc2 = CosineLinear(in_features, out_features2, False)
39 |         if sigma:
40 |             self.sigma = Parameter(torch.Tensor(1))
41 |             self.sigma.data.fill_(1)
42 |         else:
43 |             self.register_parameter('sigma', None)
44 | 
45 |     def forward(self, x):
46 |         out1 = self.fc1(x)
47 |         out2 = self.fc2(x)
48 |         out = torch.cat((out1, out2), dim=1)
49 |         if self.sigma is not None:
50 |             out = self.sigma * out
51 |         return out
52 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/models/modified_resnet_cifar.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | import torch.nn as nn
  3 | import torch.utils.model_zoo as model_zoo
  4 | import models.modified_linear as modified_linear
  5 | 
  6 | def conv3x3(in_planes, out_planes, stride=1):
  7 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
  8 |                      padding=1, bias=False)
  9 | 
 10 | class BasicBlock(nn.Module):
 11 |     expansion = 1
 12 | 
 13 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 14 |         super(BasicBlock, self).__init__()
 15 |         self.conv1 = conv3x3(inplanes, planes, stride)
 16 |         self.bn1 = nn.BatchNorm2d(planes)
 17 |         self.relu = nn.ReLU(inplace=True)
 18 |         self.conv2 = conv3x3(planes, planes)
 19 |         self.bn2 = nn.BatchNorm2d(planes)
 20 |         self.downsample = downsample
 21 |         self.stride = stride
 22 |         self.last = last
 23 | 
 24 |     def forward(self, x):
 25 |         residual = x
 26 | 
 27 |         out = self.conv1(x)
 28 |         out = self.bn1(out)
 29 |         out = self.relu(out)
 30 | 
 31 |         out = self.conv2(out)
 32 |         out = self.bn2(out)
 33 | 
 34 |         if self.downsample is not None:
 35 |             residual = self.downsample(x)
 36 | 
 37 |         out += residual
 38 |         if not self.last:
 39 |             out = self.relu(out)
 40 | 
 41 |         return out
 42 | 
 43 | class ResNet(nn.Module):
 44 | 
 45 |     def __init__(self, block, layers, num_classes=10):
 46 |         self.inplanes = 16
 47 |         super(ResNet, self).__init__()
 48 |         self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1,
 49 |                                bias=False)
 50 |         self.bn1 = nn.BatchNorm2d(16)
 51 |         self.relu = nn.ReLU(inplace=True)
 52 |         self.layer1 = self._make_layer(block, 16, layers[0])
 53 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
 54 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2, last_phase=True)
 55 |         self.avgpool = nn.AvgPool2d(8, stride=1)
 56 |         self.fc = modified_linear.CosineLinear(64 * block.expansion, num_classes)
 57 | 
 58 |         for m in self.modules():
 59 |             if isinstance(m, nn.Conv2d):
 60 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 61 |             elif isinstance(m, nn.BatchNorm2d):
 62 |                 nn.init.constant_(m.weight, 1)
 63 |                 nn.init.constant_(m.bias, 0)
 64 | 
 65 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 66 |         downsample = None
 67 |         if stride != 1 or self.inplanes != planes * block.expansion:
 68 |             downsample = nn.Sequential(
 69 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
 70 |                           kernel_size=1, stride=stride, bias=False),
 71 |                 nn.BatchNorm2d(planes * block.expansion),
 72 |             )
 73 | 
 74 |         layers = []
 75 |         layers.append(block(self.inplanes, planes, stride, downsample))
 76 |         self.inplanes = planes * block.expansion
 77 |         if last_phase:
 78 |             for i in range(1, blocks-1):
 79 |                 layers.append(block(self.inplanes, planes))
 80 |             layers.append(block(self.inplanes, planes, last=True))
 81 |         else: 
 82 |             for i in range(1, blocks):
 83 |                 layers.append(block(self.inplanes, planes))
 84 | 
 85 |         return nn.Sequential(*layers)
 86 | 
 87 |     def forward(self, x):
 88 |         x = self.conv1(x)
 89 |         x = self.bn1(x)
 90 |         x = self.relu(x)
 91 | 
 92 |         x = self.layer1(x)
 93 |         x = self.layer2(x)
 94 |         x = self.layer3(x)
 95 | 
 96 |         x = self.avgpool(x)
 97 |         x = x.view(x.size(0), -1)
 98 |         x = self.fc(x)
 99 | 
100 |         return x
101 | 
102 | def resnet32(pretrained=False, **kwargs):
103 |     n = 5
104 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
105 |     return model
106 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/models/modified_resnetmtl_cifar.py:
--------------------------------------------------------------------------------
  1 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 | ## Modified by: Yaoyao Liu
  3 | ## Modified from: https://github.com/hshustc/CVPR19_Incremental_Learning
  4 | ## MPI for Informatics
  5 | ## yaoyao.liu@mpi-inf.mpg.de
  6 | ## Copyright (c) 2020
  7 | ##
  8 | ## This source code is licensed under the MIT-style license found in the
  9 | ## LICENSE file in the root directory of this source tree
 10 | ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 11 | """Modified ResNet wit transferring weights."""
 12 | import math
 13 | import torch.nn as nn
 14 | import torch.utils.model_zoo as model_zoo
 15 | import models.modified_linear as modified_linear
 16 | from utils.conv2d_mtl import Conv2dMtl
 17 | 
 18 | def conv3x3mtl(in_planes, out_planes, stride=1):
 19 |     return Conv2dMtl(in_planes, out_planes, kernel_size=3, stride=stride,
 20 |                      padding=1, bias=False)
 21 | 
 22 | class BasicBlockMtl(nn.Module):
 23 |     expansion = 1
 24 | 
 25 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 26 |         super(BasicBlockMtl, self).__init__()
 27 |         self.conv1 = conv3x3mtl(inplanes, planes, stride)
 28 |         self.bn1 = nn.BatchNorm2d(planes)
 29 |         self.relu = nn.ReLU(inplace=True)
 30 |         self.conv2 = conv3x3mtl(planes, planes)
 31 |         self.bn2 = nn.BatchNorm2d(planes)
 32 |         self.downsample = downsample
 33 |         self.stride = stride
 34 |         self.last = last
 35 | 
 36 |     def forward(self, x):
 37 |         residual = x
 38 | 
 39 |         out = self.conv1(x)
 40 |         out = self.bn1(out)
 41 |         out = self.relu(out)
 42 | 
 43 |         out = self.conv2(out)
 44 |         out = self.bn2(out)
 45 | 
 46 |         if self.downsample is not None:
 47 |             residual = self.downsample(x)
 48 | 
 49 |         out += residual
 50 |         if not self.last:
 51 |             out = self.relu(out)
 52 | 
 53 |         return out
 54 | 
 55 | class ResNetMtl(nn.Module):
 56 | 
 57 |     def __init__(self, block, layers, num_classes=10):
 58 |         self.inplanes = 16
 59 |         super(ResNetMtl, self).__init__()
 60 |         self.conv1 = Conv2dMtl(3, 16, kernel_size=3, stride=1, padding=1,
 61 |                                bias=False)
 62 |         self.bn1 = nn.BatchNorm2d(16)
 63 |         self.relu = nn.ReLU(inplace=True)
 64 |         self.layer1 = self._make_layer(block, 16, layers[0])
 65 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
 66 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2, last_phase=True)
 67 |         self.avgpool = nn.AvgPool2d(8, stride=1)
 68 |         self.fc = modified_linear.CosineLinear(64 * block.expansion, num_classes)
 69 | 
 70 |         for m in self.modules():
 71 |             if isinstance(m, Conv2dMtl):
 72 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 73 |             elif isinstance(m, nn.BatchNorm2d):
 74 |                 nn.init.constant_(m.weight, 1)
 75 |                 nn.init.constant_(m.bias, 0)
 76 | 
 77 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 78 |         downsample = None
 79 |         if stride != 1 or self.inplanes != planes * block.expansion:
 80 |             downsample = nn.Sequential(
 81 |                 Conv2dMtl(self.inplanes, planes * block.expansion,
 82 |                           kernel_size=1, stride=stride, bias=False),
 83 |                 nn.BatchNorm2d(planes * block.expansion),
 84 |             )
 85 | 
 86 |         layers = []
 87 |         layers.append(block(self.inplanes, planes, stride, downsample))
 88 |         self.inplanes = planes * block.expansion
 89 |         if last_phase:
 90 |             for i in range(1, blocks-1):
 91 |                 layers.append(block(self.inplanes, planes))
 92 |             layers.append(block(self.inplanes, planes, last=True))
 93 |         else: 
 94 |             for i in range(1, blocks):
 95 |                 layers.append(block(self.inplanes, planes))
 96 | 
 97 |         return nn.Sequential(*layers)
 98 | 
 99 |     def forward(self, x):
100 |         x = self.conv1(x)
101 |         x = self.bn1(x)
102 |         x = self.relu(x)
103 | 
104 |         x = self.layer1(x)
105 |         x = self.layer2(x)
106 |         x = self.layer3(x)
107 | 
108 |         x = self.avgpool(x)
109 |         x = x.view(x.size(0), -1)
110 |         x = self.fc(x)
111 | 
112 |         return x
113 | 
114 | def resnetmtl32(pretrained=False, **kwargs):
115 |     n = 5
116 |     model = ResNetMtl(BasicBlockMtl, [n, n, n], **kwargs)
117 |     return model
118 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/trainer/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yaoyao-liu/class-incremental-learning/701af9f819f559c6ab3d3ee73bb3d7c21e924572/mnemonics-training/1_train/trainer/__init__.py


--------------------------------------------------------------------------------
/mnemonics-training/1_train/trainer/incremental.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import tqdm
 3 | import torch.nn as nn
 4 | from torch.optim import lr_scheduler
 5 | from torchvision import datasets, models, transforms
 6 | from utils.misc import *
 7 | from utils.process_fp import process_inputs_fp
 8 | import torch.nn.functional as F
 9 | 
10 | def incremental_train_and_eval(epochs, tg_model, ref_model, free_model, ref_free_model, tg_optimizer, tg_lr_scheduler, trainloader, testloader, iteration, start_iteration, lamda, dist, K, lw_mr, fix_bn=False, weight_per_class=None, device=None):
11 |     if device is None:
12 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
13 |     T = 2.0
14 |     beta = 0.25
15 |     if iteration > start_iteration:
16 |         ref_model.eval()
17 |         num_old_classes = ref_model.fc.out_features
18 |     for epoch in range(epochs):
19 |         tg_model.train()
20 |         if fix_bn:
21 |             for m in tg_model.modules():
22 |                 if isinstance(m, nn.BatchNorm2d):
23 |                     m.eval()
24 |         train_loss = 0
25 |         train_loss1 = 0
26 |         train_loss2 = 0
27 |         correct = 0
28 |         total = 0
29 |         tg_lr_scheduler.step()
30 |         print('\nEpoch: %d, LR: ' % epoch, end='')
31 |         print(tg_lr_scheduler.get_lr())
32 |         for batch_idx, (inputs, targets) in enumerate(trainloader):
33 |             inputs, targets = inputs.to(device), targets.to(device)
34 |             tg_optimizer.zero_grad()
35 |             outputs = tg_model(inputs)
36 |             if iteration == start_iteration:
37 |                 loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
38 |             else:
39 |                 ref_outputs = ref_model(inputs)
40 |                 loss1 = nn.KLDivLoss()(F.log_softmax(outputs[:,:num_old_classes]/T, dim=1), \
41 |                     F.softmax(ref_outputs.detach()/T, dim=1)) * T * T * beta * num_old_classes
42 |                 loss2 = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
43 |                 loss = loss1 + loss2
44 |             loss.backward()
45 |             tg_optimizer.step()
46 | 
47 |             train_loss += loss.item()
48 |             if iteration > start_iteration:
49 |                 train_loss1 += loss1.item()
50 |                 train_loss2 += loss2.item()
51 |             _, predicted = outputs.max(1)
52 |             total += targets.size(0)
53 |             correct += predicted.eq(targets).sum().item()
54 |         if iteration == start_iteration:
55 |             print('Train set: {}, Train Loss: {:.4f} Acc: {:.4f}'.format(\
56 |                 len(trainloader), train_loss/(batch_idx+1), 100.*correct/total))
57 |         else:
58 |             print('Train set: {}, Train Loss1: {:.4f}, Train Loss2: {:.4f},\
59 |                 Train Loss: {:.4f} Acc: {:.4f}'.format(len(trainloader), \
60 |                 train_loss1/(batch_idx+1), train_loss2/(batch_idx+1),
61 |                 train_loss/(batch_idx+1), 100.*correct/total))
62 |         tg_model.eval()
63 |         test_loss = 0
64 |         correct = 0
65 |         total = 0
66 |         with torch.no_grad():
67 |             for batch_idx, (inputs, targets) in enumerate(testloader):
68 |                 inputs, targets = inputs.to(device), targets.to(device)
69 |                 outputs = tg_model(inputs)
70 |                 loss = nn.CrossEntropyLoss(weight_per_class)(outputs, targets)
71 | 
72 |                 test_loss += loss.item()
73 |                 _, predicted = outputs.max(1)
74 |                 total += targets.size(0)
75 |                 correct += predicted.eq(targets).sum().item()
76 |         print('Test set: {} Test Loss: {:.4f} Acc: {:.4f}'.format(\
77 |             len(testloader), test_loss/(batch_idx+1), 100.*correct/total))
78 |     return tg_model
79 | 
80 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yaoyao-liu/class-incremental-learning/701af9f819f559c6ab3d3ee73bb3d7c21e924572/mnemonics-training/1_train/utils/__init__.py


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/compute_accuracy.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torchvision
 3 | import numpy as np
 4 | import torch.nn.functional as F
 5 | from torchvision import datasets, models, transforms
 6 | from scipy.spatial.distance import cdist
 7 | from utils.misc import *
 8 | from utils.process_fp import process_inputs_fp
 9 | 
10 | def map_labels(order_list, Y_set):
11 |     map_Y = []
12 |     for idx in Y_set:
13 |         map_Y.append(order_list.index(idx))
14 |     map_Y = np.array(map_Y)
15 |     return map_Y
16 | 
17 | def compute_accuracy(tg_model, free_model, tg_feature_model, class_means, X_protoset_cumuls, Y_protoset_cumuls, evalloader, order_list, is_start_iteration=False, fast_fc=None, scale=None, print_info=True, device=None, maml_lr=0.1, maml_epoch=50):
18 |     if device is None:
19 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
20 |     tg_feature_model.eval()
21 |     tg_model.eval()
22 |     if free_model is not None:
23 |         free_model.eval()
24 |     if fast_fc is None:
25 |         transform_proto = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5071,  0.4866,  0.4409), (0.2009,  0.1984,  0.2023)),])
26 |         protoset = torchvision.datasets.CIFAR100(root='./data', train=False, download=False, transform=transform_proto)
27 |         X_protoset_array = np.array(X_protoset_cumuls).astype('uint8')
28 |         protoset.test_data = X_protoset_array.reshape(-1, X_protoset_array.shape[2], X_protoset_array.shape[3], X_protoset_array.shape[4])
29 |         Y_protoset_cumuls = np.array(Y_protoset_cumuls).reshape(-1)
30 |         map_Y_protoset_cumuls = map_labels(order_list, Y_protoset_cumuls)
31 |         protoset.test_labels = map_Y_protoset_cumuls
32 |         protoloader = torch.utils.data.DataLoader(protoset, batch_size=128, shuffle=True, num_workers=2)  
33 | 
34 |         fast_fc = torch.from_numpy(np.float32(class_means[:,:,0].T)).to(device)
35 |         fast_fc.requires_grad=True
36 | 
37 |         epoch_num = maml_epoch
38 |         for epoch_idx in range(epoch_num):
39 |             for the_inputs, the_targets in protoloader: 
40 |                 the_inputs, the_targets = the_inputs.to(device), the_targets.to(device)
41 |                 the_features = tg_feature_model(the_inputs)
42 |                 the_logits = F.linear(F.normalize(torch.squeeze(the_features), p=2,dim=1), F.normalize(fast_fc, p=2, dim=1))
43 |                 the_loss = F.cross_entropy(the_logits, the_targets)
44 |                 the_grad = torch.autograd.grad(the_loss, fast_fc)
45 |                 fast_fc = fast_fc - maml_lr * the_grad[0]
46 |     correct = 0
47 |     correct_icarl = 0
48 |     correct_ncm = 0
49 |     correct_maml = 0
50 |     total = 0
51 |     with torch.no_grad():
52 |         for batch_idx, (inputs, targets) in enumerate(evalloader):
53 |             inputs, targets = inputs.to(device), targets.to(device)
54 |             total += targets.size(0)
55 |             if is_start_iteration:
56 |                 outputs = tg_model(inputs)
57 |             else:
58 |                 outputs, outputs_feature = process_inputs_fp(tg_model, free_model, inputs)
59 |             outputs = F.softmax(outputs, dim=1)
60 |             if scale is not None:
61 |                 assert(scale.shape[0] == 1)
62 |                 assert(outputs.shape[1] == scale.shape[1])
63 |                 outputs = outputs / scale.repeat(outputs.shape[0], 1).type(torch.FloatTensor).to(device)
64 |             _, predicted = outputs.max(1)
65 |             correct += predicted.eq(targets).sum().item()
66 | 
67 |             if is_start_iteration:
68 |                 outputs_feature = np.squeeze(tg_feature_model(inputs))
69 |             sqd_icarl = cdist(class_means[:,:,0].T, outputs_feature, 'sqeuclidean')
70 |             score_icarl = torch.from_numpy((-sqd_icarl).T).to(device)
71 |             _, predicted_icarl = score_icarl.max(1)
72 |             correct_icarl += predicted_icarl.eq(targets).sum().item()
73 |             sqd_ncm = cdist(class_means[:,:,1].T, outputs_feature, 'sqeuclidean')
74 |             score_ncm = torch.from_numpy((-sqd_ncm).T).to(device)
75 |             _, predicted_ncm = score_ncm.max(1)
76 |             correct_ncm += predicted_ncm.eq(targets).sum().item()
77 |             the_logits = F.linear(F.normalize(torch.squeeze(outputs_feature), p=2,dim=1), F.normalize(fast_fc, p=2, dim=1))
78 |             _, predicted_maml = the_logits.max(1)
79 |             correct_maml += predicted_maml.eq(targets).sum().item()
80 |     cnn_acc = 100.*correct/total
81 |     icarl_acc = 100.*correct_icarl/total
82 |     ncm_acc = 100.*correct_ncm/total
83 |     maml_acc = 100.*correct_maml/total
84 |     if print_info:
85 |         print("  Accuracy for LwF    :\t\t{:.2f} %".format(cnn_acc))
86 |         print("  Accuracy for iCaRL  :\t\t{:.2f} %".format(icarl_acc))
87 |         print("  The above results are the accuracy for the current phase.") 
88 |         print("  For the average accuracy, you need to record the results for all phases and calculate the average value.") 
89 |     return [cnn_acc, icarl_acc, ncm_acc, maml_acc], fast_fc
90 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/compute_features.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import numpy as np
 3 | from torchvision import models
 4 | from utils.misc import *
 5 | from utils.process_fp import process_inputs_fp
 6 | 
 7 | def compute_features(tg_model, free_model, tg_feature_model, is_start_iteration, evalloader, num_samples, num_features, device=None):
 8 |     if device is None:
 9 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
10 |     tg_feature_model.eval()
11 |     tg_model.eval()
12 |     if free_model is not None:
13 |         free_model.eval()
14 |     features = np.zeros([num_samples, num_features])
15 |     start_idx = 0
16 |     with torch.no_grad():
17 |         for inputs, targets in evalloader:
18 |             inputs = inputs.to(device)
19 |             if is_start_iteration:
20 |                 the_feature = tg_feature_model(inputs)
21 |             else:
22 |                 the_feature = process_inputs_fp(tg_model, free_model, inputs, feature_mode=True)
23 |             features[start_idx:start_idx+inputs.shape[0], :] = np.squeeze(the_feature)
24 |             start_idx = start_idx+inputs.shape[0]
25 |     assert(start_idx==num_samples)
26 |     return features
27 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/conv2d_mtl.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | import torch
 3 | from torch.nn.parameter import Parameter
 4 | import torch.nn.functional as F
 5 | from torch.nn.modules.module import Module
 6 | from torch.nn.modules.utils import _single, _pair, _triple
 7 | 
 8 | class _ConvNdMtl(Module):
 9 | 
10 |     def __init__(self, in_channels, out_channels, kernel_size, stride,
11 |                  padding, dilation, transposed, output_padding, groups, bias):
12 |         super(_ConvNdMtl, self).__init__()
13 |         if in_channels % groups != 0:
14 |             raise ValueError('in_channels must be divisible by groups')
15 |         if out_channels % groups != 0:
16 |             raise ValueError('out_channels must be divisible by groups')
17 |         self.in_channels = in_channels
18 |         self.out_channels = out_channels
19 |         self.kernel_size = kernel_size
20 |         self.stride = stride
21 |         self.padding = padding
22 |         self.dilation = dilation
23 |         self.transposed = transposed
24 |         self.output_padding = output_padding
25 |         self.groups = groups
26 |         if transposed:
27 |             self.weight = Parameter(torch.Tensor(
28 |                 in_channels, out_channels // groups, *kernel_size))
29 |             self.mtl_weight = Parameter(torch.ones(in_channels, out_channels // groups, 1, 1))
30 |         else:
31 |             self.weight = Parameter(torch.Tensor(
32 |                 out_channels, in_channels // groups, *kernel_size))
33 |             self.mtl_weight = Parameter(torch.ones(out_channels, in_channels // groups, 1, 1))
34 |         self.weight.requires_grad=False
35 |         if bias:
36 |             self.bias = Parameter(torch.Tensor(out_channels))
37 |             self.bias.requires_grad=False
38 |             self.mtl_bias = Parameter(torch.zeros(out_channels))
39 |         else:
40 |             self.register_parameter('bias', None)
41 |             self.register_parameter('mtl_bias', None)
42 |         self.reset_parameters()
43 | 
44 |     def reset_parameters(self):
45 |         n = self.in_channels
46 |         for k in self.kernel_size:
47 |             n *= k
48 |         stdv = 1. / math.sqrt(n)
49 |         self.weight.data.uniform_(-stdv, stdv)
50 |         self.mtl_weight.data.uniform_(1, 1)
51 |         if self.bias is not None:
52 |             self.bias.data.uniform_(-stdv, stdv)
53 |             self.mtl_bias.data.uniform_(0, 0)
54 | 
55 |     def extra_repr(self):
56 |         s = ('{in_channels}, {out_channels}, kernel_size={kernel_size}'
57 |              ', stride={stride}')
58 |         if self.padding != (0,) * len(self.padding):
59 |             s += ', padding={padding}'
60 |         if self.dilation != (1,) * len(self.dilation):
61 |             s += ', dilation={dilation}'
62 |         if self.output_padding != (0,) * len(self.output_padding):
63 |             s += ', output_padding={output_padding}'
64 |         if self.groups != 1:
65 |             s += ', groups={groups}'
66 |         if self.bias is None:
67 |             s += ', bias=False'
68 |         return s.format(**self.__dict__)
69 | 
70 | class Conv2dMtl(_ConvNdMtl):
71 | 
72 |     def __init__(self, in_channels, out_channels, kernel_size, stride=1,
73 |                  padding=0, dilation=1, groups=1, bias=True):
74 |         kernel_size = _pair(kernel_size)
75 |         stride = _pair(stride)
76 |         padding = _pair(padding)
77 |         dilation = _pair(dilation)
78 |         super(Conv2dMtl, self).__init__(
79 |             in_channels, out_channels, kernel_size, stride, padding, dilation,
80 |             False, _pair(0), groups, bias)
81 | 
82 |     def forward(self, input):
83 |         new_mtl_weight = self.mtl_weight.expand(self.weight.shape)
84 |         new_weight = self.weight.mul(new_mtl_weight)
85 |         if self.bias is not None:
86 |             new_bias = self.bias + self.mtl_bias
87 |         else:
88 |             new_bias = None
89 |         return F.conv2d(input, new_weight, new_bias, self.stride,
90 |                         self.padding, self.dilation, self.groups)
91 | 
92 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/gpu_tools.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import torch
 3 | import time
 4 | 
 5 | def check_memory(cuda_device):
 6 |     devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n")
 7 |     total, used = devices_info[int(cuda_device)].split(',')
 8 |     return total, used
 9 | 
10 | def occupy_memory(cuda_device):
11 |     total, used = check_memory(cuda_device)
12 |     total = int(total)
13 |     used = int(used)
14 |     max_mem = int(total * 0.90)
15 |     print('Total memory: ' + str(total) + ', used memory: ' + str(used))
16 |     block_mem = max_mem - used
17 |     if block_mem > 0:
18 |         x = torch.cuda.FloatTensor(256, 1024, block_mem)
19 |         del x
20 | 
21 | def set_gpu(cuda_device):
22 |     os.environ['CUDA_VISIBLE_DEVICES'] = cuda_device
23 |     print('Using gpu:', cuda_device)
24 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/misc.py:
--------------------------------------------------------------------------------
  1 | from __future__ import print_function, division
  2 | import os
  3 | import torch
  4 | import sys
  5 | import time
  6 | import subprocess
  7 | import torch.nn as nn
  8 | import torch.nn.init as init
  9 | import os.path as osp
 10 | try:
 11 |     import cPickle as pickle
 12 | except:
 13 |     import pickle
 14 | 
 15 | def savepickle(data, file_path):
 16 |     mkdir_p(osp.dirname(file_path), delete=False)
 17 |     print('pickle into', file_path)
 18 |     with open(file_path, 'wb') as f:
 19 |         pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
 20 | 
 21 | def unpickle(file_path):
 22 |     with open(file_path, 'rb') as f:
 23 |         data = pickle.load(f)
 24 |     return data
 25 | 
 26 | def mkdir_p(path, delete=False, print_info=True):
 27 |     if path == '': return
 28 | 
 29 |     if delete:
 30 |         subprocess.call(('rm -r ' + path).split())
 31 |     if not osp.exists(path):
 32 |         if print_info:
 33 |             print('mkdir -p  ' + path)
 34 |         subprocess.call(('mkdir -p ' + path).split())
 35 | 
 36 | def get_mean_and_std(dataset):
 37 |     dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=True, num_workers=2)
 38 |     mean = torch.zeros(3)
 39 |     std = torch.zeros(3)
 40 |     print('==> Computing mean and std..')
 41 |     for inputs, targets in dataloader:
 42 |         for i in range(3):
 43 |             mean[i] += inputs[:,i,:,:].mean()
 44 |             std[i] += inputs[:,i,:,:].std()
 45 |     mean.div_(len(dataset))
 46 |     std.div_(len(dataset))
 47 |     return mean, std
 48 | 
 49 | def init_params(net):
 50 |     '''Init layer parameters.'''
 51 |     for m in net.modules():
 52 |         if isinstance(m, nn.Conv2d):
 53 |             init.kaiming_normal_(m.weight, mode='fan_out')
 54 |             if m.bias:
 55 |                 init.constant_(m.bias, 0)
 56 |         elif isinstance(m, nn.BatchNorm2d):
 57 |             init.constant_(m.weight, 1)
 58 |             init.constant_(m.bias, 0)
 59 |         elif isinstance(m, nn.Linear):
 60 |             init.normal_(m.weight, std=1e-3)
 61 |             if m.bias is not None:
 62 |                 init.constant_(m.bias, 0)
 63 | 
 64 | _, term_width = os.popen('stty size', 'r').read().split()
 65 | term_width = int(term_width)
 66 | 
 67 | TOTAL_BAR_LENGTH = 65.
 68 | last_time = time.time()
 69 | begin_time = last_time
 70 | def progress_bar(current, total, msg=None):
 71 |     global last_time, begin_time
 72 |     if current == 0:
 73 |         begin_time = time.time()  # Reset for new bar.
 74 | 
 75 |     cur_len = int(TOTAL_BAR_LENGTH*current/total)
 76 |     rest_len = int(TOTAL_BAR_LENGTH - cur_len) - 1
 77 | 
 78 |     sys.stdout.write(' [')
 79 |     for i in range(cur_len):
 80 |         sys.stdout.write('=')
 81 |     sys.stdout.write('>')
 82 |     for i in range(rest_len):
 83 |         sys.stdout.write('.')
 84 |     sys.stdout.write(']')
 85 | 
 86 |     cur_time = time.time()
 87 |     step_time = cur_time - last_time
 88 |     last_time = cur_time
 89 |     tot_time = cur_time - begin_time
 90 | 
 91 |     L = []
 92 |     L.append('  Step: %s' % format_time(step_time))
 93 |     L.append(' | Tot: %s' % format_time(tot_time))
 94 |     if msg:
 95 |         L.append(' | ' + msg)
 96 | 
 97 |     msg = ''.join(L)
 98 |     sys.stdout.write(msg)
 99 |     for i in range(term_width-int(TOTAL_BAR_LENGTH)-len(msg)-3):
100 |         sys.stdout.write(' ')
101 | 
102 |     for i in range(term_width-int(TOTAL_BAR_LENGTH/2)+2):
103 |         sys.stdout.write('\b')
104 |     sys.stdout.write(' %d/%d ' % (current+1, total))
105 | 
106 |     if current < total-1:
107 |         sys.stdout.write('\r')
108 |     else:
109 |         sys.stdout.write('\n')
110 |     sys.stdout.flush()
111 | 
112 | def format_time(seconds):
113 |     days = int(seconds / 3600/24)
114 |     seconds = seconds - days*3600*24
115 |     hours = int(seconds / 3600)
116 |     seconds = seconds - hours*3600
117 |     minutes = int(seconds / 60)
118 |     seconds = seconds - minutes*60
119 |     secondsf = int(seconds)
120 |     seconds = seconds - secondsf
121 |     millis = int(seconds*1000)
122 | 
123 |     f = ''
124 |     i = 1
125 |     if days > 0:
126 |         f += str(days) + 'D'
127 |         i += 1
128 |     if hours > 0 and i <= 2:
129 |         f += str(hours) + 'h'
130 |         i += 1
131 |     if minutes > 0 and i <= 2:
132 |         f += str(minutes) + 'm'
133 |         i += 1
134 |     if secondsf > 0 and i <= 2:
135 |         f += str(secondsf) + 's'
136 |         i += 1
137 |     if millis > 0 and i <= 2:
138 |         f += str(millis) + 'ms'
139 |         i += 1
140 |     if f == '':
141 |         f = '0ms'
142 |     return f
143 |     


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/process_fp.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | from utils.misc import *
 4 | 
 5 | def process_inputs_fp(tg_model, free_model, inputs, fusion_mode=False, feature_mode=False):
 6 |     tg_model_group1 = [tg_model.conv1, tg_model.bn1, tg_model.relu, tg_model.layer1]
 7 |     tg_model_group1 = nn.Sequential(*tg_model_group1)
 8 |     tg_fp1 = tg_model_group1(inputs)
 9 |     fp1 = tg_fp1
10 |     tg_model_group2 = tg_model.layer2
11 |     tg_fp2 = tg_model_group2(fp1)
12 |     fp2 = tg_fp2
13 |     tg_model_group3 = [tg_model.layer3, tg_model.avgpool]
14 |     tg_model_group3 = nn.Sequential(*tg_model_group3)
15 |     tg_fp3 = tg_model_group3(fp2)
16 |     fp3 = tg_fp3
17 |     fp3 = fp3.view(fp3.size(0), -1)
18 |     if feature_mode:
19 |         return fp3
20 |     else:
21 |         outputs = tg_model.fc(fp3)
22 |         feature = fp3
23 |         return outputs, feature
24 | 


--------------------------------------------------------------------------------
/mnemonics-training/1_train/utils/process_mnemonics.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.optim as optim
 3 | import torchvision
 4 | import time
 5 | import os
 6 | import argparse
 7 | import numpy as np
 8 | 
 9 | def tensor2im(input_image, imtype=np.uint8):
10 |     mean = [0.5071,  0.4866,  0.4409]
11 |     std = [0.2009,  0.1984,  0.2023]
12 |     if not isinstance(input_image, np.ndarray):
13 |         if isinstance(input_image, torch.Tensor):
14 |             image_tensor = input_image.data
15 |         else:
16 |             return input_image
17 |         image_numpy = image_tensor.cpu().detach().float().numpy()
18 |         if image_numpy.shape[0] == 1:
19 |             image_numpy = np.tile(image_numpy, (3, 1, 1))
20 |         for i in range(len(mean)): 
21 |             image_numpy[i] = image_numpy[i] * std[i] + mean[i]
22 |         image_numpy = image_numpy * 255
23 |         image_numpy = np.transpose(image_numpy, (1, 2, 0))
24 |     else:
25 |         image_numpy = input_image
26 |     return image_numpy.astype(imtype)
27 | 
28 | def process_mnemonics(X_protoset_cumuls, Y_protoset_cumuls, mnemonics_raw, mnemonics_label, order_list, nb_cl_fg, nb_cl, iteration, start_iter):
29 |     mnemonics = mnemonics_raw[0]
30 |     mnemonics_array_new = np.zeros((len(mnemonics), len(mnemonics[0]), 32, 32, 3))
31 |     mnemonics_list = []
32 |     mnemonics_label_list = []
33 |     for idx in range(len(mnemonics)):
34 |         this_mnemonics = []
35 |         for sub_idx in range(len(mnemonics[idx])):
36 |             processed_img = tensor2im(mnemonics[idx][sub_idx])
37 |             mnemonics_array_new[idx][sub_idx] = processed_img   
38 |     diff = len(X_protoset_cumuls) - len(mnemonics_array_new)
39 |     for idx in range(len(mnemonics_array_new)):
40 |         X_protoset_cumuls[idx+diff] = mnemonics_array_new[idx]
41 |     return X_protoset_cumuls
42 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/README.md:
--------------------------------------------------------------------------------
 1 | ## Evaluation on our models
 2 | 
 3 | ### Clone this repository
 4 | ```bash
 5 | cd ~
 6 | git clone git@github.com:yaoyao-liu/mnemonics-training.git
 7 | ```
 8 | 
 9 | ### Processing the datasets
10 | 
11 | Process ImageNet-Sub and ImageNet:
12 | ```bash
13 | cd ~/mnemonics-training/eval/process_imagenet
14 | python generate_imagenet_subset.py
15 | python generate_imagenet.py
16 | ```
17 | 
18 | ### Download models
19 | 
20 | Download the models for CIFAR-100, ImageNet-Sub and ImageNet:
21 | ```bash
22 | cd ~/mnemonics-training/eval
23 | sh ./script/download_ckpt.sh
24 | ```
25 | You may also download the checkpoints on [Google Drive](https://drive.google.com/file/d/1sKO2BOssWgTFBNZbM50qDzgk6wqg4_l8/view).
26 | 
27 | ### Running the evaluation
28 | 
29 | Run evaluation code with our models：
30 | ```bash
31 | cd ~/mnemonics-training/eval
32 | sh run_eval.sh
33 | ```
34 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/main.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | import os
 4 | import argparse
 5 | import numpy as np
 6 | from trainer.train import Trainer
 7 | from utils.gpu_tools import occupy_memory
 8 | 
 9 | if __name__ == '__main__':
10 |     parser = argparse.ArgumentParser()
11 |     parser.add_argument('--gpu', default='0') # GPU id 
12 |     parser.add_argument('--dataset', default='cifar100', type=str, choices=['cifar100', 'imagenet_sub', 'imagenet'])
13 |     parser.add_argument('--data_dir', default='data/seed_1993_subset_100_imagenet/data', type=str)
14 |     parser.add_argument('--num_classes', default=100, type=int)
15 |     parser.add_argument('--nb_cl_fg', default=50, type=int, help='the number of classes in first group')
16 |     parser.add_argument('--nb_cl', default=10, type=int, help='Classes per group')
17 |     parser.add_argument('--nb_protos', default=20, type=int, help='Number of prototypes per class at the end')
18 |     parser.add_argument('--nb_runs', default=1, type=int, help='Number of runs (random ordering of classes at each run)')
19 |     parser.add_argument('--epochs', default=160, type=int, help='Epochs')
20 |     parser.add_argument('--T', default=2, type=float, help='Temporature for distialltion')
21 |     parser.add_argument('--beta', default=0.25, type=float, help='Beta for distialltion')
22 |     parser.add_argument('--resume', action='store_true', help='resume from checkpoint')
23 |     parser.add_argument('--resume_fg', action='store_true', help='resume first group from checkpoint')
24 |     parser.add_argument('--ckpt_dir_fg', type=str, default='-')
25 |     parser.add_argument('--dynamic_budget', action='store_true', help='fix budget')
26 |     parser.add_argument('--phase', type=str, default='train', choices=['train', 'eval'])
27 |     parser.add_argument('--ckpt_label', type=str, default='exp01')
28 |     parser.add_argument('--use_mtl', action='store_true', help='using mtl weights')
29 |     parser.add_argument('--num_workers', default=2, type=int, help='the number of workers for loading data')
30 |     parser.add_argument('--load_iter', default=0, type=int)
31 |     parser.add_argument('--mimic_score', action='store_true', help='To mimic scores for cosine embedding')
32 |     parser.add_argument('--lw_ms', default=1, type=float, help='loss weight for mimicking score')
33 |     parser.add_argument('--rs_ratio', default=0, type=float, help='The ratio for resample')
34 |     parser.add_argument('--imprint_weights', action='store_true', help='Imprint the weights for novel classes')
35 |     parser.add_argument('--less_forget', action='store_true', help='Less forgetful')
36 |     parser.add_argument('--lamda', default=5, type=float, help='Lamda for LF')
37 |     parser.add_argument('--adapt_lamda', action='store_true', help='Adaptively change lamda')
38 |     parser.add_argument('--dist', default=0.5, type=float, help='Dist for MarginRankingLoss')
39 |     parser.add_argument('--K', default=2, type=int, help='K for MarginRankingLoss')
40 |     parser.add_argument('--lw_mr', default=1, type=float, help='loss weight for margin ranking loss')
41 |     parser.add_argument('--random_seed', default=1993, type=int, help='random seed')
42 |     parser.add_argument('--train_batch_size', default=128, type=int)
43 |     parser.add_argument('--test_batch_size', default=100, type=int)
44 |     parser.add_argument('--eval_batch_size', default=128, type=int)
45 |     parser.add_argument('--base_lr1', default=0.1, type=float)
46 |     parser.add_argument('--base_lr2', default=0.1, type=float)
47 |     parser.add_argument('--lr_factor', default=0.1, type=float)
48 |     parser.add_argument('--custom_weight_decay', default=5e-4, type=float)
49 |     parser.add_argument('--custom_momentum', default=0.9, type=float)
50 |     parser.add_argument('--load_ckpt_prefix', type=str, default='-')
51 |     parser.add_argument('--load_order', type=str, default='-')
52 |     parser.add_argument('--add_str', default=None, type=str)
53 | 
54 |     the_args = parser.parse_args()
55 |     assert(the_args.nb_cl_fg % the_args.nb_cl == 0)
56 |     assert(the_args.nb_cl_fg >= the_args.nb_cl)
57 | 
58 |     print(the_args)
59 | 
60 |     np.random.seed(the_args.random_seed)
61 | 
62 |     if not os.path.exists('./logs/cifar100_nfg50_ncls2_nproto20_mtl_exp01'):
63 |         print('Download checkpoints from Google Drive.')
64 |         os.system('sh ./script/download_ckpt.sh')
65 | 
66 |     os.environ['CUDA_VISIBLE_DEVICES'] = the_args.gpu
67 |     print('Using gpu:', the_args.gpu)
68 | 
69 |     occupy_memory(the_args.gpu)
70 |     print('Occupy GPU memory in advance.')
71 | 
72 |     trainer = Trainer(the_args)
73 |     trainer.eval()
74 | 
75 | 
76 | 
77 | 
78 | 
79 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/models/modified_linear.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | 
 3 | import torch
 4 | from torch.nn.parameter import Parameter
 5 | from torch.nn import functional as F
 6 | from torch.nn import Module
 7 | 
 8 | class CosineLinear(Module):
 9 |     def __init__(self, in_features, out_features, sigma=True):
10 |         super(CosineLinear, self).__init__()
11 |         self.in_features = in_features
12 |         self.out_features = out_features
13 |         self.weight = Parameter(torch.Tensor(out_features, in_features))
14 |         if sigma:
15 |             self.sigma = Parameter(torch.Tensor(1))
16 |         else:
17 |             self.register_parameter('sigma', None)
18 |         self.reset_parameters()
19 | 
20 |     def reset_parameters(self):
21 |         stdv = 1. / math.sqrt(self.weight.size(1))
22 |         self.weight.data.uniform_(-stdv, stdv)
23 |         if self.sigma is not None:
24 |             self.sigma.data.fill_(1) #for initializaiton of sigma
25 | 
26 |     def forward(self, input):
27 |         #w_norm = self.weight.data.norm(dim=1, keepdim=True)
28 |         #w_norm = w_norm.expand_as(self.weight).add_(self.epsilon)
29 |         #x_norm = input.data.norm(dim=1, keepdim=True)
30 |         #x_norm = x_norm.expand_as(input).add_(self.epsilon)
31 |         #w = self.weight.div(w_norm)
32 |         #x = input.div(x_norm)
33 |         out = F.linear(F.normalize(input, p=2,dim=1), \
34 |                 F.normalize(self.weight, p=2, dim=1))
35 |         if self.sigma is not None:
36 |             out = self.sigma * out
37 |         return out
38 | 
39 | class SplitCosineLinear(Module):
40 |     #consists of two fc layers and concatenate their outputs
41 |     def __init__(self, in_features, out_features1, out_features2, sigma=True):
42 |         super(SplitCosineLinear, self).__init__()
43 |         self.in_features = in_features
44 |         self.out_features = out_features1 + out_features2
45 |         self.fc1 = CosineLinear(in_features, out_features1, False)
46 |         self.fc2 = CosineLinear(in_features, out_features2, False)
47 |         if sigma:
48 |             self.sigma = Parameter(torch.Tensor(1))
49 |             self.sigma.data.fill_(1)
50 |         else:
51 |             self.register_parameter('sigma', None)
52 | 
53 |     def forward(self, x):
54 |         out1 = self.fc1(x)
55 |         out2 = self.fc2(x)
56 |         out = torch.cat((out1, out2), dim=1) #concatenate along the channel
57 |         if self.sigma is not None:
58 |             out = self.sigma * out
59 |         return out


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/models/modified_resnet.py:
--------------------------------------------------------------------------------
  1 | import torch.nn as nn
  2 | import math
  3 | import torch.utils.model_zoo as model_zoo
  4 | import models.modified_linear as modified_linear
  5 | 
  6 | def conv3x3(in_planes, out_planes, stride=1):
  7 |     """3x3 convolution with padding"""
  8 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
  9 |                      padding=1, bias=False)
 10 | 
 11 | 
 12 | class BasicBlock(nn.Module):
 13 |     expansion = 1
 14 | 
 15 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 16 |         super(BasicBlock, self).__init__()
 17 |         self.conv1 = conv3x3(inplanes, planes, stride)
 18 |         self.bn1 = nn.BatchNorm2d(planes)
 19 |         self.relu = nn.ReLU(inplace=True)
 20 |         self.conv2 = conv3x3(planes, planes)
 21 |         self.bn2 = nn.BatchNorm2d(planes)
 22 |         self.downsample = downsample
 23 |         self.stride = stride
 24 |         self.last = last
 25 | 
 26 |     def forward(self, x):
 27 |         residual = x
 28 | 
 29 |         out = self.conv1(x)
 30 |         out = self.bn1(out)
 31 |         out = self.relu(out)
 32 | 
 33 |         out = self.conv2(out)
 34 |         out = self.bn2(out)
 35 | 
 36 |         if self.downsample is not None:
 37 |             residual = self.downsample(x)
 38 | 
 39 |         out += residual
 40 |         if not self.last: #remove ReLU in the last layer
 41 |             out = self.relu(out)
 42 | 
 43 |         return out
 44 | 
 45 | class ResNet(nn.Module):
 46 | 
 47 |     def __init__(self, block, layers, num_classes=1000):
 48 |         self.inplanes = 64
 49 |         super(ResNet, self).__init__()
 50 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
 51 |                                bias=False)
 52 |         self.bn1 = nn.BatchNorm2d(64)
 53 |         self.relu = nn.ReLU(inplace=True)
 54 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
 55 |         self.layer1 = self._make_layer(block, 64, layers[0])
 56 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
 57 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
 58 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2, last_phase=True)
 59 |         self.avgpool = nn.AvgPool2d(7, stride=1)
 60 |         self.fc = modified_linear.CosineLinear(512 * block.expansion, num_classes)
 61 | 
 62 |         for m in self.modules():
 63 |             if isinstance(m, nn.Conv2d):
 64 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 65 |             elif isinstance(m, nn.BatchNorm2d):
 66 |                 nn.init.constant_(m.weight, 1)
 67 |                 nn.init.constant_(m.bias, 0)
 68 | 
 69 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 70 |         downsample = None
 71 |         if stride != 1 or self.inplanes != planes * block.expansion:
 72 |             downsample = nn.Sequential(
 73 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
 74 |                           kernel_size=1, stride=stride, bias=False),
 75 |                 nn.BatchNorm2d(planes * block.expansion),
 76 |             )
 77 | 
 78 |         layers = []
 79 |         layers.append(block(self.inplanes, planes, stride, downsample))
 80 |         self.inplanes = planes * block.expansion
 81 |         if last_phase:
 82 |             for i in range(1, blocks-1):
 83 |                 layers.append(block(self.inplanes, planes))
 84 |             layers.append(block(self.inplanes, planes, last=True))
 85 |         else: 
 86 |             for i in range(1, blocks):
 87 |                 layers.append(block(self.inplanes, planes))
 88 | 
 89 |         return nn.Sequential(*layers)
 90 | 
 91 |     def forward(self, x):
 92 |         x = self.conv1(x)
 93 |         x = self.bn1(x)
 94 |         x = self.relu(x)
 95 |         x = self.maxpool(x)
 96 | 
 97 |         x = self.layer1(x)
 98 |         x = self.layer2(x)
 99 |         x = self.layer3(x)
100 |         x = self.layer4(x)
101 | 
102 |         x = self.avgpool(x)
103 |         x = x.view(x.size(0), -1)
104 |         x = self.fc(x)
105 | 
106 |         return x
107 | 
108 | 
109 | def resnet18(pretrained=False, **kwargs):
110 |     """Constructs a ResNet-18 model.
111 | 
112 |     Args:
113 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
114 |     """
115 |     model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
116 |     return model
117 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/models/modified_resnet_cifar.py:
--------------------------------------------------------------------------------
  1 | #remove ReLU in the last layer, and use cosine layer to replace nn.Linear
  2 | import torch.nn as nn
  3 | import math
  4 | import torch.utils.model_zoo as model_zoo
  5 | import models.modified_linear as modified_linear
  6 | 
  7 | def conv3x3(in_planes, out_planes, stride=1):
  8 |     """3x3 convolution with padding"""
  9 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 10 |                      padding=1, bias=False)
 11 | 
 12 | class BasicBlock(nn.Module):
 13 |     expansion = 1
 14 | 
 15 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 16 |         super(BasicBlock, self).__init__()
 17 |         self.conv1 = conv3x3(inplanes, planes, stride)
 18 |         self.bn1 = nn.BatchNorm2d(planes)
 19 |         self.relu = nn.ReLU(inplace=True)
 20 |         self.conv2 = conv3x3(planes, planes)
 21 |         self.bn2 = nn.BatchNorm2d(planes)
 22 |         self.downsample = downsample
 23 |         self.stride = stride
 24 |         self.last = last
 25 | 
 26 |     def forward(self, x):
 27 |         residual = x
 28 | 
 29 |         out = self.conv1(x)
 30 |         out = self.bn1(out)
 31 |         out = self.relu(out)
 32 | 
 33 |         out = self.conv2(out)
 34 |         out = self.bn2(out)
 35 | 
 36 |         if self.downsample is not None:
 37 |             residual = self.downsample(x)
 38 | 
 39 |         out += residual
 40 |         if not self.last: #remove ReLU in the last layer
 41 |             out = self.relu(out)
 42 | 
 43 |         return out
 44 | 
 45 | class ResNet(nn.Module):
 46 | 
 47 |     def __init__(self, block, layers, num_classes=10):
 48 |         self.inplanes = 16
 49 |         super(ResNet, self).__init__()
 50 |         self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1,
 51 |                                bias=False)
 52 |         self.bn1 = nn.BatchNorm2d(16)
 53 |         self.relu = nn.ReLU(inplace=True)
 54 |         self.layer1 = self._make_layer(block, 16, layers[0])
 55 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
 56 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2, last_phase=True)
 57 |         self.avgpool = nn.AvgPool2d(8, stride=1)
 58 |         self.fc = modified_linear.CosineLinear(64 * block.expansion, num_classes)
 59 | 
 60 |         for m in self.modules():
 61 |             if isinstance(m, nn.Conv2d):
 62 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 63 |             elif isinstance(m, nn.BatchNorm2d):
 64 |                 nn.init.constant_(m.weight, 1)
 65 |                 nn.init.constant_(m.bias, 0)
 66 | 
 67 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 68 |         downsample = None
 69 |         if stride != 1 or self.inplanes != planes * block.expansion:
 70 |             downsample = nn.Sequential(
 71 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
 72 |                           kernel_size=1, stride=stride, bias=False),
 73 |                 nn.BatchNorm2d(planes * block.expansion),
 74 |             )
 75 | 
 76 |         layers = []
 77 |         layers.append(block(self.inplanes, planes, stride, downsample))
 78 |         self.inplanes = planes * block.expansion
 79 |         if last_phase:
 80 |             for i in range(1, blocks-1):
 81 |                 layers.append(block(self.inplanes, planes))
 82 |             layers.append(block(self.inplanes, planes, last=True))
 83 |         else: 
 84 |             for i in range(1, blocks):
 85 |                 layers.append(block(self.inplanes, planes))
 86 | 
 87 |         return nn.Sequential(*layers)
 88 | 
 89 |     def forward(self, x):
 90 |         x = self.conv1(x)
 91 |         x = self.bn1(x)
 92 |         x = self.relu(x)
 93 | 
 94 |         x = self.layer1(x)
 95 |         x = self.layer2(x)
 96 |         x = self.layer3(x)
 97 | 
 98 |         x = self.avgpool(x)
 99 |         x = x.view(x.size(0), -1)
100 |         x = self.fc(x)
101 | 
102 |         return x
103 | 
104 | def resnet20(pretrained=False, **kwargs):
105 |     n = 3
106 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
107 |     return model
108 | 
109 | def resnet32(pretrained=False, **kwargs):
110 |     n = 5
111 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
112 |     return model
113 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/models/modified_resnetmtl.py:
--------------------------------------------------------------------------------
  1 | import torch.nn as nn
  2 | import math
  3 | import torch.utils.model_zoo as model_zoo
  4 | import models.modified_linear as modified_linear
  5 | from utils.incremental.conv2d_mtl import Conv2dMtl
  6 | 
  7 | def conv3x3mtl(in_planes, out_planes, stride=1):
  8 |     """3x3 convolution with padding"""
  9 |     return Conv2dMtl(in_planes, out_planes, kernel_size=3, stride=stride,
 10 |                      padding=1, bias=False)
 11 | 
 12 | 
 13 | class BasicBlockMtl(nn.Module):
 14 |     expansion = 1
 15 | 
 16 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 17 |         super(BasicBlockMtl, self).__init__()
 18 |         self.conv1 = conv3x3mtl(inplanes, planes, stride)
 19 |         self.bn1 = nn.BatchNorm2d(planes)
 20 |         self.relu = nn.ReLU(inplace=True)
 21 |         self.conv2 = conv3x3mtl(planes, planes)
 22 |         self.bn2 = nn.BatchNorm2d(planes)
 23 |         self.downsample = downsample
 24 |         self.stride = stride
 25 |         self.last = last
 26 | 
 27 |     def forward(self, x):
 28 |         residual = x
 29 | 
 30 |         out = self.conv1(x)
 31 |         out = self.bn1(out)
 32 |         out = self.relu(out)
 33 | 
 34 |         out = self.conv2(out)
 35 |         out = self.bn2(out)
 36 | 
 37 |         if self.downsample is not None:
 38 |             residual = self.downsample(x)
 39 | 
 40 |         out += residual
 41 |         if not self.last: #remove ReLU in the last layer
 42 |             out = self.relu(out)
 43 | 
 44 |         return out
 45 | 
 46 | class ResNetMtl(nn.Module):
 47 | 
 48 |     def __init__(self, block, layers, num_classes=1000):
 49 |         self.inplanes = 64
 50 |         super(ResNetMtl, self).__init__()
 51 |         self.conv1 = Conv2dMtl(3, 64, kernel_size=7, stride=2, padding=3,
 52 |                                bias=False)
 53 |         self.bn1 = nn.BatchNorm2d(64)
 54 |         self.relu = nn.ReLU(inplace=True)
 55 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
 56 |         self.layer1 = self._make_layer(block, 64, layers[0])
 57 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
 58 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
 59 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2, last_phase=True)
 60 |         self.avgpool = nn.AvgPool2d(7, stride=1)
 61 |         self.fc = modified_linear.CosineLinear(512 * block.expansion, num_classes)
 62 | 
 63 |         for m in self.modules():
 64 |             if isinstance(m, Conv2dMtl):
 65 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 66 |             elif isinstance(m, nn.BatchNorm2d):
 67 |                 nn.init.constant_(m.weight, 1)
 68 |                 nn.init.constant_(m.bias, 0)
 69 | 
 70 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 71 |         downsample = None
 72 |         if stride != 1 or self.inplanes != planes * block.expansion:
 73 |             downsample = nn.Sequential(
 74 |                 Conv2dMtl(self.inplanes, planes * block.expansion,
 75 |                           kernel_size=1, stride=stride, bias=False),
 76 |                 nn.BatchNorm2d(planes * block.expansion),
 77 |             )
 78 | 
 79 |         layers = []
 80 |         layers.append(block(self.inplanes, planes, stride, downsample))
 81 |         self.inplanes = planes * block.expansion
 82 |         if last_phase:
 83 |             for i in range(1, blocks-1):
 84 |                 layers.append(block(self.inplanes, planes))
 85 |             layers.append(block(self.inplanes, planes, last=True))
 86 |         else: 
 87 |             for i in range(1, blocks):
 88 |                 layers.append(block(self.inplanes, planes))
 89 | 
 90 |         return nn.Sequential(*layers)
 91 | 
 92 |     def forward(self, x):
 93 |         x = self.conv1(x)
 94 |         x = self.bn1(x)
 95 |         x = self.relu(x)
 96 |         x = self.maxpool(x)
 97 | 
 98 |         x = self.layer1(x)
 99 |         x = self.layer2(x)
100 |         x = self.layer3(x)
101 |         x = self.layer4(x)
102 | 
103 |         x = self.avgpool(x)
104 |         x = x.view(x.size(0), -1)
105 |         x = self.fc(x)
106 | 
107 |         return x
108 | 
109 | 
110 | def resnetmtl18(pretrained=False, **kwargs):
111 |     """Constructs a ResNet-18 model.
112 | 
113 |     Args:
114 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
115 |     """
116 |     model = ResNetMtl(BasicBlockMtl, [2, 2, 2, 2], **kwargs)
117 |     return model
118 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/models/modified_resnetmtl_cifar.py:
--------------------------------------------------------------------------------
  1 | #remove ReLU in the last layer, and use cosine layer to replace nn.Linear
  2 | import torch.nn as nn
  3 | import math
  4 | import torch.utils.model_zoo as model_zoo
  5 | import models.modified_linear as modified_linear
  6 | from utils.incremental.conv2d_mtl import Conv2dMtl
  7 | 
  8 | def conv3x3mtl(in_planes, out_planes, stride=1):
  9 |     """3x3 convolution with padding"""
 10 |     return Conv2dMtl(in_planes, out_planes, kernel_size=3, stride=stride,
 11 |                      padding=1, bias=False)
 12 | 
 13 | class BasicBlockMtl(nn.Module):
 14 |     expansion = 1
 15 | 
 16 |     def __init__(self, inplanes, planes, stride=1, downsample=None, last=False):
 17 |         super(BasicBlockMtl, self).__init__()
 18 |         self.conv1 = conv3x3mtl(inplanes, planes, stride)
 19 |         self.bn1 = nn.BatchNorm2d(planes)
 20 |         self.relu = nn.ReLU(inplace=True)
 21 |         self.conv2 = conv3x3mtl(planes, planes)
 22 |         self.bn2 = nn.BatchNorm2d(planes)
 23 |         self.downsample = downsample
 24 |         self.stride = stride
 25 |         self.last = last
 26 | 
 27 |     def forward(self, x):
 28 |         residual = x
 29 | 
 30 |         out = self.conv1(x)
 31 |         out = self.bn1(out)
 32 |         out = self.relu(out)
 33 | 
 34 |         out = self.conv2(out)
 35 |         out = self.bn2(out)
 36 | 
 37 |         if self.downsample is not None:
 38 |             residual = self.downsample(x)
 39 | 
 40 |         out += residual
 41 |         if not self.last: #remove ReLU in the last layer
 42 |             out = self.relu(out)
 43 | 
 44 |         return out
 45 | 
 46 | class ResNetMtl(nn.Module):
 47 | 
 48 |     def __init__(self, block, layers, num_classes=10):
 49 |         self.inplanes = 16
 50 |         super(ResNetMtl, self).__init__()
 51 |         self.conv1 = Conv2dMtl(3, 16, kernel_size=3, stride=1, padding=1,
 52 |                                bias=False)
 53 |         self.bn1 = nn.BatchNorm2d(16)
 54 |         self.relu = nn.ReLU(inplace=True)
 55 |         self.layer1 = self._make_layer(block, 16, layers[0])
 56 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
 57 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2, last_phase=True)
 58 |         self.avgpool = nn.AvgPool2d(8, stride=1)
 59 |         self.fc = modified_linear.CosineLinear(64 * block.expansion, num_classes)
 60 | 
 61 |         for m in self.modules():
 62 |             if isinstance(m, Conv2dMtl):
 63 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
 64 |             elif isinstance(m, nn.BatchNorm2d):
 65 |                 nn.init.constant_(m.weight, 1)
 66 |                 nn.init.constant_(m.bias, 0)
 67 | 
 68 |     def _make_layer(self, block, planes, blocks, stride=1, last_phase=False):
 69 |         downsample = None
 70 |         if stride != 1 or self.inplanes != planes * block.expansion:
 71 |             downsample = nn.Sequential(
 72 |                 Conv2dMtl(self.inplanes, planes * block.expansion,
 73 |                           kernel_size=1, stride=stride, bias=False),
 74 |                 nn.BatchNorm2d(planes * block.expansion),
 75 |             )
 76 | 
 77 |         layers = []
 78 |         layers.append(block(self.inplanes, planes, stride, downsample))
 79 |         self.inplanes = planes * block.expansion
 80 |         if last_phase:
 81 |             for i in range(1, blocks-1):
 82 |                 layers.append(block(self.inplanes, planes))
 83 |             layers.append(block(self.inplanes, planes, last=True))
 84 |         else: 
 85 |             for i in range(1, blocks):
 86 |                 layers.append(block(self.inplanes, planes))
 87 | 
 88 |         return nn.Sequential(*layers)
 89 | 
 90 |     def forward(self, x):
 91 |         x = self.conv1(x)
 92 |         x = self.bn1(x)
 93 |         x = self.relu(x)
 94 | 
 95 |         x = self.layer1(x)
 96 |         x = self.layer2(x)
 97 |         x = self.layer3(x)
 98 | 
 99 |         x = self.avgpool(x)
100 |         x = x.view(x.size(0), -1)
101 |         x = self.fc(x)
102 | 
103 |         return x
104 | 
105 | def resnetmtl20(pretrained=False, **kwargs):
106 |     n = 3
107 |     model = ResNetMtl(BasicBlockMtl, [n, n, n], **kwargs)
108 |     return model
109 | 
110 | def resnetmtl32(pretrained=False, **kwargs):
111 |     n = 5
112 |     model = ResNetMtl(BasicBlockMtl, [n, n, n], **kwargs)
113 |     return model
114 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/models/resnet_cifar.py:
--------------------------------------------------------------------------------
  1 | import torch.nn as nn
  2 | import math
  3 | import torch.utils.model_zoo as model_zoo
  4 | from utils.incremental.conv2d_mtl import Conv2d
  5 | 
  6 | def conv3x3(in_planes, out_planes, stride=1):
  7 |     """3x3 convolution with padding"""
  8 |     return Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
  9 |                      padding=1, bias=False)
 10 | 
 11 | class BasicBlock(nn.Module):
 12 |     expansion = 1
 13 | 
 14 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 15 |         super(BasicBlock, self).__init__()
 16 |         self.conv1 = conv3x3(inplanes, planes, stride)
 17 |         self.bn1 = nn.BatchNorm2d(planes)
 18 |         self.relu = nn.ReLU(inplace=True)
 19 |         self.conv2 = conv3x3(planes, planes)
 20 |         self.bn2 = nn.BatchNorm2d(planes)
 21 |         self.downsample = downsample
 22 |         self.stride = stride
 23 | 
 24 |     def forward(self, x):
 25 |         residual = x
 26 |         import pdb
 27 |         pdb.set_trace()
 28 |         out = self.conv1(x)
 29 |         out = self.bn1(out)
 30 |         out = self.relu(out)
 31 | 
 32 |         out = self.conv2(out)
 33 |         out = self.bn2(out)
 34 | 
 35 |         if self.downsample is not None:
 36 |             residual = self.downsample(x)
 37 | 
 38 |         out += residual
 39 |         out = self.relu(out)
 40 | 
 41 |         return out
 42 | 
 43 | 
 44 | class Bottleneck(nn.Module):
 45 |     expansion = 4
 46 | 
 47 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 48 |         super(Bottleneck, self).__init__()
 49 |         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 50 |         self.bn1 = nn.BatchNorm2d(planes)
 51 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
 52 |                                padding=1, bias=False)
 53 |         self.bn2 = nn.BatchNorm2d(planes)
 54 |         self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, bias=False)
 55 |         self.bn3 = nn.BatchNorm2d(planes * self.expansion)
 56 |         self.relu = nn.ReLU(inplace=True)
 57 |         self.downsample = downsample
 58 |         self.stride = stride
 59 | 
 60 |     def forward(self, x):
 61 |         residual = x
 62 | 
 63 |         out = self.conv1(x)
 64 |         out = self.bn1(out)
 65 |         out = self.relu(out)
 66 | 
 67 |         out = self.conv2(out)
 68 |         out = self.bn2(out)
 69 |         out = self.relu(out)
 70 | 
 71 |         out = self.conv3(out)
 72 |         out = self.bn3(out)
 73 | 
 74 |         if self.downsample is not None:
 75 |             residual = self.downsample(x)
 76 | 
 77 |         out += residual
 78 |         out = self.relu(out)
 79 | 
 80 |         return out
 81 | 
 82 | 
 83 | class ResNet(nn.Module):
 84 | 
 85 |     def __init__(self, block, layers, num_classes=10):
 86 |         self.inplanes = 16
 87 |         super(ResNet, self).__init__()
 88 |         self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1,
 89 |                                bias=False)
 90 |         self.bn1 = nn.BatchNorm2d(16)
 91 |         self.relu = nn.ReLU(inplace=True)
 92 |         self.layer1 = self._make_layer(block, 16, layers[0])
 93 |         self.layer2 = self._make_layer(block, 32, layers[1], stride=2)
 94 |         self.layer3 = self._make_layer(block, 64, layers[2], stride=2)
 95 |         self.avgpool = nn.AvgPool2d(8, stride=1)
 96 |         self.fc = nn.Linear(64 * block.expansion, num_classes)
 97 | 
 98 |         for m in self.modules():
 99 |             if isinstance(m, nn.Conv2d):
100 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
101 |             elif isinstance(m, nn.BatchNorm2d):
102 |                 nn.init.constant_(m.weight, 1)
103 |                 nn.init.constant_(m.bias, 0)
104 | 
105 |     def _make_layer(self, block, planes, blocks, stride=1):
106 |         downsample = None
107 |         if stride != 1 or self.inplanes != planes * block.expansion:
108 |             downsample = nn.Sequential(
109 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
110 |                           kernel_size=1, stride=stride, bias=False),
111 |                 nn.BatchNorm2d(planes * block.expansion),
112 |             )
113 | 
114 |         layers = []
115 |         layers.append(block(self.inplanes, planes, stride, downsample))
116 |         self.inplanes = planes * block.expansion
117 |         for i in range(1, blocks):
118 |             layers.append(block(self.inplanes, planes))
119 | 
120 |         return nn.Sequential(*layers)
121 | 
122 |     def forward(self, x):
123 |         x = self.conv1(x)
124 |         x = self.bn1(x)
125 |         x = self.relu(x)
126 | 
127 |         x = self.layer1(x)
128 |         x = self.layer2(x)
129 |         x = self.layer3(x)
130 | 
131 |         x = self.avgpool(x)
132 |         x = x.view(x.size(0), -1)
133 |         x = self.fc(x)
134 | 
135 |         return x
136 | 
137 | def resnet20(pretrained=False, **kwargs):
138 |     n = 3
139 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
140 |     return model
141 | 
142 | def resnet32(pretrained=False, **kwargs):
143 |     n = 5
144 |     model = ResNet(BasicBlock, [n, n, n], **kwargs)
145 |     return model
146 | 
147 | def resnet56(pretrained=False, **kwargs):
148 |     n = 9
149 |     model = ResNet(Bottleneck, [n, n, n], **kwargs)
150 |     return model
151 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/process_imagenet/generate_imagenet.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | import argparse
 4 | import os
 5 | import random
 6 | import shutil
 7 | import time
 8 | import warnings
 9 | import numpy as np
10 | 
11 | import torch
12 | import torch.nn as nn
13 | import torch.nn.parallel
14 | import torch.backends.cudnn as cudnn
15 | import torch.distributed as dist
16 | import torch.optim
17 | import torch.utils.data
18 | import torch.utils.data.distributed
19 | import torchvision.transforms as transforms
20 | import torchvision.datasets as datasets
21 | import torchvision.models as models
22 | from PIL import Image
23 | 
24 | src_root_dir = 'data/imagenet/data/'
25 | des_root_dir = 'data/imagenet_resized_256/data/'
26 | if not os.path.exists(des_root_dir):
27 |     os.makedirs(des_root_dir)
28 | 
29 | phase_list = ['train', 'val']
30 | for phase in phase_list:
31 |     if not os.path.exists(os.path.join(des_root_dir, phase)):
32 |         os.mkdir(os.path.join(des_root_dir, phase))
33 |     data_dir = os.path.join(src_root_dir, phase)
34 |     tg_dataset = datasets.ImageFolder(data_dir)
35 |     for cls_name in tg_dataset.classes:
36 |         if not os.path.exists(os.path.join(des_root_dir, phase, cls_name)):
37 |             os.mkdir(os.path.join(des_root_dir, phase, cls_name))
38 |     cnt = 0
39 |     for item in tg_dataset.imgs:
40 |         img_path = item[0]
41 |         img = Image.open(img_path)
42 |         img = img.convert('RGB')
43 |         save_path = img_path.replace('imagenet', 'imagenet_resized_256')
44 |         resized_img = img.resize((256,256), Image.BILINEAR)
45 |         resized_img.save(save_path)
46 |         cnt = cnt+1
47 |         if cnt % 1000 == 0:
48 |             print(cnt, save_path)
49 | 
50 | print("Generation finished.")
51 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/process_imagenet/generate_imagenet_subset.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | import argparse
 4 | import os
 5 | import random
 6 | import shutil
 7 | import time
 8 | import warnings
 9 | import numpy as np
10 | 
11 | import torch
12 | import torch.nn as nn
13 | import torch.nn.parallel
14 | import torch.backends.cudnn as cudnn
15 | import torch.distributed as dist
16 | import torch.optim
17 | import torch.utils.data
18 | import torch.utils.data.distributed
19 | import torchvision.transforms as transforms
20 | import torchvision.datasets as datasets
21 | import torchvision.models as models
22 | 
23 | data_dir = 'data/imagenet/data/'
24 | 
25 | # Data loading code
26 | traindir = os.path.join(data_dir, 'train')
27 | train_dataset = datasets.ImageFolder(traindir, None)
28 | classes = train_dataset.classes
29 | print("the number of total classes: {}".format(len(classes)))
30 | 
31 | seed = 1993
32 | np.random.seed(seed)
33 | subset_num = 100
34 | subset_classes = np.random.choice(classes, subset_num, replace=False)
35 | print("the number of subset classes: {}".format(len(subset_classes)))
36 | print(subset_classes)
37 | 
38 | des_root_dir = 'data/seed_{}_subset_{}_imagenet/data/'.format(seed, subset_num)
39 | if not os.path.exists(des_root_dir):
40 |     os.makedirs(des_root_dir)
41 | phase_list = ['train', 'val']
42 | for phase in phase_list:
43 |     if not os.path.exists(os.path.join(des_root_dir, phase)):
44 |         os.mkdir(os.path.join(des_root_dir, phase))
45 |     for sc in subset_classes:
46 |         src_dir = os.path.join(data_dir, phase, sc)
47 |         des_dir = os.path.join(des_root_dir, phase, sc)
48 |         cmd = "cp -r {} {}".format(src_dir, des_dir)
49 |         print(cmd)
50 |         os.system(cmd)
51 | 
52 | print("Generation finished.")


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/run_eval.sh:
--------------------------------------------------------------------------------
1 | python main.py --nb_cl_fg=50 --nb_cl=2 --nb_protos=20 --epochs=160 --gpu=0 --dataset=cifar100 --random_seed=1993 --use_mtl
2 | python main.py --nb_cl_fg=50 --nb_cl=2 --nb_protos=20 --epochs=90 --gpu=0 --dataset=imagenet_sub --data_dir=./data/seed_1993_subset_100_imagenet/data --num_workers=16 --test_batch_size=50 --use_mtl
3 | python main.py --nb_cl_fg=500 --nb_cl=20 --nb_protos=20 --epochs=90 --gpu=0 --dataset=imagenet --data_dir=./data/imagenet/data --num_workers=16 --test_batch_size=50 --num_classes=1000 --use_mtl
4 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/script/download_ckpt.sh:
--------------------------------------------------------------------------------
1 | wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1sKO2BOssWgTFBNZbM50qDzgk6wqg4_l8' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1sKO2BOssWgTFBNZbM50qDzgk6wqg4_l8" -O logs.tar.gz && rm -rf /tmp/cookies.txt
2 | tar zxvf logs.tar.gz
3 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/trainer/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yaoyao-liu/class-incremental-learning/701af9f819f559c6ab3d3ee73bb3d7c21e924572/mnemonics-training/2_eval/trainer/__init__.py


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/trainer/train.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding=utf-8
  3 | import torch
  4 | import torch.nn as nn
  5 | import torch.nn.functional as F
  6 | import torch.optim as optim
  7 | from torch.optim import lr_scheduler
  8 | import torchvision
  9 | from torchvision import datasets, models, transforms
 10 | from torch.autograd import Variable
 11 | from tensorboardX import SummaryWriter
 12 | import numpy as np
 13 | import time
 14 | import os
 15 | import os.path as osp
 16 | import sys
 17 | import copy
 18 | import argparse
 19 | from PIL import Image
 20 | try:
 21 |     import cPickle as pickle
 22 | except:
 23 |     import pickle
 24 | import math
 25 | import utils.misc
 26 | import models.modified_resnet_cifar as modified_resnet_cifar
 27 | import models.modified_resnetmtl_cifar as modified_resnetmtl_cifar
 28 | import models.modified_resnet as modified_resnet
 29 | import models.modified_resnetmtl as modified_resnetmtl
 30 | import models.modified_linear as modified_linear
 31 | from utils.imagenet.utils_dataset import split_images_labels
 32 | from utils.imagenet.utils_dataset import merge_images_labels
 33 | from utils.incremental.compute_features import compute_features
 34 | from utils.incremental.compute_accuracy import compute_accuracy
 35 | from utils.incremental.compute_confusion_matrix import compute_confusion_matrix
 36 | import warnings
 37 | warnings.filterwarnings('ignore')
 38 | 
 39 | class Trainer(object):
 40 |     def __init__(self, the_args):
 41 |         self.args = the_args
 42 |         self.log_dir = './logs/'
 43 |         if not osp.exists(self.log_dir):
 44 |             os.mkdir(self.log_dir)
 45 |         self.save_path = self.log_dir + self.args.dataset + '_nfg' + str(self.args.nb_cl_fg) + '_ncls' + str(self.args.nb_cl) + \
 46 |         '_nproto' + str(self.args.nb_protos) 
 47 |         if self.args.use_mtl:
 48 |             self.save_path += '_mtl'
 49 |         if self.args.add_str is not None:
 50 |             self.save_path += self.args.add_str          
 51 |         self.save_path += '_' + str(self.args.ckpt_label)
 52 |         if not osp.exists(self.save_path):
 53 |             os.mkdir(self.save_path)
 54 | 
 55 |         self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
 56 |         
 57 |         if self.args.dataset == 'cifar100':
 58 |             self.transform_train = transforms.Compose([ \
 59 |                 transforms.RandomCrop(32, padding=4), \
 60 |                 transforms.RandomHorizontalFlip(), \
 61 |                 transforms.ToTensor(), \
 62 |                 transforms.Normalize((0.5071,  0.4866,  0.4409), (0.2009,  0.1984,  0.2023)),])
 63 |             self.transform_test = transforms.Compose([ \
 64 |                 transforms.ToTensor(), \
 65 |                 transforms.Normalize((0.5071,  0.4866,  0.4409), (0.2009,  0.1984,  0.2023)),])
 66 |             self.trainset = torchvision.datasets.CIFAR100(root='./data', train=True, download=True, transform=self.transform_train)
 67 |             self.testset = torchvision.datasets.CIFAR100(root='./data', train=False, download=True, transform=self.transform_test)
 68 |             self.evalset = torchvision.datasets.CIFAR100(root='./data', train=False, download=False, transform=self.transform_test)
 69 | 
 70 |             self.network = modified_resnet_cifar.resnet32
 71 |             self.network_mtl = modified_resnetmtl_cifar.resnetmtl32
 72 |             self.lr_strat = [int(self.args.epochs*0.5), int(self.args.epochs*0.75)]
 73 |             self.dictionary_size = 500
 74 | 
 75 |         elif self.args.dataset == 'imagenet_sub' or self.args.dataset == 'imagenet':
 76 |             traindir = os.path.join(self.args.data_dir, 'train')
 77 |             valdir = os.path.join(self.args.data_dir, 'val')
 78 |             normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
 79 |             self.trainset = datasets.ImageFolder(traindir, \
 80 |                 transforms.Compose([transforms.RandomResizedCrop(224), \
 81 |                 transforms.RandomHorizontalFlip(), \
 82 |                 transforms.ToTensor(), normalize,]))
 83 |             self.testset =  datasets.ImageFolder(valdir, \
 84 |                 transforms.Compose([transforms.Resize(256), \
 85 |                 transforms.CenterCrop(224), \
 86 |                 transforms.ToTensor(), normalize, ]))
 87 |             self.evalset =  datasets.ImageFolder(valdir, \
 88 |                 transforms.Compose([transforms.Resize(256), \
 89 |                 transforms.CenterCrop(224), \
 90 |                 transforms.ToTensor(), normalize,]))
 91 | 
 92 |             self.network = modified_resnet.resnet18
 93 |             self.network_mtl = modified_resnetmtl.resnetmtl18
 94 |             self.lr_strat = [30, 60]
 95 |             self.dictionary_size = 1500
 96 | 
 97 |         else:
 98 |             raise ValueError('Please set correct dataset.')
 99 | 
100 | 
101 |     def eval(self):
102 |         self.train_writer = SummaryWriter(comment=self.save_path)
103 |         dictionary_size = self.dictionary_size
104 |         top1_acc_list_cumul = np.zeros((int(self.args.num_classes/self.args.nb_cl), 4, self.args.nb_runs))
105 |         top1_acc_list_ori = np.zeros((int(self.args.num_classes/self.args.nb_cl), 4, self.args.nb_runs))
106 | 
107 |         if self.args.dataset == 'cifar100':
108 |             X_train_total = np.array(self.trainset.train_data)
109 |             Y_train_total = np.array(self.trainset.train_labels)
110 |             X_valid_total = np.array(self.testset.test_data)
111 |             Y_valid_total = np.array(self.testset.test_labels)
112 |         elif self.args.dataset == 'imagenet_sub' or self.args.dataset == 'imagenet':
113 |             X_train_total, Y_train_total = split_images_labels(self.trainset.imgs)
114 |             X_valid_total, Y_valid_total = split_images_labels(self.testset.imgs)
115 |         else:
116 |             raise ValueError('Please set correct dataset.')
117 | 
118 |         for iteration_total in range(self.args.nb_runs):
119 |             order_name = osp.join(self.save_path, \
120 |                 "seed_{}_{}_order_run_{}.pkl".format(self.args.random_seed, self.args.dataset, iteration_total))
121 |             print("Order name:{}".format(order_name))
122 | 
123 |             if osp.exists(order_name):
124 |                 print("Loading orders")
125 |                 order = utils.misc.unpickle(order_name)
126 |             else:
127 |                 print("Generating orders")
128 |                 order = np.arange(self.args.num_classes)
129 |                 np.random.shuffle(order)
130 |                 utils.misc.savepickle(order, order_name)
131 |             order_list = list(order)
132 |             print(order_list)
133 | 
134 |         X_valid_cumuls    = []
135 |         X_protoset_cumuls = []
136 |         Y_valid_cumuls    = []
137 |         Y_protoset_cumuls = []
138 | 
139 |         start_iter = int(self.args.nb_cl_fg/self.args.nb_cl)-1
140 | 
141 |         for iteration in range(start_iter, int(self.args.num_classes/self.args.nb_cl)):
142 |             if iteration == start_iter:
143 |                 last_iter = 0
144 |                 tg_model = self.network(num_classes=self.args.nb_cl_fg)
145 |                 in_features = tg_model.fc.in_features
146 |                 out_features = tg_model.fc.out_features
147 |                 print("in_features:", in_features, "out_features:", out_features)
148 |                 ref_model = None
149 |             elif iteration == start_iter+1:
150 |                 last_iter = iteration
151 |                 ref_model = copy.deepcopy(tg_model)
152 |                 if self.args.use_mtl:
153 |                     tg_model = self.network_mtl(num_classes=self.args.nb_cl_fg)
154 |                 else:
155 |                     tg_model = self.network(num_classes=self.args.nb_cl_fg)
156 |                 ref_dict = ref_model.state_dict()
157 |                 tg_dict = tg_model.state_dict()
158 |                 tg_dict.update(ref_dict)
159 |                 tg_model.load_state_dict(tg_dict)
160 |                 tg_model.to(self.device)
161 |                 in_features = tg_model.fc.in_features
162 |                 out_features = tg_model.fc.out_features
163 |                 print("in_features:", in_features, "out_features:", out_features)
164 |                 new_fc = modified_linear.SplitCosineLinear(in_features, out_features, self.args.nb_cl)
165 |                 new_fc.fc1.weight.data = tg_model.fc.weight.data
166 |                 new_fc.sigma.data = tg_model.fc.sigma.data
167 |                 tg_model.fc = new_fc
168 |                 lamda_mult = out_features*1.0 / self.args.nb_cl
169 |             else:
170 |                 last_iter = iteration
171 |                 ref_model = copy.deepcopy(tg_model)
172 |                 in_features = tg_model.fc.in_features
173 |                 out_features1 = tg_model.fc.fc1.out_features
174 |                 out_features2 = tg_model.fc.fc2.out_features
175 |                 print("in_features:", in_features, "out_features1:", out_features1, "out_features2:", out_features2)
176 |                 new_fc = modified_linear.SplitCosineLinear(in_features, out_features1+out_features2, self.args.nb_cl)
177 |                 new_fc.fc1.weight.data[:out_features1] = tg_model.fc.fc1.weight.data
178 |                 new_fc.fc1.weight.data[out_features1:] = tg_model.fc.fc2.weight.data
179 |                 new_fc.sigma.data = tg_model.fc.sigma.data
180 |                 tg_model.fc = new_fc
181 |                 lamda_mult = (out_features1+out_features2)*1.0 / (self.args.nb_cl)
182 | 
183 |             actual_cl = order[range(last_iter*self.args.nb_cl,(iteration+1)*self.args.nb_cl)]
184 |             indices_train_10 = np.array([i in order[range(last_iter*self.args.nb_cl,(iteration+1)*self.args.nb_cl)] for i in Y_train_total])
185 |             indices_test_10 = np.array([i in order[range(last_iter*self.args.nb_cl,(iteration+1)*self.args.nb_cl)] for i in Y_valid_total])
186 | 
187 |             X_valid = X_valid_total[indices_test_10]
188 |             X_valid_cumuls.append(X_valid)
189 |             X_valid_cumul = np.concatenate(X_valid_cumuls)
190 | 
191 |             Y_valid = Y_valid_total[indices_test_10]
192 |             Y_valid_cumuls.append(Y_valid)
193 |             Y_valid_cumul = np.concatenate(Y_valid_cumuls)
194 | 
195 |             if iteration == start_iter:
196 |                 X_valid_ori = X_valid
197 |                 Y_valid_ori = Y_valid
198 | 
199 |             ckp_name = osp.join(self.save_path, 'run_{}_iteration_{}_model.pth'.format(iteration_total, iteration))
200 | 
201 |             print('ckp_name', ckp_name)
202 |             print("[*] Loading models from checkpoint")
203 |             tg_model = torch.load(ckp_name)
204 |             tg_feature_model = nn.Sequential(*list(tg_model.children())[:-1])
205 | 
206 |             if self.args.dataset == 'cifar100':
207 |                 map_Y_valid_ori = np.array([order_list.index(i) for i in Y_valid_ori])
208 |                 print('Computing accuracy on the original batch of classes...')
209 |                 self.evalset.test_data = X_valid_ori.astype('uint8')
210 |                 self.evalset.test_labels = map_Y_valid_ori
211 |                 evalloader = torch.utils.data.DataLoader(self.evalset, batch_size=self.args.eval_batch_size,
212 |                         shuffle=False, num_workers=self.args.num_workers)
213 |                 ori_acc = compute_accuracy(tg_model, tg_feature_model, evalloader)
214 |                 top1_acc_list_ori[iteration, :, iteration_total] = np.array(ori_acc).T
215 |                 self.train_writer.add_scalar('ori_acc/cnn', float(ori_acc), iteration)
216 |                 map_Y_valid_cumul = np.array([order_list.index(i) for i in Y_valid_cumul])
217 |                 print('Computing cumulative accuracy...')
218 |                 self.evalset.test_data = X_valid_cumul.astype('uint8')
219 |                 self.evalset.test_labels = map_Y_valid_cumul
220 |                 evalloader = torch.utils.data.DataLoader(self.evalset, batch_size=self.args.eval_batch_size,
221 |                         shuffle=False, num_workers=self.args.num_workers)        
222 |                 cumul_acc = compute_accuracy(tg_model, tg_feature_model, evalloader)
223 |                 top1_acc_list_cumul[iteration, :, iteration_total] = np.array(cumul_acc).T
224 |                 self.train_writer.add_scalar('cumul_acc/cnn', float(cumul_acc), iteration)
225 |             elif self.args.dataset == 'imagenet_sub' or self.args.dataset == 'imagenet':   
226 |                 map_Y_valid_ori = np.array([order_list.index(i) for i in Y_valid_ori])
227 |                 print('Computing accuracy on the original batch of classes...')
228 |                 current_eval_set = merge_images_labels(X_valid_ori, map_Y_valid_ori)
229 |                 self.evalset.imgs = self.evalset.samples = current_eval_set
230 |                 evalloader = torch.utils.data.DataLoader(self.evalset, batch_size=self.args.eval_batch_size,
231 |                         shuffle=False, num_workers=self.args.num_workers, pin_memory=True)
232 |                 ori_acc = compute_accuracy(tg_model, tg_feature_model, evalloader)
233 |                 top1_acc_list_ori[iteration, :, iteration_total] = np.array(ori_acc).T
234 |                 self.train_writer.add_scalar('ori_acc/cnn', float(ori_acc), iteration)
235 |                 map_Y_valid_cumul = np.array([order_list.index(i) for i in Y_valid_cumul])
236 |                 print('Computing cumulative accuracy...')
237 |                 current_eval_set = merge_images_labels(X_valid_cumul, map_Y_valid_cumul)
238 |                 self.evalset.imgs = self.evalset.samples = current_eval_set
239 |                 evalloader = torch.utils.data.DataLoader(self.evalset, batch_size=self.args.eval_batch_size,
240 |                         shuffle=False, num_workers=self.args.num_workers, pin_memory=True)        
241 |                 cumul_acc = compute_accuracy(tg_model, tg_feature_model, evalloader)
242 |                 top1_acc_list_cumul[iteration, :, iteration_total] = np.array(cumul_acc).T
243 |                 self.train_writer.add_scalar('cumul_acc/cnn', float(cumul_acc), iteration)
244 |             else:
245 |                 raise ValueError('Please set correct dataset.')
246 | 
247 |         self.train_writer.close()
248 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yaoyao-liu/class-incremental-learning/701af9f819f559c6ab3d3ee73bb3d7c21e924572/mnemonics-training/2_eval/utils/__init__.py


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/gpu_tools.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import torch
 3 | import time
 4 | 
 5 | def check_memory(cuda_device):
 6 |     devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n")
 7 |     total, used = devices_info[int(cuda_device)].split(',')
 8 |     return total,used
 9 | 
10 | def occupy_memory(cuda_device):
11 |     total, used = check_memory(cuda_device)
12 |     total = int(total)
13 |     used = int(used)
14 |     max_mem = int(total * 0.90)
15 |     print('Total memory: ' + str(total) + ', used memory: ' + str(used))
16 |     block_mem = max_mem - used
17 |     if block_mem > 0:
18 |         x = torch.cuda.FloatTensor(256, 1024, block_mem)
19 |         del x
20 | 
21 | def set_gpu(x):
22 |     os.environ['CUDA_VISIBLE_DEVICES'] = x
23 |     print('Using gpu:', x)
24 |     
25 | 
26 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/imagenet/__init__.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/imagenet/train_and_eval.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import os
 3 | import shutil
 4 | import time
 5 | 
 6 | import torch
 7 | import torch.nn as nn
 8 | import torch.nn.parallel
 9 | import torch.backends.cudnn as cudnn
10 | import torch.distributed as dist
11 | import torch.optim
12 | import torch.utils.data
13 | import torch.utils.data.distributed
14 | import torchvision.transforms as transforms
15 | import torchvision.datasets as datasets
16 | import torchvision.models as models
17 | 
18 | from .utils_train import *
19 | 
20 | def train_and_eval(epochs, start_epoch, model, optimizer, lr_scheduler, \
21 |         train_loader, val_loader, gpu=None):
22 |     for epoch in range(start_epoch, epochs):
23 |         #adjust_learning_rate(optimizer, epoch)
24 |         lr_scheduler.step()
25 |         print('\nEpoch: %d, LR: ' % epoch, end='')
26 |         print(lr_scheduler.get_lr())
27 | 
28 |         # train for one epoch
29 |         train(train_loader, model, optimizer, epoch, gpu)
30 | 
31 |         # evaluate on validation set
32 |         validate(val_loader, model, gpu)
33 | 
34 |     return model
35 | 
36 | def train(train_loader, model, optimizer, epoch, gpu=None):
37 |     batch_time = AverageMeter()
38 |     data_time = AverageMeter()
39 |     losses = AverageMeter()
40 |     top1 = AverageMeter()
41 |     top5 = AverageMeter()
42 |     model.train()
43 |     criterion = nn.CrossEntropyLoss().cuda(gpu)
44 |     end = time.time()
45 |     for i, (input, target) in enumerate(train_loader):
46 |         data_time.update(time.time() - end)
47 |         if gpu is not None:
48 |             input = input.cuda(gpu, non_blocking=True)
49 |         target = target.cuda(gpu, non_blocking=True)
50 |         output = model(input)
51 |         loss = criterion(output, target)
52 |         prec1, prec5 = accuracy(output, target, topk=(1, 5))
53 |         losses.update(loss.item(), input.size(0))
54 |         top1.update(prec1[0], input.size(0))
55 |         top5.update(prec5[0], input.size(0))
56 |         optimizer.zero_grad()
57 |         loss.backward()
58 |         optimizer.step()
59 |         batch_time.update(time.time() - end)
60 |         end = time.time()
61 | 
62 |         if i % 10 == 0:
63 |             print('Epoch: [{0}][{1}/{2}]\t'
64 |                   'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
65 |                   'Data {data_time.val:.3f} ({data_time.avg:.3f})\t'
66 |                   'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
67 |                   'Prec@1 {top1.val:.3f} ({top1.avg:.3f})\t'
68 |                   'Prec@5 {top5.val:.3f} ({top5.avg:.3f})'.format(
69 |                    epoch, i, len(train_loader), batch_time=batch_time,
70 |                    data_time=data_time, loss=losses, top1=top1, top5=top5))
71 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/imagenet/utils_dataset.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import os
 3 | import shutil
 4 | import time
 5 | import numpy as np
 6 | 
 7 | def split_images_labels(imgs):
 8 |     images = []
 9 |     labels = []
10 |     for item in imgs:
11 |         images.append(item[0])
12 |         labels.append(item[1])
13 |     return np.array(images), np.array(labels)
14 | 
15 | def merge_images_labels(images, labels):
16 |     images = list(images)
17 |     labels = list(labels)
18 |     assert(len(images)==len(labels))
19 |     imgs = []
20 |     for i in range(len(images)):
21 |         item = (images[i], labels[i])
22 |         imgs.append(item)
23 |     return imgs
24 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/imagenet/utils_train.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import os
 3 | import shutil
 4 | import time
 5 | 
 6 | import torch
 7 | import torch.nn as nn
 8 | import torch.nn.parallel
 9 | import torch.backends.cudnn as cudnn
10 | import torch.distributed as dist
11 | import torch.optim
12 | import torch.utils.data
13 | import torch.utils.data.distributed
14 | import torchvision.transforms as transforms
15 | import torchvision.datasets as datasets
16 | import torchvision.models as models
17 | 
18 | def validate(val_loader, model, gpu=None):
19 |     batch_time = AverageMeter()
20 |     losses = AverageMeter()
21 |     top1 = AverageMeter()
22 |     top5 = AverageMeter()
23 | 
24 |     model.eval()
25 |     criterion = nn.CrossEntropyLoss().cuda(gpu)
26 | 
27 |     with torch.no_grad():
28 |         end = time.time()
29 |         for i, (input, target) in enumerate(val_loader):
30 |             if gpu is not None:
31 |                 input = input.cuda(gpu, non_blocking=True)
32 |             target = target.cuda(gpu, non_blocking=True)
33 | 
34 |             output = model(input)
35 |             loss = criterion(output, target)
36 | 
37 |             prec1, prec5 = accuracy(output, target, topk=(1, 5))
38 |             losses.update(loss.item(), input.size(0))
39 |             top1.update(prec1[0], input.size(0))
40 |             top5.update(prec5[0], input.size(0))
41 | 
42 |             batch_time.update(time.time() - end)
43 |             end = time.time()
44 | 
45 |             if i % 10 == 0:
46 |                 print('Test: [{0}/{1}]\t'
47 |                       'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
48 |                       'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
49 |                       'Prec@1 {top1.val:.3f} ({top1.avg:.3f})\t'
50 |                       'Prec@5 {top5.val:.3f} ({top5.avg:.3f})'.format(
51 |                        i, len(val_loader), batch_time=batch_time, loss=losses,
52 |                        top1=top1, top5=top5))
53 | 
54 |         print(' * Prec@1 {top1.avg:.3f} Prec@5 {top5.avg:.3f}'
55 |               .format(top1=top1, top5=top5))
56 | 
57 |     return top1.avg
58 | 
59 | class AverageMeter(object):
60 |     """Computes and stores the average and current value"""
61 |     def __init__(self):
62 |         self.reset()
63 | 
64 |     def reset(self):
65 |         self.val = 0
66 |         self.avg = 0
67 |         self.sum = 0
68 |         self.count = 0
69 | 
70 |     def update(self, val, n=1):
71 |         self.val = val
72 |         self.sum += val * n
73 |         self.count += n
74 |         self.avg = self.sum / self.count
75 | 
76 | def accuracy(output, target, topk=(1,)):
77 |     with torch.no_grad():
78 |         maxk = max(topk)
79 |         batch_size = target.size(0)
80 | 
81 |         _, pred = output.topk(maxk, 1, True, True)
82 |         pred = pred.t()
83 |         correct = pred.eq(target.view(1, -1).expand_as(pred))
84 | 
85 |         res = []
86 |         for k in topk:
87 |             correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
88 |             res.append(correct_k.mul_(100.0 / batch_size))
89 |         return res
90 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/incremental/__init__.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # coding=utf-8
3 | # for incremental train and eval
4 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/incremental/compute_accuracy.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | import torch
 4 | import torch.nn as nn
 5 | import torch.nn.functional as F
 6 | import torch.optim as optim
 7 | from torch.optim import lr_scheduler
 8 | import torchvision
 9 | from torchvision import datasets, models, transforms
10 | from torch.autograd import Variable
11 | import numpy as np
12 | import time
13 | import os
14 | import copy
15 | import argparse
16 | from PIL import Image
17 | from scipy.spatial.distance import cdist
18 | from sklearn.metrics import confusion_matrix
19 | from utils.misc import *
20 | from utils.imagenet.utils_dataset import merge_images_labels
21 | 
22 | def compute_accuracy(tg_model, tg_feature_model, evalloader, scale=None, print_info=True, device=None, cifar=True, imagenet=False, valdir=None):
23 |     if device is None:
24 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
25 |     tg_model.eval()
26 |     tg_feature_model.eval()
27 | 
28 |     correct = 0
29 |     correct_icarl = 0
30 |     correct_icarl_cosine = 0
31 |     correct_icarl_cosine2 = 0
32 |     correct_ncm = 0
33 |     correct_maml = 0
34 |     total = 0
35 |     with torch.no_grad():
36 |         for batch_idx, (inputs, targets) in enumerate(evalloader):
37 |             inputs, targets = inputs.to(device), targets.to(device)
38 |             total += targets.size(0)
39 |             outputs = tg_model(inputs)
40 |             outputs = F.softmax(outputs, dim=1)
41 |             if scale is not None:
42 |                 assert(scale.shape[0] == 1)
43 |                 assert(outputs.shape[1] == scale.shape[1])
44 |                 outputs = outputs / scale.repeat(outputs.shape[0], 1).type(torch.FloatTensor).to(device)
45 |             _, predicted = outputs.max(1)
46 |             correct += predicted.eq(targets).sum().item()
47 | 
48 |     if print_info:
49 |         print("  top 1 accuracy           :\t\t{:.2f} %".format(100.*correct/total))    
50 |     acc = 100.*correct/total
51 |     return acc
52 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/incremental/compute_confusion_matrix.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | import torch
 4 | import torch.nn as nn
 5 | import torch.nn.functional as F
 6 | import torch.optim as optim
 7 | from torch.optim import lr_scheduler
 8 | import torchvision
 9 | from torchvision import datasets, models, transforms
10 | from torch.autograd import Variable
11 | import numpy as np
12 | import time
13 | import os
14 | import copy
15 | import argparse
16 | from PIL import Image
17 | from scipy.spatial.distance import cdist
18 | from sklearn.metrics import confusion_matrix
19 | from utils.misc import *
20 | 
21 | def compute_confusion_matrix(tg_model, tg_feature_model, class_means, evalloader, print_info=False, device=None):
22 |     if device is None:
23 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
24 |     tg_model.eval()
25 |     tg_feature_model.eval()
26 | 
27 |     #evalset = torchvision.datasets.CIFAR100(root='./data', train=False,
28 |     #                                   download=False, transform=transform_test)
29 |     #evalset.test_data = input_data.astype('uint8')
30 |     #evalset.test_labels = input_labels
31 |     #evalloader = torch.utils.data.DataLoader(evalset, batch_size=128,
32 |     #    shuffle=False, num_workers=2)
33 | 
34 |     correct = 0
35 |     correct_icarl = 0
36 |     correct_ncm = 0
37 |     total = 0
38 |     num_classes = tg_model.fc.out_features
39 |     cm = np.zeros((3, num_classes, num_classes))
40 |     all_targets = []
41 |     all_predicted = []
42 |     all_predicted_icarl = []
43 |     all_predicted_ncm = []
44 |     with torch.no_grad():
45 |         for batch_idx, (inputs, targets) in enumerate(evalloader):
46 |             inputs, targets = inputs.to(device), targets.to(device)
47 |             total += targets.size(0)
48 |             all_targets.append(targets)
49 | 
50 |             outputs = tg_model(inputs)
51 |             _, predicted = outputs.max(1)
52 |             correct += predicted.eq(targets).sum().item()
53 |             all_predicted.append(predicted)
54 | 
55 | 
56 | 
57 |     cm[0, :, :] = confusion_matrix(np.concatenate(all_targets), np.concatenate(all_predicted))
58 | 
59 |     if print_info:
60 |         print("  top 1 accuracy            :\t\t{:.2f} %".format( 100.*correct/total ))
61 |     return cm
62 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/incremental/compute_features.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding=utf-8
 3 | #!/usr/bin/env python
 4 | # coding=utf-8
 5 | import torch
 6 | import torch.nn as nn
 7 | import torch.nn.functional as F
 8 | import torch.optim as optim
 9 | from torch.optim import lr_scheduler
10 | import torchvision
11 | from torchvision import datasets, models, transforms
12 | from torch.autograd import Variable
13 | import numpy as np
14 | import time
15 | import os
16 | import copy
17 | import argparse
18 | from PIL import Image
19 | from scipy.spatial.distance import cdist
20 | from sklearn.metrics import confusion_matrix
21 | from utils.misc import *
22 | 
23 | def compute_features(tg_feature_model, evalloader, num_samples, num_features, device=None):
24 |     if device is None:
25 |         device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
26 |     tg_feature_model.eval()
27 |     features = np.zeros([num_samples, num_features])
28 |     start_idx = 0
29 |     with torch.no_grad():
30 |         for inputs, targets in evalloader:
31 |             inputs = inputs.to(device)
32 |             features[start_idx:start_idx+inputs.shape[0], :] = np.squeeze(tg_feature_model(inputs))
33 |             start_idx = start_idx+inputs.shape[0]
34 |     assert(start_idx==num_samples)
35 |     return features
36 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/incremental/conv2d_mtl.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | import torch
 3 | from torch.nn.parameter import Parameter
 4 | import torch.nn.functional as F
 5 | from torch.nn.modules.module import Module
 6 | from torch.nn.modules.utils import _single, _pair, _triple
 7 | 
 8 | 
 9 | class _ConvNdMtl(Module):
10 | 
11 |     def __init__(self, in_channels, out_channels, kernel_size, stride,
12 |                  padding, dilation, transposed, output_padding, groups, bias):
13 |         super(_ConvNdMtl, self).__init__()
14 |         if in_channels % groups != 0:
15 |             raise ValueError('in_channels must be divisible by groups')
16 |         if out_channels % groups != 0:
17 |             raise ValueError('out_channels must be divisible by groups')
18 |         self.in_channels = in_channels
19 |         self.out_channels = out_channels
20 |         self.kernel_size = kernel_size
21 |         self.stride = stride
22 |         self.padding = padding
23 |         self.dilation = dilation
24 |         self.transposed = transposed
25 |         self.output_padding = output_padding
26 |         self.groups = groups
27 |         if transposed:
28 |             self.weight = Parameter(torch.Tensor(
29 |                 in_channels, out_channels // groups, *kernel_size))
30 |             self.mtl_weight = Parameter(torch.ones(in_channels, out_channels // groups, 1, 1))
31 |         else:
32 |             self.weight = Parameter(torch.Tensor(
33 |                 out_channels, in_channels // groups, *kernel_size))
34 |             self.mtl_weight = Parameter(torch.ones(out_channels, in_channels // groups, 1, 1))
35 |         self.weight.requires_grad=False
36 |         if bias:
37 |             self.bias = Parameter(torch.Tensor(out_channels))
38 |             self.bias.requires_grad=False
39 |             self.mtl_bias = Parameter(torch.zeros(out_channels))
40 |         else:
41 |             self.register_parameter('bias', None)
42 |             self.register_parameter('mtl_bias', None)
43 |         self.reset_parameters()
44 | 
45 |     def reset_parameters(self):
46 |         n = self.in_channels
47 |         for k in self.kernel_size:
48 |             n *= k
49 |         stdv = 1. / math.sqrt(n)
50 |         self.weight.data.uniform_(-stdv, stdv)
51 |         self.mtl_weight.data.uniform_(1, 1)
52 |         if self.bias is not None:
53 |             self.bias.data.uniform_(-stdv, stdv)
54 |             self.mtl_bias.data.uniform_(0, 0)
55 | 
56 |     def extra_repr(self):
57 |         s = ('{in_channels}, {out_channels}, kernel_size={kernel_size}'
58 |              ', stride={stride}')
59 |         if self.padding != (0,) * len(self.padding):
60 |             s += ', padding={padding}'
61 |         if self.dilation != (1,) * len(self.dilation):
62 |             s += ', dilation={dilation}'
63 |         if self.output_padding != (0,) * len(self.output_padding):
64 |             s += ', output_padding={output_padding}'
65 |         if self.groups != 1:
66 |             s += ', groups={groups}'
67 |         if self.bias is None:
68 |             s += ', bias=False'
69 |         return s.format(**self.__dict__)
70 | 
71 | class Conv2dMtl(_ConvNdMtl):
72 | 
73 |     def __init__(self, in_channels, out_channels, kernel_size, stride=1,
74 |                  padding=0, dilation=1, groups=1, bias=True):
75 |         kernel_size = _pair(kernel_size)
76 |         stride = _pair(stride)
77 |         padding = _pair(padding)
78 |         dilation = _pair(dilation)
79 |         super(Conv2dMtl, self).__init__(
80 |             in_channels, out_channels, kernel_size, stride, padding, dilation,
81 |             False, _pair(0), groups, bias)
82 | 
83 |     def forward(self, input):
84 |         new_mtl_weight = self.mtl_weight.expand(self.weight.shape)
85 |         new_weight = self.weight.mul(new_mtl_weight)
86 |         if self.bias is not None:
87 |             new_bias = self.bias + self.mtl_bias
88 |         else:
89 |             new_bias = None
90 |         return F.conv2d(input, new_weight, new_bias, self.stride,
91 |                         self.padding, self.dilation, self.groups)
92 | 
93 | 


--------------------------------------------------------------------------------
/mnemonics-training/2_eval/utils/misc.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding=utf-8
  3 | from __future__ import print_function, division
  4 | 
  5 | import torch
  6 | import torch.nn as nn
  7 | import torch.nn.init as init
  8 | from collections import OrderedDict
  9 | 
 10 | import numpy as np
 11 | import os
 12 | import os.path as osp
 13 | import sys
 14 | import time
 15 | import math
 16 | import subprocess
 17 | try:
 18 |     import cPickle as pickle
 19 | except:
 20 |     import pickle
 21 | 
 22 | def savepickle(data, file_path):
 23 |     mkdir_p(osp.dirname(file_path), delete=False)
 24 |     print('pickle into', file_path)
 25 |     with open(file_path, 'wb') as f:
 26 |         pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
 27 | 
 28 | def unpickle(file_path):
 29 |     with open(file_path, 'rb') as f:
 30 |         data = pickle.load(f)
 31 |     return data
 32 | 
 33 | def mkdir_p(path, delete=False, print_info=True):
 34 |     if path == '': return
 35 | 
 36 |     if delete:
 37 |         subprocess.call(('rm -r ' + path).split())
 38 |     if not osp.exists(path):
 39 |         if print_info:
 40 |             print('mkdir -p  ' + path)
 41 |         subprocess.call(('mkdir -p ' + path).split())
 42 | 
 43 | def get_mean_and_std(dataset):
 44 |     dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=True, num_workers=2)
 45 |     mean = torch.zeros(3)
 46 |     std = torch.zeros(3)
 47 |     print('==> Computing mean and std..')
 48 |     for inputs, targets in dataloader:
 49 |         for i in range(3):
 50 |             mean[i] += inputs[:,i,:,:].mean()
 51 |             std[i] += inputs[:,i,:,:].std()
 52 |     mean.div_(len(dataset))
 53 |     std.div_(len(dataset))
 54 |     return mean, std
 55 | 
 56 | def init_params(net):
 57 |     for m in net.modules():
 58 |         if isinstance(m, nn.Conv2d):
 59 |             init.kaiming_normal_(m.weight, mode='fan_out')
 60 |             if m.bias:
 61 |                 init.constant_(m.bias, 0)
 62 |         elif isinstance(m, nn.BatchNorm2d):
 63 |             init.constant_(m.weight, 1)
 64 |             init.constant_(m.bias, 0)
 65 |         elif isinstance(m, nn.Linear):
 66 |             init.normal_(m.weight, std=1e-3)
 67 |             if m.bias is not None:
 68 |                 init.constant_(m.bias, 0)
 69 | 
 70 | _, term_width = os.popen('stty size', 'r').read().split()
 71 | term_width = int(term_width)
 72 | 
 73 | TOTAL_BAR_LENGTH = 65.
 74 | last_time = time.time()
 75 | begin_time = last_time
 76 | def progress_bar(current, total, msg=None):
 77 |     global last_time, begin_time
 78 |     if current == 0:
 79 |         begin_time = time.time()  # Reset for new bar.
 80 | 
 81 |     cur_len = int(TOTAL_BAR_LENGTH*current/total)
 82 |     rest_len = int(TOTAL_BAR_LENGTH - cur_len) - 1
 83 | 
 84 |     sys.stdout.write(' [')
 85 |     for i in range(cur_len):
 86 |         sys.stdout.write('=')
 87 |     sys.stdout.write('>')
 88 |     for i in range(rest_len):
 89 |         sys.stdout.write('.')
 90 |     sys.stdout.write(']')
 91 | 
 92 |     cur_time = time.time()
 93 |     step_time = cur_time - last_time
 94 |     last_time = cur_time
 95 |     tot_time = cur_time - begin_time
 96 | 
 97 |     L = []
 98 |     L.append('  Step: %s' % format_time(step_time))
 99 |     L.append(' | Tot: %s' % format_time(tot_time))
100 |     if msg:
101 |         L.append(' | ' + msg)
102 | 
103 |     msg = ''.join(L)
104 |     sys.stdout.write(msg)
105 |     for i in range(term_width-int(TOTAL_BAR_LENGTH)-len(msg)-3):
106 |         sys.stdout.write(' ')
107 | 
108 |     for i in range(term_width-int(TOTAL_BAR_LENGTH/2)+2):
109 |         sys.stdout.write('\b')
110 |     sys.stdout.write(' %d/%d ' % (current+1, total))
111 | 
112 |     if current < total-1:
113 |         sys.stdout.write('\r')
114 |     else:
115 |         sys.stdout.write('\n')
116 |     sys.stdout.flush()
117 | 
118 | def format_time(seconds):
119 |     days = int(seconds / 3600/24)
120 |     seconds = seconds - days*3600*24
121 |     hours = int(seconds / 3600)
122 |     seconds = seconds - hours*3600
123 |     minutes = int(seconds / 60)
124 |     seconds = seconds - minutes*60
125 |     secondsf = int(seconds)
126 |     seconds = seconds - secondsf
127 |     millis = int(seconds*1000)
128 | 
129 |     f = ''
130 |     i = 1
131 |     if days > 0:
132 |         f += str(days) + 'D'
133 |         i += 1
134 |     if hours > 0 and i <= 2:
135 |         f += str(hours) + 'h'
136 |         i += 1
137 |     if minutes > 0 and i <= 2:
138 |         f += str(minutes) + 'm'
139 |         i += 1
140 |     if secondsf > 0 and i <= 2:
141 |         f += str(secondsf) + 's'
142 |         i += 1
143 |     if millis > 0 and i <= 2:
144 |         f += str(millis) + 'ms'
145 |         i += 1
146 |     if f == '':
147 |         f = '0ms'
148 |     return f
149 | 


--------------------------------------------------------------------------------
/mnemonics-training/README.md:
--------------------------------------------------------------------------------
 1 | # Mnemonics Training
 2 | 
 3 | [![LICENSE](https://img.shields.io/badge/license-MIT-green?style=flat-square)](https://github.com/yaoyao-liu/class-incremental-learning/blob/master/LICENSE)
 4 | [![Python](https://img.shields.io/badge/python-3.6-blue.svg?style=flat-square&logo=python&color=3776AB)](https://www.python.org/)
 5 | [![PyTorch](https://img.shields.io/badge/pytorch-0.4.0-%237732a8?style=flat-square&logo=PyTorch&color=EE4C2C)](https://pytorch.org/)
 6 | [![Citations](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/yaoyao-liu/google-scholar/google-scholar-stats/gs_data_shieldsio_mnemonics.json&logo=Google%20Scholar&color=5087ec&style=flat-square&label=citations)](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=Qi2PSmEAAAAJ&citation_for_view=Qi2PSmEAAAAJ:UeHWp8X0CEIC)
 7 | 
 8 | \[[PDF](https://arxiv.org/pdf/2002.10211.pdf)\] \[[Project Page](https://class-il.mpi-inf.mpg.de/mnemonics-training/)\]
 9 | 
10 | ## Requirements
11 | 
12 | See the versions for the requirements [here](https://yyliu.net/files/mnemonics_packages.txt).
13 | 
14 | ## Download the Datasest
15 | 
16 | See the details [here](https://github.com/yaoyao-liu/class-incremental-learning/tree/main/adaptive-aggregation-networks#download-the-datasets).
17 | 
18 | 
19 | ## Running Experiments
20 | 
21 | ### Running experiments for baselines
22 | 
23 | ```bash
24 | cd ./mnemonics-training/1_train
25 | python main.py --method=baseline --nb_cl=10
26 | python main.py --method=baseline --nb_cl=5
27 | python main.py --method=baseline --nb_cl=2
28 | ```
29 | 
30 | ### Running experiments for our method
31 | 
32 | ```bash
33 | cd ./mnemonics-training/1_train
34 | python main.py --method=mnemonics --nb_cl=10
35 | python main.py --method=mnemonics --nb_cl=5
36 | python main.py --method=mnemonics --nb_cl=2
37 | ```
38 | 
39 | ### Performance
40 | 
41 | #### Average accuracy (%)
42 | 
43 | | Method          | Dataset   | 5-phase     | 10-phase     | 25-phase    | 
44 | | ----------      | --------- | ----------  | ----------   |------------ |
45 | | [LwF](https://arxiv.org/abs/1606.09282)  | CIFAR-100 | 52.44  | 48.47   | 45.75 |
46 | | [LwF](https://arxiv.org/abs/1606.09282) w/ ours  | CIFAR-100 | 54.21  | 52.72   | 51.59 |
47 | | [iCaRL](https://arxiv.org/abs/1611.07725)  | CIFAR-100 | 58.03  | 53.01  | 48.47 |
48 | | [iCaRL](https://arxiv.org/abs/1611.07725) w/ ours | CIFAR-100 | 60.01  | 57.37   | 54.13 |
49 | 
50 | #### Forgetting rate (%, lower is better)
51 | 
52 | | Method          | Dataset   | 5-phase     | 10-phase     | 25-phase    | 
53 | | ----------      | --------- | ----------  | ----------   |------------ |
54 | | [LwF](https://arxiv.org/abs/1606.09282)  | CIFAR-100 | 45.02  | 42.50   | 39.86 |
55 | | [LwF](https://arxiv.org/abs/1606.09282) w/ ours  | CIFAR-100 | 40.00  | 36.50   | 34.25 |
56 | | [iCaRL](https://arxiv.org/abs/1611.07725)  | CIFAR-100 | 32.87  | 32.98 | 36.32 |
57 | | [iCaRL](https://arxiv.org/abs/1611.07725) w/ ours | CIFAR-100 | 25.93  | 26.92   | 28.92 |
58 | 
59 | ## Citation
60 | 
61 | Please cite our paper if it is helpful to your work:
62 | 
63 | ```bibtex
64 | @inproceedings{liu2020mnemonics,
65 | author    = {Liu, Yaoyao and Su, Yuting and Liu, An{-}An and Schiele, Bernt and Sun, Qianru},
66 | title     = {Mnemonics Training: Multi-Class Incremental Learning without Forgetting},
67 | booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
68 | pages     = {12245--12254},
69 | year      = {2020}
70 | }
71 | ```
72 | 
73 | ### Acknowledgements
74 | 
75 | Our implementation uses the source code from the following repositories:
76 | 
77 | * [Learning a Unified Classifier Incrementally via Rebalancing](https://github.com/hshustc/CVPR19_Incremental_Learning)
78 | 
79 | * [iCaRL: Incremental Classifier and Representation Learning](https://github.com/srebuffi/iCaRL)
80 | 
81 | * [Dataset Distillation](https://github.com/SsnL/dataset-distillation)
82 | 
83 | * [Generative Teaching Networks](https://github.com/uber-research/GTN)
84 | 


--------------------------------------------------------------------------------