├── GDA-code.zip
└── README.md


/GDA-code.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zhiweihu1103/Logit-Distillation-GDA/main/GDA-code.zip


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Logit Distillation via Global Distribution Alignment
 2 | #### This repo provides the source code & data of our paper: Logit Distillation via Global Distribution Alignment.
 3 | 
 4 | ## Dependencies
 5 | * conda create -n gda python=3.7 -y
 6 | * torch==1.11.0+cu113
 7 | * torchvision==0.12.0+cu113
 8 | * torchaudio==0.11.0+cu113
 9 | * timm==0.6.12
10 | 
11 | ## Image Classification
12 | ### Preparation
13 | 1. Download [CIFAR](https://www.cs.toronto.edu/~kriz/cifar.html) images from its website.
14 | 2. Put the dataset into `Image Classification/cache/data/cifar`.
15 | 3. Download the pre-trained weights from [Strong-to-Weak](https://github.com/megvii-research/mdistiller/releases/tag/checkpoints) and [Weak-to-Strong](https://github.com/ggjy/vision_weak_to_strong/releases/tag/cifar-ckpt-1).
16 | 4. Put weights into `Image Classification/cache/ckpt/cifar`.
17 | 
18 | ### Training model
19 | ```python
20 | sh train.sh
21 | ```
22 | **Note:** 
23 | 1. We only use KD as the basis, select ResNet56 as the teacher model and ResNet20 as the student model as examples. You can freely modify the variable values ​​defined at the beginning of `train.sh`.
24 | 2. For the definitions of different distillation losses, you can find them in `Image Classification/distillers`.
25 | 3. You will see the logs in floder `logs`.
26 | 
27 | ## Few-shot Learning 
28 | ### Preparation
29 | 1. Download the [miniImageNet](https://github.com/gidariss/FewShotWithoutForgetting) datasets and link the folders into `Few-shot Learning/materials` with names `mini-imagenet`.
30 | 2. You can setting the dataset path and output path in `Few-shot Learning/init_env.py`.
31 | 3. When running python programs, use --gpu to specify the GPUs for running the code (e.g. `--gpu 0,1`). For Classifier-Baseline, we train with 4 GPUs on miniImageNet. Meta-Baseline uses half of the GPUs correspondingly.
32 | 
33 | ### Training model
34 | #### Training Classifier-Baseline
35 | * Training
36 | ```
37 | python train_classifier.py --config config/classifier/train_classifier_mini.yaml --res_type resnet12_bottle --gpu 0,1,2,3
38 | python train_classifier.py --config config/classifier/train_classifier_mini.yaml --res_type resnet18_bottle --gpu 0,1,2,3
39 | python train_classifier.py --config config/classifier/train_classifier_mini.yaml --res_type resnet36_bottle --gpu 0,1,2,3
40 | ```
41 | * Knowledge Distillation
42 | ```
43 | python train_classifier.py --config config/classifier/train_classifier_mini_kd.yaml --res_type resnet36_bottle --teacher_res_type resnet12_bottle --gpu 0,1,2,3
44 | python train_classifier.py --config config/classifier/train_classifier_mini_kd.yaml --res_type resnet36_bottle --teacher_res_type resnet18_bottle --gpu 0,1,2,3
45 | ```
46 | #### Training Meta-Baseline
47 | * Training
48 | ```
49 | python train_meta.py --config config/meta/train_meta_mini.yaml --res_type resnet12 --gpu 0,1
50 | python train_meta.py --config config/meta/train_meta_mini.yaml --res_type resnet18 --gpu 0,1
51 | python train_meta.py --config config/meta/train_meta_mini.yaml --res_type resnet36 --gpu 0,1
52 | ```
53 | * Knowledge Distillation (classifier teacher)
54 | ```
55 | python train_meta.py --config config/meta/train_meta_mini_kd.yaml --res_type resnet36_bottle --teacher_res_type resnet12_bottle --gpu 0,1
56 | python train_meta.py --config config/meta/train_meta_mini_kd.yaml --res_type resnet36_bottle --teacher_res_type resnet18_bottle --gpu 0,1
57 | ```
58 | * Knowledge Distillation (meta teacher)
59 | ```
60 | python train_meta.py --config config/meta/train_meta_mini_kd.yaml --res_type resnet36_bottle --teacher_res_type resnet12_bottle  --teacher_meta_model --gpu 0,1
61 | python train_meta.py --config config/meta/train_meta_mini_kd.yaml --res_type resnet36_bottle --teacher_res_type resnet18_bottle  --teacher_meta_model --gpu 0,1
62 | ```
63 | **Note:** 
64 | 1. For the definitions of different distillation losses, you can find them in `Few-shot Learning/utils/__init__.py`.
65 | 


--------------------------------------------------------------------------------