├── CLOC
    ├── README.md
    ├── code_offline
    │   ├── __pycache__
    │   │   ├── yfcc100m_dataset.cpython-37.pyc
    │   │   └── yfcc100m_dataset.cpython-39.pyc
    │   ├── main.py
    │   └── yfcc100m_dataset.py
    ├── code_online
    │   ├── best_model
    │   │   ├── __pycache__
    │   │   │   ├── yfcc100m_dataset.cpython-37.pyc
    │   │   │   └── yfcc100m_dataset.cpython-39.pyc
    │   │   ├── groupNorm
    │   │   │   ├── __pycache__
    │   │   │   │   ├── group_norm.cpython-37.pyc
    │   │   │   │   ├── group_norm.cpython-39.pyc
    │   │   │   │   ├── resnet.cpython-37.pyc
    │   │   │   │   └── resnet.cpython-39.pyc
    │   │   │   ├── group_norm.py
    │   │   │   └── resnet.py
    │   │   ├── main_online_best_model.py
    │   │   └── yfcc100m_dataset.py
    │   └── no_PoLRS
    │   │   ├── __pycache__
    │   │       ├── yfcc100m_dataset.cpython-37.pyc
    │   │       └── yfcc100m_dataset.cpython-39.pyc
    │   │   ├── groupNorm
    │   │       ├── __pycache__
    │   │       │   ├── group_norm.cpython-37.pyc
    │   │       │   ├── group_norm.cpython-39.pyc
    │   │       │   ├── resnet.cpython-37.pyc
    │   │       │   └── resnet.cpython-39.pyc
    │   │       ├── group_norm.py
    │   │       └── resnet.py
    │   │   ├── main_online.py
    │   │   └── yfcc100m_dataset.py
    ├── data_preparation
    │   └── download_images
    │   │   └── download_images.py
    ├── exp_BS
    │   ├── eval_BS128.sh
    │   ├── eval_BS256.sh
    │   ├── eval_BS64.sh
    │   ├── train_BS128.sh
    │   ├── train_BS256.sh
    │   └── train_BS64.sh
    ├── exp_LR
    │   ├── eval_PoLRS.sh
    │   ├── eval_constant.sh
    │   ├── eval_cosine.sh
    │   ├── train_PoLRS.sh
    │   ├── train_constant.sh
    │   └── train_cosine.sh
    ├── exp_RepBuf
    │   ├── eval_39M.sh
    │   ├── eval_40K.sh
    │   ├── eval_4M.sh
    │   ├── eval_ADRep.sh
    │   ├── train_39M.sh
    │   ├── train_40K.sh
    │   ├── train_4M.sh
    │   └── train_ADRep.sh
    └── exp_best_model
    │   ├── eval_offline.sh
    │   ├── eval_online.sh
    │   ├── train_offline.sh
    │   └── train_online.sh
├── LICENSE
└── README.md


/CLOC/README.md:
--------------------------------------------------------------------------------
  1 | # CLOC
  2 | 
  3 | Description
  4 | ===========
  5 | 
  6 | ![alt text](https://github.com/ZhipengCai/ZhipengCai.github.io/blob/master/papers/CLOC.png " ")
  7 | 
  8 | This repo contains the code for our ICCV 2021 paper: Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data. 
  9 | 
 10 | Authors: Zhipeng Cai, Ozan Sener, Vladlen Koltun
 11 | 
 12 | [[Paper link]](https://arxiv.org/pdf/2108.09020.pdf)
 13 | 
 14 | 
 15 | We perform an empirical study for online visual continual learning with natural distribution shifts. 
 16 | 
 17 | 1. We create a large scale continual learning benchmark called Continual LOCalization (CLOC), which contains roughly 39 million images taken from 8 years. We construct a continual classification task using the time stamp and geo-location of each image. CLOC is suitable for benchmarking continual learning due to its large scale and natural distribution shifts. We provide the code to construct CLOC here. ([[New host for CLOC dataset is now avaiable!]](https://github.com/hammoudhasan/CLDatasets)) If you just need the CLOC dataset, visit this link and follow their instructions for faster/easier download!
 18 | 
 19 | 2. We define online continual learning (OCL) on CLOC, and distinguish learning efficacy (how fast can the model adapt to new data) from information retention (how well can the model remember old knowledge). Previous studies mostly focus on offline continual learning, and optimize for information retention. In this paper, we show that learning efficacy and information retention are somewhat conflicting, and propose serveral strategies to optimize learning efficacy.
 20 | 
 21 | We show that the average online accuracy of our OCL model can perform on-par or better than the validation accuracy of supervised learning (SL) models, given similar budgets, as shown in the bottom right figure above.
 22 | 
 23 | The code in this repo is free for non-commercial academic use. Any commercial use is strictly 
 24 | prohibited without the authors' consent. Please acknowledge the authors by citing:
 25 | 
 26 | ```
 27 | @inproceedings{cai2021online,
 28 |   title={Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data.},
 29 |   author={Cai, Zhipeng and Sener, Ozan and Koltun, Vladlen},
 30 |   booktitle={International Conference on Computer Vision},
 31 |   year={2021}
 32 | }
 33 | ```
 34 | 
 35 | Prerequisite
 36 | ============
 37 | 1. python 3
 38 | 
 39 | 2. pytorch 1.7+
 40 | 
 41 | 3. tensorboardX
 42 | 
 43 | 4. numpy
 44 | 
 45 | We run the experiments mainly using 4 Quadro RTX 6000 GPUs.
 46 | 
 47 | Usage
 48 | =====
 49 | 1. Clone this repo.
 50 | 
 51 | 2. Download meta-data of CLOC from [the google drive link](https://drive.google.com/file/d/1UdIZe_9rEemO2QukHw7bf6aDFV-RjAfc/view?usp=sharing). Decompress it into "CLOC/data_preparation/release", via:
 52 | 
 53 | ```
 54 | mv metadata.tar.gz CLOC/data_preparation/
 55 | tar -xvzf metadata.tar.gz
 56 | ```
 57 | 
 58 | 3. Download images (an important function for download_images.py is to simultaneously download different parts of the dataset, and resume from unexpected termination, please see the comments in the code for more details), via:
 59 | 
 60 | ```
 61 | cd CLOC/data_preparation/download_images
 62 | python /download_images.py
 63 | ```
 64 | 
 65 | 4. Run the experiments:
 66 | 
 67 | Simply go to exp_<name_of_experiments> folders, and run the experiment that you want to replicate. "exp_best_model" refers to the code to train the proposed OCL model, i.e., using PoLRS, ADRep and small batch sizes.
 68 | 
 69 | In each exp_<name_of_experiments> folder, there will be pairs of scripts named as "train_xxx.sh" and "eval_xxx.sh". Both can be run by simply type in:
 70 | 
 71 | ```
 72 | bash xxx_xxx.sh
 73 | ```
 74 | 
 75 | "train_xxx.sh" trains the OCL model and plots the average online accuracy.
 76 | 
 77 | "eval_xxx.sh" produces the backward transfer curve.
 78 | 
 79 | The time axis for different plots may have a scaling effect, one can unify them by simply normalize the time axis into [0,1] and then mutiply individual time point by the maximum number of images or the maximum wall-clock time. 
 80 | 
 81 | 5. use tensorboard to monitor the results:
 82 | 
 83 | Go to each exp_xxx folder, do:
 84 | ```
 85 | tensorboard --logdir=./
 86 | ```
 87 | The output folder of individual experiments can be found in the first line of the output logs.
 88 | 
 89 | ------------------------
 90 | Contact
 91 | ------------------------
 92 | 
 93 | Name: Zhipeng Cai
 94 | 
 95 | Homepage: https://zhipengcai.github.io/
 96 | 
 97 | Email: czptc2h@gmail.com
 98 | 
 99 | Do not hesitate to contact the authors if you have any question or find any bug :)
100 | 
101 | 


--------------------------------------------------------------------------------
/CLOC/code_offline/__pycache__/yfcc100m_dataset.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_offline/__pycache__/yfcc100m_dataset.cpython-37.pyc


--------------------------------------------------------------------------------
/CLOC/code_offline/__pycache__/yfcc100m_dataset.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_offline/__pycache__/yfcc100m_dataset.cpython-39.pyc


--------------------------------------------------------------------------------
/CLOC/code_offline/main.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import os
  3 | import random
  4 | import shutil
  5 | import time
  6 | import warnings
  7 | import builtins
  8 | import math
  9 | 
 10 | import torch
 11 | import torch.nn as nn
 12 | import torch.nn.parallel
 13 | import torch.backends.cudnn as cudnn
 14 | import torch.distributed as dist
 15 | import torch.optim
 16 | import torch.multiprocessing as mp
 17 | import torch.utils.data
 18 | import torch.utils.data.distributed
 19 | import torchvision.transforms as transforms
 20 | import torchvision.datasets as datasets
 21 | import torchvision.models as models
 22 | from torch import randperm
 23 | 
 24 | import numpy as np
 25 | import sys
 26 | from tensorboardX import SummaryWriter
 27 | from yfcc100m_dataset import YFCC_CL_Dataset_offline_train, YFCC_CL_Dataset_offline_val
 28 | 
 29 | import psutil
 30 | import gc
 31 | import copy
 32 | 
 33 | 
 34 | model_names = sorted(name for name in models.__dict__
 35 | 	if name.islower() and not name.startswith("__")
 36 | 	and callable(models.__dict__[name]))
 37 | 
 38 | parser = argparse.ArgumentParser(description='PyTorch ImageNet Training')
 39 | parser.add_argument('-a', '--arch', metavar='ARCH', default='resnet50',
 40 | 					choices=model_names,
 41 | 					help='model architecture: ' +
 42 | 						' | '.join(model_names) +
 43 | 						' (default: resnet18)')
 44 | parser.add_argument('-j', '--workers', default=16, type=int, metavar='N',
 45 | 					help='number of data loading workers (default: 4)')
 46 | parser.add_argument('--epochs', default=90, type=int, metavar='N',
 47 | 					help='number of total epochs to run')
 48 | parser.add_argument('--start-epoch', default=0, type=int, metavar='N',
 49 | 					help='manual epoch number (useful on restarts)')
 50 | parser.add_argument('-b', '--batch-size', default=256, type=int,
 51 | 					metavar='N',
 52 | 					help='mini-batch size (default: 256), this is the total '
 53 | 						 'batch size of all GPUs on the current node when '
 54 | 						 'using Data Parallel or Distributed Data Parallel')
 55 | 
 56 | parser.add_argument('--lr', '--learning-rate', default=0.1, type=float,
 57 | 					metavar='LR', help='initial learning rate', dest='lr')
 58 | parser.add_argument('--momentum', default=0.9, type=float, metavar='M',
 59 | 					help='momentum')
 60 | parser.add_argument('--wd', '--weight-decay', default=1e-4, type=float,
 61 | 					metavar='W', help='weight decay (default: 1e-4)',
 62 | 					dest='weight_decay')
 63 | parser.add_argument('-p', '--print-freq', default=10, type=int,
 64 | 					metavar='N', help='print frequency (default: 10)')
 65 | parser.add_argument('-wf', '--write-freq', default=100, type=int,
 66 | 					metavar='N', help='print frequency (default: 100)')
 67 | parser.add_argument('--resume', default='', type=str, metavar='PATH',
 68 | 					help='path to latest checkpoint (default: none)')
 69 | parser.add_argument('-e', '--evaluate', dest='evaluate', action='store_true',
 70 | 					help='evaluate model on validation set')
 71 | parser.add_argument('--pretrained', dest='pretrained', action='store_true',
 72 | 					help='use pre-trained model')
 73 | parser.add_argument('--world-size', default=-1, type=int,
 74 | 					help='number of nodes for distributed training')
 75 | parser.add_argument('--rank', default=-1, type=int,
 76 | 					help='node rank for distributed training')
 77 | parser.add_argument('--dist-url', default='tcp://224.66.41.62:23456', type=str,
 78 | 					help='url used to set up distributed training')
 79 | parser.add_argument('--dist-backend', default='nccl', type=str,
 80 | 					help='distributed backend')
 81 | parser.add_argument('--seed', default=None, type=int,
 82 | 					help='seed for initializing training. ')
 83 | parser.add_argument('--gpu', default=None, type=int,
 84 | 					help='GPU id to use.')
 85 | parser.add_argument('--multiprocessing-distributed', action='store_true',
 86 | 					help='Use multi-processing distributed training to launch '
 87 | 						 'N processes per node, which has N GPUs. This is the '
 88 | 						 'fastest way to use PyTorch for either single node or '
 89 | 						 'multi node data parallel training')
 90 | 
 91 | # used for continual learning
 92 | parser.add_argument('--cell_id', default="/export/share/t1-datasets/yfcc100m_full_dataset_alt/metadata_geolocation/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy", type=str,
 93 | 					help='file that store the cell IDS')
 94 | parser.add_argument('--data', default="/export/share/t1-datasets/yfcc100m_full_dataset_alt/metadata_geolocation/", 
 95 | 					type=str, help='path to the training data')
 96 | parser.add_argument('--data_val', default="/export/share/t1-datasets/yfcc100m_full_dataset_alt/metadata_geolocation/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv", 
 97 | 					type=str, help='path to the metadata')
 98 | 
 99 | parser.add_argument('--val_freq', default=10, 
100 | 					type=int, help='perform validation per [val_freq] epochs (in offline mode)')
101 | parser.add_argument('--adjust_lr', default=0, 
102 | 					type=int, help='whether to adjust lr')
103 | parser.add_argument('--num_passes', default=1, 
104 | 					type=int, help='number of passes of data during training (used for offline training)')
105 | parser.add_argument('--use_aug', default=0, 
106 | 					type=int, help='whether to use data augmentation')
107 | parser.add_argument('--use_val', default=0, 
108 | 					type=int, help='whether to use validation set')
109 | parser.add_argument('--root', default="/export/share/t1-datasets/yfcc100m_full_dataset_alt/images/", 
110 | 					type=str, help='root to the image dataset, used to save the memory consumption')
111 | 
112 | parser.add_argument('--num_classes', default=500, 
113 | 					type=int, help='number of classes (used only for constructing ring buffer, no need to set this value)')
114 | parser.add_argument('--val_set_OOD', default=0, 
115 | 					type=int, help='(only for offline) whether to use an out-of-domain validation set')
116 | parser.add_argument('--train_set_division_OOD1', default=10, 
117 | 					type=int, help='use the first [1/xxx] training set at the time axis')
118 | 
119 | parser.add_argument('--used_data_rate_start', default = 0.0, type = float, help = 'starting data rate')
120 | parser.add_argument('--used_data_rate_end', default = 1.0, type = float, help = 'ending data rate')
121 | 
122 | parser.add_argument("--min_lr", default = 0.0, type = float, help = 'minimum learning rate, used to cut the cosine LR')
123 | 
124 | # whether to save intermediate models
125 | parser.add_argument("--SaveInter", default = 0, type = int, help = 'whether to save intermediate models')
126 | 
127 | best_acc1 = 0
128 | 
129 | 
130 | def main():
131 | 	args = parser.parse_args()
132 | 
133 | 	if args.seed is not None:
134 | 		random.seed(args.seed)
135 | 		torch.manual_seed(args.seed)
136 | 		cudnn.deterministic = True
137 | 		warnings.warn('You have chosen to seed training. '
138 | 					  'This will turn on the CUDNN deterministic setting, '
139 | 					  'which can slow down your training considerably! '
140 | 					  'You may see unexpected behavior when restarting '
141 | 					  'from checkpoints.')
142 | 
143 | 	if args.gpu is not None:
144 | 		warnings.warn('You have chosen a specific GPU. This will completely '
145 | 					  'disable data parallelism.')
146 | 
147 | 	if args.dist_url == "env://" and args.world_size == -1:
148 | 		args.world_size = int(os.environ["WORLD_SIZE"])
149 | 
150 | 	args.distributed = args.world_size > 1 or args.multiprocessing_distributed
151 | 
152 | 	ngpus_per_node = torch.cuda.device_count()
153 | 	if args.multiprocessing_distributed:
154 | 		# Since we have ngpus_per_node processes per node, the total world_size
155 | 		# needs to be adjusted accordingly
156 | 		args.world_size = ngpus_per_node * args.world_size
157 | 		# Use torch.multiprocessing.spawn to launch distributed processes: the
158 | 		# main_worker process function
159 | 		mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
160 | 	else:
161 | 		# Simply call main_worker function
162 | 		main_worker(args.gpu, ngpus_per_node, args)
163 | 
164 | 
165 | def main_worker(gpu, ngpus_per_node, args):
166 | 	global best_acc1
167 | 
168 | 	args.gpu = gpu
169 | 
170 | 	# suppress printing if not master
171 | 	if args.multiprocessing_distributed and args.gpu != 0:
172 | 		def print_pass(*args):
173 | 			pass
174 | 		builtins.print = print_pass
175 | 
176 | 	args.output_dir = create_output_dir(args)
177 | 	os.makedirs(args.output_dir, exist_ok = True)
178 | 	writer = SummaryWriter(args.output_dir)
179 | 
180 | 	print("output_dir = {}".format(args.output_dir))    
181 | 	
182 | 	sys.stdout.flush() 
183 | 	
184 | 
185 | 	if args.gpu is not None:
186 | 		print("Use GPU: {} for training".format(args.gpu))
187 | 
188 | 	if args.distributed:
189 | 		if args.dist_url == "env://" and args.rank == -1:
190 | 			args.rank = int(os.environ["RANK"])
191 | 		if args.multiprocessing_distributed:
192 | 			# For multiprocessing distributed training, rank needs to be the
193 | 			# global rank among all the processes
194 | 			args.rank = args.rank * ngpus_per_node + gpu
195 | 		dist.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
196 | 								world_size=args.world_size, rank=args.rank)
197 | 	
198 | 	model, criterion, optimizer, num_classes = init_model(args, ngpus_per_node)
199 | 
200 | 	args.num_classes = num_classes
201 | 
202 | 	# init the full dataset
203 | 	train_set, val_set = init_dataset(args)
204 | 	cudnn.benchmark = True
205 | 
206 | 	if train_set is not None:
207 | 		print("train_set.len = {}; val_set.len = {}".format(train_set.__len__(), val_set.__len__()))
208 | 	else:
209 | 		print("val_set.len = {}".format(val_set.__len__()))
210 | 	
211 | 
212 | 	if args.evaluate:
213 | 		# optionally resume from a checkpoint
214 | 		if args.resume:
215 | 			if os.path.isfile(args.resume):	
216 | 				args.start_epoch, _ = resume_offline(args, model, optimizer)
217 | 			else:
218 | 				raise RuntimeError("=> no checkpoint found at '{}'".format(args.resume))
219 | 
220 | 		print("do not perform training for evaluation mode, val_set = {}".format(val_set))
221 | 		sys.stdout.flush() 	
222 | 		
223 | 		val_loader = init_val_loader(args, val_set)
224 | 		out_folder_eval = args.output_dir + '/epoch{}'.format(args.start_epoch)
225 | 		os.makedirs(out_folder_eval, exist_ok = True)
226 | 		writer = SummaryWriter(out_folder_eval)
227 | 		acc1, idx_test, label_test = validate_with_dis(val_loader, model, criterion, args, writer, args.start_epoch, print_over_time = True)
228 | 		# save idx_test and label_test
229 | 		fname_idx = out_folder_eval + '/idx_test.torchSave'
230 | 		fname_pred = out_folder_eval + '/pred_test.torchSave'
231 | 		print("saving files to {}".format(fname_idx))
232 | 		torch.save(idx_test, fname_idx)
233 | 		torch.save(label_test, fname_pred)
234 | 		print("finish saving")
235 | 	else:
236 | 		train_offline(args, model, criterion, optimizer, train_set, val_set, ngpus_per_node, writer)
237 | 
238 | 
239 | def init_model(args, ngpus_per_node):
240 | 	cell_ids = np.load(args.cell_id, allow_pickle = True)
241 | 	num_classes = cell_ids.size + 1 # remember to +1, label 0 means all other locations that are not covered by the labels
242 | 
243 | 	print("init with num_classes = {}".format(num_classes))
244 | 
245 | 	 # create model
246 | 	if args.pretrained:
247 | 		print("=> using pre-trained model '{}'".format(args.arch))
248 | 		model = models.__dict__[args.arch](pretrained=True)
249 | 	else:
250 | 		print("=> creating model '{}'".format(args.arch))
251 | 		print("batch size too small ({} per GPU), using syncBN".format(int(args.batch_size / ngpus_per_node)))
252 | 		model = models.__dict__[args.arch](num_classes = num_classes, norm_layer = nn.SyncBatchNorm)
253 | 
254 | 	if not torch.cuda.is_available():
255 | 		print('using CPU, this will be slow')
256 | 	elif args.distributed:
257 | 		# For multiprocessing distributed, DistributedDataParallel constructor
258 | 		# should always set the single device scope, otherwise,
259 | 		# DistributedDataParallel will use all available devices.
260 | 		if args.gpu is not None:
261 | 			torch.cuda.set_device(args.gpu)
262 | 			model.cuda(args.gpu)
263 | 			# When using a single GPU per process and per
264 | 			# DistributedDataParallel, we need to divide the batch size
265 | 			# ourselves based on the total number of GPUs we have
266 | 			args.batch_size = int(args.batch_size / ngpus_per_node)
267 | 			args.workers = int((args.workers + ngpus_per_node - 1) / ngpus_per_node)
268 | 			model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.gpu])
269 | 		else:
270 | 			model.cuda()
271 | 			# DistributedDataParallel will divide and allocate batch_size to all
272 | 			# available GPUs if device_ids are not set
273 | 			model = torch.nn.parallel.DistributedDataParallel(model)
274 | 	elif args.gpu is not None:
275 | 		torch.cuda.set_device(args.gpu)
276 | 		model = model.cuda(args.gpu)
277 | 	else:
278 | 		# DataParallel will divide and allocate batch_size to all available GPUs
279 | 		if args.arch.startswith('alexnet') or args.arch.startswith('vgg'):
280 | 			model.features = torch.nn.DataParallel(model.features)
281 | 			model.cuda()
282 | 		else:
283 | 			model = torch.nn.DataParallel(model).cuda()
284 | 
285 | 	# define loss function (criterion) and optimizer
286 | 	criterion = nn.CrossEntropyLoss().cuda(args.gpu)
287 | 
288 | 	optimizer = torch.optim.SGD(model.parameters(), args.lr,
289 | 								momentum=args.momentum,
290 | 								weight_decay=args.weight_decay)
291 | 	
292 | 	return model, criterion, optimizer, num_classes
293 | 
294 | 
295 | def init_dataset(args):
296 | 	normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
297 | 									 std=[0.229, 0.224, 0.225])
298 | 
299 | 	trans = transforms.Compose([            
300 | 		transforms.RandomResizedCrop(224),
301 | 		transforms.RandomHorizontalFlip(),
302 | 		transforms.ToTensor(),
303 | 		normalize,
304 | 	])
305 | 
306 | 	trans_test = transforms.Compose([
307 | 		transforms.Resize(256),
308 | 		transforms.CenterCrop(224),
309 | 		transforms.ToTensor(),
310 | 		normalize,
311 | 	])
312 | 	
313 | 	print("data augmentation = {}; data augmentation for test batch = {}".format(trans, trans_test))
314 | 
315 | 	if args.evaluate:
316 | 		train_dataset = None
317 | 		val_dataset = YFCC_CL_Dataset_offline_val(args,
318 | 		transform = trans_test)
319 | 	else:
320 | 		train_dataset = YFCC_CL_Dataset_offline_train(args,
321 | 			transform = trans)
322 | 		val_dataset = YFCC_CL_Dataset_offline_val(args,
323 | 			transform = trans_test)
324 | 
325 | 	return train_dataset, val_dataset
326 | 
327 | 
328 | def resume_offline(args, model, optimizer):
329 | 	print("=> loading checkpoint '{}'".format(args.resume))
330 | 	if args.gpu is None:
331 | 		checkpoint = torch.load(args.resume)
332 | 	else:
333 | 		loc = 'cuda:{}'.format(args.gpu)
334 | 		checkpoint = torch.load(args.resume, map_location=loc)
335 | 	
336 | 	args.start_epoch = checkpoint['epoch']
337 | 	best_acc1 = checkpoint['best_acc1']
338 | 
339 | 	model.load_state_dict(checkpoint['state_dict'])
340 | 
341 | 	optimizer.load_state_dict(checkpoint['optimizer'])
342 | 
343 | 	print("=> loaded checkpoint '{}' (epoch {})"
344 | 		  .format(args.resume, checkpoint['epoch']))
345 | 
346 | 	return checkpoint['epoch'], checkpoint['best_acc1']
347 | 
348 | 
349 | def init_val_loader(args, val_set):
350 | 	return torch.utils.data.DataLoader(val_set, batch_size=args.batch_size, shuffle=False, num_workers=args.workers, pin_memory=True)
351 | 
352 | def validate_with_dis(val_loader, model, criterion, args, writer, epoch, print_over_time = False, writer_val = None):
353 | 	batch_time = AverageMeter('Time', ':6.3f')
354 | 	losses = AverageMeter('Loss', ':.4e')
355 | 	top1 = AverageMeter('Acc@1', ':6.2f')
356 | 	top5 = AverageMeter('Acc@5', ':6.2f')
357 | 
358 | 	top1_iter = AverageMeter('AccI@1', ':6.2f')
359 | 	top5_iter = AverageMeter('AccI@5', ':6.2f')
360 | 		
361 | 	progress = ProgressMeter(
362 | 		len(val_loader),
363 | 		[batch_time, losses, top1, top5, top1_iter, top5_iter],
364 | 		prefix='Test: ')
365 | 
366 | 	if writer_val == None:
367 | 		writer_val = writer
368 | 
369 | 	model.eval()
370 | 	second_per_week = 7*24*3600
371 | 	second_last = 0
372 | 
373 | 	with torch.no_grad():
374 | 		end = time.time()
375 | 		for i, (images, target, time_curr, idx) in enumerate(val_loader):
376 | 			if args.gpu is not None:
377 | 				images = images.cuda(args.gpu, non_blocking=True)
378 | 				target = target.cuda(args.gpu, non_blocking=True)
379 | 				idx = idx.cuda(args.gpu, non_blocking=True)
380 | 			# compute output
381 | 			output = model(images)
382 | 			loss = criterion(output, target)
383 | 
384 | 			if i == 0:
385 | 				idx_test = idx.clone()
386 | 				_, pred_test = output.topk(5, dim = 1)
387 | 			else:
388 | 				idx_test = torch.cat((idx_test, idx.clone()))
389 | 				_, pred_tmp = output.topk(5, dim = 1)
390 | 				pred_test = torch.cat((pred_test, pred_tmp))
391 | 			# measure accuracy and record loss
392 | 			acc1, acc5 = accuracy(output, target, topk=(1, 5))
393 | 			losses.update(loss.item(), images.size(0))
394 | 			top1.update(acc1[0], images.size(0))
395 | 			top5.update(acc5[0], images.size(0))
396 | 			top1_iter.update(acc1[0], images.size(0))
397 | 			top5_iter.update(acc5[0], images.size(0))
398 | 
399 | 			# measure elapsed time
400 | 			batch_time.update(time.time() - end)
401 | 			end = time.time()
402 | 
403 | 			if i % args.print_freq == 0:
404 | 				progress.display(i)
405 | 				sys.stdout.flush() 
406 | 
407 | 			if print_over_time and time_curr[-1] - second_last >= second_per_week:
408 | 				writer_val.add_scalar("val_acc1_avg", top1.avg, time_curr[-1]//second_per_week)                    
409 | 				writer_val.add_scalar("val_acc5_avg", top5.avg, time_curr[-1]//second_per_week)    
410 | 				writer_val.add_scalar("val_acc1_perWeek", top1_iter.avg, time_curr[-1]//second_per_week)                    
411 | 				writer_val.add_scalar("val_acc5_perWeek", top5_iter.avg, time_curr[-1]//second_per_week)                    
412 | 				top1_iter.reset()
413 | 				top5_iter.reset()
414 | 				second_last = time_curr[-1]
415 | 
416 | 		print(' * Acc@1 {top1.avg:.3f} Acc@5 {top5.avg:.3f}'
417 | 			  .format(top1=top1, top5=top5))
418 | 
419 | 	writer.add_scalar("val_loss", losses.avg, epoch)
420 | 	writer.add_scalar("val_acc1", top1.avg, epoch)
421 | 	return top1.avg, idx_test, pred_test
422 | 
423 | def validate(val_loader, model, criterion, args, writer, epoch, print_over_time = False, writer_val = None):
424 | 	batch_time = AverageMeter('Time', ':6.3f')
425 | 	losses = AverageMeter('Loss', ':.4e')
426 | 	top1 = AverageMeter('Acc@1', ':6.2f')
427 | 	top5 = AverageMeter('Acc@5', ':6.2f')
428 | 
429 | 	top1_iter = AverageMeter('AccI@1', ':6.2f')
430 | 	top5_iter = AverageMeter('AccI@5', ':6.2f')
431 | 		
432 | 	progress = ProgressMeter(
433 | 		len(val_loader),
434 | 		[batch_time, losses, top1, top5, top1_iter, top5_iter],
435 | 		prefix='Test: ')
436 | 
437 | 	if writer_val == None:
438 | 		writer_val = writer
439 | 
440 | 	model.eval()
441 | 	second_per_week = 7*24*3600
442 | 	second_last = 0
443 | 
444 | 	with torch.no_grad():
445 | 		end = time.time()
446 | 		for i, (images, target, time_curr, idx) in enumerate(val_loader):
447 | 			if args.gpu is not None:
448 | 				images = images.cuda(args.gpu, non_blocking=True)
449 | 				target = target.cuda(args.gpu, non_blocking=True)
450 | 
451 | 			# compute output
452 | 			output = model(images)
453 | 			loss = criterion(output, target)
454 | 
455 | 			output_all = global_gather(output)
456 | 			target_all = global_gather(target)
457 | 
458 | 			# measure accuracy and record loss
459 | 			acc1, acc5 = accuracy(output_all, target_all, topk=(1, 5))
460 | 			losses.update(loss.item(), images.size(0))
461 | 			top1.update(acc1[0], images.size(0))
462 | 			top5.update(acc5[0], images.size(0))
463 | 			top1_iter.update(acc1[0], images.size(0))
464 | 			top5_iter.update(acc5[0], images.size(0))
465 | 
466 | 
467 | 			# measure elapsed time
468 | 			batch_time.update(time.time() - end)
469 | 			end = time.time()
470 | 
471 | 
472 | 			if i % args.print_freq == 0:
473 | 				progress.display(i)
474 | 				sys.stdout.flush() 
475 | 
476 | 			if print_over_time and time_curr[-1] - second_last >= second_per_week:
477 | 				writer_val.add_scalar("val_acc1_avg", top1.avg, time_curr[-1]//second_per_week)                    
478 | 				writer_val.add_scalar("val_acc5_avg", top5.avg, time_curr[-1]//second_per_week)    
479 | 				writer_val.add_scalar("val_acc1_perWeek", top1_iter.avg, time_curr[-1]//second_per_week)                    
480 | 				writer_val.add_scalar("val_acc5_perWeek", top5_iter.avg, time_curr[-1]//second_per_week)                    
481 | 				top1_iter.reset()
482 | 				top5_iter.reset()
483 | 				second_last = time_curr[-1]
484 | 
485 | 		print(' * Acc@1 {top1.avg:.3f} Acc@5 {top5.avg:.3f}'
486 | 			  .format(top1=top1, top5=top5))
487 | 
488 | 	# if args.gpu == 0:
489 | 	writer.add_scalar("val_loss", losses.avg, epoch)
490 | 	writer.add_scalar("val_acc1", top1.avg, epoch)
491 | 	return top1.avg
492 | 
493 | def train_offline(args, model, criterion, optimizer, train_set, val_set, ngpus_per_node, writer):
494 | 	if args.resume:
495 | 		if os.path.isfile(args.resume):
496 | 			args.start_epoch, _ = resume_offline(args, model, optimizer)
497 | 		else:
498 | 			raise RuntimeError("=> no checkpoint found at '{}'".format(args.resume))
499 | 
500 | 	print("args.workers = {}".format(args.workers))
501 | 	if args.use_val == 1:
502 | 		val_loader  = torch.utils.data.DataLoader(val_set,
503 | 		batch_size=args.batch_size, shuffle=False,
504 | 		num_workers=args.workers, pin_memory=True)
505 | 
506 | 	for epoch in range(args.start_epoch, args.epochs):
507 | 		train_loader, train_sampler = init_loader_v2(args, train_set, epoch)
508 | 		if args.distributed:
509 | 			train_sampler.set_epoch(0)
510 | 
511 | 		if args.adjust_lr:
512 | 			adjust_learning_rate(optimizer, epoch, args, ngpus_per_node)
513 | 
514 | 		writer.add_scalar("learning rate", get_lr(optimizer), epoch)
515 | 
516 | 		# train for one epoch
517 | 		train_offline_one_iter(train_loader, model, criterion, optimizer, epoch, args, writer)
518 | 
519 | 		if args.use_val == 1 and (epoch + 1) % args.val_freq == 0:
520 | 			# # evaluate on validation set
521 | 			os.makedirs(args.output_dir+"/eval_ep{}".format(epoch), exist_ok = True)
522 | 			writer_val = SummaryWriter(args.output_dir+"/eval_ep{}".format(epoch))
523 | 			acc1 = validate(val_loader, model, criterion, args, writer_val, epoch, print_over_time = True, writer_val = writer_val)
524 | 
525 | 		if not args.multiprocessing_distributed or (args.multiprocessing_distributed
526 | 				and args.rank % ngpus_per_node == 0): 
527 | 			if epoch % 10 == 0: 
528 | 				save_checkpoint({
529 | 					'epoch': epoch + 1,
530 | 					'arch': args.arch,
531 | 					'state_dict': model.state_dict(),
532 | 					'best_acc1': 0,   # currently not used
533 | 					'optimizer' : optimizer.state_dict()
534 | 				}, 0, output_dir = args.output_dir, filename = 'checkpoint_ep{}.pth.tar'.format(epoch))
535 | 			else:
536 | 				save_checkpoint({
537 | 					'epoch': epoch + 1,
538 | 					'arch': args.arch,
539 | 					'state_dict': model.state_dict(),
540 | 					'best_acc1': 0,   # currently not used
541 | 					'optimizer' : optimizer.state_dict()
542 | 				}, 0, output_dir = args.output_dir)
543 | 
544 | 	val_loader = init_val_loader(args, val_set)
545 | 	acc1 = validate(val_loader, model, criterion, args, writer, args.epochs, print_over_time = True, writer_val = writer)
546 | 	print("final average accuracy = {}".format(acc1))
547 | 
548 | 
549 | def init_loader_v2(args, train_set, epoch):	
550 | 
551 | 	train_set._change_data_range()
552 | 	batchSize4Loader = args.batch_size
553 | 
554 | 	if args.distributed:
555 | 		# we already have randomness during change_data_range()
556 | 		train_sampler = torch.utils.data.distributed.DistributedSampler(train_set, shuffle = False)
557 | 	else:
558 | 		train_sampler = None
559 | 
560 | 	print("sampler.shuffle = {}".format(train_sampler.shuffle))
561 | 	
562 | 	print("[initLoaderv2]: batchSize4Loader = {}".format(batchSize4Loader))
563 | 	train_loader = torch.utils.data.DataLoader(
564 | 		train_set, batch_size=batchSize4Loader, shuffle=(train_sampler is None),
565 | 		num_workers=args.workers, pin_memory=True, sampler=train_sampler)
566 | 
567 | 	return train_loader, train_sampler
568 | 
569 | 
570 | 
571 | def train_offline_one_iter(train_loader, model, criterion, optimizer, epoch, args, writer):
572 | 	batch_time = AverageMeter('Time', ':6.3f')
573 | 	data_time = AverageMeter('Data', ':6.3f')
574 | 	losses = AverageMeter('Loss', ':.4e')
575 | 	top1 = AverageMeter('Acc@1', ':6.2f')
576 | 	top5 = AverageMeter('Acc@5', ':6.2f')
577 | 
578 | 	progress = ProgressMeter(
579 | 		len(train_loader),
580 | 		[batch_time, data_time, losses, top1, top5],
581 | 		prefix="Epoch: [{}]".format(epoch))    
582 | 
583 | 	model.train()
584 | 	end = time.time()
585 | 
586 | 	optimizer.zero_grad()
587 | 
588 | 	for i, (images, target, _, index) in enumerate(train_loader):
589 | 		# measure data loading time
590 | 		data_time.update(time.time() - end)
591 | 
592 | 		iter_curr = epoch*len(train_loader)+i
593 | 
594 | 		batch_size = target.size(0)
595 | 
596 | 		if args.gpu is not None:
597 | 			images = images.cuda(args.gpu, non_blocking=True)
598 | 			target = target.cuda(args.gpu, non_blocking=True)
599 | 
600 | 		output = model(images)
601 | 		loss = criterion(output, target) 
602 | 
603 | 		acc1, acc5 = accuracy(output, target, topk=(1, 5))
604 | 		losses.update(loss.item(), images.size(0))
605 | 		top1.update(acc1[0], images.size(0))
606 | 		top5.update(acc5[0], images.size(0))
607 | 
608 | 		
609 | 		loss.backward()
610 | 
611 | 		if i % args.print_freq == 0:
612 | 			progress.display(i)
613 | 
614 | 		if i % args.write_freq == 0:
615 | 			writer.add_scalar("train_loss_iter", losses.avg, iter_curr)
616 | 			writer.add_scalar("train_acc1_iter", top1.avg, iter_curr)
617 | 			writer.add_scalar("train_acc5_iter", top5.avg, iter_curr)
618 | 
619 | 		optimizer.step()        # update parameters of net
620 | 		optimizer.zero_grad()   # reset gradient
621 | 
622 | 		# measure elapsed time
623 | 		batch_time.update(time.time() - end)
624 | 		end = time.time()						
625 | 
626 | 	# if args.gpu == 0:
627 | 	writer.add_scalar("train_loss", losses.avg, epoch)
628 | 	writer.add_scalar("train_acc1", top1.avg, epoch)
629 | 
630 | def global_gather(x):
631 | 	all_x = [torch.ones_like(x)
632 | 			 for _ in range(dist.get_world_size())]
633 | 	dist.all_gather(all_x, x, async_op=False)
634 | 	return torch.cat(all_x, dim=0)
635 | 
636 | 
637 | def save_checkpoint(state, is_best, output_dir = '.', filename='checkpoint.pth.tar'):
638 | 	torch.save(state, output_dir+'/checkpoint.pth.tar')
639 | 	if filename != 'checkpoint.pth.tar':
640 | 		shutil.copyfile(output_dir+'/checkpoint.pth.tar', output_dir+'/'+filename)
641 | 
642 | 
643 | class AverageMeter(object):
644 | 	"""Computes and stores the average and current value"""
645 | 	def __init__(self, name, fmt=':f'):
646 | 		self.name = name
647 | 		self.fmt = fmt
648 | 		self.reset()
649 | 
650 | 	def reset(self):
651 | 		self.val = 0
652 | 		self.avg = 0
653 | 		self.sum = 0
654 | 		self.count = 0
655 | 
656 | 	def update(self, val, n=1):
657 | 		self.val = val
658 | 		self.sum += val * n
659 | 		self.count += n
660 | 		self.avg = self.sum / self.count
661 | 
662 | 	def __str__(self):
663 | 		fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '})'
664 | 		return fmtstr.format(**self.__dict__)
665 | 
666 | 
667 | class ProgressMeter(object):
668 | 	def __init__(self, num_batches, meters, prefix=""):
669 | 		self.batch_fmtstr = self._get_batch_fmtstr(num_batches)
670 | 		self.meters = meters
671 | 		self.prefix = prefix
672 | 
673 | 	def display(self, batch):
674 | 		entries = [self.prefix + self.batch_fmtstr.format(batch)]
675 | 		entries += [str(meter) for meter in self.meters]
676 | 		# for test only
677 | 		print('\t'.join(entries))
678 | 
679 | 	def _get_batch_fmtstr(self, num_batches):
680 | 		num_digits = len(str(num_batches // 1))
681 | 		fmt = '{:' + str(num_digits) + 'd}'
682 | 		return '[' + fmt + '/' + fmt.format(num_batches) + ']'
683 | 
684 | 
685 | 
686 | def get_lr(optimizer):
687 | 	for param_group in optimizer.param_groups:
688 | 		return param_group['lr']
689 | 
690 | 
691 | def adjust_learning_rate(optimizer, epoch, args, ngpus_per_node, iter_curr = 0):
692 | 	lr = args.lr
693 | 	
694 | 	lr *= 0.5 * (1. + math.cos(math.pi * epoch / args.epochs))
695 | 	lr = max(args.min_lr, lr)
696 | 	# print("lr = {}".format(lr))
697 | 
698 | 	for param_group in optimizer.param_groups:
699 | 		param_group['lr'] = lr
700 | 
701 | 
702 | def accuracy(output, target, topk=(1,), cross_GPU = False):
703 | 	"""Computes the accuracy over the k top predictions for the specified values of k"""
704 | 	with torch.no_grad():
705 | 		if cross_GPU:
706 | 			output = global_gather(output)
707 | 			target = global_gather(target)
708 | 			
709 | 		maxk = max(topk)
710 | 		batch_size = target.size(0)
711 | 
712 | 		_, pred = output.topk(maxk, 1, True, True)
713 | 		pred = pred.t()
714 | 		correct = pred.eq(target.reshape(1, -1).expand_as(pred))
715 | 
716 | 		res = []
717 | 		for k in topk:
718 | 			correct_k = correct[:k].reshape(-1).float().sum(0, keepdim=True)
719 | 			res.append(correct_k.mul_(100.0 / batch_size))
720 | 		return res
721 | 
722 | 
723 | def create_output_dir(args):
724 | 
725 | 	output_dir = 'results_offline/'+args.arch+'_lr{}_epochs{}'.format(args.lr, args.epochs)+'_numPasses{}'.format(args.num_passes)+'_useVal{}'.format(args.use_val)+'_valFreq{}'.format(args.val_freq)
726 | 	
727 | 	output_dir += '_BS{}'.format(args.batch_size)
728 | 	
729 | 	if args.used_data_rate_start != 0.0:
730 | 		output_dir += "_start{}".format(args.used_data_rate_start)
731 | 
732 | 	if args.used_data_rate_end != 1.0:
733 | 		output_dir += "_end{}".format(args.used_data_rate_end)
734 | 	
735 | 	if args.weight_decay != 1e-4:
736 | 		output_dir += '_WD{}'.format(args.weight_decay)
737 | 
738 | 	if args.min_lr > 0.0:
739 | 		output_dir += '_minLr{}'.format(args.min_lr)
740 | 
741 | 	if args.evaluate:
742 | 		output_dir += '/evaluate'
743 | 
744 | 	return output_dir
745 | 
746 | if __name__ == '__main__':
747 | 	main()


--------------------------------------------------------------------------------
/CLOC/code_offline/yfcc100m_dataset.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torchvision.datasets.vision import StandardTransform
  3 | from torch.utils.data import Dataset, IterableDataset
  4 | from PIL import Image
  5 | 
  6 | import os
  7 | import os.path
  8 | import csv
  9 | 
 10 | import sys
 11 | import math
 12 | import random
 13 | import matplotlib.pyplot as plt
 14 | 
 15 | 
 16 | def has_file_allowed_extension(filename, extensions):
 17 |     """Checks if a file is an allowed extension.
 18 | 
 19 |     Args:
 20 |         filename (string): path to a file
 21 |         extensions (tuple of strings): extensions to consider (lowercase)
 22 | 
 23 |     Returns:
 24 |         bool: True if the filename ends with one of given extensions
 25 |     """
 26 |     return filename.lower().endswith(extensions)
 27 | 
 28 | 
 29 | def is_image_file(filename):
 30 |     """Checks if a file is an allowed image extension.
 31 | 
 32 |     Args:
 33 |         filename (string): path to a file
 34 | 
 35 |     Returns:
 36 |         bool: True if the filename ends with a known image extension
 37 |     """
 38 |     return has_file_allowed_extension(filename, IMG_EXTENSIONS)
 39 | 
 40 | 
 41 | def make_dataset(directory, class_to_idx, extensions=None, is_valid_file=None):
 42 |     instances = []
 43 |     directory = os.path.expanduser(directory)
 44 |     both_none = extensions is None and is_valid_file is None
 45 |     both_something = extensions is not None and is_valid_file is not None
 46 |     if both_none or both_something:
 47 |         raise ValueError("Both extensions and is_valid_file cannot be None or not None at the same time")
 48 |     if extensions is not None:
 49 |         def is_valid_file(x):
 50 |             return has_file_allowed_extension(x, extensions)
 51 |     for target_class in sorted(class_to_idx.keys()):
 52 |         class_index = class_to_idx[target_class]
 53 |         target_dir = os.path.join(directory, target_class)
 54 |         if not os.path.isdir(target_dir):
 55 |             continue
 56 |         for root, _, fnames in sorted(os.walk(target_dir, followlinks=True)):
 57 |             for fname in sorted(fnames):
 58 |                 path = os.path.join(root, fname)
 59 |                 if is_valid_file(path):
 60 |                     item = path, class_index
 61 |                     instances.append(item)
 62 |     return instances
 63 | 
 64 | 
 65 | def pil_loader(path):
 66 |     # open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
 67 |     with open(path, 'rb') as f:
 68 |         img = Image.open(f)
 69 |         return img.convert('RGB')
 70 | 
 71 | 
 72 | def accimage_loader(path):
 73 |     import accimage
 74 |     try:
 75 |         return accimage.Image(path)
 76 |     except IOError:
 77 |         # Potentially a decoding problem, fall back to PIL.Image
 78 |         return pil_loader(path)
 79 | 
 80 | 
 81 | def default_loader(path):
 82 |     from torchvision import get_image_backend
 83 |     if get_image_backend() == 'accimage':
 84 |         return accimage_loader(path)
 85 |     else:
 86 |         return pil_loader(path)
 87 | 
 88 | 
 89 | IMG_EXTENSIONS = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp')
 90 | 
 91 | 
 92 | 
 93 | class YFCC_CL_Dataset_offline_val(Dataset):
 94 |     def __init__(self, args, loader = default_loader, extensions=IMG_EXTENSIONS, transform=None,
 95 |                  target_transform=None):
 96 |         
 97 |         fname = args.data_val
 98 |         root = args.root
 99 |         
100 |         print("YFCC_CL dataset loader = {}; extensions = {}".format(loader, extensions))
101 |         
102 |         sys.stdout.flush()
103 | 
104 |         if isinstance(fname, torch._six.string_classes):
105 |             fname = os.path.expanduser(fname)
106 |         self.fname = fname
107 | 
108 |         self.transform = transform
109 |         self.target_transform = target_transform
110 | 
111 |         self.labels, self.time_taken, self.user, self.store_loc = self._make_data(self.fname, root = root)
112 |         if len(self.labels) == 0:
113 |             msg = "Found 0 files in subfolders of: {}\n".format(self.fname)
114 |             if extensions is not None:
115 |                 msg += "Supported extensions are: {}".format(",".join(extensions))
116 |             raise RuntimeError(msg)
117 | 
118 |         self.loader = loader
119 |         self.extensions = extensions
120 |         self.root = root
121 |         self.batch_size = torch.cuda.device_count()*args.batch_size
122 |         print("root = {}; time_taken (an example) = {}; time_taken.len = {}; batch_size = {}".format(root, self.time_taken[1000], len(self.time_taken), self.batch_size))
123 | 
124 |     def _make_data(self, fname, root):
125 |         # read data
126 |         fval = open(fname, 'r')
127 |         lines_val = fval.readlines()
128 |         labels = [None] * len(lines_val)
129 |         time = [None] * len(lines_val)
130 |         user = [None] * len(lines_val)
131 |         store_loc = [None] * len(lines_val)
132 |         
133 |         for i in range(len(lines_val)):
134 |             line_splitted = lines_val[i].split(",")
135 |             labels[i] = int(line_splitted[0])
136 |             time[i] = int(line_splitted[2])
137 |             user[i] = line_splitted[3]
138 |             store_loc[i] = line_splitted[-1][:-1]
139 |         return labels, time, user, store_loc
140 | 
141 |     def __getitem__(self, index):
142 |         if self.root is not None:
143 |             path = self.root + self.store_loc[index]
144 |         else:
145 |             path = self.store_loc[index]
146 |         sample = self.loader(path)
147 |         if self.transform is not None:
148 |             sample = self.transform(sample)
149 | 
150 |         return sample, self.labels[index], self.time_taken[index], index
151 | 
152 | 
153 |     def __len__(self):
154 |         return len(self.labels)
155 | 
156 | 
157 | 
158 | class YFCC_CL_Dataset_offline_train(Dataset):
159 |     def __init__(self, args, loader = default_loader, extensions=IMG_EXTENSIONS, transform=None,
160 |                  target_transform=None):
161 |         
162 |         fname = args.data
163 |         root = args.root
164 |         
165 |         print("YFCC_CL dataset loader = {}; extensions = {}".format(loader, extensions))
166 |         
167 |         sys.stdout.flush()
168 | 
169 |         if isinstance(fname, torch._six.string_classes):
170 |             fname = os.path.expanduser(fname)
171 |         self.fname = fname
172 | 
173 |         # for backwards-compatibility
174 |         self.transform = transform
175 |         self.target_transform = target_transform
176 | 
177 |         self._make_data()
178 |         self.used_data_start = int(len(self.labels)*args.used_data_rate_start)
179 |         self.used_data_end = min(len(self.labels), int(len(self.labels)*args.used_data_rate_end))
180 |         self.data_size = self.used_data_end-self.used_data_start + 1
181 |         self.data_size_per_epoch = int(args.num_passes*self.data_size)//int(args.epochs)
182 | 
183 |         if len(self.labels) == 0:
184 |             msg = "Found 0 files in subfolders of: {}\n".format(self.fname)
185 |             if extensions is not None:
186 |                 msg += "Supported extensions are: {}".format(",".join(extensions))
187 |             raise RuntimeError(msg)
188 | 
189 |         self.loader = loader
190 |         self.extensions = extensions
191 |         self.root = root
192 |         self.batch_size = torch.cuda.device_count()*args.batch_size
193 |         print("root = {}; time_taken (an example) = {}; time_taken.len = {}; batch_size = {}; self.used_data_start = {}; self.used_data_end = {}; self.data_size_per_epoch = {}".format(root, self.time[1000], len(self.time), self.batch_size, self.used_data_start, self.used_data_end, self.data_size_per_epoch))
194 | 
195 |     def _make_data(self):
196 |         # only read labels and times to save storage
197 |         self.labels = torch.load(self.fname+'train_labels.torchSave')
198 |    #      labels = [None] * len(labels)
199 |         self.time = torch.load(self.fname+'train_time.torchSave')
200 |         self.user = [None] * len(self.labels)
201 |         self.store_loc = [None] * len(self.labels)
202 |         self.idx_data = []
203 | 
204 | 
205 |     def _change_data_range(self, idx_data = None):
206 |         if idx_data is None:
207 |             # generate idx_data
208 |             idx_data = self.used_data_start+torch.randperm(self.data_size)[:self.data_size_per_epoch]
209 |         self.idx_data = idx_data
210 |         # read user and store_locs
211 |         self.user = [None] * len(self.labels)
212 |         self.store_loc = [None] * len(self.labels)
213 | 
214 |         tmp_user = torch.load(self.fname+'train_user.torchSave')
215 |         tmp_loc = torch.load(self.fname+'train_store_loc.torchSave')
216 |         print("change data range to {}/{}".format(self.idx_data.min(), self.idx_data.max()))
217 |         for i in range(len(self.idx_data)):
218 |             self.idx_data[i] = min(len(self.labels)-1, self.idx_data[i])
219 |             self.user[self.idx_data[i]] = tmp_user[self.idx_data[i]]
220 |             self.store_loc[self.idx_data[i]] = tmp_loc[self.idx_data[i]][:-1]
221 | 
222 |     def __getitem__(self, index):
223 |         if self.root is not None:
224 |             path = self.root + self.store_loc[self.idx_data[index]]
225 |         else:
226 |             path = self.store_loc[self.idx_data[index]]
227 |         sample = self.loader(path)
228 | 
229 |             
230 |         if self.transform is not None:
231 |             sample = self.transform(sample)
232 | 
233 |         return sample, self.labels[self.idx_data[index]], self.time[self.idx_data[index]], self.idx_data[index]
234 | 
235 | 
236 |     def __len__(self):
237 |         return len(self.idx_data)
238 | 
239 | 
240 | 


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/__pycache__/yfcc100m_dataset.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/best_model/__pycache__/yfcc100m_dataset.cpython-37.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/__pycache__/yfcc100m_dataset.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/best_model/__pycache__/yfcc100m_dataset.cpython-39.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/groupNorm/__pycache__/group_norm.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/best_model/groupNorm/__pycache__/group_norm.cpython-37.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/groupNorm/__pycache__/group_norm.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/best_model/groupNorm/__pycache__/group_norm.cpython-39.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/groupNorm/__pycache__/resnet.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/best_model/groupNorm/__pycache__/resnet.cpython-37.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/groupNorm/__pycache__/resnet.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/best_model/groupNorm/__pycache__/resnet.cpython-39.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/groupNorm/group_norm.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn.functional as F
  3 | from torch.nn.modules.batchnorm import _BatchNorm
  4 | 
  5 | 
  6 | def group_norm(input, group, running_mean, running_var, weight=None, bias=None,
  7 |                   use_input_stats=True, momentum=0.1, eps=1e-5):
  8 |     r"""Applies Group Normalization for channels in the same group in each data sample in a
  9 |     batch.
 10 | 
 11 |     See :class:`~torch.nn.GroupNorm1d`, :class:`~torch.nn.GroupNorm2d`,
 12 |     :class:`~torch.nn.GroupNorm3d` for details.
 13 |     """
 14 |     if not use_input_stats and (running_mean is None or running_var is None):
 15 |         raise ValueError('Expected running_mean and running_var to be not None when use_input_stats=False')
 16 | 
 17 |     b, c = input.size(0), input.size(1)
 18 |     if weight is not None:
 19 |         weight = weight.repeat(b)
 20 |     if bias is not None:
 21 |         bias = bias.repeat(b)
 22 | 
 23 |     def _instance_norm(input, group, running_mean=None, running_var=None, weight=None,
 24 |                        bias=None, use_input_stats=None, momentum=None, eps=None):
 25 |         # Repeat stored stats and affine transform params if necessary
 26 |         if running_mean is not None:
 27 |             running_mean_orig = running_mean
 28 |             running_mean = running_mean_orig.repeat(b)
 29 |         if running_var is not None:
 30 |             running_var_orig = running_var
 31 |             running_var = running_var_orig.repeat(b)
 32 | 
 33 |         #norm_shape = [1, b * c / group, group]
 34 |         #print(norm_shape)
 35 |         # Apply instance norm
 36 |         input_reshaped = input.contiguous().view(1, int(b * c/group), group, *input.size()[2:])
 37 | 
 38 |         out = F.batch_norm(
 39 |             input_reshaped, running_mean, running_var, weight=weight, bias=bias,
 40 |             training=use_input_stats, momentum=momentum, eps=eps)
 41 | 
 42 |         # Reshape back
 43 |         if running_mean is not None:
 44 |             running_mean_orig.copy_(running_mean.view(b, int(c/group)).mean(0, keepdim=False))
 45 |         if running_var is not None:
 46 |             running_var_orig.copy_(running_var.view(b, int(c/group)).mean(0, keepdim=False))
 47 | 
 48 |         return out.view(b, c, *input.size()[2:])
 49 |     return _instance_norm(input, group, running_mean=running_mean,
 50 |                           running_var=running_var, weight=weight, bias=bias,
 51 |                           use_input_stats=use_input_stats, momentum=momentum,
 52 |                           eps=eps)
 53 | 
 54 | 
 55 | class _GroupNorm(_BatchNorm):
 56 |     def __init__(self, num_features, num_groups=1, eps=1e-5, momentum=0.1,
 57 |                  affine=False, track_running_stats=False):
 58 |         self.num_groups = num_groups
 59 |         self.track_running_stats = track_running_stats
 60 |         super(_GroupNorm, self).__init__(int(num_features/num_groups), eps,
 61 |                                          momentum, affine, track_running_stats)
 62 | 
 63 |     def _check_input_dim(self, input):
 64 |         return NotImplemented
 65 | 
 66 |     def forward(self, input):
 67 |         self._check_input_dim(input)
 68 | 
 69 |         return group_norm(
 70 |             input, self.num_groups, self.running_mean, self.running_var, self.weight, self.bias,
 71 |             self.training or not self.track_running_stats, self.momentum, self.eps)
 72 | 
 73 | 
 74 | class GroupNorm2d(_GroupNorm):
 75 |     r"""Applies Group Normalization over a 4D input (a mini-batch of 2D inputs
 76 |     with additional channel dimension) as described in the paper
 77 |     https://arxiv.org/pdf/1803.08494.pdf
 78 |     `Group Normalization`_ .
 79 | 
 80 |     Args:
 81 |         num_features: :math:`C` from an expected input of size
 82 |             :math:`(N, C, H, W)`
 83 |         num_groups:
 84 |         eps: a value added to the denominator for numerical stability. Default: 1e-5
 85 |         momentum: the value used for the running_mean and running_var computation. Default: 0.1
 86 |         affine: a boolean value that when set to ``True``, this module has
 87 |             learnable affine parameters. Default: ``True``
 88 |         track_running_stats: a boolean value that when set to ``True``, this
 89 |             module tracks the running mean and variance, and when set to ``False``,
 90 |             this module does not track such statistics and always uses batch
 91 |             statistics in both training and eval modes. Default: ``False``
 92 | 
 93 |     Shape:
 94 |         - Input: :math:`(N, C, H, W)`
 95 |         - Output: :math:`(N, C, H, W)` (same shape as input)
 96 | 
 97 |     Examples:
 98 |         >>> # Without Learnable Parameters
 99 |         >>> m = GroupNorm2d(100, 4)
100 |         >>> # With Learnable Parameters
101 |         >>> m = GroupNorm2d(100, 4, affine=True)
102 |         >>> input = torch.randn(20, 100, 35, 45)
103 |         >>> output = m(input)
104 | 
105 |     """
106 | 
107 |     def _check_input_dim(self, input):
108 |         if input.dim() != 4:
109 |             raise ValueError('expected 4D input (got {}D input)'
110 |                              .format(input.dim()))
111 | 
112 | 
113 | class GroupNorm3d(_GroupNorm):
114 |     """
115 |         Assume the data format is (B, C, D, H, W)
116 |     """
117 |     def _check_input_dim(self, input):
118 |         if input.dim() != 5:
119 |             raise ValueError('expected 5D input (got {}D input)'
120 |                              .format(input.dim()))
121 | 
122 | 
123 | 


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/groupNorm/resnet.py:
--------------------------------------------------------------------------------
  1 | import torch.nn as nn
  2 | import math
  3 | import torch.utils.model_zoo as model_zoo
  4 | from .group_norm  import GroupNorm2d
  5 | 
  6 | __all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101',
  7 |            'resnet152']
  8 | 
  9 | 
 10 | model_urls = {
 11 |     'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
 12 |     'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
 13 |     'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
 14 |     'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
 15 |     'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
 16 | }
 17 | 
 18 | 
 19 | def conv3x3(in_planes, out_planes, stride=1):
 20 |     """3x3 convolution with padding"""
 21 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 22 |                      padding=1, bias=False)
 23 | 
 24 | def norm2d(planes, num_channels_per_group=32):
 25 |     print("num_channels_per_group:{}".format(num_channels_per_group))
 26 |     if num_channels_per_group > 0:
 27 |         return GroupNorm2d(planes, num_channels_per_group, affine=True,
 28 |                            track_running_stats=False)
 29 |     else:
 30 |         return nn.BatchNorm2d(planes)
 31 | 
 32 | 
 33 | class BasicBlock(nn.Module):
 34 |     expansion = 1
 35 | 
 36 |     def __init__(self, inplanes, planes, stride=1, downsample=None,
 37 |                  group_norm=0):
 38 |         super(BasicBlock, self).__init__()
 39 |         self.conv1 = conv3x3(inplanes, planes, stride)
 40 |         self.bn1 = norm2d(planes, group_norm)
 41 |         self.relu = nn.ReLU(inplace=True)
 42 |         self.conv2 = conv3x3(planes, planes)
 43 |         self.bn2 = norm2d(planes, group_norm)
 44 |         self.downsample = downsample
 45 |         self.stride = stride
 46 | 
 47 |     def forward(self, x):
 48 |         residual = x
 49 | 
 50 |         out = self.conv1(x)
 51 |         out = self.bn1(out)
 52 |         out = self.relu(out)
 53 | 
 54 |         out = self.conv2(out)
 55 |         out = self.bn2(out)
 56 | 
 57 |         if self.downsample is not None:
 58 |             residual = self.downsample(x)
 59 | 
 60 |         out += residual
 61 |         out = self.relu(out)
 62 | 
 63 |         return out
 64 | 
 65 | 
 66 | class Bottleneck(nn.Module):
 67 |     expansion = 4
 68 | 
 69 |     def __init__(self, inplanes, planes, stride=1, downsample=None,
 70 |                  group_norm=0):
 71 |         super(Bottleneck, self).__init__()
 72 |         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 73 |         self.bn1 = norm2d(planes, group_norm)
 74 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
 75 |                                padding=1, bias=False)
 76 |         self.bn2 = norm2d(planes, group_norm)
 77 |         self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
 78 |         self.bn3 = norm2d(planes * 4, group_norm)
 79 |         self.relu = nn.ReLU(inplace=True)
 80 |         self.downsample = downsample
 81 |         self.stride = stride
 82 | 
 83 |     def forward(self, x):
 84 |         residual = x
 85 | 
 86 |         out = self.conv1(x)
 87 |         out = self.bn1(out)
 88 |         out = self.relu(out)
 89 | 
 90 |         out = self.conv2(out)
 91 |         out = self.bn2(out)
 92 |         out = self.relu(out)
 93 | 
 94 |         out = self.conv3(out)
 95 |         out = self.bn3(out)
 96 | 
 97 |         if self.downsample is not None:
 98 |             residual = self.downsample(x)
 99 | 
100 |         out += residual
101 |         out = self.relu(out)
102 | 
103 |         return out
104 | 
105 | 
106 | class ResNet(nn.Module):
107 | 
108 |     def __init__(self, block, layers, num_classes=1000, group_norm=0):
109 |         self.inplanes = 64
110 |         super(ResNet, self).__init__()
111 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
112 |                                bias=False)
113 |         self.bn1 = norm2d(64, group_norm)
114 |         self.relu = nn.ReLU(inplace=True)
115 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
116 |         self.layer1 = self._make_layer(block, 64, layers[0],
117 |                                        group_norm=group_norm)
118 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2,
119 |                                        group_norm=group_norm)
120 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2,
121 |                                        group_norm=group_norm)
122 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2,
123 |                                        group_norm=group_norm)
124 |         self.avgpool = nn.AvgPool2d(7, stride=1)
125 |         self.fc = nn.Linear(512 * block.expansion, num_classes)
126 | 
127 |         for m in self.modules():
128 |             if isinstance(m, nn.Conv2d):
129 |                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
130 |                 m.weight.data.normal_(0, math.sqrt(2. / n))
131 |             elif isinstance(m, nn.BatchNorm2d):
132 |                 m.weight.data.fill_(1)
133 |                 m.bias.data.zero_()
134 |             elif isinstance(m, GroupNorm2d):
135 |                 m.weight.data.fill_(1)
136 |                 m.bias.data.zero_()
137 | 
138 |         for m in self.modules():
139 |             if isinstance(m, Bottleneck):
140 |                 m.bn3.weight.data.fill_(0)
141 |             if isinstance(m, BasicBlock):
142 |                 m.bn2.weight.data.fill_(0)
143 | 
144 | 
145 |     def _make_layer(self, block, planes, blocks, stride=1, group_norm=0):
146 |         downsample = None
147 |         if stride != 1 or self.inplanes != planes * block.expansion:
148 |             downsample = nn.Sequential(
149 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
150 |                           kernel_size=1, stride=stride, bias=False),
151 |                 norm2d(planes * block.expansion, group_norm),
152 |             )
153 | 
154 |         layers = []
155 |         layers.append(block(self.inplanes, planes, stride, downsample,
156 |                             group_norm))
157 |         self.inplanes = planes * block.expansion
158 |         for i in range(1, blocks):
159 |             layers.append(block(self.inplanes, planes, group_norm=group_norm))
160 | 
161 |         return nn.Sequential(*layers)
162 | 
163 |     def forward(self, x):
164 |         x = self.conv1(x)
165 |         x = self.bn1(x)
166 |         x = self.relu(x)
167 |         x = self.maxpool(x)
168 | 
169 |         x = self.layer1(x)
170 |         x = self.layer2(x)
171 |         x = self.layer3(x)
172 |         x = self.layer4(x)
173 | 
174 |         x = self.avgpool(x)
175 |         x = x.view(x.size(0), -1)
176 |         x = self.fc(x)
177 | 
178 |         return x
179 | 
180 | 
181 | def resnet18(pretrained=False, **kwargs):
182 |     """Constructs a ResNet-18 model.
183 | 
184 |     Args:
185 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
186 |     """
187 |     model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
188 |     if pretrained:
189 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet18']))
190 |     return model
191 | 
192 | 
193 | def resnet34(pretrained=False, **kwargs):
194 |     """Constructs a ResNet-34 model.
195 | 
196 |     Args:
197 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
198 |     """
199 |     model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
200 |     if pretrained:
201 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet34']))
202 |     return model
203 | 
204 | 
205 | def resnet50(pretrained=False, **kwargs):
206 |     """Constructs a ResNet-50 model.
207 | 
208 |     Args:
209 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
210 |     """
211 |     model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
212 |     if pretrained:
213 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet50']))
214 |     return model
215 | 
216 | 
217 | def resnet101(pretrained=False, **kwargs):
218 |     """Constructs a ResNet-101 model.
219 | 
220 |     Args:
221 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
222 |     """
223 |     model = ResNet(Bottleneck, [3, 4, 23, 3], **kwargs)
224 |     if pretrained:
225 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet101']))
226 |     return model
227 | 
228 | 
229 | def resnet152(pretrained=False, **kwargs):
230 |     """Constructs a ResNet-152 model.
231 | 
232 |     Args:
233 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
234 |     """
235 |     model = ResNet(Bottleneck, [3, 8, 36, 3], **kwargs)
236 |     if pretrained:
237 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet152']))
238 |     return model
239 | 


--------------------------------------------------------------------------------
/CLOC/code_online/best_model/yfcc100m_dataset.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torchvision.datasets.vision import StandardTransform
  3 | from torch.utils.data import Dataset, IterableDataset
  4 | from PIL import Image
  5 | 
  6 | import os
  7 | import os.path
  8 | import csv
  9 | 
 10 | import sys
 11 | import math
 12 | import random
 13 | import matplotlib.pyplot as plt
 14 | 
 15 | 
 16 | def has_file_allowed_extension(filename, extensions):
 17 |     """Checks if a file is an allowed extension.
 18 | 
 19 |     Args:
 20 |         filename (string): path to a file
 21 |         extensions (tuple of strings): extensions to consider (lowercase)
 22 | 
 23 |     Returns:
 24 |         bool: True if the filename ends with one of given extensions
 25 |     """
 26 |     return filename.lower().endswith(extensions)
 27 | 
 28 | 
 29 | def is_image_file(filename):
 30 |     """Checks if a file is an allowed image extension.
 31 | 
 32 |     Args:
 33 |         filename (string): path to a file
 34 | 
 35 |     Returns:
 36 |         bool: True if the filename ends with a known image extension
 37 |     """
 38 |     return has_file_allowed_extension(filename, IMG_EXTENSIONS)
 39 | 
 40 | 
 41 | def make_dataset(directory, class_to_idx, extensions=None, is_valid_file=None):
 42 |     instances = []
 43 |     directory = os.path.expanduser(directory)
 44 |     both_none = extensions is None and is_valid_file is None
 45 |     both_something = extensions is not None and is_valid_file is not None
 46 |     if both_none or both_something:
 47 |         raise ValueError("Both extensions and is_valid_file cannot be None or not None at the same time")
 48 |     if extensions is not None:
 49 |         def is_valid_file(x):
 50 |             return has_file_allowed_extension(x, extensions)
 51 |     for target_class in sorted(class_to_idx.keys()):
 52 |         class_index = class_to_idx[target_class]
 53 |         target_dir = os.path.join(directory, target_class)
 54 |         if not os.path.isdir(target_dir):
 55 |             continue
 56 |         for root, _, fnames in sorted(os.walk(target_dir, followlinks=True)):
 57 |             for fname in sorted(fnames):
 58 |                 path = os.path.join(root, fname)
 59 |                 if is_valid_file(path):
 60 |                     item = path, class_index
 61 |                     instances.append(item)
 62 |     return instances
 63 | 
 64 | 
 65 | def pil_loader(path):
 66 |     # open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
 67 |     with open(path, 'rb') as f:
 68 |         img = Image.open(f)
 69 |         return img.convert('RGB')
 70 | 
 71 | 
 72 | def accimage_loader(path):
 73 |     import accimage
 74 |     try:
 75 |         return accimage.Image(path)
 76 |     except IOError:
 77 |         # Potentially a decoding problem, fall back to PIL.Image
 78 |         return pil_loader(path)
 79 | 
 80 | 
 81 | def default_loader(path):
 82 |     from torchvision import get_image_backend
 83 |     if get_image_backend() == 'accimage':
 84 |         return accimage_loader(path)
 85 |     else:
 86 |         return pil_loader(path)
 87 | 
 88 | 
 89 | IMG_EXTENSIONS = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp')
 90 | 
 91 | 
 92 | 
 93 | class YFCC_CL_Dataset_offline_val(Dataset):
 94 |     """A data loader for YFCC100M dataset
 95 |     """
 96 |     def __init__(self, args, loader = default_loader, extensions=IMG_EXTENSIONS, transform=None,
 97 |                  target_transform=None):
 98 |         
 99 |         fname = args.data_val
100 |         root = args.root
101 |         
102 |         print("YFCC_CL dataset loader = {}; extensions = {}".format(loader, extensions))
103 |         
104 |         sys.stdout.flush()
105 | 
106 |         if isinstance(fname, torch._six.string_classes):
107 |             fname = os.path.expanduser(fname)
108 |         self.fname = fname
109 | 
110 |         # for backwards-compatibility
111 |         self.transform = transform
112 |         self.target_transform = target_transform
113 | 
114 |         self.labels, self.time_taken, self.user, self.store_loc = self._make_data(self.fname, root = root)
115 |         if len(self.labels) == 0:
116 |             msg = "Found 0 files in subfolders of: {}\n".format(self.fname)
117 |             if extensions is not None:
118 |                 msg += "Supported extensions are: {}".format(",".join(extensions))
119 |             raise RuntimeError(msg)
120 | 
121 |         self.loader = loader
122 |         self.extensions = extensions
123 |         self.root = root
124 |             
125 |         # for replay buffer
126 |         self.batch_size = torch.cuda.device_count()*args.batch_size
127 |         self.is_forward = True
128 |         self.offset = 0
129 |         print("root = {}; time_taken (an example) = {}; time_taken.len = {}; batch_size = {}".format(root, self.time_taken[1000], len(self.time_taken), self.batch_size))
130 | 
131 |     def _make_data(self, fname, root):
132 |         # read data
133 |         fval = open(fname, 'r')
134 |         lines_val = fval.readlines()
135 |         labels = [None] * len(lines_val)
136 |         time = [None] * len(lines_val)
137 |         user = [None] * len(lines_val)
138 |         store_loc = [None] * len(lines_val)
139 |         
140 |         for i in range(len(lines_val)):
141 |             line_splitted = lines_val[i].split(",")
142 |             labels[i] = int(line_splitted[0])
143 |             time[i] = int(line_splitted[2])
144 |             user[i] = line_splitted[3]
145 |             store_loc[i] = line_splitted[-1][:-1]
146 |         return labels, time, user, store_loc
147 | 
148 |     def set_transfer_time_point(self, args, val_set, time_last, is_forward = True):
149 |         # find idx that is larger and closest to time_last
150 |         self.is_forward = is_forward
151 |         for i in range(len(self.time_taken)):
152 |             if self.time_taken[i] >= time_last:
153 |                 print("[set_transfer_time_point]: time_last = {}; time[{}] = {}".format(time_last, i, self.time_taken[i]))
154 |                 self.offset = i
155 |                 return
156 |         self.offset = len(self.time_taken) - 1
157 | 
158 |     def __getitem__(self, index):
159 |         if self.is_forward:
160 |             index = min(len(self.labels), index + self.offset)
161 |         else:
162 |             index = max(0, self.offset - index)
163 | 
164 |         if self.root is not None:
165 |             path = self.root + self.store_loc[index]
166 |         else:
167 |             path = self.store_loc[index]
168 |         sample = self.loader(path)
169 |         if self.transform is not None:
170 |             sample = self.transform(sample)
171 | 
172 |         return sample, self.labels[index], self.time_taken[index], index
173 | 
174 | 
175 |     def __len__(self):
176 |         if self.is_forward:
177 |             return len(self.labels) - self.offset
178 |         else:
179 |             return self.offset + 1
180 | 
181 | 
182 | class YFCC_CL_Dataset_online(Dataset):
183 | 
184 |     def __init__(self, args, loader = default_loader, extensions=IMG_EXTENSIONS, transform=None, transform_RepBuf = None,
185 |                  target_transform=None, target_transform_RepBuf = None, trans_test = None):
186 |         
187 |         fname = args.data
188 |         root = args.root 
189 |         size_buf = args.size_replay_buffer        
190 | 
191 |         print("YFCC_CL dataset loader = {}; extensions = {}".format(loader, extensions))
192 |         
193 |         sys.stdout.flush()
194 | 
195 |         if isinstance(fname, torch._six.string_classes):
196 |             fname = os.path.expanduser(fname)
197 |         self.fname = fname
198 | 
199 |         # for backwards-compatibility
200 |         self.transform = transform
201 |         self.transform_test = trans_test
202 |         self.target_transform = target_transform
203 |         self.transform_RepBuf = transform_RepBuf
204 |         self.target_transform_RepBuf = target_transform_RepBuf
205 | 
206 |         # valid initial and final index
207 |         self.used_data_start = args.used_data_start
208 |         self.used_data_end = args.used_data_end
209 | 
210 |         self._make_data()
211 |         self.data_size = len(self.labels)
212 |         self.data_size_per_epoch = math.ceil(self.data_size/args.epochs)
213 | 
214 |         self.batch_size = torch.cuda.device_count()*args.batch_size
215 | 
216 |         # make batch size always 256
217 |         if self.data_size_per_epoch % self.batch_size != 0:
218 |             self.data_size_per_epoch = self.data_size_per_epoch - self.data_size_per_epoch % self.batch_size + self.batch_size    
219 | 
220 |     
221 |         self.loader = loader
222 |         self.extensions = extensions
223 |         self.root = root
224 |         self.size_buf = size_buf
225 | 
226 |         self.repBuf_sample_rate = 0.5 # use this later, test 0.5 case first
227 |         self.sampling_strategy = args.sampling_strategy
228 |             
229 |         self.NOSubBatch = args.NOSubBatch
230 |         self.SubBatch_index_offset = int(self.batch_size/self.NOSubBatch)
231 | 
232 |         self.gradient_steps_per_batch = int(args.gradient_steps_per_batch) # repBatch_rate is multiplied by this
233 |         self.repBatch_rate = math.ceil(self.repBuf_sample_rate/(1-self.repBuf_sample_rate))
234 | 
235 |         self.repType = args.ReplayType
236 | 
237 |         print("[initData]: root = {}; time_taken (an example) = {}; time_taken.len = {}; batch_size = {}; size_buf = {}; repBuf_sample_rate = {}".format(root, self.time_taken[1000], len(self.time_taken), self.batch_size, self.size_buf, self.repBuf_sample_rate))
238 |         print("[initData]: repBatch_rate = {}; NOSubBatch = {}; SubBatch_index_offset = {}".format(self.repBatch_rate, self.NOSubBatch, self.SubBatch_index_offset))
239 |         print("[initData]: transform = {}; transform_RepBuf = {}".format(self.transform, self.transform_RepBuf))
240 | 
241 |     def _make_data(self):
242 |        # only read labels and times to save storage
243 |         self.labels = torch.load(self.fname+'train_labels.torchSave')
244 |         self.time_taken = torch.load(self.fname+'train_time.torchSave')
245 |         self.user = torch.load(self.fname+'train_userID.torchSave')
246 |         self.store_loc = [None] * len(self.labels)
247 |         self.idx_data = []
248 | 
249 |     def _change_data_range_FIFO(self):
250 |         print("reading store location from {}".format(self.fname+'train_store_loc.torchSave'))
251 |         tmp_loc = torch.load(self.fname+'train_store_loc.torchSave')
252 |         print("tmp_loc.size = {}".format(len(tmp_loc)))
253 |         for i in range(self.used_data_start, self.used_data_end):
254 |             self.store_loc[i] = tmp_loc[i][:-1]
255 |             if i % 1e5 == 0:
256 |                 print("store_loc[{}] = {}".format(i, self.store_loc[i]))
257 |             sys.stdout.flush()        
258 | 
259 | 
260 |     def _set_data_idx(self, epoch):
261 |         # setup training data idx
262 |         self.offset = epoch*self.data_size_per_epoch
263 |         size_curr_epoch = min(self.data_size - self.offset, self.data_size_per_epoch) 
264 |         size_curr_epoch = size_curr_epoch - (size_curr_epoch % self.batch_size)
265 | 
266 |         batchSize = int(self.batch_size/self.NOSubBatch)
267 |         self.data_idx = [None]* (size_curr_epoch *self.gradient_steps_per_batch)
268 |         iter_total = size_curr_epoch//batchSize
269 |         bsReplicated = batchSize*self.gradient_steps_per_batch
270 |         for i in range(0, int(iter_total)):
271 |             self.data_idx[i*bsReplicated:(i+1)*bsReplicated] = list(range(self.offset+i*batchSize, self.offset+(i+1)*batchSize))*self.gradient_steps_per_batch
272 |             
273 | 
274 |     def _change_data_range(self, epoch = 0):
275 |         self._set_data_idx(epoch)
276 | 
277 |         self.used_data_start = max(0, epoch*self.data_size_per_epoch-self.size_buf-self.batch_size)
278 |         self.used_data_end = min(len(self.labels), (epoch+1)*self.data_size_per_epoch+self.batch_size)
279 |         print("change valid data range to: [{},{}]".format(self.used_data_start, self.used_data_end))
280 |         
281 |         self.store_loc = [None] * len(self.labels)
282 |         self._change_data_range_FIFO()
283 | 
284 | 
285 |     def _sample_FIFO(self, index):
286 |         if index < self.batch_size:
287 |             return 0
288 |         else:
289 |             repBuf_idx = random.randint(max(0, index-self.size_buf-self.batch_size), index-self.batch_size)
290 |             return repBuf_idx
291 | 
292 | 
293 |     def _sample(self, index):
294 |         return self._sample_FIFO(index)        
295 | 
296 |     def __getitem__(self, index):
297 |         index = self.data_idx[index] 
298 |         index_pop = index
299 | 
300 |         if index_pop < 0:
301 |             index_pop = 0
302 |             is_valid = torch.tensor(0)
303 |         else:
304 |             is_valid = torch.tensor(1)
305 | 
306 |         num_batches = 1
307 |     
308 |         target_pop = self.labels[index_pop]
309 |         path_pop = self.root + self.store_loc[index_pop]
310 |         sample_pop = self.loader(path_pop)
311 |         if self.transform is not None:
312 |             sample_pop = self.transform(sample_pop)
313 |         if self.target_transform is not None:
314 |             target_pop = self.target_transform(target_pop)
315 | 
316 | 
317 |         sample_test = torch.zeros(self.NOSubBatch, sample_pop.size()[0], sample_pop.size()[1], sample_pop.size()[2])
318 |         target_test = torch.zeros(self.NOSubBatch).long()
319 |         test_idx = torch.zeros(self.NOSubBatch).long()      
320 |         time_taken_test = torch.zeros(self.NOSubBatch).long()
321 |         user_test = torch.zeros(self.NOSubBatch).long()
322 | 
323 |         # get the test data of full batch size
324 |         for i in range(0, self.NOSubBatch):
325 |             test_idx[i] = index+i*self.SubBatch_index_offset
326 |             if test_idx[i] >= len(self.time_taken):
327 |                 test_idx[i] = len(self.time_taken) - 1
328 | 
329 |             time_taken_test[i] = self.time_taken[test_idx[i]]
330 |             target_test[i] = self.labels[test_idx[i]]
331 |             user_test[i] = self.user[test_idx[i]]
332 |             
333 |             path = self.root + self.store_loc[test_idx[i]]
334 |             sample = self.loader(path)
335 |             if self.transform is not None:
336 |                 sample_test[i] = self.transform_test(sample)
337 |             if self.target_transform is not None:
338 |                 target_test[i] = self.target_transform(target_test[i])
339 | 
340 |         # randomly sample from repBuf
341 |         sample_RepBuf = torch.zeros(self.repBatch_rate* num_batches, sample_pop.size()[0], sample_pop.size()[1], sample_pop.size()[2])
342 |         target_RepBuf = torch.zeros(self.repBatch_rate* num_batches).long()
343 |         repBuf_idx = torch.zeros(self.repBatch_rate * num_batches).long()        
344 | 
345 |         for i in range(0, int(self.repBatch_rate * num_batches)):
346 |             repBuf_idx[i] = self._sample(index)
347 |             path_RepBuf = self.root + self.store_loc[repBuf_idx[i]]
348 |             target_RepBuf[i] = self.labels[repBuf_idx[i]]
349 |            
350 |             sample_RepBuf_tmp = self.loader(path_RepBuf)
351 |             if self.transform_RepBuf is not None:
352 |                 # print("performing to sample_RepBuf: {}".format(self.transform_RepBuf))
353 |                 sample_RepBuf[i] = self.transform_RepBuf(sample_RepBuf_tmp)
354 |             if self.target_transform_RepBuf is not None:
355 |                 target_RepBuf[i] = self.target_transform_RepBuf(target_RepBuf[i])
356 | 
357 |         return sample_test, target_test, user_test, time_taken_test, test_idx, sample_pop, target_pop, index_pop, sample_RepBuf, target_RepBuf, repBuf_idx
358 | 
359 |     def __len__(self):
360 |         return len(self.data_idx)
361 | 


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/__pycache__/yfcc100m_dataset.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/no_PoLRS/__pycache__/yfcc100m_dataset.cpython-37.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/__pycache__/yfcc100m_dataset.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/no_PoLRS/__pycache__/yfcc100m_dataset.cpython-39.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/group_norm.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/group_norm.cpython-37.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/group_norm.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/group_norm.cpython-39.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/resnet.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/resnet.cpython-37.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/resnet.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IntelLabs/continuallearning/ee7eb9e8550c4f74a6432475ebae37cc05021535/CLOC/code_online/no_PoLRS/groupNorm/__pycache__/resnet.cpython-39.pyc


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/groupNorm/group_norm.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn.functional as F
  3 | from torch.nn.modules.batchnorm import _BatchNorm
  4 | 
  5 | 
  6 | def group_norm(input, group, running_mean, running_var, weight=None, bias=None,
  7 |                   use_input_stats=True, momentum=0.1, eps=1e-5):
  8 |     r"""Applies Group Normalization for channels in the same group in each data sample in a
  9 |     batch.
 10 | 
 11 |     See :class:`~torch.nn.GroupNorm1d`, :class:`~torch.nn.GroupNorm2d`,
 12 |     :class:`~torch.nn.GroupNorm3d` for details.
 13 |     """
 14 |     if not use_input_stats and (running_mean is None or running_var is None):
 15 |         raise ValueError('Expected running_mean and running_var to be not None when use_input_stats=False')
 16 | 
 17 |     b, c = input.size(0), input.size(1)
 18 |     if weight is not None:
 19 |         weight = weight.repeat(b)
 20 |     if bias is not None:
 21 |         bias = bias.repeat(b)
 22 | 
 23 |     def _instance_norm(input, group, running_mean=None, running_var=None, weight=None,
 24 |                        bias=None, use_input_stats=None, momentum=None, eps=None):
 25 |         # Repeat stored stats and affine transform params if necessary
 26 |         if running_mean is not None:
 27 |             running_mean_orig = running_mean
 28 |             running_mean = running_mean_orig.repeat(b)
 29 |         if running_var is not None:
 30 |             running_var_orig = running_var
 31 |             running_var = running_var_orig.repeat(b)
 32 | 
 33 |         #norm_shape = [1, b * c / group, group]
 34 |         #print(norm_shape)
 35 |         # Apply instance norm
 36 |         input_reshaped = input.contiguous().view(1, int(b * c/group), group, *input.size()[2:])
 37 | 
 38 |         out = F.batch_norm(
 39 |             input_reshaped, running_mean, running_var, weight=weight, bias=bias,
 40 |             training=use_input_stats, momentum=momentum, eps=eps)
 41 | 
 42 |         # Reshape back
 43 |         if running_mean is not None:
 44 |             running_mean_orig.copy_(running_mean.view(b, int(c/group)).mean(0, keepdim=False))
 45 |         if running_var is not None:
 46 |             running_var_orig.copy_(running_var.view(b, int(c/group)).mean(0, keepdim=False))
 47 | 
 48 |         return out.view(b, c, *input.size()[2:])
 49 |     return _instance_norm(input, group, running_mean=running_mean,
 50 |                           running_var=running_var, weight=weight, bias=bias,
 51 |                           use_input_stats=use_input_stats, momentum=momentum,
 52 |                           eps=eps)
 53 | 
 54 | 
 55 | class _GroupNorm(_BatchNorm):
 56 |     def __init__(self, num_features, num_groups=1, eps=1e-5, momentum=0.1,
 57 |                  affine=False, track_running_stats=False):
 58 |         self.num_groups = num_groups
 59 |         self.track_running_stats = track_running_stats
 60 |         super(_GroupNorm, self).__init__(int(num_features/num_groups), eps,
 61 |                                          momentum, affine, track_running_stats)
 62 | 
 63 |     def _check_input_dim(self, input):
 64 |         return NotImplemented
 65 | 
 66 |     def forward(self, input):
 67 |         self._check_input_dim(input)
 68 | 
 69 |         return group_norm(
 70 |             input, self.num_groups, self.running_mean, self.running_var, self.weight, self.bias,
 71 |             self.training or not self.track_running_stats, self.momentum, self.eps)
 72 | 
 73 | 
 74 | class GroupNorm2d(_GroupNorm):
 75 |     r"""Applies Group Normalization over a 4D input (a mini-batch of 2D inputs
 76 |     with additional channel dimension) as described in the paper
 77 |     https://arxiv.org/pdf/1803.08494.pdf
 78 |     `Group Normalization`_ .
 79 | 
 80 |     Args:
 81 |         num_features: :math:`C` from an expected input of size
 82 |             :math:`(N, C, H, W)`
 83 |         num_groups:
 84 |         eps: a value added to the denominator for numerical stability. Default: 1e-5
 85 |         momentum: the value used for the running_mean and running_var computation. Default: 0.1
 86 |         affine: a boolean value that when set to ``True``, this module has
 87 |             learnable affine parameters. Default: ``True``
 88 |         track_running_stats: a boolean value that when set to ``True``, this
 89 |             module tracks the running mean and variance, and when set to ``False``,
 90 |             this module does not track such statistics and always uses batch
 91 |             statistics in both training and eval modes. Default: ``False``
 92 | 
 93 |     Shape:
 94 |         - Input: :math:`(N, C, H, W)`
 95 |         - Output: :math:`(N, C, H, W)` (same shape as input)
 96 | 
 97 |     Examples:
 98 |         >>> # Without Learnable Parameters
 99 |         >>> m = GroupNorm2d(100, 4)
100 |         >>> # With Learnable Parameters
101 |         >>> m = GroupNorm2d(100, 4, affine=True)
102 |         >>> input = torch.randn(20, 100, 35, 45)
103 |         >>> output = m(input)
104 | 
105 |     """
106 | 
107 |     def _check_input_dim(self, input):
108 |         if input.dim() != 4:
109 |             raise ValueError('expected 4D input (got {}D input)'
110 |                              .format(input.dim()))
111 | 
112 | 
113 | class GroupNorm3d(_GroupNorm):
114 |     """
115 |         Assume the data format is (B, C, D, H, W)
116 |     """
117 |     def _check_input_dim(self, input):
118 |         if input.dim() != 5:
119 |             raise ValueError('expected 5D input (got {}D input)'
120 |                              .format(input.dim()))
121 | 
122 | 
123 | 


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/groupNorm/resnet.py:
--------------------------------------------------------------------------------
  1 | import torch.nn as nn
  2 | import math
  3 | import torch.utils.model_zoo as model_zoo
  4 | from .group_norm  import GroupNorm2d
  5 | 
  6 | __all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101',
  7 |            'resnet152']
  8 | 
  9 | 
 10 | model_urls = {
 11 |     'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
 12 |     'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
 13 |     'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
 14 |     'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
 15 |     'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
 16 | }
 17 | 
 18 | 
 19 | def conv3x3(in_planes, out_planes, stride=1):
 20 |     """3x3 convolution with padding"""
 21 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 22 |                      padding=1, bias=False)
 23 | 
 24 | def norm2d(planes, num_channels_per_group=32):
 25 |     print("num_channels_per_group:{}".format(num_channels_per_group))
 26 |     if num_channels_per_group > 0:
 27 |         return GroupNorm2d(planes, num_channels_per_group, affine=True,
 28 |                            track_running_stats=False)
 29 |     else:
 30 |         return nn.BatchNorm2d(planes)
 31 | 
 32 | 
 33 | class BasicBlock(nn.Module):
 34 |     expansion = 1
 35 | 
 36 |     def __init__(self, inplanes, planes, stride=1, downsample=None,
 37 |                  group_norm=0):
 38 |         super(BasicBlock, self).__init__()
 39 |         self.conv1 = conv3x3(inplanes, planes, stride)
 40 |         self.bn1 = norm2d(planes, group_norm)
 41 |         self.relu = nn.ReLU(inplace=True)
 42 |         self.conv2 = conv3x3(planes, planes)
 43 |         self.bn2 = norm2d(planes, group_norm)
 44 |         self.downsample = downsample
 45 |         self.stride = stride
 46 | 
 47 |     def forward(self, x):
 48 |         residual = x
 49 | 
 50 |         out = self.conv1(x)
 51 |         out = self.bn1(out)
 52 |         out = self.relu(out)
 53 | 
 54 |         out = self.conv2(out)
 55 |         out = self.bn2(out)
 56 | 
 57 |         if self.downsample is not None:
 58 |             residual = self.downsample(x)
 59 | 
 60 |         out += residual
 61 |         out = self.relu(out)
 62 | 
 63 |         return out
 64 | 
 65 | 
 66 | class Bottleneck(nn.Module):
 67 |     expansion = 4
 68 | 
 69 |     def __init__(self, inplanes, planes, stride=1, downsample=None,
 70 |                  group_norm=0):
 71 |         super(Bottleneck, self).__init__()
 72 |         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 73 |         self.bn1 = norm2d(planes, group_norm)
 74 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
 75 |                                padding=1, bias=False)
 76 |         self.bn2 = norm2d(planes, group_norm)
 77 |         self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
 78 |         self.bn3 = norm2d(planes * 4, group_norm)
 79 |         self.relu = nn.ReLU(inplace=True)
 80 |         self.downsample = downsample
 81 |         self.stride = stride
 82 | 
 83 |     def forward(self, x):
 84 |         residual = x
 85 | 
 86 |         out = self.conv1(x)
 87 |         out = self.bn1(out)
 88 |         out = self.relu(out)
 89 | 
 90 |         out = self.conv2(out)
 91 |         out = self.bn2(out)
 92 |         out = self.relu(out)
 93 | 
 94 |         out = self.conv3(out)
 95 |         out = self.bn3(out)
 96 | 
 97 |         if self.downsample is not None:
 98 |             residual = self.downsample(x)
 99 | 
100 |         out += residual
101 |         out = self.relu(out)
102 | 
103 |         return out
104 | 
105 | 
106 | class ResNet(nn.Module):
107 | 
108 |     def __init__(self, block, layers, num_classes=1000, group_norm=0):
109 |         self.inplanes = 64
110 |         super(ResNet, self).__init__()
111 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
112 |                                bias=False)
113 |         self.bn1 = norm2d(64, group_norm)
114 |         self.relu = nn.ReLU(inplace=True)
115 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
116 |         self.layer1 = self._make_layer(block, 64, layers[0],
117 |                                        group_norm=group_norm)
118 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2,
119 |                                        group_norm=group_norm)
120 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2,
121 |                                        group_norm=group_norm)
122 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2,
123 |                                        group_norm=group_norm)
124 |         self.avgpool = nn.AvgPool2d(7, stride=1)
125 |         self.fc = nn.Linear(512 * block.expansion, num_classes)
126 | 
127 |         for m in self.modules():
128 |             if isinstance(m, nn.Conv2d):
129 |                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
130 |                 m.weight.data.normal_(0, math.sqrt(2. / n))
131 |             elif isinstance(m, nn.BatchNorm2d):
132 |                 m.weight.data.fill_(1)
133 |                 m.bias.data.zero_()
134 |             elif isinstance(m, GroupNorm2d):
135 |                 m.weight.data.fill_(1)
136 |                 m.bias.data.zero_()
137 | 
138 |         for m in self.modules():
139 |             if isinstance(m, Bottleneck):
140 |                 m.bn3.weight.data.fill_(0)
141 |             if isinstance(m, BasicBlock):
142 |                 m.bn2.weight.data.fill_(0)
143 | 
144 | 
145 |     def _make_layer(self, block, planes, blocks, stride=1, group_norm=0):
146 |         downsample = None
147 |         if stride != 1 or self.inplanes != planes * block.expansion:
148 |             downsample = nn.Sequential(
149 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
150 |                           kernel_size=1, stride=stride, bias=False),
151 |                 norm2d(planes * block.expansion, group_norm),
152 |             )
153 | 
154 |         layers = []
155 |         layers.append(block(self.inplanes, planes, stride, downsample,
156 |                             group_norm))
157 |         self.inplanes = planes * block.expansion
158 |         for i in range(1, blocks):
159 |             layers.append(block(self.inplanes, planes, group_norm=group_norm))
160 | 
161 |         return nn.Sequential(*layers)
162 | 
163 |     def forward(self, x):
164 |         x = self.conv1(x)
165 |         x = self.bn1(x)
166 |         x = self.relu(x)
167 |         x = self.maxpool(x)
168 | 
169 |         x = self.layer1(x)
170 |         x = self.layer2(x)
171 |         x = self.layer3(x)
172 |         x = self.layer4(x)
173 | 
174 |         x = self.avgpool(x)
175 |         x = x.view(x.size(0), -1)
176 |         x = self.fc(x)
177 | 
178 |         return x
179 | 
180 | 
181 | def resnet18(pretrained=False, **kwargs):
182 |     """Constructs a ResNet-18 model.
183 | 
184 |     Args:
185 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
186 |     """
187 |     model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
188 |     if pretrained:
189 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet18']))
190 |     return model
191 | 
192 | 
193 | def resnet34(pretrained=False, **kwargs):
194 |     """Constructs a ResNet-34 model.
195 | 
196 |     Args:
197 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
198 |     """
199 |     model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
200 |     if pretrained:
201 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet34']))
202 |     return model
203 | 
204 | 
205 | def resnet50(pretrained=False, **kwargs):
206 |     """Constructs a ResNet-50 model.
207 | 
208 |     Args:
209 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
210 |     """
211 |     model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
212 |     if pretrained:
213 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet50']))
214 |     return model
215 | 
216 | 
217 | def resnet101(pretrained=False, **kwargs):
218 |     """Constructs a ResNet-101 model.
219 | 
220 |     Args:
221 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
222 |     """
223 |     model = ResNet(Bottleneck, [3, 4, 23, 3], **kwargs)
224 |     if pretrained:
225 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet101']))
226 |     return model
227 | 
228 | 
229 | def resnet152(pretrained=False, **kwargs):
230 |     """Constructs a ResNet-152 model.
231 | 
232 |     Args:
233 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
234 |     """
235 |     model = ResNet(Bottleneck, [3, 8, 36, 3], **kwargs)
236 |     if pretrained:
237 |         model.load_state_dict(model_zoo.load_url(model_urls['resnet152']))
238 |     return model
239 | 


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/main_online.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import os
  3 | import random
  4 | import shutil
  5 | import time
  6 | import warnings
  7 | import builtins
  8 | import math
  9 | 
 10 | import torch
 11 | import torch.nn as nn
 12 | import torch.nn.parallel
 13 | import torch.backends.cudnn as cudnn
 14 | import torch.distributed as dist
 15 | import torch.optim
 16 | import torch.multiprocessing as mp
 17 | import torch.utils.data
 18 | import torch.utils.data.distributed
 19 | import torchvision.transforms as transforms
 20 | import torchvision.datasets as datasets
 21 | import torchvision.models as models
 22 | import groupNorm.resnet as models_GN
 23 | from torch import randperm
 24 | 
 25 | import numpy as np
 26 | import sys
 27 | from tensorboardX import SummaryWriter
 28 | from yfcc100m_dataset import YFCC_CL_Dataset_offline_val, YFCC_CL_Dataset_online
 29 | 
 30 | import psutil
 31 | import gc
 32 | 
 33 | import copy
 34 | 
 35 | 
 36 | model_names = sorted(name for name in models.__dict__
 37 | 	if name.islower() and not name.startswith("__")
 38 | 	and callable(models.__dict__[name]))
 39 | 
 40 | parser = argparse.ArgumentParser(description='PyTorch ImageNet Training')
 41 | parser.add_argument('-a', '--arch', metavar='ARCH', default='resnet50',
 42 | 					choices=model_names,
 43 | 					help='model architecture: ' +
 44 | 						' | '.join(model_names) +
 45 | 						' (default: resnet18)')
 46 | parser.add_argument('-j', '--workers', default=16, type=int, metavar='N',
 47 | 					help='number of data loading workers (default: 4)')
 48 | parser.add_argument('--epochs', default=90, type=int, metavar='N',
 49 | 					help='number of total epochs to run')
 50 | parser.add_argument('--start-epoch', default=0, type=int, metavar='N',
 51 | 					help='manual epoch number (useful on restarts)')
 52 | parser.add_argument('-b', '--batch-size', default=256, type=int,
 53 | 					metavar='N',
 54 | 					help='mini-batch size (default: 256), this is the total '
 55 | 						 'batch size of all GPUs on the current node when '
 56 | 						 'using Data Parallel or Distributed Data Parallel')
 57 | parser.add_argument('--gradient_accumulation_steps', default=1, type=int,
 58 | 					help='number of gradient accumulation steps')
 59 | 
 60 | parser.add_argument('--lr', '--learning-rate', default=0.1, type=float,
 61 | 					metavar='LR', help='initial learning rate', dest='lr')
 62 | parser.add_argument('--momentum', default=0.9, type=float, metavar='M',
 63 | 					help='momentum')
 64 | parser.add_argument('--wd', '--weight-decay', default=1e-4, type=float,
 65 | 					metavar='W', help='weight decay (default: 1e-4)',
 66 | 					dest='weight_decay')
 67 | parser.add_argument('-p', '--print-freq', default=10, type=int,
 68 | 					metavar='N', help='print frequency (default: 10)')
 69 | parser.add_argument('-wf', '--write-freq', default=100, type=int,
 70 | 					metavar='N', help='print frequency (default: 100)')
 71 | parser.add_argument('--resume', default='', type=str, metavar='PATH',
 72 | 					help='path to latest checkpoint (default: none)')
 73 | parser.add_argument('-e', '--evaluate', dest='evaluate', action='store_true',
 74 | 					help='evaluate model on validation set')
 75 | parser.add_argument('--pretrained', dest='pretrained', action='store_true',
 76 | 					help='use pre-trained model')
 77 | parser.add_argument('--world-size', default=-1, type=int,
 78 | 					help='number of nodes for distributed training')
 79 | parser.add_argument('--rank', default=-1, type=int,
 80 | 					help='node rank for distributed training')
 81 | parser.add_argument('--dist-url', default='tcp://224.66.41.62:23456', type=str,
 82 | 					help='url used to set up distributed training')
 83 | parser.add_argument('--dist-backend', default='nccl', type=str,
 84 | 					help='distributed backend')
 85 | parser.add_argument('--seed', default=None, type=int,
 86 | 					help='seed for initializing training. ')
 87 | parser.add_argument('--gpu', default=None, type=int,
 88 | 					help='GPU id to use.')
 89 | parser.add_argument('--multiprocessing-distributed', action='store_true',
 90 | 					help='Use multi-processing distributed training to launch '
 91 | 						 'N processes per node, which has N GPUs. This is the '
 92 | 						 'fastest way to use PyTorch for either single node or '
 93 | 						 'multi node data parallel training')
 94 | 
 95 | parser.add_argument('--cell_id', default='../final_metadata/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy', type=str,
 96 | 					help='file that store the cell IDS')
 97 | 
 98 | parser.add_argument('--val_freq', default=10, 
 99 | 					type=int, help='perform validation per [val_freq] epochs (in offline mode)')
100 | parser.add_argument('--adjust_lr', default=0, 
101 | 					type=int, help='whether to adjust lr')
102 | parser.add_argument('--use_val', default=0, 
103 | 					type=int, help='whether to use validation set')
104 | 
105 | parser.add_argument('--root', default="/export/share/Datasets/yfcc100m_full_dataset/images/", 
106 | 					type=str, help='root to the image dataset, used to save the memory consumption')
107 | parser.add_argument('--data', default="/export/share/Datasets/yfcc100m_full_dataset/metadata_geolocation/", 
108 | 					type=str, help='path to the training data')
109 | parser.add_argument('--data_val', default="/export/share/Datasets/yfcc100m_full_dataset/metadata_geolocation/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv", 
110 | 					type=str, help='path to the metadata')
111 | 
112 | parser.add_argument('--gradient_steps_per_batch', default=1, 
113 | 					type=int, help='number of gradient steps per batch, used to compare against firehose paper')
114 | parser.add_argument('--size_replay_buffer', default=0, 
115 | 					type=int, help='size of the experience replay buffer (per gpu, so if you have 8 gpus, each gpu will have size_replay_buffer number of samples in the buffer)')
116 | parser.add_argument('--sample_rate_from_buffer', default=0.5, 
117 | 					type=float, help='how many samples are from the buffer')
118 | parser.add_argument('--sampling_strategy', default='FIFO', 
119 | 					type=str, help='sampling strategy (FIFO, Reservoir, RingBuf)')
120 | parser.add_argument('--num_classes', default=500, 
121 | 					type=int, help='number of classes (used only for constructing ring buffer, no need to set this value)')
122 | 
123 | parser.add_argument('--weight_old_data', default=1.0, 
124 | 					type=float, help='weight of the loss on old data, used to separate the effect of learning rate on old and new data')
125 | 
126 | parser.add_argument('--NOSubBatch', default=1, 
127 | 					type=int, help='separate each batch into consecutive sub-batches, used only in online mode, for testing the effect of batch size')
128 | 
129 | 
130 | parser.add_argument('--GN', default=0, 
131 | 					type=int, help='number of channels per group. If it is 0, it means '
132 | 					'batch norm instead of group-norm')
133 | 
134 | parser.add_argument('--used_data_start', default = 0, type = int, help = 'number of images used, use the full dataset if set to > 1')
135 | parser.add_argument('--used_data_end', default = -1, type = int, help = 'number of images used, use the full dataset if set to > 1')
136 | 
137 | # separate between pure-replay or mixed replay
138 | parser.add_argument("--ReplayType", default = 'mixRep', 
139 | 					choices=['mixRep', 'pureRep'], help='Type of replay buffer')
140 | parser.add_argument("--min_lr", default = 0.0, type = float, help = 'minimum learning rate, used to cut the cosine LR')
141 | # whether to save intermediate models
142 | parser.add_argument("--SaveInter", default = 1, type = int, help = 'whether to save intermediate models')
143 | parser.add_argument("--ABS_performanceGap", default = 0.5, type = float, help = 'the allowed performance gap between the accuracy of old and new samples, hyper-param for adaptive buffer size (tuned through cross val)')
144 | parser.add_argument("--use_ADRep", default=1, type = int, help = 'whether to use ADRep')
145 | best_acc1 = 0
146 | 
147 | def main():
148 | 	args = parser.parse_args()
149 | 
150 | 	if args.seed is not None:
151 | 		random.seed(args.seed)
152 | 		torch.manual_seed(args.seed)
153 | 		cudnn.deterministic = True
154 | 		warnings.warn('You have chosen to seed training. '
155 | 					  'This will turn on the CUDNN deterministic setting, '
156 | 					  'which can slow down your training considerably! '
157 | 					  'You may see unexpected behavior when restarting '
158 | 					  'from checkpoints.')
159 | 
160 | 	if args.gpu is not None:
161 | 		warnings.warn('You have chosen a specific GPU. This will completely '
162 | 					  'disable data parallelism.')
163 | 
164 | 	if args.dist_url == "env://" and args.world_size == -1:
165 | 		args.world_size = int(os.environ["WORLD_SIZE"])
166 | 
167 | 	args.distributed = args.world_size > 1 or args.multiprocessing_distributed
168 | 
169 | 	ngpus_per_node = torch.cuda.device_count()
170 | 	if args.multiprocessing_distributed:
171 | 		# Since we have ngpus_per_node processes per node, the total world_size
172 | 		# needs to be adjusted accordingly
173 | 		args.world_size = ngpus_per_node * args.world_size
174 | 		# Use torch.multiprocessing.spawn to launch distributed processes: the
175 | 		# main_worker process function
176 | 		mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
177 | 	else:
178 | 		# Simply call main_worker function
179 | 		main_worker(args.gpu, ngpus_per_node, args)
180 | 
181 | 
182 | def main_worker(gpu, ngpus_per_node, args):
183 | 	global best_acc1
184 | 	args.gpu = gpu
185 | 	# suppress printing if not master
186 | 	if args.multiprocessing_distributed and args.gpu != 0:
187 | 		def print_pass(*args):
188 | 			pass
189 | 		builtins.print = print_pass
190 | 
191 | 	args.output_dir = create_output_dir(args)
192 | 	os.makedirs(args.output_dir, exist_ok = True)
193 | 	writer = SummaryWriter(args.output_dir)
194 | 
195 | 	print("output_dir = {}".format(args.output_dir))    
196 | 	sys.stdout.flush() 
197 | 	
198 | 
199 | 	if args.gpu is not None:
200 | 		print("Use GPU: {} for training".format(args.gpu))
201 | 
202 | 	if args.distributed:
203 | 		if args.dist_url == "env://" and args.rank == -1:
204 | 			args.rank = int(os.environ["RANK"])
205 | 		if args.multiprocessing_distributed:
206 | 			# For multiprocessing distributed training, rank needs to be the
207 | 			# global rank among all the processes
208 | 			args.rank = args.rank * ngpus_per_node + gpu
209 | 		dist.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
210 | 								world_size=args.world_size, rank=args.rank)
211 | 	
212 | 	model, criterion, optimizer, num_classes = init_model(args, ngpus_per_node)
213 | 
214 | 	args.num_classes = num_classes
215 | 
216 | 	# init the full dataset
217 | 	train_set, val_set = init_dataset(args)
218 | 	cudnn.benchmark = True
219 | 
220 | 	if args.evaluate:
221 | 		# optionally resume from a checkpoint
222 | 		if args.resume:
223 | 			if os.path.isfile(args.resume):
224 | 				args.start_epoch, _, _, _, _ = resume_online(args, model, optimizer) 
225 | 			else:
226 | 				raise RuntimeError("=> no checkpoint found at '{}'".format(args.resume))
227 | 
228 | 		print("do not perform training for evaluation mode, val_set = {}".format(val_set))
229 | 		sys.stdout.flush() 
230 | 
231 | 		out_folder_eval = args.output_dir + '/epoch{}'.format(args.start_epoch)
232 | 		os.makedirs(out_folder_eval, exist_ok = True)
233 | 		writer = SummaryWriter(out_folder_eval)
234 | 		print("[validation]: saving validation result to {}".format(out_folder_eval))
235 | 		val_loader = init_val_loader(args, val_set)
236 | 		acc1 = validate(val_loader, model, criterion, args, writer, args.epochs, is_forward = 2)
237 | 
238 | 		# do forward and backward transfer too
239 | 		# 1. read out time of current example
240 | 		time_taken = torch.load(args.data+'train_time.torchSave')
241 | 		idx_last = compute_idx_curr(len(time_taken), args.start_epoch, args,  ngpus_per_node)
242 | 		if idx_last < 0:
243 | 			time_last = time_taken[0] - 1
244 | 		else:
245 | 			time_last = time_taken[min(idx_last, len(time_taken)-1)]
246 | 
247 | 		# compute forward 
248 | 		val_set.set_transfer_time_point(args, val_set, time_last, is_forward = True)
249 | 		val_loader = init_val_loader(args, val_set)
250 | 		out_folder_eval = args.output_dir + '/epoch{}/forward'.format(args.start_epoch)
251 | 		os.makedirs(out_folder_eval, exist_ok = True)
252 | 		writer = SummaryWriter(out_folder_eval)
253 | 		print("[forward transfer]: saving validation result to {}".format(out_folder_eval))
254 | 		acc1 = validate(val_loader, model, criterion, args, writer, args.epochs, is_forward = 1, time_last = time_last)
255 | 		# backward transfer
256 | 		val_set.set_transfer_time_point(args, val_set, time_last, is_forward = False)
257 | 		val_loader = init_val_loader(args, val_set)
258 | 		out_folder_eval = args.output_dir + '/epoch{}/backward'.format(args.start_epoch)
259 | 		os.makedirs(out_folder_eval, exist_ok = True)
260 | 		writer = SummaryWriter(out_folder_eval)
261 | 		print("[backward transfer]: saving validation result to {}".format(out_folder_eval))
262 | 		acc1 = validate(val_loader, model, criterion, args, writer, args.epochs, is_forward = 0, time_last = time_last)
263 | 
264 | 	else:
265 | 		train_online(args, model, criterion, optimizer, train_set, val_set, ngpus_per_node, writer)
266 | 
267 | def compute_idx_curr(data_size, epoch, args, ngpus_per_node):
268 | 	data_size_per_epoch = math.ceil(data_size/args.epochs)
269 | 	if data_size_per_epoch % (args.batch_size* ngpus_per_node) != 0:
270 | 		data_size_per_epoch = data_size_per_epoch - data_size_per_epoch % (args.batch_size* ngpus_per_node) + (args.batch_size* ngpus_per_node)    
271 | 	return data_size_per_epoch * epoch - 1 
272 | 
273 | 
274 | def init_model(args, ngpus_per_node):
275 | 	cell_ids = np.load(args.cell_id, allow_pickle = True)
276 | 	num_classes = cell_ids.size + 1 # remember to +1
277 | 	print("init with num_classes = {}".format(num_classes))
278 | 
279 | 	if args.GN == 0:
280 | 		if args.pretrained:
281 | 			print("=> using pre-trained model '{}'".format(args.arch))
282 | 			model = models.__dict__[args.arch](pretrained=True)
283 | 		else:
284 | 			print("=> creating model '{}'".format(args.arch))
285 | 			model = models.__dict__[args.arch](num_classes = num_classes, norm_layer = nn.SyncBatchNorm)
286 | 	else:
287 | 		if args.pretrained:
288 | 			print("=> using pre-trained model '{}'".format(args.arch))
289 | 			model = models_GN.__dict__[args.arch](pretrained=True,
290 | 										group_norm=args.GN)
291 | 		else:
292 | 			print("=> creating model '{}'".format(args.arch))
293 | 			model = models_GN.__dict__[args.arch](num_classes = num_classes,
294 | 												group_norm=args.GN)
295 | 
296 | 	if not torch.cuda.is_available():
297 | 		print('using CPU, this will be slow')
298 | 	elif args.distributed:
299 | 		# For multiprocessing distributed, DistributedDataParallel constructor
300 | 		# should always set the single device scope, otherwise,
301 | 		# DistributedDataParallel will use all available devices.
302 | 		if args.gpu is not None:
303 | 			torch.cuda.set_device(args.gpu)
304 | 			model.cuda(args.gpu)
305 | 			# When using a single GPU per process and per
306 | 			# DistributedDataParallel, we need to divide the batch size
307 | 			# ourselves based on the total number of GPUs we have
308 | 			args.batch_size = int(args.batch_size / ngpus_per_node)
309 | 			args.workers = int((args.workers + ngpus_per_node - 1) / ngpus_per_node)
310 | 			model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.gpu])
311 | 		else:
312 | 			model.cuda()
313 | 			# DistributedDataParallel will divide and allocate batch_size to all
314 | 			# available GPUs if device_ids are not set
315 | 			model = torch.nn.parallel.DistributedDataParallel(model)
316 | 	elif args.gpu is not None:
317 | 		torch.cuda.set_device(args.gpu)
318 | 		model = model.cuda(args.gpu)
319 | 	else:
320 | 		# DataParallel will divide and allocate batch_size to all available GPUs
321 | 		if args.arch.startswith('alexnet') or args.arch.startswith('vgg'):
322 | 			model.features = torch.nn.DataParallel(model.features)
323 | 			model.cuda()
324 | 		else:
325 | 			model = torch.nn.DataParallel(model).cuda()
326 | 
327 | 	# define loss function (criterion) and optimizer
328 | 	criterion = nn.CrossEntropyLoss().cuda(args.gpu)
329 | 	optimizer = torch.optim.SGD(model.parameters(), args.lr,
330 | 									momentum=args.momentum,
331 | 									weight_decay=args.weight_decay)
332 | 	
333 | 	return model, criterion, optimizer, num_classes
334 | 
335 | 
336 | def init_dataset(args):
337 | 	normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
338 | 									 std=[0.229, 0.224, 0.225])
339 | 
340 | 	
341 | 	trans = transforms.Compose([            
342 | 		transforms.RandomResizedCrop(224),
343 | 		transforms.RandomHorizontalFlip(),
344 | 		transforms.ToTensor(),
345 | 		normalize,
346 | 	])
347 | 	
348 | 	trans_test = transforms.Compose([
349 | 		transforms.Resize(256),
350 | 		transforms.CenterCrop(224),
351 | 		transforms.ToTensor(),
352 | 		normalize,
353 | 	])
354 | 	
355 | 	print("data augmentation = {}; data augmentation for test batch = {}".format(trans, trans_test))
356 | 	
357 | 	if args.evaluate:
358 | 		train_dataset = None
359 | 		val_dataset = YFCC_CL_Dataset_offline_val(args,
360 | 		transform = trans_test)
361 | 	else:		
362 | 		train_dataset = YFCC_CL_Dataset_online(args,
363 | 			transform = trans, transform_RepBuf = trans, trans_test = trans_test)
364 | 		val_dataset = YFCC_CL_Dataset_offline_val(args,
365 | 			transform = trans_test)
366 | 
367 | 	return train_dataset, val_dataset
368 | 
369 | def resume_online(args, model, optimizer):
370 | 	print("=> loading checkpoint '{}'".format(args.resume))
371 | 
372 | 	if args.gpu is None:
373 | 		checkpoint = torch.load(args.resume)
374 | 	else:
375 | 		loc = 'cuda:{}'.format(args.gpu)
376 | 		checkpoint = torch.load(args.resume, map_location=loc)
377 | 	
378 | 	args.start_epoch = checkpoint['epoch']
379 | 	best_acc1 = checkpoint['best_acc1']
380 | 	
381 | 	model.load_state_dict(checkpoint['state_dict'])
382 | 	optimizer.load_state_dict(checkpoint['optimizer'])
383 | 	online_fit_meters = checkpoint['online_fit_meters']
384 | 	
385 | 	userID_last = checkpoint['userID_last']
386 | 
387 | 	print("=> loaded checkpoint '{}' (epoch {})"
388 | 		  .format(args.resume, checkpoint['epoch']))
389 | 	if 'reservoir_buffer' in checkpoint:
390 | 		reservoir_buffer = checkpoint['reservoir_buffer']
391 | 	else:
392 | 		reservoir_buffer = None
393 | 
394 | 	args.size_replay_buffer = checkpoint['size_replay_buffer']
395 | 	return checkpoint['epoch'], checkpoint['best_acc1'], reservoir_buffer, online_fit_meters, userID_last 
396 | 
397 | 
398 | def init_val_loader(args, val_set):
399 | 	return torch.utils.data.DataLoader(val_set, batch_size=args.batch_size, shuffle=False, num_workers=args.workers, pin_memory=True)
400 | 
401 | def init_loader_v2(args, train_set, epoch):
402 | 	train_set.size_buf = args.size_replay_buffer
403 | 	train_set._change_data_range(epoch = epoch)
404 | 
405 | 	if args.distributed:
406 | 		train_sampler = torch.utils.data.distributed.DistributedSampler(train_set, shuffle = False)
407 | 	else:
408 | 		train_sampler = None
409 | 	
410 | 	batchSize4Loader = int(args.batch_size/args.NOSubBatch)
411 | 	train_loader = torch.utils.data.DataLoader(
412 | 		train_set, batch_size=batchSize4Loader, shuffle=(train_sampler is None),
413 | 		num_workers=args.workers, pin_memory=True, sampler=train_sampler)
414 | 	return train_loader, train_sampler
415 | 
416 | 
417 | def init_online_fit_meters():
418 | 	return [AverageMeter('LossOF', ':.4e'), AverageMeter('AccOF@1', ':6.2f'), AverageMeter('AccOF@5', ':6.2f'), AverageMeter('AccF@1', ':6.2f'), AverageMeter('AccO@1', ':6.2f')]
419 | 
420 | def train_online(args, model, criterion, optimizer, train_set, val_set, ngpus_per_node, writer):
421 | 	# init metrics for online fit
422 | 	online_fit_meters = init_online_fit_meters()
423 | 	
424 | 	userID_last = -1
425 | 
426 | 	if args.resume:
427 | 		if os.path.isfile(args.resume):
428 | 			args.start_epoch, _, reservoir_buffer, online_fit_meters, userID_last = resume_online(args, model, optimizer)
429 | 			if reservoir_buffer is not None:
430 | 				train_set.buf_last = reservoir_buffer.cpu()
431 | 		else:
432 | 			raise RuntimeError("=> no checkpoint found at '{}'".format(args.resume))
433 | 
434 | 	for epoch in range(args.start_epoch, args.epochs):
435 | 		train_loader, train_sampler = init_loader_v2(args, train_set, epoch)
436 | 
437 | 		if args.distributed:
438 | 			train_sampler.set_epoch(0)
439 | 		
440 | 		if args.adjust_lr:
441 | 			adjust_learning_rate(optimizer, epoch, args, ngpus_per_node)
442 | 
443 | 		writer.add_scalar("learning rate", get_lr(optimizer), epoch)
444 | 
445 | 		user_ID_last = train_MultiGD(train_loader, model, criterion, optimizer, epoch, args, writer, online_fit_meters, userID_last)					
446 | 	
447 | 		if not args.multiprocessing_distributed or (args.multiprocessing_distributed
448 | 				and args.rank % ngpus_per_node == 0):
449 | 			save_dict = {
450 | 				'epoch': epoch + 1,
451 | 				'arch': args.arch,
452 | 				'size_replay_buffer': args.size_replay_buffer,
453 | 				'state_dict': model.state_dict(),
454 | 				'best_acc1': 0,
455 | 				'optimizer' : optimizer.state_dict(),
456 | 				'online_fit_meters': online_fit_meters,
457 | 				'userID_last': userID_last,
458 | 			}
459 | 
460 | 			# if args.gpu == 0:
461 | 			if args.sampling_strategy == 'Reservoir':
462 | 				# store previous reservoir buffer
463 | 				save_dict['reservoir_buffer'] = train_set.buf_last
464 | 		
465 | 			if (epoch + 1) % 10 == 0 and args.SaveInter: 
466 | 				save_checkpoint(save_dict, 0, output_dir = args.output_dir, filename='checkpoint_ep{}.pth.tar'.format(epoch + 1))
467 | 			else:
468 | 				save_checkpoint(save_dict, 0, output_dir = args.output_dir)
469 | 
470 | 	if args.use_val == 1:
471 | 		val_loader  = torch.utils.data.DataLoader(val_set,
472 | 		batch_size=args.batch_size, shuffle=False,
473 | 		num_workers=args.workers, pin_memory=True)
474 | 		acc1 = validate(val_loader, model, criterion, args, writer, args.epochs)
475 | 		print("final average accuracy = {}".format(acc1))
476 | 
477 | def find_next_test_album(user_ID, user_ID_last, idx_time_sorted):
478 | 	idx_ne = (user_ID[idx_time_sorted] != user_ID_last).nonzero().flatten()
479 | 	if idx_ne.numel() > 0:
480 | 		idx_end = 1
481 | 		for i in range(1, idx_ne.numel()):
482 | 			if user_ID[idx_time_sorted][idx_ne[0]] == user_ID[idx_time_sorted][idx_ne[i]]:
483 | 				idx_end += 1
484 | 			else:
485 | 				break
486 | 		return idx_ne[:idx_end] 
487 | 	else:
488 | 		return None
489 | 
490 | def train_MultiGD(train_loader, model, criterion, optimizer, epoch, args, writer, online_fit_meters, userID_last):
491 | 	# these statistics are local, so we need a separate set of meters for online fit, forward and backward transfer.
492 | 	batch_time = AverageMeter('Time', ':6.3f')
493 | 	data_time = AverageMeter('Data', ':6.3f')
494 | 
495 | 	losses = AverageMeter('Loss', ':.4e')
496 | 
497 | 	top1 = AverageMeter('Acc@1', ':6.2f')
498 | 	top5 = AverageMeter('Acc@5', ':6.2f')
499 | 
500 | 	top1_future = AverageMeter('AccF@1', ':6.2f')
501 | 	top5_future = AverageMeter('AccF@5', ':6.2f')
502 | 
503 | 	top1_Rep = AverageMeter('Acc_old@1', ':6.2f')
504 | 	top5_Rep = AverageMeter('Acc_old@5', ':6.2f')
505 | 
506 | 	progress = ProgressMeter(
507 | 		len(train_loader),
508 | 		[batch_time, data_time, losses, top1, top1_future, top1_Rep, online_fit_meters[1], online_fit_meters[2]],
509 | 		prefix="Epoch: [{}]".format(epoch))   
510 | 
511 | 	# switch to train mode
512 | 	model.train()
513 | 	end = time.time()
514 | 
515 | 	optimizer.zero_grad()
516 | 
517 | 	ngpus = torch.cuda.device_count()
518 | 
519 | 	rate_rep_sample = args.sample_rate_from_buffer/(1-args.sample_rate_from_buffer)
520 | 
521 | 	for i, (images, target, userID, time_taken, index, 
522 | 			images_pop_tmp, target_pop_tmp, index_pop_tmp, 
523 | 			images_from_buf_tmp, target_from_buf_tmp, index_buf_tmp) in enumerate(train_loader):
524 | 		# 1. images & target: test mini-batch (size = batch size)
525 | 		# 2. images_pop & target_pop: samples poped from validation buffer, used for training (size = batch size, set to "images & target" if batch_size*iter < val_buf_size, and we do normal SGD in this case)
526 | 		# 3. images_from_buf and target_from_buf: samples from replay buffer, used for training (size = batch size * sample rate * number of GD steps per iter)
527 | 
528 | 		# measure data loading time
529 | 		data_time.update(time.time() - end)
530 | 		
531 | 		# print("target = {}".format(target))
532 | 		iter_curr = epoch*len(train_loader)/(args.gradient_steps_per_batch*args.NOSubBatch)+i//(args.gradient_steps_per_batch*args.NOSubBatch)
533 | 		batch_size = target_pop_tmp.numel()
534 | 
535 | 		if i == 0:
536 | 			batch_size_ori = batch_size
537 | 
538 | 		if i % (args.gradient_steps_per_batch*args.NOSubBatch) == 0:
539 | 			images = images.reshape(images.size()[0]*images.size()[1],images.size()[2], images.size()[3], images.size()[4])[:]
540 | 			target = target.reshape(1, -1)[0][:]
541 | 			userID = userID.reshape(1, -1)[0][:]
542 | 			time_taken = time_taken.reshape(1, -1)[0][:]
543 | 			index = index.reshape(1, -1)[0][:]
544 | 	
545 | 		# organize the format of samples from the replay buffer
546 | 		bs_from_buf = round(batch_size*rate_rep_sample)
547 | 		images_pop = images_pop_tmp
548 | 		target_pop = target_pop_tmp
549 | 		index_pop = index_pop_tmp
550 | 
551 | 		images_from_buf = images_from_buf_tmp.reshape(images_from_buf_tmp.size()[0]*images_from_buf_tmp.size()[1],images_from_buf_tmp.size()[2], images_from_buf_tmp.size()[3], images_from_buf_tmp.size()[4])[:bs_from_buf]
552 | 		target_from_buf = target_from_buf_tmp.reshape(1, -1)[0][:bs_from_buf]
553 | 		index_buf = index_buf_tmp.reshape(1, -1)[0][:bs_from_buf]
554 | 
555 | 
556 | 		if i % (args.gradient_steps_per_batch * args.NOSubBatch) == 0:
557 | 			# do prediction on the new batch
558 | 			if args.gpu is not None:
559 | 				images = images.cuda(args.gpu, non_blocking=True)
560 | 				target = target.cuda(args.gpu, non_blocking=True)
561 | 
562 | 			with torch.no_grad():
563 | 				model.eval()
564 | 				output = model(images)
565 | 				output_all = global_gather(output)
566 | 				target_new = global_gather(target)
567 | 
568 | 				acc1F, acc5F = accuracy(output_all, target_new, topk=(1, 5))
569 | 				top1_future.update(acc1F[0])
570 | 				top5_future.update(acc5F[0])
571 | 
572 | 				# compute online fit on the next album
573 | 				userID = global_gather(userID.cuda(args.gpu, non_blocking=True))
574 | 				time_taken = global_gather(time_taken.cuda(args.gpu, non_blocking=True))
575 | 				time_sorted, idx_sort = time_taken.sort()
576 | 				time_last = time_sorted[-1]
577 | 				idx_set = find_next_test_album(userID, userID_last, idx_sort)
578 | 				if idx_set is not None:
579 | 					output_album = output_all[idx_sort][idx_set]
580 | 					target_album = target_new[idx_sort][idx_set]
581 | 					acc1OF, acc5OF = accuracy(output_album, target_album, topk=(1,5))
582 | 					online_fit_meters[1].update(acc1OF[0])
583 | 					online_fit_meters[2].update(acc5OF[0])
584 | 				
585 | 					# update userID_last
586 | 					userID_last = userID[idx_sort[-1]]	
587 | 						
588 | 				model.train()
589 | 
590 | 		images = images_pop.cuda(args.gpu, non_blocking=True)
591 | 		target = target_pop.cuda(args.gpu, non_blocking=True)		
592 | 
593 | 		# new code starts from here, do not support sub-batches for now	
594 | 		if args.sample_rate_from_buffer > 0 and iter_curr > 1:
595 | 			images_from_buf = images_from_buf.cuda(args.gpu, non_blocking=True)
596 | 			target_from_buf = target_from_buf.cuda(args.gpu, non_blocking=True)	
597 | 			images_merged, target_merged = merge_data_gpu(images_from_buf, target_from_buf, images, target, args)
598 | 			flag_merged = 1
599 | 		else:
600 | 			images_merged = images
601 | 			target_merged = target
602 | 			flag_merged = 0
603 | 
604 | 		output = model(images_merged)
605 | 		loss = compute_loss(args, output, target_merged, criterion, target.numel())
606 | 
607 | 
608 | 		loss.backward()
609 | 
610 | 
611 | 		optimizer.step()        # update parameters of net
612 | 		optimizer.zero_grad()   # reset gradient
613 | 
614 | 		if i % (args.gradient_steps_per_batch * args.NOSubBatch) == 0:
615 | 
616 | 			losses.update(loss.item(), target_merged.numel())
617 | 			output_all = global_gather(output)
618 | 			target_all = global_gather(target_merged)
619 | 			acc1, acc5 = accuracy(output_all, target_all, topk=(1, 5))
620 | 			top1.update(acc1[0], target_all.numel())
621 | 			top5.update(acc5[0], target_all.numel())
622 | 
623 | 			if target_merged.numel() > target.numel():			
624 | 				output_all = global_gather(output[target.numel():])	
625 | 				target_all = global_gather(target_merged[target.numel():])					
626 | 				acc1, acc5 = accuracy(output_all, target_all, topk=(1, 5))
627 | 				
628 | 				top1_Rep.update(acc1[0], target_all.numel())
629 | 				top5_Rep.update(acc5[0], target_all.numel())
630 | 
631 | 		# measure elapsed time
632 | 		batch_time.update(time.time() - end)
633 | 		end = time.time()
634 | 
635 | 		if i % (args.print_freq * args.gradient_steps_per_batch * args.NOSubBatch) == 0:
636 | 			progress.display(i)
637 | 			# sys.stdout.flush() 
638 | 
639 | 		if i % (args.write_freq*args.gradient_steps_per_batch* args.NOSubBatch) == 0:
640 | 			writer.add_scalar("train_loss_iter", losses.avg, iter_curr)
641 | 			writer.add_scalar("train_acc1_iter", top1.avg, iter_curr)
642 | 			writer.add_scalar("train_acc5_iter", top5.avg, iter_curr)
643 | 
644 | 			writer.add_scalar("avg_online_loss_iter", online_fit_meters[0].avg, iter_curr)
645 | 			writer.add_scalar("avg_online_acc1_iter", online_fit_meters[1].avg, iter_curr)
646 | 			writer.add_scalar("avg_online_acc5_iter", online_fit_meters[2].avg, iter_curr)
647 | 
648 | 			writer.add_scalar("avg_online_loss_time_iter", online_fit_meters[0].avg, time_last)
649 | 			writer.add_scalar("avg_online_acc1_time_iter", online_fit_meters[1].avg, time_last)
650 | 			writer.add_scalar("avg_online_acc5_time_iter", online_fit_meters[2].avg, time_last)
651 | 
652 | 			writer.add_scalar("train_acc1_old_iter", top1_Rep.avg, iter_curr)
653 | 			writer.add_scalar("train_acc5_old_iter", top5_Rep.avg, iter_curr)
654 | 						
655 | 			writer.add_scalar("train_acc1_future_iter", top1_future.avg, iter_curr)
656 | 			writer.add_scalar("train_acc5_future_iter", top5_future.avg, iter_curr)
657 | 
658 | 				
659 | 	# if args.gpu == 0:
660 | 	writer.add_scalar("train_loss", losses.avg, epoch)
661 | 	writer.add_scalar("train_acc1", top1.avg, epoch)	
662 | 	writer.add_scalar("train_acc5", top5.avg, epoch)	
663 | 	
664 | 	writer.add_scalar("avg_online_loss_time", online_fit_meters[0].avg, time_last)
665 | 	writer.add_scalar("avg_online_acc1_time", online_fit_meters[1].avg, time_last)
666 | 	writer.add_scalar("avg_online_acc5_time", online_fit_meters[2].avg, time_last)
667 | 
668 | 	writer.add_scalar("avg_online_loss_epoch", online_fit_meters[0].avg, epoch)
669 | 	writer.add_scalar("avg_online_acc1_epoch", online_fit_meters[1].avg, epoch)
670 | 	writer.add_scalar("avg_online_acc5_epoch", online_fit_meters[2].avg, epoch)
671 | 			
672 | 	writer.add_scalar("train_acc1_old", top1_Rep.avg, epoch)	
673 | 	writer.add_scalar("train_acc5_old", top5_Rep.avg, epoch)	
674 | 	
675 | 	writer.add_scalar("train_acc1_future", top1_future.avg, epoch)
676 | 	writer.add_scalar("train_acc5_future", top5_future.avg, epoch)
677 | 	
678 | 	writer.add_scalar("RepBuf_size", args.size_replay_buffer, epoch)
679 | 	if args.use_ADRep:
680 | 		if top1_Rep.avg - top1_future.avg > args.ABS_performanceGap:
681 | 			args.size_replay_buffer	= int(args.size_replay_buffer*2)	
682 | 		elif top1_Rep.avg - top1_future.avg < -args.ABS_performanceGap:
683 | 			args.size_replay_buffer	= int(args.size_replay_buffer/2)
684 | 		print("[changing replay buffer size]: changing replay buffer size to {} (rep vs future = {}/{})".format(args.size_replay_buffer, top1_Rep.avg, top1_future.avg))
685 | 
686 | 
687 | 	return userID_last
688 | 
689 | def validate(val_loader, model, criterion, args, writer, epoch, is_forward = 2, time_last = -1):
690 | 	batch_time = AverageMeter('Time', ':6.3f')
691 | 	losses = AverageMeter('Loss', ':.4e')
692 | 	top1 = AverageMeter('Acc@1', ':6.2f')
693 | 	top5 = AverageMeter('Acc@5', ':6.2f')
694 | 
695 | 	top1_iter = AverageMeter('AccI@1', ':6.2f')
696 | 	top5_iter = AverageMeter('AccI@5', ':6.2f')
697 | 	
698 | 	top1_over_time = AverageMeter('Acc@1_time', ':6.2f')
699 | 	top5_over_time = AverageMeter('Acc@5_time', ':6.2f')
700 | 	
701 | 	progress = ProgressMeter(
702 | 		len(val_loader),
703 | 		[batch_time, losses, top1, top5, top1_iter, top5_iter, top1_over_time, top5_over_time],
704 | 		prefix='Test: ')
705 | 
706 | 	model.eval()
707 | 	week_per_second = 24*7*3600
708 | 	time_init = time_last
709 | 	with torch.no_grad():
710 | 		end = time.time()
711 | 		for i, (images, target, time_curr, idx) in enumerate(val_loader):
712 | 			
713 | 			if args.gpu is not None:
714 | 				images = images.cuda(args.gpu, non_blocking=True)
715 | 				target = target.cuda(args.gpu, non_blocking=True)
716 | 			
717 | 			if i == 0:
718 | 				if time_last == -1:
719 | 					time_last = time_curr[0]
720 | 				idx_init = idx[0]
721 | 			
722 | 			# compute output
723 | 			output = model(images)
724 | 			loss = criterion(output, target)
725 | 
726 | 			# measure accuracy and record loss
727 | 			acc1, acc5 = accuracy(output, target, topk=(1, 5))
728 | 			losses.update(loss.item(), target.size(0))
729 | 			top1.update(acc1[0], target.size(0))
730 | 			top5.update(acc5[0], target.size(0))
731 | 
732 | 			top1_over_time.update(acc1[0], target.size(0))
733 | 			top5_over_time.update(acc5[0], target.size(0))
734 | 			top1_iter.update(acc1[0], target.size(0))
735 | 			top5_iter.update(acc5[0], target.size(0))
736 | 
737 | 			# measure elapsed time
738 | 			batch_time.update(time.time() - end)
739 | 			end = time.time()
740 | 
741 | 
742 | 			if i % args.print_freq == 0:
743 | 				progress.display(i)
744 | 				sys.stdout.flush() 
745 | 			
746 | 			time_latest = time_curr[-1]
747 | 			idx_latest = idx[-1]
748 | 			if is_forward == 0:
749 | 				time_gap_from_init = time_init - time_latest
750 | 				time_gap = time_last - time_latest
751 | 				idx_gap =  idx_init - idx_latest
752 | 			else:
753 | 				time_gap_from_init = time_latest - time_init
754 | 				time_gap = time_latest - time_last 
755 | 				idx_gap =  idx_latest - idx_init 
756 | 
757 | 			if idx_gap % 10000 == 0 or i == (len(val_loader)-1):   
758 | 				writer.add_scalar("val_acc1_iter", top1_iter.avg, idx_gap)                    
759 | 				writer.add_scalar("val_acc5_iter", top5_iter.avg, idx_gap)
760 | 
761 | 				writer.add_scalar("transfer_top1_valIdx", top1.avg, idx_gap)                    
762 | 				writer.add_scalar("transfer_top5_valIdx", top5.avg, idx_gap)
763 | 				                    
764 | 				top1_iter.reset()
765 | 				top5_iter.reset()
766 | 
767 | 			if time_gap > week_per_second:
768 | 				writer.add_scalar("transfer_top1_time", top1.avg, time_gap_from_init)                    
769 | 				writer.add_scalar("transfer_top5_time", top5.avg, time_gap_from_init)                    
770 | 
771 | 				writer.add_scalar("val_acc1_over_time", top1_over_time.avg, time_gap_from_init)                    
772 | 				writer.add_scalar("val_acc5_over_time", top5_over_time.avg, time_gap_from_init)                    
773 | 
774 | 				top1_over_time.reset()
775 | 				top5_over_time.reset()
776 | 				time_last = time_latest
777 | 
778 | 		# TODO: this should also be done with the ProgressMeter
779 | 		print(' * Acc@1 {top1.avg:.3f} Acc@5 {top5.avg:.3f}'
780 | 			  .format(top1=top1, top5=top5))
781 | 
782 | 	# if args.gpu == 0:
783 | 	writer.add_scalar("val_loss", losses.avg, epoch)
784 | 	writer.add_scalar("val_acc1", top1.avg, epoch)
785 | 	return top1.avg
786 | 
787 | 
788 | def save_checkpoint(state, is_best, output_dir = '.', filename='checkpoint.pth.tar'):
789 | 	torch.save(state, output_dir+'/checkpoint.pth.tar')
790 | 	if filename != 'checkpoint.pth.tar':
791 | 		shutil.copyfile(output_dir+'/checkpoint.pth.tar', output_dir+'/'+filename)
792 | 
793 | 
794 | class AverageMeter(object):
795 | 	"""Computes and stores the average and current value"""
796 | 	def __init__(self, name, fmt=':f'):
797 | 		self.name = name
798 | 		self.fmt = fmt
799 | 		self.reset()
800 | 
801 | 	def reset(self):
802 | 		self.val = 0
803 | 		self.avg = 0
804 | 		self.sum = 0
805 | 		self.count = 0
806 | 
807 | 	def update(self, val, n=1):
808 | 		self.val = val
809 | 		self.sum += val * n
810 | 		self.count += n
811 | 		self.avg = self.sum / self.count
812 | 
813 | 	def ma_update(self, val, weight):
814 | 		print("[ma_update before]: avg = {}; weight = {}; val = {}".format(self.avg, weight, val))
815 | 		if weight >= 1.0:
816 | 			self.update(val)
817 | 		else:
818 | 			self.val = val
819 | 			self.avg = self.avg * weight + val * (1 - weight)
820 | 		print("[ma_update after]: avg = {}".format(self.avg))
821 | 		
822 | 	def __str__(self):
823 | 		fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '})'
824 | 		return fmtstr.format(**self.__dict__)
825 | 
826 | 
827 | class ProgressMeter(object):
828 | 	def __init__(self, num_batches, meters, prefix=""):
829 | 		self.batch_fmtstr = self._get_batch_fmtstr(num_batches)
830 | 		self.meters = meters
831 | 		self.prefix = prefix
832 | 
833 | 	def display(self, batch):
834 | 		entries = [self.prefix + self.batch_fmtstr.format(batch)]
835 | 		entries += [str(meter) for meter in self.meters]
836 | 		# for test only
837 | 		print('\t'.join(entries))
838 | 
839 | 	def _get_batch_fmtstr(self, num_batches):
840 | 		num_digits = len(str(num_batches // 1))
841 | 		fmt = '{:' + str(num_digits) + 'd}'
842 | 		return '[' + fmt + '/' + fmt.format(num_batches) + ']'
843 | 
844 | 
845 | def create_output_dir(args):
846 | 	output_dir = 'results_no_PoLRS/'+args.arch+'_lr{}_epochs{}'.format(args.lr, args.epochs)+'_bufSize{}'.format(args.size_replay_buffer)    	
847 | 
848 | 	if args.sampling_strategy != 'FIFO':
849 | 		output_dir += '_sampleStrategy{}'.format(args.sampling_strategy)
850 | 
851 | 	if args.adjust_lr != 0:
852 | 		output_dir += '_adjustLr{}'.format(args.adjust_lr)
853 | 		if args.adjust_lr == 2:
854 | 			output_dir += '_MinLr{}'.format(args.cyclic_min_lr)
855 | 	
856 | 	if args.ReplayType != 'mixRep':
857 | 		output_dir += '_{}'.format(args.ReplayType)
858 | 
859 | 	if args.ABS_performanceGap != 0.5:
860 | 		output_dir += 'ABSPG{}'.format(args.ABS_performanceGap)	
861 | 	
862 | 	if args.gradient_steps_per_batch > 1:
863 | 		 output_dir+='_GDSteps{}'.format(args.gradient_steps_per_batch)
864 | 
865 | 	output_dir += '_BS{}'.format(args.batch_size)
866 | 	
867 | 	if args.GN > 0:
868 | 		output_dir += '_GN{}'.format(args.GN)
869 | 
870 | 	if args.NOSubBatch > 1:
871 | 		output_dir += '_NOSubBatch{}'.format(args.NOSubBatch)
872 | 
873 | 	if args.weight_decay != 1e-4:
874 | 		output_dir += '_WD{}'.format(args.weight_decay)	
875 | 	
876 | 	if args.min_lr > 0.0:
877 | 		output_dir += '_minLr{}'.format(args.min_lr)
878 | 
879 | 	if args.use_ADRep <=0:
880 | 		output_dir += '_noADRep'  
881 | 
882 | 	if args.evaluate:
883 | 		output_dir += '/evaluate'    
884 | 
885 | 
886 | 	return output_dir
887 | 
888 | def global_gather(x):
889 | 	all_x = [torch.ones_like(x)
890 | 			 for _ in range(dist.get_world_size())]
891 | 	dist.all_gather(all_x, x, async_op=False)
892 | 	return torch.cat(all_x, dim=0)
893 | 
894 | def compute_loss_CE(args, output, target, criterion, size_subBatch_new):
895 | 	lossF = criterion(output[:size_subBatch_new], target[:size_subBatch_new])
896 | 	if target.numel() > size_subBatch_new  and args.weight_old_data > 0.0 and args.sample_rate_from_buffer > 0.0:
897 | 		lossRep = criterion(output[size_subBatch_new:], target[size_subBatch_new:])	
898 | 		return (lossF*0.5+args.weight_old_data*lossRep*0.5)
899 | 	else:
900 | 		return lossF
901 | 
902 | 
903 | def compute_loss(args, output, target, criterion, size_subBatch_new):
904 | 	return compute_loss_CE(args, output, target, criterion, size_subBatch_new)
905 | 
906 | def merge_data(image_buf, target_buf, image, target):
907 | 	# currently just simply merge them
908 | 	if image_buf.is_cuda:
909 | 		image_buf = image_buf.cpu()
910 | 	if target_buf.is_cuda:
911 | 		target_buf = target_buf.cpu()
912 | 
913 | 	return torch.cat((image, image_buf)), torch.cat((target, target_buf))
914 | 
915 | def merge_data_gpu(image_buf, target_buf, image, target, args):
916 | 	if not image_buf.is_cuda:
917 | 		image_buf = image_buf.cuda(args.gpu, non_blocking=True)
918 | 	if not target_buf.is_cuda:
919 | 		target_buf = target_buf.cuda(args.gpu, non_blocking=True)
920 | 
921 | 	return torch.cat((image, image_buf)), torch.cat((target, target_buf))
922 | 
923 | def get_lr(optimizer):
924 | 	for param_group in optimizer.param_groups:
925 | 		return param_group['lr']
926 | 
927 | 
928 | def adjust_learning_rate(optimizer, epoch, args, ngpus_per_node, iter_curr = 0):
929 | 	lr = args.lr
930 | 	
931 | 	lr *= 0.5 * (1. + math.cos(math.pi * epoch / args.epochs))
932 | 	lr = max(args.min_lr, lr)
933 | 
934 | 	for param_group in optimizer.param_groups:
935 | 		param_group['lr'] = lr
936 | 
937 | 
938 | def accuracy(output, target, topk=(1,), cross_GPU = False):
939 | 	"""Computes the accuracy over the k top predictions for the specified values of k"""
940 | 	with torch.no_grad():
941 | 		if cross_GPU:
942 | 			output = global_gather(output)
943 | 			target = global_gather(target)
944 | 			
945 | 		maxk = max(topk)
946 | 		batch_size = target.size(0)
947 | 
948 | 		_, pred = output.topk(maxk, 1, True, True)
949 | 		pred = pred.t()
950 | 		correct = pred.eq(target.reshape(1, -1).expand_as(pred))
951 | 
952 | 		res = []
953 | 		for k in topk:
954 | 			correct_k = correct[:k].reshape(-1).float().sum(0, keepdim=True)
955 | 			res.append(correct_k.mul_(100.0 / batch_size))
956 | 		return res
957 | 
958 | 
959 | if __name__ == '__main__':
960 | 	main()


--------------------------------------------------------------------------------
/CLOC/code_online/no_PoLRS/yfcc100m_dataset.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torchvision.datasets.vision import StandardTransform
  3 | from torch.utils.data import Dataset, IterableDataset
  4 | from PIL import Image
  5 | 
  6 | import os
  7 | import os.path
  8 | import csv
  9 | 
 10 | import sys
 11 | import math
 12 | import random
 13 | import matplotlib.pyplot as plt
 14 | 
 15 | 
 16 | def has_file_allowed_extension(filename, extensions):
 17 | 	"""Checks if a file is an allowed extension.
 18 | 
 19 | 	Args:
 20 | 		filename (string): path to a file
 21 | 		extensions (tuple of strings): extensions to consider (lowercase)
 22 | 
 23 | 	Returns:
 24 | 		bool: True if the filename ends with one of given extensions
 25 | 	"""
 26 | 	return filename.lower().endswith(extensions)
 27 | 
 28 | 
 29 | def is_image_file(filename):
 30 | 	"""Checks if a file is an allowed image extension.
 31 | 
 32 | 	Args:
 33 | 		filename (string): path to a file
 34 | 
 35 | 	Returns:
 36 | 		bool: True if the filename ends with a known image extension
 37 | 	"""
 38 | 	return has_file_allowed_extension(filename, IMG_EXTENSIONS)
 39 | 
 40 | 
 41 | def make_dataset(directory, class_to_idx, extensions=None, is_valid_file=None):
 42 | 	instances = []
 43 | 	directory = os.path.expanduser(directory)
 44 | 	both_none = extensions is None and is_valid_file is None
 45 | 	both_something = extensions is not None and is_valid_file is not None
 46 | 	if both_none or both_something:
 47 | 		raise ValueError("Both extensions and is_valid_file cannot be None or not None at the same time")
 48 | 	if extensions is not None:
 49 | 		def is_valid_file(x):
 50 | 			return has_file_allowed_extension(x, extensions)
 51 | 	for target_class in sorted(class_to_idx.keys()):
 52 | 		class_index = class_to_idx[target_class]
 53 | 		target_dir = os.path.join(directory, target_class)
 54 | 		if not os.path.isdir(target_dir):
 55 | 			continue
 56 | 		for root, _, fnames in sorted(os.walk(target_dir, followlinks=True)):
 57 | 			for fname in sorted(fnames):
 58 | 				path = os.path.join(root, fname)
 59 | 				if is_valid_file(path):
 60 | 					item = path, class_index
 61 | 					instances.append(item)
 62 | 	return instances
 63 | 
 64 | 
 65 | def pil_loader(path):
 66 | 	# open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
 67 | 	with open(path, 'rb') as f:
 68 | 		img = Image.open(f)
 69 | 		return img.convert('RGB')
 70 | 
 71 | 
 72 | def accimage_loader(path):
 73 | 	import accimage
 74 | 	try:
 75 | 		return accimage.Image(path)
 76 | 	except IOError:
 77 | 		# Potentially a decoding problem, fall back to PIL.Image
 78 | 		return pil_loader(path)
 79 | 
 80 | 
 81 | def default_loader(path):
 82 | 	from torchvision import get_image_backend
 83 | 	if get_image_backend() == 'accimage':
 84 | 		return accimage_loader(path)
 85 | 	else:
 86 | 		return pil_loader(path)
 87 | 
 88 | 
 89 | IMG_EXTENSIONS = ('.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm', '.tif', '.tiff', '.webp')
 90 | 
 91 | 
 92 | 
 93 | class YFCC_CL_Dataset_offline_val(Dataset):
 94 | 	def __init__(self, args, loader = default_loader, extensions=IMG_EXTENSIONS, transform=None,
 95 | 				 target_transform=None):
 96 | 		
 97 | 		fname = args.data_val
 98 | 		root = args.root
 99 | 		
100 | 		print("YFCC_CL dataset loader = {}; extensions = {}".format(loader, extensions))
101 | 		
102 | 		sys.stdout.flush()
103 | 
104 | 		if isinstance(fname, torch._six.string_classes):
105 | 			fname = os.path.expanduser(fname)
106 | 		self.fname = fname
107 | 
108 | 		self.transform = transform
109 | 		self.target_transform = target_transform
110 | 
111 | 		self.labels, self.time_taken, self.user, self.store_loc = self._make_data(self.fname, root = root)
112 | 		if len(self.labels) == 0:
113 | 			msg = "Found 0 files in subfolders of: {}\n".format(self.fname)
114 | 			if extensions is not None:
115 | 				msg += "Supported extensions are: {}".format(",".join(extensions))
116 | 			raise RuntimeError(msg)
117 | 
118 | 		self.loader = loader
119 | 		self.extensions = extensions
120 | 		self.root = root
121 | 			
122 | 		self.batch_size = torch.cuda.device_count()*args.batch_size
123 | 		self.is_forward = True
124 | 		self.offset = 0
125 | 
126 | 		print("root = {}; time_taken (an example) = {}; time_taken.len = {}; batch_size = {}".format(root, self.time_taken[1000], len(self.time_taken), self.batch_size))
127 | 
128 | 	def _make_data(self, fname, root):
129 | 		# read data
130 | 		fval = open(fname, 'r')
131 | 		lines_val = fval.readlines()
132 | 		labels = [None] * len(lines_val)
133 | 		time = [None] * len(lines_val)
134 | 		user = [None] * len(lines_val)
135 | 		store_loc = [None] * len(lines_val)
136 | 		
137 | 		for i in range(len(lines_val)):
138 | 			line_splitted = lines_val[i].split(",")
139 | 			labels[i] = int(line_splitted[0])
140 | 			time[i] = int(line_splitted[2])
141 | 			user[i] = line_splitted[3]
142 | 			store_loc[i] = line_splitted[-1][:-1]
143 | 		return labels, time, user, store_loc
144 | 
145 | 	def set_transfer_time_point(self, args, val_set, time_last, is_forward = True):
146 | 		# find idx that is larger and closest to time_last
147 | 		self.is_forward = is_forward
148 | 		for i in range(len(self.time_taken)):
149 | 			if self.time_taken[i] >= time_last:
150 | 				print("[set_transfer_time_point]: time_last = {}; time[{}] = {}".format(time_last, i, self.time_taken[i]))
151 | 				self.offset = i
152 | 				return
153 | 		self.offset = len(self.time_taken) - 1
154 | 
155 | 	def __getitem__(self, index):
156 | 		if self.is_forward:
157 | 			index = min(len(self.labels), index + self.offset)
158 | 		else:
159 | 			index = max(0, self.offset - index)
160 | 
161 | 		if self.root is not None:
162 | 			path = self.root + self.store_loc[index]
163 | 		else:
164 | 			path = self.store_loc[index]
165 | 
166 | 		sample = self.loader(path)
167 | 		if self.transform is not None:
168 | 			sample = self.transform(sample)
169 | 
170 | 		return sample, self.labels[index], self.time_taken[index], index
171 | 
172 | 
173 | 	def __len__(self):
174 | 		if self.is_forward:
175 | 			return len(self.labels) - self.offset
176 | 		else:
177 | 			return self.offset + 1
178 | 
179 | 
180 | class YFCC_CL_Dataset_online(Dataset):
181 | 	def __init__(self, args, loader = default_loader, extensions=IMG_EXTENSIONS, transform=None, transform_RepBuf = None,
182 | 				 target_transform=None, target_transform_RepBuf = None, trans_test = None):
183 | 		
184 | 		fname = args.data
185 | 		root = args.root 
186 | 		size_buf = args.size_replay_buffer        
187 | 
188 | 		print("YFCC_CL dataset loader = {}; extensions = {}".format(loader, extensions))
189 | 		
190 | 		sys.stdout.flush()
191 | 
192 | 		if isinstance(fname, torch._six.string_classes):
193 | 			fname = os.path.expanduser(fname)
194 | 		self.fname = fname
195 | 
196 | 		# for backwards-compatibility
197 | 		self.transform = transform
198 | 		self.transform_test = trans_test
199 | 		self.target_transform = target_transform
200 | 		self.transform_RepBuf = transform_RepBuf
201 | 		self.target_transform_RepBuf = target_transform_RepBuf
202 | 
203 | 		# valid initial and final index
204 | 		self.used_data_start = args.used_data_start
205 | 		self.used_data_end = args.used_data_end
206 | 
207 | 		print("[YFCC_CL_Dataset_ConGraDv4] trans_test = {}".format(self.transform_test))
208 | 
209 | 		self._make_data()
210 | 		self.data_size = len(self.labels)
211 | 		self.data_size_per_epoch = math.ceil(self.data_size/args.epochs)
212 | 
213 | 		self.batch_size = torch.cuda.device_count()*args.batch_size
214 | 
215 | 		if self.data_size_per_epoch % self.batch_size != 0:
216 | 			self.data_size_per_epoch = self.data_size_per_epoch - self.data_size_per_epoch % self.batch_size + self.batch_size    
217 | 
218 | 	
219 | 		self.loader = loader
220 | 		self.extensions = extensions
221 | 		self.root = root
222 | 	
223 | 		# for replay buffer
224 | 		self.size_buf = size_buf
225 | 
226 | 		self.repBuf_sample_rate = 0.5 # use this later, test 0.5 case first
227 | 		self.sampling_strategy = args.sampling_strategy
228 | 			
229 | 		self.NOSubBatch = args.NOSubBatch
230 | 		self.SubBatch_index_offset = int(self.batch_size/self.NOSubBatch)
231 | 
232 | 		self.gradient_steps_per_batch = int(args.gradient_steps_per_batch) # repBatch_rate is multiplied by this
233 | 		# ratio between the replay buffer and the new input batch (rounded up to an integer)
234 | 		self.repBatch_rate = math.ceil(self.repBuf_sample_rate/(1-self.repBuf_sample_rate))
235 | 
236 | 		self.repType = args.ReplayType
237 | 
238 | 		if self.sampling_strategy == 'Reservoir':
239 | 			self.buf_out_dir = args.output_dir + '/reservoir_buf'    
240 | 			os.makedirs(self.buf_out_dir, exist_ok = True)                
241 | 			self.buf_last = None
242 | 			self.gpu = args.gpu
243 | 
244 | 		if self.sampling_strategy == 'RingBuf':
245 | 			self.samples_per_class = math.floor(size_buf / args.num_classes)
246 | 			print("[ConGraDv4]: self.samples_per_class = {}".format(self.samples_per_class))
247 | 			self.num_classes = args.num_classes
248 | 			self.buf_out_dir = args.output_dir + '/RingBuf_buf'    
249 | 			os.makedirs(self.buf_out_dir, exist_ok = True)                
250 | 			self.buf_last = None
251 | 			self.gpu = args.gpu
252 | 
253 | 		print("[initData]: root = {}; time_taken (an example) = {}; time_taken.len = {}; batch_size = {}; size_buf = {}; repBuf_sample_rate = {}".format(root, self.time_taken[1000], len(self.time_taken), self.batch_size, self.size_buf, self.repBuf_sample_rate))
254 | 		print("[initData]: repBatch_rate = {}; NOSubBatch = {}; SubBatch_index_offset = {}".format(self.repBatch_rate, self.NOSubBatch, self.SubBatch_index_offset))
255 | 		print("[initData]: transform = {}; transform_RepBuf = {}".format(self.transform, self.transform_RepBuf))
256 | 
257 | 	def _make_data(self):
258 | 		self.labels = torch.load(self.fname+'train_labels.torchSave')
259 | 		self.time_taken = torch.load(self.fname+'train_time.torchSave')
260 | 
261 | 		self.user = torch.load(self.fname+'train_userID.torchSave')
262 | 		self.store_loc = [None] * len(self.labels)
263 | 		self.idx_data = []
264 | 
265 | 	def _change_data_range_FIFO(self):
266 | 		print("reading store location from {}".format(self.fname+'train_store_loc.torchSave'))
267 | 		tmp_loc = torch.load(self.fname+'train_store_loc.torchSave')
268 | 		print("tmp_loc.siez = {}".format(len(tmp_loc)))
269 | 		for i in range(self.used_data_start, self.used_data_end):
270 | 			self.store_loc[i] = tmp_loc[i][:-1]
271 | 			if i % 1e5 == 0:
272 | 				print("store_loc[{}] = {}".format(i, self.store_loc[i]))
273 | 			sys.stdout.flush()        
274 | 
275 | 	def _change_data_range_reservoir(self, idx_Reservoir_sample, buf_init, epoch = 0):
276 | 		tmp_loc = torch.load(self.fname+'train_store_loc.torchSave')
277 | 		i = 0
278 | 
279 | 		for i in range(len(tmp_loc)):
280 | 			if (i >= self.used_data_start - self.batch_size and (self.used_data_end <= 0 or i < self.used_data_end)):
281 | 				self.store_loc[i] = tmp_loc[i][:-1]
282 | 
283 | 				# create a new reservoir idx_set
284 | 				if i >= self.used_data_start and i % self.batch_size == 0:
285 | 					batch_num = i//self.batch_size
286 | 					batch_num_curr_epoch = (i-self.used_data_start)//self.batch_size
287 | 
288 | 					buf_file_name = self.buf_out_dir + '/{}.buf'.format(batch_num_curr_epoch)
289 | 					buf_curr = self.buf_last.clone()
290 | 
291 | 					if batch_num == 1:
292 | 						buf_curr = torch.tensor(list(range(i-self.batch_size, i)))
293 | 					elif batch_num > 1 and buf_curr.numel() < self.size_buf:
294 | 						buf_curr = torch.cat((buf_curr, torch.tensor(list(range(i-self.batch_size, i))))).unique()
295 | 												
296 | 					elif batch_num > 1:
297 | 						# do reservoir sampling when repBuf size reaches the maximum value
298 | 						reservoir_value = torch.randint(i, [self.batch_size]).unique()
299 | 						replace_idx = (reservoir_value < buf_curr.numel()).nonzero().flatten() 
300 | 						if replace_idx.numel() >0:
301 | 							buf_curr[reservoir_value[replace_idx]] = (i-replace_idx-1)
302 | 
303 | 					if buf_curr.numel() > self.size_buf:
304 | 						buf_curr = buf_curr[:self.size_buf]
305 | 
306 | 					self.buf_last = buf_curr.clone()
307 | 					# save buffer
308 | 					if self.gpu == 0:
309 | 						torch.save(buf_curr, buf_file_name)
310 | 
311 | 	 
312 | 			elif idx_Reservoir_sample >=0 and idx_Reservoir_sample < buf_init.numel() and i == buf_init[idx_Reservoir_sample]:
313 | 				self.store_loc[i] = tmp_loc[i][:-1]
314 | 				idx_Reservoir_sample += 1
315 | 
316 | 
317 | 			if i % 1e5 == 0:
318 | 				print("store_loc[{}] = {}".format(i, self.store_loc[i]))
319 | 				sys.stdout.flush()        
320 | 			
321 | 			if self.used_data_end > 0 and i >= self.used_data_end:
322 | 				break
323 | 			
324 | 			i += 1 
325 | 
326 | 		if self.gpu == 0:
327 | 			buf_last_file_name = self.buf_out_dir + '/last{}.buf'.format(epoch)
328 | 			torch.save(self.buf_last, buf_last_file_name)
329 | 
330 | 	def _change_data_range_RingBuf(self, idx_RingBuf_sample, buf_init, buf_curr, idx_buf_curr, epoch):
331 | 		tmp_loc = torch.load(self.fname+'train_store_loc.torchSave')
332 | 		i = 0
333 | 		for i in range(len(tmp_loc)):
334 | 			if (i >= self.used_data_start - self.batch_size and (self.used_data_end <= 0 or i < self.used_data_end)):
335 | 				self.store_loc[i] = tmp_loc[i][:-1]
336 | 				
337 | 				if i >= self.used_data_start:
338 | 					# save buffer
339 | 					if i % self.batch_size == 0 and self.gpu == 0:
340 | 						self.buf_last = buf_curr[:]
341 | 						self.idx_buf_last = idx_buf_curr[:]
342 | 						batch_num = i//self.batch_size
343 | 						batch_num_curr_epoch = (i-self.used_data_start)//self.batch_size
344 | 						buf_file_name = self.buf_out_dir + '/{}.buf'.format(batch_num_curr_epoch)
345 | 						torch.save(buf_curr, buf_file_name)
346 | 
347 | 					class_curr = self.labels[i]
348 | 
349 | 					if buf_curr[class_curr] is None:
350 | 						buf_curr[class_curr] = torch.tensor([i])
351 | 						idx_buf_curr[class_curr] = 0
352 | 					
353 | 					elif buf_curr[class_curr].numel() < self.samples_per_class:
354 | 						buf_curr[class_curr] = torch.cat((buf_curr[class_curr], torch.tensor([i])))
355 | 						idx_buf_curr[class_curr] += 1
356 | 
357 | 					else:
358 | 						idx_buf_curr[class_curr] = (idx_buf_curr[class_curr] + 1) % self.samples_per_class
359 | 						buf_curr[class_curr][idx_buf_curr[class_curr]] = i
360 | 
361 | 	 
362 | 			elif idx_RingBuf_sample >=0 and idx_RingBuf_sample < buf_init.numel() and i == buf_init[idx_RingBuf_sample]:
363 | 				self.store_loc[i] = tmp_loc[i][:-1]
364 | 				idx_RingBuf_sample += 1
365 | 
366 | 			if i % 1e5 == 0:
367 | 				print("store_loc[{}] = {}".format(i, self.store_loc[i]))
368 | 				sys.stdout.flush()        
369 | 			
370 | 			if self.used_data_end > 0 and i >= self.used_data_end:
371 | 				break
372 | 			
373 | 			i += 1 
374 | 	  
375 | 		if self.gpu == 0:
376 | 			buf_last_file_name = self.buf_out_dir + '/buf_last{}.buf'.format(epoch)
377 | 			buf_idx_last_file_name = self.buf_out_dir + '/buf_idx_last{}.buf'.format(epoch)
378 | 			
379 | 			torch.save(self.buf_last, buf_last_file_name)
380 | 			torch.save(self.idx_buf_last, buf_idx_last_file_name)
381 | 
382 | 	def _set_data_idx(self, epoch):
383 | 		self.offset = epoch*self.data_size_per_epoch
384 | 		size_curr_epoch = min(self.data_size - self.offset, self.data_size_per_epoch) 
385 | 		size_curr_epoch = size_curr_epoch - (size_curr_epoch % self.batch_size)
386 | 
387 | 		batchSize = int(self.batch_size/self.NOSubBatch)
388 | 		self.data_idx = [None]* (size_curr_epoch *self.gradient_steps_per_batch)
389 | 		iter_total = size_curr_epoch//batchSize
390 | 		bsReplicated = batchSize*self.gradient_steps_per_batch
391 | 		for i in range(0, int(iter_total)):
392 | 			self.data_idx[i*bsReplicated:(i+1)*bsReplicated] = list(range(self.offset+i*batchSize, self.offset+(i+1)*batchSize))*self.gradient_steps_per_batch
393 | 					
394 | 
395 | 	def _change_data_range(self, epoch = 0):
396 | 		self._set_data_idx(epoch)
397 | 		# compute data range to change
398 | 		if self.sampling_strategy == 'Reservoir' or self.sampling_strategy == 'RingBuf':
399 | 			self.used_data_start = epoch*self.data_size_per_epoch
400 | 			self.used_data_end = min(len(self.labels), (epoch+1)*self.data_size_per_epoch)
401 | 		else:
402 | 			self.used_data_start = max(0, epoch*self.data_size_per_epoch-self.size_buf-self.batch_size)
403 | 			self.used_data_end = min(len(self.labels), (epoch+1)*self.data_size_per_epoch+self.batch_size)
404 | 		
405 | 		print("change valid data range to: [{},{}]".format(self.used_data_start, self.used_data_end))
406 | 		
407 | 		self.store_loc = [None] * len(self.labels)
408 | 		if self.sampling_strategy == 'Reservoir':
409 | 			if self.used_data_start == 0:
410 | 				self.buf_last = torch.tensor(list(range(self.batch_size))).long()
411 | 				buf_init = self.buf_last
412 | 				idx_Reservoir_sample = -1
413 | 			else:
414 | 				idx_Reservoir_sample = 0
415 | 				buf_last_file_name = self.buf_out_dir + '/last{}.buf'.format(epoch-1)
416 | 				self.buf_last = torch.load(buf_last_file_name)
417 | 				self.buf_last = self.buf_last.unique()
418 | 				buf_init, _ = self.buf_last.clone().flatten().sort()
419 | 			print("[reservoir]: idx_Reservoir_sample = {}; buf_init = {}".format(idx_Reservoir_sample, buf_init[:10]))
420 | 			self._change_data_range_reservoir(idx_Reservoir_sample, buf_init, epoch = epoch)
421 | 
422 | 		elif self.sampling_strategy == 'RingBuf':
423 | 			# precompute the maximum index for each iteration
424 | 			print("[RingBuf]: entering ringBuf init stage")
425 | 			sys.stdout.flush()
426 | 			if self.used_data_start == 0:
427 | 				print("[RingBuf]: entering ringBuf init stage 1")
428 | 				self.buf_last = [None] * self.num_classes
429 | 				self.idx_buf_last = [-1] * self.num_classes
430 | 				buf_init = None
431 | 				idx_RingBuf_sample = -1
432 | 			else:
433 | 				print("[RingBuf]: entering ringBuf init stage 2")
434 | 				sys.stdout.flush()
435 | 				idx_RingBuf_sample = 0
436 | 				buf_last_file_name = self.buf_out_dir + '/buf_last{}.buf'.format(epoch-1)
437 | 				buf_idx_last_file_name = self.buf_out_dir + '/buf_idx_last{}.buf'.format(epoch-1)
438 | 				
439 | 				self.buf_last = torch.load(buf_last_file_name)
440 | 				self.idx_buf_last = torch.load(buf_idx_last_file_name)
441 | 				buf_init = None
442 | 				for i in range(0, len(self.buf_last)):
443 | 					if self.buf_last[i] is not None:
444 | 						if buf_init is None:
445 | 							buf_init = self.buf_last[i].clone()
446 | 						else:
447 | 							buf_init = torch.cat((buf_init, self.buf_last[i]))
448 | 						
449 | 				buf_init, _ = buf_init.flatten().sort()
450 | 				buf_init = buf_init.unique()
451 | 				print("[RingBuf]: idx_RingBuf_sample = {}; buf_init = {}".format(idx_RingBuf_sample, buf_init[:10]))
452 | 				sys.stdout.flush()
453 | 			buf_curr = self.buf_last[:]
454 | 			idx_buf_curr = self.idx_buf_last[:]
455 | 			self._change_data_range_RingBuf(idx_RingBuf_sample, buf_init, buf_curr, idx_buf_curr, epoch)
456 | 		else:
457 | 			self._change_data_range_FIFO()
458 | 
459 | 
460 | 	def _sample_FIFO(self, index):
461 | 		if index < self.batch_size:
462 | 			return 0
463 | 		else:
464 | 			repBuf_idx = random.randint(max(0, index-self.size_buf-self.batch_size), index-self.batch_size)
465 | 			return repBuf_idx
466 | 
467 | 	def _sample_reservoir(self, index):    
468 | 		batch_num = math.floor(index/self.batch_size)
469 | 		if batch_num == 0:
470 | 			return 0
471 | 		else:
472 | 			batch_num_curr_epoch = math.floor((index - self.used_data_start)/self.batch_size)
473 | 			buf_file_name = self.buf_out_dir + '/{}.buf'.format(batch_num_curr_epoch)
474 | 			buf_curr = torch.load(buf_file_name)
475 | 			repBuf_idx = buf_curr[random.randint(0, buf_curr.numel()-1)].item()
476 | 			return repBuf_idx
477 | 
478 | 	def _sample_RingBuf(self, index):
479 | 		batch_num = math.floor(index/self.batch_size)
480 | 		if batch_num == 0:
481 | 			return 0
482 | 		else:
483 | 			batch_num_curr_epoch = math.floor((index - self.used_data_start)/self.batch_size)
484 | 			buf_file_name = self.buf_out_dir + '/{}.buf'.format(batch_num_curr_epoch)
485 | 			buf_curr = torch.load(buf_file_name)
486 | 			
487 | 			class_idx_all = torch.randperm(len(buf_curr))
488 | 			class_idx = 0
489 | 			while buf_curr[class_idx_all[class_idx]] is None:
490 | 				class_idx += 1
491 | 			class_idx = class_idx_all[class_idx]
492 | 			repBuf_idx = buf_curr[class_idx][random.randint(0, buf_curr[class_idx].numel()-1)].item()
493 | 			return repBuf_idx
494 | 
495 | 
496 | 	def _sample(self, index):
497 | 		if self.sampling_strategy == 'FIFO':
498 | 			return self._sample_FIFO(index)
499 | 		elif self.sampling_strategy == 'RingBuf':
500 | 			return self._sample_RingBuf(index)
501 | 		elif self.sampling_strategy == 'Reservoir':
502 | 			return self._sample_reservoir(index)
503 | 		else:
504 | 			return self._sample_FIFO(index)        
505 | 
506 | 	def __getitem__(self, index):
507 | 		index = self.data_idx[index] 
508 | 
509 | 		if self.repType == 'mixRep':
510 | 			# mixed replay
511 | 			index_pop = index
512 | 		else:
513 | 			# pure replay based training
514 | 			index_pop = self._sample(index + self.batch_size) 
515 | 
516 | 		if index_pop < 0:
517 | 			index_pop = 0
518 | 			is_valid = torch.tensor(0)
519 | 		else:
520 | 			is_valid = torch.tensor(1)
521 | 
522 | 		num_batches = 1
523 | 	
524 | 
525 | 		target_pop = self.labels[index_pop]
526 | 		path_pop = self.root + self.store_loc[index_pop]
527 | 		sample_pop = self.loader(path_pop)
528 | 		if self.transform is not None:
529 | 			sample_pop = self.transform(sample_pop)
530 | 		if self.target_transform is not None:
531 | 			target_pop = self.target_transform(target_pop)
532 | 
533 | 		sample_test = torch.zeros(self.NOSubBatch, sample_pop.size()[0], sample_pop.size()[1], sample_pop.size()[2])
534 | 		target_test = torch.zeros(self.NOSubBatch).long()
535 | 		test_idx = torch.zeros(self.NOSubBatch).long()      
536 | 		time_taken_test = torch.zeros(self.NOSubBatch).long()
537 | 		user_test = torch.zeros(self.NOSubBatch).long()
538 | 
539 | 		# get the test data of full batch size
540 | 		for i in range(0, self.NOSubBatch):
541 | 			test_idx[i] = index+i*self.SubBatch_index_offset
542 | 			if test_idx[i] >= len(self.time_taken):
543 | 				test_idx[i] = len(self.time_taken) - 1
544 | 
545 | 			time_taken_test[i] = self.time_taken[test_idx[i]]
546 | 			target_test[i] = self.labels[test_idx[i]]
547 | 			user_test[i] = self.user[test_idx[i]]
548 | 			
549 | 			path = self.root + self.store_loc[test_idx[i]]
550 | 			sample = self.loader(path)
551 | 			if self.transform is not None:
552 | 				sample_test[i] = self.transform_test(sample)
553 | 			if self.target_transform is not None:
554 | 				target_test[i] = self.target_transform(target_test[i])
555 | 
556 | 		# randomly sample from repBuf
557 | 		sample_RepBuf = torch.zeros(self.repBatch_rate* num_batches, sample_pop.size()[0], sample_pop.size()[1], sample_pop.size()[2])
558 | 		target_RepBuf = torch.zeros(self.repBatch_rate* num_batches).long()
559 | 		repBuf_idx = torch.zeros(self.repBatch_rate * num_batches).long()        
560 | 
561 | 		if self.repType == 'mixRep':
562 | 			for i in range(0, int(self.repBatch_rate * num_batches)):
563 | 				repBuf_idx[i] = self._sample(index)    
564 | 				path_RepBuf = self.root + self.store_loc[repBuf_idx[i]]
565 | 				target_RepBuf[i] = self.labels[repBuf_idx[i]]
566 | 			   
567 | 				sample_RepBuf_tmp = self.loader(path_RepBuf)
568 | 				if self.transform_RepBuf is not None:
569 | 					sample_RepBuf[i] = self.transform_RepBuf(sample_RepBuf_tmp)
570 | 				if self.target_transform_RepBuf is not None:
571 | 					target_RepBuf[i] = self.target_transform_RepBuf(target_RepBuf[i])
572 | 
573 | 		return sample_test, target_test, user_test, time_taken_test, test_idx, sample_pop, target_pop, index_pop, sample_RepBuf, target_RepBuf, repBuf_idx
574 | 
575 | 		
576 | 
577 | 	def __len__(self):
578 | 		return len(self.data_idx)
579 | 


--------------------------------------------------------------------------------
/CLOC/data_preparation/download_images/download_images.py:
--------------------------------------------------------------------------------
  1 | from datetime import datetime
  2 | import sys
  3 | import csv
  4 | 
  5 | import io
  6 | import random
  7 | import shutil
  8 | import sys
  9 | from multiprocessing import Pool
 10 | import pathlib
 11 | 
 12 | import requests
 13 | from PIL import Image
 14 | import time
 15 | import os
 16 | 
 17 | import wget
 18 | 
 19 | def image_downloader(url_and_path:list):
 20 | 	img_url = url_and_path[0]
 21 | 	save_path = url_and_path[1]
 22 | 
 23 | 	try:
 24 | 		res = requests.get(img_url, stream=True)
 25 | 		# count = 1
 26 | 		# while res.status_code not in [ 301, 302, 303, 307, 308, 200 ] and count <= 5:
 27 | 		# 	res = requests.get(img_url, stream=True)
 28 | 		# 	# print(f'Retry: {count} {img_url}')
 29 | 		# 	count += 1
 30 | 		# checking the type for image
 31 | 		if 'image' not in res.headers.get("content-type", ''):
 32 | 		# print('ERROR: URL doesnot appear to be an image')
 33 | 			return 0
 34 | 		if not os.path.exists(save_path):	
 35 | 			os.makedirs(os.path.dirname(save_path), exist_ok = True)
 36 | 			i = Image.open(io.BytesIO(res.content))
 37 | 			i.save(save_path)
 38 | 		return 1
 39 | 
 40 | 	except:
 41 | 		return 0	
 42 | 
 43 | def test_function(number):
 44 | 	return number*number
 45 | 
 46 | def run_downloader(process:int, urls_and_paths:list):
 47 | 
 48 | 	pool = Pool()
 49 | 
 50 | 	results = pool.imap_unordered(image_downloader, urls_and_paths)
 51 | 	total_number = 0
 52 | 	success_number = 0
 53 | 
 54 | 	for r in results:
 55 | 		total_number = total_number + 1
 56 | 		if r == 1:
 57 | 			success_number = success_number + 1
 58 | 
 59 | 	return success_number, total_number
 60 | 
 61 | header = ['photoid', 'uid', 'unickname', 'datetaken', 'capturedevice', 'title', 'description', 'usertags','machinetags','longitude','latitude','accuracy', 'web page urls', 'downloadurl original', 'downloadurl medium size images (used)', 'local path']
 62 | 
 63 | 
 64 | # f_out = open('new_metadata.csv.zip', 'r')
 65 | csv.field_size_limit(sys.maxsize)
 66 | 
 67 | urls_and_paths = []
 68 | success_number_total = 0
 69 | 
 70 | num_process = 5 # accelerate downloading using multiple processes
 71 | current_start = 0 # restart from the previous image id, can be used to resume download if there is a time limit constraint to run the downloader
 72 | batch_size = 200 # number of images to download simultaneously
 73 | part_length = 50000000 # if you have the budget to run multiple downloaders at the same time, you can divide the dataset into several parts and specify the number of images in each part in here
 74 | current_part = 0 # if you have the budget to run multiple downloaders at the same time, you can divide the dataset into several parts and specify the part to download in here
 75 | current_end = part_length*(current_part+1)
 76 | root_folder = 'dataset/' # name of the root folder for storing the images, remember to add '/' and change the corresponding dataset folder in the code
 77 | 
 78 | print("downloading files [{} , {}]".format(current_start, current_end))
 79 | sys.stdout.flush()
 80 | fname = '../release/download_link_and_locations.csv'
 81 | with open(fname) as f:
 82 | 	csv_reader = csv.reader(f, delimiter=',')
 83 | 	line_count = 0
 84 | 	for row in csv_reader:
 85 | 		line_count += 1
 86 | 		
 87 | 		if line_count < current_start:
 88 | 			continue
 89 | 		elif line_count >= current_end:
 90 | 			break
 91 | 
 92 | 		downloadurl_m = row[0]
 93 | 		local_path = root_folder + row[1]
 94 | 		
 95 | 		if line_count % batch_size == 0 :
 96 | 			# print('line count = {}'.format(line_count))
 97 | 			# print('downloadurl_m = {}; local path = {}'.format(downloadurl_m, local_path))
 98 | 			urls_and_paths += [[downloadurl_m, local_path]]
 99 | 			success_number, _  = run_downloader(num_process, urls_and_paths)
100 | 			success_number_total += success_number
101 | 			urls_and_paths = []
102 | 			if line_count % 1000 == 0:
103 | 				print('success_rate = {}/{}'.format(success_number_total, line_count))
104 | 				sys.stdout.flush()
105 | 		else:
106 | 			urls_and_paths += [[downloadurl_m, local_path]]
107 | 
108 | 
109 | 	print(f'Processed {line_count} lines.')
110 | 


--------------------------------------------------------------------------------
/CLOC/exp_BS/eval_BS128.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 2 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000_adjustLr1_BS256_NOSubBatch2_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_BS/eval_BS256.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000_adjustLr1_BS256_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_BS/eval_BS64.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 4 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000_adjustLr1_BS256_NOSubBatch4_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_BS/train_BS128.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 2 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_BS/train_BS256.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_BS/train_BS64.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 4 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_LR/eval_PoLRS.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/best_model/main_online_best_model.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --weight_old_data 1.0 \
 8 | --batch-size 256 \
 9 | --NOSubBatch 1 \
10 | --workers 64 \
11 | --gradient_steps_per_batch 1 \
12 | --size_replay_buffer 40000 \
13 | --epochs 90 \
14 | --LR_adjust_intv 5 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_best_model/resnet50_lr0.05_epochs90_bufSize40000_LrAjIntv5_BS256_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:23794' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_LR/eval_constant.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | 
 4 | python ../code_online/no_PoLRS/main_online.py \
 5 | -a resnet50 \
 6 | --lr 0.05 \
 7 | --weight-decay 0.0001 \
 8 | --batch-size 256 \
 9 | --NOSubBatch 1 \
10 | --workers 32 \
11 | --size_replay_buffer 40000 \
12 | --val_freq 1000 \
13 | --epochs 90 \
14 | --adjust_lr 0 \
15 | --SaveInter 1 \
16 | --use_ADRep 0 \
17 | --evaluate \
18 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000_BS256_noADRep/checkpoint.pth.tar' \
19 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
20 | --root "../data_preparation/release/dataset/images/" \
21 | --data "../data_preparation/release/" \
22 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
23 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
24 | 


--------------------------------------------------------------------------------
/CLOC/exp_LR/eval_cosine.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000_adjustLr1_BS256_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_LR/train_PoLRS.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/best_model/main_online_best_model.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --weight_old_data 1.0 \
 8 | --batch-size 256 \
 9 | --NOSubBatch 1 \
10 | --workers 64 \
11 | --gradient_steps_per_batch 1 \
12 | --size_replay_buffer 40000 \
13 | --epochs 90 \
14 | --LR_adjust_intv 5 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:23794' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_LR/train_constant.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | 
 4 | python ../code_online/no_PoLRS/main_online.py \
 5 | -a resnet50 \
 6 | --lr 0.05 \
 7 | --weight-decay 0.0001 \
 8 | --batch-size 256 \
 9 | --NOSubBatch 1 \
10 | --workers 32 \
11 | --size_replay_buffer 40000 \
12 | --val_freq 1000 \
13 | --epochs 90 \
14 | --adjust_lr 0 \
15 | --SaveInter 1 \
16 | --use_ADRep 0 \
17 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
18 | --root "../data_preparation/release/dataset/images/" \
19 | --data "../data_preparation/release/" \
20 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
21 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
22 | 


--------------------------------------------------------------------------------
/CLOC/exp_LR/train_cosine.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/eval_39M.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000000_adjustLr1_BS256_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/eval_40K.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000_adjustLr1_BS256_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/eval_4M.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 4000000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize4000000_adjustLr1_BS256_noADRep/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/eval_ADRep.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 1 \
16 | --evaluate \
17 | --resume 'results_no_PoLRS/resnet50_lr0.05_epochs90_bufSize40000_adjustLr1_BS256/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/train_39M.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/train_40K.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/train_4M.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 4000000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_RepBuf/train_ADRep.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_online/no_PoLRS/main_online.py \
 4 | -a resnet50 \
 5 | --lr 0.05 \
 6 | --weight-decay 0.0001 \
 7 | --batch-size 256 \
 8 | --NOSubBatch 1 \
 9 | --workers 32 \
10 | --size_replay_buffer 40000 \
11 | --val_freq 1000 \
12 | --epochs 90 \
13 | --adjust_lr 1 \
14 | --SaveInter 1 \
15 | --use_ADRep 1 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:11805' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/CLOC/exp_best_model/eval_offline.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | #python main_v3.py \
 4 | python ../code_offline/main.py \
 5 | -a resnet50 \
 6 | --lr 0.025 \
 7 | --workers 64 \
 8 | --batch-size 256 \
 9 | --adjust_lr 1 \
10 | --num_passes 1 \
11 | --use_aug 1 \
12 | --use_val 1 \
13 | --val_freq 1 \
14 | --epochs 90 \
15 | --used_data_rate_start 0.0 \
16 | --used_data_rate_end 1.0 \
17 | --evaluate \
18 | --resume 'results_offline/resnet50_lr0.025_epochs90_numPasses1_useVal1_valFreq1_BS256/checkpoint.pth.tar' \
19 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
20 | --root "../data_preparation/release/dataset/images/" \
21 | --data "../data_preparation/release/" \
22 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
23 | --dist-url 'tcp://127.0.0.1:52176' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
24 | 
25 | #val_data_rate = 0.5/0.01
26 | # --resume '/mnt/beegfs/tier1/vcl-nfs-work/zcai/WorkSpace/continual_learning/training/code_github/Continual-Learning/results/resnet50_lr0.1_isOnline0_valDataRate0.001_adjustLr1_numPasses3_useAug1_useVal1_valFreq20/checkpoint.pth.tar' \
27 | #/export/share/Datasets/yfcc100m_full_dataset/metadata_geolocation
28 | 


--------------------------------------------------------------------------------
/CLOC/exp_best_model/eval_online.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | 
 4 | python ../code_online/best_model/main_online_best_model.py \
 5 | -a resnet50 \
 6 | --lr 0.0125 \
 7 | --weight-decay 0.0001 \
 8 | --weight_old_data 1.0 \
 9 | --batch-size 256 \
10 | --NOSubBatch 4 \
11 | --workers 64 \
12 | --gradient_steps_per_batch 1 \
13 | --size_replay_buffer 40000 \
14 | --epochs 90 \
15 | --LR_adjust_intv 5 \
16 | --evaluate \
17 | --resume 'results_best_model/resnet50_lr0.0125_epochs90_bufSize40000_LrAjIntv5_BS256_NOSubBatch4/checkpoint.pth.tar' \
18 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
19 | --root "../data_preparation/release/dataset/images/" \
20 | --data "../data_preparation/release/" \
21 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
22 | --dist-url 'tcp://127.0.0.1:23794' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
23 | 


--------------------------------------------------------------------------------
/CLOC/exp_best_model/train_offline.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | python ../code_offline/main.py \
 4 | -a resnet50 \
 5 | --lr 0.025 \
 6 | --workers 64 \
 7 | --batch-size 256 \
 8 | --adjust_lr 1 \
 9 | --num_passes 1 \
10 | --use_aug 1 \
11 | --use_val 1 \
12 | --val_freq 1 \
13 | --epochs 90 \
14 | --used_data_rate_start 0.0 \
15 | --used_data_rate_end 1.0 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:52176' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 
22 | 


--------------------------------------------------------------------------------
/CLOC/exp_best_model/train_online.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/sh
 2 | 
 3 | 
 4 | python ../code_online/best_model/main_online_best_model.py \
 5 | -a resnet50 \
 6 | --lr 0.0125 \
 7 | --weight-decay 0.0001 \
 8 | --weight_old_data 1.0 \
 9 | --batch-size 256 \
10 | --NOSubBatch 4 \
11 | --workers 64 \
12 | --gradient_steps_per_batch 1 \
13 | --size_replay_buffer 40000 \
14 | --epochs 90 \
15 | --LR_adjust_intv 5 \
16 | --cell_id "../data_preparation/release/cellID_yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250.npy" \
17 | --root "../data_preparation/release/dataset/images/" \
18 | --data "../data_preparation/release/" \
19 | --data_val "../data_preparation/release/yfcc100m_metadata_with_labels_usedDataRatio0.05_t110000_t250_valid_files_2004To2014_compact_val.csv" \
20 | --dist-url 'tcp://127.0.0.1:23794' --dist-backend 'nccl' --world-size 1 --rank 0 --multiprocessing-distributed
21 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2021 Intel Labs
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # DISCONTINUATION OF PROJECT #  
 2 | This project will no longer be maintained by Intel.  
 3 | Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.  
 4 | Intel no longer accepts patches to this project.  
 5 |  If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.  
 6 |   
 7 | # continual learning
 8 | 
 9 | The "CLOC" folder contains the code for ICCV 2021 paper: Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data. 
10 | 
11 | [[Paper link]](https://arxiv.org/pdf/2108.09020.pdf)
12 | 
13 | > reviewed: 12.19.2022 michaelbeale-il
14 | 


--------------------------------------------------------------------------------