├── DSP ├── DR-TANet-main │ ├── LICENSE │ ├── README.md │ ├── TANet.py │ ├── TANet_element.py │ ├── __pycache__ │ │ ├── TANet.cpython-37.pyc │ │ ├── TANet_element.cpython-37.pyc │ │ ├── attention.cpython-37.pyc │ │ ├── datasets.cpython-37.pyc │ │ └── util.cpython-37.pyc │ ├── attention.py │ ├── data │ │ └── output │ │ │ └── vijaya.ramkumar │ │ │ └── sscdv2 │ │ │ └── DR-tanet │ │ │ └── alpha_100 │ │ │ └── new_sp_3k_nod │ │ │ └── DR-TANet_resnet50_ref │ │ │ └── vl_cmu_cd │ │ │ └── DR-TANet_resnet50_ref │ │ │ └── vl_cmu_cd │ │ │ └── eval_metrics(dataset).csv │ ├── datasets.py │ ├── eval.py │ ├── graph.py │ ├── img │ │ └── TANet_DR-TANet.png │ ├── split data.py │ ├── train.py │ └── util.py ├── config │ ├── __pycache__ │ │ └── option.cpython-37.pyc │ └── option.py ├── criterion │ ├── __pycache__ │ │ ├── ntxent.cpython-37.pyc │ │ └── sim_preserving_kd.cpython-37.pyc │ ├── ntxent.py │ └── sim_preserving_kd.py ├── dataset │ ├── CMU.py │ ├── PCD.py │ └── __pycache__ │ │ ├── CMU.cpython-37.pyc │ │ └── PCD.cpython-37.pyc ├── linear.py ├── modeling │ ├── backbone │ │ ├── __init__.py │ │ ├── __pycache__ │ │ │ ├── __init__.cpython-37.pyc │ │ │ ├── drn.cpython-37.pyc │ │ │ ├── mobilenet.cpython-37.pyc │ │ │ ├── resnet.cpython-37.pyc │ │ │ └── xception.cpython-37.pyc │ │ ├── data │ │ │ └── Digraph.gv │ │ ├── drn.py │ │ ├── mobilenet.py │ │ ├── resnet.py │ │ └── xception.py │ └── sync_batchnorm │ │ ├── __init__.py │ │ ├── __pycache__ │ │ ├── __init__.cpython-37.pyc │ │ ├── batchnorm.cpython-37.pyc │ │ ├── comm.cpython-37.pyc │ │ └── replicate.cpython-37.pyc │ │ ├── batchnorm.py │ │ ├── comm.py │ │ ├── replicate.py │ │ └── unittest.py ├── models │ ├── __pycache__ │ │ └── simclr.cpython-37.pyc │ └── simclr.py ├── mypath.py ├── optimizers │ ├── __pycache__ │ │ └── lars.cpython-37.pyc │ └── lars.py ├── supervised.py ├── train.py ├── transforms │ ├── __pycache__ │ │ └── simclr_transform.cpython-37.pyc │ └── simclr_transform.py └── util │ ├── COCO_loader │ ├── base_dataset.py │ ├── coco_uninet.py │ └── defaults.py │ ├── __pycache__ │ ├── dist_util.cpython-37.pyc │ ├── test.cpython-37.pyc │ ├── torchlist.cpython-37.pyc │ ├── train_util.cpython-37.pyc │ ├── transforms.cpython-37.pyc │ └── utils.cpython-37.pyc │ ├── dist_util.py │ ├── test.py │ ├── torchlist.py │ ├── train_util.py │ ├── transforms.py │ └── utils.py ├── LICENSE ├── README.md └── method.png /DSP/DR-TANet-main/LICENSE: -------------------------------------------------------------------------------- 1 | IT License 2 | 3 | Copyright (c) 2021 Shuo Chen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/README.md: -------------------------------------------------------------------------------- 1 | # Dynamic Receptive Temporal Attention Network for Street Scene Change Detection 2 | 3 | This is the official implementation of TANet and DR-TANet in "DR-TANet: Dynamic Receptive Temporal Attention Network for Street Scene Change Detection" (IEEE IV 2021). The preprint version is [here](https://arxiv.org/abs/2103.00879). 4 | 5 | ![img1](https://github.com/Herrccc/DR-TANet/blob/main/img/TANet:DR-TANet.png) 6 | 7 | ## Requirements 8 | 9 | - python 3.7+ 10 | - opencv 3.4.2+ 11 | - pytorch 1.2.0+ 12 | - torchvision 0.4.0+ 13 | - tqdm 4.51.0 14 | - tensorboardX 2.1 15 | 16 | ## Datasets 17 | 18 | Our network is tested on two datasets for street-view scene change detection. 19 | 20 | - 'PCD' dataset from [Change detection from a street image pair using CNN features and superpixel segmentation](http://www.vision.is.tohoku.ac.jp/files/9814/3947/4830/71-Sakurada-BMVC15.pdf). 21 | - You can find the information about how to get 'TSUNAMI', 'GSV' and preprocessed datasets for training and test [here](https://kensakurada.github.io/pcd_dataset.html). 22 | - 'VL-CMU-CD' dataset from [Street-View Change Detection with Deconvolutional Networks](http://www.robesafe.com/personal/roberto.arroyo/docs/Alcantarilla16rss.pdf). 23 | - 'VL-CMU-CD': [[googledrive]](https://drive.google.com/file/d/0B-IG2NONFdciOWY5QkQ3OUgwejQ/view?resourcekey=0-rEzCjPFmDFjt4UMWamV4Eg) 24 | - dataset for training and test in our work: [[googledrive]](https://drive.google.com/file/d/1GzQR9kQouH4_1PmFRTHl4dWTAzqz3ppH/view?usp=sharing) 25 | 26 | ## Training 27 | 28 | Start training with TANet on 'PCD' dataset. 29 | >The configurations for TANet 30 | >- local-kernel-size:1, attn-stride:1, attn-padding:0, attn-groups:4. 31 | >- local-kernel-size:3, attn-stride:1, attn-padding:1, attn-groups:4. 32 | >- local-kernel-size:5, attn-stride:1, attn-padding:2, attn-groups:4. 33 | >- local-kernel-size:7, attn-stride:1, attn-padding:3, attn-groups:4. 34 | 35 | python3 train.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --max-epochs 100 --batch-size 16 --encoder-arch resnet18 --local-kernel-size 1 36 | 37 | Start training with DR-TANet on 'VL-CMU-CD' dataset. 38 | 39 | python3 train.py --dataset vl_cmu_cd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --max-epochs 150 --batch-size 16 --encoder-arch resnet18 --epoch-save 25 --drtam --refinement 40 | 41 | ## Evaluating 42 | 43 | Start evaluating with DR-TANet on 'PCD' dataset. 44 | 45 | python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet18 --drtam --refinement --store-imgs 46 | 47 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/TANet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from util import upsample 4 | from TANet_element import * 5 | 6 | class TANet(nn.Module): 7 | 8 | def __init__(self, encoder_arch, local_kernel_size, stride, padding, groups, drtam, refinement, pretrain,sslpretrain, ssl_path): 9 | super(TANet, self).__init__() 10 | 11 | self.encoder1, channels = get_encoder(encoder_arch,ssl_path, pretrained=pretrain, sslpretrain= False) 12 | self.encoder2, _ = get_encoder(encoder_arch,ssl_path, pretrained=pretrain, sslpretrain= False) 13 | self.attention_module = get_attentionmodule(local_kernel_size, stride, padding, groups, drtam, refinement, channels) 14 | self.decoder = get_decoder(channels=channels) 15 | self.classifier = nn.Conv2d(channels[0], 2, 1, padding=0, stride=1) 16 | self.bn = nn.BatchNorm2d(channels[0]) 17 | self.relu = nn.ReLU(inplace=True) 18 | 19 | def forward(self, img): 20 | 21 | img_t0,img_t1 = torch.split(img,3,1) 22 | features_t0 = self.encoder1(img_t0) 23 | features_t1 = self.encoder2(img_t1) 24 | features = features_t0 + features_t1 25 | features_map = self.attention_module(features) 26 | pred_ = self.decoder(features_map) 27 | pred_ = upsample(pred_,[pred_.size()[2]*2, pred_.size()[3]*2]) 28 | pred_ = self.bn(pred_) 29 | pred_ = upsample(pred_,[pred_.size()[2]*2, pred_.size()[3]*2]) 30 | pred_ = self.relu(pred_) 31 | pred = self.classifier(pred_) 32 | 33 | return pred 34 | 35 | 36 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/__pycache__/TANet.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/TANet.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/DR-TANet-main/__pycache__/TANet_element.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/TANet_element.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/DR-TANet-main/__pycache__/attention.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/attention.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/DR-TANet-main/__pycache__/datasets.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/datasets.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/DR-TANet-main/__pycache__/util.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/util.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/DR-TANet-main/attention.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import torch.nn.init as init 5 | 6 | 7 | class Temporal_Attention(nn.Module): 8 | def __init__(self, in_channels, out_channels, kernel_size=1, stride=1, padding=0, 9 | groups=1, bias=False, refinement=False): 10 | super(Temporal_Attention, self).__init__() 11 | self.outc = out_channels 12 | self.kernel_size = kernel_size 13 | self.stride = stride 14 | self.padding = padding 15 | self.groups = groups 16 | self.refinement = refinement 17 | 18 | print('Attention Layer-kernel size:{0},stride:{1},padding:{2},groups:{3}...'.format(self.kernel_size,self.stride,self.padding,self.groups)) 19 | if self.refinement: 20 | print("Attention with refinement...") 21 | 22 | assert self.outc % self.groups == 0, 'out_channels should be divided by groups.' 23 | 24 | self.w_q = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=bias) 25 | self.w_k = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=bias) 26 | self.w_v = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=bias) 27 | 28 | 29 | #relative positional encoding... 30 | self.rel_h = nn.Parameter(torch.randn(self.outc // 2, 1, 1, self.kernel_size, 1), requires_grad = True) 31 | self.rel_w = nn.Parameter(torch.randn(self.outc // 2, 1, 1, 1, self.kernel_size), requires_grad = True) 32 | init.normal_(self.rel_h, 0, 1) 33 | init.normal_(self.rel_w, 0, 1) 34 | 35 | 36 | init.kaiming_normal_(self.w_q.weight, mode='fan_out', nonlinearity='relu') 37 | init.kaiming_normal_(self.w_k.weight, mode='fan_out', nonlinearity='relu') 38 | init.kaiming_normal_(self.w_v.weight, mode='fan_out', nonlinearity='relu') 39 | 40 | 41 | def forward(self, feature_map): 42 | 43 | fm_t0, fm_t1 = torch.split(feature_map, feature_map.size()[1]//2, 1) 44 | assert fm_t0.size() == fm_t1.size(), 'The size of feature maps of image t0 and t1 should be same.' 45 | 46 | batch, _, h, w = fm_t0.size() 47 | 48 | 49 | padded_fm_t0 = F.pad(fm_t0, [self.padding, self.padding, self.padding, self.padding]) 50 | q_out = self.w_q(fm_t1) 51 | k_out = self.w_k(padded_fm_t0) 52 | v_out = self.w_v(padded_fm_t0) 53 | 54 | if self.refinement: 55 | 56 | padding = self.kernel_size 57 | padded_fm_col = F.pad(fm_t0, [0, 0, padding, padding]) 58 | padded_fm_row = F.pad(fm_t0, [padding, padding, 0, 0]) 59 | k_out_col = self.w_k(padded_fm_col) 60 | k_out_row = self.w_k(padded_fm_row) 61 | v_out_col = self.w_v(padded_fm_col) 62 | v_out_row = self.w_v(padded_fm_row) 63 | 64 | k_out_col = k_out_col.unfold(2, self.kernel_size * 2 + 1, self.stride) 65 | k_out_row = k_out_row.unfold(3, self.kernel_size * 2 + 1, self.stride) 66 | v_out_col = v_out_col.unfold(2, self.kernel_size * 2 + 1, self.stride) 67 | v_out_row = v_out_row.unfold(3, self.kernel_size * 2 + 1, self.stride) 68 | 69 | 70 | q_out_base = q_out.view(batch, self.groups, self.outc // self.groups, h, w, 1).repeat(1, 1, 1, 1, 1, self.kernel_size*self.kernel_size) 71 | q_out_ref = q_out.view(batch, self.groups, self.outc // self.groups, h, w, 1).repeat(1, 1, 1, 1, 1, self.kernel_size * 2 + 1) 72 | 73 | k_out = k_out.unfold(2, self.kernel_size, self.stride).unfold(3, self.kernel_size, self.stride) 74 | 75 | k_out_h, k_out_w = k_out.split(self.outc // 2, dim=1) 76 | k_out = torch.cat((k_out_h + self.rel_h, k_out_w + self.rel_w), dim=1) 77 | 78 | k_out = k_out.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1) 79 | 80 | v_out = v_out.unfold(2, self.kernel_size, self.stride).unfold(3, self.kernel_size, self.stride) 81 | v_out = v_out.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1) 82 | 83 | inter_out = (q_out_base * k_out).sum(dim=2) 84 | 85 | out = F.softmax(inter_out, dim=-1) 86 | out = torch.einsum('bnhwk,bnchwk -> bnchw', out, v_out).contiguous().view(batch, -1, h, w) 87 | 88 | if self.refinement: 89 | 90 | k_out_row = k_out_row.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1) 91 | k_out_col = k_out_col.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1) 92 | v_out_row = v_out_row.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1) 93 | v_out_col = v_out_col.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1) 94 | 95 | out_row = F.softmax((q_out_ref * k_out_row).sum(dim=2),dim=-1) 96 | out_col = F.softmax((q_out_ref * k_out_col).sum(dim=2),dim=-1) 97 | out += torch.einsum('bnhwk,bnchwk -> bnchw', out_row, v_out_row).contiguous().view(batch, -1, h, w) 98 | out += torch.einsum('bnhwk,bnchwk -> bnchw', out_col, v_out_col).contiguous().view(batch, -1, h, w) 99 | 100 | return out 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/data/output/vijaya.ramkumar/sscdv2/DR-tanet/alpha_100/new_sp_3k_nod/DR-TANet_resnet50_ref/vl_cmu_cd/DR-TANet_resnet50_ref/vl_cmu_cd/eval_metrics(dataset).csv: -------------------------------------------------------------------------------- 1 | set,ds_name,precision,recall,accuracy,f1-score 2 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/datasets.py: -------------------------------------------------------------------------------- 1 | import os 2 | import cv2 3 | import torch 4 | import numpy as np 5 | from torch.utils.data import Dataset 6 | from os.path import join as pjoin, splitext as spt 7 | import argparse 8 | 9 | def check_validness(f): 10 | return any([i in spt(f)[1] for i in ['jpg','png']]) 11 | 12 | class pcd(Dataset): 13 | 14 | def __init__(self,root): 15 | super(pcd, self).__init__() 16 | self.img_t0_root = pjoin(root,'t0') 17 | self.img_t1_root = pjoin(root,'t1') 18 | self.img_mask_root = pjoin(root,'mask') 19 | self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f)) 20 | self.filename.sort() 21 | 22 | def __getitem__(self, index): 23 | 24 | fn = self.filename[index] 25 | fn_t0 = pjoin(self.img_t0_root,fn+'.jpg') 26 | fn_t1 = pjoin(self.img_t1_root,fn+'.jpg') 27 | fn_mask = pjoin(self.img_mask_root,fn+'.png') 28 | 29 | if os.path.isfile(fn_t0) == False: 30 | print('Error: File Not Found: ' + fn_t0) 31 | exit(-1) 32 | if os.path.isfile(fn_t1) == False: 33 | print('Error: File Not Found: ' + fn_t1) 34 | exit(-1) 35 | if os.path.isfile(fn_mask) == False: 36 | print('Error: File Not Found: ' + fn_mask) 37 | exit(-1) 38 | 39 | img_t0 = cv2.imread(fn_t0, 1) 40 | img_t1 = cv2.imread(fn_t1, 1) 41 | mask = cv2.imread(fn_mask, 0) 42 | 43 | w, h, c = img_t0.shape 44 | r = 286. / min(w, h) 45 | # resize images so that min(w, h) == 256 46 | img_t0_r = cv2.resize(img_t0, (int(r * w), int(r * h))) 47 | img_t1_r = cv2.resize(img_t1, (int(r * w), int(r * h))) 48 | mask_r = cv2.resize(mask, (int(r * w), int(r * h)))[:, :, np.newaxis] 49 | 50 | img_t0_r_ = np.asarray(img_t0_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 51 | img_t1_r_ = np.asarray(img_t1_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 52 | mask_r_ = np.asarray(mask_r>128).astype('f').transpose(2, 0, 1) 53 | 54 | crop_width = 256 55 | _, h, w = img_t0_r_.shape 56 | x_l = np.random.randint(0, w - crop_width) 57 | x_r = x_l + crop_width 58 | y_l = np.random.randint(0, h - crop_width) 59 | y_r = y_l + crop_width 60 | 61 | input_ = torch.from_numpy(np.concatenate((img_t0_r_[:, y_l:y_r, x_l:x_r], img_t1_r_[:, y_l:y_r, x_l:x_r]), axis=0)) 62 | mask_ = torch.from_numpy(mask_r_[:, y_l:y_r, x_l:x_r]).long() 63 | 64 | return input_,mask_ 65 | 66 | def __len__(self): 67 | return len(self.filename) 68 | 69 | def get_random_image(self): 70 | idx = np.random.randint(0,len(self)) 71 | return self.__getitem__(idx) 72 | 73 | 74 | class pcd_eval(Dataset): 75 | 76 | def __init__(self, root): 77 | super(pcd_eval, self).__init__() 78 | self.img_t0_root = pjoin(root, 't0') 79 | self.img_t1_root = pjoin(root, 't1') 80 | self.img_mask_root = pjoin(root, 'mask') 81 | self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f)) 82 | self.filename.sort() 83 | 84 | def __getitem__(self, index): 85 | 86 | fn = self.filename[index] 87 | fn_t0 = pjoin(self.img_t0_root, fn + '.jpg') 88 | fn_t1 = pjoin(self.img_t1_root, fn + '.jpg') 89 | fn_mask = pjoin(self.img_mask_root, fn + '.png') 90 | 91 | if os.path.isfile(fn_t0) == False: 92 | print('Error: File Not Found: ' + fn_t0) 93 | exit(-1) 94 | if os.path.isfile(fn_t1) == False: 95 | print('Error: File Not Found: ' + fn_t1) 96 | exit(-1) 97 | if os.path.isfile(fn_mask) == False: 98 | print('Error: File Not Found: ' + fn_mask) 99 | exit(-1) 100 | 101 | img_t0 = cv2.imread(fn_t0, 1) 102 | img_t1 = cv2.imread(fn_t1, 1) 103 | mask = cv2.imread(fn_mask, 0) 104 | 105 | w, h, c = img_t0.shape 106 | w_r = int(256*max(w/256,1)) 107 | h_r = int(256*max(h/256,1)) 108 | # resize images so that min(w, h) == 256 109 | img_t0_r = cv2.resize(img_t0,(h_r,w_r)) 110 | img_t1_r = cv2.resize(img_t1,(h_r,w_r)) 111 | mask_r = cv2.resize(mask,(h_r,w_r))[:, :, np.newaxis] 112 | 113 | img_t0_r = np.asarray(img_t0_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 114 | img_t1_r = np.asarray(img_t1_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 115 | mask_r = np.asarray(mask_r > 128).astype('f').transpose(2, 0, 1) 116 | 117 | return img_t0_r, img_t1_r, mask_r, w, h, w_r, h_r 118 | 119 | def __len__(self): 120 | return len(self.filename) 121 | 122 | def get_random_image(self): 123 | idx = np.random.randint(0,len(self)) 124 | return self.__getitem__(idx) 125 | class vl_cmu_cd(Dataset): 126 | 127 | def __init__(self, root, num=1): 128 | super(vl_cmu_cd, self).__init__() 129 | self.img_t0_root = pjoin(root, 't0') 130 | self.img_t1_root = pjoin(root, 't1') 131 | self.img_mask_root = pjoin(root, 'mask') 132 | self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f)) 133 | self.filename.sort() 134 | self.datanum = num 135 | 136 | def __getitem__(self, index): 137 | 138 | fn = self.filename[index] 139 | fn_t0 = pjoin(self.img_t0_root, fn + '.png') 140 | fn_t1 = pjoin(self.img_t1_root, fn + '.png') 141 | fn_mask = pjoin(self.img_mask_root, fn + '.png') 142 | 143 | if os.path.isfile(fn_t0) == False: 144 | print('Error: File Not Found: ' + fn_t0) 145 | exit(-1) 146 | if os.path.isfile(fn_t1) == False: 147 | print('Error: File Not Found: ' + fn_t1) 148 | exit(-1) 149 | if os.path.isfile(fn_mask) == False: 150 | print('Error: File Not Found: ' + fn_mask) 151 | exit(-1) 152 | 153 | img_t0 = cv2.imread(fn_t0, 1) 154 | img_t1 = cv2.imread(fn_t1, 1) 155 | mask = cv2.imread(fn_mask, 0) 156 | 157 | mask_r = mask[:, :, np.newaxis] 158 | 159 | img_t0_r = np.asarray(img_t0).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 160 | img_t1_r = np.asarray(img_t1).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 161 | mask_r_ = np.asarray(mask_r > 128).astype('f').transpose(2, 0, 1) 162 | 163 | 164 | input_ = torch.from_numpy(np.concatenate((img_t0_r, img_t1_r), axis=0)) 165 | mask_ = torch.from_numpy(mask_r_).long() 166 | 167 | return input_, mask_ 168 | 169 | def __len__(self): 170 | 171 | return round(self.datanum *len(self.filename)) 172 | 173 | def get_random_image(self): 174 | # num = self.datanum *len(self) 175 | idx = np.random.randint(0,len(self)) 176 | 177 | return self.__getitem__(idx) 178 | 179 | class vl_cmu_cd_eval(Dataset): 180 | 181 | def __init__(self, root): 182 | super(vl_cmu_cd_eval, self).__init__() 183 | self.img_root = pjoin(root, 'RGB') 184 | self.img_mask_root = pjoin(root, 'GT') 185 | self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f)) 186 | self.filename.sort() 187 | 188 | 189 | def __getitem__(self, index): 190 | 191 | fn = self.filename[index] 192 | fn_t0 = pjoin(self.img_root, '1_{:02d}'.format(index) + '.png') 193 | fn_t1 = pjoin(self.img_root, '2_{:02d}'.format(index) + '.png') 194 | fn_mask = pjoin(self.img_mask_root, fn + '.png') 195 | 196 | if os.path.isfile(fn_t0) == False: 197 | print('Error: File Not Found: ' + fn_t0) 198 | exit(-1) 199 | if os.path.isfile(fn_t1) == False: 200 | print('Error: File Not Found: ' + fn_t1) 201 | exit(-1) 202 | if os.path.isfile(fn_mask) == False: 203 | print('Error: File Not Found: ' + fn_mask) 204 | exit(-1) 205 | 206 | img_t0 = cv2.imread(fn_t0, 1) 207 | img_t1 = cv2.imread(fn_t1, 1) 208 | mask = cv2.imread(fn_mask, 0) 209 | 210 | w, h, c = img_t0.shape 211 | w_r = int(256 * max(w / 256, 1)) 212 | h_r = int(256 * max(h / 256, 1)) 213 | 214 | img_t0_r = cv2.resize(img_t0, (w_r, h_r)) 215 | img_t1_r = cv2.resize(img_t1, (w_r, h_r)) 216 | mask_r = cv2.resize(mask, (h_r, w_r))[:, :, np.newaxis] 217 | 218 | img_t0_r_ = np.asarray(img_t0_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 219 | img_t1_r_ = np.asarray(img_t1_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0 220 | mask_r_ = np.asarray(mask_r > 128).astype('f').transpose(2, 0, 1) 221 | 222 | return img_t0_r_, img_t1_r_, mask_r_, w, h, w_r, h_r 223 | 224 | def __len__(self): 225 | return len(self.filename) 226 | 227 | def get_random_image(self): 228 | idx = np.random.randint(0,len(self)) 229 | return self.__getitem__(idx) 230 | 231 | 232 | 233 | 234 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/eval.py: -------------------------------------------------------------------------------- 1 | import datasets 2 | from TANet import TANet 3 | import os 4 | import csv 5 | import cv2 6 | import torch 7 | import torch.nn as nn 8 | import numpy as np 9 | from os.path import join as pjoin 10 | from tqdm import tqdm 11 | import torch.nn.functional as F 12 | import argparse 13 | 14 | class Evaluate: 15 | 16 | def __init__(self): 17 | self.args = None 18 | self.set = None 19 | 20 | def eval(self): 21 | 22 | input = torch.from_numpy(np.concatenate((self.t0,self.t1),axis=0)).contiguous() 23 | input = input.view(1,-1,self.w_r,self.h_r) 24 | input = input.cuda() 25 | output= self.model(input) 26 | 27 | input = input[0].cpu().data 28 | img_t0 = input[0:3,:,:] 29 | img_t1 = input[3:6,:,:] 30 | img_t0 = (img_t0+1)*128 31 | img_t1 = (img_t1+1)*128 32 | output = output[0].cpu().data 33 | #mask_pred =F.softmax(output[0:2,:,:],dim=0)[0]*255 34 | mask_pred = np.where(F.softmax(output[0:2,:,:],dim=0)[0]>0.5, 255, 0) 35 | mask_gt = np.squeeze(np.where(self.mask==True,255,0),axis=0) 36 | if self.args.store_imgs: 37 | precision, recall, accuracy, f1_score = self.store_imgs_and_cal_matrics(img_t0,img_t1,mask_gt,mask_pred) 38 | else: 39 | precision, recall, accuracy, f1_score = self.cal_metrcis(mask_pred,mask_gt) 40 | return (precision, recall, accuracy, f1_score) 41 | 42 | 43 | def store_imgs_and_cal_matrics(self, t0, t1, mask_gt, mask_pred): 44 | 45 | w, h = self.w_r, self.h_r 46 | img_save = np.zeros((w * 2, h * 2, 3), dtype=np.uint8) 47 | img_save[0:w, 0:h, :] = np.transpose(t0.numpy(), (1, 2, 0)).astype(np.uint8) 48 | img_save[0:w, h:h * 2, :] = np.transpose(t1.numpy(), (1, 2, 0)).astype(np.uint8) 49 | img_save[w:w * 2, 0:h, :] = cv2.cvtColor(mask_gt.astype(np.uint8), cv2.COLOR_GRAY2RGB) 50 | img_save[w:w * 2, h:h * 2, :] = cv2.cvtColor(mask_pred.astype(np.uint8), cv2.COLOR_GRAY2RGB) 51 | 52 | if w != self.w_ori or h != self.h_ori: 53 | img_save = cv2.resize(img_save, (self.h_ori, self.w_ori)) 54 | 55 | fn_save = self.fn_img 56 | if not os.path.exists(self.dir_img): 57 | os.makedirs(self.dir_img) 58 | 59 | print('Writing' + fn_save + '......') 60 | cv2.imwrite(fn_save, img_save) 61 | 62 | if self.set is not None: 63 | f_metrics = open(pjoin(self.resultdir, "eval_metrics_set{0}(single_image).csv".format(self.set)), 'a+') 64 | else: 65 | f_metrics = open(pjoin(self.resultdir, "eval_metrics(single_image).csv"), 'a+') 66 | metrics_writer = csv.writer(f_metrics) 67 | fn = '{0}-{1:08d}'.format(self.ds,self.index) 68 | precision, recall, accuracy, f1_score = self.cal_metrcis(mask_pred,mask_gt) 69 | metrics_writer.writerow([fn, precision, recall, accuracy, f1_score]) 70 | f_metrics.close() 71 | return (precision, recall, accuracy, f1_score) 72 | 73 | def cal_metrcis(self,pred,target): 74 | 75 | temp = np.dstack((pred == 0, target == 0)) 76 | TP = sum(sum(np.all(temp,axis=2))) 77 | 78 | temp = np.dstack((pred == 0, target == 255)) 79 | FP = sum(sum(np.all(temp,axis=2))) 80 | 81 | temp = np.dstack((pred == 255, target == 0)) 82 | FN = sum(sum(np.all(temp, axis=2))) 83 | 84 | temp = np.dstack((pred == 255, target == 255)) 85 | TN = sum(sum(np.all(temp, axis=2))) 86 | 87 | precision = TP / (TP + FP) 88 | recall = TP / (TP + FN) 89 | accuracy = (TP + TN) / (TP + FP + FN + TN) 90 | f1_score = 2 * recall * precision / (precision + recall) 91 | 92 | return (precision, recall, accuracy, f1_score) 93 | 94 | def Init(self): 95 | 96 | if self.args.drtam: 97 | print('Dynamic Receptive Temporal Attention Network (DR-TANet)') 98 | model_name = 'DR-TANet' 99 | else: 100 | print('Temporal Attention Network (TANet)') 101 | model_name = 'TANet_k={0}'.format(self.args.local_kernel_size) 102 | 103 | model_name += ('_' + self.args.encoder_arch) 104 | 105 | print('Encoder:' + self.args.encoder_arch) 106 | 107 | if self.args.refinement: 108 | print('Adding refinement...') 109 | model_name += '_ref' 110 | 111 | self.resultdir = pjoin(self.args.resultdir, model_name, self.args.dataset) 112 | if not os.path.exists(self.resultdir): 113 | os.makedirs(self.resultdir) 114 | 115 | f_metrics = open(pjoin(self.resultdir, "eval_metrics(dataset).csv"), 'a+') 116 | metrics_writer = csv.writer(f_metrics) 117 | metrics_writer.writerow(['set', 'ds_name', 'precision', 'recall', 'accuracy', 'f1-score']) 118 | f_metrics.close() 119 | 120 | 121 | def run(self): 122 | 123 | if os.path.isfile(self.fn_model) is False: 124 | print("Error: Cannot read file ... " + self.fn_model) 125 | exit(-1) 126 | else: 127 | print("Reading model ... " + self.fn_model) 128 | 129 | self.model = TANet(self.args.encoder_arch, self.args.local_kernel_size, self.args.attn_stride, 130 | self.args.attn_padding, self.args.attn_groups, self.args.drtam, self.args.refinement, False,False,'None') 131 | 132 | 133 | # state_dic = {k.partition('module.')[2]:v for k,v in torch.load(self.fn_model).items()} 134 | if self.args.multi_gpu: 135 | self.model = nn.DataParallel(self.model) 136 | self.model.load_state_dict((torch.load(self.fn_model))) # 137 | self.model = self.model.cuda() 138 | self.model.eval() 139 | 140 | 141 | class evaluate_pcd(Evaluate): 142 | 143 | def __init__(self,arguments): 144 | super(evaluate_pcd,self).__init__() 145 | self.args = arguments 146 | 147 | def run(self, set): 148 | 149 | self.set = set 150 | self.dir_img = pjoin(self.resultdir, 'imgs', 'set{0:1d}'.format(self.set)) 151 | self.fn_model = pjoin(self.args.checkpointdir) #'set{0:1d}'.format(self.set), 'checkpointdir', '00120000.pth' 152 | super(evaluate_pcd,self).run() 153 | f_metrics = open(pjoin(self.resultdir, "eval_metrics(dataset).csv"), 'a+') 154 | metrics_writer = csv.writer(f_metrics) 155 | 156 | for ds in tqdm(['TSUNAMI','GSV']): 157 | test_loader = datasets.pcd_eval(pjoin(self.args.datadir,ds)) 158 | metrics = np.array([0,0,0,0], dtype='float64') 159 | img_cnt = len(test_loader) 160 | for idx in range(0,img_cnt): 161 | self.index = idx 162 | self.ds = ds 163 | self.fn_img = pjoin(self.dir_img, '{0}-{1:08d}.png'.format(self.ds, self.index)) 164 | self.t0,self.t1,self.mask,self.w_ori,self.h_ori,self.w_r,self.h_r = test_loader[idx] 165 | metrics += np.array(self.eval()) 166 | metrics_writer.writerow([self.set, ds, '%.3f' %(metrics[0] / img_cnt), '%.3f' %(metrics[1] / img_cnt), 167 | '%.3f' % (metrics[2] / img_cnt), '%.3f' %(metrics[3] / img_cnt)]) 168 | 169 | f_metrics.close() 170 | 171 | class evaluate_cmu(Evaluate): 172 | 173 | def __init__(self, arguments): 174 | super(evaluate_cmu, self).__init__() 175 | self.args = arguments 176 | 177 | def Init(self): 178 | super(evaluate_cmu,self).Init() 179 | self.ds = None 180 | self.index = 0 181 | self.dir_img = pjoin(self.resultdir, 'imgs') 182 | self.fn_img = pjoin(self.dir_img, '{0}-{1:08d}.png'.format(self.ds, self.index)) 183 | self.fn_model = pjoin(self.args.checkpointdir) #00070050 00035100 00070050.pth 184 | 185 | def eval(self): 186 | 187 | input = torch.from_numpy(np.concatenate((self.t0,self.t1),axis=0)).contiguous() 188 | input = input.view(1,-1,self.w_r,self.h_r) 189 | input = input.cuda() 190 | output= self.model(input) 191 | 192 | input = input[0].cpu().data 193 | img_t0 = input[0:3,:,:] 194 | img_t1 = input[3:6,:,:] 195 | img_t0 = (img_t0+1)*128 196 | img_t1 = (img_t1+1)*128 197 | output = output[0].cpu().data 198 | mask_pred = np.where(F.softmax(output[0:2,:,:],dim=0)[0]>0.5, 0, 255) 199 | mask_gt = np.squeeze(np.where(self.mask==True,255,0),axis=0) 200 | if self.args.store_imgs: 201 | precision, recall, accuracy, f1_score = self.store_imgs_and_cal_matrics(img_t0,img_t1,mask_gt,mask_pred) 202 | else: 203 | precision, recall, accuracy, f1_score = self.cal_metrcis(mask_pred,mask_gt) 204 | return (precision, recall, accuracy, f1_score) 205 | 206 | def run(self): 207 | super(evaluate_cmu, self).run() 208 | f_metrics = open(pjoin(self.resultdir, "eval_metrics(dataset).csv"), 'a+') 209 | metrics_writer = csv.writer(f_metrics) 210 | testdir = [0,6,7,9,12,23,24,25,27,28,32,34,36,38,39,45,47,48,50,56,58,60,61,64,66,69,76,77,81,82,85,92,93,94,95,97,100,106,107,112,113,117,119,120,125,129,132,134,135,139,142,144,145,150] 211 | img_cnt = 0 212 | metrics = np.array([0, 0, 0, 0], dtype='float64') 213 | for idx in testdir: 214 | test_loader = datasets.vl_cmu_cd_eval(pjoin(self.args.datadir, 'raw', '{:03d}'.format(idx))) 215 | img_cnt += len(test_loader) 216 | self.ds = idx 217 | for i in range(0, len(test_loader)): 218 | self.index = i 219 | self.fn_img = pjoin(self.dir_img, '{0}-{1:08d}.png'.format(self.ds, self.index)) 220 | self.t0, self.t1, self.mask, self.w_ori, self.h_ori, self.w_r, self.h_r = test_loader[i] 221 | metrics += np.array(self.eval()) 222 | metrics_writer.writerow(['%.3f' % (metrics[0] / img_cnt), '%.3f' % (metrics[1] / img_cnt), 223 | '%.3f' % (metrics[2] / img_cnt), '%.3f' % (metrics[3] / img_cnt)]) 224 | 225 | f_metrics.close() 226 | 227 | if __name__ =='__main__': 228 | 229 | parser = argparse.ArgumentParser(description='STRAT EVALUATING...') 230 | parser.add_argument('--dataset', type=str, default='pcd', required=True) 231 | parser.add_argument('--datadir',required=True) 232 | parser.add_argument('--resultdir',required=True) 233 | parser.add_argument('--checkpointdir',required=True) 234 | parser.add_argument('--encoder-arch', type=str, required=True) 235 | parser.add_argument('--local-kernel-size',type=int, default=1) 236 | parser.add_argument('--attn-stride', type=int, default=1) 237 | parser.add_argument('--attn-padding', type=int, default=0) 238 | parser.add_argument('--attn-groups', type=int, default=4) 239 | parser.add_argument('--drtam', action='store_true') 240 | parser.add_argument('--refinement', action='store_true') 241 | parser.add_argument('--store-imgs', action='store_true') 242 | parser.add_argument('--multi-gpu', action='store_true', help='processing with multi-gpus') 243 | 244 | if parser.parse_args().dataset == 'pcd': 245 | eval = evaluate_pcd(parser.parse_args()) 246 | eval.Init() 247 | for set in range(0,3): 248 | eval.run(set) 249 | elif parser.parse_args().dataset == 'vl_cmu_cd': 250 | eval = evaluate_cmu(parser.parse_args()) 251 | eval.Init() 252 | eval.run() 253 | else: 254 | print('Error: Cannot identify the dataset...(dataset: pcd or vl_cmu_cd)') 255 | exit(-1) -------------------------------------------------------------------------------- /DSP/DR-TANet-main/graph.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/graph.py -------------------------------------------------------------------------------- /DSP/DR-TANet-main/img/TANet_DR-TANet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/img/TANet_DR-TANet.png -------------------------------------------------------------------------------- /DSP/DR-TANet-main/split data.py: -------------------------------------------------------------------------------- 1 | from collections import Counter 2 | from os.path import join as pjoin, splitext as spt 3 | import os 4 | import cv2 5 | from shutil import copyfile 6 | from pathlib import Path 7 | dict = {'1_00':0,'1_01':1,'1_02':2,'1_03':3,'1_04':4,'1_05':5,'1_06':6,'1_07':7,'1_08':8,'1_09':9,'1_10':10,'1_11':11,'1_12':12,'1_13':13,'1_14':14,'1_15':15,'1_16':16,'1_17':17,'1_18':18,'1_19':19} 8 | pt = '/data/input/datasets/VL-CMU-CD/struc_train/' 9 | path_txt = '/data/input/datasets/VL-CMU-CD/struc_train/train_50p_cmu.txt' 10 | label = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_struc' 11 | # path = '/data/input/datasets/VL-CMU-CD/struc_train/train_split.txt' 12 | lst = [] 13 | count = 0 14 | count_dict = {} 15 | full_list = [] 16 | count = 0 17 | datapath = '/data/input/datasets/VL-CMU-CD/struc_train' 18 | labpath = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_struc' 19 | sav = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/dtranet_vl_50pdata' 20 | nameing = 1 21 | for idx, did in enumerate(open(path_txt)): 22 | try: 23 | image1_name, image2_name, mask_name = did.strip("\n").split(' ') 24 | except ValueError: # Adhoc for test. 25 | image_name = mask_name = did.strip("\n") 26 | extract_name = image1_name[image1_name.rindex('/') + 1: image1_name.rindex('.')] 27 | 28 | folder = image1_name.split('/') 29 | img1_file = os.path.join(pt, image1_name) 30 | img2_file = os.path.join(pt, image2_name) 31 | # items = len([name for name in os.listdir(fol_path)]) 32 | imgno = os.path.splitext(folder[2])[0] 33 | lbl_file = os.path.join(labpath, folder[1]) 34 | filename = list(spt(f)[0] for f in os.listdir(lbl_file)) 35 | filename.sort() 36 | lbl_file2 = os.path.join(lbl_file, filename[dict[imgno]]+'.png') 37 | 38 | print(img1_file) 39 | img_t0 = cv2.imread(img1_file, 1) 40 | img_t1 = cv2.imread(img2_file, 1) 41 | mask = cv2.imread(lbl_file2, 0) 42 | #rotate image 43 | image_t0_90 = cv2.rotate(img_t0, cv2.cv2.ROTATE_90_CLOCKWISE) 44 | image_t0_180 = cv2.rotate(image_t0_90, cv2.cv2.ROTATE_90_CLOCKWISE) 45 | image_t0_270 = cv2.rotate(image_t0_180, cv2.cv2.ROTATE_90_CLOCKWISE) 46 | image_t1_90 = cv2.rotate(img_t1, cv2.cv2.ROTATE_90_CLOCKWISE) 47 | image_t1_180 = cv2.rotate(image_t1_90, cv2.cv2.ROTATE_90_CLOCKWISE) 48 | image_t1_270 = cv2.rotate(image_t1_180, cv2.cv2.ROTATE_90_CLOCKWISE) 49 | mask_90 = cv2.rotate(mask, cv2.cv2.ROTATE_90_CLOCKWISE) 50 | mask_180 = cv2.rotate(mask_90, cv2.cv2.ROTATE_90_CLOCKWISE) 51 | mask_270 = cv2.rotate(mask_180, cv2.cv2.ROTATE_90_CLOCKWISE) 52 | print(pjoin(sav,'t0',str(nameing)+'.png')) 53 | cv2.imwrite(pjoin(sav,'t0',str(nameing)+'.png'), img_t0) 54 | cv2.imwrite(pjoin(sav,'t0',str(nameing+1)+'.png'), image_t0_90) 55 | cv2.imwrite(pjoin(sav,'t0',str(nameing+2)+'.png'), image_t0_180) 56 | cv2.imwrite(pjoin(sav,'t0',str(nameing+3)+'.png'), image_t0_270) 57 | cv2.imwrite(pjoin(sav, 't1', str(nameing) + '.png'), img_t1) 58 | cv2.imwrite(pjoin(sav, 't1', str(nameing + 1) + '.png'), image_t1_90) 59 | cv2.imwrite(pjoin(sav, 't1', str(nameing + 2) + '.png'), image_t1_180) 60 | cv2.imwrite(pjoin(sav, 't1', str(nameing + 3) + '.png'), image_t1_270) 61 | cv2.imwrite(pjoin(sav, 'mask', str(nameing) + '.png'), mask) 62 | cv2.imwrite(pjoin(sav, 'mask', str(nameing + 1) + '.png'), mask_90) 63 | cv2.imwrite(pjoin(sav, 'mask', str(nameing + 2) + '.png'), mask_180) 64 | cv2.imwrite(pjoin(sav, 'mask', str(nameing + 3) + '.png'), mask_270) 65 | 66 | nameing = nameing+4 67 | # with open(path) as g: 68 | # for line in g: 69 | # ls= line.split() 70 | # datapath = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_900images' 71 | # formatpath = '/data/input/datasets/VL-CMU-CD/struc_train/gt_fold_rgb' 72 | # filename = list(spt(f)[0] for f in os.listdir(datapath) ) 73 | # filename.sort() 74 | # print(filename) 75 | # query_item = 0 76 | # for word in ls : 77 | # word1 = word.zfill(3) 78 | # fol_path = pjoin(formatpath, word1) 79 | # items = len([name for name in os.listdir(fol_path)]) 80 | # savepath = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_struc' 81 | # Path(os.path.join(savepath, word1)).mkdir(parents=True, exist_ok=True) 82 | # for id in range(items): 83 | # q = query_item +id 84 | # copyfile(pjoin(datapath,filename[q]+'.png'), pjoin(savepath,word1,filename[q]+'.png')) 85 | # 86 | # query_item = items + query_item 87 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/train.py: -------------------------------------------------------------------------------- 1 | import os 2 | import csv 3 | import cv2 4 | import torch 5 | from TANet import TANet 6 | import numpy as np 7 | import datasets 8 | import torch.nn as nn 9 | import torch.nn.functional as F 10 | from tqdm import tqdm 11 | from os.path import join as pjoin 12 | from torch.utils.data import DataLoader 13 | from tensorboardX import SummaryWriter 14 | torch.cuda.empty_cache() 15 | import argparse 16 | 17 | 18 | class criterion_CEloss(nn.Module): 19 | def __init__(self,weight=None): 20 | super(criterion_CEloss, self).__init__() 21 | self.loss = nn.NLLLoss(weight) 22 | def forward(self,output,target): 23 | return self.loss(F.log_softmax(output, dim=1), target) 24 | 25 | class Train: 26 | 27 | def __init__(self): 28 | self.epoch = 0 29 | self.step = 0 30 | 31 | def train(self): 32 | 33 | weight = torch.ones(2) 34 | criterion = criterion_CEloss(weight.cuda()) 35 | optimizer = torch.optim.Adam(self.model.parameters(),lr=0.001,betas=(0.9,0.999)) 36 | lambda_lr = lambda epoch:(float)(self.args.max_epochs*len(self.dataset_train_loader)-self.step)/(float)(self.args.max_epochs*len(self.dataset_train_loader)) 37 | model_lr_scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer,lr_lambda=lambda_lr) 38 | 39 | f_loss = open(pjoin(self.checkpoint_save,"loss.csv"),'w') 40 | loss_writer = csv.writer(f_loss) 41 | 42 | self.visual_writer = SummaryWriter(os.path.join(self.checkpoint_save,'logs')) 43 | 44 | loss_item = [] 45 | 46 | max_step = self.args.max_epochs * len(self.dataset_train_loader) 47 | _,w,h = self.dataset_test.get_random_image()[0].shape 48 | img_tbx = np.zeros((max_step//self.args.step_test, 3, w*2, h*2), dtype=np.uint8) 49 | 50 | while self.epoch < self.args.max_epochs: 51 | 52 | for step,(inputs_train,mask_train) in enumerate(tqdm(self.dataset_train_loader)): 53 | self.model.train() 54 | inputs_train = inputs_train.cuda() 55 | mask_train = mask_train.cuda() 56 | output_train = self.model(inputs_train) 57 | optimizer.zero_grad() 58 | self.loss = criterion(output_train, mask_train[:,0]) 59 | loss_item.append(self.loss) 60 | self.loss.backward() 61 | optimizer.step() 62 | self.step += 1 63 | loss_writer.writerow([self.step,self.loss.item()]) 64 | self.visual_writer.add_scalar('loss',self.loss.item(),self.step) 65 | 66 | # if self.args.step_test>0 and self.step % self.args.step_test == 0: 67 | # print('testing...') 68 | # self.model.eval() 69 | # self.test(img_tbx) 70 | 71 | print('Loss for Epoch {}:{:.03f}'.format(self.epoch, sum(loss_item)/len(self.dataset_train_loader))) 72 | loss_item.clear() 73 | model_lr_scheduler.step() 74 | self.epoch += 1 75 | if self.args.epoch_save>0 and self.epoch % self.args.epoch_save == 0: 76 | self.checkpoint() 77 | 78 | self.visual_writer.add_images('cd_test',img_tbx,0, dataformats='NCHW') 79 | f_loss.close() 80 | self.visual_writer.close() 81 | 82 | def test(self,img_tbx): 83 | 84 | _, _, w_r, h_r = img_tbx.shape 85 | w_r //= 2 86 | h_r //= 2 87 | input, mask_gt = self.dataset_test.get_random_image() 88 | 89 | input = input.view(1, -1, h_r, w_r) 90 | input = input.cuda() 91 | output = self.model(input) 92 | 93 | input = input[0].cpu().data 94 | img_t0 = input[0:3, :, :] 95 | img_t1 = input[3:6, :, :] 96 | img_t0 = (img_t0 + 1) * 128 97 | img_t1 = (img_t1 + 1) * 128 98 | output = output[0].cpu().data 99 | mask_pred = np.where(F.softmax(output[0:2, :, :], dim=0)[0] > 0.5, 0, 255) 100 | mask_gt = np.squeeze(np.where(mask_gt == True, 255, 0), axis=0) 101 | self.store_result(img_t0, img_t1, mask_gt, mask_pred,img_tbx) 102 | 103 | def store_result(self, t0, t1, mask_gt, mask_pred, img_save): 104 | 105 | _, _, w, h = img_save.shape 106 | w //=2 107 | h //=2 108 | i = self.step//self.args.step_test - 1 109 | img_save[i, :, 0:w, 0:h] = t0.numpy().astype(np.uint8) 110 | img_save[i, :, 0:w, h:2 * h] = t1.numpy().astype(np.uint8) 111 | img_save[i, :, w:2 * w, 0:h] = np.transpose(cv2.cvtColor(mask_gt.astype(np.uint8), cv2.COLOR_GRAY2RGB),(2,0,1)).astype(np.uint8) 112 | img_save[i, :, w:2 * w, h:2 * h] = np.transpose(cv2.cvtColor(mask_pred.astype(np.uint8), cv2.COLOR_GRAY2RGB),(2,0,1)).astype(np.uint8) 113 | 114 | #img_save = np.transpose(img_save, (1, 0, 2)) 115 | 116 | def checkpoint(self): 117 | 118 | filename = '{:08d}.pth'.format(self.step) 119 | cp_path = pjoin(self.checkpoint_save,'checkpointdir') 120 | if not os.path.exists(cp_path): 121 | os.makedirs(cp_path) 122 | torch.save(self.model.state_dict(),pjoin(cp_path,filename)) 123 | print("Net Parameters in step:{:08d} were saved.".format(self.step)) 124 | 125 | def run(self): 126 | 127 | 128 | self.model = TANet(self.args.encoder_arch, self.args.local_kernel_size, self.args.attn_stride, 129 | self.args.attn_padding, self.args.attn_groups, self.args.drtam, self.args.refinement, self.args.pretrain, self.args.sslpretrain, self.args.ssl_path) 130 | 131 | if self.args.drtam: 132 | print('Dynamic Receptive Temporal Attention Network (DR-TANet)') 133 | else: 134 | print('Temporal Attention Network (TANet)') 135 | 136 | print('Encoder:' + self.args.encoder_arch) 137 | if self.args.refinement: 138 | print('Adding refinement...') 139 | 140 | if self.args.multi_gpu: 141 | self.model = nn.DataParallel(self.model).cuda() 142 | else: 143 | self.model = self.model.cuda() 144 | self.train() 145 | 146 | class train_pcd(Train): 147 | 148 | def __init__(self, arguments): 149 | super(train_pcd, self).__init__() 150 | self.args = arguments 151 | 152 | 153 | def Init(self,cvset): 154 | 155 | self.epoch = 0 156 | self.step = 0 157 | self.cvset = cvset 158 | if self.args.drtam: 159 | folder_name = 'DR-TANet' 160 | else: 161 | folder_name = 'TANet_k={}'.format(self.args.local_kernel_size) 162 | 163 | folder_name += ('_' + self.args.encoder_arch) 164 | if self.args.refinement: 165 | folder_name += '_ref' 166 | 167 | self.dataset_train_loader = DataLoader(datasets.pcd(pjoin(self.args.datadir, "set{}".format(self.cvset), "train")), 168 | num_workers=self.args.num_workers, batch_size=self.args.batch_size, 169 | shuffle=True) 170 | self.dataset_test = datasets.pcd(pjoin(self.args.datadir, 'set{}'.format(self.cvset), 'test')) 171 | self.checkpoint_save = pjoin(self.args.checkpointdir, folder_name, 'pcd', 'set{}'.format(self.cvset)) 172 | if not os.path.exists(self.checkpoint_save): 173 | os.makedirs(self.checkpoint_save) 174 | 175 | class train_cmu(Train): 176 | 177 | def __init__(self, arguments): 178 | super(train_cmu, self).__init__() 179 | self.args = arguments 180 | 181 | def Init(self): 182 | 183 | if self.args.drtam: 184 | folder_name = 'DR-TANet' 185 | else: 186 | folder_name = 'TANet_k={}'.format(self.args.local_kernel_size) 187 | 188 | folder_name += ('_' + self.args.encoder_arch) 189 | if self.args.refinement: 190 | folder_name += '_ref' 191 | 192 | self.dataset_train_loader = DataLoader(datasets.vl_cmu_cd(pjoin(self.args.datadir, "train"), self.args.data_num), 193 | num_workers=self.args.num_workers, batch_size=self.args.batch_size, 194 | shuffle=True) 195 | self.dataset_test = datasets.vl_cmu_cd(pjoin(self.args.datadir, 'test'), self.args.data_num ) 196 | self.checkpoint_save = pjoin(self.args.checkpointdir, folder_name, 'vl_cmu_cd') 197 | if not os.path.exists(self.checkpoint_save): 198 | os.makedirs(self.checkpoint_save) 199 | 200 | 201 | if __name__ =="__main__": 202 | parser = argparse.ArgumentParser(description="Arguments for training...") 203 | parser.add_argument('--dataset', type=str, default='pcd', required=True) 204 | parser.add_argument('--checkpointdir', required=True) 205 | parser.add_argument('--datadir', required=True) 206 | parser.add_argument('--multi-gpu',action='store_true',help='training with multi-gpus') 207 | parser.add_argument('--max-epochs', type=int, default=100) 208 | parser.add_argument('--num-workers', type=int, default=4) 209 | parser.add_argument('--batch-size', type=int, default=16) 210 | parser.add_argument('--epoch-save', type=int, default=20) 211 | parser.add_argument('--step-test', type=int, default=200) 212 | parser.add_argument('--encoder-arch', type=str, required=True) 213 | parser.add_argument('--local-kernel-size',type=int, default=1) 214 | parser.add_argument('--attn-stride', type=int, default=1) 215 | parser.add_argument('--attn-padding', type=int, default=0) 216 | parser.add_argument('--attn-groups', type=int, default=4) 217 | parser.add_argument('--drtam', action='store_true') 218 | parser.add_argument('--refinement', action='store_true') 219 | parser.add_argument('--ssl_path', type=str, help='[nb_pre,nb_nopre,bd_pre,bd_nopre]', required=True) 220 | parser.add_argument('--data_num', type=float, default=1.0) #[0.1,0.5,0.01] 221 | parser.add_argument('--pretrain', type=bool, required=True) 222 | parser.add_argument('--sslpretrain', type=bool, required=True) 223 | 224 | 225 | 226 | if parser.parse_args().dataset == 'pcd': 227 | train= train_pcd(parser.parse_args()) 228 | for set in range(0, 3): 229 | train.Init(set) 230 | train.run() 231 | elif parser.parse_args().dataset == 'vl_cmu_cd': 232 | train = train_cmu(parser.parse_args()) 233 | train.Init() 234 | train.run() 235 | else: 236 | print('Error: Cannot identify the dataset...(dataset: pcd or vl_cmu_cd)') 237 | exit(-1) 238 | 239 | 240 | 241 | 242 | 243 | 244 | 245 | 246 | -------------------------------------------------------------------------------- /DSP/DR-TANet-main/util.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch.nn.functional as F 3 | 4 | __all__ = ['Upsample', 'upsample'] 5 | 6 | upsample = lambda x, size: F.interpolate(x, size, mode='bilinear', align_corners=False) 7 | 8 | 9 | class _BNReluConv(nn.Sequential): 10 | def __init__(self, num_maps_in, num_maps_out, k=3, batch_norm=True, bn_momentum=0.1, bias=False, dilation=1): 11 | super(_BNReluConv, self).__init__() 12 | if batch_norm: 13 | self.add_module('norm', nn.BatchNorm2d(num_maps_in, momentum=bn_momentum)) 14 | self.add_module('relu', nn.ReLU(inplace=batch_norm is True)) 15 | padding = k // 2 # same conv 16 | self.add_module('conv', nn.Conv2d(num_maps_in, num_maps_out, 17 | kernel_size=k, padding=padding, bias=bias, dilation=dilation)) 18 | 19 | 20 | class Upsample(nn.Module): 21 | def __init__(self, num_maps_in, skip_maps_in, num_maps_out, use_bn=True, k=3): 22 | super(Upsample, self).__init__() 23 | print(f'Upsample layer: in = {num_maps_in}, skip = {skip_maps_in}, out = {num_maps_out}') 24 | self.bottleneck = _BNReluConv(skip_maps_in, num_maps_in, k=1, batch_norm=use_bn) 25 | self.blend_conv = _BNReluConv(num_maps_in, num_maps_out, k=k, batch_norm=use_bn) 26 | 27 | def forward(self, x, skip): 28 | skip = self.bottleneck.forward(skip) 29 | skip_size = skip.size()[2:4] 30 | x = upsample(x, skip_size) 31 | x = x + skip 32 | x = self.blend_conv.forward(x) 33 | return x 34 | 35 | 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /DSP/config/__pycache__/option.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/config/__pycache__/option.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/config/option.py: -------------------------------------------------------------------------------- 1 | from util.utils import mkdir 2 | from util.dist_util import init_distributed_mode 3 | import argparse 4 | import torch 5 | 6 | 7 | class Options: 8 | def __init__(self): 9 | print("parsing..") 10 | parser = argparse.ArgumentParser(description="PyTorch Self-supervised Learning") 11 | parser.add_argument("--img_size", default=256, type=int, help="Image size(int) for RandomResizedCrop") # 224, 96 12 | # SSL specific settings 13 | parser.add_argument("--ssl_epochs", default=300, type=int, help="Number of epochs for training SSL") # 100, 200 14 | parser.add_argument("--ssl_model", default="simclr", type=str, help="SSL model") # simclr simclr_cd 15 | parser.add_argument("--backbone", default="resnet50", type=str, help="SSL backbone") # resnet18, resnet50 16 | parser.add_argument("--optimizer", default="lars", type=str, help="SSL optimizer") # adam, lars 17 | parser.add_argument("--ssl_dataset", default="CMU", type=str, help="SSL training dataset") # STL10, CIFAR100 18 | parser.add_argument("--ssl_batchsize", default=4, type=int, help="Batch size for SSL training ") # 32, 64, 128 19 | parser.add_argument("--temperature", default=0.5, type=float, help="Temperature parameter for NTXent loss used for SSL training") 20 | parser.add_argument("--ssl_lr", default=0.0003, type=float, help="Learning rate for SSL training") # 0.0003 21 | parser.add_argument("--n_proj", default=256, type=int, help="Projection head output size for SSL training") # 64, 128 22 | parser.add_argument("--ssl_normalize", type=bool, default=True, help="Normalize projection head output for SSL training") # True, False 23 | parser.add_argument("--scheduler", type=bool, default=True, help="Use CosineAnnealingLR for SSL training") # True, False 24 | parser.add_argument("--global_bn", type=bool, default=False, help="Use CosineAnnealingLR for SSL training") 25 | 26 | # zoom-in 27 | parser.add_argument("--zoom", type=bool, default=False, help="Whether to use zoom-in of cosine similarity in NT-Xent loss") 28 | parser.add_argument("--zoom_factor", default=10, type=int, help="Value of zoom-in in the zoom term") 29 | # Use margin in NT-Xent loss - EqCo 30 | parser.add_argument("--margin", type=bool, default=True, help="Whether to use margin in NT-Xent loss") 31 | parser.add_argument("--alpha", default=65536, type=int, help="Value of alpha in the margin term") 32 | 33 | # Use two backbones 34 | parser.add_argument("--m_backbone", type=bool, default=False, help="Whether to use momentum encoder") 35 | parser.add_argument("--m_update", type=float, default=0.990, help="Momentum update value (m)") 36 | parser.add_argument("--output_stride", type=int, default=16, help="outputstride (8 or 16)") 37 | parser.add_argument("--pre_train", type=bool, default=False, help="pretrain_enc") 38 | parser.add_argument("--encoder", type=str, default='resnet', help="resnet or vgg") 39 | parser.add_argument("--dense_cl", type=bool, default=True, help="Whether to use dense prediction") # True, False 40 | parser.add_argument("--copy_paste", type=bool, default=False, help="Whether to use copy paste aug") # True, False 41 | parser.add_argument("--barlow_twins", type=bool, default=True, help="Whether to use copy paste aug") # True, False 42 | parser.add_argument("--kd_loss", default=True, type=bool, help="kldiv") # kl, rkd,sp,wasserstein,fitnet, rka, rkda, rkd-kl, rkda-kl 43 | parser.add_argument("--kd_loss_2", default="sp", type=str, help="diff kd losses:rkd,sp,fitnet, rkd,rka,rkda") # kl, rkd,sp,wasserstein,fitnet, rka, rkda 44 | parser.add_argument("--alpha_kl", default=1000, type=float, help="Hyperparameter for KL-div") 45 | parser.add_argument("--alpha_sp", default=3000, type=float, help="Hyperparameter for similarity preserving") 46 | parser.add_argument("--alpha_inter_kd", default=100, type=float, help="Hyperparameter for inter and intra KL-div") 47 | parser.add_argument("--inter_kl", default=False, type=bool, help="calculate kl between to and t1 logits") # kl, rkd,sp,wasserstein,fitnet, rka, rkda, rkd-kl, rkda-kl 48 | parser.add_argument( 49 | "--nodiff_tc", action="store_true", default=False, help="do not reset weight each generation" 50 | ) 51 | 52 | parser.add_argument("--hidden_layer", type=int, default=512, help="hiddenlayer (512 or 1024)") 53 | parser.add_argument("--supervised_multihead", type=bool, default=True, help="Whether to use copy paste aug") # True, False 54 | 55 | # Different weighted loss functions 56 | parser.add_argument("--criterion_weight", nargs="*", type=int, default=[1, 0, 0, 0], 57 | help="Loss criterion weights for SSL training") # [1, 1000, 0, 0], [1, 0, 25, 50] 58 | # Directory 59 | parser.add_argument("--data_dir", default="/data/input/datasets/VL-CMU-CD/pcd", type=str, help="Directory to import data") # Absolute path 60 | parser.add_argument("--val_data_dir", default="/data/input/datasets/VL-CMU-CD/struc_test", type=str, help="Directory to import data") # Absolute path 61 | 62 | parser.add_argument("--save_dir", default="/volumes1/tmp", type=str, help="Directory to save log and model") # Absolute path /data/output/vijaya.ramkumar/sscd /volumes1/tmp /sscdv2/runs_1 63 | # testing SSL model 64 | parser.add_argument("--test_dataset", default="CMU", type=str, help="Dataset for testing SSL methods") # STL10, CIFAR10, ImageNet 65 | parser.add_argument("--test_data_dir", default="/data/input/datasets/VL-CMU-CD/struc_test", type=str, help="Directory to import data") # Absolute path 66 | parser.add_argument("--linear_batchsize", default=16, type=int, help="Test batch size for linear evaluation") # 32, 64, 128 67 | parser.add_argument("--linear_epochs", default=100, type=int, help="No.of epochs for Linear evaluation") # 100, 200 68 | parser.add_argument("--linear_classes", default=1, type=int, help="No.of classes for Linear evaluation") # 1 for binary classification 69 | parser.add_argument("--linear_lr", default=3e-4, type=float, help="Learning rate for Linear evaluation training") # 0.0003 70 | 71 | # testing SSL model 72 | parser.add_argument("--sup_dataset", default="CIFAR100", type=str) # STL10, CIFAR10, ImageNet 73 | parser.add_argument("--sup_data_dir", default="/volumes1/CIFAR100", type=str) # Absolute path 74 | parser.add_argument("--sup_batchsize", default=256, type=int) # 32, 64, 128 75 | parser.add_argument("--sup_lr", default=0.02, type=float) # 0.0003 76 | parser.add_argument("--sup_epochs", default=100, type=int) # 100 77 | 78 | # trained SSL model path 79 | parser.add_argument("--model_path", default=None, type=str, help="Saved SSL model path for transfer learning") # Absolute path 80 | 81 | # Distributed 82 | parser.add_argument("--distribute", type=bool, default=False, help="Distributed Data Parallel") # DistributedDataParallel 83 | parser.add_argument("--dist_url", type=str, default="env://") # Default URL for DistributedDataParallel 84 | 85 | # Visualizing Heatmap for test Images 86 | parser.add_argument("--bestcheckpoint", default='/data/output/vijaya.ramkumar/sscd/runs/resnet50_bs_2/Wed_May_12_17:00:10_2021/checkpoint_model_170_model1.pth', type=str) # '/data/output/vijaya.ramkumar/sscd/runs/resnet50_bs_8/Mon_Mar_22_16:59:24_2021/checkpoint_model_200_model1.pth' 87 | self.parser = parser 88 | 89 | def parse(self): 90 | args = self.parser.parse_args() 91 | mkdir(args.save_dir) 92 | args.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 93 | if args.distribute: 94 | init_distributed_mode(args) 95 | if args.ssl_lr is None: 96 | # In SimCLR, linear LR scaling = 0.3 * args.batchsize / 256 and square root LR scaling 0.075 × math.sqrt(BatchSize) 97 | args.ssl_lr = 0.03 * args.ssl_batchsize / 256 98 | # args.ssl_lr = 0.075 * math.sqrt(args.ssl_batchsize) 99 | return args 100 | -------------------------------------------------------------------------------- /DSP/criterion/__pycache__/ntxent.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/criterion/__pycache__/ntxent.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/criterion/__pycache__/sim_preserving_kd.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/criterion/__pycache__/sim_preserving_kd.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/criterion/ntxent.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from util.utils import positive_mask 4 | import os 5 | import math 6 | import util.utils as utils 7 | import torch.nn.functional as F 8 | 9 | 10 | class NTXent(nn.Module): 11 | """ 12 | The Normalized Temperature-scaled Cross Entropy Loss 13 | Source: https://github.com/Spijkervet/SimCLR 14 | """ 15 | 16 | def __init__(self, args): 17 | super(NTXent, self).__init__() 18 | self.batch_size = args.ssl_batchsize 19 | self.margin = args.margin 20 | self.alpha = args.alpha 21 | self.temperature = args.temperature 22 | self.device = args.device 23 | self.mask = positive_mask(args.ssl_batchsize) 24 | self.criterion = nn.CrossEntropyLoss(reduction="sum") 25 | self.similarity_f = nn.CosineSimilarity(dim=2) 26 | self.N = 4 * self.batch_size 27 | self.zoom = args.zoom 28 | self.zoom_factor = args.zoom_factor 29 | self.writer = args.writer 30 | 31 | 32 | def forward(self, zx, zy, zx1, zy1, global_step): 33 | """ 34 | zx: projection output of batch zx 35 | zy: projection output of batch zy 36 | :return: normalized loss 37 | """ 38 | positive_samples, negative_samples = self.sample_no_dict(zx, zy, zx1, zy1) 39 | if self.margin: 40 | m = self.temperature * math.log(self.alpha / negative_samples.shape[1]) 41 | positive_samples = ((positive_samples * self.temperature) - m) / self.temperature 42 | 43 | labels = torch.zeros(self.N).to(positive_samples.device).long() 44 | logits = torch.cat((positive_samples, negative_samples), dim=1) 45 | loss = self.criterion(logits, labels) 46 | loss /= self.N 47 | 48 | return loss 49 | 50 | 51 | def sample_no_dict(self, zx, zy, zx1, zy1): 52 | """ 53 | Negative samples without dictionary 54 | """ 55 | # print(zx.shape) 56 | z = torch.cat((zx, zy, zx1,zy1), dim=0) 57 | sim = self.similarity_f(z.unsqueeze(1), z.unsqueeze(0)) / self.temperature 58 | # print(sim.shape,self.batch_size ) 59 | 60 | # Splitting the matrix into 4 blocks so as to count number of positive and negative samples 61 | sim_left, sim_right = torch.chunk(sim, 2, dim=1) 62 | sim_lu,sim_ll = torch.chunk(sim_left, 2, dim=0) 63 | sim_ru,sim_rl = torch.chunk(sim_right, 2, dim=0) 64 | # print(sim_lu.shape,self.batch_size ) 65 | 66 | # Extract positive samples from each block 67 | #sim_xy = torch.diag(sim, self.batch_size) 68 | pos_1 = torch.diag(sim_lu, self.batch_size) 69 | pos_2 = torch.diag(sim_lu, -self.batch_size) 70 | pos_3 = torch.diag(sim_rl, self.batch_size) 71 | pos_4 = torch.diag(sim_rl, -self.batch_size) 72 | # sim_yx = torch.diag(sim, -self.batch_size) 73 | positive_samples = torch.cat((pos_1, pos_2, pos_3, pos_4), dim=0).reshape(self.N, 1) 74 | 75 | # Extract negative samples 76 | neg_lu = sim_lu[self.mask].reshape(self.batch_size*2, 2*(self.batch_size-1) ) 77 | neg_rl = sim_rl[self.mask].reshape(self.batch_size*2, 2*(self.batch_size-1)) 78 | 79 | # Concatenating the extracted negatives from sim block left upper and right lower. 80 | neg_u = torch.cat((neg_lu, sim_ru), dim=1) 81 | neg_l = torch.cat((sim_ll, neg_rl), dim=1) 82 | negative_samples = torch.cat((neg_u, neg_l), dim=0) 83 | 84 | return positive_samples, negative_samples 85 | 86 | 87 | 88 | class BarlowTwinsLoss(torch.nn.Module): 89 | def __init__(self, device, lambda_param=5e-3): 90 | super(BarlowTwinsLoss, self).__init__() 91 | self.lambda_param = lambda_param 92 | self.device = device 93 | 94 | def forward(self, z_a: torch.Tensor, z_b: torch.Tensor): 95 | # normalize repr. along the batch dimension 96 | z_a_norm = (z_a - z_a.mean(0)) / z_a.std(0) # NxD 97 | z_b_norm = (z_b - z_b.mean(0)) / z_b.std(0) # NxD 98 | z_a_norm = z_a_norm.view(z_a_norm.size(0), z_a_norm.size(1)* z_a_norm.size(2)) 99 | z_b_norm = z_b_norm.view(z_b_norm.size(0), z_b_norm.size(1)*z_b_norm.size(2)) 100 | 101 | N = z_a.size(0) 102 | # D = z_a.size(1) 103 | D = z_a_norm.size(1) 104 | 105 | # print (z_a_norm.T.shape, z_b_norm.shape) 106 | # cross-correlation matrix 107 | # c= torch.einsum('yxb,bxy->xy', (z_a_norm.T, z_b_norm)) 108 | c = torch.mm(z_a_norm.T, z_b_norm) / N # DxD 109 | # print (c.shape) 110 | 111 | # loss 112 | c_diff = (c - torch.eye(D,device=self.device)).pow(2) # DxD 113 | # multiply off-diagonal elems of c_diff by lambda 114 | c_diff[~torch.eye(D, dtype=bool)] *= self.lambda_param 115 | loss = c_diff.sum() 116 | 117 | return loss 118 | 119 | 120 | class BarlowTwinsLoss_CD(torch.nn.Module): 121 | def __init__(self, args, lambda_param=5e-3): 122 | super(BarlowTwinsLoss_CD, self).__init__() 123 | self.lambda_param = lambda_param 124 | self.device = args.device 125 | self.dense_cl = args.dense_cl 126 | 127 | def forward(self, z_a: torch.Tensor, z_b: torch.Tensor,z_c: torch.Tensor, z_d: torch.Tensor): #, z_c: torch.Tensor, z_d: torch.Tensor 128 | 129 | # normalize repr. along the batch dimension 130 | z_a_norm = (z_a - z_a.mean(0)) / z_a.std(0) # NxD 131 | z_b_norm = (z_b - z_b.mean(0)) / z_b.std(0) # NxD 132 | z_c_norm = (z_c - z_c.mean(0)) / z_c.std(0) # NxD 133 | z_d_norm = (z_d - z_d.mean(0)) / z_d.std(0) # NxD 134 | 135 | N = z_a.size(0) 136 | if self.dense_cl == True: 137 | ## for dense activation 138 | z_a_norm = z_a_norm.view(z_a_norm.size(0), -1) 139 | z_b_norm = z_b_norm.view(z_b_norm.size(0), -1) 140 | z_c_norm = z_c_norm.view(z_c_norm.size(0), -1) 141 | z_d_norm = z_d_norm.view(z_d_norm.size(0), -1) 142 | D = z_a_norm.size(1) 143 | 144 | else: 145 | D = z_a.size(1) 146 | # print(z_a_norm.shape) 147 | # cross-correlation matrix 148 | c1 = torch.mm(z_a_norm.T, z_b_norm) / N # DxD 149 | # c2 = torch.mm(z_c_norm.T, z_d_norm) / N # DxD 150 | 151 | 152 | # loss 153 | c_diff1 = (c1 - torch.eye(D,device=self.device)).pow(2) # DxD 154 | # multiply off-diagonal elems of c_diff by lambda 155 | c_diff1[~torch.eye(D, dtype=bool)] *= self.lambda_param 156 | loss1 = c_diff1.sum() 157 | 158 | loss = loss1 159 | return loss 160 | 161 | -------------------------------------------------------------------------------- /DSP/criterion/sim_preserving_kd.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | from torch import nn 4 | from torch.autograd import Variable 5 | 6 | criterion_MSE = nn.MSELoss(reduction='mean') 7 | 8 | 9 | def cross_entropy(y, labels): 10 | l_ce = F.cross_entropy(y, labels) 11 | return l_ce 12 | 13 | 14 | def distillation(student_scores, teacher_scores, T): 15 | 16 | p = F.log_softmax(student_scores / T, dim=1) 17 | q = F.softmax(teacher_scores / T, dim=1) 18 | 19 | l_kl = F.kl_div(p, q, size_average=False) * (T**2) / student_scores.shape[0] 20 | 21 | return l_kl 22 | 23 | 24 | class JSD(nn.Module): 25 | 26 | def __init__(self, args): 27 | super(JSD, self).__init__() 28 | self.dense= args.dense_cl 29 | def forward(self, net_1_logits, net_2_logits): 30 | if self.dense==True: 31 | net_1_logits = net_1_logits.view(net_1_logits.size(0), -1) 32 | net_2_logits = net_2_logits.view(net_2_logits.size(0), -1) 33 | 34 | 35 | net_1_probs = F.softmax(net_1_logits+ 1e-10, dim=1) 36 | net_2_probs = F.softmax(net_2_logits+ 1e-10, dim=1) 37 | 38 | total_m = 0.5 * (net_1_probs + net_2_probs) 39 | 40 | return 0.5 * (F.kl_div(F.log_softmax(net_1_logits, dim=1), total_m, reduction="batchmean") + 41 | F.kl_div(F.log_softmax(net_2_logits, dim=1), total_m, reduction="batchmean")) 42 | 43 | 44 | 45 | 46 | 47 | def fitnet_loss(A_t, A_s, rand=False, noise=0.1): 48 | """Given the activations for a batch of input from the teacher and student 49 | network, calculate the fitnet loss from the paper 50 | FitNets: Hints for Thin Deep Nets https://arxiv.org/abs/1412.6550 51 | 52 | Note: This function assumes that the number of channels and the spatial dimensions of 53 | the teacher and student activation maps are the same. 54 | 55 | Parameters: 56 | A_t (4D tensor): activation maps from the teacher network of shape b x c x h x w 57 | A_s (4D tensor): activation maps from the student network of shape b x c x h x w 58 | 59 | Returns: 60 | l_fitnet (1D tensor): fitnet loss value 61 | """ 62 | if rand: 63 | rand_noise = torch.FloatTensor(A_t.shape).uniform_(1 - noise, 1 + noise) 64 | A_t = A_t * rand_noise 65 | 66 | return criterion_MSE(A_t, A_s) 67 | 68 | 69 | def at(x): 70 | return F.normalize(x.pow(2).mean(1).view(x.size(0), -1)) 71 | 72 | 73 | def at_loss(x, y, rand=False, noise=0.1): 74 | if rand: 75 | rand_noise = torch.FloatTensor(y.shape).uniform_(1 - noise, 1 + noise).cuda() 76 | y = y * rand_noise 77 | 78 | return (at(x) - at(y)).pow(2).mean() 79 | 80 | 81 | def FSP_loss(fea_t, short_t, fea_s, short_s, rand=False, noise=0.1): 82 | 83 | a, b, c, d = fea_t.size() 84 | feat = fea_t.view(a, b, c * d) 85 | a, b, c, d = short_t.size() 86 | shortt = short_t.view(a, b, c * d) 87 | G_t = torch.bmm(feat, shortt.permute(0, 2, 1)).div(c * d).detach() 88 | 89 | a, b, c, d = fea_s.size() 90 | feas = fea_s.view(a, b, c * d) 91 | a, b, c, d = short_s.size() 92 | shorts = short_s.view(a, b, c * d) 93 | G_s = torch.bmm(feas, shorts.permute(0, 2, 1)).div(c * d) 94 | 95 | return criterion_MSE(G_s, G_t) 96 | 97 | 98 | def similarity_preserving_loss(A_t, A_s): 99 | """Given the activations for a batch of input from the teacher and student 100 | network, calculate the similarity preserving knowledge distillation loss from the 101 | paper Similarity-Preserving Knowledge Distillation (https://arxiv.org/abs/1907.09682) 102 | equation 4 103 | 104 | Note: A_t and A_s must have the same batch size 105 | 106 | Parameters: 107 | A_t (4D tensor): activation maps from the teacher network of shape b x c1 x h1 x w1 108 | A_s (4D tensor): activation maps from the student network of shape b x c2 x h2 x w2 109 | 110 | Returns: 111 | l_sp (1D tensor): similarity preserving loss value 112 | """ 113 | 114 | # reshape the activations 115 | b1, c1, h1, w1 = A_t.shape 116 | b2, c2, h2, w2 = A_s.shape 117 | assert b1 == b2, 'Dim0 (batch size) of the activation maps must be compatible' 118 | 119 | Q_t = A_t.reshape([b1, c1 * h1 * w1]) 120 | Q_s = A_s.reshape([b2, c2 * h2 * w2]) 121 | 122 | # evaluate normalized similarity matrices (eq 3) 123 | G_t = torch.mm(Q_t, Q_t.t()) 124 | # G_t = G_t / G_t.norm(p=2) 125 | G_t = torch.nn.functional.normalize(G_t) 126 | 127 | G_s = torch.mm(Q_s, Q_s.t()) 128 | # G_s = G_s / G_s.norm(p=2) 129 | G_s = torch.nn.functional.normalize(G_s) 130 | 131 | # calculate the similarity preserving loss (eq 4) 132 | l_sp = (G_t - G_s).pow(2).mean() 133 | 134 | return l_sp 135 | 136 | def similarity_preserving_loss_cd(A_t, A_s, A_t1, A_s1 ): 137 | 138 | # reshape the activations 139 | b1, c1, h1, w1 = A_t.shape 140 | b2, c2, h2, w2 = A_s.shape 141 | assert b1 == b2, 'Dim0 (batch size) of the activation maps must be compatible' 142 | 143 | Q_t = A_t.reshape([b1, c1 * h1 * w1]) 144 | Q_s = A_s.reshape([b2, c2 * h2 * w2]) 145 | Q_t1 = A_t1.reshape([b1, c1 * h1 * w1]) 146 | Q_s1 = A_s1.reshape([b2, c2 * h2 * w2]) 147 | # evaluate normalized similarity matrices (eq 3) 148 | G_t = torch.mm(Q_t, Q_s.t()) 149 | # G_t = G_t / G_t.norm(p=2) 150 | G_t = torch.nn.functional.normalize(G_t) 151 | 152 | G_s = torch.mm(Q_t1, Q_s1.t()) 153 | # G_s = G_s / G_s.norm(p=2) 154 | G_s = torch.nn.functional.normalize(G_s) 155 | 156 | # calculate the similarity preserving loss (eq 4) 157 | l_sp = (G_t - G_s).pow(2).mean() 158 | 159 | return l_sp 160 | 161 | 162 | 163 | class SlicedWassersteinDiscrepancy(nn.Module): 164 | """PyTorch adoption of https://github.com/apple/ml-cvpr2019-swd""" 165 | def __init__(self, mean=0, sd=1, device='cpu'): 166 | super(SlicedWassersteinDiscrepancy, self).__init__() 167 | self.dist = torch.distributions.Normal(mean, sd) 168 | self.device = device 169 | 170 | def forward(self, p1, p2): 171 | if p1.shape[1] > 1: 172 | # For data more than one-dimensional input, perform multiple random 173 | # projection to 1-D 174 | proj = self.dist.sample([p1.shape[1], 128]).to(self.device) 175 | proj *= torch.rsqrt(torch.sum(proj.pow(2), dim=0, keepdim=True)) 176 | 177 | p1 = torch.mm(p1, proj) 178 | p2 = torch.mm(p2, proj) 179 | 180 | p1, _ = torch.sort(p1, 0, descending=True) 181 | p2, _ = torch.sort(p2, 0, descending=True) 182 | 183 | wdist = (p1 - p2).pow(2).mean() 184 | 185 | return wdist 186 | 187 | 188 | class RKD(object): 189 | """ 190 | Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho. 191 | relational knowledge distillation. 192 | arXiv preprint arXiv:1904.05068, 2019. 193 | """ 194 | def __init__(self, device, eval_dist_loss=True, eval_angle_loss=False): 195 | super(RKD, self).__init__() 196 | self.device = device 197 | self.eval_dist_loss = eval_dist_loss 198 | self.eval_angle_loss = eval_angle_loss 199 | self.huber_loss = torch.nn.SmoothL1Loss() 200 | 201 | @staticmethod 202 | def distance_wise_potential(x): 203 | x_square = x.pow(2).sum(dim=-1) 204 | prod = torch.matmul(x, x.t()) 205 | distance = torch.sqrt( 206 | torch.clamp( torch.unsqueeze(x_square, 1) + torch.unsqueeze(x_square, 0) - 2 * prod, 207 | min=1e-12)) 208 | mu = torch.sum(distance) / torch.sum( 209 | torch.where(distance > 0., torch.ones_like(distance), 210 | torch.zeros_like(distance))) 211 | 212 | return distance / (mu + 1e-8) 213 | 214 | @staticmethod 215 | def angle_wise_potential(x): 216 | e = torch.unsqueeze(x, 0) - torch.unsqueeze(x, 1) 217 | e_norm = torch.nn.functional.normalize(e, dim=2) 218 | return torch.matmul(e_norm, torch.transpose(e_norm, -1, -2)) 219 | 220 | def eval_loss(self, source, target): 221 | 222 | # Flatten tensors 223 | source = source.reshape(source.shape[0], -1) 224 | target = target.reshape(target.shape[0], -1) 225 | 226 | # normalize 227 | source = torch.nn.functional.normalize(source, dim=1) 228 | target = torch.nn.functional.normalize(target, dim=1) 229 | 230 | distance_loss = torch.tensor([0.]).to(self.device) 231 | angle_loss = torch.tensor([0.]).to(self.device) 232 | 233 | if self.eval_dist_loss: 234 | distance_loss = self.huber_loss( 235 | self.distance_wise_potential(source), self.distance_wise_potential(target) 236 | ) 237 | 238 | if self.eval_angle_loss: 239 | angle_loss = self.huber_loss( 240 | self.angle_wise_potential(source), self.angle_wise_potential(target) 241 | ) 242 | 243 | return distance_loss, angle_loss -------------------------------------------------------------------------------- /DSP/dataset/CMU.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.utils.data.dataset import Dataset 3 | import numpy as np 4 | import os 5 | from PIL import Image 6 | import random 7 | from scipy.ndimage import gaussian_filter 8 | from torchvision import transforms 9 | from config.option import Options 10 | import matplotlib.pyplot as plt 11 | 12 | args = Options().parse() 13 | 14 | 15 | IMG_EXTENSIONS = [ 16 | '.jpg', '.JPG', '.jpeg', '.JPEG', 17 | '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP', 18 | ] 19 | 20 | def is_image_file(filename): 21 | print(filename) 22 | return any(filename.endswith(extension) for extension in IMG_EXTENSIONS) 23 | 24 | def pil_loader(path): 25 | # open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835) 26 | with open(path, 'rb') as f: 27 | with Image.open(f) as img: 28 | return img.convert('RGB') 29 | 30 | 31 | palette = [0, 0, 0,255,0,0] 32 | 33 | def colorize_mask(mask): 34 | # mask: numpy array of the mask 35 | new_mask = Image.fromarray(mask.astype(np.uint8)).convert('P') 36 | new_mask.putpalette(palette) 37 | 38 | return new_mask 39 | 40 | def get_pascal_labels(): 41 | return np.asarray([[0,0,0],[0,0,255]]) 42 | 43 | def decode_segmap(temp, plot=False): 44 | 45 | label_colours = get_pascal_labels() 46 | r = temp.copy() 47 | g = temp.copy() 48 | b = temp.copy() 49 | for l in range(0, 2): 50 | r[temp == l] = label_colours[l, 0] 51 | g[temp == l] = label_colours[l, 1] 52 | b[temp == l] = label_colours[l, 2] 53 | 54 | rgb = np.zeros((temp.shape[0], temp.shape[1], 3)) 55 | rgb[:, :, 0] = r 56 | rgb[:, :, 1] = g 57 | rgb[:, :, 2] = b 58 | #rgb = np.resize(rgb,(321,321,3)) 59 | if plot: 60 | plt.imshow(rgb) 61 | plt.show() 62 | else: 63 | return rgb 64 | 65 | 66 | class Dataset(Dataset): 67 | 68 | def __init__(self,data_path,split_flag, flag_type= 'ssl', transform=False, transform_med=None): 69 | self.size = args.img_size 70 | self.train_data_path = os.path.join(data_path, "struc_train") 71 | self.test_data_path = os.path.join(data_path, 'struc_test') 72 | self.img_txt_path = os.path.join(self.train_data_path, 'train_pair.txt') 73 | self.test_img_txt_path = os.path.join(self.test_data_path, 'test_pair.txt') 74 | print( self.test_img_txt_path) 75 | # # Load the text file containing image pair 76 | # self.imgs_path_list = np.loadtxt(self.img_txt_path,dtype=str) 77 | # self.test_imgs_path_list = np.loadtxt(self.test_img_txt_path,dtype=str) 78 | 79 | self.flag = split_flag 80 | self.flag_type = flag_type 81 | self.transform = transform 82 | self.transform_med = transform_med 83 | self.img_label_path_pairs = self.get_img_label_path_pairs() 84 | 85 | def get_img_label_path_pairs(self): 86 | 87 | img_label_pair_list = {} 88 | if self.flag =='train': 89 | for idx, did in enumerate(open(self.img_txt_path)): 90 | try: 91 | image1_name,image2_name,mask_name = did.strip("\n").split(' ') 92 | except ValueError: # Adhoc for test. 93 | image_name = mask_name = did.strip("\n") 94 | extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')] 95 | img1_file = os.path.join(self.train_data_path, image1_name) 96 | img2_file = os.path.join(self.train_data_path, image2_name) 97 | lbl_file = os.path.join(self.train_data_path, mask_name) 98 | img_label_pair_list.setdefault(idx, [img1_file,img2_file,lbl_file, image1_name, image2_name]) 99 | 100 | if self.flag == 'val': 101 | self.label_ext = '.png' 102 | for idx , did in enumerate(open(self.test_img_txt_path)): 103 | try: 104 | image1_name, image2_name, mask_name = did.strip("\n").split(' ') 105 | except ValueError: # Adhoc for test. 106 | image_name = mask_name = did.strip("\n") 107 | # extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')] 108 | img1_file = os.path.join(self.test_data_path, image1_name) 109 | img2_file = os.path.join(self.test_data_path, image2_name) 110 | lbl_file = os.path.join(self.test_data_path, mask_name) 111 | img_label_pair_list.setdefault(idx, [img1_file, img2_file, lbl_file, image1_name, image2_name]) 112 | 113 | return img_label_pair_list 114 | 115 | def data_transform(self, img1,img2,lbl): 116 | rz = transforms.Compose([transforms.Resize(size=(512,512))]) 117 | img1 = rz(img1) 118 | img2 = rz(img2) 119 | lbl= transforms.ToPILImage()(lbl) 120 | lbl = rz(lbl) 121 | img1 = transforms.ToTensor()(img1) 122 | img2 = transforms.ToTensor()(img2) 123 | lbl = transforms.ToTensor()(lbl) 124 | #lbl_reverse = torch.from_numpy(lbl_reverse).long() 125 | return img1,img2,lbl 126 | 127 | def extract_instance(self, img1, img2, lbl): 128 | 129 | obj_mask = 1*(lbl >1) 130 | img1 = np.array(img1) 131 | gau_masks = gaussian_filter(obj_mask, sigma=1) 132 | gau_masks = np.reshape(gau_masks, (gau_masks.shape[0], gau_masks.shape[1], 1)) 133 | instance = img1* gau_masks 134 | transform = transforms.ToTensor() 135 | img1 = transform(img1) 136 | img2 = transform(img2) 137 | obj_mask = transform(obj_mask) 138 | instance = transform(instance) 139 | return img1, img2, obj_mask, instance 140 | 141 | def __getitem__(self, index): 142 | 143 | img1_path,img2_path,label_path,filename1, filename2 = self.img_label_path_pairs[index] 144 | # print(img1_path,filename1,filename2) 145 | ####### load images ############# 146 | img1 = Image.open(img1_path) 147 | img2 = Image.open(img2_path) 148 | 149 | label = Image.open(label_path) 150 | label = np.array(label, dtype=np.int32) 151 | 152 | height,width, d = np.array(img1,dtype= np.uint8).shape 153 | 154 | if self.transform_med != None: 155 | # normal simclr 156 | img1_0, img2_0 = self.transform_med(img1, img2) 157 | img1_1, img2_1= self.transform_med(img1,img2) 158 | 159 | if self.flag_type == 'ssl': 160 | return img1_0, img1_1, img2_0, img2_1, str(filename1), str(filename2), label 161 | elif self.flag_type == 'linear_eval': 162 | image_dict = {'pos1': img1_0, 'pos2': img1_1, 'neg1': img2_0, 'neg2': img2_1} 163 | type1, type2 = random.sample(list(image_dict.keys()), k=2) 164 | 165 | if any('pos' in s for s in [type1, type2]) and any('neg' in s for s in [type1, type2]): 166 | y = 1 167 | i1 = image_dict[type1] 168 | i2 = image_dict[type2] 169 | return i1, i2, y 170 | else: 171 | y = 0 172 | i1 = image_dict[type1] 173 | i2 = image_dict[type2] 174 | return i1, i2, y 175 | 176 | ####### load labels ############ 177 | if self.flag == 'train': 178 | label = Image.open(label_path) 179 | # if self.transform_med != None: # enable this during fine tuning 180 | # label = self.transform_med(label) 181 | label = np.array(label,dtype=np.int32) 182 | 183 | if self.flag == 'val': 184 | label = Image.open(label_path) 185 | # if self.transform_med != None: # enable this during fine tuning 186 | # label = self.transform_med(label) 187 | label = np.array(label,dtype=np.int32) 188 | 189 | if self.transform : 190 | img1, img2, label = self.data_transform(img1,img2,label) #self.extract_instance(img1, img2, label) 191 | 192 | return img1, img2, label 193 | 194 | 195 | else: 196 | return img1, img2, label 197 | def __len__(self): 198 | 199 | return len(self.img_label_path_pairs) 200 | 201 | 202 | 203 | -------------------------------------------------------------------------------- /DSP/dataset/PCD.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.utils.data.dataset import Dataset 3 | import numpy as np 4 | import os 5 | import scipy.io 6 | import scipy.misc as m 7 | from PIL import Image 8 | import random 9 | from scipy.ndimage import gaussian_filter 10 | from torchvision import transforms 11 | from config.option import Options 12 | import matplotlib.pyplot as plt 13 | 14 | args = Options().parse() 15 | 16 | 17 | class Dataset(Dataset): 18 | 19 | def __init__(self,data_path,split_flag, flag_type= 'ssl', transform=False, transform_med=None): 20 | self.size = args.img_size 21 | self.train_data_path = os.path.join(data_path, "struc_train") 22 | self.test_data_path = os.path.join(data_path, 'struc_test') 23 | self.img_txt_path = os.path.join(self.train_data_path, 'train_pair.txt') 24 | self.test_img_txt_path = os.path.join(self.test_data_path, 'test_pair.txt') 25 | self.flag = split_flag 26 | self.flag_type = flag_type 27 | self.transform = transform 28 | self.transform_med = transform_med 29 | self.img_label_path_pairs = self.get_img_label_path_pairs() 30 | 31 | def get_img_label_path_pairs(self): 32 | 33 | img_label_pair_list = {} 34 | if self.flag =='train': 35 | for idx, did in enumerate(open(self.img_txt_path)): 36 | try: 37 | image1_name,image2_name,mask_name = did.strip("\n").split(' ') 38 | except ValueError: # Adhoc for test. 39 | image_name = mask_name = did.strip("\n") 40 | extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')] 41 | img1_file = os.path.join(self.train_data_path, image1_name) 42 | img2_file = os.path.join(self.train_data_path, image2_name) 43 | lbl_file = os.path.join(self.train_data_path, mask_name) 44 | img_label_pair_list.setdefault(idx, [img1_file,img2_file,lbl_file, image1_name, image2_name]) 45 | 46 | if self.flag == 'val': 47 | self.label_ext = '.png' 48 | for idx , did in enumerate(open(self.test_img_txt_path)): 49 | try: 50 | image1_name, image2_name, mask_name = did.strip("\n").split(' ') 51 | except ValueError: # Adhoc for test. 52 | image_name = mask_name = did.strip("\n") 53 | # extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')] 54 | img1_file = os.path.join(self.test_data_path, image1_name) 55 | img2_file = os.path.join(self.test_data_path, image2_name) 56 | lbl_file = os.path.join(self.test_data_path, mask_name) 57 | img_label_pair_list.setdefault(idx, [img1_file, img2_file, lbl_file, image1_name, image2_name]) 58 | 59 | return img_label_pair_list 60 | 61 | def data_transform(self, img1,img2,lbl): 62 | rz = transforms.Compose([transforms.Resize(size=(512,512))]) 63 | img1 = rz(img1) 64 | img2 = rz(img2) 65 | lbl= transforms.ToPILImage()(lbl) 66 | lbl = rz(lbl) 67 | img1 = transforms.ToTensor()(img1) 68 | img2 = transforms.ToTensor()(img2) 69 | lbl = transforms.ToTensor()(lbl) 70 | #lbl_reverse = torch.from_numpy(lbl_reverse).long() 71 | return img1,img2,lbl 72 | 73 | def extract_instance(self, img1, img2, lbl): 74 | #USE THIS IF YOU WANT TO CREATE MORE IMAGES USING COPY PASTE AUGMENTTAION. 75 | '''This extracts the instances belonging to the changed region and paste it on exsisting images to create new images.''' 76 | obj_mask = 1*(lbl >1) 77 | img1 = np.array(img1) 78 | gau_masks = gaussian_filter(obj_mask, sigma=1) 79 | gau_masks = np.reshape(gau_masks, (gau_masks.shape[0], gau_masks.shape[1], 1)) 80 | instance = img1* gau_masks 81 | transform = transforms.ToTensor() 82 | img1 = transform(img1) 83 | img2 = transform(img2) 84 | obj_mask = transform(obj_mask) 85 | instance = transform(instance) 86 | return img1, img2, obj_mask, instance 87 | 88 | def __getitem__(self, index): 89 | 90 | img1_path,img2_path,label_path,filename1, filename2 = self.img_label_path_pairs[index] 91 | ####### load images ############# 92 | img1 = Image.open(img1_path) 93 | img2 = Image.open(img2_path) 94 | # img1 = np.asarray(img1) 95 | # img2 = np.asarray(img2) 96 | 97 | label = Image.open(label_path) 98 | label = np.array(label, dtype=np.int32) 99 | 100 | height,width, d = np.array(img1,dtype= np.uint8).shape 101 | 102 | if self.transform_med != None: 103 | # normal simclr 104 | img1_0, img2_0 = self.transform_med(img1, img2) 105 | img1_1, img2_1= self.transform_med(img1, img2) 106 | # print(img1_1.shape) 107 | img1_0 = np.asarray(img1_0).astype("f").transpose(2, 0, 1) / 128.0 - 1.0 108 | img2_0 = np.asarray(img2_0).astype("f").transpose(2, 0, 1) / 128.0 - 1.0 109 | img1_1 = np.asarray(img1_1).astype("f").transpose(2, 0, 1) / 128.0 - 1.0 110 | img2_1 = np.asarray(img2_1).astype("f").transpose(2, 0, 1) / 128.0 - 1.0 111 | img1_0 = torch.from_numpy(img1_0).float() 112 | img1_1 = torch.from_numpy(img1_1).float() 113 | img2_0 = torch.from_numpy(img2_0).float() 114 | img2_1 = torch.from_numpy(img2_1).float() 115 | 116 | if self.flag_type == 'ssl': 117 | return img1_0, img1_1, img2_0, img2_1, str(filename1), str(filename2), label 118 | elif self.flag_type == 'linear_eval': 119 | image_dict = {'pos1': img1_0, 'pos2': img1_1, 'neg1': img2_0, 'neg2': img2_1} 120 | type1, type2 = random.sample(list(image_dict.keys()), k=2) 121 | 122 | if any('pos' in s for s in [type1, type2]) and any('neg' in s for s in [type1, type2]): 123 | y = 1 124 | i1 = image_dict[type1] 125 | i2 = image_dict[type2] 126 | return i1, i2, y 127 | else: 128 | y = 0 129 | i1 = image_dict[type1] 130 | i2 = image_dict[type2] 131 | return i1, i2, y 132 | 133 | ####### load labels ############ 134 | if self.flag == 'train': 135 | label = Image.open(label_path) 136 | # if self.transform_med != None: # enable this during fine tuning 137 | # label = self.transform_med(label) 138 | label = np.array(label,dtype=np.int32) 139 | 140 | if self.flag == 'val': 141 | label = Image.open(label_path) 142 | # if self.transform_med != None: # enable this during fine tuning 143 | # label = self.transform_med(label) 144 | label = np.array(label,dtype=np.int32) 145 | 146 | if self.transform : 147 | img1, img2, label = self.data_transform(img1,img2,label) #self.extract_instance(img1, img2, label) 148 | 149 | return img1, img2, label 150 | 151 | else: 152 | return img1, img2, label 153 | def __len__(self): 154 | 155 | return len(self.img_label_path_pairs) 156 | 157 | 158 | 159 | 160 | 161 | 162 | -------------------------------------------------------------------------------- /DSP/dataset/__pycache__/CMU.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/dataset/__pycache__/CMU.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/dataset/__pycache__/PCD.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/dataset/__pycache__/PCD.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/linear.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import sys 3 | sys.path.insert(0, '.') 4 | from models.simclr import SimCLR 5 | from util.test import test_all_datasets, initialize, testloaderSimCLR 6 | import numpy as np 7 | 8 | import warnings 9 | warnings.filterwarnings("ignore", category=UserWarning) 10 | 11 | np.random.seed(10) 12 | torch.manual_seed(10) 13 | 14 | 15 | if __name__ == '__main__': 16 | args, writer = initialize() 17 | simclr = SimCLR(args) 18 | state_dict = torch.load(args.model_path, map_location=args.device) 19 | simclr.load_state_dict(state_dict) 20 | simclr = simclr.cuda() 21 | test_all_datasets(args, writer, simclr) 22 | -------------------------------------------------------------------------------- /DSP/modeling/backbone/__init__.py: -------------------------------------------------------------------------------- 1 | from modeling.backbone import resnet, xception, drn, mobilenet 2 | 3 | def build_backbone(backbone, output_stride, BatchNorm): 4 | if backbone == 'resnet101': 5 | return resnet.ResNet101(output_stride, BatchNorm) 6 | if backbone == 'resnet50': 7 | return resnet.ResNet50(output_stride, BatchNorm) 8 | elif backbone == 'xception': 9 | return xception.AlignedXception(output_stride, BatchNorm) 10 | elif backbone == 'drn': 11 | return drn.drn_d_54(BatchNorm) 12 | elif backbone == 'mobilenet': 13 | return mobilenet.MobileNetV2(output_stride, BatchNorm) 14 | else: 15 | raise NotImplementedError 16 | -------------------------------------------------------------------------------- /DSP/modeling/backbone/__pycache__/__init__.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/__init__.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/backbone/__pycache__/drn.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/drn.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/backbone/__pycache__/mobilenet.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/mobilenet.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/backbone/__pycache__/resnet.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/resnet.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/backbone/__pycache__/xception.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/xception.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/backbone/mobilenet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | import torch.nn as nn 4 | import math 5 | from modeling.sync_batchnorm.batchnorm import SynchronizedBatchNorm2d 6 | import torch.utils.model_zoo as model_zoo 7 | 8 | def conv_bn(inp, oup, stride, BatchNorm): 9 | return nn.Sequential( 10 | nn.Conv2d(inp, oup, 3, stride, 1, bias=False), 11 | BatchNorm(oup), 12 | nn.ReLU6(inplace=True) 13 | ) 14 | 15 | 16 | def fixed_padding(inputs, kernel_size, dilation): 17 | kernel_size_effective = kernel_size + (kernel_size - 1) * (dilation - 1) 18 | pad_total = kernel_size_effective - 1 19 | pad_beg = pad_total // 2 20 | pad_end = pad_total - pad_beg 21 | padded_inputs = F.pad(inputs, (pad_beg, pad_end, pad_beg, pad_end)) 22 | return padded_inputs 23 | 24 | 25 | class InvertedResidual(nn.Module): 26 | def __init__(self, inp, oup, stride, dilation, expand_ratio, BatchNorm): 27 | super(InvertedResidual, self).__init__() 28 | self.stride = stride 29 | assert stride in [1, 2] 30 | 31 | hidden_dim = round(inp * expand_ratio) 32 | self.use_res_connect = self.stride == 1 and inp == oup 33 | self.kernel_size = 3 34 | self.dilation = dilation 35 | 36 | if expand_ratio == 1: 37 | self.conv = nn.Sequential( 38 | # dw 39 | nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 0, dilation, groups=hidden_dim, bias=False), 40 | BatchNorm(hidden_dim), 41 | nn.ReLU6(inplace=True), 42 | # pw-linear 43 | nn.Conv2d(hidden_dim, oup, 1, 1, 0, 1, 1, bias=False), 44 | BatchNorm(oup), 45 | ) 46 | else: 47 | self.conv = nn.Sequential( 48 | # pw 49 | nn.Conv2d(inp, hidden_dim, 1, 1, 0, 1, bias=False), 50 | BatchNorm(hidden_dim), 51 | nn.ReLU6(inplace=True), 52 | # dw 53 | nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 0, dilation, groups=hidden_dim, bias=False), 54 | BatchNorm(hidden_dim), 55 | nn.ReLU6(inplace=True), 56 | # pw-linear 57 | nn.Conv2d(hidden_dim, oup, 1, 1, 0, 1, bias=False), 58 | BatchNorm(oup), 59 | ) 60 | 61 | def forward(self, x): 62 | x_pad = fixed_padding(x, self.kernel_size, dilation=self.dilation) 63 | if self.use_res_connect: 64 | x = x + self.conv(x_pad) 65 | else: 66 | x = self.conv(x_pad) 67 | return x 68 | 69 | 70 | class MobileNetV2(nn.Module): 71 | def __init__(self, output_stride=8, BatchNorm=None, width_mult=1., pretrained=True): 72 | super(MobileNetV2, self).__init__() 73 | block = InvertedResidual 74 | input_channel = 32 75 | current_stride = 1 76 | rate = 1 77 | interverted_residual_setting = [ 78 | # t, c, n, s 79 | [1, 16, 1, 1], 80 | [6, 24, 2, 2], 81 | [6, 32, 3, 2], 82 | [6, 64, 4, 2], 83 | [6, 96, 3, 1], 84 | [6, 160, 3, 2], 85 | [6, 320, 1, 1], 86 | ] 87 | 88 | # building first layer 89 | input_channel = int(input_channel * width_mult) 90 | self.features = [conv_bn(3, input_channel, 2, BatchNorm)] 91 | current_stride *= 2 92 | # building inverted residual blocks 93 | for t, c, n, s in interverted_residual_setting: 94 | if current_stride == output_stride: 95 | stride = 1 96 | dilation = rate 97 | rate *= s 98 | else: 99 | stride = s 100 | dilation = 1 101 | current_stride *= s 102 | output_channel = int(c * width_mult) 103 | for i in range(n): 104 | if i == 0: 105 | self.features.append(block(input_channel, output_channel, stride, dilation, t, BatchNorm)) 106 | else: 107 | self.features.append(block(input_channel, output_channel, 1, dilation, t, BatchNorm)) 108 | input_channel = output_channel 109 | self.features = nn.Sequential(*self.features) 110 | self._initialize_weights() 111 | 112 | if pretrained: 113 | self._load_pretrained_model() 114 | 115 | self.low_level_features = self.features[0:4] 116 | self.high_level_features = self.features[4:] 117 | 118 | def forward(self, x): 119 | low_level_feat = self.low_level_features(x) 120 | x = self.high_level_features(low_level_feat) 121 | return x, low_level_feat 122 | 123 | def _load_pretrained_model(self): 124 | pretrain_dict = model_zoo.load_url('http://jeff95.me/models/mobilenet_v2-6a65762b.pth') 125 | model_dict = {} 126 | state_dict = self.state_dict() 127 | for k, v in pretrain_dict.items(): 128 | if k in state_dict: 129 | model_dict[k] = v 130 | state_dict.update(model_dict) 131 | self.load_state_dict(state_dict) 132 | 133 | def _initialize_weights(self): 134 | for m in self.modules(): 135 | if isinstance(m, nn.Conv2d): 136 | # n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 137 | # m.weight.data.normal_(0, math.sqrt(2. / n)) 138 | torch.nn.init.kaiming_normal_(m.weight) 139 | elif isinstance(m, SynchronizedBatchNorm2d): 140 | m.weight.data.fill_(1) 141 | m.bias.data.zero_() 142 | elif isinstance(m, nn.BatchNorm2d): 143 | m.weight.data.fill_(1) 144 | m.bias.data.zero_() 145 | 146 | if __name__ == "__main__": 147 | input = torch.rand(1, 3, 512, 512) 148 | model = MobileNetV2(output_stride=16, BatchNorm=nn.BatchNorm2d) 149 | output, low_level_feat = model(input) 150 | print(output.size()) 151 | print(low_level_feat.size()) 152 | -------------------------------------------------------------------------------- /DSP/modeling/backbone/resnet.py: -------------------------------------------------------------------------------- 1 | import math 2 | import torch.nn as nn 3 | import torch.utils.model_zoo as model_zoo 4 | from modeling.sync_batchnorm.batchnorm import SynchronizedBatchNorm2d 5 | 6 | 7 | 8 | 9 | __all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101', 10 | 'resnet152', 'resnext50_32x4d', 'resnext101_32x8d', 11 | 'wide_resnet50_2', 'wide_resnet101_2'] 12 | 13 | 14 | model_urls = { 15 | 'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth', 16 | 'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth', 17 | 'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth', 18 | 'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth', 19 | 'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth', 20 | 'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth', 21 | 'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth', 22 | 'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth', 23 | 'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth', 24 | } 25 | 26 | class Bottleneck(nn.Module): 27 | expansion = 4 28 | 29 | def __init__(self, inplanes, planes, stride=1, dilation=1, downsample=None, BatchNorm=None): 30 | super(Bottleneck, self).__init__() 31 | 32 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False) 33 | self.bn1 = BatchNorm(planes) 34 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, 35 | dilation=dilation, padding=dilation, bias=False) 36 | self.bn2 = BatchNorm(planes) 37 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False) 38 | self.bn3 = BatchNorm(planes * 4) 39 | self.relu = nn.ReLU(inplace=True) 40 | self.downsample = downsample 41 | self.stride = stride 42 | self.dilation = dilation 43 | 44 | def forward(self, x): 45 | residual = x 46 | 47 | out = self.conv1(x) 48 | out = self.bn1(out) 49 | out = self.relu(out) 50 | 51 | out = self.conv2(out) 52 | out = self.bn2(out) 53 | out = self.relu(out) 54 | 55 | out = self.conv3(out) 56 | out = self.bn3(out) 57 | 58 | if self.downsample is not None: 59 | residual = self.downsample(x) 60 | 61 | out += residual 62 | out = self.relu(out) 63 | 64 | return out 65 | 66 | class ResNet(nn.Module): 67 | 68 | def __init__(self, arch, block, layers, output_stride, BatchNorm, pretrained=False): 69 | self.inplanes = 64 70 | super(ResNet, self).__init__() 71 | self.arch = arch 72 | blocks = [1, 2, 4] 73 | if output_stride == 16: 74 | strides = [1, 2, 2, 1] 75 | dilations = [1, 1, 1, 2] 76 | elif output_stride == 8: 77 | strides = [1, 2, 1, 1] 78 | dilations = [1, 1, 2, 4] 79 | else: 80 | raise NotImplementedError 81 | 82 | # Modules 83 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, 84 | bias=False) 85 | self.bn1 = BatchNorm(64) 86 | self.relu = nn.ReLU(inplace=True) 87 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 88 | 89 | self.layer1 = self._make_layer(block, 64, layers[0], stride=strides[0], dilation=dilations[0], BatchNorm=BatchNorm) 90 | self.layer2 = self._make_layer(block, 128, layers[1], stride=strides[1], dilation=dilations[1], BatchNorm=BatchNorm) 91 | self.layer3 = self._make_layer(block, 256, layers[2], stride=strides[2], dilation=dilations[2], BatchNorm=BatchNorm) 92 | self.layer4 = self._make_MG_unit(block, 512, blocks=blocks, stride=strides[3], dilation=dilations[3], BatchNorm=BatchNorm) 93 | # self.layer4 = self._make_layer(block, 512, layers[3], stride=strides[3], dilation=dilations[3], BatchNorm=BatchNorm) 94 | self._init_weight() 95 | 96 | 97 | if pretrained: 98 | self._load_pretrained_model(self.arch) 99 | 100 | def _make_layer(self, block, planes, blocks, stride=1, dilation=1, BatchNorm=None): 101 | downsample = None 102 | if stride != 1 or self.inplanes != planes * block.expansion: 103 | downsample = nn.Sequential( 104 | nn.Conv2d(self.inplanes, planes * block.expansion, 105 | kernel_size=1, stride=stride, bias=False), 106 | BatchNorm(planes * block.expansion), 107 | ) 108 | 109 | layers = [] 110 | layers.append(block(self.inplanes, planes, stride, dilation, downsample, BatchNorm)) 111 | self.inplanes = planes * block.expansion 112 | for i in range(1, blocks): 113 | layers.append(block(self.inplanes, planes, dilation=dilation, BatchNorm=BatchNorm)) 114 | 115 | return nn.Sequential(*layers) 116 | 117 | def _make_MG_unit(self, block, planes, blocks, stride=1, dilation=1, BatchNorm=None): 118 | downsample = None 119 | if stride != 1 or self.inplanes != planes * block.expansion: 120 | downsample = nn.Sequential( 121 | nn.Conv2d(self.inplanes, planes * block.expansion, 122 | kernel_size=1, stride=stride, bias=False), 123 | BatchNorm(planes * block.expansion), 124 | ) 125 | 126 | layers = [] 127 | layers.append(block(self.inplanes, planes, stride, dilation=blocks[0]*dilation, 128 | downsample=downsample, BatchNorm=BatchNorm)) 129 | self.inplanes = planes * block.expansion 130 | for i in range(1, len(blocks)): 131 | layers.append(block(self.inplanes, planes, stride=1, 132 | dilation=blocks[i]*dilation, BatchNorm=BatchNorm)) 133 | 134 | return nn.Sequential(*layers) 135 | 136 | def forward(self, input): 137 | x = self.conv1(input) 138 | x = self.bn1(x) 139 | x = self.relu(x) 140 | x = self.maxpool(x) 141 | 142 | x = self.layer1(x) 143 | low_level_feat = x 144 | x = self.layer2(x) 145 | x = self.layer3(x) 146 | x = self.layer4(x) 147 | return x, low_level_feat 148 | 149 | def _init_weight(self): 150 | for m in self.modules(): 151 | if isinstance(m, nn.Conv2d): 152 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 153 | m.weight.data.normal_(0, math.sqrt(2. / n)) 154 | 155 | elif isinstance(m, SynchronizedBatchNorm2d): 156 | m.weight.data.fill_(1) 157 | m.bias.data.zero_() 158 | 159 | elif isinstance(m, nn.BatchNorm2d): 160 | m.weight.data.fill_(1) 161 | m.bias.data.zero_() 162 | 163 | 164 | def _load_pretrained_model(self, arch): 165 | 166 | pretrain_dict = model_zoo.load_url(model_urls[arch]) 167 | 168 | model_dict = {} 169 | state_dict = self.state_dict() 170 | for k, v in pretrain_dict.items(): 171 | if k in state_dict: 172 | model_dict[k] = v 173 | state_dict.update(model_dict) 174 | self.load_state_dict(state_dict) 175 | 176 | 177 | 178 | def ResNet101(output_stride, BatchNorm, pretrained=True): 179 | """Constructs a ResNet-101 model. 180 | Args: 181 | pretrained (bool): If True, returns a model pre-trained on ImageNet 182 | """ 183 | model = ResNet('resnet101', Bottleneck, [3, 4, 23, 3], output_stride, BatchNorm, pretrained=pretrained) 184 | return model 185 | 186 | def ResNet50( output_stride, BatchNorm, pretrained=False): 187 | r"""ResNet-50 model from 188 | `"Deep Residual Learning for Image Recognition" `_ 189 | Args: 190 | pretrained (bool): If True, returns a model pre-trained on ImageNet 191 | progress (bool): If True, displays a progress bar of the download to stderr 192 | quantize (bool): If True, return a quantized version of the model 193 | """ 194 | model = ResNet('resnet50', Bottleneck, [3, 4, 6, 3], output_stride, BatchNorm, pretrained=pretrained) 195 | return model 196 | 197 | 198 | 199 | def wide_ResNet50_2( output_stride, BatchNorm, pretrained=True): 200 | r"""Wide ResNet-50-2 model from 201 | `"Wide Residual Networks" `_ 202 | The model is the same as ResNet except for the bottleneck number of channels 203 | which is twice larger in every block. The number of channels in outer 1x1 204 | convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048 205 | channels, and in Wide ResNet-50-2 has 2048-1024-2048. 206 | Args: 207 | pretrained (bool): If True, returns a model pre-trained on ImageNet 208 | progress (bool): If True, displays a progress bar of the download to stderr 209 | """ 210 | kwargs['width_per_group'] = 64 * 2 211 | return ResNet('wide_resnet50_2', Bottleneck, [3, 4, 6, 3], 212 | output_stride, BatchNorm, pretrained=pretrained) 213 | 214 | 215 | 216 | if __name__ == "__main__": 217 | import torch 218 | model = ResNet50(BatchNorm=nn.BatchNorm2d, pretrained=True, output_stride=8) #nn.BatchNorm2d, or Syncbatch norm 219 | input = torch.rand(1, 3, 512, 512) 220 | output, low_level_feat = model(input) 221 | print(output.size()) 222 | print(low_level_feat.size()) -------------------------------------------------------------------------------- /DSP/modeling/backbone/xception.py: -------------------------------------------------------------------------------- 1 | import math 2 | import torch 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | import torch.utils.model_zoo as model_zoo 6 | from modeling.sync_batchnorm.batchnorm import SynchronizedBatchNorm2d 7 | 8 | def fixed_padding(inputs, kernel_size, dilation): 9 | kernel_size_effective = kernel_size + (kernel_size - 1) * (dilation - 1) 10 | pad_total = kernel_size_effective - 1 11 | pad_beg = pad_total // 2 12 | pad_end = pad_total - pad_beg 13 | padded_inputs = F.pad(inputs, (pad_beg, pad_end, pad_beg, pad_end)) 14 | return padded_inputs 15 | 16 | 17 | class SeparableConv2d(nn.Module): 18 | def __init__(self, inplanes, planes, kernel_size=3, stride=1, dilation=1, bias=False, BatchNorm=None): 19 | super(SeparableConv2d, self).__init__() 20 | 21 | self.conv1 = nn.Conv2d(inplanes, inplanes, kernel_size, stride, 0, dilation, 22 | groups=inplanes, bias=bias) 23 | self.bn = BatchNorm(inplanes) 24 | self.pointwise = nn.Conv2d(inplanes, planes, 1, 1, 0, 1, 1, bias=bias) 25 | 26 | def forward(self, x): 27 | x = fixed_padding(x, self.conv1.kernel_size[0], dilation=self.conv1.dilation[0]) 28 | x = self.conv1(x) 29 | x = self.bn(x) 30 | x = self.pointwise(x) 31 | return x 32 | 33 | 34 | class Block(nn.Module): 35 | def __init__(self, inplanes, planes, reps, stride=1, dilation=1, BatchNorm=None, 36 | start_with_relu=True, grow_first=True, is_last=False): 37 | super(Block, self).__init__() 38 | 39 | if planes != inplanes or stride != 1: 40 | self.skip = nn.Conv2d(inplanes, planes, 1, stride=stride, bias=False) 41 | self.skipbn = BatchNorm(planes) 42 | else: 43 | self.skip = None 44 | 45 | self.relu = nn.ReLU(inplace=True) 46 | rep = [] 47 | 48 | filters = inplanes 49 | if grow_first: 50 | rep.append(self.relu) 51 | rep.append(SeparableConv2d(inplanes, planes, 3, 1, dilation, BatchNorm=BatchNorm)) 52 | rep.append(BatchNorm(planes)) 53 | filters = planes 54 | 55 | for i in range(reps - 1): 56 | rep.append(self.relu) 57 | rep.append(SeparableConv2d(filters, filters, 3, 1, dilation, BatchNorm=BatchNorm)) 58 | rep.append(BatchNorm(filters)) 59 | 60 | if not grow_first: 61 | rep.append(self.relu) 62 | rep.append(SeparableConv2d(inplanes, planes, 3, 1, dilation, BatchNorm=BatchNorm)) 63 | rep.append(BatchNorm(planes)) 64 | 65 | if stride != 1: 66 | rep.append(self.relu) 67 | rep.append(SeparableConv2d(planes, planes, 3, 2, BatchNorm=BatchNorm)) 68 | rep.append(BatchNorm(planes)) 69 | 70 | if stride == 1 and is_last: 71 | rep.append(self.relu) 72 | rep.append(SeparableConv2d(planes, planes, 3, 1, BatchNorm=BatchNorm)) 73 | rep.append(BatchNorm(planes)) 74 | 75 | if not start_with_relu: 76 | rep = rep[1:] 77 | 78 | self.rep = nn.Sequential(*rep) 79 | 80 | def forward(self, inp): 81 | x = self.rep(inp) 82 | 83 | if self.skip is not None: 84 | skip = self.skip(inp) 85 | skip = self.skipbn(skip) 86 | else: 87 | skip = inp 88 | 89 | x = x + skip 90 | 91 | return x 92 | 93 | 94 | class AlignedXception(nn.Module): 95 | """ 96 | Modified Alighed Xception 97 | """ 98 | def __init__(self, output_stride, BatchNorm, 99 | pretrained=True): 100 | super(AlignedXception, self).__init__() 101 | 102 | if output_stride == 16: 103 | entry_block3_stride = 2 104 | middle_block_dilation = 1 105 | exit_block_dilations = (1, 2) 106 | elif output_stride == 8: 107 | entry_block3_stride = 1 108 | middle_block_dilation = 2 109 | exit_block_dilations = (2, 4) 110 | else: 111 | raise NotImplementedError 112 | 113 | 114 | # Entry flow 115 | self.conv1 = nn.Conv2d(3, 32, 3, stride=2, padding=1, bias=False) 116 | self.bn1 = BatchNorm(32) 117 | self.relu = nn.ReLU(inplace=True) 118 | 119 | self.conv2 = nn.Conv2d(32, 64, 3, stride=1, padding=1, bias=False) 120 | self.bn2 = BatchNorm(64) 121 | 122 | self.block1 = Block(64, 128, reps=2, stride=2, BatchNorm=BatchNorm, start_with_relu=False) 123 | self.block2 = Block(128, 256, reps=2, stride=2, BatchNorm=BatchNorm, start_with_relu=False, 124 | grow_first=True) 125 | self.block3 = Block(256, 728, reps=2, stride=entry_block3_stride, BatchNorm=BatchNorm, 126 | start_with_relu=True, grow_first=True, is_last=True) 127 | 128 | # Middle flow 129 | self.block4 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 130 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 131 | self.block5 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 132 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 133 | self.block6 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 134 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 135 | self.block7 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 136 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 137 | self.block8 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 138 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 139 | self.block9 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 140 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 141 | self.block10 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 142 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 143 | self.block11 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 144 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 145 | self.block12 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 146 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 147 | self.block13 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 148 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 149 | self.block14 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 150 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 151 | self.block15 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 152 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 153 | self.block16 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 154 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 155 | self.block17 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 156 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 157 | self.block18 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 158 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 159 | self.block19 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation, 160 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=True) 161 | 162 | # Exit flow 163 | self.block20 = Block(728, 1024, reps=2, stride=1, dilation=exit_block_dilations[0], 164 | BatchNorm=BatchNorm, start_with_relu=True, grow_first=False, is_last=True) 165 | 166 | self.conv3 = SeparableConv2d(1024, 1536, 3, stride=1, dilation=exit_block_dilations[1], BatchNorm=BatchNorm) 167 | self.bn3 = BatchNorm(1536) 168 | 169 | self.conv4 = SeparableConv2d(1536, 1536, 3, stride=1, dilation=exit_block_dilations[1], BatchNorm=BatchNorm) 170 | self.bn4 = BatchNorm(1536) 171 | 172 | self.conv5 = SeparableConv2d(1536, 2048, 3, stride=1, dilation=exit_block_dilations[1], BatchNorm=BatchNorm) 173 | self.bn5 = BatchNorm(2048) 174 | 175 | # Init weights 176 | self._init_weight() 177 | 178 | # Load pretrained model 179 | if pretrained: 180 | self._load_pretrained_model() 181 | 182 | def forward(self, x): 183 | # Entry flow 184 | x = self.conv1(x) 185 | x = self.bn1(x) 186 | x = self.relu(x) 187 | 188 | x = self.conv2(x) 189 | x = self.bn2(x) 190 | x = self.relu(x) 191 | 192 | x = self.block1(x) 193 | # add relu here 194 | x = self.relu(x) 195 | low_level_feat = x 196 | x = self.block2(x) 197 | x = self.block3(x) 198 | 199 | # Middle flow 200 | x = self.block4(x) 201 | x = self.block5(x) 202 | x = self.block6(x) 203 | x = self.block7(x) 204 | x = self.block8(x) 205 | x = self.block9(x) 206 | x = self.block10(x) 207 | x = self.block11(x) 208 | x = self.block12(x) 209 | x = self.block13(x) 210 | x = self.block14(x) 211 | x = self.block15(x) 212 | x = self.block16(x) 213 | x = self.block17(x) 214 | x = self.block18(x) 215 | x = self.block19(x) 216 | 217 | # Exit flow 218 | x = self.block20(x) 219 | x = self.relu(x) 220 | x = self.conv3(x) 221 | x = self.bn3(x) 222 | x = self.relu(x) 223 | 224 | x = self.conv4(x) 225 | x = self.bn4(x) 226 | x = self.relu(x) 227 | 228 | x = self.conv5(x) 229 | x = self.bn5(x) 230 | x = self.relu(x) 231 | 232 | return x, low_level_feat 233 | 234 | def _init_weight(self): 235 | for m in self.modules(): 236 | if isinstance(m, nn.Conv2d): 237 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 238 | m.weight.data.normal_(0, math.sqrt(2. / n)) 239 | elif isinstance(m, SynchronizedBatchNorm2d): 240 | m.weight.data.fill_(1) 241 | m.bias.data.zero_() 242 | elif isinstance(m, nn.BatchNorm2d): 243 | m.weight.data.fill_(1) 244 | m.bias.data.zero_() 245 | 246 | 247 | def _load_pretrained_model(self): 248 | pretrain_dict = model_zoo.load_url('http://data.lip6.fr/cadene/pretrainedmodels/xception-b5690688.pth') 249 | model_dict = {} 250 | state_dict = self.state_dict() 251 | 252 | for k, v in pretrain_dict.items(): 253 | if k in state_dict: 254 | if 'pointwise' in k: 255 | v = v.unsqueeze(-1).unsqueeze(-1) 256 | if k.startswith('block11'): 257 | model_dict[k] = v 258 | model_dict[k.replace('block11', 'block12')] = v 259 | model_dict[k.replace('block11', 'block13')] = v 260 | model_dict[k.replace('block11', 'block14')] = v 261 | model_dict[k.replace('block11', 'block15')] = v 262 | model_dict[k.replace('block11', 'block16')] = v 263 | model_dict[k.replace('block11', 'block17')] = v 264 | model_dict[k.replace('block11', 'block18')] = v 265 | model_dict[k.replace('block11', 'block19')] = v 266 | elif k.startswith('block12'): 267 | model_dict[k.replace('block12', 'block20')] = v 268 | elif k.startswith('bn3'): 269 | model_dict[k] = v 270 | model_dict[k.replace('bn3', 'bn4')] = v 271 | elif k.startswith('conv4'): 272 | model_dict[k.replace('conv4', 'conv5')] = v 273 | elif k.startswith('bn4'): 274 | model_dict[k.replace('bn4', 'bn5')] = v 275 | else: 276 | model_dict[k] = v 277 | state_dict.update(model_dict) 278 | self.load_state_dict(state_dict) 279 | 280 | 281 | 282 | if __name__ == "__main__": 283 | import torch 284 | model = AlignedXception(BatchNorm=nn.BatchNorm2d, pretrained=True, output_stride=16) 285 | input = torch.rand(1, 3, 512, 512) 286 | output, low_level_feat = model(input) 287 | print(output.size()) 288 | print(low_level_feat.size()) 289 | -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : __init__.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | from .batchnorm import SynchronizedBatchNorm1d, SynchronizedBatchNorm2d, SynchronizedBatchNorm3d 12 | from .replicate import DataParallelWithCallback, patch_replication_callback -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/__pycache__/__init__.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/__init__.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/__pycache__/batchnorm.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/batchnorm.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/__pycache__/comm.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/comm.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/__pycache__/replicate.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/replicate.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/comm.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : comm.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import queue 12 | import collections 13 | import threading 14 | 15 | __all__ = ['FutureResult', 'SlavePipe', 'SyncMaster'] 16 | 17 | 18 | class FutureResult(object): 19 | """A thread-safe future implementation. Used only as one-to-one pipe.""" 20 | 21 | def __init__(self): 22 | self._result = None 23 | self._lock = threading.Lock() 24 | self._cond = threading.Condition(self._lock) 25 | 26 | def put(self, result): 27 | with self._lock: 28 | assert self._result is None, 'Previous result has\'t been fetched.' 29 | self._result = result 30 | self._cond.notify() 31 | 32 | def get(self): 33 | with self._lock: 34 | if self._result is None: 35 | self._cond.wait() 36 | 37 | res = self._result 38 | self._result = None 39 | return res 40 | 41 | 42 | _MasterRegistry = collections.namedtuple('MasterRegistry', ['result']) 43 | _SlavePipeBase = collections.namedtuple('_SlavePipeBase', ['identifier', 'queue', 'result']) 44 | 45 | 46 | class SlavePipe(_SlavePipeBase): 47 | """Pipe for master-slave communication.""" 48 | 49 | def run_slave(self, msg): 50 | self.queue.put((self.identifier, msg)) 51 | ret = self.result.get() 52 | self.queue.put(True) 53 | return ret 54 | 55 | 56 | class SyncMaster(object): 57 | """An abstract `SyncMaster` object. 58 | - During the replication, as the data parallel will trigger an callback of each module, all slave devices should 59 | call `register(id)` and obtain an `SlavePipe` to communicate with the master. 60 | - During the forward pass, master device invokes `run_master`, all messages from slave devices will be collected, 61 | and passed to a registered callback. 62 | - After receiving the messages, the master device should gather the information and determine to message passed 63 | back to each slave devices. 64 | """ 65 | 66 | def __init__(self, master_callback): 67 | """ 68 | Args: 69 | master_callback: a callback to be invoked after having collected messages from slave devices. 70 | """ 71 | self._master_callback = master_callback 72 | self._queue = queue.Queue() 73 | self._registry = collections.OrderedDict() 74 | self._activated = False 75 | 76 | def __getstate__(self): 77 | return {'master_callback': self._master_callback} 78 | 79 | def __setstate__(self, state): 80 | self.__init__(state['master_callback']) 81 | 82 | def register_slave(self, identifier): 83 | """ 84 | Register an slave device. 85 | Args: 86 | identifier: an identifier, usually is the device id. 87 | Returns: a `SlavePipe` object which can be used to communicate with the master device. 88 | """ 89 | if self._activated: 90 | assert self._queue.empty(), 'Queue is not clean before next initialization.' 91 | self._activated = False 92 | self._registry.clear() 93 | future = FutureResult() 94 | self._registry[identifier] = _MasterRegistry(future) 95 | return SlavePipe(identifier, self._queue, future) 96 | 97 | def run_master(self, master_msg): 98 | """ 99 | Main entry for the master device in each forward pass. 100 | The messages were first collected from each devices (including the master device), and then 101 | an callback will be invoked to compute the message to be sent back to each devices 102 | (including the master device). 103 | Args: 104 | master_msg: the message that the master want to send to itself. This will be placed as the first 105 | message when calling `master_callback`. For detailed usage, see `_SynchronizedBatchNorm` for an example. 106 | Returns: the message to be sent back to the master device. 107 | """ 108 | self._activated = True 109 | 110 | intermediates = [(0, master_msg)] 111 | for i in range(self.nr_slaves): 112 | intermediates.append(self._queue.get()) 113 | 114 | results = self._master_callback(intermediates) 115 | assert results[0][0] == 0, 'The first result should belongs to the master.' 116 | 117 | for i, res in results: 118 | if i == 0: 119 | continue 120 | self._registry[i].result.put(res) 121 | 122 | for i in range(self.nr_slaves): 123 | assert self._queue.get() is True 124 | 125 | return results[0][1] 126 | 127 | @property 128 | def nr_slaves(self): 129 | return len(self._registry) 130 | -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/replicate.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : replicate.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import functools 12 | 13 | from torch.nn.parallel.data_parallel import DataParallel 14 | 15 | __all__ = [ 16 | 'CallbackContext', 17 | 'execute_replication_callbacks', 18 | 'DataParallelWithCallback', 19 | 'patch_replication_callback' 20 | ] 21 | 22 | 23 | class CallbackContext(object): 24 | pass 25 | 26 | 27 | def execute_replication_callbacks(modules): 28 | """ 29 | Execute an replication callback `__data_parallel_replicate__` on each module created by original replication. 30 | The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)` 31 | Note that, as all modules are isomorphism, we assign each sub-module with a context 32 | (shared among multiple copies of this module on different devices). 33 | Through this context, different copies can share some information. 34 | We guarantee that the callback on the master copy (the first copy) will be called ahead of calling the callback 35 | of any slave copies. 36 | """ 37 | master_copy = modules[0] 38 | nr_modules = len(list(master_copy.modules())) 39 | ctxs = [CallbackContext() for _ in range(nr_modules)] 40 | 41 | for i, module in enumerate(modules): 42 | for j, m in enumerate(module.modules()): 43 | if hasattr(m, '__data_parallel_replicate__'): 44 | m.__data_parallel_replicate__(ctxs[j], i) 45 | 46 | 47 | class DataParallelWithCallback(DataParallel): 48 | """ 49 | Data Parallel with a replication callback. 50 | An replication callback `__data_parallel_replicate__` of each module will be invoked after being created by 51 | original `replicate` function. 52 | The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)` 53 | Examples: 54 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 55 | > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1]) 56 | # sync_bn.__data_parallel_replicate__ will be invoked. 57 | """ 58 | 59 | def replicate(self, module, device_ids): 60 | modules = super(DataParallelWithCallback, self).replicate(module, device_ids) 61 | execute_replication_callbacks(modules) 62 | return modules 63 | 64 | 65 | def patch_replication_callback(data_parallel): 66 | """ 67 | Monkey-patch an existing `DataParallel` object. Add the replication callback. 68 | Useful when you have customized `DataParallel` implementation. 69 | Examples: 70 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 71 | > sync_bn = DataParallel(sync_bn, device_ids=[0, 1]) 72 | > patch_replication_callback(sync_bn) 73 | # this is equivalent to 74 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 75 | > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1]) 76 | """ 77 | 78 | assert isinstance(data_parallel, DataParallel) 79 | 80 | old_replicate = data_parallel.replicate 81 | 82 | @functools.wraps(old_replicate) 83 | def new_replicate(module, device_ids): 84 | modules = old_replicate(module, device_ids) 85 | execute_replication_callbacks(modules) 86 | return modules 87 | 88 | data_parallel.replicate = new_replicate -------------------------------------------------------------------------------- /DSP/modeling/sync_batchnorm/unittest.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : unittest.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import unittest 12 | 13 | import numpy as np 14 | from torch.autograd import Variable 15 | 16 | 17 | def as_numpy(v): 18 | if isinstance(v, Variable): 19 | v = v.data 20 | return v.cpu().numpy() 21 | 22 | 23 | class TorchTestCase(unittest.TestCase): 24 | def assertTensorClose(self, a, b, atol=1e-3, rtol=1e-3): 25 | npa, npb = as_numpy(a), as_numpy(b) 26 | self.assertTrue( 27 | np.allclose(npa, npb, atol=atol), 28 | 'Tensor close check failed\n{}\n{}\nadiff={}, rdiff={}'.format(a, b, np.abs(npa - npb).max(), np.abs((npa - npb) / np.fmax(npa, 1e-5)).max()) 29 | ) 30 | -------------------------------------------------------------------------------- /DSP/models/__pycache__/simclr.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/models/__pycache__/simclr.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/models/simclr.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torchvision 3 | import torch 4 | import torch.nn.functional as F 5 | from copy import deepcopy 6 | from modeling.backbone.resnet import ResNet50 7 | 8 | 9 | 10 | 11 | class SimCLR(nn.Module): 12 | def __init__(self, args): 13 | super(SimCLR, self).__init__() 14 | self.m_backbone = args.m_backbone 15 | # self.dense_head = args.dense_head 16 | self.m = args.m_update 17 | self.encoder_type = args.encoder 18 | self.dense_cl = args.dense_cl 19 | self.f = get_encoder(args.backbone, args.pre_train, args.output_stride, args.encoder) 20 | self.dense_neck = DenseCLNeck(in_channels=2048, hid_channels=512, out_channels=1, num_grid=None) 21 | 22 | self.pool = nn.AdaptiveAvgPool2d(1) 23 | 24 | 25 | # projection head 26 | self.g = nn.Sequential( 27 | nn.Linear(2048, args.hidden_layer, bias=False), 28 | nn.BatchNorm1d(args.hidden_layer), 29 | nn.ReLU(inplace=True), 30 | nn.Linear(args.hidden_layer, args.n_proj, bias=True) 31 | ) 32 | 33 | 34 | # Momentum Encoder 35 | if args.m_backbone: 36 | self.fm = deepcopy(self.f) 37 | self.gm = deepcopy(self.g) 38 | self.dense_m= deepcopy(self.dense_neck) 39 | for param in self.fm.parameters(): 40 | param.requires_grad = False 41 | for param in self.gm.parameters(): 42 | param.requires_grad = False 43 | for param in self.gm.parameters(): 44 | param.requires_grad = False 45 | 46 | def forward(self, x, y=None): 47 | x, _ = self.f(x) 48 | if self.dense_cl: 49 | out_x = self.dense_neck(x) 50 | else: 51 | feat_x = self.pool(x) 52 | feat_x = torch.flatten(feat_x, start_dim=1) 53 | out_x = self.g(feat_x) 54 | if y is not None: 55 | if self.m_backbone: 56 | with torch.no_grad(): # no gradient to keys 57 | self._momentum_update() 58 | y, _ = self.fm(y) 59 | if self.dense_cl: 60 | out_y = self.dense_neck(y) 61 | else: 62 | feat_y = self.pool(y) 63 | feat_y = torch.flatten(feat_y, start_dim=1) 64 | out_y = self.gm(feat_y) 65 | else: 66 | y, _ = self.f(y) 67 | if self.dense_cl: 68 | out_y = self.dense_neck(y) 69 | else: 70 | feat_y = self.pool(y) 71 | feat_y = torch.flatten(feat_y, start_dim=1) 72 | out_y = self.g(feat_y) 73 | 74 | return x, y, out_x, out_y 75 | else: 76 | return F.normalize(feat_x, dim=-1), F.normalize(out_x, dim=-1) 77 | 78 | @torch.no_grad() 79 | def _momentum_update(self): 80 | """ 81 | Momentum update of the key encoder 82 | """ 83 | for param_f, param_fm in zip(self.f.parameters(), self.fm.parameters()): 84 | param_fm.data = param_fm.data * self.m + param_f.data * (1. - self.m) 85 | for param_g, param_gm in zip(self.f.parameters(), self.fm.parameters()): 86 | param_gm.data = param_gm.data * self.m + param_g.data * (1. - self.m) 87 | 88 | 89 | class LinearEvaluation(nn.Module): 90 | """ 91 | Linear Evaluation model 92 | """ 93 | 94 | def __init__(self, n_features, n_classes): 95 | super(LinearEvaluation, self).__init__() 96 | self.model = nn.Linear(n_features, n_classes) 97 | 98 | def forward(self, x1, x2): 99 | df = torch.abs(x1 - x2) 100 | return self.model(df) 101 | 102 | 103 | class Identity(nn.Module): 104 | def __init__(self): 105 | super(Identity, self).__init__() 106 | 107 | def forward(self, x): 108 | return x 109 | 110 | 111 | def get_encoder(encoder, pre_train, output_stride, encoder_name): 112 | """ 113 | Get Resnet backbone 114 | """ 115 | 116 | class View(nn.Module): 117 | def __init__(self, shape=2048): 118 | super().__init__() 119 | self.shape = shape 120 | 121 | def forward(self, input): 122 | ''' 123 | Reshapes the input according to the shape saved in the view data structure. 124 | ''' 125 | batch_size = input.size(0) 126 | shape = (batch_size, self.shape) 127 | out = input.view(shape) 128 | return out 129 | 130 | def CMU_resnet50(): 131 | 132 | if encoder_name=='resnet': 133 | resnet = ResNet50(BatchNorm=nn.BatchNorm2d, pretrained=pre_train, output_stride=output_stride) 134 | return resnet 135 | else: 136 | vgg16 = deeplab_V2() 137 | return vgg16 138 | 139 | 140 | return { 141 | 'resnet18': torchvision.models.resnet18(pretrained=False), 142 | 'resnet50': CMU_resnet50() 143 | }[encoder] 144 | 145 | class DenseCLNeck(nn.Module): 146 | '''The non-linear neck in DenseCL. 147 | Single and dense in parallel: fc-relu-fc, conv-relu-conv 148 | ''' 149 | 150 | def __init__(self, 151 | in_channels, 152 | hid_channels, 153 | out_channels, 154 | num_grid=None): 155 | super(DenseCLNeck, self).__init__() 156 | 157 | self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) 158 | self.mlp = nn.Sequential( 159 | nn.Linear(in_channels, hid_channels), nn.ReLU(inplace=True), 160 | nn.Linear(hid_channels, out_channels)) 161 | 162 | self.with_pool = num_grid != None 163 | if self.with_pool: 164 | self.pool = nn.AdaptiveAvgPool2d((num_grid, num_grid)) 165 | self.mlp2 = nn.Sequential( 166 | nn.Conv2d(in_channels, hid_channels, 1), nn.BatchNorm2d(hid_channels), nn.ReLU(inplace=True), 167 | nn.Conv2d(hid_channels, out_channels, 1)) 168 | self.avgpool2 = nn.AdaptiveAvgPool2d((1, 1)) 169 | 170 | 171 | 172 | def forward(self, x): 173 | 174 | x = self.mlp2(x) # sxs: bxdxsxs 175 | avgpooled_x2 = self.avgpool2(x) # 1x1: bxdx1x1 176 | # x = x.view(x.size(0), x.size(1), -1) # bxdxs^2 177 | # avgpooled_x2 = avgpooled_x2.view(avgpooled_x2.size(0), -1) # bxd 178 | return x 179 | 180 | 181 | 182 | if __name__ == "__main__": 183 | import torch 184 | model = SimCLR(a) 185 | input = torch.rand(1, 3, 512, 512) 186 | output, low_level_feat = model(input) 187 | print(output.size()) 188 | print(low_level_feat.size()) 189 | -------------------------------------------------------------------------------- /DSP/mypath.py: -------------------------------------------------------------------------------- 1 | class Path(object): 2 | @staticmethod 3 | def db_root_dir(dataset): 4 | if dataset == 'pascal': 5 | return '/path/to/datasets/VOCdevkit/VOC2012/' # folder that contains VOCdevkit/. 6 | elif dataset == 'sbd': 7 | return '/path/to/datasets/benchmark_RELEASE/' # folder that contains dataset/. 8 | elif dataset == 'cityscapes': 9 | return '/path/to/datasets/cityscapes/' # foler that contains leftImg8bit/ 10 | elif dataset == 'coco': 11 | return '/path/to/datasets/coco/' 12 | elif dataset == 'CMU': 13 | return "/data/input/datasets/VL-CMU-CD/" #folder that contains data 14 | else: 15 | print('Dataset {} not available.'.format(dataset)) 16 | raise NotImplementedError 17 | -------------------------------------------------------------------------------- /DSP/optimizers/__pycache__/lars.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/optimizers/__pycache__/lars.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/optimizers/lars.py: -------------------------------------------------------------------------------- 1 | """ 2 | @author: https://github.com/NVIDIA/apex/ 3 | """ 4 | 5 | import torch 6 | from torch import nn 7 | from torch.nn.parameter import Parameter 8 | 9 | class LARC(object): 10 | """ 11 | :class:`LARC` is a pytorch implementation of both the scaling and clipping variants of LARC, 12 | in which the ratio between gradient and parameter magnitudes is used to calculate an adaptive 13 | local learning rate for each individual parameter. The algorithm is designed to improve 14 | convergence of large batch training. 15 | 16 | See https://arxiv.org/abs/1708.03888 for calculation of the local learning rate. 17 | 18 | In practice it modifies the gradients of parameters as a proxy for modifying the learning rate 19 | of the parameters. This design allows it to be used as a wrapper around any torch.optim Optimizer. 20 | 21 | ``` 22 | model = ... 23 | optim = torch.optim.Adam(model.parameters(), lr=...) 24 | optim = LARC(optim) 25 | ``` 26 | 27 | It can even be used in conjunction with apex.fp16_utils.FP16_optimizer. 28 | 29 | ``` 30 | model = ... 31 | optim = torch.optim.Adam(model.parameters(), lr=...) 32 | optim = LARC(optim) 33 | optim = apex.fp16_utils.FP16_Optimizer(optim) 34 | ``` 35 | 36 | Args: 37 | optimizer: Pytorch optimizer to wrap and modify learning rate for. 38 | trust_coefficient: Trust coefficient for calculating the lr. See https://arxiv.org/abs/1708.03888 39 | clip: Decides between clipping or scaling mode of LARC. If `clip=True` the learning rate is set to `min(optimizer_lr, local_lr)` for each parameter. If `clip=False` the learning rate is set to `local_lr*optimizer_lr`. 40 | eps: epsilon kludge to help with numerical stability while calculating adaptive_lr 41 | """ 42 | 43 | def __init__(self, optimizer, trust_coefficient=0.02, clip=True, eps=1e-8): 44 | self.optim = optimizer 45 | self.trust_coefficient = trust_coefficient 46 | self.eps = eps 47 | self.clip = clip 48 | 49 | def __getstate__(self): 50 | return self.optim.__getstate__() 51 | 52 | def __setstate__(self, state): 53 | self.optim.__setstate__(state) 54 | 55 | @property 56 | def state(self): 57 | return self.optim.state 58 | 59 | def __repr__(self): 60 | return self.optim.__repr__() 61 | 62 | @property 63 | def param_groups(self): 64 | return self.optim.param_groups 65 | 66 | @param_groups.setter 67 | def param_groups(self, value): 68 | self.optim.param_groups = value 69 | 70 | def state_dict(self): 71 | return self.optim.state_dict() 72 | 73 | def load_state_dict(self, state_dict): 74 | self.optim.load_state_dict(state_dict) 75 | 76 | def zero_grad(self): 77 | self.optim.zero_grad() 78 | 79 | def add_param_group(self, param_group): 80 | self.optim.add_param_group( param_group) 81 | 82 | def step(self): 83 | with torch.no_grad(): 84 | weight_decays = [] 85 | for group in self.optim.param_groups: 86 | # absorb weight decay control from optimizer 87 | weight_decay = group['weight_decay'] if 'weight_decay' in group else 0 88 | weight_decays.append(weight_decay) 89 | group['weight_decay'] = 0 90 | for p in group['params']: 91 | if p.grad is None: 92 | continue 93 | param_norm = torch.norm(p.data) 94 | grad_norm = torch.norm(p.grad.data) 95 | 96 | if param_norm != 0 and grad_norm != 0: 97 | # calculate adaptive lr + weight decay 98 | adaptive_lr = self.trust_coefficient * (param_norm) / (grad_norm + param_norm * weight_decay + self.eps) 99 | 100 | # clip learning rate for LARC 101 | if self.clip: 102 | # calculation of adaptive_lr so that when multiplied by lr it equals `min(adaptive_lr, lr)` 103 | adaptive_lr = min(adaptive_lr/group['lr'], 1) 104 | 105 | p.grad.data += weight_decay * p.data 106 | p.grad.data *= adaptive_lr 107 | 108 | self.optim.step() 109 | # return weight decay control to optimizer 110 | for i, group in enumerate(self.optim.param_groups): 111 | group['weight_decay'] = weight_decays[i] -------------------------------------------------------------------------------- /DSP/supervised.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import numpy as np 4 | import random 5 | from datetime import datetime 6 | from torch.optim import Adam, SGD 7 | from torch.optim.lr_scheduler import MultiStepLR 8 | import sys 9 | from time import ctime 10 | import os 11 | sys.path.insert(0, '.') 12 | from util.utils import logger, summary_writer, log 13 | from config.option import Options 14 | from models.simclr import SimCLR 15 | from util.test import testloaderSimCLR, test_all_datasets 16 | from util.utils import save_checkpoint 17 | from transforms.simclr_transform import SimCLRTransform 18 | 19 | np.random.seed(10) 20 | random.seed(10) 21 | torch.manual_seed(10) 22 | 23 | import warnings 24 | warnings.filterwarnings("ignore", category=UserWarning) 25 | 26 | 27 | def train_supervised(args, loader, model, criterion, optimizer, scheduler): 28 | """ 29 | Train supervised model 30 | """ 31 | loss_epoch, accuracy_epoch = 0, 0 32 | model.train() 33 | for i, (x, y) in enumerate(loader): 34 | x = x.to(args.device) 35 | y = y.to(args.device) 36 | 37 | _, output = model(x) 38 | loss = criterion(output, y) 39 | 40 | predicted = output.argmax(1) 41 | acc = (predicted == y).sum().item() / y.size(0) 42 | accuracy_epoch += acc 43 | 44 | optimizer.zero_grad() 45 | loss.backward() 46 | optimizer.step() 47 | scheduler.step() 48 | 49 | loss_epoch += loss.item() 50 | if i % 50 == 0: 51 | log(f"Batch [{i}/{len(loader)}]\t Loss: {loss.item()}\t Accuracy: {acc}") 52 | return loss_epoch, accuracy_epoch 53 | 54 | 55 | if __name__ == "__main__": 56 | args = Options().parse() 57 | log_dir = os.path.join(args.save_dir, "{}_bs_{}".format(args.backbone, args.sup_batchsize), 58 | ctime().replace(' ', '_')) 59 | writer = summary_writer(args, log_dir) 60 | logger(args) 61 | args.start_time = datetime.now() 62 | log("Starting at {}".format(datetime.now())) 63 | log("arguments parsed: {}".format(args)) 64 | criterion = nn.CrossEntropyLoss() 65 | 66 | model = SimCLR(args) 67 | model.cuda(args.device) 68 | transform = SimCLRTransform(size=args.img_size).sup_transform 69 | train_loader, val_loader, test_loader = testloaderSimCLR(args, args.sup_dataset, transform, args.sup_batchsize, args.sup_data_dir) 70 | optimizer = SGD(model.parameters(), lr=args.sup_lr, momentum=0.9, weight_decay=1e-5) 71 | scheduler = MultiStepLR(optimizer, milestones=[180], gamma=0.1) 72 | for epoch in range(1, args.sup_epochs + 1): 73 | # Train 74 | loss_epoch, accuracy_epoch = train_supervised(args, train_loader, model, criterion, optimizer, scheduler) 75 | log(f"Epoch [{epoch}/{args.sup_epochs}]\t Loss: {loss_epoch / len(train_loader)}\t Accuracy: {accuracy_epoch / len(train_loader)}") 76 | 77 | # Save checkpoint after every epoch 78 | path = save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch, filename='checkpoint.pth'.format(epoch)) 79 | if os.path.exists: 80 | state_dict = torch.load(path, map_location=args.device) 81 | model.load_state_dict(state_dict) 82 | 83 | # Save the model at specific checkpoints 84 | if epoch % 10 == 0: 85 | if args.distribute: 86 | # Save DDP model's module 87 | save_checkpoint(state_dict=model.module.state_dict(), args=args, epoch=epoch, filename='checkpoint_model_{}.pth'.format(epoch)) 88 | else: 89 | save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch, filename='checkpoint_model_{}.pth'.format(epoch)) 90 | 91 | writer.add_scalar("CrossEntropyLoss/train", loss_epoch / len(train_loader), epoch) 92 | 93 | # Test the supervised Model 94 | test_all_datasets(args, writer, model) 95 | -------------------------------------------------------------------------------- /DSP/train.py: -------------------------------------------------------------------------------- 1 | import torch 2 | print(torch.__version__) 3 | import numpy as np 4 | import random 5 | from datetime import datetime 6 | from torch.optim import Adam, SGD 7 | from torch.optim.lr_scheduler import CosineAnnealingLR 8 | import sys 9 | sys.path.insert(0, '.') 10 | from util.utils import logger, summary_writer, log 11 | from util.train_util import trainSSL, get_criteria 12 | from config.option import Options 13 | from models.simclr import SimCLR 14 | from optimizers.lars import LARC 15 | 16 | np.random.seed(10) 17 | random.seed(10) 18 | torch.manual_seed(10) 19 | 20 | import warnings 21 | warnings.filterwarnings("ignore", category=UserWarning) 22 | 23 | 24 | if __name__ == "__main__": 25 | args = Options().parse() 26 | args.writer = summary_writer(args) 27 | logger(args) 28 | args.start_time = datetime.now() 29 | log("Starting at {}".format(datetime.now())) 30 | log("arguments parsed: {}".format(args)) 31 | criterion = get_criteria(args) 32 | if args.ssl_model == 'simclr': 33 | model = SimCLR(args) 34 | if args.optimizer == 'lars': 35 | optimizer_= SGD(model.parameters(), lr=args.ssl_lr) 36 | optimizer = LARC(optimizer_) 37 | if args.scheduler: 38 | scheduler = CosineAnnealingLR(optimizer_, T_max=100, eta_min=3e-4) 39 | trainSSL(args, model, optimizer, criterion, args.writer, scheduler) 40 | else: 41 | optimizer = Adam(model.parameters(), lr=args.ssl_lr, weight_decay=1e-6) 42 | trainSSL(args, model, optimizer, criterion, args.writer) 43 | 44 | -------------------------------------------------------------------------------- /DSP/transforms/__pycache__/simclr_transform.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/transforms/__pycache__/simclr_transform.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/transforms/simclr_transform.py: -------------------------------------------------------------------------------- 1 | from PIL import ImageFilter, Image 2 | import random 3 | from torchvision.transforms import transforms 4 | import numpy as np 5 | import cv2 6 | import skimage.exposure 7 | from scipy.ndimage import gaussian_filter 8 | import pickle 9 | from config.option import Options 10 | import albumentations as A 11 | from albumentations.pytorch.transforms import ToTensorV2 12 | args = Options().parse() 13 | 14 | from util.transforms import RandomChoice 15 | 16 | class SimCLRTransform(): 17 | """ 18 | Transform defined in SimCLR 19 | https://arxiv.org/pdf/2002.05709.pdf 20 | ] 21 | """ 22 | 23 | def __init__(self, size): 24 | # Normalize val dataset CMU 25 | # mean_val: TO= [0.34966046 0.33492374 0.3141161 ] T1= [0.27263916 0.27427372 0.26884845] 26 | # std_val : T0= [0.3798822 0.37294477 0.35809073] T1= [0.26939082 0.28229916 0.28446007] 27 | self.T0_mean = (0.33701816, 0.33383232, 0.3245374) 28 | self.T0_std = (0.26748696, 0.2733889, 0.27516264) 29 | self.T1_mean = (0.3782613, 0.36675948, 0.35721096) 30 | self.T1_std = (0.26745927, 0.2732622, 0.2772976) 31 | self.size = size 32 | self.copy_paste = args.copy_paste 33 | if self.size == 512 or self.size == 256: # CMU 34 | normalize = transforms.Normalize(mean=self.T0_mean, std=self.T0_std) 35 | self.train_transform = transforms.Compose( 36 | [ #transforms.RandomResizedCrop(size=size), 37 | transforms.Resize(size=(self.size,self.size)), 38 | ]) 39 | 40 | self.copy_paste_aug = copy_paste(sigma=3, affine=False, prob=0.5) 41 | self.train_transform2 = RandomChoice([ get_color_distortion(), 42 | transforms.RandomApply([GaussianBlur([.1, 2.])], p=1) 43 | ]) 44 | 45 | self.train_transform3 = transforms.Compose([ 46 | transforms.ToTensor(), 47 | # transforms.RandomErasing(p=0.4, scale=(0.09, 0.25), ratio=(0.3, 3.3)) 48 | normalize, 49 | # transforms.RandomErasing(p=0.5, scale=(0.09, 0.25), ratio=(0.3, 3.3)) 50 | # hide_patch(0.2), 51 | ]) 52 | self.train_transform_pcd = transforms.Compose([ 53 | transforms.ToTensor() 54 | 55 | ]) 56 | self.train_transform4 = RandomChoice([ 57 | transforms.RandomErasing(p=0.5, scale=(0.09, 0.25), ratio=(0.3, 3.3)) 58 | # hide_patch(1), 59 | ]) 60 | 61 | 62 | 63 | self.test_transform = transforms.Compose( 64 | [ 65 | transforms.Resize(size=(size, size)), 66 | transforms.ToTensor(), 67 | normalize 68 | ] 69 | ) 70 | 71 | self.sup_transform = transforms.Compose( 72 | [ 73 | transforms.RandomCrop(size=(size, size)), # transforms.RandomHorizontalFlip(), 74 | transforms.ToTensor(), 75 | normalize 76 | ] 77 | ) 78 | 79 | def __call__(self, x1, x2): 80 | aug1 = self.train_transform(x1) 81 | aug2 = self.train_transform(x2) 82 | # if self.copy_paste: 83 | # aug1, aug2 = self.copy_paste_aug(aug1, aug2) 84 | if args.ssl_dataset=='CMU': 85 | aug1, aug2 = self.train_transform2([aug1, aug2]) 86 | # aug2 = self.train_transform2(aug2) 87 | # aug1 = self.train_transform2(aug1) 88 | aug1 = self.train_transform3(aug1) 89 | aug2 = self.train_transform3(aug2) 90 | else: 91 | aug1, aug2 = self.train_transform2([aug1, aug2]) 92 | 93 | 94 | 95 | return aug1, aug2 96 | 97 | 98 | class GaussianBlur(object): 99 | """Gaussian blur augmentation """ 100 | 101 | def __init__(self, sigma=None): 102 | if sigma is None: 103 | sigma = [.1, 2.] 104 | self.sigma = sigma 105 | 106 | def __call__(self, x): 107 | sigma = random.uniform(self.sigma[0], self.sigma[1]) 108 | x = x.filter(ImageFilter.GaussianBlur(radius=sigma)) 109 | return x 110 | 111 | class hide_patch(object): 112 | """" Hide random part of the image """ 113 | 114 | def __init__(self, hide_prob=0.3): 115 | self.hide_prob = hide_prob 116 | self.skipsize = 20 117 | 118 | def __call__(self, img): 119 | s = img.shape 120 | wd = s[1] 121 | ht = s[2] 122 | 123 | # possible grid size, 0 means no hiding 124 | if wd ==224: 125 | grid_sizes = [15, 20, 25] 126 | else : 127 | grid_sizes = [33, 44, 55] 128 | 129 | # hiding probability 130 | 131 | # randomly choose one grid size 132 | grid_size = grid_sizes[random.randint(0, len(grid_sizes) - 1)] 133 | 134 | # hide the patches 135 | if grid_size != 0: 136 | for x in range(0, wd, grid_size): 137 | for y in range(0, ht, grid_size): 138 | x_end = min(wd, x + grid_size) 139 | y_end = min(ht, y + grid_size) 140 | if x <= self.skipsize: 141 | img[:, x:x_end, y:y_end] = 0 142 | 143 | if random.random() <= self.hide_prob: 144 | # patch_avg = img[:, x:x_end, y:y_end].mean() # activate this line if u want mean patch value 145 | img[:, x:x_end, y:y_end] = 0 # patch_avg 146 | 147 | return img 148 | 149 | 150 | 151 | 152 | 153 | def get_color_distortion(s=1.0): 154 | """ 155 | Color jitter from SimCLR paper 156 | @param s: is the strength of color distortion. 157 | """ 158 | 159 | color_jitter = transforms.ColorJitter(0.6*s, 0.6*s, 0.6*s, 0.2*s) 160 | rnd_color_jitter = transforms.RandomApply([color_jitter], p=0.7) 161 | rnd_gray = transforms.RandomGrayscale(p=0.2) 162 | color_distort = transforms.Compose([rnd_color_jitter, rnd_gray]) 163 | return color_distort 164 | 165 | 166 | class copy_paste(object): 167 | ''' Copy paste augumentation: arg: paste img, paste mask, img on which the obj to be pasted, gaussian blur(sigma) 168 | params: sigma = Gaussian blur radius 169 | blend = bool 170 | affine = bool 171 | instance_txt_path = path to the directory containing the instances list that needs to be pasted. 172 | 173 | ''' 174 | 175 | 176 | def __init__(self, blend=True, sigma= 1, affine=True, prob=1): 177 | self.sigma = sigma 178 | self.blend = blend 179 | self.affine = affine 180 | self.prob = prob 181 | self.instance_txt_path = '/data/input/datasets/VL-CMU-CD/instance.txt' 182 | with open(self.instance_txt_path, 'rb') as fp: 183 | self.instance_list = pickle.load(fp) 184 | 185 | 186 | def __call__(self, copy_img, copy_img2): 187 | if random.random() <= self.prob: 188 | if self.instance_list: 189 | inst_name = random.choice(self.instance_list) 190 | instance = Image.open(inst_name) 191 | self.instance = instance 192 | 193 | if self.instance is not None: 194 | H,W = copy_img.size 195 | paste_img = transforms.Resize(size=(H, W))(self.instance) 196 | if self.affine == True: 197 | paste_img = transforms.RandomAffine(degrees=0, translate=(0.25, 0.25), scale=(0.8, 1.1), shear=0)(paste_img) 198 | gray_mask = transforms.Grayscale()(paste_img) 199 | binary_mask = np.asarray(gray_mask) 200 | binary_mask = 1.0 * (binary_mask > 0) 201 | # blur_binary_mask = skimage.exposure.rescale_intensity(blur_binary_mask) 202 | invert_mask = 1.0 * (np.logical_not(binary_mask).astype(int)) 203 | 204 | if self.blend == True: 205 | blur_invert_mask = gaussian_filter(invert_mask, sigma=self.sigma) 206 | blur_binary_mask = gaussian_filter(binary_mask, sigma=self.sigma) 207 | blur_invert_mask = np.expand_dims(blur_invert_mask, 2) # Expanding dims to match channels 208 | blur_binary_mask = np.expand_dims(blur_binary_mask, 2) 209 | blur_invert_mask = np.expand_dims(invert_mask, 2) # Expanding dims to match channels 210 | blur_binary_mask = np.expand_dims(binary_mask, 2) 211 | aug_image1 = (paste_img * blur_binary_mask) + (copy_img * blur_invert_mask) 212 | aug_image2 = (paste_img * blur_binary_mask) + (copy_img2 * blur_invert_mask) 213 | 214 | 215 | return Image.fromarray(np.uint8(aug_image1)), Image.fromarray(np.uint8(aug_image2)) 216 | else: 217 | return(copy_img), (copy_img2) 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | -------------------------------------------------------------------------------- /DSP/util/COCO_loader/base_dataset.py: -------------------------------------------------------------------------------- 1 | from abc import abstractmethod 2 | import torch.utils.data as data 3 | import numpy as np 4 | from pycocotools.coco import COCO 5 | import os 6 | import logging 7 | import cv2 8 | import torch 9 | 10 | 11 | __all__ = ['BaseDataset'] 12 | 13 | 14 | class BaseDataset(data.Dataset): 15 | 16 | def __init__(self, root, split, cfg, mode=None, base_size=None, 17 | crop_size=None, ann_path=None, ann_file_format=None, 18 | has_inst_seg=True, **kwargs): 19 | self.root = root 20 | self._split = split 21 | self.mode = mode 22 | self.base_size = base_size if base_size is not None else 1024 23 | self.crop_size = crop_size if crop_size is not None else [512, 512] 24 | 25 | 26 | self.cfg = cfg 27 | 28 | 29 | if split == 'test': 30 | return 31 | 32 | if ann_path is None: 33 | ann_path = 'gtFine/annotations_coco_format_v1' 34 | if ann_file_format is None: 35 | ann_file_format = 'instances_%s.json' 36 | self.coco = COCO(os.path.join(root, ann_path, ann_file_format % split)) 37 | 38 | # Image paths is currently none to address test split length.. 39 | # update image paths after initializing BaseDataset 40 | self.image_paths = None 41 | self.image_ids = list(self.coco.imgs.keys()) 42 | self.image_ids = sorted(self.image_ids) 43 | logging.info(f'Number of images in split {split} is {len(self.image_ids)}') 44 | 45 | ids = [] 46 | for img_id in self.image_ids: 47 | ann_ids = self.coco.getAnnIds(imgIds=img_id, iscrowd=None) 48 | anno = self.coco.loadAnns(ann_ids) 49 | if split == "train": 50 | if self.has_valid_annotation(anno): 51 | ids.append(img_id) 52 | else: 53 | ids.append(img_id) 54 | 55 | self.image_ids = ids 56 | logging.info(f'Number of images with valid annotations ' 57 | f'in split {split} is {len(self.image_ids)}') 58 | 59 | self.id_to_filename = dict() 60 | self.filename_to_id = dict() 61 | for i, ob in self.coco.imgs.items(): 62 | self.filename_to_id[ob['file_name']] = ob['id'] 63 | self.id_to_filename[ob['id']] = ob['file_name'] 64 | 65 | detect_ids = self.get_detect_ids() 66 | self.coco_id_to_contiguous_id = {coco_id: i for i, coco_id 67 | in enumerate(detect_ids)} 68 | self.contiguous_id_to_coco_id = {v: k for k, v in 69 | self.coco_id_to_contiguous_id.items()} 70 | 71 | self.key, self.segment_mapping = self.get_segment_mapping() 72 | 73 | self.has_inst_seg = has_inst_seg 74 | self.inst_encoding_type ='MEINST' 75 | 76 | 77 | 78 | @property 79 | def image_size(self): 80 | return self.crop_size 81 | 82 | @property 83 | def split(self): 84 | return self._split 85 | 86 | @split.setter 87 | def split(self, value): 88 | assert type(value) is str, 'Dataset split should be string' 89 | self._split = value 90 | 91 | @abstractmethod 92 | def get_detect_ids(self): 93 | pass 94 | 95 | @abstractmethod 96 | def get_segment_mapping(self): 97 | pass 98 | 99 | @abstractmethod 100 | def __getitem__(self, index): 101 | pass 102 | 103 | def __len__(self): 104 | if self.split == "test": 105 | return len(self.image_paths) 106 | return len(self.image_ids) 107 | 108 | @staticmethod 109 | def xywh2xyxy(box): 110 | x1, y1, w, h = box 111 | return [x1, y1, x1 + w, y1 + h] 112 | 113 | def get_img_info(self, index): 114 | image_id = self.image_ids[index] 115 | img_data = self.coco.imgs[image_id] 116 | return img_data 117 | 118 | 119 | 120 | @abstractmethod 121 | def ann_check_hooks(self, ann_obj): 122 | pass 123 | 124 | def get_annotation(self, index): 125 | image_id = self.image_ids[index] 126 | # TODO: optionally create segmentation masks... 127 | ann_ids = self.coco.getAnnIds(imgIds=image_id) 128 | loaded_anns = self.coco.loadAnns(ann_ids) 129 | 130 | bboxes, labels, inst_masks = [], [], [] 131 | for obj in loaded_anns: 132 | if obj.get('iscrowd', 0) == 0 and obj.get('real_box', True)\ 133 | and self.ann_check_hooks(obj): 134 | bboxes.append(self.xywh2xyxy(obj["bbox"])) 135 | labels.append(self.coco_id_to_contiguous_id[obj["category_id"]]) 136 | if self.has_inst_seg: 137 | inst_masks.append(self.coco.annToMask(obj)) 138 | 139 | bboxes = np.array(bboxes, np.float32).reshape((-1, 4)) 140 | labels = np.array(labels, np.int64).reshape((-1,)) 141 | 142 | # remove invalid boxes 143 | keep = (bboxes[:, 3] > bboxes[:, 1]) & (bboxes[:, 2] > bboxes[:, 0]) 144 | bboxes = bboxes[keep] 145 | labels = labels[keep] 146 | inst_masks = [inst_masks[idx] for idx, k in enumerate(keep) if k] 147 | 148 | rets = [bboxes, labels] 149 | if self.has_inst_seg: 150 | rets += [inst_masks] 151 | return rets 152 | 153 | @staticmethod 154 | def _has_only_empty_bbox(anno): 155 | return all(not (obj.get("iscrowd", 0) == 0 and 156 | obj.get("real_bbox", True)) for obj in anno) 157 | 158 | def has_valid_annotation(self, anno): 159 | # if it's empty, there is no annotation 160 | if len(anno) == 0: 161 | return False 162 | # if all boxes have close to zero area, there is no annotation 163 | if self._has_only_empty_bbox(anno): 164 | return False 165 | return True 166 | 167 | def add_area(self): 168 | for i, v in self.coco.anns.items(): 169 | v['area'] = v['bbox'][2] * v['bbox'][3] 170 | 171 | def segment_mask_transform(self, mask): 172 | mask = np.array(mask).astype('int32') 173 | if self.segment_mapping is not None: 174 | mask = self.segment_mask_to_contiguous(mask) 175 | return torch.from_numpy(mask).long() 176 | 177 | def segment_mask_to_contiguous(self, mask): 178 | values = np.unique(mask) 179 | for i in range(len(values)): 180 | assert (values[i] in self.segment_mapping) 181 | index = np.digitize(mask.ravel(), self.segment_mapping, right=True) 182 | return self.key[index].reshape(mask.shape) 183 | 184 | 185 | -------------------------------------------------------------------------------- /DSP/util/COCO_loader/coco_uninet.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from pycocotools import mask as coco_mask 4 | import logging 5 | from PIL import Image 6 | import dataset.CMU as CMU 7 | from scipy.ndimage import gaussian_filter 8 | from torchvision import transforms 9 | import torch 10 | import random 11 | from torch.utils.data import DataLoader 12 | 13 | 14 | 15 | from torch.utils.data import Dataset, DataLoader 16 | import matplotlib.pyplot as plt 17 | 18 | from util.COCO_loader.base_dataset import BaseDataset 19 | 20 | 21 | class COCOUninet(BaseDataset): 22 | NUM_CLASSES = {'segment': 21, 'detect': 81, 'inst_seg': 81} 23 | INSTANCE_NAMES = [] 24 | 25 | CAT_LIST = [0, 5, 2, 16, 9, 44, 6, 3, 17, 62, 21, 67, 18, 19, 4, 26 | 1, 64, 20, 63, 7, 72] 27 | 28 | def __init__(self, root=os.path.expanduser('/data/input/datasets/mscoco'), 29 | split='train', mode=None, cfg=None, **kwargs): 30 | year = str(2017) 31 | if year == "2017" and split == 'minival': 32 | split = 'val' 33 | super(COCOUninet, self).__init__( 34 | root, split, cfg, mode, ann_path='annotations', 35 | ann_file_format=f'instances_%s{year}.json', **kwargs) 36 | 37 | if self.split == "test": 38 | self.image_paths = get_image_paths(self.root, year=year) 39 | if len(self.image_paths) == 0: 40 | raise RuntimeError("Found 0 images in subfolders of:" + self.root + "\n") 41 | return 42 | 43 | self.img_dir = os.path.join(root, rf'{split}{year}') 44 | self.add_area() 45 | 46 | @staticmethod 47 | def _has_only_empty_bbox(anno): 48 | return all(any(o <= 1 for o in obj["bbox"][2:]) for obj in anno) 49 | 50 | def ann_check_hooks(self, obj): 51 | return True 52 | 53 | def get_detect_ids(self): 54 | det_ids = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 55 | 19, 20, 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 56 | 38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 57 | 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 58 | 75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90] 59 | 60 | return det_ids 61 | 62 | def get_segment_mapping(self): 63 | key = None 64 | segment_mapping = None 65 | 66 | return key, segment_mapping 67 | 68 | def __getitem__(self, index): 69 | if self.split == "test": 70 | image = Image.open(self.image_paths[index]).convert('RGB') 71 | image = np.array(image) 72 | image, _ = self.transform(image) 73 | return image 74 | 75 | file_name = self.id_to_filename[self.image_ids[index]] 76 | image_path = os.path.join(self.img_dir, file_name) 77 | image = Image.open(image_path).convert('RGB') 78 | image = np.array(image) 79 | 80 | bboxes, labels, inst_masks = self.get_annotation(index) 81 | 82 | inst_list = [] 83 | for idx in range(len(inst_masks)): 84 | 85 | if labels[idx] == 3: 86 | class_name = labels[idx] 87 | masks = np.array(inst_masks[idx]) 88 | masks = np.reshape(masks, (masks.shape[0], masks.shape[1], 1)) 89 | gau_masks = gaussian_filter(masks, sigma=1) 90 | 91 | instance = image * gau_masks 92 | nzCount = instance.any(axis=-1).sum() 93 | print(nzCount) 94 | if nzCount > 10000 and nzCount < 60000: 95 | inst_list.append(instance) 96 | 97 | if labels[idx] == 3: 98 | return image, inst_list, gau_masks, class_name 99 | else: 100 | return image 101 | 102 | 103 | 104 | 105 | 106 | 107 | def get_image_paths(folder, split='test', year='2014'): 108 | def get_path_pairs(): 109 | img_paths = [] 110 | for root, directories, files in os.walk(img_folder): 111 | for filename in files: 112 | if filename.endswith(".jpg") or filename.endswith(".png"): 113 | im_path = os.path.join(root, filename) 114 | if os.path.isfile(im_path): 115 | img_paths.append(im_path) 116 | else: 117 | logging.info('cannot find the mask or image:', im_path) 118 | logging.info('Found {} images in the folder {}'.format(len(img_paths), img_folder)) 119 | return img_paths 120 | 121 | img_folder = os.path.join(folder, split + year) 122 | return get_path_pairs() 123 | 124 | def convert_togray (cd_img1, instance): 125 | img1 = instance 126 | b,c,H,W = cd_img1.shape 127 | size = (H,W) 128 | transform = transforms.ToPILImage() 129 | img1 = img1.squeeze(0) 130 | img1 = img1.permute(2, 0, 1) 131 | img1 = transform(img1) 132 | 133 | img1 = transforms.Resize(size=size)(img1) 134 | gray_instance = transforms.Grayscale()(img1) 135 | gray_instance, resized_instance = transforms.ToTensor()(gray_instance),transforms.ToTensor()(img1) 136 | 137 | return gray_instance.unsqueeze(0), resized_instance.unsqueeze(0) 138 | 139 | def image_copy_paste(img1, img2, instance, alpha, blend=True, sigma=1): 140 | if alpha is not None: 141 | gray_ins, instance = convert_togray(img1, instance) 142 | binarized = 1.0 * (gray_ins > 0) 143 | invert_binary = (~binarized).float() 144 | if blend: 145 | filtered_mask = gaussian_filter(invert_binary, sigma=1) 146 | filtered_mask = torch.Tensor(filtered_mask) 147 | aug_img1 = instance + (img1 * filtered_mask) 148 | aug_img2 = instance + (img2 * filtered_mask) 149 | 150 | return aug_img1, aug_img2, instance 151 | 152 | def save_show_transformations(img, img2,instance, masks, name, path = '/data/input/datasets/VL-CMU-CD/instances'): 153 | 154 | img = np.squeeze(img) 155 | img2 = np.squeeze(img2) 156 | instance = np.squeeze(instance) 157 | masks = np.squeeze(masks) 158 | instance = instance.permute(1,2,0) 159 | img= np.transpose(img,(0,1, 2)) # from NCHW to NHWC 160 | instance= np.transpose(instance,(0,1, 2)) 161 | f, axarr = plt.subplots(1,4) 162 | axarr[0].imshow(img.permute(1,2,0)) 163 | axarr[1].imshow(img2.permute(1,2,0)) 164 | axarr[2].imshow(instance) 165 | axarr[3].imshow(masks) 166 | plt.show() 167 | 168 | 169 | instance = instance.numpy() 170 | rescaled = (255.0 / instance.max() * (instance - instance.min())).astype(np.uint8) 171 | Name_Formatted = ("%s" % (j)) + ".png" 172 | # file_path = os.path.join(path, Name_Formatted) 173 | # instance = Image.fromarray(rescaled) 174 | # instance.save(file_path) 175 | return file_path 176 | 177 | 178 | 179 | # COCO dataset loader 180 | coco_train_dataset = COCOUninet() 181 | train_loader_coco = DataLoader(coco_train_dataset, batch_size=1, shuffle=True, drop_last=True) 182 | # Change detection dataset loader 183 | TRAIN_DATA_PATH = "/data/input/datasets/VL-CMU-CD/struc_train" 184 | data_path = os.path.join(TRAIN_DATA_PATH, 'train_pair.txt') 185 | CD_train_dataset = CMU.Dataset(TRAIN_DATA_PATH, TRAIN_DATA_PATH, 186 | data_path, 'train', 'CD', transform=True, 187 | transform_med=None) 188 | train_loader_CD = DataLoader(CD_train_dataset, batch_size=1, shuffle=True, drop_last=True) 189 | 190 | def extract_ins_coco (): 191 | for j, batch_CD in enumerate(train_loader_CD): 192 | t0, t1, cd_labels, instance = batch_CD 193 | for i, batch in enumerate(train_loader_coco): 194 | if len(batch) >1: 195 | img, instance, masks, labels = batch 196 | if instance: 197 | for ins in instance: 198 | aug_t0, aug_t1, resized_instance = image_copy_paste(t0, t1, ins, masks) 199 | # show_transformations(img, ins, masks) 200 | save_show_transformations(aug_t0, resized_instance, masks) 201 | ins_path = [] 202 | for j, batch_CD in enumerate(train_loader_CD): 203 | t0, t1, cd_labels, ins = batch_CD 204 | file_path = save_show_transformations(t0, t1, ins, cd_labels, j) 205 | ins_path.append(file_path) 206 | 207 | 208 | -------------------------------------------------------------------------------- /DSP/util/COCO_loader/defaults.py: -------------------------------------------------------------------------------- 1 | from yacs.config import CfgNode as CN 2 | 3 | # ----------------------------------------------------------------------------- 4 | # Config definition 5 | # ----------------------------------------------------------------------------- 6 | _C = CN() 7 | 8 | # ----------------------------------------------------------------------------- 9 | # MODEL options 10 | # ----------------------------------------------------------------------------- 11 | _C.MODEL = CN() 12 | _C.MODEL.PRETRAINED_PATH = "/input/datasets/uninet/pytorch-config/FCOS_imprv_R_50_FPN_1x.pth" 13 | _C.MODEL.IS_FULL_MODEL = False 14 | _C.MODEL.LOAD_BACKBONE = False 15 | _C.MODEL.BACKBONE_NAME = 'backbone' 16 | _C.MODEL.NECK_NAMES = ['fpn', 'neck'] 17 | _C.MODEL.HEAD_NAME = 'head' 18 | _C.MODEL.USE_DCN = False 19 | 20 | # ----------------------------------------------------------------------------- 21 | # INPUT options 22 | # ----------------------------------------------------------------------------- 23 | _C.INPUT = CN() 24 | 25 | # ---------------------------------------------------------------------------- # 26 | # Specific test options 27 | # ---------------------------------------------------------------------------- # 28 | _C.TEST = CN() 29 | # Number of detections per image 30 | _C.TEST.DETECTIONS_PER_IMG = 100 31 | 32 | # ---------------------------------------------------------------------------- # 33 | # Test-time augmentations for bounding box detection 34 | # See configs/test_time_aug/e2e_mask_rcnn_R-50-FPN_1x.yaml for an example 35 | # ---------------------------------------------------------------------------- # 36 | _C.TEST.BBOX_AUG = CN() 37 | # Enable test-time augmentation for bounding box detection if True 38 | _C.TEST.BBOX_AUG.ENABLED = False 39 | 40 | # --------------------------------------------------------------------------- # 41 | # Dataloader Options 42 | # ---------------------------------------------------------------------------- # 43 | _C.DATALOADER = CN() 44 | _C.DATALOADER.YEAR = 2014 45 | _C.DATALOADER.ANNOTATION_FOLDER = 'gtFine/annotations_coco_format_v1' 46 | _C.DATALOADER.ANN_FILE_FORMAT = 'instances_%s.json' 47 | # ImageNet mean and standard deviation.. 48 | _C.DATALOADER.MEAN = [.485, .456, .406] 49 | _C.DATALOADER.STD = [.229, .224, .225] 50 | _C.DATALOADER.TRAIN_TRANSFORMS = ['PreProcessBoxes', 'PadIfNeeded', 'ShiftScaleRotate', 'CropNonEmptyMaskIfExists', 51 | 'ResizeMultiScale', 'HorizontalFlip', 'ColorJitter', 'PostProcessBoxes', 52 | 'ConvertFromInts', 'ToTensor', 'Normalize'] 53 | _C.DATALOADER.VAL_TRANSFORMS = ['PreProcessBoxes', 'Resize', 'PostProcessBoxes', 54 | 'ToTensor', 'Normalize'] 55 | # Multi scale augmentation defaults.. 56 | _C.DATALOADER.MS_MULTISCALE_MODE = 'value' 57 | _C.DATALOADER.MS_RATIO_RANGE = [0.75, 1] 58 | _C.DATALOADER.PHOTOMETRIC_DISTORT_KWARGS = '{}' 59 | _C.DATALOADER.INST_SEG_ENCODING = 'MEINST' 60 | _C.DATALOADER.DEPTH_SCALE = 512. 61 | 62 | # ---------------------------------------------------------------------------- # 63 | # Task options 64 | # ---------------------------------------------------------------------------- # 65 | _C.TASKS = CN() 66 | _C.TASKS.TASK_TO_LOSS_NAME = '{\"detect\":"default",\"segment\":"default",\"depth\":"default",' \ 67 | '\"inst_depth\":"default",\"inst_seg\":"default"}' 68 | _C.TASKS.TASK_TO_LOSS_ARGS = '{}' 69 | _C.TASKS.TASK_TO_LOSS_KWARGS = '{}' 70 | _C.TASKS.TASK_TO_CALL_KWARGS = '{\"segment\":{\"ignore_index\":-1}}' 71 | _C.TASKS.TASK_TO_MIN_OR_MAX = '{\"detect\":1,\"segment\":1,\"depth\":-1,\"inst_depth\":-1,' \ 72 | ' \"inst_seg\":1}' 73 | _C.TASKS.ALL_LOSSES = ['detect_cls_loss', 'detect_reg_loss', 'detect_centerness_loss', 74 | 'segment_loss', 'depth_loss', 'inst_depth_l1_loss', 75 | 'inst_seg_loss'] 76 | _C.TASKS.LOSS_INIT_WEIGHTS = [1., 1., 1., 1., 1., 0.05, 1.] 77 | _C.TASKS.LOSS_START_EPOCH = [1, 1, 1, 1, 1, 1, 1] 78 | _C.TASKS.USE_UNCERTAINTY_WEIGHTING = False 79 | 80 | # --------------------------------------------------------------------------- # 81 | # Backbone and encoder Options 82 | # ---------------------------------------------------------------------------- # 83 | _C.MODEL.ENCODER = CN() 84 | _C.MODEL.ENCODER.NUM_EN_FEATURES = 6 85 | _C.MODEL.ENCODER.OUT_CHANNELS_BEFORE_EXPANSION = 512 86 | _C.MODEL.ENCODER.FEAT_CHANNELS = [2048, 2048, 2048, 2048] 87 | _C.MODEL.ENCODER.USE_DCN = False 88 | 89 | # --------------------------------------------------------------------------- # 90 | # Decoder Options 91 | # ---------------------------------------------------------------------------- # 92 | _C.MODEL.DECODER = CN() 93 | _C.MODEL.DECODER.OUTPLANES = 64 94 | _C.MODEL.DECODER.MULTISCALE = False 95 | _C.MODEL.DECODER.ATTENTION = False 96 | _C.MODEL.DECODER.INSERT_MEAN_FEAT = False 97 | _C.MODEL.DECODER.INIT_WEIGHTS = False 98 | _C.MODEL.DECODER.USE_NECK_FEATURES = False 99 | 100 | # --------------------------------------------------------------------------- # 101 | # Object Detection Options 102 | # ---------------------------------------------------------------------------- # 103 | _C.MODEL.DET = CN() 104 | _C.MODEL.DET.HEAD_NAME = "FCOS" 105 | _C.MODEL.DET.FEATURE_CHANNELS = 256 106 | _C.MODEL.DET.FPN_STRIDES = [8, 16, 32] 107 | _C.MODEL.DET.WEIGHTS_PER_CLASS = [1] * 8 108 | _C.MODEL.DET.ATTENTION = False 109 | _C.MODEL.DET.CLS_LOSS_TYPE = 'focal_loss' 110 | # Focal loss parameter: alpha 111 | _C.MODEL.DET.LOSS_ALPHA = 0.25 112 | # Focal loss parameter: gamma 113 | _C.MODEL.DET.LOSS_GAMMA = 2.0 114 | _C.MODEL.DET.LOSS_BETA = 0.9999 115 | _C.MODEL.DET.PRIOR_PROB = 0.01 116 | 117 | # --------------------------------------------------------------------------- # 118 | # FCOS Options 119 | # ---------------------------------------------------------------------------- # 120 | _C.MODEL.FCOS = CN() 121 | _C.MODEL.FCOS.INFERENCE_TH = 0.05 122 | _C.MODEL.FCOS.NMS_TH = 0.6 123 | _C.MODEL.FCOS.PRE_NMS_TOP_N = 1000 124 | # the number of convolutions used in the cls and bbox tower 125 | _C.MODEL.FCOS.NUM_CONVS = 4 126 | # if CENTER_SAMPLING_RADIUS <= 0, it will disable center sampling 127 | _C.MODEL.FCOS.CENTER_SAMPLING_RADIUS = 0.0 128 | # IOU_LOSS_TYPE can be "iou", "linear_iou" or "giou" 129 | _C.MODEL.FCOS.IOU_LOSS_TYPE = "iou" 130 | _C.MODEL.FCOS.NORM_REG_TARGETS = False 131 | _C.MODEL.FCOS.CENTERNESS_ON_REG = False 132 | _C.MODEL.FCOS.USE_DCN_IN_TOWER = False 133 | _C.MODEL.FCOS.USE_NAS_HEAD = False 134 | _C.MODEL.FCOS.ATSS_TOPK = 9 135 | 136 | # --------------------------------------------------------------------------- # 137 | # OnetNet Options 138 | # ---------------------------------------------------------------------------- # 139 | _C.MODEL.ONENET = CN() 140 | _C.MODEL.ONENET.CLASS_WEIGHT = 1. 141 | _C.MODEL.ONENET.GIOU_WEIGHT = 1. 142 | _C.MODEL.ONENET.L1_WEIGHT = 2.5 143 | _C.MODEL.ONENET.USE_NMS = False 144 | _C.MODEL.ONENET.NMS_TH = 0.5 145 | 146 | # --------------------------------------------------------------------------- # 147 | # Segmentation Options 148 | # ---------------------------------------------------------------------------- # 149 | _C.MODEL.SEG = CN() 150 | _C.MODEL.SEG.INPLANES = 64 151 | _C.MODEL.SEG.OUTPLANES = 64 152 | _C.MODEL.SEG.MULTISCALE = False 153 | _C.MODEL.SEG.ATTENTION = False 154 | 155 | # Depth Options 156 | # ---------------------------------------------------------------------------- # 157 | _C.MODEL.DEPTH = CN() 158 | _C.MODEL.DEPTH.INPLANES = 64 159 | _C.MODEL.DEPTH.OUTPLANES = 64 160 | _C.MODEL.DEPTH.ACTIVATION_FN = 'sigmoid' 161 | _C.MODEL.DEPTH.ATTENTION = False 162 | 163 | # --------------------------------------------------------------------------- # 164 | # Instance depth Options 165 | # ---------------------------------------------------------------------------- # 166 | _C.MODEL.INST_DEPTH = CN() 167 | _C.MODEL.INST_DEPTH.DEPTH_ON_REG = True 168 | 169 | # --------------------------------------------------------------------------- # 170 | # Instance segmentation Options 171 | # ---------------------------------------------------------------------------- # 172 | _C.MODEL.INST_SEG = CN() 173 | _C.MODEL.INST_SEG.HEAD_NAME = 'MEINST' 174 | 175 | # --------------------------------------------------------------------------- # 176 | # MEINST Options 177 | 178 | _C.MODEL.MEINST = CN() 179 | # share classification head and instance segmentation head.. 180 | _C.MODEL.MEINST.SHARE_CLS_INST_HEADS = False 181 | # share bounding box head and instance segmentation head.. 182 | _C.MODEL.MEINST.SHARE_BBOX_INST_HEADS = True 183 | # mask encoding type 184 | _C.MODEL.MEINST.ENCODING_TYPE = 'explicit' 185 | # is inverse sigmoid and sigmoid used for finding pca components 186 | _C.MODEL.MEINST.SIGMOID = True 187 | # is whiten used for finding pca components 188 | _C.MODEL.MEINST.WHITEN = True 189 | # path to pca params file 190 | _C.MODEL.MEINST.PCA_PATH = '' 191 | # number of components in the encoded mask 192 | _C.MODEL.MEINST.NUM_COMPONENTS = 60 193 | # dimension to which all instance masks are reshaped to 194 | _C.MODEL.MEINST.ENCODING_DIM = 28 195 | # add instance masks vizualized as segmentation masks to tensorboard 196 | _C.MODEL.MEINST.CREATE_PRED_MASK = False 197 | # visualize each instance separately.. 198 | _C.MODEL.MEINST.VIZ_INSTANCES = True 199 | 200 | 201 | # --------------------------------------------------------------------------- # 202 | # CenterMask Options 203 | # ---------------------------------------------------------------------------- # 204 | 205 | _C.MODEL.CENTER_MASK = CN() 206 | _C.MODEL.CENTER_MASK.IN_FEATURES = ['p3', 'p4', 'p5'] 207 | _C.MODEL.CENTER_MASK.POOLER_RESOLUTION = 14 208 | _C.MODEL.CENTER_MASK.POOLER_SAMPLING_RATIO = 0 209 | _C.MODEL.CENTER_MASK.POOLER_TYPE = 'ROIAlignV2' 210 | _C.MODEL.CENTER_MASK.ASSIGN_CRITERION = 'ratio' 211 | _C.MODEL.CENTER_MASK.MASK_CONV_DIM = 128 212 | _C.MODEL.CENTER_MASK.MASK_NUM_CONV = 2 213 | _C.MODEL.CENTER_MASK.MASKIOU_CONV_DIM = 128 214 | _C.MODEL.CENTER_MASK.MASKIOU_NUM_CONV = 2 215 | _C.MODEL.CENTER_MASK.CLS_AGNOSTIC_MASK = False 216 | _C.MODEL.CENTER_MASK.MASKIOU_ON = False 217 | _C.MODEL.CENTER_MASK.MASKIOU_LOSS_WEIGHT = 1.0 218 | 219 | # --------------------------------------------------------------------------- # 220 | # Miscellaneous Options 221 | # ---------------------------------------------------------------------------- # 222 | 223 | _C.MISC = CN() 224 | _C.MISC.CITYS_INST_SEG_EVAL = False -------------------------------------------------------------------------------- /DSP/util/__pycache__/dist_util.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/dist_util.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/util/__pycache__/test.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/test.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/util/__pycache__/torchlist.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/torchlist.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/util/__pycache__/train_util.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/train_util.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/util/__pycache__/transforms.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/transforms.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/util/__pycache__/utils.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/utils.cpython-37.pyc -------------------------------------------------------------------------------- /DSP/util/dist_util.py: -------------------------------------------------------------------------------- 1 | """ 2 | Distributed Data Parallel resources 3 | """ 4 | import torch 5 | import os 6 | import torch.distributed as dist 7 | 8 | 9 | def is_dist_avail_and_initialized(): 10 | if not dist.is_available(): 11 | return False 12 | if not dist.is_initialized(): 13 | return False 14 | return True 15 | 16 | 17 | def get_world_size(): 18 | if not is_dist_avail_and_initialized(): 19 | return 1 20 | return dist.get_world_size() 21 | 22 | 23 | def get_rank(): 24 | if not is_dist_avail_and_initialized(): 25 | return 0 26 | return dist.get_rank() 27 | 28 | 29 | def is_main_process(): 30 | return get_rank() == 0 31 | 32 | 33 | def setup_for_distributed(is_master): 34 | """ 35 | This function disables printing when not in master process 36 | """ 37 | import builtins as __builtin__ 38 | builtin_print = __builtin__.print 39 | 40 | def print(*args, **kwargs): 41 | force = kwargs.pop('force', False) 42 | if is_master or force: 43 | builtin_print(*args, **kwargs) 44 | 45 | __builtin__.print = print 46 | 47 | 48 | def init_distributed_mode(args): 49 | if 'RANK' in os.environ and 'WORLD_SIZE' in os.environ: 50 | args.rank = int(os.environ["RANK"]) 51 | args.world_size = int(os.environ['WORLD_SIZE']) 52 | args.gpu = int(os.environ['LOCAL_RANK']) 53 | elif 'SLURM_PROCID' in os.environ: 54 | args.rank = int(os.environ['SLURM_PROCID']) 55 | args.gpu = args.rank % torch.cuda.device_count() 56 | elif hasattr(args, "rank"): 57 | pass 58 | else: 59 | print('Not using distributed mode') 60 | args.distributed = False 61 | return 62 | 63 | args.distributed = True 64 | torch.cuda.set_device(args.gpu) 65 | args.dist_backend = 'nccl' 66 | print('| distributed init (rank {}): {}'.format(args.rank, args.dist_url), flush=True) 67 | torch.distributed.init_process_group(backend=args.dist_backend, init_method=args.dist_url, 68 | world_size=args.world_size, rank=args.rank) 69 | setup_for_distributed(args.rank == 0) -------------------------------------------------------------------------------- /DSP/util/torchlist.py: -------------------------------------------------------------------------------- 1 | """ 2 | @author Fahad Sarfraz 3 | """ 4 | import torch.utils.data as data 5 | 6 | from PIL import Image 7 | import os 8 | import os.path 9 | 10 | 11 | def default_loader(path): 12 | return Image.open(path).convert('RGB') 13 | 14 | 15 | def default_flist_reader(flist): 16 | """ 17 | flist format: impath label\nimpath label\n ...(same to caffe's filelist) 18 | """ 19 | imlist = [] 20 | with open(flist, 'r') as rf: 21 | for line in rf.readlines(): 22 | impath, imlabel = line.strip().split() 23 | imlist.append( (impath, int(imlabel)) ) 24 | 25 | return imlist 26 | 27 | 28 | class ImageFilelist(data.Dataset): 29 | def __init__(self, root, flist, transform=None, target_transform=None, 30 | flist_reader=default_flist_reader, loader=default_loader): 31 | self.root = root 32 | self.imlist = flist_reader(flist) 33 | self.transform = transform 34 | self.target_transform = target_transform 35 | self.loader = loader 36 | 37 | def __getitem__(self, index): 38 | impath, target = self.imlist[index] 39 | img = self.loader(os.path.join(self.root ,impath)) 40 | if self.transform is not None: 41 | img = self.transform(img) 42 | if self.target_transform is not None: 43 | target = self.target_transform(target) 44 | 45 | return img, target 46 | 47 | def __len__(self): 48 | return len(self.imlist) -------------------------------------------------------------------------------- /DSP/util/train_util.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from datetime import datetime 4 | from tqdm import tqdm 5 | import numpy as np 6 | import torch.nn.functional as F 7 | import os 8 | from transforms.simclr_transform import SimCLRTransform 9 | from torch.utils.data import DataLoader 10 | from torch.utils.data.distributed import DistributedSampler 11 | from util.utils import save_checkpoint, log 12 | from criterion.ntxent import NTXent, BarlowTwinsLoss_CD 13 | from criterion.sim_preserving_kd import criterion_MSE,distillation,fitnet_loss,similarity_preserving_loss, RKD, similarity_preserving_loss_cd, JSD 14 | import dataset.CMU as CMU 15 | import dataset.PCD as PCD 16 | 17 | def get_criteria(args): 18 | """ 19 | Loss criterion / criteria selection for training 20 | """ 21 | if args.barlow_twins : 22 | criteria = {'Barlow': [BarlowTwinsLoss_CD(args)]} #BarlowTwinsLoss 23 | else: 24 | criteria = {'ntxent': [NTXent(args), args.criterion_weight[0]]} 25 | 26 | return criteria 27 | 28 | 29 | 30 | def write_scalar(writer, total_loss,total_loss_bl, total_loss_kd,total_loss_sp, loss_p_c, leng, epoch): 31 | """ 32 | Add Loss scalars to tensorboard 33 | """ 34 | writer.add_scalar("Total_Loss/train", total_loss/leng,epoch) 35 | writer.add_scalar("Total_Loss_bl", total_loss_bl/leng,epoch) 36 | writer.add_scalar("Total_kd loss/train", total_loss_kd/leng, epoch) 37 | writer.add_scalar("Total_sp loss/train", total_loss_sp/leng, epoch) 38 | 39 | 40 | for k in loss_p_c: 41 | writer.add_scalar("{}_Loss/train".format(k), loss_p_c[k] / leng, epoch) 42 | 43 | 44 | def trainloaderSimCLR(args): 45 | """ 46 | Load training data through DataLoader 47 | """ 48 | transform = SimCLRTransform(args.img_size) 49 | 50 | if args.ssl_dataset == 'CMU': 51 | DATA_PATH = os.path.join(args.data_dir) 52 | 53 | VAL_DATA_PATH = os.path.join(args.val_data_dir) 54 | 55 | 56 | train_dataset = CMU.Dataset(DATA_PATH, 57 | 'train', 'ssl', transform= False, #ssl 58 | transform_med = transform) 59 | # test_dataset = CMU.Dataset(VAL_DATA_PATH, 'val', transform=False, 60 | # transform_med=None) 61 | elif args.ssl_dataset == 'PCD': 62 | print('PCD dataset loaded') 63 | DATA_PATH = os.path.join(args.data_dir) 64 | 65 | VAL_DATA_PATH = os.path.join(args.val_data_dir) 66 | 67 | train_dataset = PCD.Dataset(DATA_PATH, 68 | 'train', 'ssl', transform=False, # ssl 69 | transform_med=transform) 70 | # Data Loader 71 | if args.distribute: 72 | train_sampler = DistributedSampler(train_dataset) 73 | train_loader = DataLoader(train_dataset, batch_size=args.ssl_batchsize,sampler=train_sampler, drop_last=True) 74 | else: 75 | train_loader = DataLoader(train_dataset, batch_size=args.ssl_batchsize, shuffle=True, drop_last=True) 76 | # val_loader = DataLoader(train_dataset, batch_size=args.ssl_batchsize, shuffle=True, drop_last=True) 77 | # test_loader = DataLoader(test_dataset, batch_size=1, shuffle=True, drop_last=True) 78 | 79 | log("Took {} time to load data!".format(datetime.now() - args.start_time)) 80 | return train_loader 81 | 82 | def various_distance( out_vec_t0, out_vec_t1, dist_flag='l2'): 83 | 84 | if dist_flag == 'l2': 85 | distance = F.pairwise_distance(out_vec_t0,out_vec_t1,p=2) 86 | if dist_flag == 'l1': 87 | distance = F.pairwise_distance(out_vec_t0,out_vec_t1,p=1) 88 | if dist_flag == 'cos': 89 | similarity = F.cosine_similarity(out_vec_t0, out_vec_t1) 90 | distance = 1 - 2 * similarity / np.pi 91 | return distance 92 | 93 | def train_one_epoch(args, train_loader, model, criteria, optimizer, scheduler, epoch): 94 | """ 95 | Train one epoch of SSL model 96 | 97 | """ 98 | # torch.autograd.set_detect_anomaly(True) 99 | loss_per_criterion = {} 100 | total_loss = 0 101 | total_sup_loss = 0 102 | total_loss_bl = 0 103 | total_loss_kd = 0 104 | total_loss_sp = 0 105 | 106 | for i, batch in enumerate(train_loader): 107 | p1, p2, n1, n2, f1,f2, label = batch # x, y = positive pair belonging to t0 images ; x1,y1 = positive pair belonging to t1 images 108 | p1 = p1.cuda(device=args.device) 109 | p2 = p2.cuda(device=args.device) 110 | n1 = n1.cuda(device=args.device) 111 | n2 = n2.cuda(device=args.device) 112 | label = label.cuda(device=args.device) 113 | label = label.float() 114 | optimizer.zero_grad() 115 | if args.barlow_twins == True: 116 | if args.dense_cl==True: 117 | xe, ye, zx, zy = model(p1, p2) 118 | x1e, y1e, zx1, zy1 = model(n1, n2) 119 | diff_feat0= torch.nn.functional.pairwise_distance(zx, zx1) 120 | diff_feat1 = torch.nn.functional.pairwise_distance(zy , zy1) 121 | diff_feat2 = torch.nn.functional.pairwise_distance(zx , zy1) 122 | diff_feat3 = torch.nn.functional.pairwise_distance(zy , zx1) 123 | else: 124 | xe, ye, zx, zy = model(p1, p2) 125 | x1e, y1e, zx1, zy1 = model(n1, n2) 126 | ## simple diff layer to get change map 127 | diff_feat0 = torch.abs(zx - zx1) 128 | diff_feat1 = torch.abs(zy - zy1) 129 | diff_feat2 = torch.abs(zx - zy1) 130 | diff_feat3 = torch.abs(zy - zx1) 131 | else: 132 | _, _, zx, zy = model(p1, p2) 133 | _, _, zx1, zy1 = model(n1, n2) 134 | # Multiple loss aggregation 135 | loss = torch.tensor(0).to(args.device) 136 | for k in criteria: 137 | global_step = epoch * len(train_loader) + i 138 | if args.barlow_twins == True: 139 | if args.nodiff_tc: 140 | loss_bl = criteria[k][0](zx, zx1, diff_feat2, diff_feat3) 141 | else: 142 | loss_bl = criteria[k][0](diff_feat0, diff_feat1, diff_feat2, diff_feat3 ) 143 | 144 | if args.kd_loss==True: 145 | jsd = JSD(args) 146 | # loss_kd_1 = distillation(zx, zy, T=4) 147 | loss_kd_1 = jsd(zx, zy) 148 | loss_kd_2 = jsd(zx1, zy1) 149 | intra_kd_loss = loss_kd_1 + loss_kd_2 150 | loss_sp = 0 151 | if args.inter_kl==True: 152 | loss_kd_3 = jsd(zx, zx1) 153 | loss_kd_4 = jsd(zy, zy1) 154 | inter_kd_loss = loss_kd_3 + loss_kd_4 155 | loss_kd = (args.alpha_kl * intra_kd_loss) + (args.alpha_inter_kd*inter_kd_loss) 156 | else: 157 | loss_kd = args.alpha_kl * intra_kd_loss 158 | loss = loss_bl + loss_kd 159 | if args.kd_loss_2 == 'fitnet': 160 | loss_ft_1 = fitnet_loss(A_t=xe, A_s=ye, rand=False, noise=0.1) 161 | loss_ft_2 = fitnet_loss(A_t=x1e, A_s=y1e, rand=False, noise=0.1) 162 | loss_sp = (args.alpha_sp*(loss_kd_1 + loss_kd_2)) 163 | loss = loss_bl + loss_kd + loss_sp 164 | elif args.kd_loss_2 == 'sp': 165 | loss_sp = ((args.alpha_sp)* similarity_preserving_loss_cd(xe, x1e, ye, y1e)) 166 | loss = loss_bl + loss_kd + loss_sp 167 | else: 168 | 169 | loss = loss_bl 170 | 171 | loss.backward() 172 | optimizer.step() 173 | if scheduler is not None: 174 | scheduler.step() 175 | if i % 50 == 0: 176 | 177 | log("Batch {}/{}. Loss: {}. Loss_bl: {}. Loss_kd: {}. Loss_sp{}. Time elapsed: {} ".format(i, len(train_loader), loss.item(),loss_bl.item(),loss_kd.item() 178 | ,loss_sp.item(),datetime.now() - args.start_time)) 179 | total_loss += loss.item() 180 | total_loss_bl += loss_bl.item() 181 | total_loss_kd += loss_kd.item() 182 | total_loss_sp += loss_sp.item() 183 | 184 | return total_loss, total_loss_bl, total_loss_kd, total_loss_sp, loss_per_criterion 185 | 186 | 187 | 188 | 189 | 190 | def trainSSL(args, model, optimizer, criteria, writer, scheduler=None): 191 | """ 192 | Train a SSL model 193 | """ 194 | if not args.visualize_heatmap : 195 | model.train() 196 | # Data parallel Functionality 197 | if torch.cuda.device_count() > 1: 198 | model = nn.DataParallel(model) 199 | log('Model converted to DataParallel model with {} cuda devices'.format(torch.cuda.device_count())) 200 | model = model.to(args.device) 201 | 202 | train_loader = trainloaderSimCLR(args) 203 | 204 | for epoch in tqdm(range(1, args.ssl_epochs + 1)): 205 | total_loss, total_loss_bl, total_loss_kd,total_loss_sp, loss_per_criterion = train_one_epoch(args, train_loader, model, criteria, optimizer, scheduler, epoch) 206 | 207 | write_scalar(writer, total_loss,total_loss_bl, total_loss_kd,total_loss_sp, loss_per_criterion, len(train_loader), epoch) 208 | log("Epoch {}/{}. Total Loss: {}. Time elapsed: {} ". 209 | format(epoch, args.ssl_epochs, total_loss / len(train_loader), total_loss_bl / len(train_loader),total_loss_kd / len(train_loader), datetime.now() - args.start_time)) 210 | 211 | 212 | # Save checkpoint after every epoch 213 | path = save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch, filename='checkpoint.pth'.format(epoch)) 214 | if os.path.exists: 215 | state_dict = torch.load(path, map_location=args.device) 216 | model.load_state_dict(state_dict) 217 | 218 | # Save the model at specific checkpoints 219 | if epoch % 10 == 0: 220 | 221 | if torch.cuda.device_count() > 1: 222 | save_checkpoint(state_dict=model.module.state_dict(), args=args, epoch=epoch, 223 | filename='checkpoint_model_{}_model1.pth'.format(epoch)) 224 | 225 | else: 226 | save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch, 227 | filename='checkpoint_model_{}_model1.pth'.format(epoch)) 228 | 229 | log("Total training time {}".format(datetime.now() - args.start_time)) 230 | 231 | 232 | writer.close() 233 | -------------------------------------------------------------------------------- /DSP/util/transforms.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | import torch 3 | import math 4 | import random 5 | from PIL import Image, ImageOps 6 | try: 7 | import accimage 8 | except ImportError: 9 | accimage = None 10 | import numpy as np 11 | import numbers 12 | import types 13 | import collections 14 | from torch import nn 15 | 16 | 17 | class Compose(object): 18 | """Composes several transforms together. 19 | 20 | Args: 21 | transforms (list of ``Transform`` objects): list of transforms to compose. 22 | 23 | Example: 24 | >>> transforms.Compose([ 25 | >>> transforms.CenterCrop(10), 26 | >>> transforms.ToTensor(), 27 | >>> ]) 28 | """ 29 | 30 | def __init__(self, transforms): 31 | self.transforms = transforms 32 | 33 | def __call__(self, img): 34 | for t in self.transforms: 35 | img = t(img) 36 | return img 37 | 38 | 39 | class ToTensor(object): 40 | """Convert a ``PIL.Image`` or ``numpy.ndarray`` to tensor. 41 | 42 | Converts a PIL.Image or numpy.ndarray (H x W x C) in the range 43 | [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]. 44 | """ 45 | 46 | def __call__(self, pic): 47 | """ 48 | Args: 49 | pic (PIL.Image or numpy.ndarray): Image to be converted to tensor. 50 | 51 | Returns: 52 | Tensor: Converted image. 53 | """ 54 | if isinstance(pic, np.ndarray): 55 | # handle numpy array 56 | img = torch.from_numpy(pic.transpose((2, 0, 1))) 57 | # backward compatibility 58 | return img.float().div(255) 59 | 60 | if accimage is not None and isinstance(pic, accimage.Image): 61 | nppic = np.zeros([pic.channels, pic.height, pic.width], dtype=np.float32) 62 | pic.copyto(nppic) 63 | return torch.from_numpy(nppic) 64 | 65 | # handle PIL Image 66 | if pic.mode == 'I': 67 | img = torch.from_numpy(np.array(pic, np.int32, copy=False)) 68 | elif pic.mode == 'I;16': 69 | img = torch.from_numpy(np.array(pic, np.int16, copy=False)) 70 | else: 71 | img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes())) 72 | # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK 73 | if pic.mode == 'YCbCr': 74 | nchannel = 3 75 | elif pic.mode == 'I;16': 76 | nchannel = 1 77 | else: 78 | nchannel = len(pic.mode) 79 | img = img.view(pic.size[1], pic.size[0], nchannel) 80 | # put it from HWC to CHW format 81 | # yikes, this transpose takes 80% of the loading time/CPU 82 | img = img.transpose(0, 1).transpose(0, 2).contiguous() 83 | if isinstance(img, torch.ByteTensor): 84 | return img.float().div(255) 85 | else: 86 | return img 87 | 88 | 89 | class ToPILImage(object): 90 | """Convert a tensor to PIL Image. 91 | 92 | Converts a torch.*Tensor of shape C x H x W or a numpy ndarray of shape 93 | H x W x C to a PIL.Image while preserving the value range. 94 | """ 95 | 96 | def __call__(self, pic): 97 | """ 98 | Args: 99 | pic (Tensor or numpy.ndarray): Image to be converted to PIL.Image. 100 | 101 | Returns: 102 | PIL.Image: Image converted to PIL.Image. 103 | 104 | """ 105 | npimg = pic 106 | mode = None 107 | if isinstance(pic, torch.FloatTensor): 108 | pic = pic.mul(255).byte() 109 | if torch.is_tensor(pic): 110 | npimg = np.transpose(pic.numpy(), (1, 2, 0)) 111 | assert isinstance(npimg, np.ndarray), 'pic should be Tensor or ndarray' 112 | if npimg.shape[2] == 1: 113 | npimg = npimg[:, :, 0] 114 | 115 | if npimg.dtype == np.uint8: 116 | mode = 'L' 117 | if npimg.dtype == np.int16: 118 | mode = 'I;16' 119 | if npimg.dtype == np.int32: 120 | mode = 'I' 121 | elif npimg.dtype == np.float32: 122 | mode = 'F' 123 | else: 124 | if npimg.dtype == np.uint8: 125 | mode = 'RGB' 126 | assert mode is not None, '{} is not supported'.format(npimg.dtype) 127 | return Image.fromarray(npimg, mode=mode) 128 | 129 | 130 | class Normalize(object): 131 | """Normalize an tensor image with mean and standard deviation. 132 | 133 | Given mean: (R, G, B) and std: (R, G, B), 134 | will normalize each channel of the torch.*Tensor, i.e. 135 | channel = (channel - mean) / std 136 | 137 | Args: 138 | mean (sequence): Sequence of means for R, G, B channels respecitvely. 139 | std (sequence): Sequence of standard deviations for R, G, B channels 140 | respecitvely. 141 | """ 142 | 143 | def __init__(self, mean, std): 144 | self.mean = mean 145 | self.std = std 146 | 147 | def __call__(self, tensor): 148 | """ 149 | Args: 150 | tensor (Tensor): Tensor image of size (C, H, W) to be normalized. 151 | 152 | Returns: 153 | Tensor: Normalized image. 154 | """ 155 | # TODO: make efficient 156 | for t, m, s in zip(tensor, self.mean, self.std): 157 | t.sub_(m).div_(s) 158 | return tensor 159 | 160 | 161 | class Scale(object): 162 | """Rescale the input PIL.Image to the given size. 163 | 164 | Args: 165 | size (sequence or int): Desired output size. If size is a sequence like 166 | (w, h), output size will be matched to this. If size is an int, 167 | smaller edge of the image will be matched to this number. 168 | i.e, if height > width, then image will be rescaled to 169 | (size * height / width, size) 170 | interpolation (int, optional): Desired interpolation. Default is 171 | ``PIL.Image.BILINEAR`` 172 | """ 173 | 174 | def __init__(self, size, interpolation=Image.BILINEAR): 175 | assert isinstance(size, int) or (isinstance(size, collections.Iterable) and len(size) == 2) 176 | self.size = size 177 | self.interpolation = interpolation 178 | 179 | def __call__(self, img): 180 | """ 181 | Args: 182 | img (PIL.Image): Image to be scaled. 183 | 184 | Returns: 185 | PIL.Image: Rescaled image. 186 | """ 187 | if isinstance(self.size, int): 188 | w, h = img.size 189 | if (w <= h and w == self.size) or (h <= w and h == self.size): 190 | return img 191 | if w < h: 192 | ow = self.size 193 | oh = int(self.size * h / w) 194 | return img.resize((ow, oh), self.interpolation) 195 | else: 196 | oh = self.size 197 | ow = int(self.size * w / h) 198 | return img.resize((ow, oh), self.interpolation) 199 | else: 200 | return img.resize(self.size, self.interpolation) 201 | 202 | 203 | class CenterCrop(object): 204 | """Crops the given PIL.Image at the center. 205 | 206 | Args: 207 | size (sequence or int): Desired output size of the crop. If size is an 208 | int instead of sequence like (h, w), a square crop (size, size) is 209 | made. 210 | """ 211 | 212 | def __init__(self, size): 213 | if isinstance(size, numbers.Number): 214 | self.size = (int(size), int(size)) 215 | else: 216 | self.size = size 217 | 218 | def __call__(self, img): 219 | """ 220 | Args: 221 | img (PIL.Image): Image to be cropped. 222 | 223 | Returns: 224 | PIL.Image: Cropped image. 225 | """ 226 | w, h = img.size 227 | th, tw = self.size 228 | x1 = int(round((w - tw) / 2.)) 229 | y1 = int(round((h - th) / 2.)) 230 | return img.crop((x1, y1, x1 + tw, y1 + th)) 231 | 232 | 233 | class Pad(object): 234 | """Pad the given PIL.Image on all sides with the given "pad" value. 235 | 236 | Args: 237 | padding (int or sequence): Padding on each border. If a sequence of 238 | length 4, it is used to pad left, top, right and bottom borders respectively. 239 | fill: Pixel fill value. Default is 0. 240 | """ 241 | 242 | def __init__(self, padding, fill=0): 243 | assert isinstance(padding, numbers.Number) 244 | assert isinstance(fill, numbers.Number) or isinstance(fill, str) or isinstance(fill, tuple) 245 | self.padding = padding 246 | self.fill = fill 247 | 248 | def __call__(self, img): 249 | """ 250 | Args: 251 | img (PIL.Image): Image to be padded. 252 | 253 | Returns: 254 | PIL.Image: Padded image. 255 | """ 256 | return ImageOps.expand(img, border=self.padding, fill=self.fill) 257 | 258 | 259 | class Lambda(object): 260 | """Apply a user-defined lambda as a transform. 261 | 262 | Args: 263 | lambd (function): Lambda/function to be used for transform. 264 | """ 265 | 266 | def __init__(self, lambd): 267 | assert isinstance(lambd, types.LambdaType) 268 | self.lambd = lambd 269 | 270 | def __call__(self, img): 271 | return self.lambd(img) 272 | 273 | 274 | class RandomCrop(object): 275 | """Crop the given PIL.Image at a random location. 276 | 277 | Args: 278 | size (sequence or int): Desired output size of the crop. If size is an 279 | int instead of sequence like (h, w), a square crop (size, size) is 280 | made. 281 | padding (int or sequence, optional): Optional padding on each border 282 | of the image. Default is 0, i.e no padding. If a sequence of length 283 | 4 is provided, it is used to pad left, top, right, bottom borders 284 | respectively. 285 | """ 286 | 287 | def __init__(self, size, padding=0): 288 | if isinstance(size, numbers.Number): 289 | self.size = (int(size), int(size)) 290 | else: 291 | self.size = size 292 | self.padding = padding 293 | 294 | def __call__(self, img): 295 | """ 296 | Args: 297 | img (PIL.Image): Image to be cropped. 298 | 299 | Returns: 300 | PIL.Image: Cropped image. 301 | """ 302 | if self.padding > 0: 303 | img = ImageOps.expand(img, border=self.padding, fill=0) 304 | 305 | w, h = img.size 306 | th, tw = self.size 307 | if w == tw and h == th: 308 | return img 309 | 310 | if w < tw or h < th: 311 | return img.resize((tw, th), Image.BILINEAR) 312 | 313 | x1 = random.randint(0, w - tw) 314 | y1 = random.randint(0, h - th) 315 | return img.crop((x1, y1, x1 + tw, y1 + th)) 316 | 317 | 318 | class RandomHorizontalFlip(object): 319 | """Horizontally flip the given PIL.Image randomly with a probability of 0.5.""" 320 | 321 | def __call__(self, img): 322 | """ 323 | Args: 324 | img (PIL.Image): Image to be flipped. 325 | 326 | Returns: 327 | PIL.Image: Randomly flipped image. 328 | """ 329 | if random.random() < 0.5: 330 | return img.transpose(Image.FLIP_LEFT_RIGHT) 331 | return img 332 | 333 | 334 | 335 | class RandomChoice(nn.Module): 336 | def __init__(self, transforms): 337 | super().__init__() 338 | self.transforms = transforms 339 | 340 | def __call__(self, imgs): 341 | t = random.choice(self.transforms) 342 | return [t(img) for img in imgs] 343 | 344 | class RandomSizedCrop(object): 345 | """Crop the given PIL.Image to random size and aspect ratio. 346 | 347 | A crop of random size of (0.08 to 1.0) of the original size and a random 348 | aspect ratio of 3/4 to 4/3 of the original aspect ratio is made. This crop 349 | is finally resized to given size. 350 | This is popularly used to train the Inception networks. 351 | 352 | Args: 353 | size: size of the smaller edge 354 | interpolation: Default: PIL.Image.BILINEAR 355 | """ 356 | 357 | def __init__(self, size, interpolation=Image.BILINEAR): 358 | self.size = size 359 | self.interpolation = interpolation 360 | 361 | def __call__(self, img): 362 | for attempt in range(10): 363 | area = img.size[0] * img.size[1] 364 | target_area = random.uniform(0.08, 1.0) * area 365 | aspect_ratio = random.uniform(3. / 4, 4. / 3) 366 | 367 | w = int(round(math.sqrt(target_area * aspect_ratio))) 368 | h = int(round(math.sqrt(target_area / aspect_ratio))) 369 | 370 | if random.random() < 0.5: 371 | w, h = h, w 372 | 373 | if w <= img.size[0] and h <= img.size[1]: 374 | x1 = random.randint(0, img.size[0] - w) 375 | y1 = random.randint(0, img.size[1] - h) 376 | 377 | img = img.crop((x1, y1, x1 + w, y1 + h)) 378 | assert(img.size == (w, h)) 379 | 380 | return img.resize((self.size, self.size), Image.BILINEAR) 381 | 382 | # Fallback 383 | scale = Scale(self.size, interpolation=self.interpolation) 384 | crop = CenterCrop(self.size) 385 | return crop(scale(img)) 386 | 387 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 NeurAI 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Differencing based Self-supervised pretraining for scene change detection (DSP) 2 | 3 | 4 | **This is the official code for COLLA 2022 Paper, ["Differencing based Self-supervised pretraining for scene change detection"](https://proceedings.mlr.press/v199/ramkumar22a.html) by [Vijaya Raghavan Thiruvengadathan Ramkumar](https://www.linkedin.com/in/vijayaraghavan95), [Elahe Arani](https://www.linkedin.com/in/elahe-arani-630870b2/) and [Bahram Zonooz](https://www.linkedin.com/in/bahram-zonooz-2b5589156/), where we propose a novel self-supervised pretraining architechture based on differenceing called DSP for scene change detection.** 5 | 6 | ## Abstract 7 | 8 | 9 | Scene change detection (SCD), a crucial perception task, identifies changes by comparing scenes captured at different times. SCD is challenging due to noisy changes in illumination, seasonal variations, and perspective differences across a pair of views. Deep neural network based solutions require a large quantity of annotated data which is tedious and expensive to obtain. On the other hand, transfer learning from large datasets induces domain shift. To address these challenges, we propose a novel Differencing self-supervised pretraining (DSP) method that uses feature differencing to learn discriminatory representations corresponding to the changed regions while simultaneously tackling the noisy changes by enforcing temporal invariance across views. Our experimental results on SCD datasets demonstrate the effectiveness of our method, specifically to differences in camera viewpoints and lighting conditions. Compared against the self-supervised Barlow Twins and the standard ImageNet pretraining that uses more than a million additional labeled images, DSP can surpass it without using any additional data. Our results also demonstrate the robustness of DSP to natural corruptions, distribution shift, and learning under limited labeled data. 10 | 11 | ![alt text](https://github.com/NeurAI-Lab/DSP/blob/main/method.png) 12 | 13 | For more details, please see the [Paper](https://arxiv.org/abs/2208.05838) and [Presentation](https://www.youtube.com/watch?v=kWUxxC5hjKw). 14 | 15 | ## Requirements 16 | 17 | - python 3.6+ 18 | - opencv 3.4.2+ 19 | - pytorch 1.6.0 20 | - torchvision 0.4.0+ 21 | - tqdm 4.51.0 22 | - tensorboardX 2.1 23 | 24 | ## Datasets 25 | 26 | Our network is tested on two datasets for street-view scene change detection. 27 | 28 | - 'PCD' dataset from [Change detection from a street image pair using CNN features and superpixel segmentation](http://www.vision.is.tohoku.ac.jp/files/9814/3947/4830/71-Sakurada-BMVC15.pdf). 29 | - You can find the information about how to get 'TSUNAMI', 'GSV' and preprocessed datasets for training and test [here](https://kensakurada.github.io/pcd_dataset.html). 30 | - 'VL-CMU-CD' dataset from [Street-View Change Detection with Deconvolutional Networks](http://www.robesafe.com/personal/roberto.arroyo/docs/Alcantarilla16rss.pdf). 31 | - 'VL-CMU-CD': [[googledrive]](https://drive.google.com/file/d/0B-IG2NONFdciOWY5QkQ3OUgwejQ/view?resourcekey=0-rEzCjPFmDFjt4UMWamV4Eg) 32 | 33 | ## Dataset Preprocessing 34 | 35 | - For DSP pretraining - included in the DSP--dataset--CMU.py/PCD.py 36 | - For finetuning and evaluation - Please follow the preprocessing method used by the official implementation of [{Dynamic Receptive Temporal Attention Network for Street Scene Change Detection paper}](https://github.com/Herrccc/DR-TANet) 37 | 38 | Dataset folder structure for VL-CMU-CD: 39 | ```bash 40 | ├── VL-CMU-CD 41 | │ ├── Image_T0 42 | │ ├── Image_T1 43 | │ ├── Ground Truth 44 | 45 | ``` 46 | 47 | ## SSL Training 48 | 49 | 50 | - For training 'DSP' on VL-CMU-CD dataset: 51 | ``` 52 | python3 DSP/train.py --ssl_batchsize 16 --ssl_epochs 500 --save_dir /outputs --data_dir /path/to/VL-CMU-CD --img_size 256 --n_proj 256 --hidden_layer 512 --output_stride 8 --pre_train False --m_backbone False --barlow_twins True --dense_cl False --kd_loss True --kd_loss_2 sp --inter_kl False --alpha_inter_kd 0 --alpha_sp 3000 --alpha_kl 100 53 | ``` 54 | 55 | 56 | ## Fine Tuning 57 | 58 | We evaluate Rand, Imagenet supervised, Barlow twins, and DSP pretraining on DR-TANet. 59 | - Follow the Please follow the train and test procedure used by the official implementation of [{Dynamic Receptive Temporal Attention Network for Street Scene Change Detection paper}](https://github.com/Herrccc/DR-TANet) 60 | 61 | Start training with DR-TANet on 'VL-CMU-CD' dataset. 62 | 63 | python3 train.py --dataset vl_cmu_cd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --max-epochs 150 --batch-size 16 --encoder-arch resnet50 --epoch-save 25 --drtam --refinement 64 | 65 | Start evaluating with DR-TANet on 'PCD' dataset. 66 | 67 | python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet50 --drtam --refinement --store-imgs 68 | 69 | ## Evaluating the finetuned model 70 | 71 | Start evaluating with DR-TANet on 'PCD' dataset. 72 | 73 | python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet18 --drtam --refinement --store-imgs 74 | 75 | ## Analysis 76 | We analyse our DSP model under 3 scenarios: **1. Robustness to Natural corruptions 2. Out-of-distribution data 3. Limited labeled data. For more details, please see the [Paper](https://arxiv.org/abs/2208.05838).** 77 | For Natural corruptions evaluation, please refer to the paper [{Benchmarking Neural Network Robustness to 78 | Common Corruptions and Surface Variations }](https://arxiv.org/pdf/1807.01697.pdf) 79 | 80 | And finally, for the ease of comparison, we have provided the model checkpoints for the DSP pretraining below: [google drive](https://drive.google.com/drive/folders/1UwFQ7NjXRwyfgfhFnX6_CPTm8hQ8AoFF?usp=sharing) 81 | 82 | 83 | ## Cite our work 84 | 85 | If you find the code useful in your research, please consider citing our paper: 86 | 87 |
88 | @inproceedings{ramkumar2022differencing,
89 |   title={Differencing based Self-supervised pretraining for Scene Change Detection},
90 |   author={Ramkumar, Vijaya Raghavan T and Arani, Elahe and Zonooz, Bahram},
91 |   booktitle={Conference on Lifelong Learning Agents},
92 |   pages={952--965},
93 |   year={2022},
94 |   organization={PMLR}
95 | }
96 | 


--------------------------------------------------------------------------------
/method.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/method.png


--------------------------------------------------------------------------------