├── DSP
    ├── DR-TANet-main
    │   ├── LICENSE
    │   ├── README.md
    │   ├── TANet.py
    │   ├── TANet_element.py
    │   ├── __pycache__
    │   │   ├── TANet.cpython-37.pyc
    │   │   ├── TANet_element.cpython-37.pyc
    │   │   ├── attention.cpython-37.pyc
    │   │   ├── datasets.cpython-37.pyc
    │   │   └── util.cpython-37.pyc
    │   ├── attention.py
    │   ├── data
    │   │   └── output
    │   │   │   └── vijaya.ramkumar
    │   │   │       └── sscdv2
    │   │   │           └── DR-tanet
    │   │   │               └── alpha_100
    │   │   │                   └── new_sp_3k_nod
    │   │   │                       └── DR-TANet_resnet50_ref
    │   │   │                           └── vl_cmu_cd
    │   │   │                               └── DR-TANet_resnet50_ref
    │   │   │                                   └── vl_cmu_cd
    │   │   │                                       └── eval_metrics(dataset).csv
    │   ├── datasets.py
    │   ├── eval.py
    │   ├── graph.py
    │   ├── img
    │   │   └── TANet_DR-TANet.png
    │   ├── split data.py
    │   ├── train.py
    │   └── util.py
    ├── config
    │   ├── __pycache__
    │   │   └── option.cpython-37.pyc
    │   └── option.py
    ├── criterion
    │   ├── __pycache__
    │   │   ├── ntxent.cpython-37.pyc
    │   │   └── sim_preserving_kd.cpython-37.pyc
    │   ├── ntxent.py
    │   └── sim_preserving_kd.py
    ├── dataset
    │   ├── CMU.py
    │   ├── PCD.py
    │   └── __pycache__
    │   │   ├── CMU.cpython-37.pyc
    │   │   └── PCD.cpython-37.pyc
    ├── linear.py
    ├── modeling
    │   ├── backbone
    │   │   ├── __init__.py
    │   │   ├── __pycache__
    │   │   │   ├── __init__.cpython-37.pyc
    │   │   │   ├── drn.cpython-37.pyc
    │   │   │   ├── mobilenet.cpython-37.pyc
    │   │   │   ├── resnet.cpython-37.pyc
    │   │   │   └── xception.cpython-37.pyc
    │   │   ├── data
    │   │   │   └── Digraph.gv
    │   │   ├── drn.py
    │   │   ├── mobilenet.py
    │   │   ├── resnet.py
    │   │   └── xception.py
    │   └── sync_batchnorm
    │   │   ├── __init__.py
    │   │   ├── __pycache__
    │   │       ├── __init__.cpython-37.pyc
    │   │       ├── batchnorm.cpython-37.pyc
    │   │       ├── comm.cpython-37.pyc
    │   │       └── replicate.cpython-37.pyc
    │   │   ├── batchnorm.py
    │   │   ├── comm.py
    │   │   ├── replicate.py
    │   │   └── unittest.py
    ├── models
    │   ├── __pycache__
    │   │   └── simclr.cpython-37.pyc
    │   └── simclr.py
    ├── mypath.py
    ├── optimizers
    │   ├── __pycache__
    │   │   └── lars.cpython-37.pyc
    │   └── lars.py
    ├── supervised.py
    ├── train.py
    ├── transforms
    │   ├── __pycache__
    │   │   └── simclr_transform.cpython-37.pyc
    │   └── simclr_transform.py
    └── util
    │   ├── COCO_loader
    │       ├── base_dataset.py
    │       ├── coco_uninet.py
    │       └── defaults.py
    │   ├── __pycache__
    │       ├── dist_util.cpython-37.pyc
    │       ├── test.cpython-37.pyc
    │       ├── torchlist.cpython-37.pyc
    │       ├── train_util.cpython-37.pyc
    │       ├── transforms.cpython-37.pyc
    │       └── utils.cpython-37.pyc
    │   ├── dist_util.py
    │   ├── test.py
    │   ├── torchlist.py
    │   ├── train_util.py
    │   ├── transforms.py
    │   └── utils.py
├── LICENSE
├── README.md
└── method.png


/DSP/DR-TANet-main/LICENSE:
--------------------------------------------------------------------------------
 1 | IT License
 2 | 
 3 | Copyright (c) 2021 Shuo Chen
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/README.md:
--------------------------------------------------------------------------------
 1 | # Dynamic Receptive Temporal Attention Network for Street Scene Change Detection
 2 | 
 3 | This is the official implementation of TANet and DR-TANet in "DR-TANet: Dynamic Receptive Temporal Attention Network for Street Scene Change Detection" (IEEE IV 2021). The preprint version is [here](https://arxiv.org/abs/2103.00879).
 4 | 
 5 | ![img1](https://github.com/Herrccc/DR-TANet/blob/main/img/TANet:DR-TANet.png)
 6 | 
 7 | ## Requirements
 8 | 
 9 | - python 3.7+
10 | - opencv 3.4.2+
11 | - pytorch 1.2.0+
12 | - torchvision 0.4.0+
13 | - tqdm 4.51.0
14 | - tensorboardX 2.1
15 | 
16 | ## Datasets
17 | 
18 | Our network is tested on two datasets for street-view scene change detection. 
19 | 
20 | - 'PCD' dataset from [Change detection from a street image pair using CNN features and superpixel segmentation](http://www.vision.is.tohoku.ac.jp/files/9814/3947/4830/71-Sakurada-BMVC15.pdf). 
21 |   - You can find the information about how to get 'TSUNAMI', 'GSV' and preprocessed datasets for training and test [here](https://kensakurada.github.io/pcd_dataset.html).
22 | - 'VL-CMU-CD' dataset from [Street-View Change Detection with Deconvolutional Networks](http://www.robesafe.com/personal/roberto.arroyo/docs/Alcantarilla16rss.pdf).
23 |   -  'VL-CMU-CD': [[googledrive]](https://drive.google.com/file/d/0B-IG2NONFdciOWY5QkQ3OUgwejQ/view?resourcekey=0-rEzCjPFmDFjt4UMWamV4Eg)
24 |   -  dataset for training and test in our work: [[googledrive]](https://drive.google.com/file/d/1GzQR9kQouH4_1PmFRTHl4dWTAzqz3ppH/view?usp=sharing)
25 | 
26 | ## Training
27 | 
28 | Start training with TANet on 'PCD' dataset.
29 | >The configurations for TANet
30 | >- local-kernel-size:1, attn-stride:1, attn-padding:0, attn-groups:4.
31 | >- local-kernel-size:3, attn-stride:1, attn-padding:1, attn-groups:4.
32 | >- local-kernel-size:5, attn-stride:1, attn-padding:2, attn-groups:4.
33 | >- local-kernel-size:7, attn-stride:1, attn-padding:3, attn-groups:4.
34 | 
35 |     python3 train.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --max-epochs 100 --batch-size 16 --encoder-arch resnet18 --local-kernel-size 1
36 | 
37 | Start training with DR-TANet on 'VL-CMU-CD' dataset.
38 | 
39 |     python3 train.py --dataset vl_cmu_cd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --max-epochs 150 --batch-size 16 --encoder-arch resnet18 --epoch-save 25 --drtam --refinement
40 | 
41 | ## Evaluating
42 | 
43 | Start evaluating with DR-TANet on 'PCD' dataset.
44 | 
45 |     python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet18 --drtam --refinement --store-imgs
46 |   
47 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/TANet.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | from util import upsample
 4 | from TANet_element import *
 5 | 
 6 | class TANet(nn.Module):
 7 | 
 8 |     def __init__(self, encoder_arch, local_kernel_size, stride, padding, groups, drtam, refinement, pretrain,sslpretrain, ssl_path):
 9 |         super(TANet, self).__init__()
10 | 
11 |         self.encoder1, channels = get_encoder(encoder_arch,ssl_path, pretrained=pretrain, sslpretrain= False)
12 |         self.encoder2, _ = get_encoder(encoder_arch,ssl_path,  pretrained=pretrain, sslpretrain= False)
13 |         self.attention_module = get_attentionmodule(local_kernel_size, stride, padding, groups, drtam, refinement, channels)
14 |         self.decoder = get_decoder(channels=channels)
15 |         self.classifier = nn.Conv2d(channels[0], 2, 1, padding=0, stride=1)
16 |         self.bn = nn.BatchNorm2d(channels[0])
17 |         self.relu = nn.ReLU(inplace=True)
18 | 
19 |     def forward(self, img):
20 | 
21 |         img_t0,img_t1 = torch.split(img,3,1)
22 |         features_t0 = self.encoder1(img_t0)
23 |         features_t1 = self.encoder2(img_t1)
24 |         features = features_t0 + features_t1
25 |         features_map = self.attention_module(features)
26 |         pred_ = self.decoder(features_map)
27 |         pred_ = upsample(pred_,[pred_.size()[2]*2, pred_.size()[3]*2])
28 |         pred_ = self.bn(pred_)
29 |         pred_ = upsample(pred_,[pred_.size()[2]*2, pred_.size()[3]*2])
30 |         pred_ = self.relu(pred_)
31 |         pred = self.classifier(pred_)
32 | 
33 |         return pred
34 | 
35 | 
36 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/__pycache__/TANet.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/TANet.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/__pycache__/TANet_element.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/TANet_element.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/__pycache__/attention.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/attention.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/__pycache__/datasets.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/datasets.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/__pycache__/util.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/__pycache__/util.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/attention.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | import torch.nn.init as init
  5 | 
  6 | 
  7 | class Temporal_Attention(nn.Module):
  8 |     def __init__(self, in_channels, out_channels, kernel_size=1, stride=1, padding=0,
  9 |                  groups=1, bias=False, refinement=False):
 10 |         super(Temporal_Attention, self).__init__()
 11 |         self.outc = out_channels
 12 |         self.kernel_size = kernel_size
 13 |         self.stride = stride
 14 |         self.padding = padding
 15 |         self.groups = groups
 16 |         self.refinement = refinement
 17 | 
 18 |         print('Attention Layer-kernel size:{0},stride:{1},padding:{2},groups:{3}...'.format(self.kernel_size,self.stride,self.padding,self.groups))
 19 |         if self.refinement:
 20 |             print("Attention with refinement...")
 21 | 
 22 |         assert self.outc % self.groups == 0, 'out_channels should be divided by groups.'
 23 | 
 24 |         self.w_q = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=bias)
 25 |         self.w_k = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=bias)
 26 |         self.w_v = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=bias)
 27 | 
 28 | 
 29 |         #relative positional encoding...
 30 |         self.rel_h = nn.Parameter(torch.randn(self.outc // 2, 1, 1, self.kernel_size, 1), requires_grad = True)
 31 |         self.rel_w = nn.Parameter(torch.randn(self.outc // 2, 1, 1, 1, self.kernel_size), requires_grad = True)
 32 |         init.normal_(self.rel_h, 0, 1)
 33 |         init.normal_(self.rel_w, 0, 1)
 34 | 
 35 | 
 36 |         init.kaiming_normal_(self.w_q.weight, mode='fan_out', nonlinearity='relu')
 37 |         init.kaiming_normal_(self.w_k.weight, mode='fan_out', nonlinearity='relu')
 38 |         init.kaiming_normal_(self.w_v.weight, mode='fan_out', nonlinearity='relu')
 39 | 
 40 | 
 41 |     def forward(self, feature_map):
 42 | 
 43 |         fm_t0, fm_t1 = torch.split(feature_map, feature_map.size()[1]//2, 1)
 44 |         assert fm_t0.size() == fm_t1.size(), 'The size of feature maps of image t0 and t1 should be same.'
 45 | 
 46 |         batch, _, h, w = fm_t0.size()
 47 | 
 48 | 
 49 |         padded_fm_t0 = F.pad(fm_t0, [self.padding, self.padding, self.padding, self.padding])
 50 |         q_out = self.w_q(fm_t1)
 51 |         k_out = self.w_k(padded_fm_t0)
 52 |         v_out = self.w_v(padded_fm_t0)
 53 | 
 54 |         if self.refinement:
 55 | 
 56 |             padding = self.kernel_size
 57 |             padded_fm_col = F.pad(fm_t0, [0, 0, padding, padding])
 58 |             padded_fm_row = F.pad(fm_t0, [padding, padding, 0, 0])
 59 |             k_out_col = self.w_k(padded_fm_col)
 60 |             k_out_row = self.w_k(padded_fm_row)
 61 |             v_out_col = self.w_v(padded_fm_col)
 62 |             v_out_row = self.w_v(padded_fm_row)
 63 | 
 64 |             k_out_col = k_out_col.unfold(2, self.kernel_size * 2 + 1, self.stride)
 65 |             k_out_row = k_out_row.unfold(3, self.kernel_size * 2 + 1, self.stride)
 66 |             v_out_col = v_out_col.unfold(2, self.kernel_size * 2 + 1, self.stride)
 67 |             v_out_row = v_out_row.unfold(3, self.kernel_size * 2 + 1, self.stride)
 68 | 
 69 | 
 70 |         q_out_base = q_out.view(batch, self.groups, self.outc // self.groups, h, w, 1).repeat(1, 1, 1, 1, 1, self.kernel_size*self.kernel_size)
 71 |         q_out_ref = q_out.view(batch, self.groups, self.outc // self.groups, h, w, 1).repeat(1, 1, 1, 1, 1, self.kernel_size * 2 + 1)
 72 |         
 73 |         k_out = k_out.unfold(2, self.kernel_size, self.stride).unfold(3, self.kernel_size, self.stride)
 74 | 
 75 |         k_out_h, k_out_w = k_out.split(self.outc // 2, dim=1)
 76 |         k_out = torch.cat((k_out_h + self.rel_h, k_out_w + self.rel_w), dim=1)
 77 | 
 78 |         k_out = k_out.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1)
 79 | 
 80 |         v_out = v_out.unfold(2, self.kernel_size, self.stride).unfold(3, self.kernel_size, self.stride)
 81 |         v_out = v_out.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1)
 82 | 
 83 |         inter_out = (q_out_base * k_out).sum(dim=2)
 84 | 
 85 |         out = F.softmax(inter_out, dim=-1)
 86 |         out = torch.einsum('bnhwk,bnchwk -> bnchw', out, v_out).contiguous().view(batch, -1, h, w)
 87 | 
 88 |         if self.refinement:
 89 | 
 90 |             k_out_row = k_out_row.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1)
 91 |             k_out_col = k_out_col.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1)
 92 |             v_out_row = v_out_row.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1)
 93 |             v_out_col = v_out_col.contiguous().view(batch, self.groups, self.outc // self.groups, h, w, -1)
 94 | 
 95 |             out_row = F.softmax((q_out_ref * k_out_row).sum(dim=2),dim=-1)
 96 |             out_col = F.softmax((q_out_ref * k_out_col).sum(dim=2),dim=-1)
 97 |             out += torch.einsum('bnhwk,bnchwk -> bnchw', out_row, v_out_row).contiguous().view(batch, -1, h, w)
 98 |             out += torch.einsum('bnhwk,bnchwk -> bnchw', out_col, v_out_col).contiguous().view(batch, -1, h, w)
 99 | 
100 |         return out
101 | 
102 | 
103 | 
104 | 
105 | 
106 | 
107 | 
108 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/data/output/vijaya.ramkumar/sscdv2/DR-tanet/alpha_100/new_sp_3k_nod/DR-TANet_resnet50_ref/vl_cmu_cd/DR-TANet_resnet50_ref/vl_cmu_cd/eval_metrics(dataset).csv:
--------------------------------------------------------------------------------
1 | set,ds_name,precision,recall,accuracy,f1-score
2 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/datasets.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import cv2
  3 | import torch
  4 | import numpy as np
  5 | from torch.utils.data import Dataset
  6 | from os.path import join as pjoin, splitext as spt
  7 | import argparse
  8 | 
  9 | def check_validness(f):
 10 |     return any([i in spt(f)[1] for i in ['jpg','png']])
 11 | 
 12 | class pcd(Dataset):
 13 | 
 14 |     def __init__(self,root):
 15 |         super(pcd, self).__init__()
 16 |         self.img_t0_root = pjoin(root,'t0')
 17 |         self.img_t1_root = pjoin(root,'t1')
 18 |         self.img_mask_root = pjoin(root,'mask')
 19 |         self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f))
 20 |         self.filename.sort()
 21 |         
 22 |     def __getitem__(self, index):
 23 |         
 24 |         fn = self.filename[index]
 25 |         fn_t0 = pjoin(self.img_t0_root,fn+'.jpg')
 26 |         fn_t1 = pjoin(self.img_t1_root,fn+'.jpg')
 27 |         fn_mask = pjoin(self.img_mask_root,fn+'.png')
 28 | 
 29 |         if os.path.isfile(fn_t0) == False:
 30 |             print('Error: File Not Found: ' + fn_t0)
 31 |             exit(-1)
 32 |         if os.path.isfile(fn_t1) == False:
 33 |             print('Error: File Not Found: ' + fn_t1)
 34 |             exit(-1)
 35 |         if os.path.isfile(fn_mask) == False:
 36 |             print('Error: File Not Found: ' + fn_mask)
 37 |             exit(-1)
 38 |         
 39 |         img_t0 = cv2.imread(fn_t0, 1)
 40 |         img_t1 = cv2.imread(fn_t1, 1)
 41 |         mask = cv2.imread(fn_mask, 0)
 42 | 
 43 |         w, h, c = img_t0.shape
 44 |         r = 286. / min(w, h)
 45 |         # resize images so that min(w, h) == 256
 46 |         img_t0_r = cv2.resize(img_t0, (int(r * w), int(r * h)))
 47 |         img_t1_r = cv2.resize(img_t1, (int(r * w), int(r * h)))
 48 |         mask_r = cv2.resize(mask, (int(r * w), int(r * h)))[:, :, np.newaxis]
 49 | 
 50 |         img_t0_r_ = np.asarray(img_t0_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
 51 |         img_t1_r_ = np.asarray(img_t1_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
 52 |         mask_r_ = np.asarray(mask_r>128).astype('f').transpose(2, 0, 1)
 53 | 
 54 |         crop_width = 256
 55 |         _, h, w = img_t0_r_.shape
 56 |         x_l = np.random.randint(0, w - crop_width)
 57 |         x_r = x_l + crop_width
 58 |         y_l = np.random.randint(0, h - crop_width)
 59 |         y_r = y_l + crop_width
 60 | 
 61 |         input_ = torch.from_numpy(np.concatenate((img_t0_r_[:, y_l:y_r, x_l:x_r], img_t1_r_[:, y_l:y_r, x_l:x_r]), axis=0))
 62 |         mask_ = torch.from_numpy(mask_r_[:, y_l:y_r, x_l:x_r]).long()
 63 |         
 64 |         return input_,mask_
 65 | 
 66 |     def __len__(self):
 67 |         return len(self.filename)
 68 | 
 69 |     def get_random_image(self):
 70 |         idx = np.random.randint(0,len(self))
 71 |         return self.__getitem__(idx)
 72 | 
 73 | 
 74 | class pcd_eval(Dataset):
 75 | 
 76 |     def __init__(self, root):
 77 |         super(pcd_eval, self).__init__()
 78 |         self.img_t0_root = pjoin(root, 't0')
 79 |         self.img_t1_root = pjoin(root, 't1')
 80 |         self.img_mask_root = pjoin(root, 'mask')
 81 |         self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f))
 82 |         self.filename.sort()
 83 | 
 84 |     def __getitem__(self, index):
 85 | 
 86 |         fn = self.filename[index]
 87 |         fn_t0 = pjoin(self.img_t0_root, fn + '.jpg')
 88 |         fn_t1 = pjoin(self.img_t1_root, fn + '.jpg')
 89 |         fn_mask = pjoin(self.img_mask_root, fn + '.png')
 90 | 
 91 |         if os.path.isfile(fn_t0) == False:
 92 |             print('Error: File Not Found: ' + fn_t0)
 93 |             exit(-1)
 94 |         if os.path.isfile(fn_t1) == False:
 95 |             print('Error: File Not Found: ' + fn_t1)
 96 |             exit(-1)
 97 |         if os.path.isfile(fn_mask) == False:
 98 |             print('Error: File Not Found: ' + fn_mask)
 99 |             exit(-1)
100 | 
101 |         img_t0 = cv2.imread(fn_t0, 1)
102 |         img_t1 = cv2.imread(fn_t1, 1)
103 |         mask = cv2.imread(fn_mask, 0)
104 | 
105 |         w, h, c = img_t0.shape
106 |         w_r = int(256*max(w/256,1))
107 |         h_r = int(256*max(h/256,1))
108 |         # resize images so that min(w, h) == 256
109 |         img_t0_r = cv2.resize(img_t0,(h_r,w_r))
110 |         img_t1_r = cv2.resize(img_t1,(h_r,w_r))
111 |         mask_r = cv2.resize(mask,(h_r,w_r))[:, :, np.newaxis]
112 | 
113 |         img_t0_r = np.asarray(img_t0_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
114 |         img_t1_r = np.asarray(img_t1_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
115 |         mask_r = np.asarray(mask_r > 128).astype('f').transpose(2, 0, 1)
116 | 
117 |         return img_t0_r, img_t1_r, mask_r, w, h, w_r, h_r
118 | 
119 |     def __len__(self):
120 |         return len(self.filename)
121 | 
122 |     def get_random_image(self):
123 |         idx = np.random.randint(0,len(self))
124 |         return self.__getitem__(idx)
125 | class vl_cmu_cd(Dataset):
126 | 
127 |     def __init__(self, root, num=1):
128 |         super(vl_cmu_cd, self).__init__()
129 |         self.img_t0_root = pjoin(root, 't0')
130 |         self.img_t1_root = pjoin(root, 't1')
131 |         self.img_mask_root = pjoin(root, 'mask')
132 |         self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f))
133 |         self.filename.sort()
134 |         self.datanum = num
135 | 
136 |     def __getitem__(self, index):
137 | 
138 |         fn = self.filename[index]
139 |         fn_t0 = pjoin(self.img_t0_root, fn + '.png')
140 |         fn_t1 = pjoin(self.img_t1_root, fn + '.png')
141 |         fn_mask = pjoin(self.img_mask_root, fn + '.png')
142 | 
143 |         if os.path.isfile(fn_t0) == False:
144 |             print('Error: File Not Found: ' + fn_t0)
145 |             exit(-1)
146 |         if os.path.isfile(fn_t1) == False:
147 |             print('Error: File Not Found: ' + fn_t1)
148 |             exit(-1)
149 |         if os.path.isfile(fn_mask) == False:
150 |             print('Error: File Not Found: ' + fn_mask)
151 |             exit(-1)
152 | 
153 |         img_t0 = cv2.imread(fn_t0, 1)
154 |         img_t1 = cv2.imread(fn_t1, 1)
155 |         mask = cv2.imread(fn_mask, 0)
156 | 
157 |         mask_r = mask[:, :, np.newaxis]
158 | 
159 |         img_t0_r = np.asarray(img_t0).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
160 |         img_t1_r = np.asarray(img_t1).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
161 |         mask_r_ = np.asarray(mask_r > 128).astype('f').transpose(2, 0, 1)
162 | 
163 | 
164 |         input_ = torch.from_numpy(np.concatenate((img_t0_r, img_t1_r), axis=0))
165 |         mask_ = torch.from_numpy(mask_r_).long()
166 | 
167 |         return input_, mask_
168 | 
169 |     def __len__(self):
170 | 
171 |         return round(self.datanum *len(self.filename))
172 | 
173 |     def get_random_image(self):
174 |         # num = self.datanum *len(self)
175 |         idx = np.random.randint(0,len(self))
176 | 
177 |         return self.__getitem__(idx)
178 | 
179 | class vl_cmu_cd_eval(Dataset):
180 | 
181 |     def __init__(self, root):
182 |         super(vl_cmu_cd_eval, self).__init__()
183 |         self.img_root = pjoin(root, 'RGB')
184 |         self.img_mask_root = pjoin(root, 'GT')
185 |         self.filename = list(spt(f)[0] for f in os.listdir(self.img_mask_root) if check_validness(f))
186 |         self.filename.sort()
187 | 
188 | 
189 |     def __getitem__(self, index):
190 | 
191 |         fn = self.filename[index]
192 |         fn_t0 = pjoin(self.img_root, '1_{:02d}'.format(index) + '.png')
193 |         fn_t1 = pjoin(self.img_root, '2_{:02d}'.format(index) + '.png')
194 |         fn_mask = pjoin(self.img_mask_root, fn + '.png')
195 | 
196 |         if os.path.isfile(fn_t0) == False:
197 |             print('Error: File Not Found: ' + fn_t0)
198 |             exit(-1)
199 |         if os.path.isfile(fn_t1) == False:
200 |             print('Error: File Not Found: ' + fn_t1)
201 |             exit(-1)
202 |         if os.path.isfile(fn_mask) == False:
203 |             print('Error: File Not Found: ' + fn_mask)
204 |             exit(-1)
205 | 
206 |         img_t0 = cv2.imread(fn_t0, 1)
207 |         img_t1 = cv2.imread(fn_t1, 1)
208 |         mask = cv2.imread(fn_mask, 0)
209 | 
210 |         w, h, c = img_t0.shape
211 |         w_r = int(256 * max(w / 256, 1))
212 |         h_r = int(256 * max(h / 256, 1))
213 | 
214 |         img_t0_r = cv2.resize(img_t0, (w_r, h_r))
215 |         img_t1_r = cv2.resize(img_t1, (w_r, h_r))
216 |         mask_r = cv2.resize(mask, (h_r, w_r))[:, :, np.newaxis]
217 | 
218 |         img_t0_r_ = np.asarray(img_t0_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
219 |         img_t1_r_ = np.asarray(img_t1_r).astype('f').transpose(2, 0, 1) / 128.0 - 1.0
220 |         mask_r_ = np.asarray(mask_r > 128).astype('f').transpose(2, 0, 1)
221 | 
222 |         return img_t0_r_, img_t1_r_, mask_r_, w, h, w_r, h_r
223 | 
224 |     def __len__(self):
225 |         return len(self.filename)
226 | 
227 |     def get_random_image(self):
228 |         idx = np.random.randint(0,len(self))
229 |         return self.__getitem__(idx)
230 | 
231 | 
232 | 
233 | 
234 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/eval.py:
--------------------------------------------------------------------------------
  1 | import datasets
  2 | from TANet import TANet
  3 | import os
  4 | import csv
  5 | import cv2
  6 | import torch
  7 | import torch.nn as nn
  8 | import numpy as np
  9 | from os.path import join as pjoin
 10 | from tqdm import tqdm
 11 | import torch.nn.functional as F
 12 | import argparse
 13 | 
 14 | class Evaluate:
 15 | 
 16 |     def __init__(self):
 17 |         self.args = None
 18 |         self.set = None
 19 | 
 20 |     def eval(self):
 21 | 
 22 |         input = torch.from_numpy(np.concatenate((self.t0,self.t1),axis=0)).contiguous()
 23 |         input = input.view(1,-1,self.w_r,self.h_r)
 24 |         input = input.cuda()
 25 |         output= self.model(input)
 26 | 
 27 |         input = input[0].cpu().data
 28 |         img_t0 = input[0:3,:,:]
 29 |         img_t1 = input[3:6,:,:]
 30 |         img_t0 = (img_t0+1)*128
 31 |         img_t1 = (img_t1+1)*128
 32 |         output = output[0].cpu().data
 33 |         #mask_pred =F.softmax(output[0:2,:,:],dim=0)[0]*255
 34 |         mask_pred = np.where(F.softmax(output[0:2,:,:],dim=0)[0]>0.5, 255, 0)
 35 |         mask_gt = np.squeeze(np.where(self.mask==True,255,0),axis=0)
 36 |         if self.args.store_imgs:
 37 |             precision, recall, accuracy, f1_score = self.store_imgs_and_cal_matrics(img_t0,img_t1,mask_gt,mask_pred)
 38 |         else:
 39 |             precision, recall, accuracy, f1_score = self.cal_metrcis(mask_pred,mask_gt)
 40 |         return (precision, recall, accuracy, f1_score)
 41 | 
 42 | 
 43 |     def store_imgs_and_cal_matrics(self, t0, t1, mask_gt, mask_pred):
 44 | 
 45 |         w, h = self.w_r, self.h_r
 46 |         img_save = np.zeros((w * 2, h * 2, 3), dtype=np.uint8)
 47 |         img_save[0:w, 0:h, :] = np.transpose(t0.numpy(), (1, 2, 0)).astype(np.uint8)
 48 |         img_save[0:w, h:h * 2, :] = np.transpose(t1.numpy(), (1, 2, 0)).astype(np.uint8)
 49 |         img_save[w:w * 2, 0:h, :] = cv2.cvtColor(mask_gt.astype(np.uint8), cv2.COLOR_GRAY2RGB)
 50 |         img_save[w:w * 2, h:h * 2, :] = cv2.cvtColor(mask_pred.astype(np.uint8), cv2.COLOR_GRAY2RGB)
 51 | 
 52 |         if w != self.w_ori or h != self.h_ori:
 53 |             img_save = cv2.resize(img_save, (self.h_ori, self.w_ori))
 54 | 
 55 |         fn_save = self.fn_img
 56 |         if not os.path.exists(self.dir_img):
 57 |             os.makedirs(self.dir_img)
 58 | 
 59 |         print('Writing' + fn_save + '......')
 60 |         cv2.imwrite(fn_save, img_save)
 61 | 
 62 |         if self.set is not None:
 63 |             f_metrics = open(pjoin(self.resultdir, "eval_metrics_set{0}(single_image).csv".format(self.set)), 'a+')
 64 |         else:
 65 |             f_metrics = open(pjoin(self.resultdir, "eval_metrics(single_image).csv"), 'a+')
 66 |         metrics_writer = csv.writer(f_metrics)
 67 |         fn = '{0}-{1:08d}'.format(self.ds,self.index)
 68 |         precision, recall, accuracy, f1_score = self.cal_metrcis(mask_pred,mask_gt)
 69 |         metrics_writer.writerow([fn, precision, recall, accuracy, f1_score])
 70 |         f_metrics.close()
 71 |         return (precision, recall, accuracy, f1_score)
 72 | 
 73 |     def cal_metrcis(self,pred,target):
 74 | 
 75 |         temp = np.dstack((pred == 0, target == 0))
 76 |         TP = sum(sum(np.all(temp,axis=2)))
 77 | 
 78 |         temp = np.dstack((pred == 0, target == 255))
 79 |         FP = sum(sum(np.all(temp,axis=2)))
 80 | 
 81 |         temp = np.dstack((pred == 255, target == 0))
 82 |         FN = sum(sum(np.all(temp, axis=2)))
 83 | 
 84 |         temp = np.dstack((pred == 255, target == 255))
 85 |         TN = sum(sum(np.all(temp, axis=2)))
 86 | 
 87 |         precision = TP / (TP + FP)
 88 |         recall = TP / (TP + FN)
 89 |         accuracy = (TP + TN) / (TP + FP + FN + TN)
 90 |         f1_score = 2 * recall * precision / (precision + recall)
 91 | 
 92 |         return (precision, recall, accuracy, f1_score)
 93 | 
 94 |     def Init(self):
 95 | 
 96 |         if self.args.drtam:
 97 |             print('Dynamic Receptive Temporal Attention Network (DR-TANet)')
 98 |             model_name = 'DR-TANet'
 99 |         else:
100 |             print('Temporal Attention Network (TANet)')
101 |             model_name = 'TANet_k={0}'.format(self.args.local_kernel_size)
102 | 
103 |         model_name += ('_' + self.args.encoder_arch)
104 | 
105 |         print('Encoder:' + self.args.encoder_arch)
106 | 
107 |         if self.args.refinement:
108 |             print('Adding refinement...')
109 |             model_name += '_ref'
110 | 
111 |         self.resultdir = pjoin(self.args.resultdir, model_name, self.args.dataset)
112 |         if not os.path.exists(self.resultdir):
113 |             os.makedirs(self.resultdir)
114 | 
115 |         f_metrics = open(pjoin(self.resultdir, "eval_metrics(dataset).csv"), 'a+')
116 |         metrics_writer = csv.writer(f_metrics)
117 |         metrics_writer.writerow(['set', 'ds_name', 'precision', 'recall', 'accuracy', 'f1-score'])
118 |         f_metrics.close()
119 | 
120 | 
121 |     def run(self):
122 | 
123 |         if os.path.isfile(self.fn_model) is False:
124 |             print("Error: Cannot read file ... " + self.fn_model)
125 |             exit(-1)
126 |         else:
127 |             print("Reading model ... " + self.fn_model)
128 | 
129 |         self.model = TANet(self.args.encoder_arch, self.args.local_kernel_size, self.args.attn_stride,
130 |                            self.args.attn_padding, self.args.attn_groups, self.args.drtam, self.args.refinement, False,False,'None')
131 | 
132 | 
133 |         # state_dic = {k.partition('module.')[2]:v for k,v in torch.load(self.fn_model).items()}
134 |         if self.args.multi_gpu:
135 |             self.model = nn.DataParallel(self.model)
136 |         self.model.load_state_dict((torch.load(self.fn_model)))  #
137 |         self.model = self.model.cuda()
138 |         self.model.eval()
139 | 
140 | 
141 | class evaluate_pcd(Evaluate):
142 | 
143 |     def __init__(self,arguments):
144 |         super(evaluate_pcd,self).__init__()
145 |         self.args = arguments
146 | 
147 |     def run(self, set):
148 | 
149 |         self.set = set
150 |         self.dir_img = pjoin(self.resultdir, 'imgs', 'set{0:1d}'.format(self.set))
151 |         self.fn_model = pjoin(self.args.checkpointdir) #'set{0:1d}'.format(self.set), 'checkpointdir', '00120000.pth'
152 |         super(evaluate_pcd,self).run()
153 |         f_metrics = open(pjoin(self.resultdir, "eval_metrics(dataset).csv"), 'a+')
154 |         metrics_writer = csv.writer(f_metrics)
155 | 
156 |         for ds in tqdm(['TSUNAMI','GSV']):
157 |             test_loader = datasets.pcd_eval(pjoin(self.args.datadir,ds))
158 |             metrics = np.array([0,0,0,0], dtype='float64')
159 |             img_cnt = len(test_loader)
160 |             for idx in range(0,img_cnt):
161 |                 self.index = idx
162 |                 self.ds = ds
163 |                 self.fn_img = pjoin(self.dir_img, '{0}-{1:08d}.png'.format(self.ds, self.index))
164 |                 self.t0,self.t1,self.mask,self.w_ori,self.h_ori,self.w_r,self.h_r = test_loader[idx]
165 |                 metrics += np.array(self.eval())
166 |             metrics_writer.writerow([self.set, ds, '%.3f' %(metrics[0] / img_cnt), '%.3f' %(metrics[1] / img_cnt),
167 |                                      '%.3f' % (metrics[2] / img_cnt), '%.3f' %(metrics[3] / img_cnt)])
168 | 
169 |         f_metrics.close()
170 | 
171 | class evaluate_cmu(Evaluate):
172 | 
173 |     def __init__(self, arguments):
174 |         super(evaluate_cmu, self).__init__()
175 |         self.args = arguments
176 | 
177 |     def Init(self):
178 |         super(evaluate_cmu,self).Init()
179 |         self.ds = None
180 |         self.index = 0
181 |         self.dir_img = pjoin(self.resultdir, 'imgs')
182 |         self.fn_img = pjoin(self.dir_img, '{0}-{1:08d}.png'.format(self.ds, self.index))
183 |         self.fn_model = pjoin(self.args.checkpointdir)  #00070050  00035100 00070050.pth
184 | 
185 |     def eval(self):
186 | 
187 |         input = torch.from_numpy(np.concatenate((self.t0,self.t1),axis=0)).contiguous()
188 |         input = input.view(1,-1,self.w_r,self.h_r)
189 |         input = input.cuda()
190 |         output= self.model(input)
191 | 
192 |         input = input[0].cpu().data
193 |         img_t0 = input[0:3,:,:]
194 |         img_t1 = input[3:6,:,:]
195 |         img_t0 = (img_t0+1)*128
196 |         img_t1 = (img_t1+1)*128
197 |         output = output[0].cpu().data
198 |         mask_pred = np.where(F.softmax(output[0:2,:,:],dim=0)[0]>0.5, 0, 255)
199 |         mask_gt = np.squeeze(np.where(self.mask==True,255,0),axis=0)
200 |         if self.args.store_imgs:
201 |             precision, recall, accuracy, f1_score = self.store_imgs_and_cal_matrics(img_t0,img_t1,mask_gt,mask_pred)
202 |         else:
203 |             precision, recall, accuracy, f1_score = self.cal_metrcis(mask_pred,mask_gt)
204 |         return (precision, recall, accuracy, f1_score)
205 | 
206 |     def run(self):
207 |         super(evaluate_cmu, self).run()
208 |         f_metrics = open(pjoin(self.resultdir, "eval_metrics(dataset).csv"), 'a+')
209 |         metrics_writer = csv.writer(f_metrics)
210 |         testdir = [0,6,7,9,12,23,24,25,27,28,32,34,36,38,39,45,47,48,50,56,58,60,61,64,66,69,76,77,81,82,85,92,93,94,95,97,100,106,107,112,113,117,119,120,125,129,132,134,135,139,142,144,145,150]
211 |         img_cnt = 0
212 |         metrics = np.array([0, 0, 0, 0], dtype='float64')
213 |         for idx in testdir:
214 |             test_loader = datasets.vl_cmu_cd_eval(pjoin(self.args.datadir, 'raw', '{:03d}'.format(idx)))
215 |             img_cnt += len(test_loader)
216 |             self.ds = idx
217 |             for i in range(0, len(test_loader)):
218 |                 self.index = i
219 |                 self.fn_img = pjoin(self.dir_img, '{0}-{1:08d}.png'.format(self.ds, self.index))
220 |                 self.t0, self.t1, self.mask, self.w_ori, self.h_ori, self.w_r, self.h_r = test_loader[i]
221 |                 metrics += np.array(self.eval())
222 |         metrics_writer.writerow(['%.3f' % (metrics[0] / img_cnt), '%.3f' % (metrics[1] / img_cnt),
223 |                                  '%.3f' % (metrics[2] / img_cnt), '%.3f' % (metrics[3] / img_cnt)])
224 | 
225 |         f_metrics.close()
226 | 
227 | if __name__ =='__main__':
228 | 
229 |     parser = argparse.ArgumentParser(description='STRAT EVALUATING...')
230 |     parser.add_argument('--dataset', type=str, default='pcd', required=True)
231 |     parser.add_argument('--datadir',required=True)
232 |     parser.add_argument('--resultdir',required=True)
233 |     parser.add_argument('--checkpointdir',required=True)
234 |     parser.add_argument('--encoder-arch', type=str, required=True)
235 |     parser.add_argument('--local-kernel-size',type=int, default=1)
236 |     parser.add_argument('--attn-stride', type=int, default=1)
237 |     parser.add_argument('--attn-padding', type=int, default=0)
238 |     parser.add_argument('--attn-groups', type=int, default=4)
239 |     parser.add_argument('--drtam', action='store_true')
240 |     parser.add_argument('--refinement', action='store_true')
241 |     parser.add_argument('--store-imgs', action='store_true')
242 |     parser.add_argument('--multi-gpu', action='store_true', help='processing with multi-gpus')
243 | 
244 |     if parser.parse_args().dataset == 'pcd':
245 |         eval = evaluate_pcd(parser.parse_args())
246 |         eval.Init()
247 |         for set in range(0,3):
248 |             eval.run(set)
249 |     elif parser.parse_args().dataset == 'vl_cmu_cd':
250 |         eval = evaluate_cmu(parser.parse_args())
251 |         eval.Init()
252 |         eval.run()
253 |     else:
254 |         print('Error: Cannot identify the dataset...(dataset: pcd or vl_cmu_cd)')
255 |         exit(-1)


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/graph.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/graph.py


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/img/TANet_DR-TANet.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/DR-TANet-main/img/TANet_DR-TANet.png


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/split data.py:
--------------------------------------------------------------------------------
 1 | from collections import Counter
 2 | from os.path import join as pjoin, splitext as spt
 3 | import os
 4 | import cv2
 5 | from shutil import copyfile
 6 | from pathlib import Path
 7 | dict = {'1_00':0,'1_01':1,'1_02':2,'1_03':3,'1_04':4,'1_05':5,'1_06':6,'1_07':7,'1_08':8,'1_09':9,'1_10':10,'1_11':11,'1_12':12,'1_13':13,'1_14':14,'1_15':15,'1_16':16,'1_17':17,'1_18':18,'1_19':19}
 8 | pt = '/data/input/datasets/VL-CMU-CD/struc_train/'
 9 | path_txt = '/data/input/datasets/VL-CMU-CD/struc_train/train_50p_cmu.txt'
10 | label = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_struc'
11 | # path = '/data/input/datasets/VL-CMU-CD/struc_train/train_split.txt'
12 | lst = []
13 | count = 0
14 | count_dict = {}
15 | full_list = []
16 | count = 0
17 | datapath = '/data/input/datasets/VL-CMU-CD/struc_train'
18 | labpath = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_struc'
19 | sav = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/dtranet_vl_50pdata'
20 | nameing = 1
21 | for idx, did in enumerate(open(path_txt)):
22 |     try:
23 |         image1_name, image2_name, mask_name = did.strip("\n").split(' ')
24 |     except ValueError:  # Adhoc for test.
25 |         image_name = mask_name = did.strip("\n")
26 |     extract_name = image1_name[image1_name.rindex('/') + 1: image1_name.rindex('.')]
27 | 
28 |     folder = image1_name.split('/')
29 |     img1_file = os.path.join(pt, image1_name)
30 |     img2_file = os.path.join(pt, image2_name)
31 |     # items = len([name for name in os.listdir(fol_path)])
32 |     imgno = os.path.splitext(folder[2])[0]
33 |     lbl_file = os.path.join(labpath, folder[1])
34 |     filename = list(spt(f)[0] for f in os.listdir(lbl_file))
35 |     filename.sort()
36 |     lbl_file2 = os.path.join(lbl_file, filename[dict[imgno]]+'.png')
37 | 
38 |     print(img1_file)
39 |     img_t0 = cv2.imread(img1_file, 1)
40 |     img_t1 = cv2.imread(img2_file, 1)
41 |     mask = cv2.imread(lbl_file2, 0)
42 |     #rotate image
43 |     image_t0_90 = cv2.rotate(img_t0, cv2.cv2.ROTATE_90_CLOCKWISE)
44 |     image_t0_180 = cv2.rotate(image_t0_90, cv2.cv2.ROTATE_90_CLOCKWISE)
45 |     image_t0_270 = cv2.rotate(image_t0_180, cv2.cv2.ROTATE_90_CLOCKWISE)
46 |     image_t1_90 = cv2.rotate(img_t1, cv2.cv2.ROTATE_90_CLOCKWISE)
47 |     image_t1_180 = cv2.rotate(image_t1_90, cv2.cv2.ROTATE_90_CLOCKWISE)
48 |     image_t1_270 = cv2.rotate(image_t1_180, cv2.cv2.ROTATE_90_CLOCKWISE)
49 |     mask_90 = cv2.rotate(mask, cv2.cv2.ROTATE_90_CLOCKWISE)
50 |     mask_180 = cv2.rotate(mask_90, cv2.cv2.ROTATE_90_CLOCKWISE)
51 |     mask_270 = cv2.rotate(mask_180, cv2.cv2.ROTATE_90_CLOCKWISE)
52 |     print(pjoin(sav,'t0',str(nameing)+'.png'))
53 |     cv2.imwrite(pjoin(sav,'t0',str(nameing)+'.png'), img_t0)
54 |     cv2.imwrite(pjoin(sav,'t0',str(nameing+1)+'.png'), image_t0_90)
55 |     cv2.imwrite(pjoin(sav,'t0',str(nameing+2)+'.png'), image_t0_180)
56 |     cv2.imwrite(pjoin(sav,'t0',str(nameing+3)+'.png'), image_t0_270)
57 |     cv2.imwrite(pjoin(sav, 't1', str(nameing) + '.png'), img_t1)
58 |     cv2.imwrite(pjoin(sav, 't1', str(nameing + 1) + '.png'), image_t1_90)
59 |     cv2.imwrite(pjoin(sav, 't1', str(nameing + 2) + '.png'), image_t1_180)
60 |     cv2.imwrite(pjoin(sav, 't1', str(nameing + 3) + '.png'), image_t1_270)
61 |     cv2.imwrite(pjoin(sav, 'mask', str(nameing) + '.png'), mask)
62 |     cv2.imwrite(pjoin(sav, 'mask', str(nameing + 1) + '.png'), mask_90)
63 |     cv2.imwrite(pjoin(sav, 'mask', str(nameing + 2) + '.png'), mask_180)
64 |     cv2.imwrite(pjoin(sav, 'mask', str(nameing + 3) + '.png'), mask_270)
65 | 
66 |     nameing = nameing+4
67 | # with open(path) as g:
68 | #     for line in g:
69 | #         ls= line.split()
70 | # datapath = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_900images'
71 | # formatpath = '/data/input/datasets/VL-CMU-CD/struc_train/gt_fold_rgb'
72 | # filename = list(spt(f)[0] for f in os.listdir(datapath) )
73 | # filename.sort()
74 | # print(filename)
75 | # query_item = 0
76 | # for word in ls :
77 | #     word1 = word.zfill(3)
78 | #     fol_path = pjoin(formatpath, word1)
79 | #     items = len([name for name in os.listdir(fol_path)])
80 | #     savepath = '/data/input/datasets/VL-CMU-CD/vl_cmu_cd_binary_mask/vl_cmu_cd_binary_mask/train/mask_struc'
81 | #     Path(os.path.join(savepath, word1)).mkdir(parents=True, exist_ok=True)
82 | #     for id in range(items):
83 | #         q = query_item +id
84 | #         copyfile(pjoin(datapath,filename[q]+'.png'), pjoin(savepath,word1,filename[q]+'.png'))
85 | #
86 | #     query_item = items + query_item
87 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/train.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import csv
  3 | import cv2
  4 | import torch
  5 | from TANet import TANet
  6 | import numpy as np
  7 | import datasets
  8 | import torch.nn as nn
  9 | import torch.nn.functional as F
 10 | from tqdm import tqdm
 11 | from os.path import join as pjoin
 12 | from torch.utils.data import DataLoader
 13 | from tensorboardX import SummaryWriter
 14 | torch.cuda.empty_cache()
 15 | import argparse
 16 | 
 17 | 
 18 | class criterion_CEloss(nn.Module):
 19 |     def __init__(self,weight=None):
 20 |         super(criterion_CEloss, self).__init__()
 21 |         self.loss = nn.NLLLoss(weight)
 22 |     def forward(self,output,target):
 23 |         return self.loss(F.log_softmax(output, dim=1), target)
 24 | 
 25 | class Train:
 26 | 
 27 |     def __init__(self):
 28 |         self.epoch = 0
 29 |         self.step = 0
 30 | 
 31 |     def train(self):
 32 | 
 33 |         weight = torch.ones(2)
 34 |         criterion = criterion_CEloss(weight.cuda())
 35 |         optimizer = torch.optim.Adam(self.model.parameters(),lr=0.001,betas=(0.9,0.999))
 36 |         lambda_lr = lambda epoch:(float)(self.args.max_epochs*len(self.dataset_train_loader)-self.step)/(float)(self.args.max_epochs*len(self.dataset_train_loader))
 37 |         model_lr_scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer,lr_lambda=lambda_lr)
 38 | 
 39 |         f_loss = open(pjoin(self.checkpoint_save,"loss.csv"),'w')
 40 |         loss_writer = csv.writer(f_loss)
 41 | 
 42 |         self.visual_writer = SummaryWriter(os.path.join(self.checkpoint_save,'logs'))
 43 | 
 44 |         loss_item = []
 45 | 
 46 |         max_step = self.args.max_epochs * len(self.dataset_train_loader)
 47 |         _,w,h = self.dataset_test.get_random_image()[0].shape
 48 |         img_tbx =  np.zeros((max_step//self.args.step_test, 3, w*2, h*2), dtype=np.uint8)
 49 | 
 50 |         while self.epoch < self.args.max_epochs:
 51 | 
 52 |             for step,(inputs_train,mask_train) in enumerate(tqdm(self.dataset_train_loader)):
 53 |                 self.model.train()
 54 |                 inputs_train = inputs_train.cuda()
 55 |                 mask_train = mask_train.cuda()
 56 |                 output_train = self.model(inputs_train)
 57 |                 optimizer.zero_grad()
 58 |                 self.loss = criterion(output_train, mask_train[:,0])
 59 |                 loss_item.append(self.loss)
 60 |                 self.loss.backward()
 61 |                 optimizer.step()
 62 |                 self.step += 1
 63 |                 loss_writer.writerow([self.step,self.loss.item()])
 64 |                 self.visual_writer.add_scalar('loss',self.loss.item(),self.step)
 65 | 
 66 |                 # if self.args.step_test>0 and self.step % self.args.step_test == 0:
 67 |                 #     print('testing...')
 68 |                 #     self.model.eval()
 69 |                 #     self.test(img_tbx)
 70 | 
 71 |             print('Loss for Epoch {}:{:.03f}'.format(self.epoch, sum(loss_item)/len(self.dataset_train_loader)))
 72 |             loss_item.clear()
 73 |             model_lr_scheduler.step()
 74 |             self.epoch += 1
 75 |             if self.args.epoch_save>0 and self.epoch % self.args.epoch_save == 0:
 76 |                 self.checkpoint()
 77 | 
 78 |         self.visual_writer.add_images('cd_test',img_tbx,0, dataformats='NCHW')
 79 |         f_loss.close()
 80 |         self.visual_writer.close()
 81 | 
 82 |     def test(self,img_tbx):
 83 | 
 84 |         _, _, w_r, h_r = img_tbx.shape
 85 |         w_r //= 2
 86 |         h_r //= 2
 87 |         input, mask_gt = self.dataset_test.get_random_image()
 88 | 
 89 |         input = input.view(1, -1, h_r, w_r)
 90 |         input = input.cuda()
 91 |         output = self.model(input)
 92 | 
 93 |         input = input[0].cpu().data
 94 |         img_t0 = input[0:3, :, :]
 95 |         img_t1 = input[3:6, :, :]
 96 |         img_t0 = (img_t0 + 1) * 128
 97 |         img_t1 = (img_t1 + 1) * 128
 98 |         output = output[0].cpu().data
 99 |         mask_pred = np.where(F.softmax(output[0:2, :, :], dim=0)[0] > 0.5, 0, 255)
100 |         mask_gt = np.squeeze(np.where(mask_gt == True, 255, 0), axis=0)
101 |         self.store_result(img_t0, img_t1, mask_gt, mask_pred,img_tbx)
102 | 
103 |     def store_result(self, t0, t1, mask_gt, mask_pred, img_save):
104 | 
105 |         _, _, w, h = img_save.shape
106 |         w //=2
107 |         h //=2
108 |         i = self.step//self.args.step_test - 1
109 |         img_save[i, :, 0:w, 0:h] = t0.numpy().astype(np.uint8)
110 |         img_save[i, :, 0:w, h:2 * h] = t1.numpy().astype(np.uint8)
111 |         img_save[i, :, w:2 * w, 0:h] = np.transpose(cv2.cvtColor(mask_gt.astype(np.uint8), cv2.COLOR_GRAY2RGB),(2,0,1)).astype(np.uint8)
112 |         img_save[i, :, w:2 * w, h:2 * h] = np.transpose(cv2.cvtColor(mask_pred.astype(np.uint8), cv2.COLOR_GRAY2RGB),(2,0,1)).astype(np.uint8)
113 | 
114 |         #img_save = np.transpose(img_save, (1, 0, 2))
115 | 
116 |     def checkpoint(self):
117 | 
118 |         filename = '{:08d}.pth'.format(self.step)
119 |         cp_path = pjoin(self.checkpoint_save,'checkpointdir')
120 |         if not os.path.exists(cp_path):
121 |             os.makedirs(cp_path)
122 |         torch.save(self.model.state_dict(),pjoin(cp_path,filename))
123 |         print("Net Parameters in step:{:08d} were saved.".format(self.step))
124 | 
125 |     def run(self):
126 | 
127 | 
128 |         self.model = TANet(self.args.encoder_arch, self.args.local_kernel_size, self.args.attn_stride,
129 |                            self.args.attn_padding, self.args.attn_groups, self.args.drtam, self.args.refinement, self.args.pretrain, self.args.sslpretrain, self.args.ssl_path)
130 | 
131 |         if self.args.drtam:
132 |             print('Dynamic Receptive Temporal Attention Network (DR-TANet)')
133 |         else:
134 |             print('Temporal Attention Network (TANet)')
135 | 
136 |         print('Encoder:' + self.args.encoder_arch)
137 |         if self.args.refinement:
138 |             print('Adding refinement...')
139 | 
140 |         if self.args.multi_gpu:
141 |             self.model = nn.DataParallel(self.model).cuda()
142 |         else:
143 |             self.model = self.model.cuda()
144 |         self.train()
145 | 
146 | class train_pcd(Train):
147 | 
148 |     def __init__(self, arguments):
149 |         super(train_pcd, self).__init__()
150 |         self.args = arguments
151 | 
152 | 
153 |     def Init(self,cvset):
154 | 
155 |         self.epoch = 0
156 |         self.step = 0
157 |         self.cvset = cvset
158 |         if self.args.drtam:
159 |             folder_name = 'DR-TANet'
160 |         else:
161 |             folder_name = 'TANet_k={}'.format(self.args.local_kernel_size)
162 | 
163 |         folder_name += ('_' + self.args.encoder_arch)
164 |         if self.args.refinement:
165 |             folder_name += '_ref'
166 | 
167 |         self.dataset_train_loader = DataLoader(datasets.pcd(pjoin(self.args.datadir, "set{}".format(self.cvset), "train")),
168 |                                           num_workers=self.args.num_workers, batch_size=self.args.batch_size,
169 |                                           shuffle=True)
170 |         self.dataset_test = datasets.pcd(pjoin(self.args.datadir, 'set{}'.format(self.cvset), 'test'))
171 |         self.checkpoint_save = pjoin(self.args.checkpointdir, folder_name, 'pcd', 'set{}'.format(self.cvset))
172 |         if not os.path.exists(self.checkpoint_save):
173 |             os.makedirs(self.checkpoint_save)
174 | 
175 | class train_cmu(Train):
176 | 
177 |     def __init__(self, arguments):
178 |         super(train_cmu, self).__init__()
179 |         self.args = arguments
180 | 
181 |     def Init(self):
182 | 
183 |         if self.args.drtam:
184 |             folder_name = 'DR-TANet'
185 |         else:
186 |             folder_name = 'TANet_k={}'.format(self.args.local_kernel_size)
187 | 
188 |         folder_name += ('_' + self.args.encoder_arch)
189 |         if self.args.refinement:
190 |             folder_name += '_ref'
191 | 
192 |         self.dataset_train_loader = DataLoader(datasets.vl_cmu_cd(pjoin(self.args.datadir, "train"), self.args.data_num),
193 |                                           num_workers=self.args.num_workers, batch_size=self.args.batch_size,
194 |                                           shuffle=True)
195 |         self.dataset_test = datasets.vl_cmu_cd(pjoin(self.args.datadir, 'test'), self.args.data_num )
196 |         self.checkpoint_save = pjoin(self.args.checkpointdir, folder_name, 'vl_cmu_cd')
197 |         if not os.path.exists(self.checkpoint_save):
198 |             os.makedirs(self.checkpoint_save)
199 | 
200 | 
201 | if __name__ =="__main__":
202 |     parser = argparse.ArgumentParser(description="Arguments for training...")
203 |     parser.add_argument('--dataset', type=str, default='pcd', required=True)
204 |     parser.add_argument('--checkpointdir', required=True)
205 |     parser.add_argument('--datadir', required=True)
206 |     parser.add_argument('--multi-gpu',action='store_true',help='training with multi-gpus')
207 |     parser.add_argument('--max-epochs', type=int, default=100)
208 |     parser.add_argument('--num-workers', type=int, default=4)
209 |     parser.add_argument('--batch-size', type=int, default=16)
210 |     parser.add_argument('--epoch-save', type=int, default=20)
211 |     parser.add_argument('--step-test', type=int, default=200)
212 |     parser.add_argument('--encoder-arch', type=str, required=True)
213 |     parser.add_argument('--local-kernel-size',type=int, default=1)
214 |     parser.add_argument('--attn-stride', type=int, default=1)
215 |     parser.add_argument('--attn-padding', type=int, default=0)
216 |     parser.add_argument('--attn-groups', type=int, default=4)
217 |     parser.add_argument('--drtam', action='store_true')
218 |     parser.add_argument('--refinement', action='store_true')
219 |     parser.add_argument('--ssl_path', type=str, help='[nb_pre,nb_nopre,bd_pre,bd_nopre]', required=True)
220 |     parser.add_argument('--data_num',   type=float, default=1.0)       #[0.1,0.5,0.01]
221 |     parser.add_argument('--pretrain', type=bool, required=True)
222 |     parser.add_argument('--sslpretrain', type=bool, required=True)
223 | 
224 | 
225 | 
226 |     if parser.parse_args().dataset == 'pcd':
227 |         train= train_pcd(parser.parse_args())
228 |         for set in range(0, 3):
229 |             train.Init(set)
230 |             train.run()
231 |     elif parser.parse_args().dataset == 'vl_cmu_cd':
232 |         train = train_cmu(parser.parse_args())
233 |         train.Init()
234 |         train.run()
235 |     else:
236 |         print('Error: Cannot identify the dataset...(dataset: pcd or vl_cmu_cd)')
237 |         exit(-1)
238 | 
239 | 
240 | 
241 | 
242 | 
243 | 
244 | 
245 | 
246 | 


--------------------------------------------------------------------------------
/DSP/DR-TANet-main/util.py:
--------------------------------------------------------------------------------
 1 | import torch.nn as nn
 2 | import torch.nn.functional as F
 3 | 
 4 | __all__ = ['Upsample', 'upsample']
 5 | 
 6 | upsample = lambda x, size: F.interpolate(x, size, mode='bilinear', align_corners=False)
 7 | 
 8 | 
 9 | class _BNReluConv(nn.Sequential):
10 |     def __init__(self, num_maps_in, num_maps_out, k=3, batch_norm=True, bn_momentum=0.1, bias=False, dilation=1):
11 |         super(_BNReluConv, self).__init__()
12 |         if batch_norm:
13 |             self.add_module('norm', nn.BatchNorm2d(num_maps_in, momentum=bn_momentum))
14 |         self.add_module('relu', nn.ReLU(inplace=batch_norm is True))
15 |         padding = k // 2  # same conv
16 |         self.add_module('conv', nn.Conv2d(num_maps_in, num_maps_out,
17 |                                           kernel_size=k, padding=padding, bias=bias, dilation=dilation))
18 | 
19 | 
20 | class Upsample(nn.Module):
21 |     def __init__(self, num_maps_in, skip_maps_in, num_maps_out, use_bn=True, k=3):
22 |         super(Upsample, self).__init__()
23 |         print(f'Upsample layer: in = {num_maps_in}, skip = {skip_maps_in}, out = {num_maps_out}')
24 |         self.bottleneck = _BNReluConv(skip_maps_in, num_maps_in, k=1, batch_norm=use_bn)
25 |         self.blend_conv = _BNReluConv(num_maps_in, num_maps_out, k=k, batch_norm=use_bn)
26 | 
27 |     def forward(self, x, skip):
28 |         skip = self.bottleneck.forward(skip)
29 |         skip_size = skip.size()[2:4]
30 |         x = upsample(x, skip_size)
31 |         x = x + skip
32 |         x = self.blend_conv.forward(x)
33 |         return x
34 | 
35 | 
36 | 
37 | 
38 | 
39 | 


--------------------------------------------------------------------------------
/DSP/config/__pycache__/option.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/config/__pycache__/option.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/config/option.py:
--------------------------------------------------------------------------------
  1 | from util.utils import mkdir
  2 | from util.dist_util import init_distributed_mode
  3 | import argparse
  4 | import torch
  5 | 
  6 | 
  7 | class Options:
  8 |     def __init__(self):
  9 |         print("parsing..")
 10 |         parser = argparse.ArgumentParser(description="PyTorch Self-supervised Learning")
 11 |         parser.add_argument("--img_size", default=256, type=int, help="Image size(int) for RandomResizedCrop")  # 224, 96
 12 |         # SSL specific settings
 13 |         parser.add_argument("--ssl_epochs", default=300, type=int, help="Number of epochs for training SSL")  # 100, 200
 14 |         parser.add_argument("--ssl_model", default="simclr", type=str, help="SSL model")  # simclr simclr_cd
 15 |         parser.add_argument("--backbone", default="resnet50", type=str, help="SSL backbone")  # resnet18, resnet50
 16 |         parser.add_argument("--optimizer", default="lars", type=str, help="SSL optimizer")  # adam, lars
 17 |         parser.add_argument("--ssl_dataset", default="CMU", type=str, help="SSL training dataset")  # STL10, CIFAR100
 18 |         parser.add_argument("--ssl_batchsize", default=4, type=int, help="Batch size for SSL training ")  # 32, 64, 128
 19 |         parser.add_argument("--temperature", default=0.5, type=float, help="Temperature parameter for NTXent loss used for SSL training")
 20 |         parser.add_argument("--ssl_lr", default=0.0003, type=float, help="Learning rate for SSL training")  # 0.0003
 21 |         parser.add_argument("--n_proj", default=256, type=int, help="Projection head output size for SSL training")  # 64, 128
 22 |         parser.add_argument("--ssl_normalize", type=bool, default=True, help="Normalize projection head output for SSL training") # True, False
 23 |         parser.add_argument("--scheduler", type=bool, default=True, help="Use CosineAnnealingLR for SSL training")  # True, False
 24 |         parser.add_argument("--global_bn", type=bool, default=False, help="Use CosineAnnealingLR for SSL training")
 25 | 
 26 |         # zoom-in
 27 |         parser.add_argument("--zoom", type=bool, default=False, help="Whether to use zoom-in of cosine similarity in NT-Xent loss")
 28 |         parser.add_argument("--zoom_factor", default=10, type=int, help="Value of zoom-in in the zoom term")
 29 |         # Use margin in NT-Xent loss - EqCo
 30 |         parser.add_argument("--margin", type=bool, default=True, help="Whether to use margin in NT-Xent loss")
 31 |         parser.add_argument("--alpha", default=65536, type=int, help="Value of alpha in the margin term")
 32 | 
 33 |         # Use two backbones
 34 |         parser.add_argument("--m_backbone", type=bool, default=False, help="Whether to use momentum encoder")
 35 |         parser.add_argument("--m_update", type=float, default=0.990, help="Momentum update value (m)")  
 36 |         parser.add_argument("--output_stride", type=int, default=16, help="outputstride (8 or 16)")
 37 |         parser.add_argument("--pre_train", type=bool, default=False, help="pretrain_enc")  
 38 |         parser.add_argument("--encoder", type=str, default='resnet', help="resnet or vgg")  
 39 |         parser.add_argument("--dense_cl", type=bool, default=True, help="Whether to use dense prediction")  # True, False
 40 |         parser.add_argument("--copy_paste", type=bool, default=False, help="Whether to use copy paste aug")  # True, False
 41 |         parser.add_argument("--barlow_twins", type=bool, default=True, help="Whether to use copy paste aug")  # True, False
 42 |         parser.add_argument("--kd_loss", default=True, type=bool, help="kldiv")  # kl, rkd,sp,wasserstein,fitnet, rka, rkda, rkd-kl, rkda-kl
 43 |         parser.add_argument("--kd_loss_2", default="sp", type=str, help="diff kd losses:rkd,sp,fitnet, rkd,rka,rkda")  # kl, rkd,sp,wasserstein,fitnet, rka, rkda
 44 |         parser.add_argument("--alpha_kl", default=1000, type=float, help="Hyperparameter for KL-div")
 45 |         parser.add_argument("--alpha_sp", default=3000, type=float, help="Hyperparameter for similarity preserving")
 46 |         parser.add_argument("--alpha_inter_kd", default=100, type=float, help="Hyperparameter for inter and intra KL-div")
 47 |         parser.add_argument("--inter_kl", default=False, type=bool, help="calculate kl between to and t1 logits")  # kl, rkd,sp,wasserstein,fitnet, rka, rkda, rkd-kl, rkda-kl
 48 |         parser.add_argument(
 49 |             "--nodiff_tc", action="store_true", default=False, help="do not reset weight each generation"
 50 |         )
 51 | 
 52 |         parser.add_argument("--hidden_layer", type=int, default=512, help="hiddenlayer (512 or 1024)")
 53 |         parser.add_argument("--supervised_multihead", type=bool, default=True, help="Whether to use copy paste aug")  # True, False
 54 | 
 55 |         # Different weighted loss functions
 56 |         parser.add_argument("--criterion_weight", nargs="*", type=int, default=[1, 0, 0, 0],
 57 |                             help="Loss criterion weights for SSL training")  # [1, 1000, 0, 0], [1, 0, 25, 50]
 58 |         # Directory
 59 |         parser.add_argument("--data_dir", default="/data/input/datasets/VL-CMU-CD/pcd", type=str, help="Directory to import data")  # Absolute path
 60 |         parser.add_argument("--val_data_dir", default="/data/input/datasets/VL-CMU-CD/struc_test", type=str, help="Directory to import data")  # Absolute path
 61 | 
 62 |         parser.add_argument("--save_dir", default="/volumes1/tmp", type=str, help="Directory to save log and model")  # Absolute path /data/output/vijaya.ramkumar/sscd /volumes1/tmp  /sscdv2/runs_1
 63 |         # testing SSL model
 64 |         parser.add_argument("--test_dataset", default="CMU", type=str, help="Dataset for testing SSL methods")  # STL10, CIFAR10, ImageNet
 65 |         parser.add_argument("--test_data_dir", default="/data/input/datasets/VL-CMU-CD/struc_test", type=str, help="Directory to import data")  # Absolute path
 66 |         parser.add_argument("--linear_batchsize", default=16, type=int, help="Test batch size for linear evaluation")  # 32, 64, 128
 67 |         parser.add_argument("--linear_epochs", default=100, type=int, help="No.of epochs for Linear evaluation")  # 100, 200
 68 |         parser.add_argument("--linear_classes", default=1, type=int, help="No.of classes for Linear evaluation")  # 1 for binary classification
 69 |         parser.add_argument("--linear_lr", default=3e-4, type=float, help="Learning rate for Linear evaluation training")  # 0.0003
 70 | 
 71 |         # testing SSL model
 72 |         parser.add_argument("--sup_dataset", default="CIFAR100", type=str)  # STL10, CIFAR10, ImageNet
 73 |         parser.add_argument("--sup_data_dir", default="/volumes1/CIFAR100", type=str)  # Absolute path
 74 |         parser.add_argument("--sup_batchsize", default=256, type=int)  # 32, 64, 128
 75 |         parser.add_argument("--sup_lr", default=0.02, type=float)  # 0.0003
 76 |         parser.add_argument("--sup_epochs", default=100, type=int)  # 100
 77 | 
 78 |         # trained SSL model path
 79 |         parser.add_argument("--model_path", default=None, type=str, help="Saved SSL model path for transfer learning")  # Absolute path
 80 | 
 81 |         # Distributed
 82 |         parser.add_argument("--distribute", type=bool, default=False, help="Distributed Data Parallel")  # DistributedDataParallel
 83 |         parser.add_argument("--dist_url", type=str, default="env://")  # Default URL for DistributedDataParallel
 84 | 
 85 |         # Visualizing Heatmap for test Images
 86 |         parser.add_argument("--bestcheckpoint", default='/data/output/vijaya.ramkumar/sscd/runs/resnet50_bs_2/Wed_May_12_17:00:10_2021/checkpoint_model_170_model1.pth', type=str)  # '/data/output/vijaya.ramkumar/sscd/runs/resnet50_bs_8/Mon_Mar_22_16:59:24_2021/checkpoint_model_200_model1.pth'
 87 |         self.parser = parser
 88 | 
 89 |     def parse(self):
 90 |         args = self.parser.parse_args()
 91 |         mkdir(args.save_dir)
 92 |         args.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 93 |         if args.distribute:
 94 |             init_distributed_mode(args)
 95 |         if args.ssl_lr is None:
 96 |             # In SimCLR, linear LR scaling = 0.3 * args.batchsize / 256 and square root LR scaling  0.075 × math.sqrt(BatchSize)
 97 |             args.ssl_lr = 0.03 * args.ssl_batchsize / 256
 98 |             # args.ssl_lr = 0.075 * math.sqrt(args.ssl_batchsize)
 99 |         return args
100 | 


--------------------------------------------------------------------------------
/DSP/criterion/__pycache__/ntxent.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/criterion/__pycache__/ntxent.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/criterion/__pycache__/sim_preserving_kd.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/criterion/__pycache__/sim_preserving_kd.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/criterion/ntxent.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from util.utils import positive_mask
  4 | import os
  5 | import math
  6 | import util.utils as utils
  7 | import torch.nn.functional as F
  8 | 
  9 | 
 10 | class NTXent(nn.Module):
 11 |     """
 12 |     The Normalized Temperature-scaled Cross Entropy Loss
 13 |     Source: https://github.com/Spijkervet/SimCLR
 14 |     """
 15 | 
 16 |     def __init__(self, args):
 17 |         super(NTXent, self).__init__()
 18 |         self.batch_size = args.ssl_batchsize
 19 |         self.margin = args.margin
 20 |         self.alpha = args.alpha
 21 |         self.temperature = args.temperature
 22 |         self.device = args.device
 23 |         self.mask = positive_mask(args.ssl_batchsize)
 24 |         self.criterion = nn.CrossEntropyLoss(reduction="sum")
 25 |         self.similarity_f = nn.CosineSimilarity(dim=2)
 26 |         self.N = 4 * self.batch_size
 27 |         self.zoom = args.zoom
 28 |         self.zoom_factor = args.zoom_factor
 29 |         self.writer = args.writer
 30 | 
 31 | 
 32 |     def forward(self, zx, zy, zx1, zy1,  global_step):
 33 |         """
 34 |         zx: projection output of batch zx
 35 |         zy: projection output of batch zy
 36 |         :return: normalized loss
 37 |         """
 38 |         positive_samples, negative_samples = self.sample_no_dict(zx, zy, zx1, zy1)
 39 |         if self.margin:
 40 |             m = self.temperature * math.log(self.alpha / negative_samples.shape[1])
 41 |             positive_samples = ((positive_samples * self.temperature) - m) / self.temperature
 42 | 
 43 |         labels = torch.zeros(self.N).to(positive_samples.device).long()
 44 |         logits = torch.cat((positive_samples, negative_samples), dim=1)
 45 |         loss = self.criterion(logits, labels)
 46 |         loss /= self.N
 47 | 
 48 |         return loss
 49 | 
 50 | 
 51 |     def sample_no_dict(self, zx, zy,  zx1, zy1):
 52 |         """
 53 |         Negative samples without dictionary
 54 |         """
 55 |         # print(zx.shape)
 56 |         z = torch.cat((zx, zy, zx1,zy1), dim=0)
 57 |         sim = self.similarity_f(z.unsqueeze(1), z.unsqueeze(0)) / self.temperature
 58 |         # print(sim.shape,self.batch_size )
 59 | 
 60 |         # Splitting the matrix into 4 blocks so as to count number of positive and negative samples
 61 |         sim_left, sim_right = torch.chunk(sim, 2, dim=1)
 62 |         sim_lu,sim_ll = torch.chunk(sim_left, 2, dim=0)
 63 |         sim_ru,sim_rl = torch.chunk(sim_right, 2, dim=0)
 64 |         # print(sim_lu.shape,self.batch_size )
 65 | 
 66 |         # Extract positive samples from each block
 67 |         #sim_xy = torch.diag(sim, self.batch_size)
 68 |         pos_1 = torch.diag(sim_lu, self.batch_size)
 69 |         pos_2 = torch.diag(sim_lu, -self.batch_size)
 70 |         pos_3 = torch.diag(sim_rl, self.batch_size)
 71 |         pos_4 = torch.diag(sim_rl, -self.batch_size)
 72 |         # sim_yx = torch.diag(sim, -self.batch_size)
 73 |         positive_samples = torch.cat((pos_1, pos_2, pos_3, pos_4), dim=0).reshape(self.N, 1)
 74 | 
 75 |         # Extract negative samples
 76 |         neg_lu = sim_lu[self.mask].reshape(self.batch_size*2, 2*(self.batch_size-1) )
 77 |         neg_rl = sim_rl[self.mask].reshape(self.batch_size*2, 2*(self.batch_size-1))
 78 | 
 79 |         # Concatenating the extracted negatives from sim block left upper and right lower.
 80 |         neg_u = torch.cat((neg_lu, sim_ru), dim=1)
 81 |         neg_l = torch.cat((sim_ll, neg_rl), dim=1)
 82 |         negative_samples = torch.cat((neg_u, neg_l), dim=0)
 83 | 
 84 |         return positive_samples, negative_samples
 85 | 
 86 | 
 87 | 
 88 | class BarlowTwinsLoss(torch.nn.Module):
 89 |     def __init__(self, device, lambda_param=5e-3):                                                                                          
 90 |         super(BarlowTwinsLoss, self).__init__()
 91 |         self.lambda_param = lambda_param
 92 |         self.device = device
 93 | 
 94 |     def forward(self, z_a: torch.Tensor, z_b: torch.Tensor):                                                                                            
 95 |         # normalize repr. along the batch dimension
 96 |         z_a_norm = (z_a - z_a.mean(0)) / z_a.std(0) # NxD
 97 |         z_b_norm = (z_b - z_b.mean(0)) / z_b.std(0) # NxD                                                                                                                                                                                                                                                           
 98 |         z_a_norm = z_a_norm.view(z_a_norm.size(0), z_a_norm.size(1)* z_a_norm.size(2))
 99 |         z_b_norm = z_b_norm.view(z_b_norm.size(0), z_b_norm.size(1)*z_b_norm.size(2))
100 | 
101 |         N = z_a.size(0)
102 |         # D = z_a.size(1)
103 |         D = z_a_norm.size(1)
104 | 
105 |         # print (z_a_norm.T.shape, z_b_norm.shape)
106 |         # cross-correlation matrix
107 |         # c= torch.einsum('yxb,bxy->xy', (z_a_norm.T, z_b_norm))
108 |         c = torch.mm(z_a_norm.T, z_b_norm) / N # DxD
109 |         # print (c.shape)
110 | 
111 |         # loss
112 |         c_diff = (c - torch.eye(D,device=self.device)).pow(2) # DxD
113 |         # multiply off-diagonal elems of c_diff by lambda
114 |         c_diff[~torch.eye(D, dtype=bool)] *= self.lambda_param
115 |         loss = c_diff.sum()
116 | 
117 |         return loss
118 | 
119 | 
120 | class BarlowTwinsLoss_CD(torch.nn.Module):
121 |     def __init__(self, args, lambda_param=5e-3):
122 |         super(BarlowTwinsLoss_CD, self).__init__()
123 |         self.lambda_param = lambda_param
124 |         self.device = args.device
125 |         self.dense_cl = args.dense_cl
126 | 
127 |     def forward(self, z_a: torch.Tensor, z_b: torch.Tensor,z_c: torch.Tensor, z_d: torch.Tensor):   #, z_c: torch.Tensor, z_d: torch.Tensor
128 | 
129 |         # normalize repr. along the batch dimension
130 |         z_a_norm = (z_a - z_a.mean(0)) / z_a.std(0)  # NxD
131 |         z_b_norm = (z_b - z_b.mean(0)) / z_b.std(0)  # NxD
132 |         z_c_norm = (z_c - z_c.mean(0)) / z_c.std(0)  # NxD
133 |         z_d_norm = (z_d - z_d.mean(0)) / z_d.std(0)  # NxD
134 | 
135 |         N = z_a.size(0)
136 |         if self.dense_cl == True:
137 |         ## for dense activation
138 |             z_a_norm = z_a_norm.view(z_a_norm.size(0),  -1)
139 |             z_b_norm = z_b_norm.view(z_b_norm.size(0),  -1)
140 |             z_c_norm = z_c_norm.view(z_c_norm.size(0), -1)
141 |             z_d_norm = z_d_norm.view(z_d_norm.size(0), -1)
142 |             D = z_a_norm.size(1)
143 | 
144 |         else:
145 |             D = z_a.size(1)
146 |         # print(z_a_norm.shape)
147 |         # cross-correlation matrix
148 |         c1 = torch.mm(z_a_norm.T, z_b_norm) / N # DxD
149 |         # c2 = torch.mm(z_c_norm.T, z_d_norm) / N # DxD
150 | 
151 | 
152 |         # loss
153 |         c_diff1 = (c1 - torch.eye(D,device=self.device)).pow(2) # DxD
154 |         # multiply off-diagonal elems of c_diff by lambda
155 |         c_diff1[~torch.eye(D, dtype=bool)] *= self.lambda_param
156 |         loss1 = c_diff1.sum()
157 | 
158 |         loss = loss1
159 |         return loss
160 | 
161 | 


--------------------------------------------------------------------------------
/DSP/criterion/sim_preserving_kd.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn.functional as F
  3 | from torch import nn
  4 | from torch.autograd import Variable
  5 | 
  6 | criterion_MSE = nn.MSELoss(reduction='mean')
  7 | 
  8 | 
  9 | def cross_entropy(y, labels):
 10 |     l_ce = F.cross_entropy(y, labels)
 11 |     return l_ce
 12 | 
 13 | 
 14 | def distillation(student_scores, teacher_scores, T):
 15 | 
 16 |     p = F.log_softmax(student_scores / T, dim=1)
 17 |     q = F.softmax(teacher_scores / T, dim=1)
 18 | 
 19 |     l_kl = F.kl_div(p, q, size_average=False) * (T**2) / student_scores.shape[0]
 20 | 
 21 |     return l_kl
 22 | 
 23 | 
 24 | class JSD(nn.Module):
 25 | 
 26 |     def __init__(self, args):
 27 |         super(JSD, self).__init__()
 28 |         self.dense= args.dense_cl
 29 |     def forward(self, net_1_logits, net_2_logits):
 30 |         if self.dense==True:
 31 |             net_1_logits = net_1_logits.view(net_1_logits.size(0), -1)
 32 |             net_2_logits = net_2_logits.view(net_2_logits.size(0), -1)
 33 | 
 34 | 
 35 |         net_1_probs = F.softmax(net_1_logits+ 1e-10, dim=1)
 36 |         net_2_probs = F.softmax(net_2_logits+ 1e-10, dim=1)
 37 | 
 38 |         total_m = 0.5 * (net_1_probs + net_2_probs)
 39 | 
 40 |         return 0.5 * (F.kl_div(F.log_softmax(net_1_logits, dim=1), total_m, reduction="batchmean") +
 41 |                             F.kl_div(F.log_softmax(net_2_logits, dim=1), total_m, reduction="batchmean"))
 42 | 
 43 | 
 44 | 
 45 | 
 46 | 
 47 | def fitnet_loss(A_t, A_s, rand=False, noise=0.1):
 48 |     """Given the activations for a batch of input from the teacher and student
 49 |     network, calculate the fitnet loss from the paper
 50 |     FitNets: Hints for Thin Deep Nets https://arxiv.org/abs/1412.6550
 51 | 
 52 |     Note: This function assumes that the number of channels and the spatial dimensions of
 53 |     the teacher and student activation maps are the same.
 54 | 
 55 |     Parameters:
 56 |         A_t (4D tensor): activation maps from the teacher network of shape b x c x h x w
 57 |         A_s (4D tensor): activation maps from the student network of shape b x c x h x w
 58 | 
 59 |     Returns:
 60 |         l_fitnet (1D tensor): fitnet loss value
 61 | """
 62 |     if rand:
 63 |         rand_noise =  torch.FloatTensor(A_t.shape).uniform_(1 - noise, 1 + noise)
 64 |         A_t = A_t * rand_noise
 65 | 
 66 |     return criterion_MSE(A_t, A_s)
 67 | 
 68 | 
 69 | def at(x):
 70 |     return F.normalize(x.pow(2).mean(1).view(x.size(0), -1))
 71 | 
 72 | 
 73 | def at_loss(x, y, rand=False, noise=0.1):
 74 |     if rand:
 75 |         rand_noise = torch.FloatTensor(y.shape).uniform_(1 - noise, 1 + noise).cuda()
 76 |         y = y * rand_noise
 77 | 
 78 |     return (at(x) - at(y)).pow(2).mean()
 79 | 
 80 | 
 81 | def FSP_loss(fea_t, short_t, fea_s, short_s, rand=False, noise=0.1):
 82 | 
 83 |     a, b, c, d = fea_t.size()
 84 |     feat = fea_t.view(a, b, c * d)
 85 |     a, b, c, d = short_t.size()
 86 |     shortt = short_t.view(a, b, c * d)
 87 |     G_t = torch.bmm(feat, shortt.permute(0, 2, 1)).div(c * d).detach()
 88 | 
 89 |     a, b, c, d = fea_s.size()
 90 |     feas = fea_s.view(a, b, c * d)
 91 |     a, b, c, d = short_s.size()
 92 |     shorts = short_s.view(a, b, c * d)
 93 |     G_s = torch.bmm(feas, shorts.permute(0, 2, 1)).div(c * d)
 94 | 
 95 |     return criterion_MSE(G_s, G_t)
 96 | 
 97 | 
 98 | def similarity_preserving_loss(A_t, A_s):
 99 |     """Given the activations for a batch of input from the teacher and student
100 |     network, calculate the similarity preserving knowledge distillation loss from the
101 |     paper Similarity-Preserving Knowledge Distillation (https://arxiv.org/abs/1907.09682)
102 |     equation 4
103 | 
104 |     Note: A_t and A_s must have the same batch size
105 | 
106 |     Parameters:
107 |         A_t (4D tensor): activation maps from the teacher network of shape b x c1 x h1 x w1
108 |         A_s (4D tensor): activation maps from the student network of shape b x c2 x h2 x w2
109 | 
110 |     Returns:
111 |         l_sp (1D tensor): similarity preserving loss value
112 | """
113 | 
114 |     # reshape the activations
115 |     b1, c1, h1, w1 = A_t.shape
116 |     b2, c2, h2, w2 = A_s.shape
117 |     assert b1 == b2, 'Dim0 (batch size) of the activation maps must be compatible'
118 | 
119 |     Q_t = A_t.reshape([b1, c1 * h1 * w1])
120 |     Q_s = A_s.reshape([b2, c2 * h2 * w2])
121 | 
122 |     # evaluate normalized similarity matrices (eq 3)
123 |     G_t = torch.mm(Q_t, Q_t.t())
124 |     # G_t = G_t / G_t.norm(p=2)
125 |     G_t = torch.nn.functional.normalize(G_t)
126 | 
127 |     G_s = torch.mm(Q_s, Q_s.t())
128 |     # G_s = G_s / G_s.norm(p=2)
129 |     G_s = torch.nn.functional.normalize(G_s)
130 | 
131 |     # calculate the similarity preserving loss (eq 4)
132 |     l_sp = (G_t - G_s).pow(2).mean()
133 | 
134 |     return l_sp
135 | 
136 | def similarity_preserving_loss_cd(A_t, A_s, A_t1, A_s1 ):
137 | 
138 |     # reshape the activations
139 |     b1, c1, h1, w1 = A_t.shape
140 |     b2, c2, h2, w2 = A_s.shape
141 |     assert b1 == b2, 'Dim0 (batch size) of the activation maps must be compatible'
142 | 
143 |     Q_t = A_t.reshape([b1, c1 * h1 * w1])
144 |     Q_s = A_s.reshape([b2, c2 * h2 * w2])
145 |     Q_t1 = A_t1.reshape([b1, c1 * h1 * w1])
146 |     Q_s1 = A_s1.reshape([b2, c2 * h2 * w2])
147 |     # evaluate normalized similarity matrices (eq 3)
148 |     G_t = torch.mm(Q_t, Q_s.t())
149 |     # G_t = G_t / G_t.norm(p=2)
150 |     G_t = torch.nn.functional.normalize(G_t)
151 | 
152 |     G_s = torch.mm(Q_t1, Q_s1.t())
153 |     # G_s = G_s / G_s.norm(p=2)
154 |     G_s = torch.nn.functional.normalize(G_s)
155 | 
156 |     # calculate the similarity preserving loss (eq 4)
157 |     l_sp = (G_t - G_s).pow(2).mean()
158 | 
159 |     return l_sp
160 | 
161 | 
162 | 
163 | class SlicedWassersteinDiscrepancy(nn.Module):
164 |     """PyTorch adoption of https://github.com/apple/ml-cvpr2019-swd"""
165 |     def __init__(self, mean=0, sd=1, device='cpu'):
166 |         super(SlicedWassersteinDiscrepancy, self).__init__()
167 |         self.dist = torch.distributions.Normal(mean, sd)
168 |         self.device = device
169 | 
170 |     def forward(self, p1, p2):
171 |         if p1.shape[1] > 1:
172 |             # For data more than one-dimensional input, perform multiple random
173 |             # projection to 1-D
174 |             proj = self.dist.sample([p1.shape[1], 128]).to(self.device)
175 |             proj *= torch.rsqrt(torch.sum(proj.pow(2), dim=0, keepdim=True))
176 | 
177 |             p1 = torch.mm(p1, proj)
178 |             p2 = torch.mm(p2, proj)
179 | 
180 |         p1, _ = torch.sort(p1, 0, descending=True)
181 |         p2, _ = torch.sort(p2, 0, descending=True)
182 | 
183 |         wdist = (p1 - p2).pow(2).mean()
184 | 
185 |         return wdist
186 | 
187 | 
188 | class RKD(object):
189 |     """
190 |     Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho.
191 |     relational knowledge distillation.
192 |     arXiv preprint arXiv:1904.05068, 2019.
193 |     """
194 |     def __init__(self, device, eval_dist_loss=True, eval_angle_loss=False):
195 |         super(RKD, self).__init__()
196 |         self.device = device
197 |         self.eval_dist_loss = eval_dist_loss
198 |         self.eval_angle_loss = eval_angle_loss
199 |         self.huber_loss = torch.nn.SmoothL1Loss()
200 | 
201 |     @staticmethod
202 |     def distance_wise_potential(x):
203 |         x_square = x.pow(2).sum(dim=-1)
204 |         prod = torch.matmul(x, x.t())
205 |         distance = torch.sqrt(
206 |             torch.clamp( torch.unsqueeze(x_square, 1) + torch.unsqueeze(x_square, 0) - 2 * prod,
207 |             min=1e-12))
208 |         mu = torch.sum(distance) / torch.sum(
209 |             torch.where(distance > 0., torch.ones_like(distance),
210 |                         torch.zeros_like(distance)))
211 | 
212 |         return distance / (mu + 1e-8)
213 | 
214 |     @staticmethod
215 |     def angle_wise_potential(x):
216 |         e = torch.unsqueeze(x, 0) - torch.unsqueeze(x, 1)
217 |         e_norm = torch.nn.functional.normalize(e, dim=2)
218 |         return torch.matmul(e_norm, torch.transpose(e_norm, -1, -2))
219 | 
220 |     def eval_loss(self, source, target):
221 | 
222 |         # Flatten tensors
223 |         source = source.reshape(source.shape[0], -1)
224 |         target = target.reshape(target.shape[0], -1)
225 | 
226 |         # normalize
227 |         source = torch.nn.functional.normalize(source, dim=1)
228 |         target = torch.nn.functional.normalize(target, dim=1)
229 | 
230 |         distance_loss = torch.tensor([0.]).to(self.device)
231 |         angle_loss = torch.tensor([0.]).to(self.device)
232 | 
233 |         if self.eval_dist_loss:
234 |             distance_loss = self.huber_loss(
235 |                 self.distance_wise_potential(source), self.distance_wise_potential(target)
236 |             )
237 | 
238 |         if self.eval_angle_loss:
239 |             angle_loss = self.huber_loss(
240 |                 self.angle_wise_potential(source), self.angle_wise_potential(target)
241 |             )
242 | 
243 |         return distance_loss, angle_loss


--------------------------------------------------------------------------------
/DSP/dataset/CMU.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torch.utils.data.dataset import Dataset
  3 | import numpy as np
  4 | import os
  5 | from PIL import Image
  6 | import random
  7 | from scipy.ndimage import gaussian_filter
  8 | from torchvision import transforms
  9 | from config.option import Options
 10 | import matplotlib.pyplot as plt
 11 | 
 12 | args = Options().parse()
 13 | 
 14 | 
 15 | IMG_EXTENSIONS = [
 16 |     '.jpg', '.JPG', '.jpeg', '.JPEG',
 17 |     '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
 18 | ]
 19 | 
 20 | def is_image_file(filename):
 21 |     print(filename)
 22 |     return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)
 23 | 
 24 | def pil_loader(path):
 25 |     # open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
 26 |     with open(path, 'rb') as f:
 27 |         with Image.open(f) as img:
 28 |             return img.convert('RGB')
 29 | 
 30 | 
 31 | palette = [0, 0, 0,255,0,0]
 32 | 
 33 | def colorize_mask(mask):
 34 |     # mask: numpy array of the mask
 35 |     new_mask = Image.fromarray(mask.astype(np.uint8)).convert('P')
 36 |     new_mask.putpalette(palette)
 37 | 
 38 |     return new_mask
 39 | 
 40 | def get_pascal_labels():
 41 |     return np.asarray([[0,0,0],[0,0,255]])
 42 | 
 43 | def decode_segmap(temp, plot=False):
 44 | 
 45 |     label_colours = get_pascal_labels()
 46 |     r = temp.copy()
 47 |     g = temp.copy()
 48 |     b = temp.copy()
 49 |     for l in range(0, 2):
 50 |         r[temp == l] = label_colours[l, 0]
 51 |         g[temp == l] = label_colours[l, 1]
 52 |         b[temp == l] = label_colours[l, 2]
 53 | 
 54 |     rgb = np.zeros((temp.shape[0], temp.shape[1], 3))
 55 |     rgb[:, :, 0] = r
 56 |     rgb[:, :, 1] = g
 57 |     rgb[:, :, 2] = b
 58 |     #rgb = np.resize(rgb,(321,321,3))
 59 |     if plot:
 60 |         plt.imshow(rgb)
 61 |         plt.show()
 62 |     else:
 63 |         return rgb
 64 | 
 65 | 
 66 | class Dataset(Dataset):
 67 | 
 68 |     def __init__(self,data_path,split_flag, flag_type= 'ssl', transform=False, transform_med=None):
 69 |         self.size = args.img_size
 70 |         self.train_data_path = os.path.join(data_path, "struc_train")
 71 |         self.test_data_path = os.path.join(data_path, 'struc_test')
 72 |         self.img_txt_path =  os.path.join(self.train_data_path, 'train_pair.txt')
 73 |         self.test_img_txt_path =  os.path.join(self.test_data_path, 'test_pair.txt')
 74 |         print( self.test_img_txt_path)
 75 |         # # Load the text file containing image pair
 76 |         # self.imgs_path_list = np.loadtxt(self.img_txt_path,dtype=str)
 77 |         # self.test_imgs_path_list = np.loadtxt(self.test_img_txt_path,dtype=str)
 78 | 
 79 |         self.flag = split_flag
 80 |         self.flag_type = flag_type
 81 |         self.transform = transform
 82 |         self.transform_med = transform_med
 83 |         self.img_label_path_pairs = self.get_img_label_path_pairs()
 84 | 
 85 |     def get_img_label_path_pairs(self):
 86 | 
 87 |         img_label_pair_list = {}
 88 |         if self.flag =='train':
 89 |             for idx, did in enumerate(open(self.img_txt_path)):
 90 |                 try:
 91 |                     image1_name,image2_name,mask_name = did.strip("\n").split(' ')
 92 |                 except ValueError:  # Adhoc for test.
 93 |                     image_name = mask_name = did.strip("\n")
 94 |                 extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')]
 95 |                 img1_file = os.path.join(self.train_data_path, image1_name)
 96 |                 img2_file = os.path.join(self.train_data_path, image2_name)
 97 |                 lbl_file = os.path.join(self.train_data_path, mask_name)
 98 |                 img_label_pair_list.setdefault(idx, [img1_file,img2_file,lbl_file, image1_name, image2_name])
 99 | 
100 |         if self.flag == 'val':
101 |             self.label_ext = '.png'
102 |             for idx , did in enumerate(open(self.test_img_txt_path)):
103 |                 try:
104 |                     image1_name, image2_name, mask_name = did.strip("\n").split(' ')
105 |                 except ValueError:  # Adhoc for test.
106 |                     image_name = mask_name = did.strip("\n")
107 |                 # extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')]
108 |                 img1_file = os.path.join(self.test_data_path, image1_name)
109 |                 img2_file = os.path.join(self.test_data_path, image2_name)
110 |                 lbl_file = os.path.join(self.test_data_path, mask_name)
111 |                 img_label_pair_list.setdefault(idx, [img1_file, img2_file, lbl_file, image1_name, image2_name])
112 | 
113 |         return img_label_pair_list
114 | 
115 |     def data_transform(self, img1,img2,lbl):
116 |        rz = transforms.Compose([transforms.Resize(size=(512,512))])
117 |        img1 = rz(img1)
118 |        img2 = rz(img2)
119 |        lbl= transforms.ToPILImage()(lbl)
120 |        lbl = rz(lbl)
121 |        img1 = transforms.ToTensor()(img1)
122 |        img2 = transforms.ToTensor()(img2)
123 |        lbl = transforms.ToTensor()(lbl)
124 |         #lbl_reverse = torch.from_numpy(lbl_reverse).long()
125 |        return img1,img2,lbl
126 | 
127 |     def extract_instance(self, img1, img2, lbl):
128 | 
129 |         obj_mask = 1*(lbl >1)
130 |         img1 = np.array(img1)
131 |         gau_masks = gaussian_filter(obj_mask, sigma=1)
132 |         gau_masks = np.reshape(gau_masks, (gau_masks.shape[0], gau_masks.shape[1], 1))
133 |         instance = img1* gau_masks
134 |         transform = transforms.ToTensor()
135 |         img1 = transform(img1)
136 |         img2 = transform(img2)
137 |         obj_mask = transform(obj_mask)
138 |         instance = transform(instance)
139 |         return img1, img2, obj_mask, instance
140 | 
141 |     def __getitem__(self, index):
142 | 
143 |         img1_path,img2_path,label_path,filename1, filename2 = self.img_label_path_pairs[index]
144 |         # print(img1_path,filename1,filename2)
145 |         ####### load images #############
146 |         img1 = Image.open(img1_path)
147 |         img2 = Image.open(img2_path)
148 | 
149 |         label = Image.open(label_path)
150 |         label = np.array(label, dtype=np.int32)
151 | 
152 |         height,width, d = np.array(img1,dtype= np.uint8).shape
153 | 
154 |         if self.transform_med != None:
155 |             # normal simclr
156 |            img1_0, img2_0  = self.transform_med(img1, img2)
157 |            img1_1, img2_1= self.transform_med(img1,img2)
158 | 
159 |         if self.flag_type == 'ssl':
160 |             return img1_0, img1_1, img2_0, img2_1, str(filename1), str(filename2), label
161 |         elif self.flag_type == 'linear_eval':
162 |                image_dict = {'pos1': img1_0, 'pos2': img1_1, 'neg1': img2_0, 'neg2': img2_1}
163 |                type1, type2 = random.sample(list(image_dict.keys()), k=2)
164 | 
165 |                if any('pos' in s for s in [type1, type2]) and any('neg' in s for s in [type1, type2]):
166 |                    y = 1
167 |                    i1 = image_dict[type1]
168 |                    i2 = image_dict[type2]
169 |                    return i1, i2, y
170 |                else:
171 |                    y = 0
172 |                    i1 = image_dict[type1]
173 |                    i2 = image_dict[type2]
174 |                    return i1, i2, y
175 | 
176 |         ####### load labels ############
177 |         if self.flag == 'train':
178 |             label = Image.open(label_path)
179 |             # if self.transform_med != None:                # enable this during fine tuning
180 |             #     label = self.transform_med(label)
181 |             label = np.array(label,dtype=np.int32)
182 | 
183 |         if self.flag == 'val':
184 |             label = Image.open(label_path)
185 |             # if self.transform_med != None:                # enable this during fine tuning
186 |             #    label = self.transform_med(label)
187 |             label = np.array(label,dtype=np.int32)
188 | 
189 |         if self.transform :
190 |             img1, img2, label = self.data_transform(img1,img2,label) #self.extract_instance(img1, img2, label)
191 | 
192 |             return img1, img2, label
193 | 
194 | 
195 |         else:
196 |             return img1, img2, label
197 |     def __len__(self):
198 | 
199 |         return len(self.img_label_path_pairs)
200 | 
201 | 
202 | 
203 | 


--------------------------------------------------------------------------------
/DSP/dataset/PCD.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torch.utils.data.dataset import Dataset
  3 | import numpy as np
  4 | import os
  5 | import scipy.io
  6 | import scipy.misc as m
  7 | from PIL import Image
  8 | import random
  9 | from scipy.ndimage import gaussian_filter
 10 | from torchvision import transforms
 11 | from config.option import Options
 12 | import matplotlib.pyplot as plt
 13 | 
 14 | args = Options().parse()
 15 | 
 16 | 
 17 | class Dataset(Dataset):
 18 | 
 19 |     def __init__(self,data_path,split_flag, flag_type= 'ssl', transform=False, transform_med=None):
 20 |         self.size = args.img_size
 21 |         self.train_data_path = os.path.join(data_path, "struc_train")
 22 |         self.test_data_path = os.path.join(data_path, 'struc_test')
 23 |         self.img_txt_path =  os.path.join(self.train_data_path, 'train_pair.txt')
 24 |         self.test_img_txt_path =  os.path.join(self.test_data_path, 'test_pair.txt')
 25 |         self.flag = split_flag
 26 |         self.flag_type = flag_type
 27 |         self.transform = transform
 28 |         self.transform_med = transform_med
 29 |         self.img_label_path_pairs = self.get_img_label_path_pairs()
 30 | 
 31 |     def get_img_label_path_pairs(self):
 32 | 
 33 |         img_label_pair_list = {}
 34 |         if self.flag =='train':
 35 |             for idx, did in enumerate(open(self.img_txt_path)):
 36 |                 try:
 37 |                     image1_name,image2_name,mask_name = did.strip("\n").split(' ')
 38 |                 except ValueError:  # Adhoc for test.
 39 |                     image_name = mask_name = did.strip("\n")
 40 |                 extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')]
 41 |                 img1_file = os.path.join(self.train_data_path, image1_name)
 42 |                 img2_file = os.path.join(self.train_data_path, image2_name)
 43 |                 lbl_file = os.path.join(self.train_data_path, mask_name)
 44 |                 img_label_pair_list.setdefault(idx, [img1_file,img2_file,lbl_file, image1_name, image2_name])
 45 | 
 46 |         if self.flag == 'val':
 47 |             self.label_ext = '.png'
 48 |             for idx , did in enumerate(open(self.test_img_txt_path)):
 49 |                 try:
 50 |                     image1_name, image2_name, mask_name = did.strip("\n").split(' ')
 51 |                 except ValueError:  # Adhoc for test.
 52 |                     image_name = mask_name = did.strip("\n")
 53 |                 # extract_name = image1_name[image1_name.rindex('/') +1: image1_name.rindex('.')]
 54 |                 img1_file = os.path.join(self.test_data_path, image1_name)
 55 |                 img2_file = os.path.join(self.test_data_path, image2_name)
 56 |                 lbl_file = os.path.join(self.test_data_path, mask_name)
 57 |                 img_label_pair_list.setdefault(idx, [img1_file, img2_file, lbl_file, image1_name, image2_name])
 58 | 
 59 |         return img_label_pair_list
 60 | 
 61 |     def data_transform(self, img1,img2,lbl):
 62 |        rz = transforms.Compose([transforms.Resize(size=(512,512))])
 63 |        img1 = rz(img1)
 64 |        img2 = rz(img2)
 65 |        lbl= transforms.ToPILImage()(lbl)
 66 |        lbl = rz(lbl)
 67 |        img1 = transforms.ToTensor()(img1)
 68 |        img2 = transforms.ToTensor()(img2)
 69 |        lbl = transforms.ToTensor()(lbl)
 70 |         #lbl_reverse = torch.from_numpy(lbl_reverse).long()
 71 |        return img1,img2,lbl
 72 | 
 73 |     def extract_instance(self, img1, img2, lbl):
 74 |         #USE THIS IF YOU WANT TO CREATE  MORE IMAGES USING COPY PASTE AUGMENTTAION.
 75 |         '''This extracts the instances belonging to the changed region and paste it on exsisting images to create new images.'''
 76 |         obj_mask = 1*(lbl >1)
 77 |         img1 = np.array(img1)
 78 |         gau_masks = gaussian_filter(obj_mask, sigma=1)
 79 |         gau_masks = np.reshape(gau_masks, (gau_masks.shape[0], gau_masks.shape[1], 1))
 80 |         instance = img1* gau_masks
 81 |         transform = transforms.ToTensor()
 82 |         img1 = transform(img1)
 83 |         img2 = transform(img2)
 84 |         obj_mask = transform(obj_mask)
 85 |         instance = transform(instance)
 86 |         return img1, img2, obj_mask, instance
 87 | 
 88 |     def __getitem__(self, index):
 89 | 
 90 |         img1_path,img2_path,label_path,filename1, filename2 = self.img_label_path_pairs[index]
 91 |         ####### load images #############
 92 |         img1 = Image.open(img1_path)
 93 |         img2 = Image.open(img2_path)
 94 |         # img1 = np.asarray(img1)
 95 |         # img2 = np.asarray(img2)
 96 | 
 97 |         label = Image.open(label_path)
 98 |         label = np.array(label, dtype=np.int32)
 99 | 
100 |         height,width, d = np.array(img1,dtype= np.uint8).shape
101 | 
102 |         if self.transform_med != None:
103 |             # normal simclr
104 |            img1_0, img2_0  = self.transform_med(img1, img2)
105 |            img1_1, img2_1= self.transform_med(img1, img2)
106 |            # print(img1_1.shape)
107 |            img1_0 = np.asarray(img1_0).astype("f").transpose(2, 0, 1) / 128.0 - 1.0
108 |            img2_0 = np.asarray(img2_0).astype("f").transpose(2, 0, 1)  / 128.0 - 1.0
109 |            img1_1 = np.asarray(img1_1).astype("f").transpose(2, 0, 1) / 128.0 - 1.0
110 |            img2_1 = np.asarray(img2_1).astype("f").transpose(2, 0, 1)  / 128.0 - 1.0
111 |            img1_0 = torch.from_numpy(img1_0).float()
112 |            img1_1 = torch.from_numpy(img1_1).float()
113 |            img2_0 = torch.from_numpy(img2_0).float()
114 |            img2_1 = torch.from_numpy(img2_1).float()
115 | 
116 |         if self.flag_type == 'ssl':
117 |             return img1_0, img1_1, img2_0, img2_1, str(filename1), str(filename2), label
118 |         elif self.flag_type == 'linear_eval':
119 |                image_dict = {'pos1': img1_0, 'pos2': img1_1, 'neg1': img2_0, 'neg2': img2_1}
120 |                type1, type2 = random.sample(list(image_dict.keys()), k=2)
121 | 
122 |                if any('pos' in s for s in [type1, type2]) and any('neg' in s for s in [type1, type2]):
123 |                    y = 1
124 |                    i1 = image_dict[type1]
125 |                    i2 = image_dict[type2]
126 |                    return i1, i2, y
127 |                else:
128 |                    y = 0
129 |                    i1 = image_dict[type1]
130 |                    i2 = image_dict[type2]
131 |                    return i1, i2, y
132 | 
133 |         ####### load labels ############
134 |         if self.flag == 'train':
135 |             label = Image.open(label_path)
136 |             # if self.transform_med != None:                # enable this during fine tuning
137 |             #     label = self.transform_med(label)
138 |             label = np.array(label,dtype=np.int32)
139 | 
140 |         if self.flag == 'val':
141 |             label = Image.open(label_path)
142 |             # if self.transform_med != None:                # enable this during fine tuning
143 |             #    label = self.transform_med(label)
144 |             label = np.array(label,dtype=np.int32)
145 | 
146 |         if self.transform :
147 |             img1, img2, label = self.data_transform(img1,img2,label) #self.extract_instance(img1, img2, label)
148 | 
149 |             return img1, img2, label
150 | 
151 |         else:
152 |             return img1, img2, label
153 |     def __len__(self):
154 | 
155 |         return len(self.img_label_path_pairs)
156 | 
157 | 
158 | 
159 | 
160 | 
161 | 
162 | 


--------------------------------------------------------------------------------
/DSP/dataset/__pycache__/CMU.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/dataset/__pycache__/CMU.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/dataset/__pycache__/PCD.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/dataset/__pycache__/PCD.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/linear.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import sys
 3 | sys.path.insert(0, '.')
 4 | from models.simclr import SimCLR
 5 | from util.test import test_all_datasets, initialize, testloaderSimCLR
 6 | import numpy as np
 7 | 
 8 | import warnings
 9 | warnings.filterwarnings("ignore", category=UserWarning)
10 | 
11 | np.random.seed(10)
12 | torch.manual_seed(10)
13 | 
14 | 
15 | if __name__ == '__main__':
16 |     args, writer = initialize()
17 |     simclr = SimCLR(args)
18 |     state_dict = torch.load(args.model_path, map_location=args.device)
19 |     simclr.load_state_dict(state_dict)
20 |     simclr = simclr.cuda()
21 |     test_all_datasets(args, writer, simclr)
22 | 


--------------------------------------------------------------------------------
/DSP/modeling/backbone/__init__.py:
--------------------------------------------------------------------------------
 1 | from modeling.backbone import resnet, xception, drn, mobilenet
 2 | 
 3 | def build_backbone(backbone, output_stride, BatchNorm):
 4 |     if backbone == 'resnet101':
 5 |         return resnet.ResNet101(output_stride, BatchNorm)
 6 |     if backbone == 'resnet50':
 7 |         return resnet.ResNet50(output_stride, BatchNorm)
 8 |     elif backbone == 'xception':
 9 |         return xception.AlignedXception(output_stride, BatchNorm)
10 |     elif backbone == 'drn':
11 |         return drn.drn_d_54(BatchNorm)
12 |     elif backbone == 'mobilenet':
13 |         return mobilenet.MobileNetV2(output_stride, BatchNorm)
14 |     else:
15 |         raise NotImplementedError
16 | 


--------------------------------------------------------------------------------
/DSP/modeling/backbone/__pycache__/__init__.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/__init__.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/backbone/__pycache__/drn.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/drn.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/backbone/__pycache__/mobilenet.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/mobilenet.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/backbone/__pycache__/resnet.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/resnet.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/backbone/__pycache__/xception.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/backbone/__pycache__/xception.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/backbone/mobilenet.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn.functional as F
  3 | import torch.nn as nn
  4 | import math
  5 | from modeling.sync_batchnorm.batchnorm import SynchronizedBatchNorm2d
  6 | import torch.utils.model_zoo as model_zoo
  7 | 
  8 | def conv_bn(inp, oup, stride, BatchNorm):
  9 |     return nn.Sequential(
 10 |         nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
 11 |         BatchNorm(oup),
 12 |         nn.ReLU6(inplace=True)
 13 |     )
 14 | 
 15 | 
 16 | def fixed_padding(inputs, kernel_size, dilation):
 17 |     kernel_size_effective = kernel_size + (kernel_size - 1) * (dilation - 1)
 18 |     pad_total = kernel_size_effective - 1
 19 |     pad_beg = pad_total // 2
 20 |     pad_end = pad_total - pad_beg
 21 |     padded_inputs = F.pad(inputs, (pad_beg, pad_end, pad_beg, pad_end))
 22 |     return padded_inputs
 23 | 
 24 | 
 25 | class InvertedResidual(nn.Module):
 26 |     def __init__(self, inp, oup, stride, dilation, expand_ratio, BatchNorm):
 27 |         super(InvertedResidual, self).__init__()
 28 |         self.stride = stride
 29 |         assert stride in [1, 2]
 30 | 
 31 |         hidden_dim = round(inp * expand_ratio)
 32 |         self.use_res_connect = self.stride == 1 and inp == oup
 33 |         self.kernel_size = 3
 34 |         self.dilation = dilation
 35 | 
 36 |         if expand_ratio == 1:
 37 |             self.conv = nn.Sequential(
 38 |                 # dw
 39 |                 nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 0, dilation, groups=hidden_dim, bias=False),
 40 |                 BatchNorm(hidden_dim),
 41 |                 nn.ReLU6(inplace=True),
 42 |                 # pw-linear
 43 |                 nn.Conv2d(hidden_dim, oup, 1, 1, 0, 1, 1, bias=False),
 44 |                 BatchNorm(oup),
 45 |             )
 46 |         else:
 47 |             self.conv = nn.Sequential(
 48 |                 # pw
 49 |                 nn.Conv2d(inp, hidden_dim, 1, 1, 0, 1, bias=False),
 50 |                 BatchNorm(hidden_dim),
 51 |                 nn.ReLU6(inplace=True),
 52 |                 # dw
 53 |                 nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 0, dilation, groups=hidden_dim, bias=False),
 54 |                 BatchNorm(hidden_dim),
 55 |                 nn.ReLU6(inplace=True),
 56 |                 # pw-linear
 57 |                 nn.Conv2d(hidden_dim, oup, 1, 1, 0, 1, bias=False),
 58 |                 BatchNorm(oup),
 59 |             )
 60 | 
 61 |     def forward(self, x):
 62 |         x_pad = fixed_padding(x, self.kernel_size, dilation=self.dilation)
 63 |         if self.use_res_connect:
 64 |             x = x + self.conv(x_pad)
 65 |         else:
 66 |             x = self.conv(x_pad)
 67 |         return x
 68 | 
 69 | 
 70 | class MobileNetV2(nn.Module):
 71 |     def __init__(self, output_stride=8, BatchNorm=None, width_mult=1., pretrained=True):
 72 |         super(MobileNetV2, self).__init__()
 73 |         block = InvertedResidual
 74 |         input_channel = 32
 75 |         current_stride = 1
 76 |         rate = 1
 77 |         interverted_residual_setting = [
 78 |             # t, c, n, s
 79 |             [1, 16, 1, 1],
 80 |             [6, 24, 2, 2],
 81 |             [6, 32, 3, 2],
 82 |             [6, 64, 4, 2],
 83 |             [6, 96, 3, 1],
 84 |             [6, 160, 3, 2],
 85 |             [6, 320, 1, 1],
 86 |         ]
 87 | 
 88 |         # building first layer
 89 |         input_channel = int(input_channel * width_mult)
 90 |         self.features = [conv_bn(3, input_channel, 2, BatchNorm)]
 91 |         current_stride *= 2
 92 |         # building inverted residual blocks
 93 |         for t, c, n, s in interverted_residual_setting:
 94 |             if current_stride == output_stride:
 95 |                 stride = 1
 96 |                 dilation = rate
 97 |                 rate *= s
 98 |             else:
 99 |                 stride = s
100 |                 dilation = 1
101 |                 current_stride *= s
102 |             output_channel = int(c * width_mult)
103 |             for i in range(n):
104 |                 if i == 0:
105 |                     self.features.append(block(input_channel, output_channel, stride, dilation, t, BatchNorm))
106 |                 else:
107 |                     self.features.append(block(input_channel, output_channel, 1, dilation, t, BatchNorm))
108 |                 input_channel = output_channel
109 |         self.features = nn.Sequential(*self.features)
110 |         self._initialize_weights()
111 | 
112 |         if pretrained:
113 |             self._load_pretrained_model()
114 | 
115 |         self.low_level_features = self.features[0:4]
116 |         self.high_level_features = self.features[4:]
117 | 
118 |     def forward(self, x):
119 |         low_level_feat = self.low_level_features(x)
120 |         x = self.high_level_features(low_level_feat)
121 |         return x, low_level_feat
122 | 
123 |     def _load_pretrained_model(self):
124 |         pretrain_dict = model_zoo.load_url('http://jeff95.me/models/mobilenet_v2-6a65762b.pth')
125 |         model_dict = {}
126 |         state_dict = self.state_dict()
127 |         for k, v in pretrain_dict.items():
128 |             if k in state_dict:
129 |                 model_dict[k] = v
130 |         state_dict.update(model_dict)
131 |         self.load_state_dict(state_dict)
132 | 
133 |     def _initialize_weights(self):
134 |         for m in self.modules():
135 |             if isinstance(m, nn.Conv2d):
136 |                 # n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
137 |                 # m.weight.data.normal_(0, math.sqrt(2. / n))
138 |                 torch.nn.init.kaiming_normal_(m.weight)
139 |             elif isinstance(m, SynchronizedBatchNorm2d):
140 |                 m.weight.data.fill_(1)
141 |                 m.bias.data.zero_()
142 |             elif isinstance(m, nn.BatchNorm2d):
143 |                 m.weight.data.fill_(1)
144 |                 m.bias.data.zero_()
145 | 
146 | if __name__ == "__main__":
147 |     input = torch.rand(1, 3, 512, 512)
148 |     model = MobileNetV2(output_stride=16, BatchNorm=nn.BatchNorm2d)
149 |     output, low_level_feat = model(input)
150 |     print(output.size())
151 |     print(low_level_feat.size())
152 | 


--------------------------------------------------------------------------------
/DSP/modeling/backbone/resnet.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | import torch.nn as nn
  3 | import torch.utils.model_zoo as model_zoo
  4 | from modeling.sync_batchnorm.batchnorm import SynchronizedBatchNorm2d
  5 | 
  6 | 
  7 | 
  8 | 
  9 | __all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101',
 10 |            'resnet152', 'resnext50_32x4d', 'resnext101_32x8d',
 11 |            'wide_resnet50_2', 'wide_resnet101_2']
 12 | 
 13 | 
 14 | model_urls = {
 15 |     'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
 16 |     'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
 17 |     'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
 18 |     'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
 19 |     'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
 20 |     'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth',
 21 |     'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth',
 22 |     'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth',
 23 |     'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth',
 24 | }
 25 | 
 26 | class Bottleneck(nn.Module):
 27 |     expansion = 4
 28 | 
 29 |     def __init__(self, inplanes, planes, stride=1, dilation=1, downsample=None, BatchNorm=None):
 30 |         super(Bottleneck, self).__init__()
 31 | 
 32 |         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 33 |         self.bn1 = BatchNorm(planes)
 34 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
 35 |                                dilation=dilation, padding=dilation, bias=False)
 36 |         self.bn2 = BatchNorm(planes)
 37 |         self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
 38 |         self.bn3 = BatchNorm(planes * 4)
 39 |         self.relu = nn.ReLU(inplace=True)
 40 |         self.downsample = downsample
 41 |         self.stride = stride
 42 |         self.dilation = dilation
 43 | 
 44 |     def forward(self, x):
 45 |         residual = x
 46 | 
 47 |         out = self.conv1(x)
 48 |         out = self.bn1(out)
 49 |         out = self.relu(out)
 50 | 
 51 |         out = self.conv2(out)
 52 |         out = self.bn2(out)
 53 |         out = self.relu(out)
 54 | 
 55 |         out = self.conv3(out)
 56 |         out = self.bn3(out)
 57 | 
 58 |         if self.downsample is not None:
 59 |             residual = self.downsample(x)
 60 | 
 61 |         out += residual
 62 |         out = self.relu(out)
 63 | 
 64 |         return out
 65 | 
 66 | class ResNet(nn.Module):
 67 | 
 68 |     def __init__(self, arch, block, layers, output_stride, BatchNorm, pretrained=False):
 69 |         self.inplanes = 64
 70 |         super(ResNet, self).__init__()
 71 |         self.arch = arch
 72 |         blocks = [1, 2, 4]
 73 |         if output_stride == 16:
 74 |             strides = [1, 2, 2, 1]
 75 |             dilations = [1, 1, 1, 2]
 76 |         elif output_stride == 8:
 77 |             strides = [1, 2, 1, 1]
 78 |             dilations = [1, 1, 2, 4]
 79 |         else:
 80 |             raise NotImplementedError
 81 | 
 82 |         # Modules
 83 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
 84 |                                 bias=False)
 85 |         self.bn1 = BatchNorm(64)
 86 |         self.relu = nn.ReLU(inplace=True)
 87 |         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
 88 | 
 89 |         self.layer1 = self._make_layer(block, 64, layers[0], stride=strides[0], dilation=dilations[0], BatchNorm=BatchNorm)
 90 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=strides[1], dilation=dilations[1], BatchNorm=BatchNorm)
 91 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=strides[2], dilation=dilations[2], BatchNorm=BatchNorm)
 92 |         self.layer4 = self._make_MG_unit(block, 512, blocks=blocks, stride=strides[3], dilation=dilations[3], BatchNorm=BatchNorm)
 93 |         # self.layer4 = self._make_layer(block, 512, layers[3], stride=strides[3], dilation=dilations[3], BatchNorm=BatchNorm)
 94 |         self._init_weight()
 95 | 
 96 | 
 97 |         if pretrained:
 98 |             self._load_pretrained_model(self.arch)
 99 | 
100 |     def _make_layer(self, block, planes, blocks, stride=1, dilation=1, BatchNorm=None):
101 |         downsample = None
102 |         if stride != 1 or self.inplanes != planes * block.expansion:
103 |             downsample = nn.Sequential(
104 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
105 |                           kernel_size=1, stride=stride, bias=False),
106 |                 BatchNorm(planes * block.expansion),
107 |             )
108 | 
109 |         layers = []
110 |         layers.append(block(self.inplanes, planes, stride, dilation, downsample, BatchNorm))
111 |         self.inplanes = planes * block.expansion
112 |         for i in range(1, blocks):
113 |             layers.append(block(self.inplanes, planes, dilation=dilation, BatchNorm=BatchNorm))
114 | 
115 |         return nn.Sequential(*layers)
116 | 
117 |     def _make_MG_unit(self, block, planes, blocks, stride=1, dilation=1, BatchNorm=None):
118 |         downsample = None
119 |         if stride != 1 or self.inplanes != planes * block.expansion:
120 |             downsample = nn.Sequential(
121 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
122 |                           kernel_size=1, stride=stride, bias=False),
123 |                 BatchNorm(planes * block.expansion),
124 |             )
125 | 
126 |         layers = []
127 |         layers.append(block(self.inplanes, planes, stride, dilation=blocks[0]*dilation,
128 |                             downsample=downsample, BatchNorm=BatchNorm))
129 |         self.inplanes = planes * block.expansion
130 |         for i in range(1, len(blocks)):
131 |             layers.append(block(self.inplanes, planes, stride=1,
132 |                                 dilation=blocks[i]*dilation, BatchNorm=BatchNorm))
133 | 
134 |         return nn.Sequential(*layers)
135 | 
136 |     def forward(self, input):
137 |         x = self.conv1(input)
138 |         x = self.bn1(x)
139 |         x = self.relu(x)
140 |         x = self.maxpool(x)
141 | 
142 |         x = self.layer1(x)
143 |         low_level_feat = x
144 |         x = self.layer2(x)
145 |         x = self.layer3(x)
146 |         x = self.layer4(x)
147 |         return x, low_level_feat
148 | 
149 |     def _init_weight(self):
150 |         for m in self.modules():
151 |             if isinstance(m, nn.Conv2d):
152 |                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
153 |                 m.weight.data.normal_(0, math.sqrt(2. / n))
154 | 
155 |             elif isinstance(m, SynchronizedBatchNorm2d):
156 |                 m.weight.data.fill_(1)
157 |                 m.bias.data.zero_()
158 | 
159 |             elif isinstance(m, nn.BatchNorm2d):
160 |                 m.weight.data.fill_(1)
161 |                 m.bias.data.zero_()
162 | 
163 | 
164 |     def _load_pretrained_model(self, arch):
165 | 
166 |         pretrain_dict = model_zoo.load_url(model_urls[arch])
167 | 
168 |         model_dict = {}
169 |         state_dict = self.state_dict()
170 |         for k, v in pretrain_dict.items():
171 |             if k in state_dict:
172 |                 model_dict[k] = v
173 |         state_dict.update(model_dict)
174 |         self.load_state_dict(state_dict)
175 | 
176 | 
177 | 
178 | def ResNet101(output_stride, BatchNorm, pretrained=True):
179 |     """Constructs a ResNet-101 model.
180 |     Args:
181 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
182 |     """
183 |     model = ResNet('resnet101', Bottleneck, [3, 4, 23, 3], output_stride, BatchNorm, pretrained=pretrained)
184 |     return model
185 | 
186 | def ResNet50( output_stride, BatchNorm, pretrained=False):
187 |     r"""ResNet-50 model from
188 |     `"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_
189 |     Args:
190 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
191 |         progress (bool): If True, displays a progress bar of the download to stderr
192 |         quantize (bool): If True, return a quantized version of the model
193 |     """
194 |     model = ResNet('resnet50', Bottleneck, [3, 4, 6, 3], output_stride, BatchNorm, pretrained=pretrained)
195 |     return model
196 | 
197 | 
198 | 
199 | def wide_ResNet50_2( output_stride, BatchNorm, pretrained=True):
200 |     r"""Wide ResNet-50-2 model from
201 |     `"Wide Residual Networks" <https://arxiv.org/pdf/1605.07146.pdf>`_
202 |     The model is the same as ResNet except for the bottleneck number of channels
203 |     which is twice larger in every block. The number of channels in outer 1x1
204 |     convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048
205 |     channels, and in Wide ResNet-50-2 has 2048-1024-2048.
206 |     Args:
207 |         pretrained (bool): If True, returns a model pre-trained on ImageNet
208 |         progress (bool): If True, displays a progress bar of the download to stderr
209 |     """
210 |     kwargs['width_per_group'] = 64 * 2
211 |     return ResNet('wide_resnet50_2', Bottleneck, [3, 4, 6, 3],
212 |                    output_stride, BatchNorm, pretrained=pretrained)
213 | 
214 | 
215 | 
216 | if __name__ == "__main__":
217 |     import torch
218 |     model = ResNet50(BatchNorm=nn.BatchNorm2d, pretrained=True, output_stride=8)   #nn.BatchNorm2d, or Syncbatch norm
219 |     input = torch.rand(1, 3, 512, 512)
220 |     output, low_level_feat = model(input)
221 |     print(output.size())
222 |     print(low_level_feat.size())


--------------------------------------------------------------------------------
/DSP/modeling/backbone/xception.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | import torch
  3 | import torch.nn as nn
  4 | import torch.nn.functional as F
  5 | import torch.utils.model_zoo as model_zoo
  6 | from modeling.sync_batchnorm.batchnorm import SynchronizedBatchNorm2d
  7 | 
  8 | def fixed_padding(inputs, kernel_size, dilation):
  9 |     kernel_size_effective = kernel_size + (kernel_size - 1) * (dilation - 1)
 10 |     pad_total = kernel_size_effective - 1
 11 |     pad_beg = pad_total // 2
 12 |     pad_end = pad_total - pad_beg
 13 |     padded_inputs = F.pad(inputs, (pad_beg, pad_end, pad_beg, pad_end))
 14 |     return padded_inputs
 15 | 
 16 | 
 17 | class SeparableConv2d(nn.Module):
 18 |     def __init__(self, inplanes, planes, kernel_size=3, stride=1, dilation=1, bias=False, BatchNorm=None):
 19 |         super(SeparableConv2d, self).__init__()
 20 | 
 21 |         self.conv1 = nn.Conv2d(inplanes, inplanes, kernel_size, stride, 0, dilation,
 22 |                                groups=inplanes, bias=bias)
 23 |         self.bn = BatchNorm(inplanes)
 24 |         self.pointwise = nn.Conv2d(inplanes, planes, 1, 1, 0, 1, 1, bias=bias)
 25 | 
 26 |     def forward(self, x):
 27 |         x = fixed_padding(x, self.conv1.kernel_size[0], dilation=self.conv1.dilation[0])
 28 |         x = self.conv1(x)
 29 |         x = self.bn(x)
 30 |         x = self.pointwise(x)
 31 |         return x
 32 | 
 33 | 
 34 | class Block(nn.Module):
 35 |     def __init__(self, inplanes, planes, reps, stride=1, dilation=1, BatchNorm=None,
 36 |                  start_with_relu=True, grow_first=True, is_last=False):
 37 |         super(Block, self).__init__()
 38 | 
 39 |         if planes != inplanes or stride != 1:
 40 |             self.skip = nn.Conv2d(inplanes, planes, 1, stride=stride, bias=False)
 41 |             self.skipbn = BatchNorm(planes)
 42 |         else:
 43 |             self.skip = None
 44 | 
 45 |         self.relu = nn.ReLU(inplace=True)
 46 |         rep = []
 47 | 
 48 |         filters = inplanes
 49 |         if grow_first:
 50 |             rep.append(self.relu)
 51 |             rep.append(SeparableConv2d(inplanes, planes, 3, 1, dilation, BatchNorm=BatchNorm))
 52 |             rep.append(BatchNorm(planes))
 53 |             filters = planes
 54 | 
 55 |         for i in range(reps - 1):
 56 |             rep.append(self.relu)
 57 |             rep.append(SeparableConv2d(filters, filters, 3, 1, dilation, BatchNorm=BatchNorm))
 58 |             rep.append(BatchNorm(filters))
 59 | 
 60 |         if not grow_first:
 61 |             rep.append(self.relu)
 62 |             rep.append(SeparableConv2d(inplanes, planes, 3, 1, dilation, BatchNorm=BatchNorm))
 63 |             rep.append(BatchNorm(planes))
 64 | 
 65 |         if stride != 1:
 66 |             rep.append(self.relu)
 67 |             rep.append(SeparableConv2d(planes, planes, 3, 2, BatchNorm=BatchNorm))
 68 |             rep.append(BatchNorm(planes))
 69 | 
 70 |         if stride == 1 and is_last:
 71 |             rep.append(self.relu)
 72 |             rep.append(SeparableConv2d(planes, planes, 3, 1, BatchNorm=BatchNorm))
 73 |             rep.append(BatchNorm(planes))
 74 | 
 75 |         if not start_with_relu:
 76 |             rep = rep[1:]
 77 | 
 78 |         self.rep = nn.Sequential(*rep)
 79 | 
 80 |     def forward(self, inp):
 81 |         x = self.rep(inp)
 82 | 
 83 |         if self.skip is not None:
 84 |             skip = self.skip(inp)
 85 |             skip = self.skipbn(skip)
 86 |         else:
 87 |             skip = inp
 88 | 
 89 |         x = x + skip
 90 | 
 91 |         return x
 92 | 
 93 | 
 94 | class AlignedXception(nn.Module):
 95 |     """
 96 |     Modified Alighed Xception
 97 |     """
 98 |     def __init__(self, output_stride, BatchNorm,
 99 |                  pretrained=True):
100 |         super(AlignedXception, self).__init__()
101 | 
102 |         if output_stride == 16:
103 |             entry_block3_stride = 2
104 |             middle_block_dilation = 1
105 |             exit_block_dilations = (1, 2)
106 |         elif output_stride == 8:
107 |             entry_block3_stride = 1
108 |             middle_block_dilation = 2
109 |             exit_block_dilations = (2, 4)
110 |         else:
111 |             raise NotImplementedError
112 | 
113 | 
114 |         # Entry flow
115 |         self.conv1 = nn.Conv2d(3, 32, 3, stride=2, padding=1, bias=False)
116 |         self.bn1 = BatchNorm(32)
117 |         self.relu = nn.ReLU(inplace=True)
118 | 
119 |         self.conv2 = nn.Conv2d(32, 64, 3, stride=1, padding=1, bias=False)
120 |         self.bn2 = BatchNorm(64)
121 | 
122 |         self.block1 = Block(64, 128, reps=2, stride=2, BatchNorm=BatchNorm, start_with_relu=False)
123 |         self.block2 = Block(128, 256, reps=2, stride=2, BatchNorm=BatchNorm, start_with_relu=False,
124 |                             grow_first=True)
125 |         self.block3 = Block(256, 728, reps=2, stride=entry_block3_stride, BatchNorm=BatchNorm,
126 |                             start_with_relu=True, grow_first=True, is_last=True)
127 | 
128 |         # Middle flow
129 |         self.block4  = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
130 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
131 |         self.block5  = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
132 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
133 |         self.block6  = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
134 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
135 |         self.block7  = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
136 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
137 |         self.block8  = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
138 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
139 |         self.block9  = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
140 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
141 |         self.block10 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
142 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
143 |         self.block11 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
144 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
145 |         self.block12 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
146 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
147 |         self.block13 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
148 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
149 |         self.block14 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
150 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
151 |         self.block15 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
152 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
153 |         self.block16 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
154 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
155 |         self.block17 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
156 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
157 |         self.block18 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
158 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
159 |         self.block19 = Block(728, 728, reps=3, stride=1, dilation=middle_block_dilation,
160 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=True)
161 | 
162 |         # Exit flow
163 |         self.block20 = Block(728, 1024, reps=2, stride=1, dilation=exit_block_dilations[0],
164 |                              BatchNorm=BatchNorm, start_with_relu=True, grow_first=False, is_last=True)
165 | 
166 |         self.conv3 = SeparableConv2d(1024, 1536, 3, stride=1, dilation=exit_block_dilations[1], BatchNorm=BatchNorm)
167 |         self.bn3 = BatchNorm(1536)
168 | 
169 |         self.conv4 = SeparableConv2d(1536, 1536, 3, stride=1, dilation=exit_block_dilations[1], BatchNorm=BatchNorm)
170 |         self.bn4 = BatchNorm(1536)
171 | 
172 |         self.conv5 = SeparableConv2d(1536, 2048, 3, stride=1, dilation=exit_block_dilations[1], BatchNorm=BatchNorm)
173 |         self.bn5 = BatchNorm(2048)
174 | 
175 |         # Init weights
176 |         self._init_weight()
177 | 
178 |         # Load pretrained model
179 |         if pretrained:
180 |             self._load_pretrained_model()
181 | 
182 |     def forward(self, x):
183 |         # Entry flow
184 |         x = self.conv1(x)
185 |         x = self.bn1(x)
186 |         x = self.relu(x)
187 | 
188 |         x = self.conv2(x)
189 |         x = self.bn2(x)
190 |         x = self.relu(x)
191 | 
192 |         x = self.block1(x)
193 |         # add relu here
194 |         x = self.relu(x)
195 |         low_level_feat = x
196 |         x = self.block2(x)
197 |         x = self.block3(x)
198 | 
199 |         # Middle flow
200 |         x = self.block4(x)
201 |         x = self.block5(x)
202 |         x = self.block6(x)
203 |         x = self.block7(x)
204 |         x = self.block8(x)
205 |         x = self.block9(x)
206 |         x = self.block10(x)
207 |         x = self.block11(x)
208 |         x = self.block12(x)
209 |         x = self.block13(x)
210 |         x = self.block14(x)
211 |         x = self.block15(x)
212 |         x = self.block16(x)
213 |         x = self.block17(x)
214 |         x = self.block18(x)
215 |         x = self.block19(x)
216 | 
217 |         # Exit flow
218 |         x = self.block20(x)
219 |         x = self.relu(x)
220 |         x = self.conv3(x)
221 |         x = self.bn3(x)
222 |         x = self.relu(x)
223 | 
224 |         x = self.conv4(x)
225 |         x = self.bn4(x)
226 |         x = self.relu(x)
227 | 
228 |         x = self.conv5(x)
229 |         x = self.bn5(x)
230 |         x = self.relu(x)
231 | 
232 |         return x, low_level_feat
233 | 
234 |     def _init_weight(self):
235 |         for m in self.modules():
236 |             if isinstance(m, nn.Conv2d):
237 |                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
238 |                 m.weight.data.normal_(0, math.sqrt(2. / n))
239 |             elif isinstance(m, SynchronizedBatchNorm2d):
240 |                 m.weight.data.fill_(1)
241 |                 m.bias.data.zero_()
242 |             elif isinstance(m, nn.BatchNorm2d):
243 |                 m.weight.data.fill_(1)
244 |                 m.bias.data.zero_()
245 | 
246 | 
247 |     def _load_pretrained_model(self):
248 |         pretrain_dict = model_zoo.load_url('http://data.lip6.fr/cadene/pretrainedmodels/xception-b5690688.pth')
249 |         model_dict = {}
250 |         state_dict = self.state_dict()
251 | 
252 |         for k, v in pretrain_dict.items():
253 |             if k in state_dict:
254 |                 if 'pointwise' in k:
255 |                     v = v.unsqueeze(-1).unsqueeze(-1)
256 |                 if k.startswith('block11'):
257 |                     model_dict[k] = v
258 |                     model_dict[k.replace('block11', 'block12')] = v
259 |                     model_dict[k.replace('block11', 'block13')] = v
260 |                     model_dict[k.replace('block11', 'block14')] = v
261 |                     model_dict[k.replace('block11', 'block15')] = v
262 |                     model_dict[k.replace('block11', 'block16')] = v
263 |                     model_dict[k.replace('block11', 'block17')] = v
264 |                     model_dict[k.replace('block11', 'block18')] = v
265 |                     model_dict[k.replace('block11', 'block19')] = v
266 |                 elif k.startswith('block12'):
267 |                     model_dict[k.replace('block12', 'block20')] = v
268 |                 elif k.startswith('bn3'):
269 |                     model_dict[k] = v
270 |                     model_dict[k.replace('bn3', 'bn4')] = v
271 |                 elif k.startswith('conv4'):
272 |                     model_dict[k.replace('conv4', 'conv5')] = v
273 |                 elif k.startswith('bn4'):
274 |                     model_dict[k.replace('bn4', 'bn5')] = v
275 |                 else:
276 |                     model_dict[k] = v
277 |         state_dict.update(model_dict)
278 |         self.load_state_dict(state_dict)
279 | 
280 | 
281 | 
282 | if __name__ == "__main__":
283 |     import torch
284 |     model = AlignedXception(BatchNorm=nn.BatchNorm2d, pretrained=True, output_stride=16)
285 |     input = torch.rand(1, 3, 512, 512)
286 |     output, low_level_feat = model(input)
287 |     print(output.size())
288 |     print(low_level_feat.size())
289 | 


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/__init__.py:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | # File   : __init__.py
 3 | # Author : Jiayuan Mao
 4 | # Email  : maojiayuan@gmail.com
 5 | # Date   : 27/01/2018
 6 | #
 7 | # This file is part of Synchronized-BatchNorm-PyTorch.
 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
 9 | # Distributed under MIT License.
10 | 
11 | from .batchnorm import SynchronizedBatchNorm1d, SynchronizedBatchNorm2d, SynchronizedBatchNorm3d
12 | from .replicate import DataParallelWithCallback, patch_replication_callback


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/__pycache__/__init__.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/__init__.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/__pycache__/batchnorm.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/batchnorm.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/__pycache__/comm.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/comm.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/__pycache__/replicate.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/modeling/sync_batchnorm/__pycache__/replicate.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/comm.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | # File   : comm.py
  3 | # Author : Jiayuan Mao
  4 | # Email  : maojiayuan@gmail.com
  5 | # Date   : 27/01/2018
  6 | #
  7 | # This file is part of Synchronized-BatchNorm-PyTorch.
  8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
  9 | # Distributed under MIT License.
 10 | 
 11 | import queue
 12 | import collections
 13 | import threading
 14 | 
 15 | __all__ = ['FutureResult', 'SlavePipe', 'SyncMaster']
 16 | 
 17 | 
 18 | class FutureResult(object):
 19 |     """A thread-safe future implementation. Used only as one-to-one pipe."""
 20 | 
 21 |     def __init__(self):
 22 |         self._result = None
 23 |         self._lock = threading.Lock()
 24 |         self._cond = threading.Condition(self._lock)
 25 | 
 26 |     def put(self, result):
 27 |         with self._lock:
 28 |             assert self._result is None, 'Previous result has\'t been fetched.'
 29 |             self._result = result
 30 |             self._cond.notify()
 31 | 
 32 |     def get(self):
 33 |         with self._lock:
 34 |             if self._result is None:
 35 |                 self._cond.wait()
 36 | 
 37 |             res = self._result
 38 |             self._result = None
 39 |             return res
 40 | 
 41 | 
 42 | _MasterRegistry = collections.namedtuple('MasterRegistry', ['result'])
 43 | _SlavePipeBase = collections.namedtuple('_SlavePipeBase', ['identifier', 'queue', 'result'])
 44 | 
 45 | 
 46 | class SlavePipe(_SlavePipeBase):
 47 |     """Pipe for master-slave communication."""
 48 | 
 49 |     def run_slave(self, msg):
 50 |         self.queue.put((self.identifier, msg))
 51 |         ret = self.result.get()
 52 |         self.queue.put(True)
 53 |         return ret
 54 | 
 55 | 
 56 | class SyncMaster(object):
 57 |     """An abstract `SyncMaster` object.
 58 |     - During the replication, as the data parallel will trigger an callback of each module, all slave devices should
 59 |     call `register(id)` and obtain an `SlavePipe` to communicate with the master.
 60 |     - During the forward pass, master device invokes `run_master`, all messages from slave devices will be collected,
 61 |     and passed to a registered callback.
 62 |     - After receiving the messages, the master device should gather the information and determine to message passed
 63 |     back to each slave devices.
 64 |     """
 65 | 
 66 |     def __init__(self, master_callback):
 67 |         """
 68 |         Args:
 69 |             master_callback: a callback to be invoked after having collected messages from slave devices.
 70 |         """
 71 |         self._master_callback = master_callback
 72 |         self._queue = queue.Queue()
 73 |         self._registry = collections.OrderedDict()
 74 |         self._activated = False
 75 | 
 76 |     def __getstate__(self):
 77 |         return {'master_callback': self._master_callback}
 78 | 
 79 |     def __setstate__(self, state):
 80 |         self.__init__(state['master_callback'])
 81 | 
 82 |     def register_slave(self, identifier):
 83 |         """
 84 |         Register an slave device.
 85 |         Args:
 86 |             identifier: an identifier, usually is the device id.
 87 |         Returns: a `SlavePipe` object which can be used to communicate with the master device.
 88 |         """
 89 |         if self._activated:
 90 |             assert self._queue.empty(), 'Queue is not clean before next initialization.'
 91 |             self._activated = False
 92 |             self._registry.clear()
 93 |         future = FutureResult()
 94 |         self._registry[identifier] = _MasterRegistry(future)
 95 |         return SlavePipe(identifier, self._queue, future)
 96 | 
 97 |     def run_master(self, master_msg):
 98 |         """
 99 |         Main entry for the master device in each forward pass.
100 |         The messages were first collected from each devices (including the master device), and then
101 |         an callback will be invoked to compute the message to be sent back to each devices
102 |         (including the master device).
103 |         Args:
104 |             master_msg: the message that the master want to send to itself. This will be placed as the first
105 |             message when calling `master_callback`. For detailed usage, see `_SynchronizedBatchNorm` for an example.
106 |         Returns: the message to be sent back to the master device.
107 |         """
108 |         self._activated = True
109 | 
110 |         intermediates = [(0, master_msg)]
111 |         for i in range(self.nr_slaves):
112 |             intermediates.append(self._queue.get())
113 | 
114 |         results = self._master_callback(intermediates)
115 |         assert results[0][0] == 0, 'The first result should belongs to the master.'
116 | 
117 |         for i, res in results:
118 |             if i == 0:
119 |                 continue
120 |             self._registry[i].result.put(res)
121 | 
122 |         for i in range(self.nr_slaves):
123 |             assert self._queue.get() is True
124 | 
125 |         return results[0][1]
126 | 
127 |     @property
128 |     def nr_slaves(self):
129 |         return len(self._registry)
130 | 


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/replicate.py:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | # File   : replicate.py
 3 | # Author : Jiayuan Mao
 4 | # Email  : maojiayuan@gmail.com
 5 | # Date   : 27/01/2018
 6 | #
 7 | # This file is part of Synchronized-BatchNorm-PyTorch.
 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
 9 | # Distributed under MIT License.
10 | 
11 | import functools
12 | 
13 | from torch.nn.parallel.data_parallel import DataParallel
14 | 
15 | __all__ = [
16 |     'CallbackContext',
17 |     'execute_replication_callbacks',
18 |     'DataParallelWithCallback',
19 |     'patch_replication_callback'
20 | ]
21 | 
22 | 
23 | class CallbackContext(object):
24 |     pass
25 | 
26 | 
27 | def execute_replication_callbacks(modules):
28 |     """
29 |     Execute an replication callback `__data_parallel_replicate__` on each module created by original replication.
30 |     The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)`
31 |     Note that, as all modules are isomorphism, we assign each sub-module with a context
32 |     (shared among multiple copies of this module on different devices).
33 |     Through this context, different copies can share some information.
34 |     We guarantee that the callback on the master copy (the first copy) will be called ahead of calling the callback
35 |     of any slave copies.
36 |     """
37 |     master_copy = modules[0]
38 |     nr_modules = len(list(master_copy.modules()))
39 |     ctxs = [CallbackContext() for _ in range(nr_modules)]
40 | 
41 |     for i, module in enumerate(modules):
42 |         for j, m in enumerate(module.modules()):
43 |             if hasattr(m, '__data_parallel_replicate__'):
44 |                 m.__data_parallel_replicate__(ctxs[j], i)
45 | 
46 | 
47 | class DataParallelWithCallback(DataParallel):
48 |     """
49 |     Data Parallel with a replication callback.
50 |     An replication callback `__data_parallel_replicate__` of each module will be invoked after being created by
51 |     original `replicate` function.
52 |     The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)`
53 |     Examples:
54 |         > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False)
55 |         > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1])
56 |         # sync_bn.__data_parallel_replicate__ will be invoked.
57 |     """
58 | 
59 |     def replicate(self, module, device_ids):
60 |         modules = super(DataParallelWithCallback, self).replicate(module, device_ids)
61 |         execute_replication_callbacks(modules)
62 |         return modules
63 | 
64 | 
65 | def patch_replication_callback(data_parallel):
66 |     """
67 |     Monkey-patch an existing `DataParallel` object. Add the replication callback.
68 |     Useful when you have customized `DataParallel` implementation.
69 |     Examples:
70 |         > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False)
71 |         > sync_bn = DataParallel(sync_bn, device_ids=[0, 1])
72 |         > patch_replication_callback(sync_bn)
73 |         # this is equivalent to
74 |         > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False)
75 |         > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1])
76 |     """
77 | 
78 |     assert isinstance(data_parallel, DataParallel)
79 | 
80 |     old_replicate = data_parallel.replicate
81 | 
82 |     @functools.wraps(old_replicate)
83 |     def new_replicate(module, device_ids):
84 |         modules = old_replicate(module, device_ids)
85 |         execute_replication_callbacks(modules)
86 |         return modules
87 | 
88 |     data_parallel.replicate = new_replicate


--------------------------------------------------------------------------------
/DSP/modeling/sync_batchnorm/unittest.py:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | # File   : unittest.py
 3 | # Author : Jiayuan Mao
 4 | # Email  : maojiayuan@gmail.com
 5 | # Date   : 27/01/2018
 6 | #
 7 | # This file is part of Synchronized-BatchNorm-PyTorch.
 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
 9 | # Distributed under MIT License.
10 | 
11 | import unittest
12 | 
13 | import numpy as np
14 | from torch.autograd import Variable
15 | 
16 | 
17 | def as_numpy(v):
18 |     if isinstance(v, Variable):
19 |         v = v.data
20 |     return v.cpu().numpy()
21 | 
22 | 
23 | class TorchTestCase(unittest.TestCase):
24 |     def assertTensorClose(self, a, b, atol=1e-3, rtol=1e-3):
25 |         npa, npb = as_numpy(a), as_numpy(b)
26 |         self.assertTrue(
27 |                 np.allclose(npa, npb, atol=atol),
28 |                 'Tensor close check failed\n{}\n{}\nadiff={}, rdiff={}'.format(a, b, np.abs(npa - npb).max(), np.abs((npa - npb) / np.fmax(npa, 1e-5)).max())
29 |         )
30 | 


--------------------------------------------------------------------------------
/DSP/models/__pycache__/simclr.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/models/__pycache__/simclr.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/models/simclr.py:
--------------------------------------------------------------------------------
  1 | import torch.nn as nn
  2 | import torchvision
  3 | import torch
  4 | import torch.nn.functional as F
  5 | from copy import deepcopy
  6 | from modeling.backbone.resnet import ResNet50
  7 | 
  8 | 
  9 | 
 10 | 
 11 | class SimCLR(nn.Module):
 12 |     def __init__(self, args):
 13 |         super(SimCLR, self).__init__()
 14 |         self.m_backbone = args.m_backbone
 15 |         # self.dense_head = args.dense_head
 16 |         self.m = args.m_update
 17 |         self.encoder_type = args.encoder
 18 |         self.dense_cl = args.dense_cl
 19 |         self.f = get_encoder(args.backbone, args.pre_train, args.output_stride, args.encoder)
 20 |         self.dense_neck = DenseCLNeck(in_channels=2048, hid_channels=512, out_channels=1, num_grid=None)
 21 | 
 22 |         self.pool = nn.AdaptiveAvgPool2d(1)
 23 | 
 24 | 
 25 |         # projection head
 26 |         self.g = nn.Sequential(
 27 |                                 nn.Linear(2048, args.hidden_layer, bias=False),
 28 |                                 nn.BatchNorm1d(args.hidden_layer),
 29 |                                 nn.ReLU(inplace=True),
 30 |                                 nn.Linear(args.hidden_layer, args.n_proj, bias=True)
 31 |                                )
 32 | 
 33 | 
 34 |         # Momentum Encoder
 35 |         if args.m_backbone:
 36 |             self.fm = deepcopy(self.f)
 37 |             self.gm = deepcopy(self.g)
 38 |             self.dense_m= deepcopy(self.dense_neck)
 39 |             for param in self.fm.parameters():
 40 |                 param.requires_grad = False
 41 |             for param in self.gm.parameters():
 42 |                 param.requires_grad = False
 43 |             for param in self.gm.parameters():
 44 |                 param.requires_grad = False
 45 | 
 46 |     def forward(self, x, y=None):
 47 |         x, _ = self.f(x)
 48 |         if self.dense_cl:
 49 |             out_x = self.dense_neck(x)
 50 |         else:
 51 |             feat_x = self.pool(x)
 52 |             feat_x = torch.flatten(feat_x, start_dim=1)
 53 |             out_x = self.g(feat_x)
 54 |         if y is not None:
 55 |             if self.m_backbone:
 56 |                 with torch.no_grad():  # no gradient to keys
 57 |                     self._momentum_update()
 58 |                 y, _ = self.fm(y)
 59 |                 if self.dense_cl:
 60 |                     out_y = self.dense_neck(y)
 61 |                 else:
 62 |                     feat_y = self.pool(y)
 63 |                     feat_y = torch.flatten(feat_y, start_dim=1)
 64 |                     out_y = self.gm(feat_y)
 65 |             else:
 66 |                 y, _ = self.f(y)
 67 |                 if self.dense_cl:
 68 |                     out_y = self.dense_neck(y)
 69 |                 else:
 70 |                     feat_y = self.pool(y)
 71 |                     feat_y = torch.flatten(feat_y, start_dim=1)
 72 |                     out_y = self.g(feat_y)
 73 | 
 74 |             return x, y, out_x, out_y
 75 |         else:
 76 |             return F.normalize(feat_x, dim=-1), F.normalize(out_x, dim=-1)
 77 | 
 78 |     @torch.no_grad()
 79 |     def _momentum_update(self):
 80 |         """
 81 |         Momentum update of the key encoder
 82 |         """
 83 |         for param_f, param_fm in zip(self.f.parameters(), self.fm.parameters()):
 84 |             param_fm.data = param_fm.data * self.m + param_f.data * (1. - self.m)
 85 |         for param_g, param_gm in zip(self.f.parameters(), self.fm.parameters()):
 86 |             param_gm.data = param_gm.data * self.m + param_g.data * (1. - self.m)
 87 | 
 88 | 
 89 | class LinearEvaluation(nn.Module):
 90 |     """
 91 |     Linear Evaluation model
 92 |     """
 93 | 
 94 |     def __init__(self, n_features, n_classes):
 95 |         super(LinearEvaluation, self).__init__()
 96 |         self.model = nn.Linear(n_features, n_classes)
 97 | 
 98 |     def forward(self, x1, x2):
 99 |         df = torch.abs(x1 - x2)
100 |         return self.model(df)
101 | 
102 | 
103 | class Identity(nn.Module):
104 |     def __init__(self):
105 |         super(Identity, self).__init__()
106 | 
107 |     def forward(self, x):
108 |         return x
109 | 
110 | 
111 | def get_encoder(encoder, pre_train, output_stride, encoder_name):
112 |     """
113 |     Get Resnet backbone
114 |     """
115 | 
116 |     class View(nn.Module):
117 |         def __init__(self, shape=2048):
118 |             super().__init__()
119 |             self.shape = shape
120 | 
121 |         def forward(self, input):
122 |             '''
123 |             Reshapes the input according to the shape saved in the view data structure.
124 |             '''
125 |             batch_size = input.size(0)
126 |             shape = (batch_size, self.shape)
127 |             out = input.view(shape)
128 |             return out
129 | 
130 |     def CMU_resnet50():
131 | 
132 |         if encoder_name=='resnet':
133 |             resnet = ResNet50(BatchNorm=nn.BatchNorm2d, pretrained=pre_train, output_stride=output_stride)
134 |             return resnet
135 |         else:
136 |              vgg16 = deeplab_V2()
137 |              return vgg16
138 | 
139 | 
140 |     return {
141 |         'resnet18': torchvision.models.resnet18(pretrained=False),
142 |         'resnet50': CMU_resnet50()
143 |     }[encoder]
144 | 
145 | class DenseCLNeck(nn.Module):
146 |     '''The non-linear neck in DenseCL.
147 |         Single and dense in parallel: fc-relu-fc, conv-relu-conv
148 |     '''
149 | 
150 |     def __init__(self,
151 |                  in_channels,
152 |                  hid_channels,
153 |                  out_channels,
154 |                  num_grid=None):
155 |         super(DenseCLNeck, self).__init__()
156 | 
157 |         self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
158 |         self.mlp = nn.Sequential(
159 |             nn.Linear(in_channels, hid_channels), nn.ReLU(inplace=True),
160 |             nn.Linear(hid_channels, out_channels))
161 | 
162 |         self.with_pool = num_grid != None
163 |         if self.with_pool:
164 |             self.pool = nn.AdaptiveAvgPool2d((num_grid, num_grid))
165 |         self.mlp2 = nn.Sequential(
166 |             nn.Conv2d(in_channels, hid_channels, 1), nn.BatchNorm2d(hid_channels), nn.ReLU(inplace=True),
167 |             nn.Conv2d(hid_channels, out_channels, 1))
168 |         self.avgpool2 = nn.AdaptiveAvgPool2d((1, 1))
169 | 
170 | 
171 | 
172 |     def forward(self, x):
173 | 
174 |         x = self.mlp2(x)  # sxs: bxdxsxs
175 |         avgpooled_x2 = self.avgpool2(x)  # 1x1: bxdx1x1
176 |         # x = x.view(x.size(0), x.size(1), -1)  # bxdxs^2
177 |         # avgpooled_x2 = avgpooled_x2.view(avgpooled_x2.size(0), -1)  # bxd
178 |         return  x
179 | 
180 | 
181 | 
182 | if __name__ == "__main__":
183 |     import torch
184 |     model = SimCLR(a)
185 |     input = torch.rand(1, 3, 512, 512)
186 |     output, low_level_feat = model(input)
187 |     print(output.size())
188 |     print(low_level_feat.size())
189 | 


--------------------------------------------------------------------------------
/DSP/mypath.py:
--------------------------------------------------------------------------------
 1 | class Path(object):
 2 |     @staticmethod
 3 |     def db_root_dir(dataset):
 4 |         if dataset == 'pascal':
 5 |             return '/path/to/datasets/VOCdevkit/VOC2012/'  # folder that contains VOCdevkit/.
 6 |         elif dataset == 'sbd':
 7 |             return '/path/to/datasets/benchmark_RELEASE/'  # folder that contains dataset/.
 8 |         elif dataset == 'cityscapes':
 9 |             return '/path/to/datasets/cityscapes/'     # foler that contains leftImg8bit/
10 |         elif dataset == 'coco':
11 |             return '/path/to/datasets/coco/'
12 |         elif dataset == 'CMU':
13 |             return "/data/input/datasets/VL-CMU-CD/"  #folder that contains data
14 |         else:
15 |             print('Dataset {} not available.'.format(dataset))
16 |             raise NotImplementedError
17 | 


--------------------------------------------------------------------------------
/DSP/optimizers/__pycache__/lars.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/optimizers/__pycache__/lars.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/optimizers/lars.py:
--------------------------------------------------------------------------------
  1 | """
  2 | @author: https://github.com/NVIDIA/apex/
  3 | """
  4 | 
  5 | import torch
  6 | from torch import nn
  7 | from torch.nn.parameter import Parameter
  8 | 
  9 | class LARC(object):
 10 |     """
 11 |     :class:`LARC` is a pytorch implementation of both the scaling and clipping variants of LARC,
 12 |     in which the ratio between gradient and parameter magnitudes is used to calculate an adaptive 
 13 |     local learning rate for each individual parameter. The algorithm is designed to improve
 14 |     convergence of large batch training.
 15 |      
 16 |     See https://arxiv.org/abs/1708.03888 for calculation of the local learning rate.
 17 | 
 18 |     In practice it modifies the gradients of parameters as a proxy for modifying the learning rate
 19 |     of the parameters. This design allows it to be used as a wrapper around any torch.optim Optimizer.
 20 | 
 21 |     ```
 22 |     model = ...
 23 |     optim = torch.optim.Adam(model.parameters(), lr=...)
 24 |     optim = LARC(optim)
 25 |     ```
 26 | 
 27 |     It can even be used in conjunction with apex.fp16_utils.FP16_optimizer.
 28 | 
 29 |     ```
 30 |     model = ...
 31 |     optim = torch.optim.Adam(model.parameters(), lr=...)
 32 |     optim = LARC(optim)
 33 |     optim = apex.fp16_utils.FP16_Optimizer(optim)
 34 |     ```
 35 | 
 36 |     Args:
 37 |         optimizer: Pytorch optimizer to wrap and modify learning rate for.
 38 |         trust_coefficient: Trust coefficient for calculating the lr. See https://arxiv.org/abs/1708.03888
 39 |         clip: Decides between clipping or scaling mode of LARC. If `clip=True` the learning rate is set to `min(optimizer_lr, local_lr)` for each parameter. If `clip=False` the learning rate is set to `local_lr*optimizer_lr`.
 40 |         eps: epsilon kludge to help with numerical stability while calculating adaptive_lr
 41 |     """
 42 | 
 43 |     def __init__(self, optimizer, trust_coefficient=0.02, clip=True, eps=1e-8):
 44 |         self.optim = optimizer
 45 |         self.trust_coefficient = trust_coefficient
 46 |         self.eps = eps
 47 |         self.clip = clip
 48 | 
 49 |     def __getstate__(self):
 50 |         return self.optim.__getstate__()
 51 | 
 52 |     def __setstate__(self, state):
 53 |         self.optim.__setstate__(state)
 54 | 
 55 |     @property
 56 |     def state(self):
 57 |         return self.optim.state
 58 | 
 59 |     def __repr__(self):
 60 |         return self.optim.__repr__()
 61 | 
 62 |     @property
 63 |     def param_groups(self):
 64 |         return self.optim.param_groups
 65 | 
 66 |     @param_groups.setter
 67 |     def param_groups(self, value):
 68 |         self.optim.param_groups = value
 69 |     
 70 |     def state_dict(self):
 71 |         return self.optim.state_dict()
 72 | 
 73 |     def load_state_dict(self, state_dict):
 74 |         self.optim.load_state_dict(state_dict)
 75 | 
 76 |     def zero_grad(self):
 77 |         self.optim.zero_grad()
 78 | 
 79 |     def add_param_group(self, param_group):
 80 |         self.optim.add_param_group( param_group)
 81 | 
 82 |     def step(self):
 83 |         with torch.no_grad():
 84 |             weight_decays = []
 85 |             for group in self.optim.param_groups:
 86 |                 # absorb weight decay control from optimizer
 87 |                 weight_decay = group['weight_decay'] if 'weight_decay' in group else 0
 88 |                 weight_decays.append(weight_decay)
 89 |                 group['weight_decay'] = 0
 90 |                 for p in group['params']:
 91 |                     if p.grad is None:
 92 |                         continue
 93 |                     param_norm = torch.norm(p.data)
 94 |                     grad_norm = torch.norm(p.grad.data)
 95 | 
 96 |                     if param_norm != 0 and grad_norm != 0:
 97 |                         # calculate adaptive lr + weight decay
 98 |                         adaptive_lr = self.trust_coefficient * (param_norm) / (grad_norm + param_norm * weight_decay + self.eps)
 99 | 
100 |                         # clip learning rate for LARC
101 |                         if self.clip:
102 |                             # calculation of adaptive_lr so that when multiplied by lr it equals `min(adaptive_lr, lr)`
103 |                             adaptive_lr = min(adaptive_lr/group['lr'], 1)
104 | 
105 |                         p.grad.data += weight_decay * p.data
106 |                         p.grad.data *= adaptive_lr
107 | 
108 |         self.optim.step()
109 |         # return weight decay control to optimizer
110 |         for i, group in enumerate(self.optim.param_groups):
111 |             group['weight_decay'] = weight_decays[i]


--------------------------------------------------------------------------------
/DSP/supervised.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | import numpy as np
 4 | import random
 5 | from datetime import datetime
 6 | from torch.optim import Adam, SGD
 7 | from torch.optim.lr_scheduler import MultiStepLR
 8 | import sys
 9 | from time import ctime
10 | import os
11 | sys.path.insert(0, '.')
12 | from util.utils import logger, summary_writer, log
13 | from config.option import Options
14 | from models.simclr import SimCLR
15 | from util.test import testloaderSimCLR, test_all_datasets
16 | from util.utils import save_checkpoint
17 | from transforms.simclr_transform import SimCLRTransform
18 | 
19 | np.random.seed(10)
20 | random.seed(10)
21 | torch.manual_seed(10)
22 | 
23 | import warnings
24 | warnings.filterwarnings("ignore", category=UserWarning)
25 | 
26 | 
27 | def train_supervised(args, loader, model, criterion, optimizer, scheduler):
28 |     """
29 |     Train supervised model
30 |     """
31 |     loss_epoch, accuracy_epoch = 0, 0
32 |     model.train()
33 |     for i, (x, y) in enumerate(loader):
34 |         x = x.to(args.device)
35 |         y = y.to(args.device)
36 | 
37 |         _, output = model(x)
38 |         loss = criterion(output, y)
39 | 
40 |         predicted = output.argmax(1)
41 |         acc = (predicted == y).sum().item() / y.size(0)
42 |         accuracy_epoch += acc
43 | 
44 |         optimizer.zero_grad()
45 |         loss.backward()
46 |         optimizer.step()
47 |         scheduler.step()
48 | 
49 |         loss_epoch += loss.item()
50 |         if i % 50 == 0:
51 |             log(f"Batch [{i}/{len(loader)}]\t Loss: {loss.item()}\t Accuracy: {acc}")
52 |     return loss_epoch, accuracy_epoch
53 | 
54 | 
55 | if __name__ == "__main__":
56 |     args = Options().parse()
57 |     log_dir = os.path.join(args.save_dir, "{}_bs_{}".format(args.backbone, args.sup_batchsize),
58 |                                 ctime().replace(' ', '_'))
59 |     writer = summary_writer(args, log_dir)
60 |     logger(args)
61 |     args.start_time = datetime.now()
62 |     log("Starting at  {}".format(datetime.now()))
63 |     log("arguments parsed: {}".format(args))
64 |     criterion = nn.CrossEntropyLoss()
65 | 
66 |     model = SimCLR(args)
67 |     model.cuda(args.device)
68 |     transform = SimCLRTransform(size=args.img_size).sup_transform
69 |     train_loader, val_loader, test_loader = testloaderSimCLR(args, args.sup_dataset, transform, args.sup_batchsize, args.sup_data_dir)
70 |     optimizer = SGD(model.parameters(), lr=args.sup_lr,  momentum=0.9, weight_decay=1e-5)
71 |     scheduler = MultiStepLR(optimizer, milestones=[180], gamma=0.1)
72 |     for epoch in range(1, args.sup_epochs + 1):
73 |         # Train
74 |         loss_epoch, accuracy_epoch = train_supervised(args, train_loader, model, criterion, optimizer, scheduler)
75 |         log(f"Epoch [{epoch}/{args.sup_epochs}]\t Loss: {loss_epoch / len(train_loader)}\t Accuracy: {accuracy_epoch / len(train_loader)}")
76 | 
77 |         # Save checkpoint after every epoch
78 |         path = save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch, filename='checkpoint.pth'.format(epoch))
79 |         if os.path.exists:
80 |             state_dict = torch.load(path, map_location=args.device)
81 |             model.load_state_dict(state_dict)
82 | 
83 |         # Save the model at specific checkpoints
84 |         if epoch % 10 == 0:
85 |             if args.distribute:
86 |                 # Save DDP model's module
87 |                 save_checkpoint(state_dict=model.module.state_dict(), args=args, epoch=epoch, filename='checkpoint_model_{}.pth'.format(epoch))
88 |             else:
89 |                 save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch, filename='checkpoint_model_{}.pth'.format(epoch))
90 | 
91 |         writer.add_scalar("CrossEntropyLoss/train", loss_epoch / len(train_loader), epoch)
92 | 
93 |     # Test the supervised Model
94 |     test_all_datasets(args, writer, model)
95 | 


--------------------------------------------------------------------------------
/DSP/train.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | print(torch.__version__)
 3 | import numpy as np
 4 | import random
 5 | from datetime import datetime
 6 | from torch.optim import Adam, SGD
 7 | from torch.optim.lr_scheduler import CosineAnnealingLR
 8 | import sys
 9 | sys.path.insert(0, '.')
10 | from util.utils import logger, summary_writer, log
11 | from util.train_util import trainSSL, get_criteria
12 | from config.option import Options
13 | from models.simclr import SimCLR
14 | from optimizers.lars import LARC
15 | 
16 | np.random.seed(10)
17 | random.seed(10)
18 | torch.manual_seed(10)
19 | 
20 | import warnings
21 | warnings.filterwarnings("ignore", category=UserWarning)
22 | 
23 | 
24 | if __name__ == "__main__":
25 |     args = Options().parse()
26 |     args.writer = summary_writer(args)
27 |     logger(args)
28 |     args.start_time = datetime.now()
29 |     log("Starting at  {}".format(datetime.now()))
30 |     log("arguments parsed: {}".format(args))
31 |     criterion = get_criteria(args)
32 |     if args.ssl_model == 'simclr':
33 |         model = SimCLR(args)
34 |         if args.optimizer == 'lars':
35 |             optimizer_= SGD(model.parameters(), lr=args.ssl_lr)
36 |             optimizer = LARC(optimizer_)
37 |             if args.scheduler:
38 |                 scheduler = CosineAnnealingLR(optimizer_, T_max=100, eta_min=3e-4)
39 |                 trainSSL(args, model, optimizer, criterion,  args.writer, scheduler)
40 |         else:
41 |             optimizer = Adam(model.parameters(), lr=args.ssl_lr, weight_decay=1e-6)
42 |             trainSSL(args, model, optimizer, criterion,  args.writer)
43 | 
44 | 


--------------------------------------------------------------------------------
/DSP/transforms/__pycache__/simclr_transform.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/transforms/__pycache__/simclr_transform.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/transforms/simclr_transform.py:
--------------------------------------------------------------------------------
  1 | from PIL import ImageFilter, Image
  2 | import random
  3 | from torchvision.transforms import transforms
  4 | import numpy as np
  5 | import cv2
  6 | import skimage.exposure
  7 | from scipy.ndimage import gaussian_filter
  8 | import pickle
  9 | from config.option import Options
 10 | import albumentations as A
 11 | from albumentations.pytorch.transforms import ToTensorV2
 12 | args = Options().parse()
 13 | 
 14 | from util.transforms import RandomChoice
 15 | 
 16 | class SimCLRTransform():
 17 |     """
 18 |     Transform defined in SimCLR
 19 |     https://arxiv.org/pdf/2002.05709.pdf
 20 |     ]
 21 |     """
 22 | 
 23 |     def __init__(self, size):
 24 |         # Normalize val dataset CMU
 25 |         # mean_val: TO= [0.34966046 0.33492374 0.3141161 ] T1= [0.27263916 0.27427372 0.26884845]
 26 |         # std_val : T0= [0.3798822  0.37294477 0.35809073] T1= [0.26939082 0.28229916 0.28446007]
 27 |         self.T0_mean = (0.33701816, 0.33383232, 0.3245374)
 28 |         self.T0_std = (0.26748696, 0.2733889,  0.27516264)
 29 |         self.T1_mean = (0.3782613,  0.36675948, 0.35721096)
 30 |         self.T1_std = (0.26745927, 0.2732622,  0.2772976)
 31 |         self.size = size
 32 |         self.copy_paste = args.copy_paste
 33 |         if self.size == 512 or self.size == 256:  # CMU
 34 |             normalize = transforms.Normalize(mean=self.T0_mean, std=self.T0_std)
 35 |             self.train_transform = transforms.Compose(
 36 |                 [    #transforms.RandomResizedCrop(size=size),
 37 |                      transforms.Resize(size=(self.size,self.size)),
 38 |                     ])
 39 | 
 40 |             self.copy_paste_aug = copy_paste(sigma=3, affine=False, prob=0.5)
 41 |             self.train_transform2 = RandomChoice([ get_color_distortion(),
 42 |                                                   transforms.RandomApply([GaussianBlur([.1, 2.])], p=1)
 43 |                                                                                                     ])
 44 | 
 45 |             self.train_transform3 = transforms.Compose([
 46 |                                                 transforms.ToTensor(),
 47 |                                                 # transforms.RandomErasing(p=0.4, scale=(0.09, 0.25), ratio=(0.3, 3.3))
 48 |                                                 normalize,
 49 |                                                 # transforms.RandomErasing(p=0.5, scale=(0.09, 0.25), ratio=(0.3, 3.3))
 50 |                                                 # hide_patch(0.2),
 51 |                                             ])
 52 |             self.train_transform_pcd = transforms.Compose([
 53 |                 transforms.ToTensor()
 54 | 
 55 |             ])
 56 |             self.train_transform4 = RandomChoice([
 57 |                 transforms.RandomErasing(p=0.5, scale=(0.09, 0.25), ratio=(0.3, 3.3))
 58 |                 # hide_patch(1),
 59 |             ])
 60 | 
 61 | 
 62 | 
 63 |             self.test_transform = transforms.Compose(
 64 |                 [
 65 |                     transforms.Resize(size=(size, size)),
 66 |                     transforms.ToTensor(),
 67 |                     normalize
 68 |                 ]
 69 |             )
 70 | 
 71 |             self.sup_transform = transforms.Compose(
 72 |                 [
 73 |                     transforms.RandomCrop(size=(size, size)),  # transforms.RandomHorizontalFlip(),
 74 |                     transforms.ToTensor(),
 75 |                     normalize
 76 |                 ]
 77 |             )
 78 | 
 79 |     def __call__(self, x1, x2):
 80 |         aug1 = self.train_transform(x1)
 81 |         aug2 = self.train_transform(x2)
 82 |         # if self.copy_paste:
 83 |         #     aug1, aug2 = self.copy_paste_aug(aug1, aug2)
 84 |         if args.ssl_dataset=='CMU':
 85 |             aug1, aug2 = self.train_transform2([aug1, aug2])
 86 |             # aug2 = self.train_transform2(aug2)
 87 |             # aug1 = self.train_transform2(aug1)
 88 |             aug1 = self.train_transform3(aug1)
 89 |             aug2 = self.train_transform3(aug2)
 90 |         else:
 91 |             aug1, aug2 = self.train_transform2([aug1, aug2])
 92 | 
 93 | 
 94 | 
 95 |         return aug1, aug2
 96 | 
 97 | 
 98 | class GaussianBlur(object):
 99 |     """Gaussian blur augmentation """
100 | 
101 |     def __init__(self, sigma=None):
102 |         if sigma is None:
103 |             sigma = [.1, 2.]
104 |         self.sigma = sigma
105 | 
106 |     def __call__(self, x):
107 |         sigma = random.uniform(self.sigma[0], self.sigma[1])
108 |         x = x.filter(ImageFilter.GaussianBlur(radius=sigma))
109 |         return x
110 | 
111 | class hide_patch(object):
112 |     """" Hide random part of the image """
113 | 
114 |     def __init__(self, hide_prob=0.3):
115 |         self.hide_prob = hide_prob
116 |         self.skipsize = 20
117 | 
118 |     def __call__(self, img):
119 |         s = img.shape
120 |         wd = s[1]
121 |         ht = s[2]
122 | 
123 |         # possible grid size, 0 means no hiding
124 |         if wd ==224:
125 |             grid_sizes = [15, 20, 25]
126 |         else :
127 |             grid_sizes = [33, 44, 55]
128 | 
129 |         # hiding probability
130 | 
131 |         # randomly choose one grid size
132 |         grid_size = grid_sizes[random.randint(0, len(grid_sizes) - 1)]
133 | 
134 |         # hide the patches
135 |         if grid_size != 0:
136 |             for x in range(0, wd, grid_size):
137 |                 for y in range(0, ht, grid_size):
138 |                     x_end = min(wd, x + grid_size)
139 |                     y_end = min(ht, y + grid_size)
140 |                     if x <= self.skipsize:
141 |                         img[:, x:x_end, y:y_end] = 0
142 | 
143 |                     if random.random() <= self.hide_prob:
144 |                         # patch_avg = img[:, x:x_end, y:y_end].mean()  # activate this line if u want mean patch value
145 |                         img[:, x:x_end, y:y_end] = 0      # patch_avg
146 | 
147 |         return img
148 | 
149 | 
150 | 
151 | 
152 | 
153 | def get_color_distortion(s=1.0):
154 |     """
155 |     Color jitter from SimCLR paper
156 |     @param s: is the strength of color distortion.
157 |     """
158 | 
159 |     color_jitter = transforms.ColorJitter(0.6*s, 0.6*s, 0.6*s, 0.2*s)
160 |     rnd_color_jitter = transforms.RandomApply([color_jitter], p=0.7)
161 |     rnd_gray = transforms.RandomGrayscale(p=0.2)
162 |     color_distort = transforms.Compose([rnd_color_jitter, rnd_gray])
163 |     return color_distort
164 | 
165 | 
166 | class copy_paste(object):
167 |     ''' Copy paste augumentation: arg: paste img, paste mask, img on which the obj to be pasted, gaussian blur(sigma)
168 |         params: sigma = Gaussian blur radius
169 |                 blend = bool
170 |                 affine =  bool
171 |                 instance_txt_path = path to the directory containing the instances list that needs to be pasted.
172 | 
173 |     '''
174 | 
175 | 
176 |     def __init__(self, blend=True, sigma= 1, affine=True, prob=1):
177 |         self.sigma = sigma
178 |         self.blend = blend
179 |         self.affine = affine
180 |         self.prob = prob
181 |         self.instance_txt_path = '/data/input/datasets/VL-CMU-CD/instance.txt'
182 |         with open(self.instance_txt_path, 'rb') as fp:
183 |             self.instance_list = pickle.load(fp)
184 | 
185 | 
186 |     def __call__(self, copy_img, copy_img2):
187 |         if random.random() <= self.prob:
188 |             if self.instance_list:
189 |                 inst_name = random.choice(self.instance_list)
190 |             instance = Image.open(inst_name)
191 |             self.instance = instance
192 | 
193 |             if self.instance is not None:
194 |                 H,W = copy_img.size
195 |                 paste_img = transforms.Resize(size=(H, W))(self.instance)
196 |                 if self.affine == True:
197 |                     paste_img = transforms.RandomAffine(degrees=0, translate=(0.25, 0.25), scale=(0.8, 1.1), shear=0)(paste_img)
198 |                 gray_mask = transforms.Grayscale()(paste_img)
199 |                 binary_mask = np.asarray(gray_mask)
200 |                 binary_mask = 1.0 * (binary_mask > 0)
201 |                 # blur_binary_mask = skimage.exposure.rescale_intensity(blur_binary_mask)
202 |                 invert_mask = 1.0 * (np.logical_not(binary_mask).astype(int))
203 | 
204 |                 if self.blend == True:
205 |                     blur_invert_mask = gaussian_filter(invert_mask, sigma=self.sigma)
206 |                     blur_binary_mask = gaussian_filter(binary_mask, sigma=self.sigma)
207 |                     blur_invert_mask = np.expand_dims(blur_invert_mask, 2)   # Expanding dims to match channels
208 |                     blur_binary_mask = np.expand_dims(blur_binary_mask, 2)
209 |                 blur_invert_mask = np.expand_dims(invert_mask, 2)  # Expanding dims to match channels
210 |                 blur_binary_mask = np.expand_dims(binary_mask, 2)
211 |                 aug_image1 = (paste_img * blur_binary_mask) + (copy_img * blur_invert_mask)
212 |                 aug_image2 = (paste_img * blur_binary_mask) + (copy_img2 * blur_invert_mask)
213 | 
214 | 
215 |             return Image.fromarray(np.uint8(aug_image1)), Image.fromarray(np.uint8(aug_image2))
216 |         else:
217 |             return(copy_img), (copy_img2)
218 | 
219 | 
220 | 
221 | 
222 | 
223 | 
224 | 
225 | 
226 | 
227 | 
228 | 
229 | 
230 | 


--------------------------------------------------------------------------------
/DSP/util/COCO_loader/base_dataset.py:
--------------------------------------------------------------------------------
  1 | from abc import abstractmethod
  2 | import torch.utils.data as data
  3 | import numpy as np
  4 | from pycocotools.coco import COCO
  5 | import os
  6 | import logging
  7 | import cv2
  8 | import torch
  9 | 
 10 | 
 11 | __all__ = ['BaseDataset']
 12 | 
 13 | 
 14 | class BaseDataset(data.Dataset):
 15 | 
 16 |     def __init__(self, root, split, cfg, mode=None, base_size=None,
 17 |                  crop_size=None, ann_path=None, ann_file_format=None,
 18 |                  has_inst_seg=True, **kwargs):
 19 |         self.root = root
 20 |         self._split = split
 21 |         self.mode = mode
 22 |         self.base_size = base_size if base_size is not None else 1024
 23 |         self.crop_size = crop_size if crop_size is not None else [512, 512]
 24 | 
 25 | 
 26 |         self.cfg = cfg
 27 | 
 28 | 
 29 |         if split == 'test':
 30 |             return
 31 | 
 32 |         if ann_path is None:
 33 |             ann_path = 'gtFine/annotations_coco_format_v1'
 34 |         if ann_file_format is None:
 35 |             ann_file_format = 'instances_%s.json'
 36 |         self.coco = COCO(os.path.join(root, ann_path, ann_file_format % split))
 37 | 
 38 |         # Image paths is currently none to address test split length..
 39 |         # update image paths after initializing BaseDataset
 40 |         self.image_paths = None
 41 |         self.image_ids = list(self.coco.imgs.keys())
 42 |         self.image_ids = sorted(self.image_ids)
 43 |         logging.info(f'Number of images in split {split} is {len(self.image_ids)}')
 44 | 
 45 |         ids = []
 46 |         for img_id in self.image_ids:
 47 |             ann_ids = self.coco.getAnnIds(imgIds=img_id, iscrowd=None)
 48 |             anno = self.coco.loadAnns(ann_ids)
 49 |             if split == "train":
 50 |                 if self.has_valid_annotation(anno):
 51 |                     ids.append(img_id)
 52 |             else:
 53 |                 ids.append(img_id)
 54 | 
 55 |         self.image_ids = ids
 56 |         logging.info(f'Number of images with valid annotations '
 57 |                      f'in split {split} is {len(self.image_ids)}')
 58 | 
 59 |         self.id_to_filename = dict()
 60 |         self.filename_to_id = dict()
 61 |         for i, ob in self.coco.imgs.items():
 62 |             self.filename_to_id[ob['file_name']] = ob['id']
 63 |             self.id_to_filename[ob['id']] = ob['file_name']
 64 | 
 65 |         detect_ids = self.get_detect_ids()
 66 |         self.coco_id_to_contiguous_id = {coco_id: i for i, coco_id
 67 |                                          in enumerate(detect_ids)}
 68 |         self.contiguous_id_to_coco_id = {v: k for k, v in
 69 |                                          self.coco_id_to_contiguous_id.items()}
 70 | 
 71 |         self.key, self.segment_mapping = self.get_segment_mapping()
 72 | 
 73 |         self.has_inst_seg = has_inst_seg
 74 |         self.inst_encoding_type ='MEINST'
 75 | 
 76 | 
 77 | 
 78 |     @property
 79 |     def image_size(self):
 80 |         return self.crop_size
 81 | 
 82 |     @property
 83 |     def split(self):
 84 |         return self._split
 85 | 
 86 |     @split.setter
 87 |     def split(self, value):
 88 |         assert type(value) is str, 'Dataset split should be string'
 89 |         self._split = value
 90 | 
 91 |     @abstractmethod
 92 |     def get_detect_ids(self):
 93 |         pass
 94 | 
 95 |     @abstractmethod
 96 |     def get_segment_mapping(self):
 97 |         pass
 98 | 
 99 |     @abstractmethod
100 |     def __getitem__(self, index):
101 |         pass
102 | 
103 |     def __len__(self):
104 |         if self.split == "test":
105 |             return len(self.image_paths)
106 |         return len(self.image_ids)
107 | 
108 |     @staticmethod
109 |     def xywh2xyxy(box):
110 |         x1, y1, w, h = box
111 |         return [x1, y1, x1 + w, y1 + h]
112 | 
113 |     def get_img_info(self, index):
114 |         image_id = self.image_ids[index]
115 |         img_data = self.coco.imgs[image_id]
116 |         return img_data
117 | 
118 | 
119 | 
120 |     @abstractmethod
121 |     def ann_check_hooks(self, ann_obj):
122 |         pass
123 | 
124 |     def get_annotation(self, index):
125 |         image_id = self.image_ids[index]
126 |         # TODO: optionally create segmentation masks...
127 |         ann_ids = self.coco.getAnnIds(imgIds=image_id)
128 |         loaded_anns = self.coco.loadAnns(ann_ids)
129 | 
130 |         bboxes, labels, inst_masks = [], [], []
131 |         for obj in loaded_anns:
132 |             if obj.get('iscrowd', 0) == 0 and obj.get('real_box', True)\
133 |                     and self.ann_check_hooks(obj):
134 |                 bboxes.append(self.xywh2xyxy(obj["bbox"]))
135 |                 labels.append(self.coco_id_to_contiguous_id[obj["category_id"]])
136 |                 if self.has_inst_seg:
137 |                     inst_masks.append(self.coco.annToMask(obj))
138 | 
139 |         bboxes = np.array(bboxes, np.float32).reshape((-1, 4))
140 |         labels = np.array(labels, np.int64).reshape((-1,))
141 | 
142 |         # remove invalid boxes
143 |         keep = (bboxes[:, 3] > bboxes[:, 1]) & (bboxes[:, 2] > bboxes[:, 0])
144 |         bboxes = bboxes[keep]
145 |         labels = labels[keep]
146 |         inst_masks = [inst_masks[idx] for idx, k in enumerate(keep) if k]
147 | 
148 |         rets = [bboxes, labels]
149 |         if self.has_inst_seg:
150 |             rets += [inst_masks]
151 |         return rets
152 | 
153 |     @staticmethod
154 |     def _has_only_empty_bbox(anno):
155 |         return all(not (obj.get("iscrowd", 0) == 0 and
156 |                         obj.get("real_bbox", True)) for obj in anno)
157 | 
158 |     def has_valid_annotation(self, anno):
159 |         # if it's empty, there is no annotation
160 |         if len(anno) == 0:
161 |             return False
162 |         # if all boxes have close to zero area, there is no annotation
163 |         if self._has_only_empty_bbox(anno):
164 |             return False
165 |         return True
166 | 
167 |     def add_area(self):
168 |         for i, v in self.coco.anns.items():
169 |             v['area'] = v['bbox'][2] * v['bbox'][3]
170 | 
171 |     def segment_mask_transform(self, mask):
172 |         mask = np.array(mask).astype('int32')
173 |         if self.segment_mapping is not None:
174 |             mask = self.segment_mask_to_contiguous(mask)
175 |         return torch.from_numpy(mask).long()
176 | 
177 |     def segment_mask_to_contiguous(self, mask):
178 |         values = np.unique(mask)
179 |         for i in range(len(values)):
180 |             assert (values[i] in self.segment_mapping)
181 |         index = np.digitize(mask.ravel(), self.segment_mapping, right=True)
182 |         return self.key[index].reshape(mask.shape)
183 | 
184 | 
185 | 


--------------------------------------------------------------------------------
/DSP/util/COCO_loader/coco_uninet.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import numpy as np
  3 | from pycocotools import mask as coco_mask
  4 | import logging
  5 | from PIL import Image
  6 | import dataset.CMU as CMU
  7 | from scipy.ndimage import gaussian_filter
  8 | from torchvision import transforms
  9 | import torch
 10 | import random
 11 | from torch.utils.data import DataLoader
 12 | 
 13 | 
 14 | 
 15 | from torch.utils.data import Dataset, DataLoader
 16 | import matplotlib.pyplot as plt
 17 | 
 18 | from util.COCO_loader.base_dataset import BaseDataset
 19 | 
 20 | 
 21 | class COCOUninet(BaseDataset):
 22 |     NUM_CLASSES = {'segment': 21, 'detect': 81, 'inst_seg': 81}
 23 |     INSTANCE_NAMES = []
 24 | 
 25 |     CAT_LIST = [0, 5, 2, 16, 9, 44, 6, 3, 17, 62, 21, 67, 18, 19, 4,
 26 |                 1, 64, 20, 63, 7, 72]
 27 | 
 28 |     def __init__(self, root=os.path.expanduser('/data/input/datasets/mscoco'),
 29 |                  split='train', mode=None, cfg=None, **kwargs):
 30 |         year = str(2017)
 31 |         if year == "2017" and split == 'minival':
 32 |             split = 'val'
 33 |         super(COCOUninet, self).__init__(
 34 |             root, split, cfg, mode, ann_path='annotations',
 35 |             ann_file_format=f'instances_%s{year}.json', **kwargs)
 36 | 
 37 |         if self.split == "test":
 38 |             self.image_paths = get_image_paths(self.root, year=year)
 39 |             if len(self.image_paths) == 0:
 40 |                 raise RuntimeError("Found 0 images in subfolders of:" + self.root + "\n")
 41 |             return
 42 | 
 43 |         self.img_dir = os.path.join(root, rf'{split}{year}')
 44 |         self.add_area()
 45 | 
 46 |     @staticmethod
 47 |     def _has_only_empty_bbox(anno):
 48 |         return all(any(o <= 1 for o in obj["bbox"][2:]) for obj in anno)
 49 | 
 50 |     def ann_check_hooks(self, obj):
 51 |         return True
 52 | 
 53 |     def get_detect_ids(self):
 54 |         det_ids = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18,
 55 |                    19, 20, 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37,
 56 |                    38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54,
 57 |                    55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74,
 58 |                    75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90]
 59 | 
 60 |         return det_ids
 61 | 
 62 |     def get_segment_mapping(self):
 63 |         key = None
 64 |         segment_mapping = None
 65 | 
 66 |         return key, segment_mapping
 67 | 
 68 |     def __getitem__(self, index):
 69 |         if self.split == "test":
 70 |             image = Image.open(self.image_paths[index]).convert('RGB')
 71 |             image = np.array(image)
 72 |             image, _ = self.transform(image)
 73 |             return image
 74 | 
 75 |         file_name = self.id_to_filename[self.image_ids[index]]
 76 |         image_path = os.path.join(self.img_dir, file_name)
 77 |         image = Image.open(image_path).convert('RGB')
 78 |         image = np.array(image)
 79 | 
 80 |         bboxes, labels, inst_masks = self.get_annotation(index)
 81 | 
 82 |         inst_list = []
 83 |         for idx in range(len(inst_masks)):
 84 | 
 85 |             if labels[idx] == 3:
 86 |                 class_name = labels[idx]
 87 |                 masks = np.array(inst_masks[idx])
 88 |                 masks = np.reshape(masks, (masks.shape[0], masks.shape[1], 1))
 89 |                 gau_masks = gaussian_filter(masks, sigma=1)
 90 | 
 91 |                 instance = image * gau_masks
 92 |                 nzCount = instance.any(axis=-1).sum()
 93 |                 print(nzCount)
 94 |                 if nzCount > 10000 and nzCount < 60000:
 95 |                     inst_list.append(instance)
 96 | 
 97 |         if labels[idx] == 3:
 98 |             return image, inst_list, gau_masks, class_name
 99 |         else:
100 |             return image
101 | 
102 | 
103 | 
104 | 
105 | 
106 | 
107 | def get_image_paths(folder, split='test', year='2014'):
108 |     def get_path_pairs():
109 |         img_paths = []
110 |         for root, directories, files in os.walk(img_folder):
111 |             for filename in files:
112 |                 if filename.endswith(".jpg") or filename.endswith(".png"):
113 |                     im_path = os.path.join(root, filename)
114 |                     if os.path.isfile(im_path):
115 |                         img_paths.append(im_path)
116 |                     else:
117 |                         logging.info('cannot find the mask or image:', im_path)
118 |         logging.info('Found {} images in the folder {}'.format(len(img_paths), img_folder))
119 |         return img_paths
120 | 
121 |     img_folder = os.path.join(folder, split + year)
122 |     return get_path_pairs()
123 | 
124 | def convert_togray (cd_img1, instance):
125 |     img1 = instance
126 |     b,c,H,W = cd_img1.shape
127 |     size = (H,W)
128 |     transform = transforms.ToPILImage()
129 |     img1  = img1.squeeze(0)
130 |     img1 = img1.permute(2, 0, 1)
131 |     img1 = transform(img1)
132 | 
133 |     img1 = transforms.Resize(size=size)(img1)
134 |     gray_instance = transforms.Grayscale()(img1)
135 |     gray_instance, resized_instance = transforms.ToTensor()(gray_instance),transforms.ToTensor()(img1)
136 | 
137 |     return gray_instance.unsqueeze(0), resized_instance.unsqueeze(0)
138 | 
139 | def image_copy_paste(img1, img2, instance, alpha, blend=True, sigma=1):
140 |     if alpha is not None:
141 |         gray_ins, instance = convert_togray(img1, instance)
142 |         binarized = 1.0 * (gray_ins > 0)
143 |         invert_binary = (~binarized).float()
144 |         if blend:
145 |             filtered_mask = gaussian_filter(invert_binary, sigma=1)
146 |             filtered_mask = torch.Tensor(filtered_mask)
147 |         aug_img1 = instance + (img1 * filtered_mask)
148 |         aug_img2 = instance + (img2 * filtered_mask)
149 | 
150 |     return aug_img1, aug_img2, instance
151 | 
152 | def save_show_transformations(img, img2,instance, masks, name, path = '/data/input/datasets/VL-CMU-CD/instances'):
153 | 
154 |     img = np.squeeze(img)
155 |     img2 = np.squeeze(img2)
156 |     instance = np.squeeze(instance)
157 |     masks = np.squeeze(masks)
158 |     instance = instance.permute(1,2,0)
159 |     img= np.transpose(img,(0,1, 2))   # from NCHW to NHWC
160 |     instance= np.transpose(instance,(0,1, 2))
161 |     f, axarr = plt.subplots(1,4)
162 |     axarr[0].imshow(img.permute(1,2,0))
163 |     axarr[1].imshow(img2.permute(1,2,0))
164 |     axarr[2].imshow(instance)
165 |     axarr[3].imshow(masks)
166 |     plt.show()
167 | 
168 | 
169 |     instance = instance.numpy()
170 |     rescaled = (255.0 / instance.max() * (instance - instance.min())).astype(np.uint8)
171 |     Name_Formatted = ("%s" % (j)) + ".png"
172 |     # file_path = os.path.join(path, Name_Formatted)
173 |     # instance = Image.fromarray(rescaled)
174 |     # instance.save(file_path)
175 |     return file_path
176 | 
177 | 
178 | 
179 | # COCO dataset loader
180 | coco_train_dataset = COCOUninet()
181 | train_loader_coco = DataLoader(coco_train_dataset, batch_size=1, shuffle=True, drop_last=True)
182 | # Change detection dataset loader
183 | TRAIN_DATA_PATH = "/data/input/datasets/VL-CMU-CD/struc_train"
184 | data_path = os.path.join(TRAIN_DATA_PATH, 'train_pair.txt')
185 | CD_train_dataset = CMU.Dataset(TRAIN_DATA_PATH, TRAIN_DATA_PATH,
186 |                                data_path, 'train', 'CD', transform=True,
187 |                             transform_med=None)
188 | train_loader_CD = DataLoader(CD_train_dataset, batch_size=1, shuffle=True, drop_last=True)
189 | 
190 | def extract_ins_coco ():
191 |     for j, batch_CD in enumerate(train_loader_CD):
192 |         t0, t1, cd_labels, instance = batch_CD
193 |         for i, batch in enumerate(train_loader_coco):
194 |             if len(batch) >1:
195 |                 img, instance, masks, labels = batch
196 |                 if instance:
197 |                     for ins in instance:
198 |                         aug_t0, aug_t1, resized_instance  = image_copy_paste(t0, t1, ins, masks)
199 |                         # show_transformations(img, ins, masks)
200 |                         save_show_transformations(aug_t0, resized_instance, masks)
201 | ins_path = []
202 | for j, batch_CD in enumerate(train_loader_CD):
203 |     t0, t1, cd_labels, ins = batch_CD
204 |     file_path = save_show_transformations(t0, t1, ins, cd_labels, j)
205 |     ins_path.append(file_path)
206 | 
207 | 
208 | 


--------------------------------------------------------------------------------
/DSP/util/COCO_loader/defaults.py:
--------------------------------------------------------------------------------
  1 | from yacs.config import CfgNode as CN
  2 | 
  3 | # -----------------------------------------------------------------------------
  4 | # Config definition
  5 | # -----------------------------------------------------------------------------
  6 | _C = CN()
  7 | 
  8 | # -----------------------------------------------------------------------------
  9 | # MODEL options
 10 | # -----------------------------------------------------------------------------
 11 | _C.MODEL = CN()
 12 | _C.MODEL.PRETRAINED_PATH = "/input/datasets/uninet/pytorch-config/FCOS_imprv_R_50_FPN_1x.pth"
 13 | _C.MODEL.IS_FULL_MODEL = False
 14 | _C.MODEL.LOAD_BACKBONE = False
 15 | _C.MODEL.BACKBONE_NAME = 'backbone'
 16 | _C.MODEL.NECK_NAMES = ['fpn', 'neck']
 17 | _C.MODEL.HEAD_NAME = 'head'
 18 | _C.MODEL.USE_DCN = False
 19 | 
 20 | # -----------------------------------------------------------------------------
 21 | # INPUT options
 22 | # -----------------------------------------------------------------------------
 23 | _C.INPUT = CN()
 24 | 
 25 | # ---------------------------------------------------------------------------- #
 26 | # Specific test options
 27 | # ---------------------------------------------------------------------------- #
 28 | _C.TEST = CN()
 29 | # Number of detections per image
 30 | _C.TEST.DETECTIONS_PER_IMG = 100
 31 | 
 32 | # ---------------------------------------------------------------------------- #
 33 | # Test-time augmentations for bounding box detection
 34 | # See configs/test_time_aug/e2e_mask_rcnn_R-50-FPN_1x.yaml for an example
 35 | # ---------------------------------------------------------------------------- #
 36 | _C.TEST.BBOX_AUG = CN()
 37 | # Enable test-time augmentation for bounding box detection if True
 38 | _C.TEST.BBOX_AUG.ENABLED = False
 39 | 
 40 | # --------------------------------------------------------------------------- #
 41 | # Dataloader Options
 42 | # ---------------------------------------------------------------------------- #
 43 | _C.DATALOADER = CN()
 44 | _C.DATALOADER.YEAR = 2014
 45 | _C.DATALOADER.ANNOTATION_FOLDER = 'gtFine/annotations_coco_format_v1'
 46 | _C.DATALOADER.ANN_FILE_FORMAT = 'instances_%s.json'
 47 | # ImageNet mean and standard deviation..
 48 | _C.DATALOADER.MEAN = [.485, .456, .406]
 49 | _C.DATALOADER.STD = [.229, .224, .225]
 50 | _C.DATALOADER.TRAIN_TRANSFORMS = ['PreProcessBoxes', 'PadIfNeeded', 'ShiftScaleRotate', 'CropNonEmptyMaskIfExists',
 51 |                                   'ResizeMultiScale', 'HorizontalFlip', 'ColorJitter', 'PostProcessBoxes',
 52 |                                   'ConvertFromInts', 'ToTensor', 'Normalize']
 53 | _C.DATALOADER.VAL_TRANSFORMS = ['PreProcessBoxes', 'Resize', 'PostProcessBoxes',
 54 |                                 'ToTensor', 'Normalize']
 55 | # Multi scale augmentation defaults..
 56 | _C.DATALOADER.MS_MULTISCALE_MODE = 'value'
 57 | _C.DATALOADER.MS_RATIO_RANGE = [0.75, 1]
 58 | _C.DATALOADER.PHOTOMETRIC_DISTORT_KWARGS = '{}'
 59 | _C.DATALOADER.INST_SEG_ENCODING = 'MEINST'
 60 | _C.DATALOADER.DEPTH_SCALE = 512.
 61 | 
 62 | # ---------------------------------------------------------------------------- #
 63 | # Task options
 64 | # ---------------------------------------------------------------------------- #
 65 | _C.TASKS = CN()
 66 | _C.TASKS.TASK_TO_LOSS_NAME = '{\"detect\":"default",\"segment\":"default",\"depth\":"default",' \
 67 |                              '\"inst_depth\":"default",\"inst_seg\":"default"}'
 68 | _C.TASKS.TASK_TO_LOSS_ARGS = '{}'
 69 | _C.TASKS.TASK_TO_LOSS_KWARGS = '{}'
 70 | _C.TASKS.TASK_TO_CALL_KWARGS = '{\"segment\":{\"ignore_index\":-1}}'
 71 | _C.TASKS.TASK_TO_MIN_OR_MAX = '{\"detect\":1,\"segment\":1,\"depth\":-1,\"inst_depth\":-1,' \
 72 |                               ' \"inst_seg\":1}'
 73 | _C.TASKS.ALL_LOSSES = ['detect_cls_loss', 'detect_reg_loss', 'detect_centerness_loss',
 74 |                        'segment_loss', 'depth_loss', 'inst_depth_l1_loss',
 75 |                        'inst_seg_loss']
 76 | _C.TASKS.LOSS_INIT_WEIGHTS = [1., 1., 1., 1., 1., 0.05, 1.]
 77 | _C.TASKS.LOSS_START_EPOCH = [1, 1, 1, 1, 1, 1, 1]
 78 | _C.TASKS.USE_UNCERTAINTY_WEIGHTING = False
 79 | 
 80 | # --------------------------------------------------------------------------- #
 81 | # Backbone and encoder Options
 82 | # ---------------------------------------------------------------------------- #
 83 | _C.MODEL.ENCODER = CN()
 84 | _C.MODEL.ENCODER.NUM_EN_FEATURES = 6
 85 | _C.MODEL.ENCODER.OUT_CHANNELS_BEFORE_EXPANSION = 512
 86 | _C.MODEL.ENCODER.FEAT_CHANNELS = [2048, 2048, 2048, 2048]
 87 | _C.MODEL.ENCODER.USE_DCN = False
 88 | 
 89 | # --------------------------------------------------------------------------- #
 90 | # Decoder Options
 91 | # ---------------------------------------------------------------------------- #
 92 | _C.MODEL.DECODER = CN()
 93 | _C.MODEL.DECODER.OUTPLANES = 64
 94 | _C.MODEL.DECODER.MULTISCALE = False
 95 | _C.MODEL.DECODER.ATTENTION = False
 96 | _C.MODEL.DECODER.INSERT_MEAN_FEAT = False
 97 | _C.MODEL.DECODER.INIT_WEIGHTS = False
 98 | _C.MODEL.DECODER.USE_NECK_FEATURES = False
 99 | 
100 | # --------------------------------------------------------------------------- #
101 | # Object Detection Options
102 | # ---------------------------------------------------------------------------- #
103 | _C.MODEL.DET = CN()
104 | _C.MODEL.DET.HEAD_NAME = "FCOS"
105 | _C.MODEL.DET.FEATURE_CHANNELS = 256
106 | _C.MODEL.DET.FPN_STRIDES = [8, 16, 32]
107 | _C.MODEL.DET.WEIGHTS_PER_CLASS = [1] * 8
108 | _C.MODEL.DET.ATTENTION = False
109 | _C.MODEL.DET.CLS_LOSS_TYPE = 'focal_loss'
110 | # Focal loss parameter: alpha
111 | _C.MODEL.DET.LOSS_ALPHA = 0.25
112 | # Focal loss parameter: gamma
113 | _C.MODEL.DET.LOSS_GAMMA = 2.0
114 | _C.MODEL.DET.LOSS_BETA = 0.9999
115 | _C.MODEL.DET.PRIOR_PROB = 0.01
116 | 
117 | # --------------------------------------------------------------------------- #
118 | # FCOS Options
119 | # ---------------------------------------------------------------------------- #
120 | _C.MODEL.FCOS = CN()
121 | _C.MODEL.FCOS.INFERENCE_TH = 0.05
122 | _C.MODEL.FCOS.NMS_TH = 0.6
123 | _C.MODEL.FCOS.PRE_NMS_TOP_N = 1000
124 | # the number of convolutions used in the cls and bbox tower
125 | _C.MODEL.FCOS.NUM_CONVS = 4
126 | # if CENTER_SAMPLING_RADIUS <= 0, it will disable center sampling
127 | _C.MODEL.FCOS.CENTER_SAMPLING_RADIUS = 0.0
128 | # IOU_LOSS_TYPE can be "iou", "linear_iou" or "giou"
129 | _C.MODEL.FCOS.IOU_LOSS_TYPE = "iou"
130 | _C.MODEL.FCOS.NORM_REG_TARGETS = False
131 | _C.MODEL.FCOS.CENTERNESS_ON_REG = False
132 | _C.MODEL.FCOS.USE_DCN_IN_TOWER = False
133 | _C.MODEL.FCOS.USE_NAS_HEAD = False
134 | _C.MODEL.FCOS.ATSS_TOPK = 9
135 | 
136 | # --------------------------------------------------------------------------- #
137 | # OnetNet Options
138 | # ---------------------------------------------------------------------------- #
139 | _C.MODEL.ONENET = CN()
140 | _C.MODEL.ONENET.CLASS_WEIGHT = 1.
141 | _C.MODEL.ONENET.GIOU_WEIGHT = 1.
142 | _C.MODEL.ONENET.L1_WEIGHT = 2.5
143 | _C.MODEL.ONENET.USE_NMS = False
144 | _C.MODEL.ONENET.NMS_TH = 0.5
145 | 
146 | # --------------------------------------------------------------------------- #
147 | # Segmentation Options
148 | # ---------------------------------------------------------------------------- #
149 | _C.MODEL.SEG = CN()
150 | _C.MODEL.SEG.INPLANES = 64
151 | _C.MODEL.SEG.OUTPLANES = 64
152 | _C.MODEL.SEG.MULTISCALE = False
153 | _C.MODEL.SEG.ATTENTION = False
154 | 
155 | # Depth Options
156 | # ---------------------------------------------------------------------------- #
157 | _C.MODEL.DEPTH = CN()
158 | _C.MODEL.DEPTH.INPLANES = 64
159 | _C.MODEL.DEPTH.OUTPLANES = 64
160 | _C.MODEL.DEPTH.ACTIVATION_FN = 'sigmoid'
161 | _C.MODEL.DEPTH.ATTENTION = False
162 | 
163 | # --------------------------------------------------------------------------- #
164 | # Instance depth Options
165 | # ---------------------------------------------------------------------------- #
166 | _C.MODEL.INST_DEPTH = CN()
167 | _C.MODEL.INST_DEPTH.DEPTH_ON_REG = True
168 | 
169 | # --------------------------------------------------------------------------- #
170 | # Instance segmentation Options
171 | # ---------------------------------------------------------------------------- #
172 | _C.MODEL.INST_SEG = CN()
173 | _C.MODEL.INST_SEG.HEAD_NAME = 'MEINST'
174 | 
175 | # --------------------------------------------------------------------------- #
176 | # MEINST Options
177 | 
178 | _C.MODEL.MEINST = CN()
179 | # share classification head and instance segmentation head..
180 | _C.MODEL.MEINST.SHARE_CLS_INST_HEADS = False
181 | # share bounding box head and instance segmentation head..
182 | _C.MODEL.MEINST.SHARE_BBOX_INST_HEADS = True
183 | # mask encoding type
184 | _C.MODEL.MEINST.ENCODING_TYPE = 'explicit'
185 | # is inverse sigmoid and sigmoid used for finding pca components
186 | _C.MODEL.MEINST.SIGMOID = True
187 | # is whiten used for finding pca components
188 | _C.MODEL.MEINST.WHITEN = True
189 | # path to pca params file
190 | _C.MODEL.MEINST.PCA_PATH = ''
191 | # number of components in the encoded mask
192 | _C.MODEL.MEINST.NUM_COMPONENTS = 60
193 | # dimension to which all instance masks are reshaped to
194 | _C.MODEL.MEINST.ENCODING_DIM = 28
195 | # add instance masks vizualized as segmentation masks to tensorboard
196 | _C.MODEL.MEINST.CREATE_PRED_MASK = False
197 | # visualize each instance separately..
198 | _C.MODEL.MEINST.VIZ_INSTANCES = True
199 | 
200 | 
201 | # --------------------------------------------------------------------------- #
202 | # CenterMask Options
203 | # ---------------------------------------------------------------------------- #
204 | 
205 | _C.MODEL.CENTER_MASK = CN()
206 | _C.MODEL.CENTER_MASK.IN_FEATURES = ['p3', 'p4', 'p5']
207 | _C.MODEL.CENTER_MASK.POOLER_RESOLUTION = 14
208 | _C.MODEL.CENTER_MASK.POOLER_SAMPLING_RATIO = 0
209 | _C.MODEL.CENTER_MASK.POOLER_TYPE = 'ROIAlignV2'
210 | _C.MODEL.CENTER_MASK.ASSIGN_CRITERION = 'ratio'
211 | _C.MODEL.CENTER_MASK.MASK_CONV_DIM = 128
212 | _C.MODEL.CENTER_MASK.MASK_NUM_CONV = 2
213 | _C.MODEL.CENTER_MASK.MASKIOU_CONV_DIM = 128
214 | _C.MODEL.CENTER_MASK.MASKIOU_NUM_CONV = 2
215 | _C.MODEL.CENTER_MASK.CLS_AGNOSTIC_MASK = False
216 | _C.MODEL.CENTER_MASK.MASKIOU_ON = False
217 | _C.MODEL.CENTER_MASK.MASKIOU_LOSS_WEIGHT = 1.0
218 | 
219 | # --------------------------------------------------------------------------- #
220 | # Miscellaneous Options
221 | # ---------------------------------------------------------------------------- #
222 | 
223 | _C.MISC = CN()
224 | _C.MISC.CITYS_INST_SEG_EVAL = False


--------------------------------------------------------------------------------
/DSP/util/__pycache__/dist_util.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/dist_util.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/util/__pycache__/test.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/test.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/util/__pycache__/torchlist.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/torchlist.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/util/__pycache__/train_util.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/train_util.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/util/__pycache__/transforms.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/transforms.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/util/__pycache__/utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/DSP/util/__pycache__/utils.cpython-37.pyc


--------------------------------------------------------------------------------
/DSP/util/dist_util.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Distributed Data Parallel resources
 3 | """
 4 | import torch
 5 | import os
 6 | import torch.distributed as dist
 7 | 
 8 | 
 9 | def is_dist_avail_and_initialized():
10 |     if not dist.is_available():
11 |         return False
12 |     if not dist.is_initialized():
13 |         return False
14 |     return True
15 | 
16 | 
17 | def get_world_size():
18 |     if not is_dist_avail_and_initialized():
19 |         return 1
20 |     return dist.get_world_size()
21 | 
22 | 
23 | def get_rank():
24 |     if not is_dist_avail_and_initialized():
25 |         return 0
26 |     return dist.get_rank()
27 | 
28 | 
29 | def is_main_process():
30 |     return get_rank() == 0
31 | 
32 | 
33 | def setup_for_distributed(is_master):
34 |     """
35 |     This function disables printing when not in master process
36 |     """
37 |     import builtins as __builtin__
38 |     builtin_print = __builtin__.print
39 | 
40 |     def print(*args, **kwargs):
41 |         force = kwargs.pop('force', False)
42 |         if is_master or force:
43 |             builtin_print(*args, **kwargs)
44 | 
45 |     __builtin__.print = print
46 | 
47 | 
48 | def init_distributed_mode(args):
49 |     if 'RANK' in os.environ and 'WORLD_SIZE' in os.environ:
50 |         args.rank = int(os.environ["RANK"])
51 |         args.world_size = int(os.environ['WORLD_SIZE'])
52 |         args.gpu = int(os.environ['LOCAL_RANK'])
53 |     elif 'SLURM_PROCID' in os.environ:
54 |         args.rank = int(os.environ['SLURM_PROCID'])
55 |         args.gpu = args.rank % torch.cuda.device_count()
56 |     elif hasattr(args, "rank"):
57 |         pass
58 |     else:
59 |         print('Not using distributed mode')
60 |         args.distributed = False
61 |         return
62 | 
63 |     args.distributed = True
64 |     torch.cuda.set_device(args.gpu)
65 |     args.dist_backend = 'nccl'
66 |     print('| distributed init (rank {}): {}'.format(args.rank, args.dist_url), flush=True)
67 |     torch.distributed.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
68 |                                          world_size=args.world_size, rank=args.rank)
69 |     setup_for_distributed(args.rank == 0)


--------------------------------------------------------------------------------
/DSP/util/torchlist.py:
--------------------------------------------------------------------------------
 1 | """
 2 | @author Fahad Sarfraz
 3 | """
 4 | import torch.utils.data as data
 5 | 
 6 | from PIL import Image
 7 | import os
 8 | import os.path
 9 | 
10 | 
11 | def default_loader(path):
12 |     return Image.open(path).convert('RGB')
13 | 
14 | 
15 | def default_flist_reader(flist):
16 |     """
17 |     flist format: impath label\nimpath label\n ...(same to caffe's filelist)
18 |     """
19 |     imlist = []
20 |     with open(flist, 'r') as rf:
21 |         for line in rf.readlines():
22 |             impath, imlabel = line.strip().split()
23 |             imlist.append( (impath, int(imlabel)) )
24 | 
25 |     return imlist
26 | 
27 | 
28 | class ImageFilelist(data.Dataset):
29 |     def __init__(self, root, flist, transform=None, target_transform=None,
30 |                  flist_reader=default_flist_reader, loader=default_loader):
31 |         self.root = root
32 |         self.imlist = flist_reader(flist)
33 |         self.transform = transform
34 |         self.target_transform = target_transform
35 |         self.loader = loader
36 | 
37 |     def __getitem__(self, index):
38 |         impath, target = self.imlist[index]
39 |         img = self.loader(os.path.join(self.root ,impath))
40 |         if self.transform is not None:
41 |             img = self.transform(img)
42 |         if self.target_transform is not None:
43 |             target = self.target_transform(target)
44 | 
45 |         return img, target
46 | 
47 |     def __len__(self):
48 |         return len(self.imlist)


--------------------------------------------------------------------------------
/DSP/util/train_util.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from datetime import datetime
  4 | from tqdm import tqdm
  5 | import numpy as np
  6 | import torch.nn.functional as F
  7 | import os
  8 | from transforms.simclr_transform import SimCLRTransform
  9 | from torch.utils.data import DataLoader
 10 | from torch.utils.data.distributed import DistributedSampler
 11 | from util.utils import save_checkpoint, log
 12 | from criterion.ntxent import NTXent, BarlowTwinsLoss_CD
 13 | from criterion.sim_preserving_kd import criterion_MSE,distillation,fitnet_loss,similarity_preserving_loss, RKD, similarity_preserving_loss_cd, JSD
 14 | import dataset.CMU as CMU
 15 | import dataset.PCD as PCD
 16 | 
 17 | def get_criteria(args):
 18 |     """
 19 |     Loss criterion / criteria selection for training
 20 |     """
 21 |     if args.barlow_twins :
 22 |         criteria = {'Barlow': [BarlowTwinsLoss_CD(args)]}   #BarlowTwinsLoss
 23 |     else:
 24 |         criteria = {'ntxent': [NTXent(args), args.criterion_weight[0]]}
 25 | 
 26 |     return criteria
 27 | 
 28 | 
 29 | 
 30 | def write_scalar(writer, total_loss,total_loss_bl, total_loss_kd,total_loss_sp, loss_p_c, leng, epoch):
 31 |     """
 32 |     Add Loss scalars to tensorboard
 33 |     """
 34 |     writer.add_scalar("Total_Loss/train", total_loss/leng,epoch)
 35 |     writer.add_scalar("Total_Loss_bl",  total_loss_bl/leng,epoch)
 36 |     writer.add_scalar("Total_kd loss/train", total_loss_kd/leng, epoch)
 37 |     writer.add_scalar("Total_sp loss/train", total_loss_sp/leng, epoch)
 38 | 
 39 | 
 40 |     for k in loss_p_c:
 41 |         writer.add_scalar("{}_Loss/train".format(k), loss_p_c[k] / leng, epoch)
 42 | 
 43 | 
 44 | def trainloaderSimCLR(args):
 45 |     """
 46 |     Load training data through DataLoader
 47 |     """
 48 |     transform = SimCLRTransform(args.img_size)
 49 | 
 50 |     if args.ssl_dataset == 'CMU':
 51 |         DATA_PATH = os.path.join(args.data_dir)
 52 | 
 53 |         VAL_DATA_PATH = os.path.join(args.val_data_dir)
 54 | 
 55 | 
 56 |         train_dataset = CMU.Dataset(DATA_PATH,
 57 |                                          'train', 'ssl', transform= False,     #ssl
 58 |                                          transform_med = transform)
 59 |         # test_dataset = CMU.Dataset(VAL_DATA_PATH, 'val', transform=False,
 60 |         #                         transform_med=None)
 61 |     elif args.ssl_dataset == 'PCD':
 62 |         print('PCD dataset loaded')
 63 |         DATA_PATH = os.path.join(args.data_dir)
 64 | 
 65 |         VAL_DATA_PATH = os.path.join(args.val_data_dir)
 66 | 
 67 |         train_dataset = PCD.Dataset(DATA_PATH,
 68 |                                     'train', 'ssl', transform=False,  # ssl
 69 |                                     transform_med=transform)
 70 |     #  Data Loader
 71 |     if args.distribute:
 72 |         train_sampler = DistributedSampler(train_dataset)
 73 |         train_loader = DataLoader(train_dataset, batch_size=args.ssl_batchsize,sampler=train_sampler, drop_last=True)
 74 |     else:
 75 |         train_loader = DataLoader(train_dataset, batch_size=args.ssl_batchsize, shuffle=True, drop_last=True)
 76 |         # val_loader = DataLoader(train_dataset, batch_size=args.ssl_batchsize, shuffle=True, drop_last=True)
 77 |         # test_loader = DataLoader(test_dataset, batch_size=1, shuffle=True, drop_last=True)
 78 | 
 79 |     log("Took {} time to load data!".format(datetime.now() - args.start_time))
 80 |     return train_loader
 81 | 
 82 | def various_distance( out_vec_t0, out_vec_t1, dist_flag='l2'):
 83 | 
 84 |     if dist_flag == 'l2':
 85 |         distance = F.pairwise_distance(out_vec_t0,out_vec_t1,p=2)
 86 |     if dist_flag == 'l1':
 87 |         distance = F.pairwise_distance(out_vec_t0,out_vec_t1,p=1)
 88 |     if dist_flag == 'cos':
 89 |         similarity = F.cosine_similarity(out_vec_t0, out_vec_t1)
 90 |         distance = 1 - 2 * similarity / np.pi
 91 |     return distance
 92 | 
 93 | def train_one_epoch(args, train_loader, model, criteria, optimizer, scheduler, epoch):
 94 |     """
 95 |     Train one epoch of SSL model
 96 | 
 97 |     """
 98 |     # torch.autograd.set_detect_anomaly(True)
 99 |     loss_per_criterion = {}
100 |     total_loss = 0
101 |     total_sup_loss = 0
102 |     total_loss_bl = 0
103 |     total_loss_kd = 0
104 |     total_loss_sp = 0
105 | 
106 |     for i, batch in enumerate(train_loader):
107 |         p1, p2, n1, n2, f1,f2, label = batch    # x, y = positive pair belonging to t0 images ; x1,y1 = positive pair belonging to t1 images
108 |         p1 = p1.cuda(device=args.device)
109 |         p2 = p2.cuda(device=args.device)
110 |         n1 = n1.cuda(device=args.device)
111 |         n2 = n2.cuda(device=args.device)
112 |         label = label.cuda(device=args.device)
113 |         label = label.float()
114 |         optimizer.zero_grad()
115 |         if args.barlow_twins == True:
116 |             if args.dense_cl==True:
117 |                 xe, ye, zx, zy = model(p1, p2)
118 |                 x1e, y1e, zx1, zy1 = model(n1, n2)
119 |                 diff_feat0= torch.nn.functional.pairwise_distance(zx, zx1)
120 |                 diff_feat1 = torch.nn.functional.pairwise_distance(zy , zy1)
121 |                 diff_feat2 = torch.nn.functional.pairwise_distance(zx , zy1)
122 |                 diff_feat3 = torch.nn.functional.pairwise_distance(zy , zx1)
123 |             else:
124 |                 xe, ye, zx, zy = model(p1, p2)
125 |                 x1e, y1e, zx1, zy1 = model(n1, n2)
126 |                 ## simple diff layer to get change map
127 |                 diff_feat0 = torch.abs(zx - zx1)
128 |                 diff_feat1 = torch.abs(zy - zy1)
129 |                 diff_feat2 = torch.abs(zx - zy1)
130 |                 diff_feat3 = torch.abs(zy - zx1)
131 |         else:
132 |             _, _, zx, zy = model(p1, p2)
133 |             _, _, zx1, zy1 = model(n1, n2)
134 |         # Multiple loss aggregation
135 |         loss = torch.tensor(0).to(args.device)
136 |         for k in criteria:
137 |             global_step = epoch * len(train_loader) + i
138 |             if args.barlow_twins == True:
139 |                 if args.nodiff_tc:
140 |                     loss_bl = criteria[k][0](zx, zx1, diff_feat2, diff_feat3)
141 |                 else:
142 |                     loss_bl = criteria[k][0](diff_feat0, diff_feat1, diff_feat2, diff_feat3 )
143 | 
144 |                 if args.kd_loss==True:
145 |                     jsd = JSD(args)
146 |                     # loss_kd_1 = distillation(zx, zy, T=4)
147 |                     loss_kd_1 = jsd(zx, zy)
148 |                     loss_kd_2 = jsd(zx1, zy1)
149 |                     intra_kd_loss = loss_kd_1 + loss_kd_2
150 |                     loss_sp = 0
151 |                     if args.inter_kl==True:
152 |                         loss_kd_3 = jsd(zx, zx1)
153 |                         loss_kd_4 = jsd(zy, zy1)
154 |                         inter_kd_loss = loss_kd_3 + loss_kd_4
155 |                         loss_kd = (args.alpha_kl * intra_kd_loss) + (args.alpha_inter_kd*inter_kd_loss)
156 |                     else:
157 |                         loss_kd = args.alpha_kl * intra_kd_loss
158 |                     loss = loss_bl + loss_kd
159 |                     if args.kd_loss_2 == 'fitnet':
160 |                         loss_ft_1 = fitnet_loss(A_t=xe, A_s=ye, rand=False, noise=0.1)
161 |                         loss_ft_2 = fitnet_loss(A_t=x1e, A_s=y1e, rand=False, noise=0.1)
162 |                         loss_sp = (args.alpha_sp*(loss_kd_1 + loss_kd_2))
163 |                         loss = loss_bl + loss_kd + loss_sp
164 |                     elif args.kd_loss_2 == 'sp':
165 |                         loss_sp = ((args.alpha_sp)* similarity_preserving_loss_cd(xe, x1e, ye, y1e))
166 |                         loss = loss_bl + loss_kd + loss_sp
167 |             else:
168 | 
169 |                 loss = loss_bl
170 | 
171 |         loss.backward()
172 |         optimizer.step()
173 |         if scheduler is not None:
174 |             scheduler.step()
175 |         if i % 50 == 0:
176 | 
177 |             log("Batch {}/{}. Loss: {}. Loss_bl: {}. Loss_kd: {}. Loss_sp{}. Time elapsed: {} ".format(i, len(train_loader), loss.item(),loss_bl.item(),loss_kd.item()
178 |                                                                                       ,loss_sp.item(),datetime.now() - args.start_time))
179 |         total_loss += loss.item()
180 |         total_loss_bl += loss_bl.item()
181 |         total_loss_kd += loss_kd.item()
182 |         total_loss_sp += loss_sp.item()
183 | 
184 |     return total_loss, total_loss_bl, total_loss_kd, total_loss_sp, loss_per_criterion
185 | 
186 | 
187 | 
188 | 
189 | 
190 | def trainSSL(args, model, optimizer, criteria, writer, scheduler=None):
191 |     """
192 |     Train a SSL model
193 |     """
194 |     if not args.visualize_heatmap :
195 |         model.train()
196 |         # Data parallel Functionality
197 |         if torch.cuda.device_count() > 1:
198 |             model = nn.DataParallel(model)
199 |             log('Model converted to DataParallel model with {} cuda devices'.format(torch.cuda.device_count()))
200 |         model = model.to(args.device)
201 | 
202 |         train_loader = trainloaderSimCLR(args)
203 | 
204 |         for epoch in tqdm(range(1, args.ssl_epochs + 1)):
205 |             total_loss, total_loss_bl, total_loss_kd,total_loss_sp, loss_per_criterion = train_one_epoch(args, train_loader, model, criteria, optimizer, scheduler, epoch)
206 | 
207 |             write_scalar(writer, total_loss,total_loss_bl, total_loss_kd,total_loss_sp, loss_per_criterion, len(train_loader), epoch)
208 |             log("Epoch {}/{}. Total Loss: {}.   Time elapsed: {} ".
209 |                 format(epoch, args.ssl_epochs, total_loss / len(train_loader), total_loss_bl / len(train_loader),total_loss_kd / len(train_loader), datetime.now() - args.start_time))
210 | 
211 | 
212 |             # Save checkpoint after every epoch
213 |             path = save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch, filename='checkpoint.pth'.format(epoch))
214 |             if os.path.exists:
215 |                 state_dict = torch.load(path, map_location=args.device)
216 |                 model.load_state_dict(state_dict)
217 | 
218 |             # Save the model at specific checkpoints
219 |             if epoch % 10 == 0:
220 | 
221 |                 if torch.cuda.device_count() > 1:
222 |                     save_checkpoint(state_dict=model.module.state_dict(), args=args, epoch=epoch,
223 |                                     filename='checkpoint_model_{}_model1.pth'.format(epoch))
224 | 
225 |                 else:
226 |                     save_checkpoint(state_dict=model.state_dict(), args=args, epoch=epoch,
227 |                                     filename='checkpoint_model_{}_model1.pth'.format(epoch))
228 | 
229 |         log("Total training time {}".format(datetime.now() - args.start_time))
230 | 
231 | 
232 |     writer.close()
233 | 


--------------------------------------------------------------------------------
/DSP/util/transforms.py:
--------------------------------------------------------------------------------
  1 | from __future__ import division
  2 | import torch
  3 | import math
  4 | import random
  5 | from PIL import Image, ImageOps
  6 | try:
  7 |     import accimage
  8 | except ImportError:
  9 |     accimage = None
 10 | import numpy as np
 11 | import numbers
 12 | import types
 13 | import collections
 14 | from torch import nn
 15 | 
 16 | 
 17 | class Compose(object):
 18 |     """Composes several transforms together.
 19 | 
 20 |     Args:
 21 |         transforms (list of ``Transform`` objects): list of transforms to compose.
 22 | 
 23 |     Example:
 24 |         >>> transforms.Compose([
 25 |         >>>     transforms.CenterCrop(10),
 26 |         >>>     transforms.ToTensor(),
 27 |         >>> ])
 28 |     """
 29 | 
 30 |     def __init__(self, transforms):
 31 |         self.transforms = transforms
 32 | 
 33 |     def __call__(self, img):
 34 |         for t in self.transforms:
 35 |             img = t(img)
 36 |         return img
 37 | 
 38 | 
 39 | class ToTensor(object):
 40 |     """Convert a ``PIL.Image`` or ``numpy.ndarray`` to tensor.
 41 | 
 42 |     Converts a PIL.Image or numpy.ndarray (H x W x C) in the range
 43 |     [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].
 44 |     """
 45 | 
 46 |     def __call__(self, pic):
 47 |         """
 48 |         Args:
 49 |             pic (PIL.Image or numpy.ndarray): Image to be converted to tensor.
 50 | 
 51 |         Returns:
 52 |             Tensor: Converted image.
 53 |         """
 54 |         if isinstance(pic, np.ndarray):
 55 |             # handle numpy array
 56 |             img = torch.from_numpy(pic.transpose((2, 0, 1)))
 57 |             # backward compatibility
 58 |             return img.float().div(255)
 59 | 
 60 |         if accimage is not None and isinstance(pic, accimage.Image):
 61 |             nppic = np.zeros([pic.channels, pic.height, pic.width], dtype=np.float32)
 62 |             pic.copyto(nppic)
 63 |             return torch.from_numpy(nppic)
 64 | 
 65 |         # handle PIL Image
 66 |         if pic.mode == 'I':
 67 |             img = torch.from_numpy(np.array(pic, np.int32, copy=False))
 68 |         elif pic.mode == 'I;16':
 69 |             img = torch.from_numpy(np.array(pic, np.int16, copy=False))
 70 |         else:
 71 |             img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
 72 |         # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
 73 |         if pic.mode == 'YCbCr':
 74 |             nchannel = 3
 75 |         elif pic.mode == 'I;16':
 76 |             nchannel = 1
 77 |         else:
 78 |             nchannel = len(pic.mode)
 79 |         img = img.view(pic.size[1], pic.size[0], nchannel)
 80 |         # put it from HWC to CHW format
 81 |         # yikes, this transpose takes 80% of the loading time/CPU
 82 |         img = img.transpose(0, 1).transpose(0, 2).contiguous()
 83 |         if isinstance(img, torch.ByteTensor):
 84 |             return img.float().div(255)
 85 |         else:
 86 |             return img
 87 | 
 88 | 
 89 | class ToPILImage(object):
 90 |     """Convert a tensor to PIL Image.
 91 | 
 92 |     Converts a torch.*Tensor of shape C x H x W or a numpy ndarray of shape
 93 |     H x W x C to a PIL.Image while preserving the value range.
 94 |     """
 95 | 
 96 |     def __call__(self, pic):
 97 |         """
 98 |         Args:
 99 |             pic (Tensor or numpy.ndarray): Image to be converted to PIL.Image.
100 | 
101 |         Returns:
102 |             PIL.Image: Image converted to PIL.Image.
103 | 
104 |         """
105 |         npimg = pic
106 |         mode = None
107 |         if isinstance(pic, torch.FloatTensor):
108 |             pic = pic.mul(255).byte()
109 |         if torch.is_tensor(pic):
110 |             npimg = np.transpose(pic.numpy(), (1, 2, 0))
111 |         assert isinstance(npimg, np.ndarray), 'pic should be Tensor or ndarray'
112 |         if npimg.shape[2] == 1:
113 |             npimg = npimg[:, :, 0]
114 | 
115 |             if npimg.dtype == np.uint8:
116 |                 mode = 'L'
117 |             if npimg.dtype == np.int16:
118 |                 mode = 'I;16'
119 |             if npimg.dtype == np.int32:
120 |                 mode = 'I'
121 |             elif npimg.dtype == np.float32:
122 |                 mode = 'F'
123 |         else:
124 |             if npimg.dtype == np.uint8:
125 |                 mode = 'RGB'
126 |         assert mode is not None, '{} is not supported'.format(npimg.dtype)
127 |         return Image.fromarray(npimg, mode=mode)
128 | 
129 | 
130 | class Normalize(object):
131 |     """Normalize an tensor image with mean and standard deviation.
132 | 
133 |     Given mean: (R, G, B) and std: (R, G, B),
134 |     will normalize each channel of the torch.*Tensor, i.e.
135 |     channel = (channel - mean) / std
136 | 
137 |     Args:
138 |         mean (sequence): Sequence of means for R, G, B channels respecitvely.
139 |         std (sequence): Sequence of standard deviations for R, G, B channels
140 |             respecitvely.
141 |     """
142 | 
143 |     def __init__(self, mean, std):
144 |         self.mean = mean
145 |         self.std = std
146 | 
147 |     def __call__(self, tensor):
148 |         """
149 |         Args:
150 |             tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
151 | 
152 |         Returns:
153 |             Tensor: Normalized image.
154 |         """
155 |         # TODO: make efficient
156 |         for t, m, s in zip(tensor, self.mean, self.std):
157 |             t.sub_(m).div_(s)
158 |         return tensor
159 | 
160 | 
161 | class Scale(object):
162 |     """Rescale the input PIL.Image to the given size.
163 | 
164 |     Args:
165 |         size (sequence or int): Desired output size. If size is a sequence like
166 |             (w, h), output size will be matched to this. If size is an int,
167 |             smaller edge of the image will be matched to this number.
168 |             i.e, if height > width, then image will be rescaled to
169 |             (size * height / width, size)
170 |         interpolation (int, optional): Desired interpolation. Default is
171 |             ``PIL.Image.BILINEAR``
172 |     """
173 | 
174 |     def __init__(self, size, interpolation=Image.BILINEAR):
175 |         assert isinstance(size, int) or (isinstance(size, collections.Iterable) and len(size) == 2)
176 |         self.size = size
177 |         self.interpolation = interpolation
178 | 
179 |     def __call__(self, img):
180 |         """
181 |         Args:
182 |             img (PIL.Image): Image to be scaled.
183 | 
184 |         Returns:
185 |             PIL.Image: Rescaled image.
186 |         """
187 |         if isinstance(self.size, int):
188 |             w, h = img.size
189 |             if (w <= h and w == self.size) or (h <= w and h == self.size):
190 |                 return img
191 |             if w < h:
192 |                 ow = self.size
193 |                 oh = int(self.size * h / w)
194 |                 return img.resize((ow, oh), self.interpolation)
195 |             else:
196 |                 oh = self.size
197 |                 ow = int(self.size * w / h)
198 |                 return img.resize((ow, oh), self.interpolation)
199 |         else:
200 |             return img.resize(self.size, self.interpolation)
201 | 
202 | 
203 | class CenterCrop(object):
204 |     """Crops the given PIL.Image at the center.
205 | 
206 |     Args:
207 |         size (sequence or int): Desired output size of the crop. If size is an
208 |             int instead of sequence like (h, w), a square crop (size, size) is
209 |             made.
210 |     """
211 | 
212 |     def __init__(self, size):
213 |         if isinstance(size, numbers.Number):
214 |             self.size = (int(size), int(size))
215 |         else:
216 |             self.size = size
217 | 
218 |     def __call__(self, img):
219 |         """
220 |         Args:
221 |             img (PIL.Image): Image to be cropped.
222 | 
223 |         Returns:
224 |             PIL.Image: Cropped image.
225 |         """
226 |         w, h = img.size
227 |         th, tw = self.size
228 |         x1 = int(round((w - tw) / 2.))
229 |         y1 = int(round((h - th) / 2.))
230 |         return img.crop((x1, y1, x1 + tw, y1 + th))
231 | 
232 | 
233 | class Pad(object):
234 |     """Pad the given PIL.Image on all sides with the given "pad" value.
235 | 
236 |     Args:
237 |         padding (int or sequence): Padding on each border. If a sequence of
238 |             length 4, it is used to pad left, top, right and bottom borders respectively.
239 |         fill: Pixel fill value. Default is 0.
240 |     """
241 | 
242 |     def __init__(self, padding, fill=0):
243 |         assert isinstance(padding, numbers.Number)
244 |         assert isinstance(fill, numbers.Number) or isinstance(fill, str) or isinstance(fill, tuple)
245 |         self.padding = padding
246 |         self.fill = fill
247 | 
248 |     def __call__(self, img):
249 |         """
250 |         Args:
251 |             img (PIL.Image): Image to be padded.
252 | 
253 |         Returns:
254 |             PIL.Image: Padded image.
255 |         """
256 |         return ImageOps.expand(img, border=self.padding, fill=self.fill)
257 | 
258 | 
259 | class Lambda(object):
260 |     """Apply a user-defined lambda as a transform.
261 | 
262 |     Args:
263 |         lambd (function): Lambda/function to be used for transform.
264 |     """
265 | 
266 |     def __init__(self, lambd):
267 |         assert isinstance(lambd, types.LambdaType)
268 |         self.lambd = lambd
269 | 
270 |     def __call__(self, img):
271 |         return self.lambd(img)
272 | 
273 | 
274 | class RandomCrop(object):
275 |     """Crop the given PIL.Image at a random location.
276 | 
277 |     Args:
278 |         size (sequence or int): Desired output size of the crop. If size is an
279 |             int instead of sequence like (h, w), a square crop (size, size) is
280 |             made.
281 |         padding (int or sequence, optional): Optional padding on each border
282 |             of the image. Default is 0, i.e no padding. If a sequence of length
283 |             4 is provided, it is used to pad left, top, right, bottom borders
284 |             respectively.
285 |     """
286 | 
287 |     def __init__(self, size, padding=0):
288 |         if isinstance(size, numbers.Number):
289 |             self.size = (int(size), int(size))
290 |         else:
291 |             self.size = size
292 |         self.padding = padding
293 | 
294 |     def __call__(self, img):
295 |         """
296 |         Args:
297 |             img (PIL.Image): Image to be cropped.
298 | 
299 |         Returns:
300 |             PIL.Image: Cropped image.
301 |         """
302 |         if self.padding > 0:
303 |             img = ImageOps.expand(img, border=self.padding, fill=0)
304 | 
305 |         w, h = img.size
306 |         th, tw = self.size
307 |         if w == tw and h == th:
308 |             return img
309 | 
310 |         if w < tw or h < th:
311 |             return img.resize((tw, th), Image.BILINEAR)
312 | 
313 |         x1 = random.randint(0, w - tw)
314 |         y1 = random.randint(0, h - th)
315 |         return img.crop((x1, y1, x1 + tw, y1 + th))
316 | 
317 | 
318 | class RandomHorizontalFlip(object):
319 |     """Horizontally flip the given PIL.Image randomly with a probability of 0.5."""
320 | 
321 |     def __call__(self, img):
322 |         """
323 |         Args:
324 |             img (PIL.Image): Image to be flipped.
325 | 
326 |         Returns:
327 |             PIL.Image: Randomly flipped image.
328 |         """
329 |         if random.random() < 0.5:
330 |             return img.transpose(Image.FLIP_LEFT_RIGHT)
331 |         return img
332 | 
333 | 
334 | 
335 | class RandomChoice(nn.Module):
336 |     def __init__(self, transforms):
337 |        super().__init__()
338 |        self.transforms = transforms
339 | 
340 |     def __call__(self, imgs):
341 |         t = random.choice(self.transforms)
342 |         return [t(img) for img in imgs]
343 | 
344 | class RandomSizedCrop(object):
345 |     """Crop the given PIL.Image to random size and aspect ratio.
346 | 
347 |     A crop of random size of (0.08 to 1.0) of the original size and a random
348 |     aspect ratio of 3/4 to 4/3 of the original aspect ratio is made. This crop
349 |     is finally resized to given size.
350 |     This is popularly used to train the Inception networks.
351 | 
352 |     Args:
353 |         size: size of the smaller edge
354 |         interpolation: Default: PIL.Image.BILINEAR
355 |     """
356 | 
357 |     def __init__(self, size, interpolation=Image.BILINEAR):
358 |         self.size = size
359 |         self.interpolation = interpolation
360 | 
361 |     def __call__(self, img):
362 |         for attempt in range(10):
363 |             area = img.size[0] * img.size[1]
364 |             target_area = random.uniform(0.08, 1.0) * area
365 |             aspect_ratio = random.uniform(3. / 4, 4. / 3)
366 | 
367 |             w = int(round(math.sqrt(target_area * aspect_ratio)))
368 |             h = int(round(math.sqrt(target_area / aspect_ratio)))
369 | 
370 |             if random.random() < 0.5:
371 |                 w, h = h, w
372 | 
373 |             if w <= img.size[0] and h <= img.size[1]:
374 |                 x1 = random.randint(0, img.size[0] - w)
375 |                 y1 = random.randint(0, img.size[1] - h)
376 | 
377 |                 img = img.crop((x1, y1, x1 + w, y1 + h))
378 |                 assert(img.size == (w, h))
379 | 
380 |                 return img.resize((self.size, self.size), Image.BILINEAR)
381 | 
382 |         # Fallback
383 |         scale = Scale(self.size, interpolation=self.interpolation)
384 |         crop = CenterCrop(self.size)
385 |         return crop(scale(img))
386 | 
387 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2022 NeurAI
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Differencing based Self-supervised pretraining for scene change detection (DSP)
 2 | 
 3 | 
 4 | **This is the official code for COLLA 2022 Paper, ["Differencing based Self-supervised pretraining for scene change detection"](https://proceedings.mlr.press/v199/ramkumar22a.html) by [Vijaya Raghavan Thiruvengadathan Ramkumar](https://www.linkedin.com/in/vijayaraghavan95), [Elahe Arani](https://www.linkedin.com/in/elahe-arani-630870b2/) and [Bahram Zonooz](https://www.linkedin.com/in/bahram-zonooz-2b5589156/), where we propose a novel self-supervised pretraining architechture based on differenceing called DSP for scene change detection.**
 5 | 
 6 | ## Abstract
 7 | 
 8 | 
 9 | Scene change detection (SCD), a crucial perception task, identifies changes by comparing scenes captured at different times. SCD is challenging due to noisy changes in illumination, seasonal variations, and perspective differences across a pair of views. Deep neural network based solutions require a large quantity of annotated data which is tedious and expensive to obtain. On the other hand, transfer learning from large datasets induces domain shift. To address these challenges, we propose a novel Differencing self-supervised pretraining (DSP) method that uses feature differencing to learn discriminatory representations corresponding to the changed regions while simultaneously tackling the noisy changes by enforcing temporal invariance across views. Our experimental results on SCD datasets demonstrate the effectiveness of our method, specifically to differences in camera viewpoints and lighting conditions. Compared against the self-supervised Barlow Twins and the standard ImageNet pretraining that uses more than a million additional labeled images, DSP can surpass it without using any additional data. Our results also demonstrate the robustness of DSP to natural corruptions, distribution shift, and learning under limited labeled data.
10 | 
11 | ![alt text](https://github.com/NeurAI-Lab/DSP/blob/main/method.png)
12 | 
13 | For more details, please see the [Paper](https://arxiv.org/abs/2208.05838) and [Presentation](https://www.youtube.com/watch?v=kWUxxC5hjKw).
14 | 
15 | ## Requirements
16 | 
17 | - python 3.6+
18 | - opencv 3.4.2+
19 | - pytorch 1.6.0
20 | - torchvision 0.4.0+
21 | - tqdm 4.51.0
22 | - tensorboardX 2.1
23 | 
24 | ## Datasets
25 | 
26 | Our network is tested on two datasets for street-view scene change detection. 
27 | 
28 | - 'PCD' dataset from [Change detection from a street image pair using CNN features and superpixel segmentation](http://www.vision.is.tohoku.ac.jp/files/9814/3947/4830/71-Sakurada-BMVC15.pdf). 
29 |   - You can find the information about how to get 'TSUNAMI', 'GSV' and preprocessed datasets for training and test [here](https://kensakurada.github.io/pcd_dataset.html).
30 | - 'VL-CMU-CD' dataset from [Street-View Change Detection with Deconvolutional Networks](http://www.robesafe.com/personal/roberto.arroyo/docs/Alcantarilla16rss.pdf).
31 |   -  'VL-CMU-CD': [[googledrive]](https://drive.google.com/file/d/0B-IG2NONFdciOWY5QkQ3OUgwejQ/view?resourcekey=0-rEzCjPFmDFjt4UMWamV4Eg)
32 | 
33 | ## Dataset Preprocessing
34 | 
35 | - For DSP pretraining - included in the DSP--dataset--CMU.py/PCD.py
36 | - For finetuning and evaluation - Please follow the preprocessing method used by the official implementation of [{Dynamic Receptive Temporal Attention Network for Street Scene Change Detection paper}](https://github.com/Herrccc/DR-TANet) 
37 | 
38 | Dataset folder structure for VL-CMU-CD:
39 | ```bash
40 | ├── VL-CMU-CD
41 | │   ├── Image_T0
42 | │   ├── Image_T1
43 | │   ├── Ground Truth
44 | 
45 | ```
46 | 								
47 | ## SSL Training
48 | 
49 | 
50 | - For training 'DSP' on VL-CMU-CD dataset:
51 | ```
52 | python3 DSP/train.py --ssl_batchsize 16 --ssl_epochs 500 --save_dir /outputs --data_dir /path/to/VL-CMU-CD --img_size 256 --n_proj 256 --hidden_layer 512 --output_stride 8 --pre_train False --m_backbone False --barlow_twins True --dense_cl False --kd_loss True --kd_loss_2 sp --inter_kl False --alpha_inter_kd 0 --alpha_sp 3000 --alpha_kl 100
53 | ```
54 |  
55 | 
56 | ## Fine Tuning
57 | 
58 | We evaluate Rand, Imagenet supervised, Barlow twins, and DSP pretraining on DR-TANet.
59 | - Follow the Please follow the train and test procedure used by the official implementation of [{Dynamic Receptive Temporal Attention Network for Street Scene Change Detection paper}](https://github.com/Herrccc/DR-TANet) 
60 | 
61 | Start training with DR-TANet on 'VL-CMU-CD' dataset.
62 | 
63 |     python3 train.py --dataset vl_cmu_cd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --max-epochs 150 --batch-size 16 --encoder-arch resnet50 --epoch-save 25 --drtam --refinement
64 | 
65 | Start evaluating with DR-TANet on 'PCD' dataset.
66 | 
67 |     python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet50 --drtam --refinement --store-imgs
68 |   
69 | ## Evaluating the finetuned model
70 | 
71 | Start evaluating with DR-TANet on 'PCD' dataset.
72 | 
73 |     python3 eval.py --dataset pcd --datadir /path_to_dataset --checkpointdir /path_to_check_point_directory --resultdir /path_to_save_eval_result --encoder-arch resnet18 --drtam --refinement --store-imgs
74 |     
75 | ## Analysis
76 | We analyse our DSP model under 3 scenarios: **1. Robustness to Natural corruptions 2. Out-of-distribution data 3. Limited labeled data. For more details, please see the [Paper](https://arxiv.org/abs/2208.05838).** 
77 | For Natural corruptions evaluation, please refer to the paper [{Benchmarking Neural Network Robustness to
78 | Common Corruptions and Surface Variations }](https://arxiv.org/pdf/1807.01697.pdf) 
79 | 
80 | And finally, for the ease of comparison, we have provided the model checkpoints for the DSP pretraining below:  [google drive](https://drive.google.com/drive/folders/1UwFQ7NjXRwyfgfhFnX6_CPTm8hQ8AoFF?usp=sharing)
81 | 
82 | 
83 | ## Cite our work
84 | 
85 | If you find the code useful in your research, please consider citing our paper:
86 | 
87 | <pre>
88 | @inproceedings{ramkumar2022differencing,
89 |   title={Differencing based Self-supervised pretraining for Scene Change Detection},
90 |   author={Ramkumar, Vijaya Raghavan T and Arani, Elahe and Zonooz, Bahram},
91 |   booktitle={Conference on Lifelong Learning Agents},
92 |   pages={952--965},
93 |   year={2022},
94 |   organization={PMLR}
95 | }
96 | 


--------------------------------------------------------------------------------
/method.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NeurAI-Lab/DSP/45027a3702696dafd7018802619dde17c6da1ca8/method.png


--------------------------------------------------------------------------------