├── LICENSE ├── README.md ├── dataset └── README.md ├── source ├── .gitignore ├── Datasets.py ├── SparseImageWarp.py ├── augment.py ├── batch_runner.py ├── config.yaml ├── default.yaml ├── environment.yml ├── main.py ├── model │ ├── Alexnet.py │ ├── CNN_GRU.py │ ├── STF.py │ ├── THAT.py │ ├── TransCNN.py │ ├── UniTS.py │ ├── Widar3.py │ ├── __init__.py │ ├── dual-dl.py │ ├── dual.py │ ├── laxcat.py │ ├── layer_maker.py │ ├── mann.py │ ├── norm.py │ ├── resnet.py │ ├── slnet.py │ ├── static_UniTS.py │ └── transformer_encoder.py ├── pytorchtools.py ├── requirements.txt ├── set_device.py ├── utils.py └── widar3 │ ├── all_5500_top6_1000_2560 │ ├── test_filename.npy │ ├── test_label.npy │ ├── train_filename.npy │ ├── train_label.npy │ ├── valid_filename.npy │ └── valid_label.npy │ ├── all_top6.txt │ └── small_1100_top6_1000_2560 │ ├── test_filename.npy │ ├── test_label.npy │ ├── train_filename.npy │ ├── train_label.npy │ ├── valid_filename.npy │ └── valid_label.npy └── ubicomp24-rfboost-final.pdf /README.md: -------------------------------------------------------------------------------- 1 | # RFBoost 2 | 3 | [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) 4 | 5 | Datasets and PyTorch code for **RFBoost: Understanding and Boosting Deep WiFi Sensing via Physical Data Augmentation**. 6 | 7 | ## Prerequisites 8 | 9 | - Clone this repo and download preprocessed Widar3 dataset from the [link](https://connecthkuhk-my.sharepoint.com/:u:/g/personal/u3008874_connect_hku_hk/EQr23WGSqOlJqlfqf7j6ThQBKT45tbPCEpEgSV9wNhwVrg?e=tNNf3u) (password:hku-aiot-rfboost24): You can download raw data from [Widar3 website](http://tns.thss.tsinghua.edu.cn/widar3.0/). 10 | ```bash 11 | unzip NPZ-pp.zip -d "dataset/NPZ-pp/" 12 | 13 | - (Optional) Setup cache path in `config.yaml`: 14 | ```yaml 15 | cache_folder: "/path/to/cache/dir/" 16 | ``` 17 | 18 | - Use Conda to manage python environment: 19 | ```bash 20 | % create rfboost-pytorch2 21 | conda env create -f environment.yml 22 | ``` 23 | 24 | ## How to Run 25 | 26 | 1. Start the batch runner with: 27 | ```bash 28 | python source/batch_runner.py 29 | ``` 30 | 2. If everything goes well, training logs are recorded in `./log///`, final results are available under `./record`, and TensorBoard logs are located at `./runs`. 31 | 32 | ## Support methods 33 | 34 | The current version supports data augmentation methods for the Widar3 dataset and models using DFS input. In `batch_runner.py` file, uncomment the method you want to use. Available options include "PCA", "All Subcarriers", "RDA" and "ISS-6". 35 | 36 | Note that customized augmentation method will be defined in "augment.py". (TODO: We will refactor the definition logic in the future.) 37 | 38 | ## Files and Directories 39 | 40 | ### About `source/batch_runner.py`: task runner 41 | This is a multi-task queue that allows multiple augmentation combinations to be submitted at once to test performance. Currently, you can adjust the Dataset, Model, default_window, augmentation, hyperparameters, and so on. 42 | 43 | By default, it uses the RFNet model and Widar3 dataset with default parameters, testing for Cross-RX evalution. 44 | 45 | ### About `dataset/` & `source/Datasets.py`: Dataset and Splits 46 | The original data are stored in the `dataset/` directory, but for different tasks, different data splits are needed. So that we save split files in the `source//` folder. By default, `main.py` also supports K-fold cross-validation. 47 | 48 | ### About `source/augment.py`: Repo of augmentation methods 49 | Users can write their own augmentation rules in this file. 50 | 51 | ## Notes 52 | 53 | This repository is built upon [UniTS repo](https://github.com/Shuheng-Li/UniTS-Sensory-Time-Series-Classification). We owe our gratitude for their initial work. 54 | 55 | ## Citation 56 | 57 | ``` 58 | @article{hou2024rfboost, 59 | author = {Hou, Weiying and Wu, Chenshu}, 60 | title = {RFBoost: Understanding and Boosting Deep WiFi Sensing via Physical Data Augmentation}, 61 | year = {2024}, 62 | journal = {Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.}, 63 | } 64 | ``` 65 | 66 | ## License 67 | 68 | This project is licensed under the GPL v3 License - see the [LICENSE](LICENSE) file for details. 69 | -------------------------------------------------------------------------------- /dataset/README.md: -------------------------------------------------------------------------------- 1 | ### Datasets 2 | Please see description in https://github.com/aiot-lab/RFBoost for obtaining the processed datasets. 3 | -------------------------------------------------------------------------------- /source/.gitignore: -------------------------------------------------------------------------------- 1 | ../cache/* 2 | record 3 | runs 4 | log -------------------------------------------------------------------------------- /source/Datasets.py: -------------------------------------------------------------------------------- 1 | from ast import arg 2 | import random 3 | import numpy as np 4 | import os 5 | import torch 6 | from torch.utils.data import Dataset 7 | from sklearn.metrics import recall_score, f1_score, accuracy_score, confusion_matrix, roc_auc_score, roc_curve, average_precision_score 8 | from sklearn.utils import resample 9 | import scipy.io as sio 10 | from utils import * 11 | from augment import * 12 | import torch.nn.functional as F 13 | import tqdm 14 | import mat73 15 | 16 | from collections import Counter 17 | 18 | class RFBoostDataset(Dataset): 19 | def __init__(self, args, path_file, part, config=None, slice_idx=None): 20 | self.part = part 21 | self.args = args 22 | self.time_aug = args.time_aug 23 | self.freq_aug_list = [] 24 | self.freq_args_list = [] 25 | 26 | if args.freq_aug != [] and args.freq_aug != ['']: 27 | # e.g. ["kmeans,4", "ms-top,4"] 28 | for freq_aug_one in args.freq_aug: 29 | # "kmeans,4" 30 | fda_policy, freq_args = freq_aug_one.split(',') 31 | freq_args = int(freq_args) 32 | for i in range(freq_args): 33 | self.freq_aug_list.append(fda_policy) 34 | self.freq_args_list.append((i, freq_args)) 35 | else: 36 | self.freq_args_list = [] 37 | 38 | # Augmentation parameters 39 | self.space_aug = args.space_aug 40 | 41 | self.n_fda = len(self.freq_aug_list) 42 | self.n_tda = len(self.time_aug) 43 | self.n_sda = len(self.space_aug) 44 | 45 | self.augment = Augmentation(args.default_stft_window, window_step=10) 46 | self.aug_ratio = self.n_fda + self.n_tda + self.n_sda 47 | 48 | if self.part in ["test"] and self.args.exp_test.startswith("rx"): 49 | # rx_sel: rx-0,2,4 50 | self.rx_candidate = self.args.exp_test.split('-')[1].split(',') 51 | elif self.part in ["train", "valid"] and self.args.exp_train_val.startswith("rx"): 52 | # rx_sel: rx-0,2,4 53 | self.rx_candidate = self.args.exp_train_val.split('-')[1].split(',') 54 | else: 55 | self.rx_candidate = ["all"] 56 | 57 | if self.args.dataset == "widar3": 58 | # args.data_path = "" 59 | root_folder = "./widar3/{}/".format(args.data_path) 60 | # Hard 61 | # which = "fold0" 62 | self.records = np.load(root_folder+"{}_filename.npy".format(self.part)) 63 | self.labels = np.load(root_folder+"{}_label.npy".format( self.part), allow_pickle=True) 64 | 65 | def apply_slice(self, slice_idx): 66 | if slice_idx is None: 67 | return 68 | 69 | self.records = [self.records[i] for i in slice_idx] 70 | # self.data_paths = [self.data_paths[i] for i in slice_idx] 71 | self.labels = [self.labels[i] for i in slice_idx] 72 | self.ms = [self.ms[i] for i in slice_idx] 73 | 74 | def label_dist_str(self): 75 | label_conter = Counter(self.labels) 76 | dist_str = "[" 77 | # sort by label 78 | for label, cnt in sorted(label_conter.items(), key=lambda x: int(x[0])): 79 | dist_str += "{}:{:} ".format(label, cnt) 80 | dist_str += "]" 81 | return dist_str 82 | def index_map(self, index, shape): 83 | # shape: [N, A, Rx] 84 | # return N_i, A_i, Rx_i of index 85 | N, A, Rx = shape 86 | N_i = index // (A*Rx) 87 | A_i = (index % (A*Rx)) // Rx 88 | Rx_i = (index % (A*Rx)) % Rx 89 | 90 | return N_i, A_i, Rx_i 91 | 92 | def __getitem__(self, index): 93 | """ 94 | index mapping to the original data 95 | len(time_aug) + len(freq_aug) + len(space_aug) = aug_ratio - 1 96 | 0... a1-1 .. a1+a2-1... a1+a2+a3-1... a1+a2+a3(original) 97 | 98 | 0 1 2 3 ... aug_ratio-1 99 | aug ... 100 | 2*aug 101 | 102 | ... 103 | (n-1)*aug ... n*aug_ratio-1 104 | 105 | """ 106 | # if augmentation is enabled 107 | if self.args.exp.startswith("imb-"): 108 | # don't use augmentation for all 109 | file_idx = index 110 | # original data 111 | aug_idx = self.aug_ratio 112 | else: 113 | file_idx, aug_idx, rx_idx = self.index_map(index, [len(self.records), self.aug_ratio, len(self.rx_candidate)]) 114 | 115 | # Get Spectrogram 116 | try: 117 | if self.args.dataset.startswith("widar3"): 118 | path = self.records[file_idx] 119 | label = self.labels[file_idx] 120 | try: 121 | if self.args.version == "norm-filter": 122 | data_path = "../dataset/NPZ-pp/{}/{}.npz".format(self.args.version, path) 123 | ms_path = "../dataset/NPZ-pp/{}-ms/{}.mat".format(self.args.version, path) 124 | csi_data, ms = np.load(data_path)['data'], sio.loadmat(ms_path)['ms'] 125 | elif self.args.version == "norm-filter-2024": 126 | data_path = "../dataset/NPZ-pp/{}/{}.mat".format(self.args.version, path) 127 | data_packed = sio.loadmat(data_path) 128 | csi_data, ms = data_packed['data'], data_packed['ms'] 129 | # label start from 0 130 | label = int(label) - 1 131 | 132 | except Exception as e: 133 | self.args.log(e) 134 | self.args.log("Error: {}".format(data_path)) 135 | raise Exception("Error: {}".format(data_path)) 136 | 137 | if self.rx_candidate is not None and self.rx_candidate[0] != "all": 138 | # rx_sel: rx-0,2,4 139 | rx_sel = int(self.rx_candidate[rx_idx]) 140 | 141 | csi_data = csi_data[:, :, rx_sel][:, :, np.newaxis] 142 | ms = ms[:, :, rx_sel][:, :, np.newaxis] 143 | 144 | # [T, F, Rx] -> [Rx, T, F] 145 | csi_data = np.transpose(csi_data, (2, 0, 1)) 146 | if ms is not None: 147 | # if ms has only 1 dim 148 | if ms.ndim == 1: 149 | ms = np.expand_dims(ms, axis=0) 150 | else: 151 | # # [W, F, Rx] -> [Rx, F, W] 152 | ms = np.transpose(ms, (2, 1, 0)) 153 | 154 | # with record_function("Get_DFS"): 155 | # Augmentation 156 | if self.part == 'train' or (self.part in ['test', 'valid'] and self.args.enable_test_aug): 157 | if aug_idx < self.n_tda: 158 | tda_idx = aug_idx 159 | dfs = [self.augment.time_augment(csi_data[i], ms[i], self.time_aug[tda_idx], args=self.args, agg_type="pca")[2] for i in range(csi_data.shape[0])] 160 | elif aug_idx < self.n_tda + self.n_fda: 161 | fda_idx = aug_idx - self.n_tda 162 | dfs = [self.augment.frequency_augment(csi_data[i], ms[i], self.freq_aug_list[fda_idx], self.freq_args_list[fda_idx][1], fda_th=self.freq_args_list[fda_idx][0], file_th=file_idx, rx_th=i, args=self.args)[2] for i in range(csi_data.shape[0])] 163 | elif aug_idx < self.n_tda + self.n_fda + self.n_sda: 164 | sda_idx = aug_idx - self.n_tda - self.n_fda 165 | dfs = [self.augment.space_augment(csi_data[i], ms[i], self.space_aug[sda_idx], args=self.args)[2] for i in range(csi_data.shape[0])] 166 | else: 167 | # error 168 | raise Exception("Error: {}".format(data_path)) 169 | else: 170 | fda_idx = aug_idx - self.n_tda 171 | dfs = [self.augment.frequency_augment(csi_data[i], ms[i], "all", self.freq_args_list[fda_idx][1], fda_th=self.freq_args_list[fda_idx][0], file_th=file_idx, rx_th=i, args=self.args)[2] for i in range(csi_data.shape[0])] 172 | 173 | if np.array(dfs).shape.__len__() != 3: 174 | # (1, 90, 121, 139) -> (1x90, 121, 139) 175 | dfs = np.array(dfs) 176 | dfs = dfs.reshape((dfs.shape[0]*dfs.shape[1], dfs.shape[2], dfs.shape[3])) 177 | 178 | # [Rx, F, W] 179 | dfs = np.array([pad_and_downsample(d, self.args.input_size) for d in dfs]) 180 | 181 | if self.args.model == 'Widar3': 182 | # [Rx, F, W] -> [W, Rx, F] 183 | dfs = dfs.transpose((2, 0, 1)) 184 | # [W, Rx, F] -> [W, 1, Rx, F] extend_dim 185 | dfs = np.expand_dims(dfs, axis=1) 186 | elif self.args.model in ["ResNet18", "AlexNet"]: 187 | # [Rx, F, W] -> [Rx, W, F] 188 | dfs = dfs.transpose((0, 2, 1)) 189 | elif self.args.model == "RFNet": 190 | # [Rx, F, W] -> [W, Rx, F] 191 | dfs = dfs.transpose((2, 0, 1)) 192 | # [W, Rx, F] -> [W, Rx*F] 193 | dfs = dfs.reshape((dfs.shape[0], -1)) 194 | 195 | elif self.args.model == "CNN_GRU": 196 | # [W, Rx*F] -> [W, 1, Rx*F] 197 | dfs = np.expand_dims(dfs, axis=1) 198 | 199 | # normalize 200 | dfs = (dfs - np.mean(dfs)) / np.std(dfs) 201 | 202 | if np.isnan(dfs).any(): 203 | print('nan') 204 | 205 | return dfs, label 206 | except Exception as e: 207 | self.args.log(e) 208 | self.args.log("Error: {}".format(data_path)) 209 | raise Exception("Error: {}".format(data_path)) 210 | 211 | def __len__(self): 212 | if self.args.exp.startswith("imb-"): 213 | return len(self.records) 214 | 215 | if self.part == 'train' or (self.part in ['test', 'valid'] and self.args.enable_test_aug): 216 | return len(self.records) * self.aug_ratio * len(self.rx_candidate) 217 | else: 218 | return len(self.records) * self.aug_ratio * len(self.rx_candidate) 219 | 220 | def eval(model, eval_loader, epoch, kind, args): 221 | y_pred = [] 222 | y_true = [] 223 | prob_all =[] 224 | eval_loss = 0 225 | iter_conter = 0 226 | with torch.no_grad(): 227 | model.eval() 228 | for i, (x, y) in enumerate(tqdm.tqdm(eval_loader)): 229 | iter_conter += 1 230 | x = x.cuda(non_blocking=True).float() 231 | y = y.cuda(non_blocking=True).long() 232 | 233 | out = model(x) 234 | prob = F.softmax(out, dim=1) 235 | prob_all.append(prob.cpu().numpy()) 236 | # eval_loss += args.loss_func(out, y) 237 | loss = args.loss_func(out, y) 238 | eval_loss += loss.cpu().item() 239 | 240 | pred = torch.argmax(out, dim = -1) 241 | y_pred += pred.cpu().tolist() 242 | y_true += y.cpu().tolist() 243 | 244 | eval_loss /= iter_conter 245 | 246 | C = eval_loader.dataset.aug_ratio 247 | B = len(y_true) // C 248 | # majority voting 249 | # [B*C] -> [B, C] 250 | y_pred = np.array(y_pred)[:B*C].reshape((B, C)) 251 | y_true = np.array(y_true)[:B*C].reshape((B, C)) 252 | prob_all = np.concatenate(prob_all, axis=0) 253 | prob_all = np.array(prob_all)[:B*C,:].reshape((B, C, -1)) 254 | # [B, C] -> [B] 255 | y_pred = [np.bincount(p).argmax() for p in y_pred] 256 | y_true = [np.bincount(p).argmax() for p in y_true] 257 | prob_all = np.array([np.mean(p, axis=0) for p in prob_all]) 258 | 259 | if args.num_labels == 2: 260 | # calculate ROC curve using out 261 | prob_all = np.concatenate(prob_all, axis=0) 262 | fpr, tpr, thresholds = roc_curve(y_true, prob_all[:, 1]) 263 | args.log("ROC fpr: {}, tpr: {}, thresholds: {}".format(fpr, tpr, thresholds)) 264 | auc = roc_auc_score(y_true, prob_all[:, 1], average='weighted') 265 | # fpr80 266 | fpr80 = fpr[np.where(tpr >= 0.8)[0][0]] 267 | fpr95 = fpr[np.where(tpr >= 0.95)[0][0]] 268 | fnr = 1 - tpr 269 | eer_threshold = thresholds[np.argmin(np.absolute(fnr - fpr))] 270 | eer_pred = prob_all[:, 1] >= eer_threshold 271 | y_pred = eer_pred.astype(int) 272 | auprc = average_precision_score(y_true, prob_all[:, 1], average='weighted') 273 | 274 | eval_acc = accuracy_score(y_true, y_pred) 275 | if args.num_labels == 2: 276 | eval_f1 = f1_score(y_true, y_pred, average='weighted') 277 | else: 278 | eval_f1 = f1_score(y_true, y_pred, labels=list(range(args.num_labels)),average='macro') 279 | """ 280 | draw a confusion matrix with TF, TN, FP, FN: 281 | GT\Pred |Positive | Negative 282 | ------------------+----------- 283 | True | TP | FN # <- Fall 284 | False | FP | TN # <- Normal 285 | 286 | false alarm rate = FP / (FP + TN) 287 | miss alarm rate = FN / (FN + TP) 288 | detection rate = TP / (TP + FN) 289 | 290 | FPR = FP / (FP + TN) 291 | TPR = TP / (TP + FN) 292 | 293 | """ 294 | y_pred = np.array(y_pred) 295 | y_true = np.array(y_true) 296 | false_alarm_rate = np.sum((y_true == 0) & (y_pred == 1)) / np.sum(y_true == 0) 297 | miss_alarm_rate = np.sum((y_true == 1) & (y_pred == 0)) / np.sum(y_true == 1) 298 | detection_rate = 1 - miss_alarm_rate 299 | if kind == "valid": 300 | args.log("[epoch={}]Validation Accuracy : {:.7} Macro F1 : {:.7} Loss : {:.7}\n". 301 | format(epoch, str(eval_acc), str(eval_f1), str(eval_loss))) 302 | args.writer.add_scalar('accuracy/valid', eval_acc, epoch) 303 | args.writer.add_scalar('f1/valid', eval_f1, epoch) 304 | args.writer.add_scalar('loss/valid', eval_loss, epoch) 305 | args.writer.add_scalar('FAR/valid', false_alarm_rate, epoch) 306 | args.writer.add_scalar('MAR/valid', miss_alarm_rate, epoch) 307 | if args.num_labels == 2: 308 | args.writer.add_scalar('AUC/valid', auc, epoch) 309 | args.writer.add_scalar('AUPRC/valid', auprc, epoch) 310 | # detection_rate 311 | args.writer.add_scalar('DET/valid', detection_rate, epoch) 312 | # args.writer.add_scalar('EER/valid', eer, epoch) 313 | args.writer.add_scalar('FPR80/valid', fpr80, epoch) 314 | args.writer.add_scalar('FPR95/valid', fpr95, epoch) 315 | 316 | elif kind == "test": 317 | args.log("[epoch={}]Test Accuracy : {:.7} Macro F1 : {:.7} Loss : {:.7}\n". 318 | format(epoch, str(eval_acc), str(eval_f1), str(eval_loss))) 319 | args.writer.add_scalar('accuracy/test', eval_acc, epoch) 320 | args.writer.add_scalar('f1/test', eval_f1, epoch) 321 | args.writer.add_scalar('loss/test', eval_loss, epoch) 322 | args.writer.add_scalar('FAR/test', false_alarm_rate, epoch) 323 | args.writer.add_scalar('MAR/test', miss_alarm_rate, epoch) 324 | if args.num_labels == 2: 325 | args.writer.add_scalar('AUC/test', auc, epoch) 326 | args.writer.add_scalar('AUPRC/test', auprc, epoch) 327 | # args.writer.add_scalar('EER/valid', eer, epoch) 328 | args.writer.add_scalar('DET/test', detection_rate, epoch) 329 | args.writer.add_scalar('FPR80/test', fpr80, epoch) 330 | args.writer.add_scalar('FPR95/test', fpr95, epoch) 331 | 332 | 333 | # confusion matrix 334 | y_pred = np.array(y_pred) 335 | y_true = np.array(y_true) 336 | cm = confusion_matrix(y_true, y_pred, labels=list(range(args.num_labels))) 337 | cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 338 | print(cm) 339 | return eval_loss, eval_acc, eval_f1 340 | 341 | def train(model, train_loader, optimizer, epoch, args): 342 | args.w_all = [] 343 | y_pred = [] 344 | y_true = [] 345 | epoch_loss = 0 346 | iter_conter = 0 347 | x_counter = 0 348 | for i, (x, y) in enumerate(tqdm.tqdm(train_loader)): 349 | iter_conter += 1 350 | model.train() 351 | 352 | x = x.cuda(non_blocking=True).float() 353 | y = y.cuda(non_blocking=True).long() 354 | 355 | # [B, W, Rx*F] 356 | # x = x.permute(0, 3, 1, 2).contiguous().view(x.shape[0], -1, x.shape[1] * x.shape[2]) 357 | 358 | out = model(x) 359 | loss = args.loss_func(out, y) 360 | if np.isnan(loss.item()): 361 | print("Loss is NaN") 362 | # exit() 363 | 364 | optimizer.zero_grad() 365 | loss.backward() 366 | torch.nn.utils.clip_grad_norm_(model.parameters(), 2.0) 367 | optimizer.step() 368 | 369 | epoch_loss += loss.cpu().item() 370 | 371 | pred = torch.argmax(out, dim = -1) 372 | y_pred += pred.cpu().tolist() 373 | y_true += y.cpu().tolist() 374 | 375 | x_counter += len(x) 376 | if (i != 0 and (i+1) % (len(train_loader.dataset)//4*args.batch_size) == 0) or x_counter == (len(train_loader.dataset)-1): 377 | print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.7f}'.format( 378 | epoch, x_counter, len(train_loader.dataset), 379 | 100. * i / len(train_loader), loss.item())) 380 | 381 | epoch_loss /= iter_conter 382 | train_acc = accuracy_score(y_true, y_pred) 383 | train_f1 = f1_score(y_true, y_pred, labels=list(range(args.num_labels)),average='macro') 384 | 385 | args.writer.add_scalar('loss/train', epoch_loss, epoch) 386 | args.writer.add_scalar('accuracy/train', train_acc, epoch) 387 | args.writer.add_scalar('f1/train', train_f1, epoch) 388 | 389 | args.writer.add_scalar('lr', optimizer.param_groups[0]['lr'], epoch) 390 | 391 | args.log("End of Epoch : " + str(epoch) + " Loss(avg) : " + str(epoch_loss)) 392 | -------------------------------------------------------------------------------- /source/SparseImageWarp.py: -------------------------------------------------------------------------------- 1 | 2 | ################################################# 3 | ### THIS FILE WAS AUTOGENERATED! DO NOT EDIT! ### 4 | ################################################# 5 | # file to edit: dev_nb/SparseImageWarp.ipynb 6 | 7 | import torch 8 | 9 | def sparse_image_warp(img_tensor, 10 | source_control_point_locations, 11 | dest_control_point_locations, 12 | interpolation_order=2, 13 | regularization_weight=0.0, 14 | num_boundaries_points=0): 15 | device = img_tensor.device 16 | control_point_flows = (dest_control_point_locations - source_control_point_locations) 17 | 18 | # clamp_boundaries = num_boundary_points > 0 19 | # boundary_points_per_edge = num_boundary_points - 1 20 | batch_size, image_height, image_width = img_tensor.shape 21 | flattened_grid_locations = get_flat_grid_locations(image_height, image_width, device) 22 | 23 | # IGNORED FOR OUR BASIC VERSION... 24 | # flattened_grid_locations = constant_op.constant( 25 | # _expand_to_minibatch(flattened_grid_locations, batch_size), image.dtype) 26 | 27 | # if clamp_boundaries: 28 | # (dest_control_point_locations, 29 | # control_point_flows) = _add_zero_flow_controls_at_boundary( 30 | # dest_control_point_locations, control_point_flows, image_height, 31 | # image_width, boundary_points_per_edge) 32 | 33 | flattened_flows = interpolate_spline( 34 | dest_control_point_locations, 35 | control_point_flows, 36 | flattened_grid_locations, 37 | interpolation_order, 38 | regularization_weight) 39 | 40 | dense_flows = create_dense_flows(flattened_flows, batch_size, image_height, image_width) 41 | 42 | warped_image = dense_image_warp(img_tensor, dense_flows) 43 | 44 | return warped_image, dense_flows 45 | 46 | def get_grid_locations(image_height, image_width, device): 47 | y_range = torch.linspace(0, image_height - 1, image_height, device=device) 48 | x_range = torch.linspace(0, image_width - 1, image_width, device=device) 49 | y_grid, x_grid = torch.meshgrid(y_range, x_range) 50 | return torch.stack((y_grid, x_grid), -1) 51 | 52 | def flatten_grid_locations(grid_locations, image_height, image_width): 53 | return torch.reshape(grid_locations, [image_height * image_width, 2]) 54 | 55 | def get_flat_grid_locations(image_height, image_width, device): 56 | y_range = torch.linspace(0, image_height - 1, image_height, device=device) 57 | x_range = torch.linspace(0, image_width - 1, image_width, device=device) 58 | y_grid, x_grid = torch.meshgrid(y_range, x_range) 59 | return torch.stack((y_grid, x_grid), -1).reshape([image_height * image_width, 2]) 60 | 61 | def create_dense_flows(flattened_flows, batch_size, image_height, image_width): 62 | # possibly .view 63 | return torch.reshape(flattened_flows, [batch_size, image_height, image_width, 2]) 64 | 65 | def interpolate_spline(train_points, train_values, query_points, order, regularization_weight=0.0,): 66 | # First, fit the spline to the observed data. 67 | w, v = solve_interpolation(train_points, train_values, order, regularization_weight) 68 | # Then, evaluate the spline at the query locations. 69 | query_values = apply_interpolation(query_points, train_points, w, v, order) 70 | 71 | return query_values 72 | 73 | def solve_interpolation(train_points, train_values, order, regularization_weight, eps=1e-7): 74 | device = train_points.device 75 | b, n, d = train_points.shape 76 | k = train_values.shape[-1] 77 | 78 | # First, rename variables so that the notation (c, f, w, v, A, B, etc.) 79 | # follows https://en.wikipedia.org/wiki/Polyharmonic_spline. 80 | # To account for python style guidelines we use 81 | # matrix_a for A and matrix_b for B. 82 | 83 | c = train_points 84 | f = train_values.float() 85 | 86 | matrix_a = phi(cross_squared_distance_matrix(c,c), order).unsqueeze(0) # [b, n, n] 87 | # if regularization_weight > 0: 88 | # batch_identity_matrix = array_ops.expand_dims( 89 | # linalg_ops.eye(n, dtype=c.dtype), 0) 90 | # matrix_a += regularization_weight * batch_identity_matrix 91 | 92 | # Append ones to the feature values for the bias term in the linear model. 93 | ones = torch.ones(n, dtype=train_points.dtype, device=device).view([-1, n, 1]) 94 | matrix_b = torch.cat((c, ones), 2).float() # [b, n, d + 1] 95 | 96 | # [b, n + d + 1, n] 97 | left_block = torch.cat((matrix_a, torch.transpose(matrix_b, 2, 1)), 1) 98 | 99 | num_b_cols = matrix_b.shape[2] # d + 1 100 | 101 | # In Tensorflow, zeros are used here. Pytorch solve fails with zeros for some reason we don't understand. 102 | # So instead we use very tiny randn values (variance of one, zero mean) on one side of our multiplication. 103 | lhs_zeros = torch.randn((b, num_b_cols, num_b_cols), device=device) *eps 104 | right_block = torch.cat((matrix_b, lhs_zeros), 105 | 1) # [b, n + d + 1, d + 1] 106 | lhs = torch.cat((left_block, right_block), 107 | 2) # [b, n + d + 1, n + d + 1] 108 | 109 | rhs_zeros = torch.zeros((b, d + 1, k), dtype=train_points.dtype, device=device).float() 110 | rhs = torch.cat((f, rhs_zeros), 1) # [b, n + d + 1, k] 111 | 112 | # Then, solve the linear system and unpack the results. 113 | X = torch.linalg.solve(lhs, rhs) 114 | w = X[:, :n, :] 115 | v = X[:, n:, :] 116 | return w, v 117 | 118 | def cross_squared_distance_matrix(x, y): 119 | """Pairwise squared distance between two (batch) matrices' rows (2nd dim). 120 | Computes the pairwise distances between rows of x and rows of y 121 | Args: 122 | x: [batch_size, n, d] float `Tensor` 123 | y: [batch_size, m, d] float `Tensor` 124 | Returns: 125 | squared_dists: [batch_size, n, m] float `Tensor`, where 126 | squared_dists[b,i,j] = ||x[b,i,:] - y[b,j,:]||^2 127 | """ 128 | x_norm_squared = torch.sum(torch.mul(x, x)) 129 | y_norm_squared = torch.sum(torch.mul(y, y)) 130 | 131 | x_y_transpose = torch.matmul(x.squeeze(0), y.squeeze(0).transpose(0,1)) 132 | 133 | # squared_dists[b,i,j] = ||x_bi - y_bj||^2 = x_bi'x_bi- 2x_bi'x_bj + x_bj'x_bj 134 | squared_dists = x_norm_squared - 2 * x_y_transpose + y_norm_squared 135 | 136 | return squared_dists.float() 137 | 138 | def phi(r, order): 139 | """Coordinate-wise nonlinearity used to define the order of the interpolation. 140 | See https://en.wikipedia.org/wiki/Polyharmonic_spline for the definition. 141 | Args: 142 | r: input op 143 | order: interpolation order 144 | Returns: 145 | phi_k evaluated coordinate-wise on r, for k = r 146 | """ 147 | EPSILON=torch.tensor(1e-10, device=r.device) 148 | # using EPSILON prevents log(0), sqrt0), etc. 149 | # sqrt(0) is well-defined, but its gradient is not 150 | if order == 1: 151 | r = torch.max(r, EPSILON) 152 | r = torch.sqrt(r) 153 | return r 154 | elif order == 2: 155 | return 0.5 * r * torch.log(torch.max(r, EPSILON)) 156 | elif order == 4: 157 | return 0.5 * torch.square(r) * torch.log(torch.max(r, EPSILON)) 158 | elif order % 2 == 0: 159 | r = torch.max(r, EPSILON) 160 | return 0.5 * torch.pow(r, 0.5 * order) * torch.log(r) 161 | else: 162 | r = torch.max(r, EPSILON) 163 | return torch.pow(r, 0.5 * order) 164 | 165 | def apply_interpolation(query_points, train_points, w, v, order): 166 | """Apply polyharmonic interpolation model to data. 167 | Given coefficients w and v for the interpolation model, we evaluate 168 | interpolated function values at query_points. 169 | Args: 170 | query_points: `[b, m, d]` x values to evaluate the interpolation at 171 | train_points: `[b, n, d]` x values that act as the interpolation centers 172 | ( the c variables in the wikipedia article) 173 | w: `[b, n, k]` weights on each interpolation center 174 | v: `[b, d, k]` weights on each input dimension 175 | order: order of the interpolation 176 | Returns: 177 | Polyharmonic interpolation evaluated at points defined in query_points. 178 | """ 179 | query_points = query_points.unsqueeze(0) 180 | # First, compute the contribution from the rbf term. 181 | pairwise_dists = cross_squared_distance_matrix(query_points.float(), train_points.float()) 182 | phi_pairwise_dists = phi(pairwise_dists, order) 183 | 184 | rbf_term = torch.matmul(phi_pairwise_dists, w) 185 | 186 | # Then, compute the contribution from the linear term. 187 | # Pad query_points with ones, for the bias term in the linear model. 188 | ones = torch.ones_like(query_points[..., :1]) 189 | query_points_pad = torch.cat(( 190 | query_points, 191 | ones 192 | ), 2).float() 193 | linear_term = torch.matmul(query_points_pad, v) 194 | 195 | return rbf_term + linear_term 196 | 197 | 198 | def dense_image_warp(image, flow): 199 | """Image warping using per-pixel flow vectors. 200 | Apply a non-linear warp to the image, where the warp is specified by a dense 201 | flow field of offset vectors that define the correspondences of pixel values 202 | in the output image back to locations in the source image. Specifically, the 203 | pixel value at output[b, j, i, c] is 204 | images[b, j - flow[b, j, i, 0], i - flow[b, j, i, 1], c]. 205 | The locations specified by this formula do not necessarily map to an int 206 | index. Therefore, the pixel value is obtained by bilinear 207 | interpolation of the 4 nearest pixels around 208 | (b, j - flow[b, j, i, 0], i - flow[b, j, i, 1]). For locations outside 209 | of the image, we use the nearest pixel values at the image boundary. 210 | Args: 211 | image: 4-D float `Tensor` with shape `[batch, height, width, channels]`. 212 | flow: A 4-D float `Tensor` with shape `[batch, height, width, 2]`. 213 | name: A name for the operation (optional). 214 | Note that image and flow can be of type tf.half, tf.float32, or tf.float64, 215 | and do not necessarily have to be the same type. 216 | Returns: 217 | A 4-D float `Tensor` with shape`[batch, height, width, channels]` 218 | and same type as input image. 219 | Raises: 220 | ValueError: if height < 2 or width < 2 or the inputs have the wrong number 221 | of dimensions. 222 | """ 223 | image = image.unsqueeze(3) # add a single channel dimension to image tensor 224 | batch_size, height, width, channels = image.shape 225 | device = image.device 226 | 227 | # The flow is defined on the image grid. Turn the flow into a list of query 228 | # points in the grid space. 229 | grid_x, grid_y = torch.meshgrid( 230 | torch.arange(width, device=device), torch.arange(height, device=device)) 231 | 232 | stacked_grid = torch.stack((grid_y, grid_x), dim=2).float() 233 | 234 | batched_grid = stacked_grid.unsqueeze(-1).permute(3, 1, 0, 2) 235 | 236 | query_points_on_grid = batched_grid - flow 237 | query_points_flattened = torch.reshape(query_points_on_grid, 238 | [batch_size, height * width, 2]) 239 | # Compute values at the query points, then reshape the result back to the 240 | # image grid. 241 | interpolated = interpolate_bilinear(image, query_points_flattened) 242 | interpolated = torch.reshape(interpolated, 243 | [batch_size, height, width, channels]) 244 | return interpolated 245 | 246 | def interpolate_bilinear(grid, 247 | query_points, 248 | name='interpolate_bilinear', 249 | indexing='ij'): 250 | """Similar to Matlab's interp2 function. 251 | Finds values for query points on a grid using bilinear interpolation. 252 | Args: 253 | grid: a 4-D float `Tensor` of shape `[batch, height, width, channels]`. 254 | query_points: a 3-D float `Tensor` of N points with shape `[batch, N, 2]`. 255 | name: a name for the operation (optional). 256 | indexing: whether the query points are specified as row and column (ij), 257 | or Cartesian coordinates (xy). 258 | Returns: 259 | values: a 3-D `Tensor` with shape `[batch, N, channels]` 260 | Raises: 261 | ValueError: if the indexing mode is invalid, or if the shape of the inputs 262 | invalid. 263 | """ 264 | if indexing != 'ij' and indexing != 'xy': 265 | raise ValueError('Indexing mode must be \'ij\' or \'xy\'') 266 | 267 | 268 | shape = grid.shape 269 | if len(shape) != 4: 270 | msg = 'Grid must be 4 dimensional. Received size: ' 271 | raise ValueError(msg + str(grid.shape)) 272 | 273 | batch_size, height, width, channels = grid.shape 274 | 275 | shape = [batch_size, height, width, channels] 276 | query_type = query_points.dtype 277 | grid_type = grid.dtype 278 | grid_device = grid.device 279 | 280 | num_queries = query_points.shape[1] 281 | 282 | alphas = [] 283 | floors = [] 284 | ceils = [] 285 | index_order = [0, 1] if indexing == 'ij' else [1, 0] 286 | unstacked_query_points = query_points.unbind(2) 287 | 288 | for dim in index_order: 289 | queries = unstacked_query_points[dim] 290 | 291 | size_in_indexing_dimension = shape[dim + 1] 292 | 293 | # max_floor is size_in_indexing_dimension - 2 so that max_floor + 1 294 | # is still a valid index into the grid. 295 | max_floor = torch.tensor(size_in_indexing_dimension - 2, dtype=query_type, device=grid_device) 296 | min_floor = torch.tensor(0.0, dtype=query_type, device=grid_device) 297 | maxx = torch.max(min_floor, torch.floor(queries)) 298 | floor = torch.min(maxx, max_floor) 299 | int_floor = floor.long() 300 | floors.append(int_floor) 301 | ceil = int_floor + 1 302 | ceils.append(ceil) 303 | 304 | # alpha has the same type as the grid, as we will directly use alpha 305 | # when taking linear combinations of pixel values from the image. 306 | 307 | 308 | alpha = (queries - floor).clone().detach().type(grid_type) 309 | min_alpha = torch.tensor(0.0, dtype=grid_type, device=grid_device) 310 | max_alpha = torch.tensor(1.0, dtype=grid_type, device=grid_device) 311 | alpha = torch.min(torch.max(min_alpha, alpha), max_alpha) 312 | 313 | # Expand alpha to [b, n, 1] so we can use broadcasting 314 | # (since the alpha values don't depend on the channel). 315 | alpha = torch.unsqueeze(alpha, 2) 316 | alphas.append(alpha) 317 | 318 | flattened_grid = torch.reshape( 319 | grid, [batch_size * height * width, channels]) 320 | batch_offsets = torch.reshape( 321 | torch.arange(batch_size, device=grid_device) * height * width, [batch_size, 1]) 322 | 323 | # This wraps array_ops.gather. We reshape the image data such that the 324 | # batch, y, and x coordinates are pulled into the first dimension. 325 | # Then we gather. Finally, we reshape the output back. It's possible this 326 | # code would be made simpler by using array_ops.gather_nd. 327 | def gather(y_coords, x_coords, name): 328 | linear_coordinates = batch_offsets + y_coords * width + x_coords 329 | gathered_values = torch.gather(flattened_grid.t(), 1, linear_coordinates) 330 | return torch.reshape(gathered_values, 331 | [batch_size, num_queries, channels]) 332 | 333 | # grab the pixel values in the 4 corners around each query point 334 | top_left = gather(floors[0], floors[1], 'top_left') 335 | top_right = gather(floors[0], ceils[1], 'top_right') 336 | bottom_left = gather(ceils[0], floors[1], 'bottom_left') 337 | bottom_right = gather(ceils[0], ceils[1], 'bottom_right') 338 | 339 | interp_top = alphas[1] * (top_right - top_left) + top_left 340 | interp_bottom = alphas[1] * (bottom_right - bottom_left) + bottom_left 341 | interp = alphas[0] * (interp_bottom - interp_top) + interp_top 342 | 343 | return interp -------------------------------------------------------------------------------- /source/batch_runner.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import set_device 3 | try: 4 | os.environ["CUDA_VISIBLE_DEVICES"] = set_device.get_gpu_ids(int(os.environ["NGPU"])) 5 | except: 6 | try: 7 | os.environ["CUDA_VISIBLE_DEVICES"] = set_device.get_gpu_ids(0) 8 | except: 9 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 10 | from datetime import datetime 11 | import os, time 12 | import argparse 13 | from utils import compute_mean_and_conf_interval 14 | from contextlib import redirect_stdout 15 | from main import main 16 | import io 17 | import traceback 18 | 19 | class Recorder(object): 20 | def __init__(self, path, data_names=None, metric_names=None, method_names=None, model_names=None, description=None): 21 | self.start_time = datetime.now() 22 | self.path = path.replace('.txt', '_{}.txt'.format(self.start_time.strftime('%Y%m%d_%H%M%S'))) 23 | 24 | self.data_names = data_names 25 | self.metric_names = metric_names 26 | self.method_names = method_names 27 | self.model_names = model_names 28 | self.description = description 29 | 30 | self.result = [[[[-1.0 for _ in range(len(metric_names))] for _ in range(len(method_names))] for _ in range(len(data_names))] for _ in range(len(model_names))] 31 | 32 | 33 | 34 | def __enter__(self): 35 | return self 36 | 37 | def record(self, data_key, metric_key, method_key, model_key, value): 38 | # find key in data_names 39 | data_idx = self.data_names.index(data_key) 40 | # find key in metric_names 41 | metric_idx = self.metric_names.index(metric_key) 42 | # find key in method_names 43 | method_idx = self.method_names.index(method_key) 44 | # find key in third_dim_names 45 | model_idx = self.model_names.index(model_key) 46 | # record value 47 | self.result[model_idx][data_idx][method_idx][metric_idx] = value 48 | 49 | 50 | def dump(self): 51 | with open(self.path, 'w+') as log_file: 52 | log_file.write(f'start from:{datetime.now()}\n') 53 | # write description 54 | log_file.write(f'description: {self.description}\n') 55 | # basic settings 56 | log_file.write(f'\n') 57 | for model_idx, model_name in enumerate(self.model_names): 58 | log_file.write(f'{model_name}:\n\n') 59 | # header 60 | log_file.write("{:^20}|".format("Method")) 61 | for method in self.method_names: 62 | log_file.write(f"{method:^63}|") 63 | log_file.write("\n") 64 | log_file.write("{:^20}|".format("Metrics")) 65 | for _ in range(len(self.method_names)): 66 | for metric in self.metric_names: 67 | log_file.write(f"{metric:^7}|") 68 | log_file.write("\n") 69 | log_file.write("\n") 70 | # body 71 | for data_idx, data_name in enumerate(self.data_names): 72 | log_file.write(f'{data_name:<20}|') 73 | for method_idx, method in enumerate(self.method_names): 74 | for metric_idx, _ in enumerate(self.metric_names): 75 | metric_value = self.result[model_idx][data_idx][method_idx][metric_idx] 76 | # metric_value within [0, 1] 77 | if metric_value > 0 and metric_value <= 1: 78 | metric_value *= 100 79 | log_file.write(f"{metric_value:^7.3f}|") 80 | log_file.write('\n') 81 | log_file.write('\n') 82 | 83 | log_file.write(f'record dumped at:{datetime.now()}\n') 84 | log_file.write(f'elapsed:{datetime.now() - self.start_time}\n') 85 | 86 | log_file.flush() 87 | 88 | def __exit__(self, exc_type, exc_value, traceback): 89 | self.dump() 90 | return False 91 | 92 | 93 | if __name__ == "__main__": 94 | 95 | # exec bash "python main.py --dataset=widar3 --version=4 --es_patience=50 --model=ResNet" 96 | default_method_param = {"time_aug":"", "freq_aug":"", "space_aug":""} 97 | datasets = [] 98 | models = [] 99 | # models += ["MaCNN"] 100 | # models = ["Widar3"] 101 | # models = ["ResNet18"] 102 | # models = ["AlexNet"] 103 | # models += ["CNN_GRU"] 104 | # models += ["THAT"] 105 | # models += ["LaxCat"] 106 | # models += ["ResNet"] 107 | models += ["RFNet"] 108 | # models += ["SLNet"] 109 | datasets = [] 110 | datasets += ["widar3"] 111 | win_sizes = [256] 112 | versions = ["norm-filter"] 113 | 114 | seeds = [0] 115 | data_path = "small_1100_top6_1000_2560" 116 | # descripbe purpose of the experiment in record 117 | exp_description = "DFS" 118 | 119 | # dataset x version 120 | datasets_versions = ["{}-{}".format(dataset, version) for dataset in datasets for version in versions] 121 | 122 | methods = { 123 | # use all subcarriers 124 | # "all": {**default_method_param, "freq_aug":"all,90"}, 125 | # ISS-6 126 | # "subband-ms-top1,6": {**default_method_param, "freq_aug":"subband-ms-top1,6"}, 127 | # spec_augment 128 | # "all+time-wrap5-mask-time-20-mask-freq-20": {**default_method_param, "freq_aug":"all+time-wrap5-mask-time-20-mask-freq-20,180"}, 129 | # RDA 130 | "ISS6TDAx3_FDA+kmeans6top2insubband3+motion-aware-random-shift-50-and-mask-guassion-75": {**default_method_param, "freq_aug":"ISS6TDAx3_FDA+kmeans6top2insubband3+motion-aware-random-shift-50-and-mask-guassion-75,60"}, 131 | } 132 | 133 | parser = argparse.ArgumentParser() 134 | # K-fold, k_fold set to 5, use first fold only, don't change. 135 | parser.add_argument('--k_fold', type=int, default=5) 136 | parser.add_argument('--max_fold', type=int, default=5) 137 | parser.add_argument('--enable_es', action = 'store_true', default=True) 138 | parser.add_argument('--exp', type=str, default="") 139 | args = parser.parse_args() 140 | loader = "dfs" #"acf" 141 | test_ratio = 0.5 # don't change, we use same testset 142 | epochs = 50 143 | # "rx-0,2,4": use each RX (RX-0, RX-2, RX-4 in Widar3) as individual sample 144 | args.exp_train_val = "rx-0,2,4" 145 | args.exp_test = "rx-1,3,5" 146 | 147 | all_in_one = False 148 | enable_test_aug = True 149 | args.enable_es = False 150 | args.max_fold = 1 151 | enable_balance = False 152 | # test if /record exists 153 | if not os.path.exists('./record'): 154 | os.mkdir('./record') 155 | # args.max_fold = 1 156 | # args.enable_es = False 157 | # begin to run experiments 158 | for seed in seeds: 159 | for dfs_win_size in win_sizes: 160 | with Recorder(path="record/results-Wsize-{}-seed-{}-max_fold-{}-ratio-{}-{}.txt".format(dfs_win_size, seed, args.max_fold, test_ratio, args.exp), 161 | description=exp_description, 162 | data_names=datasets_versions, 163 | metric_names=["l_Acc", "l_Acc_e", "l_F1", "l_F1_e", "a_Acc", "a_Acc_e", "a_F1", "a_F1_e"], 164 | method_names=list(methods.keys()), 165 | model_names=models) as recorder: 166 | 167 | for model in models: 168 | for dataset in datasets: 169 | for version in versions: 170 | for method_name, method_param in methods.items(): 171 | dataset_version = dataset + "-" + version 172 | cmd = "~/anaconda3/envs/rfboost/bin/python main.py"\ 173 | " --dataset={dataset}"\ 174 | " --version={version}"\ 175 | " --k_fold={k_fold}"\ 176 | " --model={model}"\ 177 | " --time_aug {time_aug}"\ 178 | " --freq_aug {freq_aug}"\ 179 | " --space_aug {space_aug}"\ 180 | " --seed={seed}"\ 181 | " --dfs_win_size={dfs_win_size}"\ 182 | " --max_fold={max_fold}"\ 183 | " --exp={exp}"\ 184 | " --exp_train_val={exp_train_val}"\ 185 | " --exp_test={exp_test}"\ 186 | " --epochs={epochs}"\ 187 | " --test_ratio={test_ratio}"\ 188 | " --data_path={data_path}"\ 189 | .format(dataset=dataset, 190 | version=version, 191 | k_fold=args.k_fold, 192 | model=model, 193 | dfs_win_size=dfs_win_size, 194 | max_fold=args.max_fold, 195 | seed=seed, 196 | exp=args.exp, 197 | exp_train_val=args.exp_train_val, 198 | exp_test=args.exp_test, 199 | epochs=epochs, 200 | enable_test_aug=enable_test_aug, 201 | test_ratio=test_ratio, 202 | data_path=data_path, 203 | **method_param) 204 | 205 | cmd += " --enable_es" if args.enable_es else "" 206 | cmd += " --enable_balance" if enable_balance else "" 207 | cmd += " --enable_test_aug" if enable_test_aug else "" 208 | cmd += " --all_in_one" if all_in_one else "" 209 | print(cmd) 210 | 211 | cmd_dict = { 212 | "dataset":dataset, 213 | "version":version, 214 | "k_fold":args.k_fold, 215 | "model":model, 216 | "time_aug":'' if method_param["time_aug"] == '' else [float(m) for m in method_param["time_aug"].split(" ")], 217 | "freq_aug":'' if method_param["freq_aug"] == '' else method_param["freq_aug"].split(" "), 218 | "space_aug":method_param["space_aug"], 219 | "max_fold":args.max_fold, 220 | "dfs_win_size":dfs_win_size, 221 | "seed":seed, 222 | "epochs":epochs, 223 | "enable_es":args.enable_es, 224 | "enable_balance":enable_balance, 225 | "enable_test_aug":enable_test_aug, 226 | "all_in_one":all_in_one, 227 | "exp":args.exp, 228 | "exp_train_val":args.exp_train_val, 229 | "exp_test":args.exp_test, 230 | "test_ratio":test_ratio, 231 | "loader": loader, 232 | "data_path": data_path 233 | } 234 | 235 | # timer 236 | start_time = time.time() 237 | # try: 238 | l_acc_list = [] 239 | l_f1_list = [] 240 | a_acc_list = [] 241 | a_f1_list = [] 242 | 243 | output = io.StringIO() 244 | # redirect stdout to string 245 | with redirect_stdout(output): 246 | try: 247 | main(cmd_dict) 248 | except Exception as e: 249 | traceback.print_exc() 250 | print(output.getvalue()) 251 | continue 252 | # read stdout 253 | output_lines = output.getvalue().splitlines() 254 | for line in output_lines: 255 | if "Best Loss: [Test_Acc, Test_F1]" in line: 256 | l_acc = float(line.split("->")[-1].split(",")[0].strip()) 257 | l_f1 = float(line.split("->")[-1].split(",")[1].strip()) 258 | l_acc_list.append(l_acc) 259 | l_f1_list.append(l_f1) 260 | if "Best Acc: [Test_Acc, Test_F1]" in line: 261 | a_acc = float(line.split("->")[-1].split(",")[0].strip()) 262 | a_f1 = float(line.split("->")[-1].split(",")[1].strip()) 263 | a_acc_list.append(a_acc) 264 | a_f1_list.append(a_f1) 265 | if len(l_acc_list) == 0: 266 | continue 267 | 268 | print(l_acc_list, l_f1_list) 269 | print(a_acc_list, a_f1_list) 270 | l_acc, l_acc_err = compute_mean_and_conf_interval(l_acc_list) 271 | l_f1, l_f1_err = compute_mean_and_conf_interval(l_f1_list) 272 | recorder.record(dataset_version, "l_Acc", method_name, model, l_acc) 273 | recorder.record(dataset_version, "l_Acc_e", method_name, model, l_acc_err) 274 | recorder.record(dataset_version, "l_F1", method_name, model, l_f1) 275 | recorder.record(dataset_version, "l_F1_e", method_name, model, l_f1_err) 276 | 277 | a_acc, a_acc_err = compute_mean_and_conf_interval(a_acc_list) 278 | a_f1, a_f1_err = compute_mean_and_conf_interval(a_f1_list) 279 | recorder.record(dataset_version, "a_Acc", method_name, model, a_acc) 280 | recorder.record(dataset_version, "a_Acc_e", method_name, model, a_acc_err) 281 | recorder.record(dataset_version, "a_F1", method_name, model, a_f1) 282 | recorder.record(dataset_version, "a_F1_e", method_name, model, a_f1_err) 283 | 284 | recorder.dump() 285 | print("--- %s seconds ---" % (time.time() - start_time)) -------------------------------------------------------------------------------- /source/config.yaml: -------------------------------------------------------------------------------- 1 | cache_folder: "/path/to/cache" -------------------------------------------------------------------------------- /source/default.yaml: -------------------------------------------------------------------------------- 1 | # Model config 2 | layer_num: 1 3 | window_list: [7, 16, 32, 48, 64, 80, 96, 112, 128] 4 | stride_list: [3, 8, 16, 24, 32, 40, 48, 56, 64] 5 | k_list: [3, 8, 16, 24, 24, 32, 32, 40, 40] 6 | hidden_channel: 48 7 | 8 | -------------------------------------------------------------------------------- /source/environment.yml: -------------------------------------------------------------------------------- 1 | name: rfboost-pytorch2 2 | channels: 3 | - conda-forge 4 | - defaults 5 | dependencies: 6 | - _libgcc_mutex=0.1=conda_forge 7 | - _openmp_mutex=4.5=2_gnu 8 | - asttokens=2.2.1=pyhd8ed1ab_0 9 | - backcall=0.2.0=pyh9f0ad1d_0 10 | - backports=1.0=pyhd8ed1ab_3 11 | - backports.functools_lru_cache=1.6.5=pyhd8ed1ab_0 12 | - bzip2=1.0.8=h7f98852_4 13 | - ca-certificates=2023.7.22=hbcca054_0 14 | - comm=0.1.4=pyhd8ed1ab_0 15 | - debugpy=1.6.8=py310hc6cd4ac_0 16 | - decorator=5.1.1=pyhd8ed1ab_0 17 | - executing=1.2.0=pyhd8ed1ab_0 18 | - importlib-metadata=6.8.0=pyha770c72_0 19 | - importlib_metadata=6.8.0=hd8ed1ab_0 20 | - ipykernel=6.25.1=pyh71e2992_0 21 | - ipython=8.14.0=pyh41d4057_0 22 | - jedi=0.19.0=pyhd8ed1ab_0 23 | - jupyter_client=8.3.0=pyhd8ed1ab_0 24 | - jupyter_core=5.3.1=py310hff52083_0 25 | - ld_impl_linux-64=2.40=h41732ed_0 26 | - libffi=3.4.2=h7f98852_5 27 | - libgcc-ng=13.1.0=he5830b7_0 28 | - libgomp=13.1.0=he5830b7_0 29 | - libnsl=2.0.0=h7f98852_0 30 | - libsodium=1.0.18=h36c2ea0_1 31 | - libsqlite=3.42.0=h2797004_0 32 | - libstdcxx-ng=13.1.0=hfd8a6a1_0 33 | - libuuid=2.38.1=h0b41bf4_0 34 | - libzlib=1.2.13=hd590300_5 35 | - matplotlib-inline=0.1.6=pyhd8ed1ab_0 36 | - ncurses=6.4=hcb278e6_0 37 | - nest-asyncio=1.5.6=pyhd8ed1ab_0 38 | - openssl=3.1.2=hd590300_0 39 | - packaging=23.1=pyhd8ed1ab_0 40 | - parso=0.8.3=pyhd8ed1ab_0 41 | - pexpect=4.8.0=pyh1a96a4e_2 42 | - pickleshare=0.7.5=py_1003 43 | - pip=23.2.1=pyhd8ed1ab_0 44 | - platformdirs=3.10.0=pyhd8ed1ab_0 45 | - prompt-toolkit=3.0.39=pyha770c72_0 46 | - prompt_toolkit=3.0.39=hd8ed1ab_0 47 | - psutil=5.9.5=py310h1fa729e_0 48 | - ptyprocess=0.7.0=pyhd3deb0d_0 49 | - pure_eval=0.2.2=pyhd8ed1ab_0 50 | - pygments=2.16.1=pyhd8ed1ab_0 51 | - python=3.10.5=ha86cf86_0_cpython 52 | - python-dateutil=2.8.2=pyhd8ed1ab_0 53 | - python_abi=3.10=3_cp310 54 | - pyzmq=25.1.1=py310h5bbb5d0_0 55 | - readline=8.2=h8228510_1 56 | - setuptools=68.0.0=pyhd8ed1ab_0 57 | - six=1.16.0=pyh6c4a22f_0 58 | - sqlite=3.42.0=h2c6b66d_0 59 | - stack_data=0.6.2=pyhd8ed1ab_0 60 | - tk=8.6.12=h27826a3_0 61 | - tornado=6.3.3=py310h2372a71_0 62 | - traitlets=5.9.0=pyhd8ed1ab_0 63 | - typing-extensions=4.7.1=hd8ed1ab_0 64 | - typing_extensions=4.7.1=pyha770c72_0 65 | - wcwidth=0.2.6=pyhd8ed1ab_0 66 | - wheel=0.41.1=pyhd8ed1ab_0 67 | - xz=5.2.6=h166bdaf_0 68 | - zeromq=4.3.4=h9c3ff4c_1 69 | - zipp=3.16.2=pyhd8ed1ab_0 70 | - pip=23.3.1 71 | - pip: 72 | - absl-py==1.4.0 73 | - async-timeout==4.0.3 74 | - cachetools==5.3.1 75 | - certifi==2023.7.22 76 | - charset-normalizer==3.2.0 77 | - cmake==3.27.2 78 | - contourpy==1.1.0 79 | - cycler==0.11.0 80 | - filelock==3.12.2 81 | - fonttools==4.42.0 82 | - google-auth==2.22.0 83 | - google-auth-oauthlib==1.0.0 84 | - grpcio==1.57.0 85 | - h5py==3.9.0 86 | - idna==3.4 87 | - jinja2==3.1.2 88 | - joblib==1.3.2 89 | - kiwisolver==1.4.4 90 | - lightning-utilities==0.9.0 91 | - lit==16.0.6 92 | - markdown==3.4.4 93 | - markupsafe==2.1.3 94 | - mat73==0.60 95 | - matplotlib==3.7.2 96 | - mpmath==1.3.0 97 | - networkx==3.1 98 | - numpy==1.25.2 99 | - nvidia-cublas-cu11==11.10.3.66 100 | - nvidia-cuda-cupti-cu11==11.7.101 101 | - nvidia-cuda-nvrtc-cu11==11.7.99 102 | - nvidia-cuda-runtime-cu11==11.7.99 103 | - nvidia-cudnn-cu11==8.5.0.96 104 | - nvidia-cufft-cu11==10.9.0.58 105 | - nvidia-curand-cu11==10.2.10.91 106 | - nvidia-cusolver-cu11==11.4.0.1 107 | - nvidia-cusparse-cu11==11.7.4.91 108 | - nvidia-nccl-cu11==2.14.3 109 | - nvidia-nvtx-cu11==11.7.91 110 | - oauthlib==3.2.2 111 | - pandas==2.0.3 112 | - pillow==10.0.0 113 | - protobuf==3.19.6 114 | - pyasn1==0.5.0 115 | - pyasn1-modules==0.3.0 116 | - pyparsing==3.0.9 117 | - pytz==2023.3 118 | - pyyaml==6.0.1 119 | - requests==2.31.0 120 | - requests-oauthlib==1.3.1 121 | - rsa==4.9 122 | - scikit-learn==1.3.0 123 | - scipy==1.11.2 124 | - seaborn==0.13.0 125 | - sympy==1.12 126 | - tensorboard==2.14.0 127 | - tensorboard-data-server==0.7.1 128 | - tensorboard-plugin-wit==1.8.1 129 | - threadpoolctl==3.2.0 130 | - torch==2.0.1 131 | - torch-tb-profiler==0.4.1 132 | - torchaudio==2.0.2 133 | - torchmetrics==1.0.3 134 | - torchsummary==1.5.1 135 | - torchvision==0.15.2 136 | - tqdm==4.66.1 137 | - triton==2.0.0 138 | - tzdata==2023.3 139 | - urllib3==1.26.16 140 | - werkzeug==2.3.7 141 | prefix: ~/anaconda3/envs/rfboost-pytorch2 142 | -------------------------------------------------------------------------------- /source/model/Alexnet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class AlexNet(nn.Module): 5 | def __init__(self, input_channel, num_classes: int = 1000, dropout: float = 0.5) -> None: 6 | super().__init__() 7 | self.features = nn.Sequential( 8 | nn.Conv2d(input_channel, 64, kernel_size=11, stride=4, padding=2), 9 | nn.ReLU(inplace=True), 10 | nn.MaxPool2d(kernel_size=3, stride=2), 11 | nn.Conv2d(64, 192, kernel_size=5, padding=2), 12 | nn.ReLU(inplace=True), 13 | nn.MaxPool2d(kernel_size=3, stride=2), 14 | nn.Conv2d(192, 384, kernel_size=3, padding=1), 15 | nn.ReLU(inplace=True), 16 | nn.Conv2d(384, 256, kernel_size=3, padding=1), 17 | nn.ReLU(inplace=True), 18 | nn.Conv2d(256, 256, kernel_size=3, padding=1), 19 | nn.ReLU(inplace=True), 20 | nn.MaxPool2d(kernel_size=3, stride=2), 21 | ) 22 | self.avgpool = nn.AdaptiveAvgPool2d((6, 6)) 23 | self.classifier = nn.Sequential( 24 | nn.Dropout(p=dropout), 25 | nn.Linear(256 * 6 * 6, 4096), 26 | nn.ReLU(inplace=True), 27 | nn.Dropout(p=dropout), 28 | nn.Linear(4096, 4096), 29 | nn.ReLU(inplace=True), 30 | nn.Linear(4096, num_classes), 31 | ) 32 | 33 | def forward(self, x: torch.Tensor) -> torch.Tensor: 34 | x = self.features(x) 35 | x = self.avgpool(x) 36 | x = torch.flatten(x, 1) 37 | x = self.classifier(x) 38 | return x 39 | 40 | def main(): 41 | input = torch.zeros((128, 1, 256, 121)).cuda() 42 | model = AlexNet(input_channel = 1, num_classes = 6).cuda() 43 | o = model(input) 44 | # print parameters 45 | for name, param in model.named_parameters(): 46 | print(name, param.size()) 47 | 48 | print(o.size()) 49 | 50 | if __name__ == '__main__': 51 | main() -------------------------------------------------------------------------------- /source/model/CNN_GRU.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | from torchsummary import summary 6 | from torchvision.models import AlexNet 7 | class CNN_GRU(nn.Module): 8 | # 1d-CNN + GRU 9 | def __init__(self, input_channel, input_size, num_label, n_gru_hidden_units=128, f_dropout_ratio=0.5, batch_first=True): 10 | super(CNN_GRU, self).__init__() 11 | # [@, T, C, F] 12 | 13 | # [@, C, F] 14 | 15 | # self.cnn = ResNet(121, input_channel, num_label) 16 | # self.cnn.fc = nn.Linear(128, 128) 17 | 18 | # self.cnn = nn.Sequential( 19 | 20 | # nn.Conv1d(input_channel, 32, kernel_size=3, stride=1, padding=1), 21 | 22 | # nn.ReLU(inplace=True), 23 | 24 | # nn.AdaptiveMaxPool1d(64), 25 | 26 | # nn.Flatten(), 27 | 28 | # nn.Linear(32 * 64, 128), 29 | 30 | # nn.ReLU(inplace=True), 31 | 32 | # nn.Dropout(f_dropout_ratio), 33 | 34 | # nn.Linear(128, 128), 35 | 36 | # nn.ReLU(inplace=True), 37 | 38 | # ) 39 | 40 | 41 | # [@, T, 64] 42 | 43 | # self.dropout2 = nn.Dropout(f_dropout_ratio) 44 | 45 | # [Pytorch] DO NOT USE SOFTMAX HERE!!! 46 | 47 | # self.dense3 = nn.Linear(n_gru_hidden_units, num_label) 48 | 49 | self.cnn = nn.Sequential( 50 | nn.Conv1d(input_channel, 16, kernel_size=5, stride=1, padding="same"), 51 | # do layer norm 52 | nn.LayerNorm([16, input_size]), 53 | nn.ReLU(inplace=True), 54 | nn.MaxPool1d(kernel_size=2), 55 | 56 | nn.Conv1d(16, 32, kernel_size=3, stride=1, padding="same"), 57 | nn.LayerNorm([32, input_size // 2]), 58 | nn.ReLU(inplace=True), 59 | nn.MaxPool1d(kernel_size=2), 60 | 61 | nn.Conv1d(32, 64, kernel_size=3, stride=1, padding="same"), 62 | nn.LayerNorm([64, input_size // 4]), 63 | nn.ReLU(inplace=True), 64 | nn.MaxPool1d(kernel_size=2), 65 | 66 | # nn.Conv1d(64, 128, kernel_size=3, stride=1, padding=1), 67 | 68 | # nn.ReLU(inplace=True), 69 | # nn.MaxPool1d(kernel_size=2), 70 | 71 | # nn.Conv1d(128, 256, kernel_size=3, stride=1, padding=1), 72 | 73 | # nn.ReLU(inplace=True), 74 | nn.AdaptiveAvgPool1d(4), 75 | nn.Flatten(), 76 | 77 | nn.Linear(64 * 4, 128), 78 | nn.ReLU(inplace=True), 79 | nn.Dropout(f_dropout_ratio), 80 | nn.Linear(128, 128), 81 | nn.ReLU(inplace=True), 82 | 83 | ) 84 | # [N, T, C, F] 85 | 86 | self.cnn_ln = nn.LayerNorm(128) 87 | 88 | # [@, T, 64] 89 | self.rnn = nn.GRU(input_size=128, hidden_size=n_gru_hidden_units, batch_first=batch_first) 90 | # self.rnn = nn.LSTM(input_size=256, hidden_size=n_gru_hidden_units//2, num_layers=3, batch_first=batch_first, bidirectional=True, dropout=f_dropout_ratio) 91 | 92 | self.classifier = nn.Sequential( 93 | nn.Dropout(p=f_dropout_ratio), 94 | nn.Linear(n_gru_hidden_units, 64), 95 | nn.ReLU(inplace=True), 96 | nn.Dropout(p=f_dropout_ratio), 97 | nn.Linear(64, num_label), 98 | ) 99 | 100 | # for m in self.modules(): 101 | # if type(m) == nn.GRU: 102 | # # GRU: weight: orthogonal, recurrent_kernel: glorot_uniform, bias: zero_initializer 103 | # nn.init.orthogonal_(m.weight_ih_l0) 104 | # nn.init.orthogonal_(m.weight_hh_l0) 105 | # nn.init.zeros_(m.bias_ih_l0) 106 | # nn.init.zeros_(m.bias_hh_l0) 107 | # elif type(m) == nn.LSTM: 108 | # # LSTM: weight: orthogonal, recurrent_kernel: glorot_uniform, bias: zero_initializer 109 | # nn.init.orthogonal_(m.weight_ih_l0) 110 | # nn.init.orthogonal_(m.weight_hh_l0) 111 | # nn.init.zeros_(m.bias_ih_l0) 112 | # nn.init.zeros_(m.bias_hh_l0) 113 | 114 | # # init forget gate bias to 1 115 | # nn.init.constant_(m.bias_ih_l0[n_gru_hidden_units:2*n_gru_hidden_units], 1) 116 | # nn.init.constant_(m.bias_hh_l0[n_gru_hidden_units:2*n_gru_hidden_units], 1) 117 | 118 | # elif type(m) == nn.Linear: 119 | # # Linear: weight: orthogonal, bias: zero_initializer 120 | # nn.init.orthogonal_(m.weight) 121 | # nn.init.zeros_(m.bias) 122 | # elif type(m) == nn.Conv1d: 123 | # # Conv2d: weight: orthogonal, bias: zero_initializer 124 | # nn.init.orthogonal_(m.weight) 125 | # nn.init.zeros_(m.bias) 126 | 127 | def forward(self, input): 128 | # [@, T, C] 129 | cnn_out_list = [self.cnn(input[:, t, :, :]) for t in range(input.size(1))] 130 | cnn_out = torch.stack(cnn_out_list, dim=1) 131 | # layer normalization 132 | # cnn_out = self.cnn_ln(cnn_out) 133 | # [@, T, 128] 134 | out, _ = self.rnn(cnn_out) 135 | x = out[:, -1, :] 136 | x = self.classifier(x) 137 | 138 | # x = self.dropout2(x) 139 | # x = self.dense3(x) 140 | 141 | return x 142 | 143 | def main(): 144 | # [@, T, 1, F] 145 | input = torch.zeros((16, 256, 90, 128)).cuda() 146 | model = CNN_GRU(input_channel = 90, input_size = 121, num_label = 6, n_gru_hidden_units=128, f_dropout_ratio=0.5).cuda() 147 | o = model(input) 148 | summary(model, input_size=input.size()[1:]) 149 | print(o.size()) 150 | 151 | if __name__ == '__main__': 152 | main() -------------------------------------------------------------------------------- /source/model/STF.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | import torchcomplex.nn as nn 4 | from torch.autograd import Variable 5 | import torchcomplex.nn.functional as F 6 | import torch.nn.functional as rF 7 | import random 8 | import copy 9 | import torchvision 10 | 11 | 12 | ACT_DOMAIN = 'freq' 13 | FILTER_FLAG = True 14 | FREQ_CONV_FLAG = True 15 | 16 | #GLOBAL_KERNEL_SIZE = 32 17 | 18 | def complex_glorot_uniform(c_in, c_out_total, fft_list, fft_n, use_bias=True, name='complex_mat'): 19 | c_out = int(c_out_total/len(fft_list)) 20 | kernel = torch.empty((1, 1, int(c_in * c_out), int(fft_n))).cuda() 21 | kernel = torch.nn.init.xavier_uniform_(kernel) 22 | kernel_complex_org = torch.fft.fft(torch.complex(kernel, 0.*kernel)) 23 | kernel_complex_org = kernel_complex_org.transpose(1, 2) 24 | kernel_complex_org = kernel_complex_org[:,:, :, :int(fft_n/2)+1] 25 | kernel_complex_dict = {} 26 | for fft_elem in fft_list: 27 | if fft_elem != fft_n: 28 | transforms = torchvision.transforms.Resize((1, int(fft_elem/2)+1) ) 29 | kernel_complex_r = transforms(kernel_complex_org.real).transpose(1, 3) 30 | kernel_complex_i = transforms(kernel_complex_org.imag).transpose(1, 3) 31 | kernel_complex_dict[fft_elem] = torch.complex(kernel_complex_r, kernel_complex_i).reshape(( 32 | 1, 1, int(fft_elem/2)+1, int(c_in), int(c_out))).detach() 33 | 34 | elif fft_elem == fft_n: 35 | kernel_complex_dict[fft_elem] = kernel_complex_org.reshape(( 36 | 1, 1, int(fft_elem/2)+1, int(c_in), int(c_out))).detach() 37 | kernel_complex_dict[fft_elem].requires_grad = True 38 | #print(kernel_complex_dict[fft_elem].size()) 39 | 40 | bias_complex_r = torch.zeros((c_out*len(fft_list)), requires_grad = True).detach() 41 | bias_complex_i = torch.zeros((c_out*len(fft_list)), requires_grad = True).detach() 42 | bias_complex = torch.complex(bias_complex_r, bias_complex_i).cuda() 43 | return kernel_complex_dict, bias_complex 44 | 45 | 46 | def zero_interp(in_patch, ratio, out_fft_n): 47 | # patch_atten with shape (BATCH_SIZE, seg_num, ffn/2+1, c_in) 48 | in_patch = in_patch.unsqueeze(3) 49 | in_patch_zero = torch.tile(torch.zeros_like(in_patch), 50 | [1, 1, 1, ratio-1, 1]) 51 | in_patch = torch.reshape(torch.cat((in_patch, in_patch_zero), 3), 52 | (in_patch.size(0), in_patch.size(1), -1, in_patch.size(-1)) ) 53 | return in_patch[:,:,:out_fft_n,:] 54 | 55 | 56 | def atten_merge(patch, kernel, bias): 57 | ## patch with shape (BATCH_SIZE, seg_num, ffn/2+1, c_in, ratio) 58 | ## kernel with shape (1, 1, 1, 1, ratio, ratio) 59 | ## bias with shpe (ratio) 60 | patch_atten = torch.sum(patch.unsqueeze(-1) * kernel, dim = 4) 61 | # patch_atten with shape (BATCH_SIZE, seg_num, ffn/2+1, c_in, ratio) 62 | patch_atten = torch.abs(patch_atten + bias) 63 | patch_atten = F.softmax(patch_atten, dim = -1) 64 | patch_atten = torch.complex(patch_atten, 0*patch_atten) 65 | return torch.sum(patch*patch_atten, dim = 4) 66 | 67 | 68 | def complex_merge(merge_ratio): 69 | kernel_r = torch.zeros((1, 1, 1, 1, merge_ratio, merge_ratio ), requires_grad=True).cuda() 70 | kernel_i = torch.zeros((1, 1, 1, 1, merge_ratio, merge_ratio ), requires_grad=True).cuda() 71 | kernel_complex = torch.complex(kernel_r, kernel_i).detach() 72 | 73 | bias_r = torch.zeros((merge_ratio), requires_grad = True).cuda() 74 | bias_i = torch.zeros((merge_ratio), requires_grad = True).cuda() 75 | bias_complex = torch.complex(bias_r, bias_i).detach() 76 | 77 | return kernel_complex, bias_complex 78 | 79 | 80 | 81 | 82 | 83 | class STFlayer(nn.Module): 84 | def __init__(self, fft_list, f_step_list, kernel_len_list, dilation_len_list, c_in, c_out, 85 | out_fft_list=[0], ser_size=0, pooling=False, BATCH_SIZE = 64): 86 | super(STFlayer, self).__init__() 87 | if pooling: 88 | assert len(fft_list) == len(out_fft_list) 89 | self.fft_n_list = out_fft_list 90 | else: 91 | self.fft_n_list = fft_list 92 | self.fft_list = fft_list 93 | self.pooling = pooling 94 | self.kernel_len_list = kernel_len_list 95 | self.dilation_len_list = dilation_len_list 96 | self.c_in = c_in 97 | self.c_out = c_out 98 | self.BATCH_SIZE = BATCH_SIZE 99 | 100 | KERNEL_FFT = fft_list[1] 101 | BASIC_LEN = kernel_len_list[0] 102 | self.FFT_L_SIZE = len(self.fft_n_list) 103 | 104 | self.patch_kernel_dict, self.patch_bias = complex_glorot_uniform(c_in, c_out, self.fft_n_list, 105 | KERNEL_FFT) 106 | 107 | self.layerNorms = nn.ModuleList([nn.BatchNorm2d(BATCH_SIZE) for i in range(len(self.fft_n_list))]) 108 | self.convLayers = nn.ModuleList([nn.Conv2d(c_in, int(c_out/len(self.fft_n_list)), (kernel_len_list[i], 1), 109 | stride = 1, padding = 0, dilation = dilation_len_list[i]) for i in range(len(self.kernel_len_list)) ] ) 110 | 111 | self.merge_kernel = [] 112 | self.merge_bias = [] 113 | 114 | for fft_idx, fft_n in enumerate(self.fft_n_list): 115 | self.merge_kernel.append([]) 116 | self.merge_bias.append([]) 117 | for fft_idx2, tar_fft_n in enumerate(self.fft_n_list): 118 | if tar_fft_n >= fft_n: 119 | time_ratio = int(tar_fft_n / fft_n) 120 | kernel, bias = complex_merge(time_ratio) 121 | self.merge_kernel[fft_idx].append(kernel) 122 | self.merge_bias[fft_idx].append(bias) 123 | else: 124 | self.merge_kernel[fft_idx].append(0) 125 | self.merge_bias[fft_idx].append(0) 126 | 127 | 128 | def forward(self, inputs): 129 | # inputs with shape (batch, c_in, time_len) 130 | if (inputs.size(0)!=self.BATCH_SIZE): 131 | print("err") 132 | pass 133 | patch_fft_list = [] 134 | patch_mask_list = [] 135 | for idx in range(len(self.fft_n_list)): 136 | patch_fft_list.append(0.) 137 | patch_mask_list.append([]) 138 | 139 | inputs = inputs.reshape((inputs.size(0) * inputs.size(1), -1)) 140 | #print(inputs.size()) 141 | for fft_idx, fft_n in enumerate(self.fft_n_list): 142 | # patch_fft with shape (batch, c_in, seg_num, fft_n//2+1) 143 | if self.pooling: 144 | in_f_step = self.fft_list[fft_idx] 145 | else: 146 | in_f_step = fft_n 147 | #f_step = in_f_step 148 | patch_fft = torch.stft(inputs, n_fft = in_f_step, hop_length = in_f_step, return_complex = True, onesided = False).transpose(1, 2) 149 | patch_fft = patch_fft.reshape(self.BATCH_SIZE, -1, patch_fft.size(-2), patch_fft.size(-1)) 150 | patch_fft = patch_fft[:,:,:,:int(fft_n/2)+1] 151 | 152 | patch_fft = patch_fft[:,:,:-1,:] 153 | patch_fft = patch_fft.permute(0, 2, 3, 1) 154 | 155 | ## patch_fft with shape (batch, seg_num, fft_n//2+1, c_in) 156 | for fft_idx2, tar_fft_n in enumerate(self.fft_n_list): 157 | if tar_fft_n < fft_n: 158 | continue 159 | elif tar_fft_n == fft_n: 160 | patch_mask = torch.ones_like(patch_fft) 161 | for exist_mask in patch_mask_list[fft_idx2]: 162 | patch_mask = patch_mask - exist_mask 163 | patch_fft_list[fft_idx2] = patch_fft_list[fft_idx2] + patch_mask * patch_fft 164 | else: 165 | time_ratio = int(tar_fft_n / fft_n) 166 | patch_fft_mod = torch.reshape(patch_fft, 167 | (patch_fft.size(0), -1, time_ratio, patch_fft.size(-2), patch_fft.size(-1) ) ) 168 | 169 | patch_fft_mod = patch_fft_mod.permute(0, 1, 3, 4, 2) 170 | patch_fft_mod = atten_merge(patch_fft_mod, self.merge_kernel[fft_idx][fft_idx2], 171 | self.merge_bias[fft_idx][fft_idx2]) * float(time_ratio) 172 | 173 | patch_mask = torch.ones_like(patch_fft_mod) 174 | patch_mask = zero_interp(patch_mask, time_ratio, int(tar_fft_n/2)+1) 175 | 176 | for exist_mask in patch_mask_list[fft_idx2]: 177 | patch_mask = patch_mask - exist_mask 178 | 179 | patch_mask_list[fft_idx2].append(patch_mask) 180 | patch_fft_mod = zero_interp(patch_fft_mod, time_ratio, int(tar_fft_n/2)+1) 181 | patch_fft_list[fft_idx2] = patch_fft_list[fft_idx2] + patch_mask * patch_fft_mod 182 | 183 | patch_time_list = [] 184 | for fft_idx, fft_n in enumerate(self.fft_n_list): 185 | # f_step = f_step_list[fft_idx] 186 | k_len = self.kernel_len_list[fft_idx] 187 | d_len = self.dilation_len_list[fft_idx] 188 | paddings = [int((k_len*d_len-d_len)/2), int((k_len*d_len-d_len)/2)] 189 | 190 | patch_fft = patch_fft_list[fft_idx] 191 | # patch_fft with shape (batch, seg_num, fft_n//2+1, c_in) 192 | patch_fft = patch_fft.permute(3, 0, 1, 2)#.reshape(patch_fft.size(0), -1, patch_fft.size(-1)) 193 | # patch_fft with shape (c_in, batch, seg_num, fft_n//2+1) 194 | patch_fft = self.layerNorms[fft_idx](patch_fft) 195 | patch_fft = patch_fft.permute(1, 2, 3, 0) 196 | ## patch_fft with shape (batch, seg_num, fft_n//2+1, c_in) 197 | patch_fft_r, patch_fft_i = patch_fft.real, patch_fft.imag 198 | 199 | if FREQ_CONV_FLAG: 200 | ## spectral padding 201 | real_pad_l = torch.flip(patch_fft_r[:,:,1:1+paddings[0],:], [2]) 202 | real_pad_r = torch.flip(patch_fft_r[:,:,-1-paddings[1]:-1,:], [2]) 203 | patch_fft_r = torch.cat((real_pad_l, patch_fft_r, real_pad_r), 2) 204 | 205 | imag_pad_l = torch.flip(patch_fft_i[:,:,1:1+paddings[0],:], [2]) 206 | imag_pad_r = torch.flip(patch_fft_i[:,:,-1-paddings[1]:-1,:], [2]) 207 | patch_fft_i = torch.cat((-imag_pad_l, patch_fft_i, -imag_pad_r), 2) 208 | 209 | patch_fft = torch.complex(patch_fft_r, patch_fft_i) 210 | #print(patch_fft.size()) 211 | patch_fft = patch_fft.transpose(1, 3) 212 | # patch_fft with shape (batch, , fft_n//2+1, c_in) 213 | patch_fft = self.convLayers[fft_idx](patch_fft) 214 | patch_fft = patch_fft.transpose(1, 3) 215 | 216 | if FILTER_FLAG: 217 | patch_kernel = self.patch_kernel_dict[fft_n] 218 | patch_fft = torch.complex(patch_fft_r, patch_fft_i) 219 | patch_fft = torch.tile(patch_fft.unsqueeze(4), (1, 1, 1, 1, int(self.c_out/self.FFT_L_SIZE)) ) 220 | 221 | 222 | patch_fft_out = patch_fft * patch_kernel 223 | patch_fft = torch.sum(patch_fft_out, 3) 224 | 225 | patch_out = patch_fft 226 | if ACT_DOMAIN == 'freq': 227 | patch_out = F.crelu(patch_out) 228 | 229 | # patch_fft with shape (batch, seg_num, fft_n//2+1, c_out) 230 | patch_fft_fin = patch_out.permute((0, 3, 2, 1)).reshape(patch_out.size(0) * patch_out.size(3), -1, patch_out.size(1)) 231 | patch_fft_fin = torch.cat((patch_fft_fin, patch_fft_fin[:, :, 0].unsqueeze(-1) ), -1) 232 | patch_time = torch.istft(patch_fft_fin, n_fft = fft_n, hop_length = fft_n) 233 | 234 | patch_time = patch_time.reshape(self.BATCH_SIZE, -1, patch_time.size(-1)).transpose(1, 2) 235 | #print(patch_time.size()) 236 | patch_time_list.append(patch_time) 237 | 238 | patch_time_final = torch.cat(patch_time_list, 2) 239 | 240 | if FILTER_FLAG: 241 | patch_time_final = patch_time_final + self.patch_bias.real 242 | 243 | if ACT_DOMAIN == 'time': 244 | patch_time_final = rF.relu(patch_time_final) 245 | 246 | return patch_time_final.transpose(1, 2) 247 | 248 | 249 | 250 | class STFNet(nn.Module): 251 | def __init__(self, args): 252 | super(STFNet, self).__init__() 253 | GEN_FFT_N = args.GEN_FFT_N 254 | GEN_FFT_N2 = args.GEN_FFT_N2 255 | GEN_FFT_STEP = args.GEN_FFT_STEP 256 | FILTER_LEN = args.FILTER_LEN 257 | DILATION_LEN = args.DILATION_LEN 258 | SENSOR_AXIS = args.SENSOR_AXIS 259 | SERIES_SIZE = args.input_size 260 | SERIES_SIZE2 = int(0.75 * args.input_size) 261 | GEN_C_OUT = 72 262 | OUT_DIM = args.num_labels 263 | SENSOR_TYPE = args.sensor_type 264 | 265 | self.SENSOR_TYPE = SENSOR_TYPE 266 | self.SENSOR_AXIS = SENSOR_AXIS 267 | assert GEN_C_OUT % SENSOR_TYPE == 0 268 | 269 | self.layers1 = nn.ModuleList([]) 270 | self.layers2 = nn.ModuleList([]) 271 | self.layers3 = nn.ModuleList([]) 272 | self.drop_layers1 = nn.ModuleList([]) 273 | self.drop_layers2 = nn.ModuleList([]) 274 | self.drop_layers3 = nn.ModuleList([]) 275 | for i in range(SENSOR_TYPE): 276 | self.layers1.append(STFlayer(GEN_FFT_N, GEN_FFT_STEP, FILTER_LEN, DILATION_LEN, 277 | int(args.input_channel / args.sensor_type), GEN_C_OUT, ser_size=SERIES_SIZE, BATCH_SIZE = args.batch_size)) 278 | self.layers2.append(STFlayer(GEN_FFT_N, GEN_FFT_STEP, FILTER_LEN, DILATION_LEN, 279 | GEN_C_OUT, GEN_C_OUT, ser_size=SERIES_SIZE, BATCH_SIZE = args.batch_size)) 280 | self.layers3.append(STFlayer(GEN_FFT_N, GEN_FFT_STEP, FILTER_LEN, DILATION_LEN, 281 | GEN_C_OUT, int(GEN_C_OUT / SENSOR_TYPE), ser_size=SERIES_SIZE, BATCH_SIZE = args.batch_size)) 282 | self.drop_layers1.append(torch.nn.Dropout(p = 0.2)) 283 | self.drop_layers2.append(torch.nn.Dropout(p = 0.2)) 284 | self.drop_layers3.append(torch.nn.Dropout(p = 0.2)) 285 | 286 | 287 | self.sensor_layer1 = STFlayer(GEN_FFT_N, GEN_FFT_STEP, FILTER_LEN, DILATION_LEN, 288 | int((GEN_C_OUT/2)/len(GEN_FFT_N))*len(GEN_FFT_N)*2, GEN_C_OUT, 289 | out_fft_list=GEN_FFT_N2, ser_size=SERIES_SIZE2, pooling=True , BATCH_SIZE = args.batch_size) 290 | 291 | self.sensor_layer2 = STFlayer(GEN_FFT_N, GEN_FFT_STEP, FILTER_LEN, DILATION_LEN, 292 | GEN_C_OUT, GEN_C_OUT, ser_size=SERIES_SIZE2, BATCH_SIZE = args.batch_size) 293 | 294 | self.sensor_layer3 = STFlayer(GEN_FFT_N, GEN_FFT_STEP, FILTER_LEN, DILATION_LEN, 295 | GEN_C_OUT, GEN_C_OUT, ser_size=SERIES_SIZE2, BATCH_SIZE = args.batch_size) 296 | 297 | self.sensor_drop1 = torch.nn.Dropout(p = 0.2) 298 | self.sensor_drop2 = torch.nn.Dropout(p = 0.2) 299 | self.sensor_drop3 = torch.nn.Dropout(p = 0.2) 300 | self.linear = torch.nn.Linear(GEN_C_OUT, OUT_DIM) 301 | 302 | 303 | def forward(self, inputs): 304 | inputs = inputs.transpose(1, 2) 305 | BATCH_SIZE = inputs.size(0) 306 | #inputs: BATCH_SIZE * 9(acc*3, gyro*3, mag*3) * 5 * L 307 | inputs = torch.reshape(inputs, (BATCH_SIZE, -1, self.SENSOR_TYPE, self.SENSOR_AXIS, inputs.size(-1)) ) 308 | splits = list(torch.split(inputs, 1, 2)) 309 | for i in range(len(splits)): 310 | splits[i] = torch.reshape(splits[i], (BATCH_SIZE, -1, splits[i].size(-1)) ) 311 | #print(splits[i].size()) 312 | splits[i] = self.drop_layers1[i]( self.layers1[i]( splits[i])) 313 | splits[i] = self.drop_layers2[i]( self.layers2[i]( splits[i])) 314 | splits[i] = self.drop_layers3[i]( self.layers3[i]( splits[i])) 315 | 316 | out = torch.cat(splits, 1) 317 | 318 | out = self.sensor_drop1( self.sensor_layer1(out)) 319 | out = self.sensor_drop2( self.sensor_layer2(out)) 320 | out = self.sensor_drop3( self.sensor_layer3(out)) 321 | 322 | out = torch.mean(out, -1) 323 | out = self.linear(out) 324 | return out 325 | 326 | 327 | if __name__ == '__main__': 328 | model = STFNet().cuda() 329 | print(sum(p.numel() for p in model.parameters() if p.requires_grad)) 330 | 331 | i = torch.zeros((2, 6, 512), requires_grad = True).cuda() 332 | p = model(i) 333 | print(p.size()) -------------------------------------------------------------------------------- /source/model/THAT.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from model.transformer_encoder import Transformer 3 | from model.TransCNN import HARTransformer, Gaussian_Position 4 | import torch.nn.functional as F 5 | 6 | class HARTrans(torch.nn.Module): 7 | def __init__(self, args): 8 | super(HARTrans, self).__init__() 9 | self.softmax = torch.nn.LogSoftmax(dim=1) 10 | self.transformer = HARTransformer(args.input_channel, args.hlayers, args.hheads, int(args.input_size / args.sample)) 11 | self.args = args 12 | self.kernel_num = 128 13 | self.kernel_num_v = 16 14 | self.filter_sizes = [10, 40] 15 | self.filter_sizes_v = [2, 4] 16 | self.pos_encoding = Gaussian_Position(args.input_channel, int(args.input_size / args.sample), args.K) 17 | 18 | if args.vlayers == 0: 19 | self.v_transformer = None 20 | self.dense = torch.nn.Linear(args.input_channel, args.num_labels) 21 | else: 22 | self.v_transformer = Transformer(args.input_size, args.vlayers, args.vheads) 23 | self.dense = torch.nn.Linear(self.kernel_num * len(self.filter_sizes) + self.kernel_num_v * len(self.filter_sizes_v), args.num_labels) 24 | 25 | self.dense2 = torch.nn.Linear(self.kernel_num * len(self.filter_sizes), args.num_labels) 26 | self.dropout_rate = 0.5 27 | self.dropout = torch.nn.Dropout(self.dropout_rate) 28 | self.encoders = [] 29 | self.encoder_v = [] 30 | for i, filter_size in enumerate(self.filter_sizes): 31 | enc_attr_name = "encoder_%d" % i 32 | self.__setattr__(enc_attr_name, 33 | torch.nn.Conv1d(in_channels= args.input_channel, 34 | out_channels=self.kernel_num, 35 | kernel_size=filter_size).to('cuda') 36 | ) 37 | self.encoders.append(self.__getattr__(enc_attr_name)) 38 | for i, filter_size in enumerate(self.filter_sizes_v): 39 | enc_attr_name_v = "encoder_v_%d" % i 40 | self.__setattr__(enc_attr_name_v, 41 | torch.nn.Conv1d(in_channels= args.input_size, 42 | out_channels=self.kernel_num_v, 43 | kernel_size=filter_size).to('cuda') 44 | ) 45 | self.encoder_v.append(self.__getattr__(enc_attr_name_v)) 46 | 47 | def _aggregate(self, o, v=None): 48 | enc_outs = [] 49 | enc_outs_v = [] 50 | for encoder in self.encoders: 51 | f_map = encoder(o.transpose(-1, -2)) 52 | enc_ = F.relu(f_map) 53 | k_h = enc_.size()[-1] 54 | enc_ = F.max_pool1d(enc_, kernel_size=k_h) 55 | enc_ = enc_.squeeze(dim=-1) 56 | enc_outs.append(enc_) 57 | encoding = self.dropout(torch.cat(enc_outs, 1)) 58 | q_re = F.relu(encoding) 59 | if self.v_transformer is not None: 60 | for encoder in self.encoder_v: 61 | f_map = encoder(v.transpose(-1, -2)) 62 | enc_ = F.relu(f_map) 63 | k_h = enc_.size()[-1] 64 | enc_ = F.max_pool1d(enc_, kernel_size=k_h) 65 | enc_ = enc_.squeeze(dim=-1) 66 | enc_outs_v.append(enc_) 67 | encoding_v = self.dropout(torch.cat(enc_outs_v, 1)) 68 | v_re = F.relu(encoding_v) 69 | q_re = torch.cat((q_re, v_re), dim=1) 70 | return q_re 71 | 72 | def forward(self, data): 73 | d1 = data.size(dim=0) 74 | d3 = data.size(2) 75 | x = data.unsqueeze(-2) 76 | x = data.view(d1, -1, self.args.sample, d3) 77 | x = torch.sum(x, dim=-2).squeeze(-2) 78 | x = torch.div(x, self.args.sample) 79 | x = self.pos_encoding(x) 80 | x = self.transformer(x) 81 | 82 | if self.v_transformer is not None: 83 | y = data.view(-1, self.args.input_size, self.args.SENSOR_AXIS, int(self.args.input_channel / self.args.SENSOR_AXIS)) 84 | y = torch.sum(y, dim=-2).squeeze(-2) 85 | y = y.transpose(-1, -2) 86 | y = self.v_transformer(y) 87 | re = self._aggregate(x, y) 88 | predict = self.dense(re) 89 | else: 90 | re = self._aggregate(x) 91 | predict = self.dense2(re) 92 | 93 | return predict 94 | 95 | 96 | class TransCNN(torch.nn.Module): 97 | def __init__(self, args): 98 | super(TransCNN, self).__init__() 99 | self.softmax = torch.nn.LogSoftmax(dim=1) 100 | self.transformer = Transformer(90, args.hlayers, 9) 101 | self.args = args 102 | self.kernel_num = 128 103 | self.kernel_num_v = 16 104 | self.filter_sizes = [10, 40] 105 | self.filter_sizes_v = [2, 4] 106 | 107 | if args.vlayers == 0: 108 | self.v_transformer = None 109 | self.dense = torch.nn.Linear(90, 7) 110 | else: 111 | self.v_transformer = Transformer(2000, args.vlayers, 200) 112 | self.dense = torch.nn.Linear(self.kernel_num * len(self.filter_sizes) + self.kernel_num_v * len(self.filter_sizes_v), 7) 113 | 114 | self.dense2 = torch.nn.Linear(self.kernel_num * len(self.filter_sizes), 7) 115 | self.dropout_rate = 0.5 116 | self.dropout = torch.nn.Dropout(self.dropout_rate) 117 | self.encoders = [] 118 | self.encoder_v = [] 119 | for i, filter_size in enumerate(self.filter_sizes): 120 | enc_attr_name = "encoder_%d" % i 121 | self.__setattr__(enc_attr_name, 122 | torch.nn.Conv1d(in_channels=90, 123 | out_channels=self.kernel_num, 124 | kernel_size=filter_size).to('cuda') 125 | ) 126 | self.encoders.append(self.__getattr__(enc_attr_name)) 127 | for i, filter_size in enumerate(self.filter_sizes_v): 128 | enc_attr_name_v = "encoder_v_%d" % i 129 | self.__setattr__(enc_attr_name_v, 130 | torch.nn.Conv1d(in_channels=2000, 131 | out_channels=self.kernel_num_v, 132 | kernel_size=filter_size).to('cuda') 133 | ) 134 | self.encoder_v.append(self.__getattr__(enc_attr_name_v)) 135 | 136 | def _aggregate(self, o, v=None): 137 | enc_outs = [] 138 | enc_outs_v = [] 139 | for encoder in self.encoders: 140 | f_map = encoder(o.transpose(-1, -2)) 141 | enc_ = F.relu(f_map) 142 | k_h = enc_.size()[-1] 143 | enc_ = F.max_pool1d(enc_, kernel_size=k_h) 144 | enc_ = enc_.squeeze(dim=-1) 145 | enc_outs.append(enc_) 146 | encoding = self.dropout(torch.cat(enc_outs, 1)) 147 | q_re = F.relu(encoding) 148 | if self.v_transformer is not None: 149 | for encoder in self.encoder_v: 150 | f_map = encoder(v.transpose(-1, -2)) 151 | enc_ = F.relu(f_map) 152 | k_h = enc_.size()[-1] 153 | enc_ = F.max_pool1d(enc_, kernel_size=k_h) 154 | enc_ = enc_.squeeze(dim=-1) 155 | enc_outs_v.append(enc_) 156 | encoding_v = self.dropout(torch.cat(enc_outs_v, 1)) 157 | v_re = F.relu(encoding_v) 158 | q_re = torch.cat((q_re, v_re), dim=1) 159 | return q_re 160 | 161 | def forward(self, data): 162 | d1 = data.size(dim=0) 163 | d3 = data.size(2) 164 | x = data.unsqueeze(-2) 165 | x = data.view(d1, -1, self.args.sample, d3) 166 | x = torch.sum(x, dim=-2).squeeze(-2) 167 | x = torch.div(x, self.args.sample) 168 | x = self.transformer(x) 169 | 170 | if self.v_transformer is not None: 171 | y = data.view(-1, 2000, 3, 30) 172 | y = torch.sum(y, dim=-2).squeeze(-2) 173 | y = y.transpose(-1, -2) 174 | y = self.v_transformer(y) 175 | re = self._aggregate(x, y) 176 | predict = self.softmax(self.dense(re)) 177 | else: 178 | re = self._aggregate(x) 179 | predict = self.softmax(self.dense2(re)) 180 | 181 | return predict 182 | 183 | 184 | class TransformerM(torch.nn.Module): 185 | def __init__(self, args): 186 | super(TransformerM, self).__init__() 187 | self.args = args 188 | self.softmax = torch.nn.LogSoftmax(dim=1) 189 | self.transformer = Transformer(90, args.hlayers, 9) 190 | 191 | if args.vlayers == 0: 192 | self.v_transformer = None 193 | self.dense = torch.nn.Linear(90, 7) 194 | else: 195 | self.v_transformer = Transformer(2000, args.vlayers, 200) 196 | self.linear = torch.nn.Linear(2000, 7) 197 | self.dense = torch.nn.Linear(90, 7) 198 | 199 | #self.linear = torch.nn.Linear(2000, args.com_dim) 200 | self.cls = torch.nn.Parameter(torch.zeros([1, 1, 90], dtype=torch.float, requires_grad=True)) 201 | self.sep = torch.nn.Parameter(torch.zeros([1, 1, 90], dtype=torch.float, requires_grad=True)) 202 | torch.nn.init.xavier_uniform_(self.cls, gain=1) 203 | torch.nn.init.xavier_uniform_(self.sep, gain=1) 204 | 205 | def fusion(self, x, y): 206 | y = self.softmax(self.linear(y)) 207 | x = self.softmax(self.dense(x)) 208 | predict = x + y 209 | return predict 210 | 211 | def forward(self, data): 212 | d1 = data.size(dim=0) 213 | d3 = data.size(2) 214 | x = data.unsqueeze(-2) 215 | x = data.view(d1, -1, self.args.sample, d3) 216 | x = torch.sum(x, dim=-2).squeeze(-2) 217 | x = torch.div(x, self.args.sample) 218 | #x = torch.cat((self.cls.repeat(d1, 1, 1), x), dim=1) 219 | dx = x.size(1) 220 | x = self.transformer(x) 221 | x = torch.div(torch.sum(x, dim=1).squeeze(dim=1), dx) 222 | #x = x[:, 0, :] 223 | if self.v_transformer is not None: 224 | y = data.view(-1, 2000, 3, 30) 225 | y = torch.sum(y, dim=-2).squeeze(-2) 226 | y = y.transpose(-1, -2) 227 | d2 = y.size(1) 228 | y = self.v_transformer(y) 229 | dy = y.size(1)*3 230 | y = torch.div(torch.sum(y, dim=1).squeeze(dim=1), dy) 231 | predict = self.fusion(x, y) 232 | else: 233 | predict = self.softmax(self.dense(x)) 234 | 235 | return predict -------------------------------------------------------------------------------- /source/model/TransCNN.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch 3 | from torch.nn.init import * 4 | import torch.nn.functional as F 5 | import numpy as np 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | import math, copy, time 10 | 11 | 12 | def clones(module, N): 13 | "Produce N identical layers." 14 | return nn.ModuleList([copy.deepcopy(module) for _ in range(N)]) 15 | 16 | 17 | class Encoder(nn.Module): 18 | "Core encoder is a stack of N layers" 19 | 20 | def __init__(self, layer, N): 21 | super(Encoder, self).__init__() 22 | self.layers = clones(layer, N) 23 | self.norm = LayerNorm(layer.size) 24 | 25 | def forward(self, x, mask=None): 26 | "Pass the input (and mask) through each layer in turn." 27 | for layer in self.layers: 28 | x = layer(x, mask) 29 | return self.norm(x) 30 | 31 | 32 | class PositionwiseFeedForward(nn.Module): 33 | def __init__(self, d_model, d_ff, dropout=0.1): 34 | super(PositionwiseFeedForward, self).__init__() 35 | self.w_1 = nn.Linear(d_model, d_ff) 36 | self.w_2 = nn.Linear(d_ff, d_model) 37 | self.dropout = nn.Dropout(dropout) 38 | 39 | def forward(self, x): 40 | return self.w_2(self.dropout(F.relu(self.w_1(x)))) 41 | 42 | 43 | class LayerNorm(nn.Module): 44 | "Construct a layernorm module (See citation for details)." 45 | 46 | def __init__(self, features, eps=1e-6): 47 | super(LayerNorm, self).__init__() 48 | self.a_2 = nn.Parameter(torch.ones(features)) 49 | self.b_2 = nn.Parameter(torch.zeros(features)) 50 | self.eps = eps 51 | 52 | def forward(self, x): 53 | mean = x.mean(-1, keepdim=True) 54 | std = x.std(-1, keepdim=True) 55 | return self.a_2 * (x - mean) / (std + self.eps) + self.b_2 56 | 57 | 58 | class SublayerConnection(nn.Module): 59 | """ 60 | A residual connection followed by a layer norm. 61 | Note for code simplicity the norm is first as opposed to last. 62 | """ 63 | 64 | def __init__(self, size, dropout): 65 | super(SublayerConnection, self).__init__() 66 | self.norm = LayerNorm(size) 67 | self.dropout = nn.Dropout(dropout) 68 | 69 | def forward(self, x, sublayer): 70 | "Apply residual connection to any sublayer with the same size." 71 | return x + self.dropout(sublayer(self.norm(x))) 72 | 73 | 74 | def attention(query, key, value, mask=None, dropout=None): 75 | "Compute 'Scaled Dot Product Attention'" 76 | d_k = query.size(-1) 77 | scores = torch.matmul(query, key.transpose(-2, -1)) \ 78 | / math.sqrt(d_k) 79 | if mask is not None: 80 | scores = scores.masked_fill(mask == 0, -1e9) 81 | p_attn = F.softmax(scores, dim=-1) 82 | if dropout is not None: 83 | p_attn = dropout(p_attn) 84 | return torch.matmul(p_attn, value), p_attn 85 | 86 | 87 | class HAR_CNN(nn.Module): 88 | "Implements CNN equation." 89 | def __init__(self, d_model, d_ff, filters, dropout=0.1): 90 | super(HAR_CNN, self).__init__() 91 | self.kernel_num = int(d_ff) 92 | self.filter_sizes = filters 93 | self.dropout = nn.Dropout(dropout) 94 | self.bn = nn.BatchNorm1d(d_model) 95 | self.encoders = [] 96 | for i, filter_size in enumerate(self.filter_sizes): 97 | enc_attr_name = "encoder_%d" % i 98 | self.__setattr__(enc_attr_name, 99 | torch.nn.Conv1d( 100 | in_channels=d_model, 101 | out_channels=self.kernel_num, 102 | kernel_size=filter_size, 103 | padding=int((filter_size-1)/2)) 104 | ) 105 | self.encoders.append(self.__getattr__(enc_attr_name)) 106 | 107 | def forward(self, x): 108 | enc_outs = [] 109 | for encoder in self.encoders: 110 | f_map = encoder(x.transpose(-1, -2)) 111 | enc_ = f_map 112 | #enc_ = F.relu(f_map) 113 | #k_h = enc_.size()[-1] 114 | #enc_ = F.max_pool1d(enc_, kernel_size=k_h) 115 | #enc_ = enc_.squeeze(dim=-1) 116 | enc_ = F.relu(self.dropout(self.bn(enc_))) 117 | enc_outs.append(enc_.unsqueeze(dim=1)) 118 | re = torch.div(torch.sum(torch.cat(enc_outs, 1), dim=1), 3) 119 | encoding = re 120 | #encoding = self.dropout(torch.cat(enc_outs, 1)) 121 | #q_re = F.relu(encoding) 122 | return encoding.transpose(-1, -2) 123 | 124 | class EncoderLayer(nn.Module): 125 | "Encoder is made up of self-attn and feed forward (defined below)" 126 | 127 | def __init__(self, size, self_attn, feed_forward, dropout): 128 | super(EncoderLayer, self).__init__() 129 | self.self_attn = self_attn 130 | self.feed_forward = feed_forward 131 | self.sublayer = clones(SublayerConnection(size, dropout), 2) 132 | self.size = size 133 | 134 | def forward(self, x, mask=None): 135 | "Follow Figure 1 (left) for connections." 136 | x = self.sublayer[0](x, lambda x: self.self_attn(x, x, x, mask)) 137 | return self.sublayer[1](x, self.feed_forward) 138 | 139 | 140 | def attention_with_pos(query, key, value, pos_k, pos_v, mask=None, dropout=None): 141 | "Compute 'Scaled Dot Product Attention'" 142 | d_k = query.size(-1) 143 | scores = torch.matmul(query, key.transpose(-2, -1)) \ 144 | / math.sqrt(d_k) 145 | if mask is not None: 146 | scores = scores.masked_fill(mask == 0, -1e9) 147 | p_attn = F.softmax(scores, dim=-1) 148 | if dropout is not None: 149 | p_attn = dropout(p_attn) 150 | return torch.matmul(p_attn, value), p_attn 151 | 152 | 153 | class MultiHeadedAttention(nn.Module): 154 | def __init__(self, h, d_model, dropout=0.1): 155 | "Take in model size and number of heads." 156 | super(MultiHeadedAttention, self).__init__() 157 | assert d_model % h == 0 158 | # We assume d_v always equals d_k 159 | self.d_k = d_model // h 160 | self.h = h 161 | self.linears = clones(nn.Linear(d_model, d_model), 4) 162 | self.attn = None 163 | self.dropout = nn.Dropout(p=dropout) 164 | #relative positional encoding 165 | #self.k = 20 166 | #self.pos_k = torch.zeros([self.k, d_model], dtype=torch.float, requires_grad=True) 167 | #self.pos_v = torch.zeros([self.k, d_model], dtype=torch.float, requires_grad=True) 168 | #nn.init.xavier_uniform_(self.pos_k, gain=1) 169 | #nn.init.xavier_uniform_(self.pos_v, gain=1) 170 | 171 | def get_rel_pos(self, x): 172 | return max(self.k*-1, min(self.k, x)) 173 | 174 | def forward(self, query, key, value, mask=None): 175 | "Implements Figure 2" 176 | if mask is not None: 177 | # Same mask applied to all h heads. 178 | mask = mask.unsqueeze(1) 179 | nbatches = query.size(0) 180 | 181 | # 1) Do all the linear projections in batch from d_model => h x d_k 182 | query, key, value = \ 183 | [l(x).view(nbatches, -1, self.h, self.d_k).transpose(1, 2) 184 | for l, x in zip(self.linears, (query, key, value))] 185 | 186 | # 2) Apply attention on all the projected vectors in batch. 187 | x, self.attn = attention(query, key, value, mask=mask, 188 | dropout=self.dropout) 189 | 190 | # 3) "Concat" using a view and apply a final linear. 191 | x = x.transpose(1, 2).contiguous() \ 192 | .view(nbatches, -1, self.h * self.d_k) 193 | return self.linears[-1](x) 194 | 195 | def normal_pdf(pos, mu, sigma): 196 | a = pos - mu 197 | log_p = -1*torch.mul(a, a)/(2*sigma) - torch.log(sigma)/2 198 | return F.softmax(log_p, dim=1) 199 | 200 | 201 | def get_pe(d_model, max_len=5000): 202 | # Compute the positional encodings once in log space. 203 | pe = torch.zeros(max_len, d_model) 204 | position = torch.arange(0, max_len).unsqueeze(1) 205 | div_term = torch.exp(torch.arange(0, d_model, 2) * 206 | -(math.log(10000.0) / d_model)) 207 | pe[:, 0::2] = torch.sin(position * div_term) 208 | pe[:, 1::2] = torch.cos(position * div_term) 209 | #pe.requires_grad = False 210 | return pe 211 | 212 | 213 | class Gaussian_Position(nn.Module): 214 | def __init__(self, d_model, total_size, K=10): 215 | super(Gaussian_Position, self).__init__() 216 | #self.embedding = get_pe(d_model, K).to('cuda') 217 | #self.register_buffer('pe', self.embedding) 218 | self.embedding = nn.Parameter(torch.zeros([K, d_model], dtype=torch.float), requires_grad=True) 219 | nn.init.xavier_uniform_(self.embedding, gain=1) 220 | self.positions = torch.tensor([i for i in range(total_size)], requires_grad=False).unsqueeze(1).repeat(1, K).to('cuda') 221 | s = 0.0 222 | interval = total_size / K 223 | mu = [] 224 | for _ in range(K): 225 | mu.append(nn.Parameter(torch.tensor(s, dtype=torch.float), requires_grad=True)) 226 | s = s + interval 227 | self.mu = nn.Parameter(torch.tensor(mu, dtype=torch.float).unsqueeze(0), requires_grad=True) 228 | self.sigma = nn.Parameter(torch.tensor([torch.tensor([50.0], dtype=torch.float, requires_grad=True) for _ in range(K)]).unsqueeze(0)) 229 | 230 | def forward(self, x): 231 | M = normal_pdf(self.positions, self.mu, self.sigma) 232 | pos_enc = torch.matmul(M, self.embedding) 233 | #print(M) 234 | return x + pos_enc.unsqueeze(0).repeat(x.size(0), 1, 1) 235 | 236 | 237 | class PositionalEncoding(nn.Module): 238 | "Implement the PE function." 239 | 240 | def __init__(self, d_model, dropout, max_len=5000): 241 | super(PositionalEncoding, self).__init__() 242 | self.dropout = nn.Dropout(p=dropout) 243 | 244 | # Compute the positional encodings once in log space. 245 | pe = torch.zeros(max_len, d_model) 246 | position = torch.arange(0, max_len).unsqueeze(1) 247 | div_term = torch.exp(torch.arange(0, d_model, 2) * 248 | -(math.log(10000.0) / d_model)) 249 | pe[:, 0::2] = torch.sin(position * div_term) 250 | pe[:, 1::2] = torch.cos(position * div_term) 251 | pe.requires_grad = False 252 | #pe = pe.unsqueeze(0) 253 | self.register_buffer('pe', pe) 254 | 255 | def forward(self, x): 256 | #x = x + self.pe[:x.size(0), :]#Variable(self.pe[:x.size(0), :], requires_grad=False) 257 | x = x + self.pe[:x.size(1), :].unsqueeze(0).repeat(x.size(0), 1, 1) ## modified by Bing to adapt to batch 258 | return self.dropout(x) 259 | 260 | class HARTransformer(nn.Module): 261 | 262 | def __init__(self, hidden_dim, N, H, total_size, filters=[1, 3, 5]): 263 | super(HARTransformer, self).__init__() 264 | #self.pos_encoding = get_pe(hidden_dim) 265 | self.model = Encoder( 266 | EncoderLayer(hidden_dim, MultiHeadedAttention(H, hidden_dim), 267 | HAR_CNN(hidden_dim, hidden_dim, filters) 268 | , 0.1), 269 | N 270 | ) 271 | 272 | def forward(self, x, mask=None): 273 | #x = self.pos_encoding(x) 274 | return self.model(x, mask) -------------------------------------------------------------------------------- /source/model/UniTS.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | import random 6 | import copy 7 | import math 8 | import matplotlib.pyplot as plt 9 | 10 | 11 | 12 | 13 | class FIC(nn.Module): 14 | def __init__(self, window_size, stride): 15 | super(FIC, self).__init__() 16 | self.window_size = window_size 17 | self.k = int(window_size / 2) 18 | self.conv = nn.Conv1d(in_channels = 1, out_channels = 2 * int(window_size / 2), kernel_size = window_size, 19 | stride = stride, padding = 0, bias = False) 20 | self.init() 21 | 22 | def forward(self, x): 23 | # x: B * C * L 24 | B, C = x.size(0), x.size(1) 25 | 26 | x = torch.reshape(x, (B * C, -1)).unsqueeze(1) 27 | x = self.conv(x) 28 | x = torch.reshape(x, (B, C, -1, x.size(-1))) 29 | return x # B * C * fc * L 30 | 31 | 32 | def init(self): 33 | ''' 34 | Fourier weights initialization 35 | ''' 36 | basis = torch.tensor([math.pi * 2 * j / self.window_size for j in range(self.window_size)]) 37 | weight = torch.zeros((self.k * 2, self.window_size)) 38 | for i in range(self.k * 2): 39 | f = int(i / 2) + 1 40 | if i % 2 == 0: 41 | weight[i] = torch.cos(f * basis) 42 | else: 43 | weight[i] = torch.sin(-f * basis) 44 | self.conv.weight = torch.nn.Parameter( weight.unsqueeze(1), requires_grad=True) 45 | 46 | 47 | 48 | class TSEnc(nn.Module): 49 | def __init__(self, window_size, stride, k): 50 | super(TSEnc, self).__init__() 51 | ''' 52 | virtual filter choose 2 * k most important channels 53 | ''' 54 | self.k = k 55 | self.window_size = window_size 56 | self.FIC = FIC(window_size = window_size, stride = stride).cuda() 57 | self.RPC = nn.Conv1d(1, 2*k, kernel_size = window_size, stride = stride) 58 | 59 | 60 | def forward(self, x): 61 | x = x.permute(0, 2, 1) 62 | # fic # 63 | h_f = self.FIC(x) 64 | 65 | # virtual filter # 66 | h_f_pos, idx_pos = (torch.abs(h_f)).topk(2*self.k, dim = -2, largest = True, sorted = True) 67 | o_f_pos = torch.cat( (h_f_pos, idx_pos.type(torch.Tensor).to(h_f_pos.device )) , -2) 68 | 69 | # rpc # 70 | B, C = x.size(0), x.size(1) 71 | x = torch.reshape(x, (B*C, -1)).unsqueeze(1) 72 | o_t = self.RPC(x) 73 | o_t = torch.reshape(o_t, (B, C, -1, o_t.size(-1))) 74 | 75 | o = torch.cat((o_f_pos, o_t), -2) 76 | return o 77 | 78 | 79 | class resConv1dBlock(nn.Module): 80 | def __init__(self, in_channels, kernel_size, stride, layer_num): 81 | super(resConv1dBlock, self).__init__() 82 | ''' 83 | ResNet 1d convolution block 84 | ''' 85 | self.layer_num = layer_num 86 | self.conv1 = nn.ModuleList([ 87 | nn.Conv1d(in_channels = in_channels, out_channels = 2 * in_channels, kernel_size = kernel_size, stride = stride, padding = int((kernel_size - 1) / 2) ) 88 | for i in range(layer_num)]) 89 | 90 | self.bn1 = nn.ModuleList([ 91 | nn.BatchNorm1d(2 * in_channels) 92 | for i in range(layer_num)]) 93 | 94 | self.conv2 = nn.ModuleList([ 95 | nn.Conv1d(in_channels = 2 * in_channels, out_channels = in_channels, kernel_size = kernel_size, stride = stride, padding = int((kernel_size - 1) / 2) ) 96 | for i in range(layer_num)]) 97 | 98 | self.bn2 = nn.ModuleList([ 99 | nn.BatchNorm1d(in_channels) 100 | for i in range(layer_num)]) 101 | 102 | def forward(self, x): 103 | for i in range(self.layer_num): 104 | tmp = F.relu(self.bn1[i](self.conv1[i](x))) 105 | x = F.relu(self.bn2[i](self.conv2[i](tmp)) + x) 106 | return x 107 | 108 | 109 | class UniTS(nn.Module): 110 | def __init__(self, input_size, sensor_num, layer_num, 111 | window_list, stride_list, k_list, out_dim, hidden_channel = 128): 112 | super(UniTS, self).__init__() 113 | assert len(window_list) == len(stride_list) 114 | assert len(window_list) == len(k_list) 115 | self.hidden_channel = hidden_channel 116 | self.window_list = window_list 117 | 118 | self.ts_encoders = nn.ModuleList([ 119 | TSEnc(window_list[i], stride_list[i], k_list[i]) for i in range(len(window_list)) 120 | ]) 121 | self.num_frequency_channel = [6 * k_list[i] for i in range(len(window_list))] 122 | self.current_size = [1 + int((input_size - window_list[i]) / stride_list[i]) for i in range(len(window_list))] 123 | # o.size(): B * C * num_frequency_channel * current_size 124 | self.multi_channel_fusion = nn.ModuleList([nn.ModuleList() for _ in range(len(window_list))]) 125 | self.conv_branches = nn.ModuleList([nn.ModuleList() for _ in range(len(window_list))]) 126 | self.bns = nn.ModuleList([nn.BatchNorm1d(self.hidden_channel) for _ in range(len(window_list))]) 127 | 128 | self.multi_channel_fusion = nn.ModuleList([nn.Conv2d(in_channels = sensor_num, out_channels = self.hidden_channel, 129 | kernel_size = (self.num_frequency_channel[i], 1), stride = (1, 1) ) for i in range(len(window_list) ) ]) 130 | self.end_linear = nn.ModuleList([]) 131 | 132 | for i in range(len(window_list)): 133 | scale = 1 134 | while self.current_size[i] >= 3: 135 | self.conv_branches[i].append( 136 | resConv1dBlock(in_channels = self.hidden_channel * scale, 137 | kernel_size = 3, stride = 1, layer_num = layer_num) 138 | ) 139 | if scale < 2: 140 | # scale up the hidden dims for ResNet only once 141 | self.conv_branches[i].append( 142 | nn.Conv1d(in_channels = self.hidden_channel * scale, out_channels = self.hidden_channel *2* scale, kernel_size = 1, stride = 1) 143 | ) 144 | scale *= 2 145 | 146 | self.conv_branches[i].append(nn.AvgPool1d(kernel_size = 2)) 147 | self.current_size[i] = 1 + int((self.current_size[i] - 2) / 2) 148 | 149 | self.end_linear.append( 150 | nn.Linear(self.hidden_channel * self.current_size[i] * scale, self.hidden_channel) 151 | ) 152 | self.classifier = nn.Linear(self.hidden_channel* len(self.window_list), out_dim) 153 | 154 | def forward(self, x): 155 | #x: B * L * C 156 | multi_scale_x = [] 157 | B = x.size(0) 158 | C = x.size(2) 159 | 160 | for i in range(len(self.current_size)): 161 | tmp = self.ts_encoders[i](x) 162 | #tmp: B * C * fc * L' 163 | tmp = F.relu(self.bns[i](self.multi_channel_fusion[i](tmp).squeeze(2))) 164 | 165 | for j in range(len(self.conv_branches[i])): 166 | tmp = self.conv_branches[i][j](tmp) 167 | tmp = tmp.view(B,-1) 168 | # tmp : B * l' 169 | tmp = F.relu(self.end_linear[i](tmp)) 170 | multi_scale_x.append(tmp) 171 | 172 | x = torch.cat(multi_scale_x, -1) 173 | x = self.classifier(x) 174 | return x 175 | 176 | 177 | 178 | 179 | def main(): 180 | stft_m = STFTNet(input_size = 256, sensor_num = 6, layer_num = 1, 181 | window_list = [16, 32, 48], stride_list = [8, 16, 24], k_list = [6, 8, 10], out_dim = 4, hidden_channel = 32).cuda() 182 | 183 | total_params = sum(p.numel() for p in stft_m.parameters()) 184 | print(f'{total_params:,} total parameters.') 185 | 186 | 187 | x = torch.zeros(3, 256, 6).cuda() 188 | output = stft_m(x) 189 | print(output.size()) 190 | 191 | 192 | if __name__ == '__main__': 193 | main() 194 | -------------------------------------------------------------------------------- /source/model/Widar3.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | from torchsummary import summary 6 | from torchvision.models import AlexNet 7 | 8 | class Widar3(nn.Module): 9 | # CNN + GRU 10 | def __init__(self, input_shape, input_channel, num_label, n_gru_hidden_units=128, f_dropout_ratio=0.5, batch_first=True): 11 | super(Widar3, self).__init__() 12 | self.num_label = num_label 13 | self.n_gru_hidden_units = n_gru_hidden_units 14 | self.f_dropout_ratio = f_dropout_ratio 15 | # [@, T_MAX, 1, 20, 20] 16 | self.input_shape = input_shape 17 | self.input_time = input_shape[1] 18 | self.input_channel = input_shape[2] 19 | self.input_x, self.input_y = input_shape[3], input_shape[4] 20 | 21 | # self.cnn = AlexNet(weights=None).features 22 | # self.cnn. 23 | 24 | self.Tconv1_out_channel=16 25 | self.Tconv1_kernel_size=5 26 | self.Tconv1_stride=1 27 | 28 | self.Tdense1_out = 64 29 | self.Tdense2_out = 64 30 | 31 | self.cnn = nn.Sequential( 32 | nn.Conv2d(input_channel, self.Tconv1_out_channel, self.Tconv1_kernel_size, self.Tconv1_stride), 33 | nn.ReLU(inplace=True), 34 | nn.MaxPool2d(kernel_size=(2,2)), 35 | nn.Flatten(), 36 | nn.Linear(self.Tconv1_out_channel * ((self.input_x - 4) // 2) * ((self.input_y - 4) // 2), self.Tdense1_out), 37 | nn.ReLU(inplace=True), 38 | nn.Dropout(f_dropout_ratio), 39 | nn.Linear(self.Tdense1_out, self.Tdense2_out), 40 | nn.ReLU(inplace=True), 41 | ) 42 | self.gru = nn.GRU(input_size=self.Tdense2_out, hidden_size=n_gru_hidden_units, batch_first=batch_first) 43 | self.dropout2 = nn.Dropout(f_dropout_ratio) 44 | # [Pytorch] DO NOT USE SOFTMAX HERE!!! 45 | self.dense3 = nn.Linear(n_gru_hidden_units, num_label) 46 | 47 | def forward(self, input): 48 | # [@, T_MAX, 1, 20, 20] 49 | cnn_out_list = [self.cnn(input[:, t, :, :, :]) for t in range(self.input_time)] 50 | cnn_out = torch.stack(cnn_out_list, dim=1) 51 | # [@, T_MAX, 64] 52 | out, _ = self.gru(cnn_out) 53 | x = out[:, -1, :] 54 | x = self.dropout2(x) 55 | x = self.dense3(x) 56 | # x = F.relu(x) 57 | return x 58 | 59 | def main(): 60 | input = torch.zeros((4, 38, 1, 20, 20)).cuda() 61 | model = Widar3(input.shape, input_channel = 1, num_label = 6, n_gru_hidden_units=128, f_dropout_ratio=0.5).cuda() 62 | o = model(input) 63 | summary(model, input_size=input.size()[1:]) 64 | print(o.size()) 65 | 66 | if __name__ == '__main__': 67 | main() 68 | 69 | # import torch 70 | # import numpy as np 71 | # import torch.nn as nn 72 | # import torch.nn.functional as F 73 | # from torchsummary import summary 74 | # # Tensorflow 75 | # # model_input = Input(shape=input_shape, dtype='float32', name='name_model_input') # (@,T_MAX,20,20,1) 76 | # # # Feature extraction part 77 | # # x = TimeDistributed(Conv2D(16,kernel_size=(5,5),activation='relu',data_format='channels_last',\ 78 | # # input_shape=input_shape))(model_input) # (@,T_MAX,20,20,1)=>(@,T_MAX,16,16,16) 79 | # # x = TimeDistributed(MaxPooling2D(pool_size=(2,2)))(x) # (@,T_MAX,16,16,16)=>(@,T_MAX,8,8,16) 80 | # # x = TimeDistributed(Flatten())(x) # (@,T_MAX,8,8,16)=>(@,T_MAX,8*8*16) 81 | # # x = TimeDistributed(Dense(64,activation='relu'))(x) # (@,T_MAX,8*8*16)=>(@,T_MAX,64) 82 | # # x = TimeDistributed(Dropout(f_dropout_ratio))(x) 83 | # # x = TimeDistributed(Dense(64,activation='relu'))(x) # (@,T_MAX,64)=>(@,T_MAX,64) 84 | # # x = GRU(n_gru_hidden_units,return_sequences=False)(x) # (@,T_MAX,64)=>(@,128) 85 | # # x = Dropout(f_dropout_ratio)(x) 86 | # # model_output = Dense(n_class, activation='softmax', name='name_model_output')(x) # (@,128)=>(@,n_class) 87 | 88 | # # # Model compiling 89 | # # model = Model(inputs=model_input, outputs=model_output) 90 | # # model.compile(optimizer=keras.optimizers.RMSprop(lr=f_learning_rate), 91 | # # loss='categorical_crossentropy', 92 | # # metrics=['accuracy'] 93 | # # ) 94 | 95 | # # Pytorch 96 | # # class TimeDistributed(nn.Module): 97 | # # def __init__(self, module, batch_first=True): 98 | # # super(TimeDistributed, self).__init__() 99 | # # self.module = module 100 | # # self.batch_first = batch_first 101 | 102 | # # def forward(self, x): 103 | # # if len(x.size()) <= 2: 104 | # # return self.module(x) 105 | # # # Input: (batch, time, channels, x, y) 106 | 107 | # # # split time dimension 108 | # # x_split = x.split(1, dim=1) 109 | # # x_split = [e.squeeze(1) for e in x_split] 110 | # # # Apply the module on each timestep 111 | # # outputs = [self.module(x_t) for x_t in x_split] 112 | # # # # Combine the output tensors back into a single tensor 113 | # # return torch.stack(outputs, dim=1) if self.batch_first else torch.stack(outputs, dim=2) 114 | 115 | # class TimeDistributed(nn.Module): 116 | # def __init__(self, module, batch_first=True): 117 | # super(TimeDistributed, self).__init__() 118 | # self.module = module 119 | # self.batch_first = batch_first 120 | 121 | # def forward(self, x): 122 | # assert len(x.size()) > 2 123 | 124 | # # reshape input data --> (samples * timesteps, input_size) 125 | # # squash timesteps 126 | # batch_size, time_steps, C, H, W = x.size() 127 | # c_in = x.view(batch_size * time_steps, C, H, W) 128 | # c_out = self.module(c_in) 129 | # r_in = c_out.view(batch_size, time_steps, -1) 130 | # if self.batch_first is False: 131 | # r_in = r_in.permute(1, 0, 2) 132 | # return r_in 133 | 134 | 135 | # class Widar3(nn.Module): 136 | # def __init__(self, input_shape, input_channel, num_label, n_gru_hidden_units=128, f_dropout_ratio=0.5, batch_first=True): 137 | # super(Widar3, self).__init__() 138 | # self.num_label = num_label 139 | # self.n_gru_hidden_units = n_gru_hidden_units 140 | # self.f_dropout_ratio = f_dropout_ratio 141 | # # [@, T_MAX, 1, 20, 20] 142 | # self.input_shape = input_shape 143 | # self.input_time = input_shape[1] 144 | # self.input_channel = input_shape[2] 145 | # self.input_x, self.input_y = input_shape[3], input_shape[4] 146 | 147 | # self.Tconv1_out_channel=16 148 | # self.Tconv1_kernel_size=5 149 | # self.Tconv1_stride=1 150 | 151 | # self.Tdense1_out = 64 152 | # self.Tdense2_out = 64 153 | 154 | # data_shape = input_shape 155 | # self.Tconv1 = TimeDistributed(nn.Conv2d(input_channel, self.Tconv1_out_channel, self.Tconv1_kernel_size, self.Tconv1_stride), batch_first=batch_first) 156 | # data_shape = [data_shape[0], data_shape[1], self.Tconv1_out_channel, data_shape[3] - 4, data_shape[4] - 4] 157 | # self.Tmaxpool1 = TimeDistributed(nn.MaxPool2d(kernel_size=(2,2))) 158 | # data_shape = [data_shape[0], data_shape[1], data_shape[2], data_shape[3] // 2, data_shape[4] // 2] 159 | # self.Tflatten1 = TimeDistributed(nn.Flatten()) 160 | # data_shape = [data_shape[0], data_shape[1], data_shape[2] * data_shape[3] * data_shape[4]] 161 | # self.Tdense1 = TimeDistributed(nn.Linear(data_shape[2], self.Tdense1_out)) 162 | # data_shape = [data_shape[0], data_shape[1], self.Tdense1_out] 163 | # self.Tdropout1 = TimeDistributed(nn.Dropout(f_dropout_ratio)) 164 | # data_shape = [data_shape[0], data_shape[1], data_shape[2]] 165 | # self.Tdense2 = TimeDistributed(nn.Linear(self.Tdense1_out, self.Tdense2_out)) 166 | # data_shape = [data_shape[0], data_shape[1], self.Tdense2_out] 167 | # self.gru = nn.GRU(input_size=self.Tdense2_out, hidden_size=n_gru_hidden_units, batch_first=batch_first) 168 | 169 | # self.dropout2 = nn.Dropout(f_dropout_ratio, inplace=False) 170 | # # self.dense3 = nn.Sequential(nn.Linear(n_gru_hidden_units, num_label), nn.Softmax(dim=1)) 171 | # # DO NOT USE SOFTMAX 172 | # self.dense3 = nn.Linear(n_gru_hidden_units, num_label) 173 | # data_shape = [data_shape[0], num_label] 174 | 175 | 176 | # # initialize weights 177 | # for m in self.modules(): 178 | # if type(m) == nn.GRU: 179 | # # GRU: weight: orthogonal, recurrent_kernel: glorot_uniform, bias: zero_initializer 180 | # nn.init.orthogonal_(m.weight_ih_l0) 181 | # nn.init.orthogonal_(m.weight_hh_l0) 182 | # nn.init.zeros_(m.bias_ih_l0) 183 | # nn.init.zeros_(m.bias_hh_l0) 184 | # elif type(m) == nn.Linear: 185 | # # Linear: weight: orthogonal, bias: zero_initializer 186 | # nn.init.orthogonal_(m.weight) 187 | # nn.init.zeros_(m.bias) 188 | # elif type(m) == nn.Conv2d: 189 | # # Conv2d: weight: orthogonal, bias: zero_initializer 190 | # nn.init.orthogonal_(m.weight) 191 | # nn.init.zeros_(m.bias) 192 | 193 | 194 | 195 | 196 | 197 | # def forward(self, input): 198 | # # [@, T_MAX, 1, 20, 20] -> [@, T_MAX, 16, 16, 16] 199 | # x = self.Tconv1(input) 200 | # # Relu 201 | # x = F.relu(x) 202 | # # [@, T_MAX, 16, 16, 16] -> [@, T_MAX, 16, 8, 8] 203 | # x = self.Tmaxpool1(x) 204 | # # [@, T_MAX, 16, 8, 8] -> [@, T_MAX, 16*8*8] 205 | # x = self.Tflatten1(x) 206 | # # [@, T_MAX, 16*8*8] -> [@, T_MAX, 64] 207 | # x = self.Tdense1(x) 208 | # # Relu 209 | # x = F.relu(x) 210 | # # [@, T_MAX, 64] -> [@, T_MAX, 64] 211 | # x = self.Tdropout1(x) 212 | # # [@, T_MAX, 64] -> [@, T_MAX, 64] 213 | # x = self.Tdense2(x) 214 | # # Relu 215 | # x = F.relu(x) 216 | # # [@, T_MAX, 64] -> [@, 128] 217 | # # keras.layers.GRU(n_gru_hidden_units,return_sequences=False)(x) 218 | # x = self.gru(x)[0][:, -1, :] 219 | # x = x.squeeze(0) 220 | # # [@, 128] -> [@, 128] 221 | # x = self.dropout2(x) 222 | # # [@, 128] -> [@, n_class] 223 | # x = self.dense3(x) 224 | # # check any nan 225 | # if torch.isnan(x).any(): 226 | # print('nan') 227 | # print(x) 228 | # exit() 229 | # return x 230 | 231 | def main(): 232 | input = torch.zeros((4, 38, 1, 200, 20)).cuda() 233 | model = Widar3(input.shape, input_channel = 1, num_label = 6, n_gru_hidden_units=128, f_dropout_ratio=0.5).cuda() 234 | o = model(input) 235 | summary(model, input_size=input.size()[1:]) 236 | print(o.size()) 237 | 238 | # if __name__ == '__main__': 239 | # main() -------------------------------------------------------------------------------- /source/model/__init__.py: -------------------------------------------------------------------------------- 1 | from model.UniTS import UniTS 2 | from model.THAT import HARTrans 3 | from model.dual import RFNet 4 | from model.resnet import ResNet 5 | from model.mann import MaDNN, MaCNN 6 | from model.laxcat import LaxCat 7 | from model.static_UniTS import static_UniTS 8 | from model.Widar3 import Widar3 9 | from model.Alexnet import AlexNet 10 | from model.CNN_GRU import CNN_GRU 11 | from model.slnet import SLNet -------------------------------------------------------------------------------- /source/model/dual-dl.py: -------------------------------------------------------------------------------- 1 | #------------ 2 | # Author: Shuya Ding 3 | # Date: Sep 2020 4 | #------------ 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | import torch.optim as optim 9 | import torchvision.models as models 10 | from torch.nn.utils import weight_norm 11 | from torchsummary import summary 12 | 13 | cnn = models.resnet18(weights=None) 14 | cnn = torch.nn.Sequential(*(list(cnn.children())[:-1])) 15 | output_dim = 512 16 | 17 | def block(name, in_feat, out_feat, win_len): 18 | # if name == 'IN': 19 | # layers = [nn.Linear(in_feat, out_feat)] 20 | # layers.append(nn.InstanceNorm1d(cfg.win_len, 0.8)) 21 | # elif name == 'SN': 22 | # layers = [SpectralNorm(nn.Linear(in_feat, out_feat))] 23 | 24 | if name == 'BN': 25 | # INPUT: N*win_len*in_feat 26 | layers = [nn.Linear(in_feat, out_feat)] 27 | layers.append(nn.BatchNorm1d(win_len, 0.8)) 28 | 29 | # elif name == 'ADAIN': 30 | # layers = [nn.Linear(in_feat, out_feat)] 31 | # layers.append(AdaptiveInstanceNorm1d(cfg.win_len)) 32 | else: 33 | layers = [nn.Linear(in_feat, out_feat)] 34 | layers.append(nn.LeakyReLU(0.2, inplace=True)) 35 | return layers 36 | 37 | 38 | class RFNet(nn.Module): 39 | def __init__(self, num_classes, input_channel, win_len, model = cnn): 40 | super(RFNet, self).__init__() 41 | self.fuse = Fusion() 42 | self.model = model 43 | self.hid_dim = 512 44 | 45 | self.fc = nn.Sequential(*block('BN', input_channel * 2, self.hid_dim, win_len)) 46 | self.lstm_time = nn.LSTM(input_channel, self.hid_dim//2) 47 | self.lstm_freq = nn.LSTM(input_channel, self.hid_dim//2) 48 | 49 | 50 | 51 | self.classifier = nn.Linear(output_dim, num_classes) 52 | self.attention = weight_norm(BiAttention( 53 | time_features=self.hid_dim//2, 54 | freq_features=self.hid_dim//2, 55 | mid_features=self.hid_dim, 56 | glimpses=1, 57 | drop=0.5,), name='h_weight', dim=None) 58 | 59 | self.apply_attention = ApplyAttention( 60 | time_features=self.hid_dim//2, 61 | freq_features=self.hid_dim//2, 62 | mid_features=self.hid_dim//2, 63 | glimpses=1, 64 | num_obj=512, 65 | drop=0.2, 66 | ) 67 | self.cnn1 = torch.nn.Conv2d(2, 3, kernel_size=3, stride=1, padding=1) 68 | self.fc1 = FCNet(self.hid_dim//2, self.hid_dim//2, 'relu', 0.4) 69 | self.fc2 = FCNet(self.hid_dim//2, self.hid_dim//2, 'relu', 0.4) 70 | self.fc3 = FCNet(self.hid_dim, self.hid_dim, drop = 0.4) 71 | def forward(self,x): 72 | """ 73 | x: sample_size * 512 * 60 74 | """ 75 | 76 | # Time: sample_size * 512 * 60 77 | bs, win_len, dim = x.shape 78 | 79 | 80 | # Freq: sample_size * 512 * 60 81 | x_freq = torch.fft.fft(x.permute(0,2,1).reshape(-1,win_len), dim = -1) 82 | x_real_freq = x_freq.real.reshape(bs,dim,win_len).permute(0,2,1) 83 | x_img_freq = x_freq.imag.reshape(bs,dim,win_len).permute(0,2,1) 84 | x_absolute = torch.sqrt((x_real_freq**2) + (x_img_freq**2)) # sample_size * 512 * 60 85 | 86 | 87 | del x_img_freq, x_real_freq, x_freq 88 | torch.cuda.empty_cache() 89 | 90 | 91 | # Cat + FC 92 | combined = torch.cat([x,x_absolute],-1) # sample_size * 512 * （60*2） 93 | combined = self.fc(combined) # sample_size * 512 * hid 94 | 95 | 96 | 97 | 98 | # CNN Compute 99 | heat_map = combined.view(bs,win_len,self.hid_dim//2,2).permute(0,3,2,1) 100 | heat_map = self.cnn1(heat_map) 101 | feat = self.model(heat_map).squeeze(-1).squeeze(-1) 102 | 103 | del heat_map, combined 104 | torch.cuda.empty_cache() 105 | 106 | 107 | 108 | # Involve attention 109 | time = self.lstm_time(x)[0] #bs, win_len, self.hid_dim // 2 110 | freq = self.lstm_freq(x_absolute)[0] 111 | 112 | del x, x_absolute 113 | torch.cuda.empty_cache() 114 | 115 | atten, logits = self.attention(time, freq) 116 | time, freq = self.apply_attention(time, freq, atten, logits) 117 | 118 | 119 | # Time-Tube 120 | x = self.fc1(time[:,-1,:]) 121 | 122 | # Freq-Tube 123 | x_absolute = self.fc2(freq[:,-1,:]) 124 | 125 | del freq,time 126 | torch.cuda.empty_cache() 127 | 128 | feat = self.fc3(torch.cat([x,x_absolute],-1)) + feat 129 | # Classifier Outputs 130 | pred = self.classifier(feat) 131 | 132 | # del x, x_absolute 133 | torch.cuda.empty_cache() 134 | 135 | return pred#, [x,x_absolute,feat] 136 | 137 | 138 | class FCNet(nn.Module): 139 | def __init__(self, in_size, out_size, activate=None, drop=0.0): 140 | super(FCNet, self).__init__() 141 | self.lin = weight_norm(nn.Linear(in_size, out_size), dim=None) 142 | 143 | self.drop_value = drop 144 | self.drop = nn.Dropout(drop) 145 | 146 | # in case of using upper character by mistake 147 | self.activate = activate.lower() if (activate is not None) else None 148 | if activate == 'relu': 149 | self.ac_fn = nn.ReLU() 150 | elif activate == 'sigmoid': 151 | self.ac_fn = nn.Sigmoid() 152 | elif activate == 'tanh': 153 | self.ac_fn = nn.Tanh() 154 | 155 | 156 | def forward(self, x): 157 | if self.drop_value > 0: 158 | x = self.drop(x) 159 | 160 | x = self.lin(x) 161 | 162 | if self.activate is not None: 163 | x = self.ac_fn(x) 164 | return x 165 | 166 | 167 | class Fusion(nn.Module): 168 | """ Crazy multi-modal fusion: negative squared difference minus relu'd sum 169 | """ 170 | def __init__(self): 171 | super().__init__() 172 | 173 | def forward(self, x, y): 174 | # found through grad student descent ;) 175 | return - (x - y)**2 + F.relu(x + y) 176 | 177 | 178 | class BiAttention(nn.Module): 179 | def __init__(self, time_features, freq_features, mid_features, glimpses, drop=0.0): 180 | super(BiAttention, self).__init__() 181 | self.hidden_aug = 3 182 | self.glimpses = glimpses 183 | self.lin_time = FCNet(time_features, int(mid_features * self.hidden_aug), activate='relu', drop=drop/2.5) # let self.lin take care of bias 184 | self.lin_freq = FCNet(freq_features, int(mid_features * self.hidden_aug), activate='relu', drop=drop/2.5) 185 | 186 | self.h_weight = nn.Parameter(torch.Tensor(1, glimpses, 1, int(mid_features * self.hidden_aug)).normal_()) 187 | self.h_bias = nn.Parameter(torch.Tensor(1, glimpses, 1, 1).normal_()) 188 | 189 | self.drop = nn.Dropout(drop) 190 | 191 | def forward(self, time, freq): 192 | """ 193 | time = batch, time_num, dim 194 | freq = batch, freq_num, dim 195 | """ 196 | time_num = time.size(1) 197 | freq_num = freq.size(1) 198 | 199 | time_ = self.lin_time(time).unsqueeze(1) # batch, 1, time_num, dim 200 | freq_ = self.lin_freq(freq).unsqueeze(1) # batch, 1, q_num, dim 201 | time_ = self.drop(time_) 202 | 203 | del time, freq 204 | torch.cuda.empty_cache() 205 | 206 | 207 | h_ = time_ * self.h_weight # broadcast: batch x glimpses x time_num x dim 208 | logits = torch.matmul(h_, freq_.transpose(2,3)) # batch x glimpses x time_num x freq_num 209 | 210 | del h_, freq_ 211 | torch.cuda.empty_cache() 212 | 213 | logits = logits + self.h_bias 214 | 215 | 216 | torch.cuda.empty_cache() 217 | 218 | atten = F.softmax(logits.view(-1, self.glimpses, time_num * freq_num), 2) 219 | return atten.view(-1, self.glimpses, time_num, freq_num), logits 220 | 221 | 222 | class ApplyAttention(nn.Module): 223 | def __init__(self, time_features, freq_features, mid_features, glimpses, num_obj, drop=0.0): 224 | super(ApplyAttention, self).__init__() 225 | self.glimpses = glimpses 226 | layers = [] 227 | for g in range(self.glimpses): 228 | layers.append(ApplySingleAttention(time_features, freq_features, mid_features, num_obj, drop)) 229 | self.glimpse_layers = nn.ModuleList(layers) 230 | 231 | def forward(self, time, freq, atten, logits): 232 | """ 233 | time = batch, time_num, dim 234 | freq = batch, freq_num, dim 235 | atten: batch x glimpses x time_num x freq_num 236 | logits: batch x glimpses x time_num x freq_num 237 | """ 238 | time_num = time.shape[1] 239 | freq_num = freq.shape[1] 240 | for g in range(self.glimpses): 241 | atten_h_freq, atten_h_time = self.glimpse_layers[g](time, freq, atten[:,g,:,:], logits[:,g,:,:]) 242 | time = atten_h_time + time 243 | freq = atten_h_freq + freq 244 | del atten_h_time, atten_h_freq 245 | torch.cuda.empty_cache() 246 | return time, freq 247 | 248 | class ApplySingleAttention(nn.Module): 249 | def __init__(self, time_features, freq_features, mid_features, num_obj, drop=0.0): 250 | super(ApplySingleAttention, self).__init__() 251 | self.lin_time = FCNet(time_features, time_features, activate='relu', drop=drop) 252 | self.lin_freq = FCNet(freq_features, freq_features, activate='relu', drop=drop) 253 | 254 | def forward(self, time, freq, atten, logits): 255 | """ 256 | time = batch, time_num, dim 257 | freq = batch, freq_num , dim 258 | 259 | atten: batch x time_num x freq_num 260 | logits: batch x time_num x freq_num 261 | """ 262 | atten_h_time = self.lin_time((time.permute(0,2,1) @ atten).permute(0,2,1)) 263 | del time 264 | torch.cuda.empty_cache() 265 | atten_h_freq = self.lin_freq((freq.permute(0,2,1) @ atten).permute(0,2,1)) 266 | del freq, atten 267 | torch.cuda.empty_cache() 268 | 269 | return atten_h_time, atten_h_freq 270 | 271 | 272 | 273 | def main(): 274 | # [@, T, 1, F] 275 | input = torch.zeros((16, 256, 121, 90)).cuda() 276 | num_labels = 6 277 | input_channel = 121 278 | input_size = 256 279 | 280 | fc_layer = nn.Linear(90, 1, bias=False) 281 | 282 | class Squeeze(nn.Module): 283 | def __init__(self): 284 | super(Squeeze, self).__init__() 285 | 286 | def forward(self, x): 287 | return x.squeeze(-1) 288 | 289 | model = nn.Sequential( 290 | fc_layer.cuda(), 291 | Squeeze().cuda(), 292 | RFNet(num_classes = num_labels, input_channel = input_channel, win_len = input_size).cuda() 293 | ) 294 | 295 | # model = RFNet(num_classes = num_labels, input_channel = input_channel, win_len = input_size).cuda() 296 | o = model(input) 297 | # summary(model, input_size=input.size()[1:]) 298 | print(o.size()) 299 | 300 | if __name__ == '__main__': 301 | main() -------------------------------------------------------------------------------- /source/model/dual.py: -------------------------------------------------------------------------------- 1 | #------------ 2 | # Author: Shuya Ding 3 | # Date: Sep 2020 4 | #------------ 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | import torch.optim as optim 9 | import utils as utils 10 | import torchvision.models as models 11 | from torch.nn.utils import weight_norm 12 | 13 | 14 | cnn = models.resnet18(weights=None) 15 | cnn = torch.nn.Sequential(*(list(cnn.children())[:-1])) 16 | output_dim = 512 17 | 18 | def block(name, in_feat, out_feat, win_len): 19 | # if name == 'IN': 20 | # layers = [nn.Linear(in_feat, out_feat)] 21 | # layers.append(nn.InstanceNorm1d(cfg.win_len, 0.8)) 22 | # elif name == 'SN': 23 | # layers = [SpectralNorm(nn.Linear(in_feat, out_feat))] 24 | 25 | if name == 'BN': 26 | # INPUT: N*win_len*in_feat 27 | layers = [nn.Linear(in_feat, out_feat)] 28 | layers.append(nn.BatchNorm1d(win_len, 0.8)) 29 | 30 | # elif name == 'ADAIN': 31 | # layers = [nn.Linear(in_feat, out_feat)] 32 | # layers.append(AdaptiveInstanceNorm1d(cfg.win_len)) 33 | else: 34 | layers = [nn.Linear(in_feat, out_feat)] 35 | layers.append(nn.LeakyReLU(0.2, inplace=True)) 36 | return layers 37 | 38 | 39 | class RFNet(nn.Module): 40 | def __init__(self, num_classes, input_channel, win_len, model = cnn): 41 | super(RFNet, self).__init__() 42 | self.fuse = Fusion() 43 | self.model = model 44 | self.hid_dim = 512 45 | 46 | self.fc = nn.Sequential(*block('BN', input_channel * 2, self.hid_dim, win_len)) 47 | self.lstm_time = nn.LSTM(input_channel, self.hid_dim//2) 48 | self.lstm_freq = nn.LSTM(input_channel, self.hid_dim//2) 49 | 50 | 51 | 52 | self.classifier = nn.Linear(output_dim, num_classes) 53 | self.attention = weight_norm(BiAttention( 54 | time_features=self.hid_dim//2, 55 | freq_features=self.hid_dim//2, 56 | mid_features=self.hid_dim, 57 | glimpses=1, 58 | drop=0.5,), name='h_weight', dim=None) 59 | 60 | self.apply_attention = ApplyAttention( 61 | time_features=self.hid_dim//2, 62 | freq_features=self.hid_dim//2, 63 | mid_features=self.hid_dim//2, 64 | glimpses=1, 65 | num_obj=512, 66 | drop=0.2, 67 | ) 68 | self.cnn1 = torch.nn.Conv2d(2, 3, kernel_size=3, stride=1, padding=1) 69 | self.fc1 = FCNet(self.hid_dim//2, self.hid_dim//2, 'relu', 0.4) 70 | self.fc2 = FCNet(self.hid_dim//2, self.hid_dim//2, 'relu', 0.4) 71 | self.fc3 = FCNet(self.hid_dim, self.hid_dim, drop = 0.4) 72 | def forward(self,x, till_layer="final"): 73 | """ 74 | x: sample_size * 512 * 60 75 | """ 76 | 77 | # Time: sample_size * 512 * 60 78 | bs, win_len, dim = x.shape 79 | 80 | 81 | # Freq: sample_size * 512 * 60 82 | x_freq = torch.fft.fft(x.permute(0,2,1).reshape(-1,win_len), dim = -1) 83 | x_real_freq = x_freq.real.reshape(bs,dim,win_len).permute(0,2,1) 84 | x_img_freq = x_freq.imag.reshape(bs,dim,win_len).permute(0,2,1) 85 | x_absolute = torch.sqrt((x_real_freq**2) + (x_img_freq**2)) # sample_size * 512 * 60 86 | 87 | 88 | del x_img_freq, x_real_freq, x_freq 89 | torch.cuda.empty_cache() 90 | 91 | 92 | # Cat + FC 93 | combined = torch.cat([x,x_absolute],-1) # sample_size * 512 * （60*2） 94 | combined = self.fc(combined) # sample_size * 512 * hid 95 | 96 | 97 | 98 | 99 | # CNN Compute 100 | heat_map = combined.view(bs,win_len,self.hid_dim//2,2).permute(0,3,2,1) 101 | heat_map = self.cnn1(heat_map) 102 | feat = self.model(heat_map).squeeze(-1).squeeze(-1) 103 | 104 | del heat_map, combined 105 | torch.cuda.empty_cache() 106 | 107 | 108 | 109 | # Involve attention 110 | time = self.lstm_time(x)[0] #bs, win_len, self.hid_dim // 2 111 | freq = self.lstm_freq(x_absolute)[0] 112 | 113 | del x, x_absolute 114 | torch.cuda.empty_cache() 115 | 116 | atten, logits = self.attention(time, freq) 117 | time, freq = self.apply_attention(time, freq, atten, logits) 118 | 119 | 120 | # Time-Tube 121 | x = self.fc1(time[:,-1,:]) 122 | 123 | # Freq-Tube 124 | x_absolute = self.fc2(freq[:,-1,:]) 125 | 126 | del freq,time 127 | torch.cuda.empty_cache() 128 | 129 | feat = self.fc3(torch.cat([x,x_absolute],-1)) + feat 130 | if till_layer == "feat": 131 | return feat 132 | 133 | # Classifier Outputs 134 | pred = self.classifier(feat) 135 | 136 | # del x, x_absolute 137 | torch.cuda.empty_cache() 138 | 139 | return pred#, [x,x_absolute,feat] 140 | 141 | 142 | class FCNet(nn.Module): 143 | def __init__(self, in_size, out_size, activate=None, drop=0.0): 144 | super(FCNet, self).__init__() 145 | self.lin = weight_norm(nn.Linear(in_size, out_size), dim=None) 146 | 147 | self.drop_value = drop 148 | self.drop = nn.Dropout(drop) 149 | 150 | # in case of using upper character by mistake 151 | self.activate = activate.lower() if (activate is not None) else None 152 | if activate == 'relu': 153 | self.ac_fn = nn.ReLU() 154 | elif activate == 'sigmoid': 155 | self.ac_fn = nn.Sigmoid() 156 | elif activate == 'tanh': 157 | self.ac_fn = nn.Tanh() 158 | 159 | 160 | def forward(self, x): 161 | if self.drop_value > 0: 162 | x = self.drop(x) 163 | 164 | x = self.lin(x) 165 | 166 | if self.activate is not None: 167 | x = self.ac_fn(x) 168 | return x 169 | 170 | 171 | class Fusion(nn.Module): 172 | """ Crazy multi-modal fusion: negative squared difference minus relu'd sum 173 | """ 174 | def __init__(self): 175 | super().__init__() 176 | 177 | def forward(self, x, y): 178 | # found through grad student descent ;) 179 | return - (x - y)**2 + F.relu(x + y) 180 | 181 | 182 | class BiAttention(nn.Module): 183 | def __init__(self, time_features, freq_features, mid_features, glimpses, drop=0.0): 184 | super(BiAttention, self).__init__() 185 | self.hidden_aug = 3 186 | self.glimpses = glimpses 187 | self.lin_time = FCNet(time_features, int(mid_features * self.hidden_aug), activate='relu', drop=drop/2.5) # let self.lin take care of bias 188 | self.lin_freq = FCNet(freq_features, int(mid_features * self.hidden_aug), activate='relu', drop=drop/2.5) 189 | 190 | self.h_weight = nn.Parameter(torch.Tensor(1, glimpses, 1, int(mid_features * self.hidden_aug)).normal_()) 191 | self.h_bias = nn.Parameter(torch.Tensor(1, glimpses, 1, 1).normal_()) 192 | 193 | self.drop = nn.Dropout(drop) 194 | 195 | def forward(self, time, freq): 196 | """ 197 | time = batch, time_num, dim 198 | freq = batch, freq_num, dim 199 | """ 200 | time_num = time.size(1) 201 | freq_num = freq.size(1) 202 | 203 | time_ = self.lin_time(time).unsqueeze(1) # batch, 1, time_num, dim 204 | freq_ = self.lin_freq(freq).unsqueeze(1) # batch, 1, q_num, dim 205 | time_ = self.drop(time_) 206 | 207 | del time, freq 208 | torch.cuda.empty_cache() 209 | 210 | 211 | h_ = time_ * self.h_weight # broadcast: batch x glimpses x time_num x dim 212 | logits = torch.matmul(h_, freq_.transpose(2,3)) # batch x glimpses x time_num x freq_num 213 | 214 | del h_, freq_ 215 | torch.cuda.empty_cache() 216 | 217 | logits = logits + self.h_bias 218 | 219 | 220 | torch.cuda.empty_cache() 221 | 222 | atten = F.softmax(logits.view(-1, self.glimpses, time_num * freq_num), 2) 223 | return atten.view(-1, self.glimpses, time_num, freq_num), logits 224 | 225 | 226 | class ApplyAttention(nn.Module): 227 | def __init__(self, time_features, freq_features, mid_features, glimpses, num_obj, drop=0.0): 228 | super(ApplyAttention, self).__init__() 229 | self.glimpses = glimpses 230 | layers = [] 231 | for g in range(self.glimpses): 232 | layers.append(ApplySingleAttention(time_features, freq_features, mid_features, num_obj, drop)) 233 | self.glimpse_layers = nn.ModuleList(layers) 234 | 235 | def forward(self, time, freq, atten, logits): 236 | """ 237 | time = batch, time_num, dim 238 | freq = batch, freq_num, dim 239 | atten: batch x glimpses x time_num x freq_num 240 | logits: batch x glimpses x time_num x freq_num 241 | """ 242 | time_num = time.shape[1] 243 | freq_num = freq.shape[1] 244 | for g in range(self.glimpses): 245 | atten_h_freq, atten_h_time = self.glimpse_layers[g](time, freq, atten[:,g,:,:], logits[:,g,:,:]) 246 | time = atten_h_time + time 247 | freq = atten_h_freq + freq 248 | del atten_h_time, atten_h_freq 249 | torch.cuda.empty_cache() 250 | return time, freq 251 | 252 | class ApplySingleAttention(nn.Module): 253 | def __init__(self, time_features, freq_features, mid_features, num_obj, drop=0.0): 254 | super(ApplySingleAttention, self).__init__() 255 | self.lin_time = FCNet(time_features, time_features, activate='relu', drop=drop) 256 | self.lin_freq = FCNet(freq_features, freq_features, activate='relu', drop=drop) 257 | 258 | def forward(self, time, freq, atten, logits): 259 | """ 260 | time = batch, time_num, dim 261 | freq = batch, freq_num , dim 262 | 263 | atten: batch x time_num x freq_num 264 | logits: batch x time_num x freq_num 265 | """ 266 | atten_h_time = self.lin_time((time.permute(0,2,1) @ atten).permute(0,2,1)) 267 | del time 268 | torch.cuda.empty_cache() 269 | atten_h_freq = self.lin_freq((freq.permute(0,2,1) @ atten).permute(0,2,1)) 270 | del freq, atten 271 | torch.cuda.empty_cache() 272 | 273 | return atten_h_time, atten_h_freq 274 | -------------------------------------------------------------------------------- /source/model/laxcat.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | import torch.nn as nn 5 | import torch.nn.functional as F 6 | import random 7 | import copy 8 | import math 9 | 10 | 11 | def dense_interpolation(x, M): 12 | #x: B * C * L 13 | u = [0 for i in range(M)] 14 | for t in range(1, 1 + x.size(2)): 15 | s = M * t / x.size(2) 16 | for m in range(1, M + 1): 17 | w = (1 - abs(s - m) / M)**2 18 | u[m - 1] += w * (x[:, :, t - 1].unsqueeze(-1)) 19 | return torch.cat(u, -1) 20 | 21 | 22 | class input_attention_layer(nn.Module): 23 | def __init__(self, p, j): 24 | super(input_attention_layer, self).__init__() 25 | self.weight1 = nn.Parameter(torch.ones(p, 1), requires_grad = True) 26 | self.bias1 = nn.Parameter(torch.zeros(j), requires_grad = True) 27 | self.weight2 = nn.Parameter(torch.ones(j, p), requires_grad = True) 28 | self.bias2 = nn.Parameter(torch.zeros(p), requires_grad = True) 29 | 30 | def forward(self, x): 31 | #x: B * p * j * l 32 | l = x.size(3) 33 | h = [0 for i in range(l)] 34 | x = x.transpose(1, 3) 35 | #x: B * l * j * p 36 | for i in range(l): 37 | tmp = F.relu(torch.matmul(x[:, i, :, :], self.weight1).squeeze(-1) + self.bias1) 38 | tmp = F.relu(torch.matmul(tmp, self.weight2) + self.bias2) 39 | #tmp: B * p 40 | attn = F.softmax(tmp, -1).unsqueeze(1) 41 | h[i] = torch.sum(attn * x[:, i, :, :], -1) 42 | h[i] = h[i].unsqueeze(-1) #unsqueeze for cat 43 | #B * j 44 | return torch.cat(h, -1) 45 | #B * j * l 46 | class temporal_attention_layer(nn.Module): 47 | def __init__(self, j, l): 48 | super(temporal_attention_layer, self).__init__() 49 | self.weight1 = nn.Parameter(torch.ones(l, 1), requires_grad = True) 50 | self.bias1 = nn.Parameter(torch.zeros(j), requires_grad = True) 51 | self.weight2 = nn.Parameter(torch.ones(j, l), requires_grad = True) 52 | self.bias2 = nn.Parameter(torch.zeros(l), requires_grad = True) 53 | 54 | def forward(self, x): 55 | #x: B * j * l 56 | tmp = F.relu(torch.matmul(x, self.weight1).squeeze(-1) + self.bias1) 57 | tmp = F.relu(torch.matmul(tmp, self.weight2) + self.bias2) 58 | attn = F.softmax(tmp, -1).unsqueeze(1) 59 | #attn: B * 1 * l 60 | x = torch.sum(attn * x, -1) 61 | return x 62 | 63 | 64 | 65 | class LaxCat(nn.Module): 66 | def __init__(self, input_size, input_channel, num_label, hidden_dim = 32, kernel_size = 64, stride = 16): 67 | super(LaxCat, self).__init__() 68 | l = int((input_size - kernel_size) / stride) + 1 69 | self.Conv1 = nn.ModuleList([ 70 | nn.Conv1d(1, hidden_dim, kernel_size = kernel_size, stride = stride) for _ in range(input_channel)]) 71 | 72 | self.variable_attn = input_attention_layer(p = input_channel, j = hidden_dim) 73 | self.temporal_attn = temporal_attention_layer(j = hidden_dim, l = l) 74 | self.fc = nn.Linear(hidden_dim, num_label) 75 | 76 | def forward(self, x): 77 | x = x.transpose(1, 2) 78 | x = list(x.split(1, 1)) 79 | for i in range(len(x)): 80 | x[i] = self.Conv1[i](x[i]).unsqueeze(-1) 81 | 82 | x = torch.cat(x, -1).permute(0, 3, 1, 2) 83 | #x = F.relu(self.Conv1(x)).reshape(B, C, -1, x.size(0)) 84 | x = self.variable_attn(x) 85 | x = self.temporal_attn(x) 86 | return self.fc(x) 87 | 88 | 89 | 90 | 91 | def main(): 92 | stft_m = LaxCat(input_size = 256, input_channel = 726, num_label = 6).cuda() 93 | 94 | total_params = sum(p.numel() for p in stft_m.parameters()) 95 | print(f'{total_params:,} total parameters.') 96 | 97 | #train_STFT_model(stft_m, window_size = 64, K = 16) 98 | x = torch.zeros(64, 256, 6).cuda() 99 | output = stft_m(x) 100 | print(output.size()) 101 | 102 | 103 | if __name__ == '__main__': 104 | main() 105 | -------------------------------------------------------------------------------- /source/model/layer_maker.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | from .norm import GlobalLayerNorm, CumulativeLayerNorm1d 3 | 4 | def choose_nonlinear(name, **kwargs): 5 | if name == 'relu': 6 | nonlinear = nn.ReLU() 7 | elif name == 'sigmoid': 8 | nonlinear = nn.Sigmoid() 9 | elif name == 'softmax': 10 | assert 'dim' in kwargs, "dim is expected for softmax." 11 | nonlinear = nn.Softmax(**kwargs) 12 | elif name == 'tanh': 13 | nonlinear = nn.Tanh() 14 | elif name == 'leaky-relu': 15 | nonlinear = nn.LeakyReLU() 16 | else: 17 | raise NotImplementedError("Invalid nonlinear function is specified. Choose 'relu' instead of {}.".format(name)) 18 | 19 | return nonlinear 20 | 21 | def choose_rnn(name, **kwargs): 22 | if name == 'rnn': 23 | rnn = nn.RNN(**kwargs) 24 | elif name == 'lstm': 25 | rnn = nn.LSTM(**kwargs) 26 | elif name == 'gru': 27 | rnn = nn.GRU(**kwargs) 28 | else: 29 | raise NotImplementedError("Invalid RNN is specified. Choose 'rnn', 'lstm', or 'gru' instead of {}.".format(name)) 30 | 31 | return rnn 32 | 33 | def choose_layer_norm(name, num_features, causal=False, eps=1e-12, **kwargs): 34 | if name == 'cLN': 35 | layer_norm = CumulativeLayerNorm1d(num_features, eps=eps) 36 | elif name == 'gLN': 37 | if causal: 38 | raise ValueError("Global Layer Normalization is NOT causal.") 39 | layer_norm = GlobalLayerNorm(num_features, eps=eps) 40 | elif name in ['BN', 'batch', 'batch_norm']: 41 | n_dims = kwargs.get('n_dims') or 1 42 | if n_dims == 1: 43 | layer_norm = nn.BatchNorm1d(num_features, eps=eps) 44 | elif n_dims == 2: 45 | layer_norm = nn.BatchNorm2d(num_features, eps=eps) 46 | else: 47 | raise NotImplementedError("n_dims is expected 1 or 2, but give {}.".format(n_dims)) 48 | else: 49 | raise NotImplementedError("Not support {} layer normalization.".format(name)) 50 | 51 | return layer_norm -------------------------------------------------------------------------------- /source/model/mann.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | class MaDNN(nn.Module): 6 | def __init__(self, input_size, input_channel, num_label): 7 | super(MaDNN, self).__init__() 8 | self.Linear1 = nn.ModuleList([nn.Linear(input_size, 128) for _ in range(input_channel)]) 9 | self.Linear2 = nn.ModuleList([nn.Linear(128, 128) for _ in range(input_channel)]) 10 | self.Linear3 = nn.ModuleList([nn.Linear(128, 128) for _ in range(input_channel)]) 11 | self.Linear4 = nn.ModuleList([nn.Linear(128, num_label) for _ in range(input_channel)]) 12 | 13 | self.Linear = nn.Linear(input_channel * num_label, num_label) 14 | def forward(self, x): 15 | input_channel = x.size(2) 16 | x = list(x.split(1, 2)) 17 | for i in range(input_channel): 18 | x[i] = torch.sigmoid(self.Linear1[i](x[i].squeeze(-1))) 19 | x[i] = torch.sigmoid(self.Linear2[i](x[i])) 20 | x[i] = torch.sigmoid(self.Linear3[i](x[i])) 21 | x[i] = torch.sigmoid(self.Linear4[i](x[i])) 22 | x = torch.cat(x, dim = -1) 23 | return self.Linear(x) 24 | 25 | class MaCNN(nn.Module): 26 | def __init__(self, input_size, input_channel, num_label, sensor_num): 27 | super(MaCNN, self).__init__() 28 | self.in_channel = int(input_channel / sensor_num) 29 | self.start_conv = nn.ModuleList([nn.Conv1d(self.in_channel, 128, kernel_size = 3, stride = 1, padding = 1) for _ in range(sensor_num)]) 30 | self.conv1 = nn.ModuleList([nn.Conv1d(128, 128, kernel_size = 3, stride = 1, padding = 1) for _ in range(sensor_num)]) 31 | self.pool1 = nn.ModuleList([nn.AvgPool1d(kernel_size = 2) for _ in range(sensor_num)]) 32 | self.conv2 = nn.ModuleList([nn.Conv1d(128, 128, kernel_size = 3, stride = 1, padding = 1) for _ in range(sensor_num)]) 33 | self.pool2 = nn.ModuleList([nn.AvgPool1d(kernel_size = 2)for _ in range(sensor_num)]) 34 | self.conv3 = nn.ModuleList([nn.Conv1d(128, 128, kernel_size = 3, stride = 1, padding = 1) for _ in range(sensor_num)]) 35 | self.pool3 = nn.ModuleList([nn.AvgPool1d(kernel_size = 2)for _ in range(sensor_num)]) 36 | self.conv4 = nn.ModuleList([nn.Conv1d(128, 128, kernel_size = 3, stride = 1, padding = 1) for _ in range(sensor_num)]) 37 | self.pool4 = nn.ModuleList([nn.AvgPool1d(kernel_size = 2)for _ in range(sensor_num)]) 38 | self.conv5 = nn.ModuleList([nn.Conv1d(128, 128, kernel_size = 3, stride = 1, padding = 1) for _ in range(sensor_num)]) 39 | self.pool5 = nn.ModuleList([nn.AvgPool1d(kernel_size = 2)for _ in range(sensor_num)]) 40 | 41 | 42 | self.end_conv = nn.ModuleList([nn.Conv1d(128, 1, kernel_size = 1, stride = 1) for _ in range(sensor_num)]) 43 | self.Linear = nn.Linear(int(input_size / 32) * sensor_num, num_label) 44 | 45 | 46 | def forward(self, x): 47 | x = x.reshape(x.size(0), x.size(1), -1, self.in_channel).transpose(1, 3) 48 | sensor_num = x.size(2) 49 | x = list(x.split(1, 2)) 50 | for i in range(sensor_num): 51 | x[i] = F.relu(self.start_conv[i](x[i].squeeze(2))) 52 | 53 | x[i] = F.relu(self.pool1[i](self.conv1[i](x[i]))) 54 | x[i] = F.relu(self.pool2[i](self.conv2[i](x[i]))) 55 | x[i] = F.relu(self.pool3[i](self.conv3[i](x[i]))) 56 | x[i] = F.relu(self.pool4[i](self.conv4[i](x[i]))) 57 | x[i] = F.relu(self.pool5[i](self.conv5[i](x[i]))) 58 | x[i] = F.relu(self.end_conv[i](x[i])).squeeze(1) 59 | 60 | x[i] = x[i].view(x[i].size(0), -1) 61 | x = torch.cat(x, dim = -1) 62 | return self.Linear(x) 63 | 64 | 65 | def main(): 66 | input = torch.zeros((4, 256, 726)).cuda() 67 | model = MaCNN(input_size = 256, input_channel = 726, num_label = 6, sensor_num = 6).cuda() 68 | o = model(input) 69 | print(o.size()) 70 | 71 | 72 | 73 | if __name__ == '__main__': 74 | main() 75 | -------------------------------------------------------------------------------- /source/model/norm.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | EPS = 1e-12 5 | 6 | """ 7 | Global layer normalization 8 | See "Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation" 9 | https://arxiv.org/abs/1809.07454 10 | """ 11 | class GlobalLayerNorm(nn.Module): 12 | def __init__(self, num_features, eps=EPS): 13 | super().__init__() 14 | 15 | self.num_features = num_features 16 | self.eps = eps 17 | 18 | self.norm = nn.GroupNorm(1, num_features, eps=eps) 19 | 20 | def forward(self, input): 21 | """ 22 | Args: 23 | input (batch_size, C, *) 24 | Returns: 25 | output (batch_size, C, *) 26 | """ 27 | output = self.norm(input) 28 | 29 | return output 30 | 31 | def __repr__(self): 32 | s = '{}'.format(self.__class__.__name__) 33 | s += '({num_features}, eps={eps})' 34 | 35 | return s.format(**self.__dict__) 36 | 37 | """ 38 | Cumulative layer normalization 39 | See "Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation" 40 | https://arxiv.org/abs/1809.07454 41 | """ 42 | class CumulativeLayerNorm1d(nn.Module): 43 | def __init__(self, num_features, eps=EPS): 44 | super().__init__() 45 | 46 | self.num_features = num_features 47 | self.eps = eps 48 | 49 | self.gamma = nn.Parameter(torch.Tensor(1, num_features, 1)) 50 | self.beta = nn.Parameter(torch.Tensor(1, num_features, 1)) 51 | 52 | self._reset_parameters() 53 | 54 | def _reset_parameters(self): 55 | self.gamma.data.fill_(1) 56 | self.beta.data.zero_() 57 | 58 | def forward(self, input): 59 | """ 60 | Args: 61 | input (batch_size, C, T) or (batch_size, C, S, chunk_size): 62 | Returns: 63 | output (batch_size, C, T) or (batch_size, C, S, chunk_size): same shape as the input 64 | """ 65 | eps = self.eps 66 | 67 | n_dim = input.dim() 68 | 69 | if n_dim == 3: 70 | batch_size, C, T = input.size() 71 | elif n_dim == 4: 72 | batch_size, C, S, chunk_size = input.size() 73 | T = S * chunk_size 74 | input = input.view(batch_size, C, T) 75 | else: 76 | raise ValueError("Only support 3D or 4D input, but given {}D".format(input.dim())) 77 | 78 | step_sum = input.sum(dim=1) # -> (batch_size, T) 79 | input_pow = input**2 80 | step_pow_sum = input_pow.sum(dim=1) # -> (batch_size, T) 81 | cum_sum = torch.cumsum(step_sum, dim=1) # -> (batch_size, T) 82 | cum_squared_sum = torch.cumsum(step_pow_sum, dim=1) # -> (batch_size, T) 83 | 84 | cum_num = torch.arange(C, C*(T+1), C, dtype=torch.float) # -> (T, ): [C, 2*C, ..., T*C] 85 | cum_mean = cum_sum / cum_num # (batch_size, T) 86 | cum_squared_mean = cum_squared_sum / cum_num 87 | cum_var = cum_squared_mean - cum_mean**2 88 | 89 | cum_mean = cum_mean.unsqueeze(dim=1) 90 | cum_var = cum_var.unsqueeze(dim=1) 91 | 92 | output = (input - cum_mean) / (torch.sqrt(cum_var) + eps) * self.gamma + self.beta 93 | 94 | if n_dim == 4: 95 | output = output.view(batch_size, C, S, chunk_size) 96 | 97 | return output 98 | 99 | def __repr__(self): 100 | s = '{}'.format(self.__class__.__name__) 101 | s += '({num_features}, eps={eps})' 102 | 103 | return s.format(**self.__dict__) 104 | 105 | if __name__ == '__main__': 106 | batch_size, C, T = 2, 3, 5 107 | causal = True 108 | 109 | norm = GlobalLayerNorm(C) 110 | print(norm) 111 | 112 | input = torch.arange(batch_size*C*T, dtype=torch.float).view(batch_size, C, T) 113 | output = norm(input) 114 | print(input) 115 | print(output) -------------------------------------------------------------------------------- /source/model/resnet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | import random 6 | import copy 7 | import math 8 | 9 | class resConv1dBlock(nn.Module): 10 | def __init__(self, in_channels, out_channels, kernel_size, stride, layer_num): 11 | super(resConv1dBlock, self).__init__() 12 | self.layer_num = layer_num 13 | self.conv1 = nn.ModuleList([ 14 | nn.Conv1d(in_channels = in_channels, out_channels = 2 * in_channels, kernel_size = kernel_size, stride = stride, padding = int((kernel_size - 1) / 2) ) 15 | for i in range(layer_num)]) 16 | 17 | self.bn1 = nn.ModuleList([ 18 | nn.BatchNorm1d(2 * in_channels) 19 | for i in range(layer_num)]) 20 | 21 | self.conv2 = nn.ModuleList([ 22 | nn.Conv1d(in_channels = 2 * in_channels, out_channels = out_channels, kernel_size = kernel_size, stride = stride, padding = int((kernel_size - 1) / 2) ) 23 | for i in range(layer_num)]) 24 | 25 | self.bn2 = nn.ModuleList([ 26 | nn.BatchNorm1d(out_channels) 27 | for i in range(layer_num)]) 28 | 29 | def forward(self, x): 30 | for i in range(self.layer_num): 31 | tmp = F.relu(self.bn1[i](self.conv1[i](x))) 32 | x = F.relu(self.bn2[i](self.conv2[i](tmp)) + x) 33 | return x 34 | 35 | 36 | class ResNet(nn.Module): 37 | def __init__(self, input_size, input_channel, num_label): 38 | super(ResNet, self).__init__() 39 | self.conv1 = nn.Conv1d(input_channel, 64, kernel_size = 1, stride = 1) 40 | self.res1 = resConv1dBlock(64, 64, kernel_size = 3, stride = 1, layer_num = 3) 41 | self.pool1 = nn.AvgPool1d(kernel_size = 2) 42 | 43 | self.conv2 = nn.Conv1d(64, 128, kernel_size = 1, stride = 1) 44 | self.res2 = resConv1dBlock(128, 128, kernel_size = 3, stride = 1, layer_num = 4) 45 | self.pool2 = nn.AvgPool1d(kernel_size = 2) 46 | 47 | self.conv3 = nn.Conv1d(128, 256, kernel_size = 1, stride = 1) 48 | self.res3 = resConv1dBlock(256, 256, kernel_size = 3, stride = 1, layer_num = 7) 49 | self.pool3 = nn.AvgPool1d(kernel_size = 2) 50 | 51 | self.conv4 = nn.Conv1d(256, 128, kernel_size = 1, stride = 1) 52 | self.res4 = resConv1dBlock(128, 128, kernel_size = 3, stride = 1, layer_num = 4) 53 | self.pool = nn.AvgPool1d(kernel_size = int(input_size / 8)) 54 | 55 | self.fc = nn.Linear(128, num_label) 56 | 57 | def forward(self, x, til_layer="final"): 58 | x = x.transpose(1, 2) 59 | x = F.relu(self.conv1(x)) 60 | x = self.pool1(self.res1(x)) 61 | if til_layer == 'res1': 62 | return x 63 | x = F.relu(self.conv2(x)) 64 | x = self.pool2(self.res2(x)) 65 | if til_layer == 'res2': 66 | return x 67 | x = F.relu(self.conv3(x)) 68 | x = self.pool3(self.res3(x)) 69 | if til_layer == 'res3': 70 | return x 71 | x = F.relu(self.conv4(x)) 72 | x = self.pool(self.res4(x)) 73 | if til_layer == 'res4': 74 | return x 75 | x = x.view(x.size(0), -1) 76 | return self.fc(x) 77 | 78 | def main(): 79 | input = torch.zeros((4, 256, 45)).cuda() 80 | model = ResNet(input_size = 256, input_channel = 45, num_label = 6).cuda() 81 | o = model(input) 82 | print(o.size()) 83 | 84 | if __name__ == '__main__': 85 | main() -------------------------------------------------------------------------------- /source/model/slnet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | from torchsummary import summary 6 | from torchvision.models import AlexNet 7 | 8 | import os, sys, math 9 | import numpy as np 10 | import scipy.io as scio 11 | import torch, torchvision 12 | import torch.nn as nn 13 | from torch import sigmoid 14 | from torch.fft import fft, ifft 15 | from torch.nn.functional import relu 16 | from torch import sigmoid 17 | 18 | class m_Linear(nn.Module): 19 | def __init__(self, size_in, size_out): 20 | super().__init__() 21 | self.size_in, self.size_out = size_in, size_out 22 | 23 | # Creation 24 | self.weights_real = nn.Parameter(torch.randn(size_in, size_out, dtype=torch.float32)) 25 | self.weights_imag = nn.Parameter(torch.randn(size_in, size_out, dtype=torch.float32)) 26 | self.bias = nn.Parameter(torch.randn(2, size_out, dtype=torch.float32)) 27 | 28 | # Initialization 29 | nn.init.xavier_uniform_(self.weights_real, gain=1) 30 | nn.init.xavier_uniform_(self.weights_imag, gain=1) 31 | nn.init.zeros_(self.bias) 32 | 33 | def swap_real_imag(self, x): 34 | # [@,*,2,Hout] 35 | # [real, imag] => [-1*imag, real] 36 | h = x # [@,*,2,Hout] 37 | h = h.flip(dims=[-2]) # [@,*,2,Hout] [real, imag]=>[imag, real] 38 | h = h.transpose(-2,-1) # [@,*,Hout,2] 39 | h = h * torch.tensor([-1,1]).cuda() # [@,*,Hout,2] [imag, real]=>[-1*imag, real] 40 | h = h.transpose(-2,-1) # [@,*,2,Hout] 41 | 42 | return h 43 | 44 | def forward(self, x): 45 | # x: [@,*,2,Hin] 46 | # Note: torch.mm function doesn't support broadcasting 47 | 48 | h = x # [@,*,2,Hin] 49 | 50 | h1 = torch.matmul(h, self.weights_real) # [@,*,2,Hout] 51 | h2 = torch.matmul(h, self.weights_imag) # [@,*,2,Hout] 52 | h2 = self.swap_real_imag(h2) # [@,*,2,Hout] 53 | h = h1 + h2 # [@,*,2,Hout] 54 | h = torch.add(h, self.bias) # [@,*,2,Hout]+[2,Hout]=>[@,*,2,Hout] 55 | 56 | return h 57 | 58 | class m_cconv3d(nn.Module): 59 | def __init__(self, D_cnt, F_bins, T_bins): 60 | super().__init__() 61 | # input: (@,2,D_cnt,F_bins,T_bins) 62 | # output: (@,2,D_cnt,F_bins,T_bins) 63 | # kernel: (D_cnt,F_bins,T_bins) 64 | # bias: (D_cnt,1,T_bins) 65 | 66 | # Parameters offloading 67 | self.D_cnt, self.F_bins, self.T_bins = D_cnt, F_bins, T_bins 68 | 69 | # Parameter initialization 70 | self.fc_1 = nn.Linear(self.F_bins, self.F_bins) 71 | self.fc_2 = nn.Linear(self.F_bins, self.F_bins) 72 | 73 | # Meta matrix for DFT/IDFT 74 | F_range = torch.arange(self.F_bins) 75 | phase = -2 * math.pi * (F_range * F_range.reshape((self.F_bins,1))) / self.F_bins 76 | self.DFT_COS_M = torch.cos(phase).cuda() 77 | self.DFT_SIN_M = torch.sin(phase).cuda() 78 | 79 | def _mul_complex(self, a, b): 80 | # a: (@,2,D,T,F), b: (@,2,D,T,F) 81 | h1 = a * b # (@,2,D,T,F) 82 | ret_real = h1[:,0,:] - h1[:,1,:] # (@,D,T,F) 83 | 84 | h2 = a * b.flip(dims=[1]) # (@,2,D,T,F) 85 | ret_imag = h2[:,0,:] + h2[:,1,:] # (@,D,T,F) 86 | ret = torch.stack((ret_real,ret_imag), axis=1) # (@,D,T,F)=>(@,2,D,T,F) 87 | return ret 88 | 89 | def _fft_complex(self, x): 90 | return self._dft_complex(x) 91 | 92 | def _ifft_complex(self, x): 93 | return self._idft_complex(x) 94 | 95 | def _dft_complex(self, x): 96 | h = x # (@,2,D,T,F) 97 | 98 | ret_real = torch.matmul(h[:,0,:], self.DFT_COS_M) - torch.matmul(h[:,1,:], self.DFT_SIN_M) 99 | ret_imag = torch.matmul(h[:,0,:], self.DFT_SIN_M) + torch.matmul(h[:,1,:], self.DFT_COS_M) 100 | 101 | ret = torch.stack((ret_real, ret_imag), axis=1) # (@,D,T,F)=>(@,2,D,T,F) 102 | return ret 103 | 104 | def _idft_complex(self, x): 105 | h = x # (@,2,D,T,F) 106 | 107 | ret_real = torch.matmul(h[:,0,:], self.DFT_COS_M)/self.F_bins + torch.matmul(h[:,1,:], self.DFT_SIN_M)/self.F_bins 108 | ret_imag = torch.matmul(h[:,0,:], -1 * self.DFT_SIN_M)/self.F_bins + torch.matmul(h[:,1,:], self.DFT_COS_M)/self.F_bins 109 | 110 | ret = torch.stack((ret_real, ret_imag), axis=1) # (@,D,T,F)=>(@,2,D,T,F) 111 | return ret 112 | 113 | def _cconv(self, x, y): 114 | # cconv with 'N*fft(ifft(a)*ifft(b))' 115 | x_ifft = self._ifft_complex(x) # (@,2,D,T,F) 116 | y_ifft = self._ifft_complex(y) # (@,2,D,T,F) 117 | h = self._mul_complex(x_ifft, y_ifft) # (@,2,D,T,F) 118 | ret = self.F_bins * self._fft_complex(h) # (@,2,D,T,F) 119 | return ret 120 | 121 | def forward(self, x): 122 | h = x # (@,2,D,F,T) 123 | 124 | h = h.permute(0,1,2,4,3) # (@,2,D,F,T)=>(@,2,D,T,F) 125 | 126 | w = h # (@,2,D,T,F) 127 | w = self.fc_1(w) # (@,2,D,T,F) 128 | w = self.fc_2(w) # (@,2,D,T,F) 129 | w = torch.nn.Softmax(dim=-1)(w) 130 | h2 = self._cconv(h,w) # (@,2,D,T,F) 131 | 132 | output = h2.permute(0,1,2,4,3) # (@,2,D,T,F)=>(@,2,D,F,T) 133 | return output 134 | 135 | class m_Filtering(nn.Module): 136 | def __init__(self, D_cnt, F_bins, T_bins): 137 | super().__init__() 138 | # input: (@,2,D_cnt,F_bins,T_bins) 139 | # output: (@,2,D_cnt,F_bins,T_bins) 140 | 141 | # Parameters offloading 142 | self.D_cnt, self.F_bins, self.T_bins = D_cnt, F_bins, T_bins 143 | 144 | # Parameter initialization 145 | self.fc_1 = nn.Linear(self.F_bins, self.F_bins) 146 | 147 | def forward(self, x): 148 | h = x # (@,2,D,F,T) 149 | 150 | # Self-attention to generate filter weights 151 | w = torch.linalg.norm(h,dim=1) # (@,2,D,F,T)=>(@,D,F,T) 152 | w = w.permute(0,1,3,2) # (@,D,F,T)=>(@,D,T,F) 153 | w = sigmoid(self.fc_1(w)) # (@,D,T,F) 154 | w = w.permute(0,1,3,2) # (@,D,T,F)=>(@,D,F,T) 155 | w = w.unsqueeze(dim=1) # (@,D,F,T)=>(@,1,D,F,T) 156 | 157 | output = h * w # (@,2,D,F,T) 158 | return output 159 | 160 | class m_pconv3d(nn.Module): 161 | def __init__(self, in_channels, out_channels, kernel_size, stride, is_front_pconv_layer): 162 | super().__init__() 163 | # input: (@,2,C_in,D,F,T) ~ e.g., (@,2,1,6,121,T) 164 | # output: (@,2,C_out,D_out,F_out,T_out) 165 | 166 | # Parameters offloading 167 | self.in_channels = in_channels 168 | self.out_channels = out_channels 169 | self.kernel_size = kernel_size 170 | self.stride = stride 171 | self.is_front_pconv_layer = is_front_pconv_layer 172 | 173 | # Conventional Convolution Initialization 174 | self.conv3d = nn.Conv3d(in_channels=self.in_channels, out_channels=self.out_channels,\ 175 | kernel_size=self.kernel_size, stride=self.stride) 176 | 177 | def _polarize(self, h, F): 178 | # Abandon original phase of F 179 | # Polarize zone: [-pi/2, pi/2] 180 | h = h.permute(0,2,3,5,1,4) # (@,2,C_in,D,F,T)=>(@,C_in,D,T,2,F) 181 | h = torch.linalg.norm(h,dim=4) # (@,C_in,D,T,2,F)=>(@,C_in,D,T,F) 182 | cos_matrix = torch.cos(torch.linspace(-1*math.pi/1, 1*math.pi/1, F)) # (F,) 183 | sin_matrix = torch.sin(torch.linspace(-1*math.pi/1, 1*math.pi/1, F)) # (F,) 184 | 185 | h_cos = h * cos_matrix.cuda() # (@,C_in,D,T,F) 186 | h_sin = h * sin_matrix.cuda() # (@,C_in,D,T,F) 187 | 188 | h = torch.stack((h_cos,h_sin), axis=1) # (@,C_in,D,T,F)=>(@,2,C_in,D,T,F) 189 | 190 | h_polarized = h.permute(0,1,2,3,5,4) # (@,2,C_in,D,T,F)=>(@,2,C_in,D,F,T) 191 | return h_polarized 192 | 193 | def forward(self, x): 194 | h = x # (@,2,C_in,D,F,T) ~ e.g., (@,2,1,6,121,T) 195 | [D,F,T] = x.shape[3:] 196 | 197 | # Polarize for the front pconv layer 198 | if self.is_front_pconv_layer: 199 | h = self._polarize(h, F) # (@,2,C_in,D,F,T) 200 | 201 | # Conventional 3D Convolution 202 | h = h.reshape((-1,self.in_channels,D,F,T)) # (@,2,C_in,D,F,T)=>(@*2,C_in,D,F,T) 203 | h = self.conv3d(h) # (@*2,C_in,D,F,T)=>(@*2,C_out,D_out,F_out,T_out) 204 | output = h.reshape((-1,2)+h.shape[1:]) # (@*2,C_out,D_out,F_out,T_out)=>(@,2,C_out,D_out,F_out,T_out) 205 | return output 206 | 207 | 208 | 209 | class SLNet(nn.Module): 210 | # Need customization 211 | def __init__(self, input_shape, class_num): 212 | super(SLNet, self).__init__() 213 | self.input_shape = input_shape # [2,R,6,121,T_MAX] 214 | self.IN_CHANNEL = input_shape[1] 215 | self.NUM_RX = input_shape[2] 216 | self.T_MAX = input_shape[4] 217 | self.class_num = class_num 218 | 219 | # pconv+FC 220 | self.complex_fc_1 = m_Linear(32*7, 128) 221 | self.complex_fc_2 = m_Linear(128, 64) 222 | self.fc_1 = nn.Linear(32*7, 128) 223 | self.fc_2 = nn.Linear(128, 64) 224 | self.fc_3 = nn.Linear(64, 32) 225 | self.fc_4 = nn.Linear(self.NUM_RX*(self.T_MAX-8)*32, 256) 226 | self.fc_5 = nn.Linear(256, 128) 227 | self.fc_out = nn.Linear(128, self.class_num) 228 | self.dropout_1 = nn.Dropout(p=0.2) 229 | self.dropout_2 = nn.Dropout(p=0.3) 230 | self.dropout_3 = nn.Dropout(p=0.4) 231 | self.dropout_4 = nn.Dropout(p=0.2) 232 | self.dropout_5 = nn.Dropout(p=0.2) 233 | self.pconv3d_1 = m_pconv3d(in_channels=self.IN_CHANNEL,out_channels=16,kernel_size=[1,5,5],stride=[1,1,1],is_front_pconv_layer=True) 234 | self.pconv3d_2 = m_pconv3d(in_channels=16,out_channels=32,kernel_size=[1,5,5],stride=[1,1,1],is_front_pconv_layer=False) 235 | self.mpooling3d_1 = nn.MaxPool3d(kernel_size=[1,3,1],stride=[1,3,1]) 236 | self.mpooling3d_2 = nn.MaxPool3d(kernel_size=[1,5,1],stride=[1,5,1]) 237 | 238 | def forward(self, x): 239 | h = x # [@,2,R,C,F,T] ~ (@,2,1,6,121,T_MAX) or (@,2,3,6,121,T_MAX) 240 | 241 | # pconv 242 | h = self.pconv3d_1(h) # (@,2,R,6,121,T_MAX)=>(@,2,16,6,117,T_MAX-4) 243 | h = h.reshape((-1,16,self.NUM_RX,117,self.T_MAX-4)) # (@,2,16,6,117,T_MAX-4)=>(@*2,16,6,117,T_MAX-4) 244 | h = self.mpooling3d_1(h) # (@*2,16,6,117,T_MAX-4)=>(@*2,16,6,39,T_MAX-4) 245 | h = h.reshape((-1,2,16,self.NUM_RX,39,self.T_MAX-4)) # (@*2,16,6,39,T_MAX-4)=>(@,2,16,6,39,T_MAX-4) 246 | 247 | h = self.pconv3d_2(h) # (@,2,16,6,39,T_MAX-4)=>(@,2,32,6,35,T_MAX-8) 248 | h = h.reshape((-1,32,self.NUM_RX,35,self.T_MAX-8)) # (@,2,32,6,35,T_MAX-8)=>(@*2,32,6,35,T_MAX-8) 249 | h = self.mpooling3d_2(h) # (@*2,32,6,35,T_MAX-8)=>(@*2,32,6,7,T_MAX-8) 250 | h = h.reshape((-1,2,32,self.NUM_RX,7,self.T_MAX-8)) # (@*2,32,6,7,T_MAX-8)=>(@,2,32,6,7,T_MAX-8) 251 | 252 | # Complex FC 253 | h = h.permute(0,3,5,1,2,4) # (@,2,32,6,7,T_MAX-8)=>(@,6,T_MAX-8,2,32,7) 254 | h = h.reshape((-1,self.NUM_RX,self.T_MAX-8,2,32*7)) # (@,6,T_MAX-8,2,32,7)=>(@,6,T_MAX-8,2,32*7) 255 | h = self.dropout_1(h) 256 | h = self.complex_fc_1(h) # (@,6,T_MAX-8,2,32*7)=>(@,6,T_MAX-8,2,128) 257 | h = self.dropout_2(h) 258 | h = self.complex_fc_2(h) # (@,6,T_MAX-8,2,128)=>(@,6,T_MAX-8,2,64) 259 | 260 | # FC 261 | h = torch.linalg.norm(h,dim=3) # (@,6,T_MAX-8,2,64)=>(@,6,T_MAX-8,64) 262 | h = relu(self.fc_3(h)) # (@,6,T_MAX-8,64)=>(@,6,T_MAX-8,32) 263 | h = h.reshape((-1,self.NUM_RX*(self.T_MAX-8)*32)) # (@,6,T_MAX-8,32)=>(@,6*(T_MAX-8)*32) 264 | h = self.dropout_3(h) 265 | h = relu(self.fc_4(h)) # (@,6*(T_MAX-8)*32)=>(@,256) 266 | h = self.dropout_4(h) 267 | h = relu(self.fc_5(h)) # (@,256)=>(@,128) 268 | h = self.dropout_5(h) 269 | output = self.fc_out(h) # (@,128)=>(@,n_class) (No need for activation when using CrossEntropyLoss) 270 | 271 | return output 272 | 273 | def main(): 274 | # [@, T, 1, F] 275 | input = torch.zeros((8,2,1,1,121,256)).cuda() 276 | model = SLNet(input.shape[1:], class_num = 6).cuda() 277 | o = model(input) 278 | summary(model, input_size=input.size()[1:]) 279 | print(o.size()) 280 | 281 | if __name__ == '__main__': 282 | main() -------------------------------------------------------------------------------- /source/model/static_UniTS.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | import random 6 | import copy 7 | import math 8 | import matplotlib.pyplot as plt 9 | 10 | 11 | 12 | 13 | class FIC(nn.Module): 14 | def __init__(self, window_size, stride, k = 0): 15 | super(FIC, self).__init__() 16 | if k == 0: 17 | k = int(window_size / 2) 18 | self.k = k 19 | self.window_size = window_size 20 | self.stride = stride 21 | #self.conv = nn.Conv1d(in_channels = 1, out_channels = 2 * k, kernel_size = window_size, 22 | # stride = stride, padding = 0, bias = False) 23 | #self.init() 24 | 25 | def forward(self, x): 26 | # x: B * C * L 27 | B, C = x.size(0), x.size(1) 28 | x = torch.reshape(x, (B * C, -1)) 29 | x = torch.stft(x, n_fft = self.window_size, hop_length = self.stride, onesided = True, center = False, return_complex = True) 30 | x = x[:, 1:, :] 31 | x = torch.stack([x.real, x.imag], dim = 2) 32 | x = x.permute(0, 2, 1, 3).reshape(B, C, -1, x.size(-1)) 33 | return x # B * C * fc * L 34 | 35 | 36 | 37 | 38 | 39 | class TSEnc(nn.Module): 40 | def __init__(self, window_size, stride, k): 41 | super(TSEnc, self).__init__() 42 | self.k = k 43 | self.window_size = window_size 44 | self.FIC = FIC(window_size = window_size, stride = stride).cuda() 45 | self.RPC = nn.Conv1d(1, 2*k, kernel_size = window_size, stride = stride) 46 | 47 | 48 | def forward(self, x): 49 | x = x.permute(0, 2, 1) 50 | h_f = self.FIC(x) 51 | #print(h_f.size()) 52 | 53 | #print(self.k) 54 | h_f_pos, idx_pos = (torch.abs(h_f)).topk(2*self.k, dim = -2, largest = True, sorted = True) 55 | o_f_pos = torch.cat( (h_f_pos, idx_pos.type(torch.Tensor).to(h_f_pos.device )) , -2) 56 | 57 | #print(o_f_pos.size()) 58 | #h_f_neg, idx_neg = (-F.relu(-h_f)).topk(self.k, dim = -2, largest = False, sorted = True) 59 | #o_f_neg = torch.cat( (h_f_neg, idx_neg.type(torch.Tensor).to(h_f_neg.device )) , -2) 60 | 61 | 62 | B, C = x.size(0), x.size(1) 63 | x = torch.reshape(x, (B*C, -1)).unsqueeze(1) 64 | o_t = self.RPC(x) 65 | o_t = torch.reshape(o_t, (B, C, -1, o_t.size(-1))) 66 | 67 | o = torch.cat((o_t, o_f_pos), -2) 68 | #print(o.size()) 69 | return o 70 | 71 | 72 | class resConv1dBlock(nn.Module): 73 | def __init__(self, in_channels, kernel_size, stride, layer_num): 74 | super(resConv1dBlock, self).__init__() 75 | self.layer_num = layer_num 76 | self.conv1 = nn.ModuleList([ 77 | nn.Conv1d(in_channels = in_channels, out_channels = 2 * in_channels, kernel_size = kernel_size, stride = stride, padding = int((kernel_size - 1) / 2) ) 78 | for i in range(layer_num)]) 79 | 80 | self.bn1 = nn.ModuleList([ 81 | nn.BatchNorm1d(2 * in_channels) 82 | for i in range(layer_num)]) 83 | 84 | self.conv2 = nn.ModuleList([ 85 | nn.Conv1d(in_channels = 2 * in_channels, out_channels = in_channels, kernel_size = kernel_size, stride = stride, padding = int((kernel_size - 1) / 2) ) 86 | for i in range(layer_num)]) 87 | 88 | self.bn2 = nn.ModuleList([ 89 | nn.BatchNorm1d(in_channels) 90 | for i in range(layer_num)]) 91 | 92 | def forward(self, x): 93 | for i in range(self.layer_num): 94 | tmp = F.relu(self.bn1[i](self.conv1[i](x))) 95 | x = F.relu(self.bn2[i](self.conv2[i](tmp)) + x) 96 | return x 97 | 98 | 99 | class static_UniTS(nn.Module): 100 | def __init__(self, input_size, sensor_num, layer_num, 101 | window_list, stride_list, k_list, out_dim, hidden_channel = 128): 102 | super(static_UniTS, self).__init__() 103 | assert len(window_list) == len(stride_list) 104 | assert len(window_list) == len(k_list) 105 | self.hidden_channel = hidden_channel 106 | self.window_list = window_list 107 | 108 | self.ts_encoders = nn.ModuleList([ 109 | TSEnc(window_list[i], stride_list[i], k_list[i]) for i in range(len(window_list)) 110 | ]) 111 | self.num_frequency_channel = [6 * k_list[i] for i in range(len(window_list))] 112 | self.current_size = [1 + int((input_size - window_list[i]) / stride_list[i]) for i in range(len(window_list))] 113 | # o.size(): B * C * num_frequency_channel * current_size 114 | self.multi_channel_fusion = nn.ModuleList([nn.ModuleList() for _ in range(len(window_list))]) 115 | self.conv_branches = nn.ModuleList([nn.ModuleList() for _ in range(len(window_list))]) 116 | self.end_linear = nn.ModuleList([]) 117 | self.bns = nn.ModuleList([nn.BatchNorm1d(self.hidden_channel) for _ in range(len(window_list))]) 118 | 119 | self.multi_channel_fusion = nn.ModuleList([nn.Conv2d(in_channels = sensor_num, out_channels = self.hidden_channel, 120 | kernel_size = (self.num_frequency_channel[i], 1), stride = (1, 1) ) for i in range(len(window_list) ) ]) 121 | #self.drop = nn.ModuleList([nn.Dropout(0.5) for i in range(len(window_list))]) 122 | 123 | for i in range(len(window_list)): 124 | scale = 1 125 | while self.current_size[i] >= 3: 126 | self.conv_branches[i].append( 127 | resConv1dBlock(in_channels = self.hidden_channel * scale, 128 | kernel_size = 3, stride = 1, layer_num = layer_num) 129 | ) 130 | if scale < 2: 131 | self.conv_branches[i].append( 132 | nn.Conv1d(in_channels = self.hidden_channel * scale, out_channels = self.hidden_channel *2* scale, kernel_size = 1, stride = 1) 133 | ) 134 | scale *= 2 135 | 136 | self.conv_branches[i].append(nn.AvgPool1d(kernel_size = 2)) 137 | self.current_size[i] = 1 + int((self.current_size[i] - 2) / 2) 138 | 139 | print(self.current_size[i], end = ' ') 140 | print('') 141 | self.end_linear.append( 142 | nn.Linear(self.hidden_channel * self.current_size[i] * scale, self.hidden_channel) 143 | ) 144 | 145 | self.classifier = nn.Linear(self.hidden_channel * len(self.window_list), out_dim) 146 | 147 | def forward(self, x): 148 | #x: B * L * C 149 | multi_scale_x = [] 150 | B = x.size(0) 151 | C = x.size(2) 152 | 153 | for i in range(len(self.current_size)): 154 | tmp = self.ts_encoders[i](x) 155 | #tmp: B * C * fc * L' 156 | #print(tmp.size()) 157 | tmp = F.relu(self.bns[i](self.multi_channel_fusion[i](tmp).squeeze(2))) 158 | 159 | for j in range(len(self.conv_branches[i])): 160 | tmp = self.conv_branches[i][j](tmp) 161 | tmp = tmp.view(B,-1) 162 | # tmp : B * l' 163 | tmp = F.relu(self.end_linear[i](tmp)) 164 | #tmp = self.drop2(tmp) 165 | multi_scale_x.append(tmp) 166 | 167 | x = torch.cat(multi_scale_x, -1) #multi scale fusion 168 | x = self.classifier(x) 169 | return x 170 | 171 | 172 | 173 | 174 | def main(): 175 | stft_m = STFTNet(input_size = 256, sensor_num = 6, layer_num = 1, 176 | window_list = [16, 32, 48], stride_list = [8, 16, 24], k_list = [6, 8, 10], out_dim = 4, hidden_channel = 32).cuda() 177 | 178 | total_params = sum(p.numel() for p in stft_m.parameters()) 179 | print(f'{total_params:,} total parameters.') 180 | 181 | 182 | x = torch.zeros(3, 256, 6).cuda() 183 | output = stft_m(x) 184 | print(output.size()) 185 | 186 | 187 | if __name__ == '__main__': 188 | main() 189 | -------------------------------------------------------------------------------- /source/model/transformer_encoder.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch 3 | from torch.nn.init import * 4 | import torch.nn.functional as F 5 | import numpy as np 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | import math, copy, time 10 | from torch.autograd import Variable 11 | 12 | 13 | def clones(module, N): 14 | "Produce N identical layers." 15 | return nn.ModuleList([copy.deepcopy(module) for _ in range(N)]) 16 | 17 | 18 | class Embeddings(nn.Module): 19 | def __init__(self, d_model, vocab): 20 | super(Embeddings, self).__init__() 21 | self.lut = nn.Embedding(vocab, d_model) 22 | self.d_model = d_model 23 | 24 | def forward(self, x): 25 | return self.lut(x) * math.sqrt(self.d_model) 26 | 27 | class Encoder(nn.Module): 28 | "Core encoder is a stack of N layers" 29 | def __init__(self, layer, N): 30 | super(Encoder, self).__init__() 31 | self.layers = clones(layer, N) 32 | self.norm = LayerNorm(layer.size) 33 | 34 | def forward(self, x, mask=None): 35 | "Pass the input (and mask) through each layer in turn." 36 | for layer in self.layers: 37 | x = layer(x, mask) 38 | return self.norm(x) 39 | 40 | 41 | class PositionwiseFeedForward(nn.Module): 42 | def __init__(self, d_model, d_ff, dropout=0.1): 43 | super(PositionwiseFeedForward, self).__init__() 44 | self.w_1 = nn.Linear(d_model, d_ff) 45 | self.w_2 = nn.Linear(d_ff, d_model) 46 | self.dropout = nn.Dropout(dropout) 47 | 48 | def forward(self, x): 49 | return self.w_2(self.dropout(F.relu(self.w_1(x)))) 50 | 51 | 52 | class LayerNorm(nn.Module): 53 | "Construct a layernorm module (See citation for details)." 54 | def __init__(self, features, eps=1e-6): 55 | super(LayerNorm, self).__init__() 56 | self.a_2 = nn.Parameter(torch.ones(features)) 57 | self.b_2 = nn.Parameter(torch.zeros(features)) 58 | self.eps = eps 59 | 60 | def forward(self, x): 61 | mean = x.mean(-1, keepdim=True) 62 | std = x.std(-1, keepdim=True) 63 | return self.a_2 * (x - mean) / (std + self.eps) + self.b_2 64 | 65 | 66 | class SublayerConnection(nn.Module): 67 | """ 68 | A residual connection followed by a layer norm. 69 | Note for code simplicity the norm is first as opposed to last. 70 | """ 71 | def __init__(self, size, dropout): 72 | super(SublayerConnection, self).__init__() 73 | self.norm = LayerNorm(size) 74 | self.dropout = nn.Dropout(dropout) 75 | 76 | def forward(self, x, sublayer): 77 | "Apply residual connection to any sublayer with the same size." 78 | return x + self.dropout(sublayer(self.norm(x))) 79 | 80 | def attention(query, key, value, mask=None, dropout=None): 81 | "Compute 'Scaled Dot Product Attention'" 82 | d_k = query.size(-1) 83 | scores = torch.matmul(query, key.transpose(-2, -1)) \ 84 | / math.sqrt(d_k) 85 | if mask is not None: 86 | scores = scores.masked_fill(mask == 0, -1e9) 87 | p_attn = F.softmax(scores, dim = -1) 88 | if dropout is not None: 89 | p_attn = dropout(p_attn) 90 | return torch.matmul(p_attn, value), p_attn 91 | 92 | class PositionwiseFeedForward(nn.Module): 93 | "Implements FFN equation." 94 | def __init__(self, d_model, d_ff, dropout=0.1): 95 | super(PositionwiseFeedForward, self).__init__() 96 | self.w_1 = nn.Linear(d_model, d_ff) 97 | self.w_2 = nn.Linear(d_ff, d_model) 98 | self.dropout = nn.Dropout(dropout) 99 | 100 | def forward(self, x): 101 | return self.w_2(self.dropout(F.relu(self.w_1(x)))) 102 | 103 | class EncoderLayer(nn.Module): 104 | "Encoder is made up of self-attn and feed forward (defined below)" 105 | def __init__(self, size, self_attn, feed_forward, dropout): 106 | super(EncoderLayer, self).__init__() 107 | self.self_attn = self_attn 108 | self.feed_forward = feed_forward 109 | self.sublayer = clones(SublayerConnection(size, dropout), 2) 110 | self.size = size 111 | 112 | def forward(self, x, mask=None): 113 | "Follow Figure 1 (left) for connections." 114 | x = self.sublayer[0](x, lambda x: self.self_attn(x, x, x, mask)) 115 | return self.sublayer[1](x, self.feed_forward) 116 | 117 | class MultiHeadedAttention(nn.Module): 118 | def __init__(self, h, d_model, dropout=0.1): 119 | "Take in model size and number of heads." 120 | super(MultiHeadedAttention, self).__init__() 121 | assert d_model % h == 0 122 | # We assume d_v always equals d_k 123 | self.d_k = d_model // h 124 | self.h = h 125 | self.linears = clones(nn.Linear(d_model, d_model), 4) 126 | self.attn = None 127 | self.dropout = nn.Dropout(p=dropout) 128 | 129 | def forward(self, query, key, value, mask=None): 130 | "Implements Figure 2" 131 | if mask is not None: 132 | # Same mask applied to all h heads. 133 | mask = mask.unsqueeze(1) 134 | nbatches = query.size(0) 135 | 136 | # 1) Do all the linear projections in batch from d_model => h x d_k 137 | query, key, value = \ 138 | [l(x).view(nbatches, -1, self.h, self.d_k).transpose(1, 2) 139 | for l, x in zip(self.linears, (query, key, value))] 140 | 141 | # 2) Apply attention on all the projected vectors in batch. 142 | x, self.attn = attention(query, key, value, mask=mask, 143 | dropout=self.dropout) 144 | 145 | # 3) "Concat" using a view and apply a final linear. 146 | x = x.transpose(1, 2).contiguous() \ 147 | .view(nbatches, -1, self.h * self.d_k) 148 | return self.linears[-1](x) 149 | 150 | class Transformer(nn.Module): 151 | def __init__(self, hidden_dim, N, H): 152 | super(Transformer, self).__init__() 153 | #self. pos_encoding = PositionalEncoding(hidden_dim, 0.1) 154 | self.model = Encoder( 155 | EncoderLayer(hidden_dim, MultiHeadedAttention(H, hidden_dim), 156 | PositionwiseFeedForward(hidden_dim, hidden_dim*4) 157 | , 0.1), 158 | N 159 | ) 160 | 161 | def forward(self, x, mask=None): 162 | #x = self.pos_encoding(x) 163 | return self.model(x, mask) 164 | 165 | class PositionalEncoding(nn.Module): 166 | "Implement the PE function." 167 | 168 | def __init__(self, d_model, dropout, max_len=5000): 169 | super(PositionalEncoding, self).__init__() 170 | self.dropout = nn.Dropout(p=dropout) 171 | 172 | # Compute the positional encodings once in log space. 173 | pe = torch.zeros(max_len, d_model) 174 | position = torch.arange(0, max_len).unsqueeze(1) 175 | div_term = torch.exp(torch.arange(0, d_model, 2) * 176 | -(math.log(10000.0) / d_model)) 177 | pe[:, 0::2] = torch.sin(position * div_term) 178 | pe[:, 1::2] = torch.cos(position * div_term) 179 | pe.requires_grad = False 180 | #pe = pe.unsqueeze(0) 181 | self.register_buffer('pe', pe) 182 | 183 | def forward(self, x): 184 | #x = x + self.pe[:x.size(0), :]#Variable(self.pe[:x.size(0), :], requires_grad=False) 185 | x = x + self.pe[:x.size(1), :].unsqueeze(0).repeat(x.size(0), 1, 1) ## modified by Bing to adapt to batch 186 | return self.dropout(x) 187 | 188 | 189 | class PositionalEncoding_for_BERT(nn.Module): 190 | "Implement the PE function." 191 | 192 | def __init__(self, d_model, dropout, max_len=5000): 193 | super(PositionalEncoding_for_BERT, self).__init__() 194 | self.dropout = nn.Dropout(p=dropout) 195 | 196 | # Compute the positional encodings once in log space. 197 | pe = torch.zeros(max_len, d_model) 198 | position = torch.arange(0, max_len).unsqueeze(1) 199 | div_term = torch.exp(torch.arange(0, d_model, 2) * 200 | -(math.log(10000.0) / d_model)) 201 | pe[:, 0::2] = torch.sin(position * div_term) 202 | pe[:, 1::2] = torch.cos(position * div_term) 203 | pe.requires_grad = False 204 | #pe = torch.mul(pe, 0.2) 205 | #pe = pe.unsqueeze(0) 206 | self.register_buffer('pe', pe) 207 | 208 | def forward(self, x): 209 | #x = x + self.pe[:x.size(0), :]#Variable(self.pe[:x.size(0), :], requires_grad=False) 210 | return self.pe[:x.size(0), :] 211 | ''' 212 | def make_model(src_vocab, tgt_vocab, N=6, 213 | d_model=200, d_ff=800, h=8, dropout=0.1): 214 | "Helper: Construct a model from hyperparameters." 215 | c = copy.deepcopy 216 | attn = MultiHeadedAttention(h, d_model) # 多头attention 217 | ff = PositionwiseFeedForward(d_model, d_ff, dropout) # FFN 218 | position = PositionalEncoding(d_model, dropout) # 位置向量 219 | model = EncoderDecoder( 220 | Encoder(EncoderLayer(d_model, c(attn), c(ff), dropout), N), 221 | Decoder(DecoderLayer(d_model, c(attn), c(attn), c(ff), dropout), N), 222 | nn.Sequential(Embeddings(d_model, src_vocab), c(position)), 223 | nn.Sequential(Embeddings(d_model, tgt_vocab), c(position)), 224 | Generator(d_model, tgt_vocab)) 225 | 226 | # This was important from their code. 227 | # Initialize parameters with Glorot / fan_avg. 228 | for p in model.parameters(): 229 | if p.dim() > 1: 230 | nn.init.xavier_uniform(p) 231 | return model 232 | 233 | ''' 234 | 235 | class BertEmbeddings(nn.Module): 236 | """Construct the embeddings from word, position and token_type embeddings. 237 | """ 238 | 239 | def __init__(self, hidden_dim, vocab_size): 240 | super().__init__() 241 | self.word_embeddings = nn.Embedding(vocab_size, hidden_dim, padding_idx=0) 242 | self.position_embeddings = PositionalEncoding_for_BERT(hidden_dim, 0.1) 243 | self.token_type_embeddings = nn.Embedding(2, hidden_dim) 244 | 245 | # self.LayerNorm is not snake-cased to stick with TensorFlow model variable name and be able to load 246 | # any TensorFlow checkpoint file 247 | self.LayerNorm = LayerNorm(hidden_dim) 248 | self.dropout = nn.Dropout(0.1) 249 | 250 | def forward(self, input_ids=None, token_type_ids=None): 251 | inputs_embeds = self.word_embeddings(input_ids) 252 | position_embeddings = self.position_embeddings(inputs_embeds) 253 | token_type_embeddings = self.token_type_embeddings(token_type_ids) 254 | 255 | embeddings = inputs_embeds + position_embeddings + token_type_embeddings 256 | embeddings = self.LayerNorm(embeddings) 257 | embeddings = self.dropout(embeddings) 258 | return embeddings -------------------------------------------------------------------------------- /source/pytorchtools.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | import time 4 | class EarlyStopping: 5 | """Early stops the training if validation loss doesn't improve after a given patience.""" 6 | def __init__(self, patience=7, verbose=False, delta=0, best_path='checkpoint.pt', best_acc_path='', end_model_path='', trace_func=print, es_enabled=True): 7 | """ 8 | Args: 9 | patience (int): How long to wait after last time validation loss improved. 10 | Default: 7 11 | verbose (bool): If True, prints a message for each validation loss improvement. 12 | Default: False 13 | delta (float): Minimum change in the monitored quantity to qualify as an improvement. 14 | Default: 0 15 | best_path (str): Path for the Best model to be saved to. 16 | Default: 'checkpoint.pt' 17 | trace_func (function): trace print function. 18 | Default: ear 19 | """ 20 | self.patience = patience 21 | self.verbose = verbose 22 | self.counter = 0 23 | self.best_score = None 24 | self.early_stop = False 25 | self.val_loss_min = np.Inf 26 | self.delta = delta 27 | self.best_path = best_path 28 | self.best_acc_path = best_acc_path 29 | self.best_acc = 0 30 | self.best_f1 = 0 31 | self.trace_func = trace_func 32 | self.es_enabled = es_enabled 33 | 34 | self.best_acc = None 35 | self.best_f1 = None 36 | 37 | self.best_ep_loss = None 38 | self.best_ep_acc = None 39 | # start timer 40 | self.time_start = time.time() 41 | 42 | self.end_model_path = end_model_path 43 | 44 | def __call__(self, val_loss, val_acc, val_f1, ep, model): 45 | 46 | score = -val_loss 47 | 48 | if self.best_acc is None: 49 | self.best_acc = val_acc 50 | self.best_f1 = val_f1 51 | self.best_ep_acc = ep 52 | self.save_checkpoint(None, model, self.best_acc_path) 53 | elif self.best_acc < val_acc: 54 | self.best_acc = val_acc 55 | self.best_f1 = val_f1 56 | self.best_ep_acc = ep 57 | self.save_checkpoint(None, model, self.best_acc_path) 58 | 59 | if self.best_score is None: 60 | self.best_score = score 61 | self.save_checkpoint(val_loss, model, self.best_path) 62 | self.best_ep_loss = ep 63 | elif score < self.best_score + self.delta: 64 | if self.es_enabled: 65 | self.counter += 1 66 | self.trace_func(f'EarlyStopping counter: {self.counter} out of {self.patience}') 67 | if self.counter >= self.patience: 68 | self.early_stop = True 69 | else: 70 | self.trace_func("No update on validset. But Early stopping is disabled") 71 | else: 72 | self.best_score = score 73 | self.save_checkpoint(val_loss, model, self.best_path) 74 | self.best_ep_loss = ep 75 | self.counter = 0 76 | 77 | # self.save_checkpoint(, model, self.end_model_path) 78 | 79 | 80 | 81 | def save_checkpoint(self, val_loss, model, path): 82 | '''Saves model when validation loss decrease.''' 83 | if val_loss is not None: 84 | if self.verbose: 85 | self.trace_func(f'Validation loss decreased ({self.val_loss_min:.6f} --> {val_loss:.6f}). Saving model ...') 86 | self.val_loss_min = val_loss 87 | else: 88 | self.trace_func(f'Better Acc. Saving model ...') 89 | torch.save(model.state_dict(), path) 90 | -------------------------------------------------------------------------------- /source/requirements.txt: -------------------------------------------------------------------------------- 1 | absl-py==1.4.0 2 | async-timeout==4.0.3 3 | cachetools==5.3.1 4 | certifi==2023.7.22 5 | charset-normalizer==3.2.0 6 | cmake==3.27.2 7 | contourpy==1.1.0 8 | cycler==0.11.0 9 | filelock==3.12.2 10 | fonttools==4.42.0 11 | google-auth==2.22.0 12 | google-auth-oauthlib==1.0.0 13 | grpcio==1.57.0 14 | h5py==3.9.0 15 | idna==3.4 16 | jinja2==3.1.2 17 | joblib==1.3.2 18 | kiwisolver==1.4.4 19 | lightning-utilities==0.9.0 20 | lit==16.0.6 21 | markdown==3.4.4 22 | markupsafe==2.1.3 23 | mat73==0.60 24 | matplotlib==3.7.2 25 | mpmath==1.3.0 26 | networkx==3.1 27 | numpy==1.25.2 28 | nvidia-cublas-cu11==11.10.3.66 29 | nvidia-cuda-cupti-cu11==11.7.101 30 | nvidia-cuda-nvrtc-cu11==11.7.99 31 | nvidia-cuda-runtime-cu11==11.7.99 32 | nvidia-cudnn-cu11==8.5.0.96 33 | nvidia-cufft-cu11==10.9.0.58 34 | nvidia-curand-cu11==10.2.10.91 35 | nvidia-cusolver-cu11==11.4.0.1 36 | nvidia-cusparse-cu11==11.7.4.91 37 | nvidia-nccl-cu11==2.14.3 38 | nvidia-nvtx-cu11==11.7.91 39 | oauthlib==3.2.2 40 | pandas==2.0.3 41 | pillow==10.0.0 42 | protobuf==3.19.6 43 | pyasn1==0.5.0 44 | pyasn1-modules==0.3.0 45 | pyparsing==3.0.9 46 | pytz==2023.3 47 | pyyaml==6.0.1 48 | requests==2.31.0 49 | requests-oauthlib==1.3.1 50 | rsa==4.9 51 | scikit-learn==1.3.0 52 | scipy==1.11.2 53 | seaborn==0.13.0 54 | sympy==1.12 55 | tensorboard==2.14.0 56 | tensorboard-data-server==0.7.1 57 | tensorboard-plugin-wit==1.8.1 58 | threadpoolctl==3.2.0 59 | torch==2.0.1 60 | torch-tb-profiler==0.4.1 61 | torchaudio==2.0.2 62 | torchmetrics==1.0.3 63 | torchsummary==1.5.1 64 | torchvision==0.15.2 65 | tqdm==4.66.1 66 | triton==2.0.0 67 | tzdata==2023.3 68 | urllib3==1.26.16 69 | werkzeug==2.3.7 -------------------------------------------------------------------------------- /source/set_device.py: -------------------------------------------------------------------------------- 1 | import os 2 | def get_gpu_ids(i): 3 | gpu_ids = [] 4 | gpu_info = os.popen("nvidia-smi -L").readlines() 5 | for line in gpu_info: 6 | # print(line) 7 | ids = line.split("UUID: ")[-1].strip(" ()\n") 8 | if ids.startswith("GPU"): 9 | continue 10 | # print(ids) 11 | gpu_ids.append(ids) 12 | # print("gpu_ids:", gpu_ids) 13 | return gpu_ids[i] -------------------------------------------------------------------------------- /source/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import csv 3 | import yaml 4 | import numpy as np 5 | import matplotlib.pyplot as plt 6 | import time 7 | # from torch.autograd.profiler import profile, record_function, ProfilerActivity 8 | from sklearn.cluster import KMeans 9 | 10 | # for UniTS only, copy from it 11 | def read_data(args, config, part="train"): 12 | if part not in ["train", "test", "all", "train_valid", "valid"]: 13 | raise ValueError("part must be either train or test") 14 | 15 | path = os.path.join('../dataset', args.dataset) 16 | if part == "train": 17 | x_train = np.load(os.path.join(path, 'x_train.npy')) 18 | y_train = np.load(os.path.join(path, 'y_train.npy')).astype('int64').tolist() 19 | elif part == "test": 20 | x_test = np.load(os.path.join(path, 'x_test.npy')) 21 | y_test = np.load(os.path.join(path, 'y_test.npy')).astype('int64').tolist() 22 | elif part == "all": 23 | x_all = np.load(os.path.join(path, 'x_all.npy')) 24 | y_all = np.load(os.path.join(path, 'y_all.npy')).astype('int64').tolist() 25 | np.random.seed(args.seed) 26 | 27 | if args.exp == 'noise': # Robustness test (noise) 28 | if part == "train": 29 | for i in range(len(x_train)): 30 | for j in range(x_train.shape[2]): 31 | noise = np.random.normal(1,1 , size= x_train[i][:, j].shape) 32 | x_train[i][:, j] = x_train[i][:, j] + noise * args.ratio * np.mean(np.absolute(x_train[i][:, j] )) 33 | if part == "test": 34 | for i in range(len(x_test)): 35 | for j in range(x_test.shape[2]): 36 | noise = np.random.normal(1, 1, size= x_test[i][:, j].shape) 37 | x_test[i][:, j] = x_test[i][:, j] + noise * args.ratio * np.mean(np.absolute(x_test[i][:, j] )) 38 | 39 | elif args.exp == 'missing_data': # Robustness test (missing value) 40 | if part == "train": 41 | for i in range(len(x_train)): 42 | for j in range(x_train.shape[2]): 43 | mask = np.random.random(x_train[i][:, j].shape) >= args.ratio 44 | x_train[i][:, j] = x_train[i][:, j] * mask 45 | if part == "test": 46 | for i in range(len(x_test)): 47 | for j in range(x_test.shape[2]): 48 | mask = np.random.random(x_test[i][:, j].shape) >= args.ratio 49 | x_test[i][:, j] = x_test[i][:, j] * mask 50 | if part == "train": 51 | args.num_labels = max(y_train) + 1 52 | summary = [0 for i in range(args.num_labels)] 53 | for i in y_train: 54 | summary[i] += 1 55 | args.log("Label num cnt: "+ str(summary)) 56 | args.log("Training size: " + str(len(y_train))) 57 | return list(x_train), y_train 58 | 59 | if part == "test": 60 | args.log("Testing size: " + str(len(y_test))) 61 | return list(x_test), y_test 62 | 63 | if part == "all": 64 | args.log("All size: " + str(len(y_all))) 65 | return list(x_all), y_all 66 | 67 | class AttrDict(dict): 68 | def __init__(self, *args, **kwargs): 69 | super(AttrDict, self).__init__(*args, **kwargs) 70 | self.__dict__ = self 71 | 72 | def read_config(path): 73 | return AttrDict(yaml.load(open(path, 'r'), Loader=yaml.FullLoader)) 74 | 75 | def logging(file): 76 | def write_log(s): 77 | print(s) 78 | with open(file, 'a') as f: 79 | f.write(s+'\n') 80 | return write_log 81 | 82 | def set_up_logging(args, config): 83 | log = logging(os.path.join(args.log_path, args.model+'.txt')) 84 | for k, v in config.items(): 85 | log("%s:\t%s\n" % (str(k), str(v))) 86 | return log 87 | 88 | import scipy.stats as st 89 | 90 | def compute_mean_and_conf_interval(accuracies, confidence=.95): 91 | accuracies = np.array(accuracies) 92 | n = len(accuracies) 93 | if n <= 1: 94 | return accuracies[0], -1 95 | # st.sem() computes the standard error of the mean 96 | m, se = np.mean(accuracies), st.sem(accuracies) 97 | # ppf = Percent point function of student's t distribution 98 | h = se * st.t.ppf((1 + confidence) / 2., n-1) 99 | return m, h 100 | # retry with exception 101 | def read_npz_data(path): 102 | for trial_i in range(10): 103 | try: 104 | data = np.load(path) 105 | return data["data"], data["ms"] 106 | except: 107 | print("attempt {}".format(trial_i)) 108 | # sleep 1 ms 109 | time.sleep(0.001) 110 | continue 111 | print("problem reading {}".format(path)) 112 | raise Exception("Failed to read data") 113 | 114 | def read_any_data(path): 115 | if path.endswith(".npz"): 116 | return read_npz_data(path) 117 | elif path.endswith(".npy"): 118 | return np.load(path, allow_pickle=True) 119 | elif path.endswith(".mat"): 120 | try: 121 | import scipy.io as sio 122 | return sio.loadmat(path) 123 | except: 124 | import mat73 125 | return mat73.loadmat(path) 126 | else: 127 | raise Exception("Unsupported data format") 128 | # generate doppler freq spectrum for each subcarrier 129 | import numpy as np 130 | import scipy.signal as signal 131 | from sklearn.decomposition import PCA 132 | import hashlib 133 | import json 134 | 135 | def stringfy_data(csi_data, window_size, window_step, agg_type): 136 | mean = np.mean(np.reshape(csi_data,(-1, 1))) 137 | var = np.var(np.reshape(csi_data,(-1, 1))) 138 | data_str = "{},{},{},{},{}".format(str(mean), str(var), str(window_size), str(window_step), agg_type) 139 | # if log == False: 140 | data_str += ",no_log" 141 | return data_str 142 | 143 | # MATLAB code 144 | # nFFT = 2^(nextpow2(length(y))+1); 145 | # F = fft(y,nFFT); 146 | # F = F.*conj(F); 147 | # acf = ifft(F); 148 | # acf = acf(1:(numLags+1)); % Retain nonnegative lags 149 | # acf = real(acf); 150 | # acf = acf/acf(1); % Normalize 151 | # write python code below: 152 | def nextpow2(i): 153 | return np.ceil(np.log2(i)) 154 | def autocorr(x): 155 | nFFT = int(2**(nextpow2(len(x))+1)) 156 | F = np.fft.fft(x,nFFT) 157 | F = F*np.conj(F) 158 | acf = np.fft.ifft(F) 159 | acf = acf[0:len(x)] # Retain nonnegative lags 160 | acf = np.real(acf) 161 | acf = acf/acf[0] # Normalize 162 | return acf 163 | 164 | def get_acf_(csi_data, channel_gain, samp_rate = 1000, window_size = 128, nfft=1000, window_step = 10, agg_type = 'pca', cache_folder=None): 165 | if agg_type == "pca": 166 | pca = PCA(n_components=1) 167 | pca_coef = pca.fit_transform(np.absolute(np.transpose(csi_data, [1,0]))) 168 | # [T,1] 169 | csi_data_agg = np.dot(csi_data, pca_coef[:, 0]) 170 | else: 171 | csi_data_agg = csi_data 172 | spectrogram = [] 173 | # calculate autocorrelation function 174 | for i in range(0, csi_data_agg.shape[0] - window_size, window_step): 175 | # remove mean 176 | window = csi_data_agg[i:i+window_size] - np.mean(csi_data_agg[i:i+window_size]) 177 | csi_agg_acf = autocorr(window) 178 | spectrogram.append(csi_agg_acf) 179 | # spectrogram = spectrogram / spectrogram.max() 180 | return None, None, spectrogram 181 | 182 | def get_dfs_(csi_data, channel_gain, samp_rate = 1000, window_size = 256, nfft=1000, window_step = 10, agg_type = 'ms', n_pca=1, log=False, cache_folder=None): 183 | """ 184 | :param csi_data: csi data in the form of a list of numpy arrays 185 | ms: channel gain of each subcarrier 186 | """ 187 | # start_time = time.time() 188 | data_str = stringfy_data(csi_data, window_size, window_step, agg_type) 189 | # print("dump_string: ", time.time() - start_time) 190 | 191 | # start_time = time.time() 192 | hash_data = hashlib.md5(data_str.encode("utf-8")).hexdigest() 193 | # print("hash: ", time.time() - start_time) 194 | 195 | try: 196 | # disable 197 | # assert cache_folder is None 198 | assert type(cache_folder) == str 199 | # if the file exists, load it 200 | data = np.load(cache_folder + hash_data + '.npz', allow_pickle=True) 201 | freq_bin = data['freq_bin'] 202 | ticks = data['ticks'] 203 | doppler_spectrum = data['doppler_spectrum'] 204 | except: 205 | # if the file does not exist, compute it 206 | # with record_function("compute_DFS"): 207 | half_rate = samp_rate / 2 208 | uppe_stop = 60 209 | freq_bins_unwrap = np.concatenate((np.arange(0, half_rate, 1) / samp_rate, np.arange(-half_rate, 0, 1) / samp_rate)) 210 | freq_lpf_sele = np.logical_and(np.less_equal(freq_bins_unwrap,(uppe_stop / samp_rate)),np.greater_equal(freq_bins_unwrap,(-uppe_stop / samp_rate))) 211 | freq_lpf_positive_max = 60 212 | 213 | if agg_type == 'pca' and csi_data.shape[1] >= 1: 214 | pca = PCA(n_components=n_pca) 215 | pca_coef = pca.fit_transform(np.absolute(np.transpose(csi_data, [1,0]))) 216 | # [T,1] 217 | csi_data_agg = np.dot(csi_data, pca_coef) 218 | # always report the last pca component 219 | csi_data_agg = csi_data_agg[:,-1] 220 | elif agg_type == 'ms': 221 | # L1-normalize ms 222 | csi_data_agg = csi_data 223 | 224 | # DC removal 225 | csi_data_agg = csi_data_agg - np.mean(csi_data_agg, axis=0) 226 | noverlap = window_size - window_step 227 | freq, ticks, freq_time_prof_allfreq = signal.stft(csi_data_agg, fs=samp_rate, nfft=samp_rate, 228 | window=('gaussian', window_size), nperseg=window_size, noverlap=noverlap, 229 | return_onesided=False, 230 | padded=True) 231 | 232 | freq_time_prof_allfreq = np.array(freq_time_prof_allfreq) 233 | freq_time_prof = freq_time_prof_allfreq[freq_lpf_sele, :] 234 | 235 | if log: 236 | doppler_spectrum = np.log10(np.square(np.abs(freq_time_prof)) + 1e-20) + 20 237 | else: 238 | # DO NOT USE widar3 version, will introduce interference in the frequency axis. making empty timeslots too large 239 | # doppler_spectrum = np.divide(abs(freq_time_prof), np.sum(abs(freq_time_prof), axis=0), out=np.zeros(freq_time_prof.shape), where=abs(freq_time_prof) != 0) 240 | # cal signal’s energy 241 | doppler_spectrum = np.square(np.abs(freq_time_prof)) 242 | # doppler_spectrum = np.divide(abs(doppler_spectrum), np.sum(abs(doppler_spectrum), axis=0), out=np.zeros(doppler_spectrum.shape), where=abs(doppler_spectrum) != 0) 243 | # doppler_spectrum = freq_time_prof 244 | # freq_bin = 0:freq_lpf_positive_max - 1 * freq_lpf_negative_min:-1] 245 | freq_bin = np.array(freq)[freq_lpf_sele] 246 | 247 | # shift the doppler spectrum to the center of the frequency bins 248 | # freq_time_prof_allfreq = [0, 1, 2 ... -2, -1] 249 | doppler_spectrum = np.roll(doppler_spectrum, freq_lpf_positive_max, axis=0) 250 | freq_bin = np.roll(freq_bin, freq_lpf_positive_max) 251 | 252 | if cache_folder is not None and os.path.exists(cache_folder): 253 | try: 254 | np.savez(cache_folder + hash_data + '.npz', freq_bin=freq_bin, ticks=ticks, doppler_spectrum=doppler_spectrum) 255 | except: 256 | pass 257 | 258 | return freq_bin, ticks, doppler_spectrum 259 | 260 | def get_dfs(csi_data, channel_gain, samp_rate = 1000, window_size = 256, nfft=1000, window_step = 10, agg_type="ms", n_pca=1, spec_type = 'dfs', log=False, cache_folder=None): 261 | # dfs: [F, W] 262 | # acf: [F, W] 263 | if spec_type == "dfs": 264 | spec = get_dfs_(csi_data, channel_gain, samp_rate, window_size, nfft, window_step, agg_type, n_pca, log, cache_folder)[2] 265 | elif spec_type == "acf": 266 | spec = get_acf_(csi_data, channel_gain, samp_rate, 128, nfft, window_step, agg_type, cache_folder)[2] 267 | spec = np.array(spec).transpose() 268 | elif spec_type == "acf+dfs": 269 | dfs = get_dfs_(csi_data, channel_gain, samp_rate, window_size, nfft, window_step, agg_type, n_pca, log, cache_folder)[2] 270 | acf = get_acf_(csi_data, channel_gain, samp_rate, 128, nfft, window_step, agg_type, cache_folder)[2] 271 | 272 | dfs = np.array(dfs) 273 | acf = np.array(acf).transpose() 274 | # padding acf dim 1 275 | if dfs.shape[1] > acf.shape[1]: 276 | acf = np.pad(acf, ((0,0),(0, dfs.shape[1] - acf.shape[1])), 'constant') 277 | # padding acf dim 0 to 132 278 | acf = np.pad(acf, ((0, 132 - acf.shape[0]),(0,0)), 'constant') 279 | spec = np.concatenate((dfs, acf), axis=0) 280 | return None, None, spec 281 | 282 | 283 | def show_dfs(f, t, Zxx, ax): 284 | ax.pcolormesh(t, f, np.abs(Zxx), vmin = 0, vmax = 0.1, cmap = 'jet') 285 | ax.set_ylabel('Frequency [Hz]') 286 | ax.set_xlabel('Time [sec]') 287 | ax.grid(False) 288 | 289 | def pad_data(data, pad_to_length): 290 | data_padded = np.zeros((np.array(data).shape[0], pad_to_length)) 291 | data_padded[:, :np.array(data).shape[1]] = data 292 | return data_padded 293 | 294 | def downsample_data(data, rate, target_length): 295 | l = len(data) 296 | data = data[:, ::rate] 297 | data_downsampled = data[:, :target_length] 298 | return data_downsampled 299 | 300 | def pad_and_downsample(data, target_length, t=None, axis=1): 301 | # default axis = 1 302 | if axis == 0: 303 | data = np.transpose(data, [1,0]) 304 | y_len = np.array(data).shape[1] 305 | # more left 306 | down_rate = y_len // target_length 307 | if down_rate == 0: 308 | down_rate = 1 309 | # fewer left 310 | up_rate = down_rate + 1 311 | 312 | down_cropped_loss = int(np.ceil(y_len / down_rate)) - target_length 313 | up_pad_loss = target_length - int(np.ceil(y_len / up_rate)) 314 | 315 | if y_len > target_length: 316 | if down_cropped_loss <= up_pad_loss: 317 | out = downsample_data(data, down_rate, target_length) 318 | if t is not None: 319 | out_t = t[::down_rate] 320 | out_t = out_t[:target_length] 321 | else: 322 | out = downsample_data(data, up_rate, target_length) 323 | out = pad_data(out, target_length) 324 | if t is not None: 325 | out_t = t[::up_rate] 326 | avg_interval = (out_t[-1] - out_t[0]) / (len(out_t) - 1) 327 | # pad t to target_length 328 | pad_t = list(out_t[-1]+(np.arange(0, target_length - len(out_t), 1) + 1) * avg_interval) 329 | out_t = np.concatenate((out_t, pad_t)) 330 | 331 | elif y_len < target_length: 332 | out = pad_data(data, target_length) 333 | if t is not None: 334 | avg_interval = (t[-1] - t[0]) / (len(t) - 1) 335 | # pad t to target_length 336 | pad_t = list(t[-1]+(np.arange(0, target_length - len(t), 1) + 1) * avg_interval) 337 | out_t = np.concatenate((t, pad_t)) 338 | else: 339 | out = data 340 | if t is not None: 341 | out_t = t 342 | 343 | # transpose back 344 | if axis == 0: 345 | out = np.transpose(out, [1,0]) 346 | 347 | if t is not None: 348 | if axis == 0: 349 | out_t = np.transpose(out_t, [1,0]) 350 | return out, out_t 351 | else: 352 | return out 353 | 354 | # SLNet 355 | def complex_array_to_2_channel_float_array(data_complex): 356 | # data_complex(complex128/float64)=>data_float: [R,6,121,T_MAX]=>[2,R,6,121,T_MAX] 357 | data_complex = data_complex.astype('complex64') 358 | data_real = data_complex.real 359 | data_imag = data_complex.imag 360 | data_2_channel_float = np.stack((data_real, data_imag), axis=0) 361 | return data_2_channel_float -------------------------------------------------------------------------------- /source/widar3/all_5500_top6_1000_2560/test_filename.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/all_5500_top6_1000_2560/test_filename.npy -------------------------------------------------------------------------------- /source/widar3/all_5500_top6_1000_2560/test_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/all_5500_top6_1000_2560/test_label.npy -------------------------------------------------------------------------------- /source/widar3/all_5500_top6_1000_2560/train_filename.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/all_5500_top6_1000_2560/train_filename.npy -------------------------------------------------------------------------------- /source/widar3/all_5500_top6_1000_2560/train_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/all_5500_top6_1000_2560/train_label.npy -------------------------------------------------------------------------------- /source/widar3/all_5500_top6_1000_2560/valid_filename.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/all_5500_top6_1000_2560/valid_filename.npy -------------------------------------------------------------------------------- /source/widar3/all_5500_top6_1000_2560/valid_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/all_5500_top6_1000_2560/valid_label.npy -------------------------------------------------------------------------------- /source/widar3/small_1100_top6_1000_2560/test_filename.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/small_1100_top6_1000_2560/test_filename.npy -------------------------------------------------------------------------------- /source/widar3/small_1100_top6_1000_2560/test_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/small_1100_top6_1000_2560/test_label.npy -------------------------------------------------------------------------------- /source/widar3/small_1100_top6_1000_2560/train_filename.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/small_1100_top6_1000_2560/train_filename.npy -------------------------------------------------------------------------------- /source/widar3/small_1100_top6_1000_2560/train_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/small_1100_top6_1000_2560/train_label.npy -------------------------------------------------------------------------------- /source/widar3/small_1100_top6_1000_2560/valid_filename.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/small_1100_top6_1000_2560/valid_filename.npy -------------------------------------------------------------------------------- /source/widar3/small_1100_top6_1000_2560/valid_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/source/widar3/small_1100_top6_1000_2560/valid_label.npy -------------------------------------------------------------------------------- /ubicomp24-rfboost-final.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aiot-lab/RFBoost/f7b71a7a42ef26f51dceb7c8bae6685d202aae17/ubicomp24-rfboost-final.pdf --------------------------------------------------------------------------------