├── LICENSE ├── README.md ├── algorithm ├── MTF-LSTM-SP-test.py ├── MTF-LSTM-SP.py ├── MTF-LSTM-test.py └── MTF-LSTM.py ├── data_process └── NGSIM │ ├── add_v_a │ └── add_v_a.py │ ├── final_DP │ └── final_DP.py │ ├── merge_data │ └── merge_data.py │ ├── preprocess │ └── preprocess.py │ └── trajectory_denoise │ └── trajectory_denoise.py └── img ├── NGSIM_data.png ├── N_step1.png ├── N_step2.png ├── N_step3.png └── N_step4.png /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2022, Fanghz_Colin 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | * Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | * Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MTF-LSTM 2 | 3 | ## Introduction 4 | It's the code of paper "Vehicle Trajectory Prediction Based on Mixed Teaching Force Long Short-term Memory", which is implement in PyTorch. 5 | 6 | ## Package 7 | numpy 1.23.4 8 | torch 1.10.1 9 | sklearn 0.0 10 | scikit-learn 0.24.2 11 | 12 | 13 | ## Data Process 14 | The dataset used in this paper is NGSIM US101 and I-80 section data. 15 | We offer the raw data, processed data and trained models in the Baidu Cloud Disk: https://pan.baidu.com/s/17j0gR-vVW2chDv0JAZlJZQ 16 | Extraction code: xklg 17 | 18 | ### NGSIM data processing 19 | The NGSIM data processing flow is shown in the following figures: 20 | 21 | ![image](./img/NGSIM_data.png) 22 | 23 | Step 1: Trajectory Denoising, Put the US101 and I-80 raw data into the following folder and run "trajectory_denoise.py". 24 | 25 | ![image](./img/N_step1.png) 26 | 27 | Step 2: Filter features, run "preprocess.py". 28 | 29 | ![image](./img/N_step2.png) 30 | 31 | Step 3: Add new features, run "add_v_a.py". 32 | 33 | ![image](./img/N_step3.png) 34 | 35 | Step 4: Extract the required 8s track sequence according to the sliding window method, run "final_DP.py". 36 | 37 | ![image](./img/N_step4.png) 38 | 39 | Step 5: Final consolidation of US101 and I-80 datasets. In order to ensure the data balance and make full use of the dataset, 10 groups of datasets are randomly sampled, and each group is divided into training set, test set and verification set according to the ratio of 6:2:2, run "merge_data.py". 40 | 41 | ## Model training and testing 42 | 43 | MTF-LSTM model training, run "MTF-LSTM.py" 44 | 45 | MTF-LSTM-SP model training, run"MTF-LSTM-SP.py" 46 | 47 | The trained MTF-LSTM and MTF-LSTM-SP models in this paper are saved in the folder /algorithm/models, which can be run directly to see the training effect of the model. 48 | You can run directly to see the training effect of the model. In view of the large storage space, you can put it into the cloud disk and download it through the above link. 49 | 50 | ## Citation of papers 51 | 52 | Huazhen. Fang, Li. Liu, Xiaofeng. Xiao, Qing. Gu, and Yu. Meng, “Vehicle trajectory prediction based on mixed teaching force long short-term memory,”Journal of Transportation Systems Engineering and Information Technology, vol. 23, no. 04, pp. 80–87, 2023. 53 | 54 | DOI: 10.16097/j.cnki.1009-6744.2023.04.009 55 | 56 | ## Postscript 57 | 58 | If you hava any question, please feel free to contact the email fhz_colin@xs.ustb.edu.cn. 59 | -------------------------------------------------------------------------------- /algorithm/MTF-LSTM-SP-test.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/4/18 15:37 4 | @File: LSTM_encoder_decoder.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import torch 9 | import random 10 | import numpy as np 11 | import torch.nn as nn 12 | import matplotlib.pyplot as plt 13 | from torch.utils.data import DataLoader, TensorDataset 14 | 15 | 16 | def feature_scaling(x_seq): 17 | x_min = x_seq.min() 18 | x_max = x_seq.max() 19 | if x_min == x_max: 20 | x_new = x_min * np.ones(shape=x_seq.shape) 21 | else: 22 | x_new = (2 * x_seq - (x_max + x_min)) / (x_max - x_min) 23 | return x_new, x_min, x_max 24 | 25 | 26 | def de_feature_scaling(x_new, x_min, x_max): 27 | x_ori = np.ones(shape=(len(x_max), 40, 44)) 28 | for i in range(len(x_max)): 29 | for j in range(3): 30 | if x_min[i, j] == x_max[i, j]: 31 | x_ori[i, :, j] = x_min[i, j] 32 | else: 33 | x_ori[i, :, j] = (x_new[i, :, j] * (x_max[i, j] - x_min[i, j]) + x_max[i, j] + x_min[i, j]) / 2 34 | 35 | return x_ori 36 | 37 | 38 | def data_diff(data): 39 | data_diff = np.diff(data) 40 | data_0 = data[0] 41 | return data_0, data_diff 42 | 43 | 44 | def de_data_diff(data_0, data_diff): 45 | data = np.ones(shape=(len(data_diff), 40, 44)) 46 | data[:, 0, :] = data_0 47 | for i in range(39): 48 | data[:, i + 1, :] = data[:, i, :] + data_diff[:, i, :] 49 | 50 | return data 51 | 52 | 53 | def dataNormal(seq): 54 | seq_len = len(seq) 55 | seq_norm = np.zeros(shape=(seq_len, 39, 44)) 56 | seq_norm_feature = np.zeros(shape=(seq_len, 3, 44)) 57 | 58 | for i in range(seq_len): 59 | for j in range(44): 60 | seq_tmp = seq[i, :, j] # initial seq 61 | seq_tmp_FS, seq_tmp_min, seq_tmp_max = feature_scaling(seq_tmp) # feature scaling 62 | seq_tmp_0, seq_tmp_diff = data_diff(seq_tmp_FS) # seq diff 63 | seq_norm[i, :, j] = seq_tmp_diff # store norm data 64 | 65 | # store norm feature data 66 | seq_norm_feature[i, 0, j] = seq_tmp_min 67 | seq_norm_feature[i, 1, j] = seq_tmp_max 68 | seq_norm_feature[i, 2, j] = seq_tmp_0 69 | 70 | return seq_norm, seq_norm_feature 71 | 72 | 73 | def get_train_dataset(train_data, batch_size): 74 | # 预测2s,将数据进行动态窗口移动,进行泛化 75 | x = train_data[:, :14, :] 76 | y = train_data[:, 14:, :] 77 | 78 | x_data = torch.from_numpy(x.copy()) 79 | y_data = torch.from_numpy(y.copy()) 80 | 81 | x_data = x_data.to(torch.float32) 82 | y_data = y_data.to(torch.float32) 83 | 84 | train_dataset = TensorDataset(x_data, y_data) 85 | train_loader = DataLoader(train_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 86 | 87 | return train_loader 88 | 89 | 90 | def get_test_dataset(test_data, test_seq_NF, batch_size): 91 | x_data = torch.from_numpy(test_data.copy()) 92 | x_data = x_data.to(torch.float32) 93 | 94 | y_data = torch.from_numpy(test_seq_NF.copy()) 95 | y_data = y_data.to(torch.float32) 96 | 97 | test_dataset = TensorDataset(x_data, y_data) 98 | test_loader = DataLoader(test_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 99 | 100 | return test_loader 101 | 102 | 103 | def LoadData(num): 104 | train_x = np.load(file="datasets/X_train_{}.npy".format(num)) 105 | train_y = np.load(file="datasets/y_train_{}.npy".format(num)) 106 | 107 | test_x = np.load(file="datasets/X_test_{}.npy".format(num)) 108 | test_y = np.load(file="datasets/y_test_{}.npy".format(num)) 109 | 110 | valid_x = np.load(file="datasets/X_valid_{}.npy".format(num)) 111 | valid_y = np.load(file="datasets/y_valid_{}.npy".format(num)) 112 | 113 | return train_x, train_y, test_x, test_y, valid_x, valid_y 114 | 115 | 116 | class lstm_encoder(nn.Module): 117 | def __init__(self, input_size, hidden_size, num_layers=4): 118 | super(lstm_encoder, self).__init__() 119 | self.num_layers = num_layers 120 | self.input_size = input_size 121 | self.hidden_size = hidden_size 122 | self.lstm = nn.LSTM(batch_first=True, 123 | input_size=self.input_size, 124 | hidden_size=self.hidden_size, 125 | num_layers=self.num_layers, 126 | dropout=0.2 127 | ) 128 | 129 | def forward(self, input): 130 | lstm_out, self.hidden = self.lstm(input) 131 | return lstm_out, self.hidden 132 | 133 | 134 | class lstm_decoder(nn.Module): 135 | def __init__(self, input_size, hidden_size, num_layers=4): 136 | super(lstm_decoder, self).__init__() 137 | self.num_layers = num_layers 138 | self.input_size = input_size 139 | self.hidden_size = hidden_size 140 | self.lstm = nn.LSTM(batch_first=True, 141 | input_size=self.input_size, 142 | hidden_size=self.hidden_size, 143 | num_layers=self.num_layers, 144 | dropout=0.2 145 | ) 146 | self.fc = nn.Linear(self.hidden_size, self.input_size) 147 | 148 | def forward(self, input, encoder_hidden_states): 149 | lstm_out, self.hidden = self.lstm(input.unsqueeze(1), encoder_hidden_states) 150 | output = self.fc(lstm_out.squeeze(1)) 151 | return output, self.hidden 152 | 153 | 154 | class MyLstm(nn.Module): 155 | def __init__(self, input_size=44, hidden_size=128, batch_size=1000, target_len=25, TR=0.1): 156 | super(MyLstm, self).__init__() 157 | self.input_size = input_size 158 | self.hidden_size = hidden_size 159 | self.batch_size = batch_size 160 | self.target_len = target_len 161 | self.TR = TR 162 | 163 | self.encoder = lstm_encoder(input_size=self.input_size, hidden_size=self.hidden_size) 164 | self.decoder = lstm_decoder(input_size=self.input_size, hidden_size=self.hidden_size) 165 | 166 | def forward(self, input, target, training_prediction="recursive"): 167 | 168 | encoder_output, encoder_hidden = self.encoder(input) 169 | decoder_input = input[:, -1, :] 170 | decoder_hidden = encoder_hidden 171 | 172 | outputs = torch.zeros(input.shape[0], self.target_len, input.shape[2]) 173 | 174 | if training_prediction == "recursive": 175 | # recursive 176 | for t in range(self.target_len): 177 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 178 | outputs[:, t, :] = decoder_output 179 | decoder_input = decoder_output 180 | 181 | if training_prediction == "teacher_forcing": 182 | # teacher_forcing 183 | for t in range(self.target_len): 184 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 185 | outputs[:, t, :] = decoder_output 186 | decoder_input = target[:, t, :] 187 | 188 | if training_prediction == "mixed_teacher_forcing": 189 | # mixed_teacher_forcing 190 | teacher_forcing_ratio = self.TR 191 | for t in range(self.target_len): 192 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 193 | outputs[:, t, :] = decoder_output 194 | 195 | if random.random() < teacher_forcing_ratio: 196 | decoder_input = target[:, t, :] 197 | else: 198 | decoder_input = decoder_output 199 | 200 | return outputs 201 | 202 | 203 | class RMSELoss(torch.nn.Module): 204 | def __init__(self): 205 | super(RMSELoss,self).__init__() 206 | 207 | def forward(self,x,y): 208 | criterion = nn.MSELoss(reduction="none") 209 | loss_sum = torch.sqrt(torch.sum(criterion(x, y), axis=-1)) 210 | loss = torch.mean(loss_sum) 211 | 212 | return loss 213 | 214 | 215 | class RMSEsum(torch.nn.Module): 216 | def __init__(self): 217 | super(RMSEsum,self).__init__() 218 | 219 | def forward(self,x,y): 220 | criterion = nn.MSELoss(reduction="none") 221 | loss_sum = torch.sqrt(torch.sum(criterion(x, y), axis=-1)) 222 | 223 | return loss_sum 224 | 225 | 226 | if __name__ == '__main__': 227 | 228 | result44 = np.zeros(shape=(10, 9, 3, 7)) 229 | 230 | for dataset_num in range(10): 231 | seq_train, y_train, seq_test, y_test, seq_valid, y_valid = LoadData(dataset_num) 232 | 233 | seq_train = seq_train[:, ::2, :] 234 | seq_test = seq_test[:, ::2, :] 235 | seq_valid = seq_valid[:, ::2, :] 236 | 237 | 238 | 239 | x_norm_train, x_norm_train_feature = dataNormal(seq_train) 240 | x_norm_test, x_norm_test_feature = dataNormal(seq_test) 241 | x_norm_valid, x_norm_valid_feature = dataNormal(seq_valid) 242 | 243 | batch_size = 1024 244 | epochs = 1 245 | learning_rate = 0.001 246 | 247 | train_loader = get_train_dataset(x_norm_train, batch_size) 248 | test_loader = get_train_dataset(x_norm_test, batch_size) 249 | valid_loader = get_test_dataset(x_norm_valid, x_norm_valid_feature, batch_size) 250 | 251 | for tr in range(9): 252 | TR = (tr + 1)/10 253 | 254 | for times in range(3): 255 | model_name = "models_all_data/LSTM_ED_D{}_R{}_T{}.pkl".format(dataset_num, tr+1, times+1) 256 | print(model_name) 257 | 258 | model = MyLstm(TR=TR) 259 | rmse_loss = RMSELoss() 260 | rmse_sum = RMSEsum() 261 | optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=0.0001) 262 | 263 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 264 | model.to(device) 265 | print(device) 266 | 267 | loss_train = [] 268 | loss_test = [] 269 | loss_test_history = 1000 270 | 271 | MR1 = 0 272 | MR2 = 0 273 | for i in range(epochs): 274 | 275 | 276 | if i == epochs - 1: 277 | # valid loader 278 | loss_tmp = [] 279 | loss_1s_tmp = [] 280 | loss_2s_tmp = [] 281 | loss_3s_tmp = [] 282 | loss_4s_tmp = [] 283 | loss_5s_tmp = [] 284 | 285 | for batch_idx, (x_seq, x_seq_NF) in enumerate(valid_loader): 286 | x_data = x_seq.to(device) 287 | x_data = x_data.to(torch.float32) 288 | 289 | x_seq_NF = x_seq_NF.to(device) 290 | x_seq_NF = x_seq_NF.to(torch.float32) 291 | 292 | # muti-steps prediction 293 | x_data_ori = x_data.clone() 294 | x_tmp = x_data[:, :14, :] 295 | y_tmp = x_data[:, 14:, :] 296 | # print(model_name) 297 | 298 | model.load_state_dict(torch.load(model_name)) 299 | 300 | with torch.no_grad(): 301 | pred = model(x_tmp, y_tmp, training_prediction="recursive") 302 | pred = pred.to(device) 303 | # loss = mse_loss(y_tmp[:, :, :2], pred[:, :, :2]) 304 | # loss_tmp.append(loss.item()) 305 | 306 | x_data[:, 14:, :] = pred 307 | x_seq_NF = x_seq_NF.cpu().numpy() 308 | 309 | pred_seq_np = x_data.cpu().numpy() 310 | pred_seq_dediff = de_data_diff(x_seq_NF[:, 2, :], pred_seq_np) 311 | pred_seq_ori = de_feature_scaling(pred_seq_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 312 | 313 | x_data_ori_np = x_data_ori.cpu().numpy() 314 | x_data_ori_dediff = de_data_diff(x_seq_NF[:, 2, :], x_data_ori_np) 315 | x_data_oo = de_feature_scaling(x_data_ori_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 316 | 317 | pred_seq_ori_torch = torch.from_numpy(pred_seq_ori) 318 | x_data_oo_torch = torch.from_numpy(x_data_oo) 319 | 320 | pred_seq_ori_torch = pred_seq_ori_torch.to(torch.float32) 321 | x_data_oo_torch = x_data_oo_torch.to(torch.float32) 322 | 323 | # Get RMSE_loss of each prediction step 324 | loss_1s = rmse_loss(x_data_oo_torch[:, 19, :2], pred_seq_ori_torch[:, 19, :2]) 325 | loss_2s = rmse_loss(x_data_oo_torch[:, 24, :2], pred_seq_ori_torch[:, 24, :2]) 326 | loss_3s = rmse_loss(x_data_oo_torch[:, 29, :2], pred_seq_ori_torch[:, 29, :2]) 327 | loss_4s = rmse_loss(x_data_oo_torch[:, 34, :2], pred_seq_ori_torch[:, 34, :2]) 328 | loss_5s = rmse_loss(x_data_oo_torch[:, 39, :2], pred_seq_ori_torch[:, 39, :2]) 329 | 330 | loss = rmse_loss(x_data_oo_torch[:, 15:, :2], pred_seq_ori_torch[:, 15:, :2]) 331 | 332 | loss_1s_tmp.append(loss_1s) 333 | loss_2s_tmp.append(loss_2s) 334 | loss_3s_tmp.append(loss_3s) 335 | loss_4s_tmp.append(loss_4s) 336 | loss_5s_tmp.append(loss_5s) 337 | 338 | loss_tmp.append(loss.item()) 339 | 340 | rmse_5s_ = rmse_sum(x_data_oo_torch[:, 39, :2], pred_seq_ori_torch[:, 39, :2]) 341 | rmse_5s = np.array(rmse_5s_) 342 | 343 | MR1 = MR1 + np.sum(rmse_5s>=2) 344 | MR2 = MR2 + np.sum(rmse_5s<=2) 345 | 346 | loss_1s_tmp_np = np.array(loss_1s_tmp) 347 | loss_2s_tmp_np = np.array(loss_2s_tmp) 348 | loss_3s_tmp_np = np.array(loss_3s_tmp) 349 | loss_4s_tmp_np = np.array(loss_4s_tmp) 350 | loss_5s_tmp_np = np.array(loss_5s_tmp) 351 | 352 | loss_1s_mean = loss_1s_tmp_np.mean() 353 | loss_2s_mean = loss_2s_tmp_np.mean() 354 | loss_3s_mean = loss_3s_tmp_np.mean() 355 | loss_4s_mean = loss_4s_tmp_np.mean() 356 | loss_5s_mean = loss_5s_tmp_np.mean() 357 | 358 | loss_tmp_np = np.array(loss_tmp) 359 | loss_mean = loss_tmp_np.mean() 360 | 361 | result44[dataset_num, tr, times, 0] = loss_1s_mean 362 | result44[dataset_num, tr, times, 1] = loss_2s_mean 363 | result44[dataset_num, tr, times, 2] = loss_3s_mean 364 | result44[dataset_num, tr, times, 3] = loss_4s_mean 365 | result44[dataset_num, tr, times, 4] = loss_5s_mean 366 | result44[dataset_num, tr, times, 5] = loss_mean 367 | result44[dataset_num, tr, times, 6] = MR2/(MR1+MR2) 368 | 369 | np.save(file="LSTM_5_result.npy", arr=result44) 370 | 371 | -------------------------------------------------------------------------------- /algorithm/MTF-LSTM-SP.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/11/6 15:37 4 | @File: LSTM_encoder_decoder.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import torch 9 | import random 10 | import numpy as np 11 | import torch.nn as nn 12 | from torch.utils.data import DataLoader, TensorDataset 13 | 14 | 15 | def feature_scaling(x_seq): 16 | x_min = x_seq.min() 17 | x_max = x_seq.max() 18 | if x_min == x_max: 19 | x_new = x_min * np.ones(shape=x_seq.shape) 20 | else: 21 | x_new = (2 * x_seq - (x_max + x_min)) / (x_max - x_min) 22 | return x_new, x_min, x_max 23 | 24 | 25 | def de_feature_scaling(x_new, x_min, x_max): 26 | x_ori = np.ones(shape=(len(x_max), 40, 44)) 27 | for i in range(len(x_max)): 28 | for j in range(3): 29 | if x_min[i, j] == x_max[i, j]: 30 | x_ori[i, :, j] = x_min[i, j] 31 | else: 32 | x_ori[i, :, j] = (x_new[i, :, j] * (x_max[i, j] - x_min[i, j]) + x_max[i, j] + x_min[i, j]) / 2 33 | 34 | return x_ori 35 | 36 | 37 | def data_diff(data): 38 | data_diff = np.diff(data) 39 | data_0 = data[0] 40 | return data_0, data_diff 41 | 42 | 43 | def de_data_diff(data_0, data_diff): 44 | data = np.ones(shape=(len(data_diff), 40, 44)) 45 | data[:, 0, :] = data_0 46 | for i in range(39): 47 | data[:, i + 1, :] = data[:, i, :] + data_diff[:, i, :] 48 | 49 | return data 50 | 51 | 52 | def dataNormal(seq): 53 | seq_len = len(seq) 54 | seq_norm = np.zeros(shape=(seq_len, 39, 44)) 55 | seq_norm_feature = np.zeros(shape=(seq_len, 3, 44)) 56 | 57 | for i in range(seq_len): 58 | for j in range(44): 59 | seq_tmp = seq[i, :, j] # initial seq 60 | seq_tmp_FS, seq_tmp_min, seq_tmp_max = feature_scaling(seq_tmp) # feature scaling 61 | seq_tmp_0, seq_tmp_diff = data_diff(seq_tmp_FS) # seq diff 62 | seq_norm[i, :, j] = seq_tmp_diff # store norm data 63 | 64 | # store norm feature data 65 | seq_norm_feature[i, 0, j] = seq_tmp_min 66 | seq_norm_feature[i, 1, j] = seq_tmp_max 67 | seq_norm_feature[i, 2, j] = seq_tmp_0 68 | 69 | return seq_norm, seq_norm_feature 70 | 71 | 72 | def get_train_dataset(train_data, batch_size): 73 | x = train_data[:, :14, :] 74 | y = train_data[:, 14:, :] 75 | 76 | x_data = torch.from_numpy(x.copy()) 77 | y_data = torch.from_numpy(y.copy()) 78 | 79 | x_data = x_data.to(torch.float32) 80 | y_data = y_data.to(torch.float32) 81 | 82 | train_dataset = TensorDataset(x_data, y_data) 83 | train_loader = DataLoader(train_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 84 | 85 | return train_loader 86 | 87 | 88 | def get_test_dataset(test_data, test_seq_NF, batch_size): 89 | x_data = torch.from_numpy(test_data.copy()) 90 | x_data = x_data.to(torch.float32) 91 | 92 | y_data = torch.from_numpy(test_seq_NF.copy()) 93 | y_data = y_data.to(torch.float32) 94 | 95 | test_dataset = TensorDataset(x_data, y_data) 96 | test_loader = DataLoader(test_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 97 | 98 | return test_loader 99 | 100 | 101 | def LoadData(num): 102 | train_x = np.load(file="../data_process/NGSIM/merge_data/X_train_{}.npy".format(num)) 103 | train_y = np.load(file="../data_process/NGSIM/merge_data/y_train_{}.npy".format(num)) 104 | 105 | test_x = np.load(file="../data_process/NGSIM/merge_data/X_test_{}.npy".format(num)) 106 | test_y = np.load(file="../data_process/NGSIM/merge_data/y_test_{}.npy".format(num)) 107 | 108 | valid_x = np.load(file="../data_process/NGSIM/merge_data/X_valid_{}.npy".format(num)) 109 | valid_y = np.load(file="../data_process/NGSIM/merge_data/y_valid_{}.npy".format(num)) 110 | 111 | return train_x, train_y, test_x, test_y, valid_x, valid_y 112 | 113 | 114 | class lstm_encoder(nn.Module): 115 | def __init__(self, input_size, hidden_size, num_layers=4): 116 | super(lstm_encoder, self).__init__() 117 | self.num_layers = num_layers 118 | self.input_size = input_size 119 | self.hidden_size = hidden_size 120 | self.lstm = nn.LSTM(batch_first=True, 121 | input_size=self.input_size, 122 | hidden_size=self.hidden_size, 123 | num_layers=self.num_layers, 124 | dropout=0.2 125 | ) 126 | 127 | def forward(self, input): 128 | lstm_out, self.hidden = self.lstm(input) 129 | return lstm_out, self.hidden 130 | 131 | 132 | class lstm_decoder(nn.Module): 133 | def __init__(self, input_size, hidden_size, num_layers=4): 134 | super(lstm_decoder, self).__init__() 135 | self.num_layers = num_layers 136 | self.input_size = input_size 137 | self.hidden_size = hidden_size 138 | self.lstm = nn.LSTM(batch_first=True, 139 | input_size=self.input_size, 140 | hidden_size=self.hidden_size, 141 | num_layers=self.num_layers, 142 | dropout=0.2 143 | ) 144 | self.fc = nn.Linear(self.hidden_size, self.input_size) 145 | 146 | def forward(self, input, encoder_hidden_states): 147 | lstm_out, self.hidden = self.lstm(input.unsqueeze(1), encoder_hidden_states) 148 | output = self.fc(lstm_out.squeeze(1)) 149 | return output, self.hidden 150 | 151 | 152 | class MyLstm(nn.Module): 153 | def __init__(self, input_size=44, hidden_size=128, target_len=25, TR=0.1): 154 | super(MyLstm, self).__init__() 155 | self.input_size = input_size 156 | self.hidden_size = hidden_size 157 | self.target_len = target_len 158 | self.TR = TR 159 | 160 | self.encoder = lstm_encoder(input_size=self.input_size, hidden_size=self.hidden_size) 161 | self.decoder = lstm_decoder(input_size=self.input_size, hidden_size=self.hidden_size) 162 | 163 | def forward(self, input, target, training_prediction="recursive"): 164 | 165 | encoder_output, encoder_hidden = self.encoder(input) 166 | decoder_input = input[:, -1, :] 167 | decoder_hidden = encoder_hidden 168 | 169 | outputs = torch.zeros(input.shape[0], self.target_len, input.shape[2]) 170 | 171 | if training_prediction == "recursive": 172 | # recursive 173 | for t in range(self.target_len): 174 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 175 | outputs[:, t, :] = decoder_output 176 | decoder_input = decoder_output 177 | 178 | if training_prediction == "teacher_forcing": 179 | # teacher_forcing 180 | for t in range(self.target_len): 181 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 182 | outputs[:, t, :] = decoder_output 183 | decoder_input = target[:, t, :] 184 | 185 | if training_prediction == "mixed_teacher_forcing": 186 | # mixed_teacher_forcing 187 | teacher_forcing_ratio = self.TR 188 | for t in range(self.target_len): 189 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 190 | outputs[:, t, :] = decoder_output 191 | 192 | if random.random() < teacher_forcing_ratio: 193 | decoder_input = target[:, t, :] 194 | else: 195 | decoder_input = decoder_output 196 | 197 | return outputs 198 | 199 | 200 | if __name__ == '__main__': 201 | 202 | for dataset_num in range(5): 203 | seq_train, y_train, seq_test, y_test, seq_valid, y_valid = LoadData(dataset_num) 204 | 205 | seq_train = seq_train[:, ::2, :] 206 | seq_test = seq_test[:, ::2, :] 207 | seq_valid = seq_valid[:, ::2, :] 208 | 209 | x_norm_train, x_norm_train_feature = dataNormal(seq_train) 210 | x_norm_test, x_norm_test_feature = dataNormal(seq_test) 211 | x_norm_valid, x_norm_valid_feature = dataNormal(seq_valid) 212 | 213 | batch_size = 1024 214 | epochs = 100 215 | learning_rate = 0.001 216 | 217 | train_loader = get_train_dataset(x_norm_train, batch_size) 218 | test_loader = get_train_dataset(x_norm_test, batch_size) 219 | valid_loader = get_test_dataset(x_norm_valid, x_norm_valid_feature, batch_size) 220 | 221 | for tr in range(9): 222 | TR = (tr + 1)/10 223 | 224 | for times in range(3): 225 | model_name = "LSTM_ED_D{}_R{}_T{}.pkl".format(dataset_num, tr+1, times+1) 226 | print(model_name) 227 | 228 | model = MyLstm(TR=TR) 229 | mse_loss = nn.MSELoss(reduction='sum') 230 | mse_loss1 = nn.MSELoss() 231 | optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=0.0001) 232 | 233 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 234 | model.to(device) 235 | print(device) 236 | 237 | loss_train = [] 238 | loss_test = [] 239 | loss_test_history = 100 240 | for i in range(epochs): 241 | 242 | # trian loader 243 | loss_tmp = [] 244 | for batch_idx, (x_seq, y_label) in enumerate(train_loader): 245 | x_seq = x_seq.to(device) 246 | y_label = y_label.to(device) 247 | 248 | x_seq = x_seq.to(torch.float32) 249 | y_label = y_label.to(torch.float32) 250 | 251 | pred = model(x_seq, y_label, training_prediction="mixed_teacher_forcing") 252 | pred = pred.to(device) 253 | loss = mse_loss(y_label[:, :, :2], pred[:, :, :2]) 254 | 255 | optimizer.zero_grad() 256 | loss.backward() 257 | optimizer.step() 258 | 259 | loss_tmp.append(loss.item()) 260 | 261 | loss_tmp_np = np.array(loss_tmp) 262 | loss_train_mean = loss_tmp_np.mean() 263 | print("Epoch:{},Training loss: {}".format(i, loss_train_mean)) 264 | loss_train.append(loss_train_mean) 265 | 266 | # test loader 267 | loss_tmp = [] 268 | for batch_idx, (x_seq, y_label) in enumerate(test_loader): 269 | x_seq = x_seq.to(device) 270 | y_label = y_label.to(device) 271 | 272 | x_seq = x_seq.to(torch.float32) 273 | y_label = y_label.to(torch.float32) 274 | 275 | with torch.no_grad(): 276 | pred = model(x_seq, y_label, training_prediction="mixed_teacher_forcing") 277 | pred = pred.to(device) 278 | loss = mse_loss(y_label[:, :, :2], pred[:, :, :2]) 279 | 280 | loss_tmp.append(loss.item()) 281 | 282 | loss_tmp_np = np.array(loss_tmp) 283 | loss_test_mean = loss_tmp_np.mean() 284 | print("Epoch:{},Testing loss: {}".format(i, loss_test_mean)) 285 | loss_test.append(loss_test_mean) 286 | 287 | if loss_test_history > loss_test_mean: 288 | if loss_test_mean / loss_train_mean < 1.05: 289 | print("The model is not over-fitting, save it.") 290 | torch.save(model.state_dict(), model_name) 291 | loss_test_history = loss_test_mean 292 | 293 | if i == epochs - 1: 294 | # valid loader 295 | loss_tmp = [] 296 | loss_1s_tmp = [] 297 | loss_2s_tmp = [] 298 | loss_3s_tmp = [] 299 | loss_4s_tmp = [] 300 | loss_5s_tmp = [] 301 | 302 | for batch_idx, (x_seq, x_seq_NF) in enumerate(valid_loader): 303 | x_data = x_seq.to(device) 304 | x_data = x_data.to(torch.float32) 305 | 306 | x_seq_NF = x_seq_NF.to(device) 307 | x_seq_NF = x_seq_NF.to(torch.float32) 308 | 309 | # muti-steps prediction 310 | x_data_ori = x_data.clone() 311 | x_tmp = x_data[:, :14, :] 312 | y_tmp = x_data[:, 14:, :] 313 | 314 | model.load_state_dict(torch.load(model_name)) 315 | 316 | with torch.no_grad(): 317 | pred = model(x_tmp, y_tmp, training_prediction="recursive") 318 | 319 | pred = pred.to(device) 320 | loss = mse_loss(y_tmp[:, :, :2], pred[:, :, :2]) 321 | 322 | loss_tmp.append(loss.item()) 323 | 324 | x_data[:, 14:, :] = pred 325 | x_seq_NF = x_seq_NF.cpu().numpy() 326 | 327 | pred_seq_np = x_data.cpu().numpy() 328 | pred_seq_dediff = de_data_diff(x_seq_NF[:, 2, :], pred_seq_np) 329 | pred_seq_ori = de_feature_scaling(pred_seq_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 330 | 331 | x_data_ori_np = x_data_ori.cpu().numpy() 332 | x_data_ori_dediff = de_data_diff(x_seq_NF[:, 2, :], x_data_ori_np) 333 | x_data_oo = de_feature_scaling(x_data_ori_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 334 | 335 | pred_seq_ori_torch = torch.from_numpy(pred_seq_ori) 336 | x_data_oo_torch = torch.from_numpy(x_data_oo) 337 | 338 | pred_seq_ori_torch = pred_seq_ori_torch.to(torch.float32) 339 | x_data_oo_torch = x_data_oo_torch.to(torch.float32) 340 | 341 | # Get RMSE_loss of each prediction step 342 | loss_1s = mse_loss1(x_data_oo_torch[:, 19, :2], pred_seq_ori_torch[:, 19, :2]) ** 0.5 343 | loss_2s = mse_loss1(x_data_oo_torch[:, 24, :2], pred_seq_ori_torch[:, 24, :2]) ** 0.5 344 | loss_3s = mse_loss1(x_data_oo_torch[:, 29, :2], pred_seq_ori_torch[:, 29, :2]) ** 0.5 345 | loss_4s = mse_loss1(x_data_oo_torch[:, 34, :2], pred_seq_ori_torch[:, 34, :2]) ** 0.5 346 | loss_5s = mse_loss1(x_data_oo_torch[:, 39, :2], pred_seq_ori_torch[:, 39, :2]) ** 0.5 347 | 348 | loss_1s_tmp.append(loss_1s) 349 | loss_2s_tmp.append(loss_2s) 350 | loss_3s_tmp.append(loss_3s) 351 | loss_4s_tmp.append(loss_4s) 352 | loss_5s_tmp.append(loss_5s) 353 | 354 | loss_1s_tmp_np = np.array(loss_1s_tmp) 355 | loss_2s_tmp_np = np.array(loss_2s_tmp) 356 | loss_3s_tmp_np = np.array(loss_3s_tmp) 357 | loss_4s_tmp_np = np.array(loss_4s_tmp) 358 | loss_5s_tmp_np = np.array(loss_5s_tmp) 359 | 360 | loss_1s_mean = loss_1s_tmp_np.mean() 361 | loss_2s_mean = loss_2s_tmp_np.mean() 362 | loss_3s_mean = loss_3s_tmp_np.mean() 363 | loss_4s_mean = loss_4s_tmp_np.mean() 364 | loss_5s_mean = loss_5s_tmp_np.mean() 365 | 366 | loss_tmp_np = np.array(loss_tmp) 367 | loss_valid_mean = loss_tmp_np.mean() 368 | print("Epoch:{},valid loss: {}".format(i, loss_valid_mean)) 369 | -------------------------------------------------------------------------------- /algorithm/MTF-LSTM-test.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/4/18 15:37 4 | @File: LSTM_encoder_decoder.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import torch 9 | import random 10 | import numpy as np 11 | import torch.nn as nn 12 | import matplotlib.pyplot as plt 13 | from torch.utils.data import DataLoader, TensorDataset 14 | 15 | 16 | def feature_scaling(x_seq): 17 | x_min = x_seq.min() 18 | x_max = x_seq.max() 19 | if x_min == x_max: 20 | x_new = x_min * np.ones(shape=x_seq.shape) 21 | else: 22 | x_new = (2 * x_seq - (x_max + x_min)) / (x_max - x_min) 23 | return x_new, x_min, x_max 24 | 25 | 26 | def de_feature_scaling(x_new, x_min, x_max): 27 | x_ori = np.ones(shape=(len(x_max), 80, 44)) 28 | for i in range(len(x_max)): 29 | for j in range(3): 30 | if x_min[i, j] == x_max[i, j]: 31 | x_ori[i, :, j] = x_min[i, j] 32 | else: 33 | x_ori[i, :, j] = (x_new[i, :, j] * (x_max[i, j] - x_min[i, j]) + x_max[i, j] + x_min[i, j]) / 2 34 | 35 | return x_ori 36 | 37 | 38 | def data_diff(data): 39 | data_diff = np.diff(data) 40 | data_0 = data[0] 41 | return data_0, data_diff 42 | 43 | 44 | def de_data_diff(data_0, data_diff): 45 | data = np.ones(shape=(len(data_diff), 80, 44)) 46 | data[:, 0, :] = data_0 47 | for i in range(79): 48 | data[:, i + 1, :] = data[:, i, :] + data_diff[:, i, :] 49 | 50 | return data 51 | 52 | 53 | def dataNormal(seq): 54 | seq_len = len(seq) 55 | seq_norm = np.zeros(shape=(seq_len, 79, 44)) 56 | seq_norm_feature = np.zeros(shape=(seq_len, 3, 44)) 57 | 58 | for i in range(seq_len): 59 | for j in range(44): 60 | seq_tmp = seq[i, :, j] # initial seq 61 | seq_tmp_FS, seq_tmp_min, seq_tmp_max = feature_scaling(seq_tmp) # feature scaling 62 | seq_tmp_0, seq_tmp_diff = data_diff(seq_tmp_FS) # seq diff 63 | seq_norm[i, :, j] = seq_tmp_diff # store norm data 64 | 65 | # store norm feature data 66 | seq_norm_feature[i, 0, j] = seq_tmp_min 67 | seq_norm_feature[i, 1, j] = seq_tmp_max 68 | seq_norm_feature[i, 2, j] = seq_tmp_0 69 | 70 | return seq_norm, seq_norm_feature 71 | 72 | 73 | def get_train_dataset(train_data, batch_size): 74 | # 预测2s,将数据进行动态窗口移动,进行泛化 75 | x = train_data[:, :29, :] 76 | y = train_data[:, 29:, :] 77 | 78 | x_data = torch.from_numpy(x.copy()) 79 | y_data = torch.from_numpy(y.copy()) 80 | 81 | x_data = x_data.to(torch.float32) 82 | y_data = y_data.to(torch.float32) 83 | 84 | train_dataset = TensorDataset(x_data, y_data) 85 | train_loader = DataLoader(train_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 86 | 87 | return train_loader 88 | 89 | 90 | def get_test_dataset(test_data, test_seq_NF, batch_size): 91 | x_data = torch.from_numpy(test_data.copy()) 92 | x_data = x_data.to(torch.float32) 93 | 94 | y_data = torch.from_numpy(test_seq_NF.copy()) 95 | y_data = y_data.to(torch.float32) 96 | 97 | test_dataset = TensorDataset(x_data, y_data) 98 | test_loader = DataLoader(test_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 99 | 100 | return test_loader 101 | 102 | 103 | def LoadData(num): 104 | train_x = np.load(file="datasets/X_train_{}.npy".format(num)) 105 | train_y = np.load(file="datasets/y_train_{}.npy".format(num)) 106 | 107 | test_x = np.load(file="datasets/X_test_{}.npy".format(num)) 108 | test_y = np.load(file="datasets/y_test_{}.npy".format(num)) 109 | 110 | valid_x = np.load(file="datasets/X_valid_{}.npy".format(num)) 111 | valid_y = np.load(file="datasets/y_valid_{}.npy".format(num)) 112 | 113 | return train_x, train_y, test_x, test_y, valid_x, valid_y 114 | 115 | 116 | class lstm_encoder(nn.Module): 117 | def __init__(self, input_size, hidden_size, num_layers=4): 118 | super(lstm_encoder, self).__init__() 119 | self.num_layers = num_layers 120 | self.input_size = input_size 121 | self.hidden_size = hidden_size 122 | self.lstm = nn.LSTM(batch_first=True, 123 | input_size=self.input_size, 124 | hidden_size=self.hidden_size, 125 | num_layers=self.num_layers, 126 | dropout=0.2 127 | ) 128 | 129 | def forward(self, input): 130 | lstm_out, self.hidden = self.lstm(input) 131 | return lstm_out, self.hidden 132 | 133 | 134 | class lstm_decoder(nn.Module): 135 | def __init__(self, input_size, hidden_size, num_layers=4): 136 | super(lstm_decoder, self).__init__() 137 | self.num_layers = num_layers 138 | self.input_size = input_size 139 | self.hidden_size = hidden_size 140 | self.lstm = nn.LSTM(batch_first=True, 141 | input_size=self.input_size, 142 | hidden_size=self.hidden_size, 143 | num_layers=self.num_layers, 144 | dropout=0.2 145 | ) 146 | self.fc = nn.Linear(self.hidden_size, self.input_size) 147 | 148 | def forward(self, input, encoder_hidden_states): 149 | lstm_out, self.hidden = self.lstm(input.unsqueeze(1), encoder_hidden_states) 150 | output = self.fc(lstm_out.squeeze(1)) 151 | return output, self.hidden 152 | 153 | 154 | class MyLstm(nn.Module): 155 | def __init__(self, input_size=44, hidden_size=128, batch_size=1000, target_len=50, TR=0.1): 156 | super(MyLstm, self).__init__() 157 | self.input_size = input_size 158 | self.hidden_size = hidden_size 159 | self.batch_size = batch_size 160 | self.target_len = target_len 161 | self.TR = TR 162 | 163 | self.encoder = lstm_encoder(input_size=self.input_size, hidden_size=self.hidden_size) 164 | self.decoder = lstm_decoder(input_size=self.input_size, hidden_size=self.hidden_size) 165 | 166 | def forward(self, input, target, training_prediction="recursive"): 167 | 168 | encoder_output, encoder_hidden = self.encoder(input) 169 | decoder_input = input[:, -1, :] 170 | decoder_hidden = encoder_hidden 171 | 172 | outputs = torch.zeros(input.shape[0], self.target_len, input.shape[2]) 173 | 174 | if training_prediction == "recursive": 175 | # recursive 176 | for t in range(self.target_len): 177 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 178 | outputs[:, t, :] = decoder_output 179 | decoder_input = decoder_output 180 | 181 | if training_prediction == "teacher_forcing": 182 | # teacher_forcing 183 | for t in range(self.target_len): 184 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 185 | outputs[:, t, :] = decoder_output 186 | decoder_input = target[:, t, :] 187 | 188 | if training_prediction == "mixed_teacher_forcing": 189 | # mixed_teacher_forcing 190 | teacher_forcing_ratio = self.TR 191 | for t in range(self.target_len): 192 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 193 | outputs[:, t, :] = decoder_output 194 | 195 | if random.random() < teacher_forcing_ratio: 196 | decoder_input = target[:, t, :] 197 | else: 198 | decoder_input = decoder_output 199 | 200 | return outputs 201 | 202 | 203 | class RMSELoss(torch.nn.Module): 204 | def __init__(self): 205 | super(RMSELoss,self).__init__() 206 | 207 | def forward(self,x,y): 208 | criterion = nn.MSELoss(reduction="none") 209 | loss_sum = torch.sqrt(torch.sum(criterion(x, y), axis=-1)) 210 | loss = torch.mean(loss_sum) 211 | 212 | return loss 213 | 214 | 215 | class RMSEsum(torch.nn.Module): 216 | def __init__(self): 217 | super(RMSEsum,self).__init__() 218 | 219 | def forward(self,x,y): 220 | criterion = nn.MSELoss(reduction="none") 221 | loss_sum = torch.sqrt(torch.sum(criterion(x, y), axis=-1)) 222 | 223 | return loss_sum 224 | 225 | 226 | if __name__ == '__main__': 227 | 228 | result44 = np.zeros(shape=(10, 9, 3, 7)) 229 | 230 | for dataset_num in range(10): 231 | seq_train, y_train, seq_test, y_test, seq_valid, y_valid = LoadData(dataset_num) 232 | 233 | x_norm_train, x_norm_train_feature = dataNormal(seq_train) 234 | x_norm_test, x_norm_test_feature = dataNormal(seq_test) 235 | x_norm_valid, x_norm_valid_feature = dataNormal(seq_valid) 236 | 237 | batch_size = 1024 238 | epochs = 1 239 | learning_rate = 0.001 240 | 241 | train_loader = get_train_dataset(x_norm_train, batch_size) 242 | test_loader = get_train_dataset(x_norm_test, batch_size) 243 | valid_loader = get_test_dataset(x_norm_valid, x_norm_valid_feature, batch_size) 244 | 245 | for tr in range(9): 246 | TR = (tr + 1)/10 247 | 248 | for times in range(3): 249 | model_name = "models_all_data/LSTM_ED10_D{}_R{}_T{}.pkl".format(dataset_num, tr+1, times+1) 250 | print(model_name) 251 | 252 | model = MyLstm(TR=TR) 253 | rmse_loss = RMSELoss() 254 | rmse_sum = RMSEsum() 255 | optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=0.0001) 256 | 257 | device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu") 258 | model.to(device) 259 | print(device) 260 | 261 | loss_train = [] 262 | loss_test = [] 263 | loss_test_history = 1000 264 | 265 | MR1 = 0 266 | MR2 = 0 267 | for i in range(epochs): 268 | 269 | 270 | if i == epochs - 1: 271 | # valid loader 272 | loss_tmp = [] 273 | loss_1s_tmp = [] 274 | loss_2s_tmp = [] 275 | loss_3s_tmp = [] 276 | loss_4s_tmp = [] 277 | loss_5s_tmp = [] 278 | 279 | for batch_idx, (x_seq, x_seq_NF) in enumerate(valid_loader): 280 | x_data = x_seq.to(device) 281 | x_data = x_data.to(torch.float32) 282 | 283 | x_seq_NF = x_seq_NF.to(device) 284 | x_seq_NF = x_seq_NF.to(torch.float32) 285 | 286 | # muti-steps prediction 287 | x_data_ori = x_data.clone() 288 | x_tmp = x_data[:, :29, :] 289 | y_tmp = x_data[:, 29:, :] 290 | # print(model_name) 291 | 292 | model.load_state_dict(torch.load(model_name)) 293 | 294 | with torch.no_grad(): 295 | pred = model(x_tmp, y_tmp, training_prediction="recursive") 296 | pred = pred.to(device) 297 | # loss = mse_loss(y_tmp[:, :, :2], pred[:, :, :2]) 298 | # loss_tmp.append(loss.item()) 299 | 300 | x_data[:, 29:, :] = pred 301 | x_seq_NF = x_seq_NF.cpu().numpy() 302 | 303 | pred_seq_np = x_data.cpu().numpy() 304 | pred_seq_dediff = de_data_diff(x_seq_NF[:, 2, :], pred_seq_np) 305 | pred_seq_ori = de_feature_scaling(pred_seq_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 306 | 307 | x_data_ori_np = x_data_ori.cpu().numpy() 308 | x_data_ori_dediff = de_data_diff(x_seq_NF[:, 2, :], x_data_ori_np) 309 | x_data_oo = de_feature_scaling(x_data_ori_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 310 | 311 | pred_seq_ori_torch = torch.from_numpy(pred_seq_ori) 312 | x_data_oo_torch = torch.from_numpy(x_data_oo) 313 | 314 | pred_seq_ori_torch = pred_seq_ori_torch.to(torch.float32) 315 | x_data_oo_torch = x_data_oo_torch.to(torch.float32) 316 | 317 | # Get RMSE_loss of each prediction step 318 | loss_1s = rmse_loss(x_data_oo_torch[:, 39, :2], pred_seq_ori_torch[:, 39, :2]) 319 | loss_2s = rmse_loss(x_data_oo_torch[:, 49, :2], pred_seq_ori_torch[:, 49, :2]) 320 | loss_3s = rmse_loss(x_data_oo_torch[:, 59, :2], pred_seq_ori_torch[:, 59, :2]) 321 | loss_4s = rmse_loss(x_data_oo_torch[:, 69, :2], pred_seq_ori_torch[:, 69, :2]) 322 | loss_5s = rmse_loss(x_data_oo_torch[:, 79, :2], pred_seq_ori_torch[:, 79, :2]) 323 | 324 | loss = rmse_loss(x_data_oo_torch[:, 30:, :2], pred_seq_ori_torch[:, 30:, :2]) 325 | 326 | loss_1s_tmp.append(loss_1s) 327 | loss_2s_tmp.append(loss_2s) 328 | loss_3s_tmp.append(loss_3s) 329 | loss_4s_tmp.append(loss_4s) 330 | loss_5s_tmp.append(loss_5s) 331 | 332 | loss_tmp.append(loss.item()) 333 | 334 | rmse_5s_ = rmse_sum(x_data_oo_torch[:, 79, :2], pred_seq_ori_torch[:, 79, :2]) 335 | rmse_5s = np.array(rmse_5s_) 336 | 337 | MR1 = MR1 + np.sum(rmse_5s>=2) 338 | MR2 = MR2 + np.sum(rmse_5s<=2) 339 | 340 | loss_1s_tmp_np = np.array(loss_1s_tmp) 341 | loss_2s_tmp_np = np.array(loss_2s_tmp) 342 | loss_3s_tmp_np = np.array(loss_3s_tmp) 343 | loss_4s_tmp_np = np.array(loss_4s_tmp) 344 | loss_5s_tmp_np = np.array(loss_5s_tmp) 345 | 346 | loss_1s_mean = loss_1s_tmp_np.mean() 347 | loss_2s_mean = loss_2s_tmp_np.mean() 348 | loss_3s_mean = loss_3s_tmp_np.mean() 349 | loss_4s_mean = loss_4s_tmp_np.mean() 350 | loss_5s_mean = loss_5s_tmp_np.mean() 351 | 352 | loss_tmp_np = np.array(loss_tmp) 353 | loss_mean = loss_tmp_np.mean() 354 | 355 | result44[dataset_num, tr, times, 0] = loss_1s_mean 356 | result44[dataset_num, tr, times, 1] = loss_2s_mean 357 | result44[dataset_num, tr, times, 2] = loss_3s_mean 358 | result44[dataset_num, tr, times, 3] = loss_4s_mean 359 | result44[dataset_num, tr, times, 4] = loss_5s_mean 360 | result44[dataset_num, tr, times, 5] = loss_mean 361 | result44[dataset_num, tr, times, 6] = MR2/(MR1+MR2) 362 | 363 | np.save(file="LSTM_10_result.npy", arr=result44) 364 | 365 | -------------------------------------------------------------------------------- /algorithm/MTF-LSTM.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/11/6 15:37 4 | @File: LSTM_encoder_decoder.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import torch 9 | import random 10 | import numpy as np 11 | import torch.nn as nn 12 | from torch.utils.data import DataLoader, TensorDataset 13 | 14 | 15 | def feature_scaling(x_seq): 16 | x_min = x_seq.min() 17 | x_max = x_seq.max() 18 | if x_min == x_max: 19 | x_new = x_min * np.ones(shape=x_seq.shape) 20 | else: 21 | x_new = (2 * x_seq - (x_max + x_min)) / (x_max - x_min) 22 | return x_new, x_min, x_max 23 | 24 | 25 | def de_feature_scaling(x_new, x_min, x_max): 26 | x_ori = np.ones(shape=(len(x_max), 80, 44)) 27 | for i in range(len(x_max)): 28 | for j in range(3): 29 | if x_min[i, j] == x_max[i, j]: 30 | x_ori[i, :, j] = x_min[i, j] 31 | else: 32 | x_ori[i, :, j] = (x_new[i, :, j] * (x_max[i, j] - x_min[i, j]) + x_max[i, j] + x_min[i, j]) / 2 33 | 34 | return x_ori 35 | 36 | 37 | def data_diff(data): 38 | data_diff = np.diff(data) 39 | data_0 = data[0] 40 | return data_0, data_diff 41 | 42 | 43 | def de_data_diff(data_0, data_diff): 44 | data = np.ones(shape=(len(data_diff), 80, 44)) 45 | data[:, 0, :] = data_0 46 | for i in range(79): 47 | data[:, i + 1, :] = data[:, i, :] + data_diff[:, i, :] 48 | 49 | return data 50 | 51 | 52 | def dataNormal(seq): 53 | seq_len = len(seq) 54 | seq_norm = np.zeros(shape=(seq_len, 79, 44)) 55 | seq_norm_feature = np.zeros(shape=(seq_len, 3, 44)) 56 | 57 | for i in range(seq_len): 58 | for j in range(44): 59 | seq_tmp = seq[i, :, j] # initial seq 60 | seq_tmp_FS, seq_tmp_min, seq_tmp_max = feature_scaling(seq_tmp) # feature scaling 61 | seq_tmp_0, seq_tmp_diff = data_diff(seq_tmp_FS) # seq diff 62 | seq_norm[i, :, j] = seq_tmp_diff # store norm data 63 | 64 | # store norm feature data 65 | seq_norm_feature[i, 0, j] = seq_tmp_min 66 | seq_norm_feature[i, 1, j] = seq_tmp_max 67 | seq_norm_feature[i, 2, j] = seq_tmp_0 68 | 69 | return seq_norm, seq_norm_feature 70 | 71 | 72 | def get_train_dataset(train_data, batch_size): 73 | x = train_data[:, :29, :] 74 | y = train_data[:, 29:, :] 75 | 76 | x_data = torch.from_numpy(x.copy()) 77 | y_data = torch.from_numpy(y.copy()) 78 | 79 | x_data = x_data.to(torch.float32) 80 | y_data = y_data.to(torch.float32) 81 | 82 | train_dataset = TensorDataset(x_data, y_data) 83 | train_loader = DataLoader(train_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 84 | 85 | return train_loader 86 | 87 | 88 | def get_test_dataset(test_data, test_seq_NF, batch_size): 89 | x_data = torch.from_numpy(test_data.copy()) 90 | x_data = x_data.to(torch.float32) 91 | 92 | y_data = torch.from_numpy(test_seq_NF.copy()) 93 | y_data = y_data.to(torch.float32) 94 | 95 | test_dataset = TensorDataset(x_data, y_data) 96 | test_loader = DataLoader(test_dataset, batch_size=batch_size, drop_last=True, shuffle=True) 97 | 98 | return test_loader 99 | 100 | 101 | def LoadData(num): 102 | train_x = np.load(file="../data_process/NGSIM/merge_data/X_train_{}.npy".format(num)) 103 | train_y = np.load(file="../data_process/NGSIM/merge_data/y_train_{}.npy".format(num)) 104 | 105 | test_x = np.load(file="../data_process/NGSIM/merge_data/X_test_{}.npy".format(num)) 106 | test_y = np.load(file="../data_process/NGSIM/merge_data/y_test_{}.npy".format(num)) 107 | 108 | valid_x = np.load(file="../data_process/NGSIM/merge_data/X_valid_{}.npy".format(num)) 109 | valid_y = np.load(file="../data_process/NGSIM/merge_data/y_valid_{}.npy".format(num)) 110 | 111 | return train_x, train_y, test_x, test_y, valid_x, valid_y 112 | 113 | 114 | class lstm_encoder(nn.Module): 115 | def __init__(self, input_size, hidden_size, num_layers=4): 116 | super(lstm_encoder, self).__init__() 117 | self.num_layers = num_layers 118 | self.input_size = input_size 119 | self.hidden_size = hidden_size 120 | self.lstm = nn.LSTM(batch_first=True, 121 | input_size=self.input_size, 122 | hidden_size=self.hidden_size, 123 | num_layers=self.num_layers, 124 | dropout=0.2 125 | ) 126 | 127 | def forward(self, input): 128 | lstm_out, self.hidden = self.lstm(input) 129 | return lstm_out, self.hidden 130 | 131 | 132 | class lstm_decoder(nn.Module): 133 | def __init__(self, input_size, hidden_size, num_layers=4): 134 | super(lstm_decoder, self).__init__() 135 | self.num_layers = num_layers 136 | self.input_size = input_size 137 | self.hidden_size = hidden_size 138 | self.lstm = nn.LSTM(batch_first=True, 139 | input_size=self.input_size, 140 | hidden_size=self.hidden_size, 141 | num_layers=self.num_layers, 142 | dropout=0.2 143 | ) 144 | self.fc = nn.Linear(self.hidden_size, self.input_size) 145 | 146 | def forward(self, input, encoder_hidden_states): 147 | lstm_out, self.hidden = self.lstm(input.unsqueeze(1), encoder_hidden_states) 148 | output = self.fc(lstm_out.squeeze(1)) 149 | return output, self.hidden 150 | 151 | 152 | class MyLstm(nn.Module): 153 | def __init__(self, input_size=44, hidden_size=128, target_len=50, TR=0.1): 154 | super(MyLstm, self).__init__() 155 | self.input_size = input_size 156 | self.hidden_size = hidden_size 157 | self.target_len = target_len 158 | self.TR = TR 159 | 160 | self.encoder = lstm_encoder(input_size=self.input_size, hidden_size=self.hidden_size) 161 | self.decoder = lstm_decoder(input_size=self.input_size, hidden_size=self.hidden_size) 162 | 163 | def forward(self, input, target, training_prediction="recursive"): 164 | 165 | encoder_output, encoder_hidden = self.encoder(input) 166 | decoder_input = input[:, -1, :] 167 | decoder_hidden = encoder_hidden 168 | 169 | outputs = torch.zeros(input.shape[0], self.target_len, input.shape[2]) 170 | 171 | if training_prediction == "recursive": 172 | # recursive 173 | for t in range(self.target_len): 174 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 175 | outputs[:, t, :] = decoder_output 176 | decoder_input = decoder_output 177 | 178 | if training_prediction == "teacher_forcing": 179 | # teacher_forcing 180 | for t in range(self.target_len): 181 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 182 | outputs[:, t, :] = decoder_output 183 | decoder_input = target[:, t, :] 184 | 185 | if training_prediction == "mixed_teacher_forcing": 186 | # mixed_teacher_forcing 187 | teacher_forcing_ratio = self.TR 188 | for t in range(self.target_len): 189 | decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden) 190 | outputs[:, t, :] = decoder_output 191 | 192 | if random.random() < teacher_forcing_ratio: 193 | decoder_input = target[:, t, :] 194 | else: 195 | decoder_input = decoder_output 196 | 197 | return outputs 198 | 199 | 200 | if __name__ == '__main__': 201 | 202 | # 数据集循环,0-9 203 | for dataset_num in range(10): 204 | seq_train, y_train, seq_test, y_test, seq_valid, y_valid = LoadData(dataset_num) 205 | 206 | x_norm_train, x_norm_train_feature = dataNormal(seq_train) 207 | x_norm_test, x_norm_test_feature = dataNormal(seq_test) 208 | x_norm_valid, x_norm_valid_feature = dataNormal(seq_valid) 209 | 210 | batch_size = 1024 211 | epochs = 100 212 | learning_rate = 0.001 213 | 214 | train_loader = get_train_dataset(x_norm_train, batch_size) 215 | test_loader = get_train_dataset(x_norm_test, batch_size) 216 | valid_loader = get_test_dataset(x_norm_valid, x_norm_valid_feature, batch_size) 217 | 218 | # 示教学习率循环,0.1-0.9 219 | for tr in range(9): 220 | TR = (tr + 1)/10 221 | 222 | for times in range(3): 223 | model_name = "models_all_data/LSTM_ED10_D{}_R{}_T{}.pkl".format(dataset_num, tr+1, times+1) 224 | print(model_name) 225 | 226 | model = MyLstm(TR=TR) 227 | mse_loss = nn.MSELoss(reduction='sum') 228 | mse_loss1 = nn.MSELoss() 229 | optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=0.0001) 230 | 231 | device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu") 232 | model.to(device) 233 | print(device) 234 | 235 | loss_train = [] 236 | loss_test = [] 237 | loss_test_history = 1000 238 | for i in range(epochs): 239 | 240 | # trian loader 241 | loss_tmp = [] 242 | for batch_idx, (x_seq, y_label) in enumerate(train_loader): 243 | x_seq = x_seq.to(device) 244 | y_label = y_label.to(device) 245 | 246 | x_seq = x_seq.to(torch.float32) 247 | y_label = y_label.to(torch.float32) 248 | 249 | pred = model(x_seq, y_label, training_prediction="mixed_teacher_forcing") 250 | pred = pred.to(device) 251 | loss = mse_loss(y_label[:, :, :2], pred[:, :, :2]) 252 | 253 | optimizer.zero_grad() 254 | loss.backward() 255 | optimizer.step() 256 | 257 | loss_tmp.append(loss.item()) 258 | 259 | loss_tmp_np = np.array(loss_tmp) 260 | loss_train_mean = loss_tmp_np.mean() 261 | print("Epoch:{},Training loss: {}".format(i, loss_train_mean)) 262 | loss_train.append(loss_train_mean) 263 | 264 | # test loader 265 | loss_tmp = [] 266 | for batch_idx, (x_seq, y_label) in enumerate(test_loader): 267 | x_seq = x_seq.to(device) 268 | y_label = y_label.to(device) 269 | 270 | x_seq = x_seq.to(torch.float32) 271 | y_label = y_label.to(torch.float32) 272 | 273 | with torch.no_grad(): 274 | pred = model(x_seq, y_label, training_prediction="mixed_teacher_forcing") 275 | pred = pred.to(device) 276 | loss = mse_loss(y_label[:, :, :2], pred[:, :, :2]) 277 | 278 | loss_tmp.append(loss.item()) 279 | 280 | loss_tmp_np = np.array(loss_tmp) 281 | loss_test_mean = loss_tmp_np.mean() 282 | print("Epoch:{},Testing loss: {}".format(i, loss_test_mean)) 283 | loss_test.append(loss_test_mean) 284 | 285 | if loss_test_history > loss_test_mean: 286 | if loss_test_mean / loss_train_mean < 1.05: 287 | print("The model is not over-fitting, save it.") 288 | torch.save(model.state_dict(), model_name) 289 | loss_test_history = loss_test_mean 290 | 291 | if i == epochs - 1: 292 | # valid loader 293 | loss_tmp = [] 294 | loss_1s_tmp = [] 295 | loss_2s_tmp = [] 296 | loss_3s_tmp = [] 297 | loss_4s_tmp = [] 298 | loss_5s_tmp = [] 299 | 300 | for batch_idx, (x_seq, x_seq_NF) in enumerate(valid_loader): 301 | x_data = x_seq.to(device) 302 | x_data = x_data.to(torch.float32) 303 | 304 | x_seq_NF = x_seq_NF.to(device) 305 | x_seq_NF = x_seq_NF.to(torch.float32) 306 | 307 | # muti-steps prediction 308 | x_data_ori = x_data.clone() 309 | x_tmp = x_data[:, :29, :] 310 | y_tmp = x_data[:, 29:, :] 311 | print(model_name) 312 | 313 | model.load_state_dict(torch.load(model_name)) 314 | 315 | with torch.no_grad(): 316 | pred = model(x_tmp, y_tmp, training_prediction="recursive") 317 | 318 | pred = pred.to(device) 319 | loss = mse_loss(y_tmp[:, :, :2], pred[:, :, :2]) 320 | 321 | loss_tmp.append(loss.item()) 322 | 323 | x_data[:, 29:, :] = pred 324 | x_seq_NF = x_seq_NF.cpu().numpy() 325 | 326 | pred_seq_np = x_data.cpu().numpy() 327 | pred_seq_dediff = de_data_diff(x_seq_NF[:, 2, :], pred_seq_np) 328 | pred_seq_ori = de_feature_scaling(pred_seq_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 329 | 330 | x_data_ori_np = x_data_ori.cpu().numpy() 331 | x_data_ori_dediff = de_data_diff(x_seq_NF[:, 2, :], x_data_ori_np) 332 | x_data_oo = de_feature_scaling(x_data_ori_dediff, x_seq_NF[:, 0, :], x_seq_NF[:, 1, :]) 333 | 334 | pred_seq_ori_torch = torch.from_numpy(pred_seq_ori) 335 | x_data_oo_torch = torch.from_numpy(x_data_oo) 336 | 337 | pred_seq_ori_torch = pred_seq_ori_torch.to(torch.float32) 338 | x_data_oo_torch = x_data_oo_torch.to(torch.float32) 339 | 340 | # Get RMSE_loss of each prediction step 341 | loss_1s = mse_loss1(x_data_oo_torch[:, 39, :2], pred_seq_ori_torch[:, 39, :2]) ** 0.5 342 | loss_2s = mse_loss1(x_data_oo_torch[:, 49, :2], pred_seq_ori_torch[:, 49, :2]) ** 0.5 343 | loss_3s = mse_loss1(x_data_oo_torch[:, 59, :2], pred_seq_ori_torch[:, 59, :2]) ** 0.5 344 | loss_4s = mse_loss1(x_data_oo_torch[:, 69, :2], pred_seq_ori_torch[:, 69, :2]) ** 0.5 345 | loss_5s = mse_loss1(x_data_oo_torch[:, 79, :2], pred_seq_ori_torch[:, 79, :2]) ** 0.5 346 | 347 | loss_1s_tmp.append(loss_1s) 348 | loss_2s_tmp.append(loss_2s) 349 | loss_3s_tmp.append(loss_3s) 350 | loss_4s_tmp.append(loss_4s) 351 | loss_5s_tmp.append(loss_5s) 352 | 353 | loss_1s_tmp_np = np.array(loss_1s_tmp) 354 | loss_2s_tmp_np = np.array(loss_2s_tmp) 355 | loss_3s_tmp_np = np.array(loss_3s_tmp) 356 | loss_4s_tmp_np = np.array(loss_4s_tmp) 357 | loss_5s_tmp_np = np.array(loss_5s_tmp) 358 | 359 | loss_1s_mean = loss_1s_tmp_np.mean() 360 | loss_2s_mean = loss_2s_tmp_np.mean() 361 | loss_3s_mean = loss_3s_tmp_np.mean() 362 | loss_4s_mean = loss_4s_tmp_np.mean() 363 | loss_5s_mean = loss_5s_tmp_np.mean() 364 | 365 | loss_tmp_np = np.array(loss_tmp) 366 | loss_valid_mean = loss_tmp_np.mean() 367 | print("Epoch:{},valid loss: {}".format(i, loss_valid_mean)) 368 | -------------------------------------------------------------------------------- /data_process/NGSIM/add_v_a/add_v_a.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/11/6 20:50 4 | @File: add_v_a.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import csv 9 | import numpy as np 10 | import pandas as pd 11 | 12 | 13 | def addVA(path_final, path_add): 14 | dataS = pd.read_csv(path_final) 15 | max_vehicle = np.max(dataS.Vehicle_ID.unique()) 16 | print("The max vehicle ID is: {}".format(max_vehicle)) 17 | 18 | f = open(path_add, 'w', newline='', encoding='utf-8') 19 | csv_writer = csv.writer(f) 20 | csv_writer.writerow(["Vehicle_ID", 21 | "Global_Time", 22 | "Local_X", 23 | "Local_Y", 24 | "v_x", 25 | "v_y", 26 | "a_x", 27 | "a_y", 28 | "Lane_ID", 29 | "Heading_Angle", 30 | "Left_Label", 31 | "Right_Label", 32 | "Lane_Change_Label"]) 33 | 34 | for i in range(int(max_vehicle)): 35 | frame_ori = dataS[dataS.Vehicle_ID == i + 1] 36 | if len(frame_ori) == 0: 37 | print("The vehicle ID {} is empty.".format(i + 1)) 38 | continue 39 | else: 40 | print("Process vehicle ID {}.".format(i + 1)) 41 | frame_np = frame_ori.values 42 | X_np = frame_np[:, 2] 43 | Y_np = frame_np[:, 3] 44 | v_x = 10 * np.diff(X_np) 45 | v_y = 10 * np.diff(Y_np) 46 | a_x = 10 * np.diff(v_x) 47 | a_y = 10 * np.diff(v_y) 48 | for ii in range(len(a_x)): 49 | csv_writer.writerow([frame_np[ii, 0], 50 | frame_np[ii, 1], 51 | frame_np[ii, 2], 52 | frame_np[ii, 3], 53 | v_x[ii], 54 | v_y[ii], 55 | a_x[ii], 56 | a_y[ii], 57 | frame_np[ii, 6], 58 | frame_np[ii, 7], 59 | frame_np[ii, 8], 60 | frame_np[ii, 9], 61 | frame_np[ii, 10] 62 | ]) 63 | f.close() 64 | 65 | 66 | 67 | if __name__ == '__main__': 68 | path_final = ["../preprocess/trajectory_2783_Final_label.csv", 69 | "../preprocess/trajectory_1914_Final_label.csv", 70 | "../preprocess/trajectory_1317_Final_label.csv", 71 | "../preprocess/trajectory_0400_Final_label.csv", 72 | "../preprocess/trajectory_0500_Final_label.csv", 73 | "../preprocess/trajectory_0515_Final_label.csv"] 74 | 75 | path_add = ["trajectory_2783_add.csv", 76 | "trajectory_1914_add.csv", 77 | "trajectory_1317_add.csv", 78 | "trajectory_0400_add.csv", 79 | "trajectory_0500_add.csv", 80 | "trajectory_0515_add.csv"] 81 | 82 | for i in range(len(path_final)): 83 | addVA(path_final[i], path_add[i]) 84 | -------------------------------------------------------------------------------- /data_process/NGSIM/final_DP/final_DP.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/11/6 20:51 4 | @File: final_DP.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import pandas as pd 9 | import numpy as np 10 | import time 11 | 12 | 13 | def getTargetVehicle(path): 14 | dataS = pd.read_csv(path) 15 | veh_list = dataS.Vehicle_ID.unique() 16 | veh_left = [] 17 | veh_center = [] 18 | veh_right = [] 19 | 20 | for veh_id in veh_list: 21 | frame_ori = dataS[dataS.Vehicle_ID == veh_id] 22 | LC_list = frame_ori.Lane_Change_Label.unique() 23 | for LC_id in LC_list: 24 | if LC_id == 0: 25 | veh_left.append(veh_id) 26 | elif LC_id == 1: 27 | veh_center.append(veh_id) 28 | else: 29 | veh_right.append(veh_id) 30 | 31 | veh_new = list(set(veh_left + veh_right)) 32 | veh_new_np = np.array(veh_new) 33 | veh_new_np.sort() 34 | 35 | return veh_new_np 36 | 37 | 38 | class featureExtract(): 39 | def __init__(self, path, veh_id, X_length): 40 | super(featureExtract, self).__init__() 41 | self.path = path 42 | self.veh_id = veh_id 43 | self.X_length = X_length 44 | self.dataS = self.getVehicleIDData() 45 | 46 | def getVehicleIDData(self): 47 | dataS = pd.read_csv(self.path) 48 | frame_ori = dataS[dataS.Vehicle_ID == self.veh_id] 49 | GT_list = frame_ori.Global_Time.unique() 50 | GT_min = np.min(GT_list) 51 | GT_max = np.max(GT_list) 52 | 53 | frame_time = dataS[dataS.Global_Time >= GT_min] 54 | frame_time_1 = frame_time[frame_time.Global_Time <= GT_max] 55 | 56 | return frame_time_1 57 | 58 | def getData(self, veh_id): 59 | ''' 60 | :param veh_id: vehicle ID 61 | :return: get feature data of veh_id 62 | ''' 63 | AvailableTime = self.getAvailableTime(veh_id) 64 | 65 | print("*****Getting vehicle {} feature data*****".format(veh_id)) 66 | length_time = len(AvailableTime) 67 | 68 | # All characteristic parameters have 44 dimensions in total 69 | # Dimension 0-5 target vehicle: (x, y, v_x, v_y, a_x, a_y) 70 | # Dimension 6-41 are features of surrounding vehicles 71 | # (left-front left-rear center-front center-rear right-front right-rear) 72 | # (delta_x, delta_y, v_x, v_y, a_x, a_y) * 6 73 | # Dimension 42-43 left and right lane flag positions 74 | 75 | X = 1000 * np.ones(shape=(length_time, self.X_length, 44)) 76 | 77 | # Driving intention recognition label 78 | y = 1000 * np.ones(shape=(length_time, 1)) 79 | 80 | for i in range(length_time): 81 | 82 | # target vehicle feature writing 83 | self_condition = self.getCondition(veh_id, AvailableTime[i]) 84 | X[i, :, :6] = self_condition 85 | 86 | # Surrounding vehicles feature writing 87 | surround_vehicles = self.getOtherVehicles(veh_id, AvailableTime[i]) 88 | 89 | # left-front vehicle 90 | if surround_vehicles[0] == 10000: # if this position doesn't have vehicle 91 | X[i, :, 6] = 3 * np.ones(shape=(self.X_length)) 92 | X[i, :, 7] = 1000 * np.ones(shape=(self.X_length)) 93 | X[i, :, 8] = self_condition[:, 2] 94 | X[i, :, 9] = self_condition[:, 3] 95 | X[i, :, 10] = self_condition[:, 4] 96 | X[i, :, 11] = self_condition[:, 5] 97 | else: 98 | surround_condition = self.getCondition(surround_vehicles[0], AvailableTime[i]) 99 | X[i, :, 6] = surround_condition[:, 0] - self_condition[:, 0] 100 | X[i, :, 7] = surround_condition[:, 1] - self_condition[:, 1] 101 | X[i, :, 8] = surround_condition[:, 2] 102 | X[i, :, 9] = surround_condition[:, 3] 103 | X[i, :, 10] = surround_condition[:, 4] 104 | X[i, :, 11] = surround_condition[:, 5] 105 | 106 | # left-rear vehicle 107 | if surround_vehicles[1] == 10000: 108 | X[i, :, 12] = 3 * np.ones(shape=(self.X_length)) 109 | X[i, :, 13] = 1000 * np.ones(shape=(self.X_length)) 110 | X[i, :, 14] = self_condition[:, 2] 111 | X[i, :, 15] = self_condition[:, 3] 112 | X[i, :, 16] = self_condition[:, 4] 113 | X[i, :, 17] = self_condition[:, 5] 114 | else: 115 | surround_condition = self.getCondition(surround_vehicles[1], AvailableTime[i]) 116 | X[i, :, 12] = surround_condition[:, 0] - self_condition[:, 0] 117 | X[i, :, 13] = surround_condition[:, 1] - self_condition[:, 1] 118 | X[i, :, 14] = surround_condition[:, 2] 119 | X[i, :, 15] = surround_condition[:, 3] 120 | X[i, :, 16] = surround_condition[:, 4] 121 | X[i, :, 17] = surround_condition[:, 5] 122 | 123 | # center-front vehicle 124 | if surround_vehicles[2] == 10000: 125 | X[i, :, 18] = 0 * np.ones(shape=(self.X_length)) 126 | X[i, :, 19] = 1000 * np.ones(shape=(self.X_length)) 127 | X[i, :, 20] = self_condition[:, 2] 128 | X[i, :, 21] = self_condition[:, 3] 129 | X[i, :, 22] = self_condition[:, 4] 130 | X[i, :, 23] = self_condition[:, 5] 131 | else: 132 | surround_condition = self.getCondition(surround_vehicles[2], AvailableTime[i]) 133 | X[i, :, 18] = surround_condition[:, 0] - self_condition[:, 0] 134 | X[i, :, 19] = surround_condition[:, 1] - self_condition[:, 1] 135 | X[i, :, 20] = surround_condition[:, 2] 136 | X[i, :, 21] = surround_condition[:, 3] 137 | X[i, :, 22] = surround_condition[:, 4] 138 | X[i, :, 23] = surround_condition[:, 5] 139 | 140 | # center-rear vehicle 141 | if surround_vehicles[3] == 10000: 142 | X[i, :, 24] = 0 * np.ones(shape=(self.X_length)) 143 | X[i, :, 25] = 1000 * np.ones(shape=(self.X_length)) 144 | X[i, :, 26] = self_condition[:, 2] 145 | X[i, :, 27] = self_condition[:, 3] 146 | X[i, :, 28] = self_condition[:, 4] 147 | X[i, :, 29] = self_condition[:, 5] 148 | else: 149 | surround_condition = self.getCondition(surround_vehicles[3], AvailableTime[i]) 150 | X[i, :, 24] = surround_condition[:, 0] - self_condition[:, 0] 151 | X[i, :, 25] = surround_condition[:, 1] - self_condition[:, 1] 152 | X[i, :, 26] = surround_condition[:, 2] 153 | X[i, :, 27] = surround_condition[:, 3] 154 | X[i, :, 28] = surround_condition[:, 4] 155 | X[i, :, 29] = surround_condition[:, 5] 156 | 157 | # right-front vehicle 158 | if surround_vehicles[4] == 10000: 159 | X[i, :, 30] = -3 * np.ones(shape=(self.X_length)) 160 | X[i, :, 31] = 1000 * np.ones(shape=(self.X_length)) 161 | X[i, :, 32] = self_condition[:, 2] 162 | X[i, :, 33] = self_condition[:, 3] 163 | X[i, :, 34] = self_condition[:, 4] 164 | X[i, :, 35] = self_condition[:, 5] 165 | else: 166 | surround_condition = self.getCondition(surround_vehicles[4], AvailableTime[i]) 167 | X[i, :, 30] = surround_condition[:, 0] - self_condition[:, 0] 168 | X[i, :, 31] = surround_condition[:, 1] - self_condition[:, 1] 169 | X[i, :, 32] = surround_condition[:, 2] 170 | X[i, :, 33] = surround_condition[:, 3] 171 | X[i, :, 34] = surround_condition[:, 4] 172 | X[i, :, 35] = surround_condition[:, 5] 173 | 174 | # right-rear vehicle 175 | if surround_vehicles[5] == 10000: 176 | X[i, :, 36] = -3 * np.ones(shape=(self.X_length)) 177 | X[i, :, 37] = 1000 * np.ones(shape=(self.X_length)) 178 | X[i, :, 38] = self_condition[:, 2] 179 | X[i, :, 39] = self_condition[:, 3] 180 | X[i, :, 40] = self_condition[:, 4] 181 | X[i, :, 41] = self_condition[:, 5] 182 | else: 183 | surround_condition = self.getCondition(surround_vehicles[5], AvailableTime[i]) 184 | X[i, :, 36] = surround_condition[:, 0] - self_condition[:, 0] 185 | X[i, :, 37] = surround_condition[:, 1] - self_condition[:, 1] 186 | X[i, :, 38] = surround_condition[:, 2] 187 | X[i, :, 39] = surround_condition[:, 3] 188 | X[i, :, 40] = surround_condition[:, 4] 189 | X[i, :, 41] = surround_condition[:, 5] 190 | 191 | for ii in range(self.X_length): 192 | frame_t = self.getOneState(veh_id, AvailableTime[i] - ii) 193 | # Left and right lane flag 194 | X[i, ii, 42] = int(frame_t.Left_Label) 195 | X[i, ii, 43] = int(frame_t.Right_Label) 196 | 197 | # 3s trajectory 198 | if ii == 29: 199 | y[i] = int(frame_t.Lane_Change_Label) 200 | 201 | return X, y 202 | 203 | def getAvailableTime(self, veh_id): 204 | ''' 205 | :param veh_id: vehicle ID 206 | :return: get available time of data 207 | ''' 208 | dataS = self.dataS 209 | frame_ori = dataS[dataS.Vehicle_ID == veh_id] 210 | 211 | Available_time = [] 212 | 213 | for i in range(len(frame_ori)): 214 | t_tmp = float(frame_ori.iloc[i, 1]) 215 | if i >= self.X_length - 1: 216 | Available_time.append(t_tmp) 217 | 218 | return Available_time 219 | 220 | def getCondition(self, veh_id, t_tmp): 221 | ''' 222 | :param veh_id: vehicle ID 223 | :param t_tmp: time stamp 224 | :return: 225 | ''' 226 | dataS = self.dataS 227 | condition = np.zeros(shape=(self.X_length, 6)) 228 | 229 | frame_ori = dataS[dataS.Vehicle_ID == veh_id] 230 | for i in range(self.X_length): 231 | frame = frame_ori[frame_ori.Global_Time == t_tmp - i] 232 | if not frame.empty: 233 | frame_history = frame 234 | else: 235 | frame = frame_history 236 | 237 | frame = frame[['Local_X', 'Local_Y', 'v_x', 'v_y', 'a_x', 'a_y']] 238 | condition[i, :] = frame 239 | 240 | return condition 241 | 242 | def getOtherVehicles(self, veh_id, t_tmp): 243 | """ 244 | :param veh_id: vehicle ID 245 | :param t_tmp: time stamp 246 | :return: Get surrounding vehicle ID of veh_id in t_tmp. 247 | """ 248 | frame_ori = self.dataS 249 | frame_vehicle = frame_ori[frame_ori.Vehicle_ID == veh_id] 250 | frame_self = frame_vehicle[frame_vehicle.Global_Time == t_tmp] 251 | 252 | self_index = frame_self.index.tolist()[0] 253 | 254 | frame_t = frame_ori[frame_ori.Global_Time == t_tmp] 255 | frame_surround = frame_t.drop(self_index) 256 | 257 | # Method of getting surrounding vehicles 258 | # step 1: Get all vehicle IDs at the current time 259 | # step 2: Pass the first round of screening of adjacent lanes (lateral direction) 260 | # step 3: Filter the second round through the dynamic window (longitudinal direction) 261 | 262 | lane_self = float(frame_self.Lane_ID) 263 | y_value = float(frame_self.Local_Y) 264 | x_value = float(frame_self.Local_X) 265 | 266 | # dynamic window value 267 | distance = 60 268 | # Delete vehicle IDS outside the dynamic window 269 | frame_surround = frame_surround[frame_surround['Local_Y'] < y_value + distance] 270 | frame_surround = frame_surround[frame_surround['Local_Y'] > y_value - distance] 271 | 272 | self_left_label = float(frame_self.Left_Label) 273 | self_right_label = float(frame_self.Right_Label) 274 | 275 | # Get left lane ID 276 | if self_left_label: 277 | lane_left = lane_self - 1 278 | else: 279 | lane_left = 100 280 | 281 | # Get right lane ID 282 | if self_right_label: 283 | lane_right = lane_self + 1 284 | else: 285 | lane_right = 100 286 | 287 | # surround vehicles 288 | surround_vehicles = 10000 * np.ones(6) 289 | 290 | # left lane 291 | frame_left = frame_surround[frame_surround['Lane_ID'] == lane_left] 292 | # right lane 293 | frame_right = frame_surround[frame_surround['Lane_ID'] == lane_right] 294 | # center lane 295 | frame_center = frame_surround[frame_surround['Lane_ID'] == lane_self] 296 | 297 | # get left lane vehicle IDs 298 | if not frame_left.empty: 299 | delta_y_pos = 100 300 | delta_y_neg = -100 301 | 302 | for i in range(len(frame_left)): 303 | y_tmp = float(frame_left.iloc[i, 3]) 304 | 305 | delta_y = y_tmp - y_value 306 | 307 | if delta_y > 0: 308 | if delta_y < delta_y_pos: 309 | delta_y_pos = delta_y 310 | surround_vehicles[0] = frame_left.iloc[i, 0] 311 | else: 312 | if delta_y > delta_y_neg: 313 | delta_y_neg = delta_y 314 | surround_vehicles[1] = frame_left.iloc[i, 0] 315 | 316 | # get center lane vehicle IDs 317 | if not frame_center.empty: 318 | delta_y_pos = 100 319 | delta_y_neg = -100 320 | 321 | for i in range(len(frame_center)): 322 | y_tmp = float(frame_center.iloc[i, 3]) 323 | 324 | delta_y = y_tmp - y_value 325 | 326 | if delta_y > 0: 327 | if delta_y < delta_y_pos: 328 | delta_y_pos = delta_y 329 | surround_vehicles[2] = frame_center.iloc[i, 0] 330 | else: 331 | if delta_y > delta_y_neg: 332 | delta_y_neg = delta_y 333 | surround_vehicles[3] = frame_center.iloc[i, 0] 334 | 335 | # get right lane vehicle IDs 336 | if not frame_right.empty: 337 | delta_y_pos = 100 338 | delta_y_neg = -100 339 | 340 | for i in range(len(frame_right)): 341 | y_tmp = float(frame_right.iloc[i, 3]) 342 | 343 | delta_y = y_tmp - y_value 344 | 345 | if delta_y > 0: 346 | if delta_y < delta_y_pos: 347 | delta_y_pos = delta_y 348 | surround_vehicles[4] = frame_right.iloc[i, 0] 349 | else: 350 | if delta_y > delta_y_neg: 351 | delta_y_neg = delta_y 352 | surround_vehicles[5] = frame_right.iloc[i, 0] 353 | 354 | return surround_vehicles 355 | 356 | def getOneState(self, veh_id, t_tmp): 357 | """ 358 | :param veh_id: vehicle ID 359 | :param t_tmp: time stamp 360 | :return: data of veh_id in t_tmp 361 | """ 362 | frame_ori = self.dataS 363 | frame_vehicle = frame_ori[frame_ori.Vehicle_ID == veh_id] 364 | frame_t = frame_vehicle[frame_vehicle.Global_Time == t_tmp] 365 | 366 | return frame_t 367 | 368 | 369 | if __name__ == '__main__': 370 | 371 | path_in = ["../add_v_a/trajectory_2783_add.csv", 372 | "../add_v_a/trajectory_1914_add.csv", 373 | "../add_v_a/trajectory_1317_add.csv", 374 | "../add_v_a/trajectory_0400_add.csv", 375 | "../add_v_a/trajectory_0500_add.csv", 376 | "../add_v_a/trajectory_0515_add.csv"] 377 | 378 | path_X_out = ["X_data_2783.npy", 379 | "X_data_1914.npy", 380 | "X_data_1317.npy", 381 | "X_data_0400.npy", 382 | "X_data_0500.npy", 383 | "X_data_0515.npy"] 384 | 385 | path_y_out = ["y_data_2783.npy", 386 | "y_data_1914.npy", 387 | "y_data_1317.npy", 388 | "y_data_0400.npy", 389 | "y_data_0500.npy", 390 | "y_data_0515.npy"] 391 | 392 | for i in range(len(path_in)): 393 | veh_list = getTargetVehicle(path_in[i]) 394 | 395 | print("*****The length of {} veh_list is:{}*****".format(i, len(veh_list))) 396 | 397 | X_length = 80 398 | 399 | X = [] 400 | y = [] 401 | for veh_id in veh_list: 402 | 403 | print("*****Start process veh_id {}*****".format(veh_id)) 404 | start_time = time.time() 405 | FE = featureExtract(path_in[i], veh_id, X_length) 406 | X_tmp, y_tmp = FE.getData(veh_id) 407 | if len(y) > 0: 408 | X = np.vstack([X, X_tmp]) 409 | y = np.vstack([y, y_tmp]) 410 | else: 411 | X = X_tmp 412 | y = y_tmp 413 | 414 | end_time = time.time() 415 | print("*****End process veh_id {}*****".format(veh_id)) 416 | print("*****time cost: {}*****".format(end_time-start_time)) 417 | print() 418 | 419 | # Save the processed data into a new file 420 | np.save(file=path_X_out[i], arr=X) 421 | np.save(file=path_y_out[i], arr=y) 422 | -------------------------------------------------------------------------------- /data_process/NGSIM/merge_data/merge_data.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/11/6 20:55 4 | @File: merge_data.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import numpy as np 9 | from sklearn import model_selection 10 | 11 | 12 | def loadAllData(X_path, Y_path): 13 | 14 | X_Data = np.load(file=X_path) 15 | y = np.load(file=Y_path) 16 | X = X_Data[:, ::-1, :] 17 | print("The length of total data is: {}".format(len(y))) 18 | 19 | x_0 = [] 20 | x_1 = [] 21 | x_2 = [] 22 | y_0 = [] 23 | y_1 = [] 24 | y_2 = [] 25 | 26 | for i in range(len(y)): 27 | y_tmp = y[i] 28 | if y_tmp == 0: 29 | x_0.append(X[i]) 30 | y_0.append(y[i]) 31 | elif y_tmp == 1: 32 | x_1.append(X[i]) 33 | y_1.append(y[i]) 34 | elif y_tmp == 2: 35 | x_2.append(X[i]) 36 | y_2.append(y[i]) 37 | 38 | left_length = len(y_0) 39 | center_length = len(y_1) 40 | right_length = len(y_2) 41 | 42 | min_length = min([left_length, center_length, right_length]) 43 | 44 | print("The length of left length is: {}".format(left_length)) 45 | print("The length of center length is: {}".format(center_length)) 46 | print("The length of right length is: {}".format(right_length)) 47 | 48 | if min_length/left_length == 1: 49 | x_0_test = x_0 50 | y_0_test = y_0 51 | else: 52 | x_0_train, x_0_test, y_0_train, y_0_test = model_selection.train_test_split(x_0, y_0, test_size=min_length/left_length) 53 | 54 | if min_length/center_length == 1: 55 | x_1_test = x_1 56 | y_1_test = y_1 57 | else: 58 | x_1_train, x_1_test, y_1_train, y_1_test = model_selection.train_test_split(x_1, y_1, test_size=min_length/center_length) 59 | 60 | if min_length/right_length == 1: 61 | x_2_test = x_2 62 | y_2_test = y_2 63 | else: 64 | x_2_train, x_2_test, y_2_train, y_2_test = model_selection.train_test_split(x_2, y_2, test_size=min_length/right_length) 65 | 66 | print("The length of left length is: {}".format(len(y_0_test))) 67 | print("The length of center length is: {}".format(len(y_1_test))) 68 | print("The length of right length is: {}".format(len(y_2_test))) 69 | 70 | # save x_data 71 | x_0_data = np.array(x_0_test) 72 | x_1_data = np.array(x_1_test) 73 | x_2_data = np.array(x_2_test) 74 | 75 | X_DATA = np.vstack([x_0_data, x_1_data]) 76 | X_DATA = np.vstack([X_DATA, x_2_data]) 77 | 78 | # save y_data 79 | y_0_data = np.array(y_0_test) 80 | y_1_data = np.array(y_1_test) 81 | y_2_data = np.array(y_2_test) 82 | 83 | Y_DATA = np.vstack([y_0_data, y_1_data]) 84 | Y_DATA = np.vstack([Y_DATA, y_2_data]) 85 | 86 | return X_DATA, Y_DATA 87 | 88 | 89 | def getEqualNum(x, y, test_size): 90 | x_0 = [] 91 | x_1 = [] 92 | x_2 = [] 93 | y_0 = [] 94 | y_1 = [] 95 | y_2 = [] 96 | 97 | for i in range(len(y)): 98 | y_tmp = y[i] 99 | if y_tmp == 0: 100 | x_0.append(x[i]) 101 | y_0.append(y[i]) 102 | elif y_tmp == 1: 103 | x_1.append(x[i]) 104 | y_1.append(y[i]) 105 | elif y_tmp == 2: 106 | x_2.append(x[i]) 107 | y_2.append(y[i]) 108 | 109 | print("length of left lane change: {}".format(len(y_0))) 110 | print("length of lane keep: {}".format(len(y_1))) 111 | print("length of right lane change: {}".format(len(y_2))) 112 | 113 | x_0 = np.array(x_0) 114 | y_0 = np.array(y_0) 115 | x_1 = np.array(x_1) 116 | y_1 = np.array(y_1) 117 | x_2 = np.array(x_2) 118 | y_2 = np.array(y_2) 119 | 120 | x_train_0, x_test_0, y_train_0, y_test_0 = model_selection.train_test_split(x_0, y_0, test_size=test_size) 121 | x_train_1, x_test_1, y_train_1, y_test_1 = model_selection.train_test_split(x_1, y_1, test_size=test_size) 122 | x_train_2, x_test_2, y_train_2, y_test_2 = model_selection.train_test_split(x_2, y_2, test_size=test_size) 123 | 124 | x_train = np.vstack((x_train_0, x_train_1)) 125 | x_train = np.vstack((x_train, x_train_2)) 126 | y_train = np.vstack((y_train_0, y_train_1)) 127 | y_train = np.vstack((y_train, y_train_2)) 128 | x_test = np.vstack((x_test_0, x_test_1)) 129 | x_test = np.vstack((x_test, x_test_2)) 130 | y_test = np.vstack((y_test_0, y_test_1)) 131 | y_test = np.vstack((y_test, y_test_2)) 132 | 133 | return x_train, x_test, y_train, y_test 134 | 135 | 136 | def mergeDataset(index, num, X_path, Y_path): 137 | X_train = "X_train_{}.npy".format(index) 138 | y_train = "y_train_{}.npy".format(index) 139 | X_test = "X_test_{}.npy".format(index) 140 | y_test = "y_test_{}.npy".format(index) 141 | X_valid = "X_valid_{}.npy".format(index) 142 | y_valid = "y_valid_{}.npy".format(index) 143 | 144 | X = [] 145 | y = [] 146 | 147 | for i in range(num): 148 | print("Start process data: {}".format(i + 1)) 149 | 150 | X_tmp, Y_tmp = loadAllData(X_path[i], Y_path[i]) 151 | if len(y) > 0: 152 | X = np.vstack([X, X_tmp]) 153 | y = np.vstack([y, Y_tmp]) 154 | else: 155 | X = X_tmp 156 | y = Y_tmp 157 | 158 | seq_train_test, seq_valid, y_train_test, y_valid_seq = getEqualNum(X, y, test_size=0.2) 159 | seq_train, seq_test, y_train_seq, y_test_seq = getEqualNum(seq_train_test, y_train_test, test_size=0.25) 160 | 161 | np.save(file=X_train, arr=seq_train) 162 | np.save(file=y_train, arr=y_train_seq) 163 | np.save(file=X_test, arr=seq_test) 164 | np.save(file=y_test, arr=y_test_seq) 165 | np.save(file=X_valid, arr=seq_valid) 166 | np.save(file=y_valid, arr=y_valid_seq) 167 | 168 | 169 | if __name__ == '__main__': 170 | 171 | num = 6 172 | 173 | X = ["../final_DP/X_data_0400.npy", 174 | "../final_DP/X_data_0500.npy", 175 | "../final_DP/X_data_0515.npy", 176 | "../final_DP/X_data_1317.npy", 177 | "../final_DP/X_data_1914.npy", 178 | "../final_DP/X_data_2783.npy"] 179 | 180 | y = ["../final_DP/y_data_0400.npy", 181 | "../final_DP/y_data_0500.npy", 182 | "../final_DP/y_data_0515.npy", 183 | "../final_DP/y_data_1317.npy", 184 | "../final_DP/y_data_1914.npy", 185 | "../final_DP/y_data_2783.npy"] 186 | 187 | for i in range(10): 188 | mergeDataset(i, num, X, y) 189 | -------------------------------------------------------------------------------- /data_process/NGSIM/preprocess/preprocess.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/11/6 20:48 4 | @File: preprocess.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import pandas as pd 9 | import numpy as np 10 | import math 11 | import csv 12 | 13 | 14 | class preprocess(): 15 | def __init__(self, path_ori, path_final): 16 | super(preprocess, self).__init__() 17 | self.path_ori = path_ori 18 | self.path_final = path_final 19 | self.dataRefresh = self.RefreshData() 20 | self.laneChangeLable = self.getLaneChangeLabel() 21 | 22 | def unitConversion(self, frame): 23 | ''' 24 | :param df: data with unit feet 25 | :return: data with unit meter 26 | ''' 27 | ft_to_m = 0.3048 28 | 29 | frame.loc[:, 'Global_Time'] = frame.loc[:, 'Global_Time'] / 100 30 | for strs in ["Global_X", "Global_Y", "Local_X", "Local_Y", "v_Length", "v_Width", "v_Vel"]: 31 | frame.loc[:, strs] = frame.loc[:, strs] * ft_to_m 32 | frame.loc[:, 'v_Vel'] = frame.loc[:, 'v_Vel'] * ft_to_m * 3.6 33 | 34 | return frame 35 | 36 | def getHeadingAngle(self, s_state, e_state): 37 | ''' 38 | :param s_state: start state 39 | :param e_state: end state 40 | :return: heading Angle 41 | ''' 42 | headingAngle = math.atan2((e_state[0] - s_state[0]), (e_state[1] - s_state[1])) 43 | headingAngle = headingAngle * 180 / math.pi 44 | 45 | return headingAngle 46 | 47 | def getLeftLabel(self, lane_id): 48 | ''' 49 | :param lane_id: lane ID 50 | :return: Determine whether there is a lane on the left. (0-no, 1-yes) 51 | ''' 52 | if 1 < lane_id < 7: 53 | return 1 # Lane_Id : 2-6 54 | else: 55 | return 0 # Lane_Id : 1,7,8 56 | 57 | def getRightLabel(self, lane_id): 58 | ''' 59 | :param lane_id: lane ID 60 | :return: Determine whether there is a lane on the right. (0-no, 1-yes) 61 | ''' 62 | if lane_id < 6: 63 | return 1 # Lane_Id : 1-5 64 | else: 65 | return 0 # Lane_Id : 6,7,8 66 | 67 | def RefreshData(self): 68 | ''' 69 | :return: Remove unwanted dimensions. 70 | Add new dimensions (Heading_Angle、Left_Label、Right_Label、Lane_Change_Label) 71 | ''' 72 | data_new = pd.DataFrame(columns=["Vehicle_ID", 73 | "Global_Time", 74 | "Local_X", 75 | "Local_Y", 76 | "v_Vel", 77 | "v_Acc", 78 | "Lane_ID", 79 | "Heading_Angle", 80 | "Left_Label", 81 | "Right_Label", 82 | "Lane_Change_Label" 83 | ]) 84 | 85 | data_tmp = pd.DataFrame(columns=["Vehicle_ID", 86 | "Global_Time", 87 | "Local_X", 88 | "Local_Y", 89 | "v_Vel", 90 | "v_Acc", 91 | "Lane_ID", 92 | "Heading_Angle", 93 | "Left_Label", 94 | "Right_Label", 95 | "Lane_Change_Label" 96 | ]) 97 | 98 | dataS = pd.read_csv(self.path_ori) 99 | max_vehiclenum = np.max(dataS.Vehicle_ID.unique()) 100 | max_vehiclenum = int(max_vehiclenum) 101 | print(max_vehiclenum) 102 | 103 | for i in range(max_vehiclenum + 1): 104 | frame_ori = dataS[dataS.Vehicle_ID == i] 105 | if len(frame_ori) == 0: 106 | print("The vehicle of ID {} is empty".format(i)) 107 | continue 108 | 109 | frame_ori = self.unitConversion(frame_ori) 110 | t_first = np.min(frame_ori.Global_Time.unique()) 111 | print("Vehicle ID: {}, length of data: {}".format(i, len(frame_ori))) 112 | 113 | for j in range(len(frame_ori)): 114 | t_tmp = t_first + j 115 | frame = frame_ori[frame_ori.Global_Time == t_tmp] 116 | x_value = float(frame.Local_X) 117 | y_value = float(frame.Local_Y) 118 | 119 | if j < len(frame_ori) - 1: 120 | frame_1 = frame_ori[frame_ori.Global_Time == t_tmp + 1] 121 | x_value_1 = float(frame_1.Local_X) 122 | y_value_1 = float(frame_1.Local_Y) 123 | 124 | s_state = [x_value, y_value] 125 | e_state = [x_value_1, y_value_1] 126 | 127 | heading_angle = self.getHeadingAngle(s_state, e_state) 128 | 129 | lane_id = int(frame.Lane_ID) 130 | left_label = self.getLeftLabel(lane_id) 131 | right_label = self.getRightLabel(lane_id) 132 | 133 | data_tmp.loc[1] = [frame.iloc[0, 0], 134 | frame.iloc[0, 3], 135 | frame.iloc[0, 4], 136 | frame.iloc[0, 5], 137 | frame.iloc[0, 11], 138 | frame.iloc[0, 12], 139 | frame.iloc[0, 13], 140 | heading_angle, 141 | left_label, 142 | right_label, 143 | 1] # Initial Lane_Change_Label is set as 1. 144 | 145 | data_new = data_new.append(data_tmp, ignore_index=True) 146 | 147 | return data_new 148 | 149 | def getLaneChangeLabel(self): 150 | ''' 151 | :return: Add lane change label 152 | ''' 153 | 154 | dataS = self.dataRefresh 155 | max_vehiclenum = np.max(dataS.Vehicle_ID.unique()) 156 | max_vehiclenum = int(max_vehiclenum) 157 | print(max_vehiclenum) 158 | 159 | # Store label data 160 | label_storage = [] 161 | 162 | for i in range(max_vehiclenum + 1): 163 | frame_ori = dataS[dataS.Vehicle_ID == i] 164 | if len(frame_ori) == 0: 165 | continue 166 | 167 | t_first = np.min(frame_ori.Global_Time.unique()) 168 | print("Vehicle ID: {}, length of data: {}".format(i, len(frame_ori))) 169 | 170 | lane_change_time = [] # lane change time stamp 171 | t_history = t_first # history lane change time stamp 172 | for j in range(len(frame_ori) - 1): 173 | t_tmp = t_first + j 174 | frame = frame_ori[frame_ori.Global_Time == t_tmp] 175 | frame_1 = frame_ori[frame_ori.Global_Time == t_tmp + 1] 176 | 177 | lane_id = float(frame.Lane_ID) 178 | lane_id_1 = float(frame_1.Lane_ID) 179 | label_end = 1 180 | 181 | # Store lane change time stamp 182 | if lane_id > lane_id_1: # left lane change 183 | print("Vehicle ID: {}, time stamp: {}, lane change label: {}".format(i, t_tmp, 0)) 184 | lane_change_time.append([t_history, t_tmp, 0]) 185 | t_history = t_tmp 186 | label_end = 0 187 | elif lane_id < lane_id_1: 188 | print("Vehicle ID: {}, time stamp: {}, lane change label: {}".format(i, t_tmp, 2)) 189 | lane_change_time.append([t_history, t_tmp, 2]) 190 | t_history = t_tmp 191 | label_end = 2 192 | 193 | lane_change_time.append([t_history, t_first + len(frame_ori) - 1, label_end]) 194 | 195 | if len(lane_change_time) == 1: 196 | continue 197 | else: 198 | ### lane_change_time: First point, index from back to front 199 | t0, t1, label0 = lane_change_time[0] 200 | t0 = int(t0) 201 | t1 = int(t1) 202 | 203 | # Reduce the area within 40 steps 204 | if t1 - t0 > 40: 205 | t0 = t1 - 40 206 | 207 | count_heading = 0 208 | if label0 == 0: 209 | for tmp in range(t1, t0 - 1, -1): 210 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 211 | if frame_heading.iloc[0, 7] > -1: # left heading angle threshold 212 | count_heading = count_heading + 1 213 | if count_heading >= 3: 214 | break 215 | else: 216 | label_storage.append([i, tmp, label0]) 217 | else: 218 | label_storage.append([i, tmp, label0]) 219 | count_heading = 0 220 | 221 | elif label0 == 2: 222 | for tmp in range(t1 + 1, t0 - 1, -1): 223 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 224 | if frame_heading.iloc[0, 7] < 1: # right heading angle threshold 225 | count_heading = count_heading + 1 226 | if count_heading >= 3: 227 | break 228 | else: 229 | label_storage.append([i, tmp, label0]) 230 | else: 231 | label_storage.append([i, tmp, label0]) 232 | count_heading = 0 233 | 234 | ### lane_change_time: middle point 235 | if len(lane_change_time) > 2: 236 | for o in range(1, len(lane_change_time) - 1): 237 | t_0_no_use, t_1_no_use, label_0 = lane_change_time[o - 1] 238 | t_0, t_1, label_1_no_use = lane_change_time[o] 239 | 240 | t_0 = int(t_0) 241 | t_1 = int(t_1) 242 | # Reduce the area within 40 steps 243 | # Front half area, indexed from front to back 244 | if t_1 - t_0 > 40: 245 | t1 = t_0 + 40 246 | 247 | count_heading = 0 248 | if label_0 == 0: 249 | for tmp in range(t_0, t1 + 1): 250 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 251 | if frame_heading.iloc[0, 7] > -1: 252 | count_heading = count_heading + 1 253 | if count_heading >= 3: 254 | break 255 | else: 256 | label_storage.append([i, tmp, label0]) 257 | else: 258 | label_storage.append([i, tmp, label0]) 259 | count_heading = 0 260 | 261 | elif label_0 == 2: 262 | for tmp in range(t_0, t1 + 1): 263 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 264 | if frame_heading.iloc[0, 7] < 1: 265 | count_heading = count_heading + 1 266 | if count_heading >= 3: 267 | break 268 | else: 269 | label_storage.append([i, tmp, label0]) 270 | else: 271 | label_storage.append([i, tmp, label0]) 272 | count_heading = 0 273 | 274 | # Reduce the area within 40 steps 275 | # Second half area, indexed from back to front 276 | if t_1 - t_0 > 40: 277 | t0 = t_1 - 40 278 | 279 | count_heading = 0 280 | if label_0 == 0: 281 | for tmp in range(t_1, t0 - 1, -1): 282 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 283 | if frame_heading.iloc[0, 7] > -1: 284 | count_heading = count_heading + 1 285 | if count_heading >= 3: 286 | break 287 | else: 288 | label_storage.append([i, tmp, label0]) 289 | else: 290 | label_storage.append([i, tmp, label0]) 291 | count_heading = 0 292 | 293 | elif label_0 == 2: 294 | for tmp in range(t_1, t0 - 1, -1): 295 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 296 | if frame_heading.iloc[0, 7] < 1: 297 | count_heading = count_heading + 1 298 | if count_heading >= 3: 299 | break 300 | else: 301 | label_storage.append([i, tmp, label0]) 302 | else: 303 | label_storage.append([i, tmp, label0]) 304 | count_heading = 0 305 | 306 | ### lane_change_time: Final point 307 | t_0_no_use, t_1_no_use, label_0 = lane_change_time[len(lane_change_time) - 2] 308 | t_0, t_1, label_1_no_use = lane_change_time[len(lane_change_time) - 1] 309 | 310 | t_0 = int(t_0) 311 | t_1 = int(t_1) 312 | 313 | # Reduce the area within 40 steps, Front half area 314 | if t_1 - t_0 > 40: 315 | t1 = t_0 + 40 316 | 317 | count_heading = 0 318 | if label_0 == 0: 319 | for tmp in range(t_0, t1 + 1): 320 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 321 | if frame_heading.iloc[0, 7] > -1: 322 | count_heading = count_heading + 1 323 | if count_heading >= 3: 324 | break 325 | else: 326 | label_storage.append([i, tmp, label0]) 327 | else: 328 | label_storage.append([i, tmp, label0]) 329 | count_heading = 0 330 | 331 | elif label_0 == 2: 332 | for tmp in range(t_0, t1 + 1): 333 | frame_heading = frame_ori[frame_ori.Global_Time == tmp] 334 | if frame_heading.iloc[0, 7] < 1: 335 | count_heading = count_heading + 1 336 | if count_heading >= 3: 337 | break 338 | else: 339 | label_storage.append([i, tmp, label0]) 340 | else: 341 | label_storage.append([i, tmp, label0]) 342 | count_heading = 0 343 | 344 | lane_change_time = [] 345 | 346 | # Remove duplicate data 347 | label_storage_new = [] 348 | for label_tmp in label_storage: 349 | if label_tmp not in label_storage_new: 350 | label_storage_new.append(label_tmp) 351 | 352 | data_new = pd.DataFrame(columns=["Vehicle_ID", "Global_Time", "lane_change_label"]) 353 | data_tmp = pd.DataFrame(columns=["Vehicle_ID", "Global_Time", "lane_change_label"]) 354 | 355 | for ii in range(len(label_storage_new)): 356 | data_tmp.loc[1] = label_storage[ii] 357 | data_new = data_new.append(data_tmp, ignore_index=True) 358 | 359 | return data_new 360 | 361 | def replaceLabel(self): 362 | ''' 363 | :return: replace "xxx_addLabel.csv" lane change label with "xxx_label.csv". 364 | store the result to new file "xxx_Final_label.csv". 365 | ''' 366 | 367 | dataS = self.dataRefresh 368 | dataS_1 = self.laneChangeLable 369 | 370 | ID_lists = dataS_1.Vehicle_ID.unique() 371 | 372 | f = open(self.path_final, 'w', newline='', encoding='utf-8') 373 | csv_writer = csv.writer(f) 374 | csv_writer.writerow(["Vehicle_ID", 375 | "Global_Time", 376 | "Local_X", 377 | "Local_Y", 378 | "v_Vel", 379 | "v_Acc", 380 | "Lane_ID", 381 | "Heading_Angle", 382 | "Left_Label", 383 | "Right_Label", 384 | "Lane_Change_Label"]) 385 | 386 | for i in range(len(dataS)): 387 | dataS_tmp = dataS.iloc[i, :] 388 | veh_id = dataS_tmp.iloc[0] 389 | if veh_id not in ID_lists: 390 | csv_writer.writerow(dataS_tmp) 391 | else: 392 | time_tmp = dataS_tmp.iloc[1] 393 | dataS_1_tmp = dataS_1[dataS_1.Vehicle_ID == veh_id] 394 | dataS_1_tmp1 = dataS_1_tmp[dataS_1_tmp.Global_Time == time_tmp] 395 | if len(dataS_1_tmp1) == 0: 396 | csv_writer.writerow(dataS_tmp) 397 | else: 398 | csv_writer.writerow([dataS_tmp.iloc[0], 399 | dataS_tmp.iloc[1], 400 | dataS_tmp.iloc[2], 401 | dataS_tmp.iloc[3], 402 | dataS_tmp.iloc[4], 403 | dataS_tmp.iloc[5], 404 | dataS_tmp.iloc[6], 405 | dataS_tmp.iloc[7], 406 | dataS_tmp.iloc[8], 407 | dataS_tmp.iloc[9], 408 | dataS_1_tmp1.iloc[0, 2]]) 409 | 410 | if i % 10000 == 0: 411 | print("Written: {}".format(i)) 412 | 413 | f.close() 414 | 415 | 416 | if __name__ == '__main__': 417 | 418 | path_ori = ["../trajectory_denoise/trajectories_2783_denoise.csv", 419 | "../trajectory_denoise/trajectories_1914_denoise.csv", 420 | "../trajectory_denoise/trajectories_1317_denoise.csv", 421 | "../trajectory_denoise/trajectories_0400_denoise.csv", 422 | "../trajectory_denoise/trajectories_0500_denoise.csv", 423 | "../trajectory_denoise/trajectories_0515_denoise.csv"] 424 | 425 | path_final = ["trajectory_2783_Final_label.csv", 426 | "trajectory_1914_Final_label.csv", 427 | "trajectory_1317_Final_label.csv", 428 | "trajectory_0400_Final_label.csv", 429 | "trajectory_0500_Final_label.csv", 430 | "trajectory_0515_Final_label.csv"] 431 | 432 | for i in range(6): 433 | Pre = preprocess(path_ori[i], path_final[i]) 434 | print("********************* Start Process {} file ********************* ".format(i)) 435 | Pre.replaceLabel() -------------------------------------------------------------------------------- /data_process/NGSIM/trajectory_denoise/trajectory_denoise.py: -------------------------------------------------------------------------------- 1 | """ 2 | @Author: Fhz 3 | @Create Date: 2022/11/6 20:47 4 | @File: trajectory_denoise.py 5 | @Description: 6 | @Modify Person Date: 7 | """ 8 | import csv 9 | import copy 10 | import pywt 11 | import numpy as np 12 | import pandas as pd 13 | 14 | 15 | def denoiseData(data): 16 | 17 | data_denoise = copy.deepcopy(data) 18 | vx = 10 * np.diff(data) 19 | Wavelet = 'sym11' # filter function 20 | 21 | for i in range(3): 22 | coffs = pywt.wavedec(vx, Wavelet, level=5) 23 | 24 | ca = coffs[0] # approximation coefficients 25 | cd1 = coffs[1] # details coefficients 26 | cd2 = coffs[2] 27 | cd3 = coffs[3] 28 | cd4 = coffs[4] 29 | cd5 = coffs[5] 30 | 31 | cdd1 = 0 * cd1 32 | cdd2 = 0 * cd2 33 | cdd3 = 0 * cd3 34 | cdd4 = 0 * cd4 35 | 36 | coffs_re = [] 37 | coffs_re.append(ca) 38 | coffs_re.append(cdd1) 39 | coffs_re.append(cdd2) 40 | coffs_re.append(cdd3) 41 | coffs_re.append(cdd4) 42 | coffs_re.append(cd5) 43 | 44 | vx = pywt.waverec(coffs_re, Wavelet) 45 | 46 | vx = vx[:len(data) - 1] 47 | for j in range(len(vx)): 48 | data_denoise[j+1] = data_denoise[j] + vx[j] / 10 49 | 50 | return data_denoise 51 | 52 | 53 | def reWriteData(path_ori, path_denoise): 54 | for i in range(len(path_ori)): 55 | print("******Start process data {} *******".format(i)) 56 | dataS = pd.read_csv(path_ori[i]) 57 | veh_id_max = np.max(dataS.Vehicle_ID.unique()) 58 | print("The max vehicle id is {}".format(veh_id_max)) 59 | 60 | f = open(path_denoise[i], 'w', newline='', encoding='utf-8') 61 | csv_writer = csv.writer(f) 62 | csv_writer.writerow(["Vehicle_ID", 63 | "Frame_ID", 64 | "Total_Frames", 65 | "Global_Time", 66 | "Local_X", 67 | "Local_Y", 68 | "Global_X", 69 | "Global_Y", 70 | "v_Length", 71 | "v_Width", 72 | "v_Class", 73 | "v_Vel", 74 | "v_Acc", 75 | "Lane_ID", 76 | "Preceeding", 77 | "Following", 78 | "Space_Hdwy", 79 | "Time_Hdwy"]) 80 | 81 | for j in range(veh_id_max): 82 | veh_id = j + 1 83 | frame_ori = dataS[dataS.Vehicle_ID == veh_id] 84 | if len(frame_ori) == 0: 85 | print("******vehicle_id {} is empty. *******".format(veh_id)) 86 | continue 87 | else: 88 | frame_values = frame_ori.values 89 | x = frame_values[:, 4] 90 | x_denoise = denoiseData(x) 91 | frame_values[:, 4] = x_denoise 92 | csv_writer.writerows(frame_values) 93 | print("******vehicle_id {} is written. *******".format(veh_id)) 94 | 95 | f.close() 96 | print("******End process data {} *******".format(i)) 97 | 98 | 99 | def test(path_ori, path_denoise): 100 | for i in range(len(path_ori)): 101 | data_ori = pd.read_csv(path_ori[i]) 102 | data_denoise = pd.read_csv(path_denoise[i]) 103 | 104 | print("The length of initial data {} is {}".format(i, len(data_ori))) 105 | print("The length of denoise data {} is {}".format(i, len(data_denoise))) 106 | 107 | 108 | if __name__ == '__main__': 109 | path_ori = ["trajectories-0750am-0805am.csv", 110 | "trajectories-0805am-0820am.csv", 111 | "trajectories-0820am-0835am.csv", 112 | "trajectories-0400-0415.csv", 113 | "trajectories-0500-0515.csv", 114 | "trajectories-0515-0530.csv"] 115 | 116 | path_denoise = ["trajectories_2783_denoise.csv", 117 | "trajectories_1914_denoise.csv", 118 | "trajectories_1317_denoise.csv", 119 | "trajectories_0400_denoise.csv", 120 | "trajectories_0500_denoise.csv", 121 | "trajectories_0515_denoise.csv"] 122 | 123 | reWriteData(path_ori, path_denoise) 124 | # test(path_ori, path_denoise) 125 | -------------------------------------------------------------------------------- /img/NGSIM_data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ColinFanghz/mtf-lstm/b99e0dbff63cea1075131a4edcca01e92abf0bc8/img/NGSIM_data.png -------------------------------------------------------------------------------- /img/N_step1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ColinFanghz/mtf-lstm/b99e0dbff63cea1075131a4edcca01e92abf0bc8/img/N_step1.png -------------------------------------------------------------------------------- /img/N_step2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ColinFanghz/mtf-lstm/b99e0dbff63cea1075131a4edcca01e92abf0bc8/img/N_step2.png -------------------------------------------------------------------------------- /img/N_step3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ColinFanghz/mtf-lstm/b99e0dbff63cea1075131a4edcca01e92abf0bc8/img/N_step3.png -------------------------------------------------------------------------------- /img/N_step4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ColinFanghz/mtf-lstm/b99e0dbff63cea1075131a4edcca01e92abf0bc8/img/N_step4.png --------------------------------------------------------------------------------