├── Arpit_Aggarwal_Report.pdf ├── Arpit_Aggarwal_ppt.pdf ├── LICENSE.md ├── README.md ├── Report.pdf ├── code ├── __pycache__ │ ├── convnet.cpython-37.pyc │ ├── replay_buffer.cpython-37.pyc │ ├── td3.cpython-37.pyc │ └── utils.cpython-37.pyc ├── convnet.py ├── envs │ ├── __init__.py │ ├── __pycache__ │ │ ├── __init__.cpython-37.pyc │ │ ├── robot_target_tracking_env_interface.cpython-37.pyc │ │ └── robot_target_tracking_env_sensors.cpython-37.pyc │ ├── convnet.py │ ├── robot_target_tracking_env.py │ ├── robot_target_tracking_env_interface.py │ └── robot_target_tracking_env_sensors.py ├── eval.py ├── replay_buffer.py ├── td3.py ├── train.py └── utils.py ├── outputs ├── 7.png ├── case1_1.jpg ├── case1_1.png ├── case1_2.jpg ├── case1_2.png ├── case1_3.png ├── case1_4.png ├── case2_1.jpg ├── case2_1.png ├── case2_2.jpg ├── case2_2.png ├── case2_3.png ├── case2_4.png ├── case3_1.jpg ├── case3_1.png ├── case3_2.jpg ├── case3_2.png ├── case3_3.png ├── case3_4.png ├── case4_1.jpg ├── case4_1.png ├── case4_2.jpg ├── case4_2.png ├── case4_3.png ├── case4_4.png ├── greedy_reward.png ├── qlearning_reward.png ├── reward.png ├── reward_1.png ├── reward_sensors_1_targets_1.png ├── reward_sensors_1_targets_2.png ├── reward_sensors_1_targets_3.png ├── reward_sensors_1_targets_4.png ├── sample.png └── sample1.png └── scripts ├── __pycache__ └── utils.cpython-37.pyc ├── bh_main.py ├── ekf_main.py ├── qlearning.py └── utils.py /Arpit_Aggarwal_Report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/Arpit_Aggarwal_Report.pdf -------------------------------------------------------------------------------- /Arpit_Aggarwal_ppt.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/Arpit_Aggarwal_ppt.pdf -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Arpit Aggarwal 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 6 | 7 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 10 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Multi-robot Target Tracking 2 | 3 | [![Packagist](https://img.shields.io/packagist/l/doctrine/orm.svg)](LICENSE.md) 4 | --- 5 | 6 | 7 | ### Authors 8 | Arpit Aggarwal 9 | Lifeng Zhou 10 | 11 | 12 | ### Introduction to the Project 13 | Robot-target tracking is the task of localizing the targets by using mobile robots. The task is to determine an optimal robot trajectory to minimize the uncertainty in the targets position. First, the task was comparing the Extended Kalman Filter(EKF) and Bayesian Histogram(BH) techniques for creating a heatmap of the environment, where the bright region denoted the total uncertainty in the targets position. Next, the task was using RL technique to determine the optimal action to be taken by the robot to minimize the total uncertainty in the heatmap. 14 | 15 | 16 | ### Results 17 | 18 | Actor-Critic Model Training (robots=1, targets=1) 19 | ![](outputs/reward_sensors_1_targets_1.png) 20 | 21 | Actor-Critic Model Training (robots=1, targets=2) 22 | ![](outputs/reward_sensors_1_targets_2.png) 23 | 24 | Actor-Critic Model Training (robots=1, targets=3) 25 | ![](outputs/reward_sensors_1_targets_3.png) 26 | 27 | Actor-Critic Model Training (robots=1, targets=4) 28 | ![](outputs/reward_sensors_1_targets_4.png) 29 | 30 | 31 | ### Software Required 32 | To run the .py files, use Python 3. Standard Python 3 libraries like OpenAI Gym, PyTorch, Numpy, and matplotlib are used. 33 | 34 | 35 | ### Credits 36 | The following links were helpful for this project: 37 | 1. https://github.com/ksengin/active-target-localization/ 38 | 2. https://ksengin.github.io/active-target-localization/ 39 | -------------------------------------------------------------------------------- /Report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/Report.pdf -------------------------------------------------------------------------------- /code/__pycache__/convnet.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/code/__pycache__/convnet.cpython-37.pyc -------------------------------------------------------------------------------- /code/__pycache__/replay_buffer.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/code/__pycache__/replay_buffer.cpython-37.pyc -------------------------------------------------------------------------------- /code/__pycache__/td3.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/code/__pycache__/td3.cpython-37.pyc -------------------------------------------------------------------------------- /code/__pycache__/utils.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/code/__pycache__/utils.cpython-37.pyc -------------------------------------------------------------------------------- /code/convnet.py: -------------------------------------------------------------------------------- 1 | # reference: https://github.com/ksengin/active-target-localization/blob/master/target_localization/models/convnet.py 2 | import os 3 | 4 | import torch 5 | import torch.nn as nn 6 | import torch.nn.init as init 7 | import torch.nn.functional as F 8 | import torchvision 9 | 10 | 11 | def _init_weights(module: nn.Module): 12 | if type(module) == nn.Linear: 13 | torch.nn.init.xavier_uniform(module.weight) 14 | module.bias.data.fill_(0.01) 15 | 16 | 17 | class ConvNet(nn.Module): 18 | """ 19 | The ConvNet class defines a convolutional neural net with a desired output dimension. 20 | """ 21 | 22 | def __init__(self, out_dim: int = 128, pretrained=True): 23 | """ 24 | Create a convolutional network. 25 | """ 26 | super(ConvNet, self).__init__() 27 | 28 | self.act_fn = F.relu 29 | self.model_ft = torchvision.models.resnet18(pretrained=pretrained) 30 | self.model_ft.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False) 31 | num_ftrs = self.model_ft.fc.out_features 32 | 33 | self.model_ft.apply(_init_weights) 34 | self.fc1 = nn.Linear(num_ftrs, num_ftrs) 35 | self.fc2 = nn.Linear(num_ftrs, num_ftrs) 36 | self.fc3 = nn.Linear(num_ftrs, out_dim) 37 | 38 | 39 | def forward(self, x: torch.tensor): 40 | x = self.act_fn(self.model_ft(x)) 41 | x = self.act_fn(self.fc1(x)) 42 | x = self.act_fn(self.fc2(x)) 43 | fc_out = self.fc3(x) 44 | return fc_out 45 | -------------------------------------------------------------------------------- /code/envs/__init__.py: -------------------------------------------------------------------------------- 1 | from envs.robot_target_tracking_env_interface import RobotTargetTrackingEnvInterface 2 | -------------------------------------------------------------------------------- /code/envs/__pycache__/__init__.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/code/envs/__pycache__/__init__.cpython-37.pyc -------------------------------------------------------------------------------- /code/envs/__pycache__/robot_target_tracking_env_interface.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/code/envs/__pycache__/robot_target_tracking_env_interface.cpython-37.pyc -------------------------------------------------------------------------------- /code/envs/__pycache__/robot_target_tracking_env_sensors.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/code/envs/__pycache__/robot_target_tracking_env_sensors.cpython-37.pyc -------------------------------------------------------------------------------- /code/envs/convnet.py: -------------------------------------------------------------------------------- 1 | # reference: https://github.com/ksengin/active-target-localization/blob/master/target_localization/models/convnet.py 2 | import os 3 | 4 | import torch 5 | import torch.nn as nn 6 | import torch.nn.init as init 7 | import torch.nn.functional as F 8 | import torchvision 9 | 10 | 11 | def _init_weights(module: nn.Module): 12 | if type(module) == nn.Linear: 13 | torch.nn.init.xavier_uniform(module.weight) 14 | module.bias.data.fill_(0.01) 15 | 16 | 17 | class ConvNet(nn.Module): 18 | """ 19 | The ConvNet class defines a convolutional neural net with a desired output dimension. 20 | """ 21 | 22 | def __init__(self, out_dim: int = 128, pretrained=False): 23 | """ 24 | Create a convolutional network. 25 | """ 26 | super(ConvNet, self).__init__() 27 | 28 | self.act_fn = F.relu 29 | self.model_ft = torchvision.models.resnet18(pretrained=pretrained) 30 | self.model_ft.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False) 31 | num_ftrs = self.model_ft.fc.out_features 32 | 33 | self.model_ft.apply(_init_weights) 34 | self.fc1 = nn.Linear(num_ftrs, num_ftrs) 35 | self.fc2 = nn.Linear(num_ftrs, num_ftrs) 36 | self.fc3 = nn.Linear(num_ftrs, out_dim) 37 | 38 | 39 | def forward(self, x: torch.tensor): 40 | x = self.act_fn(self.model_ft(x)) 41 | x = self.act_fn(self.fc1(x)) 42 | x = self.act_fn(self.fc2(x)) 43 | fc_out = self.fc3(x) 44 | return fc_out 45 | -------------------------------------------------------------------------------- /code/envs/robot_target_tracking_env.py: -------------------------------------------------------------------------------- 1 | """ 2 | OpenAI Gym environment with robots and targets 3 | 4 | Environment source code: https://github.com/ksengin/active-target-localization/blob/master/target_localization/envs/tracking_waypoints_env.py 5 | Modifications: Modified the existing environment with number of robots > 1 and the targets moving case 6 | """ 7 | 8 | 9 | # header files 10 | import os 11 | import copy 12 | import numpy as np 13 | import torch 14 | import torch.nn as nn 15 | import torch.nn.functional as F 16 | import torchvision 17 | 18 | import gym 19 | from gym import error, spaces 20 | from gym.utils import seeding 21 | import matplotlib.pyplot as plt 22 | import matplotlib.cm as cm 23 | from matplotlib.patches import Ellipse 24 | import matplotlib.transforms as transforms 25 | from math import pi, cos, sin 26 | from scipy.stats import multivariate_normal 27 | from convnet import * 28 | 29 | 30 | # OpenAI gym environment class 31 | class RobotTargetTrackingEnv(gym.GoalEnv): 32 | 33 | def __init__(self): 34 | """ 35 | Init method for the environment 36 | """ 37 | self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 38 | self.seed() 39 | self.len_workspace = 20 40 | self.workspace = np.array([[0, self.len_workspace], [0, self.len_workspace]]) 41 | self.sigma_meas = 1.0 42 | self.time_step = 1 43 | 44 | self.action_space = spaces.Box(-np.pi, np.pi, shape=(1,), dtype='float32') 45 | self.convnet = ConvNet() 46 | 47 | 48 | def env_parametrization(self, num_targets=4, num_sensors=1, target_motion_omegas=None, meas_model='range'): 49 | """ 50 | Function for parametrizing the environment 51 | """ 52 | self.x1_list = [] 53 | self.y1_list = [] 54 | self.x2_list = [] 55 | self.y2_list = [] 56 | self.x3_list = [] 57 | self.y3_list = [] 58 | self.x4_list = [] 59 | self.y4_list = [] 60 | self.time_step = 1 61 | self.num_targets = num_targets 62 | self.true_targets_radii = torch.rand(self.num_targets)*5.0+2.0 63 | self.true_targets_pos = (torch.rand(self.num_targets, 2)*self.len_workspace) 64 | self.initial_true_targets_pos = self.true_targets_pos.clone() 65 | self.estimated_targets_mean = self.true_targets_pos.clone() 66 | self.estimated_targets_var = torch.zeros(self.num_targets, 2, 2) 67 | for index in range(0, self.num_targets): 68 | self.estimated_targets_var[index] = torch.tensor([[1.0, 0.0], [0.0, 1.0]]) 69 | self.target_motion_omegas = torch.rand(self.num_targets)*100.0 70 | 71 | self.heatmap = torch.zeros(self.len_workspace, self.len_workspace) 72 | for index in range(0, self.num_targets): 73 | x = np.linspace(0, self.len_workspace, self.len_workspace) 74 | y = np.linspace(0, self.len_workspace, self.len_workspace) 75 | X, Y = np.meshgrid(x, y) 76 | pos = np.empty(X.shape + (2,)) 77 | pos[:, :, 0] = X; pos[:, :, 1] = Y 78 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 79 | self.heatmap += rv.pdf(pos) 80 | image = F.interpolate(self.heatmap, (256, 256), mode='bilinear') 81 | true_obs = self.convnet(image).squeeze() 82 | #true_obs = self.heatmap.flatten() 83 | 84 | self.robot_movement_x = [] 85 | self.robot_movement_y = [] 86 | self.num_sensors = num_sensors 87 | self.sensors_pos = torch.zeros(self.num_sensors, 2) 88 | for index in range(0, self.num_sensors): 89 | rand_angle = torch.rand(1)*2*np.pi 90 | self.sensors_pos[index] = self.true_targets_pos.mean(0)+(torch.sqrt(torch.rand(1)+0.5)*(self.len_workspace/2)*(torch.tensor([torch.cos(rand_angle), torch.sin(rand_angle)]))) 91 | if(self.sensors_pos[index, 0] >= self.len_workspace): 92 | self.sensors_pos[index, 0] -= (self.sensors_pos[index, 0]-self.len_workspace+1) 93 | if(self.sensors_pos[index, 0] <= 0): 94 | self.sensors_pos[index, 0] = (-self.sensors_pos[index, 0]+1) 95 | if(self.sensors_pos[index, 1] >= self.len_workspace): 96 | self.sensors_pos[index, 1] -= (self.sensors_pos[index, 1]-self.len_workspace+1) 97 | if(self.sensors_pos[index, 1] <= 0): 98 | self.sensors_pos[index, 1] = (-self.sensors_pos[index, 1]+1) 99 | 100 | self.robot_movement_x.append(float(self.sensors_pos[0, 0])) 101 | self.robot_movement_y.append(float(self.sensors_pos[0, 1])) 102 | self.x1_list.append(float(self.true_targets_pos[0, 0])) 103 | self.y1_list.append(float(self.true_targets_pos[0, 1])) 104 | self.x2_list.append(float(self.true_targets_pos[1, 0])) 105 | self.y2_list.append(float(self.true_targets_pos[1, 1])) 106 | self.x3_list.append(float(self.true_targets_pos[2, 0])) 107 | self.y3_list.append(float(self.true_targets_pos[2, 1])) 108 | self.x4_list.append(float(self.true_targets_pos[3, 0])) 109 | self.y4_list.append(float(self.true_targets_pos[3, 1])) 110 | 111 | self.meas_model = meas_model 112 | if self.meas_model == 'bearing': 113 | self.sigma_meas = 0.2 114 | self.normal_dist_1d_torch = lambda x, mu, sgm: 1.0 / (np.sqrt(2 * np.pi*sgm**2)) * torch.exp(-0.5 / sgm**2 * (np.pi-torch.abs(torch.abs(x-mu) - np.pi))**2) 115 | else: 116 | self.sigma_meas = 1.0 117 | self.normal_dist_1d_torch = lambda x, mu, sgm: 1.0 / (np.sqrt(2 * np.pi*sgm**2)) * np.exp(-0.5 / sgm**2 * (np.abs(x - mu)**2)) 118 | 119 | self.state = torch.cat((self.sensors_pos[0], torch.tensor(true_obs).float())) 120 | self.observation_space = spaces.Box(-np.inf, np.inf, shape=self.state.shape, dtype='float32') 121 | 122 | 123 | def step(self, action, step_size): 124 | """ 125 | Function to update the environment 126 | """ 127 | self._set_action(action, step_size) 128 | 129 | self.time_step = self.time_step + 1 130 | self.update_true_targets_pos() 131 | self.update_estimated_targets_pos() 132 | 133 | self.robot_movement_x.append(float(self.sensors_pos[0, 0])) 134 | self.robot_movement_y.append(float(self.sensors_pos[0, 1])) 135 | self.x1_list.append(float(self.true_targets_pos[0, 0])) 136 | self.y1_list.append(float(self.true_targets_pos[0, 1])) 137 | self.x2_list.append(float(self.true_targets_pos[1, 0])) 138 | self.y2_list.append(float(self.true_targets_pos[1, 1])) 139 | self.x3_list.append(float(self.true_targets_pos[2, 0])) 140 | self.y3_list.append(float(self.true_targets_pos[2, 1])) 141 | self.x4_list.append(float(self.true_targets_pos[3, 0])) 142 | self.y4_list.append(float(self.true_targets_pos[3, 1])) 143 | 144 | self.heatmap = torch.zeros(self.len_workspace, self.len_workspace) 145 | for index in range(0, self.num_targets): 146 | x = np.linspace(0, self.len_workspace, self.len_workspace) 147 | y = np.linspace(0, self.len_workspace, self.len_workspace) 148 | X, Y = np.meshgrid(x, y) 149 | pos = np.empty(X.shape + (2,)) 150 | pos[:, :, 0] = X; pos[:, :, 1] = Y 151 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 152 | self.heatmap += rv.pdf(pos) 153 | true_obs = self.heatmap.flatten() 154 | 155 | done = False 156 | reward = None 157 | reward, done = self.compute_reward() 158 | if(self.time_step>1000 or float(self.sensors_pos[0, 0])<=0 or float(self.sensors_pos[0, 1])<=0 or float(self.sensors_pos[0, 0]) >= self.len_workspace or float(self.sensors_pos[0, 1])>=self.len_workspace): 159 | done = True 160 | 161 | self.state = torch.cat((self.sensors_pos[0], torch.tensor(true_obs).float())) 162 | return self.state, reward, done, None, self.estimated_targets_var 163 | 164 | 165 | def reset(self, **kwargs): 166 | """ 167 | Function to reset the environment 168 | """ 169 | self.x1_list = [] 170 | self.y1_list = [] 171 | self.x2_list = [] 172 | self.y2_list = [] 173 | self.x3_list = [] 174 | self.y3_list = [] 175 | self.x4_list = [] 176 | self.y4_list = [] 177 | self.time_step = 1 178 | self.true_targets_radii = torch.rand(self.num_targets)*5.0+2.0 179 | self.true_targets_pos = (torch.rand(self.num_targets, 2)*self.len_workspace) 180 | self.initial_true_targets_pos = self.true_targets_pos.clone() 181 | self.estimated_targets_mean = self.true_targets_pos.clone() 182 | self.estimated_targets_var = torch.zeros(self.num_targets, 2, 2) 183 | for index in range(0, self.num_targets): 184 | self.estimated_targets_var[index] = torch.tensor([[1, 0], [0, 1]]) 185 | self.target_motion_omegas = torch.rand(self.num_targets)*100.0 186 | 187 | self.robot_movement_x = [] 188 | self.robot_movement_y = [] 189 | self.sensors_pos = torch.zeros(self.num_sensors, 2) 190 | for index in range(0, self.num_sensors): 191 | rand_angle = torch.rand(1)*2*np.pi 192 | self.sensors_pos[index] = self.true_targets_pos.mean(0)+(torch.sqrt(torch.rand(1)+0.5)*(self.len_workspace/2)*(torch.tensor([torch.cos(rand_angle), torch.sin(rand_angle)]))) 193 | if(self.sensors_pos[index, 0] >= self.len_workspace): 194 | self.sensors_pos[index, 0] -= (self.sensors_pos[index, 0] - self.len_workspace + 1) 195 | if(self.sensors_pos[index, 0] <= 0): 196 | self.sensors_pos[index, 0] = (-self.sensors_pos[index, 0] + 1) 197 | if(self.sensors_pos[index, 1] >= self.len_workspace): 198 | self.sensors_pos[index, 1] -= (self.sensors_pos[index, 1] - self.len_workspace + 1) 199 | if(self.sensors_pos[index, 1] <= 0): 200 | self.sensors_pos[index, 1] = (-self.sensors_pos[index, 1] + 1) 201 | 202 | self.robot_movement_x.append(float(self.sensors_pos[0, 0])) 203 | self.robot_movement_y.append(float(self.sensors_pos[0, 1])) 204 | self.x1_list.append(float(self.true_targets_pos[0, 0])) 205 | self.y1_list.append(float(self.true_targets_pos[0, 1])) 206 | self.x2_list.append(float(self.true_targets_pos[1, 0])) 207 | self.y2_list.append(float(self.true_targets_pos[1, 1])) 208 | self.x3_list.append(float(self.true_targets_pos[2, 0])) 209 | self.y3_list.append(float(self.true_targets_pos[2, 1])) 210 | self.x4_list.append(float(self.true_targets_pos[3, 0])) 211 | self.y4_list.append(float(self.true_targets_pos[3, 1])) 212 | 213 | self.heatmap = torch.zeros(self.len_workspace, self.len_workspace) 214 | for index in range(0, self.num_targets): 215 | x = np.linspace(0, self.len_workspace, self.len_workspace) 216 | y = np.linspace(0, self.len_workspace, self.len_workspace) 217 | X, Y = np.meshgrid(x, y) 218 | pos = np.empty(X.shape + (2,)) 219 | pos[:, :, 0] = X; pos[:, :, 1] = Y 220 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 221 | self.heatmap += rv.pdf(pos) 222 | true_obs = self.heatmap.flatten() 223 | 224 | self.state = torch.cat((self.sensors_pos[0], torch.tensor(true_obs).float())) 225 | self.observation_space = spaces.Box(-np.inf, np.inf, shape=self.state.shape, dtype='float32') 226 | return self.state, self.sensors_pos, self.estimated_targets_mean, self.true_targets_radii, self.target_motion_omegas 227 | 228 | 229 | def close(self): 230 | """ 231 | Function to close the environment 232 | """ 233 | pass 234 | 235 | 236 | def render(self): 237 | """ 238 | Function for rendering the environment 239 | """ 240 | heatmap = torch.zeros(256, 256) 241 | for index in range(0, self.num_targets): 242 | x = np.linspace(0, self.len_workspace, 256) 243 | y = np.linspace(0, self.len_workspace, 256) 244 | X, Y = np.meshgrid(x, y) 245 | pos = np.empty(X.shape + (2,)) 246 | pos[:, :, 0] = X; pos[:, :, 1] = Y 247 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 248 | heatmap += rv.pdf(pos) 249 | x = np.linspace(0, self.len_workspace, 256) 250 | y = np.linspace(0, self.len_workspace, 256) 251 | X, Y = np.meshgrid(x, y) 252 | plt.cla() 253 | plt.title("Time step = " + str(self.time_step)) 254 | plt.xlabel("x") 255 | plt.ylabel("y") 256 | plt.xlim(0, self.len_workspace) 257 | plt.ylim(0, self.len_workspace) 258 | plt.contourf(X, Y, heatmap, cmap=cm.inferno) 259 | plt.plot(self.x1_list, self.y1_list, 'b--') 260 | plt.plot(self.x1_list[len(self.x1_list)-1], self.y1_list[len(self.y1_list)-1], 'o', c='b', marker='*') 261 | plt.plot(self.x2_list, self.y2_list, 'b--') 262 | plt.plot(self.x2_list[len(self.x2_list)-1], self.y2_list[len(self.y2_list)-1], 'o', c='b', marker='*') 263 | plt.plot(self.x3_list, self.y3_list, 'b--') 264 | plt.plot(self.x3_list[len(self.x3_list)-1], self.y3_list[len(self.y3_list)-1], 'o', c='b', marker='*') 265 | plt.plot(self.x4_list, self.y4_list, 'b--') 266 | plt.plot(self.x4_list[len(self.x4_list)-1], self.y4_list[len(self.y4_list)-1], 'o', c='b', marker='*') 267 | if(len(self.robot_movement_x)<8): 268 | plt.plot(self.robot_movement_x, self.robot_movement_y, 'r--') 269 | else: 270 | plt.plot(self.robot_movement_x[-8:], self.robot_movement_y[-8:], 'r--') 271 | plt.scatter(self.robot_movement_x[len(self.robot_movement_x)-1], self.robot_movement_y[len(self.robot_movement_y)-1], color='r', marker='D') 272 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/s2/"+str(self.time_step)+".png") 273 | #plt.show() 274 | 275 | 276 | # reference: https://matplotlib.org/3.1.0/gallery/statistics/confidence_ellipse.html 277 | def get_correlated_dataset(self, n, dependency, mu, scale): 278 | latent = np.random.randn(n, 2) 279 | dependent = latent.dot(dependency) 280 | scaled = dependent * scale 281 | scaled_with_offset = scaled + mu 282 | return (scaled_with_offset[:, 0], scaled_with_offset[:, 1]) 283 | 284 | 285 | def compute_reward(self): 286 | """ 287 | Function for computing the reward 288 | """ 289 | val = 0.0 290 | for index in range(0, self.num_targets): 291 | val += np.log(np.linalg.det(self.estimated_targets_var[index])) 292 | return -val, False 293 | 294 | 295 | # reference: https://matplotlib.org/3.1.0/gallery/statistics/confidence_ellipse.html 296 | def confidence_ellipse(self, x, y, ax, n_std=3.0, facecolor='none', **kwargs): 297 | """ 298 | Inputs: 299 | x: the x-coordinate datapoints 300 | y: the y-coordinate datapoints 301 | """ 302 | if x.size != y.size: 303 | raise ValueError("x and y must be the same size") 304 | 305 | cov = np.cov(x, y) 306 | pearson = cov[0, 1]/np.sqrt(cov[0, 0] * cov[1, 1]) 307 | 308 | ell_radius_x = np.sqrt(1 + pearson) 309 | ell_radius_y = np.sqrt(1 - pearson) 310 | ellipse = Ellipse((0, 0), width=ell_radius_x * 2, height=ell_radius_y * 2, facecolor=facecolor, **kwargs) 311 | scale_x = np.sqrt(cov[0, 0]) * n_std 312 | mean_x = np.mean(x) 313 | 314 | scale_y = np.sqrt(cov[1, 1]) * n_std 315 | mean_y = np.mean(y) 316 | 317 | transf = transforms.Affine2D().rotate_deg(45).scale(scale_x, scale_y).translate(mean_x, mean_y) 318 | ellipse.set_transform(transf + ax.transData) 319 | return ax.add_patch(ellipse) 320 | 321 | 322 | def _reset_sim(self, **kwargs): 323 | raise NotImplementedError() 324 | 325 | 326 | def _get_obs(self): 327 | """ 328 | Get observation function 329 | Returns the noisy relative measurement depending on the measurement model (bearing or range) 330 | """ 331 | targets_pos = self.true_targets_pos 332 | sensor_pos = self.sensors_pos[0] 333 | 334 | if self.meas_model == 'bearing': 335 | true_measurement = torch.atan2(targets_pos[:, 1] - sensor_pos[1], targets_pos[:, 0] - sensor_pos[0]) 336 | elif self.meas_model == 'range': 337 | true_measurement = torch.norm(targets_pos - sensor_pos, p=2, dim=1) 338 | 339 | true_obs = true_measurement + (self.sigma_meas * torch.randn(targets_pos.shape[0])) 340 | return true_obs 341 | 342 | 343 | def get_estimated_obs(self): 344 | """ 345 | Get observation function 346 | Returns the noisy relative measurement depending on the measurement model (bearing or range) 347 | """ 348 | targets_pos = self.estimated_targets_mean 349 | sensor_pos = self.sensors_pos[0] 350 | 351 | if self.meas_model == 'bearing': 352 | true_measurement = torch.atan2(targets_pos[:, 1] - sensor_pos[1], targets_pos[:, 0] - sensor_pos[0]) 353 | elif self.meas_model == 'range': 354 | true_measurement = torch.norm(targets_pos - sensor_pos, p=2, dim=1) 355 | return true_measurement 356 | 357 | 358 | def update_true_targets_pos(self): 359 | """ 360 | Function to update the true target positions when time step increases (assuming circular target motions) 361 | """ 362 | for index in range(0, self.num_targets): 363 | self.true_targets_pos[index] = torch.tensor([self.true_targets_radii[index]*np.cos((self.time_step-1)/float(self.target_motion_omegas[index]))+float(self.initial_true_targets_pos[index, 0])-self.true_targets_radii[index], self.true_targets_radii[index]*np.sin((self.time_step-1)/float(self.target_motion_omegas[index]))+float(self.initial_true_targets_pos[index, 1])]) 364 | 365 | 366 | def update_estimated_targets_pos(self): 367 | """ 368 | Function to update the estimated target positions when time step increases (using ekf) 369 | """ 370 | true_obs = self._get_obs() 371 | for index in range(0, self.num_targets): 372 | z_true = np.zeros((1, 1)) 373 | z_true[0, 0] = true_obs[index] 374 | q_matrix = 0.2 * np.eye(2) 375 | x_matrix = np.array([[float(self.estimated_targets_mean[index, 0])], [float(self.estimated_targets_mean[index, 1])]]) 376 | sigma_matrix = self.estimated_targets_var[index].numpy() + q_matrix 377 | 378 | z_pred = np.zeros((1, 1)) 379 | h_matrix = np.zeros((1, 2)) 380 | for index1 in range(0, 1): 381 | z_pred[index1][0] = np.linalg.norm([[x_matrix[0, 0] - float(self.sensors_pos[0, 0])], [x_matrix[1, 0] - float(self.sensors_pos[0, 1])]], 2) 382 | h_matrix[index1, 0] = (-1.0 / z_pred[index1]) * (float(self.sensors_pos[0, 0]) - x_matrix[0, 0]) 383 | h_matrix[index1, 1] = (-1.0 / z_pred[index1]) * (float(self.sensors_pos[0, 1]) - x_matrix[1, 0]) 384 | 385 | res = (z_true - z_pred) 386 | r_matrix = 1.0 * 1.0 * np.eye(1) 387 | s_matrix = np.matmul(np.matmul(h_matrix, sigma_matrix), h_matrix.T) + r_matrix 388 | k_matrix = np.matmul(np.matmul(sigma_matrix, h_matrix.T), np.linalg.inv(s_matrix)) 389 | 390 | x_matrix_tplus1 = x_matrix + (np.matmul(k_matrix, res)) 391 | sigma_matrix_tplus1 = np.matmul(np.matmul((np.eye(2) - np.matmul(k_matrix, h_matrix)), sigma_matrix), (np.eye(2) - np.matmul(k_matrix, h_matrix)).T) + np.matmul(np.matmul(k_matrix, r_matrix), k_matrix.T) 392 | target_xhat_tplus1 = x_matrix_tplus1[0, 0] 393 | target_yhat_tplus1 = x_matrix_tplus1[1, 0] 394 | 395 | self.estimated_targets_mean[index, 0] = target_xhat_tplus1 396 | self.estimated_targets_mean[index, 1] = target_yhat_tplus1 397 | self.estimated_targets_var[index] = torch.tensor(sigma_matrix_tplus1) 398 | 399 | 400 | def _get_true_target_position(self): 401 | """ 402 | Function to return the true target positions. 403 | """ 404 | return self.true_targets_pos 405 | 406 | 407 | def _get_estimated_target_position(self): 408 | """ 409 | Function to return the estimated target positions. 410 | """ 411 | return self.estimated_targets_pos 412 | 413 | 414 | def get_posterior_map(self): 415 | return self.posterior_map 416 | 417 | 418 | def _set_action(self, action, step_size): 419 | """ 420 | Applies the given action to the sensor. 421 | """ 422 | action = torch.tensor(action).float() 423 | vector = step_size * torch.tensor([torch.cos(action), torch.sin(action)]) 424 | sensor_pos = self.sensors_pos[0] + vector 425 | if(sensor_pos[0] >= self.len_workspace): 426 | sensor_pos[0] = self.sensors_pos[0, 0] 427 | if(sensor_pos[1] >= self.len_workspace): 428 | sensor_pos[1] = self.sensors_pos[0, 1] 429 | self.sensors_pos[0] = sensor_pos 430 | 431 | 432 | def seed(self, seed=None): 433 | """ 434 | Function that returns a random seed using OpenAI Gym seeding. 435 | """ 436 | _, seed = seeding.np_random(seed) 437 | return [seed] 438 | 439 | 440 | def _is_success(self, achieved_goal, desired_goal): 441 | raise NotImplementedError() 442 | 443 | 444 | def _sample_goal(self): 445 | return torch.rand(2) 446 | 447 | 448 | def _env_setup(self, initial_qpos): 449 | pass 450 | 451 | 452 | def _viewer_setup(self): 453 | pass 454 | 455 | 456 | def _render_callback(self): 457 | pass 458 | 459 | 460 | def _step_callback(self): 461 | pass 462 | -------------------------------------------------------------------------------- /code/envs/robot_target_tracking_env_interface.py: -------------------------------------------------------------------------------- 1 | # header files needed 2 | from gym import utils 3 | from envs.robot_target_tracking_env_sensors import RobotTargetTrackingEnv 4 | import numpy as np 5 | import torch 6 | 7 | 8 | class RobotTargetTrackingEnvInterface(RobotTargetTrackingEnv, utils.EzPickle): 9 | def __init__(self): 10 | RobotTargetTrackingEnv.__init__(self) 11 | utils.EzPickle.__init__(self) 12 | -------------------------------------------------------------------------------- /code/envs/robot_target_tracking_env_sensors.py: -------------------------------------------------------------------------------- 1 | """ 2 | OpenAI Gym environment with robots and targets 3 | 4 | Environment source code: https://github.com/ksengin/active-target-localization/blob/master/target_localization/envs/tracking_waypoints_env.py 5 | Modifications: Modified the existing environment with number of robots > 1 and the targets moving case 6 | """ 7 | 8 | 9 | # header files 10 | import os 11 | import copy 12 | import numpy as np 13 | import time 14 | import torch 15 | import torch.nn as nn 16 | import torch.nn.functional as F 17 | import torchvision 18 | 19 | import gym 20 | from gym import error, spaces 21 | from gym.utils import seeding 22 | import matplotlib.pyplot as plt 23 | import matplotlib.cm as cm 24 | from matplotlib.patches import Ellipse 25 | import matplotlib.transforms as transforms 26 | from math import pi, cos, sin 27 | from scipy.stats import multivariate_normal 28 | from convnet import * 29 | 30 | 31 | # OpenAI gym environment class 32 | class RobotTargetTrackingEnv(gym.GoalEnv): 33 | 34 | def __init__(self): 35 | """ 36 | Init method for the environment 37 | """ 38 | self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 39 | self.seed() 40 | self.len_workspace = 20 41 | self.workspace = np.array([[0, self.len_workspace], [0, self.len_workspace]]) 42 | self.sigma_meas = 1.0 43 | self.time_step = 1 44 | 45 | self.action_space = spaces.Box(-np.pi, np.pi, shape=(1,), dtype='float32') 46 | self.convnet = ConvNet() 47 | 48 | 49 | def env_parametrization(self, num_targets=2, num_sensors=2, target_motion_omegas=None, meas_model='range', include_cnn=True): 50 | """ 51 | Function for parametrizing the environment 52 | """ 53 | self.include_cnn = include_cnn 54 | self.x1_list = [] 55 | self.y1_list = [] 56 | self.x2_list = [] 57 | self.y2_list = [] 58 | self.x3_list = [] 59 | self.y3_list = [] 60 | self.x4_list = [] 61 | self.y4_list = [] 62 | self.time_step = 1 63 | self.num_targets = num_targets 64 | self.true_targets_radii = torch.rand(self.num_targets)*5.0+2.0 65 | self.true_targets_pos = (torch.rand(self.num_targets, 2)*self.len_workspace) 66 | self.initial_true_targets_pos = self.true_targets_pos.clone() 67 | self.estimated_targets_mean = self.true_targets_pos.clone() 68 | self.estimated_targets_var = torch.zeros(self.num_targets, 2, 2) 69 | for index in range(0, self.num_targets): 70 | self.estimated_targets_var[index] = torch.tensor([[1.0, 0.0], [0.0, 1.0]]) 71 | self.target_motion_omegas = torch.rand(self.num_targets)*100.0 72 | 73 | self.heatmap = torch.zeros(self.len_workspace, self.len_workspace) 74 | for index in range(0, self.num_targets): 75 | x = np.linspace(0, self.len_workspace, self.len_workspace) 76 | y = np.linspace(0, self.len_workspace, self.len_workspace) 77 | X, Y = np.meshgrid(x, y) 78 | pos = np.empty(X.shape + (2,)) 79 | pos[:, :, 0] = X; pos[:, :, 1] = Y 80 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 81 | self.heatmap += rv.pdf(pos) 82 | if include_cnn: 83 | image = F.interpolate(self.heatmap.unsqueeze(0).unsqueeze(0), (256, 256), mode='bilinear', align_corners=True) 84 | image = image.float() 85 | true_obs = self.convnet(image).squeeze() 86 | else: 87 | true_obs = self.heatmap.flatten() 88 | 89 | self.robot_movement_x_1 = [] 90 | self.robot_movement_y_1 = [] 91 | self.robot_movement_x_2 = [] 92 | self.robot_movement_y_2 = [] 93 | self.robot_movement_x_3 = [] 94 | self.robot_movement_y_3 = [] 95 | self.robot_movement_x_4 = [] 96 | self.robot_movement_y_4 = [] 97 | self.num_sensors = num_sensors 98 | self.sensors_pos = torch.zeros(self.num_sensors, 2) 99 | for index in range(0, self.num_sensors): 100 | rand_angle = torch.rand(1)*2*np.pi 101 | self.sensors_pos[index] = self.true_targets_pos.mean(0) + (torch.sqrt(torch.rand(1)+0.5)*(self.len_workspace/2)*(torch.tensor([torch.cos(rand_angle), torch.sin(rand_angle)]))) 102 | if(self.sensors_pos[index, 0]>=self.len_workspace): 103 | self.sensors_pos[index, 0] -= (self.sensors_pos[index, 0]-self.len_workspace+1) 104 | if(self.sensors_pos[index, 0]<=0): 105 | self.sensors_pos[index, 0] = (-self.sensors_pos[index, 0]+1) 106 | if(self.sensors_pos[index, 1]>=self.len_workspace): 107 | self.sensors_pos[index, 1] -= (self.sensors_pos[index, 1]-self.len_workspace+1) 108 | if(self.sensors_pos[index, 1]<=0): 109 | self.sensors_pos[index, 1] = (-self.sensors_pos[index, 1]+1) 110 | 111 | self.robot_movement_x_1.append(float(self.sensors_pos[0, 0])) 112 | self.robot_movement_y_1.append(float(self.sensors_pos[0, 1])) 113 | self.robot_movement_x_2.append(float(self.sensors_pos[1, 0])) 114 | self.robot_movement_y_2.append(float(self.sensors_pos[1, 1])) 115 | #self.robot_movement_x_3.append(float(self.sensors_pos[2, 0])) 116 | #self.robot_movement_y_3.append(float(self.sensors_pos[2, 1])) 117 | #self.robot_movement_x_4.append(float(self.sensors_pos[3, 0])) 118 | #self.robot_movement_y_4.append(float(self.sensors_pos[3, 1])) 119 | self.x1_list.append(float(self.true_targets_pos[0, 0])) 120 | self.y1_list.append(float(self.true_targets_pos[0, 1])) 121 | self.x2_list.append(float(self.true_targets_pos[1, 0])) 122 | self.y2_list.append(float(self.true_targets_pos[1, 1])) 123 | #self.x3_list.append(float(self.true_targets_pos[2, 0])) 124 | #self.y3_list.append(float(self.true_targets_pos[2, 1])) 125 | #self.x4_list.append(float(self.true_targets_pos[3, 0])) 126 | #self.y4_list.append(float(self.true_targets_pos[3, 1])) 127 | 128 | self.meas_model = meas_model 129 | if self.meas_model == 'bearing': 130 | self.sigma_meas = 0.2 131 | self.normal_dist_1d_torch = lambda x, mu, sgm: 1.0/(np.sqrt(2*np.pi*sgm**2))*torch.exp(-0.5/sgm**2*(np.pi-torch.abs(torch.abs(x-mu) - np.pi))**2) 132 | else: 133 | self.sigma_meas = 1.0 134 | self.normal_dist_1d_torch = lambda x, mu, sgm: 1.0/(np.sqrt(2*np.pi*sgm**2))*np.exp(-0.5/sgm**2*(np.abs(x-mu)**2)) 135 | self.state = torch.cat((self.sensors_pos[0], self.sensors_pos[1], torch.tensor(true_obs).float())) 136 | self.observation_space = spaces.Box(-np.inf, np.inf, shape=self.state.shape, dtype='float32') 137 | 138 | 139 | def step(self, action, step_size): 140 | """ 141 | Function to update the environment 142 | """ 143 | start = time.time() 144 | self._set_action(action, step_size) 145 | 146 | self.time_step = self.time_step + 1 147 | self.update_true_targets_pos() 148 | self.update_estimated_targets_pos() 149 | 150 | self.robot_movement_x_1.append(float(self.sensors_pos[0, 0])) 151 | self.robot_movement_y_1.append(float(self.sensors_pos[0, 1])) 152 | self.robot_movement_x_2.append(float(self.sensors_pos[1, 0])) 153 | self.robot_movement_y_2.append(float(self.sensors_pos[1, 1])) 154 | #self.robot_movement_x_3.append(float(self.sensors_pos[2, 0])) 155 | #self.robot_movement_y_3.append(float(self.sensors_pos[2, 1])) 156 | #self.robot_movement_x_4.append(float(self.sensors_pos[3, 0])) 157 | #self.robot_movement_y_4.append(float(self.sensors_pos[3, 1])) 158 | self.x1_list.append(float(self.true_targets_pos[0, 0])) 159 | self.y1_list.append(float(self.true_targets_pos[0, 1])) 160 | self.x2_list.append(float(self.true_targets_pos[1, 0])) 161 | self.y2_list.append(float(self.true_targets_pos[1, 1])) 162 | #self.x3_list.append(float(self.true_targets_pos[2, 0])) 163 | #self.y3_list.append(float(self.true_targets_pos[2, 1])) 164 | #self.x4_list.append(float(self.true_targets_pos[3, 0])) 165 | #self.y4_list.append(float(self.true_targets_pos[3, 1])) 166 | 167 | self.heatmap = torch.zeros(self.len_workspace, self.len_workspace) 168 | for index in range(0, self.num_targets): 169 | x = np.linspace(0, self.len_workspace, self.len_workspace) 170 | y = np.linspace(0, self.len_workspace, self.len_workspace) 171 | X, Y = np.meshgrid(x, y) 172 | pos = np.empty(X.shape + (2,)) 173 | pos[:, :, 0] = X; pos[:, :, 1] = Y 174 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 175 | self.heatmap += rv.pdf(pos) 176 | if self.include_cnn: 177 | image = F.interpolate(self.heatmap.unsqueeze(0).unsqueeze(0), (256, 256), mode='bilinear', align_corners=True) 178 | image = image.float() 179 | true_obs = self.convnet(image).squeeze() 180 | else: 181 | true_obs = self.heatmap.flatten() 182 | 183 | done = False 184 | reward = None 185 | reward, done = self.compute_reward() 186 | for index in range(0, self.num_sensors): 187 | if(self.time_step>200 or float(self.sensors_pos[index, 0]) <= 0 or float(self.sensors_pos[index, 1]) <= 0 or float(self.sensors_pos[index, 0]) >= self.len_workspace or float(self.sensors_pos[index, 1]) >= self.len_workspace): 188 | done = True 189 | 190 | self.state = torch.cat((self.sensors_pos[0], self.sensors_pos[1], torch.tensor(true_obs).float())) 191 | end = time.time() - start 192 | return self.state, reward, done, None, self.estimated_targets_var 193 | 194 | 195 | def reset(self, **kwargs): 196 | """ 197 | Function to reset the environment 198 | """ 199 | self.x1_list = [] 200 | self.y1_list = [] 201 | self.x2_list = [] 202 | self.y2_list = [] 203 | self.x3_list = [] 204 | self.y3_list = [] 205 | self.x4_list = [] 206 | self.y4_list = [] 207 | self.time_step = 1 208 | self.true_targets_radii = torch.rand(self.num_targets)*5.0+2.0 209 | self.true_targets_pos = (torch.rand(self.num_targets, 2)*self.len_workspace) 210 | self.initial_true_targets_pos = self.true_targets_pos.clone() 211 | self.estimated_targets_mean = self.true_targets_pos.clone() 212 | self.estimated_targets_var = torch.zeros(self.num_targets, 2, 2) 213 | for index in range(0, self.num_targets): 214 | self.estimated_targets_var[index] = torch.tensor([[1, 0], [0, 1]]) 215 | self.target_motion_omegas = torch.rand(self.num_targets)*100.0 216 | 217 | self.robot_movement_x_1 = [] 218 | self.robot_movement_y_1 = [] 219 | self.robot_movement_x_2 = [] 220 | self.robot_movement_y_2 = [] 221 | self.robot_movement_x_3 = [] 222 | self.robot_movement_y_3 = [] 223 | self.robot_movement_x_4 = [] 224 | self.robot_movement_y_4 = [] 225 | self.sensors_pos = torch.zeros(self.num_sensors, 2) 226 | for index in range(0, self.num_sensors): 227 | rand_angle = torch.rand(1)*2*np.pi 228 | self.sensors_pos[index] = self.true_targets_pos.mean(0)+(torch.sqrt(torch.rand(1)+0.5)*(self.len_workspace/2)*(torch.tensor([torch.cos(rand_angle), torch.sin(rand_angle)]))) 229 | if(self.sensors_pos[index, 0]>=self.len_workspace): 230 | self.sensors_pos[index, 0] -= (self.sensors_pos[index, 0]-self.len_workspace+1) 231 | if(self.sensors_pos[index, 0]<=0): 232 | self.sensors_pos[index, 0] = (-self.sensors_pos[index, 0]+1) 233 | if(self.sensors_pos[index, 1]>=self.len_workspace): 234 | self.sensors_pos[index, 1] -= (self.sensors_pos[index, 1]-self.len_workspace+1) 235 | if(self.sensors_pos[index, 1]<=0): 236 | self.sensors_pos[index, 1] = (-self.sensors_pos[index, 1]+1) 237 | 238 | self.robot_movement_x_1.append(float(self.sensors_pos[0, 0])) 239 | self.robot_movement_y_1.append(float(self.sensors_pos[0, 1])) 240 | self.robot_movement_x_2.append(float(self.sensors_pos[1, 0])) 241 | self.robot_movement_y_2.append(float(self.sensors_pos[1, 1])) 242 | #self.robot_movement_x_3.append(float(self.sensors_pos[2, 0])) 243 | #self.robot_movement_y_3.append(float(self.sensors_pos[2, 1])) 244 | #self.robot_movement_x_4.append(float(self.sensors_pos[3, 0])) 245 | #self.robot_movement_y_4.append(float(self.sensors_pos[3, 1])) 246 | self.x1_list.append(float(self.true_targets_pos[0, 0])) 247 | self.y1_list.append(float(self.true_targets_pos[0, 1])) 248 | self.x2_list.append(float(self.true_targets_pos[1, 0])) 249 | self.y2_list.append(float(self.true_targets_pos[1, 1])) 250 | #self.x3_list.append(float(self.true_targets_pos[2, 0])) 251 | #self.y3_list.append(float(self.true_targets_pos[2, 1])) 252 | #self.x4_list.append(float(self.true_targets_pos[3, 0])) 253 | #self.y4_list.append(float(self.true_targets_pos[3, 1])) 254 | 255 | self.heatmap = torch.zeros(self.len_workspace, self.len_workspace) 256 | for index in range(0, self.num_targets): 257 | x = np.linspace(0, self.len_workspace, self.len_workspace) 258 | y = np.linspace(0, self.len_workspace, self.len_workspace) 259 | X, Y = np.meshgrid(x, y) 260 | pos = np.empty(X.shape + (2,)) 261 | pos[:, :, 0] = X; pos[:, :, 1] = Y 262 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 263 | self.heatmap += rv.pdf(pos) 264 | if self.include_cnn: 265 | image = F.interpolate(self.heatmap.unsqueeze(0).unsqueeze(0), (256, 256), mode='bilinear', align_corners=True) 266 | image = image.float() 267 | true_obs = self.convnet(image).squeeze() 268 | else: 269 | true_obs = self.heatmap.flatten() 270 | 271 | self.state = torch.cat((self.sensors_pos[0], self.sensors_pos[1], torch.tensor(true_obs).float())) 272 | self.observation_space = spaces.Box(-np.inf, np.inf, shape=self.state.shape, dtype='float32') 273 | return self.state, self.sensors_pos, self.estimated_targets_mean, self.true_targets_radii, self.target_motion_omegas 274 | 275 | 276 | def set_env(self, sensors_pos, estimated_targets_mean, true_targets_radii, target_motion_omegas): 277 | self.x1_list = [] 278 | self.y1_list = [] 279 | self.x2_list = [] 280 | self.y2_list = [] 281 | self.x3_list = [] 282 | self.y3_list = [] 283 | self.x4_list = [] 284 | self.y4_list = [] 285 | self.time_step = 1 286 | self.true_targets_radii = true_targets_radii 287 | self.true_targets_pos = estimated_targets_mean 288 | self.initial_true_targets_pos = self.true_targets_pos.clone() 289 | self.estimated_targets_mean = self.true_targets_pos.clone() 290 | self.estimated_targets_var = torch.zeros(self.num_targets, 2, 2) 291 | for index in range(0, self.num_targets): 292 | self.estimated_targets_var[index] = torch.tensor([[1, 0], [0, 1]]) 293 | self.target_motion_omegas = target_motion_omegas 294 | 295 | self.robot_movement_x_1 = [] 296 | self.robot_movement_y_1 = [] 297 | self.robot_movement_x_2 = [] 298 | self.robot_movement_y_2 = [] 299 | self.robot_movement_x_3 = [] 300 | self.robot_movement_y_3 = [] 301 | self.robot_movement_x_4 = [] 302 | self.robot_movement_y_4 = [] 303 | self.sensors_pos = sensors_pos 304 | 305 | self.robot_movement_x_1.append(float(self.sensors_pos[0, 0])) 306 | self.robot_movement_y_1.append(float(self.sensors_pos[0, 1])) 307 | self.robot_movement_x_2.append(float(self.sensors_pos[1, 0])) 308 | self.robot_movement_y_2.append(float(self.sensors_pos[1, 1])) 309 | #self.robot_movement_x_3.append(float(self.sensors_pos[2, 0])) 310 | #self.robot_movement_y_3.append(float(self.sensors_pos[2, 1])) 311 | #self.robot_movement_x_4.append(float(self.sensors_pos[3, 0])) 312 | #self.robot_movement_y_4.append(float(self.sensors_pos[3, 1])) 313 | self.x1_list.append(float(self.true_targets_pos[0, 0])) 314 | self.y1_list.append(float(self.true_targets_pos[0, 1])) 315 | self.x2_list.append(float(self.true_targets_pos[1, 0])) 316 | self.y2_list.append(float(self.true_targets_pos[1, 1])) 317 | #self.x3_list.append(float(self.true_targets_pos[2, 0])) 318 | #self.y3_list.append(float(self.true_targets_pos[2, 1])) 319 | #self.x4_list.append(float(self.true_targets_pos[3, 0])) 320 | #self.y4_list.append(float(self.true_targets_pos[3, 1])) 321 | 322 | self.heatmap = torch.zeros(self.len_workspace, self.len_workspace) 323 | for index in range(0, self.num_targets): 324 | x = np.linspace(0, self.len_workspace, self.len_workspace) 325 | y = np.linspace(0, self.len_workspace, self.len_workspace) 326 | X, Y = np.meshgrid(x, y) 327 | pos = np.empty(X.shape + (2,)) 328 | pos[:, :, 0] = X; pos[:, :, 1] = Y 329 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 330 | self.heatmap += rv.pdf(pos) 331 | if self.include_cnn: 332 | image = F.interpolate(self.heatmap.unsqueeze(0).unsqueeze(0), (256, 256), mode='bilinear', align_corners=True) 333 | image = image.float() 334 | true_obs = self.convnet(image).squeeze() 335 | else: 336 | true_obs = self.heatmap.flatten() 337 | 338 | self.state = torch.cat((self.sensors_pos[0], self.sensors_pos[1], torch.tensor(true_obs).float())) 339 | self.observation_space = spaces.Box(-np.inf, np.inf, shape=self.state.shape, dtype='float32') 340 | return self.state, self.sensors_pos, self.estimated_targets_mean, self.true_targets_radii, self.target_motion_omegas 341 | 342 | 343 | def close(self): 344 | """ 345 | Function to close the environment 346 | """ 347 | pass 348 | 349 | 350 | def render(self): 351 | """ 352 | Function for rendering the environment 353 | """ 354 | heatmap = torch.zeros(256, 256) 355 | for index in range(0, self.num_targets): 356 | x = np.linspace(0, self.len_workspace, 256) 357 | y = np.linspace(0, self.len_workspace, 256) 358 | X, Y = np.meshgrid(x, y) 359 | pos = np.empty(X.shape + (2,)) 360 | pos[:, :, 0] = X; pos[:, :, 1] = Y 361 | rv = multivariate_normal(self.estimated_targets_mean[index], self.estimated_targets_var[index]) 362 | heatmap += rv.pdf(pos) 363 | x = np.linspace(0, self.len_workspace, 256) 364 | y = np.linspace(0, self.len_workspace, 256) 365 | X, Y = np.meshgrid(x, y) 366 | plt.cla() 367 | plt.title("Time step = " + str(self.time_step)) 368 | plt.xlabel("x") 369 | plt.ylabel("y") 370 | plt.xlim(0, self.len_workspace) 371 | plt.ylim(0, self.len_workspace) 372 | plt.contourf(X, Y, heatmap, cmap=cm.inferno) 373 | plt.plot(self.x1_list, self.y1_list, 'b--') 374 | plt.plot(self.x1_list[len(self.x1_list)-1], self.y1_list[len(self.y1_list)-1], 'o', c='b', marker='*') 375 | plt.plot(self.x2_list, self.y2_list, 'b--') 376 | plt.plot(self.x2_list[len(self.x2_list)-1], self.y2_list[len(self.y2_list)-1], 'o', c='b', marker='*') 377 | #plt.plot(self.x3_list, self.y3_list, 'b--') 378 | #plt.plot(self.x3_list[len(self.x3_list)-1], self.y3_list[len(self.y3_list)-1], 'o', c='b', marker='*') 379 | #plt.plot(self.x4_list, self.y4_list, 'b--') 380 | #plt.plot(self.x4_list[len(self.x4_list)-1], self.y4_list[len(self.y4_list)-1], 'o', c='b', marker='*') 381 | if(len(self.robot_movement_x_1)<8): 382 | plt.plot(self.robot_movement_x_1, self.robot_movement_y_1, 'r--') 383 | else: 384 | plt.plot(self.robot_movement_x_1[-8:], self.robot_movement_y_1[-8:], 'r--') 385 | plt.scatter(self.robot_movement_x_1[len(self.robot_movement_x_1) - 1], self.robot_movement_y_1[len(self.robot_movement_y_1) - 1], color='r', marker='D') 386 | if(len(self.robot_movement_x_2)<8): 387 | plt.plot(self.robot_movement_x_2, self.robot_movement_y_2, 'r--') 388 | else: 389 | plt.plot(self.robot_movement_x_2[-8:], self.robot_movement_y_2[-8:], 'r--') 390 | plt.scatter(self.robot_movement_x_2[len(self.robot_movement_x_2) - 1], self.robot_movement_y_2[len(self.robot_movement_y_2) - 1], color='r', marker='D') 391 | #if(len(self.robot_movement_x_3) < 8): 392 | # plt.plot(self.robot_movement_x_3, self.robot_movement_y_3, 'r--') 393 | #else: 394 | # plt.plot(self.robot_movement_x_3[-8:], self.robot_movement_y_3[-8:], 'r--') 395 | #plt.scatter(self.robot_movement_x_3[len(self.robot_movement_x_3) - 1], self.robot_movement_y_3[len(self.robot_movement_y_3) - 1], color='r', marker='D') 396 | #if(len(self.robot_movement_x_4) < 8): 397 | # plt.plot(self.robot_movement_x_4, self.robot_movement_y_4, 'r--') 398 | #else: 399 | # plt.plot(self.robot_movement_x_4[-8:], self.robot_movement_y_4[-8:], 'r--') 400 | #plt.scatter(self.robot_movement_x_4[len(self.robot_movement_x_4) - 1], self.robot_movement_y_4[len(self.robot_movement_y_4) - 1], color='r', marker='D') 401 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/s2/" + str(self.time_step) + ".png") 402 | #plt.show() 403 | 404 | 405 | # reference: https://matplotlib.org/3.1.0/gallery/statistics/confidence_ellipse.html 406 | def get_correlated_dataset(self, n, dependency, mu, scale): 407 | latent = np.random.randn(n, 2) 408 | dependent = latent.dot(dependency) 409 | scaled = dependent * scale 410 | scaled_with_offset = scaled + mu 411 | return (scaled_with_offset[:, 0], scaled_with_offset[:, 1]) 412 | 413 | 414 | def compute_reward(self): 415 | """ 416 | Function for computing the reward 417 | """ 418 | val = 0.0 419 | for index in range(0, self.num_targets): 420 | val += np.log(np.linalg.det(self.estimated_targets_var[index])) 421 | return -val, False 422 | 423 | 424 | def _reset_sim(self, **kwargs): 425 | raise NotImplementedError() 426 | 427 | 428 | def _get_obs(self): 429 | """ 430 | Get observation function 431 | Returns the noisy relative measurement depending on the measurement model (bearing or range) 432 | """ 433 | targets_pos = self.true_targets_pos 434 | sensor_pos = self.sensors_pos 435 | 436 | if self.meas_model=='bearing': 437 | true_measurement = torch.zeros(self.num_sensors, self.num_targets) 438 | for index in range(0, self.num_sensors): 439 | true_measurement[index] = torch.atan2(targets_pos[:, 1]-sensor_pos[index, 1], targets_pos[:, 0]-sensor_pos[index, 0]) 440 | elif self.meas_model=='range': 441 | true_measurement = torch.zeros(self.num_sensors, self.num_targets) 442 | for index in range(0, self.num_sensors): 443 | true_measurement[index] = torch.norm(targets_pos-sensor_pos[index], p=2, dim=1) 444 | 445 | true_obs = true_measurement+(self.sigma_meas*torch.randn(sensor_pos.shape[0], targets_pos.shape[0])) 446 | return true_obs 447 | 448 | 449 | def get_estimated_obs(self): 450 | """ 451 | Get observation function 452 | Returns the noisy relative measurement depending on the measurement model (bearing or range) 453 | """ 454 | targets_pos = self.estimated_targets_mean 455 | sensor_pos = self.sensors_pos 456 | 457 | if self.meas_model == 'bearing': 458 | true_measurement = torch.zeros(self.num_sensors, self.num_targets) 459 | for index in range(0, self.num_sensors): 460 | true_measurement[index] = torch.atan2(targets_pos[:, 1]-sensor_pos[index, 1], targets_pos[:, 0]-sensor_pos[index, 0]) 461 | elif self.meas_model == 'range': 462 | for index in range(0, self.num_sensors): 463 | true_measurement[index] = torch.norm(targets_pos-sensor_pos[index], p=2, dim=1) 464 | return true_measurement 465 | 466 | 467 | def update_true_targets_pos(self): 468 | """ 469 | Function to update the true target positions when time step increases (assuming circular target motions) 470 | """ 471 | for index in range(0, self.num_targets): 472 | self.true_targets_pos[index] = torch.tensor([self.true_targets_radii[index]*np.cos((self.time_step-1)/float(self.target_motion_omegas[index]))+float(self.initial_true_targets_pos[index, 0])-self.true_targets_radii[index], self.true_targets_radii[index]*np.sin((self.time_step-1)/float(self.target_motion_omegas[index]))+float(self.initial_true_targets_pos[index, 1])]) 473 | 474 | 475 | def update_estimated_targets_pos(self): 476 | """ 477 | Function to update the estimated target positions when time step increases (using ekf) 478 | """ 479 | true_obs = self._get_obs() 480 | for index in range(0, self.num_targets): 481 | z_true = np.zeros((self.num_sensors, 1)) 482 | for sensor_index in range(0, self.num_sensors): 483 | z_true[sensor_index, 0] = true_obs[sensor_index, index] 484 | q_matrix = 0.2*np.eye(2) 485 | x_matrix = np.array([[float(self.estimated_targets_mean[index, 0])], [float(self.estimated_targets_mean[index, 1])]]) 486 | sigma_matrix = self.estimated_targets_var[index].numpy()+q_matrix 487 | 488 | z_pred = np.zeros((self.num_sensors, 1)) 489 | h_matrix = np.zeros((self.num_sensors, 2)) 490 | for sensor_index in range(0, self.num_sensors): 491 | z_pred[sensor_index, 0] = np.linalg.norm([[x_matrix[0, 0]-float(self.sensors_pos[sensor_index, 0])], [x_matrix[1, 0]-float(self.sensors_pos[sensor_index, 1])]], 2) 492 | h_matrix[sensor_index, 0] = (-1.0/z_pred[sensor_index])*(float(self.sensors_pos[sensor_index, 0])-x_matrix[0, 0]) 493 | h_matrix[sensor_index, 1] = (-1.0/z_pred[sensor_index])*(float(self.sensors_pos[sensor_index, 1])-x_matrix[1, 0]) 494 | 495 | res = (z_true-z_pred) 496 | r_matrix = 1.0*1.0*np.eye(self.num_sensors) 497 | s_matrix = np.matmul(np.matmul(h_matrix, sigma_matrix), h_matrix.T)+r_matrix 498 | k_matrix = np.matmul(np.matmul(sigma_matrix, h_matrix.T), np.linalg.inv(s_matrix)) 499 | 500 | x_matrix_tplus1 = x_matrix+(np.matmul(k_matrix, res)) 501 | sigma_matrix_tplus1 = np.matmul(np.matmul((np.eye(2)-np.matmul(k_matrix, h_matrix)), sigma_matrix), (np.eye(2)-np.matmul(k_matrix, h_matrix)).T)+np.matmul(np.matmul(k_matrix, r_matrix), k_matrix.T) 502 | target_xhat_tplus1 = x_matrix_tplus1[0, 0] 503 | target_yhat_tplus1 = x_matrix_tplus1[1, 0] 504 | 505 | self.estimated_targets_mean[index, 0] = target_xhat_tplus1 506 | self.estimated_targets_mean[index, 1] = target_yhat_tplus1 507 | self.estimated_targets_var[index] = torch.tensor(sigma_matrix_tplus1) 508 | 509 | 510 | def _get_true_target_position(self): 511 | """ 512 | Function to return the true target positions. 513 | """ 514 | return self.true_targets_pos 515 | 516 | 517 | def _get_estimated_target_position(self): 518 | """ 519 | Function to return the estimated target positions. 520 | """ 521 | return self.estimated_targets_mean 522 | 523 | 524 | def get_posterior_map(self): 525 | return self.posterior_map 526 | 527 | 528 | def _set_action(self, action, step_size): 529 | """ 530 | Applies the given action to the sensor. 531 | """ 532 | for index in range(0, self.num_sensors): 533 | curr_action = torch.tensor(action[index]).float() 534 | vector = step_size[index]*torch.tensor([torch.cos(curr_action), torch.sin(curr_action)]) 535 | sensor_pos = self.sensors_pos[index]+vector 536 | if(sensor_pos[0]>=self.len_workspace): 537 | sensor_pos[0] = self.sensors_pos[index, 0] 538 | if(sensor_pos[1]>=self.len_workspace): 539 | sensor_pos[1] = self.sensors_pos[index, 1] 540 | self.sensors_pos[index] = sensor_pos 541 | 542 | 543 | def seed(self, seed=None): 544 | """ 545 | Function that returns a random seed using OpenAI Gym seeding. 546 | """ 547 | _, seed = seeding.np_random(seed) 548 | return [seed] 549 | 550 | 551 | def _is_success(self, achieved_goal, desired_goal): 552 | raise NotImplementedError() 553 | 554 | 555 | def _sample_goal(self): 556 | return torch.rand(2) 557 | 558 | 559 | def _env_setup(self, initial_qpos): 560 | pass 561 | 562 | 563 | def _viewer_setup(self): 564 | pass 565 | 566 | 567 | def _render_callback(self): 568 | pass 569 | 570 | 571 | def _step_callback(self): 572 | pass 573 | -------------------------------------------------------------------------------- /code/eval.py: -------------------------------------------------------------------------------- 1 | # header files 2 | import gym 3 | import math 4 | import random 5 | import numpy as np 6 | import matplotlib 7 | import matplotlib.pyplot as plt 8 | from collections import namedtuple 9 | from itertools import count 10 | from PIL import Image 11 | from gym.envs.registration import register 12 | import logging 13 | import torch 14 | import torch.nn as nn 15 | import torch.optim as optim 16 | import torch.nn.functional as F 17 | import torchvision.transforms as T 18 | from td3 import * 19 | from replay_buffer import * 20 | from utils import * 21 | 22 | 23 | logger = logging.getLogger(__name__) 24 | register( 25 | id='RobotTargetTrackingEnv-v0', 26 | entry_point='envs:RobotTargetTrackingEnvInterface', 27 | ) 28 | 29 | # initialise environment 30 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 31 | env_name = "RobotTargetTrackingEnv-v0" 32 | env_1 = gym.make(env_name) 33 | env_1.env_parametrization() 34 | env_1.reset() 35 | env_2 = gym.make(env_name) 36 | env_2.env_parametrization(include_cnn=False) 37 | env_2.reset() 38 | 39 | # create TD3 object and load optimal policy 40 | state_dim = env_1.observation_space.shape[0] 41 | action_dim = env_1.action_space.shape[0] 42 | max_action = float(env_1.action_space.high[0]) 43 | policy_1 = TD3(0.0005, state_dim, 4, max_action) 44 | policy_1.load_actor("/home/arpitdec5/Desktop/robot_target_tracking/", "sample_model_sensors_2_targets_4") 45 | state_dim = env_2.observation_space.shape[0] 46 | action_dim = env_2.action_space.shape[0] 47 | max_action = float(env_2.action_space.high[0]) 48 | policy_2 = TD3(0.0005, state_dim, 4, max_action) 49 | policy_2.load_actor("/home/arpitdec5/Desktop/robot_target_tracking/", "model_sensors_2_targets_4") 50 | 51 | # eval loop 52 | greedy_cov = [] 53 | rl1_cov = [] 54 | rl2_cov = [] 55 | step = [] 56 | ratio = [] 57 | for index in range(0, 500): 58 | # init environment 1 59 | state, sensors, targets, radii, omegas = env_1.reset() 60 | state_2, sensors_2, targets_2, radii_2, omegas_2 = env_2.set_env(sensors, targets, radii, omegas) 61 | step.append(index+1) 62 | print(index) 63 | 64 | ############################ Greedy Algo ####################################### 65 | ''' 66 | # init ekf variables 67 | mean_1 = np.asarray([float(targets[0, 0]), float(targets[0, 1])]) 68 | x_true_1, y_true_1 = float(targets[0, 0]), float(targets[0, 1]) 69 | var_1 = [[1, 0], [0, 1]] 70 | init_pos_1 = [float(targets[0, 0]) - float(radii[0]), float(targets[0, 1])] 71 | mean_2 = np.asarray([float(targets[1, 0]), float(targets[1, 1])]) 72 | x_true_2, y_true_2 = float(targets[1, 0]), float(targets[1, 1]) 73 | var_2 = [[1, 0], [0, 1]] 74 | init_pos_2 = [float(targets[1, 0]) - float(radii[1]), float(targets[1, 1])] 75 | mean_3 = np.asarray([float(targets[2, 0]), float(targets[2, 1])]) 76 | x_true_3, y_true_3 = float(targets[2, 0]), float(targets[2, 1]) 77 | var_3 = [[1, 0], [0, 1]] 78 | init_pos_3 = [float(targets[2, 0]) - float(radii[2]), float(targets[2, 1])] 79 | mean_4 = np.asarray([float(targets[3, 0]), float(targets[3, 1])]) 80 | x_true_4, y_true_4 = float(targets[3, 0]), float(targets[3, 1]) 81 | var_4 = [[1, 0], [0, 1]] 82 | init_pos_4 = [float(targets[3, 0]) - float(radii[3]), float(targets[3, 1])] 83 | robots_x = [] 84 | robots_y = [] 85 | robots_id = [] 86 | for index in range(0, len(sensors)): 87 | robots_x.append(float(sensors[index, 0])) 88 | robots_y.append(float(sensors[index, 1])) 89 | robots_id.append(index+1) 90 | map_height = 20 91 | map_width = 20 92 | action_radius = 1.0 93 | robot_movement_x = [robots_x[0]] 94 | robot_movement_y = [robots_y[0]] 95 | true_target_x_1 = [x_true_1] 96 | true_target_y_1 = [y_true_1] 97 | prev_target_x_1 = x_true_1 98 | prev_target_y_1 = y_true_1 99 | true_target_x_2 = [x_true_2] 100 | true_target_y_2 = [y_true_2] 101 | prev_target_x_2 = x_true_2 102 | prev_target_y_2 = y_true_2 103 | true_target_x_3 = [x_true_3] 104 | true_target_y_3 = [y_true_3] 105 | prev_target_x_3 = x_true_3 106 | prev_target_y_3 = y_true_3 107 | true_target_x_4 = [x_true_4] 108 | true_target_y_4 = [y_true_4] 109 | prev_target_x_4 = x_true_4 110 | prev_target_y_4 = y_true_4 111 | prev_robot_x = robots_x[0] 112 | prev_robot_y = robots_y[0] 113 | 114 | # estimate target position after each time step 115 | avg_val_greedy = 0.0 116 | i=0 117 | for t in range(2, 200): 118 | i += 1 119 | 120 | # update target position 121 | target_x_mean_1, target_y_mean_1, var_1, x_true_1, y_true_1 = extended_kalman_filter(mean_1[0], mean_1[1], var_1, robots_x, robots_y, robots_id, t, init_pos_1[0], init_pos_1[1], 1, float(omegas[0]), float(radii[0])) 122 | mean_1 = np.asarray([target_x_mean_1, target_y_mean_1]) 123 | target_x_mean_2, target_y_mean_2, var_2, x_true_2, y_true_2 = extended_kalman_filter(mean_2[0], mean_2[1], var_2, robots_x, robots_y, robots_id, t, init_pos_2[0], init_pos_2[1], 1, float(omegas[1]), float(radii[1])) 124 | mean_2 = np.asarray([target_x_mean_2, target_y_mean_2]) 125 | target_x_mean_3, target_y_mean_3, var_3, x_true_3, y_true_3 = extended_kalman_filter(mean_3[0], mean_3[1], var_3, robots_x, robots_y, robots_id, t, init_pos_3[0], init_pos_3[1], 1, float(omegas[2]), float(radii[2])) 126 | mean_3 = np.asarray([target_x_mean_3, target_y_mean_3]) 127 | target_x_mean_4, target_y_mean_4, var_4, x_true_4, y_true_4 = extended_kalman_filter(mean_4[0], mean_4[1], var_4, robots_x, robots_y, robots_id, t, init_pos_4[0], init_pos_4[1], 1, float(omegas[3]), float(radii[3])) 128 | mean_4 = np.asarray([target_x_mean_4, target_y_mean_4]) 129 | 130 | # update true target position 131 | true_target_x_1.append(x_true_1) 132 | true_target_y_1.append(y_true_1) 133 | true_target_x_2.append(x_true_2) 134 | true_target_y_2.append(y_true_2) 135 | true_target_x_3.append(x_true_3) 136 | true_target_y_3.append(y_true_3) 137 | true_target_x_4.append(x_true_4) 138 | true_target_y_4.append(y_true_4) 139 | 140 | # add robot position for rendering 141 | robot_movement_x.append(robots_x[0]) 142 | robot_movement_y.append(robots_y[0]) 143 | 144 | # plot map 145 | #render_ekf([target_x_mean_1, target_y_mean_1], [target_x_mean_2, target_y_mean_2], var_1, var_2, t, true_target_x_1, true_target_y_1, true_target_x_2, true_target_y_2, robot_movement_x, robot_movement_y) 146 | 147 | # update robot position 148 | next_robot_x, next_robot_y, val = update_robot_pos_ekf(robots_x[0], robots_y[0], [target_x_mean_1, target_x_mean_2, target_x_mean_3, target_x_mean_4], [target_y_mean_1, target_y_mean_2, target_y_mean_3, target_y_mean_4], [var_1, var_2, var_3, var_4], [prev_target_x_1, prev_target_x_2, prev_target_x_3, prev_target_x_4], [prev_target_y_1, prev_target_y_2, prev_target_y_3, prev_target_y_4], action_radius, map_height, map_width, t+1, prev_robot_x, prev_robot_y) 149 | avg_val_greedy += np.linalg.det(var_1) + np.linalg.det(var_2) + np.linalg.det(var_3) + np.linalg.det(var_4) 150 | prev_robot_x = robots_x[0] 151 | prev_robot_y = robots_y[0] 152 | robots_x[0] = next_robot_x 153 | robots_y[0] = next_robot_y 154 | prev_target_x_1 = target_x_mean_1 155 | prev_target_y_1 = target_y_mean_1 156 | prev_target_x_2 = target_x_mean_2 157 | prev_target_y_2 = target_y_mean_2 158 | prev_target_x_3 = target_x_mean_3 159 | prev_target_y_3 = target_y_mean_3 160 | prev_target_x_4 = target_x_mean_4 161 | prev_target_y_4 = target_y_mean_4 162 | avg_val_greedy = avg_val_greedy/i 163 | greedy_cov.append(avg_val_greedy) 164 | ''' 165 | ################################################################################# 166 | 167 | 168 | ############################ RL_1 Algo ####################################### 169 | average_cov_rl_1 = 0.0 170 | cov_rl = 0.0 171 | i = 0 172 | target_1 = [] 173 | target_2 = [] 174 | target_3 = [] 175 | target_4 = [] 176 | for iter in range(0, 100): 177 | i += 1 178 | action_1, step_size_1, action_2, step_size_2 = policy_1.select_action(state) 179 | next_state, reward, done, _, var = env_1.step([action_1, action_2], [step_size_1, step_size_2]) 180 | state = next_state 181 | #env_1.render() 182 | #env_1.close() 183 | 184 | #average_cov_rl_1 += np.linalg.det(var[0])+np.linalg.det(var[1])+np.linalg.det(var[2])+np.linalg.det(var[3]) 185 | average_cov_rl_1 += np.linalg.det(var[0])+np.linalg.det(var[1]) 186 | #average_cov_rl += np.linalg.det(var[0]) 187 | #cov_rl = np.linalg.det(var[0]) 188 | for index in range(0, len(var)): 189 | if(index==0): 190 | target_1.append(np.linalg.det(var[0])) 191 | elif(index==1): 192 | target_2.append(np.linalg.det(var[1])) 193 | elif(index==2): 194 | target_3.append(np.linalg.det(var[2])) 195 | else: 196 | target_4.append(np.linalg.det(var[3])) 197 | if(done): 198 | break 199 | average_cov_rl_1 = average_cov_rl_1/i 200 | rl1_cov.append(average_cov_rl_1) 201 | 202 | ############################ RL_2 Algo ####################################### 203 | average_cov_rl_2 = 0.0 204 | cov_rl = 0.0 205 | i = 0 206 | target_1 = [] 207 | target_2 = [] 208 | target_3 = [] 209 | target_4 = [] 210 | for iter in range(0, 100): 211 | i += 1 212 | action_1, step_size_1, action_2, step_size_2 = policy_2.select_action(state_2) 213 | next_state, reward, done, _, var = env_2.step([action_1, action_2], [step_size_1, step_size_2]) 214 | state_2 = next_state 215 | #env_2.render() 216 | #env_2.close() 217 | 218 | #average_cov_rl_2 += np.linalg.det(var[0])+np.linalg.det(var[1])+np.linalg.det(var[2])+np.linalg.det(var[3]) 219 | average_cov_rl_2 += np.linalg.det(var[0])+np.linalg.det(var[1]) 220 | #average_cov_rl += np.linalg.det(var[0]) 221 | #cov_rl = np.linalg.det(var[0]) 222 | for index in range(0, len(var)): 223 | if(index==0): 224 | target_1.append(np.linalg.det(var[0])) 225 | elif(index==1): 226 | target_2.append(np.linalg.det(var[1])) 227 | elif(index==2): 228 | target_3.append(np.linalg.det(var[2])) 229 | else: 230 | target_4.append(np.linalg.det(var[3])) 231 | if(done): 232 | break 233 | average_cov_rl_2 = average_cov_rl_2/i 234 | rl2_cov.append(average_cov_rl_2) 235 | ratio.append(average_cov_rl_1/average_cov_rl_2) 236 | 237 | 238 | # plot curve 239 | plt.cla() 240 | plt.title("Plot for scenario: sensors=2 and targets=4") 241 | plt.xlabel("Data(Ratio of Avg. Determinant of covariance matrix(RL_1/RL_2))") 242 | plt.ylabel("Bin Count") 243 | plt.hist(ratio, density=False, bins=30) 244 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/ratio_episode_curve_sensors_2_targets_4.png") 245 | plt.cla() 246 | plt.title("Plot for scenario: sensors=2 and targets=4") 247 | plt.xlabel("Episodes") 248 | plt.ylabel("Avg. Determinant of covariance matrix") 249 | plt.plot(step, rl2_cov, label='RL_2 Algo') 250 | plt.plot(step, rl1_cov, label='RL_1 Algo') 251 | plt.legend() 252 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/det_episode_curve_sensors_2_targets_4.png") 253 | -------------------------------------------------------------------------------- /code/replay_buffer.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | 4 | # References: 5 | # 1. https://github.com/ksengin/active-target-localization/blob/master/target_localization/util/replay_buffer.py 6 | 7 | 8 | class ReplayBuffer(object): 9 | def __init__(self, state_dim, action_dim, max_size=100000): 10 | self.buffer = [] 11 | self.max_size = int(max_size) 12 | self.state_dim = state_dim 13 | self.action_dim = action_dim 14 | self.size = 0 15 | 16 | def add(self, transition): 17 | self.size +=1 18 | # transiton is tuple of (state, action, reward, next_state, done) 19 | self.buffer.append(transition) 20 | 21 | def sample(self, batch_size): 22 | if self.size > self.max_size: 23 | del self.buffer[:self.size//5] 24 | self.size = len(self.buffer) 25 | 26 | indices = np.random.randint(0, len(self.buffer), size=batch_size) 27 | 28 | states = torch.zeros((batch_size, self.state_dim)) 29 | actions = torch.zeros((batch_size, self.action_dim)) 30 | next_states = torch.zeros((batch_size, self.state_dim)) 31 | rewards = torch.zeros((batch_size, 1)) 32 | dones = torch.zeros((batch_size, 1)) 33 | 34 | for t, idx in enumerate(indices): 35 | s, a, r, s_, d = self.buffer[idx] 36 | states[t] = s 37 | actions[t] = a 38 | rewards[t] = r 39 | next_states[t] = s_ 40 | dones[t] = torch.tensor(d) 41 | 42 | return states, actions, rewards, next_states, dones 43 | -------------------------------------------------------------------------------- /code/td3.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import torch.optim as optim 5 | from itertools import chain 6 | 7 | # Implementation of Twin Delayed Deep Deterministic Policy Gradients (TD3) 8 | # Paper: https://arxiv.org/abs/1802.09477 9 | # References: 10 | # 1. https://github.com/sfujim/TD3/blob/master/TD3.py 11 | # 2. https://github.com/nikhilbarhate99/TD3-PyTorch-BipedalWalker-v2/blob/master/TD3.py 12 | 13 | 14 | class Actor(nn.Module): 15 | def __init__(self, state_dim, action_dim, max_action): 16 | super(Actor, self).__init__() 17 | 18 | self.l1 = nn.Linear(state_dim, 1024) 19 | self.l2 = nn.Linear(1024, 1024) 20 | self.l3 = nn.Linear(1024, action_dim) 21 | 22 | self.max_action = max_action 23 | 24 | def forward(self, state): 25 | a = F.relu(self.l1(state)) 26 | a = F.relu(self.l2(a)) 27 | a = self.l3(a) 28 | a[:, 0] = torch.tanh(a[:, 0]) * self.max_action 29 | a[:, 1] = torch.sigmoid(a[:, 1]) 30 | a[:, 2] = torch.tanh(a[:, 2]) * self.max_action 31 | a[:, 3] = torch.sigmoid(a[:, 3]) 32 | return a 33 | 34 | class Critic(nn.Module): 35 | def __init__(self, state_dim, action_dim): 36 | super(Critic, self).__init__() 37 | 38 | self.l1 = nn.Linear(state_dim + action_dim, 1024) 39 | self.l2 = nn.Linear(1024, 1024) 40 | self.l3 = nn.Linear(1024, action_dim) 41 | 42 | def forward(self, state, action): 43 | q = F.relu(self.l1(torch.cat([state, action], 1))) 44 | q = F.relu(self.l2(q)) 45 | q = self.l3(q) 46 | return q 47 | 48 | class TD3(object): 49 | def __init__(self, lr, state_dim, action_dim, max_action, joint_critic_optim: bool = False): 50 | 51 | self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 52 | 53 | self.actor = Actor(state_dim, action_dim, max_action).to(self.device) 54 | self.actor_target = Actor(state_dim, action_dim, max_action).to(self.device) 55 | self.actor_target.load_state_dict(self.actor.state_dict()) 56 | self.actor_optimizer = optim.Adam(self.actor.parameters(), lr=lr) 57 | 58 | self.critic_1 = Critic(state_dim, action_dim).to(self.device) 59 | self.critic_1_target = Critic(state_dim, action_dim).to(self.device) 60 | self.critic_1_target.load_state_dict(self.critic_1.state_dict()) 61 | self.critic_1_optimizer = optim.Adam(self.critic_1.parameters(), lr=lr) 62 | 63 | self.critic_2 = Critic(state_dim, action_dim).to(self.device) 64 | self.critic_2_target = Critic(state_dim, action_dim).to(self.device) 65 | self.critic_2_target.load_state_dict(self.critic_2.state_dict()) 66 | self.critic_2_optimizer = optim.Adam(self.critic_2.parameters(), lr=lr) 67 | 68 | self.joint_critic_optim = joint_critic_optim 69 | if joint_critic_optim: 70 | self.critic_optim = optim.Adam(chain(self.critic_1.parameters(), self.critic_2.parameters()), lr=lr) 71 | 72 | self.max_action = max_action 73 | self.critic_loss = F.mse_loss 74 | 75 | def select_action(self, state): 76 | state = state.reshape(1, -1).to(self.device) 77 | return self.actor(state).cpu().data.flatten() 78 | 79 | def update(self, replay_buffer, n_iter, batch_size, gamma, tau, policy_noise, policy_delay): 80 | 81 | for idx in range(n_iter): 82 | # Sample a batch of transitions from the replay buffer 83 | state, action_, reward, next_state, done = replay_buffer.sample(batch_size) 84 | state = state.to(self.device) 85 | action = action_.to(self.device) 86 | reward = reward.to(self.device) 87 | next_state = next_state.to(self.device) 88 | done = done.to(self.device) 89 | 90 | # Select next action according to target policy 91 | noise = torch.normal(0, policy_noise, action.shape).to(self.device) 92 | next_action = (self.actor_target(next_state) + noise) 93 | next_action[:, 0] = next_action[:, 0].clamp(-self.max_action, self.max_action) 94 | next_action[:, 2] = next_action[:, 2].clamp(-self.max_action, self.max_action) 95 | 96 | # Compute target Q-value 97 | target_Q1 = self.critic_1_target(next_state, next_action) 98 | target_Q2 = self.critic_2_target(next_state, next_action) 99 | target_Q = torch.min(target_Q1, target_Q2) 100 | target_Q = reward + ((1-done) * gamma * target_Q).detach() 101 | 102 | # Optimize the critic networks 103 | if not self.joint_critic_optim: 104 | current_Q1 = self.critic_1(state, action) 105 | loss_Q1 = self.critic_loss(current_Q1, target_Q) 106 | self.critic_1_optimizer.zero_grad() 107 | loss_Q1.backward() 108 | self.critic_1_optimizer.step() 109 | 110 | current_Q2 = self.critic_2(state, action) 111 | loss_Q2 = self.critic_loss(current_Q2, target_Q) 112 | self.critic_2_optimizer.zero_grad() 113 | loss_Q2.backward() 114 | self.critic_2_optimizer.step() 115 | 116 | else: # optimization as in the original implementation 117 | current_Q1 = self.critic_1(state, action) 118 | current_Q2 = self.critic_2(state, action) 119 | loss_Q = self.critic_loss(current_Q1, target_Q) + self.critic_loss(current_Q2, target_Q) 120 | self.critic_optim.zero_grad() 121 | loss_Q.backward() 122 | self.critic_optim.step() 123 | 124 | # Delayed policy updates 125 | if idx % policy_delay == 0: 126 | # Compute actor loss 127 | actor_loss = -self.critic_1(state, self.actor(state)).mean() 128 | 129 | # Optimize the actor 130 | self.actor_optimizer.zero_grad() 131 | actor_loss.backward() 132 | self.actor_optimizer.step() 133 | 134 | # Update the frozen target models 135 | for param, target_param in zip(self.actor.parameters(), self.actor_target.parameters()): 136 | target_param.data.copy_(tau * target_param.data + (1-tau) * param.data) 137 | 138 | for param, target_param in zip(self.critic_1.parameters(), self.critic_1_target.parameters()): 139 | target_param.data.copy_(tau * target_param.data + (1-tau) * param.data) 140 | 141 | for param, target_param in zip(self.critic_2.parameters(), self.critic_2_target.parameters()): 142 | target_param.data.copy_(tau * target_param.data + (1-tau) * param.data) 143 | 144 | 145 | def save(self, directory, name): 146 | torch.save(self.actor.state_dict(), f'{directory}/{name}_actor.pth') 147 | torch.save(self.actor_target.state_dict(), f'{directory}/{name}_actor_target.pth') 148 | 149 | torch.save(self.critic_1.state_dict(), f'{directory}/{name}_crtic_1.pth') 150 | torch.save(self.critic_1_target.state_dict(), f'{directory}/{name}_critic_1_target.pth') 151 | 152 | torch.save(self.critic_2.state_dict(), f'{directory}/{name}_crtic_2.pth') 153 | torch.save(self.critic_2_target.state_dict(), f'{directory}/{name}_critic_2_target.pth') 154 | 155 | def load(self, directory, name): 156 | self.load_actor(directory, name) 157 | 158 | self.critic_1.load_state_dict(torch.load(f'{directory}/{name}_crtic_1.pth')) 159 | self.critic_1_target.load_state_dict(torch.load(f'{directory}/{name}_critic_1_target.pth')) 160 | 161 | self.critic_2.load_state_dict(torch.load(f'{directory}/{name}_crtic_2.pth')) 162 | self.critic_2_target.load_state_dict(torch.load(f'{directory}/{name}_critic_2_target.pth')) 163 | 164 | 165 | def load_actor(self, directory, name): 166 | self.actor.load_state_dict(torch.load(f'{directory}/{name}_actor.pth')) 167 | self.actor_target.load_state_dict(torch.load(f'{directory}/{name}_actor_target.pth')) 168 | -------------------------------------------------------------------------------- /code/train.py: -------------------------------------------------------------------------------- 1 | # header files 2 | import gym 3 | import math 4 | import random 5 | import numpy as np 6 | import matplotlib 7 | import matplotlib.pyplot as plt 8 | from collections import namedtuple 9 | from itertools import count 10 | from PIL import Image 11 | from gym.envs.registration import register 12 | import logging 13 | import torch 14 | import torch.nn as nn 15 | import torch.optim as optim 16 | import torch.nn.functional as F 17 | import torchvision.transforms as T 18 | from td3 import * 19 | from replay_buffer import * 20 | 21 | 22 | logger = logging.getLogger(__name__) 23 | register( 24 | id='RobotTargetTrackingEnv-v0', 25 | entry_point='envs:RobotTargetTrackingEnvInterface', 26 | ) 27 | 28 | # initialise environment 29 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 30 | env_name = "RobotTargetTrackingEnv-v0" 31 | env = gym.make(env_name) 32 | env.env_parametrization() 33 | env.reset() 34 | 35 | # constants 36 | lr = 0.0005 37 | epochs = 1000 38 | iters = 100 39 | 40 | # create TD3 object 41 | state_dim = env.observation_space.shape[0] 42 | action_dim = env.action_space.shape[0] 43 | max_action = float(env.action_space.high[0]) 44 | policy = TD3(lr, state_dim, 4, max_action) 45 | replay_buffer = ReplayBuffer(state_dim=state_dim, action_dim=4) 46 | mean_reward, ep_reward = 0, 0 47 | 48 | # training 49 | e = [] 50 | r = [] 51 | m_e = [] 52 | m_r = [] 53 | g_e = [] 54 | g_r = [] 55 | mean_reward = 0 56 | for epoch in range(0, epochs): 57 | state, _, _, _, _ = env.reset() 58 | for iter in range(0, iters): 59 | action_1, step_size_1, action_2, step_size_2 = policy.select_action(state) + torch.normal(0, 0.1, size=env.action_space.shape) 60 | action_1 = action_1.clamp(env.action_space.low.item(), env.action_space.high.item()) 61 | action_2 = action_2.clamp(env.action_space.low.item(), env.action_space.high.item()) 62 | 63 | next_state, reward, done, _, _ = env.step([action_1, action_2], [step_size_1, step_size_2]) 64 | replay_buffer.add((state, torch.tensor([action_1, step_size_1, action_2, step_size_2]), reward, next_state, np.float(done))) 65 | state = next_state 66 | 67 | mean_reward += reward 68 | ep_reward += reward 69 | if done: 70 | break 71 | 72 | # update policy 73 | policy.update(replay_buffer, iter, 100, 0.99, 0.99, 0.2, 2) 74 | 75 | # save actor-critic models 76 | if(epoch>100 and epoch%10==0): 77 | policy.save("/home/arpitdec5/Desktop/robot_target_tracking/", "sample_1_model_sensors_2_targets_2") 78 | 79 | # print reward 80 | print() 81 | print("Epoch: " + str(epoch)) 82 | print("Reward: " + str(ep_reward)) 83 | print() 84 | e.append(epoch) 85 | r.append(ep_reward) 86 | ep_reward = 0 87 | 88 | # mean reward 89 | if(epoch%50==0 and epoch>0): 90 | m_e.append(epoch) 91 | m_r.append(mean_reward/50.0) 92 | mean_reward = 0 93 | 94 | 95 | # plot epoch vs reward curve 96 | plt.xlabel("Episodes") 97 | plt.ylabel("Reward") 98 | plt.ylim(0, 1000) 99 | plt.plot(e, r, c='blue', label='Cumulative Reward') 100 | plt.plot(m_e, m_r, c='orange', label='Mean Reward') 101 | #plt.plot(g_e, g_r, c='red', label='Greedy Algorithm') 102 | plt.legend() 103 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/sample_1_reward_sensors_2_targets_2.png") 104 | -------------------------------------------------------------------------------- /code/utils.py: -------------------------------------------------------------------------------- 1 | # header files needed 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | import matplotlib.cm as cm 5 | from scipy.stats import multivariate_normal 6 | from matplotlib.patches import Ellipse 7 | import matplotlib.transforms as transforms 8 | from math import pi, cos, sin 9 | from random import random 10 | import math 11 | import torch 12 | 13 | 14 | def render_ekf(estimated_targets_mean_1, estimated_targets_mean_2, estimated_targets_var_1, estimated_targets_var_2, time_step, x_list_1, y_list_1, x_list_2, y_list_2, robot_movement_x, robot_movement_y): 15 | """ 16 | Function for rendering the environment 17 | """ 18 | heatmap = torch.zeros(256, 256) 19 | for index in range(0, 1): 20 | x = np.linspace(0, 20, 256) 21 | y = np.linspace(0, 20, 256) 22 | X, Y = np.meshgrid(x, y) 23 | pos = np.empty(X.shape + (2,)) 24 | pos[:, :, 0] = X; pos[:, :, 1] = Y 25 | rv_1 = multivariate_normal(estimated_targets_mean_1, estimated_targets_var_1) 26 | rv_2 = multivariate_normal(estimated_targets_mean_2, estimated_targets_var_2) 27 | heatmap += rv_1.pdf(pos) 28 | heatmap += rv_2.pdf(pos) 29 | x = np.linspace(0, 20, 256) 30 | y = np.linspace(0, 20, 256) 31 | X, Y = np.meshgrid(x, y) 32 | plt.cla() 33 | plt.title("Time step = " + str(time_step)) 34 | plt.xlabel("x") 35 | plt.ylabel("y") 36 | plt.xlim(0, 20) 37 | plt.ylim(0, 20) 38 | plt.contourf(X, Y, heatmap, cmap=cm.inferno) 39 | plt.plot(x_list_1, y_list_1, 'b--') 40 | plt.plot(x_list_2, y_list_2, 'b--') 41 | plt.plot(x_list_1[len(x_list_1) - 1], y_list_1[len(y_list_1) - 1], 'o', c='b', marker='*') 42 | plt.plot(x_list_2[len(x_list_2) - 1], y_list_2[len(y_list_2) - 1], 'o', c='b', marker='*') 43 | #plt.plot(self.x2_list, self.y2_list, 'b--') 44 | #plt.plot(self.x2_list[len(self.x2_list) - 1], self.y2_list[len(self.y2_list) - 1], 'o', c='b', marker='*') 45 | #plt.plot(self.x3_list, self.y3_list, 'b--') 46 | #plt.plot(self.x3_list[len(self.x3_list) - 1], self.y3_list[len(self.y3_list) - 1], 'o', c='b', marker='*') 47 | #plt.plot(self.x4_list, self.y4_list, 'b--') 48 | #plt.plot(self.x4_list[len(self.x4_list) - 1], self.y4_list[len(self.y4_list) - 1], 'o', c='b', marker='*') 49 | if(len(robot_movement_x) < 8): 50 | plt.plot(robot_movement_x, robot_movement_y, 'r--') 51 | else: 52 | plt.plot(robot_movement_x[-8:], robot_movement_y[-8:], 'r--') 53 | plt.scatter(robot_movement_x[len(robot_movement_x) - 1], robot_movement_y[len(robot_movement_y) - 1], color='r', marker='D') 54 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/s1/" + str(time_step) + ".png") 55 | #plt.show() 56 | 57 | 58 | # plot the heatmap 59 | def render(t, x_mesh, y_mesh, belief_map, x_target, y_target, robot_movement_x, robot_movement_y): 60 | """ 61 | Inputs: 62 | t: time step 63 | x_mesh: the x-coordinates 64 | y_mesh: the y-coordinates 65 | belief_map: the map containing the probabilities 66 | x_target: the target x-coordinate 67 | y_target: the target y-coordinate 68 | robot_movement_x: the list of robot paths x-coordinates 69 | robot_movement_y: the list of robot paths y-coordinates 70 | """ 71 | plt.cla() 72 | plt.title("Stationary robot(Target moving fast)") 73 | plt.xlabel("x") 74 | plt.ylabel("y") 75 | plt.contourf(x_mesh, y_mesh, belief_map, cmap=cm.inferno) 76 | plt.plot(x_target, y_target, 'b--', label='true target motion') 77 | plt.plot(x_target[len(x_target) - 1], y_target[len(y_target) - 1], 'o', c='b', marker='*') 78 | if(len(robot_movement_x) < 8): 79 | plt.plot(robot_movement_x, robot_movement_y, 'r--') 80 | else: 81 | plt.plot(robot_movement_x[-8:], robot_movement_y[-8:], 'r--') 82 | plt.scatter(robot_movement_x[len(robot_movement_x) - 1], robot_movement_y[len(robot_movement_y) - 1], color='r', marker='D', label='robot') 83 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/s1/" + str(t) + ".png") 84 | #plt.show() 85 | 86 | 87 | # compute bayesian histogram for 'm' targets and given robot position 88 | def compute_bayesian_histogram(targets_x_true, targets_y_true, robot_x, robot_y, belief_map_height, belief_map_width, stepsize_map, sigma_bayesian_hist=1.0): 89 | """ 90 | Inputs: 91 | targets_x_true: the true position of target x-coordinate 92 | targets_y_true: the true position of target y-coordinate 93 | robot_x: the position of robot x-coordinate 94 | robot_y: the position of robot y-coordinate 95 | belief_map_height: the environment dimensions, height 96 | belief_map_width: the environment dimensions, width 97 | stepsize_map: equal to 0.1 98 | sigma_bayesian_hist: equal to 1 99 | 100 | Outputs: 101 | bayesian_hist: the belief map of dimensions (belief_map_height, belief_map_width) containing probabilities 102 | """ 103 | noise = sigma_bayesian_hist * np.random.randn(1, 1) 104 | bayesian_hist = np.zeros((belief_map_height, belief_map_width)) 105 | for index in range(0, len(targets_x_true)): 106 | estimated = np.sqrt((targets_x_true[index] - robot_x)**2 + (targets_y_true[index] - robot_y)**2) + noise[0][0] 107 | for index1 in range(0, belief_map_height): 108 | for index2 in range(0, belief_map_width): 109 | true = np.sqrt(((index1*stepsize_map) - robot_x)**2 + ((index2*stepsize_map) - robot_y)**2) 110 | bayesian_hist[index1, index2] += 1.0 / (np.sqrt(2 * np.pi * sigma_bayesian_hist**2)) * np.exp(-0.5 / sigma_bayesian_hist**2 * (np.abs(true - estimated)**2)) 111 | return bayesian_hist 112 | 113 | 114 | # get target estimate 115 | def get_target_position(t, x_true, y_true): 116 | """ 117 | Inputs: 118 | t: time step 119 | x_true: the true position of target x-coordinate 120 | y_true: the true position of target y-coordinate 121 | 122 | Outputs: 123 | (x_true, y_true): the target position at next time step 124 | """ 125 | omega = 33 126 | x_true = 2*np.cos((t-1) / omega) + 10 127 | y_true = 2*np.sin((t-1) / omega) + 12 128 | return (x_true, y_true) 129 | 130 | 131 | # reference: https://matplotlib.org/3.1.0/gallery/statistics/confidence_ellipse.html 132 | def confidence_ellipse(x, y, ax, n_std=3.0, facecolor='none', **kwargs): 133 | """ 134 | Inputs: 135 | x: the x-coordinate datapoints 136 | y: the y-coordinate datapoints 137 | """ 138 | if x.size != y.size: 139 | raise ValueError("x and y must be the same size") 140 | 141 | cov = np.cov(x, y) 142 | pearson = cov[0, 1]/np.sqrt(cov[0, 0] * cov[1, 1]) 143 | # Using a special case to obtain the eigenvalues of this 144 | # two-dimensionl dataset. 145 | ell_radius_x = np.sqrt(1 + pearson) 146 | ell_radius_y = np.sqrt(1 - pearson) 147 | ellipse = Ellipse((0, 0), 148 | width=ell_radius_x * 2, 149 | height=ell_radius_y * 2, 150 | facecolor=facecolor, 151 | **kwargs) 152 | 153 | # Calculating the stdandard deviation of x from 154 | # the squareroot of the variance and multiplying 155 | # with the given number of standard deviations. 156 | scale_x = np.sqrt(cov[0, 0]) * n_std 157 | mean_x = np.mean(x) 158 | 159 | # calculating the stdandard deviation of y ... 160 | scale_y = np.sqrt(cov[1, 1]) * n_std 161 | mean_y = np.mean(y) 162 | 163 | transf = transforms.Affine2D() \ 164 | .rotate_deg(45) \ 165 | .scale(scale_x, scale_y) \ 166 | .translate(mean_x, mean_y) 167 | 168 | ellipse.set_transform(transf + ax.transData) 169 | return ax.add_patch(ellipse) 170 | 171 | 172 | # reference: https://matplotlib.org/3.1.0/gallery/statistics/confidence_ellipse.html 173 | def get_correlated_dataset(n, dependency, mu, scale): 174 | latent = np.random.randn(n, 2) 175 | dependent = latent.dot(dependency) 176 | scaled = dependent * scale 177 | scaled_with_offset = scaled + mu 178 | # return x and y of the new, correlated dataset 179 | return scaled_with_offset[:, 0], scaled_with_offset[:, 1] 180 | 181 | 182 | # code for the filter 183 | def extended_kalman_filter(target_xhat_t, target_yhat_t, target_sigma_t, robots_x, robots_y, robots_id, t, x, y, flag, omega, radii): 184 | """ 185 | Inputs: 186 | target_xhat_t: the estimated target position x-coordinate 187 | target_yhat_t: the estimated target position y-coordinate 188 | robots_x: the position of robots x-coordinate 189 | robots_y: the position of robots y-coordinate 190 | robots_id: the ids of robots 191 | t: the time step 192 | 193 | Outputs: 194 | (target_xhat_tplus1, target_yhat_tplus1, sigma_matrix_tplus1, x_true, y_true): the predicted target position 195 | """ 196 | # get z_true using true target motion 197 | sigma_z = 1.0 198 | x_true = radii*np.cos((t-1)/omega)+x 199 | if flag: 200 | y_true = radii*np.sin((t-1)/omega)+y 201 | else: 202 | y_true = -radii*np.sin((t-1) / omega)+y 203 | noise = sigma_z*np.random.randn(1000, 1) 204 | 205 | z_true = np.zeros((len(robots_x), 1)) 206 | for index in range(0, len(robots_x)): 207 | z_true[index][0] = np.linalg.norm([[robots_x[index] - x_true], [robots_y[index] - y_true]], 2) + noise[robots_id[index]] 208 | 209 | # filter code 210 | q_matrix = 0.2 * np.eye(2) 211 | x_matrix = np.array([[target_xhat_t], [target_yhat_t]]) 212 | sigma_matrix = target_sigma_t + q_matrix 213 | 214 | z_pred = np.zeros((len(robots_x), 1)) 215 | h_matrix = np.zeros((len(robots_x), 2)) 216 | for index in range(0, len(robots_x)): 217 | z_pred[index][0] = np.linalg.norm([[x_matrix[0][0] - robots_x[index]], [x_matrix[1][0] - robots_y[index]]], 2) 218 | h_matrix[index][0] = (-1.0 / z_pred[index]) * (robots_x[index] - x_matrix[0][0]) 219 | h_matrix[index][1] = (-1.0 / z_pred[index]) * (robots_y[index] - x_matrix[1][0]) 220 | 221 | res = (z_true - z_pred) 222 | r_matrix = sigma_z * sigma_z * np.eye(len(robots_x)) 223 | s_matrix = np.matmul(np.matmul(h_matrix, sigma_matrix), h_matrix.T) + r_matrix 224 | k_matrix = np.matmul(np.matmul(sigma_matrix, h_matrix.T), np.linalg.inv(s_matrix)) 225 | 226 | x_matrix_tplus1 = x_matrix + (np.matmul(k_matrix, res)) 227 | sigma_matrix_tplus1 = np.matmul(np.matmul((np.eye(2) - np.matmul(k_matrix, h_matrix)), sigma_matrix), (np.eye(2) - np.matmul(k_matrix, h_matrix)).T) + np.matmul(np.matmul(k_matrix, r_matrix), k_matrix.T) 228 | target_xhat_tplus1 = x_matrix_tplus1[0][0] 229 | target_yhat_tplus1 = x_matrix_tplus1[1][0] 230 | return (target_xhat_tplus1, target_yhat_tplus1, sigma_matrix_tplus1, x_true, y_true) 231 | 232 | 233 | # plot gaussian 234 | def plot_gaussian(gauss): 235 | x, y = np.mgrid[0:25:100j, 0:25:100j] 236 | z = np.dstack((x, y)) 237 | plt.contourf(x, y, gauss.pdf(z)) 238 | #plt.show() 239 | 240 | 241 | # save gaussian 242 | def save_gaussian(gauss, path): 243 | x, y = np.mgrid[0:25:100j, 0:25:100j] 244 | z = np.dstack((x, y)) 245 | plt.contourf(x, y, gauss.pdf(z)) 246 | plt.savefig(path) 247 | 248 | 249 | # plot confidence ellipse 250 | def plot_ellipse(x, y, mean, x_list, y_list, target_x_mean, target_y_mean, path, robot_x, robot_y, robot_movement_x, robot_movement_y): 251 | fig, ax_nstd = plt.subplots(figsize=(6, 6)) 252 | ax_nstd.axvline(c='grey', lw=1) 253 | ax_nstd.axhline(c='grey', lw=1) 254 | confidence_ellipse(x, y, ax_nstd, n_std=1, edgecolor='firebrick') 255 | ax_nstd.scatter(mean[0], mean[1], c='b', s=1) 256 | ax_nstd.legend() 257 | plt.title("Greedy algorithm using EKF(Target moving slow)") 258 | plt.xlabel("x") 259 | plt.ylabel("y") 260 | plt.xlim(0, 20) 261 | plt.ylim(0, 20) 262 | plt.plot(x_list, y_list, 'b--') 263 | plt.scatter(x_list[len(x_list)-1], y_list[len(y_list)-1], color='b', marker='*') 264 | plt.scatter([target_x_mean], [target_y_mean], color='b', marker="s") 265 | if(len(robot_movement_x) < 8): 266 | plt.plot(robot_movement_x, robot_movement_y, 'r--') 267 | else: 268 | plt.plot(robot_movement_x[-8:], robot_movement_y[-8:], 'r--') 269 | plt.scatter(robot_x, robot_y, color='r', marker='D') 270 | plt.plot([robot_x, target_x_mean], [robot_y, target_y_mean], color='r') 271 | plt.savefig(path) 272 | plt.cla() 273 | plt.close() 274 | 275 | 276 | def points_in_circle_np(radius, x0=0, y0=0): 277 | """ 278 | Inputs: 279 | radius: the radius of the circular region around the current robot position 280 | x0: the x-coordinate of the robot position 281 | y0: the y-coordinate of the robot position 282 | 283 | Outputs: 284 | points: the action set to be used for deciding the robot trajectory 285 | """ 286 | thetas = [0.0, 15.0, 30.0, 45.0, 60.0, 75.0, 90.0, 105.0, 120.0, 135.0, 150.0, 165.0, 180.0, 195.0, 210.0, 225.0, 240.0, 255.0, 270.0, 285.0, 300.0, 315.0, 330.0] 287 | points = [] 288 | for theta in thetas: 289 | theta = (theta * pi) / 180.0 290 | points.append((float(x0 + cos(theta) * radius), float(y0 + sin(theta) * radius))) 291 | return points 292 | 293 | 294 | # choose optimal action 295 | def update_robot_pos_ekf(robot_x, robot_y, target_x, target_y, var, prev_target_x, prev_target_y, radius, map_height, map_width, t, prev_robot_x, prev_robot_y): 296 | """ 297 | Inputs: 298 | robot_x: the position of robot x-coordinate 299 | robot_y: the position of robot y-coordinate 300 | target_x: the position of target x-coordinate 301 | target_y: the position of target y-coordinate 302 | var: the uncertainty in target position 303 | prev_target_x: the previous position of target x-coordinate 304 | prev_target_y: the previous position of target y-coordinate 305 | radius: the radius of the circular region around the current robot position 306 | map_height: the environment dimensions, height 307 | map_width: the environment dimensions, width 308 | t: the time step 309 | 310 | Outputs: 311 | (best_action[0], best_action[1]): updated robot position 312 | """ 313 | action_set = points_in_circle_np(radius, robot_x, robot_y) 314 | alpha_opt = -1000000 315 | best_action = (robot_x, robot_y) 316 | for action in action_set: 317 | curr_robot_x = action[0] 318 | curr_robot_y = action[1] 319 | val = 0 320 | for index in range(0, len(target_x)): 321 | if(curr_robot_x >= 1 and curr_robot_x <= (map_height-1) and curr_robot_y >= 1 and curr_robot_y <= (map_width-1)): 322 | t_v_i = ((robot_x - target_x[index])*(robot_x - target_x[index])) + ((robot_y - target_y[index])*(robot_y - target_y[index])) 323 | t_v_j = ((curr_robot_x - target_x[index])*(curr_robot_x - target_x[index])) + ((curr_robot_y - target_y[index])*(curr_robot_y - target_y[index])) 324 | t_m1 = (target_y[index] - robot_y) / (target_x[index] - robot_x + 1e-9) 325 | t_m2 = (target_y[index] - curr_robot_y) / (target_x[index] - curr_robot_x + 1e-9) 326 | value = np.abs(t_m1*t_m2 + 1) 327 | t_angle1 = np.arctan2(t_m1-t_m2, 1+t_m1*t_m2) 328 | t_angle2 = np.arctan2(t_m2-t_m1, 1+t_m1*t_m2) 329 | t_val = max((t_v_i*t_v_j*np.sin(t_angle1)*np.sin(t_angle1)), (t_v_i*t_v_j*np.sin(t_angle2)*np.sin(t_angle2))) 330 | t1_v_i = ((prev_robot_x - prev_target_x[index])*(prev_robot_x - prev_target_x[index])) + ((prev_robot_y - prev_target_y[index])*(prev_robot_y - prev_target_y[index])) 331 | t1_v_j = ((robot_x - prev_target_x[index])*(robot_x - prev_target_x[index])) + ((robot_y - prev_target_y[index])*(robot_y - prev_target_y[index])) 332 | t1_m1 = (prev_target_y[index] - prev_robot_y) / (prev_target_x[index] - prev_robot_x + 1e-9) 333 | t1_m2 = (prev_target_y[index] - robot_y) / (prev_target_x[index] - robot_x + 1e-9) 334 | t1_angle1 = np.arctan2(t1_m1-t1_m2, 1+t1_m1*t1_m2) 335 | t1_angle2 = np.arctan2(t1_m2-t1_m1, 1+t1_m1*t1_m2) 336 | t1_val = max((t1_v_i*t1_v_j*np.sin(t1_angle1)*np.sin(t1_angle1)), (t1_v_i*t1_v_j*np.sin(t1_angle2)*np.sin(t1_angle2))) 337 | t2_v_i = ((prev_robot_x - prev_target_x[index])*(prev_robot_x - prev_target_x[index])) + ((prev_robot_y - prev_target_y[index])*(prev_robot_y - prev_target_y[index])) 338 | t2_v_j = ((curr_robot_x - prev_target_x[index])*(curr_robot_x - prev_target_x[index])) + ((curr_robot_y - prev_target_y[index])*(curr_robot_y - prev_target_y[index])) 339 | t2_m1 = (prev_target_y[index] - prev_robot_y) / (prev_target_x[index] - prev_robot_x + 1e-9) 340 | t2_m2 = (prev_target_y[index] - curr_robot_y) / (prev_target_x[index] - curr_robot_x + 1e-9) 341 | t2_angle1 = np.arctan2(t2_m1-t2_m2, 1+t2_m1*t2_m2) 342 | t2_angle2 = np.arctan2(t2_m2-t2_m1, 1+t2_m1*t2_m2) 343 | t2_val = max((t2_v_i*t2_v_j*np.sin(t2_angle1)*np.sin(t2_angle1)), (t2_v_i*t2_v_j*np.sin(t2_angle2)*np.sin(t2_angle2))) 344 | val += t1_val + t2_val + t_val 345 | if(val > alpha_opt): 346 | alpha_opt = val 347 | best_action = action 348 | return (best_action[0], best_action[1], 0) 349 | 350 | 351 | # choose optimal action 352 | def update_robot_pos(robot_x, robot_y, target_x, target_y, prev_target_x, prev_target_y, radius, map_height, map_width, prev_robot_x, prev_robot_y): 353 | """ 354 | Inputs: 355 | robot_x: the position of robot x-coordinate 356 | robot_y: the position of robot y-coordinate 357 | target_x: the position of target x-coordinate 358 | target_y: the position of target y-coordinate 359 | prev_target_x: the previous position of target x-coordinate 360 | prev_target_y: the previous position of target y-coordinate 361 | radius: the radius of the circular region around the current robot position 362 | map_height: the environment dimensions, height 363 | map_width: the environment dimensions, width 364 | 365 | Outputs: 366 | (best_action[0], best_action[1]): updated robot position 367 | """ 368 | action_set = points_in_circle_np(radius, robot_x, robot_y) 369 | alpha_opt = -10000000 370 | dist_opt = 10000000 371 | best_action = (0, 0) 372 | for action in action_set: 373 | curr_robot_x = action[0] 374 | curr_robot_y = action[1] 375 | if(curr_robot_x >= 1 and curr_robot_x < (map_height-1) and curr_robot_y >= 1 and curr_robot_y < (map_width-1)): 376 | m1 = (prev_target_y - robot_y) / (prev_target_x - robot_x + 1e-8) 377 | m2 = (target_y - curr_robot_y) / (target_x - curr_robot_x + 1e-8) 378 | t_v_i = ((robot_x - target_x)*(robot_x - target_x)) + ((robot_y - target_y)*(robot_y - target_y)) 379 | t_v_j = ((curr_robot_x - target_x)*(curr_robot_x - target_x)) + ((curr_robot_y - target_y)*(curr_robot_y - target_y)) 380 | t_m1 = (target_y - robot_y) / (target_x - robot_x + 1e-9) 381 | t_m2 = (target_y - curr_robot_y) / (target_x - curr_robot_x + 1e-9) 382 | t_angle1 = np.arctan2(t_m1-t_m2, 1+t_m1*t_m2) 383 | t_angle2 = np.arctan2(t_m2-t_m1, 1+t_m1*t_m2) 384 | t_val = max((t_v_i*t_v_j*np.sin(t_angle1)*np.sin(t_angle1)), (t_v_i*t_v_j*np.sin(t_angle2)*np.sin(t_angle2))) 385 | t1_v_i = ((prev_robot_x - prev_target_x)*(prev_robot_x - prev_target_x)) + ((prev_robot_y - prev_target_y)*(prev_robot_y - prev_target_y)) 386 | t1_v_j = ((robot_x - prev_target_x)*(robot_x - prev_target_x)) + ((robot_y - prev_target_y)*(robot_y - prev_target_y)) 387 | t1_m1 = (prev_target_y - prev_robot_y) / (prev_target_x - prev_robot_x + 1e-9) 388 | t1_m2 = (prev_target_y - robot_y) / (prev_target_x - robot_x + 1e-9) 389 | t1_angle1 = np.arctan2(t1_m1-t1_m2, 1+t1_m1*t1_m2) 390 | t1_angle2 = np.arctan2(t1_m2-t1_m1, 1+t1_m1*t1_m2) 391 | t1_val = max((t1_v_i*t1_v_j*np.sin(t1_angle1)*np.sin(t1_angle1)), (t1_v_i*t1_v_j*np.sin(t1_angle2)*np.sin(t1_angle2))) 392 | t2_v_i = ((prev_robot_x - prev_target_x)*(prev_robot_x - prev_target_x)) + ((prev_robot_y - prev_target_y)*(prev_robot_y - prev_target_y)) 393 | t2_v_j = ((curr_robot_x - prev_target_x)*(curr_robot_x - prev_target_x)) + ((curr_robot_y - prev_target_y)*(curr_robot_y - prev_target_y)) 394 | t2_m1 = (prev_target_y - prev_robot_y) / (prev_target_x - prev_robot_x + 1e-9) 395 | t2_m2 = (prev_target_y - curr_robot_y) / (prev_target_x - curr_robot_x + 1e-9) 396 | t2_angle1 = np.arctan2(t2_m1-t2_m2, 1+t2_m1*t2_m2) 397 | t2_angle2 = np.arctan2(t2_m2-t2_m1, 1+t2_m1*t2_m2) 398 | t2_val = max((t2_v_i*t2_v_j*np.sin(t2_angle1)*np.sin(t2_angle1)), (t2_v_i*t2_v_j*np.sin(t2_angle2)*np.sin(t2_angle2))) 399 | val = t1_val + t2_val + t_val 400 | if(val > alpha_opt): 401 | alpha_opt = val 402 | best_action = action 403 | return (best_action[0], best_action[1]) 404 | -------------------------------------------------------------------------------- /outputs/7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/7.png -------------------------------------------------------------------------------- /outputs/case1_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case1_1.jpg -------------------------------------------------------------------------------- /outputs/case1_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case1_1.png -------------------------------------------------------------------------------- /outputs/case1_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case1_2.jpg -------------------------------------------------------------------------------- /outputs/case1_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case1_2.png -------------------------------------------------------------------------------- /outputs/case1_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case1_3.png -------------------------------------------------------------------------------- /outputs/case1_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case1_4.png -------------------------------------------------------------------------------- /outputs/case2_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case2_1.jpg -------------------------------------------------------------------------------- /outputs/case2_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case2_1.png -------------------------------------------------------------------------------- /outputs/case2_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case2_2.jpg -------------------------------------------------------------------------------- /outputs/case2_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case2_2.png -------------------------------------------------------------------------------- /outputs/case2_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case2_3.png -------------------------------------------------------------------------------- /outputs/case2_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case2_4.png -------------------------------------------------------------------------------- /outputs/case3_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case3_1.jpg -------------------------------------------------------------------------------- /outputs/case3_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case3_1.png -------------------------------------------------------------------------------- /outputs/case3_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case3_2.jpg -------------------------------------------------------------------------------- /outputs/case3_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case3_2.png -------------------------------------------------------------------------------- /outputs/case3_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case3_3.png -------------------------------------------------------------------------------- /outputs/case3_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case3_4.png -------------------------------------------------------------------------------- /outputs/case4_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case4_1.jpg -------------------------------------------------------------------------------- /outputs/case4_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case4_1.png -------------------------------------------------------------------------------- /outputs/case4_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case4_2.jpg -------------------------------------------------------------------------------- /outputs/case4_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case4_2.png -------------------------------------------------------------------------------- /outputs/case4_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case4_3.png -------------------------------------------------------------------------------- /outputs/case4_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/case4_4.png -------------------------------------------------------------------------------- /outputs/greedy_reward.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/greedy_reward.png -------------------------------------------------------------------------------- /outputs/qlearning_reward.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/qlearning_reward.png -------------------------------------------------------------------------------- /outputs/reward.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/reward.png -------------------------------------------------------------------------------- /outputs/reward_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/reward_1.png -------------------------------------------------------------------------------- /outputs/reward_sensors_1_targets_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/reward_sensors_1_targets_1.png -------------------------------------------------------------------------------- /outputs/reward_sensors_1_targets_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/reward_sensors_1_targets_2.png -------------------------------------------------------------------------------- /outputs/reward_sensors_1_targets_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/reward_sensors_1_targets_3.png -------------------------------------------------------------------------------- /outputs/reward_sensors_1_targets_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/reward_sensors_1_targets_4.png -------------------------------------------------------------------------------- /outputs/sample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/sample.png -------------------------------------------------------------------------------- /outputs/sample1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/outputs/sample1.png -------------------------------------------------------------------------------- /scripts/__pycache__/utils.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arp95/robot_target_tracking/662f2559893866c73717874ad8e2e5cfee78c731/scripts/__pycache__/utils.cpython-37.pyc -------------------------------------------------------------------------------- /scripts/bh_main.py: -------------------------------------------------------------------------------- 1 | # header files needed 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | import matplotlib.cm as cm 5 | from scipy.stats import multivariate_normal 6 | from matplotlib.patches import Ellipse 7 | from utils import * 8 | 9 | 10 | # user-defined variables 11 | x_true = 12 12 | y_true = 12 13 | robots_x = [1] 14 | robots_y = [1] 15 | robots_id = [1] 16 | map_height = 20 17 | map_width = 20 18 | stepsize_map = 0.1 19 | sigma_bayesian_hist = 1.0 20 | action_radius = 2 21 | 22 | # variables 23 | x_mesh, y_mesh = np.mgrid[0:map_height:stepsize_map, 0:map_width:stepsize_map] 24 | belief_map = np.ones((int(map_height / stepsize_map), int(map_width / stepsize_map))) 25 | robot_movement_x = [] 26 | robot_movement_y = [] 27 | true_target_x = [] 28 | true_target_y = [] 29 | true_target_x.append(x_true) 30 | true_target_y.append(y_true) 31 | robot_movement_x.append(robots_x[0]) 32 | robot_movement_y.append(robots_y[0]) 33 | prev_target_x = x_true 34 | prev_target_y = y_true 35 | prev_robot_x = robots_x[0] 36 | prev_robot_y = robots_y[0] 37 | 38 | # plotting for t=1 39 | bayesian_hist = compute_bayesian_histogram([x_true], [y_true], robots_x[0], robots_y[0], int(belief_map.shape[0]), int(belief_map.shape[1]), stepsize_map, sigma_bayesian_hist) 40 | belief_map = belief_map * bayesian_hist 41 | belief_map = belief_map / belief_map.sum() 42 | render(1, x_mesh, y_mesh, belief_map, true_target_x, true_target_y, robot_movement_x, robot_movement_y) 43 | 44 | 45 | # estimate target position after each time step 46 | for t in range(2, 200): 47 | 48 | # update target position and add target to list 49 | x_true, y_true = get_target_position(t, x_true, y_true) 50 | true_target_x.append(x_true) 51 | true_target_y.append(y_true) 52 | 53 | # compute bayesian histogram and update belief map 54 | bayesian_hist = compute_bayesian_histogram([x_true], [y_true], robots_x[0], robots_y[0], int(belief_map.shape[0]), int(belief_map.shape[1]), stepsize_map) 55 | belief_map = belief_map * bayesian_hist 56 | belief_map = belief_map / belief_map.sum() 57 | 58 | # plot map 59 | render(t, x_mesh, y_mesh, belief_map, true_target_x, true_target_y, robot_movement_x, robot_movement_y) 60 | 61 | # update robot position 62 | #next_robot_x, next_robot_y = update_robot_pos(robots_x[0], robots_y[0], x_true, y_true, prev_target_x, prev_target_y, action_radius, map_height, map_width, prev_robot_x, prev_robot_y) 63 | 64 | # add robot position for rendering 65 | prev_target_x = x_true 66 | prev_target_y = y_true 67 | prev_robot_x = robots_x[0] 68 | prev_robot_y = robots_y[0] 69 | #robots_x[0] = next_robot_x 70 | #robots_y[0] = next_robot_y 71 | robot_movement_x.append(robots_x[0]) 72 | robot_movement_y.append(robots_y[0]) 73 | -------------------------------------------------------------------------------- /scripts/ekf_main.py: -------------------------------------------------------------------------------- 1 | # header files needed 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | import matplotlib.cm as cm 5 | from scipy.stats import multivariate_normal 6 | from matplotlib.patches import Ellipse 7 | from utils import * 8 | 9 | 10 | # user-defined variables 11 | mean_1 = np.asarray([10, 10]) 12 | x_true_1 = 10 13 | y_true_1 = 10 14 | var_1 = [[1, 0], [0, 1]] 15 | init_pos_1 = [8, 10] 16 | mean_2 = np.asarray([12, 5]) 17 | x_true_2 = 12 18 | y_true_2 = 5 19 | var_2 = [[1, 0], [0, 1]] 20 | init_pos_2 = [10, 5] 21 | robots_x = [12.5] 22 | robots_y = [2.5] 23 | robots_id = [1] 24 | map_height = 20 25 | map_width = 20 26 | stepsize_map = 0.1 27 | action_radius = 1 28 | 29 | # variables 30 | x_g, y_g = np.mgrid[0:map_height:stepsize_map, 0:map_width:stepsize_map] 31 | points = np.dstack((x_g, y_g)) 32 | robot_movement_x = [robots_x[0]] 33 | robot_movement_y = [robots_y[0]] 34 | true_target_x_1 = [x_true_1] 35 | true_target_y_1 = [y_true_1] 36 | prev_target_x_1 = x_true_1 37 | prev_target_y_1 = y_true_1 38 | true_target_x_2 = [x_true_2] 39 | true_target_y_2 = [y_true_2] 40 | prev_target_x_2 = x_true_2 41 | prev_target_y_2 = y_true_2 42 | prev_robot_x = robots_x[0] 43 | prev_robot_y = robots_y[0] 44 | 45 | 46 | # estimate target position after each time step 47 | net_val = 0 48 | avg_val = 0 49 | for t in range(2, 200): 50 | 51 | # update target position 52 | target_x_mean_1, target_y_mean_1, var_1, x_true_1, y_true_1 = extended_kalman_filter(mean_1[0], mean_1[1], var_1, robots_x, robots_y, robots_id, t, init_pos_1[0], init_pos_1[1], 0) 53 | mean_1 = np.asarray([target_x_mean_1, target_y_mean_1]) 54 | target_x_mean_2, target_y_mean_2, var_2, x_true_2, y_true_2 = extended_kalman_filter(mean_2[0], mean_2[1], var_2, robots_x, robots_y, robots_id, t, init_pos_2[0], init_pos_2[1], 1) 55 | mean_2 = np.asarray([target_x_mean_2, target_y_mean_2]) 56 | 57 | # update true target position 58 | true_target_x_1.append(x_true_1) 59 | true_target_y_1.append(y_true_1) 60 | true_target_x_2.append(x_true_2) 61 | true_target_y_2.append(y_true_2) 62 | 63 | # add robot position for rendering 64 | robot_movement_x.append(robots_x[0]) 65 | robot_movement_y.append(robots_y[0]) 66 | 67 | # update belief map 68 | #gauss = multivariate_normal(mean, var) 69 | #prob_ekf = gauss.pdf(points) 70 | #scale = 2, 2 71 | #x, y = get_correlated_dataset(500, var, (mean[0], mean[1]), scale) 72 | 73 | # plot map 74 | render_ekf([target_x_mean_1, target_y_mean_1], [target_x_mean_2, target_y_mean_2], var_1, var_2, t, true_target_x_1, true_target_y_1, true_target_x_2, true_target_y_2, robot_movement_x, robot_movement_y) 75 | #plot_ellipse(x, y, mean, true_target_x, true_target_y, target_x_mean, target_y_mean, "/home/arpitdec5/Desktop/robot_target_tracking/s2/" + str(t) + ".png", robots_x[0], robots_y[0], robot_movement_x, robot_movement_y) 76 | 77 | # update robot position 78 | next_robot_x, next_robot_y, val = update_robot_pos_ekf(robots_x[0], robots_y[0], [target_x_mean_1, target_x_mean_2], [target_y_mean_1, target_y_mean_2], [var_1, var_2], [prev_target_x_1, prev_target_x_2], [prev_target_y_1, prev_target_y_2], action_radius, map_height, map_width, t+1, prev_robot_x, prev_robot_y) 79 | net_val = np.linalg.det(var_1) + np.linalg.det(var_2) 80 | avg_val += net_val 81 | prev_robot_x = robots_x[0] 82 | prev_robot_y = robots_y[0] 83 | robots_x[0] = next_robot_x 84 | robots_y[0] = next_robot_y 85 | prev_target_x_1 = target_x_mean_1 86 | prev_target_y_1 = target_y_mean_1 87 | prev_target_x_2 = target_x_mean_2 88 | prev_target_y_2 = target_y_mean_2 89 | 90 | print(net_val) 91 | print(avg_val/200) 92 | print("Done") 93 | -------------------------------------------------------------------------------- /scripts/qlearning.py: -------------------------------------------------------------------------------- 1 | # header files 2 | import gym 3 | import math 4 | import random 5 | import numpy as np 6 | import matplotlib 7 | import matplotlib.pyplot as plt 8 | from collections import namedtuple 9 | from itertools import count 10 | from PIL import Image 11 | from gym.envs.registration import register 12 | import logging 13 | import torch 14 | import torch.nn as nn 15 | import torch.optim as optim 16 | import torch.nn.functional as F 17 | import torchvision.transforms as T 18 | from td3 import * 19 | from replay_buffer import * 20 | 21 | 22 | logger = logging.getLogger(__name__) 23 | register( 24 | id='RobotTargetTrackingEnv-v0', 25 | entry_point='envs:RobotTargetTrackingEnvInterface', 26 | ) 27 | 28 | # initialise environment 29 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 30 | env_name = "RobotTargetTrackingEnv-v0" 31 | env = gym.make(env_name) 32 | env.env_parametrization() 33 | env.reset() 34 | 35 | 36 | # constants 37 | LR = 0.1 38 | DISCOUNT = 0.95 39 | epsilon = 0.5 40 | EPOCHS = 1 41 | START_EPSILON_DECAYING = 1 42 | END_EPSILON_DECAYING = EPOCHS // 2 43 | epsilon_decay = epsilon / (END_EPSILON_DECAYING - START_EPSILON_DECAYING) 44 | 45 | # train qlearning 46 | q_table = np.random.uniform(low=-1, high=1, size=(11, 11, 20, 8)) 47 | epoch_reward = 0 48 | e = [] 49 | r = [] 50 | 51 | def get_state(state): 52 | new_state = np.array([0, 0, 0]) 53 | if(state[0]%2 == 0): 54 | new_state[0] = int(state[0] / 2.0) 55 | else: 56 | new_state[0] = int((state[0] - 1) / 2.0) + 1 57 | 58 | if(state[1]%2 == 0): 59 | new_state[1] = int(state[1] / 2.0) 60 | else: 61 | new_state[1] = int((state[1] - 1) / 2.0) + 1 62 | 63 | if(state[2]%3 == 0): 64 | new_state[2] = int(state[2] / 3.0) 65 | else: 66 | new_state[2] = int((state[2] - 1) / 3.0) + 1 67 | return new_state 68 | 69 | 70 | for epoch in range(0, EPOCHS): 71 | discrete_state = env.reset() 72 | state = get_state(discrete_state.numpy()) 73 | done = False 74 | while not done: 75 | if(np.random.random() > epsilon): 76 | action = np.argmax(q_table[(state[0], state[1], state[2])]) 77 | else: 78 | action = np.random.randint(0, 8) 79 | action_index = action 80 | if(action == 0): 81 | action = [0] 82 | elif(action == 1): 83 | action = [np.pi / 4] 84 | elif(action == 2): 85 | action = [np.pi / 2] 86 | elif(action == 3): 87 | action = [3*np.pi / 4] 88 | elif(action == 4): 89 | action = [np.pi] 90 | elif(action == 5): 91 | action = [225*np.pi / 180] 92 | elif(action == 6): 93 | action = [3*np.pi / 2] 94 | else: 95 | action = [315*np.pi / 180] 96 | 97 | new_state, reward, done, _ = env.step(action) 98 | epoch_reward += reward 99 | new_state = get_state(new_state.numpy()) 100 | 101 | env.render() 102 | 103 | #if(epoch%100 == 0): 104 | # env.render() 105 | 106 | if not done: 107 | max_future_q = np.max(q_table[(new_state[0], new_state[1], new_state[2])]) 108 | current_q = q_table[(state[0], state[1], state[2], action_index)] 109 | new_q = ((1 - LR) * current_q) + (LR * (reward + DISCOUNT * max_future_q)) 110 | q_table[(state[0], state[1], state[2], action_index)] = new_q 111 | state = new_state 112 | 113 | if(END_EPSILON_DECAYING >= epoch >= START_EPSILON_DECAYING): 114 | epsilon -= epsilon_decay 115 | if(epoch%1000 == 0): 116 | e.append(epoch) 117 | r.append(epoch_reward) 118 | print("Epoch Reward:") 119 | print(epoch_reward) 120 | print() 121 | epoch_reward = 0 122 | env.close() 123 | 124 | # plot reward curve 125 | plt.xlabel("Episodes") 126 | plt.ylabel("Avg. Reward") 127 | plt.plot(e, r) 128 | plt.savefig("/home/arpitdec5/Desktop/qlearning_reward.png") 129 | 130 | 131 | # test qlearning 132 | #discrete_state = env.reset() 133 | #state = get_state(discrete_state.numpy()) 134 | #done = False 135 | #while not done: 136 | # action = np.argmax(q_table[(state[0], state[1], state[2])]) 137 | # action_index = action 138 | # if(action == 0): 139 | # action = [0] 140 | # elif(action == 1): 141 | # action = [np.pi / 4] 142 | # elif(action == 2): 143 | # action = [np.pi / 2] 144 | # elif(action == 3): 145 | # action = [3*np.pi / 4] 146 | # elif(action == 4): 147 | # action = [np.pi] 148 | # elif(action == 5): 149 | # action = [225*np.pi / 180] 150 | # elif(action == 6): 151 | # action = [3*np.pi / 2] 152 | # else: 153 | # action = [315*np.pi / 180] 154 | # 155 | # new_state, reward, done, _ = env.step(action) 156 | # new_state = get_state(new_state.numpy()) 157 | # state = new_state 158 | # 159 | # env.render() 160 | # env.close() 161 | -------------------------------------------------------------------------------- /scripts/utils.py: -------------------------------------------------------------------------------- 1 | # header files needed 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | import matplotlib.cm as cm 5 | from scipy.stats import multivariate_normal 6 | from matplotlib.patches import Ellipse 7 | import matplotlib.transforms as transforms 8 | from math import pi, cos, sin 9 | from random import random 10 | import math 11 | import torch 12 | 13 | 14 | def render_ekf(estimated_targets_mean_1, estimated_targets_mean_2, estimated_targets_var_1, estimated_targets_var_2, time_step, x_list_1, y_list_1, x_list_2, y_list_2, robot_movement_x, robot_movement_y): 15 | """ 16 | Function for rendering the environment 17 | """ 18 | heatmap = torch.zeros(256, 256) 19 | for index in range(0, 1): 20 | x = np.linspace(0, 20, 256) 21 | y = np.linspace(0, 20, 256) 22 | X, Y = np.meshgrid(x, y) 23 | pos = np.empty(X.shape + (2,)) 24 | pos[:, :, 0] = X; pos[:, :, 1] = Y 25 | rv_1 = multivariate_normal(estimated_targets_mean_1, estimated_targets_var_1) 26 | rv_2 = multivariate_normal(estimated_targets_mean_2, estimated_targets_var_2) 27 | heatmap += rv_1.pdf(pos) 28 | heatmap += rv_2.pdf(pos) 29 | x = np.linspace(0, 20, 256) 30 | y = np.linspace(0, 20, 256) 31 | X, Y = np.meshgrid(x, y) 32 | plt.cla() 33 | plt.title("Time step = " + str(time_step)) 34 | plt.xlabel("x") 35 | plt.ylabel("y") 36 | plt.xlim(0, 20) 37 | plt.ylim(0, 20) 38 | plt.contourf(X, Y, heatmap, cmap=cm.inferno) 39 | plt.plot(x_list_1, y_list_1, 'b--') 40 | plt.plot(x_list_2, y_list_2, 'b--') 41 | plt.plot(x_list_1[len(x_list_1) - 1], y_list_1[len(y_list_1) - 1], 'o', c='b', marker='*') 42 | plt.plot(x_list_2[len(x_list_2) - 1], y_list_2[len(y_list_2) - 1], 'o', c='b', marker='*') 43 | #plt.plot(self.x2_list, self.y2_list, 'b--') 44 | #plt.plot(self.x2_list[len(self.x2_list) - 1], self.y2_list[len(self.y2_list) - 1], 'o', c='b', marker='*') 45 | #plt.plot(self.x3_list, self.y3_list, 'b--') 46 | #plt.plot(self.x3_list[len(self.x3_list) - 1], self.y3_list[len(self.y3_list) - 1], 'o', c='b', marker='*') 47 | #plt.plot(self.x4_list, self.y4_list, 'b--') 48 | #plt.plot(self.x4_list[len(self.x4_list) - 1], self.y4_list[len(self.y4_list) - 1], 'o', c='b', marker='*') 49 | if(len(robot_movement_x) < 8): 50 | plt.plot(robot_movement_x, robot_movement_y, 'r--') 51 | else: 52 | plt.plot(robot_movement_x[-8:], robot_movement_y[-8:], 'r--') 53 | plt.scatter(robot_movement_x[len(robot_movement_x) - 1], robot_movement_y[len(robot_movement_y) - 1], color='r', marker='D') 54 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/s1/" + str(time_step) + ".png") 55 | #plt.show() 56 | 57 | 58 | # plot the heatmap 59 | def render(t, x_mesh, y_mesh, belief_map, x_target, y_target, robot_movement_x, robot_movement_y): 60 | """ 61 | Inputs: 62 | t: time step 63 | x_mesh: the x-coordinates 64 | y_mesh: the y-coordinates 65 | belief_map: the map containing the probabilities 66 | x_target: the target x-coordinate 67 | y_target: the target y-coordinate 68 | robot_movement_x: the list of robot paths x-coordinates 69 | robot_movement_y: the list of robot paths y-coordinates 70 | """ 71 | plt.cla() 72 | plt.title("Stationary robot(Target moving fast)") 73 | plt.xlabel("x") 74 | plt.ylabel("y") 75 | plt.contourf(x_mesh, y_mesh, belief_map, cmap=cm.inferno) 76 | plt.plot(x_target, y_target, 'b--', label='true target motion') 77 | plt.plot(x_target[len(x_target) - 1], y_target[len(y_target) - 1], 'o', c='b', marker='*') 78 | if(len(robot_movement_x) < 8): 79 | plt.plot(robot_movement_x, robot_movement_y, 'r--') 80 | else: 81 | plt.plot(robot_movement_x[-8:], robot_movement_y[-8:], 'r--') 82 | plt.scatter(robot_movement_x[len(robot_movement_x) - 1], robot_movement_y[len(robot_movement_y) - 1], color='r', marker='D', label='robot') 83 | plt.savefig("/home/arpitdec5/Desktop/robot_target_tracking/s1/" + str(t) + ".png") 84 | #plt.show() 85 | 86 | 87 | # compute bayesian histogram for 'm' targets and given robot position 88 | def compute_bayesian_histogram(targets_x_true, targets_y_true, robot_x, robot_y, belief_map_height, belief_map_width, stepsize_map, sigma_bayesian_hist=1.0): 89 | """ 90 | Inputs: 91 | targets_x_true: the true position of target x-coordinate 92 | targets_y_true: the true position of target y-coordinate 93 | robot_x: the position of robot x-coordinate 94 | robot_y: the position of robot y-coordinate 95 | belief_map_height: the environment dimensions, height 96 | belief_map_width: the environment dimensions, width 97 | stepsize_map: equal to 0.1 98 | sigma_bayesian_hist: equal to 1 99 | 100 | Outputs: 101 | bayesian_hist: the belief map of dimensions (belief_map_height, belief_map_width) containing probabilities 102 | """ 103 | noise = sigma_bayesian_hist * np.random.randn(1, 1) 104 | bayesian_hist = np.zeros((belief_map_height, belief_map_width)) 105 | for index in range(0, len(targets_x_true)): 106 | estimated = np.sqrt((targets_x_true[index] - robot_x)**2 + (targets_y_true[index] - robot_y)**2) + noise[0][0] 107 | for index1 in range(0, belief_map_height): 108 | for index2 in range(0, belief_map_width): 109 | true = np.sqrt(((index1*stepsize_map) - robot_x)**2 + ((index2*stepsize_map) - robot_y)**2) 110 | bayesian_hist[index1, index2] += 1.0 / (np.sqrt(2 * np.pi * sigma_bayesian_hist**2)) * np.exp(-0.5 / sigma_bayesian_hist**2 * (np.abs(true - estimated)**2)) 111 | return bayesian_hist 112 | 113 | 114 | # get target estimate 115 | def get_target_position(t, x_true, y_true): 116 | """ 117 | Inputs: 118 | t: time step 119 | x_true: the true position of target x-coordinate 120 | y_true: the true position of target y-coordinate 121 | 122 | Outputs: 123 | (x_true, y_true): the target position at next time step 124 | """ 125 | omega = 33 126 | x_true = 2*np.cos((t-1) / omega) + 10 127 | y_true = 2*np.sin((t-1) / omega) + 12 128 | return (x_true, y_true) 129 | 130 | 131 | # reference: https://matplotlib.org/3.1.0/gallery/statistics/confidence_ellipse.html 132 | def confidence_ellipse(x, y, ax, n_std=3.0, facecolor='none', **kwargs): 133 | """ 134 | Inputs: 135 | x: the x-coordinate datapoints 136 | y: the y-coordinate datapoints 137 | """ 138 | if x.size != y.size: 139 | raise ValueError("x and y must be the same size") 140 | 141 | cov = np.cov(x, y) 142 | pearson = cov[0, 1]/np.sqrt(cov[0, 0] * cov[1, 1]) 143 | # Using a special case to obtain the eigenvalues of this 144 | # two-dimensionl dataset. 145 | ell_radius_x = np.sqrt(1 + pearson) 146 | ell_radius_y = np.sqrt(1 - pearson) 147 | ellipse = Ellipse((0, 0), 148 | width=ell_radius_x * 2, 149 | height=ell_radius_y * 2, 150 | facecolor=facecolor, 151 | **kwargs) 152 | 153 | # Calculating the stdandard deviation of x from 154 | # the squareroot of the variance and multiplying 155 | # with the given number of standard deviations. 156 | scale_x = np.sqrt(cov[0, 0]) * n_std 157 | mean_x = np.mean(x) 158 | 159 | # calculating the stdandard deviation of y ... 160 | scale_y = np.sqrt(cov[1, 1]) * n_std 161 | mean_y = np.mean(y) 162 | 163 | transf = transforms.Affine2D() \ 164 | .rotate_deg(45) \ 165 | .scale(scale_x, scale_y) \ 166 | .translate(mean_x, mean_y) 167 | 168 | ellipse.set_transform(transf + ax.transData) 169 | return ax.add_patch(ellipse) 170 | 171 | 172 | # reference: https://matplotlib.org/3.1.0/gallery/statistics/confidence_ellipse.html 173 | def get_correlated_dataset(n, dependency, mu, scale): 174 | latent = np.random.randn(n, 2) 175 | dependent = latent.dot(dependency) 176 | scaled = dependent * scale 177 | scaled_with_offset = scaled + mu 178 | # return x and y of the new, correlated dataset 179 | return scaled_with_offset[:, 0], scaled_with_offset[:, 1] 180 | 181 | 182 | # code for the filter 183 | def extended_kalman_filter(target_xhat_t, target_yhat_t, target_sigma_t, robots_x, robots_y, robots_id, t, x, y, flag, omega, radii): 184 | """ 185 | Inputs: 186 | target_xhat_t: the estimated target position x-coordinate 187 | target_yhat_t: the estimated target position y-coordinate 188 | robots_x: the position of robots x-coordinate 189 | robots_y: the position of robots y-coordinate 190 | robots_id: the ids of robots 191 | t: the time step 192 | 193 | Outputs: 194 | (target_xhat_tplus1, target_yhat_tplus1, sigma_matrix_tplus1, x_true, y_true): the predicted target position 195 | """ 196 | # get z_true using true target motion 197 | sigma_z = 1.0 198 | x_true = radii*np.cos((t-1)/omega)+x 199 | if flag: 200 | y_true = radii*np.sin((t-1)/omega)+y 201 | else: 202 | y_true = -radii*np.sin((t-1) / omega)+y 203 | noise = sigma_z*np.random.randn(1000, 1) 204 | 205 | z_true = np.zeros((len(robots_x), 1)) 206 | for index in range(0, len(robots_x)): 207 | z_true[index][0] = np.linalg.norm([[robots_x[index] - x_true], [robots_y[index] - y_true]], 2) + noise[robots_id[index]] 208 | 209 | # filter code 210 | q_matrix = 0.2 * np.eye(2) 211 | x_matrix = np.array([[target_xhat_t], [target_yhat_t]]) 212 | sigma_matrix = target_sigma_t + q_matrix 213 | 214 | z_pred = np.zeros((len(robots_x), 1)) 215 | h_matrix = np.zeros((len(robots_x), 2)) 216 | for index in range(0, len(robots_x)): 217 | z_pred[index][0] = np.linalg.norm([[x_matrix[0][0] - robots_x[index]], [x_matrix[1][0] - robots_y[index]]], 2) 218 | h_matrix[index][0] = (-1.0 / z_pred[index]) * (robots_x[index] - x_matrix[0][0]) 219 | h_matrix[index][1] = (-1.0 / z_pred[index]) * (robots_y[index] - x_matrix[1][0]) 220 | 221 | res = (z_true - z_pred) 222 | r_matrix = sigma_z * sigma_z * np.eye(len(robots_x)) 223 | s_matrix = np.matmul(np.matmul(h_matrix, sigma_matrix), h_matrix.T) + r_matrix 224 | k_matrix = np.matmul(np.matmul(sigma_matrix, h_matrix.T), np.linalg.inv(s_matrix)) 225 | 226 | x_matrix_tplus1 = x_matrix + (np.matmul(k_matrix, res)) 227 | sigma_matrix_tplus1 = np.matmul(np.matmul((np.eye(2) - np.matmul(k_matrix, h_matrix)), sigma_matrix), (np.eye(2) - np.matmul(k_matrix, h_matrix)).T) + np.matmul(np.matmul(k_matrix, r_matrix), k_matrix.T) 228 | target_xhat_tplus1 = x_matrix_tplus1[0][0] 229 | target_yhat_tplus1 = x_matrix_tplus1[1][0] 230 | return (target_xhat_tplus1, target_yhat_tplus1, sigma_matrix_tplus1, x_true, y_true) 231 | 232 | 233 | # plot gaussian 234 | def plot_gaussian(gauss): 235 | x, y = np.mgrid[0:25:100j, 0:25:100j] 236 | z = np.dstack((x, y)) 237 | plt.contourf(x, y, gauss.pdf(z)) 238 | #plt.show() 239 | 240 | 241 | # save gaussian 242 | def save_gaussian(gauss, path): 243 | x, y = np.mgrid[0:25:100j, 0:25:100j] 244 | z = np.dstack((x, y)) 245 | plt.contourf(x, y, gauss.pdf(z)) 246 | plt.savefig(path) 247 | 248 | 249 | # plot confidence ellipse 250 | def plot_ellipse(x, y, mean, x_list, y_list, target_x_mean, target_y_mean, path, robot_x, robot_y, robot_movement_x, robot_movement_y): 251 | fig, ax_nstd = plt.subplots(figsize=(6, 6)) 252 | ax_nstd.axvline(c='grey', lw=1) 253 | ax_nstd.axhline(c='grey', lw=1) 254 | confidence_ellipse(x, y, ax_nstd, n_std=1, edgecolor='firebrick') 255 | ax_nstd.scatter(mean[0], mean[1], c='b', s=1) 256 | ax_nstd.legend() 257 | plt.title("Greedy algorithm using EKF(Target moving slow)") 258 | plt.xlabel("x") 259 | plt.ylabel("y") 260 | plt.xlim(0, 20) 261 | plt.ylim(0, 20) 262 | plt.plot(x_list, y_list, 'b--') 263 | plt.scatter(x_list[len(x_list)-1], y_list[len(y_list)-1], color='b', marker='*') 264 | plt.scatter([target_x_mean], [target_y_mean], color='b', marker="s") 265 | if(len(robot_movement_x) < 8): 266 | plt.plot(robot_movement_x, robot_movement_y, 'r--') 267 | else: 268 | plt.plot(robot_movement_x[-8:], robot_movement_y[-8:], 'r--') 269 | plt.scatter(robot_x, robot_y, color='r', marker='D') 270 | plt.plot([robot_x, target_x_mean], [robot_y, target_y_mean], color='r') 271 | plt.savefig(path) 272 | plt.cla() 273 | plt.close() 274 | 275 | 276 | def points_in_circle_np(radius, x0=0, y0=0): 277 | """ 278 | Inputs: 279 | radius: the radius of the circular region around the current robot position 280 | x0: the x-coordinate of the robot position 281 | y0: the y-coordinate of the robot position 282 | 283 | Outputs: 284 | points: the action set to be used for deciding the robot trajectory 285 | """ 286 | thetas = [0.0, 15.0, 30.0, 45.0, 60.0, 75.0, 90.0, 105.0, 120.0, 135.0, 150.0, 165.0, 180.0, 195.0, 210.0, 225.0, 240.0, 255.0, 270.0, 285.0, 300.0, 315.0, 330.0] 287 | points = [] 288 | for theta in thetas: 289 | theta = (theta * pi) / 180.0 290 | points.append((float(x0 + cos(theta) * radius), float(y0 + sin(theta) * radius))) 291 | return points 292 | 293 | 294 | # choose optimal action 295 | def update_robot_pos_ekf(robot_x, robot_y, target_x, target_y, var, prev_target_x, prev_target_y, radius, map_height, map_width, t, prev_robot_x, prev_robot_y): 296 | """ 297 | Inputs: 298 | robot_x: the position of robot x-coordinate 299 | robot_y: the position of robot y-coordinate 300 | target_x: the position of target x-coordinate 301 | target_y: the position of target y-coordinate 302 | var: the uncertainty in target position 303 | prev_target_x: the previous position of target x-coordinate 304 | prev_target_y: the previous position of target y-coordinate 305 | radius: the radius of the circular region around the current robot position 306 | map_height: the environment dimensions, height 307 | map_width: the environment dimensions, width 308 | t: the time step 309 | 310 | Outputs: 311 | (best_action[0], best_action[1]): updated robot position 312 | """ 313 | action_set = points_in_circle_np(radius, robot_x, robot_y) 314 | alpha_opt = -1000000 315 | best_action = (robot_x, robot_y) 316 | for action in action_set: 317 | curr_robot_x = action[0] 318 | curr_robot_y = action[1] 319 | val = 0 320 | for index in range(0, len(target_x)): 321 | if(curr_robot_x >= 1 and curr_robot_x <= (map_height-1) and curr_robot_y >= 1 and curr_robot_y <= (map_width-1)): 322 | t_v_i = ((robot_x - target_x[index])*(robot_x - target_x[index])) + ((robot_y - target_y[index])*(robot_y - target_y[index])) 323 | t_v_j = ((curr_robot_x - target_x[index])*(curr_robot_x - target_x[index])) + ((curr_robot_y - target_y[index])*(curr_robot_y - target_y[index])) 324 | t_m1 = (target_y[index] - robot_y) / (target_x[index] - robot_x + 1e-9) 325 | t_m2 = (target_y[index] - curr_robot_y) / (target_x[index] - curr_robot_x + 1e-9) 326 | value = np.abs(t_m1*t_m2 + 1) 327 | t_angle1 = np.arctan2(t_m1-t_m2, 1+t_m1*t_m2) 328 | t_angle2 = np.arctan2(t_m2-t_m1, 1+t_m1*t_m2) 329 | t_val = max((t_v_i*t_v_j*np.sin(t_angle1)*np.sin(t_angle1)), (t_v_i*t_v_j*np.sin(t_angle2)*np.sin(t_angle2))) 330 | t1_v_i = ((prev_robot_x - prev_target_x[index])*(prev_robot_x - prev_target_x[index])) + ((prev_robot_y - prev_target_y[index])*(prev_robot_y - prev_target_y[index])) 331 | t1_v_j = ((robot_x - prev_target_x[index])*(robot_x - prev_target_x[index])) + ((robot_y - prev_target_y[index])*(robot_y - prev_target_y[index])) 332 | t1_m1 = (prev_target_y[index] - prev_robot_y) / (prev_target_x[index] - prev_robot_x + 1e-9) 333 | t1_m2 = (prev_target_y[index] - robot_y) / (prev_target_x[index] - robot_x + 1e-9) 334 | t1_angle1 = np.arctan2(t1_m1-t1_m2, 1+t1_m1*t1_m2) 335 | t1_angle2 = np.arctan2(t1_m2-t1_m1, 1+t1_m1*t1_m2) 336 | t1_val = max((t1_v_i*t1_v_j*np.sin(t1_angle1)*np.sin(t1_angle1)), (t1_v_i*t1_v_j*np.sin(t1_angle2)*np.sin(t1_angle2))) 337 | t2_v_i = ((prev_robot_x - prev_target_x[index])*(prev_robot_x - prev_target_x[index])) + ((prev_robot_y - prev_target_y[index])*(prev_robot_y - prev_target_y[index])) 338 | t2_v_j = ((curr_robot_x - prev_target_x[index])*(curr_robot_x - prev_target_x[index])) + ((curr_robot_y - prev_target_y[index])*(curr_robot_y - prev_target_y[index])) 339 | t2_m1 = (prev_target_y[index] - prev_robot_y) / (prev_target_x[index] - prev_robot_x + 1e-9) 340 | t2_m2 = (prev_target_y[index] - curr_robot_y) / (prev_target_x[index] - curr_robot_x + 1e-9) 341 | t2_angle1 = np.arctan2(t2_m1-t2_m2, 1+t2_m1*t2_m2) 342 | t2_angle2 = np.arctan2(t2_m2-t2_m1, 1+t2_m1*t2_m2) 343 | t2_val = max((t2_v_i*t2_v_j*np.sin(t2_angle1)*np.sin(t2_angle1)), (t2_v_i*t2_v_j*np.sin(t2_angle2)*np.sin(t2_angle2))) 344 | val += t1_val + t2_val + t_val 345 | if(val > alpha_opt): 346 | alpha_opt = val 347 | best_action = action 348 | return (best_action[0], best_action[1], value) 349 | 350 | 351 | # choose optimal action 352 | def update_robot_pos(robot_x, robot_y, target_x, target_y, prev_target_x, prev_target_y, radius, map_height, map_width, prev_robot_x, prev_robot_y): 353 | """ 354 | Inputs: 355 | robot_x: the position of robot x-coordinate 356 | robot_y: the position of robot y-coordinate 357 | target_x: the position of target x-coordinate 358 | target_y: the position of target y-coordinate 359 | prev_target_x: the previous position of target x-coordinate 360 | prev_target_y: the previous position of target y-coordinate 361 | radius: the radius of the circular region around the current robot position 362 | map_height: the environment dimensions, height 363 | map_width: the environment dimensions, width 364 | 365 | Outputs: 366 | (best_action[0], best_action[1]): updated robot position 367 | """ 368 | action_set = points_in_circle_np(radius, robot_x, robot_y) 369 | alpha_opt = -10000000 370 | dist_opt = 10000000 371 | best_action = (0, 0) 372 | for action in action_set: 373 | curr_robot_x = action[0] 374 | curr_robot_y = action[1] 375 | if(curr_robot_x >= 1 and curr_robot_x < (map_height-1) and curr_robot_y >= 1 and curr_robot_y < (map_width-1)): 376 | m1 = (prev_target_y - robot_y) / (prev_target_x - robot_x + 1e-8) 377 | m2 = (target_y - curr_robot_y) / (target_x - curr_robot_x + 1e-8) 378 | t_v_i = ((robot_x - target_x)*(robot_x - target_x)) + ((robot_y - target_y)*(robot_y - target_y)) 379 | t_v_j = ((curr_robot_x - target_x)*(curr_robot_x - target_x)) + ((curr_robot_y - target_y)*(curr_robot_y - target_y)) 380 | t_m1 = (target_y - robot_y) / (target_x - robot_x + 1e-9) 381 | t_m2 = (target_y - curr_robot_y) / (target_x - curr_robot_x + 1e-9) 382 | t_angle1 = np.arctan2(t_m1-t_m2, 1+t_m1*t_m2) 383 | t_angle2 = np.arctan2(t_m2-t_m1, 1+t_m1*t_m2) 384 | t_val = max((t_v_i*t_v_j*np.sin(t_angle1)*np.sin(t_angle1)), (t_v_i*t_v_j*np.sin(t_angle2)*np.sin(t_angle2))) 385 | t1_v_i = ((prev_robot_x - prev_target_x)*(prev_robot_x - prev_target_x)) + ((prev_robot_y - prev_target_y)*(prev_robot_y - prev_target_y)) 386 | t1_v_j = ((robot_x - prev_target_x)*(robot_x - prev_target_x)) + ((robot_y - prev_target_y)*(robot_y - prev_target_y)) 387 | t1_m1 = (prev_target_y - prev_robot_y) / (prev_target_x - prev_robot_x + 1e-9) 388 | t1_m2 = (prev_target_y - robot_y) / (prev_target_x - robot_x + 1e-9) 389 | t1_angle1 = np.arctan2(t1_m1-t1_m2, 1+t1_m1*t1_m2) 390 | t1_angle2 = np.arctan2(t1_m2-t1_m1, 1+t1_m1*t1_m2) 391 | t1_val = max((t1_v_i*t1_v_j*np.sin(t1_angle1)*np.sin(t1_angle1)), (t1_v_i*t1_v_j*np.sin(t1_angle2)*np.sin(t1_angle2))) 392 | t2_v_i = ((prev_robot_x - prev_target_x)*(prev_robot_x - prev_target_x)) + ((prev_robot_y - prev_target_y)*(prev_robot_y - prev_target_y)) 393 | t2_v_j = ((curr_robot_x - prev_target_x)*(curr_robot_x - prev_target_x)) + ((curr_robot_y - prev_target_y)*(curr_robot_y - prev_target_y)) 394 | t2_m1 = (prev_target_y - prev_robot_y) / (prev_target_x - prev_robot_x + 1e-9) 395 | t2_m2 = (prev_target_y - curr_robot_y) / (prev_target_x - curr_robot_x + 1e-9) 396 | t2_angle1 = np.arctan2(t2_m1-t2_m2, 1+t2_m1*t2_m2) 397 | t2_angle2 = np.arctan2(t2_m2-t2_m1, 1+t2_m1*t2_m2) 398 | t2_val = max((t2_v_i*t2_v_j*np.sin(t2_angle1)*np.sin(t2_angle1)), (t2_v_i*t2_v_j*np.sin(t2_angle2)*np.sin(t2_angle2))) 399 | val = t1_val + t2_val + t_val 400 | if(val > alpha_opt): 401 | alpha_opt = val 402 | best_action = action 403 | return (best_action[0], best_action[1]) 404 | --------------------------------------------------------------------------------