├── imgs ├── NAF.png ├── NAF_vs_DDPG.png ├── NAF_vs_DDPG_LL.png └── NAF_vs_DDPG_LL_.png ├── LICENSE ├── README.md ├── networks.py ├── naf.py ├── replay_buffer.py ├── agent.py ├── results ├── NAF_pendulum.csv ├── DDPG_pendulum.csv ├── NAF_LL.csv └── DDPG_LL.csv └── NAF.ipynb /imgs/NAF.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BY571/Normalized-Advantage-Function-NAF-/HEAD/imgs/NAF.png -------------------------------------------------------------------------------- /imgs/NAF_vs_DDPG.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BY571/Normalized-Advantage-Function-NAF-/HEAD/imgs/NAF_vs_DDPG.png -------------------------------------------------------------------------------- /imgs/NAF_vs_DDPG_LL.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BY571/Normalized-Advantage-Function-NAF-/HEAD/imgs/NAF_vs_DDPG_LL.png -------------------------------------------------------------------------------- /imgs/NAF_vs_DDPG_LL_.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BY571/Normalized-Advantage-Function-NAF-/HEAD/imgs/NAF_vs_DDPG_LL_.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Sebastian Dittert 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Normalized Advantage Function (NAF) 3 | 4 | PyTorch implementation of the NAF algorithm based on the paper: [Continuous Deep Q-Learning with Model-based Acceleration](https://arxiv.org/abs/1603.00748). 5 | 6 | Two versions are implemented: 7 | 1. Jupyter notebook version 8 | 2. Script version (results tracking with [wandb](www.wandb.com)) 9 | 10 | ### Recently added PER and n-step method 11 | 12 | To run the script version: `python naf.py` 13 | 14 | with the arguments: 15 | 16 | '-env' : Name of the environment (default: Pendulum-v0) 17 | '-info' : Name of the Experiment (default: Experiment-1) 18 | '-f', --frames : Number of training frames (default: 40000) 19 | '-mem' : Replay buffer size (default: 100000) 20 | '-b', --batch_size : Batch size (default: 128) 21 | '-l', --layer_size : Neural Network layer size (default: 256) 22 | '-g'--gamma : Discount factor gamma (default: 0.99) 23 | '-t', --tau : Soft update factor tau (default: 1e-3) 24 | '-lr', --learning_rate : Learning rate (default: 1e-3) 25 | '-u', --update_every : update the network every x step (default: 1) 26 | '-n_up', --n_updates : update the network for x steps (default: 1) 27 | '-s', --seed : random seed (default: 0) 28 | '-per', choices=[0,1] : Use prioritized experience replay (default: 0) 29 | '-nstep' : nstep_bootstrapping (default: 1) 30 | '-d2rl': Using Deep Dense Network if set to 1 (default: 0) 31 | '--eval_every': Doing an evaluation of the current policy every X frames (default: 1000) 32 | '--eval_runs': Number of evaluation runs - performance is averaged over all runs (default: 3) 33 | 34 | 35 | 36 | ![alttext](/imgs/NAF.png) 37 | 38 | In the paper they compared NAF with DDPG and showed faster and more stable learning: *We show that, in comparison to recently proposed deep actor-critic algorithms, our method tends to learn faster and acquires more accurate policies.* 39 | 40 | To verify and support their statement I tested NAF on Pendulum-v0 and LunarLanderConinuous-v2 and compared it with the results of my implementation of [DDPG](https://github.com/BY571/DDPG). 41 | 42 | **The results shown do not include the model-based acceleration! Only the base NAF algorithm was tested.** 43 | 44 | ![alttext](/imgs/NAF_vs_DDPG.png) 45 | 46 | ![alttext](/imgs/NAF_vs_DDPG_LL_.png) 47 | 48 | Indeed the results show a faster and more stable learning! 49 | 50 | ## TODO: 51 | - Test with Double Q-nets like SAC 52 | - Test with Entropy Regularization (like sac) 53 | - Test with REDQ Q-Net ensemble 54 | 55 | 56 | 57 | Feel free to use this code for your own projects or research: 58 | 59 | ``` 60 | @misc{Normalized Advantage Function, 61 | author = {Dittert, Sebastian}, 62 | title = {PyTorch Implementation of Normalized Advantage Function}, 63 | year = {2020}, 64 | publisher = {GitHub}, 65 | journal = {GitHub repository}, 66 | howpublished = {\url{https://github.com/BY571/NAF}}, 67 | } 68 | ``` 69 | -------------------------------------------------------------------------------- /networks.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from torch.distributions import MultivariateNormal 4 | 5 | 6 | 7 | class DeepNAF(nn.Module): 8 | def __init__(self, state_size, action_size, layer_size, seed): 9 | super(DeepNAF, self).__init__() 10 | self.seed = torch.manual_seed(seed) 11 | self.input_shape = state_size 12 | self.action_size = action_size 13 | 14 | concat_size = state_size+layer_size 15 | 16 | self.fc1 = nn.Linear(self.input_shape, layer_size) 17 | self.bn1 = nn.BatchNorm1d(layer_size) 18 | self.fc2 = nn.Linear(concat_size, layer_size) 19 | self.bn2 = nn.BatchNorm1d(layer_size) 20 | self.fc3 = nn.Linear(concat_size, layer_size) 21 | self.bn3 = nn.BatchNorm1d(layer_size) 22 | self.fc4 = nn.Linear(concat_size, layer_size) 23 | self.bn4 = nn.BatchNorm1d(layer_size) 24 | self.action_values = nn.Linear(layer_size, action_size) 25 | self.value = nn.Linear(layer_size, 1) 26 | self.matrix_entries = nn.Linear(layer_size, int(self.action_size*(self.action_size+1)/2)) 27 | 28 | def forward(self, input_, action=None): 29 | """ 30 | 31 | """ 32 | 33 | x = torch.relu(self.fc1(input_)) 34 | x = self.bn1(x) 35 | x = torch.relu(self.fc2(torch.cat([x, input_], dim=1))) 36 | x = self.bn2(x) 37 | x = torch.relu(self.fc3(torch.cat([x, input_], dim=1))) 38 | x = self.bn3(x) 39 | x = torch.relu(self.fc4(torch.cat([x, input_], dim=1))) 40 | 41 | action_value = torch.tanh(self.action_values(x)) 42 | entries = torch.tanh(self.matrix_entries(x)) 43 | V = self.value(x) 44 | 45 | action_value = action_value.unsqueeze(-1) 46 | 47 | # create lower-triangular matrix 48 | L = torch.zeros((input_.shape[0], self.action_size, self.action_size)).to(input_.device) 49 | 50 | # get lower triagular indices 51 | tril_indices = torch.tril_indices(row=self.action_size, col=self.action_size, offset=0) 52 | 53 | # fill matrix with entries 54 | L[:, tril_indices[0], tril_indices[1]] = entries 55 | L.diagonal(dim1=1,dim2=2).exp_() 56 | 57 | # calculate state-dependent, positive-definite square matrix 58 | P = L*L.transpose(2, 1) 59 | 60 | Q = None 61 | if action is not None: 62 | 63 | # calculate Advantage: 64 | A = (-0.5 * torch.matmul(torch.matmul((action.unsqueeze(-1) - action_value).transpose(2, 1), P), (action.unsqueeze(-1) - action_value))).squeeze(-1) 65 | 66 | Q = A + V 67 | 68 | 69 | # add noise to action mu: 70 | dist = MultivariateNormal(action_value.squeeze(-1), torch.inverse(P)) 71 | action = dist.sample() 72 | action = torch.clamp(action, min=-1, max=1) 73 | 74 | return action, Q, V, action_value 75 | 76 | class NAF(nn.Module): 77 | def __init__(self, state_size, action_size,layer_size, seed): 78 | super(NAF, self).__init__() 79 | self.seed = torch.manual_seed(seed) 80 | self.input_shape = state_size 81 | self.action_size = action_size 82 | 83 | self.head_1 = nn.Linear(self.input_shape, layer_size) 84 | self.bn1 = nn.BatchNorm1d(layer_size) 85 | self.ff_1 = nn.Linear(layer_size, layer_size) 86 | self.bn2 = nn.BatchNorm1d(layer_size) 87 | self.action_values = nn.Linear(layer_size, action_size) 88 | self.value = nn.Linear(layer_size, 1) 89 | self.matrix_entries = nn.Linear(layer_size, int(self.action_size*(self.action_size+1)/2)) 90 | 91 | 92 | 93 | def forward(self, input_, action=None): 94 | """ 95 | 96 | """ 97 | 98 | x = torch.relu(self.head_1(input_)) 99 | x = self.bn1(x) 100 | x = torch.relu(self.ff_1(x)) 101 | x = self.bn2(x) 102 | action_value = torch.tanh(self.action_values(x)) 103 | entries = torch.tanh(self.matrix_entries(x)) 104 | V = self.value(x) 105 | 106 | action_value = action_value.unsqueeze(-1) 107 | 108 | # create lower-triangular matrix 109 | L = torch.zeros((input_.shape[0], self.action_size, self.action_size)).to(input_.device) 110 | 111 | # get lower triagular indices 112 | tril_indices = torch.tril_indices(row=self.action_size, col=self.action_size, offset=0) 113 | 114 | # fill matrix with entries 115 | L[:, tril_indices[0], tril_indices[1]] = entries 116 | L.diagonal(dim1=1,dim2=2).exp_() 117 | 118 | # calculate state-dependent, positive-definite square matrix 119 | P = L*L.transpose(2, 1) 120 | 121 | Q = None 122 | if action is not None: 123 | 124 | # calculate Advantage: 125 | A = (-0.5 * torch.matmul(torch.matmul((action.unsqueeze(-1) - action_value).transpose(2, 1), P), (action.unsqueeze(-1) - action_value))).squeeze(-1) 126 | 127 | Q = A + V 128 | 129 | # add noise to action mu: 130 | dist = MultivariateNormal(action_value.squeeze(-1), torch.inverse(P)) 131 | action = dist.sample() 132 | action = torch.clamp(action, min=-1, max=1) 133 | return action, Q, V, action_value -------------------------------------------------------------------------------- /naf.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import json 4 | import random 5 | 6 | from collections import deque 7 | import time 8 | import gym 9 | import pybullet_envs 10 | 11 | import argparse 12 | #import wandb 13 | from torch.utils.tensorboard import SummaryWriter 14 | from agent import NAF_Agent 15 | 16 | 17 | def evaluate(frame, eval_runs): 18 | scores = [] 19 | with torch.no_grad(): 20 | for i in range(eval_runs): 21 | state = test_env.reset() 22 | score = 0 23 | done = 0 24 | while not done: 25 | action = agent.act_without_noise(state) 26 | state, reward, done, _ = test_env.step(action) 27 | score += reward 28 | if done: 29 | scores.append(score) 30 | break 31 | 32 | #wandb.log({"Reward": np.mean(scores), "Step": frame}) 33 | writer.add_scalar("Reward", np.mean(scores), frame) 34 | 35 | def timer(start,end): 36 | """ Helper to print training time """ 37 | hours, rem = divmod(end-start, 3600) 38 | minutes, seconds = divmod(rem, 60) 39 | print("\nTraining Time: {:0>2}:{:0>2}:{:05.2f}".format(int(hours),int(minutes),seconds)) 40 | 41 | 42 | def run(args): 43 | """"NAF. 44 | 45 | Params 46 | ====== 47 | 48 | """ 49 | frames = args.frames 50 | eval_every = args.eval_every 51 | eval_runs = args.eval_runs 52 | scores = [] # list containing scores from each episode 53 | scores_window = deque(maxlen=100) # last 100 scores 54 | frame = 0 55 | i_episode = 0 56 | state = env.reset() 57 | score = 0 58 | evaluate(0, eval_runs) 59 | for frame in range(1, frames+1): 60 | action = agent.act(state) 61 | 62 | next_state, reward, done, _ = env.step(action) 63 | agent.step(state, action, reward, next_state, done) 64 | 65 | state = next_state 66 | score += reward 67 | 68 | if frame % eval_every == 0: 69 | evaluate(frame, eval_runs) 70 | 71 | if done: 72 | scores_window.append(score) # save most recent score 73 | scores.append(score) # save most recent score 74 | print('\rEpisode {}\tFrame [{}/{}] \tAverage Score: {:.2f}'.format(i_episode, frame, frames, np.mean(scores_window)), end="") 75 | if i_episode % 100 == 0: 76 | print('\rEpisode {}\tFrame [{}/{}] \tAverage Score: {:.2f}'.format(i_episode,frame, frames, np.mean(scores_window))) 77 | i_episode +=1 78 | state = env.reset() 79 | score = 0 80 | 81 | 82 | 83 | if __name__ == "__main__": 84 | 85 | parser = argparse.ArgumentParser() 86 | parser.add_argument("-info", type=str, default="Experiment-1", 87 | help="Name of the Experiment (default: Experiment-1)") 88 | parser.add_argument('-env', type=str, default="Pendulum-v0", 89 | help='Name of the environment (default: Pendulum-v0)') 90 | parser.add_argument('-f', "--frames", type=int, default=40000, 91 | help='Number of training frames (default: 40000)') 92 | parser.add_argument("--eval_every", type=int, default=5000, 93 | help="Evaluate the current policy every X steps (default: 5000)") 94 | parser.add_argument("--eval_runs", type=int, default=2, 95 | help="Number of evaluation runs to evaluate - averating the evaluation Performance over all runs (default: 3)") 96 | parser.add_argument('-mem', type=int, default=100000, 97 | help='Replay buffer size (default: 100000)') 98 | parser.add_argument('-per', type=int, choices=[0,1], default=0, 99 | help='Use prioritized experience replay (default: False)') 100 | parser.add_argument('-b', "--batch_size", type=int, default=256, 101 | help='Batch size (default: 128)') 102 | parser.add_argument('-nstep', type=int, default=1, 103 | help='nstep_bootstrapping (default: 1)') 104 | parser.add_argument("-d2rl", type=int, choices=[0,1], default=0, 105 | help="Using D2RL Deep Dense NN Architecture if set to 1 (default: 0)") 106 | parser.add_argument('-l', "--layer_size", type=int, default=256, 107 | help='Neural Network layer size (default: 256)') 108 | parser.add_argument('-g', "--gamma", type=float, default=0.99, 109 | help='Discount factor gamma (default: 0.99)') 110 | parser.add_argument('-t', "--tau", type=float, default=0.005, 111 | help='Soft update factor tau (default: 0.005)') 112 | parser.add_argument('-lr', "--learning_rate", type=float, default=1e-3, 113 | help='Learning rate (default: 1e-3)') 114 | parser.add_argument('-u', "--update_every", type=int, default=1, 115 | help='update the network every x step (default: 1)') 116 | parser.add_argument('-n_up', "--n_updates", type=int, default=1, 117 | help='update the network for x steps (default: 1)') 118 | parser.add_argument('-s', "--seed", type=int, default=0, 119 | help='random seed (default: 0)') 120 | parser.add_argument("--clip_grad", type=float, default=1.0, help="Clip gradients (default: 1.0)") 121 | parser.add_argument("--loss", type=str, choices=["mse", "huber"], default="mse", help="Choose loss type MSE or Huber loss (default: mse)") 122 | 123 | args = parser.parse_args() 124 | #wandb.init(project="naf", name=args.info) 125 | #wandb.config.update(args) 126 | writer = SummaryWriter("runs/"+args.info) 127 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 128 | print("Using ", device) 129 | 130 | 131 | env = gym.make(args.env) #CartPoleConti 132 | test_env = gym.make(args.env) 133 | 134 | seed = args.seed 135 | np.random.seed(seed) 136 | env.seed(seed) 137 | test_env.seed(seed+1) 138 | action_size = env.action_space.shape[0] 139 | state_size = env.observation_space.shape[0] 140 | 141 | agent = NAF_Agent(state_size=state_size, 142 | action_size=action_size, 143 | device=device, 144 | args= args, 145 | writer=writer) 146 | 147 | 148 | 149 | t0 = time.time() 150 | run(args) 151 | t1 = time.time() 152 | 153 | timer(t0, t1) 154 | torch.save(agent.qnetwork_local.state_dict(), "NAF_"+args.info+"_.pth") 155 | # save parameter 156 | with open('runs/'+args.info+".json", 'w') as f: 157 | json.dump(args.__dict__, f, indent=2) -------------------------------------------------------------------------------- /replay_buffer.py: -------------------------------------------------------------------------------- 1 | 2 | import torch 3 | import numpy as np 4 | import random 5 | from collections import deque, namedtuple 6 | 7 | 8 | class ReplayBuffer: 9 | """Fixed-size buffer to store experience tuples.""" 10 | 11 | def __init__(self, buffer_size, batch_size, device, seed, gamma, nstep): 12 | """Initialize a ReplayBuffer object. 13 | Params 14 | ====== 15 | buffer_size (int): maximum size of buffer 16 | batch_size (int): size of each training batch 17 | seed (int): random seed 18 | """ 19 | self.device = device 20 | self.memory = deque(maxlen=buffer_size) 21 | self.batch_size = batch_size 22 | self.experience = namedtuple("Experience", field_names=["state", "action", "reward", "next_state", "done"]) 23 | self.seed = random.seed(seed) 24 | self.gamma = gamma 25 | self.n_step = nstep 26 | self.n_step_buffer = deque(maxlen=nstep) 27 | 28 | def add(self, state, action, reward, next_state, done): 29 | """Add a new experience to memory.""" 30 | 31 | self.n_step_buffer.append((state, action, reward, next_state, done)) 32 | if len(self.n_step_buffer) == self.n_step: 33 | state, action, reward, next_state, done = self.calc_multistep_return() 34 | 35 | e = self.experience(state, action, reward, next_state, done) 36 | self.memory.append(e) 37 | 38 | def calc_multistep_return(self): 39 | Return = 0 40 | for idx in range(1): 41 | Return += self.gamma**idx * self.n_step_buffer[idx][2] 42 | 43 | return self.n_step_buffer[0][0], self.n_step_buffer[0][1], Return, self.n_step_buffer[-1][3], self.n_step_buffer[-1][4] 44 | 45 | 46 | 47 | def sample(self): 48 | """Randomly sample a batch of experiences from memory.""" 49 | experiences = random.sample(self.memory, k=self.batch_size) 50 | 51 | states = torch.from_numpy(np.stack([e.state for e in experiences if e is not None])).float().to(self.device) 52 | actions = torch.from_numpy(np.vstack([e.action for e in experiences if e is not None])).long().to(self.device) 53 | rewards = torch.from_numpy(np.vstack([e.reward for e in experiences if e is not None])).float().to(self.device) 54 | next_states = torch.from_numpy(np.stack([e.next_state for e in experiences if e is not None])).float().to(self.device) 55 | dones = torch.from_numpy(np.vstack([e.done for e in experiences if e is not None]).astype(np.uint8)).float().to(self.device) 56 | 57 | return (states, actions, rewards, next_states, dones) 58 | 59 | def __len__(self): 60 | """Return the current size of internal memory.""" 61 | return len(self.memory) 62 | 63 | class PrioritizedReplay(object): 64 | """ 65 | Proportional Prioritization 66 | """ 67 | def __init__(self, capacity, batch_size, seed, gamma=0.99, n_step=1, alpha=0.6, beta_start = 0.4, beta_frames=100000): 68 | self.alpha = alpha 69 | self.beta_start = beta_start 70 | self.beta_frames = beta_frames 71 | self.frame = 1 #for beta calculation 72 | self.batch_size = batch_size 73 | self.capacity = capacity 74 | self.buffer = [] 75 | self.pos = 0 76 | self.priorities = np.zeros((capacity,), dtype=np.float32) 77 | self.seed = np.random.seed(seed) 78 | self.n_step = n_step 79 | self.n_step_buffer = deque(maxlen=self.n_step) 80 | self.gamma = gamma 81 | 82 | def calc_multistep_return(self): 83 | Return = 0 84 | for idx in range(self.n_step): 85 | Return += self.gamma**idx * self.n_step_buffer[idx][2] 86 | 87 | return self.n_step_buffer[0][0], self.n_step_buffer[0][1], Return, self.n_step_buffer[-1][3], self.n_step_buffer[-1][4] 88 | 89 | def beta_by_frame(self, frame_idx): 90 | """ 91 | Linearly increases beta from beta_start to 1 over time from 1 to beta_frames. 92 | 93 | 3.4 ANNEALING THE BIAS (Paper: PER) 94 | We therefore exploit the flexibility of annealing the amount of importance-sampling 95 | correction over time, by defining a schedule on the exponent 96 | that reaches 1 only at the end of learning. In practice, we linearly anneal from its initial value 0 to 1 97 | """ 98 | return min(1.0, self.beta_start + frame_idx * (1.0 - self.beta_start) / self.beta_frames) 99 | 100 | def add(self, state, action, reward, next_state, done): 101 | assert state.ndim == next_state.ndim 102 | state = np.expand_dims(state, 0) 103 | next_state = np.expand_dims(next_state, 0) 104 | 105 | # n_step calc 106 | self.n_step_buffer.append((state, action, reward, next_state, done)) 107 | if len(self.n_step_buffer) == self.n_step: 108 | state, action, reward, next_state, done = self.calc_multistep_return() 109 | 110 | max_prio = self.priorities.max() if self.buffer else 1.0 # gives max priority if buffer is not empty else 1 111 | 112 | if len(self.buffer) < self.capacity: 113 | self.buffer.append((state, action, reward, next_state, done)) 114 | else: 115 | # puts the new data on the position of the oldes since it circles via pos variable 116 | # since if len(buffer) == capacity -> pos == 0 -> oldest memory (at least for the first round?) 117 | self.buffer[self.pos] = (state, action, reward, next_state, done) 118 | 119 | self.priorities[self.pos] = max_prio 120 | self.pos = (self.pos + 1) % self.capacity # lets the pos circle in the ranges of capacity if pos+1 > cap --> new posi = 0 121 | 122 | def sample(self): 123 | N = len(self.buffer) 124 | if N == self.capacity: 125 | prios = self.priorities 126 | else: 127 | prios = self.priorities[:self.pos] 128 | 129 | # calc P = p^a/sum(p^a) 130 | probs = prios ** self.alpha 131 | P = probs/probs.sum() 132 | 133 | #gets the indices depending on the probability p 134 | indices = np.random.choice(N, self.batch_size, p=P) 135 | samples = [self.buffer[idx] for idx in indices] 136 | 137 | beta = self.beta_by_frame(self.frame) 138 | self.frame+=1 139 | 140 | #Compute importance-sampling weight 141 | weights = (N * P[indices]) ** (-beta) 142 | # normalize weights 143 | weights /= weights.max() 144 | weights = np.array(weights, dtype=np.float32) 145 | 146 | states, actions, rewards, next_states, dones = zip(*samples) 147 | return np.concatenate(states), actions, rewards, np.concatenate(next_states), dones, indices, weights 148 | 149 | def update_priorities(self, batch_indices, batch_priorities): 150 | for idx, prio in zip(batch_indices, batch_priorities): 151 | self.priorities[idx] = prio 152 | 153 | def __len__(self): 154 | return len(self.buffer) -------------------------------------------------------------------------------- /agent.py: -------------------------------------------------------------------------------- 1 | from replay_buffer import ReplayBuffer, PrioritizedReplay 2 | from networks import NAF, DeepNAF 3 | 4 | import torch 5 | import torch.nn as nn 6 | from torch.nn.utils import clip_grad_norm_ 7 | import numpy as np 8 | import torch.optim as optim 9 | import random 10 | 11 | class NAF_Agent(): 12 | """Interacts with and learns from the environment.""" 13 | 14 | def __init__(self, 15 | state_size, 16 | action_size, 17 | device, 18 | args, 19 | writer): 20 | """Initialize an Agent object. 21 | 22 | Params 23 | ====== 24 | state_size (int): dimension of each state 25 | action_size (int): dimension of each action 26 | Network (str): dqn network type 27 | layer_size (int): size of the hidden layer 28 | BATCH_SIZE (int): size of the training batch 29 | BUFFER_SIZE (int): size of the replay memory 30 | LR (float): learning rate 31 | TAU (float): tau for soft updating the network weights 32 | GAMMA (float): discount factor 33 | UPDATE_EVERY (int): update frequency 34 | device (str): device that is used for the compute 35 | seed (int): random seed 36 | """ 37 | self.state_size = state_size 38 | self.action_size = action_size 39 | self.seed = random.seed(args.seed) 40 | self.device = device 41 | self.TAU = args.tau 42 | self.GAMMA = args.gamma 43 | self.nstep = args.nstep 44 | self.UPDATE_EVERY = args.update_every 45 | self.NUPDATES = args.n_updates 46 | self.BATCH_SIZE = args.batch_size 47 | self.Q_updates = 0 48 | self.per = args.per 49 | self.clip_grad = args.clip_grad 50 | 51 | self.action_step = 4 52 | self.last_action = None 53 | 54 | # Q-Network 55 | if args.d2rl == 0: 56 | self.qnetwork_local = NAF(state_size, action_size, args.layer_size, args.seed).to(device) 57 | self.qnetwork_target = NAF(state_size, action_size, args.layer_size, args.seed).to(device) 58 | else: 59 | self.qnetwork_local = DeepNAF(state_size, action_size, args.layer_size, args.seed).to(device) 60 | self.qnetwork_target = DeepNAF(state_size, action_size, args.layer_size, args.seed).to(device) 61 | 62 | #wandb.watch(self.qnetwork_local) 63 | self.writer = writer 64 | self.optimizer = optim.Adam(self.qnetwork_local.parameters(), lr=args.learning_rate) 65 | print(self.qnetwork_local) 66 | 67 | # Replay memory 68 | if args.per == True: 69 | print("Using Prioritized Experience Replay") 70 | self.memory = PrioritizedReplay(buffer_size=args.mem, 71 | batch_size=args.batch_size, 72 | seed=args.seed, 73 | gamma=args.gamma, 74 | n_step=self.nstep, 75 | beta_frames=args.frames) 76 | else: 77 | print("Using Regular Experience Replay") 78 | self.memory = ReplayBuffer(buffer_size=args.mem, 79 | batch_size=args.batch_size, 80 | device=self.device, 81 | seed=args.seed, 82 | gamma=args.gamma, 83 | nstep=args.nstep) 84 | 85 | # define loss 86 | if args.loss == "mse": 87 | self.loss = nn.MSELoss() 88 | elif args.loss == "huber": 89 | self.loss = nn.SmoothL1Loss() 90 | else: 91 | print("Loss is not defined choose between mse and huber!") 92 | 93 | # Initialize time step (for updating every UPDATE_EVERY steps) 94 | self.t_step = 0 95 | 96 | def step(self, state, action, reward, next_state, done): 97 | # Save experience in replay memory 98 | self.memory.add(state, action, reward, next_state, done) 99 | 100 | # Learn every UPDATE_EVERY time steps. 101 | self.t_step = (self.t_step + 1) % self.UPDATE_EVERY 102 | if self.t_step == 0: 103 | # If enough samples are available in memory, get random subset and learn 104 | if len(self.memory) > self.BATCH_SIZE: 105 | Q_losses = [] 106 | for _ in range(self.NUPDATES): 107 | experiences = self.memory.sample() 108 | if self.per == True: 109 | loss = self.learn_per(experiences) 110 | else: 111 | loss = self.learn(experiences) 112 | self.Q_updates += 1 113 | Q_losses.append(loss) 114 | self.writer.add_scalar("Q_loss", np.mean(Q_losses), self.Q_updates) 115 | #.log({"Q_loss": np.mean(Q_losses), "Optimization step": self.Q_updates}) 116 | 117 | def act_without_noise(self, state): 118 | state = torch.from_numpy(state).float().to(self.device) 119 | 120 | self.qnetwork_local.eval() 121 | with torch.no_grad(): 122 | _, _, _, action = self.qnetwork_local(state.unsqueeze(0)) 123 | self.qnetwork_local.train() 124 | return action.cpu().squeeze().numpy().reshape((self.action_size,)) 125 | 126 | def act(self, state): 127 | """Calculating the action 128 | 129 | Params 130 | ====== 131 | state (array_like): current state 132 | 133 | """ 134 | 135 | state = torch.from_numpy(state).float().to(self.device) 136 | 137 | self.qnetwork_local.eval() 138 | with torch.no_grad(): 139 | action, _, _, _ = self.qnetwork_local(state.unsqueeze(0)) 140 | self.qnetwork_local.train() 141 | return action.cpu().squeeze().numpy().reshape((self.action_size,)) 142 | 143 | 144 | 145 | def learn(self, experiences): 146 | """Update value parameters using given batch of experience tuples. 147 | Params 148 | ====== 149 | experiences (Tuple[torch.Tensor]): tuple of (s, a, r, s', done) tuples 150 | """ 151 | 152 | states, actions, rewards, next_states, dones = experiences 153 | 154 | # get the Value for the next state from target model 155 | with torch.no_grad(): 156 | _, _, V_, _ = self.qnetwork_target(next_states) 157 | 158 | # Compute Q targets for current states 159 | V_targets = rewards + (self.GAMMA**self.nstep * V_ * (1 - dones)) 160 | 161 | # Get expected Q values from local model 162 | _, Q, _, _ = self.qnetwork_local(states, actions) 163 | 164 | # Compute loss 165 | loss = self.loss(Q, V_targets) 166 | 167 | # Minimize the loss 168 | self.optimizer.zero_grad() 169 | loss.backward() 170 | clip_grad_norm_(self.qnetwork_local.parameters(), self.clip_grad) 171 | self.optimizer.step() 172 | 173 | # ------------------- update target network ------------------- # 174 | self.soft_update(self.qnetwork_local, self.qnetwork_target) 175 | 176 | return loss.detach().cpu().numpy() 177 | 178 | def learn_per(self, experiences): 179 | """Update value parameters using given batch of experience tuples. 180 | Params 181 | ====== 182 | experiences (Tuple[torch.Tensor]): tuple of (s, a, r, s', done) tuples 183 | """ 184 | self.optimizer.zero_grad() 185 | states, actions, rewards, next_states, dones, idx, weights = experiences 186 | 187 | states = torch.FloatTensor(states).to(self.device) 188 | next_states = torch.FloatTensor(np.float32(next_states)).to(self.device) 189 | actions = torch.LongTensor(actions).to(self.device) 190 | rewards = torch.FloatTensor(rewards).to(self.device).unsqueeze(1) 191 | dones = torch.FloatTensor(dones).to(self.device).unsqueeze(1) 192 | weights = torch.FloatTensor(weights).unsqueeze(1).to(self.device) 193 | 194 | # get the Value for the next state from target model 195 | with torch.no_grad(): 196 | _, _, V_, _ = self.qnetwork_target(next_states) 197 | 198 | # Compute Q targets for current states 199 | V_targets = rewards + (self.GAMMA**self.nstep * V_ * (1 - dones)) 200 | 201 | # Get expected Q values from local model 202 | _, Q, _, _ = self.qnetwork_local(states, actions) 203 | 204 | # Compute loss 205 | td_error = Q - V_targets 206 | loss = (self.loss(Q, V_targets)*weights).mean().to(self.device) 207 | # Minimize the loss 208 | loss.backward() 209 | clip_grad_norm_(self.qnetwork_local.parameters(), self.clip_grad) 210 | self.optimizer.step() 211 | 212 | # ------------------- update target network ------------------- # 213 | self.soft_update(self.qnetwork_local, self.qnetwork_target) 214 | # update per priorities 215 | self.memory.update_priorities(idx, abs(td_error.data.cpu().numpy())) 216 | 217 | 218 | return loss.detach().cpu().numpy() 219 | 220 | def soft_update(self, local_model, target_model): 221 | """Soft update model parameters. 222 | θ_target = τ*θ_local + (1 - τ)*θ_target 223 | Params 224 | ====== 225 | local_model (PyTorch model): weights will be copied from 226 | target_model (PyTorch model): weights will be copied to 227 | tau (float): interpolation parameter 228 | """ 229 | for target_param, local_param in zip(target_model.parameters(), local_model.parameters()): 230 | target_param.data.copy_(self.TAU*local_param.data + (1.0-self.TAU)*target_param.data) 231 | -------------------------------------------------------------------------------- /results/NAF_pendulum.csv: -------------------------------------------------------------------------------- 1 | "Step","gallant-jazz-50 - Reward","whole-sun-49 - Reward","scarlet-snowball-48 - Reward" 2 | "0","-1507.4474344546124","-1368.7675074571011","-1723.585816526964" 3 | "1","-1091.0162390203282","-1329.820391945355","-1186.8036595947508" 4 | "2","-1743.9052623485663","-1417.3089761405877","-1795.231718263501" 5 | "3","-1513.2903096661187","-1767.8798589128141","-1232.6779380932007" 6 | "4","-1083.8022953107632","-1395.6238459909316","-1781.6781821289096" 7 | "5","-1736.2560900185342","-1806.0401786657017","-1762.8010300242288" 8 | "6","-1399.66438803644","-1681.2422548838251","-1200.827114469524" 9 | "7","-1333.1557184019384","-1606.6010769558998","-1781.4601888882014" 10 | "8","-1393.586502171023","-1083.0367704377293","-953.3639201771228" 11 | "9","-1109.4187774196448","-1441.293756439769","-1737.7028515832274" 12 | "10","-1326.9293914261402","-1162.7092883694456","-1260.8724012965124" 13 | "11","-1572.805440318419","-1478.3363736959777","-858.6804642066253" 14 | "12","-1069.4452607517837","-1649.4832800430006","-1272.118618186267" 15 | "13","-1524.5968052122973","-1304.9512089109296","-1155.526460839424" 16 | "14","-1428.3828174378577","-1151.1379483318353","-973.0556867430162" 17 | "15","-1291.4385881793978","-1482.1859404224913","-1163.1362269798906" 18 | "16","-1285.1487646668545","-1168.5253663684198","-962.7688907266163" 19 | "17","-645.271648171834","-1046.2991483617377","-1042.7335714976673" 20 | "18","-1394.8203580045092","-865.888096290279","-1353.3343230599105" 21 | "19","-960.6787257007894","-1101.6861010378152","-859.3448071183794" 22 | "20","-1098.4478573613665","-1220.0327734472453","-969.6986201467674" 23 | "21","-1284.169963576704","-971.655409111762","-1583.6119669772745" 24 | "22","-1419.4245036122559","-779.3439279548702","-1468.9276077723823" 25 | "23","-1145.63356463456","-1122.5872946513794","-1334.655695484695" 26 | "24","-620.896610603594","-743.4845591766414","-765.2094313168549" 27 | "25","-617.3295949902641","-939.8218512825927","-1238.7708423281276" 28 | "26","-1222.2072841860602","-790.7494400016424","-1005.7822004921151" 29 | "27","-368.2205762821809","-842.5504280601738","-895.967980542981" 30 | "28","-1112.1082966035158","-511.05835211600186","-1147.7688357087955" 31 | "29","-738.2126527069581","-634.4764862474137","-1100.1891599663309" 32 | "30","-247.42403482957252","-379.7140199270052","-853.5465304390718" 33 | "31","-640.0695152861509","-742.5803675816342","-855.6701377397093" 34 | "32","-988.9899666971801","-253.308364621425","-508.97327260652526" 35 | "33","-994.9987429104684","-634.8736858959721","-996.3797981564633" 36 | "34","-1038.5698117378424","-121.8197951847826","-736.0703880898933" 37 | "35","-121.63886293790631","-615.026654278416","-621.3489512910409" 38 | "36","-869.9188627161307","-249.40632647932807","-1011.6697063492841" 39 | "37","-1073.0484166711599","-372.05969548161386","-125.81691424325402" 40 | "38","-124.30936700249387","-619.7134862633474","-124.36577442325864" 41 | "39","-716.367521612195","-373.8695275227008","-239.5887835287034" 42 | "40","-126.52222965820194","-123.44629734408855","-894.3760519349203" 43 | "41","-0.8929087412025605","-862.1997271340089","-1352.2167732570003" 44 | "42","-925.1456456510406","-482.6986848523718","-250.15895174806417" 45 | "43","-496.0942536469977","-364.05356355139634","-1.0710859895874485" 46 | "44","-1058.335395017558","-250.8434309724132","-125.37393920545436" 47 | "45","-123.86184011918574","-772.3980171748874","-121.81253967933098" 48 | "46","-371.878302529098","-0.753748082895409","-356.6810835069645" 49 | "47","-248.23347416198337","-962.978665244682","-121.4768404621585" 50 | "48","-606.517951810138","-0.8220186531340494","-127.02559258124339" 51 | "49","-1.5111097275490957","-733.9915349338153","-126.59323187395242" 52 | "50","-124.85684677446079","-126.61160147380727","-730.0611206246601" 53 | "51","-479.8082847100355","-0.9686898579414708","-122.98228651413439" 54 | "52","-126.03459759191124","-361.9488363809253","-234.83341322270846" 55 | "53","-1289.6003862399643","-360.48919449040113","-244.1780046404139" 56 | "54","-733.6799981764867","-1022.5858027035848","-245.23401223114527" 57 | "55","-250.13522568457566","-245.6837277726886","-745.4030319457632" 58 | "56","-126.88244275290998","-717.8349899830002","-125.32855692059432" 59 | "57","-366.979833239877","-123.85322169777622","-362.4084733141734" 60 | "58","-987.9029040109177","-126.51918693334923","-244.86072529006947" 61 | "59","-126.56400013576484","-125.72001830847613","-235.16934060247576" 62 | "60","-234.28921627818525","-736.3225859045626","-124.42996173630026" 63 | "61","-359.60718243730213","-245.42519196877646","-242.16186080292238" 64 | "62","-480.3482483946308","-901.3738881928023","-237.23204039668" 65 | "63","-361.18040092355267","-242.26906435596067","-243.58944310182386" 66 | "64","-126.39461422640106","-363.209537568953","-469.77051184818174" 67 | "65","-1.4188653566448866","-362.9234431470242","-121.72442491777068" 68 | "66","-121.48339685026905","-121.99446684952098","-126.54467155382714" 69 | "67","-125.08729731601655","-758.9148549422006","-620.1125935753583" 70 | "68","-368.43007720755287","-603.4629357765995","-591.6425395243759" 71 | "69","-245.3861156280137","-742.4471503802384","-238.0485250785491" 72 | "70","-126.87693648956338","-121.24764680304072","-120.76266028990185" 73 | "71","-127.3247796901256","-618.1962705407259","-251.07955210943507" 74 | "72","-740.1793297283326","-240.95760095709468","-590.7462600506879" 75 | "73","-477.2917330830815","-122.02981459947132","-125.64237876868448" 76 | "74","-250.2869949225074","-126.13973692886712","-357.1279153878531" 77 | "75","-126.52412322991484","-365.8654386608434","-125.52253463977249" 78 | "76","-249.0394840703672","-750.7536355848999","-351.44032416656415" 79 | "77","-374.3194147595046","-121.60916210824423","-123.7280761312089" 80 | "78","-799.6329155499478","-966.5991188340737","-241.73603867838787" 81 | "79","-128.12935583746483","-479.36914229492334","-125.10722685224538" 82 | "80","-240.15879695370327","-869.88249605427","-760.3552544841629" 83 | "81","-606.4229095550924","-126.40122391360758","-126.15268831744051" 84 | "82","-366.7885168534799","-485.05586057508106","-868.2668268713544" 85 | "83","-2.2443387506372865","-668.5659605408877","-126.92393258865616" 86 | "84","-762.7564228613464","-123.62253975773667","-249.53400653565348" 87 | "85","-127.80606165831338","-235.0715664882357","-123.5111739595756" 88 | "86","-363.59760524327663","-126.03360816716227","-964.2201878559458" 89 | "87","-245.88726817835513","-238.90291407726727","-239.29965538084878" 90 | "88","-252.61479807317733","-123.50174355914297","-125.16594346486384" 91 | "89","-150.56531325885777","-123.45620483387897","-120.63181460371449" 92 | "90","-376.3780675475113","-612.1526277623375","-123.46581944122842" 93 | "91","-603.0909305640891","-125.89024038686775","-247.00326067952128" 94 | "92","-244.52024200077565","-236.74194717202246","-485.925361096398" 95 | "93","-539.9632246702932","-123.02816271230004","-251.0335223333369" 96 | "94","-608.8065835323217","-126.89359763706676","-251.69572209955857" 97 | "95","-359.46945487267317","-236.59373006170284","-631.8675431446825" 98 | "96","-279.61692961693126","-246.51674020330367","-126.0320993976593" 99 | "97","-607.279467546911","-123.54655435496976","-358.4280194795308" 100 | "98","-523.2361691214063","-126.31583498557822","-346.4812373698662" 101 | "99","-252.66498349053552","-886.3744512155583","-722.3988202664649" 102 | "100","-378.8365234870072","-477.4602110253436","-491.65237685014097" 103 | "101","-760.4428415435684","-366.84653708434274","-364.4373331261735" 104 | "102","-758.7662102401788","-812.9734440089073","-628.878683288903" 105 | "103","-746.3809079313393","-1.728440252723113","-615.2269106552452" 106 | "104","-364.1019064325824","-234.93555324936372","-748.0680348671704" 107 | "105","-372.28006227724757","-127.40397564805","-150.69691171345994" 108 | "106","-480.17788055493344","-126.97138218013684","-369.0581438130998" 109 | "107","-361.2752228530327","-122.34891843526778","-485.5371659729911" 110 | "108","-268.62117816015893","-741.3229187011736","-121.55617020442693" 111 | "109","-740.76657304107","-123.74577976783415","-624.2283041899306" 112 | "110","-842.6143032598803","-126.81563157441114","-127.02249005993028" 113 | "111","-343.78633790043114","-240.7245199260935","-126.42006751752275" 114 | "112","-253.25742357879503","-725.9693976817605","-478.6859009704507" 115 | "113","-375.75748816925733","-241.83188488466664","-245.386861574502" 116 | "114","-127.10649590874188","-253.4920192754294","-245.4803357457932" 117 | "115","-855.7770281201945","-856.7434798626956","-722.8319601347065" 118 | "116","-258.5539453077163","-733.4123563710385","-615.9734633659884" 119 | "117","-251.98170978855453","-602.1772893506514","-250.65694477999824" 120 | "118","-628.0928203621542","-254.63894830078036","-361.9502422969117" 121 | "119","-741.9310487265315","-624.278480543888","-126.49448238166302" 122 | "120","-122.51319324418384","-238.01307222801694","-606.5763214911248" 123 | "121","-751.0230926378633","-316.21636833928227","-267.9429403541137" 124 | "122","-373.8918326691211","-738.5048390733832","-405.46765364363074" 125 | "123","-252.33804435782136","-3.7289719410849647","-627.6035083131197" 126 | "124","-254.2331209854943","-250.2781619332873","-374.3312685604425" 127 | "125","-241.58983118690512","-248.971678599174","-423.7590110049731" 128 | "126","-393.5372651847977","-371.13239997245427","-127.29996680612555" 129 | "127","-363.99654939860835","-363.562823275475","-236.90520750975847" 130 | "128","-354.04126666039207","-254.3101719455957","-249.28281793350365" 131 | "129","-619.1011914084496","-131.805410601048","-518.6712576328836" 132 | "130","-359.7425162817349","-379.0530415397763","-127.82709809123061" 133 | "131","-363.00961528787286","-483.7986079896173","-486.29899849716463" 134 | "132","-375.14229821892735","-492.8011774939644","-124.54901351397056" 135 | "133","-248.75391490400068","-254.0591921903828","-367.3275887862448" 136 | "134","-244.98216217669645","-822.4358798919892","-625.0411501597636" 137 | "135","-254.28109367242845","-377.3571986118354","-759.8825532375758" 138 | "136","-359.6219459610358","-494.95994368396316","-367.7953525143513" 139 | "137","-364.02311083786935","-364.2193786423973","-602.9079343278323" 140 | "138","-244.15399886341936","-741.9953403426082","-488.6507146982818" 141 | "139","-253.74288577901544","-604.3155797827948","-478.395704028197" 142 | "140","-623.2443864081838","-494.46198020575747","-358.7595592381024" 143 | "141","-251.2229895596289","-962.5887588249536","-604.3684605144838" 144 | "142","-210.42071565485605","-375.6608306818358","-772.2430537104963" 145 | "143","-256.11445716465266","-377.8680877837245","-372.7307215233145" 146 | "144","-252.0485890978244","-372.88654902024183","-247.28398735460095" 147 | "145","-694.7222103805866","-488.03778003868996","-247.7609540059671" 148 | "146","-744.3355959059716","-678.4740461924457","-252.72402499819722" 149 | "147","-366.8950712926468","-252.74963133664863","-244.97033851437243" 150 | "148","-254.41162138259693","-728.3699115824627","-238.29496423255063" 151 | "149","-253.52026111148777","-374.23047366700393","-726.9043540826042" 152 | "150","undefined","undefined","-738.9563163731026" 153 | "151","undefined","undefined","-614.5920750839446" 154 | "152","undefined","undefined","-79.9585902605596" 155 | "153","undefined","undefined","-361.0391165909497" 156 | "154","undefined","undefined","-494.2984972803201" 157 | "155","undefined","undefined","-365.2025948442079" 158 | "156","undefined","undefined","-362.28832832207144" 159 | "157","undefined","undefined","-375.4668023804831" 160 | "158","undefined","undefined","-239.36075313275416" 161 | "159","undefined","undefined","-375.4795055121578" 162 | "160","undefined","undefined","-475.67522428166114" 163 | "161","undefined","undefined","-492.5259601270471" 164 | "162","undefined","undefined","-718.6021075269705" 165 | "163","undefined","undefined","-370.775464321146" 166 | "164","undefined","undefined","-374.66036800571464" 167 | "165","undefined","undefined","-251.56765861228314" 168 | "166","undefined","undefined","-493.421400134398" 169 | "167","undefined","undefined","-845.0140577914403" 170 | "168","undefined","undefined","-294.3627243838008" 171 | "169","undefined","undefined","-371.5683423511469" 172 | "170","undefined","undefined","-718.3235861826339" 173 | "171","undefined","undefined","-125.1280327870788" 174 | "172","undefined","undefined","-252.11868807030476" 175 | "173","undefined","undefined","-680.6874271084845" 176 | "174","undefined","undefined","-573.8361135328558" 177 | "175","undefined","undefined","-725.9262877393135" 178 | "176","undefined","undefined","-252.7268290096643" 179 | "177","undefined","undefined","-755.4840993877016" 180 | "178","undefined","undefined","-375.9374048581323" -------------------------------------------------------------------------------- /results/DDPG_pendulum.csv: -------------------------------------------------------------------------------- 1 | "Step","distinctive-hill-4 - Reward","gentle-elevator-3 - Reward","different-firefly-2 - Reward" 2 | "0","-1014.8908703377958","-1234.6515419213115","-1293.1580551700167" 3 | "1","-1228.5823469580098","-1748.4348159056608","-1256.2260351822067" 4 | "2","-1650.888066475382","-1120.7598504448088","-1343.6308841010407" 5 | "3","-1650.1720090113251","-1438.4851112275107","-1323.7121862279116" 6 | "4","-870.2751215810749","-1347.803749431169","-1318.3932832099197" 7 | "5","-1745.9542931182166","-1732.943533328439","-1087.5799149922643" 8 | "6","-1704.1108615013434","-1332.2651816570499","-1756.3667660463975" 9 | "7","-1598.1912060928419","-1417.2920644665276","-1730.8651763982054" 10 | "8","-1754.6189805574213","-1073.7557883184222","-1501.1191311421044" 11 | "9","-1631.919625582375","-1060.6231099677295","-1734.497703829973" 12 | "10","-891.0556494942305","-1318.6570227623636","-1448.0368675444463" 13 | "11","-1699.0961792394073","-1726.768623977622","-1425.685305632699" 14 | "12","-1661.9071904723348","-1210.0819289283695","-1477.9706664032094" 15 | "13","-1647.3647976408724","-1308.0466138321042","-1729.0097220466168" 16 | "14","-1721.7175268726469","-1394.5874045792102","-1554.1752944648383" 17 | "15","-1538.3779107291355","-1731.1864549043614","-1680.0336692209335" 18 | "16","-1610.4664753499255","-1621.4790667854697","-1686.389033855811" 19 | "17","-1609.9566874185798","-1495.1909732211213","-1621.3391714260517" 20 | "18","-1610.9182896711059","-1630.3672278877173","-1761.8438700983531" 21 | "19","-1742.590067294859","-1293.579349540004","-1565.8977922127865" 22 | "20","-1547.6087103590298","-1655.9335991646674","-1680.0894437825689" 23 | "21","-1733.971774365266","-1606.9453390886451","-1641.284599632336" 24 | "22","-1467.6631451831959","-1538.8754241738343","-1737.2453850536535" 25 | "23","-1416.3078870591048","-1645.6598372452331","-1451.4258552392805" 26 | "24","-1552.1046107954867","-1431.7030288539029","-1495.56103552205" 27 | "25","-1681.8203027848406","-1399.8577062279721","-1485.8771501690755" 28 | "26","-1681.8227450840188","-1536.3403330347178","-1696.0715918336666" 29 | "27","-1502.129128145834","-1401.8427780210136","-1507.793752543204" 30 | "28","-1655.1686347366685","-1352.6543045081469","-1656.680456277359" 31 | "29","-1403.2515599680698","-1411.8179893394486","-1741.3822690826526" 32 | "30","-1496.7948749994846","-1253.2269618503117","-1453.23145551119" 33 | "31","-1710.3461063728982","-1227.4507899412727","-1485.5573356666916" 34 | "32","-1530.1536826780805","-1471.3127807034086","-1717.7944032725018" 35 | "33","-1132.3894766289955","-1292.5894958428598","-1317.6119564930452" 36 | "34","-1357.1737276540769","-1515.2620053849087","-1324.0950503395406" 37 | "35","-1329.4395302067305","-379.97140734092824","-1307.7621888872666" 38 | "36","-1457.528546332126","-1139.243127833966","-1382.3891410969727" 39 | "37","-1404.1955402213237","-1083.6892064909023","-1231.3338050812304" 40 | "38","-0.6408999767368159","-979.5268650792688","-1196.4467156789533" 41 | "39","-1306.30952120033","-1083.511206220124","-1303.5627933242786" 42 | "40","-1321.0233037466194","-1121.9641662940076","-1266.824575348421" 43 | "41","-1.5339299939905462","-1378.7451384928152","-1177.852341997491" 44 | "42","-1240.6517086563235","-1077.9878021135128","-1117.9526979556763" 45 | "43","-1200.4386794239424","-610.415726808469","-1177.7403681974392" 46 | "44","-1337.3520826985882","-1314.4189502478478","-1252.0525045806817" 47 | "45","-1156.427919872121","-1137.4020868396226","-1368.4623077931515" 48 | "46","-1116.2344053490438","-975.7366927593253","-1047.4229372199557" 49 | "47","-1138.2197052958916","-1080.2732516194849","-1121.0454952011614" 50 | "48","-1132.0434626147646","-1133.380325007254","-1296.0016802276564" 51 | "49","-1115.9822756702922","-256.73414348314674","-930.7716297597548" 52 | "50","-1013.7482754050733","-987.3634395038162","-1268.4717286292962" 53 | "51","-1135.8188396581038","-1226.5678189832154","-1016.5827746377354" 54 | "52","-1140.5928853459843","-835.933160459834","-880.1508165985448" 55 | "53","-1755.2545540077103","-1260.2007132393094","-1053.1300349401574" 56 | "54","-1366.1630127980793","-995.9482902423484","-1059.170345900024" 57 | "55","-1135.8513811176","-550.0734112642594","-1224.5417627805518" 58 | "56","-968.9553921534618","-935.1551792774169","-1205.9299833984894" 59 | "57","-1219.9277027682597","-1584.2105432603355","-1261.0263417144265" 60 | "58","-1132.5575153669217","-893.1915450262937","-991.2819617783136" 61 | "59","-1307.466214214299","-846.1427260918325","-1089.9417725705118" 62 | "60","-1290.871530519115","-1131.7491827097213","-559.4916934731575" 63 | "61","-999.1684137806492","-789.5346229865194","-984.0233828026737" 64 | "62","-899.870123803834","-878.7597247328788","-596.5550867555428" 65 | "63","-1251.3300197422845","-908.3859458964148","-274.7380399122305" 66 | "64","-998.44437054126","-772.4037877636262","-922.7477119048294" 67 | "65","-1337.9615123792635","-1102.3868109164287","-1233.0262743442995" 68 | "66","-1369.785636573198","-916.7029932369031","-981.745225116003" 69 | "67","-1274.2820265502364","-920.1465868593798","-953.4291012862631" 70 | "68","-1173.710296800721","-1063.6844861062455","-1001.4416873265482" 71 | "69","-380.24006242301647","-903.4818764492245","-990.2456985786142" 72 | "70","-1268.7380404814883","-975.720310490906","-1015.0212006866686" 73 | "71","-752.2588808266108","-888.9471254677297","-842.0562341654468" 74 | "72","-632.030394036545","-925.4373689970606","-984.525754234405" 75 | "73","-715.0567544506468","-1072.0987645541109","-905.8580646768436" 76 | "74","-1109.6489527000786","-971.1192965322409","-1136.907709793159" 77 | "75","-754.6892859402535","-887.0722154217104","-867.5849674698842" 78 | "76","-1178.3733058476923","-891.496327659339","-864.5380196405459" 79 | "77","-1162.3929981833858","-953.9834026939282","-879.3178938015781" 80 | "78","-126.18975889608632","-1009.7709953620043","-373.8398158923067" 81 | "79","-0.4822774162267329","-1013.4140882788087","-753.0206682588873" 82 | "80","-1154.490428073833","-1040.6632998768323","-494.74565612844975" 83 | "81","-1009.8794224055296","-900.0296101035956","-376.10818117299243" 84 | "82","-1031.041230643868","-1014.8616587583489","-742.1470110049436" 85 | "83","-253.19912194802222","-1047.0276784183732","-500.76629329239535" 86 | "84","-626.0441122553259","-1065.1218900185409","-892.5339234547456" 87 | "85","-882.4128799105716","-1065.9579819931148","-498.32630071824343" 88 | "86","-1118.5063695036383","-1067.7863027944845","-738.1586781747395" 89 | "87","-128.91614726978767","-1044.4367912586351","-367.3574740732487" 90 | "88","-1361.6066295763178","-1035.395114655505","-482.89253825714144" 91 | "89","-251.6572839135534","-978.0658887501103","-944.2190832625134" 92 | "90","-1032.7325103799462","-1032.8878348384633","-616.331897290388" 93 | "91","-621.8058043144922","-1053.809607952315","-610.8300770786523" 94 | "92","-877.9851574426241","-1014.9737391317753","-380.95552055732907" 95 | "93","-780.1796797989417","-968.7196207758694","-379.88525117429657" 96 | "94","-772.1581212643387","-1053.080656229371","-816.3325715422277" 97 | "95","-1028.2635843586106","-1031.7023348423247","-737.3066936892914" 98 | "96","-1734.2811930088467","-1023.9050513444753","-613.3831246384843" 99 | "97","-126.70084905456496","-1038.4747443515994","-719.0201777563341" 100 | "98","-2.721048015400562","-982.291679434724","-470.4679965348458" 101 | "99","-373.7255355683881","-1051.9998843711578","-1009.5589297740689" 102 | "100","-129.23949239727457","-904.701658663281","-501.11227884244636" 103 | "101","-2.411681595055098","-881.5999545564415","-761.275133373026" 104 | "102","-127.50051147720444","-886.848190769963","-621.6208583111463" 105 | "103","-1729.073849372669","-902.6629668324613","-500.9556030221846" 106 | "104","-1730.4805045179182","-758.0244018089164","-620.8876214477017" 107 | "105","-811.1790950438697","-756.4644481749871","-622.968283785174" 108 | "106","-126.10197565072214","-841.3462672032948","-616.9808647589098" 109 | "107","-273.39662912690545","-664.6114402234855","-617.8995983324573" 110 | "108","-374.2700052300167","-825.0049744219834","-502.4860194571064" 111 | "109","-502.9296095329982","-635.4916967663823","-621.4225083203901" 112 | "110","-128.1221875615817","-633.844869144544","-503.13034058063624" 113 | "111","-779.9118854330351","-630.8491075681764","-618.176964256477" 114 | "112","-370.85659964088126","-630.1590603503691","-740.3489744566169" 115 | "113","-498.3240668645451","-745.2222333483546","-501.655361908556" 116 | "114","-495.5305294109646","-632.1461223950423","-503.69975756244014" 117 | "115","-495.8137388948931","-713.3129274114458","-383.5883652830756" 118 | "116","-379.18858382464873","-850.0498123503323","-617.8195684458378" 119 | "117","-246.60244453302647","-898.3920108923751","-725.7122690739051" 120 | "118","-127.36901447774515","-880.7778402058325","-505.7342645015156" 121 | "119","-496.11654924972146","-630.7992356393751","-256.3387894950304" 122 | "120","-818.4625298629466","-627.8254476011572","-504.5648311223169" 123 | "121","-128.16040963111217","-628.7111178564513","-506.0958870370358" 124 | "122","-500.4972235391573","-760.7869994141167","-688.6888816782961" 125 | "123","-368.7889460418552","-756.0634409661885","-501.8785406737819" 126 | "124","-4.123982556288994","-532.751966168271","-505.91395562641907" 127 | "125","-130.285061604441","-629.8181023679791","-743.8910831256695" 128 | "126","-127.44992278124855","-727.9248832058818","-506.68835898372834" 129 | "127","-704.0209220386806","-513.9131590730371","-521.1568688263317" 130 | "128","-371.58465379028115","-745.9638926596058","-388.1566371196859" 131 | "129","-126.24481231769204","-581.7428801544935","-626.2999528327326" 132 | "130","-499.5563756387748","-507.8744560542396","-744.2392369989179" 133 | "131","-510.4935298288138","-741.6247165245802","-590.6597279956263" 134 | "132","-517.0293650623928","-747.9219618421271","-741.5548052733326" 135 | "133","-3.053966945930094","-734.8424550461597","-593.4836358307125" 136 | "134","-246.23027632634142","-505.57261512747937","-380.6870856738731" 137 | "135","-5.5143908403169695","-252.28839977029534","-755.0853921623545" 138 | "136","-771.6121696123007","-369.216354283389","-754.6482321647215" 139 | "137","-775.8594607463924","-485.2686048787671","-725.9204883316273" 140 | "138","-992.9202205668672","-502.27725680047814","-500.2970624880725" 141 | "139","-872.0168937117568","-715.1087590148088","-724.0188023672675" 142 | "140","-251.9545326620248","-738.5586860030983","-740.7420062988649" 143 | "141","-373.9701852438977","-497.54513641549426","-510.54384670752546" 144 | "142","-130.1061470117315","-501.9153600035508","-742.4689378867871" 145 | "143","-383.0728199597281","-482.1846124587627","-765.4667523082695" 146 | "144","-877.9605353125075","-255.23601705467587","-501.28542466343737" 147 | "145","-125.86250834017832","-380.55481268711355","-1002.4801770986641" 148 | "146","-248.93158219377045","-380.31662374672436","-763.06266781101" 149 | "147","-365.8973088051606","-376.99593378809584","-870.831945685353" 150 | "148","-488.451222818248","-609.0215479130525","-703.5946525548594" 151 | "149","-375.84739894375787","-128.43098730449802","-769.3442444752812" 152 | "150","-1.6405792684104108","-570.5622080678031","-875.6150503105503" 153 | "151","-369.58105485323637","-128.22047407675439","-746.302004180539" 154 | "152","-753.3105806932209","-497.67908760156007","-848.8512132088528" 155 | "153","-127.52568117757139","-249.15289455448544","-623.43173325632" 156 | "154","-689.3974463744178","-127.613903992894","-358.6589280020239" 157 | "155","-366.8047954470363","-2.1746109735538033","-871.5932355847069" 158 | "156","-246.06576043562822","-244.24968428097577","-511.45031908497776" 159 | "157","-127.31660796967752","-244.2917503732065","-857.9146390771301" 160 | "158","-488.29760000483714","-127.04638142137861","-875.4510803870846" 161 | "159","-127.22268097765445","-127.71096352623637","-881.7624568519443" 162 | "160","-494.36634168882375","-244.45262861103635","-809.7516177017757" 163 | "161","-382.39712211760076","-361.2800033888769","-908.4122808353352" 164 | "162","-639.0904597430848","-246.457777127788","-1028.5190874103787" 165 | "163","-123.91200098632223","-124.51245784370175","-878.9751158575763" 166 | "164","-126.60000074907356","-124.02893728433646","-670.7855462587232" 167 | "165","-247.92410485708828","-127.80444858005406","-1011.4968306587824" 168 | "166","-123.8354119445842","-772.5475085108203","-902.7104007623808" 169 | "167","-127.69678924765138","-250.34490957101673","-775.9727927530975" 170 | "168","-360.31058015005226","-123.9727434276785","-991.8251816520441" 171 | "169","-127.82498664845247","-364.9506950162843","-512.1098284842739" 172 | "170","-1.8933979828898588","-246.21496868021237","-1001.869722492" 173 | "171","-545.9210175715328","-363.5258360378142","-1006.6113155109227" 174 | "172","-128.5242208501573","-723.5725349381087","-1097.02983129944" 175 | "173","-359.51421456382366","-364.87254428977496","-787.2752406480375" 176 | "174","-126.78445560703422","-1308.8894129590974","-1011.4989906931795" 177 | "175","-541.67372591121","-1.0028825228582026","-1005.9540972055559" 178 | "176","-493.7980393633462","-918.1006745305118","-1020.3326401681294" 179 | "177","-1.0789737318074064","-1732.7255321219754","-1036.7150574346115" 180 | "178","-612.1588178287927","-1135.5252260879263","-1012.4945781969345" 181 | "179","-125.8525026949102","-235.1734412310436","-1026.8287424187897" 182 | "180","-121.78765658566633","-610.4207847025076","-1018.3569890031512" 183 | "181","-121.5912384661951","-939.5156318208992","-1029.358751109285" 184 | "182","-125.0029291996708","-987.5796452282589","-776.4680540723322" 185 | "183","-614.145760115657","-934.7835973722363","-1052.2884270265333" 186 | "184","-246.66769476299706","-118.97777885985995","-896.8803569828824" 187 | "185","-125.32078553488462","-931.110706873264","-1001.3695580312467" 188 | "186","-126.02441342323962","-891.8602043684","-902.6141531374911" 189 | "187","-123.14742549793891","-0.4012107991134946","-1028.7541812722511" 190 | "188","-627.5252806845468","-0.5432344714607146","-987.7368766164466" 191 | "189","-480.29428089398897","-0.3561983269047018","-1036.08898437535" 192 | "190","-383.34380467609583","-0.24450276518238717","-907.445212975036" 193 | "191","-243.45221267084827","-126.21167534074766","-1002.2472826453231" 194 | "192","-504.72742892766985","-126.03234018143127","-1035.9078808714655" 195 | "193","-0.5115398278926065","-360.5906868689056","-960.5018986280361" 196 | "194","-510.04276732410307","-380.8943753917122","-1025.6365511308807" 197 | "195","-248.34102710870053","-491.21651206980636","-1038.0441675426935" 198 | "196","-120.21063081792218","-254.85443282149933","-991.7366348400476" 199 | "197","-123.86507377636339","-650.80290151881","-934.6932209629218" 200 | "198","-367.68408038694673","-126.38537065810877","-1015.3105437747133" 201 | "199","-620.6265141649352","-615.7467062862117","-1007.812899989842" -------------------------------------------------------------------------------- /results/NAF_LL.csv: -------------------------------------------------------------------------------- 1 | "Step","hearty-sponge-54 - Reward","swift-dust-53 - Reward","glamorous-waterfall-52 - Reward" 2 | "0","-71.69764803802343","-291.6938154404446","-232.26251711749805" 3 | "1","-541.785113395403","-450.1837500132589","-534.184977937134" 4 | "2","-366.5903162499464","-394.8652897273437","-606.9814550518843" 5 | "3","-236.473184403032","-569.4546601954753","-624.4889053668571" 6 | "4","-190.4393449409066","-347.43991124298645","-675.6672871360818" 7 | "5","-317.6488551626255","-335.4730453500336","-291.09397165768655" 8 | "6","-480.7661145814981","-334.41167371008817","-600.7621425780221" 9 | "7","-197.15492189029106","-232.1206766520433","-306.03218695440603" 10 | "8","-175.40903082249275","-57.01187879116268","-208.9237906318925" 11 | "9","-186.32910709783317","-78.79335262994562","-74.17447087226697" 12 | "10","-303.51182601677976","-235.51333227823494","-248.13581312645027" 13 | "11","-239.47322406294745","-12.12636892577882","-165.22116349961945" 14 | "12","-174.42763003518263","-111.83029932926357","-142.5249596650128" 15 | "13","53.43136112987415","-139.3030922869666","-89.22073055588095" 16 | "14","-274.140712234675","-123.89705038120442","-343.2262929069797" 17 | "15","-217.3835081106759","-75.63240023101514","-265.2704862732581" 18 | "16","-39.80275240967397","-120.79955050588242","-154.5289667765548" 19 | "17","-112.66970988880162","-185.3426660556774","-90.2264579879523" 20 | "18","24.818168298506766","-241.33889486643372","-28.199729576764188" 21 | "19","-101.12629537581702","-57.118885279145786","13.244761530521595" 22 | "20","-84.50012903967131","-119.95959404869647","80.54147305020119" 23 | "21","-101.0516873873609","-41.645229562093405","-222.10621705318277" 24 | "22","-133.88162965527027","-67.77479093307383","-8.627425995517447" 25 | "23","-52.916092223277175","18.349501331816242","-13.7080508463674" 26 | "24","-5.540931442184371","-156.82550757224527","-209.13028361602267" 27 | "25","-65.49809086975219","-23.08050046794603","-24.173702051146506" 28 | "26","-47.49223511222782","-57.7859532090756","-78.6356832865353" 29 | "27","-31.934838220101796","23.528013451751278","-63.53722872028158" 30 | "28","-65.4391604276704","-64.72899877798874","-32.634185233282" 31 | "29","-63.57478559561951","-96.58258390121402","-32.987892350658655" 32 | "30","-90.11329301413706","-226.9820280878752","-44.11070675435227" 33 | "31","-43.692860582520424","-89.7149785878379","-40.49782191433995" 34 | "32","-32.082074813532685","-68.56079046508881","5.30911228688716" 35 | "33","-24.47388807201583","-165.1528427288667","42.443475056644296" 36 | "34","-134.93089038736744","-270.35842363037295","51.252698772587195" 37 | "35","-99.54329608883296","-50.232061969406146","-71.07639651855649" 38 | "36","-74.7896434169972","-130.05047260664546","65.51246826493986" 39 | "37","-123.64760946433174","-115.39948025390105","10.719901682049056" 40 | "38","-138.20077158445142","-96.77032470151752","-29.32839045704368" 41 | "39","-168.77563555206547","-27.356991839858505","97.85218953624108" 42 | "40","-127.83959522728908","-12.744817681186907","9.952919416804093" 43 | "41","-339.5143032154342","-77.11398457486817","-106.8014701273565" 44 | "42","-65.27619737006981","34.75545681623856","-22.007451763396286" 45 | "43","-56.908363964230865","-50.25404452590398","81.07425712875016" 46 | "44","63.510326939586946","-83.91794485517718","104.47021025163204" 47 | "45","-44.00555605676851","-76.40845662829031","-110.52004322679187" 48 | "46","41.78685967591207","-76.89335002501784","-59.25568648154801" 49 | "47","-89.29179548665876","-54.83215695756564","-32.26683224616184" 50 | "48","-47.63257346896588","-19.938096698047566","9.941565215970357" 51 | "49","-53.273731358682056","-52.46792355327812","-26.241935427431134" 52 | "50","12.532550428702617","-30.923257504312982","-16.494224720168177" 53 | "51","9.258378894749356","-30.867998692392387","84.63518149932655" 54 | "52","-259.4708538241011","-64.85959069976312","97.57774875098116" 55 | "53","48.16400566322288","-42.90248694282412","92.10338212188645" 56 | "54","76.59452541938641","-31.760695983082233","60.80584761683444" 57 | "55","-72.15568948963663","-13.18644585303436","65.78134952913007" 58 | "56","25.805539179382798","-28.062717158527686","116.5625561315064" 59 | "57","65.65862157111471","7.7060148529590125","-8.186611568873845" 60 | "58","54.5699723095856","-31.22263304145787","67.42656304334461" 61 | "59","33.025346750215434","23.384930933173052","126.32288369207404" 62 | "60","45.944713230913294","0.9475051549144262","54.31101323969931" 63 | "61","112.43680667648127","-30.69334925217595","-53.92296009622311" 64 | "62","0.39351139123287737","58.57668259240208","70.44399937681595" 65 | "63","41.93221518145479","39.6801993450812","46.50095079558376" 66 | "64","50.46545880237349","20.033215676066153","-91.64193311166696" 67 | "65","-30.307791233060485","58.260612894165774","37.59432158555404" 68 | "66","56.79176153552939","51.44504776747688","43.34001479381801" 69 | "67","79.64369165736959","26.070822261861007","-39.80247631669081" 70 | "68","17.687329148648374","30.79536633423954","-39.81122727408331" 71 | "69","43.59537063034602","43.9827830986708","24.98872920431353" 72 | "70","31.64476446810329","9.76172065063003","-58.92586581989944" 73 | "71","-195.5750848575666","35.2072778840149","-180.23475154147044" 74 | "72","-34.40782944543301","4.03183431221855","2.943633019153282" 75 | "73","-68.26096049873465","37.770572518103606","80.38488649415946" 76 | "74","57.518466807017475","57.258739629560246","60.664585245243636" 77 | "75","66.73594439331738","44.714050885737976","129.0842872117239" 78 | "76","-171.927727482154","48.30688855053768","108.58820524059894" 79 | "77","11.652809776074944","61.957373155721335","90.5765766120385" 80 | "78","51.545193191161104","66.03980682329286","112.32145691545276" 81 | "79","47.121306646296176","-30.340546044342446","134.10695631524035" 82 | "80","44.50242356318762","-57.16732681486123","88.88241428485016" 83 | "81","62.094569985067615","5.061849558701425","90.73454251408236" 84 | "82","4.3144985383633525","65.79741403615803","23.807448231922137" 85 | "83","64.68640403145614","44.67996478525876","14.671066595169961" 86 | "84","66.95569933586924","12.460514124101275","36.636899364328855" 87 | "85","32.1530072885227","-303.2749446339088","9.594200810249305" 88 | "86","18.129233387734402","-8.154891978199743","-16.111162582238762" 89 | "87","20.62635363163804","-9.27130766632736","-215.68796931194134" 90 | "88","-36.58113397042307","55.76050004820466","48.48040691130271" 91 | "89","50.512789708393946","58.805926974027194","22.21273733290549" 92 | "90","-71.92460823217253","59.39722959207387","80.64161075612243" 93 | "91","36.95820802321768","-15.356939054129356","37.804840705067186" 94 | "92","-70.38401617871841","-75.97125481350685","22.453585128561926" 95 | "93","69.19067615314171","102.61037725878805","97.68003771929067" 96 | "94","-307.1955206206049","-28.18751285911788","33.41057200460477" 97 | "95","-316.9350937373267","73.37180069110035","89.97535803434279" 98 | "96","-237.20543491205007","43.00121446124036","69.50480228359149" 99 | "97","69.23758495670202","62.88880939149944","120.97507275945951" 100 | "98","-69.3740086740027","50.09806115038106","-12.881406813497335" 101 | "99","42.60601046574539","106.11973884669453","-25.15512339295667" 102 | "100","-4.034586391096468","25.353620535820653","10.138360600404752" 103 | "101","66.99829944134814","45.2621913389375","44.347767928189015" 104 | "102","-327.71035734588236","75.66264098876918","76.49570015131033" 105 | "103","90.39194013560576","53.9225865847396","0.027488666481090718" 106 | "104","105.35825395228838","60.85956672986016","29.588448624075113" 107 | "105","-17.56061722326062","-158.61115004793191","23.77889483835952" 108 | "106","66.84148741857133","60.008902561190546","64.07709354793143" 109 | "107","80.37629991550102","67.90727397214344","2.041120654176069" 110 | "108","74.27999491583061","50.658204271725026","-235.59730842484058" 111 | "109","74.74200415006355","20.32763123375331","-144.0141821367423" 112 | "110","55.605627724836324","59.21964368005195","-192.97248579550458" 113 | "111","-27.275100813093065","-174.1933436936868","11.072626655679429" 114 | "112","22.95357550798505","-204.30002389205146","43.10413165453054" 115 | "113","-24.380122233174923","51.48697652679519","53.86877588234768" 116 | "114","37.62730970031893","91.52365952353091","-210.22861308022993" 117 | "115","61.30473693490092","-75.24620913445813","78.65424286484762" 118 | "116","118.92375544174187","-53.501646676251355","61.778984182422" 119 | "117","37.39442251659474","-81.44710174525225","76.66290371463019" 120 | "118","46.11164526555129","9.26265426892263","-27.862954620774147" 121 | "119","75.49737601413969","76.95830472756067","58.72204273481951" 122 | "120","1.6115135269878493","-21.273460553205474","57.479295506443535" 123 | "121","105.55274848703844","64.65567324119314","55.09509883231458" 124 | "122","-29.48497925858912","16.48921386129396","44.687814344970604" 125 | "123","0.743107195262553","84.34679588913663","76.86989634907484" 126 | "124","38.0772251422305","98.33553668744477","66.0399258176603" 127 | "125","5.752011895391192","26.361478901996364","17.218895720856594" 128 | "126","86.30845854471328","-10.76557256634456","66.43484800977204" 129 | "127","85.78695263657643","-127.23632173552974","-5.601463730516031" 130 | "128","9.377072267423657","35.68911251651312","71.70479684542929" 131 | "129","67.94410149295419","25.175597058009984","67.96382090130476" 132 | "130","81.69729846988457","-337.2163961891769","1.1335253724696344" 133 | "131","51.66124489466532","14.20573011623685","40.67800965884592" 134 | "132","40.29642742711442","9.264747236691719","104.78096422535278" 135 | "133","57.68344321843838","-32.475092723742094","102.7264735624355" 136 | "134","0.9735517403102545","68.21778063143067","82.67993581065264" 137 | "135","85.91702147045879","37.415143641427285","123.44617100304433" 138 | "136","41.93134450553518","58.509516849218414","5.201862539650136" 139 | "137","15.9499670636759","41.47248112994001","-1.5494374475555048" 140 | "138","117.87533582562605","-226.10201403711665","127.14451275947202" 141 | "139","55.38629016740985","20.442374359391824","92.70697216199838" 142 | "140","104.96027611618248","45.168950288242385","28.40684081942956" 143 | "141","82.99102964455331","40.22600580745382","108.84464140532137" 144 | "142","-21.063481759719977","-35.687081329965125","45.65450164648457" 145 | "143","-29.293847597119395","30.77418371983086","62.19541909111202" 146 | "144","112.64106882691733","-8.43636615553519","45.090650125837016" 147 | "145","13.65625198873282","-21.27451832370403","-163.45104692826436" 148 | "146","39.83851727835636","16.241192491876642","102.19059574655782" 149 | "147","80.98170438335904","57.81466096269153","-5.330404411810818" 150 | "148","73.72004366299458","78.07466847753483","39.20440073860245" 151 | "149","67.62542253112262","57.912509348964655","23.36576867782074" 152 | "150","51.641230813062265","111.6512115199854","39.47834104110063" 153 | "151","53.06364123041021","62.644775924138216","-5.213189056556047" 154 | "152","33.181671501478036","-13.463356364695093","25.324845462569918" 155 | "153","18.929000040913678","77.99146932210073","7.777591941087408" 156 | "154","2.9763345765414613","-57.859069506390796","-45.689542639648394" 157 | "155","24.1001546573723","36.73364875186488","108.85943854025578" 158 | "156","41.683146206743174","81.8005600644232","81.46872044312349" 159 | "157","16.74691642027554","71.41798582627398","37.40443273095178" 160 | "158","31.235497649713807","19.992002575548796","-291.0534273677455" 161 | "159","-14.091021002401263","62.36740295163857","62.29996730313236" 162 | "160","22.8378712693292","10.866946112723937","46.324657593086755" 163 | "161","47.86247123936184","55.220633245151035","0.51930586798224" 164 | "162","99.32011088272114","-9.076456557594","62.415179176796116" 165 | "163","42.244417928564474","65.20109318852248","12.027612427434974" 166 | "164","1.2806345066212117","9.093029816366592","-5.650323349679454" 167 | "165","-30.983091646519654","26.10046373827801","101.02852186780653" 168 | "166","56.64547349939792","57.28089819941644","107.06930248347213" 169 | "167","34.599357358203775","-4.338009932610436","10.79261273579769" 170 | "168","80.3359880626031","105.87998806909926","-0.4070414870349879" 171 | "169","17.013434286345557","60.82558915848031","20.573234974391156" 172 | "170","92.73726154717488","-102.69130481327792","110.18155846364637" 173 | "171","87.39585350846373","19.267728363839225","10.26958562076004" 174 | "172","-4.25193752477648","89.16176230832274","52.81212247055885" 175 | "173","82.75551675302832","-50.15488856942074","85.45277128325525" 176 | "174","-0.7062983889534564","29.889572065186684","-45.006610506349496" 177 | "175","46.169132880473796","-7.026515510245389","-27.95766056099646" 178 | "176","-7.884417952967539","16.8245826146169","62.70085051375204" 179 | "177","19.535322730809085","108.13837210880585","53.4416827734454" 180 | "178","26.21307699789837","2.9012372548412486","48.41126207200463" 181 | "179","65.8352497976442","6.953948114690348","97.97380189708497" 182 | "180","49.479497566530355","-0.0667589833458635","18.81343692927844" 183 | "181","53.52125975773231","11.353059465045249","44.78734981508347" 184 | "182","44.11110967633235","-14.54897095293262","53.557944827646054" 185 | "183","125.90153777153898","-191.3664747714745","2.899851741731979" 186 | "184","125.49286289432152","-7.893538143947154","29.292260872685517" 187 | "185","19.375823864466682","56.17838580644791","21.44752670611676" 188 | "186","128.25247308678792","-7.045760681687881","-31.330727250339606" 189 | "187","103.08797235151113","-226.4018347594501","25.877522633558883" 190 | "188","42.09565226182639","-28.306610980597952","124.51362355292802" 191 | "189","-140.74195433208166","-22.161517666222252","16.406182859646407" 192 | "190","15.37300778258944","11.081986569397017","7.393396737081545" 193 | "191","25.827125843097235","81.1352320894161","32.79281269873201" 194 | "192","25.063007712389336","24.440866564529827","12.840965910878394" 195 | "193","53.645465121455345","87.44381732913773","42.2263151412312" 196 | "194","77.45860259728536","40.67447360757993","82.11756690318622" 197 | "195","53.677891435931926","29.52986864896218","-12.392316131937278" 198 | "196","64.32660892308938","undefined","108.12549267962252" 199 | "197","116.95409774811712","undefined","34.09563711689404" 200 | "198","99.68412875488876","undefined","5.678937869392115" 201 | "199","-7.820449840373016","undefined","24.205380875216434" 202 | "200","48.23303226111768","undefined","26.32809933783851" 203 | "201","53.19529548349627","undefined","-2.030630004711597" 204 | "202","51.480119343731445","undefined","48.829655798804794" 205 | "203","88.88876082346155","undefined","110.02519406611964" 206 | "204","94.94606471681622","undefined","29.947185471255807" 207 | "205","23.723290647910005","undefined","46.75455934470986" 208 | "206","81.94284622872975","undefined","-4.08149050780932" 209 | "207","25.78591805760682","undefined","113.14389997906392" 210 | "208","18.79331716969783","undefined","-123.55778949254874" 211 | "209","87.4506249289859","undefined","4.081155372556566" 212 | "210","-135.67873546036623","undefined","72.95192813411344" 213 | "211","81.24030499938827","undefined","40.72753563424877" 214 | "212","94.77785946805199","undefined","62.012828069563724" 215 | "213","82.46804909901014","undefined","-23.390688260211277" 216 | "214","40.42248815861709","undefined","-58.64505079379795" 217 | "215","86.82127458970105","undefined","82.22848105097334" 218 | "216","-23.351795123721587","undefined","30.495516365295828" 219 | "217","37.98762700386641","undefined","52.40071656495099" 220 | "218","51.817642208068605","undefined","-15.564007877997696" 221 | "219","71.22810721638157","undefined","30.36048543002522" 222 | "220","75.31787296191665","undefined","2.7425988199038613" 223 | "221","undefined","undefined","69.38536535074661" 224 | "222","undefined","undefined","82.27444660718164" 225 | "223","undefined","undefined","94.09775570442413" 226 | "224","undefined","undefined","50.582161905914006" 227 | "225","undefined","undefined","62.77876550400319" 228 | "226","undefined","undefined","-9.617879512920453" 229 | "227","undefined","undefined","47.24040231471557" 230 | "228","undefined","undefined","13.989276007718644" 231 | "229","undefined","undefined","-6.495324894668215" 232 | "230","undefined","undefined","33.297504665898764" 233 | "231","undefined","undefined","89.21445298704509" 234 | "232","undefined","undefined","-15.085702215366325" -------------------------------------------------------------------------------- /NAF.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 225, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import torch\n", 10 | "import torch.nn as nn\n", 11 | "import torch.nn.functional as F\n", 12 | "import numpy as np\n", 13 | "import torch.optim as optim\n", 14 | "\n", 15 | "from torch.nn.utils import clip_grad_norm_\n", 16 | "import random\n", 17 | "import math\n", 18 | "from torch.utils.tensorboard import SummaryWriter\n", 19 | "from collections import deque, namedtuple\n", 20 | "import time\n", 21 | "import gym\n", 22 | "import copy\n", 23 | "\n", 24 | "from torch.distributions import MultivariateNormal, Normal\n", 25 | "#torch.autograd.set_detect_anomaly(True)" 26 | ] 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": 252, 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "class NAF(nn.Module):\n", 35 | " def __init__(self, state_size, action_size,layer_size, seed):\n", 36 | " super(NAF, self).__init__()\n", 37 | " self.seed = torch.manual_seed(seed)\n", 38 | " self.input_shape = state_size\n", 39 | " self.action_size = action_size\n", 40 | " \n", 41 | " self.head_1 = nn.Linear(self.input_shape, layer_size)\n", 42 | " self.bn1 = nn.BatchNorm1d(layer_size)\n", 43 | " self.ff_1 = nn.Linear(layer_size, layer_size)\n", 44 | " self.bn2 = nn.BatchNorm1d(layer_size)\n", 45 | " self.action_values = nn.Linear(layer_size, action_size)\n", 46 | " self.value = nn.Linear(layer_size, 1)\n", 47 | " self.matrix_entries = nn.Linear(layer_size, int(self.action_size*(self.action_size+1)/2))\n", 48 | " \n", 49 | "\n", 50 | " \n", 51 | " def forward(self, input, action=None):\n", 52 | " \"\"\"\n", 53 | " \n", 54 | " \"\"\"\n", 55 | "\n", 56 | " x = torch.relu(self.head_1(input))\n", 57 | " x = torch.relu(self.ff_1(x))\n", 58 | " action_value = torch.tanh(self.action_values(x))\n", 59 | " entries = torch.tanh(self.matrix_entries(x))\n", 60 | " V = self.value(x)\n", 61 | " \n", 62 | " action_value = action_value.unsqueeze(-1)\n", 63 | " \n", 64 | " # create lower-triangular matrix\n", 65 | " L = torch.zeros((input_.shape[0], self.action_size, self.action_size)).to(device)\n", 66 | "\n", 67 | " # get lower triagular indices\n", 68 | " tril_indices = torch.tril_indices(row=self.action_size, col=self.action_size, offset=0) \n", 69 | "\n", 70 | " # fill matrix with entries\n", 71 | " L[:, tril_indices[0], tril_indices[1]] = entries\n", 72 | " L.diagonal(dim1=1,dim2=2).exp_()\n", 73 | "\n", 74 | " # calculate state-dependent, positive-definite square matrix\n", 75 | " P = L*L.transpose(2, 1)\n", 76 | " \n", 77 | " Q = None\n", 78 | " if action is not None:\n", 79 | "\n", 80 | " # calculate Advantage:\n", 81 | " A = (-0.5 * torch.matmul(torch.matmul((action.unsqueeze(-1) - action_value).transpose(2, 1), P), (action.unsqueeze(-1) - action_value))).squeeze(-1)\n", 82 | "\n", 83 | " Q = A + V\n", 84 | " \n", 85 | " \n", 86 | " # add noise to action mu:\n", 87 | " dist = MultivariateNormal(action_value.squeeze(-1), torch.inverse(P))\n", 88 | " #dist = Normal(action_value.squeeze(-1), 1)\n", 89 | " action = dist.sample()\n", 90 | " action = torch.clamp(action, min=-1, max=1)\n", 91 | " #action = action_value.squeeze(-1)\n", 92 | " \n", 93 | " return action, Q, V\n", 94 | " " 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": 253, 100 | "metadata": {}, 101 | "outputs": [], 102 | "source": [ 103 | "class ReplayBuffer:\n", 104 | " \"\"\"Fixed-size buffer to store experience tuples.\"\"\"\n", 105 | "\n", 106 | " def __init__(self, buffer_size, batch_size, device, seed, gamma):\n", 107 | " \"\"\"Initialize a ReplayBuffer object.\n", 108 | " Params\n", 109 | " ======\n", 110 | " buffer_size (int): maximum size of buffer\n", 111 | " batch_size (int): size of each training batch\n", 112 | " seed (int): random seed\n", 113 | " \"\"\"\n", 114 | " self.device = device\n", 115 | " self.memory = deque(maxlen=buffer_size) \n", 116 | " self.batch_size = batch_size\n", 117 | " self.experience = namedtuple(\"Experience\", field_names=[\"state\", \"action\", \"reward\", \"next_state\", \"done\"])\n", 118 | " self.seed = random.seed(seed)\n", 119 | " self.gamma = gamma\n", 120 | "\n", 121 | " self.n_step_buffer = deque(maxlen=1)\n", 122 | " \n", 123 | " def add(self, state, action, reward, next_state, done):\n", 124 | " \"\"\"Add a new experience to memory.\"\"\"\n", 125 | "\n", 126 | " self.n_step_buffer.append((state, action, reward, next_state, done))\n", 127 | " if len(self.n_step_buffer) == 1:\n", 128 | " state, action, reward, next_state, done = self.calc_multistep_return()\n", 129 | "\n", 130 | " e = self.experience(state, action, reward, next_state, done)\n", 131 | " self.memory.append(e)\n", 132 | " \n", 133 | " def calc_multistep_return(self):\n", 134 | " Return = 0\n", 135 | " for idx in range(1):\n", 136 | " Return += self.gamma**idx * self.n_step_buffer[idx][2]\n", 137 | " \n", 138 | " return self.n_step_buffer[0][0], self.n_step_buffer[0][1], Return, self.n_step_buffer[-1][3], self.n_step_buffer[-1][4]\n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " def sample(self):\n", 143 | " \"\"\"Randomly sample a batch of experiences from memory.\"\"\"\n", 144 | " experiences = random.sample(self.memory, k=self.batch_size)\n", 145 | "\n", 146 | " states = torch.from_numpy(np.stack([e.state for e in experiences if e is not None])).float().to(self.device)\n", 147 | " actions = torch.from_numpy(np.vstack([e.action for e in experiences if e is not None])).long().to(self.device)\n", 148 | " rewards = torch.from_numpy(np.vstack([e.reward for e in experiences if e is not None])).float().to(self.device)\n", 149 | " next_states = torch.from_numpy(np.stack([e.next_state for e in experiences if e is not None])).float().to(self.device)\n", 150 | " dones = torch.from_numpy(np.vstack([e.done for e in experiences if e is not None]).astype(np.uint8)).float().to(self.device)\n", 151 | " \n", 152 | " return (states, actions, rewards, next_states, dones)\n", 153 | "\n", 154 | " def __len__(self):\n", 155 | " \"\"\"Return the current size of internal memory.\"\"\"\n", 156 | " return len(self.memory)" 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": 254, 162 | "metadata": {}, 163 | "outputs": [], 164 | "source": [ 165 | "class OUNoise:\n", 166 | " \"\"\"Ornstein-Uhlenbeck process.\"\"\"\n", 167 | "\n", 168 | " def __init__(self, size, seed, mu=0., theta=0.15, sigma=0.2):\n", 169 | " \"\"\"Initialize parameters and noise process.\"\"\"\n", 170 | " self.mu = mu * np.ones(size)\n", 171 | " self.theta = theta\n", 172 | " self.sigma = sigma\n", 173 | " self.seed = random.seed(seed)\n", 174 | " self.reset()\n", 175 | "\n", 176 | " def reset(self):\n", 177 | " \"\"\"Reset the internal state (= noise) to mean (mu).\"\"\"\n", 178 | " self.state = copy.copy(self.mu)\n", 179 | "\n", 180 | " def sample(self):\n", 181 | " \"\"\"Update internal state and return it as a noise sample.\"\"\"\n", 182 | " x = self.state\n", 183 | " dx = self.theta * (self.mu - x) + self.sigma * np.array([random.random() for i in range(len(x))])\n", 184 | " self.state = x + dx\n", 185 | " return self.state\n" 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": 255, 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [ 194 | "class DQN_Agent():\n", 195 | " \"\"\"Interacts with and learns from the environment.\"\"\"\n", 196 | "\n", 197 | " def __init__(self,\n", 198 | " state_size,\n", 199 | " action_size,\n", 200 | " layer_size,\n", 201 | " BATCH_SIZE,\n", 202 | " BUFFER_SIZE,\n", 203 | " LR,\n", 204 | " TAU,\n", 205 | " GAMMA,\n", 206 | " UPDATE_EVERY,\n", 207 | " NUPDATES,\n", 208 | " device,\n", 209 | " seed):\n", 210 | " \"\"\"Initialize an Agent object.\n", 211 | " \n", 212 | " Params\n", 213 | " ======\n", 214 | " state_size (int): dimension of each state\n", 215 | " action_size (int): dimension of each action\n", 216 | " Network (str): dqn network type\n", 217 | " layer_size (int): size of the hidden layer\n", 218 | " BATCH_SIZE (int): size of the training batch\n", 219 | " BUFFER_SIZE (int): size of the replay memory\n", 220 | " LR (float): learning rate\n", 221 | " TAU (float): tau for soft updating the network weights\n", 222 | " GAMMA (float): discount factor\n", 223 | " UPDATE_EVERY (int): update frequency\n", 224 | " device (str): device that is used for the compute\n", 225 | " seed (int): random seed\n", 226 | " \"\"\"\n", 227 | " self.state_size = state_size\n", 228 | " self.action_size = action_size\n", 229 | " self.seed = random.seed(seed)\n", 230 | " self.device = device\n", 231 | " self.TAU = TAU\n", 232 | " self.GAMMA = GAMMA\n", 233 | " self.UPDATE_EVERY = UPDATE_EVERY\n", 234 | " self.NUPDATES = NUPDATES\n", 235 | " self.BATCH_SIZE = BATCH_SIZE\n", 236 | " self.Q_updates = 0\n", 237 | "\n", 238 | "\n", 239 | " self.action_step = 4\n", 240 | " self.last_action = None\n", 241 | "\n", 242 | " # Q-Network\n", 243 | " self.qnetwork_local = NAF(state_size, action_size,layer_size, seed).to(device)\n", 244 | " self.qnetwork_target = NAF(state_size, action_size,layer_size, seed).to(device)\n", 245 | "\n", 246 | " self.optimizer = optim.Adam(self.qnetwork_local.parameters(), lr=LR)\n", 247 | " print(self.qnetwork_local)\n", 248 | " \n", 249 | " # Replay memory\n", 250 | " self.memory = ReplayBuffer(BUFFER_SIZE, BATCH_SIZE, self.device, seed, self.GAMMA)\n", 251 | " \n", 252 | " # Initialize time step (for updating every UPDATE_EVERY steps)\n", 253 | " self.t_step = 0\n", 254 | " \n", 255 | " # Noise process\n", 256 | " self.noise = OUNoise(action_size, seed)\n", 257 | " \n", 258 | " def step(self, state, action, reward, next_state, done, writer):\n", 259 | " # Save experience in replay memory\n", 260 | " self.memory.add(state, action, reward, next_state, done)\n", 261 | " \n", 262 | " # Learn every UPDATE_EVERY time steps.\n", 263 | " self.t_step = (self.t_step + 1) % self.UPDATE_EVERY\n", 264 | " if self.t_step == 0:\n", 265 | " # If enough samples are available in memory, get random subset and learn\n", 266 | " if len(self.memory) > self.BATCH_SIZE:\n", 267 | " Q_losses = []\n", 268 | " for _ in range(self.NUPDATES):\n", 269 | " experiences = self.memory.sample()\n", 270 | " loss = self.learn(experiences)\n", 271 | " self.Q_updates += 1\n", 272 | " Q_losses.append(loss)\n", 273 | " writer.add_scalar(\"Q_loss\", np.mean(Q_losses), self.Q_updates)\n", 274 | "\n", 275 | " def act(self, state):\n", 276 | " \"\"\"Calculating the action\n", 277 | " \n", 278 | " Params\n", 279 | " ======\n", 280 | " state (array_like): current state\n", 281 | " \n", 282 | " \"\"\"\n", 283 | "\n", 284 | " state = torch.from_numpy(state).float().to(self.device)\n", 285 | "\n", 286 | " self.qnetwork_local.eval()\n", 287 | " with torch.no_grad():\n", 288 | " action, _, _ = self.qnetwork_local(state.unsqueeze(0))\n", 289 | " #action = action.cpu().squeeze().numpy() + self.noise.sample()\n", 290 | " #action = np.clip(action, -1,1)[0]\n", 291 | " self.qnetwork_local.train()\n", 292 | " return action.cpu().squeeze().numpy()\n", 293 | "\n", 294 | "\n", 295 | "\n", 296 | " def learn(self, experiences):\n", 297 | " \"\"\"Update value parameters using given batch of experience tuples.\n", 298 | " Params\n", 299 | " ======\n", 300 | " experiences (Tuple[torch.Tensor]): tuple of (s, a, r, s', done) tuples \n", 301 | " \"\"\"\n", 302 | " self.optimizer.zero_grad()\n", 303 | " states, actions, rewards, next_states, dones = experiences\n", 304 | "\n", 305 | " # get the Value for the next state from target model\n", 306 | " with torch.no_grad():\n", 307 | " _, _, V_ = self.qnetwork_target(next_states)\n", 308 | "\n", 309 | " # Compute Q targets for current states \n", 310 | " V_targets = rewards + (self.GAMMA * V_ * (1 - dones))\n", 311 | " \n", 312 | " # Get expected Q values from local model\n", 313 | " _, Q, _ = self.qnetwork_local(states, actions)\n", 314 | "\n", 315 | " # Compute loss\n", 316 | " loss = F.mse_loss(Q, V_targets)\n", 317 | " \n", 318 | " # Minimize the loss\n", 319 | " loss.backward()\n", 320 | " clip_grad_norm_(self.qnetwork_local.parameters(),1)\n", 321 | " self.optimizer.step()\n", 322 | "\n", 323 | " # ------------------- update target network ------------------- #\n", 324 | " self.soft_update(self.qnetwork_local, self.qnetwork_target)\n", 325 | " \n", 326 | " self.noise.reset()\n", 327 | " \n", 328 | " return loss.detach().cpu().numpy() \n", 329 | "\n", 330 | " def soft_update(self, local_model, target_model):\n", 331 | " \"\"\"Soft update model parameters.\n", 332 | " θ_target = τ*θ_local + (1 - τ)*θ_target\n", 333 | " Params\n", 334 | " ======\n", 335 | " local_model (PyTorch model): weights will be copied from\n", 336 | " target_model (PyTorch model): weights will be copied to\n", 337 | " tau (float): interpolation parameter \n", 338 | " \"\"\"\n", 339 | " for target_param, local_param in zip(target_model.parameters(), local_model.parameters()):\n", 340 | " target_param.data.copy_(self.TAU*local_param.data + (1.0-self.TAU)*target_param.data)\n" 341 | ] 342 | }, 343 | { 344 | "cell_type": "code", 345 | "execution_count": null, 346 | "metadata": { 347 | "scrolled": true 348 | }, 349 | "outputs": [ 350 | { 351 | "name": "stdout", 352 | "output_type": "stream", 353 | "text": [ 354 | "Using cpu\n", 355 | "NAF(\n", 356 | " (head_1): Linear(in_features=4, out_features=256, bias=True)\n", 357 | " (bn1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", 358 | " (ff_1): Linear(in_features=256, out_features=256, bias=True)\n", 359 | " (bn2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", 360 | " (action_values): Linear(in_features=256, out_features=1, bias=True)\n", 361 | " (value): Linear(in_features=256, out_features=1, bias=True)\n", 362 | " (matrix_entries): Linear(in_features=256, out_features=1, bias=True)\n", 363 | ")\n", 364 | "Episode 0\tFrame 30\tAverage Score: 30.000\n", 365 | "Episode 100\tFrame 1355\tAverage Score: 13.255\n", 366 | "Episode 200\tFrame 3446\tAverage Score: 20.911\n", 367 | "Episode 300\tFrame 8908\tAverage Score: 54.622\n", 368 | "Episode 400\tFrame 18557\tAverage Score: 96.499\n", 369 | "Episode 495\tFrame 29919 \tAverage Score: 118.37Training time: 6.84min\n" 370 | ] 371 | } 372 | ], 373 | "source": [ 374 | " def run(frames=1000):\n", 375 | " \"\"\"\"NAF.\n", 376 | "\n", 377 | " Params\n", 378 | " ======\n", 379 | "\n", 380 | " \"\"\"\n", 381 | " scores = [] # list containing scores from each episode\n", 382 | " scores_window = deque(maxlen=100) # last 100 scores\n", 383 | " frame = 0\n", 384 | " i_episode = 0\n", 385 | " state = env.reset()\n", 386 | " score = 0 \n", 387 | " for frame in range(1, frames+1):\n", 388 | " action = agent.act(state)\n", 389 | "\n", 390 | " next_state, reward, done, _ = env.step(np.array([action]))\n", 391 | " agent.step(state, action, reward, next_state, done, writer)\n", 392 | "\n", 393 | " state = next_state\n", 394 | " score += reward\n", 395 | "\n", 396 | " if done:\n", 397 | " scores_window.append(score) # save most recent score\n", 398 | " scores.append(score) # save most recent score\n", 399 | " writer.add_scalar(\"Reward\", score, i_episode)\n", 400 | " writer.add_scalar(\"Average100\", np.mean(scores_window), i_episode)\n", 401 | " print('\\rEpisode {}\\tFrame {} \\tAverage Score: {:.2f}'.format(i_episode, frame, np.mean(scores_window)), end=\"\")\n", 402 | " if i_episode % 100 == 0:\n", 403 | " print('\\rEpisode {}\\tFrame {}\\tAverage Score: {:.2f}'.format(i_episode,frame, np.mean(scores_window)))\n", 404 | " i_episode +=1 \n", 405 | " state = env.reset()\n", 406 | " score = 0 \n", 407 | "\n", 408 | " return np.mean(scores_window)\n", 409 | "\n", 410 | "\n", 411 | " if __name__ == \"__main__\":\n", 412 | "\n", 413 | " writer = SummaryWriter(\"runs/\"+\"NAF_test_128_relu_Multinorm\")\n", 414 | " seed = 0\n", 415 | " BUFFER_SIZE = 100000\n", 416 | " BATCH_SIZE = 128\n", 417 | " GAMMA = 0.99\n", 418 | " TAU = 1e-3\n", 419 | " LR = 1e-3\n", 420 | " UPDATE_EVERY = 1\n", 421 | " NUPDATES = 1\n", 422 | " device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n", 423 | " print(\"Using \", device)\n", 424 | "\n", 425 | "\n", 426 | " np.random.seed(seed)\n", 427 | " env = gym.make(\"CartPoleConti-v0\")\n", 428 | "\n", 429 | "\n", 430 | " env.seed(seed)\n", 431 | " action_size = env.action_space.shape[0]\n", 432 | " state_size = env.observation_space.shape[0]\n", 433 | "\n", 434 | " agent = DQN_Agent(state_size=state_size, \n", 435 | " action_size=action_size,\n", 436 | " layer_size=256,\n", 437 | " BATCH_SIZE=BATCH_SIZE, \n", 438 | " BUFFER_SIZE=BUFFER_SIZE, \n", 439 | " LR=LR, \n", 440 | " TAU=TAU, \n", 441 | " GAMMA=GAMMA, \n", 442 | " UPDATE_EVERY=UPDATE_EVERY,\n", 443 | " NUPDATES=NUPDATES,\n", 444 | " device=device, \n", 445 | " seed=seed)\n", 446 | "\n", 447 | "\n", 448 | "\n", 449 | " t0 = time.time()\n", 450 | " final_average100 = run(frames = 30000)\n", 451 | " t1 = time.time()\n", 452 | "\n", 453 | " print(\"Training time: {}min\".format(round((t1-t0)/60,2)))\n", 454 | " torch.save(agent.qnetwork_local.state_dict(), \"NAF_test\"+\".pth\")\n" 455 | ] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "execution_count": null, 460 | "metadata": {}, 461 | "outputs": [], 462 | "source": [] 463 | } 464 | ], 465 | "metadata": { 466 | "kernelspec": { 467 | "display_name": "Python 3", 468 | "language": "python", 469 | "name": "python3" 470 | }, 471 | "language_info": { 472 | "codemirror_mode": { 473 | "name": "ipython", 474 | "version": 3 475 | }, 476 | "file_extension": ".py", 477 | "mimetype": "text/x-python", 478 | "name": "python", 479 | "nbconvert_exporter": "python", 480 | "pygments_lexer": "ipython3", 481 | "version": "3.6.5" 482 | } 483 | }, 484 | "nbformat": 4, 485 | "nbformat_minor": 4 486 | } 487 | -------------------------------------------------------------------------------- /results/DDPG_LL.csv: -------------------------------------------------------------------------------- 1 | "Step","worthy-wind-9 - Reward","clear-durian-8 - Reward","LunarLander-1 - Reward" 2 | "0","-656.2349704596153","-1232.446077639847","-871.4527711327439" 3 | "1","-606.9315599562417","-362.6075537678644","-333.91199722412466" 4 | "2","-299.07582736625227","-844.225879318735","-535.6846897929556" 5 | "3","-465.7596706112027","-493.7617585153136","-1093.4060117177776" 6 | "4","-490.1870611527766","-581.1545150372802","-1116.775846542623" 7 | "5","-413.4591762838069","-544.0897545032333","-1598.685981887482" 8 | "6","-761.8122018668427","-303.4457055440265","-990.0868279146364" 9 | "7","-398.0264880676308","-432.8616580003226","-773.8972784717948" 10 | "8","-480.96190878263565","-441.97838755570723","-1012.1765510551536" 11 | "9","-545.5534428922854","-578.1313055203682","-771.7187547176396" 12 | "10","-380.67460910749276","-479.51201239685577","-1236.9446198898643" 13 | "11","-416.66191559049633","-635.0375688481822","-1101.3505752418014" 14 | "12","-387.6562503438423","-469.9530577664599","-759.422913413222" 15 | "13","-428.4898728698861","-520.4831803967079","-1218.1367667622005" 16 | "14","-512.068703972246","-346.1314929020717","-1318.5039917899858" 17 | "15","-380.1183863696718","-750.2892463672686","-860.2880577828657" 18 | "16","-358.00917111783417","-778.3620073655693","-835.2100792250128" 19 | "17","-499.1096543830869","-471.64513163442257","-760.2101111395513" 20 | "18","-477.79514695051506","-299.42564362130435","-1245.5428052469624" 21 | "19","-952.5310538632411","-445.24351083045684","-1049.0965546359998" 22 | "20","-405.4357209995182","-440.0069827086226","-811.3693915444396" 23 | "21","-544.1317526913774","-333.048974926405","-778.0548426914352" 24 | "22","-494.5687164843776","-985.99038786499","-802.5826389849589" 25 | "23","-488.5649862412076","-361.6786772215805","-738.9191849225593" 26 | "24","-757.4705297219882","-495.0027817970854","-827.5370392511484" 27 | "25","-325.64703777934494","-525.0886056693319","-703.0924219156386" 28 | "26","-543.7962528587169","-334.39351534293013","-542.4648247323651" 29 | "27","-521.9020644257199","-435.46696867892575","-640.3763961647466" 30 | "28","-410.0136939654431","-474.82000209507964","-696.7727465688231" 31 | "29","-543.933779906307","-550.8068548959058","-774.0089992563345" 32 | "30","-544.0007219205792","-696.9452413440076","-1033.6719968573984" 33 | "31","-466.69823664915384","-753.530681279615","-882.9730006393881" 34 | "32","-614.1358765483768","-589.2051736254609","-742.6433092325998" 35 | "33","-368.0245931730946","-481.55618952412703","-768.0779648702174" 36 | "34","-558.9677735406351","-563.2712555533844","-810.4658831079795" 37 | "35","-652.1573519515572","-467.405072291665","-753.5850597156152" 38 | "36","-368.12775224148993","-579.8037799973308","-1354.4487169633733" 39 | "37","-287.85018008910686","-770.4280914240117","-744.5887282723962" 40 | "38","-567.081059947277","-255.02091813661707","-676.0776951089255" 41 | "39","-387.22579364411484","-682.1188773208833","-662.6398526759023" 42 | "40","-547.2488724463165","-332.4927245493347","-580.705266980735" 43 | "41","-684.0132469458943","-399.61343528214877","-731.9042962226848" 44 | "42","-501.893942163492","-666.1628373434576","-770.832230749414" 45 | "43","-377.28291194093293","-766.7141471604739","-665.5880538219366" 46 | "44","-616.5791048749268","-407.77027609072616","-593.9492907153638" 47 | "45","-338.6334061927238","-538.0314982859259","-777.156161776279" 48 | "46","-289.586331329984","-480.19684612521166","-689.5284666274706" 49 | "47","-719.8065839236473","-597.1484172426283","-678.1161153887589" 50 | "48","-524.8354691073605","-843.4816939704706","-750.608060253753" 51 | "49","-480.73967449413567","-507.6418622359064","-690.9635858666207" 52 | "50","-369.09464303691345","-477.8419350357239","-604.4650598071274" 53 | "51","-688.4450446683236","-359.21759305809763","-668.224392417359" 54 | "52","-301.0234108121557","-857.8778957627284","-715.0524889709849" 55 | "53","-481.1658239751068","-327.2894379524002","-653.2393039775792" 56 | "54","-985.6510293112174","-498.0468665263525","-755.313372645047" 57 | "55","-615.0906442999467","-494.9374009150675","-597.6298745797784" 58 | "56","-646.8581034836575","-512.0134322887893","-592.696970843904" 59 | "57","-428.51884813079226","-661.2874102711837","-657.1158043793669" 60 | "58","-478.9865258002469","-659.7092305513728","-642.9674731762944" 61 | "59","-489.9223786425172","-611.9164487634131","-408.7076619099698" 62 | "60","-523.0442004866052","-429.42566604132423","-698.1800350562156" 63 | "61","-489.4860734187111","-469.55733573501686","-413.0732749886968" 64 | "62","-601.3344580583746","-339.97378990344043","-396.4854893059337" 65 | "63","-486.37154322359163","-439.3049705624169","-642.1733256271386" 66 | "64","-520.6983919493027","-395.2936206883074","-667.2918431335422" 67 | "65","-611.3199413291693","-438.1714840546313","-744.7384333849802" 68 | "66","-517.1914650181401","-514.4512861627954","-626.5474337346997" 69 | "67","-550.4694377045046","-853.2134108093971","-447.98958746164635" 70 | "68","-514.3029767294804","-402.6815453865903","-619.9341637907661" 71 | "69","-727.222639351051","-311.630519230941","-405.89137346250266" 72 | "70","-435.6974329085411","-305.3967015878957","-661.4648375518844" 73 | "71","-555.2514766622103","-401.95328790168884","-445.16408636970334" 74 | "72","-551.6852647725816","-465.4041669250543","-559.7946188822552" 75 | "73","-390.9188217948009","-494.9409099640185","-598.357837979254" 76 | "74","-522.1227080739695","-475.1508259117603","-556.5418617575851" 77 | "75","-782.4384142913484","-800.2819298780769","-684.3048348793327" 78 | "76","-664.6973128093464","-850.6623090246273","-608.5401491168935" 79 | "77","-780.3177640297124","-416.63786164133285","-732.2239101166672" 80 | "78","-467.1588790171026","-572.738090122171","-595.3769547154736" 81 | "79","-450.6718420319592","-584.6370067555893","-657.1320984297377" 82 | "80","-298.70025500862687","-345.399791630802","-691.5184116270381" 83 | "81","-509.4457068481158","-557.3530270778464","-708.0988387713472" 84 | "82","-448.79795742737724","-731.1021967264305","-586.0724368073056" 85 | "83","-331.49417786878723","-629.4079835823451","-687.9373583457736" 86 | "84","-823.649430893865","-478.34039314264385","-618.1388641080662" 87 | "85","-725.7844207759705","-512.3137689065752","-673.3568135473901" 88 | "86","-320.0079955282289","-590.4601247140961","-564.0249374709142" 89 | "87","-283.1975801566473","-547.6465010654608","-493.61080907056686" 90 | "88","-474.4800861881888","-407.620092281521","-414.0512018974532" 91 | "89","-323.1602082968675","-442.70460530006704","-761.4813480357855" 92 | "90","-579.0652843716615","-296.94083983103747","-565.488469325459" 93 | "91","-692.4226270421058","-545.5653359326715","-719.9993738988571" 94 | "92","-546.4437511527767","-393.6434014882429","-613.9500312040021" 95 | "93","-755.5637893498994","-462.0751225298443","-719.4302842000519" 96 | "94","-487.0725890486235","-793.915426600328","-773.7376792349552" 97 | "95","-380.67441518790935","-555.4791453032058","-634.1587565434025" 98 | "96","-335.3575716033878","-479.6420341351927","-729.1387114339555" 99 | "97","-761.3626959963369","-791.1790015297664","-781.9609468810324" 100 | "98","-678.189043677128","-123.50891456835522","-495.2479367731881" 101 | "99","-485.6380371404441","-735.5754284191216","-769.5090273814137" 102 | "100","-474.28133061372125","-424.9368658155397","-666.4729427136721" 103 | "101","-490.9268513527006","-562.0168647332132","-525.7823103380333" 104 | "102","-805.6604558640412","-637.6083146750465","-512.0781126725656" 105 | "103","-358.6041241479978","-378.41052102881804","-447.1859923670597" 106 | "104","-482.9796251181012","-435.4409394585782","-646.4511063731119" 107 | "105","-490.6576655814637","-522.4356918106175","-786.3717388224788" 108 | "106","-619.0924291144104","-468.6365865501658","-621.8670198140128" 109 | "107","-508.5397688171766","-953.4546652287181","-715.3703323707778" 110 | "108","-308.6660364171721","-379.82217603590607","-829.7172038611664" 111 | "109","-289.52476658577496","-525.2966594466163","-720.2901031819084" 112 | "110","-390.7532844954275","-964.6463651759207","-739.3553490683041" 113 | "111","-1103.6592741156196","-505.76612541472576","-720.9931541963614" 114 | "112","-463.24032857488316","-725.7367249690103","-763.0816029101907" 115 | "113","-614.9114653778513","-792.5419707418632","-710.0947396829081" 116 | "114","-780.0134172109538","-443.0569563971642","-560.3584155315893" 117 | "115","-353.15525199731826","-524.1569416968375","-590.2245109369758" 118 | "116","-773.0136278499403","-565.016940914567","-604.595077631602" 119 | "117","-489.3059582866033","-530.2823476295551","-834.7553628104603" 120 | "118","-501.1286656392006","-437.54202665954153","-776.58731439591" 121 | "119","-696.5351199680094","-430.46408871521385","-570.3477650759518" 122 | "120","-759.4398634766363","-517.8352655141771","-754.0232172353019" 123 | "121","-575.593781730606","-535.0841637009349","-434.27313790594866" 124 | "122","-432.1722090506599","-327.15918668340623","-411.4137099400599" 125 | "123","-295.32684893290985","-563.4823763452666","-494.923757143851" 126 | "124","-513.4344055140198","-536.7544747680977","-250.5549932636558" 127 | "125","-424.24841763675784","-548.7466459035206","-490.3531668769138" 128 | "126","-428.3881360747998","-431.63175350433255","-449.5844821112869" 129 | "127","-521.6644735385273","-600.3226395757581","-371.2732236300409" 130 | "128","-485.78914083537563","-681.7847617886011","-364.35578615792923" 131 | "129","-609.5648431588505","-441.5856619995626","-295.12166508429766" 132 | "130","-478.9379907266574","-410.1725523741017","-270.6694426389655" 133 | "131","-522.8849791430778","-488.2466317003905","-473.9173780407686" 134 | "132","-623.099458190983","-773.3594051672519","-375.8500422124422" 135 | "133","-484.27714983288564","-858.2032717602127","-479.4553963452504" 136 | "134","-870.5141845952616","-780.5205672148847","-484.1416888189992" 137 | "135","-530.7019337182662","-466.90895823998096","-477.5158855573395" 138 | "136","-907.4400343912099","-683.0616184570803","-404.8149477497501" 139 | "137","-263.12973084873806","-920.8669134899671","-179.21584298201066" 140 | "138","-125.125121507385","-781.0932422413844","-364.30206719769353" 141 | "139","-181.59473967461633","-509.86140527740986","-534.3886004149905" 142 | "140","-360.8400683119522","-629.611258880177","-583.6566947388371" 143 | "141","-172.94258050423232","-502.87048306556244","-509.4952244969877" 144 | "142","-103.1033063174891","-542.7377153528857","-115.7188476682305" 145 | "143","-224.98623908698994","-759.8802018638888","-377.65535632433586" 146 | "144","0.60095907488747","-890.0235792044041","-304.0032944837502" 147 | "145","-204.06016062302072","-430.9554924935217","-440.79120769289375" 148 | "146","-271.21284054927844","-698.7913706547465","-160.2216824618052" 149 | "147","-155.90419668533076","-561.8158048275375","-154.02486920468334" 150 | "148","-261.4111960788215","-516.2984135784983","-184.52066253446407" 151 | "149","-261.3614146345793","-718.292961710627","-303.9429053387993" 152 | "150","-191.40027565107363","-612.0304367927585","-273.12493813020683" 153 | "151","-160.33217024376842","-900.555984064431","-519.8433763128652" 154 | "152","-203.68573788928418","-518.5104646738508","-317.4007316816266" 155 | "153","-197.92707839372565","-748.8410873808318","-49.05807389799265" 156 | "154","-174.1698926046582","-386.5055448726194","-112.87409651653853" 157 | "155","-263.8050047390137","-547.1637284194983","-39.20102858544662" 158 | "156","-101.80422415552863","-521.4555776098387","3.6019076048900445" 159 | "157","-163.62859847079983","-499.09378769045776","-36.21731863714108" 160 | "158","-130.39388046229715","-499.7059432623454","-110.28815035540698" 161 | "159","-136.54661987248636","-468.9421326213592","12.153111895998478" 162 | "160","-157.02078699154714","-398.8652201296378","-221.53427862027098" 163 | "161","-116.90861935463708","-752.6930248508831","177.79200051772744" 164 | "162","-151.20583535130953","-826.9610551236725","-258.45684129011954" 165 | "163","-130.08230703610946","-524.8522207795693","-215.75991640652128" 166 | "164","-117.61811930100401","-811.5600646687445","-175.42580014478904" 167 | "165","-115.89298353288223","-521.6731812741011","-56.5985313663803" 168 | "166","-123.51902798537469","-476.80292695177525","-49.770585727478206" 169 | "167","-140.3998239594918","-526.5112490987301","-85.17443560506558" 170 | "168","-120.53487494858172","-574.588948822335","-130.1312324602852" 171 | "169","-138.8769894385513","-809.6827888642387","-323.8505782148742" 172 | "170","-109.16916533335187","-359.9511344681253","-325.184070417828" 173 | "171","-117.73475741525063","-628.1695789714016","-207.96895254165798" 174 | "172","-127.0562076787084","-493.54439799706415","-38.66578115500117" 175 | "173","-132.47637693817012","-408.17753383963895","-237.46836603306107" 176 | "174","-127.43918426758614","-637.8412233910685","-158.97624740000632" 177 | "175","-134.53155503807426","-440.2015545603285","-107.79634373593102" 178 | "176","-122.22428082507071","-658.4160520442437","-108.35653301903618" 179 | "177","-117.03025607773942","-907.683572238813","-181.63229388342023" 180 | "178","-112.42340974821711","-391.2220066346642","-136.0638739636085" 181 | "179","-128.44469056954216","-2152.670500438436","-180.40051013919708" 182 | "180","-121.85967289470554","-424.67097100236606","-109.32978339591097" 183 | "181","-103.58674194073164","-417.89649077670447","-53.922237378181386" 184 | "182","-128.9406405479034","-465.036052276672","-176.62890774126066" 185 | "183","-114.60398518988661","-488.4899531863053","-125.69372179424067" 186 | "184","-118.98443369032833","-809.5316279388953","-171.11733430191833" 187 | "185","-136.8306037839402","-634.8757622154034","-156.06507643827223" 188 | "186","-96.02454053817655","-542.5719460443144","-164.16585082417797" 189 | "187","-130.91825972765255","-447.77971257556123","-276.68940045518445" 190 | "188","-144.7904040706659","-964.0911079255602","-194.55149896312338" 191 | "189","-138.72566524525848","-444.1100176495516","-161.0378899788443" 192 | "190","-98.10174281210453","-537.9876993962997","-247.6769432885764" 193 | "191","-116.65666898690394","-680.9490146078881","-110.90267700366104" 194 | "192","-108.67809094384404","-516.2378003383224","-138.18359979225033" 195 | "193","-116.53574225524189","-545.0313902786235","-119.05801165146347" 196 | "194","-100.74592726425655","-525.5771271033414","-167.78500257201515" 197 | "195","-95.00419669102365","-867.2103160644854","-130.12230098443732" 198 | "196","-106.79439443902308","-669.711581425577","-137.8499864469149" 199 | "197","-137.8456619720501","-999.7687522461538","-101.6288894024622" 200 | "198","-122.49677513557175","-378.33200086355475","-143.95010763489748" 201 | "199","-131.33677056184465","-297.92461209742055","-131.17744479257772" 202 | "200","-147.48811846454532","-451.81673825378573","-197.5538168605541" 203 | "201","-122.93392215002703","-772.1303693680135","154.99859516301066" 204 | "202","-143.7646892414896","-568.9061413330601","-306.0534250590429" 205 | "203","-91.99606431952701","-600.78919528831","-214.4021177584915" 206 | "204","-111.20244214010896","-941.5397453832841","-138.2369710822254" 207 | "205","-116.89442947517232","-474.8768331767758","-180.3222333896975" 208 | "206","-121.23241859964654","-716.1451180363291","-118.01234542885085" 209 | "207","-111.81676502931823","-542.4090802661374","-135.5173014533819" 210 | "208","-101.38775204516402","-554.5516980893548","-110.28738505251249" 211 | "209","-114.9813362317602","-1007.5460501749815","-126.94626205642264" 212 | "210","-135.13525916778207","-502.8005034072977","-131.21219090161512" 213 | "211","-125.96357107115352","-441.9238333547441","-98.85423823045893" 214 | "212","-99.9388448179653","-529.3910670842095","-110.186403912855" 215 | "213","-125.39597131805654","-450.69072585422947","-108.56469121038913" 216 | "214","-142.10537905845752","-532.1108964178857","-125.35738070596545" 217 | "215","-124.85092939576798","-368.7578102111931","-134.14099212330373" 218 | "216","-120.74007907133615","-628.1724971334033","-62.372524299154215" 219 | "217","-126.76199055403649","-617.7069919703612","-127.06457227754302" 220 | "218","-119.18461901725566","-815.133560214356","-175.0704291797204" 221 | "219","-113.33334177826652","-355.3666224096724","-86.94868613392262" 222 | "220","-121.18891721424347","-1004.2293480632435","-153.59475328202072" 223 | "221","-120.47747334402735","-787.1223036165562","-132.06954850234087" 224 | "222","-102.27505294774231","-655.0931677664347","-220.24492215021513" 225 | "223","-129.97593755839284","-511.96036450424","-189.74470112462723" 226 | "224","-105.84807393669203","-512.0013224556395","-106.74347253047007" 227 | "225","-81.905792053873","-481.4548296614937","-85.69459974056771" 228 | "226","-123.69444199707436","-512.4344683202103","-181.66348246244215" 229 | "227","-129.0446207517882","-460.0524578741994","-154.52641802196797" 230 | "228","-121.7273757080151","-467.3431624505533","-164.67272671837458" 231 | "229","-96.37944990637891","-681.6618085987433","-120.38166965944377" 232 | "230","-134.62269433510608","-643.254843252703","-207.40657525875488" 233 | "231","-96.9672120444743","-836.631805623723","-90.85392299962487" 234 | "232","-125.56209044348235","-515.2884965079511","-108.71513636283817" 235 | "233","-150.26773645808362","-795.9828701243089","-131.4096343484792" 236 | "234","-134.27858983689632","-673.2742407453065","-102.72736231993444" 237 | "235","-108.70095884651093","-600.2707314124677","-200.65985692035736" 238 | "236","-96.06867153765718","-424.17299611056023","-120.25619524596773" 239 | "237","-111.39990012782302","-569.616962380268","-112.93899879799473" 240 | "238","-135.98263613441154","-765.4973692022082","-89.54543079343428" 241 | "239","-97.2958065157226","-619.8497461698526","-107.04509634253897" 242 | "240","-129.66142367286713","-456.7699104708709","-173.25309113406803" 243 | "241","-109.81487640550439","-919.9611911973003","-82.41738130435304" 244 | "242","-109.66510981267294","-707.9974090740034","-194.79042511622566" 245 | "243","-126.97158715720388","-594.5028366811187","-177.15321744861058" 246 | "244","-122.49152175367323","-422.5603039952455","-116.0657113063903" 247 | "245","-138.6200806006812","-472.90266538375874","-82.41933872467304" 248 | "246","-127.48331952508003","-697.587850033909","-177.4389981588664" 249 | "247","-97.5372429448577","-322.7559139745363","-136.28795854035548" 250 | "248","-87.8076035440195","-471.14932090317615","-125.37542794816372" 251 | "249","-122.44787422662365","-348.605738305443","-102.49385857027126" 252 | "250","-116.51285137131327","-466.624846451787","-190.49758165532808" 253 | "251","-111.96697061891541","-496.1704501611751","-206.49976948672747" 254 | "252","-121.1029575651828","-579.562297989145","-245.43282884464625" 255 | "253","-119.9922603722093","-377.7292617823259","-95.53117265634123" 256 | "254","-121.14311046610062","-472.4452912729493","-175.13361965226244" 257 | "255","-126.2524476335177","-503.4901920506746","-85.14325835823517" 258 | "256","-121.30377983734468","-443.9552522964548","-142.5256092282851" 259 | "257","-105.02663033980649","-496.04331580350333","-79.46348279207582" 260 | "258","-118.48012759015378","-553.2455962399229","-152.563789239736" 261 | "259","-111.04369438237912","-329.18041992995825","-189.00083603607862" 262 | "260","-116.08937792376527","-590.969050799881","-154.60124850095943" 263 | "261","-142.12723990035272","-495.9099366240238","-67.8867342349637" 264 | "262","-122.0145244006904","-768.61881734702","-85.65537059262816" 265 | "263","-1105.4215800706552","-506.6473235931961","-88.76463678202147" 266 | "264","-696.4885471546122","-391.13866511560695","-141.6259113575208" 267 | "265","-1387.2527961076769","-618.8828920414521","-119.30658602194083" 268 | "266","-1244.758839270255","-465.0950250017259","-106.90878282187879" 269 | "267","-312.0995316499294","-396.557086560863","-128.86149502523125" 270 | "268","-113.71111377377414","-377.93117747318036","-64.61481190495566" 271 | "269","-302.78224438086886","-478.5174940442825","-76.8609365706423" 272 | "270","-107.4636037153955","-483.8262593684138","-63.87120185991526" 273 | "271","-282.2411135930122","-470.0970413764786","-86.4171541550933" 274 | "272","-114.39541055004534","-770.6770277148695","-89.61887644950502" 275 | "273","-347.3088544091886","-813.6249758850307","-85.95347886649044" 276 | "274","-96.32871385292724","-467.0848683389119","-40.741219229570284" 277 | "275","-171.86899592401522","-750.4433125701037","-90.40761411409008" 278 | "276","-219.33467903760524","-457.74543148464187","-205.93017830806264" 279 | "277","-212.77398292233414","-381.83884394851424","-57.0254575492817" 280 | "278","-185.65879306149748","-503.10651881065525","-59.54759961914881" 281 | "279","-253.70608373385184","-770.6373401068967","-103.84309588050434" 282 | "280","-264.8150665225087","-734.6565471161141","-59.56767653289642" 283 | "281","-153.76392535694322","-746.8967656178219","-38.08927883094939" 284 | "282","-206.3657572650061","-584.6795740923124","-49.14650504638085" 285 | "283","-80.87176152477348","-625.9960757415747","-1.993757679987565" 286 | "284","-324.93589747880696","-558.0805155375509","-59.6561681658538" 287 | "285","-402.9939657979853","-400.05066588223605","-8.25553965897273" 288 | "286","-316.84437639717225","-506.1068044803519","-47.664180872565666" 289 | "287","-224.68642118694356","-577.565039442808","-51.78220066837252" 290 | "288","-171.78861895680967","-731.6318394933686","-31.058545637082908" 291 | "289","-277.5494013444843","-492.0681179856491","-84.28132619211914" 292 | "290","-185.456204305152","-479.5890196251803","-73.63544683533324" 293 | "291","-191.13181900378487","-437.83893876888715","-28.386657043089805" 294 | "292","-289.0516597775918","-413.46449275003766","-64.11090567458879" 295 | "293","-198.864231067241","-719.233723799087","-108.82941816932355" 296 | "294","-310.5984342121935","-507.29886303577996","-64.57853172754412" 297 | "295","-113.30875256873063","-905.5318973813063","-39.49491584310586" 298 | "296","-199.2454400889278","-593.3517970913795","-24.214273712168538" 299 | "297","-115.62119522148706","-439.049221154147","-41.23340708718757" 300 | "298","-133.245089529561","-560.737524185464","-92.28564105628462" 301 | "299","-171.38699549105118","-411.3652328476084","-150.96863793064995" 302 | "300","-193.83317827841688","-837.422788520665","-43.29851995866316" 303 | "301","-204.07096452991507","-542.5949372231667","-5.458917414816712" 304 | "302","-203.39711273057063","-422.68799164912645","-75.84883167789444" 305 | "303","-161.63870856270302","-450.61296147614246","-29.87581497753667" 306 | "304","-258.95692371033624","-380.9184367982521","-90.23094493866257" 307 | "305","-162.9224220020841","-420.0846363481108","-21.523500100903973" 308 | "306","-279.0041072627488","-490.6655556915556","-69.83592150719636" 309 | "307","-194.2392979047792","-720.6136912888961","-90.27350211952839" 310 | "308","-256.797493465445","-604.1675472400839","-75.90629577499264" 311 | "309","-161.3016140021087","-483.0990797433246","-45.301793176454595" 312 | "310","-152.35873123771134","-429.9413771387372","-71.12975009469112" 313 | "311","-134.53204832815942","-497.4310907752141","-77.84349954357457" 314 | "312","-205.5629219187476","-819.8798922178632","-47.473917002062635" 315 | "313","-202.60131763880656","-381.71797795853104","-46.340675345688375" 316 | "314","-171.40401799515476","-518.3305447603067","-29.103704072967307" 317 | "315","-98.97931591244358","-378.97832896721985","-121.57307246647581" 318 | "316","-131.5781987248954","-996.3762444397994","-197.4699782658342" 319 | "317","-277.8780721201448","-723.8774724176352","-99.97765502156547" 320 | "318","-120.26298364536459","-564.8714309160829","-151.9558351973933" 321 | "319","-168.66526326387384","-698.9831432974378","-167.14608162029708" 322 | "320","-139.22820030887482","-810.1527644073607","-179.70142288644314" 323 | "321","-163.2329553327505","-507.7538773092271","-204.90745809025026" 324 | "322","-169.02371975389008","-562.0057307343068","-149.49624469116387" 325 | "323","-114.75045088732247","-462.0289195029545","-187.4855401166493" 326 | "324","-163.82001382864493","-813.3067382166356","-218.5622424611636" 327 | "325","-119.88395551522109","-841.6311898293313","-159.35249801678276" 328 | "326","-125.29361182161153","-835.9480626815241","-128.3604752514929" 329 | "327","-102.67821630211643","-1045.846814357782","-69.2820292098925" 330 | "328","-112.18779705797192","-771.6799419262793","-188.74909681309845" 331 | "329","-107.860725215474","-624.7331141145635","-30.906016524429255" 332 | "330","-127.94358669677375","-567.0884361431658","-60.6412134050449" 333 | "331","-232.42302599289536","-419.1549075879055","-61.16422702237969" 334 | "332","-117.6135746994279","-469.4036733451284","-107.24147027025414" 335 | "333","-260.9363518313247","-816.094423398598","-141.71301028721751" 336 | "334","-103.02395285598237","-582.303808885699","-65.39315754130975" 337 | "335","-78.11896893218095","-512.1813581962497","-68.97051402269584" 338 | "336","-86.85982936241825","-730.5136794162034","-39.94211387552685" 339 | "337","-208.64807571133835","-485.4206893069436","-64.39594970180455" 340 | "338","-134.13326373937525","-469.7609099923807","-73.11532838184986" 341 | "339","-151.75853763178057","-483.3715169383872","-34.13449932508156" 342 | "340","-86.27932534688354","-546.4959142741681","-84.63581096808147" 343 | "341","-115.9389358486199","-588.4133890383822","-35.71913371005276" 344 | "342","-209.25024100163753","-641.1213532386299","-87.1421726711027" 345 | "343","-103.12464948610368","-510.8668682002961","-60.74243089553574" 346 | "344","-100.6475094687794","-774.5895051946923","-9.95333150952496" 347 | "345","-195.7054699058398","-1021.8713548003683","-6.7324139997995065" 348 | "346","-114.20565300119294","-451.21820450037774","-20.215792405394513" 349 | "347","-143.4358352072806","-594.3137972511611","-17.764145611514788" 350 | "348","-167.18248004914363","-620.192961435053","-22.118953538298978" 351 | "349","-170.29365099390003","-525.53759488739","-94.39716988302573" 352 | "350","-244.18474161982786","-473.8394610451011","-7.491707954537739" 353 | "351","-110.74034705653668","-856.6223825480339","5.792005640147377" 354 | "352","-155.14081325534377","-682.3277702383568","-17.702860570456618" 355 | "353","-316.15949108776977","-507.3910752695628","-33.51257810424855" 356 | "354","-244.4319514797828","-326.5689252280305","-58.38022566348539" 357 | "355","-256.25136767491983","-664.3508768240554","-28.28552669479689" 358 | "356","-233.52728783015166","-646.1183401917162","-49.309514530255406" 359 | "357","-99.01582326485756","-487.0961872466745","-28.803269317294752" 360 | "358","-89.14229439789742","-402.1841925564481","-41.055972814319354" 361 | "359","-228.41996039631604","-688.1146297809754","-37.608322487516055" 362 | "360","-117.99224361525734","-501.27197760001394","-39.618616449487234" 363 | "361","-119.72741155186166","-495.5251647895783","-52.05999946971883" 364 | "362","-157.71889516127857","-676.910185517625","-37.09677303587994" 365 | "363","-104.13853155414256","-825.3983350729663","-10.757174268604475" 366 | "364","-116.47941276573188","-615.8414726703667","-53.06882591770582" 367 | "365","-77.57923599458849","-620.3654854634165","5.315627738472863" 368 | "366","-70.48978129245775","-586.0900487619023","-40.930024084014754" 369 | "367","-66.86433779362157","-685.0564987900971","-7.414629159475513" 370 | "368","-79.3453645126797","-768.7153977613852","-46.04434030051223" 371 | "369","-83.17557461872279","-334.5267727611688","-24.149179243108883" 372 | "370","-20.32494293808319","-414.44330598899603","-26.855813908496113" 373 | "371","-66.37029513103677","-436.84717295242126","-13.048732854943207" 374 | "372","-38.856508609054075","-815.535637467654","-55.59158782821737" 375 | "373","-34.03125533477822","-720.4387925817306","7.2565342826592705" 376 | "374","-20.029013599947792","-468.97482762036697","-16.81415463879616" 377 | "375","-45.19408815702155","-552.7956620681183","-4.414663809143664" 378 | "376","-50.005819033325224","-391.3058595842497","-12.41019268640974" 379 | "377","-44.77635951274023","-505.3140611763797","42.4954227019313" 380 | "378","-65.34520035098417","-611.8847139043174","-36.30796617736363" 381 | "379","-37.97923040791106","-828.6837718206027","0.19836932923768746" 382 | "380","-41.18032849742078","-496.1434560982864","-31.315422664528263" 383 | "381","-43.76724661509971","-542.9539338121269","-67.96669677615736" 384 | "382","-26.77937298096558","-442.29004160309665","2.163251433653567" 385 | "383","-25.803358280347243","-464.8230429023401","-46.76416321720819" 386 | "384","-68.04836242869683","-361.53006372566676","-28.315756259091973" 387 | "385","-20.455861365100528","-600.4846633213997","27.697244351282716" 388 | "386","-50.81274369805723","-535.5702471907898","19.72846166153021" 389 | "387","-1.6161591219077733","-470.7673944903151","124.80050295210981" 390 | "388","-60.25698784641041","-814.3508868238794","4.922413189267674" 391 | "389","-19.245972011046884","-734.9667650693248","171.02239580788302" 392 | "390","-50.46800637976639","-490.85339140105447","-35.9936675535251" 393 | "391","-28.9327726145071","-507.4625906669949","-5.794972012663628" 394 | "392","-32.401893128258145","-605.4133957842948","-22.174357489518833" 395 | "393","-52.65914947082675","-591.2860794817226","-2.470431182419912" 396 | "394","-61.3829648406038","-742.4725964501757","158.4191606794114" 397 | "395","-76.02055687828876","-589.8932590051209","-8.130240542634645" 398 | "396","-58.002162582521336","-556.2724223505963","-17.54650314201965" 399 | "397","-48.31450851580901","-676.220058886519","-22.84119980301641" 400 | "398","-64.77576353717805","-597.3010675454013","35.39961473196922" 401 | "399","-28.060973543348585","-882.2993888211636","-4.080508722209659" 402 | "400","-40.34413431995646","-514.2310890575891","74.67570302216036" 403 | "401","-30.461077446865637","-452.5284941939874","227.0928728462498" 404 | "402","-15.383421732149625","-687.3963128479048","-71.92399187779566" 405 | "403","-57.6742779606214","-531.0511809778992","-9.055565588041008" 406 | "404","-40.31607725265238","-631.3027474426835","10.554945258987885" 407 | "405","-50.23297935139562","-474.6979405497094","17.859156309773464" 408 | "406","-64.81309502295998","-632.5016064511909","-5.229961557666251" 409 | "407","-52.55246585968658","-552.8311864403436","210.44131528475043" 410 | "408","-16.371623587242606","-553.7656662205573","-56.18393117976357" 411 | "409","-26.00824344626847","-443.50412109454305","-41.02492603294875" 412 | "410","-15.401705654767309","-634.1252053665046","-16.450680025104752" 413 | "411","-63.728907543116485","-699.7770812405283","34.45475416144776" 414 | "412","-41.881864685363894","-451.0461702646835","-59.93996776464613" 415 | "413","-41.09737453853849","-458.71534281798904","268.4635738521242" 416 | "414","-59.999356648005765","-946.1522767086775","-23.02921881344588" 417 | "415","-3.594949434811869","-492.6593080752948","169.09469909570123" 418 | "416","-22.35083026836393","-486.8851147090147","162.81067860794448" 419 | "417","-45.55506573682645","-818.344753984829","-7.350571280645257" 420 | "418","-81.00918026541397","-570.9753355856159","142.17130967547868" 421 | "419","-55.250245684011695","-303.9600422892337","-31.871106695545443" 422 | "420","-33.986907782662264","-113.85161866962423","204.41891837711805" 423 | "421","-25.02328069601059","-210.32344956289566","71.9546996865765" 424 | "422","228.31097327499486","-154.8443696622453","155.12459732437298" 425 | "423","21.733155579069734","-208.40587258163487","221.0770969359061" 426 | "424","166.36645116975808","-112.32256961213756","-32.1687975583352" 427 | "425","-62.54952646128077","-28.88517439999103","145.00713240729232" 428 | "426","-14.271312259178249","-129.9568463546012","-52.89741828481915" 429 | "427","193.9066260834591","-116.2795905859229","235.7076070596775" 430 | "428","190.7631959959968","-136.4867534456679","-12.931283557817892" 431 | "429","152.42420323329213","-138.28853658230014","-41.0092151282292" 432 | "430","114.70363814453484","-149.62025970033255","8.285515228556235" 433 | "431","-32.9272127957748","-149.31537991157413","3.290437450451279" 434 | "432","123.62706495393863","-110.68843069241217","-0.17469413784847632" 435 | "433","136.39961879547496","-153.83739892396","-42.83861928023374" 436 | "434","238.53525939888019","-134.88927032865956","-17.954881333535795" 437 | "435","24.460090245361258","-215.5546206813955","-18.926505858347184" 438 | "436","-136.95148770331173","-131.9227293116401","10.956282809149414" 439 | "437","91.6899115888946","-167.67577133469942","-6.140864761719513" 440 | "438","-36.420992483539905","-91.58870668456491","19.467534227394935" 441 | "439","10.789953086596682","-7.701466230348117","34.01573742170001" 442 | "440","-7.589530789759531","-134.51677002698273","-64.41272847018995" 443 | "441","300.373337509498","-44.709507326935395","-48.50950260871421" 444 | "442","-169.13265672531182","-181.3182738310307","-65.70721474585872" 445 | "443","-28.61881416216232","-100.9871068978708","-68.79554058159991" 446 | "444","-85.54373948321744","13.401723039821448","-50.3787254748126" 447 | "445","165.6738807622475","-135.0404635501491","-107.66294654736176" 448 | "446","-17.787023047549923","-121.23449038300085","-31.998788090204357" 449 | "447","-44.75381906872528","-144.22263804341407","-63.17190504142027" 450 | "448","235.57137226008422","-61.687105067450624","-49.16769945487362" 451 | "449","-48.169153977339775","-132.3844649799394","-64.17834698085768" 452 | "450","242.7158832503788","-151.621559614207","-79.01150706503297" 453 | "451","195.41475276677198","-113.84817354687279","-137.20715841159017" 454 | "452","249.84138052430734","-115.22654752647466","-121.41220531488983" 455 | "453","220.72133646197648","-96.30774808722686","-55.60338665505672" 456 | "454","176.8437827258354","-168.1681165169249","67.0687996860548" 457 | "455","187.30780392232646","-121.31650692030728","12.567296651476212" 458 | "456","17.118866437400992","-54.091815704501855","17.80579335154017" 459 | "457","-22.638418889911843","-101.37259140881565","191.1488706723373" 460 | "458","-56.45314842335906","-67.20567852286828","270.3355233216005" 461 | "459","-32.9410686726903","-114.58034823988488","60.87336856599305" 462 | "460","137.9360023399709","-61.60132329537801","82.17991171652592" 463 | "461","-19.48714155504129","-127.23331496884859","204.82532996145085" 464 | "462","-80.43438194649264","-202.51969818515633","170.8816693756544" 465 | "463","-97.6345902714274","-132.627476815242","239.41467695122503" 466 | "464","-389.7782980786916","-131.3746602032105","-45.54157780766415" 467 | "465","-38.28790120463172","-158.2579125379866","161.53875002871405" 468 | "466","-29.18983089252657","-244.09430627349312","-1.4444666450379797" 469 | "467","196.0219354131228","-140.655574997795","66.37565678481388" 470 | "468","194.25450728410613","-153.6227687283179","63.708280306679285" 471 | "469","228.63269335872758","-345.22183121886053","210.06460309162946" 472 | "470","-17.630329880656433","-140.03597336967886","-49.70165901069397" 473 | "471","239.9593110485199","-127.74036650490365","221.2305512568043" 474 | "472","181.77999203138035","-92.20537598260552","4.3572583868986206" 475 | "473","248.57810321420402","-143.14989284126779","183.38986534216943" 476 | "474","-20.43363466658822","-111.7513824711005","140.99704609145613" 477 | "475","276.45769127803163","-277.53261541308996","185.38244596237442" 478 | "476","-18.079578612276663","-167.99982337979276","249.61795192899666" 479 | "477","272.0444250785255","-140.50265750735076","3.6339277210242225" 480 | "478","16.903775992714344","-95.80630672744287","223.8060446594819" 481 | "479","-1.5791943064432132","-139.59546391659595","214.5808821729068" 482 | "480","24.02425447808028","13.655041518288058","5.435267387620348" 483 | "481","230.78113200147718","-112.2336193772226","193.39149630428113" 484 | "482","51.04191078442739","-55.21779380588093","163.45853512218588" 485 | "483","262.39532158240127","-33.30887881063569","167.51249703447553" 486 | "484","251.58015750960035","-220.2645206662217","-39.04768162858325" 487 | "485","270.0784526181658","-62.876608259770705","20.39663510406213" 488 | "486","213.77093275500948","-121.18977044934243","197.62344690577936" 489 | "487","255.66600555774357","-156.90616936058967","236.5455702250492" 490 | "488","271.55068863370764","-114.0888515810501","288.86068162743493" 491 | "489","-8.764365149518584","-152.91098523895005","205.88869740678314" 492 | "490","236.7079285899572","-126.62512014790022","284.99553141239926" 493 | "491","309.8959881295474","-168.50784761314026","30.030185101321706" 494 | "492","243.91525901832125","-124.91335495790216","152.54280774849346" 495 | "493","226.48971100444265","-124.87623448313303","255.6585262100562" 496 | "494","267.6423484149683","-162.49106527740543","249.22654417694986" 497 | "495","249.75486932968468","-133.4524020310006","195.5864389194873" 498 | "496","122.56134723775487","-141.25909165706238","-3.8139402530956934" 499 | "497","-45.751993251553486","-133.0838684091823","214.3075874299788" 500 | "498","252.11569149944086","-140.7121010865905","285.0875924254426" 501 | "499","27.313761023145474","-171.18140270377893","87.97061641978738" --------------------------------------------------------------------------------