├── README.md
├── p0_taxi-v2
├── README.md
├── agent.py
├── images
│ ├── all_perf.png
│ ├── expected_sarsa_algo.png
│ ├── expected_sarsa_perf.png
│ ├── expected_sarsa_update_rule.png
│ ├── sarsa_algo.png
│ ├── sarsa_perf.png
│ ├── sarsa_update_rule.png
│ ├── sarsamax_algo.png
│ ├── sarsamax_perf.png
│ ├── sarsamax_update_rule.png
│ ├── taxi-game.gif
│ └── taxi_game_gif.gif
├── main.py
└── monitor.py
├── p1_navigation
├── Navigation.ipynb
├── README.md
├── config.json
├── dqn_agent.py
├── main.py
├── model.py
├── report.pdf
├── requirements.txt
├── saved
│ └── DQN_exp
│ │ └── model_trained_solved.pth
└── utils.py
├── p2_continuous_control
├── Continuous_Control.ipynb
├── README.md
├── config.json
├── ddpg_agent.py
├── images
│ └── reacher_gif.gif
├── models.py
├── report.pdf
├── requirements.txt
└── saved
│ └── DDPG_exp
│ ├── checkpoint_actor_solved.pth
│ └── checkpoint_critic_solved.pth
└── p3_collab_compet
├── DDPGAgents.py
├── OUNoise.py
├── README.md
├── ReplayBuffer.py
├── Tennis.ipynb
├── config.json
├── images
└── tennis_gif.gif
├── report.pdf
├── requirements.txt
├── saved
└── DDPGAgents_exp
│ ├── checkpoint_actor_solved.pth
│ └── checkpoint_critic_solved.pth
└── utils.py
/README.md:
--------------------------------------------------------------------------------
1 | # Deep Reinforcement Learning Nanodegree Udacity
2 |
3 | This repository contains project files for Udacity's Deep Reinforcement Learning Nanodegree program
4 |
5 | ## Projects
6 |
7 | ### Reinforcement Learning
8 | >[P0_Taxi](https://github.com/vmelan/DRLND-udacity/tree/master/p0_taxi-v2)
9 |
10 | For this project, we use OpenAI Gym Taxi-v2 environment to design an algorithm to teach a taxi agent to navigate a small gridworld using Reinforcement Learning methods.
11 |
12 | ### Navigation
13 | >[P1_Navigation](https://github.com/vmelan/DRLND-udacity/tree/master/p1_navigation)
14 |
15 | For this project, we train an agent to navigate and collect bananas in a large, square world using Deep Q-Network (DQN).
16 |
17 | ### Continuous Control
18 | >[P2_Continuous Control](https://github.com/vmelan/DRLND-udacity/tree/master/p2_continuous_control)
19 |
20 | For this project, we train a double-jointed arm agent to follow a target location using Deeo Distributed Policy
21 | Gradient (DDPG).
22 |
23 | ### Collaboration and Competition
24 | >[P3_Collaboration Competition](https://github.com/vmelan/DRLND-udacity/tree/master/p3_collab_compet)
25 |
26 | For this project, we train a pair of agents to play tennis using DDPG with shared replay buffer.
27 |
28 |
--------------------------------------------------------------------------------
/p0_taxi-v2/README.md:
--------------------------------------------------------------------------------
1 | # Project: OpenAI Gym's Taxi-v2 Task
2 |
3 | For this project, we use OpenAI Gym Taxi-v2 environment to design an
4 | algorithm to teach a taxi agent to navigate a small gridworld.
5 |
6 |
7 |
8 |
9 |
10 | ## Problem Statement
11 | This problem comes from the paper Hierarchical Reinforcement
12 | Learning with the MAXQ Value Function Decomposition by Tom Dietterich (link
13 | to the paper: https://arxiv.org/pdf/cs/9905014.pdf), section 3.1 A Motivation Example.
14 |
15 | There are four specially-designated locations in
16 | this world, marked as R(ed), B(lue), G(reen), and Y(ellow). The taxi problem is episodic. In
17 | each episode, the taxi starts in a randomly-chosen square. There is a passenger at one of the
18 | four locations (chosen randomly), and that passenger wishes to be transported to one of the four
19 | locations (also chosen randomly). The taxi must go to the passenger’s location (the “source”), pick
20 | up the passenger, go to the destination location (the “destination”), and put down the passenger
21 | there. (To keep things uniform, the taxi must pick up and drop off the passenger even if he/she
22 | is already located at the destination!) The episode ends when the passenger is deposited at the
23 | destination location.
24 |
25 | There are six primitive actions in this domain: (a) four navigation actions that move the taxi
26 | one square North, South, East, or West, (b) a Pickup action, and (c) a Putdown action. Each action
27 | is deterministic. There is a reward of −1 for each action and an additional reward of +20 for
28 | successfully delivering the passenger. There is a reward of −10 if the taxi attempts to execute the
29 | Putdown or Pickup actions illegally. If a navigation action would cause the taxi to hit a wall, the
30 | action is a no-op, and there is only the usual reward of −1.
31 |
32 | We seek a policy that maximizes the total reward per episode. There are 500 possible states:
33 | 25 squares, 5 locations for the passenger (counting the four starting locations and the taxi), and 4
34 | destinations.
35 |
36 | ## Files
37 | - `agent.py`: Agent class in which we will develop our reinforcement learning methods
38 | - `monitor.py`: The `interace` function tests how well the agent learns from interaction with the environment
39 | - `main.py` : Main file to run the project for checking the performance of the agent
40 |
41 | ## Temporal-Difference (TD) Control Methods
42 | While **Monte-Carlo** approaches requires we run the agent for the whole episode before making any decisions,
43 | this solution is no longer viable with **continuous** tasks that does not have any terminal state, as well as **episodic** tasks for cases when we
44 | do not want to wait for the terminal state before making any decisions in the environment's episode.
45 |
46 | This is where Temporal-Difference (TD) Control Methods step in, they update estimates based in part
47 | on other learned estimates, without waiting for the final outcome. As such, TD methods will update the
48 | **Q-table** after every time steps.
49 |
50 | ### Sarsa
51 | The Sarsa update rule is the following:
52 |
53 |
54 |
55 |
56 |
57 | Notice that the action-value update uses the **S**tate, **A**ction, **R**eward, next **S**tate, next **R**eward hence the name
58 | of the algorithm **Sarsa(0)** or simply **Sarsa**.
59 |
60 |
61 |
62 |
63 |
64 | Here is the performance of Sarsa on the Taxi task :
65 |
66 |
67 |
68 | The average reward over the last 100 episodes keeps improving until the 2000th episodes, where it finally
69 | reaches convergence and stops improving
70 |
71 | ### Expected Sarsa
72 | The Expected Sarsa update rule is the following:
73 |
74 |
75 |
76 |
77 | Expected Sarsa uses the expected value of the next state-action pair, where the expectation takes into accoun the probability that the agent selects each possible action from the next state.
78 |
79 |
80 |
81 |
82 |
83 | Here is the performance of Expected Sarsa on the Taxi task :
84 |
85 |
86 |
87 | The resulting graph is noisier than Sarsa, due to the fact we are averaging over all the possible
88 | actions in the next state. Convergence takes more time and there are gradually some, albeit small, improvement.
89 |
90 | ### Sarsamax (or Q-Learning)
91 | The Sarsamax (or Q-Learning) update rule is the following:
92 |
93 |
94 |
95 |
96 | In Sarsamax, the update rule attempts to approximate the optimal value function
97 | at every time step.
98 |
99 |
100 |
101 |
102 | Here is the performance of Sarsamax on the Taxi task:
103 |
104 |
105 |
106 | Sarsamax is smoother and follows the same trend as Sarsa.
107 |
108 | ### Overview
109 | Sarsa, Expected Sarsa and Sarsamax have been trained for 20000 episodes, and we can visualize
110 | in the same graph their performance :
111 |
112 |
113 |
--------------------------------------------------------------------------------
/p0_taxi-v2/agent.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | from collections import defaultdict
3 |
4 | class Agent:
5 |
6 | def __init__(self, nA=6):
7 | """ Initialize agent.
8 |
9 | Params
10 | ======
11 | - nA: number of actions available to the agent
12 | """
13 | self.nA = nA
14 | self.Q = defaultdict(lambda: np.zeros(self.nA))
15 |
16 | self.eps = 1.0
17 | self.eps_decay = 0.99
18 | self.eps_min = 0.005
19 |
20 | self.alpha = 0.1
21 | self.gamma = 0.9
22 |
23 | def get_policy(self, Q_s):
24 | """ Obtain the action probabilities corresponding to epsilon-greedy policies """
25 | self.eps = max(self.eps*self.eps_decay, self.eps_min)
26 | policy_s = np.ones(self.nA) * (self.eps / self.nA)
27 | best_a = np.argmax(Q_s)
28 | policy_s[best_a] = 1 - self.eps + (self.eps / self.nA)
29 |
30 | return policy_s
31 |
32 | def select_action(self, state):
33 | """ Given the state, select an action.
34 |
35 | Params
36 | ======
37 | - state: the current state of the environment
38 |
39 | Returns
40 | =======
41 | - action: an integer, compatible with the task's action space
42 | """
43 | policy_s = self.get_policy(self.Q[state])
44 | action = np.random.choice(np.arange(self.nA), p=policy_s)
45 |
46 | return action
47 |
48 | def step(self, state, action, reward, next_state, done):
49 | """ Update the agent's knowledge, using the most recently sampled tuple.
50 |
51 | Params
52 | ======
53 | - state: the previous state of the environment
54 | - action: the agent's previous choice of action
55 | - reward: last reward received
56 | - next_state: the current state of the environment
57 | - done: whether the episode is complete (True or False)
58 |
59 | """
60 | ## Using update rule of Sarsamax (Q-Learning)
61 |
62 | if not done:
63 | self.Q[state][action] = self.Q[state][action] + self.alpha * (reward + (self.gamma * np.max(self.Q[next_state])) - self.Q[state][action])
64 | if done:
65 | self.Q[state][action] = self.Q[state][action] + self.alpha * (reward + self.gamma * 0 - self.Q[state][action])
66 |
67 |
--------------------------------------------------------------------------------
/p0_taxi-v2/images/all_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/all_perf.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/expected_sarsa_algo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/expected_sarsa_algo.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/expected_sarsa_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/expected_sarsa_perf.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/expected_sarsa_update_rule.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/expected_sarsa_update_rule.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsa_algo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsa_algo.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsa_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsa_perf.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsa_update_rule.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsa_update_rule.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsamax_algo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsamax_algo.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsamax_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsamax_perf.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsamax_update_rule.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsamax_update_rule.png
--------------------------------------------------------------------------------
/p0_taxi-v2/images/taxi-game.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/taxi-game.gif
--------------------------------------------------------------------------------
/p0_taxi-v2/images/taxi_game_gif.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/taxi_game_gif.gif
--------------------------------------------------------------------------------
/p0_taxi-v2/main.py:
--------------------------------------------------------------------------------
1 | from agent import Agent
2 | from monitor import interact
3 | import gym
4 | import numpy as np
5 | import matplotlib.pyplot as plt
6 |
7 | import sys
8 | from collections import defaultdict
9 | import time
10 |
11 | def plot_performance(num_episodes, avg_rewards, label, disp_plot=True):
12 | plt.plot(np.linspace(0, num_episodes, len(avg_rewards),endpoint=False), np.asarray(avg_rewards), label=label)
13 | plt.xlabel('Episode Number')
14 | plt.ylabel('Average Reward (Over Next %d Episodes)' % (100))
15 | plt.title(label + " " + "performance")
16 | if disp_plot: plt.show()
17 |
18 | def plot_all_performances(num_episodes, all_avg_rewards, title):
19 | for (avg_reward, method) in zip(all_avg_rewards, ['Sarsa', 'Expected Sarsa', 'Sarsamax (Q-Learning)']):
20 | plot_performance(num_episodes, avg_reward, method, disp_plot=False)
21 | plt.title(title)
22 | plt.legend(loc='best')
23 | plt.show()
24 |
25 | def main():
26 | env = gym.make('Taxi-v2')
27 | num_episodes = 20000
28 |
29 | ## Sarsa
30 | agent = Agent(method='Sarsa')
31 | sarsa_avg_rewards, sarsa_best_avg_reward = interact(env, agent, num_episodes=num_episodes)
32 | plot_performance(num_episodes, sarsa_avg_rewards, "Sarsa", disp_plot=True)
33 |
34 | # ## Expected Sarsa
35 | agent = Agent(method='Expected Sarsa')
36 | exp_sarsa_avg_rewards, exp_sarsa_best_avg_reward = interact(env, agent, num_episodes=num_episodes)
37 | plot_performance(num_episodes, exp_sarsa_avg_rewards, "Expected Sarsa", disp_plot=True)
38 |
39 | ## Q-Learning
40 | agent = Agent(method='Q-Learning')
41 | sarsamax_avg_rewards, sarsamax_best_avg_reward = interact(env, agent, num_episodes=num_episodes)
42 | plot_performance(num_episodes, sarsamax_avg_rewards, "Sarsamax (Q-Learning)", disp_plot=True)
43 |
44 | ## All performances
45 | plot_all_performances(num_episodes, [sarsa_avg_rewards, exp_sarsa_avg_rewards, sarsamax_avg_rewards],
46 | title="Comparison of Temporal Difference control methods")
47 |
48 | if __name__ == '__main__':
49 | main()
--------------------------------------------------------------------------------
/p0_taxi-v2/monitor.py:
--------------------------------------------------------------------------------
1 | from collections import deque
2 | import sys
3 | import math
4 | import numpy as np
5 |
6 | def interact(env, agent, num_episodes=20000, window=100):
7 | # def interact(env, agent, num_episodes=1000, window=100):
8 |
9 | """ Monitor agent's performance.
10 |
11 | Params
12 | ======
13 | - env: instance of OpenAI Gym's Taxi-v1 environment
14 | - agent: instance of class Agent (see Agent.py for details)
15 | - num_episodes: number of episodes of agent-environment interaction
16 | - window: number of episodes to consider when calculating average rewards
17 |
18 | Returns
19 | =======
20 | - avg_rewards: deque containing average rewards
21 | - best_avg_reward: largest value in the avg_rewards deque
22 | """
23 | # initialize average rewards
24 | avg_rewards = deque(maxlen=num_episodes)
25 | # initialize best average reward
26 | best_avg_reward = -math.inf
27 | # initialize monitor for most recent rewards
28 | samp_rewards = deque(maxlen=window)
29 | # for each episode
30 | for i_episode in range(1, num_episodes+1):
31 | # begin the episode
32 | state = env.reset()
33 | # initialize the sampled reward
34 | samp_reward = 0
35 | while True:
36 | # agent selects an action
37 | action = agent.select_action(state)
38 | # agent performs the selected action
39 | next_state, reward, done, _ = env.step(action)
40 | # agent performs internal updates based on sampled experience
41 | agent.step(state, action, reward, next_state, done)
42 | # update the sampled reward
43 | samp_reward += reward
44 | # update the state (s <- s') to next time step
45 | state = next_state
46 | if done:
47 | # save final sampled reward
48 | samp_rewards.append(samp_reward)
49 | break
50 | if (i_episode >= 100):
51 | # get average reward from last 100 episodes
52 | avg_reward = np.mean(samp_rewards)
53 | # append to deque
54 | avg_rewards.append(avg_reward)
55 | # update best average reward
56 | if avg_reward > best_avg_reward:
57 | best_avg_reward = avg_reward
58 | # monitor progress
59 | print("\rEpisode {}/{} || Best average reward {}".format(i_episode, num_episodes, best_avg_reward), end="")
60 | sys.stdout.flush()
61 | # check if task is solved (according to OpenAI Gym)
62 | if best_avg_reward >= 9.7:
63 | print('\nEnvironment solved in {} episodes.'.format(i_episode), end="")
64 | break
65 | if i_episode == num_episodes: print('\n')
66 |
67 | return avg_rewards, best_avg_reward
--------------------------------------------------------------------------------
/p1_navigation/README.md:
--------------------------------------------------------------------------------
1 | # Project : Navigation
2 |
3 | ## Description
4 | For this project, we train an agent to navigate and collect bananas in a large,
5 | square world.
6 |
7 | ## Problem statement
8 | A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided
9 | for collecting a blue banana. Thus, the goal of the agent is to collect
10 | as many yellow bananas as possible while avoiding blue bananas.
11 |
12 | The state space has 37 dimensions, and contains the agent's velocity, along
13 | with ray-based perception of objects around the agent's forward
14 | direction. Given this information, the agent has to learn how to best select
15 | actions.
16 | Four discrete actions are available, corresponding to:
17 | - `0` - move forward
18 | - `1` - move backward
19 | - `2` - turn left
20 | - `3` - turn right
21 | The task is episodic, and in order to solve the environment, the
22 | agent must get an average score of +13 over 100 consecutive episodes.
23 |
24 | ## Files
25 | - `Navigation.ipynb`: Notebook used to control and train the agent
26 | - `main.py`: Main script used to control and train the agent for experimentation
27 | - `dqn_agent.py`: Create an Agent class that interacts with and learns from the environment
28 | - `model.py`: Q-network class used to map state to action values
29 | - `config.json`: Configuration file to store variables and paths
30 | - `utils.py`: Helper functions
31 | - `report.pdf`: Technical report
32 |
33 | ## Dependencies
34 | To be able to run this code, you will need an environment with Python 3 and
35 | the dependencies are listed in the `requirements.txt` file so that you can install them
36 | using the following command:
37 | ```
38 | pip install requirements.txt
39 | ```
40 |
41 | Furthermore, you need to download the environment from one of the links below. You need only to select
42 | the environment that matches your operating system:
43 | - Linux : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Linux.zip)
44 | - MAC OSX : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana.app.zip)
45 | - Windows : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Windows_x86_64.zip)
46 |
47 | ## Running
48 | Run the cells in the notebook `Navigation.ipynb` to train an agent that solves our required
49 | task of collecting bananas.
--------------------------------------------------------------------------------
/p1_navigation/config.json:
--------------------------------------------------------------------------------
1 | {
2 | "exp_name": "DQN_exp",
3 | "cuda": false,
4 | "gpu": 0,
5 |
6 | "optimizer": {
7 | "optimizer_type": "Adam",
8 | "betas": [0.9, 0.999],
9 | "optimizer_params": {
10 | "lr": 0.0005,
11 | "eps": 1e-7,
12 | "weight_decay": 0
13 | }
14 | },
15 |
16 | "GLIE": {
17 | "eps_start": 1.0,
18 | "eps_end": 0.005,
19 | "eps_decay": 0.999
20 | },
21 |
22 | "DQN": {
23 | "gamma": 0.99,
24 | "tau": 1e-3,
25 | "update_every": 4,
26 | "buffer_size": 5e4
27 | },
28 |
29 | "architecture": {
30 | "hidden_layers_units": [500, 200, 100],
31 | "use_dropout": false,
32 | "dropout_proba": 0.5
33 | },
34 |
35 | "trainer": {
36 | "num_episodes": 2000,
37 | "batch_size": 32,
38 | "max_timesteps_per_ep": 1000,
39 | "save_dir": "./saved/",
40 | "save_trained_name": "model_trained",
41 | "save_freq": 500,
42 | "verbose": 1
43 | }
44 |
45 | }
--------------------------------------------------------------------------------
/p1_navigation/dqn_agent.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import random
3 | from collections import namedtuple, deque
4 | import logging
5 |
6 | from model import QNetwork
7 |
8 | import torch
9 | import torch.nn.functional as F
10 | import torch.optim as optim
11 | from utils import pick_device
12 |
13 | import pdb
14 |
15 | class Agent():
16 | """ Agent used to interact with and learns from the environment """
17 |
18 | def __init__(self, state_size, action_size, config):
19 | """ Initialize an Agent object """
20 |
21 | self.state_size = state_size
22 | self.action_size = action_size
23 | self.config = config
24 |
25 | # logging for this class
26 | self.logger = logging.getLogger(self.__class__.__name__)
27 |
28 | # gpu support
29 | self.device = pick_device(config, self.logger)
30 |
31 | ## Q-Networks
32 | self.qnetwork_local = QNetwork(state_size, action_size, config).to(self.device)
33 | self.qnetwork_target = QNetwork(state_size, action_size, config).to(self.device)
34 |
35 | ## Get optimizer for local network
36 | self.optimizer = getattr(optim, config["optimizer"]["optimizer_type"])(
37 | self.qnetwork_local.parameters(),
38 | betas=tuple(config["optimizer"]["betas"]),
39 | **config["optimizer"]["optimizer_params"])
40 |
41 | ## Replay memory
42 | self.memory = ReplayBuffer(
43 | config=config,
44 | action_size=action_size,
45 | buffer_size=int(config["DQN"]["buffer_size"]),
46 | batch_size=config["trainer"]["batch_size"]
47 | )
48 |
49 | ## Initialize time step (for update every `update_every` steps)
50 | self.t_step = 0
51 |
52 |
53 | def step(self, state, action, reward, next_state, done):
54 |
55 | # Save experience in replay memory
56 | self.memory.add(state, action, reward, next_state, done)
57 |
58 | # Learn every `update_every` time steps
59 | self.t_step = (self.t_step + 1) % self.config["DQN"]["update_every"]
60 | if (self.t_step == 0):
61 | # If enough samples are available in memory, get random subset and learn
62 | if len(self.memory) > self.config["trainer"]["batch_size"]:
63 | experiences = self.memory.sample()
64 | self.learn(experiences, self.config["DQN"]["gamma"])
65 |
66 |
67 |
68 | def act(self, state, epsilon):
69 | """ Returns actions for given state as per current policy """
70 | # pdb.set_trace()
71 |
72 | # Convert state to tensor
73 | state = torch.from_numpy(state).float().unsqueeze(0).to(self.device)
74 |
75 | ## Evaluation mode
76 | self.qnetwork_local.eval()
77 | with torch.no_grad():
78 | # Forward pass of local qnetwork
79 | action_values = self.qnetwork_local.forward(state)
80 |
81 | ## Training mode
82 | self.qnetwork_local.train()
83 | # Epsilon-greedy action selection
84 | if random.random() > epsilon:
85 | # Choose the best action (exploitation)
86 | return np.argmax(action_values.cpu().data.numpy())
87 | else:
88 | # Choose random action (exploration)
89 | return random.choice(np.arange(self.action_size))
90 |
91 |
92 | def learn(self, experiences, gamma):
93 | """ Update value parameters using given batch of experience tuples """
94 |
95 | states, actions, rewards, next_states, dones = experiences
96 |
97 | ## TD target
98 | # Get max predicted Q-values (for next states) from target model
99 | # Q_targets_next = torch.argmax(self.qnetwork_target(next_states).detach(), dim=1).unsqueeze(1)
100 | Q_targets_next = self.qnetwork_target(next_states).detach().max(1)[0].unsqueeze(1)
101 | Q_targets_next = Q_targets_next.type(torch.FloatTensor)
102 |
103 | # Compute Q-targets for current states
104 | Q_targets = rewards + (gamma * Q_targets_next * (1 - dones))
105 |
106 | ## old value
107 | # Get expected Q-values from local model
108 | Q_expected = torch.gather(self.qnetwork_local(states), dim=1, index=actions)
109 |
110 | # Compute loss
111 | loss = F.mse_loss(Q_expected, Q_targets)
112 | # Minimize loss
113 | self.optimizer.zero_grad()
114 | loss.backward()
115 | self.optimizer.step()
116 |
117 | # update target network with a soft update
118 | self.soft_update(self.qnetwork_local, self.qnetwork_target, self.config["DQN"]["tau"])
119 |
120 |
121 |
122 | def soft_update(self, local_model, target_model, tau):
123 | """
124 | Soft update model parameters
125 | θ_target = τ*θ_local + (1 - τ)*θ_target
126 |
127 | Parameters
128 | ----------
129 | local_model (PyTorch model): weights will be copied from
130 | target_model (PyTorch model): weights will be copied to
131 | tau (float): interpolation parameter
132 | """
133 |
134 | for target_param, local_param in zip(target_model.parameters(), local_model.parameters()):
135 | target_param.data.copy_(tau*local_param.data + (1.0 - tau)*target_param.data)
136 |
137 |
138 | class ReplayBuffer():
139 | """ Fixed-size buffer to store experience tuples """
140 |
141 | def __init__(self, config, action_size, buffer_size, batch_size):
142 | """ Initialize a ReplayBuffer object """
143 |
144 | self.config = config
145 | self.action_size = action_size
146 | self.memory = deque(maxlen=buffer_size)
147 | self.batch_size = batch_size
148 | self.experience = namedtuple("Experience",
149 | field_names=["state", "action", "reward", "next_state", "done"])
150 |
151 | # logging for this class
152 | self.logger = logging.getLogger(self.__class__.__name__)
153 |
154 | # gpu support
155 | self.device = pick_device(config, self.logger)
156 |
157 |
158 | def add(self, state, action, reward, next_state, done):
159 | """ Add a new experience to memory """
160 | e = self.experience(state, action, reward, next_state, done)
161 | self.memory.append(e)
162 |
163 | def sample(self):
164 | """ Randomly sample a batch of experiences from memory """
165 | experiences = random.sample(self.memory, k=self.batch_size)
166 |
167 | states = torch.from_numpy(
168 | np.vstack([e.state for e in experiences if e is not None])
169 | ).float().to(self.device)
170 | actions = torch.from_numpy(
171 | np.vstack([e.action for e in experiences if e is not None])
172 | ).long().to(self.device)
173 | rewards = torch.from_numpy(
174 | np.vstack([e.reward for e in experiences if e is not None])
175 | ).float().to(self.device)
176 | next_states = torch.from_numpy(
177 | np.vstack([e.next_state for e in experiences if e is not None])
178 | ).float().to(self.device)
179 | dones = torch.from_numpy(
180 | np.vstack([e.done for e in experiences if e is not None]).astype(np.uint8)
181 | ).float().to(self.device)
182 |
183 | return (states, actions, rewards, next_states, dones)
184 |
185 |
186 | def __len__(self):
187 | """ Return the current size of internal memory """
188 | return len(self.memory)
189 |
--------------------------------------------------------------------------------
/p1_navigation/main.py:
--------------------------------------------------------------------------------
1 | import json
2 | import logging
3 | import torch
4 | import numpy as np
5 | from collections import deque
6 | from dqn_agent import Agent
7 | from utils import ensure_dir
8 | import matplotlib.pyplot as plt
9 | from unityagents import UnityEnvironment
10 |
11 | plt.ion()
12 |
13 |
14 | def dqn(agent,
15 | brain_name,
16 | config,
17 | n_episodes,
18 | max_timesteps_per_ep,
19 | eps_start,
20 | eps_end,
21 | eps_decay
22 | ):
23 |
24 | """
25 | Deep Q-Learning
26 | """
27 | logger = logging.getLogger('dqn') # logger
28 | flag = False # When environment is technically solved
29 | # Save path
30 | save_path = config["trainer"]["save_dir"] + config["exp_name"] + "/"
31 | ensure_dir(save_path)
32 | scores = [] # list containing scores from each episodes
33 | scores_window = deque(maxlen=100)
34 | epsilon = eps_start # init epsilon
35 |
36 | for i_episode in range(1, n_episodes + 1):
37 | # reset the environment
38 | env_info = env.reset(train_mode=True)[brain_name]
39 | # get the current state
40 | state = env_info.vector_observations[0]
41 | score = 0
42 | for t in range(max_timesteps_per_ep):
43 | # choose action based on epsilon-greedy policy
44 | action = agent.act(state, epsilon)
45 | # send the action to the environment
46 | env_info = env.step(action)[brain_name]
47 | # get the next state
48 | next_state = env_info.vector_observations[0]
49 | # get the reward
50 | reward = env_info.rewards[0]
51 | # see if episode has finished
52 | done = env_info.local_done[0]
53 | # step
54 | agent.step(state, action, reward, next_state, done)
55 | # cumulative rewards into score variable
56 | score += reward
57 | # get next_state and set it to state
58 | state = next_state
59 |
60 | if done:
61 | break
62 |
63 | # Update epsilon
64 | epsilon = max(eps_decay*epsilon, eps_end)
65 |
66 | # save most recent score
67 | scores.append(score)
68 | scores_window.append(score)
69 |
70 | logger.info('\rEpisode {}\tAverage Score: {:.3f}'.format(i_episode, np.mean(scores_window)))
71 |
72 | if (i_episode % 100 == 0):
73 | logger.info("\rEpisode {}\tAverage Score: {:.3f}".format(i_episode, \
74 | np.mean(scores_window)))
75 |
76 | # Save occasionnally
77 | if (i_episode % config["trainer"]["save_freq"] == 0):
78 |
79 | torch.save(agent.qnetwork_local.state_dict(), save_path +
80 | config["trainer"]["save_trained_name"] + "_" + str(i_episode) + ".pth")
81 |
82 | # Check if environment solved (if not already)
83 | if not flag:
84 | if (np.mean(scores_window) >= 13.0):
85 | logger.info('\nEnvironment solved in {:d} episodes!\tAverage Score: {:.3f}'.format(
86 | i_episode-100, np.mean(scores_window)))
87 | # Save solved model
88 | torch.save(agent.qnetwork_local.state_dict(), save_path +
89 | config["trainer"]["save_trained_name"] + "_solved.pth")
90 | flag = True
91 |
92 | return scores
93 |
94 | if __name__ == '__main__':
95 | # Configure logging for all loggers
96 | logging.basicConfig(level=logging.INFO, format='')
97 |
98 | # Load config file
99 | with open("config.json", "r") as f:
100 | config = json.load(f)
101 |
102 | # Start the environment
103 | env = UnityEnvironment(file_name="./Banana_Windows_x86_64/Banana.exe")
104 |
105 | # get the default brain
106 | brain_name = env.brain_names[0]
107 | brain = env.brains[brain_name]
108 |
109 | # Create agent
110 | agent = Agent(state_size=37, action_size=4, config=config)
111 |
112 | # Train the agent
113 | scores = dqn(agent=agent,
114 | brain_name=brain_name,
115 | config=config,
116 | n_episodes=config["trainer"]["num_episodes"],
117 | max_timesteps_per_ep=config["trainer"]["max_timesteps_per_ep"],
118 | eps_start=config["GLIE"]["eps_start"],
119 | eps_end=config["GLIE"]["eps_end"],
120 | eps_decay=config["GLIE"]["eps_decay"]
121 | )
122 |
123 | # Close the environment
124 | env.close()
--------------------------------------------------------------------------------
/p1_navigation/model.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import torch.nn.functional as F
4 |
5 |
6 | class QNetwork(nn.Module):
7 | """ Policy model that maps state to actions """
8 |
9 | def __init__(self, state_size, action_size, config):
10 | """ Initialize parameters and build model """
11 | super().__init__()
12 |
13 |
14 | # self.fc1 = nn.Linear(state_size, fc1_units)
15 | # self.fc2 = nn.Linear(fc1_units, fc2_units)
16 | # self.fc3 = nn.Linear(fc2_units, action_size)
17 |
18 | self.config = config
19 | # Retrieve variable from config file
20 | hidden_layers_units = config["architecture"]["hidden_layers_units"]
21 | dropout_proba = config["architecture"]["dropout_proba"]
22 |
23 | # Add the first layer
24 | self.layers = nn.ModuleList([nn.Linear(state_size, hidden_layers_units[0])])
25 |
26 | # Add a variable number of more hidden layers
27 | layer_sizes = zip(hidden_layers_units[:-1], hidden_layers_units[1:])
28 | self.layers.extend([nn.Linear(h1, h2) for h1, h2 in layer_sizes])
29 |
30 | # Add last layer
31 | self.output = nn.Linear(hidden_layers_units[-1], action_size)
32 |
33 | # Dropout
34 | self.dropout = nn.Dropout(p=dropout_proba)
35 |
36 | def forward(self, x):
37 | """ Forward pass """
38 | # x = F.relu(self.fc1(state))
39 | # x = F.relu(self.fc2(x))
40 | # x = self.fc3(x)
41 |
42 | for layer in self.layers:
43 | x = F.relu(layer(x))
44 | if self.config["architecture"]["use_dropout"]:
45 | x = self.dropout(x)
46 |
47 | x = self.output(x)
48 |
49 | return x
50 |
51 | if __name__ == '__main__':
52 | import json
53 | with open("config.json", "r") as f:
54 | config = json.load(f)
55 |
56 | net = QNetwork(state_size=37, action_size=4, config=config)
57 | print("net:", net)
58 |
--------------------------------------------------------------------------------
/p1_navigation/report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p1_navigation/report.pdf
--------------------------------------------------------------------------------
/p1_navigation/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib
2 | numpy>=1.11.0
3 | jupyter
4 | unityagents==0.4.0
5 | torch==0.4.0
6 | ipykernel
7 |
--------------------------------------------------------------------------------
/p1_navigation/saved/DQN_exp/model_trained_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p1_navigation/saved/DQN_exp/model_trained_solved.pth
--------------------------------------------------------------------------------
/p1_navigation/utils.py:
--------------------------------------------------------------------------------
1 | import os
2 | import torch
3 |
4 |
5 | def pick_device(config, logger):
6 | """ Pick device """
7 | if config["cuda"] and not torch.cuda.is_available():
8 | logger.warning("Warning: There's no CUDA support on this machine,"
9 | "training is performed on cpu.")
10 | device = torch.device("cpu")
11 | elif not config["cuda"] and torch.cuda.is_available():
12 | logger.info("Training is performed on cpu by user's choice")
13 | device = torch.device("cpu")
14 | elif not config["cuda"] and not torch.cuda.is_available():
15 | logger.info("Training on cpu")
16 | device = torch.device("cpu")
17 | else:
18 | logger.info("Training on gpu")
19 | device = torch.device("cuda:" + str(config["gpu"]))
20 |
21 | return device
22 |
23 | def ensure_dir(path):
24 | if not os.path.exists(path):
25 | os.makedirs(path)
--------------------------------------------------------------------------------
/p2_continuous_control/README.md:
--------------------------------------------------------------------------------
1 | # Project : Continuous Control
2 |
3 | ## Description
4 | For this project, we train a double-jointed arm agent to follow a target location.
5 |
6 |
7 |
8 |
9 |
10 | ## Problem Statement
11 | A reward of +0.1 is provided for each step that the agent's hands is in the goal location.
12 | Thus, the goal of the agent is to maintain its position at
13 | the target location for as many
14 | steps as possible.
15 |
16 | The observation space consists of 33 variables corresponding to position,
17 | rotation, velocity, and angular velocities of the arm.
18 | Each action is a vector with four numbers, corresponding to torque
19 | applicable to two joints. Every
20 | entry in the action vector should be a number between -1 and 1.
21 |
22 | The task is episodic, with 1000 timesteps per episode. In order to solve
23 | the environment, the agent must get an average score of +30 over 100 consecutive
24 | episodes.
25 |
26 | ## Files
27 | - `Continuous_Control.ipynb`: Notebook used to control and train the agent
28 | - `ddpg_agent.py`: Create an Agent class that interacts with and learns from the environment
29 | - `model.py`: Actor and Critic classes
30 | - `config.json`: Configuration file to store variables and paths
31 | - `utils.py`: Helper functions
32 | - `report.pdf`: Technical report
33 |
34 | ## Dependencies
35 | To be able to run this code, you will need an environment with Python 3 and
36 | the dependencies are listed in the `requirements.txt` file so that you can install them
37 | using the following command:
38 | ```
39 | pip install requirements.txt
40 | ```
41 |
42 | Furthermore, you need to download the environment from one of the links below. You need only to select
43 | the environment that matches your operating system:
44 | - Linux : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/one_agent/Reacher_Linux.zip)
45 | - MAC OSX : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher.app.zip)
46 | - Windows : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher_Windows_x86_64.zip)
47 |
48 | ## Running
49 | Run the cells in the notebook `Continuous_Control.ipynb` to train an agent that solves our required
50 | task of moving the double-jointed arm.
--------------------------------------------------------------------------------
/p2_continuous_control/config.json:
--------------------------------------------------------------------------------
1 | {
2 | "exp_name": "DDPG_exp",
3 | "cuda": true,
4 | "gpu": 0,
5 |
6 | "optimizer_actor": {
7 | "optimizer_type": "Adam",
8 | "betas": [0.9, 0.999],
9 | "optimizer_params": {
10 | "lr": 1e-4,
11 | "eps": 1e-7,
12 | "weight_decay": 0
13 | }
14 | },
15 |
16 | "optimizer_critic": {
17 | "optimizer_type": "Adam",
18 | "betas": [0.9, 0.999],
19 | "optimizer_params": {
20 | "lr": 1e-4,
21 | "eps": 1e-7,
22 | "weight_decay": 0
23 | }
24 | },
25 |
26 | "DDPG": {
27 | "gamma": 0.99,
28 | "tau": 0.001,
29 | "buffer_size": 10e6
30 | },
31 |
32 | "architecture": {
33 | "fc1_units": 250,
34 | "fc2_units": 100
35 | },
36 |
37 | "trainer" : {
38 | "num_episodes": 500,
39 | "batch_size": 128,
40 | "save_dir": "./saved/",
41 | "save_freq": 200
42 | }
43 | }
--------------------------------------------------------------------------------
/p2_continuous_control/ddpg_agent.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import random
3 | import copy
4 | import logging
5 | from collections import namedtuple, deque
6 | from model import Actor, Critic
7 | import torch
8 | import torch.nn.functional as F
9 | import torch.optim as optim
10 | from utils import pick_device
11 |
12 |
13 | class Agent():
14 | """ Agent used to interact with and learns from the environment """
15 |
16 | def __init__(self, state_size, action_size, config):
17 | """ Initialize an agent object """
18 |
19 | self.state_size = state_size
20 | self.action_size = action_size
21 | self.config = config
22 |
23 | # logging for this class
24 | self.logger = logging.getLogger(self.__class__.__name__)
25 |
26 | # gpu support
27 | self.device = pick_device(config, self.logger)
28 |
29 | ## Actor local and target networks
30 | self.actor_local = Actor(state_size, action_size, config).to(self.device)
31 | self.actor_target = Actor(state_size, action_size).to(self.device)
32 | self.actor_optimizer = getattr(optim, config["optimizer_actor"]["optimizer_type"])(
33 | self.actor_local.parameters(),
34 | betas=tuple(config["optimizer_actor"]["betas"],
35 | **config["optimizer_actor"]["optimizer_params"]))
36 |
37 | ## Critic local and target networks
38 | self.critic_local = Critic(state_size, action_size, config).to(self.device)
39 | self.critic_target = Actor(state_size, action_size, config).to(self.device)
40 | self.actor_optimizer = getattr(optim, config["optimizer_critic"]["optimizer_type"])(
41 | self.critic_local.parameters(),
42 | betas=tuple(config["optimizer_critic"]["betas"],
43 | **config["optimizer_critic"]optimizer_critic["optimizer_params"]))
44 |
45 | ## Noise process
46 | self.noise = OUNoise(action_size)
47 |
48 | ## Replay memory
49 | self.memory = ReplayBuffe(
50 | config=config,
51 | action_size=action_size,
52 | buffer_size=int(config["DDPG"]["buffer_size"]),
53 | batch_size=config["trainer"]["batch_size"]
54 | )
55 |
56 |
57 | def step(self, state, action, reward, next_state, done):
58 | """ Save experience in replay memory,
59 | and use random sample from buffer to learn """
60 |
61 | # Save experience in replay memory
62 | self.memory.add(state, action, reward, next_state, done)
63 |
64 | # learn every timestep as long as enough samples are available in memory
65 | if len(self.memory) > self.config["trainer"]["batch_size"]:
66 | experiences = self.memory.sample()
67 | self.learn(experiences, self.config["DDPG"]["gamma"])
68 |
69 |
70 | def act(self, state):
71 | """ Returns actions for given state as per current policy """
72 |
73 | # Convert state to tensor
74 | state = torch.from_numpy(state).float().to(self.device)
75 |
76 | ## Evaluation mode
77 | self.actor_local.eval()
78 | with torch.no_grad():
79 | # Forward pass of local actor network
80 | action_values = self.actor_local.forward(state)
81 |
82 | ## Training mode
83 | self.actor_local.train()
84 | # Add noise to improve exploration to our actor policy
85 | action_values += self.noise.sample()
86 | # Clip action to stay in the range [-1, 1] for our task
87 | action_values = np.clip(action_values, -1, 1)
88 |
89 | return action_values
90 |
91 |
92 | def learn(self, experiences, gamma):
93 | """ Update value parameters using given batch of experience tuples """
94 |
95 | states, actions, rewards, next_states, dones = experiences
96 |
97 | ## Update actor (policy) network using the sampled policy gradient
98 | # Compute actor loss
99 | actions_pred = self.actor_local.forward(states)
100 | actor_loss = -self.critic_local.forward(states, actions_pred).mean()
101 | # Minimize the loss
102 | self.actor_optimizer.zero_grad()
103 | actor_loss.backward()
104 | self.actor_optimizer.step()
105 |
106 | ## Update critic (value) network
107 | # Get predicted next-state actions and Q-values from target models
108 | actions_next = self.actor_target.forward(next_states)
109 | Q_targets_next = self.critic_target.forward(next_states, actions_next)
110 | # Compute Q-targets for current states
111 | Q_targets = rewards + (gamma * Q_targets_next * (1 - dones))
112 | # Get expected Q-values from local critic model
113 | Q_expected = self.critic_local.forward(states)
114 | # Compute loss
115 | critic_loss = F.mse_loss(Q_expected, Q_targets)
116 | # Minimize the loss
117 | self.critic_optimizer.zero_grad()
118 | critic_loss.backward()
119 | self.critic_optimizer.step()
120 |
121 |
122 | ## Update target networks with a soft update
123 | self.soft_update(self.actor_local, self.self.actor_target, self.config["DDPG"]["tau"])
124 | self.soft_update(self.critic_local, self.critic_target, self.config["DDPG"]["tau"])
125 |
126 |
127 | def soft_update(self, local_model, target_model, tau):
128 | """ Soft update model parameters,
129 | improves the stability of learning """
130 |
131 | for target_pararam, local_param in zip(target_model.parameters(), local_model.parameters()):
132 | target_param.data.copy_(tau*local_param.data + (1.0 - tau)*target_param.data)
133 |
134 |
135 |
136 | class OUNoise():
137 | """ Ornstein-Uhlenbeck process """
138 |
139 | def __init__(self, size, mu=0.0, theta=0.15, sigma=0.2):
140 | """ Initialize parameters and noise process """
141 | self.mu = mu * np.ones(size)
142 | self.theta = theta
143 | self.sigma = sigma
144 | self.reset()
145 |
146 | def reset(self):
147 | """ Reset the interal state (= noise) to mean (mu). """
148 | self.state = copy.copy(self.mu)
149 |
150 | def sample(self):
151 | """ Update internal state and return it as a noise sample """
152 | x = self.state
153 | dx = self.theta * (self.mu - x) + self.sigma * np.array([random.random() for i in range(len(x))])
154 | self.state = x + dx
155 |
156 | return self.state
157 |
158 |
159 |
160 |
161 | class ReplayBuffer():
162 | """ Fixed-size buffer to store experience tuples """
163 |
164 | def __init__(self, config, action_size, buffer_size, batch_size):
165 | """ Initialize a ReplayBuffer object """
166 |
167 | self.config = config
168 | self.action_size = action_size
169 | self.memory = deque(maxlen=buffer_size)
170 | self.batch_size = batch_size
171 | self.experience = namedtuple("Experience",
172 | field_names=["state", "action", "reward", "next_state", "done"])
173 |
174 | # logging for this class
175 | self.logger = logging.getLogger(self.__class__.__name__)
176 |
177 | # gpu support
178 | self.device = pick_device(config, self.logger)
179 |
180 |
181 | def add(self, state, action, reward, next_state, done):
182 | """ Add a new experience to memory """
183 | e = self.experience(state, action, reward, next_state, done)
184 | self.memory.append(e)
185 |
186 |
187 | def sample(self):
188 | """ Randomly sample a batch of experiences from memory """
189 | experiences = random.sample(self.memory, k=self.batch_size)
190 |
191 | states = torch.from_numpy(
192 | np.vstack([e.state for e in experiences if e is not None])
193 | ).float().to(self.device)
194 | actions = torch.from_numpy(
195 | np.vstack([e.action for e in experiences if e is not None])
196 | ).float().to(self.device)
197 | rewards = torch.from_numpy(
198 | np.vstack([e.rewards for e in experiences if e is not None])
199 | ).float().to(self.device)
200 | next_states = torch.from_numpy(
201 | np.vstack([e.next_state for e in experiences if e is not None])
202 | ).float().to(self.device)
203 | dones = torch.from_numpy(
204 | np.vstack([e.done for e in experiences if e is not None]).astype(uint8)
205 | ).float().to(self.device)
206 |
207 | return (states, actions, rewards, next_states, dones)
208 |
209 |
210 | def __len__(self):
211 | """ Return the current size of internal memory """
212 | return len(self.memory)
--------------------------------------------------------------------------------
/p2_continuous_control/images/reacher_gif.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/images/reacher_gif.gif
--------------------------------------------------------------------------------
/p2_continuous_control/models.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.function as F
5 |
6 |
7 | class Actor(nn.Module):
8 | """ Actor (Policy) model """
9 |
10 | def __init__(self, state_size, action_size, config):
11 | """ Initalize parameters and build model """
12 |
13 | super(Actor, self).__init__()
14 | fc1_units = config["architecture"]["fc1_units"]
15 | fc2_units = config["architecture"]["fc2_units"]
16 |
17 | self.fc1 = nn.Linear(in_features=state_size, out_features=fc1_units)
18 | self.fc2 = nn.Linear(in_features=fc1_units, out_features=fc2_units)
19 |
20 | # weights initialization
21 | for m in self.modules():
22 | if isinstance(m, nn.Linear):
23 | # FC layers have weights initialized with Glorot
24 | m.weight = nn.init.xavier_uniform(m.weight, gain=1)
25 |
26 | def forward(self, state):
27 | """ Build an actor (policy) network that maps states to actions """
28 | x = F.relu(self.fc1(state))
29 | x = F.relu(self.fc2(x))
30 | x = F.tanh(self.fc3(x)) # outputs are in the range [-1, 1]
31 |
32 | return x
33 |
34 |
35 | class Critic(nn.Module):
36 | """ Critic (Value) Model """
37 |
38 | def __init__(self, state_size, action_size, config):
39 | """ Initialize parameters and build model """
40 | super(Critic, self).__init__()
41 |
42 | fc1_units = config["architecture"]["fc1_units"]
43 | fc2_units = config["architecture"]["fc2_units"]
44 |
45 | self.fc1 = nn.Linear(in_features=state_size, out_features=fc1_units)
46 | self.fc2 = nn.Linear(in_features=fc1_units + action_size,
47 | out_features=fc2_units)
48 | self.fc3 = nn.Linear(in_features=fc2_units, 1)
49 |
50 | # weights initialization
51 | for m in self.modules():
52 | if isinstance(m, nn.Linear):
53 | # FC layers have weights initialized with Glorot
54 | m.weight = nn.init.xavier_uniform(m.weight, gain=1)
55 |
56 | def forward(self, state, action):
57 | """ Build a critic (value) network that maps
58 | (state, action) pairs -> Q-values """
59 | x = F.relu(self.fc1(state))
60 | x = F.relu(self.fc2(torch.cat([x, action], dim=1))) # add action too for the mapping
61 | x = F.relu(self.fc3(x))
62 |
63 | return x
64 |
65 |
--------------------------------------------------------------------------------
/p2_continuous_control/report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/report.pdf
--------------------------------------------------------------------------------
/p2_continuous_control/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/requirements.txt
--------------------------------------------------------------------------------
/p2_continuous_control/saved/DDPG_exp/checkpoint_actor_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/saved/DDPG_exp/checkpoint_actor_solved.pth
--------------------------------------------------------------------------------
/p2_continuous_control/saved/DDPG_exp/checkpoint_critic_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/saved/DDPG_exp/checkpoint_critic_solved.pth
--------------------------------------------------------------------------------
/p3_collab_compet/DDPGAgents.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import logging
3 | from models import Actor, Critic
4 | from ReplayBuffer import ReplayBuffer
5 | from OUNoise import OUNoise
6 | import torch
7 | import torch.nn.functional as F
8 | import torch.optim as optim
9 | from utils import pick_device
10 |
11 | import pdb
12 |
13 | class DDPGAgents():
14 | """ Agent used to interact with and learns from the environment """
15 |
16 | def __init__(self, state_size, action_size, config):
17 | """ Initialize an agent object """
18 |
19 | self.state_size = state_size
20 | self.action_size = action_size
21 | self.config = config
22 |
23 | # retrieve number of agents
24 | self.num_agents = config["DDPG"]["num_agents"]
25 |
26 | # logging for this class
27 | self.logger = logging.getLogger(self.__class__.__name__)
28 |
29 | # gpu support
30 | self.device = pick_device(config, self.logger)
31 |
32 | ## Actor local and target networks
33 | self.actor_local = Actor(state_size, action_size, config).to(self.device)
34 | self.actor_target = Actor(state_size, action_size, config).to(self.device)
35 | self.actor_optimizer = getattr(optim, config["optimizer_actor"]["optimizer_type"])(
36 | self.actor_local.parameters(),
37 | betas=tuple(config["optimizer_actor"]["betas"]),
38 | **config["optimizer_actor"]["optimizer_params"])
39 |
40 | ## Critic local and target networks
41 | self.critic_local = Critic(state_size, action_size, config).to(self.device)
42 | self.critic_target = Critic(state_size, action_size, config).to(self.device)
43 | self.critic_optimizer = getattr(optim, config["optimizer_critic"]["optimizer_type"])(
44 | self.critic_local.parameters(),
45 | betas=tuple(config["optimizer_critic"]["betas"]),
46 | **config["optimizer_critic"]["optimizer_params"])
47 |
48 | ## Noise process
49 | self.noise = OUNoise((self.num_agents, action_size))
50 |
51 | ## Replay memory
52 | self.memory = ReplayBuffer(
53 | config=config,
54 | action_size=action_size,
55 | buffer_size=int(config["DDPG"]["buffer_size"]),
56 | batch_size=config["trainer"]["batch_size"]
57 | )
58 |
59 |
60 | def step(self, state, action, reward, next_state, done):
61 | """ Save experience in replay memory,
62 | and use random sample from buffer to learn """
63 |
64 | # Save experience in replay memory shared by all agents
65 | for agent in range(self.num_agents):
66 | self.memory.add(state[agent, :],
67 | action[agent, :],
68 | reward[agent],
69 | next_state[agent, :],
70 | done[agent]
71 | )
72 |
73 | # learn every timestep as long as enough samples are available in memory
74 | if len(self.memory) > self.config["trainer"]["batch_size"]:
75 | experiences = self.memory.sample()
76 | self.learn(experiences, self.config["DDPG"]["gamma"])
77 |
78 |
79 | def act(self, states, add_noise=False):
80 | """ Returns actions for given state as per current policy """
81 |
82 | # Convert state to tensor²
83 | states = torch.from_numpy(states).float().to(self.device)
84 |
85 | # prepare actions numpy array for all agents
86 | actions = np.zeros((self.num_agents, self.action_size))
87 |
88 | ## Evaluation mode
89 | self.actor_local.eval()
90 | with torch.no_grad():
91 | # Forward pass of local actor network
92 | for agent, state in enumerate(states):
93 | action_values = self.actor_local.forward(state).cpu().data.numpy()
94 | actions[agent, :] = action_values
95 |
96 | # pdb.set_trace()
97 | ## Training mode
98 | self.actor_local.train()
99 | if add_noise:
100 | # Add noise to improve exploration to our actor policy
101 | # action_values += torch.from_numpy(self.noise.sample()).type(torch.FloatTensor).to(self.device)
102 | actions += self.noise.sample()
103 | # Clip action to stay in the range [-1, 1] for our task
104 | actions = np.clip(actions, -1, 1)
105 |
106 | return actions
107 |
108 |
109 | def learn(self, experiences, gamma):
110 | """ Update value parameters using given batch of experience tuples """
111 |
112 | states, actions, rewards, next_states, dones = experiences
113 |
114 | ## Update actor (policy) network using the sampled policy gradient
115 | # Compute actor loss
116 | actions_pred = self.actor_local.forward(states)
117 | actor_loss = -self.critic_local.forward(states, actions_pred).mean()
118 | # Minimize the loss
119 | self.actor_optimizer.zero_grad()
120 | actor_loss.backward()
121 | self.actor_optimizer.step()
122 |
123 | ## Update critic (value) network
124 | # Get predicted next-state actions and Q-values from target models
125 | actions_next = self.actor_target.forward(next_states)
126 | Q_targets_next = self.critic_target.forward(next_states, actions_next)
127 | # Compute Q-targets for current states
128 | Q_targets = rewards + (gamma * Q_targets_next * (1 - dones))
129 | # Get expected Q-values from local critic model
130 | Q_expected = self.critic_local.forward(states, actions)
131 | # Compute loss
132 | critic_loss = F.mse_loss(Q_expected, Q_targets)
133 | # Minimize the loss
134 | self.critic_optimizer.zero_grad()
135 | critic_loss.backward()
136 | self.critic_optimizer.step()
137 |
138 |
139 | ## Update target networks with a soft update
140 | self.soft_update(self.actor_local, self.actor_target, self.config["DDPG"]["tau"])
141 | self.soft_update(self.critic_local, self.critic_target, self.config["DDPG"]["tau"])
142 |
143 |
144 | def soft_update(self, local_model, target_model, tau):
145 | """ Soft update model parameters,
146 | improves the stability of learning """
147 |
148 | for target_param, local_param in zip(target_model.parameters(), local_model.parameters()):
149 | target_param.data.copy_(tau*local_param.data + (1.0 - tau)*target_param.data)
150 |
151 |
152 | def reset(self):
153 | """ Reset noise """
154 | self.noise.reset()
155 |
156 |
157 |
158 |
159 |
160 |
161 |
--------------------------------------------------------------------------------
/p3_collab_compet/OUNoise.py:
--------------------------------------------------------------------------------
1 | import random
2 | import numpy as np
3 | import copy
4 |
5 | class OUNoise():
6 | """ Ornstein-Uhlenbeck process """
7 |
8 | def __init__(self, size, mu=0.0, theta=0.15, sigma=0.2):
9 | """ Initialize parameters and noise process """
10 | self.mu = mu * np.ones(size)
11 | self.theta = theta
12 | self.sigma = sigma
13 | self.reset()
14 |
15 | def reset(self):
16 | """ Reset the internal state (= noise) to mean (mu). """
17 | self.state = copy.copy(self.mu)
18 |
19 | def sample(self):
20 | """ Update internal state and return it as a noise sample """
21 | x = self.state
22 | dx = self.theta * (self.mu - x) + self.sigma * np.random.standard_normal(self.size)
23 | self.state = x + dx
24 |
25 | return self.state
--------------------------------------------------------------------------------
/p3_collab_compet/README.md:
--------------------------------------------------------------------------------
1 | # Project : Collaboration and Competition
2 |
3 | ## Description
4 | For this project, we train a pair of agents to play tennis.
5 |
6 |
7 |
8 |
9 |
10 | ## Problem Statement
11 | A reward of +0.1 is provided for each step that one of the two agent hits the ball over the net.
12 | A reward of -0.01 is provided an agent lets a nall hit the ground or hits the ball out of bounds.
13 | Thus, the goal of each agent is to keep the ball in play.
14 |
15 | The observation space consists of 24 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation. Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.
16 |
17 | The task is episodic. In order to solve
18 | the environment, one of the agent must get an average score of +0.5 over 100 consecutive
19 | episodes.
20 |
21 | ## Files
22 | - `Tennis.ipynb`: Notebook used to control and train the agent
23 | - `DDPGAgents.py`: Create an DDPGAgents class that interacts with and learns from the environment
24 | - `ReplayBuffer.py`: Replay Buffer class to store the experiences
25 | - `OUNoise.py`: Ornstein Uhlenbeck noise for the actor to improve exploration
26 | - `model.py`: Actor and Critic classes
27 | - `config.json`: Configuration file to store variables and paths
28 | - `utils.py`: Helper functions
29 | - `report.pdf`: Technical report
30 |
31 | ## Dependencies
32 | To be able to run this code, you will need an environment with Python 3 and
33 | the dependencies are listed in the `requirements.txt` file so that you can install them
34 | using the following command:
35 | ```
36 | pip install requirements.txt
37 | ```
38 |
39 | Furthermore, you need to download the environment from one of the links below. You need only to select
40 | the environment that matches your operating system:
41 | - Linux : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P3/Tennis/Tennis_Linux.zip)
42 | - MAC OSX : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P3/Tennis/Tennis.app.zip)
43 | - Windows : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P3/Tennis/Tennis_Windows_x86_64.zip)
44 |
45 | ## Running
46 | Run the cells in the notebook `Tennis.ipynb` to train an agent that solves our required
47 | task of moving the double-jointed arm.
--------------------------------------------------------------------------------
/p3_collab_compet/ReplayBuffer.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import torch
3 | from collections import namedtuple, deque
4 | import random
5 | from utils import pick_device
6 | import numpy as np
7 |
8 | class ReplayBuffer():
9 | """ Fixed-size buffer to store experience tuples """
10 |
11 | def __init__(self, config, action_size, buffer_size, batch_size):
12 | """ Initialize a ReplayBuffer object """
13 |
14 | self.config = config
15 | self.action_size = action_size
16 | self.memory = deque(maxlen=buffer_size)
17 | self.batch_size = batch_size
18 | self.experience = namedtuple("Experience",
19 | field_names=["state", "action", "reward", "next_state", "done"])
20 |
21 | # logging for this class
22 | self.logger = logging.getLogger(self.__class__.__name__)
23 |
24 | # gpu support
25 | self.device = pick_device(config, self.logger)
26 |
27 |
28 | def add(self, state, action, reward, next_state, done):
29 | """ Add a new experience to memory """
30 | e = self.experience(state, action, reward, next_state, done)
31 | self.memory.append(e)
32 |
33 |
34 | def sample(self):
35 | """ Randomly sample a batch of experiences from memory """
36 | experiences = random.sample(self.memory, k=self.batch_size)
37 |
38 | states = torch.from_numpy(
39 | np.vstack([e.state for e in experiences if e is not None])
40 | ).float().to(self.device)
41 | actions = torch.from_numpy(
42 | np.vstack([e.action for e in experiences if e is not None])
43 | ).float().to(self.device)
44 | rewards = torch.from_numpy(
45 | np.vstack([e.reward for e in experiences if e is not None])
46 | ).float().to(self.device)
47 | next_states = torch.from_numpy(
48 | np.vstack([e.next_state for e in experiences if e is not None])
49 | ).float().to(self.device)
50 | dones = torch.from_numpy(
51 | np.vstack([e.done for e in experiences if e is not None]).astype(np.uint8)
52 | ).float().to(self.device)
53 |
54 | return (states, actions, rewards, next_states, dones)
55 |
56 |
57 | def __len__(self):
58 | """ Return the current size of internal memory """
59 | return len(self.memory)
--------------------------------------------------------------------------------
/p3_collab_compet/Tennis.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Collaboration and Competition\n",
8 | "\n",
9 | "---\n",
10 | "\n",
11 | "You are welcome to use this coding environment to train your agent for the project. Follow the instructions below to get started!\n",
12 | "\n",
13 | "### 1. Start the Environment\n",
14 | "\n",
15 | "Run the next code cell to install a few packages. This line will take a few minutes to run!"
16 | ]
17 | },
18 | {
19 | "cell_type": "code",
20 | "execution_count": 1,
21 | "metadata": {},
22 | "outputs": [],
23 | "source": [
24 | "!pip -q install ./python"
25 | ]
26 | },
27 | {
28 | "cell_type": "markdown",
29 | "metadata": {},
30 | "source": [
31 | "The environment is already saved in the Workspace and can be accessed at the file path provided below. "
32 | ]
33 | },
34 | {
35 | "cell_type": "code",
36 | "execution_count": 2,
37 | "metadata": {},
38 | "outputs": [
39 | {
40 | "name": "stderr",
41 | "output_type": "stream",
42 | "text": [
43 | "INFO:unityagents:\n",
44 | "'Academy' started successfully!\n",
45 | "Unity Academy name: Academy\n",
46 | " Number of Brains: 1\n",
47 | " Number of External Brains : 1\n",
48 | " Lesson number : 0\n",
49 | " Reset Parameters :\n",
50 | "\t\t\n",
51 | "Unity brain name: TennisBrain\n",
52 | " Number of Visual Observations (per agent): 0\n",
53 | " Vector Observation space type: continuous\n",
54 | " Vector Observation space size (per agent): 8\n",
55 | " Number of stacked Vector Observation: 3\n",
56 | " Vector Action space type: continuous\n",
57 | " Vector Action space size (per agent): 2\n",
58 | " Vector Action descriptions: , \n"
59 | ]
60 | }
61 | ],
62 | "source": [
63 | "from unityagents import UnityEnvironment\n",
64 | "import numpy as np\n",
65 | "\n",
66 | "env = UnityEnvironment(file_name=\"/data/Tennis_Linux_NoVis/Tennis\")"
67 | ]
68 | },
69 | {
70 | "cell_type": "markdown",
71 | "metadata": {},
72 | "source": [
73 | "Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python."
74 | ]
75 | },
76 | {
77 | "cell_type": "code",
78 | "execution_count": 3,
79 | "metadata": {},
80 | "outputs": [],
81 | "source": [
82 | "# get the default brain\n",
83 | "brain_name = env.brain_names[0]\n",
84 | "brain = env.brains[brain_name]"
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": 4,
90 | "metadata": {},
91 | "outputs": [
92 | {
93 | "data": {
94 | "text/plain": [
95 | "('TennisBrain', )"
96 | ]
97 | },
98 | "execution_count": 4,
99 | "metadata": {},
100 | "output_type": "execute_result"
101 | }
102 | ],
103 | "source": [
104 | "brain_name, brain"
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "### 2. Examine the State and Action Spaces\n",
112 | "\n",
113 | "Run the code cell below to print some information about the environment."
114 | ]
115 | },
116 | {
117 | "cell_type": "code",
118 | "execution_count": 5,
119 | "metadata": {},
120 | "outputs": [
121 | {
122 | "name": "stdout",
123 | "output_type": "stream",
124 | "text": [
125 | "Number of agents: 2\n",
126 | "Size of each action: 2\n",
127 | "There are 2 agents. Each observes a state with length: 24\n",
128 | "The state for the first agent looks like: \n",
129 | " [ 0. 0. 0. 0. 0. 0. 0.\n",
130 | " 0. 0. 0. 0. 0. 0. 0.\n",
131 | " 0. 0. -6.65278625 -1.5 -0. 0.\n",
132 | " 6.83172083 6. -0. 0. ]\n"
133 | ]
134 | }
135 | ],
136 | "source": [
137 | "# reset the environment\n",
138 | "env_info = env.reset(train_mode=True)[brain_name]\n",
139 | "\n",
140 | "# number of agents \n",
141 | "num_agents = len(env_info.agents)\n",
142 | "print('Number of agents:', num_agents)\n",
143 | "\n",
144 | "# size of each action\n",
145 | "action_size = brain.vector_action_space_size\n",
146 | "print('Size of each action:', action_size)\n",
147 | "\n",
148 | "# examine the state space \n",
149 | "states = env_info.vector_observations\n",
150 | "state_size = states.shape[1]\n",
151 | "print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))\n",
152 | "print('The state for the first agent looks like: \\n', states[0])"
153 | ]
154 | },
155 | {
156 | "cell_type": "markdown",
157 | "metadata": {},
158 | "source": [
159 | "### 3. Take Random Actions in the Environment\n",
160 | "\n",
161 | "In the next code cell, you will learn how to use the Python API to control the agent and receive feedback from the environment.\n",
162 | "\n",
163 | "Note that **in this coding environment, you will not be able to watch the agents while they are training**, and you should set `train_mode=True` to restart the environment."
164 | ]
165 | },
166 | {
167 | "cell_type": "code",
168 | "execution_count": 6,
169 | "metadata": {
170 | "scrolled": false
171 | },
172 | "outputs": [],
173 | "source": [
174 | "# for i in range(5): # play game for 5 episodes\n",
175 | "# env_info = env.reset(train_mode=False)[brain_name] # reset the environment \n",
176 | "# states = env_info.vector_observations # get the current state (for each agent)\n",
177 | "# scores = np.zeros(num_agents) # initialize the score (for each agent)\n",
178 | "# while True:\n",
179 | "# actions = np.random.randn(num_agents, action_size) # select an action (for each agent)\n",
180 | "# actions = np.clip(actions, -1, 1) # all actions between -1 and 1\n",
181 | "# env_info = env.step(actions)[brain_name] # send all actions to tne environment\n",
182 | "# next_states = env_info.vector_observations # get next state (for each agent)\n",
183 | "# rewards = env_info.rewards # get reward (for each agent)\n",
184 | "# dones = env_info.local_done # see if episode finished\n",
185 | "# scores += env_info.rewards # update the score (for each agent)\n",
186 | "# states = next_states # roll over states to next time step\n",
187 | "# if np.any(dones): # exit loop if episode finished\n",
188 | "# break\n",
189 | "# print('Total score (averaged over agents) this episode: {}'.format(np.mean(scores)))"
190 | ]
191 | },
192 | {
193 | "cell_type": "markdown",
194 | "metadata": {},
195 | "source": [
196 | "When finished, you can close the environment."
197 | ]
198 | },
199 | {
200 | "cell_type": "code",
201 | "execution_count": 7,
202 | "metadata": {},
203 | "outputs": [],
204 | "source": [
205 | "# env.close()"
206 | ]
207 | },
208 | {
209 | "cell_type": "markdown",
210 | "metadata": {},
211 | "source": [
212 | "### 4. It's Your Turn!\n",
213 | "\n",
214 | "Now it's your turn to train your own agent to solve the environment! A few **important notes**:\n",
215 | "- When training the environment, set `train_mode=True`, so that the line for resetting the environment looks like the following:\n",
216 | "```python\n",
217 | "env_info = env.reset(train_mode=True)[brain_name]\n",
218 | "```\n",
219 | "- To structure your work, you're welcome to work directly in this Jupyter notebook, or you might like to start over with a new file! You can see the list of files in the workspace by clicking on **_Jupyter_** in the top left corner of the notebook.\n",
220 | "- In this coding environment, you will not be able to watch the agents while they are training. However, **_after training the agents_**, you can download the saved model weights to watch the agents on your own machine! "
221 | ]
222 | },
223 | {
224 | "cell_type": "code",
225 | "execution_count": 8,
226 | "metadata": {},
227 | "outputs": [],
228 | "source": [
229 | "## code to keep session awake in Udacity workspace\n",
230 | "import signal\n",
231 | "\n",
232 | "from contextlib import contextmanager\n",
233 | "\n",
234 | "import requests\n",
235 | "\n",
236 | "\n",
237 | "DELAY = INTERVAL = 4 * 60 # interval time in seconds\n",
238 | "MIN_DELAY = MIN_INTERVAL = 2 * 60\n",
239 | "KEEPALIVE_URL = \"https://nebula.udacity.com/api/v1/remote/keep-alive\"\n",
240 | "TOKEN_URL = \"http://metadata.google.internal/computeMetadata/v1/instance/attributes/keep_alive_token\"\n",
241 | "TOKEN_HEADERS = {\"Metadata-Flavor\":\"Google\"}\n",
242 | "\n",
243 | "\n",
244 | "def _request_handler(headers):\n",
245 | " def _handler(signum, frame):\n",
246 | " requests.request(\"POST\", KEEPALIVE_URL, headers=headers)\n",
247 | " return _handler\n",
248 | "\n",
249 | "\n",
250 | "@contextmanager\n",
251 | "def active_session(delay=DELAY, interval=INTERVAL):\n",
252 | " \"\"\"\n",
253 | " Example:\n",
254 | "\n",
255 | " from workspace_utils import active session\n",
256 | "\n",
257 | " with active_session():\n",
258 | " # do long-running work here\n",
259 | " \"\"\"\n",
260 | " token = requests.request(\"GET\", TOKEN_URL, headers=TOKEN_HEADERS).text\n",
261 | " headers = {'Authorization': \"STAR \" + token}\n",
262 | " delay = max(delay, MIN_DELAY)\n",
263 | " interval = max(interval, MIN_INTERVAL)\n",
264 | " original_handler = signal.getsignal(signal.SIGALRM)\n",
265 | " try:\n",
266 | " signal.signal(signal.SIGALRM, _request_handler(headers))\n",
267 | " signal.setitimer(signal.ITIMER_REAL, delay, interval)\n",
268 | " yield\n",
269 | " finally:\n",
270 | " signal.signal(signal.SIGALRM, original_handler)\n",
271 | " signal.setitimer(signal.ITIMER_REAL, 0)\n",
272 | "\n",
273 | "\n",
274 | "def keep_awake(iterable, delay=DELAY, interval=INTERVAL):\n",
275 | " \"\"\"\n",
276 | " Example:\n",
277 | "\n",
278 | " from workspace_utils import keep_awake\n",
279 | "\n",
280 | " for i in keep_awake(range(5)):\n",
281 | " # do iteration with lots of work here\n",
282 | " \"\"\"\n",
283 | " with active_session(delay, interval): yield from iterable"
284 | ]
285 | },
286 | {
287 | "cell_type": "code",
288 | "execution_count": 9,
289 | "metadata": {},
290 | "outputs": [],
291 | "source": [
292 | "## Watch changes and reload automatically\n",
293 | "% load_ext autoreload\n",
294 | "% autoreload 2"
295 | ]
296 | },
297 | {
298 | "cell_type": "code",
299 | "execution_count": 10,
300 | "metadata": {},
301 | "outputs": [],
302 | "source": [
303 | "import pdb\n",
304 | "import json\n",
305 | "import numpy as np \n",
306 | "import torch \n",
307 | "from collections import deque\n",
308 | "from DDPGAgents import DDPGAgents\n",
309 | "from utils import ensure_dir\n",
310 | "import matplotlib.pyplot as plt\n",
311 | "\n",
312 | "import logging\n",
313 | "logging.basicConfig(level=logging.INFO, format='')\n",
314 | "\n",
315 | "with open(\"config.json\", \"r\") as f: \n",
316 | " config = json.load(f)"
317 | ]
318 | },
319 | {
320 | "cell_type": "code",
321 | "execution_count": 11,
322 | "metadata": {},
323 | "outputs": [
324 | {
325 | "name": "stderr",
326 | "output_type": "stream",
327 | "text": [
328 | "INFO:DDPGAgents:Training on gpu\n",
329 | "INFO:ReplayBuffer:Training on gpu\n"
330 | ]
331 | },
332 | {
333 | "name": "stdout",
334 | "output_type": "stream",
335 | "text": [
336 | "Episode 100\tAverage Score: 0.000\n",
337 | "Episode 200\tAverage Score: 0.000\n",
338 | "Episode 300\tAverage Score: 0.000\n",
339 | "Episode 400\tAverage Score: 0.000\n",
340 | "Episode 500\tAverage Score: 0.000\n",
341 | "Episode 600\tAverage Score: 0.002\n",
342 | "Episode 700\tAverage Score: 0.000\n",
343 | "Episode 800\tAverage Score: 0.000\n",
344 | "Episode 900\tAverage Score: 0.000\n",
345 | "Episode 1000\tAverage Score: 0.000\n",
346 | "Episode 1100\tAverage Score: 0.011\n",
347 | "Episode 1200\tAverage Score: 0.019\n",
348 | "Episode 1300\tAverage Score: 0.006\n",
349 | "Episode 1400\tAverage Score: 0.029\n",
350 | "Episode 1500\tAverage Score: 0.060\n",
351 | "Episode 1600\tAverage Score: 0.235\n",
352 | "Episode 1700\tAverage Score: 0.133\n",
353 | "Episode 1800\tAverage Score: 0.168\n",
354 | "Episode 1900\tAverage Score: 0.324\n",
355 | "Episode 1909\tAverage Score: 0.503\n",
356 | "Environment solved in 1809 episodes!\tAverage Score: 0.503\n"
357 | ]
358 | }
359 | ],
360 | "source": [
361 | "agent = DDPGAgents(state_size=24, action_size=2, config=config)\n",
362 | "brain_name = env.brain_names[0]\n",
363 | "\n",
364 | "def ddpg(agent, \n",
365 | " brain_name, \n",
366 | " config, \n",
367 | " n_episodes=config[\"trainer\"][\"num_episodes\"]\n",
368 | " ):\n",
369 | " \"\"\" Deep Deterministic Policy Gradient \"\"\"\n",
370 | " \n",
371 | " # Set logger for this function\n",
372 | " logger = logging.getLogger(\"ddpg\")\n",
373 | " \n",
374 | " # number of agents\n",
375 | " num_agents = config[\"DDPG\"][\"num_agents\"]\n",
376 | " \n",
377 | " max_t = 1000\n",
378 | " \n",
379 | " flag = False # When environment is technically solved\n",
380 | " # Save path \n",
381 | " save_path = config[\"trainer\"][\"save_dir\"] + config[\"exp_name\"] + \"/\"\n",
382 | " ensure_dir(save_path)\n",
383 | " scores = [] # list containing scores from each episodes \n",
384 | " scores_window = deque(maxlen=100)\n",
385 | " \n",
386 | " for i_episode in keep_awake(range(1, n_episodes + 1)):\n",
387 | " # reset the environment\n",
388 | " env_info = env.reset(train_mode=True)[brain_name]\n",
389 | " \n",
390 | " # reset noise\n",
391 | " agent.reset()\n",
392 | " \n",
393 | " # get the current state\n",
394 | " state = env_info.vector_observations\n",
395 | "\n",
396 | " # score of the agents\n",
397 | " score = np.zeros(num_agents)\n",
398 | " \n",
399 | " for t in range(max_t):\n",
400 | " # choose actions\n",
401 | " action = agent.act(state)\n",
402 | " # send the actions to the environment \n",
403 | " env_info = env.step(action)[brain_name]\n",
404 | " # get the next state\n",
405 | " next_state = env_info.vector_observations\n",
406 | " # get the rewards\n",
407 | " rewards = env_info.rewards\n",
408 | " # see if episode has finished\n",
409 | " dones = env_info.local_done\n",
410 | " # step \n",
411 | " agent.step(state, action, rewards, next_state, dones)\n",
412 | " # accumulate rewards into score variable\n",
413 | " score += rewards\n",
414 | " # get next_state and set it to state\n",
415 | " state = next_state\n",
416 | " \n",
417 | " if any(dones): \n",
418 | " break\n",
419 | " \n",
420 | " # save most recent scores (mean amongst the agents)\n",
421 | " scores.append(np.max(score))\n",
422 | " scores_window.append(np.max(score))\n",
423 | " \n",
424 | " print('\\rEpisode {}\\tAverage Score: {:.3f}'.format(i_episode, np.mean(scores_window)), end=\"\")\n",
425 | " \n",
426 | " if (i_episode % 100 == 0):\n",
427 | " print(\"\\rEpisode {}\\tAverage Score: {:.3f}\".format(i_episode, \\\n",
428 | " np.mean(scores_window)))\n",
429 | " \n",
430 | " # Save occasionnaly \n",
431 | " if (i_episode % config[\"trainer\"][\"save_freq\"] == 0):\n",
432 | " torch.save(agent.actor_local.state_dict(), save_path + \n",
433 | " \"checkpoint_actor_\" + str(i_episode) + \".pth\")\n",
434 | " torch.save(agent.critic_local.state_dict(), save_path + \n",
435 | " \"checkpoint_critic_\" + str(i_episode) + \".pth\")\n",
436 | " \n",
437 | " # Check if envionment solved \n",
438 | " if not flag:\n",
439 | " if (np.mean(scores_window) >= 0.5):\n",
440 | " print(\"\\nEnvironment solved in {:d} episodes!\\tAverage Score: {:.3f}\".format(\n",
441 | " i_episode-100, np.mean(scores_window)))\n",
442 | " # Save solved model \n",
443 | " torch.save(agent.actor_local.state_dict(), save_path + \n",
444 | " \"checkpoint_actor_solved.pth\")\n",
445 | " torch.save(agent.critic_local.state_dict(), save_path + \n",
446 | " \"checkpoint_critic_solved.pth\")\n",
447 | " flag = True\n",
448 | " \n",
449 | " break\n",
450 | " \n",
451 | " return scores\n",
452 | " \n",
453 | "scores = ddpg(agent=agent, \n",
454 | " brain_name=brain_name, \n",
455 | " config=config)"
456 | ]
457 | },
458 | {
459 | "cell_type": "code",
460 | "execution_count": 12,
461 | "metadata": {},
462 | "outputs": [],
463 | "source": [
464 | "env.close()"
465 | ]
466 | },
467 | {
468 | "cell_type": "code",
469 | "execution_count": 13,
470 | "metadata": {},
471 | "outputs": [
472 | {
473 | "data": {
474 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA7sAAAK9CAYAAADltHtfAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzs3XmYbGldJ/jfm5n31kYVBVYBBVRRyKKoKEiBC7ZNj2gLqLRtj+jYOtJjMw+jYzsPzkzrdLuNC44KLiiCAw8uNKMsTaMUskOxCVRBFbUXRS3UrapbdWu7e9bNzHjnj8zIPBEZkXlORkTmWT4fnzIjI06c8544qc/9xu99fyflnAMAAADaZG6vBwAAAADTJuwCAADQOsIuAAAArSPsAgAA0DrCLgAAAK0j7AIAANA6wi4AAACtI+wCAADQOsIuAAAArbOw1wOo6rzzzssXX3zxXg8DAACAGbjiiivuyzmfP+l+Ghd2L7744rj88sv3ehgAAADMQErp9mnsxzRmAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAGiZf/jSXfGqv7sqbrvveNzxwIl41d9dFf/tyjv3eli7amGvBwAAAMB0/dx/+WJERHzy5kNx1v6FuOW+4/HOLxyIlz7rCXs8st0j7AIAALTUscXlWFrJez2MPWEaMwAAQEstzHc38nX3zAEAAFpu33yKnFV2AQAAaJGFue5Gvu6eOQAAQMstzKeB37tU5RV2AQAAWmphbjDsrvSEXQAAABqoGGiHG1QtC7sAAAA00dJKb/3xcGW3+FrbCbsAAAAtsjxQ2R0Mu8sduueusAsAANAiK4VAO9yN2TRmAAAAGmmpN34a83LPNGYAAAAaqDhVeWE+RUpp5GttJ+wCAAC0SLEJ1b6hbswaVAEAANBIAw2qNk1jVtkFAACggZaLtx6an4ucNwKuyi4AAACNtLSyRWXXml0AAACaqNhxeWF++NZDKrsAAAA0ULGyu2+osruksgsAAEATFdfszpnGDAAAQBsMd1wu/rpkGjMAAFA3f/WZ2+L6u4+U3v7TX7kv/o93XBUPHD81u0GxKx44fir+4AM3xkqJWwcVOy6/44oDcfjk0vrvr/ybK+LuwydnMsa6EXYBAKAhfuW/XRsv+qNPlN7+d993Q/zd5Qfii199cIajYjf853dfE3/ykZvjspsObbttL48PxItLvTi2uDzNodWWsAsAAC114tTKXg+BKVlcWr2WZSq7o/z+f/8t0xxOIwi7AAAAtI6wCwAAQOsIuwAA0HJbLOGkYXZ6KdP2m7SOsAsAANAi2325kTqSfIVdAACAhthpTu1KwC0SdgEAAGgdYRcAAKDlVHYBAIDW0Z+qPVzL8oRdAACAFtm++3Y3yrzCLgAAQEPsuEFVRwJukbALAADQctbsAgAAUFvW7JYn7AIAQMvl7RdxUnNdrMxOStgFAACouSrfV2y3aVeCs7ALAADQEDtuUNWVhFsg7AIAANA6Mwu7KaULU0ofTSldn1K6NqX0H0Zs84KU0uGU0pVr//3KrMYDAADQdFZfl7cww30vR8Srcs5fSCmdHRFXpJQ+mHO+bmi7T+Scf2CG4wAAgE4TkJqvg7OQJzazym7O+e6c8xfWHh+NiOsj4gmzOh4AAEBbVWpQNWLjNOZxm+3Kmt2U0sUR8eyI+OyIl78jpXRVSul9KaVv3I3xAAAANNHOG1RNdRiNMMtpzBERkVJ6RES8MyJ+Ied8ZOjlL0TEk3LOx1JKL46Id0fE00bs4xUR8YqIiIsuumjGIwYAAKDpZlrZTSnti9Wg+9ac87uGX885H8k5H1t7fGlE7EspnTdiuzfmnC/JOV9y/vnnz3LIAAAAtWX9dXmz7MacIuJNEXF9zvk1Y7Z53Np2kVJ63tp47p/VmAAAoIuqrPekniadhpw6s1J3wyynMT8/In4yIq5OKV259twvR8RFERE55z+PiH8TEa9MKS1HxMmI+LE8ajU1AABAh1VqULXN66kjC3hnFnZzzp+MbdZP55xfFxGvm9UYAAAA2kSDqvJ2pRszAAAAk9vpNNgOZl1hFwAAoO66WJmdlLALAACtpy0O3SPsAgAA1FylBlUjti1WhrtSJBZ2AQAAGmLnQbUrEXeDsAsAANAQJqSXJ+wCAEDLVZkCSz1pUFWdsAsAAEDrCLsAAAA1V606v3njgQZVHakSC7sAAAANsdOc2pF8O0DYBQAAaAjLr8sTdgEAoOUEpObrytTjaRJ2AQAAWi51MC0LuwAAADVXpUHVdtumjqzgFXYBAABarhvxdpCwCwAAUHMdnIU8MWEXAAAaIFe70erQe6c4EBqpi2FZ2AUAAKB1hF0AAICaq9SgapvXu1LlFXYBAABarisBt0jYBQAAqLlJw2pXbjdUJOwCAEADTNJkKm87sRXaR9gFAACgdYRdAACAmqvUoGrUtt2bxSzsAgAAtF0Hs66wCwAAUHdd7KY8KWEXAAAaYJIWU5M0t4KmEnYBAABaLnWwNCzsAgAAdEhXcq+wCwAA0CKj7qvckXw7QNgFAACgdYRdAABogDxBlyn9qejK1OUiYRcAAIDWEXYBAAA6pCudmYVdAACAFhk14z11sEWVsAsAAC03yXpf2qEjxdwBwi4AADSAuArVCLsAAAANoUpfnrALAADQcmnM4zYTdgEAABqiTCflkbXfriTcAmEXAACA1hF2AQCgASzVhGqEXQAAgIbQoKo8YRcAAKDlUmHRblfuuSvsAgAANESpBlUjqr9dCbhFwi4AALScma90kbALAAANkEffUAZK6WBhV9gFAABoCg2qyhN2AQAAaB1hFwAAoCHKNKja7n2pI5OahV0AAGg5633RjRkAAKglSzWJsGa3CmEXAACg9jpYmp2QsAsAAFB7k1V0uxiVhV0AAICGKNOgatRM5+LburJ+V9gFAICWs8yTLhJ2AQAAGkKDqvKEXQAAgNrryNzjKRJ2AQAAam/Sim73wrKwCwAA0BClGlSNCMYDDaqmOaAaE3YBAKABJlmqaZknXSTsAgAANIQGVeUJuwAAALU32eTjrkxdLhJ2AQAAak9FtyphFwAAWk5Mao9SDapGXPDUwQ5Vwi4AADTAqA67dM9O1+x2JN8OEHYBAABqr4txdTLCLgAAAK0j7AIAANTeZNPYSyz1bR1hFwAAWs69Wdtjxw2qCtOgU0emRAu7AADQAPIqEb64qELYBQAAqL1uVGOnSdgFAACgdYRdAACA2tOgqiphFwAAWs4qz/Yo1aBq231MZyx1J+wCAEADCKxEaFBVhbALAABQex0px06RsAsAANByXZm6XCTsAgAA1J7py1UJuwAA0HZyUqeMWtebCtOgu1LkFXYBAKABNCbquq5E1OkRdgEAAFrOml0AAABoAWEXAACg9kxjr0rYBQCAlsuCUqeMutrFacypI3OahV0AAGgAcbXrJguoqYMNroRdAAAAWkfYBQAAoHWEXQAAgJbryDLdAcIuAAC0XLbgt/PSmMdtJuwCAEADCKyU5m8lIoRdAAAAWkjYBQAAaDlrdgEAgNYxq5UuEnYBAKAJJFampCtVXmEXAACgRfLIb0Y6knALhF0AAICW60o1t0jYBQAAoHWEXQAAaDn36G0Pl7I8YRcAABpg9DpMKCcNPO7GnGZhFwAAoCHKxNRRlfzUwUW7Mwu7KaULU0ofTSldn1K6NqX0H0Zsk1JKf5xSujml9KWU0rfOajwAAAB0x8IM970cEa/KOX8hpXR2RFyRUvpgzvm6wjYvioinrf33bRHx+rWfAAAADDGZvbyZVXZzznfnnL+w9vhoRFwfEU8Y2uylEfFXedU/RcS5KaULZjUmAADoIut9m6+Ds5AntitrdlNKF0fEsyPis0MvPSEi7ij8fiA2B2IAAOg8HZW7bdLrn8b+0l4zD7sppUdExDsj4hdyzkeGXx7xlk2XMaX0ipTS5Smlyw8dOjSLYQIAANReqQZVo97XkYBbNNOwm1LaF6tB960553eN2ORARFxY+P2JEXHX8EY55zfmnC/JOV9y/vnnz2awAAAAtMYsuzGniHhTRFyfc37NmM3eExE/tdaV+dsj4nDO+e5ZjQkAAKDJzGYvb5bdmJ8fET8ZEVenlK5ce+6XI+KiiIic859HxKUR8eKIuDkiTkTEy2c4HgAA6CTrfZtv0mnIqSsLdQtmFnZzzp+MbaaU55xzRPzsrMYAAABtIa922zS/sOjK+t1d6cYMAADA5Eo1qBoRjLsScIuEXQAAAFpH2AUAAGgI09nLE3YBAKDlBKTm6+I05EkJuwAA0ABZS+VOm/TyF8NyV3KzsAsAANAQpRpUjajlpw6WhoVdAACAhlDfL0/YBQAAqLkOFmYnJuwCAEDbWe9LBwm7AADQAOJqt03coKr4uCNlYmEXAACgIUo1qBoRjDuSbwcIuwAAAA2hwl+esAsAAC0lGLVHFyuzkxJ2AQCg5YReUqkJ0O0i7AIAQAPspEFR9+JNe02zoXZX/i6EXQAAgIYo1aBq1Pu6knALhF0AAICGMCW9PGEXAABabppTYNkbk1ZmO1jYFXYBAABoH2EXAAAaIJvA2mlTbVDVkTKvsAsAANAmo5JxRwJukbALAABQc5Ov2e1e2hV2AQCg5bIOVXSQsAsAAE0gr0Ilwi4AAEDNTVqcL06D7sqUZmEXAACgRUbl4m7E20HCLgAAQM115XZB0yTsAgBAy1nuSxcJuwAA0AACK5NIHSwNC7sAAAAd0pXcK+wCAAC0yKjOzR3JtwOEXQAAAFpH2AUAgJab9B6tNF9Xpi4XCbsAANAAAitUI+wCAADQOsIuAABAi+QR0wBSB1tUCbsAAABt172sK+wCAEDbWe5LFwm7AADQAFlkhUqEXQAAgJYr3nqoK7chEnYBAABariP5doCwCwAA0CImvK8SdgEAoOVG3YoG2k7YBQCABpBXifB3UIWwCwAA0HKp0JUqdWQFr7ALAADQEDvtpNyNeDtI2AUAAGgRU51XCbsAAAAt15V76xYJuwAA0ACKdUSo2lYh7AIAAHRIV6q8wi4AAEBD7LxBVUcSboGwCwAALWfqa7e43KuEXQAAAFpH2AUAgAbIyrOEKn0Vwi4AAACtI+wCAAA0xI4bVBXe15VWVcIuAAC0XNayqFNMeV8l7AIAADSEHFuesAsAAA0g5HRbV6YeT5OwCwAAUHO+66hO2AUAAOiQtNMuVw0j7AIAQMuZAk0XCbsAAACN4ZuLsoRdAACAmuvGxOPpEnYBAABqTj23OmEXAACg5Yo9qbpSJRZ2AQCg5VQFu0VDslXCLgAANIAAQ4S/gyqEXQAAgJrrytTjaRJ2AQAAak5BtzphFwAAoOVSoTacOlImFnYBAKDlrPNsjzKXMqsDR4SwCwAAjSDAdFtHirFTJewCAADQOsIuAABAzanrVyfsAgBAy5kCTbEpVepIhyphFwAAoCHKNBvTkGyVsAsAAA0gwHRbN2qx0yXsAgAA0DrCLgAAQM0p7Fcn7AIAQMuZAt0eO2021sVp0MIuAABAzVUJq77bWCXsAgBAAwgwUI2wCwAAQOsIuwAAADWnsl+dsAsAANAQO202llL3WlQJuwAAADVXqUGVMnBECLsAANAIWYKBSoRdAAAAWkfYBQAAqDl1/eqEXQAAaDlToNtjp1eye+2phF0AAIDaq9SgSh04IoRdAABoBPEFqhF2AQAAaB1hFwAAoCGsvy5P2AUAgJaTj0gd7FAl7AIAALSILzdWCbsAANAAAgxUI+wCAADQOsIuAAAArSPsAgBAy5kBTepgh6qZhd2U0ptTSvemlK4Z8/oLUkqHU0pXrv33K7MaCwAANJ/IClUszHDfb4mI10XEX22xzSdyzj8wwzEAAADQQTOr7OacL4uIB2a1fwAAABhnr9fsfkdK6aqU0vtSSt+4x2MBAIBWctui9nAty5vlNObtfCEinpRzPpZSenFEvDsinjZqw5TSKyLiFRERF1100e6NEAAAgEbas8puzvlIzvnY2uNLI2JfSum8Mdu+Med8Sc75kvPPP39XxwkAAHWgogfV7FnYTSk9Lq31v04pPW9tLPfv1XgAAADaIPtmJCJmOI05pfS2iHhBRJyXUjoQEb8aEfsiInLOfx4R/yYiXplSWo6IkxHxY9lVAQAAGCu7BVVpMwu7Oecf3+b118XqrYkAAIAZEpDoor3uxgwAAABTJ+wCAEADqM1CNcIuAABAi+iEtErYBQAAaAhBtjxhFwAAWk5AoouEXQAAAFpH2AUAgAZQnYVqhF0AAICGKPOlh+9FVgm7AAAANZfSXo+geYRdAABoqTz0E7pE2AUAAKg5a7arE3YBAKABsvosVCLsAgAANESZrzxUgVcJuwAAADWnQVV1wi4AALRU7pf4lProIGEXAACg5nxfUV3psJtS+q6U0svXHp+fUnry7IYFAAAUCTtQTamwm1L61Yj4PyPil9ae2hcRfzOrQQEAALBZLvGth87dq8pWdn84In4oIo5HROSc74qIs2c1KAAAADZoUFVd2bB7Kq9+hZAjIlJKZ81uSAAAwDTkoZ/QJWXD7t+llN4QEeemlP59RHwoIv5idsMCAACgz5rt6hbKbJRz/v2U0vdGxJGI+LqI+JWc8wdnOjIAAGCdsEOEKn0V24bdlNJ8RLw/5/zCiBBwAQAAdlmVNbu+GFm17TTmnPNKRJxIKT1yF8YDAABMy1roEX7oolLTmCNiMSKuTil9MNY6MkdE5Jx/fiajAgAAgAmUDbvvXfsPAACAXaY6X13ZBlV/mVLaHxFPX3vqxpzz0uyGBQAAFGWtiYjQoaqCUmE3pfSCiPjLiLgtIlJEXJhS+h9zzpfNbmgAAABEVGxQNbthNErZacx/EBHfl3O+MSIipfT0iHhbRDxnVgMDAAAmk9d/ij90z7bdmNfs6wfdiIic800RsW82QwIAAIDJlK3sXp5SelNE/PXa7z8REVfMZkgAAMAwDYq6zfWvrmzYfWVE/GxE/Hysrtm9LCL+bFaDAgAAYDNT0ssrG3YXIuKPcs6viYhIKc1HxGkzGxUAAADrqjSoUgZeVXbN7ocj4ozC72dExIemPxwAAGBa8lrokX26p1I4bqmyYff0nPOx/i9rj8+czZAAAABgMmXD7vGU0rf2f0kpXRIRJ2czJAAAAEZRpS+v7JrdX4iIt6eU7orV23U9PiJeNrNRAQAAwAS2rOymlJ6bUnpczvnzEfH1EfG3EbEcEf8YEbfuwvgAAACoQPF31XbTmN8QEafWHn9HRPxyRPxpRDwYEW+c4bgAAIAJ5aGfdIf+VNtPY57POT+w9vhlEfHGnPM7I+KdKaUrZzs0AAAA2JntKrvzKaV+IP6eiPhI4bWy630BAIAJaUxEhCp9FdsF1rdFxMdTSvfFavflT0REpJSeGhGHZzw2AAAA2JEtw27O+bdSSh+OiAsi4gM5r3+fNBcR/+usBwcAAEA1ZgGs2nYqcs75n0Y8d9NshgMAAExLP/QIP92TUur8hd9uzS4AAAA0jrALAAANkLUmIjpfrK1E2AUAAKB1hF0AAIAWMQtglbALAAAt1Q89wk/3pL0eQA0IuwAAAA3hi4vyhF0AAGgAjYmgGmEXAACA1hF2AQCgpdarwarCdJCwCwAA0CKmvK8SdgEAABqibJBN2jELuwAA0ASKdVCNsAsAAEDrCLsAANBS/SmvqsJ0kbALAADQEGW+uPDlxiphFwAAoOZSVOs4VXX7NhJ2AQCgAbL7yXRaVq+tTNgFAACgdYRdAABoOVVhukjYBQAAaIoSX1z4bmOVsAsAAFBzlRtO6U8l7AIAQBMo1nWbBlXVCbsAAAC0jrALAAAt1W9MZQ0nXSTsAgAANESZ7y1MeV4l7AIAQAOoznZb1QZV+lMJuwAAALWnWludsAsAAEDrCLsAANBSeegnzWc6e3nCLgAAQM1VWrMrEEeEsAsAAA0hwVBe0qFK2AUAAKg7DaqqE3YBAABoHWEXAABaqt/MSFOj9sguZmnCLgAAQM1VaVAlDq8SdgEAoAEU9KiiUvfmlhJ2AQAAak6DquqEXQAAaKl+QBKU6CJhFwAAoCF8bVGesAsAAFBzlRpUWeAdEcIuAAA0gvhCFUl/KmEXAACg7qy7rk7YBQCAlurPZjWrtT1cy/KEXQAAgJpz39zqhF0AAIAWUf1dJewCAEADCDBUoQ4s7AIAANSeBlXVCbsAANBS4lH7uKblCbsAAAA1p0FVdcIuAABAi6j+rhJ2AQCgAbIOVVSQkkqwsAsAAFBzGlRVJ+wCAEBL9YvBqsLt4VqWJ+wCAADUnAZV1Qm7AAAALaL4u0rYBQCABpBfqEIdWNgFAACghWYWdlNKb04p3ZtSumbM6yml9McppZtTSl9KKX3rrMYCAADdlAv/G7pllpXdt0TE92/x+osi4mlr/70iIl4/w7EAAADQITMLuznnyyLigS02eWlE/FVe9U8RcW5K6YJZjQcAAKAL3JN31V6u2X1CRNxR+P3A2nMAAMAQHXapRIeqPQ27oz7+kf8nnFJ6RUrp8pTS5YcOHZrxsAAAAOrJlx7l7WXYPRARFxZ+f2JE3DVqw5zzG3POl+ScLzn//PN3ZXAAANB0/WAkINFFexl23xMRP7XWlfnbI+JwzvnuPRwPAAAALbEwqx2nlN4WES+IiPNSSgci4lcjYl9ERM75zyPi0oh4cUTcHBEnIuLlsxoLAAA0naZDlKWSv2pmYTfn/OPbvJ4j4mdndXwAAIC2Kfulh/5UezuNGQAAmKG8/lOpj+4RdgEAAGgdYRcAAIDWEXYBAKAJzESGSoRdAACAhtBpuTxhFwAAWiqvJSMBqXtS0o9Z2AUAAKB1hF0AAABaR9gFAIAGMBMZqhF2AQAAGqLMlx7ZIu2IEHYBAKC18tBPukN/KmEXAACAFhJ2AQAAaB1hFwAAGsAyTCL8HVQh7AIAANRclTW48vAqYRcAAFqqXwVsYzXw1vuOD3QdXlxaiTsfOrmHI5qtqtdQfyphFwAAaJgvfvXB+Be//7F4y6dvW3/uZ9/6hXj+qz+yd4OidoRdAACgUW6//0RERFx5x0Prz334hnv3ajjUlLALAAANkK3ELCW3cc52gb+D8oRdAACgNdqadSs1qGrpZ1CVsAsAAC21UeWUfpqucoOqKum4pYRdAACgNcR6+oRdAACgUbZat9r2NbuUJ+wCAEADyHBE+DuoQtgFAAAaJcX49ahtzYKVGlS19lOoRtgFAICWWm9PJfs0XuUGVbMZRqMIuwAAQGsI9vQJuwAAQKNs2aDKFF7WCLsAANAAIly3uW1udcIuAAC01VpCbtvU3i0bVLXsXHfCZ7BK2AUAABqli1OVKzeoUgkWdgEAAGgfYRcAAGiULk9jzm0/wSkSdgEAoAGEnG4zLbk6YRcAAFoqr//sTlDu0rmO4xNYJewCAACNsuV9diW9NUrBwi4AAEDNCfHVCbsAANAAXco69x97OA4eXhz7+pYNqmYxoBoZF3q3+8y6aGGvBwAAAFD0nN/8UERE3Pbql4x8vYvrcrdrULXdZ9ZFKrsAANBS/Q7OXZoCq2t1t673VoRdAACgNeS8VW5VJOwCAAA0hjBfnrALAABNIOWs27JBlc+JNcIuAADQKFs2qBJ2WSPsAgBAS+Whn3SFKx4h7AIAAC3SxdsSjaI/lbALAADQGNYklyfsAgBAA6hYliMM0ifsAgAArSHr0ifsAgBAS/WrnKqd3eJ6rxJ2AQCA1sgtT3plp7MnHaqEXQAAoJlG5bl2R12qEHYBAKABWl6w3BEfCVsRdgEAgNbwpQB9wi4AALRUf32n2xZ1i8C/StgFAABao+3BvmyQTSNXNHeLsAsAADTSyDjX7qxLBcIuAAA0gKmpm/lI2IqwCwAALbUekDuUCjt0qmxD2AUAgCk7dPTh+Or9J/Z6GJ006wr44ZNLcfO9R2d7kAlcc+fh1q9bLkvYBQCAKXvub30ovvv3PrrXw2AGfuT1n44XvuayPTv+djH2B/7kk3H34cWB5+Y62qtK2AUAAFpj1lXNm+89NtP9j1OlYv3giVORCgH32l///ukPqAGEXQAAaAATUylrpTf4+/6Fbsa+bp41AAB0QAf7U+laHRG9ng8hQtgFAAAaatRS1NbHvBJpfrnX23abLhB2AQCARmp9sN2hZZXdiBB2AQCAFsktncdcpfHW8koeWfXuGmEXAAAaoK0hjunrDf2tdDX4CrsAANBWa5mnS0G5Q6c6lmnMq4RdAACgkbpYsSwTY1eE3YgQdgEAgIYS6UZbHr7RbkcJuwAA0ACCXTltncZc5bxWejlS6mLde5CwCwAANNLo++y2NO1WMLxmt6u5V9gFAICW6gc/8a9brNldJewCAACt0dZpzH1lzk835lXCLgAA0EgiHVsRdgEAoAHaXrGclrZ+TG09r1kSdgEAgEYa2aBql74V2K3jTENXOzMLuwAA0FL9PNagXNYYe/WZ6jZdnrALAAC0xm5FQZGz/oRdAACgNXar4tqkacxdJewCAEAjCFddVjVbu9eusAsAALTKLjWo2pWj7NzBI4t7PYQ9J+wCAEBL5aGfbVGHGcR71qCqBufeFMIuAADQGru2Zrd1XyG0j7ALAAA0yla3jd21bsyybu0JuwAA0ADCVdeN/wPQGXo0YRcAAFqq6SFoJ+Nv+CnviMbLowm7AADQck0NvdsNey/Pas8aVI16rqHXd9aEXQAAoJZ2EuF2q3FUnRpUqeyOJuwCAAC1tF3FclSfql3rxlyjgNmr02BqRNgFAIAG6GKcGXfOdch2uz2EOpxz0wi7AADQUnnoZ9PsJOC1PRSOOj+V3dGEXQAAoJbGrYvdKtvt2prdGgVMa3ZHE3YBAIBaqlGe3KROQ6tT8K4TYRcAAGiUraJdNxtU7fUI6knYBQCABqhTuNot4865FpXMXR7Ixn7UAAAgAElEQVTC1gG/Bp9HDQm7AADQUrnhHarqdC/bYXs1tlHHlXVHE3YBAIBaGlvZ3cF72kw35tGEXQAAoJbGRrgaZLs65UtrdkcTdgEAgFrayVrUXbv10K4cpZw6T/feSzMNuyml708p3ZhSujml9B9HvP7TKaVDKaUr1/77mVmOBwAAdtM0Gwd1MdCMO+OtPovd68a8u9dj/XgjDlunKnOdLMxqxyml+Yj404j43og4EBGfTym9J+d83dCmf5tz/rlZjQMAAPZKzhEp7fUomhuU6xzi6jQ0a3ZHm2Vl93kRcXPO+Zac86mI+P8i4qUzPB4AANRKHSPIF7/6YNx37OG9HkY5Y289VO0tn/jyoVhcWpnKkMqMYbfVaSx1Msuw+4SIuKPw+4G154b9SErpSymld6SULhy1o5TSK1JKl6eULj906NAsxgoAAFNXx/uf/vCffTp+8E8+udfDKGVcRbrKp3rjwaPxk2/6XPzae66dzqBqSGV3tFmG3VETNoavwt9HxMU552+OiA9FxF+O2lHO+Y0550tyzpecf/75Ux4mAADMRl0jyN2HF/d6CKXsJMMNf8Fw+ORSRER85dCxaQxp4zg1urqy7mizDLsHIqJYqX1iRNxV3CDnfH/OuT+H4i8i4jkzHA8AAOyqaYaQLgaasQ2qKk5jnoldvh556OfAax382yhjlmH38xHxtJTSk1NK+yPixyLiPcUNUkoXFH79oYi4fobjAQCAXbWX1b9ihbOpYWjcNPA6VFX3fgQb6vB51NHMujHnnJdTSj8XEe+PiPmIeHPO+dqU0m9ExOU55/dExM+nlH4oIpYj4oGI+OlZjQcAAHZbU0NmXezk49u9Ww/tznHKqNNY6mRmYTciIud8aURcOvTcrxQe/1JE/NIsxwAAADTTuBC3dbjbneRXp2pqfUZSL7OcxgwAAJ021TW709tVY9QpUNZZHbt+14GwCwAAMyKsTWhcZXert7R0GnP/eKOCbc+f2UjCLgAAzMheFtyKx94qKNXZ2NHW4Dz2fgRF9RpNXQi7AAAwI3WLIE2rAO7oPrvTH8bo49QgcPfVaCi1IuwCAMCM1CkQRdRvPNsZNw28i9OYt1KjodSKsAsAADMyzRAyjaDalspunYLmbht17l3+PLYi7AIAwIzs6ZrdEc/1GpaKdnaf3WadY1lbnVXTrutuEXYBAGBWapJBmtoVelxwrUOgrcEQ1tVpLHUi7AIAwIzULWQ2rQJY6wZVNbq2dRpLnQi7AAAwI3XLlk1bszuOBlWD6jSWOhF2AQBgRuqWQeow/beKOjeo2qsh1ODUG0PYBQCAGdnLcFk8dv9h0yq7O5me29YpvVv9LTVtevpuEXYBAGBGahdBajegrY2t7G75plmMZMRhahQwazSUWhF2AQBgRuoWQppWAazzaOs0tjqNpU6EXQAAmJG6TaltXNjdwa2Hdq0bc40+yjpVmetE2AUAgAnde3Qx3vKpW+PB46cGX5hiBplGnim7i7seOhlX3P5A6f32ejne+6W7o9fLsdLLcenVd08lgO1kD5defXes7Mri5L0JmP2P9bO33B/3HFmMa+48HLccOr4nY6m7hb0eAAAANN1ff+b2+JOP3Bw5Il7+/CevP7+X9bY84nHZyu53/z8fjeVejtte/ZJS27/1c1+N//zua+J3/vUz4/jDy/Gb770+Xvuyb4kffvYTqw16yE7y8ls/+9V48nlnxc/8s6+d6Njb2eti6sve+E/x6LP2xwPDX7CwTtgFAIAJLS6tRETE0kpv4Pm9DkTDyo5nuWJl9N4jixERcejow3H45FJERNx/bBohbNw05q3fdfDw4hSOXX+C7tZMYwYAgAmllCJi86196rZmdzfC9zSPMb4b895/rns/ArYj7AIAwJQMTxOuW2V3NxpUTTOITmNPs2retFfXtg5BvymEXQAAmFBa+zkcgKYZS6YRcnYl7K4dol/tnsa+JjnGrM5Y6Kw/YRcAACa1lrmGuwDv5S1hiofuP97N4UwedbcPlGU+31kF/N2+tHWbJdAEwi4AAEwoRX/Nbr2nMe/Omt3Vg0yhsDv+GDPbuMJua3Zt2UzYBQCACfUrkL1dub/rzu3G1Nv+EaZS2d1mGnOpfUxhHDSTsAsAAJNaS1SbujHXLGntZhaf5Zrd3d7HyP3uUYzOeW+nxzeJsAsAABPqT19eGZ7GPM3OxFPY1e42qJrCvsbdZzf6U6W3P0hb1uzW5dhNIuwCAMCEeuuV3fqs2R0MiquPd2XNbj+ITmNf20xjLlPhbEsuLF7P3fjSog2EXQAAmFA/ewyv2a1bJNmN6a91y2Ftu89uRP3+rupK2AUAgAn1K22b1+zWK5bsxprd9UPswprdrt5nV2W3HGEXAAAm1A+1m+6zuxeD2cKudGPur9mdxr52ON489pd2kHXLEXYBAGBC/ewxy0ruVBpU9aoec29T1fg1u+XHpUFVdwm7AAAwofHTmPdgMCOOvb6muOKAqo5/dft+p+Rq7x25vy2PMz70phHbTttuX9rieZjGXI6wCwAAE+qH3OFbDzV9Du1OQtXGNOZprNmd/POb2ZrdPQyczf6r2j3CLgAATGhcpbFuBbjKld2K+y9Wc2da2V0/RokGVbOaxjyTvZY4bs4quyUJuwAAMKGmNKiq2o15J9Oep9qgagr32d2NDtS7LVdce91Vwi4AAExoN9bsTmNXVaucOxl/nuKa3Smd9RT2MWKve3qf3RYm+BkQdgEAYELrDaA2VXbrEUr6o6ha5awa6FKacsAfV9mt8LnOLpTu7rUdbFC1q4duLGEXAAAKLr367ji6uFTpPf3wMTztt35LKzcGdMXtD8aX7zm65dY7alC19nMaDaouvfrgxPvoj+fztz246cuIvpvvPRZvvOwrceLUcvn9bvPRXH3gcFx71+HS+yurlyPeecWBqe+3jYRdAABYc/O9x+J/eesX4hffflWl9/UrjSs1uvXQKMWs9yOv/3R872sv23L7nQw/b6Tdib35U7fGbfcdH3uMMg2qioH97VfcMXKb3/iH6+K3L70hPnXz/aXHtt1n84Ov+2S85I8/WXp/Zb3rCwfity69fur7bSNhFwAA1pw8tRIREXc+dLLS+8bdx3aa05in0VV4VGXzweOn4t4ji6O331Fld23NbuV3jra4vDLiGGs/S4yvuMkDxwcr9jfdczR6vRwPHj8VERFLK+U7P914cOuq+KwcP7X58+j7wP/23XHbq1+yi6OpN2EXAAAmtN6gqpcHAtieNjHKxcerv4wazrP/7w/G8377w6P3sZOuv3ngx54bN46rDxyO73vtZfGGy25Zf65KuP9P774mPnLDPROObrqm9QVDWwi7AAAwoWJlt25Tl4uq32d35yczq/vbru189ccE4+hX76+848H154ZvHbWdr9y7eYr1rJS5FtPpgN0ewi4AAEyoHyJXeoMBrG7Bt+p4dtL1d6edn9ffX2ZqcqX9jX5+bi0YFmcu1+16VSftFgm7AAAwoX5IynloGnNtJvOuqh52d7Bmt1913XHYne62467B/FraLV6vqpXdupmTdQcIuwAAMKGNbsx5ZpXdaeyq8jTmiSq7OxtxuanJW29b5hrMrc35XSlssNMx10WZ7tRdIuwCAMCEemtTYXt5qDHU3gxn7djFCvOq6mF3J5Xdnb93Wgabc43eZm6tDFos5jY+7O71AGpG2AUAgAkNdGMuhsyahaeqo5lkze5Oz7zcmt3+VOnR2/ZKVGv7U37zwLZlR7n7yvwpKewOEnYBAGBC4yqndctOVcP3TtYc50Lw34nhd40a8nanUbwO4zZdn8ZcGGfT1+wmtd0Bwi4AAKzZaUOpfsBb6eVSU2j3ShO6MW8ew/gdjXtlILSO2ahfBR0IxnW7YBWp7A4SdgEAYM1OA1r/fZuzUr06VFU9v51WZyMmmcY8PIYt9j3mIINZd+uRFPff+MqusDtA2AUAgDU77iDcr+zm+lR2R41jVxowTdigajicjhrzttOYC6F1XH7daCrWjDW7ZejGPEjYBQCANTutZPYKYXJUF+S62JX77K6d9c6/OBj8fWXLaczbN6ga9/b+fss0s6qDMiMTdQcJuwAAsGbn05gL3ZhrUtkdpWq1dUdrdsdO6d6ZUWPebmpyMSBvF4hXWnTroTmV3QHCLgAArJl0zWYvD1bg6tbwqOrpTXKf3WlNCV4ZsWZ3+FhbPT9um34Vv3iOWx1rp3bzb0DWHSTsAgDAmp0Gk/UqYS8P7GOaMWennaIn2UfZwNrPWDkX7oG7487Ww2MYde+h0dv2Fb+02K5j88rA+t7p35ppeRcXAsu6g4RdAABYs+NpzIVmR4OV3YmHtGN5xONZVXY39r/9WtmqRjaoqvCeceewsc668NwMgunyyi7+EUi7A4RdAABYs1UzpK0UmzINTKGtWYuqqpXrsluvr9MtPN5pcNzUjXmracwTNKjqb1P8TKoOOZVIl8tbnUAVJcZWZjxdIuwCAMCajQBU9X2Fn6NKqjUxq27M61OXC5XtnZ56mWnM24X2Yr4cnFa+ecrySm+jg/ZOv+zYym5Wdudk3QHCLgAArNlxNbJwG5s633qo6prUskXJgVsvFR7vxPC7RgXQ7To+D1R2xxynv1a3l/P6ec6imdSurtnVoWrAwl4PAAAA9sJbP3t7fPmeY/GEc8+Ilz//4vjby++IR5+5PyI2utq+/fI74pwz9kVExL75FP/d1z82IiIWl1bijZfdEk8+76z4wW95/HrouuXQ8fjdf7xh/RhX3vFQ3HjwaJxcWomvOWt/3H14MV76rMfHlw4cjhc983HxzivujOc9+VFxxwMno5dzXHH7g/HT33lxPOac0+MdVxyIG+4+Ei965gWxbz7F+645uL7fP/3ozfHos/bHVx84sf7cT33HkyJFird97qvxo8+9MB65Nu6+U8u9eP3HvjL288g5R0opPvnl+zaeixw333ss7nroZHz308+Pm+45GvceeTi+62nnxQ0Hj8QDx07Fdz71vKGKeD9Ejj5Or5fjzZ+6NfYvzMVPfvuT4tjDy/HeL90dL3vuhSPDWi5MN/7bz98RP/Ssx6+/9rGbDsXN9x6Lz936wMB7PnDdPXHdXUfi8MmlgXHc8cCJ+MMP3RTPuOCcePcX71wfc3/8Bx48Gb///hvj4vPOijsfPBkvfubj4gPX3RMnTi3HUx/ziHjcOWcMHOfdV94Z3/a1j46llV4cPPxwXHLxoyJFxOdu2xjP7/7jDfG93/DYePwjz4hnPvGRq+O79mA844Jz4sJHnxlfvf9EvOuLB2JxqRenlnvx/Kd+TXzPMx676TMr7nMcUXeQsAsAQOccXVyK/+u/XrP++33HH443fPyW+JYLz11/7ksHHor//R1fGnjfba9+SURE/MEHboy/+MStERHxgq87f6CS+LbP3bH++O+vuituOHh0YB9/9OEvR0TEx258fLz7yrs2je3ztz0Qf/zjz45ffPtVERHxvmsOxrMuPDc+duOh9W1+7/03RsTqtNV+mLv8tgfihc94bPzRh78cRxaX4t89/8nr2/d6Oa6+83B8+d5jYz+TnFdD/r9902cHnnvhaz6+fu7f99rL1h9//x9+YuMzKUzjXu9uPKZK+o/XHozffO/1ERHxlPMfEW/65K3xkRvujW990qPi6Y89e1N1tX87oE/dfH/8x3ddHV+683CcsW8+IlYDfH98w178x6vj+08vecb6c8Vrs77/nNev339dC8B9r/3QTSP33XftXUfih173qfXfn/qYR8TCXBq45u+44kC844oDEbHx9/OKv74izj59Ia7+tX8ZL3ztx+PU8kYJ/c2fujW+/Fsvin3zG5Nw/+Hqu7ccR5/C7iDTmAEA6JxiuIiIOHT04YiIeOD46s+l5RyX3/bgpvfd9dDJiIi479ip9ed6vfFVzFNb3Lj1M7fcP/L5Gw8eHRjfwSOL8fByL77hgnPie77+MQPb/sILnx6PPee0iFgNmf33XXfXkbj2riMREbF/YS6uuetwXHH7amXwf/quJ8coo6YLl52KXGz2tNzburJ7dHFp/fHxh5fXK8n99abDb+vv+9jDyxERcd/atZqWXs5TuyfwnQ+ejDsfPFlq26OLq+cz/LcYEfHRG+6NiIiDhxfj4OHFgc+s6K0/823xyhc8Zf13DaoGCbsAAHTO8DrKfhOh/trNG+85Gr/xD9dtet93vvojm55bGbrd0LdetFEdXjy1MnYM9xwZHdr2L2z+J/pyrxf75tOmcS/Mp/iJb3tSREQ8/bFnx9La67fffyK+cmi1ivuNjz8nHjqxFL996er06tNG7D9iXCOoscMfeu/a9hGxtBbwywbl/hcCS2MaOY1aR11lae122/Z6s7nl0PjjbX+sP1ubbv7tv/Ph+Pbf+fDY7VIabEqVpLsBPg4AADpnaaji2r89zE6aKi2v9Aam3j7xUWfG1b/2ffG8Jz86FkdU7bbyDRecEymlTQFteSXHwvzcptvY7Jubi5/7F0+NiIizTluI5bXzOrXSi4fXjn326YNrd0eF6YjRzahKd2Mu3G6o/8VB1U9y/X2bujGPOF6FvW93DrkwjXk3LJXo+rVSMnyntf/Z+J0iYRcAgM4Zvh1Mv6pYNmQMvLc3GJYW5lKcffq+2D8/Fye3qOyO8thzTovlld6mKcVLK72Yn0ubqp/zcynm5lI88ox9sbzSW6/8Lq30YnlltRq8b+h+NKctzI889uhpzOXG3T//Xo716nLVALkeAofeNuntgLZ7dy/P5pZD45S5FdHwlzHjzA1Xdi3aHSDsAgDQOcMV0uX1qbc72NdKb6AauTCf1n8uLlcLu2fuX4illbwpEC338uo05qEQtG/tWPvmUyz18npIWlrpxdJKL/bNz62Pp29sZXdk4Ctb2e1Xc/P6GKvmx42K8OAbR90OaJrTmFcKtx7aDVMNu3NpIOCKuoOEXQAAOme4Qrq8w2pkf1+9gbC7+k/shbm5yoHvjP3z60F1YHwrvViYmxuxZnfjWMsrvfUgtbySY2klx8Jc2tS0aOya3RFJv/jUVmtN+6/kXJyOXO3kh4N8306q7UXbTXmu4zTmsvfmTTHYgVlhd5CwCwBA52yexjzBmt3e4JrdhbmNamtVZ66F3eGws7TSr+wOhd25jSry8kper1gv93KcWumNrOKOr+xuDrTFj6MY0oa325jGnNe3q5pR+9Ofy6zZrWLbym5vd8NuqcpuybXeKaWYKyTcOWl3gPvsAgDQOcPVteUJ1uwur+SBQNWPGwvz1etKZ+yfj17efDua5V6/sjs8jXlu/edSL8dcIUgtnlpZrS4PVTbHVXZXennT51IMgcWQVtwu543zH6zsbnmqa/vfeLw+/Xl4m/WNNl6pUjXebtvVewOX3t3EykxRXir5dzi8ZpdBwi4AAJ2zqbI7pqpYxtJKb2RlcLgxVBln7lv95/nJpcG1vqvdmEdUdvvrg+dW1/MWi8knTq3EvoXNY9jq1kPD+x8XdgeCb2Ea9+o+ylfJi+F97K2Hct7y9e1sly17OVeecj2JMlOUy67ZTWloza7gO0DYBQCgc4bXh/Z/31Fld8w02OHGUGWcsX81iJ48tbz+XF6bGrxavR0c98Lc2prd+blYWskxN7cxjhNLK7FvxLrhcd2YR4XdYpm1eOyB4FuYxl3sxjwuQA5Wcwf3M+p9/U7JxWBc5SqdWtm6SVivl2NlbhfDbokgO3wdxmXxuTS0ZleLqgHCLgAAnTM8TbQfLnbWoKo3MnztZBrz/rX3jKzszm2u7Ba7MS/3ejFfeNvJU8vr05wHjlFpGvPgGPqWhiqyGx9bHjsdeWM/xfduDtCbpjHnjeOsH6XCZdquIjxqrfIslalQD1d2x30JM7dpze5kY2sbYRcAgM4Zrq6dqjD1drjyOLxmt28n05j7AfnE0P15l1ZyLMxv1Y15NQjPp0Jl99RKLMynTeFx3DTm4nrbvuLnMSqYrj7emMbd623/xUEx7J0aE3wHxtDb6DC9E8Prn4et5BzzefdS4vC661GGP4utpjUXR+4+u4N0YwYAoHOGq2sn18JlmQLfcOBc7o1esztRZXco7C73eqPvs7vejXlu9ZZFhbGdPLVSvbI7tP/iWRXPeyD49gbX7G7XjbkY9ornuTxm3XRv5DTm8sF3u/WvvV3uxlymsrvaNKs4xbtcZVfUHSTsAgDQOcPVtf604TJrdjfftmh0ZXcna3b77ymGwBz9acxzIxpU9bsxp1ju5YEwfOLUSuybT5vGNjbs5rwpVA02qBoMuH2r07g3piBv1425GPYGwu769OehNbu9yRpUbVfZXe7lie/lW0WZNbsREQ8vb3w2485hbs59drdiGjMAAJ0zHBqH18hu5dSm5lajK4P75qrXldanMRfGk/PqMffNp80Nqta7Mc/F8spyzBfSzvFTy+sNrIrmxySiPCLs5oFpzGOCbyHs9wr7GNegqvjZF89zff/DPbLy5mNWW7NbfdrwrPR6mz/jcYpT2YentfelGO7GLO0WCbsAAHTOcLjZrvo3/N5ikFvtRrx5u51UdvePqOz2xzfq1kP9QL1vPq11Yx6cIrxvYS6G0+O4QLTS21zZLp7XuNsEFadxF58fNzV4/DTm0ddgoxtzoUHVyC1HKztteDcs9Xqlg3Xxs1kc82WM++xuzTRmAAA6p2x1beR7V/LQNN7R92kdtV52O/1K7HDY7b+2uUFVobLb68XySl6fyrrcy7FvbvM05vkx6WillzcFzuLhhgNu8fn+MU4tD1akRxk3jbn//OZuzP0gvXWX53GGK/Hb2beDLynKWl4ZcXunMQYru8sjt0lDa3YZJOwCANA5ZddNjrK00huogK70eiMrgwsjQuV2QWrfwuhbD417b/+5hf6a3V6OM/bNF17f/M/9cZXAXs6bqqDj1uwWK+HFadyDld3Rxynup3ieK2MaVPV/XylMj95qGvPp+wbPueoU5dP3jb4P8TQsV5rGvBFwTy6NPoeUrNPdirALAEDn7LTZ0ep7ewPvX1rJI7sDj+rGvF2Q6ndXHrVGc9T+FtanMc+tVQ17ceb++cJ7Nt96aFwlsJfzpi8BiqGyWCFdLISvpcI07mIIHrtmtxD2ToxoUDVsuEHV6s/x12/4M64ads+YIOymFFu2RF5e6ZW69VDEYNV7VKU/YvVaWqc7nrALAEDnlA0coyytDE73Xb3P7ObtRlVitw27a5XdUWs0R1WKN6Yxp7UglQeOsX9UZXdMabeXN0/vLlawiwG3WJFdLoT9gbA78iiD4bN4nv3bJg1/cbB+66G192137U5fGAq7y9W+2Dhj/4wruzuYxmzN7s5oUAXUXq+X4y8/c1v8yHOeGOecvm+vhwOdceUdD8VDJ07FC77uMWO3OXlqJd72ua/GT3/nxWP/AV3c31fuPRbHHl6On/qOJ0VKKW6+91jcfO/R+P5vumDT9kcWl+I1H7gpIiKOPbwcT3r0mXHumfviRc+8II4tLsc1dx2O+ZTi1vuPxyv/+VPiYzceisecc1p84+MfGREb/7/jZc+9MM7c351/8rz3S3fH5bc/EK98wVPiMWefvun1e44sxps/dWssnlqJZz7x3OjlHNfffWTTtNCFuRT/7rueHFfd8VCcWunFOWfsixMPr8Q3PP6cOGv/fLzpU7fGj15yYTzl/EdsO6Y7HzoZn/nK/XH45FL8xLddVGma6DuvOBDf+dSviQseeUbp90RE3HjwaLzriwfiB7/58fH3X7or/ofnXRRP+pqz4saDR+OGg0fi//3ErZX2V/SnH705bjp4dP33S68+GEcXlzZtN6oT8nZVw36gvemeo5teGzUluf/cwvxcPHhiKRbmluOCczeu+8J82lRhHdeN+S8uu2XT+tbfed/1649/+9KNx3/zT7evP/75t30xTlubOnzLfcfXn7/qjofi195z7abjfP62B9YfF8/zspsOxclTK3H84cH1qR+98VDcfv+JuGHtM7/53mNx6OjDI88hItbH0lccUxmTVHbHdU3u+/333xgHjyyu/z7q8+n7m89ufMaj/h4i1roxu7vuWN35//xAY116zd3x639/XRw8vBi/9OJn7PVwoDP+1Z9+KiIibnv1S8Zu83vvvzHe/Klb47HnnB4v+ebNgXXU/iIinvOkR8U3PeGR8cLXfHzsMV7/sa/EWz5926bn//byO+Lme48NVJle+IzHxsvf8vmBfX3w+nvi1//+urj1vuPxGy/9pi3H1iY/+1++EBERBw8vxuv/7XM2vf7+aw/GGz5+y9pvG/+YPuf0jX8W5og4urgcj3vk6fGb771+4P3zcyl+46XfGG/4+C1x5ORy/M6/fua2Y/rRP/9M3PnQyYiIeOjEqXjV931dqXM5cWo5XvX2q+LJ550VH/3FF5R6T99bPn1bvO1zX41/uOru1WPniF968TPiX/7hZQPbzaWIx5x9epw4tRxHFpfjnNMX4sjicpy2MBcPj+nQ/LEb7x34/YaDR+K0hfn49//sa+P1H/tK/NjzLoqIiEeesfkL2ude/Oi49+jiwN9v37961uPjwkefGU981Blx5ORqeD5j33ycddpCROT4usedHb/2g98Qb/n0bfHsix4VVx14KM49c/UYz7rwkXHp1XdHzjle/MwL4uTSgThycjm+5cJz4+mPPTsu+/J98agz98V3PuW8OPfMffG15521KQR+4suHNo3pwIMn1x/fWtj+i199cP3xwSOL6+tkTy2vxHmPOC3OPn0h7jv2cLzrCwdGfoZ9i0sr8bhzTo/9C3NxxwMn4o4HTkRExHmPOC2OP7wcJ5dW4rq7DsdVdzy0fn/gk6dWxk7rjYj45Rc/I/7nv75i/fdi06ytPO0xj4g7HjwRL37mBfHQiaU4c/98HDyyGCdOrcSPXvLE+PxtDw58BqM84rSF/7+9O4+SqyzzOP57ujvd2feFEBISCKsKJCzCRJsiwF4AABgISURBVBEFIYCCIwrBBVzOMHjgqMPRMWEUQWZUdAR1HJWZEQQHQWdAyBlQQEA4IEtCgIQQAg0kITuBkJ30Us/8UfdW36q+VV3VtVd/P+fU6Vtv3ap6q96+t+5zn/d9r0zSjr1dGj+8VefM3k/XP/Jq6vF7l29MLbe1NOmOJWvVZOnjmy868QDd9ew6LVnd8x2HmV0zaeKINn3o0El6tP0NjRveqoMn9T7hNO/YqXprV4eOnDpa/7d0Q6r8o0fuG9tDoFER7AKoec+9/rYkBT/2ACohkecEKm/v7pAkPfXam30Gu1G3L1mrd08ZlVb25s69emtXhw6aNEJS8qAxzvPrtvcqu/+FTWn32zfv0Jo3dwd17J1xy2X5+m2aOnZoqidJIuFavHqrjpsxtqDXqbamJtNDQVA2fdwwzRg/TFL8JXYmjxqsxxecnLr/Tme3Dv3Wn/TKGzt7rdud8NRr7M0ziIhmsrbtyb89wi61myLPz1eYvAyD7GxjdF/57hklHfMYDeTfte/I1HJLU3ICqctOPVg/OvfIVPmSNVv18Z//VYdNHqkfz5slSXr0Gx/K+vrHHzBOn5szo1f5ecdO03nHTkvd/+opB6c9/sp3z0i7/+DXTtKXb31GC59brx+cc4TOPXZq2uPurhkL7kk998DLk8uvfS/9+zrrZ49q6dpt+sm8WTrtXftkrXcxFtyxTLc+tUb/cMrB+tJJB6bKp8+/O1WnsK6Z9ct1sk6SfvrAy7r2/pd00YkH6PLICfUvn3xQ6erfjxP1lxfwnGOm9943ff+cI1LLl3xwZmr5386fVXBd6hljdgHUvDWRs7wAKmPLruxdBOPc9Phq7dwbf2mMUDTLddtTr/d6/JRrH9aHr+vJuo0d1pr3+//w3pUZr/WI/uWeFVnWzs7ddeZPH9XnbngqVXb9I6/q3Osf12PtWwp+vWq6e+kGff7GRfr8jYv0wX/9S6o8bibYzJl/w8zPrTHtJPW+Fmu5dBcxiVS+zy3n5D7Txg5VW0uTxg1r1WWnJoPPzJM4h+2TDIjnlilQzOVvDhwnSZpz0Phej5mZZk4crv3HDVVzk2na2KGaOXF4r+/rU0EW+4j9RvV6jVKZ++7kdzNn5ri08hnjh2n00EEyM00ZPUSTRw1Oq9+Z7+n7BNx7gnp/pICTdbWo3utfLqRJANS8cPhQdxGTiQAozN4sl7noJXLc29mVkHKckzps8ght3r5X7z1grG5b1DuI2pqRgS3mOqhpVSwglgnfc8mat1NlL29OjpVb//ae2OfUm+iMt++eMlLPr9veK5Of7TqsoTBLmu9YQcuy3JfOYL/fn3C0swZ+M5qaTIu/eYrMTMNam/WFOTN6jVce0tqsF6+eq7aWyuegzjt2qj42a0rWMdR//Mr7U2O5/3zZB2K3pfOOnaqzj5pS1kmdPnDwBK34ztxe7/Hnyz6QOvHy8NdPSpvW6qV/Pr3P/2NJ+uAhE/Xi1XPLermhSvjJvFm69tyjql2NmkNmF0DdKOYyEQAKU+ilOiSpO9eFL5WcsXWfUYM1YXib3NMvSxLXbbqY66D2V76zpNaqfLqfR/elU8cMjV3HzHJeD7ZSbVNMe2Q+t1onTEcMHpQcx2mWNaAaPKi5KpePyVUnKTn5VThOtrWlKXaCLDMra6AbinuP5iZL1a+lOb1+rS1NeQW7Unmvq1sp0e8CPfhGANSB4JIDNXCWHhgo+pNV7Ssw6Uy4WpqbUtcKjXaFjcvCVSPwzJyJtt7kk82M7kvDWWfjvum4mYR73if+EjGlVlSwm/FddFao6zWA2kGwC6BukNkFKiduEqO+9JUN7upOaFCTpbIv0W06LqipRjfUamSTSymf4DC6Tq6MXEsemd1yn5Ao5n8g8zej3tsWQOEIdgHUvLCnY713LwTqSX8yu30Hu66W5p7usdFAJrp9hxnf/m7zmdcULUS9n1TL5zuLfsbU9URjnhbXZTX1PonK9LgprhtzIuN+fbctgMIR7AKoeeE4QLoxA5XTnzG7fQXInYlE2ri6zkj2OBr4dqayhvnVIby+Z6iYmYL787lrSaHdmIe25urGnD2zG35P5T45UEx7ZP4/0o0ZGHgIdgHUvPBsfL1nXIB6Eg0y8r3mbl9dn7u6XYOaLNU9NhqMRLNuqUAq4TknSQoNbU2/uEQxszjXe7BbaGZ3cBjsxmTDc2Z2u8Pse5kzuyVsS7oxAwMPwS6AmtdRYJYHQPGiAVG+4yb7Cky6utMzu9HgOBqYdEYCqVyTJIWGZMykWswkU6W63FG15ArWE4neAerQHLPQ5hyzG/xPlPv7Kma/nxn4c8IUGHgIdgHUvNREKHV+EArUk2iQke9Yx76yomGmdlBcZjcty9vTRTZXwBUamjHJUjFjM3Nlp4sYClwxufaTnTEB6pB+d2MOe9yU9yRkMQFqZrdlhsIAAw/BLoCaV6mDKgA9OssQ7IaZ2p7ZmBNpj4VSvTkSiZxdaUOZwW4px3lG1cNliXJlQuP2pUMyuoBH5cqqV2o25mICVCaoAkCwC6DmdVbooApAj/50Y+4rCxfOxhwGUXFdl8P1UuvnyC6GBg8qXbCb67n1MJQiVxvEBaip6+zGPM1yfPVh1rTckz4VNxtzZjfm2m8/AKVFsAug5vVMVsOBClAp/cns9hUMdgaZ2taW4NJD0QA3Zjbmzm7PK7Pb2pK+TlFdX/PIjNayXJnQaMY8lJkVj8rVbbsncC53N+YiTlxkfBcMhQEGHoJdADWvq8hrbgIoXNzsyH3J6zq7TZYKYLuyZHZTE1QlEnmN2c3M/hYTgOUKaOvhhFvuzG7vme3DrLjHjtrNLpp9L1R3AYOfiwlQM+tWD5l5AKVFsAug5oXX4mRyEaByouNTcwYckYdyBVrurq6Eq6W5KdWNuSPLmN3o0IV8ujE3Z6yTOba2kIml4gIiU5CJ7qr9E265x+wWltnNpacbcz+ux1xAgFzU+OuM59ZDZh5AaRHsAqh5qYMqDlSAiunKEohm6kzEd0Xu9XrBeoOaLNWNuSutG3Pv1+nszm+Cql7vlbGvKCSTGLefSXhPprnWpb7nmIx4Z0xmN9eY3VzZ3mImqCpkX15Mj57OhCt6HqQe2g9AaZU12DWzuWa20szazWx+zONtZva74PEnzWx6OesDoD51VmhsGIAecd2K46RlZHNkPsOgJZrZ7YzJ5kpSR1cYXOZ36aFMmdnAQvYdcZnEMFNcD7Mxh/XPnLQr+lj0+xjSz8xuTzfmfmR2Cwg6i52NOXoNZobCAANP2YJdM2uW9O+STpd0uKTzzezwjNW+KGmru8+UdJ2ka8pVHwD1K3VQxeQiQMVEu6fmCjjynbU5fCx5nd0w2O09A3P0/TqDSxUVak9nd9r9QoKcuM9aqcvslEJYxyExwW7cvrQtmNwr7pPlSoiH7dmf2ZgL68Zc3Jjd6KWV6mHMNYDSKmdm9zhJ7e7+qrt3SLpN0tkZ65wt6aZg+X8lnWyWa6J7AANRmE3hshFA5USztDkzu9FZlLtyXbYnyOw2WaqLbdqMzzGzMXd1e2x33L7s6UgPdgsJyOKy0/V0re/we4zL2PbsS3s+Y0s/uolLxWV2C/kei5psLJFIG5Pc1e3yQgZwA6h72a8kXrwpkl6P3F8r6b3Z1nH3LjPbJmmcpC1lrFdZffPOZeom+wSUVEdwAN2+eZcW3LG0yrUBBoala7elln/xl3ZNGNEWu96LG3aklu96br1WbtoRu947ncntuKW5KZXZveXJ1anHf/NEz/KNj63S/S9sUvsbOzVzwvCC6/7rv65Ku79y4/a89x3tm3emlsPnrNiwXZL0WPuWmt8HrXlrt6T4zO7PH2rXxJFtWv/2nlRZagKwAg9dXt+afJ+de7vy+k6i2eTl6/Nvjxc3Jv+fdnV0F/zd7+1K78a8dXeH5t++rKDXAFDfyhnsxp2KzdyV5rOOzOwiSRdJ0rRp04qvWRn9ZeUbqQNzAKUxedRgTRzRpg3b3tEDKzZXuzrAgHHwpOHq6va0wDeXdVv3aN3WPVkfnzJ6iA7fd6QmjRyswyaP1Ktv7NKIwS3a8U6XXt60U8PbWrRzb5dWbtyhlRt3yCS994CxmjNznJau3aaES6ve3KVEwrV1d4e27u7UZ4/fX0fvP0Z/DvYNzU2WClhbW5rU0ZWQuwred4wb1trrOTve6aqLfdAhk0boo0fuqzd2vKY3d3Wkypet2yatS3ZdPnDCMI0b1qbW5iYdus8IXfqhmb1e57Mn7K8r7lreq3zUkEFqbW7SkVNHa1Oe++Xxw1u1ZWeyLp3diYK/x/HD2wp+zqQRg/WJo/fTb55YrTFDB2nj9nf00MrNGtrarN1B9v+Kj2SOsAPQSKxc3TnM7ARJV7r7acH9BZLk7t+LrHNvsM7jZtYiaaOkCZ6jUsccc4wvXry4LHUGAAAAAFSXmT3t7scU+zrlHLO7SNJBZjbDzFolzZO0MGOdhZIuDJY/IenBXIEuAAAAAAD5KFs35mAM7qWS7pXULOkGd19uZt+RtNjdF0r6laTfmFm7pLeUDIgBAAAAAChKOcfsyt3vkXRPRtkVkeV3JH2ynHUAAAAAAAw85ezGDAAAAABAVRDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4RDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4RDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4RDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4Zi7V7sOBTGzNyStrnY9+jBe0pZqVwJ9op3qA+1UP2ir+kA71Q/aqj7QTvWBdqof4yUNc/cJxb5Q3QW79cDMFrv7MdWuB3KjneoD7VQ/aKv6QDvVD9qqPtBO9YF2qh+lbCu6MQMAAAAAGg7BLgAAAACg4RDslsd/VLsCyAvtVB9op/pBW9UH2ql+0Fb1gXaqD7RT/ShZWzFmFwAAAADQcMjsAgAAAAAaDsFuCZnZXDNbaWbtZja/2vUZyMxsqpk9ZGYrzGy5mX0lKL/SzNaZ2bPB7YzIcxYEbbfSzE6rXu0HHjNbZWbLgjZZHJSNNbP7zezl4O+YoNzM7KdBWy01s9nVrf3AYGaHRLabZ81su5l9lW2qNpjZDWa22cyej5QVvA2Z2YXB+i+b2YXV+CyNLEs7/dDMXgza4g9mNjoon25meyLb1i8jzzk62Ge2B21p1fg8jSpLOxW8r+O4sPyytNXvIu20ysyeDcrZpqokx3F5+X+n3J1bCW6SmiW9IukASa2SnpN0eLXrNVBvkiZLmh0sj5D0kqTDJV0p6Wsx6x8etFmbpBlBWzZX+3MMlJukVZLGZ5T9QNL8YHm+pGuC5TMk/VGSSTpe0pPVrv9AuwX7u42S9mebqo2bpBMlzZb0fKSsoG1I0lhJrwZ/xwTLY6r92RrplqWdTpXUEixfE2mn6dH1Ml7nKUknBG34R0mnV/uzNdItSzsVtK/juLB6bZXx+I8kXREss01Vr52yHZeX/XeKzG7pHCep3d1fdfcOSbdJOrvKdRqw3H2Duy8JlndIWiFpSo6nnC3pNnff6+6vSWpXsk1RPWdLuilYvknSxyLlN3vSE5JGm9nkalRwADtZ0ivuvjrHOmxTFeTuj0h6K6O40G3oNEn3u/tb7r5V0v2S5pa/9gNHXDu5+33u3hXcfULSfrleI2irke7+uCeP/m5WT9uiBLJsT9lk29dxXFgBudoqyM6eK+nWXK/BNlV+OY7Ly/47RbBbOlMkvR65v1a5gytUiJlNlzRL0pNB0aVBl4gbwu4Sov2qzSXdZ2ZPm9lFQdkkd98gJXeSkiYG5bRV9c1T+sED21RtKnQbos2q7wtKZjNCM8zsGTN72MzeH5RNUbJtQrRT5RSyr2N7qr73S9rk7i9HytimqizjuLzsv1MEu6UT17efqa6rzMyGS7pd0lfdfbukX0g6UNJRkjYo2b1Fov2qbY67z5Z0uqRLzOzEHOvSVlVkZq2SzpL0P0ER21T9ydY2tFkVmdk/SeqSdEtQtEHSNHefJekySb81s5Ginaql0H0d7VR95yv9xCzbVJXFHJdnXTWmrF/bFcFu6ayVNDVyfz9J66tUF0gys0FKblC3uPsdkuTum9y9290Tkv5TPd0qab8qcvf1wd/Nkv6gZLtsCrsnB383B6vTVtV1uqQl7r5JYpuqcYVuQ7RZlQSTrHxE0qeDbpQKusW+GSw/reT4z4OVbKdoV2faqQL6sa9je6oiM2uR9HFJvwvL2KaqK+64XBX4nSLYLZ1Fkg4ysxlB5mOepIVVrtOAFYzT+JWkFe5+baQ8OrbzbyWFs/ctlDTPzNrMbIakg5ScrABlZmbDzGxEuKzkZC3PK9km4Sx7F0q6K1heKOmCYKa+4yVtC7vAoCLSzpSzTdW0QreheyWdamZjgi6apwZlKCMzmyvpG5LOcvfdkfIJZtYcLB+g5Db0atBWO8zs+OC37gL1tC3KpB/7Oo4Lq+sUSS+6e6p7MttU9WQ7LlcFfqdaSvg5BjR37zKzS5X8wpsl3eDuy6tcrYFsjqTPSlpmwZTzki6XdL6ZHaVkl4dVkv5ektx9uZn9XtILSnYju8Tduyte64FpkqQ/JPeDapH0W3f/k5ktkvR7M/uipDWSPhmsf4+Ss/S1S9ot6fOVr/LAZGZDJX1YwXYT+AHbVPWZ2a2STpI03szWSvq2pO+rgG3I3d8ys6uVPEiXpO+4e76T9CAPWdppgZIz+d4f7AefcPeLlZxl9jtm1iWpW9LFkfb4kqRfSxqi5Bjf6DhfFClLO51U6L6O48Lyi2srd/+Ves8tIbFNVVO24/Ky/05Z0FsGAAAAAICGQTdmAAAAAEDDIdgFAAAAADQcgl0AAAAAQMMh2AUAAAAANByCXQAAAABAwyHYBQAMeGbWbWbPRm7z+1j/YjO7oATvu8rMxhf7OiWox5Vm9rVq1wMAgFLiOrsAAEh73P2ofFd291+WszL1xJIXhzV3T1S7LgAARJHZBQAgiyDzeo2ZPRXcZgblqUyomX3ZzF4ws6VmdltQNtbM7gzKnjCzI4LycWZ2n5k9Y2bXS7LIe30meI9nzex6M2vOUp+rzGyJmS0zs0Mz6xPcf97Mpge3F83sv4KyW8zsFDN7zMxeNrPjIi9/pJk9GJT/XeS1vm5mi4LPclVQNt3MVpjZzyUtkTS1dN86AAClQbALAIA0JKMb83mRx7a7+3GSfibpxzHPnS9plrsfIenioOwqSc8EZZdLujko/7akR919lqSFkqZJkpkdJuk8SXOCDHO3pE9nqesWd58t6ReS8ul6PFPSTyQdIelQSZ+S9L7guZdH1jtC0pmSTpB0hZnta2anSjpI0nGSjpJ0tJmdGKx/iKSb3X2Wu6/Oox4AAFQU3ZgBAMjdjfnWyN/rYh5fKukWM7tT0p1B2fsknSNJ7v5gkNEdJelESR8Pyu82s63B+idLOlrSomSvYA2RtDlLfe4I/j4dvlYfXnP3ZZJkZsslPeDubmbLJE2PrHeXu++RtMfMHlIywH2fpFMlPROsM1zJ4HeNpNXu/kQe7w8AQFUQ7AIAkJtnWQ6dqWQQe5akb5nZuxTpnhzz3LjXMEk3ufuCPOqzN/jbrZ7f8S6l99YaHLO+JCUi9xNKPw7IrJcH9fqeu1+fVlmz6ZJ25VFXAACqhm7MAADkdl7k7+PRB8ysSdJUd39I0j9KGq1k9vMRBd2QzewkJbseb88oP13SmOClHpD0CTObGDw21sz2L6COqyTNDp47W9KMgj5h0tlmNtjMxkk6SdIiSfdK+oKZDQ9ee0pYRwAAah2ZXQAAgjG7kft/cvfw8kNtZvakkieIz894XrOk/w66KJuk69z9bTO7UtKNZrZU0m5JFwbrXyXpVjNbIulhJbsDy91fMLNvSrovCKA7JV0iKd+xsLdLuiD4DIskvZTvB494StLdSo4jvtrd10taH4wnfjzoXr1T0meUzCoDAFDTzD2uNxUAADCzVZKOcfct1a4LAAAoDN2YAQAAAAANh8wuAAAAAKDhkNkFAAAAADQcgl0AAAAAQMMh2AUAAAAANByCXQAAAABAwyHYBQAAAAA0HIJdAAAAAEDD+X9xDd4uYQsfHQAAAABJRU5ErkJggg==\n",
475 | "text/plain": [
476 | ""
477 | ]
478 | },
479 | "metadata": {
480 | "needs_background": "light"
481 | },
482 | "output_type": "display_data"
483 | }
484 | ],
485 | "source": [
486 | "# plot the scores\n",
487 | "fig = plt.figure(figsize=(16, 12))\n",
488 | "ax = fig.add_subplot(111)\n",
489 | "plt.plot(np.arange(len(scores)), scores)\n",
490 | "plt.xlabel('Episode number')\n",
491 | "plt.ylabel('Score')\n",
492 | "plt.show()"
493 | ]
494 | }
495 | ],
496 | "metadata": {
497 | "kernelspec": {
498 | "display_name": "Python 3",
499 | "language": "python",
500 | "name": "python3"
501 | },
502 | "language_info": {
503 | "codemirror_mode": {
504 | "name": "ipython",
505 | "version": 3
506 | },
507 | "file_extension": ".py",
508 | "mimetype": "text/x-python",
509 | "name": "python",
510 | "nbconvert_exporter": "python",
511 | "pygments_lexer": "ipython3",
512 | "version": "3.6.3"
513 | }
514 | },
515 | "nbformat": 4,
516 | "nbformat_minor": 2
517 | }
518 |
--------------------------------------------------------------------------------
/p3_collab_compet/config.json:
--------------------------------------------------------------------------------
1 | {
2 | "exp_name": "DDPGAgents_exp",
3 | "cuda": true,
4 | "gpu": 0,
5 |
6 | "optimizer_actor": {
7 | "optimizer_type": "Adam",
8 | "betas": [0.9, 0.999],
9 | "optimizer_params": {
10 | "lr": 1e-4,
11 | "eps": 1e-7,
12 | "weight_decay": 0
13 | }
14 | },
15 |
16 | "optimizer_critic": {
17 | "optimizer_type": "Adam",
18 | "betas": [0.9, 0.999],
19 | "optimizer_params": {
20 | "lr": 1e-3,
21 | "eps": 1e-7,
22 | "weight_decay": 0
23 | }
24 | },
25 |
26 | "DDPG": {
27 | "num_agents": 2,
28 | "gamma": 0.99,
29 | "tau": 0.001,
30 | "buffer_size": 10e6
31 | },
32 |
33 | "architecture": {
34 | "fc1_units": 250,
35 | "fc2_units": 100
36 | },
37 |
38 | "trainer" : {
39 | "num_episodes": 15000,
40 | "batch_size": 128,
41 | "save_dir": "./saved/",
42 | "save_freq": 1000
43 | }
44 | }
--------------------------------------------------------------------------------
/p3_collab_compet/images/tennis_gif.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/images/tennis_gif.gif
--------------------------------------------------------------------------------
/p3_collab_compet/report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/report.pdf
--------------------------------------------------------------------------------
/p3_collab_compet/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/requirements.txt
--------------------------------------------------------------------------------
/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_actor_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_actor_solved.pth
--------------------------------------------------------------------------------
/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_critic_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_critic_solved.pth
--------------------------------------------------------------------------------
/p3_collab_compet/utils.py:
--------------------------------------------------------------------------------
1 | import os
2 | import torch
3 |
4 |
5 | def pick_device(config, logger):
6 | """ Pick device """
7 | if config["cuda"] and not torch.cuda.is_available():
8 | logger.warning("Warning: There's no CUDA support on this machine,"
9 | "training is performed on cpu.")
10 | device = torch.device("cpu")
11 | elif not config["cuda"] and torch.cuda.is_available():
12 | logger.info("Training is performed on cpu by user's choice")
13 | device = torch.device("cpu")
14 | elif not config["cuda"] and not torch.cuda.is_available():
15 | logger.info("Training on cpu")
16 | device = torch.device("cpu")
17 | else:
18 | logger.info("Training on gpu")
19 | device = torch.device("cuda:" + str(config["gpu"]))
20 |
21 | return device
22 |
23 | def ensure_dir(path):
24 | if not os.path.exists(path):
25 | os.makedirs(path)
--------------------------------------------------------------------------------